P. 1
DiffGeom

DiffGeom

|Views: 22|Likes:
Published by mydeardog
differential geometry
differential geometry

More info:

Published by: mydeardog on Nov 26, 2012
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

07/22/2013

pdf

text

original

4P9

An Introduction to Differential Geometry.
Michael D. Alder
November 29, 2008
2
Contents
1 Introduction 9
2 Smooth Manifolds and Vector Fields 11
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Smooth Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Smooth maps and tangent vectors . . . . . . . . . . . . . . . . 15
2.4 Notation: Vector Fields . . . . . . . . . . . . . . . . . . . . . . 27
2.5 Cotangent Bundles . . . . . . . . . . . . . . . . . . . . . . . . 31
2.6 The Tangent Functor . . . . . . . . . . . . . . . . . . . . . . . 32
2.6.1 The (non-existent) Cotangent Functor . . . . . . . . . 35
2.7 Autonomous Systems of ODEs . . . . . . . . . . . . . . . . . . 37
2.7.1 Systems of ODEs and Vector Fields . . . . . . . . . . . 37
2.7.2 Exponentiation of Things . . . . . . . . . . . . . . . . 39
2.7.3 Solving Linear Autonomous Systems . . . . . . . . . . 40
2.7.4 Existence and Uniqueness . . . . . . . . . . . . . . . . 41
2.8 Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.9 Lie Brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3 Tensors and Tensor Fields 51
3.1 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1.1 Natural and Unnatural Isomorphisms . . . . . . . . . . 51
3.1.2 Multilinearity . . . . . . . . . . . . . . . . . . . . . . . 54
3.1.3 Dimension of Tensor spaces . . . . . . . . . . . . . . . 57
3.1.4 The Tensor Algebra . . . . . . . . . . . . . . . . . . . . 62
3.2 Tensor Fields on a Manifold . . . . . . . . . . . . . . . . . . . 67
3
4 CONTENTS
3.3 The Riemannian Metric Tensor . . . . . . . . . . . . . . . . . 72
3.3.1 What this means: Ancient History . . . . . . . . . . . 76
3.4 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.5 The Exterior Algebra . . . . . . . . . . . . . . . . . . . . . . . 91
3.6 The Exterior Calculus . . . . . . . . . . . . . . . . . . . . . . 96
3.7 Hodge Duality: The Hodge Operator . . . . . . . . . . . . . 102
3.7.1 The Riemannian Case . . . . . . . . . . . . . . . . . . 102
3.7.2 The SemiRiemannian Case . . . . . . . . . . . . . . . . 106
4 Some Elementary Physics 109
4.1 Three weird forces . . . . . . . . . . . . . . . . . . . . . . . . 109
4.2 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.2.1 Gradient Fields . . . . . . . . . . . . . . . . . . . . . . 116
4.2.2 What are Flux? . . . . . . . . . . . . . . . . . . . . . . 117
4.3 Maxwell and Faraday . . . . . . . . . . . . . . . . . . . . . . . 120
4.4 Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.4.1 The Idea of Invariance . . . . . . . . . . . . . . . . . . 126
4.4.2 The Lorentz Group . . . . . . . . . . . . . . . . . . . . 130
4.4.3 The Maxwell Equations . . . . . . . . . . . . . . . . . 135
4.5 Saying it with Differential Forms . . . . . . . . . . . . . . . . 136
4.6 Lorentz Invariance . . . . . . . . . . . . . . . . . . . . . . . . 141
4.6.1 Special Relativity . . . . . . . . . . . . . . . . . . . . . 145
5 DeRham Cohomology: Counting holes 149
5.1 Cultural Anthropology . . . . . . . . . . . . . . . . . . . . . . 149
5.2 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.3 Infinite Variety . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.4 Gauge Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . 156
5.5 Exact and Closed forms . . . . . . . . . . . . . . . . . . . . . 157
5.6 Homotopies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
5.7 Counting Holes . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.8 More Cultural Anthropology . . . . . . . . . . . . . . . . . . . 166
5.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6 Lie Groups 169
CONTENTS 5
6.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . 169
6.1.1 The rest of the course . . . . . . . . . . . . . . . . . . 169
6.2 Introduction to Lie Groups . . . . . . . . . . . . . . . . . . . . 169
6.3 Group Representations . . . . . . . . . . . . . . . . . . . . . . 173
6.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 173
6.3.2 Irreducible Representations . . . . . . . . . . . . . . . 176
6.3.3 Tensor Representations . . . . . . . . . . . . . . . . . . 177
6.3.4 Schur’s Lemma . . . . . . . . . . . . . . . . . . . . . . 178
6.3.5 Representations of SU(2, C) . . . . . . . . . . . . . . . 179
6.3.6 Representations of SU(2) . . . . . . . . . . . . . . . . . 182
6.4 Lie Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7 Fibre Bundles 185
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.2 Principal Bundles . . . . . . . . . . . . . . . . . . . . . . . . . 188
7.3 The Endomorphism Bundle . . . . . . . . . . . . . . . . . . . 191
8 Connections 193
8.1 Fundamental Ideas . . . . . . . . . . . . . . . . . . . . . . . . 193
8.2 Back in R
n
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
8.2.1 Covariant differentiation . . . . . . . . . . . . . . . . . 197
8.2.2 Curves and transporting vectors . . . . . . . . . . . . . 199
8.3 Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
8.4 Extensions to Tensor Fields on R
2
. . . . . . . . . . . . . . . . 201
8.5 The Koszul Connection . . . . . . . . . . . . . . . . . . . . . . 203
8.6 Vector Potentials . . . . . . . . . . . . . . . . . . . . . . . . . 203
8.6.1 Tensor formulation . . . . . . . . . . . . . . . . . . . . 206
8.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . 207
6 CONTENTS
Preface
This is a first course in Differential Geometry. I follow a number of sources:
first the text for the course, Baez and Muniain’s Gauge Fields, Knots and
Gravity, second the unique Michael Spivaks’s Comrehensive Introduction to
Differential Geometry, which is almost encyclopaedic and also readable, if at
times demanding. Third R.W.R Darling’s Differential Forms and Connec-
tions and finally the rather old fashioned Sternberg Lectures in Differential
Geometry. I shall also make some allusions to Helgason’s Differential Geom-
etry, Lie Groups and Symmetric Spaces.
My aim is to cover some of the ideas with applications to theoretical physics
from the text book while covering basic ideas. I hope that the students will
have done my 3P0 course which introduces tensors and tensor fields, but it
seems unsafe to count on it having been absorbed as thoroughly as desired.
So some of the introductory material has been lifted from my 3P0 notes.
Mike Alder, February 2007
7
8 CONTENTS
Chapter 1
Introduction
This course is about Differential Geometry and the text book is really im-
portant. You need your own copy unless you are sharing with a very good
friend. It is particularly important if you want to see where the physical
reasons for studying this topic tie in with the mathematics. We shan’t get
through the whole book but we shall have started on the journey.
I copied some of that from the introduction to 3P0. It’s still true.
You can see what is in the course by reading the contents page. Not that
it will help; this is Mathematics, not one of those subjects where you learn
to say the right things without considering whether they are true or false or
even whether or not they mean anything.
There are many different reasons for studying this subject but one impor-
tant one is that you will come to grips with some of the ideas that modern
Physics needs to make sense of the universe. You may be a physicist or a
mathematician or even an engineer (embryonically at least). The three sub-
jects tend to attract slighly different kinds of people (only slightly different:
compared with poets, pop-stars, princes, politicians and philosophers we are
barely different at all). Engineers tend to see the world in terms of facts and
protocols which they have to learn and which may or may not make much
sense, physicists see the world in terms of facts and theories, the theories
being there to summarise and predict the facts. Mathematicians expect to
see reasons and logic and relatively few basic facts from which the others
can be deduced. For a mathematician it has to make sense or it is definitely
wrong. For a theoretical physicist it has to be elegant or it is definitely wrong.
Because I am a mathematician, I have put in material which is left out of
the text book where it is treated as some bunch of facts which you need to
know, whereas I want to show why these things are the case. Maybe I just
have a bad memory for facts but a good one for arguments. Whatever the
9
10 CHAPTER 1. INTRODUCTION
reason, I am going to try to show you the essential beauty of the subject,
to get you to agree that it is amazingly cool, because this is ultimately why
mathematicians do it. The fact that it is also very useful is not why we do
it, it is why we get paid to do it. Though not much
1
.
This is very tough stuff, so don’t expect an easy time. On the other hand it
will be very exciting.
1
This is a really bad subject to do if you want to get rich, or boss people about, but a
very good one if you want to be happy and have lots of interesting and important things
to think about.
Chapter 2
Smooth Manifolds and Vector
Fields
2.1 Introduction
This chapter considers the machinery needed to say what we mean by a
smooth manifold. We also look at vector fields on smooth manifolds and
explain what this has to do with systems of ordinary differential equations.
The first idea is that a curve (in the plane or in three space) is a one di-
mensional object, a surface such as a sphere (the surface of a beach ball) or
a torus (the surface of an American doughnut where they sell you a hole in
the middle) is a two dimensional object. And there ought reasonably to be
higher dimensional variants of these things, as for example the n-sphere S
n
given by
S
n
¦x ∈ R
n+1
: |x| = 1¦
It is also reasonable to look at smooth maps between manifolds. If you draw a
smooth curve on a beach ball without stopping, which joins up to stop where
it starts and has the same final and initial velocity, then we could think of
this as a smooth map from S
1
to S
2
. But we have a problem in dealing
with what we mean by a differentiable map in this case since neither S
1
nor
S
2
are Banach spaces, and Banach spaces are the setting for talking about
differentiation, since they contain the linear algebra and distance notions
needed to talk of linear approximations, which is what derivatives are.
We might be able to make sense of this if the sphere is sitting in R
3
, and the
circle in R
2
, in which case we have a way of defining smoothness extrinsically.
But smoothness ought to make sense intrinsically, that is without reference
to some external space in which the manifold may or may not be sitting. An
11
12 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
important reason for this is that we live in what looks to be a 3-manifold
called the physical universe, at least at some scales. Make it a 4-manifold if
you want to throw in time. If the universe is sitting in some higher dimen-
sional space, we can’t know much about it, so it is slightly lunatic to believe
it is there. Google branes for an alternative viewpoint. Also string theory for
some disturbing ideas. But these do not contradict the idea that if we live
in a three dimensional physical universe it makes sense to talk about smooth
motion in it without having to postulate some inaccessible space external to
the universe.
So we seek to specify enough extra structure on a manifold so that we can
talk about smooth maps between them without reference to any space in
which they may be sitting. This will certainly be necessary if we are to
suppose that we live in a 3-manifold and want to talk about geodesics in it,
curves of minimal length. We shall certainly want to do this if we are to talk
of the path of a photon in our universe.
All this means generalising ideas about maps from R
n
, or subsets of R
n
, to
R
m
, which involve differentiability. Which we understand. Or do we?
Recall that if U, V are open subsets of R
n
and f : U → V is a differentiable
map we have that at each point a ∈ U, the derivative of f at a is the linear
map Df(a) : R
n
→ R
n
which is represented in the standard basis by the
n n matrix of partial derivatives:
[Df
a
] =

D
1
f
1
(a) D
2
f
1
(a) D
n
f
1
(a)
D
1
f
2
(a) D
2
f
2
(a) D
n
f
2
(a)
.
.
.
.
.
.
.
.
.
D
1
f
n
(a) D
2
f
n
(a) D
n
f
n
(a)
¸
¸
¸
¸
¸
=

∂f
1
∂x
1
∂f
1
∂x
2

∂f
1
∂x
n
∂f
2
∂x
1
∂f
2
∂x
2

∂f
2
∂x
n
.
.
.
.
.
.
.
.
.
∂f
n
∂x
1
∂f
n
∂x
2

∂f
n
∂x
n
¸
¸
¸
¸
¸
x=a
Usually I shan’t bother to distinguish between the linear map and its matrix
representation. You know how to compute this matrix if it should be abso-
lutely necessary, and you should understand that the linear map is the linear
part of the best affine approximation to f at a. Note that I have used f
i
for
the n component functions of f and x
i
for the components of a vector in R
n
.
I shall explain this notation later.
An important point about smooth curves needs to be considered:
2.2. SMOOTH MANIFOLDS 13
Figure 2.1.1: A smooth curve.
Exercise 2.1.1. Figure 2.1.1 shows two line segments joined together. The
horizontal one is the set of points in R
2
with y = 0 and 0 ≤ x ≤ 1 and the
vertical one is the set of points in R
2
with x = 1 and 0 ≤ y ≤ 1
Show that there is a continuous but non-differentiable function from [0, 2] to
R
2
which traces the curve formed by the two segments from the origin to the
point (1, 1)
T
.
Show that there is a differentiable function from [0, 2] to R
2
which does the
same job.
Exercise 2.1.2. Show that [−1, 1] is the image of [−1, 1] by a continuous
bijection which is not differentiable.
The conclusion you should draw from this is that you cannot decide if a curve
is smooth or not merely by looking at the image!
2.2 Smooth Manifolds
Definition 2.2.1. A chart on a topological space X is a homeomorphism
from some open subset of X onto an open subset of R
n
. I shall call the
inverse of such a homeomorphism a local parametrisation.
We show a typical local parametrisation map from a rectangular neighbour-
hood of the origin in R
2
to a region on the surface in figure 2.2.1. A chart
can be used to give coordinates for points of the space, at least some of them.
Different charts will, of course, give different coordinates in general to those
points on which the domains overlap.
Definition 2.2.2. Two charts on a space X, f : U →R
n
and g : V → R
n
are
smoothly compatible iff the maps f ◦g
−1
and g◦f
−1
are infinitely differentiable
wherever they are defined.
14 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Figure 2.2.1: A local coordinate map.
Figure 2.2.2: Two charts.
In other words, the composite map f ◦ g
−1
must have partial derivatives of
all orders at every point of the domain, and the same is true of the inverse
map.
If U and V have empty intersection then this holds vacuously. If they do have
an intersection, then f ◦ g
−1
has domain and codomain some open subsets
of R
n
and is certainly continuous. It makes sense to demand that this map
be smooth, that is, infinitely differentiable. The picture of figure 2.2.2 may
help.
Definition 2.2.3. A smooth atlas for a space X is a collection of smoothly
compatible charts such that every point of X is in the domain of at least
one chart. Such an atlas is maximal iff every possible (smoothly compatible)
chart is in it.
Definition 2.2.4. A smooth n-manifold is a hausdorff topogical space to-
gether with a maximal atlas of smoothly compatible charts. The atlas is said
2.3. SMOOTH MAPS AND TANGENT VECTORS 15
to define a smooth differential structure on X.
The reason for wanting the atlas to be maximal is just so that anyone wan-
dering in with a new local coordinate map can’t cause us trouble. Either it
is compatible with our atlas in which case we already have it, or it is not, in
which case it may be part of a different differential structure for the manifold.
Exercise 2.2.1.
1. Show that S
1
and S
2
as usually defined are smooth manifolds.
2. Show that the flat torus obtained by gluing opposite edges of a square
is also a smooth manifold.
3. Show that S
n
is a smooth manifold for any n ∈ Z
+
. Hint: use the
Implicit Function Theorem.
(Generic hint: you don’t need to have many charts. Enough to cover the
manifold will do, then just add the instruction to fill up with all other possible
smoothly compatible charts.)
Exercise 2.2.2. Construct a definition of an orientable manifold.
2.3 Smooth maps and tangent vectors
Now we have enough to say what it means for a map f : X → Y to be
smooth when X and Y are smooth manifolds:
Definition 2.3.1. A map f : X → Y between smooth manifolds is differen-
tiable when h
−1
◦ f ◦ g is differentiable for all charts h on X and all charts g
on Y belonging to the differential structures.
The diagram of figure 2.3.1 gives the idea.
We can define higher order differentiability in the same way and we can say
that a map f : X → Y is smooth whenever all composites h
−1
◦ f ◦ g are
smooth for all charts h on X and all charts g on Y
Exercise 2.3.1. Show that if f : X → Y has composite h
−1
◦ f ◦ g differ-
entiable at some point a in X then it is differentiable in any other pair of
charts containing a, f(a).
16 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Figure 2.3.1: A smooth map.
Figure 2.3.2: Some tangent curves in a manifold.
2.3. SMOOTH MAPS AND TANGENT VECTORS 17
Note that although we can say that f is differentiable, we cannot provide
a derivative, since this will generally be different in different charts. If we
move away from simple linear spaces we must pay the price: there is no longer
a best affine approximation because affine maps don’t make sense between
manifolds in general.
We can however say when two maps from R into a manifold X are tangent.
Let f, g : (−1, 1) → X be smooth maps into a manifold and without loss
of generality let f(0) = g(0) = a ∈ X. Then we can say that f and g are
tangent at a iff the derivative of f and the derivative of g are the same for
any chart h : U → R
n
where f(0) = g(0) = a ∈ U. If they are the same in
any one chart they must be the same in any other.
Exercise 2.3.2. Prove the last remark.
Exercise 2.3.3. Show that tangency is an equivalence relation on the set of
maps from R to X, and that we can do the same thing with maps from X
to R.
We can take a tangency equivalence class of maps from R to the manifold
X, and regard it as an object in its own right. The picture 2.3.2 shows some
members of a tangency equivalence class.
The curves can be thought of as the trajectories of moving points, and they
are all moving through the point a at the same speed, and in the same
direction, although we cannot give the direction a particular vector to specify
it, and the speed may also be different in different charts.
Definition 2.3.2. A tangency equivalence class at a point a in a manifold
X is called a tangent vector at a in X.
Remark 2.3.1. Watching the faces of students in class when giving this
definition is a real treat. The look of stark horror and incomprehension
is very encouraging, as it proves that some at least are listening. A small
amount of imagination, however, goes a long way to making this definition
quite reasonable.
Suppose that the North pole has been cleared of snow and turned into a
skating rink for penguins
1
, and the North pole itself is marked by a flashing
red light. There are two space craft hovering up there, call them A and B. In
space craft A, an astronaut leans out and takes a photograph of the region
around the north pole. Suppose for simplicity he is directly above the north
1
It has been pointed out to me that there are no penguins at the North pole only at
the South pole. On the other hand there isn’t a skating rink at the North pole either. So
if we are going to make a skating rink we might as well import the penguins.
18 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Figure 2.3.3: Penguins (imported from the antarctic) skating.
pole so his photograph, when enlarged is a disc as in the picture figure 2.3.4.
Astronaut B is somewhere over Russia and he also takes a photograph of
what he can see.
Now each astronaut looks at his photograph and lays it out flat and enlarges
it to a nice size, and each marks on a coordinate grid using a ruler and pen,
and so each has a chart of a bit of the polar regions, with the north pole
in the domain of each chart. If both put the origin in the centre, astronaut
A will have the flashing red light at the origin, and astronaut B will have a
negative x coordinate for the red light if he puts his coordinates on the chart
in the way suggested by the diagram. I regard the chart as both the bit of
the earth the astronaut can see, and also the process of turning it into a flat
picture with a coordinate grid on it. Call them u and v for the maps and U
and V for the domains of the maps back on earth.
I claim it makes sense to talk of a penguin skating over the north pole as
having a velocity vector as it passes through the north pole. Each astronaut
can plot the position of the green penguin in his chart, and each will agree if
the curve is differentiable. Note that if g : (−1, 1) → S
2
describes the green
penguin then astronaut A will plot the green curve at the top of the picture
and will be able to give it a perfectly respectable velocity on his chart relative
to the cartesian coordinates marked on the chart. Similarly astronaut B can
do the same. The problem is that they will have, usually, different estimates
or what the velocity is. If B is much higher up in space, his scale will be such
that the penguins will seem to be moving more slowly, for instance.
2.3. SMOOTH MAPS AND TANGENT VECTORS 19
Figure 2.3.4: Two penguins skating under the watchful eyes of two astro-
nauts.
Does this mean my claim that we can assign a meaningful velocity vector to
the penguin is just nonsense? No, for if b is the blue penguin, also skating
over the north pole at the same time as the green penguin (and mysteriously
not knocking over the first penguin: maybe they are ghost penguins and can
occupy the same space), it certainly makes sense to say if they are travelling
in the same direction at the same speed. A penguin cutting across the path
would obviously be travelling in a different direction, and a really slow pen-
guin would be slower for both astronauts, and the fast penguins would pass
through it. So I claim that the penguin velocity is a real thing which exists
at the penguin level if not at the astronaut level. But if one astronaut said
that the blue penguin and the green penguin had the same velocity at the
instant they went through the pole, the other astronaut would agree — even
though disagreeing as to the actual value of the vector in both direction and
speed, these things being properties of the charts, not the penguins.
The reason this happens is that there are two things going on here. The actual
velocity at the north pole is a real thing, penguins are actually moving, and
either they pass through the north pole at the same time in the same direction
at the same speed or they don’t. But attempts by the two astronauts to
describe the penguin motion to each other with numbers involve inventing
coordinate systems which are bits of language. So the numerical value of any
vector is dependent on the language. But the fact that different languages
20 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
agree on whether two penguins have the same velocity tells you that the
velocity is real. It exists independent of the coordinate system, provided the
two coordinate systems are related by a diffeomorphism. So there are moving
penguins and there is language, and the penguins will have the same velocity
at the pole or they won’t, and this is true no matter what language you use
to talk about it unless your language is really weird.
The problem then is to say what a velocity vector is given that any pair of
astronauts can disagree about the actual numbers. And the most elegant
solution is to say that it is what all the penguin trajectories, real and poten-
tial, have in common. And what they have in common is that every observer
will agree that they pass through the north pole in the same direction at the
same speed. This is the tangency equivalence class.
Note that I have assumed that all observers use synchronised clocks so they
all agree that the time at which the penguins hit the north pole is time
zero. This doesn’t have to be the case either. They will all agree on the
simultaneity of the events, whatever time they claim they occur. This is
because two penguins either meet or they don’t, and this is not a matter of
language but of fact.
The ghost-penguins are negotiable. Having a nice vivid picture of some sort
is essential: you should be prepared to invent your own, but this time you
may borrow my penguins if they help. If I give you more definitions like this,
it is your job to supply the penguins, or whatever it takes.
Exercise 2.3.4. Show that the claim that the two astronauts would agree if
two penguins have the same velocity at the north pole is true provided that
u ◦ v
−1
and v ◦ u
−1
are both differentiable.
Remark 2.3.2. There is, of course, a simpler way of defining tangent vectors
on S
2
. It is usually viewed as a subspace of R
3
, so a curve on S
2
is also a
curve in R
3
and we can define velocities on S
2
as tangent vectors in R
3
in
the sense of the derivatives of maps from (−1, 1) to R
3
which, for tangent
vectors at a particular point, happen to lie in a plane in R
3
which is tangent
to S
2
at that point. This certainly removes some tricky conceptual problems
but at the expense of making tangent vectors extrinsic rather than intrinsic
to the space. The whole thrust of the text book is to using intrinsic ideas
for the very good reason that we live in a 3-manifold and cannot form any
useful idea of an embedding of it in some higher dimensional space.
The next proposition tells us that the set of tangency equivalence classes at
a fixed point a in a manifold form a vector space, the tangent space at a.
Proposition 2.3.1. The set of tangent vectors at a point a of a smooth
n-manifold X comprise a real vector space of dimension n.
2.3. SMOOTH MAPS AND TANGENT VECTORS 21
Figure 2.3.5: The sum of tangent vectors.
Proof:
We have to produce sensible rules for adding and scaling tangent vectors.
Then we have to show that the result satisfies the axioms for a real vector
space. Suppose we have a tangency equivalence class v and that v is an
element of it, that is a curve v : R → X with v(0) = a and in any chart
w : W → R
n
with a ∈ U there is some derivative of w ◦ v. Then we can
scale the function v by a scalar k ∈ R to get v(kt) instead of v(t) for t ∈ R
and the derivative of w ◦ v will also be scaled by the factor k. This will be
the same scaling in any chart, so it makes sense to call this new function kv.
This has its own tangency equivalence class, kv.
It would not make a difference if we had chosen another function v

∈ v,
kv

is a function tangent to kv since they both have the same derivative no
matter what chart we choose, although in different charts the derivatives will
be different but still equal to each other.
So we can say that kv exists and we have scaled the equivalence class.
If v and u are distinct tangency equivalence classes through the point a
as in figure 2.3.5, we can take representative functions v, u : R → X with
u(0) = v(0) = a and composing with w : W → R
n
, a chart, we have two
maps, w ◦ u and w ◦ v from R into R
n
. Such maps may be added: we take
the map w ◦ u +w ◦ v −w(a). At t = 0 this passes through w(a) ∈ R
n
. The
resulting curve in R
n
can be mapped back into the manifold by w
−1
, or at
least a bit of it in a neighbourhood of w(a) can be. This gives a sort of ‘sum’
curve of u and v in the manifold, w
−1
◦ (w◦ u+w◦ v −w(a)). The tangency
class of this sum curve is defined to be the sum of u and v. It is easy to see
that the tangency equivalence class does not depend on the choice of chart.
(Although the sum curve does.)
22 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Nor does it depend on which representatives u of u and v of v which we
choose because they all have the same derivative. We may write u+v is the
tangency equivalence class of w
−1
(w ◦ u) +(w ◦ v) −w(a)) therefore, and we
may add tangency equivalence classes, otherwise known as tangent vectors
at a.
(If in doubt about this argument, say it with penguins.)
It is clear that the sum is associative and commutative and there is a zero
which contains the constant function sending R to a. The rest of the axioms
for a vector space are easily checked.
The claim that it has dimension n the same as the dimension of X is left as
an exercise.
Exercise 2.3.5. Check all the axioms for a vector space. This kind of thing
is called axiom bashing and is good for you.
The resulting vector space is called T
a
(X) and is isomorphic to R
n
when X
is an n-manifold. I want to emphasise an important point: there is in general
no particular or natural isomorphism between T
a
(X) and R
n
. If X = R
n
,
then I can get away with calling T
0
(R
n
) by the ‘slang’ name
˙
R
n
, because in
this case there is an obvious basis for the tangent space at the origin, I have
the unit vectors along the axes. And by a simple translation I can carry R
n
to R
n
and take the origin to any point a, and this translation will also take
curves through the origin (and hence vectors) to curves through a. So in
this rather special case, I do have a natural basis for T
a
(X). But there is no
natural basis for T
a
(S
2
) for any a ∈ S
2
; the best I could do is to fudge one by
using the embedding in R
3
, but this is a property of the embedding, not of
S
2
. This loss of a natural basis, or if you prefer a natural isomorphism with
R
n
, has important implications. It parallels the fact that there is no obvious
choice of a coordinate frame in the space we inhabit
2
.
Exercise 2.3.6. Prove the last statement. Hint: Do it for X = R
n
first,
then observe that locally any X is R
n
as near as dammit, and that tangency
is a very local kind of business.
There is one such tangent space T
a
(X) for each point a ∈ X. There is, in
general, no particular isomorphism between any T
a
(X) and R
n
. If X = R
n
then there is (what?), but in general there is a huge choice and no way of
picking any particular one.
2
Although in earlier days, it was thought in some quarters that Jerusalem was a good
place to put one. Where exactly in Jerusalem was not altogether clear.
2.3. SMOOTH MAPS AND TANGENT VECTORS 23
Figure 2.3.6: The simplest tangent bundle.
Exercise 2.3.7. Show that the tangent plane at the north pole to S
2
as
usually embedded in R
3
can be mapped isomorphically to the tangent space
as defined here. Is there an obvious isomorphism?
Exercise 2.3.8. Show that there is an isomorphism between T
a
(X) and
T
b
(X) for any two points a, b ∈ X.
Examples 2.3.1.
1. The simplest case is where X = R. A tangent vector at the point 1
can be thought of as the space of velocities of moving points as they go
through 1; the chart consisting of the identity map does it all nicely.
So we have a line of possible velocities attached to each point of R and
the tangent bundle is the collection of all the tangent spaces. We can
draw it as R
˙
R where the first component is the space itself and the
second is the space of velocities. I am making up the notation of
˙
R
for the space of velocities, and you won’t find it in the books, but it
makes sense and reduces confusion. Since
˙
R is isomorphic to R, R
˙
R
looks an awful lot like R
2
. We think of the different tangent spaces
˙
R
a
attached to each a ∈ R and draw some of them as in figure 2.3.6
The reason it is called a bundle is because it looks like a bundle of (red)
tangent spaces. The tangent spaces are called the fibres of the bundle.
The manifold to which the fibres are attached is called the base space
of the fibre bundle.
2. Let the manifold X be S
1
. Again it makes sense to have curves in S
1
which all pass through a point and have the same velocity vector at that
point. The different velocities again form a vector space
˙
R and there is
24 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Figure 2.3.7: The next simplest tangent bundle.
one attached to each point of S
1
. If we draw the possible tangents in
the plane, they intersect; this is a property of the space we are trying
to squash the tangent bundle into and if we turn them through a right
angle as in the last example we get the fibre bundle of figure 2.3.7
Again the fibres are all copies of a line and the bundle is pretty much the
same as S
1
R. The red dot sitting over the black one represents a speed
in the positive direction passing through the black point underneath it.
3. We have now run out of cases where we can draw the pictures, since
R and S
1
are the only one dimensional manifolds, and if we go to S
2
we get a tangent bundle of dimension four. We can draw one tangent
plane, but any more would usually intersect, and this is what happens
when we try to embed a four dimensional space in R
3
. We can see
however that there is a collection of planes, one for each point of S
2
and they form a four dimensional space. It is useful to visualise at least
a part of the tangent space of S
2
as a sphere in R
3
with some bits of
tangent planes attached to it, as in figure 2.3.8, because it is better to
have a partial idea than stick entirely to the algebra, but you should
be aware of the limitations of the picture.
The two earlier examples came out to be simple cartesian products of the
tangent space at any one point with the manifold. Such bundles are called
trivial bundles. An example of a non-trivial fibre bundle is the M¨obius bundle
shown in figure 2.3.9
This has fibre an interval, say (−1, 1), from R and base space S
1
. But it is
2.3. SMOOTH MAPS AND TANGENT VECTORS 25
Figure 2.3.8: A bit of the tangent bundle for S
2
.
Figure 2.3.9: A non-trivial fibre bundle.
not the cartesian product of the two.
Every tangent bundle has, however, a projection onto the base space, the
underlying manifold. We may write this as a vertical pair of spaces
TM
?
π
M
where M is the manifold, TM is the tangent bundle and π is the projection
which sends a tangent vector to the point in the manifold to which it is
attached.
Now we have described the tangent bundle as a union of all the tangent
spaces T
a
(M) for a ∈ M but that does not specify a topology on it. To do
that we say a subset U of TM is open iff the projection π(U) is open in M
and the intersection of U with any fibre is open in the fibre. Since the fibres
are all real vector spaces we can give them the usual topology, obtained from
an isomorphism with R
n
.
Definition 2.3.3. The tangent bundle to a smooth manifold M is the set
¸
a∈M
T
a
(M)
with the topology specified by saying U ⊆ TM is open whenever π(U) is
26 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
open in M and for every a ∈ π(U), U ∩ T
a
(M) is open in T
a
(M), where
T
a
(M) has a topology induced by any isomorphism with R
n
.
Note that this assumes that any two isomorphisms with R
n
will induce the
same topology.
Exercise 2.3.9.
1. Show that a linear map from R
n
to R
m
is continuous iff it is continuous
at the origin.
2. Show that any linear map from R
n
to R
m
is continuous.
3. Show that any isomorphism from R
n
to itself is a homeomorphism.
Now for some formal definitions:
Definition 2.3.4. A fibre bundle is a quartet (E, B, F, π) where E is called
the total space, B is called the base space, π : E → B is a continuous map
called the projection and for every b ∈ B, π
−1
(b) is homeomorphic to F. The
spaces π
−1
(b) are called the fibres of the bundle.
Definition 2.3.5. A fibre bundle is called locally trivial iff for every b ∈ B
there is an open set U ⊆ B containing b such that π
−1
(U) is homeomorphic
to U F
The bundle B F is called a trivial bundle.
Exercise 2.3.10. Describe clearly the trivial bundle with base space S
2
and
fibre S
1
and give an example of a non-trivial bundle with the same base and
fibre. Hint: you might find it easier if you specify some gluings.
Exercise 2.3.11. Show that the tangent bundle of a smooth manifold is
locally trivial.
Exercise 2.3.12. Show there is a natural atlas on the tangent bundle which
makes it a smooth manifold. Is the bundle projection smooth?
Note that for a locally trivial fibre bundle a topology on the bundle must
have as base the cartesian product of sets which are open in B (and over
which the bundle is locally trivial) with open sets in the fibre.
Definition 2.3.6. A section of a fibre bundle E with projection π to base
space B is a map s : B → E such that s ◦ π is the identity on B.
2.4. NOTATION: VECTOR FIELDS 27
Definition 2.3.7. A vector field on a manifold M is a section of the tangent
bundle TM.
You should be able to see that this makes sense and we can talk about
continuous, differentiable and smooth vector fields according as the section
(which is after all a map) is continuous, differentiable or smooth.
Exercise 2.3.13.
1. Draw a vector field on R
2
which is nice and easy and write it as a
section of the tangent bundle.
2. Show that the tangent bundle for S
2
is not trivial. Use the hairy ball
theorem which says that any continuous vector field on S
2
must have
at least one place where the vector is of length zero.
2.4 Notation: Vector Fields
On R
2
, I can write the tangent space as R
2

˙
R
2
which is mildly useful for
thinking about the meaning but not standard and not particularly useful for
computations. I shall extend this to talking about the standard basis for
˙
R
2
and call it ( ˙ e
1
, ˙ e
2
). A vector field on R
2
is an assignment to each point of R
2
of a vector, and if it is a smooth vector field this vector changes smoothly as
we move around in R
2
. So there is a tangent vector, with two components,
which both depends smoothly on x and y and hence is given by a pair of
functions P(x, y), Q(x, y). We might write the vector field as
P(x, y) ˙ e
1
+Q(x, y) ˙ e
2
but we don’t. We write it as
P(x, y)

∂x
+Q(x, y)

∂y
This notation takes a bit of explaining.
If we have a smooth function f : R
2
→R and a smooth vector field on R
2
we
can take the directional derivative of f at any point in the direction of the
vector field at the point, and multiply it by the length of the vector. This
will give us a new smooth function on R
2
. This means that such a vector
field can be thought of as an operator on the space of smooth functions from
R
2
to R, which is usually written as (

(R
2
). The constant vector field which
assigns the vector ˙ e
1
to every point of R
2
can easily be seen to be the operator
28 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
∂/∂x and similarly the orthogonal constant vector field which assigns ˙ e
2
is
the operator ∂/∂y. This explains the notation for vector fields on R
2
and by
an obvious extension we can write a vector field v on R
n
as
¸
i∈[1:n]
v
i
(x)

∂x
i
where each v
i
is a function from R
n
to R and where I called v
1
the function
P(x, y) and v
2
the function Q(x, y) when n = 2.
Since the procedure for interpreting a vector field as an operator on (

(R
2
)
is local, a vector field on a manifold M is an operator on (

(M) although
there is an issue involved in choosing a basis for each tangent space T
a
(M)
if we wish to do calculations.
This gives two quite different ways of looking at a vector field on a smooth
manifold. We have the tangency equivalence classes which we may think of
as little arrows, each selected by a section of the tangent bundle. This is
a quite straightforward transfer of ideas from R
n
and should seem natural
and reasonable once you have come to terms with the problem of having to
say everything via charts. But the other way of thinking of a vector field as
an operator on the space (

(M) has some advantages. One of these is that
it makes sense without immediate reference to charts. Of course, we need
charts to say what it means for some map f : M →R to be differentiable, but
given that, we have a pleasant freedom from particular coordinate systems.
Physicists are particularly interested in this, because the physical universe
does not come equipped with charts anymore than it has an origin and axes
sticking out of it. Recall penguins, and what they do, versus the language for
talking about them given by charts. Now we want to focus on the behaviour
of the physical universe (penguins) and not be to distracted by the language
(charts). So an invariant description, that is one which does not depend
on choosing a particular language, is definitely more physical. Note Oliver
Heavisides remarks quoted at the top of chapter three of the text book.
On R
n
we can therefore write a vector field v as a map
v : (

(R
n
) → (

(R
n
)
with vf the map
¸
i∈[1:n]
v
i
(x)
∂f
∂x
i
.
This can be compressed into
v =
¸
i∈[1:n]
v
i

∂x
i
2.4. NOTATION: VECTOR FIELDS 29
An even more compact form is
v =
¸
i∈[1:n]
v
i

i
We can make this even terser by using the Einstein Summation Convention
which is that if an index is repeated as a superscript and a subscript then we
automatically sum over the possible values. This gives us
v = v
i

i
where you have to know what the space is in which we are working to know
how many i’s there are. For some reason physicists prefer to use greek letters
as indices which means that you are likely to find expressions such as
v = v
µ

µ
instead. I fear that you will have to get used to this as the textbook is
committed to it.
This leads to a new definition of a vector field on a smooth manifold M.
First we define (

(M) to be the set of smooth maps from M to R. This
is clearly a real vector space. It is certainly possible to add and scale the
functions, and the rest is simple axiom bashing, as done in second year. It
is rather more than just a vector space, it is an algebra, which is to say it is
possible to multiply any pair of elements, fg being the function
∀a ∈ M, fg(a) = f(a)g(a)
where the right hand side of the equality means we just multiply the two real
numbers f(a) and g(a). The multiplication is associative, commutative, and
left and right distributive over addition. In other words, it is a real vector
space which is also a commutative ring, which is basically what we mean by
a real algebra. You should write down the complete list of axioms for such a
thing, not relying on the text book too much.
Now we define a linear operator on such an algebra A by saying it is a map
v : A →A
which is linear, that is,
∀f, g ∈ A v(f +g) = vf +vg
and
∀f ∈ A, ∀t ∈ R v(tf) = tv(f)
30 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Such an operator is called a derivation if it also satisfies
∀f, g ∈ A v(fg) = fv(g) +gv(f)
which you will recognise as Leibnitz Rule for differentiating a product func-
tion.
Exercise 2.4.1. Take M = R
2
and any smooth vector field on it. Show that
it is a derivation.
Note that it makes sense to define a derivation over any real algebra and
algebraists indeed do exactly this. This is a long way from differentiating
functions, but it gives all the essential properties, and algebraists have a
habit of studying the properties without much caring where they came from.
They have their uses. Algebraists, that is.
We can finally define a vector field on a manifold M as a derivation on the real
algebra (

(M). Such a definition has advantages and disadvantages. The
obvious disadvantage is that it is so abstract it seems to have nothing to do
with the things we care about, but the advantage is that the abstraction has
removed all the irrelevancies which get in the way of thinking about things
and left the bare essentials. Any lingering suspicion that the geometric baby
has been thrown out with all that bathwater may be put to rest by checking
through the last exercise carefully, and by doing it with S
1
instead of R
2
:
Exercise 2.4.2. Take M = S
1
and any smooth vector field v on it regarded
as a section of the tangent bundle. Show that v is a derivation: take some
simple functions from S
1
to R and operate on them by v. Confirm that all
the rules for a derivation are satisfied.
We also need to be able to go in the opposite direction: if v : (

(M) →
(

(M) is a derivation, then it must be able to be expressed as a vector field
in the earlier sense.
Exercise 2.4.3. Do this on R
2
at the origin. Suppose f : R
2
→ R is a
smooth map. Then we can write
f
¸
x
y

= f(0) +
¸
∂f
∂x
,
∂f
∂y

0
¸
x
y

+ax
2
+ 2bxy +cy
2
where a, b, c are second order partial derivatives of f evaluated at some point
between the origin and (x, y)
T
(and hence, we have to admit, depend upon
x and y). This is just the Taylor expansion with Lagrange form of the
remainder in two dimensions.
2.5. COTANGENT BUNDLES 31
Now apply v to f to get a new function g: then g(0) must be the limit of
g(x, y)
T
as (x, y)
T
→ 0, as g is certainly continuous, and show that since
v is linear, g(x, y)
T
must be the sum of the action of v on the above three
terms in a neighbourhood of the origin, that v takes the constant first term
to zero, and that since v satisfies the Leibnitz condition, g(0) must be
¸
∂f
∂x
,
∂f
∂y
¸
u
v

for some vector (u, v)
T

˙
R
2
. Finally show that if it works on R
2
it must
work on R
n
and also on any smooth manifold.
We can now define Vect(M) or 1(M) as the set of all vector fields on the
smooth manifold M.
Exercise 2.4.4. Show that Vect(M) (1(M) ) is a real vector space. Show
that it is a module over (

(M), that is, it is like a vector space over (

(M)
except that (

(M) is not a field but a ring.
Exercise 2.4.5. Show that 1(M) as a module over (

(R
n
), is finite dimen-
sional and has the obvious basis.
2.5 Cotangent Bundles
I mentioned earlier that we could do the business of equivalence classes of
maps from the manifold to R in exactly the same way as we took maps from
R to the manifold. If we do this we get an exact parallel and a tangency
equivalence class of such maps at a point is called a cotangent or covector
at the point. Somewhat easier is to define the space of cotangents at a ∈ X
for a smooth manifold X as the dual space of T
a
(X). Recall that the dual
(vector) space for a space V is the space V

of linear maps from V to R. I
shall say more about this in the next chapter. We can do exactly the same
process of taking the union of all the T
a
(X)

as we did for the tangent bundle
and this gives us a slightly different object called the cotangent bundle. It
has to be admitted that there is no difference between them as topological
spaces. All the difference is in the algebra and it manifests itself strongly
when we look to see what happens under maps between manifolds.
Exercise 2.5.1. I have given two different definitions of the cotangent space.
Show they are equivalent.
32 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
The same sort of considerations as worked for vector fields apply to covec-
tor fields or differential 1-forms as they are more commonly known. At each
point of R
2
, we select an element of
˙
R
2
, the cotangent space, which again has
two components. I suppose we might call the standard basis for this space
( ˙ e

1
, ˙ e

2
) where ˙ e

1
is the linear map from
˙
R
2
to R which projects everything
onto the first component and ˙ e

2
projects everything onto the second com-
ponent. But we actually call them dx, dy to be loosely consistent with the
classical notation. So we interpret dx as the linear map which takes (x, y)
T
to x where (x, y)
T
is a point in the tangent space T
a
(R
2
) at some point a.
Similarly for dy. So a differential 1-form or covector field on R
2
is written
P(x, y) dx +Q(x, y) dy
The generalisation to R
n
is of course
¸
i∈[1:n]
ω
i
(x) dx
i
(or ω
i
dx
i
using the Einstein summation convention)
and this, for smooth functions ω
i
, i ∈ [1 : n] represents a covector field
or differential 1-form on R
n
. The preference for letters towards the end of
the Greek alphabet to denote differential forms is widespread so again you
ought to get used to it. The subscripts instead of superscripts for indices
tells you something about the covariance or contravariance of the entities. I
shall explain this properly shortly.
If you wonder why on earth anybody bothers to distinguish between vector
fields and differential 1-forms, one answer is that it is natural to differentiate
k-forms to get (k +1)-forms for k ∈ N. This is what Stokes’ theorem is really
all about. As you ought to have learnt in second year but probably didn’t.
2.6 The Tangent Functor
Suppose f : X → Y is a differentiable map between manifolds. Then for the
case where X = R
n
and Y = R
m
there is a map between the tangent spaces
at each point which takes the tangent space at a ∈ X to the tangent space
at f(a) ∈ Y . To take a tangent vector v
a
in the tangent space T
a
(X) to one
in the tangent space T
f(a)
(Y ) all we have to do is to operate on it by Df(a)
which is by definition a linear map and has the right dimensions for domain
and codomain. If we are prepared to choose a basis for T
a
(X) and T
f(a)
(Y )
we could represent Df(a) by a matrix, and there is a perfectly sensible way
of choosing the ‘same’ basis for tangent spaces over different points. All this
makes sense even if X and Y are just finite dimensional real vector spaces
2.6. THE TANGENT FUNCTOR 33
without the extra structure of R
n
. In fact it makes sense in arbitrary Banach
spaces.
Of course, there is a slight problem of how to extend this to manifolds which
are not Banach spaces. Spheres and tori spring to mind.
If we take v
a
, and recall that it is a tangency equivalence class of curves
v : (−1, 1) → X taking 0 to a then f ◦ v is a curve through f(a) and it
specifies a tangency class. Moreover if v

is tangent to v at a then f ◦ v

is
tangent to f ◦ v at f(a).
Exercise 2.6.1. Most of this should have been a second year exercise but
probably wasn’t. Do it now and all about tangent vectors and maps will be
clear. Well, clearer.
1. Let f : R
2
→R
2
be defined by
¸
x
y


¸
u
v

=
¸
x
2
+x +y +y
2
1 +xy

Compute f on the set of points
¸
t
0

for t ∈ [0, 1]. Do this by choosing
ten points along the interval and evaluating f on them and plot them
on a sheet of graph paper to obtain ten points which should lie on a
smooth curve. Do the same for points on the interval
¸
0
t

for t ∈ [0, 1].
2. Calculate f
¸
1/10
0

and f
¸
0
1/10

if you haven’t already.
3. Calculate Df
¸
1
0

.
4. Evaluate the above matrix on the tangent vector ˙ e
1
5. Evaluate the above matrix on the tangent vector ˙ e
2
6. Map the two tangent vectors obtained by the last two jobs on the same
graph.
7. Represent the tangent vector ˙ e
1
by any curve c
1
in the tangency equiv-
alence class and compose with f. Differentiate to find a linear repre-
sentative of Tf(0, ˙ e
1
)
8. Repeat for a curve c
2
representing ˙ e
2
34 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
9. Sketch the curves f ◦ c
1
and f ◦ c
2
10. Prove the claim that if v

is tangent to v at a then f ◦ v

is tangent to
f ◦ v at f(a).
It follows that f induces a map Tf which takes tangent vectors at a to tan-
gent vectors at f(a). This process doesn’t, on the face of things, involve
differentiation. Nor does it involve charts. Of course it does involve differ-
entiation, as the last series of exercises shows convincingly. And it is easy to
see that it goes through on charts for the usual reasons, which involve the
chain rule.
In the case when we have a differentiable f : R
n
→ R
m
the last exercises
should convince you we have at each point a ∈ X the diagram
T
a
(X)
?
π
X
X
T
f(a)
(Y )
?
π
Y
Y
-
-
Df(a)
f
This diagram commutes which means whichever way around you go you get
the same result. We can do this for every point a ∈ X to get the commutative
diagram:
TX
?
π
X
X
TY
?
π
Y
Y
-
-
Tf
f
The process of taking a manifold and producing its tangent bundle is said to
be functorial because if we have two manifolds and a smooth map between
them the process gives a map between the bundles.
Instead of writing Tf we often write f

for the same map. This is more
general because it makes sense for some other vector bundles and not just
the tangent bundle.
Such a map between tangent bundles is said to be fibre preserving, since
it takes anything in the fibre over a to the fibre over f(a). And we can
generalise this to maps between any fibre bundles, so they are also called
bundle maps. If the fibre is a vector space we talk of vector bundles and we
2.6. THE TANGENT FUNCTOR 35
require the bundle maps to be linear, so the map Tf is also a vector bundle
map.
Note that the map Tf contains all the information about the derivative and
also tells you where things are, which the derivative (being only the linear
part of an affine map) does not. So this is actually cleaner and conceptually
simpler than the usual description of the business of differentiation. Another
way to put this, in the light of the last exercise, is that when you calculate
lots of partial derivatives you are merely trying to calculate the linear part of
an affine map which specifies a tangency equivalence class, that is, a tangent
vector.
We can usefully think of Tf as coming in two parts, since locally the tangent
space is simply a cartesian product of possible tangent vectors over a space
with a part of the space. On the first part Tf is simply f and on the second
part, the fibres, it is Df, the derivative of f. We can now choose to define the
derivative of a smooth map this way. I have hankered after teaching calculus
this way in first year. It is actually easier, probably because you need to
isolate the core ideas in order to generalise things and fronting up to the core
ideas although demanding at first makes life a lot easier subsequently.
Note that the chain rule can now be formulated as
T(f ◦ g) = Tf ◦ Tg
Exercise 2.6.2. Confirm that the chain rule holds. This is also a part of
saying that T is functorial.
Exercise 2.6.3. Guess what a functor is and what it is a map between.
Confirm your guess by doing some googling. I warn against doing the googling
first.
Exercise 2.6.4. Take f : R
+
→ R
+
defined by x → x
2
. Show this is a
diffeomorphism. Let V be the vector field on R
+
which has constant vectors
of length 1 at every point. Show that Tf takes this into a new vector field on
R
+
, and say what the new vector field is. Regarding the two vector fields as
differential equations, find both solutions.
2.6.1 The (non-existent) Cotangent Functor
Suppose we have f : X → Y a smooth map between smooth manifolds,
and we look to see what happens in the cotangent bundle. Thinking of a
cotangent at a ∈ X as a tangency equivalence class of maps from some
neighbourhood of a to R, we see that the map between the fibres goes in
36 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
the reverse direction. Given v : W ⊂ Y →R as a representative function in
the tangency equivalence class at f(a) (with f(a) ∈ W), f induces v ◦ f :
f
−1
(W) →R on X which defines a cotangent vector at a. So we obtain the
diagram:
T

a
X
?
π
X
X
T

f(a)
Y
?
π
Y
Y

-
f

f
This makes T

, a hypothetical induced map on the whole cotangent space
a mess, because it goes one way (left to right) on the space part and the
opposite way (right to left) on the cotangent part. If f has a smooth inverse
we can get around this, but it is not so neat. Incidentally:
Definition 2.6.1. A smooth map with a smooth inverse is called a (smooth)
diffeomorphism
Exercise 2.6.5. How, if at all, can we relate the derivative of f to f

when
X = R
n
, Y = R
m
?
Remark 2.6.1. In older books, a covector field is called a contravariant
vector field and a vector field is called a covariant vector field. See for
example, Mackey’s Theoretical Foundations of Quantum Mechanics. As we
shall see later, a covariant vector field is a contravariant tensor field. Don’t
blame me for this.
This is all rather confusing on first encounter. Familiarity breeds acceptance
and the best way to become familiar with these ideas is to work them through
in very simple cases. So make up a set of exercises yourself in which you work
with particular simple maps between very simple manifolds (R
n
and R
m
for
n, m small positive integers.) As a start:
Exercise 2.6.6. Let f : R → R be given by f(x) = x
2
. Put a = 2 and
investigate what happens if we take (a) a tangent vector at 1 and (b) a
cotangent vector at 4.
Now try it for f : R
2
→R
2
with
¸
x
y


¸
x
2
+y
2
xy

ans some suitable points for a and f(a). In this case you can conveniently
represent tangent vectors as columns and cotangents as rows.
2.7. AUTONOMOUS SYSTEMS OF ODES 37
Exercise 2.6.7. Write out a lecture for first year students which describes
tangent vectors on R in a really simple way as possible velocities along the
line, and hence define the tangent bundle R
˙
R. Define differentiation of
maps fromR to R in terms of bundle maps. Prove the chain rule as T(f ◦g) =
Tf ◦ Tg. Be prepared to answer any awful questions an intelligent student
might ask.
Write out a lecture on ordinary differential equations in terms of sections of
the tangent bundle. Set up and solve some easy ones in this notation.
Do you think this is easier or harder than the traditional way of doing it?
Assume that since Mathematica can solve ODEs, the idea is not to train
students to jump through hoops but to get them to understand what they
are doing.
2.7 Autonomous Systems of ODEs
2.7.1 Systems of ODEs and Vector Fields
Consider the system of linear ordinary differential equations:
˙ x = −y x(0) = 1
˙ y = x y(0) = 0
We can write this as a two dimensional problem:
¸
˙ x
˙ y

=
¸
0 −1
1 0
¸
x
y

or more succinctly:
˙ x = Ax (2.7.1)
where A is the above matrix.
The matrix A defines a vector field on R
2
by taking the location x to the
vector A(x). We are now used to the idea of a vector field on R
2
both visually
in terms of lots of little arrows stuck on the space (which can incidentally be
generated quickly and painlessly using Mathematica), and algebraically as
a map from R
2
to
˙
R
2
sending locations to arrows (with their tails attached
to those locations).
Such a system of ordinary differential equations is called autonomous, mean-
ing that the vector field specified by the system doesn’t change in time.
38 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Figure 2.7.1: A vector field or system of ODEs in R
2
Consequently we can either refer to an Autonomous System of Ordinary Dif-
ferential Equations defined on an open set U ⊆ R
n
, or we can talk about a
Smooth Vector Field on U. The second is much shorter and easier to think
about.
If we draw the vector field in the above case, we get arrows which go around
the space in a positive direction as in figure 2.7.1
A solution to the system of differential equations, or an integral curve for the
vector field is a map f : R −→R
2
, usually written
¸
x(t)
y(t)

with the property that ˙ x and ˙ y satisfy the given system of equations. What
this means is that we think of a point moving in R
2
so that it’s velocity at
any point is just the vector attached to that point. So the solution curve has
to have the vector field tangent to it always.
It is possible to learn to solve autonomous systems of differential equations
without ever understanding that they are all about vector fields which give
the velocity of a moving point, and that a solution is simply a function which
says where the moving point is at any time, and which agrees with the given
vector field in what the velocity vector is. This is a pity.
In the above case, you can see by looking at the system what the solution
is: obviously the solution orbits are circles, and given the initial condition
2.7. AUTONOMOUS SYSTEMS OF ODES 39
where at time t = 0 we start at the point (1, 0)
T
, the solution can be written
down as
x = cos(t), y = sin(t)
and it is easy to verify that this works.
Exercise 2.7.1. Do it.
Obviously, solving initial value ODE problems for more complicated vector
fields isn’t going to be so easy, and doing it in dimensions greater than three
by the ‘look at it and think’ method also looks doomed. So it is desirable
to have a general rule for getting out the solution. Fortunately this is easy
enough for linear vector fields in principle, although the calculations can be
messy in preactice. But again, that’s what computers are for.
2.7.2 Exponentiation of Things
I did this in second year M213 but some of you may have missed out on it in
which case here it is. Those of you who did it can read this rather quickly.
If you write down the usual series for the exponential function you get:
exp(x) = 1 +x +
x
2
2!
+
x
3
3!
+
x
n
n!
+
Now think about this and ask yourself what x has to be for this to make
sense. You are used to x being a real number, but it should be obvious that
it could equally well be a complex number. After all, what do you do with
x? Answer, you have to be able to multiply it by itself lots of times, and you
have to be able to scale it by a real number, and you have to be able to add
the results of this. You also have to have an identity to represent x
0
. Oh, and
you need to be able to take limits of these things. So it will certainly work
for x a real or a complex number. But it also makes sense if x is a square
matrix. Or, with any system where the objects can be added and scaled and
multiplied by themselves. And have limits of sequences of these things.
The name of a system of objects which can be added and scaled by real
numbers is a vector space, and a vector space where the vectors can also
be multiplied is called an algebra. We can do exponentiation in any algebra
which has a norm and a multiplicative identity. (And it would be a help if it
was complete in that norm, i.e. limits of cauchy sequences exist.) The square
n n matrices form such an algebra. We can also hope to take sequences of
them and maybe have them converge to some matrix. So we can exponentiate
square matrices.
40 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Exercise 2.7.2. Exponentiate the matrix A in equation 2.7.1. Now expo-
nentiate the matrix tA. Do you recognise the result?
It should be obvious that we could, in principle, calculate the exponential of
a matrix to some number of terms, and if the infinite sum makes sense and
the sequence of partial sums converges, then we could always get some sort of
estimate of exp(A) for any matrix A by computing enough terms. We would
hope that multiplying A by itself n times would give some reasonable sort of
matrix, and when we divided all the entries by n! we would get something
pretty close to the zero matrix. If this happened for all the n past some
point, then we could optimistically suppose that exp(A) was some matrix
which we could at least get better and better approximations to, which after
all is exactly what we have with exp(x) for x a real number.
Exercise 2.7.3. Define the norm of an n n matrix A to be
|A| = sup
x=1
|A(x)|
as in an earlier problem, and show that |A
2
| ≤ (|A|)
2
. Hence prove that
the function exp is always defined for any n n matrix.
Exercise 2.7.4. If e
tA
exp(tA) denotes a map from R to the space of nn
matrices, show that its derivative is Ae
tA
.
There are other algebras where a bit of exponentiation makes sense, so be
prepared for them.
2.7.3 Solving Linear Autonomous Systems
In principle this is now rather trivial:
Proposition 2.7.1. If ˙ x = Ax is an autonomous linear system of ODEs
with x(0) = a, then
x = e
tA
a
is the solution.
Proof:
Differentiating e
tA
gives Ae
tA
by the last exercise and since exp
0
= I the
identity matrix, the initial value x(0) = a is satisfied. So it is certainly a
solution.
If this looks a bit like a miracle and in need of explanation, you are thinking
sensibly and merely need to do more of it. It may help to note that the
2.7. AUTONOMOUS SYSTEMS OF ODES 41
exponential function is the unique function with slope at a point the same
as the value at the point, and that this leads to the general solution for the
linear ODE in dimension one, and that this goes over to higher dimensions
with no essential changes. In effect, the exponential function was invented to
solve all these cases. It actually goes deeper than this, see Vladimir Arnold’s
book Ordinary Differential Equations.
2.7.4 Existence and Uniqueness
Could you have two different solutions (or more)? No, not for linear systems,
but this requires thought. Certainly the 1-dimensional ODE given by
˙ x(t) = 3x
2/3
, x(0) = 0
has the solution x(t) = t
3
but also the solution x(t) = 0 It also has infinitely
many other solutions. (Can you find some?) Of course this is not a linear
ODE, but it is clear that some sort of conditions will need to be imposed
before we can look at vector fields which are not linear and expect them to
have solutions. Happily, there is a simple one which guarantees at least local
existence and uniqueness:
Theorem 2.7.1. If f : U ⊆ R
n
−→ R
n
is a continuously differentiable
vector field, then for any point a in U there is a neighbourhood W ⊆ U
of a containing a solution to the system of equations ˙ x = f(x) with a as
initial value, and the solution is unique. Moreover, there is a continuously
differentiable map F : W J −→R
n
for some interval J = (−a, a) on 0 ∈ R
such that for all b in W, the map F
b
: J −→ R
n
is the solution for initial
value b at t = 0.
There is a proof in Hirsch and Smale’s Differential Equations, Dynamical
Systems and Linear Algebra, pages 163 to 169.
There is a better proof in Arnold’s book on page 213. It is actually the
same proof but much better explained. It is given for the general (non-
autonomous) case. Both arguments use the contraction mapping theorem.
You should read through it if you have not already done a proof in your
ODEs course. Assuming you did one.
The results follow easily from a more basic result sometimes called The
Straightening Out Theorem (In Arnold The basic theorem of the theory of
ordinary differential equations or the rectification theorem. See chapter 2).
The theorem says that in a neighbourhood U of a point of R
n
where the
(continuously differentiable) vector field is non-zero, we can find a one-one
42 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
differentiable map from U to W ⊆ R
n
with a differentiable inverse, such that
the transformed vector field on W is uniform and constant.
Given that we can do that, we could also make the vectors all have length
one and lie along the x
1
axis in R
n
with a rotation and scaling. The system
of ODEs then would be, in this transformed region W, the rather boring
system:
˙ x
1
= 1
˙ x
2
= 0
.
.
.
˙ x
n
= 0
with the solution
x
1
(t) = t +a
1
; x
2
(t) = a
2
; x
n
(t) = a
n
If you believe in the Straightening Out Theorem, then it is obvious that any
continuously differentiable vector field has at any point where the vector field
is non-zero a solution which is unique in some neighbourhood of the point
and which depends smoothly on the point. All we have to do is to map the
straight line boring solution(s) back by the differentiable inverse.
Exercise 2.7.5. Prove the last remark.
When the vector field is zero at a point, the solution is the constant function
taking all of R to the point. So there is a unique solution here too.
Remark 2.7.1. You will find a proof of the straightening out theorem in
Arnold. I shan’t prove it in this course on the grounds that this isn’t a course
on ODEs. At least, I don’t think it is.
Remark 2.7.2. It should be obvious that although we have looked at sys-
tems of ordinary differential equations on R
n
, the fact that everything is
defined locally means that they ship over to any smooth manifold. If the
manifold is compact then the completeness is guaranteed, and the solution
can be found by doing everything in charts and piecing the bits together.
2.8 Flows
I rather slithered over one important point, which is the question of whether
we always get a solution for all time, past and future. It is not hard to see
2.8. FLOWS 43
that the vector field X(x) = x
2
, X(0) = 1 on R has a solution
x(t) =
1
1 −t
which goes off to infinity in finite time. From which we deduce that it is
not in general possible to ensure that there is a solution for all time, and
this explains the cautious statement of the last theorem. The best we can
hope to do, the theorem tells us, for a smooth vector field at a point is to
find a neighbourhood of the point in which there is a parametrised curve,
x(t) : t ∈ (−a, a) where if we are lucky a will be ∞ and if we aren’t it will
be some possibly rather small positive number.
Definition 2.8.1. A vector field on U ⊆ R
n
is said to be complete if any
solution can be extended to the whole real line.
Exercise 2.8.1. Show that if a vector field has compact support then it is
complete.
Exercise 2.8.2. Show that if U is the unit open ball in R
n
centred on the
origin and X is a smooth vector field on U, then if X is complete, and if
Proj(X(x), x) is the projection of X(x) on x, then
lim
x→1
Proj(X(x), x) = 0
Remark 2.8.1. It should be obvious that there are not many physical situa-
tions where things go belting off to infinity in finite time, and for that reason
I shall restrict myself from now on to complete vector fields. If I forget to
put the word in, put it in yourself. Also put the word ‘smooth’ in front of
the term ‘vector field’ whenever it occurs since I shall not consider any other
sort.
The business of getting a solution is going to work not just for the point we
selected as our starting point but also for neighbouring points provided we
don’t go too far away. In the happy case where the vector field has solutions
for all time, the space U on which the vector field is defined is decompos-
able as a set of integral curves, since solutions can’t intersect each other, or
themselves, although they can, of course, be closed loops. This statement
follows from the uniqueness of a solution. Hence we deduce that a vector
field gives rise to what is called a foliation of the space into integral curves.
You can, perhaps, guess that partial differential operators more complicated
than vector fields will give rise to higher dimensional foliations, decomposing
the space into surfaces and other manifolds.
44 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Exercise 2.8.3. Describe the foliation of R
2
by the vector field
−y

∂x
+x

∂y
Recall that in second year (M213) we discussed the idea of groups acting on
sets and came to the conclusion that they were conveniently seen as homo-
morphisms from a group G into the group Aut(V ) of maps from the set V
into itself. Then a complete smooth vector field X on U ⊆ R
n
gives rise to
an action of the group R on U as follows:
x : R U −→ U
(t, x
0
) → x(t)
where x(t) is the integral curve of X with x(0) = x
0
.
To prove this is indeed a group action, we need to show that x(0, x
0
) = x
0
for every x
0
which follows immediately from my definition of x. (Since the
additive identity of R is 0.) We also need to show that
∀ s, t ∈ R, ∀ x
0
∈ R
n
, v(s, v(t, x
0
)) = v(s +t, x
0
)
which merely means that if you travel for time t from x
0
along the solution
curve, and then go on for time s, this gives the same result as travelling for
time s + t from the starting point x
0
, which is, after all, what we expect a
solution curve to do.
If we fix t and look to see what the group action does, it is a map from R
n
to itself. Well, we knew that. It is a truth that this map is always a smooth
diffeomorphism. The old fashioned way of saying this is that the solutions
depend smoothly upon the initial conditions, but I much prefer the modern
way of saying it. You should be able to see that all we are doing is taking
each point as input, and outputting the point it will get to after time t.
Proposition 2.8.1. For a complete smooth vector field X on U open in
R
n
, for any t ∈ R, the map x
t
: U −→ U, which sends x
0
to x(t, x
0
) is a
diffeomorphism of U
Proof:
The map x
t
certainly has an inverse, x
−t
. And the theorem on existence of
solutions to an ODE establishes that the map is continuously differentiable
when X is. So if X is smooth, so is x
t
.
Remark 2.8.2. The set of diffeomorphisms ¦x
t
: t ∈ R¦, or in other words
the map x : R U −→ U, is called in old fashioned books a one-parameter
group of diffeomorphisms. I shall simply say that the map x obtained from
the vector field X is the flow of X.
2.9. LIE BRACKETS 45
Remark 2.8.3. Given a flow x on U ⊆ R
n
we can always recover the vector
field by simple taking any point, a and differentiating the map x
a
: R −→ U
which sends t to x(t, a) at t = 0. This must give us the required vector field
from which the flow can be derived. So there is a correspondence between
flows and vector fields.
You now have four ways of thinking about vector fields. They are bunches
of arrows tacked onto a space; they are autonomous systems of ordinary
differential equations. And they are also flows, obtained by solving the au-
tonomous system. And last but not least they are operators on the algebra of
smooth functions from the space to R. This demonstrates that vector fields
are more interesting and complicated than you might have supposed.
I shall give one important feature of vector fields which arises from this
multiple perspective and which is much less obvious if you stick only to
systems of ordinary differential equations.
2.9 Lie Brackets
Writing, as is conventional in some areas, X and W for two vector fields in
1(R
n
) and bearing in mind that we can compose any such operators to get
X ◦ W and W ◦ X (which we write XW and WX for short). In general the
result is a perfectly good operator but some calculations will rapidly convince
you that XW is not, in general, a vector field operator but something much
nastier.
Example 2.9.1. Let V = −y ∂/∂x + x ∂/∂y and W = x ∂/∂x + y ∂/∂y
Then V Wh is
−xy

2
h
∂x
2
−y
∂h
∂x
−y
2

2
h
∂x∂y
−0 +x
2

2
h
∂y∂x
+x
∂h
∂y
+xy

2
h
∂y
2
+ 0
and WV h =
−xy

2
h
∂x
2
+ 0 +x
2

2
h
∂y∂x
+x
∂h
∂y
−y
2

2
h
∂x∂y

∂h
∂x
+xy

2
h
∂y
2
+ 0
Neither of these look like a vector field operating on h. If however we take
the difference, V W −WV we get some happy cancellation and wind up with
V W −WV = (−y

∂x
+x

∂y
) −(x

∂y
−y

∂x
) = 0
which is a vector field although not a very interesting one.
46 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Exercise 2.9.1. Write down another pair of vector fields V, W on R
2
and
compute V W − WV . Check to see if you always get the zero vector field.
What is it telling you about the vector fields when V W −WV = 0? (Some
intelligent conjectures would be of interest but only if supported by evidence
not used in framing the conjecture.)
Exercise 2.9.2. If X = P(x, y)∂/∂x+Q(x, y)∂/∂y and W = R(x, y)∂/∂x+
S(x, y)∂/∂y, calculate XW −WX and verify that is is a vector field.
Exercise 2.9.3. Compute XW − WX for X, W ∈ 1(R
n
) and show it is a
vector field in 1(R
n
) Show that this also holds for 1(U) for any open set
U ⊆ R
n
.
All this gives the following definition:
Definition 2.9.1. The Lie Bracket or Poisson Bracket of two vector fields
X, W in 1(U) for U ⊆ R
n
is written [X, W] and defined by
[X, W] XW −WX
It is a multiplication on the vector space of Vector fields on U.
Exercise 2.9.4. Do some simple calculations preferably for U ⊆ R
1
and con-
vince yourself that the Lie bracket multiplication is not in general associative
but does satisfy the Jacobi Identity:
∀X, Y, Z ∈ X(U), [X, [Y, Z]] + [Y [X, Z]] + [Z, [X, Y ]] = 0
Exercise 2.9.5. Prove that the Jacobi Identity is always satisfied for Vector
Fields.
The Lie bracket almost makes the vector space of vector fields on U, an
open subset of R
n
, into an algebra, which you will recall is merely a vector
space where the vectors can be multiplied, to make a ring. Here the Lie
Bracket operation fails to be associative in general, but a vector space with a
non-associative multiplication which satisfies the Jacobi Identity is, notwith-
standing, called a Lie Algebra. There are others besides these and again
algebraists have gone to town on investigating abstract Lie Algebras. Well,
we wouldn’t like them to be at a loose end and hang around street corners
3
.
Exercise 2.9.6. Prove [X, (Y + Z)] = [X, Y ] + [X, Z] and [(X + Y ), Z] =
[X, Z] + [Y, Z] Prove also that ∀ a ∈ R, [aX, Y ] = a[X, Y ] and [X, aY ] =
a[X, Y ].
3
Although they’d probably have an interesting line in graffiti.
2.9. LIE BRACKETS 47
Remark 2.9.1. The above properties you will recognise as bilinearity.
Exercise 2.9.7. Investigate the relation between [hX, Y ], [X, hY ] and h[X, Y ].
It should be apparent that although the calculations tend to be messy and
provide great scope for making errors, they are not essentially difficult. A
natural candidate for a good symbolic algebra package, you might say.
Exercise 2.9.8. Is there a multiplicative identity for the Lie Bracket oper-
ation on vector spaces? That is, is there a vector field J such that for every
other vector field, X, [J, X] = X? (Hint: what is [J, J]?)
You might be interested in an area of applications of these ideas. If so read
on.
It is easy to find the solution, h(x, y) = x
2
+y
2
to the PDE
−y
∂h
∂x
+x
∂h
∂y
= 0
Now this is one solution, and finding a single solution is very nice, but we
usually want the general solution. In this particular case you can probably
guess it. But in general, if we have some linear partial differential operator
L acting on F, a suitable space of smooth functions, and if we want the set
of all solutions of Lh = 0, then it will usually be a lot harder to find them.
This process is aided by the following idea: The set of solutions of L is going
to be a linear subspace of F, by definition of the term linear operator. Call
it F
0
. Now a symmetry of the solution space of the operator L, often called
a symmetry of the operator L, is some vector field operator X such that X
takes F
0
into itself, i.e. if whenever h is a solution to Lh = 0, so is Xh. If we
know the collection of all symmetry operators for L and we have a solution,
then we can find all the other solutions. In trivial cases this will amount
to no more than adding in arbitrary constant functions, but in non-trivial
cases it will do a whole lot more than this. So it would be a good idea to
be able to find, for a given L, the set of all symmetries X for L. It is clear
that the Poisson-Lie bracket can be used for any pair of linear operators, not
just vector fields. The following observation goes some way to explaining our
interest in them:
Proposition 2.9.1. If [L, X] = wL for some function w ∈ F, then X is a
symmetry of L.
Proof:
We need to show that ∀ h ∈ F, L(Xh) = 0 Now
LX −XL = gL ⇒ LX = gL +XL
48 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
and
∀h ∈ F, (gL +XL)h = gLh +XLh = 0 +0 = 0

Exercise 2.9.9. Prove the converse, that if X is a (vector space) symmetry
of L, then [L, X] = gL for some g ∈ F.
Exercise 2.9.10. What symmetry is involved in finding the general solution
to the equation
−y
∂h
∂x
+x
∂h
∂y
= 0
and how does it give the general solution?
Now it is possible to prove that the set of all vector space symmetries of an
operator L is itself a Lie Algbra. Which is one reason for wanting to know
more about them.
Some students of PDEs want to know why it is that the standard partial
differential equations all had their variables separable: does this happen for
all possible PDEs and why does it work for these cases? The answer to this
question is rather long and may be found in Volume 4 of the Encylopedia
of Mathematics and Its Applications, Symmetry and Separation of Variables
by Willard Miller. It has a lot to do with Lie Algebras.
It is now possible to state properly a significant problem.
Going back to the idea of flows, it makes sense to discover whether flows
commute. For a suitable pair of flows, x, y : R U −→ R
n
we can start off
from a ∈ U and go by flow x for a time s and then by flow y for time t. This
will get us to some point in U, written naturally enough as y
t
◦ x
s
(a). Or we
could go the other way around, first by y and then by x to get x
s
◦ y
t
(a). If
we always wind up at the same point for any starting point and any pair of
times s, t then we may say that the flows commute.
Then when the flows x, y correspond to the vector fields X, Y , we have the
following result: x and y commute iff [X, Y ] = 0. You can see that this works
for the case of the two vector fields V, W in Example 2.9.1.
At present we lack the machinery to prove this result economically, so I shall
skip it until it is needed.
Remark 2.9.2. Again all this makes perfectly good sense on manifolds for
the usual reasons. The idea of thinking of a vector field on a manifold M
as a special kind of operator on (

(M) ensures that we can compose them
and add them and subtract them, so the Lie Bracket makes sense there too,
and we can write down vector fields on manifolds via charts and find integral
curves for them and so foliate the manifold.
2.10. CONCLUSION 49
Exercise 2.9.11. Demonstrate the truth of the last remark by doing some
of these things on S
1
and, if you are feeling very brave, S
2
.
2.10 Conclusion
This has been a quick introduction to the ideas of smooth manifolds and
vector fields on them. There are whole books dedicated to these ideas and
you will find some in the library. You will find some of these ideas covered
very quickly in chapters two and three of the text book, which you should
read and satisfy yourself that it is intelligible. You should be able to see why
definitions are as they are.
50 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Chapter 3
Tensors and Tensor Fields
This chapter deals with the machinery needed to talk about diferential geom-
etry, although it only starts on actually doing so. It contains the information
in Chapter four of the text book and goes into the algebra in more detail.
This is because we are doing it right, on account of being mathematicians
and therefore feeling uneasy about relying on our intuitions without being
able to check the logic. We also cover part of Chapter one of Part three,
where the text book is decidedly scrappy, and part of Chapter five of Part
one. I do things in a slightly different order. You should however read the
text book in conjunction with the notes and do the exercises.
I also throw in a few remarks about the exterior calculus and Stokes’ Theo-
rem, not because this is part of the course but because it is a part of every
educated person’s background in the twenty-first century. It has to be ad-
mitted that there aren’t many educated people around, but then there never
have been.
3.1 Tensors
3.1.1 Natural and Unnatural Isomorphisms
Let V denote a real vector space of finite dimension, so V is isomorphic to
R
n
for some n ∈ Z
+
. Then I can define the space of shifts of V which is a
collection of maps from V to V by taking any v ∈ V and writing
ˆ v : V → V, ∀ w ∈ V, w → w+v
The set of all such maps I shall call
ˆ
V . The map ˆ v is the map that adds v to
everything. I can compose such maps, and it is immediate that ˆ u◦ˆ v =

u +v.
51
52 CHAPTER 3. TENSORS AND TENSOR FIELDS
Similarly, for every t ∈ R,
´
tu = tˆ u where we scale maps in the usual way,
(tf)(x) = t(f(x)).
This makes
ˆ
V a vector space and gives an isomorphism

: V →
ˆ
V , ∀ v ∈ V, v → ˆ v
Exercise 3.1.1. Confirm that this is an isomorphism of vector spaces.
This isomorphism is natural, which means (in part) that given f : U → V , a
linear map between real vector spaces, we get a map from
ˆ
U to
ˆ
V :
f : U → V → f

:
ˆ
U →
ˆ
V , f

(ˆ u) =

f(u)
Note that we can specify the isomorphism and the map f

without making
any reference to a basis for U or V . This is the other part of what we mean
by natural. I cannot define naturalness properly without an excursion into
category theory which I am hoping to avoid, but the idea is sufficiently clear
for present purposes. I hope.
I shall write U

= V when vector spaces U and V are isomorphic and U
N

= V
when they are naturally isomorphic.
Exercise 3.1.2. Take the space L(R, V ), of linear maps from R to V , and
show this is also a vector space, naturally isomorphic to V .
The space of shifts being naturally isomorphic to V leads to two pictures
of a vector space, one has got points in it and the other has got arrows in
it. We can certainly think of a shift map taking u to u + v as an arrow
from u to u + v in the original space, and the map itself as a whole lot of
arrows, all basically showing where each point starts and finishes under the
map. And since the spaces are isomorphic we can cheerfully think in either
one. Physicists do this all the time as do applied mathematicians, and so
they confuse the two distinct things, points and arrows, and usually this does
no harm; in fact the more ways you have of thinking about something the
easier it is to solve problems, so it actually does some good. It is, however,
probably better to confuse things when you know you are doing it, rather
than just being confused.
Now I define the space V

, the dual space to V .
Definition 3.1.1. V

is the set L(V, R) of linear maps from V to R with
the usual rules for addition and scaling of maps, viz, (tf)(v) = t(f(v)) and
(f +g)(v) = f(v) +g(v), for every t ∈ R and every u, v ∈ V .
3.1. TENSORS 53
Exercise 3.1.3. Confirm that V

is indeed a vector space of the same di-
mension as V .
Exercise 3.1.4. Show that a basis for V determines a basis for V

in an
obvious way. It is called the dual basis. In R
2
, [1, 0] is the dual to
¸
1
0


R
2
, and so on. Note that my usage of representing vectors as columns (and
elements of R
n
as rows) is consistent with standard matrix notation and
makes it easier to distinguish R
n
from its dual space.
Remark 3.1.1. The standard (ordered) basis in R
n
is often written as the
ordered set (e
1
, e
2
, , e
n
) which saves writing lots of columns. e
j
is the
column of n numbers which has a 1 in the j
th
place and a zero everywhere
else. I shall often write (e
1
, e
2
, , e
n
) for the dual basis. You can think of e
j
either as a row matrix with n entries, with the j
th
entry 1 and all the others
zero, or you can think of it as the projection onto the j
th
axis, according to
taste. People who cannot tell subscripts from superscripts are going to have
a hard time with tensors.
It follows from the last exercise that V and V

are isomorphic, but the
isomorphism is not natural. Given U and V real vector spaces, and a linear
f : U → V , we get a map f

: V

→ U

defined by
f

: V

→ U

, ∀ g ∈ V

g → g ◦ f
This map is the wrong way around. The term contravariant is used for
things like this. Again I am being a trifle vague here in order to avoid a long
discursion.
If V = R
n
then I find it helpful to write the elements of R
n
as column arrays.
Then it is natural to write the elements of R
n
as row arrays. Then this
makes it clear that the latter act on the former (by matrix multiplication) so
there is a map
R
n
R
n
→R,

¸
¸
¸
¸
[a
1
, a
2
, a
n
] ,

x
1
x
2
.
.
.
x
n
¸
¸
¸
¸
¸

→ a
1
x
1
+a
2
x
2
+ a
n
x
n
Physicists write the thing on the right as a
i
x
i
by what is called the Einstein
summation convention which means that a repeated lower index and upper
index is short for a sum over all possible values of the index. This explains
why we use lower indices or subscripts for covectors, elements of the dual
54 CHAPTER 3. TENSORS AND TENSOR FIELDS
space, and superscripts or upper indices for the components of a vector. It
makes writing squares and higher powers a real bugger, but fortunately we
don’t have to do that very often.
Note that this generalises, there is a map
V

V →R, (g, v) → g(v)
All I have done with my rows and columns is specify the maps and the vectors
by arrays of numbers. This is so we can do sums. Usually rather horrid sums,
but that is what Mathematica and MATLAB are for.
Note that confusing a space and its dual is not a good idea: physicists did this
and got themselves in a bit of a mess in consequence. They are isomorphic,
at least when finite dimensional, but not naturally isomorphic, so it is a good
idea to keep them separate.
Exercise 3.1.5. Define the unit vector at 0 on R to be the tangency equiv-
alence class of the map i : R → R given by i(t) = t. Then i ∈
˙
R
0
is a basis
element. Define dx :
˙
R
0
→ R by dx(i) = 1. The identity map x : R → R
goes over to the identity map
˙
R →
˙
R, and it takes the tangent vector i to
itself. What does it do to dx?
Note that all the above spaces are isomorphic and all maps are pretty much
the identity map if you are prepared to be sloppy. Examine which of the
various maps are covariant and which contravariant.
Exercise 3.1.6. Let V be the space of all real valued functions defined on R.
Is it the case that V and V

are isomorphic? If so provide an isomorphism.
Exercise 3.1.7. Show that an isomorphism between a vector space V and
its dual provides a quadratic form on V which, if positive definite, defines
an inner product on V . Show that an inner product on V determines an
isomorphism between V and its dual whenever V is finite dimensional. Is it
true when V is not finite dimensional?
3.1.2 Multilinearity
Definition 3.1.2. A bilinear map f : U V → W for real vector spaces
U, V, W is one such that
∀ u, u

, ∈ U, ∀ v ∈ V, ∀ s, t ∈ R, f(su+tu

, v) = s f(u, v)+t f(u

, v) and
∀ u ∈ U, ∀ v, v

, ∈ V, ∀ s, t ∈ R, f(u, sv +tv

) = s f(u, v) +t f(u, v

)
3.1. TENSORS 55
We can describe this by saying that f is linear in each variable separately.
The field can in fact be any field you like as long as it is the same field for
U,V and W .
Exercise 3.1.8. Find a bilinear map from R R to R.
Definition 3.1.3. For any u ∈ U and a bilinear map f : U V → R I can
write f
(u,−)
: V →R as the map
f
(u,−)
: V →R, v → f(u, v)
Similarly for any v ∈ V, f
(−,v)
: U →R sends u to f(u, v).
We can describe bilinearity of f by saying that f is linear in each variable
separately, meaning that for any u ∈ U, f
(u,−)
is linear and for any v ∈
V, f
(−,v)
is linear.
Two exercises which may help later in understanding some technicalities:
Exercise 3.1.9.
1. Show that when V is finite dimensional, V

, the dual of V

, is nat-
urally isomorphic to V . That is, show there is an isomorphism which
does not require a basis of either space to specify it, and that f : U → V
induces a map f

: U

→ V

.
2. Show that if Bil(U V, W) is the vector space of bilinear maps from
U V to W and L(A, B) is the vector space of linear maps from A to
B for any real vector spaces A and B then
Bil(U V, W)
N

= L(U, L(V, W))
where
N

= denotes a natural isomorphism of vector spaces.
Now we generalise the idea of bilinearity which deals with maps from U V
to a vector space, to multilinearity which has more than just two terms in
the product.
Definition 3.1.4. If we have a k-fold cartesian product of real vector spaces
U
1
U
2
U
k
and if j ∈ [1 : k] we can take (u
1
, u
2
, , u
k
) in this
product, and for any
f : U
1
U
2
U
k
→R
we define f
(u
1
,u
2
,···ˆ u
j
,···u
k
)
: U
j
→R where the ˆ u
j
means that the j
th
term has
been replaced by −, to be the map which sends u
j
to f(u
1
, u
2
, , u
j
, , u
k
).
56 CHAPTER 3. TENSORS AND TENSOR FIELDS
Note that ˆ u
j
has absolutely nothing to do with shift maps!
Definition 3.1.5. A k-multilinear map f : U
1
U
2
U
k
to R for real
vector spaces U
j
, j ∈ [1 : k] is a map which if we keep all but one term
(u
1
, u
2
, , u
k
) fixed , representing this as (u
1
, u
2
, ˆ u
j
, u
k
), then
f
(u
1
,u
2
,···ˆ u
j
,···u
k
)
is a linear map from U
j
to R, for any j ∈ [1 : k] and any (u
1
, u
2
, ˆ u
j
, u
k
).
This is actually quite simple but a swine to write down. If you find it con-
fusing write down a trilinear map from R
2
R
2
R
2
to R.
Definition 3.1.6. A covariant k-tensor on a vector space V is a multilinear
map
f : V V V
. .. .
k copies
→R
We write T
k
(V ) for the vector space (under the usual addition and scaling of
maps) of all covariant k-tensors on V . By convention, a 0-tensor on a (real)
vector space V is a real number.
Definition 3.1.7. A contravariant -tensor on a vector space V is a multi-
linear map
g : V

V

V

. .. .
copies
→R
We write T

(V ) for the vector space of contravariant -tensors on V .
Since a covariant 1-tensor on V is actually an element of V

and a contravari-
ant 1-tensor on V is actually an element of V

N

= V , there is a case for saying
that the names co and contra should be swapped around. But I haven’t got
the nerve.
We can have mixed tensors which are covariant in some arguments and con-
travariant in others. When a physicist writes down something like g
µ,ν
the
fact that they are subscripts tell you that this is a covariant tensor, the fact
that there are two of them tells you it is a bilinear map from V V to R, and
almost certainly V is either R
3
or possibly the tangent space or the cotangent
space to the manifold we live in. When a physicist writes
g
τ
µ,ν
he has a tensor g : V V V

→R for some V which he frequently forgets to
specify on the grounds that he knows, as do all right thinking people, what it
is. He uses subscripts for the coefficients g
µ,ν
so that he can use superscripts
for the things they operate on and use the Einstein convention.
3.1. TENSORS 57
Definition 3.1.8. We talk of a

k

type tensor ω on V when it is covariant
of order k and contravariant of order , that is when it is a multilinear map
ω : V V V
. .. .
k copies
V

V

V

. .. .
copies
→R
We write T
k

(V ) for the space of type (k, )
T
tensors on V . T
k
0
(V ) is written
T
k
(V ) and T
0

(V ) is written T

(V )
I shall expand on this when I explain tensor fields which comes up next.
Definition 3.1.9. A covariant k-tensor ω is symmetric iff
ω(u
1
, u
2
, , u
k
) = ω(u
2
, u
1
, u
3
, , u
k
)
and whenever we swap any two arguments the result is the same.
Definition 3.1.10. A covariant k-tensor ω is alternating (or antisymmetric)
iff
ω(u
1
, u
2
, , u
k
) = −ω(u
2
, u
1
, u
3
, , u
k
)
and whenever we swap any two arguments the sign only is changed.
Note that we can say this more easily: a covariant k-tensor is symmetric
iff it is invariant under the symmetry group S
k
acting on the arguments, in
algebra, ω ∈ T
k
(V ) is symmetric iff ω = ω◦σ for every σ in the permutation
group S
k
on the arguments of ω. And if it is antisymmetric then it is invariant
under the group A
k
. If σ is a permutation of the set of arguments, we write
sgn(σ) to be +1 if σ is an even permutation and −1 if it is odd. Then we
can say ω is alternating iff ω = sgn(σ) ω ◦ σ.
Alternating k-tensors are also known as k-forms and are important for later
work. They have everything to do with orientation.
Definition 3.1.11. We write Ω
k
(V
n
) for the space of alternating covariant
k-tensors on the vector space V having dimension n.
3.1.3 Dimension of Tensor spaces
You can either read this carefully or simply do the exercises at the end of
the subsection. Or you can do both. As long as you find out how to do the
exercises!
The space of k-tensors on V
n
is obviously a vector space because we can
add and scale the maps; the sum or two tensors of type (k, )
T
is obviously
58 CHAPTER 3. TENSORS AND TENSOR FIELDS
another tensor of the same type, and likewise scaling such a tensor by a real
number gives another tensor of the same type. Since the set of all maps from
any X to R is a vector space, the type (k, )
T
tensors form a linear subspace.
Example 3.1.1. Suppose ω is any (2, 0)
T
tensor on R. Then we put ω(1, 1) =
a. Then by multininearity, keeping the second component fixed we deduce
that ω(x, 1) = xa for any x ∈ R, and now keeping the first component fixed
we see that ω(x, y) = xya. Thus the tensor is specified by just one number,
a and so the space of covariant 2-tensors on R is a one dimensional vector
space, having ω(1, 1) = 1 as a basis element.
We note that ω is always a symmetric tensor. There is precisely one alter-
nating (2, 0)
T
tensor on R and it is the zero map. So the space of alternating
(2, 0)
T
tensors on R is zero dimensional. The zero tensor is both symmetric
and antisymmetric.
To get a basis for the covariant k tensors on R
n
, we need to specify the maps
on every choice of basis elements. For example, for 2-tensors on R
2
we know
the multilinear map completely if we know it on (e
1
, e
1
), (e
1
, e
2
), (e
2
, e
1
)
and (e
2
, e
2
). Then multilinearity will guarantee us the value on any pair of
vectors, each in R
2
. The extension to higher order tensors and different n
is obvious, and by taking any basis for V we get the same conclusion. This
gives the obvious result, the dimension of the space of covariant k-tensors
on V
n
is n
k
. For example, we can take as a basis for T
2
(R
2
) the four maps
defined by the four columns:
(e
1
, e
1
) → 1 (e
1
, e
1
) → 0 (e
1
, e
1
) → 0 (e
1
, e
1
) → 0
(e
1
, e
2
) → 0 (e
1
, e
2
) → 1 (e
1
, e
2
) → 0 (e
1
, e
2
) → 0
(e
2
, e
1
) → 0 (e
2
, e
1
) → 0 (e
2
, e
1
) → 1 (e
2
, e
1
) → 0
(e
2
, e
2
) → 0 (e
2
, e
2
) → 0 (e
2
, e
2
) → 0 (e
2
, e
2
) → 1
Then it is obvious that these four maps are linearly independent and that
any bilinear map from R
2
R
2
to R is a linear combination of these.
Exercise 3.1.10. Prove the last remark.
Remark 3.1.2. We can write the map given by the first column as dx ⊗dx,
the second column map as dx⊗dy, the third as dy⊗dx and the last as dy⊗dy.
I shall explain this neat notation later.
If the tensors are of mixed type (k, )
T
, then by taking the dual basis for
the contravariant tensors we get the dimension is n
k+
. The Riemannian
Curvature tensor, which you may meet later, is a (3, 1)
T
tensor and in R
4
,
3.1. TENSORS 59
spacetime, it therefore has dimension 4
4
= 256. This means it takes 256
numbers to specify it. Fortunately it has a lot of symmetries which reduces
the dimension to 20, otherwise nobody would have the patience to do any
calculations with it.
The space of alternating covariant 2-tensors is obviously a subspace of the
space of all covariant 2-tensors: on R
2
, we do not need to look at what any
such tensor ω does to (e
1
, e
1
) because it has to be zero. Similarly if we know
it on (e
1
, e
2
) we know its value on (e
2
, e
1
), it is just the negative. So if we
know its value on (e
1
, e
2
) we know it completely, and since a basis for the
space Ω
2
(R
2
) is the single alternating tensor which sends sends (e
1
, e
2
) to 1,
the dimension of Ω
2
(R
2
) is one. Since it is easily verified that the alternating
map which sends (e
1
, e
2
) to 1 is the determinant of the matrix formed by
putting the two vectors as adjacent columns, we see that the determinant is
a basis for the space Ω
2
(R
2
) of alternating 2-tensors on R
2
.
Exercise 3.1.11. Easily verify the above claim.
In R
3
we have three basis elements, (e
1
, e
2
, e
3
). and if we look to see what
we have as a basis for the alternating two tensors we observe that we know
any such alternating ω if we know it on (e
1
, e
2
), (e
2
, e
3
) and (e
1
, e
3
). For
every other pair of basis elements, the result is forced by knowing ω on these
three together with the fact that ω is alternating. Since we have only three
choices of real numbers to make in order to nail down a particular alternating
2-tensor on R
3
, the dimension of Ω
2
(R
3
) is 3. And for R
n
, all we have to
do is to take pairs e
i
, e
j
with i < j, and again knowing ω on these tells us
everything about ω. There are n(n −1) ways of choosing two different basis
vectors from R
n
, and we need half of them, so the dimension of Ω
2
(R
n
) is
n(n −1)/2.
And finally, we can choose k distinct basis elements from the set of n in
n(n −1)(n −2) (n −k + 1) ways and each such way can be permuted in
k! ways and we need only one of them. We can choose a suitable basis on
which to define an alternating k-tensor to be the set ¦e
i
1
, e
i
2
, e
i
k
¦ with
i
1
< i
2
< < i
k
and this can be done in
n
C
k
=
n!
k!(n −k)!
ways, the
number of ways of choosing k things from n. So the dimension of Ω
k
(V
n
) is
n
C
k
.
The space of symmetric tensors is similar except that we do not know the
value of ω when two choices of the same basis elements of R
2
are made. In
R
2
the 2-tensor ω is determined if we know ω(e
1
, e
1
), ω(e
1
, e
2
), ω(e
2
, e
1
),
and ω(e
2
, e
2
). If we know it is symmetric we don’t need both ω(e
1
, e
2
) and
ω(e
2
, e
1
). So the dimension of the symmetric 2-tensors on R
2
is three. The
60 CHAPTER 3. TENSORS AND TENSOR FIELDS
symmetric k-tensors on R
n
have a basis the set of values of maps defined
on ¦e
i
1
, e
i
2
, e
i
k
¦ with i
1
≤ 1
2
≤ ≤ i
k
. For the 2-tensors on R
n
we
can choose two elements in n
2
ways and we can notice that n of these have
both elements the same. The remaining n(n − 1) ways have the subscripts
different and we can select half of them. So the dimension is n(n−1)/2+n =
n(n +1)/2. I leave you to work out the dimension of the space of symmetric
k tensors on R
n
.
Note that the space Ω
n
(R
n
) always has dimension 1. Taking the ω defined by
taking the value one on (e
1
, e
2
, e
3
, e
n
) in that order, we observe that we
have a particularly simple alternating n-form on R
n
. It is called the volume
element, and its value on any set of n vectors in some order can be calculated
using multilinearity. If we write each vector out as a column, the result is
the determinant of the resulting n n matrix. This is a good way to define
the determinant.
Note that we could write out a basis for Ω
k
(R
n
) for k ≤ n in terms of
the possible choices of k row elements by taking the determinant of the
result. Thus alternating tensors are all about determinants or, alternatively,
determinants are all about alternating tensors.
Exercise 3.1.12.
1. By evaluating an ω in the space of (2, 0)
T
tensors on R
2
on the elements
(e
1
, e
1
), (e
2
, e
1
), (e
1
, e
2
)(e
2
, e
2
) to get (a, b, c, d) respectively, show that
ω is defined by a 2 2 matrix using suitable matrix operations.
2. Show that this is equivalent to acting on the pair of vectors (a, b) by a
2-tensor φ by writing the matrix as A and calculating a
T
Ab.
3. Complete the scruffy arguments used to obtain the dimension of Ω
k
(R
n
)
which took suitable basis elements of
R
n
R
n
R
n
. .. .
k terms
to define a set of multilinear maps, with the implied belief that we can
extract a basis for Ω
k
(R
n
) by fixing suitable values. In particular show
that a set of such maps is linearly independent and spans Ω
k
(R
n
).
4. Show that the symmetric (2, 0)
T
tensors form a subspace. What is the
dimension? Give a basis for it.
5. Do the same for the alternating (2, 0)
T
tensors.
3.1. TENSORS 61
6. Show that the determinant acting on
¸
x
y

,
¸
u
v

by taking the two
vectors to vx −uy is an alternating 2-tensor on R
2
.
7. Show that any other alternating 2-tensor on R
2
is a multiple of this by
a real number.
8. Repeat for (0, 2)
T
tensors and again for (1, 1)
T
tensors.
9. Find the dimension of the linear space of all covariant k-tensors on R
n
.
10. Find the dimension of the space of all alternating covariant k-tensors on
R
n
. Hint: start off in a small way by looking for a non-zero alternating
2-tensor on R and showing there aren’t any. You have done alternating
2-tensors on R
2
. The determinant is a basis for the alternating 3-
tensors on R
3
, and this generalises. The alternating 2-tensors on R
3
have a basis obtained by choosing two rows of the three rows made up of
the two vectors side by side, and producing the 22 determinant on the
entries. This leads to three basis elements. Show this by looking at the
four choices of a pair of bases. Now generalise to higher dimensions.
Finally generalise to higher order alternating tensors. It’s a fair bit of
work but will burn the elements of the exterior algebra of alternating
tensors into your brain for ever.
Once you have done the work of finding out how you manipulate them,
finding out the actual use and hence the point of the things is painless.
Remark 3.1.3. If we take the space of (2, 0)
T
tensors on R
2
or R
3
and
represent them as spaces of 2 2 and 3 3 matrices, you might think that
we have captured all the properties needed for (2, 0)
T
tensors and they are
merely matrices dressed up. You might conclude that the same holds for
(0, 2)
T
tensors and for (1, 1)
T
tensors. We have a bit of a problem however
if we decide to change the basis. Obviously this will change the matrix
representing a particular tensor, even if we agree to use the same basis for
both occurrences of V or V

, or if we use some new basis for V and the
dual basis for V

. There has to be a matrix representing the transition from
one basis to another, and you might think that the usual rule for transition
matrices applies as in Linear algebra. But the matrix here represents a
bilinear map, not a linear one, and you would be wrong in general. While I
shan’t be concerned with change of basis in what remains, a proper course
in tensors would certainly go into this, and you might want to play around
with finding out what happens.
62 CHAPTER 3. TENSORS AND TENSOR FIELDS
Representing higher order tensors with matrices doesn’t work. We would
need at least cubes and tesseracts of numbers instead of squares or rectan-
gles of them. Fortunately, there are neater ways of writing them down which
we shall meet later. In the next chapter we shall be concerned with Mawell’s
Equations for the Electro-magnetic field, and this will take us as far as al-
ternating 3-tensors on R
4
. Physicists and old style mathematicians have an
obsession with matrix representations which causes them serious problems
for higher order tensors. We shall breeze through them without effort merely
by using more powerful notations.
The 0-tensors have dimension one by definition.
If you ask an old-fashioned applied mathematician what a tensor is, he might
well tell you that it is a matrix, but it transforms differently. This tells you
more about old-fashioned applied mathematicians than it tells you about
tensors.
3.1.4 The Tensor Algebra
A definition of something we have met before:
Definition 3.1.12. An algebra is a vector space with a left and right dis-
tributive multiplication on it. The multiplication is usually associative so the
elements of the space define a ring. The exception is Lie Algebras which are
not associative but instead satisfy the Jacobi Identity (see Exercise 2.9.4).
Definition 3.1.13. A graded algebra is a set of vector spaces indexed by the
group Z
n
for some n ∈ Z
+
(or the group Z), with a distributive multiplication
on the set.
Given two covariant tensors we can multiply them. More specifically, given
a k-tensor and an -tensor we can construct a k +-tensor as follows:
If
ω : V V V
. .. .
k copies
→R
is a covariant k-tensor and
η : V V V
. .. .
copies
→R
is a covariant -tensor we define
ω ⊗η : V V V
. .. .
+k copies
→R
3.1. TENSORS 63
by taking ω on the first k elements, η on the last , and multiplying the
results.
Exercise 3.1.13. Show that this gives a covariant k +- tensor.
This is called the tensor product in the tensor algebra. This makes the set of
all covariant tensors a graded algebra. Graded algebras are quite common and
you will meet them later if you do algebraic topology or theoretical physics.
Physicists stick the word ‘super’ in front of a theory when it goes to a graded
version, hence superstring theory. Usually they have only two levels so they
talk about Z
2
gradings.
Note that the tensor product of alternating tensors is not alternating unless
one of the tensors is a constant (zero tensor).
Note also that the tensor product although not commutative is associative
and distributes over addition.
Exercise 3.1.14.
1. Show that if ω
1
, ω
2
are k-tensors and ϕ is an -tensor then

1

2
) ⊗ϕ = ω
1
⊗ϕ +ω
2
⊗ϕ
2. Show that if s, s

and t, t

are real numbers,
(sω
1
+tω
2
) ⊗(s

ϕ
1
+t

ϕ
2
)
is what you’d expect it to be on the optimistic assumption that ⊗ is a
nice well behaved multiplication.
3. Give a basis for the space of (2, 0)
T
tensors on R
2
in terms of dx and
dy. Hint: Note that dx
i
: R
n
→ R is a covariant 1-tensor on R
n
for
any n. If n = 2 we call them dx and dy. Certainly the tensor product
of any two 1-tensors is a 2-tensor. Show that every 2-tensor is a linear
combination of such tensor products. (A count of basis elements might
save you some trouble here.) Look back to Remark 3.1.2 to find the
answer written down, with an explanation promised later. This is the
explanation.
4. Represent the tensor dx ⊗dy as a matrix over R
2
.
5. Represent the tensor dx ⊗dy as a matrix over R
3
.
6. Repeat the last two for the tensor dx ⊗dy −dy ⊗dx. (Later we shall
call this tensor dx ∧ dy.)
64 CHAPTER 3. TENSORS AND TENSOR FIELDS
Exercise 3.1.15. Show by an example that not every two-tensor on R
2
can
be written as a tensor product of one-tensors. This is obvious once you see
it but some people are tempted to suppose all higher order tensors are tensor
products of one-tensors. The moral: one 2-tensor is not the same things as
two 1-tensors!
Example 3.1.2. We can write down a bit of the tensor algebra (not all of
it, it is infinite dimensional) on R
2
without too much trouble. Note that I
use dx to specify the linear map from R
2
to R which projects on the first
component, and dy for the projection on the second component.
Order basis isomorphic to
T
k
(R
2
) ¦dx
i
1
⊗ ⊗dx
i
k
¦ R
2
k
.
.
.
.
.
.
.
.
.
T
3
(R
2
) ¦dx ⊗dx ⊗dx, , dy ⊗dy ⊗dy¦ R
8
T
2
(R
2
) ¦dx ⊗dx, dx ⊗dy, dy ⊗dx, dy ⊗dy R
4
T
1
(R
2
) ¦dx, dy¦ R
2
T
0
(R
2
) 1 R
Using the isomorphisms we can also write out the tensor multiplication in
an admittedly strange form:
¸
R R
2
R
4

R (R, ) (R
2
,
q
) (R
4
,
q
)
R
2
(R
2
,
q
) (R
4
,?) (R
8
,?)
R
4
(R
4
,
q
) (R
8
,?) (R
16
,?)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Table 3.1.2
Here I have started with the 0-tensors, then the 1-tensors, and so on, and
used the isomorphisms to indicate where the tensor product takes us. The
symbol means ordinary multiplication, and
q
means scalar multiplication.
The question marks remain to be filled in, but I will do the multiplication
from T
1
(R
2
) T
1
(R
2
) to T
2
(R
2
). In the bases given this is
R
2
R
2
→ R
4
¸
a
b

,
¸
c
d

ac
ad
bc
bd
¸
¸
¸
¸
3.1. TENSORS 65
If you represent elements of T
2
(R
2
) by 22 matrices, you can get this result
by a matrix multiplication
¸
c
d

[a, b]
For what that’s worth.
Exercise 3.1.16. Fill in the other question marks.
It is easy to generalise the tensor algebra so that we can take the tensor
product of contravariant tensors or of a covariant tensor with a contravari-
ant tensor or of two mixed tensors. These things are best understood by
constructing simple examples when they are ridiculously easy, rather than
by looking at the formal definitions which on first encounter are terrifying.
Algebra is learnt by making up lots of examples. When you have done this
you can easily see what is being said and after a small amount of practice
you can use the language to terrorise people unfamiliar with it. This is child-
ish and you should be ashamed of yourself for actually frightening engineers,
applied mathematicians and physicists this way.
Exercise 3.1.17. A covariant 1-tensor on R
n
is a linear map from R
n
to
R and consequently an element of R
n
. We write dx
i
: R
n
→ R to be the
projection which picks out the i
th
component of each vector.
1. Show the set ¦dx
i
: i ∈ [1 : n]¦ is a basis for T
1
(R
n
)
2. Show that the set ¦dx
i
⊗ dx
j
, i, j ∈ [1 : n]¦ is a basis for the space
T
2
(R
n
), so any ω ∈ T
2
(R
n
) can be specified by the entries in an n n
matrix relative to this basis.
3. Take some ω ∈ T
2
(R
3
) and specify it by such a matrix, take two ele-
ments of R
3
and show how to evaluate ω on them by matrix multiplica-
tions of representations of the vectors with respect to the standard basis
for R
3
.
4. Choose a different basis for R
3
; discuss what has to be done to the
matrix in order to get it to still represent the same ω ∈ T
2
(R
3
)
5. The space T
1
(R
n
) is the space of linear maps from R
n
to R and is hence
R
n
. It is naturally isomorphic to R
n
, and we can use the natural
isomorphism to take ¦e
i
, i ∈ [1 : n]¦ to be a basis for T
1
(R
n
). What is
a basis for T
2
(R
n
)? For T
k
(R
n
)?
66 CHAPTER 3. TENSORS AND TENSOR FIELDS
6. Since the dimension of T
2
(R
n
) is clearly n
2
we can represent any ele-
ment ϕ ∈ T
2
(R
n
) by an nn matrix as before. How does this transform
under change of basis?
Remark 3.1.4. Note that this confuses an earlier definition which had dx
i
as a linear map from
˙
R
n
to R, but the reason for the confusion will become
clear later. If you are rather more fussy than I am, you might want to do it
right: If ¦e
i
: i ∈ [1 : n]¦ are the standard basis elements for R
n
, we can write
the corresponding dual basis for R
n
as ¦e
i
: i ∈ [1 : n]¦. Then go through
replacing dx
i
with e
i
throughout and you will have it in impeccable form.
Remark 3.1.5. We define a 0-tensor on any real vector space to be another
and more exotic name for a real number. This means we can take tensor
products of 0-tensors with k tensors to get the scaling operation. You should
have already worked this out from doing the exercises.
Exercise 3.1.18. If you read Darling’s book you will discover that he goes
about defining tensor algbras quite differently. He defines U ⊗V for any real
vector spaces U and V , by proving a universality theorem which is somewhat
obscure.
You can recover Darling’s treatment as follows.
First note that if we have L(U, R) and L(V, R) we can define L(U, R)⊗L(V, R)
to have elements f ⊗ g which means for each f ∈ L(U, R) and g ∈ L(V, R)
we take f ⊗g : U V →R by
f ⊗g(u, v) = f(u) g(v)
where is just multiplication in R. From now on I shall just write f(u)g(v)
for this. It is clear that this is a bilinear map from U V to R. It is also
clear that L(U, R) ⊗L(V, R) is a vector space under the usual operations of
scalar multiplication and addition.
But L(U, R) is just U

and L(V, R) = V

. So we have defined U

⊗V

as a
new vector space. It is therefore perfectly straightforward to define U

⊗V

as a new vector space. And if we identify U

with U and V

with V , we
have U ⊗V .
Show that this gives us Darling’s treatment. Find the dimension of U ⊗ V
in terms of the dimension of U and the dimension of V . Find an explicit
representation for R
2
⊗R
3
and calculate
¸
1
2

4
5
6
¸
¸
3.2. TENSOR FIELDS ON A MANIFOLD 67
3.2 Tensor Fields on a Manifold
It makes sense to attach various things to manifolds. For example, to each
point of S
2
we can attach a number. We can think of it as glued on to the
sphere. Or we can imagine it as giving a function from the sphere to the
real numbers. It might measure the temperature at the surface of a solid
ball, perhaps. It makes sense to attach numbers smoothly so the function is
smooth. As you wander about the surface of the sphere the numbers will not
change too sharply.
For a different example of attaching things to manifolds, we can attach vec-
tors. Again we can think of it in various ways, and it can be used for various
purposes: for example it might make sense to have the wind blowing at the
surface of the sphere and to want to say by how much and in what direction
at each instant. Or we might want to measure the tangential component in
the surface of a magnetic or electric field.
We might want to attach tensors. Such things are important and useful
particularly to physicists. You can see that we might want to assign, for
some purpose, to each point of a sphere a two by two matrix, and it would
make sense to require the matrices to change smoothly as we moved over
the surface of the sphere. To give an example of something quite practical
that we might want to attach to a manifold, suppose we had the job of
describing distances on S
2
. There is, of course, a standard metric on S
2
but
it is extrinsic and arises from the embedding in R
3
. If you think about the
intrinsic definition of S
2
you can see that there is absolutely nothing in it
which allows us to talk of distances. It is rather reasonable to want to have
an intrinsic notion of distance on S
2
and on other manifolds. In fact we can’t
do General Relativity without it. In the same way, there is no way to talk
about the area of a region, or the angle between vectors on a sphere, except
with reference to an embedding of the sphere in R
n
, and it makes sense to do
these things intrinsically, as is shown by the fact that we habitually do these
things in the space in which we live— which might be S
3
for all we know.
Or something more complicated. All this and much more is done by means
of attaching tensors to a manifold giving what are called tensor fields. We
now investigate these things.
The idea of a vector field on a manifold is not too hard to grasp: technically
the section of the tangent bundle takes each point of the manifold and assigns
to it a vector attached to that point from the space of possible choices. If
the manifold is R
2
, we assign to a point an arrow, an element of what I shall
call
˙
R
2
for the tangent space at each point. Confusing
˙
R
2
with R
2
, we get
68 CHAPTER 3. TENSORS AND TENSOR FIELDS
such things as
V : R
2
→R
2
,
¸
x
y


¸
−y
x

This of course is the same as the system of Ordinary Differential Equations
˙ x = −y
˙ y = x
in the notation of M213. It also shows you why I want to write
˙
R
2
, with
elements
¸
˙ x
˙ y

, for the codomain of a vector field.
The difference between the modern notation and the older one is that we are
being careful to make it clear that the two spaces R
2
(in the expression for a
vector field V : R
2
→ R
2
) are actually different. One is a space of locations
and the other is a space of arrows, actually tangent vectors. You probably
found the old fashioned notation confusing when you first met it, and indeed
it is. The new notation is not only clearer, it generalises to manifolds which
the old notation does not. Even V : R
2

˙
R
2
is an improvement.
As mentioned earlier, as well as attaching vectors to points in a manifold
we can attach other things. If we attach real numbers we merely get a map
from the manifold to R, and we have now got a new way to think of such
a map. Could we attach a matrix? To see the useful way to do such a
thing, observe that the tangent bundle is merely obtained by taking as fibre
the tangent space at each point. But we could start with the tangent space
and replace it with its dual space. The thing that we get when we take the
fibre bundle with the dual to the tangent space as fibre and glue all these
fibres together using the same method as for the tangent bundle is called
the cotangent bundle. It looks rather similar and in R
2
the only difference
would be that instead of attaching the space of columns of two numbers
(representing the possible arrows at each point), we would be attaching the
space of rows of two numbers (representing linear maps from R
2
to R). The
spaces are isomorphic, but clearly the elements are not the same. Of course I
have added my own bit of confusion here by confusing linear maps with their
matrix representation, another isomorphism. On R
n
this is harmless. On a
manifold it generally is not and again the isomorphism needs to be thought
about.
The cotangent bundle is important in classical mechanics where it corre-
sponds to the momentum space whereas the tangent space corresponds to
the velocity space. The reason is that we have an energy function. If we look
3.2. TENSOR FIELDS ON A MANIFOLD 69
at R
2
it has tangent space what I have called R
2

˙
R
2
. Now the function
1/2mv
2
is a function from
˙
R
2
to R,
¸
v
1
v
2


m
2
((v
1
)
2
+ (v
2
)
2
)
and the derivative of this function is the row matrix
[mv
1
, mv
2
]
This is an element of the cotangent bundle because it is a covector, not a
vector. Cheerfully confusing the two leads to ghastly muddle further down
the track.
As well as the dual of the tangent space attached to each point of a smooth
manifold, we can attach tensors. The vector space of k-tensors such as
ω : V V V
. .. .
k copies
→R
(any space of maps into R is a vector space) has a basis consisting of the
multilinear maps evaluated on each of the n
k
combinations of basis elements
of V . If V = T
a
(M) for some manifold M and some a ∈ M, then ω will be
specified relative to this basis by n
k
numbers where n is the dimension of V
and hence M, and k is the order of the tensor. In principle this could be an
awful lot of numbers (but in practice it usually isn’t).
This k-tensor vector space can also be thought of as stuck on the manifold
at a. The tangent space is just the locally trivial vector bundle over the
manifold as base space with fibre the tangent space at each point. In exactly
the same way we can take a locally trivial vector bundle with fibre the vector
space of k-tensors on the tangent space at each point. At each a ∈ M this is
a vector space of dimension n
k
. Given that a finite dimensional real vector
space has a topology which is invariant under isomorphisms, and given that
the tensor bundles must be locally trivial (since the tangent space to R
n
is a
trivial bundle), the topology of each tensor bundle has as a base the cartesian
product of those open sets in the manifold which are subsets of those open
sets over which the bundle is trivial, with open sets in the fibre.
If we do this for every a ∈ M we get the k-tensor bundle of M. If we do it
for all the possible k we get the full tensor bundle of M.
Then:
Definition 3.2.1. A smooth k-tensor field on a manifold M is a smooth
section of the k-tensor bundle.
70 CHAPTER 3. TENSORS AND TENSOR FIELDS
Awful Warning
There is scope for some confusion here. If we take the manifold to be R
n
,
the tangent space is, in my idiosyncratic notation, R
n

˙
R
n
and this makes
a vector field a section of this bundle. If we look to see what sort of tensor
field this is, we see that the tensor field must assign to each point of R
n
a multilinear map from some space V to R. There is only one possibility
and that is to make V the dual space to
˙
R
n
and take the linear maps. This
means that by identifying the double dual with the original space
˙
R
n
we get
the right answer. Thus a vector field is a T
1
tensor on the tangent space
T
a
(R
n
) for every a ∈ R
n
. So a covariant vector field in the ordinary sense,
mentioned in the last chapter is a contravariant tensor field. There is, if you
like, an element of dualising in defining a tensor in the first place, so we have
to dualise again to get rid of it.
The terminology is unfortunate since the tangent functor takes tangent vec-
tors to tangent vectors and is covariant, but not many people have the nerve
to change the traditional terminology. I certainly don’t.
Some of the books by physicists make a pig’s breakfast of all of this duality.
Confusion is the natural state of man. And woman. Try to be clear about
which space you are working in and avoid the muddle.
End of Awful Warning
If we take the alternating covariant k-tensors on the tangent space at every
point of the manifold the smooth tensor field of sections is called a differential
k-form on the manifold. Similary we can limit the section to taking values
in the symmetric k-tensors. Both of these are important.
This sounds horrible but if you look at 2-tensors, alternating or not, you can
see that on the tangent space T
a
(M) when M = R
n
, any one of them can
be represented nicely by an n n matrix of numbers. And if we select one
such matrix for each point a of M, then we get an n n matrix of functions
from the manifold to R. So as long as k is one or two we do not have
anything very complicated. If k = 1 then we are talking about vector fields
or covector fields, and if k = 2 we are sticking matrices onto the manifold at
each point. If we never go beyond dimension 3 then the worst thing we have
to imagine is a space with a 3 3 matrix associated with each point of the
space. This is not really very bad. Admittedly this only gets us the applied
mathematician’s view of the world, but at least we know how to generalise
it to higher dimensions and higher orders if it turns out to be necessary.
A serious issue with this simplified view of things is that the specification of
the matrix representing a tensor on any T
a
(M) requires us to choose a basis
3.2. TENSOR FIELDS ON A MANIFOLD 71
Figure 3.2.1: Shifting a vector between tangent spaces.
for T
a
(M). And if we now do the same for some T
b
(M) for a = b then we
need to choose a basis for representing tensors on T
b
(M). But the spaces
T
b
(M) and T
a
(M) don’t have much to do with each other in general. So in
what sense can it be made the ‘same’ basis? And if it is different, how do we
ensure that the matrix of functions is going to behave nicely in representing
the tensor fields? The tensor fields are perfectly respectable things, but if
we insist on representing them by matrices of functions we have some serious
problems. Note that if M = R
n
, the tangent spaces can be shifted into each
other in a natural way and the idea that we are using the ‘same’ basis for
each of them makes sense. It all goes wrong if M = S
n
. We need in this
case something like an explicit isomorphism between the tangent spaces at
different points.
To see what can go wrong here, imagine a sphere and take a point on the
equator. Attach a vector to this point, say one pointing along the equator.
I have shown this in figure 3.2.1. Now look to see what happens if you move
it parallel to itself along a line of longitude so that it moves up towards the
north pole. It seems reasonable to say that we are shifting the vector so it
is still pointing in the same direction, and still has the same length, despite
the fact that the vectors are all in different tangent spaces. In other words I
am claiming that I can tell when two vectors in two different tangent spaces
are ‘the same’. That this is insanely foolhardy becomes apparent if I go to
the same place by a different route. Suppose I first go around the equator.
My pink vector also goes around the equator, becoming rather purpler as it
goes. When it is opposite the starting point, I now move it up the curve of
longitude until it gets to the north pole. All the way, by both paths, I have
moved the vector so it is pointing in the ‘same’ direction, but the result is
a pair of vectors pointing in opposite directions. So cheerfully doing on a
72 CHAPTER 3. TENSORS AND TENSOR FIELDS
sphere what makes perfect sense on R
2
is fraught with problems.
One of the hardest things to do is to unlearn things you soaked up through
the skin when young and gullible. If you were encouraged to think that
isomorphic vector spaces were never worth distinguishing and you went all
sloppy in your thinking as a consequence, you now have the formidable job
of working it all out again. Don’t blame me, blame the scruffy bunch who
taught you manifest nonsense and blame yourself for buying it. Think of this
in the future and currently and regard it as a second Awful Warning.
Exercise 3.2.1.
1. Take M = S
1
and k = 2. Define a k-tensor field on S
1
.
2. Take M = S
2
and k = 2. Define an alternating 2-tensor field (2-form)
on S
2
. Explain what this might have to do with area of regions on a
sphere and indicate how you might calculate the area of a region in S
2
with respect to your choice of 2-form.
3. Take M = R
3
and k = 2. Define a differential 2-form on R
3
.
Note that we can talk about k-covariant and -contravariant mixed tensors
and mixed tensor fields.
3.3 The Riemannian Metric Tensor
Recall from M213 that an inner product on a vector space V is a positive
definite symmetric quadratic form, which is to say a map
', ` : V V →R
such that
1. ', ` is bilinear; that is ∀ u ∈ V, 'u, −` : V → R is linear and
∀ v ∈ V, '−, v` : V →R is linear
2. ', ` is symmetric; that is ∀ u, v ∈ V, 'u, v` = 'v, u`
3. ', ` is positive definite; that is ∀ u ∈ V, 'u, u` ≥ 0 and
'u, u` = 0 ⇒ u = 0
We can now summarise the above conditions by saying that an inner product
for V is a symmetric covariant 2-tensor on V , with the additional property
3.3. THE RIEMANNIAN METRIC TENSOR 73
Figure 3.3.1: (Bits of) some perfectly respectable Hilbert Spaces stuck on a
manifold.
of being positive definite. Recall also from M213 that positive definiteness
can be specified by observing that non-degenerate quadratic forms can be
classified as to their general shape by diagonalising them and then rescaling
the axes so that they are all diagonal with entries +1 along the diagonal
down to some point after which they are −1. This gives the signature of
the quadratic form (1, 1, 1, , 1, −1, −1 − 1) for some number of posi-
tive and some number of negative ones. A positive definite form has all n
entries +1. There are also degenerate forms where some of the entries after
diagonalisation may be zero.
I shall only consider the positive definite forms here, although physicists want
to look at the general case of non-degenerate forms because in relativity we
have to put in time as an extra dimension, which gives a signature (1, 1, 1, −1)
or (3, 1). (Or (−1, 1, 1, 1) if you are a physicist. Physicists put time first.
Some of them even use (1, 3), multiplying our form by −1. I shall outline a
reason for this in the next chapter.)
We now define a Riemannian metric tensor field on a manifold as a positive
definite symmetric two tensor field. That is, at every point of the manifold
we attach, smoothly, some positive definite symmetric 2-tensor. This means
we have some bilinear function of a pair of tangent vectors at each point.
It is a daft name, and it would have been much more sensible to call it a
Riemannian inner product tensor field, because it gives an inner product on
each tangent space. But it is too late to be sensible now. Figure 3.3.1 shows
some vectors in some of the tangent spaces to a sphere, and each pair has a
sort of local dot product in a perfectly respectable tangent space which is now
a perfectly respectable inner product space, in fact a perfectly respectable
Hilbert Space.
74 CHAPTER 3. TENSORS AND TENSOR FIELDS
Each such inner product may be specified, via charts, as a symmetric 2 2
matrix in the case of the figure, each matrix A(a) at the point a on the sphere
acting on a pair of tangent vectors u
a
and v
a
to give
u
T
a
A(a)v
a
but it would be better to regard it as a bilinear symmetric map which takes
pairs of tangent vectors with their tails at some point of the manifold, and
returns a real number. Thinking of it as a matrix makes it clear that there
are three distinct numbers which depend on where we are on the sphere. For
an n-manifold it will be
n
C
2
distinct numbers for each point of the manifold.
Or if you insist you can think of the metric tensor as n(n − 1)/2 distinct
functions from the manifold to R. So it makes sense to physicists to write
such a thing as g
µ,ν
where µ (mu) and ν (nu) range through the two possible
values on a sphere or the three possible values on a three-manifold, or the
four on space-time. With g
µ,ν
= g
ν,µ
. Of course this involves a choice of some
charts to cover the manifold. It might be better to write g
µ,ν
(a) for a a point
in the manifold to remind ourselves that we have what is in effect a matrix
valued function on the manifold, but we don’t.
Note that there is absolutely no machinery for calculating the dot product
of a tangent vector u
a
at a point a, with a tangent vector v
b
at a different
point b. This can be done in R
n
but the Inner Product tensor field doesn’t
allow it.
If the symmetric tensor is always positive definite we call it a Riemannian
metric, and the manifold with this tensor field is called a Riemannian Manifold.
If the symmetric tensor field has signature (−1, 1, 1, 1) it is called a Lorentzian
metric and the manifold is called a Lorentzian manifold. Physicists treat the
universe we live in, including time, as a Lorentzian manifold. More generally,
for any signature of form we say we have a semi-Riemannian metric. Bear in
mind at all times that when a physicist talks about a metric on a manifold
he means, almost always, an inner product on all of its tangent spaces, not
necessarily positive definite but always non-degenerate. Usually it is either
riemannian or lorentzian.
If two quadratic forms are positive definite, so is their sum. It makes sense to
add them because they are just functions, and if 'u, u` ≥ 0 and ≺ u, u ~ ≥ 0
then the sum is also non-negative, if the sum is equal to zero both the terms
'u, u` and ≺ u, u ~ must be equal to zero so u = 0. Moreover if we scale
by a positive constant the result is another positive definite form, while if
we scale by a negative constant the result is a negative definite form. Hence
the positive definite symmetric covariant 2-tensors which are positive definite
are not a vector subspace of the space of covariant 2-tensors on V , but they
3.3. THE RIEMANNIAN METRIC TENSOR 75
are an open subset of the vector space of symmetric covariant 2-tensors and
therefore a manifold with a dimension.
Exercise 3.3.1. What is the dimension of the space of positive definite sym-
metric 2-tensors on R
2
? Hint: it is the same as the dimension of the vector
space of symmetric 2-tensors and if you represent a tensor by a matrix, you
need to count the number of independent numbers in the matrix.
All the above makes sense if we use contravariant 2-tensors. In fact since I
haven’t said anything about V , it might just as well be the dual space to
some other space.
Now we say it again formally:
Definition 3.3.1. A (positive definite) Riemannian metric for a manifold
M is a positive definite symmetric covariant 2-tensor field on M.
What does this mean in computational terms? It is easiest to begin by
looking at a very simple case, a metric tensor field on R
2
. The idea of such a
tensor field on R
2
has to do with inner products on
˙
R
2
, in fact one such inner
product for each point of R
2
. This can be grasped by thinking of the matrix
of numbers operating on pairs of vectors in
˙
R
2
being fixed for each point in
R
2
, and as we move about in R
2
, we change the numbers in the matrix. So
the numbers depend on where you are, and are given by smooth functions of
your location in R
2
.
More generally, we take a manifold, we take a point on it, a and look at
the tangent space at a. Now we take the symmetric bilinear maps from this
space to R which are positive definite. On R
n
, this inner product could be
specified by taking n independent vectors as a basis, then taking the dual
space and the basis elements for that, and calling them (dx
1
, dx
2
, dx
n
),
and then writing the tensor as
¸
i,j∈[1:n]
g
ij
dx
i
⊗dx
j
where g
ij
is an n n symmetric positive definite matrix. This follows from
an exercise which I hope you did. Alternatively you can use the Einstein
convention and just write g
ij
dx
i
⊗dx
j
. If you were a classical mathematician
or happen to be scruffy, you might leave out the ⊗, as if it is obvious to the
meanest intellect what dx
i
dx
j
means. You might perhaps imagine in a dim
sort of way it means that you are multiplying a very, very little bit of the i
th
component of a vector with another very, very little bit of the j
th
component
of a possible different vector. In which case you are so confused there is no
hope for you.
76 CHAPTER 3. TENSORS AND TENSOR FIELDS
The standard inner product on R
2
can be written in this form as the identity
matrix. To calculate
¸
x
y

q
¸
u
v

we simply compute
[x y]
¸
1 0
0 1
¸
u
v

to get xu + yv. Doing the same with any other symmetric positive definite
matrix instead of the identity will give us a new inner product.
For M = R
2
it makes sense to take the same basis (dx, dy) for elements of the
cotangent space over every point, so we get that it is possible to represent a
Riemannian metric on R
2
in the form
[ ˙ a ,
˙
b ]
¸
g
11
(x, y) g
12
(x, y)
g
12
(x, y) g
22
(x, y)
¸
˙ c
˙
d

Again, this was an exercise which I hope you did.
This when multiplied out gives the required bilinear map from
˙
R
2

˙
R
2
to
R. For any choice of two tangent vectors we get a real number. The g
ij
are
smooth functions, for i, j ∈ [1 : 2] (and g
12
= g
21
.)
On R
3
they would be smooth functions for i, j ∈ [1 : 3]. The matrix would be
symmetric still and at each point it would be positive definite (or in general
have the required signature).
3.3.1 What this means: Ancient History
If you reflect a little on what a covariant 2-tensor on the tangent space is,
you will see that we have bilinear maps from pairs of vectors in the tangent
space at a to R, for every a in the manifold.
Now tangent vectors in the old days of classical geometry were not thought
of as elements of a perfectly respectable vector space, but were imagined to
be infinitesimal elements in the base space. You can see that if you take a
velocity vector at a point a ∈ R
2
and travel along it for a very, very short
time, you trace out, more or less, a line segment in R
2
. If you put a little
arrow on its head (its tail being at a) you get the beginings of a picture of
a vector field, which we learnt how to draw in second year. If you have a
uniform velocity parallel to the X-axis and of unit length and in the direction
of increasing x we can represent this by a tiny little arrow attached to a and
pointing in the direction of increasing x. Such a tangent vector should be
rather small because it really represents a velocity through a, and hence an
element of what I have called
˙
R
2
, not a set of points in R
2
. The practicalities
3.3. THE RIEMANNIAN METRIC TENSOR 77
are that velocities change and can change continuously so a big long vector
would be misleading. In fact any finite length vector is misleading, but we
can be sloppy and imagine that velocities have been turned into distances by
travelling for very short times.
The idea of an infinitesimal time, one so small it was not effectively distin-
guishable from zero, but where ratios of infinitesimals made sense and need
not be zero is one which seems natural to many people. My Mathematics
teacher at school talked of dy/dx being a ratio of numbers each of which was
infinitesimal, that is not individually distinguishable from zero. I thought he
was off his head. I still do. This isn’t mathematics, it’s nonsense
1
. It does
however suggest mathematics. So although my Maths master was talking
incoherent garbage, there is something there which makes sense. And the
idea of infinitesimal distances and times leading to a definite velocity, a sort
of garbled version of the definition of a limit, has been used a great deal in
times past.
One way to think of this which you may find useful is contained in the
following example.
Let c be the curve x(t) = t, y(t) = 2 sin(t) be given. We look at the origin,
through which the curve passes. First we take the line segment from
¸
0
0

to
¸
u
2 sin(u)

for some u = 0.
This line segment has two important numbers associated with it, the projec-
tion along the x-axis and the projection along the y-axis. I shall call such a
line segment
u
and the two numbers ∆x(
u
) and ∆y(
u
). The slope of the
line segment is
∆y(
u
)
∆x(
u
)
. So I think of ∆y as assigning one number to each
such
u
and ∆x as assigning another with the ratio being the slope of the
line segment.
As we take shorter and shorter line segments, that is if we let u → 0 in the
example, the numbers get smaller but the ratio in general does not. I can
easily stipulate that the line segment has one end fixed (at 0 in our case)
and the other end lies along the curve given.
1
It is possible to go through model theory and make these ideas respectable, but this
requires a lot of logic. It is also possible to junk the lot and replace it with the idea of a
limit. And finally it is possible to choose terminology which looks a lot like the incoherent
rubbish but actually makes sense. This last is what we do and it explains some of the
more baroque aspects of our language.
78 CHAPTER 3. TENSORS AND TENSOR FIELDS
Figure 3.3.2: ∆x and ∆y and dx and dy.
Now look at the tangent vector at 0 defined by the curve above. It is a
perfectly respectable vector in the tangent space
˙
R
2
0
at 0. In fact I can take
a basis for
˙
R
2
0
consisting of the vector of unit positive speed along the x-
axis, which I have called i or ˙ e
1
or ∂/∂x earlier, and the second vector being
defined by a curve of positive unit speed along the y-axis which I have called
˙ e
2
and ∂/∂x earlier but might have called j. In this basis it is easy to see
that the tangent vector at 0 defined by the curve is just
¸
1
2

I have already defined dx in the cotangent space as the linear map which
sends this tangent vector to 1 ∈ R and dy as the linear map which sends it
to 2.
So dx and dy do to tangent vectors what ∆x and ∆y do to line segments
in the original space. I have shown the idea in figure 3.3.2. Note that we
can say that dy/dx for this tangent vector is just 2 by straight division. And
of course this is precisely what we get when we differentiate 2 sin(x) at the
origin, which is not exactly a surprise.
Classically, the idea of ∆x was what you were probably taught at school:
it was a “little bit of x”, and ∆y was a little bit of y, but you were really
looking at line segments along curves, and ∆x and ∆y are probably better
thought of as maps from line segments to R. It is easy to see that with this
way of looking at things, the claim
dy
dx
= lim
∆x→0
∆y
∆x
3.3. THE RIEMANNIAN METRIC TENSOR 79
Figure 3.3.3: A new rule for measuring distances of points from the origin.
makes sense provided we specify what we really mean by the terms. This
would involve saying that we are calculating ∆x() and ∆y() for line seg-
ments joining some fixed point on a curve to other points, and the limit
means that the other points are taken to be getting closer and closer to the
fixed point. All this explanation was unfortunately regarded as not really
part of the mathematics and consequently got left out of the notation. If we
intend to study the subject on manifolds we have to put it back in.
The idea, then, that ∆x means “a little bit of x” and dx means “a very,very
little bit of x” (so little that it is infinitesimal) still survives in the literature.
And the classical mathematicians wrote
d
2
= dx
2
+dy
2
to be an infinitesimal version of Pythagoras’ Theorem and then used it to
find the length of curves. These days we define everything through limits,
which you spend a lot of time doing more or less rigorously in first year. At
least, that was the idea.
So instead of writing
[x, y]
¸
a b
b c
¸
x
y

as the square of a new norm on R
2
, people wrote
[dx, dy]
¸
a b
b c
¸
dx
dy

as the same thing with infinitesimals to give infinitesimal sizes of infinitesimal
vectors.
80 CHAPTER 3. TENSORS AND TENSOR FIELDS
If you take a symmetric positive definite matrix and use it to define a new
norm (squared) on R
2
you can look at the set
¸
x
y

∈ R
2
: ax
2
+ 2bxy +cy
2
= 1

as the set of points at distance 1 from the origin. This is an ellipse as in
figure 3.3.3. To calculate the distance of the indicated point from the origin
we need to measure the length of the orange line by taking the length of the
blue part of it as one unit. Alternatively we scale the ellipse until it passes
through the point and then look to see what the scaling factor was. This
makes the distance of the point from the origin about 2 units.
If you do the same thing using a Riemannian metric, the ellipse changes as
you move around the space. One can follow the idea of the old fashioned
geometers by drawing little (infinitesimal?) ellipses around every point of
the space. They thought of this in terms of a dx
2
+ 2b dxdy + c dy
2
, where
dx
2
means take an infinitesimal amount of x and square it. So to calculate
the distance along a curve in R
2
equipped with a Riemannian metric, you
took some finite set of points along the curve, one at the start and one at the
end, took the ellipse on each point, and measured the distance to the next
point and then added them all up. Then you repeated with more and more
points on your curve. In the limit we get the right answer. I have shown
a stage in this process in figure 3.3.4. Nobody, of course actually did it by
taking limits, they used Calculus which is quicker and less effort.
Definition 3.3.2. A geodesic on a manifold with a riemannian metric tensor
is a curve joining two points such that its length is less than or equal to that
of any other curve joining the points.
Exercise 3.3.2. Show that in R
n
with the euclidean metric, geodesics are
straight line segments. Hint: This is a standard calculus of variations prob-
lem. Google this if stuck.
Exercise 3.3.3. Describe the geodesics on the (flat) torus.
Of course the idea of infinitesimal ellipses is daft: the question is how to
rescue the idea so that it gives us a way of computing the length of a curve
in a space where distances keep changing. If the ellipses, or more properly the
positive definite symmetric quadratic forms, are perfectly respectable things
defined on the tangent space at each point, we get what we need.
3.3. THE RIEMANNIAN METRIC TENSOR 81
Figure 3.3.4: Length of a curve via a Riemannian metric.
Example 3.3.1. On a suitable open set in R
2
I define a new metric by saying
that locally it is given by the matrix
¸
1 +xy 0
0 x
2
+y
2

Find the length of the curve along the parabola y = x
2
from the origin to
x = y = 1 in this metric.
Solution:
The ordinary formula for the curve is that it is

c
d where c is the curve and
d
2
= dx
2
+dy
2
is the ‘infinitesimal path length’. We can write this as
d
2
= [dx, dy]
¸
1 0
0 1
¸
dx
dy

Our new and improved inner product changes from place to place but it gives
rise to a norm just as the old one does, and it is a norm on the tangent space.
We therefore have
d
2
1
= [dx, dy]
¸
1 +xy 0
0 x
2
+y
2
¸
dx
dy

for the new way of measuring the differential path length and so the length
of the path along the parabola, with x = t, y = t
2
is

1
0

(1 +t
3
).1 + (t
2
+t
4
)(4t
2
) dt ≈ 1.49958
where the approximation is done using Mathematica. This compares with
about 1.29361 using the standard metric.
Example 3.3.2. Find the path length of the spiral r = θ for 0 ≤ θ ≤ 2π in
the metric on R
2
given by
d
2
2
= [dθ, dr]
¸
r
2
0
0 1
¸

dr

82 CHAPTER 3. TENSORS AND TENSOR FIELDS
Solution: This is just the usual metric on R
2
disguised by using polar
coordinates since d
2
2
= (rdθ)
2
+ (dr)
2
is the usual way of calculating the
‘infinitesimal path length’ and the answer is


0

t
2
+ 1 dt ≈ 21.2563
This compares with 2

2π ≈ 8.885765876 in the euclidean metric on the θ, r
space. Well, in that space the curve is a straight line.
Exercise 3.3.4. Draw the curve and obtain a crude estimate of the length if
possible with upper and lower bounds to see if you think this is the length in
the usual metric.
Exercise 3.3.5. Find the path length of the above spiral using the metric
given by
d
2
2
= [dθ, dr]
¸
r
2
0
0 r
4
¸

dr

Remark 3.3.1. I should feel ashamed of myself for writing out expressions
such as the above for specifying a metric (or more accurately the square of a
norm), and should undoubtedly have written
ω = r
2
dθ ⊗dθ +r
4
dr ⊗dr
or something similar. I have tried to give you something which will relate
the correct formulation to the things that the classical mathematicians did
(and which you may find at least as badly expressed in works on tensors and
tensor fields written by the congenitally confused). The bad notation can be
used to do sums quite quickly so is not wholly bad. Much depends on whether
you want to do an awful lot of sums without thinking what you are doing.
And face it, who would want to think while doing monster sums if they didn’t
have to?
Exercise 3.3.6.
1. Find the length of the path A, r = 1 for 0 ≤ θ ≤ π/2 with respect to
the metric given by r
2
dθ ⊗ dθ + dr ⊗ dr. (Note that in the θ, r space
this gives the same answer as the usual metric,. that is, treating θ, r as
if it were a piece of R
2
with the euclidean metric.)
2. What is the length of the parallel line B, r = 2 for 0 ≤ θ ≤ π/2 in the
new metric?
3.3. THE RIEMANNIAN METRIC TENSOR 83
Figure 3.3.5: Three lines of different lengths.
3. What is the length of the line C, r = 0 for 0 ≤ θ ≤ π/2?
I show the three lines in figure 3.3.5.
4. Explain what has gone wrong. The length of a line segment with the
end points different cannot be zero in a metric.
5. On the figure 3.3.5, draw the curve r = 1/(sin(θ) + cos(θ)), for θ ∈
[0, π/2]. Calculate its length with respect to the new metric. Hint: you
might try using NIntegrate in Mathematica.
6. Show the curve is a geodesic in the space, in particular it is shorter
than the ‘straight’ line A. Hint: transform back to R
2
with the euclidean
metric. Find out how to do this by reading on a bit.
Note that the Riemannian metric tensor enables us to make sense of the angle
at which two curves cross. Without this it makes no sense at all to say that
curves intersect at right angles on a manifold, because in different charts we
could get totally different answers. We feed in two tangent vectors, one along
each curve, at the point of intersection so they are both in the same tangent
space. The Riemannian metric tensor gives us a number out and this leads
us to the angle just as in R
n
.
Suppose now that I want to compute path length of a curves on a manifold.
Let us say I have a curve c : [0, 1] → S
2
on S
2
. I want to compute its length.
I take a chart containing some of the curve, say u : U → R
2
and this takes
the bit of the curve in S
2
to a bit of curve in R
2
. I have a Riemannian metric
tensor on the manifold. The picture of figure 3.3.6 shows a local parametri-
sation by u
−1
of a patch containing some of the curve. The composite u ◦ c
‘shifts’ the curve to the codomain of u, the open set u(U) in R
2
.
84 CHAPTER 3. TENSORS AND TENSOR FIELDS
Figure 3.3.6: Length of a curve on a manifold via a Riemannian metric.
Now I want to know what happens to the covariant 2-tensor field on S
2
which
tells me how to measure distances there. I claim that u
−1
induces a covariant
2-tensor field on u(U). This requires a certain amount of thought.
We have the picture from the last chapter:
T

a
X
?
π
X
X
T

f(a)
Y
?
π
Y
Y

-
f

f
Now a linear map from T

a
(X) to R is taken by differentiable f to a linear
map from T

f(a)
(Y ) to R. We can see this by using the natural equivalence
of V

with V or we can simply send α : T

a
(X) → R to α ◦ f

: T

f(a)
→ R.
These are the same thing.
Now we know what f

is on tangent vectors, it is just the derivative of f at
each point. So on R
2
if we have f : R
2
→R
2
given by
f
¸
x
y

=
¸
u
v

we can write
[du dv] = [dx dy]
¸
∂f
1
∂x
∂f
1
∂y
∂f
2
∂x
∂f
2
∂y
¸
Similarly we can, given f : X → Y for X = R
n
and Y = R
m
transform the
covectors dx, dy in a way strictly dual to the way we can carry a tangent
3.3. THE RIEMANNIAN METRIC TENSOR 85
vector on X to one on Y . In fact it works better for covectors because
a covector field on Y is pulled back to one on X (and it is certainly not
generally true that a vector field on X is taken to one on Y ).
Exercise 3.3.7. Why not?
Exercise 3.3.8. Define the pullback of a covector field on Y to one on X.
Example 3.3.3. Let
P : R
2
→ [0, 2π) [0, ∞),
¸
x
y


¸
θ
r

be the polar coordinate map. Then we can write P
−1
as
x = r cos(θ)
y = r sin(θ)
Now we have
dx =
∂x
∂θ
dθ +
∂x
∂r
dr
dy =
∂y
∂θ
dθ +
∂y
∂r
dr
hence
dx = −r sin(θ) dθ + cos(θ) dr
dy = r cos(θ) dθ + sin(θ) dr
where upon we can calculate the various tensor products in the inner product:
dx ⊗dx = (−r sin(θ) dθ cos(θ) dr) ⊗(−r sin(θ) dθ cos(θ) dr)
= r
2
sin
2
(θ) dθ⊗dθ−r sin(θ) cos(θ)dr⊗dθ−r sin(θ) cos(θ)dθ⊗dr+cos
2
(θ) dr⊗dr
and similarly for dx ⊗dy, dy ⊗dx and dy ⊗dy
= r
2
cos
2
(θ) dθ⊗dθ+r sin(θ) cos(θ)dr⊗dθ+r sin(θ) cos(θ)dθ⊗dr+sin
2
(θ) dr⊗dr
Hence we have
d ⊗d = dx ⊗dx +dy ⊗dy = r
2
dθ ⊗dθ +dr ⊗dr
This, when translated into matrix terms and old fashioned dx
2
+dy
2
language,
gives us that the standard identity matrix on R
2
for the euclidean metric
tensor goes over to the matrix
¸
r
2
0
0 1

of example 3.3.2
86 CHAPTER 3. TENSORS AND TENSOR FIELDS
Example 3.3.4. Suppose f : [0, 1] → R
2
is a curve in R
2
and we wish to
compute its length. Writing f(t) = (x(t), y(t))
T
we have the length of f is

[0,1]
du
where du is the pull-back from R
2
by f of the length measure d on R
2
. The
usual (Lebesgue) measure on [0, 1] is written dt. This gives us:
dx ⊗dx = (dx/dt dt) ⊗(dx/dt dt)
= (dx/dt)
2
dt ⊗dt
dy ⊗dy = (dy/dt dt) ⊗(dy/dt dt)
= (dy/dt)
2
dt ⊗dt
d ⊗d = dx ⊗dx +dy ⊗dy
du = f

d
⇒ du ⊗du = ((dx/dt)
2
+ (dy/dt)
2
) dt ⊗dt
⇒ du =

((dx/dt)
2
+ (dy/dt)
2
) dt

[0,1]
du =

[0,1]

((dx/dt)
2
+ (dy/dt)
2
) dt
A familiar formula usually derived somewhat less formally but using essen-
tially the same ideas. It is worth going through this argument while thinking
of dx/dt as the amount of stretching f does to the unit interval in the x-
direction when it takes [0, 1] into R
2
(and likewise dy/dt). Working through
the new jargon for a simple, friendly example makes you appreciate how the
new jargon actually does a good job of articulating geometric ideas of what
is going on.
Exercise 3.3.9. Write out the matrix and tensor product forms of the spher-
ical and cylindrical polar coordinate transforms of R
3
and confirm that the
euclidean metric goes to what it ought to.
Returning to the tensor field exported by u
−1
to R
2
, the map u distorts
distances, but it also distorts the metric tensor in exactly the right way so
that if we use the u
−1
-induced metric tensor to measure path length in R
2
we get the right answer for the metric tensor on S
2
.
You might be surprised at first that we transport a metric tensor field on a
space X to one on a space Y by a homeomorphism u
−1
which is the inverse
of the map u : X → Y . Actually this makes good sense. Suppose we take
the simplest case of the usual metric tensor field on R which assigns to the
3.3. THE RIEMANNIAN METRIC TENSOR 87
interval [a, b] the length b − a. Map R → R by u(x) = 2x. Now we want a
new, shiny metric tensor field on the codomain which gives the image of [a, b]
the same length, b − a, so we can feel we have shifted not just the interval
[a, b] but also the metric with which to measure its length.
Writing the length in the domain as

b
a
dx we see that we can get the (usual,
boring, old fashioned) length of the image in the codomain by writing it as

x=b
x=a
du =

b
a
2 dx
where du = 2dx follows from u = 2x.
You might have felt a bit happier had I written this as

b
a
du
dx
dx =

b
a
2 dx
Much depends on your previous experience of Calculus.
If we want to have length b − a for the new, shiny length in the codomain,
which I shall call the u-space, we need to use du
−1
. Now the classical mathe-
maticians, Gauss and his mob, would cheerfully write things not very different
from
u
−1
(x) = x/2, du
−1
= 1/2 dx
then using the metric given by du
−1
on the u-space we get the length of the
interval [2a, 2b] in the u-space with the ‘right’ metric is

x=2b
x=2a
1/2 dx = 1/2 (2b −2a) = b −a
Obviously this works with all linear maps from R to R not just 2x.
Exercise 3.3.10. Show it works for u(x) = −2(x).
If u is a diffeomorphism from R to R then we have something like
dx
du
= D u
−1
=
1
du/dx
by the inverse function theorem. The interval [a, b] in the x-space is taken to
[u(a), u(b)] if u is increasing, which I can assume it is without loss of generality
since if it isn’t I just compose with the map that multiplies everything by
−1 and rename the composite to be u. Now the length of this in the usual
metric is

b
a
du/dx dx. If I choose the metric given by du
−1
then I replace the
88 CHAPTER 3. TENSORS AND TENSOR FIELDS
old, boring metric dx with the new, shiny, transported metric 1/(du/dx) dx,
then the length is

b
a
du/dx 1/(du/dx) dx = b −a
This tells us that if we use the metric transported by u
−1
to measure the
length of a curve in R transported by u, we get the same length.
Exercise 3.3.11. Show this works just as well if the curve is in R
2
and
u : R
2
→ R
2
is a diffeomorphism. The length of the curve in the u-space
measured by the metric transported to the u-space by u
−1
is the same in both
spaces. Hint: Try it for linear maps u first.
Exercise 3.3.12. By taking two distinct charts both covering a curve on S
2
and hence related by a diffeomorphism, show that whichever chart you use, if
you induce the right metric tensors on R
2
from the charts and calculate the
lengths by both of them, they agree on the length of the curve on S
2
. Note that
you don’t really need S
2
at all for this exercise, it is about the way covariant
tensor fields on R
2
transform under diffeomorphisms.
Exercise 3.3.13.
1. Show that contravariant tensors of any order on a vector space U are
carried by linear maps f : U → V to contravariant tensors of the same
order on the vector space V .
2. Show that covariant tensor fields of any order on a smooth manifold
V are carried to covariant tensor fields on a manifold U of the same
order by the inverse of a diffeomorphism h : V → U.
3. Let ', ` be an inner product on V . Show that it induces an isomorphism
between V and V

.
4. Does an isomorphism from V to V

always give an inner product on
V ?
5. Deduce that if we have an inner product on V we can induce a dual
inner product on V

and vice-versa, and hence that it would be possible
to define a Riemannian metric tensor as being a contravariant tensor
field.
6. Explain why this is not usually done.
3.3. THE RIEMANNIAN METRIC TENSOR 89
7. Find a metric tensor on o
2
which gives the usual notions of distance
and angles between intersecting curves.
8. Using this tensor, confirm that the angle at the north pole between the
curves obtained by travelling around the great circle in the x −z plane
and the great circle in the y − z plane is what common sense says it
should be.
We now have enough machinery to say quite a lot about what we mean by the
geometry of a space and in particular we can say something about curvature.
If I give a curve on a manifold, and a riemannian metric structure on it, we
can cover the curve with open sets homeomorphic to open sets in R
n
, shift
the curve (in pieces if necessary) back to R
n
, shift the metric structure, and
compute the length. But how do we specify a curve on the manifold in the
first place? And how do we specify a riemannian structure on it? We can do
that with charts too, if all else fails.
Note that this is all intrinsic, it does not require an embedding of the manifold
in R
n
. If we do have an embedding, we can derive a riemannian structure
for the manifold from the usual euclidean metric on the enclosing space.
Exercise 3.3.14. How?
But if we are to say anything about the geometry of the space of the universe
in which we live, it has to be done intrinsically. If there is an embedding of
the universe in some higher dimensional euclidean space, we cannot ever
know anything about it, and so it is idle to talk about it.
A useful reference for much of the material covered so far is Volume One of
A Comprehensive Introduction to Differential Geometry by Michael Spivak.
A quick glance should persuade you that there is a rather considerable depth
in the material. Also bear in mind there are several volumes.
The idea of a vector bundle with bundle maps which are linear on each fibre
is of great importance in theoretical physics. Essentially, fields are sections of
suitable locally trivial vector bundles. We can specify a general locally trivial
vector bundle over a manifold by taking local trivialisations and specifying
a way of gluing the local products together. This often involves a group:
in the case of the m¨obius bundle, for example, the group Z
2
has everything
to do with the bundle structure. Quantum Chromodynamics and Gauge
Invariance are describable in terms of the structure of locally trivial vector
bundles. The physicists Yang and Mills were obliged to reinvent some of the
ideas well known to differential geometers, a good argument for doing serious
mathematics before tackling theoretical physics.
90 CHAPTER 3. TENSORS AND TENSOR FIELDS
3.4 Geometry
Given a Riemannian Manifold M
n
which we assume is compact and path
connected, for any a, b ∈ M
n
we can take a smooth path between a and b
and compute its length. This gives a map from the space of smooth paths
joining a and b to R. Of all possible such paths, we may hope that there is
one with the length a minimum, a geodesic on the manifold. It is not entirely
trivial to show that such a path exists.
Exercise 3.4.1. Show that if M
n
is not compact there may be points a, b
such that there is no path of minimum length joining them.
When this can be done we assign the distance between a and b to be this
minimum length. Note that the minimum length may exist even though
there is no path having it as length.
This makes the Riemannian manifold a metric space.
Exercise 3.4.2.
1. Prove that last claim.
2. Show that T
2
, S
2
, RP
2
, K
2
all have the structure of Riemannian man-
ifolds and give the metric arising.
Definition 3.4.1. A map f : (X, d) → (Y, e) between metric spaces is an
isometry iff it preserves distances, i.e. iff
∀ a, b ∈ X, e(f(a), f(b)) = d(a, b)
If we take a small ball centred on the north pole of S
2
as in the diagram,
figure 3.4.1, we can make it a ball in the metric of some radius r.
There is certainly a diffeomorphism between this ball and the ball of radius
r centred on the origin in R
2
. However it is easy to see that the length of the
perimeter is 2πr for the ball in R
2
but is less than this for the ball on S
2
. So
there cannot be an isometry between the two balls.
Exercise 3.4.3. Provide a convincing argument for these claims.
On the other hand, there is an obvious isometry between any two balls of the
same radius in R
2
, and also one between any two balls of the same sufficiently
small radius in S
2
. A shift does it in R
2
and a rotation in S
2
.
Definition 3.4.2. If X
n
is a Riemannian manifold and for any two points
a, b ∈ X, there is a smooth isometry f : X → X with f(a) = b then we say
the geometry is homogeneous.
3.5. THE EXTERIOR ALGEBRA 91
Figure 3.4.1: A ball (disc) on S
2
.
Definition 3.4.3. If X
n
and Y
m
are homogeneous Riemannian manifolds
and for any sufficiently small ball B in X there is a smooth map f taking B
isometrically to a ball on Y , then we say that X and Y have the same local
geometry.
Exercise 3.4.4.
1. Show that if two homogeneous manifolds have the same local geometry
they have the same dimension.
2. Show that having the same local geometry is an equivalence relation
on homogeneous Riemannian manifolds.
3. Show that the flat torus defined by gluing is a homogeneous Riemannian
manifold with the same local geometry as R
2
.
4. Show that the cylinder S
1
R as the subspace (cos(t), sin(t), z)
T
of R
3
has the same local geometry as R
2
.
5. Show that RP
2
is a homogeneous Riemannian manifold.
6. Construct a definition of what it means for two manifolds to have the
same local topology. Given an example of distinct manifolds having
the same local topology.
7. Construct a definition of what it means for two manifolds to have the
same global geometry.
3.5 The Exterior Algebra
I have explained that the tensor algebra T
k
(R
n
) has basis the set
¦dx
i
1
⊗dx
i
2
⊗ ⊗dx
i
k
: dx
i
j
∈ T
1
(R
n

92 CHAPTER 3. TENSORS AND TENSOR FIELDS
Well to be more exact, I asked you to prove it. It is all a matter of getting
used to the jargon and is conceptually rather simple once you are happy
with dual bases and elements of the cotangent bundle as linear maps taking
tangent vectors to numbers.
The space of alternating covariant k-tensors, Ω
k
(R
n
), we know has dimension
n
C
k
and we can get a basis by noting that we have the maps specified when
we know what they do to some standard basis elements of
R
n
R
n
R
n
. .. .
k terms
Since the maps are clearly linearly independent for different choices and since,
by an exercise, any alternating k-tensor on R
n
can be expressed as a linear
combination of maps which take each of the needed basis elements of R
nk
to
1, we can say, at some length, what the basis elements of Ω
k
(R
n
) are. They
are the maps which take each choice of e
i
1
, e
i
2
, , e
i
k
having i
1
< i
2
< i
k
to one, and the value on every other basis element of R
nk
are specified by the
fact that they alternate. Then multilinearity forces the value of any linear
combination of these things everywhere.
It has to be said that this is messy. It would be nice if we could specify the
basis elements more neatly. Something similar to the description given for
the general tensor space T
k
(R
n
) would be neater. We can take it that this is
possible because we know that we can express a basis for the space in terms
of making choices of k distinct rows of k vectors from R
n
placed side by side
and evaluating the determinant on our choice.
If we have two vectors from R
3
,
(x, a) =
x
y
z
a
b
c
then we can take the top pair to get ω
12
(x, a) = xb −ya, or the bottom pair
to get ω
23
(x, a) = yc −zb, or the top and bottom to get ω
13
(x, a) = xc −za.
These three are all alternating and every alternating 2-tensor on R
3
is a linear
combination of these three.
Exercise 3.5.1. Prove this last remark.
We actually write these as dx ∧dy, dy ∧dz and dx ∧dz respectively, and the
only thing left to do is to explain where the terminology comes from.
First I want to explain the determinant for n n matrices. You will need to
recall the material on odd and even permutations from 3P0.
3.5. THE EXTERIOR ALGEBRA 93
Suppose I have 3 columns, each column a vector in R
3
. I shall write them
x
1
x
2
x
3
y
1
y
2
y
3
z
1
z
2
z
3
Now I choose one element from each column, taking care never to choose
two things from the same row. So I can pick x
1
, y
2
, z
3
or x
2
, y
1
, z
3
but not
x
1
, y
3
, z
1
.
It follows that if we just look at the indices in xyz order, we get a permu-
tation of (1, 2, 3) specifying a choice. If I pick x
1
, y
2
, z
3
I get the identity
permutation. If I pick x
2
, y
1
, z
3
I get the permutation which I wrote out as
1 2 3
2 1 3
Now I make every possible choice of x
i
, y
j
, z
k
with no two indices the same,
and I get 6 (count them) possibilities, which is 3!, the size of the permutation
group S
3
. Now I multiply every number in each choice together, obtaining
x
1
y
2
z
3
for the first permutation and x
2
y
1
z
3
for the second, and so on. Note
that the terms in the product are never the same (although of course the
values of the terms or the result of multiplying them together may be). This
gives me 3! products. Had I done this with n vectors in R
n
I should have got
n! distinct products.
Now I take these products, multiply each by the sign of the permutation (+1
if an even permutation, −1 if odd), and add them up. This sum of n! terms,
with parity taken into account is the determinant of the matrix.
Exercise 3.5.2. Confirm this for n = 2 and n = 3.
Exercise 3.5.3. Show that for any n n matrix A, det(A) = det(A
T
).
I could write out the choice x
1
, y
2
, z
3
as dx
1
⊗dx
2
⊗dx
3
applied to the three
vectors. This is taking dx
i
to mean the projection map from R
3
to R which
selects the i
th
component. I am confusing this projection map (which I might
more reasonably have called e

i
, the dual basis element to e
i
, or e
i
) with the
map dx
i
:
˙
R
n
→ R and the reason is that pretty soon we shall be doing all
this on the tangent space, and if I confuse the notation a bit now there is
less novelty later.
In the case of R
2
I get that the determinant can be written easily as dx
1

dx
2
−dx
2
⊗dx
1
. There are,after all, only two permutations of two things.
94 CHAPTER 3. TENSORS AND TENSOR FIELDS
I shall write this as dx ∧ dy. In fact if I have a covariant 2-tensor on R
n
, I
shall also write:
dx
1
∧ dx
2
= dx
1
⊗dx
2
−dx
2
⊗dx
1
This is equivalent to choosing the first two rows of the n 2 matrix made
up by choosing any two vectors in R
n
, and computing the determinant.
It is easy to see that it is an alternating covariant 2-tensor on R
n
. I have
immediately that dx
2
∧ dx
1
= −dx
1
∧ dx
2
Similarly I can take dx
i
∧ dx
j
defined by
dx
i
∧ dx
j
= dx
i
⊗dx
j
−dx
j
⊗dx
i
and this is −dx
j
∧ dx
i
and dx
i
∧ dx
i
= 0, for i, j ∈ [1 : n]. What this means
is that I select the 2 2 matrix comprising the i
th
and j
th
rows of the two
column vectors, and calculate the determinant of them.
This can be generalised to covariant 3-tensors on R
n
without too much trou-
ble. In this case I have to define dx
i
∧dx
j
∧dx
k
and I do this by writing out
every permutation of i, j, k so that if σ is a permutation I take the 3! terms
dx
σ(i)
⊗dx
σ(j)
⊗dx
σ(k)
for the 3! permutations, σ. I then multiply the resulting numbers together,
multiply by the sign of the permutation, and sum the 3! numbers. This gives
dx
i
∧ dx
j
∧ dx
k
. It is easy to see that it is an alternating 3-tensor on R
n
.
Exercise 3.5.4. Prove the last claim.
The generalisation to alternating k tensors on R
n
is obvious.
Exercise 3.5.5. Write it down.
It follows that we can give a basis for the space Ω
k
(R
n
) rather easily: it
consists of the alternating tensors
¦dx
i
1
∧ dx
i
2
∧ ∧ dx
i
k
: i
1
< i
2
< < i
k
∈ [1 : n]¦
Now putting x
1
= x, x
2
= y and x
3
= z in traditional fashion, we recover
the mysterious expression at the beginning of this section on the Exterior
Algebra.
3.5. THE EXTERIOR ALGEBRA 95
Exercise 3.5.6.
1. Show how to construct a linear map Alt: T
k
(R
n
) → Ω
k
(R
n
) which
‘alternates’ any tensor and sends any alternating ω to itself. Hint: the
essential idea occurs in turning dx⊗dy into dx∧dy and dx
1
⊗dx
2
⊗dx
3
into dx
1
∧dx
2
∧dx
3
by adding up the signed permutations. It might be
better to average them in this case.
2. Show how to generalise the ∧ of dx
i
, dx
j
so that if ω is an alternating
k-tensor on R
n
and ϕ is an alternating tensor, then ω ∧ ϕ is an
alternating k + tensor. Hint: Hit the tensor product with the Alt of
the last exercise.
3. Show that dx
1
∧ dx
2
applied to a pair of points in R
2
, represented in
the standard way, gives twice the oriented area of the triangle formed
by the pair of points together with the origin.
4. What do you need to get the area of the triangle formed by two points
in R
n
and the origin? Does it make sense to talk of an oriented area
in this case?
The exercises should now make it clear that just as ⊗ between k-tensors
and -tensors gives us k + -tensors and hence a graded algebra, so ∧ be-
tween alternating k-tensors and alternating -tensors gives us an alternating
k + -tensor and hence another graded algebra. This is called the Exterior
Algebra. Since the only alternating n-tensor on R
n
is the determinant, and
since Ω
k
(R
n
) is just the zero tensor whenever k > n, we are really only
concerned with the graded algebra ¦Ω
k
(R
n
) : 1 ≤ k ≤ n¦. This makes the
exterior algebra rather simpler (and a lot smaller) than the tensor algebra.
Exercise 3.5.7. Show that Ω
k
(R
n
) is just the zero tensor whenever k > n.
I can write out the full exterior algebra in the form:
Order basis isomorphic to

0
(R
2
) 1 R
1

1
(R
2
) ¦dx, dy¦ R
2

2
(R
2
) ¦dx ∧ dy¦ R
1
This is a nice finite table. Just as I wrote out table 3.1.2 I can write out the
exterior algebra for R
2
:
96 CHAPTER 3. TENSORS AND TENSOR FIELDS

R R
2
R
R (R, ) (R
2
,
q
) (R, )
R
2
(R
2
,
q
) (R,det) ¦0¦
R (R, ) ¦0¦ ¦0¦
Table 3.5
Again, denotes ordinary multiplication in R and
q
denotes scalar multipli-
cation. And det denotes the determinant. The table starts with 0-tensors at
the top and 2-tensors at the bottom.
Exercise 3.5.8. Write out the full exterior algebra on R
3
. You should repli-
cate the above two tables with rather more columns and rows. In the second
table, work out what the multiplications are, as for table 3.1.2. Do you recog-
nise anything?
3.6 The Exterior Calculus
The step from the tensor algebra to tensor fields consisted of having a section
of the tensor bundle, which meant attaching a type (k, )
T
tensor to each
point on a manifold. We do exactly the same thing again, we take a section
of the Ω
k
(V ) bundle where V is a tangent space. This means that we attach
to each point a of the n-manifold M
n
an alternating k-tensor on the space
T
a
(M). A 0-tensor is just a number, and attaching a number to each point of
a manifold is merely defining a map from M
n
to R. Similarly, attaching an
n-tensor is attaching a number, the volume element, at each point of M
n
. In
between we have k-forms attached at each point of the manifold. Naturally
we want the sections to be smooth.
Such sections are called differential forms on the manifold.
To make this concrete we look at R
2
and R
3
.
A differential 0-form on R
2
is just a smooth map from R
2
to R. We know a
fair bit about these.
A differential 1-form on R
2
assigns to each point of R
2
a pair of numbers
a dx +b dy and consequently is a pair of functions
P(x, y) dx +Q(x, y) dy
It is a covector field and looks very like a vector field (but watch out for what
happens when you change bases!)
A differential 2-form on R
2
assigns to each point a of R
2
an operator α(a) dx∧
dy. This is short for α(a) dx ⊗ dy − dy ⊗ dx for some number α(a) which
3.6. THE EXTERIOR CALCULUS 97
depends on a. This acts on any pair of vectors in the tangent space at a.
Let’s choose some with respect to the standard basis for
˙
R
2
(since, for any
a ∈ R
2
,
˙
R
2
a
is isomorphic to
˙
R
2
0
in a natural way). Then
α(a) dx ∧ dy
¸
x
y

,
¸
u
v

= α(a)(xv −yu)
So α(a) dx∧dy assigns to any pair of tangent vectors the area of the parallel-
ogram in the tangent space which they determine, multiplied by a function
of a. Or if you prefer, twice α(a) times the area of the triangle consisting of
the two points and the origin of the tangent space
˙
R
2
.
A quite useful way of looking at this is that α(a) dx ∧dy is doing something
a bit like the riemannian metric, but instead of returning the inner product
of two tangent vectors it is returning an infinitesimal area element. So we
imagine that we want the area definition to vary over the space so that
calculating an area of a region is now more complicated. On the other hand
you have seen this before, more or less.
Example 3.6.1. First I am going to transform the usual area measure dx∧dy
on R
2
and use it to calculate the area of the unit disc in polar coordinates.
We have the polar coordinate transform
P : R
2
` ¦0¦ → S
1
R
+
¸
x
y


¸
θ
r

We have the inverse given by
x = r cos(θ)
y = r sin(θ)
and exactly as before
dx = −r sin(θ) dθ + cos(θ) dr
dy = r cos(θ) dθ + sin(θ) dr
Last time we calculated dx⊗dx and the three others. This time there is only
one thing to calculate, dx ∧ dy. We get
dx ∧ dy = (−r sin(θ) dθ + cos(θ) dr) ∧ (r cos(θ) dθ + sin(θ) dr)
which we can easily see is just:
−r sin
2
(θ) dθ ∧ dr +r cos
2
(θ) dr ∧ dθ = r dr ∧ dθ
98 CHAPTER 3. TENSORS AND TENSOR FIELDS
Exercise 3.6.1. Show this carefully.
Using this new area element we get that

B
1
(0)
dx ∧ dy =

B
1
(0)
r dr ∧ dθ
which we already knew although not in this language. Note that the domain
of integration, B
1
(0) is a disk in the x − y space and a rectangle wrapped
around a cylinder in the θ −r space. This is what happens to the punctured
disc under the diffeomorphism P.
The new integral has θ ∈ S
1
and r ∈ [0, 1] which makes for an easy integral,
(1/2)(2π) = π.
I knew that.
Note that this works because we
• transformed the disc in R
2
into a rectangle in S
1
R
+
, except that the
centre of the disc really got thrown away (zero area so does not affect
the result) so the rectangle (a) doesn’t have a base (zero area in any
sane density on R
2
) and b gets wrapped once around the circle.
• Back transformed the measure density dx ∧dy to get the right density
to use to compute the area. All this does is to make clear something
which you were trained to do using much sloppier arguments to justify
the right rule for the change to polars. It was all perfectly OK but the
rationale was scruffy. Note how the exterior algebra rules for computing
the new form automatically take care of signs and orientations. Doing
it for any other transformation than the polar one is now a doddle.
Exercise 3.6.2. Calculate the area of the unit disc in R
2
with respect to the
density xy dx ∧ dy.
Exercise 3.6.3. Work through the argument for the spherical and cylindrical
polar coordinate transformations in R
3
.
Exercise 3.6.4. Think of some bizarre diffeomorphism of R
2
to some two
dimensional space that does something frightful but has an explicit inverse
(make sure the inverse can be written down even if the original is a swine).
Use it to evaluate the area of some region in both spaces, before and after
being transformed. This should give two moderately foul double integrals
with weird limits. Use Mathematica to get numerical solutions and confirm
they are pretty much the same.
3.6. THE EXTERIOR CALCULUS 99
Remark 3.6.1. This should give you a conviction that differential forms
have their uses, and will suggest the most important thing about them:
Differential Forms are things you
integrate over manifolds. A dif-
ferential k-form can be integrated
over a k-manifold or k-manifold
with boundary.
I have made sure you would see this as it tells you what they are for.
Transforming differential forms by diffeomorphisms follows the same pattern
as for transforming the riemannian metric tensor, except that we may have
to transform k-forms for k > 2. The rules are simple however.
Exercise 3.6.5. Write down explicit rules in terms of partial derivatives for
transforming a differential 3-form on R
3
under a diffeomorphism, rules which
you must have used in doing the preceding exercise but one.
It follows from the big announcement that 2-forms on R
2
are integrated over
things like discs, and a 1-form on R
2
has to be integrated over curves.
Example 3.6.2. Let the curve c be the graph of y = x
2
between x = 0 and
x = 1. Let the differential 1-form be dx + dy. What would we expect the
answer to

c
dx + dy be on the basis of what this means, and what would
the calculation be?
Solution: Drawing the graph and taking a typical line segment on the curve,
∆x() is the projection along the x-axis and ∆y() is the projection along
the y-axis. If we add these up we get 1 + 1 = 2, and this is not going to
change as the segments get shorter. So the answer is 2. All done by a little
thought about what these things mean.
If we write y = x
2
we get dy = 2x dx so

c
dx +dy =

[0,1]
(1 + 2x) dx = x +x
2

1
0
= 2
As an alternative we could write x = t, y = t
2
, t ∈ [0, 1] to express the curve
parametrically and this would give the same answer.
100 CHAPTER 3. TENSORS AND TENSOR FIELDS
Exercise 3.6.6. Now try it for the curve c being the first quadrant of the
unit circle. Do you get the same answer? If not why not?
Note that when we express the curve parametrically we do so by a function c
and this allows us to pull back the differential 1-form on R
2
to a differential
1-form on R which is just some function multiplied by dt and we can integrate
this in the usual way, numerically if necessary.
Exercise 3.6.7.
1. What would you expect the result of

c
dx + dy to be when c is any
smooth closed curve?
2. Suppose we take f(x, y) = x +y. Then we have
df =
∂f
∂x
dx +
∂f
∂y
dy
which in this case is the 1-form dx + dy. So differentiating a smooth
0-form gives a smooth 1-form. Show that this is always the case.
3. It follows that

c
dx + dy =

c
df. Use the fundamental theorem
of calculus to prove your solution to the first question is correct, and
verify that it gives the right answers to all the other integrations of this
1-form along curves.
4. Find a 1-form on R
2
, P(x, y) dx +Q(x, y) dy, that is not df for any f.
Hint: What can we say about ∂P/∂y and ∂Q/∂x if the 1-form is the
derivative of a 0-form?
The usual way to represent the derivative of a 0-form f is as the row matrix
[∂f/∂x, ∂f/∂y] and since this represents, when evaluated at any point, a
linear map from R
2
to R and P dx+Q dy represents a linear map from
˙
R
2
to
R when evaluated at any point, the difference is rather small, but significant.
When we treat the differentiation in the second sense, we call d the exterior
derivative. It goes much further than this. I shall define an exterior derivative
of 1-forms to give a 2-form:
For ω = P dx +Q dy, I define
dω =
∂P
∂y
dy ∧ dx +
∂Q
∂x
dx ∧ dy = (
∂Q
∂x

∂P
∂y
) dx ∧ dy
3.6. THE EXTERIOR CALCULUS 101
Exercise 3.6.8.
1. Show that d
2
= 0 for any 0-form f.
2. Calulate the exterior derivative of a 1-form on R
3
by making up a
suitable example.
3. Pretend, briefly, that there is no such thing as duality and that the last
1-form is a vector field. Identify the 2-form.
4. Make the rule: To obtain the exterior derivative of a k form on R
n
, take
each component function P(x
1
, x
2
, x
n
) dx
i
1
∧dx
i
2
∧ ∧dx
i
k
of the
k-form, differentiate each such P with respect to each of the variables
separately to get, for example, some ∂P/∂x
j
, and put dx
j
∧ in front of
the existing term, to get
∂P/∂x
j
∧ dx
j
∧ dx
i
1
∧ dx
i
2
∧ ∧ dx
i
k
Sum the results for the n different variables x
j
and also for the different
functions P. The result is a k + 1 form on R
n
. Show that this rule
gives the same answer as in the particular cases you have worked with.
5. Use the above rule to calculate the exterior derivative of a 2-form on
R
3
. Choose your own 2-form, preferably so as to have three non-trivial
but differentiable component functions.
6. Pretending, briefly, that the 2-form ω on R
3
is a vector field, identify
dω.
Remark 3.6.2. You should be able to see that the clunky way you did
Stokes’ Theorem using vector fields arose from confusing vector fields with
both 1-forms and 2-forms, which you can do only on R
3
. In fact it is really
about differential forms. Stokes’ Theorem in general says

∂M
ω =

M

where M is an n-manifold with boundary ∂M and ω is any differential n−1-
form. For a proof, dig up my old 2C2 notes off the web.
This is the ‘modern’ form of Stokes’ Theorem. It differs from the old obsolete
form in two ways: first it is about differential forms not vector fields, so a
graps of duality is important and second it works for all positive integers n,
all n-manifolds with boundary (or without, but they are less interesting).
102 CHAPTER 3. TENSORS AND TENSOR FIELDS
Exercise 3.6.9.
1. Show that Stokes’ theorem in dimension 1, with ω a 0-form, is just a
restatement of the Fundamental Theorem of Calculus.
2. Show that the almost certainly scrofulous proof you met in second year
of Green’s Theorem is also a scrofulous proof that Stokes’ Theorem
holds when ω is a 1-form on R
2
.
3. Show that the almost certainly scrofulous proof you met in second year
of Stokes’ Theorem is also a scrofulous proof that

∂M
ω =

M

holds when ω is a 1-form on R
3
.
4. Show that the almost certainly scrofulous proof you met in second
year of the Divergence Theorem is also a scrofulous proof that Stokes’
Theorem holds when ω is a 2-form on R
3
.
5. Construct a plausible explanation of why you had to do a bungled
version of Stokes’ Theorem in second year, given that the correct version
has been known since about 1925.
Remark 3.6.3. If you want to see a proper proof of Stokes’ Theorem (all the
above, and more, in one hit) read Michael Spivak’s Calculus on Manifolds.
It consists of proper definitions of all the terms in awful generality and some
calculations. I should point out that all the physical intuitions which led
to the theorem are contained in the exercises and are not, as some shallow
people imagine, completely absent.
3.7 Hodge Duality: The Hodge Operator
3.7.1 The Riemannian Case
In R
3
we know from the table referred to in Exercise 3.5, that the Exterior
Algebra has a striking symmetry. If we look at the 0-forms we observe they
have dimension 1, just as do the 3-forms, while the 1-forms have dimension
3, just like the 2-forms.
It follows from the equality of the dimension that there is an isomorphism
between the space of 1-forms on R
3
and the space of 2-forms on R
3
; also one
between the space of 0-forms (numbers) on R
3
and the space of 3-forms on
R
3
.
Only a small amount of thought shows that there must be, in general, an
isomorphism between the k-forms on R
n
and the n −k-forms on R
n
. In fact
3.7. HODGE DUALITY: THE HODGE OPERATOR 103
not just one isomorphism of course, but scads of them. The question is, can
we find a more or less natural isomorphism by some process which works in
all cases? The answer is yes, and the isomorphism is called , or Hodge if
you want to give credit where it belongs.
Let us start by going from 2-forms on R
3
to 1-forms on R
3
and see if we can
work out the general pattern by doing concrete cases.
Recall from Exercise 3.5.1 that if ω is a 2-form on R
3
then it operates on any
pair of vectors

¸

a
1
a
2
a
3
¸
¸
,

b
1
b
2
b
3
¸
¸

by expressing the resulting number as
c
1

a
2
b
2
a
3
b
3

+c
2

a
1
b
1
a
3
b
3

+c
3

a
1
b
1
a
2
b
2

for some particular numbers c
1
, c
2
, c
3
.
But this is just the value of the three by three determinant:

a
1
b
1
c
1
a
2
b
2
−c
2
a
3
b
3
c
3

It seems reasonable therefore to define (ω) to be the 1-form [c
1
, −c
2
, c
3
]
which may be written c
1
e
1
−c
2
e
2
+c
3
e
3
using the standard dual basis in R
3
.
If you are troubled by the minus sign, note that it arises because I have
always described the submatrices obtained by omitting a row in numerical
order rather than cylic order. So I get (1, 3) where it might have been more
natural to put (3, 1). But it is much easier to specify submatrices this way
so I shall carry on doing so.
There is also an isomorphism between 0-forms and 3-forms on R
3
. It takes 1
to det. Det of course takes the three ordered vectors (a, b, c) to the determi-
nant of the matrix obtained by writing each vector out as a column (or row)
and listing them to get the three by three matrix.
In R
3
we used to send the 2-form
c
1
e
2
∧ e
3
+c
2
e
1
∧ e
3
+c
3
e
1
∧ e
2
to the 1-form
c
1
e
1
−c
2
e
2
+c
3
e
3
104 CHAPTER 3. TENSORS AND TENSOR FIELDS
Figure 3.7.1: A choice of k numbers from n.
We can simplify this by saying that (e
2
∧ e
3
) = e
1
, (e
1
∧ e
3
) = −e
2
and
(e
1
∧ e
2
) = e
3
. Then everything is multilinear and so we don’t need any
more. In other words, it suffices to specify on all the choices of i, j in e
i
∧e
j
.
Now suppose we have a k-form on R
n
. We do not have just three basis
elements, we have
n
C
k
of them and we can take one of them and write it as
e
i
1
∧ e
i
2
∧ ∧ e
i
k
where we have chosen from the numbers [1 : n] some increasing subsequence
consisting of the k numbers i
1
, i
2
, , i
k
. We specify if we give an n − k
form that every such basis element gets sent to. There is an obvious choice:
we have taken a row of n numbers and picked out k of them. Figure 3.7.1
shows that I have selected some numbers in order and painted them red. This
leaves n − k black ones. I wedge the ‘black’ projections. Thus ‘the indices
we take out go to the indices we leave behind’.
The only problem is that of sign. I have gone from the red and black mixed
up to the red ones followed by the black ones. This is a permutation of the
n numbers [1 : n] and it may be an odd permutation or an even one. (see
3P0 notes if you don’t understand these things) We call the sign of an even
permutation 1 and the sign of an odd permutation −1. So I finish up with
the definition of :
(e
i
1
∧ e
i
2
∧ ∧ e
i
k
) = sign(σ)e
i
k+1
∧ e
i
k+2
∧ ∧ e
i
n
where σ is the permutation:
1 2 3 n
i
1
i
2
i
3
i
n
and the i
j
for 1 ≤ j ≤ k on the left hand side are the red numbers and the
remainder are the black numbers in the same order as before.
Exercise 3.7.1. Confirm that the general case copies exactly what we did
with 2-forms on R
3
.
3.7. HODGE DUALITY: THE HODGE OPERATOR 105
Exercise 3.7.2. Confirm by explicit calculation for some particular k-forms
on R
n
for various n that this definition makes sense and gives reasonable
answers. In particular send some 2-forms on R
4
to some 2-forms.
Exercise 3.7.3. Is it true that
2
is the identity? Prove it or give a coun-
terexample.
Exercise 3.7.4. Verify that the result of taking (e
i
∧ e
j
) on R
3
looks a lot
like the cross product. (!) Explain this.
Exercise 3.7.5. Verify that if we choose the ordered basis (e
1
, e
3
, e
2
) and
make ω = dx we get a different value for (ω) than if we used the ordered
basis (e
1
, e
2
, e
3
). Show that if we use the ordered basis

¸
e
1
,

0
cos(θ)
sin(θ)
¸
¸

0
−sin(θ)
cos(θ)
¸
¸

we get the same result as in the standard basis.
If ω is a k-form on V and f : U → V is a linear map, then we can pull back
ω to a k-form on U defined by
f

(ω)(a
1
, a
2
, , a
k
) = ω(f(a
1
, ), f(a
2
), .f(a
k
))
Exercise 3.7.6. Define f : R
3
→R
3
by f(e
1
) = e
2
, f(e
2
) = e
3
, f(e
3
) = e
1
.
Let ω be the 1-form dx + 2dy + 3dz. Show that f

((ω)) = (f

(ω)). The
last exercise makes it clear that at least in this case, f

and commute.
Exercise 3.7.7. Do they always? Is functorial? Its definition is locked
into the standard basis, does it need to be? If not, what does this do for
defining it on manifolds?
The answer to the last question is that if f preserves both the inner product
and the orientation, that is if it an element of SO(n, R), then f

((ω)) =
(f

(ω)). This tells us that the operator involves the parity (or orienta-
tion or chirality) of an orthonormal basis in an essential way. Note that in
a Hilbert space we have a natural definition of an orthonormal basis and
that since a Riemannian structure on an oriented manifold makes each tan-
gent space a Hilbert space, the operator makes sense on oriented smooth
manifolds with a Riemannian structure.
I do wish the physicists would learn not to call it a metric, but they are
beyond saving.
106 CHAPTER 3. TENSORS AND TENSOR FIELDS
3.7.2 The SemiRiemannian Case
We are going to generalise the idea of an inner product so as to be able to
deal with the Minkowski metric on space-time. This is Lorentzian. And we
might as well deal with the general case because it isn’t any more work.
Definition 3.7.1. A symmetric bilinear form φ : V V →R on a real vector
space is said to be non-degenerate iff ∀u ∈ V, φ(u, v) = 0 ⇒ v = 0
Definition 3.7.2. A non-degenerate symmetric bilinear form φ : V V →R
on a real vector space is said to be an inner product. We write such forms as
', ` with φ(u, v) = 'u, v`.
Exercise 3.7.8. Define the inner product on R
2
by
¸
x
y

,
¸
a
b

= xb +ya
Show that this is an inner product in the new sense but that it is not positive
definite.
Exercise 3.7.9. On R
4
define

x
0
x
1
x
2
x
3
¸
¸
¸
¸
,

a
0
a
1
a
2
a
3
¸
¸
¸
¸
¸
= −x
0
a
0
+x
1
a
1
+x
2
a
2
+x
3
a
3
Show that this is an inner product (the Lorentzian inner product on space-
time), and find a non-zero vector which is orthogonal to itself. Find two
distinct points which are at ‘distance’ zero from each other. Explain what
this means physically.
It should be clear that our generalisation of the idea of an inner product is
constructed so as to allow us to do for space-time what we usually do for
space, and that the ‘metric’ derived from the inner product isn’t a metric in
the standard sense at all. This is (a) strictly necessary in order to describe
relativity in a sensible fashion and (b) an awful shock to the system. It means
that I have set the velocity of light equal to 1 and that any two points on
the path of a ray of light have zero separation in space-time. I shall discuss
some aspects of Physics in the next chapter which may shed some light on
this extraordinary behaviour.
From now on I shall use the generalised definition of the inner product.
3.7. HODGE DUALITY: THE HODGE OPERATOR 107
Proposition 3.7.1. An inner product on a finite dimensional real vector
space V determines an isomorphism from V to V

.
Proof: Define Φ : V → V

by Φ(u) = 'u, −`. That is,
∀ v ∈ V, Φ(u)(v) = 'u, v`
Then that Φ is 1-1 follows from the non-degeneracy of ', `, and that Φ is
linear follows from the bilinearity of ', `. Since the spaces have the same
dimension, this is sufficient to make Φ an isomorphism.
Exercise 3.7.10. Let ', ` be the inner product of exercise 3.7.8. Write down
the isomorphism explicitly, representing elements of R
2
by row matrices.
The converse is also true:
Proposition 3.7.2. Suppose Φ : V → V

is an isomorphism of finite di-
mensional real vector spaces. Then the bilinear form
'u, v` =
1
2
((Φ(u))(v) + (Φ(v))(u))
is an inner product on V .
Proof: That it is bilinear follows from the linearity of Φ, and it is constructed
to be symmetric. It remains only to show it is non-degenerate. If it were
degenerate then there would be some non-zero u such that 'u, −` is the zero
map in V

which would put u ∈ ker(Φ) contradicting Φ being 1-1.
Exercise 3.7.11. Construct an example of an isomorphism from R
2
to R
2
which gives an inner product in the old sense, that is a bilinear positive
definite map, and also one which gives an inner product which is not one
of the old fashioned sort. Show that by composing any given isomorphism
Φ : R
n
→ R
n
with a suitably chosen isomorphism from R
n
to itself, we can
always get an old fashioned style inner product. Hint: Use the material on
the classification of quadratic forms from Second year.
Now we have to mess around with Hodge a bit to make it work properly
on a semi-Riemannian manifold. Note that we can take the determinant of
an orthonormal basis and if this is riemannian we must get 1, since the inner
product of different basis elements will be zero, and the inner product of a
basis element with itself is 1. And the determinant of the identity matrix is
1.
This stops being true for a generalised inner product: the inner product
of the time axis with itself is −1, alternatively the matrix representing the
108 CHAPTER 3. TENSORS AND TENSOR FIELDS
Lorentz inner product is

−1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸
which has determinant −1. Since determinants figure largely in defining
we need to take account of this. If the inner product has signature (s, n −s)
then we change the definition of to:
(e
i
1
∧ e
i
2
∧ ∧ e
i
k
) = sign(σ)(i
1
)(i
2
) (i
p
)e
i
k+1
∧ e
i
k+2
∧ ∧ e
i
n
where (i
j
) is defined to be the inner product of the i
th
j
basis element with
itself.
This gives the convenient form of the operator given in the text book and
ensures invariance under the generalisation of the special orthogonal group
which preserves the orientation and generalised inner product.
Exercise 3.7.12. Prove this.
As usual, the best way to accomodate a new idea is to do lots of sums until
you are used to using the idea, whereupon it stops leading to panic attacks
and becomes just another part of your machinery for thinking.
Chapter 4
Some Elementary Physics
Most of you will know most, perhaps almost all, of this, but some may not
and it is convenient to summarise it briefly.
4.1 Three weird forces
The first time I met electrostatic forces up close and personal, in other than
a laboratory situation, was the first time I undressed a girl. It was dark
and she was wearing nylon underwear, and when I took it off it crackled and
glowed as miniature lightning flashed. My first thought was that this was
the wages of sin, and the devil had come to claim his own, but then my brain
started working and I realised I was merely seeing and hearing electricity.
An education can be useful.
Later, in an American hotel, I once walked across a nylon carpet while wear-
ing Ug boots (among other things) and reached out to the door handle, and
got a large blue spark between my fingers and the handle. This was the
second time I encountered electrostatic forces up close and personal, in other
than a laboratory situation.
The management accepted no responsibility for guests electrocuting them-
selves with Ug boots.
It is possible to replicate these effects in a small way by stroking some fur
against a nylon or plastic ball and one can use it to pick up little bits of
paper. They just jump up off the table top to stick, briefly, to the plastic
ball. Try it if you don’t believe me.
The force that affects the paper is called an electrostatic force, and on a
human scale it is usually quite small, working only for rather litle bits of
109
110 CHAPTER 4. SOME ELEMENTARY PHYSICS
paper, although google Van de Graaf generator to find out how to scale
up rather a lot. Or stand on a hilltop holding up a spiky metal stick in a
thunderstorm if you really need convincing that the process of building up
charge (to use the jargon) can be rather spectacular
1
.
So there is a rather weird force which attracts all sorts of objects and which
needs some investigation. And knowledge of which can prevent heart attacks
and interference with your love life.
You can get at any time a ’fridge magnet from the government, or from lots
of even more useless people at New Year, and it magically glues itself to the
’fridge. It does not work on paper or wood. But it can demonstrably attract
small bits of iron. So it is different from the electrostatic force. But similar
in some ways. A second weird force.
And finally if you step off the top of a tall building you will come down
towards the rest of us with an acceleration of about 9.81 meters per second
per second less air resistance
2
. The planet appears to also attract things.
Like electrostatics but unlike magnetism, it appears to attract different sub-
stances. Unlike electostatics it does not stop working when you connect the
thing attracting and the thing attracted by a metal wire.
So there are at least three weird forces, electrostatics, magnetism and gravity.
It is natural to wonder whether there is any connection between them, what
the similarities and differences are. Michael Faraday investigated these issues
in the early nineteenth century
3
.
1
Children, do not do this unless you are very, very bored. As a cure for boredom, this
compares with the guillotine as a cure for dandruff.
2
Children, do not attempt this unless you are superman. Note that believing you
are superman doesn’t cut it. The Universe does not have the slightest respect for your
opinions unless they are right. And neither do I. Incidentally, it doesn’t matter how deeply
or passionately you believe something. If you are wrong in your beliefs the universe may,
with supreme indifference, kill you stone dead. If you have been brought up with the
view that you are entitled to your beliefs whatever they are, I encourage you to see if the
universe agrees with you.
3
You can find his Notebooks in the library and they make interesting reading. They
are also very well written. I should rate him as a better writer than Jane Austen, and
the subject is a lot more interesting. Unless you think the matter of manners and which
male ends up mating with which female is more interesting than understanding how the
universe works. If so you have definitely come to the wrong shop and should enrol in
English, where you may learn the vital skill of saying the right things about storybooks.
It never fails to amaze me that there really are people who feel this is quite a reasonable
way to spend their time.
4.2. FIELDS 111
4.2 Fields
If you do some delicate quantitative measurements on electrostatic forces
you find some interesting and important things. One is that the source of an
electric force comes in lumps, although rather small lumps, so this was not
known at the beginning of the study of the subject.
If you find the smallest possible lumps you can find that they are very small
and they all repel each other. They are called electrons, derived from the
Greek name for amber, and you might amuse yourself by figuring out the con-
nection. The amount of repulsion is inversely as the square of the distance
between them, and we say the electron has a charge and by a curious con-
vention it is called a negative charge. Some things appear to have a positive
charge, and indeed something must have or the whole universe would have
negative charge instead of being mainly neutral. Opposite charges attract
again by an amount that depends inversely on the square of the distance
between them. When we say they attract or repel, we use Newton’s termi-
nology of forces: If you want to feel a force, get a friend to poke you with a
stick. Failing that, jump off a wall. You will not feel any force while falling,
but you will when you stop. It will be very like being poked all over by a
large stick.
In Newton’s day the word force was strongly associated with sticks if you
poked, or possibly ropes if you pulled. The idea that you could have a
force acting without some material object connecting the thing on which the
force acted and the thing doing the acting (usually a person or a horse) was
regarded as a contradiction in terms; it was called action at a distance and
was thought to be quite contrary to standard usage of the term ‘force’, and
hence a violation of common sense and the natural order. Hence Newton’s
definition of a force in terms of what it did, dissociating the effect from the
mechanism, was a wild and novel idea. Now that we all think like this, it
is hard to appreciate the intellectual jump made in simply defining force in
terms of observable changes in the motion of a mass. It is true that forces still
tend to be associated with physical objects, the sun for example in the case of
forces acting on planets, but it is possible to conceive of a force field in space
with no such association. Well, it is now, it was a radical if not actually loony
idea in the middle of the seventeenth century. See the discussion by Feynman
in his Lecture Notes on Physics on the question of whether Newton’s Law
is ‘merely’ a definition or something more. I think he misses the point here,
which is to assert that forces are not to be thought of in terms of sticks or
ropes joining the source of the force to the thing acted upon, but in terms of
motion of the thing acted upon. The source of the force is a separate matter
112 CHAPTER 4. SOME ELEMENTARY PHYSICS
and will depend on what type of force it is.
We therefore measure forces by using Newton’s Law which says Force is Mass
times acceleration. Newton used Latin but we prefer algebra: F = ma or
F = m¨ x. Even better, since the mass may change in time (as when a rocket
moves by consuming fuel and throwing it out rather fast at the back), it
would be more useful to write F = ˙ p or in Leibnitz’ notation, F = dp/dt
where p is the momentum.
For a fixed mass, we measure the acceleration. We compare masses using a
spring balance or something equivalent.
Since electrons don’t seem to affect each other if we pile lots of them onto
some small object, other than trying to repel each other, and opposite charges
cancel each other out to a good approximation when they are brought to-
gether, the force between a charge Q
1
and one Q
2
is
1

0
Q
1
Q
2
r
2
(4.2.1)
where Q
j
is the amount of charge, ultimately a count of electrons or their
compensating positive equivalents which also come in lumps, r is the distance
between them, and
0
is a constant. This is the size of the force, the direc-
tion on each is towards the other if they have different sign and away from
each other if the charges are of the same type. It is worth remarking that
the attractions or repulsions do not depend on there being air or any other
medium in the space between the objects, although the medium changes the
constant which is therefore a property of the medium. We usually take
it that the medium is a hard vacuum and
0
is the number associated with
empty space. You would find the reason for the 4π too incredible, so I shall
simply observe that it is a rather bizarre choice of unit.
We can set up some rather special circumstances which merit a little thought:
if we take two metal parallel plates we can put a negative charge on one of
them, using Ug boots or nylon underwear if all else fails, and investigate to
see what happens in between. See figure 4.2.1 for a picture of the situation.
We take the little green ball to have some standard unit charge, supposed
to be small and positive and, we put a (large) positive charge on one plate
and an equal but opposite negative charge on the other. It is found that
there is a force which tends to accelerate the test charge in the direction
shown. This happens throughout the intervening space and we know of its
existence by looking to see how much the test charge accelerates. And only
by looking at some such test charge, because neutral objects are unaffected
to a first approximation, and negatively charged test objects have the force
vector reversed. It really is a force field because we can check back from the
4.2. FIELDS 113
Figure 4.2.1: An Electric Field.
acceleration experienced by the same charge with different masses. It exists
throughout the space. We assume that it is there when we don’t actually
have a charge there to measure it, and we assume that trees that fall in
a forest make a noise even when there is nobody there to hear it. This is
because life seems simpler that way, and we tend to think there really is a
world outside us.
Consequently it is natural for us to believe that there is a vector field de-
fined on the space in which we live, possibly changing in time, and where
the physical meaning is that this is an electric field detected by measuring
the acceleration on a known charge and mass. Acceleration measurements
require, in principle, rulers and clocks, and we can get those. In practice
we also cheat by using geometry and trigonometry, but checking up suggests
these work pretty well.
You have to understand that this idea comes at the end of a loooooong
sequence of delicate measurements and careful experiments and seems to
explain things in a satisfactory way. In particular we can often calculate
in simple cases what we expect the field to be like and we get very good
agreement between the numbers we calculate and the ones we measure. It
is this that we mean when we say we understand something: we get good
agreement metween measurements and calculations
4
.
So we are inclined to have a certain amount of faith in the existence of electric
4
This is not what philosophers or social scientists mean when they say they understand
something; what they mean is that they get a pleasant sensation of insight. Pleasant
sensations of insight are nice to have, and physicists and mathematicians get them too,
but we like to convince ourselves they are not bogus. Read Karl Popper’s Logic of Scientific
Discovery to get an inkling about the difference.
114 CHAPTER 4. SOME ELEMENTARY PHYSICS
Figure 4.2.2: Two bar magnets attracting each other.
fields, for essentially the same reason that we believe in the existence of the
Pope. Most of us haven’t actually seen the Pope, the closest is usually a
picture, perhaps on television. But the hypothesis that he exists accounts
for a lot of phenomena which would otherwise be rather hard to explain, such
as television pictures and photographs.
Physicists also believe in Magnetic fields for similar reasons: in fact sprin-
kling some iron filings on a sheet of paper under which sits a bar magnet
(obtainable from most good toy-shops) makes it hard to doubt that you
can test the strength of a magnetic field with a small piece of iron. There
are however some significant differences between magnetism and electricity.
Electric charge comes in lumps, negative lumps and, sort of, positive lumps.
Magnetism doesn’t. People have actually looked hard for so called magnetic
monopoles and totally failed to find them. What turns up is invariably two
of them, one called North and the other called South. The name of course
is derived from the discovery that the planet has a magnetic field which can
be used for finding out what direction on the Earth you are pointing in, by
means of the magnetic compass. This made sailing a boat a much safer bet,
and has been known for a long time by the Chinese
5
Magnetism also defines a field.The picture of figure 4.2.2 shows two bar mag-
nets close together. In the configuration shown, the north pole of the magnet
on the left attracts the south pole of the magnet on the right. The south pole
of the magnet on the left repels the south pole of the magnet on the right,
and vice versa, but less because they are further away, and we can confirm
that we have an inverse square law by using longer magnets. The force is
easily detected and we can arrange various configurations of magnets just as
we could arrange the parallel plates for testing charge. So magnetic fields
exist too.
It is natural to regard the two fields as specified by vector fields and we can
expect to be able to describe the fields for reasonably simple configurations,
we can measure the constants in an equation similar to equation 4.2.1, and
calculate the field at other points and this works very well. Of course we
5
We in the West stole the idea over five hundred years ago, along with Printing and the
recipe for Gunpowder. Unfortunately we also stole the idea of Bureaucracy off them. But
we got out own back by letting them copy Marxism off us. That slowed them down a bit,
but they’ve figured out it was a trick quite recently. Which is more than our educational
theorists have done.
4.2. FIELDS 115
Figure 4.2.3: An electrical circuit.
don’t get the same constant
0
occurring, we get a different constant, µ
0
.
Although an electric field will move a charge, the magnetic field also has an
effect on charges, but only when they move. If a charge q moves at some
velocity vector v in a magnetic field B, there is a force which is orthogonal to
both v and B which is proportional to the product of the speed, the charge,
and the strength of the magnetic field at the point. We can write this out in
vector form as the Lorentz force law:
F = q(E +v B) (4.2.2)
This can be verified by using a Cathode Ray Tube such as occurs in old
fashioned television sets and computer monitors: just bring a magnet close
to the side of the tube and watch the screen. Try not to electrocute yourself.
All this strongly suggests that electricity and magnetism are closely related,
as is indeed the case.
One of the ways of seeing this is to take a coil of wire and join the ends to
opposite sides of large metal plates as in figure 4.2.3.
The circle labelled A is an ammeter which measures current and you can
ignore it for a first approximation and assume the wire goes right through
it. Disconnect the wire somewhere, and charge the metal plates just as in
figure 4.2.1. Now complete the circuit. Since the wire is metal, the charge
leaves the plates and tries to neutralise itself by flowing along the wire. When
flowing through the coil however it creates a magnetic field, and if you look
ahead to figure 4.3.1 you can work out that you get something very like a
bar magnet created by the flowing charge. This magnetic field is caused
by the changing current flow and it contains energy. The field acts so as
to impeded the charge and in fact to send it back the way it came. Once
it is back on the plate, the magnetic field vanishes and the process starts
again. The current, that is the moving charge, thus oscillates and this can
116 CHAPTER 4. SOME ELEMENTARY PHYSICS
Figure 4.2.4: An explanation of an inverse square law.
be measured (at least in principle, although it can be rather fast) by the
ammeter, which registers a sine wave. This oscillation will eventually die
down under normal circumstances because of resistance in the wire, so we
get a decaying sinusoidal wave. You probably saw the second order ODE
which describes this process in first year. This circuit is the basis of radio
and television transmissions.
4.2.1 Gradient Fields
One of the possibilities for vector fields is that the direction and length of
each vector corresponds to the acceleration with which a small mass placed
at that point would fall down a hill. The question is, does such a hill exist?
If it does we say that the vector field is the gradient of a potential field.
For an inverse square law of attraction, as with gravity, we can imagine a
field as indicated in figure 4.2.4, where the sun, say is at the bottom of the
well, and a planet would be a little dimple in the surface (to denote its own
gravitational field) and would move in an ellipse, much like taking a large
sheet of rubber, putting a heavy object in the middle to deform it, and then
knocking a light ball along the rubber sheet. You can, I hope, imagine the
trajectory it would follow.
As a way of thinking about force fields this is quite productive. We can write
V for the vector field at any point and then there is a height function f also
4.2. FIELDS 117
Figure 4.2.5: Another explanation of an inverse square law.
defined at each point and we have that
V = ∇f =

∂f/∂x
∂f/∂y
∂f/∂z
¸
¸
Since we like to think of things running down hills rather than up them, it
is quite usual for physicists and engineers to put a minus sign in the above
equation. Do so if it makes you feel better.
All three of the forces we are considering are of this type, gradient fields,
except for singularities at the centre of attraction.
It is clear that ∇is pretty much differentiating f to give the three components
of the derivative of f along orthogonal axes, and this raises the possibility that
it might be more natural to regard the electric or magnetic or gravitational
fields as 1-forms rather than vector fields on the space we live in.
4.2.2 What are Flux?
The fact that the three fields all fall off according to an inverse square law
suggests that this is a property of the space we are living in. One possible
explanation of an inverse square law of repulsion between two objects is that
each is emitting some particles, small point like objects, which hit the other
object and force it away. This would mean that the number passing through
a given area would decrease as the area is moved further away from the
source, as in figure 4.2.5.
The area subtended by a disc of fixed size would be proportional to the
inverse square of the distance from the centre, so counting hits would give
an inverse square law simply as a property of the dimension of the space we
live in reduced by one.
Even if we don’t believe in anything as fanciful as microscopic particles spat
out by charges, we can certainly think of a flow of imaginary ‘stuff’ put out
at a constant rate proportional to the amount of charge, and so people would
talk and write of the electric flux or the magnetic flux where the word ‘flux’
118 CHAPTER 4. SOME ELEMENTARY PHYSICS
means something which flows, and was used in medicine to mean stuff which
dribbled out of sores and noses
6
. Like the potential function representing a
hill down which objects roll, this is simply a possible way of thinking about
things and we do not feel obliged to specify the imaginary flowing stuff in
any detail. After all, we are doing nothing more than observing that a vector
field has an associated flow which is equivalent to it in that we can get to the
flow from the vector field by solving a system of ODEs and given the flow
we can get back to the vector field by differentiating it at every point.
Regarding Electric field in terms of a flow invites us to consider how much
flows out of a region compared with how much flows into the region. This
has much to do with Stokes’ Theorem and the extent to which the stuff is
created. Obviously the stuff flows out of any charge and ought to either
be conserved or get compressed elsewhere. Imagine, to picture this, water
flowing along in a stream. Now place an imaginary football in the stream.
Water flows through the imaginary football as if it isn’t there, which is fair
enough because it isn’t. The point is that water is hard to compress so
the density is pretty much uniform throughout the stream, and moreover
the water doesn’t come out of nowhere or suddenly vanish. This severely
constrains the kind of vector field that we get in a stream by attaching to
each point a little arrow saying how fast, and in which direction, the water
is moving at any point. It satisfies the condition that the divergence of the
vector field is zero at every point, where we measure the divergence at a point
by putting a small box at the point, and taking the amount of water coming
out of the box less the amount of water going into the box and dividing by
the volume of the box. Now take the limit as the box gets smaller to get
a number at each point of the stream. This is the divergence of the vector
field, and for streams it has to be zero. There is precisely as much imaginary
water flowing into the imaginary football as there is flowing out. You might
reasonably conjecture that the divergence of an Electric field is zero at a
point except when there is some charge at that point, when it depends on
the sign and amount of charge.
And you would be right.
In algebra we can write the divergence of a vector field V as a function g
with
g = ∇
q
V = ∂V
x
/∂x +∂V
y
/∂y +∂V
z
/∂z
Then if our charge comes in lumps, which we often assume to be the case,
any little box containing a positive charge will have some net amount of flux
coming out proportional to the charge, and if the little box is empty of net
6
Many things have improved since the early Nineteenth century.
4.2. FIELDS 119
Figure 4.2.6: The Electric field around a positive charge.
Figure 4.2.7: The magnetic dipole field.
charge there will be just as much going in as there is coming out. Electric
field flows are incompressible, and so are magnetic fields.
If the flow is not incompressible, it may still satisfy the Continuity Equation
which holds for a larger class of flows of physical systems. It says:
∂ρ
∂t
= −∇
q
j
where ρ is the density of “stuff” at a point and time and j is the vector field
of the flow of the “stuff”. You can translate this as: What goes in must
come out or wind up as a sticky mess in the middle. It applies to every
little box you put in the flow, so it makes sense in the limit as the boxes
get smaller. The right hand side can be imagined to represent the build up
of concentration of the flow, which accounts for the minus sign, and the left
hand side then represents the consequence of an increase in the density.
The flow of an electric field for a point charge looks like figure 4.2.6 and the
flow for a magnetic field looks like figure 4.2.7. If we take a field like this at
every point along a line segment and add them up we get the field of a long,
thin bar magnet.
120 CHAPTER 4. SOME ELEMENTARY PHYSICS
4.3 Maxwell and Faraday
Faraday spent a lot of time investigating the relationship between the three
forces. He didn’t find any link between the other two and gravity, although
he spent a long time looking as you will see from the Notebooks. But he did
find some important relationships between electricity and magnetism. Some
of this had also been done by Ampere in France
7
.
The key things that turn up are that a moving charge produces a magnetic
field, and that a changing magnetic field moves charge.
Electrons move rather easily through metals. The electrons in metals that
are attached to the positively charged nuclei in the atoms may be bound
more or less tightly in the atoms, and the outer electrons are more or less
communal to a set of atoms in the crystal lattice which metals form. A
bar of iron is basically a mess of little crystals all jammed together; if you
heat it and let it cool very slowly, you get fewer and bigger crystals, in an
extreme case, practicable only for small bits of iron, you can get a single
crystal. Trying to make it one big crystal is important in some applications
because the strength of a crystal is much greater than the strength of the
metal mixture. Or to put it another way, when you pull at two ends of a
wire, it comes apart at the places between the crystals, not in the middle of
a crystal. And electrons are very small and light. So a metal looks to an
electron rather like a sequence of mostly empty rooms (the crystals) and the
electrons are rather like a swarm of flies, buzzing about aimlessly. Except
that the flies repel each other. When an electric field is put across the wire,
the electrons drift in the direction forced by the field. In effect, if you pump a
bunch of electrons in at one end of a piece of wire, they repel nearby electrons
and so on so a compression wave passes rather quickly down the wire.
It makes sense therefore to talk of the current which is basically a count of
the number of electrons passing a point in a second
8
. By measuring charge
in some more practical way we can write i = dQ/dt where charge Q moves
7
During the Napoleonic Wars in Europe, Faraday and Sir Humphrey Davy travelled
around to talk to the physicists in various European countries. They regarded the war as
rather a nuisance, and had to avoid the fighting which they saw as a form of insanity to
which some people are addicted. They didn’t need passports which hadn’t been invented:
it was any free-born Englishman’s right to go wherever he wanted. Passports were intro-
duced later under the usual excuse that the government wants to help you. Few people
of any intelligence believed this in early nineteenth century Britain, it being too obvious
that politicians mainly want to help themselves. Not everything has improved since the
early Nineteenth century.
8
But with the direction reversed because current is positive charge and electrons are
negative.
4.3. MAXWELL AND FARADAY 121
Figure 4.3.1: The magnetic field of a current (moving charge).
along a wire, or even in a stream through empty space. Since what goes in
must come out and since the electrons are not going to bunch up if they can
help it, the current at one point of a piece of wire must be the same at any
other point except for brief transients when we switch on the process. It is
clear that current is a vector since moving charge has a direction associated
with it.
Michael Faraday, one of the finest experimentalist the world has produced
and an all round smart cookie, established that charge moving along a wire
produces a magnetic field which circles the wire. Drawing curves for the flow
of the field we get something like figure 4.3.1. I have drawn ony a section at
one point of the wire, there is such a set of circles centred on every point.
He also found that when a magnetic field changes it induces a current. This is
how we get our mains electricity from power stations. We spin a magnet and
surround it by a coil of wire in effect. Some serious googling or any elementary
text book on electricity should show you exactly how this is done
9
.
James Clerk Maxwell took the findings of Faraday and wrote them out in
Algebra.
If you reflect that changing magnetic fields produce an electric field and
changing electric fields produce a magnetic field, it might occur to you that
this swapping of energy between the two forms might happen in a cyclic way,
and might indeed happen in empty space. It might even occur to you that
such a cycling arrangement might travel through space. Your opinions would
not however count for much and would be considered of very minor interest,
mostly by your friends and relatives, and some of them might consider your
views as evidence of insanity. If however you were to take your wamblings
and turn them into algebra, you might be able to prove that this could
indeed happen and show how to calculate the speed of propagation of such
9
There are people who are convinced they understand electricity: when you click the
switch the light comes on, or maybe the television set, although this generally requires a
different switch. There is more to it than that, and it is a good idea to understand it a
little better, or you are not really a member of our civilisation, just a free-loading parasite
on it, hardly better than a politician or an arts graduate
122 CHAPTER 4. SOME ELEMENTARY PHYSICS
an electromagnetic wave in terms of constants which were properties of the
vacuum, such as
0
and the corresponding magnetic one µ
0
. And if this turned
out to be the same as the measured velocity of light, you would eventually
be taken very seriously. This is what Maxwell did. The velocity of light
just happens to be 1/

0
µ
0
, both of which were known from entirely static
experiments. And it led shortly afterwards to people trying deliberately to
produce such electromagnetic waves, and this led on to radio, radar, television
and most recently mobile phones
10
.
Maxwell’s Equations are four in number and state things that are known
about the electric and magnetic fields. Two deal with the nature of the fields
separately and two deal with the interaction between them. I give them here
for definiteness in more or less the same form as the text book. We suppose
that E is the electric field, B is the magnetic field and ρ is the density of
charge.

q
E = ρ (4.3.1)

q
B = 0 (4.3.2)
∇E +
∂B
∂t
= 0 (4.3.3)
∇B−
∂E
∂t
= j (4.3.4)
The vector j is in the direction of the moving charge and has norm the rate
of it.
These are very different from the form that Maxwell wrote them in, which
were much longer and not so compressed, and we shall get an even terser
form later. I note that there are some constants for the medium which have
been fixed up to make the velocity of light one. This is just a choice of units
in which to measure things and so is quite harmless and makes equations
simpler.
10
Whether this is altogether desirable may be doubted, but there are at least some
benefits. Certainly the reason we are much better off than the inhabitants of Bangladesh
or Congo is that we are more closely related to Isaac Newton, Michael Faraday and James
Clerk Maxwell than they are, biologically or socially. And we live with traditions which
are still in many ways similar to the traditions in which these men lived and produced the
amazing changes that they did. There are also some differences. Nowadays, instead of
being funded by the Royal Society at the discretion of Sir Humphrey Davey (its president),
Faraday would have had to submit a grant application to a committee to study electricity.
It is very doubtful if he’d get it. First he had no appropriate qualifications, and second
the practical applications would certainly have been beyond the imagination of the kind
of people who enjoy being on committees. He’d probably have been told to give up all
this foolery with wires and magnets and work on more powerful steam engines.
4.3. MAXWELL AND FARADAY 123
Figure 4.3.2: The curl of a vector field.
The first two equations simply say that the electric and magnetic fields are
incompressible, that the flux into a region is equal to the flux out in the
case of an electric field, except in a region containing charge, and that the
magnetic field is always incompressible (there are no magnetic monopoles).
The second two contain the information about the interchange between mag-
netic fields and electric fields. It is essential to understand what they are
saying, do not merely memorise them.
The curl of a vector field is the extent to which it tends to twist around
some axis. If you visualise a stream of water flowing and you put a very
small paddle in it, then in general the paddle gets turned by the flow being
greater on one side than on the other. Figure 4.3.2 gives the idea.
The curl can be thought of as a vector by taking the amount of twist about
the positive x-axis, the positive y-axis and the positive z-axis to give three
components; alternatively we can take the direction of the vector to be that
in which the rotation is a maximum and the length equal to the maximum
torque. Only a little thought suggests that it would be much more natural
to think of it as a 2-form, when it is simply the exterior derivative of the
1-form which replaces the vector field. This is undoubtedly a better way to
think of it, as it generalises to higher dimensions quite naturally. And of
course the divergence can be thought of as applying the exterior derivative
to a 2-form to get a 3-form, which on R
3
is, at each point, a number. So we
may anticipate the next stage of writing these equations out will be to turn
them into differential forms instead of vector fields.
For the present however, equation 4.3.3 says that the curl of the electric field
is the rate of change of the magnetic field with the direction reversed. We
have to think of B as a vector field which depends upon the time: if we
take each of the three components it is a function of x, y, z and t, and if we
differentiate it (partially) with respect to t we get a new vector field, also
usually depending on t. So equation 4.3.3 says that for every time t, the
vector field ∇E is the negative of the time derivative of B. The amount of
124 CHAPTER 4. SOME ELEMENTARY PHYSICS
twist of the electric field depends upon the rate at which the magnetic field
is changing. This is, like equation 4.5.1, and of course equation 4.3.4, part of
the interconnection between magnetic and electric fields.
Finally, equation 4.3.4 is almost dual to equation equation 4.3.3 except for
a minus sign and the j term and tells us something about the curl of the
magnetic field at every time in terms of the current vector and the rate of
change of the electric field. The latter term was introduced by Maxwell
not on the basis of experimental results, but because it led to the wave
solution to the four equations. One suspects strongly he had done the vague
English language argument about the exchange of energy between electric
and magnetic fields in free space and wanted to make it come good.
In order to collect your ideas on these equations, and to recall some earlier
work, some simple exercises will establish what is going on. If you did second
year physics you have probably already done these, although not perhaps in
this form.
Exercise 4.3.1. Find a vector field Von R
3
with constant curl the vector
(0, 0, 1)
T
. Find some more vector fields with the same curl. Show that there
is an infinite dimensional space of vector valued functions on R
3
which can
be added to your solution to give another solution.
Exercise 4.3.2. Find a vector field V on R
3
which has a constant curl vector
zero, but which has the property that the integral around the unit circle (in
the z = 0 plane) of V is non-zero.
Exercise 4.3.3. A current vector j is defined to be uniformly (0, 0, 1)
T
for
points of distance less than one from the z-axis. You may imagine a rod of
radius 1 along the z-axis carrying a current. The current is zero for points
at a distance from the z-axis greater than one (that is, outside the rod). Find
a continuous magnetic field the curl of which is the given j field.
Explain why continuity is worth having, and given that there are rather a lot
of other solutions, explain what grounds you have for preferring yours. You
might find figure 4.3.1 inspirational.
Exercise 4.3.4. Show that the wave equation is a solution to the Maxell
Equations in empty space. First write down the equation of an electric field
which has all the vectors in a plane the same length and direction: take, say,
planes x = s and arrange to have for fixed s, every electric vector the same
length and direction at any time t, but change the vector in time and also with
s so that it has unit speed along the x axis. Now look to see if this satisfies the
Maxwell Equations, for ρ and j zero, by doing some partial differentiating.
When you have done so, throw your hat in the air and shout ‘huzzah!’ in
celebration. You have seen the light.
4.3. MAXWELL AND FARADAY 125
Figure 4.3.3: What is the effect on a television set of a magnet?
Exercise 4.3.5. A beam of electrons is emitted by a cathode at the back of
a television set and paints a spot on the centre of the screen. Traditionally,
deflector plates are charged so as to sweep the spot in a raster scan giving your
television picture. I show a horizontal section through the tube in figure 4.3.3.
Discuss what happens when a bar magnet is placed in the location shown.
What numbers would you need to know in order to calulate the deflection of
the beam and its direction?
Exercise 4.3.6. Read the first twenty chapters of volume Two of the Feyn-
man Lectures on Physics (in the library). Chapter nineteen is of no direct
relevance but is good fun and worth reading to see how a great physicist thinks.
If you have been doing physics you should find this easy, but there are some
penetrating observations which you may want to think about. If you haven’t
done much physics, again this will show you something of what you have been
missing.
Exercise 4.3.7. Read Chapter four of the text book and do all the exercises.
Remark 4.3.1. The work which has been described so far has changed the
world, mainly for the better, and changed it enormously. It is the product
of the Western Intellectual Tradition, and it is worth reflecting on the kind
of society which can produce such things, and also on the kinds of society
which cannot, which is most of them.
Maxwell’s Equations represent one of the glories of Western civilisation,
something which is likely to remain as long as humanity endures and possibly
for much longer. Maxwell, Faraday and others stole lightning from the gods:
these men are heroes far beyond such as Alexander, Caesar or Napoleon
11
.
11
Or miscellaneous footy players, or people who hit balls with sticks. Or people who
play guitars or sing. The list goes on.
126 CHAPTER 4. SOME ELEMENTARY PHYSICS
Figure 4.4.1: A ball about to bounce off a wall.
Your life at University is being spent in part at least in coming to under-
stand the thinking of the great men who produced these marvels, and also
to understand something of how they did it. There are worse ways to spend
your time
12
.
4.4 Invariance
4.4.1 The Idea of Invariance
Imagine a ball rolling in the plane and bouncing of a fixed wall, as in figure
4.4.1.
If the ball has initial momentum p in the direction of the arrow, then it is
simple to compute the new momentum after the ball has bounced: the com-
ponent parallel to the wall is unchanged and the component perpendicular
to the wall is reversed in sign.
This makes a number of assumptions which are less than satisfactory; one
is that the ball is elastic since if it was made of putty it might stick to the
wall after deforming. So we also assume that energy is conserved by the
impact, in general not a realistic assumption, but approximately true for
bodies which are elastic enough. We also assume that the wall is rather
flat and very smooth, since the ball will actually impact over a region, not
at a point, and if different bits of the wall made different angles with the
trajectory then the behaviour is potentially more complicated and harder to
compute. When you did this sort of thing at school I rather suspect they just
12
Conquering Europe, or anything involving balls or guitars, for instance.
4.4. INVARIANCE 127
trained you to make the assumptions that the school teacher made without
asking many questions, so you probably never questioned the assumptions
and indeed didn’t even think about what they were. There can be rather a
lot.
Exercise 4.4.1. Think of some more assumptions that are necessary to get
a solution to the problem as posed.
In order to do the calculation, subject to all the usual assumptions, we need
to take a coordinate system, and some are better than others. I have shown
some axes in the left lower side of the picture. I haven’t however marked on
any units and you don’t know what units the momentum is given in either. It
should be fairly obvious that the units don’t much matter, in that whatever
we choose, as long as we take the same ones after as before we will get the
same answer. What is crucial is the angle the momentum vector makes with
the wall.
Now suppose we change the position and orientation of the axes. I do the
calculation in the original system and you do it in the new system. We can
translate everything from one to the other; your initial momentum vector p
will consist of an ordered pair of numbers, and an initial point for the ball
will also consist of another ordered pair of numbers. So will mine, and mine
will be different. It is easy to write down a euclidean group element which
will reliably translate your numbers to mine and the inverse will translate
my numbers to yours. And the resulting momentum vector, specified by a
direction and a point through which it passes, will translate by the same rule.
This is just like the situation of chapter one where I talked about penguins:
we have language which consists of finite lists of numbers, and we have the
physical entities, and the behaviour had better be described in the same way
whatever the language, because what happens in the world does not depend
on the language we use to talk about it. This is a key assumption about the
physical world which we use to put conditions upon the kinds of language
we shall use to talk about it
13
.
In the above case we can also change the units, you can use metres per second
and I can use parasangs per lunar month; the translation system still works
in that the system that translates the initial momentum from yours to mine
will also translate the final momentum from yours to mine. In fact it is
difficult to think of different systems of specifying initial and final momenta
and positions where a consistent translation system will not work.
13
This condition does not seem to apply to ones love life. There are tactful ways a bloke
can tell his girlfriend he doesn’t like her outfit, and there are others. He may tell the truth
in both cases, but the language makes a difference. In particular, it is always a mistake
to laugh. I speak from bitter experience.
128 CHAPTER 4. SOME ELEMENTARY PHYSICS
Exercise 4.4.2. But not impossible. Hint: consider the map from the polar
coordinate space r, θ to itself that doubles the angle. This is not a bijection,
we can translate events one way, but not unambiguously the other.
What this means is that if we have two languages for talking about events,
then as long as the translation scheme between the two languages is a bijec-
tion, and as long as an event can be specified in one language, then it can
also be specified in the other, and the translation scheme will ‘work’ for all
such events if those events are correctly described.
But as well as specifying observable events, we also want to predict what will
happen in advance by means of some kind of theory. And it is going to be a
poor sort of theory where the prediction is different in languages with such a
translation scheme to hand. This imposes a constraint on the theory: it must
translate the same predictions into each other. This is known as Einstein’s
Principle of Covariance and you should be able to see how he (and Poincar´e)
came to it: by seeing different coordinate frameworks as providing different
languages and there being a translation system between them.
We normally have not just two languages and one translation system between
them but a whole space of languages and a group of translations schemes,
since given any three languages, if I can translate from A to B and from B
to C by maps, then I can translate from A to C by the composite; moreover
the identity will translate from any language to itself, and we really want
every translation system to work in both directions, so there is an inverse
map. The associativity of composites of maps is immediate, so we have a
group of such translation systems. In the case of the shifting and rotation of
a coordinate frame used to specify only the positions of points, this is clearly
the Special Euclidean Group, SE(2, R). See the M2213 lecture notes if you
have forgotten this.
If f : R
2
→ R
2
is any map, it makes sense to ask if it is invariant with
respect to a group action. For example, f(x, y) = x
2
+ y
2
is invariant under
the rotation group SO(2, R): putting
X = x cos θ −y sin θ
Y = x sin θ +y cos θ
we easily confirm that f(X, Y ) = f(x, y) for any theta. So if some sort of
prediction is specified by a function we can look to see if it is invariant under
the appropriate group of transformations of coordinates which we regard as
specifying the possible languages we have available, and if it is not then it
cannot possibly define a satisfactory theory, because different observers will
expect to have different and incompatible outcomes. If a theory is specified
4.4. INVARIANCE 129
by requiring two functions to be equal, then they must be equal both before
and after we perform the appropriate group actions on them.
What is the group action in the case of the ball bouncing off the wall? We
have that the space in question is the space of positions and momenta of
balls. This we have seen is the cotangent space to R
2
which is isomorphic to
R
4
. It is not uncommon to write elements of this in the form (q
1
, q
2
, p
1
, p
2
)
where the q
i
are the positions, x and y more conventionally, and the p
i
are
the momenta. If we do a shift of a coordinate system, this will affect the q
i
but not the p
i
. If we do a rotation, both will be affected in the same way.
We can also consider a coordinate frame which is moving at a constant ve-
locity with respect to another one, requiring us to specify also the time. So
we have a five dimensional space in which to specify the position and mo-
mentum of the ball at each time, two coordinate frameworks for turning the
motion into a map from R denoting the time into R
5
, and a map between
them which has the property of taking one description to another description
of the same event.
Exercise 4.4.3. Write down a specification for a ball moving in a atraight
line in R
2
with constant momentum. Use the five numbers (t, q
1
, q
2
, p
1
, p
2
)
Choose actual numbers for the motion!
1. Take a coordinate frame which is shifted by some constant amount and
translate the function giving the position and momentum of the ball into
the new framework.
2. Do the same with a rotated coordinate frame.
3. Do the same for a frame which is both rotated and shifted.
4. Do the same for a frame which is both rotated and is moving at constant
velocity.
5. Do the same for a frame which is rotating with constant angular veloc-
ity.
6. Write down the groups for the first three transformations. What phys-
ically intelligible function is invariant under this group?
7. Write down the group for all the first four transformations. (This is
called the Galilean Group.) What is its dimension?
8. Write down the group for all five transformations.
9. Is your function invariant under either of the last two groups?
130 CHAPTER 4. SOME ELEMENTARY PHYSICS
10. What happens when the ball bounces?
11. Explain the physics here.
Exercise 4.4.4. If V is a vector field on R
3
and f : R
3
→R
3
is a euclidean
transformation, is it true that when V satisfies the equation ∇
q
V = 0 then
so does Tf(V )? If so prove it, if not give a counterexample.
Exercise 4.4.5. If V is a vector field on R
3
and f : R
3
→ R
3
is a diffeo-
morphism, is it true that when V satisfies the equation ∇
q
V = 0 then so
does Tf(V )? If so prove it, if not give a counterexample.
4.4.2 The Lorentz Group
The definition of the Orthogonal group O(n,R) was either that it consisted of
the orthogonal n n real matrices, or, better, that it consisted of the linear
maps from R
n
to R
n
which preserved the inner product, Formally,
∀ A ∈ L(R
n
, R
n
), A ∈ O(n, R) ⇔ ∀ u, v ∈ R
n
, 'u, v` = 'Au, Av`
So as to simplify things I shall now take a generalised inner product on R
2
which I shall write as having elements
¸
t
x

and the (lorentzian) inner product is defined to be
¸
t
x

,
¸
t

x

= tt

−xx

Note that I have reversed the sign from what you probably expected and the
one that the text book favours. If you feel uneasy about this go through
multiplying everything by −1.
It follows that the norm of the vector (t, x)
T
is t
2
−x
2
.
I shall argue by analogy with the usual inner product on R
2
.
In order to find out what the orthogonal maps looked like on R
2
, we took
the unit circle, and argued that any point on it had to remain on it under
an orthogonal map. Doing the same here, take the set
H
1
=
¸
t
x

∈ R
2
: t
2
−x
2
= 1

4.4. INVARIANCE 131
Figure 4.4.2: An analogue of the unit circle in a Lorentzian space.
Then if A preserves the new inner product, or is lorentzian we have that
¸
t
x

∈ H
1

¸
s
u

= A
¸
t
x

∈ H
1
⇒ s
2
−u
2
= 1
It is easy to draw the set t
2
− x
2
= 1 and it consists of a hyperbola as in
figure 4.4.2.
My reason for drawing it this way around and taking t
2
−x
2
and not x
2
−t
2
is that all the action has x < t which, given that the velocity of light is one
in these units and that things don’t travel faster than light, is the way things
ought to be.
Now we can parametrise the unit circle by cos θ, sin θ and it is easy to
parametrise the curve H
1
by
t = cosh θ, x = sinh θ
since
cosh
2
θ −sinh
2
θ =
e

+e
−2θ
+ 2
4

e

+e
−2θ
−2
4
= 1
Now we note that the standard basis elements in R
2
are t = 1, x = 0 and
t = 0, x = 1 and that the norm of the first is 1, so it is in H
1
and the norm
of the second is −1 and it is not in H
1
. So there is a slight problem with
defining a Lorentzian matrix in terms of cosh θ and sinh θ. The solution is
to observe that we need to extend H
1
to contain the other hyperbola which
intersects the x axis, as in figure 4.4.3
132 CHAPTER 4. SOME ELEMENTARY PHYSICS
Figure 4.4.3: A better analogue of the unit circle in a Lorentzian space.
We can now see that we should have defined
H
1
=
¸
t
x

∈ R
2
: t
2
−x
2
= ±1

which would have made no difference in the case of S
1
and the usual inner
product had we done it, but does make a difference here. We now have that
the lorentzian matrices are those for any real θ:
¸
cosh θ sinh θ
sinh θ cosh θ

Those vectors for which we are in the original hyperbola are called spacelike,
since they represent velocities which are less than 1 and correspond to things
we may see in our universe. Light, which moves at the velocity 1 in our units
must lie along x = ±t, and consists of vectors not in H
1
and the norm of
any such vector is zero. So the analogue of distance in our Lorentzian space,
which we call the interval, is zero for any light ray. Seen from the point
of view of a ray of light, there is no difference between starting from the
Andromeda Galaxy and arriving in your eye. This is definitely weird; well,
that’s reality for you.
Supposing we start with a spacelike vector in the two dimensional Lorentzian
space, for example t = 2, x = 1. It goes to
t = 2 cosh θ + sinh θ, x = 2 sinh θ + cosh θ
It is easy to verify that the norm of the original vector is 3 and so is the
norm of the final vector.
4.4. INVARIANCE 133
Exercise 4.4.6. Confirm that all such matrices as those advertised do in-
deed preserve the lorentzian form. What is their determinant. What other
matrices preserve the Lorentzian form? What is their determinant?
Now let’s get back to higher dimensional spaces with a (1,n) signature. I
have the usual spacetime situation with (x
0
, x
1
, x
2
, x
n
)
T
and the lorentzian
generalised inner product x
0
x
0

¸
j∈[1:n]
x
j
x
j
. I am particularly concerned
with n = 3 because that is the number of spatial dimensions of the universe
we live in.
There are six “basic” lorentzian matrices in R
4
with the lorentzian inner
product:

1 0 0 0
0 1 0 0
0 0 c −s
0 0 s c
¸
¸
¸
¸
,

1 0 0 0
0 c 0 s
0 0 1 0
0 −s 0 c
¸
¸
¸
¸
,

1 0 0 0
0 c −s 0
0 s c 0
0 0 0 1
¸
¸
¸
¸
(4.4.1)
gives three of them, where the c and s entries represent cosines and sines
of angles and give the three dimensional space of real orthogonal matrices.
The time axis is left fixed in this case, and each of these leaves one other
orthogonal axis fixed.
The other three are

ch sh 0 0
sh ch 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸
,

ch 0 sh 0
0 1 0 0
sh 0 ch 0
0 0 0 1
¸
¸
¸
¸
,

ch 0 0 sh
0 1 0 0
0 0 1 0
sh 0 0 ch
¸
¸
¸
¸
(4.4.2)
where ch is short for cosh θ and sh is short for sinh θ for various θ. Each of
these swaps the time into one of the three space axes and vice versa. Again,
two axes are left fixed. They are known to physicists as Lorentz Boosts.
Exercise 4.4.7. Show that each of the above six matrices preserves the
lorentzian inner product, and hence that any composite of them (for any
consistent values of the argument of cos, sin, cosh or sinh) will also.
Exercise 4.4.8. Show that every matrix which preserves the lorentzian inner
product must be some finite product of such matrices.
Exercise 4.4.9. Show that the Galilean group can be represented by matrices
of the form

1 0 0 0
v
1
a
11
a
12
a
13
v
2
a
21
a
22
a
23
v
3
a
31
a
32
a
33
¸
¸
¸
¸
134 CHAPTER 4. SOME ELEMENTARY PHYSICS
where

a
11
a
12
a
13
a
21
a
22
a
23
a
31
a
32
a
33
¸
¸
is in SO(3, R).
Remark 4.4.1. Note that both the Lorentz group and the Galilean group
can deal with a change of coordinates from a fixed system to one moving at
uniform velocity with respect to it. And they are different! The lorentz group
is the right one for relativity. You should observe that for the lorentz group,
movement with velocity v means setting tanh(θ) = v, and that we recover the
usual (relativistic) rules for the addition of velocities.
Exercise 4.4.10. Find a good source on Special Relativity: you could do
worse that the Feynman Lectures on Physics, Volume 1, chapter 15. Note
the equations
x

=
x −ut

1 −u
2
y

= y
z

= z
t

=
t −ux

1 −u
2
Show that these are essentially the inverse of the first matrix in 4.4.2.
Explain why physicists use the inverse.
Exercise 4.4.11. Show that if a space ship zooms past you at velocity half
that of light, and another spaceship zooms past that at half the speed of light
(relative to the first spaceship) in the same direction, then you will decide the
second spaceship has a velocity of 4/5 the speed of light. Show that if the first
had speed u and the second had speed v relative to the first, your opinion of
the speed of the second is given by
w =
u +v
1 +uv
Show that if [u[ < 1 and [v[ < 1 then [w[ < 1.
Exercise 4.4.12. Read Feynman’s Lecture Notes in Physics, Volume 1,
chapter 15. If you are a physicist you will have already covered this mate-
rial, if not you will find it comforting to discover you have now done Special
Relativity. Easy, isn’t it? Note that apart from a few technical difficulties (!)
you have discovered how atom bombs and nuclear power stations work
14
.
14
The details of atom bombs are very simple, and the recipe is as follows: take about
4.4. INVARIANCE 135
4.4.3 The Maxwell Equations
The first of Maxwell’s equations for the vacuum, with charge density zero is

q
E = 0
To see that this is invariant under SO(3,R) is in one sense trivial. The equa-
tion says that the net outflow of the flux determined by the vector field E for
any little box is zero, in fact for any box whatever, where a box is a region
bounded by something diffeomorphic to a 2-sphere. Rotating a box gives an-
other box, so the net outflow from a shifted box is also zero. This obviously
extends to the non-vacuum case with a non-zero charge density. It obviously
holds for a much larger group than SO(3,R) too: it must hold for any diffeo-
morphism, although the equation stating the fact that the divergence is zero
would look rather different.
Although this argument is persuasive, it lacks a certain rigour, so a slightly
more careful argument is required. We can observe that when we take the
divergence of the original vector field at any point it has to be the same as
the divergence of the transformed field at the transformed point. And since
the zero map is also preserved by the orthogonal map, the result follows.
The equation ∇
q
B = 0 is also invariant for the same reason.
Exercise 4.4.13. Show that the statement ‘the divergence of the original
vector field at any point has to be the same as the divergence of the trans-
formed field at the transformed point’ can be translated into algebra by doing
it, and that it is true for any vector field.
Now this argument uses only the linearity of the matrix and the fact that
t = t

, and indeed is far more general than that.
It remains to prove invariance for the boost maps, the first of which is

ch sh 0 0
sh ch 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸
The problem here is that the electric and magnetic fields are defined as vector
fields on R
3
and the lorentz boost maps are represented by 4 4 matrices,
so we can expect some serious complications.
a kilogram of Uranium 235 or Plutonium and shape it into a hemisphere. Do the same
with a second kilogram. Now clap them together hard to make a solid sphere. Children,
do not do this at home unless you really dislike your parents.
136 CHAPTER 4. SOME ELEMENTARY PHYSICS
The actual transformation of the E and B fields is rather a shock at first and
is given by

cosh θ sinh θ 0 0
sinh θ cosh θ 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸

E
x
E
y
E
z
¸
¸
=

E
x
E
y
cosh θ −B
z
sinh θ
E
z
cosh θ +B
y
sinh θ
¸
¸
(4.4.3)
and

cosh θ sinh θ 0 0
sinh θ cosh θ 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸

B
x
B
y
B
z
¸
¸
=

B
x
B
y
cosh θ +E
z
sinh θ
B
z
cosh θ −E
y
sinh θ
¸
¸
(4.4.4)
What is surprising about this is that the Electric and Magnetic components
get mixed up. This means that if I am travelling in the x direction with
velocity tanh θ (which you will note has absolute value always less than 1,
the speed of light) and we both measure an electric field and a magnetic
field, we shall differ on which bits are which. This is a strong hint that the
two phenomena of electric fields and magnetic fields are all part of the same
underlying entity, called the electromagnetic field.
The derivation of the above transforms will be easier once we go over to using
differential forms instead of vector fields to represent the two parts, E and
B of the electromagnetic field, so I shall defer it.
The invariance of the Maxwell Equations under the transforms is also eas-
ier to see in this setting. So we now turn to the right way to talk of the
electromagnetic field.
4.5 Saying it with Differential Forms
Given a physical phenomenon, in this case the force exerted on a charged
particle, and given that two bits of language can be used to describe it, in
this case as a vector field on R
3
or as a 1-form or a 2-form, the question of
which piece of language to use comes up immediately. We may, of course,
choose the first one that occurs to us and having made a choice stick to
it in defiance of later developments. This is rather stupid and regrettably
common. An alternative is to ask if there are any physically obvious grounds
for making a choice, and a second is to keep them both in use until such time
as one demonstrates advantages.
4.5. SAYING IT WITH DIFFERENTIAL FORMS 137
Let me first argue that it is reasonable to use 2-forms for describing magnetic
fields. The reason is that 2-forms take account of orientations, and magnetic
fields certainly exhibit all the usual signs of being orientation aware. If you
look at the Lorentz Force law, which I give again for your greater comfort,
F = q(E +v B) (4.5.1)
you will see that the v B part clearly has an orientation aspect in it by
virtue of the cross product, whereas the electric field has only the sense or
direction. It therefore makes sense to represent the magnetic field as a 2-
form. Some sort of right hand rule is involved in computing a cross product:
this may be seen as containing the information that magnetic fields also use
some sort of orientation information. They care about which direction a
charge is moving. Of course, we can force a vector view on the field if we
insist, which means we have to keep in mind the right hand rules of physics,
whereas if we use 2-forms, this should be taken care of by the formalism. A
good language is one which does most of the work and doesn’t require us to
keep a close watch on it.
If B is a 2-form then so is its time derivative, and the equation
∇E = −
∂B
∂t
is telling us that E is a 1-form. So we can write the above equation in the
form
dE = −
∂B
∂t
where d is the exterior derivative which we know takes 1-forms to 2-forms.
We may also write dB = 0 with some confidence to represent the classical
equation ∇
q
B = 0 since the divergence of a vector field is a scalar field and
the exterior derivative of a 2-form is a 3-form which on R
3
is pretty much
also a scalar field, multiplied by det if you want to be careful.
Unfortumately, after that everything goes pear-shaped.
the equation
∇B−
∂E
∂t
= j
makes no sense when we try to translate it: B can’t have a curl, it is one.

t
E has to be another 1-form, and so is j. So we somehow have to arrange
that the translation of ∇B is a 1-form.
On the other hand, we have also avoided facing the fact that we should be
doing all of this on R
4
with the lorentz metric. Maybe we can save things by
a small amount of rearrangement.
138 CHAPTER 4. SOME ELEMENTARY PHYSICS
Let us look at the simplest case first, the equations

q
B = 0
∇E +
∂B
∂t
= 0
become
dB = 0
dE +
∂B
∂t
= 0
This corresponds closely to the physics: dB is indeed a divergence that is a
3-form on R
3
and the exterior derivative of the electric 1-form is indeed a
2-form.
We can shift all this to our lorentz space, R
4
with the lorentz inner product,
bt defining a 2-form on R
4
as follows:
F = B+E ∧ dt (4.5.2)
Writing this out in coordinate form with respect to the standard basis we get
B = B
x
dy ∧ dz +B
y
dz ∧ dx +B
z
dx ∧ dy
and
E ∧ dt = E
x
dx ∧ dt +E
y
dy ∧ dt +E
z
dz ∧ dt
I hope you recall representing the 2-form 3dx ⊗dx + 4dx ⊗dy −2dy ⊗dx +
5dy ⊗dy on R
2
as a matrix
¸
3 4
−2 5

You will have verified that this operates on the two input vectors
¸
x
y

,
¸
u
v

by sending them to the number
[x, y]
¸
3 4
−2 5
¸
u
v

If you don’t recall this or didn’t do the exercise, do it NOW.
4.5. SAYING IT WITH DIFFERENTIAL FORMS 139
It is straightforward to verify that the 2-form F can be represented in the
same way on R
4
by the matrix

0 −Ex −Ey −Ez
Ex 0 Bz −By
Ey −Bz 0 Bx
Ez By −Bx 0
¸
¸
¸
¸
Exercise 4.5.1. Verify this on pain of death.
It is now easy to compute dF. This will be a 3-form on R
4
.
We have:
F = B+E ∧ dt
dF = dB+d(E ∧ dt)
Doing the dB part first:
dB = d(B
x
dy ∧ dz +B
y
dz ∧ dx +B
z
dx ∧ dy)
=
∂B
x
∂t
dt ∧ dy ∧ dz +
∂B
x
∂x
dx ∧ dy ∧ dz
+
∂B
y
∂t
dt ∧ dz ∧ dx +
∂B
y
∂y
dy ∧ dz ∧ dx
+
∂B
z
∂t
dt ∧ dx ∧ dy +
∂B
z
∂z
dz ∧ dx ∧ dy
= (
∂B
x
∂x
+
∂B
y
∂y
+
∂B
z
∂z
) dx ∧ dy ∧ dz
+
∂B
x
∂t
dt ∧ dy ∧ dz +
∂B
y
∂t
dt ∧ dz ∧ dx +
∂B
z
∂t
dt ∧ dx ∧ dy
Now doing the d(E ∧ dt) part:
d(E ∧ dt) = d(E
x
dx ∧ dt +E
y
dy ∧ dt +E
z
dz ∧ dt)
=
∂E
x
∂y
dy ∧ dx ∧ dt +
∂E
x
∂z
dz ∧ dx ∧ dt
+
∂E
y
∂x
dx ∧ dy ∧ dt +
∂E
y
∂z
dz ∧ dy ∧ dt
+
∂E
z
∂x
dx ∧ dz ∧ dt +
∂E
z
∂y
dy ∧ dz ∧ dt
=

∂E
z
∂y

∂E
y
∂z

dy ∧ dz ∧ dt
+

∂E
x
∂z

∂E
z
∂x

dz ∧ dx ∧ dt
+

∂E
y
∂x

∂E
x
∂y

dx ∧ dy ∧ dt
140 CHAPTER 4. SOME ELEMENTARY PHYSICS
Collecting up both parts we get:
dF =

∂B
x
∂x
+
∂B
y
∂y
+
∂B
z
∂z

dx ∧ dy ∧ dz (4.5.3)
+

∂E
z
∂y

∂E
y
∂z
+
∂B
x
∂t

dy ∧ dz ∧ dt
+

∂E
x
∂z

∂E
z
∂x
+
∂B
y
∂t

dz ∧ dx ∧ dt
+

∂E
y
∂x

∂E
x
∂y
+
∂B
z
∂t

dx ∧ dy ∧ dt
If dF = 0 then each of the above four lines must be zero. The first line says
that div B = 0 in old fashioned language, and the last three say that curl
E +∂B/∂t = 0 in the same old fashioned language.
In other words, we get two of the Maxwell equations out. This is encouraging
and leads us to feel that gluing E and B together into a single entity, the
2-form F is a good idea. This is the physically significant thing of which the
magnetic field and the electric field are merely different aspects.
The next step is to express the other pair of Maxwell equations in the same
language.
This is where the operator comes in. It is clear that if we F we get
another 2-form. When we calculate the matrix for it we get

0 Bx By Bz
−Bx 0 Ez −Ey
−By −Ex 0 Ex
−Bz Ey Ex 0
¸
¸
¸
¸
Exercise 4.5.2. Confirm this. The calculation is utterly trivial, all you need
to do is to organise your thoughts sensibly. Observe that the Hodge dual can
be memorised by mapping Ej to −Bj and Bj to Ej. This looks very like
what we want for the other pair of Maxwell Equations in the classical form.
If we take the exterior derivative we get a 3-form on R
4
, and if we it we
get a 1-form. We can represent the current as a 1-form on R
4
by putting the
charge density ρ in the zeroth place and using the other three places to give
the values for the current flow. This means we need to define J as the 1-form
ρ

∂t
+j
1

∂x
+j
2

∂y
+j
3

∂z
4.6. LORENTZ INVARIANCE 141
Whereupon we may write the other two Maxwell Equations out as
(d((F))) = J
Exercise 4.5.3. Show that this does indeed amount to precisely the other
pair of Maxwell’s Equations.
It is common to leave out all the parentheses and summarise the Maxwell
Equations, all together, in the form
dF = 0 (4.5.4)
d F = J (4.5.5)
Remark 4.5.1. You have to allow that this is rather cool. Compacting the
equations in this way gives us a much more elegant way to express the basic
facts of electromagnetism and should leave you feeling that it is more true to
the underlying reality than the classical form. If you had never met Maxwell’s
Equations in the classical formalism and you had just met these for the first
time, you would, I think, find a good deal of charm in the conciseness, and
feel that the evidence for the electromagnetic field being a 2-form on a four
dimensional spacetime is overwhelming. The fact that it requires a lorentz
inner product to work properly is at least highly suggestive.
Exercise 4.5.4. How far could you get in rewriting the Maxwell Equations
(in terms of forms) with the standard inner product on R
4
? What changes
would you need?
4.6 Lorentz Invariance
The first issue to be addressed is to determine how 2-forms transform under
a transformation of coordinates.
Suppose B is a 2-form on R
n
with a generalised inner product, and A : R
n

R
n
is a diffeomorphism. Then take one coordinate system at the origin and
let another be obtained by performing A on it. Call S the coordinate system
at the origin, and think ordered basis for a concrete example. Then AS is the
second coordinate system. Think of A as a linear map, possibly a lorentz map
to make this relatively concrete. Then a vector x in the coordinate system S
is read as A
−1
x in AS. Call it x

to save space. That is, we have two ways of
talking about the same point in the space, two languages. Similarly, u and
u

= A
−1
u for a second vector.
Then if B

is the correct transform of B in AS we shall have that B

acts
on the ordered pair x

, u

to give the same number as B does on the ordered
142 CHAPTER 4. SOME ELEMENTARY PHYSICS
pair x, u. This must be the case since the number we get out must not
depend on the language we are using to describe the points which exist and
are independent of the language.
If A is linear and S and hence S

= AS are given by ordered bases, then we
can represent B by a matrix [B] and write B(x, u) as x
T
[B]u. Similarly we
have x
T
[B

]u

for the same number obtained by the transformed 2-form, and
these are equal for any choice of x and u. Saying this in algebra:
∀ x, u ∈ R
n
x
T
A
T
[B]

Au = x
T
[B]u
This can only happen when A
T
[B]

A = [B] which tells us that
[B]

= A
T−1
[B]A
−1
which gives us the transform of the matrix [B] representing the 2-form B.
Exercise 4.6.1. It is well known that for an orthogonal matrix A, A
T
= A
−1
.
What can you say about the transpose of a lorentzian matrix?
Exercise 4.6.2. The question arises, how much of this depends on linearity?
Obviously we have chosen to represent things by matrices, but the equation
makes sense in a much more general setting except possibly for the business
of the transpose, which arose from our determination to represent B by a
matrix. Suppose we express the 2-form relative to a basis in the usual way
as a sum of dx
i
∧ dx
j
. What can be said about the expression of B

relative
to the dx
i
∧ dx
j
? How much if anything can be saved if we permit A to be
a diffeomorphism? Hint: investigate this in R
2
` ¦0¦ with reference to the
polar coordinate transform.
The obsession which physicists and old stle mathematicians have with matrix
representations of differential forms can obscure the basic simplicities. We
have already computed dF in standard terms as a 3-form in equation 4.5.3,
and it is simpler to investigate the lorentz transformations of both 2-forms
and 3-forms directly.
Let’s do it for the 2-form F on R
4
and the first lorentz boost.
We have that any 2-form on R
4
is given by suitable linear combinations,
weighted sums, of the six terms dt ∧ dx, dt ∧ dy, dt ∧ dz, dx ∧ dy, dx ∧ dz,
dy ∧ dz. From the matrix representation for F we can read these off:
F = −Ex dt ∧ dx −Ey dt ∧ dy −Ez dt ∧ dz
+ Bz dx ∧ dy −By dx ∧ dz +Bx dy ∧ dz
4.6. LORENTZ INVARIANCE 143
If we suppose that the first lorentz boost is used to transform the standard
basis in R
4
to a new basis, what I shall call the dashed basis, then we need
the inverse map to transform the coordinates of a point (event) in R
4
to the
new coordinates in the dashed frame. Thus we have

t

x

y

z

¸
¸
¸
¸
=

c −s 0 0
−s c 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸

t
x
y
z
¸
¸
¸
¸
=

ct −sx
st −cx
y
z
¸
¸
¸
¸
where c is short for cosh(θ) and s for sinh(θ), for any θ ∈ R.
Taking the exterior derivative we get

dt

dx

dy

dz

¸
¸
¸
¸
=

c dt −s dx
−s dt +c dx
dy
dz
¸
¸
¸
¸
We have therefore that
dt

∧ dx

= dt ∧ dx; dt

∧ dy

= c dt ∧ dy −s dx ∧ dy;
dt

∧ dz = c dt ∧ dz −s dx ∧ dz; dx

∧ dy

= −s dt ∧ dy +c dx ∧ dy
and
dx

∧ dz

= −s dt ∧ dz +c dx ∧ dz; dy

∧ dz

= dy ∧ dz
Exercise 4.6.3. Find the expressions for the four basis elements of a three
form,
dt

∧ dx

∧ dy

, dt

∧ dx

∧ dz

, dt

∧ dy

∧ dz

and dx

∧ dy

∧ dz

in terms of the undashed forms, dt ∧ dx ∧ dt and so on.
Exercise 4.6.4. Calculate the exterior derivative dF in terms of the func-
tions Ex, Ey, Ez, Bx, By, Bz and the basis elements
dt ∧ dx ∧ dy, dt ∧ dx ∧ dz, dt ∧ dy ∧ dz and dx ∧ dy ∧ dz
Your first term should be, using the shorter notation of the text book for
partial derivatives,
(−∂
y
Ex +∂
y
Ex +∂
t
Bz) dt ∧ dx ∧ dy
144 CHAPTER 4. SOME ELEMENTARY PHYSICS
In the dashed frame we have the same form as in the last exercise for dF

except that we have Ex

, Ey

, Ez

, Bx

, By

, Bz

, partial derivatives of these
with respect to the dashed variables, and the usual suspects:
dt

∧ dx

∧ dy

, dt

∧ dx

∧ dz

, dt

∧ dy

∧ dz

and dx

∧ dy

∧ dz

Now we can replace the last four terms by the undashed terms.
We also have that the chain rule allows us to replace the ∂
x
Ey

and similar
terms by their undashed translations:
[∂
t
f

, ∂
x
f

, ∂
y
f

, ∂
z
f

] = [∂
t
f, ∂
x
f, ∂
y
f, ∂
z
f]

c s 0 0
s c 0 0
0 0 1 0
0 0 0 1
¸
¸
¸
¸
Exercise 4.6.5. Do this to obtain an expression for dF

in terms of the
undashed symbols.
We finally have to confirm that
dF = 0 ⇒ dF

= 0
At first sight, the expression for dF

is a mess, but a little thought shows it
is just a new dF for a slightly different pair of electric and magnetic fields—
which also satisfy Maxwell’s equations.
Exercise 4.6.6. Checking with the forms of 4.4.3 and 4.4.4, show that
dF

= 0
This sequence of exercises has established that the vacuum Maxwell equations
are invariant under the first lorentz boost.
Exercise 4.6.7. Confirm that this also works for the other two lorentz boosts.
This is best done using a small amount of thought rather than a large amount
of algebra.
Since we have already seen that the vacuum Maxelll equations are invariant
under the special orthogonal subgroup, it follows that the equation
dF = 0
is invariant under the lorentz group.
Now if d F = 0 which it does in the vacuum, it also follows that d F = 0
since takes zero forms to zero forms and
2
is a number, in fact ±1.
4.6. LORENTZ INVARIANCE 145
Exercise 4.6.8. Which?
So the same calculation will establish the invariance of the second equation.
After doing this we conclude that both the vacuum equations are invariant
under the Lorentz group.
Exercise 4.6.9. Show that d F is also invariant under the three lorentz
boosts and the special orthogonal group. Hint: this does not require doing it
all over again!
This result is absolutely astonishing. I shall now explain why.
4.6.1 Special Relativity
Newton’s Laws of motion are about forces, which is to say accelerations
and masses if we look at the things we can actually measure directly. And
these appear at first sight to be invariant under the galilean group. Certainly
Newton thought they were, although group theory not having been formalised
in his day, he wouldn’t have put it in that way. Had the lorentz group been
in existence, the possibility that the laws of motion were invariant under the
lorentz group would have been regarded as a bizarre possibility too absurd
to waste time on, although one couldn’t rule it out on experimental grounds
since speeds with which we are familiar are small compared with the velocity
of light.
Invariance of the laws of nature under the galilean group explained why
we haven’t found anywhere in the universe labelled ‘origin’, a special point,
possibly with three orthogonal axes sticking out of it. It must have looked
unlikely that we will, and the fact that we have used coordinate systems and
bases of orthogonal vectors to talk about the universe led immediately to the
observation that this was merely a convenient language, and no particular
coordinate frame was better than any other; indeed, one could be moving
at constant velocity with respect to another and they were equally good.
Although accelerating frames changed things, as one finds when looking at
tops and roundabouts and planets.
The fact that Maxwell’s equations are not invariant under the galilean group
but are under the lorentz group produces some very strange results. One
is that the velocity of light is constant and does not depend on your own
velocity.
This is very unnerving indeed. If light is a wave motion like waves in water
then the wave velocity is a property of the water. If you are at rest with
respect to the water you get one answer, and if you are moving with respect
to the water you get another. This is not observed. If light goes like little
146 CHAPTER 4. SOME ELEMENTARY PHYSICS
Figure 4.6.1: An experiment with two charges.
bullets from a source, then the velocity of the light is affected by the velocity
of the source with respect to any other observer. This is also not observed.
What happens when you travel fast away from the light is that the frequency
is shifted down, the colour changes, and if you travel towards it the frequency
is shifted up, the doppler effect. But the velocity is not affected.
It gets worse. Suppose I am sitting in a frame of rulers with a clock and
describing what happens when I take two charged balls of opposite signs and
place them a small distance from each other, as in figure 4.6.1.
I have tied each charge to some fixed object and measured the tendency
of the charges to attract each other by measuring the stretch of a spring.
Since everything is pretty much the same except for the sign of the charge,
including the mass of the balls and the elasticity of the springs, there is a
high degree of symmetry in the arrangement and I expect the stretches to be
the same. I can measure these by reading off the number on the ruler where
the edge of the ball is.
Now you come zooming past me going into the picture at half the speed of
light.
You see the two balls, you see the extension of the spring, and you can
measure the electric and magnetic fields with some little test charge you
carry with you, and some little bar magnet. Your little bar magnet can be
thought of as just two charges orbiting about each other, close enough so
they have no net electric field to speak of, and fast enough so they produce a
magnetic field. You can also observe, just as I can, the numbers on the ruler
where the edges of the balls are, and we had better agree on these numbers.
The collision of penguins is a fact in any language that is not totally bizarre,
and the coincidence of edges of balls and numbers on an external ruler must
also be a property of the universe, not the language we choose to talk about
it.
In my framework there is no magnetic field at all, unless we count your
measuring apparatus. But in yours there are two charges coming towards
4.6. LORENTZ INVARIANCE 147
you at half the velocity of light so there has to be a magnetic field, because
moving charges have to produce one; Faraday said they did and so they
do. So the numbers you get for the electric field and the magnetic field will
be different from the numbers I get, yours will have a non-zero magnetic
component.
Exercise 4.6.10. Use the first lorentz boost to work out what the rate of
exchange is.
We will both agree that the springs extend and the objects are attracting
each other. But our explanations of why will be different. You will have an
explanation which involves magnetic fields and mine won’t.
In different configurations we may disagree about the extension of the springs
or the masses of the balls or the time duration between events, although we
must of course agree about whether an event occurs or not. If two balls
(or penguins) collide, we must agree that the event happens, this being a
property of the balls (or penguins), but the translation between our languages
may make our measurements of various forces disagree when our translation
is done using the lorentz group.
The problem which was tackled in the early days of the twentieth century
was, can you have the mechanical part of nature invariant under the galilean
group and the electromagnetic part invariant under the lorentz group? As you
can see from our discussion on penguins and Einstein’s covariance principle,
this amounts to having a different and incompatible language for different
aspects of the universe, and we measure the effects of fields by mechanical
means. So it does not make much sense to have two incompatible languages
for talking about the same thing. The Michelson-Morley experiment tried
to measure the velocity of the earth with respect to the luminiferous aether,
the whatever it was that waggled when light passed through it (‘luminiferous’
just means ‘light bearing’ and ‘aether’ meant some weird stuff which spread
throughout the universe and had no other function than to bear light. In
particular it didn’t obstruct or slow down mechanical things like planets or
penguins). This makes some sort of sense if there is one kind of invariance
for matter and another for electromagnetic fields. The answer seemed to
be it was zero: the velocity of light does not depend on the velocity of the
observer measuring it. This is consistent with the lorentz invariance of the
Maxwell equations, but not with the idea that you can get away with two
incompatible languages for talking about the world.
Given this there are only two possibilities: either Maxwell’s laws are wrong
or Newton’s are. Mostly people assumed that Maxwell was going to have
to lose out in the fight between the intellectual giants, mainly because they
148 CHAPTER 4. SOME ELEMENTARY PHYSICS
were used to Newton, and Maxwell was the new kid on the block, although
it was hard to see how he was wrong.
Poincar´e pointed out that the alternative was to suppose that everything was
invariant under the lorentz group. Einstein worked out the consequences and
we had the special theory of relativity, and E = mc
2
is a trivial corollary.
Hence atom bombs and nuclear power stations. So today we use galilean
invariance as a simple approximation when velocities are low, and lorentz
invariance is taken to be right. For everything.
It is fashionable for philosophers to pontificate on science, and since scientists
are usually much too busy doing interesting things to bother much about
them, the philosophers have much more influence on the great unwashed
than they should. One line of argument goes: ‘Einstein showed Newton
was wrong, no doubt someone will eventually show Einstein is wrong too,
so nothing is known for sure and all knowledge and belief is temporary. So
we might as well stay totally ignorant of science. In fact since all knowledge
is liable to error we can’t be said to really know anything. And there is no
sense in pursuing truth if there is no possibility of catching it.’ I shall call
this the postmodern fallacy.
This certainly saves philosophers and others the trouble of learning about
tensors, or anything else complicated. The argument is very popular with
people who like to be thought of as intellectual but don’t have much intellect.
Exercise 4.6.11. Explain, as to a philosopher, why the postmodern argument
sucks.
Exercise 4.6.12. Google Michelson-Morley Experiment
This has been a quick introduction to special relativity. I have essentially
followed the historical development of the ideas, whereas it is more usual to
give you the facts which have become known as the result of experiments
since. Facts there are in plenty and they support completely the invariance
under the lorentz group of everything in the universe. Physicists tend to see
life as a huge collection of facts, mathematicians as a much smaller collection
of ideas. To mathematicians, reality is there to give us interesting things to
think about, and so we rely on physicists and engineers to find out how things
behave so we can make languages which describe them concisely. It has been
a very succesful partnership and physics (and more recently engineering) has
forced us to produce some beautiful mathematics, some small amount of
which we have used thus far.
There is still lots left as the next chapter will show.
Chapter 5
DeRham Cohomology:
Counting holes
First some observations on a few cultural issues. There are differences be-
tween mathematicians and physicists which cause problems. I don’t want to
overstate these, nor to leave anyone thinking I disapprove of either culture;
my first degree was in Physics and my Ph.D. in Pure Mathematics, and I
find both subjects wonderfully worth studying, but failure to confront issues
tends to make them harder to deal with, not easier. So some thoughts on the
differences may be worth putting up for your consideration. It should also
be noted that, seen from outside, the two cultures are so similar it is hard to
see any difference at all.
5.1 Cultural Anthropology
In the beginning of the twentieth century, the Mata Grosso was the great
unexplored jungle of the Amazon basin in South America. It was full of,
well, jungle, which we now call a rain forest (possibly to avoid giving offence
to jungles), and contained exotic animals and exotic tribes of people with
strange customs, such as shooting curare tipped darts at strangers.
Cultural anthropologists, anxious to study humanity in all its bizarre aspects,
visited these tribes in order to learn their ways; those who avoided the curare
tipped darts were able to return to civilisation and tell it about the customs
and manners of these fascinating people. One of the chief difficulties they
faced was the strange human ability to follow complicated rules without being
able to say what the rules actually are. This is obvious in language: a ten
year old has a good grasp of his native language and can follow incredibly
149
150 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
complex rules of grammar with no apparent difficulty. The conclusion that
French must be an easy language because lots of French children speak it, is
not in fact the case. It is rather that they have internalised a huge number
of rules, but they don’t know what they are. In order to learn French as an
adult, we tend to want to know the rules. It is no use asking a French child
to tell you. They don’t know.
In the same way, it was no use asking a denizen of the Amazon basin what
their basic assumptions about the universe were. They had them, they fol-
lowed these assumptions, but they had been brought up in the culture and
couldn’t articulate them. Part of the interest in exotic tribes is trying to
work out what those assumptions are, but there is no use asking the exotic
tribesmen. They learnt them at too early an age to realise they were making
them.
I once talked to an Australian engineer on a Japanese train, and (in front of
some of the Japanese staff) he expressed his amusement about the introduc-
tion of flush toilets to Japan many years ago. They needed to have pictures
explaining to Japanese how to sit on the toilet seat, otherwise the more hy-
gienically conscious Japanese would stand on it and squat. He thought this
was very funny, because he presumably believed that the customary manner
of using a flush toilet is something people are born knowing. He thought
this because he had been potty-trained at an early age and had forgotten
the process. I doubt if his mother had. Modern Japanese toilets are so com-
plicated, they have to be explained to foreigners, so the Japanese have had
their revenge on ignorant Aussies.
These days the Mata Grosso is in the process of being turned into farms and
housing estates and the exotic tribesmen drive cars and drink coca-cola, so
there is not much point in a cultural anthropologist visiting the place. It
is much like back home. By way of compensation, there are other exotic
tribes being created. One of these is the tribe of theoretical physicists
1
. Just
like the Amazonian Indians, they have their special language, their cultural
assumptions about the world. And just like the Amazonian Indians, they
don’t actually know what they are.
This is where mathematicians come in. They are also a weird tribe, as you
may have noticed, but being professionally interested in rules they have a
much clearer idea of what those they follow actually are. And when they
study theoretical physics they find it necessary to articulate the assumptions
1
One cultural anthropologist actually spent some months with a group of physicists
but his report on their weird ways aroused little interest, perhaps because he couldn’t
make as much sense of them as they could of him. This is a true story and not a joke.
Well, maybe it is a joke but it is also true.
5.2. SOLUTIONS 151
which the physicists make. It is no use asking the physicists, they do it
be training and reflex and don’t even notice they are doing it. Learning
theoretical Physics as an adult is harder than learning French, and asking
French children is no help, as noted above.
So I am going to make some points which theoretical physicists would regard
as too obvious to talk about and don’t.
5.2 Solutions
The Maxwell Equations are basically about a set of six functions from R
4
to
R, Ex, Ey, Ez, Bx, By, Bz which correspond to things that can be measured
using particular instruments. In practice we can only sample these functions
discretely if there is something in nature producing them, or we can more or
less ignore reality and just write down the six functions. They are, in our
notation, the components of a 2-form on R
4
and we take the Lorentz inner
product should it be necessary. It is possible to see this as a map from R
4
to
R
6
. Any such map will define a suitable 2-form, and it is not unreasonable to
demand that we restrict ourselves to smooth functions and maybe analytic
functions.
Exercise 5.2.1. Why is it not unreasonable?
Now the Maxwell Equations impose conditions on these six functions. Not
every choise of six functions fromR
4
to R will satisfy them. In fact there must
be some space of solutions. We have from dF = 0 four conditions on these
functions and from d F = 0 another four. I am staying with the vacuum
equations for the time being. So we have a total of eight constraints on an
infinite dimensional space of functions, so we get some infinite dimensional
manifold of solutions. This is not much help.
The sad fact is we have only one solution to the vacuum Maxwell Equations,
which we got by guessing that a plane wave in space would do it. If you
write down
Ex(t, x, y, z) = 0, Ey(t, x, y, z) = 0, Ez(t, x, y, z) = C sin(y −t)
for any real number C, then you are describing a (sine) wave with an electric
field which exists only in the z-direction and which travels at unit speed in
the y-direction. If we take the curl of this we get, so Maxwell tells us,
∂B
∂t
= −

cos(t −y)
0
0
¸
¸
152 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
and integrating gives
Bx = −sin(y −t), By = 0, Bz = 0
which says the magnetic field is a plane wave also travelling along the y axis
being non-zero only in the x direction.
Exercise 5.2.2. Draw a picture.
It is easy to verify that ∇
q
E = 0 and ∇
q
B = 0 and it remains only to show
that curl B = ∂
t
E which is rather easy.
Exercise 5.2.3. Do it.
Of course we can get some more solutions out of this, in fact an infinite
number of them. For a start we can rotate things so that instead of travelling
along the y axis it goes along any other line. Just apply any element of
SO(3,R) to the above system and we get a new one which we already know
also satisfies the Maxwell equations. From a physical perspective it would
be astonishing if it didn’t.
Exercise 5.2.4. Do it.
Better, apply a lorentz transformation to the 2-form and get a larger class of
equivalent solutions.
For a second thing, we can change the frequency of the wave so it has more
oscillations in time and space.
Exercise 5.2.5. Do it.
And finally if we have two or more solutions we can add them and get another
solution.
Exercise 5.2.6. Verify this.
This gives us a lot more solutions, infinitely many more, but one has to feel
that they are not really all that different. Of course Fourier Theory tells us
that any function can be approximated as a finite sum of such things. On
the other hand it is easy to construct functions which are not solutions.
Exercise 5.2.7. Do it.
So we ask the question, how many other solutions are there? It is conceivable
that this is in fact all, and it is conceivable that there are squillions of other
families of solutions, where another family is obtained from any one solution
by doing some rotations, or lorentz transformations, and scalings and sums.
One thing we note is that in going from the wave equation for the electric
field to obtain a magnetic field, we simplified the integral by making some
functions equal to zero.
5.2. SOLUTIONS 153
We had
curl

0
0
sin(y −t)
¸
¸
=

−cos(y −t)
0
0
¸
¸
= −
∂B
∂t
Integrating this gives
B =

sin(y −t)
A
B
¸
¸
where A and B can be any functions which do not depend on t. It is natural
to cheat and make them zero, as I just did. Can we have any other possibility?
We would need to ensure that ∇
q
B = 0 which would force ∂
y
A + ∂
z
B = 0
for a start. We also have

t
E =

0
0
−cos(y −t)
¸
¸
would have to be curl B, that is


x

y

z
¸
¸

Bx
By
Bz
¸
¸
=

0
0
−cos(y −t)
¸
¸
or


y
Bz −∂
z
By

z
Bx −∂
x
Bz

x
By −∂
y
Bx
¸
¸
=

0
0
−cos(y −t)
¸
¸
This would seem to give rather a lot of possibilities for B other than the
simplest one we have considered.
Exercise 5.2.8. Does it? Find one or prove there aren’t any.
Note how we got this present family. Basically, we guessed from knowing that
light travels through space as a wave and the velocity is 1 in our units and the
conjecture that light is an electromagnetic thing, that a wave would work,
and wow, it did. Checking to see if a function from R
4
to R
6
will satisfy
Maxwell’s Equations is very simple, actually finding one by some process
other than guessing is a different story. And that makes the assumption
that we should look in a space of analytic or at least smooth functions; in
practice we are going to be using the elementary functions because these are
the ones we can easily write down and differentiate. Why should the universe
be so kind as to use the functions we find easy to write down? What if some
154 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
important physical process depended on functions we can’t write down as a
small sum of elementary functions? What, if anything, could be said about
it?
Exercise 5.2.9. Think about this. Have we just been dead lucky with light?
The question as to whether there are any other solutions to the vacuum
equation outside finite sums of lorentz transforms of the wave solution merits
a little thought.
The physicist will surely observe that there are bound to be solutions to any
non-vacuum problem. Take any configuration of moving charges. Specify
them by elementary functions Ex, Ey, Ez where possible. Then we can
hope to derive, in any coordinate frame in which the data is specified, the
corresponding magnetic fields. This requires merely some differentiation and
integration, leaving some unknown functions provided by the integration
stage. Now the physicist ‘knows’ that there is a solution: his reasoning
is that the universe will surely provide one, so it must be there to be found.
Indeed he will believe it is unique up to the transformations of coordinates,
since the universe doesn’t toss up between options. This, of course, assumes
that the Maxwell equations are true, which physicists do indeed believe. In
the main.
The question of why do physicists feel happy to restrict themselves largely
to the analytic elementary functions which I invited you to ponder a while
back, and the question of why physicists are so confident about being able to
prove uniqueness and existence of solutions are explained by two essentially
philosophical positions which go back to Newton.
The first can be summarised by the old adage ‘If something is ineffable,
there’s no point trying to eff it’, and perhaps also ‘If something can’t be
detected it isn’t there.’ If a function that was zero around the planet earth
was non-zero somewhere else, first it could not be represented by an analytic
function and second, we would have no way of knowing by local measurements
of any precision that it existed, so there is no point in wasting thinking
time about it, and if a function that can’t be written down is essential to
understanding something then we are never going to understand it, so again
forget about the possibility.
The second can be summarised by the principle that if you have a theory
which accounts for the phenomena, commit yourself to it until either someone
comes up with a simpler or more encompassing theory or you run into facts
which are in conflict with it, in which case bend the theory minimally to
accomodate the new facts. The more committed you are to the theory, the
more likely you are to discover such facts. Pondering what if s is a waste of
5.3. INFINITE VARIETY 155
time.
The belief that the universe does not toss up but is consistent and hence
provides us with unique solutions is again a philosophical position. One can
argue that it is justified in various ways: it can be productive because if we
get lots of solutions we can look for extra conditions to force uniqueness and
usually find them. Recall the exercise you did on the magnetic field for a
wire carrying a current.
Most physicists regard these metaphysical convictions as so obvious that they
never bother to mention them. Much like the french children.
5.3 Infinite Variety
Giving the curl of a field and asking for a solution is, as you will have dis-
covered, difficult because there are so many solutions. In differential form
notation we have dX = Y where X is a 1-form and Y is a 2-form. Now d is
linear, so it follows that if dω = 0 then whenever X is a solution, so is X+ω.
And since d
2
= 0, if f is any differentiable function whatever, X + df is a
solution.
What does this do to the physicists conviction that solutions have to be
unique on the physical grounds that the universe does not toss up? There
are two things one might do, and physicists do both of them. One is to
impose extra conditions which force uniqueness. Another is to declare the
difference between different solutions as an artefact of the language and deny
that it is physically significant. In the former case they explain that the uni-
verse has some rather unexpected preferences, often for continuous functions,
and in the second they glory in the freedom that they get to choose arbitrary
functions to suit their convenience, seeing no objection to making different
choises at different times. If a mathematician has noticed that they often
do the first and then unexpectedly do the second, and points out the incon-
sistency, they express surprise and a certain contempt that mathematicians
lack the courage to follow them. You will find this attitude in the text book
section on Gauge Freedom. I have not been able to get physicists to agree
that consistency in how they resolve multiple solutions to physical problems
is particularly desirable, although they insist that the universe shows con-
sistency. This appears to be a religious conviction, possibly derived from
Newton who believed (a) that God had created the universe and (b) that
God was not small-minded enough to be inconsistent or to try to fool us.
Quantum mechanics might have given him spiritual indigestion, as would
some of the practices of his intellectual heirs. But then, Newton consid-
156 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
ered himself a philosopher first and a mathematician second, and Physics or
indeed Science hadn’t been invented as a separate subject in his time.
Again, physicists remain happily oblivious to the underlying assumptions
in their practice, or the great majority of them do. Extracting them for
inspection is time consuming, but I haven’t found a quicker way of making
sense of their work. And it is sufficiently interesting and important work to
justify the effort.
5.4 Gauge Freedom
We have seen that guessing a solution and then verifying it is the quick and
easy way, but it presumes that we are good at guessing, or equivalently that
the solution is simple. It doesn’t seem safe to rely on this. So it is reasonable
to introduce other assumptions, some on physical grounds, some in a spirit
of optimism.
We know that d
2
= 0 and so when given dB = 0 it is tempting to consider
the possibility that the reason dB is equal to zero is that B = dX for some
unknown 1-form X. Such a thing is known in the literature as a vector
potential. We also know that it is far from unique: adding df for any function
(0-form) f will give an equally good X. This is precisely analogous to having
a constant of integration crop up: again we might feel inclined to fix it in
a physical situation by imposing an extra condition, as when we solve an
ODE, or we may feel that it gives us a glorious freedom to choose one that is
convenient, or we may decline to make a choice at all. In the case of vector
potentials, the practice of physicists is to glory in the freedom and call it gauge
freedom. A similar situation exists when we obtain a potential function for
a physical situation, when adding in an arbitrary constant will not change
the force field which is its gradient. Physicists sometimes insist that physical
constraints such as ensuring the potential goes to zero at infinity suffice to
get rid of the ambiguity, but they do not usually feel any such compulsion
in the case of the vector potential. Just what exactly is physically real and
what is an artefact of language is never precisely specified
2
. This allows
2
This creates real problems. One of my lecturers at Imperial College told the class,
rather sadly, that when he was starting on a PhD, he had come up with what he saw as
a very interesting problem. His kindly supervisor had assured him that it wasn’t a real
problem, but an artefact of language. Someone else, perhaps with a less assured or less
kindly supervisor had assumed it was real, done the research, and become famous as a
result. I suppose one moral to be derived is that you shouldn’t trust your supervisor. The
conclusion I derived was that physicists are not at all clear as to what is real and what
isn’t. This surprised me at the time, but I was very young.
5.5. EXACT AND CLOSED FORMS 157
physicists to spout manifest drivel. I was once assured that there were 4π
lines of force coming out of a unit charge, and on pointing out that this was
roughly 12
4
7
lines and what did 4/7
th
of a line look like? I was reproved for
being too literal. Clearly one wasn’t supposed to ask what things meant, one
was simply being instructed in the right things to say, whether it made sense
or was total bullshit. Thus do subcultures maintain a wall against outsiders:
there’s a lot of it about.
Exercise 5.4.1. Listen to some conversation between your friends and de-
cide, how much of what is said is carrying information about the world which
could be translated into a foreign language and remain intelligible, as in ‘Your
dress is transparent in the sunlight’ and how much is comprehensible only af-
ter a large number of extra propositions have also been translated, and possibly
not then, as in ‘ All cultures are equally valid in their own terms’.
You will note that the mathematical subculture has a quite different set of
underlying assumptions from those of most of the rest of the human race.
One is that assertions have to make sense and should, if possible, be true
or derivable from other assertions which are either true or clearly stated to
be assumptions. Many students come to university with a quite different
assumption: that what is to be said is anything that has been approved by
authority. Whether it is true, false or totally meaningless is of no importance.
Answering an examination question is done by taking a few half recalled
fragments from lectures and gluing them together with bullshit. No doubt
this works well in the schools, and perhaps in other university departments,
but mathematicians really don’t like it. As I am sure you have noticed by
now.
Exercise 5.4.2. What other fundamental but usually unstated assumptions
characterise mathematical ‘culture’ ?
For the time being I shall simply go along with the physicists, but point out
any oddities while doing so.
5.5 Exact and Closed forms
I suggested earlier that given that dF = 0 for a 2-form F, we could get this
result if F = dX, using the well known result that d
2
= 0.
Definition 5.5.1. A form Y which satisfies the condition dY = 0 is said to
be closed.
Definition 5.5.2. A k-form Y such that there exists a k − 1-form ω such
that Y = dω is said to be exact.
158 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
Then we have the result that:
Proposition 5.5.1. Every exact form is closed.
Proof: d
2
= 0
What about the other way around? Is every closed form exact? The answer
is interesting: it depends completely on topological properties of the space
on which the form is defined. You might think that this is interesting only if
you are a topologist; it has however some important implications for Physics.
The idea is explained clearly enough in Chapter Six of the text book, which
does it first for the case when X is a 1-form on R
2
. To say that dX = 0 is
to say that the field, corresponding to X when we use the inner product to
change to an equivalent vector field, has zero curl. The question then is, is
it a potential field? Is it the gradient of a scalar field f : R
2
→R?
We can try to construct one by the simple process of taking some point
a ∈ R
2
and declaring f(a) = 0. To get, for any other point b a credible value
of f(b), we take a path from a to b and integrate the vector field along the
path. This tells us the amount of work the vector field does along the path.
We can put a minus sign in if we feel like, but hey, who cares? Now this will
certainly give a value of f(b) but the obvious problem is that if we took a
different path, we might get a different answer. In general, we would. Your
second year exercises in the course of doing Stoke’s Theorem should have
convinced you of this.
If however the curl of the field is zero, then the integral around any closed loop
is zero. This follows from Stoke’s Theorem in the plane, otherwise known
as Green’s Theorem, immediately. And this means that the value of f(b)
cannot depend on the path, because two paths between fixed points when
joined together give a closed loop. Hence the value of f(b) does not depend
on the path, and so we can take this as a sensible value for f(b) because it
depends only on the vector field and the point a.
Exercise 5.5.1. Show it depends on the point a only up to an additive con-
stant: in other words if I choose a and you choose a

, your function f

and
my function f will differ by a constant.
Exercise 5.5.2. Translate this into the language of 1-forms. 2-forms and
0-forms on R
2
.
This seems to give us the following:
Proposition 5.5.2. Every Closed 1-form on R
2
is exact.
Proof: Just construct the 0-form as indicated. Then it is trivial to verify
that d of the 0-form is the given 1-form.
5.5. EXACT AND CLOSED FORMS 159
This seems perfectly reasonable and hasn’t seemed to involve us in any topol-
ogy, so I shall now give what looks like a counterexample to the last propo-
sition:
Proposition 5.5.3. The 1-form
X =
y
x
2
+y
2
dx −
x
x
2
+y
2
dy
is closed but not exact.
Proof:
First the closed part.
dx = ∂
y
y
x
2
+y
2
dy ∧ dx −∂
x
x
x
2
+y
2
dx ∧ dy
=
2(x
2
+y
2
) −2(x
2
+y
2
)
(x
2
+y
2
)
2
dy ∧ dx
= 0
Now suppose X = df for some function (0-form) f. Then it would follow that
the integral of X around the unit circle is zero, since starting at a = (1, 0)
T
and proceeding in the positive direction would give us f(a) − f(a) = 0 for
the integral, by definition of the construction of f. But a glance at the vector
field shows this is wrong. We have unit length vectors against us each step of
the way, so the integral is −2π. Check it by doing the algebra if the geometric
argument fails to carry conviction.
So there ain’t no such f, and X is not exact.
And something has gone horribly wrong.
Exercise 5.5.3. Can you see what? Stop now and try to work out why this
result is not, as at first appears, in conflict with the penultimate proposition
that said that every closed 1-form on R
2
is exact. Warning: I am about to
give the game away on the next page, so stop now and work it out.
160 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
The answer is of course obvious once you have seen it. The 1-form
X =
y
x
2
+y
2
dx −
x
x
2
+y
2
dy
is not defined on R
2
. It is defined and smooth on R
2
` ¦0¦. This is R
2
with
a hole in it. The hole completely destroys the argument, because Green’s
Theorem, Stokes in the plane, doesn’t work if there is a hole in the region.
The boundary of a disc with a hole in it consists of both the bounding circle
and the point at the hole. Ignoring missing points screws up everything.
You should be warned that evil people, I suspect physicists, have the bad
habit of writing the negative of this form as dθ. You can see why they do it,
but you have to deplore their moral and mathematical muddle.
Exercise 5.5.4. Why do they do it? You might like to consider the function
which takes a point in the plane, writes it in polar coordinates and sends it
to θ. What happens if you take the exterior derivative of this 0-form?
The removal of a point of R
2
makes a mess of the result that all closed forms
are exact. The argument works however for subsets of R
2
which don’t have
any holes in. One hole is enough to bugger things up.
Exercise 5.5.5. Show this.
Exercise 5.5.6. What about the corresponding case of closed 1-forms on
R
3
` ¦0¦. Are they always exact? After all, if we have a loop in R
3
` ¦0¦ we
need to find a surface with the loop as boundary which does not contain 0,
in order to use the classical Stoke’s Theorem. This will allow the argument
to go through even in R
3
` ¦0¦. And such surfaces are always there, we have
lots of extra room and can deform smoothly any bad surface that contains the
origin until it doesn’t.
It might occur to you to wonder if it goes on in the same way: does closed
imply exact on R
n
in general? Investigating in the simplest case, R
2
, we
know that the only 3-form is zero so every 2-form on R
2
is closed. This
would suggest that if it is true, every smooth function on R
2
has a smooth
vector field of which it is the divergence.
Exercise 5.5.7. Is this indeed the case? If so prove it, if not give a coun-
terexample.
Exercise 5.5.8. Show that every 3-form on R
3
is the exterior derivative of
a 2-form.
Exercise 5.5.9. What about closed 2-forms on R
3
? The required theorem
we would need is obviously more complicated since we have to construct a
1-form not just a function. Do it for the 2-form dx ∧dy +dx ∧dz +dy ∧dz.
5.5. EXACT AND CLOSED FORMS 161
The last exercise will show there is a certain amount of slack and that we
can make some choices. It would be nice however to have a more systematic
approach.
To do this, let’s look at 1-forms on R
2
which are exact and see if we can
be systematic about getting the ‘potential function’ f. Suppose we have a
1-form
ω = Pdx +Qdy
We can take the origin as a starting point and look to see what we get if we
integrate ω along a path from 0 to the point x. Rather than talk about any
old path, let’s do it with a straight line. Then the line is the set of points tx
for t ∈ [0, 1] and we get that the path integral of ω along this path is

1
0

P
dx
dt
+Q
dy
dt

dt where x =
¸
x
y

or

1
0
(P(tx, ty)x +Q(tx, ty)y)
Exercise 5.5.10. Evaluate this for x = [1, 2]
T
and the 1-form x dx +y dy.
Exercise 5.5.11. Evaluate this for x = [1, 2]
T
and the 1-form −y dx+x dy.
Note that this is not closed.
The result, for any point x is a number which we can call f(x) I shall call
it I(ω)(x) and use I(ω) instead of f. The reason is that I goes in the
opposite direction to the exterior derivative so I (for exterior Integral?) seems
a reasonable symbol to use.
So we have an operator I from 1-forms to 0-forms which makes sense on R
n
and always gives an answer whether the 1-form is closed or not. And we
observe that when ω is closed, then dI(ω) = ω so ω must be exact.
Can we get from 2-forms to 1-forms by a similar process? We investigate
the simplest case of R
2
and a nice simple 2-form. Let us start by taking
the constant 2-form 2 dx ∧ dy. We want to do some integrating to obtain a
suitable 1-form I(2 dx∧dy) = Pdx+Qdy. Since all 2-forms on R
2
are closed
we would rather like to have d(Pdx +Qdy) = 2 dx ∧ dy.
There is a fair bit of slack here. We would have

x
Q−∂
y
P = 2
and we would need to make up our minds about how to split up the 2 between
the two contributions. Let’s make them equal. Then we would have

x
Q = 1; ∂
y
P = −1
162 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
We can integrate both these equations to get
Q(x, y) = x; P(x, y) = −y
or
I(2dx ∧ dy) = −y dx +x dy
and checking confirms that this works: d(−y dx +x dy) = 2 dx ∧ dy.
Had we chosen some other way to split the number 2 up between the two
terms, we would have got another equally good 1-form: there is no shortage
of them.
Exercise 5.5.12. Try it. Make one term zero. Or −1. Now look at the
various 1-forms which have constant exterior derivative 2dx ∧dy. What can
you say about their difference?
Now we try to make the process look more like an operator I taking 2-forms
to 1-forms. First we split the elements up in equal amounts to be definite.
Then we integrate along a path as for the case of 1-forms. I write
I(2dx ∧ dy) =

1
0
t 2x dt

dy −

1
0
t 2y dt

dx (5.5.1)
The term t is in there to make sure we divide by 2, which we can regard as
sharing the contributions out equally.
Exercise 5.5.13. Suppose we do the same with some more complicated 2-
form which is not constant, such as ω = x
2
+y
2
dx∧dy. Can you see how to
fix up to obtain the 1-form I(ω) by modifying equation 5.5.1 appropriately?
Exercise 5.5.14. Can you make it work for 2-forms on R
3
? Try it on closed
2-forms first. Then try it on a 2-form ω that is not closed, and also try to
make it work for the 3-form dω. Notice anything?
If you have been good and virtuous and done the sequence of exercises above
you will be prepared to believe that we can construct for any k-form ω on
R
n
, k > 0, a (k −1)-form Iω, also on R
n
, given by:
Iω(x) =
¸
i
1
<···<i
k
k
¸
α=1
(−1)
α−1

1
0
t
k−1
ωi
1
< < i
k
(tx) dt

x
i
α
dx
i
1
∧ ∧
¯
dx
i
α
∧ ∧ dx
i
k
(5.5.2)
where the
¯
dx
i
α
means this term is omitted.
5.6. HOMOTOPIES 163
This is undoubtedly a bit messy, which is why I gave the sequence of exercises.
If you prefer memorising things to understanding them, the very best of luck.
Note that I takes the zero k-form to the zero (k −1)-form.
It is now possible to prove the Poincar´e lemma:
Theorem 5.5.1 (Poincar´e Lemma). If a region U ⊆ R
n
is star-shaped
with respect to the origin and if ω is a smooth k-form defined on U, then
there is a smooth (k −1)-form Iω defined on U and
ω = d Iω +I dω
It follows that if ω is closed then it is exact.
Proof: This is a thoroughly horrid calculation which is done on page 95
of Michael Spivak’s Calculus on Manifolds. You have probably worked out
what the term star shaped with respect to the origin means: if a point is
in the set U so is every point on the line segment joining that point to the
origin.
Exercise 5.5.15. Show that we can get the same result for any star-shaped
subsets U ⊆ R
n
where U is star-shaped with respect to any point.
Exercise 5.5.16. Show that if a region U ⊆ R
n
is diffeomorphic to any
star-shaped subset, then the result still holds for all smooth k-forms on U.
Exercise 5.5.17. Gives some examples of subsets U in R
n
which are not
diffeomorphic to star-shaped regions.
5.6 Homotopies
Recall from 3P0 the idea of a homotopy:
Definition 5.6.1. We say that two continuous maps f, g : X → Y where
X, Y are topological spaces are homotopic iff there is a continuous map F :
X I → Y such that ∀ x ∈ X, F(x, 0) = f(x) and ∀ x ∈ X, F(x, 1) = g(x).
In such a case we write f · g.
Exercise 5.6.1. Show that homotopy is an equivalence relation on the con-
tinuous maps from X to Y .
In other words, we can change t continuously from 0 to 1 and interpolate
between f and g. If X is the space consisting of a single point, ∗, then to say
that two maps, f, g from ∗ to Y are homotopic is to say that f(∗) and g(∗)
can be connected by a continuous path joining them. So we can say that:
164 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
Definition 5.6.2. a space Y is path connected or (0-connected) iff every
two maps from ∗ to Y are homotopic.
Or equivalently, we say Y is 0-connected iff every constant map to Y is
homotopic to every other constant map.
This can be extended considerably:
Definition 5.6.3. A space Y is simply connected or 1-connected iff every
map f : S
1
→ Y is homotopic to a constant map.
Definition 5.6.4. A space Y is k-connected iff every map S
k
→ Y is homo-
topic to a constant map.
You should be warned that some writers use the term k-connected to mean
what I call n-connected for every n ∈ [0 : k]. In my sense,
Proposition 5.6.1. The circle, S
1
is 0-connected but not 1-connected.
Proof:
To see this we make use of the exponential map exp : R → S
1
, t → e
2πit
.
If we take a map f : [0, 1] → S
1
with f(0) = f(1) we can regard f as a map
from S
1
to S
1
. Using the fact that exp is locally a diffeomorphism, we can
lift f to
˜
f : [0, 1] → R with exp ◦
˜
f = f. If, without loss of generality, we
assume f(0) = f(1) = [1, 0]
T
then we can fix
˜
f(0) = 0 and observe that
˜
f(1)
must be an integer. This integer is called the winding number of f.
Exercise 5.6.2. Draw a picture. Confirm that it is always possible to chop
[0, 1] into small enough bits so that exp has a smooth inverse on the image
by f of each bit. Explain precisely how
˜
f is constructed.
It is not hard to show that the winding number is a homotopy invariant,
which is to say that if two maps are homotopic then they have the same
winding number, and also that if they have the same winding number they
are homotopic.
Exercise 5.6.3. Do it.
It follows that there is no homotopy between the identity map and a constant
map from S
1
to itself, and hence that S
1
is not 1-connected.
Exercise 5.6.4. Finish the argument.
Exercise 5.6.5. Show that S
2
is path connected and 1-connected but not
2-connected.
Exercise 5.6.6. Show that S
n
is k-connected for 0 ≤ k < n but not n-
connected.
5.7. COUNTING HOLES 165
Definition 5.6.5. If f : X → Y and g : Y → X are continuous maps and if
f ◦g · I
X
and g◦f · I
Y
then we say that X and Y have the same homotopy
type, and f is a homotopy equivalence.
Exercise 5.6.7. Show that having the same homotopy type is an equivalence
relation on topological spaces. Show that R
n
has the homotopy type of a one
point space, and that S
k
has the homotopy type of S
n
iff k = n.
Exercise 5.6.8. Show that R
n
`¦0¦ has the homotopy type of S
n−1
for n ≥ 1.
The last exercise has as an almost immediate corollary that if R
2
has any
holes in it, the resulting space is not simply connected.
Exercise 5.6.9. Show this.
Let A denote any compact subset of R
2
. Now it is immediate that if two
loops in R
2
` A are homotopic, and if ω is any closed 1-form on R
2
` A, then
the integral of ω over the first loop is equal to the integral over the second.
It follows that we can say that:
Theorem 5.6.1. If a manifold is connected and simply connected then every
closed 1-form on it is exact.
Proof:
If ω is a closed 1-form on a connected and simply connected manifold M
n
,
then the integral around any loop is zero since the loop is homotopic to a
constant map. Hence the integral along any path between any pair of end
points does not depend on the path. To compute the integral along any path
we take a chart containing one end point and take a point along the path
which is in the domain of the same chart, shift the 1-form and the path to
R
n
by means of the chart and compute the integral in R
n
. Do this for a
set of points along the path until we have the whole path, and add up the
part integrals to get the value of the integral for the whole path. Putting the
function I(ω) equal to zero at the starting point and the value of the integral
at the finishing point defines I(ω) at the end point. We can do this for every
end point on M
n
. The argument that dI(ω) = ω takes place in R
n
and is
trivial.
5.7 Counting Holes
The text book does an excellent job of explaining how we have a vector space
Z
p
(M) of closed p-forms on M and another B
p
(M) of exact p-forms and we
166 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
know B
p
(M) ⊆ Z
p
(M). So we can form the quotient vector space
H
p
(M) Z
p
(M)/B
p
(M)
This measures the number of p-holes in M. If you have troubles with quotient
spaces, take R
n
and R
m
with m < n, take an embedding of R
m
in R
n
by
a linear map, and look at the quotient object, which should look a lot like
R
n−m
.
There are a number of ways of computing the cohomology of spaces, not
necesarily manifolds, and hence a number of different cohomology theories.
In a sense, and up to a choice of a coefficient group, they all give the same
answers. This takes us further into algebraic topology than I am game to go
in this course, but you should know that much. If you want to know more
algebraic topology, do the unit in second semester. You can find my notes
on the web, which may or may not help.
Exercise 5.7.1. Do exercise 98 on page 125 of the text book. Read the section
carefully.
5.8 More Cultural Anthropology
There is a section in the text book on the Bohm-Aharonov effect which should
be on keen interest to cultural anthropologists. The effect is a quantum
mechanical phenomenon of some interest.
The text book explains that physicists get some insight into the effect by
visualising an infinitely long core on which is wound a helical wire which the
authors wrongly call a spiral. Then it appears that the fact that the region
outside the wire is not simply connected accounts for the effect happening.
They then go on to admit that in fact the wire is not infinite so the space
outside is in fact simply connected. Since the wires are normally joined via
a generator or battery so as to produce a current in the wire, one might feel
that they were right the first time. But one can certainly visualise a very
long coil, say a light-year length of wire, and a humungous charge at one
end which attracts the electrons towards it. Then if we neutralise the charge
with an equal and opposite one (producing a humungous flash, perhaps) the
electrons will be released to head down the coil. After about nine months
one could conduct the experiment to detect the effect somewhere about the
middle of the coil. Presumably it would be observed to happen despite the
fact that the coil has ends and the complementary space is in fact simply
connected.
5.8. MORE CULTURAL ANTHROPOLOGY 167
Think about this. It is claimed that physicists get insight into why something
happens based on an assumption which is in fact false. It is rather like
claiming that you get some insight into why human beings have two legs by
observing that horses have four legs so the rear half of a horse has two. If
you were told this, you might point out that the claim is made by a person
who might in fact be a horses rear-end, but since you are not, it does not in
fact contribute noticeably to your understanding.
Us coarse, crude mathematicians have a technical term for this sort of thing:
we call it bullshit.
It might be that some kind of sense can be made of this, and it would be
interesting to see it done, even more interesting to try to actually do it.
One is left with the impression that to a physicist, mathematics is there in
two rˆoles, one is to supply a means of doing the computations and the other is
as a sort of mnemonic for remembering the rules for doing them. Mnemonics
do not have to make sense, and generally don’t.
To a mathematician, the rules are there because they reflect the way the
universe works, and they have to make sense. Either the universe does in
fact work this way in which case the rules are right and we may trust our
calculations, or it doesn’t and they are unreliable. One may, course, have
only uncertain knowledge of which of these states of affairs actually obtains
of. Taking a punt on it being right and then examining reality closely and
discovering if our sums agree with our measurements is usually felt to be the
way to go. Talking pure bullshit, even if it is the same bullshit as that uttered
by the rest of the tribe, doesn’t cut it. In the creative phase of development
of an idea, some haziness is allowable indeed necessary. But bullshit is always
a bad idea. And removing the haziness is crucial to progress. Incorporating
it into your subject is popular among people like publicists and politicians,
where a career built on a foundation of bullshit is quite common, but it is
disappointing to find it in Physics.
It wouldn’t be quite so bad if physicists understood that what they are doing
here is rather silly. Like the arts students who feel quite proud of their
inability to use logic and announce that they are not to be constrained by
mere rules and consistency, the price paid is that nobody else will trust their
arguments. Long, long ago, physicists understood that bullshit is baaaaaad.
Some of them these days do not. So do civilisations crumble.
168 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
5.9 Summary
There are serious problems for a mathematician trying to understand physics,
many of them put in place by physicists, who have a very different notion
of what constitutes an explanation. Nevertheless it is a fascinating and re-
warding subject.
I have worked through most of part I of the text book and would like to
have got much further. It is possible for the interested reader to tackle the
next two sections, and I would encourage you to do this. You will certainly
come to the conclusion that understanding Physics entails getting a grasp of
an awful lot of contemporary mathematics. You might find it easier to do it
the physicist’s way, which involves knowing a lot of facts and stringing them
together with algebra in a rather muddled manner, or you might find it better
to understand the mathematics first. This is probably to be determined by
how much brainpower you have versus the extent of your memory.
The remaining chapter headings represent a pious hope of how far I would
have liked to get but probably won’t. As time permits I shall continue finish-
ing the material but I doubt if we will get any further this semester. Maybe
we want a post-graduate unit on it.
Chapter 6
Lie Groups
6.1 Introduction and Motivation
6.1.1 The rest of the course
The next few chapters will treat the machinery needed to deal with part Two
of the text book. There are a number of elements of this. The first is a study
of some Lie Groups which will require a small amount of group theory and
a brief return to the tensor algebra, the second is a study of vector bundles
in particular G-bundles, where the bundle structure is specified by a group.
This is known to physicists as the gauge group of the bundle. It tells us
how to glue things together in order to build a bundle from trivial bundles.
This will lead to the Yang-Mills equation as a generalisation of the Maxwell
Equations for force fields other than the electromagnetic.
6.2 Introduction to Lie Groups
I discussed Lie Groups briefly in the second year algebra unit. They were
all matrix groups, and hence mostly subspaces of the general linear group
GL(n, R), which we can think of either as the space of all invertible linear
maps from R
n
to itself, or as the space of n n invertible matrices with
real entries. The exceptions were barely mentioned subgroups of GL(n, C)
which is either the space of invertible linear maps from C
n
to itself or the
space of nn invertible matrices with complex entries. Which definition you
prefer is a matter of taste; I prefer to think of the linear maps as being more
fundamental and regard the matrices as handy devices for representing the
169
170 CHAPTER 6. LIE GROUPS
linear maps in a convenient form for computation
1
. In this course however
I shall usually write GL(n, R) for the matrices and Aut(R
n
) for the linear
automorphisms (isomorphisms with itself) of R
n
.
Notable among these groups were the Orthogonal groups, O(n, R) and the
Special Orthogonal groups SO(n, R), the Unitary groups, U(n, C) and the
Special Unitary groups SU(n, C). GL(n, R) is of course a vector space, in
fact an algebra because we have a multiplication, not usually commutative,
obtained by composing the maps or, equivalently, multiplying the matri-
ces. It is obvious that the group property of the Lie Groups is that of the
multiplication, but that if we add two orthogonal matrices the result is not
an orthogonal matrix, so the Lie Groups are not vector spaces. They are
however smooth manifolds, and hence have a dimension.
To see that they are manifolds, the easy way is to note that for all the above
examples, an element of the matrix group is defined by putting a bunch of
smooth conditions on the elements of the matrix. For example, to get O(2, R)
we take the space of all 2 2 matrices with real entries,
¸
x u
y v

and require the conditions:
x
2
+y
2
= 1, u
2
+v
2
= 1, xu +yv = 0
This gives us three independent conditions on four numbers so we expect,
or at least hope, to have one degree of freedom left and a one dimensional
manifold. This is a rather sloppy discussion of an application of the implicit
function theorem, which you need to remind yourself of. And the implicit
function theorem is a generalisation done locally of the rank nullity theorem.
Which you know from second year. I hope.
Let’s do two cases in agonising detail. First the unit circle, because it is so
easy.
The Implicit function theorem deals with the zero of a function f : RR →R
which is differentiable at a point (a, b) ∈ RR. Think f(x, y) = x
2
+y
2
−1,
Df(a, b) = [∂
x
f(a, b), ∂
y
f(a, b)] = [2a, 2b]. It tells us that when the derivative
with respect to y is invertible, we can represent the zero of f locally as the
1
This is probably related to the fact that I don’t particularly enjoy doing sums, but I
do like understanding the ideas which tell me how to do them. This often requires me to
do sums, but I prefer to do the bare minimum. Of course, if the ideas didn’t tell me how
to do the sums, I should suspect them of being metaphysical tosh, so I do believe that
sums are, or at least the fact that they can be done is, important.
6.2. INTRODUCTION TO LIE GROUPS 171
graph of a curve y = g(x) for a differentiable g. The derivative with respect
to y is invertible when it is non-zero which happens everywhere except at
y = 0, x = ±1 in the case of f(x, y) = x
2
+ y
2
− 1. And in this case, if we
swop x and y we can expresss the curve as the graph of a map from y to
x. Since in either case the rank of the derivative is one and also we have
the curve is locally (that is, in a neighbourhood of the point) a graph of a
differentiable function, then we have the conclusion that at every point of
the zero of f where the rank of the derivative is one, the zero of f is locally
diffeomorphic to an interval.
The generalisation of this which we need is the Implicit Function Theorem
which I give in what may be a new (manifold) form:
Theorem 6.2.1. If f : R
n
R
m
→R
m
is differentiable and
M = ¦(x, y) ∈ R
n
R
m
: f(x, y) = 0¦
Then if (a, b) ∈ M is such that rankDf(a, b) = m, then there is a neighbour-
hood U of (a, b) ∈ M which is diffeomorphic to an open ball in R
n
.
Exercise 6.2.1. Find the version of the implicit function theorem you are
used to and verify that it is equivalent to the form given.
An even more useful form is:
Theorem 6.2.2. If f : R
n
→R
m
with n ≥ m is a smooth map and
M = ¦(x) ∈ R
n
: f(x) = 0¦
Then if rank Df = m on M, M is a smooth manifold of dimension n −m.
This is somewhat stronger than the classical Implicit Function Theorem and
the idea is intuitively appealing: locally we have that f may be approximated
by an affine map the linear part of which is Df, and if Df : R
n
→R
m
is onto
then the kernel has dimension n − m. And in a neighbourhood, the graph
of the derivative is diffeomorphic to the graph of f. In other words it is the
rank-nullity theorem and the fact that the derivative is a good approximation
to the function in a sufficiently small neighbourhood.
In the case of f(x, y) = x
2
+ y
2
− 1 it is easy to verify that the rank of Df
is never zero on the solution so must always be at least one.
Now to do the same with the orthogonal group O(2, R): we have
f : R
4
→R
3
, (x, y, u, v)
T
→ (x
2
+y
2
−1, u
2
+v
2
−1, xu +yv)
T
and
Df =

2x 2y 0 0
0 0 2u 2v
u v x y
¸
¸
172 CHAPTER 6. LIE GROUPS
and the rank of Df is 3 on M so M is a smooth manifold (since Df is
smooth) and has dimension 1.
Exercise 6.2.2. Do it for O(n, R). Show that the condition that a matrix
be in O(n, R) forces the determinant to be ±1, and deduce the dimension of
SO(n, R).
To get SO(2, R) we need another condition, namely xv −uy = 1. This might
lead you to suspect that SO(2, R) is a zero dimensional manifold, but the
fact is that the constraints are not independent, and we may deduce from
the first three that xv −uy = ±1. This means that O(2, R) is disconnected
and SO(2, R) is one component of it. And the argument from second year
shows that SO(2, R) is diffeomorphic to the unit circle as a manifold. So
O(2, R) is diffeomorphic to two circles. Aren’t you glad we did not stipulate
that our manifolds had to be connected.
Exercise 6.2.3. First show that for a manifold (not necessarily smooth or
even differentiable) connected implies path-connected. Then show that if we
have a Lie group, the connected component containing the identity is a Lie
subgroup.
Other Lie Groups can be obtained in essentially the same way as O(2, R)
by imposing conditions on linear maps or matrices: at one end we have all
GL(n, F) which is the space of all linear maps from F
n
to itself which are
invertible, where F is any field. Then we can restrict ourselves to the case
where F = C or F = R which is less than adventurous but still more than
enough to require some thought. We can stipulate that the determinant be
1 which will ensure that the measure is unchanged to get SL(n, R), we can
insist that some generalised inner product with signature (k, n−k) on R
n
be
preserved to get what we call O((k, n−k), R). We can restrict to determinant
1 in addition, which requires us to put S for Special in front of the name.
And we can more or less repeat using C instead of R, except we use the term
unitary instead of orthogonal. And we can do much of it all over again using
finite fields.
You will note that SO((3, 1), R) is what we have called the Lorentz group.
This suggests an extension: we could take any of the groups regarded as op-
erations on R
n
which preserve the origin and “affinise” them by also allowing
shifts. This will increase the dimension of the group by n since there are n
independent directions in which we can do the shifting. If we do this to the
Lorentz group we get the Poincar´e group
2
.
2
This allows the writer of the Wikipedia article on the Lorentz group to start off by
defining the Lorentz group as a subgroup of the Poincar´e group, probably the least hepful
6.3. GROUP REPRESENTATIONS 173
Exercise 6.2.4. Show that the cartesian product of two Lie Groups is a Lie
Group.
This gives us enough Lie Groups to be going on with.
Exercise 6.2.5. Do the exercises 1 to 10 in chapter one of part two of the
text book.
6.3 Group Representations
6.3.1 Introduction
Recall, from second year, that an abstract group is merely a collection of
things which can be multiplied and divided to give other things in the collec-
tion. This statement is usually made more precise by giving three axioms:
Definition 6.3.1. A group is a set G and a binary operation
q
: GG → G
(with the operation
q
(g, h) usually written in infix notation as g
q
h) such
that
1. ∀ a, b, c ∈ G, (a
q
b)
q
c = a
q
(b
q
c)
2. ∃ e ∈ G, ∀a ∈ G, a
q
e = e
q
a = a
3. ∀ a ∈ G, ∃ a
−1
∈ G, a
q
a
−1
= a
−1
q
a = e
We can now give a formal definition of a Lie group:
Definition 6.3.2. A Lie group is a group G which is also a smooth manifold
such that the maps inv:G → G, g → g
−1
and
: GG → G
are smooth, where is the multiplication in the group.
You will have already worked this out from contemplating the examples. I
hope.
Definition 6.3.3. A Lie group homomorphism is a homomorphism between
Lie Groups which is a smooth map between the manifolds.
definition one could imagine. Perhaps he wrote the article on the Poincar´e group and
defined it as an extension of the Lorentz group.
174 CHAPTER 6. LIE GROUPS
Exercise 6.3.1. Verify that all the Lie groups discussed are indeed Lie
groups.
We are also interested in abelian groups which also satisfy the condition:
4. ∀a, b ∈ G, a
q
b = b
q
a
Abstract groups are sometimes difficult to work with and so we ofen use a
representation of the group which means that the elements of the group be-
come represented by matrices and the group action by matrix multiplication.
Thus we may take the rather forlorn group Z
2
which has only two elements,
usually written 0 and 1, we replace
q
by + since the group is abelian, and we
seek to represent 0 by the identity matrix, 1 by some other matrix, and +
by matrix multiplication. There are a lot of possible choices. For example
we can choose the 2 2 matrix pair
¸
1 0
0 1

,
¸
−1 0
0 −1

which clearly works. Such a thing is called a representation of dimension 2.
There is a rather simpler representation of dimension 1 which you should be
able to see almost instantly.
Exercise 6.3.2. Write it down!
Formally,
Definition 6.3.4. A real representation of a group Gof dimension (or degree)
n is a homomorphism from G into GL(n, R).
and
Definition 6.3.5. A Lie group representation of a Lie group G is a Lie group
homomorphism from G into GL(n, R).
The theory of complex representations, where we go into GL(n, C), is much
simpler, and we shall find that there is a strong preference for complex repre-
sentations in the books. There is also some interest to physicists in Quater-
nionic representations where the quaternions, H, are the ‘step beyond C’.
Just as C is a two dimensional field, H is a four dimensional ‘field’, actually
not a field but a skew-field since the multiplication does not commute.
Exercise 6.3.3. Define H as the set of quartets a + bi + cj + dk where
i, j, k are meaningless symbols satisfying the rules i
2
= j
2
= k
2
= −1 and
ij = k, jk = i, ki = j. Assuming everything distributes in a sensible way,
show the result is a skew field. Go back to the M213 notes to see this done
for C if hopelessly lost.
6.3. GROUP REPRESENTATIONS 175
Exercise 6.3.4. Just as there are ‘orthogonal’ group over C called the unitary
groups, there are analogues over H called the symplectic groups. Construct
one as a group of quaternionic matrices. Construct the one dimensional
complex group U(1, C) as a group of real 22 matrices and the corresponding
symplectic group as a group of real 4 4 matrices.
Remark 6.3.1. The use of H comes from Hamilton who invented them. It
is said that the relations defining H are carved in a bridge in Ireland. If
you invent something great, you may be allowed to deface bridges too, but
these days it would probably be an offence and you would face a severe fine if
caught.
Remark 6.3.2. You might have felt that it makes more sense to insist that
the homomorphisms are 1-1, but this complicates the theory enough to make
it a bad idea. If it is 1-1, we say the representation is faithful
Remark 6.3.3. We could call any linear map from G into Aut(V ) for any
vector space V a representation over V , and this has its advantages. Most
representations are matrix representations in practice.
The above example suggests that there could be rather a lot of representa-
tions of a group, and that we can build some of them up from other, simpler,
representations. Such is indeed the case, and the theory of representations
deals with precisely this issue. It is a quite satisfying kind of theory for alge-
braists and they often give courses on it, usually for finite groups, occasionally
for compact Lie groups, rather rarely in complete generality.
The representation of Z
2
of dimension 2 given above sends the positive x-axis
to the negative x-axis and vice-versa for the non-identity element, and leaves
the x-axis fixed for the identity: we say the x-axis is invariant under the
group action. It is clearly a subspace of the vector space R
2
and gives rise to
the sub-representation which you will surely have discovered when looking
for a one dimensional representation of Z
2
. The y-axis is also an invariant
subspace, and R
2
is the direct sum of these two subspaces. This is revision
of second year material and I hope you recall it.
Had one of the minus signs in the second matrix been removed, note that
again there are two invariant subspaces of which R
2
is a direct sum, and that
this gives a new representation of Z
2
. Both parts give sub-representations of
Z
2
, but only one is faithful. In fact one is distinctly trivial.
Exercise 6.3.5. Find some real representations of Z
2
Z
2
. Is there a faith-
ful one dimensional real representation? Is there a faithful one dimensional
complex representation? A faithful two dimensional real or complex repre-
sentation?
176 CHAPTER 6. LIE GROUPS
The fact that the given two dimensional representation of Z
2
can be split
into two sub-representations of lower dimension means that it is really not
worth a deal of thought, because we can obviously recover it from the lower
dimensional representations by direct summing them. In fact all the real
representations of Z
2
can be obtained by minor variations of this process.
Exercise 6.3.6. Prove the last claim.
Exercise 6.3.7. Find some complex representations of Z
3
. Of Z
n
. Can you
find any two-dimensional representations which do not have (complex) one
dimensional subrepresentations?
6.3.2 Irreducible Representations
Definition 6.3.6. A representation m : G → GL(n, R) is irreducible iff there
are no proper subspaces of the space on which the matrices act which are
invariant under the group action.
Remark 6.3.4. It might be more natural to define reducible representations
but they are too boring.
The two things that make this interesting to physicists are
1. The representations of compact groups are all direct sums of irreducible
representations
2. Most gauge groups are compact
3. The irreducible representations of the gauge groups correspond to the
fundamental particles, for example, electrons
This allows us to compute properties of the fundamental particles by looking
at group representations. This is surely quite astonishing and wonderful. I
have said some rude things about physicists, but if they can do this then
they have more than redeemed themselves. They are good blokes. Or good
sheilas, as the case may be. Or at least, some of them certainly are.
I have not given a formal definition of the direct sum of two representations.
Exercise 6.3.8. Construct a suitable definition.
Definition 6.3.7. Two representations f, g : G → GL(n, R) are equivalent
iff there is an isomorphism α : R
n
→R
n
such that
∀ a ∈ G, f(a) ◦ α = α ◦ g(a)
6.3. GROUP REPRESENTATIONS 177
Exercise 6.3.9. Draw the obvious commutative diagram.
Observe that we could have generalised this by defining Aut(V ) as the set
of invertible linear maps from V to itself where V is any real vector space,
and then defining a representation of a group G over V as a homomorphism
m : G →Aut(V ). Then if α : U → V is an isomorphism of vector spaces, we
can talk of representations of a group G over U and V as being equivalent
provided the appropriate diagram commutes.
Exercise 6.3.10. Draw the new diagram.
6.3.3 Tensor Representations
I shall present this as a sequence of easy exercises.
Exercise 6.3.11. Write out a non-trivial representation, φ, of Z
2
as 2 2
real matrices.
Exercise 6.3.12. Write out a non-trivial representation,ψ, of Z
2
as 3 3
real matrices.
Exercise 6.3.13. Using the discussion on Darling’s expressions for the ten-
sor product, find an isomorphism between R
2
⊗R
3
and R
6
.
Exercise 6.3.14. Find the obvious tensor representation φ ⊗ψ in terms of
6 6 real matrices and the above isomorphism.
Exercise 6.3.15. Show it really is a representation.
Exercise 6.3.16. Repeat for the group SO(2, R).
Exercise 6.3.17. Define the tensor product of two representations, one over
a vector space U and the other over a vector space V .
Exercise 6.3.18. Write down a really, really obvious map from R
2
R
3
to
R
2
⊗R
3
.
Exercise 6.3.19. Show that any bilinear map f : R
2
R
3
→R factors into
your really, really obvious map and a linear map from R
2
⊗R
3
to R.
Exercise 6.3.20. Show this generalises to bilinear maps from U V to R.
Exercise 6.3.21. List all the one dimensional complex representations of
Z
4
.
Exercise 6.3.22. Hence or otherwise, list all the one dimensional complex
representations of SO(2, R). (Which it may be convenient to identify with
U(1, C))
Exercise 6.3.23. Explain why the above repesentations are irreducible when
they are.
178 CHAPTER 6. LIE GROUPS
6.3.4 Schur’s Lemma
I got this from Frank Adams’ Lectures on Lie Groups which I recommend
only to the bravest. It is a beautiful book but very, very dense.
Definition 6.3.8. A CG-space V is a complex finite dimensional vector space
and a homomorphism φ from the Lie group G into Aut(V ); that is, a repre-
sentation of G over V .
Definition 6.3.9. A map between CG spaces U and V is a C linear map f :
U → V which commutes with the homomorphisms, that is, if φ : G →Aut(U)
and ψ : G → Aut(V ) are the representations, for all g ∈ G, for all u ∈ U,
f(φ(g)(u)) = ψ(g)(f(u))
Proposition 6.3.1. If φ and ψ are irreducible, any CG map is either zero
or an isomorphism.
Proof: Ker(f) and Im(f) are clearly invariant subspaces of the representa-
tions and are hence either zero or the whole space.
Remark 6.3.5. It is clear that this works over arbitrary fields, not just C.
The actual Lemma needs the complex numbers:
Proposition 6.3.2 (Schur’s Lemma). If f : V → V is a CG map between
irreducible representations φ and ψ of a Lie group G, then f = λI
V
for some
λ ∈ C.
Proof:
V is isomorphic to C
n
for some n ∈ Z
+
so we work there. Then there are
n complex eigenvalues for f by the Fundamental Theorem of Algebra, not
necessarily different. So there is at least one λ ∈ C such that det(f −λI
V
) is
zero. Then by the previous proposition we must have f = λI
v
, since f −λI
V
cannot be an isomorphism for this value of λ and is hence zero.
Corollary 6.3.2.1. All the irreducible complex representations of an abelian
group have dimension one.
Proof: If G is abelian and ρ : G →Aut(V ) is a representation then for every
g ∈ G ρ(g) is an automorphism of V which is a CG map from ρ to ρ. It
follows that ρ(g) is λI
V
for some complex number λ and hence that every
subspace of V is invariant under ρ(g). If ρ is irreducible then it follows that
V has dimension one, where the only subspaces are the space itself and the
zero element.
6.3. GROUP REPRESENTATIONS 179
Remark 6.3.6. If you have a trace of mathematical taste you will allow that
the last three results are very cool.
It follows that the complex irreducible representations of U(1, C) are all
equivalent to one of the form
ρ
n
(1, θ) : (r, φ) → (r, φ +nθ), n ∈ Z
Exercise 6.3.24. Show this carefully.
Exercise 6.3.25. Show that tensor multiplication in R is just multiplication,
likewise in C, and hence that the tensor product of the above irreducible
representations just makes ρ
n
⊗ρ
m
= ρ
n+m
6.3.5 Representations of SU(2, C)
The text book indicates, without any very compelling arguments, that the
irreducible representations of U(1, C) have important physical significance.
Since the definition of U(1, C) means that it has to preserve lengths, it must
be the subset of C which contains only complex numbers of modulus 1, that
is, it is the unit circle. And a very fine group it is too, being isomorphic as
a Lie group to SO(2, R).
Exercise 6.3.26. Prove the last claim.
The representations of U(1, C) being so simple, it is natural to investigate
the representations of U(2, C) and SU(2, C). I shall refer to these as U(2)
and SU(2) from now on since the C may reasonably be taken for granted.
Again we need look only at the irreducible representations and again we are
motivated by the hope of some important physical applications of these ideas.
The first observation worth noting is that U(2) and SU(2) are not abelian
groups, so we expect complications.
First it is essential to get some sort of feeling for the groups. SU(2) is the
subgroup of U(2) having determinant one, and U(2) will consist of the 2 2
matrices with complex entries which preserve the complex inner product on
C
2
, that is the rule
¸
a
b

q
¸
u
v

= a¯ u +b¯ v
where ¯ v is the complex conjugate of v. The maps will have to take the
standard basis for C
2
to vectors which have length 1 and which are orthogonal
with respect to the complex inner product, and so the columns of the matrices
representing these linear maps must also be orthogonal and have length 1,
which implies that the inverse of such a matrix is its conjugate transpose.
180 CHAPTER 6. LIE GROUPS
We use A

to denote the conjugate transpose of A, although many physicists
use A

.
We note that
¸
e

0
0 e

is such a matrix for any θ, φ and so is
¸
cos α −sin α
sin α cos α

for any α, since real orthogonal matrices are necessarily unitary. Also the
product of two matrices which have inverses equal to their conjugate trans-
pose has its inverse equal to its conjugate transpose.
Exercise 6.3.27. Prove this.
This would lead one to conjecture that the manifold U(2) has (real)dimension
at least three. That it is a (real) manifold follows from the usual arguments
involving the Implicit Function theorem. Note that it makes sense to have
complex manifolds with smooth maps between charts in C
n
, but we shall not
be dealing with such things.
Exercise 6.3.28. Find the dimension of U(n) from the Implicit Function
theorem. Show that an element of U(n) must have determinant a complex
number of modulus 1, and hence deduce the dimension of SU(n). (The answer
to the last part is n
2
−1; make sure you get it right!)
An insight into the geometry of SU(2) is obtained from the Pauli matrices.
Recall that a matrix is hermitean if it is equal to its conjugate transpose.
The Pauli matrices are
σ
0
=
¸
1 0
0 1

, σ
1
=
¸
0 1
1 0

, σ
2
=
¸
0 −i
i 0

, σ
3
=
¸
1 0
0 −1

It is easy to see that these are linearly independent over C and hence form
a basis for the four (complex) dimensional space GL(2, C). If we take only
real coefficients then we get the hermitean 2 2 matrices.
Exercise 6.3.29. Confirm this claim. Confirm that all hermitean matrices
are obtained in this way.
You will observe that the Pauli matrices are certainly hermitean themselves
but are also unitary.
Multiply each of σ
j
for j ∈ [1 : 3] by −i and call these, following Baez and
Muniain, I, J, K to get:
6.3. GROUP REPRESENTATIONS 181
σ
0
=
¸
1 0
0 1

, I =
¸
0 −i
−i 0

, J =
¸
0 −1
1 0

, K =
¸
−i 0
0 i

Note that
1. These matrices also span GL(2, C) with complex coefficients
2. Each has determinant one
3. Each is unitary
Now it is easy to verify that taking all possible real linear combinations of
these matrices gives us a representation of the Quaternions, H.
Exercise 6.3.30. Do it.
It is also easy to verify that whenever a
2
+b
2
+c
2
+d
2
= 1, for reals a, b, c, d,

0
+bI +cJ +dK
is unitary,
Exercise 6.3.31. Do it
has determinant one
Exercise 6.3.32. Do it
and only slightly harder to confirm that every unitary 2 2 matrix with
determinant one is of this form.
Exercise 6.3.33. Do it.
This has shown that SU(2) is the three sphere S
3
equipped with a multipli-
cation which does not commute.
Remark 6.3.7. There is quite a lot of useful structure lying about here which
has been used by engineers and physicists for many a long year. Mathemati-
cians tend to see themselves as discovering structure and pointing it out to
physicists and engineers who eventually come to find it useful in talking about
something in reality, and then imagine that they discovered the structure ex-
perimentally. Physicists and engineers have a different story.
182 CHAPTER 6. LIE GROUPS
6.3.6 Representations of SU(2)
This is reasonably well described in the text book: the representations are
over vector spaces of homogeneous polynomials: The zero degree polynomials
are simply the complex numbers, the space H
j
for j half an integer is the
space of polynomials of degree j in two variables. Thus we have for j = 0
the constant functions from C
2
to C and for j = 1 the functions
f
a,b
: C
2
→C,
¸
x
y

→ ax +by
for a, b, x, y ∈ C. Then H
j
is a vector space over C of (complex) dimension
2j + 1. U
j
: SU(2) → Aut(H
j
) is the representation which takes any g ∈
SU(2) to the automorphism carrying the polynomial p to the polynomial q
defined by
q
¸
x
y

= p

g
−1
¸
x
y

Exercise 6.3.34. Confirm this gives a representation of SU(2).
These are in fact all the irreducible representations of SU(2), something which
is not proved in the text book and I shan’t prove it either. You may if you
wish.
Remark 6.3.8. This concludes everything we have to say about representa-
tions, where ‘we’ means Baez, Muniain and me, but it is far from completing
the business. There is a lot of important and relevant material still uncovered.
Well, that’s life.
Exercise 6.3.35. Read the discussion in the text book and fill in any gaps.
Remark 6.3.9. I am skipping the material which claims that SU(2) is a
double covering of SO(3); quite a lot can be said about this and it explains
the interest physicists have in SU(2).
Exercise 6.3.36. A lot of deep issues arise which physicists tend to gloss
over. This is an invitation to think about them.
First we have the mystery that the irreducible representations of the group
U(1) or SO(2) has something to do with the fact that charge is conserved.
Then we have that the irreducible representations of SU(2) tell us something
about spin, and about fundamental particles. This invites two separate ques-
tions, the first is what exactly do the groups have to do with it? Groups such
as the Lorentz and Poincar´e groups arise naturally enough from our desire
to have the physics independent of the detailed choice of language, Einstein’s
6.4. LIE ALGEBRAS 183
principle of general covariance. What is the explanation for U(1) having
everything to do with the Maxwell Equations and Electromagnetism?
Second, why irreducible representations? Why representations at all? One
can see that they might be convenient in doing calculations, but it looks as
though the use made of them goes beyond simple convenience. What exactly
is the relation between the physics and our description, and why are repre-
sentations central to it?
There is a sketch of an answer to these questions in the part I have skipped:
it involves Quantum Mechanics and the standard Hilbert space representation
of quantum states.
You are invited to write a short essay addressing these questions.
You are also invited to consider the extent to which the Hilbert Space repre-
sentation is essential to Quantum Mechanics, and to ponder whether a wholly
abstract description of what is needed for a mathematical model of QM would
necessitate Unitary representations.
6.4 Lie Algebras
Definition 6.4.1. The Lie Algebra of a Lie group G is the tangent space
at the identity. It is called g. This makes it a vector space of the same
dimension as G.
Remark 6.4.1. The multiplication comes later.
Remark 6.4.2. Elements of g used to be (and still are by some people)
called the infinitesimal elements of G. You can see why. In particular the
infinitesimal rotations are obtainable from the rotations in SO(3) by taking
curves through the identity in SO(3) and differentiating them. The text book
gives some natural examples: we take the matrix function

cos t −sin t 0
sin t cos t 0
0 0 1
¸
¸
which represents a curve of rotations about the z-axis and differentiate it at
t = 0 to get
J
z
=

0 −1 0
1 0 0
0 0 0
¸
¸
and J
x
, J
y
can be obtained in the same way.
184 CHAPTER 6. LIE GROUPS
Exercise 6.4.1. Do it.
These three matrices are linearly independent and span the algebra so(3).
The multiplication is the Lie Bracket in this case,
[X, Y ] = XY −Y X
Exercise 6.4.2. Verify that the Lie bracket is in the vector space so(3) when
X and Y are.
We can recover the original matrix functions by exponentiation:
Exercise 6.4.3. Show that exp(tJ
z
) is what it ought to be.
Exercise 6.4.4. Show that the Lie algebra of SO(2) is just R.
Exercise 6.4.5. Do exercises 33 to 54 in the text book.
Remark 6.4.3. Lie algebras are, as the book tells us, nicer in many ways to
work with than Lie groups because they are vector spaces. They give a lot of
information about the groups and their representations.
Chapter 7
Fibre Bundles
7.1 Introduction
A standard source on Fibre Bundles is Dale Husemoller’s Fibre Bundles.
There are probably more modern books, and there are certainly better writ-
ten books, but I own a copy so will stick to following it. I shall do very little
on this subject (there is quite a lot to be done) because I want to focus on
differential geometry, the subject of these notes, but there are close connec-
tions, as is shown by the physics. Anyway, to get very far in Fibre Bundles
you would need more homotopy theory than you have. So this will be a short
chapter.
First some examples:
1. The product S
1
R with projection to S
1
has base space S
1
, fibre
(space) R and total space S
1
R. It is easy to see why we call the fibre
a fibre (it is long and thin) and the fibres are glued together by the
topology of the base space.
2. The M¨obius bundle which has the same base space and fibre as the last
example, but has a twist in it so as to make a m¨obius strip (without
a boundary). Again there is a projection from the total space to the
base space, and the inverse image of any point is a copy of R.
3. Any product of two spaces. For example a 2-torus has base space S
1
and also fibre S
1
.
4. Any tangent bundle. This attaches to every point of a smooth manifold
a vector space, the tangent space at the point, and the resulting object
185
186 CHAPTER 7. FIBRE BUNDLES
is a vector bundle, which is defined as a fibre bundle which has a vector
space for the fibre, an important subclass of fibre bundles.
5. Tensor bundles. Again, these are all vector bundles.
6. SO(n,R), n ≥ 2 is a fibre bundle with base space S
n−1
. The map takes
an element of SO(n,R) and sends it to wherever the north pole of the
sphere S
n−1
is taken by applying the element to the sphere. The inverse
image of this point in SO(n,R) is a subset which is an embedded copy
of SO(n-1,R), the fibre. When n = 2, SO(n-1,R) is a single point,
the identity map from R to itself, so SO(2,R) is just a copy of S
1
topologically.
7. The sphere S
n
is a fibre bundle over RP
n
which sends antipodal points
to the same point and hence has fibre Z
2
. It might be better to describe
the fibre as S
0
, the pair ±1 under multiplication, or O(1,R). More
interesting bundles can be obtained by replacing R with C.
8. Take the sphere S
2
and at each point take the space of ordered pairs of
orthonormal tangent vectors. This gives an orthogonal 2-frame bundle
over S
2
. In general, if M is a smooth manifold, for k an integer less than
or equal to the dimension of the manifold, take the space of (ordered)
k orthonormal tangent vectors at each point. An orthogonal 1-frame
bundle on S
2
would consist of attaching a unit circle to each point
of the space, the circle being in the tangent space at the point. An
orthogonal 2-frame bundle on S
2
would attach 2 circles at each point
(Explain why). Clearly this supposes a Riemannian Inner Product.
More generally, it makes sense to attach at each point of a smooth
n-manifold, an ordered set of k linearly independent vectors of the
tangent space at that point, for k ≤ n. These bundles are called frame
bundles. A section of the two-frame bundle on S
2
would give a rather
special pair of vector fields being everywhere linearly independent, and
we know there is not even one such vector field. So it is not at all
obvious whether a given manifold admits a field of k-frames in general.
Note that there is a rather natural group action on frame bundles,
O(n,R) on the orthogonal frame bundles, and GL(n,R) on the bundles
where we do not suppose a Riemannian structure. The group acts on
the total space but sends fibres to fibres by what is a multiplication
of the (Lie) group and hence a diffeomorphism. Bundles with a group
action of this sort are called principal bundles. I shall elaborate on
these later.
7.1. INTRODUCTION 187
Exercise 7.1.1. Show that attaching an ordered set of n orthonormal vectors
to each point of a space is equivalent to attaching an element of the orthogonal
group, and that the n-frame bundle effectively attaches GL(n,R) to each point
of the n-manifold, this being the fibre. Thus a useful way of thinking of a
bundle with base space a manifold is to regard the manifold as having a copy
of the fibre attached at each point of the manifold.
Exercise 7.1.2. Show that the 2-torus admits a field of 2-frames. Does S
3
?
The above should convince you that (a) some spaces have a structure which
makes them something like a generalised cartesian product and (b) it is worth
knowing more about them because there are interesting examples.
Definition 7.1.1. A fibre bundle is a triple of space, E, B, F and a map
π : E → B such that for every b ∈ B, π
−1
(b) is homeomorphic to F.
Definition 7.1.2. A fibre bundle is locally trivial iff there is a cover of B by
open sets U
j
and for each of them π
−1
(U
j
) is homeomorphic to U
j
F.
Remark 7.1.1. All our fibre bundles will be locally trivial
Exercise 7.1.3. Give an example of a fibre bundle which is not locally
trivial.
Definition 7.1.3. A bundle map between fibre bundles (E, B, F, π) and
(E

, B

, F

, π

) is a pair of maps f
B
: B → B

and f
E
: E → E

such that
π

◦ f
E
= f
B
◦ π.
Remark 7.1.2. It follows that fibres wind up inside fibres under f
E
. It is
helpful to draw a picture of a square:
E
?
π
B
E

?
π

B

-
-
f
E
f
B
We say that the square commutes with the condition π

◦ f
E
= f
B
◦ π. If π
is onto then f
E
determines f
B
. From the definition, it has to be.
Exercise 7.1.4. Define the terms product of fibre bundles, subbundle, quo-
tient bundle. Give examples of each.
Exercise 7.1.5. Find out what a fibre product is and give an example.
188 CHAPTER 7. FIBRE BUNDLES
7.2 Principal Bundles
In many of the above examples, the fibre had some extra structure besides
being a topological space: often it was a vector space, giving a vector bundle,
and sometimes it was a group. A group acts on itself by multiplication, and
so we can more generally consider the case when the fibre has a group action
on it. We care most about the case where the group action on the fibre is that
of a Lie group, and the action is regular, which means it is both transitive,
that is for any two points of the space there is a group element which acts to
take one to the other, and also free, that is only the identity leaves any point
fixed; this is equivalent to saying that for any two x, y in the fibre, F there
exists precisely one g in G such that g x = y. In this case, F is known as
a principal homogeneous space for G or as a G-torsor. This definition holds
whether F is actually the fibre of a bundle or not.
Exercise 7.2.1. Show that the action of S
1
on itself (regarded as U(1,C),
i.e. the set of complex numbers of modulus 1 with the usual complex mul-
tiplication) makes it an S
1
-torsor. Is there a regular action of S
1
on T
2
?
Is there a regular action of R on T
2
? Take the quotient space I/∂I which
joins the ends of the unit interval together. This is homeomorphic to S
1
but
lacks the group structure and the smoothness structure. Show that it can be
given the structure of a smooth manifold via any homeomorphism with S
1
and also that it is an S
1
-torsor. Is any Lie group G a G-torsor? Are there
any G-torsors that are not homeomorphic to G?
Definition 7.2.1. A fibre bundle where each fibre is a G-torsor (for the same
G) is called a Principal bundle
Exercise 7.2.2. Show that the n-frame bundle for any smooth manifold
(Usually written F(M)) is a principal bundle.
Exercise 7.2.3. By taking the m¨obius strip with fibre a closed interval and
gluing the ends of each fibre, show that the resulting space is a principal
bundle and work out what the space is.
Remark 7.2.1. This has a lot to do with gauge theory.
Exercise 7.2.4. Do some googling to understand the last remark.
Remark 7.2.2. The condition that the fibre be a G-torsor means that we
can use group actions to say something about the bundle structure. We
have, in effect, a sort of Construction Kit for the bundle which tells us how
to put it together, the group elements can be used to specify how to glue
local trivialisations together.
7.2. PRINCIPAL BUNDLES 189
Figure 7.2.1: A locally trivial cover of S
1
.
In the simplest case, take a bundle over S
1
with fibre the interval R and
the action of O(1,R) on it. We might stipulate that the action be always
the identity, so if we have a pair of trivialisations of the bundle, on the
intersection the relation between the fibres is that they are the same way
up. This inevitably forces the bundle to be trivial, S
1
R. Or we might
insist that the group action be −1 on one intersection and +1 on the other,
when we would get the m¨obius strip. Since we would like to be consistent
on intersections, it is reasonable to want the intersection to be connected, so
for S
1
we shall do it with three open sets which cover S
1
.
Exercise 7.2.5. Is the fibre a G-torsor for the orthogonal group?
Remark 7.2.3. In the above case, if we impose the condition that the group
action has to be constant on the intersection of the trivialising cover of the
base space, then we need at least three such open sets in the cover. Labelling
them α, β and γ, we can characterise each intersection by specifying an or-
dered pair, see the diagram figure 7.2.1. α is the red open set, β the blue and
γ the green. Then the intersection αβ is the region between the black bars
at the top right. If I now assign the element 1 ∈ O(1,R) to αβ the element
−1 to βγ at the left and the element 1 to γα then you can read this as an
instruction to start with three strips, α R β R and γ R, and glue the
first two strips together keeping both orientations of R to have the positive
numbers pointing up, the strip γ R is glued to αR also with the real line
having the same orientation, but γ R is glued to β R with a reversal, so
that the β part is upside down. It is clear that these instructions produce a
m¨obius strip. Moreover, in general, we can specify a locally trivialising cover
190 CHAPTER 7. FIBRE BUNDLES
of the base space, with the condition that the intersection is path connected,
take any fibre having a group action on it and, by assigning group elements
to intersections, give instructions to build a new object. We need to ensure
that the instructions are unambiguous and that the resulting object is a fibre
bundle.
Definition 7.2.2. In general, If ¦U
α
¦ is a trivialising cover of a manifold,
with the condition that U
α
∩ U
β
is connected or empty, then when there is
a group action on the fibre with a group G, the map from the (non-empty)
intersections (αβ) to G gives the transition functions for the bundle.
Exercise 7.2.6. Suppose αβ means that you hold α the ‘right way up’ in
some sense and apply a group element g
αβ
to β before doing the gluing of
each x in the fibre F over α to g
αβ
(x) over β. Verify that this means that
we must have g
αβ
the inverse of g
βα
. What can you say about g
αα
?
Exercise 7.2.7. Verify also that for an unambiguous instruction we need to
have the cocycle condition:
g
αβ
g
βγ
g
γα
= 1
on any non-empty region α ∩ β ∩ γ.
Exercise 7.2.8. Verify that the construction described always gives a fibre
bundle.
Note that we do not need the fibre to be a G-torsor for this group, it suffices
that the action be that of a subgroup. In fact we can get a trivial bundle by
consistently choosing the identity. (We can get it other ways, too!)
Exercise 7.2.9. Explain the last, parenthetic, remark.
Exercise 7.2.10. Show that by choosing fibre S
0
instead of R, we can deal
with the case where the fibre is a G-torsor. Instead of taking a subgroup,
we can throw out most of the fibre. So for the trivial bundle and the m¨obius
bundle over S
1
with fibre S
0
, we still have all the essential properties, and
now we need only count connected components to see the difference.
Definition 7.2.3. In the case described above of a bundle the structure of
which is determined by a group action on the fibres and a set of transition
functions, the bundle is called a G-bundle, and the group is called the gauge
group of the bundle.
Remark 7.2.4. In practice, the fibre is a vector space, usually a tangent
space or tensor space.
7.3. THE ENDOMORPHISM BUNDLE 191
Definition 7.2.4. For any linear transformation T of a G-vector bundle fibre
F
p
attached to a point p in the manifold, we can ask whether it arises from
the action of G. In general some will and some won’t. If it does, we say T
lives in G.
Exercise 7.2.11. Show this is well defined. That is, show that if p is in two
charts with domains α and β, then if T lives in G over α it also lives in G
over β, even though the particular element g ∈ G is in general different.
Exercise 7.2.12. Give examples of G-vector bundles and linear transforma-
tions of F
p
that live in G and others which do not.
Exercise 7.2.13. Extend this idea to the Lie algebra g when G is a Lie
group.
Definition 7.2.5. A Gauge Transformation is a smooth G-bundle map from
a vector bundle into itself which is the identity on the base space and such
that every linear map from a fibre F
p
to itself lives in the (Lie) group G.
Remark 7.2.5. Physicists care about these a lot. See the section on page
215 of the text book to find out why. Or at least get some vague idea.
7.3 The Endomorphism Bundle
There is a natural isomorphism between
V ⊗V

and End(V )
where End(V ) is the vector space of endomorphisms of V , that is the linear
maps from V to itself. Observe that End(V ) is a ring under composition, it
has a unit, but is not in general commutative. Recall that we defined a vector
space with an associative multiplication to be an algebra. From the isomor-
phism it is clear that for any smooth manifold we have an endomorphism
bundle where V is the tangent space at each point of the manifold.
More generally, if E is any vector bundle over a smooth manifold M with
fibre V , we can define the endomorphism bundle E⊗E

by attaching End(V
p
)
to each point p in M. There is nothing new here, we have the tensor bundle
construction in the very special case of (1, 1)
T
tensors.
A section T of E ⊗E

acts on a section s of E. If at each point p in M, the
section s(p) is in the fibre then T(p) is a linear map from the fibre into itself
which takes s(p) to T(p)(s(p)), everything done pointwise. And if Γ(E) is
the space of sections of E, any section T of E ⊗E

gives a map
T : Γ(E) → Γ(E)
192 CHAPTER 7. FIBRE BUNDLES
which is linear regarding Γ(E) as a C

(M, R) module.
Exercise 7.3.1. Verify the above claim.
Exercise 7.3.2. Show that any C

(M, R)-linear map defines a section of
E ⊗ E

. This will involve partitions of unity so needs M paracompact. Do
it first for the case where E = M V .
Exercise 7.3.3. Show that the set of all gauge transformations for a G-vector
bundle E is iself a group, (, the gauge group.
Exercise 7.3.4. Read page 222 of the text book.
Chapter 8
Connections
I have followed and amplified R.W.R Darling’s book Differential Forms and
Connections, of which I can only say ‘thank God for Wikipedia’. You should
google the term Covariant Derivative on Wikipedia and anywhere else you
can find it.
8.1 Fundamental Ideas
You will have noticed that I have used the notation
˙
R
n
to denote the tangent
space of R
n
at any point. I can get away with this because if I take any
a, b ∈ R
n
, then
˙
R
n
a
and
˙
R
n
b
are isomorphic and moreover the isomorphism
comes from the shift map that takes a to b by adding b − a to everything
in R
n
. Clearly this takes curves and their tangency equivalence classes at a
to corresponding curves and their equivalence classes at b in a thoroughly
uninteresting way. An important consequence of this is that if I am standing
at the origin in R
n
and you are standing somewhere else, it makes sense to
ask if we are looking in ‘the same’ direction. We can ask if the unit tangent
vector representing my direction of look is carried by the shift from me to
you into the unit vector representing your direction of look.
On a 2-sphere, even the standard one embedded in R
3
, this is not the situation
at all. It is true that we could use R
3
as our notion of what consitutes ‘the
same’ direction, but if you and I are both on the equator of the Earth,
supposed to be an embodiment of S
2
, and if you are a quarter of a planet
away from me, if I am looking at my horizon due West, watching the sun
set, and you are also looking due West, you would not be looking towards
the sun. If you are somewhere to the West of me, then the Sun is overhead
from your point of view, and if I am to the West of you, it is dark for you
193
194 CHAPTER 8. CONNECTIONS
and the direction of the Sun is under your feet. Yet if we are both looking
due West, it clearly makes some sort of sense to say we are looking in ‘the
same’ direction. We wouldn’t feel quite so tempted to say this if I watched
the sunset and you were looking due North.
Exercise 8.1.1. Show that there is a curve joining us so that at each step
neighbours are looking in the same direction but I am looking due West and
you are looking due North.
So the question is, how do we define this notion of ‘the same direction’ for
different places on a manifold?
One approach is to go through a group action, since the shift maps from
R
n
to itself constitutes precisely such an action of the additive group R
n
on
the vector space R
n
, and the group action takes tangent vectors to unique
tangent vectors. In the case of S
2
embedded in the standard way in R
3
, one
would be tempted to use the special orthogonal group. If there is a rotation
taking me to you (and there surely is ) then we could use the rotation to take
my tangent space to yours. Then if the image of my direction of look is your
direction of look, we are looking in ‘the same’ direction. The fact that we
can decide which directions are North, South, East and West, for all points
on the Earth except the poles, suggests that some sort of sense can be made
of this. On the other hand the fact that all directions at the North Pole are
due South also suggests that there is something a bit wrong. Nevertheless, if
I am at the North Pole and you are a kilometre away and we are looking at
each other, then it makes sense to say we are looking in opposite directions,
and if I turn around to see what you are looking at (a polar bear behind
me perhaps) then we would be looking in the same direction. This may be
because a distance of a kilometre is small enough to make the world pretty
much flat. But suppose we have a chain of five thousand people, all looking
towards the next one in the chain, all able to see over each others shoulders
at the next one beyond, we could each decide that we were looking in the
same direction. If we had forty thousand people, and I am number one (as
seems reasonable to me) and standing at the North Pole looking towards you,
and you are looking due South at somebody a kilometre away also looking
in the same direction as you, and so on, then we are all looking due South
until we get to the South Pole, when everybody later in the chain is looking
due North. So North and South have got screwed up, but we are all still
looking in ‘the same’ direction. Any three consecutive people will agree on
this. And if we all turn through a right angle anticlockwise, are we are still
all looking in the same direction? Any consecutive triple of people, observing
their neighbours out of the corners of their eyes would surely agree that they
were. And if they all held their arms out sideways, their right arms would be
8.1. FUNDAMENTAL IDEAS 195
pointing towards the neighbour they had previously been looking at. Half of
them would say they were looking West and the other half would say they
were looking East. Were it not for the axial inclination of the Earth, we
might have them all looking at the sun on the horizon, although some would
find it setting and others rising.
You will note that a suitable rotation of the Earth, although not the usual
one, would take each person to his successor and would carry the direction
of look along with it.
We certainly have that parallel translation of a tangent vector around this
closed loop would have to bring us back to the original vector, but this need
not happen for all closed loops, see figure 3.2.1 in chapter three.
Exercise 8.1.2. Without looking at the picture, draw a closed loop of people
on S
2
with a distinguished starting point, so that everyone is looking in the
same direction as the person in front, but such that the last person is at the
same point as the first person but is looking in a different direction.
Three observations: first that we should feel that when the people are very
close together on any smooth manifold, it should be possible to say if they
are looking in different directions. Since ‘very close’ is a scale dependent
kind of thing, it must be intelligible to take limits, so differentiation must
come into it somewhere. On a symmetric space with a Lie group action,
the group action ought to also give us some sense of what ‘looking the same
way’ means, but the notion ought to be intelligible on any smooth manifold,
although not with the structure we have on it at present. Another reason for
thinking differentiation comes into it somewhere is that directions are given
by tangent vectors.
Second, we seem to have that paths come into it too. Whether two people
at different points a and b can be said to be looking in the same direction
would seem to require us to have a ‘path’ of people, everyone looking in the
same direction as the person next to him, the path joining the person at a
to the person at b. There isn’t a ‘next’ person on a continuous path, and
the last exercise makes it clear that whether the person at a is looking in the
same direction as the person at b would depend on the path.
Exercise 8.1.3. Find two paths on a sphere between points a and b such
that each person along each path is looking in the same direction, the person
at a is looking in a definite direction, but the two coincident people at b are
looking in different directions.
And Third, we might want to transfer along paths on a manifold other things
besides directions of looking: for example we might naturally want to transfer
a frame, that is a coordinate system, or a linear map. For this reason we
196 CHAPTER 8. CONNECTIONS
need to think in terms of doing parallel transportations for sections of vector
bundles in general, not just the tangent bundle.
So the problem is, how to articulate precisely this notion of shifting some
object parallel to itself along a curve on a smooth manifold. The above
discussion shows it should make some sort of sense, but consideration of S
2
shows that it may have some surprises.
There seem to be two possibilities: one is to restrict ourselves to symmetric
spaces with a good group action under which the manifold is invariant, and
the other is to find some more general differential structure. The first choice
leads to Cartan connections, which generalise the idea of using rotations of
S
2
to carry tangent vectors along with the points, or shifts to take tangent
spaces to tangent spaces in R
n
. The second choice is more general and leads
to Koszul connections and in particular to the Levi-Civita connection. There
are more connections than you can shake a stick at, but it is better to get
one type sorted out properly before going on to others.
Exercise 8.1.4. Can you say what kinds of paths on S
2
look ‘right’ for the
action of SO(3, R) to be used for determining how to transport a unit tangent
vector (direction of look!)?
Exercise 8.1.5. Can you transport a frame on S
2
using the same idea? If
you had an eye where your left ear is so you could look in two orthogonal
directions at once, could you have a chain of people all looking in the same
direction with eyes and left ears? Or could you have a chain where this is
impossible?
Exercise 8.1.6. You could certainly have a Riemannian inner product on
R
2
that started off with the standard basis being orthornormal and changed
gradually along a path until the basis
¸
1
0

,
¸
1
1

was an orthonormal
basis. Find such a path. Transport an orthogonal frame along the path so it
stays an orthogonal frame in the Riemannian inner product.
Exercise 8.1.7. Could you do the same thing with the standard basis,
(e
1
, e
2
), and the basis (e
2
, e
1
)? Prove your claim.
Exercise 8.1.8. Can you do the same kind of thing on S
2
? RP
2
?
We conclude that the idea is to shift ‘things’ along curves. At the very least
the curves should be smooth. In fact any smooth curve, at least locally is the
solution to a system of ODEs (the Straightening Out Theorem from ODE
theory: see Arnol’d.) So an alternative is to move them ‘infinitesimally’
along a vector field, and the ‘things’ will be sections of some vector bundle.
8.2. BACK IN R
N
197
If the shifts are ‘infinitesimal’ then we can hope to get a shift along a curve
by some sort of integration process. This leads to asking if we can have some
sort of differential operation of a vector field on various other sections of a
vector bundle.
8.2 Back in R
n
8.2.1 Covariant differentiation
I shall deal with the case n = 2 in order to save typing, but the extension is
trivial. Take a vector field X on R
2
and a point a ∈ R
2
with X(a) denoting
the vector at a. We write a as
¸
a
1
a
2

and X(a) as u =
¸
u
1
u
2

. Let Y be
another vector field on R
2
. Can we differentiate Y in the direction u at a?
If we take Y (a) = v =
¸
v
1
v
2

, then we can only talk about differentiating Y
at a if we have Y making sense in a neighbourhood of a so we want to seee
v as a pair of functions,
v(a) =
¸
v
1
(a)
v
2
(a)

Of course, we are going to be looking at the derivative of Y at a in the
direction X(a) for different points a.
We already have a way of talking about the directional derivative of a function
with respect to a vector field. Take a function f : R
2
→R and a vector field
X on R
2
. Then I can take the Lie derivative Xf or L
X
(f) and differentiate
f along the vector field with
Xf
¸
x
y

=
¸
∂f
∂x
∂f
∂y
¸
X
1
(x, y)
X
2
(x, y)

This gives me another function, a 0-tensor field.
I can certainly do this to both components v
1
(a) and v
2
(a) and this will
give me a new vector field which I shall write as ∇
X
(Y ). Baez and Munian
write D
X
(Y ), at least some of the time, which reminds you that this is
something to do with differentiation, but then, ∇ is also something to do
with differentiation. Notice ∇
X
(Y ) is very different from ∇
Y
(X) in general.
The former requires differentiation of Y in a direction at a point, but has
nothing to do with differentiating X, and the latter is the other way around.
With the notation given I have
X = u
1
(x, y)

∂x
+u
2
(x, y)

∂y
198 CHAPTER 8. CONNECTIONS
and
Y = v
1
(x, y)

∂x
+v
2
(x, y)

∂y
Then going back to writing vectors as columns we have

X
(Y ) =
¸
∂v
1
∂x
∂v
1
∂y
∂v
2
∂x
∂v
2
∂y
¸
¸
u
1
(x, y)
u
2
(x, y)

Remark 8.2.1. We could write this using the Einstein summation conven-
tion as

X
(Y ) = u
i

i
(v
j
)∂
j
, i, j ∈ [1 : 2]
which has the advantage that if I leave out the last part, by not specifying
which n we are working in, it makes sense for R
n
for any n.
Exercise 8.2.1. Take a nice vector field X on R
2
such as y

∂x
−x

∂y
. Choose
a nice simple Y and calculate ∇
X
(Y ), also ∇
Y
(X), ∇
X
(X) and ∇
Y
(Y ).
Sketch all the vector fields and satisfy yourself everything makes sense, and
that we can legitimately regard ∇
X
(Y ) as a derivative of Y in the direction
X at each point.
Remark 8.2.2. From the matrix notation, certain things are obvious:
1. ∇
X
(Y ) is certainly additive in X:

X
1
+X
2
(Y ) = ∇
X
1
(Y ) +∇
X
2
(Y )
2. ∇
X
(Y ) is R-linear in X:
∀ t ∈ R, ∇
tX
(Y ) = t∇
X
(Y )
3. Since this is done pointwise as far as X is concerned, it is C

(R
2
, R)
linear in X:
∀ f ∈ C

(R
2
, R), ∇
fX
(Y ) = f∇
X
(Y )
4. ∇
X
(Y ) is R-linear in Y :

X
(Y
1
+Y
2
) = ∇
X
(Y
1
) +∇
X
(Y
2
)
∀ t ∈ R, ∇
X
(tY ) = t∇
X
(Y )
5. It satisfies the Leibnitz rule so far as C

(R
2
, R) scaling of Y is con-
cerned:

X
(fY ) = f∇
X
(Y ) + (Xf)Y
8.2. BACK IN R
N
199
Exercise 8.2.2. Verify all these.
Exercise 8.2.3. Confirm that the Lie derivative L
X
(Y ) does not satisfy all
these conditions.
Exercise 8.2.4. Instead of operating with ∇
X
on a vector field, you could
operate in a very similar way on a covector field or 1-form, ω = P dx+Q dy
to get another 1-form ∇
X
(ω) = XP dx + XQ dy. Show this also satisfies
the above conditions.
Remark 8.2.3. We have now defined ∇
X
for two different sections of bundles
over R
2
, the tangent bundle and the cotangent bundle.
Definition 8.2.1. Any operator ∇
X
for a vector field X on sections of any
vector bundle over R
2
which satisfies the properties of Remark 8.2.2 is called a
connection, and the particular connections described are called the Euclidean
connections on the tangent and cotangent bundles.
Obviously extending these to R
n
merely means more terms. Extending them
to manifolds gives some complications.
8.2.2 Curves and transporting vectors
We can now talk about moving a vector along a curve in R
2
so it stays
pointing in the same direction. γ : I → R
2
is a smooth curve, and u(t) is a
vector at γ(t) ∈ R
2
, I want to use the fact that for each t ∈ I, if I differentiate
u(t) in the direction γ

(t) there is no change, so I want
∀ t ∈ I, ∇
γ

(t)
(u(t)) = 0
This does not make sense, because neither γ

(t) nor u are vector fields on
R
2
, although γ

is a vector field on the curve. This doesn’t matter as far as
γ

(t) is concerned, because we only need a vector at each point of the curve
to give a direction in which to differentiate. It does matter as far as u(t)
is concerned, at the very least we want to know what u is doing in some
neighbourhood of the curve, whereupon ∇
γ

(t)
(u(t)) becomes intelligible and
I can insist that it be zero. This will put some conditions on u. We already
know, of course, exactly what we want to get out, because we know that
shifting vectors parallel to themselves in R
2
is rather trivial. What we want
to conclude is that u is constant along γ, and that this does not depend on
the curve. But we have an eye on doing the same sort of thing on S
2
, where
life is more complicated.
200 CHAPTER 8. CONNECTIONS
You can see that the proposition that the directional derivative of a function
is zero in every direction certainly tells us that the function is constant. You
can also see that if the directional derivative of a function along a curve is
zero, it must be constant along the curve. And we don’t give a damn what
the function is doing elsewhere. This holds for each component function of
the vector field that extends u in a neighbourhood of the curve.
Exercise 8.2.5. Prove that a smooth function which has directional deriva-
tive along a curve equal to zero is constant along the curve.
Exercise 8.2.6. Is it true for any continuous curve in R
2
, that a function
defined on it can always be extended to a neighbourhood of the curve?
We therefore deduce that for the euclidean connection on R
2
, the properties
of the connection ensure that the condition

γ

(t)
(u(t)) = 0
is quite intelligible since it makes sense for any extension of u(t) into a
neighbourhood of the curve, and it tells us how to parallel transport a vector
along a curve, and that for any two points on the curve, the condition ensures
the vector at each point is ‘the same’. Big deal. Of course, the notion of
vectors in two different tangent spaces being ‘the same’ is certainly trivial in
R
2
, indeed in R
n
generally, so long as we have the standard structure; there
is one obvious sense in which it makes sense, the trick is to say what it means
along curves in manifolds that are not so simple.
From this, after some reflection, we conclude that the euclidean connection
on R
n
solves all the problems of parallel transportation on R
n
. This, face
it, wasn’t much of a problem. On the other hand, it does give us a hope of
solving the same problem on S
2
and other manifolds, including the universe
in which we live. And if we can parallel translate vectors we should be able
to parallel translate other things using the same ideas.
So we study connections.
8.3 Covariance
The question is, can we make this work on manifolds in general? Certain
things are prerequisites: in particular, this all has to be independent of a
choice of basis. If you are a physicist and you use a vector field or differential
1-form to represent some thing like an electric field, you would insist that
the vector at a point is a real thing that does not depend on the choice of a
8.4. EXTENSIONS TO TENSOR FIELDS ON R
2
201
coordinate system. We now know that the right way to express this belief is
in terms of invariance under certain group actions. You and I may differ in
the actual numbers to be assigned, but we’d better agree on what happens
in the world after a suitable translation scheme is established. Or it ain’t
Science.
If you change the basis of R
n
for some reason known only to yourself, then
both X and Y will have different representations. Still, a vector field exists
independent of your description, and something would be horribly wrong if

X
(Y ) depended on the basis.
Exercise 8.3.1. Take the vector fields on R
2
you used for an earlier problem
and express them in the basis
¸
1
1

,
¸
1
−1

in such a way that it really is the same vector field. Do the calculations all
over again.
Exercise 8.3.2. Show that in general if we change the basis on R
n
so that
X and Y are written in the new basis as X

and Y

, then ∇
X
(Y

) is what it
jolly well ought to be.
Exercise 8.3.3. Using the same vector fields, do the calculations using polar
coordinates. What conclusions do you draw?
Exercise 8.3.4. Suppose φ is a diffeomeorphism of R
n
and X, Y are vector
fields on R
n
. Explain how to describe X, Y in terms of the ‘coordinate system’
given by φ. What happens to ∇
X
(Y ) under this diffeomorphism?
8.4 Extensions to Tensor Fields on R
2
Returning to R
2
, we could express the Euclidean connection in the form:

X
(Y ) =
¸
∂v
1
∂x
∂v
1
∂y
∂v
2
∂x
∂v
2
∂y
¸
¸
u
1
(x, y)
u
2
(x, y)

or the somewhat more compact:

X
(Y ) = u
i

i
(v
j
)∂
j
This rather obscures the fact that I am using u’s to represent the vector field
X and v’s to represent the vector field Y , so it might be better to write

X
(Y ) = X
i

i
(Y
j
)∂
j
(8.4.1)
202 CHAPTER 8. CONNECTIONS
The definition of ∇
X
(ω) where
ω = P dx +Q dy = ω
1
dx +ω
2
dy
was just

X
(ω) = Xω
1
dx +Xω
2
dy = Xω
j
dx
j
and unpacking the expression for Xω
j
we get

X
(ω) = X
i

i
ω
j
dx
j
(8.4.2)
which looks a lot like equation 8.4.1.
Exercise 8.4.1. Write out equation 8.4.2 as a matrix.
The fact that we have the same basic shape for vector fields as for 1-forms
tells us that all we are doing is choosing a suitable basis for each of them:
(e
1
, e
2
) is the standard basis for the vectors in R
2
and I have (∂
1
, ∂
2
) for
the standard basis for the tangent vectors, and (dx
1
, dx
2
) for the cotangent
vectors. Suppose I have a (k, ) tensor bundle, then I can write out a basis
for any section as a collection of terms in the form
dx
i
1
⊗dx
i
2
⊗ dx
i
k
⊗∂
i
k+1
⊗ ⊗∂
i
k+
We can extend the definition of ∇
X
(Y ) to ∇
X
(s) where s is any section of
the tensor bundle by writing

X
(α ⊗β) = ∇
X
(α) ⊗β +α ⊗∇
X
(β)
and extending to as many tensor products as you feel a need for, and using
linearity.
Exercise 8.4.2. Show that for all sections of a tensor bundle s, the properties
of Remark 8.2.2 hold.
Exercise 8.4.3. Letting X be the same old vector field on R
2
as in earlier
exercises, and let a Riemannian inner product be defined on the positive
quadrant by the matrix
s =
¸
1 +xy 0
0 1 +x
2
+y
2

Find ∇
X
(s). Is it positive definite? If we take the covariant derivative of a
symmetric 2-tensor s, is the the resulting 2-tensor necessarily symmetric?
8.5. THE KOSZUL CONNECTION 203
8.5 The Koszul Connection
The crucial properties of the covariant derivative of tensor bundles on R
2
were: For any section s of a tensor bundle,
1. ∇
X
(s) is certainly additive in X:

X
1
+X
2
(s) = ∇
X
1
(s) +∇
X
2
(s)
2. ∇
X
(s) is R-linear in X:
∀ t ∈ R, ∇
tX
(s) = t∇
X
(s)
3. Since this is done pointwise as far as X is concerned, it is C

(R
2
, R)
linear in X:
∀ f ∈ C

(R
2
, R), ∇
fX
(s) = f∇
X
(s)
4. ∇
X
(s) is R-linear in s:

X
(s
1
+s
2
) = ∇
X
(s
1
) +∇
X
(s
2
)
∀ t ∈ R, ∇
X
(ts) = t∇
X
(s)
5. It satisfies the Leibnitz rule so far as C

(R
2
, R) scaling of s is con-
cerned:

X
(fs) = f∇
X
(s) +Xfs
Extending these to R
n
is rather trivial; consider it done. The next step is
to define a Koszul connection on any vector bundle E over a manifold M as
a map operating on a vector field, X, on M and any section s of E which
satisfies the above rules. This is rather abstract, but I have built up the
simple concrete cases first in order to cheer you up.
8.6 Vector Potentials
We gave a very simple covariant derivative on R
2
which quite obviously sat-
isfied the rules for a connection, indeed that’s where we got the rules from.
Now we take the Physicist’s perspective. Weening them off coordinates is an
ongoing process, so let’s try doing it their way– then we get to be able to do
lots of sums, which is good, and confuse things horribly, which is bad.
204 CHAPTER 8. CONNECTIONS
Suppose we have a section s of some bundle E over R
2
with fibre a vector
space F so that E = R
2
F. s is therefore a map from R
2
to F, and if
we let (e
1
, e
2
, e
n
) be a basis for F, then for every point v ∈ R
2
we have
s(v) = s
1
(v)e
1
+s
2
(v)e
2
+ s
n
(v)e
n
, or s(v) = s
i
e
i
in Physicist’s notation.
If X is a vector field on R
2
we can write X = X
1

1
+ X
2

2
= X
j

j
. And
if s

= ∇
X
(s) we have also s

(v) = s
1
(v)e
1
+ s
n
(v)e
n
= s
i
e
i
. And
the n functions s
i
(v) depend on the n functions s
i
(v) and on the functions
X
1
, X
2
only. Moreover, the rules for getting the s
i
are specified by the rules
of section 8.5 and nothing else. Let’s see how it works out. I shall have to
take ∇
Y
for Y the unit vector field in the direction of the x-axis and also the
y-axis, and rather than write this as ∇

1
or ∇

2
I shall shorten this to ∇
1
and ∇
2
respectively.
s

= ∇
X
(s) = ∇
(X
1

1
+X
2

2
)
(s)
= ∇
X
1

1
(s) +∇
X
2

2
(s)
= X
1

1
(s) +X
2

2
(s)
= X
1

1
(s
1
e
1
+ s
n
e
n
) +X
2

2
(s
1
e
1
+ s
n
e
n
)
= X
j

j
(s
i
e
i
) using the Einstein convention
= X
j
(s
i

j
(e
i
) + (X
j
s
i
)e
i
) (Leibnitz)
It might be better to expand this last to

X
(s) = X
1
(s
1

1
(e
1
) +s
2

1
(e
2
) + s
n

1
(e
n
))
+ X
2
(s
1

2
(e
1
) +s
2

2
(e
2
) + s
n

2
(e
n
))
+ X
1
((∂
1
s
1
)e
1
+ (∂
1
s
2
)e
2
+ + (∂
1
s
n
)e
n
)
+ X
2
((∂
1
s
1
)e
1
+ (∂
1
s
2
)e
2
+ + (∂
1
s
n
)e
n
)
Now the terms ∇
1
(e
i
) and ∇
2
(e
i
) are, for each i ∈ [1 : n], going to be
values of the section, and can therefore be expressed in terms of the basis
(e
1
, e
2
, e
n
). I have, in other words, for each j ∈ [1 : 2] and for each
i ∈ [1 : n], at each point v ∈ R
2
, there is a collection of numbers A
k
i,j
(v))
expressing ∇
j
(e
i
) as
¸
k∈[1:n]
Γ
k
i,j
e
k
Which tells us that we can express ∇
X
(s) as
X
1
¸
i,k∈[1:n]
s
i
Γ
k
i,1
e
k
+X
2
¸
i,k∈[1:n]
s
i
Γ
k
i,2
e
k
+X
1
¸
i∈[1:n]
(∂
1
s
i
)e
i
+X
2
¸
i∈[1:n]
(∂
2
s
i
)e
i
8.6. VECTOR POTENTIALS 205
Collecting these up and changing the name of a summation index gives us:

X
(s) =
¸
j∈[1:2],i,k∈[1:n]
X
j
(∂
j
s
i
+ Γ
i
k,j
)e
i
(8.6.1)
or

X
(s) = X
j
(∂
j
s
i
+ Γ
i
k,j
)e
i
in physics speak.
Exercise 8.6.1. Find the expression for ∇
j
(∂
i
) in terms of the standard
basis for vectors in R
2
. Now do it for polar coordinates.
The 2n
2
functions Γ
i
k,j
from R
2
to R (On R
2
, on R
m
it would be mn
2
) pretty
much tell us everything about the connection, given that the X
j
tell us about
the vector field X and the ∂
j
s
i
tell us about differentiating the section. In
general they are mn
2
functions from the manifold of dimension m to R,
and they tell us how the connection works on the bundle with fibre F of
dimension n. The collection of functions is called the Vector Potential for
the connection, or sometimes the Christofel symbols. When the manifold
has the same dimension, n, as the fibre, there are n
3
such functions. The
text book prefers to use the term Christoffel symbols for the case when the
connection respects a riemannian inner product.
Exercise 8.6.2. When we discussed the Euclidean connection for sections
of the tangent bundle, what were the Γ
i
k,j
?
The significance of the vector potential term in equation 8.6.1 is not hard to
see. If we left it out, or equivalently insisted that all terms are zero, then in
the case of a vector field we would simply have the situation of the Euclidean
connection,

X
(Y ) =
¸
∂v
1
∂x
∂v
1
∂y
∂v
2
∂x
∂v
2
∂y
¸
¸
u
1
(x, y)
u
2
(x, y)

In order to work on S
2
, this would have to survive a diffeomorphism, and by
an earlier exercise, it doesn’t.
Exercise 8.6.3. Take the usual vector fields on R
2
` ¦0¦, X = −y ∂
x
+
x ∂
y
and Y = x ∂
x
+ y ∂
y
. Compute ∇
X
(Y ) in cartesian form. Now find
expressions for the same vector fields in polar form. (You’d better get X
P
=

θ
and Y
P
= r∂
r
and make sure you can prove these are correct, not just look
at the pictures!) Now use the rule for the Euclidean connection to calculate

X
P
(Y
P
). This had better not be the polar form of ∇
X
(Y ) or you have got
the wrong answer.
206 CHAPTER 8. CONNECTIONS
Exercise 8.6.4. Find a vector field Z on R
2
` ¦0¦ which corresponds to the
polar field ∂
r
. (That is, it consists of a unit vector radially outwards at each
point.) Calculate ∇

r
(∂
θ
) by translating it into cartesian coordinates to do
the sums and then translate back. What if you had chosen a different basis
for the tangent vectors and picked r∂
r
instead? What is ∇
r∂
r
(∂
θ
)? Calculate


θ
(∂
θ
), ∇

θ
(r∂
r
) and ∇
r∂
r
(r∂
r
) in the same way. Translate them all back
into polar coordinates.
Exercise 8.6.5. Explain why r∂
r
is a better choice than ∂
r
. (Hint, look at
the polar diffeomorphism.)
Exercise 8.6.6. Hence compute the vector potential for ∇
X
P
(Y
P
).
Exercise 8.6.7. Confirm that if you take the vector potential into account,
you get the right answer for ∇
X
P
(Y
P
).
Exercise 8.6.8. ∇

x
(∂
y
) and the similar terms in the cartesian framework
are what you’d expect them to be, but the ∇

i
(∂
j
) in polar coordinates
contain significant information. What are the numbers telling you?
Exercise 8.6.9. If you look at what you have been doing with the above
calculations, you can see that we have defined ∇
X
(Y ) in cartesian coordinates
on R
2
(and hence by trivial modification on R
n
) and then proceeded to
take it on the subspace R
2
` ¦0¦ by ignoring the deleted point. Then we
transferred it all to o
1
R
+
by the diffeomorphism P for polar coordinates.
In order to compute ∇
X
P
(Y
P
), I rather took it for granted that it is to be
done by translating X
P
and Y
P
into cartesian form, doing it there, and then
translating the answer back into polar coordinates, which surely is the only
sane thing to do. If φ were any diffeomorphism from a subset U ⊆ R
2
to
some other space, V , then what we are doing is taking a vector field X on
U to the vector field φ

◦ X ◦ φ
−1
on V , a vector field Y on U to the vector
field φ

◦ Y ◦ φ
−1
on V , and defining

φ

◦X◦φ
−1(φ

◦ Y ◦ φ
−1
) = φ

◦ ∇
X
(Y ) ◦ φ
−1
Show that this is a consistent way to export ∇ from one manifold to an-
other which is diffeomorphic to it, and hence explain why we can define a
connection on a manifold, and why ∇ is called covariant differentiation.
8.6.1 Tensor formulation
The term Γ
i
k,j
looks very like a (2,1) tensor in coordinate terms. For the
Riemannian Inner product, what goes in at each point in the same tangent
8.7. CONCLUDING REMARKS 207
space is a pair of vectors and what comes out is a number, and this is bilinear
and varies smoothly as we move around in the manifold. For the vector
potential, we have that the things going in are two vector fields, or at least a
vector in the tangent space at a point and a field (or possibly more general
section) defined in a neighbourhood of the point (so we can differentiate), and
what comes out is another vector field (or possibly more general section).
8.7 Concluding Remarks
This just starts on the subject of connections, which are crucial to much
differential geometry. For example, it is connections that have curvature. We
could show how a Riemannian Inner product (metric) leads to a connection,
the Levi-Civita connection, which is compatible with the metric. But there
is too much to fit into an introductory course unless I follow the tradition of
training you to say the right things with only minimal grasp of what they
mean, something I much prefer not to do.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->