D
1
f
1
(a) D
2
f
1
(a) D
n
f
1
(a)
D
1
f
2
(a) D
2
f
2
(a) D
n
f
2
(a)
.
.
.
.
.
.
.
.
.
D
1
f
n
(a) D
2
f
n
(a) D
n
f
n
(a)
f
1
x
1
f
1
x
2
f
1
x
n
f
2
x
1
f
2
x
2
f
2
x
n
.
.
.
.
.
.
.
.
.
f
n
x
1
f
n
x
2
f
n
x
n
x=a
Usually I shant bother to distinguish between the linear map and its matrix
representation. You know how to compute this matrix if it should be abso
lutely necessary, and you should understand that the linear map is the linear
part of the best ane approximation to f at a. Note that I have used f
i
for
the n component functions of f and x
i
for the components of a vector in R
n
.
I shall explain this notation later.
An important point about smooth curves needs to be considered:
2.2. SMOOTH MANIFOLDS 13
Figure 2.1.1: A smooth curve.
Exercise 2.1.1. Figure 2.1.1 shows two line segments joined together. The
horizontal one is the set of points in R
2
with y = 0 and 0 x 1 and the
vertical one is the set of points in R
2
with x = 1 and 0 y 1
Show that there is a continuous but nondierentiable function from [0, 2] to
R
2
which traces the curve formed by the two segments from the origin to the
point (1, 1)
T
.
Show that there is a dierentiable function from [0, 2] to R
2
which does the
same job.
Exercise 2.1.2. Show that [1, 1] is the image of [1, 1] by a continuous
bijection which is not dierentiable.
The conclusion you should draw from this is that you cannot decide if a curve
is smooth or not merely by looking at the image!
2.2 Smooth Manifolds
Denition 2.2.1. A chart on a topological space X is a homeomorphism
from some open subset of X onto an open subset of R
n
. I shall call the
inverse of such a homeomorphism a local parametrisation.
We show a typical local parametrisation map from a rectangular neighbour
hood of the origin in R
2
to a region on the surface in gure 2.2.1. A chart
can be used to give coordinates for points of the space, at least some of them.
Dierent charts will, of course, give dierent coordinates in general to those
points on which the domains overlap.
Denition 2.2.2. Two charts on a space X, f : U R
n
and g : V R
n
are
smoothly compatible i the maps f g
1
and gf
1
are innitely dierentiable
wherever they are dened.
14 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Figure 2.2.1: A local coordinate map.
Figure 2.2.2: Two charts.
In other words, the composite map f g
1
must have partial derivatives of
all orders at every point of the domain, and the same is true of the inverse
map.
If U and V have empty intersection then this holds vacuously. If they do have
an intersection, then f g
1
has domain and codomain some open subsets
of R
n
and is certainly continuous. It makes sense to demand that this map
be smooth, that is, innitely dierentiable. The picture of gure 2.2.2 may
help.
Denition 2.2.3. A smooth atlas for a space X is a collection of smoothly
compatible charts such that every point of X is in the domain of at least
one chart. Such an atlas is maximal i every possible (smoothly compatible)
chart is in it.
Denition 2.2.4. A smooth nmanifold is a hausdor topogical space to
gether with a maximal atlas of smoothly compatible charts. The atlas is said
2.3. SMOOTH MAPS AND TANGENT VECTORS 15
to dene a smooth dierential structure on X.
The reason for wanting the atlas to be maximal is just so that anyone wan
dering in with a new local coordinate map cant cause us trouble. Either it
is compatible with our atlas in which case we already have it, or it is not, in
which case it may be part of a dierent dierential structure for the manifold.
Exercise 2.2.1.
1. Show that S
1
and S
2
as usually dened are smooth manifolds.
2. Show that the at torus obtained by gluing opposite edges of a square
is also a smooth manifold.
3. Show that S
n
is a smooth manifold for any n Z
+
. Hint: use the
Implicit Function Theorem.
(Generic hint: you dont need to have many charts. Enough to cover the
manifold will do, then just add the instruction to ll up with all other possible
smoothly compatible charts.)
Exercise 2.2.2. Construct a denition of an orientable manifold.
2.3 Smooth maps and tangent vectors
Now we have enough to say what it means for a map f : X Y to be
smooth when X and Y are smooth manifolds:
Denition 2.3.1. A map f : X Y between smooth manifolds is dieren
tiable when h
1
f g is dierentiable for all charts h on X and all charts g
on Y belonging to the dierential structures.
The diagram of gure 2.3.1 gives the idea.
We can dene higher order dierentiability in the same way and we can say
that a map f : X Y is smooth whenever all composites h
1
f g are
smooth for all charts h on X and all charts g on Y
Exercise 2.3.1. Show that if f : X Y has composite h
1
f g dier
entiable at some point a in X then it is dierentiable in any other pair of
charts containing a, f(a).
16 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Figure 2.3.1: A smooth map.
Figure 2.3.2: Some tangent curves in a manifold.
2.3. SMOOTH MAPS AND TANGENT VECTORS 17
Note that although we can say that f is dierentiable, we cannot provide
a derivative, since this will generally be dierent in dierent charts. If we
move away from simple linear spaces we must pay the price: there is no longer
a best ane approximation because ane maps dont make sense between
manifolds in general.
We can however say when two maps from R into a manifold X are tangent.
Let f, g : (1, 1) X be smooth maps into a manifold and without loss
of generality let f(0) = g(0) = a X. Then we can say that f and g are
tangent at a i the derivative of f and the derivative of g are the same for
any chart h : U R
n
where f(0) = g(0) = a U. If they are the same in
any one chart they must be the same in any other.
Exercise 2.3.2. Prove the last remark.
Exercise 2.3.3. Show that tangency is an equivalence relation on the set of
maps from R to X, and that we can do the same thing with maps from X
to R.
We can take a tangency equivalence class of maps from R to the manifold
X, and regard it as an object in its own right. The picture 2.3.2 shows some
members of a tangency equivalence class.
The curves can be thought of as the trajectories of moving points, and they
are all moving through the point a at the same speed, and in the same
direction, although we cannot give the direction a particular vector to specify
it, and the speed may also be dierent in dierent charts.
Denition 2.3.2. A tangency equivalence class at a point a in a manifold
X is called a tangent vector at a in X.
Remark 2.3.1. Watching the faces of students in class when giving this
denition is a real treat. The look of stark horror and incomprehension
is very encouraging, as it proves that some at least are listening. A small
amount of imagination, however, goes a long way to making this denition
quite reasonable.
Suppose that the North pole has been cleared of snow and turned into a
skating rink for penguins
1
, and the North pole itself is marked by a ashing
red light. There are two space craft hovering up there, call them A and B. In
space craft A, an astronaut leans out and takes a photograph of the region
around the north pole. Suppose for simplicity he is directly above the north
1
It has been pointed out to me that there are no penguins at the North pole only at
the South pole. On the other hand there isnt a skating rink at the North pole either. So
if we are going to make a skating rink we might as well import the penguins.
18 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Figure 2.3.3: Penguins (imported from the antarctic) skating.
pole so his photograph, when enlarged is a disc as in the picture gure 2.3.4.
Astronaut B is somewhere over Russia and he also takes a photograph of
what he can see.
Now each astronaut looks at his photograph and lays it out at and enlarges
it to a nice size, and each marks on a coordinate grid using a ruler and pen,
and so each has a chart of a bit of the polar regions, with the north pole
in the domain of each chart. If both put the origin in the centre, astronaut
A will have the ashing red light at the origin, and astronaut B will have a
negative x coordinate for the red light if he puts his coordinates on the chart
in the way suggested by the diagram. I regard the chart as both the bit of
the earth the astronaut can see, and also the process of turning it into a at
picture with a coordinate grid on it. Call them u and v for the maps and U
and V for the domains of the maps back on earth.
I claim it makes sense to talk of a penguin skating over the north pole as
having a velocity vector as it passes through the north pole. Each astronaut
can plot the position of the green penguin in his chart, and each will agree if
the curve is dierentiable. Note that if g : (1, 1) S
2
describes the green
penguin then astronaut A will plot the green curve at the top of the picture
and will be able to give it a perfectly respectable velocity on his chart relative
to the cartesian coordinates marked on the chart. Similarly astronaut B can
do the same. The problem is that they will have, usually, dierent estimates
or what the velocity is. If B is much higher up in space, his scale will be such
that the penguins will seem to be moving more slowly, for instance.
2.3. SMOOTH MAPS AND TANGENT VECTORS 19
Figure 2.3.4: Two penguins skating under the watchful eyes of two astro
nauts.
Does this mean my claim that we can assign a meaningful velocity vector to
the penguin is just nonsense? No, for if b is the blue penguin, also skating
over the north pole at the same time as the green penguin (and mysteriously
not knocking over the rst penguin: maybe they are ghost penguins and can
occupy the same space), it certainly makes sense to say if they are travelling
in the same direction at the same speed. A penguin cutting across the path
would obviously be travelling in a dierent direction, and a really slow pen
guin would be slower for both astronauts, and the fast penguins would pass
through it. So I claim that the penguin velocity is a real thing which exists
at the penguin level if not at the astronaut level. But if one astronaut said
that the blue penguin and the green penguin had the same velocity at the
instant they went through the pole, the other astronaut would agree even
though disagreeing as to the actual value of the vector in both direction and
speed, these things being properties of the charts, not the penguins.
The reason this happens is that there are two things going on here. The actual
velocity at the north pole is a real thing, penguins are actually moving, and
either they pass through the north pole at the same time in the same direction
at the same speed or they dont. But attempts by the two astronauts to
describe the penguin motion to each other with numbers involve inventing
coordinate systems which are bits of language. So the numerical value of any
vector is dependent on the language. But the fact that dierent languages
20 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
agree on whether two penguins have the same velocity tells you that the
velocity is real. It exists independent of the coordinate system, provided the
two coordinate systems are related by a dieomorphism. So there are moving
penguins and there is language, and the penguins will have the same velocity
at the pole or they wont, and this is true no matter what language you use
to talk about it unless your language is really weird.
The problem then is to say what a velocity vector is given that any pair of
astronauts can disagree about the actual numbers. And the most elegant
solution is to say that it is what all the penguin trajectories, real and poten
tial, have in common. And what they have in common is that every observer
will agree that they pass through the north pole in the same direction at the
same speed. This is the tangency equivalence class.
Note that I have assumed that all observers use synchronised clocks so they
all agree that the time at which the penguins hit the north pole is time
zero. This doesnt have to be the case either. They will all agree on the
simultaneity of the events, whatever time they claim they occur. This is
because two penguins either meet or they dont, and this is not a matter of
language but of fact.
The ghostpenguins are negotiable. Having a nice vivid picture of some sort
is essential: you should be prepared to invent your own, but this time you
may borrow my penguins if they help. If I give you more denitions like this,
it is your job to supply the penguins, or whatever it takes.
Exercise 2.3.4. Show that the claim that the two astronauts would agree if
two penguins have the same velocity at the north pole is true provided that
u v
1
and v u
1
are both dierentiable.
Remark 2.3.2. There is, of course, a simpler way of dening tangent vectors
on S
2
. It is usually viewed as a subspace of R
3
, so a curve on S
2
is also a
curve in R
3
and we can dene velocities on S
2
as tangent vectors in R
3
in
the sense of the derivatives of maps from (1, 1) to R
3
which, for tangent
vectors at a particular point, happen to lie in a plane in R
3
which is tangent
to S
2
at that point. This certainly removes some tricky conceptual problems
but at the expense of making tangent vectors extrinsic rather than intrinsic
to the space. The whole thrust of the text book is to using intrinsic ideas
for the very good reason that we live in a 3manifold and cannot form any
useful idea of an embedding of it in some higher dimensional space.
The next proposition tells us that the set of tangency equivalence classes at
a xed point a in a manifold form a vector space, the tangent space at a.
Proposition 2.3.1. The set of tangent vectors at a point a of a smooth
nmanifold X comprise a real vector space of dimension n.
2.3. SMOOTH MAPS AND TANGENT VECTORS 21
Figure 2.3.5: The sum of tangent vectors.
Proof:
We have to produce sensible rules for adding and scaling tangent vectors.
Then we have to show that the result satises the axioms for a real vector
space. Suppose we have a tangency equivalence class v and that v is an
element of it, that is a curve v : R X with v(0) = a and in any chart
w : W R
n
with a U there is some derivative of w v. Then we can
scale the function v by a scalar k R to get v(kt) instead of v(t) for t R
and the derivative of w v will also be scaled by the factor k. This will be
the same scaling in any chart, so it makes sense to call this new function kv.
This has its own tangency equivalence class, kv.
It would not make a dierence if we had chosen another function v
v,
kv
M
where M is the manifold, TM is the tangent bundle and is the projection
which sends a tangent vector to the point in the manifold to which it is
attached.
Now we have described the tangent bundle as a union of all the tangent
spaces T
a
(M) for a M but that does not specify a topology on it. To do
that we say a subset U of TM is open i the projection (U) is open in M
and the intersection of U with any bre is open in the bre. Since the bres
are all real vector spaces we can give them the usual topology, obtained from
an isomorphism with R
n
.
Denition 2.3.3. The tangent bundle to a smooth manifold M is the set
aM
T
a
(M)
with the topology specied by saying U TM is open whenever (U) is
26 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
open in M and for every a (U), U T
a
(M) is open in T
a
(M), where
T
a
(M) has a topology induced by any isomorphism with R
n
.
Note that this assumes that any two isomorphisms with R
n
will induce the
same topology.
Exercise 2.3.9.
1. Show that a linear map from R
n
to R
m
is continuous i it is continuous
at the origin.
2. Show that any linear map from R
n
to R
m
is continuous.
3. Show that any isomorphism from R
n
to itself is a homeomorphism.
Now for some formal denitions:
Denition 2.3.4. A bre bundle is a quartet (E, B, F, ) where E is called
the total space, B is called the base space, : E B is a continuous map
called the projection and for every b B,
1
(b) is homeomorphic to F. The
spaces
1
(b) are called the bres of the bundle.
Denition 2.3.5. A bre bundle is called locally trivial i for every b B
there is an open set U B containing b such that
1
(U) is homeomorphic
to U F
The bundle B F is called a trivial bundle.
Exercise 2.3.10. Describe clearly the trivial bundle with base space S
2
and
bre S
1
and give an example of a nontrivial bundle with the same base and
bre. Hint: you might nd it easier if you specify some gluings.
Exercise 2.3.11. Show that the tangent bundle of a smooth manifold is
locally trivial.
Exercise 2.3.12. Show there is a natural atlas on the tangent bundle which
makes it a smooth manifold. Is the bundle projection smooth?
Note that for a locally trivial bre bundle a topology on the bundle must
have as base the cartesian product of sets which are open in B (and over
which the bundle is locally trivial) with open sets in the bre.
Denition 2.3.6. A section of a bre bundle E with projection to base
space B is a map s : B E such that s is the identity on B.
2.4. NOTATION: VECTOR FIELDS 27
Denition 2.3.7. A vector eld on a manifold M is a section of the tangent
bundle TM.
You should be able to see that this makes sense and we can talk about
continuous, dierentiable and smooth vector elds according as the section
(which is after all a map) is continuous, dierentiable or smooth.
Exercise 2.3.13.
1. Draw a vector eld on R
2
which is nice and easy and write it as a
section of the tangent bundle.
2. Show that the tangent bundle for S
2
is not trivial. Use the hairy ball
theorem which says that any continuous vector eld on S
2
must have
at least one place where the vector is of length zero.
2.4 Notation: Vector Fields
On R
2
, I can write the tangent space as R
2
R
2
which is mildly useful for
thinking about the meaning but not standard and not particularly useful for
computations. I shall extend this to talking about the standard basis for
R
2
and call it ( e
1
, e
2
). A vector eld on R
2
is an assignment to each point of R
2
of a vector, and if it is a smooth vector eld this vector changes smoothly as
we move around in R
2
. So there is a tangent vector, with two components,
which both depends smoothly on x and y and hence is given by a pair of
functions P(x, y), Q(x, y). We might write the vector eld as
P(x, y) e
1
+Q(x, y) e
2
but we dont. We write it as
P(x, y)
x
+Q(x, y)
y
This notation takes a bit of explaining.
If we have a smooth function f : R
2
R and a smooth vector eld on R
2
we
can take the directional derivative of f at any point in the direction of the
vector eld at the point, and multiply it by the length of the vector. This
will give us a new smooth function on R
2
. This means that such a vector
eld can be thought of as an operator on the space of smooth functions from
R
2
to R, which is usually written as (
(R
2
). The constant vector eld which
assigns the vector e
1
to every point of R
2
can easily be seen to be the operator
28 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
/x and similarly the orthogonal constant vector eld which assigns e
2
is
the operator /y. This explains the notation for vector elds on R
2
and by
an obvious extension we can write a vector eld v on R
n
as
i[1:n]
v
i
(x)
x
i
where each v
i
is a function from R
n
to R and where I called v
1
the function
P(x, y) and v
2
the function Q(x, y) when n = 2.
Since the procedure for interpreting a vector eld as an operator on (
(R
2
)
is local, a vector eld on a manifold M is an operator on (
(M) although
there is an issue involved in choosing a basis for each tangent space T
a
(M)
if we wish to do calculations.
This gives two quite dierent ways of looking at a vector eld on a smooth
manifold. We have the tangency equivalence classes which we may think of
as little arrows, each selected by a section of the tangent bundle. This is
a quite straightforward transfer of ideas from R
n
and should seem natural
and reasonable once you have come to terms with the problem of having to
say everything via charts. But the other way of thinking of a vector eld as
an operator on the space (
(R
n
) (
(R
n
)
with vf the map
i[1:n]
v
i
(x)
f
x
i
.
This can be compressed into
v =
i[1:n]
v
i
x
i
2.4. NOTATION: VECTOR FIELDS 29
An even more compact form is
v =
i[1:n]
v
i
i
We can make this even terser by using the Einstein Summation Convention
which is that if an index is repeated as a superscript and a subscript then we
automatically sum over the possible values. This gives us
v = v
i
i
where you have to know what the space is in which we are working to know
how many is there are. For some reason physicists prefer to use greek letters
as indices which means that you are likely to nd expressions such as
v = v
instead. I fear that you will have to get used to this as the textbook is
committed to it.
This leads to a new denition of a vector eld on a smooth manifold M.
First we dene (
(M)
(
x
y
= f(0) +
f
x
,
f
y
x
y
+ax
2
+ 2bxy +cy
2
where a, b, c are second order partial derivatives of f evaluated at some point
between the origin and (x, y)
T
(and hence, we have to admit, depend upon
x and y). This is just the Taylor expansion with Lagrange form of the
remainder in two dimensions.
2.5. COTANGENT BUNDLES 31
Now apply v to f to get a new function g: then g(0) must be the limit of
g(x, y)
T
as (x, y)
T
0, as g is certainly continuous, and show that since
v is linear, g(x, y)
T
must be the sum of the action of v on the above three
terms in a neighbourhood of the origin, that v takes the constant rst term
to zero, and that since v satises the Leibnitz condition, g(0) must be
f
x
,
f
y
u
v
R
2
. Finally show that if it works on R
2
it must
work on R
n
and also on any smooth manifold.
We can now dene Vect(M) or 1(M) as the set of all vector elds on the
smooth manifold M.
Exercise 2.4.4. Show that Vect(M) (1(M) ) is a real vector space. Show
that it is a module over (
(M)
except that (
(R
n
), is nite dimen
sional and has the obvious basis.
2.5 Cotangent Bundles
I mentioned earlier that we could do the business of equivalence classes of
maps from the manifold to R in exactly the same way as we took maps from
R to the manifold. If we do this we get an exact parallel and a tangency
equivalence class of such maps at a point is called a cotangent or covector
at the point. Somewhat easier is to dene the space of cotangents at a X
for a smooth manifold X as the dual space of T
a
(X). Recall that the dual
(vector) space for a space V is the space V
1
, e
2
) where e
1
is the linear map from
R
2
to R which projects everything
onto the rst component and e
2
projects everything onto the second com
ponent. But we actually call them dx, dy to be loosely consistent with the
classical notation. So we interpret dx as the linear map which takes (x, y)
T
to x where (x, y)
T
is a point in the tangent space T
a
(R
2
) at some point a.
Similarly for dy. So a dierential 1form or covector eld on R
2
is written
P(x, y) dx +Q(x, y) dy
The generalisation to R
n
is of course
i[1:n]
i
(x) dx
i
(or
i
dx
i
using the Einstein summation convention)
and this, for smooth functions
i
, i [1 : n] represents a covector eld
or dierential 1form on R
n
. The preference for letters towards the end of
the Greek alphabet to denote dierential forms is widespread so again you
ought to get used to it. The subscripts instead of superscripts for indices
tells you something about the covariance or contravariance of the entities. I
shall explain this properly shortly.
If you wonder why on earth anybody bothers to distinguish between vector
elds and dierential 1forms, one answer is that it is natural to dierentiate
kforms to get (k +1)forms for k N. This is what Stokes theorem is really
all about. As you ought to have learnt in second year but probably didnt.
2.6 The Tangent Functor
Suppose f : X Y is a dierentiable map between manifolds. Then for the
case where X = R
n
and Y = R
m
there is a map between the tangent spaces
at each point which takes the tangent space at a X to the tangent space
at f(a) Y . To take a tangent vector v
a
in the tangent space T
a
(X) to one
in the tangent space T
f(a)
(Y ) all we have to do is to operate on it by Df(a)
which is by denition a linear map and has the right dimensions for domain
and codomain. If we are prepared to choose a basis for T
a
(X) and T
f(a)
(Y )
we could represent Df(a) by a matrix, and there is a perfectly sensible way
of choosing the same basis for tangent spaces over dierent points. All this
makes sense even if X and Y are just nite dimensional real vector spaces
2.6. THE TANGENT FUNCTOR 33
without the extra structure of R
n
. In fact it makes sense in arbitrary Banach
spaces.
Of course, there is a slight problem of how to extend this to manifolds which
are not Banach spaces. Spheres and tori spring to mind.
If we take v
a
, and recall that it is a tangency equivalence class of curves
v : (1, 1) X taking 0 to a then f v is a curve through f(a) and it
species a tangency class. Moreover if v
is tangent to v at a then f v
is
tangent to f v at f(a).
Exercise 2.6.1. Most of this should have been a second year exercise but
probably wasnt. Do it now and all about tangent vectors and maps will be
clear. Well, clearer.
1. Let f : R
2
R
2
be dened by
x
y
u
v
x
2
+x +y +y
2
1 +xy
t
0
0
t
1/10
0
and f
0
1/10
1
0
.
4. Evaluate the above matrix on the tangent vector e
1
5. Evaluate the above matrix on the tangent vector e
2
6. Map the two tangent vectors obtained by the last two jobs on the same
graph.
7. Represent the tangent vector e
1
by any curve c
1
in the tangency equiv
alence class and compose with f. Dierentiate to nd a linear repre
sentative of Tf(0, e
1
)
8. Repeat for a curve c
2
representing e
2
34 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
9. Sketch the curves f c
1
and f c
2
10. Prove the claim that if v
is tangent to v at a then f v
is tangent to
f v at f(a).
It follows that f induces a map Tf which takes tangent vectors at a to tan
gent vectors at f(a). This process doesnt, on the face of things, involve
dierentiation. Nor does it involve charts. Of course it does involve dier
entiation, as the last series of exercises shows convincingly. And it is easy to
see that it goes through on charts for the usual reasons, which involve the
chain rule.
In the case when we have a dierentiable f : R
n
R
m
the last exercises
should convince you we have at each point a X the diagram
T
a
(X)
?
X
X
T
f(a)
(Y )
?
Y
Y


Df(a)
f
This diagram commutes which means whichever way around you go you get
the same result. We can do this for every point a X to get the commutative
diagram:
TX
?
X
X
TY
?
Y
Y


Tf
f
The process of taking a manifold and producing its tangent bundle is said to
be functorial because if we have two manifolds and a smooth map between
them the process gives a map between the bundles.
Instead of writing Tf we often write f
a
X
?
X
X
T
f(a)
Y
?
Y
Y

f
f
This makes T
when
X = R
n
, Y = R
m
?
Remark 2.6.1. In older books, a covector eld is called a contravariant
vector eld and a vector eld is called a covariant vector eld. See for
example, Mackeys Theoretical Foundations of Quantum Mechanics. As we
shall see later, a covariant vector eld is a contravariant tensor eld. Dont
blame me for this.
This is all rather confusing on rst encounter. Familiarity breeds acceptance
and the best way to become familiar with these ideas is to work them through
in very simple cases. So make up a set of exercises yourself in which you work
with particular simple maps between very simple manifolds (R
n
and R
m
for
n, m small positive integers.) As a start:
Exercise 2.6.6. Let f : R R be given by f(x) = x
2
. Put a = 2 and
investigate what happens if we take (a) a tangent vector at 1 and (b) a
cotangent vector at 4.
Now try it for f : R
2
R
2
with
x
y
x
2
+y
2
xy
ans some suitable points for a and f(a). In this case you can conveniently
represent tangent vectors as columns and cotangents as rows.
2.7. AUTONOMOUS SYSTEMS OF ODES 37
Exercise 2.6.7. Write out a lecture for rst year students which describes
tangent vectors on R in a really simple way as possible velocities along the
line, and hence dene the tangent bundle R
R. Dene dierentiation of
maps fromR to R in terms of bundle maps. Prove the chain rule as T(f g) =
Tf Tg. Be prepared to answer any awful questions an intelligent student
might ask.
Write out a lecture on ordinary dierential equations in terms of sections of
the tangent bundle. Set up and solve some easy ones in this notation.
Do you think this is easier or harder than the traditional way of doing it?
Assume that since Mathematica can solve ODEs, the idea is not to train
students to jump through hoops but to get them to understand what they
are doing.
2.7 Autonomous Systems of ODEs
2.7.1 Systems of ODEs and Vector Fields
Consider the system of linear ordinary dierential equations:
x = y x(0) = 1
y = x y(0) = 0
We can write this as a two dimensional problem:
x
y
0 1
1 0
x
y
or more succinctly:
x = Ax (2.7.1)
where A is the above matrix.
The matrix A denes a vector eld on R
2
by taking the location x to the
vector A(x). We are now used to the idea of a vector eld on R
2
both visually
in terms of lots of little arrows stuck on the space (which can incidentally be
generated quickly and painlessly using Mathematica), and algebraically as
a map from R
2
to
R
2
sending locations to arrows (with their tails attached
to those locations).
Such a system of ordinary dierential equations is called autonomous, mean
ing that the vector eld specied by the system doesnt change in time.
38 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Figure 2.7.1: A vector eld or system of ODEs in R
2
Consequently we can either refer to an Autonomous System of Ordinary Dif
ferential Equations dened on an open set U R
n
, or we can talk about a
Smooth Vector Field on U. The second is much shorter and easier to think
about.
If we draw the vector eld in the above case, we get arrows which go around
the space in a positive direction as in gure 2.7.1
A solution to the system of dierential equations, or an integral curve for the
vector eld is a map f : R R
2
, usually written
x(t)
y(t)
with the property that x and y satisfy the given system of equations. What
this means is that we think of a point moving in R
2
so that its velocity at
any point is just the vector attached to that point. So the solution curve has
to have the vector eld tangent to it always.
It is possible to learn to solve autonomous systems of dierential equations
without ever understanding that they are all about vector elds which give
the velocity of a moving point, and that a solution is simply a function which
says where the moving point is at any time, and which agrees with the given
vector eld in what the velocity vector is. This is a pity.
In the above case, you can see by looking at the system what the solution
is: obviously the solution orbits are circles, and given the initial condition
2.7. AUTONOMOUS SYSTEMS OF ODES 39
where at time t = 0 we start at the point (1, 0)
T
, the solution can be written
down as
x = cos(t), y = sin(t)
and it is easy to verify that this works.
Exercise 2.7.1. Do it.
Obviously, solving initial value ODE problems for more complicated vector
elds isnt going to be so easy, and doing it in dimensions greater than three
by the look at it and think method also looks doomed. So it is desirable
to have a general rule for getting out the solution. Fortunately this is easy
enough for linear vector elds in principle, although the calculations can be
messy in preactice. But again, thats what computers are for.
2.7.2 Exponentiation of Things
I did this in second year M213 but some of you may have missed out on it in
which case here it is. Those of you who did it can read this rather quickly.
If you write down the usual series for the exponential function you get:
exp(x) = 1 +x +
x
2
2!
+
x
3
3!
+
x
n
n!
+
Now think about this and ask yourself what x has to be for this to make
sense. You are used to x being a real number, but it should be obvious that
it could equally well be a complex number. After all, what do you do with
x? Answer, you have to be able to multiply it by itself lots of times, and you
have to be able to scale it by a real number, and you have to be able to add
the results of this. You also have to have an identity to represent x
0
. Oh, and
you need to be able to take limits of these things. So it will certainly work
for x a real or a complex number. But it also makes sense if x is a square
matrix. Or, with any system where the objects can be added and scaled and
multiplied by themselves. And have limits of sequences of these things.
The name of a system of objects which can be added and scaled by real
numbers is a vector space, and a vector space where the vectors can also
be multiplied is called an algebra. We can do exponentiation in any algebra
which has a norm and a multiplicative identity. (And it would be a help if it
was complete in that norm, i.e. limits of cauchy sequences exist.) The square
n n matrices form such an algebra. We can also hope to take sequences of
them and maybe have them converge to some matrix. So we can exponentiate
square matrices.
40 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Exercise 2.7.2. Exponentiate the matrix A in equation 2.7.1. Now expo
nentiate the matrix tA. Do you recognise the result?
It should be obvious that we could, in principle, calculate the exponential of
a matrix to some number of terms, and if the innite sum makes sense and
the sequence of partial sums converges, then we could always get some sort of
estimate of exp(A) for any matrix A by computing enough terms. We would
hope that multiplying A by itself n times would give some reasonable sort of
matrix, and when we divided all the entries by n! we would get something
pretty close to the zero matrix. If this happened for all the n past some
point, then we could optimistically suppose that exp(A) was some matrix
which we could at least get better and better approximations to, which after
all is exactly what we have with exp(x) for x a real number.
Exercise 2.7.3. Dene the norm of an n n matrix A to be
A = sup
x=1
A(x)
as in an earlier problem, and show that A
2
 (A)
2
. Hence prove that
the function exp is always dened for any n n matrix.
Exercise 2.7.4. If e
tA
exp(tA) denotes a map from R to the space of nn
matrices, show that its derivative is Ae
tA
.
There are other algebras where a bit of exponentiation makes sense, so be
prepared for them.
2.7.3 Solving Linear Autonomous Systems
In principle this is now rather trivial:
Proposition 2.7.1. If x = Ax is an autonomous linear system of ODEs
with x(0) = a, then
x = e
tA
a
is the solution.
Proof:
Dierentiating e
tA
gives Ae
tA
by the last exercise and since exp
0
= I the
identity matrix, the initial value x(0) = a is satised. So it is certainly a
solution.
If this looks a bit like a miracle and in need of explanation, you are thinking
sensibly and merely need to do more of it. It may help to note that the
2.7. AUTONOMOUS SYSTEMS OF ODES 41
exponential function is the unique function with slope at a point the same
as the value at the point, and that this leads to the general solution for the
linear ODE in dimension one, and that this goes over to higher dimensions
with no essential changes. In eect, the exponential function was invented to
solve all these cases. It actually goes deeper than this, see Vladimir Arnolds
book Ordinary Dierential Equations.
2.7.4 Existence and Uniqueness
Could you have two dierent solutions (or more)? No, not for linear systems,
but this requires thought. Certainly the 1dimensional ODE given by
x(t) = 3x
2/3
, x(0) = 0
has the solution x(t) = t
3
but also the solution x(t) = 0 It also has innitely
many other solutions. (Can you nd some?) Of course this is not a linear
ODE, but it is clear that some sort of conditions will need to be imposed
before we can look at vector elds which are not linear and expect them to
have solutions. Happily, there is a simple one which guarantees at least local
existence and uniqueness:
Theorem 2.7.1. If f : U R
n
R
n
is a continuously dierentiable
vector eld, then for any point a in U there is a neighbourhood W U
of a containing a solution to the system of equations x = f(x) with a as
initial value, and the solution is unique. Moreover, there is a continuously
dierentiable map F : W J R
n
for some interval J = (a, a) on 0 R
such that for all b in W, the map F
b
: J R
n
is the solution for initial
value b at t = 0.
There is a proof in Hirsch and Smales Dierential Equations, Dynamical
Systems and Linear Algebra, pages 163 to 169.
There is a better proof in Arnolds book on page 213. It is actually the
same proof but much better explained. It is given for the general (non
autonomous) case. Both arguments use the contraction mapping theorem.
You should read through it if you have not already done a proof in your
ODEs course. Assuming you did one.
The results follow easily from a more basic result sometimes called The
Straightening Out Theorem (In Arnold The basic theorem of the theory of
ordinary dierential equations or the rectication theorem. See chapter 2).
The theorem says that in a neighbourhood U of a point of R
n
where the
(continuously dierentiable) vector eld is nonzero, we can nd a oneone
42 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
dierentiable map from U to W R
n
with a dierentiable inverse, such that
the transformed vector eld on W is uniform and constant.
Given that we can do that, we could also make the vectors all have length
one and lie along the x
1
axis in R
n
with a rotation and scaling. The system
of ODEs then would be, in this transformed region W, the rather boring
system:
x
1
= 1
x
2
= 0
.
.
.
x
n
= 0
with the solution
x
1
(t) = t +a
1
; x
2
(t) = a
2
; x
n
(t) = a
n
If you believe in the Straightening Out Theorem, then it is obvious that any
continuously dierentiable vector eld has at any point where the vector eld
is nonzero a solution which is unique in some neighbourhood of the point
and which depends smoothly on the point. All we have to do is to map the
straight line boring solution(s) back by the dierentiable inverse.
Exercise 2.7.5. Prove the last remark.
When the vector eld is zero at a point, the solution is the constant function
taking all of R to the point. So there is a unique solution here too.
Remark 2.7.1. You will nd a proof of the straightening out theorem in
Arnold. I shant prove it in this course on the grounds that this isnt a course
on ODEs. At least, I dont think it is.
Remark 2.7.2. It should be obvious that although we have looked at sys
tems of ordinary dierential equations on R
n
, the fact that everything is
dened locally means that they ship over to any smooth manifold. If the
manifold is compact then the completeness is guaranteed, and the solution
can be found by doing everything in charts and piecing the bits together.
2.8 Flows
I rather slithered over one important point, which is the question of whether
we always get a solution for all time, past and future. It is not hard to see
2.8. FLOWS 43
that the vector eld X(x) = x
2
, X(0) = 1 on R has a solution
x(t) =
1
1 t
which goes o to innity in nite time. From which we deduce that it is
not in general possible to ensure that there is a solution for all time, and
this explains the cautious statement of the last theorem. The best we can
hope to do, the theorem tells us, for a smooth vector eld at a point is to
nd a neighbourhood of the point in which there is a parametrised curve,
x(t) : t (a, a) where if we are lucky a will be and if we arent it will
be some possibly rather small positive number.
Denition 2.8.1. A vector eld on U R
n
is said to be complete if any
solution can be extended to the whole real line.
Exercise 2.8.1. Show that if a vector eld has compact support then it is
complete.
Exercise 2.8.2. Show that if U is the unit open ball in R
n
centred on the
origin and X is a smooth vector eld on U, then if X is complete, and if
Proj(X(x), x) is the projection of X(x) on x, then
lim
x1
Proj(X(x), x) = 0
Remark 2.8.1. It should be obvious that there are not many physical situa
tions where things go belting o to innity in nite time, and for that reason
I shall restrict myself from now on to complete vector elds. If I forget to
put the word in, put it in yourself. Also put the word smooth in front of
the term vector eld whenever it occurs since I shall not consider any other
sort.
The business of getting a solution is going to work not just for the point we
selected as our starting point but also for neighbouring points provided we
dont go too far away. In the happy case where the vector eld has solutions
for all time, the space U on which the vector eld is dened is decompos
able as a set of integral curves, since solutions cant intersect each other, or
themselves, although they can, of course, be closed loops. This statement
follows from the uniqueness of a solution. Hence we deduce that a vector
eld gives rise to what is called a foliation of the space into integral curves.
You can, perhaps, guess that partial dierential operators more complicated
than vector elds will give rise to higher dimensional foliations, decomposing
the space into surfaces and other manifolds.
44 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Exercise 2.8.3. Describe the foliation of R
2
by the vector eld
y
x
+x
y
Recall that in second year (M213) we discussed the idea of groups acting on
sets and came to the conclusion that they were conveniently seen as homo
morphisms from a group G into the group Aut(V ) of maps from the set V
into itself. Then a complete smooth vector eld X on U R
n
gives rise to
an action of the group R on U as follows:
x : R U U
(t, x
0
) x(t)
where x(t) is the integral curve of X with x(0) = x
0
.
To prove this is indeed a group action, we need to show that x(0, x
0
) = x
0
for every x
0
which follows immediately from my denition of x. (Since the
additive identity of R is 0.) We also need to show that
s, t R, x
0
R
n
, v(s, v(t, x
0
)) = v(s +t, x
0
)
which merely means that if you travel for time t from x
0
along the solution
curve, and then go on for time s, this gives the same result as travelling for
time s + t from the starting point x
0
, which is, after all, what we expect a
solution curve to do.
If we x t and look to see what the group action does, it is a map from R
n
to itself. Well, we knew that. It is a truth that this map is always a smooth
dieomorphism. The old fashioned way of saying this is that the solutions
depend smoothly upon the initial conditions, but I much prefer the modern
way of saying it. You should be able to see that all we are doing is taking
each point as input, and outputting the point it will get to after time t.
Proposition 2.8.1. For a complete smooth vector eld X on U open in
R
n
, for any t R, the map x
t
: U U, which sends x
0
to x(t, x
0
) is a
dieomorphism of U
Proof:
The map x
t
certainly has an inverse, x
t
. And the theorem on existence of
solutions to an ODE establishes that the map is continuously dierentiable
when X is. So if X is smooth, so is x
t
.
Remark 2.8.2. The set of dieomorphisms x
t
: t R, or in other words
the map x : R U U, is called in old fashioned books a oneparameter
group of dieomorphisms. I shall simply say that the map x obtained from
the vector eld X is the ow of X.
2.9. LIE BRACKETS 45
Remark 2.8.3. Given a ow x on U R
n
we can always recover the vector
eld by simple taking any point, a and dierentiating the map x
a
: R U
which sends t to x(t, a) at t = 0. This must give us the required vector eld
from which the ow can be derived. So there is a correspondence between
ows and vector elds.
You now have four ways of thinking about vector elds. They are bunches
of arrows tacked onto a space; they are autonomous systems of ordinary
dierential equations. And they are also ows, obtained by solving the au
tonomous system. And last but not least they are operators on the algebra of
smooth functions from the space to R. This demonstrates that vector elds
are more interesting and complicated than you might have supposed.
I shall give one important feature of vector elds which arises from this
multiple perspective and which is much less obvious if you stick only to
systems of ordinary dierential equations.
2.9 Lie Brackets
Writing, as is conventional in some areas, X and W for two vector elds in
1(R
n
) and bearing in mind that we can compose any such operators to get
X W and W X (which we write XW and WX for short). In general the
result is a perfectly good operator but some calculations will rapidly convince
you that XW is not, in general, a vector eld operator but something much
nastier.
Example 2.9.1. Let V = y /x + x /y and W = x /x + y /y
Then V Wh is
xy
2
h
x
2
y
h
x
y
2
2
h
xy
0 +x
2
2
h
yx
+x
h
y
+xy
2
h
y
2
+ 0
and WV h =
xy
2
h
x
2
+ 0 +x
2
2
h
yx
+x
h
y
y
2
2
h
xy
h
x
+xy
2
h
y
2
+ 0
Neither of these look like a vector eld operating on h. If however we take
the dierence, V W WV we get some happy cancellation and wind up with
V W WV = (y
x
+x
y
) (x
y
y
x
) = 0
which is a vector eld although not a very interesting one.
46 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
Exercise 2.9.1. Write down another pair of vector elds V, W on R
2
and
compute V W WV . Check to see if you always get the zero vector eld.
What is it telling you about the vector elds when V W WV = 0? (Some
intelligent conjectures would be of interest but only if supported by evidence
not used in framing the conjecture.)
Exercise 2.9.2. If X = P(x, y)/x+Q(x, y)/y and W = R(x, y)/x+
S(x, y)/y, calculate XW WX and verify that is is a vector eld.
Exercise 2.9.3. Compute XW WX for X, W 1(R
n
) and show it is a
vector eld in 1(R
n
) Show that this also holds for 1(U) for any open set
U R
n
.
All this gives the following denition:
Denition 2.9.1. The Lie Bracket or Poisson Bracket of two vector elds
X, W in 1(U) for U R
n
is written [X, W] and dened by
[X, W] XW WX
It is a multiplication on the vector space of Vector elds on U.
Exercise 2.9.4. Do some simple calculations preferably for U R
1
and con
vince yourself that the Lie bracket multiplication is not in general associative
but does satisfy the Jacobi Identity:
X, Y, Z X(U), [X, [Y, Z]] + [Y [X, Z]] + [Z, [X, Y ]] = 0
Exercise 2.9.5. Prove that the Jacobi Identity is always satised for Vector
Fields.
The Lie bracket almost makes the vector space of vector elds on U, an
open subset of R
n
, into an algebra, which you will recall is merely a vector
space where the vectors can be multiplied, to make a ring. Here the Lie
Bracket operation fails to be associative in general, but a vector space with a
nonassociative multiplication which satises the Jacobi Identity is, notwith
standing, called a Lie Algebra. There are others besides these and again
algebraists have gone to town on investigating abstract Lie Algebras. Well,
we wouldnt like them to be at a loose end and hang around street corners
3
.
Exercise 2.9.6. Prove [X, (Y + Z)] = [X, Y ] + [X, Z] and [(X + Y ), Z] =
[X, Z] + [Y, Z] Prove also that a R, [aX, Y ] = a[X, Y ] and [X, aY ] =
a[X, Y ].
3
Although theyd probably have an interesting line in grati.
2.9. LIE BRACKETS 47
Remark 2.9.1. The above properties you will recognise as bilinearity.
Exercise 2.9.7. Investigate the relation between [hX, Y ], [X, hY ] and h[X, Y ].
It should be apparent that although the calculations tend to be messy and
provide great scope for making errors, they are not essentially dicult. A
natural candidate for a good symbolic algebra package, you might say.
Exercise 2.9.8. Is there a multiplicative identity for the Lie Bracket oper
ation on vector spaces? That is, is there a vector eld J such that for every
other vector eld, X, [J, X] = X? (Hint: what is [J, J]?)
You might be interested in an area of applications of these ideas. If so read
on.
It is easy to nd the solution, h(x, y) = x
2
+y
2
to the PDE
y
h
x
+x
h
y
= 0
Now this is one solution, and nding a single solution is very nice, but we
usually want the general solution. In this particular case you can probably
guess it. But in general, if we have some linear partial dierential operator
L acting on F, a suitable space of smooth functions, and if we want the set
of all solutions of Lh = 0, then it will usually be a lot harder to nd them.
This process is aided by the following idea: The set of solutions of L is going
to be a linear subspace of F, by denition of the term linear operator. Call
it F
0
. Now a symmetry of the solution space of the operator L, often called
a symmetry of the operator L, is some vector eld operator X such that X
takes F
0
into itself, i.e. if whenever h is a solution to Lh = 0, so is Xh. If we
know the collection of all symmetry operators for L and we have a solution,
then we can nd all the other solutions. In trivial cases this will amount
to no more than adding in arbitrary constant functions, but in nontrivial
cases it will do a whole lot more than this. So it would be a good idea to
be able to nd, for a given L, the set of all symmetries X for L. It is clear
that the PoissonLie bracket can be used for any pair of linear operators, not
just vector elds. The following observation goes some way to explaining our
interest in them:
Proposition 2.9.1. If [L, X] = wL for some function w F, then X is a
symmetry of L.
Proof:
We need to show that h F, L(Xh) = 0 Now
LX XL = gL LX = gL +XL
48 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
and
h F, (gL +XL)h = gLh +XLh = 0 +0 = 0
: V
V , v V, v v
Exercise 3.1.1. Conrm that this is an isomorphism of vector spaces.
This isomorphism is natural, which means (in part) that given f : U V , a
linear map between real vector spaces, we get a map from
U to
V :
f : U V f
:
U
V , f
( u) =
f(u)
Note that we can specify the isomorphism and the map f
without making
any reference to a basis for U or V . This is the other part of what we mean
by natural. I cannot dene naturalness properly without an excursion into
category theory which I am hoping to avoid, but the idea is suciently clear
for present purposes. I hope.
I shall write U
= V when vector spaces U and V are isomorphic and U
N
= V
when they are naturally isomorphic.
Exercise 3.1.2. Take the space L(R, V ), of linear maps from R to V , and
show this is also a vector space, naturally isomorphic to V .
The space of shifts being naturally isomorphic to V leads to two pictures
of a vector space, one has got points in it and the other has got arrows in
it. We can certainly think of a shift map taking u to u + v as an arrow
from u to u + v in the original space, and the map itself as a whole lot of
arrows, all basically showing where each point starts and nishes under the
map. And since the spaces are isomorphic we can cheerfully think in either
one. Physicists do this all the time as do applied mathematicians, and so
they confuse the two distinct things, points and arrows, and usually this does
no harm; in fact the more ways you have of thinking about something the
easier it is to solve problems, so it actually does some good. It is, however,
probably better to confuse things when you know you are doing it, rather
than just being confused.
Now I dene the space V
in an
obvious way. It is called the dual basis. In R
2
, [1, 0] is the dual to
1
0
R
2
, and so on. Note that my usage of representing vectors as columns (and
elements of R
n
as rows) is consistent with standard matrix notation and
makes it easier to distinguish R
n
from its dual space.
Remark 3.1.1. The standard (ordered) basis in R
n
is often written as the
ordered set (e
1
, e
2
, , e
n
) which saves writing lots of columns. e
j
is the
column of n numbers which has a 1 in the j
th
place and a zero everywhere
else. I shall often write (e
1
, e
2
, , e
n
) for the dual basis. You can think of e
j
either as a row matrix with n entries, with the j
th
entry 1 and all the others
zero, or you can think of it as the projection onto the j
th
axis, according to
taste. People who cannot tell subscripts from superscripts are going to have
a hard time with tensors.
It follows from the last exercise that V and V
: V
dened by
f
: V
, g V
g g f
This map is the wrong way around. The term contravariant is used for
things like this. Again I am being a trie vague here in order to avoid a long
discursion.
If V = R
n
then I nd it helpful to write the elements of R
n
as column arrays.
Then it is natural to write the elements of R
n
as row arrays. Then this
makes it clear that the latter act on the former (by matrix multiplication) so
there is a map
R
n
R
n
R,
[a
1
, a
2
, a
n
] ,
x
1
x
2
.
.
.
x
n
a
1
x
1
+a
2
x
2
+ a
n
x
n
Physicists write the thing on the right as a
i
x
i
by what is called the Einstein
summation convention which means that a repeated lower index and upper
index is short for a sum over all possible values of the index. This explains
why we use lower indices or subscripts for covectors, elements of the dual
54 CHAPTER 3. TENSORS AND TENSOR FIELDS
space, and superscripts or upper indices for the components of a vector. It
makes writing squares and higher powers a real bugger, but fortunately we
dont have to do that very often.
Note that this generalises, there is a map
V
V R, (g, v) g(v)
All I have done with my rows and columns is specify the maps and the vectors
by arrays of numbers. This is so we can do sums. Usually rather horrid sums,
but that is what Mathematica and MATLAB are for.
Note that confusing a space and its dual is not a good idea: physicists did this
and got themselves in a bit of a mess in consequence. They are isomorphic,
at least when nite dimensional, but not naturally isomorphic, so it is a good
idea to keep them separate.
Exercise 3.1.5. Dene the unit vector at 0 on R to be the tangency equiv
alence class of the map i : R R given by i(t) = t. Then i
R
0
is a basis
element. Dene dx :
R
0
R by dx(i) = 1. The identity map x : R R
goes over to the identity map
R
R, and it takes the tangent vector i to
itself. What does it do to dx?
Note that all the above spaces are isomorphic and all maps are pretty much
the identity map if you are prepared to be sloppy. Examine which of the
various maps are covariant and which contravariant.
Exercise 3.1.6. Let V be the space of all real valued functions dened on R.
Is it the case that V and V
, U, v V, s, t R, f(su+tu
, v) and
u U, v, v
, V, s, t R, f(u, sv +tv
) = s f(u, v) +t f(u, v
)
3.1. TENSORS 55
We can describe this by saying that f is linear in each variable separately.
The eld can in fact be any eld you like as long as it is the same eld for
U,V and W .
Exercise 3.1.8. Find a bilinear map from R R to R.
Denition 3.1.3. For any u U and a bilinear map f : U V R I can
write f
(u,)
: V R as the map
f
(u,)
: V R, v f(u, v)
Similarly for any v V, f
(,v)
: U R sends u to f(u, v).
We can describe bilinearity of f by saying that f is linear in each variable
separately, meaning that for any u U, f
(u,)
is linear and for any v
V, f
(,v)
is linear.
Two exercises which may help later in understanding some technicalities:
Exercise 3.1.9.
1. Show that when V is nite dimensional, V
, the dual of V
, is nat
urally isomorphic to V . That is, show there is an isomorphism which
does not require a basis of either space to specify it, and that f : U V
induces a map f
: U
.
2. Show that if Bil(U V, W) is the vector space of bilinear maps from
U V to W and L(A, B) is the vector space of linear maps from A to
B for any real vector spaces A and B then
Bil(U V, W)
N
. .. .
copies
R
We write T
and a contravari
ant 1tensor on V is actually an element of V
,
he has a tensor g : V V V
. .. .
copies
R
We write T
k
(V ) is written T
(V )
I shall expand on this when I explain tensor elds which comes up next.
Denition 3.1.9. A covariant ktensor is symmetric i
(u
1
, u
2
, , u
k
) = (u
2
, u
1
, u
3
, , u
k
)
and whenever we swap any two arguments the result is the same.
Denition 3.1.10. A covariant ktensor is alternating (or antisymmetric)
i
(u
1
, u
2
, , u
k
) = (u
2
, u
1
, u
3
, , u
k
)
and whenever we swap any two arguments the sign only is changed.
Note that we can say this more easily: a covariant ktensor is symmetric
i it is invariant under the symmetry group S
k
acting on the arguments, in
algebra, T
k
(V ) is symmetric i = for every in the permutation
group S
k
on the arguments of . And if it is antisymmetric then it is invariant
under the group A
k
. If is a permutation of the set of arguments, we write
sgn() to be +1 if is an even permutation and 1 if it is odd. Then we
can say is alternating i = sgn() .
Alternating ktensors are also known as kforms and are important for later
work. They have everything to do with orientation.
Denition 3.1.11. We write
k
(V
n
) for the space of alternating covariant
ktensors on the vector space V having dimension n.
3.1.3 Dimension of Tensor spaces
You can either read this carefully or simply do the exercises at the end of
the subsection. Or you can do both. As long as you nd out how to do the
exercises!
The space of ktensors on V
n
is obviously a vector space because we can
add and scale the maps; the sum or two tensors of type (k, )
T
is obviously
58 CHAPTER 3. TENSORS AND TENSOR FIELDS
another tensor of the same type, and likewise scaling such a tensor by a real
number gives another tensor of the same type. Since the set of all maps from
any X to R is a vector space, the type (k, )
T
tensors form a linear subspace.
Example 3.1.1. Suppose is any (2, 0)
T
tensor on R. Then we put (1, 1) =
a. Then by multininearity, keeping the second component xed we deduce
that (x, 1) = xa for any x R, and now keeping the rst component xed
we see that (x, y) = xya. Thus the tensor is specied by just one number,
a and so the space of covariant 2tensors on R is a one dimensional vector
space, having (1, 1) = 1 as a basis element.
We note that is always a symmetric tensor. There is precisely one alter
nating (2, 0)
T
tensor on R and it is the zero map. So the space of alternating
(2, 0)
T
tensors on R is zero dimensional. The zero tensor is both symmetric
and antisymmetric.
To get a basis for the covariant k tensors on R
n
, we need to specify the maps
on every choice of basis elements. For example, for 2tensors on R
2
we know
the multilinear map completely if we know it on (e
1
, e
1
), (e
1
, e
2
), (e
2
, e
1
)
and (e
2
, e
2
). Then multilinearity will guarantee us the value on any pair of
vectors, each in R
2
. The extension to higher order tensors and dierent n
is obvious, and by taking any basis for V we get the same conclusion. This
gives the obvious result, the dimension of the space of covariant ktensors
on V
n
is n
k
. For example, we can take as a basis for T
2
(R
2
) the four maps
dened by the four columns:
(e
1
, e
1
) 1 (e
1
, e
1
) 0 (e
1
, e
1
) 0 (e
1
, e
1
) 0
(e
1
, e
2
) 0 (e
1
, e
2
) 1 (e
1
, e
2
) 0 (e
1
, e
2
) 0
(e
2
, e
1
) 0 (e
2
, e
1
) 0 (e
2
, e
1
) 1 (e
2
, e
1
) 0
(e
2
, e
2
) 0 (e
2
, e
2
) 0 (e
2
, e
2
) 0 (e
2
, e
2
) 1
Then it is obvious that these four maps are linearly independent and that
any bilinear map from R
2
R
2
to R is a linear combination of these.
Exercise 3.1.10. Prove the last remark.
Remark 3.1.2. We can write the map given by the rst column as dx dx,
the second column map as dxdy, the third as dydx and the last as dydy.
I shall explain this neat notation later.
If the tensors are of mixed type (k, )
T
, then by taking the dual basis for
the contravariant tensors we get the dimension is n
k+
. The Riemannian
Curvature tensor, which you may meet later, is a (3, 1)
T
tensor and in R
4
,
3.1. TENSORS 59
spacetime, it therefore has dimension 4
4
= 256. This means it takes 256
numbers to specify it. Fortunately it has a lot of symmetries which reduces
the dimension to 20, otherwise nobody would have the patience to do any
calculations with it.
The space of alternating covariant 2tensors is obviously a subspace of the
space of all covariant 2tensors: on R
2
, we do not need to look at what any
such tensor does to (e
1
, e
1
) because it has to be zero. Similarly if we know
it on (e
1
, e
2
) we know its value on (e
2
, e
1
), it is just the negative. So if we
know its value on (e
1
, e
2
) we know it completely, and since a basis for the
space
2
(R
2
) is the single alternating tensor which sends sends (e
1
, e
2
) to 1,
the dimension of
2
(R
2
) is one. Since it is easily veried that the alternating
map which sends (e
1
, e
2
) to 1 is the determinant of the matrix formed by
putting the two vectors as adjacent columns, we see that the determinant is
a basis for the space
2
(R
2
) of alternating 2tensors on R
2
.
Exercise 3.1.11. Easily verify the above claim.
In R
3
we have three basis elements, (e
1
, e
2
, e
3
). and if we look to see what
we have as a basis for the alternating two tensors we observe that we know
any such alternating if we know it on (e
1
, e
2
), (e
2
, e
3
) and (e
1
, e
3
). For
every other pair of basis elements, the result is forced by knowing on these
three together with the fact that is alternating. Since we have only three
choices of real numbers to make in order to nail down a particular alternating
2tensor on R
3
, the dimension of
2
(R
3
) is 3. And for R
n
, all we have to
do is to take pairs e
i
, e
j
with i < j, and again knowing on these tells us
everything about . There are n(n 1) ways of choosing two dierent basis
vectors from R
n
, and we need half of them, so the dimension of
2
(R
n
) is
n(n 1)/2.
And nally, we can choose k distinct basis elements from the set of n in
n(n 1)(n 2) (n k + 1) ways and each such way can be permuted in
k! ways and we need only one of them. We can choose a suitable basis on
which to dene an alternating ktensor to be the set e
i
1
, e
i
2
, e
i
k
with
i
1
< i
2
< < i
k
and this can be done in
n
C
k
=
n!
k!(n k)!
ways, the
number of ways of choosing k things from n. So the dimension of
k
(V
n
) is
n
C
k
.
The space of symmetric tensors is similar except that we do not know the
value of when two choices of the same basis elements of R
2
are made. In
R
2
the 2tensor is determined if we know (e
1
, e
1
), (e
1
, e
2
), (e
2
, e
1
),
and (e
2
, e
2
). If we know it is symmetric we dont need both (e
1
, e
2
) and
(e
2
, e
1
). So the dimension of the symmetric 2tensors on R
2
is three. The
60 CHAPTER 3. TENSORS AND TENSOR FIELDS
symmetric ktensors on R
n
have a basis the set of values of maps dened
on e
i
1
, e
i
2
, e
i
k
with i
1
1
2
i
k
. For the 2tensors on R
n
we
can choose two elements in n
2
ways and we can notice that n of these have
both elements the same. The remaining n(n 1) ways have the subscripts
dierent and we can select half of them. So the dimension is n(n1)/2+n =
n(n +1)/2. I leave you to work out the dimension of the space of symmetric
k tensors on R
n
.
Note that the space
n
(R
n
) always has dimension 1. Taking the dened by
taking the value one on (e
1
, e
2
, e
3
, e
n
) in that order, we observe that we
have a particularly simple alternating nform on R
n
. It is called the volume
element, and its value on any set of n vectors in some order can be calculated
using multilinearity. If we write each vector out as a column, the result is
the determinant of the resulting n n matrix. This is a good way to dene
the determinant.
Note that we could write out a basis for
k
(R
n
) for k n in terms of
the possible choices of k row elements by taking the determinant of the
result. Thus alternating tensors are all about determinants or, alternatively,
determinants are all about alternating tensors.
Exercise 3.1.12.
1. By evaluating an in the space of (2, 0)
T
tensors on R
2
on the elements
(e
1
, e
1
), (e
2
, e
1
), (e
1
, e
2
)(e
2
, e
2
) to get (a, b, c, d) respectively, show that
is dened by a 2 2 matrix using suitable matrix operations.
2. Show that this is equivalent to acting on the pair of vectors (a, b) by a
2tensor by writing the matrix as A and calculating a
T
Ab.
3. Complete the scruy arguments used to obtain the dimension of
k
(R
n
)
which took suitable basis elements of
R
n
R
n
R
n
. .. .
k terms
to dene a set of multilinear maps, with the implied belief that we can
extract a basis for
k
(R
n
) by xing suitable values. In particular show
that a set of such maps is linearly independent and spans
k
(R
n
).
4. Show that the symmetric (2, 0)
T
tensors form a subspace. What is the
dimension? Give a basis for it.
5. Do the same for the alternating (2, 0)
T
tensors.
3.1. TENSORS 61
6. Show that the determinant acting on
x
y
u
v
2. Show that if s, s
and t, t
1
+t
2
)
is what youd expect it to be on the optimistic assumption that is a
nice well behaved multiplication.
3. Give a basis for the space of (2, 0)
T
tensors on R
2
in terms of dx and
dy. Hint: Note that dx
i
: R
n
R is a covariant 1tensor on R
n
for
any n. If n = 2 we call them dx and dy. Certainly the tensor product
of any two 1tensors is a 2tensor. Show that every 2tensor is a linear
combination of such tensor products. (A count of basis elements might
save you some trouble here.) Look back to Remark 3.1.2 to nd the
answer written down, with an explanation promised later. This is the
explanation.
4. Represent the tensor dx dy as a matrix over R
2
.
5. Represent the tensor dx dy as a matrix over R
3
.
6. Repeat the last two for the tensor dx dy dy dx. (Later we shall
call this tensor dx dy.)
64 CHAPTER 3. TENSORS AND TENSOR FIELDS
Exercise 3.1.15. Show by an example that not every twotensor on R
2
can
be written as a tensor product of onetensors. This is obvious once you see
it but some people are tempted to suppose all higher order tensors are tensor
products of onetensors. The moral: one 2tensor is not the same things as
two 1tensors!
Example 3.1.2. We can write down a bit of the tensor algebra (not all of
it, it is innite dimensional) on R
2
without too much trouble. Note that I
use dx to specify the linear map from R
2
to R which projects on the rst
component, and dy for the projection on the second component.
Order basis isomorphic to
T
k
(R
2
) dx
i
1
dx
i
k
R
2
k
.
.
.
.
.
.
.
.
.
T
3
(R
2
) dx dx dx, , dy dy dy R
8
T
2
(R
2
) dx dx, dx dy, dy dx, dy dy R
4
T
1
(R
2
) dx, dy R
2
T
0
(R
2
) 1 R
Using the isomorphisms we can also write out the tensor multiplication in
an admittedly strange form:
R R
2
R
4
R (R, ) (R
2
,
q
) (R
4
,
q
)
R
2
(R
2
,
q
) (R
4
,?) (R
8
,?)
R
4
(R
4
,
q
) (R
8
,?) (R
16
,?)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Table 3.1.2
Here I have started with the 0tensors, then the 1tensors, and so on, and
used the isomorphisms to indicate where the tensor product takes us. The
symbol means ordinary multiplication, and
q
means scalar multiplication.
The question marks remain to be lled in, but I will do the multiplication
from T
1
(R
2
) T
1
(R
2
) to T
2
(R
2
). In the bases given this is
R
2
R
2
R
4
a
b
c
d
ac
ad
bc
bd
3.1. TENSORS 65
If you represent elements of T
2
(R
2
) by 22 matrices, you can get this result
by a matrix multiplication
c
d
[a, b]
For what thats worth.
Exercise 3.1.16. Fill in the other question marks.
It is easy to generalise the tensor algebra so that we can take the tensor
product of contravariant tensors or of a covariant tensor with a contravari
ant tensor or of two mixed tensors. These things are best understood by
constructing simple examples when they are ridiculously easy, rather than
by looking at the formal denitions which on rst encounter are terrifying.
Algebra is learnt by making up lots of examples. When you have done this
you can easily see what is being said and after a small amount of practice
you can use the language to terrorise people unfamiliar with it. This is child
ish and you should be ashamed of yourself for actually frightening engineers,
applied mathematicians and physicists this way.
Exercise 3.1.17. A covariant 1tensor on R
n
is a linear map from R
n
to
R and consequently an element of R
n
. We write dx
i
: R
n
R to be the
projection which picks out the i
th
component of each vector.
1. Show the set dx
i
: i [1 : n] is a basis for T
1
(R
n
)
2. Show that the set dx
i
dx
j
, i, j [1 : n] is a basis for the space
T
2
(R
n
), so any T
2
(R
n
) can be specied by the entries in an n n
matrix relative to this basis.
3. Take some T
2
(R
3
) and specify it by such a matrix, take two ele
ments of R
3
and show how to evaluate on them by matrix multiplica
tions of representations of the vectors with respect to the standard basis
for R
3
.
4. Choose a dierent basis for R
3
; discuss what has to be done to the
matrix in order to get it to still represent the same T
2
(R
3
)
5. The space T
1
(R
n
) is the space of linear maps from R
n
to R and is hence
R
n
. It is naturally isomorphic to R
n
, and we can use the natural
isomorphism to take e
i
, i [1 : n] to be a basis for T
1
(R
n
). What is
a basis for T
2
(R
n
)? For T
k
(R
n
)?
66 CHAPTER 3. TENSORS AND TENSOR FIELDS
6. Since the dimension of T
2
(R
n
) is clearly n
2
we can represent any ele
ment T
2
(R
n
) by an nn matrix as before. How does this transform
under change of basis?
Remark 3.1.4. Note that this confuses an earlier denition which had dx
i
as a linear map from
R
n
to R, but the reason for the confusion will become
clear later. If you are rather more fussy than I am, you might want to do it
right: If e
i
: i [1 : n] are the standard basis elements for R
n
, we can write
the corresponding dual basis for R
n
as e
i
: i [1 : n]. Then go through
replacing dx
i
with e
i
throughout and you will have it in impeccable form.
Remark 3.1.5. We dene a 0tensor on any real vector space to be another
and more exotic name for a real number. This means we can take tensor
products of 0tensors with k tensors to get the scaling operation. You should
have already worked this out from doing the exercises.
Exercise 3.1.18. If you read Darlings book you will discover that he goes
about dening tensor algbras quite dierently. He denes U V for any real
vector spaces U and V , by proving a universality theorem which is somewhat
obscure.
You can recover Darlings treatment as follows.
First note that if we have L(U, R) and L(V, R) we can dene L(U, R)L(V, R)
to have elements f g which means for each f L(U, R) and g L(V, R)
we take f g : U V R by
f g(u, v) = f(u) g(v)
where is just multiplication in R. From now on I shall just write f(u)g(v)
for this. It is clear that this is a bilinear map from U V to R. It is also
clear that L(U, R) L(V, R) is a vector space under the usual operations of
scalar multiplication and addition.
But L(U, R) is just U
and L(V, R) = V
. So we have dened U
as a
new vector space. It is therefore perfectly straightforward to dene U
with U and V
with V , we
have U V .
Show that this gives us Darlings treatment. Find the dimension of U V
in terms of the dimension of U and the dimension of V . Find an explicit
representation for R
2
R
3
and calculate
1
2
4
5
6
x
y
y
x
x
y
R
2
is an improvement.
As mentioned earlier, as well as attaching vectors to points in a manifold
we can attach other things. If we attach real numbers we merely get a map
from the manifold to R, and we have now got a new way to think of such
a map. Could we attach a matrix? To see the useful way to do such a
thing, observe that the tangent bundle is merely obtained by taking as bre
the tangent space at each point. But we could start with the tangent space
and replace it with its dual space. The thing that we get when we take the
bre bundle with the dual to the tangent space as bre and glue all these
bres together using the same method as for the tangent bundle is called
the cotangent bundle. It looks rather similar and in R
2
the only dierence
would be that instead of attaching the space of columns of two numbers
(representing the possible arrows at each point), we would be attaching the
space of rows of two numbers (representing linear maps from R
2
to R). The
spaces are isomorphic, but clearly the elements are not the same. Of course I
have added my own bit of confusion here by confusing linear maps with their
matrix representation, another isomorphism. On R
n
this is harmless. On a
manifold it generally is not and again the isomorphism needs to be thought
about.
The cotangent bundle is important in classical mechanics where it corre
sponds to the momentum space whereas the tangent space corresponds to
the velocity space. The reason is that we have an energy function. If we look
3.2. TENSOR FIELDS ON A MANIFOLD 69
at R
2
it has tangent space what I have called R
2
R
2
. Now the function
1/2mv
2
is a function from
R
2
to R,
v
1
v
2
m
2
((v
1
)
2
+ (v
2
)
2
)
and the derivative of this function is the row matrix
[mv
1
, mv
2
]
This is an element of the cotangent bundle because it is a covector, not a
vector. Cheerfully confusing the two leads to ghastly muddle further down
the track.
As well as the dual of the tangent space attached to each point of a smooth
manifold, we can attach tensors. The vector space of ktensors such as
: V V V
. .. .
k copies
R
(any space of maps into R is a vector space) has a basis consisting of the
multilinear maps evaluated on each of the n
k
combinations of basis elements
of V . If V = T
a
(M) for some manifold M and some a M, then will be
specied relative to this basis by n
k
numbers where n is the dimension of V
and hence M, and k is the order of the tensor. In principle this could be an
awful lot of numbers (but in practice it usually isnt).
This ktensor vector space can also be thought of as stuck on the manifold
at a. The tangent space is just the locally trivial vector bundle over the
manifold as base space with bre the tangent space at each point. In exactly
the same way we can take a locally trivial vector bundle with bre the vector
space of ktensors on the tangent space at each point. At each a M this is
a vector space of dimension n
k
. Given that a nite dimensional real vector
space has a topology which is invariant under isomorphisms, and given that
the tensor bundles must be locally trivial (since the tangent space to R
n
is a
trivial bundle), the topology of each tensor bundle has as a base the cartesian
product of those open sets in the manifold which are subsets of those open
sets over which the bundle is trivial, with open sets in the bre.
If we do this for every a M we get the ktensor bundle of M. If we do it
for all the possible k we get the full tensor bundle of M.
Then:
Denition 3.2.1. A smooth ktensor eld on a manifold M is a smooth
section of the ktensor bundle.
70 CHAPTER 3. TENSORS AND TENSOR FIELDS
Awful Warning
There is scope for some confusion here. If we take the manifold to be R
n
,
the tangent space is, in my idiosyncratic notation, R
n
R
n
and this makes
a vector eld a section of this bundle. If we look to see what sort of tensor
eld this is, we see that the tensor eld must assign to each point of R
n
a multilinear map from some space V to R. There is only one possibility
and that is to make V the dual space to
R
n
and take the linear maps. This
means that by identifying the double dual with the original space
R
n
we get
the right answer. Thus a vector eld is a T
1
tensor on the tangent space
T
a
(R
n
) for every a R
n
. So a covariant vector eld in the ordinary sense,
mentioned in the last chapter is a contravariant tensor eld. There is, if you
like, an element of dualising in dening a tensor in the rst place, so we have
to dualise again to get rid of it.
The terminology is unfortunate since the tangent functor takes tangent vec
tors to tangent vectors and is covariant, but not many people have the nerve
to change the traditional terminology. I certainly dont.
Some of the books by physicists make a pigs breakfast of all of this duality.
Confusion is the natural state of man. And woman. Try to be clear about
which space you are working in and avoid the muddle.
End of Awful Warning
If we take the alternating covariant ktensors on the tangent space at every
point of the manifold the smooth tensor eld of sections is called a dierential
kform on the manifold. Similary we can limit the section to taking values
in the symmetric ktensors. Both of these are important.
This sounds horrible but if you look at 2tensors, alternating or not, you can
see that on the tangent space T
a
(M) when M = R
n
, any one of them can
be represented nicely by an n n matrix of numbers. And if we select one
such matrix for each point a of M, then we get an n n matrix of functions
from the manifold to R. So as long as k is one or two we do not have
anything very complicated. If k = 1 then we are talking about vector elds
or covector elds, and if k = 2 we are sticking matrices onto the manifold at
each point. If we never go beyond dimension 3 then the worst thing we have
to imagine is a space with a 3 3 matrix associated with each point of the
space. This is not really very bad. Admittedly this only gets us the applied
mathematicians view of the world, but at least we know how to generalise
it to higher dimensions and higher orders if it turns out to be necessary.
A serious issue with this simplied view of things is that the specication of
the matrix representing a tensor on any T
a
(M) requires us to choose a basis
3.2. TENSOR FIELDS ON A MANIFOLD 71
Figure 3.2.1: Shifting a vector between tangent spaces.
for T
a
(M). And if we now do the same for some T
b
(M) for a = b then we
need to choose a basis for representing tensors on T
b
(M). But the spaces
T
b
(M) and T
a
(M) dont have much to do with each other in general. So in
what sense can it be made the same basis? And if it is dierent, how do we
ensure that the matrix of functions is going to behave nicely in representing
the tensor elds? The tensor elds are perfectly respectable things, but if
we insist on representing them by matrices of functions we have some serious
problems. Note that if M = R
n
, the tangent spaces can be shifted into each
other in a natural way and the idea that we are using the same basis for
each of them makes sense. It all goes wrong if M = S
n
. We need in this
case something like an explicit isomorphism between the tangent spaces at
dierent points.
To see what can go wrong here, imagine a sphere and take a point on the
equator. Attach a vector to this point, say one pointing along the equator.
I have shown this in gure 3.2.1. Now look to see what happens if you move
it parallel to itself along a line of longitude so that it moves up towards the
north pole. It seems reasonable to say that we are shifting the vector so it
is still pointing in the same direction, and still has the same length, despite
the fact that the vectors are all in dierent tangent spaces. In other words I
am claiming that I can tell when two vectors in two dierent tangent spaces
are the same. That this is insanely foolhardy becomes apparent if I go to
the same place by a dierent route. Suppose I rst go around the equator.
My pink vector also goes around the equator, becoming rather purpler as it
goes. When it is opposite the starting point, I now move it up the curve of
longitude until it gets to the north pole. All the way, by both paths, I have
moved the vector so it is pointing in the same direction, but the result is
a pair of vectors pointing in opposite directions. So cheerfully doing on a
72 CHAPTER 3. TENSORS AND TENSOR FIELDS
sphere what makes perfect sense on R
2
is fraught with problems.
One of the hardest things to do is to unlearn things you soaked up through
the skin when young and gullible. If you were encouraged to think that
isomorphic vector spaces were never worth distinguishing and you went all
sloppy in your thinking as a consequence, you now have the formidable job
of working it all out again. Dont blame me, blame the scruy bunch who
taught you manifest nonsense and blame yourself for buying it. Think of this
in the future and currently and regard it as a second Awful Warning.
Exercise 3.2.1.
1. Take M = S
1
and k = 2. Dene a ktensor eld on S
1
.
2. Take M = S
2
and k = 2. Dene an alternating 2tensor eld (2form)
on S
2
. Explain what this might have to do with area of regions on a
sphere and indicate how you might calculate the area of a region in S
2
with respect to your choice of 2form.
3. Take M = R
3
and k = 2. Dene a dierential 2form on R
3
.
Note that we can talk about kcovariant and contravariant mixed tensors
and mixed tensor elds.
3.3 The Riemannian Metric Tensor
Recall from M213 that an inner product on a vector space V is a positive
denite symmetric quadratic form, which is to say a map
', ` : V V R
such that
1. ', ` is bilinear; that is u V, 'u, ` : V R is linear and
v V, ', v` : V R is linear
2. ', ` is symmetric; that is u, v V, 'u, v` = 'v, u`
3. ', ` is positive denite; that is u V, 'u, u` 0 and
'u, u` = 0 u = 0
We can now summarise the above conditions by saying that an inner product
for V is a symmetric covariant 2tensor on V , with the additional property
3.3. THE RIEMANNIAN METRIC TENSOR 73
Figure 3.3.1: (Bits of) some perfectly respectable Hilbert Spaces stuck on a
manifold.
of being positive denite. Recall also from M213 that positive deniteness
can be specied by observing that nondegenerate quadratic forms can be
classied as to their general shape by diagonalising them and then rescaling
the axes so that they are all diagonal with entries +1 along the diagonal
down to some point after which they are 1. This gives the signature of
the quadratic form (1, 1, 1, , 1, 1, 1 1) for some number of posi
tive and some number of negative ones. A positive denite form has all n
entries +1. There are also degenerate forms where some of the entries after
diagonalisation may be zero.
I shall only consider the positive denite forms here, although physicists want
to look at the general case of nondegenerate forms because in relativity we
have to put in time as an extra dimension, which gives a signature (1, 1, 1, 1)
or (3, 1). (Or (1, 1, 1, 1) if you are a physicist. Physicists put time rst.
Some of them even use (1, 3), multiplying our form by 1. I shall outline a
reason for this in the next chapter.)
We now dene a Riemannian metric tensor eld on a manifold as a positive
denite symmetric two tensor eld. That is, at every point of the manifold
we attach, smoothly, some positive denite symmetric 2tensor. This means
we have some bilinear function of a pair of tangent vectors at each point.
It is a daft name, and it would have been much more sensible to call it a
Riemannian inner product tensor eld, because it gives an inner product on
each tangent space. But it is too late to be sensible now. Figure 3.3.1 shows
some vectors in some of the tangent spaces to a sphere, and each pair has a
sort of local dot product in a perfectly respectable tangent space which is now
a perfectly respectable inner product space, in fact a perfectly respectable
Hilbert Space.
74 CHAPTER 3. TENSORS AND TENSOR FIELDS
Each such inner product may be specied, via charts, as a symmetric 2 2
matrix in the case of the gure, each matrix A(a) at the point a on the sphere
acting on a pair of tangent vectors u
a
and v
a
to give
u
T
a
A(a)v
a
but it would be better to regard it as a bilinear symmetric map which takes
pairs of tangent vectors with their tails at some point of the manifold, and
returns a real number. Thinking of it as a matrix makes it clear that there
are three distinct numbers which depend on where we are on the sphere. For
an nmanifold it will be
n
C
2
distinct numbers for each point of the manifold.
Or if you insist you can think of the metric tensor as n(n 1)/2 distinct
functions from the manifold to R. So it makes sense to physicists to write
such a thing as g
,
where (mu) and (nu) range through the two possible
values on a sphere or the three possible values on a threemanifold, or the
four on spacetime. With g
,
= g
,
. Of course this involves a choice of some
charts to cover the manifold. It might be better to write g
,
(a) for a a point
in the manifold to remind ourselves that we have what is in eect a matrix
valued function on the manifold, but we dont.
Note that there is absolutely no machinery for calculating the dot product
of a tangent vector u
a
at a point a, with a tangent vector v
b
at a dierent
point b. This can be done in R
n
but the Inner Product tensor eld doesnt
allow it.
If the symmetric tensor is always positive denite we call it a Riemannian
metric, and the manifold with this tensor eld is called a Riemannian Manifold.
If the symmetric tensor eld has signature (1, 1, 1, 1) it is called a Lorentzian
metric and the manifold is called a Lorentzian manifold. Physicists treat the
universe we live in, including time, as a Lorentzian manifold. More generally,
for any signature of form we say we have a semiRiemannian metric. Bear in
mind at all times that when a physicist talks about a metric on a manifold
he means, almost always, an inner product on all of its tangent spaces, not
necessarily positive denite but always nondegenerate. Usually it is either
riemannian or lorentzian.
If two quadratic forms are positive denite, so is their sum. It makes sense to
add them because they are just functions, and if 'u, u` 0 and u, u ~ 0
then the sum is also nonnegative, if the sum is equal to zero both the terms
'u, u` and u, u ~ must be equal to zero so u = 0. Moreover if we scale
by a positive constant the result is another positive denite form, while if
we scale by a negative constant the result is a negative denite form. Hence
the positive denite symmetric covariant 2tensors which are positive denite
are not a vector subspace of the space of covariant 2tensors on V , but they
3.3. THE RIEMANNIAN METRIC TENSOR 75
are an open subset of the vector space of symmetric covariant 2tensors and
therefore a manifold with a dimension.
Exercise 3.3.1. What is the dimension of the space of positive denite sym
metric 2tensors on R
2
? Hint: it is the same as the dimension of the vector
space of symmetric 2tensors and if you represent a tensor by a matrix, you
need to count the number of independent numbers in the matrix.
All the above makes sense if we use contravariant 2tensors. In fact since I
havent said anything about V , it might just as well be the dual space to
some other space.
Now we say it again formally:
Denition 3.3.1. A (positive denite) Riemannian metric for a manifold
M is a positive denite symmetric covariant 2tensor eld on M.
What does this mean in computational terms? It is easiest to begin by
looking at a very simple case, a metric tensor eld on R
2
. The idea of such a
tensor eld on R
2
has to do with inner products on
R
2
, in fact one such inner
product for each point of R
2
. This can be grasped by thinking of the matrix
of numbers operating on pairs of vectors in
R
2
being xed for each point in
R
2
, and as we move about in R
2
, we change the numbers in the matrix. So
the numbers depend on where you are, and are given by smooth functions of
your location in R
2
.
More generally, we take a manifold, we take a point on it, a and look at
the tangent space at a. Now we take the symmetric bilinear maps from this
space to R which are positive denite. On R
n
, this inner product could be
specied by taking n independent vectors as a basis, then taking the dual
space and the basis elements for that, and calling them (dx
1
, dx
2
, dx
n
),
and then writing the tensor as
i,j[1:n]
g
ij
dx
i
dx
j
where g
ij
is an n n symmetric positive denite matrix. This follows from
an exercise which I hope you did. Alternatively you can use the Einstein
convention and just write g
ij
dx
i
dx
j
. If you were a classical mathematician
or happen to be scruy, you might leave out the , as if it is obvious to the
meanest intellect what dx
i
dx
j
means. You might perhaps imagine in a dim
sort of way it means that you are multiplying a very, very little bit of the i
th
component of a vector with another very, very little bit of the j
th
component
of a possible dierent vector. In which case you are so confused there is no
hope for you.
76 CHAPTER 3. TENSORS AND TENSOR FIELDS
The standard inner product on R
2
can be written in this form as the identity
matrix. To calculate
x
y
u
v
we simply compute
[x y]
1 0
0 1
u
v
to get xu + yv. Doing the same with any other symmetric positive denite
matrix instead of the identity will give us a new inner product.
For M = R
2
it makes sense to take the same basis (dx, dy) for elements of the
cotangent space over every point, so we get that it is possible to represent a
Riemannian metric on R
2
in the form
[ a ,
b ]
g
11
(x, y) g
12
(x, y)
g
12
(x, y) g
22
(x, y)
c
R
2
to
R. For any choice of two tangent vectors we get a real number. The g
ij
are
smooth functions, for i, j [1 : 2] (and g
12
= g
21
.)
On R
3
they would be smooth functions for i, j [1 : 3]. The matrix would be
symmetric still and at each point it would be positive denite (or in general
have the required signature).
3.3.1 What this means: Ancient History
If you reect a little on what a covariant 2tensor on the tangent space is,
you will see that we have bilinear maps from pairs of vectors in the tangent
space at a to R, for every a in the manifold.
Now tangent vectors in the old days of classical geometry were not thought
of as elements of a perfectly respectable vector space, but were imagined to
be innitesimal elements in the base space. You can see that if you take a
velocity vector at a point a R
2
and travel along it for a very, very short
time, you trace out, more or less, a line segment in R
2
. If you put a little
arrow on its head (its tail being at a) you get the beginings of a picture of
a vector eld, which we learnt how to draw in second year. If you have a
uniform velocity parallel to the Xaxis and of unit length and in the direction
of increasing x we can represent this by a tiny little arrow attached to a and
pointing in the direction of increasing x. Such a tangent vector should be
rather small because it really represents a velocity through a, and hence an
element of what I have called
R
2
, not a set of points in R
2
. The practicalities
3.3. THE RIEMANNIAN METRIC TENSOR 77
are that velocities change and can change continuously so a big long vector
would be misleading. In fact any nite length vector is misleading, but we
can be sloppy and imagine that velocities have been turned into distances by
travelling for very short times.
The idea of an innitesimal time, one so small it was not eectively distin
guishable from zero, but where ratios of innitesimals made sense and need
not be zero is one which seems natural to many people. My Mathematics
teacher at school talked of dy/dx being a ratio of numbers each of which was
innitesimal, that is not individually distinguishable from zero. I thought he
was o his head. I still do. This isnt mathematics, its nonsense
1
. It does
however suggest mathematics. So although my Maths master was talking
incoherent garbage, there is something there which makes sense. And the
idea of innitesimal distances and times leading to a denite velocity, a sort
of garbled version of the denition of a limit, has been used a great deal in
times past.
One way to think of this which you may nd useful is contained in the
following example.
Let c be the curve x(t) = t, y(t) = 2 sin(t) be given. We look at the origin,
through which the curve passes. First we take the line segment from
0
0
to
u
2 sin(u)
for some u = 0.
This line segment has two important numbers associated with it, the projec
tion along the xaxis and the projection along the yaxis. I shall call such a
line segment
u
and the two numbers x(
u
) and y(
u
). The slope of the
line segment is
y(
u
)
x(
u
)
. So I think of y as assigning one number to each
such
u
and x as assigning another with the ratio being the slope of the
line segment.
As we take shorter and shorter line segments, that is if we let u 0 in the
example, the numbers get smaller but the ratio in general does not. I can
easily stipulate that the line segment has one end xed (at 0 in our case)
and the other end lies along the curve given.
1
It is possible to go through model theory and make these ideas respectable, but this
requires a lot of logic. It is also possible to junk the lot and replace it with the idea of a
limit. And nally it is possible to choose terminology which looks a lot like the incoherent
rubbish but actually makes sense. This last is what we do and it explains some of the
more baroque aspects of our language.
78 CHAPTER 3. TENSORS AND TENSOR FIELDS
Figure 3.3.2: x and y and dx and dy.
Now look at the tangent vector at 0 dened by the curve above. It is a
perfectly respectable vector in the tangent space
R
2
0
at 0. In fact I can take
a basis for
R
2
0
consisting of the vector of unit positive speed along the x
axis, which I have called i or e
1
or /x earlier, and the second vector being
dened by a curve of positive unit speed along the yaxis which I have called
e
2
and /x earlier but might have called j. In this basis it is easy to see
that the tangent vector at 0 dened by the curve is just
1
2
I have already dened dx in the cotangent space as the linear map which
sends this tangent vector to 1 R and dy as the linear map which sends it
to 2.
So dx and dy do to tangent vectors what x and y do to line segments
in the original space. I have shown the idea in gure 3.3.2. Note that we
can say that dy/dx for this tangent vector is just 2 by straight division. And
of course this is precisely what we get when we dierentiate 2 sin(x) at the
origin, which is not exactly a surprise.
Classically, the idea of x was what you were probably taught at school:
it was a little bit of x, and y was a little bit of y, but you were really
looking at line segments along curves, and x and y are probably better
thought of as maps from line segments to R. It is easy to see that with this
way of looking at things, the claim
dy
dx
= lim
x0
y
x
3.3. THE RIEMANNIAN METRIC TENSOR 79
Figure 3.3.3: A new rule for measuring distances of points from the origin.
makes sense provided we specify what we really mean by the terms. This
would involve saying that we are calculating x() and y() for line seg
ments joining some xed point on a curve to other points, and the limit
means that the other points are taken to be getting closer and closer to the
xed point. All this explanation was unfortunately regarded as not really
part of the mathematics and consequently got left out of the notation. If we
intend to study the subject on manifolds we have to put it back in.
The idea, then, that x means a little bit of x and dx means a very,very
little bit of x (so little that it is innitesimal) still survives in the literature.
And the classical mathematicians wrote
d
2
= dx
2
+dy
2
to be an innitesimal version of Pythagoras Theorem and then used it to
nd the length of curves. These days we dene everything through limits,
which you spend a lot of time doing more or less rigorously in rst year. At
least, that was the idea.
So instead of writing
[x, y]
a b
b c
x
y
a b
b c
dx
dy
x
y
R
2
: ax
2
+ 2bxy +cy
2
= 1
1 +xy 0
0 x
2
+y
2
c
d where c is the curve and
d
2
= dx
2
+dy
2
is the innitesimal path length. We can write this as
d
2
= [dx, dy]
1 0
0 1
dx
dy
Our new and improved inner product changes from place to place but it gives
rise to a norm just as the old one does, and it is a norm on the tangent space.
We therefore have
d
2
1
= [dx, dy]
1 +xy 0
0 x
2
+y
2
dx
dy
for the new way of measuring the dierential path length and so the length
of the path along the parabola, with x = t, y = t
2
is
1
0
(1 +t
3
).1 + (t
2
+t
4
)(4t
2
) dt 1.49958
where the approximation is done using Mathematica. This compares with
about 1.29361 using the standard metric.
Example 3.3.2. Find the path length of the spiral r = for 0 2 in
the metric on R
2
given by
d
2
2
= [d, dr]
r
2
0
0 1
d
dr
2
0
t
2
+ 1 dt 21.2563
This compares with 2
r
2
0
0 r
4
d
dr
Remark 3.3.1. I should feel ashamed of myself for writing out expressions
such as the above for specifying a metric (or more accurately the square of a
norm), and should undoubtedly have written
= r
2
d d +r
4
dr dr
or something similar. I have tried to give you something which will relate
the correct formulation to the things that the classical mathematicians did
(and which you may nd at least as badly expressed in works on tensors and
tensor elds written by the congenitally confused). The bad notation can be
used to do sums quite quickly so is not wholly bad. Much depends on whether
you want to do an awful lot of sums without thinking what you are doing.
And face it, who would want to think while doing monster sums if they didnt
have to?
Exercise 3.3.6.
1. Find the length of the path A, r = 1 for 0 /2 with respect to
the metric given by r
2
d d + dr dr. (Note that in the , r space
this gives the same answer as the usual metric,. that is, treating , r as
if it were a piece of R
2
with the euclidean metric.)
2. What is the length of the parallel line B, r = 2 for 0 /2 in the
new metric?
3.3. THE RIEMANNIAN METRIC TENSOR 83
Figure 3.3.5: Three lines of dierent lengths.
3. What is the length of the line C, r = 0 for 0 /2?
I show the three lines in gure 3.3.5.
4. Explain what has gone wrong. The length of a line segment with the
end points dierent cannot be zero in a metric.
5. On the gure 3.3.5, draw the curve r = 1/(sin() + cos()), for
[0, /2]. Calculate its length with respect to the new metric. Hint: you
might try using NIntegrate in Mathematica.
6. Show the curve is a geodesic in the space, in particular it is shorter
than the straight line A. Hint: transform back to R
2
with the euclidean
metric. Find out how to do this by reading on a bit.
Note that the Riemannian metric tensor enables us to make sense of the angle
at which two curves cross. Without this it makes no sense at all to say that
curves intersect at right angles on a manifold, because in dierent charts we
could get totally dierent answers. We feed in two tangent vectors, one along
each curve, at the point of intersection so they are both in the same tangent
space. The Riemannian metric tensor gives us a number out and this leads
us to the angle just as in R
n
.
Suppose now that I want to compute path length of a curves on a manifold.
Let us say I have a curve c : [0, 1] S
2
on S
2
. I want to compute its length.
I take a chart containing some of the curve, say u : U R
2
and this takes
the bit of the curve in S
2
to a bit of curve in R
2
. I have a Riemannian metric
tensor on the manifold. The picture of gure 3.3.6 shows a local parametri
sation by u
1
of a patch containing some of the curve. The composite u c
shifts the curve to the codomain of u, the open set u(U) in R
2
.
84 CHAPTER 3. TENSORS AND TENSOR FIELDS
Figure 3.3.6: Length of a curve on a manifold via a Riemannian metric.
Now I want to know what happens to the covariant 2tensor eld on S
2
which
tells me how to measure distances there. I claim that u
1
induces a covariant
2tensor eld on u(U). This requires a certain amount of thought.
We have the picture from the last chapter:
T
a
X
?
X
X
T
f(a)
Y
?
Y
Y

f
f
Now a linear map from T
a
(X) to R is taken by dierentiable f to a linear
map from T
f(a)
(Y ) to R. We can see this by using the natural equivalence
of V
a
(X) R to f
: T
f(a)
R.
These are the same thing.
Now we know what f
x
y
u
v
we can write
[du dv] = [dx dy]
f
1
x
f
1
y
f
2
x
f
2
y
x
y
d +
x
r
dr
dy =
y
d +
y
r
dr
hence
dx = r sin() d + cos() dr
dy = r cos() d + sin() dr
where upon we can calculate the various tensor products in the inner product:
dx dx = (r sin() d cos() dr) (r sin() d cos() dr)
= r
2
sin
2
() ddr sin() cos()drdr sin() cos()ddr+cos
2
() drdr
and similarly for dx dy, dy dx and dy dy
= r
2
cos
2
() dd+r sin() cos()drd+r sin() cos()ddr+sin
2
() drdr
Hence we have
d d = dx dx +dy dy = r
2
d d +dr dr
This, when translated into matrix terms and old fashioned dx
2
+dy
2
language,
gives us that the standard identity matrix on R
2
for the euclidean metric
tensor goes over to the matrix
r
2
0
0 1
of example 3.3.2
86 CHAPTER 3. TENSORS AND TENSOR FIELDS
Example 3.3.4. Suppose f : [0, 1] R
2
is a curve in R
2
and we wish to
compute its length. Writing f(t) = (x(t), y(t))
T
we have the length of f is
[0,1]
du
where du is the pullback from R
2
by f of the length measure d on R
2
. The
usual (Lebesgue) measure on [0, 1] is written dt. This gives us:
dx dx = (dx/dt dt) (dx/dt dt)
= (dx/dt)
2
dt dt
dy dy = (dy/dt dt) (dy/dt dt)
= (dy/dt)
2
dt dt
d d = dx dx +dy dy
du = f
d
du du = ((dx/dt)
2
+ (dy/dt)
2
) dt dt
du =
((dx/dt)
2
+ (dy/dt)
2
) dt
[0,1]
du =
[0,1]
((dx/dt)
2
+ (dy/dt)
2
) dt
A familiar formula usually derived somewhat less formally but using essen
tially the same ideas. It is worth going through this argument while thinking
of dx/dt as the amount of stretching f does to the unit interval in the x
direction when it takes [0, 1] into R
2
(and likewise dy/dt). Working through
the new jargon for a simple, friendly example makes you appreciate how the
new jargon actually does a good job of articulating geometric ideas of what
is going on.
Exercise 3.3.9. Write out the matrix and tensor product forms of the spher
ical and cylindrical polar coordinate transforms of R
3
and conrm that the
euclidean metric goes to what it ought to.
Returning to the tensor eld exported by u
1
to R
2
, the map u distorts
distances, but it also distorts the metric tensor in exactly the right way so
that if we use the u
1
induced metric tensor to measure path length in R
2
we get the right answer for the metric tensor on S
2
.
You might be surprised at rst that we transport a metric tensor eld on a
space X to one on a space Y by a homeomorphism u
1
which is the inverse
of the map u : X Y . Actually this makes good sense. Suppose we take
the simplest case of the usual metric tensor eld on R which assigns to the
3.3. THE RIEMANNIAN METRIC TENSOR 87
interval [a, b] the length b a. Map R R by u(x) = 2x. Now we want a
new, shiny metric tensor eld on the codomain which gives the image of [a, b]
the same length, b a, so we can feel we have shifted not just the interval
[a, b] but also the metric with which to measure its length.
Writing the length in the domain as
b
a
dx we see that we can get the (usual,
boring, old fashioned) length of the image in the codomain by writing it as
x=b
x=a
du =
b
a
2 dx
where du = 2dx follows from u = 2x.
You might have felt a bit happier had I written this as
b
a
du
dx
dx =
b
a
2 dx
Much depends on your previous experience of Calculus.
If we want to have length b a for the new, shiny length in the codomain,
which I shall call the uspace, we need to use du
1
. Now the classical mathe
maticians, Gauss and his mob, would cheerfully write things not very dierent
from
u
1
(x) = x/2, du
1
= 1/2 dx
then using the metric given by du
1
on the uspace we get the length of the
interval [2a, 2b] in the uspace with the right metric is
x=2b
x=2a
1/2 dx = 1/2 (2b 2a) = b a
Obviously this works with all linear maps from R to R not just 2x.
Exercise 3.3.10. Show it works for u(x) = 2(x).
If u is a dieomorphism from R to R then we have something like
dx
du
= D u
1
=
1
du/dx
by the inverse function theorem. The interval [a, b] in the xspace is taken to
[u(a), u(b)] if u is increasing, which I can assume it is without loss of generality
since if it isnt I just compose with the map that multiplies everything by
1 and rename the composite to be u. Now the length of this in the usual
metric is
b
a
du/dx dx. If I choose the metric given by du
1
then I replace the
88 CHAPTER 3. TENSORS AND TENSOR FIELDS
old, boring metric dx with the new, shiny, transported metric 1/(du/dx) dx,
then the length is
b
a
du/dx 1/(du/dx) dx = b a
This tells us that if we use the metric transported by u
1
to measure the
length of a curve in R transported by u, we get the same length.
Exercise 3.3.11. Show this works just as well if the curve is in R
2
and
u : R
2
R
2
is a dieomorphism. The length of the curve in the uspace
measured by the metric transported to the uspace by u
1
is the same in both
spaces. Hint: Try it for linear maps u rst.
Exercise 3.3.12. By taking two distinct charts both covering a curve on S
2
and hence related by a dieomorphism, show that whichever chart you use, if
you induce the right metric tensors on R
2
from the charts and calculate the
lengths by both of them, they agree on the length of the curve on S
2
. Note that
you dont really need S
2
at all for this exercise, it is about the way covariant
tensor elds on R
2
transform under dieomorphisms.
Exercise 3.3.13.
1. Show that contravariant tensors of any order on a vector space U are
carried by linear maps f : U V to contravariant tensors of the same
order on the vector space V .
2. Show that covariant tensor elds of any order on a smooth manifold
V are carried to covariant tensor elds on a manifold U of the same
order by the inverse of a dieomorphism h : V U.
3. Let ', ` be an inner product on V . Show that it induces an isomorphism
between V and V
.
4. Does an isomorphism from V to V
i
, the dual basis element to e
i
, or e
i
) with the
map dx
i
:
R
n
R and the reason is that pretty soon we shall be doing all
this on the tangent space, and if I confuse the notation a bit now there is
less novelty later.
In the case of R
2
I get that the determinant can be written easily as dx
1
dx
2
dx
2
dx
1
. There are,after all, only two permutations of two things.
94 CHAPTER 3. TENSORS AND TENSOR FIELDS
I shall write this as dx dy. In fact if I have a covariant 2tensor on R
n
, I
shall also write:
dx
1
dx
2
= dx
1
dx
2
dx
2
dx
1
This is equivalent to choosing the rst two rows of the n 2 matrix made
up by choosing any two vectors in R
n
, and computing the determinant.
It is easy to see that it is an alternating covariant 2tensor on R
n
. I have
immediately that dx
2
dx
1
= dx
1
dx
2
Similarly I can take dx
i
dx
j
dened by
dx
i
dx
j
= dx
i
dx
j
dx
j
dx
i
and this is dx
j
dx
i
and dx
i
dx
i
= 0, for i, j [1 : n]. What this means
is that I select the 2 2 matrix comprising the i
th
and j
th
rows of the two
column vectors, and calculate the determinant of them.
This can be generalised to covariant 3tensors on R
n
without too much trou
ble. In this case I have to dene dx
i
dx
j
dx
k
and I do this by writing out
every permutation of i, j, k so that if is a permutation I take the 3! terms
dx
(i)
dx
(j)
dx
(k)
for the 3! permutations, . I then multiply the resulting numbers together,
multiply by the sign of the permutation, and sum the 3! numbers. This gives
dx
i
dx
j
dx
k
. It is easy to see that it is an alternating 3tensor on R
n
.
Exercise 3.5.4. Prove the last claim.
The generalisation to alternating k tensors on R
n
is obvious.
Exercise 3.5.5. Write it down.
It follows that we can give a basis for the space
k
(R
n
) rather easily: it
consists of the alternating tensors
dx
i
1
dx
i
2
dx
i
k
: i
1
< i
2
< < i
k
[1 : n]
Now putting x
1
= x, x
2
= y and x
3
= z in traditional fashion, we recover
the mysterious expression at the beginning of this section on the Exterior
Algebra.
3.5. THE EXTERIOR ALGEBRA 95
Exercise 3.5.6.
1. Show how to construct a linear map Alt: T
k
(R
n
)
k
(R
n
) which
alternates any tensor and sends any alternating to itself. Hint: the
essential idea occurs in turning dxdy into dxdy and dx
1
dx
2
dx
3
into dx
1
dx
2
dx
3
by adding up the signed permutations. It might be
better to average them in this case.
2. Show how to generalise the of dx
i
, dx
j
so that if is an alternating
ktensor on R
n
and is an alternating tensor, then is an
alternating k + tensor. Hint: Hit the tensor product with the Alt of
the last exercise.
3. Show that dx
1
dx
2
applied to a pair of points in R
2
, represented in
the standard way, gives twice the oriented area of the triangle formed
by the pair of points together with the origin.
4. What do you need to get the area of the triangle formed by two points
in R
n
and the origin? Does it make sense to talk of an oriented area
in this case?
The exercises should now make it clear that just as between ktensors
and tensors gives us k + tensors and hence a graded algebra, so be
tween alternating ktensors and alternating tensors gives us an alternating
k + tensor and hence another graded algebra. This is called the Exterior
Algebra. Since the only alternating ntensor on R
n
is the determinant, and
since
k
(R
n
) is just the zero tensor whenever k > n, we are really only
concerned with the graded algebra
k
(R
n
) : 1 k n. This makes the
exterior algebra rather simpler (and a lot smaller) than the tensor algebra.
Exercise 3.5.7. Show that
k
(R
n
) is just the zero tensor whenever k > n.
I can write out the full exterior algebra in the form:
Order basis isomorphic to
0
(R
2
) 1 R
1
1
(R
2
) dx, dy R
2
2
(R
2
) dx dy R
1
This is a nice nite table. Just as I wrote out table 3.1.2 I can write out the
exterior algebra for R
2
:
96 CHAPTER 3. TENSORS AND TENSOR FIELDS
R R
2
R
R (R, ) (R
2
,
q
) (R, )
R
2
(R
2
,
q
) (R,det) 0
R (R, ) 0 0
Table 3.5
Again, denotes ordinary multiplication in R and
q
denotes scalar multipli
cation. And det denotes the determinant. The table starts with 0tensors at
the top and 2tensors at the bottom.
Exercise 3.5.8. Write out the full exterior algebra on R
3
. You should repli
cate the above two tables with rather more columns and rows. In the second
table, work out what the multiplications are, as for table 3.1.2. Do you recog
nise anything?
3.6 The Exterior Calculus
The step from the tensor algebra to tensor elds consisted of having a section
of the tensor bundle, which meant attaching a type (k, )
T
tensor to each
point on a manifold. We do exactly the same thing again, we take a section
of the
k
(V ) bundle where V is a tangent space. This means that we attach
to each point a of the nmanifold M
n
an alternating ktensor on the space
T
a
(M). A 0tensor is just a number, and attaching a number to each point of
a manifold is merely dening a map from M
n
to R. Similarly, attaching an
ntensor is attaching a number, the volume element, at each point of M
n
. In
between we have kforms attached at each point of the manifold. Naturally
we want the sections to be smooth.
Such sections are called dierential forms on the manifold.
To make this concrete we look at R
2
and R
3
.
A dierential 0form on R
2
is just a smooth map from R
2
to R. We know a
fair bit about these.
A dierential 1form on R
2
assigns to each point of R
2
a pair of numbers
a dx +b dy and consequently is a pair of functions
P(x, y) dx +Q(x, y) dy
It is a covector eld and looks very like a vector eld (but watch out for what
happens when you change bases!)
A dierential 2form on R
2
assigns to each point a of R
2
an operator (a) dx
dy. This is short for (a) dx dy dy dx for some number (a) which
3.6. THE EXTERIOR CALCULUS 97
depends on a. This acts on any pair of vectors in the tangent space at a.
Lets choose some with respect to the standard basis for
R
2
(since, for any
a R
2
,
R
2
a
is isomorphic to
R
2
0
in a natural way). Then
(a) dx dy
x
y
u
v
= (a)(xv yu)
So (a) dxdy assigns to any pair of tangent vectors the area of the parallel
ogram in the tangent space which they determine, multiplied by a function
of a. Or if you prefer, twice (a) times the area of the triangle consisting of
the two points and the origin of the tangent space
R
2
.
A quite useful way of looking at this is that (a) dx dy is doing something
a bit like the riemannian metric, but instead of returning the inner product
of two tangent vectors it is returning an innitesimal area element. So we
imagine that we want the area denition to vary over the space so that
calculating an area of a region is now more complicated. On the other hand
you have seen this before, more or less.
Example 3.6.1. First I am going to transform the usual area measure dxdy
on R
2
and use it to calculate the area of the unit disc in polar coordinates.
We have the polar coordinate transform
P : R
2
` 0 S
1
R
+
x
y
B
1
(0)
dx dy =
B
1
(0)
r dr d
which we already knew although not in this language. Note that the domain
of integration, B
1
(0) is a disk in the x y space and a rectangle wrapped
around a cylinder in the r space. This is what happens to the punctured
disc under the dieomorphism P.
The new integral has S
1
and r [0, 1] which makes for an easy integral,
(1/2)(2) = .
I knew that.
Note that this works because we
transformed the disc in R
2
into a rectangle in S
1
R
+
, except that the
centre of the disc really got thrown away (zero area so does not aect
the result) so the rectangle (a) doesnt have a base (zero area in any
sane density on R
2
) and b gets wrapped once around the circle.
Back transformed the measure density dx dy to get the right density
to use to compute the area. All this does is to make clear something
which you were trained to do using much sloppier arguments to justify
the right rule for the change to polars. It was all perfectly OK but the
rationale was scruy. Note how the exterior algebra rules for computing
the new form automatically take care of signs and orientations. Doing
it for any other transformation than the polar one is now a doddle.
Exercise 3.6.2. Calculate the area of the unit disc in R
2
with respect to the
density xy dx dy.
Exercise 3.6.3. Work through the argument for the spherical and cylindrical
polar coordinate transformations in R
3
.
Exercise 3.6.4. Think of some bizarre dieomorphism of R
2
to some two
dimensional space that does something frightful but has an explicit inverse
(make sure the inverse can be written down even if the original is a swine).
Use it to evaluate the area of some region in both spaces, before and after
being transformed. This should give two moderately foul double integrals
with weird limits. Use Mathematica to get numerical solutions and conrm
they are pretty much the same.
3.6. THE EXTERIOR CALCULUS 99
Remark 3.6.1. This should give you a conviction that dierential forms
have their uses, and will suggest the most important thing about them:
Dierential Forms are things you
integrate over manifolds. A dif
ferential kform can be integrated
over a kmanifold or kmanifold
with boundary.
I have made sure you would see this as it tells you what they are for.
Transforming dierential forms by dieomorphisms follows the same pattern
as for transforming the riemannian metric tensor, except that we may have
to transform kforms for k > 2. The rules are simple however.
Exercise 3.6.5. Write down explicit rules in terms of partial derivatives for
transforming a dierential 3form on R
3
under a dieomorphism, rules which
you must have used in doing the preceding exercise but one.
It follows from the big announcement that 2forms on R
2
are integrated over
things like discs, and a 1form on R
2
has to be integrated over curves.
Example 3.6.2. Let the curve c be the graph of y = x
2
between x = 0 and
x = 1. Let the dierential 1form be dx + dy. What would we expect the
answer to
c
dx + dy be on the basis of what this means, and what would
the calculation be?
Solution: Drawing the graph and taking a typical line segment on the curve,
x() is the projection along the xaxis and y() is the projection along
the yaxis. If we add these up we get 1 + 1 = 2, and this is not going to
change as the segments get shorter. So the answer is 2. All done by a little
thought about what these things mean.
If we write y = x
2
we get dy = 2x dx so
c
dx +dy =
[0,1]
(1 + 2x) dx = x +x
2
1
0
= 2
As an alternative we could write x = t, y = t
2
, t [0, 1] to express the curve
parametrically and this would give the same answer.
100 CHAPTER 3. TENSORS AND TENSOR FIELDS
Exercise 3.6.6. Now try it for the curve c being the rst quadrant of the
unit circle. Do you get the same answer? If not why not?
Note that when we express the curve parametrically we do so by a function c
and this allows us to pull back the dierential 1form on R
2
to a dierential
1form on R which is just some function multiplied by dt and we can integrate
this in the usual way, numerically if necessary.
Exercise 3.6.7.
1. What would you expect the result of
c
dx + dy to be when c is any
smooth closed curve?
2. Suppose we take f(x, y) = x +y. Then we have
df =
f
x
dx +
f
y
dy
which in this case is the 1form dx + dy. So dierentiating a smooth
0form gives a smooth 1form. Show that this is always the case.
3. It follows that
c
dx + dy =
c
df. Use the fundamental theorem
of calculus to prove your solution to the rst question is correct, and
verify that it gives the right answers to all the other integrations of this
1form along curves.
4. Find a 1form on R
2
, P(x, y) dx +Q(x, y) dy, that is not df for any f.
Hint: What can we say about P/y and Q/x if the 1form is the
derivative of a 0form?
The usual way to represent the derivative of a 0form f is as the row matrix
[f/x, f/y] and since this represents, when evaluated at any point, a
linear map from R
2
to R and P dx+Q dy represents a linear map from
R
2
to
R when evaluated at any point, the dierence is rather small, but signicant.
When we treat the dierentiation in the second sense, we call d the exterior
derivative. It goes much further than this. I shall dene an exterior derivative
of 1forms to give a 2form:
For = P dx +Q dy, I dene
d =
P
y
dy dx +
Q
x
dx dy = (
Q
x
P
y
) dx dy
3.6. THE EXTERIOR CALCULUS 101
Exercise 3.6.8.
1. Show that d
2
= 0 for any 0form f.
2. Calulate the exterior derivative of a 1form on R
3
by making up a
suitable example.
3. Pretend, briey, that there is no such thing as duality and that the last
1form is a vector eld. Identify the 2form.
4. Make the rule: To obtain the exterior derivative of a k form on R
n
, take
each component function P(x
1
, x
2
, x
n
) dx
i
1
dx
i
2
dx
i
k
of the
kform, dierentiate each such P with respect to each of the variables
separately to get, for example, some P/x
j
, and put dx
j
in front of
the existing term, to get
P/x
j
dx
j
dx
i
1
dx
i
2
dx
i
k
Sum the results for the n dierent variables x
j
and also for the dierent
functions P. The result is a k + 1 form on R
n
. Show that this rule
gives the same answer as in the particular cases you have worked with.
5. Use the above rule to calculate the exterior derivative of a 2form on
R
3
. Choose your own 2form, preferably so as to have three nontrivial
but dierentiable component functions.
6. Pretending, briey, that the 2form on R
3
is a vector eld, identify
d.
Remark 3.6.2. You should be able to see that the clunky way you did
Stokes Theorem using vector elds arose from confusing vector elds with
both 1forms and 2forms, which you can do only on R
3
. In fact it is really
about dierential forms. Stokes Theorem in general says
M
=
M
d
where M is an nmanifold with boundary M and is any dierential n1
form. For a proof, dig up my old 2C2 notes o the web.
This is the modern form of Stokes Theorem. It diers from the old obsolete
form in two ways: rst it is about dierential forms not vector elds, so a
graps of duality is important and second it works for all positive integers n,
all nmanifolds with boundary (or without, but they are less interesting).
102 CHAPTER 3. TENSORS AND TENSOR FIELDS
Exercise 3.6.9.
1. Show that Stokes theorem in dimension 1, with a 0form, is just a
restatement of the Fundamental Theorem of Calculus.
2. Show that the almost certainly scrofulous proof you met in second year
of Greens Theorem is also a scrofulous proof that Stokes Theorem
holds when is a 1form on R
2
.
3. Show that the almost certainly scrofulous proof you met in second year
of Stokes Theorem is also a scrofulous proof that
M
=
M
d
holds when is a 1form on R
3
.
4. Show that the almost certainly scrofulous proof you met in second
year of the Divergence Theorem is also a scrofulous proof that Stokes
Theorem holds when is a 2form on R
3
.
5. Construct a plausible explanation of why you had to do a bungled
version of Stokes Theorem in second year, given that the correct version
has been known since about 1925.
Remark 3.6.3. If you want to see a proper proof of Stokes Theorem (all the
above, and more, in one hit) read Michael Spivaks Calculus on Manifolds.
It consists of proper denitions of all the terms in awful generality and some
calculations. I should point out that all the physical intuitions which led
to the theorem are contained in the exercises and are not, as some shallow
people imagine, completely absent.
3.7 Hodge Duality: The Hodge Operator
3.7.1 The Riemannian Case
In R
3
we know from the table referred to in Exercise 3.5, that the Exterior
Algebra has a striking symmetry. If we look at the 0forms we observe they
have dimension 1, just as do the 3forms, while the 1forms have dimension
3, just like the 2forms.
It follows from the equality of the dimension that there is an isomorphism
between the space of 1forms on R
3
and the space of 2forms on R
3
; also one
between the space of 0forms (numbers) on R
3
and the space of 3forms on
R
3
.
Only a small amount of thought shows that there must be, in general, an
isomorphism between the kforms on R
n
and the n kforms on R
n
. In fact
3.7. HODGE DUALITY: THE HODGE OPERATOR 103
not just one isomorphism of course, but scads of them. The question is, can
we nd a more or less natural isomorphism by some process which works in
all cases? The answer is yes, and the isomorphism is called , or Hodge if
you want to give credit where it belongs.
Let us start by going from 2forms on R
3
to 1forms on R
3
and see if we can
work out the general pattern by doing concrete cases.
Recall from Exercise 3.5.1 that if is a 2form on R
3
then it operates on any
pair of vectors
a
1
a
2
a
3
b
1
b
2
b
3
a
2
b
2
a
3
b
3
+c
2
a
1
b
1
a
3
b
3
+c
3
a
1
b
1
a
2
b
2
a
1
b
1
c
1
a
2
b
2
c
2
a
3
b
3
c
3
e
1
,
0
cos()
sin()
0
sin()
cos()
()(a
1
, a
2
, , a
k
) = (f(a
1
, ), f(a
2
), .f(a
k
))
Exercise 3.7.6. Dene f : R
3
R
3
by f(e
1
) = e
2
, f(e
2
) = e
3
, f(e
3
) = e
1
.
Let be the 1form dx + 2dy + 3dz. Show that f
(()) = (f
()). The
last exercise makes it clear that at least in this case, f
and commute.
Exercise 3.7.7. Do they always? Is functorial? Its denition is locked
into the standard basis, does it need to be? If not, what does this do for
dening it on manifolds?
The answer to the last question is that if f preserves both the inner product
and the orientation, that is if it an element of SO(n, R), then f
(()) =
(f
()). This tells us that the operator involves the parity (or orienta
tion or chirality) of an orthonormal basis in an essential way. Note that in
a Hilbert space we have a natural denition of an orthonormal basis and
that since a Riemannian structure on an oriented manifold makes each tan
gent space a Hilbert space, the operator makes sense on oriented smooth
manifolds with a Riemannian structure.
I do wish the physicists would learn not to call it a metric, but they are
beyond saving.
106 CHAPTER 3. TENSORS AND TENSOR FIELDS
3.7.2 The SemiRiemannian Case
We are going to generalise the idea of an inner product so as to be able to
deal with the Minkowski metric on spacetime. This is Lorentzian. And we
might as well deal with the general case because it isnt any more work.
Denition 3.7.1. A symmetric bilinear form : V V R on a real vector
space is said to be nondegenerate i u V, (u, v) = 0 v = 0
Denition 3.7.2. A nondegenerate symmetric bilinear form : V V R
on a real vector space is said to be an inner product. We write such forms as
', ` with (u, v) = 'u, v`.
Exercise 3.7.8. Dene the inner product on R
2
by
x
y
a
b
= xb +ya
Show that this is an inner product in the new sense but that it is not positive
denite.
Exercise 3.7.9. On R
4
dene
x
0
x
1
x
2
x
3
a
0
a
1
a
2
a
3
= x
0
a
0
+x
1
a
1
+x
2
a
2
+x
3
a
3
Show that this is an inner product (the Lorentzian inner product on space
time), and nd a nonzero vector which is orthogonal to itself. Find two
distinct points which are at distance zero from each other. Explain what
this means physically.
It should be clear that our generalisation of the idea of an inner product is
constructed so as to allow us to do for spacetime what we usually do for
space, and that the metric derived from the inner product isnt a metric in
the standard sense at all. This is (a) strictly necessary in order to describe
relativity in a sensible fashion and (b) an awful shock to the system. It means
that I have set the velocity of light equal to 1 and that any two points on
the path of a ray of light have zero separation in spacetime. I shall discuss
some aspects of Physics in the next chapter which may shed some light on
this extraordinary behaviour.
From now on I shall use the generalised denition of the inner product.
3.7. HODGE DUALITY: THE HODGE OPERATOR 107
Proposition 3.7.1. An inner product on a nite dimensional real vector
space V determines an isomorphism from V to V
.
Proof: Dene : V V
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
f/x
f/y
f/z
Since we like to think of things running down hills rather than up them, it
is quite usual for physicists and engineers to put a minus sign in the above
equation. Do so if it makes you feel better.
All three of the forces we are considering are of this type, gradient elds,
except for singularities at the centre of attraction.
It is clear that is pretty much dierentiating f to give the three components
of the derivative of f along orthogonal axes, and this raises the possibility that
it might be more natural to regard the electric or magnetic or gravitational
elds as 1forms rather than vector elds on the space we live in.
4.2.2 What are Flux?
The fact that the three elds all fall o according to an inverse square law
suggests that this is a property of the space we are living in. One possible
explanation of an inverse square law of repulsion between two objects is that
each is emitting some particles, small point like objects, which hit the other
object and force it away. This would mean that the number passing through
a given area would decrease as the area is moved further away from the
source, as in gure 4.2.5.
The area subtended by a disc of xed size would be proportional to the
inverse square of the distance from the centre, so counting hits would give
an inverse square law simply as a property of the dimension of the space we
live in reduced by one.
Even if we dont believe in anything as fanciful as microscopic particles spat
out by charges, we can certainly think of a ow of imaginary stu put out
at a constant rate proportional to the amount of charge, and so people would
talk and write of the electric ux or the magnetic ux where the word ux
118 CHAPTER 4. SOME ELEMENTARY PHYSICS
means something which ows, and was used in medicine to mean stu which
dribbled out of sores and noses
6
. Like the potential function representing a
hill down which objects roll, this is simply a possible way of thinking about
things and we do not feel obliged to specify the imaginary owing stu in
any detail. After all, we are doing nothing more than observing that a vector
eld has an associated ow which is equivalent to it in that we can get to the
ow from the vector eld by solving a system of ODEs and given the ow
we can get back to the vector eld by dierentiating it at every point.
Regarding Electric eld in terms of a ow invites us to consider how much
ows out of a region compared with how much ows into the region. This
has much to do with Stokes Theorem and the extent to which the stu is
created. Obviously the stu ows out of any charge and ought to either
be conserved or get compressed elsewhere. Imagine, to picture this, water
owing along in a stream. Now place an imaginary football in the stream.
Water ows through the imaginary football as if it isnt there, which is fair
enough because it isnt. The point is that water is hard to compress so
the density is pretty much uniform throughout the stream, and moreover
the water doesnt come out of nowhere or suddenly vanish. This severely
constrains the kind of vector eld that we get in a stream by attaching to
each point a little arrow saying how fast, and in which direction, the water
is moving at any point. It satises the condition that the divergence of the
vector eld is zero at every point, where we measure the divergence at a point
by putting a small box at the point, and taking the amount of water coming
out of the box less the amount of water going into the box and dividing by
the volume of the box. Now take the limit as the box gets smaller to get
a number at each point of the stream. This is the divergence of the vector
eld, and for streams it has to be zero. There is precisely as much imaginary
water owing into the imaginary football as there is owing out. You might
reasonably conjecture that the divergence of an Electric eld is zero at a
point except when there is some charge at that point, when it depends on
the sign and amount of charge.
And you would be right.
In algebra we can write the divergence of a vector eld V as a function g
with
g =
q
V = V
x
/x +V
y
/y +V
z
/z
Then if our charge comes in lumps, which we often assume to be the case,
any little box containing a positive charge will have some net amount of ux
coming out proportional to the charge, and if the little box is empty of net
6
Many things have improved since the early Nineteenth century.
4.2. FIELDS 119
Figure 4.2.6: The Electric eld around a positive charge.
Figure 4.2.7: The magnetic dipole eld.
charge there will be just as much going in as there is coming out. Electric
eld ows are incompressible, and so are magnetic elds.
If the ow is not incompressible, it may still satisfy the Continuity Equation
which holds for a larger class of ows of physical systems. It says:
t
=
q
j
where is the density of stu at a point and time and j is the vector eld
of the ow of the stu. You can translate this as: What goes in must
come out or wind up as a sticky mess in the middle. It applies to every
little box you put in the ow, so it makes sense in the limit as the boxes
get smaller. The right hand side can be imagined to represent the build up
of concentration of the ow, which accounts for the minus sign, and the left
hand side then represents the consequence of an increase in the density.
The ow of an electric eld for a point charge looks like gure 4.2.6 and the
ow for a magnetic eld looks like gure 4.2.7. If we take a eld like this at
every point along a line segment and add them up we get the eld of a long,
thin bar magnet.
120 CHAPTER 4. SOME ELEMENTARY PHYSICS
4.3 Maxwell and Faraday
Faraday spent a lot of time investigating the relationship between the three
forces. He didnt nd any link between the other two and gravity, although
he spent a long time looking as you will see from the Notebooks. But he did
nd some important relationships between electricity and magnetism. Some
of this had also been done by Ampere in France
7
.
The key things that turn up are that a moving charge produces a magnetic
eld, and that a changing magnetic eld moves charge.
Electrons move rather easily through metals. The electrons in metals that
are attached to the positively charged nuclei in the atoms may be bound
more or less tightly in the atoms, and the outer electrons are more or less
communal to a set of atoms in the crystal lattice which metals form. A
bar of iron is basically a mess of little crystals all jammed together; if you
heat it and let it cool very slowly, you get fewer and bigger crystals, in an
extreme case, practicable only for small bits of iron, you can get a single
crystal. Trying to make it one big crystal is important in some applications
because the strength of a crystal is much greater than the strength of the
metal mixture. Or to put it another way, when you pull at two ends of a
wire, it comes apart at the places between the crystals, not in the middle of
a crystal. And electrons are very small and light. So a metal looks to an
electron rather like a sequence of mostly empty rooms (the crystals) and the
electrons are rather like a swarm of ies, buzzing about aimlessly. Except
that the ies repel each other. When an electric eld is put across the wire,
the electrons drift in the direction forced by the eld. In eect, if you pump a
bunch of electrons in at one end of a piece of wire, they repel nearby electrons
and so on so a compression wave passes rather quickly down the wire.
It makes sense therefore to talk of the current which is basically a count of
the number of electrons passing a point in a second
8
. By measuring charge
in some more practical way we can write i = dQ/dt where charge Q moves
7
During the Napoleonic Wars in Europe, Faraday and Sir Humphrey Davy travelled
around to talk to the physicists in various European countries. They regarded the war as
rather a nuisance, and had to avoid the ghting which they saw as a form of insanity to
which some people are addicted. They didnt need passports which hadnt been invented:
it was any freeborn Englishmans right to go wherever he wanted. Passports were intro
duced later under the usual excuse that the government wants to help you. Few people
of any intelligence believed this in early nineteenth century Britain, it being too obvious
that politicians mainly want to help themselves. Not everything has improved since the
early Nineteenth century.
8
But with the direction reversed because current is positive charge and electrons are
negative.
4.3. MAXWELL AND FARADAY 121
Figure 4.3.1: The magnetic eld of a current (moving charge).
along a wire, or even in a stream through empty space. Since what goes in
must come out and since the electrons are not going to bunch up if they can
help it, the current at one point of a piece of wire must be the same at any
other point except for brief transients when we switch on the process. It is
clear that current is a vector since moving charge has a direction associated
with it.
Michael Faraday, one of the nest experimentalist the world has produced
and an all round smart cookie, established that charge moving along a wire
produces a magnetic eld which circles the wire. Drawing curves for the ow
of the eld we get something like gure 4.3.1. I have drawn ony a section at
one point of the wire, there is such a set of circles centred on every point.
He also found that when a magnetic eld changes it induces a current. This is
how we get our mains electricity from power stations. We spin a magnet and
surround it by a coil of wire in eect. Some serious googling or any elementary
text book on electricity should show you exactly how this is done
9
.
James Clerk Maxwell took the ndings of Faraday and wrote them out in
Algebra.
If you reect that changing magnetic elds produce an electric eld and
changing electric elds produce a magnetic eld, it might occur to you that
this swapping of energy between the two forms might happen in a cyclic way,
and might indeed happen in empty space. It might even occur to you that
such a cycling arrangement might travel through space. Your opinions would
not however count for much and would be considered of very minor interest,
mostly by your friends and relatives, and some of them might consider your
views as evidence of insanity. If however you were to take your wamblings
and turn them into algebra, you might be able to prove that this could
indeed happen and show how to calculate the speed of propagation of such
9
There are people who are convinced they understand electricity: when you click the
switch the light comes on, or maybe the television set, although this generally requires a
dierent switch. There is more to it than that, and it is a good idea to understand it a
little better, or you are not really a member of our civilisation, just a freeloading parasite
on it, hardly better than a politician or an arts graduate
122 CHAPTER 4. SOME ELEMENTARY PHYSICS
an electromagnetic wave in terms of constants which were properties of the
vacuum, such as
0
and the corresponding magnetic one
0
. And if this turned
out to be the same as the measured velocity of light, you would eventually
be taken very seriously. This is what Maxwell did. The velocity of light
just happens to be 1/
0
, both of which were known from entirely static
experiments. And it led shortly afterwards to people trying deliberately to
produce such electromagnetic waves, and this led on to radio, radar, television
and most recently mobile phones
10
.
Maxwells Equations are four in number and state things that are known
about the electric and magnetic elds. Two deal with the nature of the elds
separately and two deal with the interaction between them. I give them here
for deniteness in more or less the same form as the text book. We suppose
that E is the electric eld, B is the magnetic eld and is the density of
charge.
q
E = (4.3.1)
q
B = 0 (4.3.2)
E +
B
t
= 0 (4.3.3)
B
E
t
= j (4.3.4)
The vector j is in the direction of the moving charge and has norm the rate
of it.
These are very dierent from the form that Maxwell wrote them in, which
were much longer and not so compressed, and we shall get an even terser
form later. I note that there are some constants for the medium which have
been xed up to make the velocity of light one. This is just a choice of units
in which to measure things and so is quite harmless and makes equations
simpler.
10
Whether this is altogether desirable may be doubted, but there are at least some
benets. Certainly the reason we are much better o than the inhabitants of Bangladesh
or Congo is that we are more closely related to Isaac Newton, Michael Faraday and James
Clerk Maxwell than they are, biologically or socially. And we live with traditions which
are still in many ways similar to the traditions in which these men lived and produced the
amazing changes that they did. There are also some dierences. Nowadays, instead of
being funded by the Royal Society at the discretion of Sir Humphrey Davey (its president),
Faraday would have had to submit a grant application to a committee to study electricity.
It is very doubtful if hed get it. First he had no appropriate qualications, and second
the practical applications would certainly have been beyond the imagination of the kind
of people who enjoy being on committees. Hed probably have been told to give up all
this foolery with wires and magnets and work on more powerful steam engines.
4.3. MAXWELL AND FARADAY 123
Figure 4.3.2: The curl of a vector eld.
The rst two equations simply say that the electric and magnetic elds are
incompressible, that the ux into a region is equal to the ux out in the
case of an electric eld, except in a region containing charge, and that the
magnetic eld is always incompressible (there are no magnetic monopoles).
The second two contain the information about the interchange between mag
netic elds and electric elds. It is essential to understand what they are
saying, do not merely memorise them.
The curl of a vector eld is the extent to which it tends to twist around
some axis. If you visualise a stream of water owing and you put a very
small paddle in it, then in general the paddle gets turned by the ow being
greater on one side than on the other. Figure 4.3.2 gives the idea.
The curl can be thought of as a vector by taking the amount of twist about
the positive xaxis, the positive yaxis and the positive zaxis to give three
components; alternatively we can take the direction of the vector to be that
in which the rotation is a maximum and the length equal to the maximum
torque. Only a little thought suggests that it would be much more natural
to think of it as a 2form, when it is simply the exterior derivative of the
1form which replaces the vector eld. This is undoubtedly a better way to
think of it, as it generalises to higher dimensions quite naturally. And of
course the divergence can be thought of as applying the exterior derivative
to a 2form to get a 3form, which on R
3
is, at each point, a number. So we
may anticipate the next stage of writing these equations out will be to turn
them into dierential forms instead of vector elds.
For the present however, equation 4.3.3 says that the curl of the electric eld
is the rate of change of the magnetic eld with the direction reversed. We
have to think of B as a vector eld which depends upon the time: if we
take each of the three components it is a function of x, y, z and t, and if we
dierentiate it (partially) with respect to t we get a new vector eld, also
usually depending on t. So equation 4.3.3 says that for every time t, the
vector eld E is the negative of the time derivative of B. The amount of
124 CHAPTER 4. SOME ELEMENTARY PHYSICS
twist of the electric eld depends upon the rate at which the magnetic eld
is changing. This is, like equation 4.5.1, and of course equation 4.3.4, part of
the interconnection between magnetic and electric elds.
Finally, equation 4.3.4 is almost dual to equation equation 4.3.3 except for
a minus sign and the j term and tells us something about the curl of the
magnetic eld at every time in terms of the current vector and the rate of
change of the electric eld. The latter term was introduced by Maxwell
not on the basis of experimental results, but because it led to the wave
solution to the four equations. One suspects strongly he had done the vague
English language argument about the exchange of energy between electric
and magnetic elds in free space and wanted to make it come good.
In order to collect your ideas on these equations, and to recall some earlier
work, some simple exercises will establish what is going on. If you did second
year physics you have probably already done these, although not perhaps in
this form.
Exercise 4.3.1. Find a vector eld Von R
3
with constant curl the vector
(0, 0, 1)
T
. Find some more vector elds with the same curl. Show that there
is an innite dimensional space of vector valued functions on R
3
which can
be added to your solution to give another solution.
Exercise 4.3.2. Find a vector eld V on R
3
which has a constant curl vector
zero, but which has the property that the integral around the unit circle (in
the z = 0 plane) of V is nonzero.
Exercise 4.3.3. A current vector j is dened to be uniformly (0, 0, 1)
T
for
points of distance less than one from the zaxis. You may imagine a rod of
radius 1 along the zaxis carrying a current. The current is zero for points
at a distance from the zaxis greater than one (that is, outside the rod). Find
a continuous magnetic eld the curl of which is the given j eld.
Explain why continuity is worth having, and given that there are rather a lot
of other solutions, explain what grounds you have for preferring yours. You
might nd gure 4.3.1 inspirational.
Exercise 4.3.4. Show that the wave equation is a solution to the Maxell
Equations in empty space. First write down the equation of an electric eld
which has all the vectors in a plane the same length and direction: take, say,
planes x = s and arrange to have for xed s, every electric vector the same
length and direction at any time t, but change the vector in time and also with
s so that it has unit speed along the x axis. Now look to see if this satises the
Maxwell Equations, for and j zero, by doing some partial dierentiating.
When you have done so, throw your hat in the air and shout huzzah! in
celebration. You have seen the light.
4.3. MAXWELL AND FARADAY 125
Figure 4.3.3: What is the eect on a television set of a magnet?
Exercise 4.3.5. A beam of electrons is emitted by a cathode at the back of
a television set and paints a spot on the centre of the screen. Traditionally,
deector plates are charged so as to sweep the spot in a raster scan giving your
television picture. I show a horizontal section through the tube in gure 4.3.3.
Discuss what happens when a bar magnet is placed in the location shown.
What numbers would you need to know in order to calulate the deection of
the beam and its direction?
Exercise 4.3.6. Read the rst twenty chapters of volume Two of the Feyn
man Lectures on Physics (in the library). Chapter nineteen is of no direct
relevance but is good fun and worth reading to see how a great physicist thinks.
If you have been doing physics you should nd this easy, but there are some
penetrating observations which you may want to think about. If you havent
done much physics, again this will show you something of what you have been
missing.
Exercise 4.3.7. Read Chapter four of the text book and do all the exercises.
Remark 4.3.1. The work which has been described so far has changed the
world, mainly for the better, and changed it enormously. It is the product
of the Western Intellectual Tradition, and it is worth reecting on the kind
of society which can produce such things, and also on the kinds of society
which cannot, which is most of them.
Maxwells Equations represent one of the glories of Western civilisation,
something which is likely to remain as long as humanity endures and possibly
for much longer. Maxwell, Faraday and others stole lightning from the gods:
these men are heroes far beyond such as Alexander, Caesar or Napoleon
11
.
11
Or miscellaneous footy players, or people who hit balls with sticks. Or people who
play guitars or sing. The list goes on.
126 CHAPTER 4. SOME ELEMENTARY PHYSICS
Figure 4.4.1: A ball about to bounce o a wall.
Your life at University is being spent in part at least in coming to under
stand the thinking of the great men who produced these marvels, and also
to understand something of how they did it. There are worse ways to spend
your time
12
.
4.4 Invariance
4.4.1 The Idea of Invariance
Imagine a ball rolling in the plane and bouncing of a xed wall, as in gure
4.4.1.
If the ball has initial momentum p in the direction of the arrow, then it is
simple to compute the new momentum after the ball has bounced: the com
ponent parallel to the wall is unchanged and the component perpendicular
to the wall is reversed in sign.
This makes a number of assumptions which are less than satisfactory; one
is that the ball is elastic since if it was made of putty it might stick to the
wall after deforming. So we also assume that energy is conserved by the
impact, in general not a realistic assumption, but approximately true for
bodies which are elastic enough. We also assume that the wall is rather
at and very smooth, since the ball will actually impact over a region, not
at a point, and if dierent bits of the wall made dierent angles with the
trajectory then the behaviour is potentially more complicated and harder to
compute. When you did this sort of thing at school I rather suspect they just
12
Conquering Europe, or anything involving balls or guitars, for instance.
4.4. INVARIANCE 127
trained you to make the assumptions that the school teacher made without
asking many questions, so you probably never questioned the assumptions
and indeed didnt even think about what they were. There can be rather a
lot.
Exercise 4.4.1. Think of some more assumptions that are necessary to get
a solution to the problem as posed.
In order to do the calculation, subject to all the usual assumptions, we need
to take a coordinate system, and some are better than others. I have shown
some axes in the left lower side of the picture. I havent however marked on
any units and you dont know what units the momentum is given in either. It
should be fairly obvious that the units dont much matter, in that whatever
we choose, as long as we take the same ones after as before we will get the
same answer. What is crucial is the angle the momentum vector makes with
the wall.
Now suppose we change the position and orientation of the axes. I do the
calculation in the original system and you do it in the new system. We can
translate everything from one to the other; your initial momentum vector p
will consist of an ordered pair of numbers, and an initial point for the ball
will also consist of another ordered pair of numbers. So will mine, and mine
will be dierent. It is easy to write down a euclidean group element which
will reliably translate your numbers to mine and the inverse will translate
my numbers to yours. And the resulting momentum vector, specied by a
direction and a point through which it passes, will translate by the same rule.
This is just like the situation of chapter one where I talked about penguins:
we have language which consists of nite lists of numbers, and we have the
physical entities, and the behaviour had better be described in the same way
whatever the language, because what happens in the world does not depend
on the language we use to talk about it. This is a key assumption about the
physical world which we use to put conditions upon the kinds of language
we shall use to talk about it
13
.
In the above case we can also change the units, you can use metres per second
and I can use parasangs per lunar month; the translation system still works
in that the system that translates the initial momentum from yours to mine
will also translate the nal momentum from yours to mine. In fact it is
dicult to think of dierent systems of specifying initial and nal momenta
and positions where a consistent translation system will not work.
13
This condition does not seem to apply to ones love life. There are tactful ways a bloke
can tell his girlfriend he doesnt like her outt, and there are others. He may tell the truth
in both cases, but the language makes a dierence. In particular, it is always a mistake
to laugh. I speak from bitter experience.
128 CHAPTER 4. SOME ELEMENTARY PHYSICS
Exercise 4.4.2. But not impossible. Hint: consider the map from the polar
coordinate space r, to itself that doubles the angle. This is not a bijection,
we can translate events one way, but not unambiguously the other.
What this means is that if we have two languages for talking about events,
then as long as the translation scheme between the two languages is a bijec
tion, and as long as an event can be specied in one language, then it can
also be specied in the other, and the translation scheme will work for all
such events if those events are correctly described.
But as well as specifying observable events, we also want to predict what will
happen in advance by means of some kind of theory. And it is going to be a
poor sort of theory where the prediction is dierent in languages with such a
translation scheme to hand. This imposes a constraint on the theory: it must
translate the same predictions into each other. This is known as Einsteins
Principle of Covariance and you should be able to see how he (and Poincare)
came to it: by seeing dierent coordinate frameworks as providing dierent
languages and there being a translation system between them.
We normally have not just two languages and one translation system between
them but a whole space of languages and a group of translations schemes,
since given any three languages, if I can translate from A to B and from B
to C by maps, then I can translate from A to C by the composite; moreover
the identity will translate from any language to itself, and we really want
every translation system to work in both directions, so there is an inverse
map. The associativity of composites of maps is immediate, so we have a
group of such translation systems. In the case of the shifting and rotation of
a coordinate frame used to specify only the positions of points, this is clearly
the Special Euclidean Group, SE(2, R). See the M2213 lecture notes if you
have forgotten this.
If f : R
2
R
2
is any map, it makes sense to ask if it is invariant with
respect to a group action. For example, f(x, y) = x
2
+ y
2
is invariant under
the rotation group SO(2, R): putting
X = x cos y sin
Y = x sin +y cos
we easily conrm that f(X, Y ) = f(x, y) for any theta. So if some sort of
prediction is specied by a function we can look to see if it is invariant under
the appropriate group of transformations of coordinates which we regard as
specifying the possible languages we have available, and if it is not then it
cannot possibly dene a satisfactory theory, because dierent observers will
expect to have dierent and incompatible outcomes. If a theory is specied
4.4. INVARIANCE 129
by requiring two functions to be equal, then they must be equal both before
and after we perform the appropriate group actions on them.
What is the group action in the case of the ball bouncing o the wall? We
have that the space in question is the space of positions and momenta of
balls. This we have seen is the cotangent space to R
2
which is isomorphic to
R
4
. It is not uncommon to write elements of this in the form (q
1
, q
2
, p
1
, p
2
)
where the q
i
are the positions, x and y more conventionally, and the p
i
are
the momenta. If we do a shift of a coordinate system, this will aect the q
i
but not the p
i
. If we do a rotation, both will be aected in the same way.
We can also consider a coordinate frame which is moving at a constant ve
locity with respect to another one, requiring us to specify also the time. So
we have a ve dimensional space in which to specify the position and mo
mentum of the ball at each time, two coordinate frameworks for turning the
motion into a map from R denoting the time into R
5
, and a map between
them which has the property of taking one description to another description
of the same event.
Exercise 4.4.3. Write down a specication for a ball moving in a atraight
line in R
2
with constant momentum. Use the ve numbers (t, q
1
, q
2
, p
1
, p
2
)
Choose actual numbers for the motion!
1. Take a coordinate frame which is shifted by some constant amount and
translate the function giving the position and momentum of the ball into
the new framework.
2. Do the same with a rotated coordinate frame.
3. Do the same for a frame which is both rotated and shifted.
4. Do the same for a frame which is both rotated and is moving at constant
velocity.
5. Do the same for a frame which is rotating with constant angular veloc
ity.
6. Write down the groups for the rst three transformations. What phys
ically intelligible function is invariant under this group?
7. Write down the group for all the rst four transformations. (This is
called the Galilean Group.) What is its dimension?
8. Write down the group for all ve transformations.
9. Is your function invariant under either of the last two groups?
130 CHAPTER 4. SOME ELEMENTARY PHYSICS
10. What happens when the ball bounces?
11. Explain the physics here.
Exercise 4.4.4. If V is a vector eld on R
3
and f : R
3
R
3
is a euclidean
transformation, is it true that when V satises the equation
q
V = 0 then
so does Tf(V )? If so prove it, if not give a counterexample.
Exercise 4.4.5. If V is a vector eld on R
3
and f : R
3
R
3
is a dieo
morphism, is it true that when V satises the equation
q
V = 0 then so
does Tf(V )? If so prove it, if not give a counterexample.
4.4.2 The Lorentz Group
The denition of the Orthogonal group O(n,R) was either that it consisted of
the orthogonal n n real matrices, or, better, that it consisted of the linear
maps from R
n
to R
n
which preserved the inner product, Formally,
A L(R
n
, R
n
), A O(n, R) u, v R
n
, 'u, v` = 'Au, Av`
So as to simplify things I shall now take a generalised inner product on R
2
which I shall write as having elements
t
x
t
x
= tt
xx
Note that I have reversed the sign from what you probably expected and the
one that the text book favours. If you feel uneasy about this go through
multiplying everything by 1.
It follows that the norm of the vector (t, x)
T
is t
2
x
2
.
I shall argue by analogy with the usual inner product on R
2
.
In order to nd out what the orthogonal maps looked like on R
2
, we took
the unit circle, and argued that any point on it had to remain on it under
an orthogonal map. Doing the same here, take the set
H
1
=
t
x
R
2
: t
2
x
2
= 1
t
x
H
1
s
u
= A
t
x
H
1
s
2
u
2
= 1
It is easy to draw the set t
2
x
2
= 1 and it consists of a hyperbola as in
gure 4.4.2.
My reason for drawing it this way around and taking t
2
x
2
and not x
2
t
2
is that all the action has x < t which, given that the velocity of light is one
in these units and that things dont travel faster than light, is the way things
ought to be.
Now we can parametrise the unit circle by cos , sin and it is easy to
parametrise the curve H
1
by
t = cosh , x = sinh
since
cosh
2
sinh
2
=
e
2
+e
2
+ 2
4
e
2
+e
2
2
4
= 1
Now we note that the standard basis elements in R
2
are t = 1, x = 0 and
t = 0, x = 1 and that the norm of the rst is 1, so it is in H
1
and the norm
of the second is 1 and it is not in H
1
. So there is a slight problem with
dening a Lorentzian matrix in terms of cosh and sinh . The solution is
to observe that we need to extend H
1
to contain the other hyperbola which
intersects the x axis, as in gure 4.4.3
132 CHAPTER 4. SOME ELEMENTARY PHYSICS
Figure 4.4.3: A better analogue of the unit circle in a Lorentzian space.
We can now see that we should have dened
H
1
=
t
x
R
2
: t
2
x
2
= 1
cosh sinh
sinh cosh
Those vectors for which we are in the original hyperbola are called spacelike,
since they represent velocities which are less than 1 and correspond to things
we may see in our universe. Light, which moves at the velocity 1 in our units
must lie along x = t, and consists of vectors not in H
1
and the norm of
any such vector is zero. So the analogue of distance in our Lorentzian space,
which we call the interval, is zero for any light ray. Seen from the point
of view of a ray of light, there is no dierence between starting from the
Andromeda Galaxy and arriving in your eye. This is denitely weird; well,
thats reality for you.
Supposing we start with a spacelike vector in the two dimensional Lorentzian
space, for example t = 2, x = 1. It goes to
t = 2 cosh + sinh , x = 2 sinh + cosh
It is easy to verify that the norm of the original vector is 3 and so is the
norm of the nal vector.
4.4. INVARIANCE 133
Exercise 4.4.6. Conrm that all such matrices as those advertised do in
deed preserve the lorentzian form. What is their determinant. What other
matrices preserve the Lorentzian form? What is their determinant?
Now lets get back to higher dimensional spaces with a (1,n) signature. I
have the usual spacetime situation with (x
0
, x
1
, x
2
, x
n
)
T
and the lorentzian
generalised inner product x
0
x
0
j[1:n]
x
j
x
j
. I am particularly concerned
with n = 3 because that is the number of spatial dimensions of the universe
we live in.
There are six basic lorentzian matrices in R
4
with the lorentzian inner
product:
1 0 0 0
0 1 0 0
0 0 c s
0 0 s c
1 0 0 0
0 c 0 s
0 0 1 0
0 s 0 c
1 0 0 0
0 c s 0
0 s c 0
0 0 0 1
(4.4.1)
gives three of them, where the c and s entries represent cosines and sines
of angles and give the three dimensional space of real orthogonal matrices.
The time axis is left xed in this case, and each of these leaves one other
orthogonal axis xed.
The other three are
ch sh 0 0
sh ch 0 0
0 0 1 0
0 0 0 1
ch 0 sh 0
0 1 0 0
sh 0 ch 0
0 0 0 1
ch 0 0 sh
0 1 0 0
0 0 1 0
sh 0 0 ch
(4.4.2)
where ch is short for cosh and sh is short for sinh for various . Each of
these swaps the time into one of the three space axes and vice versa. Again,
two axes are left xed. They are known to physicists as Lorentz Boosts.
Exercise 4.4.7. Show that each of the above six matrices preserves the
lorentzian inner product, and hence that any composite of them (for any
consistent values of the argument of cos, sin, cosh or sinh) will also.
Exercise 4.4.8. Show that every matrix which preserves the lorentzian inner
product must be some nite product of such matrices.
Exercise 4.4.9. Show that the Galilean group can be represented by matrices
of the form
1 0 0 0
v
1
a
11
a
12
a
13
v
2
a
21
a
22
a
23
v
3
a
31
a
32
a
33
a
11
a
12
a
13
a
21
a
22
a
23
a
31
a
32
a
33
is in SO(3, R).
Remark 4.4.1. Note that both the Lorentz group and the Galilean group
can deal with a change of coordinates from a xed system to one moving at
uniform velocity with respect to it. And they are dierent! The lorentz group
is the right one for relativity. You should observe that for the lorentz group,
movement with velocity v means setting tanh() = v, and that we recover the
usual (relativistic) rules for the addition of velocities.
Exercise 4.4.10. Find a good source on Special Relativity: you could do
worse that the Feynman Lectures on Physics, Volume 1, chapter 15. Note
the equations
x
=
x ut
1 u
2
y
= y
z
= z
t
=
t ux
1 u
2
Show that these are essentially the inverse of the rst matrix in 4.4.2.
Explain why physicists use the inverse.
Exercise 4.4.11. Show that if a space ship zooms past you at velocity half
that of light, and another spaceship zooms past that at half the speed of light
(relative to the rst spaceship) in the same direction, then you will decide the
second spaceship has a velocity of 4/5 the speed of light. Show that if the rst
had speed u and the second had speed v relative to the rst, your opinion of
the speed of the second is given by
w =
u +v
1 +uv
Show that if [u[ < 1 and [v[ < 1 then [w[ < 1.
Exercise 4.4.12. Read Feynmans Lecture Notes in Physics, Volume 1,
chapter 15. If you are a physicist you will have already covered this mate
rial, if not you will nd it comforting to discover you have now done Special
Relativity. Easy, isnt it? Note that apart from a few technical diculties (!)
you have discovered how atom bombs and nuclear power stations work
14
.
14
The details of atom bombs are very simple, and the recipe is as follows: take about
4.4. INVARIANCE 135
4.4.3 The Maxwell Equations
The rst of Maxwells equations for the vacuum, with charge density zero is
q
E = 0
To see that this is invariant under SO(3,R) is in one sense trivial. The equa
tion says that the net outow of the ux determined by the vector eld E for
any little box is zero, in fact for any box whatever, where a box is a region
bounded by something dieomorphic to a 2sphere. Rotating a box gives an
other box, so the net outow from a shifted box is also zero. This obviously
extends to the nonvacuum case with a nonzero charge density. It obviously
holds for a much larger group than SO(3,R) too: it must hold for any dieo
morphism, although the equation stating the fact that the divergence is zero
would look rather dierent.
Although this argument is persuasive, it lacks a certain rigour, so a slightly
more careful argument is required. We can observe that when we take the
divergence of the original vector eld at any point it has to be the same as
the divergence of the transformed eld at the transformed point. And since
the zero map is also preserved by the orthogonal map, the result follows.
The equation
q
B = 0 is also invariant for the same reason.
Exercise 4.4.13. Show that the statement the divergence of the original
vector eld at any point has to be the same as the divergence of the trans
formed eld at the transformed point can be translated into algebra by doing
it, and that it is true for any vector eld.
Now this argument uses only the linearity of the matrix and the fact that
t = t
ch sh 0 0
sh ch 0 0
0 0 1 0
0 0 0 1
The problem here is that the electric and magnetic elds are dened as vector
elds on R
3
and the lorentz boost maps are represented by 4 4 matrices,
so we can expect some serious complications.
a kilogram of Uranium 235 or Plutonium and shape it into a hemisphere. Do the same
with a second kilogram. Now clap them together hard to make a solid sphere. Children,
do not do this at home unless you really dislike your parents.
136 CHAPTER 4. SOME ELEMENTARY PHYSICS
The actual transformation of the E and B elds is rather a shock at rst and
is given by
cosh sinh 0 0
sinh cosh 0 0
0 0 1 0
0 0 0 1
E
x
E
y
E
z
E
x
E
y
cosh B
z
sinh
E
z
cosh +B
y
sinh
(4.4.3)
and
cosh sinh 0 0
sinh cosh 0 0
0 0 1 0
0 0 0 1
B
x
B
y
B
z
B
x
B
y
cosh +E
z
sinh
B
z
cosh E
y
sinh
(4.4.4)
What is surprising about this is that the Electric and Magnetic components
get mixed up. This means that if I am travelling in the x direction with
velocity tanh (which you will note has absolute value always less than 1,
the speed of light) and we both measure an electric eld and a magnetic
eld, we shall dier on which bits are which. This is a strong hint that the
two phenomena of electric elds and magnetic elds are all part of the same
underlying entity, called the electromagnetic eld.
The derivation of the above transforms will be easier once we go over to using
dierential forms instead of vector elds to represent the two parts, E and
B of the electromagnetic eld, so I shall defer it.
The invariance of the Maxwell Equations under the transforms is also eas
ier to see in this setting. So we now turn to the right way to talk of the
electromagnetic eld.
4.5 Saying it with Dierential Forms
Given a physical phenomenon, in this case the force exerted on a charged
particle, and given that two bits of language can be used to describe it, in
this case as a vector eld on R
3
or as a 1form or a 2form, the question of
which piece of language to use comes up immediately. We may, of course,
choose the rst one that occurs to us and having made a choice stick to
it in deance of later developments. This is rather stupid and regrettably
common. An alternative is to ask if there are any physically obvious grounds
for making a choice, and a second is to keep them both in use until such time
as one demonstrates advantages.
4.5. SAYING IT WITH DIFFERENTIAL FORMS 137
Let me rst argue that it is reasonable to use 2forms for describing magnetic
elds. The reason is that 2forms take account of orientations, and magnetic
elds certainly exhibit all the usual signs of being orientation aware. If you
look at the Lorentz Force law, which I give again for your greater comfort,
F = q(E +v B) (4.5.1)
you will see that the v B part clearly has an orientation aspect in it by
virtue of the cross product, whereas the electric eld has only the sense or
direction. It therefore makes sense to represent the magnetic eld as a 2
form. Some sort of right hand rule is involved in computing a cross product:
this may be seen as containing the information that magnetic elds also use
some sort of orientation information. They care about which direction a
charge is moving. Of course, we can force a vector view on the eld if we
insist, which means we have to keep in mind the right hand rules of physics,
whereas if we use 2forms, this should be taken care of by the formalism. A
good language is one which does most of the work and doesnt require us to
keep a close watch on it.
If B is a 2form then so is its time derivative, and the equation
E =
B
t
is telling us that E is a 1form. So we can write the above equation in the
form
dE =
B
t
where d is the exterior derivative which we know takes 1forms to 2forms.
We may also write dB = 0 with some condence to represent the classical
equation
q
B = 0 since the divergence of a vector eld is a scalar eld and
the exterior derivative of a 2form is a 3form which on R
3
is pretty much
also a scalar eld, multiplied by det if you want to be careful.
Unfortumately, after that everything goes pearshaped.
the equation
B
E
t
= j
makes no sense when we try to translate it: B cant have a curl, it is one.
t
E has to be another 1form, and so is j. So we somehow have to arrange
that the translation of B is a 1form.
On the other hand, we have also avoided facing the fact that we should be
doing all of this on R
4
with the lorentz metric. Maybe we can save things by
a small amount of rearrangement.
138 CHAPTER 4. SOME ELEMENTARY PHYSICS
Let us look at the simplest case rst, the equations
q
B = 0
E +
B
t
= 0
become
dB = 0
dE +
B
t
= 0
This corresponds closely to the physics: dB is indeed a divergence that is a
3form on R
3
and the exterior derivative of the electric 1form is indeed a
2form.
We can shift all this to our lorentz space, R
4
with the lorentz inner product,
bt dening a 2form on R
4
as follows:
F = B+E dt (4.5.2)
Writing this out in coordinate form with respect to the standard basis we get
B = B
x
dy dz +B
y
dz dx +B
z
dx dy
and
E dt = E
x
dx dt +E
y
dy dt +E
z
dz dt
I hope you recall representing the 2form 3dx dx + 4dx dy 2dy dx +
5dy dy on R
2
as a matrix
3 4
2 5
You will have veried that this operates on the two input vectors
x
y
u
v
3 4
2 5
u
v
0 Ex Ey Ez
Ex 0 Bz By
Ey Bz 0 Bx
Ez By Bx 0
E
z
y
E
y
z
dy dz dt
+
E
x
z
E
z
x
dz dx dt
+
E
y
x
E
x
y
dx dy dt
140 CHAPTER 4. SOME ELEMENTARY PHYSICS
Collecting up both parts we get:
dF =
B
x
x
+
B
y
y
+
B
z
z
dx dy dz (4.5.3)
+
E
z
y
E
y
z
+
B
x
t
dy dz dt
+
E
x
z
E
z
x
+
B
y
t
dz dx dt
+
E
y
x
E
x
y
+
B
z
t
dx dy dt
If dF = 0 then each of the above four lines must be zero. The rst line says
that div B = 0 in old fashioned language, and the last three say that curl
E +B/t = 0 in the same old fashioned language.
In other words, we get two of the Maxwell equations out. This is encouraging
and leads us to feel that gluing E and B together into a single entity, the
2form F is a good idea. This is the physically signicant thing of which the
magnetic eld and the electric eld are merely dierent aspects.
The next step is to express the other pair of Maxwell equations in the same
language.
This is where the operator comes in. It is clear that if we F we get
another 2form. When we calculate the matrix for it we get
0 Bx By Bz
Bx 0 Ez Ey
By Ex 0 Ex
Bz Ey Ex 0
Exercise 4.5.2. Conrm this. The calculation is utterly trivial, all you need
to do is to organise your thoughts sensibly. Observe that the Hodge dual can
be memorised by mapping Ej to Bj and Bj to Ej. This looks very like
what we want for the other pair of Maxwell Equations in the classical form.
If we take the exterior derivative we get a 3form on R
4
, and if we it we
get a 1form. We can represent the current as a 1form on R
4
by putting the
charge density in the zeroth place and using the other three places to give
the values for the current ow. This means we need to dene J as the 1form
t
+j
1
x
+j
2
y
+j
3
z
4.6. LORENTZ INVARIANCE 141
Whereupon we may write the other two Maxwell Equations out as
(d((F))) = J
Exercise 4.5.3. Show that this does indeed amount to precisely the other
pair of Maxwells Equations.
It is common to leave out all the parentheses and summarise the Maxwell
Equations, all together, in the form
dF = 0 (4.5.4)
d F = J (4.5.5)
Remark 4.5.1. You have to allow that this is rather cool. Compacting the
equations in this way gives us a much more elegant way to express the basic
facts of electromagnetism and should leave you feeling that it is more true to
the underlying reality than the classical form. If you had never met Maxwells
Equations in the classical formalism and you had just met these for the rst
time, you would, I think, nd a good deal of charm in the conciseness, and
feel that the evidence for the electromagnetic eld being a 2form on a four
dimensional spacetime is overwhelming. The fact that it requires a lorentz
inner product to work properly is at least highly suggestive.
Exercise 4.5.4. How far could you get in rewriting the Maxwell Equations
(in terms of forms) with the standard inner product on R
4
? What changes
would you need?
4.6 Lorentz Invariance
The rst issue to be addressed is to determine how 2forms transform under
a transformation of coordinates.
Suppose B is a 2form on R
n
with a generalised inner product, and A : R
n
R
n
is a dieomorphism. Then take one coordinate system at the origin and
let another be obtained by performing A on it. Call S the coordinate system
at the origin, and think ordered basis for a concrete example. Then AS is the
second coordinate system. Think of A as a linear map, possibly a lorentz map
to make this relatively concrete. Then a vector x in the coordinate system S
is read as A
1
x in AS. Call it x
= A
1
u for a second vector.
Then if B
acts
on the ordered pair x
, u
]u
Au = x
T
[B]u
This can only happen when A
T
[B]
= A
T1
[B]A
1
which gives us the transform of the matrix [B] representing the 2form B.
Exercise 4.6.1. It is well known that for an orthogonal matrix A, A
T
= A
1
.
What can you say about the transpose of a lorentzian matrix?
Exercise 4.6.2. The question arises, how much of this depends on linearity?
Obviously we have chosen to represent things by matrices, but the equation
makes sense in a much more general setting except possibly for the business
of the transpose, which arose from our determination to represent B by a
matrix. Suppose we express the 2form relative to a basis in the usual way
as a sum of dx
i
dx
j
. What can be said about the expression of B
relative
to the dx
i
dx
j
? How much if anything can be saved if we permit A to be
a dieomorphism? Hint: investigate this in R
2
` 0 with reference to the
polar coordinate transform.
The obsession which physicists and old stle mathematicians have with matrix
representations of dierential forms can obscure the basic simplicities. We
have already computed dF in standard terms as a 3form in equation 4.5.3,
and it is simpler to investigate the lorentz transformations of both 2forms
and 3forms directly.
Lets do it for the 2form F on R
4
and the rst lorentz boost.
We have that any 2form on R
4
is given by suitable linear combinations,
weighted sums, of the six terms dt dx, dt dy, dt dz, dx dy, dx dz,
dy dz. From the matrix representation for F we can read these o:
F = Ex dt dx Ey dt dy Ez dt dz
+ Bz dx dy By dx dz +Bx dy dz
4.6. LORENTZ INVARIANCE 143
If we suppose that the rst lorentz boost is used to transform the standard
basis in R
4
to a new basis, what I shall call the dashed basis, then we need
the inverse map to transform the coordinates of a point (event) in R
4
to the
new coordinates in the dashed frame. Thus we have
c s 0 0
s c 0 0
0 0 1 0
0 0 0 1
t
x
y
z
ct sx
st cx
y
z
dt
dx
dy
dz
c dt s dx
s dt +c dx
dy
dz
dx
= dt dx; dt
dy
= c dt dy s dx dy;
dt
dz = c dt dz s dx dz; dx
dy
= s dt dy +c dx dy
and
dx
dz
= s dt dz +c dx dz; dy
dz
= dy dz
Exercise 4.6.3. Find the expressions for the four basis elements of a three
form,
dt
dx
dy
, dt
dx
dz
, dt
dy
dz
and dx
dy
dz
, Ey
, Ez
, Bx
, By
, Bz
dx
dy
, dt
dx
dz
, dt
dy
dz
and dx
dy
dz
Now we can replace the last four terms by the undashed terms.
We also have that the chain rule allows us to replace the
x
Ey
and similar
terms by their undashed translations:
[
t
f
,
x
f
,
y
f
,
z
f
] = [
t
f,
x
f,
y
f,
z
f]
c s 0 0
s c 0 0
0 0 1 0
0 0 0 1
in terms of the
undashed symbols.
We nally have to conrm that
dF = 0 dF
= 0
At rst sight, the expression for dF
= 0
This sequence of exercises has established that the vacuum Maxwell equations
are invariant under the rst lorentz boost.
Exercise 4.6.7. Conrm that this also works for the other two lorentz boosts.
This is best done using a small amount of thought rather than a large amount
of algebra.
Since we have already seen that the vacuum Maxelll equations are invariant
under the special orthogonal subgroup, it follows that the equation
dF = 0
is invariant under the lorentz group.
Now if d F = 0 which it does in the vacuum, it also follows that d F = 0
since takes zero forms to zero forms and
2
is a number, in fact 1.
4.6. LORENTZ INVARIANCE 145
Exercise 4.6.8. Which?
So the same calculation will establish the invariance of the second equation.
After doing this we conclude that both the vacuum equations are invariant
under the Lorentz group.
Exercise 4.6.9. Show that d F is also invariant under the three lorentz
boosts and the special orthogonal group. Hint: this does not require doing it
all over again!
This result is absolutely astonishing. I shall now explain why.
4.6.1 Special Relativity
Newtons Laws of motion are about forces, which is to say accelerations
and masses if we look at the things we can actually measure directly. And
these appear at rst sight to be invariant under the galilean group. Certainly
Newton thought they were, although group theory not having been formalised
in his day, he wouldnt have put it in that way. Had the lorentz group been
in existence, the possibility that the laws of motion were invariant under the
lorentz group would have been regarded as a bizarre possibility too absurd
to waste time on, although one couldnt rule it out on experimental grounds
since speeds with which we are familiar are small compared with the velocity
of light.
Invariance of the laws of nature under the galilean group explained why
we havent found anywhere in the universe labelled origin, a special point,
possibly with three orthogonal axes sticking out of it. It must have looked
unlikely that we will, and the fact that we have used coordinate systems and
bases of orthogonal vectors to talk about the universe led immediately to the
observation that this was merely a convenient language, and no particular
coordinate frame was better than any other; indeed, one could be moving
at constant velocity with respect to another and they were equally good.
Although accelerating frames changed things, as one nds when looking at
tops and roundabouts and planets.
The fact that Maxwells equations are not invariant under the galilean group
but are under the lorentz group produces some very strange results. One
is that the velocity of light is constant and does not depend on your own
velocity.
This is very unnerving indeed. If light is a wave motion like waves in water
then the wave velocity is a property of the water. If you are at rest with
respect to the water you get one answer, and if you are moving with respect
to the water you get another. This is not observed. If light goes like little
146 CHAPTER 4. SOME ELEMENTARY PHYSICS
Figure 4.6.1: An experiment with two charges.
bullets from a source, then the velocity of the light is aected by the velocity
of the source with respect to any other observer. This is also not observed.
What happens when you travel fast away from the light is that the frequency
is shifted down, the colour changes, and if you travel towards it the frequency
is shifted up, the doppler eect. But the velocity is not aected.
It gets worse. Suppose I am sitting in a frame of rulers with a clock and
describing what happens when I take two charged balls of opposite signs and
place them a small distance from each other, as in gure 4.6.1.
I have tied each charge to some xed object and measured the tendency
of the charges to attract each other by measuring the stretch of a spring.
Since everything is pretty much the same except for the sign of the charge,
including the mass of the balls and the elasticity of the springs, there is a
high degree of symmetry in the arrangement and I expect the stretches to be
the same. I can measure these by reading o the number on the ruler where
the edge of the ball is.
Now you come zooming past me going into the picture at half the speed of
light.
You see the two balls, you see the extension of the spring, and you can
measure the electric and magnetic elds with some little test charge you
carry with you, and some little bar magnet. Your little bar magnet can be
thought of as just two charges orbiting about each other, close enough so
they have no net electric eld to speak of, and fast enough so they produce a
magnetic eld. You can also observe, just as I can, the numbers on the ruler
where the edges of the balls are, and we had better agree on these numbers.
The collision of penguins is a fact in any language that is not totally bizarre,
and the coincidence of edges of balls and numbers on an external ruler must
also be a property of the universe, not the language we choose to talk about
it.
In my framework there is no magnetic eld at all, unless we count your
measuring apparatus. But in yours there are two charges coming towards
4.6. LORENTZ INVARIANCE 147
you at half the velocity of light so there has to be a magnetic eld, because
moving charges have to produce one; Faraday said they did and so they
do. So the numbers you get for the electric eld and the magnetic eld will
be dierent from the numbers I get, yours will have a nonzero magnetic
component.
Exercise 4.6.10. Use the rst lorentz boost to work out what the rate of
exchange is.
We will both agree that the springs extend and the objects are attracting
each other. But our explanations of why will be dierent. You will have an
explanation which involves magnetic elds and mine wont.
In dierent congurations we may disagree about the extension of the springs
or the masses of the balls or the time duration between events, although we
must of course agree about whether an event occurs or not. If two balls
(or penguins) collide, we must agree that the event happens, this being a
property of the balls (or penguins), but the translation between our languages
may make our measurements of various forces disagree when our translation
is done using the lorentz group.
The problem which was tackled in the early days of the twentieth century
was, can you have the mechanical part of nature invariant under the galilean
group and the electromagnetic part invariant under the lorentz group? As you
can see from our discussion on penguins and Einsteins covariance principle,
this amounts to having a dierent and incompatible language for dierent
aspects of the universe, and we measure the eects of elds by mechanical
means. So it does not make much sense to have two incompatible languages
for talking about the same thing. The MichelsonMorley experiment tried
to measure the velocity of the earth with respect to the luminiferous aether,
the whatever it was that waggled when light passed through it (luminiferous
just means light bearing and aether meant some weird stu which spread
throughout the universe and had no other function than to bear light. In
particular it didnt obstruct or slow down mechanical things like planets or
penguins). This makes some sort of sense if there is one kind of invariance
for matter and another for electromagnetic elds. The answer seemed to
be it was zero: the velocity of light does not depend on the velocity of the
observer measuring it. This is consistent with the lorentz invariance of the
Maxwell equations, but not with the idea that you can get away with two
incompatible languages for talking about the world.
Given this there are only two possibilities: either Maxwells laws are wrong
or Newtons are. Mostly people assumed that Maxwell was going to have
to lose out in the ght between the intellectual giants, mainly because they
148 CHAPTER 4. SOME ELEMENTARY PHYSICS
were used to Newton, and Maxwell was the new kid on the block, although
it was hard to see how he was wrong.
Poincare pointed out that the alternative was to suppose that everything was
invariant under the lorentz group. Einstein worked out the consequences and
we had the special theory of relativity, and E = mc
2
is a trivial corollary.
Hence atom bombs and nuclear power stations. So today we use galilean
invariance as a simple approximation when velocities are low, and lorentz
invariance is taken to be right. For everything.
It is fashionable for philosophers to ponticate on science, and since scientists
are usually much too busy doing interesting things to bother much about
them, the philosophers have much more inuence on the great unwashed
than they should. One line of argument goes: Einstein showed Newton
was wrong, no doubt someone will eventually show Einstein is wrong too,
so nothing is known for sure and all knowledge and belief is temporary. So
we might as well stay totally ignorant of science. In fact since all knowledge
is liable to error we cant be said to really know anything. And there is no
sense in pursuing truth if there is no possibility of catching it. I shall call
this the postmodern fallacy.
This certainly saves philosophers and others the trouble of learning about
tensors, or anything else complicated. The argument is very popular with
people who like to be thought of as intellectual but dont have much intellect.
Exercise 4.6.11. Explain, as to a philosopher, why the postmodern argument
sucks.
Exercise 4.6.12. Google MichelsonMorley Experiment
This has been a quick introduction to special relativity. I have essentially
followed the historical development of the ideas, whereas it is more usual to
give you the facts which have become known as the result of experiments
since. Facts there are in plenty and they support completely the invariance
under the lorentz group of everything in the universe. Physicists tend to see
life as a huge collection of facts, mathematicians as a much smaller collection
of ideas. To mathematicians, reality is there to give us interesting things to
think about, and so we rely on physicists and engineers to nd out how things
behave so we can make languages which describe them concisely. It has been
a very succesful partnership and physics (and more recently engineering) has
forced us to produce some beautiful mathematics, some small amount of
which we have used thus far.
There is still lots left as the next chapter will show.
Chapter 5
DeRham Cohomology:
Counting holes
First some observations on a few cultural issues. There are dierences be
tween mathematicians and physicists which cause problems. I dont want to
overstate these, nor to leave anyone thinking I disapprove of either culture;
my rst degree was in Physics and my Ph.D. in Pure Mathematics, and I
nd both subjects wonderfully worth studying, but failure to confront issues
tends to make them harder to deal with, not easier. So some thoughts on the
dierences may be worth putting up for your consideration. It should also
be noted that, seen from outside, the two cultures are so similar it is hard to
see any dierence at all.
5.1 Cultural Anthropology
In the beginning of the twentieth century, the Mata Grosso was the great
unexplored jungle of the Amazon basin in South America. It was full of,
well, jungle, which we now call a rain forest (possibly to avoid giving oence
to jungles), and contained exotic animals and exotic tribes of people with
strange customs, such as shooting curare tipped darts at strangers.
Cultural anthropologists, anxious to study humanity in all its bizarre aspects,
visited these tribes in order to learn their ways; those who avoided the curare
tipped darts were able to return to civilisation and tell it about the customs
and manners of these fascinating people. One of the chief diculties they
faced was the strange human ability to follow complicated rules without being
able to say what the rules actually are. This is obvious in language: a ten
year old has a good grasp of his native language and can follow incredibly
149
150 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
complex rules of grammar with no apparent diculty. The conclusion that
French must be an easy language because lots of French children speak it, is
not in fact the case. It is rather that they have internalised a huge number
of rules, but they dont know what they are. In order to learn French as an
adult, we tend to want to know the rules. It is no use asking a French child
to tell you. They dont know.
In the same way, it was no use asking a denizen of the Amazon basin what
their basic assumptions about the universe were. They had them, they fol
lowed these assumptions, but they had been brought up in the culture and
couldnt articulate them. Part of the interest in exotic tribes is trying to
work out what those assumptions are, but there is no use asking the exotic
tribesmen. They learnt them at too early an age to realise they were making
them.
I once talked to an Australian engineer on a Japanese train, and (in front of
some of the Japanese sta) he expressed his amusement about the introduc
tion of ush toilets to Japan many years ago. They needed to have pictures
explaining to Japanese how to sit on the toilet seat, otherwise the more hy
gienically conscious Japanese would stand on it and squat. He thought this
was very funny, because he presumably believed that the customary manner
of using a ush toilet is something people are born knowing. He thought
this because he had been pottytrained at an early age and had forgotten
the process. I doubt if his mother had. Modern Japanese toilets are so com
plicated, they have to be explained to foreigners, so the Japanese have had
their revenge on ignorant Aussies.
These days the Mata Grosso is in the process of being turned into farms and
housing estates and the exotic tribesmen drive cars and drink cocacola, so
there is not much point in a cultural anthropologist visiting the place. It
is much like back home. By way of compensation, there are other exotic
tribes being created. One of these is the tribe of theoretical physicists
1
. Just
like the Amazonian Indians, they have their special language, their cultural
assumptions about the world. And just like the Amazonian Indians, they
dont actually know what they are.
This is where mathematicians come in. They are also a weird tribe, as you
may have noticed, but being professionally interested in rules they have a
much clearer idea of what those they follow actually are. And when they
study theoretical physics they nd it necessary to articulate the assumptions
1
One cultural anthropologist actually spent some months with a group of physicists
but his report on their weird ways aroused little interest, perhaps because he couldnt
make as much sense of them as they could of him. This is a true story and not a joke.
Well, maybe it is a joke but it is also true.
5.2. SOLUTIONS 151
which the physicists make. It is no use asking the physicists, they do it
be training and reex and dont even notice they are doing it. Learning
theoretical Physics as an adult is harder than learning French, and asking
French children is no help, as noted above.
So I am going to make some points which theoretical physicists would regard
as too obvious to talk about and dont.
5.2 Solutions
The Maxwell Equations are basically about a set of six functions from R
4
to
R, Ex, Ey, Ez, Bx, By, Bz which correspond to things that can be measured
using particular instruments. In practice we can only sample these functions
discretely if there is something in nature producing them, or we can more or
less ignore reality and just write down the six functions. They are, in our
notation, the components of a 2form on R
4
and we take the Lorentz inner
product should it be necessary. It is possible to see this as a map from R
4
to
R
6
. Any such map will dene a suitable 2form, and it is not unreasonable to
demand that we restrict ourselves to smooth functions and maybe analytic
functions.
Exercise 5.2.1. Why is it not unreasonable?
Now the Maxwell Equations impose conditions on these six functions. Not
every choise of six functions fromR
4
to R will satisfy them. In fact there must
be some space of solutions. We have from dF = 0 four conditions on these
functions and from d F = 0 another four. I am staying with the vacuum
equations for the time being. So we have a total of eight constraints on an
innite dimensional space of functions, so we get some innite dimensional
manifold of solutions. This is not much help.
The sad fact is we have only one solution to the vacuum Maxwell Equations,
which we got by guessing that a plane wave in space would do it. If you
write down
Ex(t, x, y, z) = 0, Ey(t, x, y, z) = 0, Ez(t, x, y, z) = C sin(y t)
for any real number C, then you are describing a (sine) wave with an electric
eld which exists only in the zdirection and which travels at unit speed in
the ydirection. If we take the curl of this we get, so Maxwell tells us,
B
t
=
cos(t y)
0
0
0
0
sin(y t)
cos(y t)
0
0
=
B
t
Integrating this gives
B =
sin(y t)
A
B
t
E =
0
0
cos(y t)
Bx
By
Bz
0
0
cos(y t)
or
y
Bz
z
By
z
Bx
x
Bz
x
By
y
Bx
0
0
cos(y t)
This would seem to give rather a lot of possibilities for B other than the
simplest one we have considered.
Exercise 5.2.8. Does it? Find one or prove there arent any.
Note how we got this present family. Basically, we guessed from knowing that
light travels through space as a wave and the velocity is 1 in our units and the
conjecture that light is an electromagnetic thing, that a wave would work,
and wow, it did. Checking to see if a function from R
4
to R
6
will satisfy
Maxwells Equations is very simple, actually nding one by some process
other than guessing is a dierent story. And that makes the assumption
that we should look in a space of analytic or at least smooth functions; in
practice we are going to be using the elementary functions because these are
the ones we can easily write down and dierentiate. Why should the universe
be so kind as to use the functions we nd easy to write down? What if some
154 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
important physical process depended on functions we cant write down as a
small sum of elementary functions? What, if anything, could be said about
it?
Exercise 5.2.9. Think about this. Have we just been dead lucky with light?
The question as to whether there are any other solutions to the vacuum
equation outside nite sums of lorentz transforms of the wave solution merits
a little thought.
The physicist will surely observe that there are bound to be solutions to any
nonvacuum problem. Take any conguration of moving charges. Specify
them by elementary functions Ex, Ey, Ez where possible. Then we can
hope to derive, in any coordinate frame in which the data is specied, the
corresponding magnetic elds. This requires merely some dierentiation and
integration, leaving some unknown functions provided by the integration
stage. Now the physicist knows that there is a solution: his reasoning
is that the universe will surely provide one, so it must be there to be found.
Indeed he will believe it is unique up to the transformations of coordinates,
since the universe doesnt toss up between options. This, of course, assumes
that the Maxwell equations are true, which physicists do indeed believe. In
the main.
The question of why do physicists feel happy to restrict themselves largely
to the analytic elementary functions which I invited you to ponder a while
back, and the question of why physicists are so condent about being able to
prove uniqueness and existence of solutions are explained by two essentially
philosophical positions which go back to Newton.
The rst can be summarised by the old adage If something is ineable,
theres no point trying to e it, and perhaps also If something cant be
detected it isnt there. If a function that was zero around the planet earth
was nonzero somewhere else, rst it could not be represented by an analytic
function and second, we would have no way of knowing by local measurements
of any precision that it existed, so there is no point in wasting thinking
time about it, and if a function that cant be written down is essential to
understanding something then we are never going to understand it, so again
forget about the possibility.
The second can be summarised by the principle that if you have a theory
which accounts for the phenomena, commit yourself to it until either someone
comes up with a simpler or more encompassing theory or you run into facts
which are in conict with it, in which case bend the theory minimally to
accomodate the new facts. The more committed you are to the theory, the
more likely you are to discover such facts. Pondering what if s is a waste of
5.3. INFINITE VARIETY 155
time.
The belief that the universe does not toss up but is consistent and hence
provides us with unique solutions is again a philosophical position. One can
argue that it is justied in various ways: it can be productive because if we
get lots of solutions we can look for extra conditions to force uniqueness and
usually nd them. Recall the exercise you did on the magnetic eld for a
wire carrying a current.
Most physicists regard these metaphysical convictions as so obvious that they
never bother to mention them. Much like the french children.
5.3 Innite Variety
Giving the curl of a eld and asking for a solution is, as you will have dis
covered, dicult because there are so many solutions. In dierential form
notation we have dX = Y where X is a 1form and Y is a 2form. Now d is
linear, so it follows that if d = 0 then whenever X is a solution, so is X+.
And since d
2
= 0, if f is any dierentiable function whatever, X + df is a
solution.
What does this do to the physicists conviction that solutions have to be
unique on the physical grounds that the universe does not toss up? There
are two things one might do, and physicists do both of them. One is to
impose extra conditions which force uniqueness. Another is to declare the
dierence between dierent solutions as an artefact of the language and deny
that it is physically signicant. In the former case they explain that the uni
verse has some rather unexpected preferences, often for continuous functions,
and in the second they glory in the freedom that they get to choose arbitrary
functions to suit their convenience, seeing no objection to making dierent
choises at dierent times. If a mathematician has noticed that they often
do the rst and then unexpectedly do the second, and points out the incon
sistency, they express surprise and a certain contempt that mathematicians
lack the courage to follow them. You will nd this attitude in the text book
section on Gauge Freedom. I have not been able to get physicists to agree
that consistency in how they resolve multiple solutions to physical problems
is particularly desirable, although they insist that the universe shows con
sistency. This appears to be a religious conviction, possibly derived from
Newton who believed (a) that God had created the universe and (b) that
God was not smallminded enough to be inconsistent or to try to fool us.
Quantum mechanics might have given him spiritual indigestion, as would
some of the practices of his intellectual heirs. But then, Newton consid
156 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
ered himself a philosopher rst and a mathematician second, and Physics or
indeed Science hadnt been invented as a separate subject in his time.
Again, physicists remain happily oblivious to the underlying assumptions
in their practice, or the great majority of them do. Extracting them for
inspection is time consuming, but I havent found a quicker way of making
sense of their work. And it is suciently interesting and important work to
justify the eort.
5.4 Gauge Freedom
We have seen that guessing a solution and then verifying it is the quick and
easy way, but it presumes that we are good at guessing, or equivalently that
the solution is simple. It doesnt seem safe to rely on this. So it is reasonable
to introduce other assumptions, some on physical grounds, some in a spirit
of optimism.
We know that d
2
= 0 and so when given dB = 0 it is tempting to consider
the possibility that the reason dB is equal to zero is that B = dX for some
unknown 1form X. Such a thing is known in the literature as a vector
potential. We also know that it is far from unique: adding df for any function
(0form) f will give an equally good X. This is precisely analogous to having
a constant of integration crop up: again we might feel inclined to x it in
a physical situation by imposing an extra condition, as when we solve an
ODE, or we may feel that it gives us a glorious freedom to choose one that is
convenient, or we may decline to make a choice at all. In the case of vector
potentials, the practice of physicists is to glory in the freedom and call it gauge
freedom. A similar situation exists when we obtain a potential function for
a physical situation, when adding in an arbitrary constant will not change
the force eld which is its gradient. Physicists sometimes insist that physical
constraints such as ensuring the potential goes to zero at innity suce to
get rid of the ambiguity, but they do not usually feel any such compulsion
in the case of the vector potential. Just what exactly is physically real and
what is an artefact of language is never precisely specied
2
. This allows
2
This creates real problems. One of my lecturers at Imperial College told the class,
rather sadly, that when he was starting on a PhD, he had come up with what he saw as
a very interesting problem. His kindly supervisor had assured him that it wasnt a real
problem, but an artefact of language. Someone else, perhaps with a less assured or less
kindly supervisor had assumed it was real, done the research, and become famous as a
result. I suppose one moral to be derived is that you shouldnt trust your supervisor. The
conclusion I derived was that physicists are not at all clear as to what is real and what
isnt. This surprised me at the time, but I was very young.
5.5. EXACT AND CLOSED FORMS 157
physicists to spout manifest drivel. I was once assured that there were 4
lines of force coming out of a unit charge, and on pointing out that this was
roughly 12
4
7
lines and what did 4/7
th
of a line look like? I was reproved for
being too literal. Clearly one wasnt supposed to ask what things meant, one
was simply being instructed in the right things to say, whether it made sense
or was total bullshit. Thus do subcultures maintain a wall against outsiders:
theres a lot of it about.
Exercise 5.4.1. Listen to some conversation between your friends and de
cide, how much of what is said is carrying information about the world which
could be translated into a foreign language and remain intelligible, as in Your
dress is transparent in the sunlight and how much is comprehensible only af
ter a large number of extra propositions have also been translated, and possibly
not then, as in All cultures are equally valid in their own terms.
You will note that the mathematical subculture has a quite dierent set of
underlying assumptions from those of most of the rest of the human race.
One is that assertions have to make sense and should, if possible, be true
or derivable from other assertions which are either true or clearly stated to
be assumptions. Many students come to university with a quite dierent
assumption: that what is to be said is anything that has been approved by
authority. Whether it is true, false or totally meaningless is of no importance.
Answering an examination question is done by taking a few half recalled
fragments from lectures and gluing them together with bullshit. No doubt
this works well in the schools, and perhaps in other university departments,
but mathematicians really dont like it. As I am sure you have noticed by
now.
Exercise 5.4.2. What other fundamental but usually unstated assumptions
characterise mathematical culture ?
For the time being I shall simply go along with the physicists, but point out
any oddities while doing so.
5.5 Exact and Closed forms
I suggested earlier that given that dF = 0 for a 2form F, we could get this
result if F = dX, using the well known result that d
2
= 0.
Denition 5.5.1. A form Y which satises the condition dY = 0 is said to
be closed.
Denition 5.5.2. A kform Y such that there exists a k 1form such
that Y = d is said to be exact.
158 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
Then we have the result that:
Proposition 5.5.1. Every exact form is closed.
Proof: d
2
= 0
What about the other way around? Is every closed form exact? The answer
is interesting: it depends completely on topological properties of the space
on which the form is dened. You might think that this is interesting only if
you are a topologist; it has however some important implications for Physics.
The idea is explained clearly enough in Chapter Six of the text book, which
does it rst for the case when X is a 1form on R
2
. To say that dX = 0 is
to say that the eld, corresponding to X when we use the inner product to
change to an equivalent vector eld, has zero curl. The question then is, is
it a potential eld? Is it the gradient of a scalar eld f : R
2
R?
We can try to construct one by the simple process of taking some point
a R
2
and declaring f(a) = 0. To get, for any other point b a credible value
of f(b), we take a path from a to b and integrate the vector eld along the
path. This tells us the amount of work the vector eld does along the path.
We can put a minus sign in if we feel like, but hey, who cares? Now this will
certainly give a value of f(b) but the obvious problem is that if we took a
dierent path, we might get a dierent answer. In general, we would. Your
second year exercises in the course of doing Stokes Theorem should have
convinced you of this.
If however the curl of the eld is zero, then the integral around any closed loop
is zero. This follows from Stokes Theorem in the plane, otherwise known
as Greens Theorem, immediately. And this means that the value of f(b)
cannot depend on the path, because two paths between xed points when
joined together give a closed loop. Hence the value of f(b) does not depend
on the path, and so we can take this as a sensible value for f(b) because it
depends only on the vector eld and the point a.
Exercise 5.5.1. Show it depends on the point a only up to an additive con
stant: in other words if I choose a and you choose a
, your function f
and
my function f will dier by a constant.
Exercise 5.5.2. Translate this into the language of 1forms. 2forms and
0forms on R
2
.
This seems to give us the following:
Proposition 5.5.2. Every Closed 1form on R
2
is exact.
Proof: Just construct the 0form as indicated. Then it is trivial to verify
that d of the 0form is the given 1form.
5.5. EXACT AND CLOSED FORMS 159
This seems perfectly reasonable and hasnt seemed to involve us in any topol
ogy, so I shall now give what looks like a counterexample to the last propo
sition:
Proposition 5.5.3. The 1form
X =
y
x
2
+y
2
dx
x
x
2
+y
2
dy
is closed but not exact.
Proof:
First the closed part.
dx =
y
y
x
2
+y
2
dy dx
x
x
x
2
+y
2
dx dy
=
2(x
2
+y
2
) 2(x
2
+y
2
)
(x
2
+y
2
)
2
dy dx
= 0
Now suppose X = df for some function (0form) f. Then it would follow that
the integral of X around the unit circle is zero, since starting at a = (1, 0)
T
and proceeding in the positive direction would give us f(a) f(a) = 0 for
the integral, by denition of the construction of f. But a glance at the vector
eld shows this is wrong. We have unit length vectors against us each step of
the way, so the integral is 2. Check it by doing the algebra if the geometric
argument fails to carry conviction.
So there aint no such f, and X is not exact.
And something has gone horribly wrong.
Exercise 5.5.3. Can you see what? Stop now and try to work out why this
result is not, as at rst appears, in conict with the penultimate proposition
that said that every closed 1form on R
2
is exact. Warning: I am about to
give the game away on the next page, so stop now and work it out.
160 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
The answer is of course obvious once you have seen it. The 1form
X =
y
x
2
+y
2
dx
x
x
2
+y
2
dy
is not dened on R
2
. It is dened and smooth on R
2
` 0. This is R
2
with
a hole in it. The hole completely destroys the argument, because Greens
Theorem, Stokes in the plane, doesnt work if there is a hole in the region.
The boundary of a disc with a hole in it consists of both the bounding circle
and the point at the hole. Ignoring missing points screws up everything.
You should be warned that evil people, I suspect physicists, have the bad
habit of writing the negative of this form as d. You can see why they do it,
but you have to deplore their moral and mathematical muddle.
Exercise 5.5.4. Why do they do it? You might like to consider the function
which takes a point in the plane, writes it in polar coordinates and sends it
to . What happens if you take the exterior derivative of this 0form?
The removal of a point of R
2
makes a mess of the result that all closed forms
are exact. The argument works however for subsets of R
2
which dont have
any holes in. One hole is enough to bugger things up.
Exercise 5.5.5. Show this.
Exercise 5.5.6. What about the corresponding case of closed 1forms on
R
3
` 0. Are they always exact? After all, if we have a loop in R
3
` 0 we
need to nd a surface with the loop as boundary which does not contain 0,
in order to use the classical Stokes Theorem. This will allow the argument
to go through even in R
3
` 0. And such surfaces are always there, we have
lots of extra room and can deform smoothly any bad surface that contains the
origin until it doesnt.
It might occur to you to wonder if it goes on in the same way: does closed
imply exact on R
n
in general? Investigating in the simplest case, R
2
, we
know that the only 3form is zero so every 2form on R
2
is closed. This
would suggest that if it is true, every smooth function on R
2
has a smooth
vector eld of which it is the divergence.
Exercise 5.5.7. Is this indeed the case? If so prove it, if not give a coun
terexample.
Exercise 5.5.8. Show that every 3form on R
3
is the exterior derivative of
a 2form.
Exercise 5.5.9. What about closed 2forms on R
3
? The required theorem
we would need is obviously more complicated since we have to construct a
1form not just a function. Do it for the 2form dx dy +dx dz +dy dz.
5.5. EXACT AND CLOSED FORMS 161
The last exercise will show there is a certain amount of slack and that we
can make some choices. It would be nice however to have a more systematic
approach.
To do this, lets look at 1forms on R
2
which are exact and see if we can
be systematic about getting the potential function f. Suppose we have a
1form
= Pdx +Qdy
We can take the origin as a starting point and look to see what we get if we
integrate along a path from 0 to the point x. Rather than talk about any
old path, lets do it with a straight line. Then the line is the set of points tx
for t [0, 1] and we get that the path integral of along this path is
1
0
P
dx
dt
+Q
dy
dt
dt where x =
x
y
or
1
0
(P(tx, ty)x +Q(tx, ty)y)
Exercise 5.5.10. Evaluate this for x = [1, 2]
T
and the 1form x dx +y dy.
Exercise 5.5.11. Evaluate this for x = [1, 2]
T
and the 1form y dx+x dy.
Note that this is not closed.
The result, for any point x is a number which we can call f(x) I shall call
it I()(x) and use I() instead of f. The reason is that I goes in the
opposite direction to the exterior derivative so I (for exterior Integral?) seems
a reasonable symbol to use.
So we have an operator I from 1forms to 0forms which makes sense on R
n
and always gives an answer whether the 1form is closed or not. And we
observe that when is closed, then dI() = so must be exact.
Can we get from 2forms to 1forms by a similar process? We investigate
the simplest case of R
2
and a nice simple 2form. Let us start by taking
the constant 2form 2 dx dy. We want to do some integrating to obtain a
suitable 1form I(2 dxdy) = Pdx+Qdy. Since all 2forms on R
2
are closed
we would rather like to have d(Pdx +Qdy) = 2 dx dy.
There is a fair bit of slack here. We would have
x
Q
y
P = 2
and we would need to make up our minds about how to split up the 2 between
the two contributions. Lets make them equal. Then we would have
x
Q = 1;
y
P = 1
162 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
We can integrate both these equations to get
Q(x, y) = x; P(x, y) = y
or
I(2dx dy) = y dx +x dy
and checking conrms that this works: d(y dx +x dy) = 2 dx dy.
Had we chosen some other way to split the number 2 up between the two
terms, we would have got another equally good 1form: there is no shortage
of them.
Exercise 5.5.12. Try it. Make one term zero. Or 1. Now look at the
various 1forms which have constant exterior derivative 2dx dy. What can
you say about their dierence?
Now we try to make the process look more like an operator I taking 2forms
to 1forms. First we split the elements up in equal amounts to be denite.
Then we integrate along a path as for the case of 1forms. I write
I(2dx dy) =
1
0
t 2x dt
dy
1
0
t 2y dt
dx (5.5.1)
The term t is in there to make sure we divide by 2, which we can regard as
sharing the contributions out equally.
Exercise 5.5.13. Suppose we do the same with some more complicated 2
form which is not constant, such as = x
2
+y
2
dxdy. Can you see how to
x up to obtain the 1form I() by modifying equation 5.5.1 appropriately?
Exercise 5.5.14. Can you make it work for 2forms on R
3
? Try it on closed
2forms rst. Then try it on a 2form that is not closed, and also try to
make it work for the 3form d. Notice anything?
If you have been good and virtuous and done the sequence of exercises above
you will be prepared to believe that we can construct for any kform on
R
n
, k > 0, a (k 1)form I, also on R
n
, given by:
I(x) =
i
1
<<i
k
k
=1
(1)
1
1
0
t
k1
i
1
< < i
k
(tx) dt
x
i
dx
i
1
dx
i
dx
i
k
(5.5.2)
where the
dx
i
x u
y v
2x 2y 0 0
0 0 2u 2v
u v x y
1 0
0 1
1 0
0 1
n
(1, ) : (r, ) (r, +n), n Z
Exercise 6.3.24. Show this carefully.
Exercise 6.3.25. Show that tensor multiplication in R is just multiplication,
likewise in C, and hence that the tensor product of the above irreducible
representations just makes
n
m
=
n+m
6.3.5 Representations of SU(2, C)
The text book indicates, without any very compelling arguments, that the
irreducible representations of U(1, C) have important physical signicance.
Since the denition of U(1, C) means that it has to preserve lengths, it must
be the subset of C which contains only complex numbers of modulus 1, that
is, it is the unit circle. And a very ne group it is too, being isomorphic as
a Lie group to SO(2, R).
Exercise 6.3.26. Prove the last claim.
The representations of U(1, C) being so simple, it is natural to investigate
the representations of U(2, C) and SU(2, C). I shall refer to these as U(2)
and SU(2) from now on since the C may reasonably be taken for granted.
Again we need look only at the irreducible representations and again we are
motivated by the hope of some important physical applications of these ideas.
The rst observation worth noting is that U(2) and SU(2) are not abelian
groups, so we expect complications.
First it is essential to get some sort of feeling for the groups. SU(2) is the
subgroup of U(2) having determinant one, and U(2) will consist of the 2 2
matrices with complex entries which preserve the complex inner product on
C
2
, that is the rule
a
b
u
v
= a u +b v
where v is the complex conjugate of v. The maps will have to take the
standard basis for C
2
to vectors which have length 1 and which are orthogonal
with respect to the complex inner product, and so the columns of the matrices
representing these linear maps must also be orthogonal and have length 1,
which implies that the inverse of such a matrix is its conjugate transpose.
180 CHAPTER 6. LIE GROUPS
We use A
.
We note that
e
i
0
0 e
i
cos sin
sin cos
for any , since real orthogonal matrices are necessarily unitary. Also the
product of two matrices which have inverses equal to their conjugate trans
pose has its inverse equal to its conjugate transpose.
Exercise 6.3.27. Prove this.
This would lead one to conjecture that the manifold U(2) has (real)dimension
at least three. That it is a (real) manifold follows from the usual arguments
involving the Implicit Function theorem. Note that it makes sense to have
complex manifolds with smooth maps between charts in C
n
, but we shall not
be dealing with such things.
Exercise 6.3.28. Find the dimension of U(n) from the Implicit Function
theorem. Show that an element of U(n) must have determinant a complex
number of modulus 1, and hence deduce the dimension of SU(n). (The answer
to the last part is n
2
1; make sure you get it right!)
An insight into the geometry of SU(2) is obtained from the Pauli matrices.
Recall that a matrix is hermitean if it is equal to its conjugate transpose.
The Pauli matrices are
0
=
1 0
0 1
,
1
=
0 1
1 0
,
2
=
0 i
i 0
,
3
=
1 0
0 1
It is easy to see that these are linearly independent over C and hence form
a basis for the four (complex) dimensional space GL(2, C). If we take only
real coecients then we get the hermitean 2 2 matrices.
Exercise 6.3.29. Conrm this claim. Conrm that all hermitean matrices
are obtained in this way.
You will observe that the Pauli matrices are certainly hermitean themselves
but are also unitary.
Multiply each of
j
for j [1 : 3] by i and call these, following Baez and
Muniain, I, J, K to get:
6.3. GROUP REPRESENTATIONS 181
0
=
1 0
0 1
, I =
0 i
i 0
, J =
0 1
1 0
, K =
i 0
0 i
Note that
1. These matrices also span GL(2, C) with complex coecients
2. Each has determinant one
3. Each is unitary
Now it is easy to verify that taking all possible real linear combinations of
these matrices gives us a representation of the Quaternions, H.
Exercise 6.3.30. Do it.
It is also easy to verify that whenever a
2
+b
2
+c
2
+d
2
= 1, for reals a, b, c, d,
a
0
+bI +cJ +dK
is unitary,
Exercise 6.3.31. Do it
has determinant one
Exercise 6.3.32. Do it
and only slightly harder to conrm that every unitary 2 2 matrix with
determinant one is of this form.
Exercise 6.3.33. Do it.
This has shown that SU(2) is the three sphere S
3
equipped with a multipli
cation which does not commute.
Remark 6.3.7. There is quite a lot of useful structure lying about here which
has been used by engineers and physicists for many a long year. Mathemati
cians tend to see themselves as discovering structure and pointing it out to
physicists and engineers who eventually come to nd it useful in talking about
something in reality, and then imagine that they discovered the structure ex
perimentally. Physicists and engineers have a dierent story.
182 CHAPTER 6. LIE GROUPS
6.3.6 Representations of SU(2)
This is reasonably well described in the text book: the representations are
over vector spaces of homogeneous polynomials: The zero degree polynomials
are simply the complex numbers, the space H
j
for j half an integer is the
space of polynomials of degree j in two variables. Thus we have for j = 0
the constant functions from C
2
to C and for j = 1 the functions
f
a,b
: C
2
C,
x
y
ax +by
for a, b, x, y C. Then H
j
is a vector space over C of (complex) dimension
2j + 1. U
j
: SU(2) Aut(H
j
) is the representation which takes any g
SU(2) to the automorphism carrying the polynomial p to the polynomial q
dened by
q
x
y
= p
g
1
x
y
cos t sin t 0
sin t cos t 0
0 0 1
0 1 0
1 0 0
0 0 0
and J
x
, J
y
can be obtained in the same way.
184 CHAPTER 6. LIE GROUPS
Exercise 6.4.1. Do it.
These three matrices are linearly independent and span the algebra so(3).
The multiplication is the Lie Bracket in this case,
[X, Y ] = XY Y X
Exercise 6.4.2. Verify that the Lie bracket is in the vector space so(3) when
X and Y are.
We can recover the original matrix functions by exponentiation:
Exercise 6.4.3. Show that exp(tJ
z
) is what it ought to be.
Exercise 6.4.4. Show that the Lie algebra of SO(2) is just R.
Exercise 6.4.5. Do exercises 33 to 54 in the text book.
Remark 6.4.3. Lie algebras are, as the book tells us, nicer in many ways to
work with than Lie groups because they are vector spaces. They give a lot of
information about the groups and their representations.
Chapter 7
Fibre Bundles
7.1 Introduction
A standard source on Fibre Bundles is Dale Husemollers Fibre Bundles.
There are probably more modern books, and there are certainly better writ
ten books, but I own a copy so will stick to following it. I shall do very little
on this subject (there is quite a lot to be done) because I want to focus on
dierential geometry, the subject of these notes, but there are close connec
tions, as is shown by the physics. Anyway, to get very far in Fibre Bundles
you would need more homotopy theory than you have. So this will be a short
chapter.
First some examples:
1. The product S
1
R with projection to S
1
has base space S
1
, bre
(space) R and total space S
1
R. It is easy to see why we call the bre
a bre (it is long and thin) and the bres are glued together by the
topology of the base space.
2. The Mobius bundle which has the same base space and bre as the last
example, but has a twist in it so as to make a mobius strip (without
a boundary). Again there is a projection from the total space to the
base space, and the inverse image of any point is a copy of R.
3. Any product of two spaces. For example a 2torus has base space S
1
and also bre S
1
.
4. Any tangent bundle. This attaches to every point of a smooth manifold
a vector space, the tangent space at the point, and the resulting object
185
186 CHAPTER 7. FIBRE BUNDLES
is a vector bundle, which is dened as a bre bundle which has a vector
space for the bre, an important subclass of bre bundles.
5. Tensor bundles. Again, these are all vector bundles.
6. SO(n,R), n 2 is a bre bundle with base space S
n1
. The map takes
an element of SO(n,R) and sends it to wherever the north pole of the
sphere S
n1
is taken by applying the element to the sphere. The inverse
image of this point in SO(n,R) is a subset which is an embedded copy
of SO(n1,R), the bre. When n = 2, SO(n1,R) is a single point,
the identity map from R to itself, so SO(2,R) is just a copy of S
1
topologically.
7. The sphere S
n
is a bre bundle over RP
n
which sends antipodal points
to the same point and hence has bre Z
2
. It might be better to describe
the bre as S
0
, the pair 1 under multiplication, or O(1,R). More
interesting bundles can be obtained by replacing R with C.
8. Take the sphere S
2
and at each point take the space of ordered pairs of
orthonormal tangent vectors. This gives an orthogonal 2frame bundle
over S
2
. In general, if M is a smooth manifold, for k an integer less than
or equal to the dimension of the manifold, take the space of (ordered)
k orthonormal tangent vectors at each point. An orthogonal 1frame
bundle on S
2
would consist of attaching a unit circle to each point
of the space, the circle being in the tangent space at the point. An
orthogonal 2frame bundle on S
2
would attach 2 circles at each point
(Explain why). Clearly this supposes a Riemannian Inner Product.
More generally, it makes sense to attach at each point of a smooth
nmanifold, an ordered set of k linearly independent vectors of the
tangent space at that point, for k n. These bundles are called frame
bundles. A section of the twoframe bundle on S
2
would give a rather
special pair of vector elds being everywhere linearly independent, and
we know there is not even one such vector eld. So it is not at all
obvious whether a given manifold admits a eld of kframes in general.
Note that there is a rather natural group action on frame bundles,
O(n,R) on the orthogonal frame bundles, and GL(n,R) on the bundles
where we do not suppose a Riemannian structure. The group acts on
the total space but sends bres to bres by what is a multiplication
of the (Lie) group and hence a dieomorphism. Bundles with a group
action of this sort are called principal bundles. I shall elaborate on
these later.
7.1. INTRODUCTION 187
Exercise 7.1.1. Show that attaching an ordered set of n orthonormal vectors
to each point of a space is equivalent to attaching an element of the orthogonal
group, and that the nframe bundle eectively attaches GL(n,R) to each point
of the nmanifold, this being the bre. Thus a useful way of thinking of a
bundle with base space a manifold is to regard the manifold as having a copy
of the bre attached at each point of the manifold.
Exercise 7.1.2. Show that the 2torus admits a eld of 2frames. Does S
3
?
The above should convince you that (a) some spaces have a structure which
makes them something like a generalised cartesian product and (b) it is worth
knowing more about them because there are interesting examples.
Denition 7.1.1. A bre bundle is a triple of space, E, B, F and a map
: E B such that for every b B,
1
(b) is homeomorphic to F.
Denition 7.1.2. A bre bundle is locally trivial i there is a cover of B by
open sets U
j
and for each of them
1
(U
j
) is homeomorphic to U
j
F.
Remark 7.1.1. All our bre bundles will be locally trivial
Exercise 7.1.3. Give an example of a bre bundle which is not locally
trivial.
Denition 7.1.3. A bundle map between bre bundles (E, B, F, ) and
(E
, B
, F
) is a pair of maps f
B
: B B
and f
E
: E E
such that
f
E
= f
B
.
Remark 7.1.2. It follows that bres wind up inside bres under f
E
. It is
helpful to draw a picture of a square:
E
?
B
E


f
E
f
B
We say that the square commutes with the condition
f
E
= f
B
. If
is onto then f
E
determines f
B
. From the denition, it has to be.
Exercise 7.1.4. Dene the terms product of bre bundles, subbundle, quo
tient bundle. Give examples of each.
Exercise 7.1.5. Find out what a bre product is and give an example.
188 CHAPTER 7. FIBRE BUNDLES
7.2 Principal Bundles
In many of the above examples, the bre had some extra structure besides
being a topological space: often it was a vector space, giving a vector bundle,
and sometimes it was a group. A group acts on itself by multiplication, and
so we can more generally consider the case when the bre has a group action
on it. We care most about the case where the group action on the bre is that
of a Lie group, and the action is regular, which means it is both transitive,
that is for any two points of the space there is a group element which acts to
take one to the other, and also free, that is only the identity leaves any point
xed; this is equivalent to saying that for any two x, y in the bre, F there
exists precisely one g in G such that g x = y. In this case, F is known as
a principal homogeneous space for G or as a Gtorsor. This denition holds
whether F is actually the bre of a bundle or not.
Exercise 7.2.1. Show that the action of S
1
on itself (regarded as U(1,C),
i.e. the set of complex numbers of modulus 1 with the usual complex mul
tiplication) makes it an S
1
torsor. Is there a regular action of S
1
on T
2
?
Is there a regular action of R on T
2
? Take the quotient space I/I which
joins the ends of the unit interval together. This is homeomorphic to S
1
but
lacks the group structure and the smoothness structure. Show that it can be
given the structure of a smooth manifold via any homeomorphism with S
1
and also that it is an S
1
torsor. Is any Lie group G a Gtorsor? Are there
any Gtorsors that are not homeomorphic to G?
Denition 7.2.1. A bre bundle where each bre is a Gtorsor (for the same
G) is called a Principal bundle
Exercise 7.2.2. Show that the nframe bundle for any smooth manifold
(Usually written F(M)) is a principal bundle.
Exercise 7.2.3. By taking the mobius strip with bre a closed interval and
gluing the ends of each bre, show that the resulting space is a principal
bundle and work out what the space is.
Remark 7.2.1. This has a lot to do with gauge theory.
Exercise 7.2.4. Do some googling to understand the last remark.
Remark 7.2.2. The condition that the bre be a Gtorsor means that we
can use group actions to say something about the bundle structure. We
have, in eect, a sort of Construction Kit for the bundle which tells us how
to put it together, the group elements can be used to specify how to glue
local trivialisations together.
7.2. PRINCIPAL BUNDLES 189
Figure 7.2.1: A locally trivial cover of S
1
.
In the simplest case, take a bundle over S
1
with bre the interval R and
the action of O(1,R) on it. We might stipulate that the action be always
the identity, so if we have a pair of trivialisations of the bundle, on the
intersection the relation between the bres is that they are the same way
up. This inevitably forces the bundle to be trivial, S
1
R. Or we might
insist that the group action be 1 on one intersection and +1 on the other,
when we would get the mobius strip. Since we would like to be consistent
on intersections, it is reasonable to want the intersection to be connected, so
for S
1
we shall do it with three open sets which cover S
1
.
Exercise 7.2.5. Is the bre a Gtorsor for the orthogonal group?
Remark 7.2.3. In the above case, if we impose the condition that the group
action has to be constant on the intersection of the trivialising cover of the
base space, then we need at least three such open sets in the cover. Labelling
them , and , we can characterise each intersection by specifying an or
dered pair, see the diagram gure 7.2.1. is the red open set, the blue and
the green. Then the intersection is the region between the black bars
at the top right. If I now assign the element 1 O(1,R) to the element
1 to at the left and the element 1 to then you can read this as an
instruction to start with three strips, R R and R, and glue the
rst two strips together keeping both orientations of R to have the positive
numbers pointing up, the strip R is glued to R also with the real line
having the same orientation, but R is glued to R with a reversal, so
that the part is upside down. It is clear that these instructions produce a
mobius strip. Moreover, in general, we can specify a locally trivialising cover
190 CHAPTER 7. FIBRE BUNDLES
of the base space, with the condition that the intersection is path connected,
take any bre having a group action on it and, by assigning group elements
to intersections, give instructions to build a new object. We need to ensure
that the instructions are unambiguous and that the resulting object is a bre
bundle.
Denition 7.2.2. In general, If U
the inverse of g
?
Exercise 7.2.7. Verify also that for an unambiguous instruction we need to
have the cocycle condition:
g
= 1
on any nonempty region .
Exercise 7.2.8. Verify that the construction described always gives a bre
bundle.
Note that we do not need the bre to be a Gtorsor for this group, it suces
that the action be that of a subgroup. In fact we can get a trivial bundle by
consistently choosing the identity. (We can get it other ways, too!)
Exercise 7.2.9. Explain the last, parenthetic, remark.
Exercise 7.2.10. Show that by choosing bre S
0
instead of R, we can deal
with the case where the bre is a Gtorsor. Instead of taking a subgroup,
we can throw out most of the bre. So for the trivial bundle and the mobius
bundle over S
1
with bre S
0
, we still have all the essential properties, and
now we need only count connected components to see the dierence.
Denition 7.2.3. In the case described above of a bundle the structure of
which is determined by a group action on the bres and a set of transition
functions, the bundle is called a Gbundle, and the group is called the gauge
group of the bundle.
Remark 7.2.4. In practice, the bre is a vector space, usually a tangent
space or tensor space.
7.3. THE ENDOMORPHISM BUNDLE 191
Denition 7.2.4. For any linear transformation T of a Gvector bundle bre
F
p
attached to a point p in the manifold, we can ask whether it arises from
the action of G. In general some will and some wont. If it does, we say T
lives in G.
Exercise 7.2.11. Show this is well dened. That is, show that if p is in two
charts with domains and , then if T lives in G over it also lives in G
over , even though the particular element g G is in general dierent.
Exercise 7.2.12. Give examples of Gvector bundles and linear transforma
tions of F
p
that live in G and others which do not.
Exercise 7.2.13. Extend this idea to the Lie algebra g when G is a Lie
group.
Denition 7.2.5. A Gauge Transformation is a smooth Gbundle map from
a vector bundle into itself which is the identity on the base space and such
that every linear map from a bre F
p
to itself lives in the (Lie) group G.
Remark 7.2.5. Physicists care about these a lot. See the section on page
215 of the text book to nd out why. Or at least get some vague idea.
7.3 The Endomorphism Bundle
There is a natural isomorphism between
V V
and End(V )
where End(V ) is the vector space of endomorphisms of V , that is the linear
maps from V to itself. Observe that End(V ) is a ring under composition, it
has a unit, but is not in general commutative. Recall that we dened a vector
space with an associative multiplication to be an algebra. From the isomor
phism it is clear that for any smooth manifold we have an endomorphism
bundle where V is the tangent space at each point of the manifold.
More generally, if E is any vector bundle over a smooth manifold M with
bre V , we can dene the endomorphism bundle EE
by attaching End(V
p
)
to each point p in M. There is nothing new here, we have the tensor bundle
construction in the very special case of (1, 1)
T
tensors.
A section T of E E
gives a map
T : (E) (E)
192 CHAPTER 7. FIBRE BUNDLES
which is linear regarding (E) as a C
(M, R) module.
Exercise 7.3.1. Verify the above claim.
Exercise 7.3.2. Show that any C
1
0
1
1
was an orthonormal
basis. Find such a path. Transport an orthogonal frame along the path so it
stays an orthogonal frame in the Riemannian inner product.
Exercise 8.1.7. Could you do the same thing with the standard basis,
(e
1
, e
2
), and the basis (e
2
, e
1
)? Prove your claim.
Exercise 8.1.8. Can you do the same kind of thing on S
2
? RP
2
?
We conclude that the idea is to shift things along curves. At the very least
the curves should be smooth. In fact any smooth curve, at least locally is the
solution to a system of ODEs (the Straightening Out Theorem from ODE
theory: see Arnold.) So an alternative is to move them innitesimally
along a vector eld, and the things will be sections of some vector bundle.
8.2. BACK IN R
N
197
If the shifts are innitesimal then we can hope to get a shift along a curve
by some sort of integration process. This leads to asking if we can have some
sort of dierential operation of a vector eld on various other sections of a
vector bundle.
8.2 Back in R
n
8.2.1 Covariant dierentiation
I shall deal with the case n = 2 in order to save typing, but the extension is
trivial. Take a vector eld X on R
2
and a point a R
2
with X(a) denoting
the vector at a. We write a as
a
1
a
2
and X(a) as u =
u
1
u
2
. Let Y be
another vector eld on R
2
. Can we dierentiate Y in the direction u at a?
If we take Y (a) = v =
v
1
v
2
v
1
(a)
v
2
(a)
x
y
f
x
f
y
X
1
(x, y)
X
2
(x, y)
X
(Y ) =
v
1
x
v
1
y
v
2
x
v
2
y
u
1
(x, y)
u
2
(x, y)
Remark 8.2.1. We could write this using the Einstein summation conven
tion as
X
(Y ) = u
i
i
(v
j
)
j
, i, j [1 : 2]
which has the advantage that if I leave out the last part, by not specifying
which n we are working in, it makes sense for R
n
for any n.
Exercise 8.2.1. Take a nice vector eld X on R
2
such as y
x
x
y
. Choose
a nice simple Y and calculate
X
(Y ), also
Y
(X),
X
(X) and
Y
(Y ).
Sketch all the vector elds and satisfy yourself everything makes sense, and
that we can legitimately regard
X
(Y ) as a derivative of Y in the direction
X at each point.
Remark 8.2.2. From the matrix notation, certain things are obvious:
1.
X
(Y ) is certainly additive in X:
X
1
+X
2
(Y ) =
X
1
(Y ) +
X
2
(Y )
2.
X
(Y ) is Rlinear in X:
t R,
tX
(Y ) = t
X
(Y )
3. Since this is done pointwise as far as X is concerned, it is C
(R
2
, R)
linear in X:
f C
(R
2
, R),
fX
(Y ) = f
X
(Y )
4.
X
(Y ) is Rlinear in Y :
X
(Y
1
+Y
2
) =
X
(Y
1
) +
X
(Y
2
)
t R,
X
(tY ) = t
X
(Y )
5. It satises the Leibnitz rule so far as C
(R
2
, R) scaling of Y is con
cerned:
X
(fY ) = f
X
(Y ) + (Xf)Y
8.2. BACK IN R
N
199
Exercise 8.2.2. Verify all these.
Exercise 8.2.3. Conrm that the Lie derivative L
X
(Y ) does not satisfy all
these conditions.
Exercise 8.2.4. Instead of operating with
X
on a vector eld, you could
operate in a very similar way on a covector eld or 1form, = P dx+Q dy
to get another 1form
X
() = XP dx + XQ dy. Show this also satises
the above conditions.
Remark 8.2.3. We have now dened
X
for two dierent sections of bundles
over R
2
, the tangent bundle and the cotangent bundle.
Denition 8.2.1. Any operator
X
for a vector eld X on sections of any
vector bundle over R
2
which satises the properties of Remark 8.2.2 is called a
connection, and the particular connections described are called the Euclidean
connections on the tangent and cotangent bundles.
Obviously extending these to R
n
merely means more terms. Extending them
to manifolds gives some complications.
8.2.2 Curves and transporting vectors
We can now talk about moving a vector along a curve in R
2
so it stays
pointing in the same direction. : I R
2
is a smooth curve, and u(t) is a
vector at (t) R
2
, I want to use the fact that for each t I, if I dierentiate
u(t) in the direction
(t)
(u(t)) = 0
This does not make sense, because neither
(t) is concerned, because we only need a vector at each point of the curve
to give a direction in which to dierentiate. It does matter as far as u(t)
is concerned, at the very least we want to know what u is doing in some
neighbourhood of the curve, whereupon
(t)
(u(t)) becomes intelligible and
I can insist that it be zero. This will put some conditions on u. We already
know, of course, exactly what we want to get out, because we know that
shifting vectors parallel to themselves in R
2
is rather trivial. What we want
to conclude is that u is constant along , and that this does not depend on
the curve. But we have an eye on doing the same sort of thing on S
2
, where
life is more complicated.
200 CHAPTER 8. CONNECTIONS
You can see that the proposition that the directional derivative of a function
is zero in every direction certainly tells us that the function is constant. You
can also see that if the directional derivative of a function along a curve is
zero, it must be constant along the curve. And we dont give a damn what
the function is doing elsewhere. This holds for each component function of
the vector eld that extends u in a neighbourhood of the curve.
Exercise 8.2.5. Prove that a smooth function which has directional deriva
tive along a curve equal to zero is constant along the curve.
Exercise 8.2.6. Is it true for any continuous curve in R
2
, that a function
dened on it can always be extended to a neighbourhood of the curve?
We therefore deduce that for the euclidean connection on R
2
, the properties
of the connection ensure that the condition
(t)
(u(t)) = 0
is quite intelligible since it makes sense for any extension of u(t) into a
neighbourhood of the curve, and it tells us how to parallel transport a vector
along a curve, and that for any two points on the curve, the condition ensures
the vector at each point is the same. Big deal. Of course, the notion of
vectors in two dierent tangent spaces being the same is certainly trivial in
R
2
, indeed in R
n
generally, so long as we have the standard structure; there
is one obvious sense in which it makes sense, the trick is to say what it means
along curves in manifolds that are not so simple.
From this, after some reection, we conclude that the euclidean connection
on R
n
solves all the problems of parallel transportation on R
n
. This, face
it, wasnt much of a problem. On the other hand, it does give us a hope of
solving the same problem on S
2
and other manifolds, including the universe
in which we live. And if we can parallel translate vectors we should be able
to parallel translate other things using the same ideas.
So we study connections.
8.3 Covariance
The question is, can we make this work on manifolds in general? Certain
things are prerequisites: in particular, this all has to be independent of a
choice of basis. If you are a physicist and you use a vector eld or dierential
1form to represent some thing like an electric eld, you would insist that
the vector at a point is a real thing that does not depend on the choice of a
8.4. EXTENSIONS TO TENSOR FIELDS ON R
2
201
coordinate system. We now know that the right way to express this belief is
in terms of invariance under certain group actions. You and I may dier in
the actual numbers to be assigned, but wed better agree on what happens
in the world after a suitable translation scheme is established. Or it aint
Science.
If you change the basis of R
n
for some reason known only to yourself, then
both X and Y will have dierent representations. Still, a vector eld exists
independent of your description, and something would be horribly wrong if
X
(Y ) depended on the basis.
Exercise 8.3.1. Take the vector elds on R
2
you used for an earlier problem
and express them in the basis
1
1
1
1
in such a way that it really is the same vector eld. Do the calculations all
over again.
Exercise 8.3.2. Show that in general if we change the basis on R
n
so that
X and Y are written in the new basis as X
and Y
, then
X
(Y
) is what it
jolly well ought to be.
Exercise 8.3.3. Using the same vector elds, do the calculations using polar
coordinates. What conclusions do you draw?
Exercise 8.3.4. Suppose is a dieomeorphism of R
n
and X, Y are vector
elds on R
n
. Explain how to describe X, Y in terms of the coordinate system
given by . What happens to
X
(Y ) under this dieomorphism?
8.4 Extensions to Tensor Fields on R
2
Returning to R
2
, we could express the Euclidean connection in the form:
X
(Y ) =
v
1
x
v
1
y
v
2
x
v
2
y
u
1
(x, y)
u
2
(x, y)
X
(Y ) = u
i
i
(v
j
)
j
This rather obscures the fact that I am using us to represent the vector eld
X and vs to represent the vector eld Y , so it might be better to write
X
(Y ) = X
i
i
(Y
j
)
j
(8.4.1)
202 CHAPTER 8. CONNECTIONS
The denition of
X
() where
= P dx +Q dy =
1
dx +
2
dy
was just
X
() = X
1
dx +X
2
dy = X
j
dx
j
and unpacking the expression for X
j
we get
X
() = X
i
j
dx
j
(8.4.2)
which looks a lot like equation 8.4.1.
Exercise 8.4.1. Write out equation 8.4.2 as a matrix.
The fact that we have the same basic shape for vector elds as for 1forms
tells us that all we are doing is choosing a suitable basis for each of them:
(e
1
, e
2
) is the standard basis for the vectors in R
2
and I have (
1
,
2
) for
the standard basis for the tangent vectors, and (dx
1
, dx
2
) for the cotangent
vectors. Suppose I have a (k, ) tensor bundle, then I can write out a basis
for any section as a collection of terms in the form
dx
i
1
dx
i
2
dx
i
k
i
k+1
i
k+
We can extend the denition of
X
(Y ) to
X
(s) where s is any section of
the tensor bundle by writing
X
( ) =
X
() +
X
()
and extending to as many tensor products as you feel a need for, and using
linearity.
Exercise 8.4.2. Show that for all sections of a tensor bundle s, the properties
of Remark 8.2.2 hold.
Exercise 8.4.3. Letting X be the same old vector eld on R
2
as in earlier
exercises, and let a Riemannian inner product be dened on the positive
quadrant by the matrix
s =
1 +xy 0
0 1 +x
2
+y
2
Find
X
(s). Is it positive denite? If we take the covariant derivative of a
symmetric 2tensor s, is the the resulting 2tensor necessarily symmetric?
8.5. THE KOSZUL CONNECTION 203
8.5 The Koszul Connection
The crucial properties of the covariant derivative of tensor bundles on R
2
were: For any section s of a tensor bundle,
1.
X
(s) is certainly additive in X:
X
1
+X
2
(s) =
X
1
(s) +
X
2
(s)
2.
X
(s) is Rlinear in X:
t R,
tX
(s) = t
X
(s)
3. Since this is done pointwise as far as X is concerned, it is C
(R
2
, R)
linear in X:
f C
(R
2
, R),
fX
(s) = f
X
(s)
4.
X
(s) is Rlinear in s:
X
(s
1
+s
2
) =
X
(s
1
) +
X
(s
2
)
t R,
X
(ts) = t
X
(s)
5. It satises the Leibnitz rule so far as C
(R
2
, R) scaling of s is con
cerned:
X
(fs) = f
X
(s) +Xfs
Extending these to R
n
is rather trivial; consider it done. The next step is
to dene a Koszul connection on any vector bundle E over a manifold M as
a map operating on a vector eld, X, on M and any section s of E which
satises the above rules. This is rather abstract, but I have built up the
simple concrete cases rst in order to cheer you up.
8.6 Vector Potentials
We gave a very simple covariant derivative on R
2
which quite obviously sat
ised the rules for a connection, indeed thats where we got the rules from.
Now we take the Physicists perspective. Weening them o coordinates is an
ongoing process, so lets try doing it their way then we get to be able to do
lots of sums, which is good, and confuse things horribly, which is bad.
204 CHAPTER 8. CONNECTIONS
Suppose we have a section s of some bundle E over R
2
with bre a vector
space F so that E = R
2
F. s is therefore a map from R
2
to F, and if
we let (e
1
, e
2
, e
n
) be a basis for F, then for every point v R
2
we have
s(v) = s
1
(v)e
1
+s
2
(v)e
2
+ s
n
(v)e
n
, or s(v) = s
i
e
i
in Physicists notation.
If X is a vector eld on R
2
we can write X = X
1
1
+ X
2
2
= X
j
j
. And
if s
=
X
(s) we have also s
(v) = s
1
(v)e
1
+ s
n
(v)e
n
= s
i
e
i
. And
the n functions s
i
(v) depend on the n functions s
i
(v) and on the functions
X
1
, X
2
only. Moreover, the rules for getting the s
i
are specied by the rules
of section 8.5 and nothing else. Lets see how it works out. I shall have to
take
Y
for Y the unit vector eld in the direction of the xaxis and also the
yaxis, and rather than write this as
1
or
2
I shall shorten this to
1
and
2
respectively.
s
=
X
(s) =
(X
1
1
+X
2
2
)
(s)
=
X
1
1
(s) +
X
2
2
(s)
= X
1
1
(s) +X
2
2
(s)
= X
1
1
(s
1
e
1
+ s
n
e
n
) +X
2
2
(s
1
e
1
+ s
n
e
n
)
= X
j
j
(s
i
e
i
) using the Einstein convention
= X
j
(s
i
j
(e
i
) + (X
j
s
i
)e
i
) (Leibnitz)
It might be better to expand this last to
X
(s) = X
1
(s
1
1
(e
1
) +s
2
1
(e
2
) + s
n
1
(e
n
))
+ X
2
(s
1
2
(e
1
) +s
2
2
(e
2
) + s
n
2
(e
n
))
+ X
1
((
1
s
1
)e
1
+ (
1
s
2
)e
2
+ + (
1
s
n
)e
n
)
+ X
2
((
1
s
1
)e
1
+ (
1
s
2
)e
2
+ + (
1
s
n
)e
n
)
Now the terms
1
(e
i
) and
2
(e
i
) are, for each i [1 : n], going to be
values of the section, and can therefore be expressed in terms of the basis
(e
1
, e
2
, e
n
). I have, in other words, for each j [1 : 2] and for each
i [1 : n], at each point v R
2
, there is a collection of numbers A
k
i,j
(v))
expressing
j
(e
i
) as
k[1:n]
k
i,j
e
k
Which tells us that we can express
X
(s) as
X
1
i,k[1:n]
s
i
k
i,1
e
k
+X
2
i,k[1:n]
s
i
k
i,2
e
k
+X
1
i[1:n]
(
1
s
i
)e
i
+X
2
i[1:n]
(
2
s
i
)e
i
8.6. VECTOR POTENTIALS 205
Collecting these up and changing the name of a summation index gives us:
X
(s) =
j[1:2],i,k[1:n]
X
j
(
j
s
i
+
i
k,j
)e
i
(8.6.1)
or
X
(s) = X
j
(
j
s
i
+
i
k,j
)e
i
in physics speak.
Exercise 8.6.1. Find the expression for
j
(
i
) in terms of the standard
basis for vectors in R
2
. Now do it for polar coordinates.
The 2n
2
functions
i
k,j
from R
2
to R (On R
2
, on R
m
it would be mn
2
) pretty
much tell us everything about the connection, given that the X
j
tell us about
the vector eld X and the
j
s
i
tell us about dierentiating the section. In
general they are mn
2
functions from the manifold of dimension m to R,
and they tell us how the connection works on the bundle with bre F of
dimension n. The collection of functions is called the Vector Potential for
the connection, or sometimes the Christofel symbols. When the manifold
has the same dimension, n, as the bre, there are n
3
such functions. The
text book prefers to use the term Christoel symbols for the case when the
connection respects a riemannian inner product.
Exercise 8.6.2. When we discussed the Euclidean connection for sections
of the tangent bundle, what were the
i
k,j
?
The signicance of the vector potential term in equation 8.6.1 is not hard to
see. If we left it out, or equivalently insisted that all terms are zero, then in
the case of a vector eld we would simply have the situation of the Euclidean
connection,
X
(Y ) =
v
1
x
v
1
y
v
2
x
v
2
y
u
1
(x, y)
u
2
(x, y)
In order to work on S
2
, this would have to survive a dieomorphism, and by
an earlier exercise, it doesnt.
Exercise 8.6.3. Take the usual vector elds on R
2
` 0, X = y
x
+
x
y
and Y = x
x
+ y
y
. Compute
X
(Y ) in cartesian form. Now nd
expressions for the same vector elds in polar form. (Youd better get X
P
=
and Y
P
= r
r
and make sure you can prove these are correct, not just look
at the pictures!) Now use the rule for the Euclidean connection to calculate
X
P
(Y
P
). This had better not be the polar form of
X
(Y ) or you have got
the wrong answer.
206 CHAPTER 8. CONNECTIONS
Exercise 8.6.4. Find a vector eld Z on R
2
` 0 which corresponds to the
polar eld
r
. (That is, it consists of a unit vector radially outwards at each
point.) Calculate
r
(
)? Calculate
),
(r
r
) and
r
r
(r
r
) in the same way. Translate them all back
into polar coordinates.
Exercise 8.6.5. Explain why r
r
is a better choice than
r
. (Hint, look at
the polar dieomorphism.)
Exercise 8.6.6. Hence compute the vector potential for
X
P
(Y
P
).
Exercise 8.6.7. Conrm that if you take the vector potential into account,
you get the right answer for
X
P
(Y
P
).
Exercise 8.6.8.
x
(
y
) and the similar terms in the cartesian framework
are what youd expect them to be, but the
i
(
j
) in polar coordinates
contain signicant information. What are the numbers telling you?
Exercise 8.6.9. If you look at what you have been doing with the above
calculations, you can see that we have dened
X
(Y ) in cartesian coordinates
on R
2
(and hence by trivial modication on R
n
) and then proceeded to
take it on the subspace R
2
` 0 by ignoring the deleted point. Then we
transferred it all to o
1
R
+
by the dieomorphism P for polar coordinates.
In order to compute
X
P
(Y
P
), I rather took it for granted that it is to be
done by translating X
P
and Y
P
into cartesian form, doing it there, and then
translating the answer back into polar coordinates, which surely is the only
sane thing to do. If were any dieomorphism from a subset U R
2
to
some other space, V , then what we are doing is taking a vector eld X on
U to the vector eld
X
1
on V , a vector eld Y on U to the vector
eld
Y
1
on V , and dening
X
1(
Y
1
) =
X
(Y )
1
Show that this is a consistent way to export from one manifold to an
other which is dieomorphic to it, and hence explain why we can dene a
connection on a manifold, and why is called covariant dierentiation.
8.6.1 Tensor formulation
The term
i
k,j
looks very like a (2,1) tensor in coordinate terms. For the
Riemannian Inner product, what goes in at each point in the same tangent
8.7. CONCLUDING REMARKS 207
space is a pair of vectors and what comes out is a number, and this is bilinear
and varies smoothly as we move around in the manifold. For the vector
potential, we have that the things going in are two vector elds, or at least a
vector in the tangent space at a point and a eld (or possibly more general
section) dened in a neighbourhood of the point (so we can dierentiate), and
what comes out is another vector eld (or possibly more general section).
8.7 Concluding Remarks
This just starts on the subject of connections, which are crucial to much
dierential geometry. For example, it is connections that have curvature. We
could show how a Riemannian Inner product (metric) leads to a connection,
the LeviCivita connection, which is compatible with the metric. But there
is too much to t into an introductory course unless I follow the tradition of
training you to say the right things with only minimal grasp of what they
mean, something I much prefer not to do.
Much more than documents.
Discover everything Scribd has to offer, including books and audiobooks from major publishers.
Cancel anytime.