A Course in Differential Geometry

TEXTS AND READINGS
IN MATHEMATICS 22
A Course in
Differential Geometry
and Lie Groups
Texts and Readings in Mathematics
Advisory Editor
C. S. Seshadri, Chennai Mathematical Institute, Chennai.
Managing Editor
Rajendra Bhatia, Indian Statistical Institute, New Delhi.
Editors
V. S. Borkar, Tata Institute of Fundamental Research, Mumbai.
R. L. Karandikar, Indian Statistical Institute, New Delhi.
C. Musili, University of Hyderabad, Hyderabad.
K. H. Paranjape, Institute of Mathematical Sciences, Chennai.
T. R. Ramadas, Tata Institute of Fundamental Research, Mumbai.
V. S. Sunder, Institute of Mathematical Sciences, Chennai.
Already Published Volumes

R. B. Bapat: Linear Algebra and Linear Models (Second Edition)
R. Bhatia: Fourier Series
C. Musili: Representations of Finite Groups
H. Helson: Linear Algebra ( Second Edition)
D. Sarason: Notes on Complex Function Theory
M. G. Nadkarni: Basic Ergodic Theory (Second Edition)
H. Helson: Harmonic Analysis ( Second Edition)
K. Chandrasekharan: A Course on Integration Theory
K. Chandrasekharan: A Course on Topological Groups
R. Bhatia (ed.): Analysis, Geometry and Probability
K. R. Davidson: C· - Algebras by Example
M. Bhattacharjee et af.: Notes on Infinite Permutation Groups
V. S. Sunder: Functional Analysis - Spectral Theory
V. S. Varadarajan: Algebra in Ancient and Modern Times
M. G. Nadkarni: Spectral Theory of Dynamical Systems
A. Borel: Semisimple Groups and Riemannian Symmetric Spaces
M. Marcolli: Seiberg-Witten Gauge Theory
A. Bottcher and S. M. Grudsky: Toeplitz Matrices, Asymptotic Linear
Algebra and Functional Analysis
A. R. Rao and P. Bhimasankaram: Linear Algebra ( Second Edition)
C. Musili: Algebraic Geometry for Beginners
A. R. Rajwade: Convex Polyhedra with Regularity Conditions and
Hilbert's Third Problem
A Course in
Differential Geometry
and Lie Groups
s. Kumaresan
University of Mumbai
~HINDUSTAN
U LQj UBOOKAGENCY
Published by Hindustan Book Agency (India)
P 19 Green Park Extension, New Delhi 110016
Copyright © 2002 by Hindustan Book Agency ( India)
No part of the material protected by this copyright notice may be

reproduced or utilized in any form or by any means, electronic or
mechanical, including photocopying, recording or by any informa-
tion storage and retrieval system, without written permission from
the copyright owner, who has also the sole right to grant licences
for translation into other languages and publication thereof.
All export rights for this edition vest exclusively with Hindustan
Book Agency (India). Unauthorized export is a violation of Copy-
right Law and is subject to legal action.
ISBN 978-81-85931-67-8 ISBN 978-93-86279-08-8 (eBook)

DOI 10.1007/978-93-86279-08-8
Dedicated to the memory of my mother
S. Susila
Contents
Preface ix
1 Differential Calculus 1
1.1 Definitions and examples . . . . . . . . . . . . . . 1
1.2 Chain rule, mean value theorem and applications 16
1.3 Directional derivatives . . . . . . . . . . . . 21
1.4 Inverse mapping theorem . . . . . . . . . . 32
1.5 Local study of immersions and submersions 44
1.6 Fundamental theorem of calculus . . . . 46
1. 7 Higher derivatives and Taylor's theorem 48
1.8 Smooth functions with compact support 55
1.9 Existence of solutions of ODE . . . . . . 58
2 Manifolds and Lie Groups 64

2.1 Differential manifolds . . . 64
2.2 Smooth maps and diffeomorphisms 75
2.3 Tangent spaces to a manifold 81
2.4 Derivatives of smooth maps 90
2.5 Immersions and submersions 96
2.6 Submanifolds . . . . . . . . 100
2.7 Vector fields . . . . . . . . . 106
2.8 Flows and exponential map 125
2.9 Frobenius theorem . . . . . 136
2.10 Lie groups and Lie algebras 144
2.11 Homogeneous spaces · 155
3 Tensor Analysis 165

3.1 Multilinear algebra · 165
3.2 Exterior algebra 172
3.3 Tensor fields. . . . · 183
viii CONTENTS
3.4 The exterior derivative , . 190

3.5 Lie derivatives . 199
4 Integration 207
4.1 Orient able manifolds .207
4.2 Integration on manifolds . .214
4.3 Stokes' theorem . . . .223
5 Riemannian Geometry 232

5.1 Covariant differentiation .232
5.2 Riemannian metrics .. .238
5.3 The Levi-Civita connection .249
5.4 Gauss theory of surfaces in 1R3 .253
5.5 Curvature and parallel transport .264
5.6 Cartan structural equations . .272
5.7 Spaces of constant curvature .. .278
A Tangent Bundles and Vector Bundles 281
B Partitions of Unity 286
Bibliography 288
List of Symbols 290
Index 292
Preface
This book arose out of the courses offered by me at T.I.F.R. Centre,

Bangalore, T.I.F.R., Bombay (twice), Indian Institute of Technology,
Kanpur and Ramanujan Institute of Mathematics, University of Madras.
I plunged into writing this book thanks to the encouragement and per-
suasions of the audience of my courses.
The book covers the traditional topics of differential manifolds,
tensor fields, Lie groups, integration on manifolds and a short but moti-
vated introduction to basic differential and Riemannian geometry. This
book will be suitable for a course for students of Physics and Mathe-
matics at the graduate level of western universities or at M.Phil. level
of Indian universities. While the topics are traditional, the discern-
ing reader will find the approach to the topics and the proofs at many
places quite novel. Our main emphasis is on the geometric meaning of
the concepts, so that the reader will feel confident and acquire a working
knowledge. For this reason, motivations are given, many simple exer-
cises are included and illuminating nontrivial examples are discussed in
detail.
Some of the salient features of the book are the following:
1. Geometric and conceptual treatment of Differential Calculus with
a wealth of nontrivial examples.
2. A thorough discussion of the much-used result on the existence,
uniqueness and smooth dependence of solutions of ODE.
3. Special care in introducing the concept of tangent space to a
manifold.
4. An early and simultaneous treatment of Lie Groups and related
concepts as we develop the basic topics in differential manifolds.
5. An early and elementary proof of the fact that all classical groups
are. Lie groups.
6. A motivated and highly geometric proof of the Frobenius theorem.
7. A constant reconciliation with the classical (such as tensor calcu-
Ius) treatment or notation and the modern approach.
x Preface
8. Simple proofs of the Hairy-Ball theorem and Brouwer's fixed point

theorem.
9. Construction of manifolds of constant curvature a la Chern.
10 . Merits and comparisons of different view points, whenever pos-
sible.
Major portion of this book was typeset when I was at the Tata Insti-
tute and I thank the Tata Institute for it. A substantial portion of the
preliminary version was typed by G. Santhanam and was proof-read by
C.S. Aravinda. Kirti Joshi and Kapil Paranjape introduced me to the
world of 'lEX and Jb.'lEX. V. Muruganandam went through the prelim-
inary version ~nd made suggestions for improvement. V. Nandagopal
formatted the book for the TRIM series. Ajit Kumar drew the figures
on computers and helped me a lot in the final version. It gives me great
pleasure to record my sincere thanks to all these friends.
The book was in hibernation for about ten years and the sustained
efforts of many of my friends, especially S. Ilangovan, led to its seeing
the light of day.
I take this opportunity to record my sense of gratitude to one of
my relatives Mr. G.Gnanasambandam, B.E., who honed my scientific
thinking by getting me into serious discussions on scientific matters dur-
ing my high-school days. I also record my deep sense of appreciation
for the numerous hours of pleasant discussions on a variety of topics in
Mathematics which I had with Adimurthi, Akhil Ranjan, K. Okamoto,
M.S. Raghunathan and R. Parthasarathy during my days at the Tata
Institute. My ideas, knowledge and appreciation of Mathematics owe a
lot to these people.
I thank many of my colleagues, especially juniors, at the Tata Insti-
tute who asked me to give many seminars and discussed their difficulties
and problems, mathematical as well as personal. These helped me per-
r.eive the difficulties of a beginner and also made me a better human
being.
I also record my appreciation for Rajendra Bhatia, Managing Editor
of the series and the referees. Their corrections and persuasive sugges-
tions made me realize that my enthusiasm for dissemination of knowl-
edge cannot be an excuse for being sloppy. I am sure that paying heed
to their suggestions has enhanced the value of this book and eliminated
some of the egregious errors and tactless remarks.
I thank my wife Kalai and my children Sivaguru and Bharathi who
bore with me while I was busy with the preparation of the book.
Books and expositions at this level, as a rule, are written in a formal
and precise way. I broke away from this and wrote the book in a con-
Preface xi
versational tone. I hope that beginners will find it easy for self-study. I
crave the indulgence of experts in case they find my style a bit too abra-
sive. I hope that this book will instill confidence in the reader, provide
a working knowledge and create a desire for further studies. I shall be
happy if readers appreciate my efforts in making their learning curve of
this subject less steep.
I welcome corrections and suggestions for improvement.
Mumbai S. K umaresan
December 2001 kumaresa@math.mu.ac.in
Chapter 1
Differential Calculus
We assume that the reader has had a course in calculus of several

variables. The purpose of this chapter is therefore to establish the
notation, to give a quick review of the theory with special emphasis on
the geometric ideas, conceptual understanding and nontrivial examples.
The reader may assume that the spaces are the Euclidean spaces Rn if
he finds it comfortable.
1.1 Definitions and examples
Preliminaries on normed linear spaces

We shall develop differential calculus on Banach spaces. We need noth-
ing more than the definition of a Banach space:
We recall the definition of a norm on a vector space E over R
Definition 1.1.1 A function I II : E -t R is said to be a norm if it

has the following properties:
(i) Ilxll ~ 0 for all x E E and Ilxll = 0 if and only if x = o.

(ii) Ila· xii = lal·llxll for all a E R and x E E.
(iii) Ilx + yll ~ Ilxll + Ilyll for all x, y E E.

2 1. Differential Calculus
Example 1.1.2 Anyone of the following is a norm on lRn :
1. II x II := (2::i Xi 2)1/2 where x := (XI, ... , x n ) is the coordinate

representation of the space ]Rn with respect to the standard basis
ei := (0, ... ,0,1,0, ... ,0) with 1 at the i-th place and 0 elsewhere.
This is called the Euclidean or L 2 -norm on lRn and denoted by

II 112 for emphasis.
2. Ii.ell := max{lx;1 : 1 ~ i ~ n}. This norm is called the max norm

or LOO-norm and is usually easier than the L 2 -norm to estimate.
It is denoted by II 1100'
3. II x II := 2::7=1 IXi I· This is called the L 1 -norm. It is denoted by

II Ill'
We have a distance function or a metric associated with any norm:
d(x, y) := Ilx - yll· We leave it as an easy exercise to the reader to check
that d is a metric. Thus, a vector space with a norm acquires a natural
topology defined by the metric. Its basic open sets are open balls of the
form
B(x,r):= {y E E: Ilx - yll < r}.
Note that in this case B(x, r) = x + B(O, r) = x + rB(O, 1), where the
right hand side involves vector addition and scalar multiplication.
Definition 1.1.3 We say two norms, IIII and 1111', on a vector space E
are equivalent if they induce the same topology.
This is equivalent to saying that there exist positive constants C1

and C2 such that for all x E E we have
We leave this as an exercise.
Definition 1.1.4 We say that a vector space E with a norm II II is a

complete normed linear space or a Banach space if every Cauchy
sequence in E converges.
It is easy to see that if II II and II II' are equivalent norms on E,

then E is complete with respect to II II if and only if it is complete
with respect to II II'. This is easily seen from the characterization of
equivalent norms. This is to be contrasted with the case of equivalent
metrics.
1.1. Definitions and examples 3
Using the fact that IXil ~ IIxlb, it is an easy exercise to show the
following facts about the norms II Ill' II 112 and II 1100 on Rn:
(a) II x 1100 ~ II x ll 2 ~ ..;n IIxli oo '
(b) ..;n-lllxll i ~ II x ll 2 ~ ..;nllxll i .
When we use concepts such as open balls, distances, etc., associated

with a norm on Rn, we usually refer to the Euclidean norm. Note that
in view of the equivalence of the Euclidean and max norms it follows
that Xk := (x~, ... , x~) ~ x := (Xl"'" x n ) if and only if x~ ~ Xi as
k ~ 00 for all i.
One can prove that all norms on Rn are equivalent. The crucial point
of the proof is the fact that in Rn the closed unit ball B[O, 1] is compact.
Proposition 1.1.5 All norms on Rn are equivalent.
Proof Let I I : Rn ~ R be any norm. Let ei be the standard basis of

Rn. Write X = L i Xiei' Then,
(1.1.1)
where C:= max{leil : 1 ~ i ~ n}. We may take C l := (Cn)-l.

To get C 2 , we use the fact that the norm function
is continuous by (1.1.1). It attains its extremum on the closed and

bounded (that is, compact) set
S := {x E Rn : IIxlb = 1}.
Let m := minxEs Ixl. Then m > 0, since otherwise, there exists a vector
xES with Ixl = 0. By the definition of norm, X = 0, a contradiction
to the fact that xES. If.x E Rn is any nonzero vector, then u := ~ X II II2
lies in S so that lui ~ m. But lui = ,,~112 so that we have Ixl ~ m IIx1l2.
We may then take C2 := ;k and finally get Cllxl ~ IIxll2 ~ C 2 1xl.
o
As an application of this, we observe the following fact which will be
needed later.
Lemma 1.1.6 Let JRN := JR~ x JRm so that N = n rt- m. For any
z E JRN we write z = (x, y) with x E JRn and y E JRm. Then the map
z >-+ max{llxll, Ilyll} is a norm on JRN equivalent to the Euclidean norm.
o
Exercise 1.1.7 On the space Cb(X) ofreal or complex valued, bounded,

continuous functions on a topological space the map
1 >-+1111100 := sup II(x)1

xEX
is a norm. It is called the unilorm or sup norm. (Cb(X), II 11 00 ) is a

Banach space.
Exercise 1.1.8 If E, F, are Banach spaces and A: E -+ F is a linear

map then it is continuous if and only if there exists a constant C such
that II Ax II :S C II x II for all x E E. For this reason continuous linear
maps are also called bounded linear operators. The set of all bounded
linear maps from E to F is denoted by BL(E, F). The infimum of all
such C is known as the norm of the operator and is given by
IIAII := sup{IIAxll : Ilxll :S I} = sup{IIAxll : Ilxll = I}.
Show that IIA 0 BII :S IIAIIIIBII where A: E -+ F and B: D -+ E.
Exercise 1.1.9 Let E and F be normed linear spaces. Let dimE < 00.
Show that any linear map I: E -+ F is continuous.
Exercise 1.1.10 If E is a Banach space then BL(E), the set of all

bounded linear operators on E is a Banach space under the operator
norm. Can you' generalize this?
Exercise 1.1.11 The norm function x >-+ II x II is continuous.
Convention: Unless specified otherwise in the sequel, the symbols E,

F, etc. will stand for Banach spaces.
With these preliminaries over, we start our study of differential calculus.

Differentiability
Definition 1.1.12 Let I: (a, b) -+ lR be a function. We say that 1 is
differentiable at x if the limit
. I(x + h) - I(x)
11m ';""';"---''---'-'--'-
h-tO h
exists. If the limit exists it is usually denoted by f'(x) and called the
derivative of 1 at x. We can reformulate this using Landau's notation.
Let (X,d) be a metric space and a E X. If I,g: X \ {a} -+ Care
functions, we say that
1= o(g) as x -+ a if lim I((X)) =

x-ta 9 X
o.
We read 1 = o(g) as 1 is little "oh" of g. Thus, 1 is differentiable at x
if and only if there exists a real number f' (x) such that
I(x + h) - I(x) = J'(x)h + o(lh!). (1.1.2)
What is so important about this formulation? It tells us that the

derivative of 1 at x can be thought of as the linear map from lR to lR
given by h H f'(x)h and that the affine (linear) map h H I(x) + f'(x)h
is a good approximate for 1 near x in the sense that the error 'f (x + h) -
(f(x) + f'(x)h) goes to zero much faster than the increment in x. Thus a
complicated function 1 can be described (at a point) by a linear function
(such as h -+ f'(x)h), a very nice function next only to constants!
The basic idea 01 differential calculus is to study the local behaviour 01
a function at a point by looking at its first order (linear) approximation
at the same point.
In case the reader finds this vague we suggest that he bears with us
for some more time after which he will understand the meaning of this
sentence. Geometrically if we think of 1 as its graph
{(x,/(x)) : x E domain of J} c lR2 ,
then the derivative of 1 at x gives the slope of the tangent line at (x, 1(x))
to the curve, that is, "the line corresponding to the linear map above" ,
which is the best approximate at that point. Now the formulation (1.1.2)
of the derivative of a function 1 is easily adapted to maps I: U C E -+ F
where E and F are real Banach spaces and U an open subset of E. To
understand most of what follows, you may assume that E = lRm and
F = lRn.
We want to imitate (1.1.2) in formulating the concept of derivative

of f: "f(x + h) - f(x) = A(h) + o(h)", for x E U, h in a sufficiently
small neighbourhood W of 0 so that
x +W := {y E E : y := x + w} C U.
Now, clearly, we wish A to be a linear map approximating f at x. Since

f(x + h) - f(x) E F, and hE E, we see that A must be a map from E
to F. If E and F are finite dimensional then no further condition on A
need be imposed. However if one of them is infinite dimensional (to be
precise if E is infinite dimensional) then we require A to be continuous
(and linear).
Definition 1.1.13 With the above notation, we say that f is (Frechet)

differentiable at x if there exists a continuous linear map A: E -+ F such
that
f(x + h) - f(x) = A(h) + o(llhll), (1.1.3)
holds for all h E E with sufficiently small norm. Or equivalently
1. Ilf(x + h) - I(x) - Ahll _ (1.1.4)

h~ Ilhll - o.
In €-8 notation this can be phrased as follows: f is differentiable at
x if there exists a continuous linear map A: E -+ F such that for any
given € > 0, there exists a 8 > 0 satisfying the following inequality:
Ilf(x + h) - f(x) - Ahll :S € Ilhll for all h with Ilhli <8. (1.1.5)
If f is differentiable and if B satisfies (1.1.5) with B replacing A then

A = B, so that such an A is unique. (This is Exercise 1.1.14. You need
the domain U to be open.) We call A the Frechet derivative of f at x,
and denote it either by f'(x) or by Df(x).
We say that f is differentiable on U if and only if it is differentiable
at every x E U.
Exercise 1.1.14 Prove that if f is differentiable at x, then A defined

by (1.1.3) is u_nique. Hint: Let A and B both do the job. For any unit
vector v, consider
IIA(tv) - B(tv)11
~ lif(x + tv)- f(x)-A(tv) II + II f(x + tv) - f(x)-B(tv) II
:S 2€ It I for all t with It I < €.
Remark 1.1.15 Notice that we make crucial use of the fact that U is
open in Exercise 1.1.14.
Exercise 1.1.16 If f is differentiable at x then it is continuous at x.
We look at some examples.
Example 1.1.17 Let f := A: E -+ F be a continuous and linear map.

Since it is already linear it is its own best linear approximation at any
point of E and hence we should expect D f (x) = A, for all x E E. We
shall convince ourselves of this:
f(x + h) - f(x) = f(x) + f(h) - f(x) = f(h),

since f is linear. Hence we can take Df(x)(h)=f(h) so that o(llhll)=O
in this situation. If f is linear but not continuous, is f differentiable?
Exercise 1.1.18 If f is a constant map what is D f (x)?
Exercise 1.1.19 Can you define the second derivative f"(X) of a func-
tion? What is 1" (x) for f as defined in Example 1.1.17?
Exercise 1.1.20 Find Df(x) where f: E -+ F is given by
f(x) := Ax + v.
Here A: E -+ F is a continuous linear map and v E F is a fixed vector.
Example 1.1.21 Let E:=M(n,R) :={X=(Xij): Xij E R,l:::; i,j:::;

n}, the set of all real n x n matrices. Then E is a finite dimensional
vector space over lR.. In fact, the map
establishes a vector space isomorphism. E is a Banach space with respect

to the operator norm:
IIX II := sup{11 Xv II : vEE with Ilvll = I}.
We now consider the map f: E -+ E given by f(X) := X2. Here

X 2 = X . X is the matrix multiplication. We wish to show that this
map is differentiable on all of E and compute its derivative. Let H E E.

We have
f(X + H) - f(X) = (X + H)2 - X2
= X2 + X H + H X + H2 - X2
= (XH +HX) +H2.
Hence,
IIf(X +H) - f(X) - (XH + HX)II = o(IIHID.
So if we define Df(X)(H) = XH + HX, then H H XH + HX is
continuous and linear, and Df(x) satisfies (1.1.3).
Exercise 1.1.22 Can you generalize Example 1.1.21 to (i) infinite di-
mensional spaces, and (ii) to higher powers?
Remark 1.1.23 If E and F are Banach spaces, it is easy to show that

their direct sum E EEl F is a Banach space under the norm II (x, y) II :=
max{llxll, Ilyll} where x E E and y E F. We also write Ex F for EEElF.
Before going further, we want to prove
Proposition 1.1.24 Let f: U c E -+ F x G b~ given. For i = 1,2 let

Pi be the projection of F x G onto its i-th factor. Let Ii(x) = Pi 0 f(x)
so that f(x) = (h(x),f2(x)) E F x G. Then f is differentiable at x if
and only if Ii is differentiable at x and we have
Df(x)v = (Dh(x)v, Dh(x)v), vEE.
Proof Let f be differentiable at x with derivative A := Df(x). We

use the max norm on the product: II (v, w) II = max{lIxll, Ilyll}. Let
Ai = Pi 0 A. We now have
II fi(X + h) - fi(X) - Ai(h) II ~ max{II Ii(x + h) -li(x) - Ai h II}

•
= II f(x + h) - f(x) - A(h) II
= o(h).
From this it follows that Ii is differentiable at x and its derivative is

Pi0 Df(x). We can similarly prove the converse.
o
We now look at two special cases: the domain is an interval in R or
the codomain is R.
Example 1.1.25 Let f: (a, b) c IR --t E be a differentiable map. Here

the derivative D f (x): IR --t E is determined as soon as we know the
vector Df(x)(l) E E, as Df(x) is linear. In this case there is a geometric
interpretation of the vector D f (x) (1). First of all, the above map f may
be considered as a curve,: (a, b) --t E in E. Then we define the tangent
vector of the curve, at a point x (or less precisely, at ,(x) E E) to be
the vector
'( )._ r ,(x + h) - ,(x)
, x .- h~ h '
when the limit exists. It is easy to see that under our assUmptions one
has D,(x)(l) = ,'(x). For our convenience we have denoted f by ,
and hence this is same as saying that f' (x) = D f (x) (1). Thus when the
domain is a subset of the real line it is useful to think of a differentiable
map as a differentiable curve and the value of the Frechet derivative
of f'(x)(l) at the special point 1 as the tangent vector to the curve at
the point f(x). We shall see later this geometric interpretation is quite
useful and that it throws more light on the domain of the derivative.
(See Section 1.3 on directional derivatives).
We now specialize to the case when E = IRn. Let f: (a, b) --t IRn
be differentiable on (a, b). Using the standard basis, we write f(t) =
Li fi(t)ei ur as a column vector:
f(t) := (fl~t)) .
fn(t)
Then, for any t E (a, b), the derivative Df(t) is a linear map from IR to
IRn. We claim that Df(t)(h) = hUf(t), ... , f~(t))t.
II f(t + h) - f(t) - hU~ (t),... , f~(t)) II =

+
max{lfi(t h) - fi(t) - f:(t)hl} = o(h),
•
by Proposition 1.1.24. Note that the matrix representation J(Df(t)) of
the linear map Df(t) with respect to the standard bases is then given
by
fHt))
J(Df(t)) = ( : .
f~(t)
We usually think of such an f
as a curve in IRn and call the vector
D f(t)(l) = UHt), ... , f~(t))t as the tangent vector to the curve at the
point t or I(t) and denote it by f'(t). More on this later. Such a map I
has a physical interpretation which is worth knowing. We think of the
vector I(t) as the position vector of a particle at time t. Then I'(t) is
thought of as the velocity vector of the movement of the particle.
Before considering the second special case, we recall a fact from linear
algebra in the form of an exercise.
Exercise 1.1.26 Let ep: IRn -+ lR be linear. Then ep(x) = (x,v) where
v =(ep(el), ... , ep(e n )) where ej is the standard j-th basic vector.
Example 1.1.27 Let I: U C IRn -+ IR be differentiable at x. Then the

derivative D I (x): IRn -+ IR is a linear map from lRn to IR. By Ex. 1.1.26,
there exists a unique vector u E IR n , denoted by grad I (x) such that
DI(x)(v) = (v,u) = (v,grad/(x)),
where the inner product is the Euclidean inner product. This vector
grad I (x) is called the gradient of I at x. We shall identify this vector
later. (See Lemma 1.3.4.)
Example 1.1.28 Let E, F, G be real Banach spaces and I: ExF -+ G

is a continuous bilinear map. Let x, hE E and y, kEF. Then we have
I(x + h, y + k) - I(x, y) = I(x, y) + I(x, k) + I(h, y) + I(h, k) - I(x, y)

= I(x, k) + I(h, y) + I(h, k)
= I(x, k) + I(h, y) + 0(11 (h, k) II).
Thus we find that DI(x,y)(h,k) = I(x,k) + I(h,y). Notice that the
right side is continuous and linear on the space E x F.
Example 1.1.29 The above easily generalizes to continuous multilinear

maps on El x ... X En to another Banach space F. If I is such a map
then one has:
Work this out immediately as it is needed below.
We specialize Example 1.1.29 to a situation which may be familiar

to the reader.
Example 1.1.30 Let Ei = IRn, for 1 :s i :s n and Xi E E i . Then the

determinant function.
I := det: EI x ... x En -+ IR
is defined by det(xI, ... ,X n ):= det(xij) where Xi = LjXijej. Recall
that det is an alternating multilinear function of its variables:
det(XI, .. . ,X n) = L sign(a)det(xI<T(I), ... ,Xn<T(n) (1.1.6)

<TES n
for a a permutation of {I, ... , n}. Using Example 1.1.29 we find the
familiar rule for the differentiation of determinants:
(1.1. 7)
The reader is strongly urged to check this. The reader might have seen
this when the entries are differentiable functions lij: lR -+ lR in the
following form: if F := det(J;j) := det(FI , ... , Fn) then
DF = L det(FI, ... , F:, ... , Fn),

i
in an obvious notation.
We can reformulate the above in a different set-up. As in Exam-
ple 1.1.21 we can identify lRn x ... x IRn (n-times) with M(n,lR) so
that the map in 1.1.6 is I: M(n,lR) -+ lR given by I(X) = det(X) .
The tuple (el' ... ' en) of the canonical basis vectors goes under this
identification to I, the identity matrix. Now what is I'(I)(H)? Let
H := (h ij ) E M(n, lR). We use (1.1.7) where hi = (h il , .. . , h in )
E E = IRn and Xi = ei. We then get
1 0 0 0 0
0 1 0 0 0
0 0
det(xI , ... ,hi, ... ,x n ) = det h ii
hil hi2 hin
0 0 0 0 0 1
= h ii ·
Thus we have
D I(I)(H) =L hii = tr (H). (1.1.8)
We shall return to this example later.

Example 1.1.31 We shall now consider

U := {A E BL(E) : A-I exists, A-I E BL(E)} ,
the set of all invertible continuous linear operators on a Banach space
E. We claim that U is an open subset of BL(E), the Banach space of
all bounded (that is, continuous) linear operators on E. (The norm on
BL(E) is the operator norm; see Exercise 1.1.8.) We first observe that
if E is finite dimensional, say E = JRn , then we have
U = GL(n,JR) = {A E M(n,JR) : A-I exists},
= {A E M(n,JR) : detA 'I O}
= det -1 (JR \ {O} ).
Since det is a polynomial function in the entries Xij, it is continuous.
Hence U is open in M( n, JR) ~ JRn 2 • We shall give a different proof of
this fact which works in all cases.
Let A E U. To show that U is open we must exhibit an open set W
in BL(E) containing 0 such that A + W c U . Let H E E. We shall try
to find (A + H) -1 formally:
(A + H)-I = (A(I + A-I H»-I = (I + A-I H)-I A-I.
Now (1 +A- 1 H)-l looks like (1+x)-1, which has. the binomial expansion
Lk ( _1)k xk, provided that Ixi < 1. This suggests that we consider the
series.
2)-1)k(A- 1H)k = 1- (A-1H) + (A- 1H)2 - ... (1.1.9)
k
The series of operator norms
11111 + IIA-1HII + II(A- 1H)211- ...
is absolutely convergent if IIA-1HII < 1 (since IIABII::; IIAII IIBII for

all A, B E BL(E) by Exercise 1.1.8). The above condition is achieved if
II
II H II < A-I rl. Hence the series in 1.1.9 is norm convergent in BL(E)
to an element B which is the inverse of (I +A- 1 H). (See Exercise 1.1.32
below). Hence C = BA- 1 is such that
(A + H)C = 1= C(A + H).
Thus if we take
then A +W c U, that is, U is open.

Exercise 1.1.32 Let E be a Banach space and Xk EE. If'L:kllxk II < 00,
then the sequence Sn := L:~=1 Xk converges to an element x E E and we
then write x := L:k Xk·
We now consider the map f: U ~ U given by A H A-I. We want
to show that f is differentiable and compute its derivative. Let A E U
and H E BL(E). We have
f(A + H) - f(A) = (A + H)-1 - A-I

= (I - (A- 1 H) + o(IIHII)) A-I - A-I
= -A- 1 HA- 1 +o(IIHII)·
(Exercise: Justify the above set bf equations.) Hence we find that
where LB(X) = BX, the left multiplication by B on BL(E), etc.
Example 1.1.33 In all the previous examples and definitions we worked

with real Banach spaces and real linear maps. We wish to consider
f: U C C ~ C, a ~omplex analytic (that is, holomorphic) function.
We say that f as above is holomorphic on U if for all z E U there is
I'{z) E C such that limh-+O f(z+h2- f (z) = J'{z). In other words, for all
z E U there exists I' (z) E C such that
I(z + h) - I(z) = J'(z)h + o(lhl). (1.1.10)
We consider I as a map from U C 1R2 to 1R2 • Let us denote this map as

F so as to avoid confusion. Now we ask: Does DF(x,y) exist and what
is it at (x, y) H x + iy := z E C? Notice that (1.1.5) implies that the
C-linear map h H J'(z)h is a C-linear approximation to f at z. Hence'
we must expect that the map h H J'(z)h, which is a fortiori 1R-linear,
"is DF(x, y)". We shall explain this in detail.
Let J'(z) = a + ib, and h = hI + ih 2 • Then we have
Thus the C-linear map from C to C given by h H J'{z)h has as its

underlying 1R-linear map, the map
Thus the underlying real linear map is the map
(~~) (~~: ~ !~~) .

H
Thus we would like to claim that DF(x, y) = (~ -!). If we write J=

u + iv in the standard notation, then
DF(x,y) = ( au~~ au)

~~ at (x,y).
ax ay
(See (1.3.3).) Since J is holomorphic we have the Cauchy-Riemann
equations
av au and
av
ax ay ay
it).
Hence we have
DF(x,y) = ( i~ay ax
But then ~: = a and ~; = -b. This implies that J'(z) = ~: - i~; at
z, as claimed. Thus we have shown that DF(x, y) is the ]R-linear map
underlying the (:-linear map ( H !' (z )(.
On first reading the reader may skip the following and go
to the next example.
Conversely, let F: U c]R2 -t]R2 be differentiable, that is, DF(x,y)

exists on U. Then the corresponding map J: U C C -t C given by
J(z) = u(x, y) + iv(x, y) for z = x + iy
is complex analytic if the ]R-linear maps DF(x, y) are "C-linear", for all
(x,y) E U.
The last statement requires some explanation. Given]R2, how do we
"recognize" it as C? ]R2 can be recognized as a one-dimensional vector
space over C if we endow ]R2 with a C-multiplication by i := ;=I, a
fixed square root of -1. That is,
j :x + iy H i(x + iy) = -y + ix
is the scalar multiplication by i on C. This is a C-linear map and corre-

sponds to the ]R-linear map J: ]R2 -t 1R2 given by
Now viewing ]R2 as a C-vector space is determined as soon as we know

(]R2, J) for the following reason: Define
(a + ib)(x + iy) = "a(x, y)+ J(b(x, y))"

= "(ax, ay) + (-by, bx)"
= "(ax - by, ay + bx)"
= (ax - by) + i(ay + bx).
What happens is this. Secretly, we think of (x, y) as x + iy so that
(a + ib)(x + iy) = (ax - by) + i(ay + bx),

which corresponds to (ax-by, ay+bx). Thus the multiplication by i cor-
responds to the ]R-linear map J. Now given an ]R-linear map A: ]R2 -+ ]R2
it corresponds to a C-linear map if and only if "A(iz) = iA(z)", that is,
if and only if A 0 J(x, y) = J 0 A(x, y) for all (x, y) E ]R2. That is, if and
only if A 0 J = J 0 A. Let us record our finding as
Lemma 1.1.34 An lR-linear map A: 1R2 -+ ]R2 is C-linear if and only

if Ao J = J 0 A.
o
If you have really followed us up to this point the reward is near. We
apply Lemma 1.1.34 to F: ]R2 -+]R2 to conclude:
DF(x, y) = ( aug~ By
au)
au
ax ay
is Clinear if and only if DF(x, y) 0 J = J 0 DF(x, y). This means that
( auauay au)
ax _ (au
au - - ax au)
au ay
au .
ay - ax ax ay
That is, if and only if
8u = 8v and 8u = _ 8v .
8x 8y 8y 8x
That is, F = (u, v) is holomorphic if and only if F satisfies Cauchy-

Riemann equations. We summarize all these in the following
Theorem 1.1.35 Let F: U c: ]R2 -+]R2 be differentiable. Let us write

the lunction F as F(x, y) = (u(x, y), v(x, y)) and set
I(z) = u(x, y) + iv(x, y) lor z := x + iy.

Then I is holomorphic on U il and only ilthe Frechet derivative DF(x, y)
is C-linear lor all (x,y) E u. 0
The next couple of examples deals with the infinite dimensional set-
up.
Example 1.1.36 Let E be a real Hilbert space, that is, a Banach space
whose norm comes from an inner product. Consider I(x) := (x, x) for
x E E. What is DI(x)(h) for h E E? What can you say if we take
g(x) := (x, x)1/2? Investigate what happens when I is as above but E
is a Hilbert space over C.
Example 1.1.37 Let E := Cl[O, 1] be the space of all real-valued con-

tinuously differentiable functions on [0,1] such that 1(0) = O. Let the
norm be given by:
11/11:= sup I/(t)1 + sup 1!,(t)l·

0stSI 0stSI
Then it is easy to see that E is a real Banach space. Let F := C[O, 1] be

the Banach space of continuous functions 9 on [0,1] with the supremum
norm IIgll := sUPo<t<llg(t)l· Consider the map T: E -+ F, given by
TI = f' + Ik, for-k-E N. Here!,:= -ttl. Then DT(J)(g) = g' +
k I(k-I) g.
1.2 Chain rule, mean value theorem and

applications
Chain rule
Lemma 1. 2.1 Let I: U C X -+ Y be differentiable at a E U. Then I
is locally Lipschitz at a, i. e., there exists a·"8 > 0 and L > 0 such that
II/(a + h) - l(a)11 ~ L Ilhll il Ilhll < 8. Hence I is continuous at a.
Proof Let A := DI(a) be the derivative of I at a. Then, for e = 1,

there exists a 8 > 0 such that if II h II < 8 we have
11/(a + h) - I(a) - Ahll < Ilhll·

1.2. Chain rule, mean value theorem and applications 17
From this it follows that
IIf(a+h)-f(a)ll::; IIAhll+llhll::; (IIAII+1)lIhll·

We take L:= II All + 1. o
Theorem 1.2.2 (Chain rule) Assume that X, Y and Z are normed
linear spaces. Let U c X and V c Y be open and a E U. Assume that
f: U -+ V and g: V -+ Z are differentiable at a and f (a) respectively.
Then g 0 f is differentiable at a and
D(g 0 f)(a) = Dg(f(a» 0 Df(a).
Proof Let b := f(a), A := Df(a) and B := Dg(b). We wish to show

that given c > 0 there exists 0 > 0 such that
lI(gof)(a+h)-gof(a)-BoA(h)1I ::;cllhll, IIhll <0. (1.2.1)

We observe that
II (g 0 f)(a + h) - g 0 f(a) - B 0 A(h) II

::; II (g 0 f)(a + h) - g 0 f(a) - B(f(a + h) - f(a» II
+ liB [J(a + h) - f(a) - Ahlil. (1.2.2)
We es.timate each of the terms on the right side of (1.2.2). Since f is

differentiable at a for the given c there exists a 01 such that
c
IIf(a+h)-f(a)-Ahll < 2(1+IIBII) Ilhll, IIhll < 01. (1.2.3)
From this we get
II B [f (a + h) - f (a) - Ah 111 ::; II B 1111 f (a + h) - f (a) - Ah II

(1.2.4)
::; c /2, for II h II < 01.
We estimate the first term of the right side of (1.2.2). Since g is
differentiable at b, given Cl > 0 there exists "l > 0 such that
Ilg(b') - g(b) - B(b' - b) II < c111b' - bll, II b' - b II < "l. (1.2.5)
Since f is differentiable at a it is locally Lipschitz at a: There exists a
02 > 0 and an L > 0 such that
IIf(a+h)-f(a)11 ::;Lllhll, (1.2.6)

If we choose 82 < 'r// L, we have

11/(a+h)-/(a)1I <'r/, (1.2.7)
From (1.2.7) and (1.2.5) we get, for II h II < 82 ,
IIg(f(a + h)) - g(f(a)) - B [/(a + h) - I(a)] II < Cl II I(a + h) - I(a) II

:ScILllhll
:S C II h II /2, (1.2.8)
if we choose Cl 2~. Combining (1.2.2), (1.2.4) and (1.2.8) we get
(1.2.1).
o
We now give some typical applications of the chain rule.
Application 1. Let I: U C IRn -t IRm be differentiable. If
1:= (ft,···, 1m),

then Ii = Pi 0 I where Pi : Rm -t R is the projection of Rm onto the i-th
factor. Since Pi is linear, it is differentiable and so by chain rule Ii is
differentiable and we have D Ii (x) = Pi 0 D I (x) for all x E U.
Application 2. Let I: lR.n \ 0 -t lR. be given by
I(x) = Ilxll := V(x,x).

Then I is the composition of the maps x H (x, x) from Rn to lR. and
tH .,fi from IR to R Hence I is differentiable and its derivative is given
by DI(x)h = (,i~~(
Mean value theorems

Recall that if c: [a, b] -t X is any map, we think of it as a curve in X
and the vector
'( ) ._ 1. c(t + h) - c(t)
c t .- 1m h '
h--+O
if it exists, is called the tangent vector or the velocity vector to c at

the point t (not at c(t)!). Recall also that we have shown that c is
differentiable at t if and only if c' (t) exists and they are related by
c'(t) = Dc(t)(I).
We now prove the single most important result in differential calcu-
lus:
1.2. Chain rule, mean value theorem and applications 19
Theorem 1.2.3 (Mean value inequality-I) Let f: {a, b) -+ E be a

differentiable function. Then we have the following inequality:
Ilf(y) -f(x)ll:S Iy-xl sup 1I!,(x+t(y-x))11 (1.2.9)

O~t9
for all x,y E (a,b).
Proof Let M > Mo := sUPo<t< 1 II f' (x + t{y - x)) II · Let us consider

the set - -
S := {t E [0,1] : II f(x + t(y - x)) - f(x) II :S Mt Ix - yl}·

Clearly, 0 E Sand S is closed since, as functions of t both sides are
continuous. (Recall that by Lemma 1.2.1, f is continuous). Since S
is closed and bounded, s := sup S exists and lies in S. Let t > s be
sufficiently near to s. We then have
IIf(x+t(y-x))-f(x)1I :S IIf(x+t(y-x))-f(x+s(y-x))1I
+ II f(x + s(y - x)) - f{x) II. (1.2.10)
Since f is differentiable on [x, y], given c > 0 with c < M - M o, for all
t near s we have:
IIf(x + t(y - x))-f(x + s(y - X))_f'(X + s(y - x))(t - s)(y -,x) II
:S cit - silY - xl·
Hence
Ilf(x + t(y - x)) - f(x + s(y - x))11

:S II!' (x + s(y - x))(t - s)(y - x) II + c It - slly - xl
:S Mo It - slly - xl + c It - silY - xl , (1.2.11)
for all t near s . Using this in (1.2.10), for t > sand t near s we get
II f(x + t(y - x)) - f(x) II :S M(t - s ) Iy - xl + M sly - xl
= Mtly-xl·
This implies that t E S and hence s = 1. In particular, for s = 1 we

must have
Ilf(x+(y-x))-f(x)11 = IIf(y)-f(x)ll:S Mly-xl,

for any M > Mo. Hence the result follows.
D
The following extension of the above result is worth noting:
Corollary 1.2.4 If f is continuous on [x, yl and is differentiable on

(x,y), we obtain (1.2.9) with supremum taken over (0,1) as a limit of
(1.2.9) applied to smaller closed intervals:
Ilf(y)-f(x)ll:S Iy-xl sup Ilf'(x+t(y-x)ll· (1.2.12)

O<t<l
Corollary 1.2.5 Let vEE. Applying the mean value inequality to the
function g(x) := f(x) - xv, we get
Ilf(y)-f(x)-v(y-x)II:Sly-xl sup 11!,(x+t(y-x»-vll·

O<t<l
(1.2.13)
The inequality (1.2.13) is quite useful when we take v = Df(x)(l) =
!,(x).
For x, y E E, a vector space, we denote by [x, yl the line segment
joining x and y:
[x,y]:= {x+t(y - x) = (1- t)x+ty: 0::; t:S 1}.
Notice that t r--+ ,(t) := x + t(y - x) is the curve through the point
x := ,(0) and having ,'(0) = y - x as its tangent at x. We are now
ready to prove the general version of the mean value inequality.
Theorem 1.2.6 (Mean value inequality-2) Let f : U c E -+ F be

differentiable on the open set U. Let x, y E U such that [x, Ylc U. Let
T E BL(E, F). Then (1.2.13) yields
Ilf(y)-f(:z;)-T(y-x)ll::; Ily-xll sup 11!,(x+t(y-x»-TII·

O<t<l
(1.2.14)
Proof Consider get) := f(x+t(y-x» -Tt(y-x). Then the left hand

side of (1.2.14) is Ilg(l) - g(O) II. Also, by chain rule we get
g'(t) = !'(x + t(y - x»(y - x) - T(y - x)

= [f'(x + t(y - x» - T](y - x).
Hence the result follows from (1.2.13).

o
Corollary 1.2.7 Let f be as above and assume further that U is con-
nected and that Df(x) = 0 for all E U. Then f is a constant function.
1.3. Directional derivatives 21
Proof We first of all prove the result when U is convex. Let x, y E U.

Then [x, yJ C U since U is convex. Now (1.2.14) yields, with T = 0,
Ilf(x) - f(y)11 ::; Ily - xii, 0 = 0 and hence f(x) = f(y)·
If U is connected, fix x E U and consider
E := {y E U : f(y) = f(x)}.
Then obviously U is closed. If y E E, for sufficiently small r > 0, the

ball B(y, r) C U since U is open. But then the ball B(y, r) is convex
and hence by the first part of the proof, we have I (z) = I (y) = I (x)
for all z E B(y, r). Thus, for any y E E, there exists an r > 0 such that
B(y, r) C E. Thus E is open. Since E is a nonempty, open and closed
subset of the connected set U we conclude E = U.
o
1.3 Directional derivatives

We now want to introduce the concept of the directional derivative and
establish the relation between the directional derivatives and the Frechet
derivative. This is done here since it is often easier to show that the
map under consideration is differentiable and compute the directional
derivatives of a given map and hence the Frechet derivative. Also this
will allow us to establish the geometric meaning of the domain of the
derivative.
We use the standard notation. Let I: U C E -+ F. For vEE we
define
D,d(x) := lim I(x + tv) - I(x)
t-tO t
if the limit exits in F and call it the directional derivative of I at x in
the direction of v .
If I: IRm -+ IRn and Vi E lR.n is the i-th standard basic vector, we call
DvJ(x) as the i-th partial derivative of I at x. It is denoted by Dd(x).
In the case when n = 1, DvJ(x) is denoted by i!;(x).
Lemma 1.3.1 Let I: U C E -+ F be differentiable at x E U . Then the

Dvl(x) exists lor all vEE and we have
Dvl(x) = DI(x)(v). (1.3.1)
Proof We may assume v =I O. To prove (1.3.1), we need only show

that
lim I(x + tv) - I(x) = D I(x)(v). (1.3.2)
t-tO t
Since 1 is differentiable at x, given c > 0, there exists a 8 > 0 such that

II/(x + h) - I(x) - DI(x)(h) II ::; cIIhll for all h with Ihl < 8.
Using tv in place of h, if we restrict t so that It I < 8/ II v II, we get
II I(x + tv; - I(x) - DI(x) (.v) II < cIIvll,
which proves (1.3.2).

o
The converse is not true, as the following example shows.
Example 1.3.2 Let I: }R2 ---* }R be defined by
I(x) = {O u3
u 2 +y2
if x = 0;
if x = (u, v) =1= (0,0) .
Note that I(ax) = al(x). Then for any vector w E }R2, we have
1(0 + h~) - 1(0) = I(w).
It follows that the directional derivative Dwl(O) exists and we have

Dwl(O) = I(w). If 1 were differentiable at 0 then we must have, for any
w E }R2, D I(O)(w) = Dwl(O) = I( w) . Itfollows that if 1 is differentiable
at 0 then D 1(0) = 1 and hence 1 must be linear. But 1 is not:
1(1,0) = 1, 1(0,1) = 0 and 1(1,1) = 1/2.

Note that 1 is continuous on all of}R2.
Exercise 1.3.3 Show that the function I: }R2 ---* }R defined as follows:
x (X 2 + y)-l if x 2 + y =1= 0;
I(x,y) = { oY .
otherwIse
has directional derivatives in all directions but is not even continuous at

the origin. Hint: Move along the lines y = mx.
We use the notion of partial derivatives to identify the vector grad I(x)
of a differentiable function I: }Rn ---* lR.
Now let 1:}Rn ---* }R be differentiable. Given x E E, DI(x) is a
linear functional on lRn. That is, D I(x) E (}Rn)*, the dual of}Rn. Since
lRn is an inner product space, this linear functional is given by inner

product with a vector v == grad I(x) E lRn. Using Exercise 1.1.26 we see
that v = (Vl, ... ,Vn ) is given by Vj = DI(x)(ej) = 3f(x).
, Here the
first equality is thanks to the exercise and the second one follows from
Lemma 1.3.1. We thus summarize our findings in the form of a lemma.
Lemma 1.3.4 Let I: U C lRn -+ lR be differentiable at x. Then
DI(x)(v) := (grad/(x),v) = L 01
ox. (x)Vj.
j J
o
We use this to arrive at the matrix representation of the linear map
DI(x) of a differentiable map I: UC lRm -+ lRn. Write 1= (h,· .. , In).
From Proposition 1.1.24, we have
DI(x)(v) (D h (x)( v), . . . , D In(x)( v))

((grad hex), v), . .. , (grad/n(x), v))
2: j ~(X)Vj)
(
2:j ~(x)Vj
( ~:(X)
~(x)
:::
...
~(x)). (~l)
:1: (x) Vm
. (1.3.3)
The matrix (~(x)) is called the Jacobian matrix of I at x.

We could have arrived at this more directly. Let A := DI(x)
(ajih~j~m be the matrix with respect to the standard bases {ei} and
l<i<n
{Vj} of iRm and lRn respectively. Given e > 0, by differentiability of I
at x, there exists 6 > 0 such that for all t with It I < 6, we have
II I(x + tei) - I(x) - tAei II < eItl·
It follows that
I(X+tei)-/(x)-t (ali)
:. <eltl·
a n1
In particular, reading the j-th component of the qua,ntity inside the

norm we get
11i(x + tei) -1i(x) - ajil < It Ie.
Hence we find that aji = ~(x).
o
We now wish to bring out the geometric meaning of the domain of
the derivative. Let I: U C E -+ F be differentiable at x. If we look
at the definition of Dvl(x), the quantity Dvl(x) is nothing but the
derivative of the function of one variable t t-+ I(x + tv). Hidden here is
the crucial observation that we are restricting the function to the curve
"( given by t t-+ "((t) := x + tv for t sufficiently small and the fact that
this curve "( passes through the point x = "((0) and that the tangent
vector to the curve "( at x is "('(0) = v. Recall that the tangent vector
to a curve a: (-e,e) -+ E at 0 or at 17(0) is ft(a)lt=o' This is usually
denoted by 17'(0). If a is a curve in lRn and if we write a = (at, ... , an),
then 17'(0) = (a~ (0), ... , a~(O)).
The chain rule tells us that we can use any curve a such that 17(0) = x
and a'(x) = v to calculate Dvl(x).
d
Dvl(x) = dt (f oa(t))lt=o' where 17(0) =x and 17'(0) = v.
(Check this.)
Thus the vectors v on which the derivative DI(x) acts can be

thought of as tangent vectors to curves passing through x. We can
use any curve a such that 17(0) = x and a'(x) = v to calculate
Dvl(x):
Dvl(x) = :t (f 0 a(t))lt=o' where 17(0) =x and 17'(0) = v.
The significance of the boxed statement cannot be over-emphasized.

To impress this on the minds of the reader we give the following example.
Example 1.3.5 Let GL(n,lR) = {A E. M(n,lR) : det(A) :f. O}. Con-

sider a function I: GL(n,lR) -+ lR as I{A) = det{A). We have already
computed the derivative of I at J : I' (J) (X) = tr X, the linear functional
'tr' on M(n,JR). Now we consider the map exp: M(n,JR) -+ GL(n,JR)

given by
x Xk
e := exp(X) :=
00
kT· I:
k=O
For more on this map see Exercise 1.3.14 at the end of this Section.
Then the curve ,(t) := exp(tX) = etX satisfies ,(0) = I and ,'(t) = X.
Hence it follows from Example 1.1.30 and the displayed box that
(1.3.4)
We now use this to prove det(e X ) = etr(X). That is,
det(exp(X)) = exp(tr (X)). (1.3.5)
Before going into a proof let us ask ourselves whether it sounds plausible.
If X is a diagonal matrix, say X = (x!, . .. , xn), that is, Xij = 6ijXi.
(6 ij is the Kronecker delta which is 1 if i = j and 0 otherwise.) Then
exp(X) = (eXt, . .. , e Xn ) so that Equation (1.3.5) is true for all diagonal
matrices. Hence it is true for all diagonalizable matrices. (Check this.)
(A matrix X is diagonalizable if there exists an invertible matrix A
such that AX A -1 is diagonal.) But unfortunately not all matrices are
diagonalizable. If you remember your linear algebra well, the second
best thing to test Equation (1.3.5) on matrices which are in the Jordan
canonical form. It is easily verified that Equation (1.3.5) remains valid
in this case too and hence it is true for all matrices. (Or an alternative
is to realize that any matrix can be put in an upper triangular form
and hence ... ) We shall not go into the details of a proof along these
lines. Instead, we shall give a proof using calculus based on " det' (I) =
tr". The proof below makes use of the crucial fact that the map A H
det(A) is a differentiable map from CL(n,JR) to JR which is also a group
homomorphism.
Let us consider the function g(t) := det(e tX ) for a fixed X E M(n, JR).
We then have
g'(s) = .!!:.g(s
dt
+ t) = .!!:.
dt
det(e(sH)X)1
t=o
d
= -(det(e Sx ) det(etx))1
dt t=O
d
= det(e Sx ) - (det(e tx ))
dt
It=O
= g(s)tr (X)
by Equation (1.3.4). Thus we have g'(t) = g(t)tr (X). It therefore

follows that g(u) = g(O) eutr(X) for all u E lR. In particular by taking
u = 1 we get Equation (1.3.5).
We do not stop here! We shall give another beautiful application
of the geometric principle above. We use Equation (1.3.5) to find the
derivative of the function det at A E GL(n, lR). One could proceed as in
Example 1.1.30 to find a formula for f'(A)(X) in terms of coordinates
of A = (aij) and X = (Xij). But it is unlikely that you will get the nice
formula for f'(A)(X) that we are going to derive below!'
After this sales talk let us get on with our business. So, given
X E M(n, lR), we want to find a curve a such that 0'(0) = A and
0"(0) = X. On first impulse one may try O'(t) = Ae tX so that 0'(0) = A.
However, notice that 0"(0) = AX, not X. But this situation is easily
remedied. How about setting 'Y(t) = Ae tA - 1x ? Verify that it works!
Thus we have
f'(A)(X) = ! f 0 'Y(t)lt=o
= ~det(Aetr1X)1
dt t=o (1.3.6)
= det(A) ~I e t tr (A -1 X)
dt t=o
= det(A) tr (A- 1 X).
You should work this out using the approach in Example 1.1.30 and
verify that the expression obtained that way is the same as the one on
the right side of Equation 1.3.6. This will convince you of the merit of
the geometric principle enunciated above.
There is one more thing that we would like to point out: the light
that the geometric principle sheds on Taylor's formula for an lR-valued
function of several variables. Let U be a convex or star-shaped open
set in lRn. Let x E U and y E lRn. Assume that x + y E U and
f: U --+ lR is a (k + I)-times continuously differentiable function. Then
Taylor's formula gives us an expression for f(x+y) in terms of f and its
derivatives in the direction of y. The increment in the variable x is y
and we wish to approximate (express) the increment in the value. That
is, f(x + y) - f(x) by means of the change of f at x in the direction of
y. This intuitive idea suggests us that we consider the function
g(t) := f(x + ty) = f 0 'Y(t), say,

in an obvious notation. Then clearly 9 is (k + I)-times continuously
differentiable function of t. We can therefore apply Taylor's formula for
functions of one variable to g to get:
g(t) = g(O) + L t.~g(j)(O) + O(tk+l)

O~j~k J.
and hence we get
I(x + ty) = I(x) + '"' tj

L...J "7jD~g(x)
. + O(t k +1).
O~j~k J.
This is the Taylor's formula for real-valued functions of a vector variable.

(We have used another notation of Landau: We say I = O(g) as x -+ a
if there exists C > 0 such that limx-ta ~f:~ :S C.) This insight brings
out a couple of points:
1) It is natural to consider the one variable function g. It is not pulled

out of a hat; and
2) the Taylor coefficients of the function I are the directional deriva-

tives of I at x along y.
Of course, in applications of analysis one needs various forms of the

remainder term in this formula. Our only point here is to exhibit the
underlying geometric content of this result.
Definition 1.3.6 Let I: U ~ E -+ F. We say I is C I or I is continuously

differentiable if I is differentiable and the map x ~ D I (x) E BL (E, F)
is continuous. Here BL(E, F) is endowed with the operator norm.
The following gives a characterization of CI-functions in the case of

finite dimensional spaces.
Theorem 1.3.7 I: U C R,m -+ IRn is C I iff I has continuous partial

derivatives.
Proof If we write I = (iI, ... , In), it suffices to prove the result for
each of /i's by Proposition 1.1.24. So we consider I: U -+ R.
If the result is true, then we must have, for a E U
Hence we define a linear map h H Ah by the right hand side of the above
equation and show that f is differentiable at a. Let h = (hI,"" h n ).
Then we let
and Xo = a. Then we have

n
f(a + h) - f(a) = L f(Xk) - f(Xk-l)'

k=1
Because, Xk and Xk-l differ only in their k-th coordinate, we can apply
the mean value theorem of one variable calculus to find a point Yk on
the line segment joining Xk and Xk-l to get
We therefore have
Hence,
If(a + h) - f(a) - Ahl
Since the partial derivatives are continuous at a, the differentiability of

f at a follows. Also, the map a H (~aa Xl
(a), ... , ~aa
Xn
(a))) is continuous
and hence f is continuously differentiable.
Definition 1.3.8 We say that a function f: U c E -t F has Cl_

directional derivatives if for every x E U there exists Ax E B L(E, F) such
that the directional derivative Dvf(x) exists and is given by Dvf(x) =
Ax (v) and such that the map x H Ax is continuous.
Lemma 1.3.9 f is C 1 if and only if f has C 1 -directional derivatives.

Proof First we notice that if f is differentiable then f has all direc-

tional derivatives and we have
Duf(x) = Df(x)(v).
To prove the nontrivial. part of the lemma, the obvious choice for
Df(x) is the linear map Ax given by E 3 v I-t Duf(x). Thus we must
show that
II f(x + h) - f(x) - Ax(h) II = 0(11 h II)·
We set g(t) := f(x + th). Then g'(t) = Ax+th(h). We now have by the
mean value inequality (1.2.13)
Ilf(x + y) - f(x) - Axyll ::; Ilyll sup II Ax+ty - Ax II·

O<t<1
Since f has CI-directional derivatives, the left side is 0(11 y II) and hence
we see that Df(x) = Ax.
o
There is an important corollary of Lemma 1.3.9.
Corollary 1.3.10 Let the assumptions be as in Lemma 1.3.9. Let

E := I17=1 E i . Let D;J(XI, ... , x n ) stand for the derivative of the func-
tion
where except Xi all are kept fixed. If D;J are all continuous then f is
C I and we have, for hi E E i ,
Df(x)(h l , ... , hn ) = L D;J(XI, ... , xn)(hi ).

i
Proof Left to the reader.

o
D;J are called the partial derivatives of f. When Ei = IR and F = IR
we get the classical result which says that f: U c IRn -+ IR is C I if and
only if the partial derivatives If.
exist and are continuous.
Another important corollary to Lemma 1.3.9 is the following
Proposition 1.3.11 Let Ik: U -+ F be CIon an open subset U c E.

Let Ik tend to a continuous map f: U -+ F locally uniformly on U, that
is, for every x E U, we have an r x > 0 such that
sup IIIk - fll-+ 0 as k -+ 00.

yEB(x,r z )
Assume further that f~ -t 9 locally uniformly where g: U -t BL(E, F).

Then f is differentiable with f' = 9 and hence f is C l .
Proof Let x E U. We apply (1.2.14) to fk with T = fHx) to get
II fk(y) - A(x) - fHx)(y - x) II

:::;lIy-xli sup IIf~(x+t(y-x -fHx)lI
O<t<l
As k -t 00, the above yields
II f(y) - f(x) - g(x)(y - x) II

:::;lIy-xli sup IIg(x+t(y-x»-g(x)ll. (1.3.7)
O<t<l
Since 9 is continuous, the left side of (1.3.7) is o(IIY - xII) and hence we
see that f'(x) = g(x).
o
Remark 1.3.12 Proposition 1.3.11 is a special case of Theorem 1.3.13

below. However, Proposition 1.3.11 is what is required often and hence
we chose to present it rather than the general result.
Theorem 1.3.13 Let U be an open subset of a Banach space E. Let

fk: U -t F be a sequence of differentiable functions. Assume that
i) there exists a point a E U such that the sequence fk(a) E F is

convergent.
ii) f~: U -t BL(E, F) converges locally uniformly in U to 9 E BL(E, F).
Then, for any x E U, the sequence {A(x)} has a limit, say f(x) E F,
A -t f locally uniformly on U and f is differentiable on U with f' = g.
Proof Left as an exercise.

o
We end this section with an important map which will be essential
when we study Lie groups and Lie algebras later.
Exercise 1.3.14 (Exponential Map in M(n, lR» The following set of

exercises introduces the exponential map in M( n, lR) and its properties:
1. Show that if I: U c E --+ F is differentiable at x, then it remains

so ifE and F are endowed with equivalent norms. What is DI(x)
in this case?
2. For X E M(n,IR), X := (Xij), let
be the max norm. It is equivalent to the operator norm II II on

elements of M(n, 1R) viewed as linear operators on ]Rn. We shall
use the operator norm in the following.
3. We have IIABII:S IIAIIIIBIl for all A,B E M(n,lR) and IIAkll:S

IIAlik.
4. A sequence Ak --+ A in the operator norm if and only if at --+ aij
for all 1 :S i,j :S n as k --+ 00. Here we have Ak := (at), etc.
5. If L%"=o I Ak I is convergent, then L~o Ak is convergent to an

element A of M(n, 1R).
6. For any X E M(n,IR), the series L%"=o 1,~ is convergent. We

denote the sum by exp(X) or by eX.
7. For a fixed X E M(n, 1R) the function I(t) := e tX satisfies the

matrix differential equation f' (t) = XI (t), with the initial value
1(0) = I. Hint: Note that the (i,j)-th entry of I(t) is a power
series in t and use (4).
8. Set g(t) := etXe- tX and conclude that etX is invertible for all
t E IR and for all X E M(n, 1R).
9. There exists a unique solution for f'(t) = AI(t) with initial value
1(0) = B given by I(t) = etA B. Hint: If 9 is any solution, consider
h(t) = g(t) e- tA .
10. Let A, B E M(n,IR). If AB = BA then we have
Hint: Consider ¢( t) := et(A+B) - etAe tB .
11. For A, X E M(n,lR) we have eAXA - 1 = Ae X A-I.

1.4 Inverse mapping theorem

In this section we prove the inverse mapping theorem (referred to as
IMT in future). Loosely speaking, if f is a C 1 -map such that D f (p) is
invertible, then f maps an open neighbourhood U of p bijectively onto an
open neighbourhood V of f(p) and the inverse map f- 1 is differentiable
on V. More precisely,
Theorem 1.4.1 (Inverse mapping theorem) Let U C E be open

and f : U -+ F be C 1. Assume that Xo E U is such that D f(xo) is
invertible. Then there is an open subset V of U containing the point Xo
with the following properties:
i) the map f is one-one on V,
ii) the image f(V) is an open neighborhood of f(xo),
iii) f- 1 is CIon f(V) with Df-1(y) = Df(f-1(y))-1 for all y E v.

Proof We may assume that E = F and that Df(xo) = IE, the identity
map ofE.
Since D f is continuous at Xo there exists a 8 > 0 such that
1
II Df(x) - Df(xo)ll < 2' for all x E B(xo,8) . (1.4.1)
For Xl, x2 E B(xo, 8) we have by (1.2.14)
II f(X1) - f(X2) - D f(XO)(Xl - X2) II

:Sllx1-x211 sup IIDf(X1+t(X2-X1)-Df(xo)ll. (1.4.2)
O<t<l
Now
so that
Hence, in view of (1.4.1), (1.4.2) becomes
(1.4.3)
This implies f is one-one on B(xo, 8).

1..4. Inverse mapping theorem 33
To show that there exists a ball B(Yo,8') contained in the image

of B(xo,8), we modify Newton's method. Newton's method can be
briefly described as follows: if Xo is an approximate zero of f(x) = 0
and if f has a non-vanishing derivative around this point Xo then with
Xn+l := Xn - zt;'(:1),
defined recursively, we have Xn tending to a limit
x which is a zero of f. Look at Figure 1.4.1 to see the geometric idea
underlying this algorithm. We modify this algorithm below.
Figure 1.4.1 Newton's algorithm
Suppose y E B(Yo,8') for some 8' > O. We want to solve for the
equation f(x) = y with x E B(xo, 8). We define recursively
Xk = Xk-l - Y - f(Xk-l).
We need to check whether Xk E B(xo,8). We have
Xk - Xk-l = Xk-l - Xk-2 - (f(xk-d - f(Xk-2)).
Taking norm on both sides, using (1.4.3) and induction we get
1
II Xk - xk-lll ~ "2 II Xk-l - xk-211
1
~ 2k- 1 Ilxl - xoll
~ 21 - k 8'.
Hence if we choose 8' = (8/2), we have Ilxk - xk-lll ~ 2- k 8. In particu-
lar, II Xo - Xk II ~ =~=o 82- i < 8. This also shows that Xk is Cauchy. Let
limxk = x. Clearly we have f(x) = y. For, from the recursive definition

of Xk by taking the limit as k -t 00 we get
x = limxk = lim(xk-l
k
- y - f(Xk-d) =x - y - f(x).
For y E B(yo, 8/2) we define g(y) := x where x E B(xo, 8). We want to

prove that 9 is differentiable on B(yo,8/2) and that g'(y) = Df(x)-l.
To see this, we let g(y) = x, g(y + k) = x + h so that f(x + h) = y + k.
Then we have
k = f(x + y) - f(x) = Df(x)h + 0(11 h II)· (1.4.4)
From (1.4.3) it follows that
Ilk - hll = Ilf(x + h) - f(x) II:=:; (1/2) Ilhll,

so that (1/2) II h II :=:; II k II :=:; (3/2) II h II· In view of this, Equation (1.4.4)
implies h = Df(x)-lk + o(lIkll).
o
First of all we shall try to explain the geometric meaning of IMT in
the finite dimensional case. Next we shall indicate by means of a simple
example how IMT helps in nonlinear problems.
Example 1.4.2 Let f: lRn+l -t lR be C 1 . Without loss of generality,

assume that 0 lies in the image of f. In general, the set S := f-1(0)
does not have any nice geometric property. However, if we assume that
Df(p) := gradf(p) f 0 for all pES
then S "looks locally like a hyperplane". Of course this needs expla-

nation! Given PES, since Df(p) f 0, we assume without loss of
generality that --.E.La
a
X n +l
f O. Then consider the map : IRn +l -t IRn +l
given by (XI, ... ,xn , Xn+l) H (Xl,. " 'Xn,J(x)). Then '(p) has the
Jacobian
1 o
(
o I
.!!L
ax,
The determinant of this matrix is --.E.La
a
Xn+l
f 0 and hence ' (p) is
invertible. Thus is a "Cl-diffeomorphism" of an open neighborhood
1.4. Inverse mapping theorem 35
of V of pin IRn +1 onto an open set cp(V) C IRn +l, by IMT. (By a e 1 _
diffeomorphism we mean a map F which is e 1 , one-one on its domain
U and F(U) is open and F- 1 is also e 1 on F(U).) Now we introduce a
new set of coordinates on V by setting Yi(r) := Ui 0 CP(r), where Ui are
the 'usual' coordinates on IRn +l: Ui(X) := Xi. In plain language this is:
if 1 :::; i :::; n;
ifi=n+1.
With respect to this new set of coordinates Yi, S has a local description
around p on V n S: it is the "hyperplane" {Yn+l = O}. Thus by taking
a suitable system of coordinates we "straighten" the hypersurface to a
hyperplane locally.
To see how this change of coordinates can help us, we pose the follow-
i~g question: Suppose the hypersurface S is also described around p as
g-I(O). That is, there exists an open set U 3 P such that SnU = 9- 1 (0),
with g: U ---+ IR being a e 1 -function. Is 9 divisible by f at least locally
around p? That is, does there exist another function h defined in an
open set containing p on which we can write 9 = fh?
Let F := f 0 cp-l and G := 9 0 cp-l. Then it follows that
cp(V n S) = {Yn+l = O} n cp(V).

The above question then reduces to an equivalent one: Is G divisible
by F = Yn+l locally around O? This is certainly easy to answer (in the
affirmative, by Taylor expansion).
The moral therefore is that the IMT allows us to use a coordinate

system that is most convenient or that simplifies the geometric
problem on hand.
Most often differential calculus is used to solve non-linear problems

by linearizing them, that is, by "taking derivatives" and applying either
the implicit function theorem or the inverse function theorem.
We now return to the infinite dimensional Example 1.1.37. We have
DT(O)(g) = g'; that is, DT(O) = -it:eJ[O, 1] ---+ e[O,l]. This is a
continuous, one-one, onto, linear map, the inverse being given by the
indefinite integral I:
I(h):= the function t f-t lot h(s) ds.

(This is the fundamental theorem of calculus, Theorem 1.6.2.) Hence T

maps an open ball centered at a in the domain space onto a neighborhood
of a in the range space bijectively. In particular, there exists a ~ > a
such that if 9 E C[a, 1) and Ilgll < ~, then there is an I E cJ[a, 1) such
that the equation T I = 9 holds. Thus we have solved the nonlinear
problem TI = 9 for 9 with IIgll < ~.
Example 1.4.3 Let k E C(I x I) where I := [a,1). Then the operator

K defined by
(Kf)(x) := 10 1 k(x, y) I(y) dy, IE C(I)
is a continuous linear map on the Banach space C(I). Assume that

>. ~ spec(K) := {>. E C : (K - >'1)-1 ~ BL(C(I))}. Then there exists
an c > a such that the nonlinear equation
>'I(x) = 10 1k(x,y) (J(y) + [f(yW) dy + g(x) (1.4.5)
has a solution in C(I), for all 9 E C(I) with IIgll < c. This is easy.
Consider the operator A given by AI := the left side - the right side of
Equation (1.4.5). Then A'(a) = >.I - K, which is invertible. Complete
the details.
Example 1.4.4 Let AI(x):= fo1 k(x,y, I(y)) dy for I E C(I). Assume
that k, ~:, the partial derivative with respect to the third variable u,
are continuous as functions from I x I x C -+ C. Then show that
A'(f)(h) = f: ~: (x, y, I(x)) h(y) dy.
Before closing this section we shall prove the implicit function the-
orem. First recall that if I: U C E x F -+ G is differentiable at (x, y)
then we have
DI(x,y)(u,v) = Dd(x,y)(u,a) + D2 /(x,y)(a,v)

where Dd(x, y) is the partial derivative of I in the first variable etc.
Theorem 1.4.5 (Implicit function theorem) Let n C X x Y be

open. Let I: n -+ Y be C1. Assume that for some (xo, Yo) E n where
Xo E X and Yo E Y we have
1. I(xo, Yo) = a.
2. D 2 f(xo, Yo) is a continuous linear isomorphism with a continuous

inverse.
Then there exists a neighbourhood n' of (xo, Yo) in X x Y, an open set
U C X containing Xo and a CI-map 9 on U such that
i. Dd (x, y) is nonsingular for all (x, y) E n',
11. ((x,y) E n': f(x,y) = O} = {(x,g(x)): x E U} .
Proof Let F: n -+ X x Y be defined as follows: F(x, y) = (x, f(x, y)).
Then F is C I and the derivative DF(xo, Yo) can be written in the matrix
form
This has the inverse
which is bounded linear.

Hence by IMT there exists a neighbourhood n' of (xo, Yo) in n such
that F(n') is a neighbourhood of F(xo, Yo) = (xo,O) in X x Y. Let px
and py denote the projections onto X and Y respectively. Let
U:= {x EX: (x,O) E F(n')}.
Since F(n') is open, so is U. Consider g(x) := py 0 F-1(x, 0) for x E U.

Clearly, 9 is CIon U. Also, if (x, y) En', then F(x, y) = 0 iff x E U and
F(x, y) = (x,O). Applying F- I to both sides, we get F(x, y) = (x,O) iff
(x,y) = F-1(x,0) = (x,g(x)). This proves (ii) and completes the proof
of the theorem.
o
Remark 1.4.6 Note that we can compute the derivative of Dg(x). Let
G: U -+ X x Y be the map G(x) = (x,g(x)) and cp := foG. Then
cp(x) = f 0 G(x) = f(x,g(x)) = 0 so that Dcp(x) = 0 for all x E U. We
apply the chain rule:
Dcp(x)h = Df(G(x)) 0 DG(x)(h) = Df(x,g(x)) 0 DG(x)(h).
Since G(x) = (x,g(x)) we have DG(x)(h) = (h,Dg(x)(h)). Also,
D f(x, y)(h, k) = (Dd(x, y)h, D2f(x, y)k).

Hence
0= D<p(x)h = Df(x,g(x))(h,Dg(x)h)
= Dd(x,g(x))h + D2I(x,g(x)) 0 Dg(x)h.
From this we see that Dg(x) = D 2 f(x,g(x))-1 0 Dd(x,g(x)).
Lagrange multiplier method

The purpose of this section is twofold. One is to bring out the geometry
underlying Lagrange's method in the theory of constrained maxima and
minima. The second is more important and it is to show a very typical
use of the IMT in a very concrete situation.
Let us briefly explain the concept of constrained extremum. Let
f, g: U C ]Rn -+ ]R be C 1-functions. Assume that there exist points in
x E U such that g(x) = O. Let S := f-1(0). We wish to find points of
local maxima/minima of the function f on the set S, i.e. the points are
constrained to move in S. For example, consider f, g: ]Rn -+ ]R given by
f(x) := Xl and g(x) := 1 - 2:: j X]. Then e1 E g-l(O) == S := {x E ]Rn :
2: j xl = I} is a point of maximum of f restricted to S.
Theorem 1.4.7 Let g: nC]Rn+k -+]Rk be C 1. .Assume that S := g-l(O)
is nonempty and that for every pES, Dg(p) is of rank k. We define the
tangent space TpS of S at p to be the set of all vectors w E ]Rn+k such
that there exists a map c defined on some open interval containing 0 of
]R to S which is C 1 as a map from the interval to ]Rn+k with c(O) = p
and c'(O) = w. Such a map is called a C 1 curve through p in S. Thus,
T, S '= {
p' wE
]Rn+k I with
there exists c: (-c, c) -+ S C ]Rn+k}
c(O) = p and c'(0) = w .
We call the elements of TpS tangent vectors to S at p. We have TpS =

kernel of Dg(p). In particular, TpS is a vector space of dimension n.
Proof It is easy to show that any tangent vector c'(O) to S at plies

in the kernel of Dg(p). For, let c: (-c, c) -+ S be a curve through p.
That is, a C 1 map with c(O) = p. The C 1 -function a := go c is then a
constant so that a'(O) = O. By chain rule we find that Dg(p)(c'(O)) = O.
Thus, TpS is contained in the kernel of Dg(p).
To prove the converse, let w E ]Rn+k be given such that Dg(p) (w) = O.
Since by hypothesis, Dg(p) is of rank k, we may assume (permuting the
coordinates if necessary) that (8~::j )l'Si,j'Sk is invertible. For ease of
L{ Inverse mapping theorem 39
notation, let us write for zEn, Z = (x, y) E ]Rn x ]Rk. Let p = (a, b)
and w = (u, v).
We wish to use the implicit function theorem. We repeat part of
the argument ofits proof to fix the notation. Let C(x,y):= (x,g(x,y)).
Then DG(p) is invertible and there exist neighbourhoods U of pin ]Rn+k
and V of a in ]Rn and a Cl-function h: V -+]Rk such that
(i) h(a) = b,
(ii) {z En: g(x, y) = O} = {(x, h(x)) : x E V},
so that g( x, h( x)) = 0 for all x E V. In particular, the portion S n U of
the surface S is parameterized by V.
We consider the curve ,(t) := a + tu. Since V is open ,(t) E V,
for sufficiently small t. Let c(t) := C- 1 (!(t),0). Since C is a C 1 _
diffeomorphism, c is Cl. Also,
c(O) = C-1(!(0),0) = C-1(a,0) = (a,b) = p.

We claim that c'(O) = w .
Now, by the chain rule, we have
c'(O) = DC- 1 (!(0), 0) 0,'(0) = DC-1(a, O)(u, 0).

To prove the claim, it is enough to show that
DC(p)(u,v) = (u,O),
as DC-1(a,0) = DC(a,b)-I. It is easily verified that
DC(p)(u, v)
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
= £9..!. !lIn.. ~ £9..!.
8Xl 8x n 8Yl 8Yk
!!..9.5. 'l..9..!s.. ~ !!..9.5.

8Xl 8xn 8Yl 8Yk
= (~) , since Dg(p)(w) = O.

That is, in short,
Inxn
DG(p)(u,v} = ( (~)
aXj 1$i$k
1$j$n
o
A special case worth noting is when g takes values only in R In
that case, the condition on Dg(p} =I- 0 for all pES is equivalent to
requiring that V'g(p} =I- 0 for all pES. (Here V'h(x) :=
for any differentiable function h: U C ]Rn -+ R
(:::1"'"
Recall that the
txhJ
derivative Dh(x): ]Rn -+ ]R is a linear functional given by Dh(x}v =
(v, V'h(x}).} Note also that in this case, the vector space TpS is the
orthogonal complement of V' g(p) in ]Rn+l. That is why 'V' g is called the
normal to the "hypersurface" S.
More generally, if g is as in Theorem 1.4.7, we write g = (g1, . .. , gk).
We then note that v E TpS if and only if (v, V'gi(p)) = 0 for 1 ~ i ~ k.
We say a vector w E ]Rn+k is normal to the surface S at the point pES
if and only if (w, v) = 0 for v E TpS. Thus, the set of vectors normal to
S at p is a vector subspace of ]Rn+k and has {V' gi (p) : 1 ::; i ::; k} as a
basis.
Let M be a smooth hypersurface in ]Rn+l defined by g = O. This
means that M := {x E ]Rn+l : g(x) = O} and that g : ]Rn+l -+]R is a C1_
function such that Dg(x) =I- 0 for any x E M. If f is a smooth function
on M, we want to investigate conditions for local extremum. Let p E M
be a local extremum for f. If 'Y: (-e, e) -+ M is a smooth curve through
p, that is, 'Y(O) = p, then f 0'Y: ]R -+ ]R has a local extremum at O. Hence
by calculus, we have (10 'Y),(O) = O. That is,
'Y'(O)(I) := b'(O), V'f(p)) = o. (1.4.6)
Since 'Y'(O) is a tangent vector to M at p, we know that
b'(O), V'g(p)) = O. (1.4.7)
Now Equations (1.4.6) and (1.4.7) imply that V'f(p) = >..'V'g(p) for some
real number >... >.. is called the Lagrange multiplier. The above can
very easily be generalized for submanifolds which are not necessarily
hypersurfaces and which are defined by the equation g = 0 where g
satisfies the conditions of Theorem 1.4.7. In classical language the above
problem is posed as follows: Find the extrema of a function f subject
to the constraints gi(X) = 0 for 1 ::; i ::; k. Here n = n + k - k is the
dimension of the submanifold and we assume that it is defined by gi = O.

In this case the Lagrange multipliers .Ai at an extremum are given by
TV f(p) = E.Ai TV gi(P)·

i
We wish to show an interesting application of this method of
Lagrange multiplier to a problem in linear algebra or in analytic
geometry depending on one's perspective.
Theorem 1.4.8 Given an n x n real symmetric matrix A, there exists

an orthogonal matrix U such that U- i AU = D, where D is a diagonal
matrix diag(.Ab ... , .An).
Proof We consider the function

g(x) := xt Ax := (Ax, x) for xES := sn-i.
Here (-,.) is the Euclidean inner product on lRn and
sn-i := {x E lRn I f(x) := (x,x) = I},
the unit sphere in lRn. As sn-i is closed and bounded, it is compact.

Hence the continuous function 9 attains a maximum on S. Let Vi E S
be a point where the maximum is attained. We have TVg(x) = 2Ax since
Dg(x)(h) = 2 (Ax, h). Also, we have TV f(x) = 2Ix = 2x. Hence by
Lagrange multiplier there exists .Ai E lR such that TV g( Vi) = .Ai TV f (Vi).
That is, we have AVi = .Ai Vi or VI is an eigen value of A. Moreover, we
notice that .AI = (AVl,Vl) so that.Al is the maximum value of 9 on S.
LetE1 := (lRvl).L, the orthogonal complement of the one dimensional
subspace lRVI of Eo := lRn. We now restrict the function 9 to the unit
sphere in E 1 . That is, the variable x is constrained by (x, x) = 1 '3.nd
(x, Vi) = O. As above the function 9 attains a maximum at a point
V2 E sn-2 eEl. Then V2 satisfies
(1.4.8)
for some real numbers .A2 and (11. (Can you derive this?) We take inner
product of both sides of Equation (1.4.8) with V2 to get:
(AV2, V2) - .A2 - (11 (Vb V2) = O.
Thus .A2 = (AV2, V2) so that .A2 is the maximum of g on the unit sphere
of E 1 . Also, we have .AI 2:: .A2. We now take inner product of Equa-
tion (1.4.8) with Vi to get
(AV2, Vi) - .A2 (V2, VI) - (11 (VI, VI) = O.
Hence we deduce that
Therefore from Equation (1.4.8) we see that V2 satisfies AV2 = .A2V2'

Hence .A2 is an eigen value of A.
We can thus proceed up to n - 1 steps so that we get eigen vectors
Vi of unit norm with eigen values .Ai for 1 ~ i ~ n - 1. We then consider
9 on So, which is a set of two unit vectors. Hence 9 has a maximum,
say at Vn , with (Vn' Vi) = 0 for 1 ~ i ~ n - 1. We set .An := (Avn' v n ).
Then .A1 ~ .A2 ~ ... ~ .An. Notice that {Vi : 1 ~ i ~ n} forms an
orthonormal basis for lR.n . Hence we can write AVn = 2:~=1 aivi, where
ai = (Avn' Vi) = .Ai (Vn , Vi) = 0 for i < n and an = .An. Thus.A n is
also an eigen value of A. We have therefore diagonalized the symmetric
matrix A.
o
Remark 1.4.9 How is this related to analytic geometry? You can think
of the set {x E lRn : g(x) = I} as a quadric surface. Thus what we have
done above is to find the principal axes of this quadric surface.
Exercise 1.4.10 Find the maximum of (X1X2'" xn)2 subject to the

constraint 2:~=1 x~ = 1. Use this to show AM ~ GM. This can also be
derived by searching for the maximum of
g(X) = Xl + ... +xn with the constraint f(x):= X1"'Xn = 1.

Exercise 1.4.11 Let t1 = det(xij) be the determinant of a real n x n
matrix (Xij). Let Vi := (XiI, ... ,Xin), the i-th row vector. Let di := II Vi II.
We indicate a proof of the following famous result of Hadamard:
Let ri(x) := (2: j X~j)1/2 be the norm of the i-th row vector. Let di > 0
be given. We wish to find the maximum of f(x) := det(x) subject to
the constraints
Note that we have

Here, as is customary, Xij stands for the cofactor of Xij' Thus if a is a

point where a maximum is attained, then
af
-(a) = Aij
aXij
agi
aXij (a) = 2aij'
Thus the Lagrange conditions are
Aij + 2Aiaij = 0 for all i, j. (1.4.9)

Multiplying Equation (1.4.9) byaij and summing with respect to j we
get
(1.4.10)
j j
Multiplying Equation (1.4.10) by aij we get
aijg(a) + 2aijAid~ = 0 for all i,j. (1.4.11)

Combining Equations (1.4.9) and (1.4.11) we get
aijg(a) = Aijd;' (1.4.12)

We take b := (bij ) where bij := A ij . Then ab = g(a)I so that taking
determinants (that is, applying g) on both sides yields:
g(a)n = g(ab) = g(a)g(b) = g(a) (~g~~)) g(a). (1.4.13)
Hence we get (g(a))2 = d~ ... d~.

There is a very nice geometric interpretation of this result. First of
all an observation. Since AiIakI + ... + Ainakn = 0 for i # k and since
aij is proportional to Aij we see that the rows of a are orthogonal:
aiIakl + ... + ainakn = 0 for i # k.

Given n vectors Vi := (Xi!>"" Xin) we can think of the determinant
det(x) := det(xij) as an oriented volume of the patallelopiped [VI'" vnl
spanned by the n vectors Vi. Let di be given positive constants. Hada-
mard's inequality says that under the restriction that II Vi II = di , the
volume (:= the absolute value of the oriented volume) is a maximum
when the vectors are orthogonal.
Exercise 1.4.12 Show that the maximum area enclosed by a triangle

of a given perimeter 2s is obtained by the equilateral triangle. Hint:
Use Heron's formula: A2 = s{s - a){s - b){s - c).
Exercise 1.4.13 Find the extrema of
f{x,y) = x2 - y2 on S := {x 2 + y2 = 1}.
Draw pictures and understand the geometric meaning of your solution.
Exercise 1.4.14 Show that for any a E IRn , we have
/lall = max{a · x: /lx/l = 1}.

Exercise 1.4.15 Let f{x) := Xl" ·xn . Find its extrema on
S:= {x E IR L xk = 1, and Xk ;::: 0 for all k}.

n :
1.5 Local study of immersions and submersions

We introduce some definitions. Using the standard notation, we say a
map F: W c IRn -t W' C IRn is a Cl-diffeomorphism of an open set W
onto W' if F is bijective and CIon Wand its inverse F- I : W' -t W is
also C I . Thus we can rephrase IMT as saying that if F is nonsingular
at a, it is a Cl-diffeomorphism on a neighborhood of a o~to an open set
containing F{a). Note that in this case n = m.
Definition 1.5.1 We say F is an immersion at x E U if Df{x) is one-

one on IRn so that n ~ m. It is said to be a submersion at x if D f (x)
maps IRn onto IRm so that n ;::: m.
We shall establish one of the important consequences of IMT.
Theorem 1.5.2 Let F: U C IRn -t IRm be CI. Assume that F is an

immersion at p E U. Then there is a Cl-diffeomorphism iP of an open
neighborhood of V of q = F{p) E IRm onto an open set W := iP{V) C
IRn x IRm - n such that
iP 0 F{z) = (z,O) for all z E W n IRn x {O}.

1.5. Local study 01 immersions and submersions 45
Proof The theorem can be reformulated as follows: If F = (ft, ... , 1m)

is a Cl-immersion at p then around q we can choose a set of new coor-
dinates in such a way that with respect to the new coordinates F looks
like the natural or canonical inclusion map in open neighborhood of p:
(Xl, ... ,Xn) E]Rn -t (Xl, ... ,Xn,O ... ,0) E ]Rm.
The hypothesis means that the matrix ( M;(p) ) lSiSm has rank n. We
lSiSn
may assume without loss of generality, that ( ~(p) ) lSiSn is invertible.
J lSiSn
We consider the map G: ]Rn x ]Rn-m -t ]Rm given by
G(z,w) = F(z) + (O,w) forzEU, wE]Rm-n.
Then G(z,'O) = F(z) and DG(p,O) = (D~(p) ;) and hence it is invert-
ible. By IMT, there exists a neighborhood V of (p,O) in]Rn x ]Rm-n on
which G is a Cl-diffeomorphism. We take V = G(V) and W = V. We
set cfl = G- l . Then
cfl 0 F(z) = cfl 0 G(z, 0) = (z,O)
for all z E W n]Rn. That is, for all (z,O) E V.
o
This is a typical way of using IMT. That is, we use it often to change
the coordinates so that the problem on hand becomes easier or geomet-
rically more meaningful.
Theorem 1.5.3 Let I: U c]Rn -t]Rm be a submersion at p. Then there

exists a neighbourhood Uo 01 p in U and a diffeomorphism rp: Uo -trp(Uo )
such that the map 10 rp-l : Uo -t]Rm is given by
10 rp-l(Xl, ... , Xm, Xm+l, ... , Xn) = (Xl' ... ' Xm).
Proof Once again, we write I = (ft, ... , 1m). Assume without loss of
generality that (U;(P)hSi,ism is invertible. Let k = n - m. Consider
the map rp: U -t ]Rm x ]Rk given by
rp(X) = (f(x),Xm+l, ... ,xn )' if X=(Xl, ... ,Xm,Xm+l, ... ,xn ).
*)
Then
Drp(p,O) = ((U;(P)OhSi,iS m
hxk
is invertible. By IMT, there exists a neighbourhood Vp of p such that
rp: Vp -t rp(Vp) is a diffeomorphism. Now 10 rp-l: rp(Vp) -t]Rn is given
by (Yl, ... , Ym, Ym+l,···, Yn) t-+ (Yb ... , Ym).
o
1.6 Fundamental theorem of calculus

We fix a closed and bounded interval J := [a, b] C lR and a real Banach
space E. A map I : J -+ E is called a step /unction if there exists a
partition: a := to < tl ... < tn := b of J such that I is constant on each
[ti, tHl) for 0 ~ i ~ n -1. We denote by S(J, E) the set of all such step
functions. If It, h are in S (J, E) then It + h E S (J, E). This is easily
seen by taking a common refinement of partitions corresponding to Ii.
Thus we see that S(J,E) is a vector subspace of ~(J, ), the Banach
space of all bounded functions from J to E. The norm on ~(J, ) is of
course the supremum norm.
We set I(f) := ~O<i<n-l (tHl - ti)/(ti). We leave it to the reader
to check that the above definition is independent of the partition chosen
and hence is well-defined. It is called the integral of I. We obviously
have:
III(f) II ~ larbll/(t)1I dt~ (b-a)supll/(t)II:=

tEJ
(b-a)lI/lI oo ·
Thus I : S(J, E) -+ E is a continuous linear functional and hence it
J:
extends to a continuous linear map from the closure S(J, E) of S(J, E)
in ~(J, ) to E. If IE S(J, E) we denote I(f) by I(t)dt.
The most important class offunctions that lie in S(J, E) is Cp, E),
the space of continuous functions: If IE C(J, E), by uniform continuity
of Ion J, for e > 0 given, there exists a 8> 0 such that for all x, y E J
with Ix-YI <8 implies that II/(x)-/(Y) II < e. Now for any partition
a:=tO<tl<·· ·<tn:=b of J with 1tHl - ti 1< 8 we define a step function 9
by setting g(t) := I(ti) if ti ~ t < ti+1. Then by the triangle inequality
it follows that II 1- gil < 2e.
We point out an important property of this integral: Let F be
another Banach space and let T E BL(E, F). If I E S(J, E), then
To IE S(J, F) and hence for every I E S(J, E) we have To IE S(J, F).
We also have I(T(f)) = T(I(f)). A particular case of this is got when
F = lR. In this case we get back I(f) as soon as we know I(u(f)) for all
u E E* , provided that E is reflexive. (Ignore this and the next line if you
do not know what E* and reflexive spaces are.) For, in this situation,
we can define I(f) E E to be the vector given by the duality:
(I(f), u) := I(u 0 f) for all u E E*.
A special case of this is when E is a Hilbert space. Here we can describe
the integral of I as that vector vEE which satisfies:
(v, w) = I( (f(t), w)) for all wEE.
1.6. Fundamental theorem of calculus 47
Now we specialize this further by taking E = IRn. Here if we write in

the usual notation f:=(h, ... fn), then I(f)=(I(h), ... , I(fn)) , a most
gratifying result.
Exercise 1.6.1 Verify all the claims made in the previous paragraph.
After these preliminaries are over, we are ready for the fundamental
theorem of calculus.
Recall that a function f: J -t E is differentiable iff
f'(t) = lim f(t + h) - f(t)

h-+O h
exists and we have f'(t) = Df(t)(l).
Theorem 1.6.2 (Fundamental theorem of calculus)
i) If f : J -t E is continuous then the map F : (a, b) -t E given by

F(x) := J:
f(t) dt is differentiable and we have F'(t) = f(t).
ii) If f : J -t E is continuous and differentiable on (a, b) and if D f

extends continuously to J, then
f(b) - f(a) = lb f'(t) dt.
Proof i) For a fixed r E (a,b), we have
IIF(r+h)-F(r)-hf(r)11 = iT+h(f(t)-f(r))dtl
= Ihl sup IIf(t)-f(r)11

T::;t::;T+h
= o(lhl),
since f is continuous. This proves i).
ii) Define g(x) := J:

D f(t) dt - f(x). By i), g'(x) = 0 on (a, b).
Hence 9 is a constant and in particular we have g(a) = g(b) . . Since
J:
g(a) = 7 f(a) and g(b) = f(t) dt - f(b) the result follows.
o
1. 7 Higher derivatives and Taylor's theorem
Higher derivatives
In this section we shall define derivatives of higher order. Let I : U c
E -t F be differentiable. We thus have a map f' : U -t L(E, F) given
by x I-t f'(x). While working with higher. derivatives, it is simpler to
use f'(x) in place of DI(x) so that we can denote the second derivative
as f" (x) etc.
We now ask whether f' is differentiable on U. Notice that the deriva-
tive of I' if it exists is a continuous linear map from E to L(E, F). That
is, (f')' E L(E, L(E, F». As is customary we shall denote by f" the
second derivative of I. We say I is C 2 on U if f" (x) exists for every
x E U and the map x I-t f" (x) is continuous. It is now clear how to
define higher derivatives I(k) of f. By recursion we set I(k) := (f(k-I»,.
Thus I(k) E L(E, ... ,L(E, F». We say I is C k on U if x I-t I(k) (x) is
continuous.
In dealing with these higher derivatives the crucial observation is the
fact that the space L(E, ... ,L(E, F)) is norm isomorphic (that is, linear
isometric) to the space Lk(E, F) of all continuous k-linear maps from E
to F. Recall that the norm on Lk(E, F) is defined as follows:
To establish the claimed isometry, we first of all look at the case k = 2.

Then for B E L(E, L(E, F» we consider B E L2(E, F) defined by
B(xl, X2) := B(xd(X2) for all xl, X2 E E. We leave it to the reader
to check that this is a linear isometry. Now we can proceed by in-
duction to define B(XI,"" Xk) := B(XI)(X2, ... , Xk). We denote by
Lk(EI x ... X Ek, F) the space of all continuous k-linear maps from
EI x ... X Ek -t F.
Examples 1.7.1
(1) If I is a linear map from E to F, then f" = O. (Was not this
answered by you earlier?)
(2) If I: EI X ••• x En -t F is a continuous map which is linear in
each variable, what is I(k) for kEN?
(3) Let U := GL(E) be the open set in BL(E) consisting of all
invertible bounded linear maps on E. We now consider the map I : U -t
U given by I(A) = A-I. Recall that f'(A) is given by
I'(A)(X) = -A-loX 0 A-I.

1.7. Higher derivatives and Taylor's theorem 49
A simple application of chain rule and induction shows that I is C k for

all kEN.
(4) In this example, we reconcile our notation of higher, say, second
derivative of functions from an to a with the classical notation. Let
I: U c an -+ a be twice differentiable. What is f"(x)? We claim that
with respect to the standard basis of an we have
f"(x) = ( (PI) .
ax. ax; 1<. '<n
_ ,3_
We shall indicate a proof of this. First we notice that f" (x) E L2 (an, a)
is a biline,ar map so that the matrix representation of f"(x) with respect
to the usual basis {e.} has entries b'j given by b'j = f"(x)(e.,ej). Now
by definition I" = g' where 9 := I' and we have seen (or remarked in
Section 1.1) that
g'(x)(e.) = (gradg(x), e.) = a~ g(x) etc.

3
This proves the result.

An immediate corollary of item (3) of Example 1.7.1 is the fact that
the inverse mapping theorem continues to be true for Ck-maps for k 2: 1.
That is, if we assume that I is C k in the statement of the inverse mapping
theorem, then its inverse 1-1 is also C k . The product and' chain rule
also continue to hold for Ck-maps.
Proposition 1.7.2 (Product rule) Let E, F l , F2 and G be Banach

spaces. Let U be an open set in E and B E L2(Fl X F 2 , G) be a con-
tinuous bilinear map. Let Ii : U -+ F. be k-times differentiable map
(respectively, Ck). Consider g(x) = B(h(x),h(x)). Then we have
g'(x)(h) = B(J'(x)h, h(x)) + B(h(x), I~(x)h) (1.7.1)

and 9 is k-times differentiable (respectively, C k ).
Proof By hypothesis we have

h(x + h) = h(x) + I~(x)h + o(h)
h(x + h) = h(x) + I~(x)h + o(h)
so that
g(x + h) - g(x) = B(h(x + h), h(x + h)) - B(h(x), h(x))
= B(J'(x)h, h(x)) + B(h(x), I~(x)h) + o(h),
as B is continuous and bilinear. Hence 9 is differentiable and Equa-

tion (1.7.1) holds.
To prove that 9 is Ck, if fi'S are so, we proceed by induction. We
assume the result for k -1 2: 1. Let fi be C k - 1 • Then Equation (I. 7.1) is
true. The map a: L{E, Ft} x F2 ---+ L{E, G) given by (A, y) t-t T where
T{x) := B{Ax, y) is continuous and bilinear. By induction hypothesis,
It and hare C k - 1 and hence by the result for k -1 we see that the map
x ---+ a(J'{x), h{x)) is C k - 1, and a similar result holds for the other term
in Equation (1.7.1). Hence g' is C k - 1 or 9 is Ck. We remark that the
same proof shows that we can replace "Ck» by "k-times differentiable".
The theorem is proved.
o
Theorem 1.7.3 (Taylor's theorem) Let f : U C E ---+ F be Ck.

Assume that U is star shaped at x E U. That is, for any y E U the
line segment joining x and y lies entirely in U. Then for y E U we have
1
f(x + y) = f(x) + J'(x)y + ... + k! fk(x) yk + o(yk)
where yk := (y, ... , y) E Ex··· x E.

'----v----'
k-times
Proof For k = 1 this is the definition of derivative. Let f be C k and

assume the result for k - 1. Then f' is C k - 1 so that
f'{x + ty)y = J'{x)y + J"{x)t 2y2 + ...

+ 1 fk{x)tk-1yk-1 + O{t k - 1y k-1)
(k - I)! .
We now apply the Fundamental Theorem of Calculus in the form
f{x + y) = f(x) + 11 J'{x + ty)ydt
to get the result.

o
As we have explained earlier, the reader should note that the Taylor
coefficients are the directional derivatives of f if f is real valued. To
obtain the integral form of the remainder in the Taylor formula above
we use again the fundamental theorem of calculus and integrate by parts.

For example, we wish to write the remainder term for k = 1 as
R 1(y) = (f(x + y) - f(x)) - !,(x)y

= 11 !'(x + ty)ydt - !,(x)y
= 11 (f'(x + ty) - f'(x)) ydt
= "11 uv' dt" .
The trick here is to take v := (t - I)y instead of the obvious choice

v = ty. (See what happens to the following analysis if take v = ty.)
Hence we write (for u = [f'(x + ty) - f'(x)] and v = (t - I)y)
R1(y) = 11 (f'(x + ty) - !,(x)) d((t - I)y)
= [!'(x + ty) - f'(x)](t - I)YI~ - 11 !"(x + ty)y2(t - 1) dt.
The first term on the right side of the last eq\lation is 0 because of our
wise choice. Thus we get
Hereafter it is plain sailing. By induction we see that
f(x+y)
k-1
_ '" 1 j j 1
1
k k
-~ -I)!f (x)y +(k-I)!io f (x+ty)y (I-t)
k-1 r
dt.
]=1 J 0
Notice that in this proof we once again used only the directional deriva-
tive of f at x in the direction of y. Since
Rk- 1(y) = (k ~ I)! 11 fk(x + ty)yk(I- t)k-1 dt,
it is obvious that Rk-1(y) = 0(11 yk- 111). (Check this.)

We now specialize to the case when E = lRn and F = lR. With the
above notation, if we set g(t) := f(x + ty) then
g'(t) = LyjDjf(x+ty)
j
where D j = a/aXj and
gl/(t) = LYiYj(DiDjf)(x + ty)

i,j
and so on. In particular, by induction, we get that .

g(r)(t) = LYil ... YirDif(x + ty) where i := (il, ... , ir) (1.7.2)
I
We now introduce some standard notation to recast the Taylor's theorem

in its usual form. For a := (al, ... , an) with ai ~ 0 integers, we set, for
Y = (Yl,'" ,Yn),
yO := yr1 ... y~n and DO := Dr 1 ... D~n .
We also define lal := Li ai and a! := al!'" an!. Each monomial
yr 1 ••• y~n is of the form yO, with lal = r. Now given a with lal = r,
there are (r!/a!) different r-tuples (i l , ... ,i r ) in which j occurs i j times.
Hence we have
gr(t) = " r' DOg(t)
--':"Yo for 0::; r ::; k.
L..J a!
lol=r
Thus the Taylor's formula becomes
f(x + y) = L DO f(x)(1/a!)yO
1019-1
+k 11
D
(1 - t)k-l L
lol=k
DO f(x + ty)(1/a!)yO dt.
Corollary 1.7.4 If 1 is as above, and ilO E U is such that 1(0) = 0,

then we can write I(x) = Li Xdi(X) where Ii are C k - l around O.
Proof Apply Taylor's formula for k = 1 to get

I(x) = 1(0) + I>i fal Dd(tx) dt
,
and complete the proof.
o
Exercise 1.7.5 Apply fundamental theorem of calculus to the function

g(t) := f(tx) for f as in Corollary 1.7.4 and deduce Corollary 1.7.4.
Definition 1.7.6 We say a function is smooth or COO on an open set

U c lRn if it is C k for all k 2: o.
It is easy to see that a function f is Coo on an open set U if and only

if all its partial derivatives of all orders exist. That is, DOt f{x) exists for
all multi-index a = (01, ... ,On) with OJ 2: 0 and for all x E U. (Verify
this.)
Maxima and minima

We give some standard applications of Taylor expansion to problems of
maxima and minima of functions f: U c lRn -t lR in the form of graded
exercises.
Exercise 1.7.7 Write the second order Taylor expansion of a C 2 func-

tion as follows:
1
f(a + h) - f(a) = (grad f(a), h) + 2D2 f(x)h
1
(grad f(a),h) + 2D2f(a)h+ IIhll~ (h)
where
IlhI1 2 E(h) = ~[D2f(x)h-D2f(a)h]

1
2L
n
= [DiDjf(x) - DiDjf{a)]hihj .
i,j=l
Conclude that IE{h)l-t 0 as IIhll-t O.

Definition 1.7.8 Let f: U c lRn -t lR be a function. A point a E U is
said to be a local maximum, if there exists an open set B containing a
such that f(a) 2: f(x) for al x E B.
A local minimum is similarly defined.
Exercise 1.7.9 Let f: U ~ lRm -t lR be differentiable and x be a local

maximum or local minimum. Show that grad f{x) = O. Hint: Use the
standard trick. Consider the function g(t) := f(x + tv) for any v E IRm.
Exercise 1.7.10 Let f: U ~ lRm -+ lR be differentiable. Show that

in the .direction v in which IDvf(x)1 has the maximum yalue is along
grad f(x). Use this to understand the geometry behind Exercise 1.7.9.
Definition 1.7.11 We say a point a, in the domain of a differentiable

(real valued) function, is a critical point if the gradient of f at a is zero.
Exercise 1.7.12 Any point of local maximum or local minimum is a

critical point. Give an example of a critical point which is neither a local
maximum or a local minimum.
We now end this section with a sufficient condition in terms of second

order derivatives for local extrema.
Exercise 1.7.13 (Sufficient condition for local extrema) Let f be

a real valued C 2-function on an open subset U ~ lRn. Assume that a E U
is a critical point of f. Let
(i) If Q(h) > 0 for all h =j:. 0, then f has a local minimum at a.
(ii) If Q(h) < 0 for all h =j:. 0, then f has a local maximum at a.
(iii) If Q(h) is indefinite, in every neighborhood of a, we can find points
x, y such that f(x) < f(a) < f(y). Such a point a is said to be a saddle
point of f.
Definition 1.7.14 The matrix D2 f(a) = (a:,2Ixj (a)) of a C 2 function

f is called the Hessian of f at a. A symmetric matrix A = (aij) is said
to be positive definite (negative definite) iff (Ax, x) > 0 (respectively
(Ax, x) < 0) for x =j:. O. Thus a critical point a of f is a point of
local minimum (maximum) if the Hessian Hf(a) of f at a is positive
(respectively negative) definite.
Exercise 1.7.15 There are two well-known criteria for the positive def-
initeness of a symmetric matrix A = (aij).
(i) All the eigen values of A are positive.
(ii) All the matrices (aijh~i,j9 for 1 ::; k ::; n have positive determi-
nants.
Prove the first criterion for all n and the second criterion for n = 2.
1.8. Smooth functions with compact support 55
1.8 Smooth functions with compact support

Definition 1.8.1 The support of a function f: U --+ Coo 0 is defined to
be the closure of the set {x E U : f(x) =I- O}.
For later use we need the existence of smooth functions with compact
support on any open subset of IRn.
The basic ingredients in the construction of smooth function with
compact support is the function f defined on IR as follows:
f(t) = {o
exp(-l/t)
for t
for t
~ 0;
> o.
For any point x =I- 0, f is smooth since it is the composition of two
smooth functions. (Prove that the composition of two smooth functions
is smooth!) To prove that f is smooth on IR we need only show that f
is smooth at O.
We first observe that for x > 0, we have
'"""' xk
e x = L...J k! > k!1 x k c
lor ~T
a 11 k E 1"1.
k
Hence for x > 0 we have
0< f(x) = exp( -l/x) = l/(exp(l/x)) < (~! (~) k) -1 = k!x k

for kEN. Thus lim f (x) = 0 as x --+ 0+. That is, f is continuous at o.
By induction it follows that for all x we have
for some polynomial Pk of degree at most k + 1. By way of induction le~

us assume that f(k) exists and is continuous at O. Then f(O) = O. Now
we have
Since inpk(x- l ) the power of X-I is at most k+2 and since If(x)1 ~ n!x n ,
for every n we see that for all n > k + 2 the following holds:
If(x)X- 1 Pk(X- 1 )1 ~ n!x n - k --+ 0 as x --+ 0+.
Thus f(k+1) (0) exists and is O. An analogous argument yields the conti-
nuity of f(k+l)(x) = Pk+1(x)f(x) at x = O. Thus f is a smooth function
on R We now use this function to construct smooth functions with

compact support on IRn.
Let 0 < a < b. Consider the function fa: IR --+ IR given by
fa(t) = exp( -1/(t - a))
for t ~ a and 0 otherwise. Similarly consi!ier the function 9b: IR --+ IR

given by 9b(t) = exp(1/(t - b)) for t ~ band 0 otherwise. Then the
product of these functions cp is a smooth function which is 0 outside the
interval [a, b]. To construct similar functions on IRn is now easy. We set
77(X):= cp(llxlD for x E IRn. See Figures 1.8.1, 1.8.2 and 1.8.3
1
1
0.8
0.6 0.6
0.4 0.4
0.2 0.2
-2 -1 3 4
Figure 1.8.1 Graph of It Figure 1.8.2 Graph 93
0.12
0.1 0.8
0.08 0.6
0.06
0.04 0.4
0.02 0.2
1 3 4
o 3
Figure 1.8.3 Graph of cp

Figure 1.8.4 Graph of h
For many a purpose one needs smooth functions which are 1 on a

given compact set and which are 0 outside an open set U that contains
the given compact set. To this end, we define a function h on IR as
follows. With cp as above we set:
h(x) .~ (l ~(t) dt) (J.' ~(t) dt) -,

1.B. Smooth functions with compact support 57
See Figure 1.8.4. Then h is smooth with h(x) :5 1 for x :5 a and h(x) = 0
if x 2: b. If we now define 1/J(x) := h(Li xn for x := (Xl>.'" x n ) E
an, then 1/J(x) = 1 for x E B(O,a) and 1/J(x) = 0 for IIxll 2: b. The
following consequence of the above construction is of vital importance
in differential geometry for showing the existence of global objects on
smooth manifolds.
Proposition 1.8.2 II K is a compact set in an and U is an open set

containing K then there exists a smooth function I on an which is 1 on
K and 0 outside U (that is, 0 on an \ U).
Proof Let F := an \ U. Then F is closed and F n K = 0. Let
2c::= d(K, F) = inf{d(x,y): x E K,y E F}.
Then 2c: > O. (Prove this.) Choose a finite number of points Xj E K such
that K c Uj=1B(Xj,c:/2). Choose 1/Jj E coo(an ) such that 1/Jj(x) = 1
tt
on B(xj,c:/2) and 1/Jj(x) = 0 for x B(xj,c:). Then the function
k
cp(x) := 1 - II (1 -1/Jj)
j=1
does the job.
o
Corollary 1.8.3 Let I E COO(U) with U c an open. Let Xo E U.
Then there exists an open set V such that Xo EVe U and a function
9 E Coo (an) such that
g(X) = {/(X)' x EV
0, tt
x u.
Proof Let V and W be open neighbourhoods of Xo such that V c

V eWe W c U and such that V is compact. Taking K = V and
F = an \ W in the last proposition, we get cp E Coo (an) such that
tt
cp(x) = 1 on V and cp(y) = 0 for y w. If we let
g(x) = {/(X)CP(X), xEU

0, x tt an \ w,
then 9 is as required.
o
1.9 Existence of solutions of ODE
In this section we shall give a complete proof of the well-known result

on the existence, uniqueness and smooth dependence of solutions of an
ordinary differential equation in ]Rn .
Proposition 1.9.1 Let U c]RN be an open set. Let X: U -+]RN be a

Lipschitz map with Lipschitz constant L: II X (x) - X (y) II ::; L II x - y II
for all x, y E U. Let Xo E U be fixed. Let B[xo, r] c U and M > 0 be
such that IIX{x)11 ::;M forxEB[xo,r]. Letc<min{l/L,r/M}. Then
there exists a unique C 1 curve x: [-c, c] -+ B[xo, r] which is a solution
of the following initial value problem (IVP):
x'{t) = X{x{t)) and x{O) = Xo. (1.9.1)
Proof In view of the Fundamental Theorem of Calculus, the initial

value problem (IVP) (1.9.1) is equivalent to the integral equation:
x{t) = Xo + lot X{x{s)) ds. (1.9.2)
We solve this by Picard's method of iteration. Let xo{t) = Xo for t E

[-c, c]. Define Xn recursively:
We prove by induction that xn{t) E B[xo,r] for t E [-c,c]. We have
Assume that we have proved the result for all k ::; n. Now, since xn{s) E
B[xo,r] for s E [-c,c], we have
We next claim that the sequence (x n ) converges uniformly. To show

1.9. Existence of solutions of ODE 59
this, we observe that
IIxn+!(t) - Xn(t) II :5 fat IIX(xn(s)) - X(Xn_l(S» II ds
:5 L It I sup Ilxn(s) - Xn-l(S) II

~
:5 L21tl 2sup IIxn-l(s) - Xn-2(S) II

~
:5 L n Itl n sup IIxl(s) - xo(s) II

~
:5 ML n Itl n+! .
Since M Ln Ln Itl n+! :5 Me Ln (Lc)n is a convergent geometric series,
it follows from Weierstrass M-test that the series Ln[xn(t) -.Xn-l(t)]
and hence (xn) is uniformly convergent on [-c,c] to a continuous func-
tion x: [-c,c]-t B[xo,r]. Hence appealing to the result on interchange
of uniform limit and the Riemann integral for a sequence of continuous
functions, we deduce that x satisfies the integral equation (1.9.2).
To prove uniqueness, let y be another solution ofthe IVP (1.9.1) and
hence the integral equation (1.9.2) on [-c, c]. The continuous function
t t-t IIx(t) - yet) II assumes its maximum value, say, at to. We then have
IIx(to) - y(to) II = II fat [X(x(s» - X(y(s))) dsil

:5 fat II [X(x(s)) - X(y(s))] II ds
:5 Lc sup II x( s) - y( s) II
~
= Lc IIx(to) - y(to) II·

Since Lc < 1, this inequality is true if and only if II x( to) - y( to)ll = 0,
that is, if and only if x(t) = yet) for t E [-c,c}.
Exercise 1.9.2 Let A: [-a, a] x U -t M(n,R) be a continuous matrix

valued function. Then the IVP
! 1f;(t,x) == 1f;'(t,x) = A(t,x)1f; and 1f;(0, x) = I, (1.9.3)

where I is the identity, has a unique solution in [-c, c} for some c > 0.
Hint: Adapt the proof of Proposition 1.9.1. Use the operator norm
IIAII := max{IIAull : u E]Rn and lIuli = I}
60 1. Differential Oalculus
Observe that IIABII ~ IIAIIIIBII·
Exercise 1.9.3 This generalizes the above proposition. Let A C ]RN

be open. Assume t!!at X: U x A -t ]Rn is continuous. Assume further
that X is uniformly Lipschitz in x: there exists a constant L such that
IIX(x,,x) - X(y,,x)1I ~ L IIx - yll for all X,y E U, ,x E A.
Then there exists a unique continuous solution x(t,,x) oil [-e, e] x A for
some suitable e.
Keep the notation of Proposition 1.9.1. Let the unique solution of

IVP (1.9.1) be denoted by 'Yxo(t). Then 'Yxo is a 0 1 curve from [-e,e]
to B(xo,r) such that 'Yxo(O) = Xo and that 'Y~o(t) = X(-yxo(t)). The
next theorem shows that if we can cut down the neighbourhood of Xo to
B(xo, r /2), then we can find an e > 0 such that for each x E B(xo, r /2),
we have a 0 1 curve 'Yx: [-e,e] -t B(xo,r) such that 'Yx(O) = x and
'Y~(t) = X(-yx(t)) for t E [-e,e]. Moreover, if we set F(t,x) := 'Yx(t)
for x E B(xo,r/2) and It I < e, then F is jointly continuous on [-e,e] x
B(xQ,r/2). Before proving this, we need a celebrated inequality.
Lemma 1.9.4 (Gronwall inequality) Let f,g: [a,b]-t]R be nonneg-

ative continuous /unctions. Assume that there is a 0 ;::: 0 such that
f(t). ~0 + it f(s)g(s) ds.
(it
Then
f(t) ~ Oexp 9(S)dS) for t E [a, b].
Proof Assume that 0 > O. Let
h(t) := 0 + it f(s)g(s) ds.
Then f(t) ~ h(t). We observe that h(t) > 0 and

h'(t) = f(t)g(t) ~ h(t)g(t)
so that
h'(t)
h(t) ~ g(t).
Integrating this inequality yields h(t) :$ C exp g(s) (1: SincedS).

f(t) :$ h(t), the result follows.
If C = 0, use the result for Ce = c: and take limits to get h(t) = 0
and hence f(t) = O.
o
Theorem 1.9.5 Let A C ]Rk and U C ]RN be open. Let X: U x A ---+ ]RN
be Lipschitz continuous on U, uniformly in the variable from A:
IIX(x,oX)-X(y,oX)II:$Lllx-YIl forallx,yEU, oXEA.
Fix a point Xo E U. Choose r > 0 such that B(xo, 2r) C U. Then there
exists an c: > 0 and a continuous function
F: [-c:,c:] x B(xo,r) x A ---+ B(xo,2r)
such that ftF(t, x, oX) = X(F(t, x, oX)) and F(O, x, oX) = x for all x E
B(xo, r), t E [-c:,c:] and oX E A.
In fact, F is Lipschitz in x uniformly in the variables (t, oX).
Proof We shall only highlight the arguments as the details are as in the
proof of Proposition 1.9.1. We shall not write the parameter variables
explicitly in what follows.
For x E B(xo, r), consider the integral equation
x(t) = x + lt X(x(s)) ds.
We take c: < min{1/ L, r /(2M)}. As earlier, start with Xo = x and

define xn(t) = x + J~ X(X n _l(S)) ds. It is easily seen by induction that
xn(s) E B(xo,r). Then Xn converges to a function F(s,x) := /',,(s)
uniformly on [-c:,c:].
To show the continuity of F, let f(t) := II F(t, x) - F(t, y) II for x, y E
B(xo, r). We have
f(t) = lilt [X(F(s, x)) - X(F(s, y))] ds + (x - y) II

:$ Ilx - yll + L lt f(s) ds
:$ eL1t11lx _ yll,
by Gronwall's inequality. Note that this shows that th,e solution F is

Lipschitz in the x-variable. The joint continuity follows from the obser-
vation and the fact that F is C1 in t:
IIF(s,x) - F(t,y)ll::; IIF(s,x) - F(s,y)1I + IIF(s,y) - F(t,y)11

::; eL1s11lx - yll + IIF(s,y) - F(t,y)ll·
o
Theorem 1.9.6 Let X: U -+ JRn be C k • Let Xo E U be fixed. Then the

function F of Theorem 1.9.5 is C k on (-e,e) x B(xo,r/4).
Proof Let x E B(xo,r/4). Choose hE JRn such that Ilhll < r/4 so that
x + hE B(xo, r). Let F(t, x) := IX(t) and F(t, x + h) := Ix+h(t) be the
unique solutions of the IVP with initial values x and x + h respectively.
We recall that F is Lipschitz in x-variable uniformly in t:
IIF(t,x+h)-F(t,x)lI::; IlhlleLe:. (1.9.4)
We now define 'l/J to be the matrix valued solution of the IVP:
'l/J' = DX(F(t,x)) 0 'l/J with 'l/J(O) = I.

Note that such a solution 'l/J( t, x) exists, say, in [-a, a) by Exercises 1.9.2
and 1.9.3.
We claim that :xF(t,x) = 'l/J(t,x). Let
M1 := max{IIDX(x) II : x E B[O,r)}.
We have
F(t,x + h) - F(t,x) - 'l/J(t,x)h =
lot [X(F(s,x + h))-X(F(s,x)) - DX(F(s,x)) 'l/J(s,x)· h) 0
= lot DX(F(s, x)) [F(s, x + h) - F(s, x) - 'l/J(s, x)h)
+ lot [X(F(s, x + h)) - X(F(s, x))

-DX(F(s,x))(F(s,x + h) - F(s,x))) ds. (1.9.5)
Let f(t) := II F(t, x + h) - F(t, x) - ",,(t, x)h II. The integrand in the first
integral is dominated by
IIDX(F(s, x)) [F(s, x + h) - F(s, x) - ",,(s, x)h] II

ssup II DX(F(s, x)) IIII F(s, x + h) - F(s, x) - ",,(s, x)hll
sMd(t). (1.9.6)
Since X is differentiable, given 1J > 0, there exists 8 > ° such that if

Ilhll < 8, then
IIX(F(s, x + h)) - X(F(s, x)) - DX(F(s,x))(F(s,x + h) - F(s, x)) II

< ." IIF(s, x + h) - F(s, x) II, (1.9.7)
for s E [-c, c].
It follows from Equations (1.9.5), (1.9.6), (1.9.7) and (1.9.4), that
f(t) S M110t J(s)ds+1Jllhllce LE ,

for t E [-c,c:], x E B(xo,r/4) and IIhll < min{8,r/4}.
By Gronwall's inequality, it follows that for each 1J > 0,
J(t) S ."lIhllc:eL£e Mta ,
t E [-c:,c], x E B(xo, i) and Ilhll < min{8, H.

The claim is thus established. It follows from this that F is C1 in x
and C 2 in t, if X is C 1 •
We prove that F is C k in x-variable and CH1 in the t-variable by
induction. Since
a
atF(t,x) = X(F(t,x)),
we deduce that
aa
at at F(t, x) = DX(F(t, x))X(F(t, x))
a a a
at ax F(t, x) = DX(F(t, x)) ax F(t, x).
The first equation shows that F is CH1 in the t-variable while the second
shows that F is C k in the x-variable.
o
Chapter 2
Manifolds and Lie Groups
2.1 Differential manifolds

A differential manifold is, loosely speaking, a space which locally looks
like some JRn and on which we can speak of C k functions for 1 ~ k ~ 00.
We shall make this vague definition more precise presently.
Let M be a topological space. We say that it looks locally like JRn if
for every P E M there exists a neighbourhood U of p and a homeomor-
phism 'I' of U onto some open subset <p(U) of JRn . We then would like to
sayan JR-valued function f: U ~ JR defined on such a "chart" (U, '1') is
C k (with respect to (U, '1')) if the function f 0 '1'-1 from the open subset
<p(U) of JRn to JR, namely, f 0 '1'-1 : <p(U) C JRn ~ JR is C k •
Notice that there is an inherent ambiguity in this definition of C k _
functions. For instance, let (V, 'I/J) be another "chart". That is, V is an
open set with a homeomorphism 'I/J from V onto an open subset 'I/J(V) of
JRn . (Assume for a moment it is the same nj we still don't know why it
has to be so.) Now if we assume that Un V :f. 0 and if f is defined on
U n V ~ JR, can we say that f is C k with respect to (U, cp) if and only
if it is C k with respect to (V, 'I/J)?
To understand this, let us look at some concrete functions. Let (U, cp)
be a local chart as above. Let Ui : JRn ~ JR be the usual coordinate
functioIIS, namely, Ui(X) = Xi. We can then define coordinates X on U
as follows: For all p E U, we set Xi(P) = Ui(<p(p)). Then Xi are obviously
Ck-functions on U. For, Xi 0 cp-1 : cp(U) ~ JR is nothing other than
Ui : cp(U) C JRn ~ lR.
Now we want X to be C k with respect to (V, 'I/J) also, that is, the
function p t-+ x(p) on Un V is C k with respect to (V, 'I/J) if and only if Xi 0
'I/J-1 : 'I/J(unv) ~ JR is Ck. But then the map F: p t-+ (Xio'I/J-1(p)h<i<n
is a C k map. Notice that F is nothing but <po'I/J-1 : 'I/J(UnV) ~ <p(UnV).
2.1. Differential manifolds 65
In a similar way starting with the functions Yj, we see that the map
'IjJ 0 'P- 1 : 'P(U n V) --+ 'IjJ(U n V)
is C k . These two maps, being inverses of each other, are C k diffeo-
morphisms on their respective domains. So if we want a meaningful,
consistent definition of Ck-functions on M then we are led to impose
compatibility conditions on all the possible overlaps of the charts (U, 'P)
and (V,'IjJ).
We are now ready to make a precise definition.
Definition 2.1.1 Let M be a Hausdorff space. Assume that there exists

an open covering {U", : a E A} of M and homeomorphisms 'P", from U'"
onto an open subset 'P",(U",) of jRm(",) with m(a) a nonnegative integer.
Further assume that the following compatibility condition (2.1.1) holds:
There exists kEN U 00 such that whenever for a, f3 E A we have

U'" n U{3 =I- 0, then the map
(2.1.1)
Figure 2.1.1 Transition function 'P", 0 'P~l
Then the object (M,{(U""'P,,,) : a E A}) is called a Ck-manifold.

The collection {( U"', 'P",) : a E A}) is called a C k -atlas on M. The
members of the atlas are called charts or coordinate charts for the man-
ifold M. They are called coordinate charts, since they give a system of
66 2. Manifolds and Lie Groups
local coordinates: if p E Uo: and if we denote by x~(p) :=. Ui( tpo:(p)), that
is, the i-th coordinate of the point tpo: (p) E JRm(o:). An atlas is said to
be smooth if it is a Ck-atlas for all kEN. The manifold M is said to
be smooth if the atlas is smooth.
Remark 2.1.2 If we assume M to be connected then all mea:) are nec-

essarily the same by the Inverse Mapping Theorem. (For, tpo:otp/il as well
as tpf3 0 tp;;l are Ck-diffeomorphisms on tpf3(Uo: n U(3) and tpo:(Uo: n U(3).)
This common integer, say, m is called the dimension of the manifold M.
We shall always assume that in any atlas all the m( a:) are the same.
Examples of manifolds
Example 2.1.3 Let M be a discrete topological space. By convention
JRo is a single point. Then M is a O-dimensional C k manifold for all
kEN U 00. What are the Ck-functions on M? (Of course we are yet to
define Ck-functions!)
Example 2.1.4 Let M be any nonempty open set in some JRm. Then
taking U = M and tp to be identity (inclusion) map of U into JRm we
get a Ck-atlas for all k with 1 :S k :S 00. Thus M is an m-dimensional
manifold.
As a particular example let us take M := GL(n, JR), the set of all
invertible linear maps of JRn to itself. Then as we have seen in Sec-
tion 1.1, M is an open subset of M(n,JR) :::= JRn 2 • Thus GL(n,JR) is an
n 2 -dimensional manifold. We shall have lots of occasions to look into
this manifold in future.
Example 2.1.5 We now look at a nontrivial example. Let sn be the

unit sphere in JRn+l defined by
We give a Ck-atlas for snconsisting of two members. Let p := en+l be

the north pole (0,0, ... ,0,1) and q := (0, .. ,,0, -1) be the south pole.
Let U := sn \ {p} and V := sn \ {q}. Let tp (respectively 1/;) be the
stereographic projection from p (respectively q) onto the
JRn := {x E JRn+l : Xn+l = OJ.
See Figure 2.1.2.

Figure 2.1.2 Stereographic projection
Let p' be a point in U. Then cp(p') is the point of intersection of the

line joining p and p' with the plane {xn+! = a}. 'Ij; is defined likewise.
We shall now find a coordinate expression for
cp-l : an = {x E an+! : Xn+! = o} -+ sn \ {pl.
Let x = (Xl!' .. , Xn , 0) E an. Then the line joining x with en+l = P is

given by
a(t) = tx + (1- t)p = (tx',.1- t)
where x' = (Xl, ... , xn). Now a(t) lies on sn if and only if II a(t) 112 = 1
if and 'only if t 211 x'I12 + (1 - t)2 = 1 if and only if t = 2(1 + (II x'11)2)-I.
In this case
Thus we see that
(2.1.2)
In a similar way one can show that
By drawing some typical points in cp(U n V) = 'Ij;(U n V) = an \ {O}, one

can easily see that cpo'lj;-l or 'lj;ocp-l is the inversion with respect to the
equatorial sphere sn-l:= {(Xl,"" Xn-l, 0) : X~+" +X~ l = I}. More
explicitly, we have cpo'lj;-I{X) = IIxlI- 2 (Xl!X2"",Xn'0). One can also
directly verify this. Thus {(U, cp), (V, 'Ij;)} is a Ck-atlas for 1 :s; k :s; 00.
Exercise 2.1.6 Let sn(R) be the sphere of radius R in lRn+l defined

by
sn(R) := {x := (XI, ... , Xn+1) E lRn+l : I: x~

1=1
= R2}.
Let cP be the stereographic projection from the north pole onto lRn.
Proceed as before to show that for all x E lRn (2.1.2) corresponds to
-1 (2R2X IIxII2 - R2) (2.1.3)

cP (x)= IlxI1 2+R2'llxIl 2+R2 .
Exercise 2.1.7 Is it possible to find an atlas for sn consisting of a
single chart? Can you generalize this?
There is another atlas consisting of 2(n + 1) charts for sn defined as
follows. Let
Ui± := {x E sn : ±Xi > a}.
Let CPt : Ui± -* lRn be given by
'P;(XI, ... , Xn+l) := (XI, ... , Xi-I, Xi+l,···, Xn+l)
:= (Xl>"" Xi,"" xn+d·
One can easily see that these form a Ck-atlas.
Exercise 2.1.8 Verify that these form a Ck-atlas. This will help you
understand the next example better!
Now a natural question arises. We have endowed sn with two C k_
atlases. We want to know whether they are in some sense the same. In
what sense? Namely in the sense that a function f on M is "C k with
respect to the atlas A" if and only if it is "C k with respect to the atlas
~". If you have followed our motivation for the compatibility condition
(2.1.1) in the definition of a manifold it should be clear why we make
the following definition.
Definition 2.1.9 Two Ck-atlases A := {(Ua:, CPa:)} and ~ := {(V~, 1/'~)}

are said to be compatible with each other if (Ua:, CPa:) EA and (V~, 1/'~) ~,
and if Ua: n V~ =1= 0, then the map
CPa: 01/'1/ : 1/'~(Ua: n V~) -* CPa:(Ua: n V~)
is C k . Notice that due to the symmetry in a and /3, it follows that the
above map is in fact a Ck-diffeomorphism. Hence we see that a function
f on M is Ck with respect to A if and only if it is so with respect to ~.
Exercise 2.1.10 Two Ck-atlases A and ~ on M are compatible if and

only if their union A U ~ is another Ck-atlas on M.
Since we profess to be interested in the concept of Ck-functions on

M, it makes sense to club together all mutually compatible Ck-atlases on
the given space M together. That is, given a Ck-atlas A we consider the
maximal Ck-atlas containing A. It exists, for we take U := uA', where
A' runs through all Ck-atlases that are compatible with A. This again
conforms with our intuitive notion of choosing, a system of coordinates
which are convenient for a problem. A maximal atlas on a topological
space M is called a differential structure on M .
Example 2.1.11 (Level sets as manifolds)

Let F : U c IRn+k -+ IRn be a smooth map. Let q E F(U). Let
S := {x E U : F(x) = q}. Assume that for every pES, the map
F' (p) : IRn +k -+ lRn has rank n. Then S can be given a smooth atlas
making it into a smooth manifold of dimension k. Before going into the
details of this construction, let us look at some examples.
a) Let F :lRn +1 -+1R be given by F(x) :=I:i xr-1. Then F- 1 (O) =sn.
b) Let F : 1R3 -+ 1R, given by F(x) = xi + x~ - 1. Then
F-1(O) = {(cosu,sinu,v): u,v E 1R},
the right circular cylinder.
c) Let U := {x E lR3 : X3 > O}. Let F : U -+ lR be given by

F(x) = 3xi + 3x~ - x~. Then S = F-1(O) is a cone with vertex at
o (but without the vertex).
d) Let Sn be the set of n x n real symmetric matrices:
S n .-
.-
{x E
M( n, lTll)
ll'\o.
Ix
1<
= tx, that is,
_ . .<
_ n.
~,J
Xij = Xji for all } .
Then Sn is a vector space over IR of dimension n(n + 1)/2. We

now consider the map F : M(n,IR) -+ Sn given by F(X) = xxt.
Let O(n) be the set of orthogonal matrices. Then we have
O(n) = F- 1 (I) = {X E M(n,IR) : XX t = I}.

We claim that F has rank (n(n + 1))/2 for all A E O(n).
70 2. M ani/olds and Lie Groups
Let us compute DF(A). We have
F(A + H) = (A + H)(A + H)t.

Hence
F(A + H) - F(A) = AHt + (HAt + HAt) + HHt
= AHt + HAt + o(IIHII2).
Thus we find that D F (A)( H) = AHt + HAt. To establish the
claim we need to show that DF(A) is surjective on M(n, JR). It is
enough to show that if X E Sn is given there exists an HE M(n, JR)
such that
DF(A)(H) = X = (X + X t )/2.
This suggests that we look for a solution for AHt =X /2. But since
AAt = I we can rewrite the equation as Ht = (At X)/2. Hence
H = (X t A)/2 is a solution. Thus O(n) is a manifold of dimension
n 2 - (n(n + 1))/2 = (n(n - 1))/2.
Now we go back to what we intended to show. We write
F := (ft,· .. , In).
Let pES. Then the matrix ( ~(p) )l:::;i:::;n+k has rank n by hypothesis.
Without loss of generality we assume that the submatrix ( ~ (p) ) l:::;i,i:::;n
is invertible. Consider now the map cp : U -+ JRn+k given by
Then we have
so that Dcp(p) is nonsingular. Hence, by inverse mapping theorem, there

exists a neighborhood \' of p in V such that cp is a diffeomorphism of V
onto cp(V). Let us now introduce a new set of coordinates on V:
.'
Yi (r) =
{f;(r) 1 ~i~ n;
lli(r) if i > n
for all rEV, Then with respect to these new coordinates we see that
S n V' = {r E V: y;(r) = qi, 1 SIS n}. Look, for instance, at the case
n = 1 in Examples (a) to (c). On S n V we introduce new coordinates

zj(r) := Yj(r) for n + 1 :::; j :::; n + k. Thus the relative neighborhoods
of S n V of S look like IRk locally. That they overlap smoothly can
be easily verified using the inverse mapping theorem. (Exercise. See
for example the second atlas for sn.) Thus the level sets are smooth
manifolds of dimension k. The reader should also notice that the above
argument is quite reminiscent of the one which we used to show that
any hypersurface is locally a hyperplane. (Example 1.4.2 on page 34.)
Whenever in future we speak of a level set of a function F as a smooth
manifold, we shall assume that the rank of DF(x) at each point of the
level set is maximal.
Example 2.1.12 We now give a somewhat contrived example which is

quite often useful to test one's understanding. Let (lR, u) be the "usual"
manifold structure: That is, u : M := lR -+ lR is u(t) = t. Let (lR, u 3 ) :=
(M' , <p) be the manifold given rise to by the homeomorphism <p: M' -+ lR
where <pC t) = t 3 • We may also think of them as two different atlases on
the same space 1Il Are they compatible? That is, is (2.1.1) satisfied?
The map uo (u 3 )-1 : lR -+ lR is given by t H t 1/ 3 H t 1/ 3 . This map is
not even differentiable at t = O.
The following example may be omitted on first reading.
Example 2.1.13 We take (0,211') and bend it as in Figure 2.1.3 to get

the figure 00. Analytically, let <p-l : (0,211') -+ 00 C lR2 be given by
t H (sint,sin2t).
The map <p-l is a bijection so we use it to give a topology on the image
00. We call this space X. This topology is not the same as the subspace
(or relative) topology which X inherits as a subset of lR2 • (Why?) Thus
<p is a chart on X. We remark that even though we draw as if the arrows
meet at the origin, as a point set in lR2 it is two circles touching at the
origin. See Figure 2.1.3.
Now the same set can be given another chart as follows.
Let 1/.,-1: (-11',11') -+ lR2 given by
t H (sint,sin2t).
See Figure 2.1.3. The topologies introduced by these bijections on 00
are not the same. (Check.) The charts are not compatible. For, we have
.1.
'I-' 0 r.p- 1 ()
t. = ",('
'I-'
.
sm t, sm2t )
= {tt - ')
_7r I
< i <._._7r:
0_.--
, t <. :L7r.
Figure 2.1.3 Figure 00 manifolds
0.4 0.8 -0.8 -O, ~

0.:/ 0.4 0.8
~1
~~~~
-1
Figure 2.1.4 Portions of Figure 00 manifolds
Thus 'IjJ 0 <p-l (-6",6") looks like (-7r, -7r + 8) U {O} U (7r - 8,7r) which is
not an open set in (-7r, 7r). Look at Figure 2.1.4".
We shall have many occasions to look at these manifolds. We will
refer to them as oo-manifolds or Figure 8 manifolds.
This example is a typical way of constructing manifolds on sets which,
to start with, do not come with any topology. The following theorem for-
malizes the above construction. This formulation will help us construct
lots of manifolds in the sequel.
Theorem 2.1.14 Let M be any set. Let a family {(Uo , <Po'): Q E A} of

pairs consisting of subsets Uo of M and maps <Po of Uo into a fixed Rm
be given satisfying the following conditions:
i) uUo=M.
ii) For every Q E A, <Po maps Uo bijectively onto an open subset

<Po(Uo) ofRm.
iii) For all Q, f3 E A with Uo n Uf3 :f=. 0, the map <Po 0 <p~l from the
open set <pf3(Uo n U(3) C ]Rm to the open set <Po(Uo n U(3) is a
Ok -diffeomorphism.
Then there exists a natural topology on M which has as a subbase

the class of subsets
{V: V C Ua for some a and ipa(V) open in JRm }.
With respect to this topology the family {(Ua , ipa) : a E A} is a Ck-atlas

on M. Thus M with this C k -atlas becomes a C k -manifold provided that
the topology so obtained is Hausdorff.
Proof As what is involved is a trivial verification, we leave it to the

reader as an instructive exercise.
o
Remark 2.1.15 In practice it is very easy to decide whether or not the

topology obtained in Theorem 2.1.14 is Hausdorff. The reader should
notice that we simply extracted the idea behind Example 2.1.13.
To see an immediate application we present
Example 2.1.16 (Projective spaces over JR) The underlying set is

the set lP'" (JR) of lines in JRn+1 passing through the origin. There is
another way of looking at lP'" (JR). On JRn+1 \ {o} we introduce the equiv-
alence relation", defined as follows:
X'" Y <=> AX =y with A E R, A ¥- o.

An equivalence class [x] containing x E JRn+1 can be identified with the
line through the origin in JR"+1 joining any point in the equivalence class.
This second way of looking at lP'" (JR) is what we are going to exploit to
endow lP'" (JR) with a manifold structure. In]pi (JR) we have some very
special sets Ui for 1 ::; i ::; n + 1 defined by
Ui := {[x] E lP"(IR} : Xi ¥- o}.

Notice that the definition of Ui is independent of the choice of the rep-
resentative of [x]. For, if y E [x], then Yj = AX j for all 1 ::; j ::; n + 1 and
for some nonzero real number A. On these sets we define ipi as follows:
Xl Xi-l XHl xn+l)

'Pi ([ y]) := ( - , ... , - - , - - , ... , - - EIRn
Xi Xi Xi Xi
for any x = (Xl>.", X n +1) in the equivalence class [y]. Again notice
that the map 'Pi is well defined, that is, it is independent of the choice
of x E [y].
It is now an easy exercise to check that the family {(Ui , r,oi)} satisfies
the assumptions of the theorem. Thus lP'" (JR) is a Ck-manifold of dimen-
sion n. If you know about quotient topology you may be interested in
verifying that the topology we gave to lP'" (JR) is the quotient topology
induced from JRn+1 \ {O} with respect to "'.
Exercise 2 .1.17 There is another way to realize the set lP'" (JR) as the
quotient of the sphere sn: since any line through the origin meets the
sphere exactly at two points which are antipodal to each other (that is,
x and -x), the equivalence", on ]Rn+l \ {O} induces a relation on sn:
x '" y if and only if x = ±y, for x, y E sn.
Example 2.1.18 (Grassmann manifolds) Let V be a vector space

over JR of dimension n. Fix r with 1 :$ r :$ n. The set of all r-dimensional
vector subspaces of V is denoted by Gr(r, V). For F E Gr(r, V), let us
choose a complementary subspace G of F in V: V = FEB G. Let us
define
UG := {E E Gr(r, V) : E E9 G = V}.
Notice that F E UG. Let If>F,G: UG -+ L(F,G) be defined by
If>F,G(E) := 7r(G, E) 0 7r(F, E)-I.
Here 7r(F, E) denotes the restriction to E of the projection of V onto

F. 7r(0, E) is similarly defined. Note that If>F~(.A) is the graph of
A for A E L(F, G). Check that If> F,G is well-defined and that it is a
bijection. Show that the family {( UF, If> F,G)} satisfies the hypothesis
of Theorem 2.1.14, so that Gr(r, V) is a smooth manifold of dimension
r(n - r). Notice that Gr(1,JRn) = lP'"(JR) .
Exercise 2.1.19 What can you say about Gr(r, V) and Gr(n - r, V)
for 0 < r < n?
Exercise 2.1.20 Show that Gr(r, V) is compact. Hint: A subspace

E E G(r, V) is determined by any of its orthonormal basis.
Exercise 2.1.21 (Flag manifolds) Let l' denote the set of all flags
of subspaces: VI C V2 C ... C Vn - I C V = Vn in V, with dimension
dim Vi = i. Use Theorem 2.1.14 to give a manifold structure to1'. Prove
that l' is compact.
2.2. Smooth maps and diffeomorphisms 75
Exercise 2.1.22 Let M and N be two Ck-manifolds of dimensions m

and n respectively. Then M x N with product topology is a Ck-manifold
in a natural way of dimension m + n.
An important remark on our convention: Henceforth we shall

always deal only with smooth, that is, Coo-manifolds. We shall also
assume that our manifolds are such that the underlying topology has a
countable basis. In particular a discrete space is a smooth O-dimensional
manifold if and only if it is countable. We also assume that the manifold
is endowed with the corresponding maximal atlas.
2.2 Smooth maps and diffeomorphisms

We defined smooth manifolds so that we can speak of differentiable func-
tions on objects other than open sets in IRn. If M is a smooth manifold
of dimension m, how do we define a smooth function? If you recall
the motivation underlying the definition of manifolds in Section 2.1, it
should be clear to you when to say a function f on M is smooth. Let
us formally define a smooth function:
Definition 2.2.1 A function f on M is said to be smooth at a point p

in M if whenever p lies in a coordinate chart (Ua , <Pa), then the function
f 0 <p~1 defined on the open subset <Pa(Ua ) in am is smooth.
Notice that our compatibility condition (2.1.1) is just what is re-
quired to ensure that this definition is independent of the choice of the
local chart: if p also lies in another chart (Uj3, <P j3), then f is smooth
at p with respect to (Ua , <Pa) if and only if it is smooth with respect to
(Uj3, <Pj3). (Check this.)
Definition 2.2.2 We say that f is smooth on an open set U of M if and

only if it is smooth at every point of U. This is the same as requiring
that the function f 0 <p;;1 on <p(U n Ua) is smooth for all a with Un Ua
non-empty.
Now the question arises whether there are smooth functions on M.

Of course the constant functions are smooth. So the real problem is to
know the existence of smooth functions - sufficiently many to separate
points of M.
First of all let us observe that at least locally there are many func-
tions. For if we set Xi := Ui 0 <Pa, where Ui is the usual coordinate
function defined by Ui(T) := Ti, for T = (Tb .•. , Tn) , then Xi is a smooth
function on the domain U01.. These functions Xi will be called the coordi-
nate functions of the local chart (UOI.' <P0l.). In future most often we shall
use (U, x) for (U, <p) and call it a local chart.
Now the above construction can be generalized: If F is any smooth
function on <POl.(UOI.) then its pull-back f := F 0 <POI. is smooth on UOI..
(Check this.) However notice that these functions are in general local
in the sense that they may not admit any extension to the whole space
M. For example consider the coordinate function of the stereographic
atlas of the sphere given by Xi(S) = sd(l - Sn+1), the first coordinate
function corresponding to the chart (U, <p). This is defined at all points
of sn except (0, ... ,0,1). Clearly it is not possible to extend this even
as a continuous function to the whole of sn.
Thus while it is relatively easy to define local objects on M simply
by transferring the corresponding objects on JRm via the local charts, it
may not be possible to get globally defined objects on M.
Now comes the trick which will enable uS to produce global objects
on M. If F is a smooth function on <p(U) with compact support then
the function f defined as above at first on U by setting f := F 0 <p-l
and then f := 0 on M \ U is smooth on whole of M \ U. (Recall that
the support of a function f on a topological space M is defined to be
the closure of the set {p EM: f(p) =I- O}.) It is trivial to see that f is
smooth on M. We denote by Coo(M) the set of smooth functions on
M. Before we list the properties of smooth functions and of COO (M), we
want you to familiarize yourself with the concept of smooth functions
by checking which of the following functions are smooth on M.
Exercise 2.2.3 Let M be the manifold JR with the u 3 -atlas. Consider

the function f defined by f(t) := t. Is this smooth? When is the function
g(t) := t k smooth on M for kEN?
Exercise 2.2.4 Let M be the manifold JR with the atlas <p(t) := tan- 1 t.
Then <p maps JR bijectively onto the open set (-7r /2, 7r /2). Is the function
f defined by f(t) := sin(t) smooth?
Exercise 2.2.5 Can you think of some smooth functions on JPl'" (JR)?
If f, 9 E Coo(M) and a, b E JR then af + bg E Coo(M), that is,
Coo (M) is a vector space over R (Check this.) Further if we define f g
on M by setting fg(p) := f(p)g(p), then fg E Coo(M). Thus Coo(M) is
an algebra over R By the existence of sufficiently many smooth functions
with compact support it follows that Coo(M) is an infinite dimensional
vector space over JR provided M is infinite as a set.
2.2. Smooth maps and diffeomorphisms 77
Now for any open subset U of M we associate Coo(U), the set of

smooth functions on U. This correspondence U H Coo(U) has the
following properties:
1. For every open subset U of M, Coo(U) is an algebra of continuous
functions on U containing the constants.
2. For every V, U open sets in M with V c U we have f E Coo(U)
implies f E Coo(V).
3. If U = uUi , with Ui open in M and if f is a function on U such
that"! IUi E Coo (Ui) for all i then f E Coo (U).
4. There exists an integer m ~ 0 such that for all p E M there is
an open set U 3 P and m real functions say, Xl, ... ,X m in Coo (U)
with the following properties:
(a) the map ¢ : q H (Xl(q), ... ,xm(q)) is a homeomorphism of
U onto an open subset of R
(b) for all open V c U and a function f on V we have f E Coo (V)
if and only if f 0 ¢-l is smooth on ¢(V).
These properties of the above correspondence characterize the dif-

ferentiable structure on M in the following sense: If U H 'D(U) is any
assignment which has the above properties then there is a unique max-
imal smooth atlas on M for which 'D(U) = Coo(U).
Exercise 2.2.6 Prove the above statement.

Thus we have yet another way of defining a smooth manifold. This
definition is useful if one wants to define analogues of manifolds in the
algebraic category. These analogues are called algebraic varieties. For a
good and systematic treatment of them we refer the reader to the book
Algebraic Geometry by R. Hartshorne.
Let us come back to the study of differentiable manifolds. We now
wish to define smooth maps between two smooth manifolds.
Definition 2.2.7 A map f: U c M -t N, is said to be smooth if for

every p E U, there exist coordinate charts (Up, cp) around p and (V ,~)
around q := f(p) such that f(Up ) C Vq and the map
~ 0 f 0 ¢-l: ¢(Up ) C lRm -t ~(V ) c lRn
is smooth. See Figure 2.2.1.

Here we denoted by m, n the dimensions of M, N respectively. The

rest of the notation is, hopefully, by now standard. This definition is
what the reader should have come to expect by now.
Let us look at some examples of smooth maps:
Example 2.2.8 First notice that if f is a smooth function on M then

it is a smooth map of M into lIt
Figure 2.2.1 Smooth maps between manifolds
Example 2.2.9 Let X, Y be the manifolds (i) and (ii) in Figure 2.1.3.
Consider the oo-manifold in JR2 (see Example 2.1.13 in Section 2.1). Let
f be the identity map of the underlying sets of the manifolds: That is,
if (x, y) E X C ]R2 then f((x, y)) := (x, y) EYe JR2. Then f is not
smooth. This is yet another way of rephrasing a fact which we have seen
earlier. (What is this fact?)
Example 2.2.10 Let f : JRn+l \ {O} --t JP"(JR) be the quotient map,
namely, x --t [xl. Then f is a smooth map. Similarly the quotient map
from sn to JP" (JR) is smooth.
Example 2.2.11 Let ¢ : M --t N and 1/J : N --t S be smooth maps.

Then their composition 1/J 0 ¢ : M --t S is a smooth map.
Example 2.2.12 Let Mi be smooth manifolds for i = 1,2. Then the

projections 7ri are smooth maps from MI x M2 to Mi. In fact, the
smooth structure on Afl x M2 has the following universal property: For
2.2. Smooth maps and difJeomorphisms 79
any smooth manifold N, if <Pi is a map from N to M i , for i = 1,2, then

<P := (<Pi, <P2) : N
4 Mi X M2 is smooth if and only if <Pi'S are smooth.
Exercise 2.2.13 Is there an analogous universal property in terms of

maps from Mi x M2 to any smooth manifold S?
A map <P from M to N is smooth if and only if for any smooth

function f on N the function f 0 <P is smooth.
Exercise 2.2.14 Can you think of another related question?
We now want to define a concept which plays the role of an isomor-

phism for' the 'category' of smooth manifolds.
Definition 2.2.15 A map <P from M to N is said to be a difJeomorphism

of M onto N if <P is smooth, bijective and the inverse <p- i is also smooth
from N to M.
In particular a diffeomorphism <P of M onto N is a homeomorphism of
M onto N. However notice that a homeomorphism 1/J of M onto N which
is also smooth need not be diffeomorphism. For example, consider the
map 1/J from lR to lR given by 1/J(t) := t 3 . Then it is not a diffeomorphism
for the simple reason that 1/J-l is not smooth on lR.
If there is a diffeomorphism of a manifold M onto another manifold
N then we say that these two manifolds are difJeomorphic. It is obvious
that diffeomorphism is an equivalence relation.
Example 2.2.16 Let M be the manifold IR with the usual smooth

structure and N be the one which has the u 3 -structure. Then the map
<P from M to N, defined by <p(t) := t 1 / 3 is a diffeomorphism of M onto
N. (Check this.)
Example 2.2.17 The two 00 manifolds of Example 2.1.13 are diffeo-

morphic. (Prove this.)
Example 2.2.18 Let us record a trivial example of a diffeomorphism. If

(U, <p) is a local chart in a smooth manifold M then <p is a diffeomorphism
of U onto its image <p(U).
Definition 2.2.19 Let G be a group. Further assume that G also has

a smooth structure. Assume that these two structures are interwoven in
the following sense:
With respect to the product manifold structure on G x G the mul-

tiplication map G x G -+ G given by (x, y) M xy, and the inversion
map G -+ G given by x M x- 1 are both smooth. Then we say that G
is a Lie group. Of course these conditions can be replaced by a single
condition requiring the smoothness of the map G x G -+ G defined by
(x,y) M xy-1.
Examples 2.2.20 (of Lie Groups)
1. lRn with addition as the group operation and with the usual dif-
ferential structure is a Lie group.
2. lR· := lR \ {O} with multiplication as the group operation and the
smooth structure (as an open subset of lR) is a Lie group.
3. Sl considered as the subset {z E C : Izl = I} of C with the usual
multiplication in C as the group operation and with the smooth
structure introduced in Chapter 1 is a Lie group.
4. A most important example is G L( n, lR). This is a Lie group with
the smooth structure and with matrix multiplication as its group
operation. You should check the details of this example instead of
taking them for granted, as this will make you realize the signifi-
cance of some of our earlier exercises and examples.
5. The manifolds O(n,lR), U(n), SU(n) and SL(n,lR) are defined as
follows:
O(n,lR) {A E M(n, lR) : A· At = I}

U(n) {A E M(n,C) : A· A* = I}
SU(n) {A E M(n,C) : A· A* = I and det(A) = I}
SL(n,lR) {A E M(n,lR) : det(A) = I}.
These are all Lie groups with smooth structures we have already
introduced and with matrix multiplication. Even though we are
fully equipped to substantiate our claim at this stage, we will post-
pone a proof of this. (See Remark 2.6.17, p.106.) Meanwhile the
reader should try to prove this. Even if one does not succeed it
will make one understand the various concepts introduced so far
and also enhance one's appreciation of things to come.
Exercise 2.2.21 Let Ra, Lb, Ag denote the maps of a Lie group G
into itself given by Rax := xa, Lax := ax, Agx := gxg- 1 respectively.
2.3. Tangent spaces to a manifold 81
Then they are smooth maps of G to G and hence diffeomorphisms of

G. La (respectively, Ra) is called the left translation (respectively, right
translation) by a.
2.3 Tangent spaces to a manifold

Definition 2.3.1 Let M be a smooth manifold. By a curve in M we
mean a smooth map "( : (-c;, c;) -t M. "( is said to pass through the
point p := "((0).
Let us "recall and reformulate the notion of tangents in JR.n. The most
intuitive and pleasant definition is via curves. Let "( : (-c;, c;) -t n, an
open subset of JR.n , be a curve in n. Without loss of generality, we assume
that 0 E nand "((0) = O. If we let "(i(t) = Ui("((t)), then the tangent
to "( at p = "((0) is defined to be the vector "(' (0) = ("(~ (0), ... ,"(~ (0)).
Now let pEn and v E JR.n be given. Since n is open in JR.n, there exists
an c; > 0 such that the linear curve "(p,v(t) := P + tv for t E (-c;, c;) lies
in n. Thus"( := "(p,v is a curve through "((0) = p with "('(0) = v. Thus
the full space JR.n can be thought of as the set of all (why?) tangent
vectors at any point pEn. We can write Tp(n) for the set of all tangent
vectors at points pEn = JR.n. We would like to recapture the vector
space structure on Tp(n) in a more natural way.
Before doing this, we shall explore the connection between curves
and tangent vectors. We say two curves "( and (J" are equivalent at p and
we write "(Rp(J" if and only if "('(0) = (J"'(O). That is, they have the same
tangent vectors at p. Rp is easily seen to be an equivalence relation. We
denote by [,,(l the equivalence class containing the curve "(. Then we can
also identify Tp(n) with {["(n.
We shall put these concepts in a different light by exploring the
relation between the tangent vectors and derivatives of functions and.
maps. Let f : n -t JR. be smooth. For any v E JR., we recall the definition
of directional derivative DvJ(x) of f in the direction of v at x as
lim C 1 (f(x
t--?O
+ tv) - f(x)).
We know that this limit Dvf(x) exists and is given by
Dvf(x) = 'L..
" Vi 8Ui
8f (x) = (grad f(x), v).
Let us adopt a suggestive notation:
-8 I
8 (f) :=
8f (x).
-8
Ui x Ui
We recall that in computing Dvf(x) we could have used any curve a in

place of the linear curve "Ip,v, as long as a E ["I], that is, a'(O) = v.
To each v E JRn, when thought of as a tangent vector at x and hence
to each equivalence class of curves through x, we associate the directional
derivative operator Dx,v := l: Vi 8~i Ix' It operates on smooth functions
defined in a neighborhood of x as follows:
Notice that this correspondence is one-one: Dx,v = Dx,w if and only if

v = w. For, if Uj is the j-th coordinate function we have
for all j, so that the vectors v and w have the same components. Hence
v=w.
The set {Dx,v : v E ]Rn} has a natural vector space structure over JR,
as can be seen from the following:
i) (Dx,v + Dx,w)(f) := Dx,v(f) + Dx,w(f);

ii) (aDx,v)(f):= aDx,v(f).
It is trivial to see that v H Dx,v is a linear isomorphism between the

vector spaces.
Dx,v has the following properties:
1. Dx,v (f) = Dx,v (g) if f and 9 are smooth around x and f = 9 on

a neighborhood of x.
2. Dx,v(af + bg) = aDx,v(f) + bDx,v(g) for a, bE lR.

3. Dx,v(fg) = f(x)Dx,v(g) + Dx,v(f)g(x).
Properties (2) and (3) are characterizing properties in the sense that if
8 is a rule which associates a real number to each smooth real valued
function f defined around x having properties (2) and (3), then there
exists a unique v E JRn such that 8 = Dx,v' In particular,
for some curve "I through x with "1'(0) = v.

To prove 8 = Dx,1) for some v E IRn, we first observe that (2) and (3)
imply
8(1) = 8(1· 1) = 18(1) + 8(1)1 = 28(1)
so that 8(c) = 0 for any constant function c. Let us now look at the
case n = 1. If the above representation is true then we should have
8(1) = a{d~)ix(l) = af'{x) for some constant a. (We assume without
loss of generality that x = 0.) This suggests that we use the Fundamental
Theorem of Calculus:
f{x) = f{O) + 1 x J'{t) dt. (2.3.1)
We apply 8 to both sides of Equation (2.3.1). The second term on the

right must therefore be recast to facilitate the application of 8. We put
t = ux and get
f(x) = f(O) +x 11 J'(xu) du = f(x) + h(x)g(x)

with h(x) = x. Applying 8 to the extreme sides and using (2) and (3)
we get
8(1) = 0 + 8(x)g(0) + 08(g) = 8(x)g(0).
What is g(O)? We have
g(O) = 10r d
J' (0) du = J' (0) = du iu=o(l)·
Hence we see that 8 = Do,a = a(iJio' where a = 8(x).

This proof easily generalizes to higher dimensions. Again take x = 0
and a functicn F smooth on a star-like neighborhood of O. Let
<p : [0, 1] ~ IRn be defined by <p(t) = tx for some x in the neighborhood
of O. We then apply the above reasoning to the function F 0 i8u~(tx).
i 1
Hence we get
F(x) = F(O) +L 1
Xi
iOU,
8F
1
a-:
(tX) dt
= F(O) + LXi9i(X), (2.3.2)
where gi(X) := J01 g~ (tx) dt. Note that this implies
8F 8
gi(O) = 8Ui (0) = 8Ui 10(F). (2.3.3)
Now proceeding as earlier we see that 8 = i8(Ui)a~;'
The uniqueness of v is clear. Thus we have established a linear iso-

morphism from Rn to the space of tangent vectors at x onto the direc-
tional derivative operators 8 satisfying (2) and (3). Therefore whenever
we want to think of a tangent vector v at a point x, we can think of it
either as an equivalence class of curves having v as the common tangent
vector at x or as the directional derivative operators Dx,.v'
Remark 2.3.2 This remark is about a technical point and hence may
be omitted on first reading.
The reader may be puzzled as to why we always talked of smooth
functions and 8 as operators on smooth functions around p. After all we
wanted to talk of tangent vectors. For this purpose it may very well be
enough to have restricted to C 1-functions. If we had done so, in the first
order Taylor (-like) expansion f(x) = f(O)+xg(x) we obtained above, 9
will in general be continuous, not necessarily C 1 We cannot apply 8 to
9 as the domain of 8 is C 1-functions around O. Thus we could not have
concluded that 8 is of the form Do,v for some vERn.
Before we define the tangent space at a point to a manifold, we point

out the relation between DF(p) of a map (not necessarily a function)
F : n --+ Rm and the directional derivative operators.
We write F = (It, ... , fm) where as usualJi := ui(F). Now DF(p) :
Rn --+ Rm can be defined as the linear map D F (p) : Tp (Rn) --+ T F(p) (Rm )
as follows:
DF(p)( v) = (D It (p)( v), . .. ,D fm(P)( v))t,
2.:1. Tangent spaces to a manifold 85
as a column vector so that we get back our Jacobian representation of

DF{p):
at p acting upon the column vector (Vl' ... , v n ) t.

After all these preliminary discussions on lRn , we are ready for the
concept of tangent vectors to a manifold. Let M be a smooth manifold,
and let p E M. Let COO{p) be the set of lR-valued smooth functions
defined in a neighborhood of p. Thus, if f E COO{p), then there exists a
neighborhood Uf (Uf depends on f) of p such that f is defined and Coo
on Ufo
Definition 2.3.3 A tangent vector v at a point p E M is a mapping

from Coo (p) to lR enjoying the following properties:
1. v(f) E lR, for all f E COO{p).
2. v{af + bg) = av(f) + bv{g), for a, b E lR and f, 9 E COO{p). This
equality is on Ufn Ug •
3. v(fg) = f{p)v{g) + v(f)g{p).
Thus a tangent vector is a "linear functional" on COO{p) {0y (1) and
(2)) and it satisfies a Leibnitz type rule (3). Now the question is whether
there exists any nonzero tangent vector at p. (0 defined by O(f) = 0 is
trivially a tangent vector!) Let us denote by TpM the set of all tangent
vectors at p to M. We wish to show that TpM has nontrivial elements.
Let (U, cp) be a coordinate chart. We say it is centered at p if cp(p) = o.
(There always exist such charts.) Let Xi be the corresponding coordinate
functions on U. Then for any f E Coo (p), we define
~O
VXi P
I (f):= nO (f 0 cp-l )(0)
VUi
= nO
VUi 0
I (f 0 cp-l).
Then it is easy to check that a~i fp E TpM. They are nontrivial since if
we take f = Xi, then
J:l o I (f) = 0
J:l(Ui 0 cp-l)(O)
VXi P VUi
o
= J:l (Ui)(O) = 1.
VUi
86 2. M ani/olds and Lie Groups
More generally, let '"'/ : (- ~, c) -t M be any smooth curve through p.

Then we define ,",/'(0) by setting, for any / E COO(p),
d
,",/'(0)(/) := dt (/0,",/)(0)
d
= dt It=o(/ 0,",/).
We claim that ,",/'(0) is a tangent vector at p.
,",/'(0)(/ + g) = ,",/'(0)(/) + '"'/'(O)(g) (Check this.)
'"'/' (O)(/g) = ~ (/g 0 ('"'I)) It=o

d
= dt (/0 ,",/(t) . 9 0 ,",/(t)) It=o
= (/o'"'/(t). ~(go'"'/(t))) It=o

+ (~(/o'"'/(t)).go'"'/(t)) It=o
= /('"'1(0)) ('"'I'(O)(g)) + ('"'1'(0)(/)) g('"'l(O))
= /(p)('"'I'(O)(g)) + ('"'I'(O)(/))g(p).
Now why is this "more general?" For, the tangent vector {}~; is (7'(0) Ip
of the i-th coordinate curve (7 : t H cp-l(O, . .. , t , ... , 0). Look at Fig-
ure 2.3.1 (Exercise: Verify this.)
X2 - curves
Xl - curves
Figure 2.3.1: Coordinate tangent vectors as the tangent vectors of curves
It is very easy to see that the set TpM of tangent vectors at p to

M forms a vector space over lR in an obvious way. For a E lR and
We now show that {J~i Ip

forms a basis for this vector space so that
dim TpM = dim M. Here the left side is the dimension of the vector
space while the right is the dimension of the manifold.
Theorem 2.3.4 (Basis theorem) Let p E M and (U, ep) be a local

chart at p. Let Xi be the corresponding local coordinates for 1 ~ i ~ m.
Let v E TpM. We then have
8
V=LV(Xi)-8 (2.3.4)
Xi I·
p
Proof Most of our work is already done. We simply transfer the prob-
lem to one on the open set ep(U) of IRm. We shall assume that the chart
is centered at p.
For f E COO (p), we define F := foep-I. Then FE COO(O) in IRm. We
then have, as above, (2.3.2 and 2.3.3 on page 84) the first order Taylor
expression for F in IRm: F(x) = F(O)+ L:i XiGi(X) where G i are smooth
around 0 and Gi(O) = {J~i F(O). We now pull this expression back to U
via ep to get
f(x) = f(p) + LXigi(X),
where gi := G i 0 ep and Xi := Ui 0 ep. We apply v to both sides of the

equation to get
v(f) = v(f(p» +L (V(Xi)gi(p) + Xi(p)V(gi».

i
But we have gi(p) = {J~i Ip(f), for,
8 8 8
gi(p) = Gi 0 ep(p) = Gi(O) = 8Ui 10(F) = 8Ui lo(f 0 ep-I) = 8Xi Ip(f),
by the very definition of {J~i Ip'
Now to deal with the case when (U, ep) is not centered at p is easy.
We define (U, '¢) by '¢(q) = ep(q) - ep(p). Thus, if the local coordinates
for '¢ are denoted by Yi, we see Yi(q) = Xi(q) - Xi(p). Hence we have
Ip Ip
{J~i = {J~i and also V(Yi) = V(Xi). These observations prove that the
expression for v in the statement of the theorem holds for all charts.
o
Remarks 2.3.5
Ip
1) As a~i are linearly independent (since, a~i Ip(Xj) = c5ij ), we infer
dim TpM = dim M.
2) We wish to point out the importance of the representation (2.3.4)
in the Theorem 2.3.4, whose significance is most often overlooked.
Ip
It not only tells us that a~i span TpM but also how to write the
expression, that is, what the coefficients are if we write the vector
v in terms of the basis a~i Ip.
To wit, the i-th coefficient is the real
number V(Xi)!
We shall have many occasions to use this observation. Let us show
one such use now.
Example 2.3.6 Suppose that p also lies in another chart (V, 1/J) with
Ip
coordinates Yi· Since a~i E TpM we want to write a~i as a linear Ip
combination of the basic vectors a~i Ip. We denote a~i Ip by w. Then by
the basis theorem, we have
a a a
w="'w(Xj)-1 ="'-(Xj)·-I·
~
j
ax'J p ~ ay'
j'
ax'J p
That is, we have

(2.3.5)
We now let v be some tangent vector at p with the coordinate represen-

tation v = Lj v j a~j = Li vi a~i
Ip Ip'
where we have set Xi = Yi· How
are these components v j and vi related? Using Equation (2.3.5) we see
that
v = '~
" via~x· I P
i '
Thus we find that
vJ =
. L'v'-a-
aaxj
(p) = v' a-'· (2.3.6)
Xi x'
In the right extreme expression we have used the Einstein summation

convention. If you look up some classical tensor analysis text book, you
will find that Equation (2.3.6) is the transformation law for a contravari-
ant tensor of rank 1. To keep it in conformity with the classical notation
we have used the x's and the summation convention.
In future we shall refer to 8~i as the coordinate tangent vectors.

An important exercise which will reassure us that tangent vectors are
"tangents to curves" is the following:
Exercise 2.3.7 Show that for any tangent vector v E TpM, there exists
at least one smooth curve '1 passing through p such that v = '1'(0).
We cannot overemphasize the importance of this exercise. We shall

see that this will be of immense help to us whenever we want to compute
the derivatives of maps. (Recall what we saw in Section 1.1)
Before we wind up our study of tangent vectors, let us draw the
attention of the reader to the fact that a tangent vector at a point
p E M can be viewed in anyone of the following four equivalent ways:
1. As a directional derivative operator satisfying (1 )-( 3) of Defini-

tion 2.3.3.
2. As an equivalence class bJ of curves all having the property that

for all f E COO(p) one has a'(O)(I) = '1'(0)(1) for any a E bJ. (To
see the equivalence of (1) and (2) you need Exercise 2.3.7!)
3. As an m-tuple (Vi, ... , v m ) of real numbers satisfying the transfor-

mation law Equation (2.3.6) above. (This is the view point taken
in classical literature. )
4. As any real linear combination of 8~i Ip

for 1 ~ i ~ m. (For
instance, this point of view is to be adopted if we consider only
Ck-manifolds for k < 00). The reason for this is the fact that if
we adopt the definition as in (1) above then TpM can be infinite
dimensional! To have an idea of what goes wrong see the technical
remark (Remark 2.3.2) made earlier in this section.
We urge the reader to go through this section very thoroughly, as we

have found that the concept of tangent vectors to a manifold is a great
stumbling block for many learners who try to learn from books which
deal with this concept in a more economical and efficient way. Once this
section is mastered half the battle (of learning manifold theory) is won!
Exercise 2.3.8 Let M be a smooth manifold. Let 'Jp be the collec-

tion of all triples (U, cp, u) where (U, cp) is a chart of M containing
the point p and u E lRm , and m = dim M. Let us define a rela-
tion rv on this collection by setting (U, cp, u) rv (V,,,p, v) if and only
if D(cp 0 "p-l)("p(p))(v) = u. Show that this is an equivalence relation
and that there is a bijection between the set of equivalence classes and
TpM. (The point to notice is that the naive way of defining tangent
vectors at p E M is made rigorous in this exercise.)
Exercise 2.3.9 This gives another way of defining tangent vectors at a

point p to M. In COO(p) we define an equivalence by setting (f, Uj) rv
(g, Ug ) if and only if f = 9 on Uj n Ug . Then the equivalence classes
are called germs of smooth functions at p. The set of germs at p can
be made into an algebra ep over lR in a natural way. Let :7p be the set
of germs in ep which vanish at p. (Notice that the word "vanish at p"
makes sense.) Then :7p is a maximal ideal in the algebra of germs. Let
:7; be the ideal generated by [P] for f E (f] E:7p- Then the quotient
is a vector space over lR and its dual is canonically isomorphic to TpM.
(This is the approach used in algebraic geometry and is useful at times
when one deals with complex manifolds and its various tangent vectors.
However we shall not deal with complex manifolds in this book.)
2.4 Derivatives of smooth maps

Let M and N be smooth manifolds with dimM=m and dimN= n. Let
F : M -+ N be a smooth map. We introduced tangent spaces so that
we can speak of derivatives of smooth maps. So, the derivative DF(p)
at p of F must be a linear map from the tangent space TpM to Tq(N)
where q := F(p). Thus for v E TpM we want DF(p)(v) to be a tangent
vector at q. We know a tangent vector as soon as we know what it does
to an arbitrary smooth function 9 defined in a neighborhood of q. Recall
that DF(p)(v)(g) E R Is there a natural way of defining DF(p)(v)(g)
so that it is a real number? Look at Figure 2.4.1.
The vector v operates on smooth functions around p to give a real
number. Is there a natural function f depending on F and 9 so that
v could operate on f? How about 9 0 F? This suggests that we define
DF(p)(v)(g) := v(goF). We should still check that DF(p)(v) so defined
is a tangent vector at q. (Exercise: Check this.) This definition of
derivative DF(p) of F at p is essentially algebraic or functorial.
2.,,1. Derivatives of smooth maps 91
_--...L..----_lR
Figure 2.4.1 Derivative of smooth maps-functorial definition
There is another more geometric definition. Given v we know that

there is a smooth curve -y passing through p and having v as its tangent
vector at p, that is, -y(0) = p and -y'(0) = v. Then Fo-y is a curve through
q. Hence we can set DF(p)(v) := (F 0 -y)'(0). Note that since F 0 -y is a
curve, (F 0 -y)'(0) is certainly a tangent vector q. (See Section 2.3.) In
our first definition of DF(p) we had to prove this. But what is not clear
now is whether DF(P) is well defined in the second approach. That is,
we need to show that DF(P)(v) is independent of the curve -y chosen.
(Exercise: show this.) We shall however establish the equality of both
these objects thereby proving two things at the same time:
i) that DF(P)(v) defined the first way is a tangent vector to q and
ii) that DF(p)(v) is well defined in the second approach.
Here we go:
d
(F 0 -y)'(O)(g) = dt (g 0 (F 0 -y))(t)lt=o
d
= dt((goF)o-y)(t)lt=o
= -y'(O)(g 0 F)
= v(g 0 F).
In the above equations the first member on the left side is DF(p) (v)(g)
according to the second definition whereas the last member on the right
side is DF(p)(v)(g) according to the first definition. Hence the two
definitions are the same.
In our opinion the first definition is useful for theoretical work such
as in proofs whereas the second definition is more useful in any kind of
geometrical setting. This vague remark will be amply illustrated as we
go along.
M N
Figure 2.4.2 Geometric definition of derivatives of smooth maps
Exercise 2.4.1 Verify that DF(p) : TpM -+ TF(p)N is a linear map.

Let us now obtain a coordinate representation for DF(p). Let (U, '1')
(respectively, (V, ~)) be local charts centered at p (respectively, q). Let
Ip
tpe corresponding coordinates be Xi and Yj· Since 8~i and 8~j form Iq
a basis for TpM and Tq(N) respectively, and since DF(p) is linear, we
wish to find a matrix representation of DF(p) with respect to these
bases. Thus, let v := 8~i Ip' If we set w := DF(p)(v) E Tq(N), then by
the basis theorem, w can be written as follows:
Hence the matrix representation (that is, Jacobian matrix) of F with

respect to the local coordinates is
As a specific example, if we take M to be an open subset of lRm and

N = lRn , and write F = (II, ... ,fn) as usual then we have fj = Yj 0 F.
The above matrix becomes ( ~(p) h<i<m, the usual Jacobian matrix.
• l~j~n
Most often one uses the notation F. (p) or F. p for what we have so
far denoted by DF(p). If the context is clear we may also write F.
2.,4. Derivatives 01 smooth maps 93
without explicitly mentioning the point p. We shall now look at some

illustrative examples which will enhance the reader's understanding of
tangent vectors, derivatives, etc.
Examples 2.4.2
1. Let us take M = (-g, g) with g > 0, an open interval in lR en-
dowed with the usual manifold structure. Let N be any smooth
manifold and '"'( : (-g, g) --t N be smooth. For t E (-g, g), we have
'"'(.(t)U!.lt) E T...,(t)N. Can you guess what it should be? (We were
very deliberate in our choice of notation!) Did you say "It is '"'(' (t)"?
If so, well and good. If not, now that you know the answer, why
don't you prove it?
2. Let us now look at the other analogous situation. Let 1 : M --t lR
be smooth. Then DI(p)(v) E Tq (lR) , with the obvious notation.
Now, on lR, we have a global chart, namely, u : t t-+ t so that
I
TqlR = lR d~ q' Hence
DI(p)(v) = a(v)-d
u
d
Iq
for some real number a depending on v. Can you guess what this
constant should be?
3. This example is perhaps best understood after Section 2.5. The
setting is as in Example 2.1.11. Let 1 : lRn +1 --t lR be smooth. Let
S := 1-1(q). Let pES. What is TpS? It can be considered as
a subset of TplRn as follows: The natural inclusion L : S --t lRn +1
is smooth. (Why?) Also we have DL : TpS --t TqlRn+l is one-
Ip
one. For, 8~j for j ~ 2 span TpS and DL(p)(8~j Ip)
= a~j Ip'
(The reader should try to verify this using the first definition of
derivative of smooth maps.) Thus TpS is a subspace of TplRn+1 .
Can we identify this subspace using our original function I?
Suppose v E TpS. Let '"'( be a curve that passes through p and is
such that '"'('(0) = v. (Such curves exist by the important exercise
(Exercise 2.3.7.)) Since '"'((t) E S for all t E (-g,g), we have
Ib(t)) = q. Thus the smooth function 10'"'( is constant on the
connected set (-g, g) and hence we have
DI(P)(v) = (f 0 '"'()'(O) = O.
Therefore we see that Tp(S) is contained in the kernel of the lin-
ear map D 1(p). Now D 1(p): TplRn+1 --t TqlR is a nonzero linear
map. Hence by the rank-nullity theorem, the kernel of Df(p) is

of dimension n. Since dim Tp 5 = dim 5 = n, we see that T p 5 is
precisely the kernel of the derivative D f (p).
This is the magic formula which will relate our abstract definition of
tangent spaces of concrete spaces such as 5 n , cylinders etc., with what
we geometrically think of as tangent spaces of these manifolds.
Figure 2.4.3 Tangent space of the cylinder
Let us start with the sphere 5 n described as a level set of the function
f : ]Rn+1 ---+ JR where f(XI, ... , I n+1) = Li x; - 1. We have for p E 5,
Df(p)(v)
of (p). Vi = (gradf(p),v)jRn+l
= L ou .
i '
Since gradf(p)=2(PI, ... ,Pn+l) for P:=(PI, ... ,Pn+1), we see that the
kernel of DF(p) on Tp(JRn+1) ::::JRn+1 via the canonical global coordinates
Ui and hence via the map LVi a~i f-t (VI, ... , V n+ 1) is
{x E JRn +1 : (P,x) = o}.

This is nothing other than the affine tangent space defined by the normal
radial vector op
translated by -po
Another example is the right circular cylinder 5 := f- 1 (0) given by
the map f : JR3 ---+ JR by f(x) := xi
+ x~ - 1. One can give a smooth
atlas for 5 consisting of two charts (U, <p) and (V, 'l/J), where
<p-I: <p(U):= (0,27r) x JR --t 5 with <p-I(U,V) = (cosu,sinu,v)

2.4. Derivatives of smooth maps 95
and
'1/;-1: 'I/;(V):= (-rr,rr) x lR -+ S with 'I/;-1(U,v) = (cosu,sinu,v).
Then we have Df(x)(w) = (2(Xb X2, 0), w)llP' Hence the kernel of D f(x)
is "the real span of the vectors (-Xl, X2, 0) and (0,0,1)", that is,
a ) +lR-
a +x1-a a
kernel(Df(x))=lR ( -x2-a
Ul X2 aUa .
Geometrically if you take the circle parallel to the Ul u2-plane through
the point x = (Xl,X2,0), then the tangent to the circle at x and the line
perpendicular to this plane span the tangent space. The first of these is
nothing but (-X2,X1,0), etc. See Figure 2.4.3.
We now give the answers to Examples 2.4.2 (1) and (2), in case you
have not got them. In the case of (1), let f E coo(,(t)). Then
d d
D'Y(t)(du It)(f) := du lu=t(f 0 'Y)(u) = (f 0 'Y)'(t) = 'Y'(t)(f).
In (2), the only number one could associate with the data is v(f). So, let
us prove that Df(p)(v) = v(f)d'!.l q. Using the Jacobian representation
or the basis theorem we have Df(p)(v) = v(u 0 f) d'!.lq' But u : lR -+ lR
is the identity map u(t) = t and hence the result follows.
Thanks to Example 2.4.2 (2), we can think of Df(p) as the linear
fun~tional df(p) defined on TpM by df(p)(v) = v(f) for v E TpM. We
den.)te by T; M the real dual of the vector space TpM for any p EM.
Thus df (p) E T; M. Let us now take f = Xi, a local coordinate function
of a chart (U, <p). Then df : U -+ UpEUT; M is such that dXi(p) form
a basis "for T; M. This basis is dual to the basis 8~i Ip of TpM. (Verify
this.) The reader should also find the transformation rule which the
components of WET; M obey analogous to the one which those of
v E TpM do. If you look up any classical book on tensor analysis you
will find that the rule for WET; M is the same as that of a covariant
tensor of rank 1.
If f E COO(M), then we see that the map df : p t-+ df(p) is from
M to UpT; M. df is called a smooth differential I-form. If (U, <p) is
a local chart with local coordinates Xi, then on U we can write df as
df(p) = Ei 9i (p}dXi (p) for some 9i(p) E lR and for all p E U. Omitting
p we can write this as df = E 9idxi, for some smooth (?!) functions 9i
on U. Can you guess what the 9i'S are? We believe that you guessed it
right and ask you to prove your guess. This establishes rigorously what
you learnt in advanced calculus about total differentials.
Exercise 2.4.3 Let cp : M -* Nand 'IjJ N -* P be smooth maps.

Compute D('IjJ 0 cp)(p) for p E M.
Exercise 2.4.4 Let Mi be smooth and let 7ri : Ml x M2 -* Mi be the

canonical projection. Compute D7ri for i = 1,2.
Exercise 2.4.5 Now that we have defined the derivatives of a smooth

map cp : M -* N as linear maps on the tangent spaces, can you think of
some natural (linear) conditions to impose on Dcp(p)? Can you find some
natural examples that satisfy your conditions? What is the geometric
meaning, consequences, etc., of your definition?
2.5 Immersions and submersions

Let cp : M -* N be a smooth map. For all p E m we have defined a
linear map Dcp(p) : TpM -* T<p(p)N. It is therefore quite natural that
we impose conditions on the linear map Dcp(p).
Definition 2.5.1 We say that

(i) cp is an immersion at p if Dcp(p) is one-one.
(ii) cp is a submersion at p if Dcp(p) is onto.
(iii) cp is a local diffeomorphism at p if Dcp(p) is one-one and onto.
Notice that the inverse mapping theorem justifies our definition of a
local diffeomorphism: For, we would like to define a map to be a local
diffeomorphism if, for all p EM, there is an open set Up such that cp is
a diffeomorphism of Up onto cp(Up). (In particular, cp(Up) must be an
open subset of N.)
We say that cp is an immersion on M if cp is an immersion at every
p E M. Similar definitions are made for the other concepts. We now
give some very natural examples.
Example 2.5.2 Let m ~ n . Consider t : ]Rm -* ]Rn be given by
Then t is known as the canonical immersion. Similarly, if m 2: n the

projection cp : ]Rm -* ]Rn given by
is a submersion called a canonical submersion.

2.5. Immersions and submersions 97
We shall presently prove results which will show that locally any
immersion (respectively, submersion) is a canonical immersion (respec-
tively, submersion). More precisely we shall prove the following
Theorem 2.5.3
1) Let F: M -+ N be an immersion on a neighborhood of p E M.

Then there exist local charts (U, 'P) at P and (V, 1/J) at q := F(p) in
N such that 1/J 0 F 0 'P- 1 is a canonical immersion.
2) If f: M -+ N is a submersion at p E M, then we can find local

charts (U,'P) ofp and (V,1/J) ofq:= f(p) inN such that1/J o f o'P- 1
is a canonical submersion.
Before we start attempting a proof of Theorem 2.5.3, we look at some

examples.
Example 2.5.4 If S is a level surface of a smooth function f from an

open set U in ]Rn+k to ]Rn, then the identity map from S to ]Rn+k is an
immersion. In particular, the identity map from the sphere sn
in ]Rn+l
to ]Rn+l is an immersion.
ExaInple 2.5.5 The quotient map c.p : sn -+ lP' (from lR.n +1\ 0 -+ IP') is
an immersion as well as a submersion.
Example 2.5.6 The map 'P or 1/J from (0, 21T) or from (-1T, 1T) to ]R2
considered in Example 2.1.13 is a one-one immersion.
ExaInple 2.5.7 The map from ]R to S1 given by t f-7 e it is a local

diffeomorphism.
Exercise 2.5.8 Verify all the statements in the above examples.
Proof (Part (1) of Theorem 2.5.3) We choose a coordinate neigh-

borhood (Up, 'P) of p with local coordinates x centered at p, that is,
Xi(p) = 0 for 1 ~ i ~ m. Let (V,1/J) be the corresponding object for q,
with y in place of x. The derivatiVf~ DF has ( 8(W",:F) ) as the Jacobian
with respect to these coordinates. By hypothesis, its rank is m. We
set /j := Yj 0 F. We may assume without of loss of generality that the
submatrix ( 8(8~:F) ) 1~i.j~m is invertible on a neighborhood U1 of p in
Up.
98 2. Manifolds' and Lie Groups
Consider the map ~:Ul-+]Rm given by zt-t(ft(z), ... , fm(z)). Then

~ is a diffeomorphism by our assumption on the rank of the Jacobian of
DF. If we set Si := Ii, then there is a neighborhood U of pin U1 such
that (U, ~) serves as a coordinate chart around p. (This follows from
the inverse mapping theorem.) We also observe that
Hence (U,~) is centered at p. We now want to choose a new set of

coordinates tj for 1 S; j S; n at q such that tj = Yj for 1 S; j S; m. So
we set
Then r t-t (tl(r), ... ,tn(r)) is a coordinate chart at q. For, the

corresponding Jacobian is
(!li.i.) l~i,j~n _ (I*

8Yi -
0)
I .
Therefore the determinant is nonzero and hence the above map is a local
diffeomorphism around q. Let V be a neighborhood of q on which the
above map is a diffeomorphism. Then the chart (V, t) is the required
one. This completes the proof of the first part of Theorem 2.5.3.
Exercise 2.5.9 Work out the proof for F : ]R -+ ]R2 given by t t-t
(cost,sint) at the point p = O. Specify the maximum possible U and V
in the notation of Theorem 2.5.3.
Remark 2.5.10 A more conceptual proof of Theorem 2.5.3 is sketched

below:
Using the notation of the proof of Theorem 2.5.3 we choose a basis
Vi of TpM and Wj of TqN such that DF(p) is given by Ui t-t Vi for
1 S; i S; m and Uj H 0 for j > m. Let G := cp 0 F 0 'l/J. Then G
is the map from cp(U) c ]Rm to ]Rn that corresponds to F in terms of
local coordinates. By changing the bases of]Rm and of ]Rn, we may
assume that DF(cp(p)) is a canonical immersion. We now define a map
H : U x Rn-m -+ ]Rn given by H(x, z) := G(x) + (0, z) where 0 E ]Rm.
Then it is easy to check that DH(O) : ]Rn -+ ]Rn is the identity. Hence
H is a local diffeomorphism at O. The reader can check that (cp, U)
and (V, H 0 'l/J-l) satisfy the requirements of Theorem 2.5.3. (Exercise:
Check the details of this proof.)
2.5. Immersions and submersions 99
We now prove the second part of Theorem 2.5.3. Here the proof is a
lot easier. Using standard notation, we may assume that the Jacobian
matrix of f at p is such that its first n x n-minor is invertible, that is,
the matrix ( ~(p) )l$i,j:5n is invertible. Then we consider 1] on V C N
given by
1](Z) := (h(z), ... ,1n(z)) for z E V.
From our hypothesis it follows that 1] is a local diffeomorphism at q.
Hence the required new charts are (U, '1') and (W,1]) with W C V on
which 1] is a diffeomorphism.
o
Exercise 2.5.11 Try to write a proof of the second part along the lines
similar to the conceptual proof indicated for immersions.
The single most important topological property of a submersion is

that it is an open map. This follows from the local description of a
submersion as a projection of lRm to lRn. The following will fasten this
fact on the mind of the reader.
Example 2.5.12 There exists no submersion f: sn -t lRm for m, n 2l.

Notice that this is an improvement of an earlier observation that we
cannot endow sn with an atlas consisting of a single chart. (Why?)
It is important that the reader realizes that all these local studies are
simple applications of the inverse mapping theorem and they reiterate
what we said about the geometric significance underlying the inverse
mapping theorem. Namely, that the theorem allows us to choose coor-
dinate charts that are expedient for the geometric problem on hand.
Here is a typical example of a local diffeomorphism which is not a
global diffeomorphism:
f:]R2 -t 1R? given by (u,v) t-t (e U cosv,e u sinv).
Did you ever wonder why one never gives an example of a local diffeo-
morphism which is not global in 1R? The following exercise tells you
why.
Example 2.5.13 Any local diffeomorphism from IR to IR is a global

diffeomorphism of IR onto its image.
Example 2.5.14 If f : S t-t M is an immersion of S into M and if

dim S = dim M what can you say about f(S)?
2.6 Submanifolds
If M is a topological space and S is a subspace then there is a natural
way of making S into a topological space. However if M is a smooth
manifold and S is a subset of M there is no natuml way in which we
can make S into a submanifold. A correct way to define a submanifold
is as follows:
Definition 2.6.1 A submanifold of a smooth manifold is a pair (S, £),

where S is a smooth manifold and £ is a smooth map of S into M which
is one-one and an immersion.
Examples of submanifolds
Example 2.6.2 (JRm, £) is a submanifold of JRn for m S; n where £ is
the natural inclusion given by
Example 2.6.3 Let SI (respectively, S2) be the first (respectively, the

second) manifold in Figure 2.1.3 00 defined as in Example 2.6.2 of Sec-
tion 2.1. When £ is the inclusion of Sk into JR2 we see that (Sk, £) is
a submanifold. This is perhaps the most important example of a sub-
manifold which will enable the reader to appreciate the finicky definition
above, as we shall see below.
Example 2.6.4 All the level surfaces we constructed in the Chapter 2

are submanifolds of the corresponding ambient manifolds. For example,
O(n) is a submanifold of GL(n, JR).
Example 2.6.5 If M is a manifold and if S is an open subset of M

then with £ as the natural inclusion of S into M, we see that (S, £) is a
submanifold of M.
Exercise 2.6.6 What are all the open submanifolds of a given manifold
M?
Now let (S, £) be a submanifold of a manifold M. If we consider the

subspace topology on £(S) induced from M and if we pull it to S, is it
finer or coarser than the original topology on S? Recall that £ is assumed
to be smooth from S into M. Hence £ is continuous and so the original
topology on S is finer than the topology induced from the subspace
2.6. Submanifolds 101
topology on L(S). It may happen that the original topology on S is

strictly finer than the subspace topology. For instance in Example 2.6.3
above any basic neighborhood of the point x = 0 (or 7r) is of the form
(-£ ,£) U (-7r, -7r + 8) u (7r - 8 , 7r)
in the subspace topology. However S has a neighborhood of x of the

form (-£, £) which is not an open set in the subspace topology.
Now how does a submanifold S look locally in the ambient manifold
M? Recall that from our study of the local structure of an immersion
we know that given a point p in S there exists a neighborhood U in S
and a neighborhood V in M with local coordinates x and Y respectively
such that the map L with respect to these charts is of the form
This means that
L(U) = {q E V : Yi(q) = 0 for i ~ m + I}. (2.6.1)
However notice that this does not say that L(U) = SnV. For instance, in
Example 2.6.3 above, (what else?) one can check that such a possibility
arises.
The neighbourhoods U and V as above will be referred to as adapted
neighbourhoods.
Before we proceed we shall introduce some definitions.
Definition 2.6.7 We say that a smooth map ¢ : M -+ N is an imbed-

ding if ¢ is one-one immersion on M. An imbedding is said to be regular
if it is a homeomorphism of M onto its image. This is equivalent to stip-
ulating that the two topologies on M coincide. Thus a submanifold is
a pair (M, L) where M is a smooth manifold and L is an imbedding. A
submanifold is said to be regular if the imbedding L is regular.
Exercise 2.6.8 Which of the submanifolds above are regular?
We now want to point out that we already know a way of constructing

a lot of regular submanifolds. The idea behind the construction of level
sets is what we are going to formalize. Let ¢ : M -+ N be a smooth
map. We remarked earlier that in general the set ¢-l(q) for q E N is
not a manifold. If we look at our earlier examples carefully we arrive at
the following condition. We say that a point q in N is a regular value if
the map ¢ is a submersion at all points pin M with ¢(p) = q.
Proposition 2.6.9 Let S = (p-l(q) for a regular value q of 1>. Then

(S,t) is a regular submanifold of M. Here t is the natural inclusion of
S into M.
Proof This is an easy consequence of our local study of a submersion.

Let pES. Choose local charts (Up,x) and (V,y) centered at p and q
respectively so that 1> is a canonical submersion on Up. Then we have
S n Up = {z E Up : Xi(Z) = 0 for 1 ~ i ~ n}.
We set 1/; as the map
on S n Up. If we take the family of all such (S n Up,1/;) as p varies on S

we get a smooth atlas on S. (Check this.) It is clear that the topology
on S is the subspace topology from M . Thus, S is a regular submanifold
of M, with dimS = m - n.
o
Remark 2.6.10 There is a theorem due to Sard which says that if

f: M ~ N is a smooth map, then the set of regular values is an open
dense subset of N. In fact, the theorem is much more informative than
this. To give this version we need some terminology. Since we shall
have no occasion to use this we shall refrain from elaborating on this
formulation. This result is very basic in Differential Topology. The
interested reader may consult any book on Differential Topology.
Let now 1> : M -+ N be smooth and (S = 1>-l(q), t) a (regular)

submanifold of M. We now want to identify TpS as a vector subspace
of TpM and characterize this subspace via the map 1>. First notice that
this expectation is reasonable. For, if v E TpS and 'Y is a smooth curve
through the point p in S, then t 0 'Y is a smooth curve through p in M.
Hence the mapping Dt{p) : 'Y'(O) ~ t 0 'Y'(O) is the one which helps
us identify TpS with D t(p){TpS) C TpM. (Observe that by hypothesis
D t(p) is one-one on TpS.) We henceforth identify TpS with D t(p) (TpS).
Now what is D t(p) for v E TpS? Since 1> is constant on S, we have
Dq,(p){v) = (1) 0 'Y)'(O) = O. Thus TpS is contained in the kernel of
D1>(p) in TpM. Since D1>(p) is a surjection, dimension of the kernel of
D1>(p) is n - m and since dim (TpS) = n - m, it follows that
TpS = kernel of D1>(p) C TpM.

2.6. Submanilolds 103
This is the magic formula which relates our abstract definition of

tangent spaces of everyday objects such as the spheres, cylinders, etc.,
with what we geometrically think of as tangent spaces of these manifolds.
Let us look at the sphere sn, for example. It is the level set 1- 1 (0)
<;orrespc;mding to the function I : JRn+! ~ JR given by I(x) = L: x~ - l.
If we identify TxJRn +! with JRn+!, as we usually do, then
kernel D I(x) = {y E Tx(JRn +!) ~ JRn+! : 2 (x, y) = o}.

Recall that D I(x)(y) = (grad/(x) , y} = 2 (x, V). Thus the tangent space
Tpsn is the space of vectors orthogonal to the normal grad/(x) = 2x.
If we translate this vector subspace by x we get the subset
x + Tpsn = {x + y: y E JRn+! and (y,x) = o}.

This latter subset is what we geometrically think of as the tangent space
to sn and is usually referred to as the affine tangent space of sn at x.
Exercise 2.6.11 Do a similar analysis for the cylinder g-1(0), where

g : JR3 ~ JR is given by g(x) = x~ + x~ - l.
A more interestingdass of examples is that of Lie groups in GL(n, JR).
Since GL(n,JR) is an open subset of M(n, JR) ~ JRn 2 , we can identify, as
usual, TA(GL(n,lR.» with M(n,lR.}. Now, for instance we can consider
L : SL(n,lR.) <-+ GL(n,lR.) as the regular submanifold of GL(n,JR) cor-
responding to the map ¢ : GL(n, JR) ~ JR, where ¢ (X) := detX and
SL(n,JR) = ¢ -1(1). Recall that we have shown that
D¢ (A)(X) = detA tr (A- 1 X).
Using the formula for the tangent space of a level set manifold we get
TA(SL(n,JR)) = {X E M(n,JR) ~ TA(GL(n,JR)): tr(A- 1 X) = O}

since det A = 1 =F o. In particular
Tr(SL(n,JR)) = {X E M(n,JR) : tr(X) = O}.

Exercise 2.6.12 Let O(m) denote the subgroup of all orthogonal ma-
trices in GL(n,JR). Let U(n) denote the set of all unitary matrices in
G L( n, Coo 0), set of all n x n invertible matrices with complex entries.
Let SU(n) denote the subset of elements in U(n) with determinant 1.
Compute Tr(G), where G = O(m), U(m), SU(m). Answer: They are
respectively the set of n x n skew-symmetric matrices, the set of all n x n

skew-hermitian matrices and the set of all n x n skew-hermitian matri-
ces with determinant O. The tangent space at the identity of SL(n, JR.),
O(n,JR.), U(n) and SU(n) are denoted respectively by sl(n, JR.), o(n,JR.),
u(n) and su(n).
(Recall that we have constructed these G as regular submanifolds of
GL(n,JR.) for some n.)
We now would like to point out the geometric meaning of the tech-
nique of Lagrange Multiplier in extremal problems with constraints. The
connecting link is the above formula. We formulate it as
Theorem 2.6.13 Let f : n

c JR.n+1 -+ JR. be smooth. Assume that 0
is a regular value of f so that S = f- 1(0) is a regular submanifold of
JR.n+1. Let 9 : n -+ JR. be a smooth function. Assume further that pES
is an extreme point of 9 on S. Then there exists A E IR such that grad
g(p) = A grad f(p),
Proof We know from the formula for TpS that
'l~JH.TI+l c:::: JR.n+I = TpS EB JR. V' f(p).
Here V' f(p) is the vutor given by
(Vj(p),w) = gradf(p)(w) = J'(p)(w)
for w E JR. n+1. Now if p is an extreme point of 9 on S and if v E TpS then

there exists "( : (-$, $) -+ S such that "( passes through p and "('(0) = v.
Now p is an extreme point of 9 0 "( on (-$, $). Hence we have
o= (g 0 "()' (0) = grad g("((O)) . "(' (0) = (V' g(p), v) .
Since this is true for all v E TpS it follows that there exists A E JR. such
that V'g(p) = AV'f(p).
o
Remark 2.6.14 This proof is decidedly simpler than the one in Section
1.4.1 thanks to the machinery!
Exercise 2.6.15 Formulate an analogue of the above theorem for points

of extrema of a function 9 on a submanifold S defined by
S:= {x E JR.n+k : Ji(x) = 0, 1:S i:S k}.

2.6. Submanifolds 105
Can you think of necessary and sufficient conditions on the functions

j; : IRn + k -+ IR which will ensure that S is a submanifold? Hint: 'V g(p) =
2:~=1 Ai 'V j;(p).
We now discuss the behaviour of submanifolds with respect to smooth
maps. Let £ : S -+ M be any (not necessarily regular) submanifold of
M. If N is a smooth manifold and if ¢ : N -+ S is smooth, then
£ 0 ¢ : N -+ M is smooth. In a similar way if 1/J : M -+ N is smooth and
if we restrict 1/J to S, that is, if we consider 1/J 0 £ , then it is smooth.
Now comes the trouble. Consider a situation in which ¢: N -+ M is
smooth and ¢(N) c S. Then we have a set-theoretic map ¢ : N -+ S.
Is this smooth? Not always!. In fact it need not even be continuous.
For example, (what once again!) if we consider ¢ = £, the inclusion
of the manifold N := 001 into M = 1R2 , then ¢ (N) = 002 (and also
= 001). Thus we get a map ¢ : 001 -+ 002. W;: know from our earlier
encounters with these manifolds that this map ¢ is not continuous. The
heart-warming fact is that this is all that can go wrong in this situation.
More precisely we have
Theorem 2.6.16 Let ¢ : N -+ M be smooth and £ : S y M be sub-

manifol~. Assume that ¢(N) c £(S). Then ¢ : N -+ S is smooth if and
only if ¢ is continuous.
Proof It is enough to show that for any q E S and local coordinates

Yi around q the functions Yi 0 ¢ are smooth. Let V be a coordinate
neighborhood of ¢(p) = q in S such that V is given by
V={zEU: Xs +1(Z)=··· =xm=O}

for (U,x) a local chart of M. This is possible by (2.6.1). Since ¢ is a
continuous map of N into S the inverse image ¢-1(V) is an open set in
N. Hence there exists a local chart (W, y) of pin N such that the map
¢ : W -+ U is smooth. This implies that the functions Xi 0 ¢ : W -+ IR
are smooth for 1 :S i :S m. In particular, the functions Xi 0 ¢ : W -+ IR
are smooth for 1 < i <
~- - s. But these are the local coordinates for S
around q. Hence ¢ is smooth. This completes the proof of the theorem.
o
Isn't this proof amazing? We used the continuity of ¢ only to get
hold of a neighborhood of pin N. Notice that in t.he notation of Theo-
rem 2.6.16, ¢ is continuous if the submanifold is regular. (Check.) We
now want to apply this result to conclude that the regular submanifolds
O(n), U(n), SU(n), SL(n, 1R), etc., of GL(n, 1R) are, in fact, Lie groups.
Let us fix one ofthem, say, O(n). Then the product manifold O(n) x
O(n) is a regular submanifold of GL(n, JR.) x GL(n, JR.) and the map
ex: (x,y) f-t xy·-l of GL(n,JR.) x GL(n,JR) -+ GL(n, JR.)
sends O(n) x O(n) to O(n). From what we said earlier it is clear that the
restriction of ex to O(n) x O(n) is a smooth map into GL(n, JR.). Also,
ex is smooth from O(n) x O(n) to O(n) as follows from Theorem 2.6.16.
In other words, O(n) is a Lie group, called the orthogonal group. The
reader should observe that this was mentioned earlier and more signifi-
cantly that this proof could have been given there. In the same way, the
other submanifolds such as SL(n,JR), U(n) are all Lie groups.
Remark 2.6.17 Usually, this is derived as a consequence of Cartan's

Theorem 2.10.2. However, our proof is elementary.
Remark 2.6.18 Before we end this section, we want to state a result

due to Whitney. Given any smooth manifold M of dimension n, there
exists a regular embedding of Minto lR.2n +l. Thus any manifold is a
regular submanifold of JRN for a suitable N. This result is known as
Whitney's embedding theorem. The interested reader can refer to any
book on Differential Topology.
2.1 Vector fields

Definition 2.7.1 Let M be a smooth manifold. A smooth vector field
X on M is a smooth assignment, p f-t Xp E TpM, of tangent vectors to
points of M.
We should think of X as giving directions Xp at points p. See Fig-

ure 2.7.1.
When is p f-t Xp E TpM smooth? Take any coordinate neighbour-
hood U of p with local coordinates Xi. Then we can write
for q E U. We say p f-t Xp is smooth if the coefficients Ji are smooth

with respect to any local chart in the COO-atlas of M.
Hereafter, by a vector field we will always mean a smooth vector
field.
2.7. Vector fields 107
Figure 2.7.1 Vector field on a manifold
Example 2.7.2 Let M = ]Rn. Let 8~i = Xi : p H 8~i Ip'Then Xi is

a smooth vector field on ]Rn. In fact, if X is a vector field on ]Rn, then
X = L Ii Xi with Ii E Coo (]Rn ). (Prove this.)
Example 2.7.3 Let M = ]R2 \ {(x, 0) : x ~ o}. Let X = tr

and Y =
t8' where (r,8) are the coordinate functions on M with respect to the
chart M = (U, cp) and where cp-l : (0, (0) x (0, 21T) -+ M is given by
(r,8) = (r cos 8, r sin 8). (Prove that this is a Coo -atlas compatible with
the usual Coo -atlas on the open set M C ]R2 .) Why cannot M' = ]R2 \ 0
be considered?
Let us see how these two vector fields give directions at various points
of M. For brevity, we set cp-l(r,8) = re i8 . Now let p = re i9 EM' . We
r re
first express {}8 1 i8 in terms of the usual coordinates:
This means that at the point p = re i8 , tr Ire

i8 is the unit vector in the
direction of r at p. Proceeding in a similar way we find that
0 1
-08 ·n O nO
re = -rsmu -ax + rcosu-
i8 oy
at reiO. That is, tOlre
i8 is the tangent vector to the circle minus a point
of magnitude r. Look at Figure 2.7.2.

Did you notice something? As you follow the directions starting from
a point you traverse a path. For example, in the first case, it is a radial
line through the point p . (Notice that origin is not in the path.) In the
Figure 2.7.2 Rotational vector field
second case the path is a circle. In either case it is the coordinate curve.
What was the path corresponding to Xi of Example 2.7.2 above?
Example 2.7.4 Let M = sn C jRn+l with n = 2k - 1, k 2: 1. We

consider Tp(sn) c Tp(jRn+1) as explained earlier. Let
Then Xp E Tp(sn) for all p E sn. (Why?) It is nowhere zero on sn.

Thus X is a nowhere vanishing vector field on the odd dimensional sphere
sn = S2k-l.
Can you find a nowhere vanishing vector field on even dimensional

spheres S2k, say, on S2? It is not possible. This is a very deep result
in algebraic topology but now there is a simple proof due to J. Milnor.
This proof is developed as a series of lemmas at the end of this section.
This proof uses only advanced calculus and is accessible to everybody.
Now if M is an arbitrary smooth manifold, does there exist a non-
zero vector field? (Notice carefully: we do not ask for nowhere zero
vector field.) If (U, x) is a chart on M, then we have the coordinate
vector fields Xi = a~i ' These are obviously non-zero. (Why?) However
they need not extend to the whole manifold M. Do you remember how
we manufactured a lot of smooth functions on M? We use the same
trick here too. Choose a smooth function f with (compact) support in
U. Consider f Xi. It is a non-zero vector field on M. Let me point out a
couple of fall-outs of this construction. The first one is that if v E Tp(M)

is given then there exist many vector fields X such that X(p)(= Xp) = v.
For, if p E U, a coordinate chart so that v = z= Ip
V(Xi) 8~i then we can
take X = z=X(Xi) 8~i as a vector field on U and extend it to Musing
smooth functions with compact support in U. The resulting vector field
is a required one. That there is no uniqueness should be clear. The
second observation is that in general X(M) is infinite dimensional. It is
so precisely when M is an infinite set. (Check all these claims.)
Exercise 2.7.5 Let M = JR2, X = y:x' and Y = x:y.

Carry out what
we did in Example 2.7.3 above for these vector fields, that is, plot the
"directions" at various points and try to find out the path.
Let Coo (M) be the set of all Coo functions on M. Then Coo (M) is
an algebra over JR having the constant function 1 as its multiplicative
identity and 0 as its "zero". Let X(M) be the set of all smooth vector
fields on M. Then X(M) is a vector space over IR: If X, Y E X(M), then
(X + Y)(p) := X(p) + Y(p) := Xp + Yp E Tp(M)

and for a E JR, we have (a· X)(p) = aXp E Tp(M). Clearly X + Y,
aX E X(M). In fact, for X E X(M), IE Coo(M) we can define I X as
follows:
(/X)(p) = l(p)Xp E Tp(M).
Thus I X E X(M). Check that I X is smooth. Thus X(M) becomes a
module over the ring Coo(M). (See Section 3.1 for the definition of a
module.)
Note that there exists a map X : Coo(M) -+ Coo(M) associated to
any X E X(M) as follows: (Xf)(p) = Xp(/). Check that:
(1) XI E Coo(M),
(2) X(al + bg) = aX(/) + bX(g) for a, bE JR, and
(3) X(/g) = X(/).g + /.X(g). That is, X is a derivation of the

algebra Coo(M).
The properties (1), (2) and (3) above characterize vector fields; that is,
if D is a map from Coo(M) to Coo(M) satisfying (1), (2) and (3) then
there exists a unique vector field X on M such that D = X. While the
proof of this is not difficult, it is somewhat technical. One should use
the fact that such a D induces Du on any open set U c M and the
existence of Cgo functions on M. We shall give a proof of this fact later

in this section. In the following, as is the general practice, we shall not
distinguish between these two definitions of a vector field. We shall also
write X f or X (f) for X(f) in the sequel.
There is a very interesting algebraic operation on X(M), called the
Lie bracket. It is defined as follows:
Definition 2.7.6 For all X, Y E X(M), let [X, Y] be defined as follows:
[X, Y](p) (f) =Xp(Yf)-Yp(Xf) for allp E M and f E COO(M).
Notice that on the right Y f E Coo (M) so that Xp (Y f) E JR, etc.

One can easily check that
(1) [X, Y](p) E Tp(M) and
(2) P f-t [X, Y] (p) is a smooth vector field on M.
Notice that we could have used the equivalent definition of a vector field
as a derivation of Coo (M) to define the bracket operation:
[X, Yl! := X(Y(f)) - Y(X(f)) for f E COO(M).
We thus have a linear map Lx : X(M) -+ X(M) for all X E X(M)

defined by Lx(Y) = [X, Y]. Lx(Y) is read as the Li~ derivative of
Y with respect to X. There is a geometric interpretation of [X, Y] in
Theorem 3.5.2 on page 200. We now look at some examples of the
bracket operation.
Example 2.7.7 Let (U, cp) be a local chart with Xi as local coordinates.
If we take Xi = {)~i we have [Xi, X j ] = 0 on U. (Check this.) As a
specific .example on JR2 \ {(x,O) E JR2: x ~ O} take Xl = tr
and
X2 = to' Verify that [XI,X2] = o.
A converse of this is also true: If Xi are vector fields such that

Xi(p), for 1 ~ i ~ r, span an r-dimensional subspace of Tp(M) at some
point p E M and if [Xi,Xj ] = 0, 1 ~ i,j ~ r, on this neighbourhood,
then there exists a local chart (U, x) around p such that Xi = {)~i for
1 ~ i ~ r on U. This is a weak form of Frobenius theorem on complete
integrability. See Theorem 2.9.4 on page 138.
Let M = ]R2, X = x:y and Y = Y 8~' Then
[X, Yj(f) (p) = Xp (y :yf) - Yp (x :yf)
=X~(Y~f)
ay ay -y~(x~f)1
ay ay p
a
= x-a (f)1 = XpJ.
y p
Thus [X, Yj = X. One can work this out in a more compact form as
follows, ignoring "second order terms":
[X, Yj = x~
ay
(y~)
ay
- y~ (x~)
ay ay
a a a ,a
= x ay (y) ay - y ay (x) ay
=X-.
a
ay
We shall often use this short form of the computation of the bracket.
Since COO(M) acts on X(M) we want to find out how the bracket
behaves under this action. It is easy to see as above that
[fX,gYj = fX(gY) - gY(fX)
f((Xg)Y + gXY) - g((Yf)X + fYX)
= fg[X, Yj + f(Xg)Y - g(Yf)X. (2.7.1)
The reader may wish to work out the above by using the definition on
[f X, gYj(h). The important thing to notice in this is that the bracket
operation is not COO(M)-linear in any of its arguments. In particular,
/:';jxY =I- f/:.;xY.
We can use the above equation (2.7.1) to derive the coordinate ex-
pression for the bracket operation. Let X = ~ fi a~i and Y = ~ 9j a~j
on a coordinate chart (U, x). Then
(2.7.2)
The most important thing in this equation to observe is, as earlier, that
the derivatives of the coefficients of each of the vector fields are involved.
The bracket operation [ , j has the following properties: For all
X, Y, Z E X(M),
(i) [X, Y] = -[Y, X]

(ii) [X, [Y, Z]] + [Y, [Z, X]] + [Z, [X, Y]] = 0
This second identity is called the Jacobi identity. Thus (X(M), [, ]) is a

Lie algebra over lR according to the following
Definition 2.7.8 A Lie algebra is a pair (g, [ , ]), where g is a vector

space over lR and the bracket operation [, ] is an lR-bilinear map of
g x g --t g satisfying the anti-commutativity condition and the Jacobi
identity for all X, Y, Z E g:
1. [X, Y] = -[Y, X].

2. [X, [Y, Z]] + [Y, [Z, X]] + [Z, [X, Y]] = O.
Let V be any vector space over lR. If we define the bracket operation
by [x, y] := 0 for all x, y E V, we get a Lie algebra (V, [, D. This is called
an abelian Lie algebra.
Example 2.7.9 On M(n,lR) consider the bracket operation given by

[X, Y] := XY -Y X. Here XY etc., stand for the product of the matrices
X and Y, that is, the composition of the linear operators X and Y on
IRn. Then it is easily seen that M(n, 1R) becomes a Lie algebra with
respect to this bracket operation. In a similar way the vector spaces
sl(n,lR), o(n,IR), su(n) and u(n) (page 103) are Lie algebras over lR.
Notice that all these Lie algebras are finite dimensional.
Example 2.7.9 is a special case of the following: Let A be an asso-

ciative algebra over lR. We define a bracket operation on A as follows:
[x, yJ := xy - yx for all x, yEA. Then (A, [, ]) is a Lie algebra over lR.
The reader should have no difficulty in defining a Lie subalgebra of a
Lie algebra. Examples: sl (n, 1R), o( n, 1R) are Lie subalgebras of M( n, 1R).
Can you think of a Lie algebra over IR of which u(n) is a Lie subalgebra?
We now indicate a proof of the fact that any derivation of Coo (M)
arises out of a vector field.
Theorem 2.7.10 If D is a derivation of COO(M) then there exists a

unique vector field X E X (M) such that D f = X(/) for all f E Coo (M).
Proof The proof is carried out in three steps. We start with the fol-
lowing
Claim (1): If for X, Y E X(M) and for all f E Coo(M) we have X(f) =
Y(f), then X = Y.
It is enough to prove that if X(f) = 0 for all f E Coo(M), then
X = O. This follows if we prove that Xp = 0 for all p EM. Let (U,x)
be a chart around p. Then there exists agE Coo(M) such that 9 is 1
in a neighborhood of pin U and is 0 outside U. If we take f = gXi then
f is smooth on M and by hypothesis X(f) = O. This means that the
coordinates of Xp with respect to the local coordinates Xi are O. Hence
claim (1) follows.
Claim (2): If D is a derivation and I = 0 on an open set U, then D I = 0

on U.
To prove this, it suffices to show that DI(p) = 0 for any p E U. Now
we can find neighborhoods Vp C Up of p in U and a function 9 E Coo (M)
such that 9 = 1 on Vp and 9 = 0 on M \ Up- We set h := 1 - g. Then we
observe that 1= Ih on M. For, if z E Up, then 0 = I(z) = h(z)/(z) = O.
If z fI. Up, then h(z) = 1 so that I(z) = h(z)/(z). Hence we get
DI = D(hf) = D(h)1 + hD(f). Evaluating this at p gives us
DI(p) = Dh(p)/(p) + h(p)DI(p) = 0 + 0
since h(p) = 0 = I(p). Hence the claim follows.
Claim (3): Let I be defined and smooth around p. We set Xp(f) =

Dg(p) for any 9 E Coo(M) which agrees with f around p. Since D is
a derivation it is easy to see that Xp E TpM. To check that this is
well-defined, let h E Coo(M) be such that h = I around p. Then we
have 9 = h on a neighborhood of p so that by Claim 2 it follows that
Dg = Dh on this neighborhood. Thus Xp is well-defined. Hence we get
a vector field X : p r-+ Xp. We need to show that it is smooth. It is
enough to show that its coefficients with respect to any local chart are
smooth. Let (U, x) be a chart and p E U. Then we can find functions
Ii E Coo(M) such that h = Xi around p. (How?) Then the coefficients
of X are given by X(Xi) = X(fi) on U. But X(fi) = D(h) are smooth.
Hence X is smooth.
o
Let <p: M ~ N be a smooth map. We wish to study the effect of <p
on the (smooth) vector fields of M. For any manifold M we denote by
X(M), the COO (M)-module of all smooth vector fields on M. Now given
a vector field X on M and <p as above, there is, in general, no natural
way of associating a vector field on N. However if <p is a diffeomorphism
then we can define a vector field Y on N by setting Y(q) = Dcp(p)(X(p)),

where cp(p) = q. (Exercise: Check that Y E X(N) .)
Definition 2.7.11 Let cp: M -+ N be a smooth map. Two vector fields

X E X(M) and Y E X(N) are said to be cp-related if Dcp(p)(X(p)) =
Y(cp(p)) for all p E M. That is,
Dcp 0 X = Y 0 cpo (2.7.3)
Notice that this definition is adapted from the case of diffeomorphism

seen above. We want to reformulate this using functions on N. So, let
9 : N -+ JR be smooth. Then Equation (2.7.3) holds if and only if the
following equivalent condition holds for all 9 E COO(N):
X(g 0 cp) = Y(g) 0 cpo (2.7.4)
This follows from the following set of equivalences where 9 E COO(N)

andpE M:
Dcp(p)( X (p))(g) = Y(cp(p))(g)

X(p)(g 0 cp) = Y(g) (cp(p))
(X(g 0 cp))(p) = (Y(g) 0 cp )(p)
{::=} X(g 0 cp) = Y(g) 0 cpo (2.7.5)
The equivalence of Equation (2.7.3) and the last of Equation (2.7.5)

will be used often and will be referred to frequently in the sequel.
Example 2.7.12 Let cp : JR -+ 8 1 C JR2 be given by cp(t) = (cost,sint) .

Let X E X(JR) be given by X = It
and Y E X(JR2) be given by Y =
tx t
-y + x y · Then the restriction of Y to 8 1 is a tangent vector field
on 8 1 (why?) and X and Y are cp-related.
Left-invariant vector fields on Lie groups

Let G be a Lie group. Recall that for a E G, La stands for the left trans-
lation by a: La(x) = ax. The right translation Ra is defined similarly.
Definition 2.7.13 A vector field X E X( G) is said to be left invariant

if X is La-related to itself for all a E G, that is, if
DLa 0 X = X 0 La for all a E G. (2.7.6)

Or, equivalently from Equation (2.7.5) we require
X(f 0 La) = X(f) 0 La for all a E G and f E C":>O(G). (2.7.7)
Right invariant vector fields on a Lie group are defined in an analogous

way.
Example 2.7.14 What are the left (= right) invariant vector fields on
an? We use Equation (2.7.7): X(f 0 La) = X(f) 0 La to solve this. Let
X = 2:>:ti 8~i ' with 0i E Coo (an). Let f E Coo (an), and a E an. Then
(2.7.8)
and
X (f) 0 La = ('L 0i 8~i f) 0 La· (2.7.9)
Therefore
(2.7.10)
and
(X(f) 0 La)(x) = ('L 8~i (f) La) (x)

0i 0
= ('L Oi 8~i f)(q + x) (2.7.11)
= "L...J Oi(a + x) 8Xi

8f (a + x).
Equations (2.7.10) and (2.7.11) show that oi(a + x) = Oi(X) for every
a E an implies 0i is a constant. Thus the only left-invariant vector fields
on an are the constant vector fields.
A natural question at this stage is whether there exists any nonzero
left-invariant vector field on a Lie group G at all. Notice that if X is a
left-invariant vector field then
DLa(X(e)) = X 0 La(e) = X(a).

Also if Y is another left-invariant vector field such that Ye = Xe then
for all a E G or X = Y. Thus any left-invariant vector field X is

determined completely by Xe = X(e). This suggests that for v E Te(G)
we define X E X(G) by setting
X(a) = DLa(v) == DLa(e)(v).

We now show that X so defined is left-invariant. We have
X(ab) = DLab(V) = D(La 0 Lb)(V)

(X 0 La)(b) = DLa 0 X(b) =} X 0 La = DLa 0 X.
Thus X is left-invariant. We now show that X is smooth. Choose
(U, x) a neighborhood of e and (W, y) a neighborhood of a. Then, if
v = L Cj 8~ J with Cj E lR
(2.7.12)
on Wo C W, a E Woo Since La is a diffeomorphism Yi 0 La is Coo. and,

for each j, cJ 8~. (Yi 0 La) are Coo. Hence X is Coo. Thus if 9 is the set of
J
all left-invariant vector fields on G, then 9 -t Te (G) given by X t-+ X (e)
is a linear isomorphism. Therefore dim 9 = dim Te (G) = dim G.
Exercise 2.7.15 Carry out a similar argument for right invariant vector
fields.
Notice that the formula dim 9 = dim G implies that the only left-
invariant vector fields on lRn are the constant vector fields.
Let Xi E X(M) be cp-related to Yi E X(N), for i = 1,2. Are the
vector fields [Xl, X 2 ] and [Yl , Y2 ] cp-related? In view of Equation (2.7.5)
we need only show
Now
X 1X 2(g 0 cp) = Xl (X2(g 0 cp)) = Xl (Y2 (g) 0 cp)

= Yl (Y2(g)) 0 cp
= Y1 Y2 (g) 0 cpo
Here we used Equation (2.7.5) in the second and the third equations.
Similarly we have
Hence
[Xl! X 2 ](g 0 cp) = [Y1 , Y2 ](g 0 cp) .
Thus we have proved the following
Lemma 2.7.16 If Xi E X(M) is cp-related to Ii E X(N), for i = 1,2,

then [Xl! X 2 ] is cp -related to [Y1 , Y 2 ].
o
We often write X :e, Y whenever we want to indicate that X and Y
are cp-related.
We want to apply Lemma 2.7.16 to exhibit a special Lie algebra
associated to a Lie group. We denote by 9 the dime G)-vector space
of all left-invariant vector fields on a Lie group G. Is this subspace of
X(G) closed under the (Lie) bracket operation? That is, if X, Y E 9
we want to know whether [X, Y] E g. Lemma 2.7.16 tells us that 9 is
a Lie subalgebra of X(G) . For, by assumption, X and Yare La-related
to X and Y respectively and hence [X, Y] is La-related to itself for all
a E G. Thus [X, Yj E g. This Lie algebra 9 is called the Lie algebra of
the Lie group G. It is often denoted by Lie(G) , or by the corresponding
Gothic letter g. Lie( G) reveals a lot about the Lie group G, as we shall
see later.
An important remark at this juncture is that since there is a natural
identification of the vector spaces 9 and Te (G) via the map X t--t Xe =
X (e), we may transfer the bracket operation on 9 to Te (G) . This makes
Te(G) into a Lie algebra over lIt The bracket is given on Te(G) by
[x, y] := [X, Y](e) where X Egis the unique left-invariant vector field
011 G such that X(e) = x, etc.
Example 2.7.17 Let G = GL(n,JR). We know that G is a Lie group

of dimension n 2 . Since G is an open subset of M(n,JR) ~ JRn 2 , we
identify TJ(G) with M(n,JR) via the map TJ(G) :3 v t--t (v(X;j))ij, an
element of M(n, JR). Here xij(A) = aij is the (ij)-th coordinate entry of
A E M(n, JR). We .have also seen that M(n, JR) is a Lie algebra under the
operation [A , B] = AB - BA. What we wish to show now is that the
Lie algebra structure on M( n, JR) induced by identifying it with the Lie
algebra 9 of the Lie group G is same as the one seen above. That is, we
want to prove that 9 and (M(n,JR), [ , ]) are isomorphic as Lie algebras.
To establish this, it is enough to show that the given map is a Lie

algebra homomorphism. (Why?) That is, it is enough to verify that
[X, Y](e) = [X(e), Y(e») or [X, Y)(e)ij = [X(e), Y(e»)ij. (2.7.13)
We now compute [X, Y)(e):
[X, Y](e)ij = [X, Y](e) (Xij)

(2.7.14)
= X(e)Y(Xij) - Y(e)X(Xij).
Let X = L-Xij8~ij where Xij = X(Xij). Now Y(Xij) E COO(G) and
Y(xij)(a) = Y(e)(xijoLa) since
(2.7.15)
as Y is left-invariant. Since Xij is smooth, so is Xij 0 La. What is

(Xij 0 La) (g)?
(Xij 0 La)(g) = Xij(ag) = LXik(a)xkj(g). (2.7.16)

k
Hence, as functions, we have
We see that Equation (2.7.15) becomes
Y(Xij)(a) =: Ye(Xij oLa) = Ye(LXik(a)xkj)

= LXik(a)Ye(xkj).
Thus as functions Y(Xij) = L-XikYe(Xkj). We therefore have
([X, Y](e)ij = X(e) (L XikYe(Xkj)) - Y(e) (L XikXe(Xkj))

k k
= LXe(Xik)Ye(Xkj) - LYe(Xik)Xe(Xkj)
k k
= [X(e), Y(e»)ij.
Thus Equation (2.7.13) is established.
Remark 2.7.18 (cum Exercise) If H is any Lie subgroup of G =

GL(n,lR), then Example 2.7.17 allows us to identify Lie(H) with Te(H)
2. 7. Vector fields 119
when the latter is equipped with the bracket operation inherited from
M(n, 1R). Recall that we have computed the tangent space at e = I to
the Lie subgroups H = SL(n,IR), O(n,IR), U(n), SU(n) and that the
corresponding vector subspaces in M( n, 1R) are closed under the bracket
operation.
We now want to show how our study of cp-related vector fields help
us associate a derived morphism of the Lie algebras to a morphism of
Lie groups.
First some definitions.
Definition 2.7.19 If G and H are two Lie groups we say cp is a (Lie)

homomorphism of G to H if cp is a smooth map and if cp(xy) = cp(x)cp(y)
for all x, y E G. That is, cp is a smooth homomorphism.
As a rule we shall mean by a homomorphism between two Lie groups

a Lie homomorphism.
A Lie algebra homomorphism between two Lie algebras is defined in
an obvious way:
Definition 2.7.20 Let 9 and.fj be Lie algebras. A linear map f: 9 ~ .fj

is said to be a Lie algebra homomorphism if it preserves the Lie bracket
operations, that is, if J[X, Y] = [f(X), f(Y)] for all X, Y E g.
Now given cp as above, is there any associated Lie algebra homo-

morphism of the Lie algebras 9 and.fj? Recall that we identified 9
(respectively, .fj) with Te(G) (respectively, with Te(H)) and that we
have a linear map Dcp : Te(G) ~ Te(H). This induces a linear map
dcp from 9 ~ .fj as follows: dcp(X) := Y, where for b E H we set
Yb := DLb{Dcp(Xe)). Is this a Lie algebra homomorphism? Yes, it is.
This follows from Lemma 2.7.16.
We shall prove this in detail. Observe that X and Yare cp-related:
Y",(g) = Dcp(Xg) for all g E G.
For, we have
Y",(g) = DL",(g) (Ye) = DL",(g) (DCPe(Xe))

= (D{cp 0 Lg))e(Xe)
= Dcpg{DLg){e){Xe) = Dcp(Xg).
Now by Lemma 2.7.16 it follows that [Yl , Y2 ] is cp-related to [Xt,X2] for

all Xi E g and Yi = dcp(Xi). Hence we have
dcp([Xt, X 2]) ~ Dcp([Xt, X2])(e)

= [Yt, Y2](e)
~ [Yt,Y2].
We hope that the reader has no difficulty in understanding the unex-

plained notation and the proof. This establishes that the natural map
dcp is a Lie algebra homomorphism.
Later, we shall see a more geometric way of defining dcp which will
be most useful in dealing with this derived map of the Lie algebras. See
Lemma 2.8.29 on page 134.
Continuous vector fields on spheres

We give Milnor's proof of the non-existence of a nowhere vanishing vector
field on even dimensional spheres. We need a few preliminary lemmas.
Lemma 2.7.21 Let f: (X, d) -+ (Y, d) be a map of metric spaces. As-

sume that f is locally Lipschitz, that is, for each x EX, there exists an
open ball Ex containing x and a constant Lx > 0 such that
If X is compact, then f is Lipschitz, that is, there exists L > 0 such that
Proof The collection {Ex} of open balls B x , as in the statement, is

an open cover of the compact set X. Let {B x • : 1 :::; i :::; n} be a finite
subcover. We shall let Bi stand for Bx •. Let the Lipschitz constant for
Bi be L i . Consider the set
Then C is closed subset of X x X and hence is compact. If (Xl,X2) E C,

then Xl i- X2. Hence if we define 8: C -+ lR by setting 8(Xl,X2) :=
d(Xl,X2), then 8 > 0011 C. Since C is compact, it follows that E: :=
inf{8(xt,x2) : (Xl,X2) E C} > O. Let M:= max{f(xl,x2) : (Xl,X2) E
C} and take La> M/E:. It is easy to check that L := max{Lt, ... , L n , La}
is as required.
o
Lemma 2.7.22 Let KeIRn be a compact subset and assume that v is a

continuously differentiable function on a neighbourhood of K with values
in IRn. Let ft(x) := x + tv(x). Then for It I sufficiently small, ft is one-
one. Furthermore for such values of t, Vo}(ft(K)), the n-dimensional
volume (or Lebesgue measure) of ft(K) is a polynomial function in t.
Proof Since f is C I , by the mean value inequality, v is locally Lipschitz

on K and hence Lipschitz on K by the last lemma. Let L be a Lipschitz
constant of von K. Let t be such that L It I < 1. Then ft is one-one on
K. For, if ft(x) = ft(t), that is, if x + tv(x) = Y + tv(y), we have
IIx-yll ~ Itlllv(x) -v(y)1I ~ Lltlllx-yll·
This implies that x = y.

The Jacobian matrix of ft is of the form 1+ t (~) where I is
n x n-identity matrix. Hence det( D ft) is a polynomial function in t of
the form 1 + tPI(X) + ... + tnpn(x). By continuity of the derivative, it
follows that det(Dft)(x) > 0 for all x E K if It I is sufficiently small. By
the change of variable formula, it follows that
o
We let S(x,r):= {y E IRn : lIy - xII = r} and sn-I = S(O, 1).
Lemma 2.7.23 Let v: S(O, I} ~ IRn be a CI-unit tangent vector field.
Then the function ft: u t-+ u + tv(u) maps S(O, 1) onto S(O, v'f".+t2) if
It I is sufficiently small.
Proof Note that ft(S(O, 1)) C S(O, VI + t 2 ). Observe that ft(ru) =
r ft (u). We shall assume that n ~ 2. For It I sufficiently small, D ft (x) is
non-singular on K and hence by the inverse mapping theorem, ft maps
open subsets in the interior of K to open sets. Hence ft(K) is a relatively
open subset of S(O, Jf+t2). Since ft(S(O,I)) is compact and hence
closed, by connectedness of the spheres, it follows that ft(S(O,I)) =
S(O, v'f".+t2).
o
Proposition 2.7.24 There does not exist a CI-unit tangent vector field
on an even dimensional sphere.
Proof Let, if possible v: sn-l --t an be a Cl-unit tangent vector

field, where n = 2k + 1. Consider K := {x E an : a .~ IIxll ~ b}.
Then K is compact. We extend v to K by setting v(ru) := rv(u) for
a ~ r ~ b. Let It(x) := x + tv(x) for x E K. Then It maps S(O, r)
onto S(O, rv'1 + t 2 ) for a ~ r ~ b. Hence It maps K onto the set
{x E an: av'l+t 2 ~ Ilxll ~ bv'l+t2 }. Since ft(ru) = rft(u), we see
that Vol{ft(K)) = (1 + t 2 )n/2 Vol(K). If n is odd, then the right hand
side is not a polynomial in t. (Why?) This contradicts Lemma 2.7.22.
o
Theorem 2.7.25 There exists no continuous vector field of non-zero
tangent vectors on an even dimensional unit sphere.
Proof Let v be a continuous nowhere zero vector field on sn-l. Then

m := min{lIv(u)II : u E sn-l} > O. By Weierstrass approximation
theorem, applied to each of the components Vi of v = (VI"'" vn),
there exists p := (PI,'" ,Pn) where each Pi is a polynomial such that
°
IIv(u) - p(u) II < m/2 for all u E sn-l. Let w(u) := p(u) - (P(u),u)u
for u E sn-l. Then (w(u), u) = and hence u f-t w(u) is a tangent
vector field on sn-l. We claim that w(u) ::j:. 0 for u E sn-l. First, we
observe that 1(P(u),u)1 < m/2. For, (P(u),u) = (P(u) - v(u),u) so that
1(P(u),u)1 ~ IIp(u) - v(u)lIlIuli < m/2.
Using this, we have
IIw(u) II ~ IIp(u) II -1(P(u), u)1 IIuli > m - m/2 = m/2.
Now, u f-t w( u) / II w( u) " is an infinitely differentiable unit tangent vector

filed on sn-l. If n - 1 is even, this is impossible, by the last proposition.
o
We could not resist showing how this argument can be adapted to
prove the Brouwer's fixed point theorem (Thm. 2.7.28). We shall let B,
Band S stand for the closed ball B[O,IJ, B(O, 1) and the unit sphere
S:= BB = {x E an : (x,x) = I}.
Lemma 2.7.26 There exists no Cl-map f: B --t S such that f(x) = x

for x E S .
Proof Assume that f: B --t S be a Cl-map such that f(x) = x for

xES. Define g(x) := f(x) - x and ft(x) := x +tg(x) = (1- t)x+tf(x)
for "x" ~ 1 and °~ t ~ 1.
Since 9 is C 1 on the compact set B, by mean value inequality, there

exists L > 0 such that
IIg(x)-g(Y)II~Lllx-yll, forx,yEB.
If 0 ~ t < I/L, then It is one-one. For, if It(x) = It(y), then we have
x + tg(x) = y + tg(y) and hence
II x - y II = t II g( x) - g(y) II ~ tL II x - y II < II x - y II ,
a contradiction. Hence It is one-one if 0 ~ t < 1/ L.
We claim that It maps B bijectively onto itself if t is sufficiently
small. We prove this by a typical connectedness argument. Observe
that DIt(x) = 1+ tDg(x). Since Dg(x) is continuous, it follows that
D It is non-singular provided 0 ~ t < to for sufficiently small to. By
inverse function theorem It maps B onto an open subset Ut of B for
o ~ t < to. Let y E B\ Ut . Join y to a point x E Ut . Let z be a point on
the line segment joining x and y and such that z lies on the boundary
of Ut . Since It(B) is compact, there exists u E B such that z = It(u).
As z f/. Ut , u cannot be in B. Hence we infer that lIull = 1. But then
It(u) = u. We deduce that z = u E S. We therefore see that Ut = B
for 0 ~ t < to.
We now consider the integral
1(t) := fa det D It(x) dx, 0 ~ t ~ 1.

When 0 ~ t < to, by change of variable formula, 1(t) is the volume
of B and hence is a nonzero constant. Also, it is clear that 1(t) is a
polynomial in t. Thus we conclude that 1(t) is a nonzero constant for
all 0 ~ t ~ 1. But we shall show that 1(1) = o. This contradiction will
establish the result.
Since fl maps B to S, we have the standard inner product (fl, fl) =
1. Hence, by differentiation, we have
(~~>fl) =0, 1~i ~ n.
This means that the vectors ~ lie in the tangent space of S so that
detDfl(x) = 0, x E Bn.
Thus 1(1) = o.
o
124 2. Manilolds and Lie Groups
Instead of proving the above result for continuous map by employing

the Stone-Weierstrass theorem, we prove the CI-version of Brouwer fixed
point theorem.
Theorem 2.7.27 Let I: B --+ B be a CI-map 01 the unit ball in lRn to

itself. Then I has a fixed point.
Proof If I (x) #- x for any x E B, then we define g: B --+ S as follows:

we let g( x) be the point on the boundary at which the line starting from
I(x) and going to x meets S. In analytical terms, we have g(x) = x+tv,
where v = II:=~I:~II and t = - (x,v) + V1-llxll2 + ((x,v))2. Then
g: B --+ S is smooth and g( x) = x for all xES. This contradicts the
last theorem.
o
Theorem 2.7.28 (Brouwer's fixed point theorem) Let I: B --+ B
be a continuous map 01 the closed unit ball in]Rn to itself. Then I has
a fixed point.
Proof Since the fixed point property is preserved under homeomor-

phisms, we may use any equivalent norm on ]Rn. We shall use the max
norm II x 1100 := max{lxil : 1 ~ i :S n}. Let us-write I = (h,···,ln).
Given c > 0, by Weierstrass approximation theorem, we can find a
polynomial gi such that Ili(x) - gi(x)1 < c for IIxil :S 2. We let g :=
(gIl . .. ,gn) be a vector-polynomial function . We have
III - gil := sup mod I(x) - g(x) < c.
xEB
Are we ready to apply the last proposition to g? No, since we do not

know whether g maps B to itself. This however can be attended to.
Since
IIg(x) II ~ IIg(x) - I(x) II + II/(x) II < 1+ c,
g maps B(O, 1) to B(O, 1 + c). We consider h(x): = (1 + c)-Ig(X). We
then have
11/(x)-~~~11 < II/(x)-g(x)lI+cll/(x)1I

< c(1+II/II)~2c.
If I(x) #- x for x E B,
II I (x) - x II > c for
°
then by compactness there is an c > such that
all x E B. Now if we choose a polynomial function
2.8. Flows and exponential map 125
9 such that II f - gil < e, then, h defined as above cannot have a fixed
point. For, if h(xo) = Xo then we have
e > II f(xo) - g(xo) II
II (f(xo) - xo) + (xo - g(xo)) II
II f{xo) - Xo II > e,
a contradiction.
o
If you are still interested in the continuous version of the no retraction
theorem, you can derive it from Thm. 2.7.28!
For another simple (but not so elementary) proof of Brouwer's the-
orem see Section 4.3.
2.8 Flows and exponential map

Let M be a smooth manifold and X E X(M). Let p EM be given. We
know that there exist lots of smooth curves "( passing through p such that
"('(0) = Xp = X("((O)) . We now want to know whether there is a smooth
curve "( through p such that ,,('(t) = X("((t)) for every t E (-e, e) for
sufficiently small e. The reader may remember that when we discussed
some examples of vector fields, it was pointed out that we traversed a
path starting from a point and following directions. This suggests the
following
Definition 2.8.1 Let X E X(M) and p E M be given. A smooth curve

"( is said to be an integral curve of X passing through p if "((0) = p and
,,('(t) = X("((t)) = X)'(t) for all t in the domain of "(.
As the .reader may have guessed this amounts to integrating the
derivatives corresponding to the tangent vectors assigned to by the vec-
tor field. We shall make this precise. Let (U, rp) be a chart around p with
local coordinates Xi. Assume that "( is an integral curve of X through
p. By the basis theorem we write
m ()
X("((t)) = L X(Xi) ()x.i-r(t) for t E domain of "(. (2.8.1)
i=l •
By the local representation of the derivatives of a smooth map as J aco-

bian we also have
(2 .8.2)
Equations (2.8.1) and (2.8.2) yield the following system of ordinary dif-
ferential equations (ODE):
d
dt(Xi 0"Y) = X(Xi) for all t E (-c,c),I:$ i:$ n. (2.8.3)
with the initial value (i.v., for short) Xi(p) = O. Thus finding an integral
curve through p is same as solving the above system of ODE.
Using local coordinates, we transfer the problem to one on an open
set containing the origin in lRn. Thus we wish to find an x : (-c, c) -+ U
satisfying the following vector-valued ordinary differential equations:
d
dt (x) = X{x) with i.v. condition x(O) = O.
Here the curve x and the vector field X are assumed to be smooth. The
basic theorem of ODE (see Section 1.9) assures us that such a system
has a unique solution. Let us look at some examples.
Example 2.8.2 Let U denote an open set in lRR. Consider the vector
field X = 8~;' Given a point a = (at, . .. ,an), the integral curve of X
through a is
Example 2.8.3 Let M=lR2 and X=x;x+Y;y' Take p=(a,b). Then

the system of differential equations is
dx
- = x with i.v. condition x(O) = a;
dt
~~ = y with i.v. condition y(O) = b.
The integral curve is given by "Y(t) = (x(t),y(t)) = et(a,b). That is, "Y
is the open line joining the origin and the point extending to infinity:
{ex(a, b) : ex > O}.
Example 2.8.4 Let M =lR2 and X =x;y-y ;x' Take p= (a, b) in M.

What is the system of ODE here? We have
dx dy
-=-y and -=X.
dt . dt
What is the integral curve of X through p? We need to solve the above

system. We differentiate the first of the equations with respect to t and
use the second, etc., to get:
d2 x dy
= = x with i.v. x(O) = a
dt 2 dt
d2 y dx
= = -y with i.v. y(O) = b.
dt 2 dt
The solution is x(t)=acos(t)-bsin(t) and y(t)=asin(t)+bcos(t). Thus
if (a, b) ::f:. (0, 0), then the integral curve is non-constant and is a circle
with radius ...;a 2 + b2 and center at the origin.
Example 2.8.5 Let M=1R2 and X=e-x:x+Y:y ' Take p=(a,b) in M.

Then the integral curve is 'Y( t) = (log( t + ea ), bet).
Example 2.8.6 Let M = 1R2 \ {(x,O) : x ~ O}, i.e. 1R2 with the non-
negative x-axis removed, and X = :r' Take p = (a, b) := roe i90 E M.
Then the integral curve is 'Y(t) = (r + t, ( 0 ), Is this related to any of the
previous examples?
Example 2.8.7 Let M=1R2 and X=:x+e-y :Y' Takep=(a,b) in M.

Find the integral curves of X.
Example 2.8.8 Let M=1R2 and X=x:x -Y:y' Take p=(a,b) in M.

Then the integral curve is 'Y( t) = (ae t , be -t). Thus the integral curves
are the hyperbolas xy = C with C = ab, a constant.
We now state the basic theorem of ODE (Section 1.9) since we need
to fix some notation.
Theorem 2.8.9 Let U be an open set in IRn. Let p E U. Let X be a

C k -map from U to IRn for some k ~ 1. Then there exists an e > 0 and
an open subset V of U with p E V and a unique Ck -map
F : (-e, e) x V -t U
with the following property: 1t(F(t,x))=X(F(x,t)) and F(O,x)=x for

all t E (-e,e) and x E V.
o
We introduce some notation. For X and F as in Theorem 2.8.9 we set

CPt for the map
for (t, z) E domain of F.
We also denote by "'(p the integral curve of X through p:
"'(p : t f-t F(t,p)
for t E the domain of F. F is called the flow of the vector field X.
Observation 2.8.10 If a and "'( are integral curves of X and if "'((to) =

a(t o) for some to E I, an interval, then "'( = a on I.
Proof Let E := {s E I : "'((s) = a(s)}. Then E is nonempty and

closed. Now for t E I, the curves t f-t "'((t + to) and t f-t a(t + to)
are integral curves of X through p := "'((to) = a(t o). Hence by the
uniqueness part of the basic theorem (Theorem 2.3.4), we see that there
exists an e > 0 such that "'((t+to) = a(t+to) for t E (-e, e). This means
that t is an interior point of E.
o
Observation 2.8.10 informs us about the existence of a maximal do-
main of definition of the integral curve through p. For, if "'(j is an integral
curve through p defined on the interval I j :;) 0, then there is a unique
map "'( on Ip := UIj which extends all these. This curve will be referred
to as the maximal integral curve of X through p.
Observation 2.8.11 With the notation as above, we have for s E I p ,
I-y,,(s) = Ip - s.
Proof We set a(u) = "'((u - s). Then a'(O) = "'('( -s) = X-y(-s)' Hence
a is an integral curve of X through a(O) = "'(( -s), whence the observa-
tion.
o
Definition 2.8.12 A vector field X is said to be complete if the domains
of definition of all its maximal integral curves through points of Mare
lR. Otherwise X is said to be incomplete.
Example 2.8.13 The vector fields of Examples 2.8.3, 2.8.4 and 2.8.8
are complete while those of 2.8.5, 2.8.6 and 2.8.7 are not. What can you
say about the vector field in Example 2.8.2?
The following version of Taylor's formula should spring no surprise

if the geometric principle is ingested.
Theorem 2.8.14 (Taylor's formula) If X is a vector field and f E

Coo (M) and if ¢Jt denotes the flow of the vector field X, then
tk
L k! D~ f(p) + o(tn)
n
f 0 <Pt(p) =
k=O
Proof To prove the result we first observe that f 0 <Pt is a Coo-function

of t. So we apply Taylor's formula for one variable to get
dk
L
n tk
f 0 <Pt(p) = k! dt k (f 0 <Pt)(p) + o(tn).
k=O
We now compute ~(fo<pd(p). We set l(t,p) := (fo<pd(p). For k = 1,

we have 1(0, p) = it
(f 0 <Pt) (p) 10 = X f(p) since t I-t <Pt (p) is the integral
curve of X through p. We proceed by induction. Assume that we have
shown that, for 1 ::; k ::; n,
n. dk k
f (O,p) = dt k (f 0 <pd(p) = Dx f(p)·
Now, for k = n+ 1, we have D'l+!f(p) = Xp (~(O,p)). Since Xp is

the tangent vector of the curve t I-t <Pt (p), we have
d n +!l
= dtn+! (O,p).
The result follows.

o
We shall give applications of this formulation later.
The maps 'fit enjoy some very interesting properties. First of all, let
us assume that X is complete. Then each 'fit is a diffeomorphism of M
to itself. This follows from the observation that for all s, t E IR we have:
'fis 0 rpt(p) = 'fis+t(p) and 'fio(p) = p, (2.8.4)

130 2. Manifolds and Lie Grou.ps
so that the smooth map r.pt has r.p-t as its inverse. Equation (2.8.4) is
an easy consequence of Observation 2.8.11 made above. This family r.pt
is referred to as the one parameter grou.p of diffeomorphisms associated
to X. In the general case, when X is not complete, the basic theorem
still yields what is known as the local one parameter family of local
diffeomorphisms.
Definition 2.8.15 By a local one-parameter family of local diffeomor-

phisms, we mean a collection {(Uo:, r.pf, co:)} with the following proper-
ties:
1. U0: form an open cover of M.
2. Co: is a positive real number.
3. The domain of definition of r.pf contains Uo: and the map given by
(t,z) H r.pf(z) for z E Uo: and t E (-co:, co:) is smooth.
4. r.pf (z) = r.p~ (z) for all z in the intersections of domains of r.pf and
r.pf and for all t with It I < min{e""ep}.
5. For t, s with t, s, t+s E (-co:,c",), the composite r.pf0r.p~ is defined
on an open set containing U", and r.pf 0 r.p~ = r.p~+t on U",.
Exercise 2.8.16 Let a vector field X on M be given. Then there exists

a local one-parameter family as in the definition associated to this vector
field. Hint: This follows from the basic theorem and Observations 2.8.10
and 2.8.11.
This local family is very closely related to the vector field X. For, if
we are given a collection as in Definition 2.8.15 without any reference to
a vector field and if we set Xp(/) = It
10(/0 r.pt) then the map p -t Xp
is a smooth vector field with the given collection as the associated local
family. (Exercise.)
On~ thing should be noticed by the reader. There is no question of
uniqueness about this local family. For, if the data are as above we can
take 80: := ~c", and the resulting family defined by replacing 8's by c's
will still serve as a local family for X. Yet these are of use as we shall
see below.
Exercise 2.8.17 With the above notation, let t/J : M -t M be a dif-

feomorphism. Then Y defined by Yq := Dt/J(p) (Xp), for q := f(p) is
a vector field on M. Is it possible to describe the local family of Y in
terms of that of X and the map 1/;? We shall give the answer to this,
since we need it below.
The family {(1/;(Uo ),1/; 0 4>f 01/;-1, eo)} is a local family of Y.
In terms of this local family we can give a necessary and sufficient

condition for X to be complete.
Lemma 2.8.18 Let X be a vector field on M. If there exists a local one-

parameter family {(UO , cpf, eo)} of local diffeomorphisms corresponding
to X such that inf eo > 0 then X is complete.
Proof The idea is simple. Let e := inf eo. We define, for any p E M,
4>t(p):= cpf(p) ifp E Uo and t E (-e,e), and extend it to the entire real
line by setting
CPt := 4>1. 0 ... 04>1. = (4)1.)''
~ n
n times
where n is any integer such that I ~ 1< c.

As one realizes, to carry this idea out, one needs to check that 4> and
CPt are well-defined and also show that CPt is the one-parameter family
associated to X. We leave this as an instructive exercise to the reader.
o
We want to apply Lemma 2.8.18 to exhibit complete vector fields.
Example 2.8.19 Let M be. a compact manifold and X a vector field

on M. Then X is complete. For, if {( U0, cpf , co)}-is a local family of X,
then by compactness of M there exists a finite collection {UoJ which
covers M. Now {(UopcptO"co.)} is a local family for X satisfying the
hypothesis of Lemma 2.8.18.
If X is a vector field with compact sl,lpport on any smooth manifold,
the above argument shows that X is complete.
Example 2.8.20 Let G be a Lie group with Lie( G) = g. Then any

left-invariant vector field is complete. Proving this is very easy. If
{(Uo , cpf, co)} is a local family for a left-invariant vector field X, choose
an 0: such that c E Uo . Using Exercise 2.8.17 we see that the family
is a local family of DLa(X), which is X by left-invariance of X. Hence

the claim.
The Exponential map

Definition 2.8.21 Let X E X(M) be complete. Let CPt be the associ-
ated one-parameter group of diffeomorphisms. We define
for all p E M.
Exp is called the exponential map of X.
The reason for calling this map as exponential map will be clear from
the examples below.
Example 2.8.22 Let A = (aij) E M(n, lR). Then we have a vector field,
which we denote by A on lRn as follows: Li,j aij Xj a~i' The integral
curves of this vector field are given by F(t, x) = etAx. (Verify this.)
Here x is treated as a column vector.
Example 2.8.23 Let G be a Lie group, with 9 as the Lie algebra. Then
any left invariant vector field X Egis complete. So we can define Exp
as above. In this case we define
exp(X) := Expx(e) = cpl(e).
More generally, we define exp(tX) := cpt(e). Thus we have a map,
exp : 9 -+ G, again called the exponential map of G.
For example, if we take G = GL(n, lR), so that 9 = M(n, lR), the
exponential map defined above is nothing other than our favourite one!
That is, we have
Xk
LT!'
00
exp(X) = eX :=
k=O
This follows from the fact that the curve t t--; etX from lR to GL(n, lR) is
the maximal integral curve for the vector field X through e, as we have
seen earlier.
Do you wonder how the map Exp looks like at other points of the
group GL( n, lR)? Well, then solve the following
Exercise 2.8.24 Expx(g) = Lg(exp(X)) for any 9 E G. The impor-

tance of this exercise cannot be overemphasized. It tells us, for instance,
that the complete flow Fx of any X Egis given by
Fx(t,g) = 9 exp(tX) for any t E lR and 9 E G.
You will soon see how valuable this observation is going to be for us.
Exercise 2.8.25 Can you guess what the exp map for R is? Prove or
disprove your guess by finding the map from the first principles. Beware,
many beginners make mistakes here! What is the exp map of R+?
In the rest of the chapter we shall investigate the properties of the

exp map and use it to show how it relates the structures on the Lie
group and the Lie algebras.
Lemma 2.8.26 t H exp(tX) is the unique maximal integral curve of

X passing through e.
Proof This is just the definition of CPt.

o
Lemma 2.8.27 exp((s + t)X) = exp(sX) exp(tX) for all s, t E Rand
X Eg.
Proof We use Exercise 2.8.25 dealing with X, D¢>(X) and the corre-
sponding local families. For g E G, we have, by left invariance of X
Lg Exp(tX) L;1 = Exp t(DLg(X)) = Exp(tX).

and hence
L9 Exp(tX) = Exp(tX) L 9 .
This implies that
Now we have
Hence the result.

o
Thus the map from R to G given by t H exp{tX) is a Lie group
homomorphism. In general a homomorphism cP : R -+ G is called a
one-parameter subgroup of G. There is a close connection between the
one-parameter subgroups, the exp map and the left (right) invariant
vector fields. We shall exhibit this relation below (Theorem 2.8.32).
Lemma 2.8.28 The map exp : g -+ G is a local diffeomorphism around

o E g.
Proof If you grant that exp is smooth then Lemma 2.8.28 is more or
less obvious. For, we must show that the derivative of' exp at 0 is a
bijection. Now for any X E g the curve t -+ tX has X as its tangent at
o. The image of this curve under the exp map is the curve t t-+ exp(tX),
whose tangent vector at e is Xe c:::: X. Hence the result follows.
So the non-trivial point is to prove the smoothness of expo We resort
to a trick. First of all, what is the smooth structure on g? Since g
is a finite dimensional vector space over JR., canonically identified with
TeG, we may use a local chart around e to identify Te(G) c:::: g with JR.m.
(How?) Thus g is endowed with a smooth structure which is easily seen
to be independent of the local chart chosen. In particular, g becomes a
Lie group with addition as the group operation. If we can somehow show
that the flow F of X is a smooth function of t, x, and also of Xc:::: Xe
then the smoothness of exp follows. To this end, we consider the Lie
group g x G. On this Lie group, we consider the left invariant vector
field Z by setting Z(X, g) = (0, Xg). Then Z is a smooth vector field
on g x G. Also, its integral curve through (g, X) is t t-+ (X, 9 exp tX).
The corresponding one parameter group is <p(Z.t) (X, g) = (X,g exp tX).
Now exp = 7r 0 <P(Z,l) , where 7r is the projection onto the second factor.
Since <p(z.t) is the flow of a smooth vector field it is smooth and hence
exp is also smooth.
o
Lemma 2.8.29 Let ¢ : G -+ H be a Lie group homomorphism. Then

we have
exp(d¢(X)) = ¢(exp(X)).
That is, the following diagram is commutative:
cp
I I
G - - - -....
>H
exp dcp exp
g----~> .fj
Proof We need only observe that the curves t t-+ ¢(exp(tX)) and t t-+
exp t(d¢(X)) are both integral curves of d¢(X) through e and hence
they are the same.
o
The importance of Lemma 2.8.29 lies in the fact that it yields the
geometric way of computing the derived homomorphism d¢ which we
alluded to earlier. To find d¢(X), all we have to do is to differentiate:

d
d¢(X) = dt ¢(exp(tX))lt=o'
This is our favourite way of computing derivatives! We shall return to

see this principle in action later.
Exercise 2.8.30 Let ¢, t/J : G -+ H be Lie group homomorphisms of the

Lie groups G and H. Further assume that G is connected. If d¢ = dt/J,
then ¢ = t/J.
One-parameter subgroups
We now wish to establish the connection between left-invariant vector
fields and one-parameter subgroups. Let a: IR -+ G be a one-parameter
subgroup of a Lie group G. Then we have a one-parameter group of
diffeomorphisms, namely, {La. : t E 1R}. Let Y be the vector field
corresponding to this family. It is more or less clear that Y must be
right or left-invariant. We look into this:
d
Y(f 0 La)(g) = dt It=o (f(atga))
= Yga(f) = Y(f) 0 Ra(g)
for all a, 9 E G and f E COO(G). That is, Y is right-invariant. Of

course, if we start with {Ra.} then the associated vector field X will be
left-invariant. In fact, we have the following
Lemma 2.8.31 Let {cpt} be a one-parameter group of diJJeomorphisms

of a Lie group G. Assume that for all 9 E G and t E IR we have
CPt 0 Lg = Lg 0 CPt. Then the map t >-t at := cpt(e) is a one-parameter
subgroup of G and we have CPt = Ra. for t E R
Proof The proof of the fact that at is a one-parameter subgroup is the

same as the one we gave for the map t >-t exp(tX).
To see the rest, we have
That is, CPt = Ra •.

o
All these put together establish
Theorem 2.8.32 Let G be a Lie group. If a : IR -t G is a one-

parameter subgroup of G, then there is an X E 9 such that
at = ip(X,t) (1) = Exp tX(e).
Conversely, if X E 9 and if we set f3(t) := ip(x,t)(I), then f3 is a one-

parameter subgroup of G such that Exp tX = Rf3t'
o
2.9 Frobenius theorem

Given a vector field X on M, with Xp =1= 0 for any p EM, the basic
theorem in ODE gives us an integral curve "'{p of X passing through p.
Notice that since Xp =1= 0, "'{p : (-c, c) -t M is a one-one Coo immersion
if c is sufficiently small since D"'{p(-it) = X("'((t)).
We can reformulate this as follows: Let 'Dp be the I-dimensional sub-
space spanned by Xp in TpM. We then say p t-+ 'Dp is a I-dimensional
differential system on M. What we have shown (or the ODE theorem
tells us) is that given a I-dimensional differential system and a point
p E M there exists a submanifold S of M with p E. S and such that
TzS = 'D z for all z E S, that is, Sis ((-c,c),"'{p).
We can ask for a similar result for differential systems 'D of higher
dimensions: p t-+ 'Dp c TpM where 'Dp is a d-dimensional subspace of
TpM for all p E M. Notice that we have to impose certain smoothness
conditions on 'D. The smoothness requirement is: For all p E M, there
exists an open set U containing p and Xl,' .. , Xd E X(U) such that 'Dpo
is the span of {Xl (Po), ... , Xd(PO)} for all Po E U. We then say that the
vector fields Xi span 'D.
So the question is: Given a smooth d-dimensional differential system
on M and p EM, does there exist a submanifold S passing through p
(that is, pES) such that TzS = 'D z for all z E S?
Suppose such a submanifold exists, say, (S, £) where £ : S -t M
is a one-one smooth immersion with pES. Then, if Y1 , Y2 are in 'D
(that is, Yj(z) E 'Dz for j = 1,2 and for all z for which Yj is defined),
there exist a unique Xj E X(S) such that Xj ,!:.- Yj for j = 1,2. Hence
[Xl, X 2 ] E X(S) and so
[YI , Y2 ](z) = (D£(£-I(z)[X I , X 2 ](£-I(Z)))) E 'Dz = Tz(£(S))
for all z E £(S). That is, whenever YI , Y2 lie in 'D, [YI , Y2 ] also lies in
'D. This is a necessary condition for the existence of a sub manifold S
through any point p so that TzS = 'Dz holds for all z E S.
2.9. Frobenius theorem 137
The F'robenius theorem says that this condition is also sufficient.
Definition 2.9.1 A differential system 1> is said to be involutive if for

all X, Y E 1> , we have [X, Y] E 1>, that is, for all p E M we have
[X, Y](p) E 1>p.
A submanifold S is said to be an integral submanifold at p or through
p for the differential system 1> if pES and 1> z = Tz S for all z E S.
1> is said to be integrable if for any point p EM, there is an integral
submanifold S through p.
Thus Fl'obenius theorem says that a d-dimensional differential system

1> is involutive if and only if it is integrable.
Let 1> be an involutive d-dimensional differential system. For p EM,
let {Xi : 1 ~ i ~ d} span 1> on a neighborhood U of p. The involutivity
of 1> means that there are functions Cijk such that for all i, j we have
[Xi, X j ] = L Cijk X k .
k=l
Let us look for the simplest case of an involutive system. Given p EM,
suppose that there exists open neighborhood U of p and Xl, ... , Xd E
X(U) such that Xi E 1> for all i and [Xi,X j ] = 0 for all i,j. (We assume
Xl, .. " Xd span 1> on U.) A prototype of such an involutive system is
Xi = 8~i' the coordinate vector fields on U. Can we say that these are
the only examples (locally)?
Let us take up a test case. If 1> is I-dimensional and 1> is spanned by
X on an open set U with p E U, we should then have a coordinate chart
(U, cp) around p with local coordinates Xi such that X = 8~1 on U. But
observe that a~l has the xl-coordinate curve t I-t cp-l(t, 0, ... ,0) as its
integral curve. Hence we should have the integral curve of X as the first
coordinate curve. This suggests us that if (V, t/J) with local coordinates
Yi is the original coordinate chart, we should replace the Yl-coordinate
curve by the integral curve of X. This idea is carried out to prove
Theorem 2.9.2 Let X E X(M), and p EM with Xp :/; O. Then there

exists a coordinate chart (U, x) around p such that X = a~l on U.
Proof Let (V, t/J) be a coordinate chart centered at p with Yi as the

local coordinates. Since Xp :/; 0 and X p, 8~1 Ip"'" a;", Ip span TpM,
we can assume that X p, a~2Ip"'" a;m Ip form a basis of TpM. We now
implement our idea above. Let
Here CPs is the "flow" of X: CPs(p) = ,/,p(s)(= "F(s,p)") in the notation
°
following Theorem 2.8.9. F is defined for a sufficiently small neighbor-
hood of in IRm. Then
F(O, . . . ,O) = p, for CPO(1/I-1(0, ... , 0)) = CPo(p) = '/'p(O) = p.
F is smooth as 1/1-1 and CPs are smooth.

DF(O, . .. ,0) is nonsingular since it takes the basis :810'
8~.lo' for
m, :slo
°
2:::; i :::; to the basis Xp , 8~.lp for i ~ 2 of TpM. This is because
is the tangent vector at to the smooth curve s ~ (s, 0, ... ,0) and this
curve is mapped by F to the integral curve of X through p:
s ~ F(s,O, ... ,0) = CPs (1/1-1 (0, ... ,0)) = CPs(p)
and hence :,.10

is mapped by DF(O) to the tangent vector of the curve
s ~ CPs(p) = ,/,p(s), that is, ,/,~(O) = Xp'
Ip
A similar reasoning establishes DF(0)(8~j) = 8~j for all j > 1.
Thus, by the inverse mapping theorem, F maps a neighborhood of 0
diffeomorphically onto a neighborhood U of pin M. Obviously, (U, F- 1 )
is the desired coordinate chart.
o
Exercise 2.9.3 Let X E X(M) and p E M be given. Assume that
Xp =I O. For an open set U 3 P and f E G<X>(U) there exists V, an open
subset of U , with p E V and u E C<X>(V) such that Xu = f holds on V .
This exercise may be considered as an initiation to the modern theory
of partial differential equations (PDE) .
Theorem 2.9.4 Let p E V, where V is an open subset of M. Let

X, Y E X(V) be such that [X, Y] = 0 and Xp, Yp are linearly independent
non-zero vectors. Then there exists a coordinate neighborhood (U, x) of
p such that X -88
= Xl and Y = Xl on U.-88
Proof We may assume (V, 1/1, y) is a coordinate chart centered at p. As
earlier, we may assume Xp , Yp, 8~3Ip"' " 8;", Ip
form a basis of TpM .
Let cp1 (respectively, c(2) be the flow of X (respectively, of V). That is,
cp!(p) = ,/,p(s) where ,/,p is the integral curve of X and cp~(p) = O'p(s),
where O'p is the integral curve of Y.
Encouraged by our success in Theorem 2.9.2, we can define
=p
around a neighborhood of 0 E ]Rm . Then F(O)
DF(O) is non-singular as it carries :610
to X p, :t 10 and F is smooth.
to Yp and 8~j to
8~; Ip for j F is a diffeomorphism of a neighborhood of 0 on
~ 3. Hence
to a neighborhood U of p in M. We set (U, F-l , x) as the coordinate
chart.
Are we through? No. For we can imitate the argument in Theo-
rem 2.9.2 to conclude X = ~88
Xl
on U and Y = -88
X2
on the set
{Z E U : Xl(Z) = O}
but not on the whole of U! (Convince yourself of this; the trouble is the
place of tp! in the expression for F and hence we cannot cross 'over!)
Since 8~i are coordinate vector fields, we have
That is, X(xd = 8il on U and Y = E Y(Xi) 8~i. So, what we know is
that the function Y(xd = 8i2 on {z E U : Xl(Z) = O} ='8. Through
every point of the slice 8 as we move along the Xl curves, we describe
the open set U. If we show that the function Y(Xi) remains constant as
it moves along the Xl curves, we are done. For, the constant has to be
the value of Y(Xi) on 8, namely, 8i2 .
Since Xl curves are the integral curves of X, to show that the function
Y(Xi) is constant along Xl curves is equivalent to proving X(Y(Xi)) = o.
But X(Y(Xi)) = Y(X(Xi)) as [X, Y] = o. By what we said above
X(Xi) = 8il , a constant and hence Y(X(Xi)) = 0 or X(Y(Xi)) = 0, that
is, Y(Xi) =
8i2 , or Y =
8~2 on U.
o
Remarks 2.9.5
1. The above proof extends in an obvious way to the situation when

{Xi }1=l are given such that [Xi, X j ] = 0 for all i,j .
2. We observe that Theorem 2.9.4 (on a pair of commuting vector

fields) implies the commutativity of their flows: 4>; 0 4>~ = 4>~ 0 4>;
for all s, t in the domain of definition. (Check.)
3. Using (2), we can give a proof of the analogue of Theorem 2.9.4

mentioned in (1) using the commutativity of the flows </>iand <1>'.
(Exercise. )
4. The most important observation, of course, is what Theorem 2.9.4

tells us about the integral submanifolds through z E U of 1) = span
{Xi: 1 :::; i :::; d}. Let us start with p E U. Then, by construction,
Xi(p) = 0 for 1 :::; i :::; m. The submanifold through p should have
span {Xi (p) : 1 :::; i :::; d} as the tangent space at p. We then have
span {Xi (p) : 1 :::; i :::; d} = span {8~i :1 :::; i :::; d}.
Therefore the submanifold is given by the slice
{q E U : Xj(q) = 0, d + 1:::; j :::; m}.
Similarly for any point q E U, the integral submanifold of 1)

through q is given by the slice
{z E U : Xj(z) = Xj(q), j ~ d + 1}.
5. Integral submanifolds through a point are by no means unique.

For instance, if the differential system on ]R2 is given by in the t8
usual notation, we could take any arc of the circle passing through
the point p with center at the origin or any union of such arcs of
such circles as long it contains an arc of the circle through p as an
integral submanifold.
Exercise 2.9.6 Let S be the set of all lines through (0, y) parallel to
x-axis, where y varies over the set of irrationals. Is it an integral sub-
manifold for tx?
Exercise 2.9.7 Draw a few integral submanifolds of X = -y tx + tx y'
6. The example in (4) suggests to us what conditions to impose on

integral submanifolds to get uniqueness results: We look for maxi-
mal connected integral submanifolds through a given point. (What
does the emphasized phrase mean?) The global version of Frobe-
nius theorem says that for an involutive 1), there is a unique maxi-
mal connected integral submanifold through each point. (See The-
orem 2.9.10).
7. Let N be a submanifold through p such that TzN C 'Dz(N) for

all zEN. (The submanifold N need not have the maximum
dimension d.) Assume that N is connected and N C U. We then
claim that N is contained in a slice. To see this, we observe that
dXj(v) = 0 for v E 'D z and for all z E U and j ~ d + 1. This is
clear, since 'Dz = span {a~i Iz : 1 ~ i ~ d} and dXj(a~,) = !Sij.
Hence, in a similar way, dXj(v) = 0 for v E TzN for all zEN.
Since N is connected, this means that the smooth functions x j are
constants on N, that is, for all zEN, we have Xj(z) = Xj(p) for
d + 1 ~ j ~ m. Hence the claim.
We shall now deduce the general Frobenius theorem from the weaker
version (Theorem 2.9.4 and its extension in item 1 of Remark 2.9.5)
using an algebraic trick.
Theorem 2.9.8 (Frobenius theorem - local version) Let'D be an

involutive d-dimensional differential system on M and p EM. Then
there exists an integral submanifold for 'D through p.
Proof Let (U, x) be a coordinate chart around p, small enough so that

there exist Xl> X 2 , ••. , X~ E X(U) which span 'D on U. We can write
Xi = 2::j: 1 Iij a~j with fij E Coo(U). The fact that Xi(p) span a d-
dimensional subspace of TpM tells us that the matrix of COO-functions
(lij) 1~i~d has rank d.
1~j~m
Without loss of generality, we may assume that (lij h~i.j~d is invert-
ible around p, say, with inverse (9ij h~i.j~d, where gij E Coo(V), and V
is an open neighborhood of pin U. Set Yi = 2::1=1 gijXj for 1 ~ i ~ d.
Then Yi E 'D (Why?) and span {Yi} = 'D. (Why?) Notice that
Yi = -
a + "'. Ciij-a
ax''j~d+l
L....i ax· J
so that
a
[Yi, Yk] = L aikj ax.· (2.9.1)
i~d+l J
Since 'D is involutive, [Yi, Yk] E 'D for 1 ~ i, k ~ d and since {Yi}1=1
span'D we can write
[Yi, Yk] = L biklYi·

1Sl~d
Now if [Yi, Yk) =I- 0, by linear independence, bikl =I- 0 for some I with 1 ~
I ~ d. Therefore [Yi, Yk ) will have a non-zero 8~1 term for some 1 with 1 ~
1 ~ d. This contradicts Equation (2.9.1). We thus conclude [Yi, Yk) = 0
for 1 ~ i, k ~ d. Theorem 2.9.4 (or more precisely, Remark 2.9.5-1) now
completes the proof.
o
We are now ready to prove the global version of F'robenius theorem.
Theorem 2.9.9 (Frobenius theorem - global version) Let'D be

ad-dimensional, involutive differential system on M and p EM. Then
through p, there passes a unique maximal connected integral sub manifold
of 'D. Every connected integral submanifold of'D through p is contained
in the maximal one.
Proof We start with the observation that a manifold is connected if

and only if any two points are connected by a piecewise smooth path.
Now if such a maximal connected integral submanifold N containing
p exists, then any point q E N is joined by a smooth curve 'Y such that
1'(0) = P and 1'(1) = q. Then 1"{t) E T'7(t){N) = 'D'7(t). That is, N must
be made up of integral curves of 'D. This suggests to us how to define
N.
Let N be the set of all points q in M which can be joined to p
by piecewise smooth integral curves of 'D. That is, there exist smooth
curves 1'1,1'2, ... ,1'k such that
and 1'Ht) E 'D'7,(t).

By the local F'robenius theorem (Theorem 2.9.8) and (4) of Re-
mark 2.9.5 after Theorem 2.9.4 and the second countability of M, there is
a countable open cover of M by coordinate charts {{Ui , xi,· ..
,x!r,)}:o
such that the integral submanifolds of 'D in Ui are given by slices of the
form
s.• = {z E U·• : xi.J (z) = cJ'· d + 1 -
< J'-
< m}
where the cj's are constants .. We shall assume p E Uo .
Now given q E N, there exists i{q) E N U {O} such that q E Ui(q)
and there is a slice Si(q) 3 q in Ui(q). Thus the collection
{(S . i(q)
i(q),X 1 , ... ,xi(q»)}
d
qEN
2.9. F'robenius theorem 143
satisfies the requirements of our practical definition of a manifold. That

is, the above collection defines a smoot.h atlas on N. Note that, by
definition of N, N is connected.
To claim that it is indeed a manifold, we need only show that N
is second countable (with respect to the topology given by the above
atlas).
For this, fix i E fill U {O}. It is enough to show that only countably
many slices of Ui can lie in N. (Why?)
Each point of Ui , which belongs to N, is joined to p by piecewise
smooth curves which lie entirely in N. Each such curve from p to Ui
passes through a finite sequence Uo, Ui1 , ... , Uik = Ui • (Why?) (Note
that this sequence need not be unique.) Now there are only countably
many such sequences from Uo to Ui . (Why?) Hence it suffices to show
that for each such sequence there are at most countably many slices of
Ui reachable in the above way. To prove this, we observe that for all
j, k E fill u {O}, a single slice of Uj can intersect at most countably many
slices of Uk. For, if S is a slice of Uj , then S n Uk is an open submanifold
of S and hence contains at most countably many connected components.
Now each such component is a connected integral manifold of 1) in Uk
and hence lies in a slice of Uk by (7) of Remark 2.9.5. This proves the
second countability of N.
The other assertions in the theorem are obvious. Hence N is the
unique maximal connected i!ltegral submanifold of 'D through p.
o
Now we give an application of the Frobenius theorem:
Theorem 2.9.10 Let G be a Lie group with Lie algebra g. Let 5j be a

Lie subalgebra of g. Then there exists a unique connected Lie subgroup
H of G such that Lie(H) = 5j.
Proof Let G be a Lie group. Let 5j be a Lie subalgebra of g. Defiqe

1)g = {X E 9 : Xg E 5j} for each g E G. Then 1) is an involutive
differential system of rank d = dim5j. By the global Frobenius theorem
there exists a maximal integral submanifold H containing e. We claim
that H is a subgroup.
Let a E H. Then La(H) is a connected integral submanifold for
1) and La(H) contains a. Therefore La(H) C H. Similarly one shows
that if a E H then a-l E H. This means that H is a subgroup. If we
prove H is a Lie group then £ : H y G is a Lie subgroup. In view of
Theorem 2.6.16, it is enough to prove the following
Lemma 2.9.11 Let M be a manifold with an involutive differential sys-

tem 'D. Let S be an integral submanifold of 'D. Then if ¢ : N -t M is a
smooth map such that ¢(N) C S, ¢ : N -t S is continuous and hence
smooth.
Proof Let pES and n E ¢ -1 (p). Choose a coordinate chart (U, x)

of p in M such that the integral submanifolds of 'D in U are given by
the slices of the form {z E U : Xj(z) = Cj, d + 1 ::; j ::; m}. Let E be
the slice such that P' E E. E is, by definition, open in S. Let V be the
connected component of ¢-I(U) containing n. To prove continuity, it is
enough if we show that ¢(V) C E. Now Xj o¢ is a continuous map of the
connected set V and hence Xj 0 ¢(V) is an interval containing Cj. But
then ¢(V) C S n U. Since S is an integral submanifold, S n U can have
countably many slices. This means that the above interval has to be the
singleton interval Cj. That is, ¢(V) C E. This proves the continuity of
¢. Smoothness of ¢ follows from Theorem 2.6.16.
The uniqueness part is left to the reader.
o
Thus we have established the following
Theorem 2.9.12 Let G be a Lie group with Lie algebra g. Then there
exists a bijective correspondence between connected Lie subgroups of G
and Lie subalgebras of g.
o
2.10 Lie groups and Lie algebras

Exponential map and Taylor's formula
Let G be a Lie group with Lie (G) = g. In this chapter, we shall
investigate the close relation between the algebraic structure of g and
the analytic structure of G. The main tool for this is the exponential
map, exp of g to G.
It may be worthwhile recalling the definition of expo If X E g then X
is complete so that we have a one-parameter group of diffeomorphisms
denoted by {Exp(tX) : t E 1R} associated to X. We then set
exp(tX) := Exp(tX)(e)
so that {Rexp(tX) : t E 1R} is the one-parameter group of diffeomorphisms
corresponding to X. That is, if we denote by F the map
F: IR x G -t G given by (t,g) I--t gexp(tX),
2.10. Lie groups and Lie algebras 145
then F is the flow of X. Even though we have proved this earlier, we

indicate a proof of this because of its significant role in what follows.
Fix 9 E C. Consider the curve,g : t H 9 exp(tX). Then,g = Lgo"
(where, = ,e), is a curve through g. The tangent vector of,g at 9 is
D,g(O) (! It=o) = (DLg) ° (D,)(O) (:t It=o)

= (DLg)(Xe) = Xg = X")'g(O)'
More generally, we have D,g(s)(1tlt=o) = X")'g(s). Thus,g is the unique

integral curve of X through g. Hence we have for a smooth function /
on C
d d
Xg(J) = dt (J ° 4>t(g))lt=o = dt (J(g exp(tX)) It=o'
We have seen that exp is a diffeomorphism on a neighborhood V of
oE 9 onto a neighborhood U of e E C. We fix a basis XI, ... Xn of
g. Then we get a system of coordinates on U given by: Xi (g) = ti if
9 = exp(X) E U and X = Li tiXi. This is referred to as a system of
canonical coordinates around e.
We want to prove a Taylor formula for smooth functions on C again.
To do this, we claim
dk
Xk(J)(g) = dt k (gexp(tX))lt=o'
We shall prove this by induction. For k = 1, this follows from the
previous paragraph. We assume as induction hypothesis that the claim
is true for all j :::; k.
(Xk+1)(J)(g exp(tX)) = X ((Xk(J)) (g exp(tX))
= Xg exp(tX) (Xk(J))
d
= ds (Xk(J)) (g exp(tX) exp(sX))ls=o
d
= ds (Xk(J)) (g exp((t + s)X))ls=o
d
= du (Xk(J)(g exp(uX))) lu=t
dk+l
= dtk+l /(g exp(tX)).
In particular, we see that

We also get the following Taylor formula for smooth f on G:
tk
Lo
n
f(g exp(tX)) = k,Xk(f)(g) + o(tn).
.
The following theorem lists some facts about the exponential map
which are the tools needed to establish the relation between the analytic
structure of a Lie group G and the algebraic structure of Lie(G).
Theorem 2.10.1 Let G be a Lie group with Lie algebra g. Let X,

Y E g. We then have:
1. exp(tX) exp(tY) = exp{t(X + Y) + ~[X, Y] + 0(t3)}.

2. exp(tX) exp(tY) exp( -tX) exp( -tY) =exp{t 2 [X, Y]+0(t3)}.
3. exp(tX) exp(tY) exp( -tX) = exp{tY + t2[X, Y] + 0(t 3)}.
Proof We point out that all these formulas are obvious generalizations
of the corresponding ones in the case of G = GL(n,JR). For example,
let us see how to prove the second one in this special case. Here, as we
pointed out earlier, we have
Let P2(Z) := [1 + tZ + ~ 2 + 0(t 3)] for any Z E g. Then, for small

enough t,
e tX e tY e- tX e- tY = P2(X) P2(Y) P2( -X) P2( -Y)

= e t2 [X,y] + 0(t3 ).
The other formulas are derived in the same way for GL(n,JR). We also
remark that we shall prove later a more general version of the second
formula.
To deal with the general case, we use Taylor's formula. We have
mt n
L
2
f(exp(sX) exp(tY)) = ;'xmyn(f)(e) + 0(3).
m.n.
m+n=O
Here we have used 0(3) to denote terms in which the sum of exponents
of sand t are at least 3. Therefore we have
t +
L -,-,
2 m n
f(exp(tX) exp(tY)) = xmyn(f)(e) + 0(t 3 ). (2.10.1)
m+n=O m.n.
But ift is small enough then exp(tX) and exp(tY) lie in a neighbourhood
of e E G and so does exp(tX) exp(tY). Therefore there is a unique Z(t)
in a neighbourhood of 0 E g such that
exp(tX) exp(tY) = exp(Z(t)).
We note'that Z(O) = 0 and that t t---t Z(t) is smooth. (Why?) We can,
therefore, write
Z(t) = Z(O) + tZl + ~ 2 + 0(t3 ) = tZl + ~ 2 + 0(t 3 ).

We again use Taylor's formula to get:
f(exp(Z(t))) = L2 1(
k! tZl
t2
+ 2"Z2)
k
f(e) + 0(t 3 )
k=O
t2 t2
= f(e) + tZ1(f)(e) + 2" Z2f(e) + 2"ZU(e) + 0(t 3 )
(2.10.2)
Comparing the coefficients of powers of t in Equations (2.10.1) and
(2.10.2) and taking f as the coordinate functions of a system of canon-
ical coordinates, we have
1 1
Zl = X +Y and 2"(Zf + Y2Z2) = XY + 2"(X 2 + y2).
Using the first in the second we have
~(X2 + y2 + XY + YX) + ~ 2 = XY + ~(X2 + y2)
which yields Z2 = [X, Yj. Hence (1) follows. Comparing the last of the
Equation (2.10.2) with the following equation
f(exp(tX) exp(tY) exp( -tX) exp( -tY))
= L2
;
tm+n( t)r+s
xm yn xr YS(f)(e) + 0(t 3 )
m,n,r,s=O
we see, by reasoning as above, that Zl = 0 and Z2 = [X, Yj. Thus (2)

follows. (3) is left as an exercise to the reader.
o
As an application of Theorem 2.10.1, we shall prove Cartan's theorem

on closed subgroups of Lie groups. To understand the proof we suggest
the reader to attempt a proof of this in the case when G = lRn.
Theorem 2.10.2 (Cartan's theorem) Let G be a Lie group and H,

a closed subgroup of G. Then there exists a unique differential structure
on H making it a Lie group so that the natural inclusion £ : H -+ G is
a Lie homomorphism.
Proof If the result is true, then we should be able to talk of the Lie
algebra of H. Why are we interested in finding it? The reason is that
using the "Lie subalgebra" and the exp we may be able to endow a
smooth structure on H. If K is any Lie subgroup of G then the Lie
algebra of K is given by
Lie(K) = {X E 9 : exp(tX) E K for all t E lR}.

Thus we are led to consider
jj := {X E 9 : exp(tX) E H for all t E lR}.

We first remark that jj is non-empty since 0 E jj. Is fJ a Lie subalgebra
of g? Notice that it is closed under scalar multiplication. It is also closed
under addition: For, if X, Yare in fJ, then exp(tX) exp(tY) E fJ for all
t E lR since H is a subgroup. Now it is easy to show that the following
hold:
exp{t(X + Y)} = hm. ty)n

( exp -tX exp-
n-too n n
. ( tX tY -tX _ty)n
exp{t 2 [X, Y]} = hm exp - exp - exp - - exp--
n-too n n n n
(2.10.3)
The Equations (2.10.3) show that jj is a Lie subalgebra of g.

Now, by Frobenius theorem, we know that there is a unique con-
nected Lie subgroup H* such that Lie(H*) = jj. The obvious thing to
do is to claim that H* is HO, the connected component of H containing
the identity. By definition of jj it follows that H* c HO. Since HO
is the identity component of H it is enough to show that H* is open
in HO. For, any open subgroup of H is always closed and hence by
connectedness of HO we must have H* = HO.
If H* is not open in HO then there exists a sequence Xn of points of
HO \ H* such that limn-too Xn = e. (Why?) We choose a complementary
subspace M of fl in g: that is, such that g = fl EB M. Let U and V be

compact neighbourhoods of 0 in fl and M so that the map
(X, Y) H expX exp Y
is a diffeomorphism on U X V, an open set containing 0 E g. By selecting a

subsequence Xk, if necessary, we may assume that Xk =exp(Xk)exp(Yk )
with Xk E U and Y k E V. We observe that exp(Xk ) E H, Xk -t 0 and
Yk =1= 0 for every k.
For every k, we choose a positive integer rk such that rk Y k E V but
(rk + l)Yk rJ. V. This is possible since V is a compact neighbourhood
of 0 in the vector space M. Again, by compactness of V, there is a
limit point of rkYk, say Z E V. Since Z is a limit point of the sequence
rkYk, Z =1= o. For, otherwise, rkYk E V for k »0. This contradicts our
assumption on rk. If we prove that Z E fl, we are through.
To show Z E fl, we must prove that exp(tZ) E II for all t E JR. Since
H is a closed subgroup, it suffices to show that exp(~ ) = exp(~ )m
lies in H for all m, n E N. It is clear from what we wrote just now
that we need only prove exp( ~ Z) E H for any positive integer n. Let
rk = Skn + tk with 0 ~ tk < n. Then we have
Since Y k -t 0 and ~ < 1, we see that exp(~Yk) -t e. Also since

rkYk -t Z we have exp(~Yk) -t exp(~ ). The facts that exp Y k E H
and th-at H is closed imply that
exp (~z) = lirexp (:Yk ) E H.

This completes the proof of the theorem.
o
As a consequence of Cartan's theorem we see that the closed sub-
groups of GL(n,lR) such as O(n,lR), SL(n,lR) are Lie subgroups and, in
particular, Lie groups-a fact which we already knew!
Adjoint representation
Let G be a Lie group. For a, g E G, let La denote the inner conjugation:
g H aga- 1 • Then La is a Lie group homomorphism. For example, in
the case of G = GL(n,lR), we have LA(B) = ABA-l for A, B in G.
What is the derived map of the corresponding Lie algebras? From what
we learnt from the section on flows, we know that we must use the exp
map to compute the derivative. Let Ad(A) := dt(A) be the Lie algebra
homomorphism of 9 to itself. Then we have
d
Ad(A)(X) = dt A exp(tX)A
-11 t=o = d tAXA-II
dt e t=O'
so that Ad(A)(X) = AXA- 1 • Thus Ad : G --+ GL(g) := Aut(g) is a
Lie group homomorphism. Now Aut(g) has End(g) as its Lie algebra.
(The reader who is not comfortable with these abstract objects may start
with a basis of 9 and use it to identify GL(g) and End(g) with GL( n 2 , JR)
and M(n 2 ,JR) respectively.) Now again the same question; what is the
derived map of the Lie algebras? Notice that if we denote by ad the
derived map dAd, then ad : 9 --+ End(g). To find an expression for ad,
we take At := exp(tX) so that for Y E g,
Ad(Ad(Y) = etxYe- tX = e(tXY(-tX)).
Hence
ad(X)(Y) = ~e(tXY(-tX))1 = XY - YX.
dt t=O
That is, ad(X)(Y) = [X, Yj.
We now want to prove the analogues of these in the general case
using the Taylor expressions of the last section. The differential dt a :=
Ad(a) is a Lie algebra homomorphism of g. Thus Ad: G--+ GL(g) is
a group homomorphism of G into GL(g) = Aut(g), the group of linear
automorphisms of the vector space g. It is easily verified that Ad is
indeed smooth. (Exercise.) That is, Ad is a Lie group homomorphism
of G into Aut(g). The map ad is a Lie algebra homomorphism of 9 to
End(g), the Lie algebra of the Lie group Aut(g).
Let X E 9 and t E lR. Let 9 = exp(tX). Then we have
exp(Ad(g))(tY) = tg(exp(tY))
= exp( tX) exp( tY) exp( -tX)
= exp{tY + t2[X, Y] + O(t 3 )}.
But in view of the relation" cp(expX) = exp(dcp(X)) ", we get
Ad(exp(tX)) = e ad tX = et ad X.
r
G _ _A_d_-----t) GL(g)
exp jexp
ad
9 ) nd~)
The reader should notice that on the right we have used the fact that
we have exp(Z) = e Z in GL(g). Thus we get
Ad(exp(tX))(tY) = 1+ t 2 ad(X)(Y) + O(t 3 ).

Hence
exp{Ad(exp(tX))(tY)} = exp{I(tY) + t 2 ad(X)(Y) + O(t3 )}.

Comparing both expressions for small values of t, we get
ad(X)(Y) = [X, Y].

Definition 2.10.3 A Lie subalgebra S) of a Lie algebra 9 is an ideal in
9 if
YES) and X E 9 implies [X, Y] E S).
Theorem 2.10.4 If H is a connected Lie subgroup ofG which is normal

in G, then S) := Lie(H) is an ideal in 9 := Lie(G). Conversely if G
is a connected Lie group and S) := Lie(H) is an ideal in g, then the
connected Lie subgroup H corresponding to the Lie subalgebra S) is a
normal subgroup of G.
Proof Let H be normal in G. Let Z E fj and X E g. Then exp( tZ) E H

for all t E R Since H is normal in G we have, for s, t E R,
exp(sX) exp(tZ) exp( -sX) E H.
We first notice that gexp(tZ)g-l = exp(tAd(g)(Z)). (Check this.)

Hence, for g E G, t E Rand Z E S) we see that exp(t Ad(g)Z) E H. Now
t M exp(t Ad(g)(Z)) is a one-parameter subgroup of H with Ad(g)(Z)
as the associated left-invariant vector field on H. If we now take g :=
exp( sX) then
Ad(exp(sX))(Z) = e S ad(X)(Z) E S) for all s E R
The curve s M e S ad(X)(Z) has ad(X)(Z) as its tangent vector to S) at

o E S). Thus [X, Z] E S). The converse part is left as an exercise to the
reader.
o
Theorem 2.10.5 Let G be a connected Lie group. Then the center of
G is the kernel of Ad.
Proof Exercise. (Hint: Notice that the kernel is a closed subgroup of

G.)
o
Corollary 2.10.6 The center Z(G) ofG is a closed Lie subgroup. We

have
Lie(Z(G)) = {X E g : ad(X)(Y) = 0 for all Y E g}.
Proposition 2.10.7 Let G be a Lie group. Let <p : IR -t G be a contin-

uous homomorphism. Then <p is smooth.
Proof We begin by noticing that if the proposition is true, then <p is a

one-parameter subgroup of G. Due to the close correspondence between
such subgroups and the left invariant vector fields there should be an
X E g, the Lie algebra of G. How to find this X?
Since exp is a diffeomorphism around 0 E g, there is a convex
neighbourhood U of 0 in g on which exp is one-one, and 0 such that
<p([-c:,c:]) c exp(!U). Thus there exists a unique X E !U such that
<p(c:) = exp(X). Now the crucial observations are the existence of a
unique square root for any 9 E exp( !U) thanks to the relation
exp(sX) exp(tX) = exp((s + t)X)
and the fact that the dyadic rationals are dense in lR. (Recall that the
set of all dyadic rationals is of the form {; : m, n E Z}.)
Let 9 E exp(~U). Then 9 = exp(~X) for some X E U. We can
also write 9 = exp(~X)2 with ~X E ~U. Thus 9 has a square root in
exp( ~U). If a E exp( ~U) is such that 9 = a 2 , then a = exp( ~ Y) for
some Y E U so that
exp(~X) = 9 = a2 = (exp(~y))2 = expY.

This means that ~ X = Y since both lie in U on which exp is injective
(that is, one-one). Thus every element in exp( ~ U) has a unique square
root in expaU).
The homomorphism property of ~ ) exp (~X)

=> ~ (;k) = exp (21kX) for all k ~ 1, by induction;
=> ~ :) exp (;kX) for all r, k ~ 1;
=> ~(tc) exp(tX) for all t E [-1,1], by continuity
Thus we see that ~(tc) = exp(tX) for t E lR. 0
Theorem 2.10.8 Let ~ : G -+ H be a continuous homomorphism

between Lie groups. Then ~ is a Lie group homomorphism.
Proof Since ~ is a homomorphism, it is enough to show that it is

smooth at e. We choose a basis Xl. " " Xn of g. For each i, the map
t ~ ~(exp(tXi)) is a continuous homomorphism and hence smooth
by Proposition 2.10.7. Thus the map is a one-parameter subgroup
of H. Hence there is a left invariant vector field Yi on H such that
cp(exp(tXj)) = exp(tYi) for t E lR. Since cp is a homomorphism, we have
~(exp(tIXI)'" exp(tnXn)) = exp(tIYI )'" exp(tnYn).
But, for sufficiently small ti, the map
is a diffeomorphism of a neighborhood of 0 E IRn onto a neighborhood

V of e E G. On V, we have ~ = (~o t/J) 0 t/J-I. But then (~o t/J) is.
the map (tl,"" t n ) ~ exp(tIYd··· exp(tnYn) which was shown to be
smooth. Therefore,~, being the composition of two smooth maps, is
smooth around e E G.
o
Exercise 2.10.9 Give an alternative proof of the above theorem as fol-
lows using the following facts: (i) The graph of cp is a closed subgroup of
Gx H. (ii) A bijective homomorphism of Lie groups is a diffeomorphism.
Corollary 2.10.10 Let G be a locally Euclidean second countable topo-

logical group. Then there exists at most one smooth structure on G with
respect to which G is a Lie group.
Proof Follows by considering the identity map from one to the other.
o
The fifth problem of D. Hilbert asks whether there exists any smooth
structure on G, as above, which makes G into a Lie group. It was
solved in the affirmative by Gleason and Montgomery. The details of
this solution can be found in the monograph Topological Transformation
Groups by of Montgomery-Zippin.
The following result shows that the only Lie subgroups which are
also topological subgroups are the closed subgroups.
Proposition 2.10.11 Let H be a Lie subgroup of a Lie group G. If the

topology of H is the subspace topology induced from G, then H is closed
in G.
Proof Let H be the closure of H. Then it is easy to see that H is a

closed subgroup of G. Thus, by Cartan's theorem (Theorem 2.10.2), H
is a Lie subgroup of G. By hypothesis, the natural inclusion ~ : H -+ Gis
smooth. Since H is a regular submanifold of G, ~ : H ---+ H is continuous
and hence smooth. Since H has the subspace topology, H is a regular
sub manifold of H .
Suppose now H =f. H. If d := dimH < r = dimH, then for any
p E H there are coordinate neighborhoods U in H and V in H so that
VnH = {z E G: Xd+1(Z) = ... = xr(z) = O}.
Since H is dense in H, Xj(z) = 0 for all Z E V and for r 2: j 2: d + 1.

This is a contradiction to the fact that dxj's are linearly independent
on V. Hence we see that dim H = dim H and hence H = H.
o
Remark 2.10.12 There is a pure topological proof of the proposition.
We recall that a locally compact subspace of a Hausdorff space is locally
closed and hence open in its closure. Using this, we see that H is open in
its closure H. Any open subgroup K in a topological group G is closed,
for G is the union of cosets of K. It follows that H is closed in Hand
hence the result follows.
Exercise 2.10.13 Let X, Y be in g, the Lie algebra of a Lie group G.

Assume that [X, Y] = O. Then
exp(sX) exp(tY) = exp(sX + tY) for all s, t E lR..

2.11. Homogeneous spaces 155
Exercise 2.10.14 A connected Lie group is abelian if and only if its

Lie algebra is abelian. Why do we need to assume 'connectedness'?
We shall indicate a proof of the smoothness of Ad, in case you have

not got one. The idea is to imitate the method of proof of the smoothness
of any left invariant vector field. Since Ad is a homomorphism, it is
enough to show that Ad is smooth at e. Let (U, x) be a coordinate chart
centered at E E G. We choose a neighborhood V C U of e so that
£9 (z) E U for g E U and z E V. Since £9 : V -+ U is smooth, there exist
smooth functions /j on a neighborhood (0,0) E 1R2n for 1 ~ j ~ n such
that
Xi (£9(Z)) = /j(Xl (g), .. . ,xn(g), Xl (Z), .. . ,xn(z))
for 9 E U and Z E V. Now, for a fixed g, we have
a a
--j._/
- ax; aXj e·
J
The coefficients in the last sum are smooth and they constitute the
matrix elements of Ad(g) with respect to the basis 8~; Ie ofTe(G). Hence
Ad is smooth at e.
2.11 Homogeneous spaces

Let us start with some general remarks on the study of groups. As the
reader undoubtedly knows, groups were originally studied as groups of
transformations of some object.
Definition 2.11.1 If G is an abstract group and M is any set we say

that G acts on M if there is a map 1f; : G x M -+ M satisfying the
conditions:
1. 1f;(e,x) = x for x E M and e, the identity of G.
2. 1f;(ab,x) = 1f;(a,1f;(b,x)) for all a,b E G and x E M.

Notice that the conditions imply that if we denote by p(g) the map
x H 1f; (g, x) then p(g) is a bijection of M. Also note that the map
9 H p(g) is a group homomorphism of G into the group (with respect to
composition of maps) of all bijective maps of M to itself. Thus we realize
G as a group of (bijective) transformations of M. As is generally done,
we will write gx in place of 1jJ(g, x) if the action 1jJ is understood. As

examples, take M := G and 1jJ(a,g) = La(g), Ra-lg or ta(g) where La
stands for the left translation and ta(g) = aga- 1 , the inner conjugation
bya. In this case we have p(g) = La, Ra-l or tao If H is a subgroup of
G, then G acts on the quotient set G / H (of left cosets of G with respect
to H) as follows: 1jJ( Cl-j gH) := agH.
Definition 2.11.2 We say that G acts tmnsitively on X if given x, y

in X there is 9 E G such that gx = y.
This is same as requiring that if Xo is a fixed element of X, then

Gxo := {gx : 9 E G} is X. That is, given y E X, there is agE G
such that gx = y. This 9 is not unique, as can be seen in the example
of group action, namely, G / H. The transitivity condition means that
the set X is homogeneous: there is no distinguished point; all points are
same as far as the group action is concerned.
If G is a Lie group and the set M is a smooth manifold we would
like to impose some smoothness condition on the action 1jJ. The natural
condition is to insist on the smoothness of 1jJ. We say a manifold M is
homogeneous if there is a smooth transitive action of a Lie group G on
M.
Before going any further let us look at some examples of such actions.
Example 2.11.3 Let G be a Lie group and take M = G. Then the

above actions such as 1jJ(a,g) := ag, etc., are in fact smooth actions.
Which of these actions are transitive?
Example 2.11.4 Let G = GL(n, lR) and M = lRn. Define
1jJ(g, v) := gv := g(v),
the image of the vector v E lRn under the linear map g. Then 1jJ is a
smooth action of G on lRn. Is this action transitive?
Example 2.11.5 Let H be the Lie subgroup O(n) of GL(n, lR) . Take
M = sn-l, the (n - 1)-dimensional unit sphere in lRn. Then the restric-
tion of the action 1jJ of the previous example gives rise to an action of
H on sn-l. This action is smooth, since O(n) (respectively, sn-l) is a
regular submanifold of GL(n, JR) (respectively, lRn).
Example 2.11.6 Let G = 8L(2, JR) and M be the upper half plane
M:= {(x,y) E JR2 : y > O} C JR2 . We think ofJR2 as theone-dimensional
complex plane C and write z for (x, y) E ]R2. Then G acts on M via
fractional linear transformations:
.
If g:=
(ac d
b) ' then gz :=
+ d·
az +
cz b
This is easily seen to be a smooth action of G on M. G acts transitively

on M. To see this, let z = (x, y) H x + iy. Then 9 = (y; yy__!!x) takes
the point i H (0,1) to z.
Example"2.11.7 Let
G:={(a,x):= (~ ~) :aEC, lal=1, XEC}.
Then G is a closed subgroup of GL(2, C) and hence is a Lie group. It

acts on C H ]R2 as a group of rigid motions as follows: for z E C and
9 := (a, x) E G set gz := az + x. If a := (a, x) and b := ({3, y) are in G,
then (ab)z = a{3z + ay + x so that the composite action corresponds to
the matrix multiplication. The action is smooth. It is transitive, since
if z E C then we may take 9 := (1, z) which sends 0 E C to z.
This example can be generalized. Let G be the group of rigid motions
of ]Rn. Then G consists of pairs (9, v), where 9 E O(n), the group of
orthogonal transformations on ]Rn and v E ]Rn. G acts on ]Rn as follows:
if (a, u) E G, then
(a, u)x := ax + u.
We then have:
(a, u) ((b, v)(x» = (a, u)(bx + v) = (ab)x + (av + u).
This induces a group operation on G: (a, u) . (b, v) := (ab, av + u) . It is

easy to see that with the product smooth structure on G = O(n) x an,
G becomes a Lie group and G acts on ]Rn smoothly. Any element 9 E G
acts rigidly on ]Rn in the sense that it is isometric on ]Rn.
Let G be a Lie group and let H be a closed subgroup of G. Let j) :=
Lie(H) and g := Lie(G). Then by Cartan's theorem (Theorem 2.10.2)
we know that H has a unique smooth structure so that H becomes a
Lie subgroup of G. T.here is a natural topology on G / H so that the
canonical quotient map 71" : G --t G / H given by 71" (g) := gH is both
continuous and open. The class {U : 7I"-l(U) is open in G} of subsets of
G / H forms a topology on G / H. In fact it is easily verified that this is
the unique topology satisfying the conditions that 7r is continuous and

open. It is also equally easy to see that the map
GxG/H-+G/H given by (a, gH) t-+ agH
is continuous for a, g in G. What we would like to do is to make G / H

into a smooth manifold so that the above qlap is smooth. The idea is to
use a complementary subspace M of 5), in g, the Lie algebra of G and
the exp map. So, let 9 = 5) EB M as vector spaces. Let Po := eH be
the identity coset. We plan to describe local charts at Po and transport
it to other points of the space G / H via the group action. We let 1/; be
the restriction of exp to the subspace M. The lemma below is a precise
statement of what we said above.
Lemma 2.11.8 There exists a neighborhood U of 0 E M such that

1/; : U -+ 1/;(U) is a homeomorphism and 7r maps 1/;(U) onto a neigh-
borhood of Po in G / H.
Proof We choose neighborhoods UI and U2 of 0 in M and 5) such that

the map (X, Y) t-+ exp(X) exp(Y) from U1 X U2 is a diffeomorphism
onto a neighborhood of e in G. Since H is closed, it is a Lie subgroup
and is a regular submanifold of G. Hence we can choose a neighborhood
V of e in G such that V n H = exp(U2 ). Now we choose a compact
neighborhood U C UI such that exp(U) exp( -U) C V.
Then 1/;, the restriction of exp to M is a homeomorphism of U onto
1/;(U): First notice that 7r is one-one on 1/;(U). For, if Xl, X 2 are in
U with 7r(exp(X I » = 7r(exp(X2 » then there is an h E H such that
exp(Xdh = exp(X2 ). Hence
exp( -Xd exp(X2 ) = h E V n H = exp(U2 ).
This implies that
exp( -Xd exp(X2 ) = exp(Z)
for some Z E U2 • Our assumptions imply that Xl = 0, X 2 = 0 and

Z = o. Thus 7r is one-one on 1/;(U). Since 1/; is a composition of con-
tinuous maps, it is continuous on the compact set U and hence it is a
homeomorphism.
Since U x U2 is neighborhood of (0, 0) E g, exp(U) exp(U2 ) is a neigh-
borhood of e in G. Since 7r is open, 7r(1/;(UO» is an open neighborhood
of Po in G/H.
o
Remarks 2.11.9
1. Lemma 2.11.8 tells us of the existence of a neighborhood U1 of

OEM and a neighborhood U2 of 0 E S) so that the map U1 x U2 -+ G
given by
(X, Y) t-+ exp(X) exp(Y)
is a diffeomorphism of U1 x U2 onto a neighborhood V of e E G
and such that the map U1 -+ G/H given by X t-+ lI'(exp(X)) is a
homeomorphism of U1 onto a neighborhood Uof Po := eH E G / H.
We let x: U -+ U1 be the corresponding homeomorphism. We then
have (U,x) as a chart at Po E G/H.
2. The map cp : exp(Ut} exp(U2 ) -+ U1 given by
cp : 9 := exp(X) exp(Y) t-+ X
is smooth. Hence the quotient map 11' : exp(Ut} exp(U2 ) -+ U is

smooth since we have 11' = x- 1 0 cpo
3. Let p = gH E U with 9 = exp(X). We set a := exp OX; then

a : U -+ G is given by a(gH) = exp(X) and a is smooth. We also
have (11' 0 a)(gH) = gH. Thus (11' 0 a) is identity on U. a is called
a (smooth) local section of 11' : G -+ G / H.
In summary we have the following:

There exists a neighborhood V := exp(U1 ) exp(U2 ) of e E G and a
neighborhood U of Po E G / H such that
1. 11' : V -+ U is smooth.
2. The,Ee exists a smooth map a : U -+ G such that 11' 0 a is identity

on U.
Theorem 2.11.10 Let G be a Lie group and H a closed subgroup olG.

Let G/ H have the quotient topology. Then
1. G / H has a unique smooth structure such that
(a) the projection map 11' : G -+ G / H is smooth.

(b) lor any pEG / H there is a neighborhood Up and a smooth
map a p: Up -+ G such that 11' 0 a p is identity on Up'
2. G acts smoothly on G I H, when the latter is endowed with the above

smooth structure.
Proof Let a : G x G / H be the set theoretic action of G on G I H . We

continue to use the notation of Remark 2.11.9. We set Up := a(g)(U)
if p = gH and ap = Lg 0 a 0 a(g-I). We also define x P : Up -+ U1 by
x P := x 0 a(g-I) . The collection {(Up , x P ) : p E GIH} is the smooth
atlas with a p as the local sections. That the overlaps are smooth follows
from the following set of equations:
a(a) 0 7r(g) = 7r 0 La
To prove uniqueness, let (x, a) and (y, "1) be the data for Po. Then the
map x 0 y-l = X 0 (7r 0 "1) 0 y-l = (x 0 7r) 0 ("10 y-l) is a composition of
smooth maps on the common domain.
Let JL denote the multiplication map: (a, b) H abo Then the map a
IS
G x fj ----} G x G _ _J.' ~> G _ _1r ~> G/ H
a(a,p) ~ (a, a(p)) ~ JL((a, a(p)) ~ 7r(aa(p))

a composition of smooth maps. This shows that G acts smoothly on
GIH.
o
Theorem 2.11.11 Let the notation be as in Theorem 2.11.10. Assume

jurther that H is a normal subgroup of G. Then the quotient group G / H
with the smooth structure as above is a Lie group.
Proof Let /I(a, b) := ab- 1 for a, bin G. Then /I is a smooth map from
G x G -+ G. The map (aH, bH) H (ab- 1 )H is the map 7r 0/1 0 (aa X ab)
on an appropriate domain.
o
Let us return to the action of a group on a set. If a group G acts
on a set M, then the isotropy or isotropy subgroup Hp at a point p is
defined to be the set {g E G : gp = pl. It is easily seen to be a subgroup
of G. If G acts on M and if q = gp then the isotropy at q is gHpg-l.
(Check this.) The most important thing in this situation is the fact
that there is an almost natural bijection Y from G / Hp to M given as
follows: gHp t-+ gpo Notice that the map is independent of the choice of
the representative of the coset gH. If g' H = gH then g' = gh for some
h E H and hence g' p = (gh)p = g( hp) = gpo The map is onto, for, if
q E M then by transitivity there is agE G such that q = gpo Thus
T(gH) = gp = q.
If, in the above set-up, M is a smooth manifold which is homogeneous
under the action of a Lie group G then it is natural to ask whether
the map T is a diffeomorphism. The next theorem answers this in the
affirmative. Before stating the theorem, let us look at some examples. In
Example 2.11.5, if we choose p = en+! := (0, ... 0,1), then the isotropy
at p is
In Example 2.11.6, for p := (0,1) ++ A the isotropy is seen to be
C?st sint) :tElR}.

{( -smt cost
For, if 9 = ,~ ~) is in the isotropy, then we should have ai + b = i( ci + d),

that is, a = d and b = -c. This along with the fact that 9 has deter-
minant ad - bd = 1 means that we solve for real numbers a and b
satisfying a 2 + b2 = 1, whence the result. In Example 2.11.7, if we
take p = (0,0) E lR2 then the isotropy at p is the subgroup consisting
elements of the form (a,O) E G.
Theorem 2.11.12 Let G be a Lie group acting transitively on a smooth

manifold M. Let Po E M be an arbitrary point. Let H be the isotropy
at po. Then the map T: gH t-+ gpo is a diffeomorphism of G / H onto
M which respects the G-actions: T(a(a)gH) = aT(gH).
Remark 2.11.13 (May be omitted on·first reading.) The real point ot

the proof is to show that the map T is a homeomorphism. If we are
in a purely topological situation, then again we need to assume that
the group G is second countable and locally compact so that we can
apply Baire category theorem to conclude that T is open and hence
a homeomorphism. In the proof below we could use this fact and the
invariance of domain - a deep result in topology - to conclude that
dim M = dim G / H so that T is a diffeomorphism.
Proof We use the notati~n above. The set W := 'I/J(U1 ) is a submani-

fold of G diffeomorphic to U. Let t : W <-+ G be the identity map. Let j3
be the map g r-t gpo. The map T is smooth since we have T = f3o~07r-l
on iJ. We claim that T has rank equal to dim G / H. This will show that
T has maximal rank so that by the Remark 2.11.14 below it is a diffeo-
morphism.
The idea is to show that f3 has maximal rank at e. Consider the map
Df3(e) : 9 -t Tpo(M). Let X E 9 be in the kernel of Df3(e). Then for
any f E COO(M), we have
0= Df3(e)(X)(J) := X(J 0 (3) := {dd f(exp(tX)} . (2.11.1)

t t=o
Let s E R We use Equation (2.11.1) on the function
J*(q) := f(exp(sX)(q)) for q EM.
Then we see
0= {dd J*(exp(tX)PO)} = {dd f(exp(tX)po)} .

t t=O t t=s
Since f is arbitrary this ·implies that X E 5). It is clear that 5) is

contained in the kernel of Df3(e) so that we have 5) = kernel(Df3(e)).
The fact that T preserves the G-actions shows that the map T is an
immersion on G / H.
o
Remark 2.11.14 If cp : M -t N is a one-one, onto smooth map such
that it is of maximal rank on M then cp is a diffeomorphism. This is a
consequence of the local study of an immersion, our second countabil-
ity assumption on smooth manifolds and the Baire category theorem.
(Exercise. )
Abelian Lie groups

We denote 8 1 x ... X 8 1 by Tn and call it the n-dimensional torus. It
is a connected abelian Lie group. If we take the closed Lie subgroup
H := EBiZ27rei of an, then the quotient group an / H is isomorphic to
Tn as Lie groups. In fact, if {Vi} is a basis for an and if H := ZVi then
the quotient group an / H is isomorphic to Tn. (Exercise: Check this.)
We have seen that the exponential map of an abelian Lie group is a
Lie group homomorphism of the Lie algebra (considered as a Lie group
under addition) to the Lie group. Using this we shall show that any
connected abelian Lie group G is isomorphic to Tn X am for some nand
m. We first establish
Lemma 2.11.15 Let G be a connected abelian Lie group and 9 its Lie
algebra. Then the exponential map is surjective.
Proof Let Uo be a neighborhood of 0 in 9 on which exp is a diffeo-

morphism. We may assume that Uo = -Uo, since otherwise we take
Vo = Uo n (-Uo) in the following considerations. Then U := exp(Uo)
is an open neighborhood of e in G and the subset H := UnENun is
an open subgroup of G. (Why?) Now, any open subgroup of any Lie
(topological) group is closed. For, we have H = G \ Ug~HgH. Thus H
is both open and closed. Since G is connected, it follows that H = G.
Therefor~ any g E G can be written as g = exp(X I ) ... exp(Xn). Since
exp is a homomorphism, we have g = exp(X I + ... + Xn).
o
We are now ready to prove the structure theorem for connected

abelian Lie groups.
Theorem 2.11.16 Let G be a connected abelian Lie group. Then G is

isomorphic to Tk X an for some k and n.
Proof Since exp is a homomorphism, the set K := kernel of exp is a

closed Lie subgroup of G. It is normal in G. As G = g/I}, we need
to analyze the structure of K. The first thing to notice is that K is a
discrete subgroup of G. In other words, K is a closed subgroup such
that there is a neighborhood U of e in G with U n K = {e}. This
is easy to see, since we can take any neighborhood U of e on which
exp-l is a diffeomorphism. Since K is closed, the topology on K is the
induced subspace topology so that K n U is an open subset K. Thus
{ e }, and hence any singleton set is open in K. Thus our problem is
reduced to that of finding the discrete subgroups of an.
The following
lemma describes the structure of the discrete subgroups of an
and hence
completes the proof of the theorem.
Lemma 2.11.17 Let H be any discrete subgroup of an. Then there

exists Xl, ... ,X r in an such that
1. Xl, .. " Xr are linearly independent over lR.
The integer r is unique and is called the rank of H.

Proof We assume that H #- o. We set l{v) := I:i lail, where v =

I:i aiei, with respect to the canonical basis. (Any other basis will do as
well.) Since H is discrete, the set {l{v) : v E H, v #- O} has a positive
minimum. For, otherwise, we shall have
vn E {v E H : l (v) :5 1/n }
for every n so that Vn --t 0 in an. In other words, any neighborhood of
o in H will have infinitely many points of H. Hence H is not discrete,
a contradiction.
Let Cl := min{l{v) : 0 #- v E H} > o. Let VI be an element of H
with l{vd = Cl. Let HI := ZVl. If HI = H we are through. So we
assume that HI #- H. We enlarge {vd to a basis {Vi} of an and define
ll{V) := I:~2Iail, where v = I:~1 aiVi. We claim that
has a positive minimum. If not, there exist points Xk E H such that

II (Xk) --t o. We write Xk := I:i akiVi so that l{Xk) = d k + 11k where
d k = iakli and h(Xk) = TJk ---t o. We consider Yk := xk±dkVI, depending
on whether the sign of akl is negative or positive. Then Yk E Hand
ll{Yk) = 11k --t O. Hence {Yk} lies in a bounded subset
{x E an : l{x) :5 C}
for some C > o. Let Y be a limit point of {Yk}. Then Y E H since H
is closed. This contradicts the assumption that H is discrete. Hence it
follows that Yr = Ys for sufficiently large rand s. We therefore conclude
that {ll{V) : v E H, v ¢ Hd has a positive minimum. Let W2 E H be
a point at which this minimum is attained .• It is clear that VI and W2
are linearly independent. We define H2 := ZVl EI3 ZW2. If H2 #- H, we
extend {VI, W2} to a basis {Wi} of an such that WI = VI. We proceed as
above by defining l2, etc. This process must stop at a finite stage, say,
r. (Why?)
Let V denote the vector subspace spanned by H. If
{Xi: 1 :5 i :5 r} and {Yj: 1 :5 j :5 s}
satisfy the conditions of Theorem 2.11.16, then both sets are bases of
the vector space V and hence r = s.
o
Chapter 3
Tensor Analysis
3.1 Multilinear algebra

Let R be a ring with identity. Let V be a left R-module. This means
that V is an abelian group (written additively) and there is an action of
R on V in the following sense: There is a map a : R x V -+ V with the
properties:
1. a(l, x) =x for the identity 1 E R, for all x E Vj
2. a(a + b, x) = a(a, x) + a(b, x) for all a, bE R, and x E Vj
3. a(ab,x) = a(a,a(b,x)) for all a,b E R, and x E Vj
4. a(a, x + y) = a(a, x) + a(a, y) for all a E R, and x, y E V.

We usually write ax for a( a, x) if the action is understood. We have any
abeliart group Vasa Z-module. Here Z stands for the ring of integers.
The action is the multiple or power: If n is an integer and x E V then
nx := x + ... + x (n-times). A more useful example for us is obtained
if we take R = lR and V, a vector space over lR. Another important
example for us is X(M) considered as a COO(M) module where M is a
smooth manifold.
Definition 3.1.1 If V and W are R-modules, a map t.p : V -+ W is said

to be R-linear if t.p is a homomorphism of the underlying groups with
the additional property: t.p(ax) = at.p(x) for a E R and x E V. Here, on
the right, the action is the one on W.
Definition 3.1.2 Let Vi, for 1 :s; i :s; n and V be R-modules. A map
t.p : Vi X ••• X Vn -+ V is said to be multilinear or n-linear if the map
166 3. Tensor Analysis
is linear in each of its variables. That is, for each i with 1 SiS n, the
map
is linear where Zj'S are fixed and Xi is varying.
The set Lk(Vb"" Vk; V), of all k-linear maps can be made an R-
module in a natural way. For <p and 1/1 in the above set ·and a E R we
set
and
Then it is easy to check that the above set becomes an R-module.

Notation: Let Lk(V1 , .•. , Vk) denote the module Lk(Vb"" Vk; R).
Hom(V, W) will stand for the R-linear maps from V to W.
For any R-module V, let V· denote the R-dual of V, that is, V* is
the set of R-linear maps from V to R. Since R is a left R-module it is
easy to see that V* can be given a natural R-module structure.
Let us also recall that the double dual V" := (V*)* of a finite di-
mensional vector space V is naturally isomorphic to V: if v E V then
v defines a linear functional II) on V* as follows. For v* E V* set
II)(v*) := v*(v). The map v ~ II) is a one-one linear map of V to V ...
Since dim(V) = dim(V**), the above map is onto and hence an isomor-
phism. This isomorphism is natural or canonical in the sense that its
definition does not depend on the choice of basis for V, etc. In contrast,
V and V* are isomorphic as JR.-vector spaces but they are not naturally
isomorphic.
We now define the concept of tensor product of two finite dimensional
vector spaces over a field R.
Definition 3.1.3 The tensor product of two finite dimensional vector

spaces V and W over R is, by definition, the set L2 (V*, W*) of all
bilinear maps from V* x w· to R. It is denoted by V ® w.
To get a hang of this, let us have a closer look at the case when
R = JR. and bilinear maps between finite dimensional vector spaces over
lR. The case of multilinear maps is treated similarly. Let {vn~l and
{Wi}j=l be a basis of V* and W*, respectively. Now if BE L 2(V·, W*),
B is uniquely determined by its values on the pairs (vi, wi), For, if
3.1. Multilinear algebra 167
v· = ~,a'v; E V· and w· = ~j biwi E w· then, by the bilinearity of

B, we have:
B(v·,w·) = LaiB(v;'Lb'w;) = La'b'B(v;,wj).

i j i,j
This means that if we know B(v;, wi) for all i and j, then we know the
bilinear map B. For v E V and w E W, let v ® w denote the bilinear
map
.Bv,w(v· ,w·) := v*(v) w·(w).
From what we said above it follows that the collection
where n = dim V and m = dim W forms a basis of L2(V*, W·) _

V ® W. Thus V ® W isa finite dimensional vector space of dimension
nm. Obviously, the analogous statements for V l ® ... ® Vk are true.
It is very essential for what follows that the reader feels at home with
these new objects. In order to facilitate understanding of these objects,
we shall look at various natural maps one has in this situation.
There is a natural isomorphism from L2(V~ W*) to Hom(V~ W) as
follows. Let <p E L2 (V* , W·). We want an A E Hom(V*, W). Thus,
given v* E V*, Av· must be in W . Since (W·,* is naturally isomorphic
to W, Av* is uniquely determined once we know its action on W*. So, if
w* E W*, then we want to define Av*(w*) E lR. The most natural thing
is to set Av*(w·) :=<p(v*, w·). It is clear that Av· EW" :=W. The above
process can be reversed. If A E Hom(V*, W), then we set <p( v· , w·) :=
(Av·)(w*). The above correspondences <p H A and A H <p are linear
and inverses of each other. Since by definition V ® W = L 2(V·, W*),
we get a natural linear isomorphism of V ® W onto Hom(V*, W) . Note.
that V ® 1R is then V" :: V.
As a special case of the above example let us look a little more closely
at the isomorphism V* ® V :: Hom(V·· , V). First notice that if {e,} is a
basis of V and {u i } is a basis of V* then {u'®ej : 1 ~ i,j ~ n} is a basis
of V* ® V. Thus any element of V· ® V is a finite linear combination of
the form ~i,j a{u' ® ej. To understand the isomorphism it is enough to
see what happens to the basis elements. Thus, given u i ® ej we want to
define A := Au'®ej E Hom(V, V) . Ifv E V then we want AU'®ej(v) E V.
The obvious thing to do is to set
Let {e i } denote the basis of V~ dual to the basis {ei}. .These are the
linear maps ei such that we have ei(e;) = oj for all i, j. Here oj is the
so called Kronecker delta which is 1 if i = j and 0 otherwise. Now if we
choose u i := ei , the dual basis of ei, then Li,; at u i ® e; corresponds to
the linear map A : V -t V given by
(Latei®e;)(ek) = LLat(ei®ej)(ek)
iJ i j
= LLatei(ek)ej
i ;
= Lato~e;
i,;
That is, the linear map A has matrix representation (at) with respect
to the basis {ei}. In particular, the identity map I corresponds to
; .
Li,j hi C' ® ej.
This is perhaps the right place to tell you a standard way of writing
subscripts and superscripts in differential geometry. Learning it now will
save some real headache for you later if you get into some nasty "local
computations" .
Let V be an n-dimensional vector space over R Let {ei}f=l be a
basis for V over R Let V· denote the dual space of V. That is, V·
consists of all real linear functionals from V to R Let {e i } denote the
basis of V· dual to the basis {ei}. Let {Vi} be another basis of V over
R Then there exists an invertible linear map A from V to itself which
takes ei to Vi' If we write A as a matrix (a1) with respect to the basis
{ei} then we have:
Late; = Vi· (3.1.1)
;
If {u i } is the basis of V· dual to {vd then we wish to express u i in terms

of C;i. We have
(3.1.2)
We also have
(3.1.3)
Now Equations (3.1.1) and (3.1.3) imply
ci = a;ui E a;ui
:= (3.1.4)
i
Ui = b~ci:= E b~ci , (3.1.5)
i
where (b~) is the inverse of the matrix (a~). We have also used the Ein-
stein summation convention, namely, the repeated indices are summed
unless otherwise specified. We shall always use this summation conven-
tion witQout saying so in the sequel. There is a subtle change in the way
the summations for Vi and that for u i go. If we think of the index i in
the matrix (at) as the row index then in the expression for Vj in terms of
ej the summation was over the row index whereas in the dual situation
it was over the column index. A mnemonic way of remembering this is

that if A acts on V then V* is acted upon by t A-I. Not knowing this
has cost a lot of agonizing hours for the beginners of this subject.
Coming back to tensor products, it is easy to verify the following
natural isomorphisms:
1. V ® W ~ W ® V via the map V ® W H w ® v.
2. (V® W) ®Z ~ V® (W®Z) ~ V® W®Z.

3. V®(W$Z) ~ (V®W)$(V®Z).
The reader is strongly advised to check these in detail. One should also
notice that for a E lR, v E V and w E W, we have
a(v ® w) = (av) ® w = v ® (aw).
This follows, for example, from the identification of each of the above
three quantities with the bilinear maps f3(av,w) , etc. It is impractical to
think of elements of V ® W as bilinear maps always. The reader will do
well to deal with objects of the form v ® w.
We now fix a finite dimensional vector space V and let ®kV denote
V ® ... ® V := Lk(V*, . .. , V*). For k = 0, we define ®oV = R We call
®kV as the k-th tensor power of V. We define ®V := $~=o ®k V, the
direct sum of all the k-th tensor powers of V. Recall that the direct sum
means that any element t E ®V is uniquely of the form t = Lk aktk
with tk E ®kV and ak E JR, with only finitely many of them non-zero.
There is a map from ®kV x ®lV to ®k+lV as follows: Let s E ®kV
and t E ®lV. Then s ® t is the natural element of (®kV) ® (®lV). This
gives rise to a natural product defined on ® V as follows: If t and 8 are

in ®V, their product is s®t. One easily shows that ®V is an associative
algebra over JR, called the tensor algebra of V. This algebra has 1 E JR
as the identity, since (®kV) ®JR = ®kV = (JR) ® (®kV). This algebra is
never commutative if dim V > 1. For, if {ei} is a basis of V and {cj } is
the basis of v· dual to {ei}, then
el ® e2(c 1,c 2) = 1 and e2 ® el(c 1,c2) = o.

The reason for these explicit computations is to make the reader feel at
home with the tensor products. An element of ®V is called a tensor.
There is a formal way of thinking of the tensor algebra of V. If we
fix a basis {Xi} of V then ® V is nothing other than all finite linear
combinations of non-commutative polynomials xr: ... xr:
where 1 ~
i j ~ nand rj EN U {o}. We denote polynomials of this form simply as
xy. The addition and the multiplication are formally carried out. For
example, (xlx~xf) (X3Xl) = xlx~x1x3xl. Thus we can think of ®V as
the free algebra with generators {xd. Note that this identification is
not canonical or natural in the sense that we need a basis to effect this
isomorphism.
There are some special kinds of tensors with which we should be
acquainted. For example, if we look at v ®.W + w ® v E ®2V and
v ® w - w ® v E ®2 V, the striking difference between them is that we are
allowed to switch v and w in the first whereas in the second we pick up a
minus sign. The first is symmetric whereas the second is antisymmetric
or skew symmetric. More generally
Definition 3.1.4 The symmetric group Sk on k symbols acts on ®kV

as follows: if (1 E Sk and s E ®kV set
We say that a tensor s E ®kV is symmetric if (1(8) = 8, that is,
8(Uu(1), ... , Uu(k)) = S(Ul, ... , Uk)
for all (1 in the symmetric (permutation) group of k-symbols, and for all
Ui in V. For example, Lid aijei ® ej E ®2V is symmetric if and only if
aij = aji for all i and j where ei are elements of a basis of V.
Now, given a tensor t E ®kV, is there a way of getting a symmetric

tensor from t? There is an age old golden trick, namely, averaging. Let
Sk denote the symmetric group of k symbols. Then we define
Then it is obvious that Sk(t) E ®kV and that it is symmetric: For if T

is a permutation then
1
T(Sk(t))(Ul, ... ,Uk) = k! L t(U Tl7 (l),,,,,U Tl7 (k»)
l7ES"
1
= k! L t(UII(l),'" ,UII(k»),
liES"
since TU varies all over the group as u varies. The last expression is the
definition of Sk t(Ul, ... , Uk). Notice that if t is already symmetric then
Sk(t) = t.
Classically, tensors were treated by their components. Let us fix a
basis {ed for V. Then any t E ®kV can be uniquely written as a finite
linear combination of the form
Thus the tensor t is known with respect to the basis {ed as soon as
we know its components ai 1 i 2 ... i h' What happens if we choose another
basis? We have a transformation rule for the tensor. (Do you recall the
transformation laws we derived for the components of tangent vectors
and cotangent vectors?) For example, if {Vi} is another basis of V, then
for the basis {Vi ® Vj} of ®2V we have
_ k
Vi ® Vj - ai ek ® ajel
l
= aik ajek
l
® et·
Here we assume that the matrix (at) takes {ed to {vd: Vi := atej.
Now if t E ®2(V) has components tili2 with respect to ei and Tili2 as
components with respect to another basis Vi, then these components are
related by
Similar transformation formulas can be written for other tensors too.

We now want to apply these constructions to a manifold M to get
tensor fields on M. At each point p E M we have the tangent vector
space Tp(M). We write ®kTp(M) for the k-fold tensor product of TpM.
We write 1)~(p) for the tensor product of TpM (r-times) and T;(M)
(s-times). That is,
r times _ times
..... A..
1)~(p) := Tp(M) ® ... ® Tp(M) ® T;(M) ® ... ® T;(M).

An element of 1)~ (p) is called a tensor of contravariance rank r and
covariance rank s. A tensor field <p of covariance rank r and contravari-
ance rank s on M is an assignment p t-t <Pp E 1)~(p). How do we define
smooth tensor fields? The reader should be able to formulate the notion
of smooth tensor fields. In Section 3.3 we shall define smooth tensor
fields in a global way and show that both the definitions are equivalent.
3.2 Exterior algebra

Let V be an n-dimensional vector space over R.
Definition 3.2.1 We say a tensor <p E ®kV is alternating or skew sym-

metric if for any u E Sk we have u(<p) = sgn(u)<p. Here sgn(u) denotes
the sign of the permutation u.
Here, as in Definition 3.1.4, the permutation group Sk acts on ®kV
and sgn(u) denotes the sign ofthe permutation u. We shall deal with the
skew-symmetric or alternating tensors in a more systematic way. Recall
that u E Sk is said to be a transposition if u( i) = j and u(j) = i and
for any l t i,j with 1 ~ l ~ k we have u(l) = l. Since transpositions
generate the permutation group, a tensor <p E ®kV is alternating if and
only if for any transposition u we have u( <p) = -<po
Definition 3.2.2 For any k, we define the alternator Altk on ®kV as

follows: for <p E ®kV, we set
Thus the alternator can be thought of as a weighted average. It is

trivial to see that Altk (<p) is indeed alternating.
Notation We let I\kV denote the set of all alternating tensors in ®kV.
Now if <p is alternating and if Vi E V are such that Vi = Vi for some

1 ~ i < j ~ k then <P(Vl,'" ,Vi'" 'Vi'" ,Vk) = 0, since
3.2. Exterior algebra 173
More generally, if Vi E V are linearly dependent elements then again we

have <p(Vl"'" Vk) = O. For, we can then write Vk = E7;;11 aivi (without
loss of generality!) and by multilinearity we have
As a consequence we see that ;\kV = 0 if k > dimV.

Before we go any further into the study of alternating tensors, let us
indicate the geometric meaning of the alternating tensors. Let us start
with]R2 equipped with the usual Euclidean inner product. If el = (1,0)
and e2 =. (0, 1) are the canonical orthonormal basis vectors then for
any V E ]R2 we write v = E vjej. If Vi = E j v!ej E ]R2 (i = 1,2)
are two vectors then we consider the parallelogram spanned by VI and
V2. The area of this parallelogram is given by the absolute value of the
determinant (v!). (Convince yourself of this fact.) Thus we can think
of det (v!) as the oriented area of the parallelogram spanned by Vi. If
Vi are linearly dependent then they span a I-dimensional parallelogram
and hence their oriented area ought to be zero which is the case since
the rows of the determinant are linearly dependent. Similarly, if we
take three vectors Vi in ]R2 then their oriented volume is 0 since they,
at best, can span a two dimensional parallelepiped. Similar geometric
considerations hold in ]Rn .
Now given any <p E ;\kV, we think of <p as a preferred oriented k-
dimensional volume element on V. For, in general, V may not come
equipped with an inner product. Thus <p as above is a rule which as-
signs to any k vectors Vi of V, a real number which we think of as the
oriented volume of the parallelepiped spanned by V;. Recall that the
parallelepiped spanned by Vi is given by
k
[Vl, ... ,Vk]:= {v = I>iV;: 0 ~ ti ~ 1,1 ~ i ~ k}.
;=1
The importance of this notion lies in the fact that it allows us to speak of
(oriented) volumes of geometrically ~ice objects such as parallelepipeds.
If there is an inner product on V then we could start with an or-
thonormal basis {e;} and declare that the volume of the parallelepiped
(in this case a cube) Q spanned by {e;} as the unit against which we
compare the "size" of other geometrically nice objects P and assign a
positive real number as the volume of P. The reader will find it useful
to have this intuitive geometric description or meaning whenever alter-
nating tensors are in the arena.
Coming back to abstractions, we have Altk cP = cp if cp E ®kV is

alternating:
Altk cp(Vll"" vn) = k!

1
L sgn(0")cp(Vu(1) , ... , Vu(n»)
UESk
1 "
= k! ~" (sgn(O")) 2 cp(V1,"" vn)
UESk
= CP(Vll"" vn),
where we used the fact that cP is alternating in the second equality.

Hereafter we shall write simply Alt instead of Altk, as the context
will make the subscript k clear. We set 1\°v := lit Notice that, by
definition, 1\ 1 V = V. We let I\V:= Eflo I\k V. Now, given cP and 1/;
in 1\ V their tensor product cP ® 1/;, in general, is not alternating. For
instance, if both of them come from 1\ 1 V = V and if they are different
from zero then their tensor product is not alternating. What we want
to do is to define a "multiplication" on 1\V. The obvious thing to do is
to define
(cp,1/;) H Alt(cp ® 1/;).
We however define the wedge or exterior product on 1\ V by setting
(k + f)!
(cp,1/;) H cp 1\ 1/; := kIf! Alt(cp ® 1/;).
Here we assume that cp E I\kV and 1/; E I\lV. The expression cp 1\ 1/; is
read as "phi wedge psi". We shall later give a convincing reason for the
weird constant in the definition of the wedge product. It is easy to see
that for a, b E JR, we have
(acp1 + bcp2) 1\ 1/; = a( CP1 1\ 1/;) + b( CP2 1\ 1/;)

cp 1\ (a1/;1 + b1/;2) = acp 1\ 1/;1 + bcp 1\ 1/;2,
The wedge product is also associative so that 1\ V is an associative algebra

over lit To prove associativity of the wedge it is enough to show that
(k+£+r)!
cp 1\ (1/; 1\ "') = k!f!r! Alt( cp ® 1/; ® "') = (cp 1\ 1/;) 1\ ",.
We first observe that we have a direct sum decomposition of ®kV =

I\kV ED NkV, where
The decomposition is obtained as follows: we write
Lemma 3.2.3 If cp E NkV and t/J E ®lV, then Alt(cp ® t/J) = O. Simi-
larly if cp E ®kV and t/J E NlV, then Alt(cp ® t/J) = o.
Proof Let G denote the group of all permutations on k + i symbols.

Let H be the subgroup of all permutations a such that a(j) = j if
j > k. Then we can identify H with the symmetric group on k symbols
{I, 2, ... , k} . Note that a(cp®t/J) = a(cp)®t/J for a E H. We write a coset
decomposition of G with respect to H: G = U rH. Our assumption on
cp means that we have
L sgn(a)a(cp) = o.
aEH
Hence
1
Alt(cp ® t/J) = (k + i)! L L sgn(ra)r(a(cp ® t/J»
T aEH
= (k + i)! L L sgn(r) sgn(a)r«acp) ® t/J)

1
T aEH
= (k ~ i)! Lsgn(r)r( ( L sgn(a)(acp») ® t/J)

T aEH
1
= (k+i)! Lsgn(r)r(O®t/J)
T
=0.
This proves the first part of the lemma. The other part is proved simi-
larly.
o
Lemma 3.2.4 The wedge product is associative. That is, for cp E t\kv,
t/J E t\lV and 1/ E NV we have
(k+i+r)!
cp t\ (t/J t\ 1/) = k!i!r! Alt( cp ® t/J ® 1/) = (cp t\ t/J) t\ 1/.
Proof Since Alt is a projection we have Alt 0 Alt = Alt. We therefore

have
Alt(Alt(<p ® 1/;) - <p ® 1/;) = Alt(<p ® 1/;) - Alt(<p ® 1/;) = o.

We now use Lemma 3.2.3 with Alt(<p ® 1/;) - <p ® 1/; in place of <po Hence
we have
Alt( {Alt(<p ® 1/;) - <p ® 1/;} ® "7) = o.
That is,
Alt( {Alt( <p ® 1/;)} ® "7) - Alt( <p ® 1/; ® "7) = o. (3.2.1)
Now by Equation (3.2.1) and the definition of exterior product we have
(k+i+r)!
(<p 1\ 1/;) 1\ "7 = (k.+ i)!r! Alt((<p 1\ 1/;) ® "7)
(k+i+r)! ((k+i)! )
= (k+i)!r! Alt k!i! Alt(<p®1/;)®"7
(k + i + r)! (k + i)!
= (k+t')!r! k!t'! Alt(<p®1/;®"7)
where the third follows from the second in view of Equation (3.2.1). The
other equality is proved the same way.
o
Thus 1\ V becomes an associative algebra under the wedge product.
Notice that if x, yare in V = 1\ 1 V then
x 1\ Y = (2!/1!1!) (1/2!) (x ® y - y ® x) = -(V 1\ x).

This generalizes to yield the anticommutativity of the wedge product.
Proposition 3.2.5 For <p E NV and 1/; E NV, we have
Proof The result follows from the following fact:
Alt(<p ® 1/;) = (-IrS Alt(1/; ® <p).
Let
1 2 ... s s+1
r·- (
.- r +1 r +2 . .. r +s 1
t
Then sgn( r) = (-1 3 , as can be seen by counting the number of inver-
sions in r. We leave it for the reader to show that r( cp ® 1/J) = 1/J ® cpo
Then we have
Alt( cp ® 1/J) = L sgn( 0- )o-( cp ® 1/J)
ITESr+.
= L sgn(o-)o-(r(1/J ® cp))
ITESr+.
= L sgn(o-or)sgn(r)o-r(1/J®cp)
ITESr+.
= sgn(r) L sgn(o-')o-'(1/J ® cp)

IT'ESr+.
= sgn(r) Alt(1/J ® cp).

This completes the proof.
o
The associativity allows us to exhibit a basis of AV. If {ei} is a basis
of the vector space V then the collection
{e/ ~= eit A ei2 A··· A ei,,}
forms a basis of AkV where [:= (it, ... ,ik) with 1~ il < ... < ik ~ n. This
follows from the fact that any cp E AkV can be written as
and the skew symmetry of cpo Thus the dimension of the vector space
A kV = (~). AV is called the exterior algebra of the vector space V.
There is a formal way of looking at the exterior algebra of V. We fix
a basis {ei} of V as above. Then we want to constru~t an associative
algebra over lR. which is as "free" as possible subject to the following
relations:
e~ = 0 and ei.ej = -ej.ei
for all i and j with i =f. j. To construct such an algebra we start with
all real linear combinations of possible finite expressions of the form
eil ei2 ... ei r where the juxtaposition of the basis elements means the
"product". We add them formally and multiply them by real scalars
formally. For example, if we have an expression of the form e2el e2 we
then use the relations above to see that
since e; = o. Similarly
(eie2 + e3)2 = eie2eie2 + (eie2e3 + e3eie2) + e~

= 0 + 2ei e2e3 + 0
= 2eie2e3·
The reader should try to write some more expressions like this and sim-
plify them using the relations. It is easy to see that this algebra is
isomorphic to the exterior algebra of V. One thing to notice here is that
while the definition of exterior algebra of V does not depend on any
basis of V, the second definition depends on the choice of a basis of V.
If A is a linear endomorphism of V then A induces a natural map on
the vector spaces I\kV, which we again denote by A. If e[ is as above
then Ae[ := (AeiJ 1\ ... 1\ (AeiJ. In fact we have
Lemma 3.2.6 Let A be a linear endomorphism of V. Then there exists

a unique linear map A such that A : 1\ kV -+ 1\ kV with the following
property:
(3.2.2)
for all Vi, ... ,Vk E V.
Proof We choose a basis {e;} for V. Let 1 ~ i i < ... < ik ~ n, and
set
) ·=Ae·'1 1\···I\Ae·'k·
A(e·'1 1\···l\e·'Ic·
Then extend A linearly to I\kv. If for a collection it, ... ,ik we have
ei = ei with i =1= j, then
A(e·'1 1\ ... 1\ e·'k ) = A(O) = 0 = Ae· '1

1\ ... 1\ Ae·'Ic
and hence Equation (3.2.2) is satisfied. Now assume that Vi = Li a{ ei.

Then AVi = Lii at Aei for 1 ~ i ~ k. We therefore get
AVi 1\ ... 1\ AVk = ail

1
Ae·JI 1\ ... 1\ aik
k
Ae Jk.
= ail
1
. .. aik
k
(Ae·JI 1\ ... 1\ Ae·Jk )
= ail
1
... aik
k A(e·JI 1\ ... 1\ e·Jk ) •
In the above set of equations we have used the summation convention.

We see in a similar way that
A(Vl/\"'/\ Vk) = A(a{leil /\ ... /\ a{keik)

= a 1l' 1 ••• aik
k A(e·11 /\ ... /\ e·}k )
= a 1l' 1 ••• a1k'k Ae·}l /\ ... /\ Ae'1k
. 'k
= A(a{leit) /\ ... /\ A(al eik)
= A(Vl) /\ ... /\ A(Vk)'
These computations show that A is well-defined on /\kV and that it is

unique.
o
Exercise 3.2.7 More generally, A gives rise to a natural map on the

tensor powers ®k V in a similar way.
Now the top degree space /\nv is one-dimensional so that the in-
duced map is multiplication by a scalar. We claim that this scalar is the
determinant of the operator. That is to say, if we choose a basis {ed for
V and express A as a matrix (a{) we have
Proposition 3.2.8 For any n vectors in ei E V we have
Proof We may assume that {ei} forms a basis of V since otherwise

both sides of the equation are zero. (Why?) Then we can write
Aei = a{ ei' We now apply Lemma 3.2.6 to see that
A(el /\ ... /\ en) = Ael /\ ... /\ Ae n

= ail e· /\ ... /\ ain
1 11 n
e·1n
= ail
1
. .. aine'
n 11
/\ ... /\ e.1n
= a{l ... a~n (sgnUl, h ... jn) )el /\ ... /\ en
= det (A)(el/\"'/\ en).
This proves the proposition.

o
In algebra one defines determinant of a linear map A of V to itself

as the scalar by which the induced map A acts on the top exterior
power Anv. Proposition 3.2.8 shows that the definition one comes across
first is the same as the sophisticated definition. The advantage of this
latter definition is that it is invariantly defined. That is, we do not
require a matrix representation of A to define its determinant. Also, it
can be used to prove the familiar results on the determinant function
such as det(ABA- 1 ) = detB, etc. It is also quite useful to compute
determinants!
Proposition 3.2.9 Let Vi be elements of V and u j be elements of v· ,

1 ~ i,j ~ k. Then we have
U1(Vk»)
U 2 (Vk)
. .
Uk(Vk)
Proof We prove this by induction. For k = 1, this is clear. So, we

assume that the result is true for elements of A k-l V. By the definition
of the exterior product the left side of the above displayed formula is
k! 1 k-l k
(k_l)!Alt k ((U A···Au )0u )(Vb V2, ... ,Vk) =
(k ~ I)! I: sgn(a)(u 1 A··· A uk-1)(vU(1)' ... ,Vu(k_l»Uk(Vu(k».
UESk
(3.2.3)
By induction hypothesis, we have
(u 1 A··· A uk-1)(VU(1), ... , Vu(k-I) = det (ui(vu(j h~i,j::;k-l.
As earlier, we identify the subgroup H of G (:= Sk) of all permutations
that leave k fixed with Sk-l. Then Equation (3.2.3) becomes
(u 1 A··· A Uk-1)(VU(1), ... , Vu(k-I» =

L sgn(r)u1(VU(T(1»)··· Uk-1(Vu(T(k_l»).
TEH
Hence the right side of Equation (3.2.3) becomes
(k~I)! L{L sgn(ar)u1(vuT(1» .. . uk-l(VUT(k_l»Uk(VUT(k»}.

TEH uEG
For T fixed, as 0' varies over G, aT varies over G so that the expression
within the parenthesis is det (ui(vj)h~i,j9' Now the result follows since
the right side is (k - 1)!( (l/(k - I)!) det (u i (Vj) h~i,j~k'
o
Remark 3.2.10 It is essential to observe that if Vi is a basis of V and
u i is the basis of V* dual to
Vi then the formula above reads
This is true because of our definition of the wedge product, for in some
books, the wedge product is defined by
cp 1\ t/J := Alt( cp 0 t/J).

With this definition, for example, we have
so that
More generally, for u i and Vi dual to each other, we shall have
so that Proposition prop:3.2.3 is no longer true. If the vector space V

comes with an inner product, then we can identify V* with V. We start
with an orthonormal basis {ei} for V and may assume that the 'dual'
basis of V* = V is also {ei}. Then we would like the left side of the
formula in Proposition 3.2.9 to be the volume of the unit cube, namely,
1. This is the convincing reason we talked of above.
Example 3.2.11 We want to show that the sophisticated way of look-

ing at determinant function as the top degree skew symmetric multilinear
function on the row vectors is useful in computations too. For instance,
consider the matrix
A'-
.-
(1+. xl
X2 X l
XIX2
1 +x~ x'xn)
X2 X n
. .
XnXl X n X2 1 +x~
To compute its determinant we notice that the i-th row is ei +Xixt where
{ed is the canonical basis vector where we think of ei and x as column
vectors and xt denotes the transpose of x. Hence we have
det A = det(el + xlxt, ... , en + xnxt).

Now using be multilinearity and the skew symmetry (in particular, the
fact that det is zero if the arguments are linearly dependent) we easily
see that det A = 1 + xi + ... + x~. This computation will be needed later
when we wish to find the volume form of a hypersurface in JRn defined
by Xn+1 = !(Xl! ... , x n ).
We end this section with one more observation for those readers who
have been exposed to vector algebra on JR3. We take {ed to be the
canonical basis. For v,w E V := JR3, we write v = aIel + a2e2 + a3e3
and w = blel + b2e2 + b3e3. Then we have
v 1\ w = (aIel + a2e2 + a3e3) 1\ (blel + b2e2 + b3e3)
= a l b2 el 1\ e2 + a l b3 el 1\ e3 + a2bl e2 1\ el + ....
Thus we have
v 1\ w = (a l b2 - a2bdel 1\ e2 + (a 2b3 - a3b2)e2 1\ e3
+ (a3bl - a l b3 )e3 1\ el.
If we identify el x e2 with e3, e2 x e3 with el and e3 x el with e2 then

the above equation says that the wedge product on JR3 can be identified
with the cmss or vector product.
Exercise 3.2.12 For v and w in ]R3, show that the vector v x w is the
unique vector such that thp following holds for all u E JR3:
(v x w, u) = the oriented volume of [v, w, u].

Here (.,.) is the Euclidean inner product and [v, w, u] stands for the
parallelepiped spanned by the three vectors.
Exercise 3.2.13 (Cartan's lemma) Let V be an n-dimensional vec-

tor space. Let {ei}i=l be a basis of V. Let {Vj }j=l be a set of vectors
such that L~"'l Vi 1\ ei = O. Then there exist real numbers such that a1
3.3. Tensor fields 183
3.3 Tensor fields
We want to apply our constructions of the previous two sections to the

study of manifolds. If M IS a smooth manifold, then at each point
p E M we have two vector spaces TpM and T;M. We can therefore
start constructing various tensor spaces as follows:
r times ~ times
A A
D:(p) :=Tp(M)® ... ®Tp(M)' ® T;(M)® ... ®T;(M)'.
An element tp of D:
(p) is called a tensor of contravariant rank rand
covariant rank s at p. Just as in the case of a vector field we say tp is
a tensor field if it is a map p f-t tpp E D:(p). How do we impose some
smoothness conditions on a tensor field? Given p E M we choQse any
local chart (U,x) of M with p E U. Then we have seen that a~i form Ip
a basis for TpM and dx;lp form a basis for T;M. Hence we can write tp
on U as follows:
tp =L .. _I
tp'.l"".r a ® ... ® -
11···1. ax II' p a x 'Ir
a®ldx·l
p 11 p
® ... ® dx·
1. P
1.
We say tp is a smooth teIisor field of type (r, s) if the coefficients tp~ (in
an obvious notation) are smooth for all coordinate charts.
In classical tensor analysis, tensors were recognized by their compo-
nents tp~ and the transformation rule they obeyed. As an instance of
this we have pointed out when we were dealing with vector fields, that
is, contravariant tensor fields of rank 1 or tensor fields of type (1,0).
Just to give you a sample of this formulation, let us see how a tensor
field of type (1, 2) is expressed in terms of two charts (U, x) and (V, y)
wit.h UnV #: 0.
We first recall the transformation rule for the tangent vectors and
the cotangent vectors.
a LaYj a
= (3.3.1)
ax; j ax', ay·1
dx; = Lax;
~dYj. (3.3.2)
j Y1
Now we write 'P E 1)~{p) as
(3.3.3)
with respect to the two charts. Then we use the transformation formulas
for the tangent vectors and the cotangent vectors to get
'P -
8 Ym (p) 8
" a ijk -8
- L.....J -8 Ip ® -8 p dYr Ip ® -8
8xj () p dys Ip
8Xk ()
Ok X, Ym Yr Ys
0
'1
(3.3.4)
= "L.....J -8
8Ym
°
8Xj
(p) -8 8Xk
(p) -8 (p) a:ok -88 Ip ® dYr Ip ® dys Ip
° Ok X, Yr Ys Ym
'1
Using the linear independence of tl/

and dy and Equations (3.3.2) and
(3.3.4) we get the transformation formula for the tensors of type (1,2):
bra = 8ym{ ) 8xj{ ) 8Xk{ ) ~k

m 8 Xi P 8Yr P 8 Ys P a, .
There is a more elegant (and quite useful) way of introducing tensor

fields. Recall how we can think of a vector field X as a field of tangent
vectors p ~ Xp E TpM as well as a derivation of the algebra cOO{M).
Now the vector space X{M) is a module over the commutative ring
cOO{M). We quite often write hereafter 1)1 (M) for X{M). Let us denote
by 1)l{M) the dual of 1)l{M). By definition, 1)l{M) is the set of all
cOO{M)-linear maps from 1)1 (M) to COO(M). Thus, 'P E 1)1{M) if and
only if we have
'P(J X) = f'P{X) for all f E COO(M) and X E X{M).
We call elements of 1)1(M) as differential I-forms. We shall make a

formal definition.
Definition 3.3.1 A differential I-form 'P on a manifold M is a COO (M)-

linear map from the set 1)1{M) of vector fields on M to cOO{M).
As an example, we consider w := df for any smooth function f on

M: df{X) = X(J). Then df(gX) = gX(J) = gdf(X) for any smooth
function 9 on M. Thus df is a differential one form, that is, df E 1)1(M).
We wish to find local expressions for wE 1)1(M). That is, we want to

show that we can write w = L:i /idxi, with Ii E COO(U), on a local chart
around p. To do this we adopt the same method we used to identify a
derivation of COO(M) with a vector field. We define wp E T; M by setting
wp(v) = w(X)(p) where X is any smooth vector field on M such that
Xp = v. Is wp well defined?
First observe that if X E 1)1 (M) and if X = 0 on an open set V,
then the smooth function w(X) = 0 on V. To establish this, we take
any smooth function I such that I = 0 in a neighbourhood of p inside V
and I = 1 outside V. Then I X = X. Since w is COO (M)-linear we have
w(f X) = Iw(X) and hence w(X)(p) = I(p) w(X)(P) = 0 since I(p) = o.
Thus any I-form induces a I-form on an open submanifold of M. Now
if X = L:i X(Xi) 8~. OIl" U, then Xp = 0 if and only if X(Xi)(p) = 0 for
all i. Therefore we have .
w(X)(p) = (~X(Xi)W(8~)) (p)

= LX(Xi)(P)w(8~)(P)
,
=0.
Thus, if Xp = 0 for some p E M, then w(Xp) = w(X)(p) = o. This

shows that wp is well-defined.
We set
1)1(p) := {wp : wE 1)1(M)}.
We claim that 1)1(P) = T;M. By definition, we have wp E T;M. Now,
if A E T;M is given then we set Ai := A(8~; Ip). Now we consider
CPu := Aidxi on a chart (U,x) around p. Then by Remark 3.3.2 below,.
there exists a cP E 1)1(M) such that CPu ~ cP on an open set V of U with
p E V. Clearly CPp = A. Hence we have substantiated our claim. The
outcome of what we have done so far is this: 1)1 (M) consists of elements
w which can be thought of as a map p f--+ wp E T; M.
Remark 3.3.2 Do you recall for a given v E TpM how we produced

a vector field X on M such that Xp = v? We apply the same trick
here too. We choose a compact neighbourhood V of pin U. Let I be
a smooth function on M such that I has compact support in U and
I = 1 on V. Then we define cp by setting cp(X) := IlPu(X) on U and
cp(X) = 0 outside U. This cp meets our requirements.
We are now ready for the global definition of tensor fields. We set
1>r(M) to be the set of r mqltilinear maps
Note that the left side is a COO(M)-module and the linearity is with
respect to Coo (M).
Similarly we set 1>S(M) to be the set of all s-linear maps
X(M) x ... x X(M) --+ COO(M).

In particular, when s = 1, 1>1(M) consists of all COO(M)-linear maps
from X(M) to Coo(M). Thus our notation is consistent.
Now what is 1>1 (M)? It is the set of COO(M)-linear maps F :
1>1(M) --+ COO(M) by definition. We remark that if X E X(M), then
X E 1>1(M), for we can define X(w) := w(X) E COO(M). This maps
X(M) in a one-one way into 1>1(M). It turns out that we indeed have
X(M) =1>1 (M). This is the content of Proposition 3.3.3 below.
Proposition 3.3.3 We have (1)1)* = 1>1 (M) = X(M).
Proof The proof follows the pattern of the fact that a derivation of
Coo(M) gives rise to a vector field on M.
Let F E (1)1)*. Ifw E 1>1(M) vanishes on an open set V, then F(w)
vanishes on V. The argument is same as in the case of vector fields
and hence is omitted. (Use the fact that F(fw) = fF(w).) If p E M
and (U, x) is a local chart containing p then any I-form on U looks
like Edidxi on U with Ii smooth on U. This implies., by the usual
reasoning, that F(w)(p) = 0 whenever wp = O.
The second step shows that the map wp I-t F (w) (p) is a well defined
linear functional on 1> 1(p) which is T; M. Hence there exists a unique
Xp E TpM such that
F(w)(p) = wp(Xp) for all wE 1>1(p).
Thus F-gives rise to p I-t Xp with Xp E TpM. We claim that the above
assignment is smooth. For, if Z E U, we write X z = Ei ai(z) 8~i Iz
for some ai(z) E lit Now for any i there is a I-form Wi = dXi on a
neighbourhood of z contained in U. Thus we have
and hence ai are smooth functions on V 3 p.

o
Proposition 3.3.3 allows us to define 1)8(M) as the set of COO(M)-s-

linear maps
1)1(M) x ... x 1)1(M) -+ COO(M).
More generally, we define 1)~( ) to be the set of all multilinear (with
respect to COO(M)) maps
r times 8 times
1Jl(M) X ...... x 1)1 (M))' X ~1 (M) X . ~. X 1)1 (M)' -+ COO(M).

An element <p of 1)~( ) is called a tensor field of type (r, s). <p is said
to be contmvariant of rank r and covariant of rank s. In particular, a
tensor field of type (0,0) (respectively, (1,0), (0,1)) is a smooth func-
tion (respectively a smooth vector field, a smooth differential I-form).
Observe that 1):(M) is a COO (M)-module in an obvious way.
We now look at some examples.
Example 3.3.4 Let E : 1)1 (M) X 1)1(M) -+ COO(M) be defined by

E(X,w) := w(X). Then clearly we have E(fX,gw) = Igw(X) for
smooth functions land 9 on M. Thus, E is a tensor of type (1,1).
In classical tensor analysis books this is called the Kronecker tensor and
denoted by 0;.
Example 3.3.5 For wE 1)1(M), we define
as follows:
dw(X, Y) = X(w(Y)) - Y(w(X)) - w([X, YD.

Then it is easy to see that dw is a (0,2)-tensor field and dw(X, Y) =
-dw(Y, X) for all vector fields X and Y.
Example 3.3.6 We use the notation of the above example. Let X be

a vector field on M. We consider £'xw: 1)1(M) -+ COO(M) defined by
£'xw(Y) := X(w(Y)) - w([X, YD.

Then it is easily seen that £'xw is again a I-form called the Lie derivative
of w with respect to X.
Example 3.3.7 Fix w E 1)1(M) and consider F : 1)1 (M) X 1)1 (M) -+
COO(M) defined by F(X, Y):= X(w(Y)). Is F a (0,2)-tensor?
Just as we view any vector field as a mapping from M which assigns

to each point p E M a tangent vector in TpM in a smooth way, we
want to show that the definitions of tensor fields in this section and in
Section 3.1 are the same. A proof of this runs along the same lines as in
the case of vector fields and hence we strongly urge the reader to work
this out on his own. However, we include a proof for completeness sake.
Proof We take any cP E 'D;(M). By COO (M)-multilinearity we have
cp(91Wt. ... , 9rWr, iIXt. ... , IsXs) =

91'" 9riI'" IsCP(W1,'" ,wn X 1,··· ,Xs)
for 1i,9i E COO(M). If any of Xi or Wi vanishes on an open set V,

then the function cp(W1,"" Xs) also vanishes on V. The proof of this
should have become routine to the reader by now. (Choose a compact
neighbourhood W of p inside V and a function I which is 1 on W and
is 0 outside V. Then Icp = cP implies that I(p)cp(wt. . .. ,Xs)(P) = 0 =
cp(W1, ... , Xs)(P) on W.)
Let (U, x) be a local chart around p. We write Xi = Lie
lile 8~. and
Wi = Ll9jldxl. Hence we have
This shows that CP(W1, .. " Xs)(p) = 0 if some Wj or Xj vanishes at p.

Thus the map defined by
is well defined, so that CPp E 'D;(p).

o
It is clear that CPp = 0 for all p if and only if <p is 0 and that p H CPp is
smooth in the usual sense. Conversely, if the assignment p H tPp E 'D; (P)
is smooth, then there is a tensor field W E 'D;(M) such that wp = tPP'
where Wp is as defined in the paragraph above.
In differential geometry the covariant tensors play an important role.
The reason for this is the fact that they behave well under smooth maps.
Do you recall that we were able to define the image of a vector field under
a map only if the map is a diffeomorphism? On the contrary, if cP is a
covariant tensor of type (0, s) on a manifold N and if I : M -+ N is a
smooth map then we can define the pull-back f* (cp) of cP under I as
follows: !*(cp)(X b ... , Xa)(P) := cp(f.(XI(p)), ... , f.(Xa(p))) for vector

fields Xi on M and p E M. Here we have used the notation f.(Xp)
for the derivative map Df(p)(Xp). In the sequel we shall often use this
notation. It is easy to check that !* (cp) is indeed a (smooth) covariant
tensor field of rank s on M.
As an example, we see how the pull-back of a differential I-form looks
locally. Let q E Nand (V,y) be a local chart around q. Then we have
the form dYj on V. If f : M -t N is a smooth map, if p E M is such
that f(p) = q and if (U, x) is a local chart at p, then by definition we
have
!*(dYi)(X)(p) = dYi(f.X)(p)
= dYi(f.(p)(Xp))
= f.(p)(Xp)(Yi)
= f.(Xp)(Yi)
= Xp(Yi 0 f)
= d(Yi 0 f)(Xp).
That is, !*(dYi) is the I-form d(Yi 0 f) on M.
Exercise 3.3.8 Pull-back respects the tensor products.
There is an important class of operations known as contractions.
They are defined as follows.
C;(XI ® ... ®Xr ®WI ® '" ®wa):=

Wj(Xi ) (Xl ® ... ® £ ® ... ® Xr ® WI ® ... ® Wj ® ... ® Wa) .
Thus, the contraction c; of a mixed tensor of type (r, s) is the pairing
of the i-th contravariant argument and j-th covariant argument. Thus
C; : 'D~( ) -t 'D~:~( ).
Exercise 3.3.9 Is 'D:(M) dual to 'D~( ) as COO(M)-module? Prove
your guess.
Exercise 3.3.10 If f : M -t M is a diffeomorphism then f extends to

a type preserving automorphism of the tensor algebra
'D(M) := E9 'D:(M).
r,a~O
Show that this automorphism commutes with contractions.

This section is based very much on [8].
3.4 The exterior derivative

We want to define an operation known as the exterior differentiation on
differential forms. Recall that 'Dr(M) is the set of all maps from
..
r times
which is COO(M)-linear in each of its variables. Each element of'Dr(M)

is called a differential form of degree r.
We first define the exterior derivative locally and show that it extends
to a globally defined operation.
Definition 3.4.1 Let w E 'Dr(M). Assume that w has the

local expression w = Li aili2 ... irdxil /\ dXi2 /\ ... /\ dXir on a coordinate
neighborhood (U, x). We define dLw := L dai1i2 ... ir /\dXil /\ ... /\dXir.
We would like to show that dL extends globally, dL : 'Dr(M) -+

'D rH (M). That is, if w = L bjli2 ... jr dYil /\ ... /\ dYjr with respect to
a different coordinate neighborhood (V, y), and if U n V i= cp, then
dLw = L dbjli2 ... jr /\ dYil /\ ... /\ dYjr

= L dai i2 ... ir /\ dXil /\ ... dXi r
1
on UnV.
Let us look at the simplest possible case: Let w = L at.dxi = L bjdYj
on (U,x) and (V,y). Then, as usual, ai = bj~j bj = ai~. In the
following if cp is a tensor field over M and p EM, then cp(p) (or ,
vlip) will stand for the map at the appropriate product of tangent and
cotangent vector spaces at p. For instance, if cp is a one form, then cp(p)
denotes the restriction of cp to Tp(M).
Therefore dLw(x) = L dai /\ dXi and
3.4. The exterior derivative 191
Ldai 1\ dXi + Lai(L ~ 8Xi dYk) 1\ dYj

.
1.
..
1.,3
k 8Yk 8Yj
= LdaiI\dxi + L(L882~i dYkl\dYj)

.
1.
' .1, k
,
Yk YJ
dLw(x).
Thus d L is globally defined for I-forms on the whole of M. Instead of

trying to show straight away that d is well-defined on all forms of any
degree, we give an invariant definition of don M.
Let d ': ~02)T M -t E!/l)T M be defined as follows: for all f E
1)0 M = COO(M), df stands for the differential of f, that is, df(X) = X f
for all X E 1)l(M). Let w E 1)r(M), r 2:: 1 and Xl> X 2 , ••• , X r+ 1 E
1)1 (M). Then
i=l
(3.4.1)
Does dw E 1)rH (M) for w E 1)r (M)? That is, we want to know
whether dw is an alternating covariant tensor of rank (r + 1): dw is
clearly additive. We show that for all f E COO(M)
dw(X 1 , •.. ,fXk,··· ,XrH ) = fdw(X 1 , ... ,Xr+d

for all k. Now
k=i<j
+ L w([Xi ,fX k],X 1 , ... ,X:, ... ,fZ, ... ,X +d
r
i<j=k
Using the facts that
and
it follows that the second and the third terms add up to zero. Finally
we get
That dw is alternating, is seen easily by showing dw(X 1 , ••• , Xk+l) =0

if any two of its arguments coincide. Therefore dw E 1)r+l(M).
Proposition 3.4.2 d enjoys the following properties:
1. for all f E Coo(M), df(X) = XU).
2. d is linear over JR, that is, d(aw + brJ) = adw + bel,.,.
3. d(w A,.,) = dw A,., + (-1)"w A d,., for wE 1)r M and,., E 1)3 M.

4. d 2 = 0, that is, d(dw) = 0 for all w.
Tn fact, d is characterized by the above properties.
Before embarking on a proof of Proposition 3.4.2, we wish to point

out that if D : 1)(M) --t 1)(M) satisfies the four properties listed above,
then D coincides with d L locally. For,
D(dx·'1 Adx·'2 A···Adx·)=

'I r
D(dx '1· ) A dx''2 A··· A dx'Ir - dx''1 A D(dx·'2 ) A··· A dx·'r + ...
by property (3) . Since D f = df for all f E Coo (M),
D(dxih) = D(DxihJ = D2(Xi/c) = 0

by property (4). Hence each of the summands above is zero and hence
D(dxil A ... A dXir) = o. Therefore
D(fdx·'1 A···Adx·lr )=DfAdx·'1 A···Adx·'r +fD(dx·11 A···Adx.)

·r
=dlfAdx·'1 A···Adx·Ir +f·O

= dL(fdxil A ... A dXiJ.
Thus the above properties characterize d L locally, that is, if at all there
exists D on all of I\M satisfying the properties, then for all w E 'D(M),
Dwlu = dL(wl u ) where (U,x) is any coordinate neighborhood in M.
Does d L satisfy the above properties (locally) on U? The first two
properties are clear. Since dL is linear, it is enough to prove the third
one when w and .,., are given as
and
Then
and
dL(w 1\.,.,) = d(ab) 1\ dXi1 1\ ... 1\ ... 1\ dXj.
= (da.b + a.db) 1\ dXi1 1\ ... 1\ dXir 1\ dxj, 1\ ... 1\ dXj.
= da 1\ dXil 1\ ... 1\ dXir 1\ bdxjl 1\ ... 1\ dXj.
+ a dXi1 1\ ... 1\ dXir 1\ (-Ir db 1\ dxj, 1\ ... 1\ dXj.
= dw 1\.,., + (-Irw 1\ d.,.,.
This proves the third property for d L . To prove the fourth one for d L ,
observe that (d L )2(J) = dL(d LJ) = 0 since
dL(d L J) = dL(I:
j
:!. 1
dXj)
= ~dL (:~) I\dXj

1
= ~ (~a~i (:~)dx}dx;
I: 8 8.8f
2
= .dXi I\dxj
.. X, Xl
't)
=0,
as ~ ~
81:;81:j = 81: j 81:;· Now w is a sum of products of r I-forms of type
df and hence, by the third property, each term in (d L )2W has a factor
(d L )2 f = 0 and so (d L )2w = o.
Thus dL does satisfy the properties of Proposition 3.4.2 and any

D satisfying the proposition on M is d L locally. What is not clear
is whether such a global D exists. Therefore the meaning of Proposi-
tion 3.4.2 is that the dL which we defined is such a global D.
To prove Proposition 3.4.2 it therefore suffices to show that the d
has d L as its local expression. Since d and d L are lInear it is enough to
show that for w = a dX1 1\ ... 1\ dX r we have
dw(Xill···,Xir+l) = 'L.,(-l)
" k+1 XikW(Xill···,Xik
~ ... ,Xir+l )
+ L( -l)k+l w ([Xik , XiI]' XiI'···' Xir+l)

k,l
= L( _l)k+lXik (a dX1 1\ ... 1\ dx r)
x a , ... , a. ... ,~)

(!:IUX'I !:I
UX'k uX'r+1
= "'(-l)k+l~(a
L.J OXik 'I
·· · 8k- 18k
'k-I 'k+l
... 8!"
'r+1
)
= "'(_l)k+l oa 81 ... 8~-18k ... 8!" .

~ aXile 11 'Ic-l 'Ic+l 'r+l
We now compute dLw:
Therefore dLw = dw. Thus d is a globally defined operator satisfying

Proposition 3.4.2.
There is another very important property of d:
Lemma 3.4.3 II'P: M ~ N is smooth and wE 'D(N), then
In other words, the exterior derivative d commutes with pull-backs.
Proof Notice that we have already seen a very special case of this: If
9 : N ~ IR is smooth, then dg E 'D(N) and 'P·(dg) = d(g 0 'P). For, if
X E 'Dl (M), then
It follows therefore that if'P : (U,x) ~ (V,y) is smooth and if

'P = ('Pl> .. . , 'Pn), then 'P·(dYi) = d'Pi, since 'Pi(X) = Yi('P(X)), where
Yi : V ~ IR is the i-th coordinate function.
Now the proof of the fact that d( 'P. w) = 'P. (dw) is clear. We may
prove it locally and also assume w = adYl /\ . .. /\ dYr . Then
dw = da /\ dYl /\ . . . /\ dYr
and
'P. (dw) = 'P* da /\ 'P. dYl /\ ... /\ 'P* dYr (3.4.2)

= d(a 0 'P) /\ d'Pl /\ d'P2··· /\ d'Pr.
We also have 'P·w = (a 0 'P)d'Pl /\ ... /\ d'Pr. Hence
(3.4.3)
Equations (3.4.2) and (3.4.3) imply that d('P·w) = 'P*dw.

o
We now wish to exhibit the close relation between the exterior deriva-
tive of forms on 1R3 and the operators gradient, curl and divergence which
one comes across in vector analysis.
Let I : 1R3 ~ IR be smooth. Then one identifies the 1-form dl =
Li *!:dXi with the vector (It, it, It)· Thus don O-forms (scalar
field, in classical language) is the gradient, yielding a vector field.
Let w = frdxl + hdx2 + fJdX3 be a 1-form. Again, using the inner
product, we identify the tangent space to ]R3 at any point with its dual,
namely, the cotangent space. Thus we think of the above 1-form as the
vector field F = (ft, 12, fa). Now
dw = dft A dX1 + dh A dX2 + dfa A dX3

= (L,. af~)
ax, A dX1 + .. +, ..
Thus we see that d on I-forms can be identified with the curl (Y') of the
vector field F.
Let cp = g1 dX2 A dX3 + g2 dX3 A dX1 + g3 dX1 A dX2 be a 2-form on 1R3 .
Then proceeding as above we easily see that
Thus on 2-forms d is the divergence operator if we use the identification

which we introduced in Section 3.2, namely, e1 Ae2 B e3, etc.
We now wish to give the dual formulation of the Frobenius theorem in
terms of forms. Before doing this, we wish to point out a very special case
of the global definition of the exterior derivative d, which is quite often
used in differential geometry and which is often a source of confusion for
a beginner. If wE 'D1 M, then we have
dw{X, Y) = X{dw{Y)) - Y{dw{X)) - w{[X, Y]). (3.4.4)
In some books there is a factor (1/2) on the right side of the above equa-
tion. The reason for this anomaly is the way the exterior multiplication
wedge is defined. In the definition adapted by these books the definition
of wedge is
Let us verify the above equation for dJ.JJ from the basics so that we
can perceive how the definition of wedge product affects the exterior
derivative. It is enough to check it on the I-forms of the type W = f dX1'
By the local expression for d we have dJ.JJ = df /\ dX1' Now if Ok := {J~k'
etc., then we have
+ I)!
(df/\dxt}(ok,OI) = (1 1.1.
" Alt(df®dxt}(ok,ot}
(1 + I)! 1
= 1!1! 2!{(df®dx1)(ok,0t}
- (dX1 ® df)(OI, Ok)}
= df(ok) dx1(01) - df(0t}dx 1(ok)
= ok(f)81- 8lol(f).
On the other hand using the global definition we see that
dJ.JJ(Ok, 01) = Ok(W(Oz)) - OI(W(Ok)) - W([Ok, aiD
= Ok (fdx1 (01)) - 01 (fdX1 (Ok))
= 810k(f) - 8lol(f).
Thus we have verified the equation. Similarly in the expression for the
global definition of d in the other convention one will see a factor of
(11k) if the form is of degree k.
Now we attend to the dual formulation of Frobenius. If 'D is a dif-
ferential system of rank r, then we can describe 'D also by 'D* where
'D;:= {u E T;(M): u(v) = 0 for all v E 'Dp}.

Notice that dimension of'D; is m - r . Given p E M we can find a
coordinate neighborhood (U, x) of p and m - r smooth I-forms Wi on U
such that Wi span 'D~ for q E U. It is clear that 'D is determined as soon
as we know 'D*.
Proposition 3.4.4 'D is integrable if and only if for all p E M there

is a coordinate neighborhood (U, x) of p and I-forms Wi and for 1 :::;wi
i, j :::; m - r such that we have
m-r
dJ.JJi = L wi /\ Wj'
i=1
Proof This is an easy exercise using the vector field version of the
Frobenius theorem and the above formula for dw.
o
Exercise 3.4.5 Compute f* (w) where f and ware given as follows:

(i) f : ]R3 -t]R2 where f(x, y, z) = (xy - yz, xyz) and w = xdy 1\ dz +
ydz 1\ dx + zdx 1\ dy.
(ii) f:]R2 -t]R3 with f(x, y) = (x, y, xy) and w = dx 1\ dy 1\ dz.
(iii) Compute dU*w) where f : ]R -t ]R2 with f(x) = (x, -x) and
w = dx + dy. (Compare your work with your friends!)
We saw above that if c.p = dw then dc.p = o. We may ask whether the
converse is true. This question is intimately connected with topology.
Before explaining what we mean by this let us show that locally the
converse is true. This follows from the following
Lemma 3.4.6 (Poincare) Let U be a star-shaped region with respect

to a point p E U in]Rn. Then th~re exist maps Hk : 'Dk(U) -t'Dk-l(U)
such that
if k > 0;
if k = o.
Proof The idea is to integrate, "but how?" is the question. Those

readers who have learnt some algebraic topolo~ will recognize that H
are homotopies. When k = 0, the lemma is nothing but the fundamental
theorem of calculus.
We define F: [0, 1J)( U -t U by F(t, x) := tp+(l-t)x. We pull-back
c.p, a k-form on U, via F to get a form on I x U and use the differential
dt to integrate. We can write F*c.p = dt 1\ a + {3, where a and (3 are
forms not having dt-term . We define, in an obvious notation,
We now compute
F*(dc.p) = d(F*c.p) = d(dt 1\ a + (3)

8{3
= -dt 1\ da + dt 1\ 8t + 7],
where 7] does not involve dt. Therefore we have
Hk+l(dc.p) = r (8{3 _ da) .

1
It=o 8t
(3.4.5)
3.5. Lie derivatives 199
When k > 0 we have .B1(o,u) = (Id)*cp = cpo Since F(I,x) = p for all
x E U, we have .B1(1,u) = O. Hence we get
o
We remark that Lemma 3.4.6 shows the following: On any manifold
M, if we are given a k-form dcp = 0 and a point p, then we can find a
neighborhood U of this point and a (k - I)-form wp such that we have
dwp = cp on U. However it may not be possible to find a form w on
the manifold M so that we have dw = cpo An example can be found
in Section 4.2. Thus to solve dw = cp locally, a necessary condition is
that we have dcp = O. However to solve this equation globally there are
topological obstructions. To explain this we need to introduce the de
Rham cohomology.
Definition 3.4.7 We say a form w of degree k is closed if dw = 0 and

exact if there exists a (k - I)-form cp such that dcp = W. An exact form
is obviously closed. We denote the set of all closed k-forms by Fk(M)
and that of exact forms by Bk (M). These are vector spaces and Bk (M)
is a vector subspace of Fk(M). For 0 ::; k ::; m, we define the k-th de
Rham cohomol0gy group Hk(M) of M as the quotient
It can then be proved that these objects are topological invariants

of the underlying space. More precisely, the de Rham Theorem says
that these groups are dual to the singular homology groups of the space.
This theorem is much more precise than what we have stated. The more
precise version of the theorem and its proof can be found in any of the
following books [22], [23] and [4].
3.5 Lie derivatives

We first define the Lie derivative of a tensor by a vector field X on a
manifold.
Definition 3.5.1 Let X E 1)1, with the corresponding flow CPt. Let
p EM and consider a basis {ei} of TpM. Since CPt is a diffeomorphism
around p, {cps.(ei)} is a basis of T",.(p)(M). Thus Ei : S I-t cps.(ei)
can be thought of as curves lying above the integral curve of X through
the point p. The map S I-t E = (Eb ... , Em(s)) can be considered as a
curve in the frame bundle l' of M. (This is the union UpFp of frames Fp,
that is, bases of TpM as p EM. Each Fp is diffeomorphic to GL(m,lR)
and we can make l' into a manifold of dimension m + m 2 • See Appendix
A for related notions.)
If T is a tensor field around p, then its components with respect to the

basis ei (s) are smooth functions, say, t Ol with 1 ~ Q ~ mk where we have
indexed the components in a definite way. The derivatives ZOI := dt Ol ds
are the components of a tensor Z along the integral curve "Yp of X. This
tensor is well defined in the sense that it is independent of the choice
of the initial basis {e;}. For, if {v;} is another basis of TpM, then we
have Lj a{ej = Vi for a nonsingular matrix (a{). We obviously have
Vi(S) = La{ej(s) since CPs. is linear. Thus the components of T with
respect to e and v are related by some constant functions of s along "Yp
and hence the derivatives of these components are related in the same
way. Thus we do get a genuine tensor Z, called the Lie derivative of T
with respect to X.
Of course, we can put everything in an invariant manner. This we
do presently. We now want to show the geometric meaning of the Lie
bracket of two vector fields X and Y. Let CPt be the flow of X as above.
We start with the vector Y",,(p) at CPt (p). We pull it via the linear
isomorphism CP-t. to yield us the vector CP-t.(Y",,(p») at p. We form the
difference quotient and take its limit as t tends to O. We claim that the
limit exists and equals [X, Y](p). We state this as a theorem and offer
a variety of proofs for the reader to pick his favourite.
Theorem 3.5.2 Let X and Y be vector fields on a manifold M and let

CPt be the flow of X. Let p E M and t be in the domain of the flow of X
through p. Then we have
lim (cp-t}.(Y",,(p») - Yp = [X, Y](p).

t-tO t
Proof (1) If Xp is not 0, then we choose a coordinate neighborhood

U of p so that X = -aaXl
on this chart. Then for q E U, we have
integral curve of - X
integral curve of - Y integral curve of Y
P integral curve of X
Figure 3.5.1 Geometric meaning of Lie derivative
and we have
Hence we have, using an obvious notation,
. (""-t).(Y<p.(p))-Yp = hm-
hm . 1 (( h () (( (
t , ... ,Im t))- h O), ... ,lm(O)))
t-+O t t-+O t
alI
( aXl'···' aIm)
= aXm (p),
and
[X, Y](p)
a a
= [ aXl' ~ Ii aXi
1(p) = ~ aXi
ali a
aXi Ip.
Thus, in this case the result is proved. The case when Xp = is handled °
°
by a simple limiting argument.
Let now X = in a neighborhood of p. Then [X, Y](p) = 0. In this
case the integral curves starting at p are constant so that <Pt is identity
for all t. Hence the numerator in the difference quotient is zero and the
result follows in this case too.
If Xp = 0, there exists a sequence {Pn} of points such Pn --+ p and
such that XPn #- 0. The result follows in this case by continuity of X
and the first case.
o
Proof (2) The reader should compare this proof with that of the Basis
theorem (for the tangent space). We set F(t,p) := f(<pt(p» for an
arbitrary smooth function f on M. We then have the first order Taylor
expansion
F(t,p) - F(O,p) = t 10
f 1 at
8F
(st,p)ds = t . h(t,p), (3.5.1)
where h E COO(JR x M) with h(O,p) = Xf(p). We then have
(<p.Y)pF = (Y(J 0 <Pt» (<pt 1 (p»)

= Yf (<pt 1 (p») + t(Yh) (t, <Pt1(p».
Hence
lim (Y - (<pt).Y)p f = (XYJ)(p) - YXf(p).
t-+O
Proof (3) It is enough to show that for any smooth function f on M,

if we set h(t) := df ((<p-t).Y",.(p») then h'(O) = [X, Y]p(J). We unwind
the expression for h using definitions to see
h(t) = df ((<p-t).Y",t(p»)
= ((<p-d. (Y",.(p»)) (J)
= Y",.(p) (J 0 <P-t).
We define F(s, t) := Y",.(p) (J 0 <P-t). Then
h'(O) = F'(O,O) = ~~ (0,0) + a: (0,0).

We have
This last quantity is Xp(Y f). We now compute
of d
Ft(O, 0) = dt It=o (Yp(f 0 tp-t)) (3.5.2)
= Yp(-Xf) = -Yp(Xf).
1t
In Equation 3.5.2 above, we used the fact that It=o and Yp are tangent
vectors to the factor spaces of lR x M and hence they commute.
o
We now wish to define the Lie derivative in a more systematic way
that will be of practical use.
Theorem 3.5.3 The map £'X : X(M) -+ X(M) defined by £'x(Y) :=

[X, Yj can be extended uniquely to a derivation of the tensor fields on
M with the following properties:
1. £'x(f) = X(f) for f E COO(M).
2. £'X is a derivation of'D(M) pr-eserving the types of the tensors.

3. £, commutes with contractions.
Proof First of all let us explain what we mean by a derivation. A

linear map A : 'D -+ 'D is said to be a derivation if
A(tp ® 1/J) = A(cp) ® 1/J + tp ® A1/J, for all tp,1/J E 'D.
Assume that one such £, exists. Let C : 'Dl -+ COO(M) be the contrac-
tion. Then property (2) implies that
£'x(w®Y) = £'xw®Y +w®£'xy.
Now we apply contraction and (3) to get
£'x 0 C(w ® Y) = C(£'xw ® Y + w ® £'xY). (3.5.3)
The left side Equation (3.5.3) is X(w(Y)) whereas the right side is
(£'xw)(Y) + w([X, Y]). Hence it follows that we must define £'xw for
any I-form was
£'xw(Y) := X(w(Y)) - w([X, Y]).

One then easily verifies that .cxw is a I-form on M. Thus .c x is defined

on 1)1, 1)1 and COO(M) which generate 1) over COO(M) and now we
extend it to any tensor field by the derivation rule.
o
The Lie derivative .c x is quite a useful gadget whenever we want to
test the behaviour of some geometric object under the one-parameter
group of diffeomorphisms generated by X. Most often it is used in the
context of covariant tensors. Instead of the axiomatic characterization
of the Lie derivative we shall explain in detail what it is in this special
case. Let s be a covariant tensor of rank k on M. For any smooth map
cP : M -t M we have introduced the concept of the pull-back of sunder
cP:
(cp·S)p(Vb"" Vk) := S'P(p)(cp.(vd,···, CP .. (Vk)).
Now if X E X(M) with the associated flow {cptl then for any k vector
fields Xi we have a smooth function on IRxM given by (cp;)(X 1, ... , Xk).
We then claim that
(.cx)(X1 , ... ,Xk) = lim

t40
~t (cp;S(X1"" ,Xk) - S(Xb ... ,Xk)).
To prove this, it is enough to see what happens to a I-form. Since the

problem is local we may assume that the form is of the type S = df. Let
v := Yp E TpM. We set
h(t) := CP'"-t (df'P'(p)) (v).
Then we must show that h'(O) = .cx(df)(Y). We unwind the expression
for h:
h(t) = df'P'(p) ((cp;v)
= ((cpt).v) (f)
= v(f 0 CPt).
Hence we have
h'(O) = :t It=o v(f 0 CPt) = (v 0 ! It=o) (f 0 CPt)

= v(X(f)) = Yp(Xf).
In the above computation, we used the observation that It=o and v it

are tangent vectors on factor spaces and hence their order can be inter-
changed. (Compare the third proof of Theorem 3.5.2.) Since
we can rewrite the last of the equations above as
h'(O) = -[X, Y]p(f) + Xp(Yf)

= -df([X, Y])(p) + X(Y f)(p)
= -df([X, Y])l p + X(df(Y))l p'
Dropping p we get
.cx(df)(Y) = X(df(Y)) - df([X, Y]).

That is, for any I-form we have
.cx(w)(Y) = X(w(Y)) - w([X, Y]).

This completes the proof of the claim.
o
Proposition 3.5.4 For a vector field X and any covariant tensor cp of
rank k we have
k
(.cxcp) (YI, ... , Yk) = X(cp(Y)) - L cp(Y1 , ... , [X, Yi], ... , Yk),
i=l
where we have used Y for (YI, ... , Yk ).
Proof We have already proved the result when k = 1. We prove the

result by induction on k. We assume the result for k. We can assume
that k + 1 covariant tensor t/J is of the form cp ® w. Then
and hence
((.cxcp) ® w + cp ® (.cxw)) (XI, ... , Xk, XkH) =

(.cxcp)(XI, ... , Xk) w(Xk+d + cp(XI, ... , Xk).cXW(Xk+l).
Now we use the induction hypothesis to see that the right side of the
above equation is
k
X(CP(Xl"'" Xk)w(Xk+d) - LCP(XI, ... , [X, Xi]"", Xk)W(XkH)
i=l
We group the terms as follows: (first term + third term) + (second term
+ fourth term) to get
k
X(1/J(Xb ··., Xk+l» - L 1/J(X b ... , [X, Xi], ... , Xk+l).
i=l
Thus, by induction, Proposition 3.5.4 is proved.

o
We end of this section with a derivation of Cartan's formula for the
Lie derivative. We need the concept of interior multiplication. Let <p
be a k-form on a mani,fold. Let X be a vector field on M. We define a
k - I-form ix<p by setting
.
zx<p
(Xb ... ,
X ) ._ {<p(X, Xl, ... ,Xk- l ), if k 2: 1
k-l . -
0, if k = o.
Cartan's formula relates d, £'x and i x .
Theorem 3.5.5 (Cartan's Formula) For <p E 'Dk(M), we have

£'XCP = (ix 0 d + do ix)cp. (3.5.4)
Proof We prove this by induction on k. For k = 0, (3'.5.4) is trivial.

Assume the result for k - 1. Let cp be a k-form. We compute, using
(3.4.1),
ixd(<p)(Yb ... , Yk )
= d<p(X, Yb ... , Yk)
= X<p(Yb ···, Yk)

k
+ .w(
'"' -1) i+2 Yi<p(X, Yb ~
···, Yi, ... , Yk)
i=l
k
+ .w(
'"'
-1) i+'+l
J
~
<p([X, lj], X, ~
YI ,···, lj, ... , Yk)
j=l
+ '"' -1) i+ J'+2 <p([Yi, ljj, X,

.w( ~
... , Yi, ~
... , lj, ... , Yk)
i<j
= (1st term + 3rd term) + (2nd term + 4th term)
= £'x<p(Yb ... , Yk) + (-1)d 0 ix<p(Y1 , ... , Yk).
o
Chapter 4
Integration
4.1 Orientable manifolds

Let V be a vector space over IR with dim V = n. We know that /\ n V
is I-dimensional, so that the non-zero elements <p and 1jJ of /\n V are
related by <p = a 1jJ for a E IR. As we mentioned earlier, we would like to
think of any non-zero element O.
Notice that /\n V \ {O} = [<p] U [-<p], where <p is a non-zero element
and [<p] stands for the equivalence class of <po Thus, each V has two
orientations. For any ordered basis B := {VI, V2,' .. ,vn } of V, 1jJ(B) :=
1/1 (VI , ... , v n ) has the same sign for all1jJ E [<pl. The space (V, [<p]) with
a fixed orientation O.
Exercise 4.1.2 Two ordered bases {ed and {vd are both positively
oriented (respectively both negatively oriented if and only if the (invert-
ible) linear map A taking one to the other has positive (respectively
negative) determinant.
Now we want to extend the notion of orientability to smooth mani-
folds. The naive attempt to define this concept in terms of vector fields
would be: M is orient able if we can find Xj E X(M), 1 ::; j ::; dimM
208 4. Integration
such that {Xj (p)} is a basis of TpM for all p EM. This definition is not
the correct one, as we know that we may not even find one smooth vec-
tor field on M which is nowhere vanishing. There is a more appropriate
way of defining orient ability.
Since we want to talk of infinitesimal volume elements we would
like to assign oriented volumes to parallelopiped in TpM, for all p EM.
Thus we need to choose an oriented volume element wp in 1\m T; M \ {O}.
Naturally, we wish to stipulate that this choice be smooth, that is, p ~
wp is a smooth m-form on M, where m is the dimension of M. This
suggests the following
Definition 4.1.3 A manifold M is said to be orientable if there exists

a smooth m-form which is nowhere vanishing. (dim M = m.)
We say that a manifold M is oriented if it is orientable and a nowhere
vanishing top degree form W Js fixed. We refer to W as the orientation
class of M. A basis {Vj : 1 ::; j ::; dim M} of TpM is said to be positively
oriented if Wp(Vll"" vm ) > O.
Now let M be an oriented manifold. Let W be the orientation class.

Let (U, x) be a chart around p. Then for any q E U, Wq defines an
orientation on TqM so that we can speak of positively or negatively
oriented coordinate system. We say (U, x) is a positively (respectively,
Iq' ... ,
negatively) oriented local chart if wq ( 8~1 8:", Iq)> 0 (respectively,
< 0) for all q E U. Since Wq is alternating, we see that if (Xl, X2, ... , xm)
is positively oriented then (X2' Xll ... , xm) is negatively oriented.
If we assume U is connected then q ~ wq ( 8~1 Iq' ... ,
8:m Iq)
has the
same sign on U, so that we can write W = fUdxI 1\ dX2 1\ ... 1\ dX m on
U with fu(q) > 0 (or < 0) for all q E U. If (U, x) and (V, y) are two
positively (or negatively) oriented local charts, then det (~) > 0 on
J
UnV.
Definition 4.1.4 Two charts (UQ, xQ) and (U{3, x{3) are said to be con-
sistently oriented if either they do not intersect or if U n U{3 ::I 0, then
Q
we have det (~) > 0 on U n U{3. An atlas consisting of consistently

Q
oriented coordinate charts is called an oriented atlas.
Thus we have seen that an orientable manifold has an oriented atlas.

The converse is also true. This leads us to the following characterization
of orientability of manifolds purely in terms of the smooth structure.
-4.1. Orient able manifolds 209
Theorem 4.1.5 Let M be a smooth manifold. Then M is orientable if

and only if there exists an oriented atlas.
Proof We need only prove the sufficiency part.

Let A = {(Ua,x a )} be an oriented atlas. Thus on each Ua we have
a form Wa = dX 1 /\ ... /\ dx~ with the property that
Wa = det (8X~ )Wf3.

8xj
Now how to get a form out of these wa? Notice that our usual trick to
get global objects from local objects so far had been the employment of
ego-functions and that it is of no use to us now. For, if we set W = CPWa
for some 0 and cP E ego(Ua ), then W is not nowhere vanishing!
This is the place where we first use the existence of a partition of
unity, subordinate to the above atlas. (See Appendix B.) Let {V;} be a
locally finite refinement of {Ua} such that each V; has compact closure.
We fix a partition of unity {Ii} subordinate to this cover {V;}. For
each i, let 0i be such that V; c Ua , and we set x~(z) := xj'(z) for all
z E V;. Then {(V;, Xi)} is a smooth oriented atlas of M. On each V; we
have, as above, the differential form wi = dxl /\ ... /\ dx!,.. We now use
the partition of unity to patch these forms to give us a desired nowhere
vanishing m-form.
Set wp := L:i h(P)w~. Then p H wp is easily seen to be a differential
m-form which induces the original orientation on each TpM. For, if
p E Ui ,
8 8
wp (-8. '···'-8 . )=
xi x~
"~ h(p)det (8X{

8x i ) . /\ ... /\ dx:,J
(dx~ I8i
. p( 8x , ... , 8x8i ) > 0,
j 1 1 m
where the sum is over the finite number of Vi such that h(p) =f. o.
o
Example 4.1.6 If M is a manifold and if there exists an atlas consisting
of two coordinate charts {(U, x), (V, y)} with Un V connected, then M
is orientable. (Exercise.) Hence snis orientable.
Example 4.1.7 In Example 4.1.6, we need the connectedness of Un V.

For, if we let M be the Mobius band, we can give a smooth atlas for M
as follows:
210 4. Integration
(- 1, 1) 1 (1 1)
r---.4---,-----r--...,
(- 1. - 1) - 1 (1 -1 )
Figure 4.1.1 Mobius band
Recall that M can be obtained as a quotient space of a rectangular

strip with horizontal sides removed. The points on the vertical sides are
identified in the opposite direction, as indicated in the picture. Then M
has two coordinate neighborhoods U1 , the open rectangular strip (that
is, without the vertical sides) and U2 , the dotted portion on the left side
of Figure 4.1.1 on page 210, (identified with the dotted one on the right
hand side as indicated) along with the open set in M to the right of the
·
1me 1
x = 2'
In symbols, we have
U1 = {(x, y) : x f- ±1},
U2 = {(x,y): ~ < x:::; 1 and -1:::; x < -~}.
The coordinates are given by 'PI, identity on U1 and 'P2 on U2 by
( ) (x, y) if ~ < x :::; 1;

'P2 x, Y = { .
(2 + x, - y)
1
If -1 :::; x < - 2'
Note that as (1, y) = (-1, -y) in M, we need to check that 'P'2 is well
defined. It is so and it is smooth. Clearly we have on
1
U1 n U2 = {(x,y) : 2" < x < I} U {(x,y) : -1 < x < -1/2}.
Note that U2 is twisted and then joined to U1 . (See Figure 4.1.1.) The
Jacobian of the coordinate changes has determinant +1 on the first set
011 the right side above and determinant -Ion the second set. Now this
implies that M is not orientable. (Why?)
4.1. Orient able manifolds 211
. 0.5
z-aX1S
y-axis
x-axis 1 -2
2
Figure 4.1.2 Mobius Band as Surface of Revolution
There is another neat geometric realization of Mobius band. It is

described as the surface of revolution got by revolving the line segment
[-1/2,1/2]e3 around the circle of radius 2 with center at the origin in
the xy-plane, but moving the line segment along the circle in such a
way that if we have moved a distance u-radians, the angle between the
line segment at that point and e3 is u/2. More explicitly, we have the
following parameterizations:
<p( u, v) = ((2 + v sin( u/2)) cos u, (2 + v sin( u/2)) sin u, v cos(u/2)).

Figure 4.1.2 illustrates this revolution.
To get a genuine parameterizations, we assume <p and 'ljJ are defined
exactly as above but with domains (0,211") x (-1/2,1/2) and (-11",11") x
(-1/2,1/2) respectively.
Example 4.1.8 If M is a hypersurface in !R.n+1, that is, M is an n-

dimensional submanifold of !R.n+1, then M is orient able if and only if
there is a nowhere vanishing normal field on M.
This statement· needs some explanation. We identify Tp!R.n+1, as
usual in a canonical way, with !R.n+1. We also identify TpM as a subspace
of Tp!R.n+1 and hence of !R.n+1. Now on !R.n+l we have the usual inner
product. We say, for p E M, a vector 1] E Tp!R.n+1 is normal to M if
(1], v) = 0 for all v E TpM. Let p t-+ 1]p be a nowhere vanishing normal
212 4. Integration
field on M. Then we want to construct a nowhere vanishing n-form on

M. The trick is to take the (n + I)-form dUI/\···/\ dUn+l on IRnH and
contract or interior multiply it to get the required form, that is, we set
In other words, for any VI, ... ,Vn E TpM, we set
Wp(VI, ... , Vn ) = (dUI /\ ... /\ dUn +l)(7]p, VI, ... ,.vn ).
This defines wp E /\ nTpM and wp -:f. O. For, if {VI, ... , v n } is a basis for
TpM, then {Vl, ... , Vn , l1p} is a basis for TplRn+l and hence Wp(VI, ... , vn )
is the oriented volume of [7]p, VI, ... ,vnJ. The latter can be shown to be
Jdet((vi,Vi))O~i,i~n, where Vo = l1p. (See Lemma 4.2.10.)
We now prove the converse. From the local description of subman-
ifolds, we know that given p EM, there exists a coordinate chart
Up around p such that it is the level set of a regular function fp de-
fined on an open neighbourhood of Up, fp: Vp ::> Up -+ lR. The vec-
tor field 7]p: q t-+ II :~::! ~:~:~ II is a smooth normal field on Up- We let
(xf, ... , x~) be a positively oriented system of coordinates on Up. We
let vp(q) := ±7]p(q) according as
(dUI /\ ... /\ dUn+l) (7]P, a=r···' a~) > or < o.

Note that vp(z) = vq(z) if z E Up n Uq • Thus v is well-defined smooth
unit normal field on M.
To give a more 'algebraic' proof, we need the following lemma from
linear algebra.
Lemma 4.1.9 Let V be an (n+l)-dimensional real inner product space.

Let W be a vector subspace of V with dim W = n. Assume that cp
(respectively, 1jJ) is a nonzero top degree skew-symmetric form on V
(respectively W). Then there exists a unique vector v such that v 1.. W
and
(4.1.1)
Proof For V E V, define a skew-symmetric n-linear form on W by

setting
fv := cp(V, WI,.··, w n ).
Since cp is nonzero, there exists V E V such that fv -:f. O. Fix one such v.
Write V = W + 7] with W E W, 11 E W.1.. Since tI' is nonzero top degree
4.1. Orientable manifolds 213
form on W, fv must be a multiple of t/J, say, fv = a(v)t/J. Note that

f,., = fv so that f,., = a(v)t/J. Set II := 1J/a(v). The II satisfies Eq. 4.1.1.
It is unique. For, if u E W.L also satisfies (4.1.1), then U - II is such
that cp(u - II, Wi>.'" w n ) = 0 for all wi E W. Hence u - II E W n W.L.
o
Let us assume that M admits a nowhere vanishing top degree form
t/J so that there exists an oriented atlas on M. Let cp = dXI 1\ .. . 1\ dX n +1
be the orientation form on JRn+1. By the lemma, there exists lip such
that
cpp(lI, WI, ... ,Wn ) = t/J( WI, ... ,Wn ), Wi E TpM.
Then pHilp is a nowhere vanishing norrual field on M. It is smooth
since t/J and cp are so.
Example 4.1.10 If G is a Lie group, then G is an orientable manifold.

For, let Xk, 1 ::; k ::; n, be an ordered basis for g, the Lie algebra of
left invariant vector fields on G. Let Wi be the dual I-forms, that is,
wi(Xi) = 8; for all i, j. Then n = wI 1\ w2 1\ ... 1\ wn is a nowhere
vanishing n-form on G. Show that the above n-form is left invariant.
Example 4.1.11 When is 1P'(JR) orientable? Hint: The answer de-

pends on the parity of n.
Example 4.1.12 If G is a Lie group and H is a closed subgroup of G

then we know that M := G / H can be given a smooth structure. If 9
(respectively, 5») denotes the Lie algebra of G (respectively, H) then g/5)
is a vector space on which H acts as follows:
p(h)(Y + 5») := Ad(h)(Y) + 5).

This is well defined since if we have Y + 5) = Z + 5), then Y - Z = X
for some X E 5). Now for hE H and X E 5) we have Ad(h)(X) E 5), so
that p(h) is well defined. If p(h) has determinant 1 for all h E H, then
G / H is orientable. Hint: Go through the Example 4.1.10 carefully.
Exercise 4.1.13 Let M and N be connected oriented manifolds of the

same dimension. Then a map cp: M -+ N is said to be orientation
preserving if Dcp(p): Tp(M) -+ T<p{p)N is orientation preserving for each
p E M. In terms of positively oriented coordinate systems Xi and Yj on
M and N respectively, this means that the Jacobian matrix has positive
determinant.
214 4. Integration
4.2 Integration on manifolds

So far we have been discussing the differential calculus aspects on smooth
manifolds. We shall now talk of integration on manifolds. The
basic fact we need to know is the change of variable formula for C l _
diffeomorphisms of open subsets of ]Rn. We recall this now and recast
it in the language of differential forms. If U and V are open sets in ]Rn
and if t.p : U --* V is a Cl-diffeomorphism, then for any f E C(V) with
compact support, we have
[f(Y)dY= iUot.p) IdetDt.p(x) I dx.

Here the integrals are the Riemann integrals. For the sake of simplicity,
let us assume that the sets under consideration are connected. Then t.p
is either orientation preserving or orientation reversing.
That is, we have either detDt.p(x) > 0 or detDt.p(x) < 0 on U. Then
we can rewrite the above formula as follows:
[f(Y) dy = ± i u 0 t.p) det Dt.p(x) dx,
depending on whether t.p preserves or reverses the orientation.

We wish to interpret the above formula in terms of differential forms.
So, let us assume that f is smooth in the formula. Then on V we consider
the differential form w := f(y) dYI /\ ... /\ dYn, where Yi are the usual
coordinates on ]Rn. Then the pull-back of wunder t.p is the form
cp*(w) := U 0 t.p) t.p*(dYI/\··· /\ dYn).
Let us use Xi to denote the usual coordinates of]Rn on U so as to make

distinction. Then we have seen that
Thus we see that
t.p*(w) = det (:~;) dXI /\ ... /\ dxn.

In other words, we can write the change of variable formula as
[ w= ±i t.p*(w). (4.2.1)
4.2. Integration on manifolds 215
Equation (4.2.1) tells us how to define the integral of a smooth differen-

tial form w of degree m = dim M, provided that it has compact support
and the support of w lies inside a coordinate chart (U, x). Here the
support of a differential form is defined in the obvious way:
suppw:= the closure of {p EM: wp =1= o}.

From what we saw in Section 3.3 we know that wp is well defined, so
that the above definition makes sense. We assume that the local chart
is (U, cp-1). On the open set we can write w = f dX1 1\ ... 1\ dx m ; we
define
r
JM
w:= r f(cp(p))dx1 ". dx m ,
J'f'(U)
(4.2.2)
where on the right side we have the Riemann integral of the smooth
function f 0 cp. It is finite since this function is compactly supported
on cp(U). Equation (4.2.1) tells us that the left side of Equation (4.2.2)
is well defined. For, if (V, '1/1-1) is another local chart inside which the
support of w lies we could have defined
r
JM
w:= r
J",(V)
f('I/I(p))dY1".dYm,
where Yi are the local coordinates on V. It is easy to verify that Equa-

tion (4.2.1) implies that the above two left sides are the same, provided
that these charts are consistently oriented, that is, we have (~) > o.
It is important to notice that it mayor may not be possible to achieve
the effect of consistently oriented charts. We know precisely when it is
possible to do this, namely, when the manifold M is orientable. This
means that to define the integral of a differential form we need to as-
sume that M is oriented and we should restrict ourselves to positively
oriented coordinate charts. It is essential that the reader appreciates the
way the concept of orientability enters the integration theory of differ-
ential forms.
We shall always assume that the manifold is oriented wherever we

talk of integrals of forms.
Now how do we extend the notion of integrals of forms whose sup-

ports are not necessarily contained in a local chart? This is where a
partition of unity comes to our rescue.
We take an oriented manifold M and an m-form won M with com-
pact support. Let A be a positively oriented atlas on M and let us choose
216 4. Integration
any partition of unity {Ii} subordinate to this oriented atlas. Here again
notice that we can and do take positively oriented coordinates on the
sets which refine the given atlas. Then we set
1M
r W:= L lUir liw.
i
The first thing to observe in the definition above is that the right side is
finite since the summation is over a finite number of indices and each of
these integrals is finite due to the locally finite refinement. The second
thing to see is whether this notion of integral is independent of the choice
of the partition of unity. Thus, if {gj} is another partition of unity, then
we claim that
L J gjw= L J liw. (4.2.3)
J •
This is easy. We carry this out in two steps. First we notice that if W
has support in the intersection of two positively oriented charts and if
we take integrals with respect to each of them then the integrals are the
same by the differential form version of the change of variable formula.
The second step is as follows. Note that gjW = L:i Ji9jW so that
J gjW = J L:i ligjw. By our assumption on the supports of the func-
tions, we get J 9jW = L:i J ligjw and hence
(4.2.4)
Starting with the observation liw = L: j gjliw and proceeding as earlier,

we see that
(4.2.5)
From Equations (4.2.4) and (4.2.5), we get Equation (4.2.3).

Even though we can only integrate forms of top degree (that is, forms
of degree m = dim M) on an oriented manifold we can pull back k-forms
to submanifolds of dimension k and integrate them there. That is to say,
we can integrate k-forms on k-dimensional submanifolds of M. We look
at some examples. The first example has probably entered the readers'
life in a course on physics or vector analysis. Let 'Y : (a, b) --t M = lRn
be a submanifold. If W is a I-form on M, we can write W = L: lidxi
with Ii smooth on lRn. If we set 'Y = ('Y1, . .. , 'Yn), we have
'Y*(W) = LIi('Y(t))d(XiO'Y) = Llib(t))d'Yi

i i
on (a, b). Thus
1..,
w:= f
l(a,b)
'Y*(W) = Lib i a
(Ii 0 'Y)(t) dd'Yi dt,
t
the last integral being the Riemann integral.
Example 4.2.1 As a specific example, let us compute f.., w, where

( ) _ -y dx + x dy on ]R2 \ {O}
w x,y - 2 2
X +y
and 'Y is any circle centered at the origin. Thus 'Y( t) = (r cos t, r sin t)
for t E [-11', 11'). Then 'Y is a diffeomorphism of the circle [-11', 11') / (±11' )
onto the circle, with center at the origin, of radius r in ]R2. We see that
* -r sin td(r cos t) + r cos td(r sin t) d
'Y w = r
2 = t.
1. ,
Hence
w= fTr dt = 211' .
l-Tr
Example 4.2.1 is of significance to us. From the Poincare lemma
(Lemma 3.4.6) we know that if we are given any closed k form w on
M then locally we can find a k - I-form tp such that dtp = w. That
is, if p E M then there is an open set U 3 P and a form tp on U such
that dtp = w on U. In general, it is not possible to solve this equation
globally. That is to say, we may not find a tp on the whole of M such that
dtp = w holds on M. Example 4.2.1 furnishes one such instance. For,
if possible, let 9 be a smooth function on the manifold M := ]R2 \ {O}
such that dg = w. We then have 'Y·(w) = 'Y·(dg) = d(g 0 'Y). We claim
that f.., dg = O. This will contradict what we have shown above, namely,
f.., w = 211'. Now we prove the claim:
'Y*(w) = 'Y·(dg) = d(g 0 'Y)
so that
f w = r d(g 'Y) dt
i:
0
1.., l-Tr dt
= (g 0 'Y)'(t)dt
= go 'Y(11') - 9 0 'Y( -11')

=0.
218 4. Integration
Thus there are obstructions to a global solution of the equation W = dcp.

In the above case the obstruction is due to the presence of a hole at "the
origin" .
Going back to the general manifold, we saw that we can integrate an
m-form an orientable m-dimensional manifold M. If there is any reason
to fix an m-form w we can then define the integral of an arbitrary func-
tion 1 E Cgo(M), the space of smooth functions with compact support,
as follows:
On a general manifold there is no natural choice for such a form. How-

ever, if the manifold has some additional structure it may be possible to
make a canonical choice. We have two such classes of manifolds in mind.
One is the class of Riemannian manifolds and the other is Lie groups.
We first take up the study of integration of forms on Lie groups. Let
us recall that any Lie group G is orientable. Let {Vi : 1 ~ i ~ dim G} be
an ordered basis of TeG. Let Xi be the left invariant vector field defined
by Xi(e) = Vi. If Wi are the I-forms dual to Xi, the form W := wIA·· ·Awn
is a nowhere vanishing n-form on G. Hence G is orientable. Observe
that if we choose any other basis {Wi} of TeG then the resulting n-form
will be aw for a real number a. Thus, except for a scalar multiple, we
have a uniquely defined nowhere vanishing left invariant n-form on G.
Hence if we fix one ordered basis of TeG, then we can define the int ~gral
of any smooth function 1 on G by setting fa 1 := fa Iw, where the right
side is defined as the integral of the form Iw. The form w, defined as
above, has an interesting property:
L;(w) = w for all g E G.
That is, w is a left-invariant n-form on G. This in turn implies that the

integral itself of any function is left-invariant:
fa I(ag) fa I(g)
= for all a E G. (4.2.6)
From the Riesz representation theorem of measure theory we know that

the above integral corresponds to a measure JL on the group G. Equa-
tion (4.2.6) tells us that this measure is left-invariant: JL(gE) = JL(E)
for any 9 E G and a Borel set E C G. Such a measure is called a
(left-invariant) Haar measure. We first look at some examples.
Example 4.2.2 Of course our first example is jRn. Here, clearly, we

may take W = dXII\.· .. 1\. dx n , where we have used the usual global chart
(jRn, x).
Example 4.2.3 Let us consider G = jR+, the multiplicative group of

positive reals. We take (jR+, x) as the chart. We now investigate how
the I-form dx transforms under left translations. Let a E jR+. Then
This suggests the definition W := x-I dx. For then we shall have
L~(w) = (ax)-I d(ax) = w.
Thus w is left-invariant. The integral of a compactly supported function

f is therefore fR+ f(x)x-Idx. In particular, the sets (0,1) and (1,00)
have the same measure, namely, infinity.
Example 4.2.4 Let
We give it a global chart
[x, yJ f-t (x, y) E jR+ X R
Let us see how the I-form dx changes under left translations. Let
g = [a, bJ E G. The diffeomorphism Lg is given in local coordinates
by (x, y) f-t (ax, ay + b) since Lg([x, y]) = {ax, ay + bJ. This shows that
we have
L;(dx) adx;
L;(dy) ady.
Thus, if we define WI := X-I dx and W2 := x-I dy, Wi for i = 1,2 are

left-invariant 1-forms and w := wII\.w2 is a left-invariant form. Thus the
Haar measure corresponding to this form is given by
! G
f(g)dg = 1 IR+ xlR
dxdy
f(x'y)-----2 .
X
Exercise 4.2.5 What is the Haar measure of G = GL(n, JR)?

220 4. Integration
Exercise 4.2.6 Let G = 8L(2, JR) act on the upper half plane X via
fractional linear transformations on X, as in Example 2.11.6. Find a
basis for G-invariant I-forms on X.
Exercise 4.2.7 Compute a basis of the right-invariant I-forms on the

Lie group G of Example 4.2.4 and hence find the right-invariant Haar
measure. Is it the same as the left-invariant Haar measure?
Exercise 4.2.7 prompts us to find conditions under which a left in-

variant top degree form on a Lie group is right-invariant. Let G be a
Lie group and W be a left-invariant form of degree n = dim G. We wish
to find the effect of a right translation Ra on w. We claim that
We check this on Xi(e), where {Xi} is a basis of left-invariant vector

fields; it is just a matter of unwinding the definitions. To see this, let
La(X) := axa- 1 be the inner automorphism. Then we have
Since Ra 0 Lb = Lb 0 R a , we see that
We claim that L~W is left-invariant for any a E G. For, we have
Now we compute
(L:W)e(Xt, ... , Xn) = We (L:Xt, ... , L:Xn )

= We (Ad a(Xl),'" ,Ad a(Xn))
= det(Ad(a)) we(Xt, ... ,Xn ).
Our claim follows from this.

Thus we see that any left-invariant n-form on G is also right-invariant
if and only if det(Ad(g)) = 1 for all g E G.
Definition 4.2.8 A Lie group is said to be unimodular if any of its left

invariant measure is also right-invariant.
Hence a Lie group for which det(Ad(g)) = 1 holds is unimodular.

As examples we have abelian Lie groups, SL( n, JR) and compact Lie
groups. Compact groups are unimodular since the continuous homomor-
phism g ~ det(Ad(g)) maps G onto a compact subgroup JR+, namely,
the subgroup {1}.
Exercise 4.2.9 Let M be a manifold on which a Lie group G acts

transitively. Find conditions under which
1. M is orientable
2. There is a top degree G-invariant form on M.
(Hint: Recall that you may assume M = G / H for some closed subgroup
H.)
The rest of the section should be read after Section 5.2.
We assume that M is an oriented Riemannian manifold. At each

point p EM, we choose an oriented orthonormal basis {ei(p)} in TpM.
Let e;(p) be the dual basis. Then we have an m-form
dV := p ~ ei(p) 1\ ... 1\ e;"(p)
on M . Is this form smooth? We derive local expressions for dV . Let

(U,x) be a positively oriented chart. Then {a~i Ip} form a positively
oriented basis. To compute the volume (with respect to the volume
element dV(p)) of the parallelopiped spanned by this basis we use the
following lemma from linear algebra.
Lemma 4.2.10 Let V be an inner product space over R If {ed is an

oriented orthonormal basis of V, then the volume of the parallelopiped
spanned by {Vl, . .. ,Vn } is,;c, where G:= det((vi,Vj)). That is, we
have
Proof We write Vi = Lk a~ek and set A := (an. Then
(Vi, Vj) = (L a~ek' L a;el) = L a~aj

k I k
= (AAt)ij,
222 4. Integration
so that
det((vi,1Jj)) = det(AAt) = (detA)2.
By definition, we have
This proves the lemma.

o
From Lemma 4.2.10 we see that on a Riemannian manifold
where we have set G :=det(gij). Here, of course, gij(P):= g(8~; 8~; Ip' Ip)'
Thus, the form dV is smooth and is called the volume lorm correspond-
ing to the Riemannian metric g. This is the canonical choice for an
orientable Riemannian manifold. Hence we can define the integral of
any compactly supported smooth function I on M as
We shall now let you into a secret. If we are interested only in

integrating functions on a Riemannian manifold M, we do not need M
to be orientable. This should come as no surprise to the discerning reader
if one has understood the geometric reasoning behind the infinitesimal
volume elements. For any Riemwnian manifold M and any smooth
function I with compact support inside a coordinate chart (U, cp), we
set
r
1:=
1M
r(f 0 cp-l)v'c(cp-l(p))dJ.L(p),
1",(u)
where the integral on the right side is the usual Riemann integral. Here
we have used dJ.L for the Lebesgue measure on am since we did not want
it to be confused with the earlier definition in which we used the symbol
dx. Now we should check that this notion of integral is well defined.
Thus let us assume that I has support in (U, cp) as well as in (V, 1/1).
Then we need to show
r
1",(u)
(f 0 cp-l)v'c(cp-l(p))dJ.L(p) = r
1~(v)
(f o1/l-1)v'c(1/I-l(p))dJ.L(p).
This follows trivially from the (non-differential form version of) the
change of variable formula and the following exercise.
4.3. Stokes'theorem 223
Exercise 4.2.11 Let (U,x) and (V,y) be two coordinate charts in the
Riemannian manifold M such that Un V is nonempty. We observe that
gYj .- (a~i ' a~j )

/ "" aX r a "" axs a )
\ ~ ay; aXr ' aYj axs 7
= L --a
aXr axs
ay- y- grs·
,
x
r,s ' J
Hence,
det(gf,j) = det (:::) 2. det(gf,j). (4.2.7)
Thus, for f E C(M) with support in (U,x), we define
r f:= r
JM J",(U)
f 0 cp-l (x) det(gf,j) (p) dx,
where (U,x) == (U,cp) and cp(p) = x. Equation (4.2.7) shows that this is
well-defined.
Now using a partition of unity we can extend the notion of integral of any
(continuous) function with compact support. Let us once again remark
that if we wish to integrate forms we must assume that the manifold is
orientable.
4.3 Stokes' theorem

Let n be a connected open set in M. Let an be the (topological)
boundary of n in M. Stokes' theorem relates the integral of an (m -1)-
form ¢ on the boundary an of n to that of d¢ on n. In symbols
(¢, an) = (d¢, n) ,

that is, an analytic operation d has the topological operation as the a
adjoint. To make sense of fan ¢, we need an to be a smooth (m - 1)-
dimensional manifold.
Definition 4.3.1 We say n has smooth boundary if, for p E an, we

have a local chart (U, x) in M with Xi(p) = 0 for 1 ::; i ::; m such that
unn = {z E U: xm(z) 2': O}.

224 4. Integration
n
That is, under the induced coordinate map Un is mapped to a set in
the half-space
1R~ := {x E IRm : Xm ~ a}.
Proposition 4.3.2 If n is a domain with a smooth boundary in M,

then an is an (m - I)-dimensional closed submanifold of M. Further-
more, if M is orientable, an is orientable.
Proof We first observe that an is a closed subset of M and can be

given the (relative) subspace topology: if p E an and (Up, x P ) is a local
chart as above, the sets Up nan = {z E Up : x~ (z) = o} serve as a basis
for this topology on an. For, if q E an, and if x~( ) = c > 0, then by
choosing 8 with 0 < 8 < c, the set
{z E Up : Ixf(z) - xf(q)1 < 8, 1::; i ::; m - I}
is an open subset of n and hence q lies in the interior of 0., a contradic-
tion. Now on Up n an, we have an obvious choice of local coordinates,
namely, yf (z) = xf (z) for 1 ::; i ::; m - 1. This is a genuine set of local
coordinates, since Up n an is described as the inverse image (x~)-l(O)
of a regular value of a smooth map x~ : U -+ lR. The overlaps between
(Up nan, yP) are also smooth since those between (Up, x P) are smooth.
It remains to prove the second assertion of the proposition. Let M
be an orient able manifold. We fix an orientation of M. We can assume,
following our notation above, that all the (Up, x P ) are positively oriented.
(Can we?) We then want to show that the charts {(Up n an ,yP)} of an
are consistently oriented. We assume for p, q E an, Up n Uq #- 0. For
z E Up n Uq n an, we have
aax~p (z) _- 0 lor

c 1 ::; i ::; m - 1.
xi
Hence the Jacobian of coordinate changes (at such a z) is
qI q
( ax ) _
(axqm ) (ax I ) (4.3.1)
ax~ - ax~ a~
J l~i,j~m J l~i,j~m-l
on Up n Uq. By assumption, the left side of Equation (4.3.1) is posi-

tive. Since the second member of the right side is the Jacobian of the
coordinate changes, namely, (~) , it suffices to show that
::t
Yj l~i,j~m-l
(z) ~0 for z E Up n Uq nan.

(Why?) For Z as above, we choose Zk E Up n Uq with the property

that Xf(Zk) = xf(z) for all 1 ~ i ~ m - 1 and X~( k) = x~(z) + Ck
such that Ck -t O. This means that Zk -t z. Since Zk E Uq , we have
Xa.,(Zk) > xa.,(z) = O. Hence
axa., (z) = lim Xa.,(Zk) - 0 > O.
ax~ k--+oo Ck -
o
Let M be a smooth oriented manifold and n be a domain in M with
smooth boundary. We want to fix an orientation on an. To motivate
our choice of orientation of an, let us look at the case when M = IRn+l
and n = 1R~+l = {x : Xm 2: O}. If {ei} is the canonical basis and {Ui}
the dual basis, then we have
n= {x E IRn+l : Un+l(x) 2: O}
and
an = {x E IRn+l : Un+l(x) = O}::: IRn.
Now on M = IRn+l, we have the orientation determined by the (n + 1)-
form dXll\.dx21\.·· ·I\.dxn+l' If we identify as usual, TxlRn+l with IRn+l so
that on each TxlRn+l we have the form ull\. u21\.· .. 1\. Un+l corresponding
to (dXll\. dX21\.···1\.
.
dxn+dl p .
Now the outward normal to an is given by the vector field p t-+
-,,2-1
UXn+l P
' that is, -en+l under the usual identification. We fix an ori-
entation on Tp (an) ::: IRn by requiring that {VI, V2, ... , V n }, a basis of
Tp(an), is a positively oriented basis if
(Ui I\. U2 1\. ... I\. Un+l)( -en+l, VI,.'" Vn ) > o.
For example, this means that if we take Vi = ei we have
-lLI(entd
lLI(e n ) )
- 1L2(e n+l) 1L2(en)
det ( .
-lL n +:(e n +l) lLn+~ (en)'
This is positive if and only if n is odd. Hence a positive orientation for

an is given by the form
dXl I\. dX2 I\. ... I\. dX n if n is odd;
-dXl I\. dX2 I\. ... I\. dX n if n is even.
226 4. Integration
We keep the above notation. Now for any p E an, we have a positively
oriented local chart (U, x) of p in M such that
Un an = {z E U : xm(z) = xm(p)}.
On this chart U n an, we stipulate that the orientation is given by the
form
dYl 1\ . . . 1\dYm-l if m - 1 is even ;
-dYl 1\ .. . 1\ dYm-l if m - 1 is odd.
The orientation so obtained on an is said to be a compatible orientation

or a boundary orientation.
Theorem 4.3.3 (Stokes' theorem) Let M be an oriented manifold

and n c M a domain with smooth boundary. Assume n is compact. Let
w E 1)n-l(M) and i : an --+ M be an imbedding. Let an be given the
compatible boundary orientation. Then
Proof We choose a locally finite cover {U",} of M with the following

properties: .
(1) Each U'" is a coordinate neighborhood.
(2) For each a such that U'" n n -I- 0, either U'" C n or U'" n an -I- 0; in
the latter case we assume
U'" c {p : Ixf (p) I < £ for 1 ~ i ~ n}.
n
Here U'" n = {q : xm(q) 2: a}.
Let {j",} be partition of unity subordinate to {U",}. Since an and
n are compact and {U",} is locally finite the number of a's such that
n
U'" n an -I- 0 or U'" n -I- 0 is finite. Hence we have
Hence the theorem reduces to proving the following: If M is a compact

manifold with boundary and an is given the boundary orientation then
( i*(w) = { dw.
lan ln
It is enough to prove the theorem under the additional assumption that

supp(w) C Ua . Let us write U for Ua . If UnO = 0, then wand dw are
zero on 0 and hence both the integrals are zero. So, we assume UnO :j:. 0.
If U c n, that is, U does not intersect an, we have Jan
w = O. Now we
write
so that
Therefore
L dw = L L(;~:) dXl A ... A dX

k
m = 0,
since fk'S have compact support.

If U n an :j:. 0, then the last argument works for all k :s m - 1. For
k = m, we have
on an. Therefore
(4.3.2)
On the other hand, we have
on au. On au we have the orientation determined by the form

228 4. Integration
Hence we have
(4.3.3)
Comparing Equations (4.3.2) and (4.3.3) we get the result.

o
As an application of Stokes' theorem, we prove the Brouwer fixed
point theorem via Theorem 4.3.4 below.
Theorem 4.3.4 There exists no smooth map I : B -t BB = sn-l of

the closed unit ball in lRn to its boundary such that f(x) = x for all
xEBB.
Proof Suppose that there is a smooth map f with the property that
I is identity on aBo We write f := (It, ... , In). We consider the two
(n - I)-forms defined as follows:
w(x) := Xl dX2 /\ ... /\ dX n
ry(x) := It (X) dh(x) /\ ... /\ dfn(x).

Since f is identity on aB we see that £*w = £*ry on aB, where £ is the
canonical imbedding. We therefore deduce that
We now wish to apply Stokes theorem to the n-forms dw and dry. First
we observe that
dry := dlt /\ ... /\ din = 0,
since dJi E Tf(x)sn-l, an (n - I)-dimensional space. On the other hand
dw = dx l /\···/\ dx n , the volume form on B. Stokes' theorem yields the
following absurdity:
volume(B) = r_w = laBr_ry = lB~ dry = O.

lB~ dw = laB
This contradiction establishes the result.
o
We now give the proof of the fixed point theorem essentially for
cultural reasons, since the proof is a beautiful mixture of analysis and
elementary topology.
Proposition 4.3.5 Let f : B --* B be smooth. Then f has a fixed point.
Proof If f(x) =f:. x for all x E B, we define 9 : B --* aB as follows: we let

g(x) be the point on the boundary at which the line starting from f(x)
and going to x meets aBo In analytical terms, we have g(x) = x + tv,
h
were x-f~xl
v = IIx-f xlII an
d
Then 9 : B --* aB is smooth and g(x) = x for all x E aBo This

contradicts Theorem 4.3.4.
o
Brouwer's fixed point theorem is deduced as in Theorem 2.7.28.
We now show how the classical results such as Green's theorem and
Gauss divergence theorem can be derived from our version of Stokes'
theorem.
Theorem 4.3.6 (Green's theorem) Let n be a smooth domain in]R2

with smooth boundary. Let w = P dx + Q dy be a l-form on n. Then we
have
Ian Pdx + Qdy = In (~~ - ~:) dx A dy.
Proof This is an easy consequence of Stokes' theorem. One computes

dw and shows that it is the 2-form on the right side of the Green's
formula above.
o
Gauss theorem on the other hand requires some preparation, since it
involves concepts from Riemannian geometry. We define the divergence
of a vector field, Div X, by the equation .cxdV = Div (X)dV, where dV
is the volume form of the oriented Riemannian manifold. By Cartan's
formula (3.5.4), we have .cxdV = (doix)dV. Now by Stokes' theorem
we have
r
1M
Div(X)dV= r
1M
(doix)dV= r
IBM
t*(ix dV ).
230 4. Integration
Let {ei} be an oriented orthonormal frame at p E aM with em = uP'

the unit normal. Then
L*(ixO)(el,"" em- d =-= dV(X, el,"" em-d
-= (Xp, up) w(el,"" em-d,
where w := iu(dV) is the volume form on Tp(aM). Thus we get
Theorem 4.3.7 (Gauss divergence theorem) Let (M,g) be an ori-

ented Riemannian manifold. Let U be a domain in M with smooth
boundary au. Let dV be the volume element of M. If u denotes the
unit outward normal vector field on the boundary au, the volume ele-
ment w of aM is given by w := iu(dV). Let X be a vector field on M.
The divergence of X is defined by the equation .cxdV := Div (X)dV.
I
We then have
u
Div (X)dV =
Jau
r
(X, u) w. (4.3.4)
o
We derive an expression for the divergence of a vector field X on
U Let X := Li /jaj , where aj := 8~j on U. We claim that
c IRn.
· X -_ ""'
D IV fj
~ au.' a (4.3.5 )
j J
First observe that if rlV := dUl/l.· .. /1. dUn is the standard volume form,
then
so that
It follows that d(ix(dV)) = 2:: j ~. The claim follows.

We derive some standard corollaries of the divergence theorem which
are essential for analysis on IRn. Let 0 be a bounded domain in IRn with
smooth boundary S == a~.
Theorem 4.3.8 Let f E Cl(O) n C(O). Then
In fx. dx = Is fVi dS , (4.3.6)
where £1 = (VI,"" v n ) is the unit normal defining the boundary orien-

tation.
Proof Take X = (0, ... ,0, I, 0, ... ,0) with I at the i-th place in
(4.3.4).
o
Theorem 4.3.9 (Integration-by-parts) Let I,g E C1(O) n C(o}
l -l
Then
Ix;gdx = Igx; dx + l Igvi dS. (4.3.7)
°
Illg = on S (in particular ilone 01 th?m has compact support in OJ,
l -l
we have
Ix;gdx = Igx; dx (4.3.8)
Proof Apply (4.3.6) with I replaced by Ig.

o
Corollary 4.3.10 (Green's Identities) Let u,v E C2(O) n C1(O).
Then
(i) Gauss Law:
l ~u= l :~dS. (4.3.9)
(ii) First Green's Identity:
In V'u· V'vdx = -In u~vdx + l :: udS. (4.3.10)
(iii) Second Green's Identity:
r
In(u~v-v~u)dx= isr (av
uav-v au)
av dS. (4.3.11)
Proof Using (4.3.7) with u X ; in place of I and 9 = 1 we see that
l uX;x; dx = lUx; Vi dS.
Summing over i yields (i).

To prove (ii), invoke (4.3.7) with I = u and 9 = v X ;'
Interchanging u and v in (4.3.1O) and subtracting will result in {iii6
Chapter 5
Riemannian Geometry
5.1 Covariant differentiation

Let M be any manifold. Given a vector field X on M and I E COO(M),
we speak of the derivative of I in X -direction as the function
XI: p ~ XI(p):= Xp(J).

Notice that (once again) Xp(J) depends only on the val).les of I along
any curve, through p and having Xp as its tangent vector at p : If
,(0) = p and ,'(0) = D,(O)(-it) = Xp E TpM, then
d
Xp(J) = dt (J o,(t))lt=o·
We now ask: Is there a notion of derivative Dv Y of a vector field Y

along v E TpM? As has been emphasized earlier, as soon as we think
of differentiating something with respect to something else, the natural
thing is to differentiate along curves. First of all we would like to define
DxY(p) = DxpY, assuming that we succeeded in defining the right
hand side object, so that Dx Y E X(M) for all X, Y E X(M).
Now let, be a curve through p with ,'(0) = Xp = v (say). We then
would like to define
d
(Dx Y)(p) = Dv Y = dt Yo ,(t)lt=o
(5.1.1)
= lim rY(r(h)) - Y(r(O))] .
h-t0L h
But wait a second, there is no meaning for the symbol within [ J on

the right hand side of Equation (5.1.1)! For, Y(r(h)) E T-y(h)M and
5.1. Covariant differentiation 233
To(Y(-y(h)))
Y(-y(h))
Figure 5.1.1 Parallel Transport
Y(-y(O)) E TpM and so it makes no sense to talk of their difference.

Thus our naive attempt to define Dv Y is blown.
Is everything lost? Let us look at the most special case, which is
dear to our heart. Let M = IRn. The situation in Equation (5.1.1) for
IRn looks as in Figure 5.1.1. There is a natural way to make sense out
of Equation (5.1.1).
Let
a
Yh := Y..,.(h) = LYi(-y(h)) aUi I..,.(h)'
where Yi(h) = Yh(ai), Ui the natural (usual) coordinates on IRn. We can

therefore translate it parallely to get a tangent vector at T..,.(o)M, namely,
define TO(Yh) = E Yi 8~; 1..,.(0)" Call this vector TO(Yh) = w(h) E TpM.
Then we can modify Equation (5.1.1) as
D Y = ~I (h) = 1· w(h) -w(O) (5.1.2)

v 4h h=Ow h~ h .
Since w(h) = Yi(-y(h))8~i 1..,.(0)' Equation (5.1.2) now reads:
Dv Y = lim -hI L [Yi(-y(h)) - Yi(-Y(O))] aa 1 (0)

h-+O Ui ..,.
= L /" (O)(Yi) a~i 1..,.(0)"
In other words,
(5.1.3)
if we think of a vector field Z = E Z (ud 8~; = L Zi 8~; simply as
(Zl, ... ,Zn).
Let us summarize our findings in the form of a proposition for later
use.
234 5. Riemannian Geometry
Proposition 5.1.1 Given two vector fields X, Y on IRn, if we define
a a
Dx Y =L X(Yi) aUi where Y = LYi.aui '
then Dx Y(p) depends only on the value Xp of X at p. That is, we can
find DxpY as soon as we know Y along any curve "I through p having
Xp as its tangent vector at p.
We call D x Y the covariant derivative of Y with respect to X. What
made things work on IRn was the existence of a natural parallelism or a
parallel transport from TqIRn to TpIRn, Tpq : TqIRn -7 TpIRn given by
Tpq (La i a~i \q) = La; a~; \p.

The map D : X(M) x X(M) -7 X(M) given by (X, Y) f--f Dx Y is
called the covariant differentiation. Before we start investigating the
properties of D we would like to make a
Remark 5.1.2 (The reader can omit this on first reading). The formula
Equation (5.1.3) for DxY above clearly brings the distinction between
the covariant derivative Dx Y and the Lie derivative Lx Y = [X, Yj: in
Dx Y, X does not get differentiated whereas in Lx Y coefficients of X
get differentiated with respect to Y. Compare property (3) below with
(2.7.1).
Properties of D on M:= Rn
Since X(M) is a COO(M)-module it is natural to look what happens
when we take f X in place of X, gY instead of Y. (Is there any relation
between DxY and DyX etc.) We have for all X,Y,Z E X(M) and for
all f,g E COO(M)
1. Dx(Y + Z) = DxY + DxZ

2. D(x+y)Z=DxY+DyZ
3. D/xY = fDxY
4. Dx(gY) = gDX Y + X(g)Y
5. DxY - DyX = [X,Y]
6. Z (X, Y) = (DzX, Y) + (X, DzY)

The last property needs explanation, especially on the notation. We

shall deal with this after attending to the rest. The proofs of all these
are easy. Hence we indicate proofs for the last three.
We have
DfXY = (jX(Yl),"" !X(Yn))
= !(X(Yl),' .. ,X(Yn))
= !DxY.
Thus Property (3) follows. Property (4) is seen as follows.
Dx(gY) = (X(gYd, .. ·, X (gYn))
= (gX(yd + X(g)Yl,'" ,gX(Yn) + X(g)Yn)
=gDxY+X(g)Y.
Property (5) follows thus:
We now explain the notation of Property (6) and then prO~'"2 i::. On
each TpJRn, we have a natural inner product gp defined as follows. Let
I
v = E Vi 8~i p ' and W = E Wi 8~i Ip' Then we set
n
gp(v,w) = (v,w) = I>iWi.
Now given X, YEX(JRn), the mappH(X, Y) (p)=(Xp, Yp) :=gp(Xp, Yp)

is a smooth function on JRn. That is, p H ~ Xi (p)1'i(p) is smooth
where X := Ei Xi 8~i with Xi E Coo (JRn). Call this function !. Let
Z E X(JRn). We then have
Z! = Z(LXj}j)
i
(5.1.4)
Similarly
(5.1.5)
From the Equations (5.1.1), (5.1.4) and (5.1.5) we get Property (6).
Definition 5.1.3 Let M be any manifold. A map D : X(M) x X(M)

-+ X(M) satisfying Properties (1)-(4) is called an (affine) connection
or covariant differentiation. (The reason for the nomenclature 'connec-
tion' lies in the motivation with which we started, namely, the existence
of an isomorphism Tpq : TqM -+ TpM along a curve. More on this
in Section 5.2). If D satisfies Property (5) also, then it is said to be
symmetric.
To say D 'on M' has Property (6) we need the notion of a Riemannian
metric, that is, an inner product on TpM for all p E M which varies
smoothly. This is the subject matter of Section 5.2
Before looking for connections on general manifolds, let us apply our
knowledge of D on IRn to find candidates for the covariant differentiation
on submanifolds of IRn. Let us assume for simplicity that S is an n-
dimensional submanifold (usually called a hypersurface) of IRn+1. We
denote by D what was D above: namely, the covariant differentiation
defined by Dx Y = "(X(Yl), ... , X(Yn+1))".
So, given X, Y E X(S) we want to define DxY, another vector field
on S. How about setting DxpY = DxpY? Notice that right side makes
sense: if"( is a curve in S such that "((0) = p and "('(0) = Xp, then
"( : (-g, g) -+ S '---+ IRn+1 is a curve in IRn+1 with the same properties.
By Proposition 5.1.1 of D, Dxp Y is well defined. The only trouble here
is in a different direction namely, DxpY need not be a tangent vector to

Sat p. That is, we may have DxpY rt Tp(S), even though X, Y E X(S).
To convince you that this unpleasantness is quite 'normal', we shall
give a simple example.
Let S = sn c Rn+1, say SI C R2. Take Y = -ytx +xty' a tangent
vector field on S. Consider the curve "/ : t t-t (cos t, sin t) in S with
,,/'(t) = -sinttx +costty' ,,/'(t) = -ytx +xty and
- "'(I(t)"!' (
D 8
t )= " ( -y 8x 8 (-¥
+ x 8y ) ,-y 8x
8 + x 8y
8 (x ))"
= "- (x,y)"
= -(x~
8x
+y~).
8y
This is the normal field n, that is, at each point p = (x, y), this vector
is perpendicular to Tp(S) (with respect to the inner product on R2: we
have Tp(R2) = Tp(S) EBRnp , an orthogonal direct sum).
But there is a way out! Why not define Dx Y = (15x Yf, where
(Dx y)T is the tangential component: Tp(Rn+l) = Tp(S) + (Tp(S))J.
with respect to the natural inner product given by
/ ~,~ \ =8ij
\8Ui 8uj /
on Rn+1? Given pES, we can find a neighborhood U of p in S such

that there exists a unit normal vector field N on U so that TqRn+1 =
Tq(S)EBRNq, for all q E U. (Verify this. Hint: Gram-Schmidt orthogo-
nalization applied to the coordinate fields.) Thus we can set
Thus we define
(5.1.6)
For instance in the example above, namely, S = SI C R2, we shall have

D"'(I(t)Y = O. (Did you notice that Y = ,,/'(t)?)
Even though we succeeded in defining DxY, for X, Y E X(S), we
must first of all check that Dx Y E X(S), that is, whether it is smooth
and then whether it has the properties enjoyed by D on Rn+1. We will
check these details now.
r
Dx Y is smooth on S if X, E X(S): Given pES choose adapted
coordinate neighborhood U and U· of p in_S ~n<!..JRn+l, and extend
every vector field in sight to U·. Call them X, Y, N, etc. Then
Dx.Y = DxY(q) for qE U

(so that in the defining equation for D the functions are all smooth on
U). For, if Y = L:~+l Yi a~, in ff, then
for all q E U. The right side depends only on Xq and the values of Yon
UCS.
That D : X(S) x X(S) -+ X(S) satisfies Properties (1)-(6) is easily
verified. Since TpS C TpJRn+l, TpS has a natural inner product which
smoothly varies as p does. For, if p E U is as above, then q f-t (Xq, Yq)
is the restriction of the smooth function z f-t (Xz, Yz) to U. Now it is
trivial to see that Property (6) also holds for D.
For q E U as above, we have
DxY - DyX = DxY - DyX - (DxY - DyX,iJ) iJ

= [X,Yj- ([X,Yj,iJ)iJ
= ----- = [X,
[X, Yj Yj.
The last follows from Lemma 2.7.16. Thus D satisfies Properties (1)-(6).
Exercise 5.1.4 If 'V : X(JRn+l) x X(JRn+l) -+ X(JRn+l) satisfies Prop-

erties (1)-(6) then 'V = D.
Exercise 5.1.5 Properties (1)-(6) were obtained since we wanted to

find the relation between D and the operations such as X f-t f X,
Y f-t [X, Yj, etc., on X(JRn+l). Can the reader. think of something
which we ought to explore along these lines?
5.2 Riemannian metrics

Definition 5.2.1 Let M be a smooth manifold. A Riemannian metric
9 on M is a map p f-t 9p where 9p is a positive definite inner product on
TpM.
5.2. Riemannian metrics 239
We require this map to be smooth in the sense that if for all coordi-
nate neighborhoods (U,x), the functions p I--t gij(p) = gp(8~i Ip' Ip)
8~j
is smooth for all i, j. This smoothness condition is the same as requiring
that for all X, Y E X(M), p I--t gp(Xp, Yp) is smooth. (Verify this.)
A more sophisticated way of defining a Riemannian metric is that it

is a symmetric covariant tensor 9 E 1)2 (M) of rank 2 which is positive
°
definite in the sense that g(X,X)(p) > if Xp =I for X E X(M). °
Example 5.2.2 Let M = am, with the global coordinates Ui. Then we
set
a a
gp( a'ui Ip' aUj Ip) = tSij .
that is, TpM has the usual inner product when we identify Tpam with
am in the canonical way:
a
-a I I--t ei = (0, ... ,0, 1,0, ... ,0)
Ui P
where 1 is in the i-th place.
Example 5.2.3 If M is a smooth manifold, then there always exist

smooth Riemannian metrics on M. This is proved using the partition
of unity. Let M be a smooth manifold. If U is a coordinate open set
with local coordinates Xi then there is a Riemannian metric on U. For
if p E U and v, w are in TpM then we can write v = E Vi 8~i and
w = E Wi 8~i' We set gp(v, w) := E ViWi· Then 9 is a positive definite
inner product on TpM for p E U. We now use partition of unity to patch
these to get a Riemannian metric on M. Let {Ua} be an open cover of
M such that each Ua is the domain of a coordinate chart and such that
the closure U a is compact. Let {Vi} be a locally finite refinement of
{Ua }. Notice that each Vi is a coordinate neighborhood and the closure
Vi is compact. On each Vi we have a Riemannian metric yi. If {Ii} is
a partition of unity subordinate to the cover Vi then we set
gp(v, w) := L, Ii (p)g!(v, w).

i
°
Since Ii (p) =I for finitely many i only, the right hand side above is
a finite sum. It is clear that gp is a positive definite inner product on
TpM. The map p I--t gij(p) is smooth. (Check.)
Example 5.2.4 If (M,9) is a Riemannian manifold and <p : S -t M is

a submanifold, then there is an induced Riemannian metric on S "pulled
back via <p ": for X, Y E X(S) define
First notice that the right side is defined. To show that 9 is smooth, we
use adapted coordinate systems to show that p t-t gij (p) is the compo-
sition p t-t gij 0 <p(p).
As a concrete example, let S be a surface in 1R3 . That is, S is
a 2-dimensional submanifold of 1R3 • Let (U, <p) be a coordinate chart.
We let the coordinates on 1R2 be (u, v) and the induced coordinates
on U be XI,X2. Then the coordinate vector fields {)~l = D1j;(:J and
,ft-
UX2
= D1j;(!
uV
) are written with respect to the coordinate fields QUi
o.{) for
1 ~ i ~ 3 on 1R as follows: Let
3
1j;(u, v) = (x(u, v), y(u, v), z(u, v))

where x(u, v) = Ui 0 1j;(u, v) etc. Then
Similarly
Therefore
In classical notation 9u = E, g12 = F, and g22 = G. Thus 9 = (~b)

in coordinate representation.
Example 5.2.5 Let cp : M = ]R+ X (0,211") -t ]R2 \ {(x,O) : x ~ O}

be the polar coordinates: cp( r, 8) = (r cos 8, r sin 8). Then we can pull
back the Riemannian metric 9 on ]R2 to M as above. That is, declare
911 = 9(:r' :r) = g(Dcp(:r),DCP(:r))' etc. Then we have 911 = 1,
922 = r2, 912 = 0 so that 9 = (~r02). We usually think of this as
the representation of the metric on cp(M) with respect to the polar
coordinates. (Check the expressions for 9ij above.)
Example 5.2.6 Let S = sn(R) = {x E ]Rn+! : L~+! x; = R2}. Let

S be given the Riemannian metric 9 induced from 9 on ]Rn+1. Let
cp: ]Rn -t S be the inverse of the stereographic projection from north
pole. Then
(Compare this with the formula (2.1.3) on page 68.) We pull the Rie-
mannian metric on S to ]Rn via cpo
For v E Tpsn, Dcp(p)(v) = -y'(O), where -y(t) = cp(p + tv). Therefore,
(Check this.) Therefore,
(Dcp(p)(v) , Dcp(P)(W))lRn+l = 9(V, w)

4R4
= (lp12 + R2)2 (v, w)
4
= (1 + (Ipl/ R)2)2 (v, w) .
(Check this.) This metric 9 on ]Rn is called the spherical metric.
Example 5.2.7 Let M = H = {(x,y) y > O} be the upper half

E]R2 :
° ) at p = (x, y).
plane, with the usual coordinates. Set 9p = ( Yo-2 y-2
Then 9 is called the Poincare or Hyperbolic metric on Hand (H,9) is
called the Poincare hyperbolic plane.
The classical notation for a· Riemannian metric is giyen by the line

element ds 2 = Li'f" gijdxidxj. The reason for this notation is the fol-
lowing. Let "'I : [a, b -t M be a smooth curve. Then "'I'(t) E T"((t)M. We
then define the length of "'I by setting
l("'t)
r (g"((t)("'t'(t),"'I'(t)))
= Ja
b 1
2 dt. (5.2.1)
Equation (5.2.1) is the obvious generalization of the familiar formula for

the arc-length of a curve in ]Rn:
with "'I = ("'11, "'12, .•. , "'In) where ]Rn is given the usual Riemannian metric
as in Example 5.2.2.
Exercise 5.2.8 Let (M,g) be a Riemannian manifold. Show that l("'t)

is independent of the parameterizations. Thus the distance is the same,
independent of whether your train is fast or slow! So we usually param-
eterize the curve by its arc length:
Then ds 2 = g(!fit, !fit). Assume "'I lies in a single coordinate chart (U,x)
it
so that
s = l(t) = (Lgij(X(t)) ~i, d::) ! dt

whence it follows that
ds 2 = L gijdxidxj := I.
ij
In the case of Example 5.2.4, namely, surfaces S C ]R3, in classical

notation
ds 2 = Edu 2 + 2Fdu.dv + Gdv 2
and it is called the first fundamental form of the surface and is denoted
by I.
Exercise 5.2.9 Calculate I for the following surfaces with parameteri-

zation given as below. (parameterization := cp -1, if (U, cp) is a chart.)
1. Surface of revolution: Let S be a surface of revolution (u, v) I-t

(ucosv,usinv,/(u)) with a < u < band 0 < v < 211'. See (e) in
Figure 5.2.2 for the surface of revolution got by revolving a part
of the hyperbola x 2 - y2 = 1.
2. Saddle surface: Let 8 be the saddle surface given by z = xy.
3. Torus: Let 8 be the torus, considered as a surface of revolution

in JR.3, by revolving the curve C = (R + r cos u, R + r sin u) in the
(x, z)-plane around the z-axis. Assume R > rand 0 < u, v < 211'.
Then (u, v) I-t (( R + r cos u) cos v, (R + r cos u) sin v, r sin u) is a a
parameterization of 8.
4. Helicoid: (u, v) I-t (u cos v, u sin v, bv) with b =f. O.

5. Sphere: (u,v) I-t (cosucosv,cosusinv,sinu), where -t < u < I'
Note that the image is 8 2 \ poles.
6. Cylinder: (u, v) I-t (cos u, sin u, v).
Exercise 5.2.10 Calculate the first fundamental form for the flat torus
in JR.4 : (u,v) I-t (cosu,sinu,cosv,sinv).
Pictures of some of these surface can be found in the next couple of

pages. These pictures were drawn using MATHEMATIC A® software.
For more such pictures, refer to [131.
(a) Torus (b) Helicoid
(c) Sphere (d) Cylinder
Figure 5.2.1 Some surfaces

5.2. Riemannian me tries 245
(e) Surf. of revolution (f) Saddle surface
(g) Cone (h) Mobius strip
Figure 5.2.2 Some surfaces (continued)

Local isometry and isometry

Definition 5.2.11 Let (M,g) and (M,9) be two Riemannian mani-
folds. Then a smooth map cp : M -t M is said to be a local isometry if
for allp EM and u,v E TpM we have (u,v) = (Dcp(u), Dcp(v»), that is,
if the following holds:
gp(u, v) = 9cp(p) (Dcp(p)u, Dcp(p)v) , p EM, u, v E TpM.
Thus each Dcp(p) is a linear isometry of TpM onto Tcp(p)M.

If a local isometry cp is a diffeomorphism of M onto M, then cp is
said to be an isometry.
Notice that a local isometry is a local diffeomorphism.
Example 5.2.12 Let M = M = JR.n with the usual Euclidean metric.

Then any rigid motion B is an isometry. Recall that B = A 0 T", where
A is an orthogonal transformation and T", is the translation y I-t x + y.
The verification is left as an exercise to the reader.
Example 5.2.13 Let M = JR.2 with the usual Riemannian metric and
let M be the cylinder {(x,y,z) : x 2 + y2 = I} In JR.3 with the induced
Riemannian metric. Then cp : JR.2 -t M given by (u, v) I-t (cos u, sin u, v)
is a local isometry which is not a global isometry.
We have TpM = JR.:u EBJR.a: with
Now
x = Dcp(~)
au = -sinu~
ax +cosu~
ay
and
Thus Dcp is a linear isometry and
\ :u' :u) = (X, X) = sin 2 u + cos 2 U = 1,
etc. Therefore cp is a local isometry but clearly not an isometry.

Example 5.2.14 The notation is as in Example 5.2.13. If we take

Ml = {( u, v) E M := JR.2 : -71" < U < 71", v E JR.}
and
M2 = M \ {(-I,O,z): z E JR.}
with the induced Riemannian metric. Then the restriction of cp to Ml
is an isometry of Ml onto M 2 •
Example 5.2.15 Let M be the Hat torus in r:

{x E r :x~ + x~ = 1, x~ + x~ = I}.
Then tf; : JR.2 -+ M given by tf;( u, v) = (cos u, sin u, cos v, sin v) is a local
isometry.
Example 5.2.16 Let D = {(x, Y) E JR.2 : x 2 + y2 < I}. Define the Poin-
care metric on D:
(v,w)
g(v, w) = (1 _ (x 2 + y2))2'
where v, w E T(x'II}D and (v, w) is the usual inner product. In line

element representation, we can write this as
ds 2 = (dx 2 + dy2) .
(1- x 2 _ y2)2
We shall consider Dee, the field of complex numbers, that is,
(x, y) will be identified with x + iy = z E C. For all (a, b) E C2 such
that lal 2 - Ibl 2 = 1, the map
az+b
cp: z f-t _ - -
bz+o;
is an isometry of D.
For v E TzD, let "( be a curve such that "((D) = z and "('(D) = v.
Then
d
Dcp(v) = dt (cp 0 "()It=o
d (a"((t) + b)
= dt b"((t) + 0; It=o
= 1a 12 - 1b 12 "(' (0)
(bz + 0;)2
"('(D) E T D
(bz + 0;)2 ..,{z)·
Therefore
(D (v) D (v)) _ (-y'{0),-y'{0)) (5.2.2)
<p ,<p - (bz + a)2{bz + a)2
But
2 1 - I Z 12 1 - I Z 12
(5.2.3)
1-1 <p{z) I = I a+bz 12 = (a+bz)(a+bz)
Equations (5.2.2) and (5.2.3) imply that
(D<p(v), D<p(v)) (v, v)

=
(I - I <p(z) 1)2 1 - Iz I
2 2·
(Check.) Hence by polarization we have g(D<p(v), D<p(w)) = g(v, w) for

v, w E TpD as above.
Example 5.2.17 Consider the helicoid (a spiral stair case) with the
parameterization
(u,v) t-t (ucosv,usinv,v), u,v E JR and u > 0

and the catenoid with the parameterization
((),</» t-t (cosh () cos </>,cosh()sin</>,()) </>,() E JR and 0 < </> < 211".
Then the map is a local
isometry. (Exercise.)
Example 5.2.18 Show that on the upper half-plane
H = {(x, y) E JR2 : y > O}

with the hyperbolic metric g(v, w) = y-2 (v, w) for v, w E T(x,y)H, the
maps <p : z t-t ~:t~ where z = (x + iy) and a, b, e, d E JR such that
ad - be = 1 are isometries.
Since each such 0,
3. (x, y) t-t (x/(x 2 + y2), y/(x 2 + y2)),
it is enough to show that each of these is an isometry. Or one can
proceed as in Example 5.2.15 above.
5.3. The Levi-Civita connection 249
.
Example 5.2.19 H of Example 5.2.18 and D of Example 5.2.16 are
isometric via the Cayley transformation D -+ H given by z t-+ -i !=~:~
(Prove this.)
We make an important remark below:
Remark 5.2.20 In Examples 5.2.16, 5.2.18 and 5.2.19 do not resist

the natural coordinates z and z. It would be highly cumbersome if you
want to express everything in terms of the real and imaginary parts x,
y during the computation.
5.3 The Levi-Civita connection

Definition 5.3.1 Let D be an (affine) connection on a manifold M.
We say that D is symmetric or torsion free if DxY - DyX = [X, Y]
holds for all X, Y E X(M).
Definition 5.3.2 A connection D on a Riemannian manifold M is said

to be a Riemannian connection if for all X, Y, Z E X(M)
X (Y,Z) = (DxY,Z) + (Y,DxZ),

where ( , ) = g( , ).
Definition 5.3.3 A connection D on a Riemannian manifold (M,g)
which is both symmetric and Riemannian is called the Levi-Civita Con-
nection.
We shall prove now what is called the fundamental theorem of Rie-
mannian geometry.
Theorem 5.3.4 (Fundamental Thm. of Riemannian Geometry)

There exists a unique Levi-Civita connection on a Riemannian manifold
M.
Proof We plan to derive an expression for (DzX, Y) purely in terms

of the metric and the Lie bracket. This will establish the uniqueness.
The same expression also gives the existence. We use the Riemannian
and symmetry nature of D in the computation below.
(DzX,Y) = Z(X,Y)-(X,DzY)
= Z(X,Y)-(X,DyZ)-(X,[Z,YJ). (5.3.1)
By cyclically permuting X, Yand Z we get from Eq. 5.3.1
(DzX,Y) = Z (X, Y) - (X, DyZ) - (X, [Z, Yl) (5.3.2)

(DxY,Z) = X (Y, Z) - (Y, DzX) - (Y, [X, Zl) . (5.3.3)
(DyZ,X) Y (Z, X) - (Z, Dx Y) - (Z, [Y, Xl) . (5.3.4)
Subtracting Eq. 5.3.4 from the sum of Eq. 5.3.2 and Eq. 5.3.3 yields
1
(DzX, Y) = 2 «(X, [Y, Zl) + (Y, [Z, Xl) - (Z, [X, Yl))
1
+ 2 (Z (X, Y) - Y (Z,X) +X (Y, Z)). (5.3.5)
Since the right side does not involve D, we get uniqueness. One can also
check that D can be defined using this formula.
o
We shall indicate the relation of our formulation with the
local study. Let (U, x) be a local coordinate system such that
g'j(p) = gp(8~, Ip' Ip)'
8~j We write 8, for 8~, at times.
Let (gi j (p») be the inverse of (gij (p». (It exists since (gij (p» is
positive definite and hence non-singular.) g'j are smooth functions on
U.
Now if D is a map from X(U) x X(U) -+ X(U) satisfying the condi-
tions (i) and (ii), then, D8,8j = L:;;'=l rfj 8 k for uniquely defined smooth
funct.ions r~ Knowing rfj is equivalent to knowing the connection D.
For, If X = L fi8, and Y = L: hj 8j , then
Dx Y = L:D/;8i(L:hj 8j )
i j
= L:f,(L: D 8,(hj 8j ))
, j
= L: f, L: {8,(hj )8j + hj D8,8j }

, j
= L:h{L:8,(hk)8k + L:hj
,k ~k
r: j 8k }
= L: {X(hk) + L:
k '.j
r:
j f,h j 8k}.
Hence we can compute DxY, once we know rt. rfj are called the
Christoffel symbols. We derive the classical expressions for these symbols
5.3. The Levi-Civita connection 251
from Equation (5.3.5):
8gij + 89
8
jk _ 8gki =
8
2~rl . . '= 2~rl.
LJ k,gl}· LJ k,· (5.3.6)
8 Xk Xi Xj I I
To obtain r~i we apply g-1 = (glj) on both sides and get
r~i = ~ ~ (89i j + 8gj k _ 8 9ki ) gjl.

2~ 8Xk 8Xi 8xj
}
By changing 1 -+ k, k -+ i, i -+ j, j -+ l,
r~. = ~ ~ (89U + 8gl j _ 8 9ij ) glk. (5.3.7)

'I 2 LJ 8x . 8x· 8xI
I 1 '
Equations (5.3.7) are called the Christoffel identities.

Condition (5) of the definition of Levi-Civita connection puts a
restriction on r~: From the fact [8i , 8j l =0 we deduce D8;8j -D8j 8 i =0.
This implies that L:k(r~ - rji)8k = O. But then 8 k are linearly inde-
pendent on U and hence
r:j = rj;.
This explains the reason for the name "symmetric connection". But one
should remember that the symmetry of the suffixes of rfj remains valid
only with respect to the coordinate vector fields.
Surface of revolution
We now. take up the example of a surface of revolution and show how the
proof of Theorem 5.3.4 can be carried out to compute the Levi-Civita
connection of the surface with respect to the induced metric.
We consider a smooth curve cr: (a, b) -+ IR2, the (x,z)-plane in IRs
given by cr( u) = (x(u), z( u)). We assume that the curve lies in the open
right half (x, z) plane. We also assume that cr is parameterized by the
arc length so that the tangent vector crt (u) is of unit norm: II crt (u) II = 1.
If x( u) = 0 for some u in the domain we demand that z' (u) = 0 at that
point. The surface S of revolution is got by revolving the profile curve
cr about the z-axis. It has local parameterization given by
X: (u,v) ~ (x(u) cosv,x(u) sin v, z(u)) 0:5 v < 211".
The curves u = constant are called the parallels and those with v =
constant are called the meridians. The coordinate vector fields are given
by:
8) 8 . 8 8
Xu :=X. ( 8u =x'(u)cosv 8x +x'(u)smv 8y +z'(u)8z
8) . 8 8 8
Xv := X. ( 8v = -x(u) sm v 8x + x(u) cos v 8y + 0 8z'
Thus we have the components of the metric:
911 = (Xu, Xu) = Ix'(u)1 2 + Iz'(u)1 2 = la'(u)1 2 = 1

912 = (Xu, Xv) = 0
922 = (Xv, Xv) = (x(u))2 .
Hence the metric is given by
Here we have set r(u) := x(u) and () := v.

We now wish to find the Levi-Civita connection D on S. We set
80 = Xv = :0 and 8u = Xu' We shall repeatedly use the fact that
[80 ,8u ) = 0 and hence the symmetry of the connection.
1
(Da 980,80 ) = 280 (80,80 ) = 280(r
1 2
)= o. (5.3.8)
(Da980,8u) = 80 (80, 8u ) - (80, Da98u)
= - (80 , Dau 80)
1
= -2 8u (80,80)
= -~8u(r2) = -r'(u)r(u) . (5.3.9)
From Equations (5.3.8) and (5.3.9) it follows that Da980 has only 8 u -
component and
(5.3.10)
Also,
(Da98U,80) = 80 (8u , 80) - (8u , Da 80)

9
= - (8u , Dag80) = r(u)r'(u) (5.3.11)

1
(Da98u,8u) = 280 (8u , 8u ) = 0 (5.3.12)
5.4. Gauss theory of surfaces in IR3 253
From Equations (5.3.11) and (5.3.12) we deduce that Da88U h~ only

the 8e-component. Recalling that 922 = r(u)2, we see that
(5.3.13)
We also have
(Dau8u,8e) = 8 u (8u ,8e) - (8u,Dau8e)

= - (8u ,Dau 8e)
= - \ 8u , ~(~} 8e ) = 0, (5.3.14)
1
(Dau8u,8u) = 2"8u (8u,8u) = O. (5.3.15)
Thus we have
Dau8u = 0 (5.3.16)
Da88e = -r'(u)r(u)8u (5.3.17)
r'(u)
Da8 8u = r(u) 8e = Dau 8e. (5.3.18)
The Christoffel symbols are therefore:
r~2 = -r'(u)r(u) and

r2 _ r'(u) _ r2 (5.3.19)
21 - r(u) - 12
and the rest of the Christoffel symbols are O.
5.4 Gauss theory of surfaces in lR3

Gaussian curvature
Let i5 be the connection on IR3 given by i5 x Y = Li X (Yi) a!.
where
Y = Li Yi a!. with Yi E Coo (IR3). Let S be a surface in IR3, that is,
a submanifold of dimension 2 in IR3. Let <p : U C IR2 ~ S C IR3 be a
parameterization of an open set <p(U) of S. (That is, <p-1 : <p(U) ~ U
is a local chart). We shall denote the coordinates of U by (u, v) and
<p(u, v) = X = X(u, v) or <p(u, v) = r = r(u, v), "the position vector":
r(u,v) = (x(u,v),y(u,v),z(u,v)). We set
( 8 ) ax a 8y 8 az 8
Xu = ru = D<p(u, v) 8) (u,tJ) = 8u 8x + 8u 8y + 8u 8z'
(To be in conformity with the classical notation, in this and in Sec-

tion 5.4.2, quite often we shall adopt the usual notations for coordinates:
x instead of Ub :x instead of a~l etc.) X" = r" = L ~~ :x is similarly
defined. By hypothesis, Xu and X" span Tcp(u,,,)S. (They are the co-
ordinate tangent vectors of the local chart (<p (U), <p -1) .) Tp S C TplR3
inherits the usual Riemannian metric. Hence there exists a smooth vec-
tor field N, or 0, sometimes, on <p(U) such that Xu, X"' 0cp(u,,,) span
TplR3 , where p = <p(u, v), in such a way that TplR3 = TpS EB lRnp is an
orthogonal direct sum for all p E <p(U). We may and will assume that
o is of unit norm. (We refer to 0 = N as a normal field on <p(U) C S.)
Let D be the induced connection on S. Recall that for all X, Y E X(S),
That is,
(5.4.1)
This connection is symmetric and Riemannian. By the fundamental

theorem of Riemannian geometry proved in Section 5.3 it is the Levi-
Civita connection on (S,9), 9p = (, )ITp S·
It is intuitively clear that the correction term namely, (D xp Y, 0)
should determine the geometry of S sitting in lR3 • It measures the
velocity of Y with respect to Xp-direction with which it escapes into
the outer space.
Since (Y, 0) = 0 we have
and hence
(5.4.2)
Notice that Dxpo E TpS . For, the fact that (0,0) = 1 implies that
Xp (0,0) = 0 and hence 2(Dxpo, 0) = O. Thus Dxpo ..L o. Hence the
claim.
We define A: TpS -+ TpS by Av = -D"o. Thus we can write
(5.4.3)
Equation (5.4.3) is called the Gauss equation and the map A : TpS -+
TpS is called the Weingarten map. Since Equation (5.4.3) is a recast
of Equation (5.4.1), we should expect that the geometry of 8 in IR3 is

shaped by A.
This linear map A has a geometric interpretation. Given n on cp(U) C
8 as above, we have a map ii : cp(U) -+ 8 2 , the unit sphere in IR3, namely,
ii(p) = np. That is, if
a a a
np = It ox ip + h ay ip + fa az ip '
so that E!l = 1, then ii(p) = (It, h, fa) E 8 2 • This map is called
the Gauss (Spherical) map of the surface (defined locally). We wish to
calculate the derivative of this map. First of all, observe that at np in the
unit sphere, we have Tn p is the same as Tp(8). Let v E TpS, "( a curve
in S such that "((0) = p and "(' (0) = v, etc. Let us write n = E fi ~; •
Then
Dii(p)(v) = (ii 0 "()'(O)
d a
= L dt (Ii 0 "() aUi (5.4.4)
= Dun
= -Av.
Thus - A is the derivative of the Gauss map.
Example 5.4.1 Let S = 8 2(R) = {x E IR3 : IIxll2 = R2} with R > O.

Then ii( x) = x j R for all x E 8. Hence the derivative Dii( x) : TpS -+
Tp8 2 is given by v H (-v)jR. Thus, for v E Tp8, Av = (-v)jR.
If you want an analytical derivation of this, take iiz = E(xi/ R) ~;'
Then Av = -Duii = Dxii(p), where X = Vi ~; and v = Vi ~; ip '
Then
- _ " (" 0) (Xi)

Dxn= ~ ~Vj-. a --.
,. ). au)R au,
"Vi a
= ~ Rau·
i '
1
= R V'
We now show that the Weingarten map A == Ap : TpS -+ TpS

is symmetric (self-adjoint with respect to the inner product), that is,
(Av, w) = (v, Aw) for all v, w E TpS. We shall give two proofs of this
result.
Proof Let V = cp(U), V be adapted neighborhoods in Sand IR3. We

choose vector fields X and Y on V such that Xp = v, Yp = wand extend
° ii
X, Y, to X, Y, on V. Then we have, at p,
(Av, w) - (v, Aw) = (Dvii, w) - (v, Dwii)
= (Dxii, Y) Ip - (X,Dyii)
= X ( ii, Y) - (ii, D x Y) - Y( X, ii) + ( DyX, ii)
= (ii,DyX - DxY)
= (ii, [Y,Xl) = O.
Hence (v, Aw) = (Av, w).
Now the symmetric map A on TpS gives rise to a symmetric bilinear
form B : TpS x TpS -+ IR as follows:
B(v, w) := (Av, w) ,
(,) the induced inner product on TpS. B is called the second fundamental
form and is denoted by II. We now obtain the local expression for II
with respect to the parameterization. We use the notations introduced
above.
= - (Druo,ru)
(5.4.5)
= - \~: ,ru)
= (0, ruu) := L.
Equations (5.4.5) are obtained as follows : Since (0, ru) = 0, we have
tu(0, ru) = 0, which in turn implies (~~, ru) + (0, ruu) = O. Similarly
and
B(rv, rv) = (0, rvv) := N.
L, M and N are classical notations. We then write
II := Ldu 2 + 2M dudv + N dv 2 •
The above computation furnishes a second proof for the self adjointness
of A.
o
5.4. Gauss theory of surfaces in lR3 257
We now look at the various choices involved in the definition of A.

First of all, we could have taken a different coordinate neighborhood and
the corresponding parameterization. Secondly, since TpS is intrinsically
defined as a subspace of Tp lR3 , there are only 2 obvious choices for the
normal field, namely, nand -n on any coordinate neighborhood. For
the time being, let one be n. If now (iL, V) is another set of parameters,
then the matrix representations of A, say Al and A2 with respect to
(v., v) and (iL, V) are related by
where T = (~~) is the Jacobian corresponding to the coordinate change.

The above is just linear algebra: If A: V -? V is linear, A = (aij) and
A = (£iij) with respect to {ei} and {ii} respectively, and if T takes ei
to ii. That is, if
I:tijej = ii,
j
then
where (tkl) is the inverse of (tkl)' (Exercise: Check this.)

Thus we are looking for numerical quantities invariant under con-
jugation. Two obvious choices are: A I--t det(A) and A I--t tr (A). So
we accordingly define the Gaussian curvature K(p) of S at a point p
by Kp := det(Ap) and the mean curvature H(p) of at a point p by
Hp := tr (Ap)/2.
Using simple linear algebra (of inner product spaces) we see that
knowing I and II, we can calculate K :
LN-M2
K= EG-F2'
We give the simple result from linear algebra as the following
Exercise 5.4.2 Let A : V -? V be a symmetric linear map of a finite

dimensional inner product space over lIt Let Vi be any basis of V. Let
(AVi, Vj) = (Xij for 1 ~ i,j ~ dim V. Then
Exercise 5.4.3 Show that in the above notation the Riemannian vol-
ume element of S is given by JEG - F 2 dx /\ dy.
Gaussian curvature K is thus a function on S. It has a very nice

geometric interpretation which is based on the geometric meaning of
the Weingarten map and the geometric meaning of the determinant.
Gaussian curvature at a point p measures the distortion of the areas of
infinitesimal regions around the point p caused by the Gauss spherical
map. When we introduce the Cart an structural equations this can be
easily established. This geometric insight will help you understand the
qualitative behaviour of the curvature functions in some of the examples
below.
Now let us look at some examples of surfaces in ]R3 and compute I,
II and K.
We first make an important remark:
Remark 5.4.4 Most often the parameterizations given below cover only
the good portion of the surface. We shall not comment, in general, upon
the rest. However the continuity of K, etc., will imply that the equation
for K will continue to hold for the whole of S.
Example 5.4.5 (Torus)
<p( u, v) = r = «a + b cos u) cos v, (a + b cos u) sin v, b sin u)
with 0 < b < a and (u, v) E ]R2.
ru = « -bsin u) cos v, (-bsin u) sin v, bcos u)

E = (ru,ru) = b2 = 911
ruu = (-bcos u cos v, -bcos u sin v, -bsin u)
ruv = (b cos u sin v, -b sin u cos v, 0)
rv + b cos u) sin v, (a + b cos u) cos v, 0)
= (- (a
G = (rv,rv) = (a + bcOSU)2
F = (ru,rv) = 0
n = - (cos u cos v, cos u sin v, sin v)
(ruu, n) = bcos2 u(cos 2 V + sin2 v) + bsin 2 v = b = L
(ruv,n) = 0 = M
(rvv,n) = (a + bcosu)cosu(sin 2 v + cos 2 v) = N.
5.4. Gauss theory of surfaces in 1R3 259
Therefore
LN-M2
K= EG-F2
bcosu(a + bcosu)
= b2(a + bcosu)2
cosu
- b(a + bcosu)·
Example 5.4.6 (Helicoid) (u,v) f-t (vcosu,-vsinu,bu) with b =I- 0.

We then have ru = (-vsinu,-vcosu,b) and rv = (cosu,-sinu,O) so
that
(bsin u, bcosu, v)
n = ..:....--v'~b2;<=+=v:::;<2,...:.-..:...
We further have
ruu == (-vcosu,-vsinu,O)
ruv = (-sinu,cosu,O)
rvv = (0,0,0)
-b 2
K = (b 2 + v2 ) .
Example 5.4.7 (Sphere) (u,v) f-t (cosu cos v, cosusinv, sinu). Com-
pute I, II and K and so on.
Example 5.4.8 (Cylinder) (u,v) f-t (cosu,sinu,v) . Compute II and

K
(i) straight away without using any formula above and
(ii) using the parameterization formula etc. Do the same for the sphere
too.
Example 5.4.9 (Monge Surfaces) S C 1R3 is given by z = z(x, y).

P u t p -- 8x'
8%
q -- 8% - 82 % - 82%
By ' S - aXI' r - 8x8y' an
d t -- 7fYI.
8 2 % Th
en we h ave
Example 5.4.10 Let 8 be the surface got by revolving a part of the

hyperbola:
(u, v) H (cosh u cosv, cosh u sin v, sinh v).
Compute the first and second fundamental forms and the Gaussian cur-
vature of 8. (See (e) in Figure 5.2.2, on page 245.)
Exercise 5.4.11 Let a hypersurface 8 E IRn+l be given by
Compute its volume form with respect to the induced Riemannian met-
ric. (Hint: See Example 3.2.11.)
Gauss Theorema Egregium

We started with the idea that the connection term in the Gauss equa-
tion (Equation (5.4.3)) should influence the geometry of the imbedded
surface 8 in 1R3 . This in turn led us to two invariants of the associated
Weingarten map A and we defined the Gaussian curvature K of 8 in
terms of the first and second fundamental forms :
LN-M2
K(p) = det(Ap) = EG _ F2 .
The theorem of Gauss says that K is effectively computable from I.

More precisely, we have
Theorem 5.4.12 (Gauss theorem) If 8 1 and 8 2 are two surfaces in

1R3 and if cp : 8 1 -+ 8 2 is a local isometry, then the Gaussian curvature
K1 and K2 are the same at corresponding points. In particular, K does
not depend on the way it is imbedded in 1R3 but only on the induced
Riemannian metric.
Before we prove the theorem we explain the meaning of this by means

of examples.
Example 5.4.13 Consider
81 = {p E 1R3 : z(p) = O}
and
the cylinder. Let cp : SI -t S2 be given by
(x,y,O) H (cosx,sinx,y).
Then we have seen that cp is a local isometry. Clearly Kl = = K2 at °

corresponding points. (Check this for Kl and for K2 see Example 5.4.8
above.) If we let
S = {(x,y,O) E IR3 : -71' < X < 71',y E IR} C IR3
and st = Sand Si= the image of st under cp that is, the cylinder
with a line removed (parallel to the z-axis), then Sl is an imbedding of
S into IR3 in two different ways. However the first fundamental forms at
the corresponding points are the same and hence by Gauss theorem we
should expect Kl = K 2, etc.
Example 5.4.14 Consider the local isometry cp between the punctured

plane S =IR2 \ {O} and the cone S= {(x, y, z) EIR3 : 3x2 + 3y2 = z, z > O}
given by
. (T T.
y'3 r ) .
cp(T cos 0, T sm 0) = 2" cos 20, 2" sm 20, T
We leave it to the reader to check that the Gaussian curvature coincides
at the corresponding points.
Example 5.4.15 Consider the helicoid (spiral staircase) given by
cp(u,v) = (ucosv,usinv,v)
with u > °and v E IR and the catenoid given by
1/J( 0,4» H (cosh 0 cos 4>, cosh 0 sin 4>, 0)
with °< 4> < 271' and 0 E IR, and the local isometry given by
v H 4>, u H sinhO.
The reader should calculate I, II, K, etc., and verify Theorem 5.4.12 in
this case. See also Figure 5.4.1 on page 263.
To prove Theorem 5.4.12, let us go back to the question posed at the
end of Section 5.1. The answer is that we would like to investigate
R(X, Y)Z:= DxDyZ - DyDxZ - D[x,YjZ.
The following exercise is easy but an important one.

Exercise 5.4.16 For M = IRn and D the Levi-Civita connection show

that R(X, Y)Z = 0, for all X, Y, Z E X(JR.n).
Proof of Theorem 5.4.12. So, on JR.3 we have, (we think of X, Y as

extensions of X, Y on 8)
DxDyZ - DyDxZ - D[x,YjZ = O.

We now use the Gauss equation (Equation (5.4.3))
DxY = DxY- (AX,Y)n
in this and take the tangential components alone:
Dx(DyZ - (AY, Z) n) - Dy(DxZ - (AX, Z) n) - D[x,YjZ

= DxDyZ - (AX,DyZ) n+ (AY,Z)AX
- Dy D x Z - (AX, Z) AY - D[x, Yj Z + normal component.
That the above expression is 0 implies that each of the components is O.

Hence we get
(DxDy - DyDx - D[x,Yj)(Z) = (AY,Z)AX - (AX,Z)AY.
Therefore if we define R(X, Y)Z by setting
R(X, Y)Z:= (DxDy - DyDx - D[x,Yj)(Z), (5.4.6)
Then we have
R(X, Y)Z = (AY, Z) AX - (AX, Z) AY. (5.4.7)
Now assume X and Yare orthonormal on U c 8,so that
(R(X, Y)Y, X) = (AY, Y) (AX, X) - (AX, Y)(AY, X)

= det(A).
Therefore
(R(X, Y)Y, X) = K. (5.4.8)
Thus the curvature K is computed as (R(X, Y)Y, X) which depended
only on D, the Levi-Civita connection and the Riemannian metric (1)
of 8. Since the connection D is determined by the metric, the theorem
is completely proved.
o
5.4. Gauss theory of surfaces in ]R3 263
Figure 5.4.1 Catenoid to helicoid

Remark 5.4.17 Notice that thanks to Equation (5.4.8), R(X, Y)Zl p

depends only on X p, Yp and Zp (contrary i;o what one may expect from
the expression for R in terms of D's), since the right hand side depends
only on X p, Yp and Zp. Thus R(Xp, Y.,) : TpS -+ TpS is a linear operator
called the Curvature Operator.
On any Riemannian manifold (M,:J) with the Levi-Civita connec-

tion D, if we define R(X, Y)Z as above, one can directly verify that
(R(X, Y)Z)(p) depends only on Xp , Yp and Zp (Exercise). We call this
general R(Xp, Yp) as the curvature operator associated to Xp , Yp, etc.,
on any Riemannian manifold.
5.5 Curvature and parallel transport

In this section we shall define parallel transport and give a geometric
meaning of R(Xp, Yp) in terms of parallel transport.
Let (M, g) be a Riemannian manifold, D the Levi-Civita connection
on it. Motivated by Section 5.4, we define for all X, Y, Z E X(M),
R(X, Y)Z = (DxDy - DyDx - D[x,Yj)(Z),
We first show that (R(X, Y)Z)(P) depends only on X p, Yp, and Zp-
This means that R(X, Y)Z is a tensor:
R(fX,hY)(uZ) = fhuR(X,Y)Z
for all f, h, u E Coo (M). Consider
DfXDyZ = fDxDyZ
DyDfX Z = Dy(fDx Z )
= Y(f)DxZ + fDyDxZ.
Since (fX,Y] = f[X,Y]- Y(f)X, we have
D[jx,YjZ = Df[x,Yj-Y(f)x Z = fD[x,YjZ - Y(f)DxZ.
Putting all these together yields
R(fX,Y)Z = fR(X,Y)Z.
Similarly
R(X, fY)Z = f R(X, Y)Z = R(X, Y)(f Z).
5.5. Curvature and parallel transport 265
Hence if X = E Xi8i etc., we then have

R(X, Y)Z(p) =L Xi (p)Yj(P)Zk(p) R(8i , 8j )(8k)'
ij k
This establishes what we wanted. Hence without ambiguity we can write

R(Xp, Yp)(Zp) so that
is a linear map for all Xp, Yp E TpM such that
R(Xp, Yp)(w) = (R(X, Y)Z)(p),

where X, Y, Z are vector fields on M such that X(p) = Xp, Y(p) = Yp,
and Z(p) = w. We call this map R(Xp, Yp) as the curvature operator
associated to Xp and Yp .
The curvature operator R(Xp, Yp) has a simple geometric meaning in
terms of parallel transport which we now discuss. Recall the geometric
motivation with which we started the concept of D X Y. By abstracting
the properties of D on lRn , we abandoned the geometric idea of parallel
transport of TpM to TqM along curves. We now complete the circuit.
That is, given a connection D on M, we shall define the notion of parallel
transport.
We now introduce the concept of covariant derivation of a vector field
along a curve in a manifold with a connection.
Let M be any smooth manifold. Let c: [0, T] -+ M be a smooth map.
A map X: t H X(t) E Tc(t)M is called a vector field along the curve c.
X is said to be smooth if t H X(t)(f) is smooth for any f E COO(M).
All vector fields along a curve will be assumed to be smooth. We let
X(c) denote the set of all smooth vector fields along c. The following
theorem tells us how to differentiate a vector field along a curve on a
manifold with a connection. Though easy to prove, this result plays a
fundamental role in the sequel.
Theorem 5.5.1 Let M be a manifold with a connection D. Let c: 1-+

M be a smooth curve. There exists a unique map ¥t:
X(c) -+ X(c) with
the follml'ing prop'erties:
(1) ¥t(X + Y) = ¥tx + ¥tY for all X, Y E X(c).
(2) If f: [0, T]-+ lR is smooth, then ¥t(f X) = ~X + f ¥tX.
(3) If X E X(c) and V is a vector field on M such that V(c(t)) =
X(c(t)), then ¥tX(c(t)) = Dcl(t) V := Dcl(t) V(c(t)).
Proof Let us assume that such an operation exists and derive its local
expression. Let '{J: U --t]Rn be a local chart. Assume that c(I) c U. Let
c(t) := (Cl(t), ... , cn(t)) where Ci(t) = Xi(C(t)). Let Oi := 8~i' Given
a vector field along C we can write it as X = L JiOi where Ji(t) =
X(Oi(C(t))). By properties 1) and 2), we have
D "dfi D
dt X = L....J di Oi + Ji dt Oi. (5.5.1)
By 3) we have
D "d(xj 0 c)
dt Oi = Dc'(t)oi = DE d("JtO C) 8; Oi = L....J dt D 8 ;Oi'
Using this in Eq. 5.5.1, we get
DX
dt
=" dJi a. "f.d(Xi oc) D
L....J dt + L....J
l dt l
a.
8;.·
(5.5.2)
i,j
Thus if ¥t
exists, it is unique and its local expression is as in Eq. 5.5.2.
To prove the existence, we define ¥tx
on U by Eq. 5.5.2. It is easily
verified that this definition satisfies the properties 1)-3). The uniqueness
assertion above implies that ¥t
is well defined.
Note: Most often in the sequel we shall use this theorem without any
further comment. We also use the sloppy notation Dc'X, X' or X t for
¥tX for any vector field along c, if there is no cause of confusion.
Remark 5.5.2 The above theorem may be best understood if one thinks
in terms of induced bundles and induced connection. What goes on over
there is to pull back the tangent bundle on M to the domain of c via c.
Then ¥tis the induced connection on this bundle. See also Section 2.7.1.
Exercise 5.5.3 This exercise serves as an eye opener to many as it

helps them understand the meaning of the theorem. Let c: (a, b) --t M
be the constant curve c(t) = p. Let X E X(c). What is Dc'X?
The subtle point here is that X (t) is not the restriction to the image
of c of a vector field on M. If we write X(t) := Li Ji(t) 8~i Ip ' then using
the definition of ¥t,
one shows that
D
dt X = "fI(t)oO
L....J X·
IP .
i l
Note that in this case, the pull-back bundle is the trivial bundle (a, b) x
Tp(M}.
Definition 5.5.4 Let M be a manifold with a connection D. Let X be

a vector field along a curve 'Y, i.e., X(t) E T"((t)M. We then say X is
¥t
parallel along 'Y if the covariant derivative of X along 'Y is zero: X = 0
on 'Y. In the sloppy notation this is written as D"('(t)X("((t)) = 0 for t E
domain("().
Example 5.5.5 Take 'Y(t) = (t, 0, ... ,{)) in M = an and D the usual
connection. X = E Ii 8~i' Then
· l' ( It
D ,,('(t) X = O Imp a
les OUI"'"
aIn)
OUI
0
= .
That is, Ii's are constant along the ul-axis. Clearly X = E Ci 8~i on
ul-axis, with Ci E R That is, X is parallel in the usual sense. More
generally, if 'Y is any curve and X is parallel along 'Y, then X ('Y( s)) is
parallel to X("((t)) in the usual sense for all s, t in domain of 'Y.
On the other hand Y = sin( 7r /2 + t) + Ei>2 .
- 8~ is not parallel on 'Y.
Example 5.5.6 Let 'Y(t) := (cos(t), sin(t), 0), tEa, be a curve on the
unit sphere S2 C a 3 endowed with the induced Riemannian metric. Let
X(t) := e3 for all t. Then X is parallel on 'Y.
Example 5.5.7 Let M be a hypersurface in an. Let 'Y be a curve on

M. Take X = 'Y'(t). Then D"('(tn'(t) = 0 if and only if the tangent
vector field 'Y' (t) of the curve 'Y is parallel along 'Y. Using the definition
of the induced connection we get
D "(''Y ' -= -'Y

dt
D, = (D-"(''Y')T = (")T
'Y =.0
Intuitively this means that the acceleration of 'Y is orthogonal to the
tangent space to M in an. As observed from the manifold M there is
no acceleration. These curves deserve a special name: they are called
the geodesics of M. For example on M = sn,
if el and e2 are two
orthonormal vectors then 'Y( t) = sin tel + cos te2 is a great circle and it
is a geodesic. (Since 'Y"(t) = -'Y(t).l T"((t)S".)
Now on a general manifold with a connection D, we shall show that
if X is parallel along 'Y (not necessarily a geodesic), then X satisfies a
system of linear ODE. For'Y (or a piece of 'Y) lying in a coordinate chart
dlk
= d
t
+"
L...JIj 1'i r ij
..
I k on l' for all k.
',]
These are the differential equations for the components Ii of a vector
field parallel along 1'.
So from the global existence and uniqueness of solution of linear
systems of ODE, we have a unique system of solutions (ft, 12,···, In)
as soon as we impose an initial condition, say li(t o) = Vi for 1 ~ i ~ m.
This means that a vector field X parallel along l' is uniquely determined
once we know its value at a point of the curve 1'. In other words, given
v E Ty(to)M, there exists a unique vector field X such that X is parallel
along l' and X('Y(to)) = v.
We therefore get a map T-y(to) : T-y(to)M --+ T-y(t)M as follows: for
all v E T-y(to)M, let X be the unique parallel vector field on l' such
that X('Y(to)) = v, then T-y(to)(v) = X('Y(t)). T-y(to) is called the parallel
transport from 1'(to) to 1'(t) along l' or with respect to 'Y. It has the
following property: for any Riemannian connection D, T-y(to) is a linear
isometry, as we show below.
It is linear: Let 1'(to) = P and 1'(t) = q. If v, w E TpM, with
corresponding parallel fields X, Y along 1', then X + Y will be parallel
along l' with (X + Y)(p) = v + w etc. Hence it is linear.
It is injective: For, v = 0 is the vector corresponding to X == 0 by
the uniqueness of ODE.
It is an isometry: For, if X, Yare as above, we have along 1', since
D is Riemannian,
d
dt (X, Y) = (D-y/(t)X, Y) + (X, D-y/(t)Y) = O. (5.5.3)
Hence along 1', (X, Y) is a constant. Thus we have

(X(to), Y(to)) = (X(t), Yet))
5.5. CUnJature and parallel transport 269
for all t. This proves that T-y(to) is an isometry.

Because of Equation (5.5.3), one usually paraphrases the condition
(6) for a Riemannian connection D as follows: parallel translation along
any curve, with respect to D is an isometry.
In the case of IRn , parallel transport does not depend on curves. How-
ever, in general, it will depend on the cUnJe. In fact, our final theorem in
this section will show that this is the feature which tells us how cUnJed
M is from being as fiat as IRn .
We shall now give a simple example to illustrate the phenomenon

emphasized above.
Figure 5.5.1 Parallel transport along curves
Example 5.5.8 Let M = 8 2 with the induced metric and Levi-Civita

connection. Take p=(O,O, 1) and q=(O,O, -1). ,(t)=(sint,O,cost) and
O"(t) = (0, sin t, cos t). Take v = (1,0,0) E Tp(8 2 ). , and 0" are half great
circles from p to q and hence are geodesics by Example 5.5.7. Now if a
vector field X (respectively, Y) is parallel along "I (respectively, 0") with
Xp = v = Yp , then X (respectively, Y) has constant norm on, (respec-
tively, on 0") and also L(X, "I') (respectively, L(Y, 0"')) is constant and
conversely. For, on " "I'(t) "10 (respectively, on 0", O"'(t) "10). Choose
0"1 W E Tp(8 2 ) such that w l.. ,'(0) (respectively, l.. 0"'(0)). Let Z (re-
spectively, W) be the unique vector field parallel along, (respectively,
0") with Zp = w. Then Z-y(t) and ,'(t) span T-y(t)(8 2 ). The analogous
statement for Wand 0" also holds. Hence we can write X = aZ +fh'(t)
along, (Y = aW + ;3a'(t) along a).
( /(X '(t))) = ;3 (r'(t),,'(t))

cos L " 11r'(t) II
Here we assumed II v II = 1. The constancy of cosine of the angle and
of II X II implies that X is a constant linear combination of parallel
vector fields Z and ,'(t) along,. Hence X is parallel along simi-
larly Y is parallel along a . Thus we can write X(t) = X(r(t)) = ,'(t)
,i
and Y(r(t)) = -cos2tel so that X(1l') = -el and Y(1l') = el. Thus
T(r)(ed = -el and T(a)(el) = el' (Geometrically this is obvious! See
Figure 5.5.1.)
Our first idea of differentiating a vector field with respect to another
vector field given parallel translation is realized in the following
Theorem 5.5.9 Let M be a manifold with a connection D. Let p EM

and X, Y E X(M) with Xp # o. Let s t--+ ,(s) be an integral curve of
X through p. Let Ts : TpM -t T-y(s)M be the parallel translation with
respect to ,. Then
(DxY)(p) = lim ( Ts-1 y:-y(s) - Y:)

-y(O) .
s-+O S
Proof With Xi as local coordinates centered at p, we write Xi(S) for

Xi(r(S)). Fix s > O. Let Z'Y(t) for 0 :S t :S s be the parallel translate of
Z-y(O) = Ts- 1Y-Y(8) along,. We can then write
{)
Z-y(t) = L Zi(t) {)Xi i'Y(t)
Y'Y(t) = LYi(t) aXi i-y(t)'

{)
Then by the defining ODE for parallel vector fields along, (note that
XHt), as, is the integral curve for X) we have
,~(t) =
Zk(t) + Lrfj x~(t) Zj(t) = 0
for 0 :S t :S s with initial conditions Zk(S) = Yk(S) with 1 :S k :S m. But

by the mean value theorem Zds) = Zk(O) + S k(~) where 0 :S ~ ~ s.
Therefore (l/s) (Ts-1y-y(s) - Yp) has as its k-th component
(l/s)(Zk(O) - Yk(O)) = (l/s)(Zk{s) - S k(~) - Yk(O))

=L rt(r(O)x~{O j(~) + (1/ S)(Yk(S) - Yk(O)).
i,j
The right hand side tends to ak := ~ + l:ij r~j 1:J:Yj as s -+ 0. But

then (DxY)(p) = l:kak(8~k Ip)· 0
°
Theorem 5.5.10 Let M, D be as in Theorem 5.5.9. Fix p E M. Let
Xp =f. 0, Yp =f. and Zp E TpM. Let Xi be coordinates around p such
that Xi(p) = 0, Ip
8~1 = Xp, 8~21p = Yp. (This is always possible.) Let
"Ie denote the boundary of the square
in M traced counter clockwise.

Let Ze E TpM denote the vector obtained by parallely translating Z
along "Ie. Then
Proof By our choice of coordinates, "Ie is formed by integral curves

of X, Y, -X and -Y in that order. Let Po = P = (0,0, ... ,0),
PI = (E,O, ... ,0), P2 = (E,E,O, ... ,0), P3 = (O,E,O, ... ,0).
Let Tij := the parallel transport from Pj to Pi along "Ie. Let Zi = ZP'
(so that Zo = Z E TpM). We then have
T03 T32 T21 TlOZO - Zo = (T03 T32 T21 TlOZO - T03 T32 T21 ZI)
+ (T03 T32 T21 ZI - T03 T32 Z2)
+ (T03 T32 Z2 - T03 Z3) + (T03Z3 - Zo)
- TO T3 T2 TI0 Z 0 - TO321Z)
_(321 T3 T2 1
+ (TJ T; Ti ZI - TJ T; Z2)
+ TJ T;(Z2 - Ti Z3) + TJ(Z3 -
Tf ZO)
= T03T32T21 (-E(DyZh + ~ 2(D~ h)
+ T03 T32 ( -E(DyZh + ~ 2(D~ h)

- T03 T32 ( -E(DxZh +~ 2(Dk h)
- T03 ( -E(DyZh + ~(D~ h) + 0(E 3).
We now group first and fifth term together and apply Theorem 5.5.9
above then second and sixth and apply Theorem 5.5.9, etc: We then get
T03 T32 T2l TlOZO - Zo = g2T03 T32(Dy(Dx )Zh - g2T03(Dx(DyZ)h

+ O(g3)
= g2T03(Dy(Dx Z )h-g 2T03(Dx(DyZ)h + O(g3)
= _g2 To3 (R(X, Y)Zh + O(g3)
= -g2T03(R(X, Y)Z)O + O(g3)
= _g2 R(Xp, Yp)Zp + O(g3).
o
Theorem 5.5.11 Let x, y, Z, v, W E TpM, then
1. Rxy = -Ryx
2. (Rxyv, w) = - (Rxyw, v)
3. Rxyz + Ryzx + RzxY = 0
4. (Rxyv, w) = (Rvwx, y)
Proof Exercise.
o
The above theorem lists the important symmetry properties enjoyed
by the curvature operator on a Riemannian manifold with Levi-Civita
connection.
5.6 Cartan structural equations

Definition 5.6.1 Let (M,g) be a Riemannian manifold with the Levi-
Civita connection D. A set of vector fields ei E X(U) for 1 :::; i :::; m is
said to be an orthonormal frame if {ei (p)} form an (orthonormal) basis
for TpM for each p E U.
Given a point p, there exists an open set U 3 P and an orthonormal

frame on U. This follows trivially from Gram-Schmidt orthogonalization
procedure. We shall only sketch the construction. We choose a chart
containing p and let Xi := {)~i be the coordinate vector fields. XiS form
a frame on U. Now we observe that the norm function is smooth on
V \ {O} where V is any real inner product space so that when we apply
5.6. Carlan structural equations 273
the Gram-Schmidt process we shall get a smooth orthonormal frame.

However, we may not be able to find such an orthonormal frame on the
whole of M . (Can you think of an example?)
Cartan's structural equations deal with De; ej. First of all we remark
that Deiej makes sense, even though ei may not be defined on all of M.
For, as we had seen in Section 5.5, to compute Dx Yep), it is sufficient
to know the value of Y along a curve, through p such that ,'(0) = Xp .
Since {ei} is a frame on U, we can write
m
Deiej = L fi~ek.
k=l
Thus we can find m 2 one-forms wI, for 1 ~ i, j ~ m, such that
Dx(ei) = Lwf(X)ek. (5.6.1)

k
These forms wf
are called the connection forms of the connection
D. Notice that if we take ei = Xi, the coordinate vector fields, then
wf(ej) = r~, the Christoffel symbols. If we denote by wi the I-forms
dual to ei, then we refer to wi as a dual frame. We recall that for any
I-form w, the exterior derivative is given by (see (3.4.4) page 196)
dw(X, Y) = X(w(Y)) - Y(w(X)) - w([X, Y]).
We fix a local frame {ei} and its dual frame {wi}. We wish to compute
dw i . We have (using the symmetry of the connection)
dwi(ek, el) = ek(wi(el)) - el(wi(ek)) - wi([ek, ell)

= -wi([ek, ell)
= -wi(Dekel - De1ek)
= _Wi (Lw/(ek)e S - LWk(eder) (5.6.2)
s r
r s
(Lwj i\ wj) (ek, el) = L(w;(ek)wj(el) - w;(edwj(ek))

j j (5.6.3)
= wi(ek) - wi(el).
From Equations (5.6.2) and (5.6.3) we get the first structural equation
of Cartan:
dw i = - A j. L w; w (5.6.4)
j
We now derive the second equation of Cartan, which deals with the
relation between the exterior derivative of the connection form and the
curvature. We compute R(X, Y)ei:
R(X,Y)ei
= (DxDy - DyDx - D[X,Yj}ei
= Dx(LwHY)er ) -Dy(LwHX)e r ) - LwH[X,Yj)er
r r r
r r r
r r
(LX(wHY)) - LY(wHX)) - LwH[X, Yj))e r

r r r
+ LwHY)(Lw:(X)e a) - LwHX)(Lw:(Y)e a )
r $ r 8
a r
Thus we see that
R(X, Y)ei = ~) dwi - L w[ A w:)e s ' (5.6.5)

s r
We set
~j kl == Rtkl := {R(ek' el)ei, ej}. (5.6.6)
If we define two-forms
n{:= L ~jklwk Awl, (5.6.7)

l~k<l~m
then the curvature operator R(X, Y) has matrix entries n{ (X, Y) with
respect to the basis {ed. The 2-forms n{ are called the curvature forms
of the connection D. We have thus derived the second equation of Car-
tan:
n{ = dw{ + LW~ A wf. (5.6.8)
k
5.6. Cartan structural equations 275
We now further assume that the frame {ed is orthonormal. Since

(e;, ej) = cS1, the Kronecker delta, we have X (e;, ej) = 0 for any
X E X(U). Since the connection is Riemannian, we have
0= X (e;,ej) = (Dxe;,ej) + (e;,Dxej)
= \I:wf(X)ek,ej) + \e;,I:wj(X)ek ).
k k
Hence we deduce that wf + w; = 0 for all 1 ::; i, j ::; m. We collect all

these results in the form of
Theorem 5.6.2 (Cartan Structural Equations) Let {e;h~;~m be

an orthonormal frame on U. Let {wi} be the dual frame. Let the con-
nection forms be defined by
Dxe; := I:wf(X)ej.
j
If R(X, Y) denotes the curvature operator for X, Y E X(U), we define

the curvature tensor Rlkl be defined by
R(ek,el)e;:= I: Rfklej.
j
Let the curvature forms n1 be defined as follows:
We then have the following set of equations:

1. w} + wf = 0 for all 1 ::; i,j ::; m.
2. dw; = - '"
wJ,w•j 1\ wj . (First equation of Cartan)
3. of = dwl + l: wt 1\ wf. (Second equation of Cartan)
o
Conversely, if we are given an orthonormal dual frame w; and 1-
forms wl satisfying equat,ions (1) and (2) of Theorem 5.6.2 then we can
define the Levi-Civita connection by the equation Dxe; = w1(X)ej. It
is a miraculous fact that all the information about the local geometric
properties of the Riemannian manifold are encoded in this set of three
equations. These equations are of great use in local computations. We
shall give a few sample applications in this section. Another neat appli-
cation is presented in Section 5.7.
Example 5.6.3 Let M := U C ]R2 be open. Let a new Riemannian

metric be given on this manifold by ds 2 := a 2{x,y)dx2 + b2{x,y)dy2,
where a and b are positive smooth functions from U. That is, for the
coordinate vector fields tx and t y we define <tx' ty)=O, <tx' tx)=a 2
t t
and < y ' y ) = b2 • Then as an orthonormal frame we can take
el = a-l tx and e2 = b- l t y ' so that the dual frame is w l = a dx
and w 2 = bdy. Let the connection form w~ be expressed by pw l + qw 2

for smooth functions p and q. Notice that we then have the other con-
nection form wr = _pwl - qw 2 in view of the skew symmetry of the
connection forms. We compute:
dw l = da 1\ dx = b-laydx 1\ w 2
dw 2 = db 1\ dy = a-lbxdy 1\ wl.
Thus we have w~ = b-laydx - a-lbxdy. We now compute dw~ and use

the second Cartan structural equation to find the Gaussian curvature of
(M,g). We have
dw~ = {b-lay)y dy 1\ dx - {a-lbx)x dx 1\ dy
= -(l/ab) ({b-lay)y + (a-lbx)x) w l l\w 2 ,
Now we recall that the Gaussian curvature K of a two dimensional man-
ifold is given by K = (R(el,e2)e2,el). (See the'proof of Gauss theorem
in Section 5.4, in particular, Equation (5.4.8).) By Theorem 5.6.2 we
have
dw~ = R~l2Wll\w2 = (R(el,e2)e2,el)w l l\w 2 .
Hence the Gaussian curvature is given by
When a = b, then the Riemannian metric is said to be conformal

with the standard Riemannian metric on U.
Example 5.6.4 We shall now specialize to the Poincare metric on the

upper half plane. Let M := {(x,y) E ]R2 : y > O}. The Poincare
metric on M is given by ds 2 = y-2(dx 2 + dy2). We take el = Y tx
and e2 = Y t y . Then the dual frames is w l = y-l dx and w 2 = y-l dy.
Proceeding as above we find that the connection form w~ is given by
w~ = _y-l dx = _wl. Hence we find that
n~ = dw~ = _w 1 1\w 2 = (R(el,e2)e2,el)w l l\w 2.
Hence the curvature of (M, g) is -1.

5.6. Cartan structural equations 277
Definition 5.6.5 In a general Riemannian manifold, the sectional cur-

vature of a plane P C TpM at p is defined by Kp(P) := (R(el, e2)e2, el)
where {elle2} is assumed to be an orthonormal basis of P.
If we use any basis x, y of P then we have
The denominator is the area of the parallelogram spanned by x and

y. We say a Riemannian manifold (M, g) has non-negative sectional
curvature or simply non-negative curvature if Kp(P) ~ 0 for all two
planes in TpM and for all p EM. It can be shown that the curvature is
completely determined by the sectional curvatures. We say (M,g) is of
constant curvature k if the sectional curvatures K p (P) = k for all p E M
and 2-plane P C TpM. It is same as saying (x, y) = k Wi 1\ wj • (Check 0,1
this.) Is it possible that Kp(P) = Kp is a constant independent of the
two planes in TpM? The answer to this question is in
Theorem 5.6.6 (Schur's theorem) Let (M, g) be a connected Rie-

mannian manifold. Assume that the dimension of M is greater than 3.
If the sectional curvature Kp(P) is independent of the 2-plane P at p
and hence is a function of p alone, then Kp = K, a constant.
Proof The idea is to show that the differential dK is O.

The assumption is that o'f
(p) = K (p )w i 1\ wj • We take exterior
derivative of the second structural equation and use both the structural
equations below:
0= d(dw;) = do'{- d( Lwt 1\ w;)

k
= dK 1\ wi 1\ wj + Kdw i 1\ wj - Kw i 1\ dw j
- L dwt 1\ w; +L wt 1\ dw;
k k
= dK 1\ wi 1\ wj +K L w; 1\ w k 1\ wj
k
- Kw i 1\ Lwj I\w k - (L:O,i - Lwfl\wi) I\w;

k k I
+L wt 1\ (0,: - L wf 1\ w!)
k I
= dK 1\ wi I\w j .
We can write the I-form dK as dK = KIWl+ .. ·+Kmwm. Then we have

EI KIwI /\ wi /\ wj = O. Thus, KIwI /\ Wi /\ wj = 0, for all 1 ~ i, j, I ~ m.
Since m > 3 the linear independence of the 3-forms wI /\ wi /\ wj for
distinct I, j, i we see that KI = 0 for all 1 ~ i ~ m.
o
5.7 Spaces of constant curvature
Among the Riemannian manifolds, those of constant curvature are per-

haps the simplest and geometrically the most important. We now wish
to exhibit such manifolds. We remark that two Riemannian manifolds
with the same constant curvature are locally isometric. We omit the
proof of this result.
The idea for constructing such manifolds is quite a useful trick in
Riemannian geometry. We start with an open subset U of IRn and look
for a metric which is conformally equivalent to the usual Euclidean met-
ric and try to compute the curvature form of this new metric. It will
entail a differential equation for the function u. We solve this equation
to get the manifold in question.
So, we start with a function u : U -+ IR and .consider the conformal
metric
ds 2 := e-2U(dx~ + ... + dx~)

:= p-2(dx~ + ... + dx~),
where p := eU := eu(r), and r2 := x~ + ... + x~. We assume that the

corresponding Riemannian space has constant curvature. Consider at
each point the orthonormal frame whose i-th vector is the unit vector
(with respect to the new metric) tangent to the Xi curve. We denote by
wi and w{ the dual forms and the connection forms of the Levi-Civita
connection. We then have wi = e-udxi = p-1dxi. Hence
5.7. Spaces of constant curvature 279
Thus we get
,",(j au aU) _
L...J wi - ax _dXi + ax- dXj 1\ dXj - O.
j J l
This implies that

- au
wt = -
au
dXi - - dXj = u /r(XjdXi - Xidxj),
,
aXj aXi
where u' := du/dr.

By Cartan's structural equations the curvature is given by
ot- = dwf- - '"'

L...Jwik I\w~.-
k
Hence we have
o{ = [~ (~)' + (~rl (~XkdXk) 1\ (Xj dXi - Xi dXj)
+ [-2 ~ + (u')2] dXi 1\ dXj.
We want this to be of constant curvature, say, K. This is the same as

requiring
Thus we see that u has to satisfy the differential equation
-1 (u')'
- + (u')2_
- -0. (5.7.1)
r r r
This equation can be solved as follows: set u := ~. Then the equation

becomes ~ u' = _u 2 or what is the same u' = -u2r or (u' /u 2) = -r.
Integrating both sides yields us -u- 1 = _r2 + 2A for some constant A.
That is u = 1/{r 2 - 2A) which means u' /r = 1/{r 2 - 2A) or
, r
u = .
r2 -2A
Integrating both sides yields us the solution
u = log (r2 - 2A) B

for another constant B. Thus we have p = B(r 2 - 2A) so that the

curvature of the new metric is
For simplicity we choose the constants A and B to be such that - 2AB = 1.

Then we have K = -4B so that p = 1 + (K/4)r 2 • Thus the metric of
constant curvature K is given by
dx~
1f(L: x n'
d 2 "'"
s = L... 1 +
The above considerations are local. We now investigate the conditions

on Xi under which the above formula for the metric remains valid.
If K = 0 then we can allow Xi to vary over the entire space ]Rn.
If K < 0 then the metric is defined only when Xi remain inside the
ball
{u E ]Rn : x~ + ... + x;. < -4/ K}.
We make some remarks for which we do not offer any justification. No-
tice that as a point goes to the boundary of the ball the arc-length tends
to 00, so that the resulting Riemannian manifold is complete. (That is,
any maximal geodesic has IR as its domain of definition.) It is called the
n-dimensional hyperbolic space.
If k > 0 then the metric is defined on the entire space IRn. Com-
paring it with the spherical metric which we have introduced earlier it
is clear that the resulting space is isometric to the sphere (with a point
removed) of radius .JK in IRn+l.
Appendix A
Tangent Bundles and Vector Bundles
Let M be a smooth manifold. Let TM = UpEMTpM. We then have a

set theoretic map 7r from T M to M given by 7r( v) = p if v E TpM. We
wish to endow T M with a smooth structure. We fix a smooth atlas for
M. For (U, </», a local chart in the atlas, let D = 7r-l(U) = UpEuTpM.
We define
by setting
¢(v) = (Xl(7r(V)), ... ,xm(7r(v)), V(Xl), ... , v(xm)) E 1R2m.
Then ¢ is one-one on 7r-l(U) onto </>(U) x IRm, an open subset of 1R2m.
We define local coordinates ~j on D by
for 1 :::; j :::; 2m
where Uj stand for the usual coordinates on 1R2m. That is,
~j(v) = {(Xj(7r(V)) for 1 :::; j :::; m;

V(Xi) for m < j = m +i :::; 2m.
We now show that this collection {(D, ¢)}, as U varies over all charts
of the atlas for M, forms a smooth atlas for T M. The only thing to be
checked is that the overlaps are smooth.
Notice that (D, ¢) and (V, 7;;) overlap if and only if the correspond-
ing charts (U, </» and (V,7/1) of the atlas overlap. Let Xi and Yi be the
corresponding local coordinates on U and V respectively. Then given
(a, b) E ;f(U n V) x IRm we see that
~-1
7/1
~
(a, b) = ~bia.
a I.p-l(a)"
;=1 y,
282 A. Tangent Bundles and Vector Bundles
T2 p!..ove the smoothness~of the overl~s it is enough to show that

Ui 0 </> 0 t/J-l is smooth from t/J(U n V) to </>(U n V).
Let 1 $ i $ m. Then
Ui 0 ¢ 0 ;f-l(a, b) = Xi 0 7r 0 ;f-l(a, b) = Xi(t/J-l(a)).
Thus (Ul 0 ¢ 0 ;f-l, ... , Urn 0 ¢ 0 ;f-l) = </> 0 t/J-l. Hence the functions
Ui 0 ¢ 0 ;f-l for 1 $ i $ m are smooth.
Now, for j = m + r with 1 $ r $ m, we have
That is, these are the components of the Jacobian of the map ¢ 0 ;j;-l.
Thus these maps are also smooth.
o
This smooth manifold T M is called the tangent bundle of the given
manifold M.
Exercise A.I Show how to construct the cotangent bundle T* M =

UpEMT;M.
Exercise A.2 The map 7r : T M ~ M is smooth. It is called the

canonical map of the tangent bundle T M to the base manifold M.
Exercise A.3 Let </> : M ~ N be a smooth map. Then we have the

derivative map D</> : TM ~ TN given by D</>(v) := D</>(p)(v) for v E
TpM. Is this map smooth?
Definition A.I A smooth section q of T M is a smooth map q : M ~

TM such that q(p) E TpM for all p E M. Notice that this is same as
requiring that 7r 0 q(p) = p for all p EM.
We usually denote sections by symbols like X, Y, etc. We claim that
a smooth section of T M on M is a smooth vector field on M. The only
thing to be verified is the smoothness of a section X of T M when it is
considered as a vector field. This follows from the
283
Exercise A.4 Show that in the notation as above X is smooth if and

only iffor any local chart (U, x) if we write X = E:'l X(xd 8~i on U
(using the basis theorem) then the coefficients X(Xi) are smooth on U.
The tangent bundle is an example of the following more general con-

cept.
Definition A.2 A vector bundle over a (smooth) manifold M is a pair

(E,7r) satisfying the following conditions:
(a) E is a smooth manifold, called the total space.
(b) 11': E -+ M is a smooth-map, called the projection.
(c) There is a fixed r E N (called the rank of E) such that for each
p E M, 7r- 1 (p) is an r-dimensional vector space over R 7r- 1 (p) is
called the fiber over p and is usually denoted by Ep.
(d) Condition of Local Triviality: For each p E M, there is a neigh-

borhood U of p and a diffeomorphism
such that for any fixed q E U, the map r 1-+ cp(q, v) is a linear
isomorphism of]Rr onto the fiber E q •
On any manifold M there are many vector bundles, for example,

the tangent bundle T M, the cotangent bundle T* M and more generally
Tk M, the k-th tensor bundle on M (to be defined later). There are also
trivial bundles (M x ]Rr, 7r) where 7r is the projection of M x ]Rr onto
the first factor. Thus the condition (d) of Definition A.2 says that any
vector bundle is locally trivial.
Defiilition A.3 A (smooth) section of (E,7r) (or simply E) over M is

a (smooth) map s : M -+ E such that s(p) E Ep.
Thus a (smooth) vector field is a smooth section of TM. A section

of any trivial bundle M x ]Rr is nothing but a map 8 : M -+ ]Rr.
Definition A.4 A local frame of E is by definition sections {ei} defined

on an open subset U of M such that {ei(p) : 1 ~ i ~ r} is a basis for
the vector space Ep for all p E U.
284 A. Tangent Bundles and Vector Bundles
Notice that, by local triviality of E, local frames always exist. How-

ever there may not exist even a single nonzero section (that is, a section
s : M -+ E such that s(p) f:. 0 for all p EM). Can you think of an
example?
There is a natural bijection between local frames and local trivializa-
tion as in condition (d). Let (U, ip) be given as in (d). We fix a basis
Vi oflRr and define ei(p):= ip(p,Vi). Then it is trivial to see that {ei}
is a local frame on U. Conversely, if {ei} is a local frame on V, then
we have a local trivialization of E on V as follows: If e E E, we have
the expression e = I:i aiei(p). We then define 1jJ-l : 1l"-l(V) -+ V x IRr
by setting 1jJ-l(e):= (p,al, ... ,ar ). Thus a vector bundle is trivial if
and only if there exists a global frame, that is, a frame whose domain
of definition U = M. We say a manifold is parallelizable if the tangent
bundle T M is trivial. Thus TG, the tangent bundle of a Lie group is
parallelizable whereas there exist manifolds which are not parallelizable.
(Substantiate these claims.)
There are certain obvious algebraic constructions we can carry out
on the class of vector bundles on M. Given vector bundles E and F
we can construct E*, Tk E, E EEl F, E ~ F and Tr E ~ TS F where Tk E
denotes the vector bundle whose fibers are thek-th tensor powers of the
fibers of E, which were introduced in Chapter 3. The reader should be
able to construct these objects quite easily.
Another important concept in the study -of vector bundles is the
notion of transition functions. They arise as follows. Given a vector
bundle Eon M, by local triviality, there exists an open covering {U",}
of M such that we have the trivializations:
ip", : U", x IRr -+ 1l"-l(U",) C E
satisfying condition (d). Now if U", n Uf3 f:. 0, then we have the diffeo-
morphisms:
ip", : (U", n U(3) x IRr -+ 1l"-l(U", n U(3)

ipf3 : (U", n U(3) x IRr -+ 1l"-l(U", n U(3).
As remarked above each ip", gives rise to a local frame {ef}i=l on U",.
Now for p E U", n Uf3, both {ei(p)}i=l and {e1(p)}i=1 form a basis of
Ep. Hence there exists gf3", E GL(r, 1R) which takes the a-basis to the
/3-basis. In view of the fact that ip", are diffeomorphisms, it is easily seen
that the map p t-+ gf3",(p) from U", n Uf3 to GL(r,lR) is smooth. These
gf3", are called the transitions or transition functions of E. They enjoy
the following properties:
285
1. go.o. = 1, the identity.

-1
2. go.{3 = g{3o..
3. If Uo. n U{3 n U·Y =1= 0, then g,,(o. = g"({3 0 g{3o. on the open set Uo. n
U{3 n U"(.
Conversely, if we are given an open covering {Ua.} of a manifold

M and smooth maps g{3o. : Uo. n U{3 -+ GL(r, JR) for all Q and (3 with
Ua. n U(3 =1= 0 satisfying the above three properties, then we can construct
a vector bundle E in a natural way. (Exercise.) Thus the transition
functions gives us an indication as to how the locally trivial products
Uo. x JRT are glued together.
Appendix B
Partitions of Unity
Definition B.l A collection {Ua } of subsets of a topological space M

is said to be locally finite if for all p E M there exists a neighbourhood
U of p such that Un Ua = 0 except for finitely many indices a.
Definition B.2 A partition of unity on a manifold M is a collection

{gj : i E I} of smooth functions such that
1. the collection of supports {supp gi : i E I} is locally finite.
2. gi ~ 0 on M for all i E I.
3. For all p EM, I:i gi(p) = 1. Note that by (1) this sum is finite.
A partition of unity {gd is said to be subordinate to an open cover {Ua }

if for each i E I there exists an ai such that supp gi ~ Uai • We say that
it is subordinate to the cover {Ui : i E I} with the same index set as the
partition of unity if supp gi ~ Ui for all i E I.
Definition B.3 Let M be a topological space. We say that a cover

{Ua } is a refinement of another cover {Vi} if for each a there is an i
such that Ua ~ Vi.
Definition B.4 A topological space is paracompact if every open cover

has a refinement consisting of open sets and the latter cover is locally
finite.
Lemma B.l Any second countable, locally compact Hausdorff space M

is paracompact.
287
Proof We first prove that there is a countable collection {G i : i E N}

of open sets in M with the following property:
G i is compact, G i ~ GHl, M = U Gi •
iEN
Let {Ud be a countable basis of M consisting of open sets with compact

closures. Such a basis can be obtained from starting with any countable
basis and selecting the sub collection consisting of basic open sets with
compact closures. The fact that M is Hausdorff and second countable
shows that this collection is a basis.
Now let G 1 = U1 • Assume that Gk = U1 u· .. U Ui/o. Let jk+l be the
smallest integer greater than jk such that
Gk c_ Uik+lU
i=1 i·
Define Gk+1 = u1~ilUi. Then {Gd satisfies Equation (*) above.

Now let {Uo : a E A} be an arbitrary cover. The set G i \ G i - 1 is
compact and G i \ G i - 1 ~ G H1 \ G i - 2 , an open set. For each i ~ 3, we
choose a finite sub cover of the open cover {Uo n G H1 \ Gi - 2 : a E A} of
G i \ Gi-l. We also choose a finite sub cover of the open cover {Uo n G a :
a E A} of the compact set G2 • This collection of open sets is easily seen
to be a countable locally finite refinement of {Uo }, and consists of open
sets with compact closure.
D
Theorem B.2 Any second countable smooth manifold admits a parti-

tion of unity subordinate to a given smooth atlas of M.
Proof Let {Uo } be an open cover of M. Then there exists a locally

finite refinement consisting of coordinate charts (Vi, <Pi) such that <Pi (Vi)
is a ball of radius 3 and such that {<p-l (ball ofradius I)} covers M.
This is possible by Lemma B.1.
Let A be an atlas of M. Let (Vi, <Pi) be a locally finite refinement
with the above properties. There exist functions hi E ego (M) such that
supp(h i ) ~ Vi and hi ~ O. We then define
Notice that the sum in the denominator is > 0 and is finite. Then
{gi : i E J} is a partition of unity subordinate to the given atlas A.
D
Bibliography
As general references and for collateral reading

we suggest the following four books:
[1] C. Chevalley: The Theory of Lie Groups 1. Princeton University

Press, Princeton, 1946.
A classic. Undoubtedly still the best in my opinion.
[2] Y. Matsushima: Differential Manifolds. Marcel Dekker, Inc. New
York 1972.
[3] M. Spivak: A Comprehensive Introduction to Differential Geometry.
Vol. I, Publish or Perish, Boston, 1970.
[4] F. Warner: Foundations of Differential Manifolds and Lie Groups.
Springer-Verlag 1984.
For Lie groups, Lie algebras and representation theory we

recommend:
[5] V. S. Varadarajan: Lie Groups, Lie algebras and their Representa-

tions. Springer-Verlag, GTM
[6] A. W. Knapp: Lie Groups, Lie Algebras, and Cohomology. Mathe-
matical Notes 34, Princeton University Press, Princeton, 1988.
[7] A. W. Knapp: Representation Theory of Semisimple Lie Groups-
An overview based on examples. Princeton University Press, Prince-
ton, 1986.
[8] S. Helgason: Differential Geometry, Lie groups and Symmetric

Spaces. Vol I, Academic Press, New York, 1980.
[9] S. Helagason: Groups and Geometric Analysis. Academic Press,
1984.
BIBLIOGRAPHY 289
[10] N. Wallach: Harmonic Analysis on Homogeneous Spaces. Marcel

Dekker Inc. New York, 1973.
[11] N. Wallach: Real Reductive Groups 1. Academic Press, 1988.
For a graded introduction to Differential and Riemannian

Geometry, we highly recommend the following books in
the order in which they appear:
[12] M. P. do Carmo: Differential Geometry of Curves and Surfaces.

Prentice-Hall, Engelwood, NJ, 1976
[13] A. Gray: Modem Differential Geometry of Curves and Surfaces
with MATHEMATICA®. CRC Press, 1998.
[14] W. Klingenberg: A Course in Differential Geometry. Springer-
Verlag, 1978.
[15] N. J. Hicks: Notes on Differential Geometry. D. Van Nostrand Com-
pany Inc. Princeton, NJ 1965
[16] M.P. do Carmo: Riemannian Geometry. Birkhiiuser, Boston, 1992.
[17] P. Petersen: Riemannian Geometry, GTM-I71, Springer-Verlag,
1997.
[18] B. O'Neill: Semi-Riemannian Geometry. Academic Press, New
York 1983.
[19] M. Spivak: A Comprehensive Introduction to Differential Geometry.
Vols. I-V, Publish or Perish, Boston, 1970.
[20] S. Kobayashi and K. Nomizu: Foundations of Differential Geome-
try. 2 Vols., 1963, 1969.
[21] J. Cheeger & D. G. Ebin: Comparison Theorems in Riemannian
Geometry. North-Holland Publishing Company, Amsterdam, 1975.
A few other books mentioned in the book for reference.
[22] J.L. Dupont: Curvature and Characteristic Classes, Lecture Notes

in Mathematics, No. 640, Springer-Verlag, 1978 .
[23] W. Massey: Basic Course in Algebraic Topology, GTM-127,

Springer-Verlag, 1991.
List of Symbols and Notation
Ad(g), 150 V:(p), 172
ad(X), 150 ds 2 , 242
A g ,80 Dvf(x),21
Altk,173 ~, 266
BL(E, F), 4 E,F,G,240
[X, Y], 110 eX, 31
B(x, r), 2 exp(X), 31, 132

C OO (M),76 Expx(p), 132
C k ,48 r(tp),188
0, 189 f.(X), 189
DOt, 52 A"Y, 172
df,184 rfj,251
df(p),95 gij, 239
Df(x),6 GL(E),48
Div (X), 229 GL(n, lR), 12
Vl(M), V 1 (M), 184 gradf(x),l0

V(M), 189 Gr(r, V), 74
dw, 187,191 g,117
a~iL,81 H(p),257
291
IMT, 32 R(Xp, Yp), R(x, y), 264
ix, 206 R(X, Y)Z, 264
K(p),257 Sk(t), 171
K p(P),277 SL(n, lR), 80
La, 80 sn,66
.cxw, 187 sn(R),68
Lie(G),117 SU(n),80
.cx(Y),110 sl(n, lR), 112
L k (E,F),48 su(n),112
L,M,N, 256 suppw, 215
f(-y), 242 Sn, 69
II II, 1 T")'(to), 268
II 11 2 , II 11 00 ,2 ®kV, ®V, 169
M(n,lR),7 Tp(n),81
\1 f(x), 40 T;M,95
nf,274 U(n), 80
O(n,lR),80 u(n), 112
o(g),5 V®W, 166
O(n),69 I\V,174
o( n, JR), 112 wf, 273
<p 1\ 'IjJ, 174 XU),110
IF (lR), 73 X(p), X p , 109
R a ,80 XU),109
R{kl' 274 X(M),109

Index
Cl-diffeomorphism, 44 affine, 236

Djdt, 265 forms, 273
Exp-map, 132 Levi-Civita, 249
<p-related, 114 Riemannian, 249
dw for 2-forms, 196 symmetric, 236, 249
torsion free, 249
Adapted Contractions, 189
neighbourhoods, 101 Coordinate
Affine tangent space, 103 curve, 108
Algebra, 76 function, 76
free, 170 vector field, 108
Lie, 112 Coordinates
tensor, 170 local, 66
Area Covariant
oriented, 173 derivative, 234
Atlas, 65 differentiation, 234, 236
compatible, 68 Critical point, 54
consistently oriented, 208 Cross product, 182
maximal, 69 Curvature
forms, 274
Cartan's Gaussian, 257
formula, 206 mean, 257
structural equations, 275 operator, 264
Chart, 64 sectional, 277
Charts, 65
compatibility condition for, de Rham cohomology, 199
65 Derivation, 109, 203
consistently oriented, 208 Derivative, 5
coordinate, 65 covariant, 234
Christoffel directional, 21
identities, 251 Frechet, 6
symbols, 250 of a smooth map, 90
Connection partial,36
292
INDEX 293
Determinant, 179, 180 Group action, 155

Diffeomorphism, 79 transitive, 156
local,96
Differential Haar measure, 218
I-form, 95, 184 Hessian
manifold, 64 of a function, 54
structure, 69 Homogeneous space, 156
system, 136 Hyperbolic metric, 241
Differential system, 136
integrable, 137 Imbedding, 101
involutive, 137 regular, 101
Directional derivative operator, Immersion, 44, 96
82 IMT,32
Divergence of a vector field, 229 Integral curve, 125
maximal, 128
Equations Integral submanifold, 137
Cauchy-Riemann, 14 Isometry, 246
Exponential map local,246
of Lie groups, 132 Isotropy subgroup, 160
of a vector field, 132
Exterior Jacobi identity, 112
algebra, 177 Jacobian of a map, 92
differentiation, 190
product, 174 Lagrange multiplier, 104
Extremum with constraints, 104 Level sets as manifolds, 69
Lie
Frechet differentiable, 6 algebra of a Lie group, 117
Frobenius theorem, 110 algebra, 112
Function algebra, abelian, 112
coordinate, 76 bracket, 110
Fundamental form derivative, 110, 187, 203
first, 242 group, 80
second,256 Local
Fundamental theorem maximum, 53
of calculus, 47 minimum, 53
Local one-parameter family, 130
Gauss
equation, 254 Manifold
map, 255 C k ,65
theorem, 260 00,72
Geodesics, 267 dimension of a, 66
Germs, 90 eight, 72
294 INDEX
flag, 74 Product rule, 49

Grassmann, 74 Projective space, 73
level set as a, 69
orientable, 208 Rank
Matrix of a tensor, contravariant,
negative definite, 54 183
positive definite, 54 of a tensor, covariant, 187
Maximum contravariant, 187
local, 53 Regular
Milnor's proof, 108, 120 imbedding, 101
Minimum submanifold, 101
local, 53 value, 101
Multilinear map, 165 Riemannian metric, 238
Multiplication Riemannian metrics
interior, 206 conformally equivalent, 278
Norm, 1 Saddle point, 54

Loo,2 Section
L 1 ,2 local, 159
equivalent, 2 Smooth function, 53, 75
Euclidean, 2 Smooth maps, 77
max, 2 Sphere, 66
of operator, 4 Spherical metric, 241
sup, 4 Step function, 46
uniform, 4 integral of a, 46
Stereographic projection, 66
One-parameter subgroup, 133 Submanifold, 100
One-parameter group, 130 integral, 137
Orientable manifold, 208 regular, 101
Orientation Submersion, 44, 96
boundary, 226 Support of a function, 55
of a vector space, 207
preserving, 213 Tangent
preserving maps, 214 to a curve, 81
Oriented vector, 9
area, 173 vector to manifold, 85
vector space, 207 Taylor's theorem, 50
volume, 173 Tensor, 170
Orthogonal group, 106 algebra, 170
alternating, 172
Parallel transport, 234, 268 contravariant, 172
Poincare metric, 241 covariant, 95
INDEX 295
covariant rank of a, 183

product, 166
pull-back of a covariant, 188
rank, 172
skew-symmetric, 172
symmetric, 170
Theorem
Brouwer's, 124
Frobenius, 110
Gauss, 260
Stokes, 226
Tayior's, 50
Transformation law
for tensors, 171
rank 1 tensors, 89
Translation
left, 81
right, 81
Vector
normal,40
tangent, 9
Vector field, 106
ip-related, 114
along a curve, 265
complete, 128
divergence, 229
exponential map of a, 132
flow of a, 128
left invariant, 114
parallel along a curve, 267
Vector fields
right invariant, 115
Volume
oriented, 43, 173
Volume form
of a Riemannian manifold,
222
Wedge product, 174

Weingarten map, 254

A Course in Differential Geometry

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Course in Differential Geometry

Uploaded by

Copyright:

Available Formats

TEXTS AND READINGS

Already Published Volumes

Copyright © 2002 by Hindustan Book Agency ( India)

No part of the material protected by this copyright notice may be

ISBN 978-81-85931-67-8 ISBN 978-93-86279-08-8 (eBook)

2 Manifolds and Lie Groups 64

3 Tensor Analysis 165

3.4 The exterior derivative , . 190

5 Riemannian Geometry 232

A Tangent Bundles and Vector Bundles 281

B Partitions of Unity 286

List of Symbols 290

This book arose out of the courses offered by me at T.I.F.R. Centre,

8. Simple proofs of the Hairy-Ball theorem and Brouwer's fixed point

We assume that the reader has had a course in calculus of several

1.1 Definitions and examples

Preliminaries on normed linear spaces

Definition 1.1.1 A function I II : E -t R is said to be a norm if it

(i) Ilxll ~ 0 for all x E E and Ilxll = 0 if and only if x = o.

(iii) Ilx + yll ~ Ilxll + Ilyll for all x, y E E.

Example 1.1.2 Anyone of the following is a norm on lRn :

1. II x II := (2::i Xi 2)1/2 where x := (XI, ... , x n ) is the coordinate

This is called the Euclidean or L 2 -norm on lRn and denoted by

2. Ii.ell := max{lx;1 : 1 ~ i ~ n}. This norm is called the max norm

3. II x II := 2::7=1 IXi I· This is called the L 1 -norm. It is denoted by

This is equivalent to saying that there exist positive constants C1

We leave this as an exercise.

Definition 1.1.4 We say that a vector space E with a norm II II is a

It is easy to see that if II II and II II' are equivalent norms on E,

(a) II x 1100 ~ II x ll 2 ~ ..;n IIxli oo '

(b) ..;n-lllxll i ~ II x ll 2 ~ ..;nllxll i .

When we use concepts such as open balls, distances, etc., associated

Proposition 1.1.5 All norms on Rn are equivalent.

Proof Let I I : Rn ~ R be any norm. Let ei be the standard basis of

where C:= max{leil : 1 ~ i ~ n}. We may take C l := (Cn)-l.

is continuous by (1.1.1). It attains its extremum on the closed and

Exercise 1.1.7 On the space Cb(X) ofreal or complex valued, bounded,

1 >-+1111100 := sup II(x)1

is a norm. It is called the unilorm or sup norm. (Cb(X), II 11 00 ) is a

Exercise 1.1.8 If E, F, are Banach spaces and A: E -+ F is a linear

IIAII := sup{IIAxll : Ilxll :S I} = sup{IIAxll : Ilxll = I}.

Show that IIA 0 BII :S IIAIIIIBII where A: E -+ F and B: D -+ E.

Exercise 1.1.10 If E is a Banach space then BL(E), the set of all

Exercise 1.1.11 The norm function x >-+ II x II is continuous.

Convention: Unless specified otherwise in the sequel, the symbols E,

With these preliminaries over, we start our study of differential calculus.

1= o(g) as x -+ a if lim I((X)) =

I(x + h) - I(x) = J'(x)h + o(lh!). (1.1.2)

What is so important about this formulation? It tells us that the

{(x,/(x)) : x E domain of J} c lR2 ,

We want to imitate (1.1.2) in formulating the concept of derivative

Now, clearly, we wish A to be a linear map approximating f at x. Since

Definition 1.1.13 With the above notation, we say that f is (Frechet)

1. Ilf(x + h) - I(x) - Ahll _ (1.1.4)

If f is differentiable and if B satisfies (1.1.5) with B replacing A then

Exercise 1.1.14 Prove that if f is differentiable at x, then A defined

Exercise 1.1.16 If f is differentiable at x then it is continuous at x.

We look at some examples.

Example 1.1.17 Let f := A: E -+ F be a continuous and linear map.

f(x + h) - f(x) = f(x) + f(h) - f(x) = f(h),

Exercise 1.1.18 If f is a constant map what is D f (x)?

Exercise 1.1.20 Find Df(x) where f: E -+ F is given by

Here A: E -+ F is a continuous linear map and v E F is a fixed vector.

() (: ~ !~~) .