Introduction To Mathematical Analysis-Igor Kriz

Free ebooks ==> www.Ebook777.
com
Igor Kriz
Aleš Pultr
Introduction to
Mathematical
Analysis
www.Ebook777.com
Free ebooks ==> www.Ebook777.com
www.Ebook777.com
www.Ebook777.com
Igor Kriz
Aleš Pultr
Introduction
to Mathematical
Analysis
www.Ebook777.com
Igor Kriz Aleš Pultr

Department of Mathematics Department of Applied Mathematics (KAM)
University of Michigan Faculty of Mathematics and Physics
Ann Arbor, MI Charles University
USA Prague
Czech Republic
ISBN 978-3-0348-0635-0 ISBN 978-3-0348-0636-7 (eBook)

DOI 10.1007/978-3-0348-0636-7
Springer Basel Heidelberg New York Dordrecht London
Library of Congress Control Number: 2013941992
© Springer Basel 2013

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered
and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of
this publication or parts thereof is permitted only under the provisions of the Copyright Law of the
Publisher’s location, in its current version, and permission for use must always be obtained from Springer.
Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations
are liable to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for
any errors or omissions that may be made. The publisher makes no warranty, express or implied, with
respect to the material contained herein.
Printed on acid-free paper
Springer Basel AG is part of Springer Science+Business Media (www.birkhauser-science.com)
www.Ebook777.com
To Sophie
To Jitka
www.Ebook777.com
www.Ebook777.com
Preface
This book is a result of a long-term project which originated in courses we taught

to undergraduate students who specialize in mathematics. These students had ı-"
calculus before, but there did not seem to be a suitable comprehensive textbook for
a follow-up course in analysis.
We wanted to write such a textbook based on our courses, but that was not
the only goal. Teaching bright students is about introducing them to mathematics.
Therefore, we wanted to write a book which the students may want to keep after the
course is over, and which could serve them as a bridge to higher mathematics. Such
a book would necessarily exceed the scope of their courses.
We start with standard material of second year analysis: multi-variable differ-
ential calculus, Lebesgue integration, ordinary differential equations and vector
calculus. What makes all this go smoothly is that we introduce some basic concepts
of point set topology first. Since our aim is to be completely rigorous and as self-
contained as possible, we also include a Preliminaries chapter on the basic topic
of one-variable calculus, and two Appendices on the necessary concepts of linear
algebra. This pretty much comprises the first part of our book.
With the foundations covered, it is possible to venture much further. The common
theme of the second part of our book is the interplay between analysis and geometry.
After a second installment of point set topology, we are quickly able to introduce
complex analysis, and after some multi-linear algebra, also manifolds, differential
forms and the general Stokes Theorem. The methods of manifolds and complex
analysis combine in a treatment of Riemann surfaces. Basic methods of the calculus
of variations are applied to a theory of geodesics, which in turn leads to basic tensor
calculus and Riemannian geometry. Finally, infinite-dimensional spaces, which have
already made an appearance in multiple places throughout the text, are treated more
systematically in a chapter on the basic concepts of functional analysis, and another
on a few of its applications.
The total amount of material in this book cannot be covered in any single year
course. An instructor of a course based on this book should probably aim for
covering the first part, and take his or her picks in the second part. As already
mentioned, we hope to motivate the student to hold on to their textbook, and use
it for further study in years to come. They will eventually get to more advanced
books in analysis and beyond, but here they can get, relatively quickly, their first
glimpse of a big picture.
vii
www.Ebook777.com
viii Preface
Because of this, the aim of our book is not limited to undergraduate students. This
text may equally well serve a graduate student or a mathematician at any career
stage who would like a quick source or reference on basic topics of analysis. A
scientist (for example in physics or chemistry) who may have always been using
analysis in their work, can use this book to go back and fill in the rigorous details
and mathematical foundations. Finally, an instructor of analysis, even if not using
this book as a textbook, may want to use it as a reference for those pesky proofs
which usually get skipped in most courses: we do quite a few of them.
Ann Arbor, USA Igor Kriz

Prague 1, Czech Republic Aleš Pultr
www.Ebook777.com
Contents
Part I A Rigorous Approach to Advanced Calculus
1 Preliminaries . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1 Real and complex numbers .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Convergent and Cauchy sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 Continuous functions . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Derivatives and the Mean Value Theorem .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5 Uniform convergence .. . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
6 Series. Series of functions .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
7 Power series . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
8 A few facts about the Riemann integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
9 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2 Metric and Topological Spaces I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1 Basics. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2 Subspaces and products . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3 Some topological concepts . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4 First remarks on topology . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5 Connected spaces . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6 Compact metric spaces . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7 Completeness . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8 Uniform convergence of sequences of functions.
Application: Tietze’s Theorems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3 Multivariable Differential Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
1 Real and vector functions of several variables . . . . . . . . . . . . . . . . . . . . . . . . . 65
2 Partial derivatives. Defining the existence of a total differential . . . . . . 66
3 Composition of functions and the chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4 Partial derivatives of higher order. Interchangeability . . . . . . . . . . . . . . . . . 74
5 The Implicit Functions Theorem I: The case of a single equation . . . . 77
6 The Implicit Functions Theorem II: The case of several equations . . . 81
7 An easy application: regular mappings and the Inverse
Function Theorem . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
ix
www.Ebook777.com
x Contents
8 Taylor’s Theorem, Local Extremes and Extremes

with Constraints. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
9 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4 Integration I: Multivariable Riemann Integral and Basic
Ideas Toward the Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
1 Riemann integral on an n-dimensional interval . . . . . . . . . . . . . . . . . . . . . . . . 97
2 Continuous functions are Riemann integrable.. . . . . . . . . . . . . . . . . . . . . . . . . 100
3 Fubini’s Theorem in the continuous case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4 Uniform convergence and Dini’s Theorem .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5 Preparing for an extension of the Riemann integral .. . . . . . . . . . . . . . . . . . . 105
6 A modest extension .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7 A definition of the Lebesgue integral and an important lemma . . . . . . . 109
8 Sets of measure zero; the concept of “almost everywhere” .. . . . . . . . . . . 113
9 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5 Integration II: Measurable Functions, Measure
and the Techniques of Lebesgue Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
1 Lebesgue’s Theorems.. . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
2 The class ƒ (measurable functions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
3 The Lebesgue measure . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4 The integral over a set . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5 Parameters.. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6 Fubini’s Theorem . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
7 The Substitution Theorem .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8 Hölder’s inequality, Minkowski’s inequality and Lp -spaces . . . . . . . . . . 135
9 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6 Systems of Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
1 The problem.. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
2 Converting a system of ODE’s to a system of integral equations . . . . . 147
3 The Lipschitz property and a solution of the integral equation .. . . . . . . 149
4 Existence and uniqueness of a solution of an ODE system . . . . . . . . . . . . 151
5 Stability of solutions .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6 A few special differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
7 General substitution, symmetry and infinitesimal
symmetry of a differential equation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
8 Symmetry and separation of variables .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
9 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
7 Systems of Linear Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
1 The definition and the existence theorem for a system
of linear differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
2 Spaces of solutions . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
3 Variation of constants .. . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
4 A Linear differential equation of nth order
with constant coefficients.. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
www.Ebook777.com
Contents xi
5 Systems of LDE with constant coefficients. An application

of Jordan’s Theorem .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
6 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
8 Line Integrals and Green’s Theorem .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
1 Curves and line integrals . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
2 Line integrals of the first kind (D according to length) .. . . . . . . . . . . . . . . 197
3 Line integrals of the second kind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
4 The complex line integral . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
5 Green’s Theorem.. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
6 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Part II Analysis and Geometry
9 Metric and Topological Spaces II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

1 Separable and totally bounded metric spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . 213
2 More on compact spaces . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
3 Baire’s Category Theorem . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
4 Completion .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
5 More on topological spaces: Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
6 The space of continuous functions revisited:
The Arzelà-Ascoli Theorem and the Stone-Weierstrass Theorem.. . . . 229
7 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
10 Complex Analysis I: Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
1 The derivative of a complex function. Cauchy-Riemann
conditions . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
2 From the complex line integral to primitive functions . . . . . . . . . . . . . . . . . 243
3 Cauchy’s formula . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
4 Taylor’s formula, power series, and a uniqueness theorem . . . . . . . . . . . . 248
5 Applications: Liouville’s Theorem, the Fundamental
Theorem of Algebra and a remark on conformal maps . . . . . . . . . . . . . . . . 252
6 Laurent series, isolated singularities and the Residue Theorem .. . . . . . 254
7 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
11 Multilinear Algebra . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
1 Hom and dual vector spaces . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
2 Multilinear maps and the tensor product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
3 The exterior (Grassmann) algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
4 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
12 Smooth Manifolds, Differential Forms and Stokes’ Theorem . . . . . . . . . 287
1 Smooth manifolds . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
2 Tangent vectors, vector fields and differential forms.. . . . . . . . . . . . . . . . . . 292
3 The exterior derivative and integration of differential forms . . . . . . . . . . 298
4 Integration of differential forms and Stokes’ Theorem . . . . . . . . . . . . . . . . 301
5 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
www.Ebook777.com
xii Contents
13 Complex Analysis II: Further Topics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

1 The Riemann Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
2 Holomorphic isomorphisms of disks onto polygons
and the Schwartz-Christoffel formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
3 Riemann surfaces, coverings and complex differential forms . . . . . . . . . 321
4 The universal covering and multi-valued functions . . . . . . . . . . . . . . . . . . . . 332
5 Complex analysis beyond holomorphic functions . . . . . . . . . . . . . . . . . . . . . 340
6 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
14 Calculus of Variations and the Geodesic Equation . . . . . . . . . . . . . . . . . . . . . 349
1 The basic problem of the calculus of variations,
and the Euler-Lagrange equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
2 A few special cases and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
3 The geodesic equation .. . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
4 The geometry of geodesics .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
5 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
15 Tensor Calculus and Riemannian Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
1 Tensor calculus. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
2 Affine connections .. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
3 Tensors associated with an affine connection: torsion
and curvature .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
4 Riemann manifolds . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
5 Riemann surfaces and surfaces with Riemann metric.. . . . . . . . . . . . . . . . . 381
6 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
16 Banach and Hilbert Spaces: Elements of Functional Analysis . . . . . . . . 393
1 Banach and Hilbert spaces . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
2 Uniformly convex Banach spaces .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
3 Orthogonal complements and continuous linear forms . . . . . . . . . . . . . . . . 397
4 Infinite sums in a Hilbert space and Hilbert bases . . . . . . . . . . . . . . . . . . . . . 402
5 The Hahn-Banach Theorem ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
6 Dual Banach spaces and reflexivity .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
7 The duality of Lp -spaces . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
8 Images of Banach spaces under bounded linear maps . . . . . . . . . . . . . . . . . 419
9 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
17 A Few Applications of Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
1 Some preliminaries: Integration by a measure . . . . . . . . . . . . . . . . . . . . . . . . . 427
p
2 The spaces L .X; C/ and the Radon-Nikodym Theorem . . . . . . . . . . . . . 432
3 Application: The Fundamental Theorem of (Lebesgue) Calculus.. . . . 435
4 Fourier series and the discrete Fourier transformation .. . . . . . . . . . . . . . . . 440
5 The continuous Fourier transformation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
6 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448
www.Ebook777.com
Contents xiii
A Linear Algebra I: Vector Spaces .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451

1 Vector spaces and subspaces .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
2 Linear combinations, linear independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
3 Basis and dimension .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
4 Inner products and orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
5 Linear mappings . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
6 Congruences and quotients .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
7 Matrices and linear mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
8 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
B Linear Algebra II: More about Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477

1 Transforming a matrix. Rank .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
2 Systems of linear equations . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
3 Determinants .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
4 More about determinants . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
5 The Jordan canonical form of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
6 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
Index of Symbols . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
www.Ebook777.com
www.Ebook777.com
Introduction
The main purpose of this introduction is to tell the reader what to expect while
reading this book, and to give advice on how to read it. We assume the reader to be
acquainted with the basics of differential and integral calculus in one variable, as
traditionally covered in the first year of study. Nevertheless, we include, for the
reader’s convenience, in Chapter 1, a few pivotal theoretical points of analysis
in one variable: continuity, derivatives, convergence of sequences and series of
functions, the Mean Value Theorem, Taylor expansion, and the single-valued
Riemann integral. The purpose of including this material is two-fold. First, we
would like this text to be as self-contained as possible: we wish to spare the reader a
tedious search, in another text, for an elementary fact he or she may have forgotten.
The second, and perhaps more important reason, is to focus attention on facts
of elementary differential and integral calculus that have deeper aspects, and are
fundamental to more advanced topics. In connection with this, we also review in the
exercises to Chapter 1 definitions of elementary functions and proofs, from the first
principles, of their properties needed later. What we omit at this stage is a proof of
the existence of real numbers; the reader probably knows it from elsewhere, but if
not there will be an opportunity to come back and do it as an exercise to Chapter 9.
An entirely different prerequisite is linear algebra. While not a part of mathemat-
ical analysis in the narrowest sense, it contains many necessary techniques. In fact,
differential calculus (in particular in more than one variable) can be without much
exaggeration understood as the study of linear approximations of more general
mappings, and a basic knowledge in dealing with the linear case is indispensable.
The reader’s skills in these topics (determinants, linear equations, operations with
matrices, and others) may determine to a considerable degree his or her success with
a large part of this book. Because of this, we feel it is appropriate to include linear
algebra in this text as a reference. In order not to slow down the narrative, we do
so in two appendices: Appendix A for more theoretical topics such as vector spaces
and linear mappings, and Appendix B for more computational questions regarding
matrices, culminating with a treatment of the Jordan canonical form.
Let us turn to the main body of this book. It is divided into two parts. One
of our main goals is to present a rigorous treatment of the traditional topics
of advanced calculus: multivariable differentiation, (Lebesgue) integration, and
differential equations. All this is covered by Part I, including basic facts about line
integrals and Green’s Theorem.
xv
www.Ebook777.com
xvi Introduction
In Part II we use the techniques developed in Part I to approach phenomena of

geometrical nature which the reader may already have encountered without proof,
or will certainly encounter in further studies. Using the tools developed they can be
probed into considerable depth without too much further difficulty.
Part I. We think it essential to start rigorous advanced calculus with the basic
notions of (at least metric) topology. Concepts such as neighborhood, open set,
closure and convergence viewed narrowly just in the context of the Euclidean space
Rn do not give a satisfactory picture of what is going on (and besides, would not be
sufficient for what will come later). In Chapter 2, we discuss these concepts first in
the context of metric spaces. This generality is, strictly speaking, already sufficient
for most of our purposes. Yet, it is useful to learn about the more general topological
spaces to be able to distinguish what really depends on metric and what does not.
For this reason, our treatment of space in this chapter is an interweaving narrative
of metric and topological spaces with the goal of presenting an adequate general
outlook. We stick here, however, to the simpler facts and concepts needed in the
nearest chapters (the more advanced topics on spaces are postponed to Chapter 9)
and the reader will certainly not find this chapter hard.
With the basic knowledge of metric topology we are ready for multivariable
differential calculus. This is covered in Chapter 3. We start with the basic notions
of partial derivative and total differential, and chain rule. Emphasizing the role of
the total differential as a linear map is key to more coordinate-free approaches
to analysis, vital when investigating manifolds (later in Chapter 12). Next, we
prove from first principles the Implicit Function Theorem. This is the first more
complicated analytic proof in our text; the reader is advised to pay detailed attention
to this material, and certainly not to skip it, as it is a good model of what a proof in
analysis looks and feels like. The chapter is concluded with material related to the
multivariable version of Taylor’s Theorem, and to calculating extremes and saddles
of multivariable functions.
The following two chapters are devoted to integration. In Chapter 4 we start with
the multivariable Riemann integral over a product of intervals. It becomes clear very
quickly, however, that a more versatile theory is needed. For example, we want to
take integrals of unbounded functions, or integrals over more general types of sets.
Or, we would like to know when we can take limits or derivatives behind the integral
sign. We would like to understand precisely why “the boundary does not matter”
when taking a multivariable integral, and how and why we can change variables
in a non-linear way. All this leads to the concept of Lebesgue integral, which is
somewhat notorious for being time-consuming because of the abstract concepts it
entails. There is, however a way around that. A method of P.J. Daniell (going back
to 1918 and unjustly neglected for decades) allows a straightforward introduction
of the Lebesgue integral starting with monotone limits of Riemann integrals of
continuous functions with compact support. The necessary technical theorems can
be proved very quickly and we present them in the second half of Chapter 4. In
Chapter 5, we go on to present the more technical aspects of the Lebesgue integral.
We explain how to take limits and derivatives behind the integral sign, prove Fubini’s
Theorem, define the Lebesgue measure and prove its basic properties. Further, we
www.Ebook777.com
Introduction xvii
introduce Borel sets and prove criteria of measurability. We present a rigorous proof
of a multvariable substitution theorem. Finally, we introduce Lp -spaces: while this
may seem like an early place, we will have enough integration theory at this point to
do so, and to prove their basic properties. This is useful, as the Lp -spaces often occur
throughout analysis (for example, in this book, we will use them in Chapters 13
and 15 in proving the existence of a complex structure on an oriented surface with
a Riemann metric.) We will return to the study of Lp -spaces in Chapter 16, where
they provide the most basic examples in functional analysis.
Next, having covered differentiation and integration, we turn to differential
equations. We restrict our attention to the ordinary differential equations (ODEs),
as partial differential equations have quite a different flavor and constitute a vast
field of their own, far beyond a general course in analysis (even an advanced one).
For a text on partial differential equations, we refer the reader, for example, to [5].
Chapter 6 on (general) ordinary differential equations is in fact independent of
Chapters 4 and 5 and uses only the material of Chapters 1, 2 and 3. We introduce
the concept of a Lipschitz function and prove the local existence and uniqueness
theorem for the systems of ODEs (the Picard-Lindelöf Theorem). We also discuss
stability of solutions and differentiation with respect to parameters. Further, we
discuss the basic method for separation of variables, and finally discuss global
and infinitesimal symmetries of systems of ODEs (thus motivating further study of
vector fields); also, we explain how the methods of separation of variables discussed
earlier are related to symmetries of the system.
Chapter 7 covers some aspects of linear differential equations (LDEs). The global
existence theorem is proved, and the affine set of solutions of a linear system is
discussed. We show how to use the Wronskian for recognizing a fundamental system
( basis) of the space of solutions of a homogeneous system of LDEs, and how to
get solutions of a non-homogeneous system from the homogeneous one using the
variation of constants. Also, we present a method of solving systems of LDEs with
constant coefficients, easier in the case of a single higher order LDE, and requiring
the Jordan canonical form of a matrix from Appendix B in the harder general case.
Chapter 8, concluding Part I, treats parametric curves, line integrals of the first
and second kind and the complex line integral. At the end we prove Green’s
Theorem, which we will need when dealing with complex derivatives, but which
is also an elementary warm-up for the general Stokes’s Theorem.
Part II. Now our perspective changes. The traditional items of advanced calculus
have been mostly covered and we turn to topics interesting from the point of view
of geometry.
To proceed, perhaps by now not surprisingly, we need another installment of
topological foundations. This is done in Chapter 9, presenting more material on
topological spaces (separability, compactness, separation axioms and the Urysohn
Theorem) as well as on metric spaces (completion, Baire’s Category Theorem). In
the last section we prove the Stone-Weierstrass Theorem providing a remarkably
general method to obtain useful dense sets in spaces of functions, and the Arzelà-
Ascoli Theorem, which greatly clarifies the meaning of uniform convergence, and
will be useful in Chapter 10 when proving the Riemann Mapping Theorem.
www.Ebook777.com
xviii Introduction
Next, in Chapter 10 we introduce the basic methods of complex analysis. The

fundamental facts can be derived almost immediately from the complex line integral
and Green’s Theorem of Chapter 8. The conclusions, however, are powerful and
surprising. Unlike differentiable functions of a real variable, complex functions
with a complex derivative (holomorphic functions) are much more rigid. They
are determined, for example, by their values on a convergent sequence of points,
and the existence of a derivative automatically implies the existence of derivatives
of all orders; on the other hand, “geometrically very smooth” functions may not
have a complex derivative. Thus, our view of the differential calculus as we know
it from the real case is turned upside down. Yet, complex analysis has important
real applications, such as for instance the explanation of the convergence properties
of a Taylor series. Other applications presented are the Fundamental Theorem of
Algebra, and an important geometric one, the Jordan Curve Theorem. We then
go on to cover other basic methods of complex analysis, such as Laurent series,
the classification of isolated singularities, the Residue Theorem and the Argument
Principle, which has several interesting applications, including the Open Mapping
Theorem.
Next, we also have to upgrade our knowledge of linear algebra; more specifically,
we must get acquainted with the techniques of multilinear algebra. This is done in
Chapter 11, which includes dual vector spaces, tensor products, and the exterior
(Grassmann) algebra. Thus equipped, we can now study calculus on manifolds.
This is done in Chapter 12. We define smooth manifolds, tangent vectors, vector
fields, and differential forms. Further, we present the exterior derivative, de Rham
complex, and the de Rham cohomology. A general form of Stokes’s Theorem is
proved and related to the operators grad, div and curl as introduced in traditional
calculus courses.
A combination of the study of manifolds with complex analysis in one variable
leads to the concept of Riemann surfaces. Their basic theory is presented in
Chapter 13. We begin the chapter with the Riemann Mapping Theorem, showing
conformal equivalence (holomorphic isomorphism) of simply connected proper
open subsets of C. We also present the Schwarz-Christoffel integrals giving
conformal equivalence between open convex polygons and the unit open disk;
examples include elliptic integrals. Then we introduce the theory of Riemann
surfaces and coverings, and construct universal coverings. We will see that even if
we are not interested in the abstraction of manifolds, this formalism will greatly
enhance our understanding of complex integration: we will now be able, for
example, to integrate a holomorphic function over a homotopy class of continuous
paths. We will also be able to understand how to make rigorous the concept
of a “multi-valued holomorphic function”, which was strongly suggested by the
methods of Chapter 10, yet could not be adequately approached by its methods.
Finally, studying complex differential forms on Riemann surfaces will lead us to
the basic notions of “dz-d z-calculus”, which is very helpful in complex analysis.
To demonstrate, we will apply this to extending some of the methods of complex
analysis beyond the case of holomorphic functions.
www.Ebook777.com
Introduction xix
Chapter 14 is devoted, primarily, to the basic problem of the calculus of

variations in one independent variable, and to the Euler-Lagrange equation for
critical functions. Here, only the material of Part I is used. In the second half of
the chapter we define a Riemann metric on an open subset of Rn and discuss the
geodesic equation in more detail. Also, we prove the local minimality of geodesics.
A part of the reason for introducing this material is a motivation of the topics
investigated in Chapter 15 where we combine it with the material on manifolds to
obtain the basic concepts of Riemannian geometry. We start with tensor calculus and
then move on to affine connections, Riemann metrics on manifolds and curvature,
and give a local characterization of the Euclidean space as the Riemannian manifold
with zero curvature tensor. Using the methods of the last section of Chapter 13, we
also show how to construct a complex structure on an oriented Riemann manifold
in dimension 2.
Chapter 16 concerns Hilbert and Banach spaces, and introduces the basic
concepts of functional analysis. Here we need as prerequisites only the techniques
from Part I and Chapter 9. We start with the definition and basic properties of
Hilbert spaces. We show that a Hilbert space provides, in a sense, an infinite-
dimensional extension of the nice properties of the finite-dimensional vector spaces
with inner product. Banach spaces are also introduced; their theory is much harder,
but nevertheless we are able to prove a few neat results. Starting with Hahn-Banach’s
Theorem, we go on to examining duals of Banach spaces, proving, for example, that
the dual of Lp is Lq for 1 < p < 1, 1=p C 1=q D 2. We will also prove the Open
Mapping Theorem and the Closed Graph Theorem.
In Chapter 17 we present some applications, mainly of Hilbert spaces. (One tends
to use Hilbert spaces wherever possible, precisely because they are much easier.)
We will prove the Radon-Nikodym Theorem, and use it to prove a version of the
Fundamental Theorem of Calculus for the Lebesgue integral, a fairly hard fact.
In the framework of Hilbert spaces we also define Fourier series and (continuous)
Fourier transformation. As a fringe benefit, we will introduce Borel measures. The
theory of Lp -spaces generalizes to this case, and includes some interesting new
examples.
When using this book as a textbook for a course, an instructor should aim to cover
most of Part I in the first semester. After this, one can work with Part II, basically,
on three independent tracks, thus customizing the course as needed, and as time
permits.
For Chapters 10 and 14, no additional prerequisities beyond Part I are needed.
All the other chapters require Chapter 9; just this added to Part I suffices for the
study of Hilbert spaces.
In the remaining group, multilinear algebra of Chapter 11 has to precede
manifolds in Chapter 12 and Riemannian geometry in Chapter 15, which also uses
the facts from Chapter 14, and its last section uses Chapter 10.
Chapter 13 uses Chapter 10 and Chapter 12.
Using this dependence of chapters, an instructor may decide about the topics for
the next semester, possibly assigning some material to the students as independent
www.Ebook777.com
xx Introduction
reading. There is no need to cover entire chapters, there are endless possibilities how
to mix and match topics to create an interesting course.
The student (or reader) is, in any case, most strongly encouraged to keep the
book for further study. As already mentioned, we anticipate that graduate students
of mathematics, mathematicians and scientists in areas using analysis, as well as
instructors of courses in analysis will find this book useful as a reference, and will
find their own ways through the topics.
In the Bibliography section, the reader will find suggestions for further reading.
In the more advanced sections of this text, we often introduce concepts (such as
“Lie group” or “de Rham cohomology”) which arise as a natural culmination of
our discussion, but whose systematic development is beyond the scope of this book.
These concepts are meant to motivate further study. We would like to emphasize
that our list of literature is by no means meant to be complete. The books we do
suggest all have a fairly close connection to the present text, and to mathematical
analysis. They contain more detailed information, as well as suggestions of further
literature.
Finally, we would like to say a few words about sources. The overall conception
of the book is original: we designed the logic of the interdependence of topics, and
the strategy for their presentation. Many proofs are, in fact, also “original” in the
sense that we made up our own arguments to fit best the particular stage of the
presentation (the book contains no new mathematical results). Given the scope of
the project, however, we did, in some cases, consult lecture notes, other books and
occasionally even research papers for particular proofs. All the books used are listed
in the Bibliography at the end of the book. In the case of research papers, we give
the name of who we believe is the original author of the proof, but do not include
explicit journal references, as we feel an effort of being even partially fair would lead
to a web of references which would only bewilder a first-time student of the subject.
We did want to mention, however, that there are also quite a few proofs which seem
to have become “standard” in this field (including, sometimes, particular notation),
and whose original author we were not able to track down. We would like to thank
all of those, who, by inventing those proofs, contributed to this book implicitly. We
would also like to thank colleagues and students who read parts of our book, and
gave us valuable comments. Last but not least, the authors gratefully acknowledge
the support of CE-ITI of Charles University, the Michigan Center for Theoretical
Physics and the NSF.
www.Ebook777.com
Part I
A Rigorous Approach to Advanced Calculus
www.Ebook777.com
Preliminaries
1
The typical reader of this text will have had a rigorous “ı-"” first year calculus
course, using a text such as for example [22]. Such a course will have included
definitions and basic properties of the standard elementary functions (polynomials,
rational functions, exponentials and logarithms, trigonometric and cyclometric
functions), the concept of continuity of a real function and the fact that continuity
is preserved under standard constructions (sum, product, composition, etc.), and the
basic rules of computing derivatives. We review here mainly the more theoretical
aspects of these topics. The reason for reviewing them are two-fold. The first reason
is that we would like this text to be as self-contained as possible. The second reason
is that some of the basic results have, in fact, substantial depth in them, and the more
advanced topics on which this book focuses make heavy use of them. Not reviewing
such topics would at times even create a danger of circular arguments.
1 Real and complex numbers
Perhaps it is useful to go over a few basic conventions first. By a map or mapping

from a set S to a set T we mean a rule which assigns to each element of S precisely
one element of T . Two rules are considered the same if they always produce the
same value (in T ) on the same input element (of S ).
Therefore, technically, a map is a binary relation, i.e. a set R of pairs .x; y/,
x 2 S , y 2 T , such that for each s 2 S , there is precisely one .s; y/ 2 R. The
sets S , T are called the domain and codomain, respectively. We will denote a map
f from a set S to a set T by f W S ! T . For such a map, and a set X S , we will
denote by f ŒX the set of all elements f .x/ such that x 2 X . Similarly, for Y T ,
we will denote by f 1 ŒY the set of all x 2 S such that f .x/ 2 Y . The set f ŒX
is called the image of the set X under the map f , and the set f 1 ŒY is called the
pre-image of the set Y under f . This use of the square bracket may perhaps seem
unusually pedantic, but will soon pay off in the text below. The image f ŒS of the
domain is sometimes called the image of the map f .
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, 3

DOI 10.1007/978-3-0348-0636-7 1, © Springer Basel 2013
www.Ebook777.com
4 1 Preliminaries
To comment briefly on the use of inclusion symbols, throughout this book, we

generally use to denote a subset with possible equality; when equality is excluded,
we use ¨. We generally avoid the somewhat ambiguous symbol . When we do use
it, it means in a context where equality is a priori excluded for an obvious reason
not entering the logic of the argument (this may happen, for example, for a finite
subset of the real numbers when we are not using the finiteness to conclude that the
complement is non-empty).
Returning to the subject of mappings, for a map f W S ! T , and U T , often
it is useful to have a special symbol for the map g W U ! T which is defined by
g.x/ D f .x/ when x 2 U . The map g is called the restriction to the subset U , and
denoted by f jU or f jU .
A map is called onto if f ŒS D T , and is called one-to-one (briefly 1-1) if for
every s1 ; s2 2 S , f .s1 / D f .s2 / ) s1 D s2 . Onto maps are also called surjective
and one-to-one maps are called injective. A bijective map is a map which is both
surjective and injective.
The composition of maps f W S ! T , g W T ! U will be denoted by g ı f .x/ D
g.f .x// for x 2 S . In fact, the circle is often omitted, and instead of g ı f ,
one simply writes gf . One must, of course, make sure there is no possibility of
confusion with multiplication. The identity map IdS on a set S is defined simply by
IdS .x/ D x for every x 2 S . Note that a bijective map f W S ! T has an inverse,
i.e. a map f 1 such that f 1 ı f D IdS , f ı f 1 D IdT .
We will use the symbol N to denote the set of (positive) natural numbers
f1; 2; : : : g. The set of non-negative integers will be denoted by N0 , and the set of all
integers by Z. The set of all rational numbers will be denoted by Q. The set R of
real numbers needs more attention.
1.1
Let us summarize the structure of the set R of real numbers as it will be used in this
text. We do not give a rigorous construction of the real numbers at this point. Such a
construction however will emerge in the context of our discussion of completeness
in Chapter 9, where it is reviewed as an exercise.
First R is a field, that is, there are binary operations, addition C and multiplica-
tion (which will be often indicated simply by juxtaposition) that are associative
(that is, a C .b C c/ D .a C b/ C c and a.bc/ D .ab/c) and commutative
(that is, a C b D b C a and ab D ba) and related by the distributivity law
(a.b Cc/ D abCac). There are neutral elements, zero 0 and one (also called unit) 1,
such that a C 0 D a and a 1 D a. With each a 2 R we have associated an element
a 2 R such that a C .a/ D 0; almost the same holds for the multiplication where
we have for every non-zero a an element a1 (also denoted by a1 ) such that a a1 D 1.
Furthermore there is a linear order on R (a binary relation such that a a,
that a b and b a implies a D b, that a b and b c implies a c, and
finally that for any a; b either a b or a D b or a b), and this order is preserved
by addition and by multiplication by elements that are 0.
www.Ebook777.com
1 Real and complex numbers 5
Then, we have the absolute value jaj equal to a if a 0 and to a if a 0. One

often views R as a line with ja bj representing the distance between a and b.
For M R, we say that a is an upper (resp lower) bound of M if x a (resp.
x a) for all x 2 M . A supremum (resp. infimum), denoted by
sup M (resp. inf M ),
is the least upper bound (resp. greatest lower bound), if it exists. Thus, the supremum
s of M is characterized by the properties
(1) 8x 2 M; x s, and
(2) if x a for all x 2 M then s a
(similarly for infimum with instead of ). (2) It is often expediently replaced by
(2’) if a < s then there is an x 2 M such that a < x
(realize that (1)&(2) is indeed equivalent to (1)&(2’)). It is a specific property of the
ordered field R that
each non-empty M R that has an upper bound has a supremum
or, equivalently, that
each non-empty M R that has a lower bound has an infimum.
In mathematical analysis, it is often customary to use the symbols 1 D C1
and 1. The supremum (resp. infimum) of the empty set is defined to be 1
(resp. 1), and the supremum (resp. infimum) of a set with no upper bound (resp. no
lower bound) is defined to be 1 (resp. 1). Accordingly, it is customary to write
1 < a < C1 for any real number a, and to define 1 C 1 D 1 .1/ D 1,
and .1/ C .1/ D .1/ 1 D 1, a ˙1 D ˙1 resp. a ˙1 D 1
for a > 0 resp. a < 0. It is important to keep in mind, however, that the symbols 1,
1 are not real numbers, and expressions such as 1 1 or 0 1 are undefined
(although see Section 6 of Chapter 4 for an exception).
If M is a subset of R and sup.M / 2 M (resp. inf.M / 2 M ), we say that the
supremum (resp. infimum) is attained, and speak of a maximum resp. minimum. In
this case, we may use the notation
max M; min M:
It is important to keep in mind that, unlike the supremum and infimum, a maximum
and/or minimum of a non-empty bounded subset of R may not exist. A non-empty
finite subset of R, however, always has a maximum and a minimum.
Variants of notation associated with suprema and infima (resp. maxima and
minima) are often used. For example, instead of sup M , one may write
sup x;
x2M
and similarly for the infimum, etc.
www.Ebook777.com
6 1 Preliminaries
Let us fix notations for open and closed intervals in R: As usual, .a; b/ means the
set of all x 2 R such that a < x < b, where a; b are real numbers or ˙1. We will
denote by ha; bi the corresponding closed interval, i.e. the set of all x 2 R [ f˙1g
such that a x b. The reader can fill in the meaning of the symbols ha; b/, .a; bi.
1.2
The field of complex numbers C can be represented as R

R with addition
.x1 ; x2 / C .y1 ; y2 / D .x1 C y1 ; x2 C y2 / and multiplication .x1 ; x2 /.y1 ; y2 / D
.x1 y1 x2 y2 ; x1 y2 C x2 y1 /; we have the zero .0; 0/ and the unit .1; 0/. It is an
easy exercise to check that C has the arithmetic properties of a field (associativity,
commutativity, distributivity) and that .x1 ; x2 / D .x1 ; x2 / and .x1 ; x2 /1 D
x1 x2
. 2 ; 2 /. The field of complex numbers, however, has no reasonable
x1 C x2 x1 C x22
2
order.
One introduces the complex conjugate of x D .x1 ; x2 / as x D .x1 ; x2 /. It is
easy to see that
xCy DxCy and x y D x y: (*)
Further, there is the absolute value (also called the modulus) defined by setting
p q
jxj D xx D x12 C x22
x
(thus, x 1 D ).
jxj2
If we view C as the Euclidean plane (one often speaks of the Gaussian plane)
then jxj is the standard distance of x from .0; 0/, and jx yj is the standard
Pythagorean distance.
Usually one sets i D .0; 1/ and writes
x1 C ix2 for .x1 ; x2 /
(note that the multiplication rule in C comes from distributivity and the equality
i 2 D 1). In the other direction, one puts
Re.x1 C ix2 / D x1 ; Im.x1 C ix2 / D x2 ;
and calls these real numbers the real resp. imaginary part of x1 C ix2 .
We have a natural embedding of fields
.x 7! .x; 0// W R ! C
which will be used without further mention; note that this embedding respects the
absolute value.
www.Ebook777.com
1.3 Theorem. For the absolute value of complex numbers one has
jx C yj jxj C jyj:
Proof. Let x D x1 C ix2 and y D y1 C iy2 . We can assume y ¤ 0. For any real
number we have 0 .xj C yj /2 D xj2 C 2xj yj C 2 yj , j D 1; 2. Adding
these inequalities, we obtain
0 jxj2 C 2.x1 y1 C x2 y2 / C 2 jyj2 :
x1 y1 C x2 y2
Setting D yields
jyj2
.x1 y1 C x2 y2 /2 .x1 y1 C x2 y2 /2 2 .x1 y1 C x2 y2 /2

0 jxj2 2 C jyj D jxj 2

jyj2 jyj4 jyj2
and hence .x1 y1 C x2 y2 /2 jxj2 jyj2 . Consequently,
jx C yj2 D .x1 C y1 /2 C .x2 C y2 /2 D jxj2 C 2.x1 y1 C x2 y2 / C jyj2

jxj2 C 2jxjjyj C jyj2 D .jxj C jyj/2 : t
u
1.3.1 Corollary. If x D x1 C ix2 and y D y1 C iy2 then
jx yj jx1 y1 j C jx2 y2 j and jxj yj j jx yj:
1.3.2 Comment:
A function is basically the same thing as a map, although in many texts (including
this one), the term function is reserved for a map whose codomain is a set whose
elements we perceive as numbers, or at least some closely related generalizations.
For example, the codomain may be R, C or a subset of one of these sets, or it may
be, say, Œ0; 1. Sometimes, we will allow the codomain to consist even of n-tuples
of numbers, see for example Chapter 3. While many basic courses define functions
simply by formulas without worrying about the domain and codomain, in a rigorous
view of the subject, specifying domains and codomains is essential for capturing
even the most basic phenomena: Consider, for example, the function
f .x/ D x 2 : (*)
If we specify the domain as R, the function certainly cannot have an inverse no

matter what the codomain is, since it is not injective. If we do specify the domain,
say, as Œ0; 1/, and the codomain as R, there is still no inverse, since the function is
not onto. If, however, (*) is considered as a function
f W Œ0; 1/ ! Œ0; 1/;
www.Ebook777.com
8 1 Preliminaries
then there is an inverse, which is rather useful, namely

p
f 1 .x/ D x:
1.4 Polynomials and their roots
1.4.1
Recall that a polynomial with coefficients in R resp. C is an expression which is
either
(the zero polynomial) or is of the form
p.x/ an x n C C a1 x C a0 with aj 2 R resp. C (*)
for some n 2 N0 , where an ¤ 0. Technically, then, a non-zero polynomial is

simply the .n C 1/-tuple of real (resp. complex) numbers .a0 ; : : : ; an /. (This is the
information we would have to specify if we were to store the polynomial, say, on a
computer.)
The number n is called the degree of the polynomial p.x/. The degree of the
zero polynomial is not defined.
Of course, the polynomial (*) also determines a function
.x 7! an x n C C a1 x C a0 / W R ! R resp. C ! C:
The zero polynomial determines a function, too, namely one which is constantly 0.
In analysis, it is quite common to identify a polynomial with the function it
determines (although note carefully that the domain and codomain of the function
corresponding to a polynomial with real coefficients will change if its coefficients
are considered as complex numbers). Nevertheless, this identification is permissible,
since two different polynomials over R (resp. C) never correspond to the same
function. To this end, note that it suffices to show that a non-zero polynomial does
not correspond to the 0 function (by passing to the difference). To this end, simply
note that if jx0 j is very large, then
jan x0n j > jan1 j jx0n1 j C C ja0 j jan1 x0n1 C C a0 j;
and hence p.x0 / ¤ 0 by the triangle inequality.

In fact, much more is true: a polynomial of degree n can be zero at no more than
n different points of R (or C). Define a complex root of a polynomial p.x/ to be a
number c 2 C such that p.c/ D 0. If c 2 R, we speak of a real root.
www.Ebook777.com
1.4.2 Lemma. If p.x/ is a polynomial with coefficients in C with root c 2 C, then

there exists a unique polynomial q.x/ with coefficients in C such that
p.x/ D q.x/.x c/:
Moreover, q.x/ has degree n 1. If the coefficients of p.x/ and the number c are
real, then the coefficients of the polynomial q.x/ are real.
Proof. For existence, recall (or observe by chain cancellation) that for k 2 N,
x k c k D .x c/.x k1 C x k2 c C C xc k2 C c k1 /:
Therefore,
p.x/ p.c/ D an .x n c n / C C a1 .x c/
can be written as x c times another polynomial. If c is a root of p.x/, p.c/ D 0

by definition, so our statement follows.
For uniqueness, note that for a non-zero polynomial q.x/ of degree k, the
polynomial q.x/.x c/ has degree k C 1, and hence is non-zero. t
u
We immediately have the following
1.4.3 Corollary. A polynomial p.x/ of degree n with coefficients in R or C has at

most n distinct roots.
1.4.4 Proposition. Let c be a (possibly complex) root of a polynomial p with

coefficients in R. Then the complex conjugate c is also a root of p.
Proof. By 1.2.(*), p.c/ D p.c/. t

u
The Fundamental Theorem of Algebra (which will be proved in Chapter 10,

Theorem 5.2), states that
every polynomial of degree 1 has a root in C:
By Lemma 1.4.2, we then see that every polynomial of degree n with coefficients
in C can be written uniquely (up to order of factors) as
p.x/ D an .x c1 / .x cn /
for some complex numbers c1 ; : : : ; cn . (The uniqueness is proved by induction.)

Note that the numbers c1 ; : : : ; cn may not be all distinct. When c D ci for exactly
k > 0 different values i 2 f1; : : : ; ng, we say that the root c has multiplicity k.
www.Ebook777.com
10 1 Preliminaries
Applying Proposition 1.4.4 inductively, if a polynomial p.x/ has real coefficients,

then the multiplicity of the root c is equal to the multiplicity of c.
2 Convergent and Cauchy sequences
2.1
A sequence .xn /n in R or in C is said to converge to x if
8" > 0 9n0 such that n n0 ) jxn xj < ":
We write
lim xn D x or simply lim xn D x:

n
The reader is certainly familiar with the easy facts such as lim.xn C yn / D lim xn C
lim yn or lim.xn yn / D lim xn lim yn , etc.
2.2
A sequence .xn /n in R or in C is said to be Cauchy if
8" > 0 9n0 such that m; n n0 ) jxm xn j < ":
Observation. Every convergent sequence is Cauchy.

(If we have the implication n n0 ) jxn xj < " then m; n n0 )
jxm xn j jxm xj C jx xn j < 2").
2.3 Theorem. If a xn b for all xn , then the sequence .xn /n contains a

convergent subsequence .x/kn , and a limn xkn b.
Proof. Let a xn b for all n. Set
M D fx j 9 infinitely many n such that x xn g:
This set is non-empty (a 2 M ) and bounded (no x > b is in M ) and hence there is
a finite s D sup M . By the definition, each
1 1
Kn D fk j s < xk < s C g
n n
www.Ebook777.com
3 Continuous functions 11
is infinite, and we can choose, first, xk1 such that s 1 < xk1 < s C 1 and if
k1 < < kn are chosen with kj 2 Kj we can choose a knC1 2 KnC1 such that
knC1 > kn . Then obviously limn xkn D s, and equally obviously a s b. t
u
2.4 Theorem. (Bolzano - Cauchy) Every Cauchy sequence of real numbers con-
verges.
Proof. Since for some m and all n m, jxn xm j < 1, a Cauchy sequence is
bounded and hence it contains a subsequence xk1 ; : : : ; xkn ; : : : converging to an x.
But then limn xn D x: indeed, choose for an " > 0 an n0 such that for m; n n0
we have jxm xn j < " and jx xkn j < ". Then, since kn n, jx xn j < 2" for
n n0 . t
u
2.5
From 1.3.1, we see that if .xn D xn1 C ixn2 /n is a sequence of complex numbers
then
.xn /n converges if and only if both .xnj /n converge
and
.xn /n is Cauchy if and only if both .xnj /n are Cauchy.
Consequently we can infer from Theorem 2.4 the following
Corollary. Every Cauchy sequence of complex numbers converges.
3 Continuous functions
3.1
Recall that a real (resp. complex) function of one real (resp. complex) variable is a
mapping
f W X ! R (resp. ! C) with X R (resp. C):
In the real case X will be most often an interval, that is, a set J R such that
x; y 2 J and x z y implies that z 2 J .
Recall the standard notation from 1.1 for (bounded) open and closed intervals:
.a; b/ D fx j a < x < bg and ha; bi D fx j a x bg:
The intervals ha; bi will be often referred to as compact intervals; the reason for this
terminology will become apparent in Chapter 2 below. A function f W X ! R resp.
C is said to be continuous if
www.Ebook777.com
12 1 Preliminaries
8x 2 X 8" > 0 9ı > 0 such that jy xj < ı ) jf .y/ f .x/j < ": (3.1.1)
3.2 Proposition. A function f is continuous if and only if for every convergent

sequence one has f .lim xn / D lim f .xn /.
Proof. If f is continuous, if x D lim xn 2 X and if " > 0 then first choose a

ı > 0 as in (3.1.1) and then an n0 such that jxn xj < " for n n0 . Then
jf .xn / f .x/j < " for n n0 .
Now suppose f is not continuous. Then there is an x 2 X and an " > 0 such that
for every ı > 0 there is a y.ı/ such that jy.ı/ xj < ı and jf .y.ı// f .x/j ".
Set xn D y. n1 /. Then limn xn D x while f .xn / cannot converge to f .x/. t
u
3.3 Theorem. (The Intermediate Value Theorem) Let J be an interval, let

f W J ! R be a continuous function, and let for some u < v, min.f .u/; f .v//
K max.f .u/; f .v//. Then there is an x 2 hu; vi such that f .x/ D K.
Proof. Since a restriction of a continuous function is obviously continuous, since

f is continuous if and only if f is, and since if f is continuous then any x 7!
f .x/ K with K fixed is continuous, it suffices to prove that if f W ha; bi ! R is
continuous and f .a/ 0 f .b/ then there is a c 2 ha; bi such that f .c/ D 0.
Set c D supfx 2 ha; bi j f .x/ 0g.
Suppose f .c/ > 0. Then for " D f .c/ we have a ı > 0 such that for x > c ı,
f .x/ > f .c/ " D 0 while there should exist an x > c ı such that f .x/ 0.
Similarly we cannot have f .c/ < 0 because for " D f .c/ we would have a ı > 0
with f .x/ < f .c/ C " D 0 for c x < c C ı contradicting the definition of c
again. Thus, f .c/ D 0. t
u
3.4 Theorem. A continuous function f W ha; bi ! R on a compact interval attains

a maximum and a minimum.
Proof. for the maximum. Set M D ff .x/ j x 2 ha; big. If it is not bounded choose
xn > n and consider a convergent subsequence xkn with limit y. We have f .y/ D
limn xkn which is impossible because it would yield f .y/ > n for all n. Hence M
is bounded and has a finite supremum s. Now choose xn with s n1 < xn s, and
a convergent subsequence xkn with limit y 2 ha; bi to obtain f .y/ D s. t
u
3.5
A function f is said to be uniformly continuous if
8" > 0 9ı > 0 such that 8x; y; jy xj < ı ) jf .y/ f .x/j < ":
3.5.1 Theorem. A continuous function on ha; bi is uniformly continuous.
www.Ebook777.com
4 Derivatives and the Mean Value Theorem 13
Proof. Suppose not. Then there exists an " > 0 such that
1
8n 9xn ; yn such that jxn yn j < and jf .xn / f .yn /j ":
n
Choose a convergent subsequence .xkn /n and then a convergent subsequence .ykmn /n
of .ykn /n . Then we have limn xkmn D limn ykmn contradicting Proposition 3.2 and
the inequality j limn f .xkmn / limn f .ykmn /j ". t
u
4 Derivatives and the Mean Value Theorem
4.1
Let f W X ! R be a function, X R. We say that f has a limit A at a point a and

write
lim f .x/ D A
x!a
if it is defined on .u; v/ X fag for some u < a < v and if
8" > 0 9ı > 0 such that x 2 .a ı; a C ı/ X fag ) jf .x/ Aj < ":
Note that f does not have to be defined in a, and if it is, lim f .x/ D A does not
x!a
say anything about the value f .a/.
4.2
Let J be an open interval. A function f W J ! R has a derivative A in a point x if
f .x C h/ f .x/
lim DA
h!0 h
(that is, if the limit on the left-hand side exists, and if it is equal to a). The reader is
certainly familiar with the notation
df .x/
A D f 0 .x/; or
dx
and with the basic computation rules like .f C g/0 D f 0 C g 0 or .fg/0 D f 0 g C
fg 0 etc.
4.3 Theorem. A function f has a derivative A at the point x if and only if there is
a function defined on some .ı; ı/ X f0g (ı > 0) such that
lim .h/ D 0 and f .x C h/ f .x/ D Ah C h.h/:

h!0
www.Ebook777.com
14 1 Preliminaries
Proof. If such a exists we have for h 2 .ı; ı/ X f0g,
f .x C h/ f .x/
D A C .h/
h
f .x C h/ f .x/
and hence lim D A. On the other hand, if the derivative exists
h!0 h
f .x C h/ f .x/
then we can set .h/ D A. t
u
h
4.3.1 Corollary. If f has a non-zero derivative at a point x then f .x/ is neither a

maximum nor a minimum value of f (A maximum resp. minimum value of a function
f is the maximum resp. minimum, if one exists, of the set of values of f .).
(Indeed, consider f .x C h/ f .x/ D h.A .h// for j.h/j < jAj.)

A point at which a function f has zero derivative or the derivative does not exist
is called a critical point. Corollary 4.3.1 implies that critical points are the only
points at which a function f can have a minimum or a maximum. It is, of course,
not guaranteed that a critical point would be an actual minimum or maximum (take
the point x D 0 for the function f .x/ D x 3 ). However, see Theorem 4.7 below for
a partial converse of the Corollary.
4.4 The Mean Value Theorem
4.4.1 Theorem. (Rolle) Let f be continuous in ha; bi and let it have a derivative
in .a; b/. Let f .a/ D f .b/. Then there is a c 2 .a; b/ such that f .c/ D 0.
Proof. If f is constant then f 0 .c/ D 0 for all c. If not then, as f .a/ D f .b/, either
its maximum or its minimum (recall Theorem 3.4) has to be attained in a c 2 .a; b/.
By 4.3.1, f 0 .c/ D 0. t
u
4.4.2 Theorem. (The Mean Value Theorem, Lagrange’s Theorem) Let f be contin-
uous in ha; bi and let it have a derivative at .a; b/. Then there is a c 2 .a; b/ such
that
f .b/ f .a/
f 0 .c/ D :
ba
More generally, if, furthermore, g is a function with the same properties and such
that g.b/ ¤ g.a/ and g 0 .x/ ¤ 0 then there is a c 2 .a; b/ such that
f 0 .c/ f .b/ f .a/

D :
g 0 .c/ g.b/ g.a/
www.Ebook777.com
Proof. Set F .x/ D .f .x/ f .a//.g.b/ g.a// C .f .b/ f .a//.g.x/ g.a//.

Then F .a/ D F .b/ D 0 and F 0 .x/ D f 0 .x/.g.b/ g.a// g 0 .x/.f .b/ f .a//
and the second formula follows. For the first one, set g.x/ D x. t
u
4.4.3
The Mean Value Theorem is often used in the following form (to be compared
with 4.3):
let x; x C h be both in an interval in which f has a derivative. Then
f .x C h/ f .x/ D f 0 .x C h/ h for some 2 .0; 1/:
(Use 4.4.2 for hx; x C hi resp. hx C h; xi.)
4.4.4 Corollary. If f is continuous in ha; bi and if it has a positive (resp. negative)

derivative in .a; b/ then it strictly increases (resp. decreases) (i.e. x < y ) f .x/ <
f .y/ resp. x < y ) f .x/ > f .y/) in ha; bi. If f 0 0 in .a; b/ then f is constant.
(For, f .y/ f .x/ D f 0 .c/.y x/. )
4.5 The second derivative, convex and concave functions
Suppose f has a derivative f 0 .x/ at every x 2 J , where J is an open interval. Thus,

we have a new real function f 0 W J ! R and this function may have a derivative
again. In such a case we speak of the second derivative.
4.5.1
A function f is said to be convex resp. concave on an interval ha; bi if for any two
x < y in ha; bi and any z D tx C .1 t/y, (0 < t < 1), between these arguments,
f .x/ tf .x/ C .1 t/f .y/ resp. f .x/ tf .x/ C .1 t/f .y/
(that is, the points of the graph of f lay below (resp.above) the straight line
connecting the points .x; f .x// and .y; f .y//).
4.5.2 Proposition. Let f be continuous on ha; bi and let f have a non-negative

(resp non-positive) second derivative on .a; b/. Then it is convex (resp.concave)
on ha; bi.
Proof. In the notation above we have
y z D y tx .1 t/y D t.y x/; z x D .1 t/.y x/:
Let the second derivative be non-negative. Then we have x < u < z < v < y and
u < w < v such that
www.Ebook777.com
16 1 Preliminaries
f .y/ f .z/ f .z/ f .x/

D f 0 .v/ f 0 .u/ D f 00 .w/.v u/ 0
yz zx
so that
f .y/ f .z/ f .x/ f .x/

;
t.y x/ .1 t/.y x/
hence .1 t/.f .y/ f .z// t.f .z/ f .x// and finally
tf .x/ C .1 t/f .y/ f .z/: t

u
4.5.3 An application: Young’s inequality

We have
Proposition. Let a; b > 0 and let p; q 1 be such that 1

p C 1
q D 1. Then
ap bq
ab C :
p q
Proof. Since ln00 .x/ D x12 < 0, ln is concave; thus if, say ap < b q we have
1 1 1 1
ln. ap C b q / ln.ap / C ln.b q / D ln a C ln b D ln.ab/
p q p q
and since ln increases, the inequality follows. t

u
4.6 Derivatives of higher order and Taylor’s Theorem
Just as we defined the first and second derivative of a function on an open interval
J , we may iterate the process to define the third, fourth derivative, etc. In general,
we speak of the derivative of n’th order, and define
f .0/ D f; f .1/ D f 0 and further f .nC1/ D .f .n/ /0 :
(Of course, as before, for a given function, such higher derivatives may or may not
exist.)
4.6.1 Theorem. (Taylor) Let f have derivatives up to degree n C 1 in an open

interval containing a and x, a ¤ x. Then there is a c in the open interval between
a and x such that
www.Ebook777.com
X
n
f .k/ .a/ f .nC1/ .c/
f .x/ D .x a/k C .x a/nC1 :
kŠ .n C 1/Š
kD0
Proof. Fix x and a and define a function R.t/ of one real variable t by setting
X
n
f .k/ .t/
R.t/ D f .x/ .x t/k :
kŠ
kD0
Then we have
dR.t/ X f .kC1/ .t/

n X f .k/ .t/ n
R0 .t/ D D .x t/k C k.x t/k1 :
dt kŠ kŠ
kD0 kD1
Substituting l D k C 1 in the second sum we obtain
X
n
f .kC1/ .t/ X
n1
f .lC1/ .t/ f .nC1/ .t/
R0 .t/ D .x t/k C .x t/l D .x t/n :
kŠ lŠ nŠ
kD0 lD0
Now define g.t/ D .x t/nC1 . Then g 0 .t/ D .n C 1/.x t/n and g.x/ D 0. Since
also R.x/ D 0 we obtain from Theorem 4.4.2,
R.a/ R.a/ R.x/ R0 .c/ f .nC1/.c/.x c/n

D D 0 D
g.a/ g.a/ g.x/ g .c/ nŠ.n C 1/.x c/n
and hence
f nC1 .c/ f .nC1/ .c/

R.a/ D g.a/ D .x a/nC1
nŠ.n C 1/ .n C 1/Š
X
n
f .k/ .a/
and the statement follows, since R.a/ D f .x/ .x a/k , that is,
kŠ
kD0
X
n
f .k/ .a/
f .x/ D .x a/k C R.a/. t
u
kŠ
kD0
4.7 Local extremes
One immediate consequence of Taylor’s Theorem is a partial converse of

Corollary 4.3.1. Suppose a function f is defined on an open interval containing
a point x0 . We say that x0 is a local maximum (resp. local minimum) of f if there
exists a ı > 0 such that for all x 2 .x0 ı; x0 C ı/ such that x ¤ x0 , f .x/ < f .x0 /
(resp. f .x/ > f .x0 /). We have the following
www.Ebook777.com
18 1 Preliminaries
Theorem. Let f be a function such that f 0 and f 00 exist and are continuous on
an open interval .a; b/ containing a point x0 . Suppose further that f 0 .x0 / D 0,
f 00 .x0 / < 0 (resp. f 00 .x0 / > 0). Then x0 is a local maximum (resp. local minimum)
of f .
Proof. Let us treat the case of f 00 .x0 / D q > 0; the proof in the other case is
analogous. By Taylor’s Theorem, for x 2 .a; b/, x ¤ x0 , there exists a point c in
the open interval between x0 and x such that
f 00 .c/
f .x/ D f .x0 / C .x x0 /2 : (*)
2
Since f 00 is continuous, there exists a ı > 0 such that for x 2 .x0 ı; x0 C ı/,
f 00 .c/ > 0. Then it follows immediately from (*) that if x 2 .x0 ı; x0 C ı/,
x ¤ x0 , f .x/ > f .x0 /. t
u
5 Uniform convergence
5.1
Let fn be real or complex functions defined on an X . We write limn fn D f , or

briefly fn ! f if limn fn .x/ D f .x/ for all x 2 X , and say that fn converge to
f pointwise. This convergence is not very satisfactory: consider fn W h0; 1i ! R
defined by fn .x/ D x n , an example where all the fn are continuous while the limit
f is not.
We shall need to work with a stronger concept. A sequence of (real or complex)
functions .fn /n is said to converge to f uniformly if
8" > 0 9n0 such that 8n n0 8x; jfn .x/ f .x/j < ":
This is often indicated by writing fn f .
5.2 Theorem. Let fn be continuous and let fn f . Then f is continuous.
Proof. Take an x0 2 X and an " > 0. Choose an n such that for all n n0 and
for all x, jfn .x/ f .x/j < 3" , and then a ı > 0 such that jfn .x0 / fn .x/j < 3" for
jx0 xj < ı. Then for jx0 xj < ı,
jf .x0 / f .x/j jf .x0 / fn .x0 /j C jfn .x0 / fn .x/j C jfn .x/ f .x/j < ":
t
u
5.3 Theorem. Let fn have derivatives on an open interval J , let fn ! f and let
fn0 g. Then f has a derivative and f 0 D g.
www.Ebook777.com
6 Series. Series of functions 19
Proof. By the Mean Value Theorem we have for some 0 < < 1,
ˇ ˇ
ˇ f .x C h/ f .x/ ˇ
ˇ g.x/ ˇ
ˇ h ˇ
ˇ ˇ
ˇ f .x C h/ fn .x C h/ f .x/ fn .x/ fn .x C h/ fn .x/ ˇ
Dˇ ˇ C C g.x/ˇˇ
h h h
ˇ ˇ
ˇ f .x C h/ fn .x C h/ f .x/ fn .x/ ˇ
Dˇ ˇ C C fn .x C h/ g.x/ˇˇ
0
h h
1 1
jf .x C h/ fn .x C h/j C jf .x C h/ fn .x C h/j
jhj jhj
C jfn0 .x C h/ g.x C h/j C jg.x C h/ g.x/j:
Fix an h ¤ 0 such that jg.x C h/ g.x/j < 4" . Then choose an n such that
(1) jf .x C h/ fn .x C h/j < 4" jhj and jf .x/ fn .x/j < 4" jhj, and
(2) jfn0 .x C h/ g.x C h/j < 4" .
(Inequality (2) is where we need the convergence to be uniform: we do not know
the exact position of x C h). Then
ˇ ˇ
ˇ f .x C h/ f .x/ ˇ
ˇ g.x/ ˇ < 1 " jhj C 1 " jhj C " C " D ": t
u
ˇ h ˇ jhj 4 jhj 4 4 4
6 Series. Series of functions
6.1
Let .an /n be a sequence of real or complex numbers. The associated series (or sum
P
1 P
of a series) an (briefly, an if there is no danger of confusion) is the limit
nD1
X
n
P
lim ak provided it exists; in such a case we say that an converges, and we say
n
kD1 P
that it converges absolutely if jan j converges.
6.2 Consequences of Absolute Convergence
6.2.1 Proposition.
P An absolutely convergent
P series converges. More generally, if
jan j bn and bn converges then an converges.
X
n X
n
Proof. Set sn D ak and s n D bk . For m n we have
kD1 kD1
www.Ebook777.com
20 1 Preliminaries
X
m X
m X
m
jsm sn j D j an j jan j bn D js m s n j:
kDnC1 kDnC1 kDnC1
Thus, if .s n /n is convergent, hence Cauchy, then .sn /n is Cauchy and hence

convergent. t
u
P
6.2.2 Proposition. The series
X an converges absolutely if and only if for every
" > 0 there is an n0 with jan j < " for every finite K fn j n n0 g.
k2K
X
m
Proof. The formula is equivalent to stating that jak j < " for n0 n m.
kDn
X
n
Thus, the condition amounts to stating that the sequence . jak j/n is
kD1
Cauchy. t
u
P
6.2.3 Theorem. Let an converge absolutely. Then for all bijections p from the
1
X
set of natural numbers f1; 2; : : : g to itself the sums ap.n/ are equal.
nD1
1
X
Proof. Let ap.n/ D s for a bijection p. Choose n1 sufficiently large such that
X "
nD1
jan j < for every finite K fn j n n1 g and, further, an n0 such that for
2
k2K
n n0 we have
ˇ n ˇ
ˇX ˇ "
ˇ ˇ
ˇ ap.n/ s ˇ < and fp.1/; : : : ; p.n/g f1; : : : ; n1 g:
ˇ ˇ 2
kD1
Now if n p.n0 / then if we consider K D f1; : : : ; ngXfp.1/; : : : ; p.n0 /g we obtain

ˇ n ˇ ˇn ˇ ˇn ˇ
ˇX ˇ ˇX 0 X ˇ ˇX 0 ˇ X
ˇ ˇ ˇ ˇ ˇ ˇ " "
ˇ ak s ˇ D ˇ ap.k/ C ak s ˇ ˇ ap.k/ s ˇ C jak j < C D ":
ˇ ˇ ˇ ˇ ˇ ˇ 2 2
kD1 kD1 k2K kD1 k2K
t
u
6.3
It is worth taking this a little further. A set S is called countable if there exists a
bijection W f1; 2; : : : g ! S . Note that this is the same as ordering S into an
www.Ebook777.com
6 Series. Series of functions 21
infinite sequence s1 ; s2 ; : : : where si X

go through all elements of S , and each element
occurs exactly once. Let us say that as converges absolutely if
s2S
X
sup jas j
KS finite s2K
1
X
is finite. By Proposition 6.2.2, this is equivalent to a .n/ converging absolutely
nD1
for one specified bijection (which can be arbitrary). Theorem 6.2.3 then shows
that when this occurs, then
X
as
s2S
is well-defined. Here is an example where this point of view helps:
X and let S D
6.3.1 Theorem. Let S1 ; S2 ; : : : be disjoint finite or countable sets,
[
Si . Then the set S is finite or countable. Furthermore, if as converges
i s2S
absolutely, then
0 1
1
X X X
@ as A D as ; (*)
i D1 s2Si s2S
and the left-hand side converges absolutely.
Proof. The case when S is finite is not interesting. Otherwise, we may order the
elements of S into an infinite sequence as follows: Assume each of the sets Si is
ordered into a (finite or infinite) sequence. Then let Tn consist of all the i ’th elements
S of Sj such that 1 i; j n. Then clearly each Tn is finite, and Tn TnC1 ,
(if any)
and Ti D S . Thus, we can order S by taking all the elements of T1 , then all the
remaining elements of T2 , etc. Thus, S is countable. X
Now let us investigate (*). The supremum sup jas j over finite subsets K
s2K
of Si is less than or equalX
to the analogous supremum over K finite subsets of
S , which shows that each as converges absolutely. Further, for a finite subset
s2Si
K 1; 2; : : : ,
ˇ ˇ
X ˇˇ X ˇˇ X X XX
ˇ as ˇˇ jas j sup jas j
ˇ
i 2K ˇs2Si ˇ i 2K s2Si i 2K s2Li
www.Ebook777.com
22 1 Preliminaries
where the supremum on the right-hand side is over all finite subsets Li Si . We
see that the right-hand side is finite by our assumption of absolute convergence over
S , and therefore the left-hand side of (*) converges absolutely.
Finally, to prove equality in (*), use a variation of the above proof of the fact that
S is countable: Let Tn consist of sufficiently many elements of S1 ; : : : ; Sn such that
the sum
0 1
Xn X
@ as A
i D1 s2Tn \Si
differs from
0 1
X
n X
@ as A
i D1 s2Si
by less than 1=n. Then the limit of these particular partial sums is the left- hand side
of (*), but is also equal to the right-hand side by absolute convergence. t
u
1
X 1
X
6.3.2 Corollary. Let am , bn be absolutely convergent series. Then
mD0 nD0
1
! 1
! 1
!
X X X X
n
am bn D ak bnk ; (*)
mD0 nD0 nD0 kD0
with the right-hand side converging absolutely.

P
Proof. By the assumption, the supremum of jam j jbn j over K, L finite
X
m2K;n2L
subsets of f0; 1; 2; : : : g is finite, thus proving that am bn is absolutely convergent,
S
where S is the set of all pairs of numbers 0; 1; : : : . The rest follows from
Theorem 6.3.1. t
u
6.4
Let .fn /n be a sequence of real or complex functions (defined on an X R resp C).

X1 1
X
The series fn is defined as a function with values fn .x/ whenever the last
nD1 nD1
1
X
series converges. If f .x/ D fn .x/ converges (resp. converges absolutely) for
nD1
www.Ebook777.com
7 Power series 23
P X
n
all x 2 X we say that fn converges (resp. converges absolutely). If fk .x/
kD1
1
X
f .x/ we say that fn converges uniformly.
nD1
6.4.1
Since finite sums of continuous functions are continuous and since .f1 C Cfn /0 D
f10 C C fn0 we obtain from Theorem 5.2, Theorem 5.3 and Proposition 6.2.1
P
Corollary. 1. Let fn be continuous and let fn uniformly converge. Then the
resulting P
function is continuous. P
2. Let f D n fn converge, P let fn0 exist and let fn0 converge
P uniformly. Then f 0
0
exists and is equal to fn (that is, the derivative of fn can be obtained by
taking derivatives of the individual summands). P
3. The statements 1 (resp 2) apply to the case of jfn .x/j an with an
convergent; here the convergence is, moreover, absolute.
7 Power series
7.1
1
X
A power series with center c is a series an .x c/n . So far we will limit ourselves
nD1
to the real context; later in Chapter 10, we will discuss them in the complex case.
7.2
The limes superior (sometimes also called the upper limit) of a sequence .an /n of a
real number is the number
lim sup an D inf sup an :

n n kn
It obviously exists if the sequence .an /n is bounded; if not we set lim supn an D
C1. It is easy to see that lim supn an D lim an whenever the latter exists. The limes
inferior (or lower limit) is defined analogously with inf and sup switched.
7.2.1 Proposition. Let lim sup an D inf sup an D a and limn bn D b. Let an ; bn
n kn
0 and let a; b be finite. Then lim sup an bn D ab.
www.Ebook777.com
24 1 Preliminaries
Proof. Choose an " > 0 and a K > a C b. Take an > 0 such that K > a C b C
and K < " There is an n0 such that
n n0 ) sup ak < a C and b < bn < b C :

kn
That is, for every n n0 there exists a k.n/ n such that a ak.n/ < a C and
b < bk.n/ < b C so that
a.b / < ak.n/ bk.n/ < .a C /.b C / D ab C .a C b C /
and since ab " < ab K < ab a and .a C b C / < K < " we see that
ab " < ak.n/ bk.n/ < ab C "
and conclude that lim sup an bn D ab. t

u
7.2.2
1
X
For a power series an .x c/n define the radius of convergence
nD1
1
D ..an /n / D p
lim sup n
jan j
p
if lim sup n
jan j ¤ 0; otherwise set ..an /n / D C1.
1
X
Theorem. Let r < ..an /n /. Then the power series an .x c/n converges
n1
absolutely and uniformly on the set fx j jx cj rg.
1
X
On the other hand, if jx cj > then an .x c/n does not converge.
nD1
Proof. I. Let jx cj r < . Choose a q such that

p
r inf sup k
jak j < q < 1:
n kn
Then there is an n such that

p p
r sup k jak j < q and hence r k jak j < q for all k n:
kn
Choose a K 1 such that r k Kq k for k n. Then
jak x k j Kq k for all k and jx cj r
www.Ebook777.com
7 Power series 25
and hence by Theorem 6.3.1, the series converges on fx j jxcj rg absolutely

and uniformly. p
II. Let jx cj > ; then jx cj infn supkn k jak j > 1, hence jx cj
p
supkn k jak j > 1 for all n, and hence for each n there is a k.n/ n such
p
that jx cj k.n/ jak.n/ > 1, and hence jak.n/ .x c/k.n/ j > 1 and the series
cannot converge: its summands do not even converge to 0. t
u
7.3
Consider the series

1
X
nan .x c/n1 : (*)
nD1
1
X
Obviously it converges if and only if nan .x c/n does and hence its radius of
nD1
convergence is
1
p :
lim sup n njan j
p p p p
By Proposition 7.2.1, lim sup n njan j D lim sup n n n jan j D lim n n lim sup
p p p 1
n
jan j D lim sup n jan j (since lim n n D lim e n ln n D e 0 D 1). Thus,
P radius ofn convergence of the n1

the series (*) is the same as that of the original
an .x c/ and since nan .x c/ P of an .x n c/ we conclude
is the derivative n
from 5.3 and 6.3.1 that for jx cj < the series an .x c/ has a derivative,
and that it is obtained as the sum of the derivatives of the individual summands.
So far, this derivative had to be understood as in the real context. In fact, however,
it is valid for complex power series as well; see Chapter 10.
7.4 Remark
If we proceed to compute the higher derivatives summand-wise, we obtain

1
X
f .k/ .x/ D n.n 1/ .n k C 1/an .x c/nk :
nDk
In particular
www.Ebook777.com
26 1 Preliminaries
f .k/ .c/
f .k/ .c/ D nŠan ; and hence ak D :
kŠ
Thus, if a function can be written as a power series with a center c then the
coefficients an are uniquely determined (they do depend on the c, of course).
Compare this with the formula in 4.6.1. It should be noted, though, that in real
analysis it can easily happen that a function f has all derivatives without being
.nC1/
representable as a power series: the remainder f .nC1/Š.t / .x c/nC1 may not converge
to zero with increasing n (see Exercise (13)). In fact, it is interesting to note that
many important constructions in real analysis, such as the smooth partition of unity
which we will need in Chapter 12, depend on the use of such functions.
8 A few facts about the Riemann integral
8.1
A partition of a compact interval ha; bi is a sequence
D W a D t0 < t1 < < tn D b:
The mesh of the partition is the maximum of the numbers jti C1 ti j. A partition D 0 W
a D t00 < t10 < < tm0 D b refines D if ftj j j D 1; : : : ; ng ftj0 j j D 1; : : : ; mg.
Let f W ha; bi ! R be a bounded function (this means that the set of values of
f is bounded). Define the lower and upper sum of f in D as
X
n X
n
s.f; D/ D mj .tj tj 1 / and S.f; D/ D Mj .tj tj 1 /
j D1 j D1
where mj D infff .x/ j x 2 htj 1 ; tj ig and Mj D supff .x/ j x 2 htj 1 ; tj ig.
8.1.1 Proposition. 1. If D 0 refines D then s.f; D/ s.f; D 0 / and S.f; D/

S.f; D 0 /.
2. For any two partitions D1 ; D2 , s.f; D1 / S.f; D2 /.
Proof. If tk1 D tl0 < tlC10 0

< < tlCr D tk and aj D supff .x/ j x 2
X
0 0 0 0
htlCj 1 ; tlCj ig, A D supff .x/ j x 2 htk1 ; tk ig then aj .tlCj tlCj 1 /
j
X
0 0 0
A.tlCj tlCj 1 / D A.tk tk1 / and S.f; D / S.f; D/ follows. Similarly
j
for the lower sums.
Let D be a common refinement of D1 and D2 (easily obtained, e.g., from the
union of the elements of the two partitions). Then
www.Ebook777.com
8 A few facts about the Riemann integral 27
s.f; D1 / s.f; D/ S.f; D/ S.f; D2 /: t

u
8.2
By Proposition 8.1.1, we can define the lower resp. upper Riemann integral of f
over ha; bi by setting
Z b Z b
f .x/dx D sup s.f; D/ resp. f .x/dx D inf s.f; D/:

D a D
a
Rb Rb
If f .x/dx D a f .x/dx we denote the common value by
a
Z b Z b
f .x/dx or briefly f
a a
and call it the Riemann integral of f over ha; bi.

Rb
8.2.1 Proposition. a f exists if and only if for every " > 0 there is a partition D
such that
S.f; D/ s.f; D/ < ":

Rb
Proof. I. Let a f exist and let " > 0. There is a partition D1 such that
Rb Rb
S.f; D1 / < a f C 2" and a partition D2 such that s.f; D2 / > a f 2" .
Then we have, for the common refinement D of D1 and D2 ,
Z b Z b
" "
S.f; D/ s.f; D/ < f C f C D ":
a 2 a 2
II. Let the statement hold. Choose an " > 0 and a D such that S.f; D/s.f; D/ >
". Then
Z b Z b
f S.f; D/ < s.f; D/ C " f C ":
a a
Rb Rb
Since " > 0 was arbitrary, af D f. t
u
a
8.3 Theorem. For every continuous function f W ha; bi ! R the Riemann integral
Rb
a f exists. In fact, more strongly, for every sequence Dn of partitions of ha; bi
whose mesh approaches 0 with n ! 1, we have
www.Ebook777.com
28 1 Preliminaries
Z b
lim s.f; Dn / D lim S.f; Dn / D f:
n!1 n!1 a
Proof. Let " > 0. By 3.5.1, f is uniformly continuous. Hence there exist a ı > 0
such that
"
jx yj < ı ) jf .x/ f .y/j < :
ba
Choose a partition D W a D t0 < t1 < < tn D b such that tj tj 1 < ı for

all j D 1; : : : ; n. Then Mj mj D supff .x/ j x 2 htj 1 ; tj ig infff .y/ j y 2
htj 1 ; tj ig supfjf .x/ f .y/j j x; y 2 htj 1 ; tj ig ba
"
and hence
X
S.f; D/ s.f; D/ D .Mj mj /.tj tj 1 /
" X "
.tj tj 1 / D .b a/ D ": t
u
ba ba
8.4 Theorem. (The Integral Mean Value Theorem) Let f be a continuous function
on ha; bi, M D maxff .x/ j x 2 ha; big and m D minff .x/ j x 2 ha; big (they
exist by 3.4). Then there exists a c 2 ha; bi such that
Z b
f .x/dx D f .c/.b a/:
a
Proof. From the definition one immediately obtains that

Z b
m.b a/ f .x/dx M.b a/:
a
Rb
Thus there is a K, m K M such that a f .x/dx D K.b a/. By 3.3, there
exists a c such that K D f .c/. t
u
8.5 Proposition. Let a < b < c and let f be a bounded function defined on ha; ci.
Then
Z b Z c Z c Z b Z c Z c
f C f D f and f C f D f:
a b a a b a
Proof. Denote by D.u; v/ the set of all paritions of hu; vi. For D1 2 D.a; b/
and D2 2 D.b; c/ define D1 C D2 2 D.a; c/ as a union of the two sequences.
Obviously
s.D1 C D2 ; f / D s.D1 ; f / C s.D2 ; f /:
www.Ebook777.com
8 A few facts about the Riemann integral 29
We have
Z b Z c
f C f D sup s.D1 ; f / C sup s.D2 ; f /
a b D1 2D.a;b/ D2 2D.b;c/
D supfs.D1 ; f / C s.D2 ; f / j D1 2 D.a; b/; D2 2 D.b; c/g

D supfs.D1 C D2 ; f / j D1 2 D.a; b/; D2 2 D.b; c/g
Z c
D supfs.D; f / j D 2 D.a; c/g D f;
a
the penultimate equality because each D 2 D.a; c/ can be refined by a D1 C D2

adding the b. u
t
8.5.1 Convention Rb Ra
For b < a we will write formally a f for b f . Then we have, for any a; b; c,
Z b Z c Z c
f C f D f:
a b a
8.6 Theorem. (The Fundamental Theorem of Calculus) Let f be continuous on

ha; bi. For x 2 ha; bi set
Z x
F .x/ D f .t/dt:
a
Then we have F 0 .x/ D f .x/ for all x 2 .a; b/.
Proof. Let h ¤ 0. By 8.5 and 8.4. we have

Z xCh Z x Z xCh
F .x C h/ F .x/ D f f D f D f .x C h/h
a a x
with some 2 h0; 1i. Thus,
1
.F .x C h/ F .x// D f .x C h/
h
and, as f is continuous, lim f .x C h/ D f .x/. t

u
h!0
8.6.1 Corollary. If f and G are continuous on ha; bi and if G 0 D f in .a; b/ then

Z b
f .x/dx D G.b/ G.a/:
a
www.Ebook777.com
30 1 Preliminaries
Rx Rb Rb Ra
(By 4.4.4, a f .t/dt G.x/ is constant. Thus, a f D a f a f D G.b/ C
C .G.a/ C C / D G.b/ G.a/.)
9 Exercises
(1) Assuming the Fundamental Theorem of Algebra, prove that every non-zero
polynomial with coefficients in R is a product of polynomials with coefficients
in R each of which has degree 2. [Hint: Use 1.4.4.]
(2) Prove that the set R of all real numbers is not countable (we say it is
1
X
uncountable). [Hint: Prove that the numbers ak 2k are all well-defined
kD0
1
X
and different for all choices ak 2 f0; 1g. If there were a sequence ak;n 2k ,
kD0
1
X
n 2 N of all these numbers, then the number .1ak;k /2k would be different
kD0
from all of them - a contradiction.]
X 1
(3) (a) Prove directly that the function e x D x n satisfies e x e y D e xCy .
nŠ
[Hint: Use Corollary 6.3.2]
(b) Prove that e x ¤ 0 for any x 2 R. [Hint: use (a).]
(c) Prove that e x is a continuous function on R which takes on only positive
values. [Hint: Use Theorem 7.2.2, Theorem 5.2 and Theorem 3.3.]
(4) Using the definition from Exercise (3), prove that .e x /0 D e x . [Hint: Corollary
6.4.1 is relevant.]
(5) (a) Prove that e x is an increasing function on R. [Hint: Use Exercises (3)
and (4).]
(b) Prove that lim e x D 0, lim e x D 1. [Hint: Use (a) and Exercise (3).]
x!1 x!1
(6) (a) Prove that there exists a function ln.x/ W fx 2 Rjx > 0g ! R inverse to
e x . [Hint: Use Exercise (5) (b).]
(b) Prove that .ln.x//0 D 1=x. [Hint: This follows from the chain rule; a direct
proof can also be given using Theorem 4.3.]
(7) For a 2 R, x > 0, define x a D e a ln.x/ . Using the chain rule, prove that
.x a /0 D ax a1 .
X1
(8) Define functions sin.x/, cos.x/ by sin.x/ D .1/n x 2nC1 =.2n C 1/Š,
nD0
1
X
cos.x/ D .1/n x 2n =.2n/Š.
nD0
(a) Prove that cos.x/ D cos.x/; sin.x/ D sin.x/ (i.e. cos.x/ is even and
sin.x/ is odd).
(b) Prove that .sin.x//0 D cos.x/, .cos.x//0 D sin.x/. [Hint: Corollary
6.4.1 is relevant.]
www.Ebook777.com
9 Exercises 31
(9) Prove that there exists a minimum number a > 0 such that cos.a/ D 0.
This number a is called =2. Prove that cos.x/ is decreasing in the interval
.0; =2/. [Hint: By Exercise (8), we have cos00 .x/ D cos.x/, while
cos.0/ D 1, .cos.0//0 D 0. This means that .cos.x//0 is negative in some
interval .0; "/, " > 0, and .cos.x//0 is decreasing on any interval .0; a/ on
which cos.x/ > 0. Let cos0 ."=2/ D b, cos."=2/ D c, b; c > 0. Then
cos."=2 C t/ c bt if "=2 < "=2 C t < a (a as above). From this, it follows
we cannot have a "=2 > c=b.]
(10) (a) Prove that cos.x ˙ y/ D cos.x/ cos.y/ sin.x/ sin.y/, sin.x ˙ y/ D
sin.x/ cos.y/ ˙ cos.x/ sin.y/. [Hint: analogous to Exercise (3).]
(b) Prove that sin. =2/ D 1, sin.x/ is increasing on the interval .0; =2/, and
cos. =2 x/ D sin.x/. [Hint: Let sin. =2/ D a. Apply (a) to show that
cos. =2 x/ D a sin.x/, sin. =2 x/ D a cos.x/, and therefore a2 D 1.
Observe that we must then have a D 1 because sin.x/ is increasing on the
interval .0; =2/ by Exercise (8).]
(11) Prove that cos.x/ and sin.x/ are both periodic with period 2 , their values
(on x real) are between 1 and 1, and describe their maxima and minima, and
intervals on which they are decreasing resp. increasing.
[Hint: Use Exercise (10) and the fact that cos.x/ is even to prove that cos.x C
/ D cos.x/, etc.]
(12) Now consider the definition of e x from Exercise (3) for a complex number x.
(a) Prove that e x is well-defined (i.e. the series converges) for all x 2 C, and
that e x e y D e xCy for x; y 2 C. [Interpret this as separate statements
about the real and imaginary parts.]
(b) Prove that for a complex number , the functions Re.e x /, Im.e x / are
continuous and differentiable in the real variable x, and that .e x /0 D e x
[this is, again, to be interpreted as equalities of the real and imaginary
parts].
e ix C e ix e ix e ix
(c) Prove the equalities cos.x/ D , sin.x/ D for x 2 R.
2 2i
[Remark: The attentive reader surely noticed that something is missing here;
we should learn how to differentiate with respect to a complex variable x.
(!) However, we will have to build up a lot more foundations, and wait until
Chapter 10 below, to understand that rigorously.]
(13) Let f .x/ D e 1=x for x > 0, f .x/ D 0 for x 0. Prove that f .n/ .0/ D 0 for
all n 1.
www.Ebook777.com
Metric and Topological Spaces I

2
A key to rigorous multivariable calculus is a basic understanding of point set

topology in the framework of metric spaces. Covering these basic concepts is the
purpose of this chapter. We will see that studying these concepts in detail will really
pay off in the chapters below. While studying metric spaces, we will discover certain
concepts which are independent of metric, and seem to beg for a more general
context. This is why, in the process, we will introduce topological spaces as well.
1 Basics
1.1
Let RC denote the set of all non-negative real numbers and C1. A metric space
is a set X endowed with a metric (or distance function, briefly distance) d W X
X ! RC such that
(M1) d.x; y/ D 0 if and only if x D y,
(M2) d.x; y/ D d.y; x/, and
(M3) d.x; y/ C d.y; z/ d.x; z/.
Condition (M3) is called the triangle inequality; the reader will easily guess why.
The elements of a metric space are usually referred to as points.
Very often one considers distance functions which take on finite values only, but
allowing infinite distances comes in handy sometimes.
1.1.1 Examples
(a) The set R of real numbers with the distance function d.x; y/ D jx yj.
(b) The set (plane) C of complex numbers, again with the distance jx yj; note,
however, that here the fact that it satisfies the triangle inequality is much less
trivial than in the previous case (see Theorem 1.3 of Chapter 1).
(c) The Euclidean space Rm D f.x1 ; : : : ; xm / j xj 2 Rg

www.Ebook777.com
34 2 Metric and Topological Spaces I
r
X
d..x1 ; : : : ; xm /; .y1 ; : : : ; ym // D .xj yj /2 :
Comment: In linear algebra, there are good reasons for distinguishing row and
column vectors, and equally good reasons why the ordinary Eucliean space
Rn should consist of column vectors. This is the reason why we used the
subscript Rn above for row vectors, which are easier to write down (compare
with A.7.3). From the point of view of metric and topological spaces, however,
the distinction between row and column vectors has no meaning. Because of
that, in this chapter, we will use the symbols Rn and Rn interchangably, not
distinguishing between row and column vectors.
(d) C.ha; bi/, the set of all continuous real functions on the interval ha; bi, with
d.f; g/ D max jf .x/ g.x/j:

x
(e) The set F .X / of all bounded real functions on a set X with
d.f; g/ D sup jf .x/ g.x/j:

x
(f) The unit circle
S 1 D f.x; y/ 2 R2 j x 2 C y 2 D 1g
where for two points P; Q 2 S 1 , d.P; Q/ is the lesser of the two angles
between the lines ftPjt 2 Rg and ftQjt 2 Rg.
(g) Any set S with the metric given by d.x; y/ D 0 if x D y 2 S and d.x; y/ D 1
if x ¤ y 2 S . This is known as the discrete space.
1.2 Norms
The metrics in Examples 1.1.1 (a)–(e) in fact all come from a more special situation,
which plays an especially important role. A norm on a vector space V (over real or
complex numbers) is a mapping jj jj W V ! R such that
(1) jjxjj 0, and jjxjj D 0 only if x D o,
(2) jjx C yjj jjxjj C jjyjj, and
(3) jj˛xjj D j˛j jjxjj.
1.2.1
A normed vector space is a (real or complex) vector space V provided with a norm.
(The term normed linear space is also common.) Since we have
jjx zjj D jjx y C y zjj jjx yjj C jjy zjj;
www.Ebook777.com
1 Basics 35
the function .x; y/ D jjx yjj is a metric on V , called the metric associated with
the norm. In this sense, we can always view a normed linear space as a metric space.
1.2.2 Examples
1. Any of the following formulas yields a norm in Rn .
(a) jjxjj D P
max xj ,
(b) jjxjj D q jxj j,
P 2
(c) jjxjj D xj .
Notice that (c) gives the metric space in Example 1.1.1 (c).
2. In the space of bounded real functions on a set X we can consider the norm
jj'jj D supfj'.x/j j x 2 X g:
The associated metric gives rise to Example 1.1.1 (e) above.
1.2.3 A particularly important example

Example 1.2.2 (c) is in fact, a special case of the following construction: On a (real
or complex) vector space with an inner product (see 4.2 of Appendix A), we have a
norm
p
jjxjj D xx:
Indeed: (1) of 1.2 is obvious. Further, by the Cauchy-Schwarz inequality (see 4.4 of
Appendix A),
jjx C yjj2 D .x C y/.x C y/ D xx C xy C yx C yy

D jxx C xy C yx C yyj jjxjj2 C jxyj C jyxj C jjyjj2
jjxjj2 C 2jjxjjjjyjjj C jjyjj2 D .jjxjj C jjyjj/2 :
p p
Finally, jj˛xjj D .˛x/.˛x/ D ˛˛.xx/ D j˛j jjxjj. t
u
1.3 Convergence
A sequence x1 ; x2 ; : : : of points of metric space converges to a point x whenever

for every " > 0, there exists an n0 such that for all n n0 , we have d.xn ; x/ < ".
This is expressed by writing
lim xn D x or lim xn D x or just lim xn D x:

n!1 n
We then speak of a convergent sequence. Note that obviously

(*) any subsequence .xkn /n of a convergent sequence converges to the same limit.
www.Ebook777.com
1.3.1 Examples
(a) The usual convergence in R or C.
(b) Consider the examples in 1.1.1 (d) and (e). Realize that the convergence of a
sequence of functions f1 ; f2 ; : : : in these spaces is what one usually calls uniform
convergence of functions.
1.4
Two metrics d1 ; d2 on the same set X are said to be equivalent if there exist positive
real numbers ˛; ˇ such that for every x; y 2 X ,
˛d1 .x; y/ d2 .x; y/ ˇd1 .x; y/:
Note that we have an obvious
1.4.1 Observation. If d1 and d2 are equivalent then .xn /n converges in .X; d1 / if

and only if it converges in .X; d2 /.
1.5
Let .X; d / and .Y; d 0 / be metric spaces. A map f W X ! Y is said to be continuous

if
for every x 2 X and every " > 0 there is a ı > 0 such that, for
every y in X ,
(ct)
d.x; y/ < ı ) d 0 .f .x/; f .y// < ":
Later on we will need a stronger concept: a mapping f W X ! Y is said to be

uniformly continuous if
for every " > 0 there is a ı > 0 such that, for all x; y in X
(uct)
d.x; y/ < ı ) d 0 .f .x/; f .y// < ":
Note the subtle difference between the two concepts. In the former the ı can depend
on x, while in the latter it depends on the " only. For example,
f D .x 7! x 2 / W R ! R
is continuous but not uniformly continuous.
www.Ebook777.com
2 Subspaces and products 37
It is easy to prove
1.5.1 Proposition. A composition g ı f of continuous (resp. uniformly continuous)

maps f and g is continuous (resp. uniformly continuous).
1.5.2
Here is another easy but important
Observation. Let d; d1 be equivalent metrics on X and let d 0 ; d10 be equivalent

metrics on Y . Then a map f W X ! Y is continuous (resp. uniformly continuous)
with respect to d; d 0 if and only if it is continuous (resp. uniformly continuous) with
respect to d1 ; d10 .
1.6 Proposition. A map f W .X; d / ! .Y; d 0 / is continuous if and only if for every
convergent sequence .xn /n in .X; d /, the sequence .f .xn //n is convergent and
f .lim xn / D lim f .xn /:
(Compare with Proposition 3.2 of Chapter 1.)
Proof. ): Let lim xn D x. Consider the ı > 0 from (ct) taken for the x and an
" > 0. There is an n0 such that n n0 implies d.xn ; x/ < ı. Then for n n0 ,
d 0 .f .xn //; f .x// < ".
(: Suppose f is not continuous. Then there is an x 2 X and an "0 > 0
such that for every ı > 0 there exists an x.ı/ such that d.x.ı/; x/ < ı while
d 0 .f .x.ı// "0 . Now set xn D x. n1 /; obviously lim xn D x and .f .xn //n does not
converge to f .x/. t
u
2 Subspaces and products
2.1
Let .X; d / be a metric space and let X 0 X be an arbitrary subset. Obviously

.X 0 ; d 0 / where d 0 is d restricted to X 0
X 0 is a metric space again.
Examples.
(a) Intervals in the real line.
(b) More generally, the typical subspaces of the Euclidean space Rm one usually
works with: n-dimensional intervals (by which we mean cartesian products of
n-tuples of intervals), polyhedra, balls, spheres, etc.
(c) The space C.ha; bi/ from 1.1.1.(d) is a subspace of the F .ha; bi/ from 1.1.1.(e).
Convention. Unless otherwise stated we will think of subsets of spaces automat-

ically as subspaces.
www.Ebook777.com
2.1.1 Observations. 1. Let .X 0 ; d 0 / be a subspace of .X; d /. Then the embedding

map j D .x 7! x/ W X 0 ! X is uniformly continuous. Consequently,
a restriction f jX 0 W .X 0 ; d 0 / ! .Y; d / of a continuous (resp. uniformly
continuous) f W .X; d / ! .Y; d / is continuous (resp. uniformly continuous).
2. Let f W .X; d / ! .Y; d / be a continuous (resp. uniformly continuous) map and
let Y 0 Y be a subspace such that f ŒX Y 0 . Then f 0 D .x 7! f .x// W
0
.X; d / ! .Y; d / is continuous (resp. uniformly continuous).
Proof. 1. For " > 0 take ı D ". For the consequence recall 1.5.1 and the fact that
f jX 0 D fj .
2. For x and " > 0 use the same ı as for f . t
u
2.2
Y
m
Let .Xi ; di /, i D 1; : : : ; m, be metric spaces. On the cartesian product Xi D
i D1
X1

Xm consider the following distances:
v
u m
uX
..x1 ; : : : ; xm /; .y1 ; : : : ; ym // D t di .xi ; yi /2 ;
i D1
X
m
..x1 ; : : : ; xm /; .y1 ; : : : ; ym // D di .xi ; yi /; and
i D1
d..x1 ; : : : ; xm /; .y1 ; : : : ; ym // D max di .xi ; yi /:

( and d satisfy (M1), (M2) and (M3) obviously. The triangle inequality of
needs some simple reasoning – one can use, for instance, Theorem 4.4 from
Appendix A. In fact, we will rarely use this metric in the context of the topology
of multivariable functions. However, note its geometrical significance: it yields the
standard Pythagorean metric in the space Rm viewed as R

R.)
2.2.1 Proposition. The distance functions , and d are equivalent metrics.

Proof.
v
u m
uX p
..xi /i ; .yi /i / t max dj .xj ; yj /2 D n d..xi /i ; .yi /i /:
j
i D1
Obviously d..xi /i ; .yi /i / ..xi /i ; .yi /i /; ..xi /i ; .yi /i / and finally ..xi /i ;
X
m
.yi /i / maxj dj .xj ; yj / D n d..xi /i ; .yi /i /.
i D1
www.Ebook777.com
3 Some topological concepts 39
2.2.2 Q
The space Xi endowed with any of the metrics , , d (typically, by d ) will be
referred to as the product of the spaces .Xi ; di /, i D 1; : : : ; m.
Y
Theorem. 1. The projections pj D ..X1 ; : : : ; xm / 7! xj / W .Xi ; di / !
i
.Xj ; dj / are uniformly continuous.
2. A sequence
.x11 : : : ; xm
1
/; .x12 : : : ; xm
2
/; .x13 : : : ; xm
3
/; : : : (*)
Q
converges in .Xi ; di / if and only if each of the sequences
xj1 ; xj2 ; xj3 : : : (**)
converges in the respective .Xj ; dj /.

3. Let fj W .Y; d / ! .Xj ; dj / be continuous (resp. uniformly continuous). Then
the mapping
Y
f D .y 7! .f1 .y/; : : : ; fm .y/// W .Y; d 0 / ! .Xi ; di /
(the unique mapping such that pj f D fj for all j ) is continuous (resp.uniformly

continuous).
Proof. 1. We have d..xi /i ; .yi /i / dj .xj ; yj /. Thus, it suffices to take ı D ".

2. If ./ converges then each ./ converges by 1 and 1.6. For " > 0 choose nj
such that for k nj , dj .xjk ; xj / < ", and consider n0 D maxj nj . Then for
k n0 , dj .xjk ; xj / < " for all j , and hence max dj .xjk ; xj / < ".
3. immediately follows from 2 and 1.6.
3 Some topological concepts
3.1 Neighborhoods
First, define the "-ball with center x as

.x; "/ D fy j d.x; y/ < "g:
A subset U X is a neighborhood of a point x 2 X if there exists an " > 0 such

that

.x; "/ U:
www.Ebook777.com
Remark: While the concept of an "-ball depends on the concrete metric, the
concept of neighborhood does not change if we replace a metric by an equivalent
one. In fact, we can change the metric even much more radically – see Exercise (5)
below.
3.1.1 Observations. 1. If U is a neighborhood of x and U V then V is a

neighborhood of x.
2. If U1 ; U2 are neighborhoods of x then so is U1 \ U2 .
(1: for V use the same

.x; "/. 2: if
.x; "i / Ui then
.x; min."1 ; "2 //
U1 \ U2 .)
3.2 Open and closed sets
A subset U .X; d / is open if it is a neighborhood of each of its points.

A subset A .X; d / is closed if for every sequence .xn /n , xn 2 A convergent in
.X; d /, the limit lim xn is in A.
3.2.1 Proposition. 1. X and ; are open. If U and SV are open then U \ V is open,
and if Ui , i 2 J , are open (J arbitrary) then Ui is open.
i 2J
2. U is open if and only if X X U is closed.
[ If A and B are closed then A[B is closed, and if Ai , i 2 J ,
3. X and ; are closed.
are closed then Ai is closed.
i 2J
Proof. 1 is straightforward (use 3.1.1).

2: Let U be open, A D X X U . The limit x of a sequence .xn /n that is all in A
cannot be in U since there is an " > 0 such that
.x; "/ U , and the xn ’s with
sufficiently large n have to be in such
.x; "/.
On the other hand, if U is not open, then there is an x 2 U such that for every n,
.x; n1 / ª U . Therefore, we can choose points xn 2

.x; n1 /\A with x D lim xn 2
U D X X A.
3 follows from 1.3 and the formulas relating intersections and unions with
complements. t
u
3.3 Closure
Let A be a general subset of a metric space X D .X; d /. For a point x 2 X , define

the distance of x from A by
d.x; A/ D inffd.x; a/ j a 2 Ag:
www.Ebook777.com
3 Some topological concepts 41
Note that if x 2 A then d.x; A/ D 0 but d.x; A/ can be 0 even if x … A.

The closure of a set A in .X; d / is the set
A D fx j d.x; A/ D 0g:
This definition seems to depend heavily on the distance function. But we have
3.3.1 Proposition. 1. The set A is closed, and it is the smallest closed set
containing A. In other words,
\
AD fB closed j A Bg:
2. A point x 2 X is in A if and only if for each of its neighborhoods U , U \ A ¤ ;

(in other words, if and only if for each open U 3 x, U \ A ¤ ;).
Proof. 1 : U D X X A is open, since if x … A there is an " > 0 such that

.x; 2"/ \
A D ; and hence by the triangle inequality
.x; "/ \ A D ;.
Let B be closed and B A. Let x 2 A. For each n choose an xn 2 A (and
hence in B) such that d.x; xn / < n1 . Then x D lim xn is in B. The correctness of
the formula follows from 3.2.1.
2 is obvious: in yet other words we are speaking about the balls
.x; "/
intersecting A. t
u
3.3.2 Proposition. 1. ; D ;, A A, and A B ) A B,

2. A [ B D A [ B, and
3. A D A.
Proof. 1 is trivial.
2: By 1, A [ B A [ B. Now let x 2 A [ B; x is or is not in A. In the latter
case, all sufficiently close elements from A [ B have to be in B and hence x 2 B.
3: By 3.3.1 1, A is closed and since it contains B D A, it also contains B D A.
t
u
We also define the interior Int.A/ D X X X A. The interior of A is also

denoted by Aı . It immediately follows from Proposition 3.3.1 that the interior is
the union of all open sets contained in A. The boundary of A is defined as @A D
A X Int.A/.
3.4
Continuity can be expressed in terms of the concepts introduced in this section. We

have
www.Ebook777.com
Theorem. The following statements on a mapping f W .X; d / ! .Y; d 0 / are

equivalent.
(1) f is continuous.
(2) For every 2 X and every neighborhood V of f .x/ there is a neighborhood U
of x such that f ŒU V .
(3) For every U open in .Y; d 0 / the preimage f 1 ŒU is open in .X; d /.
(4) For every A closed in .Y; d 0 / the preimage f 1 ŒA is closed in .X; d /.
(5) For every subset A X ,
f ŒA f ŒA:
(6) For every subset B Y ,
f 1 ŒB f 1 ŒB:
Proof. (1))(2) : Let V be a neighborhood of f .x/ with

.f .x/; "/ V . Choose
a ı > 0 as in (ct) for x and ". Then f Œ
.x; ı/
.f .x/; "/, and
.x; ı/ is a
neighborhood of x.
(2))(3) : If U Y is open and x 2 f 1 ŒU then f .x/ 2 U and U is a
neighborhood. Hence there is a neighborhood V of x such that f ŒV U and we
have x 2 V f 1 ŒU , making f 1 ŒU a neighborhood of x.
(3),(4) by 3.2.1 2, since f 1 Œ preserves complements.
(4))(5) : We have
A f 1 ŒŒf ŒA f 1 Œf ŒA:
Since f 1 Œf ŒA is closed, we have by 3.3.1 A f 1 Œf ŒA and the statement

follows.
(5))(6) : We have, by (5), f Œf 1 ŒB f Œf 1 ŒB B and hence f 1 ŒB
f 1 ŒB.
(6))(1) : If f .y/ 2
.f .x/; "/ then f .y/ … Y X
.f .x/; "/ and hence y …
1
f ŒB where B D Y X
.f .x/; "/. Hence y … f 1 ŒB and there is a ı > 0
such that
.y; ı/ \ f 1 ŒB D ;. Thus if d.x; y/ > ı then f .y/ … B, that is,
f .y/ 2
.f .x/; "/. t
u
3.5
A continuous mapping f W .X; d / ! .Y; d 0 / is called a homeomorphism if there is

a continuous mapping g W .Y; d 0 / ! .X; d / such that
fg D idY and gf D idX :
www.Ebook777.com
4 First remarks on topology 43
If there exists a homeomorphism f W .X; d / ! .Y; d 0 / we say that the spaces

.X; d / and .Y; d 0 / are homeomorphic.
Note that if d and d 0 are equivalent metrics then the identity map idX W .X; d / !
.X; d 0 / is a homeomorphism. But idX W .X; d / ! .X; d 0 / can be a homeomorphism
even when d and d 0 are far from being equivalent (consider, e.g., the interval h0; /
with the standard metric d and with d 0 .x; y/ D j tan x tan yj).
A property of a space or a concept related to spaces is said to be topological
if it is preserved under all homeomorphisms. For example, by Theorem 3.4, for a
set to be a neighborhood of a point, or to be open resp, closed, or the closure, are
topological concepts. By 1.6, convergence is a topological concept.
Continuity is a topological concept, but uniform continuity is not.
This suggests the possibility of formulating a notion of a space based only on
topological properties. We will explore this in the next section.
4 First remarks on topology
Very often, a choice of metric is not really important. We may be interested just
in continuity, and a concrete choice of metric may be somehow off the point. For
example, note that the ”natural” Pythagorean metric would have been a real burden
in dealing with the product. Sometimes it even happens that one has a natural notion
of continuity, or convergence, without having a metric defined first. It may even
happen that there is no reasonable way to define a metric.
This leads to a more general notion of a space, called a topological space. The
idea is to describe the structure of interest simply in distinguishing whether a subset
U X containing x “surrounds” (is a neighborhood of) x, or declaring some
subsets open resp. closed, or specifying an operator of closure. We will present here
three variants of the definition, which turn out to be equivalent.
4.1
We will start with the neighborhood approach, which was historically the first one
(introduced by Hausdorff in 1914). It is convenient to denote by P.X / the power
set of X , which means the set of all subsets of X (including the empty set and X ).
With every x 2 X , one associates a set U.x/ P.X /, called the system of the
neighborhoods of x, satisfying the following axioms:
(1) For each U 2 U.x/, x 2 U ,
(2) If U 2 U.x/ and U V X then V 2 U.x/,
(3) If U; V 2 U.x/ then U \ V 2 U.x/, and
(4) For every U 2 U.x/ and every y 2 V there is a V 2 U.x/ such that U 2 U.y/.
One then defines a (possibly empty) subset U of X to be open if U is a neighborhood
of each of its points. One defines a subset A of X to be closed if the complement
www.Ebook777.com
X X A of A is open. The closure of a subset S of X is defined by the formula

S D fx j 8U 2 U.x/; U \ S ¤ ;g.
4.2
Nowadays probably the most common approach to the structure of topology is to

define open sets first as a set of subsets of X satisfying certain axioms. It may be
perhaps less intuitive, but it turns out to be much simpler technically.
In this approach, a topology on a set X is a subset P.X / satisfying
(1) ;; X 2 ,
(2) U; V 2 ) U \ S V 2 ,
(3) Ui 2 ; i 2 J ) Ui 2 .
In other words, we may simply say that a topology is a subset of the set P.X /
of all subsets of X which is closed under all unions and all finite intersections.
(To include (1), we allow the union of an empty set of subsets of X , which is said
to be ;, and the intersection of an empty set of subsets of X , which is said to be X .)
One then defines a closed set as a complement of an open set; U is a
neighborhood of x if there is an open V such that x 2 V U , and the closure
is defined by the formula
\
AD fB j A B; B closedg:
A subset A X is called dense if A D X .

Remark: It is possible to start equivalently with closed sets first and then define
open sets as their complements; the axioms of closed sets are obtained by expressing
the axioms for open sets in terms of their complements (see Exercise (9)).
4.3
Or, one can start with a closure operator u W P.X / ! P.X / satisfying
(1) u.;/ D ; and A u.A/,
(2) u.A [ B/ D u.A/ [ u.B/ and
(3) u.u.A// D u.A/.
A is declared closed if u.A/ D A, the open sets are complements of the closed ones,
and U is a neighborhood of x if x … u.X X U /.
4.4
In fact one usually thinks of a topological space as a set endowed with all the
above mentioned notions simultaneously, and the only question is which of them
www.Ebook777.com
4 First remarks on topology 45
one considers primitive concepts and which are defined afterwards. The resulting
structure is the same. (See the Exercises.)
4.5
A topology is not always obtained from a metric (if it is we speak of a metrizable

space). Here are two rather easy examples.
(a) Take an infinite set X and declare U X to be open if either it is void or if
X X U is finite.
(b) Take a partially ordered set .X; / and declare U to be open if U D
fx j 9y 2 U ; x yg. (Note: this topology is metrizable for certain special
choices of partial orderings, but certainly not in general.)
Non-metrizable spaces of importance are of course seldom defined as easily as
this. But it should be noted that many non-metrizable spaces are of interest today.
4.6
A mapping f W X ! Y between topological spaces is continuous if for every

x 2 X and every neighborhood V of f .x/ there is a neighborhood U of x such
that f ŒU V (cf. (2) in Theorem 3.4). If we replace in 3.4 the metric definition of
continuity (1) with the definition we just made, we have the following more general
result:
Theorem. Let X; Y be topological spaces. Then the following statements on a

mapping f W X ! Y are equivalent.
(2) For every U open in Y the preimage f 1 ŒU is open in X .
(3) For every A open in Y the preimage f 1 ŒA is closed in X .
(4) For every subset A X ,
f ŒA f ŒA:
(5) For every subset B Y ,
f 1 ŒB f 1 ŒB:
Proof. Most of the implications can be proved by the same reasoning as in 3.4. The
only one needing a simple adjustment is
(5))(1): Let (5) hold and let V be a neighborhood of f .x/. Thus, f .x/ …
Y X V , that is, x … f 1 ŒY X V . Hence, U D X X f 1 ŒY X V D f 1 ŒV is
a neighborhood of x, and f ŒU D ff 1 ŒV V . t
u
www.Ebook777.com
4.7
The system of open sets constituting a topology is often determined by a so-called

basis, which means a subset B such that
B1 ; B2 2 B ) B1 \ B2 2 B and
[
for every U 2 ; U D fB j B 2 B; B U g:
(For example, the set of all open intervals, or the set of all open intervals with
rational endpoints are bases of the standard topology of the real line R).
One may wish to define a topological space where some particular subsets
are open, thus specifying a subset S P.X / of such sets without any a priori
properties. One easily sees that the smallest topology containing S is the set of all
unions of finite intersections of elements of S. Then one speaks of S as of a subbasis
of the topology obtained.
The preimages of (finite) intersections are (finite) intersections, and preimages of
unions are unions of preimages. Consequently we obtain from 4.6 an important
Observation. A mapping f W .X; / ! .Y; / is continuous if and only if there is

a subbasis S of such that each f 1 ŒS with S 2 S is open.
(Thus e.g. to make sure a real function f W X ! R is continuous it suffices to

check that all the f 1 Œ.1; a/ and f 1 Œ.a: C 1/ are open.)
4.8
Let .X; / be a topological space and let Y X be a subset. We define the subspace
of .X; / carried (or induced) by Y as
.Y; jY / where jY D fU \ Y j U 2 g:
Since for the embedding map j W Y ! X , j 1 ŒU D U , the map j is continuous;

furthermore, if f W .Z; / ! .X; / is a continuous map such that f ŒZ Y then
the map .z 7! f .z// W .Z; / ! .Y; jY / is continuous as well.
Note that this is in accordance with the concept of subspace in the metric case: the
metric subspace (cf. 2.1) has the topology just described, obtained from the topology
of the larger metric space.
4.8.1 Convention
Unless otherwise stated, the subsets of a topological space will be understood to
be endowed with the induced topology, and we will subject the terminology to this
convention. Thus we will speak of “connected subsets” or “compact subsets” etc
(see below) or on the other hand of an ‘open subspace” or ”closed subspace”, etc.
www.Ebook777.com
5 Connected spaces 47
5 Connected spaces
One of the simplest notions defined for topological spaces is connectedness.
5.1
A topological space X is said to be connected if for any two open sets U; V X

which satisfy U \ V D ; and U [ V D X , we have U D ; (and hence V D X ),
or V D ; (and hence U D X ). It is also common, for a subset S X , to say
that S is connected if S is a connected topological space with respect to the induced
topology. Note that this is equivalent to saying that for open sets U; V X such
that U \ V \ S D ; and U [ V S , we have U S or V S . The following
observations are immediate.
5.1.1 Proposition. Let X be a connected space and f W X ! Y a continuous map

which is onto. Then Y is connected.
Proof. Suppose U; Y Y are open, U \ V D ;, U [ V D Y . Then f 1 ŒU \

f 1 ŒV D ;, f 1 ŒU [ f 1 ŒV D X , so f 1 ŒU D ; or f 1 ŒV D ;, which
implies U D ; or V D ; since f is onto. t
u
5.1.2 Proposition. Let Si X , i 2 I , and let each Si be connected. Suppose

further for every i; j 2 I , there exist i0 ; : : : ; ik 2 I , i0 D i , ik D j such that
Sit \ Sit C1 ¤ ;. Then
[
SD Si
i 2I
is connected.
Proof. Suppose U; V are open in X , U [ V S; U \ V \ S D ;. Suppose further

U is non-empty. Then there exists an i 2 I such that U \Si ¤ ;, and hence U Si
since Si is connected. Now select any j 2 I and let i0 ; : : : ; ik be as in the statement
of the Proposition. By induction on t, we see that U \ Sit ¤ ;, and hence U Sit
since Sit is connected. Thus, U Sj . Since j 2 I was arbitrary, U S . t
u
5.1.3 Corollary. A product X

Y of two connected metric spaces X; Y is
connected.
Proof. Choose a point x 2 X and consider the sets S0 D fxg

Y , Sy D X
fyg
for y 2 Y . Then Si , i 2 Y q f0g, satisfy the assumptions of Proposition 5.1.2. u
t
5.1.4 Proposition. The closure of a connected subset S of a topological space is

connected.
www.Ebook777.com
Proof. If U; V S satisfy U \ V D ;, U [ V D S and U; V are non-empty open

in S , then U \ S , V \ S are non-empty and open in S , their union is S and their
intersection is non-empty, contradicting the assumption that S is connected. t
u
5.2 Connectedness of the real numbers
The fact that the set R of all real numbers is connected is “intuitively obvious”, but
must be proved with care. Let us start with a preliminary result.
5.2.1 Lemma. Every open set U R is a union of countably (or finitely) many
disjoint open intervals.
Proof. We know that U is a union of countably many open intervals Ui , i D

1; 2; : : : since open intervals .q1 ; q2 /, q1 ; q2 2 Q, form a basis of the topology
of R. Note also that if V; W are open intervals and V \ W ¤ ;, then V [ W is
an open interval, and that an increasing union of open intervals is an open interval.
Now consider an equivalence class on f1; 2; : : : g where i j if and only if there
exist i0 ; : : : ; ik such that i0 D i , ik D j and Uit \ Uit C1 ¤ ;. Then the sets
[
Ui
i 2C
where C are equivalence classes with respect to are disjoint open intervals whose
union is U . t
u
5.2.2 Theorem. The connected subsets of R are precisely (open, closed, half-open,
bounded, unbounded, etc.) intervals.
Proof. Let us first prove that intervals are connected. Let J be an interval. Suppose
U; V are open in R, U \ V J , U \ V \ J D ;. Suppose U is non-empty. By
Lemma 5.2.1, U is a disjoint union of countably many open intervals Ui , i 2 I ¤ ;.
Without loss of generality, none of the sets Ui is disjoint with J . Choose i 2 I , and
suppose Ui D .a; b/ does not contain J . Then .a; b/ [ J is an interval containing
but not equal to .a; b/, so a 2 J or b 2 J . Let, without loss of generality, b 2 J .
Then b … V , b … Uj , j ¤ i , since V , Uj , j ¤ i are open and disjoint with Ui .
Thus, b 2 J X .U [ V /, which is a contradiction.
On the other hand, suppose that S R is connected but isn’t an interval. Then
there exist points x < z < y, x; y 2 S , z … S . But then S .1; z/ [ .z; 1/,
which contradicts the assumption that S is connected. t
u
5.2.3 Corollary. The Euclidean space Rn is connected.
Proof. This follows from Theorem 5.2.2 and Corollary 5.1.3. t

u
www.Ebook777.com
5 Connected spaces 49
5.3 Path-connected spaces
A topological space X is called path-connected if for any two points x; y 2 X ,

there exists a continuous map W h0; 1i ! X such that .0/ D x, .1/ D y.
By Theorem 5.2.2, Proposition 5.1.1 and Proposition 5.1.2, a path-connected space
is connected. See Exercise (14) for an example of a closed subset of R2 which is
connected but not path-connected.
5.3.1 Proposition. Let U Rn be a connected open set (with the induced

topology). Then U is path-connected.
Proof. If U is empty, it is clearly path-connected. Suppose U is non-empty. Choose

a point x 2 U . Let V U be the set of all points y 2 U for which there exists a
continuous map W h0; 1i ! U such that .0/ D x, .1/ D y. We claim that V is
open in U : this is the same as being open in Rn . If is as above,
.y; "/ U , and
z 2
.y; "/, extend to a map h0; 2i ! U by putting .1 C t/ D tz C .1 t/y for
t 2 h0; 1i. Clearly is continuous, and defining W h0; 1i ! U by .t/ D .2t/
shows z 2 V .
We also claim, however, that V is closed in U : Let yn ! y, yn 2 V , y 2 U .
Since U is open, there exists an " > 0,
.y; "/ U . Then there exists an n such
that yn 2
.y; "/. Then we proceed the same way as above: Let W h0; i ! U ,
.0/ D x, .1/ D yn . Extend to a map h0; 2i ! U by putting .1 C t/ D
ty C .1 t/yn for t 2 h0; 1i. Putting again .t/ D .2t/ shows that y 2 V .
Since V ¤ ; (since x 2 V ), and since V is open and closed in U , we must have
V D U , since U is connected. t
u
5.4 Connected components
Let X be a topological space. Let be a relation on X where x y if and

only if there exists a connected subset S X such that x; y 2 S . Then is an
equivalence relation (transitivity follows from Proposition 5.1.2). The equivalence
classes of are called the connected components of X . Also by Proposition 5.1.2,
connected components are connected subsets of X .
An immediate consequence of Proposition 5.1.4 is the following:
5.4.1 Lemma. Connected components of X are closed subsets of X . t

u
Connected components may not be open: consider Q (with the topology induced
from R). Then the connected components are single points. We have, however,
5.4.2 Lemma. Let U Rn be an open set. Then the connected components of U

are open in U (hence in Rn ).
www.Ebook777.com
Proof. Let x 2 U . Then there exists " > 0 such that

.x; "/ U , but
.x; "/
is homeomorphic to Rn and hence connected by Corollary 5.2.3, so
.x; "/ is
contained in the connected component of x. Since this is true for every point x,
the connected components are open. t
u
5.5 A result on bounded closed intervals
The proof of the following result will seem, in nature, related to the proof of the
fact that the real numbers are connected. While this is true, it turns out to be mainly
due to special properties of the real numbers. The result itself is a reformulation of
compactness, a notion which we will discuss in the next section. An understanding
of this connection for general metric spaces, however, will have to be postponed
until Chapter 9 below.
By an open interval (resp. bounded closed interval) in Rn we mean a set of the
Y
n Y
n
form .ak ; bk / (resp. of the form hak ; bk i, 1 < ak ; bk < 1).
kD1 kD1
[closed interval K in R and every set of open

n
Theorem. For every bounded
intervals S such that K I , there exists a finite subset F S such that
[ I 2S
J I.
I 2F
Proof. Let us first consider the case n D 1. Let ha; bi be contained in a union of a
set S open intervals. Let t 2 ha; bi be the supremum of the set M of all s 2 ha; bi
such that ha; si is contained in a union of some finite subset of S . We want to prove
that t D b. Assume, then, that t < b. Then there exists a J 2 S such that t 2 J . On
the other hand, by the definition of supremum, there exist si 2 M such that si % t.
Then, for some i , si 2 J . But we also know that there exists a finite subset F S
whose union contains ha; si i. Then the union of the finite subset F [ fJ g contains
ha; xi for every x 2 J , contradicting t D sup M .
Now let us consider general n. Assume, by induction, that the statement holds
with n replaced by n 1. Let K D ha1 ; b1 i

han ; bn i. Then for every point
x 2 ha1 ; b1 i, there exists, by the induction hypothesis, a finite subset Fx S
such that fxg
ha2 ; b2 i

han ; bn i Fx . Let Ix be the intersection of all the
(1-dimensional) intervals I1 where I1

In 2 Fx . Then ha1 ; b1 i is contained in
the union of the open intervals Ix , x 2 ha1 ; b1 i, and hence there are finitely many
[
k
points x1 ; : : : ; xk 2 ha1 ; b1 i such that ha1 ; b1 i Ixi . Then K is contained in the
i D1
union of the open intervals in Fx1 [ [ Fxk . t
u
www.Ebook777.com
6 Compact metric spaces 51
[ bounded closed interval K in R and every set of open

n
Corollary. For every [ sets
Q such that K I , there exists a finite subset F Q such that J I.
I 2Q I 2F
(Apply the theorem to the set S of all open intervals which are contained in one
of the open sets in Q.)
6 Compact metric spaces
6.1
A metric space X is said to be compact if each sequence .xn /n in X contains a

convergent subsequence. Thus, in particular, a bounded closed interval ha; bi in R
is compact (recall Theorem 2.3 of Chapter 1).
6.2 Proposition. 1. A subspace of a compact space is compact if and only if it is

closed.
2. If f W X ! Y is continuous then the image f ŒA of any compact A X is
compact.
Proof. 1. Let A be a closed subspace of a compact X . Let .xn /n be a sequence of

points of A. There is a subsequence xk1 ; xk2 ; xk3 ; : : : converging in X . Since A
is closed, the limit is in A.
Now let A not be closed. Then there is a sequence .xn /n of elements of A
convergent in X , with the limit x in X X A; since each subsequence converges
to x, there is none converging to a point in A.
2. Let .yn /n be a sequence in f ŒA. Choose xi 2 A such that yi D f .xi /. Since
A is compact we have a subsequence xk1 ; xk2 ; xk3 ; : : : converging to an x 2 A.
Then by 1.5, yk1 ; yk2 ; yk3 ; : : : converges to f .x/. t
u
6.2.1
Note that from the second part of the proof of the first statement we obtain an
immediate
Observation. A compact subspace of any metric space X is closed in X .
Remark. Thus we have a slightly surprising consequence: if X is compact, Y is

a general metric space and if f W X ! Y is a continuous mapping then, besides
www.Ebook777.com
preimages of closed sets being closed, also the images of closed sets are closed.
We will learn more about this phenomenon in Chapter 9 below. For now, let us
record the following
6.2.2 Corollary. Let f W X ! Y be a continuous bijective (i.e. one to one and

onto) map of metric spaces where X is compact. Then f is a homeomorphism.
6.3 Proposition. Let X be a compact metric space. Then for each continuous real
function f on X there exist x1 ; x2 2 X such that
f .x1 / D minff .x/ j x 2 X g and f .x2 / D maxff .x/ j x 2 X g:
(Compare with 3.4 of Chapter 1.)
Proof. A compact subspace A of R has a minimal and a maximal point, namely

inf A and sup A that are obviously limits of sequences in A. Apply to A D f ŒX ,
compact by 6.2. t
u
6.4 Proposition. (Finite) products of compact spaces are compact.
Proof. We will begin with the product X

Y of two compact metric spaces - the
extension to a general finite product follows by induction.
Let
.x1 ; y1 /; .x2 ; y2 /; .x3 ; y3 /; : : : (*)
be a sequence of points of X
Y . In X , choose a convergent subsequence .xkn /n
of .xn /n . Now take the sequence .ykn /n in Y and choose a convergent subsequence
.ykrn /n . Then by 2.2.2.2 (and (1.2.1)),
.xkr1 ykrn /; .xkr2 ; ykr2 /; .xkr3 ; ykr3 /; : : :
is a convergent subsequence of (*). t

u
A metric space .X; d / is bounded if there exists a number K such that for all
x; y 2 X , d.x; y/ < K. From the triangle inequality we immediately see that this
is equivalent to any of the following statements:
there is a K such that for every x; X

.x; K/;
for every x there is a K such that X
.x; K/:
www.Ebook777.com
6 Compact metric spaces 53
6.5 Theorem. A subspace of the Euclidean space Rm is compact if and only if it is

bounded and closed.
Proof. I. From Theorem 2.3 of Chapter 1, we already know that a bounded closed
interval is compact.
II. Now let X be a bounded closed subspace of Rm . Since it is bounded there are
intervals hai ; bi i, i D 1; ; : : : ; m, such that
X J D ha1 ; b1 i

ham ; bm i:
By 6.4 and I, J is compact. The subspace X is closed in Rm , hence in J , and

hence it is compact by 6.2.
III. Let X not be closed in Rm . Then it is not compact, by 6.2.1.
IV. Let X not be bounded. Choose arbitrarily x1 and then xn such that d.x1 ; xn / > n.
A convergent sequence is always bounded (all but finitely many of its elements
are in the "-ball of the limit). Thus, .xn /n cannot have a convergent subsequence
as it has no bounded one. t
u
6.6
We have already observed that uniform continuity is a much stronger property than
continuity (even the real function x 7! x 2 is not uniformly continuous). But the
situation is different for compact spaces. We have
Theorem. Let X; Y be metric spaces and let X be compact. Then a mapping f W

X ! Y is uniformly continuous if and only if it is continuous.
(Compare with Theorem 3.5.1 of Chapter 1.)
Proof. Let f be continuous but not uniformly continuous. Negating the defini-
tion,
there is an "0 > 0 such that for every ı > 0 there are x.ı/; y.ı/ such that
d.x.ı/; y.ı// < ı while d 0 .f .x.ı//; f .y.ı/// "0 :
Consider xn D x. n1 / and yn D y. n1 /. Choose a convergent subsequence .xkn /n of

.xn /n and a convergent subsequence .ykrn /n of .ykn /n , set e x n D xkrn and e
y n D ykrn ,
and finally x D lim e x n and y D lim e xn; e
y n . As d.e y n / < n1 , x D y. This is a
contradiction since by continuity f .x/ D lim f .e x n / and f .y/ D lim f .e y n / and
d.f .ex n /; f .e
y n // is always at least "0 . t
u
www.Ebook777.com
7 Completeness
7.1
A sequence .xn /n in a metric space .X; d / is said to be Cauchy if
8" > 0 9n0 such that 8m; n n0 ; d.xm ; xn / < ":
7.2 Proposition. 1. Every convergent sequence is Cauchy.

2. Let a Cauchy sequence .xn /n contain a convergent subsequence; then the whole
sequence .xn /n converges.
3. Every Cauchy sequence is bounded.
Proof. 1. Let lim xn D x. For " > 0 choose an n0 such that d.xn ; x/ < "
2 for all
n n0 . Then for m; n n0 ,
" "
d.xm ; xn / d.xm ; x/ C d.x; xn / < C D ":
2 2
2. Let .xn /n be Cauchy and let .xkn /n be a subsequence converging to a point x.

Choose an n1 such that for m; n n1 , d.xm ; xn / < 2" , and an n2 such that for
n n2 , d.xkn ; x/ < 2" . Set n0 D max.n1 ; n2 /. Since kn n we have, for
n n0 ,
d.xn ; x/ d.xn ; xkn / C d.xkn ; x/ < ":
3. Choose n0 such that for m; n n0 , d.xm ; xn / < 1. Then for any n,
d.x; xn0 / < 1 C max d.xn0 ; xk /: t

u
kn0
7.3
A metric space .X; d / is said to be complete if every Cauchy sequence in X

converges.
7.3.1 Proposition. A subspace A of a complete space X is complete if and only if

it is closed.
Proof. Let A be closed. If a sequence is Cauchy in A, it is Cauchy in X and hence

convergent. Since A is closed, the limit of the sequence has to be in A.
If A is not closed there is a sequence .xn /n with xn 2 A, convergent in X to
an x 2 X X A. Then .xn /n is Cauchy in X and hence in A as well; but all of its
subsequences converge to x and hence do not converge in A. t
u
www.Ebook777.com
7 Completeness 55
7.4 Proposition. A compact metric space is complete.
Proof. Let .xn /n be a Cauchy sequence in a compact metric space X . Then it has a
convergent subsequence, and by 6.2 2, it converges. t
u
7.5 Theorem. The Euclidean space Rm (in particular, the real line R) is complete.
Consequently, a subspace of Rm is complete if and only if it is closed.
Proof. Let .xn /n be a Cauchy sequence in Rm . By 6.2 it is bounded and hence
fxn j n D 1; 2; : : : g J D ha1 ; b1 i

ham ; bm i
for sufficiently large intervals haj ; bj i. By 6.4 .xn /n converges in J and hence it
converges in Rm . u
t
Remark. The special case of the real line is the well-known Bolzano-Cauchy
Theorem (Theorem 2.4 of Chapter 1).
7.6
The following is the well-known Banach Fixed Point Theorem. At first sight it
may seem that its use will be rather limited: the assumption is very strong. But the
reader will be perhaps surprised by the generality of one of the applications in 3.3
of Chapter 6.
Theorem. Let .X; d / be a complete metric space. Let f W X ! X be a mapping

such that there is a q < 1 with
d.f .x/; f .y// q d.x; y/ (*)
for all x; y 2 X . Then there is precisely one x 2 X such that f .x/ D x.
Proof. Choose any x1 2 X and then, inductively,
xnC1 D f .xn /:
Set C D d.x1 ; x2 /. By the assumption we have
d.x2 ; x3 / C q; d.x3 ; x4 / C q 2 ; : : : ; d.xn ; xnC1 / C q n1 :
Thus, by triangle inequality, for m n C 1,
C
d.xn ; xm / D C.q n1 Cq n C Cq m2 / C q n1 .1Cq Cq 2 C / D q n1 :
1q
www.Ebook777.com
Hence, .xn /n ia a Cauchy sequence and we have a limit x D lim xn . Now a mapping
f satisfying (*) is clearly continuous and hence we have
f .x/ D f .lim xn / D lim f .xn / D lim xnC1 D x:
Finally, if f .x/ D x and f .y/ D y then
d.x; y/ D d.f .x/; f .y// q d.x; y/ with q < 1
which is possible only if d.x; y/ D 0. t

u
7.7 An Example: Spaces of continuous functions
Let X D .X; d / be a metric space. Denote by
C.X /
the space of all bounded continuous real functions f W X ! R, endowed with the
metric
d.f; g/ D sup jf .x/ f .x/j:

x2X
(The function d thus defined really is a metric. Obviously d.f; g/ D 0 implies

f D g and d.f; g/ D d.g; f /. Suppose d.f; g/ C d.g; h/ < d.f; g/; then there
is an x 2 X such that d.f; g/ C d.g; h/ < jf .x/ h.x/j, but then in particular
jf .x/ g.x/j C jg.x/ h.x/j < jf .x/ h.x/j, a contradiction.)
Remark. Of course, by 2.4.2, if X is compact then C.X / is the space of all
continuous functions on X .
7.7.1 Observation. The convergence in C.X / is exactly the uniform convergence

defined in 8.1.
(We have d.f; g/ < " if and only if for all x 2 X , jf .x/ g.x/j < ".)
7.7.2 Proposition. The space C.X / with the metric defined above is complete.
Proof. Let .fn /n be a Cauchy sequence in C.X /. Then, since jfn .x/ fm .x/j
d.fn ; fm / for each x 2 X , every .fn .x//n is a Cauchy sequence in R, and hence a
convergent one. Set
www.Ebook777.com
8 Uniform convergence of sequences of functions. Application: Tietze’s Theorems 57
f .x/ D lim fn .x/:

n
Claim. The sequence .fn /n converges to f uniformly.

Proof of the Claim. Consider an " > 0. There exists an n0 such that
for m; n n0 ,
8x; jfn .x/ fm .x/j < "

2
and hence lim jfn .x/ fm .x/j D jfn .x/ lim fm .x/j D
m!1 m!1
jfn .x/ f .x/j 2" < ". Thus, for n n0 and for all x 2 X , jfn .x/
f .x/j < ". t
u
Proof of the Proposition continued. By the Claim and 8.2, f is continuous. Now
there exists an n0 such that for all n; m n0 , d.fn ; fm / D sup jfn .x/ fm .x/j <
x
1 and hence, taking the limit, we obtain jfn .x/ f .x/j 1 for all x. Thus, if
jfn0 .x/j K we have jf .x/j K for all x.
Now we know that f is bounded and continuous, hence f 2 C.X /, and by 7.7.1
and the Claim again, .fn /n converges to f in C.X /. t
u
7.7.3
Let a; b 2 R [ f1: C 1g. Put
C.X I a; b/ D ff 2 C.X / j 8x; a f .x/ bg:
Proposition. The subspace C.X I a; b/ is closed in C.X /. Consequently, it is

complete.
Proof. Recall 8.1.1. Since uniform convergence implies pointwise convergence, if

a fn .x/ b and fn converge to f then a f .x/ b and f 2 C.X I a; b/.
The consequence follows from 7.3.1. t
u
8 Uniform convergence of sequences of functions.

Application: Tietze’s Theorems
On various occasions we have seen that general facts the reader knew about real
functions of one real variable held generally, and the proofs did not really need
anything but replacing jx yj by the distance d.x; y/. For example, this was
the case when studying the relationship between continuity with convergence, or
when proving that continuous maps of compact spaces are automatically uniformly
continuous; or the fact about maxima and minima of real functions on a compact
www.Ebook777.com
space (where in fact the general proof was in a way simpler, or more transparent,
due to the observation that the image of a compact space is compact).
In this section we will introduce yet another case of such a mechanical exten-
sion, namely the behavior of uniformly convergent sequences of mappings, resp.
uniformly convergent series of real functions. As an application we will present
rather important Tietze Theorems on extension of continuous maps.
8.1
Let .X; d /, .Y; d 0 / be metric spaces. A sequence of mappings
f1 ; f2 ; f3 ; : : : W X ! Y
is said to converge uniformly to f if

for every " > 0 there is an n0 such that for all n n0 and for all x 2 X ,
d 0 .fn .x/; f .x// < ":
This is usually indicated
fn f:
8.1.1 Remarks
1. Note that if fn f then
lim fn .x/ D f .x/ for all x. (*)
The statement (*) alone, (called pointwise convergence), is much weaker, and
would not suffice as an assumption in 8.2 below.
2. Also note that in the above definition, one uses the metric structure in .Y; d 0 /
only. See 8.2.1 below.
8.2 Proposition. Let fn f for mappings .X; d / ! .Y; d 0 /. Let all the functions
fn be continuous. Then f is continuous.
Proof. For " > 0 choose n such that d 0 .fn .x/; f .x// < 3" for all x. Since fn is
continuous there is a ı > 0 such that d.x; y/ < ı implies d 0 .fn .x/f .x// < 3" . Now
we have the implication
d.x;y/ < ı ) d 0 .f .x/; f .y//

" " "
d 0 .f .x/; fn .x// C d 0 .fn .x/; fn .y// C d 0 .fn .y/; f .y// < C C D ":
3 3 3
t
u
www.Ebook777.com
8.2.1
Note that an analogous proposition also holds for a topological space .X; / instead
of a metric one. In the proof replace the requirement of ı by a neighborhood U of x
such that fn ŒU
.fn .x/; 3" / and use for y 2 U the triangle inequality as before.
P
8.3 Corollary. Let fn W .X; d / ! R be continuous functions, let an be a
convergent series of real numbers, and let for every n and every x, jfn .x/j an .
X n X1
Then gn .x/ D fk .x/ uniformly converge to fk .x/ and hence g D .x 7!
kD1 kD1
1
X
fk .x// is a continuous function.
kD1
8.4 Lemma. Let A; B be disjoint closed subsets of a metric space .X; d / and let
˛; ˇ be real numbers. Then there is a continuous function
' D ˆ.A; BI ˛; ˇ/ W X ! R
such that
'ŒA f˛g; 'ŒB fˇg and minf˛; ˇg '.x/ maxf˛; ˇg: (ˆ)
Proof. Set
d.x; A/
'.x/ D ˛ C .ˇ ˛/ :
d.x; A/ C d.x; B/
This definition is correct: d.x; A/ C d.x; B/ D 0 yields d.x; A/ D d.x; B/ D 0

and by closedness x 2 A and x 2 B; but A and B are disjoint.
Furthermore, .x/ D d.x; C / is continuous (by triangle inequality, d.y; C /
d.x; C / C d.x; y/ and hence jd.x; C / d.y; C /j d.x; y/) so that ', obtained
by arithmetic operations from continuous functions, is continuous as well.
The properties listed in .ˆ/ are obvious. t
u
8.5 Theorem. (Tietze) Let A be a closed subspace of a metric space X and let
J be a compact interval in R. Then each continuous mapping f W A ! J can be
extended to a continuous g W X ! J (that is, there is a continuous g such that
gjA D f ).
Proof. For a degenerate interval ha; ai the statement is trivial and all the other
compact intervals are homeomorphic; if the statement holds for J1 and if h W J !
J1 is a homeomorphism we can extend for f W A ! J the hf to a g W X ! J1
and then take g D h1 g. Thus we can choose the J arbitrarily. For our purposes,
J D h1; 1i will be particularly convenient.
www.Ebook777.com
Set A1 D f 1 Œh1; 13 i and B1 D f 1 Œh 13 ; 1i and consider
1 1
'1 D ˆ.A1 ; B1 I ; /:
3 3
We obviously have
2
8x 2 A; jf .x/ '1 .x/j :
3
Set f1 D f '1 .
Suppose we already have continuous
f D f1 ; f2 ; : : : ; fn W A ! h1; 1i and '1 ; '2 ; : : : 'n W X ! h1; 1i
such that for all k D 1; : : : ; n,
1 2
j'k .x/j ; fk .x/ D fk1 .x/ 'k .x/ and jfk .x/j : (*)
3k 3k
Then set
1 1 1 1
AnC1 D f 1 Œh ; i; BnC1 D f 1 Œh nC1 ; n i;
3n 3nC1 3 3
1 1
'nC1 D ˆ.AnC1 ; BnC1 I nC1 ; nC1 / and fnC1 D fn 'nC1 :
3 3
Thus we obtain sequences of continuous functions '1 ; '3 ; : : : ; 'k ; : : : and f D

f0 ; f1 ; : : : ; fk ; : : : satisfying (*) for all k. By 7.3, we have a continuous function
X1 X1
2
g D .x 7! 'k .x// W X ! R and since jg.x/j D 1, we can view it as
3k
kD1 kD1
a continuous function
g W X ! h1; 1i:
Now let x 2 A. We have
f .x/ D '1 .x/Cf1 .x/ D '1 .x/C'2 .x/Cf2 .x/ D D '1 .x/C C'n .x/Cfn .x/
and since limn fn .x/ D 0 we conclude that f .x/ D g.x/. t

u
www.Ebook777.com
8.5.1 Theorem. (Tietze’s Real Line Theorem) Let A be a closed subspace of a

metric space X . Then each continuous mapping f W A ! R can be extended to
a continuous g W X ! R.
Proof. We can replace R by any space homeomorphic with R (recall the first
paragraph of the previous proof). We will take the open interval .1; 1/ instead
and extend a map f W A ! .1; 1/.
By 8.5, f can be extended to a g W X ! h1; 1i. Such g can, however reach the
values 1 or 1 and hence is not an extension as desired. To remedy the situation,
consider B D g 1 Œf1; 1g which is a closed set disjoint with A, consider the ' D
ˆ.A; B; 0; 1/ from 8.4, and define
g.x/ D g.x/ '.x/:
Now we have f .x/ D g.x/ D g.x/ for x 2 A, and jg.x/j < 1 for all x 2 X : if
g.x/ D 1 or 1 then '.x/ D 0.
8.5.2
A subspace R of a space Y is said to be a retract of Y if there exists a continuous
r W Y ! R such that r.x/ D x for all x 2 R.
A metric space Y is injective if for every metric space X and closed A X ,
each continuous f W A ! Y can be extended to a continuous g W X ! Y . (Thus,
we have learned above that R and any compact interval are injective spaces.)
Theorem. Every retract of a Euclidean space is injective.
Proof. First we will prove that a Euclidean space itself is injective. Consider it as
the product
Rm D R

R m times
with the projections pj ..x1 ; : : : ; xm // D xj . Let f W A ! Rm be a continuous

mapping. Then we have by 8.5.1 continuous gj W X ! R such that gj jA D pj f .
By 2.2.2 we have the continuous g D .x 7! .g1 .x/; : : : ; gm .x/// W X ! Rm and
for x 2 A we obtain g.x/ D .p1 f .x/; : : : ; pm f .x// D f .x/.
Now let Y be a retract of Rm with a retraction r W Rm ! Y and an inclusion map
j W Y ! Rm (thus, rj D id). Now if f W A ! Y (or, rather, jf W A ! Rm ) is
extended to g W X ! Rm , the desired extension g is rg. t
u
www.Ebook777.com
9 Exercises
(1) Prove 1.4.1.

(2) Prove Proposition 1.5.1.
(3) Prove Observation 1.5.2.
(4) Prove that f W .X; d / ! .Y; d 0 / is continuous if and only if for each
convergent sequence .xn /n in .X; d / the sequence .f .xn // is convergent (not
specifying the limits.).
(5) (a) Consider the set of real numbers R. Prove that the function
d 0 .x; y/ D jx 3 y 3 j
is a metric which is not equivalent to the metric d given in exam-

ple 1.1.1 (a).
(b) Prove that nevertheless, neighborhoods with respect to d are the same as
neighborhoods with respect to d 0 .
(6) Each
.x; "/ is open (use the triangle inequality).
(7) Let Y be a subspace of .X; d /. U is open (closed) in Y if and only if there
exists an open (closed) V in X such that U D V \ Y . The closure of A in y
is A \ Y where A is the closure in X (discuss this from the various aspects of
closure as presented in 3.3.
(8) Find an example when uniform continuity is not preserved under homeomor-
phism.
(9) Write down a definition of topology based on closed subsets of X .
(10) Check that the closures as defined in 4.1 and 4.2 satisfy the requirements
of 4.3).
(11) Starting with open sets, define neighborhoods, and from them define closure
as indicated above. Prove that you get the same as the closure defined from
open sets directly.
(12) Start with open sets, define neighborhoods, and then open sets as in 4.1. Prove
that the open sets thus defined are precisely the same sets as the original ones
(note the role of the somewhat clumsy requirement (4) in 4.1).
(13) Preserving connectedness is not the same as continuity. Give an example of a
map f W X ! Y such that for every connected S X , f ŒS is connected
(with the induced topology from Y ), but f is not continuous. [Hint: Take X D
Q, the rational numbers.]
(14) Let X R2 be the union of the set of all points .0; y/, y 2 h1; 1i and the set
of all points .x; sin.1=x//, x > 0, with the induced topology.
(a) Prove that X R2 is a closed subset.
(b) Prove that X is connected but not path-connected.
(15) Let U Rn be a connected open set, and let x; y 2 U . Prove that
there exist x0 ; : : : ; xk 2 U , x0 D x, xk D y, such that the straight line
segment connecting xt ; xt C1 is contained in U . [Hint: mimic the proof of
Proposition 5.3.1.]
www.Ebook777.com
9 Exercises 63
(16) Path-connected components are defined the same way as connected com-
ponents in 5.4, with the word “connected” replaced by the word “path-
connected”. Are path-connected components necessarily closed? Prove or give
a counterexample.
(17) Check that convergence in the metric spaces defined in 1.1.1 (d), (e) is
precisely uniform convergence.
(18) Prove an analogue of Proposition 8.2 for uniform continuity instead of
continuity.
1
X
(19) Let K be the set of all real numbers of the form ak 3k , where ak 2 f0; 2g.
kD1
(This is called the Cantor set.) Prove that K is compact. Prove that K contains
no compact interval with more than one point.
(20) Prove that a subspace of Rm is injective if and only if it is a retract.
www.Ebook777.com
Multivariable Differential Calculus

3
In this chapter, we will learn multivariable differential calculus. We will develop the
multivariable versions of the concept of a derivative, and prove the Implicit Function
Theorem. We will also learn how to use derivatives to find extremes of multivariable
functions.
To understand Multivariable Differential Calculus, one must be familiar with
Linear Algebra. We assume that the typical reader of this book will already have
had a course in linear algebra, but for convenience we review the basic concepts in
Appendices A and B. We refer periodically to results of these Appendices, and we
recommend that the reader who has seen some linear algebra simply start reading the
present chapter, and refer to these results in the Appendix as needed. Notationally,
the most important are the conventions in Sections 1.3 and 7.3 of Appendix A below:
Rn will be the space of real n-dimensional column vectors (matrices of type n
1).
To avoid awkward notation, however, we will usually write rows and decorate them
with the superscript ‹T which means transposition (Subsection 7.3 in Appendix A.
Row or column vectors will be denoted by bold-faced letters, such as v. The zero
vector (origin) will be denoted by o.
1 Real and vector functions of several variables
1.1
We will deal with real functions of several real variables, that is, mappings
f W D ! R with a domain D Rn . Typically, D will be open. Intercheangably
f .x/ where, in accordance with convention 7.3 of Appendix A, x D .x1 ; : : : ; xn /T ,
we will also write f .x1 ; : : : ; xn /. When x 2 Rm , y 2 Rn , notations such as f .x; y/,
f .x; y1 ; : : : ; yn / will also be allowed for a function f of m C n variables.
Given such a function f , we will often be concerned with the associated
functions of one variable
.t/ D f .x1 ; : : : ; xk1 ; t; xkC1 ; : : : ; xn /; xj .j ¤ k/ fixed: (1)

www.Ebook777.com
66 3 Multivariable Differential Calculus
It is useful to realize right away that the study of an f W D ! R cannot be reduced to

the system of all such functions of one variable. For instance, all of the functions
(1) may be continuous while f itself is not. See the following example. Set
8
ˆ
ˆ .x y/2
ˆ
< x 2 C y 2 for .x; y/ ¤ .0; 0/;
f .x; y/ D (2)
ˆ
ˆ
:̂1 for .x; y/ D .0; 0/:
Then each f .a; / and each f .; b/ is continuous, but f is not: the sequence . n1 ; n1 /
converges to .0; 0/ while lim f . n1 ; n1 / D 0 ¤ f .0; 0/.
1.2
Recall again Convention 1.3, 7.3 of Appendix A. It is important to note that a vector
function
f D .f1 ; : : : ; fm /T W D ! Rm ; fj W D ! R:
is continuous if and only if all the fi are continuous (recall Theorem 2.2.2 of
Chapter 2).
1.3 Composition
Vector functions f W D ! Rm , D Rn , and g W D 0 ! Rk , D Rn , can be

composed whenever fŒD D 0 , and we shall write
g ı f W D ! Rk ; (if there is no danger of confusion, gf W D ! Rk /;
for the composition (sending x to g.f.x//, without pedantically restricting f to a map

f 0 W D ! D 0 first.
2 Partial derivatives. Defining the existence of a total

differential
2.1
Let f W D ! R be a real function of n variables. The partial derivative of f by xk

(or, the k-th partial derivative) at the point .x1 ; : : : ; xn / is the (ordinary) derivative
of the function of 1.1 (1), i.e. the limit
www.Ebook777.com
2 Partial derivatives. Defining the existence of a total differential 67
f .x1 ; : : : xk1 ; xk C h; xkC1 ; : : : ; xn / f .x1 ; : : : ; xn /

lim : (*)
h!0 h
The standard notation is
@f .x1 ; : : : ; xn / @f
or .x1 ; : : : ; xn /;
@xk @xk
in case of multiple variables denoted by different letters, say for f .x; y/ we write,
of course,
@f .x; y/ @f .x; y/
and ; etc.
@x @y
This notation is slightly inconsistent: the xk in the “denominator” @xk just indicates
focusing on the k-th variable while the xn in the f .x1 ; : : : ; xn / in the “numerator”
refers to an actual value of the argument. When confusion is possible, one can write
more specifically
ˇ
@f .x1 ; : : : ; xn / ˇˇ
ˇ :
@xk .x1 ;:::;xn /D.a1 ;:::;an /
However, we will use this notation only occasionally.

Example.
@.x 2 C e xyCsin.y/ /
D 2x C ye xyCsin.y/ ;
@x
@.x 2 C e xyCsin.y/ /
D .x C cos.y//e xyCsin.y/ :
@y
2.1.1
@f .x1 ; : : : ; xn /
It can happen (and typically it does) that partial derivatives exist for
@xk
all .x1 ; : : : ; xn / in some domain D 0 D. In such case, we obtain a function
@f
W D 0 ! R:
@xk
It is usually obvious from the context whether, speaking of a partial derivative, we

have in mind a function or just a number, as in the definition 2.1, (*) above.
2.2
We shall write
www.Ebook777.com
jjxjj D max jxi j

i
for the distance of x from o (for our purposes we could have taken any of the
equivalent
p distances (recall Subsection 2.2 of Chapter 2) such as the Euclidean norm
.xx/ where xx is the dot product (see Appendix A, 4.3); our choice is perhaps the
most convenient technically because of its simple behavior with respect to products).
We say that f .x1 ; : : : ; xn / has a total differential at a point a D .a1 ; : : : ; an /
if there exists a function continuous in a neighborhood U of o which satisfies
.o/ D 0 (in an alternate but equivalent formulation, one requires to be defined
in U X fog and satisfy lim .h/ D 0), and numbers A1 ; : : : ; An such that
h!o
X
n
f .a C h/ f .a/ D Ak hk C jjhjj.h/ (2.2.1)
kD1
(using the dot product, we may write f .a C h/ f .a/ D A a C jjhjj.h/).
2.3 Proposition. Let a function f have a total differential at a point a, as in the

definition above. Then
1. f is continuous in a.
2. f has all the partial derivatives in a and one has
@f .a/
D Ak :
@xk
Proof. 1. We have
jf .x y/j jA.x y/j C j.x y/jjx yjj
and the limit of the right-hand side for y ! x is clearly 0.

2. We have
1
.f .x1 ; : : : xk1 ;xk C h; xkC1 ; : : : ; xn / f .x1 ; : : : ; xn //
h
jj.0; : : : ; h; : : : ; 0/jj
D Ak C ..0; : : : ; 0; h; 0; : : : ; 0// ;
h
and the limit of the right-hand side is clearly Ak . t
u
2.4 Directional derivatives
It may now seem silly to prefer the basis vectors in Rn when defining partial
derivatives. In effect, for any vector v 2 Rn , one can define a directional derivative
of f by v by
www.Ebook777.com
2 Partial derivatives. Defining the existence of a total differential 69
f .x C hv/ f .x/
@v f .x/ D lim :
h!0 h
(Caution: Some calculus textbooks use a different convention, calling the @v=jjvjj
the directional derivative when v ¤ o, the point being that it only depends on the
“direction” of v. The notion as we defined it, without requiring any assumption
on v, and moreover linear in v, is much more natural for use in geometry, as we
will see later.) In any case, the following fact is proved precisely in the same way as
Proposition 2.3:
Proposition. If a function f has a total differential at a point a, and v 2 Rn is any

vector, then the corresponding directional derivative exists and one has
X
n
@v f .a/ D Ak vk :
kD1
2.5
The formula
X
n
f .x1 C h1 : : : ; xn C hn / f .x1 ; : : : xn / D f .a C h/ f .a/ D Ak hk C jjhjj.h/
kD1
may be interpreted as saying that in a small neighborhood of a, the function f is

well approximated by the affine function (see Appendix A, 5.9)
X
L.x1 ; : : : ; xn / D f .a1 ; : : : ; an / C Ak .xk ak / W
by the required properties of , the error term is much smaller than the difference
x a.
In case of just one variable, there is no distinction between having a derivative at
a and having a total differential at the same point. In case of more than one variable,
however, the difference between having all partial derivatives and having a total
differential at a point is tremendous.
A function f may have all partial derivatives in an open set without f even
being even continuous there: In the example 1.1 (2), both partial derivatives exist
everywhere. If we consider a single point, there are even much simpler examples,
say the function f defined by f .x; 0/ D f .0; y/ D 0 for all x; y, and f .x; y/ D 1
otherwise. Then both @f @x
and @f
@y
still exist at the point .0; 0/).
What is happening geometrically is this: If we think of a function f as
represented by its “graph”, the hypersurface
S D f.x1 ; : : : ; xn ; f .x1 ; : : : ; xn // j .x1 ; : : : ; xn / 2 Dg RnC1 ; (*)
www.Ebook777.com
the partial derivatives describe just the tangent lines in the directions of the
coordinate axes, while a total differential guarantees the existence of an entire
tangent hyperplane.
Possessing continuous partial derivatives is another matter, though.
2.6 Theorem. Let f have continuous partial derivatives in a neighborhood of a

point a. Then f has a total differential at a.
Proof. Let
h.0/ D h; h.1/ D .0; h2 ; : : : ; hn /; h.2/ D .0; 0; h3 ; : : : ; hn / etc.
(so that h.n/ D o/). Then we have

X
n
f .a C h/ f .a/ D .f .a C h.k1/ / f .a C h.k/ // DW M:
kD1
By Lagrange’s Theorem, there are 0 k 1 such that
@f .a1 ; : : : ; ak1 ; ak C k hk ; akC1 ; : : : ; an /

f .a C h.k1/ / f .a C h.k/ / D hk
@xk
and hence we can proceed with

X @f .a1 ; : : : ; ak C k hk ; : : : ; an /
M D hk
@xk
X @f .a/ X @f .a1 ; : : : ; ak C k hk ; : : : ; an / @f .a/
D hk C . /hk
@xk @xk @xk
X @f .a/ X @f .a1 ; : : : ; ak C k hk ; : : : ; an / @f .a/ hk
D hk C jjhjj . / :
@xk @xk @xk jjhjj
Set
X @f .a1 ; : : : ; ak C k hk ; : : : ; an / @f .a/ hk
.h/ D . / :
@xk @xk jjhjj
ˇ ˇ
ˇ hk ˇ
Since ˇˇ ˇ 1 and since the functions @f are continuous, lim .h/ D 0. t
u
jjhjj ˇ @xk h!o
2.7
Thus, focusing on an open set in the domain of a function, we may write

schematically
continuous PD ) TD ) PD
www.Ebook777.com
3 Composition of functions and the chain rule 71
(where PD stands for all partial derivatives and TD for total differential). Note that
neither of the implications can be reversed. We have already discussed the second
one; for the first one, recall that for functions of one variable the existence of a
derivative at a point coincides with the existence of a total differential there, but a
derivative is not necessarily a continuous function even when it exists at every point
of an open set.
In the rest of this chapter, simply assuming that partial derivatives exist will
almost never be enough. Sometimes the existence of the total differential will
suffice, but more often than not we will assume the existence of continuous partial
derivatives.
3 Composition of functions and the chain rule
3.1 Theorem. Let f .x/ have a total differential in a point a. Let real functions
gk .t/ have derivatives at a point b and let gk .b/ D ak for all k D 1; : : : ; n. Put
F .t/ D f .g.t// D f .g1 .t/; : : : ; gn .t//:
Then F has a derivative in b, and
X
n
@f .a/
F 0 .b/ D gk0 .b/:
@xk
kD1
Proof. Consider the formula 2.2.1. Applying it to our function f , we get
1 1
.F .b C h/ F .b// D .f .g.b C h// f .g.b//
h h
1
D .f .g.b/ C .g.b C h/ g.b/// f .g.b//
h
Xn
gk .b C h/ gk .b/ jgk .b C h/ gk .b/j
D Ak C .g.b C h/ g.b// max :
h k h
kD1
Now limh!0 .g.b C h/ g.b// D 0 since the functions gk are continuous at b,

jgk .b C h/ gk .b/j
and max is bounded in a sufficiently small neighborhood of 0,
k h
since gk have derivatives. Thus, the limit of the last summand is zero and we have
1 X gk .b C h/ gk .b/
n
lim .F .b C h/ F .b// D lim Ak
h h
kD1
X
n
gk .b C h/ gk .b/ X @f .a/
n
D Ak lim D g 0 .b/: t
u
h @xk k
kD1 kD1
www.Ebook777.com
3.1.1 Corollary. Let f .x/ have a total differential at a point a. Let real functions
gk .t1 ; : : : ; tr / have partial derivatives at b D .b1 ; : : : ; br / and let gk .b/ D ak for
all k D 1; : : : ; n. Then
.f ı g/.t1 ; : : : ; tr / D f .g.t// D f .g1 .t/; : : : ; gn .t//
has all the partial derivatives at b, and
@.f ı g/.b/ X @f .a/ @gk .b/

n
D :
@tj @xk @tj
kD1
3.1.2 Remark
The assumption of the existence of total differential in 2.1 is essential and it is
easy to see why. Recall the geometric intuition from 2.5. The n-tuple of functions
g D .g1 ; : : : ; gn / represents a parametrized curve in D, and f ı g is then a curve
on the hypersurface S of 2.5, (*). The partial derivatives of f , or the tangent lines
of S in the directions of the coordinate axes, have in general nothing to do with the
behaviour on this curve.
3.2 What is the total differential?
The perceptive reader has noticed that in fact, while we defined what it means that
a function has a total differential, we have not yet defined the total differential
as an object. To remedy this, let us go one step further and consider in 3.1.1 a
mapping f D .f1 ; : : : ; fs /T W D ! Rs . Take its composition f ı g with a mapping
g W D 0 ! Rn (recall the convention in 1.3). Then we get
@.f ı g/ X @fi @gk

D : (3.2.1)
@tj @xk @xj
k
This formula is often referred to as the chain rule. It certainly has not escaped the
reader’s attention that the right-hand side is the product of matrices

@fi @gk
:
@xk i;k @xj k;j
Recall that the multiplication of matrices is the matrix of the composition of the
linear maps the matrices represent (see Theorem 7.6 of Appendix A).
In view of this, it is natural to define the total differential Dfx0 W Rn ! Rs of the
map f at a point x0 2 D as the linear map
f A W Rn ! Rs
www.Ebook777.com
3 Composition of functions and the chain rule 73
associated with the matrix

ˇˇ
@fi .x/ ˇ
AD ˇ :
@xj i;j
ˇ
x0
For the purposes of practical calculation, in fact, the map Dfx0 and its associated
matrix A are often identified.
The chain rule can be then stated in the form
D.f ı g/v0 D D.f/g.v0 / ı D.g/v0 :
Compare it with the one variable rule
.f ı g/0 .t/ D f 0 .g.t//g 0 .t/I
for 1
1 matrices we of course have .a/.b/ D .ab/.
Note that additionally, the total differential in this point can be used to define an
affine approximation fxaff0 of the map f at the point x0 (in an affine map approximating
f near x0 , see Appendix A, 5.9):
fxaff0 ..x// D f.x0 / C Dfx0 .x x0 /:
3.3 Lagrange’s Formula in several variables
Recall that a subset D Rn is said to be convex if
x; y 2 D ) 8t; 0 t 1; .1 t/x C ty D x C t.y x/ 2 D:
Proposition. Let a real function f have continuous partial derivatives in a convex

open set D Rn . Then for any two points x; y 2 D, there exists a , 0 1,
such that
X
n
@f .x C .y x//
f .y/ f .x/ D .yj xj /:
j D1
@xj
Proof. Set F .t/ D f .x C t.y x//. Then F D f ı g where g is defined by gj .t/ D

xj C t.yj xj /, and
X
n
@f .g.t// X
n
@f .g.t//
0
F .t/ D gj0 .t/ D .yj xj /:
j D1
@xj j D1
@xj
www.Ebook777.com
Hence by Lagrange’s formula in one variable,
f .y/ f .x/ D F .1/ F .0/ D F 0 ./
which yields the statement of the proposition. t

u
Remark. The formula is often used in the form
X
n
@f .x C h/
f .x C h/ f .x/ D hj :
j D1
@xj
Compare this with the formula for total differential.
3.4
It may be of interest that the formula for the derivative of a product of single-variable
functions is a consequence of the chain rule.
Set h.u; v/ D u v so that @f@u
D v and @f
@v
D u. Then
@h.f .x/; g.x// 0 @h.f .x/; g.x// 0

.f .x/g.x//0 D f .x/ C g .x/
@u @u
D g.x/f 0 .x/ C f .x/g 0 .x/:
4 Partial derivatives of higher order. Interchangeability
4.1
Similarly to the second derivative of a function of one variable, we may consider

partial derivatives of a partial derivative, i.e. of a function of the form g.x/ D @f@x.x/
k
,
@g.x/
:
@xl
The result, if it exists, is then denoted by
@2 f .x/
:
@xk @xl
More generally, we may iterate this process to obtain
@r f .x/
:
@xk1 @xk2 : : : @xkr
www.Ebook777.com
4 Partial derivatives of higher order. Interchangeability 75
These functions, when they exist, are called partial derivatives of order r.
For example,
@3 f .x; y; x/ @3 f .x; y; x/
and
@x@y@z @x@x@x
are derivatives of third order (even though in the first case, we have taken a partial
derivative by each variable only once).
To simplify notation, taking partial derivatives by the same variable more than
once consecutively may be indicated by an exponent, e.g.,
@5 f .x; y/ @5 f .x; y/
2 3
D ;
@x @y @x@x@x@y@y
@5 f .x; y/ @5 f .x; y/
2 2
D :
@x @y @x @x@x@y@y@x
4.2
Consider the function
f .x; y/ D x sin.y 2 C x/:
Compute
@f .x; y/ @f .x; y/
D sin.y 2 C x/ C x cos.y 2 C x/ and D 2xy cos.y 2 C x/:
@x @y
Computing the second-order derivatives, we obtain
@2 f @2 f
D 2y cos.y 2 C x/ 2xy sin.y 2 C x/ D :
@x@y @y@x
Whether it is surprising or not, it suggests a conjecture that higher order partial

derivatives do not depend on the order of differentiation. In effect, this is true –
provided all the derivatives in question are continuous.
@2 f
4.2.1 Proposition. Let f .x; y/ be a function such that the partial derivatives
@x@y
@2 f
and are defined and continuous in a neighborhood of a point .x; y/. Then we
@y@x
have
@2 f .x; y/ @2 f .x; y/
D :
@x@y @y@x
www.Ebook777.com
Proof. Consider the function of a real variable h defined by the formula
f .x C h; y C h/ f .x; y C h/ f .x C h; y/ C f .x; y/
F .h/ D :
h2
If we set
'h .y/ D f .x C h; y/ f .x; y/ and

k .x/ D f .x; y C k/ f .x; y/;
we have
1 1
F .h/ D .'h .y C h/ 'h .y// D 2 . h .x C h/ h .x//:
h2 h
Let us compute the first expression. The function 'h , which is a function of one
variable y, has the derivative
@f .x C h; y/ @f .x; y/
'h0 .y/ D
@y @y
and hence by 3.3, we have
1 1
F .h/ D 2
.'h .y C h/ 'h .y// D 'h0 .y C 1 h/
h h
@f .x C h; y C 1 h/ @f .x; y C 1 h/
D :
@y @y
Using 3.3 again, we obtain

@ @f .x C 2 h; y C 1 h/
F .h/ D (*)
@x @y
for some 1 ; 2 between 0 and 1.

Similarly, computing h12 . h .x C h/ h .x//, we obtain

@ @f .x C 4 h; y C 2 h/
F .h/ D : (**)
@y @x
@ @f @ @f
Now since both . / and . / are continuous at the point .x; y/, we can
@y @x @x @y
compute lim F .h/ from either of the formulas (*) or (**) and obtain
h!0
@2 f .x; y/ @2 f .x; y/
lim F .h/ D D : t
u
h!0 @x@y @y@x
www.Ebook777.com
5 The Implicit Functions Theorem I: The case of a single equation 77
Remark. Look what happens: F .h/ (and its possible limit in 0) is an attempt
@2 f
to compute the second partial derivative in one step. The continuity of and
@x@y
@2 f
makes sure that it is, in fact, possible.
@y@x
4.3
Iterating the interchanges allowed by 4.2.1, we easily obtain, as a corollary,
Theorem. Let a function f of n variables possess continuous partial derivatives

up to the order k. Then the values of these drivatives depend only on the number of
times a partial derivative is taken in each of the individual variables x1 ; : : : ; xn .
4.3.1
Thus, under the assumption of the theorem, we can write a general partial derivative
of the order r k as
@r f
with r1 C r2 C C rn D r
@x1r1 @x2r2 : : : @xnrn
where, of course, rj D 0 is allowed and indicates the absence of the symbol @xj .
5 The Implicit Functions Theorem I: The case of a single

equation
5.1
Suppose we have a function of n C 1 variables, which we will write as
F .x; y/;
and consider the problem of finding a solution y D f .x/ of the equation
F .x; y/ D 0: (5.1.1)
Even in very simple cases we can hardly expect a unique solution. Take for example
F .x; y/ D x 2 C y 2 1. Then for jxj > 1 there is no solution f .x; y/. For jx0 j < 1,
for some open interval containing x0 , we have two solutions
p p
f .x/ D 1 x2 and g.x/ D 1 x 2 :
www.Ebook777.com
This is better, but we have two values in each point, contradicting the definition of
a function. To achieve uniqueness, we have to restrict not only the values of x, but
also the values of y to an interval .y0 ; y0 C / (where F .x0 ; y0 / D 0). That is,
if we have a particular solution .x0 ; y0 / we must restrict our attention to a “window”
.x0 ı; x0 C ı/
.y0 ; y0 C /
through which we see a unique solution.

In our example, there is also the case .x0 ; y0 / D .1; 0/, where there is a unique
solution, but no suitable window as above, since in every neighborhood of .1; 0/,
there are no solutions on the right-hand side of .1; 0/, and two solutions to the left.
In another example
y 2 jxj D 0;
the solution .0; 0/ can be extended indefinitely both ways, but still there is no
neighborhood of .0; 0/ in which there would be a unique solution.
5.2
Actually, the above examples cover more or less all the exceptions that can occur
for “reasonable” functions F .
Theorem. Let F .x; y/ be a function of n C 1 variables defined in a neighborhood

of a point .x0 ; y0 /. Let F have continuous partial derivatives up to the order r 1
and let
ˇ ˇ
ˇ @F .x0 ; y0 / ˇ
F .x ; y0 / D 0 and ˇ
0 ˇ ˇ ¤ 0:
@y ˇ
Then there exist ı > 0 and > 0 such that for every x with jjx x0 jj < ı there
exists precisely one y with jy y0 j < such that
F .x; y/ D 0:
Furthermore, if we write y D f .x/ for this unique value y, then the function
f W .x10 ı; x10 C ı/

.xn0 ı; xn0 C ı/ ! R
has continuous partial derivatives up to the order r.
Proof. As before, we write jjxjj D max xi . Let

i
J.< / D fx j jjx x0 jj < g and J. / D fx j jjx x0 jj g
(thus, the “window” interval we are seeking is J .< ı/

.y0 ; y0 C ı/.
www.Ebook777.com
5 The Implicit Functions Theorem I: The case of a single equation 79
Without loss of generality, let, say,
@F .x0 ; y0 /
> 0:
@y
Since the first partial derivatives of F are continuous, there exist a > 0, K, ı1 > 0
and > 0 such that for all .x; y/ 2 J.ı1 /
hy0 ; y0 C i, we have
ˇ ˇ
@F .x; y/ ˇ @F .x; y/ ˇ
a and ˇ ˇK (5.2.1)
@y ˇ @x ˇ
i
(use Theorem 6.6 of Chapter 2).

I. The function f : For fixed x 2 J.ı1 /, we will consider the function of one
variable y 2 .y0 ; y0 C / defined by
'x .y/ D F .x; y/:
Thus, 'x0 .y/ D @F .x;y/

@y > 0 and hence
all 'x .y/ with x 2 J.ı1 / are increasing functions of

y, and 'x0 .y0 / < 'x0 .y0 / D 0 < 'x0 .y0 C /.
By 2.6 and 2.3, F is continuous, and hence there is a ı, 0 < ı ı1 , such that
8x 2 J.< ı/; 'x .y0 / < 0 < 'x .y0 C /:
Now by Theorem 3.3 of Chapter 1, there is precisely one y 2 .y0 ; y0 C /

('x is one-to-one since it is increasing) such that 'x .y/ D 0 – that is, F .x; y/ D
0. Define this to be f .x/.
II. The first derivatives. We will fix an index j , abbreviate the .j 1/-dimensional
vector x1 ; : : : ; xj 1 by xb (“the xi ’s before”) and the .nj /-dimensional vector
xj C1 ; : : : ; xn by xa (“the xi ’s after”); thus, we may write
x D .xb ; xj ; xa /:
@f
Compute @x j
as the derivative of .t/ D f .xb ; t; xa /.
By 3.3, we have
0 D F .xb ; t C h; xa ; .t C h// F .xb ; t; xa ; .t//

D F .xb ; t C h; xa ; .t/ C . .t C h/ .t/// F .xb ; t; xa ; .t//
www.Ebook777.com
@F .xb ; t C h; xa ; .t/ C . .t C h/ .t///

D h
@xj
@F .xb ; t C h; xa ; .t/ C . .t C h/ .t///
C . .t C h/ .t//
@y
and hence
@F .xb ; t C h; xa ; .t/ C . .t C h/ .t///

@xj
.t C h/ .t/ D h
@F .xb ; t C h; xa ; .t/ C . .t C h/ .t///
@y
(5.2.2)
for some between 0 and 1.

Thus by (5.2.1),
ˇ ˇ
ˇK ˇ
j .t C h/ .t/j jhj ˇˇ ˇˇ
a
and f is continuous (note that we have not known that before). Using this fact,
we can compute from (5.2.2)
.t C h/ .t /
lim D
h!0 h
@F .xb ; t C h; xa ; .t / C . .t C h/ .t /// @F .xb ; t; xa ; .t //
@xj @xj
D lim D :
h!0 @F .xb ; t C h; xa ; .t / C . .t C h/ .t /// @F .xb ; t; xa ; .t //
@y @y
III The higher derivatives. Note that we have not only proved the existence of the
first derivative of f , but also the formula
1
@f .x/ @F .x; f .x// @F .x; f .x//
D : (5.2.3)
@xj @xj @y
From this we can inductively compute the higher derivatives of f (using the
standard rules of differentiation) as long as the derivatives
@r F
@x1r1 @xnrn @y rnC1
exist and are continuous. t

u
www.Ebook777.com
6 The Implicit Functions Theorem II: The case of several equations 81
5.3
We have obtained the formula (5.2.3) while proving that f has a derivative. If we
knew beforehand that f has a derivative, we could deduce (5.2.3) immediately from
the chain rule. In effect, we have
0 F .x; f .x//I
taking a derivative of both the sides we obtain
@F x; f .x// @F x; f .x// @f .x/

0D C :
@xj @y @xj
Differentiating further, we obtain inductively linear equations from which we can

compute the values of all the derivatives guaranteed by the theorem.
5.4 Remark
The solution f in 5.2 has as many derivatives as the initial F . But note the restriction
r 1. One usually thinks of the 0-th derivative as of the function itself. The theorem
does not guarantee a continuous solution f of an equation F .x; f .x// D 0 with
continuous F . Even just for the existence of the f we have used the first derivatives.
6 The Implicit Functions Theorem II: The case of several

equations
6.1 A warm-up: what happens in the case of two equations
Suppose we try to find a solution yi D fi .x/, i D 1; 2, of a pair of equations
F1 .x; y1 ; y2 / D 0;
F2 .x; y1 ; y2 / D 0
in a neighborhood of a point .x0 ; y10 ; y20 / (at which the equalities hold). We will
apply the “substitution method” based on Theorem 5.2. First we will think of
the second equation as an equation for the unknown y2 ; in a neighborhood of
.x0 ; y10 ; y20 / we obtain y2 as a function .x; y1 /. Substitute this into the first equation
to obtain
G.x; y1 / D F1 .x; y1 ; .x; y1 //I
if we find, in a neighborhood of .x0 ; y10 /, a solution y1 D f1 .x/, we can substitute it

into and obtain y2 D f2 .x/ D .x; f1 .x//.
www.Ebook777.com
What did we have to assume? First, of course, we have to have the continuous
partial derivatives of the functions Fi . Then, to be able to obtain by 5.2 the way
we did, we need to have
@F2 0 0 0
.x ; y1 ; y2 / ¤ 0: (6.1.1)
@y2
Finally, we also need to have
@G 0 0
.x ; y1 / ¤ 0I
@y1
by 3.1.1, this is equivalent to
@F1 @F1 @
C ¤ 0: (6.1.2)
@y1 @y2 @y1
Now we have (recall (5.2.3))

@ @F1 1 @F2
D
@y1 @y2 @y1
and (6.1.2) becomes
1
@F1 @F1 @F2 @F1 @F2
¤ 0;
@y2 @y1 @y2 @y2 @y1
that is,
@F1 @F2 @F1 @F2

¤ 0:
@y1 @y2 @y2 @y1
This formula should be conspicuously familiar. Indeed, it is (see the notation for
determinants from Subsection 3.3 of Appendix B)
ˇ ˇ
ˇ @F1 @F1 ˇ
ˇ ˇ
ˇ @y1 ; @y2 ˇ
ˇ ˇ @Fi
ˇ ˇ D det ¤ 0: (6.1.3)
ˇ @F2 @F2 ˇˇ @yj i;j
ˇ
ˇ @y ; @y2 ˇ
1
Note that if we assume that this determinant is non-zero we have either

@F2 0 0 0
.x ; y1 ; y2 / ¤ 0
@y2
and/or
@F2 0 0 0
.x ; y1 ; y2 / ¤ 0;
@y1
www.Ebook777.com
so if the latter holds, we can start by solving F2 .x; y1 ; y2 / D 0 for y1 instead of y2 .

Thus the condition (6.1.3) suffices.
6.2 The Jacobian
For a system of functions
F.x; y/ D .F1 .x; y1 ; : : : ; ym /; : : : ; Fm .x; y1 ; : : : ; ym //
and variables y1 ; : : : ; ym , define the Jacobi determinant (briefly, the Jacobian)

D.F/ @Fi
D det :
D.y/ @yj i;j D1;:::;m
6.3
By extending the substitution procedure indicated in 6.1, we will now prove the
general Implicit Function Theorem.
Theorem. Let Fi .x; y1 ; : : : ; ym /, i D 1; : : : ; m, be functions of n C m variables

with continuous partial derivatives up to an order k 1. Let
F.x0 ; y0 / D o
and let
D.F/ 0 0
.x ; y / ¤ 0:
.y/
Then there exist ı > 0 and > 0 such that for every
x 2 .x10 ı; x10 C ı/

.xn0 ı; xn0 C ı/
there exists precisely one
y 2 .y10 ; y10 C /

.ym
0
; xm
0
C /
such that
F.x; y/ D 0:
Furthermore, if we write this y as a vector function f.x/ D .f1 .x/; : : : ; fm .x//, then
the functions fi have continuous partial derivatives up to the order k.
www.Ebook777.com
Proof. We proceed by induction. By Theorem 5.2, the statement holds for m D 1.

Now assume it holds for a given m, and let us have a system of equations
Fi .x; y/; i D 1; : : : ; m C 1
satisfying the assumptions above (i.e. the unknown vector y is .mC1/-dimensional).

Then, in particular, in the Jacobian determinant we cannot have a column consisting
entirely of zeros, and hence, after possibly renumbering the Fi ’s, we may assume
without loss of generality that
@FmC1 0 0
.x ; y / ¤ 0:
@ymC1
If we write yQ D .y1 ; : : : ; ym /, we then have by the induction hypothesis ı1 > 0 and

1 > 0 such that for
.x; yQ / 2 .x10 ı1 ; x10 C ı1 /

.xn0 ı1 ; x1n C ı1 /

.ym
0
ı1 ; ym
0
C ı1 /;
there exists precisely one ymC1 D .x; yQ / satisfying
FmC1 .x; yQ ; ymC1 / D 0 and jymC1 ymC1

0
< 1 :
This has continuous partial derivatives up to the order k and hence so have the
functions
Gi .x; yQ / D Fi .x; yQ ; .x; yQ //; i D 1; : : : ; m C 1
(the last of which, GmC1 , is identically zero). By 3.1.1, we then have
@Gj @Fj @Fj @

D C : (6.3.1)
@yi @yi @ymC1 @yi
Now consider the determinant

ˇ ˇ
ˇ @F1 @F1 @F1 ˇ
ˇ ˇ
ˇ @y1 ; : : : ; ;
@ym @ymC1 ˇ
ˇ ˇ
ˇ ˇ
ˇ ˇ
ˇ :::; :::; :::; ::: ˇ
ˇ ˇ
ˇ ˇ
D.F/ ˇ ˇ
Dˇ ˇ:
D.y/ ˇ @Fm @Fm @Fm ˇ
ˇ ˇ
ˇ @y ; : : : ; ; ˇ
ˇ 1 @ym @ymC1 ˇ
ˇ ˇ
ˇ ˇ
ˇ @FmC1 @FmC1 @FmC1 ˇ
ˇ ˇ
ˇ @y ; : : : ; ;
@ym @ymC1 ˇ
1
www.Ebook777.com
@
Add to the i th column the product of the last column with the scalar . By (6.3.1),
@yi
taking into account the fact that GmC1 0 and hence
@GmC1 @FmC1 @FmC1 @

D C D 0;
@yi @yi @ymC1 @yi
we obtain
ˇ ˇ
ˇ @G1 @G1 @F1 ˇ
ˇ ˇ
ˇ @y1 ; : : : ; ;
@ym @ymC1 ˇ
ˇ ˇ
ˇ ˇ
ˇ ˇ
ˇ :::; :::; :::; ::: ˇ
ˇ ˇ
ˇ ˇ
D.F/ ˇ ˇ @FmC1 D.G1 ; : : : ; Gm /
Dˇ ˇD :
D.y/ ˇ @Gm @Gm @Fm ˇ @ymC1 D.y1 ; : : : ; ym /
ˇ ˇ
ˇ @y ; : : : ; ; ˇ
ˇ 1 @ym @ymC1 ˇ
ˇ ˇ
ˇ ˇ
ˇ @FmC1 ˇ
ˇ 0; : : : ; 0; ˇ
ˇ @ymC1 ˇ
Thus,
D.G1 ; : : : ; Gm /
¤0
D.y1 ; : : : ; ym /
and hence by the induction hypthesis there are ı2 > 0, 2 > 0 such that for
jxi xi0 j < ı2 there is a uniquely determined yQ with jyi yi0 j < 2 such that
Gi .x; yQ / D 0 for i D 1; : : : ; m
and that the resulting fi .x/ have continuous partial derivatives up to the order k.
If we define, further,
fi C1 .x/ D .x; f1 .x/; : : : ; fm .x//
we obtain a solution f of the original system of equations F.x; y/ D 0.

The proof is almost finished but not quite. What about the uniqueness of the
solution within the constraints jjx x0 jj < ı and jjy y0 jj < ? Does uniqueness
in the two steps of the proof above (solving FmC1 .x; y1 ; : : : ; ym ; ymC1 / D 0 for
ymC1 , and then G.x; yQ / D 0 for y1 ; : : : ; ym ) really guarantee that a different solution
cannot be found by some other procedure (e.g. reversing the order of variables)? But
luckily, in this particular proof, this turns out not to be a serious problem.
Choose 0 < ı1 ; 1 ; 2 and then 0 < ı < ı1 ; ı2 and, moreover, sufficiently
small so that for jx1 xi0 j < ı one had jfj .x/ fj .x0 /j < (the last to make sure
to have in the -interval at least one solution). Now let
F.x; y/ D o; and jjx x0 jj < ı and jjy y0 jj < : (6.3.2)
www.Ebook777.com
We have to prove that then necessarily yi D fi .x/ for all i . Since jxi xi0 j < ı ı1
for i D 1; : : : ; n, jyi yi0 j < ı1 for i D 1; : : : ; m and jymC1 ymC1
0
j < 1
we have, necessarily, ymC1 D .x; yQ /. Thus, by (6.3.2),
G.x; yQ / D o
and since jxi xi0 j < ı ı2 and jyi yi0 j < 2 we have indeed yi D fi .x/.
t
u
7 An easy application: regular mappings and the Inverse

Function Theorem
7.1
Let U Rn be an open set. A mapping f W U ! Rn is said to be regular if each fi

@fi
has continuous partial derivatives @x j
and if for all the x 2 D, we have
D.f/
.x/ ¤ 0:
D.x/
7.2 Proposition. Let f W U ! Rn be a regular mapping. Then the image f ŒV of

every open V U is open.
Proof. Let f .x0 / D y0 . Define F W V

Rn ! Rn by setting
Fi .x; y/ D fi .x/ yi : (7.2.1)
Thus F.x0 ; y0 / D o and D.F/

D.x/
¤ 0, and hence, by 6.3, there exist ı > 0 and > 0
such that for every y with jjy y0 jj < ı, there exists (precisely one, but this is not
important at this moment) x with jjx x0 jj < and Fi .x; y/ D fi .x/yi D 0. This
means that we have f.x/ D y (note that the roles of the xi and the yi are reversed
from the usual convention: here, the yi are the independent variables). Thus, we
have

.y0 ; ı/ D fy j jjy y0 jj < ıg f ŒV : t
u
Remark. Confront this fact with the characterization of continuous maps in

Theorem 3.4 of Chapter 2: for regular maps, both images and preimages of open
sets are open.
7.3 Proposition. Let f W U ! Rn be a regular mapping. Then for each x0 2 U

there exists an open neighborhood V such that the restriction fjV is one-to-one.
Moreover, the mapping g W f ŒV ! Rn inverse to fjV is regular.
www.Ebook777.com
8 Taylor’s Theorem, Local Extremes and Extremes with Constraints 87
Proof. Consider the F from (7.2.1) again. We have, for a sufficiently small > 0,
precisely one x D g.y/ such that F.x; y/ D 0 and jjx x0 jj < . This g has,
furthermore, continuous partial derivatives. We have, by 3.2,
D.Id/ D D.f ı g/ D Df Dg:
By the chain rule,
D.f/ D.g/
D detDf detDg D 1
D.x/ D.y/
D.g/
and hence for each y 2 fŒV , D.y/
.y/ ¤ 0. t
u
7.3.1 Corollary. If a regular mapping f W U ! Rn is one-one, then the inverse

g W fŒU ! Rn is regular as well.
8 Taylor’s Theorem, Local Extremes and Extremes

with Constraints
8.1 Taylor’s Theorem
A function f defined on an open set of Rn is called a C r -function if f is con-

tinuous and possesses continuous partial derivatives up to (and including) order r.
A function which is C r for all r 2 N is called C 1 . C 1 functions will be also called
smooth, while C 1 -functions will be called continuously differentiable. (Terminology
in the literature varies, some texts use the word smooth for C 1 . We shall never do
so in the present text.) Taylor’s Theorem for multivariable functions may look more
intimidating, but we will see that it is an easy consequence of the corresponding
single variable theorem:
Theorem. (Taylor) Let f be a C rC1 -function defined on an open convex subset

U Rn , and let a 2 U . Then for every point x 2 U , x ¤ x0 , there exists a point c
on the open line segment connecting a and x such that
f .x/ D
X
r X 1 @k f .a/
.x1 a1 /k1 : : : .xn an /kn
k1 Š : : : kn Š .@x1 / 1 : : : .@xn /kn
k
kD0 k1 CCkn Dk; ki 0
X 1 @k f .c/
C .x1 a1 /k1 : : : .xn an /kn :
k1 Š : : : kn Š .@x1 / 1 : : : .@xn /kn
k
k1 CCkn DrC1; ki 0
(*)
www.Ebook777.com
Proof. Simply use Theorem 4.5.1 for the function
g.t/ D f .a C t.x a//:
The formula (*) follows immediately from the observation
g .k/ .t/ D
ˇ
X @k f .s/ ˇ
kŠ ˇ
ˇ .x1 a1 /k1 : : : .xn an /kn
k1 Š : : : kn Š .@s1 /k1 : : : .@sn /kn ˇ
k1 CCkn Dk ki 0 sD.aCt .xa//
(**)
which follows by applying the chain rule repeatedly. t

u
It is useful to note that the affine approximation in the sense of 3.2 of the function
f at a point a is simply the sum of the constant and linear terms of its Taylor
expansion.
8.2 Local extremes and critical points
Let f be a function defined on an open subset U Rn and let x0 2 U . In analogy

with the one-variable case, (4.7 of Chapter 1), we say that f has a local minimum
(resp. local maximum) at x0 if there exists a ı > 0 such that for every x 2
.x0 ; ı/
with x ¤ x0 , we have f .x/ > f .x0 / (resp. f .x/ < f .x0 /). A local minimum or a
local maximum are referred to by the joint term local extreme.
On the other hand, x0 is called a critical point of f if either f does not have a
total differential at x0 , or the total differential is 0. The following is then a direct
consequence, for example, of Proposition 2.3 and Corollary 4.3.1 of Chapter 1.
8.2.1 Proposition. A local minimum or local maximum of a function f W U ! R

is a critical point of f .
8.3 The Hessian
Just as in 4.7 of Chapter 1, we would like a partial converse of Proposition 8.2.1

based on second derivatives. We will see, however, that in the multivariable case,
the geometry is intrinsically more complicated. Suppose a function f W U ! R is
C 2 on some open set U Rn . One considers the Hessian matrix H of type n
n
whose .i; j /’th entry is
@2 f
:
@xi @xj
www.Ebook777.com
This is a symmetric matrix by Proposition 4.2.1, and hence has an associated real
symmetric bilinear form. If the Hessian is non-degenerate at a critical point x0 , we
call x0 a non-degenerate critical point. We have the following
Theorem. Suppose f is C 2 on an open set U Rn containing a non-degenerate

critical point x0 . Then the following holds: if the Hessian H.x0 / is positive-definite
(resp. negative-definite) at x0 , then x0 is a local minimum (resp. local maximum). If
the Hessian is indefinite, then x0 is neither a local minimum nor a local maximum.
Such point x0 is called a saddle point.
Proof. By Taylor’s Theorem 8.1, for any > 0 for which

.x0 ; / U , for every
x 2
.x0 ; ı/, x ¤ x0 , there exists a point c on the open line segment connecting x0
and x such that
1
f .x/ D f .x0 / C .x x0 /T H.c/.x x0 /: (8.3.1)
2
Then we conclude that if H.c/ is positive-definite (resp. negative-definite), we have

f .x/ > f .x0 / (resp. f .x/ < f .x0 /). If H.c/ is indefinite, then, by definition, both
positive and negative values will occur.
However, in the statement of the theorem, we have H.x0 /, not H.c/. To remedy
this situation, we proceed as follows: Consider
.v; c/ D vT H.c/v
as a function of .v; c/ 2 X where
X D f.v; c/ j v 2 Rn ; v v D 1; c 2
.x0 ; =2/g:
Then by our assumptions, is continuous. However, X R2n is compact by

Theorem 6.5 of Chapter 2, and hence by Theorem 6.6 of Chapter 2, is uniformly
continuous.
Now suppose H.x0 / is positive-definite. The closed subset X0 X consisting of
all .v; c/ where c D x0 is compact, and hence has a minimum value m on X0 by
Proposition 6.3 of Chapter 2. Since H.x0 / is positive-definite, we have m > 0. Now
by the uniform continuity of , there exists a ı > 0 such that for all c 2
.x0 ; ı/,
.v; c/ 2 X , .v; c/ > 0, and hence H.c/ is positive-definite also.
The case of H.x0 / negative-definite is handled analogously.
When H.x0 / is indefinite non-degenerate, there exist .v1 ; x0 / 2 X0 , .v2 ; x0 / 2
X0 such that .v1 ; x0 / > 0, .v2 ; x0 / < 0. Since is continuous, there exists a
ı > 0 such that for c 2
.x0 ; ı/, .v1 ; c/ > 0, .v2 ; c/ < 0, and hence H.c/ is
indefinite. t
u
www.Ebook777.com
8.4 Global extremes
Suppose f W X ! R is a continuous function on a compact subset X Rn . Then

by Proposition 6.3 of Chapter 2, f attains a (global) minimum and maximum on X
at some points x1 ; x2 2 X . Can we find these points in practice? This is a classic
example of an optimization problem, which, as the reader can imagine, has many
applications outside of mathematics.
The first method that comes into mind is computing all critical points, and
checking the values to see at which of these points the maximum (resp. minimum)
occurs. This is generally an adequate method when n D 1. A typical (although not
general, see Exercise (19) in Chapter 2 above) example of a set X is a compact
interval or a finite union of compact intervals. If it happens (as it often does) that
the equation f 0 .x/ D 0 has only finitely many solutions, then the only other critical
points to check are the finitely many boundary points of the intervals.
One immediately realizes, however, that the method in this form does not work
even for a perfectly “reasonable” compact subset X Rn when n > 1 such as
for example a cube (or, more generally, a region with corners as we introduce it in
Chapter 12 below). The point is that the boundary of such sets X will in general be
infinite (in fact, uncountable, see Exercise (2) of Chapter 1), and will consist entirely
of critical points as defined above, so there is no way of checking all of them.
To see what else we can do, let us consider a simple example. Suppose we want
to find the local extremes of a function f .x; y/ which is continuously differentiable
on some open set containing the ball B D f.x; y/ j x 2 C y 2 1g. Suppose we are
to find the global extremes of f on the compact set B. In the interior of B, we can
then solve the equations
@f @f
D 0; D 0: (*)
@x @y
On the boundary, the extreme may not satisfy the equations (*), but we note that the
boundary is itself the set of solutions of the “nice” equation
x 2 C y 2 D 1: (C)
It is certainly worth asking if some generalization of (*) might hold, which would
allow us to solve the problem. Note that generically speaking, we expect a single
equation in the boundary case, since in addition to it, we still have the equation
(“constraint”) (C).
8.5 Local Extremes with constraints. Lagrange multipliers
The problem we encountered at the end of Subsection 8.4 can be formalized as

follows: Let U Rn be open, and let f W U ! R be a real function. Let, further,
www.Ebook777.com
gi W U ! R be real functions, i D 1; : : : ; k. A point x0 2 U is called a local

minimum (resp. maximum) subject to the constraints
gi .x/ D 0; i D 1; : : : ; k (*)
if x D x0 satisfies (*) and there exists a ı > 0 such that for every x 2
.x0 ; ı/,
x ¤ x0 which satisfies (*) we have f .x/ > f .x0 / (resp. f .x/ < f .x0 /.
We have the following
Theorem. Let f; g1 ; : : : ; gk be real functions defined in an open set D Rn , and

suppose they are continuously differentiable. Suppose further that the rank of the
matrix
0 @g @g1 1
1
; :::;
B @x1 @xn C
B C
M D B :::; :::; ::: C
@ @g @gk A
k
; :::;
@x1 @xn
is exactly k at each point of D. Suppose a continuously differentiable function

f W U ! R has a local extreme subject to the constraints (*) at a point x D a D
.a1 ; : : : ; an /. Then there exist numbers 1 ; : : : ; n (known as Lagrange multipliers)
such that for each i D 1; : : : ; n, we have
@f .a/ X
n
@gj .a/
C j D 0:
@xi j D1
@xi
Proof. See Subsection 2.4 of Appendix B. If the matrix M has rank k, then at least
one of the k
k submatrices of M is regular, and hence has a non-zero determinant.
Without loss of generality, let us assume that at the extremal point we have, say,
ˇ ˇ
ˇ @g1 @g1 ˇ
ˇ ˇ
ˇ @x1 ; : : : ; @xn ˇ
ˇ ˇ
ˇ ˇ
ˇ ˇ
ˇ :::; :::; : : : ˇˇ ¤ 0:
ˇ (1)
ˇ ˇ
ˇ ˇ
ˇ ˇ
ˇ @gk @gk ˇˇ
ˇ
ˇ @x ; : : : ; @xn ˇ
1
If this holds, we have by 6.3 in a neighborhood of the point a functions
i .xkC1 ; : : : ; xn /
www.Ebook777.com
(let us write xQ for .xkC1 ; : : : ; xn /) with contiuous partial derivatives such that
gi .1 .Qx/; : : : ; k .Qx/; xQ / D 0 for i D 1; : : : ; k:
Thus, an extreme (i.e. local maximum or local minimum) of f .x/ at a subject to the
given constraints implies the corresponding extreme property (without constraints)
of the function
F .Qx/ D f .1 .Qx/; : : : ; k .Qx/; xQ /;
at aQ , and hence by (1),
@F .Qa/
D0 for i D k C 1; : : : ; n;
@xi
and this is, by 3.1.1, equivalent to
X
k
@f .a/ @r .Qa/ @f .a/
C for i D k C 1; : : : ; n: (2)
rD1
@xr @xi @xi
Taking derivatives of the constant functions gi .1 .Qx/; : : : ; .Qx/; xQ / D 0 we obtain

for j D 1; : : : ; k,
X
k
@gj .a/ @r .Qa/ @gj .a/
C for i D k C 1; : : : ; n: (3)
rD1
@xr @xi @xi
Now we will use (1) again, for another purpose. By Theorem B.2.5.1, the system of
linear equations
@f .a/ X
n
@gj .a/
C j D 0; i D 1; : : : ; k;
@xi j D1
@xi
has a unique solution 1 ; : : : ; k . Those are the equalities from the statement, but,
so far, for i k only. It remains to be shown that the same equalities hold also for
i > k. In effect, by (2) and (3), for i > k we obtain
@f .a/ X X @f .a/ @r .Qa/ X X @gj .a/ @r .Qa/

n k k k
@gj .a/
C j D j
@xi j D1
@xi rD1
@xr @xi j D1 rD1
@xr @xi
0 1
X n
@f .a/ X n
@gj .a/ @r .Qa/ Xn
@r .Qa/
@ C j A D 0 D 0: u
t
rD1
@x i j D1
@x i @x i rD1
@xi
www.Ebook777.com
8.6 Remarks
1. The functions f; gi were assumed to be defined in an open D so that we can

take derivatives whenever we need them. In particular, this was used in the
@F .a/
computation of , and the resulting equality (3) in Theorem 8.5 above.
@xi
Take the example of the unit ball B at the end of 8.4 as an example of
f .x; y/ D x C 2y. Then the formulas x C 2y and x 2 C y 2 1 make sense
on all of R2 .
2. The force of the statement in 8.5 is in asserting the existence of 1 ; : : : ; k that
satisfy more than k equations, thus creating equations for the i ’s. In the above
@f @f
mentioned example, we have D 1 and D 2, g.x; y/ D x 2 C y 2 1 and
@x @y
@g @g
hence D 2x and D 2y. There is one that has to satisfy two equations
@x @y
1 C 2x D 0 and 2 C 2y D 0:
This is possible only if y D 2x. Hence, as x 2 C y 2 D 1 we obtain 5x 2 D 1 and

1 p
2
hence x D ˙ p15 ; this localizes the extremes to . p15 ; p25 / and . p 5 5
/.
8.7
A problem of finding extremes with constraints may not be related to extremes at

boundary points. Here is an example of another nature.
Let us ask the question which rectangular parallelepiped of a given surface area
has the largest volume. Denoting the lengths of the edges by x1 ; : : : ; xn , the surface
area is

1 1
S.x1 ; : : : ; xn / D 2x1 xn CC
x1 xn
and the volume is
V .x1 ; : : : ; xn / D x1 xn :
Thus, we have

@V 1 @S 2 1 1 1
D x1 xn and D .x1 xn / CC 2x1 xn 2 :
@xi xi @xi xi x1 xn xi
If we write yi D x1i and s D y1 C C yn and divide the equation from the theorem
by x1 xn , we obtain

2yi .s yi / C yi D 0; or yi D s C :
2
Thus, all the xi are equal and the unique solution is the cube.
www.Ebook777.com
9 Exercises
(1) Prove that the function

8
ˆ
ˆ .x 2 y/2
ˆ
< x 4 C y 2 for .x; y/ ¤ .0; 0/;
f .x; y/ D
ˆ
ˆ
:̂1 for .x; y/ D .0; 0/
becomes continuous when restricted to any straight line in R2 . Prove, however,

that f is not continuous.
(2) Let f .x; y/ W R2 ! R be the function defined by
x y
f .x; y/ D e y x for x; y ¤ 0
and by f .x; 0/ D f .0; y/ D 0. Prove that f has partial derivatives of all

orders on R2 , but is not continuous.
[Hint: for x; y ¤ 0, inductively, all partial derivatives (including higher ones)
are of the form Q.x; y/f .x; y/ where Q is a rational function. Taking limits
of such functions along vertical or horizontal lines to points of the form .0; y/,
.x; 0/, however, the limit is always 0. Therefore, by the mean value theorem,
the (possibly higher) partial derivative in question is also 0 at those points.]
(3) Prove Proposition 2.4 in detail.
(4) Prove that if in 3.1.1 the functions gk have total differentials in b then f ı g
has one as well.
(5) Derive, similarly as in 3.4, a formula for the derivative of fg .
(6) How many different expressions 4.3.1 are there?
(7) Find the first three summands in the Taylor expansion of the solution of (5.2.3).
(8) Give a counterexample of the statement of Theorem 5.2 when we drop the
assumption r D 1.
(9) Implicit differentiation Let functions F W U ! Rm , U RnCm be open as
in Theorem 6.3 and let f W V ! Rn , V Rn be open the map mentioned in
Theorem 6.3. Let Dx F W Rn ! Rm and Dy F W Rm ! Rm be linear maps such
that DF .x; y/ D Dx F .x/ C Dy F .y/. Using the chain rule, prove that then
Dfjx D .Dy F j.x;f.x// /1 .Dx F j.x;f.x// /:
(10) Prove formula 8.1 (**) in detail.

(11) Prove Proposition 8.2.1 in detail.
www.Ebook777.com
9 Exercises 95
(12) (a) Find a maximum and minimum of the function f .x; y/ D ax C by on the
set
B D f.x; y/ j x 2 C y 2 1g R2
for every choice of values of the constants a; b 2 R.

(b) Find a minimum and maximum of the function f .x; y/ D x 2 C 2y 2 on
the set B.
www.Ebook777.com
Integration I: Multivariable Riemann Integral

and Basic Ideas Toward the Lebesgue Integral 4
1 Riemann integral on an n-dimensional interval
In the first part of this chapter we will present a simple generalization of the one-
dimensional Riemann integral which the reader already knows (see Section 8 of
Chapter 1). To start with, we will consider the integral only for functions defined
on n-dimensional intervals (D“bricks”) and we will be concerned, basically, with
continuous functions. Later, the domains and functions to be integrated on will
become much more general.
1.1
A compact interval in the n-dimensional Euclidean space Rn is a product
J D ha1 ; b1 i

han ; bn i
where hak ; bk i are compact intervals in R.

A partition D of such interval is an n-tuple .D1 ; : : : ; Dn / where the Di are
partitions of the intervals hai ; bi i, that is, sequences
Di W ai D ti1 < ti 2 < < ti;ni D bi ; (*)
often also viewed as sequences of intervals
hti1 ; ti 2 i; hti 2 ; ti 3 i; : : : ; hti;ni 1 ; ti;ni i:
The partition D above is called a refinement of a partition D 0 D .D10 ; : : : ; Dn0 / if

the sequences (*) above are subsequences of the sequences

www.Ebook777.com
98 4 Integration I: Multivariable Riemann Integral and Basic Ideas Toward the: : :
Di0 W ai D ti10 < ti02 < < ti;n

0
0 D bi :
i
We have the obvious
1.1.1 Observation. Any two partitions have a common refinement.
1.2
A member of a partition D D .D1 ; : : : ; Dn / is any of the intervals (bricks)

ht1;i1 ; t1;i1 C1 i

htn;in ; tn;in C1 i where the ti;j are as in (*). The set of all members
of a partition D will be denoted by jDj.
The volume of an interval J D ha1 ; b1 i

han ; bn i is the number
Y
n
volJ D .bi ai /:
i D1
Let f be a bounded function on an interval J and let D be a partition of J . The

lower (resp. upper) sum of f in D is the number
X X
s.f; D/ D mK volK resp. S.f; D/ D MK volK
K2jDj K2jDj
where
mK D infff .x/ j x 2 Kg and MK D supff .x/ j x 2 Kg:
1.2.1
From the definitions of suprema and infima we immediately see that if D refines D 0
then
s.f; D/ s.f; D 0 / and S.f; D/ S.f; D 0 /; (*)
and taking into account a common refinement we immediately obtain
Observation. For any two partitions D; D 0 we have
s.f; D/ S.f; D 0 /:
Now we can define the lower and the upper Riemann integral of f over J by
setting
Z Z
f D sup s.f:D/ and f D inf S.f:D/;
D J D
J
www.Ebook777.com
1 Riemann integral on an n-dimensional interval 99
and if these two values coincide we speak of the Riemann integral of f over J and
write
Z
f
J
or, if we wish to emphasize the variables,

Z Z
f .x1 : : : ; xn /dx1 dxn or f .x/dx:
J J
We then speak of a Riemann integrable function.
1.3
The following easy fact can be left to the reader (it can be proved by a literal
repetition of the one variable case – Exercise (1)).
Proposition. If f; g are Riemann integrable and if ˛; ˇ are real numbers then

˛f C ˇg is Riemann integrable and we have
Z Z Z
.˛f C ˇg/ D ˛ f Cˇ g:
J J
1.4 Almost disjoint unions of intervals
An interval J D ha1 ; b1 i

han ; bn i is an almost disjoint union of a pair of
intervals J i D ha1i ; b1i i

hani ; bni i, i D 1; 2, if for some k we have
8i ¤ k; .ai ; bi / D .ai1 ; bi1 / D .ai2 ; bi2 /; and

a1 D ai1 ; bi1 D ai2 ; bi2 D bi or a1 D a12 ; b12 D ai1 ; bi1 D bi :
An interval J is an almost disjoint union of intervals J1 ; J2 ; : : : ; Jn if it can be

produced recursively from J1 ; : : : ; Jn by taking almost disjoint unions of pairs,
using each Ji precisely once.
1.4.1 Proposition. Let L be an almost disjoint union of intervals J i , i D 1; : : : ; n.

Then
Z n Z
X Z n Z
X
f D f and f D f:
J i D1 Ji J i D1 Ji
www.Ebook777.com
Proof. It suffices to prove the statement for an almost disjoint union of a pair of
intervals J 1 ; J 2 , and for this case it suffices to realize that each partition of J can
be refined into a pair of partitions of the J i ’s, and, on the other hand, from any pair
of partitions of the J i ’s we can obtain, using common refinements, a partition of J .
t
u
2 Continuous functions are Riemann integrable
2.1 Theorem. A function F is Riemann integrable if and only if for every " > 0
there exists a partition D such that
S.D; f / s.D; f / < ":
Proof. If the formula holds, then, for each " > 0,

Z Z Z
f S.D; f / < s.D; f / C " f C" f C ":
J J J
R R
On the other hand, if f D J f then by definition there are D 0 ; D 00 such that
J
S.D; f / s.D; f / < "; take a common refinement D of D 0 ; D 00 and use 1.2.1 (*).
t
u
Theorem. Every continuous function on an interval J is Riemann integrable.
Proof. By Theorem 6.6 of Chapter 2, f is uniformly continuous. Take an " > 0 and
choose a ı > 0 such that for the distance in Rn we have
"
d.x; y/ < ı ) jf .x/ f .y/j < :
volJ
Further, choose a partition D such that
8K 2 jDj; 8 x; y 2 K; d.x; y/ < ı:
Then we have for mK D infff .x/ j x 2 Kg and MK D supff .x/ j x 2 Kg,

"
MK mK and since obviously
volJ
X
volK D volJ;
K2jDj
www.Ebook777.com
3 Fubini’s Theorem in the continuous case 101
we have
X " X
S.D; f / s.D; f / D .MK mK /volK volK D ": t
u
volJ
K2jDj K2jDj
2.2
The following statements are straightforward (they hold more generally, but we will
need them so far for continuous functions only).
Proposition.
R R Let f; g be continuous functions. Then
1. j f j jf j.R R
2. If f g then f g.
3. In particular if f .x/ C for all x 2 J then
Z
f C volJ:
J
3 Fubini’s Theorem in the continuous case
3.1 Theorem. Let J 0 Rm , J 00 Rn be intervals, J D J 0

J 00 . Let f be a
continuous function defined on J . Then
Z Z Z Z Z
f .x; y/d.x; y/ D . f .x; y/dy/dx D . f .x; y/dx/dy:
J J0 J 00 J 00 J0
Proof. We will Rprove the first equality, the second one is analogous.
Put F .x/ D J 00 f .x; y/dy. We will prove that
Z Z
f D F:
j j0
This will also include the fact that the latter integral exists; this could be easily
shown by proving, using uniform continuity, that F is continuous. But we will get
it during the proof for free anyway.
Choose a partition D of J such that
Z Z
f " s.f; D/ S.f; D/ f C ":
The partition D (as any partition of J ) obviously consists of a partition D 0 of J 0

and a partition D 00 of J 00 , and we have
www.Ebook777.com
jDj D fK 0
K 00 j K 0 2 jD 0 j; K 00 2 jD 00 jg
and each member appears as precisely one K 0

K 00 . We have
X
F .x/ max f .x; y/volK 00
y2K 00
K 00 2jD 00 j
and hence
X X
S.F; D 0 / max0 . max f .x; y/ volK 00 / volK 0
x2K y2K 00
K 0 2jD 0 j K 00 2jD 00 j
X X
max f .x; y/ volK 00 volK 0
.x;y/2K 0 K 00
K 0 2jD 0 j K 00 2jD 00 j
X
max f .z/ vol.K 00
K 0 / D S.f; D/
z2K 0 K 00
K 0 K 00 2jDj
and similarly
s.f; D/ s.F; D 0 /:
Hence we have
Z Z Z Z
0
f " s.F; D / S.F; D/ f C ";
j J0 J0 J
R R
and therefore J0 F exists and is equal to J f. t
u
4 Uniform convergence and Dini’s Theorem
4.1 Theorem. Let fn be continuous real functions on a compact interval J and let
them converge uniformly to a function f . Then
Z Z
f D lim fn :
J n!1 J
Proof. Choose an " > 0 and an n0 such that, for n n0 ,

"
jfn .x/ f .x/j < :
volJ
The symbols mK and MK will be as in 1.2, and the corresponding values for fn will
be denoted by mnK and MKn . Thus we have
www.Ebook777.com
4 Uniform convergence and Dini’s Theorem 103
"
jmK mnk j; jMK Mkn j <
volJ
so that
X
js.f; D/ s.fn ; D/j jmK mnK j volK < "
K2jDj
X
(again we use the fact that volK D volJ ) and similarly
K2jDj
jS.f; D/ S.fn ; D/j < ":
Choose a partition D such that

Z Z
f " s.f; D/ S.f; D/ f C ":
J J
Then
Z Z
f 2" s.f; D/ " s.fn ; D/ fn
J J
Z
S.fn ; D/ S.f; D/ C " f C 2";
J
R R
and we conclude that lim J fn D J f. t
u
4.2 Notation
A sequence .fn /n of functions is said to be increasing if for all x
f1 .x/ f2 .x/ fn .x/
(usually this is referred to as non-decreasing, but “increasing” is shorter and there

will be no danger of confusion). Similarly we speak of a decreasing sequence.
In the remainder of this chapter, we will allow infinite values, that is, a function
will be a mapping f W Rm ! R [ f1; C1g. Consequently, an increasing (resp.
decreasing) sequence .fn /n always has a limit, namely the supremum resp. infimum.
We write
fn % f resp. fn & f
and if there is a danger of confusion (e.g. in double indexing) we emphasize the

varying index as in
fnk %k fn ; fnk &

k
fn :
www.Ebook777.com
The notation an & a, an % a may be used also for monotone sequences of

numbers.
The constant zero function will be denoted, hopefully without danger of confu-
sion, simply by 0.
4.3 Theorem. (Dini) Let fn be continuous real functions on a compact metric

space X and let fn & 0. Then fn converge to 0 uniformly.
Proof. It suffices to prove that mn D max fn .x/ converges to zero, because then
x
jfn .x/ 0j < " for sufficiently large n independently of the choice of x 2 X .
Suppose it does not. Reducing, possibly, fn to a subsequence, we obtain an
example with
fn & 0 and 8n; mn > "0
for a fixed "0 > 0.

Since X is a compact metric space, there exist xn such that fn .xn / D mn , and
we can choose a subsequence of xn converging to some x 2 X . After reducing to a
subsequence, we may assume without loss of generality that we have
fn & 0; 8n fn .xn / > "0 and lim xn D x:

n
Now for k n,
fn .xk / fk .xk / > "0
and hence
fn .x/ D lim fn .xk / "0 for all n:

k
This is a contradiction with lim fn .x/ D 0. t

u
n
4.4
From 4.2 and 4.3, we immediately obtain the following
Corollary. Let fn be continuous real functions on a compact interval J and let

fn & 0. Then
Z
lim fn D 0:
n
www.Ebook777.com
5 Preparing for an extension of the Riemann integral 105
5 Preparing for an extension of the Riemann integral
5.1
For many purposes, the Riemann integral is not sufficiently general. For example,
we may be interested in computing integrals such as
Z 1
dx p
p D 2 xj10 D 2;
0 x
which however is incorrect in the setting we considered so far, since the Riemann
integral on the left-hand side does not exist. While in this particular case there is a
quick fix in the form of “improper Riemann integrals” (which we do not treat here),
clearly, a more systematic solution is needed: What about a function f where f .x/
is 0 for x rational and 1 for x irrational? (This function is known as the Dirichlet
function.) Obviously, f is not Riemann integrable, but should we define
Z 1
f .x/dx D 1
0
to express that modifying the value of the function which is constantly equal to 1 on
countably many points should not change the value of the integral? More generally,
can one define the integral in such a way that we have
Z Z
lim fn D lim fn (*)
in a situation more general than the case of a uniform limit? Clearly, it is

unreasonable to expect (*) in complete generality: for example, consider functions
fn where fn is constant n on the interval .0; 1=n/ and constant 0 elsewhere. Then
fn ! 0, while each of the functions fn has (Riemann) integral equal to 1.
Given all these questions, it is remarkable that there is a satisfactory answer:
people more or less agree on one standard extension of the Riemann integral to
a much larger class of functions, known as the Lebesgue integral. While there
are different approaches to the Lebesgue integral, and the concept is somewhat
notorious for taking a long time to cover, we will present here a relatively quick yet
rigorous approach of defining the Lebesgue integralRsimply by starting with certain
special cases of (*) as the definition of the value of lim fn , and then showing that
this leads to a consistent theory. This approach to the Lebesgue integral is due to
P.J.Daniell.
www.Ebook777.com
5.2 The class Z
The support of a function f W Rn ! R is the closure of the set fx 2 Rn j f .x/ ¤ 0g.

The support of f is denoted by
supp.f /:
Thus, a function has compact support if and only if it vanishes outside a compact
subset X Rn . There is obviously the smallest interval J0 containing the set X .
Any interval J containing J0 is easily represented as an almost disjoint union of a
set of intervals containing J0 and Rsuch that f is zero on all the other members of the
system. Thus by 1.4, the integral J f does not depend on the choice of the interval
J containing the support X of f . We will denote the common value by
If
R
(we will reserve the standard symbol for an extended integral defined later).
The set of all continuous functions with compact support in Rn will be denoted by
Z:
Let us summarize the basic facts we will use below: We have a class Z of
functions defined on Rn such that
(Z1) for all ˛; ˇ 2 R and f; g 2 Z, ˛f C ˇg 2 Z,
(Z2) if f 2 Z then jf j 2 Z,
and a mapping I W Z ! R such that
(I1) if f 0 then If 0,
(I2) I is a linear map, and
(I3) if fn & 0 then Ifn & 0
(for (I3), use 4.4, realizing that the support of fn is contained in the support of f1 ).
Below, we will consistently use only the facts (Zj) and (Ij) and their conse-
quences. For example, let max.f; g/ (resp. min.f; g/) denote the function whose
value at a point x is max.f .x/; g.x// (resp. min.f .x/; g.x//), and let f C D
max.f; 0/, f D min.f; 0/. Note that
1 1
max.f; g/ D .f C g C jf gj/ and min.f; g/ D .f C g jf gj/:
2 2
Thus, we easily deduce that
f g ) If Ig; and
f; g 2 Z ) max.f; g/; min.f; g/; f C ; f 2 Z:
www.Ebook777.com
6 A modest extension 107
6 A modest extension
6.1
Define
Zup D ff W Rn ! .1; C1 j 9fn 2 Z; fn % f g;
Zdn D ff W Rn ! Œ1; 1/ j 9fn 2 Z; fn & f g;
Z D Zup [ Zdn :
Remark. We choose, of course, the topology on .1; C1 where a set is a

neighborhood of C1 if and only if it contains some interval .K; C1. This makes
fn % f well defined: it means that fn is an increasing sequence of functions in Z
such that for each x 2 Rn , the sequence fn .x/ converges to f .x/ in .1; C1.
The treatment of Zdn is symmetrical. We will refer to f as a monotone limit of the
functions fn % f or fn & f . The functions in Z are not necessarily continuous,
they do not have to have a compact support, and can (obviously) reach infinite
values. Also note that Z Zup \ Zdn and this inclusion is not an equality.
6.2 Proposition. Let f; g 2 Z be monotone limits of sequences of functions

fn 2 Z and gn 2 Z, respectively. Let f g. Then
lim Ifn lim Ign :
Proof. (a) If fn % f and gn & g then fn f g gn .

(b) Let fn % f and gn % g. For a fixed k set
hn D min.gn ; fk /:
Then the sequence .hn / increases and we have
lim hn D min.g; fk / D fk ;
and hence
hn %n fk ; that is, .fk hn / &

n
0
and we obtain, by (I3), that lim Ihn D Ifk . Now gn hn , hence Ign Ihn ,
n
and hence
lim Ign Ifk

n
for each k so that finally lim Ifn lim Igk .

n k
www.Ebook777.com
(c) If fn & f and gn & g use (b) for f; g.

(d) Let fn & f and gn % g. Then fn gn hn D .fn gn /C ; since hn & 0 we
have lim Ihn D 0 and finally
lim Ifn lim Ign D lim I.fn gn / 0: t

u
6.3 A Corollary and a Definition
For f 2 Z , we can define
If D lim Ifn
n
where fn is an arbitrary monotone sequence of functions in Z converging (point-

wise) to f .
6.4 A few immediate facts
For the purposes of integration, it is convenient to adopt the convention 0 1 D

0 .1/ D 0. We will use this convention for the remainder of this chapter, and in
Chapter 5.
(a) f 2 Zup if and only if f 2 Zdn .
(b) If f; g 2 Zup resp. Zdn then f C g 2 Zup resp. Zdn and we have I.f C g/ D
If C Ig.
(c) If f 2 Zup and ˛ 0 resp. ˛ 0 then ˛f 2 Zup resp. Zdn and we have
I.˛f / D ˛If .
(d) If f; g 2 Z and f g then If Ig.
(e) If f; g 2 Zup then max.f; g/; min.f; g/ 2 Zup .
6.5 Proposition. Let fn 2 Zup and fn % f . Then f 2 Zup and Ifn % If .

Similarly for fn 2 Zdn and fn & f .
Proof. Choose fnk 2 Z such that fnk %k fn and set
gn D maxffij j 1 i; j ng:
(The maximum of finitely many functions is defined by applying the definition of

5.2 recursively; alternately, take the maximum of the values at one point at a time.)
Then gn % g for some g. Since
gn .x/ D fij .x/ fi .x/ for some ij n
we have
gn fn f: (1)
www.Ebook777.com
7 A definition of the Lebesgue integral and an important lemma 109
On the other hand, for k n we have gk fnk and hence
g fn : (2)
By (1) and (2), gn % f .

Regarding the value of If , by (2), If D Ig Ifn and hence If lim Ifn ;
on the other hand, by (1), If D lim Ign lim Ifn . t
u
7 A definition of the Lebesgue integral and an important

lemma
In this section, we will define the well-known Lebesgue integral by the method of
Daniell. This approach differs from the original Lebesgue construction based on
defining a measure first. Here we will obtain measure later as a consequence of an
already defined integral. We will see in Chapter 5 that the basic properties of
measure will follow practically for free.
7.1
For an arbitrary function f W Rn ! Œ1; 1, let

Z Z
f D supfIg j g f; g 2 Z g and dn
f D inffIg j g f; g 2 Zup g:
R R
f resp f is called the lower resp. upper (Lebesgue) integral of f .
Remark. This notation will not interfere with the notation for the lower and
upper Riemann integral introduced in 1.2 and used through Section 4. While the
meanings of both notations are in fact different, we will not encounter the lower and
upper Riemann integral any longer (with the exception of the Exercises).
R R
7.2 Proposition. (1) f D supfIg j g f; g 2 Z g and f D inffIg j g
f; g 2 Z g.
R R
(2) f f .
R R R R
(3) If f g then f g and f g.
Proof. (a) Assume that, say, the second equality does not hold. Then there exists a
R
g f , g 2 Zdn such that Ig < f . Let gn & g with gn 2 Z. Then there has to
R
be a k such that Igk < f . This is a contradiction, since gn 2 Z Zup .
(2) and (3) are trivial. t
u
www.Ebook777.com
7.3
From 7.2 (1), we immediately obtain the following

R R
Corollary. For f 2 Z we have f D f D If .
7.4
Denote by
R R
the set of all functions f such that f D f and such that the common value is
finite. Such functions are called (Lebesgue) integrable, the common finite value is
called the Lebesgue integral of f and denoted by
Z
f:
We will keep this notation for a while to distinguish the Lebesgue integral from the
types of integral developed earlier. Note, however, that in practice, other notations
are also common, for example, if x1 ; : : : ; xn are the standard coordinates in Rn , one
commonly writes
Z
f .x1 ; : : : ; xn /dx1 : : : dxn
or
Z
f .x/dx
for the Lebesgue integral also.

Remark. The assumption of finiteness of the common value is essential.
R R
Functions with infinite f D f can in general misbehave. We will have functions
with infinite Lebesgue integral later, but their class will have to be restricted – see 7.9
below.
7.5 Proposition. A function f W Rn ! Œ1; 1 satisfies f 2 L if and only if for

every " > 0 there exist g1 2 Zdn and g2 2 Zup , g1 f g2 , such that Igi are
finite and Ig2 Ig1 < ".
www.Ebook777.com
7 A definition of the Lebesgue integral and an important lemma 111
Proof. The implication ) is obvious.

( : If gi are as assumed in the statement, then
Z Z
Ig1 f f Ig2 Ig1 C "
R R
so that f f is smaller than any " > 0. t
u
7.6 Convention
Functions from L can have infinite values. Let us agree that in case of f .x/ D C1
and g.x/ D 1 the value f .x/ C g.x/ will be chosen arbitrarily. We will see that
for our purposes such arbitrariness in the definition of f C g does not matter.
7.7 Proposition. (1) If f; g 2 L then f C g 2 L and one has

Z Z Z
.f C g/ D f C g:
(2) If f 2 L then any ˛f 2 L and one has

Z Z
˛f D ˛ f:
(3) If f; g 2 L then max.f; g/ 2R L andRmin.f; g/ 2 L.

(4) If f; g 2 L and f g then f g.
(5) If f 2 L then f C ; f 2 L. R R
(6) If f 2 L then jf j 2 L and j f j jf j
Proof. (1) We shall use 7.5. Choose f1 ; g1 2 Zup and f2 ; g2 2 Zdn such that f1
f f2 , g1 g g2 and If1 If2 < ", Ig1 Ig2 < ". Then
f1 C g1 f C g f2 C g2 (*)
and the statement follows (realize that the inequalities hold also at the ambigu-
ous points mentioned in the convention of 7.6: if, say, f .x/ D C1 and
g.x/ D 1 then f2 .x/ D C1 and g1 .x/ D 1; f1 .x/ has to be finite,
as a limit of a decreasing sequence of finite numbers, and similarly for g2 .x/ so
that the inequalities (*) are satisfied trivially).
(2) follows immediately from 7.5.
(3) Take the fi ; gi as in (1) to obtain
max.f1 ; g1 / max.f; g/ max.f2 ; g2 / and

min.f1 ; g1 / min.f; g/; min.f2 ; g2 /
www.Ebook777.com
and realize that
max.f2 ; g2 / max.f1 ; g1 / .f2 f1 / C .g2 g1 /:
Similarly for the minimum.

(4) isRobvious Rand (5) follows from
R (3). R R R R
(5) j f j D j .f C f /j D j f C f j f C C f D jf j.
t
u
7.8 Lemma. If fn 2 L and if fn % f then

Z Z
lim fn D f:
Remarks before the proof.

1. ThisR lemmaRis very important and willR play a Rcrucial role below.
2. As fn f , we have trivially lim fn f . Hence, under the assumptions
Z Z Z
of the lemma, we have lim fn D f D f .
n
R R R
Proof. We obviously have lim fn f , and if lim fn D C1 the equality is
trivial. R
Thus, we can assume that the limit is finite. By the definition of fn choose
gn 2 Zup , gn fn such that
Z
"
fn C > Ign :
2nC1
Set hn D maxfgi ji D 1; : : : ; ng. Then hn 2 Zup and the sequence hn is increasing

so that by 6.5, h D lim hn 2 Zup . Now hn gn fn and hence h f , and
R
Ih f .
Here is an important
Claim.
hn fn .g1 f1 / C .g2 f2 / C C .gn fn /:
(Indeed, at each point x, we have gj .x/ fj .x/ D hn .x/ fj .x/ for some
j n. The summands are non-negative, and hence the inequality holds for j D n;
otherwise the sum is greater than or equal to hn .x/ fj .x/ C gn .x/ fn .x/ D
hn .x/fn .x/Cgn .x/fj .x/ hn .x/fn .x/Cgn .x/fn .x/ hn .x/fn .x/.)
www.Ebook777.com
8 Sets of measure zero; the concept of “almost everywhere” 113
Thus we have
Z X n
"
Ihn fn i C1
<"
i D1
2
R R R
so that Ihn fn C " and finally f Ihn lim fn C ". t
u
7.9 Some more notation
Set
Lup D ff j 9fn 2 L; fn % f g; Ldn D ff j 9fn 2 L; fn & f g; and

L D Lup [ Ldn :
Now we obtain from 7.8 the following

R R
7.9.1 Corollary. For each f 2 L we have f D f . Consequently,
Lup \ Ldn D L:
7.9.2 Convention R R R
For f 2 L we will use the symbol f for the common value of f and f , even
when it is infinite. However, we will not refer to such functions as integrable.
R
7.9.3 Proposition. If f 2 L and if the integral f from 7.9.2 is finite then f 2 L
and the integral coincides with the standard integral in L.
Proof. Let, say, f 2 Lup , let fn % f with fn 2 L. Then by Lemma 7.8 and part 2
R R R R
of the Remark in 7.8, f D lim fn D f D f . t
u
8 Sets of measure zero; the concept of “almost everywhere”
8.1
The characteristic function of a subset M Rm will be denoted by
cM
(that is, cM .x/ D 1 if x 2 M and cM .x/ D 0 otherwise). We have
www.Ebook777.com
M N if and only if cM cN ;
cM [N D max.cM ; cN / and cM \N D min.cM ; cN /;
S
1
and if M1 M2 Mn , M D Mn , then
nD1
cMn % cM :
R
R M is a set of measure zero if cM D 0 (then, since cm 0, we also have
cM D 0 and hence cM 2 L).
8.2 Proposition. (1) If M is a set of measure zero and N M then N is a set of

measure zero.
S
1
(2) If Mn are sets of measure zero then also Mn is a set of measure zero.
nD1
Proof. (1) is trivial. For (2), consider Nn D M1 [ [Mn . Then cNn cM1 C cMn
R
and hence Nn is a set of measure zero by 7.7. Now cNn % cM and hence cM D 0
by 7.8. t
u
8.3
Let V .x/ be a statement about points in Rm . We say that

V holds almost everywhere (briefly, a.e.)
if the set
fx j not V .x/g
is a set of measure zero.

If f .x/ D g.x/ almost everywhere, we will write
f g:
8.4 Proposition. (1) If f 2 L then f .x/ is finite almost everywhere.

(2) If f 2 Lup (resp. Ldn ) then f .x/ > 1 (resp. < C1) almost everywhere.
Proof. (1) Recall the convention on sums in 7.6, and Proposition 7.7 (1). We may
define f CR .f / equally
R well as 0 or as cM where M D fx j f .x/ D ˙1g
and hence cM D 0 D 0.
(2) When f 2 Lup , take fn 2 L with fn % f . Then fx j f .x/ D 1g
fx j f1 .x/ D ˙1g and the latter set is a set of measure zero by (1). The case
of f 2 Ldn is analogous. t
u
www.Ebook777.com
9 Exercises 115
R R R R
8.5 Proposition. If f g then f D g and f D g.
R
Proof. We will consider the case of (the other case is analogous). If we do not
R R R
have f D g D C 1 we can assume that f < C 1. Set M Dfxj f .x/ ¤ g.x/g
R
and rn D n cM . By 3.8 we have r D 0 for r D lim rn .
R
Choose h1 ; h2 2 Zup such that h1 f , h2 r, Ih1 < f C " and Ih2 < ".
R R
Then we have h1 C h2 2 Zup , h1 C h2 g, and hence g Ih1 C Ih2 < f C 2".
R R R
Thus, g f , in particular g < C1, and we can repeat the procedure with
f; g interchanged. t
u
8.6 Corollary. (1) If f 2 L and f g then g 2 L.

(2) If f 2 Lup resp. Ldn and f g then g 2 Lup resp. Ldn .
R
8.7 Proposition. If f 0 and f D 0 then f 0.
R
Proof. Set Mn D fx j f .x/ n1 g. Since 0 cMn nf we have cMn D 0, hence
1
[
Mn is a set of measure zero, and consequently fx j f .x/ ¤ 0g D Mn is a set of
nD1
measure zero. t
u
9 Exercises
(1) Prove Proposition 1.3.

(3) Prove the second equality in Theorem 3.1.
(4) Prove that the lower Riemann integral of a bounded function on an interval
in Rn is always less than or equal to the lower Lebesgue integral, and that
the upper Riemann integral is always greater than or equal to the upper
Lebesgue integral. Conclude that a Riemann integrable function on an interval
is Lebesgue integrable and that both integrals are equal.
(5) Prove by definition that the Lebesgue integral of the function equal to x q
on h0; bi and 0 elsewhere where 0 < q < 1, b > 0 are constants exists,
and compute it. [Hint: Consider the functions equal to x q on ha; bi where
0 < a < b and 0 elsewhere.]
1
(6) Prove that the Lebesgue integral of the function f .x/ D exists and
1 C x2
compute it. [Hint: see the hint to Exercise (5).]
(7) Prove that the function which is equal to 1 on every irrational number in h0; 1i
and 0 elsewhere is Lebesgue integrable and calculate its Lebesgue integral.
(8) Prove that the Cantor set of Exercise (19) in Chapter 2 has measure 0. [Hint:
Express its characteristic function as an appropriate monotone limit.]
www.Ebook777.com
T
(9) By a generalized Cantor set, we shall mean the intersection S D Si of sets
S0 S1 S2 : : : constructed as follows: We put S0 D h0; 1i. The set Sn
is a union of 2n closed intervals hai ; bi i, i D 1; : : : ; 2n , and for some number
bi ai
"n > 0, " < , we have
2
2n
!
[ ai C bi ai C bi
SnC1 D Sn X . "; C" :
i D1
2 2
(a) Prove that there exist generalized Cantor sets which are not of measure 0.
(b) Derive a necessary and sufficient condition (in terms of the numbers "i )
for the set S to be of measure 0.
(10) (a) Prove that for two generalized Cantor sets S , T , there exists a monotone
homeomorphism W h0; 1i ! h0; 1i such that ŒS D T . [Hint: Construct
such map with S , T replaced by Sn , Tn and prove that the sequence of
those maps converges uniformly. Use a separate argument to show that
the limit is monotone.]
(b) Conclude that for a homeomorphism h0; 1i ! h0; 1i, a continuous image
of a set of measure 0 may not be of measure 0.
(11) Let f W R ! h0; 1i be defined as follows: If x is irrational, then f .x/ D 0. If
x D a=b where a 2 Z, b 2 N and the greatest common divisor of a and b is
1, then f .a=b/ D 1=b. Prove that f is continuous almost everywhere. [Hint:
Try to guess the set of all points at which f is continuous.]
www.Ebook777.com
Integration II: Measurable Functions, Measure

and the Techniques of Lebesgue Integration 5
1 Lebesgue’s Theorems
R Theorem) Let fn 2 L and

up
1.1 Theorem. (Lebesgue’s Monotone
R Convergence let
fn % f a.e. Then f 2 L and f D lim fn . Similarly for fn 2 Ldn and
up
fn & f .
Proof. Let us treat the case fn % f , fn 2 Lup , the other case is analogous. Choose
fnk 2 L such that fnk %k fn and set
gn D maxffij j i; j ng:
Now gn % g with gn 2 L. Since gn f we have g f . On the other hand,

however, gp fmp for p n and hence g fn , and finally g f . Thus,
f D g 2 Lup . R R
R of f . If lim fn D C1 the equality is trivial; hence
Now consider the value
we can assume that lim fn is finite. Then fn 2 L and we can use 7.8 of Chapter 4
R R R
to obtain lim fn D f D f . t
u
Remark. This statement is also known as Levi’s Theorem.
1.2 Theorem. (Lebesgue’s Dominated Convergence Theorem) Let fn 2 L. Assume

lim fn .x/ D f .x/
R a.e., and Rlet there exist a g 2 L such that jfn .x/j g.x/ a.e.
Then f 2 L and f D lim fn .
Remark. The attentive reader may worry about the seemingly sloppy formula-
tion: does one mean “almost everywhere one has that for all n that jfn .x/j g.x/”
or “for each n one has that jfn .x/j g.x/ almost everywhere”? But it is an easy
exercise (Exercise (1)) to show these two statements are equivalent.

www.Ebook777.com
118 5 Integration II: Measurable Functions, Measure and the Techniques: : :
Proof. By 8.5 of Chapter 4, we may omit “almost everywhere” from the assump-
tions.
Set
hn D maxffk j k ng; gn D minffk j k ng:
Since max fnCj %p hn we have hn 2 Lup , and similarly gn 2 Ldn . But we have,
j D0;:::;p
moreover,
g gn fn hn g
R R
and hence gn and hn are finite and we have in fact gn ; hn 2 L, and consequently
gn 2 Lup and hn 2 Ldn and we can use Lebesgue’s Monotone Convergence
Theorem. Now obviously gn % fR and hn &R f , by RLebesgue’s Monotone
Convergence Theorem we haveR lim gn D R lim hn D f , and finally since
gn fn hn we conclude that f D lim fn . t
u
1.3 Proposition. Let g 2 L, let fn 2 L , let fn g a.e. and let lim fn .x/ D f .x/
n
a.e. Then f 2 Lup . Similarly for fn g we obtain f 2 Ldn .
R R
Proof. Since 1 < g fn , fn 2 Lup (if fn 2 Ldn it has, hence, a finite
integral so that, by 7.9.3 of Chapter 4, fn 2 L Lup as well). Set ' D supn fn .
We have max fk %n ' and hence ' 2 Lup by 1.1, and there exist 'n 2 L such that
kn
'n % '. Obviously ' f g and we can assume that 'n g (else replace 'n by
max.'n ; g/). Set
gkn D min.'k ; fn /:
We have g gkn 'k and hence gkn 2 L and, moreover, we can use Lebesgue’s
Dominated Convergence Theorem for lim gkn and obtain
n
min.'k ; f / D lim gkn 2 L:

n
Now we conclude that min.'k ; f / %k f and hence f 2 Lup . t

u
2 The class ƒ (measurable functions)
2.1
As before, limn fn D f will be abbreviated by writing fn ! f . Let
ƒ D ff j 9fn 2 L; fn ! f g
www.Ebook777.com
2 The class ƒ (measurable functions) 119
(unlike in the definition of Lup and Ldn there is no assumption on the nature of the
convergence). Functions which belong to ƒ are called (Lebesgue) measurable.
2.2 Proposition. If f g and f 2 ƒ then g 2 ƒ.
Proof. Let fn 2 L and fn ! f . Define M D fx j f .x/ ¤ g.x/g and set
gn .x/ D g.x/ for x 2 M; gn .x/ D fn .x/ otherwise:
Then by 8.6 of Chapter 4, gn 2 L. t

u
2.3
From 1.3, we immediately see the following
Corollary. If f 2 ƒ and f 0 then f 2 Lup .
2.4
The following is trivial.
Proposition. (a) If f; g 2 ƒ and if f C g makes sense a.e. then f C g 2 ƒ.

(b) If f 2 ƒ and ˛ 2 R then ˛f 2 ƒ.
(c) If f; g 2 ƒ then max.f; g/; min.f; g/ 2 ƒ.
(d) If f 2 ƒ then jf j 2 ƒ.
2.5 Proposition. f 2 ƒ if and only if both f C and f are in Lup .
Proof. If fn are in L and fn ! f then obviously fnC ! f C and fn ! f .

Use 2.3. The other implication is trivial. u
t
2.5.1 Corollary. Let f 2 ƒ and let there exist a g 2 L such that jf j g. Then
f 2 L.
2.6 Proposition. If fn 2 ƒ and if fn ! f a.e. then f 2 ƒ.
Proof. We have fnC ; fn 2 Lup and fnC ! f C , fn ! f . Thus, by 1.3, both f C
and f are in Lup . t
u
2.7
R Proposition.
R f 2 L if and only if f C and f are in Lup and if the difference
f C f makes sense.
www.Ebook777.com
R
R Consequently, f 2 ƒ X L if and only if f C and f are in Lup and fC D

f D C1.
Proof. ) : Let, say, f 2 Lup and let fn % f and fn 2 L. As Rf1 D f1C f1
f DfC
R f C weRhave f f1 2 L and hence the value of f is finite.

( : If f f makes sense then at least one of the integrals is finite and
either f C or f is in L. Thus, f C f is either in Lup or in Ldn . t
u
2.8 Remark
Some of the statements proved in this section may be somewhat surprising. It turned
out, for example, that for integrability of a limit of integrable functions, the nature
of the limiting process is not very important: all one needs is that the positive and
negative parts of the limit not both have infinite integrals.
For the value of the integral of the limit, on the other hand, the nature of the
convergence obviously matters a great deal.
3 The Lebesgue measure
3.1
A set A Rm is said to be (Lebesgue) measurable if the characteristic function cA

is in ƒ (then, of course, it is in Lup , by 2.3). We put
Z
.A/ D cA
and call .A/ the (Lebesgue) measure of A.

Note that this terminology is in accordance with 8.4 of Chapter 4 (see Exer-
cise (4)).
3.2 General facts
If A; B Rm are measurable, A [ B is measurable and
.A [ B/ .A/ C .B/
(by 2.4, we have cA[B D max.cA ; cB / . cA C cB / in ƒ) and if A; B are disjoint

then
.A [ B/ D .A/ C .B/ (3.2.1)
as then cA[B D cA C cB .
www.Ebook777.com
3 The Lebesgue measure 121
But we have much more: the measure is countably additive (-additive, as this
fact is usually referred to). Here are some facts on measurability.
1
[
Proposition. (1) Let An , n D 1; 2; : : : , be measurable sets. Then An is
nD1
measurable. If for any two n; k the intersection An \ Ak is a set of measure
zero then
1
[ 1
X
. An / D .An /:
nD1 nD1
(2) The intersection of a countable system of measurable sets is measurable.

(3) If A; B are measurable then the difference A X B is measurable.
(4) .;/ D 0 and for a measurable subset A B, .A/ .B/.
Proof. (1) We have
cA1 [[An %n cS1

nD1 An
and hence cS1 nD1 An

2 Lup . In the almost disjoint case we obtain the value
from the finite additivity (3.2.1) and from Lebesgue’s Monotone Convergence
Theorem.
(3) cAXB D max.cA cB ; 0/. (See 2.4.)
(2) From (1), (3).
(4) is trivial. t
u
3.3 Special sets
Proposition. (1) Every open set in Rm is measurable.

(2) Every closed set in Rm is measurable.
(3) For the interval J D ha1 ; b1 i

ham ; bm i, one has
.J / D .b1 a1 /.b2 a2 / .bn an /:
(4) Every countable set is measurable, with measure 0.
Proof. The Euclidean distance in Rm will be denoted by .x; y/.

(1) It suffices to show that bounded open sets are measurable: for a general open U
[ balls Bn D fx j .x; .0; : : : ; 0// < ng and use Proposition 3.2
consider the open
(1) for U D U \ Bn .
n
Thus, let U be a bounded open set. Set
www.Ebook777.com
1
An D fx j .x; Rm X U / g
n
and define fn W Rm ! R by
.x; Rm X U /
fn .x/ D :
.x; Rm X U / C .x; An /
Since An and Rm X U are disjoint closed sets, fn is a continuous map. Since

fn .x/ D 0 for x … U , we have fn 2 Z ƒ. Now if x 2 U then .x; Rm XU /
1
n0
for some n0 and hence x 2 An , and fn .x/ D 1, for all n n0 . Thus,
fn ! cU
and cU 2 ƒ.
(2) Use (1) and 3.2 (3).
(3) Note that for a bounded closed set C we can use a similar procedure as in (1):
this time set
1
An D fx j .x; C / g
n
and define fn W Rm ! R by
.x; An /
fn .x/ D :
.x; An / C .x; C /
Now obviously fn .x/ D 1 for x 2 C and fn .x/ D 0 for .x; C / n1 if n.

Furthermore, if k n then .x; Ak / .x; An /, and fk .x/ fn .x/. Thus,
fn & cC :
In particular this holds for the interval J . Moreover, fn .x/ D 0 outside
1 1 1 1
ha1 ; b1 C i

ham ; bm C i
n n n n
and 0 fn .x/ 1 so that by the standard estimate of Riemann integrals

Z
2 2
.b1 a1 / .bn an / fn .b1 a1 C / .bn an C /
n n
R
and cJ D .b1 a1 /.b2 a2 / .bn an / by Lebesgue’s Monotone
Convergence Theorem (actually already by Dini’s Theorem).
(4) By (3), .fxg/ D 0. Use 3.2 (1). t
u
www.Ebook777.com
4 The integral over a set 123
3.4 The set B of Borel sets
The smallest class of subsets of Rm containing all open subsets and closed under
• Complements,
• Countable unions, and
• Countable intersections
(of course, the last follows from the first two) is called the class of Borel sets, and
denoted by B.
Thus, all the open and closed sets are Borel. However, we have more complicated
sets. For example, an F set is a countable union of closed subsets, and a Gı set is
an intersection of countably many open subsets. Going on, a Gı set is a union of
countably many Gı sets, and an F ı set is an intersection of countably many F
sets, and so on. All sets produced in this way are Borel by definition.
From 3.2 and 3.3 we immediately obtain
3.4.1 Corollary. Every Borel set is measurable.
3.5
Let us conclude this section with a trivial remark. From 3.2 (1) and 2.2 (1), we
immediately obtain the frequently used somewhat paradoxical observation that for
every " > 0, there exists a dense open set U of the unit interval I such that
.U / < ": order all the rationals in I in a sequence r1 ; r2 ; : : : ; rn ; : : : and set
1
[ 1 1
U D .rn ; rn C nC2 /
nD1
2nC2 2
(where .a; b/ designate open intervals).
4 The integral over a set
4.1
Unlike the additivity of the classes L etc., we do not have similarly well behaved
multiplicativity properties. Nevertheless, multiplying by characteristic functions cM
of M measurable does give satisfactory results.
Proposition. Let M be a measurable set and let f 2 L. Then cM f 2 L.
Proof. Put 'n D min.ncM ; .max.f; .ncM ////. Then 'n 2 ƒ and since j'n j jf j
we have cM f D lim 'n in L by 2.5.1. t
u
www.Ebook777.com
4.2
By 4.1, we can define for a measurable set M and f 2 L,

Z Z
f df cM f;
M
the integral of f over M .
4.3 Proposition. Let Mn , n D 1; 2; : : : be measurable.

(a) Let for n ¤ k, Mn ; MkSbe almost
R disjoint (i.e. .Mn \ Mk / D 0), f 2 ƒ, and
assume that for M D Mn , M f makes sense. Then
Z 1 Z
X
f D f:
M nD1 Mn
S R
(b) Let M1 M2 ; M D Mn and assume that M f makes sense. Then
Z Z
f D lim f:
M n Mn
T R
(c) Let M1 M2 ; M D Mn and assume that M1 f makes sense. Then
Z Z
f D lim f:
M n Mn
Proof. For f 0 the statement immediately follows from Lebesgue’s Monotone

Convergence Theorem and the fact that the sum formula obviouslyR holds for finitely
many Mn . Thus, we Rhave theR equality for f C and f . Now if M f makes sense
then by 2.7 one of m f C , m f is finite, and hence at least one of the series
X1 Z 1 Z
X
f C, f converges, and since the summands are non-negative, it
nD1 Mn nD1 Mn
converges absolutely. Thus,
Z Z Z 1 Z
X 1 Z
X 1 Z
X
f D fC f D fC fD .f C f /;
M M M nD1 Mn nD1 Mn nD1 Mn
the last reshuffling being made possible by the absolute convergence of at least one
of the series (and the other’s being a sum of non-negative numbers).
(b) Apply (a) for M1 ; M2 X M1 ; M3 X M2 ; : : : .
S
(c) Set Nn D M1 X Mn . Then M D M1 X Nn . Use (b). t
u
www.Ebook777.com
4 The integral over a set 125
4.3.1 Remark R
For the general statement, the assumption
R that
R M f make sense is essential. The
point is that we could have both M f C and M f infinite.
4.4 Criteria of measurability
For many purposes, we need a criterion by which sets and functions are measurable.
Let us begin with the following definition: For a Borel set X Rm , a function
f W X ! h1; 1i is called Borel measurable if
For every S h1; 1i Borel, f 1 ŒS is Borel. (C)
Theorem. A function f W X ! h1; 1i is (Lebesgue) measurable if and only if

there exists a Borel measurable function equal to f almost everywhere.
4.4.1 Corollary. A subset S Rm is measurable if and only if there exists a Borel

set B Rm such that S X B and B X S are sets of measure 0.
Comment: Note that since the inverse image preserves unions, intersections and
complements, we may equivalently replace every Borel set S in (C) by either every
interval h1; a/, a 2 R or every interval .a; 1i, a 2 R.
Proof of the Theorem: We begin by considering the easy implication. First,
suppose f is Borel measurable. Then so are f C and f , so by Proposition 2.5,
we may assume f 0. Then define
k k kC1
fn .x/ D n
when n f .x/ < : (*)
2 2 2n
Then clearly
fn % f:
Further, each fn is an increasing limit of a sequence of functions each of which

takes on only finitely many values, the inverse images of which are Borel, and hence
measurable sets. Therefore fn 2 Lup , and hence f 2 Lup .
Now a function equal to f almost everywhere is measurable by Corollary 8.6 of
Chapter 4.
To prove the converse implication, we first prove some lemmas.
4.4.2 Lemma. If fn % f or fn & f and the functions fn are Borel-measurable,

so is f .
Proof. Consider fn % f (the case of fn & f clearly follows by taking negatives).

Note that f .x/ S
> a if and only if there exists an n such that fn .x/ > a, so
f 1 Œ.a; 1i D fn1 Œ.a; 1i, so our statement follows from the Comment. t
u
www.Ebook777.com
R
4.4.3 Lemma. If f 0, f is Borel measurable, and f D 0, then f D 0 almost
everywhere.
Proof. Otherwise, .f 1 Œ.1=n; 1i for some n D 1; 2; : : : . But then

Z Z
1
f f .f 1 Œ.1=n; 1i/ > 0:
f 1 Œ.1=n;1i n
A contradiction. t
u
4.4.4 Lemma. If f , g are Borel measurable, so is f and, if f; g 0, also f C g.
Proof. The statement for f is immediate. For f C g, note that .f C g/.x/ < a
if and only if there exist rational numbers q, r such that f .x/ < q, g.x/ < r and
q C r < a and thus, .f C g/1 Œ.1; a/ is the (countable) union of the Borel sets
f 1 Œ.1; q/ \ g 1 Œ.1; r/. t
u
Now let f be measurable. Then by Lemma 4.4 of Chapter 4, and Proposition 2.5,
it suffices to prove the statement for f C , f , and hence, by Lemma 4.4.2, for
f 2 L.
When f 2 L, by 4.7.5, there exist gn 2 Zdn such that
gn gnC1 f
and
Z Z
gn % f:
Similarly, there exist hn 2 Zup such that
hn hnC1 f
and
Z Z
hn & f:
By the Comment, functions in Zup and Zdn are clearly Borel-measurable, so if we

put
g D lim gn ; h D lim hn ;
g and h are Borel-measurable functions,
www.Ebook777.com
5 Parameters 127
gf h
and
Z
.h g/ D 0:
By Lemma 4.4.4, h g is Borel measurable. Let
B D fxj.h g/.x/ D 0g:
By Lemma 4.4.3, X X B is a set of measure 0. Therefore, we can take h as the Borel

measurable function required by the statement. t
u
4.5 Corollary. A function f W Rm ! h1; 1i is measurable if and only if for

every interval B D .a; 1i (alternately, every interval B D h1; a/), f 1 ŒB is
measurable.
Proof. If f is measurable then, by Theorem 4.4, it is equal to a Borel-measurable

function almost everywhere, and hence clearly satisfies our condition by Proposi-
tion 8.2 (1) of Chapter 4.
If, on the other hand, f satisfies our criterion then, as in the Comment above,
f 1 ŒB is Lebesgue measurable for every Borel set B. As above, we may pass to
the functions f C and f , and hence may assume that f 0. Now the formula (*)
again produces an increasing sequence of measurable functions converging to f ,
and hence f is measurable. t
u
5 Parameters
5.1 Theorem. Let T be a metric space, t0 2 T , and let f W T

Rm
! R [ fC1; 1g be a function such that
.1/ for almost all x, f .; x/ is continuous in a point t0 ,
.2/ there is a neighborhood U of t0 such that the functions f .t; / belong to L for
all t 2 U X ft0 g, and
.3/ there exists a g 2 L and a neighborhood U of t0 such that for almost all x and
for all t 2 U X ft0 g one has jf .t; x/j g.x/.
Then f .t0 ; / is in L and we have
Z Z
f .t0 ; / D lim f .t; /:
t !t0
Proof. Choose tn 2 U X ft0 g such that lim tn D t0 and use the Lebesgue Dominated
n
Convergence Theorem. t
u
www.Ebook777.com
5.2 Theorem. Let f W R

Rm ! R [ fC1; 1g be such that in a neighborhood
U of t0
@f .t; x/
.1/ there exist partial derivatives for almost all x,
@t
.2/ there exists a g 2 L such that for almost all x and for all f 2 U one has
ˇ ˇ
ˇ @f .t; x/ ˇ
ˇ ˇ
ˇ @t ˇ g.x/;
R
.3/ and for t 2 U there existZ f .t; /.
@f .t0 ; /
Then there exist the integral and one has
@t
Z Z
@f .t0 ; / d
D f .t0 ; /:
@t dt
@f .t0 ; x/ 1
Proof. We have D lim .f .t0 C h; x/ f .t0 ; x//. Set '.h; x/ D
@t h!0 h
1
h .f .t0 C h; x/ f .t0 ; x//. By Lagrange’s Theorem we have
ˇ ˇ
ˇ @f .t0 C h; x/ ˇ
j'.h; x/j D ˇˇ ˇ g.x/
ˇ
@t
and hence we can apply Theorem 5.1. t

u
6 Fubini’s Theorem
In this section we will have to indicate the dimension of the Euclidean space
we work in. When working in Rm , we will decorate the symbols Z; Zup ; L
up
etc. with subscripts Zm ; Zm ; Lm etc., and for the integral symbols we will use
R .m/ R .m/ R .m/ R R R
; ; instead of ; ; .
We will abandon
R the integral symbol I since we already know that for f 2 Z
we have If D f .
Finally, to avoid confusion in the case of two variables we will sometimes use
the classical
Z Z Z Z
f .x; y/dy or f .x; y/dx for f .x; / or f .; y/:
6.1 Lemma. For a function f defined on RmCn define functions F and F on Rm

by setting
www.Ebook777.com
6 Fubini’s Theorem 129
Z .mCn/ Z .mCn/
F .x/ D f .x; y/dy (resp. F .x/ D f .x; y/dy /:
Then one has

Z .mCn/ Z .m/ Z .mCn/ Z .m/
f F (resp. f F /:
Proof. I. If f 2 ZmCn then we have equalities, by the case of Fubini’s Theorem

for the Riemann integral of continuous maps on compact intervals. Furthermore,
when F D F D F , we have
F 2 Zm :
Indeed, choose a compact interval J carrying the function f . The function F

obviously has compact support, contained in the projection of J (the values
elsewhere are integrals of 0). Further, let K be the volume of J . For an " > 0
there exists a ı > 0 such that for .x; x 0 / < ı, we have jf .x; y/ f .x 0 ; y/j <
"
, independently on y. Therefore, we have
K
ˇZ Z ˇ
ˇ ˇ
ˇ F .x/ F .x 0 /ˇ < " K D ";
ˇ ˇ K
and F is continuous.
II. Now let fk 2 ZmCn , fk %k f . Then
Z
Fk .x/ D fk .x; y/dy % F .x/ and also fk .x; / % f .x; /
for all y. Therefore, we still have

Z .mCn/ Z .mCn/ Z .m/ Z .m/
f .x; y/dy D lim fk D lim Fk D F:
k k
III. Now let f be general and let g 2 Zup be such that g f . Put G.x/ D
R .mCn/
g.x; y/dy, Then G F , and by II we have
Z .mCn/ Z .m/ Z .m/
gD G F
and hence
Z Z Z
f D inff g j g 2 Zup ; g f g F: t
u
www.Ebook777.com
Theorem. (Fubini) Let f 2 LmCn . Then for almost all x there exists the integral
R .mCn/
f .x; y/dy. If we denote its value by F .x/, and define the values F .x/
arbitrarily in the remaining points, we have F 2 Lm and
Z .mCn/ Z .m/
f D F:
R R
Proof. Put F .x/ D f .x; y/dy and F .x/ D f .x; y/dy. By Lemma 6.1, we
have
8 Z 9
ˆ
ˆ >
>
Z Z Z ˆ
< F >
= Z Z Z
f D f F Z F f D f:
ˆ
ˆ >
>
:̂ F >
;
R
R f be in LmCn . Then the values are finite and we obtain, first ofR all, thatR F D
Let
F is finite and hence F 2 Lm , and similarly F 2 Lm . Further, F D F and
R
hence .F F / D 0 and hence F F D 0 almost everywhere, by 4.7. If f 2 LmCn
use Lebesgue’s Monotone Convergence Theorem. t
u
7 The Substitution Theorem
In this section, we will prove a substitution theorem for multivariable integrals. The
reader should be aware that a much more general substitution theorem is valid (see
[18]). In this text, we would basically be happy with a substitution theorem for the
Riemann integral of a continuous bounded function where the coordinate change is
a diffeomorphism with bounded partial derivatives (as needed, for example, in the
Stokes Theorem in Chapter 12 below). However, we will typically need to integrate
over Borel sets, which makes Lebesgue integral relevant. The purpose of this section
is to give a rigorous, but otherwise as straightforward as possible, proof of the
version of the theorem needed here.
7.1
Recall the set B of all Borel sets in Rm . Let U Rm be an open set. Define
BU D fS 2 BjS U g:
Note that clearly, BU is the smallest set of subsets of U closed under complements
and countable unions, which contains all open subsets of U . Let us also write
IU D fha1 ; b1 /

han ; bn /jha1 ; b1 i

han ; bn i U g:
www.Ebook777.com
7 The Substitution Theorem 131
7.2 Lemma. Let U Rm be open and let S D S0 2 IU . Then there exist

S1 ; S2 ; : : : ; Sn ; : : : such that i ¤ j ) Si \ Sj D ; for i; j D 0; 1; 2; : : : and
1
[
V D Si :
i D0
Proof. Let
S0 D ha1 ; b1 /

han ; bn /:
Assume S0 is non-empty. (If S0 is empty, choose S1 2 IU arbitrary and proceed

with k 1 instead.) Let di .k/ D .bi ai /=2k . Assuming S0 ¤ ;, let T0 D fS0 g.
Suppose we have already defined T0 ; : : : Tk1 . Let Tk be the set of all
hr1 ; s1 /

hrn ; sn / .U X .T0 [ [ Tk1 //
where
si D ri C di .k/; ri D ai C `i di .k/
for some `i 2 Z, i D 1; : : : ; n. Let
fS1 ; S2 ; : : : g D T1 [ T2 [ : : : :
S
By definition, the Si ’s are disjoint and one easily checks that Si is open and
closed in U . t
u
7.3 Lemma. For S 2 IU , there exist open sets U V1 Vk : : : such

that
1
\
SD Vk :
i D1
Proof. Using the same notation as in Lemma 7.2, take
1 1
Vk D .a1 ; b1 /

.an ; bn /: t
u
k k
7.4 Proposition. Let SU be the smallest set of subsets of U which satisfies
.1/ Iu SU ;
.2/ When S1 ; S2 ; 2 SU are disjoint, then
1
[
Si 2 SU ; (C)
i D1
.3/ When S 2 SU , we have U X S 2 SU .

Then SU D BU .
www.Ebook777.com
Comment: This proposition is a special case of a more abstract theorem known

as Dynkin’s Lemma. The proof is essentially the same; the greater generality would
be of no use to us.
Proof. Let S 2 SU . Let
SU .S / D fT 2 SU j S \ T 2 SU g: (7.4.1)
Step 1: If S 2 SU , then the conditions (2) and (3) above hold with SU replaced
by SU .S /.
Proof. (2) is trivial by distributivity. To prove (3), when S; T; S \T 2 SU , then
S \ .U X T / D U X ..U X S / [ .S \ T // 2 SU ;
since .U X T / \ .S \ T / D ;. t
u
Step 2: When S 2 IU , clearly IU SU . Therefore, by Step 1, SU .S / D SU .

Step 3: Now let S 2 SU . By Step 2, IU SU . Therefore, by Step 1, SU .S / D
SU .
Step 4: By Step 3 (note (7.4.1)) and (2), (C) holds for any S1 ; S2 ; 2 SU
(without assuming disjointness). By Lemma 1 (with S0 D ; and U replaced
by V ), every open subset V U satisfies V 2 SU . Therefore, BU SU .
Step 5: Note that by Lemma 7.2, IU BU , and hence, by definition, SU BU .
t
u
7.5 Assumption
Assume now U Rm is an open set, and
F W U ! Rm
is an injective map with continuous first partial derivatives which satisfies
det.DFx / ¤ 0 for all x 2 U
(Then F is regular, and by 7.2, 7.3 of Chapter 3, its image is open and its inverse
also satisfies the Assumption). Recall 3.2 of Chapter 3 for a discussion of DFx . The
attentive reader has noticed that
det.DFx /
is a special case of the Jacobian considered in 6.2 of Chapter 3 when the variables x
of 6.2 of Chapter 3 are not present and y is labeled as x. Many texts, in fact, reserve
the term for this special case.
www.Ebook777.com
7 The Substitution Theorem 133
7.6 Lemma. Let S 2 IU . Then

Z
.FŒS / jdet.DFx /jdx: (*)
S
Proof. Note first that by Lemma 7.3 and the fact that F is a homeomorphism onto
its image, FŒS is Borel.
Next, one proves (*) in the case when is an affine map (see 5.9 of Appendix A).
By the multiplicative property of the determinant with respect to composition,
translation-invariance of Lebesge measure, Fubini’s Theorem and Gauss elimina-
tion, it then suffices to prove (*) for n D 1 (which is obvious) and for the map

1a
: (C)
01
For the case of (C), since is clearly invariant under translation, it suffices to prove
the statement for
S D h0; b1 /
h0; b2 /; b1 ; b2 > 0:
Then
[
n1
iab2 b2 iab2 b2 i b2 .i C 1/b2
FŒS h jaj ; C b1 C jaj /
h ; /:
i D0
n n n n n n
The Lebesgue measure of the right-hand side, with n D 2k , k ! 1, clearly

approaches b1 b2 , while FŒS is an intersection of this decreasing sequence of sets.
For the case of general satisfying our assumption, by countable additivity, it
suffices to consider the case when the Rm -closure S of S is contained in U . Then
since the partial derivatives of are continuous on S , they are uniformly continuous
by Theorem 6.6 of Chapter 2. Therefore, for every " > 0 there exists a ı > 0 such
that for a D .a1 ; : : : an / 2 S and
0 < bi ai < ı; (7.6.1)
we have
ˇ ˇ
ˇ @Fi .y/ @Fi .a/ ˇ
ˇ ˇ
ˇ @x @x ˇ < ":
j j
By the Mean Value Theorem, then, assuming (7.6.1),
FŒha1 ; b1 /

han ; bn /
www.Ebook777.com
is a subset of
x CDFx Œh".b1 a1 /; .b1 a1 /C".b1 a1 //

h".bn an /; .bn an /C".bn an //:
From the affine case, we conclude that
.FŒha1 ; b1 /

han ; bn // .1 C 2"/mjdet.DFx /j:
Since " > 0 was arbitrary, our statement follows. t

u
7.7 Lemma. Let S 2 IU and let f W FŒU ! R be a non-negative continuous

function. Then
Z Z
f .f ı F/jdet.DFx /jdx: (*)
FŒS S
This also holds with S replaced by an open subset V U .
Proof. Let
S D ha1 ; b1 /

han ; bn /:
Let, for integers 0 i1 < 2k ; : : : 0 in < 2k ,
Sk .i1 ; : : : ; in /
denote the set

i1 .b1 a1 / .i1 C 1/.b1 a1 /
ha1 C k
; a1 C
:::
2 2k

in .bn an / .in C 1/.bn an /

an C ; an C :
2k 2k
Then define “step functions” fk by
fk .x/ D inf f .z/ for x 2 Sk .i1 ; : : : ; in /:

z2Sk .i1 ;:::in /
Then fk % f and with f replaced by fk , the statement for S 2 IU holds by

Lemma 7.6. For V open, the statement holds by Lemma 7.2 (with S0 D ;, U
replaced by V . t
u
7.8 Proposition. For V U open, f W FŒU ! R non-negative continuous, we

have
www.Ebook777.com
8 Hölder’s inequality, Minkowski’s inequality and Lp -spaces 135
Z Z
f D .f ı F/jdet.DFx /jdx:
FŒV V
The statement also holds with V replaced by S 2 IU .
Proof. First note that the statement for S 2 IU follows from the statement for V
open by Lemma 7.3. For V open, the inequality follows from Lemma 7.7. The
inequality follows from Lemma 7.7 with f replaced by f ı F, F replaced by F1 ,
FŒU replaced by U and V replaced by FŒV (recall that the set FŒU is open). u
t
7.9 Theorem. (The Substitution Theorem) Let F satisfy Assumption 7.5, and let
f W FŒU ! R be a continuous function. Let S 2 BU . Then
Z Z
f D .f ı F/jdet.DFx /jdx; (C)
FŒS S
provided that the integral on at least one side of the equation exists and is finite.
Proof. By considering f C D max.f; 0/, f D min.f; 0/, (recall 5.2 of

Chapter 4), we may assume f 0. By Proposition 7.8 (for S 2 IU ), and by
the additivity of the integral, we clearly have (C) for all S 2 SU and hence our
statement follows from Proposition 7.4. t
u
8 Hölder’s inequality, Minkowski’s inequality and Lp -spaces
In this section, we will introduce Lp -spaces, 1 p 1, which are a very basic

source of examples in analysis. The true significance of those spaces in mathematics
will emerge in Chapters 16 and 17 below. However, their definition and basic
properties are often used throughout analysis, and thus now is a good place to treat
them. In this section, let B be a Borel subset of Rn . Let f be a real measurable
function defined on B. We will write (assuming that the right-hand side is finite) for
1 p < 1.
Z p1
kf kp D jf j p
:
B
Z Z
In this section, we will tend to write instead of , since the set B will not change.
B
For p D 1, one defines
kf k1 D inffM 0 j f .x/ M almost everywhere on Bg;
again, assuming this number is finite.
www.Ebook777.com
1 1
8.1 Theorem. (Hölder’s inequality) Let p; q > 1 and let C D 1. We have
p q
Z
jfgj kf kp kgkq :
Proof. Put ˛ D kf kp , ˇ D kgkq . Then

Z Z
1 1 q
jf jp D jgj D 1:
˛ ˇ
Set f D ˛1 f and g D ˇ1 g. By Young’s inequality 4.5.3 of Chapter 1 we have
jf .x/j jg.x/j
jf .x/g.x/j C ;
p q
and hence
Z Z Z Z
11 1 1 1 1
jfgj D jf gj jf jp C jgjq D C D 1;
˛ˇ p q p q
and finally
Z
jfgj ˛ˇ D kf kp kgkq : t
u
8.1.1 Observation. If jf jp and jgjq are linearly dependent, then

Z
jfgj D kf kp kgkq :
Remark: The equality holds if and only if the functions are dependent, but we
will not need the other implication.
Proof. Let, say, jgjq D ˛jf jp . Then

Z Z
1 1 1 1 p
kgkq D . jgjq / q D ˛ q . jf jp / q D ˛ q .kf kp / q
and hence
1 p 1 pCq 1
kf kp kgkq D ˛ q .kf kp /1C q D ˛ q .kf kpp / pq D ˛ q kf kpp :
www.Ebook777.com
On the other hand we also have

Z Z Z Z
1 p 1 p 1 1 1 1
C1
jf jjgj D .jf j˛ jf j / D ˛
q q q jf j q D˛ q jf jp. q C p / D ˛ q kf kpp :
t
u
8.2 Theorem. (Minkowski’s inequality) We have, for 1 p 1,
kf C gkp kf kp C kgkp
whenever the right-hand side is defined.
Proof. The inequality is obvious for p D 1 and p D 1, hence we can assume that
1 > p > 1.
Recall Proposition 4.5.2 of Chapter 1. For p 1 and x 0, the function f .x/ D
x p is convex (since h00 .x/ D p.p 1/x p2 0) and hence we have
1 1 1 1
jf C gjp . j2f j C j2gj/p D j2f jp C j2gjp D 2p1 jf jp C 2p1 jgjp :
2 2 2 2
R R
RThus, first,pif the integrals jf j and jgj are finite, also the integral of the sum
p p
jf C gj is finite, and kf C gkp makes sense. If it is zero then the inequality

holds. Thus suppose it is not zero.
We have
Z Z Z Z
.kf Cgkp /p D jf Cgjp .jf jCjgj/jf Cgjp1 D jf jjf Cgjp1 C jgjjf Cgjp1 :
1 1 p1
Proceed, using Hölder inequality, taking into account that D 1 D and
q p p
p1
hence q D ,
p
Z Z Z p
1 1
.p1/ p1 1 p1
.. jf jp / p C. jgjp / p /. jf Cgj / D .kf kp Ckgkp /.kf Cgkp /p1 :
Hence
.kf C gkp /p .kf kp C kgkp /.kf kp C kgkp /.kf C gkp /p1
and Minkowski’s inequality follows dividing both sides by .kf C gkp /p1 . t
u
8.3 The definition of Lp
Denote by Lp .B/ the set of all measurable functions on B for which
jjf jjp < 1:
www.Ebook777.com
By Theorem 8.2, Lp .B/ is a vector space over R, and it may appear that
jjf gjjp (8.3.1)
therefore defines a norm on Lp .B/ in the sense of 1.2.1 of Chapter 2. This is,
however, not true for the simple reason that two functions f; g which are equal
almost everywhere have 0 distance! It is immediately obvious, on the other hand,
that the converse is also true, since we have the following fact.
R
8.3.1 Lemma. If f W B ! Œ0; 1 and X f D 0, then f D 0 almost everywhere
on B.
R
Proof. Let, for " > 0, E" D fx 2 X jf .x/ > "g. Then clearly X f > ".E" /,
so .E" / D 0. The set E D E1=1 [ E1=2 [ [ E1=n [ : : : therefore satisfies
.E/ D 0, but we have E D fx 2 X jf .x/ ¤ 0g. t
u
Thus, we see that (8.3.1) gives a well-defined norm on the quotient space
Lp .B/ D Lp .B/=L0
where L0 is the subspace of functions which are 0 almost everywhere. (See

Section 6 of Appendix A for the definition of a quotient vector space.) More
precisely, the formula (8.3.1) is applied to representatives f , g of two equivalence
classes constituting the quotient space Lp .B/, but does not depend on the choice
of representatives. Additionally, by what we just observed, the distance of two
equivalence classes which are not equal cannot be 0.
In the context of the normed vector spaces Lp .B/, it is common to identify a
function f with the coset to which it belongs to notationally, i.e. to write f 2
Lp .B/. This slight imprecision does not tend to cause difficulties.
8.4 A comment of complex functions
Sometimes, we are interested in an analogue of the Lp -spaces for complex functions.

In this context, the following simple result is useful:
8.4.1 Lemma. Let f W B ! C be an integrable function. Then

Z Z
j fj jf j:
B B
R R
Proof. Let ˛ be such that j˛j D 1 and ˛ B f Dj X f j. Then
Z Z Z Z Z
j fjD˛ f D ˛f D Re.˛f / jf j:
B B X B B
R
(The last equality follows from the fact that X ˛f is real.) t
u
www.Ebook777.com
Therefore, Minkowski’s inequality also holds for complex-valued functions by

the following argument:
jjf C gjjp jj jf j C jgj jjp jj jf j jjp C jj jgj jjp D jjf jjp C jjgjjp :
The case of p D 1 needs a separate (easy) discussion, see Exercise (17). Note that
a complex analogue of Hölder’s inequality follows from the real case immediately.
Thus, we can define the normed vector spaces Lp .B; C/, 1 p 1 completely
analogously as the spaces Lp .B/, with real functions replaced by complex ones.
8.5 Completeness of the spaces Lp
8.5.1 Lemma. (Fatou’s Lemma) Let fn W B ! Œ0; 1 be measurable functions.

Then
Z Z
.lim inf fn / lim inf fn :
B n!1 n!1 B
Proof. Let gn D inf fm . We have

mn
Z Z
gn inf fn ;
B mn B
while gn % lim inf fn , so the statement follows by passing to the limit by the
n!1
Lebesgue Monotone Convergence Theorem. t
u
8.5.2 Theorem. The spaces Lp .B/ and Lp .B; C/, 1 p 1, are complete
metric spaces.
Proof. Consider, for example, the complex case (the proof in the real case is the
same). Let fn W X ! C represent a Cauchy sequence in Lp . Then there exist
n1 < n2 < < nk < : : : such that
1
X
jjfnk fnkC1 jjp < 1:
kD1
For p < 1, this means that

1
X
jfnk .x/ fnkC1 .x/jp < 1
kD1
almost everywhere, so .fnk .x//k is a Cauchy sequence in C almost everywhere

in x 2 B, so the sequence of functions fk converges in a set S B such that
.B X S / D 0. In the case p D 1, the same conclusion also holds, and moreover,
www.Ebook777.com
in that case, the convergence is uniform (Exercise (18)). Now let f .x/ D lim fnk .x/
for x 2 S , and f .x/ D 0 for x 2 X X S . In the case of p D 1, we are done. For
p < 1, by Fatou’s Lemma 8.5.1,
Z Z
jfn f jp lim inf jfn fnk jp : (8.5.1)
B k!1 B
If we choose n such that jjfn fm jjp < ", then the right-hand side of (8.5.1) is ".
The right-hand side of (8.5.1) converges to 0 with n ! 1 because the sequence fn
is Cauchy. t
u
8.6 An inequality between Lp norms
8.6.1 Lemma. Let 1 < p and let B Rn be a Borel subset such that .B/ < 1.
Then
Z p Z
1 1
jf .x/j jf .x/jp :
.B/ B .B/ B
Proof. Put
Z
1
x0 D jf .x/j:
.B/ B
Since .x p /00 > 0 on .0; 1/, the derivative of x p is increasing on .0; 1/. Therefore,
if we let b D .x0 /p and let a be the value of .x p /0 D px p1 at x0 , we have
ax0 C b D .x0 /p
and the derivative of ax C b is .x p /0 on .0; x0 / and .x p /0 on .x0 ; 1/. We

conclude that
ax C b x p for all x 2 .0; 1/:
Now compute:
Z Z
1 1
jf .x/j
p
.ajf .x/j C b/ D ax0 C b D .x0 /p ;
.B/ n .B/ B
as claimed. t
u
8.6.2 Theorem. Let .B/ < 1, 1 r p 1. Then, for a measurable function

f on B,
1 1
jjf jjr .B/ p r jjf jjp :
www.Ebook777.com
9 Exercises 141
In particular, Lp .B/ is a closed subspace of Lr .B/ (and similarly in the complex

case).
Proof. Clearly, the case of p D 1 is a direct consequence of the definition.

Additionally, it suffices to consider the case r D 1 (otherwise, replace f by jf jr
and p by p=r. The case of r D 1 and p < 1 follows from Lemma 8.6.1. t
u
9 Exercises
(1) Prove the statement contained in Remark 1.2.

(2) Consider a modification of Theorem 1.1 where one replaces Lup , Ldn by L .
Is this modified statement true? Prove or disprove.
(4) Prove that sets of measure 0 as defined in 8.4 of Chapter 4 are precisely
Lebesgue measurable sets of measure 0, as defined in 3.1. S
(5) (a) Prove that if A1 A2 : : : are measurable sets and A D Ai , then
.A/ D lim .Ai /: (*)
T
(b) Now let A1 A2 : : : , A D Ai . Give an example when (*) does not
hold. Formulate a reasonable hypothesis which fixes the problem. [Hint:
Finiteness.]
(6) Let M Rm be a measurable set, and let f W M ! R be a function such
that for every Borel set S Rm , f 1 ŒS is measurable. Prove that then the
functionf defined by
(
f .x/ for x 2 M;
f .x/ D
0 otherwise
is measurable.
(7) Give an example of a measurable function f W Rm ! R such that there exists
a measurable set S R where f 1 ŒS is not measurable.
(8) Prove the following strengthening of Corollary 4.3.1: Let S be a Lebesgue
measurable set in Rm . Then there exists a subset K S of type F (a
countable union of compact sets) such that .S X K/ D 0. [Hint: First note
that for a real function f 2 Zdn , f 1 Œha; 1/ is closed. Now in the proof of
4.4, we produced a non-decreasing sequence of Zdn -functions fn cS such
that fn % cS almost everywhere. Let K be the union of fn1 Œh1=2; 1/.]
(9) Prove that if S is a Lebesgue measurable set in Rm , then there exists a set
U of type Gı (countable intersection of open sets) containing S such that
.U X S / D 0.
www.Ebook777.com
(10) Prove that a bounded function on a compact interval ha; bi is Riemann-

integrable if and only if it is continuous almost everywhere. (An analogue
in Rm also holds and can be proved using analogous methods.) [Hint: For
necessity, take a sequence of partitions for which both the upper and lower
Riemann sums converge to the integral; prove that the function is continuous
outside of the union of any set of closed intervals which are neighborhoods of
all the points ti involved in all these partitions - recall 8.1 of Chapter 1. For
sufficiency, let f be continuous almost everywhere. Let Fn be the set of all
x0 2 ha; bi such that lim sup jf .x/ f .x0 /j 1=n. Then Fn is closed, and,
x!x0
by assumption, covered by a set S of countably many open intervals the sum
of lengths of which is < 1=n. For a point x0 … Fn , consider a ıx0 > 0 such
that for x 2
.x0 ; ıx0 /, jf .x/ f .x0 /j < 1=n. Then ha; bi is contained in
the union of the elements of S and all the
.x0 ; ıx0 =2/, x0 … Fn . Hence, by
5.5 of Chapter 2, ha; bi is contained in a union of elements of a finite subset
Sn . Show that the partitions by the boundary points of all the intervals in Sn
give upper and lower Riemann sums which converge to the same number with
n ! 1.]
(11) Evaluate the integral
Z =2
ln.1 C cos.a/ cos.x//
dx
0 cos.x/
for 0 < a < . [Hint: Find the derivative with respect to a first.]
(12) Compute
Z
xy
E
where E is the tetrahedron in R3 with vertices
.0; 0; 0/T ; .1; 1; 1/T ; .2; 3; 4/T ; .3; 6; 7/T :
[Hint: Use linear substitution.]

(13) Spherical coordinates. For m 2, consider the map
m W .0; 1/
. =2; =2/n2
.0; 2 / ! Rm
given as follows. If we denote the variables in the target as x1 ; : : : ; xm and the

variables in the source as r; t1 ; : : : ; tm1 then
x1 D r cos.t1 / : : : cos.tm1 /;
xi D r cos.t1 / : : : cos.tmi / sin.tmi C1 / i D 2; : : : m:
Prove that
www.Ebook777.com
9 Exercises 143
jdetDm j.t1 ;:::;tm1 / j D r m .cos.t1 //m2 cos.t2 /m3 : : : cos.tm1 /1 :
[Hint: Express m D ı where is given by the formula
y1 D tm1 ; .y2 ; : : : ; ym / D m1 .r; t1 ; : : : ; tm2 /
and is given by
x1 D y2 cos.y1 /; x2 D y2 sin.y1 /; xi D yi for 3 i m.
Use the chain rule.]

(14) Using Exercise (13), compute the volume .D m / where
X
D m D f.x1 ; : : : ; xm /j xi2 rg Rm :
(15) Prove that

Z
p
e t dt D
2
:
R
[Hint: First compute

Z
e x
2 y 2
R2
using 2-dimensional spherical (Dpolar) coordinates. The integral in question

is the square root of the result. Why?]
(16) Let U be an open subset of Rn , and let F W U ! Rn be a map satisfying
Assumption 7.5. Prove that if U is connected, then det.DF/ does not change
signs on U . [Hint: Recall 5.1.1 of Chapter 2.]
(17) Define in detail the metric space L1 .B; C/.
(18) Complete the details of the proof of Theorem 8.5.2 for p D 1.
(19) Using the method of Lemma 8.6.1, prove the following Jensen inequality: If
is a convex function on .0; 1/, then
Z Z
1 1
. jf .x/j/ .jf .x/j/:
.B/ B .B/ B
(20) (“Baby Lp ”) Define, on Rn or Cn , 1 p < 1
k.x1 ; : : : ; xn /kp D .jx1 jp C C jxn jp /1=p
(and similarly for Cn ). Prove that this makes Rn , Cn into normed vector spaces.
What is the appropriate definition in the case of p D 1?
www.Ebook777.com
Systems of Ordinary Differential Equations

6
1 The problem
1.1
A system of ordinarydifferential equations (briefly, ODE’s) is a problem of finding

functions y1 .x/; : : : ; yn .x/ on some open interval in R such that
yk0 .x/ D fk .x; y1 .x/; : : : ; yn .x// for k D 1; : : : ; n (1.1.1)
where fk are continuous functions of n C 1 real variables. Note that then yi , since
they are required to have a derivative, must in particular be continuous, and the
derivative is then also continuous by (1.1.1). The expression “ordinary” indicates
that there appear only derivatives of functions of one variable, not partial derivatives
of functions of several variables.
Using the vector symbols y, f as in Chapter 3, we can describe the task by writing
y0 .x/ D f.x; y.x//:
1.2
We may encounter systems involving higher derivatives, such as for example
y1 D f1 .x; y1 ; y2 ; y10 ; y20 ; y100 ; y200 ; y1000 ; y2000 /;

.4/
y2000 D f2 .x; y1 ; y2 ; y10 ; y20 ; y100 ; y200 ; y2000 /:
This appears to call for a generalization of the original problem. But in fact, such
systems are easily converted to systems of ODE’s as above: in this particular case,
introduce additional variables

www.Ebook777.com
146 6 Systems of Ordinary Differential Equations
z1 D y1 ; z2 D y2 ; z3 D y10 ; z4 D y20 ; z5 D y100 ; z6 D y200 and y7 D y1000 ;
making the two equations into the equivalent system of the form (1.1.1):
z01 D z3 ;
z02 D z4 ;
z03 D z5 ;
z04 D z6 ;
z05 D z7 ;
z06 D f2 .x; z1 ; : : : ; z7 /;
z07 D f1 .x; z1 ; : : : ; z6 ; f2 .x; z1 ; : : : ; z7 //:
The reader certainly sees how to apply this procedure in a general situation
.k / .k / .k /
y1 1 D f1 .x; y1 ; : : : ; y1 1 ; : : : ; yn ; : : : ; yn n /;
::: (1.2.1)
.k / .k / .k /
yn n D fn .x; y1 ; : : : ; y1 1 ; : : : ; yn ; : : : ; yn n /:
Introduce additional variables for all the derivatives of yi of order less than the
highest order derivative of yi which occurs in the system, and rewrite the original
system in terms of the additional variables, introducing additional equations relating
the new variables as derivatives of each other (see Exercise (1), (2)). To be explicit,
one sometimes refers to a system of the form (1.1.1) as a system of first-order ODE’s,
but we already see that such systems are all we need to consider.
1.3
We may, in fact, encounter even more general systems, namely a system of equations
of the form
.k / .k /
F1 .x; y1 ; : : : ; y1 1 ; : : : ; yn ; : : : ; yn n / D 0;
::: (1.3.1)
.k / .k /
Fm .x; y1 ; : : : ; y1 1 ; : : : ; yn ; : : : ; yn n / D 0:
In such a case, we will always assume that m D n and that the Jacobian of the
.k / .k /
Fi ’s in the variables corresponding to y1 1 ; : : : ; yn n is non-zero. Then, using the
Implicit Function Theorem 6.3 of Chapter 3, the system (1.3.1) can be converted (at
least locally) to the system (1.2.1), and hence again, by the method explained there,
to a first-order system of the form (1.1.1). If m ¤ n or the Jacobian in question is 0,
the problem (1.3.1) will be considered ill-posed from our point of view.
www.Ebook777.com
2 Converting a system of ODE’s to a system of integral equations 147
Note that whether the problem (1.3.1) is well-posed depends on the values of x,
.k 1/
the yi ’s and their derivatives up to yi i , and the (number) solution of the resulting
.k /
equations for the yi i ’s. We will see, however, that this is in the spirit of the theory
we will develop, as in solving the system (1.2.1), we get to specify x, the yi ’s and
.k 1/
their derivatives up to yi i as initial conditions. (This is equivalent to specifying
x and yi as initial condition in the system 1.1.)
The translations of 1.2 and 1.3 serve a theoretical purpose. They may often be
difficult to carry out in practice. In many cases, different reductions may be more
advantageous. (See Exercise (3).)
1.4 Remarks
1. To simplify notation, we write y 0 D f .x; y/ instead of the more correct y 0 .x/ D

f .x; y.x//, etc. Thus, the symbol y may feature both as a variable in a function
f of two variables, and as a name of a function y.x/.
2. Differential equations play a fundamental role in various applications. Let us just
mention a simple geometric interpretation of the ODE y 0 D f .x; y/: the function
f .x; y/ determines directions at individual points .x; y/ of the plane R2 ; the
graphs of the desired solutions are curves following the prescribed directions.
2 Converting a system of ODE’s to a system of integral

equations
2.1 Theorem. Let .a; b/ be an open interval containing a number x0 . Let 1 ; : : : ; n

be arbitrary real numbers. Then the functions y1 .x/; : : : ; yn .x/ constitute a solution
of the ODE system
yj0 .x/ D fj .x; y1 .x/; : : : ; yn .x//; j D 1; : : : ; n (2.1.1)
in this interval such that, moreover, yj .x0 / D j if and only if they satisfy
the equations
Z x
yj .x/ D fj .t; y1 .t/; : : : ; yn .t//dt C j : (2.1.2)
x0
Proof. This is an easy consequence of the Fundamental Theorem of Calculus.

If (2.1.1) is satisfied then one has
Z x
yj .x/ D fj .t; y1 .t/; : : : ; yn .t//dt C cj
x0
www.Ebook777.com
for some constants cj . If, moreover, yj .x0 / D j we obtain for x D x0 ,

Z x0
j D yj .x0 / D fj .: : : /dt C cj D 0 C cj :
x0
On the other hand, if the functions yj .x/ satisfy (2.1.2) then by taking the derivative
by x we obtain that yj0 .x/ D fj .x; y1 .x/; : : : ; yn .x//, and setting x D x0 we
conclude that yj .x0 / D j . t
u
2.2 Remark
This very easy translation of our problem has in fact a quite surprising consequence.
Let us illustrate it on the equation y 0 D f .x; y/. Denote by D the operator of taking
the derivative, and by F the operator transforming y.x/ to f .x; y.x//. Further,
define an operator J by setting
Z x
J.y/.x/ D f .t; y.t//dt:
co
The original task was to solve the equation
D.y/ D F .y/: (*)
This looks somewhat scary: for example, if we take the space X D C..a; b//
of bounded continuous functions on .a; b/ as considered in 7.7 of Chapter 2,
the operator D is not even defined on X , as not every continuous function has
a derivative. It seems that in order to treat the equation by means of spaces of
functions, we would have to think hard what space to work on, and what metric
to choose to make both sides of the equation (*) continuous. Such problems do,
indeed, arise with some types of differential equations.
However, in case of our system (1.1.1), Theorem 2.1 gives a way out: After the
translation we obtain the equation
y D J.y/ (**)
where J is (as we will see) continuous. Furthermore, this is a fixed-point problem

about which we already know something (see 7.6 of Chapter 2); indeed, the Banach
Fixed Point Theorem will be of a great help.
www.Ebook777.com
3 The Lipschitz property and a solution of the integral equation 149
3 The Lipschitz property and a solution of the integral

equation
3.1
Let f .x; y1 ; : : : ; yn / be a function in n C 1 (real) variables. It is said to be Lipschitz

in the variables y1 ; : : : ; yn if there exists a number M such that
jf .x; y1 ; : : : ; yn / f .x; z1 ; : : : ; zn /j M max jyi zi j:

i
We say that f is locally Lipschitz in y1 ; : : : ; yn if for each u0 D .x0 ; y10 ; : : : ; yn0 /

of the domain in question there is an open U 3 u0 such that the restriction f jU is
Lipschitz.
3.2 Observation. If a function f .x; y1 ; : : : ; yn / has continuous partial derivatives

@f
then it is locally Lipschitz.
@yj
(Indeed take a point u0 D .x0 ; y10 ; : : : ; yn0 /, an open set U 3 u0 and an M

such that
ˇ ˇ
ˇ @f .x; y1 ; : : : ; yn / ˇ M
ˇ ˇ :
ˇ @yj ˇ n
Then by the Mean Value Theorem, we have for .x; y1 ; : : : ; yn /; .x; z1 ; : : : ; zn / 2 U ,

ˇ ˇ
ˇ ˇ
ˇX @f .: : : / ˇ
jf .x; y1 ; : : : ; yn / f .x; z1 ; : : : ; zn /j D ˇˇ .yj zj /ˇˇ
ˇ j @yj ˇ
X ˇˇ @f .: : : / ˇˇ M
ˇ ˇ
ˇ @y ˇ jyj zj j n n max
j
jyj zj j:/
j j
3.3 Theorem. Let fj .x; y1 ; : : : ; yn /, j D 1; : : : ; n be continuous and Lipschitz in

the variables y1 ; : : : ; yn in a neighborhood of a point u D .x0 ; 1 ; : : : ; n /. Then
there is an a > 0 such that in the interval .x0 a; x0 C a/ the system of equations
Z x
uj .x/ D fj .t; u1 .t/; : : : ; un .t//dt C j
x0
has precisely one solution u1 ; : : : ; un .
Proof. First, choose a neighborhood
U D .x0 ˛ 0 ; x0 C ˛ 0 /
.y1 ˇ 0 ; y1 C ˇ 0 /

.yn ˇ 0 ; yn C ˇ 0 /
www.Ebook777.com
on which f jU is Lipschitz. Now choose 0 < ˛ < ˛ 0 and 0 < ˇ < ˇ 0 . We have an
M such that
jx0 xj ˛; jj yj j ˇ; jj zj j ˇ
implies that
jfj .x; y1 ; : : : ; yn / f .x; z1 ; : : : ; zn /j M max jyi zi j:

i
Since f is continuous we also have an A such that
jfj .x; y1 ; : : : ; yn /j A
in the compact interval hx0 ˛; x0 C ˛i

h1 ˇ; 1 C ˇi

hn ˇ; n C ˇi
(recall Proposition 6.3 of Chapter 2).
Choose an a such that
(1) 0 < a ˛,
ˇ
(2) a , and
A
q
(3) a for some q < 1.
M
Consider the space of continuous functions
C D C..x0 a; x0 C a//
(recall 7.7 of Chapter 2) and the subspaces
Yj D fu j u 2 C; j ˇ u.x/ j C ˇg:
All the Yj are complete metric spaces and hence also the product
Y D Y1
Y2

Yn
with, say, the maximum metric
.u; v/ D max j .uj ; vj /;

j
where j .; / D supx j.x/ .x/j, is complete (7.7.2 and 7.3.1 of Chapter 2).
Now define for u D .u1 ; : : : ; un /
J.u/ D .J1 .u/; : : : ; Jn .u//
where
Z x
Jj .u/.x/ D fj .t; u1 .t/; : : : ; un .t//dt C j :
x0
www.Ebook777.com
4 Existence and uniqueness of a solution of an ODE system 151
Since
ˇZ x ˇ
ˇ ˇ
jJj .u/.x/ j j D ˇˇ fj .t; u1 .t/; : : : ; un .t//dt ˇˇ
x0
Z x
jfj .t; u1 .t/; : : : ; un .t//jdt jx0 xj A a A ˇ;
x0
J is a mapping Y ! Y , and our problem is to find a fixed point of J . We have
.J.u/; J.v// D max sup jJk .u/.x/ Jk .v/.x/j

k x
ˇZ x Z x ˇ
ˇ ˇ
ˇ
D max sup ˇ fk .t; u1 .t/; : : : /dt fk .t; v1 .t/; : : : /dt ˇˇ
k x x0 x0
ˇZ x ˇ
ˇ ˇ
ˇ
D max sup ˇ fk .t; u1 .t/; : : : / fk .t; v1 .t/; : : : /dt ˇˇ
k x x0
Z x
max sup jfk .t; u1 .t/; : : : / fk .t; v1 .t/; : : : /jdt D c:
k x x0
Since we have jfk .t; u1 .t/; : : : / fk .t; v1 .t/; : : : /j M max xjuj .t/ vj .t/j
j
M max sup juj .x/ vj .x/j D M .u; v/ we obtain
j x
.J.u/; J.v// c max sup jx x0 j M .u; v/ a M .u; v/ q .u; v/:

j x
Thus, J W Y ! Y satisfies the condition of the Banach Fixed Point Theorem 7.6 of
Chapter 2 and we conclude that there is precisely one u such that J.u/ D u, that is,
precisely one solution of our integral equations on the interval .x0 a; x0 C a/. u
t
4 Existence and uniqueness of a solution of an ODE system
4.1
Using 2.1, we immediately infer from Theorem 3.3 the following
Theorem. (The Picard-Lindelöf Theorem) Let fj .x; y1 ; : : : ; yn /, j D 1; : : : ; n be

continuous and let them be Lipschitz with respect to y1 ; : : : ; yn in a neighborhood
of a point u D .x0 ; 1 ; : : : ; n /. Then for a sufficiently small a > 0 the system
yj0 .x/ D fj .x; y1 .x/; : : : ; yn .x//; j D 1; : : : ; n
has precisely one solution on .x0 a; x0 C a/ such that yj .x0 / D j for all j .
www.Ebook777.com
Remark. Thus, unlike the uniqueness in 3.3, the solution is unique with respect
to the extra conditions yj .x0 / D j . These requirements are usually referred to as
the initial conditions.
4.2
The solutions in 4.1 are of a local character, that is, they are guaranteed in a small
neighborhood of the initial point x0 only. Now we will head to solutions of a more
global character, defined as far as possible. To start with, we will speak of a local
solution .u; J / defined on an open interval J and we will endeavour to extend the J .
4.2.1 Lemma. Under the conditions of 4.1, let J; K be open intervals, let x0 2
J \ K, and let .u; J / and .v; K/ be local solutions such that u.x0 / D v.x0 /. If f is
continuous and Lipschitz with respect to the yj in the domain in which we consider
our system, we have ujJ \ K D vjJ \ K.
Proof. By 4.1, if the u and v coincide at a point they coincide in some of its open
neighborhoods. Thus,
U D fx j u.x/ D v.x/; x 2 J \ Kg
is an open subset of J \ K. From the continuity of u, v it follows that U is closed as

well. Since J \ K is an interval, hence connected by 5.2.2 of Chapter 2, and since
U is non-empty, U D K \ J . t
u
4.2.2
Take the union of all the intervals J on which there exists a solution u satisfying
uj .x0 / D j . By Lemma 4.2.1, there exists a solution .u; J / with the domain J .
Such maximal solutions are called the characteristics of the given ODE system. In
this terminology we can summarize the preceding facts in the following
Theorem. Let U be an open subset of RnC1 and let f.x; y1 ; : : : ; yn / W U ! R be

continuous and locally Lipschitz in y1 ; : : : ; yn , Then for each .x0 ; 1 ; : : : ; n / there
is a unique characteristic u such that uj .x0 / D j
4.3
Consider a differential equation
y .n/ D f .c; y; y 0 ; : : : ; y .n1/ /: (4.3.1)
From the method of 1.2 and from Theorem 4.2.2, we obtain the following
www.Ebook777.com
5 Stability of solutions 153
Corollary. Let U be an open subset of RnC1 and let f .c; y1 ; y2 ; : : : ; yn / be

continuous and locally Lipschitz in y1 ; : : : ; yn . Then for each .xo ; 1 ; : : : ; n / 2 U
there exists precisely one solution y of the equation (4.3.1) with maxinum interval
domain such that
y k .x0 / D kC1 ; k D 0; : : : ; n 1:
4.4 Examples
1. The domain of a characteristic may not be equal to the domain on which a

differential equation is defined. For example, the differential equation
y0 D 1 C y2
has solutions y D tan.x C C / where C is any constant, as easily verified. (See

the next section for a more systematic method for finding the solution.)
2. The Lipschitz condition in the assumptions is essential. Consider the equation
2
y 0 D 3y 3 :
2
We have the solutions y.x/ D .x Cc/3 . The function f .x; y/ D 3y 3 is Lipschitz
in y in all the points but the .x; 0/. And indeed in these exceptional points we
have solutions
8
ˆ
<.x a/ for x a;
ˆ 3
y.x/ D 0 for a x b;
ˆ
:̂.x b/3 for x b;
all of them satisfying y.x0 / D 0 for any x0 2 .a; b/.
5 Stability of solutions
5.1 The problems of stability
Consider the equations
yj0 .x/ D fj .x; y1 .x/; : : : ; yn .x//;

yj .x0 / D j ; j D 1; : : : ; n
solved as in 4.1. The solution depends (uniquely) on the j . A question naturally

arises whether this dependence is continuous. For example, if it were not continuous,
www.Ebook777.com
using of the solution in practical applications would be rather suspect, as the effect
of small errors in initial conditions would be unpredictable.
Furthermore, a practical setting often contains additional parameters, so the
system becomes
yj0 .x; ˛1 ; : : : ; ˛k / D fj .x; y1 .x; ˛1 ; : : : ; ˛k /; : : : ; yn .x; ˛1 ; : : : ; ˛k //;

(*)
yj .x0 ; ˛1 ; : : : ; ˛k / D j ; j D 1; : : : ; n:
As before, the derivative is taken by x (while technically, this is a partial derivative,

the convention is to continue using the ordinary derivative symbol to emphasize
the fact that we have one system of ordinary differential equations for each
value of the parameters). As, again, in practice the parameters are known only
approximately, the solution makes practical sense only if it depends continuously
on the parameters ˛i .
The two stability problems can be reduced to one. Fix initial conditions 0j ,
consider
ˇi D i 0i ; zi D yi C ˇi
and define
gj .x; z1 ; : : : ; zn ; ˛1 ; : : : ; ˛k ; ˇ1 ; : : : ; ˇn / D fj .x; z1 C ˇ1 ; : : : ; zn C ˇn ; ˛1 ; : : : ; ˛k /
which turns the combined task (*) into
z0j .x;˛1 ; : : : ; ˛k ; ˇ1 ; : : : ; ˇn /
D gj .x; z1 .x; ˛1 ; : : : ; ˛k ; ˇ1 ; : : : ; ˇn /; : : : ; zn .x; ˛1 ; : : : ; ˛k ; ˇ1 ; : : : ; ˇn //;
zj .x0 ;˛1 ; : : : ; ˛k ; ˇ1 ; : : : ; ˇn / D 0j ; j D 1; : : : ; n
with the initial values 0j fixed. Thus, it suffices to study the dependence of the
system on parameters only, with initial conditions fixed; in the notation (*), this
means we will study stability with respect to ˛1 ; : : : ; ˛k , with j fixed.
5.1.1 Remark
One can also convert the combined stability problem into a problem concerning
initial conditions only. But the trick with parameters is more expedient and we will
concentrate on that.
5.2 Lemma. (Gronwall’s inequality) Let F be a non-negative real-valued function

on an interval ha; bi and let there exist positive constants C; K such that for all
x 2 ha; bi we have
Z x
F .x/ C C K F .t/dt:
a
www.Ebook777.com
Then for all x 2 ha; bi,
F .x/ C eK.xa/ :
Proof. Put
Z x
G.x/ D C C K F .t/dt:
a
Then we have
F .x/ G.x/ and G 0 .x/ D K F .x/ K G.x/:
Since G.x/ > 0, we have

G 0 .x/
K
G.x/
and hence
Z Z
x
G 0 .t/ x
dt K 1 dt D K.x a/:
a G.t/ a
Subsituting in the first integral y D G.t/, we obtain

Z G.x/
dy
D ln G.x/ ln G.a/ D ln G.x/ ln C;
G.a/ y
so that
ln G.x/ ln C C K.x a/; and hence G.x/ C eK.xa/ :
Using F .x/ G.x/ again, we obtain the desired inequality. t

u
5.3
To simplify notation, in the proof of the following theorem we will write ˛ for
˛1 ; : : : ; ˛k and use the symbol
k˛k for max j˛j j:

j D1;:::;k
Similarly, for a system we will write y1 ; : : : ; yn resp. y1 .x/; : : : ; yn .x/,
kyk D max jyj j or ky.x/k D max jyj .x/j:

j D1;:::;n j D1;:::;n
www.Ebook777.com
Theorem. Let fj .x; y1 ; : : : ; yn ; ˛1 ; : : : ; ˛k / be functions continuous in all vari-

ables and Lipschitz in the variables yj and ˛j in some neighborhood of a point
.x0 ; 0 ; : : : ; n ; ˛10 ; : : : ; ˛k0 /: (5.3.1)
Then the solution yj .x; ˛1 ; : : : ; ˛k / of the system of equations
yj0 .x; ˛1 ; : : : ; ˛k / D fj .x; y1 .x; ˛1 ; : : : ; ˛k /; : : : ; yn .x; ˛1 ; : : : ; ˛k /; ˛1 ; : : : ; ˛k /;

yj .x0 ; ˛1 ; : : : ; ˛k / D j ; j D 1; : : : ; n
is continuous in all variables in some neighborhood U of the point (5.3.1).

Moreover, if K is a Lipschitz constant for the variables y1 ; : : : ; yn , ˛1 ; : : : ; ˛n , we
have an estimate on U :
jyj .x; ˛1 ; : : : ; ˛k / yj .x; ˇ1 ; : : : ; ˇk /j max j˛i ˇi jeK.xa/ (5.3.2)

i D1;:::;k
for all j D 1; : : : ; n.
Proof. We have
jyj .x; ˛/ yj .x; ˇ/j D

Z x
j fj .t; y1 .t; ˛/; : : : ; yn .t; ˛/; ˛/dt fj .t; y1 .t; ˇ/; : : : ; yn .t; ˇ/; ˇ/dtj
a
Z x
jfj .t; y1 .t; ˛/; : : : ; yn .t; ˛/; ˛/ fj .t; y1 .t; ˇ/; : : : ; yn .t; ˇ/; ˇ/jdt
a
Z x
.jfj .t; y1 .t; ˛/; : : : ; ˛/ fj .t; y1 .t; ˛/; : : : ; ˇ/j
a
C jfj .t; y1 .t; ˛/; : : : ; ˇ/ fj .t; y1 .t; ˇ/; : : : ; ˇ/j/dt

Z x
.K k˛ ˇk C K ky.t; ˛/ y.t; ˇ/k/dt;
a
so
Z x
ky.x; ˛/ y.x; ˇ/k K .k˛ ˇk C ky.t; ˛/ y.t; ˇ/k/dt:
a
If we set F .x/ D k˛ ˇk C ky.x; ˛/ y.x; ˇ/k, we obtain

Z x
F .x/ k˛ ˇk C K F .t/dt:
a
By Lemma 5.2, we now have
www.Ebook777.com
F .x/ k˛ ˇkeK.xa/
and since ky.x; ˛/ y.x; ˇ/k F .x/, the estimate (5.3.2) follows. t
u
5.4 Remark
Recall that the existence and uniqueness in Theorem 4.1 was proved using the
Banach Fixed Point Theorem 7.6 of Chapter 2. The reader may naturally ask
whether the stability theorem (at least the continuity) is not an easy consequence of
a general property of such fixed points. That is, we think of the following problem.
Let us have metric spaces X; T and a mapping
f WX
T !X
such that d.f .x; t/; f .y; t// rt where rt < 1 depend on t 2 T only. Define
F .t/ 2 X by the equation f .F .t/; t/ D F .t/. How does F .t/ depend on t?
There are fairly general facts known on this subject, but they do not fit well with
our present topic. Due to the special character of our equations it is, luckily enough,
easy to show the dependence by an explicit estimate, as we have done.
5.5
The solution of a system of differential equations is (under reasonable conditions)

not only continuously dependent on parameters. In fact we can even take derivatives.
Consider, again, the system of equations
yi0 .x; ˛/ D fi .x; y1 .x; ˛/; : : : ; yn .x; ˛/; ˛/; i D 1; : : : ; n;

(5.5.1)
yi .x0 ; ˛/ D i
@yi
satisfying the conditions from 4.1 (where we write, similarly as before, yi0 , not ,
@x
for the derivatives by x, to keep in mind the fact that we are dealing with an ordinary
differential equation).
Theorem. Let fi .x; y1 ; : : : ; yn ; ˛1 ; : : : ; ˛k / be continuous functions defined on an

open neighborhood of a point (5.3.1), continuously differentiable with respect to
yj and ˛p . Then the solutions yi .x; ˛/ of the system (5.5.1), which exist and are
unique on some open neighborhood U of (5.3.1), are differentiable with respect to
˛p , p D 1; : : : ; k on U , and the functions
@yi
zi .x; ˛/ D .x; ˛/
@˛p
www.Ebook777.com
satisfy the system of equations

X n
@fi @fi
z0i .x; ˛/ D .x; y.x; ˛/; ˛/ zj C .x; y.x; ˛/; ˛/; i D 1; : : : ; n;
j D1
@yj @˛ p
zi .x0 ; ˛/ D 0; (5.5.2)
where we write briefly y for y1 ; : : : ; yn and ˛ for ˛1 ; : : : ; ˛k .
Remarks.
1. The continuous differentiability with respect to yj and ˛p makes, of course, the
functions fi locally Lipschitz with respect to these variables.
2. The system (5.5.1) is viewed as solved and yi .x; ˛/ constitute the (unique)
solution. The equations (5.5.2) contains these functions as aleady given, not as
something dependent on the zi . Thus, the right-hand sides of the equations in
(5.5.2) are Lipschitz with respect to zj and therefore the system has a solution.
Our task will be to prove that the individual zi ’s are the partial derivatives of the
yi by ˛p .
3. The reader has certainly not overlooked that the equations for zi which we hoped
@yi
to be the come naturally in the form (5.5.2): if we already knew yi to have
@˛
derivatives, we would obtain the equality by taking derivatives of the equalities
in (5.5.1). But this we do not know yet.
Proof. First of all, note that the problem is immediately reduced to the case k D 1:
We may treat all parameters but one as constant for the existence of a single
partial derivative; once equation (5.5.2) is proved, we can use Theorem 5.3 to prove
continuity of the partial derivatives in all the ˛p ’s. Thus, let us assume k D 1, and
write ˛ for ˛p .
Let yi be a solution of the system (5.5.1) and z a solution of the system (5.5.2).
Put
1
ui .x; ˛; h/ D .yi .x; ˛ C h/ yi .x; ˛//
h
and
vi .x; ˛; h/ D ui .x; ˛; h/ zi .x; ˛/:
Thus,
@vi @ui
.x; ˛; h/ D .x; ˛; h/ z0i .x; ˛/
@x @x
@ui X n
@fi @fi
D .x; ˛; h/ .x; y.x; ˛/; ˛/ zj .x; y.x; ˛/; ˛/:
@x j D1
@yj @˛
www.Ebook777.com
@ui
Let us compute the derivative .x; ˛; h/ :
@x
@ui 1
.x; ˛; h/ D .yi0 .x; ˛ C h/ yi0 .x; ˛//
@x h
1 X
n
D . fj .x; y.x; ˛ C h/; ˛ C h/ f .x; y.x; ˛/; ˛ C h//
h j D1
1
C .fi .x; y.x; ˛/; ˛ C h/ f .x; y.x; ˛/; ˛//:
h
By the Mean Value Theorem we may continue, writing y for y1 .x; ˛ C h/

y1 .x; ˛/; : : : ; yn .x; ˛ C h/ yn .x; ˛/, that is, hu1 .x; ˛; h/; : : : ; hun .x; ˛; h/,
X n
@fi @fi
D .x; y.x; ˛/ C 1 y; ˛ C h/ uj .x; ˛; h/ C .x; y.x; ˛/; ˛ C 2 h/:
j D1
@yj @˛
@vi
Let us now consider . Since ui .x; ˛; h/ D vi .x; ˛; h/ C zi .x; ˛/, we obtain
@x
ˇ ˇ Xn ˇ ˇ
ˇ @vi ˇ ˇ @fj ˇ
ˇ .x; ˛; h/ ˇ ˇ .x; y.x; ˛/ C y; ˛ C h/ ˇ jvj .x; ˛; h/j
ˇ @x ˇ ˇ @y 1 ˇ
j D1
n ˇ
X ˇ
ˇ @fi @fj ˇ
C ˇ. .x; y.x; ˛/ C y; ˛ C h/ .x; y.x; ˛/; ˛// z .x; ˛/ ˇ
ˇ @y 1
@y
j ˇ
j
j D1
ˇ ˇ
ˇ @fi @fi ˇ
ˇ
Cˇ .x; y.x; ˛/ C 2 h/ .x; y.x; ˛//ˇˇ ;
@˛ @˛
and further
ˇ ˇ Xn ˇ ˇ
ˇ @vi ˇ ˇ @fi ˇ
ˇ .x; ˛; h/ ˇ ˇ .x; y.x; ˛/ C y; ˛ C h/ ˇ jvj .x; ˛; h/j
ˇ @x ˇ ˇ @y 1 ˇ
j D1 j
n ˇ
X ˇ
ˇ @fi @fj ˇ
C ˇ. .x; y.x; ˛/C y; ˛ C h/ .x; y.x; ˛/; ˛ C h// z .x; ˛/ ˇ
ˇ @y 1
@y
j ˇ
j D1 j
n ˇ
X ˇ
ˇ @fi @fj ˇ
C ˇ. .x; y.x; ˛/; ˛ C h/ .x; y.x; ˛/; ˛// z .x; ˛/ ˇ
ˇ @y @y
j ˇ
j
j D1
ˇ ˇ
ˇ @fi @fi ˇ
C ˇˇ .x; y.x; ˛/ C 2 h/ .x; y.x; ˛//ˇˇ :
@˛ @˛
www.Ebook777.com
Choose a compact neighbourhood of .x0 ; y.x0 ; ˛/; ˛/ and a K sufficiently large to

have, in this range,
n ˇ
X ˇ
ˇ @fi ˇ
max ˇ .x; y.x; ˛/; ˛/ ˇ K:
i ˇ @y ˇ
j D1 j
Now let " > 0. From the Lipschitz property we see that for h sufficiently small we
have for all x sufficiently close to x0 to stay in the aforementioned range
X n ˇ ˇ
ˇ @fi @fj ˇ
ˇ. ˇ
ˇ @y .x; y.x; ˛/ C 1 y; ˛ C h/ @y .x; y.x; ˛/; ˛ C h// zj .x; ˛/ˇ
j D1 j
X n ˇ ˇ
ˇ @fi @fj ˇ
C ˇ ˇ
ˇ. @y .x; y.x; ˛/; ˛ C h/ @y .x; y.x; ˛/; ˛// zj .x; ˛/ˇ
j D1 j
ˇ ˇ
ˇ @f @f ˇ
ˇ
C ˇ .x; y.x; ˛/ C 2 h/ .x; y.x; ˛//ˇˇ < "
@˛ @˛
and hence
ˇ ˇ X
ˇ @vi ˇ n
ˇ .x; ˛; h/ ˇ "CK jvj .x; ˛; h/j
ˇ @x ˇ
j D1
so that
n ˇ
X
ˇ
X
ˇ @vi ˇ n
ˇ .x; ˛; h/ ˇ "CK jvj .x; ˛; h/j;
ˇ @x ˇ
i D1 j D1
and consequently
X
n Z x X
n
jvi .x; ˛; h/j .n" C nK jvi .t; ˛; h/j/dt:
j D1 x0 i D1
X
n
Thus, for F .x/ D n" C nK lim jv.x; ˛; h/j we have
i D1
Z x
F .x/ F .t/dt
x0
and can apply Gronwall inequality to obtain, for each individual i ,

ˇ ˇ
ˇ1 ˇ ".eK.xx0/ 1/
jvi .x; ˛; h/j D ˇˇ .yi .x; ˛ C h/ yi .x; ˛// zi .x; ˛/ˇˇ ;
h K
www.Ebook777.com
6 A few special differential equations 161
1
and since " > 0 was arbitrary we conclude that lim .yi .x; ˛ C h/ yi .x; ˛// D
h!0 h
zi .x; ˛/: t
u
6 A few special differential equations
6.1
First of all, let us realize that in the situations where the theorem on the existence and
uniqueness is applicable, we do not really have to be concerned about the correctness
dy
of the procedure we use (e.g. working with as if it were a fraction, failing to
dx
control whether there might not be a zero in a denominator, etc.). If we obtain a
function satisfying the equation (and initial conditions), it has to be the one and
only solution we are looking for, by Theorem 4.2.2. This is a perfect example of the
importance of theoretical work for calculations.
6.2
We have already encountered a differential equation without knowing it. Namely,

looking for a primitive function of a function f is the ODE
y 0 D f .x/:
In general, to determine a primitive function is by no means an easy task (indeed

it is often impossible to obtain a formula in terms of elementary functions). It is,
however, customary to think of an ODE as solved if it is reduced to formulas in
primitive functions.
6.3 Separation of variables
The equation
y 0 D f .x/g.y/
can be treated as follows: rewrite it as

1
y 0 .x/ D f .x/
g.y.x//
R
and compare the primitive functions of both sides (these are indicated by plain ).
We obtain
www.Ebook777.com
Z Z
1
. /.y.x// D . f /.x/ C C:
g
This somewhat clumsy computation can be, more intuitively, modified as follows.
Take the equation as
dy
D f .x/g.y/;
dx
proceed to
dy
D f .x/dx
g.y/
and “integrate”
Z Z
dy
D f .x/dx C C:
g.y/
Examples.
1. For y 0 D y sin x we obtain
Z Z
dy
D sin xdx C C;
y
hence
ln y D cos x C C
yielding
jyj D e cos xCC that is, y D D e cos x :
2. Similarly, the equation y 0 D 1 C y 2 of Example 4.4 1 is transformed to

Z
dy
1 C y2
yielding arctan y D x C C and finally y D tan.x C C /.

3. For
x
y0 D
y
R R
we obtain ydy D xdx C C , hence 12 y 2 D 12 x 2 C c and finally x 2 C y 2 D
r 2 . This is a very intuitive example: What curves are perpendicular in each .x; y/
to the vector .x; y/? Of course, the circles with their centers at .0; 0/.
www.Ebook777.com
6 A few special differential equations 163
6.4
To solve the equation
y 0 D f .ax C by/;
substitute z.x/ D ax C by.x/. Then we have z0 b y 0 C a D b f .x/, a particularly

simple example of the equations from 6.3 where the right-hand side is independent
of x (such example is known as an autonomous equation).
6.5
(The “homogeneous equation” - not to be confused with homogeneous linear

differential equations in Chapter 7 below.) To solve the equation
y
y 0 D f . /;
x
(in other words, y 0 D F .x; y/ where F is such that for any t, F .x; y/ D F .tx; ty/),
y
substitute z D . Then we obtain
x
y0x y y0 z 1
z0 D 2
D D .f .z/ z/ ;
x x x
again an equation with separated variables.
6.6
The equation

ax C by C c
y0 D f (6.6.1)
˛x C ˇy C
would be of the type 6.5 if we had c D D 0. If not, let us try to force it. Let x0 ; y0
be a solution of the linear (algebraic) equations
ax C by C c D 0
˛x C ˇy C D 0:
Then
ax C by C c a.x x0 / C b.y y0 /
D :
˛x C ˇy C ˛.x x0 / C ˇ.y y0 /
www.Ebook777.com
If we substitute
D x x0 ; z D y y0 ;
we obtain z./ D y.x x0 / y0 and dx

d D 1 so that

dz a C bz
D y 0 ./ D f :
d ˛ C ˇz
The linear algebraic equations above may fail to have a solution: namely we could
have had .a; b/ D K .˛; ˇ/ or K .a; b/ D .˛; ˇ/. Then, however, the equation 6.6.1
is already of the form y 0 D F .Ax C By/ as it is, and we can use the procedure
from 6.4.
6.7 The linear equation y 0 D a.x/y C b.x/; first encounter

with variation of constants
First, solve the equation y 0 D a.x/y. This is a case of separated variables and by
the method from 6.3, we obtain a solution
R
u1 .x/ D c e a.x/dx
: (6.7.1)
Let us try to find a solution of the original equation in the form
y.x/ D c.x/ u1 .x/
(because of replacing the constant c from (6.7.1) by a function in x one speaks of a

variation of constant; in a more general setting, it will be used in Chapter 7 below).
Thus, we should have the equality
y 0 D c 0 u1 C cu01
and since u01 D au1 , we have, further,
y 0 D c 0 u1 C cau1 D c 0 u1 C ay:
Thus, we need a c.x/ such that
b.x/ D c 0 .x/u1 .x/
and this equality is satisfied by

Z
b.x/
c.x/ D dx C K:
u1 .x/
www.Ebook777.com
7 General substitution, symmetry and infinitesimal symmetry of a differential equation 165
6.8 At least one second-order equation
In physics, we encounter the equation
y 00 D f .y/:
Such an equation can be solved as follows. First, multiply both sides by y 0 to obtain
y 0 y 00 D f .y/y 0 ;
that is,
Z Z
1 1 0 2
. .y 0 /2 /0 D .. f / ı y/0 and further .y / D . f/ıy CC
2 2
(ı indicates composition of functions) and finally

s Z
0
y D 2. f / ı y C C ;
a case of separated variables.
7 General substitution, symmetry and infinitesimal

symmetry of a differential equation
7.1
One may ask how, looking at a differential equation, one finds the substitution which
allows us to separate variables. Of course, in most cases, it is not possible. When it
is, however, there is, in fact, a general strategy for finding the substitution, relating
separation of variables to symmetry. To study symmetry, it is convenient to write
a system of differential equations in a form in which the right-hand side does not
depend explicitly on x:
yi0 D fi .y1 ; : : : ; yn /: (7.1.1)
Clearly, this is a special case of the system (1.1.1). On the other hand, a system of the
form (1.1.1) can be always reduced to the form (7.1.1) by introducing an additional
variable y0 :
y00 D 1;
yi0 D fi .y0 ; y1 ; : : : ; yn /:
www.Ebook777.com
7.2
Now assume we have a system of the form (7.1.1). We may write it in vector
notation, putting y D .y1 ; : : : ; yn /T , f D .f1 ; : : : ; fn /T (recall that reconciling the
direction of composition of maps with matrix multiplication favors viewing vectors
as columns here, see e.g. Appendix A, 7.5):
y0 D f.y/: (7.2.1)
Let us point out a geometric interpretation of the system (7.2.1). Denote the
independent variable by t. A solution y.t/ can be interpreted as a parametric curve
with the parameter t. Then the equation (7.2.1) says that the tangent (“velocity”)
vector of the curve y at the point t is equal to f.y.t//. A function U ! Rn on
a subset U yRn when we interpret its values as vectors is called a vector field.
The curves y.t/ are called integral curves of the vector field. One sometimes denotes
the solution as
y.t/ D exp.tf/y.0/; (7.2.2)
although this is somewhat misleading, given the fact that the solution is not an
exponential even in the case of n D 1 unless f is constant, and cannot be figured out
explicitly in general when n > 1.
7.3
Let us now study how a vector field changes when we change variables. By a
substitution at y0 2 Rn we shall mean a smooth map W U ! Rn where U is an
open neighborhood of y0 whose differential at y0 is non-singular. Writing z D .y/,
then, by the chain rule, we get from (7.2.1) a system of differential equations for z,
z0 D Dj 1 .z/ f. 1 .z//;
(the operation on the right-hand side is matrix multiplication), so from the point of
view of differential equations, transforms the vector field f to the vector field
g.z/ D Dj 1 .z/ f. 1 .z//
in an open neighborhood of z0 D .y0 /. In other words, the differential equa-

tion (7.2.1), expressed in the variables z, reads
z0 D g.z/: (7.3.1)
www.Ebook777.com
7 General substitution, symmetry and infinitesimal symmetry of a differential equation 167
7.4
We will call a symmetry (at y0 ) if the differential equations (7.2.1) and (7.3.1)
coincide, i.e. we have g.z/ D f.z/, or
f..y// D Djy f.y/: (7.4.1)
However, we are less interested in a single symmetry than in a (continuous) family

of symmetries. By this, we mean a smooth map W RnC1 ! Rn , which, denoting
the first variable as ", and writing ."; ‹/ as " W Rn ! Rn , has the property that
each " is a symmetry, and
0 D Id
(in other words, 0 .y/ D y). Given a family of symmetries, what is happening near
" D 0? Let
ˇ
@" ˇˇ
uD : (7.4.2)
@" ˇ"D0
Then considering the condition (7.4.1) for D " and differentiating by " at " D 0,
we get that
ˇ
@ ˇ
f." .y//ˇˇ D Df u.y/ D @u f .y/;
@" "D0
ˇ
@ ˇ
Dj.";y/ f.y/ˇˇ D Dujy f.y/ D @f u.y/
@" "D0
d
(here on the right-hand side we use the notation @u f.y/ D f .y C tu/, see 2.4 of
dt
Chapter 3).
7.5
For two smooth vector fields u, f, we write
Œu; f D @u .f/ @f .u/;
and call this the Lie bracket of vector fields. This is, again, a vector field. The
derivative of the condition (7.4.1) at " D 0 then reads
Œu; f D 0: (7.5.1)
www.Ebook777.com
A smooth vector field u defined on an open neighborhood of y0 which satisfies

(7.5.1) will be called an infinitesimal symmetry of the differential equation (7.2.1)
at y0 . For technical reasons (dealing with possibly different domains of definition),
we will consider two infinitesimal symmetries at y0 equal when they coincide on an
open neighborood of y0 .
7.6
It is worth pointing out two properties of the Lie bracket of vector fields:
Œu; v D Œv; u; (7.6.1)
Œu; Œv; w C Œv; Œw; u C Œw; Œu; v D 0: (7.6.2)
The equality (7.6.2) is called the Jacobi identity. Generally, a vector space over R
or C with a binary operation Œ‹; ‹ which is linear in each coordinate and satisfies
the equalities (7.6.1), (7.6.2) is called a Lie algebra. Thus, in particular, smooth
vector fields defined on the same open subset of Rn form a Lie algebra, as do
symmetries of the differential equation (7.2.1) at a given point y0 (this follows from
the Jacobi identity).
7.7 Comment
Several concepts of this and the next section are closely related to Chapter 12 below.
After finishing that chapter, the reader may be ready to tie this in together in some
highly interesting and important geometrical notions which are beyond the scope of
this text. For example, the notion of Lie algebra just mentioned leads to the notion
of a Lie group. In Chapter 12, we will develop enough techniques to introduce the
concept of a Lie group, and will mention it briefly in Exercises (6), (7), (8) of
Chapter 12. Lie groups are a major field of mathematical study. We recommend
[9, 10] for further reading.
8 Symmetry and separation of variables
8.1
Given a single infinitesimal symmetry u, then
exp."u/ (8.1.1)
(used in the sense of the notation (7.2.2)) is a continuous family of symmetries. This
is because in case of " equal to (8.1.1), by definition, the derivative of the condition
www.Ebook777.com
8 Symmetry and separation of variables 169
(7.4.1) by " is the same at every point ", and is equal to Œu; f D 0. Now given an
infinitesimal symmetry u of the equation (7.2.1) at a point y0 , and assuming
u.y0 / ¤ 0; (8.1.2)
then, without loss of generality, we may assume that
u; f2 .y0 /; : : : ; fn .y0 / form a basis of Rn . (8.1.3)
(By Steinitz’ Theorem 2.6 of Appendix A, this can be always achieved after
permuting the coordinates fi .) Assuming (8.1.3) holds, consider the following
smooth map U ! Rn defined in an open neighborhood U of y 0 :
ˆ..z1 ; : : : ; zn /T / D exp..z1 y10 /u/ .y10 ; z2 ; : : : ; zn /: (8.1.4)
We have set things up in such a way that
ˆ.y0 / D y0 ;
(although obviously that is not important), and by (8.1.3) and the Implicit Function
Theorem, the map ˆ has a smooth inverse ‰ in an open neighborhood of y 0 . We
consider the substitution
z D ‰.y/: (8.1.5)
Because u is an infinitesimal symmetry, the differential equation expressed in the

variables z, i.e. (7.3.1), has a family of symmetries
z1 7! z1 C "; (8.1.6)
zi 7! zi for i D 2; : : : ; n: (8.1.7)
This means that the function g does not depend on the variable z1 , and thus, we
have reduced the number of variables by 1: we have a system of n 1 differential
equations in the variables z2 ; : : : ; zn , and an equation for z01 in terms of z2 ; : : : ; zn .
For n D 2, this implies a complete solution (separation of variables). Of course,
to make this method work, we must be able to evaluate (8.1.1), which, a priori, is
a system of n differential equations. However, in some cases, symmetries may be
more easily visible than direct solutions.
8.2
It is useful to mention one generalization. By a generalized symmetry of the

equation (7.2.1) we shall mean a substitution z D .y/ such that
f..y// D ˛.y/Djy f.y/; (8.2.1)
www.Ebook777.com
for some function ˛ W U ! R (i.e., a scalar). The significance of a generalized

symmetry is that it preserves the direction, but not the magnitude of the tangent
vectors to the integral curves. Thus, roughly speaking, a generalized symmetry
preserves the integral curves as sets, but not their parametrization.
The infinitesimal version of this condition is
Œu; f D f (8.2.2)
for another scalar function W U ! R. Again, generalized infinitesimal symmetries

of (7.2.1) at a point form a Lie algebra, a derivative at 0 of a continuous
system of generalized symmetries is a generalized infinitesimal symmetry, and
conversely, (8.1.1) for a generalized infinitesimal symmetry is a continuous family
of generalized symmetries.
In the case of a generalized symmetry, we may still apply the substitution (8.1.4).
As a result, (8.1.6), (8.1.7) will be a generalized symmetry of the system (7.3.1). In
this case, we know that the function
g.z/=g1 .z/
does not depend on z1 , so (7.3.1) reduces to a system of n 1 equations
dzi gi .z/
D ; i D 2; : : : n:
dz1 g1 .z/
Note, however, that now unless the factor ˛ of the generalized symmetry has some
special form, we still end up with a general first-order differential equation for the
variable z1 .
8.2.1 Example
Consider the homogeneous differential equation
y
y 0 D f . /:
x
In symmetric form, this is
y
y 0 D f . /;
x
x 0 D 1:
We have an obvious family of generalized symmetries
.x; y/T D .x; y/T (*)
(to conform with the above notation, " D 1). The corresponding infinitesimal
symmetry is
www.Ebook777.com
8 Symmetry and separation of variables 171
u.x; y/T D .x; y/T ;
which exponentiates to
exp.zu/.x; y/T D e z .x; y/T ;
so (fixing, say, x 0 D 1 and calling the new variables z; v), the substitution becomes
.x; y/T D .e z1 ; y 0 e z1 v/T ;
or
z D 1 C ln.x/;
y (**)
vD :
y0x
Up to scalar multiple, the formula for v is the substitution from the last section. It
is worthwhile noting, however, that in the present form, we obtain the autonomous
equation
dv 1
D 0 .f .y 0 v/ v/
dz y
(which we may not have noticed in the last section). Obviously, the rather simple
form of the generalized infinitesimal symmetry allows us to recover z in this case.
8.3 Example
The fact that for n D 2, a symmetry leads to separation of variables, begs the
question whether the separated equation
y 0 D a.x/y (8.3.1)
always has an infinitesimal symmetry. In fact, we plainly see that making a

substitution in x (independent of y) introduces multiplication by a function of x,
so we should be able to make a substitution in x which would eliminate the factor
a.x/, and the equation would become autonomous. This suggests an infinitesimal
symmetry of the form
u D .k.x/; 0/T : (8.3.2)
The condition (7.5.1) becomes
k 0 .x/ D k.x/a.x/; (8.3.3)
www.Ebook777.com
which can be solved. In fact, it is the original equation, so this is no simplification,

but we have found a symmetry, which, as we will see, is useful. Note also that
the fact that the equation (8.3.3) coincides with (8.3.1) has a geometric reason:
Choosing a non-zero characteristic C of the equation (8.3.1), the vector field the
value of which at each .x; y/T is the vertical vector from .x; 0/T to the characteristic
C is a symmetry because any other characteristic, considered as a function of x, is
a constant multiple of the function with graph C .
8.4 Example
The symmetry (8.3.2) (subject to the condition (8.3.3)) plainly also is a symmetry
of the equation
y 0 D a.x/y C b.x/ (8.4.1)
(since (8.3.2) has 0 Lie bracket with .0; b.x//T ). Thus, we may use this symmetry
to solve the equation (8.4.1). The substitution we get by choosing y 0 D .0; 0/,
y1 D y; y2 D x, is
y D z1 k.z2 /; x D z2 :
Setting z D z1 , we get
y 0 k.x/ yk 0 .x/ b.x/

z0 D D ;
k.x/2 k.x/
which is solvable by an integral, as desired.
9 Exercises
(1) Convert the differential equation

x
y 000 D .y 00 /2 C ln.x/
sin.y 2 /
into a system of first-order ODE’s.

(2) Convert the system of ordinary differential equations
z0 y 0
y 00 D C y3;
zCy Cx
z00 D ln.z0 C cos.y 0 C z// C 3
into a system of first-order ODE’s.
www.Ebook777.com
9 Exercises 173
(3) (a) Using Exercise (9) of Chapter 3, describe a procedure of converting a

system of equations of the form (1.3.1) to a system of the form (1.2.1)
with kn raised to kn C1 without assuming we can find the implicit function
explicitly. (Note that this may even be useful in the case k0 D D
kn D 0.)
(b) Using this method, convert the differential equation
y 0 C sin.x C y C y 0 / D 0
into an ordinary (explicit) second-order differential equation.

(4) State and prove an analogue of Corollary 4.3 for the general system (1.2.1) of
1.2.
(5) Solve the differential equation
2x
y0 D :
ey
x2 C y 2
y0 D :
xy

y
y0 D x C :
x
(8) Prove the Jacobi identity for vector fields.
(9) Prove that infinitesimal symmetries of a system of differential equations at a
point y 0 form a Lie algebra under the operation of Lie bracket of vector fields.
(10) Prove that generalized infinitesimal symmetries of a system of differential
equations at a point y 0 form a Lie algebra under the operation of Lie bracket
of vector fields.
(11) Prove that a generalized infinitesimal symmetry exponentiates to a generalized
symmetry.
(12) Find an infinitesimal symmetry of the equation
y 0 D f .ax C by/
and recover the solution.

(13) Find a generalized infinitesimal symmetry of the equation

0 ax C by C c
y Df
˛x C ˇy C
and use it to find the solution.
www.Ebook777.com
Systems of Linear Differential Equations

7
Systems of linear differential equations have many special properties, the most
important of which is that a characteristic is defined in any open interval in which
the system is defined (in contrast with ODE, see Example 4.4.1 of Chapter 6).
In this chapter, we prove this important “no blow-up” theorem, and discuss the
linear character of the set of solutions. We also describe a method for solving
completely the important class of systems of linear differential equations with
constant coefficients.
1 The definition and the existence theorem for a system

of linear differential equations
1.1
Let aij ; bi be continuous functions on an open interval J . A system of linear

differential equations (briefly, LDE’s) is the following special case of the system
of ODE’s (1.1.1) of Chapter 6:
X
n
yi0 .x/ D aij .x/yj .x/ C bi .x/; i D 1; : : : ; n: (L)
j D1
Recall that such systems arise naturally as equations for partial derivatives of
solutions of general differential equations by a parameter (see (5.5.2) of Chapter 6).
A linear (differential) equation of order n, where ai ; b are continuous on J , is
y .n/ .x/ C an1 .x/y .n1/ C C a1 .x/y 0 C a0 .x/y D b.x/ i D 1; : : : ; n: (L̃)
Again, the system (L̃) is easy to translate to a system of the form (L) by the method
of 1.2. In fact, again, one may call (L) a system of first order LDE’s, define systems
of higher order LDE’s, and then show such systems are equivalent to systems of first

www.Ebook777.com
176 7 Systems of Linear Differential Equations
order LDE using the method of 1.2 of Chapter 6. Consequently, it suffices, again, to
develop a theory for first-order systems (L). However, in some practical situations,
it is advantageous to treat the special case of a single higher order equation (L̃)
separately, as we will see below.
If all the functions bi are zero (in the case (L̃), if b is zero), we speak of
homogeneous equations resp. equation. The homogeneous counterpart of an (L)
resp. (L̃) will be indicated by (L-hom) resp. (L̃-hom).
1.2 Lemma. Let f be continuous and bounded on the half-open interval ha; b/.
Define a value of f at b arbitrarily. Then there exists the (Riemann) integral
Rb
a f .t/dt and we have
Z b Z x
f .t/dt D lim f .t/dt:
a x!b a
Comment: We prove this result here directly to make this chapter (and Chapter 6
above) largely self-contained, and independent of the techniques of the Lebesgue
integral as introduced in Chapters 4, 5. The attentive reader, however, should see
how the present statement follows from a much stronger result in Exercise (10) of
Chapter 5, Exercise (4) of Chapter 4, and the Lebesgue Dominated Convergence
Theorem.
Rx
Proof. The Riemann integrals a trivially exist (because of the continuity).
Let jf .x/j C . Thus, we can choose partitions D.x/ of ha; xi such that
Z x Z x
" "
f s.f jha; xi; D.x// S.f jha; xi; D.x// f C (*)
a 2 a 2
(notation from Section 8 of Chapter 1).

"
Let x > b D . Define a partition D 0 .x/ of ha; bi by adding the interval hx; bi
2C
to D.x/. Then we have
"
s.f jha; xi; D.x// s.f jha; xi; D.x// .b x/C
2
Z b Z b
s.f; D 0 .x// f f S.f; D 0 .x// (**)
a a
"
S.f jha; xi; D.x// C .b x/C S.f jha; xi; D.x// C :
2
From (*) and (**), we obtain
www.Ebook777.com
1 The definition and the existence theorem for a system of linear differential equations 177
Z x Z b Z b Z x
f " f f f C ";
a a a a
hence
ˇZ Z b ˇˇ ˇZ Z b ˇˇ
ˇ x ˇ x
ˇ ˇ ˇ ˇ
ˇ f f ˇ " and ˇ f fˇ"
ˇ a a
ˇ ˇ a a ˇ
and finally
Z b Z x Z b
f D lim f D f: t
u
x!b a a
a
1.3 Theorem. Let aij .x/; bi .x/ be continuous on an interval J , let x0 2 L and let
j , j D 1; : : : ; n, be arbitrary real numbers. Then the LDE system
X
n
yi0 .x/ D aij .x/yj .x/; i D 1; : : : ; n
j D1
has precisely one solution y1 ; : : : ; yn , defined on the whole of J , such that

yj .x0 / D j .
Proof. Uniqueness follows from the general Theorem 4.1 of Chapter 6, from which
we also know that there exists a solution defined on a neighborhood of the point
x0 . We will prove that this solution can be extended on the whole of J . We will
construct the extension on the part of the interval to the right of x0 , the extension to
the left is analogous.
Recall 2.1 of Chapter 6 and denote by M the set of all z 2 J , z x0 such that
there is a solution of the equations
Z z X
n
yi .x/ D . aij .t/yj .t/ C bi .t//dt C i
x0 j D1
on hx0 ; zi. Set s D sup M . If the set M is not all of J \ hx0 ; C1i, we have
(1) s finite, and
(2) s 2 J X M .
((1) is obvious; regarding (2), either s < sup J and there is a solution in one of
its neighborhoods, or s … M while s 2 J , since it is the only point at which
J \ hx0 ; C1i can differ from M ).
Since aij and bi are continuous functions defined on hx0 ; si, they are bounded on
this interval, say
jaij .x/j A and jbi .x/j B:
www.Ebook777.com
Choose C , ˛ sufficiently large to have

C ˛x
˛ > 2nA and B.s x0 / C max i < e
2
and moreover such that the set
MQ D fx j x 2 M and jyi .x/j < C e˛x g
is non-empty. The set MQ is obviously open in M . The set M is obviously connected.
Thus, if we prove that MQ is closed we will see (recall Section 5 of Chapter 2)
that MQ D M , and to do that it suffices to show that MQ is closed under limits of
increasing sequences (if jyi .y/j < C e˛y holds for y’s arbitrarily close above x it
obviously holds for x as well).
To this end, consider an increasing sequence xn of points of MQ and let
lim xn D . Important: we do not assume 2 M ; this will be a consequence of
n
this part of the proof, and will be used later.
From continuity, we immediately see that jyi .x/j C e˛x on the interval hx0 ; i
and hence by Lemma 1.2 we know that the Riemann integral
Z X
. aij .t/ C bi .t//dt
x0
Z x X
is equal to lim . aij .t/ C bi .t//dt D lim yi .xn /. If we define yi ./ as this
x! x0 x!
limit (if the yi has been already defined at , this coincides, by continuity, with the
original value) we have extended the solution of our LDE system to (hence, in
particular, we have 2 M ). We have, however
Z X Z
j . aij .t/ C bi .t//dt C i j .nAC e˛t C B/dt C ji j
x0 x0
nAC ˛
e C B. xO / C ji j < C e˛ ;
˛
so is not only in M , but in fact in MQ .
Therefore, MQ D M . Now we will take advantage of the fact that in our
procedure, we did not assume to be in M : the supremum point s can be written as
a limit of an increasing sequence of elements from M (equal to MQ ) in contradiction
with s 2 J X M which followed from the assumption that M ¤ J \ hx0 ; C1i. u t
1.4 Corollary. Let ai .x/ .i D 1; : : : ; n/ and b.x/ be continuous on an interval J ,

let x0 2 J and let i , i D 1; : : : ; n be arbitrary. Then the equation
X
n1
y .n/ C aj .x/y .j / .x/ D b.x/
j D1
has precisely one solution on the interval J satisfying the conditions y .j / .x0 / D j
for all j D 1; : : : ; n 1.
www.Ebook777.com
2 Spaces of solutions 179
2 Spaces of solutions
2.1
In this section, the continuous functions
aij ; bi
are defined on an open interval J . We denote by C.J / the R-vector space of all
continuous functions on J . Further, we denote the vector space
C.J /

C.J / n times
(the n-th power of C.J /) by
C n .J /:
2.2 Theorem. The system of all solutions of the LDE system (L) constitutes an
affine subset y0 C W of C n .J /, and the system of all solutions of the n-th order
equation (L̃) constitutes an affine subset y0 C W , where the vector subspaces W are
the sets of all solutions of the associated homogeneous equations.
Proof. will be done for (L). Obviously if y D .y1 ; : : : ; yn / and z D .z1 ; : : : ; zn /

solve the associated homogeneous system (L-hom) then so does any ˛y C ˇz and
n
the system of all the solutions of (L-hom) is a vector P subspace of C .J /. Now
solves (L), that is, if y00 P
if y0 D .y01 ; : : : ; y0n / P D aij y0j C bi and if y solves
(L-hom), that is, y 0 D aij yj then y00 C y 0 D aij .y0j C yj / C bi and y0 C y
solves
P (L). On the other hand if z is an arbitrary solution of (L) then z0i y00 D
aij .zj y0j / so that z y0 2 W and z D y0 C .z y0 / 2 y0 C W . t
u
Remark. Of course the principle is the same as in the solution of systems of

algebraic linear equations.
Theorem. The dimensions of (both of) the affine sets from the previous theorem
are n.
Proof. Again, we will prove the statement for the system (L). Let y1 ; : : : ; yp be
solutions of (L-hom) and let p > n. Take an x0 2 J . Then the system of algebraic
linear equations
y11 .x0 /˛1 C y21 .x0 /˛2 C C yp1 .x0 /˛p D 0

::: ::: :::
y1n .x0 /˛1 C y2n .x0 /˛2 C C ypn .x0 /˛p D 0
www.Ebook777.com
has a non-trivial solution ˛1 ; : : : ; ˛n (in fact, the vector space of such solutions has
dimension p n). Set
X
p
yD ˛i yi :
i D1
In particular we have y.x0 / D .0; : : : ; 0/. But we already know such a solution,
namely zero: o D .const0 ; : : : ; const0 /. From uniqueness, it now follows that
Xp
˛i yi D o, i.e. that the system y1 ; : : : ; yn is linearly dependent; hence, the
i D1
dimension of W is at most n. On the other hand consider the solutions yi of
(L-hom) such that yij .x0P / is 1 for i D 1 and 0 otherwise. P
Then we obtain a linearly
independent
P system: if ˛ i y i D o then in particular ˛i yi .x0 / D 0, that is,
˛i ıij D 0 and all the ˛i are zero. t
u
2.3 The Wronski determinants (Wronskians)
For solutions y1 ; : : : ; yn of (L-hom), one introduces the determinant

ˇ ˇ
ˇ y11 .x/; : : : ; y1n .x/ ˇ
ˇ ˇ
W .y1 ; : : : ; yn /.x/ D ˇˇ ::: ˇ:
ˇ
ˇy .x/; : : : ; y .x/ˇ
n1 nn
For solutions y1 ; : : : ; yn of the equation (L̃-hom), one introduces

ˇ ˇ
ˇ y1 .x/; : : : ; yn .x/ ˇ
ˇ 0 ˇ
ˇ y1 .x/; : : : ; yn0 .x/ ˇ
ˇ
W .y1 ; : : : ; yn /.x/ D ˇ ˇ:
::: ˇ
ˇ ˇ
ˇy .n1/ .x/; : : : ; y .n1/ .x/ˇ
1 n
The functions W .y1 ; : : : ; yn /.x/ resp. W .y1 ; : : : ; yn /.x/ are called the Wronski
determinants of the equations in question.
Remark. Note that the latter is in fact a special case of the former obtained from
the standard translation as in 1.2 of Chapter 6.
2.4 Theorem. The following statements are equivalent for a system of solutions
y1 ; : : : ; yn of the system (L) (the interval J is as before):
(1) the solutions y1 ; : : : ; yn are linearly independent,
(2) W .y1 ; : : : ; yn /.x/ ¤ 0 at all x 2 J ,
(3) there exists an x0 2 J such that W .y1 ; : : : ; yn /.x0 / ¤ 0.
Similarly for the system (L̃). If the conditions hold, the system y1 ; : : : ; yn is called a
fundamental system of solutions.
www.Ebook777.com
3 Variation of constants 181
Proof. We will prove the statement for the case (L̃), just for a change.
(1))(2): Suppose (2) does not hold and we have an x0 2 J such that
ˇ ˇ
ˇ y1 .x0 /; : : : ; yn .x0 / ˇ
ˇ 0 ˇ
ˇ y .x0 /; : : : ; yn0 .x0 / ˇ
W .y1 ; : : : ; yn /.x0 / D ˇˇ 1 ˇ D 0:
ˇ
ˇ ::: ˇ
ˇy .n1/ .x /; : : : ; y .n1/ .x ˇ
1 0 n 0/
Then the system of algebraic linear equations
y1 .x0 /˛1 C y2 .x0 /˛2 C C yn .x0 /˛n D 0;

y1 .x0 /0 ˛1 C y2 .x0 /0 ˛2 C C yn .x0 /0 ˛n D 0;
::: ::: :::
.n1/ .n1/
y1 .x0 /˛1 C y2 .x0 /˛2 C C yn.n1/ .x0 /˛n D 0
P
has a non-trivial solution ˛1 ; : : : ; ˛n . If we set y D ˛i yi we have in particular
y.x0 / D y 0 .x0 / D D y .n1/ .x0 / D 0. P This holds for the trivial constant zero
solution as well and hence, by uniqueness, ˛i y D const0 and our solutions are
linearly dependent.
The implications (2))(3) and (3))(1) are trivial. t
u
3 Variation of constants
This is a method which allows us to find the system of solutions of the system
(L) (resp. (L̃)), provided we know a fundamental system of solutions of the system
(L-hom) (resp. (L̃-hom)). Again, the latter is a special case of the former, but in this
case we will present both cases explicitly.
3.1 The system (L)
Suppose we have a basis y1 ; : : : ; yn of solutions of (L-hom). We will try to find a

solution of (L) in the form
X
n
y0 .x/ D ci .x/yi .x/
i D1
(recall 6.7 of Chapter 6). We have

X
yij0 D ajk yik
k
www.Ebook777.com
and hence
X X X X
yij0 D ci0 yij C ci yij0 D ci0 yij C ci ajk yik
i ik
X X X X X
D ci0 yij C ajk ci yik D ci0 yij C ajk y0k
i k i i k
and hence the problem is in finding functions ci .x/ such that

X
ci0 .x/yij .x/ D bi .x/:
i
This is easily done using the Cramer rule (Appendix B, 4.2). If we denote by Wi .x/
the Wronskian in which we replace the i -th column by the
0 1
b1 .x/
@ ::: A
bn .x/
we obtain
Wi .x/
ci0 .x/ D
W .y1 ; : : : ; yn /
with the denominator non-zero by 2.6, and conclude that

Z
Wi .x/
ci .x/ D :
W .y1 ; : : : ; yn /
3.2 The equation (L̃)
Consider a basis y1 .x/; : : : ; yn .x/. Let us look for a solution in the form
X
y.x/ D ci .x/yi .x/:
.n/
X
n1
.j /
We have yi .x/ D aj yi D 0. Thus, if we require
j D0
X
ci0 .x/yi .x/ D 0
.k/
(*)
www.Ebook777.com
4 A Linear differential equation of nth order with constant coefficients 183
for k D 0; : : : ; n 2, we will have

X
y 0 .x/ D ci .x/yi0 .x/
:::
X .n1/
y .n1/ .x/ D ci .x/yi .x/:
Let us add a further requirement

X
ci0 .x/yi
.n1/
.x/ D b.x/: (**)
Then we have
X .n/
y .n/ .x/ D ci .x/yi .x/ C b.x/
and conclude that

X
y .n/ .x/ C ak .x/y .k/ .x/ D b.x/:
k
The requirements (*) and (**) constitute, again, a system of algebraic linear
equations solvable using the Cramer rule (again with the non-zero Wronskian in the
denominator) to obtain ci0 .x/. Finally, take the primitive functions to obtain ci.x/ .
4 A linear differential equation of nth order

with constant coefficients
In this and the following section we will consider linear differential equations with
constant coefficients ai , resp. aij . In view of the previous section, it suffices to solve
the corresponding homogeneous equations. If these are solved, the general case can
be computed by variation of constants; note that the right-hand sides b resp bi do
not have to be constant.
4.1 The Characteristic Polynomial
Consider the problem of finding a function y satisfying
y .n/ C an1 y .n1/ C C a1 y 0 C a0 y D 0 (*)
where ak are real numbers.
www.Ebook777.com
We already know that it suffices to find n linearly independent solutions. Let

us try
y.x/ D ex :
We have
y .k/ .x/ D k ex ;
and hence the equation (*) will be satisfied if (and only if)
e x .n C an1 n1 C C a1 C a0 / D 0;
that is, since e x ¤ 0, if and only if
p./ D n C an1 n1 C C a1 C a0 D 0:
The polynomial p is called the characteristic polynomial of the equation (*).

Thus we see that
if is a root of the characteristic polynomial of (*) then y.x/ D ex is a solution of this
equation.
4.2
If 1 ; : : : ; n are distinct numbers then the functions e1 x ; : : : ; en x constitute a

linearly independent system. This is easily proved using the Wronski and Vander-
monde determinants. For our purposes this would not suffice, though. We will need
a stronger
Lemma. Let 1 ; : : : ; k be distinct complex numbers and let p1 .x/; : : : ; pk .x/ be

polynomials. Let
X
k
pj .x/ej x
j D1
be identically zero. Then all the polynomials pj are zero.
Proof. Suppose not. Then among the counterexamples, choose one such that
(a) the maximum of the degrees of the polynomials pj is the least possible, and
(b) the number of the polynomials pj with this maximum degree is the least
possible.
www.Ebook777.com
4 A Linear differential equation of nth order with constant coefficients 185
Here the degree of a constant non-zero polynomial is defined to be 0, and the degree
of the constant zero is defined to be 1. Thus, taking derivative of a non-zero
polynomial decreases the degree by one. We have identically
X
k
pj .x/ej x D 0: (4.2.1)
j D1
Taking the derivative we obtain
X
k X
k
pj .x/ej x C pj .x/j ej x D 0: (4.2.2)
j D1 j D1
Let, say, p1 have the maximum degree. Subtracting (4.2.1) multiplied by 1 from
(4.2.2), we obtain
X
k
p10 .x/e1 x C ..j 1 /pj .x/ C pj0 .x//ej x D 0: (4.2.3)
j D2
Now the degree of the polynomial at e1 x has decreased and none of the other
degrees has increased. Thus, the formula (4.2.3) cannot be a counterexample to the
statement and hence we have to have
p10 .x/ 0; and

.j 1 /pj .x/ C pj0 .x/ 0 for j > 1:
From the second equation we immediately see that all the pj with j > 1 are
identically zero (since 1 ¤ j ). The first one immediately yields only that p1
has to be a constant, but C e1 x is zero only if C D 0. t
u
4.3 Corollary. Let 1 ; : : : ; k be distinct complex numbers. Then the system of

functions
e1 x ;xe1 x ; : : : ; x s1 e1 x ; e2 x ; xe2 x ; : : : ; x s2 e2 x ; : : :

: : : : : : ek x ; xek x ; : : : ; x sk e1 x :
with arbitrary non-negative integers sj is linearly independent.
www.Ebook777.com
4.4 The simplest case
If the characteristic polynomial has n distinct real roots 1 ; : : : ; n then we have,

by 4.1 and 4.3, the fundamental system of solutions
e1 x ; : : : ; en x :
The problem is, hence, what to do with the complex roots, and how to deal with a
possible multiplicity of some of the roots.
4.5 Complex roots
We are dealing with an LDE in real variables. Thus the characteristic polynomial has
real coefficients and consequently each of the roots which is not real is accompanied
with its complex conjugate as another root. That is, if ˇj ¤ 0 in a root
j D ˛j C iˇj
then there is a k ¤ j with
k D ˛j iˇj :
The two complex functions ej x ; ek x are then in our basis replaced by
e˛j x cos ˇj x and e˛n x sin ˇj x: (4.5.1)
Replacing eix and eix by linear combinations of cos x and sin x, and vice versa, in
the present context, is justified by Exercise (12) of Chapter 1. We will gain a much
better understanding of this in Chapter 10 below.
4.6 Multiple roots
Define an operator
X
n1
L.y/ D y .n/ C aj y .j /
j D0
to be applied on functions y.x; / of two real variables. Thus we have
@n y X @j y
n1
L.y/ D n C aj j :
@x j D0
@x
By 4.2.1, of Chapter 3, we obtain
www.Ebook777.com
5 Systems of LDE with constant coefficients. An application of Jordan’s Theorem 187

@ @ @n y @n @y @y
L.y/ D n
C D n
C D L
@ @ @x @x @ @
and more generally

k
@k @ y
L.y/ D L :
@k @k
In particular for y.x; / D ex we have L.y/ D ex p./ and hence

@k y @k x
L.x e / D L
k x
D .e p.//:
@k @k
By induction we easily learn that

!
@k x X k
k .j /
.e p.// D p ./x kj ex :
@k j D1
j
If is a k-multipled root of p we have
p./ D p 0 ./ D D p .k1/ ./ D 0
and hence the equation L.y/ D 0 is satisfied, besides ex , also by
xex ; x 2 ex ; : : : ; x k1 ex
Thus we obtain k solutions, and if we apply this to all the roots we obtain n
solutions, independent by 4.3, and hence the fundamental system of solutions we
needed.
For a conjugate pair of complex roots ˛ C iˇ, ˛ iˇ we take, of course,
e˛x cos ˇx; xe˛x cos ˇx; : : : ; x k1 e˛x cos ˇx;
e˛x sin ˇx; xe˛x sin ˇx; : : : ; x k1 e˛x sin ˇx:
5 Systems of LDE with constant coefficients. An application

of Jordan’s Theorem
5.1
Consider a system of first-order linear differential equations
y0 D Ay: (5.1.1)
www.Ebook777.com
In fact, let us carefully consider two contexts in which (5.1.1) makes sense. The first
context is, as above, when A is a constant n
n matrix over R, and y W R ! Rn is an
unknown vector-valued function. However, it also makes sense to consider the case
when A is an n
n matrix over C, and the unknown function is y W R ! Cn . This
case makes sense since we may identify C Š R2 , and such system of n complex-
valued first order differential equations can therefore be interpreted as a system of
2n real-valued first-order linear differential equations. Let us emphasize, however,
that in this discussion, the independent variable remains real.
The advantage of considering (5.1.1) over C is that over C, every matrix is similar
to a matrix in Jordan canonical form. Changing basis to the basis in which the
matrix is in Jordan form gives a substitution which allows us to solve the system
of equations. Even more explicitly, this can be said as follows: consider a k
k
Jordan block of the matrix A with respect to an eigenvalue . This corresponds to k
vectors u1 ; : : : ; uk 2 Cn such that
Au1 D u1 ;
(5.1.2)
Auj D uj C uj 1 ; j D 2; : : : ; k:
Then this data give the following solutions of the system (5.1.1):
u1 e x ;
u2 e x C u1 xe x ;
(5.1.3)
:::
x k1 x
uk e x C uk1 xe x C C u1 .k1/Š e :
Taking the solutions (5.1.3) for all Jordan blocks gives a fundamental system of
solutions, which we can see by taking the determinant of their values at 0 (where
we get the base change matrix from the Jordan basis to the standard basis); recall
from Theorem 2.4 that a system of n solutions whose values are independent at one
point is a fundamental system of solutions.
5.2
Let us now consider the case when the system (5.1.1) is over R. Then, the matrix A
is a real matrix. This means that for every solution y over C,
Re.y/; Im.y/ (5.2.1)
are real solutions of (5.1.1). Taking all such solutions for all Jordan blocks gives a
system of real solutions which, when considered over C, generate the vector space
of all the complex solutions and hence must contain a basis of the space of real
solutions (which can be found explicitly by finding a set of columns which form a
basis of the matrix of values at 0).
www.Ebook777.com
5 Systems of LDE with constant coefficients. An application of Jordan’s Theorem 189
5.2.1 Example
Consider the system (5.1.1) with
0 1
0 1 0 0
B1 0 0 0C
ADB
@0
C:
0 0 1 A
1 0 1 0
Then one sees right away that the characteristic polynomial is
A .x/ D .x 2 C 1/2 ;
and the Jordan canonical form is

0 1
i 0 0 0
B1 i 0 0C
J DB
@0
C:
0 i 0A
1 0 1 i
Let us consider the Jordan block corresponding to the eigenvalue i . By solving

systems of linear equations, we find that
u1 D .0; 0; 1; i /T ;
u2 D .2i; 2; 0; 1/T :
Note that we could equivalently take a scalar multiple of both vectors by the same
non-zero complex number. Thus, (5.1.3) produces solutions
.0; 0; e ix ; i e ix /T ;
.2i; 2; 0; 1/T e ix C .0; 0; 1; i /T xe ix :
Taking real and imaginary parts, we get four real solutions
.0; 0; cos.x/; sin.x//T ;

.0; 0; sin.x/; cos.x//T ;
.2 sin.x/; 2 cos.x/; x cos.x/; cos.x/ x sin.x//T ;
.2 cos.x/; 2 sin.x/; x sin.x/; sin.x/ C x cos.x//T :
Since the data obtained from the other Jordan block can be taken complex conjugate,
we know that these solutions span the space of all complex solutions, and hence
form a fundamental system of real solutions.
www.Ebook777.com
5.3 Remark
As mentioned above, a single differential equation with constant coefficients
y .n/ C a1 y .n1/ C C an y D 0
can be converted to a system of first-order linear differential equations (5.1.1) where

0 1
0 1 0 ::: 0
B 0 0C
B 0 1 ::: C
B C
A D B ::: ::: ::: ::: :::C:
B C
@ 0 0 0 ::: 1A
an an1 an2 ::: a1
We clearly have
A .x/ D x n C an1 x n1 C C an : (5.3.1)
In fact, we call A the characteristic matrix of the polynomial (5.3.1). We may

ask when, conversely, may a system of first-order linear differential equations with
constant coefficients be converted by a substitution to a single n’th order linear
differential equation? Clearly, this is equivalent to asking which square matrices are
similar to characteristic matrices. We will solve this question in the exercises.
6 Exercises
(1) Prove that the Wronskian W .x/ of any n solutions of the system
y0 D A.x/y
satisfies the differential equation
W .x/0 D tr.A/W .x/:
(Here for a square matrix A, tr.A/ is the sum of its diagonal terms.)
(2) The differential equation
y0 y
y 00 C D0
x x
has solutions
1
y D x; y D :
x
www.Ebook777.com
6 Exercises 191
Find all solutions of the differential equation
y0 y
y 00 C D ex :
x x
(3) Find a fundamental system of solutions of the equation
y .3/ y .2/ C 8y 0 C 12y D 0:
(4) Find all solutions of the system of LDE’s
y10 D y1 y2 C xe x ;
y20 D y1 C 3y2 C x 2 :
(5) Find a fundamental system of real solutions of the system of LDE’s (5.1.1)
with
0 1
1 1 0 1
B 1 1 0 1C
ADB
@ 0
C:
0 1 1A
1 0 1 1
(6) Prove that the characteristic polynomial of a characteristic matrix Ap of a

polynomial p with highest coefficient 1 is equal to p.
(7) A cyclic vector for a linear transformation f W V ! V is a vector v 2 V such
that the vectors v; f .v/; : : : ; f N .v/; : : : span the vector space V . (As usual, we
will identify an n
n matrix with the linear transformation Rn ! Rn it defines
by matrix multiplication.) Prove that a matrix is similar to a characteristic
matrix if and only if it has a cyclic vector.
(8) Suppose an n
n matrix A over C has only one eigenvalue. Prove that A has
a cyclic vector if and only if A is equivalent to a Jordan block. [Hint: If v is a
cyclic vector, prove that .I A/j v for j 0 span Cn .]
(9) Prove that if f W V ! V is a linear transformation, v 2 V is a cyclic
vector and W V is a subspace such that f .W / W , then the image
v C W ov v in V =W is a cyclic vector for the induced linear transformation
f =W W V =W ! V =W .
(10) Using the results of Exercises 8 and 9, prove that a square matrix A over C has
a cyclic vector if and only if A has exactly one Jordan block for each eigenvalue
. (Note: such matrices are sometimes called regular, which however may be
confusing since this notion has nothing to do with non-singularity.)
(11) Suppose you know a cyclic vector of an n
n (constant) matrix A. Explain
how you can use the method of Section 4 (which is simpler than the method of
Section 5) for solving the system of LDE’s
y0 D Ay:
www.Ebook777.com
Line Integrals and Green’s Theorem

8
In this chapter, we introduce the line integral and prove Green’s Theorem which
relates a line integral over a closed curve (or curves) in R2 to the ordinary integral of
a certain quantity over the region enclosed by the curve(s). Making rigorous sense
of what this last concept means is a big part of the work. Much of the material
of this section is subsumed by the more general treatment of Stokes’ Theorem
in manifolds of arbitrary dimension in Chapter 12 below. However, there are two
important reasons to present Green’s Theorem first. The first reason is that Green’s
Theorem is much more elementary, and does not require the added abstraction, and
algebra and topology material needed for Stokes’ Theorem. The other important
reason is that Green’s Theorem can be, in fact, used directly to set up the foundations
of basic complex analysis, which we do in the next chapter, and which is rather nice
to do without having to go into Stokes’ Theorem in a general dimension.
1 Curves and line integrals
1.1
A parametrization of a (piecewise continuously differentiable) curve in Rn is a

continuous map
D .1 ; : : : ; n /T W ha; bi ! Rn
(recall our convention 1.1 of Chapter 3 of denoting vector functions by bold-faced

letters) such that there exists a partition
a D a0 < a1 < < an D b

www.Ebook777.com
194 8 Line Integrals and Green’s Theorem
of the interval ha; bi for which we have:

(1) On each of the intervals hai 1 ; ai i, each of the functions j has a continuous
derivative (use one-sided derivatives at the endpoints).
(2) For every i there exists a j such that j0 .t/ is positive or negative on all of
hai 1 ; ai i (again, take one-sided derivatives at the endpoints).
We will sometimes also speak of a parametrized piecewise continuously
differentiable curve.
Comment: Instead of condition (2), one may simply require that 0 .t/ ¤ 0 on
hai 1 ; ai i, as the interval can then be subdivided into finitely many intervals on each
of which condition (2) holds.
1.2
We say that two parametrizations
W ha; bi ! Rn ;
W ha; bi ! Rn
are weakly equivalent, and write
;
if there exists a homeomorphism ˛ W ha; bi ! hc; d i such that
ı ˛ D :
Note that the relation really is an equivalence relation, i.e. that it is reflexive,
symmetrical and transitive (see Exercise (2)). Equivalence classes with respect to
will be called piecewise continuously differentiable curves.
1.3 Remark
Clearly, implies Œha; bi D Œhc; d i. On the other hand, if and are
one-to-one and we have Œha; bi D Œhc; d i, then we have . In effect,
consider the maps
W ha; bi ! Œha; bi;

W ha; bi ! Œha; bi
defined by the same formulas as ; . Since the relevant spaces are compact, ,
1
are homeomorphisms (see 6.2.2 of Chapter 2). Put ˛ D :
www.Ebook777.com
1 Curves and line integrals 195
1.4 Proposition. The map ˛ from the definition of weak equivalence is

piecewise continuously differentiable.
Proof. Let a0 ; : : : ; ar , c0 ; : : : ; cs be the partitions figuring in Definition 1.1 for the

parametrizations , . Let b0 ; : : : ; bk be a common refinement of a0 ; : : : ; ar and
˛ 1 .c0 /; : : : ; ˛ 1 .cs /. On the interval h˛.bi 1 /; ˛.bi /i, choose j such that j is a
one-to-one continuously differentiable map with non-zero derivative. Then j1 has
a derivative and we have ˛.t/ D j1 .j .t// on hbi 1 ; bi i. t
u
1.5
We say that two parametrizations
W ha; bi ! Rn ; W ha; bi ! Rn
are equivalent, and write

if there exists an increasing homeomorphism ˛ such that ı ˛ D .

The equivalence classes of are called oriented piecewise continuously
differentiable curves.
Proposition. Suppose a parametrization of a piecewise continuously differentiable

curve is one-to-one with the possible exception .a/ D .b/. Then the
-equivalence class of contains precisely two -equivalence classes.
Proof. Since ) , a -equivalence class is a union of -equivalence

classes. Define W ha; bi ! ha; bi by
.t/ D t C b C a:
Then we have ı but (by injectivity), not . Therefore, there are at

least two -equivalence classes in each -equivalence class. When
but the homeomorphisms ˛, ˇ from Definition 1.2 for the pairs ; and ˇ; ˛ are
not increasing, then ˇ˛ is increasing, and hence . t
u
1.6 A Remark and a Convention
The geometric idea of a curve is modelled well by the concepts of 1.2 (see 1.3).
A parametrization can be interpreted as additional information about “time” at
which we are at a particular point when travelling along the curve. In an oriented
curve, we do not care about the precise time at which we are at a particular point,
but we do want to keep track of the direction of travel.
www.Ebook777.com
We will often speak freely of an (oriented or unoriented) curve
W ha; bi ! Rn :
Of course, what we really mean is the corresponding equivalence class of

parametrizations.
1.7
Let K; L be oriented curves with parametrizations W ha; bi ! Rn , W ha1 ; b1 i !

Rn such that .b/ D .a1 /. Without loss of generality, we may assume a1 D b
((otherwise, we may replace , say, by the parametrization ı ˛ where ˛.t/ D
a1 C .t b/). Let us then write c D b1 , and define W ha; ci ! Rn by

.t/ for t 2 ha; bi,
. /.t/ D
.t/ for t 2 hb; ci.
Then is a parametrization of a new oriented curve which we will denote by
L C K:
This parametrized curve is well-defined in terms of K, L; moreover, this operation

is associative (see Exercises (3), (4), (5) below).
1.8
Recall the map of 1.5. If L is an oriented curve with parametrization

W ha; bi ! Rn , define
L
as the oriented curve determined by ı . Thus, L is “the other oriented curve”

which represents the same (unoriented) curve, and accordingly, we shall refer to L
as the oriented curve with opposite orientation. Again, L does not depend on the
particular parametrization of the oriented curve L.
1.9 Some terminology
A curve with a one-to-one parametrization is sometimes called a simple arc, a curve

with a representation such that .a/ D .b/ but .x/ ¤ .y/ unless x D y or
fx; yg D fa; bg is called a simple closed curve. The word ‘closed’ in this context
has, of course, a different meaning than a closed subset of a topological space.
www.Ebook777.com
2 Line integrals of the first kind (D according to length) 197
2 Line integrals of the first kind (D according to length)

2.1
p
Recall our definition of Euclidean norm jjujj D u u where is the dot product.
Then jju vjj is the ordinary Euclidean distance. While this distance function was
unimportant (even awkward) in topological considerations, and could be replaced
by any equivalent metric, in the present section it will play a crucial role.
2.2
Recall the definition of the Riemann integral and view it informally as a kind of
summation of the function f over the length of an interval. Since an interval is
a very special case of a piecewise continuously differentiable curve (where the
parametrization is the identity), we may wonder if the Riemann integral could
be generalized to a situation where the domain is (the image of) a piecewise
continuously differentiable curve. This intuition indeed works. By a partition of
a parametrized piecewise continuously differentiable curve W ha; bi ! Rn we
will mean a sequence of points
.t0 /; .t1 /; : : : ; .tk / (*)
where t0 < t1 < < tk is a partition of the interval ha; bi. The mesh of a partition
is the maximum of the numbers jj.ti 1 / .ti /jj.
Note that since ha; bi is a compact space, is a uniformly continuous map and
hence if the mesh of a sequence of partitions goes to 0, so does the mesh of their
-images.
Now consider a continuous real function f defined (at least) on Œha; bi. In
analogy with the Riemann integral, (recall, in particular, Theorem 8.3 of Chapter 1),
let us investigate sums of the form
X
k
f ..ti //jj.ti / .ti 1 /jj
i D1
and let us see if they converge to a particular value when the mesh goes to 0. By the
Mean Value Theorem
v
X uX
u n
f ..ti //t .j .ti / j .ti 1 //2
i j D1
X sX
D f ..ti // j0 .ij /2 .ti ti 1 /2
i j
X sX
D f ..ti // j0 .ij /2 .ti ti 1 /
i j
www.Ebook777.com
which, by Theorem 8.3 of Chapter 1, when the mesh goes to 0, converges to

Z b
f ..t//jj0 .t/jjdt:
a
When W ha; bi ! Rn is a parametrization of a piecewise continuously differen-

tiable curve L, we call the number
Z b
f ..t//jj0 .t/jjdt (**)
a
the line intergral of the first kind (or integral according to length) of the function f
over the curve L, and denote it by
Z Z
f or f .x/jjdxjj:
L L
2.2.1 Comment
The formula (**) makes sense, of course, for any integrable function f , in which
case the integral (**) exists as a Lebesgue integral. A similar comment will apply
to all the types of curve integrals we shall introduce. It is useful to note, however,
that in the context of the present chapter, we are not interested in such level of
generality, and are happy to assume that the function f is continuous in which case
the Lebesgue integral is the same as the Riemann integral. Nevertheless, even with
that in mind, the Lebesgue integral techniques we developed in Chapter 5 are still
needed for example in arguments such as differentiating behind the integral sign in
Proposition 3.7 or the use of multivariable substitution in Section 5 below.
2.3 Proposition. The expression in the definition of the line integral of the first kind
is independent of parametrization.
Proof. Let and be as in 1.1, let D ı ˛. By 1.4, ˛ is piecewise continuously

differentiable (except at finitely many points where there are, at most, discontinuities
of the first kind, i.e. such that the corresponding one-sided limits exist), and hence
we have
qP qP
jj0 .t/jj D j0 .t/2 D 0 2 0
j .˛.t// .˛ .t//
2
qP
0 2 0 0
D j .˛.t// j˛ .t/j D jj .˛.t//jj j˛ 0 .t/j
and hence by the Substitution Theorem (for the Riemann integral in one variable),
we have
www.Ebook777.com
3 Line integrals of the second kind 199
Z b Z b
f ..t//jj0 .t/jjdt D f . .˛.t//jj 0
.˛.t//jj j˛ 0 .t/jdt
a a
Z d
0
D f . .//jj ./jjd: t
u
c
(The attentive reader will recall from the theory of the single variable Riemann
integral substitution that if ˛ is decreasing, the absolute value in j˛ 0 .t/j is
nevertheless correct because of an interchange of bounds.)
2.4 Remark
The length of a curve L is defined as the integral of the first kind of the function 1
over L, i.e.
Z Z b
1D jj0 jj:
L a
3 Line integrals of the second kind
3.1
Let W ha; bi ! Rn be a parametrization of a piecewise continuously differentiable

oriented curve L, and let f D .f1 ; : : : ; fn /T be a vector function defined (at least)
on Œha; bi. The line integral of the second kind of the function f over the oriented
curve L is the number
Z Z b n Z
X b
fD f..t// 0 .t/ D fj ..t//j0 .t/dt
L a j D1 a
(note, in the middle expression, the dot product of vectors). When there is a danger
of confusion, we will denote line integrals of the first and second kind explicitly by
Z Z
.I / ; .II/ :
L L
In the literature, the line integral of the second kind is also often denoted by
Z
.f1 dx1 C C fn dxn /:
L
www.Ebook777.com
This notation, in fact, conforms to the notation of differential forms, which we will
see later in Chapter 12. When x D .x1 ; : : : ; xn /T , we will also use the notation
Z
f.x/ dx:
L
3.2
The “physical” meaning of the line integral of the second

R kind: We travel around
the curve L from the beginning point to the end point. L F is then the work done
when we exert the force F at each given point of the curve.
3.3 Proposition. The expression in the definition of the line integral of the second
kind does not depend on the choice of parametrization of an oriented piecewise
continuously differentiable curve.
Proof. Let D ı ˛. Now, of course, ˛ 0 .t/ > 0 (with the possible exception of
finitely many points, where ˛ 0 has, at most, discontinuities of the first kind). We have
n Z
X b n Z
X b
fj ..t//j0 .t/dt D fj . .˛.t// 0 0
j .˛.t//˛ .t/dt
j D1 a j D1 a
n Z
X d
0
D fj . .// j ./d: t
u
j D1 c
Z Z
Observation. f D f.
L L
3.4
We immediately see the following
Proposition. Let K; L be oriented piecewise continuously differentiable curves

such that K C L is defined. Then
Z Z Z
fD fC f:
KCL K L
www.Ebook777.com
3 Line integrals of the second kind 201
3.5
Now let f be a (scalar) function defined on Œha; bi where is a parametrization

of a piecewise continuously differentiable oriented curve. On the same set, define
0 .t/
f..t// D f ..t// :
jj0 .t/jj
From the definitions, one has immediately

Z Z
.I / f D .II/ f:
L L
Thus, the line integral of the first kind can be reduced to the line integral of the
second kind.
3.6 Remarks
1. The traditional terms “of the first kind” and “of the second kind” therefore should
not be interpreted as expressing the order of importance. The line integral of the
second kind is in fact more fundamental, and the integral of the first kind can be
reduced to it. Perhaps the reason for the terminology is that the line integral of
the first kind is the more naive notion.
2. The function f or f often is defined on an open set containing Œha; bi. This
will play a crucial role in the proof of Green’s Theorem.
3.7
Since continuous functions on a compact set are bounded, we obtain immediately

from Theorem 5.2 of Chapter 5 the following
3.8 Proposition. Let f.˛; x/ be a continuous vector function defined in an open set
@fj .˛; x/
U of Rn such that is continuous on U for each j . Then the line integral
@˛
of the second kind satisfies
Z Z
d @f.˛; x/
f.˛; x/ dx D dx:
d˛ L L @˛
www.Ebook777.com
4 The complex line integral
4.1
For a complex function of one real variable, f .t/ D f1 .t/ C if2 .t/ where f1 , f2 are
real functions, one introduces the Riemann integral by the formula
Z b Z b Z b
f .t/dt D f1 .t/dt C i f2 .t/dt:
a a a
4.2
Recall that on the field of complex numbers C, we use the distance function
d.x; y/ D jx yj, which is the same as the Euclidean distance when we identify
C with R2 by x C iy 7! .x; y/. We will use this identification freely to define
piecewise continuously differentiable functions in C, etc., but now note that .t/
are the elements of the field C and hence can be subjected to the multiplication in C
which is different from the dot multiplication in R2 (for example in that the result is
again an element of C rather than R). This distinction, in fact, is the main point of
the present section. Because of this, when working with complex-valued functions,
we will not use bold-faced letters as we did in the case of vector functions.
4.3
Let W ha; bi ! C be a parametrization of an oriented piecewise continuously

differentiable curve L and let f be a (continuous) complex function of one complex
variable defined on some set containing Œha; bi. The complex line integral
Z
f .z/dz
L
is introduced by the formula

Z b
f ..t// 0 .t/dt (*)
a
(independence on (oriented) parametrization will be discussed below in 4.4). Again,

note with caution that while the formula (*) is similar to the definition of the
line integral of the second kind, it is different and “more mysterious” in that it
involves complex multiplication. For example, there is no simple interpretation of
the complex curve interval similar to the interpretations given in 2.2 or 3.2.
www.Ebook777.com
4 The complex line integral 203
4.4
It is, however, again possible to express the complex line integral in terms of line
integrals of the second kind.
Theorem. Let f be a complex function of one complex variable. Let
f .z/ D f1 .z/ C if2 .z/
where f1 , f2 are real functions of one complex variable. Then the complex line
integral satisfies
Z Z Z
f .z/dz D .II/ .f1 ; f2 /T C i .II/ .f2 ; f1 /T :
L L L
Proof. We have
Z b Z b
f ..t// 0 .t/dt D .f1 ..t// C if2 ..t///.10 .t/ C i 20 .t//dt
a a
Z b
D .f1 ..t//10 .t/ C .f2 ..t///20 .t//dt
a
Z !
b
Ci .f2 ..t//10 .t/ C f1 ..t//20 .t/dt
a
Z Z
D .f1 ; f2 /T C i .f2 ; f1 /T : t
u
L L
Remark: This theorem also implies that the complex line integral does not
depend on the parametrization of an oriented piecewise continuously differentiable
curve. (Of course, reversal of orientation results in a reversal of sign.)
4.5
The estimate in the following statement is not particularly tight. However, it will
prove useful in Chapter 10 below.
Lemma. Let L be a piecewise continuously differentiable curve in C of length d

(recall 2.4), and assume a complex function f on Œha; bi satisfies jf .z/j A.
Then we have
ˇZ ˇ
ˇ ˇ
ˇ f .z/dzˇ 4Ad:
ˇ ˇ
L
www.Ebook777.com
Proof. We have
ˇZ ˇ ˇZ Z b Z b Z b ˇ
ˇ b ˇ ˇ b ˇ
ˇ 0 ˇ ˇ 0 0 0 0ˇ
ˇ f ..t// .t/dt ˇ D ˇ f1 1 f2 2 C i f2 1 C i f1 1 ˇ
ˇ a ˇ ˇ a a a a ˇ
ˇZ ˇ ˇZ ˇ ˇZ ˇ ˇZ ˇ
ˇ b ˇ ˇ b ˇ ˇ b ˇ ˇ b ˇ
ˇ 0ˇ ˇ 0ˇ ˇ 0ˇ ˇ 0ˇ
ˇ f1 1 ˇ C ˇ f2 2 ˇ C ˇ f2 1 ˇ C ˇ f1 2 ˇ
ˇ a ˇ ˇ a ˇ ˇ a ˇ ˇ a ˇ
Z b Z b Z b Z b
jf1 j j10 j C jf2 j j20 j C jf2 j j10 j C jf1 j j20 j
a a a a
Z b Z b
4 Aj 0 j D 4A j 0 j D 4Ad: t
u
a a
5 Green’s Theorem
5.1 Smooth partition of unity: a “baby version”
Let Z Rn be a compact set, and let S be a set of open subsets of Rn whose union
contains Z. A smooth partition of unity subordinate to S is a set of finitely many
smooth functions i W Rn ! R, i D 1; : : : ; k such that the image of each i is
contained in h0; 1i, the support of each i is compact and contained in one of the
Xk
sets from S , and D i has the property that .x/ D 1 for every x 2 Z.
i D1
Lemma. Let Z Rn be a compact set. For every set S whose union contains Z,
there exists a smooth partition of unity.
Proof. First of all, Z is bounded by 6.5 of Chapter 2, and hence contained in a

bounded closed interval K D hA1 ; B1 i

hAn ; Bn i. Consider the set T consisting
of all bounded open intervals whose closures are either contained in one of the sets
from S , or are disjoint with Z. By 5.5 of Chapter 2, K is contained in the union
of the elements of a finite subset F of T . Now consider the function .x/ equal to
e 1=x for x > 0, and equal to 0 for x 0 (see Exercise (13) of Chapter 1). For an
interval J D .a1 ; b1 /

.an ; bn /, let
Y
n
J .x/ D .xi ai /.bi xi /:
kD1
Consider further the functions i;A .x/ D .xi Bi /, i;B .x/ D .Ai xi /, and let
be the sum of all these functions. Then J D J =, J 2 F form a smooth partition
of unity. t
u
www.Ebook777.com
5 Green’s Theorem 205
5.2
By a domain we shall mean an open subset U of R2 (or of C) which has

compact closure (which, by 6.5, is equivalent to being bounded). This condition
may not be very strong, but we will see that it will play an absolutely crucial
role in the proof of Green’s Theorem. Let L1 ; : : : ; Lk be oriented piecewise
continuously differentiable simple closed curves in R2 with disjoint images and
with parametrizations c1 ; : : : ; ck . We will say that L1 q q Lk is the boundary of
a domain U oriented counter-clockwise if the images of Li are contained in U and
for every x 2 U X U there exists an open neighborhood Vx of x and an injective
regular map with bounded partial derivatives
x W Vx !
.o; 1/ D fx 2 R2 j jjxjj < 1g
with
det.Dx / > 0;
a number ˛ 2 .0; 2 / and numbers a > 0, b 2 R and i 2 f1; : : : ; kg such that

(1) x .b/ D x;
(2) x ŒU \ Vx D f.r cos ; r sin /j 0 < r < 1; 0 < < ˛g;
(3)
[
k
. Im.cj // \ Vx D ci Œ.b a; b C a/;
j D1
(4) For s 2 h1; 0i, we have x ci .as C b/ D .s cos.˛/; s sin.˛// and for s 2
h0; 1i, we have x ci .as C b/ D .s; 0/:
5.3 Comment
Informally, the above definition says simply that the boundary of U is a union of the
images of the Li ’s and that at a neighborhood of every point of the boundary, locally
U looks like a wedge of an open disk (the wedge may also be a half-disk) whose
boundary is parametrized linearly by one of the curves ci in the same direction as
the increasing parametrization of .1; 1/ is with respect to the upper half-disk
f.x; y/ 2
.o; 1/jy 0g:
Note, however, the great generality this allows, for example a disk D with several
open disks removed whose disjoint closures are in the interior of D, or similarly
with polygons, etc. The beauty of the upcoming proof is that it uses no intuitive
properties of such situations except the formal properties given in the definition;
www.Ebook777.com
for example, we do not use any intuitive notion of “interior” or “exterior” of the
curves Li , and although the expression “counter-clockwise” matches the intuition,
Definition 5.2 is not based on intuition. Another way of putting this is to note that
our definition of boundary is purely local in the sense that it is completely described
by requirements on neighborhoods of individual points of C.
5.4
Let
Z Z Z
fD f CC f:
L1 qqLk L1 Lk
Theorem. (Green’s Theorem) Let U be a domain in R2 and let L1 ; : : : ; Lk be

oriented piecewise continuously differentiable simple closed curves with disjoint
images such that L1 q q Lk is the boundary of U oriented counter-clockwise.
Let M D U and let f W V ! R2 be a function with continuous (first) partial
derivatives for some V M open. Then we have
Z Z
@f2 @f1
fD : (5.4.1)
L1 qqLk M @x1 @x2
Proof. First, we note that the theorem is valid for U D .0; K/

.0; K/, K > 0,
i D 1 and
c1 W h0; 4i ! R2
defined by
c1 .t/ D .Kt; 0/ for 0 t 1,

D .K; K.t 2// for 1 t 2,
D .K.3 t/; K/ for 2 t 3,
D .0; K.4 t// for 3 t 4.
In this case, applying Fubini’s Theorem and the Fundamental Theorem of Calculus
in one variable, we get
Z Z K Z K
@f2 @f2 .x1 ; x2 /
D dx1 dx2
M @x1 0 0 @x1
Z K Z Z
D .f2 .K; x2 / f2 .0; x2 // dx2 D fC f:
0 L1 L3
www.Ebook777.com
5 Green’s Theorem 207
Similarly, we have
Z Z K Z K
@f1 @f1 .x1 ; x2 /
D dx2 dx1
M @x2 0 0 @x2
Z K Z Z
D .f1 .x1 ; 0/ f1 .x1 ; K// dx1 D fC f:
0 L4 L2
Adding these two formulas gives the statement in the present case. Amazingly, this
is the only concrete case of the theorem we need to prove by direct calculation.
Now consider the general case. First we need to observe that the statement (5.4.1)
doesn’t change if we perform a (2-variable) substitution by a diffeomorphism W
V ! V 0 (see Theorem 7.9 of Chapter 5). This is easy to accept, but somewhat
harder to do in detail. The reason is that even in two variables, the concepts we set
up so far do not transform in the simplest possible way under coordinate change.
We will understand this better in Chapter 12 below.
To do the calculation we need, let us write
.x1 ; x2 /T D F..r1 ; r2 /T /;
so identifying, at a point, the linear map D with its associated matrix, we have
0 @x @x 1
1 1
B @r1 @r2 C
B C
DF D B C:
@ @x @x A
2 2
@r1 @r2
Now consider a parametrized vector function f W R2 ! R2 where we understand

the independent variables to be x1 ; x2 (i.e. “the x1 x2 -plane”). Let L be an oriented
piecewise continuously differentiable curve in R2 , which we understand as “the
r1 r2 -plane”. We will denote, slightly imprecisely, by FŒL the “F-image of the curve
L in the x1 x2 -plane”, i.e. the oriented curve obtained by composing the parametriza-
tion of L with the map F. The key observation then is that the definition of the line
integral of the second kind gives
Z Z
f dx D ..DF/T .f ı F// dr: (5.4.2)
FŒL L
(Note that if we wrote the integrand of a line integral of the second kind as a row
instead of column vector, the transposition on the right hand side of (5.4.2) would
be unnecessary - again, we will understand this better in Chapter 12 below.)
www.Ebook777.com
Denoting the integrand on the right hand side of (5.4.2) by g, we have, in

coordinates,
0 @x @x2 1
1
f1 Cf2
B @r1 @r1 C
B C
gDB C:
@ @x @x2 A
1
f1 C f2
@r2 @r2
Now compute:
@g2 @g1

@r1 @g2
@x1 @f1 @2 x1 @x2 @f2 @2 x2

D C f1 C C f2 (5.4.3)
@r2 @r1 @r1 @r2 @r2 @r1 @r1 @r2
@x1 @f1 @2 x1 @x2 @f2 @2 x2

f1 f2 :
@r1 @r2 @r1 @r2 @r1 @r2 @r1 @r2
We see that the second order terms cancel out, and after applying the chain rule
@fi @fi @x1 @fi @x2

D C ;
@rj @x1 @rj @x2 @rj
the right hand side of (5.4.3) becomes a sum of eight terms, four of which cancel
out, leaving

@f2 @f1 @x1 @x2 @x1 @x2 @f2 @f1
D det.DF/;
@x1 @x2 @r1 @r2 @r2 @r1 @x1 @x2
which is what we need to transform (5.4.1) from the x-coordinates to the

r-coordinates, provided det.DF/ > 0 (see Theorem 7.9 and Exercise (16) of
Chapter 5).
Now by compactness of U (our main assumption!), there exist open sets
V1 ; : : : ; Vm of R2 such that
V1 [ [ Vm U
and for each i , we have either Vi U or x 2 Vi Vx for some x 2 U XU . Let ui be

a smooth partition of unity subordinate to the cover .Vi /. We shall prove the formula
(5.4.1) for each of the functions ui f, i D 1; : : : ; m. We distinguish four cases:
Case 1: Vi U . By linear substitution, we may assume Vi .0; K/
.0; K/.
Thus, the statement for ui f follows from the special case already proved (the
left hand side of (5.4.1) with f replaced by ui f is 0).
www.Ebook777.com
6 Exercises 209
Case 2: x 2 Vi Vx and 0 < < . By R2 -linear substitution, we may assume

D =2. In this case, choose K D 1 and extend the map ui f ı .x /1 to an open
set containing h0; Ki
h0; Ki by 0. Again, the statement reduces to the special
case already proved (noting that for this new function, the contributions to the
right hand side of (5.4.1) for 1 t 3 are 0).
Case 3: D . By the linear substitution
1
rD .x1 C 1/; s D x2 ;
2
applied to the function ui f ı .x /1 , the statement reduces to the special case
already proved with K D 1. (Note that for this function, the contributions to the
left hand side of (5.4.1) with 1 t 4 are 0.)
Case 4: < < 2 . By R2 -linear substitution applied to the function
ui f ı .x /1 , we may assume D 3 =2. Then extend the function ui f ı .x /1
to an open neighborhood in R2 of the set
Z D .h1; 1i
h1; 1i/ X ..0; 1i
h1; 0//:
Express
Z D Z1 [ Z2
where
Z1 D h1; 1i
h0; 1i;
Z2 D h1; 0i
h1; 0i:
The sets are not disjoint, but the intersection has measure 0. For the sets Z1 , Z2
and restrictions of the function ui f ı .x /1 , the statement follows from Cases 3
and 2, respectively. When adding the left hand sides of formula (5.4.1) for these
functions, the contributions from the line segment h1; 0i
f0g cancel out. u t
6 Exercises
(1) Prove the statement of the comment in 1.1.

(2) Prove that the relation in 1.2 is an equivalence relation.
(3) Prove that in 1.7, is a piecewise continuously differentiable
parametrization of a curve.
(4) Prove that the parametrized curve K C L defined in 1.7 depends only on the
parametrized curves K and L, and not on their parametrizations.
(5) Prove that in 1.7, we have .K C L/ C M D K C .L C M /.
www.Ebook777.com
0.t /
(6) Prove directly that the factor of 3.4 at each point Œha; bi (where
jj0.t / jj
defined) does not depend on the parametrization of the piecewise continuously
differentiable oriented curve. Prove also that reversal of orientation of the
curve results in multiplication of this factor by 1.
(7) Compute the complex line integral
Z
e z dz
L
where L is the straight line segment in C from 2 C 3i to 1 C i .

(8) Write out in detail the simplification of the right hand side of (5.4.3) using the
chain rule.
(9) Compute
Z
.II/ y 2 dx C 2xydy
L
where L is the boundary of the upper unit half-disk
f.x; y/T 2 R2 j x 2 C y 2 < 1; y > 0g
oriented counterclockwise.
(10) Prove that if L1 q q Lk is the boundary of a domain U oriented
counter-clockwise, then the area of U is equal to
Z
1
xdy ydx:
2 L1 qqLk
[Hint: Use Green’s Theorem.]

(11) Using Green’s Theorem and Theorem 4.4, compute the complex line integral
Z
z2 dz
L
where L is the boundary of the square
fx C iy j 0 < x < 1; 0 < y < 1g
oriented counter-clockwise.
www.Ebook777.com
Part II
Analysis and Geometry
www.Ebook777.com
Metric and Topological Spaces II

9
For the remaining chapters of this text, we must revisit our foundations. Specifically,
it is time to upgrade our knowledge of both metric and topological spaces. For
example, in the upcoming discussion of manifolds in Chapter 12, we will need
separability. We will need a characterization of compactness by properties of open
covers. Also, it is natural to define manifolds as topological and not metric spaces
which prompts the development of separation axioms, with a focus on normality. On
the other hand, when discussing Hilbert spaces in Chapters 16 and 17, we will need
completion, extension of uniformly continuous maps, and the Stone-Weierstrass
Theorem. These are the topics we will discuss in the present chapter.
1 Separable and totally bounded metric spaces
1.1 A few concepts
A subset M X of a topological space is said to be dense if M D X .

A space is separable if it contains an at most countable dense subset. (At most
countable means finite or countable.).
A cover of a space .X; / ( is the set of all open sets, recall Subsection 4.2 of
Chapter 2) is a subset U such that
[
U D X:
Note that we only consider covers by open sets. (In other texts, this requirement
is sometimes dropped, in which case our concept would be called an open cover.)
A subcover V of a cover U is a subset V U that is itself a cover.
A space X is said to be Lindelöf if every cover of X contains an at most
countable subcover.
1.2 Theorem. The following statements about a metric space X D .X; d / are
equivalent.

www.Ebook777.com
214 9 Metric and Topological Spaces II
(1) X is separable.
(2) The topology of X has a countable basis.
(3) X is Lindelöf.
Proof. (1))(2): Let M be a countable dense subset of X . Put
B D f
.m; r/ j m 2 M; r rationalg:
We will prove that B is a basis. Take an open U , an x 2 U , and an " > 0 such that
.x; "/ U . Now choose an m 2 M such that d.x; m/ < 13 " and a rational r such
that 13 " < r < 23 ". Then x 2
.m; r/
.x; "/ U : in effect, if d.m; y/ < r we
have d.x; y/ d.x; m/ C d.m; y/ < . 13 C 23 /" D ".
(2))(3): Let B be a countable basis and let U be an arbitrary open cover. Put
B 0 D fB 2 B j 9U 2 U; B U g. Then B 0 is a countable cover, and if we choose
for each B 2 B 0 a UB 2 U with B UB then also fUB j B 2 B 0 g is a countable
cover.
(3))(1): For every positive natural number n, choose a countable subcover of
the cover f
.x; n1 / j x 2 X g, say

.xn1 ; n1 /; : : : ;
.xnk ; n1 /; : : : :
Then the set fxnk j n; k D 1; 2; : : : g is dense in X . t

u
1.2.1 Remarks
1. This is a very specific fact concerning metric spaces. In a general topological
space one has only the (very easy) implications (2))(3) a (2))(1) and nothing
more.
2. In the literature, the existence of a countable basis is often called the second
axiom of countability.
1.2.2
Obviously, if X has a countable basis B then each subspace Y X has one, namely
BjY D fU \ Y j U 2 Bg. Hence we have
Corollary. A subspace of a separable metric space is separable.

Equivalently, a subspace of a Lindelöf metric space is Lindelöf.
The first of these statements hardly comes as a surprise (it is easy to prove it
directly, too). But the second one should sound somewhat strange. We will see
shortly (in 2.3 below) that Lindelöf property is very close to compactness, and
compactness is (very obviously) not preserved on subspaces. Again, this corollary is
characteristic for metric spaces. In a general topological context neither of the two
statements holds.
www.Ebook777.com
1 Separable and totally bounded metric spaces 215
1.3
A metric space X is said to be totally bounded if for each " > 0 there exists a finite
subset M."/ of X such that
for every x 2 X; we have d.x; M."// < ": (TB)
1.3.1
A totally bounded space is always bounded but a bounded space is not necessarily
totally bounded: take any infinite set and define d.x; y/ D 1 for x ¤ y. But we
have
Proposition. A subspace X of the Euclidean space Rn is totally bounded if and

only if it is bounded.
Proof. If X is bounded, then we have
X hN; N i

hN; N i
N "
for a suficiently large natural N . Choose a natural number k such that k
< 2
and put
M D fs D . sk1 ; : : : ; skn / j si are integers, N k si N kg:
For every x 2 X , there exists an s 2 M such that d.x; s/ < 2" . For an s 2 M ,
choose an x.s/ 2 X such that d.x.s/; s/ < 2" , if such x.s/ exists, and put
MX D fx.s/ j s 2 M such that x.s/ existsg:
Then, by the triangle inequality, we have, for every x 2 X , d.x; MX / < "
2 C 2" D ".
t
u
1.4 Proposition. A metric space X is totally bounded if and only if every sequence
in X contains a Cauchy subsequence.
Proof. I. Let X be totally bounded. Consider the sets M. n1 / from the defini-
tion 1.3. Now consider a sequence .xi /i D1;2::: in X . If the set P D fxi j i D
1; 2; : : : g is finite, then our sequence contains a constant subsequence, which is,
of course, Cauchy. Otherwise choose first m1 2 M.1/ so that P1 D P \

.m1 ; 1/ is infinite, and then k1 with xk1 2 P1 . Now assuming we have
mj 2 M. j1 /; j D 1; : : : ; n 1; such that Pj D Pj 1 \
.mj ; j1 /
are infinite, and k1 < k2 < < kn1 such that xkj 2 Pj ;
www.Ebook777.com
choose mn 2 M. n1 / with Pn D Pn1 \

.m1 ; 1/ infinite, and a kn > kn1 such
that xkn 2 Pn . Then the sequence xk1 ; xk2 ; xk3 ; : : : is obviously Cauchy.
II. Let X not be totally bounded. Then there exists an " > 0 such that for
every finite M X there exists an x 2 X with d.x; M / ". Pick an
arbitrary point x1 ; assuming we have chosen x1 ; : : : ; xn , pick an xnC1 2 X
so that d.xnC1 ; fx1 ; : : : ; xn g/ ". The resulting sequence obviously contains
no Cauchy subsequence. t
u
1.5 Proposition. A totally bounded metric space is separable.

S1
Proof. It sufices to take M D nD1 M. n1 /. t
u
2 More on compact spaces
2.1
A point x of a space X is said to be an accumulation point of a subset M X if

for every neighborhood U of X the intersection U \ M is infinite.
Here is a simple reformulation of the definition of compactness we have used so
far (i.e. requiring that every sequence have a convergent subsequence).
2.1.1 Proposition. A metric space X is compact if and only if every infinite set
M X has an accumulation point.
Proof. I. Assume that every sequence in X has a converegent subsequence. Let M

be an infinite subset of X . Choose a one-to-one (not necessarily onto) mapping
' W N ! M and a convergent subsequence .'.kn //n of .'.n//n . Then lim '.kn /
n
is an accumulaton point of M .
II. Assume that every infinite M X has an accumulation point. Let .xn /n be
a sequence in X . If M D fxn j n D 1; 2; : : : g is finite then .xn /n contains a
constant, and hence convergent, subsequence. Assume that M is infinite, and
let x be one of its accumulation points. Choose k1 arbitrarily, and assuming
xk1 ; : : : ; xkn1 , k1 ; < < kn1 , are chosen, choose kn > kn1 so that xkn 2

.x; n1 / (such a kn exists since x is an accumulation point, and on the other
hand, only finitely many j , namely those with j kn1 , are disqualified by the
definition of a subsequence). Now obviously lim xkn D x. t
u
n
2.2 Theorem. A metric space X is compact if and only if it is complete and totally
bounded.
Proof. I. If X is compact then it is complete by 7.4 of Chapter 2. If X were not

totally bounded, there would exist an " > 0 such that for every finite subset
M there is a point x with d.x; M / ". Choose x1 arbitrarily and assuming
www.Ebook777.com
2 More on compact spaces 217
x1 ; : : : ; xn are already chosen, pick an xn such that d.xn ; fx1 ; : : : ; xn g/ ".

Then fxn j n D 1; 2; : : : g is infinite and obviously has no accumulation point.
II. If .xn /n is a sequence in a totally bounded complete metric space then by
1.4 it contains a Cauchy subsequence, and by completeness, this subsequence
converges. t
u
2.2.1 Remark
This fact is a generalization of Theorem 6.5 of Chapter 2 stating that a subset X
Rn is compact if and only if it is closed and bounded. We know that Rn is complete
(7.5 of Chapter 2), and hence, by 7.5 of Chapter 2 again, X is complete if and
only if it is closed; by 1.3.1, for X Rm , boundedness and total boundedness are
equivalent.
2.3
The following is the famous Heine-Borel Theorem. One can think of it as a

generalization of 5.5 of Chapter 2 to arbitrary metric spaces.
Theorem. A metric space is compact if and only if each of its (open) covers
contains a finite subcover.
Proof. Let X be compact and let U 0 be a cover of X which has no finite subcover.
By 2.2 and 1.5, X is separable, hence by 1.2 it is Lindelöf, and hence U 0 has a
countable subcover
U D fU1 ; U2 ; : : : ; Un ; : : : g:
By assumption, U has no finite subcover.

Now our strategy is to discard the Ui ’s which are “redundant” in the given order.
More precisely,
let V1 D Uj for the lowest j for which Uj ¤ ;,
and assuming Vj , j D 1; : : : ; n 1 are already chosen,
[
n1
let Vn D Uj for the lowest j such that Uj X Vi ¤ ;
i D1
(by assumption, the finite system fV1 ; : : : ; Vn1 g cannot be a cover). Choose
[
n1
xn 2 Vn X Vi and put M D fxn j n D 1; 2; : : : g:
i D1
www.Ebook777.com
Now
[
n1
• xn … fx1 ; : : : ; xn1 g . Vi / and hence M is infinite,
i D1
• V1 ; : : : ; Vn ; : : : is a cover since each discarded Uj is contained in the union of
the Vi ’s, and
• Vn \ M fx1 ; : : : ; xn g and hence is finite.
This is a contradiction: The set M must have an accumulation point x, this x is an
element of some Vn , but this neighborhood of x meets M in finitely many points
only.
II. Assume each cover of X has a finite subcover and assume M X has no
accumulation point. Then for every x 2 X there exists an open neighborhood Un
such that Un \ M is finite. Choose a finite subcover Uk1 ; : : : ; Ukn . Then
[
n [
n
M D. U xi / \ M D .Uxi \ M /
i D1 i D1
is a finite union of finite sets and hence is finite. t

u
2.4
Theorem 2.3 suggests the following definition of compactness for general topologi-
cal spaces, which we will adopt from now on:
A topological space is said to be compact if each of its (open) covers has a finite
subcover.
Similarly as in the special case of metric spaces (recall 6.2 of Chapter 2) we have
2.4.1 Proposition. Let f W X ! Y be a continuous map and let X be compact.

Then the subspace f ŒX of Y is compact.
[
Proof. Let Ui , i 2 J , be open in Y and let f ŒX Ui . Then X
[ [ i 2J
f 1 Œ Ui D f 1 ŒUi and hence there exist i1 ; : : : ; in such that
i 2J i 2J
[
n [
n
X f 1 ŒUi D f 1 Œ Uij :
j D1 j D1
[
n
This is equivalent to f ŒX U ij . t
u
j D1
From this statement we obtain, again, the following important generalization of

Proposition 6.3 of Chapter 2:
www.Ebook777.com
3 Baire’s Category Theorem 219
2.4.2 Proposition. Let X be a compact topological space. Then every continuous

real function f W X ! R has both a maximum and a minimum.
2.4.3 Proposition. A closed subspace Y of a compact topological space is

compact.
S
Proof. Let Ui , i 2 J , be open sets in X such that Ui Y . Then fUi j i 2
J g [ fX X Y g is an open cover of X and hence there exists a finite subcover
U i1 ; : : : ; U in ; X X Y
S
of X . Since Y \ .X X Y / D ; we have Y nj D1 Uij . t
u
2.4.4 Remark
Unlike the case of metric spaces, a compact subspace of a topological space is
not necessarily closed: for example, any subspace of a finite topological space is
compact, but not every subset may be closed. This, in fact, is one of the motivations
of separation axioms, which can be used to remedy this situation, and which will be
discussed in Section 5 below.
3 Baire’s Category Theorem

3.1
A subset A of a topological space X is said to be nowhere dense if X X A is dense in

X , that is, if X X A D X (recall that A denotes the closure of A, i.e. the intersection
of all closed subsets of X which contain A).
In other words,
A is nowhere dense if and only if for each non-empty open U the intersection
U \ .X X A/ is non-empty.
Consequently we obtain
3.2 Observation. A union of finitely many nowhere dense subsets of X is nowhere
dense.
(If A; B are nowhere dense and U is non-empty open then U \ .X X A/ is

non-empty open and hence U \ .X X A/ \ .X X B/ D U \ .X X .A [ B// D
U \ .X X A [ B/ is non-empty.)
3.3
A subset A X is of the first category (or meager) in X if A is a countable union

1
[
An
nD1
with An nowhere dense. From 3.2 we immediately see that
www.Ebook777.com
S
1
A subset A X is of the first category in X if it is a union An of an increasing
nD1
sequence A1 A2 of nowhere dense subsets.
3.4 Theorem. (Baire’s Category Theorem) If X is a complete metric space then X

is not of the first category in X .
Proof. Let
A1 A2 An
[of nowhere dense subsets of a complete metric space X .

be an increasing sequence
We will prove that X X An ¤ ;.
n
Since a closure of a nowhere dense set is nowhere dense, we may assume without
loss of generality that the sets An are closed.
The set A1 is nowhere dense closed and hence there exists an x1 2 X X A1 and
an "1 , 0 < "1 < 1 such that
.x1 ; 2"1 / \ A1 D ;.
Now
.x1 ; "1 / is a non-empty open set and hence
.x1 ; "1 / \ .X X A1 / ¤ ;
and we have an x2 and an "2 , 0 < "2 < 12 , such that
.x2 ; 2"2 /
.x1 ; "1 / \ .X X
A2 /, i.e.

.x2 ; 2"2 / \ A2 D ; and
.x2 ; 2"2 /
.x1 ; "1 /:
Now assume we already have x1 ; : : : ; xn and "1 ; : : : ; "n , 0 < "k < k1 , such that

.xk ; 2"k / \ Ak D ; for k n; and

.xk ; 2"k /
.xk1 ; "k1 / for 1 < k n:
Since
.xn ; "n / is a non-empty open set, we have a non-empty open
.xn ; "n / \
.X X AnC1 / and hence there is an xnC1 and an "nC1 with 0 < "nC1 < nC1
1
such that

.xnC1 ; 2"nC1 / \ AnC1 D ; and
.xnC1 ; 2"nC1 /
.xn ; "n /:
Since
.x; "/
.x; 2"/ (if d.y;
.x; "// D 0 we can find a z 2
.x; "/ such that
d.y; z/ < "), setting Bn D
.xn ; "n / we obtain a sequence

.x1 ; 2"1 / B1
.x2 ; 2"2 / B2
.x3 ; 2"3 / B3
such that
.xk ; 2"k / \ Ak D ; (and hence Bk \ Ak D ;).
For k n we have xk 2
.xn ; 2"n / and since "n < n1 the sequence .xn /n is
Cauchy, and by completeness it has a limit x 2 X . Furthermore,
T for k n we have
xk 2 TBn , and since Bn is closed, xT2 Bn . Thus,
S x 2 Bn . Since Bn \SAn D ; we
have Bk \ An D ; and finally Bk \ An D ;. Therefore, x … An . t
u
www.Ebook777.com
4 Completion 221
4 Completion
4.1
Let X D .X; d / be a metric space. On the set of Cauchy sequences .xn /n in X ,

introduce an equivalence relation
.xn /n .xn0 /n df lim d.xn ; xn0 / D 0

n
( is obviously reflexive and symmetric, and the transitivity immediately follows

from the triangle inequality).
4.2 Lemma. 1. If .xn /n and .yn /n are Cauchy sequences in X then .d.xn ; yn //n
is a Cauchy, and hence convergent, sequence in R.
2. If .xn /n .xn0 /n and .yn /n .yn0 /n then limn d.xn ; yn / D limn d.xn0 ; yn /.
Proof. 1. From the triangle inequality, we immediately see that
jd.xm ; ym / d.xn ; yn /j d.xm ; xn / C d.ym ; yn /:
Thus, if d.xm ; xn /; d.ym ; yn / < 2" , then jd.xm ; ym / d.xn ; yn /j < ".
2. d.xn ; yn / d.xn ; xn0 / C d.xn0 ; yn0 / C d.yn0 ; yn / and hence lim d.xn ; yn /
lim d.xn0 ; yn0 /, and by symmetry also lim d.xn0 ; yn0 / lim d.xn ; yn /. t
u
4.3
Denote by XQ the set of all the -equivalence classes of Cauchy sequences in .X; d /.
For ; 2 XQ , define
dQ .; / D lim d.xn ; yn / where .xn /n 2 and .yn /n 2 :
Q dQ / is a metric space.
Observation. XQ D .X;
(The definition of dQ is correct by 4.2: obviously dQ is symmetric and satisfies the

triangle inequality, and if dQ .; / D 0 and .xn /n 2 ; .yn /n 2 then we obtain
.xn /n .yn /n by comparing the definitions of and dQ .)
4.4
A bijection (i.e. a one-to-one onto map) f W .X; d / ! .X 0 ; d 0 / is called an

isometry if
8x; y d 0 .f .x/; f .y// D d.x; y/: (*)
www.Ebook777.com
(Note that (*) implies that f is one-to-one. Thus, to verify that a mapping satisfying
this condition is an isometry it suffices to prove that it is onto.)
If such a mapping exists we say that the spaces .X; d / and .X 0 ; d 0 / are isometric.
A map satisfying the condition (*) without assuming that it is onto will be called
an isometric embedding.
Proposition. Every metric space is isometric to a dense subspace of a complete

metric space.
Proof. For x 2 X define xQ 2 XQ as the class containing the constant sequence
x; x; x; : : : :
Obviously the mapping
Q W X ! X D fxQ j x 2 X g XQ
D .x 7! x/
is an isometry.
I. X is dense in XQ . Consider an arbitrary " > 0. For a 2 XQ , choose a
representative .xn /n and an n0 such that d.xm ; xn / < " for m; n n0 . Then
dQ .; xQ n0 / D lim d.xn ; xn0 / ":

m
II. XQ is complete. Let .n /n be a Cauchy sequence in X.

Q Since X is a dense subset
of XQ , we can choose an xn 2 X such that
dQ .n ; xQ n / < n1 :
For an " > 0, choose an n0 such that dQ .m ; n / < " whenever m; n n0 . Then
d.xm ; xn / D dQ .xQ m ; xQ n / dQ .xQ m ; m / C dQ .m ; n / C dQ .n ; xQ n / < 1

m C"C 1
n
and we see that .xn /n is a Cauchy sequence.

Denote by the equivalence class of .xn /n . We will show that this is a limit, in
XQ , of the sequence .n /n . Take an arbitrary " > 0 and an n0 such that n10 < 2" and
for n; k n0 , d.xm ; xk / < 2" (this can be done since we already know that .xn /n is
a Cauchy sequence). Then for n n0 , we have
dQ .n ; / dQ .n ; xQ n / C dQ .xQ n ; .xk /k / < "

2 C lim d.xn ; xk / "
2 C "
2 D ": t
u
k
www.Ebook777.com
4 Completion 223
4.5
An isometric embedding of a metric space X into a complete metric space with a

dense image is called a completion of X .
Proposition. Up to isometry, there exists precisely one completion of a metric space

X . More precisely, in the notation of 4.4, if ' W X ! Y is a completion then there
exists an isometry f W XQ ! Y such that f ı D '.
Proof. If we denote by the metric on Y , we have
8x; y 2 X; .'.x/; '.y// D d.x; y/ and 'ŒX D Y:
For a 2 XQ , choose a representative .xn /n and put
f ./ D lim '.xn / .in Y /

n
(by the isometric embedding requirement, .'.xn //n is Cauchy and hence convergent
in Y ; if .xn /n .yn /n , then again by the isometric embedding requirement,
lim .'.xn /; '.yn // D lim d.xn ; yn / D 0;

n
and hence lim '.xn / D lim '.yn / so that the definition does not depend on the
n n
choice of a representative).
We have f .x/Q D '.x/ (the limit of a constant sequence), and since a metric is
(obviously) a continuous function, we have
.f ./; f .// D .lim '.xn /; lim '.yn //

n n
D lim .'.xn /; '.yn // D lim d.xn ; yn / D dQ .; /:
Thus, f is an isometric embedding, and it remains to show that f is onto. Take a y 2

Y . Since 'ŒX is dense, there exists a sequence .xn /n in X such that lim.'.xn // D
y. Thus in particular .'.xn //n is Cauchy, and, since ' is an isometric embedding,
so is .xn /n is. If we denote by the equivalence class of .xn /n , we obtain f ./ D
lim.'.xn // D y. t
u
4.6 Extension of uniformly continuous maps
When discussing the Fourier transform in Chapter 17, we will need the following
important result on extension of uniformly continuous maps to the completion.
www.Ebook777.com
Proposition. Let .X; d /; .X 0 ; d 0 / be metric spaces, let .X 0 ; d 0 / be complete and let

Y be a dense subspace of X . Then each uniformly continuous f W Y ! X 0 has a
unique uniformly continuous extension g W X ! X 0 .
Proof. For an x 2 X , choose a sequence xn in Y such that lim xn D x, and set

g.x/ D lim f .xn /. (Clearly, this definition is forced by the assumption of uniform
n
continuity of g, which already proves uniqueness.) Let us show that this is a correct
definition of a mapping: .xn /n is a Cauchy sequence, hence .f .xn //n is Cauchy
and hence convergent; if .yn /n is another sequence in Y converging to x we have a
Cauchy sequence f .x1 /; f .y1 /; f .x2 /; f .y2 /; : : : ; f .xn /; f .yn /; : : : converging to
both lim f .xn / and lim f .yn /. Considering the constant sequence, g.x/ D f .x/ for
n n
x 2 Y.
Now let " > 0. Choose "1 ; > 0 such that "1 C 2 < ", and a ı > 0 such that
d.u; v/ < ı implies d 0 .f .u/; f .v// < "1 for u; v 2 Y . Let d.x; y/ < ı. Choose n
sufficiently large such that
d.xn ; yn / < ı and d 0 .f .xn /; g.x//; d 0 .f .yn /; g.y// < :
Then
d 0 .g.x/; g.y// d 0 .g.x/; f .xn /// C d 0 .f .xn /; f .yn // C d 0 .f .yn /; g.y//

< C "1 C < ":
t
u
5 More on topological spaces: Separation
Topological spaces are seldom used in the generality of Chapter 2, Section 4. For
various purposes, extra assumptions are usually added. In analysis, we typically
encounter so-called separation axioms, (in fact, typically, the stronger ones), which
we will briefly introduce in this section. It is worth noting that in this context,
separation refers to separation of points or subsets by open sets; it is not related
to separability as defined in Section 1 above.
5.1 T0 and T1
A topological space is said to be T0 if for any two distinct points x; y 2 X there

exists an open set U such that either x … U 3 y or y … U 3 x. This is equivalent
to requiring that fxg D fyg implies that x D y.
A space is said to be T1 if for any two distinct points x; y 2 X there is an open
set U such that y … U 3 x. This is equivalent to requiring that every finite set be
closed.
www.Ebook777.com
5 More on topological spaces: Separation 225
It should be noted that while there is not much use for spaces that are not T0 ,
spaces which are not T1 are used a lot (typically, however, in applications outside
analysis).
5.2 T2 , or the Hausdorff axiom
A space is Hausdorff (or, T2 ) if for any two distinct points x; y 2 X there are
disjoint open sets U; V such that x 2 U and y 2 V .
Hausdorff spaces are already “analysis-friendly”; for instance they admit con-
cepts of convergence in which limits are unique. We will not discuss such topics but
will present the following fact which has been promised before.
5.2.1 Proposition. In a Hausdorff space every compact subset is closed.
Proof. Let A X be a compact subspace. Fix an x … A. We will prove that there

is a neighborhood of x that is disjoint from A.
For each a 2 A choose disjoint open sets Ua 3 a and Va 3 x. Then fUa j a 2 Ag
\
n
is a cover of A and hence there is an open subcover Ua1 ; : : : ; Uan . Set V D Vai .
i D1
[
n
Then V \ Uai D ; and hence V \ A D ;. t
u
i D1
From 2.4.1, we obtain the following generalization of 7.2 of Chapter 2.
5.2.2 Corollary. Let f W X ! Y be a continuous map, let X be compact and let

Y be Hausdorff. Then for every closed A X , the image f ŒA is closed. Thus in
particular such an f W X ! Y that is bijective is a homeomorphism.
5.3 Regularity and complete regularity (T3 and T3C 1 )

2
A space X is regular, or T3 , if for every x 2 X and every closed A X such that

x … A, there are open disjoint U; V such that x 2 U and A V .
X is completely regular, or T3C 1 , if for every x 2 X and every closed A X
2
such that x … A there is a continuous mapping ' W X ! h0; 1i such that '.x/ D 0
and 'ŒA f1g.
Obviously a completely regular space is regular: take the assumed ' and set
U D ' 1 Œh0; 12 / and V D ' 1 Œ. 12 ; 1i .
5.3.1 Proposition. A topological space X is regular if and only if for every open
U X,
[
U D fV j V open; V U g:
www.Ebook777.com
Proof. I. Let X be regular and let x 2 U . Then x … X X U and there are disjoint
open sets V 3 x and W X X U . Now V X X W U and since X X W
is closed, V U .
II. Let the condition hold, let A be closed, and let x … A. Then
[
x2 fV j V open; V X X Ag
and hence there is an open set V 3 x such that A X X V . t

u
5.4 Normality
A space is normal (or T4 ) if for any two disjoint closed subsets A; B X , there
exist disjoint open sets U; V such that A U and B V .
5.4.1 Remarks
1. After 5.3, the reader may expect an axiom T4C 1 requiring a separation of disjoint
2
closed sets by continuous real functions. This, however, already follows from
normality as we will see in 5.4.6 below. On the other hand, complete regularity
does not follow from regularity.
2. Of course we have T2 ) T1 ) T0 while we do not have such implications for
the higher separation axioms (T3 does not imply T2 , T4 does not imply T3 ). The
reason is that the higher separation axioms in fact do not require that points
be closed. In practice, one usually works with T3 &T1 , T3C 1 &T1 and T4 &T1
2
and then the expected implications from “higher” to “lower” separation axiom
naturally hold.
5.4.2 Proposition. Every metric space .X; d / is normal.
Proof (Recall 8.4 of Chapter 2). For disjoint closed sets A; B X define a maping
' W X ! h0; 1i
by setting
d.x; A/
'.x/ D :
d.x; A/ C d.x; B/
Since the A; B are closed and disjoint we cannot have simultaneously d.x; A/ D 0
and d.x; B/ D 0 and hence d.x; A/ C d.x; B/ > 0 for all x . Thus, ' is continuous
and we can take U D ' 1 Œh0; 12 / and V D ' 1 Œ. 12 ; 1i . t
u
www.Ebook777.com
5 More on topological spaces: Separation 227
5.4.3 Proposition. Every regular Lindelöf topological space is normal.
Proof. Let X be regular Lindelöf and let A; B be closed and disjoint sets. For a 2 A,
choose open disjoint sets Ua 3 a and Va0 B.
fUa j a 2 Ag [ fX X Ag is a cover of X and therefore we have a subcover
X X A; U1 ; : : : ; Un ; : : : :
Thus we have obtained open sets

[
U1 ; : : : ; Un ; : : : such that Un A and U n \ B D ;:
n
Taking, instead, the unions U1 ; U1 [ U2 ; U1 [ U2 [ U3 ; : : : we can assume that
U1 U2 Un :
Similarly we can find open sets

[
V1 V2 Vn ; such that Vn B and V n \ A D ;:
n
Now set
[
n [
UQ n D Un X V j; U D UQ n ; and
j D1 n
[
n [
VQn D Vn X Uj; V D VQn :
j D1 n
We have A [
U (no point of A appears in any of the subtracted V j ) and B V ,
and U \ V D .UQ m \ VQn / D ;, since in any of the intersections Um \ VQn , we have
m;n
either m n or m n. t
u
5.4.4 Proposition. Every compact Hausdorff space is normal.
Proof. By 5.4.3, it suffices to prove that the space is regular. Let A be closed and
x … A. For a 2 A choose disjoint open sets Ua 3 a and Va 3 x. Then fUa j a 2 Ag
[
n
is a cover of A and hence there is an open subcover Ua1 ; : : : ; Uan . Set U D Uai
i D1
\
n
and V D Vai . Then x 2 V , A U and U \ V D ;. t
u
i D1
www.Ebook777.com
5.4.5 Lemma. Let Q h0; 1i be a dense subset. Let us have in a topological space
X open sets Uq , q 2 Q, such that
q<r ) U q Ur :
Define a mapping ' W X ! h0; 1i by setting
'.x/ D inffq j x 2 Uq g:
Then ' is continuous.
Proof. Set M.x/ D fq j x 2 Uq g Since obviously q 2 M.x/ and q < r imply

r 2 M.x/, we have q > '.x/ ) x 2 Uq and hence
x … Uq ) '.x/ q: (*)
For q < '.x/ take an r with q < r < '.x/; then x … Ur and we see that
q < '.x/ ) x … U q: (**)
Let '.x/ 2 .˛; ˇ/ (the cases '.x/ D 0 or 1 are only simpler and can be left to the
reader). Choose ˛ < q < ' < r < ˇ. Then by the implications above,
x 2 Ur X U q and 8y 2 Ur X U q ; '.y/ 2 .˛; ˇ/:
Thus, the neighborhood Ur X U q of x is being mapped into .˛; ˇ/ and we see that
' is continuous. t
u
5.4.6 Proposition. (Urysohn’s Theorem) Let A; B be disjoint closed subsets of a

normal space X . Then there is a continuous mapping ' W X ! h0; 1i such that
'ŒA f0g and 'ŒA f1g.
Proof. Let Q be the set of all dyadic rationals between 0 and 1, that is, the
k
; n D 1; 2; : : : I k D 1; 2; : : : ; 2n 1:
2n
Choose disjoint open U. 12 /, V such that A U. 12 / and B V (so that U. 12 /

X X B). Now let U. 2km / be already chosen for m n so that
q<r ) U.q/ U.r/:
For k D 0; : : : ; 2n , choose disjoint open sets U. 2kC1

2nC1
/, V such that
www.Ebook777.com
6 The space of continuous functions revisited: The Arzelà-Ascoli Theorem and : : : 229
U. 2kn / U. 2kC1
2nC1
/ and X X U. kC1
2n
/V (and hence U. 2kC1
2nC1
/ U. kC1
2n
//
where for k D 0 we take the set A instead of U.0/ and for k D 2n we take B instead
of X X U.1/.
Thus we obtain inductively a system U.q/, q 2 Q, satisfying the requirements
of Lemma 5.4.5, and the statement follows. t
u
5.4.7 Remarks
1. In particular, every Lindelöf regular space is completely regular. It should be
noted that, with the exception of T3 » T3C 1 , proving that a lower separation
2
axiom does not imply a higher one is easy. This exception, on the contrary,
was a hard nut to crack (and had been an open problem for quite some
time). Proposition 5.4.6 shows why: the counterexample has to use uncountable
reasoning in a substantial way.
2. Lemma 5.4.5 can be used to reformulate complete regularity without referring to
the real numbers. Recall 5.3.1. Denote by the relation
V U df V U:
It is in general not interpolative (that is, we generally do not necesarily have a W

such that U W V ). If we denote by C the largest interpolative subrelation
of then completely Sregular spaces can be characterized as those where each
open U is the union fV j V C U g.
6 The space of continuous functions revisited:

The Arzelà-Ascoli Theorem and the Stone-Weierstrass
Theorem
Certain very strong theorems hold about the space C.K/ of (necessarily bounded)
continuous real functions on a compact metric space K with the supremum metric
considered in 7.7 of Chapter 2. We will prove two such results in the this section,
and use them in Chapters 10 and 17 below.
6.1 The Arzelà-Ascoli Theorem
A sequence of functions fn 2 C.K/ is called uniformly bounded if there exists a

number M such that jfn .x/j < M for every n and every x 2 K. Therefore, being
uniformly bounded is the same thing as fn 2
.0; M / for all n, for a fixed M > 0,
where 0 is the constant zero function. Additionally, the sequence of functions .fn /n
is called equicontinuous if for every " > 0, there exists a ı > 0 such that for every
x; y 2 K and every n 2 N,
www.Ebook777.com
d.x; y/ < ı ) jjfn .x/ fn .y/jj < ":
Thus, this means that the functions fn are all uniformly continuous with the same
bound ı depending on ", independent of n.
6.2 Theorem. (The Arzelà-Ascoli Theorem) Let K be a compact metric space.

Then any uniformly bounded equicontinuous sequence of functions .fn /n in C.K/
has a uniformly convergent subsequence (i.e. a subsequence convergent in C.K/).
Proof. By Theorem 2.2, the space K is totally bounded. Therefore, for each " 2 N,
there is a finite subset S" K such that for every x 2 K, d.x; y/ < " for at least
one y 2 S" .
Now let
[
SD S1=k D fx1 ; x2 ; x3 ; : : : g:
k
Then f ŒK is compact by Proposition 6.2 2, so there exists a subsequence .fi1n /n

of .fn /n such that the sequence fi1n .x1 / converges. Next, there exists a subsequence
.fii2n /n of .fi1n /n such that fi2n .x2 / converges. Repeating this procedure, we may
successively pick subsequences .fij n /n such that fij n .xj / converges. Note however
that then since we picked each sequence as a subsequence of the previous one, the
“diagonal” subsequence finn converges on every point of S . Now let "=3 1=k.
Taking ı D ı."/ for a given " from the definition of equicontinuity, let N be such
that for m; n > N , jfimm .s/ finn .s/j < "=3 for every s 2 Sk . Then, by the
triangle inequality, jfimm .x/ finn .x/j < ", (since there exists an s 2 S1=k with
d.x; s/ < ı."=3/, showing that jfimm .x/ finn .x/j < " for every x 2 K, showing
that the subsequence .finn /n is Cauchy in C.K/. Since however C.K/ is complete
(by Proposition 7.7.2), this subsequence converges in C.K/. t
u
Sometimes we are interested in working in the space C.X / of bounded real

continuous functions on a space X which is not compact. In that case, the
assumptions of equicontinuity and uniform boundedness, and consequently the
conclusion of uniform convergence, are often too strong. One strategy for getting
around this is the following: We say that a topological space X is -compact if it is
a union of countably many compact subsets.
6.3 Theorem. Suppose that X is a -compact metric space. Then every sequence
.fn /n in C.X / which is equicontinuous and bounded on every K X compact has
a subsequence which is uniformly convergent on every K X compact.
Proof. We use the “diagonal method” one more time. Let

1
[
XD Kn
nD0
www.Ebook777.com
for Kn compact. Then using Theorem 6.2, choose a subsequence .fi1n /n which
converges uniformly on K1 . Within this subsequence, choose another subsequence
.fi2n /n which converges uniformly on K2 . Proceeding in the same way, keep
choosing consecutive subsequences, so that .fij n /n converges uniformly in Kj .
Then the “diagonal” subsequence .finn /n satisfies the requirement. t
u
Another important problem in analysis is approximation, i.e. the problem of

finding a convenient subset dense in a given metric space X . We will now prove
a very strong approximation theorem for the space C.K/ of real functions on a
compact metric space K, for which we will find an application in Chapter 17 below,
in our treatment of Fourier series.
6.4 The Stone-Weierstrass Theorem: Assumptions and statement
Notice that the space C.K/ has the structure of a vector space over R, and that the
operations of addition and multiplication by a scalar are continuous. In addition to
this, C.K/ also has an operation of product of function, which is also continuous.
We will consider subsets A C.K/ satisfying the following assumptions:
(1) A is a vector subspace of C.K/, contains the constant function 1 with value 1,
and for f; g 2 A, we have f g 2 A. (We say that A is a unital subalgebra of
C.K/.)
(2) For any two points x; y 2 K, there exists a function f 2 A such that f .x/ ¤
f .y/ (we say that A separates points).
6.4.1 Theorem. (The Stone-Weierstrass Theorem) Let A be a unital subalgebra of

C.K/ which separates points. Then A is a dense subset of C.K/.
The proof of this theorem will occupy the remainder of this section. However,
let us observe one thing right away: since the operations of addition of functions,
multiplication of functions and multiplication by a scalar are continuous functions
C.K/
C.K/ ! C.K/, R
C.K/ ! C.K/, the closure of a unital subalgebra
is a unital subalgebra. Therefore, the statement of the theorem will follow if we can
prove that every closed unital subalgebra of C.K/ which separates points is equal
to C.K/.
6.5
An important step in the proof of the theorem is the fact that the square root (and
hence the absolute value) of a non-negative continuous function on a bounded
compact interval
p is a uniform limit of polynomials. To prove this, we use the Taylor
expansion of 1 x.
www.Ebook777.com
p
Lemma. Let 0 < b < 1. Then the Taylor expansion of 1 x at the point x D 0
converges absolutely uniformly in the interval hb; bi.
While it is possible to prove this fact in an elementary way, a much easier proof
will follow from the methods of complex analysis. Because of this, we will skip
the proof at this point, and referpthe reader to Exercise (8) of Chapter 10 where we
define rigorously the function 1 x for x 2 C, Re.x/ < 1, and prove that the
(complex) radius of convergence of its Taylor series is 1.
Comment: In fact, using a lemma of Abel’s, the upper bound of uniform conver-
gence can be extended to 1. However, we do not need that fact.
6.6 Lemma. Let A C.K/ bepa closed unital subalgebra.

(1) If f 2 A and f 0, then f 2 A.
(2) If f 2 A, then jf j 2 A.
(3) If f; g 2 A, then max.f; g/; min.f; g/ 2 A.
Proof. Without loss of generality, max jf j < 1. By Lemma 6.5,

k
1
!
p X 1=2
f C 1=n D .1 1=n f /k
k
kD0
converges uniformly for n D 1; 2; : : : , and hence

p
f C 1=n 2 A: (6.6.1)
p
The function x is continuous, and hence,pby Theorem 6.6 of Chapterp 2, uniformly
continuous on h0; 2i, which
p implies that f C 1=n converges to f uniformly
with n ! 1, and hence f 2 A. p
(2) This follows from the formula jf j D f 2 and from (1).
(3) This follows from (2) and the fact that
1 1
max.f; g/ D .f C g C jf gj/; min.f; g/ D .f C g jf gj/: t
u
2 2
6.7 Proof of Theorem 1.1:
Let A C.K/ be a closed unital subalgebra which separates points, and let f 2
C.K/. Given " > 0, we will construct a g 2 A such that for every x 2 K,
jf .x/ g.x/j < ": (*)
www.Ebook777.com
Since " > 0 was arbitrary, this will imply that f is a limit of a uniformly convergent
sequence of elements of A, and hence f 2 A since A is closed. Since f was
arbitrary, A D C.K/, which implies the statement of the theorem.
To construct g, consider two points s ¤ t 2 K. Since A separates points, we
may choose h 2 A such that h.s/ ¤ h.t/. Now define, for v 2 K,
h.v/ h.t/
fs;t .v/ D f .s/ C .f .t/ f .s//
h.s/ h.t/
Clearly, fs;t 2 A, and
fs;t .s/ D f .s/; fs;t .t/ D f .t/:
Now fixing s, let
Ut D fv 2 K j fs;t .v/ < f .v/ C "g:
Then
Ut D .fs;t f /1 Œ.1; "/;
and since fs;t ; f are continuous, Ut is open. On the other hand, s; t 2 Ut , and hence
.Ut /t ¤s is an open cover of K. Since K is compact, this open cover has a finite
subcover .Ut1 ; : : : ; Utm /. Putting
hs D min.fs;t1 ; : : : ; fs;tm /;
we have
hs < f C "; hs .s/ D s:
Now let
Vs D fv 2 K j hs .v/ > f .v/ "g:
Then
Vs D .hs f /1 Œ."; 1/;
and hence Vs is open. Since s 2 Vs , .Vs /s2K is an open cover of K. Since K is

compact, this cover has a finite subcover .Vs1 ; : : : ; Vsp /. Let
g D max.hs1 ; : : : ; hsp /:
www.Ebook777.com
Then g 2 A, and
f " < g < f C ";
as desired. t
u
7 Exercises
(1) Prove directly that a subspace of a metric separable space is separable.

(2) Prove that a subspace of a totally bounded metric space is totally bounded.
(3) Prove that a (finite) product of totally bounded metric spaces is totally bounded.
(4) Using Baire’s theorem, prove that an increasing function f W h0; 1i ! R is
Lipschitz on a dense subset of h0; 1i.
(5) Prove a modification of Baire’s Category Theorem where “complete metric
space” is replaced by “compact Hausdorff space”.
(6) Prove that an onto isometry of metric spaces is a homeomorphism.
(7) A rigorous construction of real numbers. Note carefully that the field of
real numbers R cannot be constructed as a completion of the metric space Q
of rational numbers directly using 4.1, since the definition of the metric in
Lemma 4.2 uses the real numbers, thereby making such an argument circular.
Nevertheless, this difficulty can be circumvented, and the approach of 4.1 can
be used to define R after all. Following the logically correct sequence of steps
is the point of the present exercise.
(a) Consider, on Q, the metric d.a; b/ D ja bj. Now define R as the set of
equivalence classes of Cauchy sequences with respect to the equivalence
relation defined in 4.1. Prove that R is a field with respect to the operation
of addition and multiplication of Cauchy sequences, which contains Q as
the subfield of (equivalence classes of) constant sequences.
(b) Write, for a Cauchy sequence x D .xi /i in Q, x > 0 when there exists an
N such that xi > 0 for every i > N . Prove that if x y, then x > 0 if
and only if y > 0. (Caution: note that this fails if we tried to use instead
of >.)
(c) Define, for a 2 R, jaj D a when a > 0 and jaj D a .D 0 a/
otherwise. Prove that d.a; b/ D ja bj is a metric on R and that R is a
complete metric space with respect to this metric.
(d) The material of 4.1 is now rigorous without previously assuming a
construction of R. Verify (caution, it is very nearly a tautology) that the
metric space R is indeed the completion of the metric space Q as defined
in 4.1.
(8) Prove that any open set in Rn is -compact.
(9) Prove the following converse to the Arzelà-Ascoli Theorem: If X is a compact
metric space and .fn /n is a uniformly convergent sequence in C.X /, then it is
uniformly bounded and equicontinuous.
www.Ebook777.com
7 Exercises 235
(10) Prove the following result known as the Weierstrass Approximation Theorem:
For a continuous function f W ha; bi ! R, there exists a sequence of
polynomials (with real coefficients) pn .x/ which, when restricted to ha; bi,
converge to f .
(11) Prove that the set of all polynomials in the variables sin.nx/, cos.nx/, n D
0; 1; 2; : : : is dense in C.h0; 2 i/. Is the set of all polynomials in the variables
sin.nx/, n D 0; 1; 2; : : : dense in C.h0; 2 /? Prove or disprove.
www.Ebook777.com
Complex Analysis I: Basic Concepts

10
In this chapter, we will develop the basic principles of the analysis of complex
functions of one complex variable. As we will see, using the results of Chapter 8,
these developments come almost for free. Yet, the results are of great significance.
On the one hand, complex analysis gives a perfect computation of the convergence
of a Taylor expansion, which is of use even if we are looking at functions of
one real variable (for example, power functions with a real power). On the other
hand, the very rigid, almost “algebraic”, behavior of holomorphic functions is a
striking mathematical phenomenon important for the understanding of areas of
higher mathematics such as algebraic geometry ([8]). In this chapter, the reader
will also see a proof of the Fundamental Theorem of Algebra and, in Exercise (4), a
version of the famous Jordan Theorem on simple curves in the plane.
1 The derivative of a complex function. Cauchy-Riemann

conditions
1.1
p the complex conjugate z D x iy of z D x C iy and

From 1.2 of Chapter 1, recall
the absolute value jzj D zz, the easy rules
z1 C z2 D z1 C z2 ; z1 z2 D z1 z2 and jz1 z2 j D jz1 j jz2 j;
and the slightly harder triangle inequality
jz1 C z2 j jz1 j C jz2 j:
Further recall from 4.2 of Chapter 8 that the set of complex numbers C is identified
with the Euclidean plane, with the distance jz1 z2 j equal to Euclidean distance
in R2 .

www.Ebook777.com
238 10 Complex Analysis I: Basic Concepts
1.2
Let U C be an open subset and let f W U ! C be a mapping, i.e. a complex

1
function of one variable. We can compute, in the field C, the values .f .z C h/
h
f .z// for h ¤ 0 D 0 C i 0, and, analogously to the case of real functions of one
variable, consider the limit
f .z C h/ f .z/
lim ;
h!0 h
(but this time in the metric space C), if it exists. If the limit exists, we speak (again)
of a derivative of f in z. More generally, one can introduce, in the obvious way,
partial derivatives of functions f W U1

Un ! C of several complex variables.
One uses the same notation as in the real case:
df
f 0 ; f 0 .z/; ; etc.
dz
By precisely the same procedure as in the real case we can prove the formulas
.f C g/0 D f 0 C g 0 ; .˛f /0 D ˛f 0 ; .f g/0 D f 0 g C f g 0
(the second of which concerns the multiplication by a complex constant), the

composition rule
.f ı g/0 .z/ D f 0 .g.z// g 0 .z/
and the formula .zn /0 D n zn1 , so we can take derivatives of polynomials exactly
as in the real case.
1.3
What we cannot do, however, is adopt the interpretation of a derivative as describing

a tangent, or expressing smoothness, as in the real case. The function f .z/ D z is
certainly as smooth as a map can be: geometrically it is just mirorring the plane
along the real axis. But we have here
f .z C h/ f .z/ zChz h
D D ;
h h h
an expression that has no limit for h approaching 0: on the real axis, i.e. for h D
h1 C i 0, we have constantly the value hh D hh11 D 1 while on the imaginary axis, i.e.
h2
for h D 0 C ih2 , we have h
h
D h2
D 1.
www.Ebook777.com
1 The derivative of a complex function. Cauchy-Riemann conditions 239
In other words, while the condition of existence of complex derivative does imply
the existence of total differential of the function f considered as a map R2 ! R2
(or U ! R2 where U is an open set in R2 ), the converse is not true: the existence
of a complex derivative is a much stronger condition. We will see below in 5.3
that it has a different interpretation, namely of f preserving orientation and angles:
smoothness follows.
1.4 Cauchy–Riemann conditions
Writing z D x C iy, we can view a complex function f W U ! C as
f .z/ D P .x; y/ C iQ.x; y/
where P; Q are real functions in two real variables. We will now show that the
differentiability of f implies certain equations between the partial derivatives of
P an Q.
1.4.1 Theorem. Let a complex function f have a (complex) derivative at a point

z D x C iy. Then the functions P; Q have partial derivatives at .x; y/ and we have
@P .x; y/ @Q.x; y/ @P .x; y/ @Q.x; y/

D and D : (CR)
@x @y @y @x
The derivative of f is then given by the formulas
@P .x; y/ @Q.x; y/ @Q.x; y/ @P .x; y/

f 0 .z/ D Ci D i :
@x @x @y @y
Remark. The equations (CR) are referred to as the Cauchy - Riemann conditions.
We have shown that these conditions are necessary for complex differentiability.
We will show in Theorem 1.5 below that the conditions are also sufficient when
f is continuously differentiable. A theorem of Looman and Menchoff states,
more generally, that the conditions are also sufficient assuming only that f is
continuous, but we will not need that result here. The conditions (CR) alone,
without any additional assumption on f , however, do not imply differentiability
(see Exercise (2).)
Proof. Put h D h1 C ih2 . We have
1 1
.f .z C h/ f .z// D .P .x C h1 ; y C h2 / P .x; y//
h h1 C ih2
(*)
i
C .Q.x C h1 ; y C h2 / Q.x; y//:
h1 C ih2
www.Ebook777.com
For h2 D 0 (and h1 ¤ 0) this yields in particular

1 i
.P .x C h1 ; y/ P .x; y// C .Q.x C h1 ; y/ Q.x; y// (**)
h1 h1
while for h1 D 0 (and h2 ¤ 0) we obtain
i 1
.P .x; y C h2 / P .x; y// C .Q.x; y C h2 / Q.x; y//: (***)
h2 h2
If the expression (*) has a limit for h ! 1, the expression (**) has the same limit
for h1 ! 0, namely
@P .x; y/ @Q.x; y/
Ci .D f 0 .z//
@x @x
and similarly (***) yields
@P .x; y/ @Q.x; y/
i .D f 0 .z//:
@y @y
Comparing the real and the imaginary parts, we obtain the desired equations. t
u
1.5 Theorem. Let P; Q be real functions of two variables with continuous partial
derivatives, let f .z/ D P .x; y/ C iQ.x; y/ and let the conditions (CR) be satisfied
at some point z D x C iy 2 U . Then f has a derivative in z.
Proof. We have
1
.f .z C h/ f .z/
h
1
D .P .x C h1 ; y C h2 / P .x; y/ C iQ.x C h1 ; y C h2 / iQ.x; y//
h
1
D .P .x C h1 ; y C h2 / P .x C h1 ; y/ C P .x C h1 ; y/ P .x; y/
h
C i.Q.x C h1 ; y C h2 / Q.x C h1 ; y/ C Q.x C h1 ; y/ Q.x; y///:
Denote the right-hand side by u. Using the Mean Value Theorem and (CR), we
obtain
P .x C h1 ; y C h2 / P .x C h1 ; y/ C P .x C h1 ; y/ P .x; y/
@P .x C h1 ; y C ˛h2 / @P .x C ˇh1 ; y/
D h2 C h1
@y @x
@P .x C h1 ; y C ˛h2 / @P .x C ˇh1 ; y/
D h2 C h1
@x @x
www.Ebook777.com
1 The derivative of a complex function. Cauchy-Riemann conditions 241
and similarly
Q.x C h1 ; y C h2 / Q.x C h1 ; y/ C Q.x C h1 ; y/ Q.x; y/

@Q.x C h1 ; y C h2 / @Q.x C ıh1 ; y/
D h2 C h1
@y @y
@Q.x C h1 ; y C h2 / @P .x C ıh1 ; y/
D h2 C h1 ;
@x @x
with some 0 < ˛; ˇ; ; ı < 1. Thus, setting h D h1 C ih2 ,

1 @P .x C h1 ; y C ˛h2 / @Q.x C ıh1 ; y/
uD .h1 C ih2 / C i .h1 C ih2 /
h @x @x

@P .x C ˇh1 ; y/ @P .x C h1 ; y C ˛h2 /
C h1
@x @x

@Q.x C h1 ; y C h2 / @Q.x C ıh1 ; y/
h2
@x @x
@P .x C h1 ; y C ˛h2 / @Q.x C ıh1 ; y/ h1 h2
D Ci C d1 C d2
@x @x h h
ˇ ˇ
ˇ hi ˇ
and since the differences d1 ; d2 tend to 0 and ˇˇ ˇˇ 1, the statement follows. t
u
h
1.6 Holomorphic functions
A complex function f W U ! C on an open set U C with continuous

partial derivatives which satisfies the Cauchy-Riemann conditions is said to be
holomorphic. It can be shown that a complex function is holomorphic on U if and
only if it has a complex derivative on U . (By what we already proved, sufficiency is
the non-trivial part.) This is the famous theorem of Goursat which can be found, for
example, in [1].
From the chain rule, it is again immediate that for holomorphic functions f; g in
an open set U , f C g, f g, f g are holomorphic, as is fg provided that g is
non-zero at all points of U .
1.7
Recall the complex line integral from Section 4 above. Later we will need the
following fact. It is an easy consequence of 3.7 and 4.4 of Chapter 8, but we shall
spell things out, mainly to exercise the Cauchy-Riemann conditions.
www.Ebook777.com
Theorem. Let f .; z/ be a continuous complex function of two variables which

R
is holomorphic in in some open set U C. Then the complex line integral L
satisfies
Z Z
d @f .; z/
f .; z/dz D dz:
d L L @
R
Proof. Set F . / D L f .; z/dz and write f .; z/ D P .˛; ˇ; x; y/ C iQ.˛; ˇ; x; y/
where D ˛ C iˇ. From 4.4 of Chapter 8, we see that
F . / D P.˛; ˇ/ C i Q.˛; ˇ/
where
Z
P.˛; ˇ/ D .II/ .P .˛; ˇ; x; y/ Q.˛; ˇ; x; y/
Z
D .P .˛; ˇ; x; y/dx Q.˛; ˇ; x; y/dy/;
Z
Q.˛; ˇ/ D .II/ .Q.˛; ˇ; x; y/; P .˛; ˇ; x; y/
Z
D .Q.˛; ˇ; x; y/dx C P .˛; ˇ; x; y/dy/:
Since f is holomorphic in , we have
@P @Q @P @Q
D and D
@˛ @ˇ @ˇ @˛
so that by 3.7 of Chapter 8,

Z Z
@P @P @Q @Q @P @Q
D .II/ ;
D .II/ ; D ;
@˛ @˛ @˛ @ˇ @ˇ @ˇ
Z Z (1.7.1)
@P @P @Q @Q @P @Q
D .II/ ; D .II/ ; D
@ˇ @ˇ @ˇ @˛ @˛ @˛
@f @P @Q
so that F . / is holomorphic and hence has a derivative. By 1.4, D Ci
@ @˛ @˛
and hence by (1.7.1) and 1.4 again,
Z Z Z
@f .; z/ @P @Q @Q @P @P @Q dF
dz D .II/ ; C i.II/ ; D Ci D : t
u
@ @˛ @˛ @˛ @˛ @˛ @˛ d
www.Ebook777.com
2 From the complex line integral to primitive functions 243
2 From the complex line integral to primitive functions
2.1 Theorem. Let U be a domain in C. Let L1 ; : : :; Lk be simple piece-

wise continuously differentiable closed curves with disjoint images such that
L1 q q Lk is the boundary of U oriented counter-clockwise (see 5.2 of
Chapter 8). Let f be a function defined on an open set V contining U . Then
the complex line integral of f satisfies
Z Z
f .z/dz C C f .z/dz D 0:
L1 Lk
Proof. Put again f .z/ D P .x; y/ C iQ.x; y/. By 4.4 of Chapter 8, we have
Z Z Z
f D .II/ .P; Q/ C i.II/ .Q; P /
Li Li Li
and by the Green’s formula (5.4.1) of Chapter 8, the sum of these factors is equal to
Z Z
@Q @P @P @Q
Ci :
U @x @y U @x @y
By the Cauchy-Riemann conditions, both the summands are zero. t

u
2.2
Consider two oriented simple arcs P1 ; P2 expressed by parametrizations

i W h˛i ; ˇi i ! C such that 1 .˛1 / D 2 .˛2 / and 1 .ˇ1 / D 2 .ˇ2 / and
1 .x/ ¤ 2 .y/ unless x D ˛1 and y D ˛2 or x D ˇ1 and y D ˇ2 . Then
L D P2 C P1 is a piecewise continuously differentiable simple closed curve. If
L is the boundary of a domain U and f is holomorphic on an open subset of C
containing U , then by 2.1,
Z Z
f D f:
P1 P2
2.3
Let f be holomorphic in a convex open set U C. For a; b 2 U define

Z b Z
f .z/dz D f .z/dz
a L.a;b/
where L.a; b/ is parametrized by W h0; 1i ! C, .t/ D a C t.b a/.
www.Ebook777.com
Fix a 2 U and write for u 2 U ,

Z u
F .u/ D f .z/dz:
a
Theorem. We have F 0 .z/ D f .z/.
Proof. We claim that

Z Z Z
f .z/dz D f .z/dz C f .z/dz: (2.3.1)
L.a;uCh/ L.a;u/ L.u;uCh/
In effect, this is trivial when the points a; u and u C h are colinear. Otherwise the
piecewise continuously differentiable simple curves P1 D L.a; u C h/ and P2 D
L.a; u/ C L.u; u C h/, h 2 h0; 1i, satisfy the assumptions of 2.2 and hence (2.3.1)
follows from 3.4 and 4.4 of Chapter 8. Now, by (2.3.1),
Z Z 1
1 1 1
.F .u C h/ F .u// D f .u C th/hdt D f .u C th/dt
h h 0 0
Z 1 Z 1
D P .u C th/dt C i Q.u C th/dt
0 0
which with real h ! 0 approaches P .u/ C iQ.u/, by the Mean Value Theorem. u
t
2.4 Comment
By analogy with the theory of real functions, we call F a primitive function of f if

F 0 D f . It is easy to observe that the difference between two primitive functions on
an open set is locally constant, i.e. constant on each connected component. Indeed,
by 1.4, we can reduce this to the fact that a real function with partial derivatives
equal to 0 on an open set is locally constant. In particular, on a convex open set U ,
any two primitive functions differ by a constant.
2.5
It is curious to observe that the proof of Theorem 2.3 can be “transported” (with only
minor modifications) by a (real) injective regular map. More precisely, identifying
C with R2 , let W U ! V be a bijective regular map in the sense of Subsection 7.1
of Chapter 3. Then the proof of Theorem 2.3 remains valid with the line segments
L.a; b/ replaced by their -images. We obtain therefore the following
www.Ebook777.com
3 Cauchy’s formula 245
Proposition. If V is an open set in C such that there exists a bijective (real) regular
map W U ! V where U is convex, then every holomorphic function on V has a
primitive function.
As it turns out, the converse is also true. In fact, in Section 1 of Chapter 13, we
shall prove much more, namely that unless U D C, the map can be chosen to be
holomorphic. This is the famous Riemann Mapping Theorem.
3 Cauchy’s formula
3.1 Lemma. Let Kr be a circle with center in a point z and radius r > 0, oriented
counter-clockwise. Then we have
Z
d
D 2 i:
Kr z
Proof. Parametrize Kr by
W h0; 2 i ! C; .t/ D z C r.cos.t/ C i sin.t//:
Then we have 0 .t/ D r. sin.t/ C i cos.t//, and hence

Z Z Z
d 2
r. sin.t/ C i cos.t// 2
D D i dt D 2 i: t
u
Kr z 0 r.cos.t/ C i sin.t// 0
3.2
Notice that the integral computed in 3.1 is not required to vanish by Theorem 2.1
because the argument is not defined (and in fact, goes to infinity) at D z.
3.3 Theorem. (Cauchy’s formula) Let f be holomorphic in an open disk

.z; R/
with R > r > 0. Then we have
Z
1 f ./
d D f .z/:
2 i Kr z
Proof. We have
Z
f ./
d
Kr z
Z Z
f .z/ f ./ f .z/
D d C d:
Kr z Kr z
www.Ebook777.com
The first summand on the right-hand side is equal to 2 if .z/ by 3.1. We shall prove
that the second summand is 0. Since
f ./ f .z/
f 0 .z/ D lim ;
!z z
f ./ f .z/
the quantity is bounded on the set U Xfzg for some open neighborhood
z
U of z (and hence, by continuity, on
.z; r/ X fzg). Let
ˇ ˇ
ˇ f ./ f .z/ ˇ
ˇ ˇ < A in
.z; r/ X fzg:
ˇ z ˇ
By Lemma 4.5 of Chapter 8, for 0 < s < r, we have

ˇZ ˇ
ˇ f ./ f .z/ ˇˇ
ˇ d ˇ 4A 2 s D 8A s:
ˇ z
Ks
In particular,
Z
f ./ f .z/
lim d D 0:
s!0 Ks z
Now we will apply 2.1 to
U D
.z; r/ X
.z; s/; (*)
with k D 2, L1 D Kr , L2 D Ks . By (*), we have

Z
f ./ f .z/
d
Kr z
Z Z
f ./ f .z/ f ./ f .z/
D lim d d D .by 2.1/
s!0 Kr z Ks z
D lim 0 D 0: t
u
s!0
3.4 Theorem. A holomorphic complex function on an open set U has complex

derivatives of all orders on U .
Proof. By 1.7, we may differentiate the argument of the integral in Cauchy’s

formula repeatedly by z, giving
www.Ebook777.com
3 Cauchy’s formula 247
Z
kŠ f ./
f .k/ .z/ D d: (3.4.1)
2 i Kr . z/kC1
t
u
3.5 Corollary. A continuous complex function f on a convex open set in C is

holomorphic if and only if it has a primitive function.
Proof. If f is holomorphic then it has a primitive function F by Theorem 2.3. If f

has a primitive function F then F is holomorphic since f is continuous. Now apply
Theorem 3.4 to the function F . t
u
We also get the following
3.6 Theorem. (Weierstrass’s Theorem) Suppose that fn is a sequence of holomor-

phic functions defined on an open set U C which converge to a function f .z/
uniformly on every compact subset of U . Then f is a holomorphic function on U .
Furthermore, fn0 converge to f 0 uniformly on every compact subset of U .
Proof. Using Cauchy’s formula (Theorem 3.3) with f replaced by fn , and taking
the limit after the integral sign using Lebesgue’s Dominated Convergence Theorem
implies the same formula for f , proving that f is holomorphic. Further, using the
same argument on formula (3.4.1) (k D 1), we see that fn0 converges to f 0 , and
further that the convergence is uniform in a disk with center z and radius r=2. A
compact set is covered by finitely many such disks by the Heine-Borel Theorem 2.3
of Chapter 9, which implies that the convergence of derivatives is uniform on a
compact set. t
u
The following result will be useful for applying the Arzelà-Ascoli Theorem 6.2
to sequences of analytic functions.
3.7 Theorem. A sequence .fn /n of holomorphic functions defined on an open

set U C which is uniformly bounded on every compact subset K U is
equicontinuous on every compact subset K U .
Proof. Let z0 2 U , and assume

.z0 ; r/ U . Let M be the boundary of
.z0 ; r/,
oriented counterclockwise. For z 2
.z0 ; r/, we get
Z
1 1 1
f .z/ f .z0 / D f ./d
2 i M z z0
Z (*)
z z0 f ./d
D :
2 i M . z/. z0 /
www.Ebook777.com
If jf .t/j < C for all t 2 M , and if z 2

.z0 ; r=2/, then the right-hand side of (*) is
less than or equal to
4C jz z0 j
: (C)
r
Now let K U be a compact subset. We claim that there exists an r > 0 and a
compact set L, K L U such that every point of distance r from some point
of K belongs to L.
(For every point x 2 K, there is a number s.x/ > 0 such that
.x; s.x// U .
By the Heine-Borel Theorem 2.3 of Chapter 9, K is covered by finitely many of the
open disks
.xi ; s.xi /=3/, for some points xi , i D 1; : : : ; k. Let s D minfs.xi /ji D
[k
1; : : : ; kg. Then we may put r D s=3, L D
.xi ; s.xi /=3/.
i D1
Now let C be a uniform bound on jfn .z/j for z 2 L. Then in (C) we may always
use these values of C and r. We see that then at least for z; t 2 K, jz tj < r=2,
4C jz tj
jfn .z/ fn .t/j < ;
r
which implies equicontinuity on K. t
u
Note that in the preceding proof, we have proved more than equicontinuity,
namely a uniform Lipschitz constant.
3.8 Remarks
1. Note that the statements 3.4 and 3.5 are in sharp contrast with real analysis.
2. We will see that Cauchy’s formula in complex analysis plays an analogous role
to the Mean Value Theorem in real analysis. It is, however, a much stronger tool,
which makes certain concepts (such as the Taylor series) much easier.
f ./
3. Realize the role of the argument going to infinity at the point z. Note that
z
all the information about the integral in 8.3 is contained in an arbitrarily small
neighborhood of z.
4. By the same argument, the circle Kr could be replaced by any closed simple
curve L which is the boundary of a domain U oriented counter-clockwise and
such that z 2 U .
4 Taylor’s formula, power series, and a uniqueness theorem
4.1 Theorem. (Taylor’s formula) Let f be holomorphic in a neighborhood of a

point c 2 C. Then, in a sufficiently small neighborhood of c, we have
www.Ebook777.com
4 Taylor’s formula, power series, and a uniqueness theorem 249
1 0 1 1
f .z/ D f .c/C f .c/.zc/C f 00 .c/.zc/2 C C f .n/ .c/.zc/n C: : : :
1Š 2Š nŠ
Proof. We have
1 z 1
D : (*)
z c 1 zc
c
Consider a circle Kr with center c and radius r such that f is holomorphic in
.c; R/ for some R > r. Let z be such that jz cj < r, so that

ˇ ˇ
ˇzcˇ
ˇ ˇ
ˇ c ˇ < 1
for a point of the circle Kr . From (*), we obtain

2 n !
1 1 zc zc zc
D 1C C CC C :::
z c c c c
1 1 1
D C .z c/ C .z c/2 C :::
c . c/2 . c/3
1
C.z c/n C ::::
. c/nC1
Thus, from Cauchy’s formula and Lebesgue’s Dominated Convergence Theorem

(note that we are dealing with continuous functions and therefore all partial sums
have a uniform constant bound), we get:
Z
1 f ./
f .z/ D d
2 i z
Z Z
1 f ./ 1 f ./
D d C .z c/ d C : : :
2 i c 2 i . c/2
Z
1 f ./
C.z c/ n
d C : : : :
2 i . c/nC1
By the formula in the proof of Theorem 3.4, we have

Z
1 f ./ 1
d D f .n/ .c/: t
u
2 i . c/ nC1 nŠ
www.Ebook777.com
4.2
Note that repeating verbatim the proofs in Section 7 of Chapter 1, we get the
following
Proposition. A (complex) power series

1
X
ak .z c/k (*)
kD0
converges absolutely and uniformly in a circle with center c and any radius
1
s < r D lim inf p
n
jan j
and diverges outside of the closed circle with center c and radius r. (The number r
is called the radius of convergence of the power series (*).)
Moreover, the power series
1
X
kak .z c/k1
kD1
has the same radius of convergence as (*), and the series (*) may be differentiated
term by term.
4.3
The power series
z z2
ez D 1 C C C :::;
1Š 2Š
z3 z5
sin.z/ D z C :::;
3Š 5Š
z2 z4
cos.z/ D 1 C :::
2Š 4Š
will now be considered the definitions of the functions e z , sin.z/, cos.z/ for z
complex (the radius of convergence of these series is 1). Therefore, we have
www.Ebook777.com
4 Taylor’s formula, power series, and a uniqueness theorem 251
e iz D cos.z/ C i sin.z/; e iz D cos.z/ i sin.z/;
and also
e iz C e iz e iz e iz
cos.z/ D ; sin.z/ D :
2 2i
4.4 A uniqueness theorem
Lemma. Let f , g be holomorphic in an open set U , and let c 2 U , c D lim cn ,

cn ¤ c. Suppose f .cn / D g.cn / for all n. Then f D g in some neighborhood of c.
Proof. It suffices to prove that if f .cn / D 0 for all n, then f 0 in some

neighborhood of c. By Taylor’s formula, we have
1
X
f .z/ D ak .z c/k
kD0
for some constants ak . It suffices to prove that
ak D 0 for all k: (*)
Assuming (*) does not hold, let n be the smallest number such that an ¤ 0. Then in
some neighborhood of c,
f .z/ D .z c/n .an C anC1 .z c/ C anC2 .z c/2 C : : : /:
The function in the parentheses on the right-hand side is continuous (it is a uniform
limit of continuous functions), and not zero at c; thus, it is non-zero in some
neighborhood of c, and so is .z c/n , contradicting our assumptions. t
u
Theorem. Assume f; g are holomorphic on a connected open set U , and let c 2 U ,

c D lim cn , c ¤ cn , and f .cn / D g.cn / for all n. Then f g on U .
Proof. Let
M D fz 2 U jf .u/ D g.u/ in some neighborhood of zg:
M is clearly open, and by the lemma, it is also closed and non-empty. Since U is
connected, we have M D U . t
u
www.Ebook777.com
4.5 The algebra of power series
Note that on two power series of the form 4.2.(*) with the same c, we can perform
addition, sutraction and multiplication (in the case of multiplication, note that only
finitely many terms with the same power .zc/k are added). As an inverse operation
to this purely algebraic multiplication, note that it is also possible to divide by any
power series 4.2.(*) with a0 ¤ 0, figuring the coefficients of the ratio by a recursive
procedure.
It will be important for us that when these purely algebraic operations are
performed on power series with a positive radius of convergence representing Taylor
series at c of holomorphic functions f , g, the power series resulting in an algebraic
operation converges and is the Taylor series of f C g, f g, f g or f =g, (the
division requires g.c/ ¤ 0). All of these statements are more or less obvious with
the exception of the division. Here we note that since g.c/ ¤ 0, we have g.z/ ¤ 0 in
some disk
.c; r/, r > 0. Therefore, f =g is a holomorphic function in a disk with
center a, and hence has a Taylor expansion at c. Multiplying this Taylor expansion
with the Taylor expansion of g at c algebraically, we then get the Taylor expansion
of f at c by uniqueness. This implies that the Taylor expansion of f =g at c is the
algebraic ratio of the Taylor expansions of f and g at c.
5 Applications: Liouville’s Theorem, the Fundamental

Theorem of Algebra and a remark on conformal maps
5.1 Theorem. (Liouville’s Theorem) Suppose f is holomorphic and bounded in all

of C. Then f is constant.
Proof. By the formula from 3.4, for any circle Kr with center z and radius r we have
Z
0 2Š f ./
f .z/ D d:
2 i Kr . z/2
Suppose jf .z/j A for all z 2 C. For a point on Kr , we then have

ˇ ˇ
ˇ f ./ ˇ A
ˇ ˇ
ˇ . z/2 ˇ r 2 ;
and by 4.5 of Chapter 8, we have
2Š A 8A
jf 0 .z/j 4 2 r 2 D :
2 r r
Since r > 0 was arbitrary, we must have f 0 .z/ 0 and hence f must be constant.
t
u
www.Ebook777.com
5 Applications: Liouville’s Theorem, the Fundamental Theorem of Algebra: : : 253
5.2 Theorem. (The Fundamental Theorem of Algebra) Every non-constant polyno-

mial has at least one root in C.
Proof. Suppose a polynomial
p.z/ D zn C an1 zn1 C C a1 z C a0 ; n 1
has no root in C. Then the function

1
f .z/ D
p.z/
is defined and holomorphic on all of C. Let
R D 2n max.ja0 j; : : : ; jan j/
(where an D 1). For jzj R, we then have
jp.z/j jzjn jan1 zn1 C C a1 z C a0 j

R R
jzjn jzjn1 D jzjn1 Rn
2 2
and hence
c
jf .z/j :
Rn
On the other hand, on fzj jzj Rg, f is bounded because it is continuous. Thus, f
is bounded on all of C, and by Liouville’s Theorem, it is constant, and hence so is
p.z/. This is a contradiction since we assumed n 1. t
u
5.3
A conformal map is a regular map f W U ! Rn defined on an open set U Rn

such that for two vectors u; v 2 Rn , and every point z 2 U , the angle between
the vectors Dz f.u/,Dz f.v/ is the same as the angle between u ¤ 0 and v ¤ 0.
(Recall that the angle 0 ˛ between non-zero vectors u, v is defined by
cos.˛/ D u v=.jjujj jjvjj/.)
Note that for n D 1, any regular map is conformal. For n > 2, it can be shown
that every conformal map is locally a constant multiple of an isometry, a fact which
we will not show here. However, for n D 2 we have the following result. Identify,
again, C with R2 by x C iy 7! .x; y/, and drop the bold-faced letters.
www.Ebook777.com
Theorem. For n D 2, a regular map f as in 1.1 is conformal if and only if on each

connected component of U , f is either holomorphic or the complex conjugate of a
holomorphic function (such a function is often called antiholomorphic).
Proof: This is really a statement entirely about the R-linear map Dfz for each
z 2 U (see Exercise (1)), which is a consequence of the following
Lemma. A regular R-linear map A W C ! C preserves angles between non-zero

pairs of vectors if and only A is given either by the formula Az D z or Az D z
for some ¤ 0 2 C.
The proof of the lemma goes as follows: Representing A by a 2

2 matrix using
the basis 1, i , by our assumption, the columns of A must be non-zero and orthogonal.
Since multiplication by a non-zero complex number preserves angles by the
geometric interpretation of complex numbers, we may assume (by composing, if
necessary, A with multiplication by a suitable non-zero complex number) that the
first column of A is .1; 0/T . By orthogonality, the second column is then .0; a/T
for some non-zero (real) a. However, the requirement that A.1; 1/T , A.1; 1/T be
orthogonal gives a2 D 1. If a D 1, Az D z and if a D 1, Az D z. t
u
6 Laurent series, isolated singularities and the Residue

Theorem
6.1 Laurent series
Let f be a holomorphic function defined on an annulus R1 < jz cj < R2 for some

a 2 C. Let Lr be a circle with center c and radius r oriented counterclockwise.
Define
Z
1 f ./
f1 .z/ D d
2 i Lr z
for some jz cj < r < R2 and

Z
1 f ./
f2 .z/ D d
2 i Ls z
for some R1 < s < jz cj. The exact choice of r or s does not change the value by
Theorem 2.1. Furthermore, we have
f .z/ D f1 .z/ C f2 .z/ for R1 < jz cj < R2 . (6.1.1)
www.Ebook777.com
6 Laurent series, isolated singularities and the Residue Theorem 255
To see this, consider a circle K with center z and a small radius oriented
counterclockwise, and apply Theorem 2.1 to the function
f ./
z
of the variable with simple closed curves Lr , Ls , K , along with Cauchy’s
formula (Theorem 3.3).
By differentiating under the integral sign (Theorem 1.7), and the fact that the
value does not depend on r, we see that the function f1 .z/ is holomorphic in the
disk jz cj < R2 , and hence has a Taylor expansion. In case of the function f2 .z/,
it is convernient to perform the substitution
1 1 1
D ; D c C ; d D 2 d;
c
and similarly
1 1
tD ; zDcC ;
zc t
so that
1 t
D :
z t
For the function g.t/ D f2 .z.t//, this gives

Z
1 g./
g.t/ D t d
2 i M . t/
where M is the circle with center 0 and radius 1=s < 1=jtj oriented counterclock-
wise (note that the substitution reverses orientation, so we have a total of 4 minus
signs, which result in a plus). Again by differentiating under the integral sign, we
see that g.t/=t is a holomorphic function in the circle jtj < 1=R1 , and hence has
a Taylor expansion. (Note: when performing the substitution, we implicitly used
the fact that when performing substitution in complex line integrals, we may treat
differentials the same way as in ordinary single-variable integral substitution - see
Exercise (11) below). Writing the Taylor series of g.t/ in the variable .z c/, we
obtain an expansion of the form
X
f2 .z c/ D an .z c/n ;
n<0
which leads to the following result:
www.Ebook777.com
Theorem. A holomorphic function f .z/ in an annulus R1 < jz cj < R2 has

an expansion
1
X
f .z/ D an .z c/n (6.1.2)
nD1
which is absolutely convergent in the annulus R1 < jz cj < R2 , and the

convergence is uniform on every compact subset. Furthermore, the coefficients
an are uniquely determined by f . (This is called the Laurent expansion of the
function f .z/.)
Proof. The existence of the expansion (6.1.2) follows from the expansions for the
functions f1 , f2 in the variable z c discussed above. Moreover, the convergence
properties of the series (6.1.2) follow from our already discussed theory of power
series. Regarding uniqueness, note that the coefficients an can be calculated by
Cauchy integrals, which can be performed term by term by the convergence
properties of the power series (see Exercise (13) below). t
u
6.2 Classification of isolated singularities and the Residue

Theorem
Let U be an open subset of C, and let c 2 U . A holomorphic function f defined

on U X fag is said to have an isolated singularity at c. In this case, f has a Laurent
expansion (6.1.2) at a with R1 D 0. Isolated singularities are classified using this
expansion: If an D 0 for all n < 0, we say that f has a removable singularity at c.
Clearly, in that case, one can extend f to U by setting f .c/ D a0 . (For a stronger
statement, see Exercise (14).) On the other hand, if the set of all n for which an ¤ 0
is not bounded below, then we say that f has an essential singularity at c. If n > 0
is such that an ¤ 0 and am D 0 for all m < n, then we say that f has a pole
of order n at c. Symmetrically, if n > 0 and an ¤ 0 while am D 0 for m < n,
we say that f has a zero of order n at c, although that is not really a singularity. If
am D 0 for m < n, n 0, we say that f has at most a pole of order n at c, and
if am D 0 for m < n > 0, we say that f has a zero of order at least n at c. We
say that f .z/ has at most a pole at c if it does not have an essential sigularity there.
From the uniqueness of the Laurent expansion, it immediately follows that f has at
most a pole of order n at c if and only if f .z/.z c/n is holomorphic in U , while f
has a zero of order at least n at c if and only if f .z/=.z c/n is holomorphic in U .
Note that the remarks 4.5 on algebraic operations with power series obviously
extend to Laurent series of functions which have at most a pole at the point c. When
essential singularities are present, however, multiplication and division may involve
infinite sums of coefficients at the same power, and hence the purely algebraic
operations are undefined. We will also need the following result on essential
singularities:
www.Ebook777.com
Proposition. Let f be a holomorphic function on

.c; r/ X fcg with an essential
singularity at c and let A 2 C. Then for each " > 0 and each ı > 0, f .z/ 2
.A; "/
for infinitely many z 2
.c; ı/.
Proof. Assuming the opposite, there exists an A 2 C and an " > 0 and a ı > 0 such
that f .z/ A ¤ 0 for z 2
.c; ı/. But then the function
1
f .z/ A
has a removable sinularity at A, and hence f .z/ has at most a pole at A. t
u
For a function f with an isolated singularity at c as above, we define the residue

of f at c as
reszDc f .z/ D a1 : (6.2.1)
Since we may integrate the Laurent series term by term, it follows that if L is a circle
in U with center c oriented counter-clockwise such that the interior of the circle is
also contained in U , then
Z
1
reszDc f .z/ D f ./d: (6.2.2)
2 i L
From this and Theorem 2.1, we then immediately get the following fact:
Theorem. (The Residue Theorem) Let U be a domain in C and let L1 ; : : : ; Lk

be simple piecewise continuously differentiable closed curves with disjoint images
such that L1 q qLk is the boundary of U oriented counter-clockwise. Let further
c1 ; : : : cm be finitely many distinct points in U , and let f be a holomorphic function
on V X fa1 ; : : : ; am g where V U is an open set. Then
Z Z
f .z/dz C C f .z/dz D 2 i.reszDc1 f .z/ C C reszDcm f .z//:
L1 Lk
t
u
6.3 Applications: The Argument Principle and Rouché’s Theorem
The Residue Theorem has the following celebrated consequence. We say that a
function is meromorphic in an open set U C if f is holomorphic and non-zero
on U X S for a discrete set S U , and f has at most a pole at each c 2 S . Then
we define the degree of f at c 2 U as
8
< n if f has a zero of degree n at c
degc .f / D n if f has a pole of order n at c
:
0 otherwise.
www.Ebook777.com
6.3.1 Theorem. (The Argument Principle) Let U be a domain in C and let

L1 ; : : : ; Lk be simple piecewise continuously differentiable closed curves with
disjoint images such that L1 q q Lk is the boundary of U oriented counter-
clockwise. Let f be a meromorphic function on V U with no zeros or poles on
L1 q q Lk . Then
X k Z X
1 f 0 .z/
dz D degc .f /:
j D1
2 i Li f .z/ c2U
(Note that since U is compact, the sum on the right-hand side has only finitely many
non-zero terms.)
Proof. If f .z/ D .z c/n g.z/ with g.c/ ¤ 0. Then
f 0 .z/ D n.z c/n1 g.z/ C .z c/n g 0 .z/;
so that
f 0 .z/ n g 0 .z/
D C ;
f .z/ zc g.z/
and hence
f 0 .z/
reszDc D n:
f .z/
The statement then follows directly from the Residue Theorem (see Exercise (17)).
t
u
Comment: The argument of a number z 2 C X f0g is defined as the angle

Arg.z/ D ˛ such that z D jzje i ˛ . Since this ˛ is only defined up to adding an
integral multiple of 2 , one usually normalizes by requiring 0 Arg.z/ < 2 (this
is the argument in the narrower, normalized sense). It follows that in a connected
open set U where there exists a holomorphic function Ln.z/ which satisfies
e Ln.z/ D z;
we have
Arg.z/ D Im.Ln.z// C 2 k for some k 2 Z:
If U C is, say, a convex open set on which f .z/ has no zero, then f 0 .z/=f .z/
has a primitive function Ln.f .z// whose imaginary part differs from Arg.f .z// by
www.Ebook777.com
2 k, k 2 Z. The whole point is, however, that by Lemma 3.1, Ln.z/ cannot be well-
defined on the whole set C X f0g; roughly speaking, when we follow a circle with
center 0 once around counter-clockwise, the value of the logarithm will increase by
2 i (note that its real part won’t change: it is just its imaginary part, the argument,
which will inrease by 2 ). Thus, Theorem 6.3.1 in the case k D 1 makes precise the
intuitive assertion that following around a simple closed curve on which f .z/ has
no zero and which is a boundary oriented counter-clockwise of a domain U , then
the increase of the argument of f along this curve is equal to 2 times the number
of zeros of f inside U .
Let f be a holomorphic function on U which is non-zero outside of a finite set of
points. Then f is meromorphic, and the sum of degrees of f at all the points a 2 U
(which has only finitely many non-zero summands) is called the number of zeros of
the function f in the set U . (Thus, this is a count of zeros with “multiplicities”.)
6.3.2 Corollary. (Rouché’s Theorem) Let U be a domain in C and let L1 ; : : : ; Lk

be simple piecewise continuously differentiable closed curves with disjoint images
such that L1 q q Lk is the boundary of U oriented counter-clockwise. Let f , g
be holomorphic on V U and satisfy
jf .z/ g.z/j < jf .z/j for z 2 L1 q q Lk :
Then f , g have the same number of zeros in U . (Note that again, since U is
compact, by Theorem 4.4 ,f and g have only finitely many zeros in U .)
Proof. By assumption, we have

ˇ ˇ
ˇ g.z/ ˇ
ˇ 1 ˇ < 1 for z 2 L1 q q Lk :
ˇ f .z/ ˇ
Thus, if we put F .z/ D f .z/=g.z/, then
F ŒL1 q q Lk
where
is is the open disk with center 1 and radius 1. Then 1=z has a primitive
function on
, which we will denote by Ln.z/. The chain rule then implies
F 0 .z/
.Ln.F .z///0 D :
F .z/
Therefore,
X k Z
1 F 0 .z/
dz D 0;
j D1
2 i Li F .z/
and our statement follows from the Argument Principle. t

u
www.Ebook777.com
6.3.3 Theorem. Suppose a holomorphic function f .z/ defined on an open set U

C is such that for some z0 2 U , f .z0 / D w0 and the function f .z/ w0 has a zero
of order n at z0 . Then there exists an "0 > 0 such that for 0 < " < "0 , there exists a
ı > 0 such that for all c 2 C with jc w0 j < ı, the number of zeros of f .z/ c in
.z0 ; "/ is n.
Proof. Let L" be the circle with center z0 and radius " oriented conterclockwise. We
will study the integral
Z
1 f 0 .z/
dz: (*)
2 i L" f .z/ c
Choose "0 > 0 so that f .z/ w0 ¤ 0 for z 2

.z0 ; "0 / X fz0 g. (Such an "0 > 0
must exist or else f .z/ is constant in a neighborhood of z0 by Theorem 4.4.) Then
if we choose 0 < " < "0 , since L" is compact, there exists a ı > 0 such that the
denominator of the integrand (*) is non-zero for all c 2
.w0 ; ı/. Therefore, the
integral (*) is defined and continuous in that domain. However, we know by
the Argument Principle that (*) is a non-negative integer, namely the number of
zeros of the function f .z/ c in the disk
.z0 ; "/. Thus, it must be constant,
as claimed. t
u
6.3.4 Corollary. (The Holomorphic Open Mapping Theorem) A non-constant holo-

morphic function on a connected open set U C maps open sets onto open sets.
Proof. Note that in particular in the conclusion of Theorem 6.3.3, every element of
.w0 ; ı/ is in the image of f . t

u
An immediate consequence is then the following:
6.3.5 Corollary. (The maximum principle) If f .z/ is holomorphic and non-

constant in a connected open set U C, then jf .z/j has no maximum in U .
Proof. By Corollary 6.3.4, for any z 2 U , all points in a neighborhood of f .z/ are
in the image of f , so this will include points of greater absolute value. t
u
Another consequence of the Argument Principle is the following
6.3.6 Theorem. (Hurwitz’s Theorem) Let fn be holomorphic functions on a

connected open set U C which converge uniformly on every compact subset
of U to a function f W U ! C. Assume further that fn .z/ ¤ 0 for all n and all
z 2 U . Then either f is identically 0 on U , or f .z/ ¤ 0 for all z 2 U .
Proof. We know from Weirstrass’s Theorem (Theorem 3.6) that f .z/ is a holomor-
phic function on U . Suppose f .z/ is not identically 0. Then by Theorem 4.4, for
any point z0 2 U , there exists a number r > 0 such that f .z/ is defined and not
www.Ebook777.com
equal to 0 for 0 < jz z0 j r. In particular, by Proposition 6.3 of Chapter 2,

jf .z/j has a minimum on the circle K D fz 2 Cj jz z0 j D rg. It follows that
1=f .z/ converges uniformly to 1=f 0 .z/ on K, and by Weierstrass’s Theorem 3.6 also
fn0 .z/ converges uniformly to f 0 .z/ on K. By Lebesgue’s Dominated Convergence
Theorem, considering K oriented counter-clockwise, we conclude that
Z Z
1 fn0 .z/ 1 f 0 .z/
lim dz D dz: (*)
n!1 2 i K fn .z/ 2 i K f .z/
Now by the Argument Principle, the argument of the limit in (*) is the number of
zeros of fn inside the circle K, which is 0, while the right-hand side is the number
of zeros of f inside K. In particular, f .z0 / ¤ 0, and the statement follows, since z0
was arbitrary. t
u
6.4 Example: The values of the Riemann zeta function at even

integers k 2
The Riemann zeta function is
X1
1
.s/ D s
for Re.s/ > 1:
mD1
m
A lot can be said about the Riemann zeta function, but here we want to show how
the Residue Theorem can be applied to evaluating .k/ for k 2 an even integer,
which is a typical example of an application of the theorem. (The evaluation of .k/
for odd integers k > 2 is still an open problem.) First, note that e z 1 has a simple
(=order 1) zero at z D 0, and hence
z
ez 1
has a removable singularity at 0, and hence has a Taylor expansion at z D 0:
X Bj 1
z
D zj :
ez 1 j D0
j Š
The numbers Bj are called the Bernoulli numbers. One has
B0 D 1; B1 D 1=2; B2 D 1=6; B3 D 0; B4 D 1=30; : : : :
www.Ebook777.com
(See Exercise (18).) Now consider the function
2 i
f .z/ D :
zk .e 2 i z 1/
Then, by definition, for k 2 an even integer,
.2 i /k Bk
reszD0 f .z/ D :
kŠ
On the other hand, clearly f .z/ has a simple (D order 1) pole at m 2 Z X f0g), and
using Taylor series at z D m, one gets
1
reszDm f .z/ D :
mk
Also, clearly, f .z/ has no other poles. Let L be a rectangle with sides
˙.n C 21 / C ti, niCt, t 2 R in the appropriate ranges, oriented counterclockwise.
By the Residue Theorem,
Z !
.2 i /k Bk Xn
1
f .z/dz D 2 i C2 : (C)
L kŠ mD1
mk
On the other hand, the left-hand side tends to 0 with n ! 1. In effect, we claim that
je 2 iz 1j > C (*)
on L, where C > 0 is a constant independent of n. To see this, note that on the

vertical sides of the rectangle, e 2 i z 1 is a negative real number, while on the
horizontal sides, e 2 iz is a complex number of constant absolute value, which with
n ! 1 tends to 0 on the upper side, and to 1 on the lower side. This proves (*),
and since we are further dividing by zk , k 2, the absolute value of the integrand
on the left-hand side of (C) is < K=n2 for a constant K independent of n, which
implies that the left-hand side of (C) converges to 0 with n ! 1. We conclude the
following
Theorem. For every even integer k 2,
.2 i /k Bk
.k/ D :
2.kŠ/
www.Ebook777.com
7 Exercises 263
7 Exercises
(1) Prove from first principles that for a holomorphic function f W U ! C where
U is open, f , thought of as a map from an open set of R2 to R2 , has a total
differential at every point.
(2) Prove that the function of one complex variable
(
4
e z if z ¤ 0
f .z/ D
0 if z D 0
satisfies the Cauchy-Riemann conditions (CR) everywhere in C, but is not

holomorphic. [This example is due to H. Looman.]
(3) Consider the function of one complex variable

z5 =jzj4 if z ¤ 0
f .z/ D
0 if z D 0.
Prove that f is continuous everywhere in C, satisfies the Cauchy-Riemann

conditions (CR) at z D 0, but does not have a complex derivative at z D 0.
(4) Jordan’s Theorem. Let L be an oriented closed simple piecewise contin-
uously differentiable curve in C. Assume for simplicity that there exists a
parametrization c W ha; bi ! C of L where the partition a D a0 < <
0 0
ak D b mentioned in 1.1 of Chapter 8 satisfies cC .ai / ¤ c .ai / for
0 0
i D 1; : : : ; k 1, cC .a/ ¤ c .b/ for any > 0.
1. Prove that for every x 2 cŒha; bi, there exists an open neighborhood Vx
and a diffeomorphism
x W Vx !
.0; 1/
with
det.Dx / > 0;
a number ˛ 2 .0; 2 / and numbers a > 0; b 2 R such that
c.b/ D x;
cŒha; bi \ Vx D cŒ.b a; b C a/
and
for s 2 h1; 0i, we have x c.as C b/ D .s cos.˛/; s sin.˛//

and for s 2 h0; 1i; we have x c.as C b/ D .s; 0/.
www.Ebook777.com
2. Define, for a point z 2 C X cŒha; bi,

Z
1 d
indc .z/ D :
2 i L z
Consider the notation from part 1 of this exercise. Let t1 D .q cos.ˇ/;

q sin.ˇ//, t2 D .q cos. /; q sin. // where 0 < q < 1, 0 < ˇ < ˛ <
< 2 . Let zi D x1 .ti /, i D 1; 2. Prove that
indc .z1 / D indc .z2 / C 1:
[Hint: Assume without loss of generality b D 0, a D 1, x D Id. Let

q < r < 1. Consider the curve c1 parametrized by the restriction of c
to hr; ri. Let c2 W Œ0; ˛ ! C and c3 W Œ˛; 2 ! C be defined by
t 7! .r cos.t/; r sin.t//. Let Li be the oriented curve parametrized by ci
for i D 1; 2; 3. Use Remark 8.6. (4) for the curves L1 C L2 , L3 L1 ,
L2 C L3 .]
3. Prove that indc .z/ is constant on connected components of C X cŒha; bi.
4. Let cŒha; bi
.0; R/ and let jzj R. Prove that
indc .z/ D 0:
5. Prove that there exists a point x of cŒha; bi for which, in the notation of part
2 of this Exercise, indc .z1 / D 0 or indc .z2 / D 0. Note that either alternative
can arise depending on the orientation of c. [Hint: Let x 2 I m.c/ be a point
with maximal real part.]
6. Let Ui be the connected component of C X cŒha; bi which contains the
point zi , i D 1; 2. Prove that Ui X Ui D cŒha; bi. [Hint: Use part 1 and
compactness.]
7. Prove from part 5 that U1 [ U2 [ cŒha; bi is open, and equal to its
closure, hence equal to C. Hence, CXcŒha; bi has precisely two connected
components, namely U1 and U2 (note that, by parts 2 and 3, U1 ¤ U2 ).
(5) Prove that the set of all z 2 C such that e z D 1 is precisely the set
f2k i j k 2 Zg. [Hint: Recall Exercises (12), (11) of Chapter 1].
(6) Prove that if Re.t/ > 0, then there exists a unique z 2 C with =2 <
Im.z/ < =2 such that e z D t. Denote z D ln.t/. Prove that the complex
derivative of ln.z/ is 1=z.
(7) For Im.z/ > 0, a 2 C, define za D e a ln.z/ . Mimic Exercise (7) of Chapter 1 to
show that the complex derivative of za is aza1 .
(8) Define, for a 2 C,
!
a
D1
0
www.Ebook777.com
7 Exercises 265
and for k D f1; 2; : : : g,

!
a a.a 1/ : : : .a k C 1/
D :
k kŠ
Prove Newton’s formula, which states that for z 2 C with jzj < 1, we have
1
!
X a
.1 C z/a D zn :
nD0
n
(9) Suppose that f is a holomorphic function on C, and suppose there exist non-
zero numbers a; b 2 C such that we do not have qa D b for any q 2 Q, and
such that f .z C a/ D f .z/, f .z C b/ D f .z/ for all z 2 C. Prove that then f
is constant. [Note that there is more than one case to consider.]
(10) Prove that a non-constant holomorphic function on f W C ! C satisfies
f ŒC D C. [Hint: If a … f ŒC, then the function 1=.f .z/ a/ is holomorphic
and bounded.]
(11) Prove that if L is a parametrized oriented piecewise smooth curve in an open
set U C, h W U ! C is a holomorphic injective function and f is a
holomorphic function on hŒU , then
Z Z
f .t/dt D f .h.z//h0 .z/dz:
hŒL L
(Note that the notation hŒL applied to a parametrized curve is slightly

imprecise, but the meaning is clear.)
(12) Prove that if f .z/ is a holomorphic function on an annulus R1 < jjzajj < R2
which cannot be holomorphically extended to any annulus r1 < jjz ajj < r2
for r1 R1 , r2 R2 where equality does not arise in both cases, then the
Laurent expansion (6.1.2) diverges outside the annulus R1 jjz ajj R2 .
(13) Prove that
Z
. a/n d D 0
K
where K is a circle with center a oriented counter-clockwise, and n 2 Z,

n ¤ 1.
(14) Let U C be an open set, and let a 2 U . Let f be a holomorphic function
on U X fag. Suppose that f is bounded in some neighborhood of a. Prove that
then f has a removable singularity at a. [Hint: Consider the function f2 of
Subsection 6.1. Then f2 is bounded in a neighborhood of 0 because f is. This
means that the function of Subsection 6.1 is holomorphic and bounded in all
of C. Apply Liouville’s Theorem.]
www.Ebook777.com
(15) Prove that the complex function f .z/ D e 1=z for z ¤ 0 has an essential
singularity at z D 0. Conclude that the Taylor expansion of f at a 2 C X f0g
has radius of convergence jjajj. (Compare to Exercise (13) of Chapter 1.)
(16) Prove that a function f as in Subsection 6.2 has a pole of order n at a if and
only if g.z/ D f .z/.z a/n is holomorphic in U and g.a/ ¤ 0. Similarly,
prove that f has a zero of order n at a if and only if h.z/ D f .z/=.z a/n is
holomorphic in U and h.z/ ¤ 0.
(17) Let U be a connected open subset of C and let f; g be meromorphic functions
on U . Prove that f g, f =g are meromorphic on U .
(18) Prove that Bk D 0 if k > 2 is an odd integer.
www.Ebook777.com
Multilinear Algebra
11
Now that we strengthened our foundations in topology, algebra needs an upgrade

as well. We already know the concept of a bilinear map, which we have used,
for example, in Chapter 3, Section 8. Of the more general multilinear maps, we
encountered one additional example: the determinant. In this chapter, we will study
multilinear maps in some depth. This is essential for our treatment of differential
forms in Chapter 12 below, as well as for tensor calculus in Chapter 15 below.
In this chapter and the next, we will drop the bold-faced letter convention
of 1.2 of Chapter 3, as it is generally not used in this context.
1 Hom and dual vector spaces
1.1
In this Chapter, the symbol F stands for either the field R of real numbers or the
field C of complex numbers. Let V , W be vector spaces over F. Denote by
HomF .V; W /
the set of all linear maps (homomorphisms of F-vector spaces)
f W V ! W:
Observe that HomF .V; W / is again a vector space: for f; g 2 HomF .V; W /, we have
a linear map f C g 2 HomF .V; W / defined by
.f C g/.x/ D f .x/ C g.x/;
and when 2 F, we also have a linear map f defined by

www.Ebook777.com
268 11 Multilinear Algebra
.f /.x/ D f .x/:
The required identities are obvious.

A case of special interest is when W D F. Then we write
V D HomF .V; F/;
and call V the dual of the vector space V and often refer to elements of V as
linear forms on V .
1.2 Covariance and contravariance
Observe the behavior of HomF .V; W / with respect to linear maps.

First, let W W ! W 0 be a linear map. We can naturally define a map
W HomF .V; W / ! HomF .V; W 0 /
by composition with :
f 2 HomF .V; W / 7! ı f 2 HomF .V; W 0 /:
The map is clearly linear. Moreover, this construction clearly preserves the
identity, and if we have another linear map W W 0 ! W 00 , we have
ı D . ı / :
Next, consider a linear map W V ! V 0 . Again, we can define naturally a map

on HomF .‹; W / by
g 7! .g/ D g ı :
This map,
W HomF .V 0 ; W / ! HomF .V; W /;
however, goes in the opposite direction! Again, is clearly linear, and this
construction preserves identity. Also, it preserves composition, but this time in the
reversed order: If W V 0 ! V 00 is a linear map, then
. ı / D ı :
This behavior, i.e. reversing the direction of maps and the order of composition, is
referred to as contravariance and the opposite of contravariance, i.e. preserving the
direction of maps and order of composition, is then referred to as covariance. Thus,
the construction is contravariant and the construction is covariant.
www.Ebook777.com
1 Hom and dual vector spaces 269
These are basic concepts of category theory where the constructions 7!

and 7! are referred to as covariant and contravariant functors, which means
that they preserve the identity, and preserve or reverse the order of composition.
For our purposes, however, we do not need to investigate these concepts further.
The interested reader can look at [12].
At this moment, the most important fact for us is that the dual is contravariant,
i.e. for a linear map of vector spaces
WV !W
we obtain a linear map
W W ! V :
1.3 The dual basis
Suppose now that a vector space V is finite-dimensional, and let .v1 ; : : : ; vn / be

an ordered basis of V . Then, by the definition of a basis, there exist linear forms
.f1 ; : : : ; fn / 2 V such that
1 when i D j
fi .vj / D
0 else.
Proposition. .f1 ; : : : ; fn / is a ordered basis of V (and is referred to as the dual

(ordered) basis of .v1 ; : : : ; vn /).
Proof. We know that any linear form f W V ! F satisfies
f .1 v1 C C n vn / D 1 f .v1 / C C n f .vn /:
We conclude that
f D f .v1 /f1 C C f .vn /fn : t

u
Note that finite-dimensionality was used in the last line of the proof, where we
would get an undefined infinite sum, were the basis infinite.
1.4 The double dual
Let V be a vector space. Then there is a map
W V ! .V /
www.Ebook777.com
defined naturally as follows: Let
v2V
and
.f W V ! F/ 2 V :
Then define
..v//.f / D f .v/:
Proposition. The map is an isomorphism when V is finite-dimensional.
Proof. Let V have an ordered basis .v1 ; : : : ; vn /. Let .f1 ; : : : ; fn / be the dual
ordered basis, and let .w1 ; : : : ; wn / be the dual ordered basis of .f1 ; : : : ; fn /. By
definition, we have .vi / D wi . t
u
1.5 Duals of inner product spaces
For a general finite-dimensional space V , there is no “naturally defined” isomor-

phism V ! V . This statement can actually be made more precise, but we won’t
need that. For a finite-dimensional real vector space V with inner product hu; vi,
however, the situation is different. We can define a linear map
WV !V (*)
by
..v//.w/ D hv; wi: (**)
It is easily seen that this is an isomorphism, since when .v1 ; : : : ; vn / is an

orthonormal ordered basis, ..v1 /; : : : ; .vn // is the dual basis.
When V is a finite-dimensional inner product vector space over C, the situation
is somewhat more complicated. If we attempt to define an isomorphism (*) by the
formula (**), we find by the properties of the complex inner product that the map
we obtain is anti-linear, not linear. It is possible to define the complex conjugate
space V which is the same as V as a real vector space, and the multiplication by
2 C on V is defined as the multiplication by the complex conjugate on V . Then
the formula (**) defines an isomorphism
V ! V :
www.Ebook777.com
2 Multilinear maps and the tensor product 271
2 Multilinear maps and the tensor product
2.1
Let V1 ; : : : ; Vn ; W be vector spaces over the same field F (again, we assume F D R

or F D C). A multilinear map from V1

Vn into W is a map of sets which is
linear in each coordinate. This means that for fixed vi 2 Vi , i ¤ k (i; k D 1; : : : ; n),
and for x; y 2 Vk , 2 F, we have
.v1 ; : : : ; vk1 ; x C y; vkC1 ; : : : ; vn /

D .v1 ; : : : ; vk1 ; x; vkC1 ; : : : ; vn / C .v1 ; : : : ; vk1 ; y; vkC1 ; : : : ; vn /
and
.v1 ; : : : ; vk1 ; x; vkC1 ; : : : ; vn / D .v1 ; : : : ; vk1 ; x; vkC1 ; : : : ; vn /:
When n D 2 resp. n D 3, we speak of a bilinear resp. trilinear map, etc.

Beware the standard mistake: Note that a multilinear map is not linear (except
in special cases such as when V1 D D Vn D 0). For example, in the bilinear
case, the linear additivity formula gives
2.x; z C t/ D .2x; z C t/ D .x; z/ C .x; t/;
while the multilinear additivity formula gives
.x; z C t/ D .x; z/ C .x; t/:
2.2 The tensor product by universality
Since a multilinear map from V1

Vn to W is not linear, it raises the question
whether the information contained in a multilinear map could be equivalently
replaced by the information contained in a linear map.
In mathematics, a standard approach to such a situation is by looking for a
universal object: Is there a vector space W0 and a multilinear map from V1

Vn
to W0 such that for every multilinear map from V1

Vn to any vector space
W there exists a unique linear map 0 W W0 ! W such that 0 ı D ? We express
this by a diagram (the arrows labelled multi mean multilinear maps):
V1

Vn
multi
multi

W0 W:
0
www.Ebook777.com
The dotted arrow means that the map (in this case linear) exists and is determined
by the other data. For given vector spaces V1 ; : : : ; Vn , such a universal vector space
W0 indeed exists. It is called the tensor product, and denoted by
W 0 D V1 ˝ ˝ Vn :
Of course, the existence is yet to be proved. However, let us observe that just from
the universal property, if the tensor product exists, it must be unique up to a preferred
(we say canonical) isomorphism: Suppose
0 W V1

Vn ! W00
is another universal multilinear map. Then by the universality of W0 , there exists a

linear map W W0 ! W00 such that
0 D :
Similarly, by the universality of W00 , there exists a linear map W W00 ! W0

such that
D 0 :
But now, since we have
D ;
by the uniqueness part of the univeral property of , we must have
D Id;
and similarly
D Id;
so and are linear isomorphisms.
2.3 The existence of the tensor product
Proposition. Let V1 ; : : : ; Vn be vector spaces. Then there exists a vector space V1 ˝

˝ Vn (the tensor product) which satisfies the universality property from the last
paragraph.
Proof. The construction is not very inspiring. Recall from Appendix A, 5.6 the
construction of the free vector space. Now take the free vector space
www.Ebook777.com
F.V1

Vn / (*)
on the (typically infinite) basis V1

Vn (forgetting, for the moment, the vector
space structure of V1 ; : : : ; Vn completely).
Then there is a canonical (i.e. obvious) map
0 W V1

Vn ! F.V1

Vn /;
namely sending each
x D .v1 ; : : : ; vn / 2 V1

Vn
to the free generator of the same name. This map 0 is just a map of sets; there is no
reason even to suspect that it may be multilinear.
Now, however, we apply our technique of factorization. Namely, in (*), take the
vector subspace Z generated by all the elements
.v1 ; : : : ; vk1 ; x C y; vkC1 ; : : : ; vn /

C.v1 ; : : : ; vk1 ; x; vkC1 ; : : : ; vn / C .v1 ; : : : ; vk1 ; y; vkC1 ; : : : ; vn /;
and
.v1 ; : : : ; vk1 ; x; vkC1 ; : : : ; vn / .v1 ; : : : ; vk1 ; x; vkC1 ; : : : ; vn /
where vi 2 Vi , i ¤ k, i; k D 1; : : : ; n and 2 F. Now define W0 as the factor
W0 D F.V1

Vn /=Z:
Therefore, by definition of the factor space, we have a canonical projection
W F.V1

Vn / ! W0 :
Put
D 0 :
Then we immediately see that is a multilinear map, because the definition of

multilinearity is precisely equivalent to asserting that our generators of Z go to 0.
Now by definition of basis, any map of sets
W V1

Vn ! W
determines a unique linear map
ˆ W F.V1

Vn / ! W
www.Ebook777.com
such that
ˆ.v1 ; : : : ; vn / D .v1 ; : : : ; vn /:
If is moreover multilinear, then
ˆŒZ D 0;
so by the Homomorphism Theorem, there exists a (necessarily unique) linear map
0 W F.V1

Vn /=Z ! W
such that
0 D ˆ: t
u
Notation: One usually denotes
v1 ˝ ˝ vn D .v1 ; : : : ; vn / 2 V1 ˝ ˝ Vn :
Let us also remark that to be completely precise, we should denote our tensor
product by
V1 ˝F ˝F Vn
to distinguish the field. We will, however, typically not use this longer notation
unless confusion can arise.
Perhaps the most important convention is that in most of advanced mathematics,
a multilinear map
W V1

Vn ! W
is generally identified with the corresponding linear map
0 W V1 ˝ ˝ Vn ! W;
which means that the two concepts are no longer distinguished explicitly, and the
linear variant is written in all formulas.
2.4 The tensor product and bases
To avoid excessive indexing, assume here that n D 2 and investigate the tensor
product V ˝ W of vector spaces V , W with ordered bases .v1 ; : : : ; vm / and
.w1 ; : : : ; wp /. (See Exercise (2).)
www.Ebook777.com
Proposition. The set
fvi ˝ wj j i D 1; : : : ; m; j D 1; : : : ; pg
is a basis of V ˝ W .
Proof. By the uniqueness explained in 2.2, it suffices to exhibit a bilinear map
0 W V
W ! F.V
W /
which satisfies the universal property for bilinear maps. Put
X
m X
n X
0 . i vi ; j wj / D i j vi ˝ wj :
i D1 j D1 i;j
Bilinearity is an immediate consequence of associativity and distributivity.

For universality, simply note that every bilinear map
WV
W !U
into another vector space U must satisfy
X
m X
n X
. i vi ; j wj / D i j .vi ; wj /;
i D1 j D1 i;j
so the map required by the universality property is uniquely given by the formula
0 .vi ˝ wj / D .vi ; wj /: t
u
2.5 The tensor product and duals
Let V , W be vector spaces over F. Note that we have canonical maps
W V ˝ W ! Hom.V; W /;
W V ˝ W ! .V ˝ W / :
Specifically, let v 2 V , w 2 W , and let
.f W V ! F/ 2 V ;
.g W W ! F/ 2 W :
www.Ebook777.com
Define
..f ˝ w//.v/ D f .v/w;
.f ˝ g/.v ˝ w/ D f .v/ ˝ g.w/:
Proposition. Let V , W be finite-dimensional vector spaces. Then the linear maps

, defined above are isomorphisms.
Proof. Let V; W have ordered bases .v1 ; : : : ; vm / and .w1 ; : : : ; wn /, let the dual
ordered bases be .f1 ; : : : ; fm /, .g1 ; : : : ; gn /. We already know that the space space
Hom.V; W / is isomorphic to the space of .m
n/-matrices by assigning to a linear
map W V ! W its matrix with respect to the bases .vi /, .wj /. Denote by
i;j 2 Hom.V; W /
the linear map whose matrix has 1 in the j ’th row and i ’th column and 0’s elsewhere.
Then clearly the set of all i;j , i D 1; : : : ; m; j D 1; : : : ; n is a basis of Hom.V; W /,
and we have
.fi ˝ wj / D i;j ;
thus proving the statement for .

Now let, on .V ˝ W / , .ei;j / be the dual basis to the basis .vi ˝ wj /. Then by
definition
.fi ˝ gj / D ei;j ;
thus proving the statement about . t

u
3 The exterior (Grassmann) algebra

3.1 Alternating (multilinear) maps
Let V , W be vector spaces over the field F (which, again, we assume to be equal to
R or C). Recall that a multilinear map
W„
V
ƒ‚

V… ! W
k times
can be identified with a linear map
W V ˝˝V ! W
„ ƒ‚ …
k times
(the left hand side is, of course, also denoted by V ˝k ). The multilinear map is
called alternating if for any permutation
www.Ebook777.com
3 The exterior (Grassmann) algebra 277
W f1; : : : ; kg ! f1; : : : ; kg;
and any vectors vi 2 V , we have
.v .1/ ˝ ˝ v .k/ / D sgn./.v1 ˝ ˝ vk /:
3.2
It is natural to ask if there is a universal object for alternating multilinear maps just
as the tensor product was for multilinear maps, i.e. if for every vector space V and
every k D 0; 1; 2; : : : there exists a vector space Wa and an alternating map
W V ˝k ! Wa
such that for every alternating map
W V ˝k ! W
there exists a unique linear map a W Wa ! W such that
D a ;
or, expressed by a diagram,
V ˝k
alt
alt
Wa W:
a
The notation alt means an alternating map. Such an object indeed exists, as we shall
prove in 3.3. It is called the exterior power, and is denoted by ƒk .V /. It is also
unique up to canonical isomorphism by the same argument as the tensor product
(see Exercise (6)).
3.3 The Existence of the exterior power
Proposition. The vector space Wa D ƒk .V / and the map W V ˝k ! ƒk .V / with

the universal property described in Section 3.2 exist.
www.Ebook777.com
Proof. Let Z V ˝k be the vector subspace generated by all elements of the form
.v .1/ ˝ ˝ v .n/ / sgn./.v1 ˝ ˝ vn /
where is a permutation on f1; : : : ; ng and vi 2 V . Then by definition, the

quotient map
W V ˝k ! ƒk .V /
is alternating (since all the generators of Z being 0 is a translation of the definition

of an alternating map). Let W V ˝k ! W be an alternating map. Then, again, by
definition,
ŒZ D 0;
so by the Homomorphism Theorem, there exists a unique linear map
a W ƒk .V / D V ˝k =Z ! W
such that D a ı , as claimed. t

u
Notation: For v1 ; : : : ; vk 2 V , one writes
v1 ^ ^ vk D .v1 ˝ ˝ vk / 2 ƒk .V /:
3.4 Exterior powers and bases
Proposition. Let V be a finite-dimensional vector space with ordered basis

.v1 ; : : : ; vn /. Then the set
fvi1 ^ ^ vik j1 i1 < < ik ng (1)
is a basis of ƒk .V /.
Proof. Again, we will use the uniqueness which follows from the universal property.
Let Wa0 be the free vector space on the set (1). A linear map on a vector space can
be defined by specifying its values on the basis elements. The basis elements on
V ˝k are
vi1 ˝ ˝ vik ; i1 ; : : : ik 2 f1; : : : ng: (2)
Define thus
www.Ebook777.com
0 W V ˝k ! Wa0
by sending the element (2) to 0 if two of the numbers i1 ; : : : ; ik are equal, and to
sgn./ vi .1/ ^ ^ vi .k/
if
i .1/ < < i .k/ :
Now let W V ˝k ! W be an alternating map. Define
a0 W Wa0 ! W
by
a0 .vi1 ^ ^ vik / D .vi1 ˝ ˝ vik /: (3)
Then for a basis element x in (2),
.x/ D a0 0 .x/ (4)
follows from the definition of an alternating map (in particular, note that if two
coordinates of x coincide, swapping these two coordinates only changes the sign but
not x so we get .x/ D .x/, implying .x/ D 0). Note also that the definition
(3) is thereby forced by (4), so 0 has the universal property of 3.2. t
u
3.5 Remark
Note that if dim.V / D n, then we have

!
n
dim ƒ .V / D
k
:
k
In particular, for k > n, we have ƒk .V / D 0; and we also have
dim.ƒn .V // D 1:
This means that the space of alternating multilinear maps on V ˝n is 1-dimensional.

Specifying an isomorphism
W V ! Fn ;
www.Ebook777.com
(where the right hand side is the space of columns), one such non-zero alternating
map is
v1 ˝ ˝ vn 7! det..v1 /; : : : ; .vn //:
(Here the argument of the determinant is simply the n

n matrix with the columns
listed put in that order.) Thus, we have proved that any alternating multilinear map
on V ˝n is a constant multiple of the determinant!
When F D R, a choice of one of the two connected components of ƒn .V / X f0g
is called an orientation of the vector space V , and the other orientation is called
the opposite orientation. A linear isomorphism f W V ! V is said to preserve
orientation if the linear isomorphism ƒn .f / (see Exercise (7)) restricted to ƒn .V /X
f0g preserves the chosen connected component. Otherwise, f is said to reverse
orientation.
3.6 The exterior product
Let V; Z; W be vector spaces. Consider two numbers k; ` D 0; 1; 2; : : : . It is useful

to study multilinear maps
W V ˝k ˝ Z ˝` ! W
which are alternating in the first k coordinates and the last ` coordinates separately.
By this, we mean that
.v .1/ ˝ ˝ v .kC`/ / D sgn./.v1 ˝ ˝ vkC` /
whenever v1 ; : : : ; vk 2 V ,vkC1 ; : : : ; vkC` 2 Z, and is a permutation which

satisfies
.f1; : : : ; kg/ D f1; : : : ; kg
(and hence, of course, also .fk C 1; : : : ; k C `g/ D fk C 1; : : : ; k C `g). It turns

out that the universal object for multilinear maps alternating in the first k and last `
coordinates separately is ƒk .V / ˝ ƒ` .Z/. More precisely, we have the following
Proposition. The map
2 D ˝ W V ˝k ˝ Z ˝` ! ƒk .V / ˝ ƒ` .Z/
(see Exercises (3) and (4)) is alternating in the first k and last ` coordinates
separately. For any vector space W and any linear map
W V ˝k ˝ Z ˝` ! W
www.Ebook777.com
alternating in the first k and last ` coordinates separately, there exists a unique
linear map
2 W ƒk .V / ˝ ƒ` .V / ! W
such that D 2 2 .
Proof. It is obvious that 2 is alternating in the first k and the last ` coordinates
separately. Consider a map as in the statement of the proposition. Then for
w 2 Z ˝` fixed,
v 7! .v ˝ w/
is an alternating map on V ˝k , which gives us a map
w W ƒk .V / ! W: (*)
Fixing now v 2 ƒk .V /, on the other hand, (*) is clearly linear and alternating in w,
thus giving us a map
v W ƒ` .V / ! W: (**)
Therefore, (**) specifies a bilinear map
ƒk .V /
ƒ` .Z/ ! W;
and hence a linear map
2 W ƒk .V / ˝ ƒ` .Z/ ! W:
We have D 2 2 by definition, which in turn uniquely determines 2 since

ƒk .V / ˝ ƒ` .Z/ is generated by elements of the form .v1 ^ ^ vk / ˝
.vkC1 ^ ^ vkC` /, v1 ; : : : ; vk 2 V , vkC1 ; : : : ; vkC` 2 Z. t
u
The whole point of the proposition for our purposes is that for V D Z,
when a map
W V ˝kC` ! W
is alternating, it is clearly alternating in the first k and last ` coordinates separately,

so the universal property in the proposition (for W D ƒkC` .V /) gives a map
^ W ƒk .V / ˝ ƒ` .V / ! ƒkC` .V /:
www.Ebook777.com
We will think of this map as a kind of a product, called the exterior product, i.e.
write, for x 2 ƒk .V /, y 2 ƒ` .V /,
x ^ y 2 ƒkC` .V /:
One has, of course, for v1 ; : : : ; vkC` 2 V ,
.v1 ^ ^ vk / ^ .vkC1 ^ ^ vkC` / D v1 ^ ^ vkC` :
In the above notation, we therefore see that
x ^ y D .1/k` y ^ x
since .1/k` is the sign of the permutation swapping the first k with the last `
coordinates (without changing their individual orders). Note that if we put
1
M
ƒ.V / D ƒk .V /
kD0
(let ƒ0 .V / D F), this defines an actual bilinear product
^ W ƒ.V / ˝ ƒ.V / ! ƒ.V /:
This product is associative and unital in the sense that
.x ^ y/ ^ z D x ^ .y ^ z/; 1 ^ x D x ^ 1 D x:
One calls ƒ.V / the exterior algebra (or the Grassmann algebra).
3.7 The exterior product and duality
Let V be a vector space over R or C. Define a linear map
W ƒk .V / ! .ƒk .V //
by
X
..f1 ^ ^ fk //.v1 ^ ^ vk / D sgn./ f .1/ .v1 / f .k/ .vk /

where the sum is over all permutations on the set f1; : : : ; kg.
www.Ebook777.com
Proposition. If V is a finite-dimensional vector space, then the map defined

above is an isomorphism.
Proof. Let .v1 ; : : : ; vn / be an ordered basis of V and let .f1 ; : : : ; fn / be the dual
ordered basis of V . Then for 1 i1 < < ik n, we have
.vi1 ^ ^ vik / D fi1 ^ ^ fik : t

u
3.8 The Hodge * operator
Now let V be a finite-dimensional real inner product space of dimension n. (The

complex case can be treated also, but we don’t need it; see e.g. [8]). Then ƒk .V / is
naturally an inner product space where the inner product is defined by
X
hv1 ^ ^ vk ; w1 ^ ^ wk i D sgn./ hv .v1 / ; w1 i hv .vk / ; wk i

where the sum is over all permutations on f1; : : : ; kg. It is useful to note that
if .v1 ; : : : ; vn / is an ordered orthonormal basis of V , then the basis given by
Proposition 3.3 is orthonormal.
Now let V be an oriented real finite-dimensional vector space of dimension n.
Recall from Remark 3.5 that dim.ƒn .V // D 1 and note that an orientation specifies
a connected component C of ƒn .V / X f0g. Now there exists a unique 2 C with
h; i D 1. There exists a unique linear isomorphism
" W ƒn .V / ! R
such that
"./ D 1:
Then we have a linear map
W ƒk .V / ! .ƒnk .V //
defined by
..v1 ^ ^ vk //.vkC1 ^ ^ vn / D ".v1 ^ ^ vn /:
Now define the Hodge * operator
W ƒk .V / ! ƒnk .V /
www.Ebook777.com
as the composition

ƒk .V / .ƒnk .V // ƒnk .V /
where the second map is given by the inner product on ƒnk .V /.

Note that when .v1 ; : : : ; vn / is an ordered orthonormal basis of V , then
v1 ^ ^ vn is equal either to or . We say that the basis is oriented if
v1 ^ ^ vn D :
Then we see readily that for an oriented ordered basis .v1 ; : : : ; vn / of V , we have
.v1 ^ ^ vk / D vkC1 ^ ^ vn :
4 Exercises
(1) Let V , W be finite-dimensional vector spaces, and let W V ! W be a

linear map. Fix ordered bases .v1 ; : : : ; vn / of V and .w1 ; : : : ; wm / of W . Let
.f1 ; : : : ; fn / and .g1 ; : : : ; gm / be the dual ordered bases. Let A be the matrix
of the map with respect to the ordered bases .v1 ; : : : ; vn /, .w1 ; : : : ; wm /.
Prove that the matrix of the map with respect to the dual ordered bases is
the transposed matrix AT .
(2) Write down a basis of V1 ˝ ˝ Vn in terms of chosen bases of V1 ; : : : ; Vn for
general n.
(3) “Functoriality” of the tensor product: For linear maps f W V ! V 0 , g W W !
W 0 , construct a map f ˝ g W V ˝ W ! V 0 ˝ W 0 in such a way that Id ˝ Id
is the identity, and the construction preserves compositions.
(4) Prove commutativity associativity and unitality of the tensor product, i.e.
construct isomorphisms
V ˝ W ! W ˝ V;
V ˝ .W ˝ Z/ ! .V ˝ W / ˝ Z;
F˝V !V
which form commutative diagrams with the linear maps constructed in

Exercise (3). (This property is called “naturality”.)
(5) Prove that for any vector spaces V; W; Z, there is a canonical isomorphism
Hom.V; Hom.W; Z// Š Hom.V ˝ W; Z/:
www.Ebook777.com
4 Exercises 285
(6) Prove the uniqueness of ƒn .V / based on its universal property discussed

in 3.2.
(7) Prove “functoriality” of ƒn , i.e. for a linear map V ! W , construct a
linear map ƒn .f / W ƒn .V / ! ƒn .W / which preserves identity maps and
compositions.
(8) Prove that for a finite-dimensional vector space V , a linear isomorphism
f W V ! V preserves orientation if and only if det.f / > 0.
(9) Prove the associativity and unitality property of the exterior product defined
in Section 3.6.
(10) Let V be a real finite-dimensional inner product space. Prove the commutativ-
ity of the diagram
Š
ƒk .V / .ƒk .V //

Š Š
ƒk .V / ƒk .V /
Id
where the vertical maps are given by the inner products in V and ƒk .V /.
www.Ebook777.com
Smooth Manifolds, Differential Forms

and Stokes’ Theorem 12
In this chapter, we will introduce smooth manifolds (“locally Euclidean spaces”).

A theory of differential forms, which we will exhibit, allows us to set up a
general theory of integration on such spaces, and to generalize Green’s Theorem
in Chapter 8 to the general Stokes Theorem in arbitrary dimension.
In the process of introducing these topics, we will touch on the field of algebraic
topology. For basic information on this topic, the reader may look at [20]. For a more
advanced introduction to algebraic topology from the point of view of differential
forms, we recommend [3]. For an introduction to topics which are more abstract,
we recommend [13, 14].
1 Smooth manifolds
1.1 Topological manifolds
A topological manifold of dimension n (briefly a topological n-manifold) is a

metrizable separable topological space M (metrizable means that there exists a
metric on M which induces the given topology on M ) such that for every x 2 M
there exists an open neighborhood Ux of x and an injective open map
hx W Ux ! Rn (*)
(open map means that the image of every set open in the domain is open in the
codomain). The neighborhood Ux is called a coordinate neighborhood, and the
function hx is called a coordinate system, or coordinate system at x. The map
assigning to each x 2 M a coordinate neighborhood and a coordinate system is
called an atlas. The coordinate systems of an atlas are also referred to as charts.

www.Ebook777.com
288 12 Smooth Manifolds, Differential Forms and Stokes’ Theorem
Remarks:
1. Note that instead of requiring hx to be open, we could have equivalently required
that hx be a homeomorphism (see Exercise (1)).
2. Since we assume M is separable and metrizable, it has a countable basis (see 1.2
of Chapter 9). The reason we don’t actually say M is a metric space is that we do
not want to specify the metric: the metric has no geometric significance, and is
only a technical tool at this point. While there are metrics on manifolds which do
have a geometrical significance, we will only see these when we develop more
structure (such as the concept of a Riemann metric in Chapter 15).
3. Note that the pairs .Ux ; hx / may coincide for different points x. For example,
for M D Rn , the atlas may contain only one coordinate system, namely Rn with
the identity map Id W Rn ! Rn , which can be equal to .Ux ; hx / for all x 2 Rn .
In other interesting cases, the atlas may contain only finitely many coordinate
systems (in fact, note that by definition, a compact manifold always has such
a finite atlas). The reader may wonder why we don’t simply speak of atlases
as open covers U, with coordinate systems on each U 2 U. This is merely a
technical point: it turns out that being able to denote a coordinate neighborhood
of a point by a single symbol simplifies many arguments.
4. Because we required separability, by our definition, an uncountable discrete set is
not a manifold. There is an alternative definition, calling a manifold any (possibly
a
uncountable) disjoint union of manifolds in our sense. (In a disjoint union Mi ,
i
a set U is open if and only if each U \ Mi is open in Mi .)
1.2 Smooth manifolds
A smooth manifold of dimension n (briefly a smooth n-manifold) is a topological

manifold M with an atlas .Ux ; hx / such that for every x; y 2 M , the composition
.hx /1 hy
hx ŒUx \ Uy Ux \ Uy hy ŒUx \ Uy (C)
is a smooth map, i.e. a map which is continuous and has partial derivatives of all
orders which are also continuous. (Note that the domain and codomain of (C) are
open subsets of Rn ; also note that the intersection of Ux and Uy may be empty; in
that case, the condition (C) is void.)
Remarks:
1. Note that this definition is completely intuitive: it simply says that in a coordinate
neighborhood, we can speak of smooth real functions, and that these concepts are
compatible when we pass from one coordinate neighborhood to another.
2. Note that the continuity of all higher partial derivatives does not follow from
their existence, even on an open set (see Exercise (2) of Chapter 3).
www.Ebook777.com
1 Smooth manifolds 289
1.3 Differentiable maps
Let M and N be manifolds of dimensions m and n, respectively. A map

f W M ! N is called a C r -map if f is continuous and for every x 2 M , the
composition
h1
x f hf .x/
hx Œ.f 1 ŒUf .x/ / \ Ux f 1 ŒUf .x/ \ Ux Uf .x/ Rn
has continuous partial derivatives up to order r 2 N. Note that the source of

the composition is an open subset of a Euclidean space. A C 1 map is a map
which is C r for all r. A C r -diffeomorphism (r 1) is a homeomorphism
f W M ! N such that both f , f 1 are C r . A C 1 -diffeomorphism will be referred
to simply as a diffeomorphism. Two smooth manifolds M , N for which there exists
a diffeomorphism M ! N are called diffeomorphic. For a point x 2 M , a smooth
coordinate system at x consists of an open neighborhood U of x and a (smooth)
diffeomorphism h W U ! V where V is an open subset of Rn .
Given a smooth manifold M with a given atlas .Ux ; hx /, any other atlas .Vx ; kx /
on the topological manifold M is considered an atlas on the smooth manifold M
if the identity on M is a diffeomorphism from the manifold defined by the atlas
.Ux ; hx / to the manifold defined by the atlas .Vx ; kx /.
1.4 Examples
(1) Any open subset of a Euclidean space Rn is, of course, a smooth manifold,
and C r -maps between such manifolds are simply maps for which the required
partial derivatives (in the old sense) exist and are continuous.
(2) More generally, an open subset U of a smooth manifold M automatically
inherits a structure of a smooth manifold.
(3) Suppose f W Rn ! R is a C 1 -function. Define
M D f.x; f .x// 2 RnC1 jx 2 Rn g:
Then M is a smooth manifold with a single coordinate neighborhood Rn and

the coordinate function
.x; f .x// 7! x:
The smooth manifold M is known as the graph of the function f . The identity
embedding M RnC1 is a C 1 -map and the projection
M ! Rn
www.Ebook777.com
given by
.x; f .x// 7! x
is a diffeomorphism.
(4) The first “non-trivial” example of a smooth manifold is the n-sphere
X
n
S n D f.x0 ; : : : ; xn / 2 RnC1 j xi2 D 1:g:
i D0
For every x D .x0 ; : : : ; xn / 2 S n , there exists a k 2 f0; : : : ; ng q

such that
xk ¤ 0. Choose, for each x, such a k and an " satisfying 0 < " < 1 xk2 .
Then we can take
hx W .y0 ; : : : ; yn / 7! .y0 ; : : : ; yk1 ; ykC1 ; : : : ; yn /;

Ux D h1
x Œ
.hx .x/; "/:
(5) If M , N are smooth manifolds, then M

N is naturally a smooth manifold
where the coordinate neighborhood of a point .x; y/ 2 M
N is Ux
Uy and
the coordinate function is h.x;y/ .z; t/ D .hx .z/; hy .t//. The product projections
M
N ! M , M
N ! N are C 1 -maps.
1.5 Smooth partition of unity
Let M be a smooth manifold and let .Ui /i 2I be an open cover of M . A smooth

partition of unity subordinate to the cover .Ui / is a system of smooth functions
ui W M ! R such that for every x 2 M , 0 ui .x/ 1,
u1
i Œ.0; 1i Ui
(i.e. the support of ui is contained in Ui ), and for every x 2 M there exists an open
neighborhood Vx of x and a finite subset Ix I such that for all y 2 Vx , i 2 I XIx ,
we have ui .y/ D 0 and
X
ui D 1: (1.5.1)
i 2I
(Note that the expression on the left-hand side of (1.5.1) makes sense because on
Vx , it can be defined as the sum over Ix .)
A refinement of an open cover .Ui /i 2I is an open cover .Vj /j 2J such that for
every j 2 J , there exists an i 2 I such that Vj Ui . A cover .Ui /i 2I is called
www.Ebook777.com
1 Smooth manifolds 291
locally finite if for every x 2 M , there exists an open neighborhood Vx and a finite
subset Ix I such that for i 2 I X Ix , Vx \ Ui D ;.
Lemma. For every open cover .Ai /i 2I of a smooth manifold M , there exists an
atlas .Uj ; hj /j 2J such that J is countable, the cover .Uj / is locally finite, is a
refinement of the cover .Ai /, we have hj ŒUj D
.o; 3/ and .h1j Œ
.o; 1//j 2J is
also a cover of M .
Proof. Since M has a countable basis by Theorem 1.2 of Chapter 9, any open
cover has a countable subcover. Since clearly every point of M has a compact
neighborhood, there exists a countable cover by open sets whose closures are
compact, which is a refinement of .Ai /. Assume, without loss of generality, that
.Ai /i 2I itself is such a cover, and that, moreover, I D f1; 2; : : : g. Now define
K1 D A1 , and assuming K1 ; : : : ; Ki are defined, let
Ki C1 D A1 [ [ Ar
where r > i is the smallest number such that Ki A1 [ [ Ar . (Note that such
a number exists by compactness.)
Denote by X ı the interior of a set M , i.e.
X ı D M X .M X X /:
Now setting K0 D ;, one has

1
[
M D Ki X Kiı1 ;
i D1
and
Ki 1 Kiı :
For each x 2 Ki X Kiı1 , we can find an open neighborhood Ux KiıC1 X Ki 2

which is contained in one of the Ai ’s, and a diffeomorphism hx W Ux !
.o; 3/
such that hx .x/ D 0. Furthermore, there are finite sets Si Ki X Kiı1 so that
h1 ı
x Œ
.o; 1/, x 2 Si , cover Ki X Ki 1 .
S
The system .Ux ; hx /x2 Si is the required atlas. t
u
Theorem. For any open cover .Ai / of a smooth manifold M there exists a smooth
partition of unity subordinate to .Ai /.
Proof. Let W R ! R be a function defined by .t/ D e 1=t for t > 0 and

.t/ D 0 for t 0. Then is smooth (see Exercise (3) of Chapter 1). Hence the
function g W Rn ! R defined by
www.Ebook777.com
.2 jjxjj/
g.x/ D
.2 jjxjj/ C .jjxjj 1/
satisfies 0 g.x/ 1 for every x 2 Rn , and
g.x/ D 0 for jjxjj 2;

g.x/ D 1 for jjxjj 1:
Now take the atlas .Uj ; hj / from the statement of the Lemma, let gj D g ı hj and
define
gj
uj D X for j 2 J :
gk
k2J
(Note that the right-hand side is well defined by local finiteness.) t

u
2 Tangent vectors, vector fields and differential forms
The notion of a tangent vector to a smooth manifold models the geometric intuition
(for example, the instant velocity of a point moving in the manifold). As we learned
in the previous section, however, we must model everything in terms of coordinate
neighborhoods.
2.1 Tangent vectors
Let M be a smooth m-manifold and let x 2 M . Consider the set TQ Mx of all triples
.U; h; v/ where U is a neighborhood of x, h W U ! V be a diffeomorphism for
some V Rn open, and v 2 Rn .
Now introduce the following equivalence relation on TQ Mx : We put
.U; h; v/ .V; k; w/
if there exists an open neighborhood W of x contained in U \ V such that if we

denote by f the composition
h1 k
hŒW W kŒW ;
then
Dfh.x/ .v/ D w:
www.Ebook777.com
2 Tangent vectors, vector fields and differential forms 293
(Recall that D denotes the total differential, see 3.2 of Chapter 3). It is easy to
verify that this is indeed an equivalence relation. The set of equivalence classes
of TQ Mx is denoted by TM x and its elements are called tangent vectors to M at
x. A representative of a -equivalence class will be called a representative of a
tangent vector. The tangent vector represented by a triple .U; h; v/ will be sometimes
denoted by Œ.U; h; v/. When this gets too cumbersome, we will also refer to v as
the vector Œ.U; h; v/ in the coordinate system h W U ! V .
Lemma. Let u 2 TM x . Then for every neighborhood U of x and every

diffeomorphism h W U ! V for V Rn open, there exists a unique representative
.U; h; v/ of the tangent vector u.
Proof. If .V; k; w/ is any representative of u, put
v D D.k ı h1 /1

h.x/ .w/:
(Note that k ı h1 is defined in a neighborhood of h.x/.) By definition, this proves

existence. To prove uniqueness, note that by definition, clearly we cannot have
.U; h; v/ .U; h; v 0 / for v ¤ v 0 2 Rn . t
u
Note that by the lemma it immediately follows that TM x has a natural structure
of a R-vector space, and that moreover, this vector space is n-dimensional. In effect,
let U be an open neighborhood of x and let h W U ! V be a diffeomorphism onto
an open subset of Rn . Let
Œ.U; h; v/ C Œ.U; h; w/ D Œ.U; h; v C w/;

Œ.U; h; v/ D Œ.U; h; v/
where 2 R. Correctness of the definitions of these operations (i.e. independence

of the results of chosen representatives) follows from the linearity of the differential
in Rn .
As noted above, a coordinate system h W U ! V at x 2 M identifies TM x with
Rn . Putting h D .h1 ; : : : ; hn /, hi W U ! R, it is useful to denote the ordered basis
of TM x corresponding to the standard basis of Rn by
@ @
. ;:::; /: (*)
@h1 @hn
The reason for this notation is that if f W U ! R is a smooth function, in the spirit
of the chain rule, it makes sense to write
@ @f .x/ @.f ı h1 /

f .x/ D D .h.x//
@hi @hi @xi
www.Ebook777.com
where on the right-hand side, xi denotes the standard i ’th coordinate of Rn , as used
in Chapter 3.
It is also useful to notice that when U Rn is an open subset, x 2 U , we have a
canonical identification
Rn Š TU x
via
v 7! ŒU; Id; v:
2.2 The total differential on manifolds
Let M; N be smooth manifolds and let f W M ! N be a C 1 -map. Let x 2 M . We

define the total differential of f at x
Dfx W TM x ! TN f .x/
as follows: Let V be an open neighborhood of f .x/ and let k W V ! W be a

diffeomorphism where W Rm is open. Then define
Dfx .Œ.U; h; v/ D Œ.V; k; D.k ı f ı h1 /h.x/ .v//:
This definition is correct by the chain rule (in Euclidean spaces) and Dfx is linear by
linearity of differentials (in Euclidean spaces). Additionally, note that it generalizes
the definition of total differential 3.2 of Chapter 3 when we identify the tangent
space of an open subset of Rn at every point with Rn .
If we have a real C 1 -function f W U ! R from some U M open, we usually
write df .x/ instead of Dfx . From this point of view, df can also be viewed as a
C 1 - 1-form (see 2.3 below). Similar statements, of course, hold for C r and smooth
functions. In particular, it is useful to note that if h W U ! V is a coordinate system
at x 2 M , and h D .h1 ; : : : ; hn /, then
.dh1 ; : : : ; dhn /
is a basis of TM x dual to the basis 2.1 (*) of TM x .

In preparation for the next subsection, note also that by the properties of duals
and exterior products (see Chapter 11), we have canonical linear maps
Dfx W TN f .x/ ! TM x ;
ƒk .Dfx / W ƒk .TN f .x/ / ! ƒk .TM x /:
www.Ebook777.com
A smooth map f W M ! N is called an immersion (resp. submersion)

if for every x 2 M , Dfx is injective (resp. onto). An immersion which is a
homeomorphism onto its image is also called an embedding or an inclusion of a
submanifold. We will then also refer to f ŒM as a submanifold of N .
2.3 Smooth vector fields and differential forms
Let M be a smooth n-manifold. Then a vector field v on M (resp. a k-form !

on M ) is a map assigning to each x 2 M an element of v.x/ 2 TM x (resp. of
!.x/ 2 ƒk .TM x /). A differential form is a common term for k-forms for any k.
A vector field (resp. a k-form) is called C r if for every x 2 M there exists an open
neighborhood U and a diffeomorphism h W U ! V for V Rn open such that for
every y 2 U ,
y 7! Dhy .v.y// 2 Rn
resp.
y 7! ƒk ..D.h/y /1 /.!.y// 2 ƒk ..Rn / /
is a C r map where the right-hand side uses the identification of the tangent spaces
of an open subset of Rn at the end of Section 2.1. C 1 vector fields and k-forms are
also called smooth.
It is useful to note that if h W U ! V is a smooth coordinate system at some
point x 2 M , h D .h1 ; : : : ; hn /, then immediately from the definition, the vector
space of all smooth vector fields on U is
X
n
@
f fi j fi W U ! R smooth functionsg;
i D1
@hi
and the space of all smooth 1-forms on U is
X
n
f fi dhi j fi W U ! R smooth functionsg:
i D1
Thus, the smooth vector field or 1-form is completely determined by the n-tuple of
smooth functions .f1 ; : : : ; fn /, and vice versa, the functions fi are determined by
the vector field (resp. differential form) and the coordinate system.
Using Proposition 3.4 of Chapter 11, we can extend this to k-forms. The space
of all smooth k-forms on U is isomorphic to
X
f fi1 ;:::;ik dhi1 ^ ^ dhik j fi1 ;:::;ik W U ! R smoothg;
1i1 <<ik n
and the smooth functions fi1 ;:::;ik are completely determined by a smooth k-form.
www.Ebook777.com
It is also useful to realize that if .Ui ; hi / is a smooth atlas of M , and we have

smooth vector fields vi resp. smooth k-forms !i on each Ui such that
vi jUi \Uj D vj jUi \Uj
resp.
!i jUi \Uj D !j jUi \Uj ;
then this uniquely determines a smooth vector field resp. smooth k-form on M . In
other words, smooth vector fields and k-forms can be described by a collection of
local descriptions in the charts of an atlas.
Analogous statements are, of course, true with “smooth” replaced by C r .
2.4 Products and functoriality
For a vector field v and a smooth k-form ! on M , and a smooth function

f W M ! R, the product f v is again a smooth vector field, and f ! is a smooth
k-form (here the products are evaluated point-wise, i.e. .f v/.x/ D f .x/v.x/
for all x 2 M , and similarly for the differential form). Additionally, for a smooth
`-form on M , we have a smooth .k C `/-form ! ^ defined using the exterior
product 3.7 of Chapter 11:
.! ^ /.x/ D !.x/ ^ .x/ 2 ƒkC` .TM x /:
There are, also, analogous statements for C r .

Now let f W M ! N be a smooth map. Using the maps constructed at the
end of 2.2, for a smooth k-form ! on N , we obtain a smooth k-form f ! on M .
Explicitly, for x 2 M ,
.f !/.x/ D ƒk .Dff.x/ /.!.f .x/// 2 ƒk .TM x / :
This correspondence, of course, sends the identity to the identity, and .f ıg/ .!/ D
g .f .!//. Thus, we conclude that differential forms are contravariant in smooth
maps (in the sense of 1.2 of Chapter 11). There are, of course, analogous statements
for smooth replaced by C r .
It may be surprising that vector fields are neither covariant nor contravariant
in smooth maps: One can see this by realizing that vectors are covariant, while
smooth functions are contravariant. Vector fields can be made, however, covariant
in diffeomorphisms: Let f W M ! N be a diffeomorphism and let v be a smooth
vector field on M . Then we can define a smooth vector field f w on N by
.f w/.x/ D Dff 1 .x/ .v.f 1 .x/// 2 TN x :
www.Ebook777.com
2.4.1 Comment
The meaning of the symbols f and f here is related to, but not quite the same as
in 1 of Chapter 11. Note that, for example, in the current situation, f is not a linear
map. Nevertheless, using the same symbol in both situations is quite standard in
this case.
2.5 A Slice Theorem
The attentive reader has noticed a similarity of this material with our remarks on
substitution in differential equations. In fact, much of what we observed in Section 7
of Chapter 6 can be done coordinate-free. Let us make this concrete in one aspect,
which will be instructive as a contrast with what we will do with differential forms:
Proposition. Let v be a smooth vector field on a smooth n-manifold M , and

suppose v.x/ ¤ 0 for all x 2 M (we speak of a non-vanishing vector field).
Let x 2 M . Then there exists a coordinate system h W U ! V at x such that
the vector field h .vjU / on V is constant and equal to .1; 0; : : : ; 0/ 2 Rn (using the
identification from the end of 2.1).
Proof. Let k W U1 ! V1 be any coordinate system at x and consider the vector field
k v. We can treat this vector field as a system of differential equations on V1 : For a
smooth function f W .a; b/ ! V1 , the equation is
X n
@f
f 0 .t/ D .k v/.f .t//i : (*)
i D1
@xi
Now we know that this system has a smooth solution in a neighborhood of a point
of V1 . Specifically, consider vectors w2 ; : : : ; wn 2 Rn such that
.k v/.k.x//; w2 ; : : : ; wn are linearly independent (hence form a

(**)
basis of Rn ).
Then by Theorem 4.1 of Chapter 6, there exists an open neighborhood V2 Rn of

o and a smooth map W V2 ! V1 such that for y 2 V2 , we have .y/ 2 V1 , further
X
n
.0; a2 ; : : : ; an / D k.x/ C ai wi
i D2
and for any constants a2 ; : : : ; an such that .t; a2 ; : : : ; an / 2 V2 , the function
f .t/ D .t; a2 ; : : : ; an /
www.Ebook777.com
satisfies the equation (*). Additionally, by our assumption (**), the map is regular
at 0, so by the Inverse Function Theorem 7.3 of Chapter 3, there exists an open
neighborhood V of 0 such that the restriction jV W V ! ŒV is a diffeomorphism.
Now put
U D k 1 ŒV
and
h D . 1 k/jU : t
u
3 The exterior derivative and integration

of differential forms
3.1 The exterior derivative
The R-vector space of all smooth k-forms on a smooth n-manifold M is denoted by
k .M /. It is clearly a vector space over R. We will now construct a linear map
d W
k .M / !
kC1 .M /: (1)
In terms of a smooth coordinate system h W U ! V , one writes

0 1
X
d@ fi1 ;:::;ik dhi1 ^ ^ dhik A
1i1 <<ik n
X
D dfi1 ;:::;ik ^ dhi1 ^ ^ dhik (2)
1i1 <<ik n
X X
n
@fi 1 ;:::;ik
D dhj ^ dhi1 ^ ^ dhik :
1i1 <<ik n j D1
@hj
Lemma. The formula (2) does not depend on the choice of coordinate system.
Proof. One first notices that for a smooth function f , df is independent of

coordinate system by the chain rule, and that for smooth functions f; g, one has
the Leibniz rule
d.fg/ D f dg C gdf:
www.Ebook777.com
3 The exterior derivative and integration of differential forms 299
Now let g W U ! W be another coordinate system. By the chain rule, we have
X n
@hi
dhi D dgj :
j D1
@gj
Now differentiating
fi1 ;:::;ik dhi1 ^ ^ dhik (3)
in the h-coordinate system and converting to the g-coordinate system, we obtain
X
@hi1 @hik
dfi1 ;:::;ik ::: dgj1 ^ ^ dgjk (4)
@gj1 @gjk
where the sum is over all possible choices 1 j1 ; : : : ; jk n (the numbers jp do

not have to form an increasing sequence in p).
Now converting (3) to the g-coordinate system first, we obtain
X
@hi1 @hik
fi1 ;:::;ik ::: dgj1 ^ ^ dgjk : (5)
@gj1 @gjk
Now differentiating (5) in the g-coordinate system, we must form

@hi1 @hik
d fi1 ;:::;ik ::: (6)
@gj1 @gjk
and then multiply by dgj1 ^ ^ dgjk . However, by the Leibniz rule, we may
differentiate fi1 ;:::;ik and the partial derivative factors separately, and the key point is
that when we differentiate
@hip
;
@gjp
we get a double partial derivative
@2 hip
:
@gjp @gjp0
In the resulting sum, however, each such term will appear twice, with the attached
dgjp and dgjp0 terms swapped. Thus, by the rules of computation in the exterior
algebra, the two terms in each such pair appear with opposite signs, and hence
cancel out. t
u
www.Ebook777.com
3.2 The de Rham complex, de Rham cohomology

and Betti numbers
Lemma. We have
d ı d D 0 W
k .M / !
kC2 .M /:
Proof. Using formula (2) from 3.1 in a coordinate system h W U ! V , the

(k C 2)-form
0 1
X
d@ fi1 ;:::;ik dhi1 ^ ^ dhik A
1i1 <<ik n
is a sum of expressions of the form
@2 fi1 ;:::;ik
dh` ^ dhj ^ dhi1 ^ ^ dhik ;
@hj @h`
but each of these terms appears twice with j and ` in opposite orders, and therefore
with opposite signs, and hence the entire expression vanishes (of course, the terms
with j D ` vanish immediately). t
u
We therefore obtain a sequence of vector spaces and linear maps
d d d

0 .M / !
1 .M / ! !
n .M / (*)
(note that we can only have

k .M / ¤ 0 for 0 k n) such that
d ı d D 0:
The sequence (*) is called the de Rham complex of the smooth manifold M , and is
denoted by
.M /. A k-form ! is called closed if
d! D 0
and is called exact if there exists a .k 1/-form such that
! D d:
(We consider the 0-form 0 exact.) Then the set of all closed k-forms is a vector
subspace of
k .M / which is denoted by Z k .M /, and the set of all exact k-forms is
then a vector subspace of Z k .M / which is denoted by B k .M /.
www.Ebook777.com
4 Integration of differential forms and Stokes’ Theorem 301
The quotient R-vector space

k
HDR .M / D Z k .M /=B k .M /
is called the k’th de Rham cohomology vector space of M . We write
bk .M / D dim.HDR
k
.M //
and call this the k’th Betti number of M (it can, of course, be infinite, see
Exercise (17)).
Betti numbers are fundamental characteristics of manifolds. For example, they
are computable in practice, they turn out to be topological invariants, which
means that two homeomorphic manifolds have the same Betti numbers. Also, Betti
numbers can be defined for topological manifolds, and in fact, for all topological
spaces. This leads to an area of mathematics called algebraic topology (see, for
example, [3, 13, 14, 20]). Unfortunately, in this text, a systematic treatment of Betti
numbers would take us too far afield, and we will confine ourselves to a few basic
exercises (Exercises (11), (12) (13), (14), (15), (16), (17)).
4 Integration of differential forms and Stokes’ Theorem
4.1 Orientation of smooth manifolds
Let M be a smooth n-manifold. An orientation of M is a choice of orientation of the

space .TM x / for each x 2 M such that for each x 2 M there exists a coordinate
system h W U ! V at x for which the orientation is constant when we use the
identification from the end of 2.1. Two orientations of M are considered equal if
they are equal at every point x 2 M . An orientation may not exist (see Exercise (18)
below). A smooth manifold for which there exists an orientation is called orientable.
Recall from 3.5 of Chapter 11 that a non-zero element of ƒn ..TM x / / determines
an orientation of .TM x / . Hence a form ! 2
n .M / such that !.x/ ¤ 0 for all
x 2 M determines an orientation of M .
Lemma. Every orientation of a smooth n-manifold M is determined by a form

! 2
n .M / such that !.x/ ¤ 0 for all x 2 M (which is then often called the
volume form). Moreover, two such forms !; determine the same orientation if and
only if there exists a smooth (nowhere vanishing) function k W M ! R such that
! D k .
Proof. To prove the first statement (existence), take a smooth atlas .Ui ; hi / such
that a form !i as required exists for the restriction of our orientation to Ui (such an
atlas exists by the definition of orientation). Now take a smooth partition of unity ui
subordinate to the cover Ui , and put
www.Ebook777.com
X
!D ui !i :
i
To prove the second statement, let !, determine the same orientation. Choose a
smooth atlas .Ui ; hi /. Then !jUi D fi dh1 ^ ^ dhn , jUi D gi dh1 ^ ^ dhn .
Define k.x/ D fi .x/=gi .x/ when x 2 Ui . t
u
Let M , N be oriented n-manifolds and let be a volume form on N specifying

the orientation. We say that a diffeomorphism W M ! N preserves orientation if
the volume form ! specifies the given orientation on M .
4.2 Integration
Let ! be a smooth n-form on Rn . Then we may write
! D f dx1 ^ ^ dxn
for a smooth function f W Rn ! R. Let B Rn be a Borel set such that B is

compact. Define
Z Z
!D f dx1 : : : dxn : (1)
B B
Now let M be a smooth oriented n-manifold and let ! be a smooth n-form on M .

Let B M be a Borel set such that B is compact. Then there exists a smooth atlas
.Ui ; hi / of M such that hi preserves orientation if we take the standard orientation
dx1 ^ ^ dxn on hi ŒUi , and such that there exists a finite subset F I where
Ui \ B D 0 for i … F (take an orientation-preserving atlas, choose a finite subcover
containing B and intersect the remaining charts with M X B). Now put
Z XZ
!D ui .h1
i / ! (2)
B i 2F hi ŒB\Ui
(recall 2.4).
Lemma. The number (2) does not depend on the choice of the atlas .Ui ; hi / (subject
to the given conditions).
Proof. First note that if U; V Rn are open sets such that B U , W U ! V is

an orientation-preserving diffeomorphism, ! is a smooth n-form, then
Z Z
!D . 1 / ! (3)
B ŒB
as defined by (1), by the Substitution Theorem 7.9 of Chapter 5.
www.Ebook777.com
Now let .Ui ; hi /i 2I , .Ui0 ; h0i /i 2I 0 be two atlases as in the statement of the lemma.
First, note that by the (finite) additivity of the integral, we may assume I D I 0 ,
Ui D Ui0 . We may still have hi ¤ h0i , but the invariance of the integral under this
choice follows from (3). t
u
Remark: Note that our notation is slightly inconsistent. In (2), we should display
the orientation of the manifold M . In (1), on the other hand, we assume the standard
orientation of Rn , i.e. the orientation defined by the n-form dx1 ^ ^dxn . A reversal
of orientation results, of course, in a reversal of sign.
4.3 Regions with corners
4.3.1
Let M be an oriented smooth n-manifold. By a region with corners in M we
mean a compact subset K M such that for every x 2 K X K ı , there exists
an orientation-preserving coordinate system h W U ! V at x in M such that
V D .1; 1/n
and there exists a k 2 f0; : : : ; ng such that
hŒK \ V D h0; 1/k

.1; 1/.nk/ (1)
or
hŒK \ V D .1; 1/n X ..1; 0/k

.1; 1/.nk/ /: (2)
(We use the symbol S n for the n-th Cartesian power of a set S here to reduce
the chance of confusion.) A special case worth pointing out is the case when one
always has k 1. In this case, we call K a compact n-dimensional submanifold
with boundary. Note that then our coordinate system gives K X K ı the structure of
a .k 1/-dimensional compact submanifold of M .
4.3.2 Integrating over the boundary

Now let be a smooth .n 1/-form on M . Consider an atlas .Ui ; hi / of M such
that there exists a finite subset F I where K \ Ui D ; when i … F , and .Ui ; hi /
satisfy (1) or (2) when i 2 F . Let ui be a smooth partition of unity subordinate
to the cover Ui . Let F1 , resp. F2 denote the set of all i 2 F for which h D hi
satisfies (1) (resp. (2)). Denote by
cj W Rn1 ! Rn
www.Ebook777.com
the map given by
.x1 ; : : : ; xn1 / 7! .x1 ; : : : ; xj 1 ; 0; xj ; : : : ; xn1 /:
Then define
Z

@K
XX
k Z
D .1/j ui .h1
i cj / (*)
i 2F1 j D1 h0;1/.k1/ .1;1/.nk/
XX
k Z
C .1/j ui .h1
i cj / :
i 2F2 j D1 .1;0i.k1/ .1;1/.nk/
It can be proved that the expression (*) does not depend on the choice of atlas with
the properties required above. However, this is a bit tedious and we will omit the
proof, as it is not needed for proving Stokes’ Theorem. When stating the theorem in
the next paragraph, we will simply assume that an atlas as above has been chosen.
It is worth noting, however, that in the special case of a compact n-dimensional
submanifold with boundary, it follows that the integral defined by (*) coincides with
Z

@K
where W @K ! M is the inclusion of a submanifold, as discussed above. Note

however that then one must be careful about orientation. The correct orientation at
a point x 2 @K, x 2 Ui , is by
.h1
i / .dx2 ^ ^ dxn / 2 ƒ
n1
.T .@K/x / :
(The minus sign comes from the fact that the added first vector of the ordered
basis representing the orientation of TMx should point “outside” from the boundary,
which, in our setup, happens to be in the negative direction.)
4.4 Theorem. (Stokes’ Theorem) Let M be a smooth n-manifold and let 2
n1 .M /. Let K be a region with corners in M and let .Ui ; hi / and ui be chosen
as in 4.3. Then
Z Z
D d: (*)
@K K
Proof. The statement and the proof are both straightforward generalizations of
our treatment of Green’s Theorem. (In fact, the part of the proof dealing with
www.Ebook777.com
substitutions becomes simpler, since Stokes’ Theorem is stated in terms of differen-

tial forms, which are contravariant.) Again, the key step is to prove the result for the
case of a cube: M D Rn ,
Y
n
KD haj ; bj i;
j D1
aj < bj . In this case, suppose without loss of generality that
D f dx1 ^ ^ dxj 1 ^ dxj C1 ^ ^ dxn :
Then
@f
d D .1/j C1 dx1 ^ ^ dxn :
@xj
Then by Fubini’s Theorem and the Fundamental Theorem of Calculus in one

variable,
Z Z
d D Q .1/j C1 .f .x1 ; : : : ; xj 1 ; bj ; xj C1 ; : : : ; xn /
K `¤j ha` ;b` i
Z
f .x1 ; : : : ; xj 1 ; aj ; xj C1 ; : : : ; xn //dx1 : : : dxn1 D :
@K
(Note that on the right-hand side, the summands corresponding to coordinates other
than the j ’th coordinate vanish.)
Now in the general case, one proves the theorem by considering each of the
summands 4.3.2 (*) separately, applying the case of the cube to the smooth
.n 1/-form
ui .h1
i cj / :
When i 2 F1 , one uses the cube
h0; 1ik
h1; 1i.nk/:
When i 2 F2 , one sums over the cubes
h1; 0i.`1/
h0; 1i
h1; 1i.n`/
with ` D 1; : : : ; k. Again, the summands not relevant to the statement are 0 or

appear twice with opposite signs. u
t
www.Ebook777.com
4.5 Three special cases: grad, div and curl
On open submanifolds U Rn , smooth 1-forms are identified with Rn -valued

functions. The differential
d W
0 .U / !
1 .U /
is then identified with a map grad from the space of smooth functions on U to the
space of Rn -valued smooth functions (or, equivalently, n-tuples of smooth func-
tions) on U . The corresponding case of the Stokes Theorem is the “Fundamental
Theorem of Line Integrals” which says that for an oriented piecewise smooth curve
L represented by W ha; bi ! Rn , we have
Z
.II/ grad.f / D f ..b// f ..b//: (*)
L
(Note that our current setup is slightly different, to get a special case of Theorem 4.4,
we would have to formulate (*) on smooth 1-manifolds rather than piecewise
smooth curves, but both statements are equally easy to prove - see Exercise (20).)
Smooth .n 1/-forms can also be identified with smooth 1-forms and smooth
n-forms can be identified with smooth functions Pusing the Hodge -operator. For a
function F W U ! Rn , denote by the 1-form Fi dxi . Then we put
div.F / D .d .//;
and for a region with corners K U , we put

Z Z
F D :
@K @K
(In this form, this integral is also known as flux.) Then the Stokes Theorem takes the
form
Z Z
F D div.F /:
@K K
When n D 3, one also denotes by curl.F / the R3 -valued function associated

with the 1-form
.d/:
In coordinates, we obtain

@F3 @F2 @F1 @F3 @F2 @F1
curl.F / D ; ; :
@x2 @x3 @x3 @x1 @x1 @x2
www.Ebook777.com
5 Exercises 307
Let M be a 2-dimensional submanifold of R3 , let K be a region with corners in M

and let F be an Rn -valued function defined in an open subset of R3 containing M .
Then the Stokes Theorem takes on the form
Z Z
curl.F / D F:
K @K
Observe that the right-hand side may be interpreted as a sum of line integrals of the
second kind.
5 Exercises
(1) Prove that the definitions of a manifold and a smooth manifold would remain
equivalent if we require the coordinate maps hx to be homeomorphisms.
(2) Prove in detail that the definition given in Example 1.4 (4) really specifies a
smooth manifold and that the inclusion S n RnC1 is a C 1 -map.
(3) Prove that the function used in the proof of Theorem 1.5 is smooth.
(4) Recall the example of the manifold S n from the last section. For x 2 S n ,
construct an isomorphism of vector spaces
x W T .S n /x Š fw 2 Rn jx w D 0g
such that for every smooth map f W RnC1 ! RnC1 which satisfies f ŒS n S n
we have a commutative diagram
x
T .S n /x Rn
Df jS n Df
f .x/
T .S n /f .x/ Rn :
(5) Recall the notion of Lie bracket of smooth vector fields from 7.5 of Chapter 6.
Let us generalize this notion to vector fields on manifolds. In other words,
let u, v be vector fields which on some open set U with smooth coordinates
h1 ; : : : ; hn are given by
X
n
@ X @
n
uD fi ; vD gi
i D1
@hi i D1
@hi
for smooth functions fi , gi . Define
X n
@gj @fj @
Œu; v D fi gi :
i;j D1
@hi @hi @hj
www.Ebook777.com
Prove that this is a well-defined operation on smooth vector fields on a smooth

manifold M , and that it satisfies (7.6.1) and (7.6.2) of Chapter 6.
(6) A Lie group is a smooth manifold G which is also a group (see B.3.1), such
that the operations of multiplication W G
G ! G and inverse W G ! G
are smooth maps (see 1.4, (5)). Prove that the groups GLn .R/, GLn .C/ (see
Appendix B, Exercise (6)) are open subsets of the real vector spaces of all n
n
real (resp. complex) matrices, and are Lie groups by considering their group
structure and the induced smooth manifold structure.
(7) Let G be a Lie group. A vector field v on the manifold G is called left
invariant if
.DLg /.v.e// D v.g/
for every g 2 G where e is the unit element and Lg W G ! G is the

diffeomorphism given by left multiplication by g. Prove that the R-vector
space of left invariant vector fields on G is isomorphic to T Ge by
v 7! v.e/:
(8) Prove that if G is a Lie group, then the vector space g of left-invariant smooth
vector fields on G forms a sub-algebra of the Lie algebra of all smooth
vector fields discussed in Exercise (5) in the sense that the Lie bracket of two
left invariant vector fields is left invariant. This g is called the Lie algebra
associated with the Lie group G, and can be shown to encode a large part of
the Lie group structure of G. (For further reading, see for example [9, 10].)
(9) Find two smooth 1-forms !, on R2 such that for every x 2 R2 we have
!.x/; .x/ ¤ 0 and there does not exist any non-empty open set U R2 and
W U ! R2 with D !jU . Compare with 2.5. [Hint: use the exterior
derivative.]
(10) Prove that for a smooth k-form ! and a smooth `-form ,
d.! ^ / D .d!/ ^ C .1/k ! ^ d:
(11) Generalize the proof of Lemma 3.1 to prove that for a smooth map
f W M ! N and a smooth k-form ! 2
k .N /, we have
d.f !/ D f .d!/:
Conclude that we have a canonical linear map
f W Z k .N / ! Z k .M /
www.Ebook777.com
5 Exercises 309
which restricts to
f W B k .N / ! B k .M /
and hence determines a linear map
f W HDR
k
.N / ! HDR
k
.M /:
Hence, de Rham cohomology is contravariant in smooth maps.

(12) Prove that diffeomorphic smooth manifolds have the same Betti numbers
[Hint: use Exercise (11)].
(13) Note that a smooth 0-form is the same thing as a smooth function. Prove that
a smooth 0-form is closed if and only if it is locally constant. Conclude that
b0 .M / is the number of connected components of M .
(14) Let W R ! S 1 be the smooth map defined by
.t/ D .cos.t/; sin.t//:
Prove that a smooth 1-form f dx on R is equal to ! for some ! 2

1 .S 1 /
if and only if f is a smooth periodic function with period 2 . Prove that ! is
exact if and only if
Z 2
f .x/dx D 0:
0
Conclude that b1 .S 1 / D 1. Conclude also that b1 .S 1

S 1 / ¤ 0. [Hint:
Consider the smooth map S 1
S 1 ! S 1 given by .x; y/ 7! x and the
smooth map S 1 ! S 1
S 1 given by x 7! .x; a/ for some constant a.
Use Exercise (11).]
(15) Prove that bn .Rn / D 0 (this is a special case of the Poincaré lemma, which
says that bi .Rn / D 0 for i > 0). [Hint: writing an ! 2
n .Rn / as
f dx1 ^ ^ dxn , put
Z x1
g.x1 ; : : : ; xn / D f .t; x1 ; : : : ; xn /dt
0
and consider the form gdx1 ^ ^ dxn .]

(16) Let ! 2 Z 1 .S 2 /. Let U C D S 2 X f.0; 0; 1/g, U D S 2 X f.0; 0; 1/g, U D
U C \ U . Then U C and U are diffeomorphic to R2 , so by Exercise (15),
there exist smooth 0-forms (i.e. smooth functions) f W U C ! R, g W U ! R
such that df D !jU C , dg D !jU . Additionally, d.f jU gjU / D 0 and hence
f jU gjU is locally constant and hence constant, since U is connected. Let
c D f jU gjU . Define a function h W S 2 ! R by h.x/ D f .x/ for x 2 U C ,
www.Ebook777.com
h.x/ D g.x/ C c for x 2 U . Prove that dh D !, and hence b1 .S 2 / D 0.

Conclude (see Exercise (14)) that the smooth manifolds S 2 and S 1
S 1 are
not diffeomorphic - an intuitively obvious, but highly non-trivial fact.
(17) Prove that b1 .C X Z/ D 1.
(18) The Möbius strip. Consider here S 1 as the unit circle in C.
M D f.x; z/ 2 C
S 1 jx 2 =z 2 Rg:
Prove that M is not orientable. [Hint: Consider the immersion and submersion
f W R
R ! M given by .x; t/ 7! .xe it ; e 2 it /. Prove that a 2-form hdxdy 2

2 .R2 / D f ! for a 2-form ! 2
2 .M / must satisfy h.0; 1/ D h.0; 0/ and
that therefore ! cannot be nowhere vanishing.]
(19) Consider, on S n1 , the smooth n 1-form
X
n
!D .1/i C1 dx1 ^ ^ dxi 1 ^ dxi C1 ^ ^ dxn :
i D1
Prove that
Z
! ¤ 0:
s n1
Conclude that bn1 .S n1 / 1. [Hint: use Stokes’ Theorem, the Hodge *
operator and spherical coordinates.]
(20) Prove the Fundamental Theorem of Line Integrals, 4.5 (*). [Hint: After
composing with the map , it becomes essentially a special case of the
Fundamental Theorem of Calculus for the Riemann integral, but a little bit
of care is needed since L is only piecewise smooth.]
www.Ebook777.com
Complex Analysis II: Further Topics

13
There are some extremely important concepts in complex analysis which we did
not cover in Chapter 10, and which ultimately lead up to several other areas of
mathematics. First of all, quite a bit more can be said about conformal maps. Under
very general conditions, one open subset of C can be mapped holomorphically
bijectively onto another. We prove one such result, the famous Riemann Mapping
Theorem. In many situations, such maps can even be written down explicitly. Those
are the Schwartz-Christoffel formulas, which have applications in cartography, as
the basic condition on mappings in cartography is to be conformal (since distortion
of distances in a topographical map is generally considered more allowable than
distortion of angles). Yet, the Schwarz-Christoffel formulas also lead to elliptic
integrals, which are “inverse” to elliptic functions (see for example [11]).
A major topic not covered in Chapter 10 is the question of “multi-valued holo-
morphic maps” such as, for example, the natural logarithm on C X f0g (or, for that
matter, elliptic integrals). What is the appropriate theoretical underpinning for such
functions? It turns out that now is the right moment for us to study such questions,
since we have already learned about manifolds. In this chapter, we will study com-
plex manifolds of complex dimension 1, which are called Riemann surfaces. It turns
out that the right way of thinking about multivalued functions on an open subset U
of C is as functions defined on a certain Riemann surface which is a covering of U
(not to be confused with open covers as studied in 1.1 of Chapter 9). In the process of
developing this concept, we will also learn a lot more about complex integration (we
will develop, for example, integration of holomorphic functions along continuous
paths and will show that if two paths are homotopic, i.e. one can be continuously
deformed to another, the integrals are the same). At the same time, we will also
explore striking ways in which complex differential forms behave on Riemann
surfaces, which will greatly enhance our understanding of complex integration.
Finally, we will see that methods of complex analysis extend even to functions
which are not holomorphic, generalizing, for example, the Cauchy formula to func-
tions which are continuously differentiable but not holomorphic. These methods will
be very useful in Chapter 15 below, where we will construct compatible complex
structures on oriented surfaces with Riemann metrics.

www.Ebook777.com
312 13 Complex Analysis II: Further Topics
As is the case with the concept of manifolds, the study of coverings has a close
connection with algebraic topology, which we will not explore here in detail. We
will, however, briefly introduce the concept of the fundamental group and give two
examples in Exercises (15) and (16). For more information on Riemann surfaces,
we recommend the book [6], and for a very concise yet informative study of the
fundamental group and coverings in an abstract topological setting, [13]. For very
interesting ventures to higher dimensions, [8] may be an excellent source.
1 The Riemann Mapping Theorem
In this section, we will consider bijective holomorphic maps f W U ! V where

U; V are open subsets of C. Note that by Theorem 6.3.3 of Chapter 10, we must
have f 0 .z/ ¤ 0 for z 2 U , and hence f 1 W V ! U is also a bijective holomorphic
function. Such functions will be called holomorphic isomorphisms, and if U D V ,
holomorphic automorphisms.
1.1 Holomorphic self-maps of C and the unit disk
1.1.1 Proposition. The only injective holomorphic functions on C are f .z/DazCb.
Proof. Let us study the singularity of the function f .1=z/ at z D 0. If this singularity
is removable, then f is bounded, and hence constant by Liouville’s Theorem 5.1 of
Chapter 10, contradicting our assumptions. If f .1=z/ has a pole of order k > 1 at 0,
then for " > 0 sufficiently small, there exists, by Theorem 6.3.3 of Chapter 10, a ı >
0 such that 1=f .1=z/ a has exactly k zeros in
.0; "/ for every a 2
.0; ı/. Note
that these k zeros may include zeros of order > 0, but not for ı sufficiently small,
since otherwise the holomorphic function .1=f .1=z//0 would have zeros arbitrarily
close to 0, and hence would be constantly 0 by Theorem 4.4 of Chapter 10. However,
if the k zeros are all different, this contradicts injectivity of f . Finally, if f .1=z/
has an essential singularity at 0, let f .0/ D A. Then by the Holomorphic Open
Mapping Theorem 6.3.4 of Chapter 10, for every r > 0 there exists an " > 0
such that f j
.0; r/ takes on every value in
.A; "/. On the other hand, applying
Proposition 6.2 of Chapter 10 to f .1=z/, we see that there are (infinitely many) z
with jzj > r such that f .z/ 2
.A; "/ which, again, contradicts injectivity.
We have concluded that f .1=z/ has a pole of order 1 at z D 0. Then the
function .f .z/ A/=z is holomorphic and bounded on C, and hence is constant
by Liouville’s Theorem 5.1. t
u
It is, however, convenient to consider a slightly larger class of maps called

Möbius transformations. These maps are formally defined as maps C [ f1g !
C [ f1g by formulas
az C b
fA .z/ D
cz C d
www.Ebook777.com
1 The Riemann Mapping Theorem 313
where A is a matrix of complex numbers

a b
AD ;
c d
and we assume
det.A/ ¤ 0:
One readily verifies that
fA ı fB D fAB ;
and thus all Möbius transformations are bijective maps C [ f1g ! C [ f1g.
We will understand that better in Section 3 below. While in the formalism we
introduced, Möbius transformations with c ¤ 0 are, by definition, not holomorphic
functions on C, they can be useful in mapping injectively holomorphically certain
open subsets of C onto one another (see Exercises 1, 2).
1.1.2 Lemma. (Schwartz’s Lemma) If f .z/ is a holomorphic function on

.0; 1/
which satisfies the conditions jf .z/j 1 for all z 2
.0; 1/ and f .0/ D 0, then
jf .z/j jzj for z 2
.0; 1/, and jf 0 .0/j 1. If additionally jf .z0 /j D jz0 j for
some z0 2
.0; 1/ or jf 0 .0/j D 1, then f .z/ D cz for all z 2
.0; 1/ for some
constant jcj D 1.
Proof. Consider the function

f .z/=z for z 2
.0; 1/ X f0g.
g.z/ D
f 0 .0/ for z D 0.
This function is holomorphic on

.0; 1/ (see Exercise (14)). By the maximum
principle 6.3.5 of Chapter 10, then, the maximum of the function jg.z/j on
.0; r/
can occur only on the boundary for any 0 < r < 1. By assumption, jg.z/j 1=r
for jzj D r, and hence also for jzj r. Passing to the limit, we get jg.z/j 1 for
z 2
.0; 1/, which is the first claim. If equality arises for a single point z0 2
.0; 1/,
the function g has a maximum at that point, and hence must be constant. In order for
the equality to actually arise, however, the constant must have absolute value 1. u t
1.1.3 Corollary. Let f be a holomorphic automorphism of

.0; 1/ such that
f .0/ D 0 and f 0 .0/ is a positive real number. Then f .z/ D z for all z 2
.0; 1/.
Proof. First note that by Theorem 6.3.3 of Chapter 10, f 0 .z/ ¤ 0 for all z 2
.0; 1/, and thus f 1 W

.0; 1/ !
.0; 1/ is also a holomorphic map. Thus,
Schwartz’s Lemma can be applied to both f and f 1 . In particular, jf 0 .0/j 1
and 1=jf 0 .0/j D j.f 1 .0//0 j 1, and hence equality must arise. Therefore, by
www.Ebook777.com
Schwartz’s Lemma again, f .z/ D cz with jcj D 1, but the assumption that f 0 .0/ is
a positive real number then implies c D 1. t
u
1.2 The Riemann Mapping Theorem
An open subset U C is called simply connected if U is connected and every

holomorphic function on U has a primitive function.
Lemma. Let U C be a simply connected open set and let W U ! C X f0g be

a holomorphic function. Then there exists a holomorphic function Ln..z// on U
such that
e Ln..z// D .z/
for all z 2 U .
0 .z/
Proof. Since U is simply connected, the function has a primitive function on
.z/
U , which we will denote by Ln..z//. This function is determined up to an additive
constant. But using the chain rule and the product rule, we find that
0
e Ln..z//
D 0;
.z/
so this function is constant. The additive constant can therefore be chosen in such a
way that
e Ln..z// D .z/: t

u
Theorem. (The Riemann Mapping Theorem) Let U ¨ C be an open simply

connected set, and let z0 2 U . Then there exists a unique holomorphic isomorphism
f W U !
.0; 1/ such that f .z0 / D 0 and f 0 .z0 / is a positive real number.
Proof. First of all, note that uniqueness follows from Corollary 1.1.3, since if f1 , f2
were two maps satisfying the conclusion of the Theorem, then .f1 /1 f2 would be a
holomorphic automorphism of
.0; 1/ with positive real derivative at 0.
To prove existence, we will first prove that there exists an injective holomorphic
map f W U !
.0; 1/ with f .z0 / D 0 where f 0 .z0 / is a positive real number. In
effect, let a … U . Apply Lemma 1.2 to the function .z/ D z a. Thus, we have a
function Ln.z a/ such that
e Ln.za/ D z a:
www.Ebook777.com
1 The Riemann Mapping Theorem 315
Now let
h.z/ D e Ln.za/=2 :
Then
h.z/2 D z a on U ;
which means that for any z; t 2 U ,
h.z/ ¤ ˙h.t/:
By the Holomorphic Open Mapping Theorem 6.3.4 of Chapter 10, there is an r > 0
such that

.h.z0 /; r/ hŒU :
Therefore,
hŒU \
.h.z0 /; r/ D ;:
This means that for z 2 U ,
jh.z/ C h.z0 /j r;
and in particular,
2jh.z0 /j r:
Now consider the function
r jh0 .z0 /j h.z0 / h.z/ h.z0 /

f0 .z/ D :
4 jh.z0 /j2 h0 .z0 / h.z/ C h.z0 /
First, note that the denominator is non-zero. Clearly, we have f0 .z0 / D 0. From the
chain rule, in fact,
f00 .z0 / D .r=8/ jh00 .z0 /j=jh.z0 /j2 > 0:
Additionally, f0 is a composition of a Möbius transformation with the injective

function h. Thus, f0 is injective. Finally,
ˇ ˇ ˇ ˇ
ˇ h.z/ h.z0 / ˇ ˇ ˇ 4jh.z0 /j
ˇ ˇ D jh.z0 /j ˇ 1 2 ˇ
ˇ h.z/ C h.z / ˇ ˇ h.z / h.z/ C h.z / ˇ r
0 0 0
www.Ebook777.com
which shows that jf0 .z/j 1 for z 2 U . Of course, strict inequality must hold by
the Holomorphic Open Mapping Theorem 6.3.4 of Chapter 10.
Now let N be the supremum of all the values f 0 .z0 / over the set S of all injective
functions f W U !
.0; 1/ which satisfy f .z0 / D 0 and f 0 .z0 / > 0. (Note that
a priori, one may have N D C1.) There exists, however, a sequence .fn /n of
functions in S such that
lim fn0 .z0 / D N:

n!1
Clearly, the sequence fn is uniformly bounded and by Theorem 3.7 of Chapter 10,
is also equicontinuous on every compact subset of U . By Theorem 6.3 of Chapter 9
(a consequence of the Arzelà-Ascoli Theorem), there exists a subsequence .fin /n
which converges uniformly on every compact subset K U . Denote the limit
function by f . We know by Weierstrass’s Theorem 3.6 of Chapter 10 that f is
holomorphic, f 0 .z0 / D N (and thus N < 1), and jf .z/j 1 for every z 2 U , but,
again, a strict inequality must arise by Theorem 6.3.4 of Chapter 10. We will now
show that f is injective. In effect, let z1 2 U . Then the functions fin .z/ fin .z1 /
have no zero in U Xfz1 g, and hence f .z/f .z1 / has no zero in U Xfz1 g by Hurwitz’s
Theorem 6.3.6 of Chapter 10. Since z0 was arbitrary, f is injective as claimed.
We claim that the function f W U !
.0; 1/ is onto. Assume, for contradiction,
that w0 2
.0; 1/ X f ŒU . Let
f .z/ w0
.z/ D :
1 w0 f .z/
Note that this function is injective since it is the composition of a Möbius

transformation with the injective function f . Applying Lemma 1.2 to this function
.z/, we can find a function Ln..z// on U which satisfies
e Ln..z// D .z/:
Let, again,
g.z/ D e .Ln.z//=2;
so that
.z/ D .g.z//2 :
Let
jg 0 .z0 /j g.z/ g.z0 /

F .z/ D :
g 0 .z0 / 1 g.z0 /g.z/
Note that we have F 2 S . (Compare Exercise (2).)
www.Ebook777.com
2 Schwartz-Christoffel formula 317
Note that f D ı F for a certain holomorphic function W

.0; 1/ !
.0; 1/
which satisfies .0/ D 0. Concretely, is a composition of a holomorphic
automorphism of
.0; 1/ which maps 0 to g.z0 / followed by squaring, and a
holomorphic automorphism of
.0; 1/ which maps 0 to w0 . Since is obviously
not a linear map, we have j 0 .0/j < 1 by Schwartz’s Lemma 1.1.2, so F 0 .0/ >
f 0 .0/ D N (since both are positive real numbers), which is a contradiction. t
u
1.3 Comments
1. It is clear why the case U D C must be excluded in the statement of the Riemann
Mapping Theorem: By Liouville’s Theorem 5.1 of Chapter 10, any bounded
holomorphic map defined on C is constant.
2. Excluding the case of U D C, as already remarked, Proposition 2.5 of Chapter 10
provides a converse to the Riemann Mapping Theorem when stated for real-
regular images of convex open sets. Perhaps much more importantly, however,
Proposition 2.5 of Chapter 10 serves as a source of examples for the Theorem.
While our definition of a simply connected set above precisely fits the proof of
the Theorem, it is not a condition which is easy to verify. On the other hand,
constructing real injective regular maps on convex sets, as in Proposition 2.5 of
Chapter 10, is easy (for example, see Exercise (4)).
2 Holomorphic isomorphisms of disks onto polygons

and the Schwartz-Christoffel formula
2.1 Convex polygons
We will examine holomorphic isomorphisms between

.0; 1/ and open polygons.
We will restrict here to convex polygons, although the restriction is not really
necessary (in the sense that the same formula implies to non-convex polygons as
well). However, convex polygons are much easier to treat rigorously. By an open
half-plane of angle a , 0 a < 2, we shall mean a subset of C consisting of all
the numbers b C z where
Im.ze i a / > 0 (2.1.1)
for some constant b 2 C. A closed half-plane is the closure of an open half-plane

of the same angle.
An open convex polygon P is a bounded non-empty intersection of open half-
planes P1 ; : : : ; Pk of angles a1 ; : : : ak , 0 < a1 < < ak 2. We put a0 D ak
2. The corresponding closed convex polygon is its closure P . Consider the points zi ,
where fzi g D @Pi \ @Pi 1 , i D 1; : : : ; k, and we let P0 D Pk and z0 D zk . Let us
assume, without loss of generality, that k is the smallest possible for the given P .
Then the points zi are called the vertices of P and the number ˇi D .ai ai 1 /
www.Ebook777.com
the exterior angle at the vertex zi , i D 1; : : : ; k. Setting ˛i D 1 ˇi , the angle

˛i is called the interior angle at the vertex zi . The boundary @P is the union of the
closed line segments Li between the points zi , zi C1 , i D 0; : : : ; k 1.
Now let z0 2 C and 0 < ˛ < 1. By Lemma 1.2, the function
.z z0 /˛ D e ˛Ln.zz0 / (2.1.2)
can then be defined on any open half-plane P whose boundary contains z0 , and
inspection shows that this function can be extended to a bijective continuous
function mapping P onto a closed angle of value ˛.
Let f .w/ be a holomorphic function defined in an open neighborhood U of a
point w0 2 C, let f .w0 / D z0 and let f 0 .w0 / ¤ 0. Define
g.w/ D .f .w/ z0 /˛ :
Then we just proved that g.w/ is defined on f 1 ŒP where P is an open half-plane

containing z0 .
2.2 Lemma. The function
g.w/.w w0 /˛
extends holomorphically to an open neighborhood of w0 , and the extension is non-

zero there.
Proof. By our assumption, the Taylor expansion of f .w/ z0 at w0 is of the form

1
X
.w w0 / anC1 .w w0 /n
nD0
where a1 ¤ 0. However, since z˛ can be defined as a holomorphic function in the

neighborhood of any non-zero point, we may then write
1
!˛
X
g.w/ D anC1 .w w0 / n
;
nD0
which is a holomorphic function in the neighborhood w0 . t

u
2.3 Lemma. Let f W P !

.0; 1/ be a holomoprhic isomorphism where P
is an open polygon. Then f extends to a homeomorphism f W P !
.0; 1/.
Furthermore, if we denote by g.w/ the inverse of f , assuming that g.wi / D zi where
z1 ; : : : zk are the vertices of P , then for w ¤ w1 ; : : : ; wk in the domain of g, g can
be extended to a holomorphic function with non-zero derivative in a neighborhood
www.Ebook777.com
2 Schwartz-Christoffel formula 319
of w, and additionally
.g.w/ zi / .w wi /˛i (2.3.1)
can be extended to a holomorphic function on an open neighborhood of wi , which

is non-zero there.
Proof. Consider a point z 2 @P , z ¤ z1 ; : : : ; zk . Let us use the above notation for

P . Assume, without loss of generality, that z 2 L0 , and L0 R. Consider the set
Q D Int.P [ fz j z 2 P g/, (note that z means the complex conjugate of z while P
is the closure), and the set R D C X fz 2 R j jzj > 1g. For z 2 L0 X fz0 ; z1 g, by the
Riemann Mapping Theorem 1.2, there exists a unique holomorphic isomorphism
Q ! R such that g.z/ D 0, and g 0 .z/ is a positive real number. Since, however, the
map g.z/ is another solution, it must be equal to g.z/ or, in other words, g.z/ D g.z/.
It then follows from the Intermediate Value Theorem that gŒP is contained in the
upper half-plane (the open half-plane with angle 0), and hence must be equal to
it. Thus, g restricts to a holomorphic isomorphism from P onto the upper half-
plane. Replacing
.0; 1/ by the upper half-plane (which we may do by a Möbius
transformation), we see that f extends holomorphically to an open neighborhood
of z.
Let us now consider the points z D zi . Assume, without loss of generality, i D 0,
L0 R, z0 D 0. Then denoting by the continuous extension of the function
(2.1.2) for ˛ D ˛0 , to the closed upper half-plane, let be the restriction of 1 to
P . Then we have
ŒP \ R D ŒLk1 [ L0 :
Let this image be the interval hs; ti where s < 0 < t. Now applying the argument
of the previous paragraph to the holomorphic isomorphism ˆ from the set
Int.ŒP [ fz j z 2 ŒP g/
to C X ..1; si [ ht; 1/ which maps 0 to 0 with a positive real derivative at

0, we again see that the map must be symmetric under complex conjugation, and
hence must map ŒP holomorphically bijectively onto the open upper half-plane.
Therefore (after composing with a Möbius transformation to pass from the upper
half-plane to
.0; 1/), ˆ ı gives a continuous extension of f to a neighborhood
of z0 in P , and, in fact, also a holomorphic extension of f ı 1 to an open
neighborhood of 0.
Now the open neighborhoods of all points z 2 @P cover @P which is compact,
and hence by the Heine-Borel Theorem 2.3 of Chapter 9, we may cover @P by
finitely many such neighborhoods. By the uniqueness theorem for holomorphic
functions 4.4 of Chapter 10, the local extensions of f agree on the intersections
of the neighborhood, which proves the existence of the continuous map f .
www.Ebook777.com
Now the statement about the holomorphic extension of the function (2.3.1)
follows from Lemma 2.2 (applied to this same function g). t
u
2.4 Theorem. (The Schwartz-Christoffel formula) Let P be a convex polygon as

above, and let f W P !
.0; 1/ be a holomorphic isomorphism. Let f .zi / D wi
(see Lemma 2.3). Then the inverse g of the map f is given by the formula
Z wY
k
g.w/ D C .u wi /ˇi du C D (2.4.1)
0 i D1
for some constants C; D 2 C.
Proof. Apply Lemma 2.3. Differentiating, we get that the function
Y
k
h.w/ D g 0 .w/ .w wi /ˇi (*)
i D1
extends holomorphically to an open set U containing

.0; 1/, and the extension
has no zero on U . We will show that the argument of the function (*) is, in fact,
constant on the boundary @
.0; 1/. First, consider the argument (in the sense of
Subsection 6.3 of Chapter 10) of the function g 0 .w/ for w in the segment of the unit
circle between the points wi and wi C1 . But we see that g.w/ on that segment is a
composition of a linear function with Ln.z/, and using the chain rule,
Arg.g 0 .w// D Ci Arg.w/
where Ci is a constant in the circle segment between wi and wi C1 . (Comment: In

this case, we consider the argument in the broader sense, i.e. determined only up
to an integral multiple of 2 ). Now the key point is that the slope of the side of P
changes by ˇi when passing the point wi , which immediately gives
Ci Ci 1 D ˇi :
On the other hand, by basic geometry of an isosceles triangle, we have
Arg.wi / C Arg.w/
Arg.w wi / D ˙ C : (**)
2 2
Therefore, for w on the circle segment between wi and wi C1 ,
!
Y
k X
k
Arg .w wi /ˇi D Qi C . ˇj /Arg.w/=2 D Qi C Arg.w/
i D1 j D1
www.Ebook777.com
3 Riemann surfaces, coverings and complex differential forms 321
for some constant Qi . When passing wi in the clock-wise direction, one of the C
signs in (**) changes into a , which shows
Qi Qi 1 D ˇi :
We see then that the argument of the function h.w/ of (*) is constant on the unit
circle. Thus, the holomorphic function h.w/ on U maps the unit circle into a set of
the form
S D fte ib j t > 0g
for b constant (a ray). Applying the Maximum Principle 6.3.5 of Chapter 10 to the
holomorphic functions
ib ib
e h.w/e ; e h.w/e ;
we then see that h.w/ maps the whole set

.0; 1/ to S , and thus, by the Open
Mapping Theorem 6.3.4 is constant. Integrating gives the statement of the theorem.
t
u
Comment: The numbers wi are not determined by Theorem 2.4 or any of the
above discussion. They are difficult to determine analytically except in a few very
special situations (see Exercises (6), (7)).
3 Riemann surfaces, coverings and complex differential

forms
We will now use what we learned about complex analysis to discuss a partial
“complex analog” of some of the material of Chapter 12. While this may seem like
an abstract exercise, it actually turns out to be an extremely useful device, which
will enhance greatly our understanding of topics already covered, such as Möbius
transformations, simply connected open subsets of C, and even primitive functions.
3.1 Riemann Surfaces: the basic definitions
Much of the theory of smooth manifolds of Chapter 12 can be directly translated to

form a theory of “complex manifolds” by simply replacing R with C and smooth
functions by holomorphic functions. However, there are some notable exceptions
which require care. First of all, to discuss a theory of complex manifolds in an
arbitrary dimension, we would first have to study analysis in several complex
variables. While the reader could probably fill in the basic definitions, the theory
www.Ebook777.com
of several complex variables is a special area of analysis with many subtleties,

which exceeds the realm of this book. For a good introduction to that subject, we
recommend [15].
Because of this, we will restrict our attention to complex dimension 1. A complex
manifold of complex dimension 1 is called a Riemann surface. (It has, of course,
topological dimension 2, and 2-dimensional manifolds are often called surfaces.)
Thus, we have the following definition:
A Riemann surface is a 2-dimensional topological manifold † with an atlas
.Ux ; hx / (where the coordinate maps are understood as maps into C) such that the
compositions (C) of Subsection 1.2 of Chapter 12 are holomorphic maps.
Analogously with Subsection 1.3 of Chapter 12, a map f W †1 ! †2 of
Riemann surfaces is called holomorphic if f is continuous, and for every x 2 †1 ,
the composition
h1
x f hf .x/
hx Œ.f 1 ŒUf .x/ / \ Ux f 1 ŒUf .x/ \ Ux Uf .x/ C
is holomorphic. As expected, a holomorphic map f W † ! C for a Riemann surface

† will be called a holomorphic function on †.
The treatment of tangent vectors of Riemann surfaces also parallels directly the
smooth case, i.e. Subsection 2.1 of Chapter 12. Of course, the tangent space T †x
to a Riemann surface at a point x 2 † is a complex line, i.e. a vector space over C
of dimension 1.
The first difference between Riemann surfaces and smooth manifolds is that
Remark 1 of Subsection 1.1 of Chapter 12 does not apply to Riemann surfaces.
In other words, we cannot assume that the coordinate maps are onto C: if we did,
then there would not be enough examples. By Proposition 1.1.1, an open subset
U ¨ C, which we can consider as a Riemann surface where the (single) coordinate
map is the inclusion, does not have an atlas whose coordinate systems would be
holomorphic maps onto C.
Another substantial difference is the absence of a “holomorphic partition of
unity”. In other words, the discussion of Subsection 1.5 of Chapter 12 does not have
a holomorphic analogue. For example, a holomorphic function on a connected open
set which is 0 outside of a compact subset is necessarily constant 0 by Theorem 4.4
of Chapter 10.
On the other hand, also in contrast with the case of real manifolds, note again
that for a bijective holomorphic map of Riemann surfaces f W †1 ! †2 , by
Theorem 6.3.3 of Chapter 10, Dfx ¤ 0 for every x 2 †1 , and thus f 1 is
also a bijective holomorphic map. Again, such maps will be called holomorphic
isomorphisms, and holomorphic automorphisms if †1 D †2 .
www.Ebook777.com
3.2 The first examples
It turns out that we already have a number of examples of Riemann surfaces. Of

course, open subsets of C are immediate examples. A first “non-trivial” example is
the complex projective space CP 1 : As a set, it is C [ f1g. It is topologized as S 2 ,
with C identified homeomorphically with S 2 X fag for any chosen point a 2 S 2 .
Then the atlas has two charts: one is C and the identity on C, the other is CP 1 X f0g
with the chart defined by

1=z for z ¤ 1
z 7!
0 for z D 1.
Now it is pretty much obvious from the definition that the Möbius transformations
of Subsection 1.1 are holomorphic automorphisms of CP 1 , and it is not difficult to
check that they are the only ones (see Exercise (10)).
Moreover, for an open set U C, note that a meromorphic function on U is
precisely the same thing as a holomorphic map U ! CP 1 . Because of this, one
extends this to call a meromorphic function on a Riemann manifold † a holomorphic
map f W † ! CP 1 .
Here is another example: Let a; b be complex numbers linearly independent over
R. Introduce an equivalence relation on C where x1 C iy1 x2 C iy2 is x1
x2 D ka, y1 y2 D `b where k; ` are integers. The set E of equivalence classes
with respect to this equivalence relation is called an elliptic curve. (The use of the
term “curve” here stems from algebraic geometry, where one develops methods for
defining geometric objects, called varieties, over general fields. A 1-dimensional
variety is called a curve. A non-singular curve over the field C is then, in particular,
a Riemann surface.)
Denote the equivalence class of z 2 C by Œz, an element of an equivalence class
is called its representative. Clearly, we have a projection
WC!E
given by
.z/ D Œz:
We may define a metric E by letting the distance of two classes Œz0 , Œt0 be
min jz tj
where z 2 Œz0 , t 2 Œt0 . The reason the minimum exists is that the subset
L D fka C `b j k; ` 2 Zg
www.Ebook777.com
is discrete. The projection is then continuous. There exists, therefore, an " > 0
such that
.0; "/ \ L D f0g. Then for any z 2 C, j
.z; "/ is a homeomorphism
onto
.Œz; "/. Thus, the inverses of these restrictions can be taken for an atlas,
making E a Riemann surface.
Meromorphic functions on E are the same data as doubly periodic functions on
C. Such functions are called elliptic functions. See Exercise (8) for one method by
which examples of elliptic functions can be constructed.
3.3 Coverings
3.3.1
Let † be a Riemann surface. A holomorphic map W T ! †, where T is another
Riemann surface, is called a covering if for every z 2 †, there exists an open
neighborhood Vz such that 1 ŒVz is a disjoint union of open subsets Ui , i 2 I ,
such that for each i , the restriction
jUi W Ui ! Vz
is a holomorphic isomorphism. We call Vz a fundamental neighborhood. Note

that an open subset of a fundamental neighborhood which contains z is also a
fundamental neighborhood.
Obviously, a holomorphic isomorphism is a covering. If E is an elliptic curve,
the projection W C ! E discussed in Subsection 3.2 is a covering. For yet another
example, see Exercise (14).
3.3.2 Coverings from (local) primitive functions

Another example of a covering, which will be of great significance to us, is obtained
as follows: Let U C be an open subset, and let f W U ! C be a holomorphic
function. Let Uf be equal to U
C as a set. Denote by
W Uf ! U
the projection to the first factor: .z; t/ D z. Introduce, however, a topology on Uf

as follows: Let its basis consist of all sets of the form
WV;F D f.z; F .z// j z 2 V g (*)
where V U is an open subset and F is a primitive function of f on V . By

Theorem 2.3 of Chapter 10, every point of Uf is contained in one of the sets (*), and
in fact .WV;F ; jWV;F / form an atlas of a Riemann surface Uf , and, furthermore, the
projection is a covering. (Convex open subsets of U can be taken as fundamental
neighborhoods.)
www.Ebook777.com
3.3.3 Paths and homotopy

We will now briefly investigate topological properties of coverings. By a path in a
topological space X , one means a continuous map ! W h0; 1i ! X . The points !.0/
and !.1/ are called the beginning point and end point, respectively. A homotopy of
paths !, with the same beginning point and the same end point is a continuous map
h W h0; 1i
h0; 1i ! X such that h.s; 0/ D !.s/, h.s; 1/ D .s/, h.0; t/ D !.0/,
h.1; t/ D !.0/ for all s; t 2 h0; 1i. We write h W ! ' . Our main result is the
following
3.3.4 Theorem. Let W T ! † be a covering, and let ! W h0; 1i ! † be a path.

Let a point x 2 T be such that .x/ D !.0/. Then there exists a unique path !Q in
Q
T such that !.0/ D x and !.t/
Q D !.t/ for all t 2 h0; 1i. Furthermore, if ! ' ,
then !Q ' Q (in particular, !Q and Q have the same endpoints). One refers to the path
!Q as a lifting of the path !.
Proof. Let At be an open interval containing the point t 2 h0; 1i such that !ŒAt \
h0; 1i is contained in a fundamental neighborhood. By Theorem 5.5 of Chapter 2,
h0; 1i is covered by finitely many of the open intervals At . Denoting their end points
by 0 D t0 < t1 < < tk D 1, each of the images !Œhti ; ti C1 i is contained in a
fundamental neighborhood. We can prove by induction on i that a lift !Q i of !jh0; ti i
with end point x exists and is unique: in fact, assuming this for a given i , !Q i exists,
let V be a fundamental neighborhood containing !Œhti ; ti C1 i, and let Vj be the open
subset given by the definition of a covering which is mapped homeomorphically to
V by the restriction i of the projection, and has the property that !Q i .ti / 2 Vj . Then
for t 2 hti ; ti C1 i, define
!Q i C1 .t/ D i1 !.t/:
Clearly, this extends !Q i to the required !Q i C1 , and further this extension is uniquely
determined, since i is a homeomorphism. Now we can put !Q D !Q k , and we have
both existence and uniqueness.
Regarding the homotopy, let h W ! ' . We shall construct a lift of this
homotopy to T . Note that we already know the lift exists and is uniquely determined
by applying the path lifting theorem separately to the path h.‹; a/ with each
fixed a. However, we must prove that this lift hQ W h0; 1i
h0; 1i ! T is
continuous. To this end, we must repeat, to some extent, our above argument
for paths: The set h0; 1i
h0; 1i is compact, and hence is covered by finitely
many rectangles hs; s 0 i
ht; t 0 i the closures of whose images lie in fundamental
neighborhoods. Taking the finite sets of all such s; s 0 and t; t 0 , we obtain partitions
0 D s0 < s1 < < s` D 1, 0 D t0 < t1 < < tm D 1 where the h-image of
each rectangle hsi ; si C1 i
htj ; tj C1 i is in a fundamental neighborhood Ui;j . For each
j , we then prove by induction on i that hjh0; Q si i
htj ; jj C1 i is continuous; indeed,
suppose the statement is true for a given i (and a fixed j ). Then by the connectedness
of intervals and the induction hypothesis, hŒfs Q i g
htj ; tj C1 i is contained in one of
www.Ebook777.com
the disjoint open sets which, by , map homeomorphically onto Ui;j . Inverting the
homeomorphism, we obtain the statement for i C 1.
Q 1/ is constant in s, note that 1 Œf!.1/g is discrete, and a
To see that h.s;
continuous function from a connected space to a discrete space is constant. t
u
Remark: It is useful to note that the proof of this theorem was purely topological
and did not make any use of the holomorphic structure.
3.4 Complex and holomorphic differential forms
3.4.1 Integration on Riemann surfaces

Let us begin by a brief discussion of complex line integrals on Riemann surfaces.
A Riemann surface † is certainly a smooth manifold, and by the material of
Chapter 12, for a differential 1-form ! on † and a continuously differentiable map
L W ha; bi ! †, we may integrate
Z Z b
!D L .!/: (*)
L a
This definition extends, as before, in an obvious way to piecewise continuously

differentiable curves L, and is independent of parametrization in the sense of
Chapter 8. Therefore, the key point is specifying the differential 1-form !.
What one means by complex integral is that using complex multiplication, we can
introduce 1-forms with complex coefficients (also called complex-valued differential
forms. We obtain those by applying ‹ ˝R C to the spaces TMx , ƒk .TMx /. (When
tensoring over R with C, we consider C as an R-vector space. However, note
that ‹ ˝R C covariantly turns R-vector spaces into C-vector spaces by using
the multiplication in C.) Thus, a smooth complex-valued k-form assigns to each
x 2 M , an element of
ƒk .TMx / ˝R C
which becomes smooth upon identification of TMx with C Š R2 when x 2 U and

U is a coordinate neightborhood. Identifying C with R2 , a complex-valued k-form
on a Riemann surface is then precisely the same thing as a pair of real k-forms:
its real and imaginary part. This construction could, in fact, be done by any (real)
smooth manifold.
We remarked in Subsection 4.4 of Chapter 8 that the complex line integral over
a piecewise continuously differentiable curve in an open subset U C we used
so extensively in Chapter 10 (and the present chapter) can be expressed in terms of
the line integral of the second kind. In the more modern context of complex-valued
differential forms, this is expressed by the simple but somewhat profound formula
www.Ebook777.com
dz D dx C i dy: (**)
In (**), we identify C with R2 by
z D x C iy:
The right-hand side of (**) is then a differential 1-form with complex coefficients,
so we may integrate it over piecwise continuously differentiable curves L in C.
When integrating the left-hand side of (**) over L, we mean, on the other hand, the
corresponding complex line integral. This is, then, the same thing as treating dz as
a complex-valued 1-form. Using complex multiplication, we then have additional
complex 1-forms ! D f .z/dz for a complex continuously real-differentiable
function f .z/. A line integral of the complex-valued 1-form ! is then the same
thing as the complex line integral as treated earlier, thus explaining in this way a
complex line integral as an integral of a complex-valued 1-form.
3.4.2 Holomorphic 1-Forms on a Riemann surface

On a general Riemann surface †, we no longer have a preferred form d z, but we
do have one on a coordinate neighborhood with a holomorphic coordinate system z.
Using the complex chain rule, we see that if z D z.t/ is a holomorphic function of
another holomorphic coordinate t, then
dz D z0 .t/dt
where z0 denotes the complex derivative (note that we only need to make sense of
this on an open subset of C). This means that 1-forms on a coordinate system, which
can be given as
f .z/dz;
where f is a holomorphic function, transform to 1-forms of the same kind

upon holomorphic change of coordinates. Such complex-valued 1-forms are called
holomorphic 1-forms on the Riemann manifold †.
Now in analogy with Subsection 2.4 of Chapter 10, for a holomorphic 1-form on
a Riemann surface †, a primitive function (if one exists) is a function F W † ! C
such that
dF D !: (*)
To see that this is the right generalization, note that on an open set U C, indeed,
dF D f .z/dz is equivalent to F 0 .z/ D f .z/, see Exercise (12). Note that therefore
by what we proved in Chapter 10, it immediately follows that a primitive function
to a holomorphic 1-form (if one exists) is necessarily holomorphic.
www.Ebook777.com
Even if a primitive function does not exist, note that the construction 3.3.2
immediately generalizes to give, for any holomorphic 1-form ! on a Riemann
surface †, a covering
W †! ! †:
Again, †! D †
C as a set, and the topology has basis consisting of sets 3.3.2 (*),
where V † is open, and F is a primitive function of ! on V .
3.5 The basis dz, dz
We are not, however, always interested just in holomorphic 1-forms. It is therefore

natural to also introduce the “complex conjugate 1-form”
dz D dx i dy
on a coordinate neighborhood U of a Riemann surface, where z is the coordinate.

Then dz, dz at each point of U of a coordinate neightborhood clearly form a basis
of the (complex) dual of the complexified tangent space T †x ˝R C. Under a
holomorphic change of coordinates z D z.t/, the 1-form d z transforms by
dz D z0 .t/dt :
It follows that the C-vector spaces of forms on U
f.z/dz j smooth C valuedg; f.z/dz j smooth C valuedg
are preserved by holomorphic change of coordinates. Such forms are called 1-forms
of type .1; 0/, resp. of type .0; 1/. In fact, note that if we define, for ! D f .z/dz C
g.z/dz with f; g smooth on U ,
! D f .z/dz C g.z/dz;
then this “complex conjugation” operator is invariant under holomorphic coordinate

change, and switches the spaces of .1; 0/-forms and .0; 1/-forms.
In view of this, it is helpful also to write the basis of complex vector fields dual
to dz, dz on U :

@ 1 @ @
D i ;
@z 2 @x @y

@ 1 @ @
D Ci :
@z 2 @x @y
www.Ebook777.com
Note that in this notation, the Cauchy-Riemann equations for a function f can be
expressed simply by
@f
D 0: (3.5.1)
@z
In other words, a continuously differentiable function on f W † ! C is holomorphic

if and only if it satisfies (3.5.1) in holomorphic coordinates. Let us now examine
how the new basis behaves with respect to the exterior differential and the exterior
product. Regarding exterior product, note that
dz ^ dz D 2i dx ^ dy: (3.5.2)
Regarding the exterior differential, one has, of course, for a complex continuously
(real)-differentiable function f ,
@f @f
df D dz C dz:
@z @z
Recalling the Cauchy-Riemann condition in the form (3.5.1), it then becomes

natural to write for a form !0 2 f1; dz; dz; dz ^ dzg, and a complex continuously
differentiable function f on U ,
@f
@.f !0 / D dz ^ !0 ;
@z
@f
@.f !0 / D dz ^ !0 ;
@z
(the point, of course, being that d!0 D 0). Of course, we have
d D @ C @:
One readily verifies that @ and @ are invariant under a change of holomorphic
coordinate (see Exercise (13)). Because of that, @ and @ are well-defined on any
Riemann surface †.
Note that on a compact Riemann surface, there may exist non-trivial holomorphic
1-forms. For example, the form dz obviously determines a well-defined holomorphic
1-form on any elliptic curve as defined in Subsection 3.2. Compare this with
Exercise 9 which asserts that every holomorphic function on a compact Riemann
surface is constant. In fact, note that if † is a compact Riemann surface, then the
space
1Hol .†/ embeds canonically into the de Rham cohomology with complex
coefficients
1
HDR .†; C/ D HDR
1
.†/ ˝R C:
www.Ebook777.com
This is because if a holomorphic 1-form ! satisfies ! D df , then f is necessarily

holomorphic and hence constant, and hence ! D 0. In fact, one can prove that for
† compact, there is a canonical isomorphism
1
HDR .†; C/ Š
1Hol .†/

1Hol .†/:
Let us remark that the 1-form dz, of course, pulls back to any open subset U C,
and hence also to any covering W V ! U . We shall simplify notation by denoting
dz D d.z ı / also by dz, thus defining “complex integration” of functions on
any covering † equipped with a covering W V ! U where U C is an open
subset. Since every point z 2 V has an open neighborhood which is mapped by
holomorphically bijectively onto an open subset of U , a complex derivative of
holomorphic functions f W V ! C is then also defined, as is the concept of a
primitive function of f on open subsets of V .
3.6 Complex line integrals revisited
In Chapter 8, we investigated extensively the implications of reparametrizing a

piecewise continuously differentiable parametrized curve L. Note that in particular,
we can make the domain of the parametrization the interval h0; 1i, which lets us
consider the parametrized curve L as a path in the sense of Subsection 3.3. Note
also that reparametrizations result in homotopic paths.
3.6.1 Theorem. Let † be a Riemann surface, and let ! be a holomorphic 1-form

on †. Let L, M be partially continuously differentiable parametrized curves in V
which are homotopic as paths (in particular, they have the same beginning points
and the same end points). Then
Z Z
!D !:
L M
Proof. Consider the covering †! of † corresponding to the local primitive func-

tions of f (see 3.3.2 and 3.4. Let z0 be the beginning point of the parametrized
curves L; M . Let LQ (resp. MQ ) be a lift of the path L (resp.M ) to † with beginning
Q MQ . We claim that
point .z0 ; 0/. Let .z1 ; K1 /, .z2 ; K2 / be the end points of L,
Z Z
! D K1 ; ! D K2 :
L M
www.Ebook777.com
In effect, find again 0 D t0 < t1 < < tk D 1 such that LŒhti ; ti C1 i, M Œhti ; ti C1 i
for each chosen i are contained in a fundamental neighborhood of the covering, and
use the properties of primitive functions.
But then since L; M are homotopic, .z1 ; K1 / D .z2 ; K2 / by Theorem 3.3.4,
which proves our statement. t
u
3.6.2 Corollary. Let U C be an open set, let f W U ! C be a holomorphic

function and let L, M be piecewise continuously differentiable parametrized curves
which are homotopic as paths. Then
Z Z
f .z/dz D f .z/dz: t
u
L M
Note that it would be quite difficult to prove this directly using the techniques of
Chapter 10, in particular since there is no theory of line integrals of the second kind
over continuous paths: we have really used the force of Theorem 3.3.4 here.
However, for open subsets of C, we can go even further. Recall the definition of
a simply connected open set from Subsection 1.2.
3.6.3 Theorem. For a connected open set U ¨ C, the following are equivalent:
(1) U is simply connected (i.e. every holomorphic function on U has a primitive
function).
(2) U is holomorphically isomorphic to
.0; 1/
(3) Let a; b 2 U . Then any two paths !; with beginning point a and end point b
are homotopic.
Proof. (1) implies (2) by the Riemann Mapping Theorem 1.2. (2) implies (3)
because
.0; 1/ is a convex set: We may define the homotopy simply by h.s; t/ D
t!.s/ C .1 t/.s/. To see that (3) implies (1), suppose that U is a connected
open subset of C satisfying (3). Let f be a holomorphic function on U . Let †
be the covering 3.3.2 corresponding to the primitive function of f , and let †0
be a connected component of U . By definition, the restriction of the projection
0 W †0 ! U is a covering. We claim, in fact, that it is a holomorphic isomorphism.
By Theorem 3.3.4, and the fact that U is path-connected, 0 is onto. Thus, if it is
not a holomorphic isomorphism, it cannot be injective, i.e. there must be two points
x; y 2 †0 with 0 .x/ D 0 .y/. But †0 is connected, and since it is a manifold, also
path-connected, so there is a path ! in beginning point x and end point y. Then the
projection 0 ı ! in U has the same beginning point and end point 0 .x/ D 0 .y/,
but cannot be homotopic to the constant path by Theorem 3.3.4, since its lift ! has
a different beginning point and end point.
The contradiction proves that 0 is a holomorphic isomorphism; the second
coordinate of 01 .z/ is then a primitive function of f on U . t
u
www.Ebook777.com
4 The universal covering and multi-valued functions
Theorem 3.6.3 suggests the following definition: A Riemann surface † is called

simply connected if it is connected, and if any two paths !, which have the same
beginning point and the same end point are homotopic. The Riemann Mapping
Theorem actually has a generalization called the Uniformization Theorem stating
that every simply connected Riemann surface is holomorphically isomorphic to
.0; 1/, C or CP 1 , but we shall not prove this here (see, however, Exercise (17)).
4.1 Theorem. Every connected Riemann surface † has a covering W † Q ! †

Q
where † is simply connected. (This covering is called the universal covering of †.)
Proof. Select a point x0 2 †. Define † Q as the set of homotopy classes (i.e.

equivalence classes with respect to the relation of homotopy) of paths ! with
beginning point x0 . The homotopy class of a path ! will be denoted by Œ!. We
have an obvious map W † Q ! †, sending a class Œ! to the end point of ! (by
our definition of homotopy, this does not depend on the choice of a representative).
Therefore, it remains to define a structure of a Riemann surface on † Q and to prove
that it is simply connected and that is a covering.
It is helpful here to introduce the operation of concatenation of paths, which is a
generalization of the operation C on parametrized continuously differentiable curve
we considered in Chapter 8: If !, are paths in † where the end point of ! is the
beginning point of , define the path ! by

!.2t/ for 0 t 1=2
.! /.t/ D
.2t 1/ for 1=2 t 1.
Note that, (just as for piecewise continuously differentiable curves,) concatenation

is associative up to homotopy. Also similarly as for curves, the operation L of
Chapter 8 has a generalization to paths: the inverse path of ! is defined by
!.t/ D !.1 t/:
One readily proves that ! ! is homotopic to a constant path, as is ! !.

Q Œ! D x. Let .Ux ; hx / be a coordinate system
To proceed further, let Œ! 2 †,
of † at x, and let V hx ŒUx be a convex open subset containing hx .x/. Then
let UŒ!;V † Q be the set consisting of all classes Œ! ..hx /1 ı L/ where L is
a linearly parametrized line segment in V with beginning point hx .x/. Note that
this class does not depend on the choice of the representative ! of the class Œ!.
Note that by definition, maps UŒ!;V bijectively onto h1
x ŒV . (Note: our notation
implies that a fixed coordinate system is specified at each x 2 †; otherwise, the
notation UŒ!;V must be modified to reflect the coordinate system.)
www.Ebook777.com
4 The universal covering and multi-valued functions 333
4.1.1 Lemma. If .Œ!/ D .Œ/ (i.e. ! and have the same end point x) and
Œ! ¤ Œ (i.e. ! and are not homotopic), then for any convex open subset V
hx ŒUx ,
UŒ!;V \ UŒ;V D ;:
Proof. If ! ..hx /1 ı L/ ' ..hx /1 ı L/, then
! ..hx /1 ı L/ ..hx /1 ı L/ ' ..hx /1 ı L/ ..hx /1 ı L/;
(note: this uses associativity of up to homotopy), which in turn implies
!'
(which uses the inverse property). t

u
We still need to make yet another observation.
4.1.2 Lemma. Let !, V be as above and let y 2 h1

x ŒV . Then there exists a path
and an " > 0 such that for a convex open W
.hy .y/; "/, UŒ;W UŒ!;V .
Proof. It suffices to choose " > 0 such that h1 1

y Œ
.hy .y/; "/ hx ŒW . We set
D ! ..hx /1 ı L/
where L is a linearly parametrized line segment with beginning point hx .x/ and
end point hx .y/. To prove that UŒ;W UŒ!;V , let M be a linearly parametrized
line segment in W with beginning point hy .y/ and end point hy .z/. We need to
prove that
Œ .h1
y ı M / 2 UŒ!;V : (*)
To this end, note that by associativity of ,
.h1 1 1
y ı M / ' ! .hx ı L/ .hy ı M /:
Now we have
.h1 1 1 1
x ı L/ .hy ı M / D hx ı .L .hx ı hy ı M //:
Clearly, the path L .hx ı h1

y ı M / in V is not a linearly parametrized line segment,
but is homotopic to one since V is a convex set. This proves (*). t
u
www.Ebook777.com
Now by Lemma 4.1.2, we can give † Q a topology where a subset U is a

neighborhood of an Œ! 2 U if and only if it contains a subset of the form UŒ!;V (we
need the lemma to conclude that this definition is correct in the sense that a set we
call a neighborhood indeed contains an open subset). Lemma 4.1.1 then implies that
for U D h1 1
x ŒV as above, ŒU is a disjoint union of open subsets UŒ!;V over
all the ! with .!/ D x. We can then define an atlas of † Q as the open sets UŒ!;V
together with the coordinate maps hx ı where .Œ!/ D x. It then follows that † Q
is a path-connected Riemann surface and is a covering - except for one detail: we
must prove that † Q is separable.
To this end, let Ui be a countable basis of † such that each Ui is connected
and contained in a convex subset of a coordinate neighborhood. Then each Ui is a
fundamental neighborhood. We will prove that for each x 2 †, the set 1 Œfxg is
countable; then the connected components of 1 ŒUi form a countable basis of †. Q
But note that by compactness as above, for every path with beginning point x0 and
end point x, there exist 0 D t0 < t1 < < tm D 1, a finite sequence i1 ; : : : ; im
such that !Œhtj 1 ; tj i Uij for j D 1; : : : ; m. In particular, Uij \ Uij C1 ¤ ;.
One then proves by induction that there exist unique connected components UQ ij of
1 ŒUij such that UQ ij \ UQ ij C1 ¤ ;. Since clearly Œ! 2 UQ im , and since there are
only countably many such sequences i1 < < im , there are only countably many
Œ! with .Œ!/ D x, as claimed.
Finally, we shall prove that † Q is simply connected. But this is easy. First of
Q is path-connected by construction (since the lift of a path ! with beginning
all, †
point x0 has, by definition, end point Œ!). Next, suppose that ˛, ˇ are two paths
in †Q with the same beginning point Œ! and the same end point Œ. But this means
Œ! . ı ˛/ D Œ D Œ! . ı ˇ/, i.e. ! . ı ˛/ ' ! . ı ˇ/, which implies
ı ˛ ' ı ˇ (by concatenating with !, which implies ˛ ' ˇ by Theorem 3.3.4.
t
u
4.2 Base points, universality and multi-valued functions
Let † be a connected Riemann surface and let x0 2 †. We refer to such a

chosen point as a base point of †. Note that we already used the base point in the
construction of the universal covering †, Q and that in fact that construction comes
with a preferred base point xQ 0 , represented by the constant path at x0 . We have
.xQ 0 / D x0 . We refer to a covering W T ! † with a choice of base points
.xQ 0 / D x0 as a based covering.
The term ‘universal covering’ (which should really pedantically be called
“universal based covering”) is justified by the following fact:
4.2.1 Theorem. Let † be a connected Riemann surface. Consider the based

universal covering W †Q ! †, with base points xQ 0 7! x0 , and let W T ! †
be any based covering, with base points y0 7! x0 . Then there exists a unique based
Q ! T such that .xQ 0 / D y0 . In fact, we have ı D .
covering W †
www.Ebook777.com
Proof. Let x 2 †. Q Let ! be a path in † Q with beginning point xe0 and end point x.
By Theorem 3.3.4, there is a unique lift of the path ı ! to T with beginning
point y0 . Let .x/ be the end point of . (Note in fact that this definition is forced by
the path lifting property, which already implies uniqueness.) On the other hand, also
note that our definition of .x/ did not depend on the choice of the path !, since
any two such paths are homotopic as † Q is simply connected. Because of this, if
U is a connected fundamental open neighborhood of a point z 2 † for both the
coverings and , and if Ui (resp. Uj ) is the open disjoint summand of 1 ŒU
(resp. 1 ŒU ) such that 0 D jUi ! U (resp. 0 D jUj ) and which contains
x (resp. .x/), then jUi is given by the formula 01 ı 0 , which shows that
is a covering with such fundamental neighborhoods Uj . (Note: if y 2 T is not
in the connected component of the base point, then it won’t be in the image of ,
so the fundamental neighborhood of y can be chosen to be the whole connected
component.) t
u
We immediately get the following
4.2.2 Theorem. A connected Riemann surface † is simply connected if and only if

every covering W T ! † with T connected is a holomorphic isomorphism.
Proof. Suppose † is simply connected and W T ! † is a covering with T

connected. We already remarked that is onto by Theorem 3.3.4. Suppose .x1 / D
.x2 /. Let ! be a path in T with beginning point x1 and end point x2 . Then ı !
has a beginning point equal to its end point, and hence is homotopic to the constant
path since † is simply connected. Thus, by Theorem 3.3.4, x1 D x2 . Thus, is also
injective, and thus is a holomorphic isomorphism.
On the other hand, suppose † is connected but not simply connected. Then the
universal covering W † Q ! † cannot be a holomorphic isomorphism, since † Q is
simply connected. t
u
4.2.3 Corollary. (Uniqueness of universal covering) Let † be a connected Rie-

mann surface. A based universal covering W † Q ! † with base points xe0 7! x0
is unique in the sense that for any other based universal covering W T ! † with
base points z0 7! x0 , there exists a unique holomorphic isomorphism W †Q !T
such that ˛.y0 / D z0 . In fact, ı D .
Proof. By Theorem 4.2.1, there exists a unique covering W † Q ! T with

the specified properties. Since T is simply connected, by Theorem 4.2.2, is a
holomorphic isomorphism. t
u
4.2.4 Multi-valued functions

Q !†
Let † be a based connected Riemann surface with base point x0 , and let W †
be a based universal covering with base points xQ 0 7! x0 . Then we define a multi-
valued holomorphic function on † based at x0 as a holomorphic function on †. Q
www.Ebook777.com
Note that then, in particular, the multivalued function based at a point x0 does have
a well-defined “value” at the point x0 .
Multivalued holomorphic functions based at x0 form an algebra in the sense
that they contain (ordinary) holomorphic functions (a holomorphic function f is
identified with the multi-valued function f ı ), and have well-defined operations
of addition and multiplication. Much more is true, of course, for example if f is a
multi-valued holomorphic function based at x0 and g W C ! C is an ordinary holo-
morphic function, then there is a well-defined multivalued holomorphic function
g ı f based at x0 .
Note that by Corollary 4.2.3, the choice of † Q does not matter in the sense
that multi-valued holomorphic functions defined via any other based holomorphic
Q by a preferred bijection, namely
universal covering are related to those defined via †
the one induced by the based holomorphic isomorphism between † Q and T , and that
this bijection preserves all the operations in sight. It is important to note, however,
that unless † is simply connected, there is no preferred way of identifying the
algebras of multivalued holomorphic functions based at different base-points of †.
Examples of multi-valued holomorphic functions on Riemann surfaces can be
obtained from holomorphic 1-forms !: Note that we have a primitive function F
of ! well-defined on any connected component of the covering †! , and hence,
by Theorem 4.2.1, on the universal cover. This is referred to as the multi-valued
primitive function of !. Note that a discussion of base points is not so important here,
since no matter how we choose base points, two multi-valued primitive functions of
the same holomorphic 1-form will differ by a constant. In particular, for connected
open sets U C, we have a well-defined notion (up to additive constant) of a multi-
valued primitive function based at z0 2 U of a given multi-valued function based
at z0 .
For example, the multi-valued primitive function of
1
f .z/ D
z z0
on C X fz0 g with value equal to 0 at the base point which is chosen to project to
z0 C 1 2 C X fz0 g is called the multivalued logarithm ln.z z0 /. Choosing an
arbitrary ˛ 2 C, we then obtain the multivalued function
.z z0 /˛ D e ˛ ln.zz0 /
on C X fz0 g, also based at z0 C 1 2 C X fz0 g. Sometimes different conventions of

base points are appropriate (see below). In any case, no matter what base point we
specify, the multi-valued logarithm is well defined up to adding an integral multiple
of 2 i , and .z z0 /˛ is well defined up to a non-zero multiplicative constant of
modulus 1.
www.Ebook777.com
4.2.5 Example
The behavior of multi-valued functions can be quite complicated. Consider the
multivalued function
f .z/ D za .z 1/b (1)
on U D C X f0; 1g. Assume, for simplicity, a; b > 0 to be real numbers. (Note that
there exists unique multi-valued functions za , .z 1/b based at any chosen point
0 < z0 < 1 whose values at the base point are positive real numbers.)
Now let F be the multi-valued primitive function on U (Let, for example, the
value of F at the base point zQ0 , .Qz0 / D z0 , be 0.) Now let K be the circle with
center 0 and radius z0 (and beginning point z0 ) oriented counter-clockwise, and let
L be the circle with center 1 and radius 1 z0 (and beginning point z0 ) oriented
counter-clockwise.
Let ! be a concatenation of m copies of K and n copies of L (in any fixed order),
and ze1 be the end-point of the lift !Q to the universal covering with beginning point
zQ0 . Then one immediately sees that
f .Qz1 / D f .Qz0 /e 2 .maCnb/i : (2)
Let us now examine the behavior of the function F : First note that the integrals
Z z0 Z 1
AD za .z 1/b dz; B D za .z 1/b dz
0 z0
actually exist in the sense of ordinary real analysis, and are equal to (finite) positive
real numbers. Additionally, the integrals of (1) over a circle with radius " and center
0 or 1 goes to 0 with " ! 0. Because of this, if K e is a lift of K to the universal
covering with beginning point zQ1 as above, we have, denoting the end point by zQ2 ,
F .Qz2 / F .Qz1 / D e 2 .maCnb/i .e 2 ai 1/A; (3)
while if e
L is a lift of L to the universal cover with beginning point zQ1 as above and
end point zQ3 , we have
F .Qz3 / F .Qz1 / D e 2 .maCnb/i .1 e 2 bi /B: (4)
Note that the operations (3), (4) do not commute: if we begin at zQ1 and follow first
K and then L, the value of the primitive function increases by
e 2 .maCnb/i .e 2 ai 1/A C e 2 ..mC1/aCnb/i .1 e 2 bi /B;
www.Ebook777.com
while following L first and then K beginning from the same point zQ1 gives an
increase of
e 2 .maC.nC1/b/i .e 2 ai 1/A C e 2 .maCnb/i .1 e 2 bi /B:
These two values are in general not equal. Because of this, it is not true, contrary to
what one may naively expect, that F .z/=f .z/ would be a single-valued function on
U (in the sense that it would be a composition of an ordinary holomorphic function
on U with ). Note that we also see that the end points of the lifts of K L and
L K to the universal covering with the same beginning point are, in fact, different.
Up to normalization, the function F belongs to a family of functions
called hypergeometric functions; they are, in some sense, the “simplest” multi-
valued holomorphic functions on a connected open subset of C for which this
phenomenon occurs.
4.3 The fundamental group
Let † be a connected Riemann surface with base point x0 . Denote by 1 .†; x0 /

the set of all homotopy classes of paths in † with beginning point and end point x0 .
Recall the proof of Theorem 4.1, and specifically the operation of concatenation of
paths. From the arguments given there, it follows that gives a well-defined binary
operation on 1 .†; x0 /. Moreover, note that the constant path at x0 is a unit element
for the operation . Also observe that if, for a path !, we define a path ! given by
!.t/ D !.1 t/;
then Œ! is the inverse of Œ! with respect to . Thus, the set 1 .†; x0 / with the
operation is a group in the sense of Appendix B, 3.1. This group is called
the fundamental group of † with base point x0 . There are many interesting and
deep connections between the fundamental group and coverings, which we cannot
explore in this text, in part because we do not develop the theory of groups in any
substantial way. After filling in the necessary algebra, say, in [2], the reader can find
more information in [6, 13, 20].
There is, however, one connection between the fundamental group and the
universal cover which is too beautiful and striking to pass up. Consider a based
universal cover
Q xQ 0 / ! .†; x0 /
.†;
of a connected Riemann surface †. A deck transformation is a homeomorphism

f W† Q !† Q such that the following diagram commutes:
www.Ebook777.com
f
Σ̃ Σ̃
π π
(In other words, such that ı f D .) Note that a deck transformation is

automatically a holomorphic isomorphism, and that deck transformations form a
group with respect to composition of maps. Denote this group by .
Now define maps
ˆ W ! 1 Œfx0 g
by letting, for a deck transformation f ,
ˆ.f / D f .xQ 0 /:
Define, on the other hand, a map
‰ W 1 .†; x0 / ! 1 Œfx0 g
as follows: Let ! be a path in † with beginning point and end point x0 . Let !Q
be a path in † Q which is the unique lifting of ! with beginning point xQ 0 (see
Theorem 3.3.4). Then let ‰.Œ!/ be the end point of !. Q By Theorem 3.3.4, this
does not depend on the choice of the representative ! of the class Œ! 2 1 .†; x0 /.
The following result can often be used to compute the fundamental group (see
Exercise (15)).
Theorem. The maps ˆ and ‰ are bijections. Moreover, the composition ˆ1 ı ‰
is a homomorphism (hence isomorphism) of groups.
Proof. The fact that ˆ is bijective is a special case of the universality Theorem 4.2.1.
To show that ‰ is onto, recall that † Q is connected, and hence path-connected. Let
y 2 1 Œfx0 g and let be a path in † Q from xQ 0 to y. Put ! D ı . Then ‰.Œ!/ D
y. To prove injectivity, note that the just mentioned is unique up to homotopy
since † Q is simply connected, and composing with gives uniqueness of Œ!.
To prove that ˆ1 ı ‰ is a homomorphism of groups, let , ! be paths in † with
beginning points and points x0 and let , Q with beginning point
Q !Q be their lifts to †
xQ 0 . Let, on the other hand, O be the lift of to † Q whose beginning point is the end
point xQ 1 of !.Q Now let f be a deck transformation which sends xQ 0 to xQ 1 . Then by
uniqueness of path lifting, f ı Q D . O In particular, if we denote the end point of Q
by xO 0 and the end point of O by xO 1 , then
f .xO 0 / D xO 1 :
www.Ebook777.com
We see that
ˆ.f ı g/ D xO 1 D ‰.! /;
so
f ı g D ˆ1 ı ‰.! /;
while
f D ˆ1 ı ‰.!/; g D ˆ1 ı ‰./;
which is what we wanted to prove. t

u
4.4 Comment
The reader no doubt noticed that the concepts of covering, universal covering, and
fundamental group do not use the structure of a Riemann surface very substantially.
They can, indeed, be defined for more general topological spaces. In order for the
nice theorems we presented to be true, however, some “local assumptions” about
the topological spaces involved must be included. The book [20] contains an easily
accessible discussion of coverings in a more general topological context. One case
which works very well is the case of smooth (or even topological) manifolds.
Definition 3.3.1, Theorem 3.3.4, Theorem 4.1, Theorem 4.2.1, the definition of
fundamental group in 4.3 and Theorem 4.3 remain vaild if we replace “Riemann
surface” by “smooth manifold” (resp. “topological manifold”) and “holomorphic
isomorphism” by “diffeomorphism” (resp. “homeomorphism”).
Yet, the case of Riemann surfaces, which we discussed above, is particularly
striking, and in this context, coverings were first discovered by Riemann.
5 Complex analysis beyond holomorphic functions
We are now ready to extend Cauchy’s formula (Theorem 3.3 of Chapter 10) to the
case of any continuously (real)-differentiable function. Let us write an integration
variable
D s C it
(to distinguish from the standard convention z D x C iy).
5.1 Theorem. (The Cauchy-Green formula) Let U be a domain in C. Let

L1 ; : : : ; Lk be simple piecewise continuously differentiable closed curves with
disjoint images such that L1 q q Lk is the boundary of U oriented counter-
clockwise. Let U be defined and have continuous real partial derivatives on an open
www.Ebook777.com
5 Complex analysis beyond holomorphic functions 341
set V C containing U . Then for z 2 U , we have
Z k Z
1 .@f =@/dsdt 1 X f ./d
C D f .z/: (5.1.1)
U z 2 i j D1 Lj z
Proof. It is actually almost the same as the proof of Theorem 3.3 of Chapter 10.
Using the language of Subsection 3.5, we may rewrite (5.1.1) as
Z k Z
1 f ./d 1 X f ./d
d C D f .z/: (*)
2 i U z 2 i j D1 Lj z
On the other hand, for " > 0 small, if we denote by K the boundary of
.z; "/
oriented counter-clockwise, then Green’s Theorem 5.4 of Chapter 8 gives
Z k Z Z
1 f ./d 1 X f ./d 1 f ./d
d C D : (**)
2 i U X
.z;"/ z 2 i j D1 Lj z 2 i K z
When " ! 0, the right-hand side tends to f .z/ by the same argument as in the proof
of Theorem 3.3 of Chapter 10. So it remains to prove that
Z
.@f =@/dsdt
lim D 0;
"!0
.z;"/ z
which, by continuity, is equivalent to

Z
dsdt
lim D 0;
"!0
.z;"/ z
which is an obvious calculation (for example in polar coordinates). t

u
5.2 The “inverse” Cauchy-Riemann operator
The Cauchy-Green formula is the starting point of applying methods of complex

analysis to classes of functions which are not necessarily holomorphic, but merely
satisfies differentiability conditions in the real sense. We will use these methods in
Section 5 of Chapter 15, when we will construct a complex structure on an oriented
surface with a Riemann metric.
Recall Hölder’s inequality (Theorem 8.1 of Chapter 5)
Z
1 1
jj f ./g.z /dsdtjj1 jjf jjp jjgjjq for C D 1. (5.2.1)
C p q
www.Ebook777.com
We will focus here on functions defined on an open disk
D D
.0; 1/:
(We could, of course, equivalently work on any other disk.) Where needed, we may
extend such functions to C by 0. Note also that for z 2 D, using polar coordinates,
we have
1
./ D 2 Lq .D/ for every q < 2:
z
Thus, by (5.2.1), for p > 2, we have a well-defined operator
P W Lp .D/ ! L1 .C/
defined by
Z
1 f ./dsdt
.P .f //.z/ D :
D z
We will also need another version of this operator, defined by the formula
Z
1 1 1
.P1 .f // D f ./ dsdt:
C z
Note that the function

1 1
./ D
z
is in Lq .CX
.0; 2jzj// for every q > 1, and thus P1 .f / is defined for any function
f 2 Lp .C/, p > 2, and produces a (not necessarily bounded) complex function
defined everywhere on C.
5.2.1 Lemma. Let f be a continuous function on C with support in D such that
jf .z/ f .t/j Kjz tj˛ for some ˛ > 0 (5.2.2)
for some constant K. Then P .f / is a continuously differentiable function on C. In

fact, we have
Z
@P .f .z// 1 f ./ f .z/ @P .f .z//
D dsdt; D f .z/: (5.2.3)
@z D . z/2 @z
If f is a continuous function on C which is in Lp .C/, p > 2 and satisfies (5.2.2),

then P1 .f / is a continuously differentiable function on C and
www.Ebook777.com
Z
@P1 .f .z// 1 f ./ f .z/ f ./ f .0/
D dsdt;
@z C . z/2 2
(5.2.4)
@P1 .f .z//
D f .z/ f .0/:
@z
Proof. Let us first prove the statement for P . Using polar coordinates, one easily
proves the identity
Z
1 dsdt
D z; for z 2 D. (1)
D z
Using this formula for z 2 D and small j zj, we have

Z
1 f ./ f .z/ . z/dxdy
.P .f //.z C z/ .P .f //.z/ D C f .z/ .z/:
D z z z
(2)
Dividing by z and taking limits z ! 0 along the lines y D ix, y D ix, we
obtain the formulas (5.2.3). Note carefully that the function
f ./ f .z/
j zj˛
(with arbitrary value at D z) is, by assumption, bounded in 2 D. After dividing

by z, the limit behind the integral sign can be taken by the Lebesgue Dominated
Convergence Theorem after we restrict the integral to DX
.z; 2 z/. The remaining
integral is bounded by a constant times
Z
dsdt
: (3)

.z;2 z/ j zj1˛ j z zj
We must show that (3) converges to 0 with z ! 0. The integral (3) is certainly
finite, and without loss of generality, z D 0. Now a substitution D z shows that
(3) is proportional to j zj˛ , and hence tends to 0 with z ! 0, as needed.
Proving the continuity of
@P .f .z//
@z
is actually easier, we may use the Lebesgue Dominated Convergence Theorem

directly on the entire range of integration after substituting D z. If z is not
in the support of f , (1) still remains valid since f .z/ D 0 (formula (1) is not used
in that case).
The case of P1 is analogous: Instead of (1), we have
www.Ebook777.com
.P1 .f //.z C z/ .P1 .f //.z/
Z
1 f ./ f .z/ z f ./ f .0/ z
dsdt
C z z z z
C.f .z/ f .0// .z/:
The Lebesgue dominated convergence argument can then be applied on the set
C X .
.z; 2 z/ [
.0; 2 z//;
and we use the estimate (5.2.3) at the point z on

.z; 2 z/ and at the point z D 0 at
.0; 2 z/. The rest of the argument is the same. t

u
As an application, we get the following extension of Liouville’s Theorem, which

will be useful in Section 5 of Chapter 15.
5.3 Theorem. Let f be a function on C with continuous first (real) partial

derivatives. Assume that
lim f .z/ D 0
z!1
and that there exists a function A.z/ with continuous first (real) partial derivatives
and compact support such that
@f
D Af:
@z
Then f .z/ D 0 for all z 2 C.
Proof. Assume without loss of generality that the support of A.z/ is contained in D.
Put F .z/ D f .z/e .P .A//.z/ . Using Lemma 5.2.1, we compute

@F @f .z/
D e B.z/ f .z/A.z// D 0:
@z @z
Thus, F is a holomorphic function on C, and since it tends to 0 at 1, it is zero by

Liouville’s Theorem 5.1 of Chapter 10. t
u
Finally, we will prove two easy inequalities involving the operator P , which will
also be useful in Section 5 of Chapter 15:
5.3.1 Lemma. (1) If f is a continuously differentiable function on C with support

in D, then
www.Ebook777.com
8jjf jj1
j.P .f //.z/j :
1 C jzj
(2) For every p > 2 there exists a constant Cp such that if f 2 Lp .C/, then
j.P1 .f //.z1 / .P1 .f //.z2 /j Cp jjf jjp jz1 z2 j12=p :
Proof. For (1), clearly it suffices to prove that

Z
dsdt 8
: (*)
D j zj 1 C jzj
First of all, by polar coordinates, we clearly have

Z
dxdy
p D 2 r:

.0;r/ x2 C y 2
Thus, for jzj 1, we may use r D 2 to show that the left-hand side of (*) is less than
or equal to 4 . For jzj > 1, the idea is to integrate 1=j zj over the intersection of

.z; 1 C jzj/ X
.z; jzj 1/ (**)
with the smallest angle with center z which contains D. As already remarked, the
integral of 1=j zj over (**) is 4 , so the integral over the intersection of (**) with
an angle of size ˛ will be
2˛:
The angle in question has size
2
2arcsin.1=jzj/ ;
jzj 1 C jzj
which is good enough.

To prove (2), use Hölder’s inequality with 1=p C 1=q D 1 (put D =jzj,
u D s=jzj, v D t=jzj):
www.Ebook777.com
ˇZ ˇ Z 1=q
ˇ f ./ f ./ ˇ dxdy
ˇ ˇ
dxdy ˇ jzj jjf .z/jjp
ˇ
C z C j. z/j
q
Z 1=q
dudv
D jjf .z/jjp jzj.2=q/1:
C j. 1/jq
Applying this to the function f .z C z2 / at the point z1 z2 gives the claimed

inequality. t
u
6 Exercises
(1) Prove that a Möbius transformation maps an open disk, an open half-plane
or the complement of a closed disk onto an open disk, open half-plane or the
complement of a closed disk. Prove furthermore that for any two subsets of C[
f1g of any two of the above three types, there exists a Möbius transformation
mapping one onto the other.
(2) Let w0 2
.0; 1/. Consider the Möbius transformation
z w0
f .z/ D :
1 w0 z
Prove that this gives a holomorphic automorphism of

.0; 1/ which maps w0
to 0. [Hint: Consider the effect of this Möbius transformation on jzj D 1.]
(3) Construct a non-constant (non-injective) holomorphic function f on C which
is not onto.
(4) Let a < b 2 R and let f; g W Œa; b ! R be continuous real functions which
are continuously differentiable in .a; b/ and such that for a < x < b, f .x/ <
g.x/. Prove that the set
fx C iy 2 C j a < x < b; f .x/ < y < g.x/g
is simply connected.
(5) Find an elementary function which maps the set fz 2
.0; 1/ j Re.z/ >
0; Im.z/ > 0g bijectively holomorphically onto
.0; 1/. [Hint: Find, in this
order, holomorphic isomorphisms of the set described onto an open half-disk,
an open quadrant, an open half-plane,
.0; 1/.]
(6) Show that if the polygon P is a triangle, then in Theorem 2.4, the points
w1 ; w2 ; w3 can be chosen to be any three points on the unit circle which occur in
this order when the circle is oriented counter-clockwise. [Hint: Using the maps
of Exercise (2) and rotations, show that there is a holomorphic automorphism
of
.0; 1/ which extends holomorphically to an open set containing
.0; 1/,
and maps a given choice of points w1 ; w2 ; w3 to any other such given choice.]
(7) Determine a choice of the points wi when P is a regular k-gon.
www.Ebook777.com
6 Exercises 347
(8) Using the Schwartz-Chrisfoffel formula, write down an explicit formula (with
one free parameter) for a function f mapping bijectively holomorphically the
upper half-plane on a rectangle. Such formulas are called elliptic integrals.
Using complex conjugation (similarly as in Lemma 2.3), prove that the inverse
function g extends to a meromorphic function on C, which is doubly periodic,
with periods equal to the sides of the rectangle (such functions are called
elliptic functions). For information on elliptic function, the reader may look
at [11].
(9) Prove that every holomorphic function on a compact Riemann manifold is
constant.
(10) Prove that the Möbius transformations are the only holomorphic automor-
phisms of CP 1 . [Hint: Use Proposition 1.1.1.]
(11) Prove that non-constant meromorphic functions on CP 1 are precisely rational
functions, i.e. functions of the form p.z/=q.z/ where p.z/, q.z/ are polyno-
mials, q.z/ not identically zero. [Hint: Multiply (resp. divide) such a function
f .z/ by the product of all factors .zzi /ki where zi is a pole (resp. zero)of order
ki in C, (infinitely many zeroes or poles would mean f is a constant 0 or 1 by
the Uniqueness Theorem 4.4 of Chapter 10). Then we may assume without loss
of generality that the restriction of f to C has neither zeroes nor poles. Now
if f .1/ ¤ 1, then f is bounded on C, while if f .1/ D 1, then 1=f .z/
is bounded on C. In either case, f is constant by Liouville’s Theorem 5.1 of
Chapter 10.]
(12) Prove in detail that for U C an open set, F .z/ is a primitive function for the
1-form f .z/dz with f holomorphic if and only if F .z/ is a primitive function
of f .z/.
(13) Prove in detail that the definitions of the operators @, @ on differential forms
on a Riemann surface is invariant under holomorphic change of coordinates.
(14) Prove that the function e z , considered as a holomorphic map C ! C X f0g, is
a covering and that, in fact, it is the universal covering of C X f0g.
(15) From Exercise (14), construct an isomorphism
1 .C X f0; g; x0 / ! Z
for any base point x0 .

(16) Prove that 1 .C X f0; 1g; x0 / is not abelian for any base point x0 . [Hint: Use
Example 4.2.5.]
(17) Prove that the Riemann surface CP 1 is simply connected. [Hint: Use smooth
partition of unity to prove that every path is homotopic to a piecewise continu-
ously differentiable p arametrized curve. For any two piecewise continuously
differentiable curves, there exists a point which is in neither of their images.]
(18) Prove that a connected Riemann surface † (or, for that matter, a connected
manifold, see Comment 4.4) with a point x0 2 † is simply connected if and
only if 1 .†; x0 / is the trivial group (i.e. a group with a single element).
www.Ebook777.com
(19) Recall the concept of a Lie group from Chapter 12, Exercise (6). Prove that the
fundamental group of a Lie group is commutative (see Comment 4.4). [Hint:
the concatenation of paths is homotopic to the point-wise product, using the
group operation.]
(20) Prove that if W ! G is a covering and G is a Lie group (cf. Comment 4.4)
with both G and connected, then can be given a structure of a Lie group
such that is a homomorphism of groups.
(21) Define for f 2 Lp .D/, p > 2,
Z
1 f ./dsdt
.Q.f // D :
D z
Assuming (5.2.2), calculate
@Q.f .z// @Q.f .z//

; :
@z @z
www.Ebook777.com
Calculus of Variations and the Geodesic

Equation 14
The aim of this chapter is to give a glimpse of the main principle of the calculus of
variations which, in its most basic problem, concerns minimizing certain types of
linear functions on the space of continuously differentiable curves in Rn with fixed
beginning point and end point. For further study in this subject, we recommend [7].
We derive the Euler-Lagrange equation which can be used to axiomatize a large
part of classical mechanics. We then consider in more detail the possibly most
fundamental example of the calculus of variations, namely the problem of finding
the shortest curve connecting two points in an open set in Rn with an arbitrary given
(smoothly varying) inner product on its tangent space. The Euler-Lagrange equation
in this case is known as the geodesic equation. The smoothly varying inner product
captures the idea of curved space. Thus, solving the geodesic equation here goes a
long way toward motivating the basic techniques of Riemannian geometry, which
we will develop in the next chapter.
1 The basic problem of the calculus of variations,

and the Euler-Lagrange equations
1.1
For the purposes of this chapter, define a continuously differentiable function
y W ha; bi ! Rn (*)
as a function with the property that the function defined as the derivative of y on
.a; b/ and as the respective one-sided derivatives at a and b is everywhere defined
and continuous on ha; bi.
Now consider the vector space V D Va;b;p;q of all continuously differentiable
function (*) such that
y.a/ D p; an y.b/ D q

www.Ebook777.com
350 14 Calculus of Variations and the Geodesic Equation
for some fixed values p; q 2 Rn . Let
L D L.t; x1 ; : : : xn ; v1 ; : : : ; vn / W ha; bi
R2n ! R
be a function with continuous first partial derivatives (again, take one-sided

derivatives at a, b when applicable).
The most basic problem of the calculus of variations is looking for the extremes
of the function (we use the term functional)
S WV !R
given by
Z b
S.y/ D L.t; y.t/; y0 .t//dt:
a
Note that S is continuous when we consider the metric on V given by the norm
jjyjj D sup jjy.t/jj C sup jjy0 .t/jj:

t 2ha;bi t 2ha;bi
(Here we may choose any of the usual norms on Rn , for example the maximum
one.) However, in the kind of formal investigation we are going to do, even this will
play only a peripheral role.
Lemma. Let f W ha; bi ! R be a continuous function such that

Z b
f .t/h.t/dt D 0
a
for all continuously differentiable functions h such that h.a/ D h.b/ D 0. Then
f 0.
Proof. Suppose f is not identically zero. Then f .t0 / ¤ 0 for some t0 2 .a; b/.
Suppose, without loss of generality, f .t0 / > 0. Since f is continuous, there exists
an " > 0 such that f .t/ > 0 for all t 2 .t0 "; t0 C "/. Now let u be a continuously
differentiable function which is positive on some non-empty interval contained in
.t0 "; t0 C "/, and 0 elsewhere (we may use the “baby version” of smooth partition
of unity 5.1 of Chapter 8). Then
Z b
f .t/h.t/dt > 0:
a
t
u
1.2 Theorem. (The Euler-Lagrange equations) Suppose the functional S W V ! R

has an extreme at a function y 2 V. Then the function y satisfies the system of
differential equations
www.Ebook777.com
1 The basic problem of the calculus of variations, and the Euler-Lagrange equations 351
ˇ ˇ
@L ˇˇ d @L ˇˇ
D :
@xi ˇxDy;vDy0 dt @vi ˇxDy;vDy0
Comment: It is often customary to write the equations in the form

@L d @L
D ;
@yi dt @yi0
but there is some danger in such a notation, since in the partial derivatives, we must
treat yi , yi0 as formal symbols plugged in for the arguments xi , vi of L, while the
derivative by t is the actual total derivative by the independent variable t.
Proof of the theorem: Choose any continuously differentiable function h W
ha; bi ! R, such that h.a/ D h.b/ D 0. Consider the real function of n variables
Z b
ˆh .u1 ; : : : ; un / D L.t; y.t/ C uh.t/; y0 .t/ C uh0 .t//dt:
a
If the functional L has an extreme at y, then ˆh has an extreme at o, and since it has
continuous partial derivatives by the chain rule everywhere, we must have
@ˆh .o/
D 0:
@ui
Denoting by ei the i ’th standard basis vector of Rn , compute

1
.ˆh .uei / ˆh .o//
u
Z
1 b
D L.t; y.t/ C uei h.t/; y 0 .t/ C uei h0 .t// L.t; y.t/; y0 .t// dt
u a
Z
1 @L.t; y.t/ C uei h.t/; y 0 .t/ C uei h0 .t//
b
D uh.t/dt
u a @xi
Z !
b
@L.t; y.t/; y0 .t/ C uei h0 .t// 0
C uh .t/dt
a @vi
for some 0 < ; < 1. On the right hand side, we used the Mean Value Theorem 3.3
of Chapter 3 twice. Note that the u factor cancels out, and using h.a/ D h.b/ D 0
and integration by parts in the second integral, we get
Z b
@L.: : : / d @L.: : : /
D h.t/dt:
a @xi dt @vi
Now use Lemma 1.1. t

u
www.Ebook777.com
1.3 Comment
The main idea of the proof resembles the idea of the total differential of a function
of finitely many variables (see Exercise (1) below for a more concrete statement). It
may seem we got something for free: how come we can find extremes of functionals
on a space of continuously differentiable functions as easily as extremes of functions
of finitely many variables? There is, however, one major catch: with the space V not
being compact (not even locally), there is no guarantee an extreme of the functional
S on V exists at all! Therefore, Theorem 1.2 is not nearly as strong as it may seem,
giving only candidates for a possible extreme. Similarly as in the case of functions
of finitely many variables, we call these candidates critical functions. Highly non-
trivial methods are generally needed to show that a given critical function is in fact
an extreme (we will see an example of that below).
2 A few special cases and examples
Simplifications in the form of the Euler-Lagrange equations occur in certain special

cases.
2.1 When L does not depend on x
In this case, the Euler-Lagrange equations become
@L.t; y.t/; y0 .t//

D Ki
@vi
where K1 ; : : : ; Kn 2 R are constants.

Example: Let n D 1. Let us verify that the shortest graph of a function
connecting two points .a; p/ .b; q/ in R2 , a < b, is indeed a straight line. The
formula for the arc length of a graph of a function y D y.t/ is
Z b p
S.y/ D 1 C y 0 .t/2 dt;
a
and hence
p
L.t; x; v/ D 1 C v2 :
Therefore, we have
@L v
Dp ;
@v 1 C v2
www.Ebook777.com
2 A few special cases and examples 353
and we get a differential equation
y0
p D K;
1 C y 02
or
.1 K 2 /y 02 D K 2 :
Thus we get a critical function if and only if y 0 is constant.
2.2 When L does not depend on t
Then we have
d X @L.y.t/; y0 .t//
n
@L.y.t/; y0 .t// 00
L.y.t/; y0 .t// D yi0 C yi ;
dt i D1
@xi @vi
which, using the Euler-Lagrange equation, is equal to

!
X n
d @L.y.t/; y0 .t// @L.y.t/; y0 .t// 00 d X
n
@L.y.t/; y0 .t//
yi0 C yi D yi0 :
i D1
dt @vi @vi dt i D1
@vi
Thus, we obtain
!
d X n
@L 0
L yi D 0;
dt i D1
@vi
or in other words
X n
@L 0
y LDK (2.2.1)
i D1
@vi i
is a “conserved quantity” in t. This expression is called the Hamiltonian. When

n D 1, the equation (2.2.1) can be used directly instead of the Euler-Lagrange
equation to find critical functions.
Example: The brachistochrone problem. Design a shape of a roller-coaster track
in the tx plane such that the car starting at the point .0; r/ reaches the point .s; 0/ in
the shortest possible time. (Gravity is assumed to pull in the negative direction of the
x axis. Caution: here x is the vertical coordinate, and t is the horizontal coordinate,
not time!)
www.Ebook777.com
We may choose units such that the mass of the car is 1, as is the acceleration of
gravity. Then the potential energy at the point .0; s/ is s, and hence by conservation
of energy, at a point .t; x/, the kinetic energy is .s x/. Thus, if the component of
the velocity in the t direction is w, we have
1 2
w .1 C x 02 / D s x;
2
and hence
s
2.s x/
wD ;
1 C x 02
or
s
1 C v2
L.t; x; v/ D 1=w D :
2.s x/
Thus, (2.2.1) gives

s
1 C v2 x0
x0 p D K;
2.s x/ 1 C x 02 2.s x/
which yields
p
1 D K 2.s x/ .1 C x 02 /
or
1 D 2K 2 .1 x/.1 C x 02 /:
It is not difficult to verify that the solution can be expressed parametrically as
t./ D 1 A.1 C cos.//; x./ D A.sin./ C / C B (2.2.2)
for suitable constants A, B. This curve is called a cycloid.
2.3 Lagrangian mechanics
In physics, the motion of a system of finitely many particles in R3 can be described

using the Euler-Lagrange equations. Consider all the coordinates of all the particles
together, so we have a variation problem in R3n , where the coordinates of i -th
www.Ebook777.com
2 A few special cases and examples 355
particle are coordinates number 3i 2, 3i 1, 3i . The basic principle for writing

down the Lagrangian is
L D kinetic energy potential energy: (2.3.1)
In the basic setup of Newtonian mechanics, the particles have masses mi , and the
kinetic energy is
X
n
1
2 C v3i 1 C v3i /:
2 2 2
m.v3i (2.3.2)
i D1
2
P1 2
A kinetic energy formula of this form, i.e. essentially the form 2
mv , is referred
to as a standard kinetic term.
The potential energy term is more variable. Assuming the particles act on one
another by gravity, Newton’s law of gravity gives potential energy
X mi mj
Gv (2.3.3)
u 2
i <j uX
t .x
3i k x3j k /
2
kD0
where G is Newton’s universal constant of gravity. We may generalize this further

by including a conservative force field acting on each particle, Fi D grad.i /,
i W R3 ! R, i D 1; : : : ; n, in which case we add to the potential energy the term
X
n
i .x3i 2 ; x3i 1 ; x3i /: (2.3.4)
i D1
According to the recipe (2.3.1), the (original) Lagrangian is obtained by taking the
standard kinetic term (2.3.2), and subtracting the potential terms (2.3.3), (2.3.4),
thus getting
L.x1 ; : : : ; x3n ; v1 ; : : : ; v3n /
X
n
1
D 2
m.v3i 2 C v3i 1 C v3i /
2 2
i D1
2
X mi mj X
n
C Gv C i .x3i 2 ; x3i 1 ; x3i /:
u 2
i <j uX i D1
t .x x3j k /2
3i k
kD0
www.Ebook777.com
Lagrange’s principle states that the equation of motion is given by the critical
function for this Lagrangian on a time interval ha; bi with given positions at the
times a and b, i.e. that it is subject to the Euler-Lagrange equations 1.2. We will not
prove this here. In fact, a mathematical “proof” in this setting is not to be expected:
we are referring to a system of physical particles. What could be proved, however,
is that Lagrange’s equations are equivalent to Newton’s.
Observe that in the presence of the standard kinetic term (2.3.2), the Hamilto-
nian (2.2.1) of 2.3.1 has the physical meaning of the total energy of the system,
which, indeed, should be conserved by the law of conservation of energy.
The Lagrangian mechanics setup may seem like nothing new, since it only recov-
ers Newton’s equations, and, in fact, is even less general, since it requires a conser-
vative force field. However, the Lagrangian turns out to be extremely beneficial for
generalizations. In fact, most of modern physics uses the Lagrangian formalism.
3 The geodesic equation
Let us return to mathematics. Perhaps the single most important example of the
Euler-Lagrange equation is the geodesic equation in a Riemann metric (although it
should be pointed out that the equation does have a physical meaning, describing in
fact the motion of a light ray in a gravity field in Einstein’s general relativity).
3.1 A Riemann metric on an open subset of Rn
Let U Rn be an open subset. Let gij W U ! R, i D 1; : : : ; n, be smooth functions

such that for each x 2 U , g D .gij /i;j is a positive definite symmetric matrix.
We will interpret g.x/ as the associated matrix of a (real) inner product
hu; vig
of tangent vectors at the point x 2 U , which will be called a Riemann metric on U

(see 7.7 of Appendix A). The key point is that the tangent space of U is canonically
identified with Rn via the coordinate map which is simply the embedding U Rn .
As a generalization of formula (**) in Subsection 2.2 of Chapter 8, we will, then,
define the length with respect to the Riemann metric g of a piecewise continuously
differentiable curve represented by a map
W ha; bi ! U
by the formula
Z b q
sg ./ D h 0 .t/; 0 .t/ig dt: (3.1.1)
a
(See Exercise (8).)
www.Ebook777.com
3 The geodesic equation 357
We will be interested in the variational problem of minimizing the func-

tional (3.1.1) over the set of continuously differentiable curves with given boundary
points .a/ D A; .b/ D B 2 U . Before getting into this seriously, we will
introduce a notational convention which is helpful when figuring out numerical
examples in complicated formulas with many indices: often, we are making multiple
sums over indices, for example, i D 1; : : : ; n over terms where the index i occurs,
and is equal, in two factors entering the formula. In this, and the following chapter,
we will make the convention that
When an index i appears in more than one factor of a product, then i
will occur in precisely two such factors, once as a subscript and once as
a superscript. This notation shall mean summation over all permissible
values of i , which Xmust be the same in both factors in question; the
summation symbol will be omitted.
i
(3.1.2)
Thus, using this convention, the components of the function will be written with
superscripts, i , i D 1; : : : ; n, and the formula (3.1.1) above will assume the form
Z b q Z b q
sg ./ D gij 0i 0j D gij ..t// 0i .t/ 0j .t/dt: (3.1.3)
a a
The convention (3.1.2) may seem unreasonably restrictive, but turns out adequate
in the types of formulas we will encounter. It is known as (one version of) the
Einstein convention. When two quantities share an index as a subscript in one
and a superscript in the other (and summation over all permissible values is to
be performed), we call the quantities coupled. We can see already in (3.1.3) in
comparison with (3.1.1) that the Einstein convention can make formulas more
explicit. In the next chapter, when talking about the more general context of
manifolds, we will talk about tensors, and will give the Einstein convention a deeper
interpretation.
3.2 A trick: modifying the functional
We see immediately that the Euler-Lagrange equation for the functional (3.1.3)
will be a pain because of the square root in the Lagrangian. This problem has a
surprisingly simple solution, which, at first, cannot possibly seem right: simply omit
the square root! Thus, we will consider the functional
Z b
Sg ./ D gij 0i 0j : (3.2.1)
a
www.Ebook777.com
To justify this, recall that by Lemma 8.6.1 of Chapter 5,

1
Sg ./ .sg .//2 ;
ba
while equality arises if and only if gij ..t// 0i .t/ 0j .t/ is constant in t (keep
in mind that we are using the Einstein convention). This condition is called
parametrization by arc length.
Note that any continuously differentiable curve can be parametrized by arc
length: Letting
Z tq
s.t/ D gij ..t// 0i .t/ 0j .t/;
a
we obtain an increasing continuously differentiable map with positive derivative s

from ha; bi to the interval h0; sg ./i; composing with s 1 is a parametrization by
arc length. This shows that if the functional Sg indeed has a minimum in the space
of continuously differentiable curves with fixed boundary points A, B, then the
minimum curve also minimizes the functional sg , and furthermore is parametrized
by arc length!
3.3 The Euler-Lagrange equation for the modified

functional-the geodesic equation
The modified functional (3.2.1) of 3.2 gives us the Lagrangian
L.x; v/ D gij .x/v i v j ; (3.3.1)
using the notation x D .x 1 ; : : : ; x n /, v D .v 1 ; : : : ; v n / and the Einstein convention.

We have
@L
D 2gij .x/v j :
@v i
Note here that from the point of view of the Einstein convention, we must
@
treat the i in i as a subscript.
@v
By the chain rule, we therefore have

d @L.x; x0 / @gij
D 2gij .x j /00 C 2 k .x j /0 .x k /0
dt @v i @x

j 00 @gij @gik
D 2gij .x / C C j .x j /0 .x k /0 :
@x k @x
The last step may seem to do nothing, but we will see later that it is useful to have the
quantity coupled to .x j /0 .x k /0 symmetrical in j; k (it will help eliminate a certain,
somewhat counterintuitive, quantity known as torsion).
www.Ebook777.com
3 The geodesic equation 359
Also note that by the chain rule, we have
@L.x; x0 / @gj k j 0 k 0
D .x / .x / ;
@x i @x i
and hence the Euler-Lagrange equation becomes (after cancelling 2),

i 00 1 @gij @gik @gj k
gij .x / C C j .x j /0 .x k /0 D 0; (3.3.2)
2 @x k @x @x i
which is called the geodesic equation. It is useful to write

1 @gij @gik @gj k
ijk D C j : (3.3.3)
2 @x k @x @x i
Then (3.3.2) becomes
gij .x i /00 C ijk .x j /0 .x k /0 D 0: (3.3.4)
As we learned from the theory of differential equations in Chapter 6, it is useful

to have the highest derivative in explicit form. In the present case, it suffices to
multiply by the matrix g1 inverse to g. To conform with the Einstein convention,
it is customary to denote the .i; j /’th entry of the matrix g 1 as g ij . Then we obtain
g ij gj k D ıki
where
ıki D 1 when i D k
D 0 otherwise
is called the Kronecker ı (see also Appendix A, 7.2). Thus, putting
ji k D g i ` `j k ;
the geodesic equation becomes
.x i /00 C ji k .x j /0 .x k /0 D 0: (3.3.5)
The symbols ijk or ji k are known as Christoffel symbols of the first resp. second
kind.
Parametrized curves satisfying the geodesic equation are called geodesics
parametrized by arc length, or simply geodesics. Let us keep in mind, however,
that geodesics are merely critical for the functional (3.2.1) of 3.2. We have not
proved that geodesics minimize the length of continuously differentiable curves
www.Ebook777.com
with given boundary points. In fact, this is false in general (see Exercise (7) (c)
of Chapter 15 below). Yet, for the sake of geometry, we are clearly interested at
least in some minimum length statement regarding geodesics, and it is important to
note that the variational tools we supplied do not give that. We will prove such a
statement in the next section using different methods.
4 The geometry of geodesics
The purpose of this section is to study geodesics in more detail, and eventually to
prove that locally they really are the curves of minimal length connecting two points
with respect to a given Riemann metric.
4.1 Dependence on boundary conditions, the exponential map
Recall now Theorem 6.5.5 where we investigated the dependence on an ordinary

differential equation on initial conditions. At this point, we are interested in dealing
with smooth functions. Let us distill the result we will need here:
4.1.1 Lemma. Let U Rn be an open set, and let f W R

U ! Rn be a smooth
function. Consider points t0 2 R, x0 2 U . Then there exists an open neighborhood
V of .t0 ; x0 / in R
U and a unique smooth function y W V ! U such that
y.t0 ; x/ D x and
@y.t; x/
D f.t; y.t; x// (*)
@t
for all .t; x/ 2 V .
Proof. As explained in Subsection 5.1 of Chapter 6, we can treat dependence

on initial conditions as dependence on parameters. From this point of view, the
existence and uniqueness of a continuous solution y as claimed follows from
Theorem 5.3 of Chapter 6, and its continuous differentiability in all variables follows
from Theorem 5.5 of Chapter 6. Now applying the equations (5.5.2) of Chapter 6
for the partial derivatives, we obtain, by induction, the existence and continuity of
all higher partial derivatives. t
u
By 1.2 of Chapter 6, an analogue of Lemma 4.1.1 also holds for systems of higher
order differential equations. Applying this specifically to the case of the geodesic
equation (3.3.5) of 2.3, we obtain the following
4.1.2 Corollary. For a smooth Riemann metric g on an open set U Rn and a

point P 2 U , pick an isometry
W .Rn ; h‹; ‹i/ ! .Rn ; h‹; ‹igP /:
www.Ebook777.com
4 The geometry of geodesics 361
(Here on the left hand side, h‹; ‹i denotes the dot product, see Appendix A,
Section 4.3.) Then there exists a convex open neighborhood V of o 2 Rn , and a
unique smooth map W V ! U such that
(1) .o/ D P ,
(2) for each v 2 Rn , .vt/ considered as a function of t in an open neighborhood
of o in which vt 2 V is a g-geodesic parametrized by arc length (in the sense
of 3.3),
(3) @v .o/ D .v/.
The smooth map of Corollary 4.1.2 is often denoted by exp and called the
exponential map.
4.2 Behavior of geodesics with respect to lengths and angles
Let us first verify that solutions of the equation (3.3.5) of 3.3 are indeed parametrized
by arc length with respect to the Riemann metric g. While we argued in 3.2 that this
must be true for parametric curves minimizing the functional (3.2.1), note that we
have so far only proved that the solutions of (3.3.5) are critical. Hence, that argument
cannot be used rigorously.
4.2.1 Lemma. Let x W .a; b/ ! U be a solution of the equation (3.3.5). Then

we have
.gij .x i /0 .x j /0 /0 D 0
(using the Einstein convention).
Proof. Let us compute the Hamiltonian (2.2.1) of 3.2 for the Lagrangian (3.3.1)
of 2.3:
@L.x; x0 / i 0
.x / L.x; x 0 / D 2gij .x i /0 .x j /0 gij .x i /0 .x j /0 D gij .x i /0 .x j /0 :
@v i
Thus, the quantity whose constancy in t we are trying to prove is in effect the
Hamiltonian. Hence, our statement follows from 2.2. u
t
Note that the proof of Lemma 4.2.1 suggests multiplying the Lagrangian (3.3.1)
of 3.3 by a factor of 1=2, and calling it energy.
4.2.2
Now we will prove that when we shift a geodesic to a nearby geodesic, the
angle of the shift is also conserved, provided that we do not change the scale of
parametrization. More precisely, let solutions of the geodesic equation (3.3.5) of 3.3
depend on some smooth parameter u in the space of initial conditions, as in the proof
of Lemma 4.1.1. Let us assume further that
www.Ebook777.com
@gij .x i /0 .x j /0
D 0: (1)
@u
Note that by Lemma 4.2.1, it suffices to verify this condition at one point, and
the condition indeed means that we are not changing the scale of arc length
parametrization with u. Now let
@x
zD ;
@u
as, again, in the proof of Lemma 4.2.1.
Lemma. We have
.gij .zi /.x j /0 /0 D 0: (2)
Proof. Compute
@gij @x k @x i @x j @2 x i @x j @x i @2 x j
.gij .zi /.x j /0 /0 D C gij C gij : (3)
@x k @t @u @t @u@t @t @u .@t/2
Now (1) implies that
@gij @x k @x i @x j @2 x i @x j
C 2gij D 0: (4)
@x k @u @t @t @u@t @t
Subtracting 1=2 times (4) from the right hand side of (3), we get
@gij @x k @x i @x j 1 @gij @x k @x i @x j @x i @2 x j
D k
k
C gij : (5)
@x @t @u @t 2 @x @u @t @t @u .@t/2
2 j
Using the geodesic equation (3.3.5) of 3.3 for @ x 2 , we see that the second term is
.@t/
equal to
gij ji k .x i /0 .x j /0 .x k /0 D ijk .x i /0 .x j /0 .x k /0 :
Using the definition of ijk in 3.3, this is equal to

1 @x i @gij @gik @gj k @x j @x k
k
C j :
2 @u @x @x @x i @t @t
This is equal to
www.Ebook777.com
4 The geometry of geodesics 363
@gij @x k @x i @x j 1 @gij @x k @x i @x j
k
C
@x @t @u @t 2 @x k @u @t @t
by renaming variables, which shows that (5) is 0. t
u
Remark: In comparison with Lemma 4.2.1, we may ask if Lemma 4.2.2 has
a similarly conceptual proof (our proof was by calculation from the definition of
the Christoffel symbols). Such a conceptual proof indeed exists, and is related to
our comments in Sections 7 and 8 of Chapter 6: the condition (1) indicates that
the Lagrangian has an infinitesimal symmetry. By a similar but somewhat more
elaborate argument to the discussion in Chapter 6, this always implies a conserved
quantity known as a Noether current, which is the cause of the conservation law
proved in Lemma 4.2.2. Discussing this more systematically, however, exceeds the
scope of this text.
4.3 Minimality of geodesics
Let us now consider an open subset U Rn with a smooth Riemann metric g and a
point P 2 U . Choose an isometry as in Corollary 4.1.2, and let W V ! U ,
.0/ D P , be the corresponding exponential map. By the Inverse Function
Theorem 7.3 of Chapter 3, we may further assume that is a diffeomorphism
onto its image.
4.3.1 Lemma. Let
Sr D fx 2 Rn j jjxjj D rg:
Assuming Sr V , x 2 Sr , v 2 T .Sr /x , then .D/x .v/ is g-orthogonal to .D/x .x/.

In other words, T ..Sr //.x/ is g-orthogonal to .D/x .x/.
Caution: It is not claimed, and, as we will see in the next section, certainly not
true in general, that would be an isometry!
Proof. We will use Lemma 4.2.2. Let xQ D x=jjxjj D x=r. Consider the geodesic
.t xQ /. By the definition of , and the fact that it is a diffeomorphism onto its image
when restricted to V , the space T ..Sr //.x/ is spanned by the vectors z.r/ of 4.2.2
with respect to the boundary condition change
x.0; u/ D P; (*)

0 xQ C uw
x .0; u/ D
jjQx C uwjj
www.Ebook777.com
where hx; wi D 0. The condition (1) of 4.2.2 is then satisfied (at t D 0 and hence,
by Lemma 4.2.1, for all t 2 .r; r/) by the fact that is an isometry. By (*),
gij .zi /.x j /0 D 0
at t D 0, and hence, by Lemma 4.2.1, also at t D r D jjxjj, which implies the

statement of Lemma 4.3.1. u
t
Assume now, without loss of generality, that V D

.o; R/ for some R > 0.
4.3.2 Theorem. Let y W h0; ai ! .V / be a continuously differentiable curve such

that y.a/ 2 .Sr /. Then, recalling the notation (3.1.1), we have
sg .y/ r;
and equality is attained if and only if y is a geodesic, i.e. y yQ where yQ is a geodesic

parametrized by arc length.
Proof. Consider the function
h W .V / ! R
given by
h.x/ D jj 1 xjj:
Then the function h is smooth on .V / X fP g. By Lemma 4.3.1, the vector
@h.x/
. /i (a)
@x i
is a positive multiple of the derivative at t D jj 1 .x/jj of the geodesic
.t. 1 .x/=jj 1 .x/jj/: (b)
We have
@h.x/ @h.x/
g ij D 1: (c)
@x i @x j
(Change coordinates so that one coordinate vector will be the derivative of (b) at
t D jj 1 .x/jj and the other, g-orthogonal coordinate vectors will be tangent vectors
at x to .Sjj 1 .x/jj /. Then the contributions to (c) in the new coordinates from all
but i D j D 1 will be 0, and the contribution from the first coordinate is 1 by the
fact that the geodesic (b) is parametrized by arc length, and is an isometry.) Hence,
www.Ebook777.com
5 Exercises 365
by the finite-dimensional Cauchy-Schwarz inequality (see 4.4 of Appendix A), for

any z 2 Rn ,
q @h.x/ i
gij .x/zi zj z
@x i
where equality arises if and only if z is a positive multiple of (a). Therefore,
Z q Z
sg .y/ D gij .y.t//.y i /0 .t/.y j /0 .t/dt h.y.t//0 dt D h.y.a//
h0;ai h0;ai
where equality arises if and only if y0 .t/ is a positive multiple of a tangent vector
of a geodesic of the form (b) for y.t/ D x almost everywhere in t (and hence
everywhere, by continuity). t
u
5 Exercises
(1) Prove that for y 2 Va;b;p;q , and h 2 Va;b;o;o , we have

Z b
S.y C h/ S.y/ D Dy .t/h.t/dt C M.h/ jjhjj
a
where
M W Va;b;o;o ! R
satisfies
lim M.h/ D 0
h!0
and

@L.t; y.t/; y0 .t// d @L.t; y.t/; y0 .t//
.Dy .t/i / D :
@xi dt @vi
The function Dy .t/ is an example of what we call a Fréchet derivative,

although it is more common to consider this concept on normed vector spaces
(while Va;b;p;q is an affine space). [Hint: Mimic the proof of Theorem 1.2, but
keep in mind that h plays a slightly different role, h.t/ now having values in
Rn .]
(2) Find the critical functions W ha; bi ! R for the functional
Z b p
S.y/ D y.x/ 1 C .y 0 .x//2 dx
a
on continuously differentiable functions y 2 Va;b;p;q , p; q > 0 2 R.
www.Ebook777.com
(3) Find the Euler-Lagrange equation for the functional

Z 1
S.y/ D y 2 .y 0 /2 dx:
0
(4) Find the critical functions W ha; bi ! R for the functional

Z b
S.y/ D .p 2 .y 0 /2 C q 2 y 2 /dx:
a
(5) By reversing the coordinates in Example 2.2 (i.e. making the vertical coor-
dinate the independent and the horizontal coordinate the dependent variable),
find an alternate solution to the brachistochrone problem using the method of
Example 2.1.
(6) Find the critical functions for the functional
Z 1
S.u; v/ D ..u0 /2 C .v 0 /2 C u0 v 0 /dx:
0
(7) Prove in detail the parametric form (2.2.2) of the solution of the brachys-
tochrone problem.
(8) Prove that the formula (3.1.1) of 3.1 does not depend on the parametrization
of a piecewise continuously differentiable curve L.
(9) The hyperbolic plane is the upper half-plane of complex numbers, i.e. the set
H D fx C iy 2 C j y > 0g
with the Riemannian metric gij associated, at a point x C iy 2 H, with the

matrix

1=y 2 0
:
0 1=y 2
Using the geodesic equation, determine the geodesics in H.

(10) (Spherical geometry) Consider, on
C D fx C iy j x; y 2 Rg;
the Riemann metric gij associated, at a point x C iy 2 C, with the matrix

1=.1 C x 2 C y 2 / 0
:
0 1=.1 C x 2 C y 2 /
Using the geodesic equations, determine the geodesics in this space.
www.Ebook777.com
Tensor Calculus and Riemannian Geometry

15
The attentive reader probably noticed that the concept of a Riemann metric on an
open subset of Rn which we introduced in the last chapter, and the related material
on geodesics, beg for a generalization to manifolds. Although this is not quite as
straightforward as one might imagine, the work we have done in the last chapter
gets us well underway. A serious problem we must address, of course, is how the
concepts we introduced behave under change of coordinates. It turns out that what
we have said on covariance and contravariance in manifolds is not quite enough: we
need to discuss the notation of tensor calculus.
Additionally, it turns out that discussing geodesics in a Riemann metric directly
would cause us to copy many expressions over and over unnecessarily. There is a
natural intermediate notion which axiomatizes the Christoffel symbols of the second
kind directly, without referring to a Riemann metric. This gives rise to the concept
of an affine connection. In the presence of an affine connection, we can discuss
geodesics, but also the important geometric concepts of torsion and curvature. We
will show that vanishing of torsion and curvature characterizes, in an appropriate
sense, the canonical affine connection on Rn (the flat connection).
We will define the notion of a Riemann manifold, and show how it canonically
specifies an affine connection, known as the Levi-Civita connection. This will lead
us to the concept of curvature of a Riemann manifold. We will show that locally, a
Riemann manifold with zero curvature is isometric to an open subset of Rn . We will
also show that every oriented Riemann manifold in dimension 2 has a compatible
structure of a Riemann surface.
Although we make no reference to physics, the present chapter gives a good
rigorous foundation for the mathematics of general relativity theory. In fact, the
notation we use (writing out the indices in tensors) is closer to physics than is
customary in most mathematical texts. As we shall see, this notation does not
sacrifice rigor, and can make calculations with tensors more transparent by showing
explicitly which coordinates we are contracting.
To comment on the title of this chapter, by tensor calculus, one usually means
the basic development of tensor fields, their transformation under changes of
coordinates, and the covariant derivative. Riemannian geometry develops the same

www.Ebook777.com
368 15 Tensor Calculus and Riemannian Geometry
concepts further and on a higher level of abstraction. By making a kind of a

vertical slice through the concepts, we are hoping to make advanced geometry more
accessible to the reader.
Riemannian geometry is a vast subject, and here we only explore its very
beginnings. For further study of differential geometry and Riemannian geometry,
we recommend [10, 16, 21].
From here on, we commonly drop the bold-faced letter convention from 1.2 of
Chapter 3. Exceptions will be made where we specifially need to refer to material
of previous chapters, such as Comment 1.1 below.
1 Tensor calculus
1.1 Tensors and tensor fields
Let M be a smooth manifold and let x 2 M . An m-times contravariant and n-times

covariant tensor (or, more briefly, tensor of type .m; n/) at x is simply an element of
.TM x /˝m ˝ .TM x /˝n : (1.1.1)
A smooth tensor field T on M of type .m; n/ is a map assigning to each x 2 M a

tensor Tx at x of type .m; n/ such that for a smooth coordinate system h W U ! RN
at any point x 2 M ,
h ˝ ˝ h Ty ; y 2 U (1.1.2)
„ ƒ‚ …
m C n times
depends smoothly on y. (By h , we mean Dhx on the contravariant coordinates and

.D.h1 /h.x/ / on the covariant coordinates. Note also that the tangent space of RN
is identified canonically with RN , so the target of (1.1.2) is canonically identified
with the same finite-dimensional vector space for all y 2 U .)
For example, therefore, a smooth tensor field of type .1; 0/ is the same as a
smooth vector field, and a smooth tensor field of type .0; 1/ is the same as a smooth
1-form.
Comment: The reader may wonder why the convention on using the terms
covariant and contravariant when referring to tensors is opposite to the functoriality
we observed in 2.4 of Chapter 12. The reason is that the traditional terminology
on tensors (which we follow here) focuses on coordinates rather than the objects
themselves. In other words, one does not refer to functoriality with respect to smooth
maps, but with respect to coordinate change, which turns out to be the opposite. To
give an example, to use the notation of Chapter 14, a tangent vector would be, in
local coordinates h1 ; : : : ; hn written as
@
v D vi
@hi
www.Ebook777.com
1 Tensor calculus 369
(using the Einstein convention). From the point of view of tensor calculus, we write
v simply as v i . Note that composing the coordinate system h with where W U !
V is a diffeomorphism of open subsets of Rn , we have
@ @hj @
D
@. ı h/i @ i @hj
by the chain rule, and thus with respect to the coordinates . ı h/i , the coordinates
of v will be
D 1 .v 1 ; : : : ; v n /T :
1.2 A coordinate-free meaning for indices
Even though we have not specified coordinates, it is often customary to give a tensor
of type .m; n/ m different superscripts and n different subscripts, e.g.
Tji11ji22:::i m
:::jn :
The superscripts and subscripts are formal symbols each one of which refers simply
to a particular factor of (1.1.1). For example a tensor of type .2; 2/ may be then
denoted by
ij
Tk` :
This notation has immediate benefits. For example, the Einstein convention now
makes sense for tensors: for tensors T , S , by the symbol
T::::::i ::: S:::i ::: D S:::i ::: T:::

::: ::: :::i :::
we mean the image of S ˝ T under the map which applies the evaluation map
.TM x / ˝ .TM x / ! R
to the coordinates of S and T labeled by i . We stipulate that each index will occur
at most twice, but there may be multiple pairs of coinciding indices, in which case
we apply multiple evaluation maps: For example,
ij
Tk` Sijk` 2 R
makes sense for two tensors of type .2; 2/ at the same point x 2 M . This operation
is often referred to as contraction.
www.Ebook777.com
The other benefit is that we can easily talk about symmetric and antisymmetric
tensors: Recall that for two vector spaces V; W there is a canonical interchange map
V ˝ W ! W ˝ V; v ˝ w 7! w ˝ v:
A tensor
T::::::i :::j ::: or T:::i

:::
:::j :::
is called symmetric (resp. antisymmetric) in the coordinates i; j if applying the

interchange map to those coordinates gives again T (resp. T ). We may also say
that a tensor is symmetric (resp. antisymmetric) in a set of coordinates S if it is
symmetric (resp. antisymmetric) in any pair i; j 2 S . Realize that then, for example,
a smooth tensor field of type .0; k/ antisymmetric in all its coordinates is the same
thing as a smooth k-form.
One example needs to be discussed explicitly: recall from 2.5 of Chapter 11 that
the canonical map
V;W W V ˝ W ! Hom.V; W /; ..f ˝ w//.v/ D f .v/w
is an isomorphism when V , W are finite-dimensional. We then have a smooth tensor

field of type .1; 1/ on any manifold M which, at any point x 2 M , is given by
1
TM x ;TM x .IdTM x /:
This tensor field is denoted by
ıji :
1.3 Comment
The reader probably noticed the difference between the way subscripts and
superscripts are used in the context of tensors on a Riemann manifold, and the
way we used them in the last chapter: in the last chapter, an index i stood simply
for the i ’th coordinate, where i is a number, and the Einstein convention was used
to sum terms where the same i occurs twice. In the context of tensors, no number
is plugged in for i , it simply is a label denoting which factor of the tensor product
we are working with, and the Einstein convention means an application of the
evaluation map.
Conveniently, these two points of view are somewhat interchangable: if we pick
@
a local coordinate system h, then we have a basis i of TM x , and a dual basis dhi
@h
of .TM x / , and the evaluation map can be indeed computed by summing products
of terms coupling a basis element with the corresponding element of the dual basis.
www.Ebook777.com
2 Affine connections 371
Nevertheless, one must be careful to note that the coordinate-free tensor context
is more restrictive: The tensor notation should be used only for quantities which
are intrinsically coordinate-free. For example, let us take the Christoffel symbols
ijk . On an open set in Rn , the tangent space is canonically identified with Rn , so
we could certainly view ijk as a tensor of type .1; 2/. The trouble is, however, that
if we change coordinates, i.e. apply a diffeomorphism to another open subset of
Rn , this will not preserve the tangent space identification, and we find that it would
not preserve the tensor ijk we just defined, i.e. that for each choice of coordinates,
we would get a different tensor. Usually, this is expressed by saying that ijk is not
a tensor and transforms according to different rules (see Exercise (1) below). It
is more accurate, however, to say that there is no canonical tensor given by the
Christoffel symbols.
2 Affine connections
2.1 The definition of an affine connection
There is no general natural way of taking a derivative of a vector field by another

vector field on a smooth manifold. However, we can give a manifold additional
structure which enables such operations, and specify axioms which make this
operation “behave like a derivative”. This leads to the notion of an affine connection.
Consider the R-vector space W.M / of all smooth vector fields on a smooth man-
ifold M . An affine connection (or, more briefly, connection) on M is a bilinear map
W.M /
W.M / ! W.M /;
.u; v/ 7! ru .v/
such that for a smooth function f W M ! R, we have
rf u .v/ D f ru .v/ (2.1.1)
and
ru .f v/ D @u f v C f ru .v/: (2.1.2)
By @u f we mean a function which, at x 2 M , is the directional derivative at x of f

by the vector u.x/. (Note that (2.1.2) can be interpreted as a kind of a Leibniz rule.)
2.2 Locality
Perhaps the first thing to notice about affine connections is that they are “local” in
the following sense: the value of ru .v/ at a point x 2 M clearly depends only on
www.Ebook777.com
the value u.x/ and the value of v on the image of any continuously differentiable
oriented curve
W .a; a/ ! M
(such that 0 .t/ ¤ 0) where .a/ D x and 0 .0/ D u: Choosing vector

fields e1 ; : : : ; eN such that e1 .x/; : : : ; eN .x/ form a basis of TM x , by (2.1.1) and
bilinearity, rei .v/ clearly determine the value of ru .v/ at x for any u. To prove the
statement about the v variable, first note that ru .v/.x/ and ru .w/.x/ are equal if v,
w coincide in a neighborhood of x: in such a case, there exists a smooth function
h W M ! R such that h is constant 1 in a neighborhood of x, and hv D hw. This
implies our claim by axiom (2.1.2). Now choose local coordinates h W U ! Rn ,
h.u/ D 0. Then without loss of generality, M D hŒU , which is an open set in Rn
(in other words, h is the inclusion). We can write
v D f i ei ; w D g i ei ;
and by our assumption, f i and g i coincide on the image Œa; a/ of . In

particular,
@u f i .x/ D @u g i .x/:
Consequently, again, our claim follows from axiom (2.1.2).

Consider now a function v assigning to y 2 Œ.a; a/ an element in TM y which
is smooth in the sense that h v W ."; "/ ! Rn is a smooth function where h are
local coordinates at x. We shall call such a function a smooth vector field defined
on Œ.a; a/. By the Implicit Function Theorem, a smooth vector field defined
on Œ.a; a/ extends to a smooth vector field in a neighborhood of x, and by the
previous remarks, ru .v/.x/ is well defined even though v is not a priori a smooth
vector field defined on a neighborhood of x. This is sometimes important.
2.3 Examples
1. The most basic example is the canonical connection in Rn : since the tangent
space of Rn is canonically identified with Rn , vector fields are canonically
identified with Rn -valued functions, and we may simply define the value of ru .v/
at x as the u.x/-directional derivative of v (considered as an Rn -valued function)
at x.
2. Let us now generalize this example in the spirit of the previous section. Let U be
an open subset of Rn and let gij be a Riemann metric on U . Define

@v j
ru .v/ D u i
ej C v j ijk ek (2.3.1)
@x i
www.Ebook777.com
2 Affine connections 373
using the standard coordinates x i in Rn , and letting ei be the standard basis of

Rn . The axioms (2.1.1), (2.1.2) are readily verified. To explain where this formula
comes from, note that by the chain rule, if x D x.t/ is a geodesic, then (2.3.1)
gives precisely
rx 0 .t / x 0 D 0; (2.3.2)
which really “looks like” a generalization of the equation of a straight line in Rn ,

although the left-hand side must be taken in the sense of the remarks made in 2.2.
Perhaps the main purpose of this section is to develop this example further,
generalize it to Riemann manifolds and show its significance to a Riemann metric;
but we need to develop more the general theory of connections first.
2.4 Parallel transport and geodesics
2.4.1
Let W ha; bi ! M be a continuously differentiable parametrized curve in a smooth
manifold M with affine connection r (as usual, we assume that 0 .t/ ¤ 0 for any
t 2 ha; bi and take the one-sided derivatives at the boundary points). Consider the
equation
r 0 .t / y..t// D 0 (*)
where y is a smooth vector field defined on Œha; bi. Clearly, we can treat this
problem locally, and hence we may work in a coordinate neighborhood U of M ,
where we have a smooth coordinate system h W U ! RN . Let ei D @ i . Writing
@h
y D x i ei ;
the equation (*) becomes a system of first-order linear differential equations in the
coefficients x i . Thus, by Theorem 1.3 of Chapter 7, there is a unique solution to the
equation (*) with given value
v D y..a// 2 TM .a/ :
This solution is called the parallel transport of the vector v along the parametrized
curve with respect to the affine connection r. It is important to note, however, that
performing parallel transport on a vector v D .a/ 2 TM .a/ along a parametrized
closed curve may produce a different vector v ¤ .b/ 2 TM .b/ D TM .a/ . This
is related to two quantities known as torsion and curvature associated with the affine
connection r, which we will discuss in the next section.
www.Ebook777.com
2.4.2 Geodesics
We can now also see that the concept of a geodesic generalizes to any smooth
manifold with an affine connection. In effect, if, in a local coordinate system h,
we write ei D @ i , and define Christoffel symbols of the connection r by
@h
rei .ej / D ijk ek ; (*)
then in this generalized sense, any affine connection in local coordinates is given
by the formula (2.3.1) of 2.3 (by the axioms of 2.1). We then see that the
“geodesic equation” (2.3.2) written in coordinates becomes a (non-linear) second-
order ordinary differential equation, and hence locally has solutions uniquely
determined by the value and derivative at a single point (by Corollary 4.1.2 of
Chapter 14).
3 Tensors associated with an affine connection:

torsion and curvature
Recall the vector space W.M / of smooth vector fields on M . We will prove the
following
3.1 Lemma. Suppose we have a multi-linear function
ˆ W W.M /

W.M / ! W.M /
„ ƒ‚ …
k times
which has the property that
ˆ.u1 ; : : : ; ui 1 ; yui ; ui C1 ; : : : ; uk /x D y.x/ˆ.u1 ; : : : ; uk /x (*)
for every smooth function y W M ! R and every i D 1; : : : ; k. Then ˆ.u1 ; : : : ; uk /x

only depends on .u1 /x ; : : : ; .uk /x , and defines a smooth tensor field of type .1; k/.
Remark: Multi-linearity of ˆ, as we defined it, guarantees condition (*) for a

constant function y.
Proof. By the same reasoning as in 2.2, the value ˆ.u1 ; : : : ; uk /x depends only on
values of ui in an open neighborhood U of x. We may assume U to be a coordinate
neighborhood with a coordinate function h W U ! RN , and let ei D @ i . Then we
@h
may write
ui D i y j ej
www.Ebook777.com
3 Tensors associated with an affine connection: torsion and curvature 375
for smooth functions i y j on U , so (*) implies that ˆ.u1 ; : : : ; uk /x is the sum of
1y
j1
.x/ k y jk .x/ˆ.ej1 ; : : : ; ejk /x
over all possible choices 1 j1 ; : : : ; jk N . This implies our statement. t

u
3.2
Let r be an affine connection on a smooth manifold M . We will give two examples

of quantities satisfying the assumptions of Lemma 3.1, namely
T .u; v/ D ru .v/ rv .u/ Œu; v
and
R.u; v; w/ D ru rv .w/ rv ru .w/ rŒu;v .w/
where Œu; v is the Lie bracket of the smooth vector fields u; v (see Section 7 of
Chapter 6, and Exercise (5).
Lemma. The functions T , R satisfy the hypotheses of Lemma 3.1, and hence define
smooth tensor fields Tijk , Rijk
`
. Furthermore, both of these tensors are antisymmetric
in the coordinates i; j .
Remark: The tensors Tijk , Rijk

`
are called the torsion tensor and curvature tensor,
respectively.
Proof of the Lemma: Multilinearity is obvious, as is antisymmetry in the specified

coordinates. Condition (*) of Lemma 3.1 is a direct calculation.
T .yu; v/ D ryu .v/ rv .yu/ Œyu; v

D yru .v/ yrv .u/ .@v y/ u yŒu; v C .@v y/ u
D yT .u; v/;
R.yu; v; w/ D ryu rv .w/ rv ryu .w/ rŒyu;v .w/

D yru rv .w/ rv yru .w/ ryŒu;v w C r@v yu w
D yru rv .w/ yrv ru .w/ @v y ru .w/ yrŒu;v w
C@v y ru .w/ D yR.u; v; w/;
www.Ebook777.com
R.u; v; yw/ D ru rv .yw/ rv ru .yw/ rŒu;v .yw/

D ru yrv .w/ C ru .@v y w/ rv yru .w/ rv .@u y/ w
yrŒu;v .w/ @Œu;v y w
D yru rv .w/ C @u yrv .w/ C @v yru .w/ C @u @v y w
yrv ru .w/ @v yru .w/ .@u y/rv .w/ @v @u y w
yrŒu;v w @u @v y w C @v @u y w D yR.u; v; w/:
The other cases follow by antisymmetry. t

u
3.3 Example
The connection defined in Example 2.3 2 has zero torsion. This immediately follows
from the fact that
ijk D jki : (3.3.1)
Compare this to the beginning of Subsection 3.3 of Chapter 14, where we specifi-
cally defined the Christoffel symbols in such a way so as to make (3.3.1) true.
In fact, more generally, we see from the comments made in 2.4.2 and formula
(2.3.1) of 2.3 that any affine connection has zero torsion if and only if, in local
coordinates, it satisfies (3.3.1) in the sense of 2.4.2.
3.4 A characterization of the Euclidean connection
Theorem. Let M be a smooth manifold with an affine connection r, and let

x 2 M . Then there exists an open neighborhood of x in which r has torsion
and curvature tensors equal identically to 0 if and only if there exists an open
neighborhood U of x and a coordinate system h W U ! Rn which sends r restricted
to U to the canonical connection (Example 2.31) on Rn , restricted to hŒU .
Proof. Clearly, the Euclidean connection has torsion and curvature 0, and hence the
existence of the coordinate system h W U ! Rn with the specified properties implies
that r is torsion and curvature free on U .
On the other hand, consider a connection r on M which is torsion and curvature
free on an open neighborhood of x. Choose a basis e1 ; : : : ; en of TM x . Let W
.a1 ; a1 / ! M , .0/ D x, 0 .0/ D e1 be a geodesic with respect to r. Now
denote the parallel transport of e2 along also by e2 at each point t1 2 .a1 ; a1 /. Let
t1 W .a2 ; a2 / ! M be a geodesic with t1 .0/ D .t1 /, t01 .0/ D e2 . Note that we
may assume the number a2 > 0 is independent of t1 because of smooth dependence
on geodesics on boundary conditions (the argument of 2.4 extends verbatim to this
situation). By the same argument, we may also consider as a smooth function
www.Ebook777.com
3 Tensors associated with an affine connection: torsion and curvature 377
W .a1 ; a1 /
.a2 ; a2 / ! M:
We will denote the two independent variables by t1 2 .a1 ; a1 /, t2 2 .a2 ; a2 /. We

clearly have
@ @
Œ ; D0 (1)
@t1 @t2
by the commutation of partial derivatives. Write
@
ei D ; (2)
@ti
i D 1; 2. By the fact that r has 0 curvature, parallel transports along the curves t1 ;‹
and ‹;t2 with constant t1 resp. t2 therefore commute. We conclude in particular that
re2 .e1 / D 0;
since it is true at t2 D 0 by our definition. Since r has 0 torsion, we also have
re1 .e2 / D 0;
and since the curvature is 0,
re2 re1 .e1 / D re1 re2 .e1 / D 0:
Hence, in fact,
re1 .e1 / D 0;
since it is true at t2 D 0 by our definitions. In conclusion,
rei .ej / D 0 (3)
for i; j 2 f1; 2g.

Now assume, by induction, that we have a function
W .a1 ; a1 /

.ak ; ak / ! M
such that if we define (2), then (3) is true for all i; j 2 f1; : : : ; kg. If k < n,
denote the parallel transport of ekC1 to any of the points .t1 ; : : : ; tk / by the
curves .t1 ; : : : ti 1 ; ‹; ti C1 ; : : : tn / (with only one ti non-constant) by ekC1 . Smooth
dependence on boundary conditions implies that is a smooth function of the k C 1
variables t1 ; : : : ; tkC1 on some set
www.Ebook777.com
.a1 ; a1 /

.akC1 ; akC1 /
and applying the above argument to individual pairs of coordinates gives (3) for
i; j 2 f1; : : : ; k C 1g.
Thus, we may assume k D n. But then is locally the inverse of a local
coordinate system on M at x (by the Inverse Function Theorem), and (3) implies
that this coordinate system carries the connection r to the Euclidean connection
2.31, as claimed. t
u
4 Riemann manifolds
The purpose of this section is to put, finally, everything together. We define a

connection canonically associated with a Riemann metric on a smooth manifold,
called the Levi-Civita connection. We define the curvature of a Riemann manifold,
and prove that vanishing of the curvature locally characterizes Euclidean geometry
up to isometry.
4.1 Riemann metrics
A (smooth) Riemann metric on a smooth manifold M is a smooth tensor field of type

.2; 0/ denoted usually by gij which is symmetric and such that for each x 2 M , the
symmetric bilinear form on TM x defined by
g.u; v/ D gij ui v j
is positive-definite (and hence defines a real inner product). A smooth manifold with
a Riemann metric is called a Riemann manifold. The fact that we considered an inner
product on TM x (as opposed to TMx ) is merely a convention: we claim that given
a Riemann metric gij , there exists a unique tensor of type .2; 0/ denoted by gij such
that
gij g j k D ıik ;
which, moreover, defines a positive-definite symmetric bilinear form on TM x :

Picking an ordered basis B of TM x , the matrix of g ij with respect to the ordered
basis of TM x dual to B is the inverse of the matrix of gij with respect to B which
is also positive-definite (see Exercise (2)). Similarly, we could have started with
positive-definite symmetric tensor gij , and a positive-definite symmetric tensor gij
would be determined.
An isometry is a smooth diffeomorphism f W M ! N between Riemannian
manifolds with Riemann metrics g, gQ such that f g D g. Q
4.1.1 Lemma. Every smooth manifold M has a Riemann metric.
www.Ebook777.com
4 Riemann manifolds 379
Proof. The statement is certainly true if we replace M by one of its coordinate

neighborhoods Ui (since for an open subset of Rn , we can take the standard inner
product on Rn ). Let i g be the Riemann metric on Ui , and let ui be a smooth partition
of unity subordinate to the open cover .Ui /. Then
X
ui .i g/
i
is a Riemann metric on M . (Note that a linear combination of finitely many

positive-definite symmetric matrices with positive coefficients is a positive-definite
symmetric matrix.) t
u
4.1.2 The induced Riemann metric

Lemma 4.1.1 is often very useful technically, but is perhaps not very geometric: the
Riemann metric which we proved to exist has no geometric meaning. Typically, we
are dealing with a situation where a Riemann metric is given and we are interested in
its properties. The most common way a Riemann metric can be given is as follows:
suppose we are given a Riemann metric on a smooth manifold N , and suppose
W M N is a smooth submanifold (we could more generally consider the situation
when is an immersion). Then we have a naturally induced Riemann metric on M ,
simply because for x 2 M , we have an embedding TM x TN x . To show that this
induced Riemann metric is smooth, recall that gij is contravariant with respect to ,
so we know that .gij /M D ..gij /N / is a smooth tensor field.
4.2 Riemann metrics and connections
Let gij be a Riemann metric on a smooth manifold M , and let r be an affine

connection on M . We say that the connection r is compatible with the Riemann
metric gij if g ij is preserved by parallel transport, i.e. for a smooth parametrized
curve with boundary points x, y, and two vectors u; v 2 TM x , if uQ , vQ are the
parallel transports of u; v to TM y , we have
gij .Qui ; vQ j / D gij .ui ; v j / (4.2.1)
An “infinitesimal version” of this condition is (dropping the indices)
@u .g.v; w// D g.ru .v/; w/ C g.v; ru .w//: (2)
(See exercise (4).)
Theorem. For every Riemann metric g on a smooth manifold M , there exists a

unique affine connection r on M which is compatible with g and has 0 torsion.
This affine connection r is known as the Levi-Civita connection.
www.Ebook777.com
Proof. We shall prove uniqueness first. Suppose we have an affine connection

compatible with the Riemann metric g. Let u; v; w be smooth vector fields on M .
Compute from (2) and the fact that r is torsion free:
@u .g.v; w// C @v .g.u; w// @w .g.u; v//

D g.ru v; w/ C g.ru w; v/ C g.rv u; w/ C g.rv w; u/ g.rw u; v/ g.rw v; u/
D 2g.ru v; w/ C g.Œu; w; v/ C g.Œv; w; u/:
Therefore,
g.ru v; w/
D 12 .@u .g.v; w// C @v .g.u; w// @w .g.u; v// g.Œu; w; v/ g.Œv; w; u// :
Hence, ru v is determined by g.
Now we will prove existence. We will first treat the case when M D U is
an open subset of Rn . In this case, consider the connection (2.3.1) constructed in
Example 2.32. We already know from Example 3.33 that this connection is torsion
free. To verify that this connection is compatible with the metric g, by the chain
rule, it suffices to verify the condition (2) in the case when u D ei , v D ej , w D ek .
Thus, we need to show that
@ei .g.ej ; ek // D g.rei ej ; ek / C g.ej ; rei ek /;
which translates to
@gij
D kij C j i k ;
@x k
which follows directly from equation (3.3.3) of Chapter 14.
Now let M be an arbitrary smooth Riemann manifold, and let .Ui / be a
coordinate cover of M . Then by what we just proved, and by locality of connections,
we have smooth torsion free connections on each Ui which are compatible with g.
By uniqueness, further, the connections corresponding to Ui and Uj coincide on
Ui \ Uj . Thus, these connections together define a torsion free affine connection on
M compatible with g. t
u
4.3 The curvature tensor of a Riemann manifold,

and a characterization of Euclidean geometry
Let M be a smooth manifold with a Riemann metric g. To this data, we have

uniquely associated the Levi-Civita connection r by Theorem 4.2. The curvature
tensor R of the Levi-Civita connection is called the curvature tensor of the
Riemann manifold M . The culmination of our work is the following result, which
characterizes Euclidean geometry in the world of Riemann manifolds!
www.Ebook777.com
5 Riemann surfaces and surfaces with Riemann metric 381
Theorem. Let M be a Riemann manifold, and let x 2 M . Then there exists an open
neighborhood of x on which R D 0 if and only if there exists an open neighborhood
U of x and a smooth map h W U ! Rn which is an isometry onto its image.
Proof. The necessity of 0 curvature for the existence of h follows directly from
Theorem 3.4, and the sufficiency almost does. In effect, if curvature vanishes in a
neighborhood of x, from Theorem 3.4, we get an open neighborhood U of x and
a map h W U ! Rn which is a diffeomorphism onto its image such that h maps
the Levi-Civita connection on U to the Euclidean connection on hŒU . Clearly,
we may then assume that U D M and h is the identity. Note however that we
have not proved the map h preserves Riemann metrics. In effect, we must investi-
gate the question: What Riemann metrics is the Euclidean connection r compatible
with?
To answer this question, assume, without loss of generality, that U is connected
(in fact, we could assume without loss of generality that it is an open ball). We
see from the formulation (4.2.1) of compatibility of the connection with the metric
that given an inner product gx on TM x for a chosen point x 2 U , there is at most
one Riemann metric gij on U with which r is compatible and such that .g ij /x D
gx (since the inner product on TM y for all y 2 U is then determined by parallel
transport). Since, however, for the Euclidean connection, parallel transport is simply
the identity when we make the canonical identification of TM y with Rn , for any
inner product gx on TM x D Rn , there is precisely one Riemann metric with which r
is compatible, namely the one specified by the same inner product on all TM y D Rn .
Since any two inner product spaces of the same dimension are isomorphic, to get
the desired isometry, it suffices to pick an affine map ˛ W Rn ! Rn which takes the
inner product on TM x to the standard inner product on Rn for a single point x 2 U .
We may then put h D ˛jU . t
u
Remark: For a general Riemann metric, it is not so easy to characterize all

Riemann metrics with which its Levi-Civita connection is compatible, although (for
connected manifolds) it remains true that such Riemann metrics are characterized
by the inner product they give on TM x at a single point. Which of these inner
products are allowable, however, is related to the notion of holonomy, which we
do not discuss here. We refer the interested reader to [21].
5 Riemann surfaces and surfaces with Riemann metric
Despite the fact that both concepts are attributed to Riemann, a Riemann surface is
not the same thing as a Riemann manifold which is a surface (i.e. has dimension 2).
A Riemann surface † is of course, in particular, a 2-dimensional manifold, and
hence Lemma 4.1.1 applies. Additionally, † comes with the structure of a complex
manifold, but that is not the same thing as a Riemann metric.
www.Ebook777.com
5.1 The compatible complex structure
When putting a Riemann metric on a Riemann surface, we are usually only

interested in compatible metrics which means that for any tangent vector u 2 T †x
for any x 2 M , u is orthogonal to iu. Nevertheless, the method of Lemma 4.1.1
readily applies to prove the following
Lemma. Every Riemann surface † has a compatible Riemann metric
Proof. On an open subset of C, the metric on C identified with R2 via the

isomorphism
z 7! .Re.z/; Im(z)/ (5.1.2)
is clearly compatible. Let, again, Ui be the coordinate neighborhoods of †, let i g

be a compatible Riemann metric on Ui and let ui be a smooth partition of unity
subordinate to .Ui /. Then, as before,
X
ui .i g/
i
is the desired compatible Riemann metric on †. t

u
In this context, it is also appropriate to make the following
Observation. Every Riemann surface † comes with a canonical (i.e. preferred)

orientation.
Proof. We will produce a nowhere vanishing 2-form on †. In fact, on an open subset

of C, we can simply take the form dxdy where x and y are the first and second
coordinates of R2 (i.e. z D x C iy). Note again that the coordinates of a complex
number z D xQ C i y,
Q D jje i ˛ 2 C, are given by

xQ cos.˛/ sin.˛/ x
D jj :
yQ sin.˛/ cos.˛/ y
We conclude that
Q yQ D jj2 dxdy:
dxd (5.1.3)
Now let .Ui / be a coordinate neighborhood of † and let !i be a 2-form induced as

above from dxdy by the complex coordinate z D x C iy on Ui . The key observation
is that, by (5.1.3), on the intersection Ui \ Uj ,
!i D h!j
www.Ebook777.com
where h is a positive smooth real function. Thus, if ui is, again, a smooth partition
of unity subordinate to .Ui /, then
X
!D ui !i
i
is the nowhere vanishing 2-form on † we were seeking. Simultaneously, it follows

that the form obtained from any other complex atlas is a multiple of ! by a positive
smooth real function. t
u
5.2 The complex structure on an oriented surface

with a Riemann metric: reduction to the equation
of holomorphic disks
The orientation constructed in the Observation is called a compatible orientation

on †. In view of the Observation and Lemma 5.1, it is a natural question if there is
a converse, i.e. if every 2-dimensional oriented Riemann manifold has a structure of
a Riemann surface with which the Riemann metric and orientation are compatible.
The answer is affirmative, but the proof turns out to be quite hard. We will need the
full force of the methods of Section 5 of Chapter 13.
Let † be a 2-dimensional oriented manifold with a Riemann metric. Our task is
to construct a complex structure compatible with the metric. Let x 2 †. Clearly, it is
enough to construct a conformal oreintation-preserving coordinate u W U ! C (with
non-singular differential at x). It turns out that it is somewhat easier to construct the
inverse of the coordinate function u, which we will denote by f D f .z/. Note that,
without loss of generality, we may assume that U D † is an open subset of C and
x D 0 D u.x/, so the function f we seek should map an open neighborhood of 0
onto U , f .0/ D 0, Df0 should be non-singular and orientation preserving. What is,
however, the condition of compatibility of complex structure with Riemann metric
in this setting? To understand this, note that a 2-dimensional oriented inner product
R-vector space V comes with a canonical complex structure J , which means a
linear map J W V ! V such that J 2 D Id. In fact, define Jv to be the vector
of length jjvjj which is orthogonal to v and has the property that v ^ Jv has positive
orientation.
In this setting, the Riemann metric therefore specifies, at each z 2 U , a complex
structure Jz on C D T Uz , which varies smoothly as a function of z. This is referred
to as an almost complex structure. We are therefore seeking a smooth function f .z/
defined in a neighborhood of 0 such that
Dfz .it/ D Jf .z/ Dfz .t/; f .0/ D 0; det.Df0 / ¤ 0: (5.2.1)
This is our first encounter with the equation of holomorphic disks. In order to
solve the equation, however, it is more convenient to write it in terms of complex
www.Ebook777.com
differential 1-forms. A complex differential 1-form ˛ on U is said to be of J -type

.1; 0/ if for every v 2 C, and every z 2 U ,
˛.z/.Jz v/ D i ˛.z/.v/:
(Note that a 1-form of type .1; 0/ with respect to the standard complex structure i is
simply of the form
.z/dz;
where .z/ is a smooth function, i.e. not necessarily a holomorphic 1-form.)

Now by definition, there exists a smooth function W U ! C such that
dz D ˛ C .z/˛
where ˛ is of J -type .1; 0/. We have
dz D ˛ C ˛;
and hence
dz dz
˛D :
1 jj2
Thus, the complex 1-form dz .z/dz is of J -type .1; 0/ and the condition of f
being J -holomorphic means that
f .dz .z/dz/ D .z/dz
for a smooth function .z/. We have
@f @f
f .dz/ D dz C dz;
@z @z
@f @f
f .dz/ D .f / .dz/ D dz C dz:
@z @z
Thus, we have
@f @f @f @f
f .dz .z/dz/ D . .f .z// /dz C . .f .z// /dz:
@z @z @z @z
The condition that this be a form of type .1; 0/ with respect to the standard complex
structure then reads
@f =@z D .f .z//@f =@z: (5.2.2)
(Note that @f =@z D @f =@z.)
www.Ebook777.com
Our goal is then to solve the differential equation (5.2.2). To this end, we will
make one more reduction. Applying @=@z to (5.2.2) and writing
@ @
g.z/ D ; h.z/ D ; (*)
@z @z
we obtain
@2 f =.@z@z/ .f .z//@2 f =.@z@z/
D @f =@z .g.f .z//.@f =@z/ C h.f .z//.@f =@z//
D @f =@z .g.f .z//.@f =@z/ C h.f .z//.f .z//.@f =@z//
D .g.f .z// C .f .z//h.f .z/// j@f =@zj2 :
(The second equality uses the equation (5.2.2).) Putting
b.z/ D g.z/ C .z/h.z/; (5.2.3)
we therefore have
@2 f =.@z@z/ .f .z//@2 f =.@z@z/ D b.f .z//j@f =@zj2 :
The complex conjugate equation is
.f .z//@2 f =.@z@z/ C @2 f =.@z@z/ D b.f .z//j@f =@zj2 :
Putting
b.z/ C .z/b.z/
a.z/ D ; (5.2.4)
1 j.z/j2
this gives the equation
@2 f =.@z@z/ D a.f .z//j@f =@zj2 : (5.2.5)
Our strategy is first to solve the equation (5.2.5), and then show that the solution
(with suitable conditions) also satisfies (5.2.2), and hence (5.2.1).
Before doing so, however, let us briefly consider what restriction we can place
on the function a.z/. Note that this function is related to the smooth function .z/
by the equations (*), (5.2.3) and (5.2.4). On the function .z/ we can certainly
impose the relation
.0/ D 0;
www.Ebook777.com
since we are free to choose the differential of f to preserve the complex structure
at 0. Further, by substituting t D ız for ı > 0 small if necessary, we can make
.z/ and its first several chosen partial derivatives arbitrarily small in a chosen
neighborhood of 0, and further, since we are only interested in a correct solution in
a neighborhood of 0, we may assume .z/ D 0. for jzj > 1=2. Using the equations
(*), (5.2.3) and (5.2.4), we can translate this to similar conditions on a.z/, i.e., for
any fixed chosen ı > 0, we can assume
a.0/ D 0; a.z/ D 0 for jzj > 1=2,

(5.2.6)
ja.z/j; j@a=@zj; j@a=@zj < ı for all z 2 C.
5.3 Theorem. There exists an ı > 0 such that for a smooth function a.z/ satisfying
(5.2.6), there exists a solution f .z/ to the equation (5.2.5) with @f =@z, @f =@z
continuous, f .0/ D 0,
lim f .z/ D 1; (5.3.1)

z!1
.@f =@z/.0/ ¤ 0 and
@f
lim D 0: (5.3.2)
z!1 @z
Proof. Recall Section 5.2 of Chapter 13. We will find a solution of the form
f .z/ D z C P1 ..z//; 2 L3 .C/: (5.3.3)
Define
.A.//.z/ D a.z C P1 .// j.z/ C 1j2 :
Let us consider first the equation
@
D A./: (5.3.4)
@z
In effect, we will solve the equation (5.3.4) in the set Q" of continuous bounded
functions on C which satisfy
"
j.z/j (5.3.5)
1 C jzj
with the metric induced from the metric on the space C.C/ of bounded continuous
functions on C (the supremum metric). Note that obviously, Q" is a closed subset
of C.C/.
www.Ebook777.com
The parameter " > 0 will be chosen later, but note that (5.3.5) implies
Q L3 .C/:
Since
j.P1 .//.z/j C3 K"jzj1=3
where
Z 1=3
dxdy
KD ;
C .1 C jzj/3
choosing C3 K" < 1=2 guarantees
jz C P1 ./j > 1=2 for jzj > 1 ı for some ı > 0,
so
supp..A.//.z// D: (5.3.6)
Let us also assume 0 < " < 1. Now by choosing ı > 0 sufficiently small, we may
assume
jA./j < "=8
and
1
jA./ A. /j j j (5.3.7)
2
for ; 2 Q" . (Again, we are considering the norm in C.C/.)

Now put
1 D 0; nC1 D P .A.n //:
By Lemma 5.3.1 (1) of Chapter 13, we have n 2 Q" , and by (5.3.7), .n / is a

Cauchy sequence in Q" . Put
D lim n :
n!1
Since P is continuous on C.C/, we have
D PA./; j.z/j "=.1 C jzj/; jA./j < "=8: (5.3.8)
Now by (5.3.6), A./ has support in D, so by Lemma 5.3.1 (2) of Chapter 13,
www.Ebook777.com
j.z/ .t/j < Kjz tj1=3
for a suitable constant K. By Lemma 5.3.1 (2) of Chapter 13 again, there exist
constants L; > 0 such that
jA.z/ A.t/j < Ljz tj ;
and hence is continuously differentiable by Lemma 5.2.1 of Chapter 13, and

moreover satisfies (5.3.4).
Now consider the function f .z/ defined by (5.3.3). First note that by the
definition of P1 , f .0/ D 0. The equality (5.3.1) follows from the second estimate
(5.3.8) and from Lemma 5.3.1 (2) of Chapter 13. Also,
@P1 ..z//
D .z/ .0/
@z
by formula (5.2.4) of Lemma 5.2.1 of Chapter 13. Therefore, we have in (5.3.3)
@f
D .z/ C 1 .0/;
@z
and f is continuously differentiable on C by Lemma 5.2.1 of Chapter 13. Therefore,

f solves the equation (5.2.5), and @f =@z is non-zero at the point z D 0 because
j.0/j ":
To prove (5.3.2), it suffices to prove that
@P1 ./
lim D 0: (5.3.9)
z!1 @z
Because of the second estimate (5.3.8), we can write

Z Z
1 ./ 1 ./
P1 ..z// D dsdt C dsdt:
C z C
The second summand is constant in z, the first one is, by substitution D z,

D u C iv,
Z
1 . C z/
dudv:
C
Differentiating after the integral sign gives
www.Ebook777.com
Z Z
1 dudv 1 dsdt
@.z C /=@z D @./=@ : (5.3.10)
C D z
Note that the integrand on the right-hand side 0 outside D, which lets us restrict the
integration from C to D. This also implies that taking derivatives after the integral
sign is legal by Theorem 5.2 of Chapter 5. Now the right-hand side of (5.3.10)
obviously tends to 0 with z ! 1, which proves (5.3.9). t
u
5.4 Proposition. Any solution f .z/ of the equation (5.2.5) which satisfies the
conditions of Theorem 5.3 is also a solution of the equation (5.2.2).
Proof. Let f be as assumed. Then, recalling (5.2.4), we have
@2 f =.@z@z/ .f .z//@2 f =.@z@z/
D .a.f .z// .f .z//a.f .z///j@f =@zj2 D b.f .z//j@f =@zj2 :
Using the chain rule, we obtain from (5.2.3)
@ @.f .z//
.@f =@z .f .z//@f =@z/ C @f =@z
@z @z

@f @f
D @f =@z g.f .z// C .f .z// h.f .z// :
@z @z
From this, we obtain
@
.@f =@z .f .z//@f =@z/
@z

D @f =@z h.f .z// @f =@z .f .z// .@f =@z/ :
Setting
F .z/ D @f =@z .f .z// .@f =@z/;
we therefore have
@F
D A.z/ F .z/
@z
where
A.z/ D @f =@z h.f .z//
is a continuously differentiable function with compact support. Further, we have
www.Ebook777.com
lim F .z/ D 0
z!1
(by (5.3.1) and the fact that has compact support). Hence, F .z/ D 0 for all z 2 C
by Theorem 5.3 of Chapter 13, which proves our statement. t
u
Therefore, we have finished the proof of the following result.
5.5 Theorem. Every oriented smooth surface † with a Riemann metric has a
compatible complex structure. u
t
Note that in view of the comments of Subsection 5.3 of Chapter 10 and the
Riemann Mapping Theorem 1.2 of Chapter 13, this can be equivalently phrased
to say that for every surface † with a Riemann metric, and any point x 2 †,
any sufficiently small simply connected open neighborhood of x can be mapped
conformally bijectively onto
.0; 1/. In cartography, this theorem is of major
significance: Note that together with the Riemann Mapping Theorem, we can make
a flat local chart of any (smooth) landscape in the shape of any simply connected
open set in C (other than C itself) which preserves surface angles.
6 Exercises
(1) Let M be a smooth manifold with an affine connection and let U be an open
subset of M . Let x i , y i be two different coordinate systems on U , and let
k
ijk be the Christoffel symbols with respect to the coordinates x i , and ij the
Christoffel symbols with respect to y i . Prove that
k @x p @x q @y k r @y k @2 x m @2 x m
ij D pq C :
@y i @y j @x r @x m @y i @y i @y j
Note that the second term is the “error term for the symbol ijk behaving as a
tensor of type .2; 1/”.
(2) Prove that the inverse of a positive-definite symmetric matrix is positive-
definite. [Hint: We have x T Ax > 0 when x ¤ 0, and we want to prove
y T .A1 /y > 0 for y ¤ 0. Consider y D Ax.]
(3) Let M be a Riemann manifold with Riemann metric g. Define, for x; y 2 M ,
.x; y/ D inf sg .y/

y
where y is a parametrized continuously differentiable curve with boundary

points x; y. Prove that the function is a metric and that the associated
topology to is the topology on M which is a part of the definition of a
www.Ebook777.com
6 Exercises 391
manifold. [Hint: Use Theorem 4.3.2 of Chapter 14. Keep in mind that one of
the things to show is that .x; y/ D 0 implies x D y.]
(4) Prove that the conditions (4.2.1) and (2) of 4.2 are equivalent. [Hint: Integrat-
ing condition (2) along a curve where r 0 .v/ D r 0 .w/ D 0, u D 0
gives (4.2.1). This also means that (4.2.1) implies (2) at points where ru .v/ D
ru .w/ D 0. Fixing local coordinates, the general case then follows by the
chain rule.]
(5) Volume associated with a Riemann metric:
(a) Let g be a Riemann metric defined on a bounded open subest U Rn .
Assuming B U is a Borel set, define
Z q
volg .B/ D det.gij /:
B
Prove that this definition is invariant under diffeomorphism, provided we

transform gij as a tensor of type .2; 0/.
(b) Let M be a Riemann manifold with coordinate atlas .Up ; hp /p2P and let
up be a smooth partition of unity subordinate to Up . Recall that P can be
chosen to be countable, since we defined manifolds to have a countable
basis. Let B be a Borel subset of M . Prove that we can write B as a
disjoint union of Borel sets Bp , p 2 P , such that Bp Up . Put
X
volg .B/ D vol.hp / g .hp ŒBp /:
p
Prove that volg .B/ does not depend on the choices (i.e. the atlas and the
set Bp ).
(6) Let W ha; bi ! .0; 1/ be a smooth function (taking one-sided derivatives at
the boundary points). Consider the smooth map of manifolds
W .a; b/
S 1 ! R3
given by
.x; e 2 i t / 7! .x; .x/ cos.t/; .x/ sin.t//:
Prove that is an embedding of manifolds. Let g be the Riemannian metric

on M D Im./ induced from R3 . Find an explicit formula for the volume
(=“area”) of M in terms of the function . Find the function which minimize
the surface area of M subject to given values .a/; .b/ > 0. You may assume
without proof such smooth function exists. [Hint: compare with Exercise (2)
of Chapter 14.]
(7) (a) Consider the 2-sphere
S 2 D f.x; y; z/ 2 R3 j x 2 C y 2 C z2 D 1g
www.Ebook777.com
with the Riemann metric induced from R3 . State precisely and prove
that geodesics are precisely segments of great circles parametrized by arc
length.
(b) Generalize this to the n-sphere.
(c) Construct a Riemann metric on R2 in which there exists a geodesic with
boundary points A, B which does not minimize the distance functional
among continuously differentiable curves with boundary points A, B.
[Hint: Remove a point from S 2 , and induce a Riemann metric on R2 from
the Riemann metric (a) via the radial projection diffeomorphism.]
(8) Let M N be a smooth submanifold, and let g be the Riemann metric on
M induced by a Riemann metric gQ on N . If we denote by r resp. rQ the
Levi-Civita connection of g resp. g, Q prove that .ru .v//x is the g-orthogonal
Q
projection of rQ u .v/ onto TM x for x 2 M (note that rQ u .v/ is only defined
in the sense of 2.2). Use this to compute the curvature tensor of S 2 with the
Riemann metric induced from R3 . Conclude that no non-empty open set of S 2
is isometric to an open set of R2 (with the respective Riemann metrics). This
fact was first rigorously proved by Gauss.
(9) Prove that every 1-dimensional manifold is diffeomorphic either to S 1 or to
R. [Hint: Use Lemma 4.1.1 and parametrization by arc length.]
(10) Consider the ball S in R3 given by the equation
x 2 C y 2 C .z 1/2 D 1:
Identifying the xy-plane with C by
z D x C iy;
define a map from S X f.0; 0; 2/g to C by mapping a point P on S with

the point Q in the xy-plane such that P , Q and .0; 0; 2/ lie on a straight
line. This is called the stereographic projection. If we take on S the induced
Riemann metric from R3 , and the standard complex structure on C, prove
that the stereographical projection gives a coordinate system of a compatible
complex structure on S (or, equivalently, a conformal map).
[Hint: This can be done using basic trigonometry. A particularly elegant
solution can be obtained by comparing the isometries of S with Möbius
transformations on C [ f1g.]
www.Ebook777.com
Banach and Hilbert Spaces: Elements

of Functional Analysis 16
Let us now turn to infinite-dimensional geometry. The simplest such structure is

probably that of a Hilbert space. It is highly relevant for analysis, and plays a key
role in such areas as stochastic analysis and quantum physics. In this chapter we
will discuss the basics of this concept; in the next one we will present some of its
uses.
In the process we will also introduce the more general Banach spaces. Some facts
about Hilbert spaces readily generalize to Banach ones, but deeper theorems in this
much broader area require separate methods. These methods comprise a vast area
of mathematics called functional analysis. For good texts on this subject we can
recommend, e.g., [17, 19]. In this chapter we will be able to present some of the
simpler highlights of functional analysis, in particular the Hahn-Banach Theorem
and some of its consequences, and the duality of Lp spaces.
1 Banach and Hilbert spaces
1.1
In this chapter we will work with vector spaces over the field R of real numbers
and the field C of complex numbers (see Appendix A). Since the case of C is
perhaps less familiar, we will emphasize it, especially in the theory of Hiblert spaces.
All we say for C there remains true essentially verbatim over the field R as well,
and the reader is encouraged to consider what changes are appropriate in the real
case (mostly, complex conjugation disappears). In the case of Banach spaces, the
cases of R and C are sometimes really different. In those cases, we will spell out
both alternatives in detail.
Now recall the notion of an inner product from 4.2 of Appendix A and its
associated norm (and hence metric) from 1.2.3 of Chapter 2. Recall also the general
notion of a norm as introduced in 1.2 of Chapter 2.

www.Ebook777.com
394 16 Banach and Hilbert Spaces: Elements of Functional Analysis
If a normed vector space is complete (in the sense of Section 7 of Chapter 2)

we speak of a Banach space. If, moreover, the norm has been obtained from
an inner product as in 1.2.3 of Chapter 2, we speak of a Hilbert space. By an
isomorphism of Banach spaces, we mean a vector space isomorphism which is also
a homeomorphism. An isometric isomorphism (briefly isometry) is an isomorphism
of vector spaces which preserves the norm. Note that an isometric isomorphism
of Banach spaces which are Hilbert also necessarily preserves the inner product
(Exercise (2)).
Examples: In particular, Rn (resp. Cn ) equipped with the standard Pythagorean
metric is an example of a real (resp. complex) Hilbert space, and more generally,
each of the norms kvkp makes Rn , Cn into a Banach space (Exercise (20) of
Chapter 5).
More interestingly, let B Rn be a Borel subset. Recall the spaces Lp .B/,
L .B; C/ of Section 8 of Chapter 5. In the present terminology, Theorem 8.5.2 of
p
Chapter 5 says that Lp .B/ and Lp .B; C/, 1 p 1, are real resp. complex
Banach spaces. In fact, on L2 .B/, L2 .B; C/ we have a real (resp. complex) inner
product defined by
Z
f g D fg
B
which is finite by the Cauchy-Schwarz inequality applied at every point. Since the
norm on L2 is the norm corresponding to this inner product, the spaces L2 .B/ and
L2 .B; C/ are real and complex Hilbert spaces. The spaces Lp .B/, Lp .B; C/ are, in
some sense, the most fundamental examples.
1.2 Theorem. A norm is a uniformly continuous map V ! R.
Proof. We have jjxjj D jjy C .x y/jj jjyjj C jjx yjj and similarly with the roles
of x and y reversed, so
jjjxjj jjyjjj jjx yjj: t

u
1.3 An important convention
A subspace of a Banach resp. Hilbert space is a subset that is a Banach resp. Hilbert
space in the inherited structure. In particular, it is required to be complete. Thus, by
Proposition 7.3.1 of Chapter 2,
subspaces of a Banach resp. Hilbert space are precisely closed linear (vector)
subspaces.
www.Ebook777.com
2 Uniformly convex Banach spaces 395
2 Uniformly convex Banach spaces

2.1
A normed linear space V is said to be uniformly convex if

8" > 0 9ı > 0 such that for all x; y 2 V we have the implication

jjxjj D jjyjj D 1 and jj xCy
2 jj > 1 ı ) jjx yjj < ":
To reduce ı-", it is sometimes convenient to rephrase this as the following obviously

equivalent statement:
For sequences of elements .xn /, .yn / in V ,
xn C yn
.kxn k D kyn k D 1 and k k ! 1/ ) kxn yn k ! 0
2
(here the symbol ! indicates the limit with n ! 1).
Explanation. This condition expresses the intuitive notion of convexity of the
(unit) ball in the space as a sort of “bulging”. If you take for instance the norm from
Example 1.2.2(a) of Chapter 2, the unit ball is a cube; it does not really bulge: two
elements x; y on any of its faces may be far from each other while the distance of
the mean point xCy 2 from the center is still 1. In Example 1.2.2(c) of Chapter 2, on
the other hand, if we move x,y on the border from each other, the point xCy 2
moves
away from the border. Draw a picture.
2.2 Theorem. A Hilbert space is uniformly convex.

q
"2
Proof. Choose an " > 0 and set ı D 1 1 4. If jjxjj D jjyjj D 1 and jj xCy
2 jj >
q
2
1 ı D 1 "4 , then we have
"2 1 1 1
1 < .x C y/.x C y/ D .1 C yx C xy C 1/ D .2 C yx C xy/
4 4 4 4
and consequently
xy C yx > 2 "2 ;
so
jjx yjj2 D .x y/.x y/ D jjxjj2 C jjyjj2 xy yx D 2 .xy C yx/ < "2 : t

u
2.3 Lemma. Let yn ; zn be elements of a uniformly convex Banach space such that
lim jjyn jj D lim jjzn jj D lim jj yn Cz

2
n
jj D 1:
Then lim jjyn zn jj D 0.
www.Ebook777.com
Proof. First, we obviously have

zn zn

jjzn jj
lim

D lim 1

.1 /zn
D 0: (2.3.1)
jjzn jj jjyn jj
jjzn jj jjyn jj
Since the norm is a continuous function, it follows from (2.3.1) and the assumptions
that

1 zn yn

1 zn C yn 1 zn zn
lim
. C
/ D lim
C . /
D 1
2 kzn k kyn k
kyn k 2 2 kzn k kyn k
and hence we obtain, by the uniform convexity, that

yn zn
lim

D0
jjyn jj jjzn jj
and we conclude, using (2.3.1) again, that

yn zn zn zn
lim jjyn zn jj D lim jjyn jj

jjy jj jjz jj C
D 0: t
u
n n jjzn jj jjyn jj
2.4 Theorem. Let K be a closed convex subset of a uniformly convex Banach space
B and let a 2 B. Then there exists precisely one element y 2 K such that
jjy ajj D inffjjx ajj j x 2 Kg:
Proof. The maps x 7! x a and x 7! ˛x are obviously homeomorphisms

preserving convexity. Thus, except for the trivial case of a 2 K, we can assume
that
aDo and inffjjxjj j x 2 Kg D 1:
Then there exists a sequence jjxn jj, n D 1; 2; : : : such that
lim jjxn jj D 1:
Since K is convex we have

1
1 jj xn Cx
2
m
jj .jjxn jj C jjxm jj/: (2.4.1)
2
Suppose that the sequence .xn /n is not Cauchy. Then there exist subsequences .yn /n
and .zn /n such that for some "0 > 0 and all n,
jjyn zn jj "0 :
www.Ebook777.com
3 Orthogonal complements and continuous linear forms 397
However, we have lim jjyn jj D lim jjzn jj D 1 and by (2.4.1) also lim jj yn Cz
2
n
jj D 1 and
hence by Lemma 2.3, lim jjyn zn jj D 0, a contradiction.
Thus, .xn /n is a Cauchy sequence and if we set y D lim xn we have y 2 K and
jjyjj D 1. If we had jjzjj D 1 for another z 2 K we would have, according to the
same reasoning as above, a Cauchy sequence y; z; y; z; : : : ; y; z; : : : . t
u
3 Orthogonal complements and continuous linear forms
3.1
Similarly as in 4.6 of Appendix A, we define for a subspace M of a Hilbert space H
M ? D fx j xy D 0 for all y 2 M g:
Note that from the property xx D 0 ) x D o of the scalar product it follows that
M \ M ? D fog:
Also note that M ? is a Hilbert subspace: it is obviously a vector subspace of H ,

and it is closed since the mapping ..x; y/ 7! xy/ W H ! C resp. R is continuous
(indeed, we have jxy x 0 y 0 j D jxy xy0 C xy0 x 0 y 0 j jxy xy0 j C jxy0 x 0 y 0 j
jjxjj jjy y 0 jj C jjy 0 jj jjx x 0 jj).
3.2 Theorem. Let M be a (Hilbert) subspace of a Hilbert space H . Then each

x 2 H can be uniquely written as
x D y Cz with y 2 M and z 2 M ? :
Proof. Using 2.3, consider the element y 2 M for which
jjx yjj D minfjjx ujj j u 2 M g
and put z D x y. For a general non-zero u 2 M we have

zu ujj jjx yjj D jjzjj;
jjz uu
and hence
zu uz zu zu
jjzjj2 zu uz C uu D jjzjj2 0; hence
uu uu uu uu
jzuj2 D .zu/zu D .zu/.uz/ 0
so zu D 0, and finally z 2 M ? .
www.Ebook777.com
If we have x D zCy D z0 Cy 0 with y; y 0 2 M and z:z0 2 M ? then yy 0 D z0 z

and these differences are in M \ M ? D fog. t
u
3.3 Theorem. .M ? /? D M for all subspaces M .
Proof. Obviously M .M ? /? . Now let x 2 .M ? /? . Using 3.2, write x D y C z

with y 2 M and z 2 M ? . Then
zz D zx zy D 0 0 D 0;
and hence z D o and x D y 2 M . t

u
3.4
By Theorem 3.3, the mapping M 7! M ? is a bijection of the set of all subspaces

of H onto itself. Since it obviously reverses order by inclusion (i.e. M1 M2 )
M2? M1? ), we have
.M \ N /? D M ? C N ? and .M C N /? D M ? \ N ?
where M C N is the smallest subspace containing both M and N .
3.5 Theorem. Let V; V 0 be normed vector spaces (real or complex). Then the
following statements for a linear operator f W V ! V 0 are equivalent.
(2) f is uniformly continuous.
(3) There exists a number K such that
jjxjj 1 ) jjf .x/jj K:
Because of condition (3) of the theorem, continuous linear operators between

normed linear spaces are also referred to as bounded.
Proof. (2))(1) is trivial.

(1))(3): Suppose the implication does not hold. Then there exist xk 2 V such
that jjxk jj 1 and jjf .xk /jj k. Put yk D k1 xk . Then lim yk D o while jjf .yk /jj
1 and hence f .xn / cannot converge to o D f .o/.
(3))(2): Suppose such a K exists. For " > 0, put ı D K1 ". Now if jjx yjj < ı,
then jj K" .x y/jj 1, and hence
" K "
jjf .x/ f .y/jj D jjf .x y/jj D jj " f .x y/jj K D ": t
u
K K
www.Ebook777.com
3.5.1
This leads to a concept of a norm of a continuous linear map f W V ! V 0 between
normed vector spaces defined by
jjf jj D supfjjf .x/jj j jjxjj 1g:
It is an easy exercise to show that it is indeed a norm on the vector space
L.V; V 0 /
of all continuous linear maps f W V ! V 0 (with the natural addition and multipli-
cation by scalars).
3.5.2
A linear form on a real or complex normed vector space V is a continuous linear
mapping V ! R resp. V ! C. Similarly as in 1.1 of Chapter 11, we will denote by
V
the space of all linear forms on V . This is called the dual space of the normed
vector space V . Note, however, that, unlike in 1.1 of Chapter 11, we now take
the continuous linear forms only. The definition from 3.5.1 yields a norm on V
defined by
jj'jj D supfj'.x/j j jjxjj 1g:
3.5.3
Similarly as in 1.2 of Chapter 11, we have for a continuous linear mapping f W V !
V 0 a linear mapping f W .V 0 / ! V defined by
f .'/ D ' ı f
(if f; ' are continuous then the composition ' ı f is continuous as well). We will
show that f is continuous. This is an immediate consequence of the following
3.6 Lemma. We have jjf .'/jj jjf jj jj'jj.
Proof. We have jjf .'/jj D jj' ı f jj D supfj'.f .x//j j jjxjj 1g. If jjxjj 1 then
jf .x/j jjf jj. Thus, 1 jjf .x/jj 1 and 1 j'.f .x//j D j'. 1 f .x//j jj'jj.
jjf jj jjf jj jjf jj
t
u
3.6.1 Theorem. (The Riesz RepresentationTheorem) Let H be a Hilbert

space. Then
- for every a 2 H , the mapping .x 7! xa/ W H ! C is a linear form, and
- on the other hand every linear form ' W H ! C is given by the formula .x 7! xa/
for a uniquely determined a 2 H .
www.Ebook777.com
Proof. The first statement is obvious.

Now let ' W H ! C be a continuous linear mapping. If it is constant (and hence,
zero everywhere) we can set '.x/ D xo. Otherwise
M D fx j '.x/ D 0g
is a subspace unequal to H and hence M ? ¤ fog, by 3.2. First we will show that
dim M ? D 1. Indeed, let o ¤ x; y 2 M ? and consider u D '.y/x '.x/y. Then
'.u/ D '.y/'.x/ '.x/'.y/ D 0
and hence u 2 M \ M ? D fog. Thus, '.y/x '.x/y D o and since x; y are

non-zero in M ? , '.x/; '.y/ are nonzero and x; y are linearly dependent.
Thus, we have
M ? D f˛b j ˛ 2 Cg
for some b ¤ o. Now by 3.2, a general x 2 H can be written as
x D xM C ˛.x/b with xM 2 M:
Hence we have
'.x/ D ˛.x/'.b/ and xb D ˛.x/.bb/:
Comparing these two equations we obtain
'.b/
'.x/ D xa where a D :
bb
The uniueness is obvious (if a ¤ b then 0 ¤ .a b/.a b/ and hence xa ¤ xb for

x D a b). t
u
3.7 Lemma. Let ' W H ! C be given by '.x/ D xa. Then
jj'jj D supfj'.x/j j jjxjj 1g D jjajj:
Proof. If jjxjj 1 then j'.x/j D jxaj jjxjjjjajj jjajj. On the other hand we have
'. 1 a/ D 1 aa D jjajj. t
u
jjajj jjajj
3.7.1
A map f between vector spaces over C is said to be antilinear if it preserves
addition and sends ˛z to ˛f .z/.
www.Ebook777.com
Theorem. The correspondence D H W H ! H defined by .a/.x/ D xa is

bijective, antilinear and preserves norms.
Proof. is one-one onto by Theorem 3.6.1. We have .a C b/.x/ D x.a C b/ D

xa C xb D .a/.x/ C .b/.x/, and .˛z/.x/ D x.˛z/ D ˛.xz/. t
u
3.7.2 Remark
Note that in the case of Hilbert spaces over R, the mappings H are norm preserving
isomorphisms.
3.8
Let f W H ! H 0 be a continuous linear mapping. By 3.5.3 we have a continuous

linear mapping f W .H 0 / ! H dual to f . On the other hand, in view of 3.7, we
have a continuous linear mapping associated with f going in the same direction,
namely the g from the commutative diagram
H
H ! H
? ?
?
fy
?
ygDH 0 f H1
H 0
H 0 ! .H 0 / :
This calls for a closer analysis. For a continuous linear mapping f and a fixed
y 2 H 0 we have the linear form, obviously continuous,
h D .x 7! f .x/y/:
By Theorem 3.5, there is, hence, a z 2 H such that
h D .x 7! xz/:
Setting z D f Ad .y/ we obtain a mapping H 0 ! H satisfying the formula
8x; y; f .x/ y D x f Ad .y/:
This mapping f Ad is referred to as the mapping adjoint to f . We will show that the
mapping g from the diagram above is equal to .f Ad / . Indeed, we have
.f Ad / .H .a//.x/ D .H .a/ ı f Ad /.x/ D H .a/.f Ad .x//
D f Ad .x/ a D a f Ad .x/ D f .a/ x D x f .a/ D H 0 .f .a//.x/:
www.Ebook777.com
3.9
A continuous linear mapping f W H ! H 0 is said to be Hermitian if it is adjoint to

itself, that is, if f D f Ad , explicitly
f .x/ y D x f .y/ for all x; y 2 H .
Remark. Hermitian mappings (one also speaks of Hermitian operators) play an

important role in theoretical physics. It is a useful exercise to show that Hermitian
operators Cn ! Cn are associated with matrices A such that
T
A D A D A
P
(we have in mind the complex case with xy D xi yi and the complex conjugate
matrix defined by .aij /ij D .aij /ij ; AT is, as usual, the transposed matrix). Recall
from 7.2 of Appendix A that A is sometimes called the adjoint matrix.
3.9.1
The eigenvalues of a linear operator f W H ! H are numbers such that f .u/ D
u for a non-zero u, and that the x’s satisfying such equations are called eigenvectors
(compare 5.1 of Appendix B). We have
Theorem. 1. All the eigenvalues of a Hermitian operator f are real.

2. Two eigenvectors associated with different eigenvalues are orthogonal.
Proof. 1. Let f .u/ D u and u ¤ o. Then we have
.u u/ D u u D f .u/ u D u f .u/ D u .u/ D .u u/:
2. Let f .u/ D ˛u and f .v/ D ˇv, ˛ ¤ ˇ. Then we have
.˛ ˇ/uv D ˛.uv/ ˇ.uv/ D .˛u/v u.ˇv/ D f .u/v uf .v/ D 0: t

u
4 Infinite sums in a Hilbert space and Hilbert bases
4.1
We say that a system .xj /j 2J of elements of a Hilbert space has a sum x and write
X
xD xj
J
if for every " > 0 there exists a finite J."/ J such that for every finite K such
that J."/ K J we have
www.Ebook777.com
4 Infinite sums in a Hilbert space and Hilbert bases 403
X
jjx xj jj < ":
j 2K
Observation. If a sum of .xj /j 2 J exists then it is uniquely detemined.
(Indeed, let the statement above hold for x and y. Then

X X
jjx yjj jjx xj jj C jjy xj jj < 2": /
j 2K j 2K
4.2 Theorem. .xj /j 2J has a sum if and only if for every " > 0 there exists a finite
subset K."/X J such that for each finite subset K J satisfying K \ K."/ D ;
one has jj xi jj < ".
K
Proof. ) : Consider an " > 0 and put K."/ D J. "2 /. Let K be finite and such that
K \ K."/ D ;. Then we have
X X X X X
jj xj jjDjj xj xj jjjj xj xjj C jj xj xjj < ":
K K[K."/ K."/ K\K."/ K."/
X
( : Set Kn D K.1/[K. 21 /[ [K. n1 / and yn D xj . From the assumption
j 2Kn
is a Cauchy sequence and hence it has a limit x D lim yn .
we easily see that .yn /nX
We will show that x D xj .
J
Choose an " > 0 and an n such that jjx yn jj < 2" and at the same time n1 < 2" .
Take a K Kn and set L D K X Kn . Then
X X X " "
jjx xj jj D jjx yn C xj jj jjx yn jj C jj xj jj C D ":
K L L
2 2
(The last inequality uses L \ Kn D ;.) t

u
4.3 Theorem. A system .xj /j 2J has a sum x if and only if either J is finite and
X
xj D x in the ordinary sense, or the following conditions hold simultaneously:
j 2J
(a) for at most countably many j , xj ¤ o,
(b) whenever we order the xj ¤ o in a sequence x1 ; x2 ; : : : we have
X
n
lim xk D x
n
kD1
with the same result x.
www.Ebook777.com
Proof. We will use the same notation as above.

[1
1
) : The set L D K. / is countable and if j … L then jjxj jj < 1
n
for all n,
nD1
n
and hence xj D 0. Thus, without loss of generality,
J D f1; 2; : : : ; n; : : : g:
For " > 0 choose n" such that J."/ f1; 2; : : : ; n" g. Then for n n" , we
obviously have
X
n
jjx xk jj < ":
kD1
X
( : Suppose the sum xj does not exist. Choose a fixed order x1 ; x2 ; : : : .
J
X
n X
Then the limit x D limn xk either does not exist or it does but it is not xj .
kD1 J
In the latter case, by the definition, there exists an a > 0 such that
X
8 finite L J 9 finite K.L/such that L K.L/ J and jj xj xjj a:
K.L/
Put
A1 D f1g; B1 D K.A1 /;
A2 D f1; 2; : : : ; max B1 C 1g; B2 D K.A2 /
and further, assuming A1 ; : : : ; An , B1 ; : : : ; Bn are already determined, put
AnC1 D f1; 2; : : : ; max Bn C 1g; BnC1 D K.AnC1 /:
Now A1 B1 ¨ A2 B2 ¨ A3 and
X X
lim jj xj xjj D 0 while jj xj xjj a: (4.3.1)
n
An Bn
If we rearrange the sequence x1 ; x2 ; : : : into a sequence y1 ; y2 ; : : : by taking

successively all xj ’s from the blocks
A1 ; B1 X A1 ; A2 X B1 ; : : : ; An X Bn1 ; Bn X An ; AnC1 X Bn ; : : :
(the xj in the individual blocks ordered arbitrarily), we see that in view of 4.3.1,
Xn
limn yk does not exist. t
u
kD1
www.Ebook777.com
X X
4.4 Theorem. Let xj and yj exist in a Hilbert space H . Then
X J J X
(1) ˛xj exists and is equal to ˛ xj ,
X
J J X X
(2) .xj C yj / exists and is equal to xj C yj , and
J X J J X
(3) for every z the sum .xj z/ exists and is equal to . xj /z.
J J
Proof. (1) and (2) are straightforward.

(3): The mapping .x 7! xz/ is continuous. By Theorem 4.3, we can think of the
X
n
system .xj /j as of a sequence x1 ; x2 ; : : : with the sum x D lim xk and conclude
kD1
X
n X
n X
that xz D .lim xk /z D lim .xk z/ D .xj z/. t
u
kD1 kD1 J
4.5
Similarly as in 4.5 of Appendix A, we will speak of an orthogonal system .xj /j 2J

if xj xk D 0 whenever j ¤ k. If, moreover, jjxj jj D 1 for all j 2 J we say that the
system is orthonormal.
4.6 Theorem. (Generalized Pythagoras’ Theorem) An orthogonal system .xj /j in

a Hilbert space has a sum if and only if the system .jjxj jj2 /j has a sum in R. In that
case, we have
X 2 X
jj xj jj D jjxj jj2 :
J J
Proof. I. Existence:
) : Consider the sets K."/ from 4.2. If K J is finite and K \ K."/ D ;
then, using orthogonality,
X X X X X 2
jjxj jj2 D xj xk D . xj /. xj / D jj xj jj < "2 :
K j;k2K K K K
( : Reason as in the ) implication but in reverse, using, this time, the sets
K."2 /.
II. The equality:
X
Set x D xj . By 4.4(3), we have
J
www.Ebook777.com
X X X X XX X
xx D . xj /x D .xj x/ D xj . xk / D .xj xk / D xj xj :
J J J J J J j
t
u
4.7 Theorem. (Bessel’s inequality) Let .xj /j 2J be an orthogonal

P system in a
Hilbert space H . Then for each element x 2 H , the sum J jxxj j2 exists and
one has
X
jxxj j2 jjxjj2 :
J
Proof. Let K J be a finite subset. We have

X 2 X X
0 jjx .xxj /xj jj D .x .xxj /xj /.x .xxj /xj /
K K K
X X X
D xx .xxj /.xj x/ .xxj /.xj x/ C .xxj /.xxk /.xj xk /
K K j;k2K
X X X
D xx .xxj /.xxj / .xxj /.xxj / C .xxj /.xxj /
K K K
X X
D xx .xxj /.xxj / D xx jxxj j2
K K
and hence
X
jxxj j2 jjxjj2 :
K
X
Thus, the sum jxxj j absolutely converges (recall 6.2 and 6.3 of Chapter 1). t
u
J
4.8
From 4.7 and 4.6, we immediately obtain the following
Corollary. If .xj /j 2J is an orthonormal system in H then for every x 2 H there

exists the sum
X
.xxj /xj :
J
X
4.9 Theorem. (Parseval’s equality) One has jxxj j2 D jjxjj2 , that is the Bessel
X
J
inequality becomes equality, if and only if x D .xxj /xj .
J
www.Ebook777.com
Proof. Recall the beginning of the proof of Theorem 4.7: instead of the inequality
X 2 X 2
0 jjx .xxj /xj jj consider 0 D jjx .xxj /xj jj and observe that the
K K
formulas in the statement express the same fact. t
u
4.10
A Hilbert basis of a Hilbert space H is a maximal orthonormal system in H , that

is, an orthonormal system .xj /j 2J such that no non-zero x 2 H is orthogonal to all
of the xj , j 2 J .
Using Zorn’s lemma (for the system of all orthogonal systems ordered by
inclusion), one easily proves the following
Proposition. Every Hilbert space has a Hilbert basis.
Remark. There is a terminological conflict: a Hilbert basis of H is not a basis

of H as a vector space; the point is not in the orthogonality – we already have the
concept of an orthogonal basis in a vector space with a scalar product, and a Hilbert
basis is in general not that either. It does not generate the space: a general element
is not necessarily a linear combination of its elements. But, as we will see, a general
element can be expressed as an “infinite linear combination” of the elements of a
Hilbert basis.
4.11 Theorem. Let .xj /j 2J be an orthonormal system in a Hilbert space H . Then

the following statements are equivalent.
(1) .xj /j 2J is a Hilbert basis.
j 2 J then x D o.
(2) If x is orthogonal to all the xj ,X
(3) For every x 2 H one has x D .xxJ /xj .
J
(4) For every two x; y 2 H one has
X
xy D .xxj /.yxj /:
J
(5) For every x 2 H one has

sX
jjxjj D jxxj j2 :
J
Proof. (1),(2) is just a reformulation of the definition.

P
(2))(3) : For every x 2 HP , one has .x .xxj /xj /xk D xxk xxk D 0 for
each k and hence by (2), x .xxj /xj D 0.
www.Ebook777.com
(3))(4) : We have
X X X X
xy D . .xxj /xj /. .yxk /xk / D .xxj /.yxk /xj xk D .xxj /.yxj /:
j k j;k j
pP an element X such that jjxjj D 1

(4))(5) : Suppose (1) does not hold. Choose
and xxj D 0 for all j . Then jjxjj D 1 ¤ 0 D J jxxj j .
2 u
t
5 The Hahn-Banach Theorem
Let us now turn our attention to Banach spaces. Recall that linear maps f W V ! R,
f W V ! C for a real resp. complex vector space V are called linear forms.
5.1 Theorem. (Hahn - Banach) Let V be a real vector space and let W V ! R
be a function such that
(a) for all x; y 2 V , .x C y/ .x/ C .y/ and
(b) for every x 2 V and r 2 h0; 1/, .rx/ D r .x/.
Let V0 be a vector subspace of V and let f0 be a linear form on V0 such that
f0 .x/ .x/ for all x 2 V0 :
Then there exists a linear form f on V such that
f0 D f jV0 and f .x/ .x/ for all x 2 V:
Proof. Consider the system W of all pairs .W; g/ where W V0 is a vector

subspace of V and g W W ! R a linear form such that gjV0 D f0 and that
jg.x/j .x/ for all x 2 W .
On W define an order v by the formula
.W1 ; g1 / v .W2 ; g2 / df

W1 W2 and g2 jW1 D g1 :
S
Let C D f.Wi ; gi / j i 2 J g W be a chain in this order. Setting W D i 2J Wi and
defining g W W ! R by f .x/ D fi .x/ for x 2 Wi , we obtain a .W; g/ majorizing
all the .Wi ; gi /. By Zorn’s Lemma, there is, hence, a .W; g/ 2 W maximal in the
order v.
We will prove the statement of the theorem by showing that W D V . Suppose
W ¤ V . Choose a 2 V X W and let
W 0 D fx C ra j x 2 W; r 2 Rg:
For arbitrary x; y 2 W we have
g.x/ C g.y/ D g.x C y/ .x C a C y a/ .x C a/ C .y a/
www.Ebook777.com
5 The Hahn-Banach Theorem 409
and hence
g.y/ .y a/ g.x/ C .x C a/:
Since x; y are arbitrary there is a real number ˛ such that
8x; y 2 W: g.y/ .y a/ ˛ g.x/ C .x C a/ (*)
(for instance ˛ D supy .g.y/ .y a//, or ˛ D infx .g.x/ C .x C a//). Now

define a linear form
hWW0 !R by letting h.x C ra/ D g.x/ C r˛
(this is correct, if x C ra D y C sa then .r s/a D x y 2 W , hence s D r and

x D y). Let r > 0. Since, by (*),
1 1
g. x/ C ˛ . x C a/;
r r
we have
1 1
h.x C ra/ D r.g. x/ C ˛/ r . x C a/ D .x C ra/:
r r
Similarly if r < 0 we use the inequality
1 1
g. x/ ˛ . x a/
r r
to obtain
1 1
h.x C ra/ D r.g. x/ ˛/ r. . x a// D .x C ra/:
r r
Since trivially h.x C 0 a/ .x C 0 a/ we conclude that h.y/ .y/ for all
y 2 W 0 contradicting the maximality of .W; g/. t
u
5.2 Corollary. (Hahn-Banach’s Theorem - the complex version) Let V be a

complex vector space, and let W V ! h0; 1/ satisfy
(a) for all x; y 2 V , .x C y/ .x/ C .y/ and
(b) for every x 2 V and r 2 C, .rx/ D jrj .x/.
Let V0 be a vector subspace of V and let f0 be a linear form on V0 such that
jf0 .x/j .x/ for all x 2 V0 :
Then there exists a linear form f on V such that
f0 D f jV0 and jf .x/j .x/ for all x 2 V:
www.Ebook777.com
Proof. View V as a vector space over R. By Hahn-Banach’s Theorem, there exists

a linear map g W V ! R such that gjV0 D Re.f0 /, g.x/ .x/. Then there exists
a (unique) complex-linear map f W V ! C such that Re.f / D g. In particular, by
uniqueness, f jV0 D f0 . Now for every x 2 V , there exists a complex number of
modulus 1 such that
f .x/ D jf .x/j:
Thus,
jf .x/j D f .x/:
Hence f .x/ 2 R, and hence f .x/ D g.x/. Now compute:
jf .x/j D g.x/ .x/ D jj .x/ D .x/: t

u
5.3
As an easy consequence of Hahn - Banach Theorem we obtain
Proposition. Let L be a normed real or complex vector space and let M be a

vector subspace of L. Let g be a continuous linear form on M . Then there exists a
continuous linear form on L such that kf k D kgk (the norms in L and M ).
Proof. Use Theorem 5.1 resp. Corollary 5.2 with V D L, V0 D M and .x/ D
kgk kxk. t
u
5.4
And here is another one.
Proposition. Let L be a normed vector space and let M be a closed vector

subspace. Let M ¤ L. Then there is a continuous non-zero linear form f on L
such that f jM is constant zero.
Remark. Note that we speak of continuity but not of the norm: norm of f jM is
zero and would not help us.
Proof. Choose an a 2 L X M . Since M is closed, inffkx ak j x 2 M g D d > 0.

Define a linear form g on M 0 D fx C ra j x 2 M; r 2 Rg by setting g.x C ra/ D r.
We have
1
k.x Cra/.y Csa/k D kx y C.r s/ak D jr sjk .x y/Cak jr sjd
r s
www.Ebook777.com
6 Dual Banach spaces and reflexivity 411
and hence g is continuous. Now extend g to a continuous linear form on L using

the Hahn-Banach Theorem. t
u
6 Dual Banach spaces and reflexivity
6.1
Recall the definition 3.5.2 of the dual L of a normed vector space L.
Proposition. L is always complete (and consequently is always a Banach space).
Proof. To fix ideas, let us consider the real case (the complex case is analogous).
Suppose .fn / is a Cauchy sequence in L . Let B be the unit ball in L. Then, by
definition, the restriction fn jB is a Cauchy sequence in the space C.B/ of bounded
continuous functions on B, which we discussed in Chapter 2 (and, in fact, the
L -distances kfm fn k are equal to the C.B/-distances). However, we already
know that the space C.B/ is complete, and thus the sequence .fn jB/ converges
uniformly to a function f0 W B ! R. Then it is immediate that the function f 2 L
defined by
f .v/ D kvk f0 .v=kvk/
is the limit of the sequence .fn / in L . t

u
6.2
Recall from Section 3.6 that for a continuous linear mapping f W L ! M , we have
a continuous linear mapping
f W M ! L by setting f ./ D f
and that we have kf k kf k. In fact, the norms are equal.
Proposition. We have kf k D kf k.
Proof. To fix ideas, let us consider the real case (the complex case is analogous).
Choose an " > 0 and an x0 2 L such that 0 < kx0 k 1 and kf .x0 /k kf k ".
On the vector subspace frf .x0 / j r 2 Rg define a linear form g by setting
1
g.rf .x0 // D rkf .x0 /k. Then kgk D 1 (the unit ball is frf .x0 / j r g)
kf .x0 /k
and hence there is, by Proposition 5.3, a linear form 2 M such that kk 1
and .f .x0 // D kf .x0 /k. Thus, kf k kf ./k D kf k j.f0 .x0 //j D
kf .x0 /k kf k ". Since " > 0 was arbitrary we conclude that kf k D kf k. u t
www.Ebook777.com
6.3
For a normed linear space L define
D L W L ! L by setting ..x//./ D .x/:
6.3.1 Proposition. is a linear map preserving norm, and for every continuous
linear map f W L ! M we have a commutative diagram
L
L ! L
? ?
?
fy
? :
yf
M
M ! M
Proof. Again, to fix ideas, let us work in the real case. The complex case is the
same.
Checking that is linear is straightforward. Consider the formula
k.x/k D supfj.x/.f /j j kf k 1g D supfjf .x/j j kf k 1g:
By Lemma 3.6, jf .x/j kf k kxk and hence we see that k.x/k kxk.
Now fix an x ¤ o and define a linear form g W L0 D frx j r 2 Rg ! R by setting
g.rx/ D rkxk. The unit ball in L0 is the set frx j r kxk
1
g and hence kgk D 1.
By Proposition 5.3, we can extend g to a linear form f on L with kf k D 1 and we
have .x/.f / D f .x/ D kxk. Thus, k.x/k kxk.
Finally, let f W L ! M be a continuous linear map, x 2 L and 2 M .
We have
..f L /.x//./ D .f ..x//./ D .L .x/ f //./
D .x/.f .// D L .x/. f /
D .f .x// D .M .f .x///./ D ..M f /.x//./;
that is, f L D M f . t
u
6.4
A Banach space B is said to be reflexive if the mapping B is surjective (and hence

a norm preserving isomorphism).
6.4.1 Remark:
We have seen in Theorem 3.7.1 that the dual space of a Hilbert space H is
antilinearly isomorphic to H by the inner product. Composing the antilinear
isomorphisms
www.Ebook777.com
6 Dual Banach spaces and reflexivity 413
H ! H ! .H / ;
one gets the map of 6.3, and thus a Hilbert space is always reflexive.
6.5 Proposition. Let a Banach space B not be reflexive. Then neither is the Banach
space B .
Proof. Since B is complete, the vector subspace B ŒB of B is also complete (it
is norm-isomorphic) and hence, by Proposition 7.3.1 of Chapter 2 closed in B .
By Proposition 5.4, there exists an F 2 B , a linear form on B that is non-zero
but identically zero on B ŒB. We will show that it is not in B ŒB . Suppose it is,
that is, F D B .f / for a linear form f on B. In particular, for each B .x/ we have
F .B .x// D 0. Thus,
0 D B .f /.B .x// D B .x/.f / D f .x/
for all x, hence f D o and finally also F D .o/ is identically zero, a contradiction.
t
u
6.6 The weak topology
The following construction works over R or C. To fix ideas, let us work over C. The
treatment over R is analogous. Let W be a Banach space and let W be its dual.
The weak topology of W (with respect to W ) has a basis of open sets determined
by all possible choices of elements f1 ; : : : fn 2 W , and open sets U1 ; : : : ; Un C:
The basis element corresponding to this data is
fX 2 W j X.f1 / 2 U1 ; : : : ; X.fn / 2 Un g:
6.6.1 Lemma. Let V be a normed vector space. Then the unit ball B of .V / is the
closure of the image B1 of the unit ball of V under the canonical map V ! .V / ,
with respect to the weak topology (with respect to V ).
Proof. To prove that B is contained in the closure of B1 with respect to the weak
topology, it suffices to show that every open set U in the weak topology disjoint
with B1 is also disjoint with B. For open sets U which are of the form
F11 ŒU1 \ \ Fn1 ŒUn
with U1 ; : : : ; Un open for F1 ; : : : ; Fn 2 V (such sets form a basis of the open

topology), we may as well take the quotient of both V and .V / by the annihilator
of F1 ; : : : ; Fn (i.e. the subspace of elements which have 0 evaluation on F1 ; : : : ; Fn ).
The map induced on the quotients from the canonical map V ! .V / , however, is
www.Ebook777.com
the canonical map W ! .W / where W is the quotient of V by the annihilator of

F1 ; : : : ; Fn , which is an isomorphism since W is finite-dimensional.
On the other hand, if X … B, there exists an F 2 V such that kF k D 1,
X.F / > 1. This means that the open set determined by
F1 D F
and
U1 D .1 C .X.F / 1/=2; 1/
contains X but is disjoint from B1 , thus showing that X is not in the closure of B1
with respect to the weak topologiy. t
u
6.7 Theorem. (The Milman-Pettis Theorem) Every uniformly convex Banach space
V is reflexive.
Proof (The proof we present here is due to J.R. Ringrose). Let V be a uniformly
convex Banach space. By uniform convexity, for every " > 0 it is possible to choose
a ı D ı."/ > 0 such that if x; y 2 V satisfy
kxk; kyk 1; kx C yk 2 ı;
then
kx yk < ":
Now suppose V is a uniformly convex Banach space which is not reflexive. Let B
be the closed unit ball in .V / , and let B1 be the image of the closed unit ball in
V under the canonical map V ! .V / . Then B is contained in the closure of B1
under the weak topology (with respect to the space V ). Assuming B ¤ B1 , since
the canonical embedding V ! .V / is an isometry, by completeness, the image is
closed, and thus B1 is a closed subset of B. This means that there exists an " > 0
and an X 2 B such that, in .V / ,

.X; 2"/ \ B1 D ;: (*)
1
Now choose an F 2 V such that kF k D 1 and jX.F / 1j < ı where ı D ı."/.
2
Then put
1
V D fY 2 .V / j jY .F / 1j < ıg:
2
If Y; Y1 2 V \ B1 , we have jY .F / C Y1 .F /j > 2 ı, and hence kY C Y1 k > 2 ı,

and therefore kY Y1 k < ". Fixing Y , we deduce that
www.Ebook777.com
7 The duality of Lp -spaces 415
V \ B1 Y C "B:
Since, however, the right-hand set is closed in .V / under the weak topology
(with respect to V ), while X is in the closure o V \ B1 with respect to the weak
topology (since, in that topology, V is open), we deduce that X 2 Y C "B. This is
a contradiction with (*). t
u
7 The duality of Lp -spaces
We begin with the following result:
7.1 Theorem. For 1 < p < 1, the spaces Lp .B/, Lp .B; C/ are uniformly convex.
7.2 Reduction to the real case
The remainder of this section will consist of a proof of Theorem 7.1. The first thing
we should realize is that the real and complex cases are actually somewhat different,
since in the complex case the definition of Lp uses the complex absolute value,
which, in effect, is a Hilbert space norm on C D R2 . Because of this, we don’t have
an obvious isomorphism of Lp .B; C/, considered as a real Banach space, to a real
Lp -space (although we won’t prove that they are not isomorphic). Of course, Lp .B/
is embedded into Lp .B; C/ isometrically, and hence the uniform convexity for
Lp .B; C/ implies the uniform convexity of Lp .B/. We will, however, be interested
in the opposite implication, as the proof of uniform convexity of Lp .B/, is, in fact,
somewhat simpler.
Assume, therefore, that we already know that Lp .B/ is uniformly convex, and
let .fn /, .gn / be sequences in Lp .B; C/ such that
fn C gn
kfn kp D kgn kp D 1; k kp ! 1:
2
Then certainly
k jfn j kp D k jgn j kp D 1;
and
fn C gn jfn j C jgn j
k kp k kp 1
2 2
(the second inequality by the triangle inequality), so
jfn j C jgn j
k kp ! 1;
2
www.Ebook777.com
and hence by the uniform convexity of Lp .B/,
k jfn j jgn j k ! 0:
This means that there exist measurable functions ˛n W B ! C, j˛n .x/j D 1 for all
x, such that
kfn ˛n gn kp ! 0: (*)
From the uniform convexity of Hilbert spaces (applied to the 1-dimensional complex
Hilbert space C), we know that for each " > 0 there exists a ı > 0 such that
gn .x/ C ˛n .x/gn .x/

j˛n .x/ 1j > " ) j j < .1 ı/1=p jgn .x/j:
2
Denote by Sn the set of all x 2 B such that j˛n .x/ 1j > ", and denote by cn D cSn
its characteristic function (i.e. the function equal to 1 on Sn and 0 elsewhere). Then
gn .x/ C ˛n .x/gn .x/

.k kp /p .kgn kp /p ı.kgn cn kp /p :
2
Taking n ! 1 and using (*), we obtain
lim jjgn cn jjp D 0;

n!1
and hence
lim kfn gn jjp lim kfn ˛n gn jjp C lim jj.1 ˛n /gn jjp
n!1 n!1 n!1
lim ."kgn kp C kgn cn kp / D ":

n!1
Since " > 0 was arbitrary, we are done: it suffices to prove the uniform convexity of
Lp .B/.
7.3 The uniform convexity of Lp .B/
We will show now a simple argument proving the uniform convexity of Lp .B/
which does not generalize to the complex case, thus explaining in particular why
the reduction 7.2 pays off.
7.3.1 Lemma. Let 1 p < 1 and let f; g be non-negative real functions which
represent elements in Lp .B/. Then
.kf C gkp /p .kf kp /p C .kgkp /p :
www.Ebook777.com
7 The duality of Lp -spaces 417
Proof. Note that for non-negative numbers x; y and p 1, we have
.x C y/p x p C y p :
In effect, dividing by y p , we may assume without loss of generality y D 1, and then

Z xC1
.x C 1/ x D
p p
ptp1 dt
x
is a non-decreasing function in x, and hence is 1. Now we have

Z
.kf C gkp /p .kf kp /p .kgkp /p D ..f .t/ C g.t//p f .t/p g.t/p / 0;
B
as claimed. t
u
7.3.2 Lemma. If, in a normed vector space, sequences .xn /, .yn / satisfy
kxn k ! 1; kxn C yn kp C kxn yn kp ! 2;
then
kxn C yn k ! 1; kxn yn k ! 1:
Proof. Using the compactness of the interval h0; 3i, by picking a subsequence, we
may assume, without loss of generality, that
kxn C yn k ! ˛; kxn yn k ! ˇ
for some ˛; ˇ 0. Now we have
˛ C ˇ D lim .kxn C yn k C kxn yn k/ lim k2xn k D 2;

n!1 n!1
while ˛ p C ˇ p D 2. Thus,
1 1
. .˛ C ˇ//p .˛ p C ˇ p / D 1;
2 2
and hence, since t p is a convex function on h0; 1/, ˛ D ˇ and equality occurs. u
t
7.3.3 Proof that Lp .B/ is uniformly convex

Suppose .fn /, .gn / to be sequences in Lp .B/ such that
fn C gn
kfn kp D kgn kp D 1; k kp ! 1:
2
www.Ebook777.com
Put
fn C gn fn gn
xn D ; yn D :
2 2
Then
kxn C yn kp D kfn kp D 1 D kgn kp D kxn yn kp ;
and hence
2 D .kxn C yn kp /p C .kxn yn kp /p
Z
D .jxn .t/ C yn .t/jp C jxn .t/ yn .t/jp /
B
Z
D .j jxn .t/j C jyn .t/j jp C j jxn .t/j jyn .t/j jp /
B
D .k jxn j C jyn j kp /p C .k jxn j jyn j kp /p :
(Note that in the third equality, it is crucial that xn , yn are real numbers.) Now by
Lemma 7.3.2,
k jxn j C jyn j kp ! 1:
Using Lemma 7.3.1,
.kyn kp /p .k jxn j C jyn j kp /p .kxn kp /p ! 0;
as claimed. This concludes the proof that Lp .B/ is uniformly convex, and hence,
by Subsection 7.2, the proof of Theorem 7.1. t
u
1 1
7.4 Theorem. Let B be a Borel subset of Rn . Let 1 < p < 1 and let C D1
p q
(then, of course, also 1 < q < 1). We have isometric isomorphisms of Banach
spaces
Uq W Lq .B/ Š .Lp .B//
and
Uq W Lq .B; C/ Š .Lp .B; C//
given by
Z
.Uq .y//.x/ D x y: (7.4.1)
B
www.Ebook777.com
8 Images of Banach spaces under bounded linear maps 419
Proof. Let us prove the complex case (the real case is analogous). By Hölder’s
inequality, the integral (7.4.1) exists, and we have
j.Uq .y//.x/j kykq kxkp :
Since Uq .y/ is linear, we therefore have Uq .y/ 2 .Lp .B; C// with
kUq .y/k kykq : (*)
To deduce that Uq is an isometry, we need to show that the norms are in fact equal.
Let, therefore, y 2 Lq .B; C/ be such that kykq D 1. Let ˛ W B ! C be a
measureable function such that j˛.t/j D 1 for t 2 B and
˛.t/y.t/ D jy.t/j:
Define x.t/ D jy.t/jq=p ˛.t/. Then x 2 Lp .B; C/, and kxkp D 1. We compute:
Z Z Z
.Uq .y//.x/ D xy D jyj q=p
jyj D jyjq D 1;
B B B
thus proving the equality in (*).

Thus, Uq is an isometric embedding, and since Lq .B; C/ is complete, the image
of Uq is closed. We need to show this map is onto. However, if Uq is not onto, then
by Proposition 5.4, there exists a non-zero ! 2 ..Lp .B; C// / such that
!.Uq .y// D 0 for all y 2 Lq .B:C/.
However, since Lp .B; C/ is uniformly convex by Theorem 7.1, it is reflexive by

Theorem 6.7, and hence ! D .x/ for some x 2 Lp .B; C/. We conclude that
.Up .x//.y/ D 0 for all y 2 Lq .B; C/,
which contradicts the fact that Up is an isometry. t

u
8 Images of Banach spaces under bounded linear maps
8.1
Recall that a map f W X ! Y between topological spaces is open if the image of

each open subset of X is open. It is relatively open if its restriction X ! f ŒX is
open. In this section, we will write for subsets S; T of a vector space V and a point
x 2V,
www.Ebook777.com
x C S D fx C y j y 2 S g;
S C T D fx C y j x 2 S; y 2 T g
and similarly x S , S T etc.

We have an immediate
8.1.1 Observation. A linear map f W M ! N between normed vector spaces

is open if and only if the image f ŒU of every neighbourhood of zero in M is a
neighbourhood of zero in N .
8.1.2 Corollary. An open linear map f W M ! N is onto.
Proof. f ŒM contains an open neighborhood U of o, so there exists an " > 0 such

that kvk < " ) v 2 f ŒM . But scalar multiples of elements of U are also in f ŒM
since f is linear, and these include all elements of N . t
u
8.2 Proposition. Let M; N be normed vector spaces and let f W M ! N be an

open continuous linear map. If M is complete then N is also complete.
Proof. Let .yn / be a Cauchy sequence in N . Let B be the unit ball in M . Then since
f is open, there exists a ı > 0 such that f ŒB contains all vectors of norm ı.
By passing to a subsequence, if necessary, we may assume that
1
kyn ynC1 k < :
2n
Now f is onto, so there is an x1 2 M such that f .x1 / D y1 . By induction, then, we

may choose xn such that
f .xn / D yn
and
1
kxn xnC1 k < :
2n ı
Then .xn / is a Cauchy sequence. Let x D lim xn . Then f .x/ D lim yn by continuity.
t
u
8.3 Lemma. Let M; M1 be normed vector spaces such that M is complete. Let
f W M ! M1 be a continuous linear map such that for each neighbourhood U
of o in M the closure of the image f ŒU is a neighborhood of o in M1 . Then for
each neighbourhood U of o the image f ŒU is a neighborhood of o (and hence f
is open).
www.Ebook777.com
Proof. Choose a neighborhood U of o and an ˛ > 0 such that
fx 2 M j kxk ˛g U:
Let
˛
Un D fx j kxk g; Vn D f ŒUn :
2n
Thus, every Vn is a neighborhood of o in M1 . We will prove that f ŒU is a
neigborhood of zero by showing that V1 f ŒU . To this end, let y 2 V1 be
arbitrary; we look for an x 2 U such that y D f .x/.
We will find inductively xk 2 Uk k D 1; 2; : : : such that for all n,
X
n
y f .xk / 2 VnC1 and
kD1
(*)
X
n
1
ky f .xk /k < :
n
kD1
First, since .y V2 / \ fz j ky zk < 1g is a neighborhood of y and y is in the

closure of f .U1 /, we have a
y1 2 .y V2 / \ fz j ky zk < 1g \ f .U1 /;
that is, a y1 D f .x1 / with x1 2 U1 such that ky f .x1 /k < 1 and y1 D y v with
v 2 V2 , that is, y f .x1 / D v 2 V2 .
Now suppose we already have x1 ; : : : ; xn such that (*) holds. Then
X n
y f .xk / 2 f ŒUnC1 and since
kD1
X
n X
n
1
..y f .xk // VnC2 / \ fz j ky f .xk / zk < g
nC1
kD1 kD1
X
n
is a neigborhood of y f .xk / there is an xnC1 2 UnC1 such that
kD1
X
n X
nC1
.y f .xk // f .xnC1 / D y f .xk / 2 VnC2 ; and
kD1 kD1
X
n X
nC1
1
ky f .xk / f .xnC1 /k D ky f .xk /k < ;
nC1
kD1 kD1
which are the conditions (*) with n C 1 replacing n.
www.Ebook777.com
Since xk 2 Uk , clearly, the sequence
X
n
. xk /
kD1
is Cauchy, and if we denote its limit by x, then

X
n
f .x/ D lim f .xk / D y:
kD1
Finally, kxk ˛, and hence x 2 U . t

u
Recall the definition of a meager set (set of the first category) from 3.3 of
Chapter 9, and the Theorem 3.4 of Chapter 9 stating that no complete space is
meager in itself (Baire’s Category Theorem).
8.4 Theorem. Let M; N be normed vector spaces, M a complete one. Let f W

M ! M1 be a continuous linear map. Then there holds precisely one of the
following statements.
(1) f ŒM is complete and f is relatively open.
(2) f ŒM is meager in itself and f is not open; moreover, there is a neighborhood
U of o such that f ŒU is nowhere dense in f ŒM .
Proof. The two alternatives exclude each other by Baire’s Category Theorem
(Theorem 3.4 of Chapter 9).
I. Suppose there is a neighbourhood U of zero such that f ŒU is nowhere dense
[1
in f ŒM . Then f is obviously not open. Furthermore, M D nU and hence
nD1
1
[
f ŒM D nf ŒU . Obviously, if A is nowhere dense, then nA is nowhere
nD1
dense also. Thus, f ŒM is meager in itself.
II. Let none of the f ŒU with U a neighbourhood of zero be nowhere dense.
Thus, each such f ŒU is a neighbourhood of some of its points. We will prove
that in fact it is a neighbourhood of o and the statement will follow from
Proposition 8.2 and Lemma 8.3.
Let U be a neighborhood of zero in M . By continuity of the addition we have a
neighborhood V 0 such that V 0 C V 0 U and by continuity of the map x 7! .x/,
V 0 is a neighborhood of zero, and finally also V D V 0 \ .V 0 / is a neighborhood
of o. The set f .V / is a neighborhood of a point y0 and since V D V 0 \ .V 0 /, it is
also a neighborhood of y0 . Consider the homeomorphism D .y 7! y y0 /.
It maps f ŒV onto f ŒV y0 and since f ŒV y0 f ŒV C f ŒV f ŒU we
have .f ŒV / f ŒU and since .y0 / D o and is a homeomorphism, f ŒU is a
neighborhood of o. t
u
www.Ebook777.com
8.5
As an immediate corollary we obtain an important
Theorem. Let M ! N be Banach spaces and let f W M ! N be a bijective

linear map. Then f is a homeomorphism.
Proof. Alternative (2) of Theorem 8.4 is excluded by Baire’s Category Theorem.

t
u
8.6
Note that, somewhat surprisingly, we have in Theorem 8.5 the continuity of

f 1 implied by the continuity of f (reminiscent of the mappings between
compact Hausdorff spaces, and, even more basically, the behaviour of algebraic
homomorphisms).
We will present, as a consequence of Theorem 8.5, another case of an “inverted
implication”.
Let X1 ; X2 be metric spaces; consider a mapping f W X1 ! X2 and its graph
G D f.x; f .x// j x 2 X1 g X1
X2 :
If f is continuous then the graph G is obviously closed in X1

X2 (the sequence
.xn ; f .xn // either converges to .lim xn ; f .lim xn // or does not converge at all).
Equally obviously, closedness of the graph G does not imply continuity (consider
a discontinuous one-one onto map f with continuous f 1 ). For Banach spaces we
have, however,
8.6.1 Theorem. (The Closed Graph Theorem) Let Mi , i D 1; 2, be Banach spaces

and let f W M1 ! M2 be a linear map with a closed graph G D f.x; f .x// j x 2
M1 g M1
M2 . Then f is continuous.
Proof. Consider the space M1

M2 with the norm
k.x1 ; x2 /k D max.kx1 k; kx2 k/:
This is a Banach space (a product of two complete metric spaces is complete). The
graph G D f.x; f .x// j x 2 M1 g is a closed vector subspace of M1
M2 and hence
it is, again, a Banach space.
Now the projection
p1 D ..x; y/ 7! x/ W G ! M1
is a continuous map. It is linear one-one and onto, and hence, by Theorem 8.5, the
inverse p11 W M1 ! G is continuous. Since also p2 D ..x; y/ 7! y/ W G ! M2 is
continuous, the composition f D p2 p11 W M1 ! M2 is continuous. t
u
www.Ebook777.com
8.6.2 Remark:
The completeness hypothesis in Theorem 8.6.1 is essential. Consider the space
C.ha; bi/ of continuous real functions on a closed interval ha; bi with the norm
k k D maxt 2ha;bi j.t/j. Take the subspace M C.ha; bi/ consisting of the
functions with a continuous derivative (one-sided in a and b). Now M is a
normed vector space (not complete, though) and the convergence in M is uniform
convergence. By Theorem 5.3 of Chapter 1, if functions xn converge to X and if
the derivatives xn0 converge to y then x 0 exists and x 0 D y. Thus, the mapping
D D .x 7! x 0 / W M ! M of taking the derivative has a closed graph. Obviously,
however, D is not continuous; in fact it is continuous at no point x 2 M .
9 Exercises
(1) Prove that any finite-dimensional vector space V with an inner product is a
Hilbert space. Prove that the norms associated with any two inner products on
V define equivalent metrics.
(2) Prove that if f W H ! H 0 is an isometric isomorphism of Banach spaces
where H; H 0 are Hilbert spaces, then f .u/ f .v/ D u v. [Hint: there is a
formula expressing the dot product from its associated norm.]
(3) Prove that the closure of the unit ball
.o; 1/ in a Hilbert space H is compact
if and only if H is finite-dimensional.
(4) Give an example of a bounded linear operator F W H ! H , where H is a
Hilbert space, whose image is not closed.
(5) Prove that the symbol jjf jj defined in 3.5.1 is a norm on the space L.B; B 0 /
of continuous linear maps B ! B 0 for Banach spaces B; B 0 .
(6) Prove the statement of 3.4 in detail.
(7) Let V be a finite-dimensional Hilbert (Dinner product) space over C and let
f W V ! V be a Hermitian operator. Define, for x; y 2 V , B.x; y/ D f .x/y.
Prove that B is a Hermitian form.
(8) Let H; J be Hilbert spaces. A linear operator F W H ! J is called compact
if F ŒB is compact where B D fx 2 H j jjxjj 1g.
(a) Prove that if F is compact then for any bounded closed subset S H ,
F ŒS is compact.
(b) Prove that a compact operator is always bounded.
(c) An operator F W H ! J between Hilbert spaces is called finite if
its image is finite-dimensional. Prove that a finite operator is always
compact.
(d) Give an example of a compact operator between Hilbert spaces which is
not finite.
(9) Prove that if F W H ! J is a compact linear operator between Hilbert spaces,
then there exists an x 2 H such that jjxjj D 1 and jjF .x/jj D jjF jj jjxjj.
[Hint: Consider y 2 F ŒB to be of maximal norm (note that the norm is
continuous and F ŒB is compact).]
www.Ebook777.com
9 Exercises 425
(10) Let F W H ! J be a compact linear operator where H , J are Hilbert spaces.

(a) Prove that there exist orthonormal systems .ei /i 2N , .fi /i 2N in H and J
respectively and numbers s1 s2 such that
F .en / D sn fn (i)
and
F is 0 on the orthogonal complement of the closure of the vector
(ii)
subspace generated by e1 ; e2 ; : : : .
Prove further that the numbers sn are uniquely determined and that the
orthonormal systems .ei /, .fi / are uniquely determined up to a scalar
multiple if s1 > s2 > . The numbers si are known as singular values of
the operator F . [Hint: s1 D jjF jj. Use Exercise (9) and pass to orthogonal
complements.]
(b) Prove that
lim sn D 0: (iii)
n!1
Conversely, prove that if F W H ! J is an operator which satisfies (i),

(ii) and (iii), then F is compact.
(11) A compact linear operator F W H ! J between Hilbert spaces is called trace
class if its singular values satisfy
1
X
sn < 1:
nD1
Prove that when an operator F W H ! H for a Hilbert space H is trace class,

and, for every Hilbert basis .ei /i 2I of H ,
X
f .ej / D aij ei ;
i 2I
X
then the series aii is absolutely convergent and does not depend on the
i 2I
choice of Hilbert basis. This number is denoted by tr.F / and called the trace
of F .
(12) A compact operator linear F W H ! J between Hilbert spaces is called
Hilbert-Schmidt if its singular values satisfy
1
X
.sn /2 < 1:
nD1
Let .ei /i 2I be a Hilbert basis of H . For two Hilbert-Schmidt operators F; G W

H ! J , define
www.Ebook777.com
X
F G D F .ei / G.ei /:
i 2I
Prove that this is a well-defined inner product on the space HS.H; J / of all
Hilbert-Schmidt linear operators, and that moreover HS.H; J / with this inner
product is a Hilbert space.
(13) Prove that if L is a uniformly convex Banach space and 0 ¤ h 2 L , then
there exists a z 2 L such that kzk D 1 and h.z/ D khk.
[Hint: Choose a sequence zn in the unit ball of L such that h.zn / ! khk.
Uniform convexity implies that it is Cauchy.]
(14) Let B be a Borel set in Rn such that .B/ > 0. Prove that L1 .B/, L1 .B; C/,
L1 .B/, L1 .B; C/ are not uniformly convex.
[Hint: It suffices to consider the “baby” version - see Exercise (20) of
Chapter 5.]
(15) Let F W L ! M be a bounded operator where L; M are Banach spaces, and
the vector space M=F ŒL is finite-dimensional. Prove that then F ŒL is closed
in M .
[Hint: There is a finite-dimensional vector space V and an extension FQ W L ˚
V ! M which is onto, and maps V isomorphically onto M=f ŒL. Now FQ is
open and the image, under FQ , of the open subset L
V X f0g is M X F ŒL.]
www.Ebook777.com
A Few Applications of Hilbert Spaces

17
In the previous chapter we developed, with the help of analysis, an understanding of

Hilbert (and Banach) spaces as a kind of satisfactory generalization of linear algebra
to infinite-dimensional spaces. In particular, we developed modified notions of duals
and bases which behave well in this situation.
The real force of Hilbert spaces, however, is that they naturally occur in a variety
of contexts. In physics, specifically in quantum mechanics, a (complex) Hilbert
space is the basic structure on a state space, which is the fundamental concept of
the theory. In this chapter, we will remain in mathematics, and give examples of
Hilbert spaces which occur as certain spaces of functions (generally known as L2 -
spaces). We will then explore two particular roles L2 -spaces play.
First, they provide us with a useful technical tool. To illustrate this, we will
prove the Radon-Nikodym Theorem on derivatives of measures. We will then apply
that result to proving a Lebesgue integral version of the Fundamental Theorem of
Calculus, which is ultimately a very satisfactory, but also very difficult theorem.
In some sense, this theorem brings the story of Lebegue integral, which we used
extensively (although often implicitly) throughout this book, to a conclusion.
The second use of L2 -spaces, and generally Hilbert spaces, is as a rigorous
foundation for modelling intuitive geometric ideas. We will illustrate this on the
concepts of Fourier series and the Fourier transformation.
1 Some preliminaries: Integration by a measure
In most of this book, we worked with the Lebesgue integral which we constructed by
passing to limits from the Riemann integral. As a result, we obtained a construction
of the Lebesgue measure. At this point, however, we need to talk about measures
in greater generality. In this section, we summarize the basics of integration theory
with respect to more general measures.

www.Ebook777.com
428 17 A Few Applications of Hilbert Spaces
1.1
To avoid excessive definitions, we will only consider so-called Borel measures.

For the completely abstract concepts of measure and integration, the reader is
referred to [18].
Let X Rn be a Borel subset. By a Borel measure on X we shall mean a map
which assigns to each Borel subset S X a number .S / 2 Œ0; 1. We require
that for disjoint subsets S1 ; S2 ; : : : ; Sn ; : : : of X , we have
1
X
.S1 [ S2 [ : : : / D .Sn /
nD1
(a property known as -additivity). Note that when E1 E2 : : : , then

-additivity, applied to the sets EnC1 X En , implies
[
.En / % . En /:
Example: By Proposition 3.2 and Corollary 3.4.1 of Chapter 4, we know that the
Lebesgue measure on Rn can be considered as a Borel measure on Rn (if we ignore
the fact that it is defined on even more general sets).
1.2 Definition and basic facts about integration of non-negative

real functions with respect to a measure
First, by a simple function on X we mean a function expressed in the form
X
n
sD ai cAi (1.2.1)
i D1
where Ai X are Borel subsets, and 0 ai < 1. We define the integral of a

simple function with respect to a Borel measure by
Z X
n
sd D ai .Ai /: (1.2.2)
X i D1
If .Ai / D 1 and ai D 0, we set the i ’th summand equal to 0. Note carefully that
a priori, the integral of s as defined may depend on the expression (1.2.1). However,
it doesn’t (see Exercise (1)). Even without knowing that fact, however, we define for
a Borel measurable function f W X ! Œ0; 1, (recall 4.4 of Chapter 4)
Z Z
f d D sup sd (1.2.3)
X X
www.Ebook777.com
1 Some preliminaries: Integration by a measure 429
where the supremum is taken Rover all simple functions (1.2.1) such that s f .
Note: For a Borel set B X , B f d may be defined simply as the integral of the
restriction of f by the restriction of to B.
1.2.1 Lemma. For any Borel function f W X ! Œ0; 1 there exist simple functions
sn such that sn % f .
Proof. Put
sn .x/ D k=2n
when 0 k n2n and x is such that
k=2n f .x/ < .k C 1/=2n: t

u
1.2.2 Corollary. When is the Lebesgue measure and f W X ! Œ0; 1 is Borel

measurable, then (1.2.3) is the Lebesgue integral of f over X .
Proof. Use Lemma 1.2.1, the Lebesgue Monotone Convergence Theorem (Theo-
rem 1.1 of Chapter 5), and recall definition 3.1 of Chapter 5. t
u
1.2.3 Theorem. (the Lebesgue Monotone Convergence Theorem for a Borel mea-
sure) Let fn % f , where fn W X ! Œ0; 1 are Borel-measurable functions. Then
Z Z
lim fn d D f d:
n!1 X X
Proof. First note that

Z Z
fn fnC1 ;
X X
so the limit makes sense. If s fn is a simple function, then clearly s f . This

implies the inequality. For the opposite inequality, let s f be a simple function
and let 0 < c < 1.SLet En be the set of all x 2 X such that cs.x/ fn .x/. Clearly,
En EnC1 , and En D X , so
Z Z Z Z Z
c sd D csd D lim csd lim fn d lim fn d:
X X n!1 E n!1 E n!1 X
n n
(The second equality follows from -additivity.) Now taking the supremum of the
left-hand side of the inequality we just derived over all 0 < c < 1 and all simple
functions s f , we obtain the inequality of the statement. t
u
www.Ebook777.com
1.2.4 Lemma. When f; g W X ! Œ0; 1 are Borel measurable functions, and c 2

Œ0; 1/, we have
Z Z
.cf /d D c f d;
X X
Z Z Z
.f C g/d D f d C gd:
X X X
Proof. The first equality is obvious, since simple functions f correspond

bijectively to simple functions cf by multiplication by c. For the second
inequality, let by Lemma 1.2.1, sn % f , sn0 % g where sn , sn0 are simple functions.
We have sn C sn0 % f C g, so by the Lebesgue Monotone Convergence Theorem,
Z Z
.f C g/d D lim .sn C sn0 /d
X X
Z Z Z Z
D lim sn d C lim sn0 d D f d C gd: t
u
X X X X
1.2.5 Comment
Let be a Borel measure on X and u W X ! Œ0; 1 a Borel-measurable function.
Then it follows from Lemma 1.2.4 and the Lebesgue Monotone Convergence
Theorem that
Z
E 7! ud
E
is a Borel measure on X ; this Borel measure is often denoted by u.
1.3 Integration of complex functions over a measure
Let be a Borel measure. A Borel-measurable function f W X ! C is called

-integrable if
Z
jf jd < 1:
X
(This is clearly equivalent to requiring that Re.f /C , Im.f /C Re.f / and Im.f /
all have finite integrals). We then put
Z Z Z Z Z
C C
f d D Re.f / d Re.f / d C i Im.f / d Im.f / d:
X X X X X
www.Ebook777.com
1 Some preliminaries: Integration by a measure 431
By Proposition 2.7 of Chapter 5, a Borel-measurable function is integrable by the

Lebesgue measure if and only if it has a finite Lebesgue integral, and the integral
just defined equals its Lebesgue integral.
1.3.1 Lemma. Let f; g W X ! C be -integrable functions, and let ˛ 2 C. Then

˛f , f C g are -integrable and we have
Z Z
˛f d D ˛ f d;
X X
Z Z Z
.f C g/d D f d C gd:
X X X
Proof. The second formula immediately follows from Lemma 1.2.4. To prove the
first formula, one first notes that for ˛ 0 it follows from Lemma 1.2.4, then one
checks it for ˛ D 1 and ˛ D i , and uses the second formula to pass to the case of
˛ arbitrary. t
u
1.3.2 Theorem. (the Lebesgue Dominated Convergence Theorem) Suppose fn W

X ! C are Borel measurable functions and assume that fn ! f , and there exists
a -integrable function g W X ! Œ0; 1 such that for all n, jfn j g. Then
Z Z
lim fn d D f d:
X X
Proof. We have
jfn f j 2g;
so by Fatou’s lemma (the proof of Lemma 8.5.1 of Chapter 5 works for any Borel
measure),
Z Z
2gd lim inf .2g jf fn j/d
X n!1 X
Z Z
D lim inf. 2gd jf fn j/d
n!1 X X
Z Z
D 2gd lim sup jf fn jd:
X n!1 X
R
Subtracting X 2gd from both sides,
Z
lim sup jf fn jd D 0
n!1 X
www.Ebook777.com
and hence
Z
lim jf fn jd D 0:
n!1 X
An analogue of Lemma 8.4.1 of Chapter 5 also holds by the same proof. Therefore,
Z
lim .f fn /d D 0;
n!1 X
which implies our statement by Lemma 1.3.1. t

u
p
2 The spaces L .X; C/ and the Radon-Nikodym Theorem
p
2.1 The spaces L .X; C/
The definition of spaces Lp 1 p 1 with respect to an arbitrary Borel measure

parallels completely the discussion of the case of the Lebesgue measure in Section 8
of Chapter 5. In particular, let X Rn be a Borel subset and let be a Borel
p
measure on X . Let, for 1 p < 1, L .X; C/ denote the set of equivalence classes
of all Borel-measurable functions f W X ! C such that
Z
jf jp d < 1 (2.1.1)
X
with respect to the equivalence relation of being equal almost everywhere (i.e.
f g if and only if .fx 2 X j f .x/ ¤ g.x/g/ D 0/. The relation is a
p
congruence, so L .X; C/ inherits a structure of a C-vector space from the set of
p
all functions satisfying (2.1.1). Again, elements of L .X; C/ are often (slightly
imprecisely but usually harmlessly) identified with their representative functions.
Again, we define jjf jjp to be the p’th root of the left-hand side of (2.1.1). For
p D 1, we define, again, jjf jj1 to be the infimum of M 1 such that
f .x/ M almost everywhere, and we define L1 .X; C/ to be the quotient of the
vector space of such functions by the congruence of being equal almost everywhere.
An analogue of Minkowski’s inequality (Theorem 8.2 of Chapter 5) holds by the
p
same proof, thus providing us with a norm on L .X; C/. The proof of Theorem 8.5.2
p
of Chapter 5 extends to the case of Borel measures to prove that the spaces L .X; C/
are complete, and hence are Banach spaces. In fact, all the theory of the spaces
Lp .B/, Lp .B; C/ we built up in Chapter 16 extends verbatim to the case of the
p p
spaces L .X /, L .X; C/. In particular, for 1 < p < 1, these Banach spaces
are uniformly convex and hence are reflexive; we simply didn’t want to complicate
the discussion in Chapter 16 with unnecessary generality where we didn’t need it.
(However, see Exercises 6, 7 below.)
It is worthwhile pointing out, though, that the case of Borel measures gives some
interesting examples which we haven’t seen before: Let S be a countable set with
www.Ebook777.com
p
2 The spaces L .X; C/ and the Radon-Nikodym Theorem 433
p
the measure in which every element has measure 1. Then the space L .S / is
isometric to the space of all sequences .an /n2N such that
1
X
jan jp < 1
nD1
with the norm

X
jj.an /n jj D . jan jp /1=p :
Such spaces are denoted by `p (`p .C/ in the complex case).

As before, a special role belongs to the spaces L2 .X /, L2 .X; C/. By the Cauchy-
Schwarz inequality, for f; g 2 L2 .X; C/,
Z
f gd < 1; (2.1.2)
X
so the formula (2.1.2) defines an inner product on L2 .X; C/. Since the norm comes
from the inner product, L2 .X; C/ with the inner product (2.1.2) is a Hilbert space
(and similarly, L2 .X / is a real Hilbert space).
2.2 The Radon-Nikodym Theorem
Let , be Borel measures on a Borel set X Rn . We say that is absolutely

continuous with respect to if for every Borel set S X , .X / D 0 implies
.X / D 0.
Theorem. Suppose that , are Borel measures on a Borel set X Rn ,

.X / < 1, .X / < 1 and is absolutely continuous with respect to . Then
there exists a Borel measurable function h W X ! Œ0; 1/ such that D h. (see
Comment 1.2.5). The function h is called the Radon-Nikodym derivative of by .
Proof. Consider the measure D C . We then have .X / < 1 and for every
Borel set S X , .S / .S /. Then every function in f 2 L2 .X; C/, f is
-integrable, hence -integrable. Define
Z
I.f / D f d:
X
Clearly,
I W L2 .X; C/ ! C
is a C-linear map. We claim that I is continuous. By Theorem 3.5 of Chapter 16,

it suffices to prove that there exists a number K such that jjf jj;2 1 implies
www.Ebook777.com
jI.f /j K. Let S D fx 2 X j jf .x/j 1g.

Z Z
jI.f /j jf jd C jf jd
S X XS
Z
jjf jj2 d C .X X S / jjf jj2;2 C .X X S / 1 C .X /:
S
By the Riesz Representation Theorem 3.6.1 of Chapter 16, there exists a g 2

L2 .X; C/ such that for all f 2 L2 .X; C/,
Z Z
f d D f gd: (2.2.1)
X X
But we claim that in fact
.fxj0 g < 1g/ D .X /: (2.2.2)
In effect, .S / > 0 where S is any of the sets fx 2 X j Im.g/ > 1=ng, fx 2

X j Im.g/ < 1=ng, fx 2 X j Re.g/ < 1=ng, fx 2 X j Re.g/ 1g, would
violate (2.2.1) for f D cS (in particular, for S D fx 2 X j Re.g/ 1g, we would
get .S / .S / D .S / C .S /, so .S / D 0 which contradicts the assumption
.S / > 0 by absolute continuity). Thus the above sets S have .S / D 0, which
proves (2.2.2). Now rewrite (2.2.1) as
Z Z
f .1 g/d D fgd: (2.2.3)
X X
Put h D g=.1 g/ where defined, and h D 0 elsewhere. Now let
En D fx 2 X jg.x/ < 1 1=ng:
Then for a Borel set S En , f D cS =.1 g/ is bounded non-negative

Borel-measurable, and hence is in L2 .X; C/, so applying (2.2.3) gives .S / D
R S
S hd. RIf S X X En , then .S / D 0 hence .S / D .S / D 0, so
.S / D S hd. Now any Borel subset of X is a countable union of sets for which
the statement was just proved. t
u
2.3
The following statement will be useful in the next section.
Lemma. Let X Rn be a Borel set, and let , be Borel measures on X such

that .X / < 1. Then is absolutely continuous with respect to if and only if for
every " > 0 there exists a ı > 0 such that .S / < ı implies .S / < ".
www.Ebook777.com
3 Application: The Fundamental Theorem of (Lebesgue) Calculus 435
Proof. Let the ı-" condition hold. Then a set S with .S / D 0 satisfies the
hypothesis for every ı, and hence .S / < " for every " > 0.
Conversely, let be absolutely continuous with respect to . Suppose the ı-"
does not hold, i.e. there exists an " > 0 and sets Ei with .Ei / 1=2i such that
.Ei / ". Put Ai DTEi [ Ei C1 [T . Then .Ai / ", Ai Ai C1 , .Ai /
1=2i 1 , and hence . Ai / D 0, . Ai / " by -additivity. t
u
3 Application: The Fundamental Theorem

of (Lebesgue) Calculus
In this section, we derive an application of the Radon-Nikodym Theorem which is

the analogue, for the Lebesgue integral, of the Fundamental Theorem of Calculus,
stating, roughly, that the derivative and the integral are inverse operations. We begin
with the part about the integral of the derivative. This part does not need the Radon-
Nikodym Theorem, but it shows that things are much harder than in the case of
the Riemann integral. Throughout this section, we will work with real functions;
all statements immediately follow for complex-valued functions by treating them as
pairs of real functions.
3.1 Absolute continuity of functions
A function f W ha; bi ! R is called absolutely continuous if for every " > 0 there
exists a ı > 0 such that for any m-tuple of non-empty disjoint intervals hai ; bi i
ha; bi, i D 1; : : : ; m which satisfy
X
m
.bi ai / < ı;
i D1
we have
X
m
jf .bi / f .ai /j < ":
i D1
An absolutely continuous function is clearly continuous (take m D 1).
3.2 The derivative of an integral
Consider now the situation when f W ha; bi ! R is a Lebesgue integrable

function. (Recall that by Theorem 4.4 of Chapter 5, we may assume that f is Borel
measurable.) Now define a function
F W ha; bi ! R
www.Ebook777.com
by
Z
F .x/ D f:
ha;xi
(The integral is with respect to the Lebesgue measure .)
Proposition. The function F W ha; bi ! R is absolutely continuous.
Proof. The measure jf j (see Comment 1.2.5) is clearly absolutely continu-

ous with respect to . Our statement therefore follows from Lemma 2.3 and
Lemma 8.4.1 of Chapter 5. t
u
Theorem. The function F has a derivative almost everywhere in ha; bi and we have
F 0 .x/ D f .x/ almost everywhere in ha; bi.
Proof. Recall that for every ı > 0, there exists a continuous function g W ha; bi!R
such that
Z
jf gj < ı:
ha;bi
(By our definition of the Lebesgue integral, we may replace f by a function in

Zup , which can then be replaced by a continuous function.) Now our statement is
true for g in place of f by the corresponding statement for the Riemann integral
(Theorem 8.6 of Chapter 1). Now let us investigate the function
h D f g:
Let " > 0. Let B be the set of all x 2 ha; bi for which there exists a t.x/ > 0 with
a x t.x/ < x C t.x/ b
such that
Z
jhj > "t.x/:
hxt .x/;xCt .x/i
Let K be a compact subset of the open set B. Then there exist x1 ; : : : ; xN such that
[
N
.xi t.xi /; xi C t.xi // K:
i D1
Note that we may find i1 < < im such that the intervals
www.Ebook777.com
.xi t.xi /; xi C t.xi //
are disjoint, and
[
m
.xij 3t.xij /; xij C 3t.xij // K: (3.2.1)
j D1
In fact, assume without loss of generality that
t.x1 / t.xN /:
Then it suffices to let ij C1 be the smallest number i > ij such that .xi t.xi /; xi C
t.xi // is disjoint from .xik t.xik /; xik C t.xik // for k j . By (3.2.1), we see that
X Z Z
6X
m m
6 6ı
.K/ 6 t.xij / < jhj jhj :
j D1
" j D1 hxij t .xij /;xij Ct .xij /i " ha;bi "
Since K B was an arbitrary compact subset, we conclude
6ı
.B/ :
"
Now the point is that for every " > 0 we can choose ı > 0 such that .B/ is
arbitrarily small. Let
C D fa x bj jh.x/j "g:
Clearly,
ı
.C / :
"
However, for
x 2 ha; bi X .B [ C /; (3.2.2)
for every t > 0 such that a x t < x C t b, we have for both J D hx t; xi

and J D hx; x C ti
ˇZ Z ˇ Z
1 ˇˇ ˇ 1
f g ˇˇ jhj ";
t ˇ J J t J
while jf .x/ g.x/j < ", and thus

R
j. 1t J f / f .x/j < 3" for sufficiently small t > 0 (3.2.3)
www.Ebook777.com
for x as in (3.2.2). Now the sets B, C depend on ı and ", but writing B D B.ı; "/,
C D C.ı; "/, (3.2.3) holds for
\
x 2 ha; bi X .B.1=n; "/ [ C.1=n; "//;
n
which is almost everywhere. Since " was arbitrary, considering " D 1=k, k D
1; 2; : : : , we see that F 0 .x/ D f .x/ almost everywhere on ha; bi, as claimed. u
t
3.3 The integral of the derivative
Let us now consider the harder direction, namely the integral of the derivative of a
function F W ha; bi ! R. By Proposition 3.2, it suffices to consider the case when
F is absolutely continuous.
Theorem. Let F W ha; bi ! R be absolutely continuous. Then F 0 .x/ exists almost

everywhere and for every x 2 ha; bi,
Z
F 0 D F .x/ F .a/:
ha;xi
The proof will consist of several steps. First assume that
F is increasing. (*)
We start with
3.3.1 Lemma. Let (*) hold and let F be absolutely continuous. Let S ha; bi
satisfy .S / D 0. Then .F ŒS / D 0.
Proof. Suppose, without loss of generality, a; b … S . By Exercise (9) of Chapter 5,

there exists for every ı > 0 an open set U S such that .U / < ı. Then we
may express U as a countable disjoint union of open intervals .ai ; bi /, i D 1; 2; : : :
(Lemma 5.2.1 of Chapter 2). By the definition of absolute continuity (applied to
i D 1; : : : ; n and taking a limit with n ! 1) we see that for every " > 0 there
exists a ı > 0 for which ŒF ŒS < ". Since " > 0 was arbitrary, .F ŒS / D 0, as
desired. t
u
3.3.2 Proof of the Theorem under the hypothesis (*)

Since F is increasing and continuous on a compact interval, F 1 is continuous on
F Œha; bi, hence Borel measurable. Hence, we can define a Borel measure on
ha; bi by
.S / D .F ŒS /:
www.Ebook777.com
Further, by the lemma, is absolutely continuous with respect to the Lebesgue

measure , and hence satisfies the assumptions of the Radon-Nikodym Theorem.
Let h be the Radon-Nikodym derivative of by . Then applying the statement of
Theorem 2.2 to the sets ha; xi, we get
Z
hd D F .x/ F .a/;
ha;xi
as claimed. The fact that h is the derivative of F almost everywhere follows from
Theorem 3.2. t
u
3.3.3 Lemma. Let F W ha; bi ! R be absolutely continuous. Let
X
N
G.x/ D sup jF .ti / F .ti 1 /j
i D1
where the supremum is over all N and all choices of points
a D t0 < < tN D x:
Then the functions G, G F , G C F are increasing and absolutely continuous.

(The function G is called the total variation of the function F .)
Proof. Let a y < x b. The supremum in the definition of G.x/ clearly will
not change if we take it only over such tuples .ti / which additionally satisfy ti D y
for some i . This shows that
X
N
G.x/ G.y/ D sup jF .ti / F .ti 1 /j (*)
i D1
where the supremum is taken over all
y D t0 < < tN D x:
Now choose an " > 0. Then if F satisfies the condition of absolute continuity with
a particular ı > 0, (*) (applied to y D ai ; x D bi for each individual i in the
definition 3.1) shows that G satisfies the condition of absolute continuity for the
same ı.
To show that G F and G C F are non-decreasing, note that by definition, for
a y < x b,
G.x/ G.y/ jF .x/ F .y/j;
www.Ebook777.com
and hence
G.x/ G.y/ ˙.F .x/ F .y//;
as required. t
u
3.3.4 Proof of the Theorem in the general case

Clearly, an R-linear combination of absolutely continuous functions is absolutely
continuous. Let F be as in the hypothesis of the theorem. Then the conclusion holds
with F replaced by the increasing functions G C F C x, G C x, and hence, by the
linearity of derivatives and integrals, for
F D .G C F C x/ .G C x/: t
u
4 Fourier series and the discrete Fourier transformation
In the preceding sections, we obtained strong theorems (Theorems 2.2 and 3.3)
which used the theory of Hilbert spaces in their proofs, but Hilbert spaces were
not a part of the final statements. The role of Hilbert spaces in this and the next
section is different, namely as a framework in which intuitive statements can be
easily made rigorous. Of course, much more can be said on the subjects we touch on
here, but what we say is a good example of the role the concept plays, for example,
in mathematical physics.
4.1 The discrete Fourier transform (L2 -Fourier series)
We begin with an auxilliary result.
4.1.1 The subspace of continuous functions with compact

support in Lp
Let U Rn be an open set. Recall that the support supp.f / of a function
f W U ! R is the closure in U of the set of all x 2 U such that f .x/ ¤ 0. The
set (vector space) of continuous functions on U with compact support is denoted by
Cc .U /. Similarly, the space of continuous complex functions with compact support
on U is denoted by Cc .U; C/.
Theorem. Let U Rn be a an open set and let 1 p < 1. Then the set Cc .U /
(resp. Cc .U; C/) is dense in Lp .U / (resp. Lp .U; C/).
Proof. Let us prove the complex case, the real case is analogous. Let K U be a
compact set. We will first prove that in Lp .U; C/,
www.Ebook777.com
4 Fourier series and the discrete Fourier transformation 441
cK 2 Cc .U; C/ (4.1.1)
(recall that cK is the characteristic function, which has value 1 on K and 0

elsewhere). In effect, K is contained in the union of all balls
.x; "x / with x 2 K,
"x < 1=k which are contained in U , and hence in finitely many of those balls by
compactness. Let Uk be the union of these finitely many open balls. Then by Tietze’s
Theorem (Theorem 8.5 of Chapter 2), there exists a function fk W Rn ! h0; 1i such
that f .x/ D 1 for x 2 K, and f .x/ D 0 for x … Uk . Clearly, fk has compact
support, and for k sufficiently large, supp.fk / U . Then fk & cK , and we have
lim jjfk cK jjp ! 0

k!1
by the Lebesgue Dominated Convergence Theorem, which implies (4.1.1).

Next, we claim that (4.1.1) extends to any F -set K which satisfies .K/ < 1:
this is because any such set is a union of countably many Kn compact, and we
may assume K1 K2 and use the fact that by the Lebesgue Dominated
Convergence Theorem,
lim jjcKk cK jjp D 0:

k!1
Finally, recall from Exercise (8) of Chapter 5 that for every measurable set S
U with .S / < 1, there exists an F -set K S , .S XK/ D 0, so in Lp .U; C/; cS
is in the closure of Cc .U; C/. Consequently, so is any non-negative simple function
s with finite integral (which is equivalent to s p having a finite integral). Now for any
f 0, f 2 Lp .U; C/, there are non-negative simple functions sn with sn % f .
Then
lim jjf sk jjp D 0

k!1
by Lebesgue’s Dominated Convergence Theorem, and hence f 2 Cc .U; C/, which

implies the same conclusion about any f 2 L2 .U; C/ (by considering Re.f /C ,
Re.f / , Im.f /C , Im.f / ). t
u
4.1.2 Comments
1. Note that unlike our previous results on Lp , Theorem 4.1.1 does not readily
generalize to an arbitrary Borel measure.
2. Also note that Cc .U / is certainly not dense in L1 .U /. Since the complement of
a measure 0 set in U is necessarily dense, on Cc .U /, L1 -convergence is uniform
convergence, and thus the closure of Cc .U / in L1 .U / consists, in particular, of
continuous functions.
www.Ebook777.com
4.1.3 The Discrete Fourier Transform Theorem

Consider the space L2 .h0; 2 i/ (but we could adapt our arguments to any compact
interval of non-zero length, see Exercise (12)). Then by explicit calculation,
1
p e inx ; n 2 Z (4.1.2)
2
form an orthonormal system in L2 .h0; 2 i/.
Theorem. The system (4.1.2) forms an orthonormal basis of L2 .h0; 2 i/.
Proof. Consider the space
S 1 D fz 2 C j jzj D 1g
with the topology induced by C. Now consider the R-vector subspace C.S 1 ; R/
spanned by the functions zn C zn , i.zn zn /, n 2 Z. Then is closed under
multiplication, contains a non-zero constant function and separates points, and
hence satisfies the hypotheses of the Stone-Weierstrass Theorem 6.4.1 of Chapter 9.
Consequently, every continuous function f W S 1 ! R is a uniform limit of a
sequence of elements of . Composing with the map e ix , we see that in particular,
every continuous function g W .0; 2 / ! R with compact support is a uniform
limit of functions gn which are finite linear combinations of the functions sin.nx/,
cos.nx/, n 2 Z. Therefore, every continuous function g W .0; 2 / ! C with
compact support is a uniform limit of functions gn where each gn is a finite linear
combination of the functions e inx , n 2 Z. By the Lebesgue Dominated Convergence
Theorem, a sequence in L2 .h0; 2 i; C/ which converges uniformly converges in L2 .
Since the functions gn are (finite) linear combinations of the elements (4.1.2), g is
in the closure of the subspace spanned by (4.1.2). Thus, our statement follows from
Theorem 4.1.1. t
u
4.1.4
As already remarked in Section 2.1 above, sometimes one denotes by `2 .C/ the
space L2 .Z; C/ where is the counting measure on Z, i.e. .S / is the number of
elements of S when S is finite, and .S / D 1 for S infinite. Then the assignment
X a.n/
p e inx 7! .a W Z ! C/ (*)
n2Z
2
defines an isomorphism
L2 .h0; 2 i; C/ ! `2 .C/
which is sometimes referred to as the discrete Fourier transformation and the

expression on the left-hand side of (*) of an element f 2 L2 .h0; 2 i; C/ is called
its Fourier series. Much hard mathematics concerns convergence of Fourier series
www.Ebook777.com
5 The continuous Fourier transformation 443
in other spaces than L2 . Note, however, that by Theorem 4.9 of Chapter 16, we have
an expression for the coefficients an :
Z
1
an D p f .x/e inx : (**)
2 h0;2 i
5 The continuous Fourier transformation
5.1 The continuous Fourier transformation formula
While Exercise (15) of the previous Section gives a basis of L2 .R; C/, one may
ask if there is a more compelling analogue of formula (**) which would apply
to L2 .R; C/. There is a surprisingly simple answer, namely to apply (**) for a
continuous parameter instead of n 2 Z, and integrate over all of R, thus obtaining,
again, a function on R: Define for a function f W R ! C and for t 2 R,
Z
1
fO.t/ D p f .x/e ixt dx: (5.1.1)
2 R
(The integral on the right-hand side is the Lebesgue integral; we include the symbol
dx to emphasize that we are integrating in the variable x.)
Despite the simplicity of the generalization, it is immediately visible that
the situation will be more complicated than in the case of the discrete Fourier
transformation. For example, we cannot expect the formula (5.1.1) to work for every
f 2 L2 .R; C/: in order for (5.1.1) to make sense, f must be integrable. Conversely,
suppose (5.1.1) does make sense. Do we have fO 2 L2 .R; C/?
We will answer these questions partially: We will apply the continuous Fourier
transform formula (5.1.1) to certain subspace of functions called “rapidly decreasing
functions”, and extend it to an isometric isomorphism of Hilbert spaces
Š
F W L2 .R; C/ L2 .R; C/:
Again, much deeper and more specific convergence theorems exist, but we will not
discuss them in this text.
5.2 Lemma. (The Riemann-Lebesgue lemma) Let f W R ! C be an integrable

function. Then fO W R ! C is continuous and we have
lim fO.t/ D lim f .t/ D 0:

t !1 t !1
www.Ebook777.com
Proof. When tn ! t, then f .x/e itn x ! f .x/e itx , while jf .x/e itn x j D jf .x/j.
Thus, fO.tn / ! fO.t/ by the Lebesgue Dominated Convergence Theorem. This
proves continuity.
To prove the limit formula, first consider the case when f D c.a;bi , a < b:
we have
ˇZ ˇ
ˇ ˇ 1
ˇ e dx ˇˇ D je itb e ita j
itx
ˇ jtj
.a;bi
and the right-hand side goes to 0 with jtj ! 1. By a step function we shall now
mean a (finite) C-linear combination of the functions c.a;bi (with varying a < b).
Then we claim that for every integrable function f W R ! C and every " > 0, there
exists a step function s such that
Z
jf sj < ":
R
First, this is true for continuous functions with compact supports (by the conver-
gence of the Riemann integral). Then it is true for non-negative functions in Zdn and
hence for all integrable functions by the Lebesgue Monotone Convergence Theorem
and linearity of integrals. But
Z
jfO.t/ sO.t/j jf sj < ";
R
and thus the limit formula for s implies the limit formula for f . t
u
5.3 Lemma. Let f W R ! C be such that both f .x/ and x f .x/ are integrable.
Then fO.t/ is differentiable, and
dfO
dt
2
D ixf .x/.t/:
(Note: By the right-hand side, we mean the Fourier transform of ixf .x/, which is
a function of t.)
Proof. Under the conditions given, we have

Z Z Z
d itx @f .x/e itx
f .x/e dx D dx D .ix/f .x/e itx dx
dt R R @t R
by Theorem 5.2 of Chapter 5 (differentiation under the integral sign). t

u
5.4 Lemma. Let f W R ! C have a continuous derivative, and assume f .x/ and
f 0 .x/ are integrable, and that
lim f .x/ D 0:
x!˙1
www.Ebook777.com
Then we have
fb0 .t/ D itfO.t/:
Proof. Compute:
Z Z
fb0 .t/ D 0
f .x/e itx
dx D lim f 0 .x/e itx dx
R a!1 ha;ai
Z Z
D lim .f .a/e ita f .a/e ita C itf .x/e itx dx/ D it f .x/e itx dx:
a!1 ha;ai R
The passages to the limit follow from the Lebesgue Dominated Convergence
Theorem. The middle equality is integration by parts (for the Riemann integral). u
t
5.5 Lemma. Let f; g W R ! C be integrable functions and let a > 0. Then

Z Z
O
f .ax/g.x/dx D g.ay/fO.y/dy: (5.5.1)
R R
(Again, on both sides, we mean the Lebesgue integral.)
Proof. First note that both sides of (5.5.1) make sense by the Riemann-Lebesgue
lemma, since fO and gO are continuous and bounded. Next, consider the integral
Z
f .x/g.t/e itx=a :
R2
Clearly, this integral exists (replace the integrand by jf .x/j jg.t/j), and is equal
to both sides of (5.5.1) by Fubini’s Theorem and linear substitutions x=a D u and
t=a D v. t
u
5.6 Rapidly decreasing functions
A function f W R ! C is called rapidly decreasing (or Schwarzian) if f has all

derivatives, and for all numbers m; n D 0; 1; 2; : : : , we have
lim x m f .n/ .x/ D 0:

x!˙1
(Note that the term “rapidly decreasing” is a misnomer, since these functions are,
in fact, never decreasing.) Note that any smooth function with compact support is
rapidly decreasing (since all its derivatives will have, again, compact support). The
vector space of all rapidly decreasing functions f W R ! C is denoted by S.
Lemma. Let f 2 S. Then fO 2 S.
www.Ebook777.com
Proof. By induction, using Lemmas 5.3 and 5.4, t m fO.n/ .t/ is a (finite) linear
4
combination of functions of the form x k f .`/ .x/.t/. Use the assumption and the
Riemann-Lebesgue lemma. u
t
5.7 The Fourier Inversion Theorem
Define the inverse Fourier transform fQ by

Z
1
fQ.t/ D p f .x/e itx dx:
2 R
Then by definition,
b
fQ D f
where x is the complex conjugate of x. It follows that the inverse Fourier transform
maps S to S.
Theorem. For f 2 S, the inverse Fourier transform of the Fourier transform of f

is equal to f .
Proof. Let f; g 2 S. By the Lebesgue Dominated Convergence Theorem and

Lemma 5.6, we may pass to the limit a ! 0 in Lemma 5.5, getting
Z Z
f .0/ gO D g.0/ fO: (5.7.1)
R R
Setting
1
g.x/ D p e x =2 ;
2
and using Exercise (15) of Chapter 5, and Exercise (18) below, (5.7.1) becomes
Z
1
f .0/ D p fO;
2 R
which is the special case of the formula we desire at the point x D 0. The general
case follows from Exercise (16) below. t
u
Corollary. For f; g 2 S, we have
hf; gi D hfO; gi:

O
www.Ebook777.com
Proof. We have
Z Z
b
hfO; gi
O D fOgO D f gO D
R R
Z Z
fe
gO D f g D hf; gi: t
u
R R
Lemma. S is dense in L2 .R; C/.
Proof. In effect, by Theorem 4.1.1, continuous functions with compact support

are dense in L2 .R; C/, but we claim that if f is a continuous function with
compact support K, then there exists an L K compact and fn smooth with
supp.fn / L which converge to f uniformly (and hence in L2 ). In effect, let
U be any open neighborhood of K such that U is compact. Let " > 0. Since f
is uniformly continuous, there exists a ı > 0 such that for x 2 K, y 2
.x; ı/,
jf .x/ f .y/j < ". Further, by compactness, for ı > 0 sufficiently small and all
x 2 K,
.x; ı/ U . Now choose a smooth partition of unity ux subordinate to the
open cover by the balls
.x; ı/ and R X K, and let
X
g.t/ D ux .t/f .x/:
x2K
Then jf .t/ g.t/j < 2" for all t 2 K, while g is smooth and supp.g/ U . t
u
5.8 Theorem. The maps S ! S given by f 7! fO, f 7! fQ extend to linear

isometries
F ; F 1 W L2 .R; C/ ! L2 .R; C/
which are inverse to each other.
Proof. An isometry of inner product spaces is always injective. Thus, by Corol-

lary 5.7, the Fourier transform gives an injective linear map S ! S, and by
Theorem 5.7, it is onto. Hence it is a linear isomorphism, and the inverse Fourier
transform is an inverse linear isomorphism. Hence, the inverse Fourier transform is
also an isometry (of course, this could also be proved directly).
Now composing either the Fourier transform or the inverse Fourier transform
S ! S with the inclusion S L2 .R; C/, we obtain uniformly continuous maps into
a complete metric space, which can therefore be uniquely extended to a uniformly
continuous map
L2 .R; C/ ! L2 .R; C/
www.Ebook777.com
by Proposition 4.6 of Chapter 9. These maps are clearly linear isometries by

continuity of the inner product and vector space operations, and are inverse to each
other by uniqueness of the extension. t
u
6 Exercises
(1) Prove that the expression (1.2.2) of 1.2 does not depend on the expression of a
simple function (1.2.1).
(2) Prove that the function volg .B/ on Borel subsets B of a Riemann manifold M
from Exercise (5) of Chapter 15 is a Borel measure on M .
(3) Prove that for two Riemann metrics g1 , g2 on a smooth manifold M , the Borel
measure volg1 is absolutely continuous with respect to the Borel measure volg2 .
Conclude that it makes sense to speak of a measure 0 set in a smooth manifold,
even when we do not specify a Riemann metric.
(4) Extend the Radon-Nikodym Theorem S to the case when there exist subsets
X1 ; X2 ; X such that X D Xn and .Xn / < 1. (The measure on X
is then called -finite). Note that we are keeping the assumption .X / < 1.
[Hint: Apply Theorem 2.2 for each Xn instead of X .]
(5) Prove uniqueness in the Radon-Nikodym Theorem, i.e. prove that if two
functions h1 , h2 in the statement of Theorem 2.2 satisfy the conclusion, then
they are equal almost everywhere.
(6) Prove that if B Rn is a Borel set and is a -finite Borel measure on B, then
there is an isomorphism of Banach spaces .L1 .B// Š L1 .B/, and similarly
in the complex case.
[Hint: Extend the Radon-Nikodym Theorem to a situation where instead
of the measure we have a continuous linear functional on L1 .B/ under
the condition .X / < 1 - the proof is the same! The “Radon-Nikodym
derivative” h is the function in L1 which we are seeking; Exercise (4) is also
relevant. To prove that there is a bound M such that jh.X /j < M almost
everywhere, assume for contradiction that jh.x/j > 2n on a subset Xn of
Rpositive measure, RXn disjoint. Then there exists an integrable function fn with
Xn jfn j 1=2 n
, Xn fn h D 1.]
(7) Prove that if U is an open set in Rn , then the spaces L1 .B/, L1 .B; C/ are not
reflexive.
[Hint: Use 4.1.2. Let V be a the closure of Cc .U / in L1 .B/. Prove that there
is a continuous linear form X on L1 .B/ which is 0 on Cc .U /. Consequently,
X cannot come from L1 .U /. (Consider Exercise (6).)]
(8) Prove that in Lemma 2.3, the assumption .X / < 1 is needed. Find a
counterexample and describe where the proof goes wrong when we omit this
condition.
(9) The requirement in Definition 3.1 that the intervals hai ; bi i be disjoint is
needed. Give an example showing that we get a different notion if we drop it.
www.Ebook777.com
6 Exercises 449
(10) Prove that while x 2 sin.1=x 2 / has a derivative everywhere, it is not absolutely
continuous on h1; 1i, and thus the Lebesgue integral of its derivative does not
exist.
(11) Let F W ha; bi ! R be Lipschitz (see 3.1 of Chapter 6). Prove that F has a
derivative almost everywhere.
(12) In analogy of 4.1, find a Hilbert basis of the space L2 .ha; bi/ for a < b.
(13) Using 4.1, find a real orthonormal basis of the real Hilbert space
L2 .h0; 2 i; R/:
(14) Let f W h0; 2 i ! R be defined by

(a) f .x/ D x,
(b) f .x/ D 1 for 0 x and f .x/ D 0 else.
Compute the Fourier series of f .
(15) Prove that the functions m;n W R ! C where m;n .x/ D p12 e inx when
2 m x < 2 .m C 1/ and m;n .x/ D 0 otherwise, form an orthonormal
basis of L2 .R; C/.
(16) Let f W R ! C be an integrable function and let a 2 R. Define a function
fa W R ! C by fa .x/ D f .x C a/. Prove that fba .t/ D e ita fO.t/.
(17) Define the convolution of functions f; g W R ! C by
Z
f g.t/ D f .x/g.t x/dx:
R
Prove that if f and g are integrable then the convolution is well defined, and
one has
1
f g D fO g:
O
[Hint: Use Fubini’s Theorem.]

(18) Prove that the function e x =2 is rapidly decreasing and that its Fourier
2
transform is the same function.
www.Ebook777.com
Linear Algebra I: Vector Spaces

A
1 Vector spaces and subspaces
1.1
Let F be a field (in this book, it will always be either the field of reals R or the field
of complex numbers C). A vector space
V D .V; C; o; ˛./ .˛ 2 F//
over F is a set V with a binary operation C, a constant o and a collection of unary

operations (i.e. maps) ˛ W V ! V labelled by the elements of F, satisfying
(V1) .x C y/ C z D x C .y C z/,
(V2) x C y D y C x,
(V3) 0 x D o,
(V4) ˛ .ˇ x/ D .˛ˇ/ x,
(V5) 1 x D x,
(V6) .˛ C ˇ/ x D ˛ x C ˇ x, and
(V7) ˛ .x C y/ D ˛ x C ˛ y.
Here, we write ˛ x and we will write also ˛x for the result ˛.x/ of the unary
operation ˛ in x. Often, one uses the expression “multiplication of x by ˛”; but it is
useful to keep in mind that what we really have is a collection of unary operations
(see also 5.1 below). The elements of a vector space are often referred to as vectors.
In contrast, the elements of the field F are then often referred to as scalars.
In view of this, it is useful to reflect for a moment on the true meaning of the
axioms (equalities) above. For instance, (V4), often referred to as the “associative
law” in fact states that the composition of the functions V ! V labelled by
ˇ; ˛ is labelled by the product ˛ˇ in F, the “distributive law” (V6) states that the
(pointwise) sum of the mappings labelled by ˛ and ˇ is labelled by the sum ˛ C ˇ
in F, and (V7) states that each of the maps ˛ preserves the sum C. See Example 3
in 1.2.

DOI 10.1007/978-3-0348-0636-7, © Springer Basel 2013
www.Ebook777.com
452 A Linear Algebra I: Vector Spaces
1.2 Examples
Vector spaces are ubiquitous. We present just a few examples; the reader will
certainly be able to think of many more.
1. The n-dimensional row vector space Fn . The elements of Fn are the n-tuples
.x1 ; : : : ; xn / with xi 2 F, the addition is given by
.x1 ; : : : ; xn / C .y1 ; : : : ; yn / D .x1 C y1 ; : : : ; xn C yn /;
o D .0; : : : ; 0/, and the ˛’s operate by the rule
˛..x1 ; : : : ; xn // D .˛x1 ; : : : ; ˛xn /:
Note that F1 can be viewed as the F. However, although the operations a come
from the binary multiplication in F, their role in a vector space is different. See
5.1 below.
2. Spaces of real functions. The set F .M / of all real functions on a set M ,
with pointwise addition and multiplication by real numbers is obviously a
vector space over R. Similarly, we have the vector space C.J / of all the
continuous functions on an interval J , or e.g. the space C 1 .J / of all continuously
differentiable functions on an open interval J or the space C 1 .J / of all smooth
functions on J , i.e. functions which have all higher derivatives. There are also
analogous C-vector spaces of complex functions.
3. Let V be the set of positive reals. Define x ˚ y D xy, o D 1, and for arbitrary
˛ 2 R, ˛ x D x ˛ . Then .V; ˚; o; ˛ ./ .˛ 2 R// is a vector space (see
Exercise (1)).
1.3 An important convention
We have distinguished above the elements of the vector space and the elements
of the field by using roman and greek letters. This is a good convention for a
definition, but in the row vector spaces Fn , which will play a particular role below,
it is somemewhat clumsy. Instead, we will use for an arithmetic vector a bold-faced
variant of the letter denoting the coordinates. Thus,
x D .x1 ; : : : ; xn /; a D .a1 ; : : : ; an /; etc.
Similarly we will write
f D .f1 ; : : : ; fn /
for the n-tuple of functions fj W X ! R resp. C (after all, they can be viewed as
mappings f W X ! Fn ), and similarly.
www.Ebook777.com
1 Vector spaces and subspaces 453
These conventions make reading about vectors much easier, and we will maintain
them as long as possible (for example in our discussion of multivariable differential
calculus in Chapter 3). The fact is, however, that in certain more advanced settings
the conventions become cumbersome or even ambiguous (for example in the context
of tensor calculus in Chapter 15), and because of this, in the later chapters of this
book we eventually abandon them, as one usually does in more advanced topics of
analysis.
We do, however, use the symbol o universally for the zero element of a general
vector space – so that in Fn we have o D .0; 0; : : : ; 0/.
1.4
We have the following trivial
Observation. In any vector space V , for all x 2 V , we have x C o D x and there

exists precisely one y such that x C y D o, namely y D .1/x.
(Indeed, x C o D 1 x C 0 x D .1 C 0/x D x and x C .1/x D 1x C

.1/x D .1 C .1//x D 0 x D o, and if x C y D o and x C z D o then
y D y C .x C z/ D .y C x/ C z D z.)
1.5 (Vector) subspaces
A subspace of a vector space V is a subset W V that is itself a vector space with

the operations inherited from V . Since the equations required in V hold for special
as well as general elements, we have a trivial
Observation. A subset W V of a vector space is a subspace if and only if

(a) o 2 W ,
(b) if x; y 2 W then x C y 2 W , and
(c) for all ˛ 2 F and x 2 W , ˛x 2 W .
1.5.1
Also the following statement is immediate.
Proposition. The intersection of an arbitrary set of subspaces of a vector space V

is a subspace of V .
1.6 Generating sets
By 1.5.1, we see that for each subset M of V there exists the smallest subspace
W V containing M , namely
www.Ebook777.com
\
L.M / D fW j W subspace of V and M W g:
For M finite, we use the notatiom
L.u1 ; : : : ; un / instead of L.fu1 ; : : : ; un g/:
Obviously L.;/ D fog.

We say that M generates L.M /; in particular if L.M / D V we say that M is a
generating set (of V ). One often speaks of a set of generators but we have to keep
in mind that this does not imply each of its elements generates V , which would be
a much stronger statement.
If there exists a finite generating system we say that V is finitely generated, or
finite-dimensional.
1.7 The sum of subspaces
Let W1 ; W2 be subspaces. Unlike the intersection W1 \ W2 , the union W1 [ W2

is generally (and typically) not a subspace. But we have the smallest subspace
containing both W1 and W2 , namely L.W1 [ W2 /. It will be denoted by
W1 C W2
and called the sum of W1 and W2 . (One often uses the symbol ‘˚’ instead of ‘C’
when one also has W1 \ W2 D fog.)
2 Linear combinations, linear independence
2.1
A linear combination of a system x1 ; : : : ; xn of elements of a vector space V over

F is a formula
X
n
˛1 x1 C C ˛n xn (briefly, ˛j xj /: (*)
j D1
The “system” in question is to be understood as the sequence, although the order

in which it is presented will play no role. However, a possible repetition of an
individual element is essential.
Note that we spoke of (*) as of a “formula”. That is, we had in mind the full
information involved (more pedantically, we could speak of the linear combination
as of the sequence together with the mapping f1; : : : ; ng ! F sending j to ˛j ).
The vector obtained as the result of the indicated operations should be referred to as
the result of the linear combination (*). We will follow this convention consistently
www.Ebook777.com
2 Linear combinations, linear independence 455
X
n
to begin with; later, we will speak of a linear combination ˛j xj more loosely,
j D1
trusting that the reader will be able to tell from the context whether we will mean
the explicit formula or its result.
2.2
A linear combination (*) is said to be non-trivial if at least one of the ˛j is non-zero.

A system x1 ; : : : ; xn is linearly dependent if there exists a non-trivial linear
combination (*) with result o. Otherwise, we speak of a linearly independent
system.
2.2.1 Proposition. 1. If x1 ; : : : ; xn is linearly dependent resp. independent then for

any permutation of f1; : : : ; ng the system x .1/ ; : : : ; x .n/ is linearly dependent
resp. independent.
2. A subsystem of a linearly independent system is linearly independent.
3. Let ˇ2 ; : : : ; ˇn be arbitrary. Then x1 : : : ; xn is linearly independent if and only if
X n
the system x1 C ˇj xj ; x2 ; : : : ; xn is.
j D2
4. A system x1 ; : : : ; xn is linearly dependent if and only if some of its members are
a (result of a) linear combination of the others.
In particular, any system containing o is linearly dependent. Similarly, if there
exist j ¤ k such that xj D xk then x1 ; : : : ; xn is linearly dependent.
Proof. 1. is trivial.
2. A non-trivial linear combination demonstrating the dependence of the smaller
system demonstrates the dependence of the bigger one if we put ˛j D 0 for the
remaining summands.
3. It suffices to prove one implication, the other follows by symmetry since the first
system can be obtained from the second by using the coefficients ˇj . Thus,
Xn
let ˛1 .x1 C ˇj xj / C ˛2 x2 C C ˛n xn D o with an ˛k ¤ 0. Then we have
j D2
˛1 x1 C .˛2 C ˛1 ˇ2 /x2 C C .˛n C ˛1 ˇn /xn D o
and it is a non-trivial linear combination of the x1 ; : : : ; xn : indeed either ˛1 ¤ 0

or .˛k C ˛1 ˇk / D ˛k ¤ 0.
Xn
4. If ˛1 x1 C C ˛n xn (briefly, ˛j xj D o) with ˛k ¤ 0 then xk D
j D1
X .˛j / X
xj . On the other hand, if xk D ˛j xj we have the non-trivial
˛k
j ¤k j ¤k
X
linear combination xk C .˛j /xj D o. t
u
j ¤k
www.Ebook777.com
2.3 Conventions
We speak of a linearly independent finite set X V if X is independent when

ordered as a sequence without repetition. A general subset X V is said to be
independent if each of its finite subsets is independent.
2.4 Theorem. Let M be an arbitrary subset of a vector space V . Then L.M / is the
set of all the (results of) linear combinations of finite subsystems of M .
Proof. The set of all such results of linear combinations is obviously a subspace of
V . On the other hand, a subspace W containing M has to contain all the (results of)
linear combinations of elements of M . t
u
2.5 Proposition. L.u1 ; : : : ; un / L.v1 ; : : : ; vk / if and only if each of the uj ’s is a

linear combination of v1 ; : : : ; vk .
Proof. If it is, the inclusion follows from 2.4 since L.u1 ; : : : ; un / is the smallest
subspace containing all the uj ; if we have the inclusion then the uj ’s are the desired
linear combinations, again by 2.4. t
u
2.6 Theorem. (Steinitz’ Theorem, or The Exchange Theorem) Let v1 ; : : : ; vk be a

linearly independent system in a vector space V and let fu1 ; : : : ; un g be a generating
set. Then
(1) k n, and
(2) There exists a bijection W f1; : : : ; ng ! f1; : : : ; ng (i.e. a permutation of the
set f1; : : : ; ng) such that
fv1 ; : : : ; vk ; u .kC1/; : : : ; u .n/ g
is a generating set.
Proof. by induction.
X
n
If k D 1 we have v1 D ˛j uj and since v1 ¤ o by 2.2, there exists at least
j D1
one uj0 with ˛j0 ¤ 0. Now
1 X ˛j
uj0 D v1 C uj
˛j0 ˛j0
j ¤j0
and we have, by 2.5,
L.v1 ; u1 ; : : : ; uj0 1 ; uj0 C1 ; : : : ; un / D L.u1 ; : : : ; un / D V:
Rearange the uj by exchanging u1 with uj0 .
www.Ebook777.com
3 Basis and dimension 457
Now let the statement hold for k and let us have a linearly independent system
v1 ; : : : ; vk ; vkC1 . Then v1 ; : : : ; vk . is linearly independent and we have, after a
rearrangement of the uj ,
L.v1 ; : : : ; vk ; ukC1 ; : : : ; un / D V:
Since vkC1 2 V we have
X
k X
n
vkC1 D ˛j vj C ˛j uj :
j D1 j DkC1
We cannot have all the ˛j with j > k equal to zero: since v1 ; : : : ; vk ; vkC1 are
independent, this would contradict 2.2.1 4. Thus, ˛j0 ¤ 0 for some j0 > k and
hence, first,
n k C 1;
and, second, after rearranging the uj ’s to exchange the uj0 with ukC1 we obtain
1 X k
˛j X
n
˛j
vkC1 D vj C ukC1 C uj ;
˛kC1 ˛
j D1 kC1
˛kC1
j DkC2
and hence
X k
˛j 1 X
n
˛j
ukC1 D vj C vkC1 C uj ;
˛
j D1 kC1
˛kC1 ˛kC1
j DkC2
and L.v1 ; : : : ; vk ; vkC1 ; ukC2 ; : : : ; un / D L.u1 ; : : : ; un / D V by 2.5 again. t

u
3 Basis and dimension
3.1
We have observed a somewhat complementary behaviour of generating sets and

independent systems: the former remain a generating set if more elements are
added, the latter remain independent if some elements are deleted. This suggests
the importance of minimal generating sets and maximal independent ones. We will
see they are, basically, the same. The resulting concept is of fundamental importance
in linear algebra.
A basis of a vector space V is a subset that is both generating and linearly
independent.
www.Ebook777.com
3.1.1 Observation. In a vector space V ,

X
n
(1) if u1 ; : : : ; un is a generating set then each x can be written as x D ˛j uj ,
j D1
(2) if u1 ; : : : ; un is linearly independent then each x can be written at most one way
X n
as x D ˛j uj ,
j D1
(3) if u1 ; : : : ; un is a basis then each x can be written precisely one way as x D
X n
˛j uj .
j D1
X
n X
n X
n
((1) is in 2.4; as for (2), if ˛j uj D ˇj uj then .˛j ˇj /uj D o and
j D1 j D1 j D1
˛j ˇj D 0; (3) is a combination of (1) and (2).)
3.2 Theorem. 1. Every (finite) generating system u1 ; : : : ; un contains a basis.

2. Every linearly independent system v1 ; : : : ; vn of a finitely generated vector
space can be extended to a basis.
3. All bases of a finitely generated vector space have the same number of elements.
Proof. 1. If u1 ; : : : ; un are linearly independent we already have a basis. Else

there is, by 2.1 4, an element uj , say un (which we can achieve by rearange-
ment), that is a linear combination of others. Then by 2.5, L.u1 ; : : : ; un1 / D
L.u1 ; : : : ; un / D V and we can repeat the procedure with the generating system
u1 ; : : : ; un1 . After repeating the procedure sufficiently many times we finish
with a generating u1 ; : : : ; uk that is linearly independent. (Note that this last
system can be empty if the preceding system u1 consisted of u1 D o only; the
empty system is formally independent, and constitutes a basis of the trivial vector
space fog.)
2. From 1 we already know that V has a basis u1 ; : : : ; un and from 2.6 we infer that
after rearangement we have a generating system
v1 ; : : : ; vk ; ukC1 : : : ; un (*)
and this, by 1 again, has to contain a basis. But this basis cannot be a proper
subset of (*), by 2.6, since there exists an independent system u1 ; : : : ; un .
3. If u1 ; : : : ; un and v1 ; : : : ; vk are bases then by 2.6, k n and n k. t
u
3.3
The common cardinality of all bases of a finitely generated vector space V is called
the dimension of V and denoted by
dim V:
www.Ebook777.com
3 Basis and dimension 459
From 2.6 and 3.2 we immediately obtain
Corollary. Let dim V D n. Then

1. every generating system u1 ; : : : ; un is a basis, and
2. every linearly independent system u1 ; : : : ; un is a basis.
3.4 Theorem. A subspace W of a finitely generated vector space V is finitely

generated and we have dim W dim V . If dim W D dim V then W D V .
Proof. We just have to show that W is finitely generated; the other statements are
consequences of the already proved facts (since a basis of W is a linearly indepen-
dent system in V ). Suppose W is not finitely generated. Then, first, it contains a
non-zero element u1 . Suppose we have already found a linearly independent system
u1 ; : : : ; un . Since V ¤ L.u1 ; : : : ; un / there exists a unC1 2 V X L.u1 ; : : : ; un /. Then,
by 2.2.1 4, u1 ; : : : ; un ; unC1 is linearly independent, and we can construct inductively
an arbitrarily large independent system, contradicting 2.6. t
u
3.5 Remark
We have learned that every finitely generated vector space has a basis. In fact, one
can easily prove, using Zorn’s lemma, that every vector space has one. Indeed, let
fIj j j 2 J g
S
be a chain of independent subsets of V . Then I D fIj j j 2 J g is an independent
set again, since any finite subset M D fx1 ; : : : ; xn g I is independent: if xk 2 Ijk
then M Ir , the largest of the Ijk , k D 1; : : : ; n. Thus there exists a maximal
independent set B and this B is a basis: if there were x … L.B/ we would have
fxg [ B independent, by 2.2.1 4, contradicting the maximality.
Recall the sum of subspaces from 1.7. We have
3.6 Theorem. Let W1 ; W2 be finitely generated subspaces of a vector space V .

Then
dim W1 C dim W2 D dim.W1 \ W2 / C dim.W1 C W2 /:
Proof. Consider a basis u1 ; : : : ; uk of W1 \ W2 . By 3.2, there exist bases
u1 ; : : : ; uk ; vkC1 ; : : : ; vr of W1 ; and
u1 ; : : : ; uk ; wkC1 ; : : : ; ws of W2 :
Then the system
u1 ; : : : ; uk ; vkC1 ; : : : ; vr ; wkC1 ; : : : ; ws
www.Ebook777.com
obviously generates W1 C W2 and hence our statement will follow if we prove that it
is linearly independent (and hence a basis) – since then dim.W1 C W2 / D r C s k.
To this end, let
X
k X
r X
s
˛j uj C ˇj vj C j wj D o:
j D1 j DkC1 j DkC1
Then we have
X
r X
k X
s
ˇj vj D ˛j uj j wj 2 W1 \ W2
j DkC1 j D1 j DkC1
X
k
and since it also can be written as ıj uj , all the ˇj are zero, by 3.1.1.
j D1
Consequently,
X
k X
s
˛j uj C j wj D o
j D1 j DkC1
and since u1 ; : : : ; uk ; wkC1 ; : : : ; ws is a basis, also all the ˛i and i are zero. t
u
4 Inner products and orthogonality
4.1
In this section, it is important that we work with vector spaces over R or C. Since
all the formulas in the real context will be special cases of the respective complex
ones, the proofs will be done in C.
Recall the complex conjugate z D z1 i z2 ofp z D z1 C i z2 , the formulas z C z0 D
z C z0 and z z0 D z z0 , the absolute value jzj D zz, and realize that for a real z this
absolute value is the standard one.
4.2
An inner product in a vector space V over C resp. R is a mapping
..x; y/ 7! x y/ W V
V ! C resp. R
such that
(1) u u 0 (in particular always real), and u u D 0 only if u D o,
(2) u v D v u (u v D v u in the real case),
www.Ebook777.com
4 Inner products and orthogonality 461
(3) .˛u/ v D ˛.u v/, and

(4) u .v C w/ D u v C u w.
We usually write simply uv for u v, and u2 for uu. Note that
u.˛v/ D .˛v/u D ˛.vu/ D ˛.vu/ D ˛.uv/
and using similarly twice the complex conjugate,
.v C w/u D vu C wu:
Remark: The notation for an inner product sometimes varies. The most common
alternate notation to x y is hx; yi (although one must beware of possible confusion
with our notation for closed intervals). The notation is particularly convenient when
we want to express the dependence of the product on some other data, such as a
matrix (see Section 7.7 below).
Further, we introduce the norm
p
jjujj D uu:
4.3 An important example
In the row vector space we will use without further mentioning the inner product the
symbol
X
n X
n
xy D xj y j (in the real case x y D xj yj /
j D1 j D1
(see Exercise (2)). This specific example of an inner product is sometimes referred
to as the dot product.
p p
4.4 Theorem. (The Cauchy-Schwarz inequality) We have jxyj xx yy.
Proof. We have
0 .x C y/.x C y/ D xx C .y/x C x.y/ C .y/.y/

(*)
D xx C .yx/ C .xy/ C .yy/:
If x D o then the inequality in the statement holds trivially. Else set

xy
D
yy
www.Ebook777.com
to obtain from (*)

xy yx .xy/.yx/ xy
0 xx .yx/ .xy/ C .yy/ D xx .yx/
yy yy .yy/.yy/ yy
and hence .xy/.xy/ D .xy/.yx/ .xx/.yy/. Take square roots. t

u
4.5
Vectors u; v are said to be orthogonal if uv D 0. Note that

the only vector orthogonal to itself is o.
A system u1 ; : : : ; un is said to be orthogonal if uj uk D 0 whenever j ¤ k. It is
orthonormal if, moreover, jjuj jj D 1 for all j .
4.5.1 Proposition. An orthogonal system consisting of non-zero elements (in par-

ticular, an orthonormal system) is linearly independent.
P P
Proof.
P Multiply o D ˛j uj by uk from the right. We obtain 0 D .˛j uj /uk D
˛j .uj uk / D ˛k .uk uk /. Since uk uk ¤ 0, ˛k D 0. t
u
4.5.2 Theorem. (The Gram-Schmidt orthogonalization process) For every basis

u1 ; : : : ; un of a vector space V with inner product there exists an orthonormal basis
v1 ; : : : ; vn such that for each k D 1; 2; : : : ; n,
L.v1 ; : : : ; vk / D L.u1 ; : : : ; uk /:
If u1 ; : : : ; ur is orthonormal we can have vj D uj for j r.
Proof. Start with v1 D jju11 jj . If we already have an orthonormal system v1 ; : : : ; vk

such that L.v1 ; : : : ; vr / D L.u1 ; : : : ; ur / for all r k set
X
k
w D ukC1 .ukC1 vj /vj :
j D1
For all vr , r k, we have
X
k
wvr D ukC1 vr .ukC1 vj /.vj vr / D ukC1 vr ukC1 vr D 0:
j D1
X
k
We have w ¤ o since otherwise ukC1 D .ukC1 vj /vj 2 L.v1 ; : : : ; vk / D
j D1
L.u1 ; : : : ; uk / contradicting the linear independence of u1 ; : : : ; uk ; ukC1 . Thus we
can set
www.Ebook777.com
4 Inner products and orthogonality 463
w
vkC1 D
jjwjj
and obtain an orthonormal system v1 ; : : : ; vk ; vkC1 and
L.v1 ; : : : ; vk ; vkC1 / D L.u1 ; : : : ; uk ; ukC1 /
by 2.5.
Finally observe that if u1 ; : : : ; ur was already orthonormal, the procedure yields
vj D uj until j D r. t
u
4.6
The orthogonal complement of a subspace W of a vector space V with inner product

is the set
W ? D fu 2 V j uv D 0 for all v 2 W g:
From the properties in 4.1 we immediately obtain
4.6.1 Observations. 1. W ? is a subspace of V and we have W ? \ W D fog and

the implication
W1 W2 ) W2? W1? :
2. L.v1 ; : : : ; vn /? D fu j uvj D 0 for all j D 1; : : : ; ng:
4.6.2 Theorem. Let V be a finite-dimensional vector space with inner product.

Then we have, for subspaces W; Wj V ,
(1) W ˚ W ? D V ,
(2) dim W ? D dim V dim W ,
(3) .W ? /? D W , and
(4) .W1 \ W2 /? D W1? C W2? and .W1 C W2 /? D W1? \ W2? .
Proof. (1) and (2): Let u1 ; : : : ; uk be an orthonormal basis of W . By 2.6 and

4.5.2 we can extend it to an orthonormal basis u1 ; : : : ; uk ; ukC1 ; : : : ; un of V . If
Xn Xn
xD ˛j uj is in W ? we have 0 D xur D ˛j .uj ur / D ˛r for r k and
j D1 j D1
x 2 L.ukC1 ; : : : ; un /.
On the other hand, if x 2 L.ukC1 ; : : : ; un / then x 2 W ? by 4.6.1 2. Thus,
?
W D L.ukC1 ; : : : ; un /, and (1) and (2) follow.
(3) Obviously W .W ? /? . By (2), dim W D dim.W ? /? and hence W D
.W ? /? by 3.4.
(4) Obviously Wi? .W1 \ W2 /? and hence W1? C W2? .W1 \ W2 /? , and
similarly W ? ? ?
1 \ W 2 .W1 C W2 / . Now, using (3) and 4.6.1 1 we obtain
www.Ebook777.com
.W1 \ W2 /? D ..W1? /? \ .W2? /? /? ..W1? C W2? /? /? D W1? C W2? ; and

.W1 C W2 /? D ..W1? /? C .W2? /? /? ..W1? \ W2? /? /? D W1? \ W2? : t
u
4.7 Hermitian and Symmetric Bilinear Forms
For a vector space V over C, a mapping V

V ! C satisfying all the axioms of
4.2 except axiom (1) is called a Hermitian form. (Note that by axiom (2) of 4.2,
B.v; v/ is always a real number.) If we replace C by R in this definition, we speak
of a symmetric bilinear form (over R). For Hermitian and symmetric bilinear forms,
one usually does not use the notation , but a letter, for example B.u; v/, u; v 2 V .
A Hermitian (resp. real symmetric bilinear) form B is then called positive definite
(resp. negative definite) if B is an inner product (resp. B is an inner product). B
is called indefinite if it is neither positive nor negative definite. A Hermitian resp.
real symmetric bilinear form B is called degenerate if there exists a non-zero vector
v 2 V such that for every w 2 V , B.v; w/ D 0. Otherwise, B is called non-
degenerate. Clearly, every degenerate Hermitian or real symmetric bilinear form is
indefinite.
Real symmetric bilinear forms, and whether they are non-degenerate and positive
or negative-definite, is important in multivariable differential calculus (see Section 8
of Chapter 3). Hermitian forms behave analogously in many ways. It is therefore
natural to ask: Given a Hermitian or real symmetric bilinear form, can we decide if it
is positive or negative definite? Doing this algorithmically requires solving systems
of linear equations, which we will review in Appendix B, so we will postpone the
solution of this problem to Appendix B.2.6 below.
5 Linear mappings
5.1
Let V; W be vector spaces. A mapping f W V ! W is said to be linear if
for all x; y 2 V; f .x C y/ D f .x/ C f .y/; and

for all ˛ 2 F and x 2 V; f .˛x/ D ˛f .x/:
Note that the “multiplication by elements of F” really acts as individual unary

operations (recall 1.1). In particular, a linear mapping f W F ! F with F viewed as
F1 (recall 1.2 1 satisfies f .ax/ D af .x/, not f .ax/ D f .a/f .x/).
A linear mapping f W V ! W is an isomorphism if there is a linear mapping
g W W ! V such that fg D id and gf D id; V and W are then said to be
isomorphic.
www.Ebook777.com
5 Linear mappings 465
We have an immediate
5.1.1 Observation. A composition of linear mappings is a linear mapping.
5.2 Examples
1. The projections pk D ..x1 ; : : : ; xn / 7! xk / W Fn ! F1 are linear mappings.

2. The mapping ..x1 ; x2 ; x3 / 7! .x2 ; x1 x3 // W F3 ! F2 is linear.
3. Recall 1.2 2. The mapping . 7! .x// W F .X / ! R1 is linear.
4. Let J be an open interval. Recall 1.2 2 again. Taking the derivative at a point
a 2 J is a linear mapping from C 1 .J / to R1 .
See the Exercises for more examples.
5.3 Theorem. Let f W V ! W be a linear mapping such that f ŒV D W , let

g W V ! Z be a linear mapping, and let h W W ! Z be a mapping such that
hf D g. Then h is linear.
Proof. For each w 2 W choose an element .w/ 2 V such that f ..w// D w.

We have h.x C y/ D h.f ..x// C f ..y/// D hf ..x/ C .y// D g..x/ C
.y// D g.x/Cg.y/ D hf .x/Chf .y/ D h.x/Ch.y/ and similarly h˛x D
h.˛f .x// D hf .˛.x// D g.˛.x// D ˛g..x// D ˛hf .x/ D ˛h.x/. t
u
Note. This is a general fact about homomorphisms between algebraic structures.
5.3.1 Corollary. Every linear mapping f W V ! W that is one-one and onto is an

isomorphism.
(Indeed, there is a g W W ! V such that gf D id and gf D id. Since f is onto

and id is linear, g is linear.)
5.3.2 Corollary. If dim V D n then V is isomorphic to Fn .
(Choose a basis P u1 ; : : : ; un and define a mapping f W Fn ! V by setting

f ..x1 ; : : : ; xn // D xi ui . This f is obviously linear and by 3.1.1 1 it is one-one
and onto.)
5.4 Proposition. Let f W V ! W be a linear mapping. If f is one-one then it

sends every linearly independent system to a linearly independent one, if f is onto
then it sends every generating set to a generating one. Consequently, isomorphisms
preserve generating sets, linearly independent ones, and bases.
P P
Proof.
P Let f be one-one and let ˛j f .xj / D o. Then f . ˛j xj / D f .o/ and
˛j xj D o so that if x1 ; : : : ; xn were linearly independent, all the ˛j are zero.
www.Ebook777.com
Let f be onto and let M P generate V . For a y 2 W choose an x 2 VPsuch that

f
P .x/ D y and write x as ˛i ui with ui 2 M . Then, y D f .x/ D f . ˛i ui / D
˛i f .ui / with f .ui / 2 f ŒM . t
u
5.5 Theorem. Let u1 ; : : : ; un be a basis of a vector space V , let W be a vector

space and let W fu1 ; : : : ; un g ! W be an arbitrary mapping. Then there exists
precisely one linear mapping f W V ! W such that f .ui / D .ui / for each i .
P
P be written as x D
Proof. Since every element of V can ˛j uj there is P
at most
one such fP: we must have f .x/ DP ˛j .uj /. On the other hand, if x D ˛j uj
and y D ˇj uj then x C y D P .˛j C ˇj /uj and it is, by 3.1.1, the only such
representation. Similarly for ˛x D ˛˛j uj . Thus, setting
X X
f .x/ D ˛j .uj / where x D ˛j uj
yields a linear mapping f W V ! W such that f .ui / D .ui /. t

u
5.6 The Free Vector Space on a Set S
In view of Theorem 5.5, it is an interesting question if for any set S , we can find
Š
a vector space with a basis B and a bijection W S !B. This is called the free
F-vector space on the set S , and denoted by FS (it is customary to treat as the
identity, which is usually OK, since it is specified). Of course, for S finite, we may
simply take Fn where n is the cardinality of S . However, for S infinite, the Cartesian
product FS turns out not to be the right construction. Rather, we set

there exists a finite subset F S such that
FS D a W S ! F j :
a.s/ D 0 for s 2 S X F
The operations of addition and multiplication by a scalar are done point-wise. In

fact, this is a vector subspace of FS , which is the space of all maps S ! F. The
basis B in question is the set of all maps as W S ! F where as .s/ D 1 and as .t/ D 0
for t ¤ s. It is easily verified that this is a basis. One usually treats the map S ! FS ,
s 7! as as an inclusion, so as becomes identified with s.
5.7 Affine subsets
Let W be a subspace of a vector space V and let x0 2 V . A subset of the form
x0 C W D fx0 C w j w 2 W g
is called an affine subset of V (or affine set in V ).
www.Ebook777.com
5 Linear mappings 467
5.7.1 Proposition. Let L be an affine set in V . Then the subspace W in the

representation
L D x0 C W
is uniquely determined, while for x0 one can take an arbitrary element of L. The
space W is sometimes referred to as the associated vector subspace of V , and the
dimension of V is referred to as the dimension of L.
Proof. We have
w2W if and only if w D x y with x; y 2 L
(x0 C u .x0 C v/ D u v 2 W and on the other hand, if w 2 W then w D

.x0 C w/ x0 ). Now let x1 D x0 C w0 be arbitrary, w0 2 W . Then for any w 2 W
we have x1 C w D x0 C .w0 C w/ 2 L, and x0 C w D x1 w0 C w. t
u
5.8 Theorem. Let f W V ! Z be a linear mapping. Then

(1) W D f 1 Œfog is a subspace of V , and
(2) the f 1 Œfzg are precisely the affine sets in V of the form v C W with f .v/ D z.
Proof. (1): If f .x/ D f .y/ D o then f .˛x C ˇy/ D o.

(2) Let f .v0 / D z. Then for each w 2 W we have f .v0 C w/ D f .v0 / C f .w/ D
z C o D z and on the other hand, if f .v/ D z then f .v v0 / D z z D o,
hence v v0 2 W , and v D v0 C .v v0 /. t
u
5.9 Affine maps
By an affine map between affine subsets L V , M W of vector spaces V , W

we shall mean simply a map
f WL!M
which is of the form
f .x/ D y0 C g.x x0 /
where x0 2 L, y0 2 M , and g is a linear map between the associated vector

subspaces.
It is possible to say a lot more about affine subsets and affine maps. Alternately,
many calculus texts do not mention them at all and refer to affine subsets as “linear
subsets”, and affine maps imprecisely as “linear maps [in the broader sense]”.
We decided to make the compromise of keeping the terminology precise without
dwelling on details which would not be useful to us.
www.Ebook777.com
6 Congruences and quotients
6.1
A congruence on a vector space V is an equivalence relation E V

V (we will
write xEy for .x; y/ 2 E) such that
xEy ) .˛x/E.˛y/ for all ˛ 2 F; and

xi Eyi ; i D 1; 2 ) .x1 C x2 /E.y1 C y2 /:
For the equivalence (congruence) classes Œx; Œy set
Œx C Œy D Œx C y and ˛Œx D Œ˛x
(this is correct: if x 0 2 Œx and y 0 2 Œy then x 0 Ex and y 0 Ey and hence

.x 0 C y 0 /E.x C y/ and x 0 C y 0 2 Œx C y; similarly for Œ˛x). It is easy to check
that the set of equivalence classes with these operations constitutes a vector space,
denoted by
V =E;
and that
pE D .x 7! Œx/ W V ! V =E
is a linear mapping onto.
6.2 Theorem. The formulas
E 7! WE D fx j xEog and W 7! EW D f.x; y/ j x m 2 W g
constitute a one-one corespondence between the congruences on V and subspaces

of V .
The congruence classes of E are precisely the affine sets
x C WE :
Proof. Obviously WE D fx j xEog is a subspace. If W is a subspace then EW is a

congruence: trivially xEW x, if xEW y then xy 2 W , hence yx D .xy/ 2 W
and yEW x, and if xEW y and yEW z then x z D .x y/ C .y z/ 2 W and xEW z;
if xi EW yi then .x1 y1 / C .x2 y2 / 2 W , that is, .x1 C x2 / .y1 C y2 / 2 W and
finally if xEW y we have x y 2 W and hence ˛x ˛y 2 W , that is, ˛xEW ˛y.
Now x 2 WEW if and only if xEW o if and only if x D x o 2 W , and xEWE y
if and only if x y 2 WE if and only if .x y/Eo if and only if xEy.
www.Ebook777.com
7 Matrices and linear mappings 469
Finally, if y 2 Œx then yEx, hence .y x/Eo, that is, y x 2 WE , and

y D x C .y x/ 2 x C WE . If y 2 .x C WE / then y D x C w with w 2 W and
y x Dw2W. t
u
6.2.1
If W is a subspace of V we will use, in view of 5.2, the symbol
V =W instead of V =EW :
We call the vector space V =W the quotient space (or factor) of V by the
subspace W .
6.3
Let f W V ! Z be a linear mapping. The subspace f 1 Œfog of V is called the

kernel of f and denoted by
Kerf:
Theorem. (The homomorphism theorem for vector spaces) For every linear map-
ping f W V ! Z and every subspace W Kerf there is an homomorphism
h W V =W ! Z
defined by h.x C W / D f .x/. If f is onto, so is h. If W D Kerf , h is one-to-one.
Proof. Using the projection V =W ! V =Kerf , x 2 W 7! x C Kerf , it suffices to

consider the case W D Kerf . If x C Kerf D y C Kerf then x y 2 Kerf , hence
f .x/f .y/ D o and f .x/ D f .y/. Thus, the mapping h is correctly defined. Since
we have, for the linear mapping p D .x 7! Œx/ W V ! V =Kerf with hp D f , h is
a linear mapping, by 5.3. Now h is obviously onto if f . If x C Kerf ¤ y C Kerf
then x y … Kerf and f .x/ f .y/ D f .x y/ ¤ o so that h is one-one. t
u
7 Matrices and linear mappings

7.1 Matrices
In this section we will deal with vector spaces over the field of complex or real
numbers. A matrix of the type m
n is an array
0 1
a11 ; : : : ; a1n
A D @ ::: ::: ::: A
am1 ; : : : ; amn
www.Ebook777.com
where the entries ajk are numbers, real or complex, according to the context. If m
and n are obvious we often write simply
A D .ajk /j;k or .ajk /jk :
Sometimes the jk-th entry of a matrix A is denoted by Ajk .

The row vectors
.aj1 ; : : : ; aj n /; j D 1; : : : ; m
are called the rows of the matrix A, and the
.a1k ; : : : ; amk /; k D 1; : : : ; n
are called the columns of A. Hence, a matrix of the type m

n is sometimes referred
to as a matrix with m rows and n columns.
Matrices of the type m
m are called square matrices.
7.2 Basic operations with matrices
Transposition. Let A D .ajk /jk be an m

n matrix. The n
m matrix
AT D .ajk0 /jk where ajk0 D akj
is called the transposed matrix of A. There is a variant of this construction over the
field C: If A is a matrix over C, we denote by A the complex conjugate of AT ,
i.e. the matrix obtained from AT by replacing every entry by its complex conjugate.
This is sometimes called the adjoint matrix of A. A (necessarily square) matrix A
which satisfies AT D A (resp. A D A) is called symmetric (resp. Hermitian).
Multiplication. Let A D .ajk /jk be an m
n matrix and let B D .bjk /jk be an
n
p matrix. The product of A and B is the matrix
X
n
AB D .cjk /jk where cjk D ajr brk :
rD1
The unit matrices are the matrices of type n

n defined by
(
1 if j D k
I D In D .ıjk /jk where ıjk D :
0 if j ¤ k
We obviously have
.AB/T D B T AT ; .AB/ D B A and AI D A and IA D A whenever defined.
www.Ebook777.com
The motivation for the definition of the product will be apparent in 7.6 below,
where we will also learn more about its properties.
7.3 Row and column vectors as matrices
A vector x D .x1 ; : : : ; xn / 2 Fn will be viewed as a matrix of the type 1

n. Also,
we will consider the column vectors, matrices of type n
1,
0 1
x1
xT D @: : :A :
xn
Clearly, all column vectors of a given dimension n also form a vector space over
F, known as the n-dimensional column vector space and denoted as Fn . We will
see that in spite of the fact that it is more convenient to write rows than columns,
the space of columns is more convenient in the sense that for columns, composition
of linear maps corresponds to multiplication of matrices without reversing orders
(see Theorem 7.6 below). Because of this, nearly all courses in linear algebra now
use the space of column vectors and not row vectors as the default model of an
n-dimensional vector space. We will follow this convention in this text as well. In
particular, we will extend the convention 1.3 to column vectors.
7.4 The standard bases of Fn , Fn
In the row vector space Fn , we will consider the basis

(
1 if j D k;
e1 ; : : : ; en where .ej /k D
0 if j ¤ k
and in Fn , we will consider the basis
e1 ; : : : ; en where ei D .ei /T
(this notation conforms with 1.3; of course .ej /k D ıjk from 7.2).
The ej ’s from Fm and Fn with m ¤ n differ (and similarly for ej ), but this rarely
causes confusion. In the rare cases where it can we will display the dimension n as
n ej , n e .
j
Obviously we have
X
n
xD xj ej : (7.4.1)
j D1
www.Ebook777.com
7.5 The linear maps fA ,f A
Let A be a matrix of type m

n. Define a mapping
fA W Fm ! Fn by setting fA .x/ D xA;
and a mapping
f A W Fn ! Fm by setting f A .x/ D Ax:
7.5.1 Theorem. The mappings fA , f A are linear and the formula
A 7! fA
resp.
A 7! f A
yields a bijective correspondence between matrices of type m

n and the set of all
linear mappings Fm ! Fn resp. Fn ! Fm .
Proof. We will prove the statement about row spaces. The statement for column
spaces is analogous (see Exercise (10)). The linearity of the formula is an immediate
consequence of the definition of a product of matrices.
We have
X
n
.ej A/1k D ejr ark D ajk (*)
rD1
and hence if A ¤ B, there exist r; s such that ars ¤ brs . Thus, fA ¤ fB .

Now let f W Fm ! Fn be an arbitrary linear mapping. Consider the ajk uniquely
defined by the formula
X
n
f .m ej / D ajk .n ek /
kD1
and define A as the array .ajk /jk . We have, by (*),

X X X X XX
f .x/ D f . xj .m ej // D xj f .m ej / D xj ajk .n ek / D . xj ajk /.n ek /;
j j j k k j
and hence f .x/1k D .xA/1k and finally f .x/ D .xA/. t

u
www.Ebook777.com
7.6 Theorem. In the representation of linear mappings from 7.5 we have
fI D id; fAB D fB ı fA ;
and
f I D id; f AB D f A ı f B :
Proof. We will only prove the statement for row vectors. The statement for column
vectors is analogous (see Exercise (11)). The first formula is obvious. Now let A,
B be matrices of types m
n resp. n
p. If two linear maps agree on a basis they
obviously coincide. We have
X X
fB .fA .m ej // D fB . ajk .m ek // D ajk fB .m ek /
k k
X X XX
D ajk . bkr .p er / D . ajk bkr /p er D fAB .m ej /: t
u
k r r k
7.6.1
From the associativity of composition of mappings and from the uniqueness of the
matrix in the representation of linear mappings as fA we immediately obtain
Corollary. Multiplication of matrices is associative, that is, A.BC / D .AB/C

whenever defined.
7.6.2 Different bases, base change

At this point we must mention the fact that the association between matrices and
linear maps works for arbitrary finite-dimensional vector spaces V; W . Let B D
.v1 ; : : : ; vn / resp. C D .w1 ; : : : ; wm / be sequences of distinct vectors in V resp.
W which, when considered as sets, form bases of V and W (we speak of ordered
bases). Then for an m
n matrix A over F, we have an associated linear map
B;C f
A
WV !W
given by
X
m
B;C f .vj / D
A
aij wi :
i D1
Clearly, (for example, by considering the isomorphisms between V , Fn and W ,

Fm mapping B and C to the standard bases), this again defines a bijective
correspondence between m
n matrices over F and linear maps from V to W .
We will say that the linear map B;C f A is associated to the matrix A with respect to
the bases B and C , and, vice versa, that A is the matrix associated with the linear
www.Ebook777.com
map (or simply matrix of the linear map) f D B;C f A with respect to the bases
B; C . An analogue of Theorem 7.6 of course holds, i.e.
B;D f
A1 A2
D C;D f A1 ı B;C f A2 (*)
for an m
n matrix A1 and an n
p matrix A2 , and ordered bases B; C; D of
m- resp. n- resp. p-dimensional spaces U , V , W .
For two ordered bases B; B 0 of the same finite-dimensional vector space V , the
matrix of Id W V ! V with respect to the basis B in the domain and B 0 in the
codomain is sometimes referred to as the base change matrix from the basis B to
the basis B 0 . By (*), base change matrices can be used to relate matrices of linear
maps with respect to different bases, both in the domain and codomain.
7.7 Hermitian matrices and Hermitian forms
Given a Hermitian (resp. symmetric) matrix A of type n

n over C (resp. over R),
we have a Hermitian (resp. symmetric bilinear) form B on Cn (resp. Rn ) given by
B.x; y/ D y Ax:
In case when B is positive-definite, this becomes an inner product, also denoted by
hx; yiB :
(In the real case, of course, y D yT .) Conversely, the axioms immediately imply
that every Hermitian (resp. symmetric bilinear) form on Cn (resp. Rn ) arises in this
way. We will say that the form B is associated with the matrix A and vice versa.
Sometimes we simplify the terminology and call a Hermitian (resp. real symmetric)
matrix positive definite resp. negative definite resp. indefinite if the corresponding
property holds for its associated Hermitian (resp. symmetric bilinear) form.
8 Exercises
(1) Prove the statement made in Example 1.2 3.

(2) Prove that the dot-product from 4.2 satisfies the definition of an inner product,
and more generally the B defined in Subsection 7.7 is a Hermitian (resp.
symmetric bilinear) form.
(3) Prove that every Hermitian (resp. symmetric bilinear) form on Cn (resp. Rn )
is associated with a Hermitian (resp. symmetric) matrix.
(4) Take the vector space V from 1.23. Prove that .x 7! ln x/ is an isomorphism
V ! R1 .
www.Ebook777.com
8 Exercises 475
(5) Prove that if 1 , 2 are inner products on a (real or complex) vector space V ,
and ; > 0, then .1 / C .2 / is an inner product.
(6) Prove that linear maps F ! F are precisely the mappings .x 7! ax/ where
a 2 F is fixed. Z b
(7) Prove that if ha; bi is a closed interval then . 7! .x/dx/ is a linear
a
mapping C.ha; bi/ ! R1 .
(8) Prove that the set of all as , s 2 S in 5.6 forms a basis of the free vector space
FS on a set S .
(9) Prove that an affine map f W L ! M between affine subsets of vector spaces
V , W can be made to satisfy the definition 5.9 with any choice of the element
x0 2 L. Is an analogous statement true for y0 2 M ?
(10) Prove the statement of Theorem 7.5.1 for column vectors.
(11) Prove the statement of Theorem 7.6 for column vectors.
(12) Prove that the set of all matrices of type m
n with entries in F is a vector
space over F where addition is addition of matrices, and multiplication by a
scalar 2 F is the operation which multiplies each entry by . Is this vector
space finite-dimensional? What is its dimension?
www.Ebook777.com
Linear Algebra II: More about Matrices

B
1 Transforming a matrix. Rank
1.1 Elementary row and column operations
Recall Section A.7. Let A be a matrix of type m

n. The vector subspace Row.A/ of
Fn generated by the rows of A is called the row space of A and the vector subspace
Col.A/ of Fm generated by the columns is called the column space of A.
An elementary row (resp. column) operation on A is any of the following three
transformations of the matrix.
(E1) A permutation of the rows (resp. columns).
(E2) Multiplication of one of the rows (resp. columns) by a non-zero number.
(E2) Adding to a row (resp. column) a linear combination of the other rows (resp.
columns).
1.1.1 Observation. An elementary row (resp. column) operation does not change
the row resp. column space.
1.2
The column space is, of course, changed by a row operation (and the row space is
changed by a column operation). We have, however, the following
Proposition. An elementary row (resp. column) operation preserves the dimension

of the column (resp. row) space.
Proof. Let p be a permutation of the set f1; 2; : : : ; ng. Define p W Fn ! Fn by

setting
p .x1 ; : : : ; xn / D .xp.1/ ; : : : ; xp.n/ /:

www.Ebook777.com
478 B Linear Algebra II: More about Matrices
Obviously p is an isomorphism: trivially it is linear, and it has an inverse, namely

p1 .
Further, for a non-zero a define
a .x1 ; x2 ; : : : ; xn / D .ax1 ; x2 ; : : : ; xn /:
Again, it is an isomorphism, with the inverse a1 .

Finally, setting for numbers b2 ; : : : ; bn ,
X
n
.x1 ; x2 ; : : : ; xn / D .x1 C bj xj ; x2 ; : : : ; xn /;
j D2
we obtain an isomorphism with the inverse sending .x1 ; x2 ; : : : ; xn / to
X
n
.x1 bj xj ; x2 ; : : : ; xn /:
j D2
Now performing elementary row operations on A we transform the column space

by the isomorphisms p , a and ; an isomorphism sends a basis to a basis (A.5.4)
and hence preserves dimension. t
u
1.3 Theorem. For any matrix A, the dimensions of the row and column spaces
coincide.
Proof. By 1.1.1 and 1.2, the dimensions are unchanged after arbitrarily many row
and column operations.
If ajk D 0 for all j; k then both the dimensions are zero. Let there be an ajk ¤ 0.
Performing (E1), we can move the ajk to the position .1; 1/ and multiplying the
1
(now) first row by we have our matrix transformed to
ajk
0 1
1; b12 ; : : : ; b1n
B b21 ; b22 ; : : : ; b2n C
B C
@ ::: ::: ::: A:
bm1 ; bm2 ; : : : ; bmn
Now we will perform the operations (E3) subtracting the first row bj1 times from the
j -th one, and when this is finished we do the same with the columns thus obtaining
the matrix transformed to
0 1
1; 0; : : : ; 0
B .2/ .2/ C
B0; a22 ; : : : ; a2n C
B C:
@ ::::::::: A
.2/ .2/
0; am2 ; : : : ; amn
www.Ebook777.com
2 Systems of linear equations 479
.2/
If all the ajk with j; k 2 are zero, the dimension of the two spaces are 1.
.2/
Otherwise choose an ajk ¤ 0, move it to the position .2; 2/ by (E1) operations
(without affecting the first row and column) and repeat the procedure as above
to obtain
0 1
1; 0; 0; : : : ; 0
B0; 1; 0; : : : ; 0 C
B C
B .3/ .3/ C
B0; 0; a33 ; : : : ; a3n C :
B C
@ ::::::::: A
.3/ .3/
0; 0; am3 ; : : : ; amn
.rC1/
After sufficiently many repetitions of the procedure we have ajk D 0 for all
j; k > r and have a matrix
0 1
1; 0; : : : ; 0; 0; : : : ; 0
B0; 1; : : : ; 0; 0; : : : ; 0C
B C
B C
B ::::::::: C
B C
B D B0; 0; : : : ; 1; 0; : : : ; 0C
B C
B0; 0; : : : ; 0; 0; : : : ; 0C
B C
@ ::::::::: A
0; 0; : : : ; 0; 0; : : : ; 0
with the first r diagonal entries 1 and all the others zero, and hence the dimensions
of both the row and the column spaces are equal to r. t
u
1.4
The common dimension of the row and column spaces is called the rank of the
matrix and denoted by
rankA:
2 Systems of linear equations
2.1
Let A D .ajk /jk be a matrix of type m

n and let b1 ; : : : ; bm be numbers. A system
of linear equations is a name for the task of determining x1 ; : : : ; xn 2 F such that
a11 x1 C a12 x2 C C a1n xn D b1

::: ::: ::: : (2.1.1)
am1 x1 C am2 x2 C C amn xn D bm
www.Ebook777.com
If .b1 ; : : : ; bn / D o we speak of a homogeneous system, and when replacing the

original b by o we speak of the homogeneous system associated with (2.1.1).
The matrix A is called the matrix of the system and the matrix
0 1
a11 ; : : : ; a1n ; b1
@ ::: ::: ::: A
am1 ; : : : ; amn ; bm
is referred to as the augmented matrix of the system.
2.2 Three views of the task
1. Recall A.7.5. We seek an x such that
AxT D bT :
Thus we have a linear map f W Fn ! Fm and would like to determine the set
f 1 ŒbT :
2. If we denote by c1 ; : : : ; cn the columns of A, we are seeking numbers x1 ; : : : ; xn

such that
X
n
xj cj D bT :
j D1
3. The associated homogeneous system can be understood as seeking the x such

that
x aj D 0 for all j D 1; : : : ; m
where is the dot product and aj D .aj1 : : : ; aj n / are the complex conjugates of
the rows of A (this approach is valid for F D R; C, which, as remarked above,
are the only contexts we are interested in).
Thus, the set of solutions of the associated homogeneous system coincides with
the orthogonal complement
L.a1 ; : : : ; am /? :
Now the dimension of L.a1 ; : : : ; am / is the same as that of the row space, that
is, equal to the rank r od A: if we perfom the procedure from Theorem A.3.2
(the Gram-Schmidt process) on the system a1 ; : : : ; am , we end up with a basis of
www.Ebook777.com
the same size as when starting with a1 ; : : : ; am (since aj is a linear combination of

the other ak ’s if and only if aj is a linear combination of the other ak ’s).
Thus, by Theorem A.4.6.2, the dimension of the subspace of solutions of a
homogeneous system is n rankA.
2.3
From 2.2 2, we immediately obtain
2.3.1 Theorem (Frobenius). A system of linear equations has a solution if and only
if the rank of the matrix of the system is the same as the rank of the augmented one.
(That is: if and only if the right-hand side column is in the column space of A.)
From 2.2 1 and 2.2 3, we obtain
2.3.2 Theorem. If a system of linear equations has a solution x0 , then the set of all
solutions is an affine set
x0 C W
where W is the set of all solutions of the associated homogeneous system. The
dimension of this affine set is n rankA.
2.4 The Gauss Elimination Method
By 2.3.2, to determine the set of all solutions of the system (2.1.1), it suffices to find
one of its solutions and s D n r linearly independent solutions x1 ; : : : ; xs of the
associated homogeneous system, where r D rankA. The general solution is then
X
s
x0 C ˛j xj ; ˛j 2 F arbitrary:
j D1
First observe that

elementary row operations on the augmented matrix preserve the solution set.
Column operations change the solution set and will not be used, with the exception
of the (E1) performed on the A-part of the augmented matrix: this is relatively
harmless; we will only have to keep track of the permuted coordinates of solutions.
Start with the augmented matrix and transform it by (E1) operations so that the
.1; 1/ entry is non-zero, moving there a non-zero aj1 k . Remember j1 . Then multiply
the first row by .aj0 1 k /1 to obtain
www.Ebook777.com
0 0 0 1
1; a12 ; : : : ; a1n ; b10
B a0 ; a0 ; : : : ; a0 ; b 0 C
B 21 22 2n 2 C
@ : : : : : : : : :A
0 0 0
am1 ; am2 ; : : : ; amn ; b20
0
and then subtract from the j -th rows, j D 2; : : : ; m, the aj1 multiple of the first
one. Now we have
0 0 0 1
1; a12 ; : : : ; a1n ; b10
B0; a00 ; : : : ; a00 ; b 00 C
B 22 2n 2 C :
@ : : : : : : : : :A
00
0; am2 00
; : : : ; amn ; b200
We repeat the procedure in the part of the matrix with indices 2 (during this, of
0 0
course, the a12 ; : : : ; a1n are permuted, too; again, the j2 from the aj002 k moved to the
.2; 2/ position to be remembered). After repeating the procedure r 1 times we
obtain a matrix
0 1
1; c12 ; c13 ; : : : ; c1r ; : : : ; c1n ; bQ1
B0; 1; c23 ; : : : ; c2r ; : : : ; c2n ; bQ2 C
B C
B0; 0; 1; : : : ; c ; : : : ; c ; bQ C
B 3r 3n 3C
B C
B ::: ::: ::: C
B C
B0; 0; 0 : : : 1; : : : ; crn ; bQr C
B C
B0; 0; 0 : : : 0; : : : ; 0; 0 C
B C
@ ::: ::: ::: A
0; 0; 0 : : : 0; : : : ; 0; 0
(note that because of Frobenius’ Theorem the right-hand side becomes zero after the
r-th row or else the system has no solution) corresponding to a system of equations
y1 C c12 y2 C c13 y3 C C c1r yr C c1;rC1 yrC1 C Cc1n yn D bQ1 ;
y2 C c23 y3 C C c2r yr C c2;rC1 yrC1 C Cc2n yn D bQ2 ;

::: :::
yr C cr;rC1 yrC1 C Ccrn yn D bQr
with the same system of solutions if we set yk D xjk .

The one solution y0 of the system can be obtained by setting y0;rC1 D y0;rC2 D
D y0;n D 0, y0;r D bQr , and then recursively
X
n
y0;k1 D ck1;j y0j C bQk1 :
j Dk
www.Ebook777.com
A basis yi (i D 1; : : : ; s D n r) of the vector space of solutions of the associated

homogeneous system can be then obtained by setting yi;rCi D 1, yi;rCj D 0
otherwise, and then recursively
X
n
yi;k1 D ck1;j yij :
j Dk
2.5 Regular matrices
A matrix A D .aij /ij of type n

n is said to be regular (or non-singular) if rankA D
n. In such a case, each system of equations
X
n
aij xj D bj; i D 1; 2; : : : ; n
j D1
has precisely one solution: it has a solution since the augmented matrix, being of
type n
.n C 1/, cannot have a bigger rank than n; on the other hand, the dimension
of the set of solutions is n n D 0. By 1.3,
a matrix A is regular if and only if AT is regular.
2.5.1 Theorem. The following statements about a square matrix A are equivalent.
(1) A is regular.
(2) There exists a matrix U such that AU D I .
(3) There exists a matrix V such that VA D I .
(4) The matrix A has a unique inverse matrix, that is, there is a unique U such that
UA D AU D I .
Notation. The inverse matrix of A will be denoted by A1 .
Proof. (1))(2),(3): Notation from 2.2 1 and A.7.4. For each ei on the right-hand
side there is a solution xi such that
AxTi D eTi .D ei /:
X j
X j
Thus, ajk xi k D ıi , and if we set uij D xj i we have ajk uki D ıi , that is, we
k k
have a U such that AU D I . The statement (3) is obtained applying this reasoning
for AT and using X
A.7.2.
(2))(1): Let aij ujk D ıik . Fix k and set xj D ujk . Then in the notation of
j
2.2.2 we have for the columns cj of A,
www.Ebook777.com
X
xj cj D ek :
j
Thus, the column space contains all the ek and hence its dimension is n.
(2)&(3))(4): If AU D I and VA D I we have have V D V .AU / D .VA/U D U .
(4))(2) is trivial. u
t
2.6 Deciding if a Hermitian form is positive-definite

or negative-definite
Recall now our problem from A.4.7 of deciding if a Hermitian (or real symmetric
bilinear) form is positive-definite or negative-definite. Consider a Hermitian form
B on a finite-dimensional complex vector space V (the case of a real symmetric
bilinear form is analogous). Then perform the following procedure:
Start with k D 0. Suppose we have constructed vectors v1 ; : : : vk 2 V such that
B.vi ; vi / ¤ 0, B.vi ; vj / D 0 for i ¤ j . Note that the vectors vi must be linearly
independent. (In effect, suppose
X
k
ai vi D 0:
i D1
Applying B.‹; vi /, we get ai D 0.) Then, using a system of linear equations, find a
non-zero vector w 2 V such that B.vi ; w/ D 0 for all i D 1; : : : ; k. If no such w
exists, then by 2.2 3, k dim.V /, and by linear independence, equality arises, so the
vi ’s form a basis of V . In this case, if the signs of the real numbers B.vi ; vi / are all
positive (resp. negative), B is positive-definite (resp. negative-definite). Otherwise,
B is indefinite.
Suppose the vector w exists. If B.w; w/ ¤ 0, put vkC1 D w and repeat the
procedure with k replaced by k C 1. If B.w; w/ D 0, find a vector u 2 V such that
B.w; u/ ¤ 0. If no such u exists, B is degenerate. If u exists, then
4B.u; w/ D B.u C w; u C w/ C iB.iu C w; iu C w/

B.u C w; u C w/ iB.iu C w; u C w/
by the axioms, so choosing vkC1 as one of the vectors u C w, u C w, iu C w,

iu C w, the vector vkC1 will satisfy B.vkC1 ; vkC1 / ¤ 0. Repeat the procedure with
k replaced by k C 1.
www.Ebook777.com
3 Determinants 485
3 Determinants
3.1
A group G is a set with a binary operation which satisfies associativity, has a unit
element e and an inverse unary operation .‹/1 . Explicitly, the axioms are
.a b/ c D a .b c/;
a e D e a;
x x 1 D x 1 x D e:
For groups G; H , a map f W G ! H is called a homomorphism of groups if we

have
f .a b/ D f .a/ f .b/ for all a; b 2 G:
A bijective homomorphism of groups is called an isomorphism (of groups).

Obviously, the inverse of an isomorphism is again an isomorphism.
Immediate examples of groups include the set Z of all integers with the operation
C, the set f1; 1g with the operations (multiplication), as well as R or C with the
operation C or R D R X f0g, C D C X f0g with the operation . Note that all
those groups have the additional property that
ab Dba
where is the operations. Groups satisfying this property are called commutative or
abelian. We will soon encounter examples of groups which are not abelian.
We will not develop the theory of groups at all here (and the reader is referred to
[2] and [4] for more on abstract algebra), but they do come up naturally in the context
of the determinant. In particular we will use the obvious fact that the mappings
G!G
x 7! x 1 and x 7! ax for a fixed a 2 G
are bijections (the first is inverse to itself, the other one to x 7! a1 x). It then
follows that if f W G ! R or C is any mapping then
X X X
f .x/ D f .x 1 / D f .ax/ (3.1.1)
x2G x2G x2G
(all three are the same sum, only rearanged).
www.Ebook777.com
3.2 The sign of a permutation
We will be concerned with the group P .n/ of permutations of the set f1; 2; : : : ; ng,
i.e. bijections f1; 2; : : : ; ng ! f1; 2; : : : ; ng, where the operation is composition. A
permutation p 2 P .n/ will be usually encoded as a sequence
.k1 ; : : : ; kn / where kj D p.j /:
A transposition is a permutation interchanging two of the elements and keeping

all the others.
3.2.1 Theorem. 1. Every pemutation can be obtained as a composition of trans-

positions.
2. If p 2 P .n/ can be represented as a composition of an even (resp. odd) number
of transposition then in any such representation the number of transpositions is
even (resp. odd).
Proof. 1. By induction. The statement is obvious for n D 1; 2. Now let it hold for
P .n/ and let p be a permutation of f1; : : : ; n; nC1g. Consider the transposition
interchanging nC1 with p.nC1/ (if p.nC1/ D nC1 set D id). Now q D ıp
sends n C 1 to n C 1, hence f1; : : : ; ng to f1; : : : ; ng. The restriction q 0 of q to
f1; : : : ; ng can be written as q 0 D 10 ı ı r0 with transpositions j0 . Extending
these to transpositions j of f1; : : : ; n; n C 1g we obtain a representation
p D ı 1 ı ı r :
2. Encode p as .k1 ; : : : ; kn / and set
I.p/ D f.i; j / j i < j and ki > kj g; .p/ D #I.p/
(# indicates the number of elements). We will prove that for any transposition the
number
j. ı p/ .p/j
is odd; since .id/ D 0 the statement will follow.

Let exchange ˛ with ˇ, ˛ < ˇ, let q D ı p. Then we have
p D .k1 ; : : : ; k˛1 ; k˛ ; k˛C1 ; : : : ; kˇ1 ; kˇ ; kˇC1 ; : : : ; kn / and

q D .k1 ; : : : ; k˛1 ; kˇ ; k˛C1 ; : : : ; kˇ1 ; k˛ ; kˇC1 ; : : : ; kn /:
We obviously have .i; j / 2 I.p/ if and only if .i; j / 2 I.q/ for
i; j ¤ ˛; ˇ; or i < ˛ and j 2 f˛; ˇg, or ˇ < j and i 2 f˛; ˇg.
www.Ebook777.com
3 Determinants 487
Thus we have to discuss the cases

(a) .˛; j / with ˛ < j < ˇ,
(b) .j; ˇ/ with ˛ < j < ˇ, and
(c) .˛; ˇ/.
In cases (a) and (b) we have together an even number of changes: we have .˛; j / 2
I.p/ if and only if .j; ˇ/ … I.q/, and .j; ˇ/ 2 I.p/ if and only if .˛; j / … I.q/;
thus if there are s many .˛; j / 2 I.p/ and t many .j; ˇ/ 2 I.p/ we have s C t such
pairs in I.p/ and usCu D t D 2u.sCt/ such pairs in I.q/ where u D ˇ˛C1.
The case (c) stands alone, and it is in precisely one of the I.p/, I.q/. t
u
3.2.2 Notation and observation

We define
(
C1 if p is a composition of an even number of transpositions,
sgn p D
1 if p is a composition of an odd number of transpositions.
From the definition we immediately infer that
sgn id D 1; sgn .p ı q/ D sgn p sgn q and sgn p 1 D sgn p:
Permutations p with sgnp D 1 (resp. sgnp D 1) are called even (resp. odd).
3.2.3 Corollary. The map
sgn W P .n/ ! f1; 1g
sending a permutation to its sign is a homomorphism of groups, where on f1; 1g,

we consider the operation of multiplication.
3.3
The determinant of a matrix A D .aij /ij is the number

X
det A D sgn p a1;p.1/ an;p.n/ :
p2P .n/
It is often indicated as
ˇ ˇ
ˇa11 ; : : : ; a1n ˇ
ˇ ˇ
ˇ ::: ::: ˇ:
ˇ ˇ
ˇa ; : : : ; a ˇ
n1 nn
ˇ ˇ
ˇ a; b ˇ
ˇ
Thus for instance ˇ ˇ D ad bc (and this is about the only case of a determinant
c; d ˇ
easily and transparently computed from the basic definition).
www.Ebook777.com
3.3.1 Proposition. 1. det AT D det A.

2. If B is obtained from a square matrix A by permuting the rows or columns
following a permutation p 2 P .n/ then det B D sgn p det A.
Proof. Rearranging the factors we obtain the formula a1p.1/ anp.n/ D

ap1 .1/1 ap1 .n/n and since sgn p 1 D sgn p we can rewrite the formula from the
definition as
X
det A D sgn p 1 ap1 .1/1 ap1 .n/n
p2P .n/
which is, by (3.1.1), equal to

X
sgn p ap.1/1 ap.n/n :
p2P .n/
2. It suffices to prove it for a permutation of rows. We have B D .ap.i /j /ij so that

X
det B D sgn q ap.1/q.1/ ap.n/q.n/ :
q2P .n/
Rearanging the factors and using 3.2.2, we obtain

X
det B D sgn q a1;qp1 .1/ an;qp1 .n/
q2P .n/
X
D sgn p sgn qp1 a1;qp1 .1/ an;qp1 .n/
q2P .n/
and by (3.1.1),
X
D sgn p sgn q a1;q.1/ an;q.n/ D sgn p det A: t
u
q2P .n/
3.3.2 Corollary. If there are in a matrix A two equal colums or rows then
det A D 0.
(For, transposing such two rows yields det A D det A.)

From the formula for det A we immediately get the following
3.4 Theorem. A determinant is linear in each of its rows (resp. columns). That is,
if A is a matrix of type n
n and if Aj .x/ is obtained from A by replacing the j -th
row by x then the mapping
.x 7! det Aj .x// W Fn ! R resp. C
is linear.
www.Ebook777.com
4 More about determinants 489
3.4.1 Convention
The notation Aj .x/ will be kept in the remainder of this chapter. Furthermore, we
will use the symbol Aj .xT / for the matrix in which the i -th column is replaced
by xT .
3.4.2 Theorem. If B is obtained from A by adding to a row (resp. column) a linear

combination of the other rows (columns) then det B D det A.
Proof.
X Let a1 ; : : : ; an be the rows of A. We have A D Ai .ai / and B D Ai .ai C
˛j aj /. By 3.2, det Ai .aj / D 0 for j ¤ i and hence
j ¤i
X X
det B D det Ai .ai C ˛j aj / D det Ai .ai /C ˛j det Ai .aj / D det A: t
u
j ¤i j ¤i
3.4.3 Proposition. Let aij D 0 for i > j . Then det A D a11 a22 ann . More
explicitly,
ˇ ˇ
ˇ a ; a ; a ; :::; a ˇ
ˇ 11 12 13 1;n1 ; a1n ˇ
ˇ 0: a : a ; : : : ; a ˇ
ˇ 22 23 2;n1 ; a2n ˇ
ˇ ˇ
ˇ 0: 0: a33 ; : : : ; a3;n1 ; a3n ˇ D a11 a22 ann :
ˇ ˇ
ˇ ::: ::: ::: ˇ
ˇ ˇ
ˇ 0; 0; 0; : : : ; 0; ann ˇ
Proof. follows again from the definition: if p ¤ Id then there is an i with i > p.i /.
t
u
3.4.4 Computing a determinant

Using elemetary operations of the type (E1) and (E3) we can easily transform the
matrix in our determinant into the form as in 3.4.3; then we will have the value as
the product of the elements on the diagonal.
The (E3) operations do not change the value (see 3.4.2). We have to be more
careful with the (E1) operations, though. Since computing of the sign may not
be quite transparent, it is prudent to use transpositions only, and whenever such
is performed, to multiply automatically one of the rows or columns by 1.
4 More about determinants

4.1 Minors and the inverse matrix
Denote by A.i;j / the matrix obtained from A by deleting the i -th row and the j -th
column. The number
˛ij D .1/i Cj det A.i;j /
is called the .i; j /-th minor of A.
www.Ebook777.com
4.1.1
Recall the notation from 3.4.1. We have the following
X
n X
n
Theorem. det Ai .x/ D xj ˛ij and det Aj .xT / D xi ˛ij .
j D1 j D1
Proof.
P We shall treat the case of rows (the case of columns is analogous). Since
x D xj ej we have
X
det Ai .x/ D xj det Ai .ej /:
j
Now
ˇ ˇ
ˇ a1;1 ::: a1;j 1 0 a1;j C1 ::: a1;n ˇˇ
ˇ
ˇ : : : ˇˇ
ˇ ::: ::: ::: ::: ::: :::
ˇ ai 1;n ˇˇ
ˇ ai 1;1 ::: ai 1;j 1 0 ai 1;j C1 :::
ˇ ˇ
det Ai .ej / D ˇ 0 ::: 0 1 0 ::: 0 ˇ:
ˇ ˇ
ˇ ai C1;1 ::: ai C1;j 1 0 ai C1;j C1 ::: ai C1;n ˇ
ˇ ˇ
ˇ ::: ::: ::: ::: ::: ::: : : : ˇˇ
ˇ
ˇ an;1 ::: an;j 1 0 an;j C1 ::: an;n ˇ
Exchange subsequently the i -th row with the .i 1/-th one then the .i 1/-th row
with the .i 2/-th one, etc., and then similarly operating with the rows we move the
1 from the .i; j /-th to the .1; 1/-th position and obtain
ˇ ˇ
ˇ 1 o ˇ
det Ai .ej / D .1/ i Cj ˇ ˇ i Cj
det A.i;j / D .1/i Cj ˛ij :
ˇ yT A.i;j /ˇ D .1/ t
u
4.1.2 Corollary. In particular, for x the j -th row of A, we obtain

X
n X
n
j
akj ˛ij D ajk ˛j i D ıi det A; hence A .˛jk /Tjk D I det A (*)
j D1 j D1
from which we immediately get a formula for the inverse matrix,

˛ T
A1 D
ij
:
det A ij
4.2 Cramer’s Rule
Recall the representation of a system of linear equations as
AxT D bT
www.Ebook777.com
4 More about determinants 491
from 2.2 1. If A is a regular matrix we can multiply this formula by A1 from the
left to obtain
xT D A1 AxT D A1 bT :
Thus, by 4.1.2 we obtain

1 X
xi D ˛j i bj :
det A j
The sum is then by 4.1.1 equal to det Ai .b/ so that we obtain the formula (Cramer’s
Rule)
det Aj .b/
xi D :
det A
Of course computing the solutions using this formula would be much harder than
using the Gauss Elimination. It is, however, useful for theoretical purposes.
4.3 Determinants and products of matrices
4.3.1 Lemma. Let A; B be square matrices and let C be a matrix of the form

AM A O
or as
O B M B
where O indicates a system of zero entries while the entries at M are arbitrary.
Then
det C D det A det B:
Proof. It suffices to treat the first case. Transform the matrix as indicated in 3.4.4 to
obtain
0 0 0 0 0 1
a11 a12 a13 ::: a1m
B 0 0
a22 0
a23 ::: 0
a2m C
B C
B 0 0 0
a13 ::: 0
a3m M C
B C
B ::: C
B ::: ::: ::: ::: C
B 0 C
B 0 0 0 ::: amm C
B 0 0 0 0 C :
B b11 b12 b13 ::: b1n C
B C
B 0 0
b22 0
b23 ::: 0 C
b2n
B C
B O 0 0 0
b13 ::: 0 C
b3n
B C
@ ::: ::: ::: ::: ::: A
0
0 0 0 : : : bnn
www.Ebook777.com
If we do the first just in the first m rows and columns and then in the remaining
ones, the left upper part corresponds to the transformation of the matrix A and the
0 0 0
right lower one is the matrix B transformed. Thus we have det A D a11 a22 amm ,
0 0 0 0 0 0 0 0 0
det B D b11 b22 bnn and det C D a11 a22 amm b11 b22 bnn D det A det B. u t
4.3.2 Theorem. Let A; B be matrices of type n

n. Then
det AB D det A det B:
Proof. Consider the matrix

0 1
a11 : : : a1n 1 0 ::: 0
Ba 1 0 C
B 21 : : : a2n 0 ::: C
B: : : : : :C
B ::: ::: ::: ::: ::: C
B C
C D Ban1 : : : ann 0 0 ::: 1 C :
B C
B 0; :::; 0 b11 b12 ::: b1n C
B C
@: : : ::: ::: ::: ::: ::: : : :A
0; :::; 0 bn1 bn2 ::: bnn
To the i -th column add the a1i multiple of the .n C 1/-th column, the a2i multiple
of the .n C 2/-th column, etc. untill the ani multiple of the 2n-th column. Then the
upper left part anihilates, and the lower left part becomes AB, schematically

O In
:
AB B
Now let us exchange the i -th and .n C i /-th rows and, to compensate the change of
sign, multiply after each of these exchanges the i -th row by -1. We obtain

In O
DD
B AB
and still det C D det D. By Lemma 4.3.1, det C D det A det B and det D D
det I det AB D det AB. t
u
4.4 Proposition. A square matrix A is regular if and only if det A ¤ 0.
Proof. If A is not regular then some of the rows are linear combinations of the
others and det A D 0 by 3.4.2. If A is regular it has an inverse A1 . Thus by 3.3.2,
det A det A1 D det AA1 D det I D 1 and hence det A ¤ 0. t
u
www.Ebook777.com
5 The Jordan canonical form of a matrix 493
4.5 The determinant of a linear map
Let V be a finite-dimensional vector space over F and let f W V ! V be a linear

map. Then Theorem 4.3.2 enables us to define the determinant det.f / of the linear
map f as the determinant of the matrix A of f with respect to the same ordered
basis B in the domain and the codomain (see A.7.6.2). Note that the choice of the
basis B does not matter because if we choose another basis B 0 and denote the base
change matrix from B to B 0 by M , then the matrix of f with respect to B 0 in the
domain and codomain is MAM 1 , and
det.MAM 1 / D det.M /det.A/det.M /1 D det.A/:
5 The Jordan canonical form of a matrix
5.1 Eigenvalues and eigenvectors of a matrix
An eigenvalue of a matrix A is a number 2 F such that there exists a non-zero

column vector v with
Av D v: (5.1.1)
The column vector v is then called an eigenvector of A (associated with the

eigenvalue ).
Note. These concepts are very useful (see an application in Chapter 7). One
interpretation is as of a generalized fixed-point. If we recall the linear mapping
f A W Fn ! Fn we see that we have here an “almost fixed point” v with f .v/ D v.
In the set of all lines through the origin fv j 2 Fg (v ¤ o), which has a lot
of structure and called the .n 1/-dimensional projective space, the directions
generated by eigenvectors become fixed points of the action by f A .
5.1.1 Determining eigenvalues: the characteristic polynomial

X
n X
n X
n
The formula 5.1.1, that is, ajk vk D vj , can be viewed as ajk vk D ıjk vk ,
kD1 kD1 kD1
X
n
rewritten as .ıjk ajk /vk D 0, or
kD1
.I A/vT D o: (5.1.2)
Now this is a system of linear equations that has a nonzero solution if and only if
rankA < n, that is, by 4.4, if and only if
A ./ D det.I A/ D 0:
www.Ebook777.com
The expression A ./ is easily seen to be a polynomial in with coefficients in F.

It is called the characteristic polynomial of A.
We will also apply it to arguments more general than the numbers from F,
see the next paragraph.
5.2 The algebra of matrices of type n n
Matrices of type n
n can be added by the rule
A C B D .ajk C bjk /jk where A D .ajk /jk and B D .bjk /jk
and multiplied by the ˛ 2 F by setting
˛A D .˛aj k /j k :
This is of course the same as computing in the vector space of n

n matrices over F.
(Recall that the zero vector is the zero matrix O, i.e. the matrix with all the entries
0). Note that the I A in (5.1.2) agrees with this notation.
For convenience we sometimes write the muliplication by numbers also from the
right, as A˛.
Furthermore we have the multiplication of matrices AB and we easily deduce
that
.A C B/C D AC C BC; A.B C C / D AB C AC; and AO D OA D O;

0 A D O; and .˛A/B D A.˛B/ D ˛.AB/:
This structure is called the algebra of matrices (of type n

n). It will be denoted by
An :
Thus, we can consider polynomials with coefficients in An .
5.2.1 Lemma. Let A 2 An , and let
p.x/ D Ck x k C : : : C1 x C C0
where C0 ; : : : Ck 2 An commute with A 2 An . Then there exists a polynomial q.x/

with coefficients in An such that
p.x/ D .xI A/q.x/ C p.A/:
www.Ebook777.com
Proof. Apply division of polynomials with remainder by xI A; we work with

polynomials in coefficients in An . All matrices involved as coefficients commute
with A. t
u
5.3 Theorem. (Cayley-Hamilton) Plugging a matrix A 2 An into its own charac-

teristic polynomial gives
A .A/ D O:
Proof. Let B./ D I A, let
C./j k D .1/j Ck detB./.j;k/ :
By Cramer’s rule,
.I A/C./T D I A ./:
Applying Lemma 5.2.1, we have
.I A/C./T D .I A/q./ C A .A/;
or
.I A/.C./T q.// D A .A/:
Examining the highest power of which occurs in C./T q./, we see that
C./T q./ D 0;
proving the statement of the Theorem. t

u
5.4
By a Jordan block we mean a matrix of the form

0 1
0 ::: 0 0
B 1 ::: 0 0 C
B C
B C
B 0 1 ::: 0 0 C
B C:
B::: ::: ::: ::: :::C
B C
@ 0 0 ::: 0 A
0 0 ::: 1
www.Ebook777.com
A matrix similar to a matrix A is a matrix of the form B 1 AB where B is an

invertible matrix. A direct sum of square matrices A1 ; : : : ; Ak is the matrix
0 1
A1 0 ::: 0
B 0 A2 ::: 0 C
B C:
@::: ::: ::: :::A
0 0 ::: Ak
5.5
A vector space V is a direct sum of subspaces U1 ; : : : ; Ur if Uj \ Uk D fog and

V D U1 C C Ur ; in other words, if each v 2 V can be written as a unique sum
v D vi C C vr with vj 2 Uj . We then write V D U1 ˚ ˚ Ur . From now on,
we will work over the field F D C.
5.5.1 Lemma. Put
U D fv 2 Cn j .I A/N v D 0 for some N D 0; 1; 2; : : :g:
Then Cn is the direct sum of the spaces U .
Proof. Let us write
Y
k
A .x/ D .x i /ni I
i D1
P
thus, i are the eigenvalues of A, and ni D n. Define subspaces Wi Cn ,
i D 0; : : : ; k and linear transformations
fi W Wi 1 ! Wi ; i D 1; : : : ; k
as follows.
W 0 D Cn ;
fi D .A i E/ni jWi 1 ;
Wi D fi ŒWi 1 :
By definition,
Ker.fi / Ui ; i D 0; : : : ; k 1: (1)
By Cayley-Hamilton’s Theorem,
www.Ebook777.com
Wk D 0: (2)
Since fi are onto we have by (2),
dim.Ker.f0 // C C dim.Ker.fk1 // D n;
hence, by (1),
X
k
dim.Ui / n:
i D1
Thus, it suffices to show that if
X
k
vi D 0; vi 2 Ui ; (3)
i D1
then v1 D D vk D 0. Let
ni D minfN j .A i I /N vi D 0g:
Suppose ni0 ¤ 0. Then replacing each vector vi by vi0 D .A i0 I /vi , the vectors
vi0 still satisfy (3) in place of the vi ’s. When we make this replacement, the number
ni0 decreases by 1, while the numbers ni , i ¤ i0 , remain unchanged. After applying
this procedure finitely many times, we achieve a situation where ni1 D 1 for some
i1 , and ni D 0 for i ¤ i1 . Then (3) reads
vi1 D 0;
which contradicts ni1 D 1.

Thus, we have proved that ni D 0 for all i , in other words vi D 0, which is what
we needed to show. t
u
5.5.2 Theorem. (Jordan) Every n

n matrix is similar to a direct sum of Jordan
blocks. Moreover, up to order, the Jordan blocks are uniquely determined.
(We refer to this direct sum as the Jordan canonical form of the matrix A.)
Proof. We will exhibit a proof which will allow us to find the Jordan blocks and the
matrix T explicitly (assuming we already have the eigenvalues).
Fix an eigenvalue . We shall exhibit a basis of U with respect to which the
matrix of the linear transformation AjU is a direct sum of Jordan blocks. Put f D
I A. Define subspaces
U0 U1 Um (1)
www.Ebook777.com
of U inductively by
U0 D 0; U;i C1 D f1 ŒUi :
We see that if we let m be the first number such that
Um D U ;
then all the inclusions (1) are strict. Let vj1 ; : : : ; vj qj be a set of vectors in Uj
which projects to a basis of U;j =.Uj 1 C f ŒU;j C1 /, j D 1; : : : ; m (recall
A.6.2.1). Then
vj i ; f .vj i /; : : : ; .f /j 1 vj i ; j D 1; : : : ; m; i D 1; : : : ; qj
is by definition the desired basis. Combining these bases over for all eigenvalues ,
by Lemma 5.5.1, gives a basis with respect to which the linear transformation A is
a sum of Jordan blocks. Further, the sizes of the Jordan blocks determine and are
determined by the dimensions of the spaces Uj , which in turn depend only on the
matrix A. This implies the uniqueness statement. t
u
6 Exercises
(1) Write down a detailed proof of Theorem 2.3.2.

(2) Find all solutions of the system of linear equations over R
x C 2y C 3z C 4t C u D 10;
2x C 4y C 2z C 5t C u D 8;
3x C 6y C 5z C 9t C 2u D 1:
(3) Prove that a Hermitian form over Cn (resp. symmetric bilinear form over Rn )
is non-degenerate if and only if its associated matrix is regular.
(4) Decide whether the symmetric bilinear form on R3 associated with the matrix
0 1
461
@6 8 2A
124
is non-degenerate, and whether it is positive-definite, negative-definite or

indefinite.
www.Ebook777.com
6 Exercises 499
(5) Compute the determinant of the matrix

0 1
21 34
B2 2 4 5C
B C:
@1 4 3 3A
35 68
(6) Prove that the set of all n

n matrices over R (resp. C) of non-zero determinant
with the operation of matrix multiplication is a group. This group is called the
general linear group and denoted by GLn .R/ (resp. GLn .C/).
(7) Prove that
det W GLn .F/ ! F
is a homomorphism of groups where F stands for R or C.

(8) Prove that the determinant of a square matrix with entries in An in which two
rows (or two columns) coincide is 0. [Hint: the same product appears once
with a C and once with a .]
(9) Write down an explicit condition on when a 2
2 matrix

ab
cd
(a; b; c; d 2 C) is regular, and write down a closed formula for its inverse.
(10) Determine the Jordan canonical form of the matrix
0 1
11 03
B0 1 1 0C
ADB
@0 0
C
1 0A
00 01
and find a non-singular matrix P such that P 1 AP is in Jordan form.
www.Ebook777.com
Bibliography
1. L. Ahlfors, Complex Analysis, 3rd edn. (McGraw-Hill Science/Engineering/Math, New York,

1979)
2. M. Artin, Algebra, 2nd edn. (Pearson, Boston, 2011)
3. R. Bott, L.W. Tu, Differential Forms in Algebraic Topology. Graduate Texts in Mathematics,
vol. 82 (Springer, New York, 2011)
4. D. Dummit, R. Foote, Abstract Algebra, 3rd edn. (Wiley, Hoboken, 2004)
5. L. Evans, Partial Differential Equations. Graduate Studies in Mathematics, vol. 19, 2nd edn.
(American Mathematical Society, Providence, 2010)
6. O. Forster, B. Gilligan, Lectures on Riemann Surfaces. Graduate Texts in Mathematics, vol. 81
(Springer, New York, 1981)
7. I.M. Gelfand, S.V. Fomin, Calculus of Variations. Dover Books in Mathematics (Dover
Publications, Mineola, 2000)
8. P. Griffiths, J. Harris, Principles of Algebraic Geometry (Wiley, New York, 1994)
9. B.C. Hall, Lie Groups, Lie Algrba, and Representations: An Elementary Introduction. Graduate
Texts in Mathematics, vol. 222 (Springer, New York, 2003)
10. S. Helgason, Differential Geometry, Lie Groups, and Symmetric Spaces. Graduate Studies in
Mathematics, vol. 34 (American Mathematical Society, Providence, 2001)
11. S. Lang, Elliptic Functions. Graduate Texts in Mathematics, vol. 112 (Springer, New York,
1987)
12. S. MacLane, Categories for the Working Mathematician. Graduate Texts in Mathematics,
vol. 5, 2nd edn. (Springer, New York, 1998)
13. J.P. May, A Concise Course in Algebraic Topology (University of Chicago Press, Chicago,
1999)
14. J.R. Munkres, Elements of Algebraic Topology (Westview Press, Boulder, 1996)
15. R. Narasimhan, Several Complex Variables (University of Chicago Press, 1995)
16. P. Petersen, Riemannian Geometry. Graduate Texts in Mathematics, vol. 171 (Springer,
New York, 2010)
17. F. Riesz, B. Nagy, Functional Analysis (Dover Publications, New York, 1990)
18. W. Rudin, Real and Complex Analysis. International Series in Pure and Applied Mathematics,
3rd edn. (McGraw-Hill, New York, 1987)
19. W. Rudin, Functional Analysis, 2nd edn. (McGraw-Hill, New York, 1991)
20. M. Singer, J.A. Thorpe, Lecture Notes on Elementary Topology and Geometry. Undergraduate
Texts in Mathematics (Springer, New York, 1976)
21. M. Spivak, A Comprehensive Introduction to Differential Geometry. 5 volume set, 3rd edn.
(Publish or Perish, Houston, 1999)
22. M. Spivak, Calculus, 4th edn. (Publish or Perish, Houston, 2008)

www.Ebook777.com
Index of Symbols
Z
.a; b/ open interval, 6
f d integral by a measure, 427
ha; bi closed interval, 6
ZXb
A adjoint matrix, 470
AT transposed matrix, 470 f .x/dx the integral, 27
a
A1 inverse matrix, 483 p
` , 433
C.X/ space of bounded continuous functions, `p .C/, 433
56 fO the Fourier transform formula, 443
C r , C 1 degrees of smoothness, 289 ln.x/ natural logarithm, 30
F , Gı , Fı . . . types of Borel sets, 123 C the field of complex numbers, 6
Lp , 138 FS free vector space on a set S, 466
`
Rijk curvature tensor, 375 F field of real or complex numbers, 451
Tijk torsion tensor, 375 Fn the space of column vectors, 471
V dual vector space, 268 Fn row vector space, 452
W .y1 ; : : : ; yn / Wronskian, 180 R the field of real numbers, 4
W ? orthogonal complement, 463 Z functions with compact support on Rn , 106
Œu; v Lie bracket of vector fields, 167 Zup , Zdn , Z sets of certain limits of compactly
ƒ Lebesgue measurable functions, 118 supported functions, 107
ei , ei standard bases, 471 B Borel sets, 123
o the zero element of a vector space, 452 F Fourier transformation, 443
u v inner product, dot product, 461 F 1 inverse Fourier transformation, 447
v row or column vector, 452 S the space of rapidly decreasing functions,
A characteristic polynomial of a matrix, 493 445
j
ıi Kronecker delta, 359 L Lebesgue integrable functions, 110
det A, jAj determinant of a matrix, 487 Lup , Ldn , L functions with a (possibly
dim.VZ / dimension of a vector space, 458 infinite) Lebesgue integral, 113
TM x the tangent space at a point x, 293
.I / line integral of the first kind, 199
ZL sgn p sign of a permutation, 487
@f
.II/ line integral of the second kind, 199 partial derivative, 66
Z L @xi
@v f directional derivative, 68
f Lebesgue integral, 109 Df total differential, 73
Z
d exterior derivative, 298
f Riemann integral over an n-dimensional definition of, 31
J
Z interval, 99 1 .†; x0 / fundamental group, 338
', 325
f .z/dz complex line integral, 202
ZL sin.x/; cos.x/ trigonometric functions, 30
Col.A/ column space, 477
! integral of a differential form, 302
ZB Row.A/ row space, 477
Im.z/, 6
f Lebesgue integral over a set, 124
M Re.z/, 6

www.Ebook777.com
504 Index of Symbols
Arg.z/, 258 f ŒX the image of a set under a map, 3

grad, div, curl operators on vector fields, 306 f dual linear map, 269
rankA rank of a matrix, 479 f 1 ŒX The pre-image of a set under a map, 3
fQ inverse Fourier transformation formula, 446 f Ad adjoint linear operator, 401
i
jk Christoffel symbols of the second kind, fA , f A linear maps associated with a matrix,
359 472
ijk Christoffel symbols of the first kind, 359 fn ! f pointwise convergence, 18
.M / the de Rham complex of M , 300 fn % f increasing limit, 103
.x; "/ open ball, 39 fn Ã f uniform convergence, 18, 58
k .M / the vector space of k-forms, 298 fn & f decreasing limit, 103

jjf jjp , 135 Hom.U; V / the vector space of
jjxjj the norm of x, 34 homomorphisms, 267
cM characteristic function, 113 Ker.f / kernel, 469
e x exponential function, 30 P.X/ power set, 43
www.Ebook777.com
Index
Abelian group, 485 Boundary oriented counter-clockwise, 205

Absolute convergence, 19 Bounded linear operator, 398
Absolutely continuous function, 435 Bounded metric space, 52
Absolutely continuous measure, 433 Brachistochrone, 353
Adjoint linear operator, 401
Adjoint matrix, 470
Affine approximation, 73 Cantor set, 63
Affine connection, 371 Category theory, 269
Affine map, 467 Cauchy-Riemann conditions, 239
Affine set, 466 Cauchy-Schwarz inequality, 461
Algebra of matrices, 494 Cauchy sequence, 10, 54
Almost complex structure, 383 Cauchy’s formula, 245
Almost everywhere, 114 Cayley-Hamilton Theorem, 495
Argument, 258 Chain rule, 71
Argument Principle, 258 Change of coordinates, 368
Arzelà-Ascoli Theorem, 229 Characteristic function, 113
Associated homogeneous system, 480 Characteristic matrix, 190
Associated vector subspace to an affine set, Characteristic polynomial, 183, 493
467 Chart, 287
Atlas, 287 Christoffel symbol, 359
Augmented matrix of a system of linear Closed form, 300
equations, 480 Closed set, 40, 44
Closed simple curve, 196
Closure, 40, 44
Baire’s Category Theorem, 220 Codomain, 3
Banach’s Fixed Point Theorem, 55 Column, 470
Banach space, 393 space, 477
Banach subspace, 394 vector, 471
Base change matrix, 474 Compact interval, 11
Base point, 334 Compact metric space, 51
Basis of a topology, 46 Compact operator, 424
Basis of a vector space, 457 Compact topological space, 218
Beginning point, 325 Completely regular space, 225
Bessel’s inequality, 406 Complete metric space, 54
Betti numbers, 301, 309 Completion, 223
Bijective map, 4 Complex conjugates, 6
Bolzano-Cauchy Theorem, 11 Complex derivative, 238
Borel measurable function, 125 Complex line integral, 202
Borel measure, 428 Complex primitive function, 243
Borel set, 123 Composition, 4, 66

www.Ebook777.com
506 Index
Concave function, 15 Dynkin’s Lemma, 132

Conformal map, 253
Congruence on a vector space, 468
Connected component, 49 Eigenvalues, 402, 493
Connected space, 47 Eigenvectors, 402, 493
Connection, 371 Einstein convention, 357, 370
Conserved quantity, 353 Elementary row and column operations, 477
Continuous Fourier transformation, 443 Elliptic curve, 323
Continuous function, 11 Elliptic functions, 347
Continuous map, 36, 45 Elliptic integral, 347
Contravariance, 268 End point, 325
Convergence, 35 Energy, 355, 361
Convergent sequence, 10 Equation of holomorphic disks, 383
Convex function, 15 Essential singularity, 256
Convex polygon, 317 Euclidean connection, 376
Convex set, 73 Euclidean plane, 6
Coordinate neighborhood, 287 Euler-Lagrange equations, 350
Coordinate system, 287 Even permutation, see Permutation, even
Countable set, 20 Exact form, 300
Coupled quantities, 357 Existence theorem for systems of LDE’s, 177
Covariance, 268 Existence and Uniqueness Theorem for
Covering, 213, 324 Systems of ODE’s, 151
Cramer’s rule, 490 Exponential map, 361
Critical function, 352 Exterior algebra, 277
Critical point, 14, 88 Exterior derivative, 298
Curvature tensor, 375 Exterior power, 277
Curve, 193 Exterior product, 281
Cyclic vector, 191
Cycloid, 354
Factor, 469
Fatou’s lemma, 139
Daniell’s method, 109 Field, 4
Deck transformation, 338 Finite-dimensional vector space, 454
Decreasing sequence of functions, 103 -finite measure, 448
Degree of a polynomial, 8 Finite operator, 424
Dense subset, 44 Flux, 306
de Rham cohomology, 301, 309 Fourier series, 442
de Rham complex, 300 Fourier transformation, 443
Derivative, 13 Fréchet derivative, 365
Determinant, 487 Free vector space on a set, 466
of a linear map, 493 Frobenius’ Theorem, 481
Diffeomorphism, 289 Fubini’s Theorem, 101, 128
Differential form, 295 Function, 7
Dimension of a vector space, 458 Functoriality, 285
Dini’s Theorem, 102 Fundamental group, 338
Directional derivative, 68 Fundamental neighborhood, 324
Direct sum of vector spaces, 496 Fundamental system of solutions, 180
Discrete Fourier transform, 442 Fundamental Theorem of Algebra, 253
Distance, 33 Fundamental Theorem of Calculus, 29, 435,
Domain, 3, 205 438
Dot product, 461 Fundamental Theorem of Line Integrals, 306
Dual basis, 269
Dual space, 399
Dual vector space, 268 Gauss elimination method, 481
www.Ebook777.com
Index 507
Gaussian plane, 6 Immersion, 295

Generalized Cantor set, 116 Implicit differentiation, 94
Generalized Pythagoras’ Theorem, 405 Implicit Function Theorem, 77, 81
Generalized symmetry of a system of ODE’s, Increasing sequence of functions, 103
169 Indefinite Hermitian, real symmetric matrix,
General linear group, 499 form, 474, 484
Generating set of a vector space, 453 Induced Riemann metric, 379
Geodesic, 359, 374 Infimum, 5
equation, 359 Infinitesimal symmetry, 168
Global extreme, 90 Injective map, 4
Gram-Schmidt orthogonalization process, 462 Injective space, 61
Goursat’s Theorem, 241 Inner product, 460
Grassmann algebra, 277 norm, 461
Green’s Theorem, 206 Integral by a Borel measure, 430
Gronwall’s inequality, 154 Integral curves, 166
Group, 485 Integral equations, 147
Integral Mean Value Theorem, 28
Intermediate Value Theorem, 12
Hamiltonian, 353 Interval (n-dimensional), 97
Hausdorff space, 225 Inverse, 4
Heine-Borel Theorem, 217 Fourier transform, 446
Hermitian form, 464 Function Theorem, 86
Hermitian matrix, 470 matrix, 483
Hermitian operator, 402 Isolated singularity, 256
Hessian, 88 Isometry, 378
Higher derivative, 16 Isomorphism
Hilbert basis, 407 of banach spaces, 394
Hilbert-Schmidt operator, 426 of groups, 485
Hilbert space, 393 holomorphic, 312
Hilbert subspace, 394 isometric, 394, 418
Hodge operator, 283 of vector space, 464
Hölder’s inequality, 136
Holomorphic automorphism, 312, 322
Holomorphic 1-form, 327 Jacobian, 83
Holomorphic function, 241, 322 Jacobi identity, 168
Holomorphic isomorphism, 312, 322 Jensen’s inequality, 143
Holomorphic Open Mapping Theorem, 260 Jordan block, 495
Holonomy, 381 Jordan canonical form, 497
Homeomorphism, 42 Jordan’s Curve Theorem, 263
Homogeneous differential equation, 170 Jordan’s Theorem on Matrices, 497
Homogeneous equation, 163
Homogeneous LDE’s, 176
Homogeneous system of linear equations, 480 Kernel, 469
Homomorphism theorem (for vector spaces), k-form, 295
469 Kronecker ı, 359
Homomorphism of groups, 485
Homotopy of paths, 325
Hurwitz’s Theorem, 260 Lagrange’s Theorem, 14, 73
Hyperbolic plane, 366 Lagrangian, 354
Hypergeometric functions, 338 Laurent series, 256
LDE, see Linear differential equation
Lebesgue integrable function, 110
Identity, 4 Lebesgue integral, 109
Imaginary part, 6 over a set, 123
www.Ebook777.com
508 Index
Lebesgue measure, 120 Negative definite Hermitian, real symmetric

Lebesgue’s Dominated Convergence Theorem, matrix, form, 474, 484
117, 431 Neighborhood, 39, 44
Lebesgue’s Monotone Convergence Theorem, Noether current, 363
117, 429 Non-singular matrix, 483
Left invariant vector field, 308 Non-vanishing vector field, 297
Levi-Civita connection, 379 Norm, 34
Levi’s Theorem, 117 Normal space, 226
Lie algebra, 168 Normed vector space, 34
Lie bracket, 168, 307
Lie group, 308
Lifting, 325 Odd permutation, see Permutation, odd
Lindelöf space, 213 ODE, see Ordinary differential equation
Linear combination, 454 Onto map, 4
Linear differential equation (LDE), 164, Open cover, 213
175 Open set, 40, 44
Linear independence, 455 Ordered basis, 473
Linear map, 464 Ordinary differential equation (ODE), 145
associated with a matrix, 474 Orientation, 280, 301
Line integral of the first kind, 198 Oriented curve, 195
Line integral of the second kind, 199 Orthogonal complement, 397, 463
Liouville’s Theorem, 252 Orthogonal vectors, 462
Lipschitz function, 149 Orthonormal system, 462
Local extreme, 17, 88
Locally finite cover, 290
Looman-Menchoff’s Theorem, 239 Parallel transport, 373
Lower sum, 26 Parametrization, 193
by arc length, 358
Parametrized curve, 194
Manifold, 287 Parseval’s equality, 406
Map, 3 Partial derivatives, 66
Matrix, 469 of higher order, 74
associated with a linear map, 474 Partition of an interval, 26, 97
of a linear map, 473 Path, 325
of a system of linear equations, 480 Permutation, 456
Maximum principle, 260 even, 486
Mean Value Theorem, 14 odd, 486
Measurable function, 118 Path-connected space, 49
Measurable set, 120 Picard-Lindelöf Theorem, 151
Meromorphic function, 323 Piecewise continuously differentiable curve,
Mesh, 26 194
Metric, 33 Point, 33
space, 33 Pole, 256
subspace, 37 Polynomial, 8
Metrizable space, 45 Positive definite Hermitian, real symmetric
Minor of a matrix, 489 matrix, form, 474, 484
Möbius strip, 310 Power series, 23
Möbius transformations, 312, 323 Power set, 43
Modulus, 6 Primitive function, 327
Multiplication of matrices, 470 Product of metric spaces, 39
Multiplicity of a root, 9
Multi-valued holomorphic function,
335 Quotient vector space, 469
www.Ebook777.com
Index 509
Radius of convergence, 24 Steinitz’ Theorem, 456

Radon-Nikodym Theorem, 433 Stereographical projection, 392
Rank of a matrix, 479 Stokes’ Theorem, 304
Rapidly decreasing function, 445 Stone-Weierstrass Theorem, 231
Real numbers - a rigorous construction, 234 Subbasis of a topology, 46
Real part, 6 Subcover, 213
Refinement of a cover, 290 Submanifold, 295
Refinement of a partition, 26, 98 Submersion, 295
Region with corners, 303 Substitution in differential equations, 165
Regular matrix, 191, 483 Substitution Theorem, 130, 135
Regular space, 225 Sum in a Hilbert space, 402
Removable singularity, 256 Sum of vector subspaces, 454
Residue, 257 Support, 106, 440
Residue Theorem, 257 Supremum, 5
Restriction of a map, 4 Surface, 322
Riemann integrable function, 99, 142 Surjective map, 4
Riemann integral, 26, 27, 97, 98 Symmetric bilinear form, 464
Riemann-Lebesgue lemma, 443 Symmetric matrix, 470
Riemann Mapping Theorem, 314 Symmetry of a system of ODE’s, 167
Riemann metric, 356, 378 System of linear differential equations, 175
Riemann surface, 322 with constant coefficients, 183
Riemann zeta function, 261 System of linear equations, 479
Riesz Representation Theorem, 399 System of ordinary differential equations,
Root of a polynomial, 8 145
Rouché’s Theorem, 259
Row, 470
space, 477 Tangent vector, 292
vector, 471 Taylor’s Theorem, 16, 87, 248
Tensor, 368
calculus, 368
Scalar, 451 field, 368
Schwartz-Christoffel formula, 317 product, 271, 272
Schwartz’s Lemma, 313 Tietze’s Real Line Theorem, 61
Schwarzian function, 445 Tietze’s Theorem, 59
Separable space, 213 Topological concept, 43
Separation axioms, 224 Topological invariant, 301
Separation of variables, 161, 168 Topological manifold, 287
Series, 19 Topological space, 43
Set of measure 0, 114 Topology, 44
Sign of a permutation, 486 Torsion tensor, 375
Similar matrices, 496 Total differential, 68, 72, 294
Simple arc, 196 Totally bounded metric space, 215
Simply connected Riemann surface, 332 Trace class operator, 425
Simply connected set, 314 Transposition, 470, 486
Singular values, 425 Triangle inequality, 33
Slice Theorem, 297 T0 and T1 spaces, 224
Smooth coordinate system, 289 T2 space, 225
Smooth function, 288 T3 and T3C 1 spaces, 225
2
Smooth manifold, 288 T4 space, 226
Smooth partition of unity, 204, 290
Space of solutions, 179
Spherical coordinates, 142 Uncountable set, 30
Square matrix, 470 Uniform convergence, 18, 58
Standard basis, 471 Uniformization Theorem, 332
www.Ebook777.com
510 Index
Uniformly continuous function, 12 space, 451

Uniformly continuous map, 36 subspace, 453
Uniformly convex Banach space, 395 Volume, 391
Uniqueness theorem for holomorphic form, 301
functions, 251
Unit matrix, 470
Universal covering, 332 Weak topology, 413
Universal object, 271, 276 Weierstrass’s Theorem, 247
Upper sum, 26 Wronskian, 180
Urysohn’s Theorem, 228
Young’s inequality, 16
Variation of constants, 164, 181
Vector, 451
field, 166, 295 Zero, 256
www.Ebook777.com

Introduction To Mathematical Analysis-Igor Kriz

Uploaded by

Document Information

Original Title

Copyright

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Introduction To Mathematical Analysis-Igor Kriz

Uploaded by

Copyright:

Free ebooks ==> www.Ebook777.

Igor Kriz Aleš Pultr

ISBN 978-3-0348-0635-0 ISBN 978-3-0348-0636-7 (eBook)

© Springer Basel 2013

Printed on acid-free paper

Springer Basel AG is part of Springer Science+Business Media (www.birkhauser-science.com)

This book is a result of a long-term project which originated in courses we taught

Ann Arbor, USA Igor Kriz

Part I A Rigorous Approach to Advanced Calculus

8 Taylor’s Theorem, Local Extremes and Extremes

5 Systems of LDE with constant coefficients. An application

Part II Analysis and Geometry

9 Metric and Topological Spaces II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

13 Complex Analysis II: Further Topics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

A Linear Algebra I: Vector Spaces .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451

B Linear Algebra II: More about Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477

Index of Symbols . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503

In Part II we use the techniques developed in Part I to approach phenomena of

Next, in Chapter 10 we introduce the basic methods of complex analysis. The

Chapter 14 is devoted, primarily, to the basic problem of the calculus of

1 Real and complex numbers

Perhaps it is useful to go over a few basic conventions first. By a map or mapping

I. Kriz and A. Pultr, Introduction to Mathematical Analysis, 3

To comment briefly on the use of inclusion symbols, throughout this book, we

Then, we have the absolute value jaj equal to a if a  0 and to a if a  0. One

sup M (resp. inf M ),

and similarly for the infimum, etc.

The field of complex numbers C can be represented as R

xCy DxCy and x  y D x  y: (*)

x1 C ix2 for .x1 ; x2 /

Re.x1 C ix2 / D x1 ; Im.x1 C ix2 / D x2 ;

0  jxj2 C 2.x1 y1 C x2 y2 / C 2 jyj2 :

.x1 y1 C x2 y2 /2 .x1 y1 C x2 y2 /2 2 .x1 y1 C x2 y2 /2

and hence .x1 y1 C x2 y2 /2  jxj2 jyj2 . Consequently,

jx C yj2 D .x1 C y1 /2 C .x2 C y2 /2 D jxj2 C 2.x1 y1 C x2 y2 / C jyj2 

1.3.1 Corollary. If x D x1 C ix2 and y D y1 C iy2 then

jx  yj  jx1  y1 j C jx2  y2 j and jxj  yj j  jx  yj:

If we specify the domain as R, the function certainly cannot have an inverse no

f W Œ0; 1/ ! Œ0; 1/;

then there is an inverse, which is rather useful, namely

1.4 Polynomials and their roots

(the zero polynomial) or is of the form

p.x/ an x n C    C a1 x C a0 with aj 2 R resp. C (*)

for some n 2 N0 , where an ¤ 0. Technically, then, a non-zero polynomial is

jan x0n j > jan1 j  jx0n1 j C    C ja0 j  jan1 x0n1 C    C a0 j;

and hence p.x0 / ¤ 0 by the triangle inequality.

1.4.2 Lemma. If p.x/ is a polynomial with coefficients in C with root c 2 C, then

p.x/ D q.x/.x  c/:

x k  c k D .x  c/.x k1 C x k2 c C    C xc k2 C c k1 /:

can be written as x  c times another polynomial. If c is a root of p.x/, p.c/ D 0

We immediately have the following

1.4.3 Corollary. A polynomial p.x/ of degree n with coefficients in R or C has at

1.4.4 Proposition. Let c be a (possibly complex) root of a polynomial p with

Proof. By 1.2.(*), p.c/ D p.c/. t

The Fundamental Theorem of Algebra (which will be proved in Chapter 10,

every polynomial of degree  1 has a root in C:

for some complex numbers c1 ; : : : ; cn . (The uniqueness is proved by induction.)

Applying Proposition 1.4.4 inductively, if a polynomial p.x/ has real coefficients,

2 Convergent and Cauchy sequences

A sequence .xn /n in R or in C is said to converge to x if

8" > 0 9n0 such that n  n0 ) jxn  xj < ":

lim xn D x or simply lim xn D x:

A sequence .xn /n in R or in C is said to be Cauchy if

8" > 0 9n0 such that m; n  n0 ) jxm  xn j < ":

Observation. Every convergent sequence is Cauchy.

2.3 Theorem. If a  xn  b for all xn , then the sequence .xn /n contains a

Then, we have the absolute value jaj equal to a if a 0 and to a if a 0. One

xCy DxCy and x y D x y: (*)

0 jxj2 C 2.x1 y1 C x2 y2 / C 2 jyj2 :

and hence .x1 y1 C x2 y2 /2 jxj2 jyj2 . Consequently,

jx C yj2 D .x1 C y1 /2 C .x2 C y2 /2 D jxj2 C 2.x1 y1 C x2 y2 / C jyj2

jx yj jx1 y1 j C jx2 y2 j and jxj yj j jx yj:

p.x/ an x n C C a1 x C a0 with aj 2 R resp. C (*)

jan x0n j > jan1 j jx0n1 j C C ja0 j jan1 x0n1 C C a0 j;

p.x/ D q.x/.x c/:

x k c k D .x c/.x k1 C x k2 c C C xc k2 C c k1 /:

can be written as x c times another polynomial. If c is a root of p.x/, p.c/ D 0

every polynomial of degree 1 has a root in C:

8" > 0 9n0 such that n n0 ) jxn xj < ":

8" > 0 9n0 such that m; n n0 ) jxm xn j < ":

2.3 Theorem. If a xn b for all xn , then the sequence .xn /n contains a

Proof. Let a xn b for all n. Set

M D fx j 9 infinitely many n such that x xn g:

.a; b/ D fx j a < x < bg and ha; bi D fx j a x bg:

8" > 0 9ı > 0 such that x 2 .a ı; a C ı/ X fag ) jf .x/ Aj < ":

lim .h/ D 0 and f .x C h/ f .x/ D Ah C h.h/:

Proof. If such a exists we have for h 2 .ı; ı/ X f0g,

(Indeed, consider f .x C h/ f .x/ D h.A .h// for j.h/j < jAj.)

f 0 .c/ f .b/ f .a/

Proof. Set F .x/ D .f .x/ f .a//.g.b/ g.a// C .f .b/ f .a//.g.x/ g.a//.

f .x C h/ f .x/ D f 0 .x C h/ h for some 2 .0; 1/:

(For, f .y/ f .x/ D f 0 .c/.y x/. )

f .x/ tf .x/ C .1 t/f .y/ resp. f .x/ tf .x/ C .1 t/f .y/

y z D y tx .1 t/y D t.y x/; z x D .1 t/.y x/:

f .y/ f .z/ f .z/ f .x/

f .y/ f .z/ f .x/ f .x/

hence .1 t/.f .y/ f .z// t.f .z/ f .x// and finally

tf .x/ C .1 t/f .y/ f .z/: t

Proposition. Let a; b > 0 and let p; q 1 be such that 1

R.a/ R.a/ R.x/ R0 .c/ f .nC1/.c/.x c/n

Now if n p.n0 / then if we consider K D f1; : : : ; ngXfp.1/; : : : ; p.n0 /g we obtain

n n0 ) sup ak < a C and b < bn < b C :

a.b / < ak.n/ bk.n/ < .a C /.b C / D ab C .a C b C /

ab " < ak.n/ bk.n/ < ab C "

Proof. I. Let jx cj r < . Choose a q such that

Choose a K 1 such that r k Kq k for k n. Then

jak x k j Kq k for all k and jx cj r