You are on page 1of 12

Differential Geometry

Ignacio Lopez de Arbina

October 12, 2015
This is a resume of the book Geometrical methods of mathematical
physics of Bernard F. Schutz, typed in LATEX with some additional
mathematical notation of the concepts and less introduction to the topics.

From the year 300 B.C. to the 1800 mathematicians used to believe that Euclidian geometry was the correct idealization of the real physical space. Whereas
philosophers like Kant defended that our mind had an a priori conception of
the reality philosophers like Hume assumed that science is purely empirical so
that the laws of Euclidean geometry are not necessarily physical truths. But
Kant point of view was accepted by his contemporary believing that the external world is known in relation we are forced to interpret it by our minds.
The beginning of the problem was that the Euclidean axiom of the parallels
was not clear. Although it was obvious for everybody, it hadnt got the certainty
needed to be an axiom. On the way to prove it with the other Euclidean axioms
geometers realized that they couldnt do it. And this is the beginning of the
non-Euclidean geometries.
1. The space Rn and its topology
The space Rn is a n-dimensional space of vector algebra where a point, or
n-tuple, (x1 , x2 , . . . , xn ) is a sequence of n real numbers. This a continuous
space where for any given point there exist another as close as we want. Or
between any two given points there are infinity. With this concept we can state
the local topology of the Rn space.
What we firstly need to define a topology is the concept of a distance function
between any two point. If
x = (x1 , x2 , . . . , xn )

and y = (y1 , y2 , . . . , yn )

are two points of Rn then the distance function between both points is defined
d(x, y) = [(x1 y1 )2 + (x2 y2 )2 + . . . + (xn yn )2 ]1/2 .
A neighbourhood of radius r of the point x Rn is a set of points Nr (x) whose
distance from x is less than r (Clarification, N is the set of points contained
in a ball centered in x with radius r where r is an arbitrary distance so that
d(x, y) < r). Now we can define more precisely the continuity of the space . A
set of points of Rn is discrete if each point has a neighbourhood which contain
no other points of the set. By this, Rn is not discrete. A set of points S of Rn

is said to be open if every point x in S has a neighbourhood entirely contained

in S (Open set S Rn x S Nr (x) S). Clearly discrete sets are
not open. Therefore, any subset of Rn will be open if we do not include the
boundary of the subset in itself.
Hausedorff property of Rn says that any two points of Rn have neighbourhoods which do not intersect. By a geometrical view it is the same that the line
joining any two points of Rn can be infinitely subdivided.
We have used the distance function d(x, y) to define neighbourhoods and
thereby open set. We say that d(x, y) has induced a topology on Rn . This has
led us to define open sets of Rn which have the properties:
(Ti) if O1 and O2 are open then its intersection O1 O2 is also open; and
(Tii) the union of any collection (possibly infinity in number) of open sets is
also open.
In order to make apply (i) to all open sets of Rn , we define the empty set () to
be open, and in order to make (ii) work we likewise define Rn itself to be open.
Now we can ask whether the induced topology depends very much of the
precise form of d(x, y). Let us suppose, for example, a distance function of the
d0 (x, y) = [(x1 y1 )2 + (x2 y2 )2 + . . . + (xn yn )2 ]1/2 ,

, , R (2)

which also define neighbourhoods and open sets. The key point is that any
set which is open according d0 (x, y) is also open according to d(x, y), and vice
versa. The proof rest on the fact that any given d-type neighbourhood of x
contains a d0 -type neighbourhood entirely within it, and vice versa. Hence
, N (x), | N0 (x) N (x)
where N and N 0 denote the d and d0 -type neighbourhood, respectively. So we
can conclude that if a set is open as defined by d(x, y) it is also open as defined
by d0 (x, y), and vice versa. We therefore say that both d and d0 induce the
same topology on Rn . So although we began with the usual Euclidean distance
function d(x, y), the topology we have defined is not very independent on the
form of d. This is called the natural topology of Rn . Topology is a more
primitive concept that distance. So we do not care about the actual distance
between two points, since there are many different possible definitions, but we
need is only a notion that the distance between points can be made arbitrarily
small and no two distinct points have zero distance between them.
The topology of a manifold is more general than any particular distance
function so that the word neighbourhood is often used in a different sense.
So we will let a neighbourhood of a point x be any set containing an open set
containing x.
2. Mapping
A map f from the space M to a space N is a rule which associates with an
element x of M a unique element y of N , that is
f: M


y = f (x)


We remark that a map gives a unique f (x) for every x, but not necessarily a
unique x for every f (x). When this happens, more than one value of x gives the
same value f (x), the map is called many-to-one (not injective). More generally,
if f maps M to N then for any set S in M the elements in N mapped from the
points of S form a set T called the image of S under f , denoted by f (S).
If f : M N S M | S 7 T = f (S) N image of S.
Conversely, the set S is called the inverse image of T or the kern (from German)
of S, denoted by f 1 (T ). If the map f is many-to-one then the inverse image of
a single point of N is not a single point of M , so there is no map f 1 from N to
M , since every map must have a unique image. So in general f 1 (T ) must be
read as a single symbol: it is not the image of T under a map f 1 but simply
a set called f 1 (T ). On the other hand, if every point in f (S) has a unique
inverse image point in S, then f is said to be ono-to-one (1-1) and there does
exist another 1-1 map f 1 , called the inverse of f , which maps the image of M
to M .
f (x) = sin x is many to one, since f (x) = f (x + 2n) = f [(2n + 1) x] n Z.
Therefore, a true inverse function does not exist. The usual inverse functions,
arcsin y or sin1 y, is obtained by restricting the original sine function to the
principal values, /2 < x /2, on which it is indeed 1-1 and invertible.
If we have two maps, f and g, f : M N and g : N P , then there is a
map called composition of f and g, denoted by g f , which maps from M to P
(g f : M P ),




So the composition takes a point x M , then find the point f (x) N , and
uses g to map it to P : (g f )(x) = g(f (x)).
If a map is defined in all the point of M , then we say that it is a mapping
from M into N . If, in addition, every point of N has an inverse image (not
necessarily a unique one), we say it is a mapping from M onto N . As we said,
if the inverse image is unique, the map is 1-1. A map which is both 1-1 and
onto is called a bijection.
A map f : M N is continuous at x M if any open set of N containing
f (x) contains the image of an open set of M . [This presuppose that M and N
are topological spaces.] More generally, f is continuous on M if it is continuous
at all the points of M . In calculus notations we will say that f is continuous at
a point x0 if
 > 0, > 0 | |f (x) f (x0 )| <  when |x x0 | < .
On the other hand, in term of open sets, and considering d000 (x, x0 ) = |x x0 |.
We say that f is continuous at x0 if every d000 -neighbourhood of f (x0 ) contains
the image of a d000 -neighbourhood of x0 . [Remember that these neighbourhood
are open sets.]

Theorem 1 f is continuous on M if and only if the inverse image of every

open set of N is open in M .
Proof 1 If f is continuous at all x, then the inverse image of any open set is
open because it contains an open set containing each point in the inverse image.
Conversely, if the inverse image of every open set of N is open, then it contains
an open set about any of its points, so f is continuous at each of these points.
A subspace of a vector space V is a subset of V that is itself a vector space.
Now we can go on and define differentiation of a function. If f (x1 , . . . , xn ) is
a function defined in some open region S of Rn , then it is said to be differentiable
of class C k if all its partial derivatives of order less than or equal to k exist and
are continuous functions on S. As a shorthand, such a function f is said to be
a C k function. Special cases: C 0 a continuous functions, C a function, all of
whose derivatives exist. If f is a 1-1 map of an open set M Rn onto another
open set N Rn , it can be expressed as
yi = fi (x1 , x2 , . . . , xn )


y = f (x).

If the functions {fi , i = 1, . . . , n} are all C k -differentiable, then the map is said
to be C k -differentiable. The Jacobian matrix of a C 1 map is the matrix of
partial derivatives fi /xj . The determinant of this matrix is simply called the
Jacobian, J, and is denoted by



(f1 , . . . , fn )
.. .
= ..
(x1 , . . . , xn ) .


If the Jacobian at a point x is nonzero, then the inverse function theorem assures
us that the map f is 1-1 and onto in some neighbourhood of x.
If a function g(x1 ; . . . , xn ) is mapped into a function g (y1 , . . . , yn ) by the
g [f1 (x1 , . . . , xn ), . . . , fn (x1 , . . . , xn )] = g(x1 , . . . , xn )
(that is, g has the same value at f (x) as g at x), then the integral of g over M
equals the integral of g J over N :
g(x1 , . . . , xn )dx1 . . . dxn =
g (y1 , . . . , yn )J dy1 . . . dyn .

Since g and g have the same value at appropriate points, the the volumeelement dx1 . . . dxn has changed to J dy1 . . . dyn (coordinate change).
3. Real analysis
A real function of a single variable, f (x), is said to be analytic at x = x0 if it
has a Taylor expansion about x0 which converges to f (x) in some neighbourhood
of x0 , and we write

1 d2 f
1 d3 f
f (x) = f (x0 )+
(xx0 )+
(xx0 )3 +. . .
dx x0
2 dx2 x0
3! dx3 x0

So functions which are not C at x0 are not analytic. Likewise there are C
functions which are not analytic. For example exp(1/x2 ), whose value and all
whose derivatives are zero at x = 0, but which is not identically zero in any
neighbourhood of x = 0. However, there are nonanalytic functions that can be
well approximated by analytic ones in the following sense. Let g(x1 , . . . , xn ) a
real valued function defined on an open region S of Rn , then it is said to be
square-integrable if the multiple integral
[g(x1 , . . . , xn )]2 dx1 . . . dxn

exist. Also there is a theorem of functional analysis that any square-integrable

function g may be approximated by an analytic function g 0 in such a way that
the integral of (g g 0 )2 over S can be made as small as one wishes. [For this
reason physicists typically do not hesitate to assume that a given function is
analytic if this helps to establish a result.] Since a C function need not to be
analytic, there is a special notation for analytic functions: C . Naturally, a C
function is C .
An operator A on functions defined on Rn is a map which takes one function
f into another one, A(f ). If A(f ) = gf , where g is another function, then the
operator is simply multiplicative. Others on R might be simple differentiation,
D(f ) =


or integration using a fixed kernel function g,

Z x
[G(f )](x) =
f (y)g(x, y)dy.

In each case the operator may or may not be defined on all functions f . D may
not be defined on a function which is not C 1 , while G is undefined on functions
which give unbounded integrals. Specifying the set of functions on which an
operator is allowed to act in fact forms part of the definition of the operator;
this set is called domain.
The commutator of two operators A and B, called [A, B], is another operator
defined as
[A, B](f ) = (AB BA)(f ) = A[B(f )] B[A(f )].
The commutator only makes sense when acting over a function. If the two
operators have vanishing commutator, they are said to commute. One has to be
careful about the domains of the operators: the domain of [A, B] may not be as
large as that of either A or B. E.g., if A = d/dx and B = xd/dx, then we may
take both their domains to be all C 1 functions. But not for all C 1 functions will
the successive operator A[B(f )] be defined, since it involves second derivatives.
The operators AB and BA can be given all C 2 functions as domains, which
is a smaller set than C 1 functions. Then the commutator [A, B] has only C 2
functions in its domain. We can enlarge the domain (extending the operator)
in this case, though not always, observing that for any C 2 function f

[A, B](f ) =
dx dx

so we can identify [A, B] simply with d/dx and thereby extend its domain to
all C 1 functions. The point is that the commutator may be defined even on
functions on which the products in the commutator are not.
4. Group theory
A collection of elements G together with a binary operation called is a
group if it satisfies the axioms:
(Gi) Associativity: if x, y, z G, then
x (y z) = (x y) z.
(Gii) Right identity: G contains an element e such that, x G,
x e = x.
(Giii) Right inverse: x G there is an element called x1 G for which
x x1 = e.
A group is said to be Abelian or commutative if in addition satisfies
(Giv) x y = y x x, y G.
A familiar example is the group of all permutations of n objects; the binary
composition is defined is simply of two permutations is simply the permutation
obtained by the following one permutation by the other. The group has n!
elements. Its identity element is the permutation which leaves all objects
By (Gi) to (Giii) we can conclude that: the identity element e is unique; it
is also left-identity (e x) = x; the inverse element x1 is unique for any x; and
is also left-inverse (x1 x = e). As we know, it is common to omit symbol
when there is no risk of confusion.
The most important kind of group in modern physics is the Lie group. It is
a continuous group: any open set of the elements of a Lie group has 1-1 map
onto an open set of Rn for some n. An example of Lie group is the translation
group of Rn (x x + a, a = const). Each point a of Rn correspond to an
element of the group, so the group has in fact a 1-1 map onto all of Rn . The
group composition law is simply addition: two elements a = (a1 , . . . , an ) and
b = (b1 , . . . , bn ) compose to the form c = a + b = (a1 + b1 , . . . , an + bn ). This
illustrates the fact that one need not always use the symbol to present the
group operation. With Abelian groups, as this one is, it is more common to
use the symbol +. A subgroup S of a group G is a collection of elements of G
which themselves form a group with the same binary operation, and is denoted
by S G as used above. As a group, a subgroup must have an identity element.
Since the groups identity e is unique, any subgroup must also contain e. E.g.,
a subgroup of the permutation group could be the permutations of n objects
which do not change the position of the first object because, (i) the identity e
leaves the first object fixed; (ii) the inverse of such a permutation still leaves the
first object fixes; (iii) the composition of any two such permutations still leaves

the first object fixed. This subgroup is identical to the group of permutations
of n 1 objects. This statement, that a certain subgroup of the permutation
group is identical to the group of permutations of n 1 objects, is an example
of a group isomorphism. Two groups G1 and G2 , with binary operations and
respectively, are isomorphic (which means identical in their group properties)
if there is a 1-1 map f of G1 to G2 which respects the group operations:
f (x y) = f (x) f (y)


The isomorphism f of our example: an element of the subgroup of the npermutation group which permutes only the last n 1 object is mapped to
the same permutation in the (n 1)-permutation group. Another example: let
G1 = (R+ , ), and let G2 = (R+ , +). [Both are groups with the same identity
e = 1, inverse x1 x R+ and, e.g., 1 u (2 u 3) = (1 u 2) u 3.] Then if x G1 ,
f (x) = log x defines a map f : G1 G2 which satisfies (7):
log(xy) = log x + log y.
The two groups are isomorphic and f is an isomorphism.
Another relation between groups is called group homomorphism. Is like a
isomorphism but the map can be many-to-one and may be into. Condition (7)
must also be satisfied.
5. Linear algebra
A vector space (over R) is a set V which has a binary operation called + with
which is an Abelian group and has a multiplication by real numbers, (V, +, ),
which satisfies the axioms: Let x
, y V and a, b R, then
(Vi) a (
x + y) = (a x
) + (b y),
(Vii) (a + b) x
= (a x
) + (b y),
(Viii) (ab) x
= a (b x
(Viv) 1 x
The identity element of V is called 0 or 0. Some examples of vector spaces are:
(i) The set of all nn matrices, where + means adding corresponding entries
and means multiplying each entry by the real number.
(ii) The set of all continuous functions f (x) defined on the interval a 6 x 6 b.
A linear combination of vectors is like
x + b
y + c


such that x
, y, z V and a, b, c R. A set of elements {
x1 , x
2 , . . . , x
m } of V is
said to be linear independent if it is possible to find real numbers {a1 , a2 , . . . , am }
not all zero for which
a1 x
1 + a2 x
2 + . . . + am x
m = 0.


The set is a maximal linearly independent set if including any other vector of
V in it would make it linearly dependent. This means that any other vector in

V can be expressed as a linear combination of elements of a maximal set, and

so a maximal set forms a basis for V . For example, if V is the set of n n
real matrices, then one basis is the collection of the n2 different matrices that
each have zeroes everywhere except for a one in a single entry. Generally, the
number of vectors in a basis is called the dimension of V , dim V . Let the vector
xi , i = 1, . . . , n} be a basis. Then an arbitrary vector y is expressible as
y =


ai x
i ,



and the numbers {ai , i = 1, . . . , n} are called the components of y on the basis
of the {
xi }n . As a complement let us play the theorem-proof game.
Theorem 2 If V is a vector space over a field F , then a basis of V is a maximal
linearly independent set in V .
Proof 2 Let B = (
x1 , . . . , x
m ) be a basis of V . Then any vector y V B can
be expressed as a linear combination of the vectors of the basis B, i.e.,
y = a1 x
1 + . . . + am x
m ,

ai F,

which means that B {

y } is linear dependent. So B is a linear independent set
such that adding any vector to it yields a linear dependent set, which in turn is
the definition of maximal. Q.E.D.
A subspace of a vector space V is a subset of V that is itself a vector space
(usually of least dimension). In particular it must include the zero vector and all
linear combinations of any of its elements. Any set of vector S = {
y1 , . . . , ym }
is said to generate the subspace of V which is formed by all possible linear
a1 y1 + a2 y2 + . . . + am ym .
[This set formed by all linear combinations of S is called hSi which itself is a
vector space. Then, if S V , the set hSi is the smallest vectorial subspace of
V which contains S.] The dimension of the subspace is the maximum number
of linearly independent vectors among the generators.
To introduce the inner products between vectors we can define the norm of
a vector. A normed vector space V is a vector space with a mapping from V
into R (i.e. a function that assigns each vector a real number called its norm),
n: V
7 n(
where the map satisfies the axioms
(Ni) n(
x) 0

x V , and n(
x) = 0 x
= 0;

(Nii) n(a
x) = |a|n(

a R, x

(Niii) n(
x + y) n(
x) + n(
y ) x, y V .

Indeed, n is a function and there are many that can satisfy these axioms. For
example, let us consider Rn as a vector space, where vector addition is defined
x + y = (x1 + y1 , . . . , xn + yn ),
and a multiplication by real numbers by
ax = (ax1 , . . . , axn ).


Then we can define a norm as a distance of a vector from the origin, as we did
in section 1:
n(x) = [(x1 )2 + (x2 )2 + . . . + (xn )2 ]1/2 ,


2 1/2

n (x) = [(x1 ) + (x2 ) + . . . + (xn ) ]

, , R,


where n = d and n0 = d0 . These two norms satisfy an additional axiom, the

parallelogram rule:
(Niv) [n(
x + y)]2 + [n(
x y)]2 = 2[n(
x)]2 + 2[n(
y )]2 .
That sort of norms permit one to define a bilinear symmetric inner product
between two vectors
x + y)]2 41 [n(
x y)]2 .
y 41 [n(


The bilinearity means that

x + b
y ) z = a(
x z) + b(
y z),


z (a
x + b
y ) = a(
) + b(
z y).


y = y x


Symmetry means that

Moreover, the inner product is positive-defined, i.e.




= 0.


Finally we get that x

= [n(
x)]2 .
This norm n(x) on R is called the Euclidean norm. When we regard Rn
as a vector space with this norm we denote it by E n and call it n-dimensional
Euclidean space. We remark the important deference between Rn and E n : Rn
is simply the set of all the n-tuples (x1 , . . . , xn ), without any distance, vector
properties, or norms defined.
To define the inner product and show it was bilinear and symmetric only
(Ni) and (Niv) are needed. A pseudo-norm is one which violates (Ni) and (Niii):
the inner product of a vector with itself could be non-positive. For example,
the spacelike 4-vectors in Special Relativity under the Minkowski metric =
diag(1, 1, 1, 1).
Furthermore, it is possible to define a vector space over the field of the complex numbers, C, such that a, b C.

6. The algebra of square matrices

A linear transformation T on a vector space V is a map from V onto itself
which obeys the rule of linearity, i.e.
T : V


T (

where linearity is
T (a
x + b
y ) = aT (
x) + bT (
y ).


Given a basis (
e1 , e2 , . . . , en ) for V , then we can express any vector as a linear
combinations of these vectors such that


ai ei ,



and applying the linear transformation over the vector we get

ai ei =
ai T (
ei ) =
Tij ej
T (
x) = T





where i=1 Tij ej are the components of the vector T (
ei ). Likewise, the numbers
Tij are called the components of T , and can be represented as a square n n
Another important algebraic result is that
Bij cj =
Bij ai ,




which means that the order in which the sums are performed makes no difference. Therefore, we can write
n X

ai Bij cj

or just

i=1 j=1

ai Bij cj



saying that the sum is simply the sum of various products over all possible
combinations of indices.
Two linear transformations T and U acting on the space V produce the
transformation U T :
U T (
x) = U (T (
x)) = U
ai Tij ej =
ai Tij Ujk ek =
Tij Ujk ek




where I are the components of U T . We can realize that if we represent Tij as a

matrix (being i the row index and j the column index), and similarly for Uj k,
then I is just the matrix product of their respective matrices. Generally, if Aij
and Bij are matrices, then their matrix products are
(AB)ik =
Aij Bjk =
Bjk Aij ,


(BA)ik =

Bij Ajk =

Ajk Bij .


By comparing both (25) and (26) we can realize that the order of the factors is
important, which means that the matrix product is, generally, not commutative.
The transpose (AT )ij of a matrix A has elements
(AT )ij = Aji ,


which means change rows by columns. [If A is complex we define the adjoint
A of A by (A )ij = Aji , where the bar denotes de complex conjugation.] The
unit matrix, I, has ones on the main diagonal and zeros elsewhere:
1, i = j,
I = ij =
0, i 6= j.
where ij is the Kronecker delta symbol. The identity transformation is the one
which maps any vector x
into itself. The inverse A1 of a matrix A is a matrix
such that
A1 A = AA1 = I.
Not every matrix has an inverse, for example the zero matrix. When a inverse
exist it is unique. Of course A is the inverse of A1 . When A1 exists, A is
said to be nonsingular. (Otherwise it is singular). The set of all nonsingular
n n matrices forms a group with matrix multiplication operation. The group
identity is I. This is a very important Lie group called GL(n, R) (general linear
The determinant of a 2 2 matirx

a b
c d
is called det A, and is defined as
det(A) = ad bc.


The determinant of an n n is defined by induction on (n 1) (n 1) matrices

by the following rule of cofactors. The cofactor of an element aij of A is called
aij and is defined as (1)i+j times the determinant of the (n 1) (n 1)
matrix formed by eliminating from A the row and the column that aij belongs
to. Thus, in the matrix

a b c
A = d e f
g h k
the cofactor of a is ek f h, while that of f is bg ah. Then the determinant
of A is defined as
det(A) =
aij a , for a fixed i.

For matrix (31), taking i = 1 gives

det(A) = a(ek f h) + b(f g dk) + c(dh eg),

while for i = 2 gives

det(A) = d(hc bk) + e(ak cg) + f (bg ah),
being both equal.
The rows and the columns of a n n matix may each be thought of as giving
the components of a vector in some n-dimensional vector space.
The determinant of a matrix vanishes if and only if the n vectors defined by its
rows or columns are linearly independent.
This follows from the determinant properties:
(Di) If a single row is multiplied by a constant , the determinant is multiplied
by ,
(Dii) ff one row is replaced element-by-element by the sum of itself and any
multiple of another row, the determinant remains unchanged, and
(Diii) if any two rows are unchanged, the determinant changes sign.
The properties are true for both rows and columns.
Morphism: is a structure-preserving map from one mathematical structure to another. E.g., in set theory, morphism are functions; in linear algebra, linear transformations; in group theory, group homomorphism; in
topology, continuous functions.
Homomorphism (same shape): is a map which preserves selected structures between two algebraic structures. E.g.,
Group homomorphism: is a homomorphism that preserves the group
Isomorphism: is a morphism that admits and inverse morphism.
Group isomorphism: is a one-to-one map between two groups