Are you sure?
This action might not be possible to undo. Are you sure you want to continue?
on Metric Spaces
Craig Calcaterra
29 November 2008
Version 1.0
ii
Contents
Preface v
Introduction vii
0.1 Context and objective . . . . . . . . . . . . . . . . . . . . . . . . vii
0.2 Example: ﬂows on L
2
(R) . . . . . . . . . . . . . . . . . . . . . . xi
0.3 Example: ﬂows on manifolds . . . . . . . . . . . . . . . . . . . . xiv
0.4 Example: ﬂows on a space with no linear structure . . . . . . . . xxi
0.5 Chapter outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii
0.6 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv
0.7 Abridged version of the book . . . . . . . . . . . . . . . . . . . . xxv
0.8 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . xxvi
I Theory 1
1 Flows 3
1.1 Generating ﬂows with arc ﬁelds . . . . . . . . . . . . . . . . . . . 3
1.1.1 The fundamental theorem . . . . . . . . . . . . . . . . . . 3
1.1.2 Local ﬂows . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.1.3 Global ﬂows . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.2 Forward ﬂows and ﬁxed points . . . . . . . . . . . . . . . . . . . 19
1.3 Invariant sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.4 Commutativity of ﬂows . . . . . . . . . . . . . . . . . . . . . . . 22
2 Lie algebra on metric spaces 25
2.1 Metric space arithmetic . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Metric space Lie bracket . . . . . . . . . . . . . . . . . . . . . . . 31
2.3 Covariance and contravariance . . . . . . . . . . . . . . . . . . . 34
3 Foliations 41
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2 Local integrability . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3 Commutativity of ﬂows . . . . . . . . . . . . . . . . . . . . . . . 58
3.4 The Global Frobenius Theorem . . . . . . . . . . . . . . . . . . . 60
iii
iv CONTENTS
3.5 Control theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
II Examples 71
4 Brackets on function spaces 73
5 Approximation with nonorthogonal families 83
5.1 Gaussians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.1.1 First approximation formula . . . . . . . . . . . . . . . . . 83
5.1.2 Signal synthesis . . . . . . . . . . . . . . . . . . . . . . . . 84
5.1.3 Deconvolution . . . . . . . . . . . . . . . . . . . . . . . . 85
5.1.4 Coeﬃcient formulas . . . . . . . . . . . . . . . . . . . . . 88
5.1.5 Instability . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2 Lowfrequency trigonometric series . . . . . . . . . . . . . . . . . 90
5.2.1 Density in L
2
. . . . . . . . . . . . . . . . . . . . . . . . . 90
5.2.2 Coeﬃcient formulas . . . . . . . . . . . . . . . . . . . . . 92
5.2.3 Damping gives a stable family . . . . . . . . . . . . . . . . 97
6 Partial diﬀerential equations 101
6.1 Metric space arithmetic . . . . . . . . . . . . . . . . . . . . . . . 101
6.2 PDEs as arc ﬁelds . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7 Flows on H (R
n
) 107
7.1 IFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.2 Continuous IFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.3 Fixed points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
7.4 Cyclically attracted sets . . . . . . . . . . . . . . . . . . . . . . . 114
7.5 Control theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
8 Counterexamples 119
Appendix A: Metric spaces 123
.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
.2.1 Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
.2.2 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . 130
.3 Geometric objects . . . . . . . . . . . . . . . . . . . . . . . . . . 131
.3.1 Triangles . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
.3.2 Metric coordinates . . . . . . . . . . . . . . . . . . . . . . 132
.3.3 Conversion formulas . . . . . . . . . . . . . . . . . . . . . 133
Appendix B: ODEs as vector ﬁelds 137
Appendix C: Numerical diﬀerentiation 141
List of notation 151
Preface
This book explores the subject of metric geometry using continuous dynamics.
Metric geometry is currently experiencing intense interest, due to Perelman’s
solution of the Poincare’s Conjecture and the inﬂuence of Gromov’s ideas on
string theory in physics. Despite this advanced pedigree, metric geometry begins
at a basic level requiring no more than an undergraduate introduction to point
set topology and the deﬁnition of a distance metric. The novel perspective of
this text is the focus of using ﬂows on an abstract metric space to crack into
geometric objects such as foliations. The abstract environment allows us to
pinpoint the necessary ideas to make all our analytic constructions—we employ
the bare minimum deﬁnitions for creating dynamics, geometric decompositions,
and approximations on metric spaces. This book is written with students in
mind, with the intention of using this minimum apparatus to make learning and
understanding the ideas easier. Hopefully the treatment will be of interest to
researchers as well, being the ﬁrst uniﬁed presentation of this dynamic approach
to metric geometry. Further, researchers can use this abstract environment to
test the limits of their understanding of fundamental constructions such as ﬂows,
Lie derivatives, foliations, holonomy and connections.
v
vi PREFACE
Introduction
In this chapter the case is made for the importance of studying ﬂows on a metric
space. The concept of a metric space is the deepest point of contact between
geometry and analysis; we gain new perspective on these subjects by generalizing
several of their results to metric spaces. The generalized Fundamental Theorem
of Ordinary Diﬀerential Equations and Frobenius’ Foliation Theorem are the
major theoretical results of this book. The ﬁrst theorem belongs to analysis
and the second to geometry.
The greater generality also gives a richer palette for mathematical modeling,
as demonstrated with novel dynamics on H (R
n
), the space of nonempty com
pact subsets of R
n
. Innovative dynamics arise even on wellstudied spaces. E.g.,
geometric control theory on function spaces leads to our centerpiece example:
lowfrequency trigonometric series can approximate any L
2
function on any in
terval, Theorem 94 and Example 95, which the reader can turn to immediately,
before learning the details of metric space dynamics which conceived the idea.
0.1 Context and objective
A metric space (M, d) is a set M with a function d : M M →R called the
metric which is positive, deﬁnite, symmetric and satisﬁes the triangle inequal
ity:
(i) d(x, y) ≥ 0 positivity
(ii) d(x, y) = 0 iﬀ x = y deﬁniteness (or nondegeneracy)
(iii) d(x, y) = d(y, x) symmetry
(iv) d(x, y) ≤ d(x, z) +d(z, y) triangle inequality
for all x, y, z ∈ M. A metric space is locally complete if for each element
x ∈ M there exists an r > 0 such that the closed ball
B(x, r) := ¦y ∈ M[d (x, y) ≤ r¦
is complete. Every major result in this book is written at this generality, so our
constant friend is the triangle inequality—exploited without acknowledgement.
The most important metric spaces include ndimensional Euclidean space R
n
,
Riemannian manifolds and function spaces such as L
2
(R). Appendix A gives
vii
viii INTRODUCTION
deﬁnitions for these and other examples and lists general properties of metric
spaces.
The term “continuous dynamics”, as opposed to “discrete dynamics”, means
the study of ﬂows:
Deﬁnition 1 A ﬂow is a continuous map F : M R → M which, for all
x ∈ M and s, t ∈ R, satisﬁes
(i) F (x, 0) = x
(ii) F (F (x, s) , t) = F (x, s +t).
More eﬃcient notations are
F
t
(x) := F (x, t) =: F
x
(t)
with the space variable x or time parameter t in the subscript, depending on
which quantity is active in a calculation. Flows will typically be denoted with
F, G, or H.
For ﬁxed t, a ﬂow gives a map F
t
: M → M which is necessarily an auto
morphism, i.e., a homeomorphism of M to itself, since F
−t
is the continuous
inverse of F
t
F
−t
◦ F
t
= F
0
= Id.
A ﬂow may thus be viewed as a 1parameter family of homeomorphisms. From
another point of view, the R parameter t often signiﬁes time, and the map F
x
:
R →M then describes the motion of a ﬁxed x through its position/conﬁguration
space M ÷ F
x
(t) for all t ∈ R with initial condition x = F
x
(0).
Our chief interest is to use continuous dynamics to explore the geometry
of general metric spaces. Insights into geometric structure, in turn, give us
deeper understanding of possible dynamics. It is surprising how many important
geometrical ideas require only a metric for their deﬁnition. Balls and spheres,
of course, are utilized at the inception of metric spaces. A more extensive list
of static geometric deﬁnitions (ellipses, cylinders, etc.) appears on page 133.
Ekeland’s variational principle and the Mountain Pass Theorem have natural
expressions on a metric space [43]. For many decades algebraic topologists
have been aware that a topology without further algebraic structure is suﬃcient
to deﬁne geometrically insightful indices, such as the fundamental homotopy
group or the homological Conley index [25]. More important for this book,
geometric notions such as curves, surfaces, tangency, and transversality have
natural expressions on metric spaces. The generalization of the Fundamental
Theorem of Ordinary Diﬀerential Equations to metric spaces ([52], [7], [18], [30])
and Frobenius’ Foliation Theorem (Chapter 3) are the major theoretical results
explicated in this text. Further, length, speed, angles, norm, curvature [14],
the Lie derivative (Chapter 2), gradients ([41], [3]) and many others also have
natural and fruitful generalizations. The spirit that guides the development of
metric geometry is the conviction that every major geometrical result has a
substantial expression on metric spaces.
0.1. CONTEXT AND OBJECTIVE ix
Metric geometry’s goal may be summarized as the attempt to generalize Rie
mannian geometry to metric spaces; a complementary point of view holds that
Riemannian geometry is a specialized pursuit within the wider goal of exploring
the geometry of general metric spaces. Hilbert’s 4th and 23rd problems are a
good place to start the long history of metric geometry. The major contributors
to metric geometry are unprintably populous, but an embarrassingly short list
of highlights, particularly relevant to the goal of this book, include Menger [48],
A. D. Aleksandrov [2], Busemann [15], and Gromov [37], who take the notions
“curve”, “path” or “arc” in M as primary objects of study. Diﬀerent authors
have contradictory deﬁnitions. Let us deﬁne a curve to be a continuous map
c : I → M, where I ⊂ R is a subinterval with nonempty interior; and deﬁne
the path of c as its image, the set ¦c (t) : t ∈ I¦ ⊂ M. An arc is a curve
with a special property, e.g., it may minimize distance or energy. We will re
serve the freedom to use the term “arc” in this loose, evocative manner as any
distinguished curve. The length L(c) of a curve c is the supremum of the sum
n
¸
i=1
d (c (t
i
) , c (t
i−1
))
taken over all ﬁnite partitions ¦t
0
, t
1
, ..., t
n
¦ of its domain I. A curve c : I → X
has speed bounded by ρ with ρ ≥ 0 if d (c (s) , c (t)) ≤ ρ [s −t[ for all s
and t in I. The speed of c is the inﬁmum of all such bounds ρ. The length
of the curve restricted to any interval [t
1
, t
2
] ⊂ I is then less than or equal
to ρ [t
1
−t
2
[, and the notion of speed as lengthtraveleddividedbytime is still
valid (inﬁnitesimally and on average) in metric spaces.
Much of the diﬀerential calculus may also be generalized, inspired by the
observation that tangency may be characterized using solely the metric (lines
(1) and (2) below) without requiring any algebraic properties for the underlying
space. Diﬀerential equations and their solutions are thereby expressible on gen
eral metric spaces. Stripping manifolds of their locallinear structure and leaving
only the ability to compare distances between points helps us understand the
essence of these geometrical and dynamical facts, and it has the added beneﬁt
of giving occasionally stronger theorems and a wider descriptive power that ac
companies the more general framework. But our ulterior motivation is: focusing
on the metric alone often makes things easier. It is easier to prove and under
stand a result when there are fewer assumptions; and it is easier to construct
examples when we are not restricted to a highly structured environment, such as
a ﬁnitedimensional manifold. Throughout the book, though, we rigidly adhere
to the philosophy of faithfully and naturally generalizing analysis and geometry
on manifolds; this allows us to use the voluminous library of traditional results
whenever our generalized examples inhabit a more structured environment.
Two curves c
i
: I
i
→M for i = 1, 2 are tangent at a point t ∈ I
1
∩ I
2
if
lim
h→0
d (c
1
(t +h) , c
2
(t +h))
h
= 0. (1)
This deﬁnition faithfully generalizes tangency on M = R
n
or any normed linear
x INTRODUCTION
space. For instance a curve c : I → R
n
is diﬀerentiable with c
(t
0
) ∈ R
n
for
t
0
∈ I if and only if c is tangent to the curve l (t) := c (t
0
) +(t −t
0
) c
(t
0
) which
is a line in the direction of c
(t
0
) since
lim
h→0
d (c (t
0
+h) , l (t
0
+h))
h
= lim
h→0
c (t
0
+h) −c (t
0
)
h
−c
(t
0
)
(2)
where the metric d is derived from the norm, d (x, y) := x −y. So the smooth
ness of a curve c is determined by its tangency with a special curve, an arc, l.
Remember (Appendix B) nearly any ODE may be rewritten as a vector ﬁeld
problem
x
= V (x)
where V : R
n
→R
n
is the vector ﬁeld and a solution is a curve σ
x
0
: I →R
n
with initial condition σ
x0
(0) = x
0
∈ R
n
satisfying
d
dt
σ
x0
(t) = V (σ
x0
(t)) .
The fundamental result of ODEs is: if V is Lipschitz continuous then there exists
a collection of solutions which generates a unique local ﬂow F (x, t) := σ
x
(t).
We generalize this result in Chapter 1 using the idea contained in (2) that a
curve can represent a vector or derivative. In analogy with vectors on a linear
space, we study arcs on a metric space. Whereas the vector ﬁeld V speciﬁes a
direction V (x) ∈ R
n
at each point x ∈ R
n
to which solutions must be tangent,
an arc ﬁeld X speciﬁes a direction with an arc at each point x. So an arc ﬁeld
is a map X : M [−1, 1] → M with X (x) : [−1, 1] → M being the arc at the
position x ∈ M.
To make the generalization claimed in the previous paragraph more concrete,
let us show how every vector ﬁeld V may be naturally represented as an arc
ﬁeld X. Deﬁne X : R
n
[−1, 1] →R
n
by X (x, t) := x+tV (x). If σ
x0
: I →R
n
is a solution to the vector ﬁeld problem, then σ
x0
is also tangent to X at each
value t ∈ I in the sense that
lim
h→0
d (σ (t +h) , X (σ (t) , h))
h
= 0.
To check this notice
lim
h→0
d (σ (t +h) , X (σ (t) , h))
h
= lim
h→0
d (σ (t +h) , σ (t) +hV (σ (t)))
h
= lim
h→0
σ (t +h) −σ (t)
h
−V (σ (t))
=
d
dt
σ (t) −V (σ (t))
= 0.
The motivation for generalizing the calculus is to analyze dynamics (i.e.,
ﬂows) on such archetypical examples of metric spaces as the inﬁnitedimensional
space L
2
(R), manifolds, and the space of nonempty compact subsets of the
plane H
R
2
.
0.2. EXAMPLE: FLOWS ON L
2
(R) xi
0.2 Example: ﬂows on L
2
(R)
The space of square integrable functions L
2
(R) (see Appendix A.1 for a precise
deﬁnition) is a linear space and may seem an unlikely candidate to yield novel
results through our program of abstracting classical results to metric spaces
while avoiding the use of any linear structure. However, for this most elemen
tary of all inﬁnitedimensional spaces—this Hilbert space—the linear structure is
actually a hindrance to understanding some of its most basic ﬂows.
Example 2 On M := L
2
(R) , the (Hilbert) space of square integrable functions
of one real variable, the metric is derived from the L
2
norm:
d (f, g) :=
R
(f −g)
2
dµ = f −g
2
.
What is the simplest example of a ﬂow on M? For many visual thinkers, trans
lating the graph leaps to mind:
F (f, t) (x) := f (x +t) .
f (x +t) and f (x)
The two ﬂow properties are automatically veriﬁed: (i) F (f, 0) = f and (ii)
F (F (f, s) , t) = F (f, s +t) for any f ∈ L
2
(R). In fact ¦F (, t) [t ∈ R¦ is
clearly a family of isometries of M.
This example seems so perfectly regular as to seem trivial. However a con
founding blow to our intuition is that for most initial conditions f, the curves
F (f, ) are nondiﬀerentiable with respect to either Gateaux or Frechet diﬀeren
tiability (notions we won’t use and won’t deﬁne). To get a feel for this situation,
consider the initial condition f := χ
[0,1]
. Here χ
S
represents the characteristic
function of a set S, i.e.,
χ
S
(x) :=
1 for x ∈ S
0 otherwise.
For f := χ
[0,1]
F (f, t +h) −F (f, t)
h
=
1
h
χ
[1+t,1+t+h]
−χ
[t,t+h]
xii INTRODUCTION
has norm
2/h and does not converge to a member of L
2
(R) as h → 0. The
linear structure of the vector space L
2
(R) is not helping in our quest to analyze
1
F.
Even more fundamentally bothersome is the fact that the speed of the ﬂow is
not locally bounded, i.e., the speed of the curves F(f, ) can become arbitrarily
large on any neighborhood of M.
(Here we are referring to the notion of speed deﬁned technically above. The
speed of F(f, ) is not related to the rate the graph is translated on the R axis—
which is constantly 1. The metric is biased toward the structure of addition
of functions in order to achieve a norm and is less sensitive to comparing how
similar the graphs appear. Reread the deﬁnitions carefully so as not to be misled
by initial intuition.)
This diﬃculty with translation is at the heart of many obstacles to answering
the wellposedness of partial diﬀerential equations (PDEs), since translation is
the solution of
∂F
∂t
=
∂F
∂x
.
This is the simplest nontrivial partial diﬀerential equation and yet we already
see the “unbounded” property of some functional analysis operators rearing its
head. This warns us about the diﬃculties inherent in transporting the language
and intuition of continuous dynamics in ﬁnite dimensions to inﬁnite dimensions
or more general metric spaces. L
2
is a beautiful, complete metric space which
is natural to consider as an environment for solving PDEs, but the pitfall men
tioned in this paragraph may lead us to widen our search to other metric spaces.
Example 3 Another basic ﬂow on M := L
2
(R) is vector space translation,
G : L
2
(R) R →L
2
(R) given by
G(f, t) := f +tg
for any choice of g ∈ L
2
(R). The evolution of the graph of G
t
(f) as t changes
is not quite as easy to visualize as Example 2; but since G respects the vector
space structure, it is much tamer analytically. Verifying the ﬂow properties is
trivial. Continuity in particular follows immediately from the properties of the
norm. In fact the speed is globally bounded by ρ := g since
d (G(f, t) , G(f, s)) = (f +tg) −(f +sg) = [t −s[ g .
G(, t), like F (, t) above, is again a family of isometries of M:
G(f
1
, t) −G(f
2
, t) = (f
1
+tg) −(f
2
+tg) = f
1
−f
2
 .
1
This diﬀerence quotient does, of course, converge to a diﬀerence of Dirac point distrib
utions δ
t+1
− δt if we bother to deﬁne the wider notion of a distribution in the linear dual.
Admittedly we’re being overly critical on the value of linearity at this stage, but read on and
note for yourself why even the use of covectors won’t simplify the analysis.
0.2. EXAMPLE: FLOWS ON L
2
(R) xiii
How do our two ﬂows F and G from Examples 2 and 3 compare? How do
they interact on M, and what does this tell us about M? Let us determine the
reachable set for this pair of ﬂows. The reachable set is an object of fundamental
concern in the subject of control theory, which we take up in greater detail in
§3.5. Imagine we are running some process which allows us to apply either ﬂow
F or G successively, at will, to an initial condition in our conﬁguration space
M. The reachable set starting from the initial point f is then deﬁned as
R
F,G
(f) :=
¸
G
s
n
F
t
n
G
s
n−1
F
t
n−1
...G
s
1
F
t
1
(f) ∈ M[s
i
, t
i
∈ R, n ∈ N
.
Here we are dropping the composition parentheses, using G
s
F
t
(f) = G(F (f, t) , s)
to simplify notation; the general associativity of composition means the extra
parentheses are unnecessary. So starting with the initial condition f ∈ M we
can steer our process in ﬁnite time to any conﬁguration in R
F,G
(f) ⊂ M by
judiciously applying F and G by various amounts s
i
and t
i
.
If R
F,G
(f) is dense in M, then M is said to be controllable by F and
G. For instance we could imagine M consists of the space of possible signals a
circuit can generate in a looped line. G then represents adding a waveform in
the shape of the graph of g; and F would correspond to time lag as the signal
naturally cycles around the loop. The reachable set in this idealized scenario
represents the possible signals that can be generated with our circuit.
As a ﬁrst inquiry into the nature of R
F,G
(f) for our two types of translation
on L
2
(R), let us test whether F and G commute, i.e., does F (G(f, s) , t) =
G(F (f, t) , s)? If so the reachable set will be merely a twodimensional subset
of the inﬁnitedimensional space M = L
2
(R), since any member collapses to
the simple representation
G
s
n
F
t
n
G
s
n−1
F
t
n−1
...G
s
1
F
t
1
(f) = G
s
1
+...+s
n
F
t
1
+...+t
n
(f) = G
s
F
t
(f) .
Perhaps surprisingly F and G are usually far from commutative, and by how
much depends on the function g:
[F (G(f, s) , t) −G(F (f, t) , s)] (x) (3)
= [f (x +t) +sg (x +t)] −[f (x +t) +sg (x)] = s [g (x +t) −g (x)] .
From the point of view of diﬀerentiable ﬂows on a manifold, we would at least
expect
d (F (G(f, t) , t) , G(F (f, t) , t)) = O
t
2
and, in fact, continuing from line (3) we calculate
lim
t→0
F (G(f, t) , t) −G(F (f, t) , t)
t
2
=
dg
dx
if g is diﬀerentiable. Following the ideas of geometric control theory, this break
in holonomy suggests the reachable set is more than twodimensional. In fact
R
F,G
(f) should be dense in the span of the set of all Lie brackets generated by
F and G.
xiv INTRODUCTION
This turns out to be exactly correct:
span
d
n
dx
n
g
n ∈ N
¸
⊂ R
F,G
(f) (4)
where S denotes the topological closure of a set S ⊂ M, and spanS denotes
the closed linear span of S in M. There are algorithms for steering any initial
condition to a member of the reachable set, and for many choices of g ∈ M, e.g.,
g (x) := e
−x
2
, we ﬁnd that all of M is controllable. Continuing the application
of this model to signal processing above, this means that for the correct choice
of g, any signal can be synthesized by alternately applying F and G. In the
course of this book we will clarify the terminology and ideas surrounding these
claims, culminating in §5.1 with Theorem 88.
One corollary is that lowfrequency trigonometric series of the form
N
¸
n=1
a
n
e
ix/n
are dense in L
2
[a, b] for any interval [a, b], Theorem 90. Here’s a quick moti
vation for this shocking fact: approximate f (x) ≈
N
¸
n=1
b
n
x
n
then notice x
n
=
i
−n d
n
dt
n
e
itx
t=0
. Now approximate the derivatives with ﬁnite diﬀerences (the
formula is reviewed in Example 129). Example 95 gives the coeﬃcients for ap
proximating x
3
to arbitrary accuracy using just 3 lowfrequency sine functions.
To achieve these results we generalize the ChowRashevsky Theorem to met
ric spaces: the closure of the reachable set is the closure of the integral manifold
to the distribution consisting of the set of all arc ﬁelds bracketgenerated from
F and G, which gives line (4). The proof uses generalized versions of folia
tions (Chapter 3), Lie brackets (§2.2), geometrical distributions (§3.4), and an
arithmetic of ﬂows which works on a geometric as well as algebraic level (§2.1,
Theorem 43). To give a ﬁrm footing for such complicated constructions us
ing only the abstract building blocks of metric spaces, we devote Chapter 1 to
carefully establishing the fundamental properties of existence, uniqueness and
regularity of ﬂows on M and Chapter 2 to the algebraic properties.
0.3 Example: ﬂows on manifolds
The ultimate source of inspiration for metric space generalization of geometri
cal and dynamical ideas is the theory of diﬀerentiable manifolds, in addition to
being the colliery for our examples. The elaborate apparatus constructed to do
calculus on a diﬀerentiable manifold is remarkably successful in extending tra
ditional calculus on R
n
to a more general setting, indispensable in fundamental
areas of mathematics and physics. Much of this apparatus is ready to be further
extended to metric spaces.
Digesting the brief overview of diﬀerentiable manifolds in this section is not
necessary to digest the rest of the material in this book, but a familiarity with
manifold theory will allow you to anticipate all our results. This is an apology
to the beginner for how dense the next paragraph is. [23] or [12] or many other
0.3. EXAMPLE: FLOWS ON MANIFOLDS xv
proper introductions expand the following paragraph to chapters; [1] gives an
introduction to inﬁnitedimensional manifolds, or Banach manifolds. We make
no pretense toward rigor in this section, but promise to rectify this imprecision
in the presentation of the generalizations throughout the remainder of the text.
We focus instead on the properties of manifolds which are naturally generalized
to metric spaces.
Example 4 Deﬁne the torus
T
2
:= S
1
S
1
= ¦(xmod2π, y mod2π) = (x, y) mod2π[x, y ∈ R¦
where xmod2π is the remainder upon dividing x by 2π. The torus T
2
is most
easily geometrically visualized when embedded in R
3
as a doughnut: First embed
the circle S
1
in R
3
via a map such as c (t) := (0, 3 + sint, cos t), then rotate the
circle around the zaxis with the embedding S : T
2
→R
3
S (s, t) :=
cos s −sins 0
sins cos s 0
0 0 1
¸
¸
0
3 + sint
cos t
¸
¸
.
(This particular construction is easily extended to give more 2dimensional man
Figure 1: T
2
embedded in R
3
ifolds such as found on pp. 4347.)
T
2
is an archetypical example of a manifold, which by deﬁnition is a set
which is locally ﬂat; this means every point x ∈ T
2
has a neighborhood U with
φ : U → V a homeomorphism onto its image in the topological vector space V .
For the torus V = R
2
which makes it 2dimensional, hence the superscript in
T
2
. φ is called a chart for T
2
which gives local coordinates on the manifold. T
2
is also a diﬀerentiable manifold, which means the charts “match up nicely”,
which means for any two charts φ
i
: U
i
→V for i = 1, 2 the composition φ
2
◦φ
−1
1
is a diﬀerentiable map from V to V wherever it is deﬁned. The existence of
xvi INTRODUCTION
these nicelymatchedup charts means any calculus done on V may be applied to
T
2
. Charts are easy to construct for T
2
; the only diﬃculty arises near a point
(x, y) ∈ T
2
when x = 0 or y = 0, which the reader is encouraged to resolve.
In Figure 1 you can see the locally ﬂat patches of R
2
nicely matching up as
grid lines to form the manifold, but this is an aberration amongst manifolds.
Excepting the Klein bottle, no other compact 2dimensional manifold has such
perfectly aligned patches globally. Consider, e.g., a globe where longitude and
latitude grid lines degenerate at the pole; results from topology prove any way
you attempt to construct such a grid on a globe, will end with at least one point
of degeneration [40].
Particularly important for our investigation of geometry and dynamics is
the concept of the tangent space of a manifold, roughly the space of all possible
directions in which you can move from a point within the manifold. The tangent
space may be deﬁned in several equivalent ways; let’s outline the most relevant
for our purposes. A tangent vector v at a point x ∈ M is an equivalence class
of curves in the manifold c : (−δ
c
, δ
c
) →M with c (0) = x under the equivalence
relation that c
1
∼ c
2
if (φ ◦ c
1
)
(0) = (φ ◦ c
2
)
(0) for any chart φ, i.e., c
1
and
c
2
are diﬀerentiable and tangent to each other at x. This equivalence relation
distills the idea of a “direction with magnitude” v located at the position x
into an abstract mathematical object, represented by an explicit, constructible
object c
(0). The tangent space at x is the set of all equivalence classes under
this relation, denoted T
x
M. The tangent bundle TM is the collection of
all tangent spaces, i.e., the disjoint union TM := H
x∈M
T
x
M, representing all
possible directions of motion (x, v). A vector ﬁeld is a map f : M →TM with
f (x) ∈ T
x
M, and represents a rule for motion on the manifold. A solution to
a vector ﬁeld is a curve σ : (α, β) →M which is tangent to the vector ﬁeld at all
points, i.e., a curve which follows the rule. More concretely, if the translation
map τ
s
: R →R is given by τ
s
(t) := s +t then σ ◦ τ
s
: (α −s, β −s) →M has
σ◦τ
s
∈ f (σ (s)) for all s ∈ (α, β). In other words σ follows the rules of motion of
f through M. Assuming 0 ∈ (α, β) we say σ (0) ∈ M is the initial condition
of the solution, the place where the motion σ begins. By the Fundamental
Theorem of ODEs (Appendix B) applied to charts, we always have solutions to
a smooth vector ﬁeld, and we can combine them to give a local ﬂow. Conversely,
any diﬀerentiable ﬂow on a manifold is generated by a vector ﬁeld.
Example 5 A simple ﬂow that spirals around the torus F : T
2
R → T
2
is
deﬁned by F
t
(x, y) = (x +t, y +at) mod2π. If a is a rational number then the
path F
(x,0)
(R) closes and is homeomorphic to the circle S
1
(Figure 2). These
paths are evocatively described as toral helices. The diﬀerent paths, starting
from diﬀerent points (x, 0), partition T
2
. This partition is an example of a 1
dimensional foliation of the torus. Two more foliations perpendicular to each
other are illustrated by grid lines in Figure 1 above.
When a is not a rational number the path F
(x,0)
(R) does not close on itself
and is homeomorphic to R, as a dense subset of T
2
(Figure 3). Still there are
an inﬁnite number of disjoint paths, starting from diﬀerent points (x, 0), which
0.3. EXAMPLE: FLOWS ON MANIFOLDS xvii
Figure 2: Rational ﬂowpath
again foliate T
2
.
Figure 3: Irrational ﬂowpath
The solutions to the vector ﬁeld f (x, y) := (1, a) generate the ﬂow F. Here
the number (1, a) really represents the curve
c (t) := [(x, y) +t (1, a)] mod2π ∈ T
2
which itself is a representative of an equivalence class of curves under tangency
and so [c] ∈ T
(x,y)
T
2
. Sometimes it’s easier to just construct the ﬂow than
to think about the vector ﬁelds; but vector ﬁelds are generally considered pri
mary, and often have great descriptive power, giving a link between algebra and
xviii INTRODUCTION
geometry. E.g., the vector ﬁeld illustrated here
V (x, y) := (cos y, sinx)
is smooth on T
2
since it matches up at 0 and 2π. V is easily solved on the plane
and transferred by charts (or S) to the manifold. However, the ﬂow paths of V
do not foliate T
2
as before, since there are 4 ﬁxed points; instead this leads to a
stratiﬁcation since the paths are of diﬀerent dimension—namely 1 and 0.
Another inequivalent foliation of T
2
is given by the paths in Figure 4, consist
ing of two closed circles and an continuum’s worth of toral helices which accumu
late on the circles. More circles may be added, producing topologically distinct
foliations. Essentially the ﬁnal twist we can add in foliating a 2dimensional
compact manifold is a Reeb component, illustrated in Figure 5. See [40] for an
elementary classiﬁcation of foliations on compact manifolds.
Higherdimensional foliations of manifolds are vital to the study of geometry
and dynamics. Examples of 2dimensional foliations of a 3dimensional space
are illustrated on pages 4247. Just as integral curves and 1dimensional fo
liations are generated by vector ﬁelds or by 1dimensional subbundles of the
tangent bundle TM, surfaces and ndimensional foliations are generated by n
transverse vector ﬁelds or by ndimensional subbundles of TM called distribu
tions
2
. Not all distributions may be integrated to generate foliations, not even if
they are smoothly deﬁned (see Example 58). However, a simple condition called
“involutivity” characterizes the integrable distributions—this characterization is
referred to as Frobenius’ Foliation Theorem.
To deﬁne involutivity, we use the Lie bracket [f, g] of two vector ﬁelds f
and g, which is a new vector ﬁeld on M. The vector [f, g] (x) is the tangency
equivalence class represented by the curve G
−
√
t
F
−
√
t
G
√
t
F
√
t
(x) where F and
G are the respective ﬂows of f and g. I.e., start at x ∈ M and move in an
2
The term “distribution” is not to be confused with the several other mathematical con
cepts that share its name. As a striking case of poor terminology, when studying dynamics
on abstract function spaces three of these deﬁnitions may be needed in a single example:
probability distributions, functionals, and subbundles, e.g, in Example 85.
0.3. EXAMPLE: FLOWS ON MANIFOLDS xix
Figure 4: 4 leaves of another toral foliation are depicted: 2 circles and two
partiallycomplete squished toral helices. Everyone loves a Slinky.
Figure 5: Reeb component
xx INTRODUCTION
approximateparallelogram following F, then G, then F backwards, then G
backwards. The little “parallelogram” almost returns to x, but
√
t has inﬁnite
speed at t = 0 which cancels the naturally tendency of the parallelogram to
close, at least to order O(t), giving a curve with ﬁnite speed, Figure 6.Why
Figure 6: O(t) gap represents the new vector [f, g]
do we use
√
t? If we restrict our attention to x ∈ R
n
and move > 0 in
each of the directions around the parallelogram in with the f and g vector
ﬁeld directions starting at x
0
, then using Taylor series, we get a curve x() =
x
0
+
2
∂g
∂x
f (x
0
) −
∂f
∂x
g (x
0
)
+O
3
.
The bracket encapsulates a subtle diﬀerence between f and g which is crit
ical to appreciate. For example [f, g] = 0 if and only if F and G commute,
meaning the parallelogram closes perfectly. But there is much more. The Lie
bracket gives us fundamental geometrical information about the subbundle of
TM generated by f and g, i.e., the distribution
∆(f, g) := ¦span¦f (x) , g (x) ∈ T
x
M¦ [x ∈ M¦ .
The distribution ∆(f, g) gives a plane at each point x and so is also called a plane
ﬁeld. Frobenius’ Foliation Theorem says ∆(f, g) foliates M into 2dimensional
surfaces (“leaves”) exactly when [f, g] (x) ∈ ∆(f, g) for all x ∈ M. Higher
dimensional foliations are determined by a straightforward generalization.
Frobenius’ Theorem has an important corollary for control theory, the Chow
Rashevsky Theorem, concerning the reachable set of a control system: If ∆(f, g)
is involutive then the situation is simple, and the reachable set R
f,g
(x) is the
leaf of the foliation through x; both sets consist of the set of all points in the set
of all piecewise diﬀerentiable paths containing x with derivatives being linear
combinations of f and g. If ∆(f, g) is not involutive then [f, g] is transverse
to any surface tangent to f and g, so cycling through the motions of f and g
according to the bracket deﬁnition sends us away from the tangent surface, and
thus the reachable set is not as simple as in the involutive case. But there is a
simple solution in this case as well. If ∆(f, g) is not involutive, then we may form
0.4. EXAMPLE: FLOWS ON ASPACE WITHNOLINEARSTRUCTURExxi
the distribution ∆[f, g] bracketgenerated by f and g, consisting of the linear
combinations of f and g and all ﬁnitely iterated brackets, such as [[f, [f, g]] , f].
By deﬁnition ∆[f, g] is involutive and so foliates M by Frobenius’ Theorem.
The ChowRashevsky Theorem says the closure of the reachable set R
f,g
(x) is
the leaf of the foliation from the bracketgenerated distribution ∆[f, g] through
x. This is easy to believe now since iterated brackets of f and g are tangent to
the ﬂows of some complicated composition of F and G. E.g.,
[f, [f, g]] ∼
F
−
4
√
t
G
−
4
√
t
F
4
√
t
G
4
√
t
F
−
√
t
G
−
4
√
t
F
−
4
√
t
G4
√
t
F 4
√
t
F
√
t
(x) .
So the ﬂow of any iterated bracket is in R
f,g
(x). This shows that the leaf
is contained in the closure of the reachable set; the (less interesting) reverse
inclusion follows from the NagumoBrézis invariantset theorem, proven in §1.3.
Vector ﬁelds on a manifold are generalized to metric spaces with arc ﬁelds
as a special family of curves (cf. the technical description on page 3). The
deﬁnition of Lie brackets (§2.2) is essentially the same on metric spaces as given
above for manifolds. But to deﬁne distributions (§3.4) using spans of arc ﬁelds,
and also to deﬁne involutivity, we need an arithmetic for ﬂows on a metric space—
but metric spaces have no usable linear structure by deﬁnition. Surprisingly,
though you cannot add points together in a metric space, you can add arc ﬁelds
in a natural way which faithfully generalizes the linear properties of vector
ﬁelds on manifolds: scalar multiplication is deﬁned by changing the speed of
the curves, and arc ﬁelds can be added simply by composing them (§2.1). Then
global foliations on metric spaces follow with a new proof of Frobenius’ Theorem
(Chapter 3). The ChowRashevsky Theorem is generalized in §3.5.
0.4 Example: ﬂows on a space with no linear
structure
As a ﬁnal introductory example we consider a metric space which resists any
natural ascription of a linear structure, but still gives a fertile environment for
dynamics.
Example 6 Let (R
n
, d) be the usual ndimensional Euclidean space. The met
ric space H (R
n
) is the set of all nonempty compact subsets of R
n
and the
Hausdorﬀ distance is given by
d
H
(a, b) := max
sup
x∈a
¸
inf
y∈b
¦d (x, y)¦
, sup
y∈a
¸
inf
x∈b
¦d (x, y)¦
¸
.
Using the simplifying notation d (x, a) := inf
y∈a
¦d (x, y)¦ =: d (a, x) for x ∈ R
n
and a ⊂ R
n
, we have
d
H
(a, b) = sup
x∈a;y∈b
¦d (x, b) , d (y, a)¦ .
xxii INTRODUCTION
H (R
n
) has several useful topological properties in common with R
n
. It is
separable, complete and even locally compact (separability is obvious by consid
ering ﬁnite subsets of R
n
; for completeness, see [8]; for local compactness, see
[33, p. 183]).
What makes this space interesting for modeling is that shapes of homoge
neous matter are merely points in this metric space. A circle, a rectangle, a
pentagram: all points in H
R
2
. A ball, a box, a cloud: points in H
R
3
.
Exercise 7 Find d
H
(a, b) when a ∈ H
R
2
is the unit coordinate box
a := ¦(x, y)[ 0 ≤ x ≤ 1, 0 ≤ y ≤ 1¦
and b ∈ H
R
2
is the unit ball
b :=
¸
(x, y)[ x
2
+y
2
≤ 1
.
Hint: Since a = b we cannot have d
H
(a, b) = 0.
Figure 7: d
H
(ball, square) =?
Exercise 8 Determine which points are in the ball B
dH
(0, 1) ⊂ H
R
2
.
Hint: B
d
H
(0, 1) = B
d
(0, 1) and the word “point” is easily misinterpreted
here.
As further motivation for the potential of this space, answer the question:
What is a curve in H (R
n
) ?
Looking at a black and white newspaper photo with a magnifying glass we see a
ﬁnite collection of black dots. This photograph may be thought of as a point in
H
R
2
, the compact set representing the union of the black dots which forms
0.5. CHAPTER OUTLINE xxiii
a closed and bounded subset of R
2
. Now if a black and white photograph is a
point in H
R
2
, a black and white ﬁlm clip is a curve in H
R
2
. These points of
H
R
2
, the photographs or individual frames of the movie, move continuously
with respect to the Hausdorﬀ metric as time goes by (or at least approximately
continuously, since there are only a ﬁnite number of frames in a ﬁlm clip). Color
cinema is a curve in H
R
3
.
The ability to describe the motion of complex patterns makes H (R
n
) a very
interesting space. It is easy to imagine the motion and evolution of a homoge
neous material simply as a curve in this metric space: a moving cloud may be
characterized as a curve in H
R
3
, and a lightning stroke is a very fast curve
in H
R
3
; the growth of a bacteria colony in a petri dish and the evolution of a
sixsided snowﬂake growing from a tiny ice seed are both geometrically curves
in H
R
3
.
Very well then, H (R
n
) has a strong potential for describing all sorts of
shape changes, but do we have any control on this profusion of information
with which H (R
n
) presents us? How do we mathematically encapsulate motion
or characterize forces on such a space? Can we generalize diﬀerential equations,
somehow? Even then, could we stomach any calculations with this complicated
metric? Happily, all of these questions have positive answers.
Let’s construct some curves in H (R
n
).
Example 9 For x, y ∈ R
n
, let λ
xy
: [−1, 1] →R
n
be the line deﬁned by
λ
xy
(t) := (1 −t) x +ty.
For k functions f
i
: R
n
→R
n
deﬁne the arc ﬁeld X : H (R
n
)[−1, 1] → H (R
n
)
by
X
a
(t) := ∪
x∈a
i=1,...,k
λ
xf
i
(x)
(t)
which describes curves from X
a
(0) = a to X
a
(1) = ∪
i=1,...,k
f
i
(a) in H (R
n
). In
Chapter 7 this arc ﬁeld X is shown to generate a ﬂow F on H (R
n
). When the
f
i
are aﬃne and contractive F
t
(a) converges to a unique ﬁxed point in H (R
n
)
as t → ∞, the convex hull of a fractal. Another example in §7.5 characterizes
the reachable set of a control system as the limit of the ﬂow of a similar arc
ﬁeld.
0.5 Chapter outline
Part I: Theory In these chapters our will is bent to proving generalizations
of the basic theorems of dynamical systems and diﬀerential geometry.
Chapter 1: Flows The Fundamental Theorem of ODEs is generalized, prov
ing the wellposedness of arc ﬁelds, Theorem 12. This gives a means
for generating ﬂows on metric spaces. Global ﬂows are guaranteed
xxiv INTRODUCTION
when an arc ﬁeld satisﬁes the extra condition of linearly bounded
speed, Theorem 25. A ﬁxed point is guaranteed when the arc ﬁeld is
suitably contractive, Theorem 31. An invariantset theorem general
izing the NagumoBrézis Theorem is given with Theorem 33, which
is used later to piece together integral surfaces in a global foliation
theorem. Theorem 35 gives a condition analogous to a vanishing Lie
bracket which guarantees forward ﬂows commute.
Chapter 2: Lie algebra on a metric space An arithmetic for arc ﬁelds is in
troduced which generalizes the algebraic structure of vector ﬁelds on
a manifold. Theorem 43 elucidates which module properties general
arc ﬁelds enjoy. Then the Lie bracket is introduced and its algebraic
properties are explored. Theorem 52 shows how pullback and push
forward operations are natural with respect to this new Lie algebra.
Chapter 3: Foliations Transverse arc ﬁelds generate geometric distributions.
The Lie bracket is used to prove a local Frobenius theorem, showing
involutive distributions have integral surfaces, Theorem 62. These
integral surfaces are pieced together to foliate metric spaces, culmi
nating in a global Frobenius theorem, Theorem 75. A corollary of
this result is an application to control theory with Chow’s Theorem
on a metric space, Theorem 78.
Part II: Examples
Chapter 4: Brackets on function spaces The Lie bracket and the Frobenius
Theorem are applied to simple ﬂows on L
2
(R) to make good on the
promises of §0.2. Various foliations of L
2
and other function spaces
are explored.
Chapter 5: Approximation with nonorthogonal families Applications of the
results of Chapter 4 give surprising new approximation methods us
ing nonorthogonal families of functions such as translations of a
Gaussian
e
−(x+1/n)
2
[n ∈ N
¸
in §5.1 and lowfrequency trigonomet
ric functions
¸
e
ix/n
[n ∈ N
in §5.2.
Chapter 6: More ﬂows on function spaces PDEs are rewritten as arc ﬁelds
avoiding derivatives.
Chapter 7: Flows on H (R
n
) A continuous version of the discrete IFS fractal
generator and other ﬂows with novel dynamics are introduced.
Some sections are not logically dependent on others. The fastest tour of the
highpoints is
'1.1 →§2.1 →§2.2 →§3.2 →§4 →§5
0.6. PREREQUISITES xxv
0.6 Prerequisites
Technically the prerequisites for understanding this book are very basic; a single
semester of undergraduate analysis which introduces the concept of a limit in
a metric space is suﬃcient. We’ve made eﬀorts to keep the book selfcontained
and gently introduce each concept. Certainly, those with experience in the
diﬀerentiablemanifold presentations of ﬂows, Lie brackets and foliations will
ﬁnd this generalized environment easy to apprehend. When released from the
details of charts, atlases and coordinates, new students may likewise ﬁnd these
concepts simpler to grasp.
Several proofs are extremely long. This is a good place to apologize, justify
ourselves, and prepare the reader. This is an abstract subject with concrete
claims. We are a bit defensive, therefore, and feel the need to detail every
pedestrian step exhaustively. Instead of relying on our readers’ mathematical
dexterity in this unfamiliar terrain, we spoiled the fun and printed out sixpage
proofs. Instead of slogging through, line by line, it may be more productive for
you to read the proof’s outline, then create one yourself.
0.7 Abridged version of the book
Generalizations of the major ideas in dynamics and geometry can be fruitfully
made to metric spaces. As well as greater descriptive power, the extra generality
gives insight into classical questions on inﬁnitedimensional spaces.
A vector ﬁeld on a manifold is recast as an arc ﬁeld X, that is, a set of
curves on a metric space M, each curve representing a direction, i.e., X is a
continuous map X : M [−1, 1] → M such that for all x ∈ M, X (x, 0) = x.
Tangency between two arc ﬁelds X and Y is given by the condition
d (X
x
(t) , Y
x
(t)) = o (t) .
If X satisﬁes the regularity conditions E1 and E2 (p. 1.1.1) on a complete
metric space, then there exists a unique local ﬂow tangent to X. If X has
linearly bounded speed, it generates a global ﬂow.
An arithmetic for arc ﬁelds is given by X +Y and aX for a ∈ R deﬁned by
(X +Y )
t
(x) := Y
t
X
t
(x)
and
(aX)
t
(x) := X
at
(x) .
The Lie bracket [X, Y ] of two arc ﬁelds is given by
[X, Y ] (x, t) := G
−
√
t
F
−
√
t
G
√
t
F
√
t
(x)
where F and G are the ﬂows generated by X and Y .
A distribution is a set of arc ﬁelds. A distribution ∆ is involutive if for
any X, Y ∈ ∆, we have [X, Y ] ∼ ∆. An involutive distribution has a unique
xxvi INTRODUCTION
maximal integral surface through each point in M. The integral surfaces, pieced
together, foliate M.
One application on M := L
2
(R) shows the ﬂows F
t
(f) (x) := f (x +t)
and G
t
(f) := f +tg bracketgenerate an inﬁnitedimensional distribution when
g (x) = e
−x
2
, and the reachable set is all of M. Similarly G and Z
t
(f) (x) :=
e
ixt
f (x) have an inﬁnitedimensional bracketgenerated distribution and L
2
([a, b] , C)
is controllable with G and Z, for any choice of interval [a, b]. Consequently se
ries of Gaussians
N
¸
k=0
a
k
e
−(x+1/k)
2
or lowfrequency trig series
N
¸
k=M
a
k
e
ix/k
may
be made arbitrarily close to any square integrable function.
0.8 Acknowledgements
This theory took more than 10 years to commit to paper, though I had assumed
it could be hammered out in a few months. It’s all Axel Boldt’s fault. To
my constant irritation, he corrected countless mistakes and misunderstandings,
which really slowed down the creative process. He also introduced me to several
branches of mathematics, which distracted me from metric spaces, and made
me a more versatile mathematician. Thanks for screwing up my focus, pal.
Michael Green was the mathematician who gave me the most extensive and
useful feedback on this manuscript. David Bleecker suggested I write this book,
which was the strangest thing I had seen him do, so I took him seriously. He
has been my greatest supporter in the development of these ideas.
Except for my wife, Karen. Often when authors thank their wives, I imagine
a shrew who speeds the writing of a book by folding her arms and tapping her
feet at the doorway to the study. But Karen took an interest in all the ideas
in this book, even the applications outside her ﬁeld of expertise. She was my
best sounding board, my best critic. And by introducing me to fatherhood then
guiding me for a year abroad in China, she’s been my best teacher.
Part I
Theory
1
Chapter 1
Flows
“Panta rhei.” (Everything ﬂows.)
Heraclitus, ca. 500 B.C.
The purpose of this chapter is to introduce a general method for producing
ﬂows (dynamical systems) on a metric space. A ﬂow may proceed forward and
backward in time F : M (−∞, +∞) → M, or possibly only forward in time
F : M [0, +∞) →M as in the case of diﬀusion. We explore the generation of
both types of ﬂows and study some conditions which guarantee global existence,
ﬁxed points and commutativity.
1.1 Generating ﬂows with arc ﬁelds
This section follows the generation of ﬂows on a manifold M from a vector ﬁeld:
ﬁrst we ﬁnd solutions for each initial condition x ∈ M, then we piece together
the solutions with domain (−δ, δ) in a neighborhood of x to get a local ﬂow,
which are then continued to produce a global ﬂow with domain (−∞, ∞).
1.1.1 The fundamental theorem
The following deﬁnition is made in analogy with the representation of a vector
ﬁeld on a manifold as a family of curves, detailed in §0.3.
Deﬁnition 10 An arc ﬁeld on a metric space M is a continuous map X :
M[−1, 1] →M with locally uniformly bounded speed, such that for all x ∈ M,
X (x, 0) = x.
Saying X has locally uniformly bounded speed means X (x, ) : [−1, 1] →
M is Lipschitz, locally uniformly in x. Speciﬁcally we have
ρ (x) := sup
s=t
d (X (x, s) , X (x, t))
[s −t[
< ∞,
3
4 CHAPTER 1. FLOWS
(i.e., X (x, ) is Lipschitz), and the function ρ (x) is locally bounded, meaning
there exists r > 0 such that
ρ (x, r) := sup¦ρ (y) [y ∈ B(x, r)¦ < ∞.
A solution curve to X is a curve σ which is tangent to X throughout its
domain, i.e., σ : (α, β) → M for some open interval (α, β) ⊂ R such that for
each t ∈ (α, β)
lim
h→0
d (σ (t +h) , X (σ (t) , h))
h
= 0, (1.1)
i.e., d (σ (t +h) , X (σ (t) , h)) = o (h).
Arc ﬁelds are typically denoted with X, Y , or Z. The two independent
variables for arc ﬁelds, usually denoted by x and t, are often thought of as
representing space and time. We typically use x, y, and z for space variables,
while r, s, t, and h ﬁll the time variable slot. As with ﬂows, the variables of an
arc ﬁeld X will often migrate liberally between parentheses and subscripts
X (x, t) = X
x
(t) = X
t
(x)
depending on which variable we wish to emphasize in a calculation.
On R
n
a vector ﬁeld which is Lipschitz continuous generates a local ﬂow
constructed by Euler curves. An arc ﬁeld is a faithful analogy for a metric
space, and when it satisﬁes analogous regularity conditions (E1 and E2 detailed
below), we will soon show Euler curves converge to a ﬂow. To further the
analogy with vector ﬁelds on manifolds, an arc ﬁeld may be thought of as a
map X : M → AM where AM is the arc bundle, consisting of the set of all
Lipschitz continuous arcs, and we require X (x) (0) = x.
The initial condition of σ is the point x = σ (0) ∈ M. Notationally we use
σ
x
to mean the solution with initial condition x. We say σ
x
: (α
x
, ω
x
) → M
is the unique solution to X with initial condition x if for any other solution
´ σ
x
: (´ α
x
, ´ ω
x
) → M also having initial condition x, we have (´ α
x
, ´ ω
x
) ⊂ (α
x
, ω
x
)
and ´ σ
x
= σ
x
[
(´ αx,´ ωx)
(i.e., σ
x
is the unique maximal solution curve).
We will prove below that on a locally complete metric space the next two
conditions guarantee the arc ﬁeld problem is well posed, i.e., there exists a
unique solution from any initial condition x ∈ M (Theorem 12).
Condition E1: For each x
0
∈ M, there exist positive constants r, δ and Λ such
that for all x, y ∈ B(x
0
, r) and t ∈ (−δ, δ)
d (X
t
(x) , X
t
(y)) ≤ d (x, y) (1 +[t[ Λ)
Condition E2: For each x
0
∈ M, there exist positive constants r, δ and Ω such
that for all x ∈ B(x
0
, r) and s ∈ (−δ, δ) and any t with [t[ ≤ [s[
d (X
s+t
(x) , X
t
(X
s
(x))) ≤ [st[ Ω.
1.1. GENERATING FLOWS WITH ARC FIELDS 5
These conditions may be restated as saying
d (X
t
(x) , X
t
(y))
d (x, y)
−1 ≤ O([st[)
and
d (X
s+t
(x) , X
t
(X
s
(x))) = O([st[)
for [t[ ≤ [s[ as s → 0, locally uniformly in x. Inﬁnitesimally these conditions
have E1 limiting the spread of X, and E2 restraining X to be ﬂowlike (Figure
1.1).
Figure 1.1: E1 and E2 are continuity conditions on X which ensure some geo
metric regularity using only the metric.
Example 11 A Banach space (M, ) is a complete normed vector space
(e.g., R
n
with Euclidean norm). A Banach space is an example of a metric
space (M, d) where the metric is deﬁned by d (u, v) := u −v. A vector ﬁeld
on a Banach space M is a map f : M → M. A solution to a vector ﬁeld
f with initial condition x is a curve σ
x
: (α, ω) → M deﬁned on an open
interval (α, ω) ⊂ R containing 0 such that σ
x
(0) = x and σ
x
(t) = f (σ
x
(t))
for all t ∈ (α, ω). The Fundamental Theorem of ODEs (detailed in Appendix
B) guarantees unique solutions for any locally Lipschitz vector ﬁeld f. With a
few tricks, most diﬀerential equations can be represented as vector ﬁelds on a
suitably abstract space.
Every Lipschitz vector ﬁeld f : M → M naturally gives rise to an arc ﬁeld
X (x, t) := x + tf (x) on M, and it is easy to check X satisﬁes E1 and E2:
Calculating
d (X
t
(x) , X
t
(y)) = X
t
(x) −X
t
(y)
≤ x −y +[t[ f (x) −f (y) ≤ (1 +[t[ K
f
) x −y
where K
f
is the local Lipschitz constant for f, so Λ
X
:= K
f
gives Condition
6 CHAPTER 1. FLOWS
E1.
d (X
s+t
(x) , X
t
(X
s
(x)))
= x + (s +t) f (x) −[X
s
(x) +tf ((X
s
(x)))] = tf (x) −tf (X
s
(x))
≤ [t[ K
f
x −[x +sf (x)] ≤ [st[ K
2
f
x
so Ω
X
:= K
2
f
x. Further the solutions to the arc ﬁeld are precisely the solutions
to the vector ﬁeld guaranteed by the fundamental theorem since
d
σ (t +h) , X
σ(t)
(h)
= [h[
σ (t +h) −σ (t)
h
−f (σ (t))
= o (h)
⇔ σ
(t) = f (σ (t)) .
Therefore Theorem 12, below, is a generalization of the classical Fundamental
Theorem of ODEs (given in Appendix B). Similarly, Lipschitz vector ﬁelds on a
Banach manifold (a manifold whose charts map to a Banach space; if f is locally
Lipschitz in one chart, it is in any and on the manifold with any compatible
metric) give arc ﬁelds which satisfy E1 and E2.
The basic iterative trick for proving ODEs are wellposed on R
n
, or more
generally on a Banach space, applies just as well for arc ﬁelds on general metric
spaces. For economy of description we use round brackets in the superscript,
f
(i)
, to denote the composition of a map f : M →M with itself i times. So, for
example,
X
(i)
t/2
n
(x) = X
t/2
n ◦ X
t/2
n ◦ ... ◦ X
t/2
n
. .. .
i compositions
(x) .
Then given x ∈ M and a positive integer n, we may deﬁne the nth Euler
curve ξ
n
: (−2
n
, 2
n
) →M for X starting at x as
ξ
n
(t) := X
(2
n
)
t/2
n
(x) (1.2)
for n ∈ N such that 2
n
> [t[. Taking n → ∞ generates a solution to X in the
following fundamental result.
Theorem 12 Let X be an arc ﬁeld satisfying E1 and E2 on a locally complete
metric space M. Then given any point x ∈ M, there exists a unique solution
σ
x
: (α
x
, ω
x
) → M with initial condition σ
x
(0) = x for some α
x
< 0 < ω
x
∈
R∪¦∞, −∞¦.
Proof of Existence of Solutions. We will show
lim
n→∞
ξ
n
(t) = lim
n→∞
X
(2
n
)
t/2
n
(x)
exists for each t suﬃciently close to 0 and deﬁne σ
x
(t) as this limit. Then σ
x
(t)
will be shown to be tangent to X at t = 0. The elaborate chain of elementary
1.1. GENERATING FLOWS WITH ARC FIELDS 7
calculations checking these two facts becomes convoluted, but the inspiration
guiding us is sketched simply enough in Figures 2.2 and 2.3. We then establish
σ
x
(s +t) = σ
σx(s)
(t) which shows σ
x
is tangent to X at all t in its domain
by the previous result. Uniqueness of solutions is elaborated and veriﬁed in
Remark 17, below.
First we show that for suﬃciently small c > 0 the image of the Euler curves
ξ
n
([−c, c]) must remain bounded for all n. This is intuitively true because the
arc ﬁeld X from which the Euler curve is constructed has locally bounded speed
ρ < ∞, so successively following 2
n
compositions of X for small time t/2
n
does
not allow us to travel further than ρ [t[ distance. This is exactly correct, but we
need to demonstrate how we can achieve this bound using only the metric. We
exhaust the rest of this voluminous paragraph with the tedious details. Suppose
r > 0 is chosen so ρ(x, r) < ∞. If ρ (x, r) = 0, then σ (t) := x deﬁnes a solution
curve and there is nothing to prove. Thus, assume ρ (x, r) > 0, and let
c := r/ρ (x, r) .
We assume hereafter that t is restricted to [t[ < c and [t[ < 1, guaranteeing the
Euler curve ξ
n
(t) is well deﬁned. In this case we claim ξ
n
(t) ∈ B(x, r): the
triangle inequality gives
d (x, ξ
n
(t)) ≤
2
n
¸
k=1
d
X
(k−1)
t/2
n
(x) , X
(k)
t/2
n
(x)
where X
(0)
t/2
n
(x) = x by deﬁnition.
d
X
(k−1)
t/2
n
(x) , X
(k)
t/2
n
(x)
= d
y, X
t/2
n (y)
≤ ρ(y) [t[ /2
n
for each k
where y := X
(k−1)
t/2
n
(x). So if y ∈ B(x, r) then ρ (y) ≤ ρ (x, r) and induction
allows us to conclude
d (ξ
n
(t) , x) ≤ ρ (x, r) [t[ < r.
Next we additionally assume the above r > 0 is chosen small enough that
Λ and Ω from Conditions E1 and E2 hold uniformly on B(x, r) and for conve
nience, that Λ, Ω > 1. We may further assume the closure B(x, r) is a complete
metric subspace of M by again taking r to be smaller if need be. In this carefully
chosen neighborhood we will now show the Euler curves converge by proving ξ
n
is Cauchy. (If M were locally compact, ArzelaAscoli would allow us to bypass
this one page veriﬁcation.)
8 CHAPTER 1. FLOWS
Figure 2.2: To prove the Euler curves are
Cauchy, apply E1 and E2 repeatedly to
estimate the distance between ξ
n
(t) and
ξ
n+1
(t) tracking back to
ξ
n
(0) = x
0
= ξ
n+1
(0) .
Figure 2.3: To prove tangency apply
E1 and E2 to estimate the distance
between X
t
(x
0
) and ξ
n
(t) → σ
x
(t) .
Consider
d
ξ
n
(t) , ξ
n+1
(t)
= d
X
(2
n
)
t/2
n
(x) , X
(2
n+1
)
t/2
n+1
(x)
≤ d
X
(2
n
)
t/2
n
(x) , X
(2)
t/2
n+1
X
(2
n
−1)
t/2
n
(x)
+d
X
(2)
t/2
n+1
X
(2
n
−1)
t/2
n
(x) , X
(2
n+1
)
t/2
n+1
(x)
The ﬁrst term is approximated by
d
X
(2
n
)
t/2
n
(x) , X
(2)
t/2
n+1
X
(2
n
−1)
t/2
n
(x)
= d
X
2t/2
n+1X
(2
n
−1)
t/2
n
(x) , X
(2)
t/2
n+1
X
(2
n
−1)
t/2
n
(x)
= d
X
2t/2
n+1 (y) , X
(2)
t/2
n+1
(y)
≤
t
2
n+1
2
Ω
for y := X
(2
n
−1)
t/2
n
(x) using Condition E2, while the second term is approximated
by
d
X
(2)
t/2
n+1
X
(2
n
−1)
t/2
n
(x) , X
(2
n+1
)
t/2
n+1
(x)
= d
X
(2)
t/2
n+1
X
(2
n
−1)
t/2
n
(x) , X
(2)
t/2
n+1
X
(2
n+1
−2)
t/2
n+1
(x)
≤ d
X
(2
n
−1)
t/2
n
(x) , X
(2
n+1
−2)
t/2
n+1
(x)
1 +
[t[
2
n+1
Λ
2
using Condition E1 twice. Such calculations will now be performed frequently
and without comment for the rest of the proof; usually when a new Λ or Ω
erupts, the triangle inequality and Condition E1 or E2 have been used.
1.1. GENERATING FLOWS WITH ARC FIELDS 9
Inserting these last two estimates and iterating we have
d
X
(2
n
)
t/2
n
(x) , X
(2
n+1
)
t/2
n+1
(x)
≤ d
X
(2
n
−1)
t/2
n
(x) , X
(2
n+1
−2)
t/2
n+1
(x)
1 +
[t[
2
n+1
Λ
2
+
t
2
n+1
2
Ω
≤ d
X
(2
n
−2)
t/2
n
(x) , X
(2
n+1
−2·2)
t/2
n+1
(x)
1 +
[t[
2
n+1
Λ
2·2
+
1 +
[t[
2
n+1
Λ
2
t
2
n+1
2
Ω +
t
2
n+1
2
Ω
≤ d
X
(2
n
−2
n
)
t/2
n
(x) , X
(2
n+1
−2·2
n
)
t/2
n+1
(x)
1 +
[t[
2
n+1
Λ
2·2
n
+
2
n
−1
¸
k=0
1 +
[t[
2
n+1
Λ
2·k
t
2
n+1
2
Ω
= 0 +
t
2
n+1
2
Ω
2
n
−1
¸
k=0
1 +
[t[
2
n+1
Λ
2·k
(geometric series)
=
t
2
n+1
2
Ω
1 +
t
2
n+1
Λ
2
n+1
−1
1 +
t
2
n+1
Λ
2
−1
=
t
2
n+1
2
Ω
1 +
t
2
n+1
Λ
2
n+1
−1
t
2
n
Λ +
t
2
n+1
Λ
2
≤
t
2
n+1
2
Ω
e
tΛ
−1
t
2
n
Λ
≤
[t[
2
n+1
Ω
e
tΛ
−1
Λ
Then for m < n
d (ξ
m
(t) , ξ
n
(t)) ≤
n−1
¸
k=m
d
ξ
k
(t) , ξ
k+1
(t)
=
n−1
¸
k=m
d
X
(2
k
)
t/2
k
(x) , X
(2
k+1
)
t/2
k+1
(x)
≤
n−1
¸
k=m
[t[
2
k+1
Ω
e
tΛ
−1
Λ
≤ [t[ Ω
e
tΛ
−1
Λ
2
−(m+1)
∞
¸
k=0
2
−k
= [t[ Ω
e
tΛ
−1
Λ
2
−m
and we see ξ
n
(t) is uniformly Cauchy on the interval [t[ < c in the complete
metric space B(x, r). By the bound ρ on speed, the curves ξ
n
(t) are uniformly
continuous in t and so they converge to a (continuous) curve, denoted
σ
x
(t) := lim
n→∞
ξ
n
(t) .
Let us now check σ
x
is tangent to X, ﬁrst at t = 0. Notice
d (σ
x
(t) , X
x
(t)) ≤ d (σ
x
(t) , ξ
n
(t)) +d (ξ
n
(t) , X
x
(t)) .
The ﬁrst summand is easily controlled. For the second summand consider the
fact that for any t ∈ [−1, 1] and n ∈ N we have
d
X
t
(x) , X
(n)
t/n
(x)
≤ e
tΛ
t
2
Ω (1.3)
10 CHAPTER 1. FLOWS
which holds since
d
X
t
(x) , X
(n)
t/n
(x)
≤
n−1
¸
k=0
d
X
(n−k)
t/n
X
kt/n
(x) , X
(n−(k+1))
t/n
X
(k+1)t/n
(x)
≤
n−1
¸
k=0
1 +
[t[
n
Λ
(n−(k+1))
k
t
n
2
Ω ≤ e
tΛ
t
2
Ω.
Replacing the n in (1.3) with 2
n
, the bound is undisturbed, and we have
d (σ
x
(t) , X
x
(t)) ≤ d (σ
x
(t) , ξ
n
(t)) +e
tΛ
t
2
.
Letting n → ∞ gives
d (σ
x
(t) , X
x
(t)) ≤ e
tΛ
t
2
Ω = O
t
2
(1.4)
locally uniformly in x.
Next we show σ
x
is locally 2ndorder tangent to X for all t. This will be
done if we show σ
x
(s +t) = σ
σ
x
(s)
(t) because in that case
d
σ
x
(s +t) , X
σx(s)
(t)
= d
σ
σx(s)
(t) , X
σx(s)
(t)
= O
t
2
this last equality having been established by line (1.4). Using (1.3) we have
d
X
t
(x) , X
(n)
t/n
(y)
≤
d (x, y) +t
2
Ω
e
tΛ
(1.5)
since
d
X
t
(x) , X
(n)
t/n
(y)
≤ d
X
t
(x) , X
(n)
t/n
(x)
+d
X
(n)
t/n
(x) , X
(n)
t/n
(y)
≤ t
2
Ωe
tΛ
+
1 +
[t[
n
Λ
n
d (x, y) ≤
d (x, y) +t
2
Ω
e
tΛ
.
Next if k divides j then using (1.5) we have
d
X
(j)
t/j
(x) , X
(j/k)
kt/j
(x)
≤ e
tΛ
t
2
Ωk/j (1.6)
since
d
X
(j)
t/j
(x) , X
(j/k)
kt/j
(x)
= d
X
(k)
t/j
X
(k[j/k−1])
t/j
(x) , X
kt/j
X
(j/k−1)
kt/j
(x)
≤
¸
d
X
(k[j/k−1])
t/j
(x) , X
(j/k−1)
kt/j
(x)
+
kt
j
2
Ω
¸
e
kt/jΛ
≤ ...
≤
¸
¸
...
¸
d
X
(0)
t/j
(x) , X
(0)
kt/j
(x)
+
kt
j
2
Ω
e
kt/jΛ
+
e
kt/jΛ
+
kt
j
2
Ω
e
kt/jΛ
+...
kt
j
2
Ω
¸
e
kt/jΛ
1.1. GENERATING FLOWS WITH ARC FIELDS 11
where the sum is taken j/k times and since d
X
(0)
t/j
(x) , X
(0)
kt/j
(x)
= 0 the
above is
≤
...
¸
kt
j
2
Ω
¸
e
kt/jΛ
+
e
kt/jΛ
+
kt
j
2
Ω
e
kt/jΛ
+...
kt
j
2
Ω
e
kt/jΛ
=
kt
j
2
Ω
= t
2
Ωe
tΛ
k
j
.
(1.6) is useful because it gives us
lim
j→∞
X
(j)
t/j
(x) = lim
j→∞
X
(j/k)
kt/j
(x)
or better put
lim
j→∞
X
(kj)
t/j
(x) = lim
j→∞
X
(j)
kt/j
(x) (1.7)
where k can be any function of j and t as long as k/j →0 as j → ∞.
For each n ∈ N, choose i (n) ∈ N such that n/i (n) → 0 as n → ∞ (for
example, choose i (n) = n
2
). and let j (i, n) , k (i, n) ∈ N be chosen so
j
s +t
2
i
−
s
2
n
<
[s +t[
2
i
and
k
s +t
2
i
−
t
2
n
<
[s +t[
2
i
so
j
s +t
2
i
+k
s +t
2
i
−
s
2
n
+
t
2
n
< 2
[s +t[
2
i
which implies
(j +k) −2
i−n
< 2
so
j +k = 2
i−n
+δ (n)
where [δ (n)[ < 2. Therefore
σ
x
(s +t) = lim
n→∞
X
(2
n
)
(s+t)/2
n
(x) = lim
i→∞
X
(2
i
)
s+t
2
i
(x) = lim
i→∞
X
(2
n
[j+k])
s+t
2
i
(x)
= lim
i→∞
X
(2
n
j)
s+t
2
i
X
(2
n
k)
s+t
2
i
(x) and using (1.7) this is
= lim
i→∞
X
(2
n
)
j
s+t
2
i
X
(2
n
)
k
s+t
2
i
(x) = lim
n→∞
X
(2
n
)
t/2
n
X
(2
n
)
s/2
n
(x)
= lim
n→∞
X
(2
n
)
t/2
n
lim
n→∞
X
(2
n
)
s/2
n
(x)
= σ
σ
x
(s)
(t) .
This completes the proof that solutions exist which are locally uniformly 2nd
order tangent to X. The proof of uniqueness follows from Theorem 16 below;
see Remark 17.
12 CHAPTER 1. FLOWS
Remark 13 Theorem 12 has a simple corollary showing the wellposedness of
timedependent dynamics following the exact same idea for timedependent vec
tor ﬁelds on a manifold. Simply consider a timeindependent arc ﬁeld on MR,
namely ((x, t) , h) → (X
x,t
(h) , t +h) in M R, and project solutions onto the
M factor.
2ndorder diﬀerential equations can be rewritten with 2ndorder vector ﬁelds.
A 2ndorder arc ﬁeld is a straightforward generalization with wellposedness a
simple corollary of Theorem 12 (see [16] for details).
With a little extra eﬀort Theorem 12 and those which follow are true in even
greater generality, and the reader is encouraged to study the work in, e.g., [52],
[7] and [18]. **check on the status of columbo and corli’s new work**. But
in the examples throughout this book the stronger conditions E1 and E2 are
satisﬁed and are Easier to use.
The above proof actually gives a result stronger than the statement of the
theorem which will be frequently useful:
Corollary 14 Assuming E1 and E2, the solutions σ are locally uniformly
2ndorder tangent to X in the variable x, i.e.,
d (X
x
(t) , σ
x
(t)) = O
t
2
locally uniformly for x ∈ M; i.e., for each x
0
∈ M there exist positive constants
r, δ, T > 0 such that for all x ∈ B(x
0
, r)
d (X
x
(t) , σ
x
(t)) ≤ t
2
T
whenever [t[ < δ.
Proof. This was established at line (1.4).
Denote local uniform tangency of two arc ﬁelds X and Y by X ∼ Y and
local uniform 2ndorder tangency by X ≈ Y . It is easy to check ∼ and ≈ are
equivalence relations. E.g., transitivity follows from the triangle inequality:
d (X
t
(x) , Z
t
(x)) ≤ d (X
t
(x) , Y
t
(x)) +d (Y
t
(x) , Z
t
(x)) .
We use the symbols ∼ and ≈ in many contexts in this monograph (particularly
§3.4), and always with an associated localuniformtangency property.
Further, the proof of Theorem 12 gives us another useful fact we will subse
quently need:
Corollary 15 Assuming E1 and E2, the solutions σ are tangent uniformly over
all arc ﬁelds X which satisfy E1 and E2 for speciﬁed Λ and Ω.
Proof. This was also established at line (1.4).
Also notice the proof used only the weaker property s = t and not the more
general [t[ ≤ [s[ from Condition E2 to prove the Euler curves are Cauchy. The
full assumption was used to prove the solution is tangent to the arc ﬁeld.
1.1. GENERATING FLOWS WITH ARC FIELDS 13
Theorem 16 Let σ
x
: (α
x
, β
x
) → M and σ
y
:
α
y
, β
y
→ M be two solu
tions to an arc ﬁeld X which satisﬁes E1. Assume (α
x
, β
x
) ∩
α
y
, β
y
⊃ I for
some interval I containing 0, and assume E1 holds uniformly with Λ on a set
containing
¦σ
x
(t) [t ∈ I¦ ∪ ¦σ
y
(t) [t ∈ I¦ .
Then
d (σ
x
(t) , σ
y
(t)) ≤ e
Λt
d (x, y) for all t ∈ I.
Proof. We check t ≥ 0, the case t < 0 being similar. Let
g (t) = e
−Λt
d (σ
x
(t) , σ
y
(t)) .
For h ≥ 0, we have
g (t +h) −g (t)
= e
−Λ(t+h)
d (σ
x
(t +h) , σ
y
(t +h)) −e
−Λt
d (σ
x
(t) , σ
y
(t))
= e
−Λ(t+h)
(d (X
h
(σ
x
(t)) , X
h
(σ
y
(t))) +o (h)) −e
−Λt
d (σ
x
(t) , σ
y
(t))
≤ e
−Λt
e
−Λh
d (σ
x
(t) , σ
y
(t)) (1 + Λh) −e
−Λt
d (σ
x
(t) , σ
y
(t)) +o (h)
=
e
−Λh
(1 + Λh) −1
e
−Λt
d (σ
x
(t) , σ
y
(t)) +o (h)
= o (h) e
−Λt
d (σ
x
(t) , σ
y
(t)) +o (h) = o (h) (g (t) + 1) .
Hence, the upper forward derivative of g (t) is nonpositive; i.e.,
D
+
g (t) := lim
h→0
+
g (t +h) −g (t)
h
≤ 0.
Consequently, g (t) ≤ g (0) or
d (σ
x
(t) , σ
y
(t)) ≤ e
Λt
d (σ
x
(0) , σ
y
(0)) = e
Λt
d (x, y) .
Theorem 16 says solutions locally diverge at most exponentially, which is the
most useful result we have for proving regularity of ﬂows. When I is compact
¦σ
x
(t) [t ∈ I¦ ∪ ¦σ
y
(t) [t ∈ I¦
is compact since σ
x
is continuous, and so it is often easy to ﬁnd a uniform bound
Λ for E1 on the set.
Remark 17 Uniqueness of solutions in Theorem 12 has the same meaning as
in classical ODE theory:
(1.) Any two solutions σ
1
x
:
α
1
x
, β
1
x
→ M and σ
2
x
:
α
2
x
, β
2
x
→ M with
initial condition x has σ
1
x
(t) = σ
2
x
(t) for all t ∈
α
1
x
, β
1
x
∩
α
1
x
, β
1
x
and
(2.) There exists a solution σ
x
: (α
x
, ω
x
) → M with maximal domain,
meaning any other solution ¯ σ
x
: (¯ α
x
, ¯ ω
x
) → M has in the sense that for any
(¯ α
x
, ¯ ω
x
) ⊂ (α
x
, ω
x
).
14 CHAPTER 1. FLOWS
Choosing x = y in Theorem 16 establishes (1.) for a small interval containing
the origin. The exact same extension argument as in ODEs then establishes
(1.) and (2.) fully (cf. practically any text introducing ODEs, e.g., [39]). The
maximal interval (α
x
, ω
x
) described in (2.) is the union of the domains of all
solutions with initial condition x.
Example 18 **Good spot for the nonunique solutions example x
=
√
x. This
example indicates how E1 and E2 cannot be weakened too much if we want to
guarantee a general wellposedness result.**
Remark 19 Theorem 16 gives uniqueness of solutions for any arc ﬁeld which
satisﬁes E1 alone. E2 is only used to prove general existence, but E2 is typi
cally the more diﬃcult condition to verify, so if we can verify solutions exist in
some other manner (perhaps directly calculating the limit of Euler curves, as in
Example 100) E1 is suﬃcient.
Theorem 16 also gives an easy proof of a very general Nagumotype invariant
set theorem, Theorem 33 below in §3.4.
Notice in the proof of Theorem 12 the Euler curves were deﬁned with nodes
spaced at a distance of t/2
n
. This was for convenience. The simpler expression
lim
n→∞
X
(n)
t/n
(x) = σ
x
(t) (1.8)
may also be veriﬁed, but we won’t present the more tedious analysis.
Yet a third deﬁnition of Euler curves for any real number r > 0 is common:
for i, n ∈ N deﬁne
ξ
r,n
(t) :=
X
(t−i·r2
−n
)
X
(i)
r2
−n
(x) i r2
−n
≤ t ≤ (i + 1) r2
−n
X
(t+i·r2
−n
)
X
(i)
−r2
−n
(x) −(i + 1) r2
−n
≤ t ≤ −i r2
−n
.
This concatenation of arcs is more complicated notationally, but more intuitively
compelling, and is in introductory texts on diﬀerential equations. Again ξ
r,n
→
σ
x
as n → ∞ as was proven in [18] with r = 1 to verify wellposedness under
commensurate conditions. Notice
ξ
t,n
(t) = X
(2
n
)
t/2
n
(x)
since t = 2
n
t2
−n
.
1.1.2 Local ﬂows
From now on (α
x
, ω
x
) will denote the maximal domain with initial condition x.
1.1. GENERATING FLOWS WITH ARC FIELDS 15
Corollary 20 Assume the conditions of Theorem 12 and let s ∈ (α
x
, ω
x
).
Then α
σx(s)
= α
x
−s and ω
σx(s)
= ω
x
−s. Thus t ∈
α
σx(s)
, ω
σx(s)
if and only
if t +s ∈ (α
x
, ω
x
), and then we have
σ
σ
x
(s)
(t) = σ
x
(s +t).
Deﬁning W ⊂ M R by
W : = ¦(x, t) ∈ M R[t ∈ (α
x
, ω
x
)¦ and
F : W →M by F(x, t) := σ
x
(t) (1.9)
Then
(i) M ¦0¦ ⊂ W and F(x, 0) = x for all x ∈ M (identity at 0 property)
(ii) F(t, F(s, x)) = F(t +s, x) (1parameter local group property)
(iii) For each (ﬁxed) x ∈ M, F(x, ) : (α
x
, ω
x
) → M is the maximal solution
σ
x
to X.
The map F is called the local ﬂow of X.
Compare Condition E2 with (ii) above to see why an arc ﬁeld might be
described as a “preﬂow”.
Theorem 16 says if F is the local ﬂow of an arc ﬁeld X which satisﬁes
Condition E1 with uniform constant Λ
X
then
d (F
t
(x) , F
t
(y)) ≤ e
ΛXt
d (x, y) . (1.10)
Thus F (x, t) is continuous in x. Notice e
ΛXt
= 1+Λ
X
[t[ +O
t
2
and compare
Condition E1 with line (1.10) to see why E1 may be thought of as a local linearity
property for X, needed for the continuity of F. Now let’s check continuity in
the other variable, t:
Lemma 21 Suppose c > 0 and σ : (−c, c) → X is a solution curve of X.
Assume the speed of X is bounded by ρ ∈ [0, ∞) on σ ((−c, c)). Then the speed
of σ is also bounded by ρ.
Proof. First let t ≥ 0. For −c ≤ t
0
≤ t
0
+t < c, let
f (t) := d (σ (t
0
+t) , σ (t
0
)) −ρt
Since f (0) = 0 we wish to show D
+
f (t) ≤ 0, since then f (t) ≤ 0 and we will
then know
d (σ (t
0
+t) , σ (t
0
)) ≤ ρt
as desired.
f (t +h) −f (t) = d (σ (t
0
+t +h) , σ (t
0
)) −d (σ (t
0
+t) , σ (t
0
)) −ρh
≤ d (σ (t
0
+t +h) , σ (t
0
+t)) −ρh
≤ d (σ (t
0
+t +h) , X
h
(σ (t
0
+t))) +d (X
h
(σ (t
0
+t)) , σ (t
0
+t)) −ρh
= o (h) +d (X
h
(σ (t
0
+t)) , X
0
(σ (t
0
+t))) −ρh
≤ o (h) +ρh −ρh = o (h) .
Checking d (σ (t
0
+t) , σ (t
0
)) ≤ ρ [t[ for t < 0 is similar, mutatis mutandis.
16 CHAPTER 1. FLOWS
Theorem 22 For F and W as above (1.9) we have W open in M R and F
continuous on W.
Proof. Continuity is easy to check using Theorem 16 and Lemma 21 on
the separate variables, once we’ve established a proper environment on which
their assumptions are satisﬁed. So we ﬁrst check W is open by showing for
any (x
0
, t
0
) ∈ W there is a neighborhood V of x
0
and > 0 such that V
(t
0
−, t
0
+) ⊂ W. Deﬁne x
1
:= σ
x
0
(t
0
). Since X has locally bounded speed
there exists r > 0 such that ρ := ρ (x
1
, r) < ∞, and so for any x ∈ B(x
1
, r/2ρ)
we have σ
x
(t) ∈ B(x
1
, r) for [t[ <
r
2ρ
. Consequently B
x
1
,
r
2ρ
−
r
2ρ
,
r
2ρ
⊂
W.
Now the rest of the proof follows the idea that there is a small enough
neighborhood V of x
0
such that F (V, t
0
) ⊂ B(x
1
, r/2ρ) by Theorem 16 which
guarantees V
t
0
−
r
2ρ
, t
0
+
r
2ρ
⊂ W since
F
F (V, t
0
) ,
−
r
2ρ
,
r
2ρ
= F
V,
t
0
−
r
2ρ
, t
0
+
r
2ρ
by the local group property. Theorem 16 requires only there be a set on which
E1 is satisﬁed uniformly by some Λ > 0. Then V := B
x
0
, e
−t0Λ r
4ρ
is suﬃ
cient. Now to show the set for Theorem 16 exists. Notice
0, t
0
+
r
2ρ
is compact
and so its continuous image σ
x0
0, t
0
+
r
2ρ
⊂ M is compact. For each t ∈
0, t
0
+
r
2ρ
there is a ball B(σ
x0
(t) , r
t
) ⊂ M with r
t
> 0 on which Condition
E1 is satisﬁed with Λ
t
. These neighborhoods cover σ
x
0
0, t
0
+
r
2ρ
, so there is
a ﬁnite subcover ¦B(σ
x0
(t
i
) , r
ti
) [i = 1, ..., n¦. Let Λ := max ¦Λ
i
[i = 1, ..., n¦,
let U :=
n
∪
i=1
B(σ
x0
(t
i
) , r
ti
) and let M`U denote the set complement. The
function f :
0, t
0
+
r
2ρ
→ R deﬁned by f (t) := d (σ
x
0
(t) , M`U) is positive
and continuous on a compact domain and so has a minimum m > 0. There
fore any y ∈ σ
x0
0, t
0
+
r
2ρ
has a neighborhood ball B(y, m) ⊂ U and
therefore any solution curve which stays within a distance of m of the path
σ
x0
0, t
0
+
r
2ρ
will have a uniform Λ satisfying E1. Therefore Theorem 16
applies and we can choose V := B
x
0
, e
−t0Λ r
4ρ
as explained above, giving
(x
0
, t
0
) ∈ V
t
0
−
r
2ρ
, t
0
+
r
2ρ
⊂ W and W is open. (In fact we have proven
V
0, t
0
+
r
2ρ
⊂ W.)
Now proving continuity is easy. Since X is an arc ﬁeld, it has locally bounded
speed and there exists r > 0 and a local bound on speed ρ := ρ (σ
x
0
(t
0
) , r) < ∞
for X
y
for all y ∈ B(σ
x
(t
0
) , r), in particular Lemma 21 requires the speed of
σ
x
0
(t) be bounded by ρ for all t with [t −t
0
[ <
r
2ρ
. Using Theorem 16 (on the
set constructed in the previous paragraph for which Λ is uniform) and Lemma
1.1. GENERATING FLOWS WITH ARC FIELDS 17
21, as (x, t) →(x
0
, t
0
) we have
d (F (x, t) , F (x
0
, t
0
))
= d (σ
x
(t) , σ
x0
(t
0
)) ≤ d (σ
x
(t) , σ
x0
(t)) +d (σ
x0
(t
0
) , σ
x0
(t))
≤ e
Λ(t0+1)
d (x
0
, x) +ρ (σ
x
(t
0
) , r) [t
0
−t[ →0.
For ﬁxed t it is clear F
t
is a local lipeomorphism, when deﬁned, by Theorem
16.
1.1.3 Global ﬂows
We now investigate conditions which guarantee local ﬂows are in fact “global”,
i.e., (α
x
, ω
x
) = R for all x ∈ M. To achieve this, we mimic ODE theory.
Example 23 Consider the classic elementary example of “quadratic” speed
growth
x
= f (x)
where f : R →R is the (locally Lipschitz) vector ﬁeld given by f (x) = x
2
which
has solutions x(t) :=
x
0
1 −tx
0
so that when the initial condition is x(0) = x
0
=
0, the solutions “blow up” at time t = 1/x
0
. The vector ﬁeld f (x) = x
2
grows
too quickly as solutions x grow, sending x to ∞ in ﬁnite time.
To guarantee globally deﬁned ﬂows, ﬁrst the space cannot have holes, i.e.,
M must be complete. Secondly we must limit the magnitude of the vector ﬁeld
to prevent the situation in Example 23, which inspires the following
Deﬁnition 24 An arc ﬁeld X on a metric space M is said to have linear
speed growth if there is a point x ∈ M and positive constants c
1
and c
2
such
that for all r > 0
ρ (x, r) ≤ c
1
r +c
2
, (1.11)
where ρ (x, r) is the local bound on speed given in Deﬁnition 10.
If y is any other point in X then B(y, r) ⊆ B(x, d (x, y) +r) . Thus,
ρ(y, r) ≤ ρ (x, d (x, y) +r) ≤ c
1
(x) (d (x, y) +r) +c
2
(x)
= c
1
(x) r + (c
1
(x) d (x, y) +c
2
(x)) .
Hence, if the relation (1.11) holds for a point x, then for any other y ∈ X we
also have
ρ (y, r) ≤ c
1
(y) r +c
2
(y) , (1.12)
where c
1
(y) = c
1
(x) and c
2
(y) = c
1
(x) d (x, y) +c
2
(x).
Theorem 25 Let X be an arc ﬁeld on a complete metric space M, which sat
isﬁes E1 and E2 and has linear speed growth. Then F has domain W = MR,
i.e., F is a ﬂow.
18 CHAPTER 1. FLOWS
Proof. A similar proof in this context of metric spaces appears in [18]. Most
other proofs on manifolds can be easily transferred to our current situation.
Assume t ≥ 0 (the case t < 0 being similar). Then for any partition 0 =
t
1
< t
2
< ... < t
n+1
= t of [0, t] we have
d (σ
x
(t) , x) ≤ d (σ
x
(t) , σ
x
(0)) ≤
n
¸
i=1
d (σ
x
(t
i
) , σ
x
(t
i+1
)) ≤
n
¸
i=1
ρ(σ
x
(s
i
)) [t
i+1
−t
i
[
for some choice of s
i
∈ [t
i
, t
i+1
] which leads to
d (σ
x
(t) , x) ≤
t
0
ρ (σ
x
(s)) ds ≤
t
0
c
1
(x) d (σ
x
(s) , x) +c
2
(x) ds.
In other words, for f (t) := d (σ
x
(t) , x) we wish to use the inequality
f (t) ≤
t
0
c
1
f (s) +c
2
ds (1.13)
to bound f. In fact (1.13) gives
D
+
f (t) := lim
h→0
+
f (t +h) −f (t)
h
≤ c
1
f (t) +c
2
.
As motivation, the solution of x
(t) = c
1
x(t) + c
2
satisfying x(0) = 0 is
x(t) =
c2
c
1
(e
c
1
t
−1). Since we expect f
(t) ≤ x
(t) and f (0) = 0, we expect f
to grow at most exponentially. Then assuming the domain (α
x
, ω
x
) has ω
x
< ∞
we would have by continuity and the boundedness of f that σ
x
can be continued
to have σ
x
(ω
x
) ∈ M. But then the fundamental theorem allows us to continue
σ
x
beyond ω
x
giving us the contradiction.
A ﬂow is sometimes called a full ﬂow, or a global ﬂow, or a complete
ﬂow to distinguish it from a local ﬂow. Since local ﬂows are continuous—and
continuity is a local property—full ﬂows are continuous.
Example 26 The support of an arc ﬁeld X is the closure of the set S :=
¦x ∈ M[X
m
0¦. Here 0 is the constant arc ﬁeld 0
x
(t) = x. Assuming E1 and
E2 on a locally complete space M, it is easy to see that when the support of X
is compact, the ﬂow F is complete; in particular if M is compact all such X
give complete ﬂows.
Example 27 Every local ﬂow on a metric space is generated by an arc ﬁeld.
Any local ﬂow F gives rise to an arc ﬁeld F : M [−1, 1] →M deﬁned by
F (x, t) :=
F (x, t) if t ∈
α
x
2
,
ω
x
2
F
x,
αx
2
if t ∈
−1,
αx
2
F
x,
ωx
2
if t ∈
ωx
2
, 1
.
The issue here is that F, being a local ﬂow, may have [−1, 1] (α
x
, ω
x
), so
we have to be careful at the endpoints. Clearly the local ﬂow generated by F
is F. Since all our concerns with arc ﬁelds are local, we will never focus on
t / ∈
αx
2
,
ωx
2
and henceforth we will not notationally distinguish between F and
F as arc ﬁelds.
1.2. FORWARD FLOWS AND FIXED POINTS 19
With this identiﬁcation of ﬂows being arc ﬁelds (but not usually viceversa)
we may simplify Corollary 14 to:
Corollary 28 X ≈ F if X satisﬁes E1 and E2.
Examples relevant to this chapter occur in Chapter 7 and Example 100.
1.2 Forward ﬂows and ﬁxed points
In many applications the solution of a diﬀerential equation, or vector ﬁeld is not
deﬁned for t < 0. For example, diﬀusion phenomena is usually only tractable
forward in time. In this case we work with forward ﬂows (also called semi
ﬂows, or in the context of operators on Banach spaces, semigroups). We list
here the minor modiﬁcations to the above theory for this more general situation,
then prove a simple ﬁxed point theorem. We don’t bother to stress much new
forwardspeciﬁc terminology, as it should be clear from context whether we mean
forward or bidirectional in any examples.
Change the domain of arcs on M from c : [−1, 1] → M to c : [0, 1] → M
and similarly replace [−1, 1] with [0, 1] everywhere it occurs, e.g., (forward) arc
ﬁelds are deﬁned as maps X : M [0, 1] → M. Solutions σ
x
: [0, β
x
) → M are
forward tangent to X deﬁned by
lim
h→0
+
d (σ (t +h) , X (σ (t) , h))
h
= 0,
i.e., t and h are restricted to positive values. We explicitly spell out the minor
changes to Conditions E1 and E2 since a new possibility of allowing negative
Λ
X
will prove to be useful.
Condition E1: For each x
0
∈ M, there are constants r > 0, δ > 0 and Λ ∈ R
such that for all x, y ∈ B(x
0
, r) and t ∈ [0, δ)
d (X
t
(x) , X
t
(y)) ≤ d (x, y) (1 +tΛ) .
Condition E2:
d (X
s+t
(x) , X
t
(X
s
(x))) = O(st)
for 0 ≤ t ≤ s as s →0, locally uniformly in x.
Corollary 29 Let X be an arc ﬁeld satisfying E1 and E2 on a locally complete
metric space M. Then given any point x ∈ M, there exists a unique solution
σ
x
: [0, ω
x
) →M with initial condition σ
x
(0) = x.
Proof. Follow the proof of Theorem 12.
20 CHAPTER 1. FLOWS
Corollary 30 Let X be an arc ﬁeld on a complete metric space M, which
satisﬁes E1 and E2. Solutions σ
x
: [0, ω
x
) →M satisfy
σ
σx(s)
(t) = σ
x
(s +t)
for s, t ≥ 0 and s +t < ω
x
. Deﬁning W ⊂ M R
+
by
W : = ¦(x, t)[t ∈ [0, ω
x
)¦ and
F : W →M by F(x, t) := σ
x
(t)
we know F is continuous and
(i) M ¦0¦ ⊂ W and F(x, 0) = x for all x ∈ M (identity at 0 property)
(ii) F(t, F(s, x)) = F(t +s, x) (1parameter local semigroup property)
(iii) For each x ∈ M, F(x, ) : [0, ω
x
) →M is the maximal solution σ
x
to X.
If in addition X has linear speed growth, then F has domain W = M R
+
,
i.e., F is a forward ﬂow.
Proof. Use Corollary 29 and adapt the proof of Theorem 22.
Theorem 31 Let X be an arc ﬁeld on a complete metric space M, which has
linear speed growth and satisﬁes Conditions E1 and E2, with Λ < 0 uniformly
valid for all of M. Then the forward ﬂow F : M[0, ∞) →M of X has a unique
ﬁxed point. That is, there exists p ∈ M, such that for all t ≥ 0, F (p, t) = p, and
if F (x, t
0
) = x for some t
0
> 0, then x = p. Furthermore, the ﬂow converges to
the ﬁxed point exponentially:
d (F (x, t) , p) = d (F (x, t) , F (p, t)) ≤ e
Λt
d (x, p) .
Proof. Theorem 16 is valid mutatis mutandis and gives
d (F (a, t) , F (b, t)) ≤ e
K
A
t
d (a, b) .
Thus, F
t
:= F (, t) is a contraction mapping for t > 0 on M, and therefore has
a unique ﬁxed point, say p
t
, by the Contraction Mapping Theorem (Theorem
119, Appendix A.1). Note p
t
is a continuous function of t, since
d (p
t
, p
t
) = d (F (p
t
, t) , F (p
t
, t
))
≤ d (F (p
t
, t) , F (p
t
, t)) +d (F (p
t
, t) , F (p
t
, t
))
≤ e
KAt
d (p
t
, p
t
) +d (F (p
t
, t) , F (p
t
, t
))
⇒ d (p
t
, p
t
) ≤
d (F (p
t
, t) , F (p
t
, t
))
1 −e
KAt
→0 as t
→t
by the continuity of F. The 1parameter local semigroup property of F gives
p
t
= F (p
t
, t) = F (F (p
t
, t) , t) = F (p
t
, 2t) = = F (p
t
, nt)
for any positive integer n. Hence, p
nt
= p
t
and further p
i/j
= p
i
= p
1
for all
positive integers i and j. Since t →p
t
is continuous and constant on the positive
rationals, p
t
= p
1
for all t > 0.
See Chapter 7 for examples in H (R
n
). Theorem 33 in the next section deals
with the more general question of invariant sets instead of just ﬁxed points.
1.3. INVARIANT SETS 21
1.3 Invariant sets
Deﬁnition 32 A set S ⊂ M is deﬁned to be locally uniformly tangent to
X if
d (X
t
(x) , S) = o (t)
locally uniformly for x ∈ S, denoted S ∼ X.
S is invariant under the ﬂow F if for any x ∈ S we have F
t
(x) ∈ S for all
t ∈ (α
x
, ω
x
).
The next theorem is a metric space generalization of the NagumoBrézis
Invariance Theorem (Example 11 shows how this generalizes the Banach space
setting). The bidirectional case is given, but the result obviously holds also for
forward ﬂows mutatis mutandis. Cf. [50] for an exposition on general invariance
theorems.
Theorem 33 Let X satisfy E1 and E2 and assume a closed set S ⊂ M has
S ∼ X. Then S is an invariant set of the ﬂow F.
Proof. By choosing x ∈ S this theorem is an immediate corollary of the
following, slightly stronger fact:
Lemma 34 Let σ
x
: (α, ω) →U ⊂ M be a solution to X which meets Condition
E1 with uniform constant Λ on a neighborhood U. Assume S ⊂ U is a closed
set with S ∼ X. Then
d (σ
x
(t) , S) ≤ e
Λt
d (x, S) for all t ∈ (α, ω) .
Proof. (Adapted from the proof of Theorem 16, due to David Bleecker.)
We check only t > 0. Deﬁne g (t) := e
−Λt
d (σ
x
(t) , S). For h ≥ 0, we have
g (t +h) −g (t) = e
−Λ(t+h)
d (σ
x
(t +h) , S) −e
−Λt
d (σ
x
(t) , S)
≤ e
−Λ(t+h)
[d (σ
x
(t +h) , X
h
(σ
x
(t))) +d (X
h
(σ
x
(t)) , X
h
(y)) +d (X
h
(y) , S)]
−e
−Λt
d (σ
x
(t) , S)
for any y ∈ S, which in turn is
≤ e
−Λ(t+h)
[d (X
h
(σ
x
(t)) , X
h
(y)) +o (h)] −e
−Λt
d (σ
x
(t) , S)
≤ e
−Λt
e
−Λh
d (σ
x
(t) , y) (1 + Λh) −e
−Λt
d (σ
x
(t) , S) +o (h)
=
e
−Λh
(1 + Λh) d (σ
x
(t) , y) −d (σ
x
(t) , S)
e
−Λt
+o (h) .
Therefore
g (t +h) −g (t) ≤
e
−Λh
(1 + Λh) −1
e
−Λt
d (σ
x
(t) , S) +o (h)
since y was arbitrary in S. Thus
g (t +h) −g (t)
≤ o (h) e
−Λt
d (σ
x
(t) , S) +o (h) = o (h) (g (t) + 1) .
22 CHAPTER 1. FLOWS
Hence, the upper forward derivative of g (t) is nonpositive; i.e.,
D
+
g (t) := lim
h→0
+
g (t +h) −g (t)
h
≤ 0.
Consequently, g (t) ≤ g (0) or
d (σ
x
(t) , S) ≤ e
Λt
d (σ
x
(0) , S) = e
Λt
d (x, S) .
Theorem 33 will be used to piece together local integral surfaces to get
foliations in §3.4. Also see Example 87.
1.4 Commutativity of ﬂows
The following theorem is valid for both bidirectional and forward ﬂows.
Theorem 35 Let X and Y be arc ﬁelds on a complete metric space M which
satisfy Conditions E1 and E2. Let F and G be the local ﬂows generated by X
and Y , respectively. If
d (Y
t
X
t
(x) , X
t
Y
t
(x)) = o
t
2
(1.14)
locally uniformly in x then
F
s
G
t
= G
t
F
s
that is, the ﬂows commute.
Proof. (1.14) means for any x ∈ M there exists a neighborhood U :=
B(x, δ) and a function φ with lim
t→0
φ(t) = 0 such that for all y ∈ U we have
d (Y
t
X
t
(y) , X
t
Y
t
(y)) ≤ t
2
φ(t). By shrinking U if necessary, both arc ﬁelds will
satisfy E1 and E2 uniformly on U for some constants Λ > 1 and Ω, and also
the speeds of X and Y are uniformly bounded by ρ > 0. For the time being
we assume s and t are suﬃciently small so all the compositions of X and Y
appearing below remain in U, i.e., [s[ , [t[ < δ/ (2ρ). In the last paragraph of the
proof, continuation will eliminate this restriction on s and t.
The calculations can become a little convoluted, but using Figure **add it!**
you might ﬁnd it easier to construct your own proof than to read this one.
Let us ﬁrst check the theorem in the case s = t. This next estimate is the
linchpin of the proof.
Lemma: d
(Y
r
X
r
)
i
(x) , (X
r
Y
r
)
i
(x)
≤ rφ(r) (1 +rΛ)
2i
(1.15)
1.4. COMMUTATIVITY OF FLOWS 23
Let us verify this estimate. Denote x
j
:= (Y
r
X
r
)
j
(x)
d
(Y
r
X
r
)
i
(x) , (X
r
Y
r
)
i
(x)
≤
i−1
¸
k=0
d
(X
r
Y
r
)
k
(Y
r
X
r
)
i−k
(x) , (X
r
Y
r
)
k+1
(Y
r
X
r
)
i−k−1
(x)
=
i−1
¸
k=0
d
(X
r
Y
r
)
k
(Y
r
X
r
) (x
i−k−1
) , (X
r
Y
r
)
k
(X
r
Y
r
) (x
i−k−1
)
≤
i−1
¸
k=0
d (Y
r
X
r
(x
i−k−1
) , X
r
Y
r
(x
i−k−1
)) (1 +rΛ)
2k
≤
¸
max
x
k
d (Y
r
X
r
(x
k
) , X
r
Y
r
(x
k
))
i−1
¸
k=0
(1 +rΛ)
2k
≤ r
2
φ(r)
(1+rΛ)
2i
−1
(1+rΛ)
2
−1
≤ rφ(r) (1 +rΛ)
2i
as desired.
We will show the following estimate can be made arbitrarily small.
d (G
t
F
t
(x) , F
t
G
t
(x)) ≤ d
G
t
F
t
(x) ,
Y
t/n
X
t/n
n
(x)
(1.16)
+d
Y
t/n
X
t/n
n
(x) ,
X
t/n
Y
t/n
n
(x)
+d
X
t/n
Y
t/n
n
(x) , F
t
G
t
(x)
The above lemma (1.15) satisfactorily bounds the middle term by
d
Y
t/n
X
t/n
n
(x) ,
X
t/n
Y
t/n
n
(x)
≤
t
n
φ
t
n
e
2
t
n
Λ
→0 (1.17)
as n →∞. Next
d
G
t
F
t
(x) ,
Y
t/n
X
t/n
n
(x)
≤ d
G
t
F
t
(x) ,
Y
t/n
n
X
t/n
n
(x)
+d
Y
t/n
n
X
t/n
n
(x) ,
Y
t/n
X
t/n
n
(x)
.
The ﬁrst term converges to 0 as n → ∞ by the Euler curve approximation, so
let us consider the second term separately:
d
Y
t/n
n
X
t/n
n
(x) ,
Y
t/n
X
t/n
n
(x)
≤
n−2
¸
k=0
d
Y
t/n
n−k
X
t/n
Y
t/n
k
X
t/n
n−k
(x) ,
Y
t/n
n−k−1
X
t/n
Y
t/n
k+1
X
t/n
n−k−1
(x)
(1.18)
using only the triangle inequality since
Y
t/n
X
t/n
n
(x) =
Y
t/n
1
X
t/n
Y
t/n
n−1
X
t/n
1
(x) , etc.
Denote y
k
:=
X
t/n
k
(x) so
d
Y
t/n
n−k
X
t/n
Y
t/n
k
X
t/n
n−k
(x) ,
Y
t/n
n−k−1
X
t/n
Y
t/n
k+1
X
t/n
n−k−1
(x)
≤ d
Y
t/n
X
t/n
k+1
(y
n−k−1
) ,
X
t/n
Y
t/n
k+1
(y
n−k−1
)
1 + Λ
t
n
n−k−1
≤
t
n
φ
t
n
1 +
t
n
Λ
2(k+1)
1 + Λ
t
n
n−k−1
=
t
n
φ
t
n
1 + Λ
t
n
n+k+1
.
24 CHAPTER 1. FLOWS
Therefore (1.18) is bounded by
≤
n−2
¸
k=0
t
n
φ
t
n
1 + Λ
t
n
n+k+1
=
t
n
φ
t
n
1 + Λ
t
n
n+1
n−2
¸
k=0
1 + Λ
t
n
k
=
t
n
φ
t
n
1 + Λ
t
n
n+1
1 + Λ
t
n
n−1
−1
1 + Λ
t
n
−1
(& using Λ > 1 gives)
≤ φ
t
n
1 + Λ
t
n
n+1
1 + Λ
t
n
n−1
= φ
t
n
1 + Λ
t
n
2n
≤ φ
t
n
e
2Λt
→0
as n → ∞. We just barely have a bound going to zero at this point, which
indicates a sharpness for the theorem. Now tracing the calculations backward
shows d
G
t
F
t
(x) ,
Y
t/n
X
t/n
n
(x)
→ 0 as n → ∞. Similar manipulations
yield d
X
t/n
Y
t/n
n
(x) , F
t
G
t
(x)
→ 0 as n → ∞, and putting these results
together in line (1.16) gives d (G
t
F
t
(x) , F
t
G
t
(x)) = 0 as desired.
Now since F
t
G
t
(x) = G
t
F
t
(x) for any valid t we can use this and the
semigroup property of ﬂows to get
F
2t
G
t
(x) = F
t
F
t
G
t
(x) = F
t
G
t
F
t
(x) = G
t
F
t
F
t
(x) = G
t
F
2t
(x)
and similarly
F
mt
G
nt
(x) = G
nt
F
mt
(x)
for any m, n ∈ N (or Z for the bidirectional case) by induction. Since t may
be chosen arbitrarily, for small enough t we have F
r
G
s
(x) = G
s
F
r
(x) for any
valid r, s ∈ Q. By the continuity of the ﬂows F and G, the result follows.
This theorem is a generalization of a classical result in mechanics, as will be
obvious once we introduce the metric space analog of the Lie bracket in §2.2,
which allows an alternate proof given in §3.3.
Chapter 2
Lie algebra on metric spaces
On a metric space we cannot add or multiply elements without imposing further
structure on the space. But arc ﬁelds enjoy some natural algebraic properties
because of the fact that R is embedded in their deﬁnition. In fact just as
vector ﬁelds on a manifold form a module, arc ﬁelds with minimal regularity
assumptions (E1, E2, and closure) form a module up to tangency equivalence.
The Lie bracket of two vector ﬁelds is a key tool in the study of geometry
and dynamics on manifolds. In §2.2 a generalization is introduced to exploit
its power on metric spaces. The asymptotic characterization given at line (2.6)
below is the natural deﬁnition to choose for the metric space context. Remark
ably, though, the Lie derivative interpretation is shown to be valid as well at
line (2.8).
The relation of the Lie bracket to the other algebraic deﬁnitions on arc ﬁelds
is explored, and we ﬁnd the operations of pullback and pushforward to be
natural with respect to this algebra. All of this machinery then allows us to
prove Frobenius’ Foliation Theorem on a metric space in Chapter 3.
2.1 Metric space arithmetic
Deﬁnition 36 If X and Y are arc ﬁelds on M then deﬁne the arc ﬁeld X +Y
on M by
(X +Y )
t
(x) := Y
t
X
t
(x) .
For any locally Lipschitz function a : M →R deﬁne the arc ﬁeld aX by
aX (x, t) := X (x, a (x) t) . (2.1)
To be fastidiously precise we need to deﬁne aX
x
(t) for all t ∈ [−1, 1] so
25
26 CHAPTER 2. LIE ALGEBRA ON METRIC SPACES
technically when a > 1 we must specify
aX (x, t) :=
X (x, a (x) t)
X (x, 1)
X (x, −1)
x
−
1
a(x)
≤ t ≤
1
a(x)
t > 1/ [a(x)[
t < −1/ [a (x)[
for −1 ≤ t ≤ 1
when a(x) = 0
if a (x) = 0
(2.2)
using the trick from Example 27. Again, we will not burden ourselves with this
detail; in all cases our concern with the properties of an arc ﬁeld X
x
(t) is only
near t = 0 and a is always continuous.
When X +Y satisﬁes Conditions E1 and E2, its ﬂow H is then computable
with Euler curves (using line (1.8) above) as
H (x, t) = lim
n→∞
(X +Y )
(n)
t/n
(x) = lim
n→∞
Y
t/n
X
t/n
(n)
(x) . (2.3)
Therefore, this deﬁnition of X+Y using compositions is a direct generalization
of the concept of adding vector ﬁelds on a diﬀerentiable manifold (see [1], §4.1A).
The sum of two ﬂows on a metric space was introduced in [24] in the same spirit
as deﬁned here, with commensurable conditions.
It is a simple deﬁnition check to prove aX is an arc ﬁeld when a is locally
Lipschitz, since aX
x
(t) = X
x
(a (x) t) is Lipschitz in t if X
x
(t) is:
ρ
aX
(x) := sup
s=t
d (X
x
(a(x) s) , X
x
(a (x) t))
[s −t[
= sup
s=t
d (X
x
(s) , X
x
(t))
s
a(x)
−
t
a(x)
= [a (x)[ sup
s=t
d (X
x
(s) , X
x
(t))
[s −t[
= [a (x)[ ρ
X
(x)
so
ρ
aX
(x, r) := sup
y∈B(x,r)
¦ρ
aX
(y)¦ = sup
y∈B(x,r)
¦[a (y)[ ρ (y)¦
≤ ([a (x)[ +rK
a
) ρ
X
(x, r) < ∞.
However, we need another condition to guarantee linear combinations of arc
ﬁelds with local ﬂows still give arc ﬁelds with local ﬂows:
Deﬁnition 37 We say X & Y close if
d (Y
s
X
t
(x) , X
t
Y
s
(x)) = O([st[)
locally uniformly in x, i.e., if for each x
0
∈ M there exist positive constants
C
XY
, δ, and r such that for all x ∈ B(x
0
, r)
d (Y
s
X
t
(x) , X
t
Y
s
(x)) ≤ C
XY
[st[
for all [s[ , [t[ < δ.
2.1. METRIC SPACE ARITHMETIC 27
Example 38 As in Example 11 let f, g : B → B be Lipschitz vector ﬁelds on a
Banach space B, and let X and Y be their corresponding arc ﬁelds
X (x, t) := x +tf (x) and Y (x, t) := x +tg (x) .
Notice
(X +Y ) (x, t) = [x +tf (x)] +tg (x +tf (x)) = x +t [f (x) +g (x +tf (x))]
which is tangent to the arc ﬁeld
Z (x, t) := x +t [f (x) +g (x)]
since
d ((X +Y ) (x, t) , Z (x, t)) = [t[ [f (x) +g (x +tf (x))] −[f (x) +g (x)]
≤ t
2
K
g
f (x) = o (t)
which motivates the deﬁnition of the sum of arc ﬁelds.
It is easy to check X & Y close:
d (Y
s
X
t
(x) , X
t
Y
s
(x))
= x +tf (x) +sg (x +tf (x)) −[x +sg (x) +tf (x +sg (x))]
≤ [t[ f (x) −f (x +sg (x)) +[s[ g (x +tf (x)) −g (x)
≤ [t[ K
f
x −(x +sg (x)) +[s[ K
g
x +tf (x) −x
≤ [st[ (K
f
g (x) +K
g
f (x))
so C
XY
:= (K
f
g (x) +K
g
f (x)).
Proposition 39 Assume X & Y close and satisfy E1 and E2. Then their sum
X +Y satisﬁes E1 and E2.
Proof. Checking Condition E1:
d ((X +Y )
t
(x) , (X +Y )
t
(y))
= d (Y
t
X
t
(x) , Y
t
X
t
(y)) ≤ d (X
t
(x) , X
t
(y)) (1 +[t[ Λ
Y
)
≤ d (x, y) (1 +[t[ Λ
X
) (1 +[t[ Λ
Y
) ≤ d (x, y)
1 +[t[ (Λ
X
+ Λ
Y
) +t
2
Λ
X
Λ
Y
≤ d (x, y) (1 +[t[ Λ
X+Y
)
where Λ
X+Y
:= Λ
X
+ Λ
Y
+ Λ
X
Λ
Y
< ∞.
Condition E2:
d
(X +Y )
s+t
(x) , (X +Y )
t
(X +Y )
s
(x)
= d (Y
s+t
X
s+t
(x) , Y
t
X
t
Y
s
X
s
(x))
≤ d (Y
s+t
X
s+t
(x) , Y
t
Y
s
X
s+t
(x)) +d (Y
t
Y
s
X
s+t
(x) , Y
t
X
t
Y
s
X
s
(x))
≤ [st[ Ω
Y
+d (Y
s
X
s+t
(x) , X
t
Y
s
X
s
(x)) (1 +[t[ Λ
Y
)
≤ [st[ Ω
X
+ [d (Y
s
X
s+t
(x) , Y
s
X
t
X
s
(x)) +d (Y
s
X
t
(y) , X
t
Y
s
(y))] (1 +tΛ
X
)
(2.4)
28 CHAPTER 2. LIE ALGEBRA ON METRIC SPACES
where y := X
s
(x). Notice
d (Y
s
X
s+t
(x) , Y
s
X
t
X
s
(x)) ≤ d (X
s+t
(x) , X
t
X
s
(x)) (1 +[s[ Λ
Y
)
≤ [st[ Ω
X
(1 +[s[ Λ
Y
) = O([st[)
and the last summand of (2.4) is also O([st[) since X & Y close, so E2 is
satisﬁed.
When X & Y close and satisfy E1 and E2, we also have (X +Y ) ≈ (Y +X)
using (2.3) since
Y
t/n
X
t/n
(n)
= Y
t/n
X
t/n
Y
t/n
(n−1)
X
t/n
(2.5)
whence both arc ﬁelds X + Y and Y + X are (locally uniformly 2ndorder)
tangent to the ﬂow H.
Example 40 Under the conditions of Theorem 35 we can obviously see F
t
G
t
=
H
t
where again H is the ﬂow generated by X + Y . In fact, line (1.17) in its
proof gives a second (more tedious) veriﬁcation of the fact that the Euler curves
for X +Y and Y +X converge to each other.
Proposition 41 If X satisﬁes E1 and E2 and a : M →R is a locally Lipschitz
function, then aX satisﬁes E1 and E2.
Proof. It suﬃces, by localizing, to assume a is globally Lipschitz.
E1:
d (aX
x
(t) , aX
y
(t))
= d (X
x
(a (x) t) , X
y
(a (y) t))
≤ d (X
x
(a (x) t) , X
x
(a (y) t)) +d (X
x
(a (y) t) , X
y
(a(y) t))
≤ [a (x) t −a (y) t[ ρ(x) +d (x, y) (1 +[a (y)[ [t[ Λ
X
)
≤ d (x, y) (K
a
[t[ ρ (x) + 1 +[a (y)[ [t[ Λ
X
) = d (x, y) (1 +[t[ Λ
aX
)
where Λ
aX
:= K
a
ρ (x) +[a (y)[ Λ
X
< ∞.
E2: For all x
0
∈ M and δ > 0 we know a is bounded by some A > 0 on
B(x
0
, δ) since a is Lipschitz.
d
aX
x
(s +t) , aX
aX
x
(s)
(t)
= d
X
x
(a(x) (s +t)) , X
Xx(a(x)s)
(a(X
x
(a (x) s)) t)
≤ d
X
x
(a(x) (s +t)) , X
Xx(a(x)s)
(a(x) t)
+d
X
X
x
(a(x)s)
(a (x) t) , X
X
x
(a(x)s)
(a (X
x
(a(x) s)) t)
≤ a (x) [s[ a (x) [t[ Ω
X
+ρ [a (x) t −a(X
x
(a (x) s)) t[
≤ [st[ [a (x)]
2
Ω
X
+[t[ ρK
a
d (x, X
x
(a (x) s))
≤ [st[ [a (x)]
2
Ω
X
+[st[ ρ
2
K
a
a (x) ≤ [st[ Ω
aX
where Ω
aX
:= A
2
Ω
X
+ρ
2
K
a
A.
Combining Propositions 39 and 41 gives
2.1. METRIC SPACE ARITHMETIC 29
Theorem 42 If a and b are locally Lipschitz functions and X & Y close and
satisfy E1 and E2, then aX +bY is an arc ﬁeld which satisﬁes E1 and E2 and
so has a unique local ﬂow.
If in addition a and b are globally Lipschitz and X and Y have linear speed
growth, then aX +bY generates a unique ﬂow.
Proof. We haven’t proven aX and bY close, but this is a straightforward
deﬁnition check, as is the fact that aX +bY has linear speed growth.
Now we have the beginnings of a linear structure associated with M. For
instance, expressions such as X −Y make sense:
X −Y := X + (−1)Y
where −1 is a constant function on M. Further, 0 is an arc ﬁeld deﬁned as the
constant map
0 (x, t) := x.
Note the space of all Lipschitz functions a : M →R is a ring.
Theorem 43 (Module properties) Let X,Y , and Z be arc ﬁelds which satisfy
Conditions E1 and E2 and assume X & Y , Y & Z, and X & Z all close. Let
a : M →R and b : M →R be locally Lipschitz functions. Then
(i) 0 +X = X = X + 0 additive identity
(ii) X +−X ≈ 0 additive inverse
(iii) X + (Y +Z) = (X +Y ) +Z additive associativity
(iv) 1X = X scalar identity
(v) a(bX) = (ab) X scalar associativity
(vi) (ab) X = (ba) X scalar commutativity
(vii) X +Y ≈ Y +X additive commutativity
(viii) a(X +Y ) ≈ aX +aY additive distributivity
(ix) (a +b) X ≈ aX +bX scalar distributivity
Further, equivalence respects this linearity:
(ix) if X ∼ X
and Y ∼ Y
then aX +bY ∼ aX
+bY
(x) if X ≈ X
and Y ≈ Y
then aX +bY ≈ aX
+bY
Proof. All equalities—(i) and (iii)(vi)—are immediate from the deﬁnitions;
(iii) and (v) particularly are due to the general associativity of composition of
maps. (ii) follows immediately from Condition E2:
d ((X −X)
t
(x) , 0 (x)) = d (X
−t
X
t
(x) , X
−t+t
(x)) = O
t
2
.
(vi) was shown at line (2.5). For (vii) and (viii) we may appeal as we did
for (vi) to the fact that both sides of the relations are 2nd order tangent to the
same ﬂows; but they are also easy to verify directly: checking (vii)
d (a(X +Y )
t
(x) , (aX +aY )
t
(x))
= d
Y
a(x)t
X
a(x)t
(x) , Y
a(X
a(x)t
(x))t
X
a(x)t
(x)
= d
Y
a(x)t
(y) , Y
a(X
a(x)t
(x))t
(y)
30 CHAPTER 2. LIE ALGEBRA ON METRIC SPACES
where y := X
a(x)t
(x) . Then
d
Y
a(x)t
(y) , Y
a(X
a(x)t
(x))t
(y)
≤ [t[ ρ
Y
(y)
a (x) −a
X
a(x)t
(x)
≤ [t[ ρ
Y
(y) K
a
d
x, X
a(x)t
(x)
≤ t
2
ρ
Y
(y) ρ
X
(x) K
a
[a (x)[ = O
t
2
locally uniformly. Checking (viii) notice if a and b are constant, then this
is simply Condition E2. Since we want the result for Lipschitz functions we
carefully verify
d ([(a +b) X]
t
(x) , (aX +bX)
t
(x)) = d
X
(a(x)+b(x))t
(x) , X
b(X
a(x)t
(x))t
X
a(x)t
(x)
≤ d
X
(a(x)+b(x))t
(x) , X
b(x)t
X
a(x)t
(x)
+d
X
b(x)t
X
a(x)t
(x) , X
b(X
a(x)t
(x))t
X
a(x)t
(x)
≤ t
2
[ab (x)[ Ω +d
X
b(x)t
(y) , X
b(X
a(x)t
(x))t
(y)
where y := X
a(x)t
(x) and this ﬁnal estimate is bounded by
t
2
[ab (x)[ Ω+ρ
X
(y) [t[
b (x) −b
X
a(x)t
(x)
≤ t
2
([ab (x)[ Ω +ρ
X
(y) ρ
X
(x) K
b
[a (x)[)
which is O
t
2
locally uniformly.
(ix) follows from the two facts
aX ∼ aX
and X +Y ∼ X
+Y
which are veriﬁed easily:
d ([aX]
t
(x) , [aX
]
t
(x)) = d
X
a(x)t
(x) , X
a(x)t
(x)
= o (a (x) t) = o (t)
locally uniformly, and
d ([X +Y ]
t
(x) , [X
+Y
]
t
(x))
= d (Y
t
X
t
(x) , Y
t
X
t
(x)) ≤ d (Y
t
X
t
(x) , Y
t
X
t
(x)) +d (Y
t
X
t
(x) , Y
t
X
t
(x))
= d (X
t
(x) , X
t
(x)) (1 + Λ
Y
[t[) +d (Y
t
(y) , Y
t
(y))
where y := X
t
(x), so the last estimate is o (t) locally uniformly.
(x): Replace o (t) in the veriﬁcation of (ix) with O
t
2
.
Consequently under the conditions of closure and E1 and E2, we may now
perform soaring feats of algebra such as
X +Y ∼ 0 ⇔ Y ∼ −X.
Local ﬂows have the following stronger linearity property, which is printed
here so it may be obliquely referred to in the depths of a long proof in the sequel.
Lemma 44 If F is a local ﬂow then interpreting F as an arc ﬁeld we can
perform the following operations when both sides are deﬁned:
2.2. METRIC SPACE LIE BRACKET 31
1. for a, b ∈ R, we have aF +bF = (a +b) F
2. if a and b are real functions then (aF +bF)
t
(x) = (a +b ◦ (aF)
t
) F
t
(x).
Proof. This is another obvious deﬁnition check:
(2) (aF +bF)
t
(x) = (bF)
t
(aF)
t
(x) = F
b((aF)
t
(x))t
F
a(x)t
(x)
= F
(a(x)+(b◦(aF)
t
)(x))t
(x) = (a +b ◦ (aF)
t
) F
t
(x)
and (1) follows from (2).
Proposition 45 Let X,Y , and Z be arc ﬁelds which satisfy Conditions E1 and
E2 and assume X & Y, Y & Z, and X & Z all close. Let a, b : M → R be
locally Lipschitz functions. Then
(i) X & X close
(ii) Y & X close
(iii) aX & Y close
(iv) X and Y +Z close.
Proof. (i) follows immediately from Condition E2. The others are easy; let
us do the most diﬃcult here:
(iv)
d ((Y +Z)
s
X
t
(x) , X
t
(Y +Z)
s
(x)) = d (Z
s
Y
s
X
t
(x) , X
t
Z
s
Y
s
(x))
≤ d (Z
s
Y
s
X
t
(x) , Z
s
X
t
Y
s
(x)) +d (Z
s
X
t
Y
s
(x) , X
t
Z
s
Y
s
(x))
≤ d (Y
s
X
t
(x) , X
t
Y
s
(x)) (1 +[s[ Λ
Z
) +O([st[) = O([st[)
Closure is not transitive lest all arc ﬁelds close, since all arc ﬁelds close with
the constant 0 arc ﬁeld. This prevents us from forming a natural local linear
structure fully analogous to the tangent bundle of a manifold via equivalence
classes under ≈ tangency. But by means of Proposition 45 and Theorem 42, we
can form successive linear combinations of arc ﬁelds which all close and have
unique solutions, making an object with properties akin to a linear subbundle
of the tangent bundle. We invite the reader to explore extra restrictions on
either arc ﬁelds or the space M which guarantee all arc ﬁelds close, giving a full
tangent bundle.
Examples for this section form the content of Chapter 6. With the module
properties of this section, a homological analysis of metric spaces would be an
interesting exercise.
2.2 Metric space Lie bracket
Review §0.3 from the Introduction for motivation from manifolds.
32 CHAPTER 2. LIE ALGEBRA ON METRIC SPACES
Deﬁnition 46 Given arc ﬁelds X and Y with local ﬂows F and G, deﬁne the
bracket [X, Y ] : M [−1, 1] →M as
[X, Y ] (x, t) :=
G
−
√
t
F
−
√
t
G
√
t
F
√
t
(x)
F
−
√
t
G
−
√
t
F
√
t
G
√
t
(x)
for t ≥ 0
for t < 0.
(2.6)
Here again, without spelling out the details, we implicitly use the trick from
Example 27 to force [X, Y ] to be well deﬁned if the local ﬂows are not deﬁned
for all t ∈ [−1, 1].
There are many diﬀerent equivalent characterizations of the Lie bracket on
a manifold. (2.6) uses the obvious choice of the asymptotic characterization to
generalize the concept to metric spaces. [X, Y ] (x, t) traces out a small “paral
lelogram” in M starting at x, which hopefully almost returns to x. The bracket
measures the failure of F and G to commute as will be made clear in Theorems
63 and 62. Notice
[X, Y ] (x, t) :=
(F +G−F −G)
x,
[t[
(G+F −G−F)
x,
[t[
for t ≥ 0
for t < 0.
We should very much have preferred to deﬁne the bracket of two arc ﬁelds X
and Y directly in terms of the arc ﬁelds themselves instead of using their ﬂows
F and G. This is not feasible if we want meaningful geometric information as
can be seen in Example 111 p. 119 and Example 112.
The bracket is not a priori an arc ﬁeld since it is not clear whether the speed
is bounded as
√
t is employed. This is remedied if the arc ﬁelds close:
Lemma 47 If X & Y close and satisfy E1 and E2 then
d (Y
−t
X
−t
Y
t
X
t
(x) , x) = O
t
2
locally uniformly for x ∈ M.
Proof.
d (Y
−s
X
−t
Y
s
X
t
(x) , x)
≤ d (Y
−s
X
−t
Y
s
X
t
(x) , Y
−s
X
−t
X
t
Y
s
(x)) +d (Y
−s
X
−t
X
t
Y
s
(x) , Y
−s
Y
s
(x)) +d (Y
−s
Y
s
(x) , x)
≤ d (Y
s
X
t
(x) , X
t
Y
s
(x)) (1 +[s[ Λ
Y
) (1 +[t[ Λ
X
) +t
2
Ω
X
(1 +[s[ Λ
Y
) +s
2
Ω
Y
≤ C
XY
[st[ (1 +[s[ Λ
Y
) (1 +[t[ Λ
X
) +t
2
Ω
X
(1 +[s[ Λ
Y
) +s
2
Ω
Y
≤ C
[st[ +t
2
+s
2
where
C := max¦C
XY
(1 + Λ
Y
) (1 + Λ
X
) , Ω
X
(1 + Λ
Y
) , Ω
Y
¦ .
Letting s = t gives the result.
2.2. METRIC SPACE LIE BRACKET 33
Proposition 48 If X and Y satisfy E1 and E2, and F & G close (as arc ﬁelds)
then [X, Y ] is an arc ﬁeld.
Proof. We establish the local bound on speed. The main purpose of Lemma
47 is to give d ([X, Y ]
t
(x) , x) = O(t) for t ≥ 0:
d ([X, Y ]
t
2 (x) , x) = d (G
−t
F
−t
G
t
F
t
(x) , x) = O
t
2
since F and G satisfy E1 (Theorem 16) and E2 (1parametery local group prop
erty). Similarly, for t < 0
d (F
−t
G
−t
F
t
G
t
(x) , x)
≤ d (F
t
G
t
F
−t
G
−t
(x) , F
t
F
−t
(x)) ≤ d (G
t
F
−t
G
−t
(x) , F
−t
(x)) e
tΛ
X
+ 0
which, using this trick again, gives
≤ d (F
−t
G
−t
(x) , G
−t
F
−t
(x)) e
t(Λ
X
+Λ
Y
)
= O
t
2
since F & G close.
Therefore
d ([X, Y ]
t
(x) , x) = O(t)
for both positive and negative t. Then since
[t[ is Lipschitz except at t = 0
we see [X, Y ] has bounded speed.
Example 49 We haven’t proven Lipschitz vector ﬁelds necessarily have ﬂows
which close, but their analog arc ﬁeld bracket is still an arc ﬁeld. The issue is
subtle. Let f and g be two vector ﬁelds with associated arc ﬁelds X and Y (as
in Example 11) and ﬂows F and G. Then
d (G
s
F
t
(x) , F
t
G
s
(x))
≤ d (G
s
F
t
(x) , Y
s
F
t
(x)) +d (Y
s
F
t
(x) , Y
s
X
t
(x)) +d (Y
s
X
t
(x) , X
t
Y
s
(x))
+d (X
t
Y
s
(x) , X
t
G
s
(x)) +d (X
t
G
s
(x) , F
t
G
s
(x))
≤ O
s
2
+O
t
2
+O(st) +O
s
2
+O
t
2
which is not enough to give that F & G close. But it is enough to prove [X, Y ] is
an arc ﬁeld, referring to the proof of Proposition 48, which only uses s = t from
the closure condition. So even though Lipschitz vector ﬁelds may be nonsmooth,
with undeﬁned classical Lie bracket, their metric space bracket is meaningful
and will give us geometric information on any Banach manifold, as we shall
see in Theorem 62. It is an open question whether the arc ﬁeld bracket of two
Lipschitz vector ﬁelds is always wellposed.
Exercise 50 What conditions do we need to guarantee the Lie algebra proper
ties for the bracket?
(i) −[X, Y ] = [Y, X]
(ii) [Y, X] + [X, Y ] = 0
34 CHAPTER 2. LIE ALGEBRA ON METRIC SPACES
(iii) −[X, Y ] ∼ [−X, Y ].
(iv) [X +Y, Z] ∼ [X, Z] + [Y, Z]
(v) [aX, aY ] = a
2
[X, Y ] for a ∈ R
(vi) [aX, Y ] ∼ a [X, Y ] for a ∈ R.
Hint: (i) , (ii) and (v) are true for any arc ﬁelds with ﬂows. Invent restric
tions on the arc ﬁelds or conditions on the space M which guarantee (iii), (iv)
and (vi).
2.3 Covariance and contravariance
In this section we demonstrate the arithmetic on arc ﬁelds is natural from the
category theory point of view.
Let φ : M
1
→ M
2
be a lipeomorphism—a Lipschitz map with a Lipschitz
inverse. The pushforward of an arc ﬁeld X on M
1
is the arc ﬁeld φ
∗
X on M
2
given by
φ
∗
X (x, t) := φ
X
φ
−1
(x) , t
.
This is a direct analog of the pushforward of a vector ﬁeld on a manifold. The
pushforward of any curve or ﬂow on M
1
is deﬁned similarly, e.g., φ
∗
F (x, t) :=
φ
F
φ
−1
(x) , t
. The pushforward of a function a : M
1
→ R is the function
φ
∗
a : M
2
→R deﬁned as φ
∗
a(x) := a
φ
−1
(x)
.
Proposition 51 If φ : M
1
→ M
2
is a lipeomorphism and the arc ﬁeld X on
M
1
has unique solutions then φ
∗
X has unique solutions. If F is the local ﬂow
of X, then φ
∗
F is the local ﬂow of φ
∗
X.
Proof. This is not conceptually diﬃcult, just notationally labyrinthine.
d
F
x
(t +h) , X
F
x
(t)
(h)
= o (t) for all x ∈ M implies
d ((φ
∗
F)
x
(t +h) , (φ
∗
X) (φ
∗
F
x
(t) , h))
= d
φ
F
φ
−1
(x)
(t +h)
, φ
X
φ
−1
φ
F
φ
−1
(x)
(t)
, h
≤ K
φ
d
F
φ
−1
(x)
(t +h) , X
F
φ
−1
(x)
(t) , h
= o (t) .
since φ
−1
(x) ∈ M.
The pushforward of a ﬂow is still a ﬂow, since it clearly satisﬁes the 1
parameter local group property and is the identity at t = 0.
The pullback of a map is deﬁned similarly, e.g.,
φ
∗
F (x, t) := φ
−1
(F (φ(x) , t))
and Proposition 51 clearly holds with pullbacks in place of pushforwards, since
pushforward and pullback are inverse operations—see part (vi) of the following
theorem.
Theorem 52 (Algebraic properties of covariance and contravariance)
2.3. COVARIANCE AND CONTRAVARIANCE 35
Let M
i
be metric spaces and let φ : M
1
→ M
2
and ψ : M
2
→ M
3
be
lipeomorphisms. Let a, b : M
1
→ R and ´a,
´
b : M
2
→ R be Lipschitz functions.
Then we have
(i) φ
∗
(ab) = (φ
∗
a) (φ
∗
b) and φ
∗
(ab) = (φ
∗
a) (φ
∗
b)
(ii) φ
∗
(a +b) = φ
∗
a +φ
∗
b and φ
∗
´a +
´
b
= φ
∗
´a +φ
∗
´
b
(iii) (ψ ◦ φ)
∗
= φ
∗
◦ ψ
∗
(contravariance ﬂips the order)
and (ψ ◦ φ)
∗
= ψ
∗
◦ φ
∗
(covariance preserves the order).
Let X and Y be arc ﬁelds on M
1
and
´
X and
´
Y be arc ﬁelds on M
2
then
(iv) φ
∗
(X +Y ) = φ
∗
(X) +φ
∗
(Y ) and φ
∗
´
X +
´
Y
= φ
∗
´
X
+φ
∗
´
Y
(v) φ
∗
(aX) =
a ◦ φ
−1
φ
∗
(X) = φ
∗
(a) φ
∗
(X)
and φ
∗
´a
´
X
= (´a ◦ φ) φ
∗
´
X
= φ
∗
(´a) φ
∗
´
X
.
(vi) φ
∗
φ
∗
= Id
M1
and φ
∗
φ
∗
= Id
M2
.
(vii) [φ
∗
X, φ
∗
Y ] = φ
∗
[X, Y ] and [φ
∗
X, φ
∗
Y ] = φ
∗
[X, Y ].
Proof. These are all obvious deﬁnition checks. Most hold in more general
settings.
(i)
φ
∗
(ab) (x) = (ab)
φ
−1
(x)
= a
φ
−1
(x)
b
φ
−1
(x)
= (φ
∗
a) (x) (φ
∗
b) (x) = (φ
∗
a) (φ
∗
b) (x)
for φ
∗
replace φ
−1
with φ.
(ii)
φ
∗
(a +b) (x) = (a +b)
φ
−1
(x)
= a
φ
−1
(x)
+b
φ
−1
(x)
= (φ
∗
a) (x) + (φ
∗
b) (x) = (φ
∗
a +φ
∗
b) (x) .
(iii) This is valid for functions, arc ﬁelds and ﬂows. Essentially this follows
from (ψ ◦ φ)
−1
= φ
−1
◦ ψ
−1
. Let us explicitly check arc ﬁelds:
(ψ ◦ φ)
∗
X
t
(x) = (ψ ◦ φ)
−1
(X
t
((ψ ◦ φ) (x))) = φ
−1
ψ
−1
X
t
ψφ(x)
= φ
−1
(ψ
∗
X)
t
(φ(x)) = (φ
∗
(ψ
∗
X))
t
(x) = (φ
∗
◦ ψ
∗
) X
t
(x)
and
(ψ ◦ φ)
∗
X
t
(x) = ψ ◦ φ
X
t
(ψ ◦ φ)
−1
(x)
= ψ
φ
X
t
φ
−1
ψ
−1
(x)
= ψ
(φ
∗
X)
t
ψ
−1
(x)
= (ψ
∗
◦ φ
∗
) X
t
(x) .
(iv) We check only pullbacks:
φ
∗
(X +Y ) (x, t) = φ
−1
(X +Y ) (φ(x) , t) = φ
−1
Y
t
X
t
φ(x)
= φ
−1
Y
t
φφ
−1
X
t
φ(x) = (φ
∗
(X) +φ
∗
(Y )) (x, t) .
(v)
φ
∗
(aX) (x, t) = φ
−1
(aX) (φ(x) , t) = φ
−1
X (φ(x) , (a ◦ φ) (x) t)
= φ
∗
(X) (x, (a ◦ φ) (x) t) = (a ◦ φ) φ
∗
(X) (x, t) .
36 CHAPTER 2. LIE ALGEBRA ON METRIC SPACES
(vi) Let’s check two cases; all others are similarly obvious.
φ
∗
φ
∗
F (x, t) = φ
∗
φ
F
φ
−1
(x) , t
= φ
−1
φ
F
φφ
−1
(x) , t
= F
and
φ
∗
φ
∗
a (x) = φ
∗
a (φ(x)) = a
φ
−1
φ(x)
= a(x) .
(vii) Automatic from the deﬁnition since φ
∗
F and φ
∗
G are the local ﬂows
of φ
∗
X and φ
∗
Y . Checking t ≥ 0
[φ
∗
X, φ
∗
Y ]
t
2
(x) = (φ
∗
G)
−t
(φ
∗
F)
−t
(φ
∗
G)
t
(φ
∗
F)
t
(x)
= φG
−t
φ
−1
φF
−t
φ
−1
φG
t
φ
−1
φF
t
φ
−1
(x)
= φG
−t
F
−t
G
t
F
t
φ
−1
(x) = φ[X, Y ]
t
2
φ
−1
x
= φ
∗
[X, Y ]
t
2 (x)
t < 0 is just as easy.
Notice the formulas hold formally for arbitrary functions a : M →R, but we
restrict ourselves to Lipschitz functions to guarantee φ
∗
(aX) still has bounded
speed.
Evidently this proposition shows lipeomorphisms on metric spaces engender
maps quite similar to module homomorphisms in view of Theorem 43. How
much of homology theory can be grafted onto this context?
Since pullback and linearity are established for arc ﬁelds, we can now explore
another characterization of the bracket. In the context of M being a smooth
manifold, let F and G be local ﬂows generated by smooth vector ﬁelds f :
M → TM and g : M → TM. There it is well known the following “dynamic”
characterization of the traditionally deﬁned Lie bracket is equivalent to the
asymptotic characterization
[f, g] =
d
dt
(F
t
)
∗
g
t=0
= lim
t→0
(F
t
)
∗
g −g
t
. (2.7)
Using this for inspiration, we return to the context of metric spaces with F and
G again the local ﬂows of arc ﬁelds X and Y . We have
F
∗
t
G
t
(x) = (t [X, Y ] +G)
t
(x) for t ≥ 0 and (2.8)
F
∗
s
G
s
(x) = (−s [−X, −Y ] −G)
−s
(x) for s < 0 (2.9)
which hold because
(t [X, Y ] +G)
t
(x) = G
t
[X, Y ]
t
2
(x)
= G
t
G
−t
F
−t
G
t
F
t
(x) = F
−t
G
t
F
t
(x) = F
∗
t
G
t
(x)
and
(−s [−X, −Y ] −G)
−s
(x)
= G
s
[−X, −Y ]
s
2
(x) = G
s
(−G)
−s
(−F)
−s
(−G)
s
(−F)
s
(x)
= G
s
G
s
F
s
G
−s
F
−s
(x) = F
−s
G
s
F
s
(x) = F
∗
s
G
s
(x) .
2.3. COVARIANCE AND CONTRAVARIANCE 37
These facts will be used in Chapter 3 for a proof of the fundamental result on
foliations of metric spaces, Theorem 62, as will the following
Proposition 53 If X has local ﬂow F then (F
s
)
∗
X ∼ X.
If X satisﬁes E1 and E2 then (F
s
)
∗
X ≈ X.
Proof. Using the properties of ﬂows F
t
= F
−s+t+s
= F
−s
F
t
F
s
and F
−1
t
=
F
−t
we get
d
(F
s
)
∗
X
t
(x) , X
t
(x)
≤ d (F
−s
X
t
F
s
(x) , F
−s
F
t
F
s
(x)) +d (F
t
(x) , X
t
(x))
≤ e
sΛX
d (X
t
(y) , F
t
(y)) +o (t) = o (t)
where y := F
s
(x) and the exponential comes from Theorem 16.
If X satisﬁes E1 and E2 then o (t) may be replaced with O
t
2
since then
X ≈ F by Corollary 14.
Deﬁnition 54 Let φ : M
1
→M
2
be a lipeomorphism. The arc ﬁelds X : M
1
→
AM
1
and Y : M
2
→ AM
2
are called φrelated, denoted X ∼
φ
Y , if Y ∼ φ
∗
X.
If Y ≈ φ
∗
X then X and Y are 2ndorder φrelated, denoted X ≈
φ
Y .
(Remember ∼ and ≈ have an implicit local uniformity condition.)
Proposition 53 may therefore be restated as X ∼
F
s
X.
Proposition 55 Let φ be a lipeomorphism.
(i) if X ∼ Y then φ
∗
X ∼ φ
∗
Y and φ
∗
X ∼ φ
∗
Y .
(ii) Y ∼ φ
∗
X iﬀ φ
∗
Y ∼ X.
(Consequently φrelated may be equivalently written with the pullback instead
of the pushforward, then rerewritten as an equivalence relation due to the fol
lowing transitive property.)
(iii) X ∼
φ
Y and Y ∼
ψ
Z implies X ∼
ψ◦φ
Z.
Further (i) , (ii) , and (iii) hold with ≈ in place of ∼.
Proof. (i)
d (φ
∗
X
t
(x) , φ
∗
Y
t
(x)) = d
φ
X
t
φ
−1
(x)
, φ
Y
t
φ
−1
(x)
≤ K
φ
d
X
t
φ
−1
(x)
, Y
t
φ
−1
(x)
= o (t) .
Similarly for φ
∗
. (ii) follows from (i) since φ
∗
and φ
∗
are inverse:
Y ∼ φ
∗
X ⇒ φ
∗
Y ∼ φ
∗
φ
∗
X = X.
(iii) by deﬁnition Y ∼ φ
∗
X and Z ∼ ψ
∗
Y so (i) implies Z ∼ ψ
∗
Y ∼ ψ
∗
φ
∗
X =
(ψ ◦ φ)
∗
X.
38 CHAPTER 2. LIE ALGEBRA ON METRIC SPACES
Proposition 56 Assume X and Y are arc ﬁelds on M which satisfy Conditions
E1 and E2, and let F and G be the ﬂows of X Y . Then
(i) G
t∗
F (respectively G
∗
t
F) is the ﬂow generated by G
t∗
X (respectively
G
∗
t
X).
(ii) X ∼
φ
Y iﬀ G = φ
∗
F iﬀ F ∼
φ
G.
(iii) F
t∗
X ∼ X.
(iv) G
t∗
X and G
∗
t
X satisfy Condition E2 (but not necessarily E1).
Proof. (i)
(G
t∗
F)
r+s
(x) = G
t
F
r+s
(G
−t
x) = G
t
F
r
F
s
G
−t
x = G
t
F
r
G
−t
G
t
F
s
G
−t
x
= (G
t∗
F
s
) (G
t∗
F
r
) (x)
and
(G
t∗
F)
0
(x) = G
t
F
0
(G
−t
x) = x
consequently G
t∗
F is a ﬂow and is tangent to G
t∗
X since
d (G
t∗
F
s
(x) , G
t∗
X
s
(x)) = d (G
t
F
s
(G
−t
x) , G
t
X
s
(G
−t
x))
≤ e
ΛY t
d (F
s
(G
−t
x) , X
s
(G
−t
x)) = O
t
2
.
The claim is settled by Proposition 51, which also informs:
(ii)
G ∼ Y ∼ φ
∗
X ∼ φ
∗
F
and so G = φ
∗
F since φ
∗
F is a local ﬂow and G is the unique ﬂow tangent to
Y .
(iii) We repeat this fact because (ii) now gives us an automatic proof with
φ := F
t
(iv) E2:
d (G
t∗
X
r+s
(x) , G
t∗
X
s
(G
t∗
X
r
(x)))
= d (G
t
X
r+s
(G
−t
x) , G
t
X
s
(G
−t
(G
t∗
X
r
(x))))
= d (G
t
X
r+s
(G
−t
x) , G
t
X
s
(G
−t
(G
t
X
r
(G
−t
x))))
= d (G
t
X
r+s
(G
−t
x) , G
t
X
s
X
r
(G
−t
x)) ≤ e
Λ
Y
t
d (X
r+s
(G
−t
x) , X
s
X
r
(G
−t
x))
≤ e
ΛY t
[rs[ Ω
X
= [rs[ Ω
where Ω := e
ΛY t
Ω
X
.
E1 is not quite satisﬁed, though: for t ≥ 0
d (G
t∗
X
s
(x) , G
t∗
X
s
(y))
= d (G
t
X
s
(G
−t
x) , G
t
X
s
(G
−t
y)) ≤ e
Λ
Y
t
d (X
s
(G
−t
x) , X
s
(G
−t
y))
≤ e
ΛY t
d (G
−t
x, G
−t
y) (1 +sΛ
X
) ≤ e
2ΛY t
d (x, y) (1 +sΛ
X
).
Proposition 57 Assume E1 and E2 are satisﬁed by X, X
, Y and Y
.
If X ∼
φ
X
and Y ∼
φ
Y
then [X, Y ] ∼
φ
[X
, Y
].
2.3. COVARIANCE AND CONTRAVARIANCE 39
Proof. φ
∗
X ∼ X
and φ
∗
Y ∼ Y
so since φ
∗
F is the local ﬂow of φ
∗
X it is
also the local ﬂow of X
and φ
∗
G is similarly the local ﬂow of Y
. Then
φ
∗
[X, Y ] = [φ
∗
X, φ
∗
Y ] = [X
, Y
]
by deﬁnition of the bracket.
40 CHAPTER 2. LIE ALGEBRA ON METRIC SPACES
Chapter 3
Foliations
The geometry of a manifold M may be understood easier if we can foliate it. A
foliation is a fundamental deconstruction of M, a partition of M into a family
of sets called the “leaves” of the foliation. The most elementary example is to
partition R
n
with the leaves R
p
¦x¦ where x ∈ R
q
and p+q = n with p, q ≥ 0.
In this manner R
n
is partitioned into R
q
many copies of R
p
. The leaves of
more interesting, curved foliations may be constructed by ﬁrst specifying sets of
vector ﬁelds (called distributions) to which the leaves are tangent. If conversely
a foliation is ﬁrst established, then the dynamics of ﬂows for vector ﬁelds in the
distribution tangent to the foliation may be better understood in terms of the
geometry of the foliation.
In this chapter we again substitute arc ﬁelds for vector ﬁelds and apply the
limited Lie algebra developed in Chapter 2 to prove Frobenius’ Foliation Theo
rem on a metric space. These ideas lead to an inﬁnitedimensional control theory
in §3.5, which gives new approximation schemes using previously unanticipated
families of functions in Chapter 5.
3.1 Introduction
To give a quick impression of the very geometrical topic of this chapter, let’s
look at a few drawings of leaves and foliations before we delve into the technical
deﬁnitions. In R
3
consider the set o consisting of the x
3
axis and the unit circle
in the x
1
x
2
plane,
o :=
¸
x ∈ R
3
[x
1
= 0 = x
2
∪
¸
x ∈ R
3
[x
2
1
+x
2
2
= 1, x
3
= 0
where x = (x
1
, x
2
, x
3
). Figures 3.13.5 demonstrate diﬀerent foliations of the
space M := R
3
`o.
Figure 3.1 displays a foliation of M by tori, generated by rotating the circles
in Figure 3.2 about the x
3
axis (see Example 5 in §0.3 for the formula). As
the tori shrink they converge to the circle
¸
x ∈ R
3
[x
2
2
+x
2
3
= 1
; as they grow
they ﬁll up the rest of R
3
except the zaxis
¸
x ∈ R
3
[x
1
= 0 = x
2
. Then we can
41
42 CHAPTER 3. FOLIATIONS
Figure 3.1: Torus foliation (4 leaves)
Figure 3.2: rotating circles
3.1. INTRODUCTION 43
see the threedimensional space M is homeomorphic to a continuum’s worth of
copies of the torus T
2
, i.e., M · T
2
(0, ∞). Each torus is a leaf, and the
collection of leaves form a foliation of M.
We don’t include the x
3
axis nor the x
1
,x
2
plane unit circle in M, because
they are 1dimensional instead of 2 and so cannot be leaves in the foliation.
Technically the collection of all the 2D surfaces along with the two 1D curves
are a stratiﬁcation of R
3
, cf., [22]. You might wish to compare these ﬁgures
with the stereographic projection to R
3
∪¦∞¦ of the Hopf ﬁbration of the unit
threesphere in R
4
S
3
:=
¸
x ∈ R
4
x = 1
(cf. [63, p. 103], e.g.) where the construction is more natural. Through this
projection, these pictures give a hint on how to construct several topologically
distinct ﬁxedpointfree ﬂows on S
3
, and ﬂows with extremely complex basins
of attraction on this simplest of compact 3dimensional manifolds.
Now if we instead have these circles grow as they rotate around the x
3
axis,
we get an ouroboros (or snakeeatingitsowntail) foliation, Figure 3.3. Each
ouroboros leaf is homeomorphic to an inﬁnite cylinder, or S
1
(0, ∞), where
S
1
denotes the circle. So M ·
S
1
(0, ∞)
S
1
. To see this start with an
initial ouroboros and follow the foliation, expanding along “larger” leaves. As
the leaves grow, ﬁlling out the foliation, we return in ﬁnite time to the initial
leaf.
Figure 3.3: Ouroboros foliation (1 leaf)
44 CHAPTER 3. FOLIATIONS
Next a spiral is rotated around the zaxis, Figure 3.4. Again (but for
slightly changed reasons) the leaf is homeomorphic to a cylinder and M ·
(0, ∞) S
1
S
1
.
Figure 3.4: Spiral torus foliation (1 leaf)
You can test your geometric imagination by adding one more twist. Shrink
the spiral as it rotates around the zaxis. This spiral may or may not intersect
the previous copies of itself as it rotates and shrinks, depending on whether or
not the ratio of the rotation rate and the shrinking rate is rational. For Figure
3.5 the rates are chosen so the spiral rotates twice around the x
3
axis before
closing on itself perfectly, giving a surface homeomorphic to a cylinder as in the
case of Figure 3.4, but now the cylinder is twisted twice, Figure 3.6.
This twisted cylinder may be disorienting, but it is still orientable. Try
tracing an edge to see how the solution to a vector ﬁeld tangent to one set of
grid lines would be periodic; then in Figure 3.7 we see how an initial condition
close to the center would follow a path that leads it far away before returning.
3.1. INTRODUCTION 45
Figure 3.5: Ouroboros spiral with 720
◦
rotation/dilation symmetry (1 leaf)
If the ratio of dilation to rotation is irrational the surface will not close with
the rotation. Then a single leaf is homeomorphic to the plane, R
2
, and dense
in M. See Figure 3.8. We get a situation reminiscent of the irrational ﬂow on
the torus T
2
whose ﬂow lines are dense (Example 5).
In the general case, a local ﬂow gives a local foliation with 1dimensional
leaves—the integral curves’ paths. In this sense Frobenius’ Foliation Theorem
generalizes the Fundamental Theorem of ODEs. A ﬂow without equilibria fo
46 CHAPTER 3. FOLIATIONS
Figure 3.6: Twisted cylinder
Figure 3.7: A “longer” twisted cylinder
3.1. INTRODUCTION 47
Figure 3.8: Ouroboros spiral without rational rotation/dilation symmetry (1
leaf)
48 CHAPTER 3. FOLIATIONS
liates the whole space (remember Example 5). The existence of a nontrivial
foliation is not guaranteed on general spaces M and depends on the global
topology of M. For example, there is no 1dimensional foliation of S
n
for even
numbers n, nor for any compact surface except the torus and the Klein bottle.
Higherdimensional foliations are fundamental in many diverse subjects; the
three areas that inspire our interest are diﬀerential geometry, dynamical sys
tems, and control theory. The heuristic geometric idea is to generalize vector
ﬁelds to “plane ﬁelds”, which may be algebraically deﬁned. Plane ﬁelds of any
dimension are used and are called distributions, which are not to be confused
with the generalized functions from analysis and other mathematical concepts
which unfortunately share the bland term.
Intuitively we might expect to be able to “integrate” a plane ﬁeld to get a
surface tangent to the plane ﬁeld starting from any point; i.e., there should exist
a foliation tangent to any distribution, giving a basic link between algebra and
geometry. However, unlike the 1dimensional case, where Lipschitz continuous
vector ﬁelds always have integral curves, many welldeﬁned smooth plane ﬁelds
have no integral surfaces—such distributions are called nonholonomic. A non
holonomic plane ﬁeld is an intuitively disturbing object in geometry (but is
the starting point of the subject called “contact geometry”, fundamental in
mechanics).
Example 58 The archetypical noninvolutive distribution is in R
3
with the set
of planes given by the linear spans of the vector ﬁelds f, g : R
3
→ R
3
where
f (x
1
, x
2
, x
3
) := (1, 0, 0) and g (x
1
, x
2
, x
3
) := (0, 1, x
1
). We readily verify this
Figure 3.9: Noninvolutive distribution
3.1. INTRODUCTION 49
distribution has no surface whose tangent spaces coincide with the plane ﬁeld,
because the reachable set from any point is all of R
3
: we can move tangentially
to the plane ﬁeld by moving parallel to the x
1
axis at any time; at the x
2
,x
3
plane
we can also move parallel to the x
2
axis; at any other point we can move either
up or down diagonally. If there were an integral surface for this distribution, the
reachable set from a point on the surface would be limited to the 2dimensional
surface (by Nagumo invariance, Theorem 33). Therefore this distribution is
nonholonomic.
In control theory, a nonholonomic plane ﬁeld may be a boon: if a 2
dimensional distribution generated by a pair of vector ﬁelds is nonholonomic,
then the reachable set is more than 2 dimensional. In Chapter 4 we’ll see how
a 2dimensional distribution in L
2
may have an inﬁnite dimensional reachable
set.
The Local Frobenius Theorem (Theorem 62) gives an algebraic property
which characterizes holonomic distributions: when the bracket of any vector
ﬁelds in the plane ﬁeld are still in the plane ﬁeld, then the distribution has a
tangent foliation. The technical terminology is: involutive distributions are
integrable. Extending the integral surfaces by continuation gives us the Global
Frobenius Theorem (Theorem 75): each extended integral surface is a leaf, and
the collection of leaves partitions M into a foliation.
In §3.2 we will prove the Local Frobenius Theorem on a fully general metric
space. A novel approach to the proof is needed in order to use the metric
space bracket. This paragraph gives an outline of the proof, simpliﬁed to vector
ﬁelds on a manifold. The terminology will be clariﬁed in §3.2, and Figures 3.10
and 3.11 from §3.2 may aid your intuition. The crux of the Local Frobenius
Theorem in two dimensions is as follows: Given two transverse vector ﬁelds
f, g : M →TM there exists an integral surface (tangent to linear combinations
of f and g) through any point x
0
∈ M, under the assumption that the Lie
bracket satisﬁes [f, g] = af + bg for some choice of functions a, b : M → R
(involutivity of f and g). To prove this, deﬁne
S := ¦F
t
G
s
(x
0
) ∈ M[ [s[ , [t[ < δ¦
where F and G are the local ﬂows of f and g. Since f and g are transverse,
we may choose δ > 0 small enough for S to be a welldeﬁned surface. S will
be shown to be the desired integral surface through x
0
. Notice S is tangent to
f by construction, but it is not immediately clear S is tangent to a
f +b
g for
arbitrarily chosen a
, b
∈ R. Notice, though, that by construction S is tangent
to g at any point x = G
s
(x
0
), and also S is tangent to a
f +b
g at this same
x for functions a
and b
. Therefore establishing
(F
t
)
∗
(a
f +b
g) = a
f +b
g at x = G
s
(x
0
) (3.1)
for some functions a
and b
, proves S is tangent to a
F +b
G at an arbitrary
point z = F
t
G
s
(x
0
) ∈ S, since the pushforward (F
t
)
∗
and the pullback (F
t
)
∗
50 CHAPTER 3. FOLIATIONS
are inverse to each other and preserve tangency since they are local lipeomor
phisms. Next since the Lie bracket equals the Lie derivative,
lim
h→0
F
∗
h
(g) −g
h
= [f, g] = af +bg
for some a and b by involutivity, so
F
∗
h
(g) = g +h(af +bg) +o (h) = ¯af +
¯
bg +o (h) .
Using the fact that F
∗
h
(f) = f for any h, and the linearity of pullback for ﬁxed
t, we have for functions a
i
and b
i
: M →R
F
∗
t/n
(a
i
f +b
i
g) = (a
i+1
f +b
i+1
g) +o (1/n)
for some functions a
i+1
and b
i+1
. Then since
F
∗
t
= F
∗
t/n
F
∗
t/n
...F
∗
t/n
. .. .
composition n times
=
F
∗
t/n
(n)
we get (3.1) as follows:
F
∗
t
(a
0
f +b
0
g) = lim
n→∞
F
∗
t/n
(n)
(a
0
f +b
0
g)
= lim
n→∞
a
n
f +b
n
g +no (1/n) = a
∞
f +b
∞
g + 0
completing the sketch for manifolds.
[40], [22] and [64] are good introductions with deeper insights on foliations on
ﬁnitedimensional manifolds. Topological, analytical, and geometric questions
have been explored voluminously; [64] has 263 pages of references up to 1996
and some examples in inﬁnite dimensions.
3.2 Local integrability
In this section we prove the 2dimensional local Frobenius Theorem on a metric
space, Theorem 62.
Deﬁnition 59 Two arc ﬁelds X and Y are (locally uniformly) transverse if
for each x
0
∈ M there exists a δ > 0 such that
d (X
s
(x) , Y
t
(x)) ≥ δ ([s[ +[t[)
for [t[ < δ for all x ∈ B(x
0
, δ).
3.2. LOCAL INTEGRABILITY 51
Example 60 On the plane R
2
with Euclidean norm  any two linearly inde
pendent vectors u, v ∈ R
2
give us the transverse arc ﬁelds
X
t
(x) := x +tu and Y
t
(x) := x +tv.
To check this, it is perhaps easiest to deﬁne a new norm on R
2
by
x
uv
:= [x
1
[ +[x
2
[
where x = x
1
u + x
2
v and x
1
, x
2
∈ R. Since all norms on R
2
are metrically
equivalent there must exist a constant C > 0 such that x
uv
≤ C x for all
x ∈ R
2
. Then taking δ :=
1
C
d (X
s
(x) , Y
t
(x)) = su −tv ≥ δ su −tv
uv
= δ ([s[ +[t[) .
Localization shows any pair of continuous vector ﬁelds f and g on a diﬀerentiable
manifold (metrized in any manner) give transverse arc ﬁelds if f and g are non
colinear at each point.
A (2dimensional) surface is a 2dimensional topological manifold, i.e., lo
cally homeomorphic to R
2
.
For any subset N ⊂ M and element x ∈ M the distance from x to N is
deﬁned (with an excusable overload of notation d) as
d (x, N) := inf ¦d (x, y) : y ∈ N¦ .
This new function d is not a metric, obviously, but it does satisfy a kind of
triangle inequality:
d (x, N) ≤ d (x, y) +d (y, N)
for all x, y ∈ M, as is easy to verify.
Deﬁnition 61 A surface S ⊂ M is an integral surface for two transverse
arc ﬁelds X and Y if given any Lipschitz functions a, b : M → R we have S
locally uniformly tangent to aX +bY restricted to S, i.e.,
d ((aX +bY )
t
(x) , S) = o (t)
locally uniformly for x ∈ S. Locally uniform tangency is denoted S ∼ aX +bY .
Theorem 62 Assume X and Y are transverse, and satisfy E1 and E2 on a
locally complete metric space M. If [X, Y ] ∼ aX + bY for some Lipschitz
functions a, b : M →R, then for each x
0
∈ M there exists an integral surface S
through x
0
.
Proof. The metric space analogs of the bracket and the pullback deﬁned
in §2.2 and §2.3 will now be inserted into the manifold outline given in §3.1. A
rigorous veriﬁcation of the analytic estimates requires voluminous, but straight
forward, calculations painstakingly detailed in the next six pages.
52 CHAPTER 3. FOLIATIONS
Figure 3.10: integral surface S
Let F and G be the local ﬂows of X and Y . Deﬁne
S := ¦F
t
G
s
(x
0
) [ [s[ , [t[ < δ¦
where δ > 0 is chosen small enough for S to be a welldeﬁned surface (Figure
3.10). I.e., F
t1
G
s1
(x
0
) = F
t2
G
s2
(x
0
) implies t
1
= t
2
and s
1
= s
2
, so
φ : (−δ, δ) (−δ, δ) ⊂ R
2
→S ⊂ M
deﬁned by φ(s, t) := F
t
G
s
(x
0
) is a homeomorphism. Finding such a δ is possible
since X and Y are transverse. To see this, assume the contrary. Then there are
diﬀerent choices of s
i
and t
i
which give F
t1
G
s1
(x
0
) = F
t2
G
s2
(x
0
) which implies
G
s
1
(x
0
) = F
t
3
G
s
2
(x
0
) and letting y := G
s
2
(x
0
) we must also then have
F
t
(y) = G
s
(y) . (3.2)
If our current contrary assumption were true, then for all ε > 0 there would
exist s and t with [s[ , [t[ < ε such that (3.2) holds. This contradicts the fact
that X and Y are transverse.
We will show S is a desired integral surface through x
0
. Assume δ is also
chosen small enough so throughout S the functions [a[ and [b[ are bounded,
while the constants Λ, Ω, and ρ hold for X and Y uniformly, and the closure
of B(x, 2δ (ρ + 1)) is complete. This is possible because F and G have locally
bounded speeds, since X and Y do.
S ∼ X by construction, but it is not immediately clear S ∼ a
X + b
Y for
arbitrarily chosen a
, b
∈ R. We can use
a
X +b
Y ∼ a
F +b
G
and so we will show S ∼ a
F + b
G. We need to show this is true for an
arbitrary point z ∈ S, so assume z := F
t
G
s
(x
0
) for some s and t ∈ R. When
3.2. LOCAL INTEGRABILITY 53
t = 0 however, i.e., at any x := G
s
(x
0
), it is easy to see our desired result
holds, because, due to the construction of S we have S ∼ a
F + b
G since
a
F +b
G ∼ b
G+a
F (Theorem 43 (vi)) and
(b
G+a
F)
h
(x) = F
a
(G
b
(x)h
(x))h
G
b
(x)h
(x) = F
a
(G
b
(x)h
(x))h
G
b
(x)h
G
s
(x
0
) ∈ S
when h is small.
(x
0
, x, z, s and t are now ﬁxed for the remainder of the proof; however, we
only explicitly check the case t > 0, indicating the changes where needed to
check the t < 0 case.)
If we prove
(F
t
)
∗
(a
F +b
G) ∼ S at x = G
s
(x
0
) (3.3)
this will prove S ∼ a
F +b
G at z, since the pushforward (F
t
)
∗
and the pull
back (F
t
)
∗
are inverse, and are local lipeomorphisms, and so preserve tangency.
(See Figure 3.11.)
Figure 3.11: pullback to G
s
(x
0
)
***add one more picture with the intermediate step approximation***
Recalling (2.8):
F
∗
t
G
t
(x) = (t [X, Y ] +G)
t
(x)
so
F
∗
t/n
G
t/n
(x) =
t
n
[X, Y ] +G
t/n
(x) (3.4)
for our previously ﬁxed small t ≥ 0 and arbitrary positive integer n ∈ N. (For
t < 0 use (2.9) instead.) Clearly for any arc ﬁelds Z and Z
d
Z
s
(x) , Z
s
(x)
= o (s) implies
d
(sZ)
s
(x) ,
sZ
s
(x)
= d
(Z)
s
2
(x) ,
Z
s
2
(x)
= o
s
2
(3.5)
54 CHAPTER 3. FOLIATIONS
and so
[X, Y ] ∼ aF +bG implies
d
t
n
[X, Y ]
t/n
(x) ,
t
n
(aF +bG)
t/n
(x)
= o
1
n
2
(3.6)
since t is ﬁxed.
We use these facts to establish (3.3), ﬁrst checking
d
(F
∗
t
(a
F +b
G))
t/n
(x) , S
= o
1
n
as n → ∞. At the end of the proof we will replace t/n by arbitrary r → 0.
Using the linearity of pullback (Theorem 52) we get
d
(F
∗
t
(a
F +b
G))
t/n
(x) , S
= d
(a
◦ F
t
) F
∗
t
(F) + (b
◦ F
t
) F
∗(n)
t/n
(G)
t/n
(x) , S
= d
a
0
F +b
0
F
∗(n)
t/n
(G)
t/n
(x) , S
where a
0
:= a
◦ F
t
and b
0
:= b
◦ F
t
. Using (3.4) means this last estimate is
= d
a
0
F +b
0
F
∗(n−1)
t/n
t
n
[X, Y ] +G
t/n
(x) , S
≤ d
a
0
F +b
0
F
∗(n−1)
t/n
t
n
[X, Y ] +G
t/n
(x) ,
a
0
F +b
0
F
∗(n−1)
t/n
t
n
(aF +bG) +G
t/n
(x)
+d
a
0
F +b
0
F
∗(n−1)
t/n
t
n
(aF +bG) +G
t/n
(x) , S
. (3.7)
The ﬁrst summand of (3.7) is now analyzed as
d
a
0
F +b
0
F
∗(n−1)
t/n
t
n
[X, Y ] +G
t/n
(x) ,
a
0
F +b
0
F
∗(n−1)
t/n
t
n
(aF +bG) +G
t/n
(x)
= d
b
0
F
∗
(n−1)t/n
t
n
[X, Y ] +G
t/n
(y) ,
b
0
F
∗
(n−1)t/n
t
n
(aF +bG) +G
t/n
(y)
where y := a
0
F
t/n
(x)
= d
F
∗
(n−1)t/n
t
n
[X, Y ] +G
b0(y)t/n
(y) ,
F
∗
(n−1)t/n
t
n
(aF +bG) +G
b0(y)t/n
(y)
= d
F
−(n−1)t/n
t
n
[X, Y ] +G
b0(y)t/n
F
(n−1)t/n
(y)
,
F
−(n−1)t/n
t
n
(aF +bG) +G
b0(y)t/n
F
(n−1)t/n
(y)
= d
F
−(n−1)t/n
t
n
[X, Y ] +G
b
0
(y)t/n
(z)
,
F
−(n−1)t/n
t
n
(aF +bG) +G
b0(y)t/n
(z)
(3.8)
3.2. LOCAL INTEGRABILITY 55
where z := F
(n−1)t/n
(y). Then by Theorem 16, (3.8) is
≤ d
t
n
[X, Y ] +G
b
0
(y)t/n
(z) ,
t
n
(aF +bG) +G
b
0
(y)t/n
(z)
e
ΛX(n−1)t/n
= d
G
b0(y)t/n
t
n
[X, Y ]
b0(y)t/n
(z) , G
b0(y)t/n
t
n
(aF +bG)
b0(y)t/n
(z)
e
Λ
X
(n−1)t/n
≤ d
t
n
[X, Y ]
b0(y)t/n
(z) ,
t
n
(aF +bG)
b0(y)t/n
(z)
e
ΛX(n−1)t/n
e
ΛY b0(y)t/n
≤ r
b
0
(y)
t
n
2
e
ΛX(n−1)t/n+ΛY b0(y)t/n
=: o
1
1
n
2
(3.9)
where we deﬁne
r (s) := d ([X, Y ]
s
(z) , (aF +bG)
s
(z)) .
By the main assumption of the theorem, r (s) = o (s) so we have o
1
1
n
2
=
o
1
n
2
, but we need to keep a careful record of this estimate as we will be
summing n terms like it—the subscript distinguishes o
1
as a speciﬁc function.
Substituting (3.9) into (3.7) gives
d
(F
∗
t
(a
F +b
G))
t/n
(x) , S
= d
a
0
F +b
0
F
∗(n)
t/n
G
t/n
(x) , S
(3.10)
≤ d
a
0
F +b
0
F
∗(n−1)
t/n
t
n
(aF +bG) +G
t/n
(x) , S
+o
1
1
n
2
= d
¸
a
0
F +b
0
t
n
a ◦ F
(n−1)t/n
F
+b
0
t
n
b ◦ F
(n−1)t/n
+ 1
F
∗(n−1)
t/n
G
t/n
(x) , S
¸
+o
1
1
n
2
= d
¸
a
0
+
b
0
t
n
a ◦ F
(n−1)t/n
◦
a
0
F
t/n
F
+b
0
t
n
b ◦ F
(n−1)t/n
+ 1
F
∗(n−1)
t/n
G
t/n
(x) , S
¸
+o
1
1
n
2
= d
a
1
F +b
1
F
∗(n−1)
t/n
G
t/n
(x) , S
+o
1
1
n
2
(3.11)
where
a
1
:= a
0
+
b
0
t
n
a ◦ F
(n−1)t/n
◦
a
0
F
t/n
and
b
1
:= b
0
t
n
b ◦ F
(n−1)t/n
+ 1
.
Getting from the third line to the fourth line uses the linearity of pullback
(Theorem 52), while the ﬁfth line is due to the linearity of F (Lemma 44).
After toiling through these many complicated estimates we can relax a bit,
since the rest of the proof follows more algebraically by iterating the result of
56 CHAPTER 3. FOLIATIONS
lines (3.10) and (3.11):
d
a
0
F +b
0
F
∗(n)
t/n
G
t/n
(x) , S
≤ d
a
1
F +b
1
F
∗(n−1)
t/n
G
t/n
(x) , S
+o
1
1
n
2
≤ d
a
2
F +b
2
F
∗(n−2)
t/n
G
t/n
(x) , S
+o
1
1
n
2
+o
2
1
n
2
≤ ... ≤ d
(a
n
F +b
n
G)
t/n
(x) , S
+
n
¸
i=1
o
i
1
n
2
(3.12)
where
a
2
:= a
1
+
b
1
t
n
a ◦ F
(n−2)t/n
◦
a
1
F
t/n
b
2
:= b
1
t
n
b ◦ F
(n−2)t/n
+ 1
and in general
a
i
:= a
i−1
+
b
i−1
t
n
a ◦ F
(n−i)t/n
◦
a
i−1
F
t/n
b
i
:= b
i−1
t
n
b ◦ F
(n−i)t/n
+ 1
.
In the region of interest the [a[ and [a
0
[ are bounded by some A ∈ R and [b[ and
[b
0
[ are bounded by some B ∈ R so
[b
1
[ =
b
0
t
n
b ◦ F
(n−1)t/n
+ 1
≤ B
t
n
B + 1
[b
2
[ =
b
1
t
n
b ◦ F
(n−1)t/n
+ 1
≤ B
t
n
B + 1
2
[b
i
[ ≤ B
t
n
B + 1
i
and
[a
1
[ =
a
0
+b
0
t
n
a ◦ F
(n−1)t/n
≤ A+B
t
n
A
[a
2
[ =
a
1
+b
1
t
n
a ◦ F
(n−2)t/n
≤
A+B
t
n
A
+B
t
n
B + 1
t
n
A
[a
3
[ =
a
2
+b
2
t
n
a ◦ F
(n−3)t/n
≤ A+B
t
n
A+B
t
n
B + 1
t
n
A+B
t
n
B + 1
2
t
n
A
[a
i
[ ≤ A+
t
n
AB
i−1
¸
k=0
t
n
B + 1
k
= A+
t
n
AB
t
n
B + 1
i
−1
t
n
B
= A
t
n
B + 1
i
.
Therefore
[b
n
[ ≤ B
t
n
B + 1
n
≤ Be
tB
and
[a
n
[ ≤ A
t
n
B + 1
n
≤ Ae
tB
.
Penultimately, we need to estimate the o
i
1
n
2
. Remember from line (3.9)
o
1
1
n
2
:= r
b
0
(y)
t
n
2
e
ΛX(n−1)t/n+ΛY b0(y)t/n
3.2. LOCAL INTEGRABILITY 57
where r (s) = o (s), so
o
2
1
n
2
= r
b
1
(y)
t
n
2
e
Λ
X
(n−2)t/n+Λ
Y
b
1
(y)t/n
≤ B
t
n
B + 1
o
t
n
2
e
Λ
X
(n−2)t/n+Λ
Y
B
t
n
B+1
t/n
o
i
1
n
2
= r
b
i−1
(y)
t
n
2
e
Λ
X
(n−i)t/n+Λ
Y
b
i−1
(y)t/n
.
Consequently
n
¸
i=1
o
i
1
n
2
≤
n
¸
i=1
r
b
i−1
(y)
t
n
2
e
ΛX(n−i)t/n+ΛY B
t
n
B+1
i−1
t/n
≤ o
t
n
2
Be
tB
n
¸
i=1
e
ΛX(n−i)t/n+ΛY B
t
n
B+1
i−1
t/n
since r
b
i−1
(y)
t
n
2
= o
t
n
2
Be
tB
for all i. Therefore
n
¸
i=1
o
i
1
n
2
≤ o
t
n
2
Be
tB
ne
Λ
X
t+Λ
Y
Be
tB
t/n
= o
1
n
as n →∞. Putting this into (3.12) gives
d
(F
∗
t
(a
F +b
G))
t/n
(x) , S
≤ d
(a
n
F +b
n
G)
t/n
(x) , S
+o
1
n
= o
1
n
because of the uniform bound on [a
n
[ and [b
n
[. To see this notice
d
(a
∗
F +b
∗
G)
t/n
(x) , S
= o
1
n
uniformly for bounded a
∗
and b
∗
since a
∗
F + b
∗
G ∼ b
∗
G + a
∗
F and as be
fore (b
∗
G+a
∗
F)
t
(x) ∈ S using the uniform Λ and Ω derived in the proofs of
Propositions 39 and 41 (cf. Remark 15).
Finally we need to check
d ((F
∗
t
(a
F +b
G))
r
(x) , S) = o (r)
when r is not necessarily t/n. We may assume 0 < t < 1 and 0 < r < t so
t = nr +ε for some 0 ≤ ε < r and integer n with
t
r
−1 < n ≤
t
r
. Therefore the
above calculations give
d ((F
∗
t
(a
F +b
G))
r
(x) , S) = d
F
∗
ε
F
∗(n)
r
(cF +dG)
r
(x) , S
≤ d (F
∗
ε
(a
n
F +b
n
G)
r
(x) , S) +o (r) = o (r) .
The ndimensional corollary of this 2dimensional version of Frobenius’ The
orem is given in Section 3.4.
58 CHAPTER 3. FOLIATIONS
3.3 Commutativity of ﬂows
Theorem 63 Assume X and Y satisfy E1 and E2 on a locally complete metric
space M. Let F and G be the local ﬂows of X and Y . Then [X, Y ] ∼ 0 if and
only if F and G commute, i.e.,
F
t
G
s
(x) = G
s
F
t
(x) , i.e., F
∗
t
(G) = G.
Proof. The assumption [X, Y ] ∼ aX +bY with a = b = 0 allows us to copy
the approach in the proof of Theorem 62. Let δ > 0 be chosen small enough so
1. the constants Λ, Ω, and ρ for X and Y hold uniformly, and
2. [X, Y ] ∼ 0 uniformly
all on S := B(x, 2δ (ρ + 1)) and that S is also complete. We check t > 0. Since
F
∗
t
(G) and G are both local ﬂows, we only need to show they are tangent to
each other and then they must be equal by uniqueness of solutions.
As motivation imagine being in the context of diﬀerentiable manifolds. There,
for vector ﬁelds f and g with local ﬂows F and G, we would have
lim
h→0
F
∗
h
(g) −g
h
= L
f
g = [f, g] = 0
so F
∗
h
(g) = g +o (h) and thus we expect
F
∗
h
(g) = g +o (h) .
We might use this idea as before with the linearity of pullback (Theorem 52)
to get
F
∗
t
(g) = lim
n→∞
F
∗(n)
t/n
(g) = lim
n→∞
g +no (1/n) = g
as desired.
Now in our context of metric spaces with t > 0, line (2.8) again gives
F
∗
t/n
(G)
t/n
(x) =
t
n
[X, Y ] +G
t/n
(x) .
For t < 0 one would use (2.9). Also we again have
[X, Y ] ∼ 0 implies
d
t
n
[X, Y ]
t/n
(x) , x
= o
1
n
2
.
Using these tricks (and Theorem 16 in the fourth line following) gives
d
(F
∗
t
(G))
t/n
(x) , G
t/n
(x)
= d
F
∗(n−1)
t/n
F
∗
t/n
(G)
t/n
(x) , G
t/n
(x)
= d
F
∗(n−1)
t/n
t
n
[X, Y ] +G
t/n
(x) , G
t/n
(x)
≤ d
F
∗(n−1)
t/n
G
t/n
t
n
[X, Y ]
t/n
(x)
, F
∗(n−1)
t/n
G
t/n
(x)
+d
F
∗(n−1)
t/n
G
t/n
(x) , G
t/n
(x)
≤ d
G
t/n
t
n
[X, Y ]
t/n
(y) , G
t/n
(y)
e
ΛX
t(n−1)
n
+d
F
∗(n−1)
t/n
G
t/n
(x) , G
t/n
(x)
3.3. COMMUTATIVITY OF FLOWS 59
where y := F
(n−1)t/n
(x)
≤ d
t
n
[X, Y ]
t/n
(y) , y
e
ΛY t/n
e
ΛX
t(n−1)
n
+d
F
∗(n−1)
t/n
G
t/n
(x) , G
t/n
(x)
and so
d
(F
∗
t
(G))
t/n
(x) , G
t/n
(x)
≤ d
F
∗(n−1)
t/n
G
t/n
(x) , G
t/n
(x)
+e
ΛY t/n+ΛX
t(n−1)
n
o
1
1
n
2
where o
1
1
n
2
:= d
t
n
[X, Y ]
t/n
(y) , y
.
Iterating this result gives
d
F
∗n
t/n
(G)
t/n
(x) , G
t/n
(x)
≤ d
F
∗(n−1)
t/n
G
t/n
(x) , G
t/n
(x)
+e
ΛY t/n+ΛX
t(n−1)
n
o
1
1
n
2
≤ d
F
∗(n−2)
t/n
G
t/n
(x) , G
t/n
(x)
+e
Λ
Y
t/n+Λ
X
t(n−2)
n
o
2
1
n
2
+e
Λ
Y
t/n+Λ
X
t(n−1)
n
o
1
1
n
2
≤ ... ≤ d
F
0
t/n
G
t/n
(x) , G
t/n
(x)
+e
ΛY t/n
n
¸
i=1
o
i
1
n
2
e
ΛX
t(n−i)
n
= e
Λ
Y
t/n
n
¸
i=1
o
i
1
n
2
e
Λ
X
t(n−i)
n
where o
i
1
n
2
:= d
t
n
[X, Y ]
t/n
(y
i
) , y
i
and y
i
:= F
(n−i)t/n
(x). Since
d
t
n
[X, Y ]
t/n
(y) , y
= o
1
n
2
uniformly for y ∈ B(x, 2δ (ρ + 1)) we have
d
F
∗n
t/n
(G)
t/n
(x) , G
t/n
(x)
≤ e
Λ
Y
t/n
n
¸
i=1
o
i
1
n
2
e
Λ
X
t(n−i)
n
= o
1
n
2
e
Λ
Y
t/n
n
¸
i=1
e
Λ
X
t(n−i)
n
= o
1
n
2
e
ΛY t/n
e
ΛXt
n
¸
i=1
e
−
t
n
i
= o
1
n
2
e
ΛY t/n+ΛXt
1 −
e
−
t
n
n+1
1 −
e
−
t
n
.
So
d
(F
∗
t
(G))
t/n
(x) , G
t/n
(x)
= o
1
n
and F
∗
t
(G) ∼ G by the same argument at the last paragraph of the proof of
Theorem 62.
The converse is trivial.
Using Example 11, this theorem applies to the nonlocally compact setting
with nonsmooth vector ﬁelds. [55], another paper which inspires this mono
graph, obtains similar results with a very diﬀerent approach.
60 CHAPTER 3. FOLIATIONS
Example 64 If [X, Y ] ∼ 0 and F, G, and H are the ﬂows of X, Y, and X+Y ,
respectively, then H = F ◦ G.
3.4 The Global Frobenius Theorem
The goal of this section is to recast Theorem 62 in the language of distributions
and foliations, and so we begin with several deﬁnitions. As always M is a locally
complete metric space.
Deﬁnition 65 A distribution ∆ on M is a set of arc ﬁelds.
The following are archetypical examples of distributions. Using the addition
and multiplication operations deﬁned for arc ﬁelds on M (§2.1) we may deﬁne
the linear span of arc ﬁelds:
∆
1
X, ...,
n
X
:=
n
¸
i=1
a
i
i
X
a
i
∈ Lip (M, R)
¸
.
(Remember Lip (M, R) is the set of Lipschitz functions on M.)
The linear span of two (or more) distributions ∆
1
and ∆
2
on M is also
obviously deﬁned to give another distribution
∆
1
+ ∆
2
:=
¸
X +Y [X ∈ ∆
1
, Y ∈ ∆
1
.
Writing
∆(X) := ¦aX[a ∈ Lip (M, R)¦
we automatically have ∆(X, Y ) = ∆(X) + ∆(Y ). Associativity also holds for
this formal sum:
∆
1
+ ∆
2
+ ∆
3
= ∆
1
+
∆
2
+ ∆
3
so we may write ﬁnite summands without confusion. Then without diﬃculty
we have
∆
1
X, ...,
n
X
=
n
¸
i=1
∆
i
X
.
Commutativity is not generally valid, but it does hold up to tangency, deﬁned
below.
For x ∈ M denote ∆
x
:= ¦X (x, •) [X ∈ ∆¦. I.e., ∆
x
is a set of curves based
at x. For y ∈ M deﬁne (with another overload of d notation)
d (y, ∆
x
) := inf ¦d (y, X (x, t)) [X ∈ ∆ and t ∈ [−1, 1]¦ .
Deﬁnition 66 An arc ﬁeld X is (locally uniformly) tangent to ∆, denoted
X ∼ ∆, if for each x ∈ M there is an arc ﬁeld X
∆
∈ ∆ with X ∼ X
∆
uniformly
in a neighborhood of x.
Two distributions ∆ and
¯
∆ are (locally uniformly) tangent, denoted ∆ ∼
¯
∆, if X ∼
¯
∆ for each X ∈ ∆ and
¯
X ∼ ∆ for each
¯
X ∈
¯
∆. Again, ∼ is an
equivalence relation.
3.4. THE GLOBAL FROBENIUS THEOREM 61
By deﬁnition, then, X ∼ ∆
1
X,
2
X, ...,
n
X
if and only if there exist Lipschitz
functions a
k
: M → R such that X ∼
n
¸
k=1
a
k
k
X. Restating, tangency between
two distributions means a correspondence between tangent arc ﬁelds within the
distributions.
Example 67 Using Theorem 43 we may check that when the arc ﬁelds
i
X
¸
i∈I
satisfy E1 and E2 and mutually close, then we have
∆
1
X,
2
X
∼ ∆
2
X,
1
X
and ∆(X) + ∆(X) ∼ ∆(X)
or more generally, assuming [I[ < ∞
∆
i
X
¸
i∈I
∼ ∆
i
X
¸
i∈J
+ ∆
i
X
¸
i∈K
if J ∪ K = I.
Deﬁnition 68 X is (locally uniformly) transverse to ∆ if for all x
0
∈ M
there exists a δ > 0 such that for all x ∈ B(x
0
, δ) we have
d (X
x
(s) , Y
x
(t)) ≥ δ ([s[ +[t[)
for all Y ∈ ∆ and all [s[ , [t[ < δ. In this case we have
d (X
x
(t) , ∆) ≥ δ [t[ .
The arc ﬁelds
1
X,
2
X, ...,
n
X are transverse to each other if for each i ∈ ¦1, ..., n¦
we have
i
X transverse to
∆
1
X,
2
X, ...,
i−1
X ,
i+1
X , ...,
n
X
.
Inspecting Example 60 shows this deﬁnition generalizes transversality in R
n
.
A set of transverse arc ﬁelds is meant to generalize linearly independent vector
ﬁelds.
Deﬁnition 69 Let T :=
1
X,
2
X, ...,
n
X
¸
be a set of n transverse arc ﬁelds which
satisfy E1 and E2 on a neighborhood U ⊂ M and whose ﬂows mutually close.
T is a local frame for a distribution ∆ if ∆ ∼ ∆
1
X,
2
X, ...,
n
X
on U. T is a
global frame for ∆ if local uniform tangency holds throughout M.
A distribution is ndimensional if each point in M has a neighborhood with
a local frame with cardinality n.
Whether global frames of a particular dimension even exist on a space M
may be diﬃcult to answer—even when M is a manifold, where the question falls
under the purview of topology and global analysis.
62 CHAPTER 3. FOLIATIONS
Deﬁnition 70 An ndimensional distribution ∆ is involutive if each local
frame
1
X,
2
X, ...,
n
X
¸
has
¸
i
X,
j
X
∼ ∆
for all i, j ∈ ¦1, ..., n¦.
A surface (or nsurface) is a topological manifold S (of dimension n). A
surface S ⊂ M is locally uniformly tangent to an arc ﬁeld X, denoted
X ∼ S, if d (X
t
(x) , S) = o (t) locally uniformly for x ∈ S.
An ndimensional surface S is an integral surface for an ndimensional
distribution if for any local frame
1
X,
2
X, ...,
n
X
¸
we have
n
¸
k=1
a
k
k
X ∼ S for any
choice of Lipschitz functions a
k
: M →R.
An ndimensional distribution ∆ is said to be integrable if there exists an
integral surface for ∆ through every point in M.
Theorem 62 then has the following corollary:
Proposition 71 An ndimensional involutive distribution is integrable.
Proof. n = 1 is wellposedness of arc ﬁelds, Theorem 12. n = 2 is Theorem
62. Now proceed by induction. We do enough of the case n = 3 to suggest
the path, and much of this is copied from the proof of Theorem 62—if you’ve
understood that proof, this induction is easier to construct for yourself than to
read.
Choose x
0
∈ M. Let X, Y, and Z be the transverse arc ﬁelds guaranteed in
the deﬁnition of a 3dimensional distribution. If we ﬁnd an integral surface S
for ∆(X, Y, Z) through x
0
then obviously S is an integral surface for ∆. Let
F, G, and H be the local ﬂows of X, Y , and Z and deﬁne
S := ¦F
t
G
s
H
r
(x
0
) [ [r[ , [s[ , [t[ < δ¦
with δ > 0 chosen small enough as in the proof of Theorem 62 so S is a 3
dimensional manifold. Again we may assume δ is also chosen small enough so
that throughout S the functions [a
k
[ are bounded by A, the constants Λ, Ω, and
ρ for X, Y and Z hold uniformly, and the closure of B(x, 3δ (ρ + 1)) is complete.
Notice
S := ¦G
s
H
r
(x
0
) [ [r[ , [s[ < δ¦
is an integral surface through x
0
for ∆(Y, Z) by the proof of Theorem 62. Now
S ∼ X by construction, but it is not immediately clear S ∼ a
X + b
Y + c
Z
for arbitrarily chosen a
, b
, c
∈ R. Again we really only need to show S ∼
a
F + b
G + c
H for an arbitrary point z := F
t
G
s
H
r
(x
0
) ∈ S, and again it is
suﬃcient to prove
(F
t
)
∗
(a
F +b
G+c
H) ∼ S at y = G
s
H
r
(x
0
)
by the construction of S. Continue as above adapting the same tricks from the
proof of Theorem 62 to the extra dimension.
3.4. THE GLOBAL FROBENIUS THEOREM 63
Proposition 72 If S
1
and S
2
are integral surfaces through x ∈ M, then
Theorem 73 (i) S
1
∩ S
2
is an integral surface
(ii) S
1
∪ S
2
is an integral surface.
Further, there is a unique maximal integral surface S through x, meaning
S ∩ S
1
= S
1
for any integral surface S
1
through x.
Proof. The case n = 1 is true by the uniqueness of integral curves.
For higher dimensions n, Theorem 33 from §1.3 guarantees S
1
and S
2
contain
local integral curves for
n
¸
k=1
a
k
k
X for all choices of a
k
∈ R with initial condition
x. Since the
k
X are transverse, there is a small neighborhood of x on which all
the choices of the parameters a
k
give local nonintersecting curves in M which
ﬁll up n dimensions giving an integral surface in S
1
∩S
2
(precisely the argument
in the second paragraph of the proof of Theorem 62).
For (ii) since S
1
∩S
2
is an integral surface inside S
1
∪S
2
the only question is
whether the union is still an ndimensional manifold. Pick x ∈ S
1
∪ S
2
and for
i = 1, 2 let U
i
⊂ S
i
be the ndimensional neighborhood of x guaranteed by the
fact that S
i
is an integral surface. As with (i) each of these neighborhoods are
manifolds ﬁlled with by the ﬂows of
n
¸
k=1
a
k
k
X. By Nagumo’s invariance result,
Theorem 33, they coincide near x.
The maximal integral surface is the union of all integral surfaces through x.
Deﬁnition 74 A foliation is a partition of M into a set of subsets Φ :=
¦L
i
¦
i∈I
for some indexing set I, where the subsets L
i
⊂ M (called leaves) are
disjoint, connected topological manifolds each having the same dimension.
A foliation Φ is tangent to a distribution ∆ if the leaves are integral sur
faces; in this case we say ∆ foliates M.
Collecting all these results we have the following version of the Global Frobe
nius Theorem.
Theorem 75 Let ∆ be an ndimensional distribution on a locally complete met
ric space M.
(i) If ∆ is involutive, then ∆ is integrable.
(ii) If ∆ is integrable, then ∆ foliates M.
(iii) If ∆ foliates M into Φ := ¦L
x
¦
x∈M
then for any X, Y ∈ ∆, we have
[X, Y ]
x
(t) ∈ L
x
for t ∈ [−1, 1].
(iv) ∆ is involutive if and only if ∆ has a local frame at each x ∈ M with
commutative ﬂows.
Proof. (i) is Proposition 71.
(ii) is 72.
(iii) follows from Theorem 33 and the deﬁnition of the bracket.
64 CHAPTER 3. FOLIATIONS
(iv) (⇐) This is automatic since the bracket is trivial if the ﬂows commute.
(⇒) Pick any local frame
i
X
¸
at x ∈ M and construct a commutative frame
as follows. Let
1
σ
x
: (α, ω) → M be the solution of
1
X. Deﬁne
1
¯
X :=
2
F
t∗
◦
1
σ
x
and
2
¯
X
surface
:=
2
X. This forces
(a)
1
¯
F and
2
¯
F commute
(b)
1
¯
X and
2
¯
X span a surface locally
(c) ∨
1
¯
F,
2
¯
F
¸
⊂ L
x
.
Continue with n = 3, etc., pushing forward span
1
¯
F,
2
¯
F
¸
with
3
F to extend
1
¯
X and
2
¯
X on a local 3D set and
3
¯
X
3−D set
:=
3
X. In the end we have an n
dimensional surface (property (b)) in L
x
(property (c), which is the key point
of the proof and requires the assumption of the statement of the theorem) and
so ﬁlls L
x
locally and commutes (property (a)).
Part (iii) of Theorem 75 is as close to a converse of (i) as we have been able
to achieve. The bracket is tangent to the distribution in the sense given in the
theorem, but not necessarily locally uniformly tangent to a single arc ﬁeld in
the distribution—which is the deﬁnition of ∼ required for involutivity.
The local frame with commutative ﬂows gives local coordinates on the leaves
of the foliation. If the foliation is trivial having only one leaf, then the ﬂows
give local coordinates near each point in M, called ﬂow coordinates in which
case M is a topological manifold.
Function space examples relevant to this chapter are the content of Chapters
4 and 5. Further, the idea of a connection from diﬀerential geometry is now
straightforward to generalize to metric spaces. Use the interpretation that a
connection is a choice of horizontal subspace of TTM, i.e., a distribution on
TTM. As on a manifold, each choice of a connection gives a precise deﬁnition
of curvature. On a metric space, however, the situation is complicated by the
choice of arcs which must be made to deﬁne the analogs of TM and TTM.
An open question is how to guarantee a connection related to the metric which
gives length minimizing geodesics—the analog of the Fundamental Theorem of
Riemannian Geometry. Progress in this direction is one of the successes of
Finsler geometry.
3.5. CONTROL THEORY 65
3.5 Control theory
In this section we explore some ideas from control theory and how the geometric
results from this chapter impact the subject. The culmination is a generalization
of Chow’s Theorem to metric spaces.
Often we are able to aﬀect a dynamical system at will, i.e., we can control,
directly or indirectly, some parameters independently of the evolution of a ﬂow.
We can intervene in the evolution of a system, instead of merely observing it.
The study of such a scenario is the purview of control theory. Some of the
original models are of mechanical systems, but the applications are extremely
diverse: from signal analysis to sociological models, any diﬀerential equation
can be coopted for study in control theory if we complicate the situation by
adding parameters. One of the most exciting contemporary applications of con
trol theory that requires geometric theory to apprehend is programming robot
motion, and we are immersed daily in more prosaic control systems: driving a
car, changing a thermostat, or adjusting the dosage levels of a patient’s med
ication.
The model for driving a car can be simpliﬁed to the action of turning the
steering wheel (p
1
> 0 means rotate the steering wheel right and p
1
< 0 means
left) and moving the car forward (p
2
> 0) or backward (p
2
< 0). Thinking
of a toy remotecontrol car with its two peg controls may help intuition. The
parameters p
i
in any system are the controls, which belong to subsets of R.
These subsets are often intervals, but with digital systems we need to be able
to use discrete subsets like ¦0, 1¦ and ¦−1, 0, 1¦. The systems are modeled
on a space M of possible conﬁgurations of the system. In a basic example
“conﬁguration” might mean the location of the object of interest, but this is
usually only one of the variables of interest. In the model of driving the car we
might take x ∈ M to represent location, in which case M = R
2
(assuming we’re
driving on a ﬂat plane). But this is hardly adequate, as we will be also interested
in the direction of the car—so let the space of conﬁgurations be M := R
2
S
1
where the R
2
factor represents the location of the center of mass of the car,
and S
1
represents the orientation. But we may also represent the direction
the wheels are turned—an important part of the conﬁguration—so the proper
conﬁguration space is M := R
2
S
1
[−1, 1]. But to make ﬂows tractable on
M we will force the easier scenario of M := R
2
S
1
S
1
÷ (x
1
, x
2
, θ
1
, θ
2
). Even
in this simple example we are led to study a manifold M instead of a vector
space.
Since we are interested in how our system evolves in time, a diﬀerential
equation is typically used to model the behavior. Often a mechanical system’s
evolution is determined by forces, and Newton’s equation is used:
x
= f (x, p) (3.13)
where x ∈ M and p = (p
i
)
i∈I
are the control parameters. Control theory’s
fundamental questions are determining: 1) stability and 2) controllability of a
66 CHAPTER 3. FOLIATIONS
system.
For the ﬁrst question, there are many diﬀerent meanings of stability, but they
all center on whether the system evolves in a qualitatively similar (“topologically
conjugate”) manner if small changes are made to the right hand side, to x, p
or f. Changing x determines sensitivity to initial conditions, adjusting
p determines the stability of the control, and perturbing the function f
determines the structural stability of the system.
Our focus in this chapter is on the second question of the controllability
of a system, which asks whether we can steer any initial condition x ∈ M
to any other conﬁguration
1
in M with a clever adjustment of parameters. In
inﬁnite dimensions there is a diﬀerence between whether we can drive an initial
condition to a terminal condition, or whether we can only drive it arbitrarily
close to the terminal condition. Any imaginable practical application will not
distinguish these cases, though.
As a prerequisite for this full controllability, we must obviously have local
controllability, in which any initial condition has a neighborhood for which
the restriction of the system is controllable. Continuation then gives us full
controllability if M is pathconnected. In this restricted local situation the 2nd
order ODE may be equivalently rewritten (as illustrated in Appendix B) as a
1storder ODE
x
= g (x, p) = g
p
(x)
and the ﬂow G
p
generated by the vector ﬁeld g
p
depends on the parameter p,
typically a member of R
n
.
A simpliﬁed presentation of the following terminology is suﬃcient for our
purposes in metric spaces.
Deﬁnition 76 Let G = ¦G
p
[p ∈ P¦ be a family of ﬂows with indexing set P.
The reachable set of G from x ∈ M is denoted
R
G
(x) :=
G
p
n
tn
G
p
n−1
tn−1
...G
p
1
t1
(x) [t
i
∈ R, p ∈ P, n ∈ N
¸
⊂ M (3.14)
where x ∈ M is the constant function. R
G
(x) is the set of all ﬁnite composi
tions of G
p
t
. The approximately reachable set from x is R
G
(x). If M is
approximately reachable, then the system is controllable.
Reachable sets partition M. Comparing Deﬁnition 76 of the reachable set
R
G
(x) with Deﬁnition 36 of linear combinations of arc ﬁelds we see R
G
(x) as
the conﬁgurations attainable from the initial conﬁguration x using the ﬂows G
p
successively, or with little practical diﬀerence using all linear combinations of
the G
p
.
To clarify the terminology, consider again whether the “driving a car” sys
tem is controllable. Now we are asking whether controlling our 2dimensional
1
We will concentrate on this basic controllability and not on optimal controllability which
seeks to ﬁnd solutions which optimize some quantity, such as the time needed to reach the
terminal condition using speed less than 1.
3.5. CONTROL THEORY 67
parameter space allows us to steer our car into any point of the 4dimensional
conﬁguration space M. Manipulations on the 2dimensional parameter space
act on the conﬁguration space through two simple ﬂows F
t
(x), steer, rotates
the steering wheel and G
t
(x), drive, moves the car forward and back with
the steering ﬁxed. Any driver’s intuition will promise us that a car can be
moved to any conﬁguration using only F and G. Mathematically, though, it
seems unlikely a 2dimensional parameter space is enough to control the entire
4dimensional conﬁguration space. R
(F,G)
(x) should be a 2dimensional surface
inside of M, perhaps something like the surface
¦G
s
F
t
(x) [s, t ∈ R¦ ⊂ M.
But this naive mathematical intuition is incorrect, and stems from the fact
that the ﬂows F and G do not commute, so the terms of the reachable set
(3.14) do not simplify to G
s
F
t
(x). Following the course G
−t
F
−t
G
t
F
t
(x) should
return us to our initial conﬁguration x if the ﬂows were to commute, but as
you can mentally check, the automobile will actually end up rotated with an
insigniﬁcant translation. This motivates us to introduce the bracket of two ﬂows
and to consider the meaning of Frobenius’ Theorem for control systems.
In this example [F, G] is called wriggle. Thinking of wriggle as rotation (ig
noring the minor translation) it is easy to verify that [[F, G] , G]—right rotation,
forward drive, left rotation, backward drive—is eﬀectively translation transverse
to drive. So [[F, G] , G] has been traditionally dubbed slide, an algorithm for
parking your car in any space inﬁnitesimally longer than your car [1]. Since wrig
gle is not tangent to a simple sum of drive and steer, wriggle generates a 3rd
dimension of controllability/reachability. Similarly slide generates a 4th dimen
sion. Since wriggle is the combination of drive and steer 4 times (the bracket) as
is slide (iterated bracket), then drive and steer—using Euler approximation—are
enough to reach all points in the 4dimensional conﬁguration space.
(Drivers will object that slide is not the algorithm they use; parallel parking
is better described by the arc ﬁeld H as follows:
H
t
(x) := F
π/2
G
√
t
F
−π
G
√
t
F
π
G
−
√
t
F
−π
G
−
√
t
F
π/2
(x)
for t > 0. H is its own ﬂow, translation perpendicular to drive, when θ
2
= 0.
Be careful here. If you replace the π/2 with
√
t then
H = [F, −G] + [−G, −F] + [−F, −2G] + [−2G, F] + [F, −G] + [−G, −F]
and then using rules of Lie algebra we haven’t yet proven, we get H ∼ 0 which
doesn’t help us park a car. The use of π/2 reﬂects how drivers always turn their
wheels completely when shimmying into a space. Try writing out the formula
for slide explicitly to see the 4th root arise in the iterated bracket.)
Example 77 Consider an even simpler toy car restricted to moving forward
and backward or rotating, where the formulas are easy to intuit. The Segway,
the twowheeled electric upright vehicle, is a good representative for this model.
68 CHAPTER 3. FOLIATIONS
Then the conﬁguration space is simply M = R
2
S
1
, the R
2
variable for the
center of mass and the S
1
variable representing its direction of orientation. Let
F be the ﬂow moving the car forward or back in the direction of orientation,
and let G be rotation. For x = (x
1
, x
2
, θ)
F
t
(x
1
, x
2
, θ) = (x
1
+t cos θ, x
2
+t sinθ, θ) = (x
1
, x
2
, θ) +t (cos θ, sinθ, 0)
G
t
(x
1
, x
2
, θ) = (x
1
, x
2
, [θ +t] mod2π)
then check
G
−t
F
−t
G
t
F
t
(x) = (x
1
+t cos θ −t cos (θ +t) , x
2
+t sinθ −t sin(θ +t) , θ) = x
assuming θ = 0 and t is small enough to ignore the modulo 2π calculation. In
fact for t > 0
[F, G]
t
(x) = G
−
√
t
F
−
√
t
G
√
t
F
√
t
(x)
= (x
1
, x
2
, θ) +t
cos θ −cos
θ +
√
t
√
t
,
sinθ −sin
θ +
√
t
√
t
, 0
∼ (x
1
, x
2
, θ) +t (sinθ, −cos θ, 0) =: H
t
(x)
H is translation perpendicular to the orientation. H / ∈ ∆(F, G) and so M =
L
x
= R
(F,G)
and the system is controllable. More explicitly, since H is tangent
to the bracket of F and G, H may be approximated by appropriate successive
compositions of F and G which means R
(F,G)
is at least dense in M. So the
3dimensional system is controllable with 2 parameters because the bracket is
nontrivial—i.e., the system is nonholonomic.
[27, Chapter 5] gives an example demonstrating the diﬃculties of applying
the traditional vector ﬁeld Lie bracket to inﬁnitedimensional spaces directly
with diﬀerential operators on PDEs; in the same chapter, fruitful work on con
trollability in the inﬁnite dimensional context of NavierStokes and quantum
mechanics is cited. A careful use of the metric space Lie bracket and foliation
theorems means the approach can work in the greater generality of a metric
space.
Deﬁne the distribution bracketgenerated by the set of arc ﬁelds ¦X
i
¦ to
be the distribution ∆[¦X
i
¦] consisting of the linear combinations of the X
i
and
all ﬁnitely iterated brackets.
Theorem 78 Let ∆ be an ndimensional distribution bracketgenerated by the
G
p
. Then R
G
(x) ⊂ L
x
and R
G
(x) = L
x
.
Proof. ∆(G
p
) is involutive by deﬁnition and so has a foliation consisting of
the leaves L
x
. That R
G
(x) ⊂ L
x
is a corollary of Frobenius’ Theorem, Theorem
75 and Theorem 33. Continuation on the connected leaves means you can move
between any two points in a leaf using the ﬂows, proving R
G
(x) = L
x
.
3.5. CONTROL THEORY 69
This is a generalization of Chow’s Theorem to metric spaces (also called the
ChowRashevsky Theorem and Hermes’ Theorem).
In Chapters 4 and 5 we investigate applications of these ideas. An inﬁnite
dimensional distribution version arises naturally using the same approach of
generalizing linear algebra to functional analysis, but we have not yet dwelled
on any potential pitfalls.
70 CHAPTER 3. FOLIATIONS
Part II
Examples
71
Chapter 4
Brackets on function spaces
Let’s explore how the ideas of Chapter 3 express themselves with the simplest
examples on function spaces.
Example 79 (Vector space translations and dilations) Let M be a Ba
nach space with norm . First let X and Y be vector space translations in the
directions of u and v ∈ M
X
t
(x) := x +tu Y
t
(x) := x +tv.
X and Y are their own ﬂows (when extended to [t[ > 1). Obviously [X, Y ] = 0,
and the ﬂows commute.
Next consider the dilations X and Y about the respective centers u and v ∈ M
X
t
(x) := (1 +t) (x −u) +u Y
t
(x) := (1 +t) (x −v) +v
(u and v are usually taken equal to 0 for simplicity). The ﬂows are computable
by intuition, or with a little eﬀort using Euler curves
F
t
(x) = lim
n→∞
X
(n)
t/n
(x) = e
t
x −
e
t
−1
u.
(So dilation about u = 0 is the familiar F
t
(x) = e
t
x.)
Then for t ≥ 0
[X, Y ]
t
2 (x)
= G
−t
F
−t
G
t
F
t
(x)
= e
−t
e
−t
e
t
e
t
x −
e
t
−1
u
−
e
t
−1
v
−
e
−t
−1
u
−
e
−t
−1
v
= x −u +e
−t
u −e
−t
v +e
−2t
v −e
−2t
u +e
−t
u −e
−t
v +v
= x + (v −u)
e
−t
−1
2
73
74 CHAPTER 4. BRACKETS ON FUNCTION SPACES
so [X, Y ] ∼ Z where Z is the translation Z
t
(x) := x + t (v −u) since, for
instance with t > 0
d ([X, Y ]
t
(x) , Z
t
(x))
= [v −u[
e
−
√
t
−1
2
−t
= [t[ [v −u[
e
−
√
t
−1
√
t
2
−1
= o (t) .
Hence the distribution ∆(X, Y ) is not involutive. However, the set of all di
lations generates all translations using brackets. Using the same tricks we’ve
just employed, it is easy to check the bracket of a dilation and a vector space
translation is tangent to a vector space translation, e.g., if F
t
(x) := x +tu and
G
t
(x) := e
t
x (dilation about 0) then [F, G] ∼ F since for t > 0
[X, Y ]
t
2
(x) = G
−t
F
−t
G
t
F
t
(x) = e
−t
e
t
[x +tu] −tu
= x +tu
1 −e
−t
and so
d ([X, Y ]
t
(x) , F
t
(x)) = [tu[
1−e
−
√
t
√
t
−1
= o (t) .
Bracketgenerated distributions (deﬁnition on p. 68) are involutive by de
ﬁnition. Example 79 shows the distribution bracketgenerated by dilations is
exactly the distribution consisting of all dilations and all translations.
With obvious modiﬁcations the results of Example 79 are valid on the metric
space (H (R
n
) , d
H
) where H (R
n
) is the set of nonvoid compact subsets of R
n
and d
H
is the Hausdorﬀ metric. Theorem 75 gives foliations of H (R
n
), which
is of interest because this space is incapable of accepting any natural linear
structure. H (R
n
) is a particularly strange space topologically because, despite
being locally compact, H (R
n
) is inﬁnite dimensional by most any measure we
can attempt to apply. We can even ﬁnd inﬁnitely many transverse ﬂows.
Example 80 (two parameter decomposition of L
2
) Now let M be real Hilbert
space L
2
(R). Since M is Banach, the results of Example 79 hold. Let’s ex
plore the example from §0.2 in further detail. Let F
t
(f) (x) := f (x +t) denote
function translation and let G
t
(f) = f + tg denote vector space translation by
g ∈ L
2
(R). When g ∈ C
1
(R) with derivative g
∈ L
2
(R), Example 97 below
shows F & G close and gives the formula for the ﬂow of the sum.
Let’s compute their bracket. For t > 0
[F, G]
t
2 (f) (x)
= G
−t
F
−t
G
t
F
t
(f) (x) = G
−t
F
−t
[f (x +t) +tg (x)]
= f (x) +tg (x −t) −tg (x) = f (x) −t
2
¸
g (x) −g (x −t)
t
.
75
Deﬁning a new arc ﬁeld Z
t
(f) := f +t (−g
) we therefore have
d ([F, G]
t
(f) , Z
t
(f)) = [t[
R
g (x) −g
x −
√
t
√
t
−g
(x)
2
dx = o (t)
when g ∈ C
1
(R) with g
∈ L
2
(R). Thus [F, G] ∼ Z which we ﬁnish verifying by
checking the case t ≤ 0. Let t = −s
2
for s > 0. Then
[F, G]
t
(f) (x) = F
−s
G
−s
F
s
G
s
(f) (x) = F
−s
G
−s
[f (x +s) +sg (x +s)]
= f (x) +sg (x) −sg (x −s) = f (x) −s
2
¸
−
g (x) −g (x −s)
s
= f (x) +t
¸
−
g (x) −g (x −s)
s
.
So again d ([F, G]
t
(f) , Z
t
(f)) = o (t). (See Figures 5.1 and 5.2 with g (x) =
e
−x
2
.)
x x
G
1
(0) = e
−x
2
F
1
G
1
(0) = e
−(x+1)
2
x
x
G
−1
F
1
G
1
(0) = e
−(x+1)
2
−e
−x
2
F
−1
G
−1
F
1
G
1
(0) = 0
Figure 5.1: F and G do not commute.
76 CHAPTER 4. BRACKETS ON FUNCTION SPACES
x
Figure 5.2:
F
−
√
t
G
−
√
t
F
−
√
t
G
−
√
t
(0) −t
d
dx
e
−x
2
2
= o (t)
This simple calculation has remarkable consequences. Using Theorem 78 if
the (n + 1)st derivative g
[n+1]
is not contained in
span
g
[i]
[0 ≤ i ≤ n
¸
then iterating the process of bracketing F and G generates a large space reach
able via repeated compositions. For instance when g (x) = e
−x
2
the deriva
tives generate the famous Hermite functions, a basis of L
2
(R). In this case
R
(F,G)
(0) = L
2
(R). We devote §5.1 to exploring the function approximation
schemes this fact inspires.
Continuing the example, for other choices of g we may alternately have
g
[n+1]
∈ span
g
[i]
[0 ≤ i ≤ n
¸
.
Then the space reachable by F and G is precisely limited. E.g., if we limit
ourselves to M := L
2
[a, b] then we may choose g to be a sine or cosine function;
then R
(F,G)
(0) is twodimensional as are the leaves of the foliation generated by
∆(F, G). Similarly if g is an nth order polynomial then the parameter space
is (n + 1)dimensional. These choices of g would also give ﬁnitedimensional
foliations of L
2
G
(R) which denotes the space of square integrable functions with
Gaussian weight, i.e., with norm
f
G
:=
[f (x)[
2
e
−x
2
dx
1/2
< ∞.
Restating these results in diﬀerent terminology: Controlling amplitude and
phase the 2parameter system is holonomically constrained. Controlling phase
and superposition perturbation (F and G) generates a larger space of signals;
how much F and G deviate from holonomy depends on the choice of perturbation
function g.
77
Example 81 Let’s continue Example 80 with M = L
2
(R) and
F
t
(f) (x) := f (x +t) and G
t
(f) := f +tg.
Now deﬁne the arc ﬁelds
V
t
(f) := e
t
f and W
t
(f) (x) := f
e
t
x
which may be thought of as vector space dilation (about the point 0 ∈ M)
and function dilation (about the point 0 ∈ R). Again, V and W are their own
ﬂows. Using the same approach as in Examples 79 and 80 it is easy to check
the brackets satisfy
[F, G]
t
(f) = f −tg
+o (t) [G, V ]
t
= G
t
+o (t)
[G, W]
t
(f) (x) = f (x) +txg
(x) +o (t) [F, V ] = 0
[F, W]
t
= −F
t
+o (t) [V, W] = 0
assuming for the [F, G] and [G, W] calculations that g ∈ C
1
(R) and g
∈ L
2
(R).
Consequently
∆(F, G) may be highly noninvolutive depending on g,
∆(G, V ) is involutive, but G and V do not commute,
∆(G, W) may be highly noninvolutive depending on g,
∆(F, V ) is involutive; F and V commute,
∆(F, W) is involutive, but F and W do not commute,
∆(V, W) is involutive; V and W commute.
For many choices of g (e.g., e
ax
or x
c
for noninteger c) the ﬂows G and W con
trol many function spaces, similarly to F and G. This gives more opportunities
to generate approximation schemes which are explored in Chapter 5. Foliations
of L
2
(R) generated by these distributions are now precisely understood.
Example 82 Now consider
G
t
(f) := f +tg and Z
t
(f) (x) := e
tcx
f (x)
with g (x) = e
−x
2
and c constant nonzero. G and Z are their own ﬂows.
[G, Z]
t
2 (f) (x) = Z
−t
G
−t
Z
t
G
t
(f) (x)
= f (x) +t
2
g (x)
(1 −e
−tcx
)
t
so [G, Z] ∼ H where H
t
(f) (x) := f (x) +tcxg (x). Therefore the bracket is not
involutive. In fact iterating the bracket generates polynomials of every degree
and the system is controllable—the reachable set satisﬁes R
(G,Z)
= L
2
(I) for any
bounded interval I ⊂ R.
This then means that terms of the form
Z
t
n
G
t
n
...Z
s
1
G
t
1
(0) = e
sncx
( e
s2cx
(e
s1cx
t
1
g (x) +t
2
g (x)) +t
n
g (x))
= g (x)
a
1
e
b
1
x
+a
2
e
b
2
x
+ a
n
e
b
n
x
78 CHAPTER 4. BRACKETS ON FUNCTION SPACES
are dense in L
2
(R). Therefore linear combinations of exponentials
N
¸
n=0
a
n
e
bnx
(4.1)
are dense in L
2
(I) for any bounded interval I ⊂ R. Notice the choice of g was
nearly immaterial—any nonvanishing function in L
2
(R) will do.
In this example the StoneWeierstrass Theorem gives us the density of series
such as (4.1), since they form an algebra generated by the set
¦e
rx
[r ∈ R¦
which separates points. However StoneWeierstrass cannot predict the results
we will achieve extending this example in §5.2.
Example 83 The question arises: does Fourier analysis decompose L
2
(0, 1)
with just a few ﬂows? Yes and no. The typical point of view would more likely
be that Fourier analysis is so powerful because we need only a countable set
of orthogonal functions
¸
e
inx
[n ∈ Z
or ¦sin(nx) [n ∈ Z¦ ∪¦cos (nx) [n ∈ Z¦ to
represent any of the uncountable profusion of functions in L
2
(0, π). The series
f (x) =
∞
¸
k=1
a
n
sinnx +
n
¸
k=1
b
n
cos nx
represents (countably) inﬁnitely many operations of superposition (+). That is,
you need 1 circuit for each n to synthesize a signal. Finitely many circuits means
limited ﬁdelity. The translations F and G from Example 81 where G
t
(f) :=
f +tg with g (x) := cos x will certainly not work since the distribution generated
by the brackets of F and G is merely 2dimensional (the derivative of the cosine
is a function translation of the cosine).
But from another point of view, the answer is yes. Three ﬂows are enough to
generate any Fourier series with a simple algorithm. Let us examine the ﬂows
G
t
(f) := f +tg and W
t
(f) (x) := f
e
t
x
from Example 81 more closely. Choosing g (x) = cos x notice any ﬁnite cosine
series
f (x) :=
n
¸
k=1
b
n
cos nx
is in the reachable set for G and W from 0, because
f = G
b1
W
ln2
G
b2
W
ln(3/2)
G
b3
...W
ln(n/n−1)
G
bn
(0) .
79
Let us illustrate this with a calculation for n = 3 :
G
b1
W
ln2
G
b2
W
ln(3/2)
G
b3
(0) (x)
= G
b
1
W
ln2
G
b
2
W
ln(3/2)
(b
3
cos (x)) = G
b
1
W
ln2
G
b
2
b
3
cos
e
ln(3/2)
x
= G
b1
W
ln2
b
3
cos
e
ln(3/2)
x
+b
2
cos x
= G
b1
b
3
cos
e
ln(3/2)
e
ln2
x
+b
2
cos
e
ln2
x
= b
3
cos (3x) +b
2
cos (2x) +b
1
cos x.
So these two ﬂows generate any Fourier cosine series, which means on [0, π]
that R
G,W
= L
2
[0, π].
This algorithm means we can make a universal synthesizer by putting a single
variablefrequency oscillator (an IRC circuit whose capacitor has an adjustable
plate separation) on a looped wire. There are, of course, many obstacles which
need to be overcome in order to put this (or any) scheme into practice. The
time it takes to load a signal onto the wire (the load time) gets larger as the
ﬁdelity is increased, so we need several circuits working in parallel. Looped
signals degenerate in time due to friction and diﬀusion. Et cetera. There are,
of course, answers to each problem, but knowing whether such ﬁxes are practical
is one of the gaping holes in the education of theorists such as myself.
Finally, using the ideas from Example 80 and a trick we will see in Theorem
94, there is another algorithm possible, since
[G, W]
t
(f) (x) = f (x) +txg
(x) +o (t) (4.2)
which for a function g (x) set equal to e
x
or cos x means successive brackets
again generate an inﬁnite set of linearly independent functions.
Example 84 Again choosing some function space, M := L
2
(R) for exam
ple, let us slightly generalize V from Example 81 and consider X
t
(f) (x) :=
e
tg(x)
f (x) which is its own ﬂow if g is some bounded function. Incidentally X
is the solution to the diﬀerential equation f
t
= gf. Let Y
t
(f) (x) := f (x +t)
and let’s calculate their bracket:
[X, Y ]
t
2 (f) (x) = e
t
2 [g(x)−g(x−t)]
t
f (x)
so [X, Y ] ∼ Z where
Z
t
(f) := e
tg
f
and iterating we have [X
n
, Y ] ∼
n
Z where
[X
n
, Y ] : = [[... [[X, Y ] , Y ] , ..., Y ] , Y ]
. .. .
n brackets
n
Z
t
(f) : = e
tg
[n]
f.
80 CHAPTER 4. BRACKETS ON FUNCTION SPACES
Then
(a
m
[X
m
, Y ] +a
n
[X
n
, Y ])
t
(f) ∼ e
a
m
tg
[m]
e
a
n
tg
[n]
(f) = e
t(amg
[m]
+ang
[n]
)
(f)
for a
i
∈ R so
n
¸
k=0
a
k
X
k
,Y
t
(f) ∼ exp
¸
t
n
¸
k=0
a
k
g
[k]
(f) .
Again, we understand the reachable set depending on the choice of g.
Example 85 Modifying the previous example slightly gives us a technique for
generating any probability distribution. Consider the new ﬂow
X
t
(f) (x) := e
tg(x)
f (x) /
e
tg
f
.
Now M is chosen to be the unit sphere in some Banach function space B from
which we get the norm used above, i.e., M := ¦f ∈ B[ f = 1¦. Further, g is
chosen to be some suitably bounded function (though we are thinking merely of
the Gaussian at this moment). Obviously X
t
(f) = 1 for all t ∈ [−1, 1]. Again
X is its own ﬂow, which is possibly surprising, but not diﬃcult to check:
X
t
X
s
(f) (x) : = e
tg(x)
e
sg(x)
f (x) / e
sg
f
/
e
tg
[(e
sg
f) / e
sg
f]
= e
(s+t)g(x)
f (x) /
e
(s+t)g
f
= X
s+t
(f) (x) .
Calculating the bracket with the translation ﬂow Y
t
(f) (x) := f (x +t) is only
slightly more diﬃcult than in the previous example, and we get [X, Y ] ∼ Z where
Z
t
(f) :=
e
tg
f
/
e
tg
f
.
Again the previous techniques work to generate any probability distribution as
suming g is derivativegenerating and the initial condition f (x) is suﬃciently
regular (e.g., strictly positive).
Example 86 Let M := l
2
(R), i.e., the Hilbert space of all square summable se
quences. Let S := ¦x ∈ M[even entries are 0¦ , e.g., x =
1, 0, 2
−1
, 0, 2
−2
, 0, ...
∈
S. Then S is an inﬁnitedimensional closed linear submanifold of M and its
inﬁnite distinct cosets Φ := ¦L
x
:= x +S[x ∈ M¦ foliate M.
The set ∆ of vector ﬁelds which have 0 in all the even entries is a distribution
with the same foliation Φ. Also ∆ is clearly involutive. This gives us motivation
to generalize Frobenius’ Theorem further to inﬁnitedimensional distributions.
[64] gives examples of foliations in the following inﬁnitedimensional contexts:
on the space of gauge ﬁelds on a principle bundle, on the space of Riemannian
metrics and on the space of probability measures on a manifold.
81
Example 87 Let M = L
2
(R). Consider the following arc ﬁelds
X
t
(f) (x) := f (x +t) and Y
t
(f) (x) := e
t/2
f
e
t
x
.
X and Y each give 1dimensional foliations of M` ¦0¦ and since they are each
invariant on the unit ball B := B(0, 1) ⊂ M they also foliate B` ¦0¦ by a family
of isometries, and foliate the unit sphere
S
∞
:=
¸
f ∈ L
2
(R) [f
2
= 1
.
Interestingly the length of the integral curves are inﬁnite (whenever the initial
condition is not f ≡ 0) and the integral curves are far from being geodesics
1
.
An insight into the inﬁnite dimension of L
2
comes from comparing these
foliations to the 1dimensional foliation of R
2
` ¦0¦ given by rotations which
also foliates the ball B
R
2 (0, 1) ` ¦0¦. These rotations are a family of isometries,
where the integral curves have ﬁnite length 2πr.
As before it is easy to check the bracket satisﬁes [X, Y ] ∼ −X so ∆ gives a
2dimensional foliation of M` ¦0¦ and B` ¦0¦ and S
∞
. The area of each leaf is
again inﬁnite. Higher dimensional foliations are given by adding transverse arc
ﬁelds.
Let B
+
:= ¦f ∈ M : f (x) ≥ 0, ∀x ∈ R¦. Choose g ∈ B
+
and deﬁne the arc
ﬁeld Z on M by
Z
t
(f) := (1 −t) f +tg.
It is easy to check Z satisﬁes E1 and E2, e.g.,
Z
s
Z
t
(f) = (1 −s) [(1 −t) f +tg] +sg = = Z
s+t
(f) +st (f −g)
so E2 is satisﬁed. B
+
is an attracting invariant set under the forward ﬂow of
Z. In fact g is the unique attracting ﬁxed point of the ﬂow. However B
+
is not
invariant under the reverse ﬂow of Z, e.g., pick f = χ
[0,1]
and g = χ
[1,2]
. B
+
is
also invariant (forward and backward) under X and Y and their bracket is also
in B
+
so X and Y give a 2D foliation of B
+
` ¦0¦. Further, any positive linear
combination aX + bY +cZ for a, b, c > 0 also gives a ﬂow which has B
+
as a
forward invariant set. But the bracket of X and Z is not tangent to B
+
(since
it uses t < 0 in Z
t
). As before the reachable set for X and Z or for Y and Z is
generally a higherdimensional space.
1
locally minimal length curves
82 CHAPTER 4. BRACKETS ON FUNCTION SPACES
Chapter 5
Approximation with
nonorthogonal families
Chapter 4 furnished a particularly surprising result: two simple ﬂows can control
an inﬁnitedimensional space. Here we translate some of these results to the
language of numerical analysis.
5.1 Gaussians
5.1.1 First approximation formula
The result of Example 80—that successive sums and translations of Gaussians ap
proximate any L
2
function—can be proﬁtably rephrased. f ≈
g means f −g
2
<
.
Theorem 88 For any f ∈ L
2
(R) and any > 0 there exists t > 0 and N ∈ N
and a
n
∈ R such that
f ≈
N
¸
n=0
a
n
e
−(x−nt)
2
. (5.1)
If f (x) e
x
2
/2
is integrable, then one choice of coeﬃcients is
a
n
=
(−1)
n
n!
√
π
N
¸
k=n
1
(k−n)!(2t)
k
R
f (x) e
x
2 d
k
dx
k
e
−x
2
dx.
If f (x) e
x
2
/2
is not integrable, replace f in the above formula with f χ
[−M,M]
where M is chosen large enough that
f −f χ
[−M,M]
2
< .
Proof. Since the span of the Hermite functions is dense in L
2
(R), see [61],
we have for some N
f ≈
/2
N
¸
n=0
b
n
d
n
dx
n
e
−x
2
.
83
84CHAPTER5. APPROXIMATIONWITHNONORTHOGONAL FAMILIES
Now use ﬁnite backward diﬀerences to approximate the derivatives (Appendix
C reviews the wellknown facts; use Example 129). We have for some small
t > 0
N
¸
n=0
b
n
d
n
dx
n
e
−x
2
≈
/2
N
¸
n=0
b
n
1
t
n
n
¸
k=0
(−1)
k
n
k
e
−(x−kt)
2
.
The coeﬃcients a
n
are achieved by simplifying this last expression and cal
culating the coeﬃcients b
n
using orthogonal functions. These straightforward
calculations are detailed in [20].
This result may be surprising; it promises we can approximate to any degree
of accuracy a function such as the following characteristic function of an interval
χ
[−11,−10]
(x) :=
1
0
for x ∈ [−10, −11]
otherwise
with support far from the means of the Gaussians e
−(x−nt)
2
which are located
in [0, ∞) at the points x = nt. The graphs of these functions e
−(x−nt)
2
are
extremely simple geometrically, being Gaussians with the same variance. We
only use the right translates, and they all shrink precipitously (exponentially)
away from their means.
¸
a
n
e
−(x−nt)
2
≈ characteristic function?
5.1.2 Signal synthesis
Interpreting Theorem 88 in terms of signal analysis, we see a Gaussian ﬁlter is
a universal synthesizer with arbitrarily short load time. To clarify this claim,
let G(x) :=
1
√
π
e
−x
2
. A Gaussian ﬁlter is a linear timeinvariant system
represented by the operator
¼(f) (x) := (f ∗ G) (x) =
1
√
π
R
f (y) e
−(x−y)
2
dy.
The symbol ¼ is used in reference to the Weierstrass transform. If you feed ¼ a
Dirac delta distribution δ
t
(an ideal impulse at time x = t) you get ¼(δ
t
) (x) =
5.1. GAUSSIANS 85
G(x −t). The load time τ of a signal ¼(f) is the length of the support of
the input function f. Theorem 88 gives
Corollary 89 For any f ∈ L
2
(R) and any > 0 and any τ > 0 there exists
t > 0 and N ∈ N with tN < τ such that
f ≈
¼
N
¸
n=0
a
n
δ
nt
for some choice of a
n
∈ R.
Feed a Gaussian ﬁlter a linear combination of impulses and we can syn
thesize any signal and arbitrarily small load time τ. The design of physical
approximations to an analog electronic Gaussian ﬁlter are detailed in [29] and
[46].
One of the principle deities in Hinduism is Shiva who generates, maintains,
and destroys the universe by performing the Tandava (“dance of bliss”) in cyclic
eternity. Said to be the supreme representation of Hindu art, the Chola Nataraja
(Figure 5.1) depicts Shiva in the midst of Tandava. In this canonical represen
tation ﬁre is held in Shiva’s upper left hand, symbolizing destruction, and a
damaru drum in Shiva’s upper right hand, with which Shiva (re)generates the
cosmos. The damaru is a bifacial drum in the shape of an hourglass and is
sometimes made from human skulls (Figure 5.2). Delivering alternate pulses to
the opposite sides of a damaru is an excellent metaphor for the universal signal
synthesis of Corollary 89.
Appropriately, Shiva is depicted in the Chola Nataraja breaking the back of
the demon Apasmara, who represents ignorance.
5.1.3 Deconvolution
The inverse problem of convolution is an important one for signal analysis: given
a set of transformed signals from a ﬁxed process, how do we ﬁnd the signals
that were originally transformed. In a dramatic instance, astronomers collected
ﬂawed images from the Hubble telescope’s slightly defective mirror from 1990
when it was launched until 1993 when the aberration was corrected. These
images were far from useless and were immediately improved with deconvolution.
All imaging systems (cameras, telescopes, microscopes, televisions, eyeballs)
have such ﬂaws to a certain degree, due to inevitable imperfections in lenses
and mirrors, and deconvolution is an important technique for improving their
quality. (The bible of optics is [13].)
Let’s frame the problem mathematically. We imagine the pictures that Hub
ble took are the image under the convolution
Φ
T
(f) (x) := (f ∗ T ) (x) :=
R
f (y) T (x −y) dy.
f is the perfect picture, and the output Φ
T
(f) is the transformed, ﬂawed pic
ture. The goal is to ﬁnd the inverse transform Φ
−1
T
given only a few outputs.
86CHAPTER5. APPROXIMATIONWITHNONORTHOGONAL FAMILIES
Figure 5.1: Chola Nataraja c Himalayan Academy Publications, Kapaa, Kauai,
Hawaii
Figure 5.2: Damaru made from crania c National Music Museum: America’s Shrine
to Music
5.1. GAUSSIANS 87
Determining T solves the problem. The fundamental trick is to ﬁnd the image
of a point source or Dirac delta distribution. Knowing Φ
T
(δ
0
) gives us T since
Φ
T
(δ
0
) (x) =
R
δ
0
(y) T (x −y) dy =
R
δ
0
(x −y) T (y) dy = T (x) .
Once the transfer function T is thus known, the Fourier transform may theo
retically be used to ﬁnd the inverse
f = T
−1
(T (f ∗ T ) /T (T ))
using the Convolution Theorem T (f ∗ T ) = T (f) T (T ). Division by 0 in
this calculation may demand another approach, though. In astronomy, distant
stars are regularly used as point sources. In microscopy, point sources are more
diﬃcult to obtain. In astronomy and microscopy T is called the point spread
function; in signal analysis T is the impulse response; in control theory T is the
transfer function; in seismology T is the earthreﬂectivity function.
Theoretically we can ﬁnd T given Φ
T
(f) for any particular f since again
T = T
−1
(T (f ∗ T ) /T (f))
but dividing by 0 again suggests this approach is not numerically feasible.
Theorem 88 allows us to use Gaussians as an alternative to point sources or
Dirac delta distributions in constructing an approximation of T . Let G
z
(x) :=
1
√
π
e
−(x−z)
2
.
Φ
T
(G
0
) (x) =
R
G
0
(y) T (x −y) dy =
R
G
0
(y −x) T (y) dy
=
R
G
x
(y) T (y) dy = 'G
x
, T ` . (5.2)
Now by Theorem 88 we can construct an orthonormal basis of L
2
with linear
combinations of the G
x
and obtain the generalized Fourier coeﬃcients of T
from line (5.2). The most natural choice of orthonormal basis would be the
Hermite polynomials. Speciﬁcally, given enough information about Φ
T
(G
0
) we
can compute
d
n
dx
n
Φ
T
(G
0
) (x)
x=0
=
d
n
G
dx
n
, T
= (−1)
n
'H
n
G, T `
which are essentially the Hermite coeﬃcients of T with which we may recon
struct T as a Hermite series. This method is essentially inverting the Weierstrass
transform[10] which has been largely neglected despite periodic rediscovery. One
advantage to using the Weierstrass transform in astronomy (or microscopy) is
that the light proﬁle of a distant star (or quantum dot) is quite accurately
represented by a Gaussian compared with a Dirac delta distribution.
88CHAPTER5. APPROXIMATIONWITHNONORTHOGONAL FAMILIES
As discussed in Examples 80 and 81 this method may be expanded to other
choices of functions in place of the Gaussian, but the choice is precisely limited to
derivativegenerating functions. Those examples demonstrate why polynomials
and sine functions, e.g., are poor choices.
5.1.4 Coeﬃcient formulas
Theorem 88 promises any L
2
function can be approximated
f (x) ≈
N
¸
n=0
a
n
e
−(x−nt)
2
(5.3)
and gives a formula for the coeﬃcients a
n
. Unfortunately we cannot simply
send N →∞:
f (x) =
∞
¸
n=0
a
n
e
−(x−nt)
2
.
The approximation, line (5.3), works because it is a weaker form of convergence
than the typical Hilbert space series convergence. Consequently the coeﬃcients
a
n
are not unique, and in fact are not “best” according to the classical continuous
least squares technique.
1.5
1
0.5
0
0.5
1
1.5
4 2 2 4 x
Least squares approximation
N = 5, t = .01
1.5
1
0.5
0
0.5
1
1.5
4 2 2 4 x
Theorem 88 formula
N = 5, t = .01
Least squares
With the least squares method we minimize the error function
E
2
(a
0
, ..., a
N
) :=
R
f (x) −
N
¸
n=0
a
n
e
−(x−nt)
2
2
dx
by setting
∂E2
∂aj
= 0 for j = 0, ..., N and solving for the a
n
. These N + 1 linear
equations are called the normal equations. The matrix form of this system is
M
−→
v =
−→
b where M is the matrix
M =
¸
π
2
e
−
k
2
+j
2
−
(k+j)
2
2
t
2
¸
N
j,k=0
5.1. GAUSSIANS 89
and
−→
v = [a
j
]
N
j=0
and
−→
b =
R
f (x) e
−(x−jt)
2
dx
¸
¸
N
j=0
.
M is symmetric and invertible, so we can always solve for the a
n
. But these
least squares matrices are notorious for being illconditioned when using non
orthogonal approximating functions. The Hilbert matrix is the archetypical
example. The current application is no exception since the matrix entries are
very similar for most choices of N and t, so roundoﬀ error is extreme. Choosing
N = 7 instead of 5 in the graphed example above requires almost 300 signiﬁcant
digits.
Lagrange multipliers formula
A third approach is to use Lagrange multipliers to minimize
N
¸
n=0
a
2
n
subject to
the constraint
R
¸
f (x) −
N
¸
n=0
(a
n
cos ntx +b
n
sinntx)
2
dx ≤ .
Changing the translation factors
There are many other approaches as well. The question of which method works
best is diﬃcult, and the answer (if there is one) depends on the situation. A
completely diﬀerent approach that may be eﬀected is to change the choice of
Gaussians—we don’t need to use evenly spaced translations:
f (x) ≈
N
¸
n=0
a
n
e
−(x−αnt)
2
(5.4)
for any suitably bounded sequence α. The coeﬃcients may be calculated by
adjusting the proof of Theorem 88 with the npoint numerical diﬀerentiation
formula appropriate to α (reviewed in Appendix C)
5.1.5 Instability
Be warned that the method is unstable for two reasons: the coeﬃcients grow
without bound as → 0; and as N increases all the coeﬃcients need to be
recalculated. These diﬃculties can be ameliorated using the numerical analysis
canon (e.g., see Appendix C, e.g., for one approach to improving numerical
diﬀerentiation), but not eliminated. Instability is a catastrophic problem that
precludes the use of this method in many situations. However, we can take
comfort from the fact that we know precisely where the instability arises, and
so we can anticipate the error and adjust for it.
90CHAPTER5. APPROXIMATIONWITHNONORTHOGONAL FAMILIES
5.2 Lowfrequency trigonometric series
“Signal analysts do it with high frequency.”
bumper sticker
5.2.1 Density in L
2
Taking the Fourier transform of line (5.1) gives the startling fact that low
frequency trigonometric series are dense in L
2
[a, b]. Let’s go through the de
tails. Deﬁne the norm
f
2,G
:=
R
[f (x)[
2
e
−x
2
dx
1/2
with Gaussian weight function. Let L
2
G
(R) denote the set of functions f with
ﬁnite norm f
2,G
< ∞. Write f ≈
,G
g to mean f −g
2,G
< .
Theorem 90 For every f ∈ L
2
(R, C) and > 0 there exists N ∈ N and t
0
> 0
such that for any t = 0 with [t[ < t
0
f (x) ≈
,G
N
¸
n=0
a
n
e
−intx
for a
n
∈ C dependent on N and t.
Proof. We use the Fourier transform with convention
T [f] (s) =
1
√
2π
R
f (x) e
−isx
dx.
T is a linear isometry of L
2
(R, C) with
T
e
−αx
2
=
1
√
2α
e
−
s
2
4α
,
T [f (x +r)] = e
−irs
T [f (x)] and
T [g ∗ h] =
√
2πT [g] T [h] .
where ∗ is convolution.
Let f ∈ L
2
and we now show f
2
(x) :=
1
√
2π
e
−x
2
∗ T
−1
[f] (x) ∈ L
2
. Notice
g := T
−1
[f] ∈ L
2
and
f
2

2
2
=
R
R
1
√
2π
g (x −y) e
−y
2
dy
2
ds ≤
1
2π
R
R
[g (x −y)[
2
e
−2y
2
dyds
= c
¼
t0
[g[
2
1
= c
g
2
1
= c g
2
2
= c f
2
2
< ∞
for some c > 0. Here ¼
t
[h] is the solution to the diﬀusion equation for time t
and initial condition h. (The notation ¼ refers to the Weierstrass transform.)
5.2. LOWFREQUENCY TRIGONOMETRIC SERIES 91
The reason for the third equality in the previous calculation is that ¼
t
maintains
the L
1
integral of any positive initial condition h for all time t > 0 [66].
Now approximate the real and imaginary parts of f
2
with Theorem 88. Then
we get
1
√
2π
e
−x
2
∗ T
−1
[f] (x) ≈
N
¸
n=0
a
n
e
−(x−nt)
2
a
n
∈ C
and applying T gives
1
√
2
e
−s
2
/4
f (s) ≈
N
¸
n=0
a
n
e
−ints 1
√
2
e
−s
2
/4
.
Hence
f (s) ≈
√
2,G
N
¸
n=0
a
n
e
−ints
using the fact e
−s
2
/4
> e
−s
2
.
Another proof is furnished in Example 93, below.
Corollary 91 On any ﬁnite interval [a, b] for any ω > 0 the ﬁnite linear com
binations of sine and cosine functions with frequency lower than ω are dense in
L
2
([a, b] , R).
Proof. On [a, b] the Gaussian is bounded and so the norms with or without
weight function are equivalent. Apply Theorem 90 to f ∈ L
2
([a, b] , R) and
choose t such that Nt < ω to get
f ≈
N
¸
n=0
Re (a
n
) cos (ntx) + Im(a
n
) sin(ntx)
where
a
n
=
(−1)
n
n!2π
N
¸
k=n
1
(k−n)!(2t)
k
R
e
−x
2
∗ T
−1
[f] (x)
e
x
2 d
k
dx
k
e
−x
2
dx.
High frequency trigonometric series are of course dense in L
2
[0, 2π], which
is the basis of Fourier analysis; this idea was revolutionary in the 19th century,
but it’s common mathematical intuition today. Looking at the ﬁgures below
we’re not too surprised that linear combinations of trig functions sinkx and
cos kx can represent most any function on [0, 2π]. Perhaps, though, even jaded
contemporaries will be surprised the set of uninteresting, ﬂat functions sin
1
k
x
and cos
1
k
x
can combine linearly to give practically any function on (−∞, ∞).
The fact that lowfrequency trig series can be constructed with a highfrequency
“signal” casts into doubt our traditional interpretation of the fundamental facts
of information theory. High channel capacity is possible with low bandwidth
transmission. Spectral imaging techniques are theoretically possible with low
frequency electromagnetic waves.
92CHAPTER5. APPROXIMATIONWITHNONORTHOGONAL FAMILIES
high frequency, ω ≥ 1
cos (kx)
k = 1, ..., 6
low frequency, ω ≤ 1
cos
1
k
x
k = 1, ..., 6
Remark 92 The above development actually shows something a bit stronger.
Just as Gaussians with centers near zero can approximate any L
2
function, we
may pick any point x
0
∈ R in place of 0 and Gaussians with centers near x
0
clearly may still approximate any L
2
function.
Then as before, taking the Fourier transform shows we may replace trig func
tions of near0 frequency (lowfrequency trig functions) with trig functions of fre
quency near any x
0
and still get a robust approximation technique. This means
as long as we have subtle control over an oscillator in any narrow bandwidth we
can synthesize every signal.
That lowfrequency trigonometric series
N
¸
n=0
a
n
e
−intx
are dense in L
2
G
(R, C)
may be superﬁcially surprising in light of the penultimate paragraph of Example
80 which shows series of the form
N
¸
n=0
a
n
e
−i(x+r)
for all r ∈ R are far from dense,
forming a 3dimensional subset of the ∞dimensional space. Use F
t
(f) (x) :=
f (x +t) and G
t
(f) = f +th with h(x) = e
ix
.
5.2.2 Coeﬃcient formulas
One formula for the coeﬃcients of the lowfrequency approximation is explicit in
the constructive proof given above for Corollary 91. Alternately we can directly
use classical least squares approximation or Lagrange multipliers as detailed in
§5.1. Let’s look at a few more approaches.
Example 93 As in Example 82 with c set equal to the complex number i
G
t
(f) := f +tg and Z
t
(f) (x) := e
ixt
f (x)
with g (x) = e
−x
2
gives R
(G,Z)
= L
2
(R, C). Z is eﬀectively the Fourier trans
form of function translation. As before [G, Z] ∼ H where
H
t
(f) (x) := f (x) +tixg (x)
5.2. LOWFREQUENCY TRIGONOMETRIC SERIES 93
Iterating the bracket generates polynomials of every degree, so the system is
controllable. This is another proof of Theorem 90.
Example 93 inspires a third proof of Theorem 90:
Theorem 94 If f is represented by a power series
f (x) =
∞
¸
n=0
b
n
x
n
with convergence in L
2
G
then
f ≈
,G
N
¸
k=0
a
k
e
iktx
for large enough N and small t = 0 where
a
k
=
N
¸
n=k
(−1)
n−k
b
n
(it)
n
n
k
.
Proof. Since
i
n
x
n
=
d
n
dr
n
e
irx
r=0
we may truncate the series for f to get
f (x) ≈
/2,G
N
¸
n=0
b
n
i
−n
d
n
dr
n
e
irx
r=0
.
Now use ﬁnite forward diﬀerences to approximate the derivatives (Appendix C).
We have for some small t > 0
N
¸
n=0
b
n
i
n
d
n
dr
n
e
irx
r=0
≈
/2,G
N
¸
n=0
b
n
i
n
1
t
n
n
¸
k=0
(−1)
n−k
n
k
e
iktx
.
Switching the order of summation gives the result.
Every nth order polynomial is then rewritten as a lowfrequency trigono
metric series with n (complex) terms. We can also use diﬀerent polynomial
approximations of a function (e.g., orthogonal polynomials) to get diﬀerent for
mulas for the coeﬃcients of the lowfrequency trigseries approximation.
Example 95 Three lowfrequency sine functions are enough to approximate x
3
arbitrarily closely despite the fact that their graphs look linear around 0.
4
2
0
2
4
60 40 20 20 40 60 x
sin
3
100
x
(not to scale)
94CHAPTER5. APPROXIMATIONWITHNONORTHOGONAL FAMILIES
0.3
0.2
0.1
0
0.1
0.2
0.3
10 8 6 4 2 2 4 6 8 10
x
sin(.01x) , sin(.02x) , sin(.03x)
Using the formula from Theorem 94, we can pick t = .01 to get x
3
≈
−3
(.01)
3
sin(.01x) +
3
(.01)
3
sin(.02x) −
1
(.01)
3
sin(.03x) and graphs show close visual
agreement for x ∈ [−20, 20].
20
10
0
10
20
3 2 1 1 2 3
x
x
3
and lowfrequency approximation; t = 0.1
8000
6000
4000
2000
0
2000
4000
6000
8000
20 10 10 20
x
x
3
and lowfrequency approximation; t = 0.01
5.2. LOWFREQUENCY TRIGONOMETRIC SERIES 95
As t → 0 the graphs converge for every x. Notice this miracle comes at the
cost of large coeﬃcients—see the comments on instability in §5.1.5.
In Figures 5.35.5 a few more examples of Theorem 94 are displayed. The
0
1
2
3
4
5
6
2 1 1 2 x
Figure 5.3: e
x
(solid),
¸
4
n=0
x
n
/n! (dots) and its lowfreq. trig. series approxi
mation (dashes) with t = 0.1
4
2
0
4
6
2 1 1 2
x
Figure 5.4: x(x + 1) (x −1) and its low freq. trig. approximation; t = 0.1
instability we referred to above arises, for example, when we wish to get a better
approximation for the graph in Figure 5.5. If we try to shrink t from 0.1 to say
0.01 the coeﬃcients, which include 1/t
n
grow quickly. Depending on the limits
of your machine precision as t shrinks you will eventually see a meaningless
graph like Figure 5.6. As suggested in §5.1.5, implementing more sophisticated
numerical diﬀerentiation formulas derived in Appendix C avoids the roundoﬀ
error. We can leave the t = 0.1 but add more terms in the npoint formula to
improve the approximation while avoiding large coeﬃcients, thereby sidestepping
the instability—Figure 5.7.
It is not diﬃcult to extend Remark 92, slightly generalizing Theorem 94 to
give a similarly simple formula for power series conversion to trig series with
frequency in any narrow range. One approach is to approximate f (x) e
−ixc
with
frequencies near 0 then multiply by e
ixc
.
96CHAPTER5. APPROXIMATIONWITHNONORTHOGONAL FAMILIES
4
2
0
2
4
10 8 6 4 2 2 4 6 8 10
x
Figure 5.5:
¸
17
n=0
sin(nπ/2) x
n
/n! and its lowfreq. trig. approximation; t = 0.1
Figure 5.6: Instability creeps in as t →0.01 with 30 digits precision.
Figure 5.7: Improved approximation of
¸
17
n=0
sin(nπ/2) x
n
/n! keeping t = 0.1
but using a higher npoint formula.
5.2. LOWFREQUENCY TRIGONOMETRIC SERIES 97
Axel Boldt and Pangyen Weng suggest another approach which yields the
same coeﬃcients.
2nd proof of Theorem 94. This time just use
x =
1
i
d
dr
e
irx
r=0
= lim
h→0
e
ihx
−1
ih
≈
e
itx
−1
it
so that
x
n
≈
e
itx
−1
n
(it)
n
=
1
(it)
n
n
¸
k=1
(−1)
n−k
n
k
e
iktx
.
Finally, the real part may be calculated in several ways to get the formula
f ≈
,G
N
¸
n=0
b2n
t
2n
1
2
2n
2n
n
+
(−1)
n
2
2n−1
n−1
¸
k=0
(−1)
k
2n
k
cos ((2n −2k) tx)
+
b2n+1
t
2n+1
(−1)
n
2
2n
n
¸
k=0
(−1)
k
2n+1
k
sin((2n + 1 −2k) tx)
¸
¸
¸
¸
.
Error bound
From the numerical analysis point of view, it is worth noting how Theorem 94
leads to an explicit error bound for lowfrequency trigseries approximations:
1. Use Taylor’s Theorem to get an error bound for the polynomial approxima
tion.
2. Replace the monomials x
n
with i
−n d
n
dr
n
e
irx
r=0
3. Approximate the derivatives with ﬁnite diﬀerences, which have an explicit
error bound (details in Appendix C).
5.2.3 Damping gives a stable family
Numerical diﬀerentiation formulas have led us above to new approximating
families—shifted Gaussians and lowfrequency trig series. Another approach to
numerical diﬀerentiation gives a noteworthy pair of approximating families, low
frequency trig series with exponential and Gaussian damping.
e
x
cos x e
−x
2
cos x
Exponential and Gaussiandamped trig functions
98CHAPTER5. APPROXIMATIONWITHNONORTHOGONAL FAMILIES
Simply choose complex nodes α
j
:= e
j
n
2πi
instead of real nodes for sampling in
the derivative approximation formulas. Example 130 from Appendix C guaran
tees we have
d
m
dz
m
g (z) ≈
m!
nt
m
n
¸
j=1
e
(n−m)j
n
2πi
g
z +te
j
n
2πi
.
Using this formula for the coeﬃcients to approximate the derivative
d
m
dr
m
e
irx
r=0
as in Theorem 94 leads to
Theorem 96 If f is represented by a power series
f (x) =
∞
¸
n=0
b
n
x
n
with L
2
convergence on a bounded interval, then f is approximated by damped,
lowfrequency trigonometric functions
f (x) ≈
n
¸
k=0
a
k
e
itxcos(
k
n
2π)
e
−txsin(
2πk
n
)
for large enough N and any ﬁxed t = 0 where
a
k
=
M
¸
m=0
b
m
m!
(it)
m
n
e
2πi
(n−m)k
n
.
Proof.
f (x) =
∞
¸
m=0
b
m
x
m
≈
/2
M
¸
m=0
b
m
i
m
d
m
dr
m
e
irx
r=0
≈
/2
M
¸
m=0
b
m
m!
(it)
m
n
n
¸
k=0
e
(n−m)k
n
2πi
e
itxe
k
n
2πi
=
M
¸
m=0
b
m
m!
(it)
m
n
n
¸
k=0
e
i(txcos(
k
n
2π)+
(n−m)k
n
2π)
e
−txsin(
2πk
n
)
=
n
¸
k=0
¸
M
¸
m=0
b
m
m!
(it)
m
n
e
2πi
(n−m)k
n
e
itxcos(
k
n
2π)
e
−txsin(
2πk
n
)
Remarkably, there is no instability with this approximating family, because
the t may remain ﬁxed at any value, even t = 1, and the n may be increased arbi
trarily. In this special instance, numerical diﬀerentiation is not unstable because
the function we are approximating is analytic and entire. We can appreciate
the stability of the method by noticing the terms do not grow uncontrollably as
n →∞, but instead converge to Cauchy’s integral. See Example 130.
With a more careful choice of nodes α
j
the damping terms’ coeﬃcients may
be manipulated by the modeler. Then however, the coeﬃcients a
k
are not
explicitly given, but may be determined by solving the Vandermonde matrix
(again, see Appendix C).
5.2. LOWFREQUENCY TRIGONOMETRIC SERIES 99
Applying this approach to the Gaussian approximation scheme in Theorem
88 gives an approximating family consisting of lowfrequency trig functions e
it
k
x
with Gaussian damping factors e
−(x+r
k
)
2
which again boasts an explicit, stable
formula for the coeﬃcients.
The shape of these approximating functions is similar to the sinc function
sinx
x
. Compare this scheme and the Gaussian translations of §5.1 with the
classical Whittaker—Shannon interpolation formula, which uses translates of a
single sinc function with carefully chosen frequency.
100CHAPTER5. APPROXIMATIONWITHNONORTHOGONAL FAMILIES
Chapter 6
Partial diﬀerential
equations
Further applications of metric space algebra on function spaces leads to a new
perspective on partial diﬀerential equations (PDEs) by rewriting them without
derivatives. The reason this is desirable is because on any reasonably robust
space of functions, a diﬀerential operator is unbounded. Looking at ﬂows with
out resorting to such discontinuous vector ﬁelds on Banach spaces gives palpable
advantages.
6.1 Metric space arithmetic
Example 97 F
t
(f) (x) := f (x +t) and G
t
(f) := f +tg on say M = L
2
has
d (F
t
G
s
(f) , G
s
F
t
(f))
=
(f (x +t) +sg (x +t) −[f (x +t) +sg (x)])
2
dy
1/2
= [st[
¸
g (x +t) −g (x)
t
2
dy
1/2
= O(st)
assuming g ∈ C
1
and g
∈ L
2
so F & G close. Then computing the ﬂow H of
F +G with Euler curves we get
G
t/n
F
t/n
(n)
(f) (x) = f (x +t) +
t
n
n
¸
m=0
g
x +m
t
n
so
H
t
(f) (x) = lim
n→∞
G
t/n
F
t/n
(n)
(f) (x) = f (x +t) +
x+t
x
g (y) dy.
101
102 CHAPTER 6. PARTIAL DIFFERENTIAL EQUATIONS
H, representing the equal combination of translation on the xaxis (F) and L
2
vector space translation (G) gives a dynamic which may be described as xaxis
translation with a smeared L
2
vector space translation of g. The sum of these
ﬂows was introduced in [24], §5.2, with other interesting function space examples
and a partial diﬀerential equations treatment.
H inspires us to study a similar arc ﬁeld X
t
(f) (x) := f (x) +
x+t
x
g (y) dy.
We may readily check that X ∼ G so X doesn’t give a new dynamic, but it’s
still interesting as a new representation of a fundamental arc ﬁeld.
Example 98 Consider the eﬀect of multiplication aF on F
t
(f) (x) := f (x +t).
First if a is a constant
(aF)
t
(f) (x) = F
at
(f) (x) = f (x +at)
which changes the speed of the xaxis translation.
If we want to apply the metric space arithmetic of §2.1 to a nonconstant
function a then it needs to be a function a : M → R. E.g., a(f) := f
1
gives
(aF)
t
(f) (x) := f (x +f
1
t) which is its own ﬂow and is the solution to the
PDE f
t
= f f
x
. This 1D PDE is equivalent to Newton’s equation on a line
(see [5, p. 2]).
Another possibility asserts itself here. Though it doesn’t make sense in terms
of the metric space arithmetic deﬁned above, let us formally insert the function
a := x. Then
(aF)
t
(f) (x) = F
at
(f) (x) = f (x +xt) = f ([1 +t] x) .
aF is not a ﬂow now:
(aF)
s+t
(f) (x) = f ([1 +s +t] x)
(aF)
t
(aF)
s
(f) (x) = f ([1 +s] [1 +t] x) .
The arc ﬁeld aF actually has ﬂow given by xaxis dilation
H
t
(f) (x) := f
e
t
x
.
Checking wellposedness requires choosing a metric, e.g., pick M := L
1
(R).
Condition E1 follows from
d ((aF)
t
(f) , (aF)
t
(g)) =
∞
−∞
[f (x +xt) −g (x +xt)[ dx
=
1
1 +t
∞
−∞
[f (u) −g (u)[ du ≤ d (f, g)
∞
¸
n=0
[t[
n
≤ d (f, g) (1 +[t[ Λ)
6.1. METRIC SPACE ARITHMETIC 103
when [t[ < 1 guaranteeing uniqueness of solutions. The solution H exists,
printed above, but checking E2 hits a snag:
d
(aF)
s+t
(f) , (aF)
t
(aF)
s
(f)
=
∞
−∞
[f ([1 +s +t] x) −f ([1 +s +t +st] x)[ dx
=
1
1 +s +t
∞
−∞
[f (u) −f (u +stx)[ du ≤ [st[ xf
(x) = O(st)
for s, t > 0, but only when xf
(x) is integrable.
Picking M := C
1
c
(R) the set of C
1
functions with compact support, e.g.,
with metric d (f, g) := f −g
∞
+f
−g

∞
gives
d ((aF)
t
(f) , (aF)
t
(g)) = sup
x∈R
[f (x +xt) −g (x +xt)[ = d (f, g)
and
d
(aF)
s+t
(f) , (aF)
t
(aF)
s
(f)
= sup
x∈R
[f ([1 +s +t] x) −f ([1 +s +t] x +stx)[
≤ [st[ sup
x∈R
xf
(x) = O(st) .
Example 99 Continuing the nonstandard arithmetic of Example 98, consider
aG for G
t
(f) := f +tg where a (x) is a function:
(aG)
t
(f) (x) = G
a(x)t
(f) (x) = f (x) +ta (x) g (x)
so aG
t
(f) = f + t (a g). If the curve G
t
(f) is thought of as a vector in L
2
or L
1
starting at f and moving in the direction of g, then multiplying G by a
suitably bounded function a gives a new direction aG, akin to rotating the vector
G.
aG is clearly its own ﬂow and satisﬁes E2; checking E1
d
(aG)
f
1
(t) , (aG)
f
2
(t)
= d
f
1
, f
2
proves we have unique solutions.
Example 100 Let X
t
(f) (x) := f (x) + tf (x +t) for a, b ∈ R. The Euler
curves are
Z
(n)
t/n
(f) (x) =
n
¸
k=0
t
k
n
k
n
k
f (x +kt) so
F
t
(f) (x) =
∞
¸
k=0
t
k
k!
f (x +kt) .
Notice on any Banach space of functions, using the notation of the translation
operator τ
z
: R →R with τ
z
(x) = x +z, we have
d (X
t
(f) , F
t
(f)) =
f +tτ
t
f −
∞
¸
k=0
t
k
k!
τ
kt
f
=
∞
¸
k=2
t
k
k!
τ
kt
f
≤
∞
¸
k=2
t
k
k!
τ
kt
f = f
∞
¸
k=2
t
k
k!
= o (t) .
104 CHAPTER 6. PARTIAL DIFFERENTIAL EQUATIONS
Applying the formulas F
s+t
= F
t
F
s
, etc., to various functions f gives new
series identities.
Try calculating limits of Euler curves for F + G with various G, such as
G
t
(f) := f +tg using the model of Example 97.
Similar results hold for the generalization X
t
(f) (x) := f (x)+tf (x + (at +b)).
6.2 PDEs as arc ﬁelds
Consider the simplest PDE
f
t
= f
x
(6.1)
with initial condition f
0
: R →R, i.e., f (x, 0) = f
0
(x). The solution is trans
lation f (x, t) = f
0
(x +t). To cast (6.1) in the idiom of arc ﬁelds we therefore
may write X
t
(f
0
) (x) := f
0
(x +t) or equivalently X
t
(f) (x) := f (x +t) which
is obviously its own ﬂow. Alternately we may write (6.1) as
f
r
=
∂f (x, r)
∂r
= lim
t→0
f (x, r +t)
t
=
∂f (x, r)
∂x
= f
x
or
f (x, r +t) −f (x, r)
t
= f
x
+O(t) or
f (x, r +t) = f +tf
x
+o (t)
which, as a second arc ﬁeld, would be written simply X
t
(f) := f + tf
x
with
X ∼ X
. Diagrammatically the PDEtoarc ﬁeld translation is
f
t
= f
x
⇐⇒ X
t
(f) (x) := f (x +t)
" t
f (x, r +t) = f +tf
x
+o (t) ⇐⇒ X
t
(f) := f +tf
x
.
Similarly consider the PDE
f
t
= f
xx
(6.2)
with initial condition f
0
: R →R. The solution is diﬀusion
f (x, t) =
1
√
4πt
∞
−∞
e
−(x−y)
2
/(4t)
f
0
(y) dy.
Diagrammatically the PDEtoarc ﬁeld translation of (6.2) is
f
t
= f
xx
⇐⇒ Y
t
(f) (x) :=
1
√
4πt
∞
−∞
e
−(x−y)
2
/(4t)
f (y) dy
" t
f (x, r +t) = f +tf
xx
+o (t) ⇐⇒ Y
t
(f) := f +tf
xx
where Y is restricted to being a forward arc ﬁeld.
6.2. PDES AS ARC FIELDS 105
It is easy to check both X and Y satisfy E1 and E2 on L
1
(R), and further
X & Y close and even commute X
s
Y
t
= Y
t
X
s
. Consequently we have existence
and uniqueness of solutions to a family of PDEs
f
t
= af
x
+bf
xx
⇐⇒ Z = aX +bY
" t
f (x, r +t) = f +t (af
x
+bf
xx
) +o (t) ⇐⇒
Z
t
= f +t (af
x
+bf
xx
)
= (f +atf
x
) +btf
xx
= Y
bt
X
at
(f)
= (aX
+bY
) (f)
where a and b may be any locally Lipschitz functions of f, i.e., Lipschitz con
tinuous functionals a, b : L
1
(R) →R. When a and b are constants the solutions
calculated with Euler curves are
f (t, x) = lim
n→∞
Z
(n)
t/n
(f
0
(x)) = (aX +bY )
t
(f
0
(x))
since X and Y commute. With nonconstant a and b the Euler curves still
converge, but are not as easy to simplify.
Translating Example 11 to the current notation, let W
t
(f) := f + tΨ(f).
When Ψ : M → M is locally Lipschitz on a Banach space, E1 and E2 are
satisﬁed. Now adding W is straightforward and gives solutions to the family
f
t
= af
x
+bf
xx
+cΨ(f) .
To extend these results to higherdimensional, higherorder, timedependent
PDEs, apply Remark 13. To determine the existence of longtime solutions,
apply Theorem 25 adjusted in accord with the forwardﬂows ideas of Section
1.2.
This proves wellposedness for a family of PDEs, but the most eagerly sought
after wellposedness results are for a wider choice of coeﬃcients, not a, b, c : M →
R as above. Conﬂating the approach of this section with Examples 98 and 99
extends the family. E.g.,
f
t
= xf
x
⇐⇒ X
t
(f) (x) := f (x +xt)
" t
f (x, r +t) = f +txf
x
+o (t) ⇐⇒ X
t
(f) := f +txf
x
.
The claim X ∼ X
follows from lim
t→0
f(x+xt)−f(x)
t
= xf
x
. The solution is the ﬂow
F
t
(f) (x) := f (e
rt
x), so if xf
x
is a component of a PDE we call it dilation.
To see why nonlinear PDEs are notoriously diﬃcult to solve, consider the
simplest, which in arc ﬁeld language is
f
t
= f f
x
⇐⇒ X
t
(f) (x) := f (x +tf (x))
" t
f (x, r +t) = f +tf f
x
+o (t) ⇐⇒ X
t
(f) := f +tf f
x
.
106 CHAPTER 6. PARTIAL DIFFERENTIAL EQUATIONS
The claim X ∼ X
follows from lim
t→0
f(x+tf(x))−f(x)
t
= f f
x
. Checking Condi
tions E1 and E2 quickly becomes convoluted. This equation is related to the
convective derivative from ﬂuid mechanics.
Exercise 101 Apply the above results to a space of divergencefree vector ﬁelds
and solve the NavierStokes equations. Finding the precise metric that works
for both the f f
x
component and the diﬀusion f
xx
component, yet keeps the arc
ﬁeld at linear speed may be a challenge; but if you act now, you’ll get the coﬀee
maker, the furniture, and... $1,000,000!
Chapter 7
Flows on H(R
n
)
The idea for generalizing vector ﬁelds to arc ﬁelds to study ﬂows on a metric
space is natural and simple and was independently arrived at by several authors
[51], [7], [18]. All of the instigators had the same space in mind, H (R
n
). This
space’s rich modeling capabilities was originally used by Hausdorﬀ to give a
new topology on function spaces by comparing the distance between the graphs
of functions. Later the space became a successful environment for generating
fractals as detailed in §7.1. In this chapter we will apply the metric space results
to H (R
n
), introducing novel dynamics with previously unimaginable modeling
capabilities.
7.1 IFS
One route towards generating a fractal is via a socalled iterated function system
(IFS). For the sake of completeness, we review the procedure; a more leisurely
treatment is found in [8]. Let  denote the usual Euclidean norm on R
n
and
use the metric space H(R
n
) from Example 6. Denote α ∨ β := max ¦α, β¦ and
α ∧ β := min¦α, β¦.
We prove the following lemma, as one might expect cross terms d
H
(a
1
, b
2
)
and d
H
(a
2
, b
1
) also on the right hand side.
Lemma 102 For any a
1
, a
2
, b
1
, b
2
∈ H(R
n
),
d
H
(a
1
∪ a
2
, b
1
∪ b
2
) ≤ d
H
(a
1
, b
1
) ∨ d
H
(a
2
, b
2
),
for any a
1
, a
2
, b
1
, b
2
∈ H(R
n
).
107
108 CHAPTER 7. FLOWS ON H
R
N
Proof. We have
d
H
(a
1
∪ a
2
, b
1
∪ b
2
) = max
y∈b
1
∪b
2
d (a
1
∪ a
2
, y) ∨ max
x∈a1∪a2
d (x, b
1
∪ b
2
)
=
max
y∈b1
d (a
1
∪ a
2
, y) ∨ max
y∈b2
d (a
1
∪ a
2
, y)
∨
max
x∈a1
d (x, b
1
∪ b
2
) ∨ max
x∈a2
d (x, b
1
∪ b
2
)
≤
max
y∈b1
d (a
1
, y) ∨ max
y∈b2
d (a
2
, y)
∨
max
x∈a
1
d (x, b
1
) ∨ max
x∈a
2
d (x, b
2
)
=
max
y∈b1
d (a
1
, y) ∨ max
x∈a1
d (x, b
1
)
∨
max
y∈b2
d (a
2
, y) ∨ max
x∈a2
d (x, b
2
)
= d
H
(a
1
, b
1
) ∨ d
H
(a
2
, b
2
).
More generally,
d
H
( ∪
1≤i≤k
a
i
, ∪
1≤j≤k
b
j
) ≤ max
1≤i≤k
d
H
(a
i
, b
i
).
Now let f
i
: R
n
→ R
n
, for 1 ≤ i ≤ k, be aﬃne contractions with respective
Lipschitz constants c
i
< 1. Let f : H(R
n
) →H(R
n
) be given by f (a) :=
k
∪
i=1
f
i
(a)
where f
i
(a) := ¦f
i
(z) : z ∈ a¦. Using Lemma 102, the following shows f is a
contraction mapping on H(R
n
) with contraction factor c := max
1≤i≤k
c
i
< 1 :
d
H
(f(a), f(b)) = d
H
k
∪
i=1
f
i
(a),
k
∪
i=1
f
i
(b)
≤ max
1≤i≤k
d
H
(f
i
(a), f
i
(b))
≤ max
1≤i≤k
max
y∈fi(b)
d (f
i
(a), y) ∨ max
x∈fi(a)
d (x, f
i
(b))
≤ max
1≤i≤k
max
y∈b
(c
i
d (a, y)) ∨ max
x∈a
(c
i
d (x, b))
≤ max
1≤i≤k
c
i
d
H
(a, b) = cd
H
(a, b) .
By the Contraction Mapping Theorem, Theorem 119, the iterates of f starting
with any point in H(R
n
) converge to a unique ﬁxed point of f in H(R
n
). For
many choices of the IFS, f , this ﬁxed point is an interesting fractal. For example
if we choose n = 2 and
f
1
(x) :=
1
2
x f
2
(x) :=
1
2
x +
1
2
, 0
f
3
(x) :=
1
2
x +
1
4
,
1
2
then the ﬁxed point of f is the famous Sierpinsky triangle (Figure 7.1).
7.2. CONTINUOUS IFS 109
Figure 7.1: Fixed point of a discrete ﬂow on H
R
2
7.2 Continuous IFS
As an arc ﬁeld application, we now create a continuous version of the above
process. For x, y ∈ R
n
, let λ
xy
: R →R
n
be the line from x to y deﬁned by
λ
xy
(t) := (1 −t) x +ty.
Deﬁne λ
ab
: R →H(R
n
) by
λ
ab
(t) := ∪
x∈a,y∈b
¦λ
xy
(t)¦ = ¦λ
xy
(t)[x ∈ a, y ∈ b¦
so λ
ab
(0) = a, and λ
ab
(1) = b. With regard to the black and white ﬁlm clip
analogy from §0.4, g
ab
would be a morph from photo a to photo b.
Let diam(a) := max
x,y∈a
x −y.
Proposition 103 We have
d
H
(λ
ab
(t) , λ
ab
(t
)) ≤ diam(a ∪ b) [t −t
[ .
Consequently, for any a, b ∈ H(R
n
), λ
ab
: R → H(R
n
) is Lipschitz continuous
(i.e., has bounded speed), and the length L( λ
ab
[
[0,1]
) of λ
ab
restricted to the
domain [0, 1] satisﬁes d
H
(a, b) ≤ L( λ
ab
[
[0,1]
) ≤ diam(a ∪ b).
Proof. Use Lemma 102 on the deﬁnition of λ
ab
:
d
H
(λ
ab
(t) , λ
ab
(t
))
≤ max
x∈a
y∈b
(1 −t) x +ty −[(1 −t
) x +t
y] = [t −t
[ max
x∈a
y∈b
x −y .
110 CHAPTER 7. FLOWS ON H
R
N
Some more crucial properties of the curves λ
ab
are ultimately consequences
of the estimates
d (λ
xy
(t), λ
x
y
(t
)) ≤ min
d (x, y) [t
−t[ +[1 −t
[ d (x, x
) +[t
[ d (y, y
) ,
d (x
, y
) [t
−t[ +[1 −t[ d (x, x
) +[t[ d (y, y
)
¸
(7.1)
and d
λ
xy
(s +h), λ
λxy(s)λyz(s)
(h)
≤ [sh[ (d (x, y) +d (y, z)) (7.2)
whose proofs are straightforward. Using (7.1) it is then not hard to prove
Proposition 104 For a, b, a
, b
∈ H(R
n
) and t, t
∈ R, we have
d
H
(λ
ab
(t) , λ
a
b
(t
))
≤ (diam(a ∪ b) ∧ diam(a
∪ b
)) [t
−t[
+(1 −([t[ ∧ [t
[)) d
H
(a, a
) + ([t[ ∨ [t
[) d
H
(b, b
) , (7.3)
where α ∧ β := min¦α, β¦.
Corollary 105 For a, a
, b, b
∈ H(R
n
) and t ∈ R, we have
d
H
(λ
ab
(t), λ
a
b
(t)) ≤ [1 −t[ d
H
(a, a
) +[t[ d
H
(b, b
).
The following proposition will be used in verifying Condition E2 for the arc
ﬁeld deﬁned below in (7.4).
Proposition 106
d
H
λ
ab
(s +h) , λ
λab(s)λbc(s)
(h)
≤ [sh[ [diam(a ∪ b) +diam(b ∪ c)]
for any a, b, c ∈ H (R
n
) .
Proof. We have
d
H
λ
ab
(s +h) , λ
λab(s)λbc(s)
(h)
=
¸
max
z∈λ
λ
ab
(s)λ
bc
(s)
(h)
d (λ
ab
(s +h), z)
¸
∨
¸
max
z∈λ
ab
(s+h)
d
z, λ
λab(s)λbc(s)
(h)
.
Using (7.2), we have
max
z∈λ
λ
ab
(s)λ
bc
(s)
(h)
d (λ
ab
(s +h) , z)
= max
z∈λ
λ
ab
(s)λ
bc
(s)
(h)
min
x∈a, y∈b
d (λ
xy
(s +h) , z)
¸
= max
x
∈a, y
∈b, z
∈c
min
x∈a, y∈b
d
λ
xy
(s +h) , λ
λ
x
y
(s)λ
y
z
(s)
(h)
¸
≤ max
x
∈a, y
∈b, z
∈c
d
λ
x
y
(s +h) , λ
λ
x
y
(s)λ
y
z
(s)
(h)
¸
≤ [sh[ max
x
∈a, y
∈b, z
∈c
¦d (x
, y
) +d (y
, z
)¦
≤ [sh[ [diam(a ∪ b) +diam(b ∪ c)]
7.2. CONTINUOUS IFS 111
while
max
z∈λ
ab
(s+t)
¸
d
z, λ
λ
ab
(s)λ
bc
(s)
(h)
= max
z∈λ
ab
(s+t)
min
x∈a, y∈b, z∈c
d
z, λ
λxy(s)λyz(s)
(h)
¸
= max
x
∈a, y
∈b
min
x∈a, y∈b, z∈c
d
λ
x
y
(s +h) , λ
λxy(s)λyz(s)
(h)
¸
≤ max
x
∈a, y
∈b
min
z∈c
d
λ
x
y
(s +h) , λ
λ
x
y
(s)λ
y
z
(s)
(h)
¸
≤ [sh[ max
x
∈a, y
∈b
min
z∈c
¦d (x
, y
) +d (y
, z)¦
¸
≤ [sh[ [diam(a ∪ b) +diam(b ∪ c)].
We deﬁne an arc ﬁeld X : H(R
n
) [−1, 1] →H(R
n
) by
X
t
(a) :=
k
∪
i=1
λ
af
i
(a)
(t) = λ
af(a)
(t) . (7.4)
The continuity of X follows from (7.3) of Proposition 104, while Proposition 103
provides a speed function ρ : H(R
n
) → R
+
, namely ρ(a) = diam(a ∪ f (a)).
X has linear speed growth in the sense of Deﬁnition 24. Indeed, for b ∈
B
dH
(a, r), using the contractivity of the f
i
, we have ρ (b) = diam(b ∪ f (b)) ≤
diam(a ∪ f (a)) + 2r = ρ (a) + 2r and hence ρ (a, r) ≤ ρ(a) + 2r. Even without
contractivity, if we assume the f
i
are globally Lipschitz we still have linear speed
growth since k is ﬁnite, and so Theorem 25 gives a global ﬂow on H (R
n
) once
we verify E1 and E2 below.
With this choice of X, one might expect the points of H(R
n
) to move under
the ﬂow toward the attractor of the IFS, but our aim is to show they ﬂow toward
the convex hull of the attractor. To this end we will employ Theorem 31 so we
restrict X to being a forward arc ﬁeld. First let’s check Condition E1. By
Lemma 102,
d
H
(X
a
(t), X
b
(t)) = d
H
λ
af (a)
(t), λ
bf(b)
(t)
= d
H
k
∪
i=1
λ
afi(a)
(t),
k
∪
i=1
λ
bfi(b)
(t)
≤ max
1≤i≤k
d
H
λ
af
i
(a)
(t) , λ
bf
i
(b)
(t)
.
Using Corollary 105, for each i ∈ ¦1, , k¦, we have
d
H
λ
af
i
(a)
(t) , λ
bf
i
(b)
(t)
≤ (1 −t) d
H
(a, b) +td
H
(f
i
(a) , f
i
(b))
≤ (1 −t) d
H
(a, b) +tc
i
d
H
(a, b)
≤ d
H
(a, b) (1 +t (c
i
−1)) .
Thus, with
Λ := max
1≤i≤k
c
i
−1 < 0 (7.5)
112 CHAPTER 7. FLOWS ON H
R
N
Condition E1 is satisﬁed.
We now verify Condition E2. As the f
j
send lines to lines in R
n
,
f
λ
af (a)
(s)
=
k
∪
i=1
f
i
(λ
af(a)
(s)) =
k
∪
i=1
λ
f
i
(a)f
i
(f(a))
(s) ⊆ λ
f(a)f(f(a))
(s) .
Then using Proposition 106, we have
d
H
X
a
(s +h) , X
Xa(s)
(h)
= d
H
λ
af(a)
(s +h) , λ
[λ
af(a)
(s)][f (λ
af (a)
(s))]
(h)
≤ d
H
λ
af(a)
(s +h) , λ
[λ
af(a)
(s)][λ
f(a)f(f(a))
(s)]
(h)
≤ hs [diam(a ∪ f (a)) +diam(f (a) ∪ f (f (a)))] .
Thus
Ω(a) := 2diam(a ∪ f (a) ∪ f (f (a)))
satisﬁes Condition E2.
7.3 Fixed points
By Theorem 31, the negativity of Λ at line (7.5) guarantees a unique ﬁxed point
for the forward ﬂow F of the arc ﬁeld X. We now show this ﬁxed point is
the convex hull of the attractor of the associated IFS. The convex hull of a
set P ⊂ R
n
(denoted C (P)) is the (convex) intersection of all convex subsets
of R
n
containing P. The convex hull C (¦x
0
, . . . , x
n
¦) of ¦x
0
, . . . , x
n
¦ ⊆ R
n
is
an nsimplex, where we allow the volume of C (¦x
0
, . . . , x
n
¦) to be 0. It is
standard that
C (¦x
0
, . . . , x
n
¦) =
n
¸
i=0
α
i
x
i
α
i
∈ [0, 1] ,
n
¸
i=0
α
i
= 1
¸
. (7.6)
For f : R
n
← aﬃne and α
i
as in (7.6), f (
¸
n
i=0
α
i
x
i
) =
¸
n
i=0
α
i
f (x
i
). Thus,
f (C (¦x
0
, . . . , x
n
¦)) = C (f (¦x
0
, . . . , x
n
¦)) .
In order to demonstrate the unique ﬁxed point of the ﬂow F generated
by the arc ﬁeld X
a
(t) = λ
af(a)
(t) is the convex hull C (A) of the attractor A
of the IFS (i.e., A is the ﬁxed point of f () := ∪
i
f
i
()), it suﬃces to prove
X
C(A)
(t) = λ
C(A)f (C(A))
(t) is actually constant for 0 ≤ t ≤
1
n+1
. Then, the
constant curve α : R
+
→H(R
n
) deﬁned by α(t) := C (A) is a solution curve of
X, and C (A) is the ﬁxed point of F. We use the following theorem (proven,
for instance, in [32, p. 10]) which is another characterization of convex hulls in
R
n
.
Theorem 107 (Carathéodory’s theorem) The convex hull of a set P ⊂ R
n
is the union of all nsimplices with vertices in P.
7.3. FIXED POINTS 113
Thus, for example, in the plane the convex hull of a set P consists of the union
of all ﬁlled triangles with vertices in P. In this case, every point x ∈ C (P) is
inside a triangle with vertices in P.
Proposition 108 For the attractor A of an IFS f () = ∪
i
f
i
(), we have f(C (A)) ⊆
C (A) .
Proof. Let p ∈ f (C (A)). Then by Theorem 107, for some i ∈ ¦1, . . . , k¦ ,
p = f
i
(q) for some q in some nsimplex C (¦x
0
, . . . , x
n
¦) with vertices x
0
, . . . , x
n
in A. Since A = f (A) = ∪
i
f
i
(A) , we have f
i
(¦x
0
, . . . , x
n
¦) ⊆ f
i
(A) ⊆ A and
hence C (f
i
(¦x
0
, . . . , x
n
¦)) ⊆ C (A). Thus,
p = f
i
(q) ∈ f
i
(C (x
0
, . . . , x
n
)) = C (f
i
(¦x
0
, . . . , x
n
¦)) ⊆ C (A) .
The inclusion in Proposition 108 is often strict.
Proposition 109 For ¦x
0
, . . . , x
n
¦ ⊆ R
n
, we have
λ
C({x0,...,xn}){x0,...,xn}
(t) = C (¦x
0
, . . . , x
n
¦) for 0 ≤ t ≤
1
n + 1
.
Proof. We have
λ
C({x0,...,xn}){x0,...,xn}
(t)
=
n+1
¸
i=0
¦λ
pxi
(t)[ p ∈ C (¦x
0
, . . . , x
n
¦) , i ∈ ¦0, . . . , n + 1¦¦ .
Thus, λ
C({x0,...,xn}){x0,...,xn}
(t) is the union of n + 1 nsimplices each of which
is shrinking into one of the vertices as t 1, and we need to show this union
is all of C (¦x
0
, . . . , x
n
¦) for 0 ≤ t ≤
1
n+1
; i.e., for any t ∈
0,
1
n+1
, every
q ∈ C (¦x
0
, . . . , x
n
¦) can be written as
λ
pxi
(t) = (1 −t) p +tx
i
for some p ∈ C (¦x
0
, . . . , x
n
¦) and some i ∈ ¦0, . . . , n + 1¦. Since q =
¸
n
i=0
α
i
x
i
where
¸
n
i=0
α
i
= 1, one of the α
i
≥ 0, say α
0
, is in
1
n+1
, 1
. Then
q =
n
¸
i=1
α
i
x
i
+α
0
x
0
= (1 −t)
α
0
−t
1 −t
x
0
+
n
¸
i=1
α
i
1 −t
x
i
+tx
0
.
Observe α
0
−t ≥ 0 for t ∈
0,
1
n+1
, and
α
0
−t
1 −t
+
n
¸
i=1
α
i
1 −t
=
α
0
−t +
¸
n
i=1
α
i
1 −t
=
1 −t
1 −t
= 1,
114 CHAPTER 7. FLOWS ON H
R
N
so
p :=
α
0
−t
1 −t
x
0
+
n
¸
i=1
α
i
1 −t
x
i
∈ C (¦x
0
, . . . , x
n
¦) ,
and q = λ
px0
(t) as required.
It remains to show
λ
C(A)f(C(A))
(t) = C (A) for 0 ≤ t ≤
1
n + 1
.
Since f(C (A)) ⊆ C (A) by Proposition 108, we have
λ
C(A)f(C(A))
(t) ⊆ λ
C(A)C(A)
(t) = C (A) .
For the reverse inclusion, note by Proposition 109 for 0 ≤ t ≤
1
n+1
λ
C(A)f(C(A))
(t) ⊇ ∪
{x
0
,...,x
n
}⊆A
λ
C({x0,...,xn}){x0,...,xn}
(t)
= ∪
{x0,...,xn}⊆A
C (¦x
0
, . . . , x
n
¦) = C (A) .
In summary, we have proven
Theorem 110 Let f () = ∪
k
i=1
f
i
() be the IFS determined by contractive aﬃne
maps f
1
, . . . , f
k
of R
n
, and let A be the unique ﬁxed point of f. The arc ﬁeld
X on H (R
n
) deﬁned by X
a
= λ
af (a)
generates a contractive forward ﬂow F :
H (R
n
) [0, ∞) →H (R
n
), whose (unique) ﬁxed point is the convex hull C (A)
of A.
7.4 Cyclically attracted sets
Consider the following diﬀerential equations in R
n
where x(t), y(t), and z(t)
represent curves in R
n
.
dx/dt = y −x
dy/dt = z −y
dz/dt = x −z.
(7.7)
Its solutions are three curves spiraling toward each other; x(t) moves toward
y(t), y(t) moves toward z(t), and z(t) moves toward x(t). Let’s deﬁne an arc
ﬁeld that describes a similar situation on H(R
n
).
On the Cartesian product
H
3
(R
n
) := H (R
n
) H (R
n
) H (R
n
)
use the complete metric
d((a, b, c), (a
, b
, c
)) := d
H
(a, a
) +d
H
(b, b
) +d
H
(c, c
).
7.5. CONTROL THEORY 115
Deﬁne the arc ﬁeld X on H
3
(R
n
) by
X
(a,b,c)
:= (λ
ab
, λ
bc
, λ
ca
) .
In §7.2 properties of λ
ab
were demonstrated that make it easy to check E1 and
E2.
The projections of the solution onto each of its three coordinates gives curves
a(t), b (t), and c (t) in H (R
n
) beginning respectively at the initial points a, b,
and c and attracted to each other cyclically. As a special case if a, b, and c
are individual points in R
n
⊂ H (R
n
), the projections of the solutions to the
arc ﬁeld X with those initial conditions are identical to the solutions of the
diﬀerential equation (7.7).
7.5 Control theory
Let ¦f
i
¦
i∈I
be a free control system where M is a manifold and f
i
: M → TM
are vector ﬁelds. Let R
{fi}
denote the reachable set, i.e., the set of all points
reachable by solutions of
¸
i∈I
p
i
(t) f
i
where p
i
: R → R are unconstrained control parameters. This suggests a set
valued approach so we again use H (M), the set of all nonvoid compact subsets
of M, to ﬁnd R
{f
i
}
.
Consider the map Φ : H (M) →H (TM) deﬁned by
Φ(a) :=
¸
i∈I
t
i
f
i
(a)
¸
i∈I
[t
i
[ ≤ 1
¸
and the arc ﬁeld on H (M) given by
X
t
(a) : = a +tΦ(a) = ∪
x∈a,y∈Φ(a)
¦x +ty¦ when M is a linear space
X
t
(a) : = ∪
ti
¸
i∈I
ti≤1
¸
F
{ti}
t
(a) generally.
Here F
{ti}
is the ﬂow of the vector ﬁeld
¸
i∈I
t
i
f
i
and F
{ti}
t
(a) =
F
{ti}
t
(x)
x ∈ a
¸
.
We check E1 and E2. Then the reachable set R
{f
i
}
is the limit of the ﬂow,
and Euler curves give a constructible (if computationally impractical) means for
approximating the reachable set. Denoting a ∨ b := max¦a, b¦ we check E1 on
116 CHAPTER 7. FLOWS ON H
R
N
a linear space:
d
H
(X
t
(a) , X
t
(b))
=
max
z∈Xt(b)
d
¸
∪
x∈a
y∈Φ(a)
x +ty, z
¸
¸
¸
∨
max
z∈Xt(a)
d
¸
¸z, ∪
x
∈b
y
∈Φ(b)
x
+ty
¸
¸
¸
¸
=
max
x
∈b
y
∈Φ(b)
min
x∈a
y∈Φ(a)
x +ty −(x
+ty
)
¸
¸
¸ ∨
max
x∈a
y∈Φ(a)
min
x
∈b
y
∈Φ(b)
x +ty −(x
+ty
)
¸
¸
¸
≤
max
x
∈b
y
∈Φ(b)
min
x∈a
y∈Φ(a)
¦x −x
 +[t[ y −y
¦
¸
¸
¸ ∨
max
x∈a
y∈Φ(a)
min
x
∈b
y
∈Φ(b)
¦x −x
 +[t[ y −y
¦
¸
¸
¸
≤ d
H
(a, b) +[t[ d
H
(Φ(a) , Φ(b)) ≤ d
H
(a, b) +[t[ max
i
d
H
(f
i
(a) , f
i
(b))
≤ d
H
(a, b) +[t[ d
H
(a, b) max
i
K
i
and E2:
X
t
X
s
(a) = X
t
¸
∪
x∈a
y∈Φ(a)
¦x +sy¦
¸
= ∪
x
∈
¸
¸ ∪
x∈a
y∈Φ(a)
{x+sy}
¸
,y
∈Φ
¸
¸ ∪
x∈a
y∈Φ(a)
{x+sy}
¸
¦x
+ty
¦
= ∪
x∈a
y∈Φ(a)
∪
x
∈Φ(a)
y
∈Φ
(2)
(a)
¦(x +sy) +t (x
+sy
)¦
so
d
H
(X
s+t
(a) , X
t
X
s
(a))
=
max
z∈XtXs(a)
d
¸
∪
x∈a
y∈Φ(a)
x + (s +t) y, z
¸
¸
¸
∨
max
z∈Xs+t(a)
d
¸
¸z, ∪
x∈a
y∈Φ(a)
∪
x
∈Φ(a)
y
∈Φ
(2)
(a)
¦(x +sy) +t (x
+sy
)¦
¸
¸
¸
¸
=
max
x∈a,x
∈Φ(a)
y∈Φ(a),y
∈Φ
(2)
(a)
min
x∈a
y∈Φ(a)
x + (s +t) y −((x +sy) +t (x
+sy
))
¸
¸
¸
∨
max
x∈a
y∈Φ(a)
min
x∈a,x
∈Φ(a)
y∈Φ(a),y
∈Φ
(2)
(a)
x + (s +t) y −((x +sy) +t (x
+sy
))
¸
¸
¸
7.5. CONTROL THEORY 117
= [t[
¸
¸
¸
¸
¸
¸
¸
¸
¸
max
x∈a,x
∈Φ(a)
y∈Φ(a),y
∈Φ
(2)
(a)
min
x∈a
y∈Φ(a)
y −(x
+sy
)
¸
¸
¸
∨
max
x∈a
y∈Φ(a)
min
x∈a,x
∈Φ(a)
y∈Φ(a),y
∈Φ
(2)
(a)
y −(x
+sy
)
¸
¸
¸
¸
≤ [st[
¸
¸
max
x∈a,x
∈Φ(a)
y∈Φ(a),y
∈Φ
(2)
(a)
min
x∈a
y∈Φ(a)
y

¸
¸
¸ ∨
max
x∈a
y∈Φ(a)
min
x∈a,x
∈Φ(a)
y∈Φ(a),y
∈Φ
(2)
(a)
y

¸
¸
¸
¸
≤ [st[ max
y
∈Φ
(2)
(a)
y
 = O(st) .
These calculations also verify E1 and E2 on a manifold via localization.
Examples similar to this one give part of the motivation for introducing
mutational analysis [7] and quasidiﬀerential equations [51].
118 CHAPTER 7. FLOWS ON H
R
N
Chapter 8
Counterexamples
Example 111 For computational purposes, we would much prefer to use the
original arc ﬁelds X and Y in the deﬁnition of the bracket [X, Y ] instead of their
ﬂows F and G (particularly for examples with PDEs). The current example,
however, shows this is not generally feasible. Let us use the bracket
¦X, Y ¦ (x, t) :=
Y
−
√
t
X
−
√
t
Y
√
t
X
√
t
(x)
X
−
√
t
Y
−
√
t
X
√
t
Y
√
t
(x)
for t ≥ 0
for t < 0
instead of
[X, Y ] (x, t) :=
G
−
√
t
F
−
√
t
G
√
t
F
√
t
(x)
F
−
√
t
G
−
√
t
F
√
t
G
√
t
(x)
for t ≥ 0
for t < 0.
Let
X
t
(f) (x) := f +tf
x
and F
t
(f) := f (x +t)
so that on, e.g., M := L
1
(R) ∩ C
1
(R) the ﬂow of X is F since
d (X
t
(f) , F
t
(f)) =
[f (x +t) −f (x) −tf
(x)[ dx
= [t[
f (x +t) −f (x)
t
−f
(x)
dx = o (t)
(M is not complete with the L
1
norm/metric, but F is still the ﬂow of X).
However, due to the presence of the square root of t it is still conceivable that
¦X, X¦ [X, X]. In fact
¦X, X¦
t
2 (f) = f +tf
x
+t (f
x
+tf
xx
) −t [f
x
+tf
xx
+t (f
xx
+tf
xxx
)]
−t (f
x
+tf
xx
+t (f
xx
+tf
xxx
) −t [f
xx
+tf
xxx
+t (f
xxx
+tf
xxxx
)])
= f −t
2
2f
xx
+t
4
f
xxxx
therefore
d (¦X, X¦
t
(f) , Y
t
(f)) = o (t)
119
120 CHAPTER 8. COUNTEREXAMPLES
where Y
t
(f) := f −2tf
xx
for f ∈ C
2
and we see ¦X, X¦ ¦F, F¦ = 0. Conse
quently, if we want any geometric information about ﬂows from the bracket, it
is important not to interchange ﬂows and arc ﬁelds in calculations.
Example 112 If X ≈ F and Y ≈ G then for t > 0
d (¦X, Y ¦ (x, t) , [X, Y ] (x, t))
= d
Y
−
√
t
X
−
√
t
Y
√
t
X
√
t
(x) , G
−
√
t
F
−
√
t
G
√
t
F
√
t
(x)
≤ d
Y
−
√
t
X
−
√
t
Y
√
t
X
√
t
(x) , Y
−
√
t
F
−
√
t
G
√
t
F
√
t
(x)
+d
Y
−
√
t
F
−
√
t
G
√
t
F
√
t
(x) , G
−
√
t
F
−
√
t
G
√
t
F
√
t
(x)
≤ d
X
−
√
t
Y
√
t
X
√
t
(x) , F
−
√
t
G
√
t
F
√
t
(x)
1 + Λ
Y
√
t
+O
√
t
2
≤ ... ≤ d
X
√
t
(x) , F
√
t
(x)
1 + Λ
Y
√
t
2
1 + Λ
X
√
t
+ 3O(t)
= O(t) = o (t) .
Since there are circumstances when all of these inequalities are equalities, this
estimate is tight. Consequently even 2ndorder tangency is not enough to allow
the indiscriminate use of arc ﬁelds to directly calculate the bracket.
Example 113 (localuniformity necessary for Theorem 63)
Our results not only apply to all Lipschitz vector ﬁelds, but also for some
discontinuous vector ﬁelds. Consider the vector ﬁeld f : R
2
→R
2
given by
f (x) = f (x
1
, x
2
) :=
(1, 0) x
1
< x
2
(0, −1) x
1
≥ x
2
.
1
0.5
0
1
1.5
1 1
Though discontinuous, f still has a unique (continuous) ﬂow F given by
F
x
(t) :=
(x
1
+t, x
2
) x
1
+t ≤ x
2
(x
2
, x
2
−(t −(x
1
−x
2
))) x
1
≥ x
2
¸
x
1
+t ≥ x
2
(x
1
, x
2
−t) x
1
+t ≥ x
2
(x
1
+t −(x
1
−x
2
) , x
1
) x
1
≥ x
2
¸
x
1
+t ≤ x
2
.
121
Another vector ﬁeld on R
2
given by the constant function g (x) := (1, 0) has ﬂow
G
x
(t) := x+t (1, 0). Calculate their arc ﬁeld bracket to ﬁnd it is tangent to the
constant 0 ﬂow
d ([X, Y ]
t
(x) , 0
t
(x)) = d ([X, Y ]
t
(x) , x) = o (t)
at every x ∈ M = R
2
but not locally uniformly o (t) near the line of discontinuity
x
1
= x
2
. Consequently Theorem 63 on commutativity does not apply, and in
fact commutativity does not hold since, e.g.,
G
10
F
10
(−1, 0) = G
10
(0, −9) = (10, −9)
while
F
10
G
10
(−1, 0) = F
10
(9, 0) = (9, −10) .
F and G also satisfy
d (G
s
F
t
(x) , F
t
G
s
(x)) = O([st[)
at each point, though not locally uniformly so they do not close. Their sum
F +G is well deﬁned, though.
Example 114 Let F be the ﬂow derived from the discontinuous vector ﬁeld f
as in Example 113 except extended to R
3
,
f (x
1
, x
2
, x
3
) :=
(1, 0, 0) x
1
< x
2
(0, −1, 0) x
1
≥ x
2
.
and deﬁne a new ﬂow Z
t
(x) := (x
1
, x
2
, x
3
+t) then their bracket is identically 0
(and so locally uniformly tangent to 0) and these ﬂows do commute and foliate
R
3
like pages in a book opened to a right angle, or stacked, bent, sheet metal.
So the locus of discontinuity matches up with a space of relative equilibrium
(perpendicular ﬂows).
122 CHAPTER 8. COUNTEREXAMPLES
Appendix A: Metric spaces
“Metric spaces are everywhere.”Mischa Gromov.
A metric space (M, d) is a set M with a function d : M M → R called
the metric which is positive, deﬁnite, symmetric and satisﬁes the triangle in
equality:
(i) d(x, y) ≥ 0 positivity
(ii) d(x, y) = 0 iﬀ x = y deﬁniteness (or nondegeneracy)
(iii) d(x, y) = d(y, x) symmetry
(iv) d(x, y) ≤ d(x, z) +d(z, y) triangle inequality
for all x, y, z ∈ M. Maurice Fréchet introduced metric spaces in [34, 1906], 1906,
though the term was coined by Felix Hausdorﬀ, who gave the ﬁrst extensive
exploration of their properties in [38], 1914.
.1 Examples
Example 115 For our purposes the most important metric space is M = R,
the real number line, with metric d (x, y) = [x −y[. Properties (i)(iv) are easy
to verify, and R is complete by deﬁnition. Next we might take M = R
n
with
d (x, y) = x −y where  denotes the Euclidean norm, z :=
¸
n
i=1
z
2
i
for
z = (z
1
, ..., z
n
) ∈ M.
More generally any vector space with a norm  gives a metric space by
using d (x, y) := x −y. A normed vector space which is complete is called a
Banach space. Conversely a metric d on a vector space M which is translation
invariant
d(x, y) = d (x +z, y +z)
for all z ∈ M and homogeneous
d (rx, ry) = [r[ d (x, y)
for all r ∈ R gives a norm x := d (x, 0) on M.
Other common examples of metrics on R
n
which come from norms include
the taxicab metric with
d
1
(x, y) :=
n
¸
i=1
[x
i
−y
i
[
123
124 APPENDIX A: METRIC SPACES
(compute B(x, r) to see why it’s also called the “diamond metric”) and the
Chebyshev metric
d
∞
(x, y) := max
i
[x
i
−y
i
[ =: x −y
∞
which is also called the supremum metric.
x
1
= 1 x
2
= 1 x
∞
= 1
More generally, the Minkowski distance
1
of order p ≥ 1 is deﬁned as
d
p
(x, y) :=
n
¸
i=1
[x
i
−y
i
[
p
1
p
and it happens that lim
p→∞
d
p
(x, y) = d
∞
(x, y) for any ﬁxed x, y ∈ R
n
.
If we take n →∞ in R
n
we get the set of real sequences
R
N
:= ¦x = (x)
∞
i=1
= (x
1
, x
2
, ...) [x
i
∈ R for i ∈ N¦ .
The l
p
metrics (p ≥ 1)
d
p
(x, y) :=
∞
¸
i=1
[x
i
−y
i
[
p
1
p
= x −y
p
on the sets
l
p
(R) :=
¸
x ∈ R
N
d
p
(x, 0) < ∞
also give metric spaces. Here 0 represents the constant sequence 0 = (0, 0, ...) ∈
R
N
. Here again we have the supremum metric
d
∞
(x, y) := sup
i∈N
[x
i
−y
i
[ = x −y
∞
and again lim
p→∞
d
p
(x, y) = d
∞
(x, y) is valid.
Finally we can further generalize to the L
p
spaces on subsets of the space of
real functions
R
R
:= ¦f : R →R¦ .
1
not to be confused with the pseudoRiemannian Minkowski metric fundamental to special
relativity theory.
.1. EXAMPLES 125
The L
p
metrics (p ≥ 1) are given by
d
p
(f, g) :=
[f −g[
p
dµ
1
p
= f −g
p
.
Here µ is the Lebesgue measure, but there is a new diﬃculty: if d
p
(f, g) = 0 we
may still have f = g on a set of measure 0 and so property (ii) is invalid. The
standard trick is to introduce the equivalence relation f ∼ g when they agree on
a set of full measure. Then d
p
is a genuine metric on the quotient space
L
p
(R) :=
¸
f ∈ R
R
d
p
(f, 0) < ∞
/ ∼
where now 0 is the constant function 0 (x) = 0 for all x ∈ R. Again we have the
supremum metric d
∞
(f, g) := ess sup[f −g[ where now ess sup refers to the
essential supremum
ess sup(f) := inf ¦r ∈ R[f (x) < r for almost all x¦ = f
∞
on the set L
∞
(R). Again lim
p→∞
d
p
(f, g) = d
∞
(f, g) holds.
L
2
(R) is a particularly important space as it is a Hilbert space, i.e., a
complete inner product space. The inner product ', ` is given by
'f, g` :=
f (x) g (x) dµ(x)
and we have
'f, f` = f
2
. Constructing the norm from the inner product
in this way shows all Hilbert spaces are Banach spaces. Hilbert spaces are the
starting point for the subject of functional analysis [26].
Another useful addition is to change the measure in L
p
. Deﬁne the L
p
w
norm
by
d
p
(f, g) :=
[f (x) −g (x)[
p
w(x) dx
1
p
= f −g
p,w
.
The set of functions with bounded p, w norm is denoted L
2
w
(R). w is the weight
function, which is assumed to satisfy w(x) > 0 and
w(x) dx = 1. In this
book L
2
G
is particularly useful where G denotes the Gaussian G(x) := e
−x
2
/
√
π.
In the spaces of the two previous examples, R can be replaced by the set of
complex numbers C, and the claims remain valid.
Example 116 All of the above examples derive from normed vector spaces, but
metric spaces are much more general. E.g., the space
C
∞
b
(R) :=
f : R →R :
sup
x∈R
d
n
f (x)
dx
n
¸
< ∞ ∀n ∈ N
¸
is a natural set to investigate for physical situations. Physicists usually assume
their objects of concern are smooth and bounded. So it is extremely unsettling
126 APPENDIX A: METRIC SPACES
that C
∞
b
(R) does not have a complete norm. However, it does have a few
complete metrics, including
d (f, g) :=
∞
¸
n=0
f −g
[n]
1 +f −g
[n]
where
f
[n]
:= sup
x∈R
d
n
f (x)
dx
n
¸
.
Here is another general situation where a vector space without a norm can
still be given a metric: Begin with an arbitrary set S and any complete metric
space (M, d). Denote by M
the set of all bounded functions from S to M, where
f : S → M is bounded if
f (S) := ¦f (x) [x ∈ S¦ ⊂ B(x
0
, r) ⊂ M
for some x
0
∈ M and 0 < r < ∞. Then M
is complete under the supremum
metric
d
∞
(f, g) := sup
x∈S
¦d (f (x) , g (x)) .
Example 117 Euclidean distance on R
2
gives a metric on the 1sphere
S
1
:=
¸
x ∈ R
2
x = 1
by restriction. This is the extrinsic metric. The intrinsic metric d
I
(x, y) is
deﬁned to be the length of the shortest path in S
1
⊂ R
2
connecting x and y.
This example is immediately generalized to the nsphere S
n
⊂ R
n+1
. A choice
of metric on R
n+1
other than the Euclidean distance will give new extrinsic and
intrinsic metrics on S
n
.
On the next simplest manifold, the torus T
2
, there are three natural choices
of metric. Viewing T
2
as embedded in R
3
we have the extrinsic metric and the
intrinsic metric deﬁned in the same way as the metrics on S
1
. The third metric
is the ﬂat metric. Remember
T
2
:= S
1
S
1
= ¦x = (x
1
mod2π, x
2
mod2π) = (x
1
, x
2
) mod2π[x
1
, x
2
∈ R¦
then the ﬂat metric is
d
F
(x, y) := min
j,k,m,n∈Z
¦(x
1
+ 2mπ, x
2
+ 2nπ) −(y
1
+ 2jπ, y
2
+ 2kπ)¦ .
We call this the ﬂat metric because the space is equivalent to a ﬂat piece of
paper for which opposite edges are identiﬁed (via the modulo 2π operation). This
metric is geometrically inequivalent to the others since geodesics are diﬀerent as
is the curvature.
.2. PROPERTIES 127
Example 118 The metric space H (R
n
) is the set of all nonempty compact
subsets of R
n
. Using the simplifying notation d (x, a) := inf
y∈a
¦d (x, y)¦ =: d (a, x)
for x ∈ R
n
and a ⊂ R
n
the Hausdorﬀ metric on H (R
n
) is given by
d
H
(a, b) := max
sup
x∈a
¦d (x, b)¦ , sup
y∈b
¦d (y, a)¦
¸
. (1)
An equivalent deﬁnition for d may aid intuition: deﬁne
B(a, r) := ∪
x∈a
B(x, r)
for a ⊂ R
n
then
d
H
(a, b) = inf ¦r[ b ⊂ B(a, r) ∧ a ⊂ B(b, r)¦ .
H (R
n
) has several useful topological properties in common with R
n
. It is
separable, complete and even locally compact (separability is obvious by consid
ering ﬁnite subsets of R
n
; for completeness, see [8]; for local compactness, see
[33, p. 183]).
For any metric space (M, d) the space of nonvoid compact subsets H (M)
may be metrized with d
H
deﬁned without change as (1). Again, if M is complete
(or separable, or locally compact) then (H (M) , d
H
) is also.
An interesting generalization is to consider the space M
GH
of all nonvoid
compact metric spaces. The Gromov—Hausdorﬀ distance is a complete met
ric on M
GH
deﬁned by
d
GH
(M
1
, M
2
) := inf ¦d
H
(f
1
(M
1
) , f
2
(M
2
))¦
taken over all metric spaces M and all isometric embeddings f
i
: M
i
→ M for
i = 1, 2. This space is complete and separable.
With Riemannian geometry we may prove Gromov’s compactness theorem,
which states that the set of Riemannian manifolds with Ricci curvature ≥ c and
diameter ≤ D is precompact in the Gromov—Hausdorﬀ metric, [37].
In 2003 Perelman [53] used continuous dynamics on the metric space M
GH
,
i.e., Ricci ﬂow, to validate Thurston’s geometrization conjecture. Several essen
tial ideas come from metric geometry, including Gromov’s compactness theorem,
which is no surprise since his earlier accomplishments were in Aleksandrov met
ric geometry and his thesis advisor was Burago [14].
.2 Properties
Here we tersely list results from elementary pointset topology relevant to metric
spaces. The choice of results covered includes facts used implicitly throughout
this text and also facts appropriate for furthering the program of generalizing
diﬀerential geometry and analysis theorems initiated in this book. Most of these
ideas may be generalized to topological spaces; on metric spaces the presentation
128 APPENDIX A: METRIC SPACES
is more natural and simpliﬁed. Proofs of the facts in this section are commonly
available in elementary topology texts; [49] is recommended.
A sequence (x
n
) in a metric space M is a map x : N → M. The image of
n ∈ N is typically denoted x
n
∈ M instead of x(n). The point x
∗
∈ M is the
limit of (x
n
) if for any > 0 there exists N ∈ N such that d (x
n
, x
∗
) < for all
n > N; this is denoted x
∗
= lim
n→∞
x
n
or x
n
→ x
∗
. A sequence with a limit is
convergent, and a sequence with no limit is divergent.
A subset S ⊂ M has closure S in M deﬁned by
S :=
x ∈ M[ ∃(x
n
) ⊂ M with x = lim
n→∞
x
n
¸
.
S is a closed subset of M if S = S; and S is an open subset if its complement
S
is closed. S is dense in M if S = M. If S is open and dense in M then the
property of belonging to S is generic.
.2.1 Regularity
A map f : M → N between metric spaces is continuous at x ∈ M if for
any > 0 there exists δ > 0 such that d
Y
(f (x) , f (y)) < for all y ∈ M
such that d (x, y) < δ. The map f is continuous if f is continuous at every
x ∈ M. If for every convergent sequence x
n
→ x
∗
we have f (x
n
) → f (x
∗
)
then f is sequentially continuous. Continuity and sequential continuity are
equivalent; also f is continuous iﬀ for any open set U ⊂ N the set f
−1
(U) is
open in M. A map f is uniformly continuous if for any > 0 there exists
δ > 0 such that d
Y
(f (x) , f (y)) < for all x, y ∈ M such that d (x, y) < δ.
A neighborhood of a point x ∈ M is an open set containing x. A map f is
locally uniformly continuous if for each x ∈ M there exists a neighborhood
on which f is uniformly continuous.
A sequence of functions f
n
: M → N converges uniformly to a function
f : M →N if for any > 0 there exists N such that
d
Y
(f
n
(x) , f (x)) <
for all n > N and all x ∈ X (in other words d
∞
(f
n
, f) < for all n > N which
is why d
∞
is occasionally referred to as the uniform metric). The Uniform
Limit Theorem states that if the f
n
are continuous and converge uniformly to
f, then f is continuous.
A map f : (M, d
M
) →(N, d
N
) between metric spaces is Lipschitz continu
ous if there exists K ≥ 0 such that
d
N
(f (x
1
) , f (x
2
)) ≤ Kd
M
(x
1
, x
2
) (2)
for all x
1
, x
2
∈ M. The number K is a Lipschitz constant for f. The inﬁmum
of all such Lipschitz constants is denoted K
f
. The map f is locally Lipschitz
.2. PROPERTIES 129
continuous if for each x ∈ M there exists a neighborhood on which f is Lip
schitz. f is a contraction if it has a Lipschitz constant with K
f
< 1. In this
case K
f
is called its contraction factor.
Lipschitz continuity is a natural regularity restriction on a metric space; it is
closely related to smooth maps by Rademacher’s Theorem: a locally Lipschitz
map f : R
n
→ R
m
is diﬀerentiable at almost every point x ∈ R
n
. Metric space
versions of Rademacher’s Theorem are also valid, [3]. Smoothness on R
n
is gen
erally associated in some way with tangency to a linear map, or in the simplest
case, a smooth curve is tangent to a line, a special choice of curve in R
n
. Since
there is no automatic “special curve” to deﬁne smoothness in a general met
ric space, we sometimes rely on Lipschitz continuity instead. Fortunately, many
important results on R
n
hold under the more general Lipschitz regularity as well
as smoothness. E.g., Lipschitz vector ﬁelds can be substituted for smooth vec
tor ﬁelds in the Fundamental Theorem of ODE’s (Theorem 128), the Flowbox
Theorem [19], the deﬁnition of the Lie bracket [55] (or p. 32 above), Frobe
nius’ Theorem (p. 51 above), the Inverse Function Theorem [7], etc. Even the
Fundamental Theorem of Calculus works for Lipschitz functions f : [a, b] →R
b
a
f
(x) dµ(x) = f (b) −f (a)
with the integral taken in the Lebesgue sense. (Use Rademacher’s Theorem
to deﬁne f
almost everywhere, then note this holds more generally for any
absolutely continuous f [56]). For those interested in studying nonsmooth
analysis an excellent starting point is the subject of geometric measure theory
[33].
It is convenient to denote associated spaces of maps as follows:
C (M, N) : = ¦f : M →N [ f is continuous¦
Lip (M, N) : = ¦f : M →N [ f is Lipschitz¦
Lip
K
(M, N) : = ¦f : M →N [ f with Lipschitz constant ≤ K¦ .
A homeomorphism is a continuous map with continuous inverse. A lipeo
morphism is a Lipschitz map with Lipschitz inverse. An isometry is a map
f : M →N such that d
N
(f (x) , f (y)) = d
M
(x, y) for all x, y ∈ M. An isome
try is a lipeomorphism onto its image. M and N are isometric if there exists
an isometry from M onto N.
A Cauchy sequence is a sequence (x
n
) with the property that for any > 0
there exists N ∈ N such that d (x
n
, x
m
) < for all m, n > N. A metric space
M is complete if every Cauchy sequence converges. M is locally complete if
every point x ∈ M has a complete neighborhood. A closed subset of a complete
space is complete. A locally complete metric space is isometrically isomorphic
with an open subset of a complete metric space, and conversely every open
subset of a complete metric space is locally complete. Every space in Examples
115118 is complete.
130 APPENDIX A: METRIC SPACES
A sequence (y
n
) is a subsequence of (x
n
) if there exists an increasing
sequence (n
m
) in N such that x
n
m
= y
m
. A metric space M is complete iff
every Cauchy sequence has a convergent subsequence.
Every metric space M has a metric completion M
, that is a complete
metric space for which M may be isometrically embedded as a dense subset.
The space M
may be constructed as the set of all equivalence classes of Cauchy
sequences in M where equivalence between two sequences x = ¦x
n
¦ and y =
¦y
n
¦ ⊂ M is determined if d (x
n
, y
n
) →0 as n →∞. Clearly M
is unique up to
isometry. A second approach to constructing M
is to deﬁne φ
a
(x) := d (x, a) −
d (x, x
0
) and Φ : M → E(X, R) by Φ(a) := φ
a
then notice M
:= Φ(M) is
complete. A third approach is to use Kuratowski’s embedding theorem: every
metric space can be embedded in a Banach space. Whence the completion is
the closure.
Theorem 119 (Contraction Mapping Theorem) A contraction f : M →
M on a complete metric space M has a unique ﬁxed point x
∗
. For any x ∈ M the
iterates f
(n)
(x) converge to x
∗
exponentially, i.e., if K < 1 is the contraction
factor of f, then
d
f
(n)
(x) , x
∗
≤ K
n
d (x, x
∗
) .
.2.2 Extensions
Theorem 120 (Tietze extension) If S is a subset of a metric space M, and
if f : S → R is continuous, then there exists an extension f : M →R which is
continuous.
Theorem 121 (McShane’s Lipschitz extension) If S is a subset of a met
ric space M and f : S →R is KLipschitz, then f : M →R deﬁned by
f (x) := sup¦f (y) −K d (x, y)[ y ∈ S¦
equals f on S and is KLipschitz.
The proof of this is easy to check since f is given explicitly [47].
The support of a function φ : M → R is deﬁned to be supp (φ) :=
¦x ∈ M[ φ(x) = 0¦, the closure of the set on which the function does not vanish.
Theorem 122 Let ¦U
i
¦
i∈I
be an arbitrary open covering of a metric space
M, i.e., U
i
⊂ M is open for all i and ∪
i∈I
U
i
= M. Then there exists a partition
of unity dominated by ¦U
i
¦
i∈I
. This means there exist functions φ
i
: M →[0, 1]
for all i ∈ I such that
(i) supp (φ
i
) ⊂ U
i
for all i ∈ I
(ii) ¦supp (φ
i
)¦ is locally ﬁnite
(iii)
¸
i∈I
φ
i
(x) = 1 for each x ∈ M.
.3. GEOMETRIC OBJECTS 131
(ii) means at any given x ∈ M only ﬁnitely many φ
i
are nonzero. This
makes the sum in (iii) welldeﬁned.
A partition of unity is used in manifold theory to demonstrate the existence
of constructions whenever the desired object may be constructed on charts. For
example, Riemann metrics always exist on a manifold because the Euclidean
inner product exists on the chart in R
n
; vector ﬁelds may be approximated with
C
∞
vector ﬁelds since they can be approximated in charts; and the integral of
a form on a manifold
dω is deﬁned using charts,
Compactness is a fundamental property in metric spaces which is given short
shrift in this book, since all our results hold in the more general setting of a
complete metric space. One result worth mentioning in keeping with our interest
in generalizing results from manifolds to metric spaces is the fact that a compact
metric space with topological dimension n can be embedded in R
2n+1
. Unlike
the case with manifolds this value of 2n + 1 is sharp for metric spaces.
We are focused on metric spaces and not topological spaces, so we do not wish
to immerse ourselves in the dense terminology of topological spaces. Therefore
we print the next two theorems without explanation. We’ve been able to forgive
ourselves this shoddiness because they are not directly used anywhere in the
manuscript, despite the perspective they impart.
Theorem 123 Every metric space is a Hausdorﬀ, ﬁrst countable, paracompact
and perfectly normal topological space.
Theorem 124 (NagataSmirnovBing metrization) A topological space is
metrizable if and only if it is regular and has a basis which is countably locally
ﬁnite.
.3 Geometric objects
This book largely ignores the “static” objects listed in this dusty appendix
in favor of more dynamic interests. We cannot ignore their fundamental im
portance completely, though, and recognize that further developments in this
subject will beneﬁt from the rich constructions generalized from Riemannian
and Finsler geometry.
The diameter of a nonvoid subset A of a metric space M is the number
D(A) := sup
x,y∈A
d (x, y)
.3.1 Triangles
The cosine angle formula for a triangle in the plane with sides of length A, B,
and C is C
2
= A
2
+ B
2
−2ABcos θ where θ is the angle between the sides of
132 APPENDIX A: METRIC SPACES
length A and B. This may be immediately generalized to metric spaces to give
a notion of angle. For three points x, y, and z ∈ M let
¯
∠xyz := arccos
d (x, y)
2
+d (y, z)
2
−d (x, z)
2
2d (x, y)
2
d (y, z)
2
denote the comparison angle xyz which is used to deﬁne the angle between
curves c
1
and c
2
: [0, 1] →M with common origin c
1
(0) = x = c
2
(0) as
¯
∠(c
1
, c
2
) := lim
s,t→0
+
¯
∠c
1
(s) xc
2
(t)
when the limit exists. This gives the inspiration for deﬁning a metric space
inner product
'c
1
, c
2
`
M
:= c
1
 c
2
 cos
¯
∠(c
1
, c
2
)
where the norm c denotes the speed of c at t = 0
c := sup
t=0
d (c (t) , c (0))
[t[
similar to Deﬁnition 10. Similarly curvature (deﬁned with the angle given pre
viously), convex sets, geodesics, gradients, etc., can be generalized proﬁtably
to metric spaces, see [37], [14] and [41]. Usually the metric space needs to be
restricted (to length spaces or even locally Euclidean spaces) to get nontrivial
results from these deﬁnitions. A diﬀerent approach is to use the interpreta
tion of a connection on a manifold M as a distribution on TM, which may be
generalized to metric spaces using the ideas of Chapter 3.
.3.2 Metric coordinates
Let C ⊂ M be a set of points and deﬁne x ∈ M and c ∈ C the number
x
c
:= d (x, c) called the cth metric coordinate of x. Assuming for each pair
of elements x, y ∈ M with x = y we have (x
c
)
c∈C
= (y
c
)
c∈C
∈ R
C
then C
coordinatizes M.
Example 125 Consider the open halfplane H
2
in the Euclidean plane E
2
with
the Euclidean metric d. Pick any two distinct points a and b on the boundary.
We can locate any point x in H
2
if we know its distances to a and b, say
x
a
= d (x, a) and x
b
= d (x, b). Thus ¦a, b¦ is a metric coordinatizing set for
H
2
.
Equations in metric coordinates obviously give diﬀerent graphs from those
in Cartesian or polar coordinates. E.g., for any r > d (a, b) , the locus of the
equation
x
a
+x
b
= r (3)
in metric coordinates is the set
¸
x ∈ H
2
: d (x, a) +d (x, b) = r
.
.3. GEOMETRIC OBJECTS 133
The graph of (3) is half of an ellipse with foci at a and b.
E
2
, the plane, requires 3 noncolinear points for a metric coordinatizing set.
H
3
(the open halfspace) is metrically coordinatized with 3 noncolinear points
on its boundary, and E
3
needs 4 noncoplanar points. Many geometric objects
are readily described in metric coordinates on E
3
:
Sphere (center a, radius r) x
a
= r r ≥ 0
Ellipsoid (foci a, b) x
a
+x
b
= r r ≥ d (a, b)
Hyperboloid (foci a, b) [x
a
−x
b
[ = r 0 < r < d (a, b)
Inﬁnite Cylinder
s (s −x
a
) (s −x
b
) (s −d (a, b)) = r
(with axis
←→
ab, radius
2r
d(a,b)
) where s =
xa+x
b
+d(a,b)
2
Inﬁnite Cone x
2
b
= d (a, b)
2
+x
2
a
−2x
a
d (a, b) cos θ
(with axis
←→
ab, vertex a, angle θ)
Plane (⊥
←→
ab) x
a
= x
b
Segment ab x
a
+x
b
= d (a, b)
Ray
−→
ab x
a
±x
b
= d (a, b)
Line
←→
ab [x
a
±x
b
[ = d (a, b)
The equation for the cylinder comes from Heron’s formula for the area of a
triangle. The equation for the cone is simply the cosine angle formula for a
triangle and represents only one half of a twosided cone; the other half is given
when θ is replaced with π − θ. More general equations for lines and planes
are available but are not so concise. Choosing the coordinates according to the
problem simpliﬁes the formulae.
Since each of the above formulae use only metric coordinates, they may serve
as deﬁnitions for the various geometric objects in general metric spaces.
.3.3 Conversion formulas
Choose a metric coordinatizing subset ¦a, b, c¦ of the Euclidean plane E
2
so the
rays
−→
ca and
−→
cb are perpendicular with d (a, c) = 1 = d (b, c). Deﬁne a Cartesian
coordinate system on the plane with the origin (0, 0) at c, the positive xaxis
along the ray
−→
ca and the positive yaxis along the ray
−→
cb so E
2
is given the
134 APPENDIX A: METRIC SPACES
structure of R
2
. The conversion formulae
2
are easy to ﬁnd:
Metric (w
a
, w
b
, w
c
) = w = (x, y) Cartesian (4)
w
c
=
x
2
+y
2
w
b
=
x
2
+ (y −1)
2
w
a
=
(x −1)
2
+y
2
.
Solving these same equations for x and y yields the inverse formulae
x =
w
2
c
−w
2
a
+ 1
2
and y =
w
2
c
−w
2
b
+ 1
2
. (5)
More generally, on a Hilbert space we have:
Theorem 126 Let (H, ', `) be a real Hilbert space with orthonormal basis B.
The set C := B ∪ ¦0¦ ⊂ H is a metric coordinatizing set.
Proof. For u, v ∈ H assume d (u, c) = d (v, c) for all c ∈ C. Then since 0 is
in C we have 'u, u` = 'v, v` . Further
'u −c, u −c` = 'v −c, v −c`
'u, u` −2 'c, u` +'c, c` = 'v, v` −2 'c, v` +'c, c`
'c, u` = 'c, v`
for all c ∈ B so u = v.
Using the basis B write an element w ∈ H in orthonormal coordinates as
w=( ¯ w
c
) where ¯ w
c
= 'w, c` for each c ∈ B. Any point w ∈ H is given in metric
coordinates by w = (w
c
)
c∈B∪{0}
where w
c
:= w −c = d (w, c) . With this, the
conversion formulae are
¯ w
c
=
w
2
0
−w
2
c
+ 1
2
c ∈ B (6)
w
c
=
w
2
−2 ¯ w
c
+ 1
1/2
c ∈ B (7)
w
0
= w
a straightforward generalization of the ﬁnitedimensional formulae, (5) and (4).
(7) results from the easy calculation
w
c
= w −c = 'w −c, w −c`
1/2
= ('w, w` −'w, c` −'c, w` +'c, c`)
1/2
=
w
2
−2 ¯ w
c
+ 1
1/2
.
Solving this equation for ¯ w
c
yields (6).
2
To write (wa, w
b
, wc) = w = (x, y) is technically abuse of notation. (wa, w
b
, wc) and
(x, y) are actually representations of w, and in the sequel we write w
C
= (wa, w
b
, wc) to make
this distinction explicit.
.3. GEOMETRIC OBJECTS 135
Example 127 One must be careful in applying these formulae. They do not
necessarily apply on nonHilbert vector spaces. The ﬁnitedimensional Banach
space R
2
with the inﬁnity norm has basis ¦(1, 0) , (0, 1)¦ which does not produce
a coordinatizing set in the above manner.
136 APPENDIX A: METRIC SPACES
Appendix B: ODEs as
vector ﬁelds
The most important source of ﬂows is the subject of ordinary diﬀerential equa
tions (ODEs). Let us demonstrate the elementary fact that for local questions,
practically any nth order ODE may be rewritten as a vector ﬁeld by adding
dependent variables.
Denoting higherorder derivatives with square brackets
y
[n]
:=
d
n
y
dt
n
= y
... ←n primes
consider for some g : R
n+2
→R the ODE
g
y
[n]
, y
[n−1]
, ..., y
, y, t
= 0. (8)
Using the implicit function theorem we can locally solve (8) for y
[n]
when
∂g
∂x1
=
0. (In case this is not true, we are in the realm of the subject of singularity
theory; see [6], [4].) And so we have
y
[n]
= h
y
[n−1]
, ..., y
, y, t
for some h : R
n+1
→R. Substituting x
1
:= y, x
2
:= y
, x
3
:= y
, ..., x
n
:= y
[n−1]
we get the equivalent 1storder system
x
1
= x
2
.
.
.
x
n−1
= x
n
x
n
= h(x
n
, ..., x
2
, x
1
, t) .
Introducing a ﬁnal variable x
n+1
:= t eliminates the right hand side’s depen
137
138 APPENDIX B: ODES AS VECTOR FIELDS
dence on t to get the autonomous system
x
1
= x
2
.
.
.
x
n−1
= x
n
x
n
= h(x
n
, ..., x
2
, x
1
, x
n+1
)
x
n+1
= 1
which may be rewritten more concisely with vector notation as
x
= f (x) (9)
where x = (x
1
, ..., x
n+1
) and f : R
n+1
→R
n+1
.
f is called the vector ﬁeld associated with the ODE (8). A solution to
the vector ﬁeld is a map x : I → R
n+1
for some interval I ⊂ R which satisﬁes
(9). The point x(0) = x
0
∈ R
n+1
is called the initial condition of x. Such a
function x clearly gives a solution y to (8) by retracing our steps, using the ﬁrst
coordinate of x. This can be immediately generalized, mutatis mutandis, by
assuming y is a vector quantity. Consequently we take solutions to vector ﬁelds
as primary, and Theorem 128 below has come to be known as the Fundamental
Theorem of ODEs
3
, which uses the next two deﬁnitions.
Our geometric intuition is usually rooted in ﬁnite dimensions (typically R
2
with the occasional stretch to R
3
). However, a routine experience in the “un
reasonable eﬀectiveness of mathematics” is how easily proofs inspired by low
dimensional intuition can be generalized to abstract spaces. Case in point, this
next theorem and its proofs have the same form in dimension 1 or inﬁnity.
Theorem 128 If f : B → B is a Lipschitz vector ﬁeld on a Banach space
B then there exists a unique solution to the ODE x
= f (x) for each initial
condition x
0
∈ B.
Proof. This is given in detail in introductory ODE texts, such as [39].
(Sketch) The solution σ
x0
is the limit of the sequence ¦φ
i
¦ deﬁned by
φ
0
(t) : = x
0
φ
i+1
(t) : = x
0
+
t
0
f (φ
i
(s)) ds.
The limit exists by the Contraction Mapping Theorem, Theorem 119. The detail
that complicates matters is the domain of the solution. By continuity of f we
ﬁnd r and M > 0 such that f (x) < M on B(x
0
, r). Then the domain is at
least (−r/M, r/M) and we use the supremum norm on C ((−r/M, r/M) , B)
3
alternately referred to as the “CauchyLipschitz Theorem”, “PicardLindelöf Theorem”,
“Wellposedness Theorem”, “Existence and Uniqueness Theorem”, etc., but most often it’s
used without comment.
139
to get the metric space used in the Contraction Mapping Theorem. Then
Grönwall’s lemma (a specialization of Theorem 16 to real functions) guaran
tees uniqueness.
Alternately, Theorem 12’s proof is transferrable, demonstrating the conver
gence of the sequence of Euler curves.
This result may be carried to the slightly more general context of a smooth
Banach manifold, M. A map f : M →TM is a vector ﬁeld if π◦f = id
M
where
π : TM → M is the natural projection. Remember, a tangent vector (x, x
) =
v ∈ TM can be represented as an equivalence class v = [c] of curves c which
are diﬀerentiable and tangent. Therefore a vector ﬁeld f may be represented
as a family of curves on M with c
x
∈ [c
x
] = f (x) ∈ TM requiring c
x
(0) =
x. Theorem 128 guarantees unique solutions when the transferred f is locally
Lipschitz on some (and therefore any) chart.
140 APPENDIX B: ODES AS VECTOR FIELDS
Appendix C: Numerical
diﬀerentiation
To approximate the derivative of a function f : R → R we may employ the
diﬀerence quotient
df
dx
= f
[1]
(x) ≈
f (x +t) −f (x)
t
d
2
f
dx
2
= f
[2]
(x) ≈
f(x+2t)−f(x+t)
t
−
f(x+t)−f(x)
t
t
=
f (x + 2t) −2f (x +t) +f (x)
t
2
.
.
.
f
[m]
(x) ≈
1
t
m
m
¸
j=0
(−1)
m−j
m
j
f (x +jt) .
In this appendix we derive an error estimate and generalize this formula to
include more points x+α
j
t. In approximating the m
th
derivative with an n+1
point formula
f
[m]
(x) =
1
t
m
n
¸
j=0
c
m,j
f (x +α
j
t) +Error
we wish to calculate the coeﬃcients c
j
and keep track of the Error. In the
forward diﬀerence method, the α
j
= j, but keeping these values general allows
us to ﬁnd the coeﬃcients for the central, backward, and other diﬀerence formulas
just as easily. The following method for ﬁnding the c
j
was shown to us by Jeﬀrey
Thornton who rediscovered the approach.
Taylor’s Theorem has
f (x +α
j
t) =
n
¸
k=0
(α
j
t)
k
k!
f
[k]
(x) +
(α
j
t)
n+1
(n + 1)!
f
[n+1]
ξ
j
141
142 APPENDIX C: NUMERICAL DIFFERENTIATION
for some ξ
j
between x and x +α
j
t. From this it follows
n
¸
j=0
c
j
f (x +α
j
t)
=
f (x)
tf
(x)
.
.
.
t
n
f
[n]
(x)
t
n+1
¸
¸
¸
¸
¸
¸
¸
T
1 1 1
α
0
α
1
α
n
α
2
0
2!
α
2
1
2!
α
2
n
2!
.
.
.
.
.
.
.
.
.
.
.
.
α
n
0
n!
α
n
1
n!
α
n
n
n!
α
n+1
0
f
[n+1]
(ξ
0
)
(n+1)!
α
n+1
1
f
[n+1]
(ξ
1
)
(n+1)!
α
n+1
n
f
[n+1]
(ξ
n
)
(n+1)!
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
c
0
c
1
.
.
.
c
n
¸
¸
¸
¸
¸
.
Now pick c = [c
j
] as a solution to
1 1 1
α
0
α
1
α
n
α
2
0
2!
α
2
1
2!
α
2
n
2!
.
.
.
.
.
.
.
.
.
.
.
.
α
n
0
n!
α
n
1
n!
α
n
n
n!
¸
¸
¸
¸
¸
¸
¸
¸
c
0
c
1
.
.
.
c
n
¸
¸
¸
¸
¸
=
0
.
.
.
1
.
.
.
0
¸
¸
¸
¸
¸
¸
¸
¸
←m
th
entry 1 (10)
which is possible whenever the α
j
are distinct, because then the matrix is in
vertible, as is seen using the Vandermonde determinant:
det =
Π
0≤j<k≤n
(α
k
−α
j
)
Π
2≤j≤n
j!
= 0.
Then we must have
n
¸
j=0
c
j
f (x +α
j
t) =
f (x)
tf
(x)
.
.
.
t
n
f
[n]
(x)
t
n+1
¸
¸
¸
¸
¸
¸
¸
T
0
.
.
.
1 (mth position)
.
.
.
0
1
(n+1)!
n
¸
j=0
c
j
α
n+1
j
f
[n+1]
ξ
j
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
= t
m
f
[m]
(x) +
t
n+1
(n + 1)!
n
¸
j=1
c
j
α
n+1
j
f
[n+1]
ξ
j
.
Therefore
f
[m]
(x) =
1
t
m
n
¸
j=0
c
j
f (x +α
j
t) +Error
for c
j
which satisfy (10) where
Error = −
t
n+1−m
(n + 1)!
n
¸
j=0
c
j
α
n+1
j
f
[n+1]
ξ
j
.
143
This Error formula shows how truncation error may be decreased by increas
ing n without shrinking t, thus combatting roundoﬀ error at the expense of
increased computation of sums.
Example 129 For n = m and α
j
= j the c
i
which satisfy (10) are
c
j
= (−1)
n−j
n
j
which gives the famous the forward diﬀerence formula
f
[m]
(x) ≈
1
t
m
m
¸
j=0
(−1)
n−j
n
j
f (x +jt)
and similarly we can derive the backward diﬀerence formula
f
[m]
(x) ≈
1
t
m
m
¸
j=0
(−1)
j
n
j
f (x −jt)
Notice the α
i
may be chosen as complex values when f is analytic (as is
the case with our Gaussians in §5.1 and particularly with complex exponentials
for the lowfrequency trigonometric series in §5.2.2). This gives us another
opportunity to mitigate roundoﬀ error, since a greater quantity of regularly
spaced nodes α
i
can be packed into an epsilon ball around zero in the complex
plane than on the real line. However, Taylor’s remainder as used above needs
to be adjusted in the complex case to the familiar integral form.
Example 130 Thornton chose the roots of unity in C as nodes α
j
:= e
j
n
2πi
and found c
j
=
m!
nt
m
e
(n−m)j
n
2πi
so
f
[m]
(z) ≈
m!
nt
m
n
¸
j=1
e
(n−m)j
n
2πi
f
z +te
j
n
2πi
. (11)
Controlling the error and taking n →∞ proves Cauchy’s Integral Formula:
f
[m]
(z) =
m!
2πi
γ
f (w)
(w −z)
m+1
dw
where γ is the circle centered at z of radius t.
Since line (11) is merely the Riemann sum approximation of the complex
integral, a more sophisticated numerical integration technique would beneﬁt a
practical implementation of the results of §5.2.3.
As ﬁnal note we mention there have been numerous advances to the present
day in inverting the Vandermonde matrix. We mention only the earliest appli
cation to numerical diﬀerentiation [59] which gives a formula in terms of the
Stirling numbers.
144 APPENDIX C: NUMERICAL DIFFERENTIATION
Bibliography
[1] Ralph Abraham, Jerrold Marsden and Tudor Ratiu, “Manifolds, Tensor
Analysis, and Applications”, 2nd Ed., SpringerVerlag, 1988.
[2] Aleksandr Danilovich Aleksandrov, “Convex Polyhedra”, SpringerVerlag,
Berlin, 2005.
[3] Luigi Ambrosio, Nicola Gigli, Giuseppe Savaré, “Gradient Flows in Metric
Spaces and in the Space of Probability Measures”, 2
nd
ed., Birkhäuser,
2008.
[4] Vladimir Igorevich Arnol’d, “Geometrical Methods in the Theory of Ordi
nary Diﬀerential Equations”, SpringerVerlag, Berlin, 1988
[5] Vladimir Igorevich Arnol’d, “Lectures on Partial Diﬀerential Equations”,
SpringerVerlag, Berlin, 2004..
[6] Vladimir Igorevich Arnol’d, V. S. Afrajmovich, “Bifurcation Theory and
Catastrophe Theory”, SpringerVerlag, Berlin, 1999.
[7] JeanPierre Aubin, “Mutational and Morphological Analysis”, Birkhäuser,
Boston, 1999.
[8] Michael Barnsley,“Fractals Everywhere”, Academic Press Professional,
New York, 1993.
[9] Alain Bensoussan, Giuseppe Da Prato, Michel C. Delfour, Sanjoy K. Mitter,
“Representation and Control of Inﬁnite Dimensional Systems”, 2nd Ed.,
webversion, April 2006.
[10] G. G. Bilodeau, The Weierstrass Transform and Hermite Polynomials,
Duke Mathematical Journal, Vol. 29, No. 2, 1962.
[11] Leonard M. Blumenthal, Karl Menger, “Studies in Geometry”, W. H. Free
man and Co., 1970.
[12] William M. Boothby, “An Introduction to Diﬀerentiable Manifolds and
Riemannian Geometry”, 2nd Ed., Academic Press, Inc., 1986.
145
146 BIBLIOGRAPHY
[13] Max Born and Emil Wolf, “Principles of Optics”, 7th ed., Cambridge Uni
versity Press, (corrected) 2002.
[14] Dmitri Burago, Yuri Burago and Sergei Ivanov, “A Course in Metric Geom
etry”, American Mathematical Society 1984.
[15] Herbert Busemann, “The Geometry of Geodesics”, Academic Press Inc.,
New York, N. Y., 1955.
[16] Craig Calcaterra, “Arc Fields”, Ph. D. dissertation, University of Hawaii,
1999.
[17] Craig Calcaterra, Linear Combinations of Gaussians with a Single Variance
are dense in L
2
, Proceedings of the World Congress on Engineering, 2008.
[18] Craig Calcaterra and David Bleecker, Generating Flows on a Metric Space,
Journal of Mathematical Analysis and Applications, 248, pp. 645677, 2000.
[19] Craig Calcaterra and Axel Boldt, Lipschitz Flowbox Theorem, Journal of
Mathematical Analysis and Applications, 338, issue 2, pp. 11081115, 2008.
[20] Craig Calcaterra and Axel Boldt, Approximating with Gaussians,
arXiv:0805.3795v1 [math.CA]
[21] Craig Calcaterra, Axel Boldt, Michael Green, David Bleecker, Metric Co
ordinate Systems, arXiv.org: math.DS/0206253
[22] César Camacho, Alcides Lins Neto, “Geometric Theory of Foliations”,
Birkhäuser Boston, 1985.
[23] ShiingShen Chern, W. H. Chen, K. S. Lam, “Lectures on Diﬀerential
Geometry”, World Scientiﬁc Publishing Co., 1999.
[24] Rinaldo M. Colombo and Andrea Corli, A Semilinear Structure on Semi
groups in a Metric Space, Semigroup Forum, 68, pp. 419444, 2004.
[25] Charles C. Conley, Isolated Invariant Sets and the Morse Index, CBMS
Regional Conference, vol. 89, American Mathematical Society, 1978.
[26] John B. Conway, “A Course in Functional Analysis”, 2nd edition, Springer
Verlag, 1994.
[27] JeanMichel Coron, “Control and Nonlinearity”, American Mathematical
Society, 2007.
[28] S. Darlington, Synthesis and Reactance of 4poles, J. Math. & Phys., 18,
pp. 257353, 1939.
[29] Milton Dishal, GaussianResponse Filter Design, Electrical Communica
tion, 36, no. 1, pp. 326, 1959.
BIBLIOGRAPHY 147
[30] J. R. Dorroh and J. W. Neuberger, A Theory of Strongly Continuous Semi
groups in Terms of Generators, Journal of Functional Analysis, 136, pp.
114—126, 1996.
[31] James Dugundji, “Topology”, Allyn and Bacon, Boston, 1966.
[32] Günter Ewald, “Combinatorial Convexity and Algebraic Geometry”,
SpringerVerlag, New York, 1996.
[33] Herbert Federer, “Geometric Measure Theory,” SpringerVerlag, New York,
1996.
[34] Maurice Fréchet, Sur Quelques Points du Calcul Fonctionnel, Rendic. Circ.
Mat. Palermo 22, pp. 1—74, 1906.
[35] David Gottlieb and Steven Orszag, “Numerical Analysis of Spectral Meth
ods”, SIAM, 1977.
[36] Leslie Greengard and Xiaobai Sun, A New Version of the Fast Gauss Trans
form, Documenta Mathematica, Extra Volume ICM, III, pp. 575584, 1998.
[37] M. Gromov, “Metric structures for Riemannian and nonRiemannian
spaces”, Birkhäuser, 1999.
[38] Felix Hausdorﬀ, “Grundzüge der Mengenlehre”, Leipzig: Veit, 1914.
(Reprinted in “Felix Hausdorﬀ—Gesammelte Werke. Band II”, Springer
Verlag, pp. 91576, 2002.)
[39] Philip Hartman, “Ordinary Diﬀerential Equations”, 2nd Ed., SIAM, 2002.
[40] Gilbert Hector and Ulrich Hirsch, “Introduction to the Geometry of Folia
tions, Part A”, Friedr. Vieweg & Sohn, 1981.
[41] Juha Heinonen, “Lectures on Analysis on Metric Spaces”, SpringerVerlag,
New York, 2001.
[42] John E. Hutchinson, Fractals and Self Similarity, Indiana University Math
ematics Journal, 30, pp. 713747, 1981.
[43] Youssef Jabri, “The Mountain Pass Theorem”, Cambridge University
Press, 2003.
[44] Donald Knuth, “Concrete Mathematics,” 2nd ed., AddisonWesley, 1994.
[45] Greg Leibon, Daniel Rockmore & Gregory Chirikjian, A Fast Hermite
Transform with Applications to Protein Structure Determination, Proceed
ings of the 2007 international Workshop on SymbolicNumeric Computa
tion, ACM, New York, NY, pp. 117124, 2007.
[46] J. Madrenas, M. Verleysen, P. Thissen, and J. L. Voz, A CMOS Ana
log Circuit for Gaussian Functions, IEEE Transactions on Circuits and
SystemsII: Analog and Digital Signal Processing, 43, no. 1, 1996.
148 BIBLIOGRAPHY
[47] E. J. McShane, Extensions of Range of Functions, Bull. Am. Math. Soc.,
40, pp. 837842, 1934.
[48] Karl Menger, “Géométrie Général”, Memor. Sci. Math. no. 124, Gauthier
Villars, Paris, 1954.
[49] James R. Munkres, “Topology,” Prentice Hall, 2000.
[50] D. Motreanu and N. H. Pavel, “Tangency, Flow Invariance for Diﬀerential
Equations, and Optimization Problems”, Marcel Decker, 1999.
[51] A. I. Panasyuk, Quasidiﬀerential Equations in a Complete Metric Space
under Conditions of the Caratheodory Type. I, Diﬀerential Equations 31,
pp. 901910, 1995.
[52] A. I. Panasyuk, Quasidiﬀerential Equations in a Complete Metric Space
under Caratheodorytype Conditions. II, Diﬀerential Equations 31, no. 8,
pp. 13081317, 1995.
[53] Grisha Perel’man, Finite Extinction Time for the Solutions to the Ricci
Flow on Certain Threemanifolds, arXiv:math/0307245v1 [math.DG]
[54] Anthony Ralston and Philip Rabinowitz, “A First Course in Numerical
Analysis”, McGrawHill, 1978.
[55] Franco Rampazzo and Hector J. Sussmann, Commutators of Flow Maps of
Nonsmooth Vector Fields, Journal of Diﬀerential Equations, 232, no. 1 pp.
134175, 2007.
[56] H. L. Royden, “Real Analysis”, Macmillan Publishing Company, 1988.
[57] Slobodan Simi´c, Lipschitz Distributions and Anosov Flows, Proceedings of
the AMS, 124, no. 6, pp. 18691877, 1996.
[58] Eduardo D. Sontag, “Mathematical Control Theory”, 2nd Ed., Springer
Verlag, 1998.
[59] A. Spitzbart and N. Macon, Numerical Diﬀerentiation Formulas, The
American Mathematical Monthly, Vol. 64, No. 10, pp. 721723, 1957.
[60] M. H. Stone, Developments in Hermite Polynomials, The Annals of Math
ematics, 2nd Ser., Vol. 29, No. 1/4, pp. 113, 19271928.
[61] Gabor Szegö, “Orthogonal Polynomials”, American Mathematical Society,
3rd ed., 1967.
[62] E. C. Titschmarsh, “The Theory of Functions”, 2d ed., Oxford University
Press, Fair Lawn, N.J., 1939.
[63] William P. Thurston, “ThreeDimensional Geometry and Topology”, vol.
1, Princeton University Press, 1997.
BIBLIOGRAPHY 149
[64] Philippe Tondeur, “Geometry of Foliations”, Birkhauser Verlag, 1991.
[65] Gilbert G. Walter, “Wavelets and Other Orthogonal Systems With Appli
cations”, CRC Press, 1994.
[66] David Widder, “The Heat Equation”, Pure and Applied Mathematics, Vol.
67. Academic Press, 1975.
150 BIBLIOGRAPHY
List of notation
Exposition
Terms in bold are deﬁned in that paragraph.
A := B A is deﬁned by B, or B is denoted by A; used to distinguish the
case when = represents a step in a calculation.
A =: B means B := A
A :=: B means A and B are interchangeable notations for a concept
Square brackets for bibliographical citations, e.g., [1].
Round brackets refer to displayed lines, e.g., (2.3) refers to the thirdreferenced
line in the second chapter.
Logic
⇒ implication
⇐ is a consequence of
⇔ equivalence
iff “if and only if”
∃ there exists
∀ for any
Set theory and algebra
R denotes the set of real numbers
C the complex numbers
N := ¦1, 2, ...¦ the natural numbers
151
152 LIST OF NOTATION
Z := ¦..., −1, 0, 1, 2, ...¦ the integers
R
n
:= R R R Cartesian product
R
+
:= [0, ∞) ⊂ R
L
p
, L
p
G
, l
p
, H (R
n
), etc., are various spaces deﬁned in Appendix A.1.
B
denotes the complement of the set B, i.e., ¦x ∈ M[ x / ∈ B¦
A`B := A∩ B
set diﬀerence
f : A →B signiﬁes a function with domain A and codomain B
f : ABC →M may be denoted f (x, y, z) :=: f (x) (y) (z) :=: f
x
(y, z) :=:
f
x,y
(z)
f
1
, f
2
: A →B signiﬁes two functions with the same domain and codomain
f
(n)
composition n times: f ◦ f ◦ ... ◦ f
f
[n]
n
th
derivative of f
f
n
usually distinguishes the n
th
object f; alternatively may mean the n
th
multiplicative power
supA supremum (least upper bound) of A ⊂ R
inf A inﬁmum (greatest lower bound) of A ⊂ R
span¦A
i
¦ linear span of objects A
i
, e.g.,
span¦A
i
¦ :=
n
¸
n=1
c
i
A
i
n ∈ N, c
i
∈ R
¸
span topological closure of span, i.e., closed linear span
∨ a ∨ b := maximum of a and b
∧ a ∧ b := minimum of a and b
Metric spaces
B(x, r) := ¦y ∈ M[d (x, y) < r¦ :=: B
M
(x, r) :=: B
d
(x, r) is the ball about
the center x with radius r in M.
B(x, r) := ¦y ∈ M[d (x, y) ≤ r¦ :=: B(x, r) the closed ball.
153
C (M, N) := ¦f : M →N [ f is continuous¦
Lip (M, N) := ¦f : M →N [ f is Lipschitz¦ (not the set of lipeomorphisms
from M to N)
Lip
K
(M, N) := ¦f : M →N [ f with Lipschitz constant ≤ K¦
homeomorphism, lipeomorphism: see Appendix A.2
∼, ≈ A ∼ B and A ≈ B denote 1st and 2ndorder localuniform tangency.
A and B may be arc ﬁelds (p. 12), an arc ﬁeld and a set (p.21), 2D
integral surfaces (p. 51), and distributions (p. 60).
Be warned ≈ is alternately used for function approximation: f ≈
g means
f −g < .
“Big oh” and “little oh”
O, o For a real function Ψ the statement
“ Ψ(t) = O(t
n
) as t →0 ”
means there exist K > 0 and δ > 0 such that [Ψ(t)[ < K[t
n
[ for 0 < [t[ <
δ. The statement
“ Ψ(t) = o (t
n
) as t →0 ”
means
lim
t→0
Ψ(t)
t
n
= 0.
For a family of functions ¦Ψ
x
: x ∈ M¦ we say Ψ
x
(t) = O(t
n
) locally uni
formly in x when for each x
0
∈ M there are positive constants r, δ and
K such that for all x ∈ B(x
0
, r) and 0 = t ∈ (−δ, δ) we have
[Ψ
x
(t)[ < [t
n
[ K.
Ψ
x
(t) = o (t
n
) locally uniformly in x when for each x
0
∈ M and any > 0
there are positive constants r and δ such that for all x ∈ B(x
0
, r) and
0 = t ∈ (−δ, δ) we have
[Ψ
x
(t)[ < [t
n
[ .
Index
(α
x
, ω
x
), 14
angle, 132
approximately reachable set, 66
approximation, 83
Fourier synthesis, 78
probability distributions, 80
with exponentials, 77
with Gaussians, 76, 83
with lowfrequency trig series,
90
coeﬃcients, 93
damping, 98
error bound, 97
approximation ﬁeld, see arc ﬁeld
arc, ix
arc bundle, 4
arc ﬁeld, x, 3
generalizes vector ﬁeld, x, 5
parallel parking example, 67
scalar multiple, 25
second order, 12
sum, 25
time dependent, 12
well posed result, 6
automorphism, viii
B(x, r), 152
Banach manifold, xv, 6
FTODE, 139
Banach space, 5, 27, 73, 123
embedding, 130
ﬂow on unit sphere, 80
foliation of unit sphere in L
2
,
81
FTODE, 138
bounded speed, ix
bracket, see Lie bracket
bracketgenerated distribution, see
distribution, bracketgenerated
C (M, N), 129
Cauchy sequence, 129
CauchyLipschitz Theorem, 138
characteristic function, χ
S
, xi
Chebyshev metric, see supremum met
ric
Chow’s Theorem, xiv
manifold, xx
metric space, 68
close, 26
bracket is arc ﬁeld, 33
closure closes, 31
closed set, 128
closure, 128
commutativity of ﬂows, 22, 58, 60,
63
complete, 129
complete ﬂow, see ﬂow
completion of a metric space, 130
Conditions E1 and E2, 4
E1 implies uniqueness, 14
forward arc ﬁelds, 19
implicit in frame deﬁnition, 61
imply module properties, 29
well posedness, 6
conﬁguration space, 65
conley index, viii
continuity, 128
continuous dynamics, viii
contraction, 129
Contraction Mapping Theorem, 130
control theory, 65
H (R
n
), 115
inﬁnite dimensional, xiii
154
INDEX 155
controllability, xiii, 66
converge, 128
convex set, 132
convolution, 85
coordinates
ﬂow, 64
local on a manifold, xv
metric, 132
curvature, 132
curve, ix
d, vii
d
∞
, see supremum metric
deconvolution, 85
∆(X, Y ), 60
∆[X, Y ], xxi, 68
∆
x
, 60
dense, 128
dilation
function dilation, 77
vector space dilation, 77
distance, see metric
distance to a set, 51
distribution
bracket generated, xxi
bracketgenerated, 68, 74
distribution (geometric), xiv
manifold, xviii, 48
metric space, 60
ndimensional, 61
diverge, 128
error
bound for lowfreq. trig approx.,
97
numerical diﬀerentiation, 143
Euler curve, 6
alternate deﬁnitions, 14
existence and uniqueness
arc ﬁeld solutions, 6
vector ﬁeld solutions, 138
exponential growth, 13
extrinsic metric, 126
F, viii
Finsler geometry, 64
ﬁxed point, 20
ﬂow, viii, 3
well posed, 17
foliation, xiv, 41
manifold, xvi
ouroboros, 43
spiral ouroboros, 44, 45
metric space, 63
unit sphere in L
2
, 81
forward ﬂow, 3, 19, 21, 22
ﬁxed point, 20
Fourier
analysis, 78
transform, 87, 90, 92
frame, 64
local, 61
Frobenius’ Theorem
generalization of FTODE, 45
manifold, xx
proof, 49
metric space
global, 63
local, 62
local 2D, 51
FTODE, see Fundamental Theorem
of ODEs
full ﬂow, see ﬂow
Fundamental Theorem of ODEs, x,
6, 138
forward ﬂows, 19
generalized to Frobenius’ The
orem, 45
Gaussian ﬁlter, 84
generic, 128
geodesic, 132
geometric objects on metric spaces,
131, 133
global ﬂow, see ﬂow
global frame, 61
Global Frobenius Theorem, 63
gradient, 132
Hausdorﬀ distance, xxi, 127
Hausdorﬀ metric, see Hausdorﬀ dis
tance
156 INDEX
Hermite polynomial, 76, 83, 87
Hilbert space, 125
holonomy, 48, 76
homeomorphism, 129
H (R
n
), xxi, 74, 107, 127
and reachable sets, 115
cyclically attracted sets, 114
ﬁxed point of discrete map, 108
ﬁxed point of ﬂow on, 114
initial condition
arc ﬁeld, 4
vector ﬁeld
Banach space, 5
manifold, xvi
R
n
, x
inner product (metric space), 132
integrability, 49, 51, 62, 63
integral surface
2dimensional, 51
maximal, 63
metric space, 62
intrinsic metric, 126
invariant set, 21
involutivity, 48, 49, 74
metric space, 62
irrational ﬂow of the torus, 45
irrational foliation of the torus, xvii
isometry, 129
K, 128
K
f
, 128
L
2
(R), xi, 74
foliations, 77
L
2
(R), 125
L
2
G
, 125
Lagrange multipliers, 89
Λ, 4
< 0 implies ﬁxed point, 20
leaf, 63
least squares, 88
Lebesgue measure, 125
length of a curve, ix
length space, 132
Lie bracket, xiv
dynamic characterization, 36
manifold, xviii
metric space, 32
arc ﬁeld vs. ﬂow deﬁnition,
119
limit of a sequence, 128
linear span, see span
linear speed growth, 17, 29
ﬁxed point, 20
Lip (M, N), 129
lipeomorphism, 34, 129
Lip
K
(M, N), 129
Lipschitz, 4, 128
constant, 128
locally, 129
vector ﬁeld, 27
load time, 85
local coordinates
manifold, xv
metric space, 64
local ﬂow, 15
continuity, 16
forward, 20
local ﬂows are arc ﬁelds, 18
local frame, see frame
local uniformity
bounded speed arc ﬁeld, 3
continuity, 128
tangency
arc ﬁelds, 12
distributions, 60
integral surface, 51
φ related, 37
surface, 21, 62
locally complete, 129
locally complete metric space, vii
lowfrequency trigonometric series,
90
coeﬃcients, 93
damping, 98
dense in L
2
[a, b], xiv
error bound, 97
L
p
, 125
l
p
, 124
M, vii
INDEX 157
manifold, xv
chart, xv
metric, vii, 123
coordinate, 132
metric space, vii, 123
metric space arithmetic, 25
Conditions E1 and E2, 29
examples, 101
module properties of arc ﬁelds,
29
Minkowski distance, 124
module properties of arc ﬁelds, 29
ndimensional distribution, 61
Nagumo’s Invariance Theorem, xxi
generates leaves of foliation, 63
metric space, 21
neighborhood, 128
norm (metric space), 132
normal equations, 88
numerical diﬀerentiation
backward diﬀerence, 143
forward diﬀerence, 143
npoint formula, 142
O(t
n
), 153
o (t
n
), 153
ODE, ordinary diﬀerential equation,
137
Ω, 4
open covering, 130
open subset, 128
ouroboros, 43
partition of unity, 130
path, ix, xvi
PicardLindelöf Theorem, 138
preﬂow, see arc ﬁeld
probability distribution, 80
pullback, 34
pushforward, 34
natural, 34
Rademacher’s Theorem, 129
reachable set, xiii, xx, 66
H (R
n
), 115
Reeb component, xviii
ρ, ix, 4
Ricci ﬂow, 127
S, see closure
semiﬂow, see forward ﬂow
semigroup, see forward ﬂow
sequence, 128
Shiva, 85
σ, 4
σ
x
, 4
signal synthesis
lowfrequency trig series, 91
Shiva’s damaru, 85
with Gaussians, 85
slide, 67
solution
arc ﬁeld, 4
(α
x
, ω
x
), 14
maximal, 4
vector ﬁeld, xvi
Banach space, 5
solution curve, see solution
span, xiv, xx, 76
of arc ﬁelds, 60
of distributions, 60
speed of a curve, ix
locally uniformly bounded, 3
stratiﬁcation, xviii, 43
subsequence, 130
support, 85
arc ﬁeld, 18
function, 130
supremum metric
L
∞
(R), 125
R
n
, 124
surface, 62
2dimensional, 51
t, viii
tangency
arc ﬁeld to distribution, 60
arc ﬁeld to solution, 4
between curves, ix
distribution to distribution, 60
foliation, 63
158 INDEX
forward tangency, 19
generalization from R
n
, x
second order, 12
surface, 62
∼, ≈, 37
arc ﬁelds, 12
distribution, 60
equivalence relations, 12
integral surface, 51
invariant set, 21
module properties, 29
φ related, 37
tangent bundle, TM, xvi
tangent space, T
x
M, xvi
tangent vector, xvi
taxicab metric, 123
torus, T
2
, xv
translation
bracketed with dilation, 77
function translation
bracketed with vector space
translation, 74
foliates the Hilbert ball, 81
on probability distributions,
80
PDE, 104
vector space translation, xii, 73
transverse
arc ﬁelds, 50
distribution, 61
triangle inequality, vii
uniformity
O(t
n
) locally, 153
o (t
n
) locally, 153
locally uniformly continuous map,
128
uniqueness
solution to arc ﬁeld, 4, 13
Vandermonde determinant, 142
vector ﬁeld, xviii, 138
Banach space, 5
discontinuous examples, 120, 121
manifold, xvi
nonsmooth, 59
on R
n
, x
Weierstrass transform, 84, 87
weight function, 125
well posed
arc ﬁeld, 4, 6
distribution, 63
plane ﬁeld, 51
vector ﬁeld, 138
wriggle, 67
X, 3
x, viii
INDEX 159
**check on the status of colombo and corli’s new work
**Good spot for the nonunique solutions example x
=
√
x.
**Commuting ﬁgure
**add one more picture with the intermediate step approximation to Frobe
nius proof
**inﬁnite D distributions and frobenius thm
**The picture on the cover gives the scaﬀolding for constructing a ﬂow on
S
3
with no equilibria which alternately consists of open sets of points with
alpha limit set consisting of the ﬁrst circle and omega limit set consisting of
the second circle and other open sets with alpha and omega limit sets reversed.
These open sets are intertwined in interesting complex ways separated by the
surfaces described before, consisting of periodic paths.
**integrate valid traditional foliation results
**ShannonHartley law/theorem: C = Blog
2
1 +
S
N
where
C is the channel capacity in bits per second;
B is the bandwidth of the channel in hertz;
S is the total signal power over the bandwidth, measured in watt or volt2;
N is the total noise power over the bandwidth, measured in watt or volt2.
So the information is limited (a lot) by how much power you can use, if
you are shrinking the bandwidth as suggested above. Or does this theorem not
apply?
Considering the cubic function approximation with 3 sine functions, Example
95, we have
3 = C and
S
N
∼ C
1
B
3
so
C ∼ B
S
N
1/3
violating the ShannonHartley law.
This action might not be possible to undo. Are you sure you want to continue?
Use one of your book credits to continue reading from where you left off, or restart the preview.