Professional Documents
Culture Documents
Dierential Calculus
In this chapter, it is assumed that all linear spaces and at spaces under
consideration are nite-dimensional.
61 Dierentiation of Processes
Let c be a at space with translation space 1. A mapping p : I c from
some interval I SubR to c will be called a process. It is useful to think
of the value p(t) c as describing the state of some physical system at time
t. In special cases, the mapping p describes the motion of a particle and
p(t) is the place of the particle at time t. The concept of dierentiability for
real-valued functions (see Sect.08) extends without diculty to processes as
follows:
Denition 1: The process p : I c is said to be dierentiable at
t I if the limit
t
p := lim
s0
1
s
(p(t + s) p(t)) (61.1)
exists. Its value
t
p 1 is then called the derivative of p at t. We say
that p is dierentiable if it is dierentiable at all t I. In that case, the
mapping p : I 1 dened by (p)(t) :=
t
p for all t I is called the
derivative of p.
Given n N
1
p := p,
k+1
p := (
k
p) for all k (n 1)
]
. (61.2)
We say that p is of class C
n
if it is n times dierentiable and
n
p is
continuous. We say that p is of class C
if it is of class C
n
for all n N
.
209
210 CHAPTER 6. DIFFERENTIAL CALCULUS
As for a real-valued function, it is easily seen that a process p is contin-
uous at t Domp if it is dierentiable at t. Hence p is continuous if it is
dierentiable, but it may also be continuous without being dierentiable.
In analogy to (08.34) and (08.35), we also use the notation
p
(k)
:=
k
p for all k (n 1)
]
(61.3)
when p is an n-times dierentiable process, and we use
p
:= p
(1)
= p, p
:= p
(2)
=
2
p, p
:= p
(3)
=
3
p, (61.4)
if meaningful.
We use the term process also for a mapping p : I T from some
interval I into a subset T of the at space c. In that case, we use poetic
license and ascribe to p any of the properties dened above if p[
E
has that
property. Also, we write p instead of (p[
E
) if p is dierentiable, etc. (If T
is included in some at T, then one can take the direction space of T rather
than all of 1 as the codomain of p. This ambiguity will usually not cause
any diculty.)
The following facts are immediate consequences of Def.1, and Prop.5 of
Sect.56 and Prop.6 of Sect.57.
Proposition 1: The process p : I c is dierentiable at t I if and
only if, for each in some basis of 1
be at spaces and : c c
a at mapping.
If p : I c is a process that is dierentiable at t I, then p : I c
t
( p) = ()(
t
p). (61.5)
If p is dierentiable then (61.5) holds for all t I and we get
( p) = ()p. (61.6)
Let p : I c and q : I c
t
(p, q) = (
t
q,
t
q). (61.7)
61. DIFFERENTIATION OF PROCESSES 211
Both p and q are dierentiable if and only if (p, q) is, and, in that case,
(p, q) = (p, q). (61.8)
Let p and q be processes having the same domain I and the same
codomain c. Since the point-dierence (x, y) x y is a at mapping
from c c into 1 whose gradient is the vector-dierence (u, v) u v
from 1 1 into 1 we can apply Prop.2 to obtain
Proposition 3: If p : I c and q : I c are both dierentiable at
t I, so is the value-wise dierence p q : I 1, and
t
(p q) = (
t
p) (
t
q). (61.9)
If p and q are both dierentiable, then (61.9) holds for all t I and we
get
(p q) = p q. (61.10)
The following result generalizes the Dierence-Quotient Theorem stated
in Sect.08.
Dierence-Quotient Theorem: Let p : I c be a process and let
t
1
, t
2
I with t
1
< t
2
. If p[
[t
1
,t
2
]
is continuous and if p is dierentiable at
each t ]t
1
, t
2
[ then
p(t
2
) p(t
1
)
t
2
t
1
Clo Cxh
t
p [ t ]t
1
, t
2
[. (61.11)
Proof: Let a Flf c be given. Then (a p) [
[t
1
,t
2
]
is continuous and,
by Prop.1, a p is dierentiable at each t ]t
1
, t
2
[. By the elementary
Dierence-Quotient Theorem (see Sect.08) we have
(a p)(t
2
) (a p)(t
2
)
t
2
t
1
t
(a p) [ t ]t
1
, t
2
[.
Using (61.5) and (33.4), we obtain
a
_
p(t
2
) p(t
1
)
t
2
t
1
_
(a)
>
(o), (61.12)
where
o :=
t
p [ t ]t
1
, t
2
[.
Since (61.12) holds for all a Flf c we can conclude that b(
p(t
2
)p(t
1
)
t
2
t
1
) 0
holds for all those b Flf 1 that satisfy b
>
(o) P. Using the Half-Space
Intersection Theorem of Sect.54, we obtain the desired result (61.11).
Notes 61
(1) See Note (8) to Sect.08 concerning notations such as
t
p, p,
n
p, p
, p
(n)
, etc.
212 CHAPTER 6. DIFFERENTIAL CALCULUS
62 Small and Conned Mappings
Let 1 and 1
on 1 and 1
, respectively, we have
lim
u0
(n(u))
(u)
= 0. (62.1)
The set of all such small mappings will be denoted by Small(1, 1
).
Proposition 1: Let n be a mapping from a neighborhood of 0 in 1 to a
neighborhood of 0 in 1
).
(ii) n(0) = 0 and the limit-relation (62.1) holds for some norm on 1
and some norm
on 1
.
(iii) For every bounded subset o of 1 and every ^
Nhd
0
(1
) there is a
P
such that
n(sv) s^
Nhd
0
(1
) and a bounded
subset o of 1 be given. By Cor.1 to the Cell-Inclusion Theorem of Sect.52,
we can choose b P
such that
(v) b for all v o. (62.3)
By Prop.3 of Sect.53 we can choose P
such that
bCe(
) ^
. (62.4)
Applying Prop.4 of Sect.57 to the assumption (ii) we obtain P
such
that, for all u Domn,
)
for all s ], [ such that sv Domn. The desired conclusion (62.2) now
follows from (62.4).
(iii) (i): Assume that (iii) is valid. Let a norm on 1, a norm
on 1
, and P
:= Ce(
) and determine P
), which yields
(n(u))
(u)
< .
The assertion follows by applying Prop.4 of Sect.57.
The condition (iii) of Prop.1 states that
lim
s0
1
s
n(sv) = 0 (62.6)
for all v 1 and, roughly, that the limit is approached uniformly as v varies
in an arbitrary bounded set.
We also wish to make precise the intuitive idea that a mapping h from a
neighborhood of 0 in 1 to a neighborhood of 0 in 1
on 1
there is ^ Nhd
0
(1) and P
such that
).
Proposition 2: Let h be a mapping from a neighborhood of 0 in 1 to a
neighborhood of 0 in 1
).
214 CHAPTER 6. DIFFERENTIAL CALCULUS
(ii) There exists a norm on 1, a norm
on 1
, a neighborhood ^ of 0
in 1, and P
of 1
such that
h(sv) so
such that
(v) b for all v o.
By Prop.3 of Sect.53, we can determine P
)
for all s ], [ such that sv Domh. If we put o
:= bCe(
), we obtain
the desired conclusion (62.8).
(iii) (i): Assume that (iii) is valid. Let a norm on 1 and a norm
on 1
and
P
such that o
Ce(
).
We put ^ := Ce() Domh, which belongs to Nhd
0
(1). If we put s := 0
in (62.8) we obtain h(0) = 0, which shows that (62.7) holds for u := 0.
Now let u ^
(u)Ce(
),
which yields the assertion (62.7).
The following results are immediate consequences of the denition and
of the properties on linear mappings discussed in Sect.52:
62. SMALL AND CONFINED MAPPINGS 215
(I) Value-wise sums and value-wise scalar multiples of mappings that are
small [conned] near zero are again small [conned] near zero.
(II) Every mapping that is small near zero is also conned near zero, i.e.
Small(1, 1
) Conf(1, 1
).
(III) If h Conf(1, 1
) Conf(1, 1
).
(V) The only linear mapping that is small near zero is the zero-mapping,
i.e.
Lin(1, 1
) Small(1, 1
) = 0.
Proposition 3: Let 1, 1
, 1
) and k Conf(1
, 1
on 1, 1
, 1
and ^ Nhd
0
(1), ^
Nhd
0
(1
)
such that
((k h)(u)
(h(u))
(u) (62.10)
for all u ^ Domh such that h(u) ^
Domk) Nhd
0
(1) and hence ^ h
<
(^
Domk) Nhd
0
(1).
Thus, (62.7) remains satised when we replace h, and ^ by kh,
, and
^ h
<
(^
).
Assume now, that one of k and h, say h, is small. Let P
be given.
Then we can choose ^ Nhd
0
(1) such that
((kh)(u)) (u)
for all u ^ h
<
(^
Domk). Since P
((k h)(u))
(u)
= 0,
i.e. that k h is small near zero.
Now let c and c
, respec-
tively.
216 CHAPTER 6. DIFFERENTIAL CALCULUS
Denition 3: Let x c be given. We say that a mapping from a
neighborhood of x c to a neighborhood of 0 1
).
We say that a mapping from a neighborhood of x c to a neighborhood
of (x) c
on 1
there is ^ Nhd
x
(c) and
P
such that
) is conned near x.
(VIII) If a mapping is conned near x it is continuous at x.
(IX) A at mapping : c c
) and h Conf(1
, 1
).
(XIII) If is conned near x c and if is a mapping with Dom = Cod
that is small near (x) then Small
x
(c, 1
), where 1
is the
linear space for which Cod Nhd
0
(1
).
(XIV) An adjustment of a mapping that is small [conned] near x is again
small [conned] near x, provided only that the concept small [conned]
near x remains meaningful after the adjustment.
63. GRADIENTS, CHAIN RULE 217
Notes 62
(1) In the conventional treatments, the norms and
). The notation
h = O(), and the phrase h is big oh of , are often used to express the assertion
that h Conf(V, V
, respectively. We
consider a mapping : T T
of c
.
Proposition 1: Given x T, there can be at most one at mapping
: c c
is small
near x.
Proof: If the at mappings
1
,
2
both have this property, then the
value-wise dierence (
2
1
)[
D
= (
1
[
D
) (
2
[
D
) is small near
218 CHAPTER 6. DIFFERENTIAL CALCULUS
x c. Since
2
1
is at, it follows from (X) and (XIV) of Sect.62 that
1
is the zero mapping and hence that
1
=
2
.
Denition 1: The mapping : T T
is said to be dierentiable
at x T if there is a at mapping : c c
such that
[
D
Small
x
(c, 1
). (63.1)
This (unique) at mapping is then called the tangent to at x. The
gradient of at x is dened to be the gradient of and is denoted by
x
:= . (63.2)
We say that is dierentiable if it is dierentiable at all x T. If this
is the case, the mapping
: T Lin(1, 1
) (63.3)
dened by
()(x) :=
x
for all x T (63.4)
is called the gradient of . We say that is of class C
1
if it is dierentiable
and if its gradient is continuous. We say that is twice dierentiable
if it is dierentiable and if its gradient is also dierentiable. The gradient
of is then called the second gradient of and is denoted by
(2)
:= () : T Lin(1, Lin(1, 1
))
= Lin
2
(1
2
, 1
). (63.5)
We say that is of class C
2
if it is twice dierentiable and if
(2)
is
continuous.
If the subsets T and T
is dierentiable at x if [
E
Int D
is dierentiable
at x and we write
x
for
x
([
E
Int D
).
The dierentiability properties of a mapping remain unchanged if the
codomain of is changed to any open subset of c
in c
,
one may change the codomain to a subset that is open in T
. in that case,
the gradient of at a point x T must be replaced by the adjustment
x
[
U
of
x
, where |
.
The dierentiability and the gradient of a mapping at a point depend
only on the values of the mapping near that point. To be more precise, let
1
and
2
be two mappings whose domains are neighborhoods of a given x c
and whose codomains are open subsets of c
. Assume that
1
and
2
agree
63. GRADIENTS, CHAIN RULE 219
on some neighborhood of x, i.e. that
1
[
E
N
=
2
[
E
N
for some ^ Nhd
x
(c).
Then
1
is dierentiable at x if and only if
2
is dierentiable at x. If this
is the case, we have
x
1
=
x
2
.
Every at mapping : c c
)
whose value is the gradient of in the
sense of Sect.33. The gradient of a linear mapping is the constant whose
value is this linear mapping itself.
In the case when c := 1 := R and when T := I is an open interval,
dierentiability at t I of a process p : I T
to Lin(R, 1
), so that
t
p = (
t
p) and
t
p = (
t
p)1 (see Sect.25). If I is an interval that is not open and if t is an
endpoint of it, then the derivative of p : I T
at t, if it exists, cannot be
associated with a gradient.
If is dierentiable at x then it is conned and hence continuous at x.
This follows from the fact that its tangent , being at, is conned near x
and that the dierence [
D
, being small near x, is conned near x. The
converse is not true. For example, it is easily seen that the absolute-value
function (t [t[) : R R is conned near 0 R but not dierentiable at 0.
The following criterion is immediate from the denition:
Characterization of Gradients: The mapping : T T
is dif-
ferentiable at x T if and only if there is an L Lin(1, 1
) such that
n : (T x) 1, dened by
n(v) := ((x +v) (x)) Lv for all v T x, (63.6)
is small near 0 in 1. If this is the case, then
x
= L.
Let T, T
1
, T
2
be open subsets of at spaces c, c
1
, c
2
with translation
spaces 1, 1
1
, 1
2
, respectively. The following result follows immediately from
the denitions if we use the term-wise evaluations (04.13) and (14.12).
Proposition 2: The mapping (
1
,
2
) : T T
1
T
2
is dierentiable
at x T if and only if both
1
and
2
are dierentiable at x. If this is the
case, then
x
(
1
,
2
) = (
x
1
,
x
2
) Lin(1, 1
1
) Lin(1, 1
2
)
= Lin(1, 1
1
1
2
)
(63.7)
General Chain Rule: Let T, T
, T
, c
, 1
, respectively. If : T T
is
220 CHAPTER 6. DIFFERENTIAL CALCULUS
dierentiable at x T and if : T
x
( ) = (
(x)
)(
x
). (63.8)
If and are both dierentiable, so is , and we have
( ) = ( )(), (63.9)
where the product on the right is understood as value-wise composition.
Proof: Let be the tangent to at x and the tangent to at (x).
Then
:= [
D
Small
x
(c, 1
),
:= [
D
Small
(x)
(c
, 1
).
We have
= ( +) ( +)
= ( +) + ( +)
= + () + ( +)
where domain restriction symbols have been omitted to avoid clutter. It
follows from (VI), (IX), (XII), and (XIII) of Sect.62 that () + (+
) Small
x
(c, 1
).
If follows that is dierentiable at x with tangent . The assertion
(63.8) follows from the Chain Rule for Flat Mappings of Sect.33.
Let : T c
and : T c
is dierentiable at x and
x
( ) =
x
x
. (63.10)
This follows from Prop.2, the fact that the point-dierence (x
, y
) (x
)
is a at mapping from c
into 1
is dierentiable at x T and if L
Lin(1
x
(Lh) = L(
x
h). (63.11)
Using the fact that vector-addition, transpositions of linear mappings,
and the trace operations are all linear operations, we obtain the following
special cases of Prop.3:
(I) Let h : T 1
and k : T 1
both be dierentiable at x T.
Then the value-wise sum h +k : T 1
is dierentiable at x and
x
(h +k) =
x
h +
x
k. (63.12)
(II) Let and Z be linear spaces and let F : T Lin(, Z) be dif-
ferentiable at x. If F
: T Lin(Z
) is dened by value-wise
transposition, then F
is dierentiable at x T and
x
(F
)v = ((
x
F)v)
: I Lin(Z
), and
(F
= (F
. (63.14)
(III) Let be a linear space and let F : T Lin() be dierentiable at x.
If trF : T R is the value-wise trace of F, then trF is dierentiable
at x and
(
x
(trF))v = tr((
x
F)v) for all v 1. (63.15)
In particular, if I is an open interval and if F : I Lin() is dier-
entiable, so is trF : I R, and
(trF)
= tr(F
). (63.16)
We note three special cases of the General Chain Rule: Let I be an
open interval and let T and T
, respectively. If
p : I T and : T T
are dierentiable, so is p : I T
, and
( p)
= (() p)p
. (63.17)
222 CHAPTER 6. DIFFERENTIAL CALCULUS
If : T T
and f : T
((f) ) (63.18)
(see (21.3)). If f : T I and p : I T
f) f. (63.19)
Notes 63
(1) Other common terms for the concept of gradient of Def.1 are dierential,
Frechet dierential, derivative, and Frechet derivative. Some authors make
an articial distinction between gradient and dierential. We cannot use
derivative because, for processes, gradient and derivative are distinct though
related concepts.
(2) The conventional denitions of gradient depend, at rst view, on the prescription
of a norm. Many texts never even mention the fact that the gradient is a norm-
invariant concept. In some contexts, as when one deals with genuine Euclidean
spaces, this norm-invariance is perhaps not very important. However, when one
deals with mathematical models for space-time in the theory of relativity, the norm-
invariance is crucial because it shows that the concepts of dierential calculus have
a Lorentz-invariant meaning.
(3) I am introducing the notation
x
for the gradient of at x because the more
conventional notation (x) suggests, incorrectly, that (x) is necessarily the
value at x of a gradient-mapping . In fact, one cannot dene the gradient-
mapping without rst having a notation for the gradient at a point (see 63.4).
(4) Other notations for
x
in the literature are d(x), D(x), and
(x).
(5) I conjecture that the Chain of Chain Rule comes from an old terminology that
used chaining (in the sense of concatenation) for composition. The term
Composition Rule would need less explanation, but I retained Chain Rule
because it is more traditional and almost as good.
64 Constricted Mappings
In this section, T and T
, respectively.
Denition 1: We say that the mapping : T T
is constricted if
for every norm on 1 and every norm
on 1
there is P
such that
; it is
denoted by str(; ,
).
It is clear that
str(; ,
) = sup
_
((y) (x))
(y x)
x, y T, x ,= y
_
. (64.2)
Proposition 1: For every mapping : T T
on 1 and 1
, respectively, and P
such
that (64.1) holds for all x, y T.
(iii) For every bounded subset c of 1 and every ^
Nhd
0
(1
) there is
P
such that
x y sc = (x) (y) s^
(64.3)
for all x, y T and s P
.
Proof: (i) (ii): This implication is trivial.
(ii) (iii): Assume that (ii) holds, and let a bounded subset c of 1
and ^
Nhd
0
(1
such that
(u) b for all u c. (64.4)
By Prop.3 of Sect.53, we can choose P
) ^
. Now
let x, y T and s P
) s^
,
i.e. that (64.3) holds.
(iii) (i): Assume that (iii) holds and let a norm on 1 and a norm
on 1
:= Ce(
), which
implies
is constricted, it is conned
near every x T, as is evident from Prop.4 of Sect.62. The converse is not
true. For example, one can show that the function f : ]1, 1[R dened
by
f(t) :=
_
t sin(
1
t
) if t ]0, 1[
0 if t ]1, 0]
_
(64.5)
is not constricted, but is conned near every t ]1, 1[.
Every at mapping : c c
is constricted and
str(; ,
) = [[[[
,
,
where [[ [[
,
is the operator norm on Lin(1, 1
) corresponding to and
(see Sect.52).
Constrictedness and strictions remain unaected by a change of codo-
main. If the domain of a constricted mapping is restricted, then it remains
constricted and the striction of the restriction is less than or equal to the
striction of the original mapping. (Pardon the puns.)
Proposition 2: If : T T
Nhd
0
(1
) be given.
We choose a bounded neighborhood c of 0 in 1 and determine P
c Nhd
0
(1) and s :=
1
, we see that
(64.3) gives
x y ^ = (x) (y) ^
for all x, y T.
Pitfall: The converse of Prop.2 is not valid. A counterexample is the
square-root function
: P P, which is uniformly continuous but not
constricted. Another counterexample is the function dened by (64.5).
The following result is the most useful criterion for showing that a given
mapping is constricted.
Striction Estimate for Dierentiable Mappings: Assume that T
is an open convex subset of c, that : T T
) sup[[
z
[[
,
[ z T (64.6)
64. CONSTRICTED MAPPINGS 225
for all norms ,
on 1, 1
, respectively.
Proof: Let x, y T. Since T is convex, we have tx + (1 t)y T for
all t [0, 1] and hence we can dene a process
p : [0, 1] T
t
p = (
z
)(x y) with z := tx + (1 t)y T.
Applying the Dierence-Quotient Theorem of Sect.61 to p, we obtain
(x) (y) = p(1) p(0) Clo Cxh(
z
)(x y) [ z T. (64.7)
If ,
are norms on 1, 1
then, by (52.7),
((
z
)(x y)) [[
z
[[
,
(x y) (64.8)
for all z T. To say that has a bounded range is equivalent, by Cor.1
to the Cell-Inclusion Theorem of Sect.52, to
:= sup[[
z
[[
,
[ z T < . (64.9)
It follows from (64.8) that
((
z
)(xy) (xy) for all z T, which
can be expressed in the form
(
z
)(x y) (x y)Ce(
) for all z T.
Since the set on the right is closed and convex we get
Clo Cxh(
z
)(y x) [ z T (x y)Ce(
)
and hence, by (64.7), (x) (y) (xy)Ce(
on 1, 1
) =
226 CHAPTER 6. DIFFERENTIAL CALCULUS
0. Using (64.2), we conclude that
is of class C
1
. Let k be a compact subset of T. For every
norm on 1, every norm
on 1
, and every P
there exists P
such that k + Ce() T and such that, for every x k, the function
n
x
: Ce() 1
dened by
n
x
(v) := (x +v) (x) (
x
)v (64.10)
is constricted with
str(n
x
; ,
) . (64.11)
Proof: Let a norm on 1 be given. By Prop.6 of Sect.58, we can
obtain
1
P
such that k +
1
Ce() T. For each x k, we dene
m
x
:
1
Ce() 1
by
m
x
(v) := (x +v) (x) (
x
)v. (64.12)
Dierentiation gives
v
m
x
= (x +v) (x) (64.13)
for all x k and all v
1
Ce(). Since k +
1
Ce() is compact by Prop.6
of Sect.58 and since is continuous, it follows by the Uniform Continuity
Theorem of Sect.58 that [
k+
1
Ce()
is uniformly continuous. Now let a
norm
on 1
and P
such that
(y x) <
2
= [[(y) (x)[[
,
<
for all x, y k +
1
Ce(). In view of (64.13), it follows that
v
2
Ce() = [[
v
m
x
[[
,
<
for all x k and all v
1
Ce(). If we put := min
1
,
2
and if we dene
n
x
:= m
x
[
Ce()
for every x k, we see that [[
v
n
x
[[
,
[ v Ce()
is bounded by for all x k. By (64.12) and the Striction Estimate for
Dierentiable Mappings, the desired result follows.
64. CONSTRICTED MAPPINGS 227
Denition 2: Let : T T be a constricted mapping from a set T
into itself. Then
str() := infstr(; , ) [ a norm on 1 (64.14)
is called the absolute striction of . If str() < 1 we say that is a
contraction.
Contraction Fixed Point Theorem: Every contraction has at most
one xed point and, if its domain is closed and not empty, it has exactly one
xed point.
Proof: Let : T T be a contraction. We choose a norm on 1 such
that := str(; , ) < 1 and hence, by (64.1),
((x) (y)) (x y) for all x, y T. (64.15)
If x and y are xed points of , so that (x) = x, (y) = y, then
(64.15) gives (xy)(1 ) 0. Since 1 > 0 this is possible only when
(x y) = 0 and hence x = y. Therefore, can have at most one xed
point.
We now assume T ,= , choose s
0
T arbitrarily, and dene
s
n
:=
n
(s
0
) for all n N
kr
[
(s
n+k+1
s
n+k
)
and hence
(s
n+r
s
n
)
kr
[
n+k
(s
1
s
0
)
n
1
(s
1
s
0
).
Since < 1, we have lim
n
n
= 0, and it follows that for every P
from an open
subset T of c into an open subset T
of a at space c
with translation
space 1
. Given any x
2
c
2
, we dene (, x
2
) : c
1
c according to (04.21),
put
T
(,x
2
)
:= (, x
2
)
<
(T) = z c
1
[ (z, x
2
) T, (65.1)
which is an open subset of c
1
because (, x
1
) is at and hence continuous,
and dene (, x
2
) : T
(,x
2
)
T
according to (04.22). If (, x
2
) is dif-
ferentiable at x
1
for all (x
1
, x
2
) T we dene the partial 1-gradient
(1)
: T Lin(1
1
, 1
) of by
(1)
(x
1
, x
2
) :=
x
1
(, x
2
) for all (x
1
, x
2
) T. (65.2)
In the special case when c
1
:= R, we have 1
1
= R, and the partial
1-derivative
,1
: T 1
, dened by
,1
(t, x
2
) := ((, x
2
))(t) for all (t, x
2
) T, (65.3)
65. PARTIAL GRADIENTS, DIRECTIONAL DERIVATIVES 229
is related to
(1)
by
(1)
=
,1
and
,1
= (
(1)
)1 (value-wise).
Similar denitions are employed to dene T
(x
1
,)
, the mapping
(x
1
, ) : T
(x
1
,)
T
) and
the partial 2-derivative
,2
: T 1
.
The following result follows immediately from the denitions.
Proposition 1: If : T T
is dierentiable at x := (x
1
, x
2
) T,
then the gradients
x
1
(, x
2
) and
x
2
(x
1
, ) exist and
(
x
)v = (
x
1
(, x
2
))v
1
+ (
x
2
(x
1
, ))v
2
(65.4)
for all v := (v
1
, v
2
) 1.
If is dierentiable, then the partial gradients
(1)
and
(2)
exist
and we have
=
(1)
(2)
(65.5)
where the operation on the right is understood as value-wise application
of (14.13).
Pitfall: The converse of Prop.1 is not true: A mapping can have
partial gradients without being dierentiable. For example, the mapping
: R R R, dened by
(s, t) :=
_
st
s
2
+t
2
if (s, t) ,= (0, 0)
0 if (s, t) = (0, 0)
_
,
has partial derivatives at (0, 0) since both (, 0) and (0, ) are equal to the
constant 0. Since (t, t) =
1
2
for t R
and
(2)
exist and are continuous. The converse of this statement is true,
but the proof is highly nontrivial:
Proposition 2: Let c
1
, c
2
, c
an open subset of c
. A mapping : T T
is of class C
1
if
(and only if ) the partial gradients
(1)
and
(2)
exist and are continuous.
Proof: Let (x
1
, x
2
) T. Since T (x
1
, x
2
) is a neighborhood of (0, 0)
in 1
1
1
2
, we may choose /
1
Nhd
0
(1
1
) and /
2
Nhd
0
(1
2
) such that
/:= /
1
/
2
(T (x
1
, x
2
)).
We dene m : / 1
by
m(v
1
, v
2
) := ((x
1
, x
2
) + (v
1
, v
2
)) (x
1
, x
2
)
(
(1)
(x
1
, x
2
)v
1
+
(2)
(x
1
, x
2
)v
2
).
230 CHAPTER 6. DIFFERENTIAL CALCULUS
It suces to show that m Small(1
1
1
2
, 1
and k : / 1
are dened
by
h(v
1
, v
2
) := (x
1
, x
2
+v
2
) (x
1
, x
2
)
(2)
(x
1
, x
2
)v
2
,
k(v
1
, v
2
) := (x
1
+v
1
, x
2
+v
2
) (x
1
, x
2
+v
2
)
(1)
(x
1
, x
2
)v
1
.
_
(65.6)
The dierentiability of (x
1
, ) at x
2
insures that h belongs to
Small(1
1
1
2
, 1
).
We choose norms
1
,
2
, and
on 1
1
, 1
2
and 1
, respectively. Let
P
be given. Since
(1)
is continuous at (x
1
, x
2
), there are open convex
neighborhoods ^
1
and ^
2
of 0 in 1
1
and 1
2
, respectively, such that ^
1
/
1
, ^
2
/
2
, and
[[
(1)
((x
1
, x
2
) + (v
1
, v
2
))
(1)
(x
1
, x
2
) [[
1
,
<
for all (v
1
, v
2
) ^
1
^
2
. Noting (65.6), we see that this is equivalent to
[[
v
1
k(, v
2
) [[
1
,
< for all v
1
^
1
, v
2
^
2
.
Applying the Striction Estimate of Sect.64 to k(, v
2
)[
N
1
, we infer that
str(k(, v
2
)[
N
1
;
1
,
(k(v
1
, v
2
))
1
(v
1
) max
1
(v
1
),
2
(v
2
) for all v
1
^
1
, v
2
^
2
.
Since P
(k(v
1
, v
2
))
max
1
(v
1
),
2
(v
2
)
= 0,
which, in view of (62.1) and (51.23), shows that k Small(1
1
1
2
, 1
) as
required.
Remark: In examining the proof above one observes that one needs
merely the existence of the partial gradients and the continuity of one of
them in order to conclude that the mapping is dierentiable.
We now generalize Prop.2 to the case when c := (c
i
[ i I) is the
set-product of a nite family (c
i
[ i I) of at spaces. This product c is a
65. PARTIAL GRADIENTS, DIRECTIONAL DERIVATIVES 231
at space whose translation space is the set-product 1 := (1
i
[ i I) of
the translation spaces 1
i
of the c
i
, i I. Given any x c and j I, we
dene the mapping (x.j) : c
j
c according to (04.24).
We consider a mapping : T T
of c
) of by
(j)
(x) :=
x
j
(x.j) for all x T.
In the special case when c
j
:= R for some j I, the partial j-
derivative
,j
: T 1
, dened by
,j
(x) := ((x.j))(x
j
) for all x T, (65.8)
is related to
(j)
by
(j)
=
,j
and
,j
= (
(j)
)1.
In the case when I := 1, 2, these notations and concepts reduce to the
ones explained in the beginning (see (04.21)).
The following result generalizes Prop.1.
Proposition 3: If : T T
is dierentiable at x = (x
i
[ i I) T,
then the gradients
x
j
(x.j) exist for all j I and
(
x
)v =
jI
(
x
j
(x.j))v
j
(65.9)
for all v = (v
i
[ i I) 1.
If is dierentiable, then the partial gradients
(j)
exist for all j I
and we have
=
jI
(
(j)
), (65.10)
where the operation
. If moreover, c
:= R
K
with some nite index set K, then
232 CHAPTER 6. DIFFERENTIAL CALCULUS
lnc
(
,i
| iI)
can be identied with the function which assigns to each x T
the matrix
(
k,i
(x) [ (k, i) K I) R
KI
= Lin(R
I
, R
K
).
Hence can be identied with the matrix
= (
k,i
[ (k, i) K I). (65.12)
of the partial derivatives
k,i
: T R of the component functions. As we
have seen, the mere existence of these partial derivatives is not sucient
to insure the existence of . Only if is known a priori to exist is it
possible to use the identication (65.12).
Using the natural isomorphism between
(c
i
[ i I) and c
j
((c
i
[ i I j))
and Prop.2, one easily proves, by induction, the following generalization.
Partial Gradient Theorem: Let I be a nite index set and let c
i
, i
I and c
an open subset of c
. A mapping : T T
is of class C
1
if and only
if the partial gradients
(i)
exist and are continuous for all i I.
The following result is an immediate corollary:
Proposition 4: Let I and K be nite index sets and let T and T
be
open subsets of R
I
and R
K
, respectively. A mapping : T T
is of
class C
1
if and only if the partial derivatives
k,i
: T R exist and are
continuous for all k K and all i I.
Let c, c
, respectively.
Let T, T
if ,= 0
0 if = 0
_
at (0, 0). Using Prop.4 one easily shows that [
R
2
\{(0,0)}
is of class C
1
and
hence, by Prop.5, has directional derivatives in all directions at all (s, t)
R
2
. Nevertheless, since (s, s
2
) =
1
2
for all s R
is of class
C
1
if and only if the directional b-derivatives dd
b
: T 1
bb
b
b for all R
b
. Since b is a basis, is a at iso-
morphism. If we dene T :=
<
(T) R
b
and := [
D
D
: T T
, we
see that the directional b-derivatives of correspond to the partial deriva-
tives of . The assertion follows from the Partial Gradient Theorem, applied
to the case when I := b and c
i
:= R for all i I.
Combining Prop.5 and Prop.6, we obtain
Proposition 7: Assume that o Sub1 spans 1. The mapping
: T T
is of class C
1
if and only if the directional derivatives dd
v
:
T 1
v
2
) (Bv
1
) (66.1)
for all v
1
1
1
, v
2
1
2
.
Proof: Since B(v
1
, ) = Bv
1
: 1
2
(see (24.2)) is linear for each
v
1
1, the partial 2-gradient of B exists and is given by
(2)
B(v
1
, v
2
) =
Bv
1
for all (v
1
, v
2
) 1
1
1
2
. It is evident that
(2)
B : 1
1
1
2
Lin(1
2
, ) is linear and hence continuous. A similar argument shows that
(1)
B is given by
(1)
B(v
1
, v
2
) = B
v
2
for all (v
1
, v
2
) 1
1
1
2
and hence
is also linear and continuous. By the Partial Gradient Theorem of Sect.65, it
follows that B is of class C
1
. The formula (66.1) is a consequence of (65.5).
Let T be an open set in a at space c with translation space 1. If
h
1
: T 1
1
, h
2
: T 1
2
and B Lin
2
(1
1
1
2
, ) we write
B(h
1
, h
2
) := B (h
1
, h
2
) : T
so that
B(h
1
, h
2
)(x) = B(h
1
(x), h
2
(x)) for all x T.
66. THE GENERAL PRODUCT RULE 235
The following theorem, which follows directly from Prop.1 and the Gen-
eral Chain Rule of Sect.63, is called Product Rule because the term prod-
uct is often used for the bilinear forms to which the theorem is applied.
General Product Rule: Let 1
1
, 1
2
and be linear spaces and let
B Lin
2
(1
1
1
2
, ) be given. Let T be an open subset of a at space and
let mappings h
1
: T 1
1
and h
2
: T 1
2
be given. If h
1
and h
2
are both
dierentiable at x T so is B(h
1
, h
2
), and we have
x
B(h
1
, h
2
) = (Bh
1
(x))
x
h
2
+ (B
h
2
(x))
x
h
1
. (66.2)
If h
1
and h
2
are both dierentiable [of class C
1
], so is B(h
1
, h
2
), and we
have
B(h
1
, h
2
) = (Bh
1
)h
2
+ (B
h
2
)h
1
, (66.3)
where the products on the right are understood as value-wise compositions.
We now apply the General Product Rule to special bilinear mappings,
namely to the ordinary product in R and to the four examples given in
Sect.24. In the following list of results T denotes an open subset of a at
space having 1 as its translation space; ,
and
+ ()
h. (66.6)
(IV) If F : T Lin(,
x
(Fh)v = ((
x
F)v)h(x) + (F(x)
x
h)v (66.7)
for all v 1. If F and h are dierentiable [of class C
1
], so is Fh and
(66.7) holds for all x T, v 1.
236 CHAPTER 6. DIFFERENTIAL CALCULUS
(V) If F : T Lin(,
) and G : T Lin(
) are dierentiable
at x T, so is GF, dened by value-wise composition, and we have
x
(GF)v = ((
x
G)v)F(x) +G(x)((
x
F)v) (66.8)
for all v 1. If F and G are dierentiable [of class C
1
], so is GF, and
(66.8) holds for all x T, v 1.
If is an inner-product space, then
k + (k)
h. (66.9)
In the case when T reduces to an interval in R, (66.4) becomes
the Product Rule (08.32)
2
of elementary calculus. The following for-
mulas apply to dierentiable processes f, h, , F, and G with values in
R, ,
, Lin(,
), and Lin(
), respectively:
(fh)
= f
h + fh
, (66.10)
(h)
h +h
, (66.11)
(Fh)
= F
h +Fh
, (66.12)
(GF)
= G
F +GF
. (66.13)
If h and k are dierentiable processes with values in an inner product space,
then
(h k)
= h
k +h k
. (66.14)
Proposition 2: Let be an inner-product space and let R : T Lin
be given. If R is dierentiable at x T and if Rng R Orth then
R(x)
((
x
R)v) Skew for all v 1. (66.15)
Conversely, if R is dierentiable, if T is convex, if (66.15) holds for all
x T, and if R(q) Orth for some q T then Rng R Orth.
Proof: Assume that R is dierentiable at x T. Using (66.8) with the
choice F := R and G := R
we nd that
x
(R
R)v = ((
x
R
)v)R(x) +R
(x)((
x
R)v)
66. THE GENERAL PRODUCT RULE 237
holds for all v 1. By (63.13) we obtain
x
(R
R)v = W
+W with W := R
(x)((
x
R)v) (66.16)
for all v 1. Now, if Rng R Orth1 then R
R = 1
W
is constant (see
Prop.2 of Sect.43). Hence the left side of (66.16) is zero and W must be
skew for all v 1, i.e. (66.15) holds.
Conversely, if (66.15) holds for all x T, then, by (66.16)
x
(R
R) = 0
for all x T. Since T is convex, we can apply Prop.3 of Sect.64 to conclude
that R
R)(q) =
1
W
, then R
R is the constant 1
W
, i.e. Rng R Orth.
Remark: In the second part of Prop.2, the condition that T be con-
vex can be replaced by the one that T be connected as explained in the
Remark after Prop.3 of Sect.64.
Corollary: Let I be an open interval and let R : I Lin be a
dierentiable process. Then Rng R Orth if and only if R(t) Orth
for some t I and Rng (R
) Skew.
The following result shows that quadratic forms (see Sect.27) are of class
C
1
and hence continuous.
Proposition 3: Let 1 be a linear space. Every quadratic form Q : 1
R is of class C
1
and its gradient Q : 1 1
Q . Since Q is
symmetric, this reduces to (66.17).
Let 1 be a linear space. The lineonic nth power pow
n
: Lin1 Lin1
on the algebra Lin1 of lineons (see Sect.18) is dened by
pow
n
(L) := L
n
for all n N. (66.18)
The following result is a generalization of the familiar dierentiation rule
(
n
)
= n
n1
.
238 CHAPTER 6. DIFFERENTIAL CALCULUS
Proposition 4: For every n N the lineonic nth power pow
n
on
Lin1 dened by (66.18) is of class C
1
and, for each L Lin1, its gradi-
ent
L
pow
n
Lin(Lin1) at L Lin1 is given by
(
L
pow
n
)M =
kn
]
L
k1
ML
nk
for all M Lin1. (66.19)
Proof: For n = 0 the assertion is trivial. For n = 1, we have pow
1
=
1
LinV
and hence
L
pow
1
= 1
LinV
for all L Lin1, which is consistent with
(66.19). Also, since pow
1
is constant, pow
1
is of class C
1
. Assume, then,
that the assertion is valid for a given n R. Since pow
n+1
= pow
1
pow
n
holds in terms of value-wise composition, we can apply the result (V) above
to the case when F := pow
n
, G := pow
1
in order to conclude that pow
n+1
is of class C
1
and that
(
L
pow
n+1
)M = ((
L
pow
1
)M)pow
n
(L) + pow
1
(L)((
L
pow
n
)M)
for all M Lin1. Using (66.18) and (66.19) we get
(
L
pow
n+1
)M = ML
n
+L(
kn
]
L
k1
ML
nk
),
which shows that (66.19) remains valid when n is replaced by n + 1. The
desired result follows by induction.
If M commutes with L, then (66.19) reduces to
(
L
pow
n
)M = (nL
n1
)M.
Using Prop.4 and the form (63.17) of the Chain Rule, we obtain
Proposition 5: Let I be an interval and let F : I Lin1 be a process. If
F is dierentiable [of class C
1
], so is its value-wise nth power F
n
: I Lin1
and we have
(F
n
)
kn
]
F
k1
F
F
nk
for all n N. (66.20)
67 Divergence, Laplacian
In this section, T denotes an open subset of a at space c with translation
space 1, and denotes a linear space.
67. DIVERGENCE, LAPLACIAN 239
If h : T 1 is dierentiable at x T, we can form the trace (see
Sect.26) of the gradient
x
h Lin1. The result
div
x
h := tr(
x
h) (67.1)
is called the divergence of h at x. If h is dierentiable we can dene the
divergence div h : T R of h by
(div h)(x) := div
x
h for all x T. (67.2)
Using the product rule (66.5) and (26.3) we obtain
Proposition 1: Let h : T 1 and f : T R be dierentiable. Then
the divergence of fh : T 1 is given by
div (fh) = (f)h + fdiv h. (67.3)
Consider now a mapping H : T Lin(1
, ) that is dierentiable
at x T. For every
, R) = 1
= 1 (see Sect.22). Since H is dieren-
tiable at x for every
R.
It is clear that this mapping is linear and hence an element of
= .
Denition 1: Let H : T Lin(1
, ) be dierentiable at x T. Then
the divergence of H at x is dened to be the (unique) element div
x
H of
which satises
(div
x
H) = tr(
x
(H)) for all
. (67.4)
If H is dierentiable, then its divergence div H : T is dened by
(div H)(x) := div
x
H for all x T. (67.5)
In the case when := R, we also have
= R and Lin(1
, ) =
1
= 1. Thus, using (67.4) with := 1 R, we see that the denition
just given is consistent with (67.1) and (67.2).
If we replace h and L in Prop.3 of Sect.63 by H and
(K K) Lin(Lin(1
, ), 1),
respectively, we obtain
x
(H) =
x
H for all
, (67.6)
240 CHAPTER 6. DIFFERENTIAL CALCULUS
where the right side must be interpreted as the composite of
x
H Lin
2
(1 1
, R)
= Lin(1, Lin(1
, R)) = Lin(1, 1
)
= Lin1
to make (67.6) meaningful. Using (67.6), (67.4), and (67.1) we see that
div H satises
(div
x
H) = div
x
(H) = tr(
x
H) (67.7)
for all
.
The following results generalize Prop.1.
Proposition 2: Let H : T Lin(1
, ) and : T
, R)
= 1 is given by
div (H) = div H+ tr(H
), (67.8)
where value-wise evaluation and composition are understood.
Proof: Let x T and v 1 be given. Using (66.8) with the choices
G := and F := H we obtain
x
(H)v = ((
x
)v)H(x) +(x)((
x
H)v). (67.9)
Since (
x
)v
, (21.3) gives
((
x
)v)H(x) = (H(x)
x
)v.
On the other hand, we have
(x)((
x
H)v) = ((x)(
x
H))v
if we interpret
x
H on the right as an element of Lin
2
(1 1
, ). Hence,
since v 1 was arbitrary, (67.9) gives
x
(H) = (x)
x
H+H(x)
x
.
Taking the trace, using (67.1), and using (67.7) with := (x), we get
div
x
(H) = (x)div
x
H+ tr(H(x)
x
).
Since x T was arbitrary, the desired result (67.8) follows.
Proposition 3: Let h : T 1 and k : T be dierentiable. The
divergence of k h : T Lin(1
. Hence, by Prop.1,
div (k h) = ((k))h + (k)div h = ((k)h +k div h)
holds for all
, ) and f : T R be dieren-
tiable. The divergence of fH : T Lin(1
, ) is then given by
div (fH) = fdiv H+H(f). (67.11)
Proof: Let
(f)). (67.12)
Using (f) = f and using (25.9), (26.3), and (22.3), we obtain
tr(H
(f)) = tr(H
( f)) = f(H
was
arbitrary, (67.11) follows.
From now on we assume that c has the structure of a Euclidean space,
so that 1 becomes an inner-product space and we can use the identication
1
= 1 (see Sect.41). Thus, if k : T is twice dierentiable, we can
consider the divergence of k : T Lin(1, )
= Lin(1
, ).
Denition 2: Let k : T be twice dierentiable. Then the Lapla-
cian of k is dened to be
k := div (k). (67.13)
If k = 0, then k is called a harmonic function.
Remark: In the case when c is not a genuine Euclidean space, but one
whose translation space has index 1 (see Sect.47), the term DAlembertian
or Wave-Operator and the symbol are often used instead of Laplacian and
.
The following result is a direct consequence of (66.5) and Props.3 and 4.
Proposition 5: Let k : T and f : T R be twice dierentiable.
The Laplacian of fk : T is then given by
(fk) = (f)k + fk + 2(k)(f). (67.14)
242 CHAPTER 6. DIFFERENTIAL CALCULUS
The following result follows directly from the form (63.19) of the Chain
Rule and Prop.3.
Proposition 6: Let I be an open interval and let f : T I and
g : I both be twice dierentiable. The Laplacian of g f : T is
then given by
(g f) = (f)
2
(g
f) + (f)(g
f). (67.15)
We now discuss an important application of Prop.6. We consider the
case when f : T I is given by
f(x) := (x q)
2
for all x T, (67.16)
where q c is given. We wish to solve the following problem: How can T, I
and g : I be chosen such that g f is harmonic?
It follows directly from (67.16) and Prop.3 of Sect.66, applied to Q := sq,
that
x
f = 2(xq) for all x T and that hence (f)
2
= 4f. Also, (f)
is the constant with value 21
V
and hence, by (26.9),
f = div (f) = 2tr1
V
= 2 dim1 = 2 dimc.
Substitution of these results into (67.15) gives
(g f) = 4f(g
f) + 2(dimc)g
f. (67.17)
Hence, g f is harmonic if and only if
(2g
+ ng
+ ng
we may take T := x c [ (x
q)
2
> 0 and I := P
.
Notes 67
(1) In some of the literature on Vector Analysis, the notation h instead of div h
is used for the divergence of a vector eld h, and the notation
2
f instead of f
for the Laplacian of a function f. These notations come from a formalistic and
erroneous understanding of the meaning of the symbol and should be avoided.
68 Local Inversion, Implicit Mappings
In this section, T and T
with
translation spaces 1 and 1
, respectively.
To say that a mapping : T T
is dierentiable at x T means,
roughly, that may be approximated, near x, by a at mapping , the
tangent of at x. One might expect that if the tangent is invertible, then
itself is, in some sense, locally invertible near x. To decide whether
this expectation is justied, we must rst give a precise meaning to locally
invertible near x.
Denition: Let a mapping : T T
N
)
= Dom of T and T
N
= 1
N
and [
N
N
= 1
N
(68.1)
for suitable open subsets ^ = Cod and ^
= Dom of T and T
, respec-
tively.
If
1
and
2
are local inverses of and / := (Rng
1
) (Rng
2
), then
1
and
2
must agree on
>
(/) =
<
1
(/) =
<
2
(/), i.e.
1
[
M
>
(M)
=
2
[
M
>
(M)
if / := (Rng
1
) (Rng
2
). (68.2)
Pitfall: A mapping : T T
is dierentiable at x T and
locally invertible near x. Then, if some local inverse near x is dierentiable
at (x), so are all others and all have the same gradient, namely (
x
)
1
.
Proof: We choose a local inverse
1
of near x such that
1
is dier-
entiable at (x). Applying the Chain Rule to (68.1) gives
(
(x)
1
)(
x
) = 1
V
and (
x
)(
(x)
1
) = 1
V
,
which shows that
x
is invertible and
(x)
1
= (
x
)
1
.
Let now a local inverse
2
of near x be given. Let / be dened as in
(68.2). Since /is open and since
1
, being dierentiable at x, is continuous
at x, it follows that
>
(/) =
<
1
(/) is a neighborhood of (x) in c
. By
(68.2),
2
agrees with
1
on the neighborhood
>
(/) of (x). Hence
2
must also be dierentiable at (x), and
(x)
2
=
(x)
1
= (
x
)
1
.
Proposition 2: Assume that : T T
Dom, then
>
(/
) =
<
(/
>
(M
)
M
<
(M)
is again
a local inverse of near x.
Proof: Part (i) is an immediate consequence of Prop.3 of Sect.56. To
prove (ii), we observe rst that
<
(Rng () must be a neighborhood of
(x) =
of
<
(Rng () with
68. LOCAL INVERSION, IMPLICIT MAPPINGS 245
(x) /
).
Remark: It is in fact true that if : T T
0 if t = 0
_
. (68.3)
It is dierentiable and hence continuous. Since f
(s) < 0.
Let I be an open interval and let f : I R be a function of class C
1
.
If the tangent to f at a given t I is invertible, i.e. if f
be of class C
1
and let x T be such that the tangent to at
x is invertible. Then is locally invertible near x and every local inverse of
near x is dierentiable at (x).
Moreover, there exists a local inverse of near x that is of class C
1
and satises
y
= (
(y)
)
1
for all y Dom. (68.4)
Before proceeding with the proof, we state two important results that
are closely related to the Local Inversion Theorem.
Implicit Mapping Theorem: Let c, c
, c
be at spaces, let / be an
open subset of c c
and let : / c
be a mapping of class C
1
. Let
(x
o
, y
o
) /, z
o
:= (x
o
, y
o
), and assume that
y
o
(x
o
, ) is invertible.
Then there exist an open neighborhood T of x
o
and a mapping : T c
,
dierentiable at x
o
, such that (x
o
) = y
o
and Gr() /, and
(x, (x)) = z
o
for all x T. (68.5)
Moreover, T and can be chosen such that is of class C
1
and
(x) = (
(2)
(x, (x)))
1
(1)
(x, (x)) for all x T. (68.6)
246 CHAPTER 6. DIFFERENTIAL CALCULUS
Remark: The mapping is determined implicitly by an equation in
this sense: for any given x T, (x) is a solution of the equation
? y c
, (x, y) = z
o
. (68.7)
In fact, one can nd a neighborhood / of (x
o
, y
o
) in c c
[ (x, y) /.
Dierentiation Theorem for Inversion Mappings: Let 1 and 1
) of all linear
isomorphorisms from 1 onto 1
),
the inversion mapping inv : Lis(1, 1
) Lin(1
, 1) dened by inv(L) :=
L
1
is of class C
1
, and its gradient is given by
(
L
inv)M = L
1
ML
1
for all M Lin(1, 1
). (68.9)
The proof of these three theorems will be given in stages, which will be
designated as lemmas. After a basic preliminary lemma, we will prove a
weak version of the rst theorem and then derive from it weak versions of
the other two. The weak version of the last theorem will then be used, like
a bootstrap, to prove the nal version of the rst theorem; the nal versions
of the other two theorems will follow.
Lemma 1: Let 1 be a linear space, let be a norm on 1, and put
B := Ce(). Let f : B 1 be a mapping with f (0) = 0 such that f 1
BV
is constricted with
:= str(f 1
BV
; , ) < 1. (68.10)
Then the adjustment f [
(1)B
(see Sect.03) of f has an inverse and this
inverse is conned near 0.
Proof: We will show that for every w (1 )B, the equation
? z B, f (z) = w (68.11)
has a unique solution.
To prove uniqueness, suppose that z
1
, z
2
B are solutions of (68.11) for
a given w 1. We then have
(f 1
BV
)(z
1
) (f 1
BV
)(z
2
) = z
2
z
1
68. LOCAL INVERSION, IMPLICIT MAPPINGS 247
and hence, by (68.10), (z
2
z
1
) (z
2
z
1
), which amounts to
(1 )(z
2
z
1
) 0. This is compatible with < 1 only if (z
1
z
2
) = 0
and hence z
1
= z
2
.
To prove existence, we assume that w (1 )B is given and we put
:=
(w)
1
, so that [0, 1[. For every v B, we have (v) and
hence, by (68.10)
((f 1
BV
)(v)) (v) ,
which implies that
(w(f (v) v)) (w) + ((f 1
BV
)(v))
(1 ) + = .
Therefore, it makes sense to dene h
w
: B B by
h
w
(v) := w(f (v) v) for all v B. (68.12)
Since h
w
is the dierence of a constant with value w and a restriction of
f 1
BV
, it is constricted and has an absolute striction no greater than
. Therefore, it is a contraction, and since its domain B is closed, the
Contraction Fixed Point Theorem states that h
w
has a xed point z B.
It is evident from (68.12) that a xed point of h
w
is a solution of (68.11).
We proved that (68.11) has a unique solution z B and that this solution
satises
(z) :=
(w)
1
. (68.13)
If we dene g : (1 )B f
<
((1 )B) in such a way that g(w) is this
solution of (68.11), then g is the inverse we are seeking. It follows from
(68.13) that (g(w))
1
1
(w) for all w (1 )B. In view of part (ii)
of Prop.2 of Sect.62, this proves that g is conned.
Lemma 2: The assertion of the Local Inversion Theorem, except possi-
bly the statement starting with Moreover, is valid.
Proof: Let : c c
[
D
[
x+B
(x +1
V
)[
x+B
B
) x. (68.14)
Since is of class C
1
, so is f . Clearly, we have
f (0) = 0,
0
f = 1
V
. (68.15)
248 CHAPTER 6. DIFFERENTIAL CALCULUS
Put := no
B
, so that B = Ce(). Choose ]0, 1[ ( :=
1
2
would do). If we
apply Prop.4 of Sect.64 to the case when there is replaced by f and k by
0, we see that there is ]0, 1] such that
:= str(f [
B
1
BV
; , ) . (68.16)
Now, it is clear from (64.2) that a striction relative to and
remains
unchanged if and
. Since Ce(
1
) =
Ce() = B (see Prop.6 of Sect.51), it follows from (68.16) that Lemma
1 can be applied when , B, and f there are replaced by
1
, B and f [
B
,
respectively. We infer that if we put
/
o
:= (1 )B, ^
o
:= f
<
(/
o
),
then f [
M
o
N
o
has an inverse g : /
o
^
o
that is conned near zero. Note
that /
o
and ^
o
are both open neighborhoods of zero, the latter because it
is the pre-image of an open set under a continuous mapping. We evidently
have
g[
V
1
M
o
V
= (1
N
o
V
f [
N
o
) g.
In view of (68.15), 1
N
o
V
f [
N
o
is small near 0 1. Since g is conned near
0, it follows by Prop.3 of Sect.62 that g[
V
1
M
o
V
is small near zero. By the
Characterization of Gradients of Sect.63, we conclude that g is dierentiable
at 0 with
0
g = 1
V
.
We now dene
^ := x +^
o
, ^
:= (x) + ()
>
(/
o
).
These are open neighborhoods of x and (x), respectively. A simple calcu-
lation, based on (68.14) and g = (f [
M
o
N
o
)
, shows that
:= (x + 1
V
)[
N
N
o
g (
x)[
M
o
N
(68.17)
is the inverse of [
N
N
. Using the Chain Rule and the fact that g is dif-
ferentiable at 0, we conclude from (68.17) that is dierentiable at (x).
Lemma 3: The assertion of the Implicit Mapping Theorem, except pos-
sibly the statement starting with Moreover, is valid.
Proof: We dene : / c c
by
(x, y) := (x, (x, y)) for all (x, y) /. (68.18)
68. LOCAL INVERSION, IMPLICIT MAPPINGS 249
Of course, is of class C
1
. Using Prop.2 of Sect.63 and (65.4), we see that
(
(x
o
,y
o
)
)(u, v) = (u, (
x
o
(, y
o
))u + (
y
o
(x
o
, ))v)
for all (u, v) 1 1
. Since
y
o
(x
o
, ) is invertible, so is
(x
o
,y
o
)
. In-
deed,we have
(
(x
o
,y
o
)
)
1
(u, w) = (u, (
y
o
(x
o
, ))
1
(w(
x
o
(, y
o
))u))
for all (u, w) 11
, where T and T
c and
: T T
.
Then, by (68.18),
((x, y)) = ((x, z), ((x, z),
. (68.19)
Since (x
o
, z
o
) = ((x
o
, z
o
),
(x
o
, z
o
)) = (x
o
, y
o
), we have
(x
o
, z
o
) = y
o
.
Therefore, if we dene : T c
by (x) :=
(x, z
o
), we see that (x
o
) =
y
o
and (68.5) are satised. The dierentiability of at x
o
follows from the
dierentiablity of at (x
o
, z
o
).
If we dene / := Rng , it is immediate that (x) is the only solution
of (68.8).
Lemma 4: Let 1 and 1
) Lin(1
, 1) dened by inv(L) := L
1
is dierentiable.
Proof: We apply Lemma 3 with c replaced by Lin(1, 1
), c
by
Lin(1
, 1), and c
) Lin(1
).
It is clear that Ri
L
is invertible if and only if L is invertible. (We have
(Ri
L
)
1
= Ri
L
1 if this is the case.) Thus, given L
o
Lis(1, 1
), we can
250 CHAPTER 6. DIFFERENTIAL CALCULUS
apply Lemma 3 with L
o
, L
1
o
, and 1
V
playing the roles of x
o
, y
o
, and z
o
,
respectively. We conclude that there is an open neighborhood T of L
o
in
Lin(1, 1
, 1), LM = 1
V
has a solution for every L T. Since this solution can only be L
1
= inv(L),
it follows that the role of the mapping of Lemma 3 is played by inv[
D
.
Therefore, we must have T Lis(1, 1
) is open
and that inv is dierentiable.
Completion of Proofs: We assume that hypotheses of the Local Inver-
sion Theorem are satised. The assumption that has an invertible tangent
at x T is equivalent to
x
Lis(1, 1
) Lin(1
, 1) is
dierentiable and hence continuous by Lemma 4, since is continuous by
assumption, and since is continuous because it is dierentiable, it follows
that the composite inv [
Lis(V,V
)
Rng
is continuous. By (68.4), this com-
posite is none other than , and hence the proof of the Local Inversion
Theorem is complete.
Assume, now, that the hypotheses of the Implicit Mapping Theorem
are satised. By the Local Inversion Theorem, we can then choose a local
inverse of , as dened by (68.18), that is of class C
1
. It then follows that
the function : T c
be of class C
1
and
surjective. If f has an invertible tangent at every t I, then f is not only
69. EXTREME VALUES, CONSTRAINTS 251
locally invertible near every t I but (globally) invertible, as is easily seen.
This conclusion does not generalize to the case when I and I
are replaced by
connected open subsets of at spaces of higher dimension. The curvilinear
coordinate systems discussed in Sect.74 give rise to counterexamples. One
can even give counterexamples in which I and I
<
({(x)})
has an extremum at x is often expressed by
saying that f attains an extremum at x subject to the constraint that be
constant.
Constrained-Extremum Theorem: Assume that f : T R and
: T T are both of class C
1
and that
x
Lin(1, ) is surjective for
a given x T. If f[
<
( (x)})
attains a local extremum at x, then
x
f Rng (
x
)
. (69.1)
Proof: We put | := Null
x
and choose a supplement Z of | in 1.
Every neighborhood of 0 in 1 includes a set of the form ^ +/, where ^
is an open neighborhood of 0 in | and / is an open neighborhood of 0 in
Z. Since T is open, we may select ^ and / such that x + ^ + / T.
We now dene : ^ / T by
(u, z) := (x +u +z) for all u ^, z /. (69.2)
It is clear that is of class C
1
and we have
0
(0, ) =
x
[
Z
. (69.3)
Since
x
is surjective, it follows from Prop.5 of Sect.13 that
x
[
Z
is
invertible. Hence we can apply the Implicit Mapping Theorem of Sect.68 to
the case when / there is replaced by ^ /, the points x
o
, y
o
by 0, and z
o
by (x). We obtain an open neighborhood ( of 0 in | with ( ^ and a
mapping h : ( Z, of class C
1
, such that h(0) = 0 and
(u, h(u)) = (x) for all u (. (69.4)
Since
(1)
(0, 0) =
0
(, 0) =
x
[
U
= 0 by the denition
| := Null
x
, the formula (68.6) yields in our case that
0
h = 0, i.e.
we have
h(0) = 0,
0
h = 0. (69.5)
69. EXTREME VALUES, CONSTRAINTS 253
It follows from (69.2) and (69.4) that x + u + h(u)
<
((x)) for all
u ( and hence that
(u f(x +u +h(u))) : ( R
has a local extremum at 0. By the Extremum Theorem and the Chain Rule,
it follows that
0 =
0
(u f(x +u +h(u))) =
x
f(1
UV
+
0
h[
V
),
and therefore, by (69.5), that
x
f[
U
= (
x
f)1
UV
= 0, which means that
x
f |
= (Null
x
)
x
g ,= 0 for a given x T. If f[
g
<
({g(x)})
attains a local extremum at x,
then there is R such that
x
f =
x
g. (69.6)
In the case when T := R
I
, we can use (23.19) and obtain
Corollary 2: Assume that f : T R and all terms g
i
: T R
in a nite family g := (g
i
[ i I) : T R
I
are of class C
1
and that
(
x
g
i
[ i I) is linearly independent for a given x T. If f[
g
<
({g(x)})
attains a local extremum at x, then there is R
I
such that
x
f =
iI
x
g
i
. (69.7)
Remark 1: If c is two-dimensional then Cor.1 can be given a geo-
metrical interpretation as follows: The sets L
g
:= g
<
(g(x)) and L
f
:=
f
<
(f(x)) are the level-lines through x of f and g, respectively (see
Figure).
254 CHAPTER 6. DIFFERENTIAL CALCULUS
If these lines cross at x, then the value f(y) of f at y L
g
strictly
increases or decreases as y moves along L
g
from one side of the line L
f
to
the other and hence f[
L
g
cannot have an extremum at x. Hence, if f[
L
g
has an extremum at x, the level-lines must be tangent. The assertion (69.6)
expresses this tangency. The condition
x
g ,= 0 insures that the level-line
L
g
does not degenerate to a point.
Notes 69
(1) The number of (69.6) or the terms
i
occuring in (69.7) are often called Lagrange
multipliers.
610 Integral Representations
Let I SubR be some genuine interval and let h : I 1 be a continuous
process with values in a given linear space 1. For every 1
the composite
h : I R is then continuous. Given a, b I one can therefore form the
integral
_
b
a
h in the sense of elementary integral calculus (see Sect.08). It
is clear that the mapping
(
_
b
a
h) : 1
R
is linear and hence an element of 1
= 1.
Denition 1: Let h : I 1 be a continuous process with values in the
linear space 1. Given any a, b I the integral of h from a to b is dened
to be the unique element
_
b
a
h of 1 which satises
_
b
a
h =
_
b
a
h for all 1
. (610.1)
Proposition 1: Let h : I 1 be a continuous process and let L be a
linear mapping from 1 to a given linear space . Then Lh : I is a
continuous process and
L
_
b
a
h =
_
b
a
Lh for all a, b I. (610.2)
Proof: Let
_
b
a
h
. (610.3)
Proof: The continuity of h follows from the continuity of (Prop.7
of Sect.56) and the Composition Theorem for Continuity of Sect.56.
It follows from (52.14) that
[h[(t) = [h(t)[
()(h(t)) =
()( h)(t)
holds for all 1
).
Using this result and (610.1), (08.40), and (08.42), we obtain
_
b
a
h
_
b
a
h
_
a
b
[h[
_
b
a
h
= h.
Proof: Let 1
= p
.
Hence, the elementary Fundamental Theorem of Calculus (see Sect.08) gives
p
= h. Since 1
= h, which is
continuous by assumption.
In the remainder of this section, we assume that the following items are
given: (i) a at space c with translation space 1, (ii) An open subset T of
c, (iii) An open subset T of R c such that, for each x T,
T
(,x)
:= s R [ (s, x) T
is a non-empty open inteval, (iv) a linear space .
Consider now a mapping h : T such that h(, x) : T
(,x)
is
continuous for all x T. Given a, b R such that a, b T
(,x)
for all x T,
we can then dene k : T by
k(x) :=
_
b
a
h(, x) for all x T. (610.6)
We say that k is dened by an integral representation.
Proposition 3: If h : T is continuous, so is the mapping
k : T dened by the integral representation (610.6).
Proof: Let x T be given. Since T is open, we can choose a norm
on 1 such that x + Ce() T. We may assume, without loss of generality,
that a < b. Then [a, b] is a compact interval and hence, in view of Prop.4
of Sect.58, [a, b] (x + Ce()) is a compact subset of T. By the Uniform
Continuity Theorem of Sect. 58, it follows that the restriction of h to [a, b]
(x +Ce()) is uniformly continuous. Let be a norm on and let P
_
b
a
(h(, y) h(, x))
(b a)
b a
=
whenever (y x) < . Since P
such that
^ := ((a+], [ ) (b+], [ )) ^
is a subset of T and such that the restriction of h to ^ is bounded by
some P
__
a
s
h(, y) +
_
t
b
h(, y)
_
<
2
(610.9)
for all y ^, all s a + ], ], and all t b + ], ]. On the other hand,
by Prop.3, we can determine / Nhd
x
(T) such that
__
b
a
h(, y)
_
b
a
h(, x)
_
<
2
(610.10)
for all y /. By (610.4) we have
_
t
s
h(, y)
_
b
a
h(, x) =
_
b
a
h(, y)
_
b
a
h(, x)
+
_
a
s
h(, y) +
_
t
b
h(, y).
Hence it follows from (610.9) and (610.10) that
__
t
s
h(, y)
_
b
a
h(, x)
_
<
258 CHAPTER 6. DIFFERENTIAL CALCULUS
for all y /^ Nhd
x
(T), all s a+ ], ], and all t b+ ], [. Since
P
(2)
h : T Lin(1, ) is continuous.
Then the mapping k : T dened by the integral representation
(610.6) is of class C
1
and its gradient is given by the integral representation
x
k =
_
b
a
(2)
h(, x) for all x T. (610.11)
Roughly, this theorem states that if h satises the conditions (i) and (ii),
one can dierentiate (610.6) with respect to x by dierentiating under the
integral sign.
Proof: Let x T be given. As in the proof of Prop.3, we can choose
a norm on 1 such that x + Ce() T, and we may assume that a < b.
Then [a, b]
_
x + Ce()
_
is a compact subset of T. By the Uniform Conti-
nuity Theorem of Sect.58, the restriction of
(2)
h to [a, b]
_
x + Ce()
_
is
uniformly continuous. Hence, if a norm on and P
are given, we
can determine ]0, 1 ] such that
|
(2)
h(s, x +u)
(2)
h(s, x)|
,
<
b a
(610.12)
holds for all s [a, b] and u Ce().
We now dene n :
_
T (0, x)
_
by
n(s, u) := h(s, x +u) h(s, x)
_
(2)
h(s, x)
_
u. (610.13)
It is clear that
(2)
n exists and is given by
(2)
n(s, u) =
(2)
h(s, x +u)
(2)
h(s, x).
By (610.12), we hence have
|
(2)
n(s, u)|
,
<
b a
610. INTEGRAL REPRESENTATIONS 259
for all s [a, b] and u Ce(). By the Striction Estimate for Dierentiable
Mapping of Sect.64, it follows that n(s, )[
Ce()
is constricted for all s [a, b]
and that
str(n(s, )[
Ce()
; , )
b a
for all s [a, b].
Since n(s, 0) = 0 for all s [a, b], the denition (64.1) shows that
(n(s, u))
b a
(u) for all s [a, b]
and all u Ce(). Using Prop.2, we conclude that
__
b
a
n(, u)
_
_
b
a
(n(, u)) (u)
whenever u Ce(). Since P
(2)
h(, x)
_
u
for all u Tx. Therefore, by the Characterization of Gradients of Sect.63,
k is dierentiable at x and its gradient is given by (610.11). The continuity
of k follows from Prop.3.
The following corollary deals with generalizations of integral representa-
tions of the type (610.6).
Corollary: Assume that h : T satises the conditions (i) and
(ii) of the Theorem. Assume, further, that f : T R and g : T R are
dierentiable and satisfy
f(x), g(x) T
(,x)
for all x T.
Then k : T , dened by
k(x) :=
_
g(x)
f(x)
h(, x) for all x T, (610.14)
is of class C
1
and its gradient is given by
x
k = h(g(x), x)
x
g h(f(x), x)
x
f (610.15)
+
_
g(x)
f(x)
(2)
h(, x) for all x T.
260 CHAPTER 6. DIFFERENTIAL CALCULUS
Proof: Consider the set T R R c dened by (610.7), so that
a, b T
(,x)
for all (a, b, x) T. Since T
(,x)
is assumed to be an interval
for each x T, we can dene m : T by
m(a, b, x) :=
_
b
a
h(, x). (610.16)
By the Theorem,
(3)
m : T Lin(1, ) exists and is given by
(3)
m(a, b, x) =
_
b
a
(2)
h(, x). (610.17)
It follows from Prop.4 that
(3)
m is continuous. By the Fundamental The-
orem of Calculus and (610.16), the partial derivatives m,
1
and m,
2
exist,
are continuous, and are given by
m,
1
(a, b, x) = h(a, x), m,
2
(a, b, x) = h(b, x). (610.18)
By the Partial Gradient Theorem of Sect.65, it follows that m is of class C
1
.
Since
(1)
m = m,
1
and
(2)
m = m,
2
, it follows from (65.9), (610.17),
and (610.18) that
(
(a,b,x)
m)(, , u) = h(b, x) h(a, x) (610.19)
+
__
b
a
(2)
h(, x)
_
u
for all (a, b, x) T and all (, , u) R R 1.
Now, since f and g are dierentiable, so is (f, g, 1
DE
) : T RRc
and we have
x
(f, g, 1
DE
) = (
x
f,
x
g, 1
V
) (610.20)
for all x T. (See Prop.2 of Sect.63). By (610.14) and (610.16) we have
k = m (f, g, 1
DE
)[
D
. Therefore, by the General Chain Rule of Sect.63, k
is of class C
1
and we have
x
k = (
(f(x),g(x),x)
m)
x
(f, g, 1
DE
)
for all x T. Using (610.19) and (610.20), we obtain (610.15).
611. CURL, SYMMETRY OF SECOND GRADIENTS 261
611 Curl, Symmetry of Second Gradients
In this section, we assume that at spaces c and c
)
_
= Lin
2
(1
2
, 1
),
and hence we can apply the switching dened by (24.4) to the values of H.
Denition 1: The curl of a dierentiable mapping H : T Lin(1, 1
)
is the mapping Curl H : T Lin
2
(1
2
, 1
) dened by
(Curl H)(x) :=
x
H(
x
H)
).
The following result deals with conditions that are sucient or necessary
for the curl to be zero.
Curl-Gradient Theorem: Assume that H : T Lin(1, 1
) is of class
C
1
. If H = for some : T c
.
The proof will be based on the following
Lemma: Assume that T is convex. Let q T and q
be given and
let : T c
be dened by
(q +v) := q
+
__
1
0
H(q + sv)ds
_
v for all v T q. (611.2)
Then is of class C
1
and we have
(q +v) = H(q +v)
__
1
0
s(Curl H)(q + sv)ds
_
v (611.3)
for all v T q.
Proof: We dene T R 1 by
T := (s, u) [ u T q, q + su T.
Since T is open and convex, it follows that, for each u T q, T
(,u)
:=
s R [ q + su T is an open interval that includes [0, 1]. Since H is of
class C
1
, so is
((s, u) H(q + su)) : T Lin(1, 1
);
262 CHAPTER 6. DIFFERENTIAL CALCULUS
hence we can apply the Dierentiation Theorem for Integral Representations
of Sect.610 to conclude that u
_
1
0
H(q +su)ds is of class C
1
and that its
gradient is given by
v
(u
_
1
0
H(q + su)ds) =
_
1
0
sH(q + sv)ds
for all v T q. By the Product Rule (66.7), it follows that the mapping
dened by (611.2) is of class C
1
and that its gradient is given by
(
q+v
)u =
__
1
0
sH(q + sv)uds
_
v +
__
1
0
H(q + sv)ds
_
u
for all v Tq and all u 1. Applying the denitions (24.2), (24.4), and
(611.1), and using the linearity of the switching and Prop.1 of Sect.610, we
obtain
(
q+v
)u =
__
1
0
s(Curl H)(q + sv)ds
_
(u, v) (611.4)
+
__
1
0
(s(H(q + sv))v +H(q + sv))ds
_
u
for all v Tq and all u 1. By the Product Rule (66.10) and the Chain
Rule we have, for all s [0, 1],
s
(t tH(q + tv)) = H(q + sv) + s(H(q + sv))v.
Hence, by the Fundamental Theorem of Calculus, the second integral on the
right side of (611.4) reduces to H(q +v). Therefore, since Curl H has skew
values and since u 1 was arbitrary, (611.4) reduces to (611.3).
Proof of the Theorem: Assume, rst, that T is convex and that
Curl H = 0. Choose q T, q
and dene : T c
by (611.2). Then
H = by (611.3).
Conversely, assume that H = for some : T c
. Let q T
be given. Since T is open, we can choose a convex open neighborhood ^
of q with ^ T. Let v ^ q by given. Since ^ is convex, we have
q + sv ^ for all s [0, 1], and hence we can apply the Fundamental
Theorem of Calculus to the derivative of s (q + sv), with the result
(q +v) = (q) +
__
1
0
H(q + sv)ds
_
v.
611. CURL, SYMMETRY OF SECOND GRADIENTS 263
Since v ^ q was arbitrary, we see that the hypotheses of the Lemma are
satised for ^ instead of T, q
:= (q), [
N
instead of , and H[
N
= [
N
instead of H. Hence, by (611.3), we have
__
1
0
s(Curl H)(q + sv)ds
_
v = 0 (611.5)
for all v ^ q. Now let u 1 be given. Since ^ q is a neighborhood
of 0 1, there is a P
is of class C
2
, then its second gradient
(2)
: T Lin
2
(1
2
, 1
) has
symmetric values, i.e. Rng
(2)
Sym
2
(1
2
, 1
).
Remark 2: The assertion that
x
() is symmetric for a given x T
remains valid if one merely assumes that is dierentiable and that
is dierentiable at x. A direct proof of this fact, based on the results of
Sect.64, is straightforward although somewhat tedious.
We assume now that c := c
1
c
2
is the set-product of at spaces c
1
, c
2
with translation spaces 1
1
, 1
2
, respectively. Assume that : T c
is
twice dierentiable. Using Prop.1 of Sect.65 repeatedly, it is easily seen that
(2)
(x) =
(
(1)
(1)
)(x)(ev
1
, ev
1
) + (
(1)
(2)
)(x)(ev
1
, ev
2
) + (611.6)
(
(2)
(1)
)(x)(ev
2
, ev
1
) + (
(2)
(2)
)(x)(ev
2
, ev
2
)
for all x T, where ev
1
and ev
2
are the evaluation mappings from 1 :=
1
1
1
2
to 1
1
and 1
2
, respectively (see Sect.04). Evaluation of (611.6) at
264 CHAPTER 6. DIFFERENTIAL CALCULUS
(u, v) = ((u
1
, u
2
), (v
1
, v
2
)) 1
2
gives
(2)
(x)(u, v) =
(
(1)
(1)
)(x)(u
1
, v
1
) + (
(1)
(2)
)(x)(u
1
, v
2
) + (611.7)
(
(2)
(1)
)(x)(u
2
, v
1
) + (
(2)
(2)
)(x)(u
2
, v
2
).
The following is a corollary to the preceding theorem.
Theorem on the Interchange of Partial Gradients: Assume that
T is an open subset of a product space c := c
1
c
2
and that : T c
is of class C
2
. Then
(1)
(2)
= (
(2)
(1)
)
, (611.8)
where the operation is to be understood as value-wise switching.
Proof: Let x T and w 1
1
1
2
be given. If we apply (611.7) to the
case when u := (w
1
, 0), v := (0, w
2
) and then with u and v interchanged
we obtain
(2)
(x)(u, v) = (
(1)
(2)
)(x)(w
1
, w
2
),
(2)
(x)(v, u) = (
(2)
(1)
)(x)(w
2
, w
1
).
Since w 1
1
1
2
was arbitrary, the symmetry of
(2)
(x) gives (611.8).
Corollary: Let I be a nite index set and let T be an open subset
of R
I
. If : T c
is of class C
2
, then the second partial derivatives
,
i
,
k
: T 1
satisfy
,
i
,
k
= ,
k
,
i
for all i, k I. (611.9)
Remark 3: Assume T is an open subset of a Euclidean space c with
translation space 1. A mapping h : T 1
= 1
and t I. (612.1)
Then h is continuous and z converges locally uniformly to the process p :
I c dened by
p(t) := q +
_
t
a
h for all t I. (612.2)
Proof: The continuity of h follows from the Theorem on Continuity of
Uniform Limits of Sect.56. Hence (612.2) is meaningful. Let be a norm
on 1. By Prop.2 of Sect.610 and (612.1) and (612.2) we have
(z
n
(t) p(t)) =
__
t
a
(g
n
h)
_
_
t
a
(g
n
h)
(612.3)
for all n N
such that
(g
n
h)[
[b,c]
<
c b
for all n m +N.
Hence, by (612.3), we have
(z
n
(t) p(t)) < [t a[
c b
< for all n m +N
and all t [b, c]. Since P
kn
[
1
k!
L
k
(612.4)
for all L Lin1 and all n N
|L|
k
for all k N and all L Lin1 (see (52.9)).
Hence, given P
, we have
|
1
k!
L
k
|
k
k!
for all k N and all L Ce(||
k
k!
[ k N)
converges (to e
) converges
uniformly. Now, given L Lin1, Ce(| |
) is a neighborhood of L if
> |L|
_
t
0
|LD(s)|
ds |L|
_
t
0
|D(s)|
ds,
i.e., with the abbreviations
(t) := |D(t)|
, (612.7)
that
0 (t)
_
t
0
for all t P. (612.8)
612. LINEONIC EXPONENTIALS 267
We dene : P R by
(t) := e
t
_
t
0
for all t P. (612.9)
Using elementary calculus, we infer from (612.9) and (612.8) that
(t) = e
t
_
t
0
+ e
t
(t) 0
for all t P and hence that is antitone. On the other hand, it is evident
from (612.9) that (0) = 0 and 0. This can happen only if = 0,
which, in turn, can happen only if (t) = 0 for all t P and hence, by
(612.7), if D(t) = 0 for all t P. To show that D(t) = 0 for all t P, one
need only replace D by D () and L by L in (612.5) and then use the
result just proved.
Proposition 4: Let L Lin1 be given. Then the only dierentiable
process E : R Lin1 that satises
E
= LE and E(0) = 1
V
(612.10)
is the one given by
E := exp
V
(L). (612.11)
Proof: We consider the sequence G in Map(R, Lin1) dened by
G
n
(t) :=
kn
[
t
k
k!
L
k+1
(612.12)
for all n N
kn
[
t
k+1
(k + 1)!
L
k+1
= S
n+1
(tL)
for all n N
is given by
D
(t) = E(0)F(t) +
_
t
0
E
= LE
1
, E
2
= KE
2
, and E
1
(0) = E
2
(0) = 1
V
.
Taking the dierence and observing (612.17)
1
we see that D := E
2
E
1
satises
D
= KE
2
LE
1
= LD+ rME
2
, D(0) = 0.
Hence, if we apply Prop.5 with the choice F := rME
2
we obtain
(E
2
E
1
)(t) = r
_
t
0
E
1
(t s)ME
2
(s)ds for all t R.
612. LINEONIC EXPONENTIALS 269
For t := 1 we obtain, in view of (612.17),
1
r
(exp
V
(L + rM) exp
V
(L)) =
_
1
0
(exp
V
((1 s)L))Mexp
V
(s(L + rM))ds.
Since exp
V
is continuous by Prop.2, and since bilinear mappings are contin-
uous (see Sect.66), we see that the integrand on the right depends contin-
uously on (s, r) R
2
. Hence, by Prop.3 of Sect.610 and by (65.13), in the
limit r 0 we nd
(dd
M
exp
V
)(L) =
_
1
0
(exp
V
((1 s)L))Mexp
V
(sL)ds. (612.18)
Again, we see that the integrand on the right depends continuously on
(s, L) RLin1. Hence, using Prop.3 of Sect.610 again, we conclude that
the mapping dd
M
exp
V
: Lin1 Lin1 is continuous. Since M Lin1
was arbitrary, it follows from Prop.7 of Sect.65 that exp
V
is of class C
1
. The
formula (612.16) is the result of combining (65.14) with (612.18).
Proposition 6: Assume that L, M Lin1 commute, i.e. that LM =
ML. Then:
(i) M and exp
V
(L) commute.
(ii) We have
exp
V
(L +M) = (exp
V
(L))(exp
V
(M)). (612.19)
(iii) We have
(
L
exp
V
)M = (exp
V
(L))M. (612.20)
Proof: Put E := exp
v
(L) and D := EM ME. By Prop.4, we nd
D
= E
M ME
= E
F +EF
= LEF +EMF.
Since EM = ME by part (i), we obtain (EF)
= (L + M)EF. Since
(EF)(0) = E(0)F(0) = 1
V
, by Prop.4, EF = exp
v
((L +M)). Evaluation
at 1 gives (612.19), which proves (ii).
Part (iii) is an immediate consequence of (612.16) and of (i) and (ii).
270 CHAPTER 6. DIFFERENTIAL CALCULUS
Pitfalls: Of course, when 1 = R, then Lin1 = LinR
= R and the
lineonic exponential reduces to the ordinary exponential exp. Prop.6 shows
how the rule exp(s+t) = exp(s) exp(t) for all s, t R and the rule exp
= exp
generalize to the case when 1 , = R. The assumption, in Prop.6, that L and
M commute cannot be omitted. The formulas (612.19) and (612.20) need
not be valid when LM ,= ML.
If the codomain of the ordinary exponential is restricted to P
it becomes
invertible and its inverse log : P
R is of class C
1
. If dim1 > 1, then exp
V
does not have dierentiable local inverses near certain values of L Lin1
because for these values of L, the gradient
L
exp
V
fails to be injective (see
Problem 10). In fact, one can prove that exp
V
is not locally invertible near
the values of L in question. Therefore, there is no general lineonic analogue
of the logarithm. See, however, Sect.85.
613 Problems for Chapter 6
(1) Let I be a genuine interval, let c be a at space with translation space
1, and let p : I c be a dierentiable process. Dene h : I
2
1 by
h(s, t) :=
_
p(s)p(t)
st
if s ,= t
p
(s) if s = t
_
. (P6.1)
(a) Prove: If p
, dened by
[k[(y) := [k(y)[ for all y T, is dierentiable at x and that
x
[k[ =
1
|k(x)|
(
x
k)
k(x). (P6.3)
(b) Show that k/[k[ : T
x
_
k
|k|
_
=
1
|k(x)|
3
_
[k(x)[
2
x
k k(x) (
x
k)
k(x)
_
. (P6.4)
(4) Let c be a genuine inner-product space with translation space 1, let
q c be given, and dene r : c q 1
by
r(x) := x q for all x c q (P6.5)
and put r := [r[ (see Part (a) of Problem 3).
(a) Show that r/r : c q 1
is of class C
1
and that
_
r
r
_
=
1
r
3
(r
2
1
V
r r). (P6.6)
(Hint: Use Part (b) of Problem 3.)
(b) Let the function h : P
(2)
(h r) =
1
r
_
1
r
2
(r(h
r) (h
r))r r + (h
r)1
V
_
.
(P6.7)
(c) Evaluate the Laplacian (h r) and reconcile your result with
(67.17).
(5) A Euclidean space c of dimension n with n 2, a point q c, and a
linear space are assumed given.
(a) Let I be an open interval, let T be an open subset of c, let
f : T I be given by (67.16), let a be a at function on c, let
g : I be twice dierentiable, and put h := a[
D
(g f) : T
. Show that
h = 2a[
D
((2g
+ (n + 2)g
) f) 4a(q)(g
f). (P6.8)
272 CHAPTER 6. DIFFERENTIAL CALCULUS
(b) Assuming that the Euclidean space c is genuine, show that the
function h : c q given by
h(x) := ((e (x q))[x q[
n
)a +b (P6.9)
is harmonic for all e 1 := c c and all a, b . (Hint: Use
Part (a) with a determined by a = e, a(q) = 0.)
(6) Let c be a at space, let q c, let Q be a non-degenerate quadratic
form (see Sect.27) on 1 := c c, and dene f : c R by
f(y) := Q(y q) for all y c. (P6.10)
Let a be a non-constant at function on c (see Sect.36).
(a) Prove: If the restriction of f to the hyperplane T := a
<
(0)
attains an extremum at x T, then x must be given by
x = q
1
Q (a), where :=
a(q)
1
Q (a,a)
. (P6.11)
(Hint: Use the Constrained-Extremum Theorem.)
(b) Under what condition does f[
F
actually attain a maximum or a
minimum at the point x given by (P6.11)?
(7) Let c be a 2-dimensional Euclidean space with translation-space 1
and let J Orth1 Skew1 be given (see Problem 2 of Chapt.4). Let
T be an open subset of c and let h : T 1 be a vector-eld of class
C
1
.
(a) Prove that
Curl h = (div(Jh))J. (P6.12)
(b) Assuming that T is convex and that div h = 0, prove that h =
Jf for some function f : T R of class C
2
.
Note: If h is interpreted as the velocity eld of a volume-preserving ow, then
div h = 0 is valid and a function f as described in Part (b) is called a stream-
function of the ow.
613. PROBLEMS FOR CHAPTER 6 273
(8) Let a linear space 1, a lineon L Lin1, and u 1 be given. Prove:
The only dierentiable process h : R 1 that satises
h
is arbitrary.)
(9) Let a linear space 1 and a lineon J on 1 that satises J
2
= 1
V
be
given. (There are such J if dim1 is even; see Sect.89.)
(a) Show that there are functions c : R R and d : R R, of class
C
1
, such that
exp
V
(J) = c1
V
+ dJ. (P6.15)
(Hint: Apply the result of Problem 8 to the case when 1 is re-
placed by c := Lsp(1
V
, J) Lin1 and when L is replaced by
Le
J
Linc, dened by Le
J
U = JU for all U c.)
(b) Show that the functions c and d of Part (a) satisfy
d
= c, c
be a linear space.
(a) Let H : T Lin(1, 1
) be of class C
2
. Let x T and v 1 be
given. Note that
x
Curl H Lin(1, Lin(1, Lin(1, 1
)))
= Lin
2
(1
2
, Lin(1, 1
))
and hence
(
x
Curl H)
Lin
2
(1
2
, Lin(1, 1
))
= Lin(1, Lin(1, Lin(1, 1
))).
Therefore we have
G := (
x
Curl H)
v Lin(1, Lin(1, 1
))
= Lin
2
(1
2
, 1
).
Show that
GG
= (
x
Curl H)v. (P6.19)
(Hint: Use the Symmetry Theorem for Second Gradients.)
(b) Let : T 1
be of class C
2
and put
W := Curl : T Lin(1, 1
)
= Lin
2
(1
2
, R).
Prove: If h : T 1 is of class C
1
, then
Curl (Wh) = (W)h +Wh + (h)
W, (P6.20)
where value-wise evaluation and composition are understood.
(12) Let c and c
2, let : c c
be
a mapping of class C
1
and put
c := x c [
x
is not invertible.
613. PROBLEMS FOR CHAPTER 6 275
(a) Show that
>
(c) Rng Bdy (Rng ).
(b) Prove: If the pre-image under of every bounded subset of c
is
bounded and if Acc
>
(c) = (see Def.1 of Sect.57), then is
surjective. (Hint: Use Problem 13 of Chap.5.)
(13) By a complex polynomial p we mean an element of C
(N)
, i.e. a
sequence in C indexed on N and with nite support (see (07.10)). If
p ,= 0, we dene the degree of p by
deg p := max Supt p = maxk N [ p
k
,= 0.
By the derivative p
C
(N)
dened by
(p
)
k
:= (k + 1)p
k+1
for all k N. (P6.21)
The polynomial function p : C C of p is dened by
p(z) :=
kN
p
k
z
k
for all z C. (P6.22)
Let p (C
(N)
)
be given.
(a) Show that p
<
(0) is nite and p
<
(0) deg p.
(b) Regarding C as a two-dimensional linear space over R, show that
p is of class C
2
and that
(
z
p)w = p