The Gateaux and Hadamard Variations and Differentials: SF (X) H /im (F (X + TH) - F (X) ) /T

The Gateaux and Hadamard variations
and differentials
In this chapter X, Y and Z again denote normed spaces over either the
real or the complex field.
4.1 The Gateaux variation and the Gateaux differential

Let/ be a function from a set A c X into 7, let xoeA, and let heX. We
say that/ has a Gateaux variation at x0for the increment h if the function
t^f{x0 + th) is defined on the interval [0,n[ for some rj > 0 and has a
right-hand derivative at t = 0. The value of this derivative is then called
the Gateaux variation off at x0for the increment h, and we denote it by
<5/(x0M Thus
Sf(xo)h= \im (f(xo + th)-f(xo))/t
t->0 +
whenever the limit exists. Further, the function h i-» Sf(xo)h with
domain the set of heX for which Sf(xo)h exists is called the Gateaux
variation off at x 0 , and we denote it by Sf(xo).t Obviously <5/(xo)0 exists
whenever xoeA, and is equal to 0.
The Gateaux variation df(x0) need not be linear (see Example (d)
below), but we have the result that if df(xo)h exists and a > 0 , then
<>f(xo)(<xh) exists and is equal to aSf(xo)h. This is obvious when a = 0,
while if a > 0
w
Sf(x \t u\ v /(*o
o)(ah) = hm —
r 0 + t
= a hhm
m f{o)
t-+0+ OLt
t We use the same bracketing convention as with df{x0).

% There is no really standard notation and terminology for the Gateaux variation. Some
authors speak of the 'Gateaux variation of/ at x 0 for the increment W only when it exists
for all heX, and it is more usual to employ the ordinary derivative in the definition than
the right-hand derivative. The most common notations for the Gateaux variation are
6f(xo;h) and Vf{xo;h).
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 19 Dec 2018 at 02:16:38, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9780511897191.006
252 [4.1] Gateaux and Hadamard differentials
The following examples illustrate the range of possible behaviour of

the function Sf(x0).
Example (a)
Let 0 be a function from a set A c R into Y, and let toe A. If h > 0, then <t>
has a Gateaux variation at t0 for the increment h if and only if A contains
an interval of the form \_to,to + rj\_9 where r\ > 0, and </> has a right-hand
derivative (t>'+(to\ and then d<j)(to)h = /«/>'+(t0). Similarly, if h< 0, then 0
has a Gateaux variation at t0 for the increment h if and only if A contains
an interval of the form ]t 0 - rj, r 0 ] and <j) has a left-hand derivative (j>'_(to\
and then d(j)(to)h = h(j)'_(t0). Thus d(j)(to)h exists for all /ieR if and only
if r0 is an interior point of A and both 0' + (t o )and 0'_(to) exist.
There is, of course, no correspondence between the existence of (f)'± (t0)
and that of <5</>(ro)0, since the latter exists whenever tosA.
Example (b)
If/ is a function from a set A^ X into 7 which is Frechet differentiable
at x 0 , then the Gateaux variation Sf(xo)h exists and is equal to df(xo)h
for all heX, so that the function 5f(x0) belongs to S£(X, Y). This follows
directly from either Exercise 3.1.4 or (3.1.2).
Example (c)
If/:R 2 ->Risgivenby
/ ( x 1 , x 2 ) = l if x2=xl^0, f(xl,x2) = 0 otherwise,
then Sf(0)h exists and is equal to 0 for all /ieR 2 , so that the function
<3/(0) belongs to =Sf (R2, R). However, / is discontinuous at the origin,
and is therefore not Frechet differentiable there.
Example (d)
L e t / be the (continuous) function from R2 into R given by
/(x 1 ,x 2 ) = x 1 if | x 2 | > x 2 , /(x 1 ? x 2 ) = |x 2 1/*! otherwise.
Then / takes the value 0 everywhere on the axes, so that Sf(0)h exists
and is equal to 0 when h lies on either axis. Also, if h = (/i1, h2) does not
lie on either axis, then
{f {th19th2)-f (0,0))/t = ht whenever 0 < £ < |/z2|//z2,
[4.1] Gateaux and Hadamard differentials 253
so that for such h we have Sf(0)h = hx. Hence (5/(0) is non-linear, and
is discontinuous at each point of the /zâxis other than the origin.
Example (e)
Ifp is a convex function on a convex set A^X, and x0 is an internal point
of A, then Sp(xo)h exists for all heX and Sp(x0) is sublinear.
To prove this we note first that if heX then the function t »-> p(x0 + th) is
convex on a neighbourhood of 0 in R, and therefore the limit
Sp(xo)h = lim (p(x0 + th) - p(xo))/t
t->0 +
exists (cf. §1.1, Example (g)). It remains to prove that $p(x0) is sublinear,
and since 8p(xo)((xh) = ccSp(xo)h for all a > 0, it is enough to show that
Sp(x0) is convex. Let h,keX, let 0 < c < l , and let / = oh + (1 - o ) k .
Then for all sufficiently small positive t, p(x0 + tt) = p(<r{x0 + th) +
(1 - a)(x0 + tk)) < <rp(x0 + th) + (1 - G)P(X0 + tfc), and this trivially
implies that
dp(xo)l < <rdp(xo)h + (1 - a)5p(xo)fc.
Theorem (3.1.3) applies without change to the Gateaux variation, i.e.

we have:
(4.1.1) (i) / / / is a function from a set A^X into Y which has a Gateaux
variation at x0 for the increment h, then so does offor every scalar a and
(ii) Iffg are functions from sets A,B^X into Y which have Gateaux
variations at x0 for the increment h, then so doesf + g, and d(f + g)(xo)h
The following version of the chain rule is an immediate consequence of

Exercise 3.1.4.
(4.1.2) Let f be a function from a set A^X into Y which has a Gateaux
variation at x0for the increment h, and let g be a function from a set B^Y
into Z which is Frechet differentiable at the point y0 = /(x 0 ). Then g °fhas a
Gateaux variation at x0for the increment h, equal to dg(yo)(SJ{xo)h). In
particular, if geS£(Y,Z\ then the Gateaux variation ofg°f at x0 for the
increment h is g($j{xo)h).
In (4.1.2) we cannot replace the condition that g is Frechet differentiable
254 [4.1 ] Gateaux and Hadamard differentials
at y0 by the condition that g has a Gateaux variation Sg(yo)k for every

increment fc, even if dg(yo)€^(Y ,Z). For example, if g :R 2 -> R is the
function of Example (c) above, so that
g(xl,x2)= 1 if x 2 = x 2 = ^ 0 , g(xl,x2) = 0 otherwise,
a n d / : R - > R 2 is given by/(t) = (r,f2), then ^ ( 0 ) e i ? ( R 2 , R ) and 5/(0)e
if(R, R 2 ), but g(f(t)) = 1 except when t = 0, so that the Gateaux variation
of g of at 0 exists only for the increment 0. This example shows also that
we cannot interchange the Gateaux and Frechet conditions in (4.1.2).
It is useful to note that if/ is defined on the closed line segment with
endpoints x o , x o + /i, and £e[0,1[, then the function t*-+f(xo + th)
has a right-hand derivative at £ equal to / if and only if Sf(x0 + £,h)h
exists and is equal to / (for
t-*0+
y
= lim
whenever either limit exists). In particular, by combining this remark
with (1.6.3. Corollary), we obtain the following mean value inequality for
Gateaux variations.
(4.1.3) Let x o , x o + heX, let S be the closed line segment in X with end-
points x o , x o + /i, and let f :S-+Y be a continuous function such that
df(xo + th)h exists for nearly all re]0,l[. Then there exist uncountably
many £e]0, l [ / o r which
\\f(x0 + h) -f(x0) || < 15fix0 + ih)h ||. (1)
Moreover, either (1) holds with strict inequality for uncountably many £,
or it holds whenever Sf(x0 + £h)h exists (and with equality for nearly
all®.
We give next a simple sufficient condition for the linearity of the

Gateaux variation <5/(x0) when the normed spaces involved are real.
We observe that since Sf(xo)((xh) = oidf(xo)h for every a > 0, to prove the
linearity of df(x0) over the real field it is enough to show that
a/UoX* + k) = sf(x0)h + Sf(xo)k for all h, k.
(4.1.4) IfX, Y are real normed spaces andf is a function from a set A^X
into Y such that
(i) for each x in some neighbourhood of x0 the Gateaux variation df(x)h
exists for all heX,
[4.1 ] Gateaux and Hadamard differentials 255
(ii) for each heX the function x H* Sf(x)h is continuous at x 0 ,

then the function Sf(x0) is linear.
This result is a direct consequence of the preceding remark and the
following lemma.
(4.1.5) Lemma. Letf be a function from a set A^X into 7, and let xoeA.
Let also h,keX and suppose that x •-• Sf(x)h is defined on some neighbour-
hood V of x0 and is continuous at x 0 , and that Sf(xo)k exists. Then
Sf(x0) (h + k) exists and is equal to Sf(xo)h + Sf(xo)k.
If r is a point of ]0, oo[ for which x 0 + th + tk and x0 + tk belong to A9
then
f(x0 + th + tk) -f(x0) - tSf(xo)h - t5f(xo)k
= (/(x0 + th + tk) -f(x0 + tk) - tSf(xo)h)
+ (f(xo + tk)-f(xo)-tdf(xo)k)
say, and here Q(t)/t -> 0 as t -* 0 + , since df(xo)k exists.

We have now to show that P(t)/t -> 0 as t -• 0 + , and to do this we
consider the function
s »f(x0 + sh + tk) -f(x0 + tk) - s8f{xo)h (2)
on the interval / = [0, t\. For all sufficiently small t, the closed line segment
in X joining the points x 0 + th + tk and x 0 + tk lies in V, and hence, by
the remark preceding (4.1.3), for all such t the right-hand derivative of
the function (2) at the point s of / is
5f(x0 + sh + tk)h - Sf(xo)h.
By the mean value inequality (1.6.3. Corollary) applied to the function (2)
on /, we deduce that for all sufficiently small t there exist uncountably
many ê/for which
|| P{t) || < M || Sf(x0 + th + tk)h - df(xo)h ||,
and since the function x>->df(x)h is continuous at x 0 , this implies that
We now define Gateaux differentiability. Let / be a function from

a set A £ X into Y and let x0 be an interior point of A. We say that / is
Gateaux differentiable at x 0 if the Gateaux variation Sf(xo)h exists for
all heX and the function df(x0) belongs to JSf(X, Y). This function Sf(x0)
is then called the Gateaux differential off at x o .t Further, the function
t As with the Gateaux variation, there is no really standard notation for the Gateaux
differential of/at x 0 ; perhaps the most common notation is Df{x0).
256 [4.1 ] Gateaux and Hadamard differentials
x *-+ Sf(x) whose domain is the set of interior points of A at which / is

Gateaux differentiable is called the Gateaux differential of/, and we
denote it by Sf.
If/ is Frechet differentiable at x 0 , then by Example (b) above,/ is
Gateaux differentiable at x0 and Sf(x0) = df(x0). The converse is true
when X = R (and Y is real), for if 0 is a function from a set A c R into
Y, and t0 is an interior point of A, then the Frechet differentiability of </>
at t0 and the Gateaux differentiability of <>
/ at t0 are each equivalent to the
existence of the derivative <j)'(to\ and then d(j)(to)h = d(f>(to)h = h(j)'(t0)
for all heX (see Example (a) above and §3.1, Example (a) (p. 168)). If
dim X > 2, then Gateaux differentiability does not imply Frechet diff-
erentiability; this is easily seen from Example (c). This same example
shows also that when dimX > 2 the Gateaux differentiability o f / a t x0
does not imply the continuity of/ there.
When X = R", where n > 2, the Gateaux differentiability of / at x 0
implies the existence of the partial derivatives of/ at x 0 , and in fact
Djf(xo) = ^f(xo)ej 0' = 1 > • • • > w), where el,..., en is the natural basis in
R". Hence if Y = Rm, and f x , . . . ,/ w are the components of / , then the
matrix of the Gateaux differential Sf(x0) with respect to the natural
bases in Rn and Rm is the Jacobian matrix lDjfi{x0)'].
It should be noted also that if X = R", then the existence of the partial
derivatives Djf(x0) of/ at an interior point x0 of its domain does not
imply that/ is Gateaux differentiable at x 0 . For instance, if/ is the function
of Example (d\ then D1/(0,0) = D2/(0,0) = 0, b u t / is not Gateaux
differentiable at (0,0).
The analogue of (3.1.2) for the Gateaux differential is that / i s Gateaux
differentiable at an interior point x 0 of its domain if and only if there
exists Te&(X, Y) such that for each heX
(/(x 0 + th) -f(x0) - T(th))/t -+ 0 as t -* 0 in R\{0}, (3)
and then Sf(x0) = T. The argument used in §3.1 to prove the uniqueness
of the Frechet differential shows that if there exists TG^(X, Y) such that
(3) holds for all heX, then T is unique.
By (4.1.1) the Gateaux differential <5/(x0) has the same linearity prop-
erties (as a function of/) as the Frechet differential (cf. (3.1.3)). Also, by
(4.1.2), the chain rule here takes the form that iff is Gateaux differentiable
at x0 and g is Frechet differentiable at the point yQ =/(x 0 ), then g°f is
Gateaux differentiable at x0 and its Gateaux differential there is dg(yo)°
<5/(xo)-t In particular, if ge^(Y,Z), then the Gateaux differential of
t We shall see in (4.2.5) that the Frechet differentiability of g here can be replaced by the
weaker property of Hadamard differentiability.
[4.1 ] Gateaux and Hadamard differentials 257
go fat x0 is g°Sf{x0). When X = R", Y = Rm, Z = R', there is also a

direct analogue of (3.4.3) involving the Jacobian matrices off and g.
We note in passing that the particular case of the above chain rule
where gei?(Y,Z) implies that the result of §3.1, Example (g) (p. 173),
holds for Gateaux differentiable functions.
By virtue of (4.1.3), the result of (3.2.2) extends to Gateaux differen-
tiability; more precisely, we have:
(4.1.6) Let M > 0, let C be a closed convex set in X with a non-empty

interior C°, and let f:C -• Y be a continuous function such that, for nearly
all xeC° the Gateaux differential 8f(x) exists and satisfies ||<S/(x)|| < M.
Thenf is M-Lipschitzian on C, i.e. for all a,beC
\\f(b)-f(a)\\<M\\b-a\\.
There is a simpler variant of (4.1.6) which is frequently useful, namely:

(4.1.7) If C is an open convex set in XJ :C -> Y is Gateaux differentiable
on C, and || Sf(x) || < Mfor all x e C, thenf is M-Lipschitzian on C.
To prove this, we observe that if a,beC and S is the closed segment
joining a and b, then the Gateaux differentiability of/ implies that / is
continuous on S. We can therefore apply (4.1.3) t o / on S to obtain the
result.
(4.1.7. Corollary 1) / / / is a function from asetA^X into Y whose Gateaux

differential 3f is defined on a neighbourhood ofx0 and is continuous at x 0 ,
thenf is Frechet differentiable at x0.
Given e > 0, we can find an open ball B with centre x 0 contained in A
such that Sf(x) is defined and satisfies || Sf(x) — <5/(x0) || < e for all xeB.
By (4.1.7) applied to the function
we deduce that for all xeB

|| fix) -/(X O ) - df(xo)(x - X0) || < 8 || X - X0 ||
whence/is Frechet differentiable at x 0 .
(4.1.7. Corollary 2) / / / is a function from an open set E^ X into Y, then

fisC1 onE if and only if its Gateaux differential bf is continuous on E.
If X is a product space, say X = Xx x ... x I n , we can define partial

Gateaux differentials of/ exactly as in the case of Frechet differentials
discussed in §3.3. Thus for j = 1,... ,n let l} be the insertion map from
Xj into X which maps the point x} of Xj to the point of X with the jth co-
ordinate Xj and all other coordinates 0. Then the partial Gateaux differ-
ential of a function f: A -• Y at the point xo = (xlo,...,xnO) of A with
respect to the jth coordinate is the Gateaux differential of the function
Xj^fixQ + IfiCj — Xjo)) at xj0, provided that this latter differential
exists. By analogy with §3.3, we denote this partial differential by Sjf(x0).
It is readily verified that the results of (3.3.1,2) hold for partial Gateaux
differentials.
It is also possible to define Gateaux differentials of higher order.
However, we make no use of such differentials, and we do not consider
them further.
We mention finally the upper and lower Gateaux variations $f(xo)h
and Sf(xo)h of a real-valued function/. These are respectively the upper
and lower right-hand Dini derivatives of the function t*-+f(x0 + th) at
t = 0, i.e.
Sf(xo)h = lim sup(/(x 0 + th) -f{xo))/U
f->0 +
df(xo)h = lim inf(/(x0 + th) -f(xo))/t
t->0 +
(the values ± oo being allowed). These upper and lower variations will be
used in §4.6.
Exercises 4.1
1 Let X be real and let / be a function from a convex set A c X into R. Prove that if
for each xe A the Gateaux variation Sf(x) off is defined and convex on X, and
f(x)>f(xo) + Sf(xo)(x-xo)
for all x,xoeA, then/ is convex.
[Hint. Cf. Exercise 3.1.8.]
2 Let/ be a function from a set A c x into Y which has a Gateaux variation at x0 for
all increments h9 and suppose that ||/(x o )|| > ||/(x)|| for all xeA. Prove that, for
all h e X,
\\f(xo)\\<\\f(xo) + df(xo)h\\.
4.2 The Hadamard variation and the Hadamard differential

The notion of differentiability introduced in this section is intermediate
between those of Frechet and Gateaux. We consider first the corres-
ponding variation, and we begin by defining an auxiliary concept, the
set of vectors directed into a set at a point.
Let E c X, and let xQeX. We say that a vector heX is directed into E
at x0 if there exist a neighbourhood W of h in X and a positive number e
such that x 0 + tkeE whenever keW and 0 < t < e. It is easy to see that
the set of vectors directed into £ at x 0 is an open cone in X with vertex 0,
and we denote this cone by K d (£;x 0 ).t For example, we show in (4.5.3)
that if £ is non-empty and convex, and x o e£, then
K d (£;x 0 ) ={heX: there exists A > 0 for which x 0 + A/ze £°}.
The following statements are readily verified:
(a) if K d (£;x 0 ) ^ 0 , then x o e£ (so that if x o ££ then Kd(E;x0) = 0,
(b) 0eK d (£;x o ) if and only if x 0 e£°, and then X d (£;x o ) = X,
(c) heKd(E;x0) if and only if, for each sequence (hn) in X converging
to h and each sequence {tn) of positive numbers converging to 0, the point
x 0 + tnhn belongs to £ for all sufficiently large n.%
Now let/ be a function from a set A c X into 7, and let xoeA. We say
that / has a Hadamard variation at x 0 for the increment h if h is directed
into A at x 0 and there exists le Y such that, for each sequence (hn) in X
converging to h and each sequence (tn) of positive numbers converging
to 0, we have
lim (/(x 0 + tnhn) -f(xo))/tn = I
n~*oo
The element / is then called the Hadamard variation of fat x 0 for the incre-
ment ft, and we denote it by df(xo)h. Further, the function h^df{xo)h
with domain the set of heX for which df(xo)h exists will be called the
Hadamard variation off at x 0 , and we denote it by df(x0).
It is obvious that if df(xo)h exists and a > 0, then df(xo)(och) exists
and is equal to <xdf(xo)h.
By taking hn = h for all n we see that if/has a Hadamard variation
at x 0 for the increment /z, then/ has a Gateaux variation at x 0 for h and
$f(xo)h = df(xo)h. In particular, this implies that if 3/(xo)0 exists then
it is equal to 0. In the opposite direction, the existence of the Gateaux
variation Sf(x0) does not in general imply the existence of the Hadamard
variation df(xo)h (see Example (c) below), but we have the result that
iff has a Gateaux variation at x0for h and is Lipschitzian in a neighbourhood
o/x 0 , thenf has a Hadamard variation at x0for h. This is a trival conse-
quence of the identity
/fro + *A) -/fro) /fro + Kh) -f(x0) f(x0 + tnhn) - / ( x 0 + tnh)
= m (!)
t Kd{E;x0) is often called the cone offeasible directions at x0. For the purposes of this section
it is enough to consider the case where x o e £ , but we require the more general definition
in §4.5.
J For such sequences {hj, (tn), the points tnhn approach 0 'tangentially' to the vector h
(see §4.3).
Example (a)
Let 0 be a function from a set A c R into Y, and let toeA. If ft > 0,
then </> has a Hadamard variation at t0 for the increment ft if and only
if A contains an interval of the form [to,to + rj\_, where rj > 0, and (/> has
a right-hand derivative </>+(t0), and then d0(£o)ft =ft</>'+(r0).Similarly, if
ft < 0, then d(f)(to)h exists if and only if y4 contains an interval of the form
]t 0 — 77, to~\ and 0 has a left-hand derivative <t>'_(t0), and then d(p(to)h =
h(t>'_(t0). Further, if t0 is an interior point of A and both 0+(r o ) and (/>'_(t0)
exist, then 30(ro)O exists. Hence (as with the Gateaux variation) d(j)(to)h
exists for all fteR if and only if t0 is an interior point of A and both <t>'+(t0)
and </>'_(£0) exist.
It should be noted that the existence of #(t o )O does not imply the
existence of </>'±(£0) (see 4.2.2) below).
Example (b)
If/ is a function from a subset of X into Y which is Frechet differentiate
at a point x 0 , then df(xo)h exists for all toeX and is equal to df(xo)h
(so that 3/(xo)eJS?(X, Y)). This follows easily from (3.1.2)(iii).
Example (c)
If/:R 2 ->Risgivenby
/(x l 9 x 2 ) = l if x 2 = X i ^ 0 , /(x 1 ,x 2 ) = 0otherwise,
then/ has a Gateaux variation at (0,0) which belongs to if(R 2 ,R) (§4.1,
Example (c)). On the other hand, the Hadamard variation of/ at (0,0)
for the increment h = (1,0) does not exist, for if (tn) is a sequence of positive
numbers converging to 0 and hn = (1, tn), then hn -> ft, and tnhn = (tn,t*\
The following result gives an equivalent formulation of the definition

of the Hadamard variation which is perhaps more illuminating, though
less useful, than that above.
(4.2.1) Letf be a function from a set A^X into Y, let xoeA, and let heX.
Then the following statements are equivalent:
(i) the Hadamard variation df(xo)h exists and is equal to /;
(ii) for each function (j> from a subset BofR containing 0 into X such that
0(0) = x 0 , (j)'+(O) = h, the function f°(f) is defined on Bn[0,rj[for some
rj>0 and has a right-hand derivative at 0 equal to I.
Suppose first that (i) holds, let 0 be a function from a subset B of R

containing 0 into X such that 0(0) = xo,0'+(O) = h, let (tn) be a sequence
of positive numbers in B converging to 0, and let hn = ((t)(tn) - 0(O))/rn.
Then hn -+ h as n -> oo, and since ft is directed into A at x 0 , it follows
that (/)(tn) = x0 + tnhneA for all sufficiently large n. Further, since
df(xo)h = /,
This implies that (/°0)+(O) exists and is equal to /. Moreover, it implies

that / ° 0 is defined on Bn[0,n\_ for some rj>0, for otherwise we can
find a sequence (tn) of positive numbers in B converging to 0 such that,
for each n, (f>(tn)Â, and this gives a contradiction.
Conversely, suppose that (ii) holds, let (hn) be a sequence in X converging
to h, let (tn) be a sequence of positive numbers converging to 0, and let
0 : [0, oo[ -> X be given by
0(U = *o + KK (n = 1,2,...), 0(0 = x 0 + t/i otherwise.
Then 0(0) = x 0 and ( 0 ( 0 - 0(0))/f -> h as r ^ O - h , so that 0r+ (0) = /i.
Hence /° 0 is defined on the interval [0, Y\ [ for some rj > 0, and this implies
that x 0 + tnhn = 0(tn)e A for all sufficiently large n, so that h is directed
into >4 at x0. Also
/(*o + *A) -/(*o)
as n -• oo, so that df(xo)h = /.

The position of the increment 0 in the theory is anomalous, as is shown
by the following result.
(4.2.2) Let f be a function from a set A^X into Y, and let xoeA. Then
df(xo)0 exists if and only ifxo is an interior point of A and there exist K>0
and a neighbourhood V ofx0 in A such that
\\f(x)-f(xo)\\<K\\x-xo\\ (2)
for allxeV. {In particular, ifdf(xo)0 exists thenf is continuous at xo.)
Suppose first that df(xo)0 exists. Then 0 is directed into A at x o , and
therefore xoeA°, by (b) (p. 259). Further, / is continuous at x 0 , for if
(xj is a sequence in A\{x0} converging to x 0 , and
then hn -> 0 in X,tn -• 0 in]0, oo[, and therefore

f(xn) -/(xo) = r / ( x o + f
A)-/ô)^Q.g/(Xo)0 = 0>
To prove the inequality (2), suppose on the contrary that no such

K and V exist. Then we can find a sequence (kn) in X\{0} converging
to 0 such that
||/(x0 +fcj-/(xo)II/llfc, || - oo as n-oo,
and obviously ||/(x 0 + kn) —/(xo)|| > 0 for all sufficiently large n. For
such n let
'•HI/(*o + U-/(*o)ll> K = Klt'n-
Then h'n -> 0 in X, t'n -+ 0 in ]0, oo [, and
\\f{xo + t'nh'n)-f(xo)\\lt!n = l,
and again we obtain a contradiction, since df(xo)0 = 0.
It remains to prove the 'if, and this is simple, for if (2) holds for all xe V
then
\\f(xo + th)-f(xo)\\/t<K\\h\\
whenever the V and t > 0.
Theorem (3.1.3) on the linearity (as a mapping defined on a vector
space of functions) of the Frechet differential applies without change to
the Hadamard variation, just as it does for the Gateaux variation
(cf. (4.1.1)).
(4.2.3)(i) / / / is a function from a set A^X into Y which has a Hadamard

variation at x0 for the increment h, then so does offor every scalar a and
(ii) Iffg are functions from sets A,B^X into Y which have Hadamard
variations at x0for the increment h, then so doesf + g9 and d(f + g)(xo)h =
df(xo)h + dg(xo)h.
In contrast to the Gateaux variation, the Hadamard variation obeys

the chain rule, viz.
(4.2.4) Let f be a function from a set A^ X into Y which has a Hadamard

variation at the point x0for the increment h, and let g be a function from a set
B^Y into Z which has a Hadamard variation at the point y0 =f(xo)far
the increment k = df(xo)h. Then g°f has a Hadamard variation at x0for
the increment h, equal to dg(yo)k = dg(yo)(df(xo)h).
Let (hn) be a sequence in X converging to h, and let (tn) be a sequence
of positive numbers converging to 0. Then x 0 4- tnhneA for all sufficiently
large n, and if for such n we write kn = (f(x0 + tnhn)—f(x0))/tn, then
kn^>df(x0)h = k as n-*oo. This implies in turn that f(x0 + tnhn) =
y0 + tnkneB for all sufficiently large n, whence h is directed into the domain
of g°f at x0. Further,
g(/(*o + *A)) ~ g(f(x0)) = g(y0 + *A) ~ gfro)
y )k
as n -> oc, and this gives the result.
A similar but simpler proof gives the following companion result

(cf. (4.1.2)).
(4.2.5) Let f be a function from a set A^X into Y which has a Gateaux
variation at the point x0for the increment h, and let g be a function from a set
B <=; Y into Z which has a Hadamard variation at the point y0 —f(x0)for
the increment k = 5f{xo)h. Then g°f has a Gateaux variation at x0 for
the increment h, equal to dg(y0) k = dg(yo)(df(xo)h).
The next group of results deal with the case where the variation df(xo)h
exists for all heX, and here there are marked contrasts with the Gateaux
variation (cf. §4.1, Examples (c), (d)).
(4.2.6) Letf be a function from a set A^X into Y, let xoeA9 and suppose
that the Hadamard variation df(xo)h exists for all heX. Then x0 is an
interior point of A, f is continuous at x0 (and indeed (2) holds on some
neighbourhood ofx0), and df(x0) is continuous on X.
The first two statements and the parenthesis follow from the existence
of 3/(xo)0. To prove the last statement, suppose on the contrary that
df(x0) is not continuous at a point keX. Then we can find a positive
number s and a sequence (fcj in X converging to k such that
for all m. Further, since

lim (/(x 0 + tjcj -f(xo))/tn = df(xo)km
n-*oo
for each m and every sequence (tn) of positive numbers converging to 0, we
can find a sequence (sm) of positive numbers converging to 0 such that
f(x0 + smkm) -f(x0)
-df(xo)k
for all m, and this gives a contradiction, since the expression on the left
tends to 0 as m -> oo.
The most important example where the Hadamard variation is defined

on X, but may be non-linear, is that of a convex function.
(4.2.7) Ifp is a continuous convex function on a convex set A^X and x0

is an interior point of A, then dp(xo)h exists for all heX and the function
dp(x0) is (continuous and) sublinear.
By §4.1, Example (e), the Gateaux variation dp(x0) is defined and
sublinear on X. Since p is Lipschitzian on a neighbourhood of x0 (A. 3.11.),
it follows that the Hadamard variation dp(x0) is defined and sublinear
onl.
By an obvious translation, we deduce from (4.2.1) the following

corollary.
(4.2.7. Corollary 1) Let p be a continuous convex function on a convex

set A ci X, let x0 be an interior point of A, and let ty be a function from a
set B^R into X taking the value x0 at t0 and having there a right-hand
derivative <t>'+{t0). Then p°(j> is defined on Bn [to,to 4- v\\_for some rj>0
and has a right-hand derivative at t0 equal to dp(xo)4>'+(to).
The hypotheses in (4.2.7) can be weakened. We recall (A.3.12) that

if p is a lower semicontinuous convex function on a convex set A £ X, and
x0 is an internal point of A, then x 0 is an interior point of A9 and p is conti-
nuous on a neighbourhood of x 0 . Moreover, it is clear from the definition
of Gateaux variation that if dp(xo)h exists for all heX, then x 0 is an internal
point of A. Hence we have:
(4.2.7. Corollary 2) Let p be a lower semicontinuous convex function on

a convex set A^X, and suppose that p has a Gateaux variation Sp(xo)h at
a point x0eAfor all heX. Then x0 is an interior point of A, p is continuous
on a neighbourhood ofx0, dp(xo)h existsfor all heX, and dp(x0) is continuous
and sublinear.
We now define Hadamard differentiability. Let / be a function from

a set A c X into 7, and let x0 be an interior point of A. We say that / is
Hadamard differentiable at x0 if the Hadamard variation df(xo)h exists
for all heX and the function df(x0) belongs to &{X9 Y). This function
df(x0) is then called the Hadamard differential off at x0. Further, the
function x *-+ df(x) whose domain is the set of interior points of A at
which / is Hadamard differentiable is called the Hadamard differential
of/, and we denote it by df.
The following result is the analogue of (3.1.2) for Hadamard differ-
entiability.
(4.2.8) Let f be a function from a set A^X into Y, let x0 be an interior point
of A, and let Te££(X, Y). Then the following statements are equivalent:
(i) / is Hadamard differentiable at x 0 , with df(x0) = T9 i.e. ifheX then
(/(*o + ' A ) ~f(xo))/tH ^Th as n-+cc (3)
for every sequence (hn) in X converging to h and every sequence (tn) of
positive numbers converging to 0;
(ii) if heX then (3) holds for every sequence (hn) in X converging to h
and every sequence (tn) of non-zero real numbers converging to 0;
(iii) for each function (f) from a neighbourhood ofO in R into X which takes
the value x0 at 0 and has a derivative at 0, the function f°(j) is defined on a
neighbourhood ofO and has a derivative at 0 equal to T((/>'(0));
(iv) for each compact set E <= X,
(f(xo + th)-f{xJ)/t->Th
as t —• 0, uniformly for h in E.
Here the equivalence of (i) and (ii) is almost immediate. Also (i) implies
(iii), by (4.2.1), and (iii) implies (ii), by an argument similar to that of (4.2.1).
Next, (ii) implies (iv). To prove this, suppose that (ii) holds, and let
D(t,h) = (f(x0 + th) -f(xo))/t (x0 + the A, t * 0).
Then D{t, h) -+ Th as t -• 0, for each heX, and we have to show that the
convergence is uniform on any compact E £ X. If this is false, we can
find a positive number e, a sequence (tn) of non-zero real numbers con-
verging to 0, and a sequence (hn) in £, such that \\D(tn,hn)- Thn\\ >s
for all n. Since E is compact, the sequence (hn) has a subsequence (hnm)
converging to a point h of £, and since T is continuous we have
This gives a contradiction, and hence (ii) implies (iv).

We show finally that (iv) implies (ii). If (hn) is a sequence in X converging
to a point ft, then the set consisting of h together with the points hn is
compact. Hence if (iv) holds, then D(t, hn) — Thn -• 0 as t -• 0, uniformly
in n. It follows that if (tn) is a sequence of non-zero real numbers converging
to 0, then D(tn,hn) — Thn -• 0 as n -• oo, and since T is continuous, this
implies that (ii) holds.
For a function 0 of a real variable taking values in a real normed space,

the notions of Frechet, Hadamard and Gateaux differentiability at a
point coincide, and each is equivalent to the property that the derivative
of 0 exists when the point in question is an interior point of the domain of </>
(cf. Example (a) and §4.1, Example (a)).
For a function / from a subset of X into Y, the Hadamard differen-

tiability o f / at x 0 obviously implies the Gateaux differentiability o f /
there, and <5/(x0) = df(x0). The converse is false when dim X > 2, as is
readily seen by considering an example similar to the function of
Example (c). However, if in addition/is Lipschitzian in a neighbourhood
of x 0 , then the Gateaux differentiability o f / a t x 0 implies the Hadamard
differentiability of/there (see p. 259).
If/ is Frechet differentiable at x 0 , then it is Hadamard differentiable
there, and df(x0) = df(x0) (cf. Example {b)). From (4.2.8)(iv) and (3.1.2)(iv)
we see further that the converse is true when dim X < oo, so that Frechet
and Hadamard differentiability are then equivalent. In fact, the following
example shows that Frechet and Hadamard differentiability are equivalent
if and only if dim X < oo.
Example (d)
Let dim X = oo, let (wn) be a sequence on the unit sphere of X with no
convergent subsequence, let c be a unit vector in 7, and let / : X -> Y be
given by
f(wm/m) = c/m (m = 1,2,...), f(x) = 0 otherwise.
T h e n / is Hadamard differentiable at 0 with 5/(0) = 0, b u t / is not Frechet
differentiable at 0.
To prove the first statement, let heX, let (hn) be a sequence in X con-
verging to h, and let (tn) be a sequence of positive numbers converging
to 0. Iff(tnhn) = 0 for all sufficiently large n, then obviously f(tHhn)/tn -+ 0
as n -> oo. In the contrary case, let (nr) be the subsequence of the positive
integers for which f(tnhn) ± 0. We have to show that f{tnhn)/tnr -• 0
a s r - • oo.
For each r we have tnhnr = wmjmr for some positive integer mr, and
since tnrhnr -• 0, mr -• oo as r -• oo. If h =/= 0 and for each non-zero x of X
we write x A= x/1| x ||, then
as r -^ oo, and this contradicts our choice of the wm. Hence h = 0, and
therefore
Wf(tnA)/tJ = \\c/(tnmr)\\ = I *J(tnmr)\\ = || VII - 0 ,
as required.
It follows now that if/ is Frechet differentiable at 0, then d/(0) = 0.
This, however, is impossible, since the sequence (wm) is bounded and
™(/(wm/m) - / ( 0 ) ) = c for all m (cf. (3.1.2)(iii)).
From (4.2.6) we see that if a function / is Hadamard differentiable

at x 0 then it is continuous at x 0 ,and indeed there exist K > 0and a neigh-
bourhood V ofx0such that
for all xeV. The function we have just considered in Example (d) shows
that in this inequality we cannot take K to be of the form || dj[xo) || + s for
arbitrarily small positive s (cf. (3.1.1)).
By (4.2.3), the Hadamard differential has the same linearity properties
as the Frechet differential (cf. (3.1.3)). Further, by (4.2.4), the chain rule
here takes the following form.
(4.2.9) Let f be a function from a set A^X into Y which is Hadamard

differentiable at a point x 0 , and let g be a function from a set B^Y into Z
which is Hadamard differentiable at the point y0 =/(x 0 ). Then g°fis
Hadamard differentiable at x 0 , and d(g°f) = dg(y0)odf(x0).
From (4.2.9) and (4.2.8) we shall deduce that Hadamard differentiability

is the most general form of differentiability for which the chain rule holds,
the phrase 4most general' here being used in the following sense.
It is reasonable to require of any definition of differentiability, say
3£-differentiability, that
(i) if/ is a function from a set A e X into 7, and x 0 is an interior point
of A, then the ^-differential of/ at x 0 , if it exists, is a unique continuous
linear function from X into 7,
(ii) if 0 is a function from a set A ^ R into Y considered as a normed space
over R and t0 is an interior point of A, then <> / is £-differentiable at t0 if and
only if the derivative <t>\t0) exists, and the ^-differential of (j) at t0 is the
function h*-+h(j)\t0) from R into Y.
We shall show that if the chain rule holds for X-differentiability (i.e. the
result of (4.2.9) holds with ^-differentiability throughout) then every
function X-differentiable at a point is also Hadamard differentiable at that
point and the two differentials are the same.
To prove this, suppose that 3E-differentiability satisfies these conditions,
let/:,4-» Y be 3E-differentiable at x 0 , and let 0 be a function from a
neighbourhood of 0 in R into X such that 0(0) = x 0 and that 0'(O) exists.
Then, by applying the chain rule to / and 0, we see that / satisfies
the criterion of (4.2.8)(iii), whence/ is Hadamard differentable at x 0 .
The mean value inequalities of (4.1.6,7) and the results of (4.1.7,

Corollaries 1,2) obviously hold with the Hadamard differential in place
of the Gateaux differential. In particular, when we define the functions

which are C1 on an open set £, it is immaterial whether we use the Frechet,
Gateaux, or Hadamard differentials.
Partial Hadamard differentials can be defined exactly as for Frechet
and Gateaux differentials (§§3.3,4.1). We can also define Hadamard
differentials of higher order, but since we make no use of these higher
order differentials we do not pursue this point.
We next consider some results concerning the upper and lower

Hadamard variations of a real-valued function.
Let / b e a function from a set A £ X into R, let xoeA, and let h be
directed into A at x0. We define the upper Hadamard variation off at x0
for the increment h by the relation
df(xo)h = sup(lim sup(/(x 0 + tnhn) -/(x 0 ))/O>
where the supremum is taken over all sequences {hn) in X converging to
h and all sequences (tn) of positive numbers converging to 0 (the values
± oobeing permitted). The lower variation df(xo)h is then defined to be
the negative of the upper variation of (—/). Obviously if/has a Hadamard
variation at x0 for the increment h, then df(xo)h = 8f(xo)h = df(xo)h.
We observe that there exist a sequence (hn) in X converging to h and a
sequence (tn) of positive numbers converging to 0 such that
df(xo)h = lim (f(x0 + tnhn) -f(xo))/tn (4)
n-* oo
(so that df(xo)h is the greatest extended real number with this property).
To prove this, we may suppose that df(xo)h > - oo, otherwise the result
is trivial. Let ( s j be a strictly increasing sequence of real numbers such
that sm-+df(x0)h as m-> oo. Then for each positive integer m we can
find sequences (h^) and (rj,m)) in X and ]0, oo[ converging to h and 0 res-
pectively such that for all m and n
(/(Xo + « m ) ) - / ( * o ) ) / C ) > V
If now for each m we choose nm so large that || hQ - h \\ < 1/m and fj,™' <
1/m, then h™ -»h and t^ -*• 0 as m -> oo, and for all m
Hence
lim sup (f(x0 + CX) -/(
m-* oo
and there is obviously equality here. On taking appropriate subsequences

of (h™) and (rjj^) we therefore obtain the required result.
The following result is the analogue of (4.2.1).
(4.2.10) Let f be a function from a set A^ X into R, let xoeA, and let
h be directed into A at x0. If(f> is a function from a subset o/R containing 0
into X such that 0(0) = xo,(/>'+(0) = h, and ij/ = / ° 0 , then
D + il/(0)<df(xo)k (5)
Moreover, there exists a function <j)for which equality holds in (5).
Let (tn) be a sequence of positive numbers converging to 0 such that
(<A(O - t(O))/tn -> D + i/s(0) as n ^ oo. Then hn = (4>(tn) ~ <l>(0))/tH - h as
n-» oc, whence
= lim 0A(O - «A(O))/rn = lim (f(4>(tn)) -f(<t>(O)))/tn
= lim (/(x 0 + tnhn) -f(xo))/tn < df(xo)h.

n-»oo
Next, let (hn\ (tn) be sequences for which (4) holds, let (j>1 :R -• X be
given by
0 i W = Xo + KK (n = 1,2,...), 0 x (r) = x o + r/i otherwise,
and let i ^ 1 = / o 0 1 . Then 0X(O) = x o ,0' 1 (O) =/z, and
GMO " î(0))A B = (f(x0 + t A ) -A*o))/tn - 3 /(ô)^
as 7i ^ oo, so that D+iAi(O) > 3/(xo)fc. Hence D + ^^0) = 3/(xo)fc, by (4).
It is easily verified that if h is directed into A at x0 and a > 0, then

df(xo)((xh) = ocdf(xo)h. In particular, this implies that if x 0 is an interior
point of A (so that 0 is directed into A at x 0 ) then df(xo)O is one of - oo,
0, oo. The following result, which is the analogue of the last part of (4.2.6),
shows in particular that if <3/(xo)0 = — oo then df(xo)h = — oo for all h.
(4.2.11) The upper Hadamard variation of a real-valued function at a

point x0 is upper semicontinuous.
Let/be a real-valued function on a set A c X, and let xoeA. The domain
of the function df(x0) is then the set Kd(A;x0) of vectors directed into
A at x 0 , and is open in X. If df(x0) is not upper semicontinuous at a point
keKd(A;x0\ we can find a real number L > df{xo)k and a sequence (km)
of points of AT converging to k such that df(xo)km > L for all m. By (4), we
can then choose sequences (/ij,m)) and (^m)) in X and ]0, oo [ converging
to km and 0 respectively such that for all m and n
If now for each m we choose nm so large that II Mm) - km II < \/m and t'm) <
1/m, then h™ -> k and t™ -• 0 as m -» oo, and

lim sup (/(x0 + tfX) -f{xQ))lC2 >L> df(xo)K
m->oo _
and this contradicts the definition of df(xo)k.
(4.2.11. Corollary) / / Sf{x0) is convex f[ then it is continuous and sublinear

onX.
Continuity is immediate. It implies that the relation adf(xo)h =
Sf(x0) (a/i), which we know for a > 0, holds also for a = 0, and this relation
together with convexity implies sublinearity.
It is obvious that if 3/(x o )0< oo, then / is upper semicontinuous

at x 0 . In particular, this implies that if/is a convex function on a convex
open set A £ x, and 3/(xo)0 < oo for some xoeA, then/ is continuous
on A.
Now let X be a real normed space. We define the subvariation d^f(x0)
of the function/: A -> R at the interior point x0 of A to be the set
{ueX':u(h) < df(xo)h (heX)}.
(4.2.12) Let X be real, let f be a function from a set A^ X into R, let x0

be an interior point of A, and let ueX'. Then the following statements are
equivalent:
(i) uedj(xo);
(ii) for each heX there exist a sequence (hn) in X converging to h and a
sequence (tn) of positive numbers converging to 0 such that
lim (/(x 0 + tnhn) -f(xo))/tn > u(h).
n->oo
If in addition f is lower semicontinuous and convex on A (and A is convex),
then each of(i) and (ii) is equivalent to
(iii) f(x0 + h) -f{x0) > u(h) whenever x0 + heA.%
Clearly (ii) implies (i), while the converse follows from (4).
Suppose now that / is lower semicontinuous and convex, so that
the Hadamard variation df(xo)h of/exists for all heX. If x 0 + he A, then
f(x0 + h) - / ( x 0 ) > (/(x 0 + th) -f(xo))/t
for 0 < t < 1, and on making t -> 0 + we obtain that
f(xo + h)-f(xo)>df(xo)h.
t Recall that by our conventions a convex function is finite-valued.
J In books on convexity theory, the subvariation d*f{x0) is defined (for convex/ only) as
the s£t of all ueX' that satisfy (iii). It is usually called the subdifferential of/ at x 0 , and
denoted by df(x0).
Hence if uedj(xo\ then (iii) holds. Conversely, if (iii) holds, then for all
heX and all sufficiently small positive t we have
f(xo + th)-f(xo)>u(th) = tu(hl
and this obviously implies that df(xo)h > u(h\ so that ued^f(x0).
Example (e)
It is easy to see that if/ is Hadamard differentiable at x 0 , then the sub-
variation of / at x 0 consists of the single element df(x0). The converse,
that if 8^f(x0) consists of a single element then/is Hadamard differentiable
at x 0 , is false. For example, if X = R, and
f(x) = - \x| sin 2 (l/x) (x =/= 0), /(0) = 0,
then df(0)h = 0 for all heR, so that 3^/(0) consists of the zero function,
while / i s clearly not Hadamard differentiable at 0.
Example (f)
To prove this let ued^N(x0). Then for all heX we have

u(h) < N(x0 + h)- N(x0) = || x 0 + h || - || x01|
so that || u || < 1. Also, taking h = - x 0 we obtain that u( - x0) < - || x 0 1|,
so that M(XO)> ||X O ||. Since also w(xo)< ||x o ||, it follows that w(xo) =
||x 0 1|, and therefore \\u\\ = 1. On the other hand, if ||u|| = 1 and M(X0) =
I x 0 1), then
|| x 0 1| + u(h) = u(x0 + h) < || x 0 + h ||,
so that ued^N(x0).
For the exceptional case where x0 = 0 we have trivially d^N(0) =
{ueX':\\u\\<l}.
The subvariation d^f(x0) is of most interest when the upper variation

df(x0) is convex, for by (4.2.11. Corollary) df(x0) is continuous and
sublinear. From (A.4.1. Corollary 2) we therefore have:
(4.2.13) If the upper variation df(x0) of thefunction f :A-+Ratx0 is convex,

then the subvariation d^f(xQ) is non-empty, and for all he X
Sf(xo)h = sup {u(h) :u€d*f(x0)}. (6)
(4.2.13. Corollary) / / / has a Hadamard variation at x0 which is convex,

thenf is Hadamard differentiable at x0 if and only if the subvariation d^.f(x0)
consists of a single element w, and in this case u = df(x0).
Here the 'if follows from (6), since now df{x0) = df(x0). The 'only if
is a restatement of the first part of Example (e).
Example (g)
If JV(x) = || x ||, then N is Hadamard differentiable at a point x 0 ^ 0 if
and only if the closed ball B = {xeX: ||x || < ||x 0 1|} has exactly one
supporting hyperplane at x 0 . In fact, by (4.2.13. Corollary) and Example
{e\ N is Hadamard differentiable at x 0 if and only if there exists exactly
one ueX' such that || u || = 1 and w(x0) = || x 0 1|. Obviously if u has these
properties then u(x)<\\u\\ \\x0|| = u(x0) for all xeB, so that u(x) = M(X0)
is the equation of a supporting hyperplane to B at x 0 . Conversely, let
M(X) = w(x0) be the equation of a supporting hyperplane to B at x 0 , so
that u(x)<u(x 0 ) for all xeB. Since the hyperplane is unchanged if we
multiply u by a positive constant, we may suppose that \\u\\ = 1. The
condition that u(x) < u{x0) for all x e B implies that ± u{x) = u(±x)< u{x0)
for all x e £ , so that also |u(x)| < w(x0).
Hence
1 = ||u|| = sup |u(x)|/||x|| <u(x o )/||x o || < ||u|| = 1,
11*11= 11*0 ||
so that w(x o )= | | X O | | . Hence there exists exactly one ueX' such that
||u|| = 1 and u(xo) = ||X O || if and only if B has exactly one supporting
hyperplane at x 0 .
A norm which is Hadamard differentiable at every point of X\{0} is

said to be smooth.
Exercises 4.2
1 Let/ be a function from a set A c X into 7, let xoeA, let h be directed into A at x0,
and let le Y. Prove that/ has a Hadamard variation at x0 for the increment h, equal
to /, if and only if for each e > 0 there exist a neighbourhood V of h in X and a positive
number 3 such that
tk)-f(xo)
-I <e
whenever keV and 0 < t < 3.

2 Let/ be a function from a set A c X into Y, and let x0 be an interior point of A.
Prove that
(i) if the Hadamard variation df(xo)h exists for all heX, then, for each compact
(f(xo + th)-f(xo))/t^df(xo)h
as t -» 0 + , uniformly for h in E,
(ii) if there exists a continuous function T:X-> y such that, for each compact
set £ c x ,
as t -• 0 + , uniformly for /i in £, then df(xo)h exists and is equal to T(/z) for all heX.
3 Give a direct proof of (4.2.7) without using the Lipschitzian property of p.
[Hint. Since Sp(x0) is defined and convex on X, it is enough to prove that if
heX, (hn) is a sequence in X converging to h, and (tn) is a sequence of positive numbers
converging to 0, then
(P(*o + *A ) " P(*o + tji))/tn - 0
as n -• oo (cf. (1)). Use the fact that if y,y- z,y + zeA, then t*-+ p(y + tz) is convex
on [ — 1,1], whence for 0 < t < 1
P(y) -p(y-z)< (p(y + tz) - p(y))/t <p(y + z)- p(y)
(these inequalities are particular cases of those used in §1.1, Example (g) (p. 6)).]
4 L e t / be a function from a set A c X into y which has a Hadamard variation at the
point x 0 for the increment h, let g be a function from a set 5 c y into R, and let
k = df(xo)h be directed into B at y 0 =/(x 0 ). Prove that
3{gof)(x0)h < dg(yo)k = dg(yo)(df(xo)h).
5 Let X be a real normed space. Prove that if p(x) = 11| x \\2, then p is convex and
^p(x 0 ) = {usX>: || u || = || x 0 1| ,u(x0) = || u || || x 0 1 } .
4.3 The tangent cones to the graph and the level surfaces of a function
It has been shown in (1.2.1) that a function/of a real variable has a deriva-
tive at an interior point x0 of its domain if and only if the graph of/ has
a tangent line not parallel to the y-axis at the point (x o ,/(x o )). Further,
the equation of this tangent line is
y =f(xo) + (* - *o)/'(xo)>
i.e. the tangent line is the graph of the approximating function x |->/(x0) +
(x — x o )/'(x o ). In this section we investigate the analogue of this result
for a vector-valued function of a vector variable. We consider also the
corresponding problem for the level surfaces of a function. Our results
here have applications to constrained maxima and minima which are
given in the next two sections.
The definition of tangency which we employ is as follows. Let E c X,
and let xoeX. We say that a vector heX is tangent to E at x 0 if for each
neighbourhood W of h in X and each positive number e there exist keW
and fG]O,e[ such that x0 + tkeE. It is easily verified that if /i is tangent to
E at x 0 and a > 0, then a/i is tangent to E at x 0 . The set of vectors tangent

to E at x 0 is therefore a cone in X with vertex 0, and we denote this cone
by Kt{E;x0). Clearly 0eKt(E;xo) for all E c X and all x0eE.
Example (a)
If x0 is an interior point of £, then Kt(E;x0) = X.
Example (b)
If X = Rw, E is the closed unit ball in Rn, and ||x 0 1| = 1, then Kt(E;x0)
is a half-space.
The following result gives some equivalent formulations of the definition

of Kt(E;x0) in terms of sequences. Here and later, for each non-zero
x G X we use x A to denote the unit vector x/1| x ||.
(4.3.1) Lemma. Let E^X, let xoeE, and let heX. Then the following
statements are equivalent:
(i) heKt(E;x0);
(ii) there exists a sequence (hn) in X converging to h and a sequence (tn) of
positive numbers converging to 0 such that x 0 + tnhneEfor all n (or for all
sufficiently large n).
If in addition hj=O, then each of (i) and (ii) is equivalent to
(iii) there exists a sequence (xn) in E\{x0} converging to x 0 such that
(xn ~ xoT~* K-
The equivalence of (i) and (ii) is immediate. If now h ±0, then (ii) implies
(iii), for if (hn) and (tn) have the properties specified in (ii), and xn = x 0 +
tnhn, then xneE for all n and (xn - x o ) A = (rlIfcll)A=fcIIA-> /T.Conversely,
(iii) implies (ii), for if ( x j has the properties specified in (iii) and hn =
II*II ( * „ - xoy9 tn = \ \ x n - x 0 ||/1|h||, then hn -> h,tH ^ 0, and x 0 + tnhn =
xneE for all n.
The next two theorems deal with the cone of vectors tangent to the graph
of a function; as might be expected from the last lemma, the appropriate
form of differentiability to use here is that of Hadamard rather than that
of Frechet.
For any function F from a subset of X into Y we denote the graph
o f F i n X x Yby <&(F).
(4.3.2) Let f be a function from asetA^X into 7, let x 0 be an interior point
of A, let z0 = (x o ,/(x o )), and let ST be the cone of vectors in X x Y tangent

to9(f)atz0.
(i) / / / has a Gateaux variation at x0for each heX, then ^(<5/(x0)) c y,
(ii) Iff has a Hadamard variation at x0for each heX, then @(df(x0)) = ST.
We may suppose that x 0 = 0 and/(x 0 ) = 0 (for if this is not the case we
replace / by the function x »->/(x0 + x) -f(x0)).
To prove (i), let heX. Then for all t > 0, the point (thj(th)) belongs to
9(f\ and (h,f{th)/t)-+(h,Sf(O)h) as f - 0 + . Hence (fc^/(0)A)e«f, so
that 9(Sf(0))cy.
To prove (ii), it is enough, by (i), to show that if df(O)h exists for each
heX, then & c &(df(0)). Let {h,l)e$~. Then we can find a sequence
((hn,ln)) in X x Y converging to (h,t) and a sequence (tn) of positive
numbers converging to 0 such that (tnhn,tnln)e^(f) for all n. But then
K =f(t»K)/*n - df(°)h as n ->cx), whence / = 3/(0)fc, i.e. (ft, /)e 9(5/(0)).
(4.3.3) (i) //dim Y< oo, / is a function from a set A^X into Y, x0 is an
interior point of A at which f is continuous, and the cone 3~ of vectors tangent
to &(f) at the point z0 = (x o ,/(x o )) is contained in the graph of a function
Te£f{X,Y\ then f is Hadamard differentiable at x0 and df(xo) = T.
If in addition dimX < oo, then f is Frechet differentiate at x0 and
df(xo)=T.
(ii) //dim Y= oo, there exists a function fl:X-^ Y which is continuous
at 0 but not Gateaux differentiate there, withfÔ) = 0, and such that the
cone of vectors tangent to the graph off{ at (0,0) is X x {0}, i.e. it is the
graph of the zero function from X into Y.
(iii) / / dim X = oo, dim Y< oo, there exists a function f2:X -+ Y which
is Hadmard differentiate at 0 but not Frechet differentiable there, with
/2(0) = 0, and such that the cone of vectors tangent to the graph of f2 at
(0,0) is X x {0}.
We recall that if dimX< oo, then Hadamard and Frechet differen-
tiability are equivalent (p. 266). Hence to prove (i) it is enough to prove
the first part of the result concerning the Hadamard differentiability of/,
and in doing this we may again assume that x 0 = 0 and/(x 0 ) = 0.
Suppose then that 0 is an interior point of A, that / is continuous at 0
and/(0) = 0, and that the cone y of vectors tangent to ^ ( / ) at (0,0) is
contained in &(T), where Te&(X,Y). I f / i s Hadamard differentiable
at 0, then ^(5/(0)) c <&(J) by (4.3.2)(ii) so that <3/(0) = T; and conversely,
if 3/(0) = T, then / is obviously Hadamard differentiable at 0. Hence
it is enough to show that for each heX, Th is the Hadamard variation of
/ a t 0 for the increment h.
In the contrary case, there exist a positive number s, a sequence (hn) in

X converging to a point h9 and a sequence {tn) of positive numbers converg-
ing to 0 such that
\\f(tnhn)/tn-Th\\>s (n=l,2,...). (1)
If the sequence of points wn = f(tnhn)/tn is bounded, we can find a sub-
sequence (wWm) of this sequence converging to a point w (note that a boun-
ded closed set in Yis compact), and since (hnm, wnj -» (/z,w) as m -> oo and
('-A*.* L ^ J =<WÂÂ J) e **(/) f^11 ™>* follows that & w ) e ^-
Hence (/i, w)e ^(T), so that w = Th9 and this contradicts the condition (1).
On the other hand, if the sequence (wn) is unbounded, we can select a
subsequence (wrs) such that || wrs || -• oo as s -> oo and that the sequence
(wrA) converges to a point v (so that also || v\\ = 1). But then as n tends
to oo through the values rs,
w
-
and hence (0,i;)e^, by (4.3.1)(iii). Since (0,u)^(T), this again gives a
contradiction, and completes the proof of (i).
Consider now the proof of (ii). Let dim Y = oo, let (vn) be a sequence
on the unit sphere of Y with no convergent subsequence, let b be a unit
vector in X, and let/ x : X -• Y be given by
/i(6/n) = vjn (n = 1,2,...), / x (x) = 0 otherwise.
Clearly / x is continuous at 0. It is also obvious that / x is not Gateaux
differentiable at 0, since the function t^f^tb) from R into Y does not
have a right-hand derivative at 0.
Next, since (x9fx(x)J= (b,vnjif x = b/n, n — 1,2,..., it follows that if
(xm) is a sequence in X\{0} converging to 0 such that the sequence of points
(xm,/1(xm))Atends to a limit, there is only a finite number of m for which
xm is one of the points bjn. Hence (xw,/1(xm))T= (xm,0)/sfor all sufficiently
large m, and therefore, by (4.3.1)(iii), the cone F of vectors tangent to the
graph of/j at (0,0) is contained in X x {0}. On the other hand, if x is a
unit vector in X, then
(x/(n + i),/(x/(n + £))) A= (x,0) A= (x,0),
so that (X,0)G^\ Hence also (ooc,0)e^~ for all a > 0, and therefore F =
X x {0}.
It remains now only to prove (iii), and here we can take f2 to be the
function of §4.2, Example (d) (p. 266).
We remark that (4.3.3) (i) shows in particular that the inclusion relation
in (4.3.2)(i) cannot be replaced by equality (for otherwise Gateaux differen-

tiability would imply Hadamard differentiability).
We consider next the cone of vectors tangent to the level surface of a
function through a given point. If / i s a function from a set A c X into
Y,xoeA, and y0 =/(x 0 ), then the level surface off through the point x0
is the setf~\{y0}) (this may, of course, consist of the single point x0).
We remark immediately that if/is differentiable at x 0 in one of the three
senses defined in Chapter 3 and § §4.1,2, T is the appropriate differential
of/at x 0 , and g is the approximating function given by
g(x) = yo + T(x-xo) (xeX\
then the level surface of g through x0 is the set where T(x — x0) = 0, i.e.
it is x0 + ker T.
We prove first an elementary result.
(4.3.4) Let f be a function from a set A^X into Y which is Hadamard

differentiable at an (interior) point x0 of A, and let E be the level surface of f
through x0. Then Kt(E;x0) c ker df{x0).
We may again obviously suppose that x0 = 0 and /(x 0 ) = 0. If
heKt{E;x0X we can find a sequence (hn) in X converging to h and a
sequence (tn) of positive numbers converging to 0 such that tnhneE for
all n, i.e.f(tnhn) = 0. Since/ is Hadamard differentiable at O,f(tnhn)/tn "*
df(0)h as n -> oo, and therefore df(O)h = 0, i.e. heker 3/(0).
The inclusion relation here cannot be replaced by equality, even if/ is

Frechet differentiable at x 0 . This can easily be seen by considering the
function / : R -»R defined by/(x) = x 2 at the point x 0 = 0, for d/(0) = 0
but the level surface through 0 is the singleton {0}. A more sophisticated
example is given by defining g : R2 -* R by
g(x,y) = x + y2 (x > 0), g(x,y) = x - y2 (x<0),
and taking x o = (0,0). Then dg(x0) exists and, for all (hl,h2)eR2,
dg(xo)(h19h2) = hl9 so that ker dg(xo)= {0} x R; but again the level
surface through x 0 is the singleton {x0}.
In the first of these examples the crucial fact is that df(x0) is not surjec-
tive (see (4.3.7)). In the second, it is that g is not continuous, as we shall
now see.
(4.3.5) Letf be a continuous function from a subset A of the real space X

into R which is Hadamard differentiable at an interior point x 0 of A, with
df(x0) 4" 0, and let E be the level surface off through x 0 . Then Kt(E;x0) =
ker 5/(x0).
By (4.3.4), it is enough to prove that X t (£;x 0 )^ker 3/(x0), and in

doing this we suppose that x 0 = 0 and /(x 0 ) = 0. Let Zieker 5/(0) and
let rj >0 be given. Take a point k of X such that df(0)k > 0and \\k\\<r\.
Then df(0)(h + k) > 0 and therefore f(t(h + k))> 0 for all sufficiently
small t>0. Similarly, df(0)(h - k) < 0 so that f(t(h- k))<0 for all
sufficiently small t > 0. Since / is continuous, it follows that for each
sufficiently small t > 0 there exists a point kt on the segment joining k
and -fc such that/(r + kt)) = 0, i.e. t(h + fcr)e£, and therefore heKt(E;0).
We shall next prove a result in which equality holds and in which the
range of/ is a Banach space Y. The proof is a good deal less elementary,
and we first require a lemma which is a companion to (3.7.1).
(4.3.6) Lemma. Let X, Y be Banach spaces, let T be a continuous linear

function from X into Y, let H = ker T, and for each LeX/H let TCL = Tx,
where x is any point of L (so that, by (4.2.2)(iii), Tce^j^(X/H, Y)). Let
also M > || T~11|, let 0 < e < 1, let A be an open set in X containing 0,
and letf :A-+Y be a function such thatf(0) = 0 and that
|| f(x) -f(x') - T(x < ^ || x - x'
for all x, x'eA. Then ify is a point of Y such that A contains the closed ball B
in X with centre 0 and radius M\\y\\ /(I — e), the equation /(x) = y has a
solution x* in B.
The proof is, in its essentials, the same as that of (3.7.1). The main idea is
to find a root of the equation/(x) = y using a sequence (xn) of successive
approximations satisfying the recurrence relation
T(xn+1-xn) = y-f(xn). (2)
In (3.7.1), T was invertible, and in this case it would not merely have
been easy to define (xj inductively from (2), but it was possible to apply
the contraction mapping principle and so avoid any explicit mention of (2).
In the present proof we use (A.2.3), which shows that, under the hypotheses
of this lemma, for each ze Y we can find xeX such that Tx=z and that
||x||<M||z||.
Let y and B be as specified in the lemma. We prove first that there
exists a sequence (xM) in B° with x0 = 0 which satisfies both (2) and
\\xn+1-xn\\<Men\\y\\. (3)
Let x 0 = 0. By the result of (A.2.3) mentioned above, with z = y, we can
choose xxeX such that Txt =y and \\xl\\ <M||>;||; then xleB° and
(2) and (3) are satisfied with n = 0. Suppose then that n is a positive integer
such that x 0 , . . . , xn are defined and lie in B°, and that for r = 0,... ,n - 1,
T(xr+ , - x ^ y -f(xr) and \\xr+l-xr\\< Me'\\y\\. (4)
Again by (A.2.3), we can choose xn+l so that
T(xn+ 1~xn) = y - / ( * „ ) and \\xn+1-xn\\<M\\y -f(xn)||. (5)
Then
\\xn+1-xn\\<M\\y-f(xn)\\
= M\\y -f(xn) + T(xM -x^J-y +/(*„_ J ||
so that the conditions (4) are satisfied for r = n. Further,

h n + 1 \ \ = \\xn+i-xo\\<\\xn+1-xn\\+... +\\xi-Xo\\
so that xn+leB°. It therefore follows by induction that there exists a
sequence (xj in B° such that x0 = 0 and that (2) and (3) hold for all n.
From (3) we now deduce that (xj is a Cauchy sequence in B°, and hence
it converges to a point x*eB. On making n -• oo in (2) we deduce that
/(x*) = y, and this completes the proof.
(4.3.6. Corollary 1) Under the conditions of (4.3.6), f(A) is open in Y.

Further, if A = X, then f (A) = Y.
Let yef(A\ let x be a point of A such that/(x) = y, and let
g(w) =/(x + w)-y £ + WG4
Clearly #(0) = 0, and
IIff(w)- #(w') - T(w - W) || < ^ || w - w' ||
for all w, w' in the domain of g. If now A contains the closed ball C in X
with centre x and radius a, and D is the closed ball in Y with centre y and
radius a(l—e)/M, then for all yeD the equation gf(w) = <y — j^ has a
solution w such that x + weC, i.e. the equation /(x) = y has a solution
xeC. Hence D c / ( c ) ^ / U ) , so that/U) is open in 7. Further, if A = X,
then by (4.3.6), f(A) is clearly equal to Y.
(4.3.6. Corollary 2) If the hypotheses of (4.3.6) are satisfied, and z is a point

of A such that A contains the closed ball C in X with centre z and radius
e || z || /(I - e\ then the equation f(x) = Tz has a solution in C.
Let y=Tz-f(z), and let h(w) =f(z + w) - / ( z ) (z + we>4). As in
Corollary 1, /* satisfies the hypotheses of (4.3.6), and
so that the domain of h contains the closed ball D in X with centre 0

and radius Af ||y||/(l — e). Hence the equation h(w) = y has a solution
w in D, and this gives the result.
(4.3.7) Let X, Y be Banach spaces, let f be a function from a set A^X

into Y, let x 0 be an interior point of A, and let E be the level surface off
through x 0 . Suppose further that f is Hadamard differentiable on a neigh-
bourhood ofx0, that df is continuous at x 0 , and that df(x0) is onto Y. Then
By (4.3.4) Kt(E;x0) c ker d/(x0), so that it remains to prove the reverse

inclusion. Further, we may again suppose that x 0 = 0 and f(x0) = 0,
and hence it is enough to show that if T = 3/(0) and h is a non-zero vector
in ker T, then there exist a sequence (hn) in X converging to h and a sequence
(tn) of positive numbers converging to 0 such that tnhneE for all n.
Let Tc be defined as in (4.3.6), let M > || T~11|, and let 0 < e < 1. Since
df is continuous at 0, we can find an open ball Ao in X with centre 0,
contained in A, such that || df{x) - T\\ < e/M for all xeA0. By the
analogue of (4.1.6) for Hadamard differentiation applied to /— T, we
then have
|| f{x) -/(*') - T(x - xf) || < £ I x - x11| IM
for all x9x'eA0, so that the restriction of/to Ao satisfies the conditions
of (4.3.6). It is obvious that for all sufficiently small positive t the ball Ao
contains the closed ball B in X with centre th and radius £ || th || /(I — s),
and then, by (4.3.6. Corollary 2), there exists x eB such that/(x) = T(th) = 0.
We now apply this result, taking s to be successively | , | , . . . , 1/n,...,
and for s=\/n choosing t = tn so that tn -> 0 as n->oo. We thus
obtain a sequence (xw) in E such that || xn — tnh \\ < tn \\ h \\ /(n — 1) for all n.
If now hn = xjtn, then tnhn = xneE for all n and hn = xjtn -• h as n -• oo,
Exercise 4.3
1 Let X, Y be Banach spaces, let T be a continuous linear function from X onto 7,
and let Tce^Jif(X/kQT T, Y) be defined as in (4.3.6). Let also S be an element of
Se(X, Y) such that || 5 - r|| < 1/1| T~J ||. Prove that S is onto 7, and that if Sce
yîX/kerS, Y) is defined as in (4.3.6), then
\\S^\\<\\Tr\W-\\Tr\\ \\S-T\\).
[This is a generalization of (3.7.1. Corollary 1).
Hint. Choose M such that || S - T || < 1/Af < 1/1| T~11|, and let e = M || S - T ||.
Then the conditions of (4.3.6) are satisfied with A = X a n d / = 5.]
t The result of (4.3.7) holds if/ is continuous and Gateaux differentiable on a neighbourhood
of xo,<5/is continuous at x 0 , and <5/(x0) is onto Y. Note that each set of conditions implies
that/is Frechet differentiable at x 0 (4.1.7. Corollary 1).
4.4 Constrained maxima and minima (equality constraints)

In this section and the next, we shall discuss real-valued functions. In
order that these may be differentiable, the normed spaces involved will
be over the real field. Let X, Y be normed spaces, let A c; X, and let
F :A -> R and G:A^> ybe given functions. We say that F has a local
minimum at a point xoeA subject to the condition G(x) = 0 if x 0 belongs to
the level surface S = G~l({0}) of G and there exists a neighbourhood
V of x 0 in X, contained in A, such that F(x) > F(x0) for all xeSn V; if
strict inequality holds except at x0, then x 0 is a strict conditional local
minimum. A conditional local maximum of F is defined similarly.
In general, if x 0 is not a local maximum or minimum of F (in the sense
defined in §3.6), then as x passes through x 0 crossing successive level
surfaces of F, the value of F(x) varies monotonically. It is therefore
intuitive that if x 0 is a conditional local minimum or maximum of F on S,
then every vector tangent to S at x 0 is also tangent to the level surface
£f of F through x 0 (Fig. 4.1). In particular, this would imply that if G
satisfies the hypotheses of (4.3.7), and F is Hadamard differentiable at
x 0 , then
ker dG(x0) = Kt(S;x0) <= Kt(ST;x0) = ker dF(x0). (1)
The following result shows that this inclusion relation (1) does in fact
hold when the spaces X, Y are complete.
(4.4.1) Let X9 Y be real Banach spaces, let F and G be functions from a set
Figure 4.1
A^X into R and Y respectively, and suppose that

(i) F has a local minimum or local maximum at x0 subject to the condition
G(x) = O,
(ii) F is Hadamard differentiable at x 0 ,
(iii) G is Hadamard differentiable on a neighbourhood o/x 0 , dG is continu-
ous at x 0 , and dG(x0) is onto Y. Then ker dG(x0) ^ ker dF(x0).
Let S = G " 1 ({0}) and suppose that F has a local maximum at x 0 on S.
By (4.3.7), ker dG(x0) = Kt(S;x0\ and hence if heker dG{x0) we can find
a sequence (hn) in X converging to h and a sequence (tn) of positive numbers
converging to 0 such that x 0 + tnhneS for all n. Then F(x0 + tnhn) < F(x0)
for all sufficiently large n, whence
dF(xo)h = lim (F(x 0 + tnhn) ~ F(xo))/tn < 0.
But we can repeat this argument with h replaced by — h, whence

<3F(xo)(— h) < 0, and therefore dF(xo)h = 0, as required. The proof when
F has a conditional minimum is similar.
(4.4.1. Corollary 1) / / F and G satisfy the hypotheses of the main theorem,

there exists AeY' such that dF(xo) = AodG(x0).
We apply (A.2.2)(ii). The kernel of the function T = dF(x0) contains
the closed subspace W = ker <9G(x0). There is therefore Tce&(X/W,R)
such that T=Tc°(j), where </> is the canonical map from X to X/W.
Since 3G(x0) is onto Y and both X and Y are Banach spaces, there is a
linear homeomorphism \\J : Y -• X/S such that <t> = xj/odG(x0). We take
When X = R", Y = Rm, the corollary asserts that there exist real numbers
A 1 ,...,/l m (usually called Lagrange's multipliers) such that the Jacobian
matrices of F and G satisfy the relation
Thus we have:
(4.4.1. Corollary 2) (Lagrange's multipliers) / / F and G satisfy the hy-

potheses of the main theorem, where now X = R", Y = Rm, there exist real
numbers X1,..., Xm such that
DjF(x0) = I itDjGfro) (j = 1,..., n\ (2)
where G x ,..., Gm are the components ofG.
The results of (4.4.1) and its corollaries become false if 3G(x0) is not
onto Y. For example, if F, G: R 2 -> R are given by

F(x, y) = y9 G(x, y) = (y- x2)(2y - x2\
then F obviously has a minimum at the origin subject to the condition
G(x,y) = 0. However, here <3G(0,0) = 0, and the equations (2), which in
this case are
0 = A(4x3 - 6xy) and 1 = X{4y - 3x2),
have no solution X when x = y = 0.
In the particular case where G is real-valued, (4.4.1. Corollary 1) shows
that there exists X eR such that dF(x0) = XdG(x0). This case has an interest-
ing application to the isoperimetric problem of the calculus of variations.
The isoperimetric problem can be stated as follows:
(IP) Find the function (f>for which the functional
Ja
= P'P' (3)
has a local minimum or maximum, where the admissible functions belong to
Cl{ [a, i ] , Y), satisfy the endpoint conditions (f>(a) = c, (j)(b) = d, and are such
that another functional
G«>)=
f*g(t,<Kt),<l>Wdt (4)
Ja
tafces a gwen fa/we /. The name 'isoperimetric' arises from the special
case where c = d = 0,
F(</>) = ^dt, (5)
(1+0'W)2)1'2*
GW>)= (6)
Ja
since in this case the problem is that of finding the greatest among the
areas bounded by the t-axis and a curve y = (j)(t) of fixed length (observe
that all such areas have the same perimeter).
To make the problem (IP) precise, let J = [a, 6], let Y be a real Banach
space, let A be an open set in the metric space J x Y x Y, and, as in
(3.11.1), let fg : A -• R be continuous functions whose partial differentials
d2fd3fd2g,d3g are continuous on A. Let also c,deY, let S be the set
of <j)€C\J, Y) such that </>(a) = c,(j){b) = d, and that {t,<l>(t),<l>'(t))eA for
all tsJ, and let FS,GS be the restrictions to (^ of the functional defined
by (3) and (4). If 0 is a local minimum or maximum of Fs subject to the
condition that Gs{<t>) = /, and F, G are given by
where the domain off and G is the set of êCj(J, Y) for which 4> + \\jeS,
then ij/ = 0 is a local minimum or maximum of F subject to the condition
that G(i/0 = /. Since the Frechet differential of F at 0 is given for all
iAeCj(J, Y) by dF(<# = tf (d2/(O(t)Mr) + djm)W(t))dt where 0>(t) =
(t, <£(0, </>'(*)) (cf. the remarks preceding (3.1 LI)) and similarly for G, we
deduce from (4.4.1. Corollary 1) that if dG(0) is non-zero then there exists a
real number k such that for all êCj(J, Y)
f
Ja
By combining this with (3.11.4), we thus obtain:
(4.4.2) Let J = [a,b], let Y be a real Banach space, let A be an open set in
the metric space J x Y x 7, and let f,g:A-+R be continuous functions
whose partial differentials d2f,d3f,d2g,d3g are continuous on A. If §
is a solution of the isoperimetric problem (IP) which is not an extremal for
the functional G, and Q>(t) = {t,(j){t\(j)\t)\ there exists a real number k such
that the function t*-+ d3f(Q>(t)) - kd3g(Q>(t)) has a derivative at each tej
and
ftfWf)) M(t())) dJW)) Mm)\ (7)
For example, if F, G are given by (5) and (6), so that

f{Uy,y') = y, g(t,y,y') = (l + y'2)ll\
the equation (7) is
which is easily solved to show that the largest area is obtained when the
curve is an arc of a circle.
A somewhat more difficult example is given by taking
F(ct>)= [VW(1+(0'W) 2 ) 1/2 A, (8)

Ja
and retaining formula (6) for G. Except for the constant 2TI, the functional
F then represents the area of the surface obtained by rotating the curve
y = (f)(t) about the t-axis (cf. Example (b\ §3.11, p. 232). We shall indicate
how the solution to this problem may be obtained.
As in §3.11, write L(t) = (1 + (</>'(0)2)1/2- If we suppress the variable t,

(7) applied to the functional in (8) and (6) gives
^-{L-1W-kL-l<l)') = L.
at
If we carry out the differentiation and replace (j)'2 by L2 — 1, we find
-L-l+L-\<t>-X)<l)" = Q.
On multiplying by - <\>' this equation becomes
If we now write %(t) = 0(f) - /I, we see that for some constant a
and the solution to this problem was given in §3.11 (p.233.)
We now return to the ideas of (4.4.1). In that result we assumed the

Hadamard differentiability of F at x 0 , but the argument still applies if
we assume only that F has a Hadamard variation dF(xo)h for all heX.
More precisely, we have:
(4.4.3) Let X, Y be real Banach spaces, let A^X, let xoeA, and let F:
A -> R and G'.A-*Ybe functions such that F has a Hadamard variation
dF(x0)hfor all heX,G is Hadamard differentiable on a neighbourhood of
x o ,dG is continuous at x 0 , and dG(x0) is onto Y.
(i) IfF has a local maximum at x0 subject to the condition that G(x) — 0,
then
kerdG(x 0 ) c {heX :dF(xo)h < 0,dF(x o )(- h) < 0}.
In particular, ifdF{x0) is sublinear, then
ker dG(x0) <={heX: dF{xo)h = dF(xo)( -h) = 0}.
(ii) IfF has a local minimum at x 0 subject to the condition that G(x) = 0,
then
ker 3G(x0) ^{heX: dF(xo)h > 0, dF{xo)( -h)> 0}.
We note next a simple result concerning a convex function.
(4.4.4) Let Fbea convex function on a convex set A^ XJetx0 be an internal

point of A (so that F has a Gateaux variation 8F(x0)hfor all heX\ and let
B = {heX:3F{xo)h > 0}. Then x 0 + B does not meet the set Q = {xeA :
F(x)<F(x 0 )}.
Suppose on the contrary that there exists heB such that x 0 + heQ.
Then on the one hand
(F(x0 + th) - F(xo))/t -> SF(xo)h > 0
as t -> 0 + , while on the other hand, for 0 < t < 1 we have
(F(x0 + th) - F(xo))/t < F(x0 + h)- F(x0) < O.|
If in addition to the hypotheses of (4.4.4) F is continuous, then F has a

Hadamard variation dF(xo)h at x 0 for all heX, dF(x0) = <5F(x0), and F(x0)
is sublinear. Hence we have:
(4.4.4. Corollary) Let X, Y be Banach spaces, let F and G be functions

from a convex set A^X into R and Y respectively, and suppose that
(i) F has a local maximum or minimum at x0 subject to the condition
that G(x) = 0,
(ii) F is continuous and convex,
(iii) G is Hadamard differentiate on a neighbourhood of xo,dG is con-
tinuous at x0, and dG(x0) is onto Y.
Then x 0 + ker dG(x0) does not meet the set {xeA :F(x) < F(x0)}.
As an application of (4.4.4. Corollary), we consider the case where

F(x) = || x ||, and we prove:
(4.4.5) Let Xbea Banach space, let G be afunction from a set A^X into R,
and let S = G~1({0}). If x0 is a local maximum or minimum of the norm
function subject to the condition that G(x) = 0, G is Hadamard differen-
tiate on a neighbourhood ofx0 and dG is continuous at x0, then \ dG(xo)xo | =
\\dG(xo)\\\\xo\\.
The result is trivial if x0 = 0 or dG(x0) = 0, so that we may suppose
x0 ± 0 and dG(x0) ± 0, whence dG(x0) is onto R. Let T = dG(x0), and let
H = kerT. Then by (4.4.4. Corollary), ||x 0 + ft|| > ||x o || for all heH,
whence also ||xo*+ /i|| > 1 for all heH, where xoA= x o /||x o ||. This implies
that the element L=x o A -hif of the quotient space XIH has norm 1.
If now Tce&34?(X/H,R) is defined as in (A.2.2)(iii) then || Tc\\ = || T||,
and since ± L are the only two points of the unit sphere in X/H (for X/H
is one-dimensional) we have also that || Tc || = | TCL||. Hence
l|r|| = l|Tc|| = ||TcL||=|Tx01 = |Tx0|/||x0||,
as required.
t A slight adaptation of this argument shows further that B does not meet the cone of vectors
heX such that there exists A > 0 for which F{x0 + Afi) < F(x0).
4.5 Constrained maxima and minima (inequality constraints)

We now widen the discussion of §4.4 and consider the minimum of a
function F subject to conditions of the form
Ft(x)<0 (i=l,...,nX G(x) = 0 (1)
(the first n conditions here are known as 'inequality constraints', while
the last is an 'equality constraint'). Thus let F1,...,Fn be real-valued
functions whose domains are contained in X, let G be a function from a
subset of X into Y, let
0 } ( i = l n), Qn+l = {xeX :G(x) = 0},
w+l
and let F be a real-valued function whose domain contains Q (in applica-

tions, each of the sets Ql9...9Qn will have a non-empty interior, while
Qn+i W ^ *n general have no interior points). We say that a point x0
of the domain of F is a local minimum ofF subject to the conditions (1) if
x0 e Q and there exists a neighbourhood V of x 0 in X such that F(x) > F(x0)
for all xeQnV.
We note immediately that if x0 is a local minimum of F subject to the
conditions (1), and
Q0 = {xeX:F(x)<F(x0)}
then there exists a neighbourhood V of x 0 in X with the property that if
xe[)".+{(Q.n V) then x$Q0, i.e. such that
n
f)(Q^v) = 0' (2)
i=0
We shall see below that this property (2) implies a relation between the
cones Kd(Qt;x0) (i= l,...,n) and the cone of vectors tangent to Qn+1
at x 0 . However, we must first investigate further the properties of these
cones.
We begin by recalling from §4.2 (p. 258-9) that, for E c X and x o eX,
the cone Kd(E;x0) of vectors directed into E at x0 consists of those ele-
ments heX for which there exist a neighbourhood W of h in X and s > 0
such that x 0 + tkeE when ke W and 0 < t < a.
The following elementary properties of this cone are easily checked.
(4.5.1) LetE^X and let xoeX.

(i) Kd(E;x0) is an open cone with vertex 0.
(ii) IfKd(E;x0) + 0 then xoeE (so that ifxo#E then Kd(E;x0) = 0).
(iii) IfE 2 E then Kd(E;x0) 2 Kd{E;x0).
(iv) Kd(E;x0) = Kd(E° ;x0), so that if E° = 0 then Kd(E,x0) = 0 .

(v) / / V is a neighbourhood of x0 in X then Kd{En V;x0) = Kd(E;x0).
(vi) OeKd(E;x0) if only ifxoeE°, and then Kd(E;x0) = X.
(vii) heKd(E;x0) if and only if for each sequence (hn) in X converging
to h and each sequence (tn) of positive numbers converging to 0, the points
xn + tnhn belong to Efor all sufficiently large n.f
(viii) D(^
We now turn to the cone Kt{E;x0) of vectors tangent to a set E at the

point x 0 . This was defined in §4.3 (p. 273-4) to be
{heX: for each neighbourhood W of h in X and for each e > 0
there exist ke W and te]0, e[ such that x 0 + tkeE).
Some simple properties of this cone are listed in the next proposition.
Here and later, for any set A ^ X we denote the complement X\A of A
with respect to X by Ac.
(4.5.2) LetE^X and xoeX.

(i) Kt(E;x0) = Kd(Ec;x0)c, so that Kt(E;xQ) is a closed cone with vertex
0, and 0eKt(E;xo) whenever Kt(E;x0) ± 0.
(ii) IfKt(E;x0) ± 0 then xoeE (so that ifx04E then Kt{E;x0) = 0 ) .
(iii) IfE 2 E thenKt(E;x0) =2 Kt(E';x0).
(iv) Kt(E;x0) = Kx(E;x0).
(v) / / V is a neighbourhood ofx0 in X then Kt(E n V; x0) = Kt(E; x0).
(vi) heKt(E;x0) if and only if there exist a sequence (hn) in X converging
to h and a sequence (tn) of positive numbers converging to 0 such that
x0 + tnhneEfor all n.i
(vii) For Et c l ( l < i < m), Kt([JT=, Et;x0) = {J?= x Kt(£,;x0).
For the special case where E is convex, the relationship between

Kd(E;x0) and Kt(E;x0) is particularly simple.
(4.5.3) Let E be a non-empty convex set in X, let xoeE, let

D = {heX: there exists X > Ofor which x0 + XheE),
Do = {heX :there exists X > Ofor which x0 + XheE°},
and let E"be the set of support functional to E at x 0 , i.e.
E"= {ueX':u(x) > u(xo)far all xeE).
f This condition is given in §4.2.

J This is just (4.3.1) (ii).
Then
(i) Kt(E;x0) = Kt(E;x0) = D and Kd(E;x0) = Kd(E°;x0) = Do,
(ii) Kt(E; x0) and Kd(E; x0) are convex,
(iii) the dual cone K*(E;x0) of Kt(E;x0) is equal to E\
If in addition E° =/= 0 , then
(iv) Kt(E;xo) = Kd(E;xol so that the dual cone K$(E;x0) of Kd(E;x0)
isE:
To prove the first statement in (i), let heKt(E;x0). Then, given
a neighbourhood W ofh, there zrekeW and X > 0 with x0 + XkeE. Thus
fceD, and we conclude hsD. On the other hand, if heD, then for each
neighbourhood W of h there exist keW and A>0 with x o + Afce£.
Since x o e£,x o + tkeE for 0 < £ < A, whence /ieX t (£;x 0 ). This, together
with (4.5.2)(iv) gives the result.
In proving the second part of (i) we may obviously discard the case
in which E° = 0 . Let heD0, so that there exists k > 0 for which x0 +
XheE\ and let W= - k~ xx0 + l~x E°. Then Wis a neighbourhood of h
and if ke W then x0 + XkeE°. The convexity of E gives that iffceWand
0<t<X then xo + ffce£°, whence fteKd(£°;xo). Conversely, if
heKd(E°;x0) then xo + theE° for all sufficiently small positive t, so
that heD0. This and (4.5.1)(iv) completes the proof of (i).
To prove (ii) it is now enough to show that D and Do are convex. If we
prove D is convex, then the convexity of D will follow, and the same proof
will suffice to show Do is convex (for E° is convex when E is). Let h,keD,
let A,/x be positive numbers such that xo +Xh,xo +iikeE, and let
v = min{A,/j}. Then x 0 + v/i and x 0 + vk belong to £, and since E is
convex it follows that
x0 + v(ah + (1 - o)k) = <r(x0 + v/i) + (1 - a)(x0 + v/c)e£
for 0 < cr < 1. Hence oh + (1 — a)keD, so that Z) is convex.
To prove (iii), we recall (A.5.3) (iv) that the dual of a cone K in X is
identical to the dual of the closure K of K, so that it is enough to prove
that D* = E~. Let ueD* and let xeE. Then x - x o e D 5 whence u(x) -
u(x0) = w(x — x0) > 0, so that ueE".Conversely, if wEfTand /iel>, there
exists X > 0 such that x0 + A/*e£ and then u(h) = A~ l(u(x0 + A/z) —
M(X O ))>0, whence ueD*.
It remains to prove (iv), and it is enough to show that if E° =/= 0 then
Do 3 D, for then clearly Do = D. Let heD, let I be a positive number such
that x0 + AheE, let x 0 4- /c be a point of E°. Then for 0 < a < 1
x0 + ok 4- A(l -ff)A= o(x0 + Jfc) + (1 - o)(x0
so that h + o(l- o)~1 X~ lkeD0, and therefore heD0.
The next result contains the kernel of our theory.
(4.5.4) IfE0,...,En+laresubsetsofXsuch that f|?= o E i = 0>

then
Kd(Ei;x0))nKt(En+1 ;xo) = 0 . (3)
i=0 J
If in addition the cones Kd{Et;x0) (i — 0,...,n) and Kt{En+l;x0) are
convex and at least one is neither 0 nor X and their dual cones are K$(Et; x 0 )
and K*(En+l ;x 0 ) then there exist u^KÊi;x0) (i = 0,...,n) and un+1e
K*(En+ x ; x 0 ), not all 0, such that
Wo + Ml + ...+ W f I + M n + 1 = 0 . (4)
1 c
If Hô Et = 0 then En+ ^{JU E t, and using (4.5.2) we find
;x0)s Kt( 0 Ei ;x 0 ) = 0 Xt(£? ;x0)

\i = 0 / i=0
i=0 \i = 0
This gives (3), and from (3) and (A.5.9) we deduce the existence of u0,...,
un+1, not all 0, with property (4).
When the sets Et are convex, (4.5.4) has the following partial converse.
(4.5.5) Let £ 0 , . . . , £ n + 1 be convex subsets of X such that Eo is open and

that E\nE°2 n . . . n £ ° n En+l is non-empty. Suppose also that there
exists XOG f|"= o £; s u c h that ( 3 ) holds - Then the set E = fl"= o1 £,• ^s ^Pfy-
Suppose on the contrary that E ± 0 , let xe£, let X'G£° n . . . n £ ° n
£ B+1 , and for 0 < <r < 1 let xa = ax' + (1 - <x)x.Then xff6£° n . . . n £ n ° n
£ n + 1 . Moreover, since xeE0 and £ 0 is open, xaeE0 for all sufficiently
small G. Choose such a a and letft= xa — x 0 . By (4.5.3)(i),/ieXd(££ ;x0)
for i = 0,..., n and heKt(En+1;x0), and this contradicts (3).
We now return to the problem of constrained minima. From (2),

(4.5.4), (4.5.1)(v) and (4.5.2)(v) we obtain the following theorem.
(4.5.6) Let F19...,Fnbe real-valued functions whose domains are contained

in the real normed space X, let G be a function from a subset of X into the
real normed space 7, and let
Qi={xeX:Fi(x)<0}(i=U...,nl Qn+1 = {xeX :G(x) = 0},
n+l
c-na-
Let also F be a real-valued function whose domain contains the set Q, let
x0 be a local minimum ofF subject to the condition that xsQ(so that xoeQ)
and let
Q0 = {xeX:F{x)<F{x0)}.
Then
/ n
\i = 0
If in addition the cones Kd(Q0;x0),...,Kd(Qn;x0)9Kt(Qn+1;x0) are

convex and at least one is neither 0 nor X, there exist uieK^{Qi;;x0)
(i = 0,..., n) and un+ x e JCt*(en+ x; x 0 ), not all 0, such that
UQ + UX + ... +!*„ + Un+1=0.
In (4.3.7) we saw how to determine Kt(Qn+l;x0) in certain cases.

We now turn to the corresponding problem for the cones Kd.
(4.5.7) Let f be a real-valued function with domain contained in the real

normed space X, let x0 be an interior point of its domain and suppose that
the upper Hadamard variation df(xo)h exists for all he X {see §4.2, p. 268).
Let also
Q = {xeX :/(x) </(x 0 )}, R = {xeX :/(x) </(x 0 )}.
Then
{heX:df{xo)h < 0} c Kd{Q;x0) c Kd{R;x0)c {heX:df{xo)h < 0}. (5)
Suppose first that df{xo)h < 0. Then for each sequence {hn) in X converg-
ing to h and each sequence {tn) of positive numbers converging to 0 we
have
lim sup(/(x 0 + tnhn) -f(xo))/tH < 0.
«-»oo
Hence / ( x 0 -I- tnhn) - / ( x 0 ) < 0 for all sufficiently large n, whence

x
o + f A G 6 » i-e- heKd{Q;x0).
Next let heKd{R;x0). Then for each sequence {hn) in X converging
to h and each sequence {tn) of positive numbers converging to 0 we have
/ ( x 0 + tnhn) —f{x0) < 0 for all sufficiently large n. Hence
lim sup(/(x 0 + tnhn) -f{xo))/tn < 0,
n->ao
which implies that df{xo)h < 0. Since the central inclusion in (5) is obvious,
the proof is complete.
(4.5.7. Corollary) Suppose that in addition to the hypotheses of (4.5.7)
the upper variation df(x0) is convex on X and the set {heX :df(xo)h < 0}
is non-empty. Then
Kt(Q; x0) = K*(R; x0) = U ^J(xo\
where d^f(x0) is the Hadamard subvariation off at x0 (see §4.2, p. 270).
This corollary is simply established by combining (5), (4.2.11. Corollary)
and (A.5.6).
It is worth recalling from §4.2 that under the conditions of the corollary,
d*/(*o) is non-empty, and that if/ is Hadamard differentiable at x 0 ,
then d^f(x0) consists of the single element df(x0).
We can now reinterpret (4.5.6). For simplicity, we take the case when
there are only inequality constraints.
(4.5.8) Let F19...9FH be continuous real-valued functions whose domains

are subsets of the real normed space X and let Fo be a continuous real-valued
function whose domain contains {xeX:Ft(x)<0 (i — l , . . . , n ) } . Let also
x 0 be a local minimum ofF0 subject to the conditions Ff(x) < 0 (i = 1,..., n),
and suppose that F0,...,Fn possess Hadamard upper variations at x0,
each defined and convex on X. Then there exist non-negative numbers
Xo,..., Xn, not all 0, such that
A.F.(x0) = 0 (i = 1,..., n\ Oe £ kfiJFfa). (6)
Moreover, if there exists heX such that dFi(xo)h<0 for every integer
i > 1 for which Ft(x0) = 0, then k0 ^ 0.
We begin by discarding some trivial cases. If Oe53|tF0(x0), we can
take Xo = 1 and Xt - 0 for all i ± 0. Again, if there is j > 1 such that
j7.(X()) = o and 0ed^Fj(xo\ we can take A</= 1 and ^ = 0 for all i±j.
We can therefore assume that 0^3 j|s F 0 (x 0 ) and that for each i > 1, either
Let
Qo = {xeX :F 0 (x) < F 0 (x 0 )}, Qt = {xeX :Ff(x) < 0} (i = 1,... ,n). (7)
Then by (4.5.6) (with G = 0) we have C\UoKd(Qi'9xo) = 0. Moreover,
since 0^5 j|e F 0 (x 0 ), the set {heX:dFo(xo)h<0} is non-empty, so that
Kd(Q0;x0) is non-empty (4.5.7). Further, since xo$Qo, it follows from
(4.5.1)(vi) that 0$Kd(Qo;xo) so that Kd(Qo; x o ) =/= X. From (4.5.6) we
infer the existence of i ^ e X ^ g . ; x 0 ) (i = 0,...,n), not all 0, such that
1^0 + 1?! + . . . + ^ = 0. If, for any i > l , F f ( x o ) ^ 0 , so that F f ( x o ) < 0
and x 0 is an interior point of Qi9 then by (4.5.1)(vi) Kd(Q{;x0) = X, so
that KîQi; x 0 ) = {0}; hence vi = 0, and if we put kt = 0 and take any
we
F^Xo), have —Xiui = vi. On the other hand, if, for any i > 0 ,
^^XQ), then {heX:dFtJ(xo)h<0} is non-empty, so by (4.5.7.
Corollary), K$(Qi;x0)=[J^<0^l^;liiFi(x0); thus we can find Af > 0 and
M.GS^F^XQ) with — A.M. = i?.. We have assumed that for every i one of
these possibilities holds, and hence
V o + K"l + '•• + ^WK = - ^0 " Vl ~ '•• ~ Vn = 0.
Since not every ^ is 0, neither is every A. equal to 0, and (6) is established.
Suppose next that the additional hypothesis is satisfied, let A 0 ,...,A n
be non-negative numbers, not all 0, such that the conditions (6) hold,
and let J be the set of integers i > 1 such that Ft(x0) = 0. If Ao = 0, then
O e ^ A ^ F ^ X o ) , and by (4.5.7. Corollary) and (A.5.9) it follows that
OtejKdiQi ' X o ) = 0- A n application of (4.5.7) now yields a contradiction
to the hypothesis.
For the case in which the Ft are Hadamard differentiable, we have:
(4.5.8. Corollary 1) Let F 1 9 . . . , F n be continuous real-valued functions

whose domains are contained in X, and let F o be a continuous real-valued
function whose domain contains the set {xeX'.F^x)^ 0,(i = 1 , . . . ,n)}.
Let also x 0 be a local minimum of F o subject to the conditions Ff(x) < 0
(i = 1,..., n\ and suppose that F o , . . . , Fn are Hadamard differentiable at x 0 .
Then there exist non-negative numbers A o ,..., An, not all 0, such that
A.F,.(x0) = 0 (i = 1,..., n\ £ kfiFM = 0.

i=0
Moreover, if there exists heX such that dFi(x0)h< 0 for every integer
ifor which Ff(x0) = 0, then Ao =/= 0.
When the Ff are convex, the additional hypothesis in the last part of
(4.5.8) can be put in a more convenient form.
(4.5.8. Corollary 2) Let F 1 , . . . , F M be continuous convex functions whose

domains are subsets ofX and let F o be a continuous convex function whose
domain contains the set { x e X : F . ( x ) < 0, (i= l , . . . , / i ) } . Let x 0 be a local
minimum ofF0 subject to the conditions Ft(x) < 0 (/ = 1 , . . . ,n). Then there
exist non-negative numbers A o , . . . , AM, not all 0, such that
) = 0 (i = 1,..., n\ Oe X kfijfa). (8)

i=0
Moreover, if there exists xeX such that F t (x)<0/or each integer i> 1
for which Ff(x0) = 0, then Ao ^ 0.
Here it is obviously enough to prove the last part, and to do this we

have only to note that if F( is convex then for 0 < t < 1
(F,(x0 + t(x - x0)) - FJLxo))/t < Ft(x) - FJLx0),
so that if Ft(x0) = 0 and Ft(x) < 0 then
dFJLxo)(x - x0) = SFM(x - x0) < Fjx) < 0.
From (4.5.5) we have the following converse of (4.5.8. Corollary 2).
(4.5.9) Let Fl9...,Fn be continuous convex functions whose domains are

subsets of the real normed space X, and suppose that there exists xeX
such that Ft(x) < 0 for i= 1,..., n. Let also Fo be a continuous convex
function whose domain contains the set Q = {xeX :Ff(x) < 0 (i = 1,... ,n)},
and let x0 be a point of Q with the property that there exist non-negative
numbers A o ,...,k n , not all 0, such that (8) holds. Then F0(x) > F0(x0)for
allxeQ.
Let Qo,..., Qn be defined as in (7). If Qo = 0 there is nothing to prove.
If Qo ± 0 ' t h e n evidently xoeQ~o; moreover, if xeQ0, then (exactly as
above) dFo(x-xo)<0, so that by (4.5.7. Corollary) K%(Qo;xo) =
Next
Uô/'Wo)- > i^ l a n d Ûo) = 0? then
if
dFjxo)(x-xo)<09
so that K^(Qi;xo)=[jtl<ofid^Fi(xo). On the other hand, if F,(x 0 )<0
then ^ = 0 and Kd(Qi;x0) = X, so that A.^F f (x 0 )= {0} = Ki(Qt;x0).
From (8) and (A.5.9) it follows that f]ni=0Kd(Qi;x0) = 0, whence
[)"= o Qt = 0 by (4.5.5) and this is the required result.
4.6 Theorems of Lyapunov type for differential equations

In this section we discuss some applications of Lyapunov's 'second
method' to questions of stability, uniqueness, and existence of solutions
of ordinary differential equations.!
The idea underlying Lyapunov's 'second method' goes back to a
theorem first stated by Lagrange, namely that if in a certain position of a
conservative mechanical system the potential energy has a strict minimum,
then this position is one of stable equilibrium.
Consider, for simplicity, a particle of unit mass moving under a con-
servative system offerees in R3, where the potential energy of the particle
when it is at the point x = (x1,x2,x3) of R3 is U(x) (the potential energy
t In his work on stability theory, Lyapunov used the term 'second method' to cover methods
that did not require a knowledge of the form of the solution of the equation concerned
(cf. (4) below).
is independent of the time t, since the system is conservative). We suppose

that U has a strict local minimum at 0. Moreover, since U is determined
only up to an additive constant, we can suppose that U(0) = 0.
The equations of motion of the particle are
„_ dU(x) „_ dU(x) „_ dU(x)
Xi 1 — - 5 2 —
X*y z » -^3 — Z ,
dxx dx2 * dx3
or in vector notation (cf. Exercise 3.5.1),
x"=-Vl/(x), (1)
and since U has a minimum at 0, Vt/(0) = 0, so that x = 0 is a solution.
We wish to show that this solution is stable, and for this purpose we have
to consider the pair of first-order equations equivalent to (1), namely
*' = * / = - V l / ( x ) ,
whose solution is of the form x = il/(t),y = \j/\t\ where ^ is a solution of
(1). Writing z = (j)(t) = {\l/{t\^/\t)\ we have to show that for each e>0
we can find d > 0 such that if || 0(to) || < <5 and 0 reaches to the boundary
of [t 0 , 00[ x R6 on the right, then <f) is defined and satisfies || </)(t)\\ < s
for all t > to.f To do this, we use the fact that the total energy of the
system is constant along each solution curve, i.e. that
[/(#)) + \ || f W || 2 = constant.
Let
H(z) = H{x9y) = U(x) + \ || y ||2 (x,yeR 3 ).:
Then H is continuous on R6, H(0) = 0, and H has a strict local minimum
at 0, so that there exists p > 0 such that H(z) > 0 for 0 < || z || < p. For r > 0
let E(r) be the component of the set {zeR6 :H(z) < r} containing the origin.
Clearly E(r) is open. Moreover, E(r) tends to {0} as r -• 0 + , for if 0 < s < p,
then H has a positive infimum m on the sphere {z: || z || = e}, and hence
if 0 < r < m, then £(r) does not meet the sphere, and therefore lies inside it.
If now we take such an r and choose d > 0 so that the ball {z: || z || < 3} is
contained in E(r\ then if ||0(fo)|| <d,(j)(t)eE{r) for all t in its domain
for which t > t 0 , so that || (j)(i) || < a. Since s < p, we infer from (2.4.3)(iv)
that 0 has domain [t0,00 [, and this is the required result.
What is crucial here is the existence of the (continuous) function H which
has a local minimum at 0 and which is constant along each solution
curve z = </>(£). More generally, the argument still applies if the derivative
f See Exercises 2.8; the argument actually shows that the zero solution of (1) is uniformly
stable.
{ H is the Hamiltonian of the system (see §3.11, Example (d\ p. 236).
ofH along each solution curve is less than or equal to 0, for this ensures that
if the solution starts in E(r) then it remains in E(r).
In the preceding example, the right-hand side of the equation (1) is
independent of t. For a general first-order equation
y=f(t,y\ (2)
the corresponding approach is to find a function (t, y) H+ V(t9 y) which is
zero for y = 0, and whose derivative along any solution curve y = 4>(t) of
the equation is less than or equal to 0. Such a function is called a Lyapunov
function for the equation (2). The Hadamard variation of V enters naturally
here, for if V has a Hadamard variation at the point (t9(/>{t)) for every
increment, then, by (4.2.4),
^ = dV(t9<Kt))(l<l>'{t)) =
at
so that the required condition is that, for all t,
o. (3)
This condition (3) is in turn obviously implied by the condition that, for
(4)
and this last condition (4) has the advantage that it does not require any
knowledge of the solution 0.
More generally, (4) can be replaced by the condition that, for all (t9y)9
dV(t,y)(lJ(t9y))<0, (5)
for, by Exercise 4.2.4, this implies that
D+ V(t9 <t>(t)) < 8V(t, 0W)(1, fW) = MU </>«)(!,/(', <Kt))) ^ 0, (6)
so that again t*-> V(t,(/)(t)) is decreasing.
In the theorems below we make a still further generalization, in the
spirit of the results of §2.5 and §2.11, by comparing V(t,(j)(t)) with the
solution of a scalar equation x' = g(t, x) whose behaviour is prescribed.
In the first instance we employ the condition
dV(t,y)(lJ(t,y))<g(t,V(t,y)l
which is the natural extension of (5). Other conditions will be mentioned
later.
We prove first a simpie differential inequality which underlies the
subsequent results on stability.
(4.6.1) Let I = [a, oo [, let B be the closed ball in Ywith centre 0 and radius
p>0, let / : / x B - > Y,g:Ix [0,oo[->R, and V:I x B -> [0,oo[ be
continuous, and let toel,xo > 0. Suppose further that

(i) the maximal solution *¥ of the equation x' = g(t, x) satisfying ^(to) = x0
and reaching to the boundary of I x [0, oo [ on the right is defined on [t0, oo [,
(ii) for all (t,y)for which te]a, oo [, || y || < p, and V(t,y) > ^(t),
dV(t,y)(l,f(t,y))<g(t,V(t,y)), (7)
(iii) (j) is a solution of y' =f(t,y) on an interval [to,b\_, where t0 (t)) (to£t< b). Then x{t0) < x0, and, as in (6),
+
D x(t)<dv(t,(t*(mj(tA(t)))<g(t,x(t)) (8)
for all £e]r o ,fr[ for which x{t)> ^{t). The result therefore follows from
(1.5.2).
Since ¥(£) > 0, the condition (ii) is obviously satisfied if (7) holds
whenever tG]^, OO[ and 0 < || y || < p.
We consider now a number of results concerning stability; in these,
we require Y to be finite-dimensional. We recall (Exercises 2.8) that if
/ = [a, oo [, E is a subset of / x Y open in / x Y,f: E -• Y is continuous,
and the zero function is a solution of y' =f(t,y) on /, then this zero
solution is
(S) stable if for each s > 0 and each toel there exists d > 0 such that, if 0 is a
solution of/ =/(r,y) satisfying || 0(ro) || < S and reaching to the boundary
of E on the right, then <f> is defined on [t0, oo [ and || <j){t) || < e for all
t>t0,
(AS) asymptotically stable if it is stable and in addition the S in (S) can be
chosen so that 4>(t) -• 0 as t -• oo,
(US) uniformly stable if it is stable and the S in (S) can be chosen to be
independent of t0.
The proof of the first result is modelled on the argument used above
for the proof of Lagrange's theorem.
(4.6.2) Let Y be finite-dimensional, let / = [a, oo[, let B be the closed

ball in Y with centre 0 and radius p > 0, and let f :I x £ -> Y, g:
I x [0, oo [ -• R, and V: / x B -* [0, oo [ be continuous. Suppose further that
(i) the equations y' =f(t, y) and xf = g(t, x) have y = 0 and x = 0 as solu-
tions, i.e.f(t,O) = 0 and g(t,O) = Ofor all tel,
(ii) V(t,O) = Ofor all tel, and for each eÔwe can find r>0 such that,
if E(t,r) is the component of the set {yeB:V{t,y)<r} that contains 0,

then E(t, r) £ {ye Y: || y || < e}for all re/,f
(iii) for all tel° and 0 < || y\\ < p,
Then if the zero solution of x' = g{t, x) is stable, so is the zero solution
ofy=f(t,y).
Suppose that the zero solution of x' = g{t,x) is stable, let toel, let
0 < e 0 so that E(t,r)^ {yeY: \\y\\ < s} for all
tel. We can then find rj > 0 so that each solution \j/ of x' = g(t, x) satisfying
*A(*o)< V a n d reaching to the boundary of / x £0, oo [ on the right is
defined on [t0, oo [ and satisfies \j/{t) < r for all t > t0. Since y *+ V(t0, y)
is continuous and E(t0, r) is open, we can then find 5 > 0 such that
V(to,y)<rj and yeE{to,r) whenever \\y\\ < d. Let 0 be a solution of
y' =f(t> y) satisfying || (j)(t0) \\ < 5 and reaching to the boundary of / x B on
the right, and suppose that the domain of (j) is [t o ,b[, where t0 (to))< rj, and hence the maximal solution *F of xr = g(t,x)
satisfying ^ ( r o ) = V{to,(j)(to)) is defined on [t 0 ,oo[, and ^ ( t ) < r for all
t>t0. By (4.6.1), V(tMt))<xif(t)<r for all te[to,b[, and since 0(r o )e
E(to,r) and the set {(t9y):t > to,yeE(t,r)} is connected in / x 7, this
implies that (f)(t)eE(t,r) for all £e[t o ,6[, whence || 0(r)|| < s. Since s< p,
we infer from (2.4.3)(iv) that b = oo, and hence the zero solution of y' =
f(t, y) is stable.
(4.6.3) Suppose that the conditions of (4.6.2) are satisfied, and that in
addition y •-> V(t, y) is continuous at 0, uniformly for t in I, and 0 is an interior
point of the set f)teIE(t,r)for each r > 0. Then if the zero solution ofx! =
g(t,x) is uniformly stable, so is the zero solution ofy' =f(t,y).
If the zero solution ofx' = g(t, x) is uniformly stable, the Y\ in the preced-
ing argument can be chosen to be independent of t0. The additional
hypotheses then imply that the <5 can also be chosen to be independent
of t0, and this gives the result.
(4.6.4) Suppose that the conditions of (4.6.2) are satisfied, and that in
addition for 0 < y 0 and i(y)el such that V(t,y) >P(y)
whenever t > x{y) and y < || y || < p. Then if the zero solution ofx' = g(t,x)
is asymptotically stable, so is the zero solution ofy' =f(t,y).
t Since the sets E(t, r) shrink as r decreases, this is equivalent to the condition that E(u r)
converges to {0} as r -* 0 +, uniformly for t in /. The condition is strictly weaker than the
condition that the component of the set {(tiy)el x B: V(t,y) < r} containing / x {0} is
contained in the cylinder {{t, y)el x Y : || y || < s} for all sufficiently small r.
If the zero solution of x' = g(t, x) is asymptotically stable, then, exactly

as in the proof of (4.6.2), V(t,<t>(t)) < V(t) and || <f)(t)\\ < e for all t > t0,
and in addition we can choose the rj so that ¥(£) -• 0 as t -> oo. This
implies that (j)(t) -• 0 as t -> oo, for otherwise we can find y > 0 and a
sequence (tn) tending to oo such that || (/>(tn) \\ > y for all n. Then ¥(£„) >
V(tn,(t>(tn)) > f}(y) > 0 for all sufficiently large n, and this gives a contradic-
tion.
The next result, which we deduce from (4.6.2-4), imposes conditions

on V which, although more restrictive than those in (4.6.2-4), are easier
to verify.
We use jf* to denote the set of all real-valued continuous, strictly
increasing functions on [0, oo [ which take the value 0 at 0.
(4.6.5) Suppose that the hypotheses of (4.6.2) are satisfied, except that (ii) is
replaced by
(iiy V(t,0) = Ofor all tel, and there exists Kejf such that V(t9y) > K( || y ||)
whenever tel and \\y\\< p.
Then if the zero solution ofx' = g(t, x) is respectively stable or asymptotically
stable, so is the zero solution ofy' =f(t,y). If in addition there exists fiejf
such that V(t,y)< fi(\\y\\) whenever tel and \\y\\ < p, then the uniform
stability of the zero solution ofx! = g(t9 x) implies that of the zero solution of
?=f(t,y)-
Here (ii)' implies that the set E(t, r) is contained in the ball {yeB: || y || <
K~ 1(r)} and since K~ l is continuous at 0 and K~ 1(0) = 0, this implies (ii) of
(4.6.2). Since (ii)' trivially implies the additional hypothesis in (4.6.4), this
proves the statements concerning stability and asymptotic stability.
Finally, the existence of [i with the specified properties implies the
additional hypotheses in (4.6.3), and this gives the result concerning
uniform stability.
The case where V does not depend on t is particularly simple, and

since here the condition (ii) of (4.6.2) and the additional hypotheses of
(4.6.3,4) are satisfied if V(0) = 0 and V(y) > 0 for all non-zero yeB. This
result is in fact contained in (4.6.5),| but it is worth while to state it
separately.
(4.6.6) Let Y be finite-dimensional, let / = [a, oo[, let B be the closed

|Take K to be a continuous, strictly increasing function satisfying K{T) < inf
r< \\y\\ <p}forO<r<p.
ball in Y with centre 0 and radius p > 0, and let f :I x B-> Y, g:

/ x [0, oo [ -• R, and V :B-+R be continuous. Suppose further that
(i) the equations y'=f(t,y) and x' = g(t,x) have y = 0 and x = 0 as
solutions, i.e.f(t,0) = 0 and g(t,0) = Ofor all tel,
(ii) V(0) = 0, and V{y) > Ofor all non-zero yeB,
(iii) for all tel° and 0 < | | } ; | | < p
dV(y)f(t,y)<g(t,V(y)).
Then if the zero solution ofx! = g(t,x) is respectively stable, asymptotically
stable, or uniformly stable, so is the zero solution ofy' =f(t,y).
We note next some special cases of the condition (iii) of (4.6.2-5), and
also some alternatives to this condition.
{a) If V is Frechet differentiable (which requires Y to be real) on the set

C = {(t,y)el° x Y : 0 < || y\\ < p } , then the condition (iii) is equivalent
to the condition that for all (t,y)eC
+ d2 V(t,y)f(t,y) < g{t, V{t,y)). (9)

2Mt
For instance, if Y is a Hilbert space, and V(t,y) = e | y | 2 , then
forally,/jey
d2V(t,y)h = 2e-2Mt<y,h>,
so that here (9) with g = 0 becomes
<y,f(t,y)y<M\\y\\2.
Again, if Y is a Hilbert space, and V(y) = || y || (so that V is independent
of t), then
so that here (9) becomes

(y,f(t,y)}<\\y\\g(t,\\y\\)
(cf. Exercises 2.5.4,5).
(b) If V(t, y) is locally Lipschitzian in y on C, then the condition (iii) can be

replaced by the condition that for all (t,y)eC
8V{t,y)(lJ(t,y)) < g(t, V(t,y)), (10)
where SV is the lower Gateaux variation ofV.
In fact, if 0 is a solution of/ =/(t, y), then (10) implies that
SV(t,m)(U(t>f(t))<g(t,V(t,(t>(t))),
and therefore that

lim inf(V(t + x, <Kt) + x^'(r)) - V(t, 0W))A < g(t, V(t, </>(r))),
x-*0 +
since
(V(t +x,<Kt + x)) - V(t + T, (/>(;) + x0'W))A
= O( || <t>(t + T) - 0(t) ~ T^W ||/T) = 0(1).
The inequality (11) implies in turn that
and this can be used in the proof of (4.6.1) instead of (8).
We note in passing that when V(t,y) is locally Lipschitzian in y, then

for all heY
To prove this, let he Y, let {hn) be a sequence in Y converging to h, and

let (TW), (kn) be sequences of positive numbers converging to 0 and 1 res-
pectively. Then, by the Lipschitzian property,
(V{t + xnkn,y + xnhn) — V(t + inkn, y + xnknh))/(xnkn) = 0( \\ hn — knh ||) = o(l).
We then have
lim sup(K(r + xnkn,y + x ^ J - V(t,y))/xn
n-*ao
= lim *„. lim sup(F(t + xnkn,y + xnkjt) - V(t,y))/(xnkn)

n-*oo n-+ao
<5V(t,y)(Uh).
Hence dV(t,y)(l,h)<BV(t,y)(l,h\ and since the reversed inequality is
trivial, we obtain (12).
We mention finally a case where V is independent of t.
(c) If V:Y-* [0, oo[ is continuous and sublinear, then the condition (iii)
can be replaced by the condition that for all (t,y)eC
V{f{Uy))<g{uV(y))
(cf. (2.5.1-3) and (2.5.5)). For, if 0 is a solution of y' =f{t,y\ then,
by (1.6.1),
D+v((j>{t)) < v(4>\t)) = v(f(t, </>(0)) £ g(t, v{<t>{t))\
and this is the inequality employed in the proof of (4.6.1) for the case
where V is independent of t.f
t In fact, (c) implies condition (iii) itself, for V is locally Lipschitzian, and therefore, by (12),
3V(y)f(t,y) = SV(y)f(t,y)< V(f(t,y))<g(t, 1
We consider next conditions for the uniqueness of solutions of the

equation / =f(t,y). If we were to follow the pattern set by (2.11.2), we
should consider
X(t)=V(t,ct)1(t)-<t>2(t)l
where <t>1,^>2 are solutions of y' =f{t,y) such that ^(to) = 02(*o)> and
V{t, y) = 0 if and only if y = 0. However, a further generalization is possible
here, in that we can consider
where now V(t, y, z) = 0 if and only if y = z. Since

D+co(t) < dV(t, ct>M<t>2(t))(l,<t>\(0,<t>'2(t))
= dV(t, 4>x(t\ 0 2 W)(1 ,/(t, </>! (t))J(t, 4>2{t))\ (13)
the condition we have to use is that
dV(t,y,z)(l,f(t,y),f(t,z)) < g(t, V{t,y,z)\ (14)
where g satisfies an appropriate uniqueness condition.
In (4.6.7) below, we suppose that g satisfies the Iyanaga-Kamke condi-
tion (I-K) of §2.11, viz.:
(I-K) g is a continuous function from the rectangle ]t o ,t o + a] x [0,/?] in
R2 into [0, oo[, satisfying g(t,O) = Ofor all te]toô + a]> with the property
that, for each txs^t0,t0 -f a],x = 0 is the only solution of x' = g(t9x) on
] t o , t j such that
X(t0 + ) = 0 and tfia* lim x(t)/(t - t0) = 0.
We recall (cf. (2.11.1)) that if g satisfies condition (I-K) and
co:[t0,t0 + (x]^> [0, j8] is a continuous function such that co(t0) = a/(£0) = 0
and that D+co(t) < g{t,co(t)) for nearly all te~]t0,t0 + a[, then co = 0.
(4.6.7) L^t Y be a Banach space, let J = [to,to + a], to E be a subset of

J x Y containing (t0, y0), and /e£/: E -• 7 be continuous at {t0, y0). Suppose
further that
(i) K: ]t 0 , t0 + a] x 7 2 -> [0, oo [ is continuous, and for to<t<to + ot.,
V(t,y,z) = 0 i/and on/y ify = z,
(ii) tftere exist L > 0 and Ae]0, a] SMC/Z f/iar F(t, y, z) < L || y - z || w/ien-
et;er r0 < t < t0 -h K || )> - y01| < K and || z - y01| < k,
(iii) 0 satisfies condition (I-K) and (14) fo/ds whenever tej°, (t,y), (t,z)eE,
y±z,andV(t,y,z)<p.
Then the equation y' =f(t,y) has at most one solution on J taking the
value y0 at t0.
Let (j)1,(t>2bQ solutions o f / =f(t, y) on J such that ^^Q) = <t>2(t0) = y0,
and let
(D{t)=V{U(t>1{t\(j)2{t)) (to<t<to + *l (o(t0) = 0.
are
Since </>1,</>2 continuous at t0, we can find /ie]0,A] such that
| | 0 x ( r ) - y o \ \ < X and || 0 2 ( f ) - y0 \\ < X whenever t0 < t < t0 + \i, whence,
by(ii),
co(t)<L || hM-hit) ||
for r0 < t < 10 + fi. In particular, this shows that a>(t) -> 0 as t-*to + ,
so that a) is continuous at t0 and therefore is continuous on J. Further,
s i n c e / is continuous at (to,yo\ given e > 0 we can find <5e]0,/z] such
that
whenever t0 < t < t0 + 3. Hence for all such t we have

|| 4>x{t) - <)>2(t) || <s(t-to\
and therefore, by (ii) again, 0 < co(t) < Ls(t — to\ which in turn implies
that cor(t0) = 0.
We observe now that if </>1 ± <t>2» w e c a n &n& t1ej such that coitj > 0.
Hence we can find a subinterval [<x, x] of [t 0 , t J such that co(<r) = 0 and that
0<co(t)(t)) for
all te]<7,x[. If o = t0, then co = 0 on [<J,T], by (2.11.1). Similarly, if o > t0,
and co(t) = 0 (r0 < t< a\co(t) = co(t) (a<t< x), then co = 0, again by
(2.11.1). Thus in either case we obtain a contradiction, and this completes
the proof.
The condition (14) can be replaced by various alternatives, as in (a)-{c)

above. For example, if V(t,y,z) is locally Lipschitzian in (y,z), then (14)
can be replaced by the condition that
dV(t,y,z)(lJ(t9y\f(t,z)) < g{U V(t,y9z)).
We can also subsume condition (ii) of {4.6.1) in further differentiability
conditions, as in the following corollary.
(4.6.7. Corollary 1) Let D be the diagonal {{y,y):yeY} in Y2. Then

the result of (4.6.1) continues to hold if the conditions (ii) and (iii) are replaced
by
(ii)' V is Gateaux differentiable on J° x (Y2\D) (which implies that Y
must be real), and for each bounded set A in Y2 the partial differentials
S2 Vand <53 V are bounded on J° x (A\D\
(iii)' g satisfies condition (I-K) and
dV(t,y,z)(\J(t,y)J(t,z)) < g(u V(t9y,z))
whenever tej°, (t9y\ (t,z)eE,y £ z, and V{t9y,z) < /?.

Let R > 0, and let A be the set of points (t,y,z) such that t e J°, \\y\\< R,
|| z || < #. It is enough to prove that if V satisfies (i) and (ii)', then V(t, y9 z)
is Lipschitzian in (y, z) in A, for this trivially implies the condition (ii) of
(4.6.7); also, by an argument similar to that used in the proof of (12), it
implies that V is Hadamard differentiable on J° x (Y2\D), so that
(iiiy implies (iii).
Let K be the greater of the suprema of 182 V{t, u, v) || and || d3 V(t, u, v) \\
for (t9u9v)eA9(u9v)$D9 and let (t9y9z)9(t9y9z')eA. If both (j;,z),(/,z')eD,
then
V(t,y9z)-V(t,y\z') = 0.
On the other hand, if at least one of (y,z),(/,z') is not in D, then, by (4.1.3),
| V(t9y9z) - V(t9y\z')\<\ V(t9y9z) - V(t9y\z)\ + | V(t9y\z) - V(t9y\z')\
<\d2 V(t9Lz)iy - / ) | + 153 V&?>I)b ~ z')I
for some ^, rj on the open segments in Y joining y to y' and z to z' such that
(£9z)9(y'9ri)$D (note that there is at most one point of D on each of the
segments in Y2 joining (y,z) to (y\z) and iy'9z) to (y\z')). Since (t,<^,z),
(t,y\rj)eA we therefore have
\V(t,y,z)-V(t9y\z')\<K(\\y-y'\\ + \\z-z'\\l
as required.
As a further corollary, we have the following result when V(t, y9 z) =

V(t9y-z).
(4.6.7. Corollary 2) Let J = [to,to + a], let E be a subset of J x Y con-

taining (to,yo), and letf\E-+Ybe continuous at (ôô)- Suppose further
that
(i) F:]t 0 , t0 + a] x Y-» [0, oo[ is continuous, and for t0<t<t0 + ot,
V(t,y) = 0 i/and only ify = 0,
(ii) F f5 Gateaux differentiable on J° x (7\{0}), and for each bounded
set A in Y2 the partial differential S2 V is bounded on J° x (i4\{0}),
(iii) g satisfies condition (I-K) and
5V(t9y- z)(l9f(t9y) -/(r,z)) < g(t9 V(y - z))
whenever teJ°, (t,y\(t,z)eE,y=f= z, and V(t,y - z) < ft.
Then the equation y' =f(Uy) has at most one solution on J taking the
value y0 at t0.
We consider finally existence theorems corresponding to (4.6.7).
As in (2.11.3), we require t h a t / is continuous, and therefore our results

are of interest only when Y is infinite-dimensional.
In the following result, we suppose for simplicity that V is independent
of t; the resulting loss in generality is in fact small, since the comparison
function g itself depends on t.
(4.6.8) Let Y be a real Banach space, let J = [t0, t0 + a], let B be the closed
ball in Y with centre y0 and radius p>0, and let f :J x £ -* Y be a con-
tinuous function such that \\f(t,y)\\ < Mfor all (t,y)ej x B. Let also D be
the diagonal {(y,y) :ye Y} in Y2, and suppose that
(i) V: Y2 -> [0, oc [ is continuous, and V(y, y) = Ofor all yeY,
(ii) V is Gateaux differentiable on Y2\D, and SV is bounded on A\D for
each bounded set A in Y2,
(iii) for each s > 0 there exists S > 0 such that if V(y, z)<d then
\\y-z\\< e,|
(iv) g satisfies condition (I-K) and
5V(y,z)(f(t,y)J(t9z)) < g(t, V(y,z)) (15)
whenever (t,y),(t,z)ej° x B°,y J= z, and V(y,z) < j8.
Then if rj = min{a,p/M}, the equation y' =f(t,y) has a solution on
[to,to + rj~\ taking the value y0 at t0. Moreover, if in addition V(y,z)>0
whenever y±z, this solution is unique.
Let K be the supremum of || 5K(y,z)|| for (y,z)eB2\D. Then, by (4.1.6),
for all (y,z),(y',z')€B2 we have
\V(y,z)-V(y',z')\<K(\\y-y'\\2+\\z-z'\\2y>2 (16)
2
(note that the segment in Y joining (y, z) to (/, z') contains at most one
point of D unless it lies wholly in D, in which case V(y, z) = V(y', z') = 0).
In particular, we deduce from (16) that
0< V(y,z) = V(y,z)- V(z,z)<K\\y-z\\ (17)
2
whenever (y,z)eB . We observe also that (16) and (ii) together imply
that V is Hadamard differentiate on the set specified in (iv) where (15)
holds.
Let / = [to,to + rf\, and let (en) be a decreasing sequence of positive
numbers converging to 0. By (2.3.1), for each positive integer n we can
find an ^-approximate solution \j/n of the equation y' =f(t,y) on /, satis-
fying *Aw(*o)= ^0' a n d ^ h *he property that for all s, tel
||^(s)-W0||<M|5-r| (18)
t This is equivalent to the condition that if (yn\ (zn) are sequences such that V(yn,zn) -> 0,
then>; n -z n ->0.
(so that \l/n(t)eB° whenever t0 < t < t0 + rj).

For m > n > 1 let
UD) (to<t<to +1,), crmtn(t0) = 0.
Then amn is obviously continuous on /\{t 0 }. Further, by (17) we have
0 < aJLt) < K || +m(t) - +u(t) II £ 2KM(t - t0) (19)
for all tel, and hence <rm n is continuous at t0 and therefore on /.
Moreover, by (16) and (18), for all s,tel we have
\*m,n(s)-<Tm,n(t)\^KMy/2\s-t\- (20)
Next, since/ is continuous at (to,yo), given e > 0 we can find Ae]0,7/]
such that || f{Uy) -f(t,z) || < e whenever t0 < t < t0 + A, || y - y01| < A,
I z - z01| < A. Hence if t0 < t < t0 + min {A, A/M}, then
whence also
It follows that for such t we have

Um(t)-Ut)\\^(e + 2en){t-t0),
whence, by (19),
0<amn(t)<K(e + 2en)(t-t0). (21)
Now let n = min {tjJ/(2KM)}, and let / , = [t 0 ,t 0 + /i]. Then, by (19),
0 < <rmn(0 t w e have
so that
<2Ken.
Hence for all te/° we have
D+om^t)<g{UGm,n{t)) + 2Kzn. (22)
For each positive integer n let con = sup<rm n. Then con(to) = 0, and, by
(20), (22) and (2.11.5),
\(Dn(s)-con(t)\<KMy/2\s-t\
for alls, tell9 and
D+cow(f)<^,cow(r)) + 2XeM (23)
for all £e/°. Moreover, by (21),
0<<on(t)<K(e + 2en)(t-t0) (24)
whenever f0 < t < min {A,/l/M}. The sequence (con) is therefore equicon-
tinuous and uniformly bounded on Il9 and hence (A.I.2) it has a subsequ-
ence (con) converging uniformly on/ 1 to a functions, and clearly co(to) = 0.
By (23) and (2.11.4),
for all tel°v and therefore also

D+co(t)<g(t,(o(t)).
Further, by (24), 0 < co(t) < Ke(t- tQ) whenever f o < f < £ o +
min {A,^/M}. Hence a/(to) = 0, and therefore a> = 0, by (2.11.1).
Now let e' > 0, and choose S > 0 so that if V(y9 z)<S then || y - z || < s'.
Since co = 0, we can find an integer JV such that am nr(t) < S whenever
m>nr,r>N, and tell9 and hence we have also ||ij/Jt) — il/nr(t)\\ <ef.
This implies in turn that || ^fm{t) — il/n(t)\\ < 2s' whenever m,n>rN and
telx ,i.e. (\//n) converges uniformly on / x . Hence <>
/ = lim \j/n is a solution of
/ =f(t>y) on / x such that 4>{tQ) = y0. If fi = rj9 the proof is now complete.
On the other hand, if fi< rj, we can repeat the argument starting from the
point (t0 + fi,(j)(t0 + /i)), and after a finite number of such repetitions
we obtain the required solution on /.
Uniqueness is immediate from (4.6.7. Corollary 1).
We conclude by stating without proof an extension of (4.6.8) in which

V may depend on t.
(4.6.9) Let Y be a real Banach space, let J = [t0, t0 + a], let B be the closed
ball in Y with centre y0 and radius p > 0, and let f :J x £ -• Y be a con-
tinuous function such that \\f(t9y)\\ < M for all (t,y)ej x B. Let also
D be the diagonal {(y,y) :ye Y} in Y2, and suppose that
(i) V:]t0,r0 + a ] x y 2 - > [ 0 , o o [ is continuous, and V(t,y,y) = 0 for all
] 0 0 ]
(ii) V is Gateaux differentiable on J° x (Y2\D), and for each bounded
set A in Y2 and each xe]r o ,£ o + a[, 5V is bounded on [x,t0 + a[ x (A\D\
(iii) there exist L > 0 and Ae]0, a] such that V(t, y, z) < L || y - z || when-
ever t0 < t < t0 4- K || y ~ y0 II ^ K and || z - >;01| < I,
(iv) /or eac/i e > 0 t/iere exists (5 > 0 swc/i r/iat if V(t, u,v)<8 then
\\u-v\\<e,
(v) g satisfies condition (I-K) and
5V{t, y, z)(l ,/(r, y),/(r, z)) < ^r, F(r, y, z))
whenever (t,y\(t,z)ej° x B,y=/=z, and V(t,y,z) < p.
Then if rj = min {a,p/M}, t/u? equation y' =f{Uy) has a solution on
[to,to + T/] taking the value y0 at t0. Moreover, if in addition V(t,y,z) > 0
whenever tej° and yj=z, this solution is unique.
4.7 Historical note on differentials

The motivation for differentiating functions of a vector variable came
from the calculus of variations. If a good way of finding maxima or minima
of ordinary functions was through the differential calculus, then perhaps
a similar calculus would offer the same advantages for functions of func-
tions. Volterra began the study from this viewpoint in 1887 (though
soon, using a geometrical perspective, he came to call the subject 'functions
of lines'). Work in this field, which is still continuing, formed a substantial
part of the inspiration for the formation of functional analysis.
In his paper of 1887, Volterra (1887a) considers a function/defined on a
subset of the space C[a,b] of continuous real-valued functions defined on
the interval [a,fc], and in effect calculates the Gateaux variation of/ at
(j) for the increment \j/. If we assume t h a t / is Frechet differentiable at (j>
(which Volterra's conditions imply), this variation is simply dfifyij/.
Now df((j)) belongs to the dual space of C[a,b~\ and so by the Riesz rep-
resentation theorem corresponds to a measure; we assume that this
measure is absolutely continuous, and denote it by/'(</>, s)ds. The variation
then appears in the form
n
J
which is the formula given by Volterra. Of course, Volterra lacked the

advantages conferred by measure theory. His strategy was first to proceed
locally: fix se\_a,b~\, let e > 0 , take i^>0 with support in [s —
and define/'(0,s) to be the limit of the expression
as sup i// -> 0 and s -• 0. He then proceeded to obtain his result for'global'
variations \j/.
Volterra (1887a) also gave a Taylor expansion for a function of functions.
This necessitated producing a higher variation and this appeared in the
form
(Here, as with / ' above, f(n) is not an ordinary derivative.) Summaries of

this part of his work are given in Volterra's book (1913).
In 1911, Frechet took a decisive step by declaring that a real-valued
function / of two real variables was differentiable at a point (a,b) if and
only if its graph had a tangent plane at that point. Later that year he
conceded that the analytic expression asserting this,
f(a + h,b + k) -f(a,b) = DJ(a,b)h + D2f(a,b)k + s(hMh2 + k2)1'2 (1)
where e(h, k) -+ 0 as h, k -> 0, had been given by Young (1909a). Young, as
stated in his book (1910), was aiming at 'rigidity of proof and novelty
of treatment', a target which he must be judged to have hit, but he himself
concedes priority in using (1) to Stolz (1893).
In his note of 1911, Frechet points out that the formula (1) can be
generalized to infinite-dimensional spaces, provided a linear functional
is used to replace the first two terms on the right-hand side and a metric
is given on the space to provide the possibility of convergence.! He also
asserts that the elementary properties of differentials can easily be trans-
ferred to the infinite-dimensional case.
Two years later, a new differential appeared in a note by Gateaux
(1913). Unfortunately, Gateaux was killed very early in the Great War,
and it was not until 1922 that a full version of his work appeared. In
this latter paper, the importance of the linearity of the variation is stressed,
and the conclusion of (4.1.4) is obtained under stronger hypotheses
t Perhaps Frechet allowed his enthusiasm to carry him away here. As Nashed (1971, p. 116)
points out, this definition is not what is required in a linear metric space, even if the dimen-
sion is 1. However, it was probably a normed space which Frechet had in mind.
(joint continuity in x and h). Gateaux's papers were prepared for publica-
tion by Levy, who in his book (1922) also emphasizes the value of the
linearity of the variation and points out that, even with linearity imposed
as an additional requirement, the Gateaux differential is more general
than that of Frechet.
Until 1925, the functions involved were all scalar-valued (and the
word 'functional' was used to describe such a mapping). In that year,
Frechet recognized that his definition of 1911 needed very little modifica-
tion to apply to functions between normed spaces. The announcement of
this in Frechet (1925a) is discussed at length in Frechet (1925b).
With Hildebrandt and Graves (1927), calculus in normed spaces really
becomes a subject of its own. It is interesting that they consider it worth
while to begin with a list of axioms for a normed space (the algebraic as
well as the analytic); the basic concepts were not yet common knowledge.
They work with the Frechet differential and prove its uniqueness, the
result of (3.LI), that a function with a bounded Frechet differential in
an open set satisfies a Lipschitz condition (3.2.2), they define partial
differentials on a product of normed spaces and show that the existence
and continuity of the partial differentials implies the existence and
continuity of the differential (3.3.3), their higher order differentials are
multilinear (§3.5) and they give a complicated form of the chain rule.
The highlight of the paper is a local implicit function theorem which they
establish using the contraction mapping principle (a result which they
also prove). They establish that the implicitly defined function is of
class Cr if the requisite conditions are satisfied (cf. (3.8.1)). (The finite-
dimensional version of the latter result is due to Young (1909b).)
In a later paper in the same volume of the Transactions of the American
Mathematical Society, Graves (1927a) gives a version of Taylor's theorem
with an integral form of the remainder (cf. Exercise 3.6.3). He generally
works with the Gateaux differential but does also prove that the higher
order Frechet differential is symmetric (3.5.7).
Perhaps the oldest notion of derivative for a function of a vector
variable is that of the gradient of a real-valued function defined on a
Euclidean space. This idea was placed in an abstract (Hilbert space)
setting by Golomb (1935), who essentially gives the definition of Exercise
3.5.1, although he does not mention the Frechet differential explicitly.
An earlier gradient in function spaces had been described by Courant
and Hilbert (1930) in their chapter on the calculus of variations. (This
material was not in the first German edition of the book but is translated
in the first English edition of 1953.) Unfortunately, the text is not very
precise about the domains of definition of its functional or the conditions

under which the gradient is defined, but it is clear that the underlying
concept of differential is of Hadamard rather than Frechet type.
The rigorous definition of the Hadamard differential appears in the
work of Frechet (1937). Frechet complains that the Gateaux differential,
even when required to be linear as advocated by Levy, is seriously defective
in that it fails to satisfy the chain rule. He therefore proposes a new
differential for which the condition of (4.2.8)(iii) is taken as definition, i.e.
that the chain rule should hold for compositions with functions whose
domains are intervals. This idea was used by Hadamard (1923) to make
sense of the formula
dz dz .
dz=-z-dx + —dy
dx dy
and so Frechet names this new differential after him.
Frechet proves that for finite-dimensional spaces, the Frechet and
Hadamard differentials coincide (see above, p. 266). He also considers
the example M(x)= max |x(f)| for x in C[0,1] to show that these two
0<r<l
differentials may be distinct in infinite dimensions. Perhaps a little
strangely, Frechet does not go on to prove that the Hadamard differential
satisfies the general chain rule (4.2.4). This task is left to Ky Fan (1942)
who also shows that a Hadamard differentiable function is continuous
(4.2.6).
In the second version of his book, Levy (1951) accepts Frechet's criticism
of the Gateaux differential. He points out that the weaker the definition
of differential, the fewer will be its properties, and he gives a long dis-
cussion of the merits of various possible definitions (pp. 37-40). His
conclusion is that, despite the drawback that it depends on the norm in
the space, the Frechet differential is preferable because it is simpler and
more natural.
Levy's conclusion seems to be shared by most 'pure' analysts, and
it is the Frechet differential which forms the subject of textbooks on
calculus (see Cartan (1967) or Dieudonne (I960)). Those who work in the
calculus of variations, however, take a different view. Vainberg (1956)
finds the Gateaux differential valuable and discusses its relationship
with the Frechet differential; he proves (4.1.7. Corollary 1) and gives an
example of a function everywhere Gateaux differentiable but nowhere
Frechet differentiable. Nashed (1971) finds all three differentials of
importance and includes many others. He gives a detailed study of the
connections between various differentials and has extensive information
on the history of the subject. Most of the results on Gateaux and

Hadamard variations which have not been ascribed to their originators
here may be found in Nashed's paper, though it should be mentioned that
Sova (1964) was responsible for the characterizations of Hadamard
differentiability in (4.2.8).
The subject is now developing in the direction of greater abstraction,
principally to differentiation in linear topological spaces. The Gateaux
differential has a great advantage here, since it is independent of the
topology placed on the spaces involved. However, all the differentials
mentioned have several generalizations to topological vector spaces:
see, for example, the paper by Nashed (1971) already cited, or Averbukh
and Smolyanov (1967).

The Gateaux and Hadamard Variations and Differentials: SF (X) H /im (F (X + TH) - F (X) ) /T

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Gateaux and Hadamard Variations and Differentials: SF (X) H /im (F (X + TH) - F (X) ) /T

Uploaded by

Copyright:

Available Formats

The Gateaux and Hadamard variations

4.1 The Gateaux variation and the Gateaux differential

t We use the same bracketing convention as with df{x0).

The following examples illustrate the range of possible behaviour of

Theorem (3.1.3) applies without change to the Gateaux variation, i.e.

The following version of the chain rule is an immediate consequence of

In (4.1.2) we cannot replace the condition that g is Frechet differentiable

at y0 by the condition that g has a Gateaux variation Sg(yo)k for every

We give next a simple sufficient condition for the linearity of the

(ii) for each heX the function x H* Sf(x)h is continuous at x 0 ,

say, and here Q(t)/t -> 0 as t -* 0 + , since df(xo)k exists.

We now define Gateaux differentiability. Let / be a function from

x *-+ Sf(x) whose domain is the set of interior points of A at which / is

go fat x0 is g°Sf{x0). When X = R", Y = Rm, Z = R', there is also a

(4.1.6) Let M > 0, let C be a closed convex set in X with a non-empty

There is a simpler variant of (4.1.6) which is frequently useful, namely:

(4.1.7. Corollary 1) / / / is a function from asetA^X into Y whose Gateaux

we deduce that for all xeB

(4.1.7. Corollary 2) / / / is a function from an open set E^ X into Y, then

If X is a product space, say X = Xx x ... x I n , we can define partial

4.2 The Hadamard variation and the Hadamard differential

The following result gives an equivalent formulation of the definition

Suppose first that (i) holds, let 0 be a function from a subset B of R

This implies that (/°0)+(O) exists and is equal to /. Moreover, it implies

as n -• oo, so that df(xo)h = /.

then hn -> 0 in X,tn -• 0 in]0, oo[, and therefore

To prove the inequality (2), suppose on the contrary that no such

(4.2.3)(i) / / / is a function from a set A^X into Y which has a Hadamard

In contrast to the Gateaux variation, the Hadamard variation obeys

(4.2.4) Let f be a function from a set A^ X into Y which has a Hadamard

as n -> oc, and this gives the result.

A similar but simpler proof gives the following companion result

for all m. Further, since

The most important example where the Hadamard variation is defined

(4.2.7) Ifp is a continuous convex function on a convex set A^X and x0

By an obvious translation, we deduce from (4.2.1) the following

(4.2.7. Corollary 1) Let p be a continuous convex function on a convex

The hypotheses in (4.2.7) can be weakened. We recall (A.3.12) that

(4.2.7. Corollary 2) Let p be a lower semicontinuous convex function on

We now define Hadamard differentiability. Let / be a function from

This gives a contradiction, and hence (ii) implies (iv).

For a function 0 of a real variable taking values in a real normed space,

For a function / from a subset of X into Y, the Hadamard differen-

From (4.2.6) we see that if a function / is Hadamard differentiable

(4.2.9) Let f be a function from a set A^X into Y which is Hadamard

From (4.2.9) and (4.2.8) we shall deduce that Hadamard differentiability

The mean value inequalities of (4.1.6,7) and the results of (4.1.7,

of the Gateaux differential. In particular, when we define the functions

We next consider some results concerning the upper and lower

and there is obviously equality here. On taking appropriate subsequences

The following result is the analogue of (4.2.1).

= lim (/(x 0 + tnhn) -f(xo))/tn < df(xo)h.

It is easily verified that if h is directed into A at x0 and a > 0, then

(4.2.11) The upper Hadamard variation of a real-valued function at a

1/m, then h™ -> k and t™ -• 0 as m -» oo, and

(4.2.11. Corollary) / / Sf{x0) is convex f[ then it is continuous and sublinear

It is obvious that if 3/(x o )0< oo, then / is upper semicontinuous

(4.2.12) Let X be real, let f be a function from a set A^ X into R, let x0

To prove this let ued^N(x0). Then for all heX we have

The subvariation d^f(x0) is of most interest when the upper variation

(4.2.13) If the upper variation df(x0) of thefunction f :A-+Ratx0 is convex,

(4.2.13. Corollary) / / / has a Hadamard variation at x0 which is convex,

A norm which is Hadamard differentiable at every point of X\{0} is