You are on page 1of 7

1 Introduction to optimization

1.1 Notations and definitions


1. a criterion, or cost function, or objective function: a function J defined over V
with values in R, where V is the space (normed vector space) in which the problem
lies, also called the space of command variables.
2. some constraints: for example,
(a) v 2 K, where K is a subset of V
(b) equality constraints: F (v) = 0, where F : V ! Rm (m real constraints: Fi (v) =
0)
(c) inequality constraints: G(v)  0, where G : V ! Rp (p constraints Gi (v)  0).
If Gi (v) = 0, the constraint is active, or saturated.
If Gi (v) < 0, the constraint is inactive.
(d) Functional equation: the constraint is given by an ODE, or a PDE, to be satisfied
optimal command of an evolutive problem.

The di↵erent types of constraints can be mixed: v 2 K and F (v) = 0 and G(v)  0.

We denote by U the set of admissible elements of the problem:

U = {v 2 V ; v satisfies all the constraints}.

Minimization problem:

(P) Find u 2 U such that J(u)  J(v), 8v 2 U

Maximization problem: idem.

u is the optimal solution, or the solution of the optimization problem. J(u) is the optimal
value of the criterion.

Local optimum: ū is a local optimum if there exists a neighborhood V(ū) such that
J(ū)  J(v), 8v 2 V(ū).

Note that a global optimum is a local optimum, but the converse proposition is not true
(except in a convex case).

1.2 Examples
• Finite dimension: J : Rn ! R
• Infinite dimension: J : V ! R

Finite dimension:

1.2.1 Example 1: linear problem


Food rationing (e.g. during wars):
n types of food
m food components (proteins, vitamins, . . . )
cj : unitary price of food j

3
vj : quantity of food j
aij quantity of component i per unit of food j
bi : minimal (vital) quantity of component i

Food ration of minimal cost: minimize


n
X
J(v) = cj vj ,
j=1

n
X
under the constraints vj 0, aij vj bi , i = 1, . . . , m.
j=1

1.2.2 Example 2: least squares problem


Av = b, where A is a m ⇥ n matrix, of rank n < m, and v 2 Rn .

J(v) = kAv bk2Rm


Infinite dimension:

1.2.3 Calculus of variations


V functional space,
Z
J(v) = L(x, v(x), P v(x), . . . ) dx,

where P is a di↵erential operator.
Problem: inf J(v).
v2V

1.2.4 Example 3: optimal trajectory

From point (a, y0 ), reach (b, y1 ) as soon as possible.


The speed at point (x, y(x)) is c(x, y(x)).
Boundary conditions: y(a) = y0 , y(b) = y1 .

p
dx2 + y 02 dx2 dx p
c(x, y(x)) = = 1 + y 02
dt dt
p
1 + y 02 dx
dt =
c(x, y(x))
and then Z p
b
1 + y 02
J(y) = dx
a c(x, y(x))
to be minimized under the constraint y(a) = y0 , y(b) = y1 , and y 2 V = C 1 ([a, b]).

4
1.2.5 Example 4: geodesic
A geodesic is the shortest path between points on a given space (e.g. on the Earth’s surface).
x = x(u, v), y = y(u, v), z = z(u, v) ) ds2 = dx2 + dy 2 + dz 2 = (xu du + xv dv)2 +(yu du +
yv dv)2 +(zu du + zv dv)2 = e du2 + 2f dudv + g dv 2 , where e = x2u + yu2 + zu2 , f = xu xv + yu yv +
zu zv , g = x2v + yv2 + zv2 .
The shortest path between points (u0 , v0 ) and (u1 , v1 ) is given by a function v : [u0 , u1 ] ! R
solution of Z u1 p
inf J(v) = e + 2f v 0 + gv 02 du
u0

under the constraints v(u0 ) = v0 , v(u1 ) = v1 , and v 2 C 1 ([u0 , u1 ]).

1.2.6 Other examples


• Energy principle (or variational principle) for a PDE:
Z Z
1 2
inf J(v) = |rv| dx f v dx
2 ⌦ ⌦

• Inverse problems: identification and estimation of parameters:

div(K(x)ru) = f

where K is unknown: X
J= [u(xi ) ui ] 2
i

5
6
2 Optimality conditions
Elementary remark:
min J
[a,b]⇢R

if x0 is a local minimum of J, then (necessary condition):


• J 0 (x0 ) 0 if x0 = a,
• J 0 (x0 ) = 0 if x0 2]a, b[,
• J 0 (x0 )  0 if x0 = b.
Indeed, if x0 2 [a, b[, we can consider x = x0 + h for some h > 0 small enough.

J(x) = J(x0 ) + hJ 0 (x0 ) + o(h) J(x0 ) ! J 0 (x0 ) 0.

If x0 2]a, b], x = x0 h and then J 0 (x0 )  0. Note that if x0 2]a, b[,

h2 00
J(x0 ) + J (x0 ) + o(h2 ) J(x0 )
2
and then J 00 (x0 ) 0.
One must take into account the constraints (x 2 [a, b]) in order to test the optimality: there
are admissible directions.

2.1 Fréchet and Gâteaux di↵erentiability


For clarity purpose, we may denote by dJ the Fréchet derivative, and by J 0 the Gâteaux
derivative.

Definition 2.1 J : V ! H has a directional derivative at point u 2 V along the direction


v 2 V if
J(u + ✓v) J(u)
lim
✓!0+ ✓
exists, and we denote by J 0 (u; v) the limit.

Example: V = Rn and H = R, partial derivatives of J:


J(u + ✓ei ) J(u) @J
! (u).
✓ @xi
Definition 2.2 J : V ! H is Gâteaux di↵erentiable at point u if
• J 0 (u; v) exists, 8v 2 V ,
• and v 7! J 0 (u; v) is a linear continuous function:

J 0 (u; v) = (J 0 (u), v), J 0 (u) 2 L(V, H)

and we denote by J 0 (u) the Gâteaux derivative of J at u.

Definition 2.3 J : V ! H is Fréchet di↵erentiable at point u if there exists J 0 (u) 2 L(V, H)


such that
J(u + v) = J(u) + (J 0 (u), v) + kvk"(v),
where "(v) ! 0 when v ! 0.

7
Proposition 2.1 If J is Fréchet di↵erentiable at u, then J is Gâteaux di↵erentiable, and
the two derivatives are equal.

Proof
J(u + ✓v) J(u)
J(u+✓v) = J(u)+(J 0 (u), ✓v)+k✓vk"(✓v), and then = (J 0 (u), v)+kvk"(✓v),

and "(✓v) ! 0 when ✓ ! 0. Then the directional derivative exists everywhere, and J 0 (u) is
the Gâteaux derivative. ⇤

Remark: the converse proposition is not true. Consider the following function in R2 :

J(x, y) = 1 if y = x2 and x 6= 0; J(x, y) = 0 otherwise.

This function is Gâteaux di↵erentiable. But it is not continuous, and hence cannot have a
Fréchet derivative at 0.

Finite-dimensional case: V = Rn and H = R. If all partial derivatives of J exist at u,


then
Xn
@J
(J 0 (u), v) = (u).vi = hrJ(u), vi.
i=1
@x i

Example 1: J(v) = Av where A 2 L(V, H). Then J 0 (u) = A, 8u.


1
Example 2: J(v) = a(v, v) L(v), where a is a continuous symmetric bilinear form, and
2
L is a continuous linear form.
1
J(u + v) = J(u) + (J 0 (u), v) + a(v, v).
2
As a is continuous, a(v, v)  M kvk2 and then

(J 0 (u), v) = a(u, v) L(v).

J 0 (u + ✓v; w) J 0 (u; w)
Second (Gâteaux) derivative: If has a finite limit 8u, w, v 2 V

+ 00
when ✓ ! 0 , then we denote this limit by J (u; v, w), and it is the second directional
derivative at point u in the directions v and w.
If J 00 (u; v, w) is a continuous bilinear form, then J 00 (u) is called the second derivative of J,
or hessian of J.

Finite increments and Taylor formula:


Proposition 2.2 If J : V ! R is Gâteaux di↵erentiable 8u + ✓v, where ✓ 2 [0, 1], then
9✓0 2]0, 1[ such that
J(u + v) = J(u) + J 0 (u + ✓0 v, v).

Proof
Let f (✓) = J(u + ✓v). f : R ! R.

f (✓ + ") f (✓) J(u + ✓v + "v) J(u + ✓v)


f 0 (✓) = lim = lim = (J 0 (u + ✓v), v).
" "
Moreover, 9✓0 2]0, 1[ such that f (1) = f (0) + f 0 (✓0 ). ⇤

8
Proposition 2.3 If J : V ! R is twice Gâteaux di↵erentiable 8u + ✓v, ✓ 2 [0, 1], then
9✓0 2]0, 1[ such that
1
J(u + v) = J(u) + (J 0 (u), v) + J 00 (u + ✓0 v; v, v).
2
Proof
Same as before, with f 00 (✓) = (J 00 (u + ✓v); v, v). ⇤

2.2 Convexity and Gâteaux di↵erentiability


Let V be a Banach space on R, and K ⇢ V a convex subset.

Definition 2.4 J : K ! R is a convex function if

J ((1 ✓)u + ✓v)  (1 ✓)J(u) + ✓J(v), 8✓ 2 [0, 1], 8u, v 2 K.

J is called strictly convex if

J ((1 ✓)u + ✓v) < (1 ✓)J(u) + ✓J(v), 8✓ 2]0, 1[, 8u 6= v 2 K.

J is ↵-convex (↵ > 0), or strongly convex, if



J ((1 ✓)u + ✓v)  (1 ✓)J(u) + ✓J(v) ✓(1 ✓)ku vk2 , 8✓ 2 [0, 1], 8u, v 2 K.
2
Proposition 2.4 ↵-convex ) strictly convex ) convex.

Proposition 2.5 Let J : K ! R be a convex function, then every local minimum of J is a


global minimum.

Proof
Let u be a local minimum of J. Then J(u)  J(v), 8v 2 V(u) \ K. If u is not a global
minimum, 9w 2 K such that J(w) < J(u). Then, for ✓ > 0, by convexity,

J((1 ✓)u + ✓w) < J(u).

For ✓ > 0 small enough, (1 ✓)u + ✓w 2 V(u) \ K, and there is a contradiction. ⇤

Proposition 2.6 If J is strictly convex on K convex, then J has at most one minimum on
K.

Proof
Let u1 and u2 be two (local ) global) minima of J. Then J(u1 ) = J(u2 ). By strict convexity,
J(w) < J(u1 ) for all w 2]u1 , u2 [. ⇤

Proposition 2.7 If J : V ! R is Gâteaux-di↵erentiable in V , then:


i J is convex in V , J(v) J(u) + (J 0 (u), v u), 8u, v
ii J is strictly convex in V , J(v) > J(u) + (J 0 (u), v u), 8u 6= v

iii J is ↵-convex in V , J(v) J(u) + (J 0 (u), v u) + ↵2 kv uk2 , 8u, v

Proof

You might also like