You are on page 1of 4

10 CHAPTER 2.

UNCONSTRAINED OPTIMIZATION

2.1 Introduction
Several engineering, economic and planning problems can be posed as optimization prob-
lems, i.e. as the problem of determining the points of minimum of a function (possibly in
the presence of conditions on the decision variables). Moreover, also numerical problems,
such as the problem of solving systems of equations or inequalities, can be posed as an
optimization problem.
We start with the study of optimization problems in which the decision variables are
defined in IRn : unconstrained optimization problems. More precisely we study the problem
of determining local minima for differentiable functions. Although these methods are
seldom used in applications, as in real problems the decision variables are subject to
constraints, the techniques of unconstrained optimization are instrumental to solve more
general problems: the knowledge of good methods for local unconstrained minimization is
a necessary pre-requisite for the solution of constrained and global minimization problems.
The methods that will be studied can be classified from various points of view. The
most interesting classification is based on the information available on the function to be
optimized, namely
methods without derivatives (direct search, finite differences);

methods based on the knowledge of the first derivatives (gradient, conjugate direc-
tions, quasi-Newton);

methods based on the knowledge of the first and second derivatives (Newton).

2.2 Definitions and existence conditions


Consider the optimization problem:
Problem 1 Minimize
f (x) subject to x F
in which f : IRn IR and1 F IRn .
With respect to this problem we introduce the following definitions.

Definition 1 A point x F is a global minimum2 for the Problem 1 if

f (x) f (y)

for all y F.
A point x F is a strict (or isolated) global minimum (or minimiser) for the Problem 1
if
f (x) < f (y)
1
The set F may be specified by equations of the form (1.1) and/or (1.2).
2
Alternatively, the term global minimiser can be used to denote a point at which the function f attains
its global minimum.
2.2. DEFINITIONS AND EXISTENCE CONDITIONS 11

for all y F and y 6= x.


A point x F is a local minimum (or minimiser) for the Problem 1 if there exists > 0
such that
f (x) f (y)
for all y F such that ky xk < .
A point x F is a strict (or isolated) local minimum (or minimiser) for the Problem 1 if
there exists > 0 such that
f (x) < f (y)
for all y F such that ky xk < and y 6= x.

Definition 2 If x F is a local minimum for the Problem 1 and if x is in the interior


of F then x is an unconstrained local minimum of f in F.

The following result provides a sufficient, but not necessary, condition for the existence of
a global minimum for Problem 1.

Proposition 1 Let f : IRn IR be a continuous function and let F IRn be a compact


set3 . Then there exists a global minimum of f in F.

In unconstrained optimization problems the set F coincides with IRn , hence the above
statement cannot be used to establish the existence of global minima. To address the
existence problem it is necessary to consider the structure of the level sets of the function
f . See also Section 1.2.3.

Definition 3 Let f : IRn IR. A level set of f is any non-empty set described by
L() = {x IRn : f (x) },
with IR.

For convenience, if x0 IRn we denote with L0 the level set L(f (x0 )). Using the concept
of level sets it is possible to establish a simple sufficient condition for the existence of
global solutions for an unconstrained optimization problem.

Proposition 2 Let f : IRn IR be a continuous function. Assume there exists x0 IRn


such that the level set L0 is compact. Then there exists a point of global minimum of f in
IRn .

Proof. By Proposition 1 there exists a global minimum x? of f in L0 , i.e. f (x? ) f (x) for
all x L0 . However, if x 6 L0 then f (x) > f (x0 ) f (x? ), hence x? is a global minimum
of f in IRn . /

It is obvious that the structure of the level sets of the function f plays a fundamental
role in the solution of Problem 1. The following result provides a necessary and sufficient
condition for the compactness of all level sets of f .
3
A compact set is a bounded and closed set.
12 CHAPTER 2. UNCONSTRAINED OPTIMIZATION

Proposition 3 Let f : IRn IR be a continuous function. All level sets of f are compact
if and only if for any sequence {xk } one has

lim kxk k = lim f (xk ) = .


k k

Remark. In general xk IRn , namely



x1k

x2k

xk = .. ,
.


xnk

i.e. we use superscripts to denote components of a vector. 

A function that satisfies the condition of the above proposition is said to be radially
unbounded.
Proof. We only prove the necessity. Suppose all level sets of f are compact. Then,
proceeding by contradiction, suppose there exist a sequence {xk } such that limk kxk k =
and a number > 0 such that f (xk ) < for all k. As a result

{xk } L().

However, by compactness of L() it is not possible that limk kxk k = . /

Definition 4 Let f : IRn IR. A vector d IRn is said to be a descent direction for f
in x? if there exists > 0 such that

f (x? + d) < f (x? ),

for all (0, ).

If the function f is differentiable it is possible to give a simple condition guaranteeing that


a certain direction is a descent direction.

Proposition 4 Let f : IRn IR and assume4 f exists and is continuous. Let x? and d
be given. Then, if f (x? )0 d < 0 the direction d is a descent direction for f at x? .

Proof. Note that f (x? )0 d is the directional derivative of f (which is differentiable by


hypothesis) at x? along d, i.e.

f (x? + d) f (x? )
f (x? )0 d = lim ,
0+
4 f f 0
We denote with f the gradient of the function f , i.e. f = [ x 1,, xn
]. Note that f is a
column vector.
2.2. DEFINITIONS AND EXISTENCE CONDITIONS 13

f increasing

f(x) = f(x ) > f(x )


2 1

f(x) = f(x ) > f(x )


1
*

f(x) = f(x )
*

anti-gradient

descent direction

Figure 2.1: Geometrical interpretation of the anti-gradient.

and this is negative by hypothesis. As a result, for > 0 and sufficiently small

f (x? + d) f (x? ) < 0,

hence the claim. /

The proposition establishes that if f (x? )0 d < 0 then for sufficiently small positive dis-
placements along d and starting at x? the function f is decreasing. It is also obvious that
if f (x? )0 d > 0, d is a direction of ascent, i.e. the function f is increasing for sufficiently
small positive displacements from x? along d. If f (x? )0 d = 0, d is orthogonal to f (x? )
and it is not possible to establish, without further knowledge on the function f , what is
the nature of the direction d.
From a geometrical point of view (see also Figure 2.1), the sign of the directional derivative
f (x? )0 d gives information on the angle between d and the direction of the gradient at
x? , provided f (x? ) 6= 0. If f (x? )0 d > 0 the angle between f (x? ) and d is acute. If
f (x? )0 d < 0 the angle between f (x? ) and d is obtuse. Finally, if f (x? )0 d = 0, and
f (x? ) 6= 0, f (x? ) and d are orthogonal. Note that the gradient f (x? ), if it is not
identically zero, is a direction orthogonal to the level surface {x : f (x) = f (x? )} and it is
a direction of ascent, hence the anti-gradient f (x? ) is a descent direction.
Remark. The scalar product x0 y between the two vectors x and y can be used to define
the angle between x and y. For, define the angle between x and y as the number [0, ]
such that5
x0 y
cos = .
kxkE kykE

If x0 y = 0 one has cos = 0 and the vectors are orthogonal, whereas if x and y have the
same direction, i.e. x = y with > 0, cos = 1. 
5

kxkE denotes the Euclidean norm of the the vector x, i.e. kxkE = x0 x.

You might also like