Professional Documents
Culture Documents
Lec 20 Multivar Cauchy Method
Lec 20 Multivar Cauchy Method
In this Lecture
In this lecture we will discuss two important methods
that deal with Multi-variable Unconstrained
Optimization:
Cauchys Steepest Ascent Method
Newton's Method
f
f
= 2 y + 2 2x
= 2x 4 y
x
y
f
1
,
1
)
x
2(1) + 2 2(1) 6
At (1,1), f = f
=
=
f ( x) = 2 xy + 2 x x 2 2 y 2
g (h) = f (-1 + 6h,1 6h) = ... = 7 + 72h 180h 2
Setting g ' (h) = 0 yields
72 360h = 0 h = 0.2
If h = 0.2 maximizes g(h), then x = -1+6(0.2) = 0.2 and y = 16(0.2) = -0.2 would maximize f(x, y).
So moving along the direction of gradient from point (-1,
1), we would reach the optimum point (which is our next
point) at (0.2, -0.2).
EXERCISE 3.4.2
Consider the Himmelblau
function:
Minimize:
f(x, y) = (x2 + y 11)2 + (x + y2
7)2
Step 1: In order to ensure
proper convergence, a large value
of M (= 100) is usually chosen.
EXERCISE 3.4.2
3D Graph
Example
Contour graph:
% Matlab program to
draw contour of function
[X,Y] = meshgrid(0:.1:5);
Z = (X.*X + Y -11.).^2. +(
X + Y.*Y - 7.).^2.
contour(X, Y, Z, 150);
colormap(jet);
Minimum point
EXERCISE 3.4.2
Step 2:
The derivative at x(0) = (0, 0)T is
first calculated
f ( xi + h) f ( xi h)
f
=
2h
xi
and found to be (-14, -22)T, which
is identical to the exact derivative
at that point (Figure 3.12).
Step 3:
The magnitude of the derivative vector is not small and k =a < M =
100. Thus, we do not terminate; rather we proceed to Step 4.
EXERCISE 3.4.2
Step 4: At this step, we need to perform a line search from x(0)
in the direction - f(x(0)) such that the function value is minimum.
Along that direction, any point can be expressed by fixing a value or
the parameter a(0) in the equation:
f ( x) = ( x 2 + y 11) 2 + ( x + y 2 7) 2
g (a ) = f (14a, 22a ) = (196a 2 + 22a 11) 2 + (484a 2 + 14a 7) 2
g (a ) = 2(392a + 22)(196a 2 + 22a 11) +
2(968a + 14)(484a 2 + 14a 7) 2
Setting g ' (a ) = 0 for region (0,1) yields
a = 0.1
EXERCISE 3.4.2
Thus, the minimum point along the search direction is
x(1) == (1.788, 2.810)T.
Step 5: Since, x(1) and x(0) are quite different, we do
not terminate; rather we move back to Step 2. This
completes one iteration of Cauchy's method. The total
number of function evaluations required in this iteration
is equal to 30.
Step 2: The derivative vector at this point, computed
numerically, is (-30.7, 18.8)T.
EXERCISE 3.4.2
Step 3: This magnitude of the derivative vector is not
smaller than 1. Thus, we continue with Step 4.
Step 4: Another unidirectional search along (30.70, 18.80)T from the point x(1) = (1.788, 2.810)T using
the golden section search finds the new point:
x(2) = (3.00, 1.99)T with a function value equal to 0.018.
Newtons Method
This method uses second order derivatives to create
search directions.
This allows faster convergence to the minimum point.
Consider the first three terms in Taylors series
expansion of a multivariable function, it can be shown
that the first order optimality condition will be satisfied
if following search direction is used:
(k )
= f ( x ) f ( x )
2
(k )
(k )
Newton's Method
One-dimensional
Optimization
At the
optimal
Newton's
Method
Multi-dimensional
Optimization
f ' ( xi ) = 0
f (x) = 0
f ' ( xi )
xi +1 = xi
f " ( xi )
xi +1 = xi H i1f (xi )
f " ( xi ) 1 f ' ( xi )
Newton's Method
x i +1 = x i H i1f ( x i )
Converge quadratic fashion.
May diverge if the starting point is not close
enough to the optimum point.
Costly to evaluate H-1.
EXERCISE 3.4.3
Consider the Himmelblau function:
Minimize using Newtons method:
f(x, y) = (x2 + y 11)2 + (x + y2 7)2
EXERCISE 3.4.3
3D Graph
Minimum point
) (
) (
2
f ( x, y) = x + y 11 + x + y 7
f
= 4 x x 2 + y 11 + 2 x + y 2 7 = 4 x 3 + 4 xy 42 x + y 2 14
x
f
= 2 x 2 + y 11 + 2 y x + y 2 7 = 2 x 2 2 y 22 + 2 xy + 2 y 3 14 y
y
2
12 x 2 + 4 y 42
4x + 2 y
H=
2
16 + 2 x + 6 y
4x + 2 y
0
42.0
T
At (0,0), f = [ 14 22] and H =
.
22
0
) (
2
f ( x, y ) = x + y 11 + x + y 7
2
0
42.0
At (0,0), f = [ 14 22] and H =
.
22
0
x(1) = x(0) aH 1f
T
0 14
0 a 22
=
42 22
0 924 0
0.333a
=
a
) (
) (
f ( x, y ) = x 2 + y 11 + x + y 2 7
f
= 4 x x 2 + y 11 + 2 x + y 2 7 = 4 x 3 + 4 xy 42 x + y 2 14
x
f
= 2 x 2 + y 11 + 2 y x + y 2 7 = 2 x 2 2 y 22 + 2 xy + 2 y 3 14 y
y
12 x 2 + 4 y 42
4x + 2 y
H=
2
x
y
x
y
+
+
+
4
2
2
2
3
10 10
At (2, 1), f = [ 57 24] and H =
.
10 5
T
) (
2
f ( x, y ) = x + y 11 + x + y 7
2
10 10
At (0,0), f = [ 57 24] and H =
.
10 5
1
x(1) = x(0) aH f
T
10 57
2
a 5
=
1 50 10 10 24
2 0.9a
=
1 + 6.6a
EXERCISE 3.4.3
This method is effective for initial points close to the
optimum.
This demands some knowledge of the optimum point.
Computation of the Hessian matrix and its inverse is
computationally expensive.