OPTIMIZATION OF MULTIVARIABLE FUNCTIONS

UNCONSTRAINED MULTIVARIABLE
Chapter 6
OPTIMIZATION
1
Methods
1 Function Values Only (grid search)
Chapter 6
2 First Derivatives of f (gradient and conjugate

direction methods)
3 Second Derivatives of f (e.g., Newton’s

method)
4 Quasi-Newton methods
2
Grid Search
• Fungsi dihitung di setiap titik pada grid
• Nilai ekstremum ditemukan pada satu titik
Chapter 6
tertentu  nilai optimum
3
Gradient Method
Chapter 6
k
(1) Calculate a search direction s
(2) Select a step length in that direction to reduce f(x)

k 1 k k k k
x  x   s  x  x
4
Gradient Method: Steepest
Descent (Search Direction)
k k
s  f ( x ) Don’t need to normalize
Method terminates at any stationary point. Why?

f ( x )  0
So procedure can stop at saddle point. Need to show
*
H ( x ) is positive definite for a minimum.
5
Gradient Method:
Step Length
Chapter 6
How to pick 
• analytically
• numerically
6
Chapter 6
7
8
Chapter 6
9
Analytical Method
How does one minimize a function in a search direction
using an analytical method?
It means s is fixed and you want to pick , the step

length to minimize f(x). Note  x k   s k .
Chapter 6
k k k 1 k k k 1 k k k
f (x   s )  f (x )  f ( x )  T f ( x )(  x )  (  x )T H ( x )(  x )
2
k k
df ( x   s ) k k k k k
 0  T f ( x )(s )   (s )T H ( x )(s )
d
Solve for 
k k
T f ( x )(s ) (6.9)
 
k k k
(s )T H ( x )(s )
This yields a minimum of the approximating function.

10
Numerical Method
Use coarse search first
(1) Fixed  ( = 1) or variable  ( = 1, 2, ½, etc.)
Chapter 6
Options for optimizing 

(1) Use interpolation such as quadratic, cubic
(2) Region Elimination (Golden Search)
(3) Newton, Secant, Quasi-Newton
(4) Random
(5) Analytical optimization
(1), (3), and (5) are preferred. However, it may

not be desirable to exactly optimize  (better to
generate new search directions).
11
Suppose we calculate the gradient at the point x T = [2 2]
Chapter 6
12
Chapter 6
13
Chapter 6
14
Gradient Method:
Termination Criteria
f(x) f(x)
Chapter 6
x x
Big change in f(x) but little change Big change in x but little change
in x. Code will stop if x is sole criterion. in f(x). Code will stop if x is sole criterion.
For minimization you can use up to three criteria for termination:
(1) f ( x k )  f ( x k 1 ) except when f ( x k )  0
1
k
f (x ) then use f ( x k )  f ( x k 1 ) 2
xi k 1  xi k except when x k  0
(2) 3
xi k
then use x k 1  x k 4
(3) f ( x k ) 5 or si k 6
15
Gradient Method:
Conjugate Search Directions
Improvement over gradient method for general quadratic functions
Basis for many NLP techniques
Two search directions are conjugate relative to Q if
( s i )T Q ( s j )  0
To minimize f(xnx1) when H is a constant matrix (=Q), you are

guaranteed to reach the optimum in n conjugate direction stages if
you minimize exactly at each stage
(one-dimensional search)
16
Chapter 6
17
Conjugate Gradient Method
Step 1. At x 0 calculate f (x 0 ). Let
s 0  f ( x 0 )
Step 2. Savef ( x 0 ) and compute
x1  x 0   0 s 0
by minimizing f(x) with respect to  in the s0 direction (i.e., carry out a unidimensional search for 0).
Chapter 6
Step 3. Calculate f ( x1 ), f ( x1 ). The new search direction is a linear combination of s 0 and f ( x1 ) :

T f ( x1 )f ( x1 )
s  f ( x )  s T
1 1 0
 f ( x 0 )f ( x 0 )
For the kth iteration the relation is
k 1 k 1 T f ( x k 1 )f ( x k 1 )
s  f ( x )s k
(6.6)
T f ( x k )f ( x k )
For a quadratic function it can be shown that these successive search directions are conjugate.
After n iterations (k = n), the quadratic function is minimized. For a nonquadratic function,
the procedure cycles again with xn+1 becoming x0.
Step 4. Test for convergence to the minimum of f(x). If convergence is not attained, return to step 3.
Step n. Terminate the algorithm when f ( x ) is less than some prescribed tolerance.
k
18
Chapter 6
19
Chapter 6
20
Chapter 6
21
Chapter 6
22
Minimize f  ( x1  3)  9( x2  5) using the method of conjugate gradients with
2 2
x10  1 and x20  1 as an initial point.
1
In vector notation, x 0   
1
4 
f x0
  
72 
Chapter 6
For steepest descent,

 4
s 0  f x0
 
72 
Steepest Descent Step (1-D Search)
1 4 
x1      0   ,  0  0.
1  72 
The objective function can be expressed as a function of 0 as follows:

f ( 0 )  (4 0  2) 2  9(72 0  4) 2 .
1.223 
Minimizing f(0), we obtain f = 3.1594 at 0 = 0.0555. Hence x1   
5.011
23
Calculate Weighting of Previous step
The new gradient can now be determined as
 3.554
f x1   
0.197 
and 0 can be computed as
(3.554) 2  (0.197) 2
 
0
 0.00244.
(4) 2  (72) 2
Generate New (Conjugate) Search Direction
Chapter 6
 3.554   4   3.564 
s1     0.00244    
 0.197  72   0.022 
and
1.223  1  3.564 
x2       0.022
5.011  
One dimensional Search
Solving for 1 as before [i.e., expressing f(x1) as a function of 1 and minimizing

with respect to 1] yields f = 5.91 x 10-10 at 1 = 0.4986. Hence
3.0000  which is the optimum (in 2 steps,
X 
2
 which agrees with the theory).
5.0000 
24
Chapter 6
25
Chapter 6
26
s  H f ( x )  f ( x )  k
k 1 k 1 k
T
(s )  f ( x )  f ( x ) H /  k
k T k 1 k1
k k 1
Using definition of conjugate directions, (s )T Hs =0,
T
 f ( x k 1
)  f ( x ) H H  f ( x )   k s   0
1
k k 1 k

Chapter 6
k k 1
f T ( x )f ( x )0
k 1 k
and f T ( x )s  0, and solving for the weighting factor:
k 1 k 1
T f ( x )f ( x )
 
k
k k
T f ( x )f ( x )
k 1 k 1 k
s  f ( x )  k s
27
Newton Method: Linear vs.
Quadratic Approximation of f(x)
k k k 1 k k k
f ( x )  f ( x )  ( x  x )T f ( x )  ( x  x )T H ( x )( x  x )
2
k k k
x  x  x   k s
Chapter 6
(1) Using a linear approximation of f ( x ) :

df ( x ) k k
T
 0   f ( x ) so cannot solve for  x !
d ( x )
(2) Using a quadratic approximation for f (x) :
df ( x ) k k k  Newton's method
T
 0   f ( x )  H ( x )( x  x )
d ( x )  solves one of these
or
k 1 k
x  x  H ( x )f ( x )
k  with x  x k 1

(simultaneous
equation-solving)
28
Note: Both direction and step length are determined
- Requires second derivatives (Hessian)
1
- H, H must be positive definite (for minimum) to guarantee convergence
- Iterate if f ( x ) is not quadratic
Chapter 6
Modified Newton's Procedure:

k 1 k 1 k k
x  x   k H ( x )f ( x )
 k  1 for Newton's Method
(If H  I, you have steepest descent)
Example
f ( x )  x12  20 x2 2
Minimize f starting at x 0   1 1
T
29
Chapter 6
30
Chapter 6
31
Chapter 6
32
Marquardt’s Method
1
If H ( x ) or H ( x ) is not always positive definite, make it
positive definite.
1 1
Let H ( x )  H ( x )   I  ; similar for H( x )
  
 is a positive constant large enough to shift all the
Chapter 6
negative eigenvalues of H ( x ).
Example
0
At the start of the search, H( x ) is evaluated at x and
Not positive definite
0 1 2
found to be H ( x )    as the eigenvalues
 2 1  are e1  3, e2  1
0
Modify H ( x ) to be (  2)
Positive definite as the
1  2 2 
H

 eigenvalues are e1  5, e2  1
 2 1  2 
 is adjusted as search proceeds. 33
Step 1
0
Pick x the starting point.
Let   convergence criterion
Step 2
Chapter 6
Set k  0. Let  0  10 3
Step 3
k
Calculate f ( x )
Step 4
Is f ( x )k )   ? If yes, terminate. If no, continue.
34
Step 5
1
Calculate s( x )  - H   I  f ( x )
k k k k k
Step 6
k 1 k k
Calculate x  x  s( x )
Chapter 6
Step 7
k 1 k
Is f( x )  f ( x )? If yes, go to step 8. If no, go to step 9.
Step 8
1 k
Set  k 1   and k  k  1. Go to step 3
4
Step 9
Set  k  2 k . Go to step 5
35
Secant Methods
Recall for one dimensional search the secant method
only uses values of f(x) and f ′(x).
1
 f ( x )  f ( x ) 
k p
k 1
x  x  k
 f ( x k
)
 x x
k p

Chapter 6
Approximate f (x ) by a straight line (the secant).

Hence it is called a "Quasi-Newton" method.
The basic idea (for a quadratic function):
k k k 1 k 1 k
f ( x )  H  x  0 or (x  x )  H f ( x )
Pick two points to start (x k  Ref. point)
k k
f ( x 2 )  f ( x )  H ( x 2  x )
k k
f ( x 1 )  f ( x )  H ( x 1  x )
k
f ( x 2 )  f ( x 1 )  y  H ( x 2  x 1 )
36
For a non-quadratic function, H would be calculated,
k k 1
after taking a step from x to x , by solving the
secant equations
k k k 1 k
y  H  x or 
x  H y
Chapter 6
- An infinite number of candidates exist for H when n  1

-1 -1
 
- We want to choose H (or H ) close to H (or H ) in
some sense. Several methods can be used to update H
37
• Probably the best update formula is the BFGS update
(Broyden – Fletcher – Goldfarb – Shanno) – ca. 1970
• BFGS is the basis for the unconstrained optimizer

in the Excel Solver
Chapter 6
• Does not require inverting the Hessian matrix but

approximates the inverse with values of f
38
Chapter 6
39
Chapter 6
40
Chapter 6
41
Chapter 6
42

OPTIMIZATION OF MULTIVARIABLE FUNCTIONS

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

OPTIMIZATION OF MULTIVARIABLE FUNCTIONS

Uploaded by

Copyright:

Available Formats

UNCONSTRAINED MULTIVARIABLE

2 First Derivatives of f (gradient and conjugate

3 Second Derivatives of f (e.g., Newton’s

tertentu  nilai optimum

(2) Select a step length in that direction to reduce f(x)

Method terminates at any stationary point. Why?

So procedure can stop at saddle point. Need to show

It means s is fixed and you want to pick , the step

This yields a minimum of the approximating function.

Options for optimizing 

(1), (3), and (5) are preferred. However, it may

To minimize f(xnx1) when H is a constant matrix (=Q), you are

Step 3. Calculate f ( x1 ), f ( x1 ). The new search direction is a linear combination of s 0 and f ( x1 ) :

x10  1 and x20  1 as an initial point.

For steepest descent,

The objective function can be expressed as a function of 0 as follows:

Solving for 1 as before [i.e., expressing f(x1) as a function of 1 and minimizing

(1) Using a linear approximation of f ( x ) :

Modified Newton's Procedure:

Is f ( x )k )   ? If yes, terminate. If no, continue.

Approximate f (x ) by a straight line (the secant).

- An infinite number of candidates exist for H when n  1

• BFGS is the basis for the unconstrained optimizer

• Does not require inverting the Hessian matrix but

You might also like