Professional Documents
Culture Documents
Chapter 6
UNCONSTRAINED MULTIVARIABLE
OPTIMIZATION
1
6.1 Function Values Only
direction methods)
3
Chapter 6
4
Chapter 6
5
Chapter 6
6
Chapter 6
7
Chapter 6
8
General Strategy for Gradient methods
k
(1) Calculate a search direction s
(2) Select a step length in that direction to reduce f(x)
k 1 k k k k
x x s x x
Chapter 6
Steepest Descent
Search Direction
k k
s f ( x ) Don’t need to normalize
9
So procedure can stop at saddle point. Need to show
*
H ( x ) is positive definite for a minimum.
Step Length
Chapter 6
How to pick
• analytically
• numerically
10
Chapter 6
11
Chapter 6
12
Chapter 6
13
Analytical Method
How does one minimize a function in a search direction
using an analytical method?
k k k 1 k k k 1 k k k
f (x s ) f (x ) f ( x ) T f ( x )( x ) ( x )T H ( x )( x )
2
k k
df ( x s ) k k k k k
0 T f ( x )(s ) (s )T H ( x )(s )
d
Solve for
k k
T f ( x )(s ) (6.9)
k k k
(s )T H ( x )(s )
16
Chapter 6
17
Chapter 6
18
Termination Criteria
f(x)
Big change in f(x) but little change
in x. Code will stop if x is sole criterion.
x
f(x)
Big change in x but little change
Chapter 6
x
For minimization you can use up to three criteria for termination:
(3) 19
f ( x ) 5
k
or si 6
k
Conjugate Search Directions
( s i )T Q ( s j ) 0
• To minimize f(xnx1) when H is a constant matrix (=Q), you
are guaranteed to reach the optimum in n conjugate
direction stages if you minimize exactly at each stage
(one-dimensional search)
20
Chapter 6
21
Conjugate Gradient Method
Step 1. At x 0 calculate f (x 0 ). Let
s 0 f ( x 0 )
Step 2. Savef ( x 0 ) and compute
x1 x 0 0 s 0
by minimizing f(x) with respect to in the s0 direction (i.e., carry out a unidimensional search for 0).
Chapter 6
f ( x 0 )f ( x 0 )
For the kth iteration the relation is
k 1 k 1 T f ( x k 1 )f ( x k 1 )
s f ( x )s k
(6.6)
T f ( x k )f ( x k )
For a quadratic function it can be shown that these successive search directions are conjugate.
After n iterations (k = n), the quadratic function is minimized. For a nonquadratic function,
the procedure cycles again with xn+1 becoming x0.
Step 4. Test for convergence to the minimum of f(x). If convergence is not attained, return to step 3.
Step n. Terminate the algorithm when f ( x ) is less than some prescribed tolerance.
k
22
Chapter 6
23
Chapter 6
24
Chapter 6
25
Chapter 6
26
Minimize f ( x1 3) 9( x2 5) using the method of conjugate gradients with
2 2
1
In vector notation, x 0
1
4
f x0
72
Chapter 6
3.554 4 3.564
s1 0.00244
0.197 72 0.022
and
1.223 1 3.564
x2 0.022
5.011
One dimensional Search
29
Chapter 6
30
Chapter 6
31
Fletcher – Reeves Conjugate Gradient Method
0 0
Let s f ( x )
1 1 0
s f ( x ) 1s
2 2 1
s f ( x ) s 2
Chapter 6
k k k 1
are chosen to make s H s
k
0 (conjugate directions)
k
Derivation: (let H H )
k 1 k k k
f ( x ) f ( x ) 2f ( x )( x x )
k 1 k k k
f ( x ) f ( x ) H x H k s
32
s H f ( x ) f ( x ) k
k 1 k 1 k
T
(s ) f ( x ) f ( x ) H / k
k T k 1 k1
k k 1
Using definition of conjugate directions, (s )T Hs =0,
T
f ( x k 1
) f ( x ) H H f ( x ) k s 0
1
k k 1 k
Chapter 6
k k 1
f T ( x )f ( x )0
k 1 k
and f T ( x )s 0, and solving for the weighting factor:
k 1 k 1
T f ( x )f ( x )
k
k k
T f ( x )f ( x )
k 1 k 1 k
s f ( x ) k s
33
Linear vs. Quadratic Approximation of f(x)
k k k 1 k k k
f ( x ) f ( x ) ( x x )T f ( x ) ( x x )T H ( x )( x x )
2
k k k
x x x k s
(1) Using a linear approximation of f ( x ) :
Chapter 6
df ( x ) k k
T
0 f ( x ) so cannot solve for x !
d ( x )
(2) Using a quadratic approximation for f (x) :
df ( x ) k k k Newton's method
T
0 f ( x ) H ( x )( x x )
d ( x ) solves one of these
or
k 1 k
x x H ( x )f ( x )
k with x x k 1
(simultaneous
equation-solving)
34
Note: Both direction and step length are determined
- Requires second derivatives (Hessian)
1
- H, H must be positive definite (for minimum) to guarantee convergence
- Iterate if f ( x ) is not quadratic
Chapter 6
35
Chapter 6
36
Chapter 6
37
Chapter 6
38
Marquardt’s Method
1
If H ( x ) or H ( x ) is not always positive definite, make it
positive definite.
1 1
Let H ( x ) H ( x ) I ; similar for H( x )
is a positive constant large enough to shift all the
Chapter 6
negative eigenvalues of H ( x ).
Example
0
At the start of the search, H( x ) is evaluated at x and
Not positive definite
0 1 2
found to be H ( x ) as the eigenvalues
2 1 are e1 3, e2 1
0
Modify H ( x ) to be ( 2)
Positive definite as the
1 2 2
H
eigenvalues are e1 5, e2 1
2 1 2
39
is adjusted as search proceeds.
Step 1
0
Pick x the starting point.
Let convergence criterion
Step 2
Chapter 6
Set k 0. Let 0 10 3
Step 3
k
Calculate f ( x )
Step 4
40
Step 5
1
Calculate s( x ) - H I f ( x )
k k k k k
Step 6
k 1 k k
Calculate x x s( x )
Chapter 6
Step 7
k 1 k
Is f( x ) f ( x )? If yes, go to step 8. If no, go to step 9.
Step 8
1 k
Set k 1 and k k 1. Go to step 3
4
Step 9
Set k 2 k . Go to step 5 41
Secant Methods
Recall for one dimensional search the secant method
only uses values of f(x) and f ′(x).
1
f ( x ) f ( x )
k p
k 1
x x k
f ( x k
)
x x
k p
Chapter 6
43
• Probably the best update formula is the BFGS update
(Broyden – Fletcher – Goldfarb – Shanno) – ca. 1970
44
Chapter 6
45
Chapter 6
46
Chapter 6
47
Chapter 6
48