Professional Documents
Culture Documents
LECTURE NOTE II
Chapter 3
min f ( x), x R N
x
f f ( x) f ( x ) 0 for x x ( 0)
f f ( x) f ( x ) 0 for x x ( 0)
- For Q ( z ) z T Az
A is positive definite if Q ( z ) 0, z 0
A is positive semidefinite if Q ( z ) 0, z and z 0 z T Az 0
A is negative definite if Q ( z ) 0, z 0
A is negative semidefinite if Q ( z ) 0, z and z 0 z T Az 0
A is indefinite if Q ( z ) 0 for some z and Q ( z ) 0 for other z
Test for positive definite matrices
1. If any one of diagonal elements is not positive, then A is not p.d.
2. All the leading principal determinants must be positive.
3. All eigenvalues of A are positive.
Test for negative definite matrices
1. If any one of diagonal elements is not negative, then A is not n.d.
2. All the leading principal determinant must have alternate sign starting from
D1<0 (D2>0, D3<0, D4>0, … ).
3. All eigenvalues of A are negative.
Test for positive semidefinite matrices
1. If any one of diagonal elements is nonnegative, then A is not p.s.d.
2. All the principal determinants are nonnegative.
Test for negative semidefinite matrices
1. If any one of diagonal elements is nonpositive, then A is not n.s.d.
2. All the k-th order principal determinants are nonpositive if k is odd, and
nonnegative if k is even.
Remark 1: The principal minor of order k of NxN matrix Q is a submatrix of size kxk
obtained by deleting any n-k rows and their corresponding columns from the matrix Q.
Remark 2: The leading principal minor of order k of NxN matrix Q is a submatrix of
size kxk obtained by deleting the last n-k rows and their corresponding columns.
Remark 3: The determinant of a principal minor is called the principal determinant. For
NxN matrix, there are 2 N 1 principal determinant in all.
v) Contraction: If f ( xr ) f ( xh ) , xt xc ( xh xc ) .
( 0.4 0.6 ) Else if f ( xr ) f ( xg ) , xt xc ( xh xc ) .
Then set xh xt and go to i).
vi) If the simplex is small enough, then stop. Otherwise,
Reduction: xi xl 0.5( xi xl ) for i 1, 2, , n 1 . And go to i).
( k 1)
B. If f ( x p ) f ( xb ) , then x x p and go to i).
(k ) (k ) (k )
Remark 4: More complicated methods can be derived. However, the next Powell’s
Conjugate Direction Method is better if a more sophisticated algorithm is to be used.
If the objective function of n variables is quadratic and in the form of perfect square,
then the optimum can be found after exactly n single variable searches.
Quadratic functions:
q ( x) a bT x 0.5 xT Cx
Similarity transform (Diagonalization): Find T with x=Tz so that
Q( x) xT Cx z T TT CTz z T Dz (D is a diagonal matrix)
cf) If C is diagonalizable, T is the eigenvector of C.
For optimization, C of objective function is not generally available.
- Conjugate directions
Definition:
Given an nxn symmetric matrix C, the direction s1, s2,, …, sr ( r n ) are said to be
C conjugate if the directions are linearly independent and
siT Cs j 0 for all i j .
Remark 1: If si s j 0 for all i j they are orthogonal.
T
f f x
(bT xT C) d 0
x x z1 or z2
u Tj Azk
j 1
zj uj T zk for j 2,3, , n
zk Azk
k 1
z T A2 z
z2 Az1 1T 1 z1
z1 Az1
z Tj A2 z j z Tj A2 z j
z j 1 Az j T z j T z for j 2,3, , n -1
z j Az j z j 1 Az j 1 j 1
0.5
f ( x ( k ) ) f ( x ( k 1) )
Check if *
(k )
f ( xm 1 ) f ( xm )
k (k )
If yes, use old directions again. Else delete sm and add sn+1.
B. Modification by Zangwill
If yes, use old directions again. Else delete sm and add sn+1.
Remark 3: This method will converge to a local minimum at superlinear convergence
rate.
( k 1)
cf) Let lim C where ( k ) x ( k ) x*
(k ) r
k
f ( x ( k ) ) f and/or x ( k 1) x ( k ) / x ( k ) x
f ( x ( k ) )T s ( x ( k ) ) 0
( k 1) * f ( x ( k ) ) f ( x* )
x x x (k )
x *
f ( x )(k )
f ( x ( k ) ) f ( x ( k ) )( x* x ( k ) ) f ( x* )
f ( x ( k ) )
f ( x ( k ) ) * ( k ) 2
(x x ) k(x x ) * (k ) 2
f ( x ( k ) )
1
Also, if the initial condition is chosen such that (0) , the method will
C
converge. It implies that the initial condition is chosen poorly, it may diverge.
Remark 4: The family of Newton’s methods does not possess the descent property.
f ( x ( k ) )T s ( x ( k ) ) f ( x ( k ) )T 2 f ( x ( k ) )1 f ( x ( k ) ) 0 only if the Hessian
is positive definite.
f ( x ( k ) ) f ( x ( k ) k ( x ( k ) ))
iii) Calculate g ( x ( k ) , k )
k f ( x ( k ) )T ( x ( k ) )
If g ( x ( k ) ,1) , select k such that g ( x ( k ) , k ) 1 .
Else, k 1 .
iv) Let Q [Q1 Q2 Qn ] (approximation of the Hessian)
f ( x ( k ) f ( x ( k 1) ) ei ) f ( x ( k ) )
where Qi
f ( x ( k 1) )
If Q( x ( k ) ) is singular or f ( x ( k ) )T Q( x ( k ) ) 1 f ( x ( k ) ) 0 ,
then ( x ( k ) ) f ( x ( k ) ) . Else ( x ( k ) ) Q ( x ( k ) ) 1 f ( x ( k ) ) .
k 1
- Search direction: s ( k ) g ( k ) i 0
(i ) (i )
s with s (0) g (0)
ii) Choose (0) and (1) such that s (2) Cs (1) 0 and s (2) Cs (0) 0 .
T T
cf) Use when the objective and gradient evaluations are very inexpensive.
ii) Daniel
T
s ( k 1) 2 f ( x ( k ) )f ( x ( k ) )
s (k )
f ( x ) (k )
s ( k 1)
( k 1)T ( k 1)
s f ( x )s
2 (k )
g ( x ( k ) )T g ( x ( k ) )
s (k )
f ( x ) (k )
2
s ( k 1)
( k 1)
g(x )
A (ck ) g ( k ) x ( k ) / A ( k ) g ( k )
Family of solutions
1 x ( k ) yT A ( k ) g ( k ) z T
A (ck ) (y and z are arbitrary vectors)
yT g ( k ) z T g ( k )
T T T T
i) x ( k 1) g ( k 1) x ( k 1) g ( k ) x ( k 1) g ( k 1) x ( k 1) g ( k 1)
a a b b a b
2
ii) T T T
0 (Schwarz inequality)
a a b b a b
2
T T T
0.
z T Az 0
This method has the descent property.
f f ( x ( k ) )T x ( k )f ( x ( k ) )T A ( k )f ( x ( k ) ) 0 for ( k ) 0
- Variations
McCormick (Pearson No.2)
T
(x ( k 1) A ( k 1) g ( k 1) ) x ( k 1)
A ( k ) A ( k 1) T
x ( k 1) g ( k 1)
𝑨 𝑰 𝑨 𝑰
x ( k 1) g ( k 1)
T
( k 1) A ( k 1) g ( k 1) g ( k 1)T A ( k 1) x ( k 1) x ( k 1)T
A (k )
T A T T
g ( k 1) A ( k 1) g ( k 1) g ( k 1) A ( k 1) g ( k 1) x ( k 1) g ( k 1)
T
where B is 2x2 and B ( k 1) x ( k 1) A ( k 1) g ( k 1) g ( k 1) [ -1]T .
- Test results
Himmelblau (1972): BFS, DFP and Powell’s direct search methods are superior.
Sargent and Sebastian (1971): BFS among BFS, DFP and FR methods
Shanno and Phua (1980): BFS
Reklaitis (1983): FR among Cauchy, FR, DFP, and BFS methods