Professional Documents
Culture Documents
Unconstrained Optimization
Shirish Shevade
Computer Science and Automation
Indian Institute of Science
Bangalore 560 012, India.
min f (x)
x
2
Let f ∈ C and f be bounded below.
Newton method uses quadratic approximation of f at a
given point, xk
T 1
f (x) ≈ fqk (x) = f (xk ) + gk (x − xk ) + (x − xk )T Hk (x − xk )
2
xk+1 is the minimizer of fqk (x)
What is the region of trust in which this approximation is
reliable?
Ωk = {x : kx − xk k ≤ ∆k }
q (x
k k+1 )
q
T 1 −1
yk (x) = f (xk ) + gk (x − xk ) + (x − xk )T Bk (x − xk )
2
−1
(Bk ) is either Hk or its approximation
xk+1 = xk + αk dkQN = xk − αk Bk gk
Given xk , xk+1 , gk , gk+1 and Bk , how to update Bk to get a
symmetric positive definite matrix Bk+1 ?
Is Bk ≈ (Hk )−1 ?
Are there any conditions that Bk+1 should satisfy?
T 1
yk+1 (x) = f (xk+1 )+gk+1 (x−xk+1 )+ (x−xk+1 )T (Bk+1 )−1 (x−xk+1 )
2
Require
Therefore, we require,
k+1 Bk+1 γ k = δ k
B should be positive definite
T T
γ k Bk+1 γ k = γ k δ k > 0 ∀ γ k 6= 0
∴ (Bk + αuuT )γ k = δ k
∴ αuT γ k u = δ k − Bk γ k
Let u = δ k − Bk γ k .
T
Therefore, αuT γ k = 1 gives α−1 = δ k − Bk γ k γ k .
T
δ k − Bk γ k δ k − Bk γ k
∴ Bk+1
SR1
k
=B + T
δ k − Bk γ k γ k
1
x2
−1
−2
−3
−3 −2 −1 0 1 2 3
x1
1
x2
−1
−2
−3
−3 −2 −1 0 1 2 3
x1
1
x2
−1
−2
−3
−3 −2 −1 0 1 2 3
x1
∗ T 8 −2
x = (0, 0) , H = ,
−2 2
−1 0.1667 0.1667
H =
0.1667 0.6667
k = 0, B1 γ 0 = δ 0
k = 1, B2 γ 1 = δ 1
k = 2, B3 γ 2 = δ 2
.. ..
. .
k = n − 1, Bn γ n−1 = δ n−1
k = 0, B1 γ 0 = δ 0
k = 1, B2 γ 1 = δ 1 , B2 γ 0 = δ 0
k = 2, B3 γ 2 = δ 2 , B3 γ 1 = δ 1 , B3 γ 0 = δ 0
.. ..
. .
k = n − 1, Bn γ n−1 = δ n−1 , Bn γ n−2 = δ n−2 , . . . Bn γ 0 = δ 0
Hereditary Property
Bk γ j = δ j , j = 0, . . . , k − 1
Proof.
Note that Hδ k = γ k ∀ k.
For k = 1, B1 γ 0 = δ 0 . (Quasi-Newton condition)
k j j
Suppose B γ = δ , j = 0, . . . , k − 1.
Using rank-one correction and using j = 0, . . . , k − 1,
T
δ k − Bk γ k δ k − Bk γ k
k+1 k
B = B + T
δ k − Bk γ k γ k
(δ k − Bk γ k )((δ k − Bk γ k )T
k+1 j k
∴B γ = B + γj
(δ k − Bk γ k )T γ k
Shirish Shevade Numerical Optimization
Proof. (continued)
δ k − Bk γ k
T
∴B k+1
γ j
= Bγ +k j
T δ k − Bk γ k γj
δ k − Bk γ k γ k
δ k − Bk γ k
T T
k j
= Bγ + T δ k γ j − γ k Bk γ j
δ k − Bk γ k γ k
k k k
δ − B γ T T
= Bk γ j + T δ k Hδ j − δ k Hδ j
δ k − Bk γ k γ k
= Bk γ j
= δ j ∀ j = 0, . . . , k − 1
∆ 1
min f (x) = xT Hx + cT x
x 2
where H is a symmetric positive definite matrix.
If the rank-one correction is well defined and δ 0 , δ 1 , . . . , δ n−1
are linearly independent, then the rank-one correction method
applied to minimize f (x) terminates in at most n + 1 iterations,
with Bn = H−1 .
T T T
δ k γ k = δ k (gk+1 − gk ) = −gk δ k = αk gk Bk gk > 0
xT Bk+1
DFP x > 0, x 6= 0
T
We have already shown that δ k γ k > 0.
Suppose xT Bk+1
DFP x = 0, x 6= 0.
T
Therefore, (aT a)(bT b) = (aT b)2 and (δ k x)2 = 0.
T T T
(δ k x)2 = 0 ⇒ µδ k γ k = 0 ⇒ δ k γ k = 0 (contradiction)
Therefore, xT Bk+1 k+1
DFP x > 0, x 6= 0 ⇒ BDFP is positive definite.
Shirish Shevade Numerical Optimization
Quasi-Newton Algorithm (DFP Method)
(1) Initialize x0 , and symmetric positive definite B0 , set
k := 0.
(2) while kgk k >
(a) dk = −Bk gk
(b) Find αk (> 0) along dk such that
(i) f (xk + αk dk ) < f (xk )
(ii) αk satisfies Armijo-Wolfe (or Armijo-Goldstein)
conditions
(c) xk+1 = xk + αk dk
(d) Find Bk+1 using DFP method
(e) k := k + 1
endwhile
Output : x∗ = xk , a stationary point of f (x).
1
x2
−1
−2
−3
−3 −2 −1 0 1 2 3
x1
1
x2
−1
−2
−3
−3 −2 −1 0 1 2 3
x1
1.2
0.8
0.6
x2
0.4
0.2
−0.2
−0.4
−1 −0.5 0 0.5 1
x1
0.8
0.6
0.4
x2
0.2
−0.2
−0.4
T T
δk δk
Bk γ k γ k Bk
Bk+1
DFP
k
=B + T − (DFP Method)
δk γ k γ k T Bk γ k
T T
γkγk Gk δ k δ k Gk
Gk+1
BFGS
k
=G + T k − T (BFGS Method)
γk δ δ k Gk δ k
Shirish Shevade Numerical Optimization
T T
γkγk Gk δ k δ k Gk
Gk+1
BFGS
k
=G + T k − T (BFGS Method)
γk δ δ k Gk δ k
Bk+1 k+1
BFGS GBFGS = I
to get
γ T Bγ δδ T δγ T B + Bγδ T
Bk+1
BFGS =B+ 1+ T −
δ γ δT γ δT γ
1.2
0.8
0.6
x2
0.4
0.2
−0.2
−0.4
−1 −0.5 0 0.5 1
x1
0.8
0.6
0.4
x2
0.2
−0.2
−0.4
DFP method
T T
δk δk
Bk γ k γ k Bk
Bk+1
DFP =B + T −k
δk γ k γ k T Bk γ k
BGFS method
γ T Bγ δδ T δγ T B + Bγδ T
Bk+1
BFGS =B+ 1+ T −
δ γ δT γ δT γ
Broyden Family