Professional Documents
Culture Documents
Unconstrained Optimization
Shirish Shevade
Computer Science and Automation
Indian Institute of Science
Bangalore 560 012, India.
∆ 1
min f (x) = xT Hx − cT x
2
where H is a symmetric positive definite matrix.
Condition number of the Hessian matrix controls the
convergence rate of steepest descent method.
Faster convergence if the Hessian matrix is I
Let H = LLT be the Cholesky decomposition of H
Define y = LT x. Therefore, the function f (x) is
transformed to the function h(y).
∆
h(y) = f (L−T y)
min f (x)
x
2
Let f ∈ C and f be bounded below.
Use second order information to find a descent direction
At every iteration, use Taylor series to approximate f at xk
by a quadratic function and find the minimum of this
quadratic function to get xk+1
T
f (x) ≈ fq (x) = f (xk ) + gk (x − xk ) + 12 (x − xk )T Hk (x − xk )
xk+1 = arg minx fq (x)
−1
∇fq (x) = 0 ⇒ xk+1 = xk − (Hk ) gk (assuming Hk is
invertible)
1
x2
−1
−2
−3
−3 −2 −1 0 1 2 3
x1
1
x2
−1
−2
−3
−3 −2 −1 0 1 2 3
x1
1.2
0.8
0.6
x2
0.4
0.2
−0.2
−0.4
−1 −0.5 0 0.5 1
x1
0.8
0.6
0.4
x2
0.2
−0.2
−0.4
0.8
0.6
0.4
0.2
1
x
g(x)
x2
−0.2
x0
−0.4
−0.6
−0.8
−1
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
0.8
0.6
0.4
0.2
x1 x2
g(x)
0
x3 x
0
x
4
−0.2
−0.4
−0.6
−0.8
−1
−6 −4 −2 0 2 4 6
x
min f (x)
then
α1 k ∗ 2
|xk+1 −x∗ | ≤ |x −x | (order two convergence ifxk → x∗ )
2α2
Note that
α1 k
|xk+1 − x∗ | ≤ |x − x∗ | |xk − x∗ |
2α2
| {z }
required to be <1
α2 = min g0 (x)
x∈(x∗ −η,x∗ +η)
Therefore, 00 k
1 g (x̄ ) α1
2 g0 (xk ) ≤ 2α2 = β, say.
Therefore,
lim |xk − x∗ | = 0
k→∞
Modifications:
Given xk and dkN = −(Hk )−1 gk ,
Fix some constant δ > 0.
Find the smallest ζk ≥ 0 such that the smallest eigenvalue
of the matrix (Hk + ζk I) is greater than δ.
Therefore, dk = −(Hk + ζk I)−1 gk is a descent direction.
Given xk and dk = −(Hk + ζk I)−1 gk , use line search
techniques to determine αk and xk+1
xk+1 = xk + αk dk