DDD Desdelinear and Nonlinear Programming-5

288 Chapter 10 Quasi-Newton Methods
10.2 CONSTRUCTION OF THE INVERSE

The fundamental idea behind most quasi-Newton methods is to try to construct
the inverse Hessian, or an approximation of it, using information gathered as the
descent process progresses. The current approximation Hk is then used at each stage
to define the next descent direction by setting Sk = Hk in the modified Newton
method. Ideally, the approximations converge to the inverse of the Hessian at the
solution point and the overall method behaves somewhat like Newton’s method.
In this section we show how the inverse Hessian can be built up from gradient
information obtained at various points.
Let f be a function on E n that has continuous second partial derivatives.
If for two points xk+1 , xk we define gk+1 = fxk+1 T , gk = fxk T and pk =
xk+1 − xk , then
gk+1 − gk F xk pk (6)
If the Hessian, F, is constant, then we have
qk ≡ gk+1 − gk = Fpk (7)
and we see that evaluation of the gradient at two points gives information about F.
If n linearly independent directions p0 , p1 , p2 pn−1 and the corresponding qk ’s
are known, then F is uniquely determined. Indeed, letting P and Q be the n × n
matrices with columns pk and qk respectively, we have
F = QP−1 (8)
It is natural to attempt to construct successive approximations Hk to F−1 based

on data obtained from the first k steps of a descent process in such a way that if
F were constant the approximation would be consistent with (7) for these steps.
Specifically, if F were constant Hk+1 would satisfy
Hk+1 qi = pi 0 i k (9)
After n linearly independent steps we would then have Hn = F−1 .

For any k < n the problem of constructing a suitable Hk , which in general serves as
an approximation to the inverse Hessian and which in the case of constant F satisfies (9),
admits an infinity of solutions, since there are more degrees of freedom than there are
constraints. Thus a particular method can take into account additional considerations.
We discuss below one of the simplest schemes that has been proposed.
Rank One Correction

Since F and F−1 are symmetric, it is natural to require that Hk , the approximation
to F−1 , be symmetric. We investigate the possibility of defining a recursion of
the form
Hk+1 = Hk + ak zk zkT (10)
10.2 Construction of the Inverse 289
which preserves symmetry. The vector zk and the constant ak define a matrix of
(at most) rank one, by which the approximation to the inverse is updated. We select
them so that (9) is satisfied. Setting i equal to k in (9) and substituting (10) we obtain
pk = Hk+1 qk = Hk qk + ak zk zkT qk (11)
Taking the inner product with qk we have

2
qkT pk − qkT Hk qk = ak zkT qk (12)
On the other hand, using (11) we may write (10) as
pk − Hk qk pk − Hk qk T
Hk+1 = Hk + 2
ak zkT qk
which in view of (12) leads finally to
pk − Hk qk pk − Hk qk T
Hk+1 = Hk + (13)
qkT pk − Hk qk
We have determined what a rank one correction must be if it is to satisfy (9)

for i = k. It remains to be shown that, for the case where F is constant, (9) is also
satisfied for i < k. This in turn will imply that the rank one recursion converges to
F−1 after at most n steps.
Theorem. Let F be a fixed symmetric matrix and suppose that p0 , p1 ,
p2 pk are given vectors. Define the vectors qi = Fpi , i = 0 1 2 k.
Starting with any initial symmetric matrix H0 let
pi − Hi qi pi − Hi qi T
Hi+1 = Hi + (14)
qiT pi − Hi qi
Then
pi = Hk+1 qi for i k (15)
Proof. The proof is by induction. Suppose it is true for Hk , and i k − 1. The

relation was shown above to be true for Hk+1 and i = k. For i < k
Hk+1 qi = Hk qi + yk pTk qi − qkT Hk qi (16)
where
pk − Hk qk
yk =
qkTpk − Hk qk
By the induction hypothesis, (16) becomes
290 Chapter 10 Quasi-Newton Methods

Hk+1 qi = pi + yk pTk qi − qkT pi
From the calculation
qkT pi = pTk Fpi = pTk qi
it follows that the second term vanishes.

To incorporate the approximate inverse Hessian in a descent procedure while
simultaneously improving it, we calculate the direction dk from
dk = −Hk gk
and then minimize fxk + dk with respect to 0. This determines xk+1 =
xk + k dk , pk = k dk , and gk+1 . Then Hk+1 can be calculated according to (13).
There are some difficulties with this simple rank one procedure. First, the
updating formula (13) preserves positive definiteness only if qkT pk − Hk qk > 0,
which cannot be guaranteed (see Exercise 6). Also, even if qkT pk −Hk qk is positive,
it may be small, which can lead to numerical difficulties. Thus, although an excellent
simple example of how information gathered during the descent process can in
principle be used to update an approximation to the inverse Hessian, the rank one
method possesses some limitations.
10.3 DAVIDON–FLETCHER–POWELL METHOD

The earliest, and certainly one of the most clever schemes for constructing the inverse
Hessian, was originally proposed by Davidon and later developed by Fletcher and
Powell. It has the fascinating and desirable property that, for a quadratic objective,
it simultaneously generates the directions of the conjugate gradient method while
constructing the inverse Hessian. At each step the inverse Hessian is updated
by the sum of two symmetric rank one matrices, and this scheme is therefore
often referred to as a rank two correction procedure. The method is also often
referred to as the variable metric method, the name originally suggested by Davidon.
The procedure is this: Starting with any symmetric positive definite matrix H0 ,
any point x0 , and with k = 0,
Step 1. Set dk = −Hk gk .
Step 2. Minimize fxk + dk with respect to 0 to obtain xk+1 , pk = k dk ,

and gk+1 .
Step 3. Set qk = gk+1 − gk and
pk pTk Hk qk qkT Hk
Hk+1 = Hk + − T (17)
pTk qk qk Hk qk
Update k and return to Step 1.

DDD Desdelinear and Nonlinear Programming-5

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DDD Desdelinear and Nonlinear Programming-5

Uploaded by

Copyright:

Available Formats

288 Chapter 10 Quasi-Newton Methods

10.2 CONSTRUCTION OF THE INVERSE

qk ≡ gk+1 − gk = Fpk (7)

It is natural to attempt to construct successive approximations Hk to F−1 based

After n linearly independent steps we would then have Hn = F−1 .

Rank One Correction

pk = Hk+1 qk = Hk qk + ak zk zkT qk (11)

Taking the inner product with qk we have

On the other hand, using (11) we may write (10) as

which in view of (12) leads finally to

We have determined what a rank one correction must be if it is to satisfy (9)

pi = Hk+1 qi for i k (15)

Proof. The proof is by induction. Suppose it is true for Hk , and i k − 1. The

Hk+1 qi = Hk qi + yk pTk qi − qkT Hk qi (16)

From the calculation

qkT pi = pTk Fpi = pTk qi

it follows that the second term vanishes.

10.3 DAVIDON–FLETCHER–POWELL METHOD

Step 2. Minimize fxk + dk with respect to 0 to obtain xk+1 , pk = k dk ,

Step 3. Set qk = gk+1 − gk and

You might also like

DDD Desdelinear and Nonlinear Programming-5

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DDD Desdelinear and Nonlinear Programming-5

Uploaded by

Copyright:

Available Formats

288 Chapter 10 Quasi-Newton Methods

10.2 CONSTRUCTION OF THE INVERSE

qk ≡ gk+1 − gk = Fpk  (7)

It is natural to attempt to construct successive approximations Hk to F−1 based

After n linearly independent steps we would then have Hn = F−1 .

Rank One Correction

pk = Hk+1 qk = Hk qk + ak zk zkT qk  (11)

Taking the inner product with qk we have

On the other hand, using (11) we may write (10) as

which in view of (12) leads finally to

We have determined what a rank one correction must be if it is to satisfy (9)

pi = Hk+1 qi for i k (15)

Proof. The proof is by induction. Suppose it is true for Hk , and i k − 1. The

Hk+1 qi = Hk qi + yk pTk qi − qkT Hk qi  (16)

From the calculation

qkT pi = pTk Fpi = pTk qi 

it follows that the second term vanishes.

10.3 DAVIDON–FLETCHER–POWELL METHOD

Step 2. Minimize fxk + dk with respect to 0 to obtain xk+1 , pk = k dk ,

Step 3. Set qk = gk+1 − gk and

You might also like

qk ≡ gk+1 − gk = Fpk (7)

pk = Hk+1 qk = Hk qk + ak zk zkT qk (11)

pi = Hk+1 qi for i k (15)

Hk+1 qi = Hk qi + yk pTk qi − qkT Hk qi (16)

qkT pi = pTk Fpi = pTk qi