1 Unconstrained Optimization Problems: 2 Conjugate Gradient Method

1 Unconstrained optimization problems:
2 Conjugate Gradient Method

Remark 1 The solution to the system of linear equations Qx = b minimizes
the quadratic form
1
f (x) = xT Qx xT b + c
2
with Q symmetric positive de…nite matrix ( solution of the system solves opti-
mization problem).
Proof. Let x be a vector that satis…es Qx = b and e be any other non zero
vector in Rn . Then
1
f (x + e) = (x + e)T Q(x + e) (x + e)T b + c
2
1 T
= (x + eT )[Q(x ) + Q(e)] (x )T b eT b + c
2
1 T
= (x [Q(x ) + Q(e)] + eT [Q(x ) + Q(e)] (x )T b eT b + c
2
1 T 1 1
= [x Q(e) + eT Q(x )] + eT Q(e) (x )T b eT b + x T Q(x ) + c
2 2 2
1 T 1 T 1
= [ x Q(x ) (x ) b + c] + [x Q(e) + (x ) Qe] + eT Q(e) eT b
T T
2 2 2
1 T T T 1 T T
= [ x Q(x ) (x ) b + c] + x Q(e) + e Q(e) e b
2 2
1 T 1
= [ x Q(x ) (x ) b + c] + e Q(x ) + eT Q(e) eT b
T T
2 2
1 T
= f (x ) + e Q(e)
2
Note that (x )T Q(e) = ((x )T Q(e))T = eT QT (x ) = eT Q(x ) = eT b: Also,
1 T T 1 T T
2 [(x ) Q(e) + e Q(x )] = 2 (2e b) = e b:
As, 21 eT Q(e) > 0; f (x ) < f (x + e) for any non zero vector e in Rn :

The above is also the way we express a quadratic function f in the form
containing x ( which is a minimum point).
Theorem 2 Let x(0) be any starting vector in Rn ; the basic conjugate algorithm
converges to the unique x ( that solves Qx = b) in n steps; that is, x(n) = x :
Proof. Consider x x0 2 Rn : Since fd0 ; d1 ; d2 ; :::; dn 1g are linearly indepen-
n
dent vectors in R . Hence
x x0 = 0 d0 + 1 d1 + 2 d2 + ::: + n 1 dn 1
Premultiply both sides of the above equation by dTk Q, 0 k < n; we have

dTk Q(x x0 ) = dTk Q( 0 d0 + 1 d1 + 2 d2 + ::: + n 1 dn 1 )
T
= k dk Qdk
1
where the terms dTk Qdj = 0; k 6= j by the Q conjugacy property. Hence
dTk Q(x x0 ) = T
k dk Qdk gives
T
dk Q(x x0 )
k = indeed, dTk Qdk > 0
dTk Qdk
Note that
x(1) = x(0) + 0 d0 and x(2) = x(1) + 1 d1 = x(0) + 0 d0 + 1 d1

(3) (2) (0)
x = x + 2 d2 = x + 0 d0 + 1 d1 + 2 d2
Continuing
x(k) = x(0) + 0 d0 + 1 d1 + 2 d2 + ::: + k 1 dk 1
Therefore
x(k) x(0) = 0 d0 + 1 d1 + 2 d2 + ::: + k 1 dk 1
Thus
x x0 = (x xk ) + (xk x0 )
Premultiply both sides of the above equation by dTk Q; we have
dTk Q(x x0 ) = dTk Q(x xk ) + dTk Q(xk x0 )

= dTk Q(x xk ) + dTk Q( 0 d0 + 1 d1 + 2 d2 + ::: + k 1 dk 1 )
= dTk [Q(x ) Q(x )] = k
dTk gk
because gk = Q(xk ) b, Qx = b and dTk Q( 0 d0 + 1 d1 + 2 d2 +:::+ k 1 dk 1 ) =

0: Thus
dTk Q(x x0 ) dT gk
k = T
= Tk = k
dk Qdk dk Qdk
Hence
x(n) = x(n 1)
+ n 1 dn 1
(0)
= x + 0 d0 + 1 d1 + 2 d2 + ::: + n 1 dn 1
(0)
= x + 0 d0 + 1 d1 + 2 d2 + ::: + n 1 dn 1
0 0 ( )
= x +x x =x
Thus x(n) solves Qx = b , that is, rf (x(n) ) = 0 and hence we have
f (x(n) ) = minn f (x):

x2R
For quadratic function of n variables, the conjugate direction method reaches

the solution after n steps.
2
Remark 3 The basic idea is that the minimization on Rn of quadratic function
f (x) = c + bT x + 12 xT Qx with Q symmetric positive de…nite, can be split into n
minimizations over R: This is done with the help of n directions conjugate with
respect to Q: Along each direction an exact line search is performed in closed
form.
Remark 4 Suppose that we start at x(0) and search in the direction d0 to obtain
that
x(1) = x(0) + 0 d0
where step sizes calculated with the help of following formula
< g (0) ; d0 >

0 =
< d0 ; d0 >Q
We claim that
< rf (x(1) ); d0 >=< g (1) ; d0 >= 0
To see this,
< g (1) ; d0 >= (g (1) )T d0 = (Qx(1) b)T d0

= (x(1) )T Qd0 bT d0
(0) T
= (x + 0 d0 ) Qd0 bT d0
= (x(0) )T Qd0 + T
0 d0 Qd0 bT d0
< g (0) ; d0
> T
= (x(0) )T Qd0 d Qd0 bT d0
< d0 ; d0 >Q 0
= (x(0) )T Qd0 < g (0) ; d0 > bT d0
= [(x(0) )T Q bT ]d0 < g (0) ; d0 >= [Q(x(0) ) b]T d0 < g (0) ; d0 >
= [(g (0) )T ]d0 < g (0) ; d0 >= 0
The equation < g (1) ; d0 >= 0 implies that 0 has the property that
0 = arg min f (x(0) + d0 ) = arg min 0( )
that is,
f (x(1) ) = f (x(0) + 0 d0 ) = min f (x(0) + d0 )
2R
where
0( ) = f (x(0) + d0 ):
That is, the value of 0 0 is the exact minimizer of the
0( ) = f (x(0) + d0 ):
Apply the chain rule to obtain that

0
0( ) = rf (x(0) + d0 )T d0
3
Evaluating at 0, we have
0
0( 0) = rf (x(0) + T
0 d0 ) d0 = rf (x(1) )T d0 < g (1) ; d0 >= 0
as 0 is quadratic function of ( see the remark at the start of this lecture by

replacing e by 0 d0 and x by x(0) ) ; the coe¢ cient of the 2 term in 0 is
dT0 Qd0 > 0
the above implies that

0 = arg min 0( )
2R
Using the similar arguments, we can show that
< rf (x(k+1) ); dk >=< g (k+1) ; dk >= 0
and
k = arg min k( )
2R
Proposition 5 If d0 ; d1 ; d2 ; :::; dn 1 are n mutually conjugate directions with

respect to positive de…nite symmetric matrix Q in Rn and x(0) be any starting
vector in Rn : Let x(1) ; x(2) ; x(3) ; :::; x(n) be recursively de…ned by
x(k+1) = x(k) + k dk
where step size at each step is calculated with the help of following formula
< gk ; dk >
k =
< dk ; dk >Q
Note that
f (x(k+1) ) = f (x(k) + k dk ) = min f (x(k) + dk ) = min k( );

2R 2R
where
k( ) = f (x(k) + dk ):
That is, the value of k 0 is the exact minimizer of the
f (x(k) + dk ) = k( ):
That is, step size is found with the help of exact line search. Note that
gk = rf (x(k) ) = Qx(k) b
With the search direction dk chosen, we need to compute the step size, k: To
this end, we consider
1 (k)
( ) = f (x(k) + dk ) = (x + dk )T Q(x(k) + dk ) (x(k) + dk )T b
2
4
This is a quadratic function of : Now
0
( ) = rf (x(k) + dk )T dk = [Q(x(k) + dk ) b]T dk
= [Q(x(k) ) + Qdk ) b]T dk = (x(k) )T Qdk + dTk Qdk bT dk
00
and ( ) = dTk Qdk > 0
0
Its minimizer k is the value of where f (xk + dk ) vanishes. Therefore
d
f (x(k) + dk ) = rf (x(k+1) )T dk = (gk+1 )T dk =< gk+1 ; dk >= 0
d = k
Note that
0
( k) = (x(k) )T Qdk + T
k dk Qdk bT dk = 0 gives
bT dk (x(k) )T Qdk (x(k) )T Qdk bT dk ) (x(k) )T Q bT )dk
k = = ( = (
dTk Qdk dTk Qdk dTk Qdk
rf (x(k) )T dk < gk ; dk >
= ( )= :
dTk Qdk dTk Qdk
Following is the even the stronger condition satis…ed by gk+1 :

Proposition 6 In conjugate direction algorithm,
T
gk+1 dj = 0
for all k; 0 k n 1 and 0 j k: That is, gk+1 is orthogonal to each of
the direction d0 ; d1 ; :::; dk :
The result is true for k = 0; that is, g1T d0 = 0

Proof. We know that x(k+1) = x(k) + k dk where k is found such that
f (x(k+1) ) = min f (x(k) + dk ) = min ( ); where ( ) = f (x(k) + dk ); that is,
2R 2R
k 0 chosen to satisfy f (x(k) + k dk ) = min f (x(k) + dk ): So we must have
2R
0
T
( ) = 0: Thus gk+1 dk = 0:
= k
Assume that the result is true for k 1 ( that is, gkT dj = 0; 0 j k 1 ) (
T
we will show that it is true for k; that is, gk+1 dj = 0; 0 j k): Note that,
x(k+1) = x(k) + k dk
(k+1) (k)
x x = k dk
Now,
Q(x(k+1) ) Qx(k) = [Q(x(k+1) ) b] [Qx(k) b]
Q[x(k+1) x(k) ] = g (k+1) g (k)
Q[ k dk ] = g (k+1) g (k) gives
(k+1)
g = g (k) + k Qdk
(k+1) T (k) T
[g ] = [g ] + k dTk Q
5
Now, for 0 j<k
[g (k+1) ]T dj = [g (k) ]T dj + T
k dk Qdj
= 0+0
The …rst term on RHS is zero by inductive hypothesis and the second term is
zero by Q conjugacy ). We know that
T
gk+1 dk = 0:
Hence
(g (k+1) )T dj = 0 for all k; with 0 k n 1 and 0 j k
Thus, g (k+1) is orthogonal to any vector from the subspace spanned by fd0 ; d1 ; d2 ; :::; dk g
Proposition 7
f (xk+1 ) = min f (x)
x2Vk
(0)
where Vk = x + spanfd0 ; d1 ; d2 ; :::; dk g: Thus not only have
f (x(k+1) ) = min f (x(k) + dk )

2R
but also
P
k
f (x(k+1) ) = min f (x(0) + k dk )
0 ; 1 ; 2 ;::: k 2R i=0
As k increases, the subspace generated by fd0 ; d1 ; d2 ; :::; dk g expands and will

eventually …ll the whole of Rn ( provided the vectors d0 ; d1 ; d2 ; ::: are linearly
independent. Therefore for some su¢ ciently large k; x will lie in Vk : For this
reason, the above result is sometimes called the expanding subspace theorem.
Proof. De…ne the matrix Dk = [d0 ; d1 ; d2 ; :::; dk ]: That is, di is the ith column
of Dk : Note that x0 + R(Dk ) = Vk ; where R(Dk ) is the column space of Dk :
Also
P
k
xk+1 = x(0) + k dk
i=0
= x(0) + D k
where = [ 0 ; 1 ; :::; k ]: Hence xk+1 2 x0 + R(Dk ) = Vk : Now consider any

vector x 2 Vk : There exists a vector such that
x = x(0) + Dk
6
De…ne
k( ) = f (x(0) + Dk )
Note that k ( ) is a quadratic function and has a unique minimizer that satis…es
the FONC. By the chain rule
D k( ) = rf (x(0) + Dk )T D(k)
= [rf (xk+1 )]T D(k)
= [g (k+1) )]T D(k)
we know that [g (k+1) )]T D(k) = 0: Hence
D k( ) = [g (k+1) )]T D(k) = 0
Therefore satis…es FONC for the quadratic function k( ): Hence is the

minimizer of k ; that is
f (x(k+1) ) = min f (x(0) + Dk ) = min f (x):

x2Vk
How to generate Q conjugate directions:

The conjugate gradient algorithm does not use prespeci…ed conjugate direc-
tions but instead compute the directions as algorithm progresses.
At each stage, direction is calculated as a linear combination of previous
direction and the current gradient, in such a way that all the directions are
mutually Q conjugates.
Consider the quadratic form
1 T
f (x) = x Qx xT b + c
2
with Q symmetric positive de…nite matrix.
1. Let x(0) be an initial guess, then compute g (0) = rf (x(0) ) = Qx(0) b

2. If g (0) = 0; then stop, else go to next step.
3. Take d0 = g (0) ; That is the starting step is a selection of steepest descent.
4. Thus
x(1) = x(0) + 0 d0
where
< g (0) ; d0 >
0 = arg min f (x(0) + d0 ) =
0 < d0 ; d0 >Q
5. In the next stage, we search in a direction d1 that is Q conjugate to d0 :
We choose d1 as a linear combination of g (1) and d0 :
6. We then look for a direction d2 that is conjugate to the previous direction
d0 and d1 with respect to matrix Q:
7
7. Likewise we continue and thus in general:
8. In general, at the (k + 1) th stage, we choose dk+1 as a linear combination
of g (k+1) and dk as follows:
dk+1 = g (k+1) + k dk
The coe¢ cients k are chosen in such a way that dk+1 is Q conjugate
to d0 ; d1 ; d2 ; :::; dk : This is done by choosing k as
< gk+1 ; dk >Q

k =
< dk ; dk >Q
2.1 Justi…cation of above process:

1. Note that,
d0 = g (0) and d1 = g (1) + 0 d0 which gives

(1) (0)
x = x + 0 d0
where 0 is found such that f (x(0) + d0 ) is minimum. That is,
[rf (x(0) + d0 )]T d0 = 0; [rf (x(1) )]T d0 = 0; that is, (g (1) )T g (0) = 0:
That is, g (1) and g (0) are orthogonal to each other. That is, < g1 ; g0 >=
0; that is, < g1 ; g0 >= g1T g0 = 0: that is, < g (1) ; d0 >= 0, that is,
< g (k+1) ; dk >= 0 holds for k = 0:
We now show that if d0 and d1 are conjugate with respect to Q; then
< g (1) ; d0 >Q

0 must be chosen equal to
< d0 ; d0 >Q
Note that
g = rf (x) = Qx b and in particular

(0) (0) (0)
g = rf (x ) = Qx b and g (1) = rf (x(1) ) = Qx(1) b
If d0 and d1 are conjugate with respect to Q; then we must have
dT1 Qd0 = 0; that is, ( g (1) + 0 d0 )T Qd0 = 0

that is, g (1) Qd0 + 0 d0
T
Qd0 = 0; that is,
g (1) Qd0 < g (1) ; d0 >Q
0 = =
d0 T Qd0 < d0 ; d0 >Q
Proposition 8 In conjugate gradient algorithm, the directions d0 ; d1 ; :::; dn 1

are Q conjugate.
8
Proof. We shall prove the result with induction: First show that,
dT0 Qd1 = 0:
To this end:
dT0 Qd1 = dT0 Q(g (1) + 0 d0 )
substituting the value of 0; we have
< g (1) ; d0 >Q

dT0 Qd1 = dT0 Q[ g (1) + 0 d0 ] = dT0 Q[ g (1) + d0 ]
< d0 ; d0 >Q
< g (1) ; d0 >Q T
= dT0 Q( g (1) ) + d Qd0 )
< d0 ; d0 >Q 0
= dT0 Q( g (1) )+ < g (1) ; d0 >Q = dT0 Q( g (1) ) + dT0 Qg (1) = 0
Assume that d0 ; d1 ; :::; dk ; k < n 1 are Q conjugate directions. We now show

that dk+1 is Q conjugate to the directions d0 ; d1 ; :::; dk . That is,
dTk+1 Qdj = 0 for j 2 f0; 1; 2; :::; kg
Consider
dTk+1 Qdj = [ g (k+1) + k dk ]T Qdj

= (g (k+1) )T Qdj + T
k dk Qdj
= (g (k+1) )T Qdj + T
k dk Qdj
for j < k; we have dk T Qdj = 0 by virtue of the induction hypothesis. Also,
x(k+1) = x(k) + k dk
(k+1)
x x(k)
= dk
k
Thus
x(j+1) x(j)
dTk+1 Qdj = (g (k+1) )T Qdj = (g (k+1) )T Q( )
j
1
= (g (k+1) )T Q(x(j+1) x(j) )
j
1
= (g (k+1) )T [Qx(j+1) Qx(j) ]
j
1
= [(g (k+1) )T g (j+1) (g (k+1) )T g (j) ] = 0
j
Hence
dk+1 T Qdj = 0
for j = 0; 1; 2; :::; k 1:
9
Does dk+1 T Qdk = 0: We assume this as this necessary to …nd k: Use
dk+1 T Qdk = 0
we know that
dk+1 = gk+1 + k dk which implies that

dTk+1 = T
gk+1 + T
k dk
Note that
(dk+1 )T Qdk = T
0 () ( gk+1 + T
k dk )Qdk = 0
T
() gk+1 Qdk + k dTk Qdk =0
T T
() k dk Qdk = gk+1 Qdk
T
gk+1 Qdk
() k =
dTk Qdk
Proposition 9 Show that g (k+1) is orthogonal to gj for 0 j k: That is,

T
gk+1 gj = 0
Proof. We know that dk+1 = g (k+1) + k dk : Fix j 2 f0; 1; 2; :::; kg and obtain
dj = gj + j 1 dj 1
gives that
(g (k+1) )T dj = (g (k+1) )T gj + j 1 (g
(k+1) T
) dj 1 implies that
0 = (g (k+1) )T gj + 0 as (g (k+1) )T dj 1 = 0 and (g (k+1) )T dj = 0
Hence (g (k+1) )T gj = 0 for j 2 f0; 1; 2; :::; kg:
Example 10 Consider the following problem:

1
min (x21 + 9x22 )
2
Here
1 2
f (x) = (x + 9x22 )
2 1
T T
1 x1 1 0 x1 x1 0
=
2 x2 0 9 x2 x2 0
1 T 1 0 0
= x Qx xT b; here Q = and b =
2 0 9 0
10
Note that
@
@x1 f (x) x1
rf (x) = @ =
@x2 f (x)
9x2
1 0 x1 0
=
0 9 x2 0
= Qx b and
" @2 @2
#
2 @x21
f (x) @x2 @x1 f (x)
r f (x) = @2 @2
@x1 @x2 f (x) @x22
f (x)
1 0
= =Q
0 9
9
Suppose we start with x(0) = : Then
1
9
g0 = rf (x0 ) =
9
9
) d0 = g (0) =
9
1 0 9 9
) Qd0 = =
0 9 9 81
T
9 9
dT0 Qd0 = = 810
9 81
< g (0) ; d0 > 162 2
) 0 = = =
< d0 ; Qd0 > 810 10
9 1 9 4 9
) x(1) = x(0) + 0 d0 = =
1 5 9 5 1
Now
36 1
g (1) = rf (x1 ) =
5 1
36 1 < g (1) ; d0 >Q
) d1 = g (1) + 1 d0 = + d0
5 1 < d0 ; d0 >Q
36 1 ( 36 2
5 ) (2) 9
=
5 1 81(2) 9
324 36
25 9
= 36 =
25 25 1
11
Now
36 36 1 9 36 36
g1T d1 = ( )( ) =( )( )( 10)
5 25 1 1 5 25
36 324 324
1 0 9 25 1
Qd1 = ( ) = 324 =
25 0 9 1 25 25 1
T
36 324 9 1 23 328
dT1 Qd1 = ( )( ) =
25 25 1 1 125
< g1 ; d1 > ( 36 )( 36 )( 10) 5
) 1 = = 5 25 23 328 =
< d1 ; Qd1 > 125
9
4 9 36 5 9
x2 = x1 + 1 d1 = + ( )
5 1 25 9 1
0
=
0
Example 11 Consider the following problem:
min 12x2 + 4x21 + 4x22 4x1 x2
Here
f (x) = 4x21 + 4x22 4x1 x2 12x2

T T
x1 4 2 x1 x1 0
=
x2 2 4 x2 x2 12
T T T
1 x1 8 4 x1 x1 0
=
2 x2 4 8 x2 x2 12
1 T
= x Qx xT b
2
Note that
@
@x1 f (x) 8x1 4x2
rf (x) = @ =
@x2 f (x)
12 + 8x2 4x1
8 4 x1 0
= and
4 8 x2 12
" @2 @2
#
2 @x21
f (x) @x2 @x1 f (x)
r f (x) = @2 @2
@x1 @x2 f (x) @x22
f (x)
8 4
= =Q
4 8
We now generate two conjugate directions: d0 and d1 : Suppose that we choose
12
1 a
d0 = : Then d1 = must satisfy the following condition:
0 b
T
1 8 4 a
dT0 Qd1 = =0
0 4 8 b
= 8a 4b = 0
1
In particular we may choose a = 1 then b = 2: Hence d1 = : It may
2
be noted that conjugate directions are not unique. If we minimize the objective
1
1
function f starting from x(0) = 2 along the direction : Then
1 0
1 1
1 +
x(1) = x(0) + d0 = 2 + = 2
1 0 1
Now
1 T 1
1 + 8 4 +
( ) = f (x(0) + d0 ) = f (x(1) ) = 2 2
2 1 4 8 1
T 1
0 2 +
12 1
1
= (4 4) 2 +5 12
2
1
which attains its minimum at 0 = 1: Hence x(1) = 2 : Now start from
1
1
1
x(1) = 2 and minimize the objective function along the direction we
1 2
get
1
1
x(2) = x(1) + d1 = 2 +
1 2
1 1
f attains its minimum value at 1 = 2 and we get x(2) = which is a
2
minimum point of f:
8(1) 4(2) 0
rf (x(2) ) = =
12 + 8(2) 4(1) 0
13

1 Unconstrained Optimization Problems: 2 Conjugate Gradient Method

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 Unconstrained Optimization Problems: 2 Conjugate Gradient Method

Uploaded by

Copyright:

Available Formats

1 Unconstrained optimization problems:

2 Conjugate Gradient Method

As, 21 eT Q(e) > 0; f (x ) < f (x + e) for any non zero vector e in Rn :

Premultiply both sides of the above equation by dTk Q, 0 k < n; we have

x(1) = x(0) + 0 d0 and x(2) = x(1) + 1 d1 = x(0) + 0 d0 + 1 d1

x(k) = x(0) + 0 d0 + 1 d1 + 2 d2 + ::: + k 1 dk 1

dTk Q(x x0 ) = dTk Q(x xk ) + dTk Q(xk x0 )

because gk = Q(xk ) b, Qx = b and dTk Q( 0 d0 + 1 d1 + 2 d2 +:::+ k 1 dk 1 ) =

Thus x(n) solves Qx = b , that is, rf (x(n) ) = 0 and hence we have

f (x(n) ) = minn f (x):

For quadratic function of n variables, the conjugate direction method reaches

< g (0) ; d0 >

< g (1) ; d0 >= (g (1) )T d0 = (Qx(1) b)T d0

0 = arg min f (x(0) + d0 ) = arg min 0( )

Apply the chain rule to obtain that

as 0 is quadratic function of ( see the remark at the start of this lecture by

dT0 Qd0 > 0

the above implies that

Using the similar arguments, we can show that

< rf (x(k+1) ); dk >=< g (k+1) ; dk >= 0

Proposition 5 If d0 ; d1 ; d2 ; :::; dn 1 are n mutually conjugate directions with

f (x(k+1) ) = f (x(k) + k dk ) = min f (x(k) + dk ) = min k( );

Following is the even the stronger condition satis…ed by gk+1 :

The result is true for k = 0; that is, g1T d0 = 0

(g (k+1) )T dj = 0 for all k; with 0 k n 1 and 0 j k

f (x(k+1) ) = min f (x(k) + dk )

As k increases, the subspace generated by fd0 ; d1 ; d2 ; :::; dk g expands and will

where = [ 0 ; 1 ; :::; k ]: Hence xk+1 2 x0 + R(Dk ) = Vk : Now consider any

we know that [g (k+1) )]T D(k) = 0: Hence

D k( ) = [g (k+1) )]T D(k) = 0

Therefore satis…es FONC for the quadratic function k( ): Hence is the

f (x(k+1) ) = min f (x(0) + Dk ) = min f (x):

How to generate Q conjugate directions:

1. Let x(0) be an initial guess, then compute g (0) = rf (x(0) ) = Qx(0) b

< gk+1 ; dk >Q

2.1 Justi…cation of above process:

d0 = g (0) and d1 = g (1) + 0 d0 which gives

where 0 is found such that f (x(0) + d0 ) is minimum. That is,

< g (1) ; d0 >Q

g = rf (x) = Qx b and in particular

If d0 and d1 are conjugate with respect to Q; then we must have

dT1 Qd0 = 0; that is, ( g (1) + 0 d0 )T Qd0 = 0

Proposition 8 In conjugate gradient algorithm, the directions d0 ; d1 ; :::; dn 1

< g (1) ; d0 >Q

Assume that d0 ; d1 ; :::; dk ; k < n 1 are Q conjugate directions. We now show

dTk+1 Qdj = 0 for j 2 f0; 1; 2; :::; kg

dTk+1 Qdj = [ g (k+1) + k dk ]T Qdj

for j < k; we have dk T Qdj = 0 by virtue of the induction hypothesis. Also,

dk+1 = gk+1 + k dk which implies that

Proposition 9 Show that g (k+1) is orthogonal to gj for 0 j k: That is,

Example 10 Consider the following problem:

Example 11 Consider the following problem:

min 12x2 + 4x21 + 4x22 4x1 x2

f (x) = 4x21 + 4x22 4x1 x2 12x2

We now generate two conjugate directions: d0 and d1 : Suppose that we choose

You might also like