Professional Documents
Culture Documents
Olof Runborg
June 1, 2011
1 Goal
We want to find xk ∈ Kk (b, A) which minimizes ||Axk − b||A−1 when A is a
symmetric positive definite matrix. Normally one would need to first compute
a basis for Kk (b, A) and then do a least squares fit to find the solution. How-
ever, for symmetric positive definite matrices we can do it iteratively with the
conjugate gradient method.
xk = α1 p1 + · · · + αk pk =: Sk ak , k ≥ 1,
where
Sk = [p1 p2 · · · pk ], ak = [α1 α2 · · · αk ]T .
We also define the residuals
r k = Axk − b,
The method stops if rk = 0 for some k. We say that the method fails if
pk = 0 for some k. We will assume for now that the method has not failed
before iteration k, i.e. that pℓ 6= 0 for ℓ ≤ k. At the end we will then show that
this implies that it will not fail in iteration k + 1. Since p1 = b 6= 0 it follows
by induction that the method will never fail.
1
1.2 Choice of pk -directions
Since we look for xk ∈ Kk (b, A) and xk = Sk ak we want to pick pk such that
range(Sk ) = Kk (b, A). We will show here that this follows if we take
pk+1 = r k − ASk z k , (3)
where z k is chosen such that it minimizes ||pk+1 ||2 . It is thus the least squares
solution given by the normal equations
SkT AT ASk z k = SkT Ar k . (4)
Note that in the steepest descent method we take just pk+1 = r k . The sub-
traction of ASk z k makes the search much more efficient. The construction has
several interesting consequences.
1. The pk -directions are A-conjugate (or just conjugate), which means that
This follows from the fact that SkT AT pk+1 = SkT AT (r k − ASk z k ) = 0 for
all k. It also implies that
SkT ASk = Dk , Dk is diagonal with elements dℓℓ = pTℓ Apℓ . (6)
2
1.3 Simplified minimization problem
Since {pj } span the Krylov space Kk (b, A) we can simplify the minimization
problem
pTk b = pTk (Axk−1 − r k−1 ) = pTk (ASk−1 ak−1 − rk−1 ) = −pTk r k−1 ,
3
and with αk 6= 0 as defined above. Then
From the fact that pTk+1 Apk = 0 by (5) we get an expression for βk ,
rTk Apk
βk = − .
pTk Apk
||r k ||2
γk = .
||r k ||2 + ||pk ||2
However, as we will see below, γk is actually not needed in the final optimized
algorithm.
1.5 Non-failure
We have left to show that the method does not fail. Recall that in Section 1.1
we assumed that pℓ 6= 0 and rℓ 6= 0 for 0 ≤ ℓ ≤ k. The results we have shown so
far therefore only holds upto ℓ = k. We can now finish the induction argument
by noting that whenever pk 6= 0 and r k 6= 0 we must have pk+1 6= 0 by (8).
The results in the previous sections therefore hold for all ℓ, until r ℓ = 0 and the
method stops.
4
3. Update pk+1 = (1 − γk )(r k + βk pk ).
pT
k+1 r k
4. Compute αk+1 = − pT
k+1 Apk+1