This action might not be possible to undo. Are you sure you want to continue?
and Nonlinear Equations
C. T. Kelley
North Carolina State University
Society for Industrial and Applied Mathematics
Philadelphia 1995
Untitled1 9/20/2004, 2:59 PM 3
To Polly H. Thomas, 19061994, devoted mother and grandmother
1
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
2
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Contents
Preface xi
How to Get the Software xiii
CHAPTER 1. Basic Concepts and Stationary Iterative Methods 3
1.1 Review and notation . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 The Banach Lemma and approximate inverses . . . . . . . . . . 5
1.3 The spectral radius . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Matrix splittings and classical stationary iterative methods . . 7
1.5 Exercises on stationary iterative methods . . . . . . . . . . . . 10
CHAPTER 2. Conjugate Gradient Iteration 11
2.1 Krylov methods and the minimization property . . . . . . . . . 11
2.2 Consequences of the minimization property . . . . . . . . . . . 13
2.3 Termination of the iteration . . . . . . . . . . . . . . . . . . . . 15
2.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5 Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6 CGNR and CGNE . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.7 Examples for preconditioned conjugate iteration . . . . . . . . 26
2.8 Exercises on conjugate gradient . . . . . . . . . . . . . . . . . . 30
CHAPTER 3. GMRES Iteration 33
3.1 The minimization property and its consequences . . . . . . . . 33
3.2 Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3 Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4 GMRES implementation: Basic ideas . . . . . . . . . . . . . . . 37
3.5 Implementation: Givens rotations . . . . . . . . . . . . . . . . . 43
3.6 Other methods for nonsymmetric systems . . . . . . . . . . . . 46
3.6.1 BiCG. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.6.2 CGS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.6.3 BiCGSTAB. . . . . . . . . . . . . . . . . . . . . . . . . 50
vii
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
viii CONTENTS
3.6.4 TFQMR. . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.7 Examples for GMRES iteration . . . . . . . . . . . . . . . . . . 54
3.8 Examples for CGNR, BiCGSTAB, and TFQMR iteration . . . 55
3.9 Exercises on GMRES . . . . . . . . . . . . . . . . . . . . . . . . 60
CHAPTER 4. Basic Concepts and FixedPoint Iteration 65
4.1 Types of convergence . . . . . . . . . . . . . . . . . . . . . . . . 65
4.2 Fixedpoint iteration . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3 The standard assumptions . . . . . . . . . . . . . . . . . . . . . 68
CHAPTER 5. Newton’s Method 71
5.1 Local convergence of Newton’s method . . . . . . . . . . . . . . 71
5.2 Termination of the iteration . . . . . . . . . . . . . . . . . . . . 72
5.3 Implementation of Newton’s method . . . . . . . . . . . . . . . 73
5.4 Errors in the function and derivative . . . . . . . . . . . . . . . 75
5.4.1 The chord method. . . . . . . . . . . . . . . . . . . . . . 76
5.4.2 Approximate inversion of F
. . . . . . . . . . . . . . . . 77
5.4.3 The Shamanskii method. . . . . . . . . . . . . . . . . . 78
5.4.4 Diﬀerence approximation to F
. . . . . . . . . . . . . . . 79
5.4.5 The secant method. . . . . . . . . . . . . . . . . . . . . 82
5.5 The Kantorovich Theorem . . . . . . . . . . . . . . . . . . . . . 83
5.6 Examples for Newton’s method . . . . . . . . . . . . . . . . . . 86
5.7 Exercises on Newton’s method . . . . . . . . . . . . . . . . . . 91
CHAPTER 6. Inexact Newton Methods 95
6.1 The basic estimates . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.1.1 Direct analysis. . . . . . . . . . . . . . . . . . . . . . . . 95
6.1.2 Weighted norm analysis. . . . . . . . . . . . . . . . . . . 97
6.1.3 Errors in the function. . . . . . . . . . . . . . . . . . . . 100
6.2 Newtoniterative methods . . . . . . . . . . . . . . . . . . . . . 100
6.2.1 Newton GMRES. . . . . . . . . . . . . . . . . . . . . . . 101
6.2.2 Other Newtoniterative methods. . . . . . . . . . . . . . 104
6.3 NewtonGMRES implementation . . . . . . . . . . . . . . . . . 104
6.4 Examples for NewtonGMRES . . . . . . . . . . . . . . . . . . 106
6.4.1 Chandrasekhar Hequation. . . . . . . . . . . . . . . . . 107
6.4.2 Convectiondiﬀusion equation. . . . . . . . . . . . . . . 108
6.5 Exercises on inexact Newton methods . . . . . . . . . . . . . . 110
CHAPTER 7. Broyden’s method 113
7.1 The Dennis–Mor´e condition . . . . . . . . . . . . . . . . . . . . 114
7.2 Convergence analysis . . . . . . . . . . . . . . . . . . . . . . . . 116
7.2.1 Linear problems. . . . . . . . . . . . . . . . . . . . . . . 118
7.2.2 Nonlinear problems. . . . . . . . . . . . . . . . . . . . . 120
7.3 Implementation of Broyden’s method . . . . . . . . . . . . . . . 123
7.4 Examples for Broyden’s method . . . . . . . . . . . . . . . . . . 127
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
CONTENTS ix
7.4.1 Linear problems. . . . . . . . . . . . . . . . . . . . . . . 127
7.4.2 Nonlinear problems. . . . . . . . . . . . . . . . . . . . . 128
7.5 Exercises on Broyden’s method . . . . . . . . . . . . . . . . . . 132
CHAPTER 8. Global Convergence 135
8.1 Single equations . . . . . . . . . . . . . . . . . . . . . . . . . . 135
8.2 Analysis of the Armijo rule . . . . . . . . . . . . . . . . . . . . 138
8.3 Implementation of the Armijo rule . . . . . . . . . . . . . . . . 141
8.3.1 Polynomial line searches. . . . . . . . . . . . . . . . . . 142
8.3.2 Broyden’s method. . . . . . . . . . . . . . . . . . . . . . 144
8.4 Examples for Newton–Armijo . . . . . . . . . . . . . . . . . . . 146
8.4.1 Inverse tangent function. . . . . . . . . . . . . . . . . . 146
8.4.2 Convectiondiﬀusion equation. . . . . . . . . . . . . . . 146
8.4.3 Broyden–Armijo. . . . . . . . . . . . . . . . . . . . . . . 148
8.5 Exercises on global convergence . . . . . . . . . . . . . . . . . . 151
Bibliography 153
Index 163
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
x CONTENTS
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Preface
This book on iterative methods for linear and nonlinear equations can be used
as a tutorial and a reference by anyone who needs to solve nonlinear systems
of equations or large linear systems. It may also be used as a textbook for
introductory courses in nonlinear equations or iterative methods or as source
material for an introductory course in numerical analysis at the graduate level.
We assume that the reader is familiar with elementary numerical analysis,
linear algebra, and the central ideas of direct methods for the numerical
solution of dense linear systems as described in standard texts such as [7],
[105], or [184].
Our approach is to focus on a small number of methods and treat them
in depth. Though this book is written in a ﬁnitedimensional setting, we
have selected for coverage mostly algorithms and methods of analysis which
extend directly to the inﬁnitedimensional case and whose convergence can be
thoroughly analyzed. For example, the matrixfree formulation and analysis for
GMRES and conjugate gradient is almost unchanged in an inﬁnitedimensional
setting. The analysis of Broyden’s method presented in Chapter 7 and
the implementations presented in Chapters 7 and 8 are diﬀerent from the
classical ones and also extend directly to an inﬁnitedimensional setting. The
computational examples and exercises focus on discretizations of inﬁnite
dimensional problems such as integral and diﬀerential equations.
We present a limited number of computational examples. These examples
are intended to provide results that can be used to validate the reader’s own
implementations and to give a sense of how the algorithms perform. The
examples are not designed to give a complete picture of performance or to be
a suite of test problems.
The computational examples in this book were done with MATLAB
(version 4.0a on various SUN SPARCstations and version 4.1 on an Apple
Macintosh Powerbook 180) and the MATLAB environment is an excellent one
for getting experience with the algorithms, for doing the exercises, and for
smalltomedium scale production work.
1
MATLAB codes for many of the
algorithms are available by anonymous ftp. A good introduction to the latest
1
MATLAB is a registered trademark of The MathWorks, Inc.
xi
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
xii PREFACE
version (version 4.2) of MATLAB is the MATLAB Primer [178]; [43] is also
a useful resource. If the reader has no access to MATLAB or will be solving
very large problems, the general algorithmic descriptions or even the MATLAB
codes can easily be translated to another language.
Parts of this book are based upon work supported by the National
Science Foundation and the Air Force Oﬃce of Scientiﬁc Research over
several years, most recently under National Science Foundation Grant Nos.
DMS9024622 and DMS9321938. Any opinions, ﬁndings, and conclusions or
recommendations expressed in this material are those of the author and do not
necessarily reﬂect the views of the National Science Foundation or of the Air
Force Oﬃce of Scientiﬁc Research.
Many of my students and colleagues discussed various aspects of this
project with me and provided important corrections, ideas, suggestions, and
pointers to the literature. I am especially indebted to Jim Banoczi, Jeﬀ Butera,
Steve Campbell, Tony Choi, Moody Chu, Howard Elman, Jim Epperson,
Andreas Griewank, Laura Helfrich, Ilse Ipsen, Lea Jenkins, Vickie Kearn,
Belinda King, Debbie Lockhart, Carl Meyer, Casey Miller, Ekkehard Sachs,
Jeﬀ Scroggs, Joseph Skudlarek, Mike Tocci, Gordon Wade, Homer Walker,
Steve Wright, Zhaqing Xue, Yue Zhang, and an anonymous reviewer for their
contributions and encouragement.
Most importantly, I thank ChungWei Ng and my parents for over one
hundred and ten years of patience and support.
C. T. Kelley
Raleigh, North Carolina
January, 1998
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
How to get the software
A collection of MATLAB codes has been written to accompany this book. The
MATLAB codes can be obtained by anonymous ftp from the MathWorks server
ftp.mathworks.com in the directory pub/books/kelley, from the MathWorks
World Wide Web site,
http://www.mathworks.com
or from SIAM’s World Wide Web site
http://www.siam.org/books/kelley/kelley.html
One can obtain MATLAB from
The MathWorks, Inc.
24 Prime Park Way
Natick, MA 01760,
Phone: (508) 6531415
Fax: (508) 6532997
Email: info@mathworks.com
WWW: http://www.mathworks.com
xiii
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Chapter 1
Basic Concepts and Stationary Iterative Methods
1.1. Review and notation
We begin by setting notation and reviewing some ideas from numerical linear
algebra that we expect the reader to be familiar with. An excellent reference
for the basic ideas of numerical linear algebra and direct methods for linear
equations is [184].
We will write linear equations as
Ax = b, (1.1)
where A is a nonsingular N N matrix, b ∈ R
N
is given, and
x
∗
= A
−1
b ∈ R
N
is to be found.
Throughout this chapter x will denote a potential solution and ¦x
k
¦
k≥0
the
sequence of iterates. We will denote the ith component of a vector x by (x)
i
(note the parentheses) and the ith component of x
k
by (x
k
)
i
. We will rarely
need to refer to individual components of vectors.
In this chapter   will denote a norm on R
N
as well as the induced matrix
norm.
Definition 1.1.1. Let   be a norm on R
N
. The induced matrix norm
of an N N matrix A is deﬁned by
A = max
x=1
Ax.
Induced norms have the important property that
Ax ≤ Ax.
Recall that the condition number of A relative to the norm   is
κ(A) = AA
−1
,
where κ(A) is understood to be inﬁnite if A is singular. If   is the l
p
norm
x
p
=
_
_
N
j=1
[(x)
i
[
p
_
_
1/p
3
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
4 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
we will write the condition number as κ
p
.
Most iterative methods terminate when the residual
r = b −Ax
is suﬃciently small. One termination criterion is
r
k

r
0

< τ, (1.2)
which can be related to the error
e = x −x
∗
in terms of the condition number.
Lemma 1.1.1. Let b, x, x
0
∈ R
N
. Let A be nonsingular and let x
∗
= A
−1
b.
e
e
0

≤ κ(A)
r
r
0

. (1.3)
Proof. Since
r = b −Ax = −Ae
we have
e = A
−1
Ae ≤ A
−1
Ae = A
−1
r
and
r
0
 = Ae
0
 ≤ Ae
0
.
Hence
e
e
0

≤
A
−1
r
A
−1
r
0

= κ(A)
r
r
0

,
as asserted.
The termination criterion (1.2) depends on the initial iterate and may result
in unnecessary work when the initial iterate is good and a poor result when the
initial iterate is far from the solution. For this reason we prefer to terminate
the iteration when
r
k

b
< τ. (1.4)
The two conditions (1.2) and (1.4) are the same when x
0
= 0, which is a
common choice, particularly when the linear iteration is being used as part of
a nonlinear solver.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BASIC CONCEPTS 5
1.2. The Banach Lemma and approximate inverses
The most straightforward approach to an iterative solution of a linear system
is to rewrite (1.1) as a linear ﬁxedpoint iteration. One way to do this is to
write Ax = b as
x = (I −A)x +b, (1.5)
and to deﬁne the Richardson iteration
x
k+1
= (I −A)x
k
+b. (1.6)
We will discuss more general methods in which ¦x
k
¦ is given by
x
k+1
= Mx
k
+c. (1.7)
In (1.7) M is an NN matrix called the iteration matrix. Iterative methods of
this form are called stationary iterative methods because the transition from x
k
to x
k+1
does not depend on the history of the iteration. The Krylov methods
discussed in Chapters 2 and 3 are not stationary iterative methods.
All our results are based on the following lemma.
Lemma 1.2.1. If M is an N N matrix with M < 1 then I − M is
nonsingular and
(I −M)
−1
 ≤
1
1 −M
. (1.8)
Proof. We will show that I − M is nonsingular and that (1.8) holds by
showing that the series
∞
l=0
M
l
= (I −M)
−1
.
The partial sums
S
k
=
k
l=0
M
l
form a Cauchy sequence in R
N×N
. To see this note that for all m > k
S
k
−S
m
 ≤
m
l=k+1
M
l
.
Now, M
l
 ≤ M
l
because   is a matrix norm that is induced by a vector
norm. Hence
S
k
−S
m
 ≤
m
l=k+1
M
l
= M
k+1
_
1 −M
m−k
1 −M
_
→ 0
as m, k → ∞. Hence the sequence S
k
converges, say to S. Since MS
k
+ I =
S
k+1
, we must have MS +I = S and hence (I −M)S = I. This proves that
I −M is nonsingular and that S = (I −M)
−1
.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
6 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Noting that
(I −M)
−1
 ≤
∞
l=0
M
l
= (1 −M)
−1
.
proves (1.8) and completes the proof.
The following corollary is a direct consequence of Lemma 1.2.1.
Corollary 1.2.1. If M < 1 then the iteration (1.7) converges to
x = (I −M)
−1
c for all initial iterates x
0
.
A consequence of Corollary 1.2.1 is that Richardson iteration (1.6) will
converge if I − A < 1. It is sometimes possible to precondition a linear
equation by multiplying both sides of (1.1) by a matrix B
BAx = Bb
so that convergence of iterative methods is improved. In the context of
Richardson iteration, the matrices B that allow us to apply the Banach lemma
and its corollary are called approximate inverses.
Definition 1.2.1. B is an approximate inverse of A if I −BA < 1.
The following theorem is often referred to as the Banach Lemma.
Theorem 1.2.1. If A and B are NN matrices and B is an approximate
inverse of A, then A and B are both nonsingular and
A
−1
 ≤
B
1 −I −BA
, B
−1
 ≤
A
1 −I −BA
, (1.9)
and
A
−1
−B ≤
BI −BA
1 −I −BA
, A−B
−1
 ≤
AI −BA
1 −I −BA
. (1.10)
Proof. Let M = I −BA. By Lemma 1.2.1 I −M = I −(I −BA) = BA is
nonsingular. Hence both A and B are nonsingular. By (1.8)
A
−1
B
−1
 = (I −M)
−1
 ≤
1
1 −M
=
1
1 −I −BA
. (1.11)
Since A
−1
= (I −M)
−1
B, inequality (1.11) implies the ﬁrst part of (1.9). The
second part follows in a similar way from B
−1
= A(I −M)
−1
.
To complete the proof note that
A
−1
−B = (I −BA)A
−1
, A−B
−1
= B
−1
(I −BA),
and use (1.9).
Richardson iteration, preconditioned with approximate inversion, has the
form
x
k+1
= (I −BA)x
k
+Bb. (1.12)
If the norm of I − BA is small, then not only will the iteration converge
rapidly, but, as Lemma 1.1.1 indicates, termination decisions based on the
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BASIC CONCEPTS 7
preconditioned residual Bb − BAx will better reﬂect the actual error. This
method is a very eﬀective technique for solving diﬀerential equations, integral
equations, and related problems [15], [6], [100], [117], [111]. Multigrid methods
[19], [99], [126], can also be interpreted in this light. We mention one other
approach, polynomial preconditioning, which tries to approximate A
−1
by a
polynomial in A [123], [179], [169].
1.3. The spectral radius
The analysis in ¸ 1.2 related convergence of the iteration (1.7) to the norm of
the matrix M. However the norm of M could be small in some norms and
quite large in others. Hence the performance of the iteration is not completely
described by M. The concept of spectral radius allows us to make a complete
description.
We let σ(A) denote the set of eigenvalues of A.
Definition 1.3.1. The spectral radius of an N N matrix A is
ρ(A) = max
λ∈σ(A)
[λ[ = lim
n→∞
A
n

1/n
. (1.13)
The term on the righthand side of the second equality in (1.13) is the limit
used by the radical test for convergence of the series
A
n
.
The spectral radius of M is independent of any particular matrix norm of
M. It is clear, in fact, that
ρ(A) ≤ A (1.14)
for any induced matrix norm. The inequality (1.14) has a partial converse that
allows us to completely describe the performance of iteration (1.7) in terms of
spectral radius. We state that converse as a theorem and refer to [105] for a
proof.
Theorem 1.3.1. Let A be an N N matrix. Then for any > 0 there is
a norm   on R
N
such that
ρ(A) > A −.
A consequence of Theorem 1.3.1, Lemma 1.2.1, and Exercise 1.5.1 is a
characterization of convergent stationary iterative methods. The proof is left
as an exercise.
Theorem 1.3.2. Let M be an NN matrix. The iteration (1.7) converges
for all c ∈ R
N
if and only if ρ(M) < 1.
1.4. Matrix splittings and classical stationary iterative methods
There are ways to convert Ax = b to a linear ﬁxedpoint iteration that are
diﬀerent from (1.5). Methods such as Jacobi, Gauss–Seidel, and sucessive
overrelaxation (SOR) iteration are based on splittings of A of the form
A = A
1
+A
2
,
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
8 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
where A
1
is a nonsingular matrix constructed so that equations with A
1
as
coeﬃcient matrix are easy to solve. Then Ax = b is converted to the ﬁxed
point problem
x = A
−1
1
(b −A
2
x).
The analysis of the method is based on an estimation of the spectral radius of
the iteration matrix M = −A
−1
1
A
2
.
For a detailed description of the classical stationary iterative methods the
reader may consult [89], [105], [144], [193], or [200]. These methods are usually
less eﬃcient than the Krylov methods discussed in Chapters 2 and 3 or the more
modern stationary methods based on multigrid ideas. However the classical
methods have a role as preconditioners. The limited description in this section
is intended as a review that will set some notation to be used later.
As a ﬁrst example we consider the Jacobi iteration that uses the splitting
A
1
= D, A
2
= L +U,
where D is the diagonal of A and L and U are the (strict) lower and upper
triangular parts. This leads to the iteration matrix
M
JAC
= −D
−1
(L +U).
Letting (x
k
)
i
denote the ith component of the kth iterate we can express Jacobi
iteration concretely as
(x
k+1
)
i
= a
−1
ii
_
_
b
i
−
j=i
a
ij
(x
k
)
j
_
_
. (1.15)
Note that A
1
is diagonal and hence trivial to invert.
We present only one convergence result for the classical stationary iterative
methods.
Theorem 1.4.1. Let A be an N N matrix and assume that for all
1 ≤ i ≤ N
0 <
j=i
[a
ij
[ < [a
ii
[. (1.16)
Then A is nonsingular and the Jacobi iteration (1.15) converges to x
∗
= A
−1
b
for all b.
Proof. Note that the ith row sum of M = M
JAC
satisﬁes
N
j=1
[m
ij
[ =
j=i
[a
ij
[
[a
ii
[
< 1.
Hence M
JAC

∞
< 1 and the iteration converges to the unique solution of
x = Mx + D
−1
b. Also I − M = D
−1
A is nonsingular and therefore A is
nonsingular.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BASIC CONCEPTS 9
Gauss–Seidel iteration overwrites the approximate solution with the new
value as soon as it is computed. This results in the iteration
(x
k+1
)
i
= a
−1
ii
_
_
b
i
−
j<i
a
ij
(x
k+1
)
j
−
j>i
a
ij
(x
k
)
j
_
_
,
the splitting
A
1
= D +L, A
2
= U,
and iteration matrix
M
GS
= −(D +L)
−1
U.
Note that A
1
is lower triangular, and hence A
−1
1
y is easy to compute for vectors
y. Note also that, unlike Jacobi iteration, the iteration depends on the ordering
of the unknowns. Backward Gauss–Seidel begins the update of x with the Nth
coordinate rather than the ﬁrst, resulting in the splitting
A
1
= D +U, A
2
= L,
and iteration matrix
M
BGS
= −(D +U)
−1
L.
A symmetric Gauss–Seidel iteration is a forward Gauss–Seidel iteration
followed by a backward Gauss–Seidel iteration. This leads to the iteration
matrix
M
SGS
= M
BGS
M
GS
= (D +U)
−1
L(D +L)
−1
U.
If A is symmetric then U = L
T
. In that event
M
SGS
= (D +U)
−1
L(D +L)
−1
U = (D +L
T
)
−1
L(D +L)
−1
L
T
.
From the point of view of preconditioning, one wants to write the stationary
method as a preconditioned Richardson iteration. That means that one wants
to ﬁnd B such that M = I − BA and then use B as an approximate inverse.
For the Jacobi iteration,
B
JAC
= D
−1
. (1.17)
For symmetric Gauss–Seidel
B
SGS
= (D +L
T
)
−1
D(D +L)
−1
. (1.18)
The successive overrelaxation iteration modiﬁes Gauss–Seidel by adding a
relaxation parameter ω to construct an iteration with iteration matrix
M
SOR
= (D +ωL)
−1
((1 −ω)D −ωU).
The performance can be dramatically improved with a good choice of ω but
still is not competitive with Krylov methods. A further disadvantage is that
the choice of ω is often diﬃcult to make. References [200], [89], [193], [8], and
the papers cited therein provide additional reading on this topic.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
10 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
1.5. Exercises on stationary iterative methods
1.5.1. Show that if ρ(M) ≥ 1 then there are x
0
and c such that the iteration
(1.7) fails to converge.
1.5.2. Prove Theorem 1.3.2.
1.5.3. Verify equality (1.18).
1.5.4. Show that if A is symmetric and positive deﬁnite (that is A
T
= A and
x
T
Ax > 0 for all x ,= 0) that B
SGS
is also symmetric and positive
deﬁnite.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Chapter 2
Conjugate Gradient Iteration
2.1. Krylov methods and the minimization property
In the following two chapters we describe some of the Krylov space methods
for linear equations. Unlike the stationary iterative methods, Krylov methods
do not have an iteration matrix. The two such methods that we’ll discuss in
depth, conjugate gradient and GMRES, minimize, at the kth iteration, some
measure of error over the aﬃne space
x
0
+/
k
,
where x
0
is the initial iterate and the kth Krylov subspace /
k
is
/
k
= span(r
0
, Ar
0
, . . . , A
k−1
r
0
)
for k ≥ 1.
The residual is
r = b −Ax.
So ¦r
k
¦
k≥0
will denote the sequence of residuals
r
k
= b −Ax
k
.
As in Chapter 1, we assume that A is a nonsingular N N matrix and let
x
∗
= A
−1
b.
There are other Krylov methods that are not as well understood as CG or
GMRES. Brief descriptions of several of these methods and their properties
are in ¸ 3.6, [12], and [78].
The conjugate gradient (CG) iteration was invented in the 1950s [103] as a
direct method. It has come into wide use over the last 15 years as an iterative
method and has generally superseded the Jacobi–Gauss–Seidel–SOR family of
methods.
CG is intended to solve symmetric positive deﬁnite (spd) systems. Recall
that A is symmetric if A = A
T
and positive deﬁnite if
x
T
Ax > 0 for all x ,= 0.
11
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
12 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
In this section we assume that A is spd. Since A is spd we may deﬁne a norm
(you should check that this is a norm) by
x
A
=
√
x
T
Ax. (2.1)
 
A
is called the Anorm. The development in these notes is diﬀerent from
the classical work and more like the analysis for GMRES and CGNR in [134].
In this section, and in the section on GMRES that follows, we begin with a
description of what the algorithm does and the consequences of the minimiza
tion property of the iterates. After that we describe termination criterion,
performance, preconditioning, and at the very end, the implementation.
The kth iterate x
k
of CG minimizes
φ(x) =
1
2
x
T
Ax −x
T
b (2.2)
over x
0
+/
k
.
Note that if φ(˜ x) is the minimal value (in R
N
) then
∇φ(˜ x) = A˜ x −b = 0
and hence ˜ x = x
∗
.
Minimizing φ over any subset of R
N
is the same as minimizing x −x
∗

A
over that subset. We state this as a lemma.
Lemma 2.1.1. Let o ⊂ R
N
. If x
k
minimizes φ over o then x
k
also
minimizes x
∗
−x
A
= r
A
−1 over o.
Proof. Note that
x −x
∗

2
A
= (x −x
∗
)
T
A(x −x
∗
) = x
T
Ax −x
T
Ax
∗
−(x
∗
)
T
Ax + (x
∗
)
T
Ax
∗
.
Since A is symmetric and Ax
∗
= b
−x
T
Ax
∗
−(x
∗
)
T
Ax = −2x
T
Ax
∗
= −2x
T
b.
Therefore
x −x
∗

2
A
= 2φ(x) + (x
∗
)
T
Ax
∗
.
Since (x
∗
)
T
Ax
∗
is independent of x, minimizing φ is equivalent to minimizing
x −x
∗

2
A
and hence to minimizing x −x
∗

A
.
If e = x −x
∗
then
e
2
A
= e
T
Ae = (A(x −x
∗
))
T
A
−1
(A(x −x
∗
)) = b −Ax
2
A
−1
and hence the Anorm of the error is also the A
−1
norm of the residual.
We will use this lemma in the particular case that o = x
0
+/
k
for some k.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
CONJUGATE GRADIENT ITERATION 13
2.2. Consequences of the minimization property
Lemma 2.1.1 implies that since x
k
minimizes φ over x
0
+/
k
x
∗
−x
k

A
≤ x
∗
−w
A
(2.3)
for all w ∈ x
0
+/
k
. Since any w ∈ x
0
+/
k
can be written as
w =
k−1
j=0
γ
j
A
j
r
0
+x
0
for some coeﬃcients ¦γ
j
¦, we can express x
∗
−w as
x
∗
−w = x
∗
−x
0
−
k−1
j=0
γ
j
A
j
r
0
.
Since Ax
∗
= b we have
r
0
= b −Ax
0
= A(x
∗
−x
0
)
and therefore
x
∗
−w = x
∗
−x
0
−
k−1
j=0
γ
j
A
j+1
(x
∗
−x
0
) = p(A)(x
∗
−x
0
),
where the polynomial
p(z) = 1 −
k−1
j=0
γ
j
z
j+1
has degree k and satisﬁes p(0) = 1. Hence
x
∗
−x
k

A
= min
p∈P
k
,p(0)=1
p(A)(x
∗
−x
0
)
A
. (2.4)
In (2.4) P
k
denotes the set of polynomials of degree k.
The spectral theorem for spd matrices asserts that
A = UΛU
T
,
where U is an orthogonal matrix whose columns are the eigenvectors of A and
Λ is a diagonal matrix with the positive eigenvalues of A on the diagonal. Since
UU
T
= U
T
U = I by orthogonality of U, we have
A
j
= UΛ
j
U
T
.
Hence
p(A) = Up(Λ)U
T
.
Deﬁne A
1/2
= UΛ
1/2
U
T
and note that
x
2
A
= x
T
Ax = A
1/2
x
2
2
. (2.5)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
14 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Hence, for any x ∈ R
N
and
p(A)x
A
= A
1/2
p(A)x
2
≤ p(A)
2
A
1/2
x
2
≤ p(A)
2
x
A
.
This, together with (2.4) implies that
x
k
−x
∗

A
≤ x
0
−x
∗

A
min
p∈P
k
,p(0)=1
max
z∈σ(A)
[p(z)[. (2.6)
Here σ(A) is the set of all eigenvalues of A.
The following corollary is an important consequence of (2.6).
Corollary 2.2.1. Let A be spd and let ¦x
k
¦ be the CG iterates. Let k be
given and let ¦¯ p
k
¦ be any kth degree polynomial such that ¯ p
k
(0) = 1. Then
x
k
−x
∗

A
x
0
−x
∗

A
≤ max
z∈σ(A)
[ ¯ p
k
(z)[. (2.7)
We will refer to the polynomial ¯ p
k
as a residual polynomial [185].
Definition 2.2.1. The set of kth degree residual polynomials is
T
k
= ¦p [ p is a polynomial of degree k and p(0) = 1.¦ (2.8)
In speciﬁc contexts we try to construct sequences of residual polynomials,
based on information on σ(A), that make either the middle or the right term
in (2.7) easy to evaluate. This leads to an upper estimate for the number of
CG iterations required to reduce the Anorm of the error to a given tolerance.
One simple application of (2.7) is to show how the CG algorithm can be
viewed as a direct method.
Theorem 2.2.1. Let A be spd. Then the CG algorithm will ﬁnd the
solution within N iterations.
Proof. Let ¦λ
i
¦
N
i=1
be the eigenvalues of A. As a test polynomial, let
¯ p(z) =
N
i=1
(λ
i
−z)/λ
i
.
¯ p ∈ T
N
because ¯ p has degree N and ¯ p(0) = 1. Hence, by (2.7) and the fact
that ¯ p vanishes on σ(A),
x
N
−x
∗

A
≤ x
0
−x
∗

A
max
z∈σ(A)
[ ¯ p(z)[ = 0.
Note that our test polynomial had the eigenvalues of A as its roots. In
that way we showed (in the absence of all roundoﬀ error!) that CG terminated
in ﬁnitely many iterations with the exact solution. This is not as good as it
sounds, since in most applications the number of unknowns N is very large,
and one cannot aﬀord to perform N iterations. It is best to regard CG as an
iterative method. When doing that we seek to terminate the iteration when
some speciﬁed error tolerance is reached.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
CONJUGATE GRADIENT ITERATION 15
In the two examples that follow we look at some other easy consequences
of (2.7).
Theorem 2.2.2. Let A be spd with eigenvectors ¦u
i
¦
N
i=1
. Let b be a linear
combination of k of the eigenvectors of A
b =
k
l=1
γ
l
u
i
l
.
Then the CG iteration for Ax = b with x
0
= 0 will terminate in at most k
iterations.
Proof. Let ¦λ
i
l
¦ be the eigenvalues of A associated with the eigenvectors
¦u
i
l
¦
k
l=1
. By the spectral theorem
x
∗
=
k
l=1
(γ
l
/λ
i
l
)u
i
l
.
We use the residual polynomial,
¯ p(z) =
k
l=1
(λ
i
l
−z)/λ
i
l
.
One can easily verify that ¯ p ∈ T
k
. Moreover, ¯ p(λ
i
l
) = 0 for 1 ≤ l ≤ k and
hence
¯ p(A)x
∗
=
k
l=1
¯ p(λ
i
l
)γ
l
/λ
i
l
u
i
l
= 0.
So, we have by (2.4) and the fact that x
0
= 0 that
x
k
−x
∗

A
≤ ¯ p(A)x
∗

A
= 0.
This completes the proof.
If the spectrum of A has fewer than N points, we can use a similar technique
to prove the following theorem.
Theorem 2.2.3. Let A be spd. Assume that there are exactly k ≤ N
distinct eigenvalues of A. Then the CG iteration terminates in at most k
iterations.
2.3. Termination of the iteration
In practice we do not run the CG iteration until an exact solution is found, but
rather terminate once some criterion has been satisﬁed. One typical criterion is
small (say ≤ η) relative residuals. This means that we terminate the iteration
after
b −Ax
k

2
≤ ηb
2
. (2.9)
The error estimates that come from the minimization property, however, are
based on (2.7) and therefore estimate the reduction in the relative Anorm of
the error.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
16 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Our next task is to relate the relative residual in the Euclidean norm to
the relative error in the Anorm. We will do this in the next two lemmas and
then illustrate the point with an example.
Lemma 2.3.1. Let A be spd with eigenvalues λ
1
≥ λ
2
≥ . . . λ
N
. Then for
all z ∈ R
N
,
A
1/2
z
2
= z
A
(2.10)
and
λ
1/2
N
z
A
≤ Az
2
≤ λ
1/2
1
z
A
. (2.11)
Proof. Clearly
z
2
A
= z
T
Az = (A
1/2
z)
T
(A
1/2
z) = A
1/2
z
2
2
which proves (2.10).
Let u
i
be a unit eigenvector corresponding to λ
i
. We may write A = UΛU
T
as
Az =
N
i=1
λ
i
(u
T
i
z)u
i
.
Hence
λ
N
A
1/2
z
2
2
= λ
N
N
i=1
λ
i
(u
T
i
z)
2
≤ Az
2
2
=
N
i=1
λ
2
i
(u
T
i
z)
2
≤ λ
1
N
i=1
λ
i
(u
T
i
z)
2
= λ
1
A
1/2
z
2
2
.
Taking square roots and using (2.10) complete the proof.
Lemma 2.3.2.
b
2
r
0

2
b −Ax
k

2
b
2
=
b −Ax
k

2
b −Ax
0

2
≤
_
κ
2
(A)
x
k
−x
∗

A
x
∗
−x
0

A
(2.12)
and
b −Ax
k

2
b
2
≤
_
κ
2
(A)r
0

2
b
2
x
k
−x
∗

A
x
∗
−x
0

A
. (2.13)
Proof. The equality on the left of (2.12) is clear and (2.13) follows directly
from (2.12). To obtain the inequality on the right of (2.12), ﬁrst recall that if
A = UΛU
T
is the spectral decomposition of A and we order the eigenvalues
such that λ
1
≥ λ
2
≥ . . . λ
N
> 0, then A
2
= λ
1
and A
−1

2
= 1/λ
N
. So
κ
2
(A) = λ
1
/λ
N
.
Therefore, using (2.10) and (2.11) twice,
b −Ax
k

2
b −Ax
0

2
=
A(x
∗
−x
k
)
2
A(x
∗
−x
0
)
2
≤
¸
λ
1
λ
N
x
∗
−x
k

A
x
∗
−x
0

A
as asserted.
So, to predict the performance of the CG iteration based on termination on
small relative residuals, we must not only use (2.7) to predict when the relative
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
CONJUGATE GRADIENT ITERATION 17
Anorm error is small, but also use Lemma 2.3.2 to relate small Anorm errors
to small relative residuals.
We consider a very simple example. Assume that x
0
= 0 and that the
eigenvalues of A are contained in the interval (9, 11). If we let ¯ p
k
(z) =
(10 −z)
k
/10
k
, then ¯ p
k
∈ T
k
. This means that we may apply (2.7) to get
x
k
−x
∗

A
≤ x
∗

A
max
9≤z≤11
[ ¯ p
k
(z)[.
It is easy to see that
max
9≤z≤11
[ ¯ p
k
(z)[ = 10
−k
.
Hence, after k iterations
x
k
−x
∗

A
≤ x
∗

A
10
−k
. (2.14)
So, the size of the Anorm of the error will be reduced by a factor of 10
−3
when
10
−k
≤ 10
−3
,
that is, when
k ≥ 3.
To use Lemma 2.3.2 we simply note that κ
2
(A) ≤ 11/9. Hence, after k
iterations we have
Ax
k
−b
2
b
2
≤
√
11 10
−k
/3.
So, the size of the relative residual will be reduced by a factor of 10
−3
when
10
−k
≤ 3 10
−3
/
√
11,
that is, when
k ≥ 4.
One can obtain a more precise estimate by using a polynomial other than
p
k
in the upper estimate for the righthand side of (2.7). Note that it is always
the case that the spectrum of a spd matrix is contained in the interval [λ
N
, λ
1
]
and that κ
2
(A) = λ
1
/λ
N
. A result from [48] (see also [45]) that is, in one
sense, the sharpest possible, is
x
k
−x
∗

A
≤ 2x
0
−x
∗

A
_
_
κ
2
(A) −1
_
κ
2
(A) + 1
_
k
. (2.15)
In the case of the above example, we can estimate κ
2
(A) by κ
2
(A) ≤ 11/9.
Hence, since (
√
x −1)/(
√
x + 1) is an increasing function of x on the interval
(1, ∞).
_
κ
2
(A) −1
_
κ
2
(A) + 1
≤
√
11 −3
√
11 + 3
≈ .05.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
18 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Therefore (2.15) would predict a reduction in the size of the Anorm error by
a factor of 10
−3
when
2 .05
k
< 10
−3
or when
k > −log
10
(2000)/ log
10
(.05) ≈ 3.3/1.3 ≈ 2.6,
which also predicts termination within three iterations.
We may have more precise information than a single interval containing
σ(A). When we do, the estimate in (2.15) can be very pessimistic. If the
eigenvalues cluster in a small number of intervals, the condition number can
be quite large, but CG can perform very well. We will illustrate this with an
example. Exercise 2.8.5 also covers this point.
Assume that x
0
= 0 and the eigenvalues of A lie in the two intervals (1, 1.5)
and (399, 400). Based on this information the best estimate of the condition
number of A is κ
2
(A) ≤ 400, which, when inserted into (2.15) gives
x
k
−x
∗

A
x
∗

A
≤ 2 (19/21)
k
≈ 2 (.91)
k
.
This would indicate fairly slow convergence. However, if we use as a residual
polynomial ¯ p
3k
∈ T
3k
¯ p
3k
(z) =
(1.25 −z)
k
(400 −z)
2k
(1.25)
k
400
2k
.
It is easy to see that
max
z∈σ(A)
[ ¯ p
3k
(z)[ ≤ (.25/1.25)
k
= (.2)
k
,
which is a sharper estimate on convergence. In fact, (2.15) would predict that
x
k
−x
∗

A
≤ 10
−3
x
∗

A
,
when 2 (.91)
k
< 10
−3
or when
k > −log
10
(2000)/ log
10
(.91) ≈ 3.3/.04 = 82.5.
The estimate based on the clustering gives convergence in 3k iterations when
(.2)
k
≤ 10
−3
or when
k > −3/ log
10
(.2) = 4.3.
Hence (2.15) predicts 83 iterations and the clustering analysis 15 (the smallest
integer multiple of 3 larger than 3 4.3 = 12.9).
From the results above one can see that if the condition number of A is near
one, the CG iteration will converge very rapidly. Even if the condition number
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
CONJUGATE GRADIENT ITERATION 19
is large, the iteration will perform well if the eigenvalues are clustered in a few
small intervals. The transformation of the problem into one with eigenvalues
clustered near one (i.e., easier to solve) is called preconditioning. We used
this term before in the context of Richardson iteration and accomplished the
goal by multiplying A by an approximate inverse. In the context of CG, such
a simple approach can destroy the symmetry of the coeﬃcient matrix and a
more subtle implementation is required. We discuss this in ¸ 2.5.
2.4. Implementation
The implementation of CG depends on the amazing fact that once x
k
has been
determined, either x
k
= x
∗
or a search direction p
k+1
,= 0 can be found very
cheaply so that x
k+1
= x
k
+ α
k+1
p
k+1
for some scalar α
k+1
. Once p
k+1
has
been found, α
k+1
is easy to compute from the minimization property of the
iteration. In fact
dφ(x
k
+αp
k+1
)
dα
= 0 (2.16)
for the correct choice of α = α
k+1
. Equation (2.16) can be written as
p
T
k+1
Ax
k
+αp
T
k+1
Ap
k+1
−p
T
k+1
b = 0
leading to
α
k+1
=
p
T
k+1
(b −Ax
k
)
p
T
k+1
Ap
k+1
=
p
T
k+1
r
k
p
T
k+1
Ap
k+1
. (2.17)
If x
k
= x
k+1
then the above analysis implies that α = 0. We show that
this only happens if x
k
is the solution.
Lemma 2.4.1. Let A be spd and let ¦x
k
¦ be the conjugate gradient iterates.
Then
r
T
k
r
l
= 0 for all 0 ≤ l < k. (2.18)
Proof. Since x
k
minimizes φ on x
0
+/
k
, we have, for any ξ ∈ /
k
dφ(x
k
+tξ)
dt
= ∇φ(x
k
+tξ)
T
ξ = 0
at t = 0. Recalling that
∇φ(x) = Ax −b = −r
we have
∇φ(x
k
)
T
ξ = −r
T
k
ξ = 0 for all ξ ∈ /
k
. (2.19)
Since r
l
∈ /
k
for all l < k (see Exercise 2.8.1), this proves (2.18).
Now, if x
k
= x
k+1
, then r
k
= r
k+1
. Lemma 2.4.1 then implies that
r
k

2
2
= r
T
k
r
k
= r
T
k
r
k+1
= 0 and hence x
k
= x
∗
.
The next lemma characterizes the search direction and, as a side eﬀect,
proves that (if we deﬁne p
0
= 0) p
T
l
r
k
= 0 for all 0 ≤ l < k ≤ n, unless the
iteration terminates prematurely.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
20 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Lemma 2.4.2. Let A be spd and let ¦x
k
¦ be the conjugate gradient iterates.
If x
k
,= x
∗
then x
k+1
= x
k
+ α
k+1
p
k+1
and p
k+1
is determined up to a scalar
multiple by the conditions
p
k+1
∈ /
k+1
, p
T
k+1
Aξ = 0 for all ξ ∈ /
k
. (2.20)
Proof. Since /
k
⊂ /
k+1
,
∇φ(x
k+1
)
T
ξ = (Ax
k
+α
k+1
Ap
k+1
−b)
T
ξ = 0 (2.21)
for all ξ ∈ /
k
. (2.19) and (2.21) then imply that for all ξ ∈ /
k
,
α
k+1
p
T
k+1
Aξ = −(Ax
k
−b)
T
ξ = −∇φ(x
k
)
T
ξ = 0. (2.22)
This uniquely speciﬁes the direction of p
k+1
as (2.22) implies that p
k+1
∈ /
k+1
is Aorthogonal (i.e., in the scalar product (x, y) = x
T
Ay) to /
k
, a subspace
of dimension one less than /
k+1
.
The condition p
T
k+1
Aξ = 0 is called Aconjugacy of p
k+1
to /
k
. Now, any
p
k+1
satisfying (2.20) can, up to a scalar multiple, be expressed as
p
k+1
= r
k
+w
k
with w
k
∈ /
k
. While one might think that w
k
would be hard to compute, it
is, in fact, trivial. We have the following theorem.
Theorem 2.4.1. Let A be spd and assume that r
k
,= 0. Deﬁne p
0
= 0.
Then
p
k+1
= r
k
+β
k+1
p
k
for some β
k+1
and k ≥ 0. (2.23)
Proof. By Lemma 2.4.2 and the fact that /
k
= span(r
0
, . . . , r
k−1
), we need
only verify that a β
k+1
can be found so that if p
k+1
is given by (2.23) then
p
T
k+1
Ar
l
= 0
for all 0 ≤ l ≤ k −1.
Let p
k+1
be given by (2.23). Then for any l ≤ k
p
T
k+1
Ar
l
= r
T
k
Ar
l
+β
k+1
p
T
k
Ar
l
.
If l ≤ k −2, then r
l
∈ /
l+1
⊂ /
k−1
. Lemma 2.4.2 then implies that
p
T
k+1
Ar
l
= 0 for 0 ≤ l ≤ k −2.
It only remains to solve for β
k+1
so that p
T
k+1
Ar
k−1
= 0. Trivially
β
k+1
= −r
T
k
Ar
k−1
/p
T
k
Ar
k−1
(2.24)
provided p
T
k
Ar
k−1
,= 0. Since
r
k
= r
k−1
−α
k
Ap
k
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
CONJUGATE GRADIENT ITERATION 21
we have
r
T
k
r
k−1
= r
k−1

2
2
−α
k
p
T
k
Ar
k−1
.
Since r
T
k
r
k−1
= 0 by Lemma 2.4.1 we have
p
T
k
Ar
k−1
= r
k−1

2
2
/α
k
,= 0. (2.25)
This completes the proof.
The common implementation of conjugate gradient uses a diﬀerent form
for α
k
and β
k
than given in (2.17) and (2.24).
Lemma 2.4.3. Let A be spd and assume that r
k
,= 0. Then
α
k+1
=
r
k

2
2
p
T
k+1
Ap
k+1
(2.26)
and
β
k+1
=
r
k

2
2
r
k−1

2
2
. (2.27)
Proof. Note that for k ≥ 0
p
T
k+1
r
k+1
= r
T
k
r
k+1
+β
k+1
p
T
k
r
k+1
= 0 (2.28)
by Lemma 2.4.2. An immediate consequence of (2.28) is that p
T
k
r
k
= 0 and
hence
p
T
k+1
r
k
= (r
k
+β
k+1
p
k
)
T
r
k
= r
k

2
2
. (2.29)
Taking scalar products of both sides of
r
k+1
= r
k
−α
k+1
Ap
k+1
with p
k+1
and using (2.29) gives
0 = p
T
k+1
r
k
−α
k+1
p
T
k+1
Ap
k+1
= r
T
k

2
2
−α
k+1
p
T
k+1
Ap
k+1
,
which is equivalent to (2.26).
To get (2.27) note that p
T
k+1
Ap
k
= 0 and hence (2.23) implies that
β
k+1
=
−r
T
k
Ap
k
p
T
k
Ap
k
. (2.30)
Also note that
p
T
k
Ap
k
= p
T
k
A(r
k−1
+β
k
p
k−1
)
= p
T
k
Ar
k−1
+β
k
p
T
k
Ap
k−1
= p
T
k
Ar
k−1
.
(2.31)
Now combine (2.30), (2.31), and (2.25) to get
β
k+1
=
−r
T
k
Ap
k
α
k
r
k−1

2
2
.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
22 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Now take scalar products of both sides of
r
k
= r
k−1
−α
k
Ap
k
with r
k
and use Lemma 2.4.1 to get
r
k

2
2
= −α
k
r
T
k
Ap
k
.
Hence (2.27) holds.
The usual implementation reﬂects all of the above results. The goal is to
ﬁnd, for a given , a vector x so that b−Ax
2
≤ b
2
. The input is the initial
iterate x, which is overwritten with the solution, the right hand side b, and a
routine which computes the action of A on a vector. We limit the number of
iterations to kmax and return the solution, which overwrites the initial iterate
x and the residual norm.
Algorithm 2.4.1. cg(x, b, A, , kmax)
1. r = b −Ax, ρ
0
= r
2
2
, k = 1.
2. Do While
√
ρ
k−1
> b
2
and k < kmax
(a) if k = 1 then p = r
else
β = ρ
k−1
/ρ
k−2
and p = r +βp
(b) w = Ap
(c) α = ρ
k−1
/p
T
w
(d) x = x +αp
(e) r = r −αw
(f) ρ
k
= r
2
2
(g) k = k + 1
Note that the matrix A itself need not be formed or stored, only a routine
for matrixvector products is required. Krylov space methods are often called
matrixfree for that reason.
Now, consider the costs. We need store only the four vectors x, w, p, and r.
Each iteration requires a single matrixvector product (to compute w = Ap),
two scalar products (one for p
T
w and one to compute ρ
k
= r
2
2
), and three
operations of the form ax +y, where x and y are vectors and a is a scalar.
It is remarkable that the iteration can progress without storing a basis for
the entire Krylov subspace. As we will see in the section on GMRES, this is
not the case in general. The spd structure buys quite a lot.
2.5. Preconditioning
To reduce the condition number, and hence improve the performance of the
iteration, one might try to replace Ax = b by another spd system with the
same solution. If M is a spd matrix that is close to A
−1
, then the eigenvalues
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
CONJUGATE GRADIENT ITERATION 23
of MA will be clustered near one. However MA is unlikely to be spd, and
hence CG cannot be applied to the system MAx = Mb.
In theory one avoids this diﬃculty by expressing the preconditioned
problem in terms of B, where B is spd, A = B
2
, and by using a twosided
preconditioner, S ≈ B
−1
(so M = S
2
). Then the matrix SAS is spd and its
eigenvalues are clustered near one. Moreover the preconditioned system
SASy = Sb
has y
∗
= S
−1
x
∗
as a solution, where Ax
∗
= b. Hence x
∗
can be recovered
from y
∗
by multiplication by S. One might think, therefore, that computing
S (or a subroutine for its action on a vector) would be necessary and that a
matrixvector multiply by SAS would incur a cost of one multiplication by A
and two by S. Fortunately, this is not the case.
If y
k
, ˆ r
k
, ˆ p
k
are the iterate, residual, and search direction for CG applied
to SAS and we let
x
k
= Sˆ y
k
, r
k
= S
−1
ˆ r
k
, p
k
= Sˆ p
k
, and z
k
= Sˆ r
k
,
then one can perform the iteration directly in terms of x
k
, A, and M. The
reader should verify that the following algorithm does exactly that. The input
is the same as that for Algorithm cg and the routine to compute the action of
the preconditioner on a vector. Aside from the preconditioner, the arguments
to pcg are the same as those to Algorithm cg.
Algorithm 2.5.1. pcg(x, b, A, M, , kmax)
1. r = b −Ax, ρ
0
= r
2
2
, k = 1
2. Do While
√
ρ
k−1
> b
2
and k < kmax
(a) z = Mr
(b) τ
k−1
= z
T
r
(c) if k = 1 then β = 0 and p = z
else
β = τ
k−1
/τ
k−2
, p = z +βp
(d) w = Ap
(e) α = τ
k−1
/p
T
w
(f) x = x +αp
(g) r = r −αw
(h) ρ
k
= r
T
r
(i) k = k + 1
Note that the cost is identical to CG with the addition of
• the application of the preconditioner M in step 2a and
• the additional inner product required to compute τ
k
in step 2b.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
24 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Of these costs, the application of the preconditioner is usually the larger. In
the remainder of this section we brieﬂy mention some classes of preconditioners.
A more complete and detailed discussion of preconditioners is in [8] and a
concise survey with many pointers to the literature is in [12].
Some eﬀective preconditioners are based on deep insight into the structure
of the problem. See [124] for an example in the context of partial diﬀerential
equations, where it is shown that certain discretized secondorder elliptic
problems on simple geometries can be very well preconditioned with fast
Poisson solvers [99], [188], and [187]. Similar performance can be obtained from
multigrid [99], domain decomposition, [38], [39], [40], and alternating direction
preconditioners [8], [149], [193], [194]. We use a Poisson solver preconditioner
in the examples in ¸ 2.7 and ¸ 3.7 as well as for nonlinear problems in ¸ 6.4.2
and ¸ 8.4.2.
One commonly used and easily implemented preconditioner is Jacobi
preconditioning, where M is the inverse of the diagonal part of A. One can also
use other preconditioners based on the classical stationary iterative methods,
such as the symmetric Gauss–Seidel preconditioner (1.18). For applications to
partial diﬀerential equations, these preconditioners may be somewhat useful,
but should not be expected to have dramatic eﬀects.
Another approach is to apply a sparse Cholesky factorization to the
matrix A (thereby giving up a fully matrixfree formulation) and discarding
small elements of the factors and/or allowing only a ﬁxed amount of storage
for the factors. Such preconditioners are called incomplete factorization
preconditioners. So if A = LL
T
+ E, where E is small, the preconditioner
is (LL
T
)
−1
and its action on a vector is done by two sparse triangular solves.
We refer the reader to [8], [127], and [44] for more detail.
One could also attempt to estimate the spectrum of A, ﬁnd a polynomial
p such that 1 −zp(z) is small on the approximate spectrum, and use p(A) as a
preconditioner. This is called polynomial preconditioning. The preconditioned
system is
p(A)Ax = p(A)b
and we would expect the spectrum of p(A)A to be more clustered near z = 1
than that of A. If an interval containing the spectrum can be found, the
residual polynomial q(z) = 1 − zp(z) of smallest L
∞
norm on that interval
can be expressed in terms of Chebyshev [161] polynomials. Alternatively
q can be selected to solve a least squares minimization problem [5], [163].
The preconditioning p can be directly recovered from q and convergence rate
estimates made. This technique is used to prove the estimate (2.15), for
example. The cost of such a preconditioner, if a polynomial of degree K is
used, is K matrixvector products for each application of the preconditioner
[5]. The performance gains can be very signiﬁcant and the implementation is
matrixfree.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
CONJUGATE GRADIENT ITERATION 25
2.6. CGNR and CGNE
If A is nonsingular and nonsymmetric, one might consider solving Ax = b by
applying CG to the normal equations
A
T
Ax = A
T
b. (2.32)
This approach [103] is called CGNR [71], [78], [134]. The reason for this name
is that the minimization property of CG as applied to (2.32) asserts that
x
∗
−x
2
A
T
A
= (x
∗
−x)
T
A
T
A(x
∗
−x)
= (Ax
∗
−Ax)
T
(Ax
∗
−Ax) = (b −Ax)
T
(b −Ax) = r
2
is minimized over x
0
+/
k
at each iterate. Hence the name Conjugate Gradient
on the Normal equations to minimize the Residual.
Alternatively, one could solve
AA
T
y = b (2.33)
and then set x = A
T
y to solve Ax = b. This approach [46] is now called CGNE
[78], [134]. The reason for this name is that the minimization property of CG
as applied to (2.33) asserts that if y
∗
is the solution to (2.33) then
y
∗
−y
2
AA
T
= (y
∗
−y)
T
(AA
T
)(y
∗
−y) = (A
T
y
∗
−A
T
y)
T
(A
T
y
∗
−A
T
y)
= x
∗
−x
2
2
is minimized over y
0
+/
k
at each iterate. Conjugate Gradient on the Normal
equations to minimize the Error.
The advantages of this approach are that all the theory for CG carries over
and the simple implementation for both CG and PCG can be used. There
are three disadvantages that may or may not be serious. The ﬁrst is that the
condition number of the coeﬃcient matrix A
T
A is the square of that of A.
The second is that two matrixvector products are needed for each CG iterate
since w = A
T
Ap = A
T
(Ap) in CGNR and w = AA
T
p = A(A
T
p) in CGNE.
The third, more important, disadvantage is that one must compute the action
of A
T
on a vector as part of the matrixvector product involving A
T
A. As we
will see in the chapter on nonlinear problems, there are situations where this
is not possible.
The analysis with residual polynomials is similar to that for CG. We
consider the case for CGNR, the analysis for CGNE is essentially the same.
As above, when we consider the A
T
A norm of the error, we have
x
∗
−x
2
A
T
A
= (x
∗
−x)
T
A
T
A(x
∗
−x) = A(x
∗
−x)
2
2
= r
2
2
.
Hence, for any residual polynomial ¯ p
k
∈ T
k
,
r
k

2
≤ ¯ p
k
(A
T
A)r
0

2
≤ r
0

2
max
z∈σ(A
T
A)
[ ¯ p
k
(z)[. (2.34)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
26 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
There are two major diﬀerences between (2.34) and (2.7). The estimate is
in terms of the l
2
norm of the residual, which corresponds exactly to the
termination criterion, hence we need not prove a result like Lemma 2.3.2. Most
signiﬁcantly, the residual polynomial is to be maximized over the eigenvalues
of A
T
A, which is the set of the squares of the singular values of A. Hence the
performance of CGNR and CGNE is determined by the distribution of singular
values.
2.7. Examples for preconditioned conjugate iteration
In the collection of MATLAB codes we provide a code for preconditioned
conjugate gradient iteration. The inputs, described in the comment lines,
are the initial iterate, x
0
, the right hand side vector b, MATLAB functions for
the matrixvector product and (optionally) the preconditioner, and iteration
parameters to specify the maximum number of iterations and the termination
criterion. On return the code supplies the approximate solution x and the
history of the iteration as the vector of residual norms.
We consider the discretization of the partial diﬀerential equation
−∇
˙
(a(x, y)∇u) = f(x, y) (2.35)
on 0 < x, y < 1 subject to homogeneous Dirichlet boundary conditions
u(x, 0) = u(x, 1) = u(0, y) = u(1, y) = 0, 0 < x, y < 1.
One can verify [105] that the diﬀerential operator is positive deﬁnite in the
Hilbert space sense and that the ﬁvepoint discretization described below is
positive deﬁnite if a > 0 for all 0 ≤ x, y ≤ 1 (Exercise 2.8.10).
We discretize with a ﬁvepoint centered diﬀerence scheme with n
2
points
and mesh width h = 1/(n + 1). The unknowns are
u
ij
≈ u(x
i
, x
j
)
where x
i
= ih for 1 ≤ i ≤ n. We set
u
0j
= u
(n+1)j
= u
i0
= u
i(n+1)
= 0,
to reﬂect the boundary conditions, and deﬁne
α
ij
= −a(x
i
, x
j
)h
−2
/2.
We express the discrete matrixvector product as
(Au)
ij
= (α
ij
+α
(i+1)j
)(u
(i+1)j
−u
ij
)
−(α
(i−1)j
+α
ij
)(u
ij
−u
(i−1)j
) + (α
i(j+1)
+α
ij
)(u
i(j+1)
−u
ij
)
−(α
ij
+α
i(j−1)
)(u
ij
−u
i(j−1)
)
(2.36)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
CONJUGATE GRADIENT ITERATION 27
for 1 ≤ i, j ≤ n.
For the MATLAB implementation we convert freely from the representa
tion of u as a twodimensional array (with the boundary conditions added),
which is useful for computing the action of A on u and applying fast solvers,
and the representation as a onedimensional array, which is what pcgsol ex
pects to see. See the routine fish2d in collection of MATLAB codes for an
example of how to do this in MATLAB.
For the computations reported in this section we took a(x, y) = cos(x) and
took the right hand side so that the exact solution was the discretization of
10xy(1 −x)(1 −y) exp(x
4.5
).
The initial iterate was u
0
= 0.
In the results reported here we took n = 31 resulting in a system with
N = n
2
= 961 unknowns. We expect secondorder accuracy from the method
and accordingly we set termination parameter = h
2
= 1/1024. We allowed
up to 100 CG iterations. The initial iterate was the zero vector. We will report
our results graphically, plotting r
k

2
/b
2
on a semilog scale.
In Fig. 2.1 the solid line is a plot of r
k
/b and the dashed line a plot of
u
∗
−u
k

A
/u
∗
−u
0

A
. Note that the reduction in r is not monotone. This is
consistent with the theory, which predicts decrease in e
A
but not necessarily
in r as the iteration progresses. Note that the unpreconditioned iteration is
slowly convergent. This can be explained by the fact that the eigenvalues are
not clustered and
κ(A) = O(1/h
2
) = O(n
2
) = O(N)
and hence (2.15) indicates that convergence will be slow. The reader is asked
to quantify this in terms of execution times in Exercise 2.8.9. This example
illustrates the importance of a good preconditioner. Even the unpreconditioned
iteration, however, is more eﬃcient that the classical stationary iterative
methods.
For a preconditioner we use a Poisson solver. By this we mean an operator
G such that v = Gw is the solution of the discrete form of
−v
xx
−v
yy
= w,
subject to homogeneous Dirichlet boundary conditions. The eﬀectiveness of
such a preconditioner has been analyzed in [124] and some of the many ways
to implement the solver eﬃciently are discussed in [99], [188], [186], and [187].
The properties of CG on the preconditioned problem in the continuous
case have been analyzed in [48]. For many types of domains and boundary
conditions, Poisson solvers can be designed to take advantage of vector and/or
parallel architectures or, in the case of the MATLAB environment used in
this book, designed to take advantage of fast MATLAB builtin functions.
Because of this their execution time is less than a simple count of ﬂoating
point operations would indicate. The fast Poisson solver in the collection of
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
28 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
0 10 20 30 40 50 60
10
4
10
3
10
2
10
1
10
0
10
1
Iterations
R
e
l
a
t
i
v
e
R
e
s
i
d
u
a
l
a
n
d
A

n
o
r
m
o
f
E
r
r
o
r
Fig. 2.1. CG for 2D elliptic equation.
0 10 20 30 40 50 60
10
4
10
3
10
2
10
1
10
0
10
1
Iterations
R
e
l
a
t
i
v
e
R
e
s
i
d
u
a
l
Fig. 2.2. PCG for 2D elliptic equation.
codes fish2d is based on the MATLAB fast Fourier transform, the builtin
function fft.
In Fig. 2.2 the solid line is the graph of r
k

2
/b
2
for the preconditioned
iteration and the dashed line for the unpreconditioned. The preconditioned
iteration required 5 iterations for convergence and the unpreconditioned
iteration 52. Not only does the preconditioned iteration converge more
rapidly, but the number of iterations required to reduce the relative residual
by a given amount is independent of the mesh spacing [124]. We caution
the reader that the preconditioned iteration is not as much faster than the
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
CONJUGATE GRADIENT ITERATION 29
unpreconditioned one as the iteration count would suggest. The MATLAB
flops command indicates that the unpreconditioned iteration required roughly
1.2 million ﬂoatingpoint operations while the preconditioned iteration required
.87 million ﬂoatingpoint operations. Hence, the cost of the preconditioner is
considerable. In the MATLAB environment we used, the execution time of
the preconditioned iteration was about 60% of that of the unpreconditioned.
As we remarked above, this speed is a result of the eﬃciency of the MATLAB
fast Fourier transform. In Exercise 2.8.11 you are asked to compare execution
times for your own environment.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
30 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
2.8. Exercises on conjugate gradient
2.8.1. Let ¦x
k
¦ be the conjugate gradient iterates. Prove that r
l
∈ /
k
for all
l < k.
2.8.2. Let A be spd. Show that there is a spd B such that B
2
= A. Is B
unique?
2.8.3. Let Λ be a diagonal matrix with Λ
ii
= λ
i
and let p be a polynomial.
Prove that p(Λ) = max
i
[p(λ
i
)[ where   is any induced matrix norm.
2.8.4. Prove Theorem 2.2.3.
2.8.5. Assume that A is spd and that
σ(A) ⊂ (1, 1.1) ∪ (2, 2.2).
Give upper estimates based on (2.6) for the number of CG iterations
required to reduce the A norm of the error by a factor of 10
−3
and for
the number of CG iterations required to reduce the residual by a factor
of 10
−3
.
2.8.6. For the matrix A in problem 5, assume that the cost of a matrix vector
multiply is 4N ﬂoatingpoint multiplies. Estimate the number of ﬂoating
point operations reduce the A norm of the error by a factor of 10
−3
using
CG iteration.
2.8.7. Let A be a nonsingular matrix with all singular values in the interval
(1, 2). Estimate the number of CGNR/CGNE iterations required to
reduce the relative residual by a factor of 10
−4
.
2.8.8. Show that if A has constant diagonal then PCG with Jacobi precondi
tioning produces the same iterates as CG with no preconditioning.
2.8.9. Assume that A is N N, nonsingular, and spd. If κ(A) = O(N), give
a rough estimate of the number of CG iterates required to reduce the
relative residual to O(1/N).
2.8.10. Prove that the linear transformation given by (2.36) is symmetric and
positive deﬁnite on R
n
2
if a(x, y) > 0 for all 0 ≤ x, y ≤ 1.
2.8.11. Duplicate the results in ¸ 2.7 for example, in MATLAB by writing the
matrixvector product routines and using the MATLAB codes pcgsol
and fish2d. What happens as N is increased? How are the performance
and accuracy aﬀected by changes in a(x, y)? Try a(x, y) =
√
.1 +x and
examine the accuracy of the result. Explain your ﬁndings. Compare
the execution times on your computing environment (using the cputime
command in MATLAB, for instance).
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
CONJUGATE GRADIENT ITERATION 31
2.8.12. Use the Jacobi and symmetric Gauss–Seidel iterations from Chapter 1
to solve the elliptic boundary value problem in ¸ 2.7. How does the
performance compare to CG and PCG?
2.8.13. Implement Jacobi (1.17) and symmetric Gauss–Seidel (1.18) precondi
tioners for the elliptic boundary value problem in ¸ 2.7. Compare the
performance with respect to both computer time and number of itera
tions to preconditioning with the Poisson solver.
2.8.14. Modify pcgsol so that φ(x) is computed and stored at each iterate
and returned on output. Plot φ(x
n
) as a function of n for each of the
examples.
2.8.15. Apply CG and PCG to solve the ﬁvepoint discretization of
−u
xx
(x, y) −u
yy
(x, y) +e
x+y
u(x, y) = 1, 0 < x, y, < 1,
subject to the inhomogeneous Dirichlet boundary conditions
u(x, 0) = u(x, 1) = u(1, y) = 0, u(0, y) = 1, 0 < x, y < 1.
Experiment with diﬀerent mesh sizes and preconditioners (Fast Poisson
solver, Jacobi, and symmetric Gauss–Seidel).
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
32 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Chapter 3
GMRES Iteration
3.1. The minimization property and its consequences
The GMRES (Generalized Minimum RESidual) was proposed in 1986 in [167]
as a Krylov subspace method for nonsymmetric systems. Unlike CGNR,
GMRES does not require computation of the action of A
T
on a vector. This is
a signiﬁcant advantage in many cases. The use of residual polynomials is made
more complicated because we cannot use the spectral theorem to decompose
A. Moreover, one must store a basis for /
k
, and therefore storage requirements
increase as the iteration progresses.
The kth (k ≥ 1) iteration of GMRES is the solution to the least squares
problem
minimize
x∈x
0
+K
k
b −Ax
2
. (3.1)
The beginning of this section is much like the analysis for CG. Note that
if x ∈ x
0
+/
k
then
x = x
0
+
k−1
j=0
γ
j
A
j
r
0
and so
b −Ax = b −Ax
0
−
k−1
j=0
γ
j
A
j+1
r
0
= r
0
−
k
j=1
γ
j−1
A
j
r
0
.
Hence if x ∈ x
0
+/
k
then r = ¯ p(A)r
0
where ¯ p ∈ T
k
is a residual polynomial.
We have just proved the following result.
Theorem 3.1.1. Let A be nonsingular and let x
k
be the kth GMRES
iteration. Then for all ¯ p
k
∈ T
k
r
k

2
= min
p∈P
k
¯ p(A)r
0

2
≤ ¯ p
k
(A)r
0

2
. (3.2)
From this we have the following corollary.
Corollary 3.1.1. Let A be nonsingular and let x
k
be the kth GMRES
iteration. Then for all ¯ p
k
∈ T
k
r
k

2
r
0

2
≤ ¯ p
k
(A)
2
. (3.3)
33
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
34 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
We can apply the corollary to prove ﬁnite termination of the GMRES
iteration.
Theorem 3.1.2. Let A be nonsingular. Then the GMRES algorithm will
ﬁnd the solution within N iterations.
Proof. The characteristic polynomial of A is p(z) = det(A − zI). p has
degree N, p(0) = det(A) ,= 0 since A is nonsingular, and so
¯ p
N
(z) = p(z)/p(0) ∈ T
N
is a residual polynomial. It is well known [141] that p(A) = ¯ p
N
(A) = 0. By
(3.3), r
N
= b −Ax
N
= 0 and hence x
N
is the solution.
In Chapter 2 we applied the spectral theorem to obtain more precise infor
mation on convergence rates. This is not an option for general nonsymmetric
matrices. However, if A is diagonalizable we may use (3.2) to get information
from clustering of the spectrum just like we did for CG. We pay a price in that
we must use complex arithmetic for the only time in this book. Recall that
A is diagonalizable if there is a nonsingular (possibly complex!) matrix V such
that
A = V ΛV
−1
.
Here Λ is a (possibly complex!) diagonal matrix with the eigenvalues of A on
the diagonal. If A is a diagonalizable matrix and p is a polynomial then
p(A) = V p(Λ)V
−1
A is normal if the diagonalizing transformation V is orthogonal. In that case
the columns of V are the eigenvectors of A and V
−1
= V
H
. Here V
H
is the
complex conjugate transpose of V . In the remainder of this section we must
use complex arithmetic to analyze the convergence. Hence we will switch to
complex matrices and vectors. Recall that the scalar product in C
N
, the space
of complex Nvectors, is x
H
y. In particular, we will use the l
2
norm in C
N
.
Our use of complex arithmetic will be implicit for the most part and is needed
only so that we may admit the possibility of complex eigenvalues of A.
We can use the structure of a diagonalizable matrix to prove the following
result.
Theorem 3.1.3. Let A = V ΛV
−1
be a nonsingular diagonalizable matrix.
Let x
k
be the kth GMRES iterate. Then for all ¯ p
k
∈ T
k
r
k

2
r
0

2
≤ κ
2
(V ) max
z∈σ(A)
[ ¯ p
k
(z)[. (3.4)
Proof. Let ¯ p
k
∈ T
k
. We can easily estimate ¯ p
k
(A)
2
by
¯ p
k
(A)
2
≤ V 
2
V
−1

2
¯ p
k
(Λ)
2
≤ κ
2
(V ) max
z∈σ(A)
[ ¯ p
k
(z)[,
as asserted.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GMRES ITERATION 35
It is not clear how one should estimate the condition number of the
diagonalizing transformation if it exists. If A is normal, of course, κ
2
(V ) = 1.
As we did for CG, we look at some easy consequences of (3.3) and (3.4).
Theorem 3.1.4. Let A be a nonsingular diagonalizable matrix. Assume
that A has only k distinct eigenvalues. Then GMRES will terminate in at most
k iterations.
Theorem 3.1.5. Let A be a nonsingular normal matrix. Let b be a linear
combination of k of the eigenvectors of A
b =
k
l=1
γ
l
u
i
l
.
Then the GMRES iteration, with x
0
= 0, for Ax = b will terminate in at most
k iterations.
3.2. Termination
As is the case with CG, GMRES is best thought of as an iterative method.
The convergence rate estimates for the diagonalizable case will involve κ
2
(V ),
but will otherwise resemble those for CG. If A is not diagonalizable, rate
estimates have been derived in [139], [134], [192], [33], and [34]. As the set of
nondiagonalizable matrices has measure zero in the space of N N matrices,
the chances are very high that a computed matrix will be diagonalizable. This
is particularly so for the ﬁnite diﬀerence Jacobian matrices we consider in
Chapters 6 and 8. Hence we conﬁne our attention to diagonalizable matrices.
As was the case with CG, we terminate the iteration when
r
k

2
≤ ηb
2
(3.5)
for the purposes of this example. We can use (3.3) and (3.4) directly to estimate
the ﬁrst k such that (3.5) holds without requiring a lemma like Lemma 2.3.2.
Again we look at examples. Assume that A = V ΛV
−1
is diagonalizable,
that the eigenvalues of A lie in the interval (9, 11), and that κ
2
(V ) = 100.
We assume that x
0
= 0 and hence r
0
= b. Using the residual polynomial
¯ p
k
(z) = (10 −z)
k
/10
k
we ﬁnd
r
k

2
r
0

2
≤ (100)10
−k
= 10
2−k
.
Hence (3.5) holds when 10
2−k
< η or when
k > 2 + log
10
(η).
Assume that I − A
2
≤ ρ < 1. Let ¯ p
k
(z) = (1 − z)
k
. It is a direct
consequence of (3.2) that
r
k

2
≤ ρ
k
r
0

2
. (3.6)
The estimate (3.6) illustrates the potential beneﬁts of a good approximate
inverse preconditioner.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
36 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
The convergence estimates for GMRES in the nonnormal case are much
less satisfying that those for CG, CGNR, CGNE, or GMRES in the normal
case. This is a very active area of research and we refer to [134], [33], [120],
[34], and [36] for discussions of and pointers to additional references to several
questions related to nonnormal matrices.
3.3. Preconditioning
Preconditioning for GMRES and other methods for nonsymmetric problems is
diﬀerent from that for CG. There is no concern that the preconditioned system
be spd and hence (3.6) essentially tells the whole story. However there are two
diﬀerent ways to view preconditioning. If one can ﬁnd M such that
I −MA
2
= ρ < 1,
then applying GMRES to MAx = Mb allows one to apply (3.6) to the
preconditioned system. Preconditioning done in this way is called left
preconditioning. If r = MAx − Mb is the residual for the preconditioned
system, we have, if the product MA can be formed without error,
e
k

2
e
0

2
≤ κ
2
(MA)
r
k

2
r
0

2
,
by Lemma 1.1.1. Hence, if MA has a smaller condition number than A, we
might expect the relative residual of the preconditioned system to be a better
indicator of the relative error than the relative residual of the original system.
If
I −AM
2
= ρ < 1,
one can solve the system AMy = b with GMRES and then set x = My. This is
called right preconditioning. The residual of the preconditioned problem is the
same as that of the unpreconditioned problem. Hence, the value of the relative
residuals as estimators of the relative error is unchanged. Right preconditioning
has been used as the basis for a method that changes the preconditioner as the
iteration progresses [166].
One important aspect of implementation is that, unlike PCG, one can
apply the algorithm directly to the system MAx = Mb (or AMy = b). Hence,
one can write a single matrixvector product routine for MA (or AM) that
includes both the application of A to a vector and that of the preconditioner.
Most of the preconditioning ideas mentioned in ¸ 2.5 are useful for GMRES
as well. In the examples in ¸ 3.7 we use the Poisson solver preconditioner for
nonsymmetric partial diﬀerential equations. Multigrid [99] and alternating
direction [8], [182] methods have similar performance and may be more
generally applicable. Incomplete factorization (LU in this case) preconditioners
can be used [165] as can polynomial preconditioners. Some hybrid algorithms
use the GMRES/Arnoldi process itself to construct polynomial preconditioners
for GMRES or for Richardson iteration [135], [72], [164], [183]. Again we
mention [8] and [12] as a good general references for preconditioning.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GMRES ITERATION 37
3.4. GMRES implementation: Basic ideas
Recall that the least squares problem deﬁning the kth GMRES iterate is
minimize
x∈x
0
+K
k
b −Ax
2
.
Suppose one had an orthogonal projector V
k
onto /
k
. Then any z ∈ /
k
can
be written as
z =
k
l=1
y
l
v
k
l
,
where v
k
l
is the lth column of V
k
. Hence we can convert (3.1) to a least squares
problem in R
k
for the coeﬃcient vector y of z = x −x
0
. Since
x −x
0
= V
k
y
for some y ∈ R
k
, we must have x
k
= x
0
+V
k
y where y minimizes
b −A(x
0
+V
k
y)
2
= r
0
−AV
k
y
2
.
Hence, our least squares problem in R
k
is
minimize
y∈R
kr
0
−AV
k
y
2
. (3.7)
This is a standard linear least squares problem that could be solved by a QR
factorization, say. The problem with such a direct approach is that the matrix
vector product of A with the basis matrix V
k
must be taken at each iteration.
If one uses Gram–Schmidt orthogonalization, however, one can represent
(3.7) very eﬃciently and the resulting least squares problem requires no extra
multiplications of A with vectors. The Gram–Schmidt procedure for formation
of an orthonormal basis for /
k
is called the Arnoldi [4] process. The data are
vectors x
0
and b, a map that computes the action of A on a vector, and a
dimension k. The algorithm computes an orthonormal basis for /
k
and stores
it in the columns of V .
Algorithm 3.4.1. arnoldi(x
0
, b, A, k, V )
1. Deﬁne r
0
= b −Ax
0
and v
1
= r
0
/r
0

2
.
2. For i = 1, . . . , k −1
v
i+1
=
Av
i
−
i
j=1
((Av
i
)
T
v
j
)v
j
Av
i
−
i
j=1
((Av
i
)
T
v
j
)v
j

2
If there is never a division by zero in step 2 of Algorithm arnoldi, then
the columns of the matrix V
k
are an orthonormal basis for /
k
. A division by
zero is referred to as breakdown and happens only if the solution to Ax = b is
in x
0
+/
k−1
.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
38 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Lemma 3.4.1. Let A be nonsingular, let the vectors v
j
be generated by
Algorithm arnoldi, and let i be the smallest integer for which
Av
i
−
i
j=1
((Av
i
)
T
v
j
)v
j
= 0.
Then x = A
−1
b ∈ x
0
+/
i
.
Proof. By hypothesis Av
i
∈ /
i
and hence A/
i
⊂ /
i
. Since the columns of
V
i
are an orthonormal basis for /
i
, we have
AV
i
= V
i
H,
where H is an i i matrix. H is nonsingular since A is. Setting β = r
0

2
and e
1
= (1, 0, . . . , 0)
T
∈ R
i
we have
r
i

2
= b −Ax
i

2
= r
0
−A(x
i
−x
0
)
2
.
Since x
i
− x
0
∈ /
i
there is y ∈ R
i
such that x
i
− x
0
= V
i
y. Since r
0
= βV
i
e
1
and V
i
is an orthogonal matrix
r
i

2
= V
i
(βe
1
−Hy)
2
= βe
1
−Hy
R
i+1,
where  
R
k+1 denotes the Euclidean norm in R
k+1
.
Setting y = βH
−1
e
1
proves that r
i
= 0 by the minimization property.
The upper Hessenberg structure can be exploited to make the solution of
the least squares problems very eﬃcient [167].
If the Arnoldi process does not break down, we can use it to implement
GMRES in an eﬃcient way. Set h
ji
= (Av
j
)
T
v
i
. By the Gram–Schmidt
construction, the k+1k matrix H
k
whose entries are h
ij
is upper Hessenberg,
i.e., h
ij
= 0 if i > j +1. The Arnoldi process (unless it terminates prematurely
with a solution) produces matrices ¦V
k
¦ with orthonormal columns such that
AV
k
= V
k+1
H
k
. (3.8)
Hence, for some y
k
∈ R
k
,
r
k
= b −Ax
k
= r
0
−A(x
k
−x
0
) = V
k+1
(βe
1
−H
k
y
k
).
Hence x
k
= x
0
+V
k
y
k
, where y
k
minimizes βe
1
−H
k
y
2
over R
k
. Note that
when y
k
has been computed, the norm of r
k
can be found without explicitly
forming x
k
and computing r
k
= b −Ax
k
. We have, using the orthogonality of
V
k+1
,
r
k

2
= V
k+1
(βe
1
−H
k
y
k
)
2
= βe
1
−H
k
y
k

R
k+1. (3.9)
The goal of the iteration is to ﬁnd, for a given , a vector x so that
b −Ax
2
≤ b
2
.
The input is the initial iterate, x, the righthand side b, and a map that
computes the action of A on a vector. We limit the number of iterations
to kmax and return the solution, which overwrites the initial iterate x and the
residual norm.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GMRES ITERATION 39
Algorithm 3.4.2. gmresa(x, b, A, , kmax, ρ)
1. r = b −Ax, v
1
= r/r
2
, ρ = r
2
, β = ρ, k = 0
2. While ρ > b
2
and k < kmax do
(a) k = k + 1
(b) for j = 1, . . . , k
h
jk
= (Av
k
)
T
v
j
(c) v
k+1
= Av
k
−
k
j=1
h
jk
v
j
(d) h
k+1,k
= v
k+1

2
(e) v
k+1
= v
k+1
/v
k+1

2
(f) e
1
= (1, 0, . . . , 0)
T
∈ R
k+1
Minimize βe
1
−H
k
y
k

R
k+1 over R
k
to obtain y
k
.
(g) ρ = βe
1
−H
k
y
k

R
k+1.
3. x
k
= x
0
+V
k
y
k
.
Note that x
k
is only computed upon termination and is not needed within
the iteration. It is an important property of GMRES that the basis for the
Krylov space must be stored as the iteration progress. This means that in
order to perform k GMRES iterations one must store k vectors of length N.
For very large problems this becomes prohibitive and the iteration is restarted
when the available room for basis vectors is exhausted. One way to implement
this is to set kmax to the maximum number m of vectors that one can store,
call GMRES and explicitly test the residual b−Ax
k
if k = m upon termination.
If the norm of the residual is larger than , call GMRES again with x
0
= x
k
,
the result from the previous call. This restarted version of the algorithm is
called GMRES(m) in [167]. There is no general convergence theorem for the
restarted algorithm and restarting will slow the convergence down. However,
when it works it can signiﬁcantly reduce the storage costs of the iteration. We
discuss implementation of GMRES(m) later in this section.
Algorithm gmresa can be implemented very straightforwardly in MAT
LAB. Step 2f can be done with a single MATLAB command, the backward
division operator, at a cost of O(k
3
) ﬂoatingpoint operations. There are more
eﬃcient ways to solve the least squares problem in step 2f, [167], [197], and we
use the method of [167] in the collection of MATLAB codes. The savings are
slight if k is small relative to N, which is often the case for large problems, and
the simple oneline MATLAB approach can be eﬃcient for such problems.
A more serious problem with the implementation proposed in Algo
rithm gmresa is that the vectors v
j
may become nonorthogonal as a result of
cancellation errors. If this happens, (3.9), which depends on this orthogonality,
will not hold and the residual and approximate solution could be inaccurate. A
partial remedy is to replace the classical Gram–Schmidt orthogonalization in
Algorithm gmresa with modiﬁed Gram–Schmidt orthogonalization. We replace
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
40 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
the loop in step 2c of Algorithm gmresa with
v
k+1
= Av
k
for j = 1, . . . k
v
k+1
= v
k+1
−(v
T
k+1
v
j
)v
j
.
While modiﬁed Gram–Schmidt and classical Gram–Schmidt are equivalent in
inﬁnite precision, the modiﬁed form is much more likely in practice to maintain
orthogonality of the basis.
We illustrate this point with a simple example from [128], doing the
computations in MATLAB. Let δ = 10
−7
and deﬁne
A =
_
_
_
1 1 1
δ δ 0
δ 0 δ
_
_
_.
We orthogonalize the columns of A with classical Gram–Schmidt to obtain
V =
_
_
_
1.0000e + 00 1.0436e −07 9.9715e −08
1.0000e −07 1.0456e −14 −9.9905e −01
1.0000e −07 −1.0000e + 00 4.3568e −02
_
_
_.
The columns of V
U
are not orthogonal at all. In fact v
T
2
v
3
≈ −.004. For
modiﬁed Gram–Schmidt
V =
_
_
_
1.0000e + 00 1.0436e −07 1.0436e −07
1.0000e −07 1.0456e −14 −1.0000e + 00
1.0000e −07 −1.0000e + 00 4.3565e −16
_
_
_.
Here [v
T
i
v
j
−δ
ij
[ ≤ 10
−8
for all i, j.
The versions we implement in the collection of MATLAB codes use modi
ﬁed Gram–Schmidt. The outline of our implementation is Algorithm gmresb.
This implementation solves the upper Hessenberg least squares problem using
the MATLAB backward division operator, and is not particularly eﬃcient. We
present a better implementation in Algorithm gmres. However, this version is
very simple and illustrates some important ideas. First, we see that x
k
need
only be computed after termination as the least squares residual ρ can be used
to approximate the norm of the residual (they are identical in exact arithmetic).
Second, there is an opportunity to compensate for a loss of orthogonality in
the basis vectors for the Krylov space. One can take a second pass through the
modiﬁed Gram–Schmidt process and restore lost orthogonality [147], [160].
Algorithm 3.4.3. gmresb(x, b, A, , kmax, ρ)
1. r = b −Ax, v
1
= r/r
2
, ρ = r
2
, β = ρ, k = 0
2. While ρ > b
2
and k < kmax do
(a) k = k + 1
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GMRES ITERATION 41
(b) v
k+1
= Av
k
for j = 1, . . . k
i. h
jk
= v
T
k+1
v
j
ii. v
k+1
= v
k+1
−h
jk
v
j
(c) h
k+1,k
= v
k+1

2
(d) v
k+1
= v
k+1
/v
k+1

2
(e) e
1
= (1, 0, . . . , 0)
T
∈ R
k+1
Minimize βe
1
−H
k
y
k

R
k+1 to obtain y
k
∈ R
k
.
(f) ρ = βe
1
−H
k
y
k

R
k+1.
3. x
k
= x
0
+V
k
y
k
.
Even if modiﬁed Gram–Schmidt orthogonalization is used, one can still
lose orthogonality in the columns of V . One can test for loss of orthogonality
[22], [147], and reorthogonalize if needed or use a more stable means to create
the matrix V [195]. These more complex implementations are necessary if A is
ill conditioned or many iterations will be taken. For example, one can augment
the modiﬁed Gram–Schmidt process
• v
k+1
= Av
k
for j = 1, . . . k
h
jk
= v
T
k+1
v
j
v
k+1
= v
k+1
−h
jk
v
j
• h
k+1,k
= v
k+1

2
• v
k+1
= v
k+1
/v
k+1

2
with a second pass (reorthogonalization). One can reorthogonalize in every
iteration or only if a test [147] detects a loss of orthogonality. There is nothing
to be gained by reorthogonalizing more than once [147].
The modiﬁed Gram–Schmidt process with reorthogonalization looks like
• v
k+1
= Av
k
for j = 1, . . . , k
h
jk
= v
T
k+1
v
j
v
k+1
= v
k+1
−h
jk
v
j
• h
k+1,k
= v
k+1

2
• If loss of orthogonality is detected
For j = 1, . . . , k
h
tmp
= v
T
k+1
v
j
h
jk
= h
jk
+h
tmp
v
k+1
= v
k+1
−h
tmp
v
j
• h
k+1,k
= v
k+1

2
• v
k+1
= v
k+1
/v
k+1

2
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
42 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
One approach to reorthogonalization is to reorthogonalize in every step.
This doubles the cost of the computation of V and is usually unnecessary.
More eﬃcient and equally eﬀective approaches are based on other ideas. A
variation on a method from [147] is used in [22]. Reorthogonalization is done
after the Gram–Schmidt loop and before v
k+1
is normalized if
Av
k

2
+δv
k+1

2
= Av
k

2
(3.10)
to working precision. The idea is that if the new vector is very small relative
to Av
k
then information may have been lost and a second pass through
the modiﬁed Gram–Schmidt process is needed. We employ this test in the
MATLAB code gmres with δ = 10
−3
.
To illustrate the eﬀects of loss of orthogonality and those of reorthogonal
ization we apply GMRES to the diagonal system Ax = b where b = (1, 1, 1)
T
,
x
0
= (0, 0, 0)
T
, and
A =
_
_
_
.001 0 0
0 .0011 0
0 0 10
4
_
_
_. (3.11)
While in inﬁnite precision arithmetic only three iterations are needed to solve
the system exactly, we ﬁnd in the MATLAB environment that a solution to full
precision requires more than three iterations unless reorthogonalization is ap
plied after every iteration. In Table 3.1 we tabulate relative residuals as a func
tion of the iteration counter for classical Gram–Schmidt without reorthogonal
ization (CGM), modiﬁed Gram–Schmidt without reorthogonalization (MGM),
reorthogonalization based on the test (3.10) (MGMPO), and reorthogonaliza
tion in every iteration (MGMFO). While classical Gram–Schmidt fails, the
reorthogonalization strategy based on (3.10) is almost as eﬀective as the much
more expensive approach of reorthogonalizing in every step. The method based
on (3.10) is the default in the MATLAB code gmresa.
The kth GMRES iteration requires a matrixvector product, k scalar
products, and the solution of the Hessenberg least squares problem in step 2e.
The k scalar products require O(kN) ﬂoatingpoint operations and the cost
of a solution of the Hessenberg least squares problem, by QR factorization or
the MATLAB backward division operator, say, in step 2e of gmresb is O(k
3
)
ﬂoatingpoint operations. Hence the total cost of the m GMRES iterations is
m matrixvector products and O(m
4
+m
2
N) ﬂoatingpoint operations. When
k is not too large and the cost of matrixvector products is high, a brute
force solution to the least squares problem using the MATLAB backward
division operator is not terribly ineﬃcient. We provide an implementation
of Algorithm gmresb in the collection of MATLAB codes. This is an appealing
algorithm, especially when implemented in an environment like MATLAB,
because of its simplicity. For large k, however, the bruteforce method can be
very costly.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GMRES ITERATION 43
Table 3.1
Eﬀects of reorthogonalization.
k CGM MGM MGMPO MGMFO
0 1.00e+00 1.00e+00 1.00e+00 1.00e+00
1 8.16e01 8.16e01 8.16e01 8.16e01
2 3.88e02 3.88e02 3.88e02 3.88e02
3 6.69e05 6.42e08 6.42e08 6.34e34
4 4.74e05 3.70e08 5.04e24
5 3.87e05 3.04e18
6 3.35e05
7 3.00e05
8 2.74e05
9 2.53e05
10 2.37e05
3.5. Implementation: Givens rotations
If k is large, implementations using Givens rotations [167], [22], Householder
reﬂections [195], or a shifted Arnoldi process [197] are much more eﬃcient
than the bruteforce approach in Algorithm gmresb. The implementation in
Algorithm gmres and in the MATLAB code collection is from [167]. This
implementation maintains the QR factorization of H
k
in a clever way so that
the cost for a single GMRES iteration is O(Nk) ﬂoatingpoint operations. The
O(k
2
) cost of the triangular solve and the O(kN) cost of the construction of
x
k
are incurred after termination.
A 2 2 Givens rotation is a matrix of the form
G =
_
c −s
s c
_
, (3.12)
where c = cos(θ), s = sin(θ) for θ ∈ [−π, π]. The orthogonal matrix G rotates
the vector (c, −s), which makes an angle of −θ with the xaxis through an
angle θ so that it overlaps the xaxis.
G
_
c
−s
_
=
_
1
0
_
.
An N N Givens rotation replaces a 2 2 block on the diagonal of the
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
44 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
N N identity matrix with a 2 2 Givens rotation.
G =
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
1 0 . . . 0
0
.
.
.
.
.
.
.
.
. c −s
.
.
. s c 0
.
.
.
0 1
.
.
.
.
.
.
.
.
. 0
0 . . . 0 1
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
. (3.13)
Our notation is that G
j
(c, s) is an N N givens rotation of the form (3.13)
with a 2 2 Givens rotation in rows and columns j and j + 1.
Givens rotations are used to annihilate single nonzero elements of matrices
in reduction to triangular form [89]. They are of particular value in reducing
Hessenberg matrices to triangular form and thereby solving Hessenberg least
squares problems such as the ones that arise in GMRES. This reduction can be
accomplished in O(N) ﬂoatingpoint operations and hence is far more eﬃcient
than a solution by a singular value decomposition or a reduction based on
Householder transformations. This method is also used in the QR algorithm
for computing eigenvalues [89], [184].
Let H be an N M (N ≥ M) upper Hessenberg matrix with rank M.
We reduce H to triangular form by ﬁrst multiplying the matrix by a Givens
rotation that annihilates h
21
(and, of course, changes h
11
and the subsequent
columns). We deﬁne G
1
= G
1
(c
1
, s
1
) by
c
1
= h
11
/
_
h
2
11
+h
2
21
and s
1
= −h
21
/
_
h
2
11
+h
2
21
. (3.14)
If we replace H by G
1
H, then the ﬁrst column of H now has only a single
nonzero element h
11
. Similarly, we can now apply G
2
(c
2
, s
2
) to H, where
c
2
= h
22
/
_
h
2
22
+h
2
32
and s
2
= −h
32
/
_
h
2
22
+h
2
32
. (3.15)
and annihilate h
32
. Note that G
2
does not aﬀect the ﬁrst column of H.
Continuing in this way and setting
Q = G
N
. . . G
1
we see that QH = R is upper triangular.
A straightforward application of these ideas to Algorithm gmres would
solve the least squares problem by computing the product of k Givens rotations
Q, setting g = βQe
1
, and noting that
βe
1
−H
k
y
k

R
k+1 = Q(βe
1
−H
k
y
k
)
R
k+1 = g −R
k
y
k

R
k+1,
where R
k
is the k + 1 k triangular factor of the QR factorization of H
k
.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GMRES ITERATION 45
In the context of GMRES iteration, however, we can incrementally perform
the QR factorization of H as the GMRES iteration progresses [167]. To see
this, note that if R
k
= Q
k
H
k
and, after orthogonalization, we add the new
column h
k+2
to H
k
, we can update both Q
k
and R
k
by ﬁrst multiplying h
k+2
by Q
k
(that is applying the ﬁrst k Givens rotations to h
k+2
), then computing
the Givens rotation G
k+1
that annihilates the (k + 2)nd element of Q
k
h
k+2
,
and ﬁnally setting Q
k+1
= G
k+1
Q
k
and forming R
k+1
by augmenting R
k
with
G
k+1
Q
k
h
k+2
.
The MATLAB implementation of Algorithm gmres stores Q
k
by storing
the sequences ¦c
j
¦ and ¦s
j
¦ and then computing the action of Q
k
on a vector
x ∈ R
k+1
by applying G
j
(c
j
, s
j
) in turn to obtain
Q
k
x = G
k
(c
k
, s
k
) . . . G
1
(c
1
, s
1
)x.
We overwrite the upper triangular part of H
k
with the triangular part of the
QR factorization of H
k
in the MATLAB code. The MATLAB implementation
of Algorithm gmres uses (3.10) to test for loss of orthogonality.
Algorithm 3.5.1. gmres(x, b, A, , kmax, ρ)
1. r = b −Ax, v
1
= r/r
2
, ρ = r
2
, β = ρ,
k = 0; g = ρ(1, 0, . . . , 0)
T
∈ R
kmax+1
2. While ρ > b
2
and k < kmax do
(a) k = k + 1
(b) v
k+1
= Av
k
for j = 1, . . . k
i. h
jk
= v
T
k+1
v
j
ii. v
k+1
= v
k+1
−h
jk
v
j
(c) h
k+1,k
= v
k+1

2
(d) Test for loss of orthogonality and reorthogonalize if necessary.
(e) v
k+1
= v
k+1
/v
k+1

2
(f) i. If k > 1 apply Q
k−1
to the kth column of H.
ii. ν =
_
h
2
k,k
+h
2
k+1,k
.
iii. c
k
= h
k,k
/ν, s
k
= −h
k+1,k
/ν
h
k,k
= c
k
h
k,k
−s
k
h
k+1,k
, h
k+1,k
= 0
iv. g = G
k
(c
k
, s
k
)g.
(g) ρ = [(g)
k+1
[.
3. Set r
i,j
= h
i,j
for 1 ≤ i, j ≤ k.
Set (w)
i
= (g)
i
for 1 ≤ i ≤ k.
Solve the upper triangular system Ry
k
= w.
4. x
k
= x
0
+V
k
y
k
.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
46 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
We close with an example of an implementation of GMRES(m) . This
implementation does not test for success and may, therefore, fail to terminate.
You are asked to repair this in exercise 7. Aside from the parameter m, the
arguments to Algorithm gmresm are the same as those for Algorithm gmres.
Algorithm 3.5.2. gmresm(x, b, A, , kmax, m, ρ)
1. gmres(x, b, A, , m, ρ)
2. k = m
3. While ρ > b
2
and k < kmax do
(a) gmres(x, b, A, , m, ρ)
(b) k = k +m
The storage costs of m iterations of gmres or of gmresm are the m + 2
vectors b, x, and ¦v
k
¦
m
k=1
.
3.6. Other methods for nonsymmetric systems
The methods for nonsymmetric linear systems that receive most attention in
this book, GMRES, CGNR, and CGNE, share the properties that they are easy
to implement, can be analyzed by a common residual polynomial approach, and
only terminate if an acceptable approximate solution has been found. CGNR
and CGNE have the disadvantages that a transposevector product must be
computed for each iteration and that the coeﬃcient matrix A
T
A or AA
T
has
condition number the square of that of the original matrix. In ¸ 3.8 we give
an example of how this squaring of the condition number can lead to failure
of the iteration. GMRES needs only matrixvector products and uses A alone,
but, as we have seen, a basis for /
k
must be stored to compute x
k
. Hence,
the storage requirements increase as the iteration progresses. For large and
illconditioned problems, it may be impossible to store enough basis vectors
and the iteration may have to be restarted. Restarting can seriously degrade
performance.
An ideal method would, like CG, only need matrixvector products, be
based on some kind of minimization principle or conjugacy, have modest
storage requirements that do not depend on the number of iterations needed
for convergence, and converge in N iterations for all nonsingular A. However,
[74], methods based on shortterm recurrences such as CG that also satisfy
minimization or conjugacy conditions cannot be constructed for general
matrices. The methods we describe in this section fall short of the ideal, but
can still be quite useful. We discuss only a small subset of these methods and
refer the reader to [12] and [78] for pointers to more of the literature on this
subject. All the methods we present in this section require two matrixvector
products for each iteration.
Consistently with our implementation of GMRES, we take the view that
preconditioners will be applied externally to the iteration. However, as with
CG, these methods can also be implemented in a manner that incorporates
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GMRES ITERATION 47
the preconditioner and uses the residual for the original system to control
termination.
3.6.1. BiCG. The earliest such method BiCG (Biconjugate gradient)
[122], [76], does not enforce a minimization principle; instead, the kth residual
must satisfy the biorthogonality condition
r
T
k
w = 0 for all w ∈ /
k
, (3.16)
where
/
k
= span(ˆ r
0
, A
T
ˆ r
0
, . . . , (A
T
)
k−1
ˆ r
0
)
is the Krylov space for A
T
and the vector ˆ r
0
. ˆ r
0
is a usersupplied vector and is
often set to r
0
. The algorithm gets its name because it produces sequences of
residuals ¦r
k
¦, ¦ˆ r
k
¦ and search directions ¦p
k
¦, ¦ˆ p
k
¦ such that biorthogonality
holds, i. e. ˆ r
T
k
r
l
= 0 if k ,= l and the search directions ¦p
k
¦ and ¦ˆ p
k
¦ satisfy
the biconjugacy property
ˆ p
T
k
Ap
l
= 0 if k ,= l.
In the symmetric positive deﬁnite case (3.16) is equivalent to the minimization
principle (2.2) for CG [89].
Using the notation of Chapter 2 and [191] we give an implementation of Bi
CG making the choice ˆ r
0
= r
0
. This algorithmic description is explained and
motivated in more detail in [122], [76], [78], and [191]. We do not recommend
use of BiCG and present this algorithm only as a basis for discussion of some
of the properties of this class of algorithms.
Algorithm 3.6.1. bicg(x, b, A, , kmax)
1. r = b −Ax, ˆ r = r, ρ
0
= 1, ˆ p = p = 0, k = 0
2. Do While r
2
> b
2
and k < kmax
(a) k = k + 1
(b) ρ
k
= ˆ r
T
r, β = ρ
k
/ρ
k−1
(c) p = r +βp, ˆ p = ˆ r +βˆ p
(d) v = Ap
(e) α = ρ
k
/(ˆ p
T
v)
(f) x = x +αp
(g) r = r −αv; ˆ r = ˆ r −αA
T
ˆ p
Note that if A = A
T
is spd, BiCG produces the same iterations as CG
(but computes everything except x twice). If, however, A is not spd, there is
no guarantee that ρ
k
in step 2b or ˆ p
T
Ap, the denominator in step 2e, will not
vanish. If either ρ
k−1
= 0 or ˆ p
T
Ap = 0 we say that a breakdown has taken
place. Even if these quantities are nonzero but very small, the algorithm can
become unstable or produce inaccurate results. While there are approaches
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
48 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
[80], [78], [148], [79], [81], to avoid some breakdowns in many variations of
BiCG, there are no methods that both limit storage and completely eliminate
the possibility of breakdown [74]. All of the methods we present in this section
can break down and should be implemented with that possibility in mind.
Once breakdown has occurred, one can restart the BiCG iteration or pass
the computation to another algorithm, such as GMRES. The possibility of
breakdown is small and certainly should not eliminate the algorithms discussed
below from consideration if there is not enough storage for GMRES to perform
well.
Breakdowns aside, there are other problems with BiCG. A transpose
vector product is needed, which at the least will require additional program
ming and may not be possible at all. The performance of the algorithm can
be erratic or even unstable with residuals increasing by several orders of mag
nitude from one iteration to the next. Moreover, the eﬀort in computing ˆ r
at each iteration is wasted in that ˆ r makes no contribution to x. However,
BiCG sometimes performs extremely well and the remaining algorithms in
this section represent attempts to capture this good performance and damp
the erratic behavior when it occurs.
We can compare the bestcase performance of BiCG with that of GMRES.
Note that there is ¯ p
k
∈ T
k
such that both
r
k
= ¯ p
k
(A)r
0
and ˆ r
k
= ¯ p
k
(A
T
)ˆ r
0
. (3.17)
Hence, by the minimization property for GMRES
r
GMRES
k

2
≤ r
Bi−CG
k

2
,
reminding us that GMRES, if suﬃcient storage is available, will always
reduce the residual more rapidly than BiCG (in terms of iterations, but not
necessarily in terms of computational work). One should also keep in mind that
a single BiCG iteration requires two matrixvector products and a GMRES
iterate only one, but that the cost of the GMRES iteration increases (in terms
of ﬂoatingpoint operations) as the iteration progresses.
3.6.2. CGS. A remedy for one of the problems with BiCG is the Conjugate
Gradient Squared (CGS) algorithm [180]. The algorithm takes advantage of
the fact that (3.17) implies that the scalar product ˆ r
T
r in BiCG (step 2b of
Algorithm bicg) can be represented without using A
T
as
r
T
k
ˆ r
k
= (¯ p
k
(A)r
0
)
T
(¯ p
k
(A
T
)ˆ r
0
) = (¯ p
k
(A)
2
r
0
)
T
ˆ r
0
.
The other references to A
T
can be eliminated in a similar fashion and an
iteration satisfying
r
k
= ¯ p
k
(A)
2
r
0
(3.18)
is produced, where ¯ p
k
is the same polynomial produced by BiCG and used in
(3.17). This explains the name, Conjugate Gradient Squared.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GMRES ITERATION 49
The work used in BiCG to compute ˆ r is now used to update x. CGS
replaces the transposevector product with an additional matrixvector product
and applies the square of the BiCG polynomial to r
0
to produce r
k
. This may,
of course, change the convergence properties for the worse and either improve
good convergence or magnify erratic behavior [134], [191]. CGS has the same
potential for breakdown as BiCG. Since ¯ p
2
k
∈ T
2k
we have
r
GMRES
2k

2
≤ r
CGS
k

2
. (3.19)
We present an algorithmic description of CGS that we will refer to in our
discussion of TFQMR in ¸ 3.6.4. In our description of CGS we will index the
vectors and scalars that would be overwritten in an implementation so that we
can describe some of the ideas in TFQMR later on. This description is taken
from [77].
Algorithm 3.6.2. cgs(x, b, A, , kmax)
1. x
0
= x; p
0
= u
0
= r
0
= b −Ax
2. v
0
= Ap
0
; ˆ r
0
= r
0
3. ρ
0
= ˆ r
T
0
r
0
; k = 0
4. Do While r
2
> b
2
and k < kmax
(a) k = k + 1
(b) σ
k−1
= ˆ r
T
0
v
k−1
(c) α
k−1
= ρ
k−1
/σ
k−1
(d) q
k
= u
k−1
−α
k−1
v
k−1
(e) x
k
= x
k−1
+α
k−1
(u
k−1
+q
k
)
r
k
= r
k−1
−α
k−1
A(u
k−1
+q
k
)
(f) ρ
k
= ˆ r
T
0
r
k
; β
k
= ρ
k
/ρ
k−1
u
k
= r
k
+β
k
q
k
p
k
= u
k
+β
k
(q
k
+β
k
p
k−1
)
v
k
= Ap
k
Breakdowns take place when either ρ
k−1
or σ
k−1
vanish. If the algorithm
does not break down, then α
k−1
,= 0 for all k. One can show [180], [77], that
if ¯ p
k
is the residual polynomial from (3.17) and (3.18) then
¯ p
k
(z) = ¯ p
k−1
(z) −α
k−1
z¯ q
k−1
(z) (3.20)
where the auxiliary polynomials ¯ q
k
∈ T
k
are given by ¯ q
0
= 1 and for k ≥ 1 by
¯ q
k
(z) = ¯ p
k
(z) +β
k
¯ q
k−1
(z). (3.21)
We provide no MATLAB implementation of BiCG or CGS. BiCGSTAB
is the most eﬀective representative of this class of algorithms.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
50 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
3.6.3. BiCGSTAB. The Biconjugate gradient stabilized method (Bi
CGSTAB) [191] attempts to smooth the convergence of CGS by replacing
(3.18) with
r
k
= q
k
(A)p
k
(A)r
0
(3.22)
where
q
k
(z) =
k
i=1
(1 −ω
i
z).
The constant ω
i
is selected to minimize r
i
= q
i
(A)p
i
(A)r
0
as a function of ω
i
.
The examples in [191] indicate the eﬀectiveness of this approach, which can
be thought of [12] as a blend of BiCG and GMRES(1). The performance in
cases where the spectrum has a signiﬁcant imaginary part can be improved by
constructing q
k
to have complex conjugate pairs of roots, [98].
There is no convergence theory for BiCGSTAB and no estimate of the
residuals that is better than that for CGS, (3.19). In our description of the
implementation, which is described and motivated fully in [191], we use ˆ r
0
= r
0
and follow the notation of [191].
Algorithm 3.6.3. bicgstab(x, b, A, , kmax)
1. r = b −Ax, ˆ r
0
= ˆ r = r, ρ
0
= α = ω = 1, v = p = 0, k = 0, ρ
1
= ˆ r
T
0
r
2. Do While r
2
> b
2
and k < kmax
(a) k = k + 1
(b) β = (ρ
k
/ρ
k−1
)(α/ω)
(c) p = r +β(p −ωv)
(d) v = Ap
(e) α = ρ
k
/(ˆ r
T
0
v)
(f) s = r −αv, t = As
(g) ω = t
T
s/t
2
2
, ρ
k+1
= −ωˆ r
T
0
t
(h) x = x +αp +ωs
(i) r = s −ωt
Note that the iteration can break down in steps 2b and 2e. We provide an
implementation of Algorithm bicgstab in the collection of MATLAB codes.
The cost in storage and in ﬂoatingpoint operations per iteration re
mains bounded for the entire iteration. One must store seven vectors
(x, b, r, p, v, ˆ r
0
, t), letting s overwrite r when needed. A single iteration re
quires four scalar products. In a situation where many GMRES iterations are
needed and matrixvector product is fast, BiCGSTAB can have a much lower
average cost per iterate than GMRES. The reason for this is that the cost of
orthogonalization in GMRES can be much more than that of a matrixvector
product if the dimension of the Krylov space is large. We will present such a
case in ¸ 3.8.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GMRES ITERATION 51
3.6.4. TFQMR. Now we consider the quasiminimal residual (QMR)
family of algorithms [80], [81], [77], [37], [202]. We will focus on the transpose
free algorithm TFQMR proposed in [77], to illustrate the quasiminimization
idea. All algorithms in this family minimize the norm of an easily computable
quantity q = f − Hz, a quasiresidual over z ∈ R
N
. The quasiresidual is
related to the true residual by a fullrank linear transformation r = Lq and
reduction of the residual can be approximately measured by reduction in q.
The speciﬁc formulation of q and L depends on the algorithm.
Returning to Algorithm cgs we deﬁne sequences
y
m
=
_
¸
_
¸
_
u
k−1
if m = 2k −1 is odd,
q
k
if m = 2k is even,
and
w
m
=
_
¸
_
¸
_
(¯ p
k
(A))
2
r
0
if m = 2k −1 is odd,
¯ p
k
(A)¯ p
k−1
(A)r
0
if m = 2k is even.
Here the sequences ¦¯ p
k
¦ and ¦¯ q
k
¦ are given by (3.20) and (3.21). Hence, the
CGS residual satisﬁes
r
CGS
k
= w
2k+1
.
We assume that CGS does not break down and, therefore, α
k
,= 0 for all
k ≥ 0. If we let ¸r be the nearest integer less than or equal to a real r we
have
Ay
m
= (w
m
−w
m+1
)/α
(m−1)/2
, (3.23)
where the denominator is nonzero by assumption. We express (3.23) in matrix
form as
AY
m
= W
m+1
B
m
, (3.24)
where Y
m
is the N m matrix with columns ¦y
j
¦
m
j=1
, W
m+1
the N (m+1)
matrix with columns ¦w
j
¦
m+1
j=1
, and B
m
is the (m+ 1) m matrix given by
B
m
=
_
_
_
_
_
_
_
_
_
_
1 0 . . . 0
−1 1
.
.
.
.
.
.
0
.
.
.
.
.
. 0
.
.
.
.
.
. −1 1
0 . . . 0 −1
_
_
_
_
_
_
_
_
_
_
diag(α
0
, α
0
, . . . , α
(m−1)/2
)
−1
.
Our assumptions that no breakdown takes place imply that /
m
is the span
of ¦y
j
¦
m
j=1
and hence
x
m
= x
0
+Y
m
z (3.25)
for some z ∈ R
m
. Therefore,
r
m
= r
0
−AY
m
z = W
m+1
(e
1
−B
m
z), (3.26)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
52 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
where, as in our discussion of GMRES, e
1
= (1, 0, . . . , 0)
T
∈ R
m
. In a sense,
the conditioning of the matrix W
m+1
can be improved by scaling. Let
Ω = diag(ω
1
, ω
2
, . . . , ω
m
) (3.27)
and rewrite (3.26) as
r
m
= r
0
−AY
m
z = W
m+1
Ω
−1
m+1
(f
m+1
−H
m
z) (3.28)
where f
m+1
= Ω
m+1
e
1
= ω
m+1
e
1
and H
m
= Ω
m+1
B
m
is bidiagonal and hence
upper Hessenberg. The diagonal entries in the matrix Ω
m+1
are called weights.
The weights make it possible to derive straightforward estimates of the residual
in terms of easily computed terms in the iteration.
If W
m+1
Ω
−1
m+1
were an orthogonal matrix, we could minimize r
m
by solving
the least squares problem
minimize
z∈R
mf
m+1
−H
m
z
2
. (3.29)
The quasiminimization idea is to solve (3.29) (thereby quasiminimizing the
residual) to obtain z
m
and then set
x
m
= x
0
+Y
m
z
m
. (3.30)
Note that if k corresponds to the iteration counter for CGS, each k produces
two approximate solutions (corresponding to m = 2k −1 and m = 2k).
The solution to the least squares problem (3.29) can be done by using
Givens rotations to update the factorization of the upper Hessenberg matrix
H
m
[77] and this is reﬂected in the implementation description below. As
one can see from that description, approximate residuals are not computed in
Algorithm tfqmr. Instead we have the quasiresidual norm
τ
m
= f
m+1
−H
m
z
m

2
.
Now if we pick the weights ω
i
= w
i

2
then the columns of W
m+1
Ω
−1
m+1
have
norm one. Hence
r
m

2
= W
m+1
Ω
−1
m+1
(f
m+1
−H
m
z)
2
≤ τ
m
√
m+ 1. (3.31)
We can, therefore, base our termination criterion on τ
m
, and we do this in
Algorithm tfqmr.
One can [80], [78], also use the quasiminimization condition to estimate
the TFQMR residual in terms of residual polynomials.
Theorem 3.6.1. Let A be an N N matrix and let x
0
, b ∈ R
N
be given.
Assume that the TFQMR iteration does not break down or terminate with
the exact solution for k iterations and let 1 ≤ m ≤ 2k. Let r
GMRES
m
be the
residual for the mth GMRES iteration. Let ξ
m
be the smallest singular value
of W
m+1
Ω
−1
m+1
. Then if ξ
m
> 0
τ
m
≤ r
GMRES
m

2
/ξ
m
. (3.32)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GMRES ITERATION 53
Proof. We may write
r
GMRES
m
= r
0
+Y
m
z
GMRES
m
and obtain
r
GMRES
m

2
= W
m+1
Ω
−1
m+1
(f
m+1
−H
m
z
GMRES
m
)
2
≥ ξ
m
f
m+1
−H
m
z
GMRES
m

2
.
Hence, by the quasiminimization property
r
GMRES
m

2
≥ ξ
m
τ
m
as desired.
As a corollary we have a ﬁnite termination result and a convergence
estimate for diagonalizable matrices similar to Theorem 3.1.3.
Corollary 3.6.1. Let A be an N N matrix and let x
0
, b ∈ R
N
be given.
Then within (N +1)/2 iterations the TFQMR iteration will either break down
or terminate with the solution.
We apply Theorem 3.1.3 to (3.32) and use (3.31) to obtain the following
result.
Theorem 3.6.2. Let A = V ΛV
−1
be a nonsingular diagonalizable matrix.
Assume that TFQMR does not break down or terminate with the solution for k
iterations. For 1 ≤ m ≤ 2k let ξ
m
be the smallest singular value of W
m+1
Ω
−1
m+1
and let x
m
be given by (3.30). Then, if ξ
m
> 0,
r
m

2
r
0

2
≤
√
m+ 1ξ
−1
m
κ(V ) max
z∈σ(A)
[φ(z)[
for all φ ∈ T
m
.
The implementation below follows [77] with ˆ r
0
= r
0
used throughout.
Algorithm 3.6.4. tfqmr(x, b, A, , kmax)
1. k = 0, w
1
= y
1
= r
0
= b −Ax, u
1
= v = Ay
1
, d = 0
ρ
0
= r
T
0
r
0
, τ = r
2
, θ = 0, η = 0
2. Do While k < kmax
(a) k = k + 1
(b) σ
k−1
= r
T
0
v, α = ρ
k−1
/σ
k−1
, y
2
= y
1
−α
k−1
v, u
2
= Ay
2
(c) For j = 1, 2 (m = 2k −2 +j)
i. w = w −α
k−1
u
j
ii. d = y
j
+ (θ
2
η/α
k−1
)d
iii. θ = w
2
/τ, c = 1/
√
1 +θ
2
iv. τ = τθc, η = c
2
α
k−1
v. x = x +ηd
vi. If τ
√
m+ 1 ≤ b
2
terminate successfully
(d) ρ
k
= r
T
0
w, β = ρ
k
/ρ
k−1
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
54 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
(e) y
1
= w +βy
2
, u
1
= Ay
1
(f) v = u
1
+β(u
2
+βv)
Note that y
2
and u
2
= Ay
2
need only be computed if the loop in step 2c
does not terminate when j = 1. We take advantage of this in the MATLAB
code tfqmr to potentially reduce the matrixvector product cost by one.
3.7. Examples for GMRES iteration
These examples use the code gmres from the collection of MATLAB codes. The
inputs, described in the comment lines, are the initial iterate x
0
, the righthand
side vector b, a MATLAB function for the matrixvector product, and iteration
parameters to specify the maximum number of iterations and the termination
criterion. We do not pass the preconditioner to the GMRES code, rather
we pass the preconditioned problem. In all the GMRES examples, we limit
the number of GMRES iterations to 60. For methods like GMRES(m), Bi
CGSTAB, CGNR, and TFQMR, whose storage requirements do not increase
with the number of iterations, we allow for more iterations.
We consider the discretization of the partial diﬀerential equation
(Lu)(x, y) = −(u
xx
(x, y) +u
yy
(x, y)) +a
1
(x, y)u
x
(x, y)
+a
2
(x, y)u
y
(x, y) +a
3
(x, y)u(x, y) = f(x, y)
(3.33)
on 0 < x, y < 1 subject to homogeneous Dirichlet boundary conditions
u(x, 0) = u(x, 1) = u(0, y) = u(1, y) = 0, 0 < x, y < 1.
For general coeﬃcients ¦a
j
¦ the operator L is not selfadjoint and its discretiza
tion is not symmetric. As in ¸ 2.7 we discretize with a ﬁvepoint centered dif
ference scheme with n
2
points and mesh width h = 1/(n + 1). The unknowns
are
u
ij
≈ u(x
i
, x
j
)
where x
i
= ih for 1 ≤ i ≤ n. We compute Lu in a matrixfree manner
as we did in ¸ 2.7. We let n = 31 to create a system with 961 unknowns.
As before, we expect secondorder accuracy and terminate the iteration when
r
k

2
≤ h
2
b
2
≈ 9.8 10
−4
b
2
.
As a preconditioner we use the fast Poisson solver fish2d described in ¸ 2.7.
The motivation for this choice is similar to that for the conjugate gradient
computations and has been partially explained theoretically in [33], [34], [121],
and [139]. If we let Gu denote the action of the Poisson solver on u, the
preconditioned system is GLu = Gf.
For the computations reported in this section we took
a
1
(x, y) = 1, a
2
(x, y) = 20y, and a
3
(x, y) = 1.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GMRES ITERATION 55
u
0
= 0 was the initial iterate. As in ¸ 2.7 we took the right hand side so that
the exact solution was the discretization of
10xy(1 −x)(1 −y) exp(x
4.5
).
In Figs. 3.1 and 3.2 we plot iteration histories corresponding to precondi
tioned or unpreconditioned GMRES. For the restarted iteration, we restarted
the algorithm after three iterations, GMRES(3) in the terminology of ¸ 3.4.
Note that the plots are semilog plots of r
2
/b
2
and that the righthand
sides in the preconditioned and unpreconditioned cases are not the same. In
both ﬁgures the solid line is the history of the preconditioned iteration and the
dashed line that of the unpreconditioned. Note the importance of precondi
tioning. The MATLAB flops command indicates that the unpreconditioned
iteration required 7.6 million ﬂoating point operations and converged in 56
iterations. The preconditioned iteration required 1.3 million ﬂoating point op
erations and terminated after 8 iterations. Restarting after 3 iterations is more
costly in terms of the iteration count than not restarting. However, even in
the unpreconditioned case, the restarted iteration terminated successfully. In
this case the MATLAB flops command indicates that the unpreconditioned
iteration terminated after 223 iterations and 9.7 million ﬂoating point opera
tions and the preconditioned iteration required 13 iterations and 2.6 million
ﬂoating point operations.
The execution times in our environment indicate that the preconditioned
iterations are even faster than the diﬀerence in operation counts indicates.
The reason for this is that the FFT routine is a MATLAB intrinsic and is not
interpreted MATLAB code. In other computing environments preconditioners
which vectorize or parallelize particularly well might perform in the same way.
3.8. Examples for CGNR, BiCGSTAB, and TFQMR iteration
When a transpose is available, CGNR (or CGNE) is an alternative to GMRES
that does not require storage of the iteration history. For elliptic problems like
(3.33), the transpose (adjoint) of L is given by
L
∗
u = −u
xx
−u
yy
−a
1
u
x
−a
2
u
y
+a
3
u
as one can derive by integration by parts. To apply CGNR to the precondi
tioned problem, we require the transpose of GL, which is given by
(GL)
∗
u = L
∗
G
∗
u = L
∗
Gu.
The cost of the application of the transpose of L or GL is the same as that
of the application of L or GL itself. Hence, in the case where the cost of the
matrixvector product dominates the computation, a single CGNR iteration
costs roughly the same as two GMRES iterations.
However, CGNR has the eﬀect of squaring the condition number, this eﬀect
has serious consequences as Fig. 3.3 shows. The dashed line corresponds to
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
56 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
0 10 20 30 40 50 60
10
4
10
3
10
2
10
1
10
0
Iterations
R
e
l
a
t
i
v
e
R
e
s
i
d
u
a
l
Fig. 3.1. GMRES for 2D elliptic equation.
0 50 100 150 200 250
10
4
10
3
10
2
10
1
10
0
Iterations
R
e
l
a
t
i
v
e
R
e
s
i
d
u
a
l
Fig. 3.2. GMRES(3) for 2D elliptic equation.
CGNR on the unpreconditioned problem, i.e., CG applied to L
∗
Lu = L
∗
f.
The matrix corresponding to the discretization of L
∗
L has a large condition
number. The solid line corresponds to the CGNR applied to the preconditioned
problem, i.e., CG applied to L
∗
G
2
Lu = L
∗
G
2
f. We limited the number of
iterations to 310 = 10m and the unpreconditioned formulation of CGNR had
made very little progress after 310 iterations. This is a result of squaring
the large condition number. The preconditioned formulation did quite well,
converging in eight iterations. The MATLAB flops command indicates that
the unpreconditioned iteration required 13.7 million ﬂoatingpoint operations
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GMRES ITERATION 57
and the preconditioned iteration 2.4 million. However, as you are asked to
investigate in Exercise 3.9.8, the behavior of even the preconditioned iteration
can also be poor if the coeﬃcients of the ﬁrst derivative terms are too large.
CGNE has similar performance; see Exercise 3.9.9.
0 50 100 150 200 250 300 350
10
4
10
3
10
2
10
1
10
0
Iterations
R
e
l
a
t
i
v
e
R
e
s
i
d
u
a
l
Fig. 3.3. CGNR for 2D elliptic equation.
The examples for BiCGSTAB use the code bicgstab from the collection
of MATLAB codes. This code has the same input/output arguments as gmres.
We applied BiCGSTAB to both the preconditioned and unpreconditioned
forms of (3.33) with the same data and termination criterion as for the GMRES
example.
We plot the iteration histories in Fig. 3.4. As in the previous examples, the
solid line corresponds to the preconditioned iteration and the dashed line to
the unpreconditioned iteration. Neither convergence history is monotonic, with
the unpreconditioned iteration being very irregular, but not varying by many
orders of magnitude as a CGS or BiCG iteration might. The preconditioned
iteration terminated in six iterations (12 matrixvector products) and required
roughly 1.7 million ﬂoatingpoint operations. The unpreconditioned iteration
took 40 iterations and 2.2 million ﬂoatingpoint operations. We see a
considerable improvement in cost over CGNR and, since our unpreconditioned
matrixvector product is so inexpensive, a better cost/iterate than GMRES in
the unpreconditioned case.
We repeat the experiment with TFQMR as our linear solver. We use
the code tfqmr that is included in our collection. In Fig. 3.5 we plot the
convergence histories in terms of the approximate residual
√
2k + 1τ
2k
given
by (3.31). We plot only full iterations (m = 2k) except, possibly, for the
ﬁnal iterate. The approximate residual is given in terms of the quasiresidual
norm τ
m
rather than the true residual, and the graphs are monotonic. Our
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
58 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
0 5 10 15 20 25 30 35 40
10
4
10
3
10
2
10
1
10
0
10
1
Iterations
R
e
l
a
t
i
v
e
R
e
s
i
d
u
a
l
Fig. 3.4. BiCGSTAB for 2D elliptic equation.
0 10 20 30 40 50 60 70
10
5
10
4
10
3
10
2
10
1
10
0
Iterations
R
e
l
a
t
i
v
e
A
p
p
r
o
x
i
m
a
t
e
R
e
s
i
d
u
a
l
Fig. 3.5. TFQMR for 2D elliptic equation.
termination criterion is based on the quasiresidual and (3.31) as described in
¸ 3.6.4, stopping the iteration when τ
m
√
m+ 1/b
2
≤ h
−2
. This led to actual
relative residuals of 7 10
−5
for the unpreconditioned iteration and 2 10
−4
for the preconditioned iteration, indicating that it is reasonable to use the
quasiresidual estimate (3.31) to make termination decisions.
The unpreconditioned iteration required roughly 4.1 million ﬂoatingpoint
operations and 67 iterations for termination. The preconditioned iteration
was much more eﬃcient, taking 1.9 million ﬂoatingpoint operations and 7
iterations. The plateaus in the graph of the convergence history for the
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GMRES ITERATION 59
unpreconditioned iteration are related (see [196]) to spikes in the corresponding
graph for a CGS iteration.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
60 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
3.9. Exercises on GMRES
3.9.1. Give an example of a matrix that is not diagonalizable.
3.9.2. Prove Theorem 3.1.4 and Theorem 3.1.5.
3.9.3. Prove that the Arnoldi process (unless it terminates prematurely with a
solution) produces a basis for /
k+1
that satisﬁes (3.8).
3.9.4. Duplicate Table 3.1 and add two more columns corresponding to classical
Gram–Schmidt orthogonalization with reorthogonalization at every step
and determine when the test in (3.10) detects loss of orthogonality. How
do your conclusions compare with those in [160]?
3.9.5. Let A = J
λ
be an elementary Jordan block of size N corresponding to
an eigenvalue λ ,= 0. This means that
J
λ
=
_
_
_
_
_
_
_
_
_
_
λ 1 0
0 λ 1 0
.
.
.
.
.
.
.
.
.
.
.
.
0 λ 1 0
0 λ 1
0 λ
_
_
_
_
_
_
_
_
_
_
.
Give an example of x
0
, b ∈ R
N
, and λ ∈ R such that the GMRES
residuals satisfy r
k

2
= r
0

2
for all k < N.
3.9.6. Consider the centered diﬀerence discretization of
−u
+u
+u = 1, u(0) = u(1) = 0.
Solve this problem with GMRES (without preconditioning) and then
apply as a preconditioner the map M deﬁned by
−(Mf)
= f, (Mf)(0) = (Mf)(1) = 0.
That is, precondition with a solver for the highorder term in the
diﬀerential equation using the correct boundary conditions. Try this
for meshes with 50, 100, 200, . . . points. How does the performance of
the iteration depend on the mesh? Try other preconditioners such as
Gauss–Seidel and Jacobi. How do other methods such as CGNR, CGNE,
BiCGSTAB, and TFQMR perform?
3.9.7. Modify the MATLAB GMRES code to allow for restarts. One way to do
this is to call the gmres code with another code which has an outer loop
to control the restarting as in Algorithm gmresm. See [167] for another
example. How would you test for failure in a restarted algorithm? What
kinds of failures can occur? Repeat Exercise 6 with a limit of three
GMRES iterations before a restart.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GMRES ITERATION 61
3.9.8. Duplicate the results in ¸ 3.7 and ¸ 3.8. Compare the execution times
in your computing environment. Experiment with diﬀerent values of m.
Why might GMRES(m) be faster than GMRES? Vary the dimension and
the coeﬃcient a
y
. Try a
y
= 5y. Why is the performance of the methods
sensitive to a
y
and a
x
?
3.9.9. Apply CGNE to the PDE in ¸ 3.7. How does the performance diﬀer from
CGNR and GMRES?
3.9.10. Consider preconditioning from the right. For the PDE in ¸ 3.7, apply
GMRES, CGNR, BiCGSTAB, TFQMR, and/or CGNE, to the equation
LGw = f and then recover the solution from u = Gw. Compare timings
and solution accuracy with those obtained by left preconditioning.
3.9.11. Are the solutions to (3.33) in ¸ 3.7 and ¸ 3.8 equally accurate? Compare
the ﬁnal results with the known solution.
3.9.12. Implement BiCG and CGS and try them on some of the examples in
this chapter.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
62 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Chapter 4
Basic Concepts and FixedPoint Iteration
This part of the book is about numerical methods for the solution of systems
of nonlinear equations. Throughout this part, we will let   denote a norm
on R
N
as well as the induced matrix norm.
We begin by setting the notation. We seek to solve
F(x) = 0. (4.1)
Here F : R
N
→ R
N
. We denote the ith component of F by f
i
. If the
components of F are diﬀerentiable at x ∈ R
N
we deﬁne the Jacobian matrix
F
(x) by
F
(x)
ij
=
∂f
i
∂x
j
(x).
The Jacobian matrix is the vector analog of the derivative. We may express
the fundamental theorem of calculus as follows.
Theorem 4.0.1. Let F be diﬀerentiable in an open set Ω ⊂ R
N
and let
x
∗
∈ Ω. Then for all x ∈ Ω suﬃciently near x
∗
F(x) −F(x
∗
) =
_
1
0
F
(x
∗
+t(x −x
∗
))(x −x
∗
) dt.
4.1. Types of convergence
Iterative methods can be classiﬁed by the rate of convergence.
Definition 4.1.1. Let ¦x
n
¦ ⊂ R
N
and x
∗
∈ R
N
. Then
• x
n
→ x
∗
qquadratically if x
n
→ x
∗
and there is K > 0 such that
x
n+1
−x
∗
 ≤ Kx
n
−x
∗

2
.
• x
n
→ x
∗
qsuperlinearly with qorder α > 1 if x
n
→ x
∗
and there is
K > 0 such that
x
n+1
−x
∗
 ≤ Kx
n
−x
∗

α
.
• x
n
→ x
∗
qsuperlinearly if
lim
n→∞
x
n+1
−x
∗

x
n
−x
∗

= 0.
65
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
66 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
• x
n
→ x
∗
qlinearly with qfactor σ ∈ (0, 1) if
x
n+1
−x
∗
 ≤ σx
n
−x
∗

for n suﬃciently large.
Definition 4.1.2. An iterative method for computing x
∗
is said to be
locally (qquadratically, qsuperlinearly, qlinearly, etc.) convergent if the
iterates converge to x
∗
(qquadratically, qsuperlinearly, qlinearly, etc.) given
that the initial data for the iteration is suﬃciently good.
Note that a qsuperlinearly convergent sequence is also qlinearly conver
gent with qfactor σ for any σ > 0. A qquadratically convergent sequence is
qsuperlinearly convergent with qorder 2.
In general, a qsuperlinearly convergent method is preferable to a qlinearly
convergent one if the cost of a single iterate is the same for both methods. We
will see that often a method that is more slowly convergent, in terms of the
convergence types deﬁned above, can have a cost/iterate so low that the slow
iteration is more eﬃcient.
Sometimes errors are introduced into the iteration that are independent of
the errors in the approximate solution. An example of this would be a problem
in which the nonlinear function F is the output of another algorithm that
provides error control, such as adjustment of the mesh size in an approximation
of a diﬀerential equation. One might only compute F to low accuracy in the
initial phases of the iteration to compute a solution and tighten the accuracy
as the iteration progresses. The rtype convergence classiﬁcation enables us to
describe that idea.
Definition 4.1.3. Let ¦x
n
¦ ⊂ R
N
and x
∗
∈ R
N
. Then ¦x
n
¦ converges
to x
∗
r(quadratically, superlinearly, linearly) if there is a sequence ¦ξ
n
¦ ⊂ R
converging q(quadratically, superlinearly, linearly) to zero such that
x
n
−x
∗
 ≤ ξ
n
.
We say that ¦x
n
¦ converges rsuperlinearly with rorder α > 1 if ξ
n
→ 0
qsuperlinearly with qorder α.
4.2. Fixedpoint iteration
Many nonlinear equations are naturally formulated as ﬁxedpoint problems
x = K(x) (4.2)
where K, the ﬁxedpoint map, may be nonlinear. A solution x
∗
of (4.2) is
called a ﬁxed point of the map K. Such problems are nonlinear analogs of the
linear ﬁxedpoint problems considered in Chapter 1. In this section we analyze
convergence of ﬁxedpoint iteration, which is given by
x
n+1
= K(x
n
). (4.3)
This iterative method is also called nonlinear Richardson iteration, Picard
iteration, or the method of successive substitution.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
FIXEDPOINT ITERATION 67
Before discussing convergence of ﬁxedpoint iteration we make two deﬁni
tions.
Definition 4.2.1. Let Ω ⊂ R
N
and let G : Ω → R
M
. G is Lipschitz
continuous on Ω with Lipschitz constant γ if
G(x) −G(y) ≤ γx −y
for all x, y ∈ Ω.
Definition 4.2.2. Let Ω ⊂ R
N
. K : Ω → R
N
is a contraction mapping
on Ω if K is Lipschitz continuous on Ω with Lipschitz constant γ < 1.
The standard result for ﬁxedpoint iteration is the Contraction Mapping
Theorem [9]. Compare the proof to that of Lemma 1.2.1 and Corollary 1.2.1.
Theorem 4.2.1. Let Ω be a closed subset of R
N
and let K be a contraction
mapping on Ω with Lipschitz constant γ < 1 such that K(x) ∈ Ω for all x ∈ Ω.
Then there is a unique ﬁxed point of K, x
∗
∈ Ω, and the iteration deﬁned by
(4.3) converges qlinearly to x
∗
with qfactor γ for all initial iterates x
0
∈ Ω.
Proof. Let x
0
∈ Ω. Note that ¦x
n
¦ ⊂ Ω because x
0
∈ Ω and K(x) ∈ Ω
whenever x ∈ Ω. The sequence ¦x
n
¦ remains bounded since for all i ≥ 1
x
i+1
−x
i
 = K(x
i
) −K(x
i−1
) ≤ γx
i
−x
i−1
 . . . ≤ γ
i
x
1
−x
0
,
and therefore
x
n
−x
0
 = 
n−1
i=0
x
i+1
−x
i

≤
n−1
i=0
x
i+1
−x
i
 ≤ x
1
−x
0

n−1
i=0
γ
i
≤ x
1
−x
0
/(1 −γ).
Now, for all n, k ≥ 0,
x
n+k
−x
n
 = K(x
n+k−1
) −K(x
n−1
)
≤ γx
n+k−1
−x
n−1

≤ γK(x
n+k−2
) −K(x
n−2
)
≤ γ
2
x
n+k−2
−x
n−2
 ≤ . . . ≤ γ
n
x
k
−x
0

≤ γ
n
x
1
−x
0
/(1 −γ).
Hence
lim
n,k→∞
x
n+k
−x
n
 = 0
and therefore the sequence ¦x
n
¦ is a Cauchy sequence and has a limit x
∗
[162].
If K has two ﬁxed points x
∗
and y
∗
in Ω, then
x
∗
−y
∗
 = K(x
∗
) −K(y
∗
) ≤ γx
∗
−y
∗
.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
68 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Since γ < 1 this can only hold if x
∗
−y
∗
 = 0, i. e. x
∗
= y
∗
. Hence the ﬁxed
point is unique.
Finally we note that
x
n+1
−x
∗
 = K(x
n
) −K(x
∗
) ≤ γx
n
−x
∗
,
which completes the proof.
One simple application of this theorem is to numerical integration of
ordinary diﬀerential equations by implicit methods. For example, consider
the initial value problem
y
= f(y), y(t
0
) = y
0
and its solution by the backward Euler method with time step h,
y
n+1
−y
n
= hf(y
n+1
). (4.4)
In (4.4) y
k
denotes the approximation of y(t
0
+ kh). Advancing in time from
y
n
to y
n+1
requires solution of the ﬁxedpoint problem
y = K(y) = y
n
+hf(y).
For simplicity we assume that f is bounded and Lipschitz continuous, say
[f(y)[ ≤ M and [f(x) −f(y)[ ≤ M[x −y[
for all x, y.
If hM < 1, we may apply the contraction mapping theorem with Ω = R
since for all x, y
K(x) −K(y) = h[f(x) −f(y)[ ≤ hM[x −y[.
From this we conclude that for h suﬃciently small we can integrate forward
in time and that the time step h can be chosen independently of y. This time
step may well be too small to be practical, and the methods based on Newton’s
method which we present in the subsequent chapters are used much more often
[83]. This type of analysis was used by Picard to prove existence of solutions
of initial value problems [152]. Nonlinear variations of the classical stationary
iterative methods such as Gauss–Seidel have also been used; see [145] for a
more complete discussion of these algorithms.
4.3. The standard assumptions
We will make the standard assumptions on F.
Assumption 4.3.1.
1. Equation 4.1 has a solution x
∗
.
2. F
: Ω → R
N×N
is Lipschitz continuous with Lipschitz constant γ.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
FIXEDPOINT ITERATION 69
3. F
(x
∗
) is nonsingular.
These assumptions can be weakened [108] without sacriﬁcing convergence
of the methods we consider here. However the classical result on quadratic
convergence of Newton’s method requires them and we make them throughout.
Exercise 5.7.1 illustrates one way to weaken the standard assumptions.
Throughout this part, we will always denote a root of F by x
∗
. We let
B(r) denote the ball of radius r about x
∗
B(r) = ¦x[ e < r¦,
where
e = x −x
∗
.
The notation introduced above will be used consistently. If x
n
is the nth iterate
of a sequence, e
n
= x
n
−x
∗
is the error in that iterate.
Lemma 4.3.1 is an important consequence of the standard assumptions.
Lemma 4.3.1. Assume that the standard assumptions hold. Then there is
δ > 0 so that for all x ∈ B(δ)
F
(x) ≤ 2F
(x
∗
), (4.5)
F
(x)
−1
 ≤ 2F
(x
∗
)
−1
, (4.6)
and
F
(x
∗
)
−1

−1
e/2 ≤ F(x) ≤ 2F
(x
∗
)e. (4.7)
Proof. Let δ be small enough so that B(δ) ⊂ Ω. For all x ∈ B(δ) we have
F
(x) ≤ F
(x
∗
) +γe.
Hence (4.5) holds if γδ < F
(x
∗
).
The next result (4.6) is a direct consequence of the Banach Lemma if
I −F
(x
∗
)
−1
F
(x) < 1/2.
This will follow from
δ <
F
(x
∗
)
−1

−1
2γ
since then
I −F
(x
∗
)
−1
F
(x) = F
(x
∗
)
−1
(F
(x
∗
) −F
(x))
≤ γF
(x
∗
)
−1
e ≤ γδF
(x
∗
)
−1
 < 1/2.
(4.8)
To prove the ﬁnal inequality (4.7), we note that if x ∈ B(δ) then x
∗
+te ∈
B(δ) for all 0 ≤ t ≤ 1. We use (4.5) and Theorem 4.0.1 to obtain
F(x) ≤
_
1
0
F
(x
∗
+te)e dt ≤ 2F
(x
∗
)e
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
70 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
which is the right inequality in (4.7).
To prove the left inequality in (4.7) note that
F
(x
∗
)
−1
F(x) = F
(x
∗
)
−1
_
1
0
F
(x
∗
+te)e dt
= e −
_
1
0
(I −F
(x
∗
)
−1
F
(x
∗
+te))e dt,
and hence, by (4.8)
F
(x
∗
)
−1
F(x) ≥ e(1 −
_
1
0
I −F
(x
∗
)
−1
F
(x
∗
+te)dt) ≥ e/2.
Therefore
e/2 ≤ F
(x
∗
)
−1
F(x) ≤ F
(x
∗
)
−1
F(x),
which completes the proof.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Chapter 5
Newton’s Method
5.1. Local convergence of Newton’s method
We will describe iterative methods for nonlinear equations in terms of the
transition from a current iterate x
c
to a new iterate x
+
. In this language,
Newton’s method is
x
+
= x
c
−F
(x
c
)
−1
F(x
c
). (5.1)
We may also view x
+
as the root of the twoterm Taylor expansion or linear
model of F about x
c
M
c
(x) = F(x
c
) +F
(x
c
)(x −x
c
).
In the context of single systems this method appeared in the 17th century
[140], [156], [13], [137].
The convergence result on Newton’s method follows from Lemma 4.3.1.
Theorem 5.1.1. Let the standard assumptions hold. Then there are K > 0
and δ > 0 such that if x
c
∈ B(δ) the Newton iterate from x
c
given by (5.1)
satisﬁes
e
+
 ≤ Ke
c

2
. (5.2)
Proof. Let δ be small enough so that the conclusions of Lemma 4.3.1 hold.
By Theorem 4.0.1
e
+
= e
c
−F
(x
c
)
−1
F(x
c
) = F
(x
c
)
−1
_
1
0
(F
(x
c
) −F
(x
∗
+te
c
))e
c
dt.
By Lemma 4.3.1 and the Lipschitz continuity of F
e
+
 ≤ (2F
(x
∗
)
−1
)γe
c

2
/2.
This completes the proof of (5.2) with K = γF
(x
∗
)
−1
.
The proof of convergence of the complete Newton iteration will be complete
if we reduce δ if needed so that Kδ < 1.
Theorem 5.1.2. Let the standard assumptions hold. Then there is δ such
that if x
0
∈ B(δ) the Newton iteration
x
n+1
= x
n
−F
(x
n
)
−1
F(x
n
)
converges qquadratically to x
∗
.
71
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
72 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Proof. Let δ be small enough so that the conclusions of Theorem 5.1.1
hold. Reduce δ if needed so that Kδ = η < 1. Then if n ≥ 0 and x
n
∈ B(δ),
Theorem 5.1.1 implies that
e
n+1
 ≤ Ke
n

2
≤ ηe
n
 < e
n
 (5.3)
and hence x
n+1
∈ B(ηδ) ⊂ B(δ). Therefore if x
n
∈ B(δ) we may continue the
iteration. Since x
0
∈ B(δ) by assumption, the entire sequence ¦x
n
¦ ⊂ B(δ).
(5.3) then implies that x
n
→ x
∗
qquadratically.
The assumption that the initial iterate be “suﬃciently near” the solution
(x
0
∈ B(δ)) may seem artiﬁcial at ﬁrst look. There are, however, many
situations in which the initial iterate is very near the root. Two examples are
implicit integration of ordinary diﬀerential equations and diﬀerential algebraic
equations, [105], [83], [16], where the initial iterate is derived from the solution
at the previous time step, and solution of discretizations of partial diﬀerential
equations, where the initial iterate is an interpolation of a solution from a
coarser computational mesh [99], [126]. Moreover, when the initial iterate is
far from the root and methods such as those discussed in Chapter 8 are needed,
local convergence results like those in this and the following two chapters
describe the terminal phase of the iteration.
5.2. Termination of the iteration
We can base our termination decision on Lemma 4.3.1. If x ∈ B(δ), where δ
is small enough so that the conclusions of Lemma 4.3.1 hold, then if F
(x
∗
) is
well conditioned, we may terminate the iteration when the relative nonlinear
residual F(x)/F(x
0
) is small. We have, as a direct consequence of applying
Lemma 4.3.1 twice, a nonlinear form of Lemma 1.1.1.
Lemma 5.2.1. Assume that the standard assumptions hold. Let δ > 0 be
small enough so that the conclusions of Lemma 4.3.1 hold for x ∈ B(δ). Then
for all x ∈ B(δ)
e
4e
0
κ(F
(x
∗
))
≤
F(x)
F(x
0
)
≤
4κ(F
(x
∗
))e
e
0

,
where κ(F
(x
∗
)) = F
(x
∗
)F
(x
∗
)
−1
 is the condition number of F
(x
∗
)
relative to the norm  .
From Lemma 5.2.1 we conclude that if F
(x
∗
) is well conditioned, the size
of the relative nonlinear residual is a good indicator of the size of the error.
However, if there is error in the evaluation of F or the initial iterate is near a
root, a termination decision based on the relative residual may be made too
late in the iteration or, as we will see in the next section, the iteration may
not terminate at all. We raised this issue in the context of linear equations
in Chapter 1 when we compared (1.2) and (1.4). In the nonlinear case, there
is no “righthand side” to use as a scaling factor and we must balance the
relative and absolute size of the nonlinear residuals in some other way. In all
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
NEWTON’S METHOD 73
of our MATLAB codes and algorithms for nonlinear equations our termination
criterion is to stop the iteration if
F(x) ≤ τ
r
F(x
0
) +τ
a
, (5.4)
where the relative error tolerance τ
r
and absolute error tolerance τ
a
are input
to the algorithm. Combinations of relative and absolute error tolerances are
commonly used in numerical methods for ordinary diﬀerential equations or
diﬀerential algebraic equations [16], [21], [23], [151].
Another way to decide whether to terminate is to look at the Newton step
s = −F
(x
c
)
−1
F(x
c
) = x
+
−x
c
,
and terminate the iteration when s is suﬃciently small. This criterion is
based on Theorem 5.1.1, which implies that
e
c
 = s +O(e
c

2
).
Hence, near the solution s and e
c
are essentially the same size.
For methods other than Newton’s method, the relation between the step
and the error is less clear. If the iteration is qlinearly convergent, say, then
e
+
 ≤ σe
c

implies that
(1 −σ)e
c
 ≤ s ≤ (1 +σ)e
c
.
Hence the step is a reliable indicator of the error provided σ is not too near
1. The diﬀerential equations codes discussed in [16], [21], [23], and [151], make
use of this in the special case of the chord method.
However, in order to estimate e
c
 this way, one must do most of the work
needed to compute x
+
, whereas by terminating on small relative residuals or
on a condition like (5.4) the decision to stop the iteration can be made before
computation and factorization of F
(x
c
). If computation and factorization of
the Jacobian are inexpensive, then termination on small steps becomes more
attractive.
5.3. Implementation of Newton’s method
In this section we assume that direct methods will be used to solve the linear
equation for the Newton step. Our examples and discussions of operation
counts make the implicit assumption that the Jacobian is dense. However,
the issues for sparse Jacobians when direct methods are used to solve linear
systems are very similar. In Chapter 6 we consider algorithms in which iterative
methods are used to solve the equation for the Newton step.
In order to compute the Newton iterate x
+
from a current point x
c
one
must ﬁrst evaluate F(x
c
) and decide whether to terminate the iteration. If
one decides to continue, the Jacobian F
(x
c
) must be computed and factored.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
74 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Then the step is computed as the solution of F
(x
c
)s = −F(x
c
) and the iterate
is updated x
+
= x
c
+s. Of these steps, the evaluation and factorization of F
are the most costly. Factorization of F
in the dense case costs O(N
3
) ﬂoating
point operations. Evaluation of F
by ﬁnite diﬀerences should be expected to
cost N times the cost of an evaluation of F because each column in F
requires
an evaluation of F to form the diﬀerence approximation. Hence the cost of a
Newton step may be roughly estimated as N + 1 evaluations of F and O(N
3
)
ﬂoatingpoint operations. In many cases F
can be computed more eﬃciently,
accurately, and directly than with diﬀerences and the analysis above for the
cost of a Newton iterate is very pessimistic. See Exercise 5.7.21 for an example.
For general nonlinearities or general purpose codes such as the MATLAB code
nsol from the collection, there may be little alternative to diﬀerence Jacobians.
In the future, automatic diﬀerentiation (see [94] for a collection of articles on
this topic) may provide such an alternative.
For deﬁniteness in the description of Algorithm newton we use an LU
factorization of F
. Any other appropriate factorization such as QR or
Cholesky could be used as well. The inputs of the algorithm are the initial
iterate x, the nonlinear map F, and a vector of termination tolerances
τ = (τ
r
, τ
a
) ∈ R
2
.
Algorithm 5.3.1. newton(x, F, τ)
1. r
0
= F(x)
2. Do while F(x) > τ
r
r
0
+τ
a
(a) Compute F
(x)
(b) Factor F
(x) = LU.
(c) Solve LUs = −F(x)
(d) x = x +s
(e) Evaluate F(x).
One approach to reducing the cost of items 2a and item 2b in the Newton
iteration is to move them outside of the main loop. This means that the linear
approximation of F(x) = 0 that is solved at each iteration has derivative
determined by the initial iterate.
x
+
= x
c
−F
(x
0
)
−1
F(x
c
).
This method is called the chord method. The inputs are the same as for
Newton’s method.
Algorithm 5.3.2. chord(x, F, τ)
1. r
0
= F(x)
2. Compute F
(x)
3. Factor F
(x) = LU.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
NEWTON’S METHOD 75
4. Do while F(x) > τ
r
r
0
+τ
a
(a) Solve LUs = −F(x)
(b) x = x +s
(c) Evaluate F(x).
The only diﬀerence in implementation from Newton’s method is that the
computation and factorization of the Jacobian are done before the iteration
is begun. The diﬀerence in the iteration itself is that an approximation to
F
(x
c
) is used. Similar diﬀerences arise if F
(x
c
) is numerically approximated
by diﬀerences. We continue in the next section with an analysis of the eﬀects
of errors in the function and Jacobian in a general context and then consider
the chord method and diﬀerence approximation to F
as applications.
5.4. Errors in the function and derivative
Suppose that F and F
are computed inaccurately so that F + and F
+ ∆
are used instead of F and F
in the iteration. If ∆ is suﬃciently small, the
resulting iteration can return a result that is an O() accurate approximation
to x
∗
. This is diﬀerent from convergence and was called “local improvement”
in [65]. These issues are discussed in [201] as well. If, for example, is entirely
due to ﬂoatingpoint roundoﬀ, there is no reason to expect that F(x
n
) will
ever be smaller than in general. We will refer to this phase of the iteration
in which the nonlinear residual is no longer being reduced as stagnation of the
iteration.
Theorem 5.4.1. Let the standard assumptions hold. Then there are
¯
K > 0, δ > 0, and δ
1
> 0 such that if x
c
∈ B(δ) and ∆(x
c
) < δ
1
then
x
+
= x
c
−(F
(x
c
) + ∆(x
c
))
−1
(F(x
c
) +(x
c
))
is deﬁned (i.e., F
(x
c
) + ∆(x
c
) is nonsingular) and satisﬁes
e
+
 ≤
¯
K(e
c

2
+∆(x
c
)e
c
 +(x
c
)). (5.5)
Proof. Let δ be small enough so that the conclusions of Lemma 4.3.1 and
Theorem 5.1.1 hold. Let
x
N
+
= x
c
−F
(x
c
)
−1
F(x
c
)
and note that
x
+
= x
N
+
+ (F
(x
c
)
−1
−(F
(x
c
) + ∆(x
c
))
−1
)F(x
c
) −(F
(x
c
) + ∆(x
c
))
−1
(x
c
).
Lemma 4.3.1 and Theorem 5.1.1 imply
e
+
 ≤ Ke
c

2
+ 2F
(x
c
)
−1
−(F
(x
c
) + ∆(x
c
))
−1
)F
(x
∗
)e
c

+(F
(x
c
) + ∆(x
c
))
−1
(x
c
).
(5.6)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
76 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
If
∆(x
c
) ≤ F
(x
∗
)
−1

−1
/4
then Lemma 4.3.1 implies that
∆(x
c
) ≤ F
(x
c
)
−1

−1
/2
and the Banach Lemma implies that F
(x
c
) + ∆(x
c
) is nonsingular and
(F
(x
c
) + ∆(x
c
))
−1
 ≤ 2F
(x
c
)
−1
 ≤ 4F
(x
∗
)
−1
.
Hence
F
(x
c
)
−1
−(F
(x
c
) + ∆(x
c
))
−1
 ≤ 8F
(x
∗
)
−1

2
∆(x
c
).
We use these estimates and (5.6) to obtain
e
+
 ≤ Ke
c

2
+ 16F
(x
∗
)
−1

2
F
(x
∗
)∆(x
c
)e
c
 + 4F
(x
∗
)
−1
(x
c
).
Setting
¯
K = K + 16F
(x
∗
)
−1

2
F
(x
∗
) + 4F
(x
∗
)
−1

completes the proof.
The remainder of this section is devoted to applications of Theorem 5.4.1.
5.4.1. The chord method. Recall that the chord method is given by
x
+
= x
c
−F
(x
0
)
−1
F(x
c
).
In the language of Theorem 5.4.1
(x
c
) = 0, ∆(x
c
) = F
(x
0
) −F
(x
c
).
Hence, if x
c
, x
0
∈ B(δ) ⊂ Ω
∆(x
c
) ≤ γx
0
−x
c
 ≤ γ(e
0
 +e
c
). (5.7)
We apply Theorem 5.4.1 to obtain the following result.
Theorem 5.4.2. Let the standard assumptions hold. Then there are
K
C
> 0 and δ > 0 such that if x
0
∈ B(δ) the chord iterates converge qlinearly
to x
∗
and
e
n+1
 ≤ K
C
e
0
e
n
. (5.8)
Proof. Let δ be small enough so that B(δ) ⊂ Ω and the conclusions of
Theorem 5.4.1 hold. Assume that x
n
∈ B(δ). Combining (5.7) and (5.5)
implies
e
n+1
 ≤
¯
K(e
n
(1 +γ) +γe
0
)e
n
 ≤
¯
K(1 + 2γ)δe
n
.
Hence if δ is small enough so that
¯
K(1 + 2γ)δ = η < 1
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
NEWTON’S METHOD 77
then the chord iterates converge qlinearly to x
∗
. Qlinear convergence implies
that e
n
 < e
0
 and hence (5.8) holds with K
C
=
¯
K(1 + 2γ).
Another variation of the chord method is
x
+
= x
c
−A
−1
F(x
c
),
where A ≈ F
(x
∗
). Methods of this type may be viewed as preconditioned
nonlinear Richardson iteration. Since
∆(x
c
) = A−F
(x
c
) ≤ A−F
(x
∗
) +F
(x
∗
) −F
(x
c
),
if x
c
∈ B(δ) ⊂ Ω then
∆(x
c
) ≤ A−F
(x
∗
) +γe
c
 ≤ A−F
(x
∗
) +γδ.
Theorem 5.4.3. Let the standard assumptions hold. Then there are
K
A
> 0, δ > 0, and δ
1
> 0, such that if x
0
∈ B(δ) and A−F
(x
∗
) < δ
1
then
the iteration
x
n+1
= x
n
−A
−1
F(x
n
)
converges qlinearly to x
∗
and
e
n+1
 ≤ K
A
(e
0
 +A−F
(x
∗
))e
n
. (5.9)
5.4.2. Approximate inversion of F
. Another way to implement chord
type methods is to provide an approximate inverse of F
. Here we replace
F
(x)
−1
by B(x), where the action of B on a vector is less expensive to compute
than a solution using the LU factorization. Rather than express the iteration
in terms of B
−1
(x) −F
(x) and using Theorem 5.4.3 one can proceed directly
from the deﬁnition of approximate inverse.
Note that if B is constant (independent of x) then the iteration
x
+
= x
c
−BF(x
c
)
can be viewed as a preconditioned nonlinear Richardson iteration.
We have the following result.
Theorem 5.4.4. Let the standard assumptions hold. Then there are
K
B
> 0, δ > 0, and δ
1
> 0, such that if x
0
∈ B(δ) and the matrixvalued
function B(x) satisﬁes
I −B(x)F
(x
∗
) = ρ(x) < δ
1
(5.10)
for all x ∈ B(δ) then the iteration
x
n+1
= x
n
−B(x
n
)F(x
n
)
converges qlinearly to x
∗
and
e
n+1
 ≤ K
B
(ρ(x
n
) +e
n
)e
n
. (5.11)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
78 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Proof. On one hand, this theorem could be regarded as a corollary of the
Banach Lemma and Theorem 5.4.1. We give a direct proof.
First, by (5.10) we have
B(x) = B(x)F
(x
∗
)F
(x
∗
)
−1
 ≤ B(x)F
(x
∗
)F
(x
∗
)
−1

≤ M
B
= (1 +δ
1
)F
(x
∗
)
−1
.
(5.12)
Using (5.12) and
e
+
= e
c
−B(x
c
)F(x
c
) =
_
1
0
(I −B(x
c
)F
(x
∗
+te
c
))e
c
dt
= (I −B(x
c
)F
(x
∗
))e
c
+B(x
c
)
_
1
0
(F
(x
∗
) −F
(x
∗
+te
c
))e
c
dt
we have
e
+
 ≤ ρ(x
c
)e
c
 +M
B
γe
c

2
/2.
This completes the proof with K
B
= 1 +M
B
γ/2.
5.4.3. The Shamanskii method. Alternation of a Newton step with a
sequence of chord steps leads to a class of highorder methods, that is, methods
that converge qsuperlinearly with qorder larger than 2. We follow [18] and
name this method for Shamanskii [174], who considered the ﬁnite diﬀerence
Jacobian formulation of the method from the point of view of eﬃciency. The
method itself was analyzed in [190]. Other highorder methods, especially for
problems in one dimension, are described in [190] and [146].
We can describe the transition from x
c
to x
+
by
y
1
= x
c
−F
(x
c
)
−1
F(x
c
),
y
j+1
= y
j
−F
(x
c
)
−1
F(y
j
) for 1 ≤ j ≤ m−1,
x
+
= y
m
.
(5.13)
Note that m = 1 is Newton’s method and m = ∞ is the chord method with
¦y
j
¦ playing the role of the chord iterates.
Algorithmically a second loop surrounds the factor/solve step. The inputs
for the Shamanskii method are the same as for Newton’s method or the chord
method except for the addition of the parameter m. Note that we can overwrite
x
c
with the sequence of y
j
’s. Note that we apply the termination test after
computation of each y
j
.
Algorithm 5.4.1. sham(x, F, τ, m)
1. r
0
= F(x)
2. Do while F(x) > τ
r
r
0
+τ
a
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
NEWTON’S METHOD 79
(a) Compute F
(x)
(b) Factor F
(x) = LU.
(c) for j = 1, . . . m
i. Solve LUs = −F(x)
ii. x = x +s
iii. Evaluate F(x).
iv. If F(x) ≤ τ
r
r
0
+τ
a
exit.
The convergence result is a simple application of Theorem 5.4.2.
Theorem 5.4.5. Let the standard assumptions hold and let m ≥ 1 be
given. Then there are K
S
> 0 and δ > 0 such that if x
0
∈ B(δ) the Shamanskii
iterates converge qsuperlinearly to x
∗
with qorder m+ 1 and
e
n+1
 ≤ K
S
e
n

m+1
. (5.14)
Proof. Let δ be small enough so that B(δ) ⊂ Ω and that the conclusions
of Theorem 5.4.2 hold. Then if x
n
∈ B(δ) all the intermediate iterates ¦y
j
¦
are in B(δ) by Theorem 5.4.2. In fact, if we set y
1
= x
n
, (5.8) implies that for
1 ≤ j ≤ m
y
j
−x
∗
 ≤ K
C
x
n
−x
∗
y
j−1
−x
∗
 = K
C
e
n
y
j−1
−x
∗
 ≤ . . . ≤ K
j
C
e
n

j+1
.
Hence x
n+1
∈ B(δ). Setting j = m in the above inequality completes the
proof.
The advantage of the Shamanskii method over Newton’s method is
that high qorders can be obtained with far fewer Jacobian evaluations or
factorizations. Optimal choices of m as a function of N can be derived
under assumptions consistent with dense matrix algebra [18] by balancing the
O(N
3
) cost of a matrix factorization with the cost of function and Jacobian
evaluation. The analysis in [18] and the expectation that, if the initial iterate is
suﬃciently accurate, only a small number of iterations will be taken, indicate
that the chord method is usually the best option for very large problems.
Algorithm nsol is based on this idea and using an idea from [114] computes
and factors the Jacobian only until an estimate of the qfactor for the linear
convergence rate is suﬃciently low.
5.4.4. Diﬀerence approximation to F
. Assume that we compute
F(x) + (x) instead of F(x) and attempt to approximate the action of F
on a vector by a forward diﬀerence. We would do this, for example, when
building an approximate Jacobian. Here the jth column of F
(x) is F
(x)e
j
where e
j
is the unit vector with jth component 1 and other components 0.
A forward diﬀerence approximation to F
(x)w would be
F(x +hw) +(x +hw) −F(x) −(x)
h
.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
80 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Assume that (x) ≤ ¯ . Then
F
(x)w −
F(x +hw) +(x +hw) −F(x) −(x)
h
= O(h + ¯ /h).
The quantity inside the Oterm is minimized when h =
√
¯ . This means, for
instance, that very small diﬀerence steps h can lead to inaccurate results. In
the special case where (x) is a result of ﬂoatingpoint roundoﬀ in full precision
(¯ ≈ 10
−15
), the analysis here indicates that h ≈ 10
−7
is a reasonable choice.
The choice h ≈ 10
−15
in this case can lead to disaster (try it!). Here we have
made the implicit assumption that x and w are of roughly the same size. If
this is not the case, h should be scaled to reﬂect that. The choice
h = ¯
1/2
x/w
reﬂects all the discussion above.
A more important consequence of this analysis is that the choice of step
size in a forward diﬀerence approximation to the action of the Jacobian on
a vector must take into account the error in the evaluation of F. This can
become very important if part of F is computed from measured data.
If F(x) has already been computed, the cost of a forward diﬀerence
approximation of F
(x)w is an additional evaluation of F (at the point x+hw).
Hence the evaluation of a full ﬁnite diﬀerence Jacobian would cost N function
evaluations, one for each column of the Jacobian. Exploitation of special
structure could reduce this cost.
The forward diﬀerence approximation to the directional derivative is not
a linear operator in w. The reason for this is that the derivative has been
approximated, and so linearity has been lost. Because of this we must
carefully specify what we mean by a diﬀerence approximation to the Jacobian.
Deﬁnition 5.4.1 below speciﬁes the usual choice.
Definition 5.4.1. Let F be deﬁned in a neighborhood of x ∈ R
N
.
(∇
h
F)(x) is the N N matrix whose jth column is given by
(∇
h
F)(x)
j
=
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
F(x +hxe
j
) −F(x)
hx
x ,= 0
F(he
j
) −F(x)
h
x = 0
We make a similar deﬁnition of the diﬀerence approximation of the
directional derivative.
Definition 5.4.2. Let F be deﬁned in a neighborhood of x ∈ R
N
and let
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
NEWTON’S METHOD 81
w ∈ R
N
. We have
D
h
F(x : w) =
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
0, w = 0,
w
F(x +hxw/w) −F(x)
hx
, w, x ,= 0,
w
F(hw/w) −F(x)
h
, x = 0, w ,= 0.
(5.15)
The standard assumptions and the Banach Lemma imply the following
result.
Lemma 5.4.1. Let the standard assumptions hold. Then there are δ, ¯ ,
¯
h >
0 such that if x ∈ B(δ), h ≤
¯
h, and (x) ≤ ¯ for all x ∈ B(δ), then
∇
h
(F +)(x) is nonsingular for all x ∈ B(δ) and there is M
F
> 0 such that
F
(x) −∇
h
(F +)(x) ≤ M
F
(h + ¯ /h).
If we assume that the forward diﬀerence step has been selected so that
h = O(
√
¯ ), then ∆(x) = O(
√
¯ ) by Lemma 5.4.1. Theorem 5.4.1 implies the
following result.
Theorem 5.4.6. Let the standard assumptions hold. Then there are
positive δ, ¯ , and K
D
such that if x
c
∈ B(δ), (x) ≤ ¯ for all x ∈ B(δ),
and there is M
−
> 0 such that
h
c
> M
−
√
¯
then
x
+
= x
c
−∇
hc
(F +)(x
c
)
−1
(F(x
c
) +(x
c
))
satisﬁes
e
+
 ≤ K
D
(¯ + (e
c
 +h
c
)e
c
).
Note that Theorem 5.4.6 does not imply that an iteration will converge to
x
∗
. Even if is entirely due to ﬂoatingpoint roundoﬀ, there is no guarantee
that (x
n
) → 0 as x
n
→ x
∗
. Hence, one should expect the sequence ¦e
n
¦
to stagnate and cease to decrease once e
n
 ≈ ¯ . Therefore in the early
phases of the iteration, while e
n
 >>
√
¯ , the progress of the iteration will
be indistinguishable from Newton’s method as implemented with an exact
Jacobian. When e
n
 ≤
√
¯ , Theorem 5.4.6 says that e
n+1
 = O(¯ ). Hence
one will see quadratic convergence until the error in F admits no further
reduction.
A diﬀerence approximation to a full Jacobian does not use any special
structure of the Jacobian and will probably be less eﬃcient than a hand coded
analytic Jacobian. If information on the sparsity pattern of the Jacobian is
available it is sometimes possible [47], [42], to evaluate ∇
h
F with far fewer than
N evaluations of F. If the sparsity pattern is such that F
can be reduced to
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
82 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
a block triangular form, the entire solution process can take advantage of this
structure [59].
For problems in which the Lipschitz constant of F
is large, the error in
the diﬀerence approximation will be large, too. Moreover, D
h
F(x : w) is not,
in general, a linear function of w and it is usually the case that
(∇
h
F(x))w ,= D
h
F(x : w).
If F
(x
∗
) is ill conditioned or the Lipschitz constant of F
is large, the
diﬀerence between them may be signiﬁcant and a diﬀerence approximation to
a directional derivative, which uses the size of w, is likely to be more accurate
than ∇
h
F(x)w.
While we used full matrix approximations to numerical Jacobians in all
the examples reported here, they should be used with caution.
5.4.5. The secant method. The results in this section are for single
equations f(x) = 0, where f is a realvalued function of one real variable.
We assume for the sake of simplicity that there are no errors in the evaluation
of f, i.e., (x) = 0. The standard assumptions, then, imply that f
(x
∗
) ,= 0.
There is no reason to use a diﬀerence approximation to f
for equations in a
single variable because one can use previously computed data to approximate
f
(x
n
) by
a
n
=
f(x
n
) −f(x
n−1
)
x
n
−x
n−1
.
The resulting method is called the secant method and is due to Newton
[140], [137]. We will prove that the secant method is locally qsuperlinearly
convergent if the standard assumptions hold.
In order to start, one needs two approximations to x
∗
, x
0
and x
−1
. The
local convergence theory can be easily described by Theorem 5.4.1. Let x
c
be
the current iterate and x
−
the previous iterate. We have (x
c
) = 0 and
∆(x
c
) =
f(x
c
) −f(x
−
)
x
c
−x
−
−f
(x
c
). (5.16)
We have the following theorem.
Theorem 5.4.7. Let the standard assumptions hold with N = 1. Then
there is δ > 0 such that if x
0
, x
−1
∈ B(δ) and x
0
,= x
−1
then the secant iterates
converge qsuperlinearly to x
∗
.
Proof. Let δ be small enough so that B(δ) ⊂ Ω and the conclusions of
Theorem 5.4.1 hold. Assume that x
−1
, x
0
, . . . , x
n
∈ B(δ). If x
n
= x
∗
for some
ﬁnite n, we are done. Otherwise x
n
,= x
n−1
. If we set s = x
n
−x
n−1
we have
by (5.16) and the fundamental theorem of calculus
∆(x
n
) =
_
1
0
(f
(x
n−1
+ts) −f
(x
n
)) dt
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
NEWTON’S METHOD 83
and hence
[∆(x
n
)[ ≤ γ[x
n−1
−x
n
[/2 ≤
γ([e
n
[ +[e
n−1
[)
2
.
Applying Theorem 5.4.1 implies
[e
n+1
[ ≤
¯
K((1 +γ/2)[e
n
[
2
+γ[e
n
[[e
n−1
[/2). (5.17)
If we reduce δ so that
¯
K(1 +γ)δ = η < 1
then x
n
→ x
∗
qlinearly. Therefore
[e
n+1
[
[e
n
[
≤
¯
K((1 +γ/2)[e
n
[ +γ[e
n−1
[/2) → 0
as n → ∞. This completes the proof.
The secant method and the classical bisection (see Exercise 5.7.8) method
are the basis for the popular method of Brent [17] for the solution of single
equations.
5.5. The Kantorovich Theorem
In this section we present the result of Kantorovich [107], [106], which states
that if the standard assumptions “almost” hold at a point x
0
, then there
is a root and the Newton iteration converges rquadratically to that root.
This result is of use, for example, in proving that discretizations of nonlinear
diﬀerential and integral equations have solutions that are near to that of the
continuous problem.
We require the following assumption.
Assumption 5.5.1. There are β, η, ¯ r, and γ with βηγ ≤ 1/2 and x
0
∈ R
N
such that
1. F is diﬀerentiable at x
0
,
F
(x
0
)
−1
 ≤ β, and F
(x
0
)
−1
F(x
0
) ≤ η.
2. F
is Lipschitz continuous with Lipschitz constant γ in a ball of radius
¯ r ≥ r
−
about x
0
where
r
−
=
1 −
√
1 −2βηγ
βγ
.
We will let B
0
be the closed ball of radius r
−
about x
0
B
0
= ¦x[ x −x
0
 ≤ r
−
¦.
We do not prove the result in its full generality and refer to [106], [145],
[57], and [58] for a complete proof and for discussions of related results.
However, we give a simpliﬁed version of a Kantorovichlike result for the chord
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
84 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
method to show how existence of a solution can be derived from estimates
on the function and Jacobian. Unlike the chord method results in [106],
[145], and [57], Theorem 5.5.1 assumes a bit more than Assumption 5.5.1 but
has a simple proof that uses only the contraction mapping theorem and the
fundamental theorem of calculus and obtains qlinear convergence instead of
rlinear convergence.
Theorem 5.5.1. Let Assumption 5.5.1 hold and let
βγη < 1/2. (5.18)
Then there is a unique root x
∗
of F in B
0
, the chord iteration with x
0
as the
initial iterate converges to x
∗
qlinearly with qfactor βγr
−
, and x
n
∈ B
0
for
all n. Moreover x
∗
is the unique root of F in the ball of radius
min
_
¯ r,
1 +
√
1 −2βηγ
βγ
_
about x
0
.
Proof. We will show that the map
φ(x) = x −F
(x
0
)
−1
F(x)
is a contraction on the set
o = ¦x[ x −x
0
 ≤ r
−
¦.
By the fundamental theorem of calculus and Assumption 5.5.1 we have for
all x ∈ o
F
(x
0
)
−1
F(x) = F
(x
0
)
−1
F(x
0
)
+F
(x
0
)
−1
_
1
0
F
(x
0
+t(x −x
0
))(x −x
0
) dt
= F
(x
0
)
−1
F(x
0
) +x −x
0
+F
(x
0
)
−1
_
1
0
(F
(x
0
+t(x −x
0
)) −F
(x
0
))(x −x
0
) dt
(5.19)
and
φ(x) −x
0
= −F
(x
0
)
−1
F(x
0
)
−F
(x
0
)
−1
_
1
0
(F
(x
0
+t(x −x
0
)) −F
(x
0
))(x −x
0
) dt.
Hence,
φ(x) −x
0
 ≤ η +βγr
2
−
/2 = r
−
.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
NEWTON’S METHOD 85
and φ maps o into itself.
To show that φ is a contraction on o and to prove the qlinear convergence
we will show that
φ(x) −φ(y) ≤ βγr
−
x −y
for all x, y ∈ o. If we do this, the contraction mapping theorem (Theo
rem 4.2.1) will imply that there is a unique ﬁxed point x
∗
of φ in o and
that x
n
→ x
∗
qlinearly with qfactor βγr
−
. Clearly x
∗
is a root of F because
it is a ﬁxed point of φ. We know that βγr
−
< 1 by our assumption that
βηγ < 1/2.
Note that for all x ∈ B
0
φ
(x) = I −F
(x
0
)
−1
F
(x) = F
(x
0
)
−1
(F
(x
0
) −F
(x))
and hence
φ
(x) ≤ βγx −x
0
 ≤ βγr
−
< 1.
Therefore, for all x, y ∈ B
0
,
φ(x) −φ(y) =
_
_
_
_
_
1
0
φ
(y +t(x −y))(x −y) dt
_
_
_
_
≤ βγr
−
x −y.
This proves the convergence result and the uniqueness of x
∗
in B
0
.
To prove the remainder of the uniqueness assertion let
r
+
=
1 +
√
1 −2βηγ
βγ
and note that r
+
> r
−
because βηγ < 1/2. If ¯ r = r
−
, then we are done by the
uniqueness of x
∗
in B
0
. Hence we may assume that ¯ r > r
−
. We must show
that if x is such that
r
−
< x −x
0
 < min(¯ r, r
+
)
then F(x) ,= 0. Letting r = x −x
0
 we have by (5.19)
F
(x
0
)
−1
F(x) ≥ r −η −βγr
2
/2 > 0
because r
−
< r < r
+
. This completes the proof.
The Kantorovich theorem is more precise than Theorem 5.5.1 and has a
very diﬀerent proof. We state the theorem in the standard way [145], [106],
[63], using the rquadratic estimates from [106].
Theorem 5.5.2. Let Assumption 5.5.1 hold. Then there is a unique root
x
∗
of F in B
0
, the Newton iteration with x
0
as the initial iterate converges to
x
∗
, and x
n
∈ B
0
for all n. x
∗
is the unique root of F in the ball of radius
min
_
¯ r,
1 +
√
1 −2βηγ
βγ
_
about x
0
and the errors satisfy
e
n
 ≤
(2βηγ)
2
n
2
n
βγ
. (5.20)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
86 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
If βηγ < 1/2 then (5.20) implies that the convergence is rquadratic (see
Exercise 5.7.16).
5.6. Examples for Newton’s method
In the collection of MATLAB codes we provide an implementation nsol
of Newton’s method, the chord method, the Shamanskii method, or a
combination of all of them based on the reduction in the nonlinear residual.
This latter option decides if the Jacobian should be recomputed based on the
ratio of successive residuals. If
F(x
+
)/F(x
c
)
is below a given threshold, the Jacobian is not recomputed. If the ratio is too
large, however, we compute and factor F
(x
+
) for use in the subsequent chord
steps. In addition to the inputs of Algorithm sham, a threshold for the ratio
of successive residuals is also input. The Jacobian is recomputed and factored
if either the ratio of successive residuals exceeds the threshold 0 < ρ < 1 or
the number of iterations without an update exceeds m. The output of the
MATLAB implementation of nsol is the solution, the history of the iteration
stored as the vector of l
∞
norms of the nonlinear residuals, and an error ﬂag.
nsol uses diffjac to approximate the Jacobian by ∇
h
F with h = 10
−7
.
While our examples have dense Jacobians, the ideas in the implementation
of nsol also apply to situations in which the Jacobian is sparse and direct
methods for solving the equation for the Newton step are necessary. One would
still want to minimize the number of Jacobian computations (by the methods
of [47] or [42], say) and sparse matrix factorizations. In Exercise 5.7.25, which
is not easy, the reader is asked to create a sparse form of nsol.
Algorithm 5.6.1. nsol(x, F, τ, m, ρ)
1. r
c
= r
0
= F(x)
2. Do while F(x) > τ
r
r
0
+τ
a
(a) Compute ∇
h
F(x) ≈ F
(x)
(b) Factor ∇
h
F(x) = LU.
(c) i
s
= 0, σ = 0
(d) Do While i
s
< m and σ ≤ ρ
i. i
s
= i
s
+ 1
ii. Solve LUs = −F(x)
iii. x = x +s
iv. Evaluate F(x)
v. r
+
= F(x), σ = r
+
/r
c
, r
c
= r
+
vi. If F(x) ≤ τ
r
r
0
+τ
a
exit.
In the MATLAB code the iteration is terminated with an error condition
if σ ≥ 1 at any point. This indicates that the local convergence analysis in this
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
NEWTON’S METHOD 87
section is not applicable. In Chapter 8 we discuss how to recover from this.
nsol is based on the assumption that a diﬀerence Jacobian will be used. In
Exercise 5.7.21 you are asked to modify the code to accept an analytic Jacobian.
Many times the Jacobian can be computed eﬃciently by reusing results that
are already available from the function evaluation. The parameters m and ρ
are assigned the default values of 1000 and .5. With these default parameters
the decision to update the Jacobian is made based entirely on the reduction in
the norm of the nonlinear residual. nsol allows the user to specify a maximum
number of iterations; the default is 40.
The Chandrasekhar Hequation. The Chandrasekhar Hequation, [41],
[30],
F(H)(µ) = H(µ) −
_
1 −
c
2
_
1
0
µH(ν) dν
µ +ν
_
−1
= 0, (5.21)
is used to solve exit distribution problems in radiative transfer.
We will discretize the equation with the composite midpoint rule. Here we
approximate integrals on [0, 1] by
_
1
0
f(µ) dµ ≈
1
N
N
j=1
f(µ
j
),
where µ
i
= (i −1/2)/N for 1 ≤ i ≤ N. The resulting discrete problem is
F(x)
i
= x
i
−
_
_
1 −
c
2N
N
j=1
µ
i
x
j
µ
i
+µ
j
_
_
−1
. (5.22)
It is known [132] that both (5.21) and the discrete analog (5.22) have two
solutions for c ∈ (0, 1). Only one of these solutions has physical meaning,
however. Newton’s method, with the 0 function or the constant function with
value 1 as initial iterate, will ﬁnd the physically meaningful solution [110].
The standard assumptions hold near either solution [110] for c ∈ [0, 1). The
discrete problem has its own physical meaning [41] and we will attempt to
solve it to high accuracy. Because H has a singularity at µ = 0, the solution of
the discrete problem is not even a ﬁrstorder accurate approximation to that
of the continuous problem.
In the computations reported here we set N = 100 and c = .9. We used the
function identically one as initial iterate. We computed all Jacobians with the
MATLAB code diffjac. We set the termination parameters τ
r
= τ
a
= 10
−6
.
We begin with a comparison of Newton’s method and the chord method.
We present some iteration statistics in both tabular and graphical form. In
Table 5.1 we tabulate the iteration counter n, F(x
n
)
∞
/F(x
0
)
∞
, and the
ratio
R
n
= F(x
n
)
∞
/F(x
n−1
)
∞
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
88 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Table 5.1
Comparison of Newton and chord iteration.
n F(x
n
)
∞
/F(x
0
)
∞
R
n
F(x
n
)
∞
/F(x
0
)
∞
R
n
0 1.000e+00 1.000e+00
1 1.480e01 1.480e01 1.480e01 1.480e01
2 2.698e03 1.823e02 3.074e02 2.077e01
3 7.729e07 2.865e04 6.511e03 2.118e01
4 1.388e03 2.132e01
5 2.965e04 2.136e01
6 6.334e05 2.136e01
7 1.353e05 2.136e01
8 2.891e06 2.136e01
for n ≥ 1 for both Newton’s method and the chord method. In Fig. 5.1
the solid curve is a plot of F(x
n
)
∞
/F(x
0
)
∞
for Newton’s method and
the dashed curve a plot of F(x
n
)
∞
/F(x
0
)
∞
for the chord method. The
concave iteration track is the signature of superlinear convergence, the linear
track indicates linear convergence. See Exercise 5.7.15 for a more careful
examination of this concavity. The linear convergence of the chord method
can also be seen in the convergence of the ratios R
n
to a constant value.
For this example the MATLAB flops command returns an fairly accurate
indicator of the cost. The Newton iteration required over 8.8 million ﬂoating
point operations while the chord iteration required under 3.3 million. In
Exercise 5.7.18 you are asked to obtain timings in your environment and will
see how the ﬂoatingpoint operation counts are related to the actual timings.
The good performance of the chord method is no accident. If we think
of the chord method as a preconditioned nonlinear Richardson iteration with
the initial Jacobian playing the role of preconditioner, then the chord method
should be much more eﬃcient than Newton’s method if the equations for the
steps can be solved cheaply. In the case considered here, the cost of a single
evaluation and factorization of the Jacobian is far more than the cost of the
entire remainder of the chord iteration.
The ideas in Chapters 6 and 7 show how the low iteration cost of the
chord method can be combined with the small number of iterations of a
superlinearly convergent nonlinear iterative method. In the context of this
particular example, the cost of the Jacobian computation can be reduced by
analytic evaluation and reuse of data that have already been computed for the
evaluation of F. The reader is encouraged to do this in Exercise 5.7.21.
In the remaining examples, we plot, but do not tabulate, the iteration
statistics. In Fig. 5.2 we compare the Shamanskii method with m = 2 with
Newton’s method. The Shamanskii computation required under 6 million
ﬂoatingpoint operations, as compared with over 8.8 for Newton’s method.
However, for this problem, the simple chord method is the most eﬃcient.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
NEWTON’S METHOD 89
0 1 2 3 4 5 6 7 8
10
7
10
6
10
5
10
4
10
3
10
2
10
1
10
0
Iterations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 5.1. Newton and chord methods for Hequation, c = .9.
0 0.5 1 1.5 2 2.5 3 3.5 4
10
7
10
6
10
5
10
4
10
3
10
2
10
1
10
0
Iterations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 5.2. Newton and Shamanskii method for Hequation, c = .9.
The hybrid approach in nsol with the default parameters would be the chord
method for this problem.
We now reconsider the Hequation with c = .9999. For this problem the
Jacobian at the solution is nearly singular [110]. Aside from c, all iteration
parameters are the same as in the previous example. While Newton’s method
converges in 7 iterations and requires 21 million ﬂoatingpoint operations, the
chord method requires 188 iterations and 10.6 million ﬂoatingpoint operations.
The qfactor for the chord method in this computation was over .96, indicating
very slow convergence. The hybrid method in nsol with the default parameters
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
90 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
computes the Jacobian four times during the course of the iteration, converges
in 14 iterations, and requires 12 million ﬂoatingpoint operations. In Fig. 5.3
we compare the Newton iteration with the hybrid.
0 2 4 6 8 10 12 14
10
6
10
5
10
4
10
3
10
2
10
1
10
0
Iterations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 5.3. Newton and hybrid methods for Hequation c = .9999.
One should approach plots of convergence histories with some caution. The
theory for the chord method asserts that the error norms converge qlinearly to
zero. This implies only that the nonlinear residual norms ¦F(x
n
)¦ converge
rlinearly to zero. While the plots indicate qlinear convergence of the nonlinear
residuals, the theory does not, in general, support this. In practice, however,
plots like those in this section are common.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
NEWTON’S METHOD 91
5.7. Exercises on Newton’s method
In some of the exercises here, and in the rest of this part of the book, you will
be asked to plot or tabulate iteration statistics. When asked to do this, for
each iteration tabulate the iteration counter, the norm of F, and the ratio
of F(x
n
)/F(x
n−1
) for n ≥ 1. A better alternative in the MATLAB
environment is to use the semilogy command to plot the norms. When
one does this, one can visually inspect the plot to determine superlinear
convergence (concavity) without explicitly computing the ratios. Use the l
∞
norm.
5.7.1. A function G is said to be H¨older continuous with exponent α in Ω if
G(x) −G(y) ≤ Kx −y
α
for all x, y ∈ Ω. Show that if the Lipschitz
continuity condition on F
in the standard assumptions is replaced by
H¨older continuity with exponent α > 0 that the Newton iterates converge
with qorder 1 +α.
5.7.2. Can the performance of the Newton iteration be improved by a linear
change of variables? That is, for nonsingular N N matrices A and
B, can the Newton iterates for F(x) = 0 and AF(Bx) = 0 show any
performance diﬀerence when started at the same initial iterate? What
about the chord method?
5.7.3. Let ∇
h
F be given by Deﬁnition 5.4.1. Given the standard assumptions,
prove that the iteration given by
x
n+1
= x
n
−(∇
hn
F(x
n
))
−1
F(x
n
)
is locally qsuperlinearly convergent if h
n
→ 0. When is it locally q
quadratically convergent?
5.7.4. Suppose F
(x
n
) is replaced by ∇
hn
F(x
n
) in the Shamanskii method.
Discuss how h
n
must be decreased as the iteration progresses to preserve
the qorder of m+ 1 [174].
5.7.5. In what way is the convergence analysis of the secant method changed if
there are errors in the evaluation of f?
5.7.6. Prove that the secant method converges rsuperlinearly with rorder
(
√
5 + 1)/2. This is easy.
5.7.7. Show that the secant method converges qsuperlinearly with qorder
(
√
5 + 1)/2.
5.7.8. The bisection method produces a sequence of intervals (x
k
l
, x
k
r
) that
contain a root of a function of one variable. Given (x
k
l
, x
k
r
) with
f(x
k
l
)f(x
k
r
) < 0 we replace one of the endpoints with y = (x
k
l
+ x
k
r
)/2
to force f(x
k+1
l
)f(x
k+1
r
) < 0. If f(y) = 0, we terminate the iteration.
Show that the sequence x
k
= (x
k
l
+ x
k
r
)/2 is rlinearly convergent if f is
continuous and there exists (x
0
l
, x
0
r
) such that f(x
0
l
)f(x
0
r
) < 0.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
92 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
5.7.9. Write a program that solves single nonlinear equations with Newton’s
method, the chord method, and the secant method. For the secant
method, use x
−1
= .99x
0
. Apply your program to the following
function/initial iterate combinations, tabulate or plot iteration statistics,
and explain your results.
1. f(x) = cos(x) −x, x
0
= .5
2. f(x) = tan
−1
(x), x
0
= 1
3. f(x) = sin(x), x
0
= 3
4. f(x) = x
2
, x
0
= .5
5. f(x) = x
2
+ 1, x
0
= 10
5.7.10. For scalar functions of one variable a quadratic model of f about a point
x
c
is
m
c
(x) = f(x
c
) +f
(x
c
)(x −x
c
) +f
(x
c
)(x −x
c
)
2
/2.
Consider the iterative method that deﬁnes x
+
as a root of m
c
. What
problems might arise in such a method? Resolve these problems and
implement your algorithm. Test it on the problems in Exercise 5.7.9. In
what cases is this algorithm superior to Newton’s method? For hints and
generalizations of this idea to functions of several variables, see [170]. Can
the ideas in [170] be applied to some of the highorder methods discussed
in [190] for equations in one unknown?
5.7.11. Show that if f is a real function of one real variable, f
is Lipschitz
continuous, and f(x
∗
) = f
(x
∗
) = 0 but f
(x
∗
) ,= 0 then the iteration
x
n+1
= x
n
−2f(x
n
)/f
(x
n
)
converges locally qquadratically to x
∗
provided x
0
is suﬃciently near to
x
∗
, but not equal to x
∗
[171].
5.7.12. Show that if f is a real function of one real variable, the standard
assumptions hold, f
(x
∗
) = 0 and f
(x) is Lipschitz continuous, then
the Newton iteration converges cubically (qsuperlinear with qorder 3).
5.7.13. Show that if f is a real function of one real variable, the standard
assumptions hold, x
0
is suﬃciently near x
∗
, and f(x
0
)f
(x
0
) > 0, then
the Newton iteration converges monotonically to x
∗
. What if f
(x
∗
) = 0?
How can you extend this result if f
(x
∗
) = f
(x
∗
) = . . . f
(k)
(x
∗
) = 0 ,=
f
(k+1)
(x
∗
)? See [131].
5.7.14. How would you estimate the qorder of a qsuperlinearly convergent
sequence? Rigorously justify your answer.
5.7.15. If x
n
→ x
∗
qsuperlinearly with qorder α > 1, show that log(e
n
) is a
concave function of n in the sense that
log(e
n+1
) −log(e
n
)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
NEWTON’S METHOD 93
is a decreasing function of n for n suﬃciently large. Is this still the case
if the convergence is qsuperlinear but there is no qorder? What if the
convergence is qlinear?
5.7.16. Show that the sequence on the right side of (5.20) is rquadratically (but
not qquadratically) convergent to 0 if βηγ < 1/2.
5.7.17. Typically (see [89], [184]) the computed LU factorization of F
(x) is the
exact LU factorization of a nearby matrix.
LU = F
(x) +E.
Use the theory in [89], [184] or another book on numerical linear algebra
and Theorem 5.4.1 to describe how this factorization error aﬀects the
convergence of the Newton iteration. How does the analysis for the QR
factorization diﬀer?
5.7.18. Duplicate the results on the Hequation in ¸ 5.6. Try diﬀerent values of
N and c. How does the performance of the methods diﬀer as N and c
change? Compare your results to the tabulated values of the H function
in [41] or [14]. Compare execution times in your environment. Tabulate
or plot iteration statistics. Use the MATLAB cputime command to see
how long the entire iteration took for each N, c combination and the
MATLAB flops command to get an idea for the number of ﬂoating
point operations required by the various methods.
5.7.19. Solve the Hequation with c = 1 and N = 100. Explain your results
(see [50] and [157]). Does the trick in Exercise 5.7.11 improve the
convergence? Try one of the approaches in [51], [49], or [118] for
acceleration of the convergence. You might also look at [90], [170], [75],
[35] and [155] for more discussion of this. Try the chord method and
compare the results to part 4 of Exercise 5.7.9 (see [52]).
5.7.20. Solve the Hequation with N = 100, c = .9 and c = .9999 with ﬁxedpoint
iteration. Compare timings and ﬂop counts with Newton’s method, the
chord method, and the algorithm in nsol.
5.7.21. Compute by hand the Jacobian of F, the discretized nonlinearity from
the H equation in (5.22). Prove that F
can be computed at a cost of
less than twice that of an evaluation of F. Modify nsol to accept this
analytic Jacobian rather than evaluate a diﬀerence Jacobian. How do
the execution times and ﬂop counts change?
5.7.22. Modify dirder by setting the forward diﬀerence step h = 10
−2
. How
does this change aﬀect the convergence rate observations that you have
made?
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
94 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
5.7.23. Add noise to your function evaluation with a random number generator,
such as the the MATLAB rand function. If the size of the noise is
O(10
−4
) and you make the appropriate adjustments to dirder, how are
the convergence rates diﬀerent? What is an appropriate termination
criterion? At what point does the iteration stagnate?
5.7.24. Assume that the standard assumptions hold, that the cost of a function
evaluation is O(N
2
) ﬂoatingpoint operations, the cost of a Jacobian is
O(N) function evaluations, and that x
0
is near enough to x
∗
so that the
Newton iteration converges qquadratically to x
∗
. Show that the number
n of Newton iterations needed to obtain e
n
 ≤ e
0
 is O(log()) and
that the number of ﬂoatingpoint operations required is O(N
3
log()).
What about the chord method?
5.7.25. Develop a sparse version of nsol for problems with tridiagonal Jacobian.
In MATLAB, for example, you could modify diffjac to use the
techniques of [47] and nsol to use the sparse matrix factorization in
MATLAB. Apply your code to the central diﬀerence discretization of the
twopoint boundary value problem
−u
= sin(u) +f(x), u(0) = u(1) = 0.
Here, the solution is u
∗
(x) = x(1 −x), and
f(x) = 2 −sin(x(1 −x)).
Let u
0
= 0 be the initial iterate.
5.7.26. Suppose you try to ﬁnd an eigenvectoreigenvalue pair (φ, λ) of an NN
matrix A by solving the system of N + 1 nonlinear equations
F(φ, λ) =
_
Aφ −λφ
φ
T
φ −1
_
=
_
0
0
_
(5.23)
for the vector x = (φ
T
, λ)
T
∈ R
N+1
. What is the Jacobian of this
system? If x = (φ, λ) ∈ R
N+1
is an eigenvectoreigenvalue pair, when is
F
(x) nonsingular? Relate the application of Newton’s method to (5.23)
to the inverse power method. See [150] for more development of this
idea.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Chapter 6
Inexact Newton Methods
Theorem 5.4.1 describes how errors in the derivative/function aﬀect the
progress in the Newton iteration. Another way to look at this is to ask how
an approximate solution of the linear equation for the Newton step aﬀects the
iteration. This was the view taken in [55] where inexact Newton methods in
which the step satisﬁes
F
(x
c
)s +F(x
c
) ≤ η
c
F(x
c
) (6.1)
are considered. Any approximate step is accepted provided that the relative
residual of the linear equation is small. This is quite useful because conditions
like (6.1) are precisely the small linear residual termination conditions for
iterative solution of the linear system for the Newton step. Such methods are
not new. See [145] and [175], for example, for discussion of these ideas in the
context of the classical stationary iterative methods. In most of this chapter
we will focus on the approximate solution of the equation for the Newton step
by GMRES, but the other Krylov subspace methods discussed in Chapters 2
and 3 and elsewhere can also be used. We follow [69] and refer to the term η
c
on the right hand side of (6.1) as the forcing term.
6.1. The basic estimates
We will discuss speciﬁc implementation issues in ¸ 6.2. Before that we will
give the basic result from [55] to show how a step that satisﬁes (6.1) aﬀects
the convergence. We will present this result in two parts. In ¸ 6.1.1 we give a
straightforward analysis that uses the techniques of Chapter 5. In ¸ 6.1.2 we
show how the requirements of the simple analysis can be changed to admit a
more aggressive (i.e., larger) choice of the parameter η.
6.1.1. Direct analysis. The proof and application of Theorem 6.1.1 should
be compared to that of Theorem 5.1.1 and the other results in Chapter 5.
Theorem 6.1.1. Let the standard assumptions hold. Then there are δ and
K
I
such that if x
c
∈ B(δ), s satisﬁes (6.1), and x
+
= x
c
+s then
e
+
 ≤ K
I
(e
c
 +η
c
)e
c
. (6.2)
95
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
96 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Proof. Let δ be small enough so that the conclusions of Lemma 4.3.1 and
Theorem 5.1.1 hold. To prove the ﬁrst assertion (6.2) note that if
2
r = −F
(x
c
)s −F(x
c
)
is the linear residual then
s +F
(x
c
)
−1
F(x
c
) = −F
(x
c
)
−1
r
and
e
+
= e
c
+s = e
c
−F
(x
c
)
−1
F(x
c
) −F
(x
c
)
−1
r. (6.3)
Now, (6.1), (4.7), and (4.6) imply that
s +F
(x
c
)
−1
F(x
c
) ≤ F
(x
c
)
−1
η
c
F(x
c
)
≤ 4κ(F
(x
∗
))η
c
e
c
.
Hence, using (6.3) and Theorem 5.1.1
e
+
 ≤ e
c
−F
(x
c
)
−1
F(x
c
) + 4κ(F
(x
∗
))η
c
e
c

≤ Ke
c

2
+ 4κ(F
(x
∗
))η
c
e
c
,
where K is the constant from (5.2). If we set
K
I
= K + 4κ(F
(x
∗
)),
the proof is complete.
We summarize the implications of Theorem 6.1.1 for iterative methods in
the next result, which is a direct consequence of (6.2) and will suﬃce to explain
most computational observations.
Theorem 6.1.2. Let the standard assumptions hold. Then there are δ and
¯ η such that if x
0
∈ B(δ), ¦η
n
¦ ⊂ [0, ¯ η], then the inexact Newton iteration
x
n+1
= x
n
+s
n
,
where
F
(x
n
)s
n
+F(x
n
) ≤ η
n
F(x
n
)
converges qlinearly to x
∗
. Moreover
• if η
n
→ 0 the convergence is qsuperlinear, and
• if η
n
≤ K
η
F(x
n
)
p
for some K
η
> 0 the convergence is qsuperlinear
with qorder 1 +p.
2
Our deﬁnition of r is consistent with the idea that the residual in a linear equation is b −Ax. In
[55] r = F
(xc)s +F(xc).
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
INEXACT NEWTON METHODS 97
Proof. Let δ be small enough so that (6.2) holds for x
c
∈ B(δ). Reduce δ
and ¯ η if needed so that
K
I
(δ + ¯ η) < 1,
where K
I
is from (6.2). Then if n ≥ 0 and x
n
∈ B(δ) we have
e
n+1
 ≤ K
I
(e
n
 +η
n
)e
n
 ≤ K
I
(δ + ¯ η)e
n
 < e
n
.
This proves qlinear convergence with a qfactor of K
I
(δ + ¯ η).
If η
n
→ 0 then qsuperlinear convergence follows from the deﬁnition. If
η
n
≤ K
η
F(x
n
)
p
then we may use (4.7) and (6.2) to conclude
e
n+1
 ≤ K
I
(e
n

1−p
+K
η
2
p
F
(x
∗
)
p
)e
n

1+p
which completes the proof.
6.1.2. Weighted norm analysis. Since K
I
= O(κ(F
(x
∗
))), one might
conclude from Theorem 6.1.2 that if F
(x
∗
) is ill conditioned very small forcing
terms must be used. This is not the case and the purpose of this subsection
is to describe more accurately the necessary restrictions on the forcing terms.
The results here diﬀer from those in ¸ 6.1.1 in that no restriction is put on
the sequence ¦η
n
¦ ⊂ [0, 1) other than requiring that 1 not be an accumulation
point.
Theorem 6.1.3. Let the standard assumptions hold. Then there is δ such
that if x
c
∈ B(δ), s satisﬁes (6.1), x
+
= x
c
+s, and η
c
≤ η < ¯ η < 1, then
F
(x
∗
)e
+
 ≤ ¯ ηF
(x
∗
)e
c
. (6.4)
Proof. To prove (6.4) note that Theorem 4.0.1 implies that
F(x
c
) ≤ F
(x
∗
)e
c
 +
γe
c

2
2
.
Since
e
c
 = F
(x
∗
)
−1
F
(x
∗
)e
c
 ≤ F
(x
∗
)
−1
F
(x
∗
)e
c

we have, with
M
0
=
γF
(x
∗
)
−1

2
F(x
c
) ≤ (1 +M
0
δ)F
(x
∗
)e
c
. (6.5)
Now,
F
(x
∗
)e
+
= F
(x
∗
)(e
c
+s)
= F
(x
∗
)(e
c
−F
(x
c
)
−1
F(x
c
) −F
(x
c
)
−1
r).
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
98 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
By Theorem 5.1.1
F
(x
∗
)(e
c
−F
(x
c
)
−1
F(x
c
)) ≤ KF
(x
∗
)e
c

2
.
Hence,
F
(x
∗
)e
+
 ≤ F
(x
∗
)F
(x
c
)
−1
r +KF
(x
∗
)e
c

2
. (6.6)
Since
F
(x
∗
)F
(x
c
)
−1
r ≤ r +(F
(x
∗
) −F
(x
c
))F
(x
c
)
−1
r
≤ (1 + 2γF
(x
∗
)
−1
e
c
)r,
we may set
M
1
= 2γF
(x
∗
)
−1
, and M
2
= KF
(x
∗
)
and obtain, using (4.7) and (6.5),
F
(x
∗
)e
+
 ≤ (1 +M
1
δ)r +M
2
δe
c

≤ (1 +M
1
δ)(1 +M
0
δ)η
c
F
(x
∗
)e
c

+M
2
δF
(x
∗
)
−1
F
(x
∗
)e
c

≤ ((1 +M
1
δ)(1 +M
0
δ)η +M
2
δF
(x
∗
)
−1
)F
(x
∗
)e
c
.
(6.7)
Now let δ be small enough so that
(1 +M
1
δ)(1 +M
0
δ)η +M
2
δF
(x
∗
)
−1
 ≤ ¯ η
and the proof is complete.
Note that the distance δ from the initial iterate to the solution may have
to be smaller for (6.4) to hold than for (6.2). However (6.4) is remarkable in its
assertion that any method that produces a step with a linear residual less than
that of the zero step will reduce the norm of the error if the error is measured
with the weighted norm
 
∗
= F
(x
∗
) .
In fact, Theorem 6.1.3 asserts qlinear convergence in the weighted norm if the
initial iterate is suﬃciently near x
∗
. This is made precise in Theorem 6.1.4.
The application of Theorem 6.1.3 described in Theorem 6.1.4 diﬀers from
Theorem 6.1.2 in that we do not demand that ¦η
n
¦ be bounded away from 1
by a suﬃciently large amount, just that 1 not be an accumulation point. The
importance of this theorem for implementation is that choices of the sequence
of forcing terms ¦η
n
¦ (such as η
n
= .5 for all n) that try to minimize the
number of inner iterations are completely justiﬁed by this result if the initial
iterate is suﬃciently near the solution. We make such a choice in one of the
examples considered in ¸ 6.4 and compare it to a modiﬁcation of a choice from
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
INEXACT NEWTON METHODS 99
[69] that decreases η
n
rapidly with a view toward minimizing the number of
outer iterations.
Theorem 6.1.4. Let the standard assumptions hold. Then there is δ such
that if x
0
∈ B(δ), ¦η
n
¦ ⊂ [0, η] with η < ¯ η < 1, then the inexact Newton
iteration
x
n+1
= x
n
+s
n
,
where
F
(x
n
)s
n
+F(x
n
) ≤ η
n
F(x
n
)
converges qlinearly with respect to  
∗
to x
∗
. Moreover
• if η
n
→ 0 the convergence is qsuperlinear, and
• if η
n
≤ K
η
F(x
n
)
p
for some K
η
> 0 the convergence is qsuperlinear
with qorder 1 +p.
Proof. The proof uses both (6.4) and (6.2). Our ﬁrst task is to relate the
norms   and  
∗
so that we can estimate δ. Let δ
0
be such that (6.4) holds
for e
c
 < δ
0
.
For all x ∈ R
N
,
x
∗
≤ F
(x
∗
)x ≤ κ(F
(x
∗
))x
∗
, (6.8)
Note that e < δ
0
if
e
∗
< δ
∗
= F
(x
∗
)δ
0
Set δ = F
(x
∗
)
−1
δ
∗
. Then e
0

∗
< δ
∗
if e
0
 < δ.
Our proof does not rely on the (possibly false) assertions that e
n
 < δ for
all n or that x
n
→ x
∗
qlinearly with respect to the unweighted norm. Rather
we note that (6.4) implies that if e
n

∗
< δ
∗
then
e
n+1

∗
≤ ¯ ηe
n

∗
< δ
∗
(6.9)
and hence x
n
→ x
∗
qlinearly with respect to the weighted norm. This proves
the ﬁrst assertion.
To prove the assertions on qsuperlinear convergence, note that since
x
n
→ x
∗
, eventually e
n
 will be small enough so that the conclusions of
Theorem 6.1.1 hold. The superlinear convergence results then follow exactly
as in the proof of Theorem 6.1.2.
We close this section with some remarks on the two types of results in
Theorems 6.1.2 and 6.1.4. The most signiﬁcant diﬀerence is the norm in which
qlinear convergence takes place. qlinear convergence with respect to the
weighted norm  
∗
is equivalent to qlinear convergence of the sequence of
nonlinear residuals ¦F(x
n
)¦. We state this result as Proposition 6.1.1 and
leave the proof to the reader in Exercise 6.5.2. The implications for the choice of
the forcing terms ¦η
n
¦ are also diﬀerent. While Theorem 6.1.2 might lead one
to believe that a small value of η
n
is necessary for convergence, Theorem 6.1.4
shows that the sequence of forcing terms need only be kept bounded away from
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
100 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
1. A constant sequence such as η
n
= .5 may well suﬃce for linear convergence,
but at a price of a greater number of outer iterations. We will discuss other
factors in the choice of forcing terms in ¸ 6.3.
Proposition 6.1.1. Let the standard assumptions hold and let x
n
→ x
∗
.
Then F(x
n
) converges qlinearly to 0 if and only if e
n

∗
does.
6.1.3. Errors in the function. In this section, we state two theorems on
inexact local improvement. We leave the proofs, which are direct analogs to
the proofs of Theorems 6.1.1 and 6.1.3, to the reader as exercises. Note that
errors in the derivative, such as those arising from a diﬀerence approximation
of the action of F
(x
c
) on a vector, can be regarded as part of the inexact
solution of the equation for the Newton step. See [55] or Proposition 6.2.1 and
its proof for an illustration of this point.
Theorem 6.1.5. Let the standard assumptions hold. Then there are δ and
K
I
such that if x
c
∈ B(δ), s satisﬁes
F
(x
c
)s +F(x
c
) +(x
c
) ≤ η
c
F(x
c
) +(x
c
) (6.10)
and x
+
= x
c
+s, then
e
+
 ≤ K
I
((e
c
 +η
c
)e
c
 +(x
c
)). (6.11)
Theorem 6.1.6. Let the standard assumptions hold. Then there is δ such
that if x
c
∈ B(δ), s satisﬁes (6.10), x
+
= x
c
+ s, and η
c
≤ η < ¯ η < 1, then
there is K
E
such that
F
(x
∗
)e
+
 ≤ ¯ ηF
(x
∗
)e
c
 +K
E
(x
c
). (6.12)
6.2. Newtoniterative methods
A Newtoniterative method realizes (6.1) with an iterative method for the linear
system for the Newton step, terminating when the relative linear residual is
smaller than η
c
(i.e, when (6.1) holds). The name of the method indicates
which linear iterative method is used, for example, NewtonSOR, NewtonCG,
NewtonGMRES, would use SOR, CG, or GMRES to solve the linear system.
This naming convention is taken from [175] and [145]. These methods have also
been called truncated Newton methods [56], [136] in the context of optimization.
Typically the nonlinear iteration that generates the sequence ¦x
n
¦ is called
the outer iteration and the linear iteration that generates the approximations
to the steps is called the inner iteration.
In this section we use the l
2
norm to measure nonlinear residuals. The
reason for this is that the Krylov methods use the scalar product and the
estimates in Chapters 2 and 3 are in terms of the Euclidean norm. In the
examples discussed in ¸ 6.4 we will scale the l
2
norm by a factor of 1/N so
that the results for the diﬀerential and integral equations will be independent
of the computational mesh.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
INEXACT NEWTON METHODS 101
6.2.1. Newton GMRES. We provide a detailed analysis of the Newton
GMRES iteration. We begin by discussing the eﬀects of a forward diﬀerence
approximation to the action of the Jacobian on a vector.
If the linear iterative method is any one of the Krylov subspace methods
discussed in Chapters 2 and 3 then each inner iteration requires at least one
evaluation of the action of F
(x
c
) on a vector. In the case of CGNR and CGNE,
an evaluation of the action of F
(x
c
)
T
on a vector is also required. In many
implementations [20], [24], the action of F
(x
c
) on a vector w is approximated
by a forward diﬀerence, (5.15), D
h
F(x : w) for some h. It is important to note
that this is entirely diﬀerent from forming the ﬁnite diﬀerence Jacobian ∇
h
F(x)
and applying that matrix to w. In fact, as pointed out in [20], application
of GMRES to the linear equation for the Newton step with matrixvector
products approximated by ﬁnite diﬀerences is the same as the application of
GMRES to the matrix G
h
F(x) whose last k −1 columns are the vectors
v
k
= D
h
F(x : v
k−1
)
The sequence ¦v
k
¦ is formed in the course of the forwarddiﬀerence GMRES
iteration.
To illustrate this point, we give the forwarddiﬀerence GMRES algorithm
for computation of the Newton step, s = −F
(x)
−1
F(x). Note that matrix
vector products are replaced by forward diﬀerence approximations to F
(x), a
sequence of approximate steps ¦s
k
¦ is produced, that b = −F(x), and that the
initial iterate for the linear problem is the zero vector. We give a MATLAB
implementation of this algorithm in the collection of MATLAB codes.
Algorithm 6.2.1. fdgmres(s, x, F, h, η, kmax, ρ)
1. s = 0, r = −F(x), v
1
= r/r
2
, ρ = r
2
, β = ρ, k = 0
2. While ρ > ηF(x)
2
and k < kmax do
(a) k = k + 1
(b) v
k+1
= D
h
F(x : v
k
)
for j = 1, . . . k
i. h
jk
= v
T
k+1
v
j
ii. v
k+1
= v
k+1
−h
jk
v
j
(c) h
k+1,k
= v
k+1

2
(d) v
k+1
= v
k+1
/v
k+1

2
(e) e
1
= (1, 0, . . . , 0)
T
∈ R
k+1
Minimize βe
1
−H
k
y
k

R
k+1 to obtain y
k
∈ R
k
.
(f) ρ = βe
1
−H
k
y
k

R
k+1.
3. s = V
k
y
k
.
In our MATLAB implementation we solve the least squares problem in
step 2e by the Givens rotation approach used in Algorithm gmres.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
102 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
As should be clear from inspection of Algorithm gmres, the diﬀerence
between ∇
h
F and G
h
F is that the columns are computed by taking directional
derivatives based on two diﬀerent orthonormal bases: ∇
h
F using the basis of
unit vectors in coordinate directions and G
h
F the basis for the Krylov space /
k
constructed by algorithm fdgmres. The action of G
h
F on /
⊥
k
, the orthogonal
complement of /
k
, is not used by the algorithm and, for the purposes of
analysis, could be speciﬁed by the action of ∇
h
F on any basis for /
⊥
k
. Hence
G
h
F is also ﬁrst order accurate in h. Assuming that there is no error in the
evaluation of F we may summarize the observations above in the following
proposition, which is a special case of the more general results in [20].
Proposition 6.2.1. Let the standard assumptions hold. Let η ∈ (0, 1).
Then there are C
G
,
¯
h, and δ such that if x ∈ B(δ), h ≤
¯
h, and Algo
rithm fdgmres terminates with k < kmax then the computed step s satisﬁes
F
(x)s +F(x)
2
< (η +C
G
h)F(x)
2
. (6.13)
Proof. First let δ be small enough so that B(δ) ⊂ Ω and the conclusions
to Lemma 4.3.1 hold. Let ¦u
j
¦ be any orthonormal basis for R
N
such that
u
j
= v
j
for 1 ≤ j ≤ k, where ¦v
j
¦
k
j=1
is the orthonormal basis for /
k
generated
by Algorithm fdgmres.
Consider the linear map deﬁned by its action on ¦u
j
¦
Bu
j
= D
h
F(x : u
j
).
Note that
D
h
F(x : u
j
) =
_
1
0
F
(x +thx
2
u
j
)u
j
dt
= F
(x)u
j
+
_
1
0
(F
(x +thx
2
u
j
) −F
(x))u
j
dt.
Since F
is Lipschitz continuous and ¦u
j
¦ is an orthonormal basis for R
N
we
have, with ¯ γ = γ(x
∗

2
+δ),
B −F
(x)
2
≤ hx
2
γ/2 ≤ h¯ γ/2 (6.14)
Since the linear iteration (fdgmres) terminates in k < kmax iterations, we
have, since B and G
h
F agree on /
k
,
Bs +F(x)
2
≤ ηF(x)
2
and therefore
F
(x)s +F(x)
2
≤ ηF(x)
2
+h¯ γs
2
/2. (6.15)
Assume that
¯
h¯ γ ≤ F
(x
∗
)
−1

−1
2
/2
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
INEXACT NEWTON METHODS 103
Then Lemma 4.3.1 and (6.15) imply
F
(x
∗
)
−1

−1
2
s
2
/2 ≤ F
(x)
−1

−1
2
s
2
≤ F
(x)s
2
≤ (1 +η)F(x)
2
+
¯
h¯ γs
2
/2.
Therefore,
s
2
≤ 4(1 +η)F
(x
∗
)
−1

2
F(x)
2
. (6.16)
Combining (6.16) with (6.15) completes the proof with C
G
= 4¯ γ(1 +
η)F
(x
∗
)
−1

2
.
Proposition 6.2.1 implies that a ﬁnite diﬀerence implementation will not
aﬀect the performance of NewtonGMRES if the steps in the forward diﬀerence
approximation of the derivatives are suﬃciently small. In an implementation,
therefore, we must specify not only the sequence ¦η
n
¦, but also the steps ¦h
n
¦
used to compute forward diﬀerences. Note also that Proposition 6.2.1 applies
to a restarted GMRES because upon successful termination (6.15) will hold.
We summarize our results so far in the analogs of Theorem 6.1.4 and
Theorem 6.1.2. The proofs are immediate consequences of Theorems 6.1.2,
6.1.4, and Proposition 6.2.1 and are left as (easy) exercises.
Theorem 6.2.1. Assume that the assumptions of Proposition 6.2.1 hold.
Then there are δ, ¯ σ such that if x
0
∈ B(δ) and the sequences ¦η
n
¦ and ¦h
n
¦
satisfy
σ
n
= η
n
+C
G
h
n
≤ ¯ σ
then the forward diﬀerence NewtonGMRES iteration
x
n+1
= x
n
+s
n
where s
n
is computed by Algorithm fdgmres with arguments
(s
n
, x
n
, F, h
n
, η
n
, kmax, ρ)
converges qlinearly and s
n
satisﬁes
F
(x
n
)s
n
+F(x
n
)
2
< σ
n
F(x
n
)
2
. (6.17)
Moreover,
• if σ
n
→ 0 the convergence is qsuperlinear, and
• if σ
n
≤ K
η
F(x
n
)
p
2
for some K
η
> 0 the convergence is qsuperlinear
with qorder 1 +p.
Theorem 6.2.2. Assume that the assumptions of Proposition 6.2.1 hold.
Then there is δ such that if x
0
∈ B(δ) and the sequences ¦η
n
¦ and ¦h
n
¦ satisfy
σ
n
= η
n
+C
G
h
n
∈ [0, ¯ σ]
with 0 < ¯ σ < 1 then the forward diﬀerence NewtonGMRES iteration
x
n+1
= x
n
+s
n
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
104 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
where s
n
is computed by Algorithm fdgmres with arguments
(s
n
, x
n
, F, h
n
, η
n
, kmax, ρ)
converges qlinearly with respect to  
∗
and s
n
satisﬁes (6.17). Moreover
• if σ
n
→ 0 the convergence is qsuperlinear, and
• if σ
n
≤ K
η
F(x
n
)
p
2
for some K
η
> 0 the convergence is qsuperlinear
with qorder 1 +p.
In the case where h
n
is approximately the square root of machine roundoﬀ,
σ
n
≈ η
n
if η
n
is not too small and Theorem 6.2.2 states that the observed
behavior of the forward diﬀerence implementation will be the same as that of
an implementation with exact derivatives.
6.2.2. Other Newtoniterative methods. Other iterative methods may
be used as the solver for the linear equation for the Newton step. Newton
multigrid [99] is one approach using a stationary iterative method. See also
[6], [111]. If F
is symmetric and positive deﬁnite, as it is in the context of
unconstrained minimization, [63], NewtonCG is a possible approach.
When storage is restricted or the problem is very large, GMRES may not
be practical. GMRES(m), for a small m, may not converge rapidly enough
to be useful. CGNR and CGNE oﬀer alternatives, but require evaluation of
transposevector products. BiCGSTAB and TFQMR should be considered in
all cases where there is not enough storage for GMRES(m) to perform well.
If a transpose is needed, as it will be if CGNR or CGNE is used as the
iterative method, a forward diﬀerence formulation is not an option because
approximation of the transposevector product by diﬀerences is not possible.
Computation of ∇
h
F(x) and its transpose analytically is one option as is
diﬀerentiation of F automatically or by hand. The reader may explore this
further in Exercise 6.5.13.
The reader may also be interested in experimenting with the other Krylov
subspace methods described in ¸ 3.6, [78], and [12].
6.3. NewtonGMRES implementation
We provide a MATLAB code nsolgm in the collection that implements
a forwarddiﬀerence NewtonGMRES algorithm. This algorithm requires
diﬀerent inputs than nsol. As input we must give the initial iterate, the
function, the vector of termination tolerances as we did for nsol. In addition,
we must provide a method for forming the sequence of forcing terms ¦η
n
¦.
We adjust η as the iteration progresses with a variation of a choice from
[69]. This issue is independent of the particular linear solver and our discussion
is in the general inexact Newton method setting. Setting η to a constant for
the entire iteration is often a reasonable strategy as we see in ¸ 6.4, but the
choice of that constant depends on the problem. If a constant η is too small,
much eﬀort can be wasted in the initial stages of the iteration. The choice in
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
INEXACT NEWTON METHODS 105
[69] is motivated by a desire to avoid such over solving. Over solving means
that the linear equation for the Newton step is solved to a precision far beyond
what is needed to correct the nonlinear iteration. As a measure of the degree
to which the nonlinear iteration approximates the solution we begin with
η
A
n
= γF(x
n
)
2
/F(x
n−1
)
2
,
where γ ∈ (0, 1] is a parameter. If η
A
n
is uniformly bounded away from 1,
then setting η
n
= η
A
n
for n > 0 would guarantee qquadratic convergence by
Theorem 6.1.1. In this way, the most information possible would be extracted
from the inner iteration. In order to specify the choice at n = 0 and bound
the sequence away from 1 we set
η
B
n
=
_
¸
_
¸
_
η
max
, n = 0,
min(η
max
, η
A
n
), n > 0.
(6.18)
In (6.18) the parameter η
max
is an upper limit on the sequence ¦η
n
¦. In [69]
the choices γ = .9 and η
max
= .9999 are used.
It may happen that η
B
n
is small for one or more iterations while x
n
is still
far from the solution. A method of safeguarding was suggested in [69] to avoid
volatile decreases in η
n
. The idea is that if η
n−1
is suﬃciently large we do not
let η
n
decrease by much more than a factor of η
n−1
.
η
C
n
=
_
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
_
η
max
, n = 0,
min(η
max
, η
A
n
), n > 0, γη
2
n−1
≤ .1,
min(η
max
, max(η
A
n
, γη
2
n−1
)), n > 0, γη
2
n−1
> .1.
(6.19)
The constant .1 is somewhat arbitrary. This safeguarding does improve the
performance of the iteration.
There is a chance that the ﬁnal iterate will reduce F far beyond the
desired level and that the cost of the solution of the linear equation for the last
step will be more accurate than is really needed. This oversolving on the ﬁnal
step can be controlled comparing the norm of the current nonlinear residual
F(x
n
) to the nonlinear residual norm at which the iteration would terminate
τ
t
= τ
a
+τ
r
F(x
0
)
and bounding η
n
from below by a constant multiple of τ
t
/F(x
n
). The
algorithm nsolgm does this and uses
η
n
= min(η
max
, max(η
C
n
, .5τ
t
/F(x
n
))). (6.20)
Exercise 6.5.9 asks the reader to modify nsolgm so that other choices of
the sequence ¦η
n
¦ can be used.
We use dirder to approximate directional derivatives and use the default
value of h = 10
−7
. In all the examples in this book we use the value γ = .9 as
recommended in [69].
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
106 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Algorithm 6.3.1. nsolgm(x, F, τ, η)
1. r
c
= r
0
= F(x)
2
/
√
N
2. Do while F(x)
2
/
√
N > τ
r
r
0
+τ
a
(a) Select η.
(b) fdgmres(s, x, F, η)
(c) x = x +s
(d) Evaluate F(x)
(e) r
+
= F(x)
2
/
√
N, σ = r
+
/r
c
, r
c
= r
+
(f) If F(x)
2
≤ τ
r
r
0
+τ
a
exit.
Note that the cost of an outer iterate is one evaluation of F to compute
the value at the current iterate and other evaluations of F to compute the
forward diﬀerences for each inner iterate. Hence if a large value of η can be
used, the cost of the entire iteration can be greatly reduced. If a maximum
of m GMRES iterations is needed (or GMRES(m) is used as a linear solver)
the storage requirements of Algorithm nsolgm are the m+ 5 vectors x, F(x),
x +hv, F(x +hv), s, and the Krylov basis ¦v
k
¦
m
k=1
.
Since GMRES forms the inner iterates and makes termination decisions
based on scalar products and l
2
norms, we also terminate the outer iteration on
small l
2
norms of the nonlinear residuals. However, because of our applications
to diﬀerential and integral equations, we scale the l
2
norm of the nonlinear
residual by a factor of 1/
√
N so that constant functions will have norms that
are independent of the computational mesh.
For large and poorly conditioned problems, GMRES will have to be
restarted. We illustrate the consequences of this in ¸ 6.4 and ask the reader to
make the necessary modiﬁcations to nsolgm in Exercise 6.5.9.
The preconditioners we use in ¸ 6.4 are independent of the outer iteration.
Since the Newton steps for MF(x) = 0 are the same as those for F(x) = 0, it
is equivalent to precondition the nonlinear equation before calling nsolgm.
6.4. Examples for NewtonGMRES
In this section we consider two examples. We revisit the Hequation and solve
a preconditioned nonlinear convectiondiﬀusion equation.
In all the ﬁgures we plot the relative nonlinear residual F(x
n
)
2
/F(x
0
)
2
against the number of function evaluations required by all inner and outer
iterations to compute x
n
. Counts of function evaluations corresponding to
outer iterations are indicated by circles. From these plots one can compare
not only the number of outer iterations, but also the total cost. This enables
us to directly compare the costs of diﬀerent strategies for selection of η. Note
that if only a single inner iteration is needed, the total cost of the outer iteration
will be two function evaluations since F(x
c
) will be known. One new function
evaluation will be needed to approximate the action of F
(x
c
) on a vector
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
INEXACT NEWTON METHODS 107
and then F(x
+
) will be evaluated to test for termination. We also count
the evaluation of F(x
0
) as an additional cost in the evaluation of x
1
, which
therefore has a minimum cost of three function evaluations. For each example
we compare a constant value of η
n
with the choice given in (6.20) and used as
the default in nsolgm.
6.4.1. Chandrasekhar Hequation. We solve the Hequation on a 100
point mesh with c = .9 using two schemes for selection of the parameter η.
The initial iterate is the function identically one. We set the parameters in
(6.20) to γ = .9 and η
max
= .25.
We used τ
r
= τ
a
= 10
−6
in this example.
In Fig. 6.1 we plot the progress of the iteration using η = .1 with the solid
line and using the sequence given by (6.20) with the dashed line. We set the
parameters in (6.20) to γ = .9 and η
max
= .25. This choice of η
max
was the
one that did best overall in our experiments on this problem.
We see that the constant η iteration terminates after 12 function evalua
tions, 4 outer iterations, and roughly 275 thousand ﬂoatingpoint operations.
This is a slightly higher overall cost than the other approach in which ¦η
n
¦ is
given by (6.20), which terminated after 10 function evaluations, 3 outer itera
tions, and 230 thousand ﬂoatingpoint operations. The chord method with the
diﬀerent (but very similar for this problem) l
∞
termination condition for the
same problem, reported in ¸ 5.6 incurred a cost of 3.3 million ﬂoatingpoint
operations, slower by a factor of over 1000. This cost is entirely a result of the
computation and factorization of a single Jacobian.
0 2 4 6 8 10 12
10
8
10
7
10
6
10
5
10
4
10
3
10
2
10
1
10
0
Function Evaluations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 6.1. NewtonGMRES for the Hequation, c = .9.
In Fig. 6.2 the results change for the more illconditioned problem with
c = .9999. Here the iteration with η
n
= .1 performed slightly better and
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
108 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
0 5 10 15 20 25
10
7
10
6
10
5
10
4
10
3
10
2
10
1
10
0
Function Evaluations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 6.2. NewtonGMRES for the Hequation, c = .9999
0 2 4 6 8 10 12 14 16 18
10
4
10
3
10
2
10
1
10
0
Function Evaluations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 6.3. NewtonGMRES for the PDE, C = 20.
terminated after 22 function evaluations and 7 outer iterations, and roughly
513 thousand ﬂoatingpoint operations. The iteration with the decreasing ¦η
n
¦
given by (6.20) required 23 function evaluations, 7 outer iterations, and 537
thousand ﬂoatingpoint operations.
6.4.2. Convectiondiﬀusion equation. We consider the partial diﬀeren
tial equation
−∇
2
u +Cu(u
x
+u
y
) = f (6.21)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
INEXACT NEWTON METHODS 109
with homogeneous Dirichlet boundary conditions on the unit square (0, 1)
(0, 1). f has been constructed so that the exact solution was the discretization
of
10xy(1 −x)(1 −y) exp(x
4.5
).
We set C = 20 and u
0
= 0. As in ¸ 3.7 we discretized (6.21) on a 31 31 grid
using centered diﬀerences. To be consistent with the secondorder accuracy of
the diﬀerence scheme we used τ
r
= τ
a
= h
2
, where h = 1/32.
We compare the same possibilities for ¦η
n
¦ as in the previous example and
report the results in Fig. 6.3. In (6.20) we set γ = .9 and η
max
= .5. The
computation with constant η terminated after 19 function evaluations and 4
outer iterates at a cost of roughly 3 million ﬂoatingpoint operations. The
iteration with ¦η
n
¦ given by (6.20) required 16 function evaluations, 4 outer
iterations, and 2.6 million ﬂoatingpoint operations. In the computation we
preconditioned (6.21) with the fast Poisson solver fish2d.
In this case, the choice of ¦η
n
¦ given by (6.20) reduced the number of inner
iterations in the early stages of the iteration and avoided oversolving. In cases
where the initial iterate is very good and only a single Newton iterate might be
needed, however, a large choice of η
max
or constant η may be the more eﬃcient
choice. Examples of such cases include (1) implicit time integration of nonlinear
parabolic partial diﬀerential equations or ordinary diﬀerential equations in
which the initial iterate is either the converged solution from the previous
time step or the output of a predictor and (2) solution of problems in which
the initial iterate is an interpolant of the solution from a coarser mesh.
Preconditioning is crucial for good performance. When preconditioning
was not done, the iteration for C = 20 required more than 80 function
evaluations to converge. In Exercise 6.5.9 the reader is asked to explore this.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
110 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
6.5. Exercises on inexact Newton methods
6.5.1. Verify (6.8).
6.5.2. Prove Proposition 6.1.1 by showing that if the standard assumptions
hold, 0 < < 1, and x is suﬃciently near x
∗
then
F(x)/(1 +) ≤ F
(x
∗
)e ≤ (1 +)F(x).
6.5.3. Verify (6.14).
6.5.4. Prove Theorems 6.2.2 and 6.2.1.
6.5.5. Prove Theorems 6.1.5 and 6.1.6.
6.5.6. Can anything like Proposition 6.2.1 be proved for a ﬁnitediﬀerence Bi
CGSTAB or TFQMR linear solver? If not, what are some possible
implications?
6.5.7. Give an example in which F(x
n
) → 0 qlinearly but x
n
does not converge
to x
∗
qlinearly.
6.5.8. In the DASPK code for integration of diﬀerential algebraic equations,
[23], preconditioned NewtonGMRES is implemented in such a way
that the data needed for the preconditioner is only computed when
needed, which may be less often that with each outer iteration. Discuss
this strategy and when it might be useful. Is there a relation to the
Shamanskii method?
6.5.9. Duplicate the results in ¸ 6.4. For the Hequation experiment with
various values of c, such as c = .5, .99999, .999999, and for the convection
diﬀusion equation various values of C such as C = 1, 10, 40 and how
preconditioning aﬀects the iteration. Experiment with the sequence of
forcing terms ¦η
n
¦. How do the choices η
n
= 1/(n + 1), η
n
= 10
−4
[32], η
n
= 2
1−n
[24], η
n
= min(F(x
n
)
2
, (n + 2)
−1
) [56], η
n
= .05, and
η
n
= .75 aﬀect the results? Present the results of your experiments as
graphical iteration histories.
6.5.10. For the Hequation example in ¸ 6.4, vary the parameter c and see how
the performance of NewtonGMRES with the various choices of ¦η
n
¦ is
aﬀected. What happens when c = 1? See [119] for an explanation of
this.
6.5.11. Are the converged solutions for the preconditioned and unpreconditioned
convectiondiﬀusion example in ¸ 6.4 equally accurate?
6.5.12. For the convectiondiﬀusion example in ¸ 6.4, how is the performance
aﬀected if GMRES(m) is used instead of GMRES?
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
INEXACT NEWTON METHODS 111
6.5.13. Apply NewtonCGNR, NewtonBiCGSTAB, or NewtonTFQMR to the
two examples in ¸ 6.4. How will you deal with the need for a transpose in
the case of CGNR? How does the performance (in terms of time, ﬂoating
point operations, function evaluations) compare to NewtonGMRES?
Experiment with the forcing terms. The MATLAB codes fdkrylov,
which allows you to choose from a variety of forward diﬀerence Krylov
methods, and nsola, which we describe more fully in Chapter 8, might
be of use in this exercise.
6.5.14. If you have done Exercise 5.7.25, apply nsolgm to the same twopoint
boundary value problem and compare the performance.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
112 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Chapter 7
Broyden’s method
QuasiNewton methods maintain approximations of the solution x
∗
and the
Jacobian at the solution F
(x
∗
) as the iteration progresses. If x
c
and B
c
are
the current approximate solution and Jacobian, then
x
+
= x
c
−B
−1
c
F(x
c
). (7.1)
After the computation of x
+
, B
c
is updated to form B
+
. The construction of
B
+
determines the quasiNewton method.
The advantages of such methods, as we shall see in ¸ 7.3, is that solution of
equations with the quasiNewton approximation to the Jacobian is often much
cheaper than using F
(x
n
) as the coeﬃcient matrix. In fact, if the Jacobian is
dense, the cost is little more than that of the chord method. The method we
present in this chapter, Broyden’s method, is locally superlinearly convergent,
and hence is a very powerful alternative to Newton’s method or the chord
method.
In comparison with Newtoniterative methods, quasiNewton methods
require only one function evaluation for each nonlinear iterate; there is no cost
in function evaluations associated with an inner iteration. Hence, if a good
preconditioner (initial approximation to F
(x
∗
)) can be found, these methods
could have an advantage in terms of function evaluation cost over Newton
iterative methods.
Broyden’s method [26] computes B
+
by
B
+
= B
c
+
(y −B
c
s)s
T
s
T
s
= B
c
+
F(x
+
)s
T
s
T
s
. (7.2)
In (7.2) y = F(x
+
) −F(x
c
) and s = x
+
−x
c
.
Broyden’s method is an example of a secant update. This means that the
updated approximation to F
(x
∗
) satisﬁes the secant equation
B
+
s = y. (7.3)
In one dimension, (7.3) uniquely speciﬁes the classical secant method. For
equations in several variables (7.3) alone is a system of N equations in N
2
113
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
114 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
unknowns. The role of the secant equation is very deep. See [62] and [64] for
discussion of this topic. In this book we focus on Broyden’s method and our
methods of analysis are not as general as those in [62] and [64]. In this chapter
we will assume that the function values are accurate and refer the reader to [66],
[65], and [114] for discussions of quasiNewton methods with inaccurate data.
Locally superlinearly convergent secant methods can be designed to maintain
the sparsity pattern, symmetry, or positive deﬁniteness, of the approximate
Jacobian, and we refer the reader to [28], [63], [62], and [64] for discussion and
analysis of several such methods. More subtle structural properties can also
be preserved by secant methods. Some examples can be found in [101], [96],
[95], [113], and [116].
Broyden’s method is also applicable to linear equations [29], [82], [84],
[104], [143], [130], Ax = b, where B ≈ A is updated. One can also apply the
method to linear least squares problems [84], [104]. The analysis is somewhat
clearer in the linear case and the plan of this chapter is to present results for
the linear case for both the theory and the examples. One interesting result,
which we will not discuss further, is that for linear problems Broyden’s method
converges in 2N iterations [29], [82], [84], [143].
We will express our results in terms of the error in the Jacobian
E = B −F
(x
∗
) (7.4)
and the step s = x
+
−x
c
. While we could use E = B−F
(x), (7.4) is consistent
with our use of e = x − x
∗
. When indexing of the steps is necessary, we will
write
s
n
= x
n+1
−x
n
.
In this chapter we show that if the data x
0
and B
0
are suﬃciently good,
the Broyden iteration will converge qsuperlinearly to the root.
7.1. The Dennis–Mor´e condition
In this section we consider general iterative methods of the form
x
n+1
= x
n
−B
−1
n
F(x
n
) (7.5)
where B
n
= F
(x
∗
)+E
n
≈ F
(x
∗
) is generated by some method (not necessarily
Broyden’s method).
Veriﬁcation of qsuperlinear convergence cannot be done by the direct
approaches used in Chapters 5 and 6. We must relate the superlinear
convergence condition that η
n
→ 0 in Theorem 6.1.2 to a more subtle, but
easier to verify condition in order to analyze Broyden’s method. This technique
is based on veriﬁcation of the Dennis–Mor´e condition [61], [60] on the sequence
of steps ¦s
n
¦ and errors in the Jacobian ¦E
n
¦
lim
n→∞
E
n
s
n

s
n

= 0. (7.6)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BROYDEN’S METHOD 115
The main result in this section, Theorem 7.1.1 and its corollary for linear
problems, are used to prove superlinear convergence for Broyden’s method in
this chapter and many other quasiNewton methods. See [61], [28], [62], and
[64] for several more examples. Our formulation of the Dennis–Mor´e result is
a bit less general than that in [60] or [63].
Theorem 7.1.1. Let the standard assumptions hold, let ¦B
n
¦ be a sequence
of nonsingular N N matrices, let x
0
∈ R
N
be given and let ¦x
n
¦
∞
n=1
be given
by (7.5). Assume that x
n
,= x
∗
for any n. Then x
n
→ x
∗
qsuperlinearly if
and only if x
n
→ x
∗
and the Dennis–Mor´e condition (7.6) holds.
Proof. Since
−F(x
n
) = B
n
s
n
= F
(x
∗
)s
n
+E
n
s
n
we have
E
n
s
n
= −F
(x
∗
)s
n
−F(x
n
) = −F
(x
∗
)e
n+1
+F
(x
∗
)e
n
−F(x
n
). (7.7)
We use the fundamental theorem of calculus and the standard assumptions to
obtain
F
(x
∗
)e
n
−F(x
n
) =
_
1
0
(F
(x
∗
) −F
(x
∗
+te
n
))e
n
dt
and hence
F
(x
∗
)e
n
−F(x
n
) ≤ γe
n

2
/2.
Therefore, by (7.7)
E
n
s
n
 ≤ F
(x
∗
)e
n+1
 +γe
n

2
/2. (7.8)
Now, if x
n
→ x
∗
qsuperlinearly, then for n suﬃciently large
s
n
/2 ≤ e
n
 ≤ 2s
n
. (7.9)
The assumption that x
n
,= x
∗
for any n implies that the sequence
ν
n
= e
n+1
/e
n
 (7.10)
is deﬁned and and ν
n
→ 0 by superlinear convergence of ¦x
n
¦. By (7.9) we
have
e
n+1
 = ν
n
e
n
 ≤ 2ν
n
s
n
,
and hence (7.8) implies that
E
n
s
n
 ≤ (2F
(x
∗
)ν
n
+γe
n
)s
n

which implies the DennisMor´e condition (7.6).
Conversely, assume that x
n
→ x
∗
and that (7.6) holds. Let
µ
n
=
E
n
s
n

s
n

.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
116 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
We have, since
E
n
s
n
= (B
n
−F
(x
∗
))s
n
= −F(x
n
) −F
(x
∗
)s
n
, (7.11)
s
n
F
(x
∗
)
−1

−1
≤ F
(x
∗
)s
n
 ≤ E
n
s
n
 +F(x
n
)
= µ
n
s
n
 +F(x
n
).
(7.12)
Since µ
n
→ 0 by assumption,
µ
n
≤ F
(x
∗
)
−1

−1
/2, (7.13)
for n suﬃciently large. We assume now that n is large enough so that (7.13)
holds. Hence,
s
n
 ≤ 2F
(x
∗
)
−1
F(x
n
). (7.14)
Moreover, using the standard assumptions and (7.11) again,
F(x
n
) +F
(x
n
)s
n
 ≤ F(x
n
) +F
(x
∗
)s
n
 +(F
(x
∗
) −F
(x
n
))s
n

≤ (µ
n
+γe
n
)s
n
.
(7.15)
Since x
n
→ x
∗
by assumption, η
n
→ 0 where
η
n
= 2F
(x
∗
)
−1
(µ
n
+γe
n
).
Combining (7.14) and (7.15) implies that
F(x
n
) +F
(x
n
)s
n
 ≤ η
n
F(x
n
), (7.16)
which is the inexact Newton condition. Since η
n
→ 0, the proof is complete
by Theorem 6.1.2.
7.2. Convergence analysis
Our convergence analysis is based on an approach to inﬁnitedimensional
problems from [97]. That paper was the basis for [168], [104], and [115], and
we employ notation from all four of these references. We will use the l
2
norm
throughout our discussion. We begin with a lemma from [104].
Lemma 7.2.1. Let 0 <
ˆ
θ < 1 and let
¦θ
n
¦
∞
n=0
⊂ (
ˆ
θ, 2 −
ˆ
θ).
Let ¦
n
¦
∞
n=0
⊂ R
N
be such that
n

n

2
< ∞,
and let ¦η
n
¦
∞
n=0
be a set of vectors in R
N
such that η
n

2
is either 1 or 0 for
all n. Let ψ
0
∈ R
N
be given. If ¦ψ
n
¦
∞
n=1
is given by
ψ
n+1
= ψ
n
−θ
n
(η
T
n
ψ
n
)η
n
+
n
(7.17)
then
lim
n→∞
η
T
n
ψ
n
= 0. (7.18)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BROYDEN’S METHOD 117
Proof. We ﬁrst consider the case in which
n
= 0 for all n. In that case, we
can use the fact that
θ
n
(2 −θ
n
) >
ˆ
θ
2
> 0
to show that the sequence ¦ψ
n
¦ is bounded in l
2
norm by ψ
0

2
and, in fact,
ψ
n+1

2
2
≤ ψ
n

2
2
−θ
n
(2 −θ
n
)(η
T
n
ψ
n
)
2
≤ ψ
n

2
2
−
ˆ
θ
2
(η
T
n
ψ
n
)
2
≤ ψ
n

2
2
.
Therefore, for any M > 0,
M
n=0
(η
T
n
ψ
n
)
2
≤
ψ
0

2
2
−ψ
M+1

2
2
ˆ
θ
2
≤
ψ
0

2
ˆ
θ
2
.
We let M → ∞ to obtain
∞
n=0
(η
T
n
ψ
n
)
2
< ∞. (7.19)
Convergence of the series in (7.19) implies convergence to zero of the terms in
the series, which is (7.18).
To prove the result for
n
,= 0 we use the inequality
_
a
2
−b
2
≤ a −
b
2
2a
, (7.20)
which is valid for a > 0 and [b[ ≤ a. This inequality is used often in the analysis
of quasiNewton methods [63]. From (7.20) we conclude that if ψ
n
,= 0 then
ψ
n
−θ
n
(η
T
n
ψ
n
)η
n

2
≤
_
ψ
n

2
2
−θ
n
(2 −θ
n
)(η
T
n
ψ
n
)
2
≤ ψ
n

2
−
θ
n
(2 −θ
n
)(η
T
n
ψ
n
)
2
2ψ
n

2
.
Hence if ψ
n
,= 0
ψ
n+1

2
≤ ψ
n

2
−
θ
n
(2 −θ
n
)(η
T
n
ψ
n
)
2
2ψ
n

2
+
n

2
. (7.21)
Hence
(η
T
n
ψ
n
)
2
≤
2ψ
n

2
θ
n
(2 −θ
n
)
(ψ
n

2
−ψ
n+1

2
+
n

2
), (7.22)
which holds even if ψ
n
= 0.
From (7.21) we conclude that
ψ
n+1

2
≤ µ,
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
118 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
where
µ =
∞
i=0

i

2
+ψ
0

2
.
Hence
M
n=0
(η
T
n
ψ
n
)
2
≤
2µ
ˆ
θ
2
M
n=0
(ψ
n

2
−ψ
n+1

2
+
n

2
)
=
2µ
ˆ
θ
2
_
ψ
0

2
−ψ
M+1

2
+
M
n=0

n

2
_
≤
2µ
2
ˆ
θ
2
.
Hence (7.19) holds and the proof is complete.
7.2.1. Linear problems. We may use Lemma 7.2.1 to prove a result from
[130] on the convergence of Broyden’s method for linear problems. The role of
the parameter
ˆ
θ and the sequence ¦θ
n
¦ in Lemma 7.2.1 is nicely illustrated by
this application of the lemma.
In this section F(x) = Ax−b and B
n
= A+E
n
. The standard assumptions
in this case reduce to nonsingularity of A. Our ﬁrst task is to show that
the errors in the Broyden approximations to A do not grow. This property,
called bounded deterioration [28], is an important part of the convergence
analysis. The reader should compare the statement of the result with that
in the nonlinear case.
In the linear case, the Broyden update has a very simple form. We will
consider a modiﬁed update
B
+
= B
c
+θ
c
(y −B
c
s)s
T
s
T
s
= B
c
+θ
c
(Ax
+
−b)s
T
s
T
s
. (7.23)
If x
c
and B
c
are the current approximations to x
∗
= A
−1
b and A and
s = x
+
−x
c
then
y −B
c
s = (Ax
+
−Ax
c
) −B
c
(x
+
−x
c
) = −E
c
s
and therefore
B
+
= B
c
+θ
c
(y −B
c
s)s
T
s
2
2
= B
c
−θ
c
(E
c
s)s
T
s
2
2
. (7.24)
Lemma 7.2.2. Let A be nonsingular, θ
c
∈ [0, 2], and x
c
∈ R
N
be given. Let
B
c
be nonsingular and let B
+
be formed by the Broyden update (7.23). Then
E
+

2
≤ E
c

2
. (7.25)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BROYDEN’S METHOD 119
Proof. By subtracting A from both sides of (7.24) we have
E
+
= E
c
−θ
c
(E
c
s)s
T
s
2
2
= E
c
(I −θ
c
P
s
),
(7.26)
where P
s
is the orthogonal projector
P
s
=
ss
T
s
2
2
. (7.27)
This completes the proof as
E
+

2
≤ E
c

2
I −θ
c
P
s

2
and I −θ
c
P
s

2
≤ 1 by orthogonality of P
s
and the fact that 0 ≤ θ
c
≤ 2.
Now, θ
c
can always be selected to make B
+
nonsingular. In [130] one
suggestion was
θ
c
=
_
_
_
1, [γ
c
[ ≥ σ,
1 −sign(γ
c
)σ
1 −γ
c
, otherwise,
where
γ
c
=
(B
−1
c
y)
T
s
s
2
2
=
(B
−1
c
As)
T
s
s
2
2
and σ ∈ (0, 1) is ﬁxed. However the results in [130] assume only that the
sequence ¦θ
n
¦ satisﬁes the hypotheses of Lemma 7.2.1 for some
ˆ
θ ∈ (0, 1) and
that θ
c
is always chosen so that B
+
is nonsingular.
For linear problems one can show that the Dennis–Mor´e condition implies
convergence and hence the assumption that convergence takes place is not
needed. Speciﬁcally
Proposition 7.2.1. Let A be nonsingular. Assume that ¦θ
n
¦ satisﬁes
the hypotheses of Lemma 7.2.1 for some
ˆ
θ ∈ (0, 1). If ¦θ
n
¦ is such that
the matrices B
n
obtained by (7.23) are nonsingular and ¦x
n
¦ is given by the
modiﬁed Broyden iteration
x
n+1
= x
n
−B
−1
n
(Ax
n
−b) (7.28)
then the Dennis–Mor´e condition implies convergence of ¦x
n
¦ to x
∗
= A
−1
b.
We leave the proof as Exercise 7.5.1.
The superlinear convergence result is remarkable in that the initial iterate
need not be near the root.
Theorem 7.2.1. Let A be nonsingular. Assume that ¦θ
n
¦ satisﬁes the
hypotheses of Lemma 7.2.1 for some
ˆ
θ ∈ (0, 1). If ¦θ
n
¦ is such the matrices B
n
obtained by (7.23) are nonsingular then the modiﬁed Broyden iteration (7.28)
converges qsuperlinearly to x
∗
= A
−1
b for every initial iterate x
0
.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
120 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Proof. Our approach is to show that for every φ ∈ R
N
that
lim
n→∞
φ
T
_
E
n
s
n
s
n

2
_
= 0. (7.29)
Assuming that (7.29) holds, setting φ = e
j
, the unit vector in the jth
coordinate direction, shows that the j component of
E
n
s
n
s
n

2
has limit zero.
As j was arbitrary, we have
lim
n→∞
E
n
s
n
s
n

2
= 0
which is the Dennis–Mor´e condition. This will imply qsuperlinear convergence
by Theorem 7.1.1 and Proposition 7.2.1.
Let φ ∈ R
N
be given. Let
P
n
=
s
n
s
T
n
s
n

2
2
.
Since for all v ∈ R
N
and all n
v
T
(E
T
n
φ) = φ
T
(E
n
v)
and, by (7.26)
E
T
n+1
φ = (I −θ
n
P
n
)E
T
n
φ,
we may invoke Lemma 7.2.1 with
η
n
= s
n
/s
n

2
, ψ
n
= E
T
n
φ, and
n
= 0
to conclude that
η
T
n
ψ
n
=
(E
T
n
φ)
T
s
n
s
n

2
= φ
T
E
n
s
n
s
n

2
→ 0. (7.30)
This completes the proof.
The result here is not a local convergence result. It is global in the sense
that qsuperlinear convergence takes place independently of the initial choices
of x and B. It is known [29], [82], [84], [143], that the Broyden iteration will
terminate in 2N steps in the linear case.
7.2.2. Nonlinear problems. Our analysis of the nonlinear case is diﬀerent
from the classical work [28], [63]. As indicated in the preface, we provide a
proof based on ideas from [97], [115], and [104], that does not use the Frobenius
norm and extends to various inﬁnitedimensional settings.
For nonlinear problems our analysis is local and we set θ
n
= 1. However
we must show that the sequence ¦B
n
¦ of Broyden updates exists and remains
nonsingular. We make a deﬁnition that allows us to compactly express this
fact.
Definition 7.2.1. We say that the Broyden sequence (¦x
n
¦, ¦B
n
¦) for the
data (F, x
0
, B
0
) exists if F is deﬁned at x
n
for all n, B
n
is nonsingular for all
n, and x
n+1
and B
n+1
can be computed from x
n
and B
n
using (7.1) and (7.2).
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BROYDEN’S METHOD 121
We will show that if the standard assumptions hold and x
0
and B
0
are
suﬃciently good approximations to x
∗
and F
(x
∗
) then the Broyden sequence
exists for the data (F, x
0
, B
0
) and that the Broyden iterates converge q
superlinearly to x
∗
. The development of the proof is complicated. We ﬁrst
prove a bounded deterioration result like Lemma 7.2.2. However in the
nonlinear case, the errors in the Jacobian approximation can grow as the
iteration progresses. We then show that if the initial data is suﬃciently good,
this growth can be limited and therefore the Broyden sequence exists and
the Broyden iterates converges to x
∗
at least qlinearly. Finally we verify the
Dennis–Mor´e condition (7.6), which, together with the qlinear convergence
proved earlier completes the proof of local qsuperlinear convergence.
Bounded deterioration. Our ﬁrst task is to show that bounded deteriora
tion holds. Unlike the linear case, the sequence ¦E
n
¦ need not be monotone
nonincreasing. However, the possible increase can be bounded in terms of the
errors in the current and new iterates.
Theorem 7.2.2. Let the standard assumptions hold. Let x
c
∈ Ω and a
nonsingular matrix B
c
be given. Assume that
x
+
= x
c
−B
−1
c
F(x
c
) = x
c
+s ∈ Ω
and B
+
is given by (7.2). Then
E
+

2
≤ E
c
 +γ(e
c

2
+e
+

2
)/2 (7.31)
Proof. Note that (7.7) implies that
E
c
s = −F(x
+
) + (F(x
+
) −F(x
c
) −F
(x
∗
)s)
= −F(x
+
) +
_
1
0
(F
(x
c
+ts) −F
(x
∗
))s dt.
(7.32)
We begin by writing (7.32) as
F(x
+
) = −E
c
s +
_
1
0
(F
(x
c
+ts) −F
(x
∗
))s dt
and then using (7.2) to obtain
E
+
= E
c
(I −P
s
) +
(∆
c
s)s
T
s
2
2
where P
s
is is given by (7.27) and
∆
c
=
_
1
0
(F
(x
c
+ts) −F
(x
∗
)) dt.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
122 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Hence, by the standard assumptions
∆
c

2
≤ γ(e
c

2
+e
+

2
)/2.
Just as in the proof of Lemma 7.2.2 for the linear case
E
+

2
≤ E
c

2
+∆
c

2
and the proof is complete.
Local qlinear convergence. Theorem 7.2.3 states that Broyden’s method
is locally qlinearly convergent. The proof is more complex than similar results
in Chapters 5 and 6. We must explore the relation between the size of E
0

needed to enforce qlinear convergence and the qfactor.
Theorem 7.2.3. Let the standard assumptions hold. Let r ∈ (0, 1) be
given. Then there are δ and δ
B
such that if x
0
∈ B(δ) and E
0

2
< δ
B
the
Broyden sequence for the data (F, x
0
, B
0
) exists and x
n
→ x
∗
qlinearly with
qfactor at most r.
Proof. Let δ and δ
1
be small enough so that the conclusions of Theo
rem 5.4.3 hold. Then, reducing δ and δ
1
further if needed, the qfactor is at
most
K
A
(δ +δ
1
) ≤ r,
where K
A
is from the statement of Theorem 5.4.3.
We will prove the result by choosing δ
B
and reducing δ if necessary so that
E
0

2
≤ δ
B
will imply that E
n

2
≤ δ
1
for all n, which will prove the result.
To do this reduce δ if needed so that
δ
2
=
γ(1 +r)δ
2(1 −r)
< δ
1
(7.33)
and set
δ
B
= δ
1
−δ
2
. (7.34)
We show that E
n

2
≤ δ
1
by induction. Since E
0

2
< δ
B
< δ
1
we may
begin the induction. Now assume that E
k

2
< δ
1
for all 0 ≤ k ≤ n. By
Theorem 7.2.2,
E
n+1

2
≤ E
n

2
+γ(e
n+1

2
+e
n

2
)/2.
Since E
n

2
< δ
1
, e
n+1

2
≤ re
n

2
and therefore
E
n+1

2
≤ E
n

2
+γ(1 +r)e
n

2
/2 ≤ E
n

2
+γ(1 +r)r
n
δ/2.
We can estimate E
n

2
, E
n−1

2
, . . . in the same way to obtain
E
n+1

2
≤ E
0

2
+
(1 +r)γδ
2
n
j=0
r
j
≤ δ
B
+
(1 +r)γδ
2(1 −r)
≤ δ
1
,
which completes the proof.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BROYDEN’S METHOD 123
Veriﬁcation of the Dennis–Mor´e condition. The ﬁnal task is to verify
the Dennis–Mor´e condition.
Theorem 7.2.4. Let the standard assumptions hold. Then there are δ and
δ
B
such that if x
0
∈ B(δ) and E
0

2
< δ
B
the Broyden sequence for the data
(F, x
0
, B
0
) exists and x
n
→ x
∗
qsuperlinearly.
Proof. Let δ and δ
B
be such that the conclusions of Theorem 7.2.3 hold.
By Theorem 7.1.1 we need only show that Dennis–Mor´e condition (7.6) holds
to complete the proof.
As in the proof of Theorem 7.2.1 let
P
n
=
s
n
s
T
n
s
n

2
2
.
Set
∆
n
=
_
1
0
(F
(x
n
+ts
n
) −F
(x
∗
)) dt.
Let φ ∈ R
N
be arbitrary. Note that
E
T
n+1
φ = (I −P
n
)E
T
n
φ +P
n
∆
T
n
φ.
We wish to apply Lemma 7.2.1 with
ψ
n
= E
T
n
φ, η
n
= s
n
/s
n

2
, and
n
= P
n
∆
T
n
φ.
The hypothesis in the lemma that
n

n

2
< ∞
holds since Theorem 7.2.3 and the standard assumptions imply
∆
n

2
≤ γ(1 +r)r
n
δ/2.
Hence Lemma 7.2.1 implies that
η
T
n
ψ
n
=
(E
T
n
φ)
T
s
n
s
n

2
= φ
T
E
n
s
n
s
n

2
→ 0.
This, as in the proof of Theorem 7.2.1, implies the Dennis–Mor´e condition and
completes the proof.
7.3. Implementation of Broyden’s method
The implementation of Broyden’s method that we advocate here is related
to one that has been used in computational ﬂuid mechanics [73]. Some
such methods are called limited memory formulations in the optimization
literature [138], [142], [31]. In these methods, after the storage available for
the iteration is exhausted, the oldest of the stored vectors is replaced by the
most recent. Another approach, restarting, clears the storage and starts over.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
124 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
This latter approach is similar to the restarted GMRES iteration. Our basic
implementation is a nonlinear version of the implementation for linear problems
in [67]. We use a restarted approach, though a limited memory approach could
be used as well.
We begin by showing how the initial approximation to F
(x
∗
), B
0
, may be
regarded as a preconditioner and incorporated into the deﬁntion of F just as
preconditioners were in the implementation of NewtonGMRES in ¸ 6.4.
Lemma 7.3.1. Assume that the Broyden sequence (¦x
n
¦, ¦B
n
¦) for the data
(F, x
0
, B
0
) exists. Then the Broyden sequence (¦y
n
¦, ¦C
n
¦) exists for the data
(B
−1
0
F, x
0
, I) and
x
n
= y
n
, B
n
= B
0
C
n
(7.35)
for all n.
Proof. We proceed by induction. Equation (7.35) holds by deﬁnition when
n = 0. If (7.35) holds for a given n, then
y
n+1
= y
n
−C
−1
n
B
−1
0
F(y
n
)
= x
n
−(B
0
C
n
)
−1
F(x
n
) = x
n
−B
−1
n
F(x
n
)
= x
n+1
.
(7.36)
Now s
n
= x
n+1
−x
n
= y
n+1
−y
n
by (7.36). Hence
B
n+1
= B
n
+
F(x
n+1
)s
T
n
s
n

2
2
= B
0
C
n
+B
0
B
−1
0
F(y
n+1
)s
T
n
s
n

2
2
= B
0
C
n+1
.
This completes the proof.
The consequence of this proof for implementation is that we can assume
that B
0
= I. If a better choice for B
0
exists, we can incorporate that into the
function evaluation. Our plan will be to store B
−1
n
in a compact and easy to
apply form. The basis for this is the following simple formula [68], [176], [177],
[11].
Proposition 7.3.1. Let B be a nonsingular N N matrix and let
u, v ∈ R
N
. Then B + uv
T
is invertible if and only if 1 + v
T
B
−1
u ,= 0. In
this case,
(B +uv
T
)
−1
=
_
I −
(B
−1
u)v
T
1 +v
T
B
−1
u
_
B
−1
. (7.37)
The expression (7.37) is called the Sherman–Morrison formula. We leave
the proof to Exercise 7.5.2.
In the context of a sequence of Broyden updates ¦B
n
¦ we have for n ≥ 0,
B
n+1
= B
n
+u
n
v
T
n
,
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BROYDEN’S METHOD 125
where
u
n
= F(x
n+1
)/s
n

2
and v
n
= s
n
/s
n

2
.
Setting
w
n
= (B
−1
n
u
n
)/(1 +v
T
n
B
−1
n
u
n
)
we see that, if B
0
= I,
B
−1
n
= (I −w
n−1
v
T
n−1
)(I −w
n−2
v
T
n−2
) . . . (I −w
0
v
T
0
)
=
n−1
j=0
(I −w
j
v
T
j
).
(7.38)
Since the empty matrix product is the identity, (7.38) is valid for n ≥ 0.
Hence the action of B
−1
n
on F(x
n
) (i.e., the computation of the Broyden
step) can be computed from the 2n vectors ¦w
j
, v
j
¦
n−1
j=0
at a cost of O(Nn)
ﬂoatingpoint operations. Moreover, the Broyden step for the following
iteration is
s
n
= −B
−1
n
F(x
n
) = −
n−1
j=0
(I −w
j
v
T
j
)F(x
n
). (7.39)
Since the product
n−2
j=0
(I −w
j
v
T
j
)F(x
n
)
must also be computed as part of the computation of w
n−1
we can combine
the computation of w
n−1
and s
n
as follows:
w =
n−2
j=0
(I −w
j
v
T
j
)F(x
n
)
w
n−1
= C
w
w where C
w
= (s
n−1

2
+v
T
n−1
w)
−1
s
n
= −(I −w
n−1
v
T
n−1
)w.
(7.40)
One may carry this one important step further [67] and eliminate the need
to store the sequence ¦w
n
¦. Note that (7.40) implies that for n ≥ 1
s
n
= −w +C
w
w(v
T
n−1
w) = w(−1 +C
w
(C
−1
w
−s
n−1

2
))
= −C
w
ws
n−1

2
= −s
n−1

2
w
n−1
.
Hence, for n ≥ 0,
w
n
= −s
n+1
/s
n

2
. (7.41)
Therefore, one need only store the steps ¦s
n
¦ and their norms to construct
the sequence ¦w
n
¦. In fact, we can write (7.38) as
B
−1
n
=
n−1
j=0
_
I +
s
j+1
s
T
j
s
j

2
2
_
. (7.42)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
126 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
We cannot use (7.42) and (7.39) directly to compute s
n+1
because s
n+1
appears
on both sides of the equation
s
n+1
= −
_
I +
s
n+1
s
T
n
s
n

2
2
_
n−1
j=0
_
I +
s
j+1
s
T
j
s
j

2
2
_
F(x
n+1
)
= −
_
I +
s
n+1
s
T
n
s
n

2
2
_
B
−1
n
F(x
n+1
).
(7.43)
Instead, we solve (7.43) for s
n+1
to obtain
s
n+1
= −
B
−1
n
F(x
n+1
)
1 +s
T
n
B
−1
n
F(x
n+1
)/s
n

2
2
. (7.44)
By Proposition 7.3.1, the denominator in (7.44)
1 +s
T
n
B
−1
n
F(x
n+1
)/s
n

2
2
= 1 +v
T
n
B
−1
n
u
n
is nonzero unless B
n+1
is singular.
The storage requirement, of n +O(1) vectors for the nth iterate, is of the
same order as GMRES, making Broyden’s method competitive in storage with
GMRES as a linear solver [67]. Our implementation is similar to GMRES(m),
restarting with B
0
when storage is exhausted.
We can use (7.42) and (7.44) to describe an algorithm. The termination
criteria are the same as for Algorithm nsol in ¸ 5.6. We use a restarting
approach in the algorithm, adding to the input list a limit nmax on the number
of Broyden iterations taken before a restart and a limit maxit on the number
of nonlinear iterations. Note that B
0
= I is implicit in the algorithm. Our
looping convention is that the loop in step 2(e)ii is skipped if n = 0; this is
consistent with setting the empty matrix product to the identity. Our notation
and algorithmic framework are based on [67].
Algorithm 7.3.1. brsol(x, F, τ, maxit, nmax)
1. r
0
= F(x)
2
, n = −1,
s
0
= −F(x), itc = 0.
2. Do while itc < maxit
(a) n = n + 1; itc = itc + 1
(b) x = x +s
n
(c) Evaluate F(x)
(d) If F(x)
2
≤ τ
r
r
0
+τ
a
exit.
(e) if n < nmax then
i. z = −F(x)
ii. for j = 0, n −1
z = z +s
j+1
s
T
j
z/s
j

2
2
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BROYDEN’S METHOD 127
iii. s
n+1
= z/(1 −s
T
n
z/s
n

2
2
)
(f) if n = nmax then
n = −1; s = −F(x);
If n < nmax, then the nth iteration of brsol, which computes x
n
and
s
n+1
, requires O(nN) ﬂoatingpoint operations and storage of n + 3 vectors,
the steps ¦s
j
¦
n
j=0
, x, z, and F(x), where z and F(x) may occupy the same
storage location. This requirement of one more vector than the algorithm
in [67] needed is a result of the nonlinearity. If n = nmax, the iteration
will restart with x
nmax
, not compute s
nmax+1
, and therefore need storage for
nmax+2 vectors. Our implementation of brsol in the collection of MATLAB
codes stores z and F separately in the interest of clarity and therefore requires
nmax + 3 vectors of storage.
The Sherman–Morrison approach in Algorithm brsol is more eﬃcient, in
terms of both time and storage, than the dense matrix approach proposed
in [85]. However the dense matrix approach makes it possible to detect
ill conditioning in the approximate Jacobians. If the data is suﬃciently
good, bounded deterioration implies that the Broyden matrices will be well
conditioned and superlinear convergence implies that only a few iterates will
be needed. In view of all this, we feel that the approach of Algorithm brsol
is the best approach, especially for large problems.
A MATLAB implementation of brsol is provided in the collection of
MATLAB codes.
7.4. Examples for Broyden’s method
In all the examples in this section we use the l
2
norm multiplied by 1/
√
N.
We use the MATLAB code brsol.
7.4.1. Linear problems. The theory in this chapter indicates that Broy
den’s method can be viewed as an alternative to GMRES as an iterative method
for nonsymmetric linear equations. This point has been explored in some detail
in [67], where more elaborate numerical experiments that those given here are
presented, and in several papers on inﬁnitedimensional problems as well [91],
[104], [120]. As an example we consider the linear PDE (3.33) from ¸ 3.7. We
use the same mesh, righthand side, coeﬃcients, and solution as the example
in ¸ 3.7.
We compare Broyden’s method without restarts (solid line) to Broyden’s
method with restarts every three iterates (dashed line). For linear problems,
we allow for increases in the nonlinear residual. We set τ
a
= τ
r
= h
2
where
h = 1/32. In Fig. 7.1 we present results for (3.33) preconditioned with the
Poisson solver fish2d.
Broyden’s method terminated after 9 iterations at a cost of roughly 1.5
million ﬂoating point operations, a slightly higher cost than the GMRES
solution. The restarted iteration required 24 iterations and 3.5 million ﬂoating
point operations. One can also see highly irregular performance from the
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
128 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
restarted method.
In the linear case [67] one can store only the residuals and recover the
terminal iterate upon exit. Storage of the residual, the vector z, and the
sequence of steps saves a vector over the nonlinear case.
0 5 10 15 20 25
10
3
10
2
10
1
10
0
10
1
Iterations
R
e
l
a
t
i
v
e
R
e
s
i
d
u
a
l
Fig. 7.1. Broyden’s method for the linear PDE.
7.4.2. Nonlinear problems. We consider the Hequation from Chapters 5
and 6 and a nonlinear elliptic partial diﬀerential equation. Broyden’s method
and NewtonGMRES both require more storage as iterations progress, Newton
GMRES as the inner iteration progresses and Broyden’s method as the outer
iteration progresses. The cost in function evaluations for Broyden’s method is
one for each nonlinear iteration and for NewtonGMRES one for each outer
iteration and one for each inner iteration. Hence, if function evaluations are
very expensive, the cost of a solution by Broyden’s method could be much less
than that of one by NewtonGMRES. If storage is at a premium, however,
and the number of nonlinear iterations is large, Broyden’s method could be at
a disadvantage as it might need to be restarted many times. The examples
in this section, as well as those in ¸ 8.4 illustrate the diﬀerences in the two
methods. Our general recommendation is to try both.
Chandrasekhar Hequation. As before we solve the Hequation on a 100
point mesh with c = .9 (Fig. 7.2) and c = .9999 (Fig. 7.3). We compare
Broyden’s method without restarts (solid line) to Broyden’s method with
restarts every three iterates (dashed line). For c = .9 both methods required six
iterations for convergence. The implementation without restarts required 153
thousand ﬂoatingpoint operations and the restarted method 150, a substantial
improvement over the NewtonGMRES implementation.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BROYDEN’S METHOD 129
For c = .9999, the restarted iteration has much more trouble. The un
restarted iteration converges in 10 iterations at a cost of roughly 249 thousand
ﬂoatingpoint operations. The restarted iteration requires 18 iterations and
407 ﬂoatingpoint operations. Even so, both implementations performed bet
ter than NewtonGMRES.
0 1 2 3 4 5 6
10
8
10
7
10
6
10
5
10
4
10
3
10
2
10
1
10
0
Iterations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 7.2. Broyden’s method for the Hequation, c = .9.
0 2 4 6 8 10 12 14 16 18
10
7
10
6
10
5
10
4
10
3
10
2
10
1
10
0
Iterations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 7.3. Broyden’s method for the Hequation, c = .9999.
One should take some care in drawing general conclusions. While Broyden’s
method on this example performed better than NewtonGMRES, the storage
in NewtonGMRES is used in the inner iteration, while that in Broyden in the
nonlinear iteration. If many nonlinear iterations are needed, it may be the case
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
130 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
that Broyden’s method may need restarts while the GMRES iteration for the
linear problem for the Newton step may not.
Convectiondiﬀusion equation. We consider the nonlinear convection
diﬀusion equation from ¸ 6.4.2. We use the same mesh width h = 1/32 on
a uniform square grid. We set C = 20, u
0
= 0, and τ
r
= τ
a
= h
2
. We
preconditioned with the Poisson solver fish2d.
We allowed for an increase in the residual, which happened on the second
iterate. This is not supported by the local convergence theory for nonlinear
problems; however it can be quite successful in practice. One can do this in
brsol by setting the third entry in the parameter list to 1, as one would if the
problem were truly linear. In ¸ 8.4.3 we will return to this example.
In Fig. 7.4 we compare Broyden’s method without restarts (solid line) to
Broyden’s method with restarts every 8 iterates (dashed line). The unrestarted
Broyden iteration converges in 12 iterations at a cost of roughly 2.2 million
ﬂoatingpoint operations. This is a modest improvement over the Newton
GMRES cost (reported in ¸ 6.4.2) of 2.6 million ﬂoatingpoint operations.
However, the residual norms do not monotonically decrease in the Broyden
iteration (we ﬁx this in ¸ 8.3.2) and more storage was used.
0 2 4 6 8 10 12 14 16
10
4
10
3
10
2
10
1
10
0
10
1
Iterations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 7.4. Broyden’s method for nonlinear PDE.
At most ﬁve inner GMRES iterations were required. Therefore the GMRES
inner iteration for the NewtonGMRES approach needed storage for at most
10 vectors (the Krylov basis, s, x
c
, F(x
c
), F(x
c
+ hv)) to accommodate the
NewtonGMRES iteration. The Broyden iteration when restarted every n
iterates, requires storage of n+2 vectors ¦s
j
¦
n−1
j=0
, x, and z (when stored in the
same place as F(x)). Hence restarting the Broyden iteration every 8 iterations
most closely corresponds to the NewtonGMRES requirements. This approach
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BROYDEN’S METHOD 131
took 15 iterations to terminate at a cost of 2.4 million ﬂoatingpoint operations,
but the convergence was extremely irregular after the restart.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
132 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
7.5. Exercises on Broyden’s method
7.5.1. Prove Proposition 7.2.1.
7.5.2. Prove Proposition 7.3.1.
7.5.3. Extend Proposition 7.3.1 by showing that if A is a nonsingular N N
matrix, U is N M, V is M N, then A + UV is nonsingular if and
only if I + V A
−1
U is a nonsingular M M matrix. If this is the case,
then
(A+UV )
−1
= A
−1
−A
−1
U(I +V A
−1
U)
−1
V A
−1
.
This is the Sherman–Morrison–Woodbury formula [68], [199]. See [102]
for further generalizations.
7.5.4. Duplicate the results reported in ¸ 7.4. Vary the parameters in the
equations and the number of vectors stored by Broyden’s method and
report on their eﬀects on performance. What happens in the diﬀerential
equation examples if preconditioning is omitted?
7.5.5. Solve the Hequation with Broyden’s method and c = 1. Set τ
a
= τ
r
=
10
−8
. What qfactor for linear convergence do you see? Can you account
for this by applying the secant method to the equation x
2
= 0? See [53]
for a complete explanation.
7.5.6. Try to solve the nonlinear convectiondiﬀusion equation in ¸ 6.4 with
Broyden’s method using various values for the parameter C. How does
the performance of Broyden’s method diﬀer from NewtonGMRES?
7.5.7. Modify nsol to use Broyden’s method instead of the chord method after
the initial Jacobian evaluation. How does this compare with nsol on the
examples in Chapter 5.
7.5.8. Compare the performance of Broyden’s method and NewtonGMRES on
the Modiﬁed Bratu Problem
−∇
2
u +du
x
+e
u
= 0
on (0, 1) (0, 1) with homogeneous Dirichlet boundary conditions.
Precondition with the Poisson solver fish2d. Experiment with various
mesh widths, initial iterates, and values for the convection coeﬃcient d.
7.5.9. The bad Broyden method [26], so named because of its inferior perfor
mance in practice [63], updates an approximation to the inverse of the
Jacobian at the root so that B
−1
≈ F
(x
∗
)
−1
satisﬁes the inverse secant
equation
B
−1
+
y = s (7.45)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BROYDEN’S METHOD 133
with the rankone update
B
−1
+
= B
−1
c
+
(s −B
−1
c
y)y
T
y
2
2
.
Show that the bad Broyden method is locally superlinearly convergent
if the standard assumptions hold. Try to implement the bad Broyden
method in a storageeﬃcient manner using the ideas from [67] and ¸ 7.3.
Does anything go wrong? See [67] for more about this.
7.5.10. The Schubert or sparse Broyden algorithm [172], [27] is a quasiNewton
update for sparse matrices that enforces both the secant equation and the
sparsity pattern. For problems with tridiagonal Jacobian, for example,
the update is
(B
+
)
ij
= (B
c
)
ij
+
(y −B
c
s)
i
s
j
i+1
k=i−1
s
2
k
for 1 ≤ i, j ≤ N and [i − j[ ≤ 1. Note that only the subdiagonal, su
perdiagonal, and main diagonal are updated. Read about the algorithm
in [172] and prove local superlinear convergence under the standard as
sumptions and the additional assumption that the sparsity pattern of B
0
is the same as that of F
(x
∗
). How would the Schubert algorithm be
aﬀected by preconditioning? Compare your analysis with those in [158],
[125], and [189].
7.5.11. Implement the bad Broyden method, apply it to the examples in ¸ 7.4,
and compare the performance to that of Broyden’s method. Discuss the
diﬀerences in implementation from Broyden’s method.
7.5.12. Implement the Schubert algorithm on an unpreconditioned discretized
partial diﬀerential equation (such as the Bratu problem from Exer
cise 7.5.8) and compare it to NewtonGMRES and Broyden’s method.
Does the relative performance of the methods change as h is decreased?
This is interesting even in one space dimension where all matrices are
tridiagonal. References [101], [93], [92], and [112], are relevant to this
exercise.
7.5.13. Use your result from Exercise 5.7.14 in Chapter 5 to numerically estimate
the qorder of Broyden’s method for some of the examples in this
section (both linear and nonlinear). What can you conclude from your
observations?
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
134 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Chapter 8
Global Convergence
By a globally convergent algorithm we mean an algorithm with the property
that for any initial iterate the iteration either converges to a root of F or fails
to do so in one of a small number of ways. Of the many such algorithms we
will focus on the class of line search methods and the Armijo rule [3], [88],
implemented inexactly [24], [25], [70].
Other methods, such as trust region [153], [154], [181], [129], [63], [24],
[25], [173], [70], and continuation/homotopy methods [1], [2], [109], [159],
[198], can be used to accomplish the same objective. We select the line search
paradigm because of its simplicity, and because it is trivial to add a line search
to an existing locally convergent implementation of Newton’s method. The
implementation and analysis of the line search method we develop in this
chapter do not depend on whether iterative or direct methods are used to
compute the Newton step or upon how or if the Jacobian is computed and
stored. We begin with single equations, where the algorithms can be motivated
and intuition developed with a very simple example.
8.1. Single equations
If we apply Newton’s method to ﬁnd the root x
∗
= 0 of the function
f(x) = arctan(x) with initial iterate x
0
= 10 we ﬁnd that the initial iterate
is too far from the root for the local convergence theory in Chapter 5 to be
applicable. The reason for this is that f
(x) = (1 + x
2
)
−1
is small for large
x; f(x) = arctan(x) ≈ ±π/2, and hence the magnitude of the Newton step
is much larger than that of the iterate itself. We can see this eﬀect in the
sequence of iterates:
10, −138, 2.9 10
4
, −1.5 10
9
, 9.9 10
17
,
a failure of Newton’s method that results from an inaccurate initial iterate.
If we look more closely, we see that the Newton step s = −f(x
c
)/f
(x
c
) is
pointing toward the root in the sense that sx
c
< 0, but the length of s is too
large. This observation motivates the simple ﬁx of reducing the size of the step
until the size of the nonlinear residual is decreased. A prototype algorithm is
given below.
135
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
136 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Algorithm 8.1.1. lines1(x, f, τ)
1. r
0
= [f(x)[
2. Do while [f(x)[ > τ
r
r
0
+τ
a
(a) If f
(x) = 0 terminate with failure.
(b) s = −f(x)/f
(x) (search direction)
(c) x
t
= x +s (trial point)
(d) If [f(x
t
)[ < [f(x)[ then x = x
t
(accept the step)
else
s = s/2 goto 2c (reject the step)
When we apply Algorithm lines1 to our simple example, we obtain the
sequence
10, −8.5, 4.9, −3.8, 1.4, −1.3, 1.2, −.99, .56, −0.1, 9 10
−4
, −6 10
−10
.
The quadratic convergence behavior of Newton’s method is only apparent
very late in the iteration. Iterates 1, 2, 3, and 4 required 3, 3, 2, and 2
steplength reductions. After the fourth iterate, decrease in the size of the
nonlinear residual was obtained with a full Newton step. This is typical of the
performance of the algorithms we discuss in this chapter.
We plot the progress of the iteration in Fig. 8.1. We plot
[arctan(x)/arctan(x
0
)[
as a function of the number of function evaluations with the solid line. The
outer iterations are identiﬁed with circles as in ¸ 6.4. One can see that most of
the work is in the initial phase of the iteration. In the terminal phase, where
the local theory is valid, the minimum number of two function evaluations per
iterate is required. However, even in this terminal phase rapid convergence
takes place only in the ﬁnal three outer iterations.
Note the diﬀerences between Algorithm lines1 and Algorithm nsol. One
does not set x
+
= x +s without testing the step to see if the absolute value of
the nonlinear residual has been decreased. We call this method a line search
because we search along the line segment
(x
c
, x
c
−f(x
c
)/f
(x
c
))
to ﬁnd a decrease in [f(x
c
)[. As we move from the right endpoint to the left,
some authors [63] refer to this as backtracking.
Algorithm lines1 almost always works well. In theory, however, there is
the possibility of oscillation about a solution without convergence [63]. To
remedy this and to make analysis possible we replace the test for simple
decrease with one for suﬃcient decrease. The algorithm is to compute a search
direction d which for us will be the Newton direction
d = −f(x
c
)/f
(x
c
)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GLOBAL CONVERGENCE 137
0 5 10 15 20 25 30 35
10
10
10
9
10
8
10
7
10
6
10
5
10
4
10
3
10
2
10
1
10
0
Function Evaluations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 8.1. Newton–Armijo iteration for inverse tangent.
and then test steps of the form s = λd, with λ = 2
−j
for some j ≥ 0, until
f(x +s) satisﬁes
[f(x
c
+λd)[ < (1 −αλ)[f(x
c
)[. (8.1)
The condition in (8.1) is called suﬃcient decrease of [f[. The parameter
α ∈ (0, 1) is a small, but positive, number intended to make (8.1) as easy
as possible to satisfy. We follow the recent optimization literature [63] and
set α = 10
−4
. Once suﬃcient decrease has been obtained we accept the step
s = λd. This strategy, from [3] (see also [88]) is called the Armijo rule. Note
that we must now distinguish between the step and the Newton direction,
something we did not have to do in the earlier chapters.
Algorithm 8.1.2. nsola1(x, f, τ)
1. r
0
= [f(x)[
2. Do while [f(x)[ > τ
r
r
0
+τ
a
(a) If f
(x) = 0 terminate with failure.
(b) d = −f(x)/f
(x) (search direction)
(c) λ = 1
i. x
t
= x +λd (trial point)
ii. If [f(x
t
)[ < (1 −αλ)[f(x)[ then x = x
t
(accept the step)
else
λ = λ/2 goto 2(c)i (reject the step)
This is essentially the algorithm implemented in the MATLAB code nsola.
We will analyze Algorithm nsola in ¸ 8.2. We remark here that there is
nothing critical about the reduction factor of 2 in the line search. A factor of
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
138 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
10 could well be better in situations in which small values of λ are needed for
several consecutive steps (such as our example with the arctan function). In
that event a reduction of a factor of 8 would require three passes through the
loop in nsola1 but only a single reduction by a factor of 10. This could be
quite important if the search direction were determined by an inexact Newton
method or an approximation to Newton’s method such as the chord method
or Broyden’s method.
On the other hand, reducing λ by too much can be costly as well. Taking
full Newton steps ensures fast local convergence. Taking as large a fraction as
possible helps move the iteration into the terminal phase in which full steps
may be taken and fast convergence expected. We will return to the issue of
reduction in λ in ¸ 8.3.
8.2. Analysis of the Armijo rule
In this section we extend the onedimensional algorithm in two ways before
proceeding with the analysis. First, we accept any direction that satisﬁes the
inexact Newton condition (6.1). We write this for the Armijo sequence as
F
(x
n
)d
n
+F(x
n
) ≤ η
n
F(x
n
). (8.2)
Our second extension is to allow more ﬂexibility in the reduction of λ. We
allow for any choice that produces a reduction that satisﬁes
σ
0
λ
old
≤ λ
new
≤ σ
1
λ
old
,
where 0 < σ
0
< σ
1
< 1. One such method, developed in ¸ 8.3, is to minimize
an interpolating polynomial based on the previous trial steps. The danger is
that the minimum may be too near zero to be of much use and, in fact, the
iteration may stagnate as a result. The parameter σ
0
safeguards against that.
Safeguarding is an important aspect of many globally convergent methods and
has been used for many years in the context of the polynomial interpolation
algorithms we discuss in ¸ 8.3.1 [54], [63], [86], [133], [87]. Typical values of σ
0
and σ
1
are .1 and .5. Our algorithm nsola incorporates these ideas.
While (8.2) is motivated by the Newtoniterative paradigm, the analysis
here applies equally to methods that solve the equations for the Newton step
directly (so η
n
= 0), approximate the Newton step by a quasiNewton method,
or even use the chord method provided that the resulting direction d
n
satisﬁes
(8.2). In Exercise 8.5.9 you are asked to implement the Armijo rule in the
context of Algorithm nsol.
Algorithm 8.2.1. nsola(x, F, τ, η).
1. r
0
= F(x)
2. Do while F(x) > τ
r
r
0
+τ
a
(a) Find d such that F
(x)d +F(x) ≤ ηF(x)
If no such d can be found, terminate with failure.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GLOBAL CONVERGENCE 139
(b) λ = 1
i. x
t
= x +λd
ii. If F(x
t
) < (1 −αλ)F(x) then x = x
t
else
Choose σ ∈ [σ
0
, σ
1
]
λ = σλ
goto 2(b)i
Note that step 2a must allow for the possibility that F
is illconditioned
and that no direction can be found that satisﬁes (8.2). If, for example, step
2a were implemented with a direct factorization method, illconditioning of F
might be detected as part of the factorization.
Let ¦x
n
¦ be the iteration produced by Algorithm nsola with initial iterate
x
0
. The algorithm can fail in some obvious ways:
1. F
(x
n
) is singular for some n. The inner iteration could terminate with
a failure.
2. x
n
→ ¯ x, a local minimum of F which is not a root.
3. x
n
 → ∞.
Clearly if F has no roots, the algorithm must fail. We will see that if F has no
roots then either F
(x
n
) has a limit point which is singular or ¦x
n
¦ becomes
unbounded.
The convergence theorem is the remarkable statement that if ¦x
n
¦ and
¦F
(x
n
)
−1
¦ are bounded (thereby eliminating premature failure and local
minima of F that are not roots) and F
is Lipschitz continuous in a
neighborhood of the entire sequence ¦x
n
¦, then the iteration converges to a
root of F at which the standard assumptions hold, full steps are taken in the
terminal phase, and the convergence is qquadratic.
We begin with a formal statement of our assumption that F
is uniformly
Lipschitz continuous and bounded away from zero on the sequence ¦x
n
¦.
Assumption 8.2.1. There are r, γ, m
f
> 0 such that F is deﬁned, F
is
Lipschitz continuous with Lipschitz constant γ, and F
(x)
−1
 ≤ m
f
on the
set
Ω(¦x
n
¦, r) =
∞
_
n=0
¦x[ x −x
n
 ≤ r¦.
Assumption 8.2.1 implies that the line search phase (step 2b in Algorithm
nsola) will terminate in a number of cycles that is independent of n. We show
this, as do [25] and [70], by giving a uniform lower bound on λ
n
(assuming
that F(x
0
) ,= 0). Note how the safeguard bound σ
0
enters this result. Note
also the very modest conditions on η
n
.
Lemma 8.2.1. Let x
0
∈ R
N
and α ∈ (0, 1) be given. Assume that ¦x
n
¦ is
given by Algorithm nsola with
¦η
n
¦ ⊂ (0, ¯ η] ⊂ (0, 1 −α) (8.3)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
140 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
and that Assumption 8.2.1 holds. Then either F(x
0
) = 0 or
λ
n
≥
¯
λ = σ
0
min
_
r
m
f
F(x
0
)
,
2(1 −α − ¯ η)
m
2
f
F(x
0
)(1 + ¯ η)
2
γ
_
. (8.4)
Proof. Let n ≥ 0. By the fundamental theorem of calculus and (8.2) we
have, for any λ ∈ [0, 1]
F(x
n
+λd
n
) = F(x
n
) +λF
(x
n
)d
n
+
_
1
0
(F
(x
n
+tλd
n
) −F
(x
n
))λd
n
dt
= (1 −λ)F(x
n
) +λξ
n
+
_
1
0
(F
(x
n
+tλd
n
) −F
(x
n
))λd
n
dt,
where, by the bound η
n
≤ ¯ η,
ξ
n
 ≤ η
n
F(x
n
) ≤ ¯ ηF(x
n
).
The step acceptance condition in nsola implies that ¦F(x
n
)¦ is a decreasing
sequence and therefore
d
n
 = F
(x
n
)
−1
(ξ
n
−F(x
n
)) ≤ m
f
(1 + ¯ η)F(x
n
).
Therefore, by Assumption 8.2.1 F
is Lipschitz continuous on the line segment
[x
n
, x
n
+λd
n
] whenever
λ ≤
¯
λ
1
= r/(m
f
F(x
0
)).
Lipschitz continuity of F
implies that if λ ≤
¯
λ
1
then
F(x
n
+λd
n
) ≤ (1 −λ)F(x
n
) +λ¯ ηF(x
n
) +
γ
2
λ
2
d
n

2
≤ (1 −λ)F(x
n
)[ +λ¯ ηF(x
n
) +
m
2
f
γ(1 + ¯ η)
2
λ
2
F(x
n
)
2
2
≤ (1 −λ + ¯ ηλ)F(x
n
) +λF(x
n
)
m
2
f
γλ(1 + ¯ η)
2
F(x
0
)
2
< (1 −αλ)F(x
n
),
which will be implied by
λ ≤
¯
λ
2
= min
_
¯
λ
1
,
2(1 −α − ¯ η)
(1 + ¯ η)
2
m
2
f
γF(x
0
)
_
.
This shows that λ can be no smaller than σ
0
¯
λ
2
, which completes the proof.
Now we can show directly that the sequence of Armijo iterates either is
unbounded or converges to a solution. Theorem 8.2.1 says even more. The
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GLOBAL CONVERGENCE 141
sequence converges to a root at which the standard assumptions hold, so in the
terminal phase of the iteration the step lengths are all 1 and the convergence
is qquadratic.
Theorem 8.2.1. Let x
0
∈ R
N
and α ∈ (0, 1) be given. Assume that ¦x
n
¦
is given by Algorithm nsola, ¦η
n
¦ satisﬁes (8.3), ¦x
n
¦ is bounded, and that
Assumption 8.2.1 holds. Then ¦x
n
¦ converges to a root x
∗
of F at which the
standard assumptions hold, full steps are taken for n suﬃciently large, and the
convergence behavior in the ﬁnal phase of the iteration is that given by the local
theory for inexact Newton methods (Theorem 6.1.2).
Proof. If F(x
n
) = 0 for some n, then we are done because Assump
tion 8.2.1 implies that the standard assumptions hold at x
∗
= x
n
. Otherwise
Lemma 8.2.1 implies that F(x
n
) converges qlinearly to zero with qfactor at
most (1 −α
¯
λ).
The boundedness of the sequence ¦x
n
¦ implies that a subsequence ¦x
n
k
¦
converges, say to x
∗
. Since F is continuous, F(x
∗
) = 0. Eventually
[x
n
k
− x
∗
[ < r, where r is the radius in Assumption 8.2.1 and therefore the
standard assumptions hold at x
∗
.
Since the standard assumptions hold at x
∗
, there is δ such that if x ∈ B(δ),
the Newton iteration with x
0
= x will remain in B(δ) and converge q
quadratically to x
∗
. Hence as soon as x
n
k
∈ B(δ), the entire sequence, not
just the subsequence, will remain in B(δ) and converge to x
∗
. Moreover,
Theorem 6.1.2 applies and hence full steps can be taken.
At this stage we have shown that if the Armijo iteration fails to converge
to a root either the continuity properties of F or nonsingularity of F
break
down as the iteration progresses (Assumption 8.2.1 is violated) or the iteration
becomes unbounded. This is all that one could ask for in terms of robustness
of such an algorithm.
Thus far we have used for d the inexact Newton direction. It could well be
advantageous to use a chord or Broyden direction, for example. All that one
needs to make the theorems and proofs in ¸ 8.2 hold is (8.2), which is easy to
verify, in principle, via a forward diﬀerence approximation to F
(x
n
)d.
8.3. Implementation of the Armijo rule
The MATLAB code nsola is a modiﬁcation of nsolgm which provides for a
choice of several Krylov methods for computing an inexact Newton direction
and globalizes Newton’s method with the Armijo rule. We use the l
2
norm in
that code and in the numerical results reported in ¸ 8.4. If GMRES is used
as the linear solver, then the storage requirements are roughly the same as for
Algorithm nsolgm if one reuses the storage location for the ﬁnite diﬀerence
directional derivative to test for suﬃcient decrease.
In Exercise 8.5.9 you are asked to do this with nsol and, with each
iteration, numerically determine if the chord direction satisﬁes the inexact
Newton condition. In this case the storage requirements are roughly the same
as for Algorithm nsol. The iteration can be very costly if the Jacobian must be
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
142 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
evaluated and factored many times during the phase of the iteration in which
the approximate solution is far from the root.
We consider two topics in this section. The ﬁrst, polynomial interpolation
as a way to choose λ, is applicable to all methods. The second, how Broyden’s
method must be implemented, is again based on [67] and is more specialized.
Our Broyden–Armijo code does not explicitly check the inexact Newton
condition and thereby saves an evaluation of F. Evaluation of F
(x
c
)d
c
if the
full step is not accepted (here d
c
is the Broyden direction) would not only verify
the inexact Newton condition but provide enough information for a twopoint
parabolic line search. The reader is asked to pursue this in Exercise 8.5.10.
8.3.1. Polynomial line searches. In this section we consider an elabora
tion of the simple line search scheme of reduction of the steplength by a ﬁxed
factor. The motivation for this is that some problems respond well to one or
two reductions in the steplength by modest amounts (such as 1/2) and others
require many such reductions, but might respond well to a more aggressive
steplength reduction (by factors of 1/10, say). If one can model the decrease
in F
2
as the steplength is reduced, one might expect to be able to better
estimate the appropriate reduction factor. In practice such methods usually
perform better than constant reduction factors.
If we have rejected k steps we have in hand the values
F(x
n
)
2
, F(x
n
+λ
1
d
n
)
2
, . . . F(x
n
+λ
k−1
d
n
)
2
.
We can use this iteration history to model the scalar function
f(λ) = F(x
n
+λd
n
)
2
2
with a polynomial and use the minimum of that polynomial as the next
steplength. We consider two ways of doing this that use second degree
polynomials which we compute using previously computed information.
After λ
c
has been rejected and a model polynomial computed, we compute
the minimum λ
t
of that polynomial analytically and set
λ
+
=
_
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
_
σ
0
λ
c
if λ
t
< σ
0
λ
c
,
σ
1
λ
c
if λ
t
> σ
1
λ
c
,
λ
t
otherwise.
(8.5)
Twopoint parabolic model. In this approach we use values of f and f
at λ = 0 and the value of f at the current value of λ to construct a 2nd degree
interpolating polynomial for f.
f(0) = F(x
n
)
2
2
is known. Clearly f
(0) can be computed as
f
(0) = 2(F
(x
n
)
T
d
n
)
T
F(x
n
) = 2F(x
n
)
T
(F
(x
n
)d
n
). (8.6)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GLOBAL CONVERGENCE 143
Use of (8.6) requires evaluation of F
(x
n
)d
n
, which can be obtained by
examination of the ﬁnal residual from GMRES or by expending an additional
function evaluation to compute F
(x
n
)d
n
with a forward diﬀerence. In any
case, our approximation to f
(0) should be negative. If it is not, it may
be necessary to compute a new search direction. This is a possibility with
directions other than inexact Newton directions, such as the Broyden direction.
The polynomial p(λ) such that p, p
agree with f, f
at 0 and p(λ
c
) = f(λ
c
)
is
p(λ) = f(0) +f
(0)λ +cλ
2
,
where
c =
f(λ
c
) −f(0) −f
(0)λ
c
λ
2
c
.
Our approximation of f
(0) < 0, so if f(λ
c
) > f(0), then c > 0 and p(λ) has a
minimum at
λ
t
= −f
(0)/(2c) > 0.
We then compute λ
+
with (8.5).
Threepoint parabolic model. An alternative to the twopoint model that
avoids the need to approximate f
(0) is a threepoint model, which uses f(0)
and the two most recently rejected steps to create the parabola. The MATLAB
code parab3p implements the threepoint parabolic line search and is called
by the two codes nsola and brsola.
In this approach one evaluates f(0) and f(1) as before. If the full step is
rejected, we set λ = σ
1
and try again. After the second step is rejected, we
have the values
f(0), f(λ
c
), and f(λ
−
),
where λ
c
and λ
−
are the most recently rejected values of λ. The polynomial
that interpolates f at 0, λ
c
, λ
−
is
p(λ) = f(0) +
λ
λ
c
−λ
−
_
(λ −λ
−
)(f(λ
c
) −f(0))
λ
c
+
(λ
c
−λ)(f(λ
−
) −f(0))
λ
−
_
.
We must consider two situations. If
p
(0) =
2
λ
c
λ
−
(λ
c
−λ
−
)
(λ
−
(f(λ
c
) −f(0)) −λ
c
(f(λ
−
) −f(0)))
is positive, then we set λ
t
to the minimum of p
λ
t
= −p
(0)/p
(0)
and apply the safeguarding step (8.5) to compute λ
+
. If p
(0) ≤ 0 one could
either set λ
+
to be the minimum of p on the interval [σ
0
λ, σ
1
λ] or reject the
parabolic model and simply set λ
+
to σ
0
λ
c
or σ
1
λ
c
. We take the latter approach
and set λ
+
= σ
1
λ
c
. In the MATLAB code nsola from the collection, we
implement this method with σ
0
= .1, σ
1
= .5.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
144 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Interpolation with a cubic polynomial has also been suggested [87], [86],
[63], [133]. Clearly, the polynomial modeling approach is heuristic. The
safeguarding and Theorem 8.2.1 provide insurance that progress will be made
in reducing F.
8.3.2. Broyden’s method. In Chapter 7 we implemented Broyden’s
method at a storage cost of one vector for each iteration. The relation
y − B
c
s = F(x
+
) was not critical to this and we may also incorporate a line
search at a cost of a bit of complexity in the implementation. As in ¸ 7.3, our
approach is a nonlinear version of the method in [67].
The diﬀerence from the development in ¸ 7.3 is that the simple relation
between the sequence ¦w
n
¦ and ¦v
n
¦ is changed. If a line search is used, then
s
n
= −λ
n
B
−1
n
F(x
n
)
and hence
y
n
−B
n
s
n
= F(x
n+1
) −(1 −λ
n
)F(x
n
). (8.7)
If, using the notation in ¸ 7.3, we set
u
n
=
y
n
−B
n
s
n
s
n

2
, v
n
=
s
n
s
n

2
, and w
n
= (B
−1
n
u
n
)/(1 +v
T
n
B
−1
n
u
n
),
we can use (8.7) and (7.38) to obtain
w
n
= λ
n
_
−d
n+1
s
n

2
+ (λ
−1
n
−1)
s
n
s
n

2
_
= λ
n
_
−s
n+1
/λ
n+1
s
n

2
+ (λ
−1
n
−1)
s
n
s
n

2
_
,
(8.8)
where
d
n+1
= −B
−1
n+1
F(x
n+1
)
is the search direction used to compute x
n+1
. Note that (8.8) reduces to (7.41)
when λ
n
= 1 and λ
n+1
= 1 (so d
n+1
= s
n+1
).
We can use the ﬁrst equality in (8.8) and the relation
d
n+1
= −
_
I −
w
n
s
T
n
s
n

2
_
B
−1
n
F(x
n+1
)
to solve for d
n+1
and obtain an analog of (7.44)
d
n+1
= −
s
n

2
2
B
−1
n
F(x
n+1
) −(1 −λ
n
)s
T
n
B
−1
n
F(x
n+1
)s
n
s
n

2
2
+λ
n
s
T
n
B
−1
n
F(x
n+1
)
(8.9)
We then use the second equality in (8.8) to construct B
−1
c
F(x
+
) as we did in
Algorithm brsol. Veriﬁcation of (8.8) and (8.9) is left to Exercise 8.5.5.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GLOBAL CONVERGENCE 145
Algorthm brsola uses these ideas. We do not include details of the line
search or provide for restarts or a limited memory implementation. The
MATLAB implementation, provided in the collection of MATLAB codes, uses
a threepoint parabolic line search and performs a restart when storage is
exhausted like Algorithm brsol does.
It is possible that the Broyden direction fails to satisfy (8.2). In fact, there
is no guarantee that the Broyden direction is a direction of decrease for F
2
.
The MATLAB code returns with an error message if more than a ﬁxed number
of reductions in the step length are needed. Another approach would be to
compute F
(x
c
)d numerically if a full step is rejected and then test (8.2) before
beginning the line search. As a side eﬀect, an approximation of f
(0) can be
obtained at no additional cost and a parabolic line search based on f(0), f
(0),
and f(1) can be used. In Exercise 8.5.10 the reader is asked to fully develop
and implement these ideas.
Algorithm 8.3.1. brsola(x, F, τ, maxit, nmax)
1. Initialize:
F
c
= F(x) r
0
= F(x)
2
, n = −1,
d = −F(x), compute λ
0
, s
0
= λ
0
d
2. Do while n ≤ maxit
(a) n = n + 1
(b) x = x +s
n
(c) Evaluate F(x)
(d) If F(x)
2
≤ τ
r
r
0
+τ
a
exit.
(e) i. z = −F(x)
ii. for j = 0, n −1
a = −λ
j
/λ
j+1
, b = 1 −λ
j
z = z + (as
j+1
+bs
j
)s
T
j
z/s
j

2
2
(f) d = (s
n

2
2
z + (1 −λ
n
)s
T
n
zs
n
)/(s
n

2
2
−λ
n
s
T
n
z)
(g) Compute λ
n+1
with a line search.
(h) Set s
n+1
= λ
n+1
d, store s
n+1

2
and λ
n+1
.
Since d
n+1
and s
n+1
can occupy the same location, the storage require
ments of Algorithm brsola would appear to be essentially the same as those
of brsol. However, computation of the step length requires storage of both the
current point x and the trial point x + λs before a step can be accepted. So,
the storage of the globalized methods exceeds that of the local method by one
vector. The MATLAB implementation of brsola in the collection illustrates
this point.
8.4. Examples for Newton–Armijo
The MATLAB code nsola is a modiﬁcation of nsolgm that incorporates the
threepoint parabolic line search from ¸ 8.3.1 and also changes η using (6.20)
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
146 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
once the iteration is near the root. We compare this strategy, using GMRES
as the linear solver, with the constant η method as we did in ¸ 6.4 with γ = .9.
As before, in all the ﬁgures we plot F(x
n
)
2
/F(x
0
)
2
against the number
of function evaluations required by all line searches and inner and outer
iterations to compute x
n
. Counts of function evaluations corresponding to
outer iterations are indicated by circles. We base our absolute error criterion
on the norm  
2
/
√
N as we did in ¸ 6.4.2 and 7.4.
In ¸ 8.4.3 we compare the Broyden–Armijo method with NewtonGMRES
for those problems for which both methods were successful. We caution the
reader now, and will repeat the caution later, that if the initial iterate is far
from the solution, an inexact Newton method such as NewtonGMRES can
succeed in many case in which a quasiNewton method can fail because the
quasiNewton direction may not be an inexact Newton direction. However,
when both methods succeed, the quasiNewton method, which requires a single
function evaluation for each outer iteration when full steps can be taken, may
well be most eﬃcient.
8.4.1. Inverse tangent function. Since we have easy access to analytic
derivatives in this example, we can use the twopoint parabolic line search.
We compare the twopoint parabolic line search with the constant reduction
search (σ
0
= σ
1
= .5) for the arctan function. In Fig. 8.2 we plot the iteration
history corresponding to the parabolic line search with the solid line and that
for the constant reduction with the dashed line. We use x
0
= 10 as the initial
iterate with τ
r
= τ
a
= 10
−8
. The parabolic line search required 7 outer iterates
and 21 function evaluations in contrast with the constant reduction searches 11
outer iterates and 33 function evaluations. In the ﬁrst outer iteration, both line
searches take three steplength reductions. However, the parabolic line search
takes only one reduction in the next three iterations and none thereafter. The
constant reduction line search took three reductions in the ﬁrst two outer
iterations and two each in following two.
8.4.2. Convectiondiﬀusion equation. We consider a much more diﬃ
cult problem. We take (6.21) from ¸ 6.4.2,
−∇
2
u +Cu(u
x
+u
y
) = f
with the same right hand side f, initial iterate u
0
= 0, 31 31 mesh and
homogeneous Dirichlet boundary conditions on the unit square (0, 1) (0, 1)
that we considered before. Here, however, we set C = 100. This makes a
globalization strategy critical (Try it without one!). We set τ
r
= τ
a
= h
2
/10.
This is a tighter tolerance that in ¸ 6.4.2 and, because of the large value of C
and resulting illconditioning, is needed to obtain an accurate solution.
In Fig. 8.3 we show the progress of the iteration for the unpreconditioned
equation. For this problem we plot the progress of the iteration using η = .25
with the solid line and using the sequence given by (6.20) with the dashed
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GLOBAL CONVERGENCE 147
0 5 10 15 20 25 30 35
10
12
10
10
10
8
10
6
10
4
10
2
10
0
Function Evaluations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 8.2. Newton–Armijo for the arctan function.
line. We set the parameters in (6.20) to γ = .9 and η
max
= .25. This problem
is very poorly conditioned and much tighter control on the inner iteration is
needed than for the other problems. The two approaches to selection of the
forcing term performed very similarly. The iterations required 75.3 (constant
η) and 75.4 (varying η) million ﬂoating point operations, 759 (constant η) and
744 (varying η) function evaluations, and 25 (constant η) and 22 (varying η)
outer iterations.
0 100 200 300 400 500 600 700 800
10
5
10
4
10
3
10
2
10
1
10
0
Function Evaluations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 8.3. Newton–Armijo for the PDE, C = 100.
We consider the preconditioned problem in Fig. 8.4. In the computation we
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
148 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
preconditioned (6.21) with the fast Poisson solver fish2d. In Fig. 8.4 we show
the progress of the iteration for the preconditioned equation. For this problem
we plot the progress of the iteration using η = .25 with the solid line and using
the sequence given by (6.20) with the dashed line. We set the parameters in
(6.20) to γ = .9 and η
max
= .99.
The constant η iteration terminated after 79 function evaluations, 9 outer
iterations, and roughly 13.5 million ﬂoatingpoint operations. The line search
in this case reduced the step length three times on the ﬁrst iteration and twice
on the second and third.
The iteration in which ¦η
n
¦ is given by (6.20) terminated after 70 function
evaluations, 9 outer iterations, and 12.3 million ﬂoatingpoint operations. The
line search in the nonconstant η case reduced the step length three times on
the ﬁrst iteration and once on the second. The most eﬃcient iteration, with
the forcing term given by (6.20), required at most 16 inner iterations while the
constant η approach needed at most 10.
0 10 20 30 40 50 60 70 80
10
5
10
4
10
3
10
2
10
1
10
0
Function Evaluations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 8.4. Newton–Armijo for the PDE, C = 100.
8.4.3. Broyden–Armijo. We found that the Broyden–Armijo line search
failed on all the unpreconditioned convection diﬀusion equation examples. In
these cases the Jacobian is not well approximated by a low rank perturbation
of the identity [112] and the ability of the inexact Newton iteration to ﬁnd a
good search direction was the important factor.
We begin by revisiting the example in ¸ 6.4.2 with C = 20. As reported
in ¸ 6.4.2, the NewtonGMRES iteration always took full steps, terminating
in 4 outer iterations, 16 function evaluations, and 2.6 million ﬂoatingpoint
operations. When compared to the Broyden’s method results in ¸ 6.4.2,
when increases in the residual were allowed, the Broyden–Armijo costs reﬂect
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GLOBAL CONVERGENCE 149
an improvement in eﬃciency. The Broyden iteration in ¸ 6.4.2 took 12
iterations and 2.8 million ﬂoatingpoint operations while the Broyden–Armijo
took 9 iterations and required 2 million ﬂoatingpoint operations. We set
τ
a
= τ
r
= h
2
, η
max
= .5, and γ = .9 as we did in ¸ 6.4.2.
In our implementation of brsola the threepoint parabolic line search was
used. The steplength was reduced on the second iteration (twice) and the third
(once). Figure 8.5 compares the Broyden–Armijo iteration (solid line) to the
GMRES iteration (dashed line).
0 2 4 6 8 10 12 14 16
10
3
10
2
10
1
10
0
Function Evaluations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 8.5. NewtonGMRES and Broyden–Armijo for the PDE, C = 20.
For the more diﬃcult problem with C = 100, the performance of the
two methods is more similar. In Fig. 8.6 we compare our implementation
of Newton–GMRES–Armijo (dashed line) to Broyden–Armijo (solid line). We
set τ
a
= τ
r
= h
2
/10, η
max
= .99, and γ = .9. The Broyden–Armijo iteration
required 34 nonlinear iterations, roughly 16 million ﬂoatingpoint operations,
and 85 function evaluations. In terms of storage, the Broyden–Armijo iteration
required storage for 37 vectors while the NewtonGMRES iteration needed at
most 16 inner iterations, needing to store 21 vectors, and therefore was much
more eﬃcient in terms of storage.
Restarting the Broyden iteration every 19 iterations would equalize the
storage requirements with Newton–GMRES–Armijo. Broyden’s method suf
fers under these conditions as one can see from the comparison of Broyden–
Armijo (solid line) and Newton–GMRES–Armijo (dashed line) in Fig. 8.7. The
restarted Broyden–Armijo iteration took 19.5 million ﬂoatingpoint operations,
42 outer iterates, and 123 function evaluations.
From these examples we see that Broyden–Armijo can be very eﬃcient pro
vided only a small number of iterations are needed. The storage requirements
increase as the nonlinear iteration progresses and this fact puts Broyden’s
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
150 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
0 10 20 30 40 50 60 70 80 90
10
5
10
4
10
3
10
2
10
1
10
0
Function Evaluations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 8.6. NewtonGMRES and Broyden–Armijo for the PDE, C = 100.
0 20 40 60 80 100 120 140
10
5
10
4
10
3
10
2
10
1
10
0
Function Evaluations
R
e
l
a
t
i
v
e
N
o
n
l
i
n
e
a
r
R
e
s
i
d
u
a
l
Fig. 8.7. NewtonGMRES and restarted Broyden–Armijo for the PDE, C = 100.
method at a disadvantage to NewtonGMRES when the number of nonlinear
iterations is large.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
GLOBAL CONVERGENCE 151
8.5. Exercises on global convergence
8.5.1. How does the method proposed in [10] diﬀer from the one implemented in
nsola? What could be advantages and disadvantages of that approach?
8.5.2. In what ways are the results in [25] and [70] more general than those in
this section? What ideas do these papers share with the analysis in this
section and with [3] and [88]?
8.5.3. Implement the Newton–Armijo method for single equations. Apply your
code to the following functions with x
0
= 10. Explain your results.
1. f(x) = arctan(x) (i.e., Duplicate the results in ¸ 8.1.)
2. f(x) = arctan(x
2
)
3. f(x) = .9 + arctan(x)
4. f(x) = x(1 + sin(1/x))
5. f(x) = e
x
6. f(x) = 2 + sin(x)
7. f(x) = 1 +x
2
8.5.4. A numerical analyst buys a German sports car for $50,000. He puts
$10,000 down and takes a 7 year installment loan to pay the balance. If
the monthly payments are $713.40, what is the interest rate? Assume
monthly compounding.
8.5.5. Verify (8.8) and (8.9).
8.5.6. Show that if F
is Lipschitz continuous and the iteration ¦x
n
¦ produced
by Algorithm nsola converges to x
∗
with F(x
∗
) ,= 0, then F
(x
∗
) is
singular.
8.5.7. Use nsola to duplicate the results in ¸ 8.4.2. Vary the convection
coeﬃcient in the convectiondiﬀusion equation and the mesh size and
report the results.
8.5.8. Experiment with other linear solvers such as GMRES(m), BiCGSTAB,
and TFQMR. This is interesting in the locally convergent case as well.
You might use the MATLAB code nsola to do this.
8.5.9. Modify nsol, the hybrid Newton algorithm from Chapter 5, to use the
Armijo rule. Try to do it in such a way that the chord direction is used
whenever possible.
8.5.10. Modify brsola to test the Broyden direction for the descent property
and use a twopoint parabolic line search. What could you do if the
Broyden direction is not a descent direction? Apply your code to the
examples in ¸ 8.4.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
152 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
8.5.11. Modify nsola to use a cubic polynomial and constant reduction line
searches instead of the quadratic polynomial line search. Compare the
results with the examples in ¸ 8.4.
8.5.12. Does the secant method for equations in one variable always give a
direction that satisﬁes (8.2) with η
n
bounded away from 1? If not, when
does it? How would you implement a secantArmijo method in such a
way that the convergence theorem 8.2.1 is applicable?
8.5.13. Under what conditions will the iteration given by nsola converge to a
root x
∗
that is independent of the initial iterate?
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Bibliography
[1] E. L. Allgower and K. Georg, Simplicial and continuation methods for
approximating ﬁxed points and solutions to systems of equations, SIAM Rev., 22
(1980), pp. 28–85.
[2] , Numerical Continuation Methods : An Introduction, SpringerVerlag, New
York, 1990.
[3] L. Armijo, Minimization of functions having Lipschitzcontinuous ﬁrst partial
derivatives, Paciﬁc J. Math., 16 (1966), pp. 1–3.
[4] W. E. Arnoldi, The principle of minimized iterations in the solution of the
matrix eigenvalue problem, Quart. Appl. Math., 9 (1951), pp. 17–29.
[5] S. F. Ashby, T. A. Manteuffel, and J. S. Otto, A comparison of
adaptive Chebyshev and least squares polynomial preconditioning for Hermetian
positive deﬁnite linear systems, SIAM J. Sci. Statist. Comput., 13 (1992), pp. 1–
29.
[6] K. E. Atkinson, Iterative variants of the Nystr¨ om method for the numerical
solution of integral equations, Numer. Math., 22 (1973), pp. 17–31.
[7] , An Introduction to Numerical Analysis, 2nd. ed., John Wiley, New York,
1989.
[8] O. Axelsson, Iterative Solution Methods, Cambridge University Press, Cam
bridge, 1994.
[9] S. Banach, Sur les op´erations dans les ensembles abstraits et leur applications
aux ´equations int´egrales, Fund. Math, 3 (1922), pp. 133–181.
[10] R. E. Bank and D. J. Rose, Global approximate Newton methods, Numer.
Math., 37 (1981), pp. 279–295.
[11] M. S. Barlett, An inverse matrix adjustment arising in discriminant analysis,
Ann. Math. Statist., 22 (1951), pp. 107–111.
[12] R. Barrett, M. Berry, T. F. Chan, J. Demmel, J. Donato, J. Don
garra, V. Eijkhout, R. Pozo, C. Romine, and H. van der Vorst, Templates
for the Solution of Linear Systems: Building Blocks for Iterative Methods, Society
for Industrial and Applied Mathematics, Philadelphia, PA, 1993.
[13] N. Bi´ cani´ c and K. H. Johnson, Who was ‘Raphson’?, Internat. J. Numer.
Methods. Engrg., 14 (1979), pp. 148–152.
[14] P. B. Bosma and W. A. DeRooij, Eﬃcient methods to calculate Chan
drasekhar’s Hfunctions, Astron. Astrophys., 126 (1983), p. 283.
[15] H. Brakhage,
¨
Uber die numerische Behandlung von Integralgleichungen nach
der Quadraturformelmethode, Numer. Math., 2 (1960), pp. 183–196.
[16] K. E. Brenan, S. L. Campbell, and L. R. Petzold, Numerical Solution
153
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
154 BIBLIOGRAPHY
of Initial Value Problems in DiﬀerentialAlgebraic Equations, no. 14 in Classics in
Applied Mathematics, SIAM, Philadelphia, 1996.
[17] R. Brent, Algorithms for Minimization Without Deriviatives, PrenticeHall,
Englewood Cliﬀs, NJ, 1973.
[18] , Some eﬃcient algorithms for solving systems of nonlinear equations, SIAM
J. Numer. Anal., 10 (1973), pp. 327–344.
[19] W. Briggs, A Multigrid Tutorial, Society for Industrial and Applied Mathemat
ics, Philadelphia, PA, 1987.
[20] P. N. Brown, A local convergence theory for combined inexact–Newton/ ﬁnite–
diﬀerence projection methods, SIAM J. Numer. Anal., 24 (1987), pp. 407–434.
[21] P. N. Brown, G. D. Byrne, and A. C. Hindmarsh, VODE: A variable
coeﬃcient ode solver, SIAM J. Sci. Statist. Comput., 10 (1989), pp. 1038–1051.
[22] P. N. Brown and A. C. Hindmarsh, Reduced storage matrix methods in stiﬀ
ODE systems, J. Appl. Math. Comput., 31 (1989), pp. 40–91.
[23] P. N. Brown, A. C. Hindmarsh, and L. R. Petzold, Using Krylov methods
in the solution of largescale diﬀerentialalgebraic systems, SIAM J. Sci. Comput., 15
(1994), pp. 1467–1488.
[24] P. N. Brown and Y. Saad, Hybrid Krylov methods for nonlinear systems of
equations, SIAM J. Sci. Statist. Comput., 11 (1990), pp. 450–481.
[25] , Convergence theory of nonlinear NewtonKrylov algorithms, SIAM J.
Optim., 4 (1994), pp. 297–330.
[26] C. G. Broyden, A class of methods for solving nonlinear simultaneous equa
tions, Math. Comput., 19 (1965), pp. 577–593.
[27] , The convergence of an algorithm for solving sparse nonlinear systems,
Math. Comput., 25 (1971), pp. 285–294.
[28] C. G. Broyden, J. E. Dennis, and J. J. Mor´ e, On the local and superlinear
convergence of quasiNewton methods, J. Inst. Maths. Applic., 12 (1973), pp. 223–
246.
[29] W. Burmeister, Zur Konvergenz einiger verfahren der konjugierten Richtun
gen, in Proceedings of Internationaler Kongreß ¨ uber Anwendung der Mathematik in
dem Ingenieurwissenschaften, Weimar, 1975.
[30] I. W. Busbridge, The Mathematics of Radiative Transfer, Cambridge Tracts,
No. 50, Cambridge Univ. Press, Cambridge, 1960.
[31] R. H. Byrd, J. Nocedal, and R. B. Schnabel, Representation of quasi
Newton matrices and their use in limited memory methods, Math. Programming, 63
(1994), pp. 129–156.
[32] X.C. Cai, W. D. Gropp, D. E. Keyes, and M. D. Tidriri, NewtonKrylov
Schwarz methods in CFD, in Proceedings of the International Workshop on the
NavierStokes Equations, R. Rannacher, ed., Notes in Numerical Fluid Mechanics,
Braunschweig, 1994, Vieweg Verlag.
[33] S. L. Campbell, I. C. F. Ipsen, C. T. Kelley, and C. D. Meyer, GMRES
and the minimal polynomial, Tech. Report CRSCTR9410, North Carolina State
University, Center for Research in Scientiﬁc Computation, July 1994. BIT, to appear.
[34] S. L. Campbell, I. C. F. Ipsen, C. T. Kelley, C. D. Meyer, and Z. Q.
Xue, Convergence estimates for solution of integral equations with GMRES, Tech.
Report CRSCTR9513, North Carolina State University, Center for Research in
Scientiﬁc Computation, March 1995. Journal of Integral Equations and Applications,
to appear.
[35] R. Cavanaugh, Diﬀerence Equations and Iterative Processes, PhD thesis,
University of Maryland, 1970.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BIBLIOGRAPHY 155
[36] F. ChaitinChatelin, Is nonnormality a serious diﬃculty ?, Tech. Report
TR/PA/94/18, CERFACS, December 1994.
[37] T. Chan, E. Gallopoulos, V. Simoncini, T. Szeto, and C. Tong, A quasi
minimal residual variant of the BiCGSTAB algorithm for nonsymmetric systems,
SIAM J. Sci. Comput., 15 (1994), p. 338.
[38] T. Chan, R. Glowinski, J. P´ eriaux, and O. Widlund, eds., Domain
Decomposition Methods,Proceedings of the Second International Symposium on
Domain Decomposition Methods, Los Angeles, CA, January 14–16, 1988; Society
for Industrial and Applied Mathematics, Philadelphia, PA, 1989.
[39] , eds., Domain Decomposition Methods, Proceedings of the Third Interna
tional Symposium on Domain Decomposition Methods, Houston, TX, 1989; Society
for Industrial and Applied Mathematics, Philadelphia, PA, 1990.
[40] , eds., Domain Decomposition Methods, Proceedings of the Fourth Interna
tional Symposium on Domain Decomposition Methods, Moscow, USSR, 1990; Soci
ety for Industrial and Applied Mathematics, Philadelphia, PA, 1991.,
[41] S. Chandrasekhar, Radiative Transfer, Dover, New York, 1960.
[42] T. F. Coleman and J. J. Mor´ e, Estimation of sparse Jacobian matrices and
graph coloring problems, SIAM J. Numer. Anal., 20 (1983), pp. 187–209.
[43] T. F. Coleman and C. VanLoan, Handbook for Matrix Computations,
Frontiers in Appl. Math., No. 4, Society for Industrial and Applied Mathematics,
Philadelphia, PA, 1988.
[44] P. Concus, G. H. Golub, and G. Meurant, Block preconditioning for the
conjugate gradient method, SIAM J. Sci. Statist. Comput., 6 (1985), pp. 220–252.
[45] P. Concus, G. H. Golub, and D. P. O’Leary, A generalized conjugate
gradient method for the numerical solution of elliptic partial diﬀerential equations,
in Sparse Matrix Computations, J. R. Bunch and D. J. Rose, eds., Academic Press,
1976, pp. 309–332.
[46] E. J. Craig, The Nstep iteration procedures, J. Math. Phys., 34 (1955), pp. 64–
73.
[47] A. R. Curtis, M. J. D. Powell, and J. K. Reid, On the estimation of sparse
Jacobian matrices, J. Inst. Math. Appl., 13 (1974), pp. 117–119.
[48] J. W. Daniel, The conjugate gradient method for linear and nonlinear operator
equations, SIAM J. Numer. Anal., 4 (1967), pp. 10–26.
[49] D. W. Decker, H. B. Keller, and C. T. Kelley, Convergence rates for
Newton’s method at singular points, SIAM J. Numer. Anal., 20 (1983), pp. 296–314.
[50] D. W. Decker and C. T. Kelley, Newton’s method at singular points I, SIAM
J. Numer. Anal., 17 (1980), pp. 66–70.
[51] , Convergence acceleration for Newton’s method at singular points, SIAM J.
Numer. Anal., 19 (1982), pp. 219–229.
[52] , Sublinear convergence of the chord method at singular points, Numer.
Math., 42 (1983), pp. 147–154.
[53] , Broyden’s method for a class of problems having singular Jacobian at the
root, SIAM J. Numer. Anal., 22 (1985), pp. 566–574.
[54] T. J. Dekker, Finding a zero by means of successive linear interpolation, in
Constructive Aspects of the Fundamental Theorem of Algebra, P. Henrici, ed., 1969,
pp. 37–48.
[55] R. Dembo, S. Eisenstat, and T. Steihaug, Inexact Newton methods, SIAM
J. Numer. Anal., 19 (1982), pp. 400–408.
[56] R. Dembo and T. Steihaug, Truncated Newton algorithms for largescale
optimization, Math. Programming, 26 (1983), pp. 190–212.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
156 BIBLIOGRAPHY
[57] J. E. Dennis, On the Kantorovich hypothesis for Newton’s method, SIAM J.
Numer. Anal., 6 (1969), pp. 493–507.
[58] , Toward a uniﬁed convergence theory for Newtonlike methods, in Nonlinear
Functional Analysis and Applications, L. B. Rall, ed., Academic Press, New York,
1971, pp. 425–472.
[59] J. E. Dennis, J. M. Martinez, and X. Zhang, Triangular decomposition
methods for solving reducible nonlinear systems of equations, SIAM J. Optim., 4
(1994), pp. 358–382.
[60] J. E. Dennis and J. J. Mor´ e, A characterization of superlinear convergence
and its application to quasiNewton methods, Math. Comput., 28 (1974), pp. 549–560.
[61] , QuasiNewton methods, methods, motivation and theory, SIAM Rev., 19
(1977), pp. 46–89.
[62] J. E. Dennis and R. B. Schnabel, Least change secant updates for quasi
Newton methods, SIAM Rev., 21 (1979), pp. 443–459.
[63] , Numerical Methods for Unconstrained Optimization and Nonlinear Equa
tions, no. 16 in Classics in Applied Mathematics, SIAM, Philadelphia, 1996.
[64] J. E. Dennis and H. F. Walker, Convergence theorems for least change secant
update methods, SIAM J. Numer. Anal., 18 (1981), pp. 949–987.
[65] , Inaccuracy in quasiNewton methods: Local improvement theorems, in
Mathematical Programming Study 22: Mathematical programming at Oberwolfach
II, North–Holland, Amsterdam, 1984, pp. 70–85.
[66] , Leastchange sparse secant updates with inaccurate secant conditions,
SIAM J. Numer. Anal., 22 (1985), pp. 760–778.
[67] P. Deuflhard, R. W. Freund, and A. Walter, Fast secant methods for
the iterative solution of large nonsymmetric linear systems, Impact of Computing in
Science and Engineering, 2 (1990), pp. 244–276.
[68] W. J. Duncan, Some devices for the solution of large sets of simultaneous
linear equations (with an appendix on the reciprocation of partitioned matrices), The
London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science,
Seventh Series, 35 (1944), pp. 660–670.
[69] S. C. Eisenstat and H. F. Walker, Choosing the forcing terms in an inexact
Newton method, SIAM J. Sci. Comput., 17 (1996), pp. 16–32.
[70] , Globally convergent inexact Newton methods, SIAM J. Optim., 4 (1994),
pp. 393–422.
[71] H. C. Elman, Iterative Methods for Large, Sparse, Nonsymmetric Systems of
Linear Equations, PhD thesis, Yale University, New Haven, CT, 1982.
[72] H. C. Elman, Y. Saad, and P. E. Saylor, A hybrid ChebyshevKrylov
subspace algorithm for solving nonsymmetric systems of linear equations, SIAM J.
Sci. Statist. Comput., 7 (1986), pp. 840–855.
[73] M. Engelman, G. Strang, and K. J. Bathe, The application of quasi
Newton methods in ﬂuid mechanics, Internat. J. Numer. Methods Engrg., 17 (1981),
pp. 707–718.
[74] V. Faber and T. A. Manteuffel, Necessary and suﬃcient conditions for the
existence of a conjugate gradient method, SIAM J. Numer. Anal., 21 (1984), pp. 352–
362.
[75] D. Feng, P. D. Frank, and R. B. Schnabel, Local convergence analysis of
tensor methods for nonlinear equations, Math. Programming, 62 (1993), pp. 427–459.
[76] R. Fletcher, Conjugate gradient methods for indeﬁnite systems, in Numerical
Analysis Dundee 1975, G. Watson, ed., SpringerVerlag, Berlin, New York, 1976,
pp. 73–89.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BIBLIOGRAPHY 157
[77] R. W. Freund, A transposefree quasiminimal residual algorithm for non
Hermitian linear systems, SIAM J. Sci. Comput., 14 (1993), pp. 470–482.
[78] R. W. Freund, G. H. Golub, and N. M. Nachtigal, Iterative solution of
linear systems, Acta Numerica, 1 (1991), pp. 57–100.
[79] R. W. Freund, M. H. Gutknecht, and N. M. Nachtigal, An implemen
tation of the lookahead Lanczos algorithm for nonHermitian matrices, SIAM J. Sci.
Comput., 14 (1993), pp. 137–158.
[80] R. W. Freund and N. M. Nachtigal, QMR: a quasiminimal residual
algorithm for nonhermitian linear systems, Numer. Math., 60 (1991), pp. 315–339.
[81] , An implementation of the QMR method based on coupled twoterm
recurrences, SIAM J. Sci. Comput., 15 (1994), pp. 313–337.
[82] D. M. Gay, Some convergence properties of Broyden’s method, SIAM J. Numer.
Anal., 16 (1979), pp. 623–630.
[83] C. W. Gear, Numerical Initial Value Problems in Ordinary Diﬀerential Equa
tions, PrenticeHall, Englewood Cliﬀs, NJ, 1971.
[84] R. R. Gerber and F. T. Luk, A generalized Broyden’s method for solving
simultaneous linear equations, SIAM J. Numer. Anal., 18 (1981), pp. 882–890.
[85] P. E. Gill and W. Murray, QuasiNewton methods for unconstrained
optimization, J. I. M. A., 9 (1972), pp. 91–108.
[86] , Safeguarded steplength algorithms for optimization using descent methods,
Tech. Report NAC 37, National Physical Laboratory Report, Teddington, England,
1974.
[87] P. E. Gill, W. Murray, and M. H. Wright, Practical Optimization,
Academic Press, London, 1981.
[88] A. A. Goldstein, Constructive Real Analysis, Harper and Row, New York,
1967.
[89] G. H. Golub and C. G. VanLoan, Matrix Computations, Johns Hopkins
University Press, Baltimore, 1983.
[90] A. Griewank, Analysis and modiﬁcation of Newton’s method at singularities,
PhD thesis, Australian National University, 1981.
[91] , Rates of convergence for secant methods on nonlinear problems in Hilbert
space, in Numerical Analysis, Proceedings Guanajuato , Mexico 1984, Lecture Notes
in Math., No, 1230, J. P. Hennart, ed., SpringerVerlag, Heidelberg, 1986, pp. 138–
157.
[92] , The solution of boundary value problems by Broyden based secant methods,
in Computational Techniques and Applications: CTAC 85, Proceedings of CTAC,
Melbourne, August 1985, J. Noye and R. May, eds., North Holland, Amsterdam,
1986, pp. 309–321.
[93] , On the iterative solution of diﬀerential and integral equations using secant
updating techniques, in The State of the Art in Numerical Analysis, A. Iserles and
M. J. D. Powell, eds., Clarendon Press, Oxford, 1987, pp. 299–324.
[94] A. Griewank and G. F. Corliss, eds., Automatic Diﬀerentiation of Algo
rithms: Theory, Implementation, and Application, Society for Industrial and Applied
Mathematics, Philadelphia, PA, 1991.
[95] A. Griewank and P. L. Toint, Local convergence analysis for partitioned
quasinewton updates, Numer. Math., 39 (1982), pp. 429–448.
[96] , Partitioned variable metric methods for large sparse optimization problems,
Numer. Math., 39 (1982), pp. 119–137.
[97] W. A. Gruver and E. Sachs, Algorithmic Methods In Optimal Control,
Pitman, London, 1980.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
158 BIBLIOGRAPHY
[98] M. H. Gutknecht, Variants of BICGSTAB for matrices with complex spectrum,
SIAM J. Sci. Comput., 14 (1993), pp. 1020–1033.
[99] W. Hackbusch, MultiGrid Methods and Applications, vol. 4 of Springer Series
in Computational Mathematics, SpringerVerlag, New York, 1985.
[100] , Multigrid methods of the second kind, in Multigrid Methods for Integral
and Diﬀerential Equations, Oxford University Press, Oxford, 1985.
[101] W. E. Hart and S. O. W. Soul, QuasiNewton methods for discretized
nonlinear boundary problems, Journal of the Institute of Applied Mathematics, 11
(1973), pp. 351–359.
[102] H. V. Henderson and S. R. Searle, On deriving the inverse of a sum of
matrices, SIAM Rev., 23 (1981), pp. 53–60.
[103] M. R. Hestenes and E. Steifel, Methods of conjugate gradient for solving
linear systems, J. of Res. Nat. Bureau Standards, 49 (1952), pp. 409–436.
[104] D. M. Hwang and C. T. Kelley, Convergence of Broyden’s method in Banach
spaces, SIAM J. Optim., 2 (1992), pp. 505–532.
[105] E. Isaacson and H. B. Keller, Analysis of numerical methods, John Wiley,
New York, 1966.
[106] L. Kantorovich and G. Akilov, Functional Analysis, 2nd ed., Pergamon
Press, New York, 1982.
[107] L. V. Kantorovich, Functional analysis and applied mathematics, Uspehi Mat.
Nauk., 3 (1948), pp. 89–185. translation by C. Benster as Nat. Bur. Standards Report
1509. Washington, D. C., 1952.
[108] H. B. Keller, Newton’s method under mild diﬀerentiability conditions, J.
Comput. Sys. Sci, 4 (1970), pp. 15–28.
[109] , Lectures on Numerical Methods in Bifurcation Theory, Tata Institute of
Fundamental Research, Lectures on Mathematics and Physics, SpringerVerlag, New
York, 1987.
[110] C. T. Kelley, Solution of the Chandrasekhar Hequation by Newton’s method,
J. Math. Phys., 21 (1980), pp. 1625–1628.
[111] , A fast multilevel algorithm for integral equations, SIAM J. Numer. Anal.,
32 (1995), pp. 501–513.
[112] C. T. Kelley and E. W. Sachs, A quasiNewton method for elliptic boundary
value problems, SIAM J. Numer. Anal., 24 (1987), pp. 516–531.
[113] , A pointwise quasiNewton method for unconstrained optimal control
problems, Numer. Math., 55 (1989), pp. 159–176.
[114] , Fast algorithms for compact ﬁxed point problems with inexact function
evaluations, SIAM J. Sci. Statist. Comput., 12 (1991), pp. 725–742.
[115] , A new proof of superlinear convergence for Broyden’s method in Hilbert
space, SIAM J. Optim., 1 (1991), pp. 146–150.
[116] , Pointwise Broyden methods, SIAM J. Optim., 3 (1993), pp. 423–441.
[117] , Multilevel algorithms for constrained compact ﬁxed point problems, SIAM
J. Sci. Comput., 15 (1994), pp. 645–667.
[118] C. T. Kelley and R. Suresh, A new acceleration method for Newton’s method
at singular points, SIAM J. Numer. Anal., 20 (1983), pp. 1001–1009.
[119] C. T. Kelley and Z. Q. Xue, Inexact Newton methods for singular problems,
Optimization Methods and Software, 2 (1993), pp. 249–267.
[120] , GMRES and integral operators, SIAM J. Sci. Comput., 17 (1996), pp. 217–
226.
[121] T. Kerkhoven and Y. Saad, On acceleration methods for coupled nonlinear
elliptic systems, Numerische Mathematik, 60 (1992), pp. 525–548.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BIBLIOGRAPHY 159
[122] C. Lanczos, Solution of linear equations by minimized iterations, J. Res. Natl.
Bur. Stand., 49 (1952), pp. 33–53.
[123] T. A. Manteuffel, Adaptive procedure for estimating parameters for the
nonsymmetric Tchebychev iteration, Numer. Math., 31 (1978), pp. 183–208.
[124] T. A. Manteuffel and S. Parter, Preconditioning and boundary conditions,
SIAM J. Numer. Anal., 27 (1990), pp. 656–694.
[125] E. S. Marwil, Convergence results for Schubert’s method for solving sparse
nonlinear equations, SIAM J. Numer. Anal., (1979), pp. 588–604.
[126] S. McCormick, Multilevel Adaptive Methods for Partial Diﬀerential Equations,
Society for Industrial and Applied Mathematics, Philadelphia, PA, 1989.
[127] J. A. Meijerink and H. A. van der Vorst, An iterative solution method
for linear systems of which the coeﬃcient matrix is a symmetric Mmatrix, Math.
Comput., 31 (1977), pp. 148–162.
[128] C. D. Meyer, Matrix Analysis and Applied Linear Algebra, forthcoming.
[129] J. J. Mor´ e, Recent developments in algorithms and software for trust region
methods, in Mathematical Programming: The State of the Art, A. Bachem,
M. Gr¨ oschel, and B. Korte, eds., SpringerVerlag, Berlin, 1983, pp. 258–287.
[130] J. J. Mor´ e and J. A. Trangenstein, On the global convergence of Broyden’s
method, Math. Comput., 30 (1976), pp. 523–540.
[131] T. E. Mott, Newton’s method and multiple roots, Amer. Math. Monthly, 64
(1957), pp. 635–638.
[132] T. W. Mullikin, Some probability distributions for neutron transport in a half
space, J. Appl. Probab., 5 (1968), pp. 357–374.
[133] W. Murray and M. L. Overton, Steplength algorithms for minimizing a
class of nondiﬀerentiable functions, Computing, 23 (1979), pp. 309–331.
[134] N. M. Nachtigal, S. C. Reddy, and L. N. Trefethen, How fast are
nonsymmetric matrix iterations?, SIAM J. Matrix Anal. Appl., 13 (1992), pp. 778–
795.
[135] N. M. Nachtigal, L. Reichel, and L. N. Trefethen, A hybrid gmres
algorithm for nonsymmetric linear systems, SIAM J. Matrix Anal. Appl., 13 (1992).
[136] S. G. Nash, Preconditioning of truncated Newton methods, SIAM J. Sci. Statist.
Comput., 6 (1985), pp. 599–616.
[137] , Who was Raphson? (an answer). Electronic Posting to NADigest,
v92n23, 1992.
[138] J. L. Nazareth, Conjugate gradient methods less dependent on conjugacy,
SIAM Rev., 28 (1986), pp. 501–512.
[139] O. Nevanlinna, Convergence of Iterations for Linear Equations, Birkh¨ auser,
Basel, 1993.
[140] I. Newton, The Mathematical Papers of Isaac Newton (7 volumes), D. T.
Whiteside, ed., Cambridge University Press, Cambridge, 1967–1976.
[141] B. Noble, Applied Linear Algebra, Prentice Hall, Englewood Cliﬀs, NJ, 1969.
[142] J. Nocedal, Theory of algorithms for unconstrained optimization, Acta Numer
ica, 1 (1991), pp. 199–242.
[143] D. P. O’Leary, Why Broyden’s nonsymmetric method terminates on linear
equations, SIAM J. Optim., 4 (1995), pp. 231–235.
[144] J. M. Ortega, Numerical Analysis A Second Course, Classics in Appl. Math.,
No. 3, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1990.
[145] J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear
Equations in Several Variables, Academic Press, New York, 1970.
[146] A. M. Ostrowski, Solution of Equations and Systems of Equations, Academic
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
160 BIBLIOGRAPHY
Press, New York, 1960.
[147] B. N. Parlett, The Symmetric Eigenvalue Problem, Prentice Hall, Englewood
Cliﬀs, NJ, 1980.
[148] B. N. Parlett, D. R. Taylor, and Z. A. Liu, A lookahead Lanczos
algorithm for unsymmetric matrices, Math. Comput., 44 (1985), pp. 105–124.
[149] D. W. Peaceman and H. H. Rachford, The numerical solution of parabolic
and elliptic diﬀerential equations, J. Soc. Indus. Appl. Math., 11 (1955), pp. 28–41.
[150] G. Peters and J. Wilkinson, Inverse iteration, illconditioned equations and
Newton’s method, SIAM Rev., 29 (1979), pp. 339–360.
[151] L. R. Petzold, A description of DASSL: a diﬀerential/algebraic system solver,
in Scientiﬁc Computing, R. S. Stepleman et al., eds., North Holland, Amsterdam,
1983, pp. 65–68.
[152] E. Picard, M´emoire sur la th´eorie des ´equations aux d´eriv´ees partielles et la
m´ethode des approximations successives, J. de Math. ser 4, 6 (1890), pp. 145–210.
[153] M. J. D. Powell, A hybrid method for nonlinear equations, in Numerical
Methods for Nonlinear Algebraic Equations, Gordon and Breach, New York, 1970,
pp. 87–114.
[154] , Convergence properties of a class of minimization algorithms, in Nonlinear
Programming 2, O. L. Mangasarian, R. R. Meyer, and S. M. Robinson, eds.,
Academic Press, New York, 1975, pp. 1–27.
[155] L. B. Rall, Convergence of the Newton process to multiple solutions, Numer.
Math., 9 (1961), pp. 23–37.
[156] J. Raphson, Analysis aequationum universalis seu ad aequationes algebraicas
resolvendas methodus generalis, et expedita, ex nova inﬁnitarum serierum doctrina,
deducta ac demonstrata. Original in British Library, London, 1690.
[157] G. W. Reddien, On Newton’s method for singular problems, SIAM J. Numer.
Anal., 15 (1978), pp. 993–986.
[158] J. K. Reid, Least squares solution of sparse systems of nonlinear equations by a
modiﬁed Marquardt algorithm, in Proc. NATO Conf. at Cambridge, July 1972, North
Holand, Amsterdam, 1973, pp. 437–445.
[159] W. C. Rheinboldt, Numerical Analysis of Parametrized Nonlinear Equations,
John Wiley, New York, 1986.
[160] J. R. Rice, Experiments on GramSchmidt orthogonalization, Math. Comput.,
20 (1966), pp. 325–328.
[161] T. J. Rivlin, The Chebyshev Polynomials, John Wiley, New York, 1974.
[162] H. L. Royden, Real Analysis, 2nd ed., Macmillan, New York, 1968.
[163] Y. Saad, Practical use of polynomial preconditionings for the conjugate gradient
method, SIAM J. Sci. Statist. Comput., 6 (1985), pp. 865–881.
[164] , Least squares polynomials in the complex plane and their use for solving
nonsymmetric linear systems, SIAM J. Numer. Anal., 24 (1987), pp. 155–169.
[165] , ILUT: A dual threshold incomplete LU factorization, Tech. Report 9238,
Computer Science Department, University of Minnesota, 1992.
[166] , A ﬂexible innerouter preconditioned GMRES algorithm, SIAM J. Sci.
Comput., (1993), pp. 461–469.
[167] Y. Saad and M. Schultz, GMRES a generalized minimal residual algorithm
for solving nonsymmetric linear systems, SIAM J. Sci. Statist. Comput., 7 (1986),
pp. 856–869.
[168] E. Sachs, Broyden’s method in Hilbert space, Math. Programming, 35 (1986),
pp. 71–82.
[169] P. E. Saylor and D. C. Smolarski, Implementation of an adaptive algorithm
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
BIBLIOGRAPHY 161
for Richardson’s method, Linear Algebrra Appl., 154/156 (1991), pp. 615–646.
[170] R. B. Schnabel and P. D. Frank, Tensor methods for nonlinear equations,
SIAM J. Numer. Anal., 21 (1984), pp. 815–843.
[171] E. Schr¨ oder,
¨
Uber unendlich viele Algorithmen zur Auﬂosung der Gleichungen,
Math. Ann., 2 (1870), pp. 317–365.
[172] L. K. Schubert, Modiﬁcation of a quasiNewton method for nonlinear equations
with sparse Jacobian, Math. Comput., 24 (1970), pp. 27–30.
[173] G. A. Schultz, R. B. Schnabel, and R. H. Byrd, A family of trustregion
based algorithms for unconstrained minimization with strong global convergence
properties, SIAM J. Numer. Anal., 22 (1985), pp. 47–67.
[174] V. E. Shamanskii, A modiﬁcation of Newton’s method, Ukran. Mat. Zh., 19
(1967), pp. 133–138. (In Russian.)
[175] A. H. Sherman, On Newtoniterative methods for the solution of systems of
nonlinear equations, SIAM J. Numer. Anal., 14 (1978), pp. 755–774.
[176] J. Sherman and W. J. Morrison, Adjustment of an inverse matrix corre
sponding to changes in the elements of a given column or a given row of the original
matrix (abstract), Ann. Math. Statist., 20 (1949), p. 621.
[177] , Adjustment of an inverse matrix corresponding to a change in one element
of a given matrix, Ann. Math. Statist., 21 (1950), pp. 124–127.
[178] K. Sigmon, MATLAB Primer, Fourth Edition, CRC Press, Boca Raton, FL,
1994.
[179] D. C. Smolarski and P. E. Saylor, An optimum iterative method for solving
any linear system with a square matrix, BIT, 28 (1988), pp. 163–178.
[180] P. Sonneveld, CGS, a fast Lanczostype solver for nonsymmetric linear
systems, SIAM J. Sci. Statist. Comput., 10 (1989), pp. 36–52.
[181] D. C. Sorensen, Newton’s method with a model trust region modiﬁcation, SIAM
J. Numer. Anal., 19 (1982), pp. 409–426.
[182] G. Starke, Alternating direction preconditioning for nonsymmetric systems of
linear equations, SIAM J. Sci. Comput., 15 (1994), pp. 369–385.
[183] G. Starke and R. S. Varga, A hybrid ArnoldiFaber iterative method for
nonsymmetric systems of linear equations, Numer. Math., 64 (1993), pp. 213–239.
[184] G. W. Stewart, Introduction to matrix computations, Academic Press, New
York, 1973.
[185] E. L. Stiefel, Kernel polynomials in linear algebra and their numerical
applications, U.S. National Bureau of Standards, Applied Mathematics Series, 49
(1958), pp. 1–22.
[186] P. N. Swarztrauber, The methods of cyclic reduction, Fourier analysis and
the FACR algorithm for the discrete solution of Poisson’s equation on a rectangle,
SIAM Rev., 19 (1977), pp. 490–501.
[187] , Approximate cyclic reduction for solving Poisson’s equation, SIAM J. Sci.
Statist. Comput., 8 (1987), pp. 199–209.
[188] P. N. Swarztrauber and R. A. Sweet, Algorithm 541: Eﬃcient FORTRAN
subprograms for the solution of elliptic partial diﬀerential equations, ACM Trans.
Math. Software, 5 (1979), pp. 352–364.
[189] P. L. Toint, On sparse and symmetric matrix updating subject to a linear
equation, Math. Comput., 31 (1977), pp. 954–961.
[190] J. F. Traub, Iterative Methods for the Solution of Equations, Prentice Hall,
Englewood Cliﬀs, NJ, 1964.
[191] H. A. van der Vorst, BiCGSTAB: A fast and smoothly converging variant to
BiCG for the solution of nonsymmetric systems, SIAM J. Sci. Statist. Comput., 13
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
162 BIBLIOGRAPHY
(1992), pp. 631–644.
[192] H. A. van der Vorst and C. Vuik, The superlinear convergence behaviour
of GMRES, Journal Comput. Appl. Math., 48 (1993), pp. 327–341.
[193] R. S. Varga, Matrix Iterative Analysis, Prentice Hall, Englewood Cliﬀs, NJ,
1962.
[194] E. L. Wachspress, Iterative Solution of Elliptic Systems and Applications to
the Neutron Diﬀusion Equations of Reactor Physics, Prentice Hall, Englewood Cliﬀs,
NJ, 1966.
[195] H. F. Walker, Implementation of the GMRES method using Householder
transformations, SIAM J. Sci. Statist. Comput., 9 (1989), pp. 815–825.
[196] , Residual smoothing and peak/plateau behavior in Krylov subspace methods,
Applied Numer. Math., 19 (1995), pp. 279–286.
[197] H. F. Walker and L. Zhou, A simpler GMRES, J. Numer. Lin. Alg. Appl.,
6 (1994), pp. 571–581.
[198] L. T. Watson, S. C. Billups, and A. P. Morgan, Algorithm 652:
HOMPACK: A suite of codes for globally convergent homotopy algorithms, ACM
Trans. Math. Software, 13 (1987), pp. 281–310.
[199] M. A. Woodbury, Inverting modiﬁed matrices, Memorandum Report 42,
Statistical Research Group, Princeton NJ, 1950.
[200] D. M. Young, Iterative Solution of Large Linear Systems, Academic Press, New
York, 1971.
[201] T. J. Ypma, The eﬀect of rounding errors on Newtonlike methods, IMA J.
Numer. Anal., 3 (1983), pp. 109–118.
[202] L. Zhou and H. F. Walker, Residual smoothing techniques for iterative
methods, SIAM J. Sci. Comput., 15 (1994), pp. 297–312.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Index
Aconjugacy, 20
Anorm, 12
Approximate inverse, 6, 77
Armijo rule, 137
Arnoldi process, 37
Backtracking, 136
Bad Broyden method, 132
Banach Lemma, 6
BiCG, 47
algorithm, 47
BiCGSTAB, 50
algorithm, 50
ﬁnite diﬀerence, 110
Biconjugacy, 47
BiConjugate Gradient, 47
algorithm, 47
Biorthogonality, 47
Bounded deterioration, 118, 121
Breakdown, 47
Broyden sequence, 120
Broyden’s method, 113
convergence
linear equations, 118
nonlinear equations, 120
implementation, 123
limited memory, 123
restarting, 123
sparse, 133
Brsol
algorithm, 126
Brsola
algorithm, 145
Cauchy sequence, 67
CGNE, 25
CGNR, 25
CGS, 48
algorithm, 49
Chandrasekhar Hequation, 87
solution
Broyden’s method, 128
Newton’s method, 87
NewtonGMRES, 107
Characteristic polynomial, 34
Chord method, 74
algorithm, 74
convergence, 76
Condition number, 3
Conjugate Gradient
algorithm, 22
ﬁnite termination, 14
minimization property, 12
use of residual polynomials, 14
Conjugate Gradient Squared, 48
algorithm, 49
Conjugate transpose, 34
Contraction mapping, 67
Contraction Mapping Theorem, 67
ConvectionDiﬀusion Equation
Broyden’s method, 130
Newton–Armijo, 146
NewtonGMRES, 108
solution
Broyden’s method, 130
Convergence theorem
Broyden’s method
linear equations, 119
163
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
164 INDEX
nonlinear equations, 123
chord method, 76, 77
contraction mapping, 67
ﬁnite diﬀerence Newton, 81
inexact Newton’s method, 96,
99
Newton’s method, 71
Newton–Armijo, 141
NewtonGMRES, 103
secant method, 82
Shamanskii method, 79
DASPK, 110
Dennis–Mor´e Condition, 114
superlinear convergence, 115
Diagonalizable matrix, 34
Diagonalizing transformation, 34
Diﬀerence approximation, 79
choice of h, 80
directional derivative, 80
Jacobian, 80
nonlinearity, 80
Elementary Jordan block, 60
Fixed point, 66
Fixedpoint iteration, 66
Forcing term, 95
choice of, 104
Gauss–Seidel
algorithm, 9
Givens rotation, 43
Globally convergent algorithm, 135
GMRES
algorithm, 40, 45
diagonalizable matrices, 34
ﬁnite diﬀerences, 101
ﬁnite termination, 34
GMRES(m), 39, 46
algorithm, 46
loss of orthogonality, 41
minimization property, 33
restarts, 39, 46
use of residual polynomials, 33
Gram–Schmidt process, 37
loss of orthogonality, 40, 41
modiﬁed, 40
Hequation, 87
solution
Broyden’s method, 128
Newton’s method, 87
NewtonGMRES, 107
H¨older Continuity, 91
High order methods, 78
Hybrid algorithms, 36
Induced matrix norm
deﬁnition, 3
Inexact Newton method, 95
Inner iteration, 100
Inverse power method, 94
Inverse secant equation, 132
Inverse Tangent
Newton–Armijo, 146
Iteration matrix, 5
Iteration statistics, 91
Jacobi
algorithm, 8
convergence result, 8
preconditioning, 9
Jacobi preconditioning, 24
Jordan block, 60
Kantorovich Theorem, 83
Krylov subspace, 11
Left preconditioning, 36
Limited memory, 123
Line Search
parabolic
threepoint, 143
twopoint, 142
polynomial, 142
Line search, 135, 136
Linear PDE
BiCGSTAB, 57
Broyden’s method, 127
CGNR, 55
GMRES, 54
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
INDEX 165
solution
Broyden’s method, 127
TFQMR, 58
Lipschitz continuous
deﬁnition, 67
Local improvement, 75
inexact, 100
Matrix norm, 3
Matrix splitting, 7
Modiﬁed Bratu Problem, 132
Newton direction, 136
Newton’s method, 71
algorithm, 74
convergence rate, 71
ﬁnite diﬀerences, 81
stagnation, 81
inexact, 95
monotone convergence, 92
termination, 72
NewtonGMRES, 101
convergence rates, 103
ﬁnite diﬀerences, 101
safeguarding, 105
Newtoniterative method, 100
Normal matrix, 34
Nsol
algorithm, 86
Nsola
algorithm, 138
Nsolgm
algorithm, 105
Outer iteration, 100
Over solving, 104
Picard iteration, 66
Poisson solver, 24, 27
preconditioner
for CG, 24, 27
for GMRES, 36, 54
Polynomial preconditioning, 24, 36
Richardson iteration, 7
Positive deﬁnite matrix, 11
Preconditioned CG
algorithm, 23
Preconditioning
conjugate gradient, 19
GMRES, 36
incomplete
factorization, 24, 36
Jacobi, 24
Poisson solver, 27, 36, 54
polynomial, 24, 36
Richardson iteration, 6
Quasiminimization, 52
QuasiNewton methods, 113
Quasiresidual, 51
Quasiresidual norm, 52
Relaxation parameter, 9
Reorthogonalization, 41
Residual
linear, 11
as error estimate, 4
nonlinear
as error estimate, 72
relative nonlinear, 72
Residual polynomials
application to CG, 14
application to CGNR, 25
application to GMRES, 33
application to TFQMR, 52
deﬁnition, 14
Richardson iteration, 5
nonlinear, 66
Right preconditioning, 36
Schubert algorithm, 133
Search direction, 136
Secant equation, 113
Secant method, 82
convergence, 82
qorder, 91
rorder, 91
Secant update, 113
Shamanskii method, 78
algorithm, 78
convergence, 79
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
166 INDEX
ﬁnite diﬀerences, 91
Sherman–Morrison
formula, 124, 132
Sherman–Morrison–Woodbury for
mula, 132
Simple decrease, 136
SOR
algorithm, 9
SPD matrix, 12
Spectral radius, 7
Stagnation, 94
Newton’s method, 75
Stationary iterative methods, 5
as preconditioners, 24
Successive substitution, 66
Suﬃcient decrease, 137
Symmetric Gauss–Seidel
algorithm, 9
preconditioning, 9
Symmetric matrix, 11
Symmetric positive deﬁnite matrix,
12
Termination
CG, 16
GMRES, 35, 38
linear equations, 4
nonlinear equations, 72
TFQMR, 52
TFQMR, 51
algorithm, 53
ﬁnite diﬀerence, 110
weights, 52
Weighted norm, 98
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
To Polly H. Thomas, 19061994, devoted mother and grandmother
1
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
2
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
. 3. . . . . . . . . . . . .html. 11 11 13 15 19 22 25 26 30 33 33 35 36 37 43 46 47 48 50 . . . .6. . . . . .4 Matrix splittings and classical stationary iterative methods . . .8 Exercises on conjugate gradient . . . 2. . . . . . . . GMRES Iteration 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 CHAPTER 2. . .2 Termination . . . . . . . . 7 1. . . . . . .4 GMRES implementation: Basic ideas .Copyright ©1995 by the Society for Industrial and Applied Mathematics. . . . . . . . . . . 3 1. . 3. . . . . . . . . . .6. . . . .3 BiCGSTAB. . . . . . .3 Preconditioning . 3. .6 CGNR and CGNE . . . . . . . . . . . . . . . . . . . 3. . . . . . . Basic Concepts and Stationary Iterative Methods 3 1. . . . . 3. . . . . . . . . . .5 Exercises on stationary iterative methods . . . . . . . . . . . . . . . . . . . . . . . .3 Termination of the iteration . . . . . . . . . . .5 Preconditioning . . . . . . . . . . . . . 2. . . . . . . . . .ecsecurehost. . . . . . . . . . . . 3. 2. . . . . . . . . . vii Buy this book from SIAM at http://www. .7 Examples for preconditioned conjugate iteration 2. 2. . 2. . . . . . . . . . . .1 Review and notation . . . .6. . . . . . . . . . Conjugate Gradient Iteration 2. . . . . . . . . . . . . . . . . . . . .2 The Banach Lemma and approximate inverses . . . . . . . . . . . . . . . . . . . . . . .4 Implementation . . . . . . . This electronic version is for personal use and may not be duplicated or distributed. . . . . . . . . . . . 5 1. .5 Implementation: Givens rotations . . . . Contents Preface xi How to Get the Software xiii CHAPTER 1. 3. . . . . . . . . . . . . 2. . . . . . . .3 The spectral radius . . . . . . . . .2 CGS. .1 Krylov methods and the minimization property . . . . . . . . . . . . . . . . . . . 7 1. . . .2 Consequences of the minimization property . . . . . . . . . . . . . . . . .com/SIAM/FR16.1 BiCG. . . CHAPTER 3. . . . . . .1 The minimization property and its consequences 3. . . . . . . . . .6 Other methods for nonsymmetric systems . .
. . . . . . . . . . . . . . . 6. . . . . . 5. . . . .1 The chord method. . . . Examples for CGNR. .2 Nonlinear problems. .1. . . . .6 Examples for Newton’s method .2 Other Newtoniterative methods. . . . . and TFQMR iteration . . . . . .2 Convectiondiﬀusion equation. . . . . . . . . 51 54 55 60 65 65 66 68 71 71 72 73 75 76 77 78 79 82 83 86 91 95 95 95 97 100 100 101 104 104 106 107 108 110 113 114 116 118 120 123 127 CHAPTER 4. . . Broyden’s method 7. . . . . . . . . .3 Implementation of Broyden’s method 7. . . . . . . . . . . . . . . . . . . e 7. . . . . . . . . . . Exercises on GMRES . . 6. . . . Buy this book from SIAM at http://www. 6. . . . .4 Diﬀerence approximation to F . . . . . .com/SIAM/FR16. 6. . . . . . . . . .2 Fixedpoint iteration . . . .4. . . . 7. . . . . 6. . . . . . . . . . . . . 5. . . . . . . 5. . . .4. . . . . . . . . . .ecsecurehost. . . . . . . 5. . . . . . . . 4. . . . . . . . .2 Newtoniterative methods . .2 Convergence analysis . . . . . . . . . . . . .3 The Shamanskii method. . . . . . . . . . . . . . .3 Implementation of Newton’s method .3 The standard assumptions . . . . . . . . viii 3. . .3 Errors in the function. . . . . 5. . . . . . . . . . . . . . . . . . 6. . . . . . . . .1 Local convergence of Newton’s method 5. . . . . . . . . . .1 Types of convergence . . . . . . . . . . . . . CONTENTS 3. .2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Chandrasekhar Hequation. . . . . . . .2. . . .8 3. . . . . . . . . . . . . . . . . . . .5 Exercises on inexact Newton methods . . .1. . .6. . . Examples for GMRES iteration . . . . . . . . . .Copyright ©1995 by the Society for Industrial and Applied Mathematics. . .4. . . . . .5 The secant method. . . . . .html. . . . . Inexact Newton Methods 6. . 5.9 . . . . . . . . . . . . . .2 Approximate inversion of F . . . . . . . . BiCGSTAB. . . . . . . . 6. . . . . . . . . . . . . . . . . . 5. . . . . . . . . . 7. . . . . . . . . . . . . . . . . . . CHAPTER 6. CHAPTER 7. . . . . . . . . . .4. . .1 The basic estimates .4. . . . . . . . . . . . . . . . . Basic Concepts and FixedPoint Iteration 4. . Newton’s Method 5. . . . . .7 3. . . . . . . . . .4 Errors in the function and derivative . . .4. . . . . . . . 4. .4 TFQMR. . This electronic version is for personal use and may not be duplicated or distributed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. . . . . .3 NewtonGMRES implementation . . . . . . .4. . . . . . . .2 Weighted norm analysis.2. . . . . . . . . . CHAPTER 5. .7 Exercises on Newton’s method .5 The Kantorovich Theorem .2 Termination of the iteration . . . . . . . . . . . . . . .1 The Dennis–Mor´ condition . . . . . . . . . . . . . . . . . .4 Examples for NewtonGMRES . . . . . . . . 6. . . . . . . . . . . . 5. . .1 Newton GMRES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. . . . . .2. 5. .1 Linear problems. 7. . . . 6. . . . .4 Examples for Broyden’s method . . . . . . . . . . . . . . . . 6. . . . .1 Direct analysis. . . . . . . .1.
. . . . . . . . . . 135 135 138 141 142 144 146 146 146 148 151 153 163 CHAPTER 8. .5 7. . . . . . 8. . . . . . . Global Convergence 8. . . . . . .Copyright ©1995 by the Society for Industrial and Applied Mathematics. . . . . . . . . . . . 8. . . . . . . . . . . . . . . . . .1 Inverse tangent function. . . . . . . . . . . . 127 7. .3 Broyden–Armijo. .3 Implementation of the Armijo rule . 8. . . . . . . . . . . .com/SIAM/FR16.4 Examples for Newton–Armijo . . . .2 Broyden’s method.3. .ecsecurehost. . . . . . . . . 128 Exercises on Broyden’s method . . . . . . . . . . .4. .4. . . . . . .4. . .2 Convectiondiﬀusion equation. . . . .html. . . . . . . . . . . . . . . . .2 Analysis of the Armijo rule . . . . . . . . . 8. . . . . . . . .5 Exercises on global convergence . . . . . .2 Nonlinear problems. 132 . . . . . . . . CONTENTS ix 7. . . . .1 Linear problems. . . . . This electronic version is for personal use and may not be duplicated or distributed. 8. . . . . . . . . . . . .3. . . . . . . . . . . . 8.4. 8. . . . . . . . . . . . . . . . . . . 8. . Bibliography Index Buy this book from SIAM at http://www. . . . . . . . . . . . . . .4. .1 Polynomial line searches. . . . . . . . . . . .1 Single equations . 8. . . . . . . . . .
This electronic version is for personal use and may not be duplicated or distributed.html.ecsecurehost.com/SIAM/FR16. . x CONTENTS Buy this book from SIAM at http://www.Copyright ©1995 by the Society for Industrial and Applied Mathematics.
. [105]. We assume that the reader is familiar with elementary numerical analysis. we have selected for coverage mostly algorithms and methods of analysis which extend directly to the inﬁnitedimensional case and whose convergence can be thoroughly analyzed.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Inc.1 on an Apple Macintosh Powerbook 180) and the MATLAB environment is an excellent one for getting experience with the algorithms. Our approach is to focus on a small number of methods and treat them in depth. It may also be used as a textbook for introductory courses in nonlinear equations or iterative methods or as source material for an introductory course in numerical analysis at the graduate level. The computational examples in this book were done with MATLAB (version 4.0a on various SUN SPARCstations and version 4.html. for doing the exercises. xi Buy this book from SIAM at http://www. For example. and for smalltomedium scale production work. or [184]. This electronic version is for personal use and may not be duplicated or distributed.1 MATLAB codes for many of the algorithms are available by anonymous ftp. and the central ideas of direct methods for the numerical solution of dense linear systems as described in standard texts such as [7]. The analysis of Broyden’s method presented in Chapter 7 and the implementations presented in Chapters 7 and 8 are diﬀerent from the classical ones and also extend directly to an inﬁnitedimensional setting. Preface This book on iterative methods for linear and nonlinear equations can be used as a tutorial and a reference by anyone who needs to solve nonlinear systems of equations or large linear systems. the matrixfree formulation and analysis for GMRES and conjugate gradient is almost unchanged in an inﬁnitedimensional setting. The computational examples and exercises focus on discretizations of inﬁnitedimensional problems such as integral and diﬀerential equations. Though this book is written in a ﬁnitedimensional setting. We present a limited number of computational examples.com/SIAM/FR16. These examples are intended to provide results that can be used to validate the reader’s own implementations and to give a sense of how the algorithms perform. The examples are not designed to give a complete picture of performance or to be a suite of test problems. A good introduction to the latest 1 MATLAB is a registered trademark of The MathWorks.ecsecurehost. linear algebra.
Moody Chu. [43] is also a useful resource. 1998 Buy this book from SIAM at http://www. xii PREFACE version (version 4. Belinda King. Yue Zhang. DMS9024622 and DMS9321938. Homer Walker. North Carolina January.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Debbie Lockhart. Andreas Griewank. and pointers to the literature. Gordon Wade. . I am especially indebted to Jim Banoczi. Steve Campbell. Zhaqing Xue. Jeﬀ Scroggs. Ekkehard Sachs.com/SIAM/FR16. Steve Wright. and conclusions or recommendations expressed in this material are those of the author and do not necessarily reﬂect the views of the National Science Foundation or of the Air Force Oﬃce of Scientiﬁc Research.2) of MATLAB is the MATLAB Primer [178]. Howard Elman. Ilse Ipsen. Casey Miller. T. Parts of this book are based upon work supported by the National Science Foundation and the Air Force Oﬃce of Scientiﬁc Research over several years. Jim Epperson. Joseph Skudlarek. Lea Jenkins. Tony Choi. Vickie Kearn. ﬁndings. and an anonymous reviewer for their contributions and encouragement. This electronic version is for personal use and may not be duplicated or distributed. Most importantly. I thank ChungWei Ng and my parents for over one hundred and ten years of patience and support. suggestions. Kelley Raleigh.html.ecsecurehost. Carl Meyer. C. Laura Helfrich. Many of my students and colleagues discussed various aspects of this project with me and provided important corrections. the general algorithmic descriptions or even the MATLAB codes can easily be translated to another language. If the reader has no access to MATLAB or will be solving very large problems. Any opinions. Mike Tocci. most recently under National Science Foundation Grant Nos. Jeﬀ Butera. ideas.
ecsecurehost. How to get the software A collection of MATLAB codes has been written to accompany this book.com xiii Buy this book from SIAM at http://www. The MATLAB codes can be obtained by anonymous ftp from the MathWorks server ftp.com/SIAM/FR16. http://www.mathworks. This electronic version is for personal use and may not be duplicated or distributed.com or from SIAM’s World Wide Web site http://www. from the MathWorks World Wide Web site.html.siam.Copyright ©1995 by the Society for Industrial and Applied Mathematics.html One can obtain MATLAB from The MathWorks. 24 Prime Park Way Natick. MA 01760.com in the directory pub/books/kelley. Phone: (508) 6531415 Fax: (508) 6532997 Email: info@mathworks.mathworks.org/books/kelley/kelley. Inc. .com WWW: http://www.mathworks.
1. where κ(A) is understood to be inﬁnite if A is singular.1.1. Let · be a norm on RN .com/SIAM/FR16. Recall that the condition number of A relative to the norm κ(A) = A A−1 . Chapter 1 Basic Concepts and Stationary Iterative Methods 1. x =1 where A is a nonsingular N × N matrix. x∗ = A−1 b ∈ RN is to be found. We will denote the ith component of a vector x by (x)i (note the parentheses) and the ith component of xk by (xk )i . In this chapter · will denote a norm on RN as well as the induced matrix norm. b ∈ RN is given. We will write linear equations as (1. The induced matrix norm of an N × N matrix A is deﬁned by A = max Ax .ecsecurehost.1) Ax = b. Review and notation We begin by setting notation and reviewing some ideas from numerical linear algebra that we expect the reader to be familiar with. This electronic version is for personal use and may not be duplicated or distributed. Definition 1. .html.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Throughout this chapter x will denote a potential solution and {xk }k≥0 the sequence of iterates. An excellent reference for the basic ideas of numerical linear algebra and direct methods for linear equations is [184]. and Induced norms have the important property that Ax ≤ A x . We will rarely need to refer to individual components of vectors. If · · is is the lp norm x p = N j=1 1/p (x)i  p 3 Buy this book from SIAM at http://www.
4 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS we will write the condition number as κp . particularly when the linear iteration is being used as part of a nonlinear solver. r0 as asserted.2) depends on the initial iterate and may result in unnecessary work when the initial iterate is good and a poor result when the initial iterate is far from the solution.2) which can be related to the error e = x − x∗ in terms of the condition number. Let b.html. e0 r0 rk < τ. Most iterative methods terminate when the residual r = b − Ax is suﬃciently small.3) Proof. One termination criterion is (1. The termination criterion (1.4) < τ. Buy this book from SIAM at http://www. (1. . e0 A −1 r0 r0 Ae = A−1 r e r ≤ κ(A) . Let A be nonsingular and let x∗ = A−1 b.com/SIAM/FR16. Since r = b − Ax = −Ae we have e = A−1 Ae ≤ A−1 and r0 = Ae0 ≤ A e0 . Lemma 1.ecsecurehost.Copyright ©1995 by the Society for Industrial and Applied Mathematics.4) are the same when x0 = 0. Hence e A−1 r r ≤ = κ(A) .1. x0 ∈ RN . b The two conditions (1. which is a common choice.1. This electronic version is for personal use and may not be duplicated or distributed. x. For this reason we prefer to terminate the iteration when rk (1.2) and (1.
If M is an N × N matrix with M < 1 then I − M is nonsingular and 1 (I − M )−1 ≤ (1. Iterative methods of this form are called stationary iterative methods because the transition from xk to xk+1 does not depend on the history of the iteration.7) M is an N ×N matrix called the iteration matrix.7) xk+1 = M xk + c.1) as a linear ﬁxedpoint iteration. Since M Sk + I = Sk+1 .ecsecurehost. The Banach Lemma and approximate inverses The most straightforward approach to an iterative solution of a linear system is to rewrite (1. l=0 The partial sums Sk = form a Cauchy sequence in RN ×N . All our results are based on the following lemma. M l ≤ M norm. This electronic version is for personal use and may not be duplicated or distributed. Lemma 1.2. and to deﬁne the Richardson iteration (1.8) . In (1.Copyright ©1995 by the Society for Industrial and Applied Mathematics. . because m l=k+1 · is a matrix norm that is induced by a vector 1 − M m−k 1− M Sk − Sm ≤ M l = M k+1 →0 as m.html. Buy this book from SIAM at http://www. BASIC CONCEPTS 5 1. k → ∞.1.5) x = (I − A)x + b. we must have M S + I = S and hence (I − M )S = I. We will discuss more general methods in which {xk } is given by (1. say to S. 1− M Proof. We will show that I − M is nonsingular and that (1.6) xk+1 = (I − A)xk + b. One way to do this is to write Ax = b as (1.2. Hence the sequence Sk converges. This proves that I − M is nonsingular and that S = (I − M )−1 .8) holds by showing that the series ∞ M l = (I − M )−1 . Hence l Ml .com/SIAM/FR16. k l=0 Ml To see this note that for all m > k m l=k+1 Sk − Sm ≤ Now. The Krylov methods discussed in Chapters 2 and 3 are not stationary iterative methods.
ecsecurehost. If M < 1 then the iteration (1. inequality (1. termination decisions based on the Buy this book from SIAM at http://www. If A and B are N × N matrices and B is an approximate inverse of A. Let M = I − BA. 6 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Noting that (I − M )−1 ≤ ∞ l=0 M l = (1 − M )−1 .1. 1 − I − BA B I − BA . 1 − I − BA B −1 ≤ A . 1 − I − BA A − B −1 ≤ Proof. (1. 1 − I − BA A I − BA . Definition 1. B is an approximate inverse of A if I − BA < 1.1) by a matrix B BAx = Bb so that convergence of iterative methods is improved.com/SIAM/FR16. Hence both A and B are nonsingular.2. proves (1. the matrices B that allow us to apply the Banach lemma and its corollary are called approximate inverses.1.8) (1. By (1.2.1. has the form xk+1 = (I − BA)xk + Bb.8) and completes the proof.1 indicates.10) A−1 − B ≤ A−1 ≤ B . A − B −1 = B −1 (I − BA).html. By Lemma 1.12) If the norm of I − BA is small.Copyright ©1995 by the Society for Industrial and Applied Mathematics.1.9). but.1 is that Richardson iteration (1. Richardson iteration. as Lemma 1.2.11) A−1 B −1 = (I − M )−1 ≤ 1 1− M = 1 .2.9). This electronic version is for personal use and may not be duplicated or distributed. The following theorem is often referred to as the Banach Lemma. then not only will the iteration converge rapidly. 1 − I − BA Since A−1 = (I − M )−1 B. To complete the proof note that A−1 − B = (I − BA)A−1 .7) converges to x = (I − M )−1 c for all initial iterates x0 . It is sometimes possible to precondition a linear equation by multiplying both sides of (1. The following corollary is a direct consequence of Lemma 1.2.9) and (1. In the context of Richardson iteration. Corollary 1. Theorem 1. The second part follows in a similar way from B −1 = A(I − M )−1 . then A and B are both nonsingular and (1. A consequence of Corollary 1. and use (1. . preconditioned with approximate inversion.1.2.11) implies the ﬁrst part of (1.1 I − M = I − (I − BA) = BA is nonsingular.6) will converge if I − A < 1.
[111]. Let M be an N ×N matrix. and Exercise 1. The spectral radius of an N × N matrix A is (1. The iteration (1. and sucessive overrelaxation (SOR) iteration are based on splittings of A of the form A = A1 + A2 .2.com/SIAM/FR16. The proof is left as an exercise. [126]. Theorem 1. We state that converse as a theorem and refer to [105] for a proof. and related problems [15]. Hence the performance of the iteration is not completely described by M .2.14) has a partial converse that allows us to completely describe the performance of iteration (1.14) ρ(A) ≤ A for any induced matrix norm.1.13) ρ(A) = max λ = lim An λ∈σ(A) n→∞ 1/n . Multigrid methods [19]. can also be interpreted in this light.2 related convergence of the iteration (1. However the norm of M could be small in some norms and quite large in others.7) in terms of spectral radius.5. The spectral radius of M is independent of any particular matrix norm of M . polynomial preconditioning. The term on the righthand side of the second equality in (1.3.1 is a characterization of convergent stationary iterative methods.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Theorem 1.7) converges for all c ∈ RN if and only if ρ(M ) < 1.3. BASIC CONCEPTS 7 preconditioned residual Bb − BAx will better reﬂect the actual error. It is clear. Lemma 1.5). Matrix splittings and classical stationary iterative methods There are ways to convert Ax = b to a linear ﬁxedpoint iteration that are diﬀerent from (1.3. Methods such as Jacobi. [169].1. that (1.ecsecurehost. in fact. . 1. Let A be an N × N matrix. The spectral radius The analysis in § 1. We let σ(A) denote the set of eigenvalues of A. Buy this book from SIAM at http://www. 1. A consequence of Theorem 1.3. [6]. We mention one other approach.13) is the limit used by the radical test for convergence of the series An .4. This electronic version is for personal use and may not be duplicated or distributed. [179].html. [100]. Then for any > 0 there is a norm · on RN such that ρ(A) > A − . [99]. The concept of spectral radius allows us to make a complete description. The inequality (1. which tries to approximate A−1 by a polynomial in A [123].7) to the norm of the matrix M . [117].1. Definition 1. Gauss–Seidel.3.1. This method is a very eﬀective technique for solving diﬀerential equations. integral equations.
Note that A1 is diagonal and hence trivial to invert. Proof. Hence MJAC ∞ < 1 and the iteration converges to the unique solution of x = M x + D−1 b. or [200]. Letting (xk )i denote the ith component of the kth iterate we can express Jacobi iteration concretely as (1. The limited description in this section is intended as a review that will set some notation to be used later. This electronic version is for personal use and may not be duplicated or distributed.4. Note that the ith row sum of M = MJAC satisﬁes N j=1 mij  = j=i aij  aii  < 1. Then Ax = b is converted to the ﬁxedpoint problem x = A−1 (b − A2 x). A2 = L + U.ecsecurehost. This leads to the iteration matrix MJAC = −D−1 (L + U ).15) converges to x∗ = A−1 b for all b. [105].html. [193]. 1 For a detailed description of the classical stationary iterative methods the reader may consult [89].Copyright ©1995 by the Society for Industrial and Applied Mathematics. These methods are usually less eﬃcient than the Krylov methods discussed in Chapters 2 and 3 or the more modern stationary methods based on multigrid ideas.1. As a ﬁrst example we consider the Jacobi iteration that uses the splitting A1 = D. Theorem 1. However the classical methods have a role as preconditioners.15) (xk+1 )i = a−1 bi ii − j=i aij (xk )j . Buy this book from SIAM at http://www. 8 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS where A1 is a nonsingular matrix constructed so that equations with A1 as coeﬃcient matrix are easy to solve. We present only one convergence result for the classical stationary iterative methods. where D is the diagonal of A and L and U are the (strict) lower and upper triangular parts. 0< j=i Then A is nonsingular and the Jacobi iteration (1. . 1 The analysis of the method is based on an estimation of the spectral radius of the iteration matrix M = −A−1 A2 . Also I − M = D−1 A is nonsingular and therefore A is nonsingular. Let A be an N × N matrix and assume that for all 1≤i≤N (1. [144].16) aij  < aii .com/SIAM/FR16.
This results in the iteration (xk+1 )i = a−1 bi − ii the splitting and iteration matrix aij (xk+1 )j − j<i j>i aij (xk )j . BASIC CONCEPTS 9 Gauss–Seidel iteration overwrites the approximate solution with the new value as soon as it is computed. A2 = L.html.17) For symmetric Gauss–Seidel (1. [8]. and iteration matrix MBGS = −(D + U )−1 L. For the Jacobi iteration. Backward Gauss–Seidel begins the update of x with the N th coordinate rather than the ﬁrst.Copyright ©1995 by the Society for Industrial and Applied Mathematics. and the papers cited therein provide additional reading on this topic. A symmetric Gauss–Seidel iteration is a forward Gauss–Seidel iteration followed by a backward Gauss–Seidel iteration. resulting in the splitting A1 = D + U. If A is symmetric then U = LT . References [200]. and hence A−1 y is easy to compute for vectors 1 y. (1.ecsecurehost. Buy this book from SIAM at http://www. MGS = −(D + L)−1 U. one wants to write the stationary method as a preconditioned Richardson iteration. The performance can be dramatically improved with a good choice of ω but still is not competitive with Krylov methods. A2 = U. the iteration depends on the ordering of the unknowns. Note also that. BJAC = D−1 . . Note that A1 is lower triangular. From the point of view of preconditioning.com/SIAM/FR16. The successive overrelaxation iteration modiﬁes Gauss–Seidel by adding a relaxation parameter ω to construct an iteration with iteration matrix MSOR = (D + ωL)−1 ((1 − ω)D − ωU ).18) BSGS = (D + LT )−1 D(D + L)−1 . This electronic version is for personal use and may not be duplicated or distributed. That means that one wants to ﬁnd B such that M = I − BA and then use B as an approximate inverse. A further disadvantage is that the choice of ω is often diﬃcult to make. [193]. [89]. A1 = D + L. This leads to the iteration matrix MSGS = MBGS MGS = (D + U )−1 L(D + L)−1 U. In that event MSGS = (D + U )−1 L(D + L)−1 U = (D + LT )−1 L(D + L)−1 LT . unlike Jacobi iteration.
Exercises on stationary iterative methods 1.3.2. Buy this book from SIAM at http://www.ecsecurehost.5.html.1.4. 10 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 1. Show that if ρ(M ) ≥ 1 then there are x0 and c such that the iteration (1.7) fails to converge.5.Copyright ©1995 by the Society for Industrial and Applied Mathematics. 1. 1. 1. .com/SIAM/FR16.5. Prove Theorem 1.18).2. Verify equality (1. Show that if A is symmetric and positive deﬁnite (that is AT = A and xT Ax > 0 for all x = 0) that BSGS is also symmetric and positive deﬁnite.5. This electronic version is for personal use and may not be duplicated or distributed.5.3.
and [78]. Krylov methods and the minimization property In the following two chapters we describe some of the Krylov space methods for linear equations. Krylov methods do not have an iteration matrix. Ar0 . It has come into wide use over the last 15 years as an iterative method and has generally superseded the Jacobi–Gauss–Seidel–SOR family of methods. As in Chapter 1.6. minimize. Brief descriptions of several of these methods and their properties are in § 3. some measure of error over the aﬃne space x0 + Kk . Recall that A is symmetric if A = AT and positive deﬁnite if xT Ax > 0 for all x = 0.Copyright ©1995 by the Society for Industrial and Applied Mathematics. So {rk }k≥0 will denote the sequence of residuals rk = b − Axk .com/SIAM/FR16. The conjugate gradient (CG) iteration was invented in the 1950s [103] as a direct method.1. Unlike the stationary iterative methods. 11 Buy this book from SIAM at http://www. There are other Krylov methods that are not as well understood as CG or GMRES. This electronic version is for personal use and may not be duplicated or distributed. CG is intended to solve symmetric positive deﬁnite (spd) systems.ecsecurehost. .html. . . . . we assume that A is a nonsingular N × N matrix and let x∗ = A−1 b. where x0 is the initial iterate and the kth Krylov subspace Kk is Kk = span(r0 . at the kth iteration. [12]. conjugate gradient and GMRES. Ak−1 r0 ) for k ≥ 1. The residual is r = b − Ax. Chapter 2 Conjugate Gradient Iteration 2. The two such methods that we’ll discuss in depth.
We state this as a lemma. Note that x − x∗ 2 A = (x − x∗ )T A(x − x∗ ) = xT Ax − xT Ax∗ − (x∗ )T Ax + (x∗ )T Ax∗ .1. Since A is symmetric and Ax∗ = b −xT Ax∗ − (x∗ )T Ax = −2xT Ax∗ = −2xT b. and at the very end. If xk minimizes φ over S then xk also minimizes x∗ − x A = r A−1 over S.com/SIAM/FR16. Since A is spd we may deﬁne a norm (you should check that this is a norm) by (2. ˜ Minimizing φ over any subset of RN is the same as minimizing x − x∗ A over that subset.html. After that we describe termination criterion. the implementation.1) x A = √ xT Ax. Proof.ecsecurehost. Buy this book from SIAM at http://www. and in the section on GMRES that follows. . Lemma 2.1. Therefore x − x∗ 2 A = 2φ(x) + (x∗ )T Ax∗ . In this section. Since (x∗ )T Ax∗ is independent of x. We will use this lemma in the particular case that S = x0 + Kk for some k. 12 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS In this section we assume that A is spd.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Let S ⊂ RN . we begin with a description of what the algorithm does and the consequences of the minimization property of the iterates.2) 1 φ(x) = xT Ax − xT b 2 over x0 + Kk . This electronic version is for personal use and may not be duplicated or distributed. minimizing φ is equivalent to minimizing x − x∗ 2 and hence to minimizing x − x∗ A . · A is called the Anorm. The development in these notes is diﬀerent from the classical work and more like the analysis for GMRES and CGNR in [134]. A If e = x − x∗ then e 2 A = eT Ae = (A(x − x∗ ))T A−1 (A(x − x∗ )) = b − Ax 2 A−1 and hence the Anorm of the error is also the A−1 norm of the residual. preconditioning. performance. Note that if φ(˜) is the minimal value (in RN ) then x ∇φ(˜) = A˜ − b = 0 x x and hence x = x∗ . The kth iterate xk of CG minimizes (2.
The spectral theorem for spd matrices asserts that A = U ΛU T .com/SIAM/FR16.3) x∗ − xk A ≤ x∗ − w A for all w ∈ x0 + Kk . where U is an orthogonal matrix whose columns are the eigenvectors of A and Λ is a diagonal matrix with the positive eigenvalues of A on the diagonal. Consequences of the minimization property Lemma 2.5) x = xT Ax = A1/2 x 2 . 2 Buy this book from SIAM at http://www.2.Copyright ©1995 by the Society for Industrial and Applied Mathematics. we can express x∗ − w as x∗ − w = x∗ − x0 − Since Ax∗ = b we have r0 = b − Ax0 = A(x∗ − x0 ) and therefore x∗ − w = x∗ − x0 − where the polynomial p(z) = 1 − k−1 j=0 k−1 j=0 k−1 j=0 γj Aj r0 .4) x∗ − xk A = p∈Pk . .1. Since U U T = U T U = I by orthogonality of U . γj z j+1 has degree k and satisﬁes p(0) = 1. CONJUGATE GRADIENT ITERATION 13 2.4) Pk denotes the set of polynomials of degree k. 2 A Deﬁne A1/2 = U Λ1/2 U T and note that (2.1 implies that since xk minimizes φ over x0 + Kk (2. This electronic version is for personal use and may not be duplicated or distributed.ecsecurehost. In (2.html. Hence p(A) = U p(Λ)U T . γj Aj+1 (x∗ − x0 ) = p(A)(x∗ − x0 ). Since any w ∈ x0 + Kk can be written as w= k−1 j=0 γj Aj r0 + x0 for some coeﬃcients {γj }. Hence (2. we have Aj = U Λj U T .p(0)=1 min p(A)(x∗ − x0 ) A.
based on information on σ(A).1.com/SIAM/FR16. It is best to regard CG as an iterative method. Buy this book from SIAM at http://www. and one cannot aﬀord to perform N iterations. together with (2.1. Let A be spd and let {xk } be the CG iterates.2. p Note that our test polynomial had the eigenvalues of A as its roots. Theorem 2. This electronic version is for personal use and may not be duplicated or distributed. In that way we showed (in the absence of all roundoﬀ error!) that CG terminated in ﬁnitely many iterations with the exact solution. One simple application of (2. Let k be given and let {¯k } be any kth degree polynomial such that pk (0) = 1.7) easy to evaluate. ¯ xN − x∗ A ≤ x0 − x∗ A z∈σ(A) max ¯(z) = 0. When doing that we seek to terminate the iteration when some speciﬁed error tolerance is reached.1. p z∈σ(A) We will refer to the polynomial pk as a residual polynomial [185].2.7) xk − x∗ x0 − x∗ A A ≤ max ¯k (z). for any x ∈ RN and p(A)x A = A1/2 p(A)x 2 ≤ p(A) 2 A1/2 x 2 ≤ p(A) 2 x A.6) xk − x∗ A ≤ x0 − x∗ A p∈Pk .4) implies that (2. This is not as good as it sounds. Let A be spd. that make either the middle or the right term in (2. Proof. by (2.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Then the CG algorithm will ﬁnd the solution within N iterations. since in most applications the number of unknowns N is very large. As a test polynomial. Hence.8) Pk = {p  p is a polynomial of degree k and p(0) = 1. ¯ ¯ p ∈ PN because p has degree N and p(0) = 1.7) and the fact ¯ that p vanishes on σ(A). Then p ¯ (2.2.7) is to show how the CG algorithm can be viewed as a direct method.6).html.ecsecurehost. The set of kth degree residual polynomials is (2. The following corollary is an important consequence of (2.} In speciﬁc contexts we try to construct sequences of residual polynomials. let i=1 p(z) = ¯ N i=1 (λi − z)/λi . This leads to an upper estimate for the number of CG iterations required to reduce the Anorm of the error to a given tolerance. This.p(0)=1 z∈σ(A) min max p(z). ¯ Definition 2. . Here σ(A) is the set of all eigenvalues of A. 14 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Hence. Let {λi }N be the eigenvalues of A. Corollary 2.
Then the CG iteration terminates in at most k iterations. 2. Theorem 2.9) b − Axk 2 ≤ η b 2 . CONJUGATE GRADIENT ITERATION 15 In the two examples that follow we look at some other easy consequences of (2. One typical criterion is small (say ≤ η) relative residuals. but rather terminate once some criterion has been satisﬁed.4) and the fact that x0 = 0 that xk − x∗ A ≤ p(A)x∗ ¯ A = 0.3. . Termination of the iteration In practice we do not run the CG iteration until an exact solution is found. are based on (2. If the spectrum of A has fewer than N points.com/SIAM/FR16. By the spectral theorem l=1 x∗ = k l=1 (γl /λil )uil . Moreover. ¯ So.2. we have by (2. Theorem 2. This electronic version is for personal use and may not be duplicated or distributed.2. Let A be spd with eigenvectors {ui }N .ecsecurehost. Proof. Then the CG iteration for Ax = b with x0 = 0 will terminate in at most k iterations. This means that we terminate the iteration after (2. Let {λil } be the eigenvalues of A associated with the eigenvectors {uil }k . Let b be a linear i=1 combination of k of the eigenvectors of A b= k l=1 γl uil . Let A be spd. Buy this book from SIAM at http://www. p(z) = ¯ k l=1 (λil − z)/λil .Copyright ©1995 by the Society for Industrial and Applied Mathematics.html. We use the residual polynomial. p(λil ) = 0 for 1 ≤ l ≤ k and ¯ hence p(A)x∗ = ¯ k l=1 p(λil )γl /λil uil = 0. however. This completes the proof.3.2. we can use a similar technique to prove the following theorem.7). ¯ One can easily verify that p ∈ Pk . Assume that there are exactly k ≤ N distinct eigenvalues of A.7) and therefore estimate the reduction in the relative Anorm of the error. The error estimates that come from the minimization property.
= z T Az = (A1/2 z)T (A1/2 z) = A1/2 z 2 2 which proves (2. (2. then A 2 = λ1 and A−1 2 = 1/λN .10) and (2. λN .12). Proof. i = λN ≤ Az ≤ λ1 N T 2 i=1 λi (ui z) 2 2 = N 2 T 2 i=1 λi (ui z) N T 2 i=1 λi (ui z) = λ1 A1/2 z 2 .11) Proof. . We will do this in the next two lemmas and then illustrate the point with an example. 16 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Our next task is to relate the relative residual in the Euclidean norm to the relative error in the Anorm.11) twice. we must not only use (2.10) complete the proof.13) follows directly from (2.Copyright ©1995 by the Society for Industrial and Applied Mathematics.2.com/SIAM/FR16. Lemma 2. Therefore. A1/2 z 2 = z A (2. This electronic version is for personal use and may not be duplicated or distributed. Then for all z ∈ RN .12) and (2. The equality on the left of (2. . To obtain the inequality on the right of (2.7) to predict when the relative Buy this book from SIAM at http://www. Let ui be a unit eigenvector corresponding to λi . b − Axk b − Ax0 2 2 = A(x∗ − xk ) A(x∗ − x0 ) 2 2 ≤ λ1 x∗ − xk λN x∗ − x0 A A as asserted.10).html. ﬁrst recall that if A = U ΛU T is the spectral decomposition of A and we order the eigenvalues such that λ1 ≥ λ2 ≥ .1. Clearly z 2 A λN 1/2 z A ≤ Az 2 ≤ λ1 1/2 z A. We may write A = U ΛU T as Az = Hence λN A1/2 z 2 2 N i=1 λi (uT z)ui . 2 Taking square roots and using (2. to predict the performance of the CG iteration based on termination on small relative residuals. λN > 0.3.13) b r0 2 2 b − Axk b 2 b − Axk b 2 2 = b − Axk b − Ax0 2 2 ≤ κ2 (A) xk − x∗ x∗ − x0 xk − x∗ x∗ − x0 A A A A 2 ≤ κ2 (A) r0 b 2 2 .ecsecurehost. Let A be spd with eigenvalues λ1 ≥ λ2 ≥ . So κ2 (A) = λ1 /λN . .12) is clear and (2.12).10) and (2. using (2. So.3. . Lemma 2. .
05. that is. CONJUGATE GRADIENT ITERATION 17 Anorm error is small.ecsecurehost. the sharpest possible.3.com/SIAM/FR16. So. If we let pk (z) = ¯ (10 − z)k /10k . √ κ2 (A) − 1 11 − 3 ≈ . ∞). Assume that x0 = 0 and that the eigenvalues of A are contained in the interval (9. We consider a very simple example. To use Lemma 2. One can obtain a more precise estimate by using a polynomial other than pk in the upper estimate for the righthand side of (2.7) to get ¯ xk − x∗ It is easy to see that Hence. √ √ Hence. that is. the size of the Anorm of the error will be reduced by a factor of 10−3 when 10−k ≤ 10−3 . then pk ∈ Pk .html. Note that it is always the case that the spectrum of a spd matrix is contained in the interval [λN . p 9≤z≤11 max ¯k (z) = 10−k . p ≤ x∗ A 10 −k . b 2 So. after k iterations we have Axk − b 2 √ ≤ 11 × 10−k /3.2 we simply note that κ2 (A) ≤ 11/9. is (2. A result from [48] (see also [45]) that is.15) xk − x ∗ A ≤ 2 x0 − x ∗ A κ2 (A) − 1 κ2 (A) + 1 k .Copyright ©1995 by the Society for Industrial and Applied Mathematics. 11). This electronic version is for personal use and may not be duplicated or distributed. when k ≥ 3. In the case of the above example. Hence.3. ≤√ κ2 (A) + 1 11 + 3 Buy this book from SIAM at http://www. but also use Lemma 2.14) xk − x∗ A A ≤ x∗ A 9≤z≤11 max ¯k (z). we can estimate κ2 (A) by κ2 (A) ≤ 11/9. after k iterations (2.7). in one sense.2 to relate small Anorm errors to small relative residuals. . This means that we may apply (2. when k ≥ 4. since ( x − 1)/( x + 1) is an increasing function of x on the interval (1. the size of the relative residual will be reduced by a factor of 10−3 when √ 10−k ≤ 3 × 10−3 / 11. λ1 ] and that κ2 (A) = λ1 /λN .
3 = 12. 1. the condition number can be quite large. Exercise 2. 400).91)k < 10−3 or when k > − log10 (2000)/ log10 (. (2.91)k .5.5 also covers this point.ecsecurehost.04 = 82. This would indicate fairly slow convergence. We may have more precise information than a single interval containing σ(A).6.15) gives xk − x∗ x∗ A A ≤ 2 × (19/21)k ≈ 2 × (. Based on this information the best estimate of the condition number of A is κ2 (A) ≤ 400.91) ≈ 3.2) = 4. However. but CG can perform very well.15) can be very pessimistic. Assume that x0 = 0 and the eigenvalues of A lie in the two intervals (1.15) would predict that xk − x∗ when 2 × (. which also predicts termination within three iterations. If the eigenvalues cluster in a small number of intervals. When we do. p which is a sharper estimate on convergence.9).05) ≈ 3. . the CG iteration will converge very rapidly. The estimate based on the clustering gives convergence in 3k iterations when (. (1.5) and (399.Copyright ©1995 by the Society for Industrial and Applied Mathematics.2)k ≤ 10−3 or when k > −3/ log10 (.3. which. A ≤ 10−3 x∗ A.25)k × 4002k max ¯3k (z) ≤ (. This electronic version is for personal use and may not be duplicated or distributed. the estimate in (2. From the results above one can see that if the condition number of A is near one.3 ≈ 2.25/1. Even if the condition number Buy this book from SIAM at http://www. when inserted into (2. 18 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Therefore (2.html.8.15) would predict a reduction in the size of the Anorm error by a factor of 10−3 when 2 × .25)k = (. In fact.com/SIAM/FR16.05k < 10−3 or when k > − log10 (2000)/ log10 (. if we use as a residual polynomial p3k ∈ P3k ¯ p3k (z) = ¯ It is easy to see that z∈σ(A) (1.3/1.3/.2)k .15) predicts 83 iterations and the clustering analysis 15 (the smallest integer multiple of 3 larger than 3 × 4.25 − z)k (400 − z)2k . We will illustrate this with an example. Hence (2.
as a side eﬀect.1).com/SIAM/FR16.ecsecurehost. αk+1 is easy to compute from the minimization property of the iteration.Copyright ©1995 by the Society for Industrial and Applied Mathematics.17) αk+1 = pT (b − Axk ) p T rk k+1 = T k+1 .16) can be written as pT Axk + αpT Apk+1 − pT b = 0 k+1 k+1 k+1 leading to (2. the iteration will perform well if the eigenvalues are clustered in a few small intervals. Buy this book from SIAM at http://www. Implementation The implementation of CG depends on the amazing fact that once xk has been determined. CONJUGATE GRADIENT ITERATION 19 is large. pT Apk+1 pk+1 Apk+1 k+1 If xk = xk+1 then the above analysis implies that α = 0. We show that this only happens if xk is the solution. In the context of CG. Recalling that ∇φ(x) = Ax − b = −r we have (2. Equation (2. Let A be spd and let {xk } be the conjugate gradient iterates.. Now. then rk = rk+1 .4. we have.16) =0 dα for the correct choice of α = αk+1 . proves that (if we deﬁne p0 = 0) pT rk = 0 for all 0 ≤ l < k ≤ n. We discuss this in § 2. 2. We used this term before in the context of Richardson iteration and accomplished the goal by multiplying A by an approximate inverse.4.8. if xk = xk+1 . . 2 The next lemma characterizes the search direction and.5.1. Once pk+1 has been found. Since rl ∈ Kk for all l < k (see Exercise 2.4. Since xk minimizes φ on x0 + Kk .e. Lemma 2.19) T ∇φ(xk )T ξ = −rk ξ = 0 for all ξ ∈ Kk . This electronic version is for personal use and may not be duplicated or distributed.18) Proof. either xk = x∗ or a search direction pk+1 = 0 can be found very cheaply so that xk+1 = xk + αk+1 pk+1 for some scalar αk+1 .18). Then T rk rl = 0 for all 0 ≤ l < k. (2. for any ξ ∈ Kk dφ(xk + tξ) = ∇φ(xk + tξ)T ξ = 0 dt at t = 0. Lemma 2. this proves (2. The transformation of the problem into one with eigenvalues clustered near one (i. In fact dφ(xk + αpk+1 ) (2. easier to solve) is called preconditioning.html. such a simple approach can destroy the symmetry of the coeﬃcient matrix and a more subtle implementation is required. unless the l iteration terminates prematurely.1 then implies that T T rk 2 = rk rk = rk rk+1 = 0 and hence xk = x∗ .
The condition pT Aξ = 0 is called Aconjugacy of pk+1 to Kk . (2. it is.21) then imply that for all ξ ∈ Kk .2 and the fact that Kk = span(r0 . If xk = x∗ then xk+1 = xk + αk+1 pk+1 and pk+1 is determined up to a scalar multiple by the conditions (2. Let A be spd and let {xk } be the conjugate gradient iterates.2 then implies that pT Arl = 0 for 0 ≤ l ≤ k − 2. k+1 It only remains to solve for βk+1 so that pT Ark−1 = 0. k+1 k If l ≤ k − 2. . Let A be spd and assume that rk = 0. Now.com/SIAM/FR16. trivial.19) and (2. pT Aξ = 0 for all ξ ∈ Kk .4.24) T βk+1 = −rk Ark−1 /pT Ark−1 k provided pT Ark−1 = 0. Trivially k+1 (2. Let pk+1 be given by (2. Since Kk ⊂ Kk+1 . Since k rk = rk−1 − αk Apk Buy this book from SIAM at http://www. By Lemma 2. k+1 This uniquely speciﬁes the direction of pk+1 as (2.20) pk+1 ∈ Kk+1 .Copyright ©1995 by the Society for Industrial and Applied Mathematics. k+1 Proof. This electronic version is for personal use and may not be duplicated or distributed. up to a scalar multiple.23) then pT Arl = 0 k+1 for all 0 ≤ l ≤ k − 1.ecsecurehost.e. . then rl ∈ Kl+1 ⊂ Kk−1 .4. Deﬁne p0 = 0. a subspace of dimension one less than Kk+1 .. 20 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Lemma 2. be expressed as pk+1 = rk + wk with wk ∈ Kk . .22) αk+1 pT Aξ = −(Axk − b)T ξ = −∇φ(xk )T ξ = 0. (2.html. we need only verify that a βk+1 can be found so that if pk+1 is given by (2. Then pk+1 = rk + βk+1 pk for some βk+1 and k ≥ 0. Then for any l ≤ k T pT Arl = rk Arl + βk+1 pT Arl .20) can. . any k+1 pk+1 satisfying (2. rk−1 ).23). in the scalar product (x. y) = xT Ay) to Kk .23) Proof.22) implies that pk+1 ∈ Kk+1 is Aorthogonal (i. (2. Lemma 2.2.1. We have the following theorem. . in fact. While one might think that wk would be hard to compute.4.4.21) ∇φ(xk+1 )T ξ = (Axk + αk+1 Apk+1 − b)T ξ = 0 for all ξ ∈ Kk . (2. Theorem 2.
com/SIAM/FR16. CONJUGATE GRADIENT ITERATION 21 2 2 we have T rk rk−1 = rk−1 − αk pT Ark−1 . . To get (2.4.1 we have (2.ecsecurehost. and (2.26). (2. An immediate consequence of (2.31) pT Apk = pT A(rk−1 + βk pk−1 ) k k = pT Ark−1 + βk pT Apk−1 = pT Ark−1 . Lemma 2. k 2 This completes the proof. Let A be spd and assume that rk = 0.2. This electronic version is for personal use and may not be duplicated or distributed. The common implementation of conjugate gradient uses a diﬀerent form for αk and βk than given in (2.27) note that pT Apk = 0 and hence (2.27) Proof. Then (2. k T Since rk rk−1 = 0 by Lemma 2.4.28) T pT rk+1 = rk rk+1 + βk+1 pT rk+1 = 0 k+1 k αk+1 = rk 2 2 pT Apk+1 k+1 rk rk−1 2 2 2.28) is that pT rk = 0 and k hence pT rk = (rk + βk+1 pk )T rk = rk 2 .html. Note that for k ≥ 0 (2.24).3.23) implies that k+1 (2. k k k βk+1 = T −rk Apk .29) k+1 2 Taking scalar products of both sides of rk+1 = rk − αk+1 Apk+1 with pk+1 and using (2.Copyright ©1995 by the Society for Industrial and Applied Mathematics.30) Also note that (2.29) gives T 0 = pT rk − αk+1 pT Apk+1 = rk k+1 k+1 2 2 − αk+1 pT Apk+1 . pT Apk k Now combine (2.25) pT Ark−1 = rk−1 2 /αk = 0. (2. 2 βk+1 = by Lemma 2.31). k+1 which is equivalent to (2. rk−1 2 2 Buy this book from SIAM at http://www.25) to get βk+1 = T −rk Apk αk .26) and (2.17) and (2.30).4.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
22
ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Now take scalar products of both sides of rk = rk−1 − αk Apk with rk and use Lemma 2.4.1 to get rk
2 2 T = −αk rk Apk .
Hence (2.27) holds. The usual implementation reﬂects all of the above results. The goal is to ﬁnd, for a given , a vector x so that b−Ax 2 ≤ b 2 . The input is the initial iterate x, which is overwritten with the solution, the right hand side b, and a routine which computes the action of A on a vector. We limit the number of iterations to kmax and return the solution, which overwrites the initial iterate x and the residual norm. Algorithm 2.4.1. cg(x, b, A, , kmax) 1. r = b − Ax, ρ0 = r 2 , k = 1. 2 √ 2. Do While ρk−1 > b 2 and k < kmax (a) if k = 1 then p = r else β = ρk−1 /ρk−2 and p = r + βp (b) w = Ap (c) α = ρk−1 /pT w (d) x = x + αp (e) r = r − αw (f) ρk = r
2 2
(g) k = k + 1 Note that the matrix A itself need not be formed or stored, only a routine for matrixvector products is required. Krylov space methods are often called matrixfree for that reason. Now, consider the costs. We need store only the four vectors x, w, p, and r. Each iteration requires a single matrixvector product (to compute w = Ap), two scalar products (one for pT w and one to compute ρk = r 2 ), and three 2 operations of the form ax + y, where x and y are vectors and a is a scalar. It is remarkable that the iteration can progress without storing a basis for the entire Krylov subspace. As we will see in the section on GMRES, this is not the case in general. The spd structure buys quite a lot. 2.5. Preconditioning To reduce the condition number, and hence improve the performance of the iteration, one might try to replace Ax = b by another spd system with the same solution. If M is a spd matrix that is close to A−1 , then the eigenvalues
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
CONJUGATE GRADIENT ITERATION
23
of M A will be clustered near one. However M A is unlikely to be spd, and hence CG cannot be applied to the system M Ax = M b. In theory one avoids this diﬃculty by expressing the preconditioned problem in terms of B, where B is spd, A = B 2 , and by using a twosided preconditioner, S ≈ B −1 (so M = S 2 ). Then the matrix SAS is spd and its eigenvalues are clustered near one. Moreover the preconditioned system SASy = Sb has y ∗ = S −1 x∗ as a solution, where Ax∗ = b. Hence x∗ can be recovered from y ∗ by multiplication by S. One might think, therefore, that computing S (or a subroutine for its action on a vector) would be necessary and that a matrixvector multiply by SAS would incur a cost of one multiplication by A and two by S. Fortunately, this is not the case. If y k , rk , pk are the iterate, residual, and search direction for CG applied ˆ ˆ to SAS and we let ˆ ˆ ˆ ˆ xk = S y k , rk = S −1 rk , pk = S pk , and zk = S rk , then one can perform the iteration directly in terms of xk , A, and M . The reader should verify that the following algorithm does exactly that. The input is the same as that for Algorithm cg and the routine to compute the action of the preconditioner on a vector. Aside from the preconditioner, the arguments to pcg are the same as those to Algorithm cg. Algorithm 2.5.1. pcg(x, b, A, M, , kmax) 1. r = b − Ax, ρ0 = r 2 , k = 1 2 √ 2. Do While ρk−1 > b 2 and k < kmax (a) z = M r (b) τk−1 = z T r (c) if k = 1 then β = 0 and p = z else β = τk−1 /τk−2 , p = z + βp (d) w = Ap (e) α = τk−1 /pT w (f) x = x + αp (g) r = r − αw (h) ρk = rT r (i) k = k + 1 Note that the cost is identical to CG with the addition of • the application of the preconditioner M in step 2a and • the additional inner product required to compute τk in step 2b.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
24
ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Of these costs, the application of the preconditioner is usually the larger. In the remainder of this section we brieﬂy mention some classes of preconditioners. A more complete and detailed discussion of preconditioners is in [8] and a concise survey with many pointers to the literature is in [12]. Some eﬀective preconditioners are based on deep insight into the structure of the problem. See [124] for an example in the context of partial diﬀerential equations, where it is shown that certain discretized secondorder elliptic problems on simple geometries can be very well preconditioned with fast Poisson solvers [99], [188], and [187]. Similar performance can be obtained from multigrid [99], domain decomposition, [38], [39], [40], and alternating direction preconditioners [8], [149], [193], [194]. We use a Poisson solver preconditioner in the examples in § 2.7 and § 3.7 as well as for nonlinear problems in § 6.4.2 and § 8.4.2. One commonly used and easily implemented preconditioner is Jacobi preconditioning, where M is the inverse of the diagonal part of A. One can also use other preconditioners based on the classical stationary iterative methods, such as the symmetric Gauss–Seidel preconditioner (1.18). For applications to partial diﬀerential equations, these preconditioners may be somewhat useful, but should not be expected to have dramatic eﬀects. Another approach is to apply a sparse Cholesky factorization to the matrix A (thereby giving up a fully matrixfree formulation) and discarding small elements of the factors and/or allowing only a ﬁxed amount of storage for the factors. Such preconditioners are called incomplete factorization preconditioners. So if A = LLT + E, where E is small, the preconditioner is (LLT )−1 and its action on a vector is done by two sparse triangular solves. We refer the reader to [8], [127], and [44] for more detail. One could also attempt to estimate the spectrum of A, ﬁnd a polynomial p such that 1 − zp(z) is small on the approximate spectrum, and use p(A) as a preconditioner. This is called polynomial preconditioning. The preconditioned system is p(A)Ax = p(A)b and we would expect the spectrum of p(A)A to be more clustered near z = 1 than that of A. If an interval containing the spectrum can be found, the residual polynomial q(z) = 1 − zp(z) of smallest L∞ norm on that interval can be expressed in terms of Chebyshev [161] polynomials. Alternatively q can be selected to solve a least squares minimization problem [5], [163]. The preconditioning p can be directly recovered from q and convergence rate estimates made. This technique is used to prove the estimate (2.15), for example. The cost of such a preconditioner, if a polynomial of degree K is used, is K matrixvector products for each application of the preconditioner [5]. The performance gains can be very signiﬁcant and the implementation is matrixfree.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
there are situations where this is not possible.33) then y∗ − y 2 AAT = (y ∗ − y)T (AAT )(y ∗ − y) = (AT y ∗ − AT y)T (AT y ∗ − AT y) = x∗ − x 2 2 is minimized over y0 + Kk at each iterate. for any residual polynomial pk ∈ Pk .Copyright ©1995 by the Society for Industrial and Applied Mathematics. The third. The advantages of this approach are that all the theory for CG carries over and the simple implementation for both CG and PCG can be used. one might consider solving Ax = b by applying CG to the normal equations (2. CGNR and CGNE If A is nonsingular and nonsymmetric.html. Hence the name Conjugate Gradient on the Normal equations to minimize the Residual.6. disadvantage is that one must compute the action of AT on a vector as part of the matrixvector product involving AT A.32) asserts that x∗ − x 2 AT A = (x∗ − x)T AT A(x∗ − x) = (Ax∗ − Ax)T (Ax∗ − Ax) = (b − Ax)T (b − Ax) = r 2 is minimized over x0 + Kk at each iterate. The ﬁrst is that the condition number of the coeﬃcient matrix AT A is the square of that of A. The reason for this name is that the minimization property of CG as applied to (2. There are three disadvantages that may or may not be serious. The reason for this name is that the minimization property of CG as applied to (2.32) AT Ax = AT b.33) asserts that if y ∗ is the solution to (2.33) AAT y = b and then set x = AT y to solve Ax = b. more important. [134]. we have x∗ − x 2 AT A = (x∗ − x)T AT A(x∗ − x) = A(x∗ − x) 2 2 = r 2. The analysis with residual polynomials is similar to that for CG. the analysis for CGNE is essentially the same. As above. when we consider the AT A norm of the error. . As we will see in the chapter on nonlinear problems. This approach [46] is now called CGNE [78]. 2 Hence. This electronic version is for personal use and may not be duplicated or distributed. [134]. This approach [103] is called CGNR [71].ecsecurehost. one could solve (2. We consider the case for CGNR. Alternatively. [78]. Conjugate Gradient on the Normal equations to minimize the Error. p Buy this book from SIAM at http://www. ¯ (2.com/SIAM/FR16. The second is that two matrixvector products are needed for each CG iterate since w = AT Ap = AT (Ap) in CGNR and w = AAT p = A(AT p) in CGNE. CONJUGATE GRADIENT ITERATION 25 2.34) rk 2 ≤ pk (AT A)r0 ¯ 2 ≤ r0 2 z∈σ(AT A) max ¯k (z).
described in the comment lines.ecsecurehost. MATLAB functions for the matrixvector product and (optionally) the preconditioner. Most signiﬁcantly. We discretize with a ﬁvepoint centered diﬀerence scheme with n2 points and mesh width h = 1/(n + 1). 1) = u(0. y < 1. The inputs. y ≤ 1 (Exercise 2. y) = 0. The unknowns are uij ≈ u(xi .html. y)∇u) = f (x. 26 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS There are two major diﬀerences between (2. Hence the performance of CGNR and CGNE is determined by the distribution of singular values. are the initial iterate. We consider the discretization of the partial diﬀerential equation (2.7. 2. which is the set of the squares of the singular values of A. and iteration parameters to specify the maximum number of iterations and the termination criterion. . hence we need not prove a result like Lemma 2. and deﬁne αij = −a(xi .34) and (2.3. 0 < x. Examples for preconditioned conjugate iteration In the collection of MATLAB codes we provide a code for preconditioned conjugate gradient iteration.36) = (αij + α(i+1)j )(u(i+1)j − uij ) −(α(i−1)j + αij )(uij − u(i−1)j ) + (αi(j+1) + αij )(ui(j+1) − uij ) −(αij + αi(j−1) )(uij − ui(j−1) ) Buy this book from SIAM at http://www. On return the code supplies the approximate solution x and the history of the iteration as the vector of residual norms. which corresponds exactly to the termination criterion. the right hand side vector b.35) ˙ −∇(a(x. x0 . y) = u(1. This electronic version is for personal use and may not be duplicated or distributed. One can verify [105] that the diﬀerential operator is positive deﬁnite in the Hilbert space sense and that the ﬁvepoint discretization described below is positive deﬁnite if a > 0 for all 0 ≤ x. The estimate is in terms of the l2 norm of the residual. the residual polynomial is to be maximized over the eigenvalues of AT A. We express the discrete matrixvector product as (Au)ij (2.2.Copyright ©1995 by the Society for Industrial and Applied Mathematics.7). xj ) where xi = ih for 1 ≤ i ≤ n. to reﬂect the boundary conditions.10). We set u0j = u(n+1)j = ui0 = ui(n+1) = 0. xj )h−2 /2. y < 1 subject to homogeneous Dirichlet boundary conditions u(x. 0) = u(x.8. y) on 0 < x.com/SIAM/FR16.
In Fig. designed to take advantage of fast MATLAB builtin functions. For the computations reported in this section we took a(x.5 ). The properties of CG on the preconditioned problem in the continuous case have been analyzed in [48].Copyright ©1995 by the Society for Industrial and Applied Mathematics. Note that the reduction in r is not monotone. [186]. The initial iterate was u0 = 0. We will report our results graphically. For many types of domains and boundary conditions. j ≤ n. and [187]. [188]. We expect secondorder accuracy from the method and accordingly we set termination parameter = h2 = 1/1024. See the routine fish2d in collection of MATLAB codes for an example of how to do this in MATLAB. By this we mean an operator G such that v = Gw is the solution of the discrete form of −vxx − vyy = w. For a preconditioner we use a Poisson solver. We allowed up to 100 CG iterations. however.html. . which is useful for computing the action of A on u and applying fast solvers. The reader is asked to quantify this in terms of execution times in Exercise 2. 2. which predicts decrease in e A but not necessarily in r as the iteration progresses. This is consistent with the theory. The initial iterate was the zero vector. Even the unpreconditioned iteration.1 the solid line is a plot of rk / b and the dashed line a plot of u∗ −uk A / u∗ −u0 A . The eﬀectiveness of such a preconditioner has been analyzed in [124] and some of the many ways to implement the solver eﬃciently are discussed in [99]. For the MATLAB implementation we convert freely from the representation of u as a twodimensional array (with the boundary conditions added). This example illustrates the importance of a good preconditioner. is more eﬃcient that the classical stationary iterative methods. Note that the unpreconditioned iteration is slowly convergent. This can be explained by the fact that the eigenvalues are not clustered and κ(A) = O(1/h2 ) = O(n2 ) = O(N ) and hence (2.15) indicates that convergence will be slow. This electronic version is for personal use and may not be duplicated or distributed. which is what pcgsol expects to see. and the representation as a onedimensional array. in the case of the MATLAB environment used in this book. y) = cos(x) and took the right hand side so that the exact solution was the discretization of 10xy(1 − x)(1 − y) exp(x4. The fast Poisson solver in the collection of Buy this book from SIAM at http://www. plotting rk 2 / b 2 on a semilog scale.ecsecurehost. Poisson solvers can be designed to take advantage of vector and/or parallel architectures or.com/SIAM/FR16. subject to homogeneous Dirichlet boundary conditions. Because of this their execution time is less than a simple count of ﬂoatingpoint operations would indicate. CONJUGATE GRADIENT ITERATION 27 for 1 ≤ i.8.9. In the results reported here we took n = 31 resulting in a system with N = n2 = 961 unknowns.
codes fish2d is based on the MATLAB fast Fourier transform. the builtin function fft. . The preconditioned iteration required 5 iterations for convergence and the unpreconditioned iteration 52. 2. but the number of iterations required to reduce the relative residual by a given amount is independent of the mesh spacing [124]. 10 1 10 0 Relative Residual 10 1 10 2 10 3 10 4 0 10 20 30 Iterations 40 50 60 Fig. 28 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 1 10 Relative Residual and Anorm of Error 10 0 10 1 10 2 10 3 10 4 0 10 20 30 Iterations 40 50 60 Fig.2.com/SIAM/FR16. 2.1. Not only does the preconditioned iteration converge more rapidly. 2. In Fig.Copyright ©1995 by the Society for Industrial and Applied Mathematics.2 the solid line is the graph of rk 2 / b 2 for the preconditioned iteration and the dashed line for the unpreconditioned. PCG for 2D elliptic equation. CG for 2D elliptic equation.html.ecsecurehost. We caution the reader that the preconditioned iteration is not as much faster than the Buy this book from SIAM at http://www. This electronic version is for personal use and may not be duplicated or distributed.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. As we remarked above. the execution time of the preconditioned iteration was about 60% of that of the unpreconditioned. the cost of the preconditioner is considerable. this speed is a result of the eﬃciency of the MATLAB fast Fourier transform. In the MATLAB environment we used.8. . The MATLAB flops command indicates that the unpreconditioned iteration required roughly 1.87 million ﬂoatingpoint operations. This electronic version is for personal use and may not be duplicated or distributed. Buy this book from SIAM at http://www.com/SIAM/FR16.2 million ﬂoatingpoint operations while the preconditioned iteration required . CONJUGATE GRADIENT ITERATION 29 unpreconditioned one as the iteration count would suggest.ecsecurehost. In Exercise 2.11 you are asked to compare execution times for your own environment. Hence.html.
give a rough estimate of the number of CG iterates required to reduce the relative residual to O(1/N ). 2.4. in MATLAB by writing the matrixvector product routines and using the MATLAB codes pcgsol and fish2d.8. 2.3.8. 2. Assume that A is spd and that σ(A) ⊂ (1. Let Λ be a diagonal matrix with Λii = λi and let p be a polynomial. . 2).6.8.1 + x and examine the accuracy of the result. Estimate the number of CGNR/CGNE iterations required to reduce the relative residual by a factor of 10−4 .8. What happens as N is increased? How are the performance √ and accuracy aﬀected by changes in a(x. 2. 2. 2. assume that the cost of a matrix vector multiply is 4N ﬂoatingpoint multiplies.com/SIAM/FR16. This electronic version is for personal use and may not be duplicated or distributed. Show that there is a spd B such that B 2 = A.7 for example. Assume that A is N × N .8. Is B unique? 2.Copyright ©1995 by the Society for Industrial and Applied Mathematics. 2.8.36) is symmetric and 2 positive deﬁnite on Rn if a(x. Prove Theorem 2. 2. and spd. Compare the execution times on your computing environment (using the cputime command in MATLAB. Buy this book from SIAM at http://www.10.1.8. Estimate the number of ﬂoatingpoint operations reduce the A norm of the error by a factor of 10−3 using CG iteration.6) for the number of CG iterations required to reduce the A norm of the error by a factor of 10−3 and for the number of CG iterations required to reduce the residual by a factor of 10−3 .html. Prove that p(Λ) = maxi p(λi ) where · is any induced matrix norm. Prove that rl ∈ Kk for all l < k.7.2). Let {xk } be the conjugate gradient iterates.ecsecurehost. for instance).8.8. Exercises on conjugate gradient 2.8.1) ∪ (2.3.2.8. 2. Let A be spd. 1. y) = . y)? Try a(x. For the matrix A in problem 5. Give upper estimates based on (2. y) > 0 for all 0 ≤ x. Show that if A has constant diagonal then PCG with Jacobi preconditioning produces the same iterates as CG with no preconditioning. Explain your ﬁndings. Let A be a nonsingular matrix with all singular values in the interval (1.11. nonsingular. y ≤ 1.9. 30 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 2.2. Prove that the linear transformation given by (2.5. If κ(A) = O(N ).8. Duplicate the results in § 2. 2.8.
8. and symmetric Gauss–Seidel). y < 1. y) + ex+y u(x.ecsecurehost.8.13. How does the performance compare to CG and PCG? 2. y. Modify pcgsol so that φ(x) is computed and stored at each iterate and returned on output.17) and symmetric Gauss–Seidel (1.18) preconditioners for the elliptic boundary value problem in § 2.7.7. This electronic version is for personal use and may not be duplicated or distributed. < 1. 0 < x.12.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Jacobi. 2. 2. Implement Jacobi (1. subject to the inhomogeneous Dirichlet boundary conditions u(x.html. Plot φ(xn ) as a function of n for each of the examples. Apply CG and PCG to solve the ﬁvepoint discretization of −uxx (x. Experiment with diﬀerent mesh sizes and preconditioners (Fast Poisson solver.14. u(0. 1) = u(1. y) = 1. Use the Jacobi and symmetric Gauss–Seidel iterations from Chapter 1 to solve the elliptic boundary value problem in § 2. 0 < x. . Compare the performance with respect to both computer time and number of iterations to preconditioning with the Poisson solver.8. 0) = u(x. CONJUGATE GRADIENT ITERATION 31 2.8. y) − uyy (x. Buy this book from SIAM at http://www. y) = 0.com/SIAM/FR16.15. y) = 1.
com/SIAM/FR16. 32 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Buy this book from SIAM at http://www. .html.Copyright ©1995 by the Society for Industrial and Applied Mathematics.ecsecurehost. This electronic version is for personal use and may not be duplicated or distributed.
com/SIAM/FR16. Then for all pk ∈ Pk ¯ (3. one must store a basis for Kk . The use of residual polynomials is made more complicated because we cannot use the spectral theorem to decompose A.1. The minimization property and its consequences The GMRES (Generalized Minimum RESidual) was proposed in 1986 in [167] as a Krylov subspace method for nonsymmetric systems.2) rk 2 = min p(A)r0 ¯ p∈Pk 2 ≤ pk (A)r0 2 . ¯ 33 Buy this book from SIAM at http://www. Note that if x ∈ x0 + Kk then x = x0 + and so b − Ax = b − Ax0 − k−1 j=0 k−1 j=0 k j=1 γj Aj r0 γj A j+1 r0 = r0 − γj−1 Aj r0 . We have just proved the following result.1. Then for all pk ∈ Pk ¯ (3.1. and therefore storage requirements increase as the iteration progresses. Corollary 3. (3.1. Let A be nonsingular and let xk be the kth GMRES iteration. The kth (k ≥ 1) iteration of GMRES is the solution to the least squares problem minimizex∈x0 +Kk b − Ax 2 . GMRES does not require computation of the action of AT on a vector. Moreover. Unlike CGNR. This electronic version is for personal use and may not be duplicated or distributed.Copyright ©1995 by the Society for Industrial and Applied Mathematics. ¯ ¯ Hence if x ∈ x0 + Kk then r = p(A)r0 where p ∈ Pk is a residual polynomial. Let A be nonsingular and let xk be the kth GMRES iteration. ¯ From this we have the following corollary.ecsecurehost. This is a signiﬁcant advantage in many cases. .html.1.1) The beginning of this section is much like the analysis for CG. Chapter 3 GMRES Iteration 3.3) rk r0 2 2 ≤ pk (A) 2 . Theorem 3.
html. is xH y. In particular. Hence we will switch to complex matrices and vectors. Let pk ∈ Pk . This is not an option for general nonsymmetric matrices.4) rk r0 2 2 ≤ κ2 (V ) max ¯k (z). p z∈σ(A) Buy this book from SIAM at http://www. Proof. Recall that the scalar product in C N . Here V H is the complex conjugate transpose of V . p has degree N . p(0) = det(A) = 0 since A is nonsingular. The characteristic polynomial of A is p(z) = det(A − zI).com/SIAM/FR16.1.1. 2 by ≤ V 2 V −1 2 pk (Λ) ¯ 2 ≤ κ2 (V ) max ¯k (z).2) to get information from clustering of the spectrum just like we did for CG. We can use the structure of a diagonalizable matrix to prove the following result. Then for all pk ∈ Pk ¯ (3. rN = b − AxN = 0 and hence xN is the solution. In Chapter 2 we applied the spectral theorem to obtain more precise information on convergence rates. Our use of complex arithmetic will be implicit for the most part and is needed only so that we may admit the possibility of complex eigenvalues of A. Let A be nonsingular.ecsecurehost. Theorem 3.3. we will use the l2 norm in C N . Here Λ is a (possibly complex!) diagonal matrix with the eigenvalues of A on the diagonal. and so pN (z) = p(z)/p(0) ∈ PN ¯ is a residual polynomial. p z∈σ(A) 2 ¯ Proof. Let A = V ΛV −1 be a nonsingular diagonalizable matrix. Theorem 3. Recall that A is diagonalizable if there is a nonsingular (possibly complex!) matrix V such that A = V ΛV −1 . Let xk be the kth GMRES iterate. This electronic version is for personal use and may not be duplicated or distributed. However. If A is a diagonalizable matrix and p is a polynomial then p(A) = V p(Λ)V −1 A is normal if the diagonalizing transformation V is orthogonal.Copyright ©1995 by the Society for Industrial and Applied Mathematics. By ¯ (3.2. 34 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS We can apply the corollary to prove ﬁnite termination of the GMRES iteration. . the space of complex N vectors. It is well known [141] that p(A) = pN (A) = 0. We pay a price in that we must use complex arithmetic for the only time in this book. In the remainder of this section we must use complex arithmetic to analyze the convergence. We can easily estimate pk (A) ¯ pk (A) ¯ as asserted. Then the GMRES algorithm will ﬁnd the solution within N iterations. In that case the columns of V are the eigenvectors of A and V −1 = V H .3). if A is diagonalizable we may use (3.
11). Assume that A = V ΛV −1 is diagonalizable. we look at some easy consequences of (3.2) that (3.6) 2 ≤ ρ < 1. Let b be a linear combination of k of the eigenvectors of A b= k l=1 γl uil . As the set of nondiagonalizable matrices has measure zero in the space of N × N matrices. Then GMRES will terminate in at most k iterations. Using the residual polynomial pk (z) = (10 − z)k /10k we ﬁnd ¯ rk r0 2 2 ≤ (100)10−k = 102−k . [134]. The estimate (3. Again we look at examples.4). the chances are very high that a computed matrix will be diagonalizable. [192].3. Assume that A has only k distinct eigenvalues. GMRES ITERATION 35 It is not clear how one should estimate the condition number of the diagonalizing transformation if it exists. Let A be a nonsingular normal matrix.2. We can use (3. Hence (3. As we did for CG. . Let pk (z) = (1 − z)k . rate estimates have been derived in [139].5) rk 2 ≤η b 2 for the purposes of this example. κ2 (V ) = 1. Let A be a nonsingular diagonalizable matrix. Theorem 3. If A is not diagonalizable. we terminate the iteration when (3.1. and that κ2 (V ) = 100. Hence we conﬁne our attention to diagonalizable matrices. The convergence rate estimates for the diagonalizable case will involve κ2 (V ).ecsecurehost.5) holds when 102−k < η or when k > 2 + log10 (η). Termination As is the case with CG.html.Copyright ©1995 by the Society for Industrial and Applied Mathematics.2.3) and (3. but will otherwise resemble those for CG. Then the GMRES iteration. [33]. 3. Theorem 3.com/SIAM/FR16. that the eigenvalues of A lie in the interval (9. We assume that x0 = 0 and hence r0 = b. and [34]. As was the case with CG. Assume that I − A consequence of (3. for Ax = b will terminate in at most k iterations.4) directly to estimate the ﬁrst k such that (3.4. of course. GMRES is best thought of as an iterative method.5) holds without requiring a lemma like Lemma 2. If A is normal.6) illustrates the potential beneﬁts of a good approximate inverse preconditioner. with x0 = 0. This is particularly so for the ﬁnite diﬀerence Jacobian matrices we consider in Chapters 6 and 8.3) and (3.1.5. It is a direct ¯ rk 2 ≤ ρk r0 2 . Buy this book from SIAM at http://www. This electronic version is for personal use and may not be duplicated or distributed.
7 we use the Poisson solver preconditioner for nonsymmetric partial diﬀerential equations. . the value of the relative residuals as estimators of the relative error is unchanged. Hence.1. This is a very active area of research and we refer to [134]. or GMRES in the normal case.html. Preconditioning done in this way is called left preconditioning. [120]. we have. If r = M Ax − M b is the residual for the preconditioned system. If one can ﬁnd M such that I − MA 2 = ρ < 1. [183]. Again we mention [8] and [12] as a good general references for preconditioning. 3.ecsecurehost. and [36] for discussions of and pointers to additional references to several questions related to nonnormal matrices. However there are two diﬀerent ways to view preconditioning. 36 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS The convergence estimates for GMRES in the nonnormal case are much less satisfying that those for CG. by Lemma 1. then applying GMRES to M Ax = M b allows one to apply (3. Incomplete factorization (LU in this case) preconditioners can be used [165] as can polynomial preconditioners.Copyright ©1995 by the Society for Industrial and Applied Mathematics.1. one can write a single matrixvector product routine for M A (or AM ) that includes both the application of A to a vector and that of the preconditioner.com/SIAM/FR16. There is no concern that the preconditioned system be spd and hence (3. one can apply the algorithm directly to the system M Ax = M b (or AM y = b). Right preconditioning has been used as the basis for a method that changes the preconditioner as the iteration progresses [166]. Buy this book from SIAM at http://www. Most of the preconditioning ideas mentioned in § 2. The residual of the preconditioned problem is the same as that of the unpreconditioned problem. Preconditioning Preconditioning for GMRES and other methods for nonsymmetric problems is diﬀerent from that for CG. Multigrid [99] and alternating direction [8]. Hence. CGNR. One important aspect of implementation is that. one can solve the system AM y = b with GMRES and then set x = M y.5 are useful for GMRES as well. Hence. This is called right preconditioning. [33]. Some hybrid algorithms use the GMRES/Arnoldi process itself to construct polynomial preconditioners for GMRES or for Richardson iteration [135]. [164]. [182] methods have similar performance and may be more generally applicable. [72]. if M A has a smaller condition number than A. CGNE. if the product M A can be formed without error. In the examples in § 3.6) to the preconditioned system. ek e0 2 2 ≤ κ2 (M A) rk r0 2 2 . [34]. If I − AM 2 = ρ < 1. This electronic version is for personal use and may not be duplicated or distributed. we might expect the relative residual of the preconditioned system to be a better indicator of the relative error than the relative residual of the original system.3.6) essentially tells the whole story. unlike PCG.
. The problem with such a direct approach is that the matrix vector product of A with the basis matrix Vk must be taken at each iteration. 2.com/SIAM/FR16.1. GMRES ITERATION 37 3.html. Deﬁne r0 = b − Ax0 and v1 = r0 / r0 2 . This electronic version is for personal use and may not be duplicated or distributed.7) very eﬃciently and the resulting least squares problem requires no extra multiplications of A with vectors.4. V ) 1. our least squares problem in Rk is (3. .Copyright ©1995 by the Society for Industrial and Applied Mathematics. Hence. If one uses Gram–Schmidt orthogonalization. A division by zero is referred to as breakdown and happens only if the solution to Ax = b is in x0 + Kk−1 . Algorithm 3. . Suppose one had an orthogonal projector Vk onto Kk . A. arnoldi(x0 . k.4.7) minimizey∈Rk r0 − AVk y 2 .1) to a least squares where problem in Rk for the coeﬃcient vector y of z = x − x0 . k − 1 vi+1 = Avi − Avi − i T j=1 ((Avi ) vj )vj i T j=1 ((Avi ) vj )vj 2 If there is never a division by zero in step 2 of Algorithm arnoldi. a map that computes the action of A on a vector. Since x − x0 = Vk y for some y ∈ Rk . . The Gram–Schmidt procedure for formation of an orthonormal basis for Kk is called the Arnoldi [4] process. say. we must have xk = x0 + Vk y where y minimizes b − A(x0 + Vk y) 2 = r0 − AVk y 2 . GMRES implementation: Basic ideas Recall that the least squares problem deﬁning the kth GMRES iterate is minimizex∈x0 +Kk b − Ax 2 . The data are vectors x0 and b. is the lth column of Vk . one can represent (3. however. Then any z ∈ Kk can be written as z= vlk k l=1 yl vlk . Buy this book from SIAM at http://www.ecsecurehost. Hence we can convert (3. For i = 1. and a dimension k. The algorithm computes an orthonormal basis for Kk and stores it in the columns of V . . b. This is a standard linear least squares problem that could be solved by a QR factorization. then the columns of the matrix Vk are an orthonormal basis for Kk .
Then x = A−1 b ∈ x0 + Ki .com/SIAM/FR16.. a vector x so that b − Ax 2 AVk = Vk+1 Hk .9) The goal of the iteration is to ﬁnd. This electronic version is for personal use and may not be duplicated or distributed. using the orthogonality of Vk+1 . rk 2 = Vk+1 (βe1 − Hk y k ) 2 = βe1 − Hk y k Rk+1 .1. . rk = b − Axk = r0 − A(xk − x0 ) = Vk+1 (βe1 − Hk y k ). ≤ b 2. the k +1×k matrix Hk whose entries are hij is upper Hessenberg. (3. Buy this book from SIAM at http://www. where y k minimizes βe1 − Hk y 2 over Rk . and let i be the smallest integer for which Avi − i j=1 ((Avi )T vj )vj = 0. Since xi − x0 ∈ Ki there is y ∈ Ri such that xi − x0 = Vi y. . Hence xk = x0 + Vk y k . Setting y = βH −1 e1 proves that ri = 0 by the minimization property. . 0)T ∈ Ri we have ri 2 2 = b − Axi 2 = r0 − A(xi − x0 ) 2 . x. Setting β = r0 and e1 = (1. for some y k ∈ Rk . .Copyright ©1995 by the Society for Industrial and Applied Mathematics. Proof. where H is an i × i matrix. hij = 0 if i > j + 1.html. H is nonsingular since A is. Note that when y k has been computed. let the vectors vj be generated by Algorithm arnoldi.4. which overwrites the initial iterate x and the residual norm. the norm of rk can be found without explicitly forming xk and computing rk = b − Axk . where · Rk+1 denotes the Euclidean norm in Rk+1 .ecsecurehost. We limit the number of iterations to kmax and return the solution.e. i. the righthand side b. The input is the initial iterate. By hypothesis Avi ∈ Ki and hence AKi ⊂ Ki . for a given .8) Hence. we can use it to implement GMRES in an eﬃcient way. We have. Since r0 = βVi e1 and Vi is an orthogonal matrix ri 2 = Vi (βe1 − Hy) 2 = βe1 − Hy Ri+1 . If the Arnoldi process does not break down. Set hji = (Avj )T vi . and a map that computes the action of A on a vector. Let A be nonsingular. The Arnoldi process (unless it terminates prematurely with a solution) produces matrices {Vk } with orthonormal columns such that (3. 0. 38 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Lemma 3. we have AVi = Vi H. . Since the columns of Vi are an orthonormal basis for Ki . The upper Hessenberg structure can be exploited to make the solution of the least squares problems very eﬃcient [167]. By the Gram–Schmidt construction.
and the simple oneline MATLAB approach can be eﬃcient for such problems. β = ρ. gmresa(x. which depends on this orthogonality. call GMRES and explicitly test the residual b−Axk if k = m upon termination. This means that in order to perform k GMRES iterations one must store k vectors of length N . . . ρ) 1. . v1 = r/ r 2 . A partial remedy is to replace the classical Gram–Schmidt orthogonalization in Algorithm gmresa with modiﬁed Gram–Schmidt orthogonalization. ρ = r 2 . . k hjk = (Avk )T vj (c) vk+1 = Avk − (d) hk+1. (3. For very large problems this becomes prohibitive and the iteration is restarted when the available room for basis vectors is exhausted. We discuss implementation of GMRES(m) later in this section. [197]. kmax.k = vk+1 k j=1 hjk vj 2 2 (e) vk+1 = vk+1 / vk+1 (1. One way to implement this is to set kmax to the maximum number m of vectors that one can store. This restarted version of the algorithm is called GMRES(m) in [167]. will not hold and the residual and approximate solution could be inaccurate. While ρ > b 2 and k < kmax do (a) k = k + 1 (b) for j = 1. If this happens.2. . . (g) ρ = βe1 − Hk y k 3. when it works it can signiﬁcantly reduce the storage costs of the iteration. r = b − Ax. . the backward division operator. k = 0 2. [167]. 0. Step 2f can be done with a single MATLAB command. A more serious problem with the implementation proposed in Algorithm gmresa is that the vectors vj may become nonorthogonal as a result of cancellation errors. call GMRES again with x0 = xk . the result from the previous call. The savings are slight if k is small relative to N . There is no general convergence theorem for the restarted algorithm and restarting will slow the convergence down. . This electronic version is for personal use and may not be duplicated or distributed. and we use the method of [167] in the collection of MATLAB codes. at a cost of O(k 3 ) ﬂoatingpoint operations.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Algorithm gmresa can be implemented very straightforwardly in MATLAB. xk = x0 + Vk y k . GMRES ITERATION 39 Algorithm 3.html. 0)T (f) e1 = ∈ Rk+1 Minimize βe1 − Hk y k Rk+1 over Rk to obtain y k .4.com/SIAM/FR16. b.9). However. which is often the case for large problems. Buy this book from SIAM at http://www. We replace Rk+1 . It is an important property of GMRES that the basis for the Krylov space must be stored as the iteration progress. . There are more eﬃcient ways to solve the least squares problem in step 2f. Note that xk is only computed upon termination and is not needed within the iteration.ecsecurehost. A. . If the norm of the residual is larger than .
. gmresb(x.004. While ρ > b 2 and k < kmax do (a) k = k + 1 Buy this book from SIAM at http://www. there is an opportunity to compensate for a loss of orthogonality in the basis vectors for the Krylov space.3568e − 02 T The columns of VU are not orthogonal at all.Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed. [160]. . First.0456e − 14 1. v1 = r/ r 2 .4.0000e + 00 4. we see that xk need only be computed after termination as the least squares residual ρ can be used to approximate the norm of the residual (they are identical in exact arithmetic). β = ρ. kmax. Let δ = 10−7 and deﬁne 1 1 1 A = δ δ 0 . .ecsecurehost.3.com/SIAM/FR16.3565e − 16 T Here vi vj − δij  ≤ 10−8 for all i. and is not particularly eﬃcient. The outline of our implementation is Algorithm gmresb. V = 1.0000e − 07 1. . Algorithm 3. We illustrate this point with a simple example from [128]. δ 0 δ We orthogonalize the columns of A with classical Gram–Schmidt to obtain 1.9905e − 01 . ρ = r 2 .9715e − 08 −9. r = b − Ax. j. . While modiﬁed Gram–Schmidt and classical Gram–Schmidt are equivalent in inﬁnite precision. A. One can take a second pass through the modiﬁed Gram–Schmidt process and restore lost orthogonality [147].0000e + 00 1. In fact v2 v3 ≈ −. However.0000e − 07 1.0000e + 00 . This implementation solves the upper Hessenberg least squares problem using the MATLAB backward division operator.html.0000e − 07 −1. We present a better implementation in Algorithm gmres. k = 0 2. b.0456e − 14 1.0000e − 07 −1.0436e − 07 −1. this version is very simple and illustrates some important ideas. V = 1.0436e − 07 1. Second.0000e + 00 4. The versions we implement in the collection of MATLAB codes use modiﬁed Gram–Schmidt. k T vk+1 = vk+1 − (vk+1 vj )vj . For modiﬁed Gram–Schmidt 1. 40 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS the loop in step 2c of Algorithm gmresa with vk+1 = Avk for j = 1. doing the computations in MATLAB. ρ) 1.0436e − 07 9. the modiﬁed form is much more likely in practice to maintain orthogonality of the basis.0000e + 00 1.
k = vk+1 2 • If loss of orthogonality is detected For j = 1.Copyright ©1995 by the Society for Industrial and Applied Mathematics. k T i. . vk+1 = vk+1 − hjk vj (c) hk+1. one can still lose orthogonality in the columns of V . .k = vk+1 2 2 (d) vk+1 = vk+1 / vk+1 (1.k = vk+1 2 Rk+1 .com/SIAM/FR16. One can reorthogonalize in every iteration or only if a test [147] detects a loss of orthogonality. xk = x0 + Vk y k . k T hjk = vk+1 vj vk+1 = vk+1 − hjk vj • hk+1. . . One can test for loss of orthogonality [22]. . 0)T (e) e1 = ∈ Rk+1 Minimize βe1 − Hk y k Rk+1 to obtain y k ∈ Rk . .k = vk+1 2 2 • vk+1 = vk+1 / vk+1 Buy this book from SIAM at http://www. k T htmp = vk+1 vj hjk = hjk + htmp vk+1 = vk+1 − htmp vj • hk+1. and reorthogonalize if needed or use a more stable means to create the matrix V [195]. There is nothing to be gained by reorthogonalizing more than once [147]. one can augment the modiﬁed Gram–Schmidt process • vk+1 = Avk for j = 1. . . These more complex implementations are necessary if A is ill conditioned or many iterations will be taken. GMRES ITERATION 41 (b) vk+1 = Avk for j = 1.html. . . Even if modiﬁed Gram–Schmidt orthogonalization is used. . This electronic version is for personal use and may not be duplicated or distributed. . . • vk+1 = vk+1 / vk+1 2 with a second pass (reorthogonalization). k T hjk = vk+1 vj vk+1 = vk+1 − hjk vj • hk+1. 0. [147]. . .ecsecurehost. . . (f) ρ = βe1 − Hk y k 3. . For example. . The modiﬁed Gram–Schmidt process with reorthogonalization looks like • vk+1 = Avk for j = 1. hjk = vk+1 vj ii.
A= 0 0 0 104 While in inﬁnite precision arithmetic only three iterations are needed to solve the system exactly. by QR factorization or the MATLAB backward division operator. say. We employ this test in the MATLAB code gmres with δ = 10−3 . The kth GMRES iteration requires a matrixvector product. In Table 3. We provide an implementation of Algorithm gmresb in the collection of MATLAB codes. in step 2e of gmresb is O(k3 ) ﬂoatingpoint operations. we ﬁnd in the MATLAB environment that a solution to full precision requires more than three iterations unless reorthogonalization is applied after every iteration. Buy this book from SIAM at http://www. The method based on (3. k scalar products. Hence the total cost of the m GMRES iterations is m matrixvector products and O(m4 + m2 N ) ﬂoatingpoint operations.html. a bruteforce solution to the least squares problem using the MATLAB backward division operator is not terribly ineﬃcient. especially when implemented in an environment like MATLAB.10) (MGMPO). 1. A variation on a method from [147] is used in [22]. . and the solution of the Hessenberg least squares problem in step 2e.11) .10) is the default in the MATLAB code gmresa. x0 = (0. however. More eﬃcient and equally eﬀective approaches are based on other ideas. This is an appealing algorithm. 0. and reorthogonalization in every iteration (MGMFO). the bruteforce method can be very costly. The idea is that if the new vector is very small relative to Avk then information may have been lost and a second pass through the modiﬁed Gram–Schmidt process is needed. For large k. To illustrate the eﬀects of loss of orthogonality and those of reorthogonalization we apply GMRES to the diagonal system Ax = b where b = (1. 42 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS One approach to reorthogonalization is to reorthogonalize in every step. and (3. 1)T . 0)T . reorthogonalization based on the test (3. This doubles the cost of the computation of V and is usually unnecessary.0011 0 . the reorthogonalization strategy based on (3.ecsecurehost.Copyright ©1995 by the Society for Industrial and Applied Mathematics. While classical Gram–Schmidt fails. because of its simplicity. This electronic version is for personal use and may not be duplicated or distributed.1 we tabulate relative residuals as a function of the iteration counter for classical Gram–Schmidt without reorthogonalization (CGM). modiﬁed Gram–Schmidt without reorthogonalization (MGM).10) is almost as eﬀective as the much more expensive approach of reorthogonalizing in every step.001 0 0 . The k scalar products require O(kN ) ﬂoatingpoint operations and the cost of a solution of the Hessenberg least squares problem.10) Avk 2 + δ vk+1 2 = Avk 2 to working precision.com/SIAM/FR16. When k is not too large and the cost of matrixvector products is high. Reorthogonalization is done after the Gram–Schmidt loop and before vk+1 is normalized if (3.
00e05 2.88e02 6.16e01 3. −s).16e01 3.42e08 3. The O(k 2 ) cost of the triangular solve and the O(kN ) cost of the construction of xk are incurred after termination.1 Eﬀects of reorthogonalization. s = sin(θ) for θ ∈ [−π. Implementation: Givens rotations If k is large.00e+00 8.70e08 3.00e+00 8. Householder reﬂections [195]. GMRES ITERATION Table 3. This implementation maintains the QR factorization of Hk in a clever way so that the cost for a single GMRES iteration is O(N k) ﬂoatingpoint operations. implementations using Givens rotations [167]. The implementation in Algorithm gmres and in the MATLAB code collection is from [167]. A 2 × 2 Givens rotation is a matrix of the form c −s s c (3.html. .42e08 5. π].69e05 4.88e02 6.12) G= .5.00e+00 8.37e05 MGM 1. which makes an angle of −θ with the xaxis through an angle θ so that it overlaps the xaxis.87e05 3.16e01 3.com/SIAM/FR16. [22]. 43 k 0 1 2 3 4 5 6 7 8 9 10 CGM 1.ecsecurehost.Copyright ©1995 by the Society for Industrial and Applied Mathematics.53e05 2.34e34 3. The orthogonal matrix G rotates the vector (c.74e05 3.88e02 6.88e02 6. G c −s = 1 0 . An N × N Givens rotation replaces a 2 × 2 block on the diagonal of the Buy this book from SIAM at http://www.74e05 2. where c = cos(θ).04e24 MGMFO 1.16e01 3. or a shifted Arnoldi process [197] are much more eﬃcient than the bruteforce approach in Algorithm gmresb. This electronic version is for personal use and may not be duplicated or distributed.35e05 3.00e+00 8.04e18 MGMPO 1.
. Buy this book from SIAM at http://www. 11 21 11 21 If we replace H by G1 H..13) 0 . . then the ﬁrst column of H now has only a single nonzero element h11 . Givens rotations are used to annihilate single nonzero elements of matrices in reduction to triangular form [89]. G= . s1 ) by (3. This method is also used in the QR algorithm for computing eigenvalues [89].Copyright ©1995 by the Society for Industrial and Applied Mathematics. Continuing in this way and setting Q = GN . s c 1 0 . . 22 32 22 32 and annihilate h32 . where Rk is the k + 1 × k triangular factor of the QR factorization of Hk . We reduce H to triangular form by ﬁrst multiplying the matrix by a Givens rotation that annihilates h21 (and.15) c2 = h22 / h2 + h2 and s2 = −h32 / h2 + h2 . This reduction can be accomplished in O(N ) ﬂoatingpoint operations and hence is far more eﬃcient than a solution by a singular value decomposition or a reduction based on Householder transformations. . . .ecsecurehost. s2 ) to H.. 44 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS N × N identity matrix with a 2 × 2 Givens rotation. . where (3.. 0 1 0 0 . 0 . (3. .. [184]. s) is an N × N givens rotation of the form (3. . c −s .com/SIAM/FR16. and noting that βe1 − Hk y k Rk+1 = Q(βe1 − Hk y k ) Rk+1 = g − Rk y k Rk+1 ..14) c1 = h11 / h2 + h2 and s1 = −h21 / h2 + h2 .. Note that G2 does not aﬀect the ﬁrst column of H. G1 we see that QH = R is upper triangular. .. 0 . . They are of particular value in reducing Hessenberg matrices to triangular form and thereby solving Hessenberg least squares problems such as the ones that arise in GMRES. 0 . Our notation is that Gj (c. .. We deﬁne G1 = G1 (c1 . 1 . . changes h11 and the subsequent columns).. we can now apply G2 (c2 . . Similarly. of course.html. . setting g = βQe1 .13) with a 2 × 2 Givens rotation in rows and columns j and j + 1. . A straightforward application of these ideas to Algorithm gmres would solve the least squares problem by computing the product of k Givens rotations Q. Let H be an N × M (N ≥ M ) upper Hessenberg matrix with rank M . This electronic version is for personal use and may not be duplicated or distributed.
then computing the Givens rotation Gk+1 that annihilates the (k + 2)nd element of Qk hk+2 . we add the new column hk+2 to Hk . . While ρ > b 2 and k < kmax do (a) k = k + 1 (b) vk+1 = Avk for j = 1.ecsecurehost. xk = x0 + Vk y k .html. .k = vk+1 2 (d) Test for loss of orthogonality and reorthogonalize if necessary. note that if Rk = Qk Hk and. 4. To see this.k . Solve the upper triangular system Ry k = w. ρ) 1. hk+1.com/SIAM/FR16. (g) ρ = (g)k+1 . kmax. hjk = vk+1 vj ii.k − sk hk+1. This electronic version is for personal use and may not be duplicated or distributed. . after orthogonalization.k = ck hk. . Set (w)i = (g)i for 1 ≤ i ≤ k. k = 0. GMRES ITERATION 45 In the context of GMRES iteration. A. b. s1 )x. vk+1 = vk+1 − hjk vj (c) hk+1. ρ = r 2 . h2 + h2 k. we can update both Qk and Rk by ﬁrst multiplying hk+2 by Qk (that is applying the ﬁrst k Givens rotations to hk+2 ). . . k T i. We overwrite the upper triangular part of Hk with the triangular part of the QR factorization of Hk in the MATLAB code. The MATLAB implementation of Algorithm gmres stores Qk by storing the sequences {cj } and {sj } and then computing the action of Qk on a vector x ∈ Rk+1 by applying Gj (cj . ν = 2 i. v1 = r/ r 2 . sk )g.k = 0 iv.Copyright ©1995 by the Society for Industrial and Applied Mathematics. r = b − Ax. The MATLAB implementation of Algorithm gmres uses (3. g = ρ(1.5.10) to test for loss of orthogonality. gmres(x.k k+1. 0.j for 1 ≤ i. g = Gk (ck . . sk ) . 3. If k > 1 apply Qk−1 to the kth column of H. G1 (c1 .k /ν. j ≤ k.1.j = hi. . sk = −hk+1. Algorithm 3. (e) vk+1 = vk+1 / vk+1 (f) ii. Set ri. and ﬁnally setting Qk+1 = Gk+1 Qk and forming Rk+1 by augmenting Rk with Gk+1 Qk hk+2 . we can incrementally perform the QR factorization of H as the GMRES iteration progresses [167]. however. iii.k /ν hk.k . 0)T ∈ Rkmax+1 2. . ck = hk. sj ) in turn to obtain Qk x = Gk (ck . Buy this book from SIAM at http://www. . . β = ρ.
like CG. We discuss only a small subset of these methods and refer the reader to [12] and [78] for pointers to more of the literature on this subject. kmax. and only terminate if an acceptable approximate solution has been found. gmres(x. GMRES. as we have seen. All the methods we present in this section require two matrixvector products for each iteration.ecsecurehost. . m. k=1 3. b. but. Algorithm 3. ρ) 1. Consistently with our implementation of GMRES. While ρ > b 2 and k < kmax do (a) gmres(x. Other methods for nonsymmetric systems The methods for nonsymmetric linear systems that receive most attention in this book. have modest storage requirements that do not depend on the number of iterations needed for convergence. only need matrixvector products. A. ρ) (b) k = k + m The storage costs of m iterations of gmres or of gmresm are the m + 2 vectors b. In § 3. This electronic version is for personal use and may not be duplicated or distributed. Aside from the parameter m. as with CG. m. Hence. k = m 3. and {vk }m . be based on some kind of minimization principle or conjugacy. but can still be quite useful. gmresm(x. .2.8 we give an example of how this squaring of the condition number can lead to failure of the iteration. share the properties that they are easy to implement.5.6. A. For large and illconditioned problems.com/SIAM/FR16. GMRES needs only matrixvector products and uses A alone. the arguments to Algorithm gmresm are the same as those for Algorithm gmres. therefore. and CGNE. b. the storage requirements increase as the iteration progresses.Copyright ©1995 by the Society for Industrial and Applied Mathematics. The methods we describe in this section fall short of the ideal. and converge in N iterations for all nonsingular A. . a basis for Kk must be stored to compute xk . . A. x. b. methods based on shortterm recurrences such as CG that also satisfy minimization or conjugacy conditions cannot be constructed for general matrices. fail to terminate. CGNR and CGNE have the disadvantages that a transposevector product must be computed for each iteration and that the coeﬃcient matrix AT A or AAT has condition number the square of that of the original matrix. CGNR. can be analyzed by a common residual polynomial approach. we take the view that preconditioners will be applied externally to the iteration. An ideal method would. [74]. Restarting can seriously degrade performance. these methods can also be implemented in a manner that incorporates Buy this book from SIAM at http://www. However. You are asked to repair this in exercise 7. it may be impossible to store enough basis vectors and the iteration may have to be restarted. However.html. This implementation does not test for success and may. m. ρ) 2. 46 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS We close with an example of an implementation of GMRES(m) .
r = r. . {ˆk } such that biorthogonality r p holds. While there are approaches Buy this book from SIAM at http://www. [76]. A. GMRES ITERATION 47 the preconditioner and uses the residual for the original system to control termination. bicg(x. Algorithm 3. b. (AT )k−1 r0 ) r ˆ ˆ is the Krylov space for AT and the vector r0 .16) is equivalent to the minimization principle (2. p = p = 0. instead. Do While r 2 > b 2 and k < kmax (a) k = k + 1 (b) ρk = rT r. 3.ecsecurehost. . the kth residual must satisfy the biorthogonality condition (3. If. This electronic version is for personal use and may not be duplicated or distributed.html.6. there is no guarantee that ρk in step 2b or pT Ap. Using the notation of Chapter 2 and [191] we give an implementation of BiCG making the choice r0 = r0 . does not enforce a minimization principle. AT r0 . k = 0 ˆ 2. i. the algorithm can become unstable or produce inaccurate results.2) for CG [89]. rk rl = 0 if k = l and the search directions {pk } and {ˆk } satisfy ˆT p the biconjugacy property pT Apl = 0 if k = l. kmax) ˆ 1. BiCG produces the same iterations as CG (but computes everything except x twice).com/SIAM/FR16. This algorithmic description is explained and ˆ motivated in more detail in [122]. however.Copyright ©1995 by the Society for Industrial and Applied Mathematics. r = r − αAT p ˆ ˆ ˆ Note that if A = AT is spd. ρ0 = 1.1. Even if these quantities are nonzero but very small. We do not recommend use of BiCG and present this algorithm only as a basis for discussion of some of the properties of this class of algorithms. ˆk In the symmetric positive deﬁnite case (3. and [191]. The earliest such method BiCG (Biconjugate gradient) [122]. Kk = span(ˆ0 . The algorithm gets its name because it produces sequences of residuals {rk }. the denominator in step 2e. . . [78]. A is not spd.16) where T rk w = 0 for all w ∈ Kk . r0 is a usersupplied vector and is ˆ ˆ often set to r0 .1. {ˆk } and search directions {pk }. e. p = r + β p ˆ ˆ ˆ (d) v = Ap (e) α = ρk /(ˆT v) p (f) x = x + αp (g) r = r − αv. r = b − Ax. will not ˆ T Ap = 0 we say that a breakdown has taken vanish. BiCG. . [76].6. β = ρk /ρk−1 ˆ (c) p = r + βp. If either ρk−1 = 0 or p ˆ place. .
¯ ˆ ¯ r Hence. [148]. The possibility of breakdown is small and certainly should not eliminate the algorithms discussed below from consideration if there is not enough storage for GMRES to perform well. ˆ BiCG sometimes performs extremely well and the remaining algorithms in this section represent attempts to capture this good performance and damp the erratic behavior when it occurs. The performance of the algorithm can be erratic or even unstable with residuals increasing by several orders of magnitude from one iteration to the next. [81]. 3. CGS. [79]. .17) rk = pk (A)r0 and rk = pk (AT )ˆ0 .com/SIAM/FR16. Once breakdown has occurred.ecsecurehost. One should also keep in mind that a single BiCG iteration requires two matrixvector products and a GMRES iterate only one. by the minimization property for GMRES GM rk RES 2 Bi−CG ≤ rk 2.17). ˆ p p r p ˆ The other references to AT can be eliminated in a similar fashion and an iteration satisfying rk = pk (A)2 r0 (3. Conjugate Gradient Squared. This electronic version is for personal use and may not be duplicated or distributed. where pk is the same polynomial produced by BiCG and used in ¯ (3.18) ¯ is produced.6. Note that there is pk ∈ Pk such that both ¯ (3.17) implies that the scalar product rT r in BiCG (step 2b of ˆ Algorithm bicg) can be represented without using AT as T rk rk = (¯k (A)r0 )T (¯k (AT )ˆ0 ) = (¯k (A)2 r0 )T r0 . but that the cost of the GMRES iteration increases (in terms of ﬂoatingpoint operations) as the iteration progresses. Buy this book from SIAM at http://www. The algorithm takes advantage of the fact that (3.html.2. A transposevector product is needed. A remedy for one of the problems with BiCG is the Conjugate Gradient Squared (CGS) algorithm [180]. All of the methods we present in this section can break down and should be implemented with that possibility in mind. This explains the name. which at the least will require additional programming and may not be possible at all. there are no methods that both limit storage and completely eliminate the possibility of breakdown [74]. reminding us that GMRES. We can compare the bestcase performance of BiCG with that of GMRES. will always reduce the residual more rapidly than BiCG (in terms of iterations. [78]. there are other problems with BiCG. Breakdowns aside. but not necessarily in terms of computational work). such as GMRES. 48 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS [80]. the eﬀort in computing r ˆ at each iteration is wasted in that r makes no contribution to x. However.Copyright ©1995 by the Society for Industrial and Applied Mathematics. if suﬃcient storage is available. to avoid some breakdowns in many variations of BiCG. Moreover. one can restart the BiCG iteration or pass the computation to another algorithm.
If the algorithm does not break down. cgs(x.6. ρ0 = r0 r0 . A. that if pk is the residual polynomial from (3. Since p2 ∈ P2k we have ¯k (3.18) then ¯ (3. [191]. then αk−1 = 0 for all k.17) and (3. of course. BiCGSTAB is the most eﬀective representative of this class of algorithms. This electronic version is for personal use and may not be duplicated or distributed.2.html. change the convergence properties for the worse and either improve good convergence or magnify erratic behavior [134]. βk = ρk /ρk−1 ˆT uk = rk + βk qk pk = uk + βk (qk + βk pk−1 ) vk = Apk Breakdowns take place when either ρk−1 or σk−1 vanish.6. CGS has the same potential for breakdown as BiCG. CGS ˆ replaces the transposevector product with an additional matrixvector product and applies the square of the BiCG polynomial to r0 to produce rk .4. . Do While r 2 > b 2 and k < kmax (a) k = k + 1 ˆT (b) σk−1 = r0 vk−1 (c) αk−1 = ρk−1 /σk−1 (d) qk = uk−1 − αk−1 vk−1 (e) xk = xk−1 + αk−1 (uk−1 + qk ) rk = rk−1 − αk−1 A(uk−1 + qk ) (f) ρk = r0 rk . We present an algorithmic description of CGS that we will refer to in our discussion of TFQMR in § 3. x0 = x.19) GM r2k RES 2 CGS ≤ rk 2. ¯ We provide no MATLAB implementation of BiCG or CGS. b. p0 = u0 = r0 = b − Ax 2.20) ¯ ¯ pk (z) = pk−1 (z) − αk−1 z qk−1 (z) ¯ ¯ where the auxiliary polynomials qk ∈ Pk are given by q0 = 1 and for k ≥ 1 by ¯ (3. . kmax) 1. r0 = r0 ˆ 3.com/SIAM/FR16. Algorithm 3. [77]. GMRES ITERATION 49 The work used in BiCG to compute r is now used to update x. This may. v0 = Ap0 . One can show [180]. In our description of CGS we will index the vectors and scalars that would be overwritten in an implementation so that we can describe some of the ideas in TFQMR later on. Buy this book from SIAM at http://www. This description is taken from [77].ecsecurehost. k = 0 ˆT 4.21) ¯ ¯ qk (z) = pk (z) + βk qk−1 (z).Copyright ©1995 by the Society for Industrial and Applied Mathematics.
v.6.6.18) with rk = qk (A)pk (A)r0 (3. we use r0 = r0 ˆ and follow the notation of [191]. .com/SIAM/FR16. In a situation where many GMRES iterations are needed and matrixvector product is fast. r. In our description of the implementation. r0 = r = r. We will present such a case in § 3. letting s overwrite r when needed. 50 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 3. b. v = p = 0. Algorithm 3. which is described and motivated fully in [191].Copyright ©1995 by the Society for Industrial and Applied Mathematics. A. The Biconjugate gradient stabilized method (BiCGSTAB) [191] attempts to smooth the convergence of CGS by replacing (3. The constant ωi is selected to minimize ri = qi (A)pi (A)r0 as a function of ωi . ρ1 = r0 r ˆ 2. There is no convergence theory for BiCGSTAB and no estimate of the residuals that is better than that for CGS. ρ0 = α = ω = 1. The reason for this is that the cost of orthogonalization in GMRES can be much more than that of a matrixvector product if the dimension of the Krylov space is large. ρk+1 = −ωˆ0 t 2 (h) x = x + αp + ωs (i) r = s − ωt Note that the iteration can break down in steps 2b and 2e. b.ecsecurehost. One must store seven vectors (x. t).3. Buy this book from SIAM at http://www. The examples in [191] indicate the eﬀectiveness of this approach. which can be thought of [12] as a blend of BiCG and GMRES(1). [98].19). p. (3. . t = As rT (g) ω = tT s/ t 2 .22) where qk (z) = k i=1 (1 − ωi z). r = b − Ax. bicgstab(x. We provide an implementation of Algorithm bicgstab in the collection of MATLAB codes. BiCGSTAB can have a much lower average cost per iterate than GMRES. BiCGSTAB. This electronic version is for personal use and may not be duplicated or distributed.html.3. The performance in cases where the spectrum has a signiﬁcant imaginary part can be improved by constructing qk to have complex conjugate pairs of roots. A single iteration reˆ quires four scalar products. r0 .8. Do While r 2 > b 2 and k < kmax (a) k = k + 1 (b) β = (ρk /ρk−1 )(α/ω) (c) p = r + β(p − ωv) (d) v = Ap rT (e) α = ρk /(ˆ0 v) (f) s = r − αv. The cost in storage and in ﬂoatingpoint operations per iteration remains bounded for the entire iteration. kmax) ˆ ˆT 1. k = 0.
α0 . . . [81]. The speciﬁc formulation of q and L depends on the algorithm. This electronic version is for personal use and may not be duplicated or distributed.html. (3. therefore. if m = 2k − 1 is odd. 1 0 1 . if m = 2k is even.26) rm = r0 − AYm z = Wm+1 (e1 − Bm z). αk = 0 for all k ≥ 0. Buy this book from SIAM at http://www. (¯k (A))2 r0 p p (A)¯ ¯k pk−1 (A)r0 q Here the sequences {¯k } and {¯k } are given by (3.com/SIAM/FR16.ecsecurehost.. . The quasiresidual is related to the true residual by a fullrank linear transformation r = Lq and reduction of the residual can be approximately measured by reduction in q.. (3. α (m−1)/2 ) . −1 .6. 0 . Returning to Algorithm cgs we deﬁne sequences ym = and wm = uk−1 q k if m = 2k − 1 is odd. . . . to illustrate the quasiminimization idea. .20) and (3.23) in matrix form as AYm = Wm+1 Bm . [37]. . Wm+1 the N × (m + 1) j=1 matrix with columns {wj }m+1 . Now we consider the quasiminimal residual (QMR) family of algorithms [80].4.24) where Ym is the N × m matrix with columns {yj }m . Therefore.25) for some z ∈ Rm . . We express (3. . . 0 .21). . . (3. All algorithms in this family minimize the norm of an easily computable quantity q = f − Hz. a quasiresidual over z ∈ RN . if m = 2k is even. . and Bm is the (m + 1) × m matrix given by j=1 −1 Bm = 0 . . TFQMR. We will focus on the transposefree algorithm TFQMR proposed in [77]. the p CGS residual satisﬁes CGS rk = w2k+1 . 1 Our assumptions that no breakdown takes place imply that Km is the span of {yj }m and hence j=1 xm = x0 + Ym z (3. . GMRES ITERATION 51 3.23) where the denominator is nonzero by assumption. . . [202]. . We assume that CGS does not break down and. Hence. If we let r be the nearest integer less than or equal to a real r we have Aym = (wm − wm+1 )/α (m−1)/2 . 0 −1 −1 0 diag(α0 . ..Copyright ©1995 by the Society for Industrial and Applied Mathematics. [77]..
Then if ξm > 0 m+1 (3.29) minimizez∈Rm fm+1 − Hm z 2 . The quasiminimization idea is to solve (3.com/SIAM/FR16. we could minimize rm by solving m+1 the least squares problem (3.32) GM τm ≤ rm RES 2 /ξm .ecsecurehost. The solution to the least squares problem (3. [78]. One can [80]. 0)T ∈ Rm .Copyright ©1995 by the Society for Industrial and Applied Mathematics. b ∈ RN be given. . .6. If Wm+1 Ω−1 were an orthogonal matrix. As one can see from that description. This electronic version is for personal use and may not be duplicated or distributed. Let rm RES be the residual for the mth GMRES iteration. Let ξm be the smallest singular value of Wm+1 Ω−1 .html. Let (3. Instead we have the quasiresidual norm τm = fm+1 − Hm zm 2 . Now if we pick the weights ωi = wi norm one. 52 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS where. Hence (3. . 0.28) rm = r0 − AYm z = Wm+1 Ω−1 (fm+1 − Hm z) m+1 Ω = diag(ω1 . each k produces two approximate solutions (corresponding to m = 2k − 1 and m = 2k). . .29) (thereby quasiminimizing the residual) to obtain zm and then set (3. ω2 . . e1 = (1.1. ωm ) where fm+1 = Ωm+1 e1 = ωm+1 e1 and Hm = Ωm+1 Bm is bidiagonal and hence upper Hessenberg. base our termination criterion on τm .26) as (3. Theorem 3. as in our discussion of GMRES. Let A be an N × N matrix and let x0 . and we do this in Algorithm tfqmr. In a sense. therefore.31) rm 2 2 then the columns of Wm+1 Ω−1 have m+1 2 = Wm+1 Ω−1 (fm+1 − Hm z) m+1 √ ≤ τm m + 1. . Assume that the TFQMR iteration does not break down or terminate with GM the exact solution for k iterations and let 1 ≤ m ≤ 2k. The weights make it possible to derive straightforward estimates of the residual in terms of easily computed terms in the iteration. We can. Note that if k corresponds to the iteration counter for CGS.30) xm = x0 + Ym zm . .27) and rewrite (3. the conditioning of the matrix Wm+1 can be improved by scaling. approximate residuals are not computed in Algorithm tfqmr. The diagonal entries in the matrix Ωm+1 are called weights. Buy this book from SIAM at http://www. also use the quasiminimization condition to estimate the TFQMR residual in terms of residual polynomials. .29) can be done by using Givens rotations to update the factorization of the upper Hessenberg matrix Hm [77] and this is reﬂected in the implementation description below.
3 to (3.4. d = 0 T ρ0 = r0 r0 . by the quasiminimization property GM rm RES 2 ≥ ξm τm as desired. Then.6. Let A = V ΛV −1 be a nonsingular diagonalizable matrix. ˆ Algorithm 3. A.2. w1 = y1 = r0 = b − Ax. j = 1.1.30). if ξm > 0. ii.1. tfqmr(x.3. We apply Theorem 3.html. The implementation below follows [77] with r0 = r0 used throughout. b. η = 0 2. k = 0. GMRES ITERATION 53 Proof. As a corollary we have a ﬁnite termination result and a convergence estimate for diagonalizable matrices similar to Theorem 3. . y2 = y1 − αk−1 v. We may write GM GM rm RES = r0 + Ym zm RES and obtain GM rm RES 2 GM = Wm+1 Ω−1 (fm+1 −Hm zm RES ) m+1 2 GM ≥ ξm fm+1 −Hm zm RES 2. This electronic version is for personal use and may not be duplicated or distributed. Theorem 3. Hence.32) and use (3.6. Do While k < kmax (a) k = k + 1 T (b) σk−1 = r0 v.Copyright ©1995 by the Society for Industrial and Applied Mathematics. b ∈ RN be given. η = c2 αk−1 x = x + ηd √ If τ m + 1 ≤ b 2 terminate successfully T (d) ρk = r0 w. Then within (N + 1)/2 iterations the TFQMR iteration will either break down or terminate with the solution. vi.ecsecurehost. kmax) 1. For 1 ≤ m ≤ 2k let ξm be the smallest singular value of Wm+1 Ω−1 m+1 and let xm be given by (3. v. iii. Assume that TFQMR does not break down or terminate with the solution for k iterations.6. iv. c = 1/ 1 + θ2 τ = τ θc. θ = 0.31) to obtain the following result. Let A be an N × N matrix and let x0 . rm r0 2 2 ≤ √ −1 m + 1ξm κ(V ) max φ(z) z∈σ(A) for all φ ∈ Pm . 2 (m = 2k − 2 + j) w = w − αk−1 uj d = yj + (θ2 η/αk−1 )d √ θ = w 2 /τ . . τ = r 2 . α = ρk−1 /σk−1 . β = ρk /ρk−1 Buy this book from SIAM at http://www. u2 = Ay2 (c) For i.1. Corollary 3. u1 = v = Ay1 .com/SIAM/FR16.
whose storage requirements do not increase with the number of iterations. described in the comment lines. and [139]. If we let Gu denote the action of the Poisson solver on u. 0) = u(x. we expect secondorder accuracy and terminate the iteration when rk 2 ≤ h2 b 2 ≈ 9. We do not pass the preconditioner to the GMRES code. y) = −(uxx (x. y) = u(1. y)u(x. y) = 20y. and a3 (x. The motivation for this choice is similar to that for the conjugate gradient computations and has been partially explained theoretically in [33]. As a preconditioner we use the fast Poisson solver fish2d described in § 2. BiCGSTAB. and iteration parameters to specify the maximum number of iterations and the termination criterion. As before. y) = 0. 3.com/SIAM/FR16. xj ) where xi = ih for 1 ≤ i ≤ n. The inputs.7. a2 (x. In all the GMRES examples. we limit the number of GMRES iterations to 60. and TFQMR. rather we pass the preconditioned problem.33) (Lu)(x. This electronic version is for personal use and may not be duplicated or distributed.7 we discretize with a ﬁvepoint centered difference scheme with n2 points and mesh width h = 1/(n + 1). .7.Copyright ©1995 by the Society for Industrial and Applied Mathematics. We consider the discretization of the partial diﬀerential equation (3. Examples for GMRES iteration These examples use the code gmres from the collection of MATLAB codes. y)ux (x. y)uy (x.html. [121]. y) = 1. y) = f (x. we allow for more iterations. y) on 0 < x. the righthand side vector b. y) +a2 (x. y < 1 subject to homogeneous Dirichlet boundary conditions u(x. We take advantage of this in the MATLAB code tfqmr to potentially reduce the matrixvector product cost by one. For general coeﬃcients {aj } the operator L is not selfadjoint and its discretization is not symmetric. the preconditioned system is GLu = Gf . are the initial iterate x0 .8 × 10−4 b 2 . y) = 1. y) + uyy (x. We compute Lu in a matrixfree manner as we did in § 2. [34]. y < 1. 1) = u(0. We let n = 31 to create a system with 961 unknowns. CGNR. For the computations reported in this section we took a1 (x. y)) + a1 (x. y) + a3 (x. a MATLAB function for the matrixvector product. 54 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS (e) y1 = w + βy2 . 0 < x. For methods like GMRES(m). The unknowns are uij ≈ u(xi . As in § 2. u1 = Ay1 (f) v = u1 + β(u2 + βv) Note that y2 and u2 = Ay2 need only be computed if the loop in step 2c does not terminate when j = 1.ecsecurehost.7. Buy this book from SIAM at http://www.
1 and 3. To apply CGNR to the preconditioned problem. we require the transpose of GL. Note that the plots are semilog plots of r 2 / b 2 and that the righthand sides in the preconditioned and unpreconditioned cases are not the same. In both ﬁgures the solid line is the history of the preconditioned iteration and the dashed line that of the unpreconditioned. However. 3. Restarting after 3 iterations is more costly in terms of the iteration count than not restarting.33).6 million ﬂoating point operations and converged in 56 iterations. This electronic version is for personal use and may not be duplicated or distributed. 3. The cost of the application of the transpose of L or GL is the same as that of the application of L or GL itself. . CGNR (or CGNE) is an alternative to GMRES that does not require storage of the iteration history. 3.Copyright ©1995 by the Society for Industrial and Applied Mathematics.7 million ﬂoating point operations and the preconditioned iteration required 13 iterations and 2.7 we took the right hand side so that the exact solution was the discretization of 10xy(1 − x)(1 − y) exp(x4. GMRES ITERATION 55 u0 = 0 was the initial iterate. The reason for this is that the FFT routine is a MATLAB intrinsic and is not interpreted MATLAB code. Hence.4.8. In other computing environments preconditioners which vectorize or parallelize particularly well might perform in the same way.6 million ﬂoating point operations. The preconditioned iteration required 1.com/SIAM/FR16. Examples for CGNR. As in § 2. GMRES(3) in the terminology of § 3. which is given by (GL)∗ u = L∗ G∗ u = L∗ Gu. the transpose (adjoint) of L is given by L∗ u = −uxx − uyy − a1 ux − a2 uy + a3 u as one can derive by integration by parts. and TFQMR iteration When a transpose is available. For elliptic problems like (3. However.ecsecurehost. BiCGSTAB. a single CGNR iteration costs roughly the same as two GMRES iterations.html. The dashed line corresponds to Buy this book from SIAM at http://www.3 shows.5 ). In this case the MATLAB flops command indicates that the unpreconditioned iteration terminated after 223 iterations and 9. in the case where the cost of the matrixvector product dominates the computation. CGNR has the eﬀect of squaring the condition number.2 we plot iteration histories corresponding to preconditioned or unpreconditioned GMRES. the restarted iteration terminated successfully. For the restarted iteration. even in the unpreconditioned case.3 million ﬂoating point operations and terminated after 8 iterations. we restarted the algorithm after three iterations. this eﬀect has serious consequences as Fig. In Figs. The MATLAB flops command indicates that the unpreconditioned iteration required 7. Note the importance of preconditioning. The execution times in our environment indicate that the preconditioned iterations are even faster than the diﬀerence in operation counts indicates.
The preconditioned formulation did quite well. converging in eight iterations. 10 0 10 Relative Residual 1 10 2 10 3 10 4 0 50 100 Iterations 150 200 250 Fig. GMRES(3) for 2D elliptic equation.. i.Copyright ©1995 by the Society for Industrial and Applied Mathematics.7 million ﬂoatingpoint operations Buy this book from SIAM at http://www.com/SIAM/FR16. 56 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 0 10 10 Relative Residual 1 10 2 10 3 10 4 0 10 20 30 Iterations 40 50 60 Fig.ecsecurehost. 3.1.html. i.e.. . This is a result of squaring the large condition number. The MATLAB flops command indicates that the unpreconditioned iteration required 13. CGNR on the unpreconditioned problem. This electronic version is for personal use and may not be duplicated or distributed. We limited the number of iterations to 310 = 10m and the unpreconditioned formulation of CGNR had made very little progress after 310 iterations. GMRES for 2D elliptic equation. CG applied to L∗ Lu = L∗ f .2. 3.e. The solid line corresponds to the CGNR applied to the preconditioned problem. CG applied to L∗ G2 Lu = L∗ G2 f . The matrix corresponding to the discretization of L∗ L has a large condition number.
8. We repeat the experiment with TFQMR as our linear solver. This electronic version is for personal use and may not be duplicated or distributed. The examples for BiCGSTAB use the code bicgstab from the collection of MATLAB codes.4.5 we plot the convergence histories in terms of the approximate residual 2k + 1τ2k given by (3. .33) with the same data and termination criterion as for the GMRES example.9. see Exercise 3. 10 0 10 Relative Residual 1 10 2 10 3 10 4 0 50 100 150 200 Iterations 250 300 350 Fig. The preconditioned iteration terminated in six iterations (12 matrixvector products) and required roughly 1. We use the code tfqmr that is included in our collection.Copyright ©1995 by the Society for Industrial and Applied Mathematics. 3. since our unpreconditioned matrixvector product is so inexpensive.√3. In Fig. the behavior of even the preconditioned iteration can also be poor if the coeﬃcients of the ﬁrst derivative terms are too large. Neither convergence history is monotonic. We see a considerable improvement in cost over CGNR and.31). However. GMRES ITERATION 57 and the preconditioned iteration 2.html. The approximate residual is given in terms of the quasiresidual norm τm rather than the true residual. for the ﬁnal iterate.9. This code has the same input/output arguments as gmres. CGNR for 2D elliptic equation. possibly.3. as you are asked to investigate in Exercise 3.9. CGNE has similar performance. We plot the iteration histories in Fig. Our Buy this book from SIAM at http://www. 3. the solid line corresponds to the preconditioned iteration and the dashed line to the unpreconditioned iteration. We applied BiCGSTAB to both the preconditioned and unpreconditioned forms of (3. with the unpreconditioned iteration being very irregular.4 million.ecsecurehost. but not varying by many orders of magnitude as a CGS or BiCG iteration might. As in the previous examples.com/SIAM/FR16.2 million ﬂoatingpoint operations. and the graphs are monotonic. a better cost/iterate than GMRES in the unpreconditioned case.7 million ﬂoatingpoint operations. The unpreconditioned iteration took 40 iterations and 2. We plot only full iterations (m = 2k) except.
indicating that it is reasonable to use the quasiresidual estimate (3. taking 1. termination criterion is based on the quasiresidual and (3.html. BiCGSTAB for 2D elliptic equation.9 million ﬂoatingpoint operations and 7 iterations. 3. The preconditioned iteration was much more eﬃcient. stopping the iteration when τm m + 1/ b 2 ≤ h−2 .Copyright ©1995 by the Society for Industrial and Applied Mathematics. The plateaus in the graph of the convergence history for the Buy this book from SIAM at http://www. .4.com/SIAM/FR16.5.4. This electronic version is for personal use and may not be duplicated or distributed.6.31) to make termination decisions. The unpreconditioned iteration required roughly 4. 58 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 1 10 10 0 Relative Residual 10 1 10 2 10 3 10 4 0 5 10 15 20 Iterations 25 30 35 40 Fig. 10 0 10 Relative Approximate Residual 1 10 2 10 3 10 4 10 5 0 10 20 30 40 Iterations 50 60 70 Fig.1 million ﬂoatingpoint operations and 67 iterations for termination. TFQMR for 2D elliptic equation.31) as described in √ § 3.ecsecurehost. This led to actual relative residuals of 7 × 10−5 for the unpreconditioned iteration and 2 × 10−4 for the preconditioned iteration. 3.
Buy this book from SIAM at http://www. .com/SIAM/FR16. GMRES ITERATION 59 unpreconditioned iteration are related (see [196]) to spikes in the corresponding graph for a CGS iteration. This electronic version is for personal use and may not be duplicated or distributed.html.Copyright ©1995 by the Society for Industrial and Applied Mathematics.ecsecurehost.
Solve this problem with GMRES (without preconditioning) and then apply as a preconditioner the map M deﬁned by −(M f ) = f. and λ ∈ R such that the GMRES residuals satisfy rk 2 = r0 2 for all k < N .9. . Modify the MATLAB GMRES code to allow for restarts.7. Try this for meshes with 50. Prove that the Arnoldi process (unless it terminates prematurely with a solution) produces a basis for Kk+1 that satisﬁes (3.8). Exercises on GMRES 3. This electronic version is for personal use and may not be duplicated or distributed.1 and add two more columns corresponding to classical Gram–Schmidt orthogonalization with reorthogonalization at every step and determine when the test in (3. Consider the centered diﬀerence discretization of −u + u + u = 1. . (M f )(0) = (M f )(1) = 0. 0 1 Give an example of x0 . points.. Buy this book from SIAM at http://www. precondition with a solver for the highorder term in the diﬀerential equation using the correct boundary conditions. Prove Theorem 3. CGNE.. 100.10) detects loss of orthogonality.. How do other methods such as CGNR.html.4. 60 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 3.9. 3.1.9. 3. . .5.3.9. 0 1 . 1 λ 0 λ . . 200. 3. See [167] for another example. λ 0 .9. That is. ..6. 0 0 .2. b ∈ RN . One way to do this is to call the gmres code with another code which has an outer loop to control the restarting as in Algorithm gmresm. How do your conclusions compare with those in [160]? 3.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Duplicate Table 3. Give an example of a matrix that is not diagonalizable.1.com/SIAM/FR16. BiCGSTAB.9. 3. and TFQMR perform? 3.ecsecurehost. Let A = Jλ be an elementary Jordan block of size N corresponding to an eigenvalue λ = 0. How would you test for failure in a restarted algorithm? What kinds of failures can occur? Repeat Exercise 6 with a limit of three GMRES iterations before a restart. .9.1. This means that Jλ = λ 0 1 λ . How does the performance of the iteration depend on the mesh? Try other preconditioners such as Gauss–Seidel and Jacobi. u(0) = u(1) = 0.9. .5.4 and Theorem 3.
9. Apply CGNE to the PDE in § 3. to the equation LGw = f and then recover the solution from u = Gw. Experiment with diﬀerent values of m.9.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Why is the performance of the methods sensitive to ay and ax ? 3. Why might GMRES(m) be faster than GMRES? Vary the dimension and the coeﬃcient ay . and/or CGNE. BiCGSTAB.11.33) in § 3.7. Are the solutions to (3. For the PDE in § 3. TFQMR. 3. 3.8 equally accurate? Compare the ﬁnal results with the known solution.8.9. This electronic version is for personal use and may not be duplicated or distributed.9. Try ay = 5y.7 and § 3. Consider preconditioning from the right.com/SIAM/FR16. How does the performance diﬀer from CGNR and GMRES? 3. Implement BiCG and CGS and try them on some of the examples in this chapter.9. Duplicate the results in § 3. GMRES ITERATION 61 3.9. Compare timings and solution accuracy with those obtained by left preconditioning.7. Compare the execution times in your computing environment.html. apply GMRES. Buy this book from SIAM at http://www.8.10.ecsecurehost.12.7 and § 3. . CGNR.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
62
ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
Chapter 4
Basic Concepts and FixedPoint Iteration
This part of the book is about numerical methods for the solution of systems of nonlinear equations. Throughout this part, we will let · denote a norm on RN as well as the induced matrix norm. We begin by setting the notation. We seek to solve (4.1) F (x) = 0. Here F : RN → RN . We denote the ith component of F by fi . If the components of F are diﬀerentiable at x ∈ RN we deﬁne the Jacobian matrix F (x) by ∂fi F (x)ij = (x). ∂xj The Jacobian matrix is the vector analog of the derivative. We may express the fundamental theorem of calculus as follows. Theorem 4.0.1. Let F be diﬀerentiable in an open set Ω ⊂ RN and let ∗ ∈ Ω. Then for all x ∈ Ω suﬃciently near x∗ x F (x) − F (x∗ ) = 4.1. Types of convergence Iterative methods can be classiﬁed by the rate of convergence. Definition 4.1.1. Let {xn } ⊂ RN and x∗ ∈ RN . Then • xn → x∗ qquadratically if xn → x∗ and there is K > 0 such that xn+1 − x∗ ≤ K xn − x∗ 2 . • xn → x∗ qsuperlinearly with qorder α > 1 if xn → x∗ and there is K > 0 such that xn+1 − x∗ ≤ K xn − x∗ α . • xn → x∗ qsuperlinearly if
n→∞ 1 0
F (x∗ + t(x − x∗ ))(x − x∗ ) dt.
lim
xn+1 − x∗ = 0. xn − x∗
65
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
66
ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
• xn → x∗ qlinearly with qfactor σ ∈ (0, 1) if xn+1 − x∗ ≤ σ xn − x∗ for n suﬃciently large. Definition 4.1.2. An iterative method for computing x∗ is said to be locally (qquadratically, qsuperlinearly, qlinearly, etc.) convergent if the iterates converge to x∗ (qquadratically, qsuperlinearly, qlinearly, etc.) given that the initial data for the iteration is suﬃciently good. Note that a qsuperlinearly convergent sequence is also qlinearly convergent with qfactor σ for any σ > 0. A qquadratically convergent sequence is qsuperlinearly convergent with qorder 2. In general, a qsuperlinearly convergent method is preferable to a qlinearly convergent one if the cost of a single iterate is the same for both methods. We will see that often a method that is more slowly convergent, in terms of the convergence types deﬁned above, can have a cost/iterate so low that the slow iteration is more eﬃcient. Sometimes errors are introduced into the iteration that are independent of the errors in the approximate solution. An example of this would be a problem in which the nonlinear function F is the output of another algorithm that provides error control, such as adjustment of the mesh size in an approximation of a diﬀerential equation. One might only compute F to low accuracy in the initial phases of the iteration to compute a solution and tighten the accuracy as the iteration progresses. The rtype convergence classiﬁcation enables us to describe that idea. Definition 4.1.3. Let {xn } ⊂ RN and x∗ ∈ RN . Then {xn } converges to x∗ r(quadratically, superlinearly, linearly) if there is a sequence {ξn } ⊂ R converging q(quadratically, superlinearly, linearly) to zero such that xn − x∗ ≤ ξn . We say that {xn } converges rsuperlinearly with rorder α > 1 if ξn → 0 qsuperlinearly with qorder α. 4.2. Fixedpoint iteration Many nonlinear equations are naturally formulated as ﬁxedpoint problems (4.2) x = K(x) where K, the ﬁxedpoint map, may be nonlinear. A solution x∗ of (4.2) is called a ﬁxed point of the map K. Such problems are nonlinear analogs of the linear ﬁxedpoint problems considered in Chapter 1. In this section we analyze convergence of ﬁxedpoint iteration, which is given by (4.3) xn+1 = K(xn ).
This iterative method is also called nonlinear Richardson iteration, Picard iteration, or the method of successive substitution.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
2.2. . Let Ω ⊂ RN and let G : Ω → RM . for all n. If K has two ﬁxed points x∗ and y ∗ in Ω. .2.k→∞ and therefore the sequence {xn } is a Cauchy sequence and has a limit x∗ [162]. and the iteration deﬁned by (4. Definition 4. ≤ γ i x1 − x0 .3) converges qlinearly to x∗ with qfactor γ for all initial iterates x0 ∈ Ω. FIXEDPOINT ITERATION 67 Before discussing convergence of ﬁxedpoint iteration we make two deﬁnitions. . y ∈ Ω. Then there is a unique ﬁxed point of K. Theorem 4. Proof. and therefore xn − x0 = ≤ n−1 i=0 xi+1 n−1 i=0 − xi n−1 i i=0 γ xi+1 − xi ≤ x1 − x0 ≤ x1 − x0 /(1 − γ). xn+k − xn = K(xn+k−1 ) − K(xn−1 ) ≤ γ xn+k−1 − xn−1 ≤ γ K(xn+k−2 ) − K(xn−2 ) ≤ γ 2 xn+k−2 − xn−2 ≤ . Let x0 ∈ Ω. Let Ω ⊂ RN . . G is Lipschitz continuous on Ω with Lipschitz constant γ if G(x) − G(y) ≤ γ x − y for all x. k ≥ 0.2. x∗ ∈ Ω. then x∗ − y ∗ = K(x∗ ) − K(y ∗ ) ≤ γ x∗ − y ∗ . Let Ω be a closed subset of RN and let K be a contraction mapping on Ω with Lipschitz constant γ < 1 such that K(x) ∈ Ω for all x ∈ Ω.com/SIAM/FR16.1. Now. This electronic version is for personal use and may not be duplicated or distributed.2. . K : Ω → RN is a contraction mapping on Ω if K is Lipschitz continuous on Ω with Lipschitz constant γ < 1. Definition 4. ≤ γ n xk − x0 ≤ γ n x1 − x0 /(1 − γ). Note that {xn } ⊂ Ω because x0 ∈ Ω and K(x) ∈ Ω whenever x ∈ Ω.html. Compare the proof to that of Lemma 1.1. Hence lim xn+k − xn = 0 n.ecsecurehost. The sequence {xn } remains bounded since for all i ≥ 1 xi+1 − xi = K(xi ) − K(xi−1 ) ≤ γ xi − xi−1 . Buy this book from SIAM at http://www.2.Copyright ©1995 by the Society for Industrial and Applied Mathematics.1. The standard result for ﬁxedpoint iteration is the Contraction Mapping Theorem [9].1 and Corollary 1.
One simple application of this theorem is to numerical integration of ordinary diﬀerential equations by implicit methods. and the methods based on Newton’s method which we present in the subsequent chapters are used much more often [83]. i.4) y k denotes the approximation of y(t0 + kh).ecsecurehost. F : Ω → RN ×N is Lipschitz continuous with Lipschitz constant γ. we may apply the contraction mapping theorem with Ω = R since for all x. This type of analysis was used by Picard to prove existence of solutions of initial value problems [152].html.Copyright ©1995 by the Society for Industrial and Applied Mathematics. For example. 1. The standard assumptions We will make the standard assumptions on F . Advancing in time from y n to y n+1 requires solution of the ﬁxedpoint problem y = K(y) = y n + hf (y). 2. Assumption 4. This time step may well be too small to be practical.com/SIAM/FR16. see [145] for a more complete discussion of these algorithms. From this we conclude that for h suﬃciently small we can integrate forward in time and that the time step h can be chosen independently of y. This electronic version is for personal use and may not be duplicated or distributed. If hM < 1.1. Finally we note that xn+1 − x∗ = K(xn ) − K(x∗ ) ≤ γ xn − x∗ . (4. Equation 4. x∗ = y ∗ . Buy this book from SIAM at http://www. y K(x) − K(y) = hf (x) − f (y) ≤ hM x − y.4) y n+1 − y n = hf (y n+1 ).3. 68 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Since γ < 1 this can only hold if x∗ − y ∗ = 0. Nonlinear variations of the classical stationary iterative methods such as Gauss–Seidel have also been used.1 has a solution x∗ . 4. . y. say f (y) ≤ M and f (x) − f (y) ≤ M x − y for all x. which completes the proof. In (4. For simplicity we assume that f is bounded and Lipschitz continuous. y(t0 ) = y0 and its solution by the backward Euler method with time step h.3. e. Hence the ﬁxed point is unique. consider the initial value problem y = f (y).
3.5) and Theorem 4. This will follow from δ< since then (4.html.8) I − F (x∗ )−1 F (x) = F (x∗ )−1 (F (x∗ ) − F (x)) ≤ γ F (x∗ )−1 e ≤ γδ F (x∗ )−1 < 1/2.ecsecurehost. We use (4. For all x ∈ B(δ) we have F (x) ≤ F (x∗ ) + γ e . FIXEDPOINT ITERATION 69 3. F (x∗ ) is nonsingular.1 is an important consequence of the standard assumptions.1 to obtain F (x) ≤ 1 0 F (x∗ + te) e dt ≤ 2 F (x∗ ) e Buy this book from SIAM at http://www. Proof.5) holds if γδ < F (x∗ ) .1.0. The notation introduced above will be used consistently. where e = x − x∗ .1 illustrates one way to weaken the standard assumptions. we will always denote a root of F by x∗ . This electronic version is for personal use and may not be duplicated or distributed. The next result (4.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Then there is δ > 0 so that for all x ∈ B(δ) (4. If xn is the nth iterate of a sequence. Lemma 4. Assume that the standard assumptions hold.com/SIAM/FR16. F (x∗ )−1 2γ −1 To prove the ﬁnal inequality (4.7. Hence (4.7) F (x∗ )−1 F (x) ≤ 2 F (x∗ ) .6) is a direct consequence of the Banach Lemma if I − F (x∗ )−1 F (x) < 1/2. Let δ be small enough so that B(δ) ⊂ Ω.6) and (4. .5) (4. However the classical result on quadratic convergence of Newton’s method requires them and we make them throughout. en = xn − x∗ is the error in that iterate. We let B(r) denote the ball of radius r about x∗ B(r) = {x  e < r}.3. we note that if x ∈ B(δ) then x∗ + te ∈ B(δ) for all 0 ≤ t ≤ 1. Exercise 5. −1 e /2 ≤ F (x) ≤ 2 F (x∗ ) e . F (x)−1 ≤ 2 F (x∗ )−1 . Throughout this part. Lemma 4. These assumptions can be weakened [108] without sacriﬁcing convergence of the methods we consider here.7).
by (4.Copyright ©1995 by the Society for Industrial and Applied Mathematics. e /2 ≤ F (x∗ )−1 F (x) ≤ F (x∗ )−1 F (x) . Buy this book from SIAM at http://www.8) F (x∗ )−1 F (x) ≥ e (1 − Therefore 1 0 1 0 1 0 F (x∗ + te)e dt (I − F (x∗ )−1 F (x∗ + te))e dt. which completes the proof.ecsecurehost.7) note that F (x∗ )−1 F (x) = F (x∗ )−1 =e− and hence. I − F (x∗ )−1 F (x∗ + te)dt ) ≥ e /2.html. To prove the left inequality in (4.com/SIAM/FR16. . 70 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS which is the right inequality in (4. This electronic version is for personal use and may not be duplicated or distributed.7).
1 and the Lipschitz continuity of F e+ ≤ (2 F (x∗ )−1 )γ ec 2 /2. Newton’s method is x+ = xc − F (xc )−1 F (xc ).1. (5. By Lemma 4. [156]. Theorem 5. The proof of convergence of the complete Newton iteration will be complete if we reduce δ if needed so that Kδ < 1.0.1. This electronic version is for personal use and may not be duplicated or distributed. Local convergence of Newton’s method We will describe iterative methods for nonlinear equations in terms of the transition from a current iterate xc to a new iterate x+ . Let the standard assumptions hold. In the context of single systems this method appeared in the 17th century [140].com/SIAM/FR16. Chapter 5 Newton’s Method 5. In this language.Copyright ©1995 by the Society for Industrial and Applied Mathematics. (5. By Theorem 4.1) We may also view x+ as the root of the twoterm Taylor expansion or linear model of F about xc Mc (x) = F (xc ) + F (xc )(x − xc ). The convergence result on Newton’s method follows from Lemma 4. Let the standard assumptions hold.3. [13].1. Let δ be small enough so that the conclusions of Lemma 4.2) with K = γ F (x∗ )−1 .3.3. [137].html.1 hold.2) Proof.2. 71 Buy this book from SIAM at http://www. .1.1 e+ = ec − F (xc )−1 F (xc ) = F (xc )−1 1 0 (F (xc ) − F (x∗ + tec ))ec dt. Theorem 5. Then there is δ such that if x0 ∈ B(δ) the Newton iteration xn+1 = xn − F (xn )−1 F (xn ) converges qquadratically to x∗ . Then there are K > 0 and δ > 0 such that if xc ∈ B(δ) the Newton iterate from xc given by (5.1.1) satisﬁes e+ ≤ K ec 2 .ecsecurehost. This completes the proof of (5.
html. there is no “righthand side” to use as a scaling factor and we must balance the relative and absolute size of the nonlinear residuals in some other way. Two examples are implicit integration of ordinary diﬀerential equations and diﬀerential algebraic equations. many situations in which the initial iterate is very near the root.ecsecurehost. Termination of the iteration We can base our termination decision on Lemma 4.1. [126]. where the initial iterate is derived from the solution at the previous time step.1 we conclude that if F (x∗ ) is well conditioned.2.Copyright ©1995 by the Society for Industrial and Applied Mathematics. we may terminate the iteration when the relative nonlinear residual F (x) / F (x0 ) is small. Let δ be small enough so that the conclusions of Theorem 5. Lemma 5. and solution of discretizations of partial diﬀerential equations. Assume that the standard assumptions hold. the iteration may not terminate at all. 5.3.1.1 hold.com/SIAM/FR16. In the nonlinear case.1.1 hold. Let δ > 0 be small enough so that the conclusions of Lemma 4.1 implies that (5. Then for all x ∈ B(δ) e F (x) 4κ(F (x∗ )) e ≤ ≤ . a termination decision based on the relative residual may be made too late in the iteration or. However. as a direct consequence of applying Lemma 4.3. [83]. [105]. where the initial iterate is an interpolation of a solution from a coarser computational mesh [99]. We raised this issue in the context of linear equations in Chapter 1 when we compared (1.3. where δ is small enough so that the conclusions of Lemma 4. If x ∈ B(δ).2. then if F (x∗ ) is well conditioned. This electronic version is for personal use and may not be duplicated or distributed.1. a nonlinear form of Lemma 1.1 twice. Theorem 5. when the initial iterate is far from the root and methods such as those discussed in Chapter 8 are needed. Moreover. if there is error in the evaluation of F or the initial iterate is near a root.3.2. We have.1. In all Buy this book from SIAM at http://www. Since x0 ∈ B(δ) by assumption. however.4). Then if n ≥ 0 and xn ∈ B(δ). as we will see in the next section. [16]. From Lemma 5. Reduce δ if needed so that Kδ = η < 1. The assumption that the initial iterate be “suﬃciently near” the solution (x0 ∈ B(δ)) may seem artiﬁcial at ﬁrst look. local convergence results like those in this and the following two chapters describe the terminal phase of the iteration.3) then implies that xn → x∗ qquadratically. the entire sequence {xn } ⊂ B(δ).1 hold for x ∈ B(δ). 72 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Proof.3) en+1 ≤ K en 2 ≤ η en < en and hence xn+1 ∈ B(ηδ) ⊂ B(δ). There are. .2) and (1. ∗ )) 4 e0 κ(F (x F (x0 ) e0 where κ(F (x∗ )) = F (x∗ ) F (x∗ )−1 is the condition number of F (x∗ ) relative to the norm · . (5. Therefore if xn ∈ B(δ) we may continue the iteration. the size of the relative nonlinear residual is a good indicator of the size of the error.1.
say.4) the decision to stop the iteration can be made before computation and factorization of F (xc ). the Jacobian F (xc ) must be computed and factored. Our examples and discussions of operation counts make the implicit assumption that the Jacobian is dense.Copyright ©1995 by the Society for Industrial and Applied Mathematics. If computation and factorization of the Jacobian are inexpensive. then termination on small steps becomes more attractive. If the iteration is qlinearly convergent. 5. Buy this book from SIAM at http://www. Combinations of relative and absolute error tolerances are commonly used in numerical methods for ordinary diﬀerential equations or diﬀerential algebraic equations [16]. in order to estimate ec this way. This electronic version is for personal use and may not be duplicated or distributed. If one decides to continue.4) F (x) ≤ τr F (x0 ) + τa . [151]. NEWTON’S METHOD 73 of our MATLAB codes and algorithms for nonlinear equations our termination criterion is to stop the iteration if (5. near the solution s and ec are essentially the same size.1. However. [21]. . In order to compute the Newton iterate x+ from a current point xc one must ﬁrst evaluate F (xc ) and decide whether to terminate the iteration. This criterion is based on Theorem 5.1. Hence. Implementation of Newton’s method In this section we assume that direct methods will be used to solve the linear equation for the Newton step. then e+ ≤ σ ec implies that (1 − σ) ec ≤ s ≤ (1 + σ) ec . the issues for sparse Jacobians when direct methods are used to solve linear systems are very similar. For methods other than Newton’s method. The diﬀerential equations codes discussed in [16]. [23]. [21].3.ecsecurehost. make use of this in the special case of the chord method. and [151].com/SIAM/FR16. one must do most of the work needed to compute x+ . Another way to decide whether to terminate is to look at the Newton step s = −F (xc )−1 F (xc ) = x+ − xc . the relation between the step and the error is less clear. and terminate the iteration when s is suﬃciently small. whereas by terminating on small relative residuals or on a condition like (5. where the relative error tolerance τr and absolute error tolerance τa are input to the algorithm. [23]. In Chapter 6 we consider algorithms in which iterative methods are used to solve the equation for the Newton step.html. However. which implies that ec = s + O( ec 2 ). Hence the step is a reliable indicator of the error provided σ is not too near 1.
For deﬁniteness in the description of Algorithm newton we use an LU factorization of F . One approach to reducing the cost of items 2a and item 2b in the Newton iteration is to move them outside of the main loop.1. (c) Solve LU s = −F (x) (d) x = x + s (e) Evaluate F (x). For general nonlinearities or general purpose codes such as the MATLAB code nsol from the collection. Do while F (x) > τr r0 + τa (a) Compute F (x) (b) Factor F (x) = LU . x+ = xc − F (x0 )−1 F (xc ). τ ) 1. Hence the cost of a Newton step may be roughly estimated as N + 1 evaluations of F and O(N 3 ) ﬂoatingpoint operations. τ ) 1. τa ) ∈ R2 . Factorization of F in the dense case costs O(N 3 ) ﬂoatingpoint operations.Copyright ©1995 by the Society for Industrial and Applied Mathematics.21 for an example. Algorithm 5. r0 = F (x) 2. Buy this book from SIAM at http://www. Algorithm 5.3. Any other appropriate factorization such as QR or Cholesky could be used as well. The inputs of the algorithm are the initial iterate x. This electronic version is for personal use and may not be duplicated or distributed. Compute F (x) 3. This method is called the chord method. and a vector of termination tolerances τ = (τr . This means that the linear approximation of F (x) = 0 that is solved at each iteration has derivative determined by the initial iterate. r0 = F (x) 2.2. the nonlinear map F .com/SIAM/FR16.ecsecurehost. The inputs are the same as for Newton’s method. 74 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Then the step is computed as the solution of F (xc )s = −F (xc ) and the iterate is updated x+ = xc + s. In many cases F can be computed more eﬃciently. Evaluation of F by ﬁnite diﬀerences should be expected to cost N times the cost of an evaluation of F because each column in F requires an evaluation of F to form the diﬀerence approximation.html. and directly than with diﬀerences and the analysis above for the cost of a Newton iterate is very pessimistic. newton(x. accurately. Factor F (x) = LU . In the future. F.3. See Exercise 5. there may be little alternative to diﬀerence Jacobians. . F.7. automatic diﬀerentiation (see [94] for a collection of articles on this topic) may provide such an alternative. chord(x. the evaluation and factorization of F are the most costly. Of these steps.
5. These issues are discussed in [201] as well.1 imply (5.1.6) e+ ≤ K ec 2 + 2 F (xc )−1 − (F (xc ) + ∆(xc ))−1 ) F (x∗ ) ec (xc ) .ecsecurehost.3.1. the resulting iteration can return a result that is an O( ) accurate approximation to x∗ . Proof. . We will refer to this phase of the iteration in which the nonlinear residual is no longer being reduced as stagnation of the iteration. Similar diﬀerences arise if F (xc ) is numerically approximated by diﬀerences. and δ1 > 0 such that if xc ∈ B(δ) and ∆(xc ) < δ1 then x+ = xc − (F (xc ) + ∆(xc ))−1 (F (xc ) + (xc )) is deﬁned (i. + Lemma 4. for example. + (F (xc ) + ∆(xc ))−1 Buy this book from SIAM at http://www. The diﬀerence in the iteration itself is that an approximation to F (xc ) is used.1. is entirely due to ﬂoatingpoint roundoﬀ. This is diﬀerent from convergence and was called “local improvement” in [65].e. The only diﬀerence in implementation from Newton’s method is that the computation and factorization of the Jacobian are done before the iteration is begun. This electronic version is for personal use and may not be duplicated or distributed.1 and Theorem 5.com/SIAM/FR16. F (xc ) + ∆(xc ) is nonsingular) and satisﬁes (5.3.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Errors in the function and derivative Suppose that F and F are computed inaccurately so that F + and F + ∆ are used instead of F and F in the iteration.4. Let the standard assumptions hold.1 and Theorem 5. Then there are ¯ K > 0. If ∆ is suﬃciently small. NEWTON’S METHOD 75 4. Let δ be small enough so that the conclusions of Lemma 4.html. Do while F (x) > τr r0 + τa (a) Solve LU s = −F (x) (b) x = x + s (c) Evaluate F (x). there is no reason to expect that F (xn ) will ever be smaller than in general. If. δ > 0.. Let xN = xc − F (xc )−1 F (xc ) + and note that x+ = xN + (F (xc )−1 − (F (xc ) + ∆(xc ))−1 )F (xc ) − (F (xc ) + ∆(xc ))−1 (xc ).1 hold.5) ¯ e+ ≤ K( ec 2 + ∆(xc ) ec + (xc ) ). Theorem 5.4. We continue in the next section with an analysis of the eﬀects of errors in the function and Jacobian in a general context and then consider the chord method and diﬀerence approximation to F as applications.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
76 If
ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
∆(xc ) ≤ F (x∗ )−1 ∆(xc ) ≤ F (xc )−1
−1
/4
then Lemma 4.3.1 implies that
−1
/2
and the Banach Lemma implies that F (xc ) + ∆(xc ) is nonsingular and (F (xc ) + ∆(xc ))−1 ≤ 2 F (xc )−1 ≤ 4 F (x∗ )−1 . Hence F (xc )−1 − (F (xc ) + ∆(xc ))−1 ≤ 8 F (x∗ )−1 We use these estimates and (5.6) to obtain e+ ≤ K ec Setting
2 2
∆(xc ) .
+ 16 F (x∗ )−1
2
F (x∗ ) ∆(xc ) ec + 4 F (x∗ )−1
2
(xc ) .
¯ K = K + 16 F (x∗ )−1
F (x∗ ) + 4 F (x∗ )−1
completes the proof. The remainder of this section is devoted to applications of Theorem 5.4.1. 5.4.1. The chord method. Recall that the chord method is given by
x+ = xc − F (x0 )−1 F (xc ). In the language of Theorem 5.4.1 (xc ) = 0, ∆(xc ) = F (x0 ) − F (xc ). Hence, if xc , x0 ∈ B(δ) ⊂ Ω (5.7) ∆(xc ) ≤ γ x0 − xc ≤ γ( e0 + ec ).
We apply Theorem 5.4.1 to obtain the following result. Theorem 5.4.2. Let the standard assumptions hold. Then there are KC > 0 and δ > 0 such that if x0 ∈ B(δ) the chord iterates converge qlinearly to x∗ and (5.8) en+1 ≤ KC e0 en . Proof. Let δ be small enough so that B(δ) ⊂ Ω and the conclusions of Theorem 5.4.1 hold. Assume that xn ∈ B(δ). Combining (5.7) and (5.5) implies ¯ ¯ en+1 ≤ K( en (1 + γ) + γ e0 ) en ≤ K(1 + 2γ)δ en . Hence if δ is small enough so that ¯ K(1 + 2γ)δ = η < 1
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
NEWTON’S METHOD
77
then the chord iterates converge qlinearly to x∗ . Qlinear convergence implies ¯ that en < e0 and hence (5.8) holds with KC = K(1 + 2γ). Another variation of the chord method is x+ = xc − A−1 F (xc ), where A ≈ F (x∗ ). Methods of this type may be viewed as preconditioned nonlinear Richardson iteration. Since ∆(xc ) = A − F (xc ) ≤ A − F (x∗ ) + F (x∗ ) − F (xc ) , if xc ∈ B(δ) ⊂ Ω then ∆(xc ) ≤ A − F (x∗ ) + γ ec ≤ A − F (x∗ ) + γδ. Theorem 5.4.3. Let the standard assumptions hold. Then there are KA > 0, δ > 0, and δ1 > 0, such that if x0 ∈ B(δ) and A − F (x∗ ) < δ1 then the iteration xn+1 = xn − A−1 F (xn ) converges qlinearly to x∗ and (5.9) en+1 ≤ KA ( e0 + A − F (x∗ ) ) en .
5.4.2. Approximate inversion of F . Another way to implement chordtype methods is to provide an approximate inverse of F . Here we replace F (x)−1 by B(x), where the action of B on a vector is less expensive to compute than a solution using the LU factorization. Rather than express the iteration in terms of B −1 (x) − F (x) and using Theorem 5.4.3 one can proceed directly from the deﬁnition of approximate inverse. Note that if B is constant (independent of x) then the iteration x+ = xc − BF (xc ) can be viewed as a preconditioned nonlinear Richardson iteration. We have the following result. Theorem 5.4.4. Let the standard assumptions hold. Then there are KB > 0, δ > 0, and δ1 > 0, such that if x0 ∈ B(δ) and the matrixvalued function B(x) satisﬁes (5.10) I − B(x)F (x∗ ) = ρ(x) < δ1
for all x ∈ B(δ) then the iteration xn+1 = xn − B(xn )F (xn ) converges qlinearly to x∗ and (5.11) en+1 ≤ KB (ρ(xn ) + en ) en .
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
78
ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS
Proof. On one hand, this theorem could be regarded as a corollary of the Banach Lemma and Theorem 5.4.1. We give a direct proof. First, by (5.10) we have (5.12) B(x) = B(x)F (x∗ )F (x∗ )−1 ≤ B(x)F (x∗ ) F (x∗ )−1 ≤ MB = (1 + δ1 ) F (x∗ )−1 .
Using (5.12) and e+ = ec − B(xc )F (xc ) =
1 0
(I − B(xc )F (x∗ + tec ))ec dt
1 0
= (I − B(xc )F (x∗ ))ec + B(xc ) we have
(F (x∗ ) − F (x∗ + tec ))ec dt
e+ ≤ ρ(xc ) ec + MB γ ec 2 /2.
This completes the proof with KB = 1 + MB γ/2. 5.4.3. The Shamanskii method. Alternation of a Newton step with a sequence of chord steps leads to a class of highorder methods, that is, methods that converge qsuperlinearly with qorder larger than 2. We follow [18] and name this method for Shamanskii [174], who considered the ﬁnite diﬀerence Jacobian formulation of the method from the point of view of eﬃciency. The method itself was analyzed in [190]. Other highorder methods, especially for problems in one dimension, are described in [190] and [146]. We can describe the transition from xc to x+ by y1 (5.13) = xc − F (xc )−1 F (xc ),
yj+1 = yj − F (xc )−1 F (yj ) for 1 ≤ j ≤ m − 1, x+ = ym .
Note that m = 1 is Newton’s method and m = ∞ is the chord method with {yj } playing the role of the chord iterates. Algorithmically a second loop surrounds the factor/solve step. The inputs for the Shamanskii method are the same as for Newton’s method or the chord method except for the addition of the parameter m. Note that we can overwrite xc with the sequence of yj ’s. Note that we apply the termination test after computation of each yj . Algorithm 5.4.1. sham(x, F, τ, m) 1. r0 = F (x) 2. Do while F (x) > τr r0 + τa
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
.8) implies that for 1≤j≤m yj −x∗ ≤ KC xn −x∗ yj−1 −x∗ = KC en j yj−1 −x∗ ≤ . indicate that the chord method is usually the best option for very large problems. x = x + s iii. In fact.ecsecurehost.com/SIAM/FR16. Setting j = m in the above inequality completes the proof. Let δ be small enough so that B(δ) ⊂ Ω and that the conclusions of Theorem 5. Evaluate F (x).14) en+1 ≤ KS en m+1 . . Theorem 5. m i. Proof.4. Optimal choices of m as a function of N can be derived under assumptions consistent with dense matrix algebra [18] by balancing the O(N 3 ) cost of a matrix factorization with the cost of function and Jacobian evaluation. A forward diﬀerence approximation to F (x)w would be F (x + hw) + (x + hw) − F (x) − (x) .html. if we set y1 = xn .4. . iv. Solve LU s = −F (x) ii. Assume that we compute F (x) + (x) instead of F (x) and attempt to approximate the action of F on a vector by a forward diﬀerence. The analysis in [18] and the expectation that. Algorithm nsol is based on this idea and using an idea from [114] computes and factors the Jacobian only until an estimate of the qfactor for the linear convergence rate is suﬃciently low.4. for example. We would do this. Here the jth column of F (x) is F (x)ej where ej is the unit vector with jth component 1 and other components 0. Hence xn+1 ∈ B(δ). This electronic version is for personal use and may not be duplicated or distributed. Then there are KS > 0 and δ > 0 such that if x0 ∈ B(δ) the Shamanskii iterates converge qsuperlinearly to x∗ with qorder m + 1 and (5. ≤ KC en j+1 . h Buy this book from SIAM at http://www. 5.2 hold. only a small number of iterations will be taken. The convergence result is a simple application of Theorem 5. if the initial iterate is suﬃciently accurate. Diﬀerence approximation to F .2.5. If F (x) ≤ τr r0 + τa exit.4. .4. when building an approximate Jacobian. . (5. (c) for j = 1. NEWTON’S METHOD 79 (a) Compute F (x) (b) Factor F (x) = LU . .2. The advantage of the Shamanskii method over Newton’s method is that high qorders can be obtained with far fewer Jacobian evaluations or factorizations. Let the standard assumptions hold and let m ≥ 1 be given. Then if xn ∈ B(δ) all the intermediate iterates {yj } are in B(δ) by Theorem 5.Copyright ©1995 by the Society for Industrial and Applied Mathematics.4.
2. Because of this we must carefully specify what we mean by a diﬀerence approximation to the Jacobian. Definition 5. Let F be deﬁned in a neighborhood of x ∈ RN . If this is not the case.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Deﬁnition 5. Hence the evaluation of a full ﬁnite diﬀerence Jacobian would cost N function evaluations. Exploitation of special structure could reduce this cost. Definition 5. (∇h F )(x) is the N × N matrix whose jth column is given by F (x + h x ej ) − F (x) h x F (he ) − F (x) j x=0 x=0 (∇h F )(x)j = h We make a similar deﬁnition of the diﬀerence approximation of the directional derivative. The choice h ≈ 10−15 in this case can lead to disaster (try it!).1. .4. h should be scaled to reﬂect that. This electronic version is for personal use and may not be duplicated or distributed.1 below speciﬁes the usual choice.4. In the special case where (x) is a result of ﬂoatingpoint roundoﬀ in full precision (¯ ≈ 10−15 ). Then F (x + hw) + (x + hw) − F (x) − (x) = O(h + ¯/h). Let F be deﬁned in a neighborhood of x ∈ RN and let Buy this book from SIAM at http://www. This means. The forward diﬀerence approximation to the directional derivative is not a linear operator in w.4. one for each column of the Jacobian. that very small diﬀerence steps h can lead to inaccurate results. This can become very important if part of F is computed from measured data. The reason for this is that the derivative has been approximated. the cost of a forward diﬀerence approximation of F (x)w is an additional evaluation of F (at the point x+hw). 80 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Assume that (x) ≤ ¯. Here we have made the implicit assumption that x and w are of roughly the same size. h F (x)w − √ The quantity inside the Oterm is minimized when h = ¯. If F (x) has already been computed. the analysis here indicates that h ≈ 10−7 is a reasonable choice. for instance. A more important consequence of this analysis is that the choice of step size in a forward diﬀerence approximation to the action of the Jacobian on a vector must take into account the error in the evaluation of F .html. and so linearity has been lost.com/SIAM/FR16.ecsecurehost. The choice h = ¯1/2 x / w reﬂects all the discussion above.
.6. We have (5. h ≤ h. Note that Theorem 5.html.Copyright ©1995 by the Society for Industrial and Applied Mathematics.4. Theorem 5. NEWTON’S METHOD 81 w ∈ RN . If information on the sparsity pattern of the Jacobian is available it is sometimes possible [47].4. ¯. one should expect the sequence { en } to stagnate and cease to decrease once√ en ≈ ¯. then ∆(x) = O( ¯) by Lemma 5. while en >> ¯. F (x + h x w/ w ) − F (x) .6 says that en+1 = O(¯). w = 0. h x F (hw/ w ) − F (x) .4. Hence.com/SIAM/FR16. there is no guarantee that (xn ) → 0 as xn → x∗ . If the sparsity pattern is such that F can be reduced to x∗ .4. ¯ Lemma 5. the progress of the iteration will be indistinguishable from Newton’s method as implemented with an exact √ Jacobian. Hence one will see quadratic convergence until the error in F admits no further reduction.6 does not imply that an iteration will converge to Even if is entirely due to ﬂoatingpoint roundoﬀ.4.1. When en ≤ ¯. Then there are δ. x = 0. [42]. and KD such that if xc ∈ B(δ). x+ = xc − ∇hc (F + )(xc )−1 (F (xc ) + (xc )) Buy this book from SIAM at http://www. If we assume that the forward diﬀerence step has been selected so that √ √ h = O( ¯). w = 0. h x = 0. Then there are positive δ. Therefore in the early phases of the iteration.1.15)Dh F (x : w) = 0. Theorem 5. Let the standard assumptions hold. then ∇h (F + )(x) is nonsingular for all x ∈ B(δ) and there is MF > 0 such that F (x) − ∇h (F + )(x) ≤ MF (h + ¯/h). Theorem 5. (x) ≤ ¯ for all x ∈ B(δ). Let the standard assumptions hold. w w The standard assumptions and the Banach Lemma imply the following result.4. This electronic version is for personal use and may not be duplicated or distributed.ecsecurehost. h > ¯ 0 such that if x ∈ B(δ). and there is M− > 0 such that √ hc > M− ¯ then satisﬁes e+ ≤ KD (¯ + ( ec + hc ) ec ). A diﬀerence approximation to a full Jacobian does not use any special structure of the Jacobian and will probably be less eﬃcient than a hand coded analytic Jacobian. w.1 implies the following result. to evaluate ∇h F with far fewer than N evaluations of F . and (x) ≤ ¯ for all x ∈ B(δ). ¯.
We have (xc ) = 0 and (5. If xn = x∗ for some ﬁnite n. one needs two approximations to x∗ . then. which uses the size of w. xc − x− We have the following theorem. For problems in which the Lipschitz constant of F is large. Moreover. xn ∈ B(δ). Assume that x−1 . they should be used with caution. Let δ be small enough so that B(δ) ⊂ Ω and the conclusions of Theorem 5. The local convergence theory can be easily described by Theorem 5. a linear function of w and it is usually the case that (∇h F (x))w = Dh F (x : w).4. x0 . is likely to be more accurate than ∇h F (x)w. The results in this section are for single equations f (x) = 0. an = xn − xn−1 The resulting method is called the secant method and is due to Newton [140].16) ∆(xc ) = f (xc ) − f (x− ) − f (xc ). Then there is δ > 0 such that if x0 . 82 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS a block triangular form. Let the standard assumptions hold with N = 1..ecsecurehost. imply that f (x∗ ) = 0. If F (x∗ ) is ill conditioned or the Lipschitz constant of F is large.e.Copyright ©1995 by the Society for Industrial and Applied Mathematics.html. The secant method. We will prove that the secant method is locally qsuperlinearly convergent if the standard assumptions hold. [137]. x−1 ∈ B(δ) and x0 = x−1 then the secant iterates converge qsuperlinearly to x∗ . the diﬀerence between them may be signiﬁcant and a diﬀerence approximation to a directional derivative. Theorem 5. If we set s = xn − xn−1 we have by (5. in general. While we used full matrix approximations to numerical Jacobians in all the examples reported here.16) and the fundamental theorem of calculus ∆(xn ) = 1 0 (f (xn−1 + ts) − f (xn )) dt Buy this book from SIAM at http://www. . This electronic version is for personal use and may not be duplicated or distributed. the entire solution process can take advantage of this structure [59]. Proof. 5.com/SIAM/FR16. . There is no reason to use a diﬀerence approximation to f for equations in a single variable because one can use previously computed data to approximate f (xn ) by f (xn ) − f (xn−1 ) . we are done.4. the error in the diﬀerence approximation will be large. Let xc be the current iterate and x− the previous iterate. (x) = 0. In order to start. x0 and x−1 . The standard assumptions.1 hold.4. too. where f is a realvalued function of one real variable.1. . . We assume for the sake of simplicity that there are no errors in the evaluation of f . Dh F (x : w) is not.7. i. Otherwise xn = xn−1 .4. .5.
html. . [145]. r− = βγ We will let B0 be the closed ball of radius r− about x0 B0 = {x  x − x0 ≤ r− }. r. [57]. There are β. F (x0 )−1 ≤ β. we give a simpliﬁed version of a Kantorovichlike result for the chord Buy this book from SIAM at http://www. This electronic version is for personal use and may not be duplicated or distributed. The Kantorovich Theorem In this section we present the result of Kantorovich [107].5.4. η.com/SIAM/FR16. NEWTON’S METHOD 83 γ(en  + en−1 ) .5. 2 and hence ∆(xn ) ≤ γxn−1 − xn /2 ≤ Applying Theorem 5. 2.ecsecurehost.17) ¯ en+1  ≤ K((1 + γ/2)en 2 + γen en−1 /2). This result is of use. This completes the proof. We do not prove the result in its full generality and refer to [106]. ¯ K(1 + γ)δ = η < 1 If we reduce δ so that then xn → x∗ qlinearly. F is Lipschitz continuous with Lipschitz constant γ in a ball of radius r ≥ r− about x0 where ¯ √ 1 − 1 − 2βηγ . in proving that discretizations of nonlinear diﬀerential and integral equations have solutions that are near to that of the continuous problem.1. then there is a root and the Newton iteration converges rquadratically to that root.7. The secant method and the classical bisection (see Exercise 5.Copyright ©1995 by the Society for Industrial and Applied Mathematics. which states that if the standard assumptions “almost” hold at a point x0 .1 implies (5. [106]. Assumption 5. for example. and γ with βηγ ≤ 1/2 and x0 ∈ RN ¯ such that 1. However. F is diﬀerentiable at x0 . We require the following assumption. 5. Therefore en+1  ¯ ≤ K((1 + γ/2)en  + γen−1 /2) → 0 en  as n → ∞. and [58] for a complete proof and for discussions of related results.8) method are the basis for the popular method of Brent [17] for the solution of single equations. and F (x0 )−1 F (x0 ) ≤ η.
This electronic version is for personal use and may not be duplicated or distributed. We will show that the map φ(x) = x − F (x0 )−1 F (x) is a contraction on the set S = {x  x − x0 ≤ r− }. Then there is a unique root x∗ of F in B0 . Moreover x∗ is the unique root of F in the ball of radius √ 1 + 1 − 2βηγ ¯ min r. 1 0 (F (x0 + t(x − x0 )) − F (x0 ))(x − x0 ) dt.1 assumes a bit more than Assumption 5. By the fundamental theorem of calculus and Assumption 5. and xn ∈ B0 for all n. and [57]. Let Assumption 5. .1 but has a simple proof that uses only the contraction mapping theorem and the fundamental theorem of calculus and obtains qlinear convergence instead of rlinear convergence. Theorem 5.19) 1 0 F (x0 + t(x − x0 ))(x − x0 ) dt = F (x0 )−1 F (x0 ) + x − x0 +F (x0 )−1 1 0 (F (x0 + t(x − x0 )) − F (x0 ))(x − x0 ) dt and φ(x) − x0 = −F (x0 )−1 F (x0 ) −F (x0 )−1 Hence.5. Theorem 5. Proof.1 hold and let (5.ecsecurehost.18) βγη < 1/2. [145].5.Copyright ©1995 by the Society for Industrial and Applied Mathematics.html.1 we have for all x ∈ S F (x0 )−1 F (x) = F (x0 )−1 F (x0 ) +F (x0 )−1 (5.com/SIAM/FR16. the chord iteration with x0 as the initial iterate converges to x∗ qlinearly with qfactor βγr− .5.5.1. Unlike the chord method results in [106]. βγ about x0 . Buy this book from SIAM at http://www.5. 2 φ(x) − x0 ≤ η + βγr− /2 = r− . 84 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS method to show how existence of a solution can be derived from estimates on the function and Jacobian.
19) F (x0 )−1 F (x) ≥ r − η − βγr2 /2 > 0 because r− < r < r+ . using the rquadratic estimates from [106]. x∗ is the unique root of F in the ball of radius √ 1 + 1 − 2βηγ ¯ min r. βγ about x0 and the errors satisfy (5.5. NEWTON’S METHOD 85 and φ maps S into itself. and xn ∈ B0 for all n. To show that φ is a contraction on S and to prove the qlinear convergence we will show that φ(x) − φ(y) ≤ βγr− x − y for all x. Note that for all x ∈ B0 φ (x) = I − F (x0 )−1 F (x) = F (x0 )−1 (F (x0 ) − F (x)) and hence φ (x) ≤ βγ x − x0 ≤ βγr− < 1. We must show ¯ that if x is such that r− < x − x0 < min(¯.5. [106]. This proves the convergence result and the uniqueness of x∗ in B0 . Letting r = x − x0 we have by (5. then we are done by the uniqueness of x∗ in B0 .Copyright ©1995 by the Society for Industrial and Applied Mathematics. Hence we may assume that r > r− .1 and has a very diﬀerent proof. . To prove the remainder of the uniqueness assertion let √ 1 + 1 − 2βηγ r+ = βγ ¯ and note that r+ > r− because βηγ < 1/2. This electronic version is for personal use and may not be duplicated or distributed. r+ ) r then F (x) = 0. y ∈ B0 . The Kantorovich theorem is more precise than Theorem 5. the contraction mapping theorem (Theorem 4.ecsecurehost. Clearly x∗ is a root of F because it is a ﬁxed point of φ.1 hold. If we do this.com/SIAM/FR16.2. We state the theorem in the standard way [145]. Theorem 5. [63]. We know that βγr− < 1 by our assumption that βηγ < 1/2. 1 0 Therefore. the Newton iteration with x0 as the initial iterate converges to x∗ .html. This completes the proof.5.1) will imply that there is a unique ﬁxed point x∗ of φ in S and that xn → x∗ qlinearly with qfactor βγr− . φ(x) − φ(y) = φ (y + t(x − y))(x − y) dt ≤ βγr− x − y . ≤ 2n βγ n Buy this book from SIAM at http://www. y ∈ S. for all x. Then there is a unique root x∗ of F in B0 . Let Assumption 5. If r = r− .20) en (2βηγ)2 .2.
Do while F (x) > τr r0 + τa (a) Compute ∇h F (x) ≈ F (x) (b) Factor ∇h F (x) = LU . . the reader is asked to create a sparse form of nsol. a threshold for the ratio of successive residuals is also input. 5. Examples for Newton’s method In the collection of MATLAB codes we provide an implementation nsol of Newton’s method. (c) is = 0. say) and sparse matrix factorizations. rc = r0 = F (x) 2.16).html. nsol uses diffjac to approximate the Jacobian by ∇h F with h = 10−7 . 86 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS If βηγ < 1/2 then (5. σ = r+ /rc .1.ecsecurehost. While our examples have dense Jacobians. the chord method. however. The Jacobian is recomputed and factored if either the ratio of successive residuals exceeds the threshold 0 < ρ < 1 or the number of iterations without an update exceeds m. the Jacobian is not recomputed. If the ratio is too large. The output of the MATLAB implementation of nsol is the solution. x = x + s iv. If F (x+ ) / F (xc ) is below a given threshold. the Shamanskii method.Copyright ©1995 by the Society for Industrial and Applied Mathematics.20) implies that the convergence is rquadratic (see Exercise 5.7. Evaluate F (x) v. r+ = F (x) .25.6. Algorithm 5. is = is + 1 ii. ρ) 1. This electronic version is for personal use and may not be duplicated or distributed. and an error ﬂag. In addition to the inputs of Algorithm sham. m. we compute and factor F (x+ ) for use in the subsequent chord steps. nsol(x. or a combination of all of them based on the reduction in the nonlinear residual. In the MATLAB code the iteration is terminated with an error condition if σ ≥ 1 at any point. the history of the iteration stored as the vector of l∞ norms of the nonlinear residuals.6. the ideas in the implementation of nsol also apply to situations in which the Jacobian is sparse and direct methods for solving the equation for the Newton step are necessary. F. τ. which is not easy.7.com/SIAM/FR16. This indicates that the local convergence analysis in this Buy this book from SIAM at http://www. If F (x) ≤ τr r0 + τa exit. Solve LU s = −F (x) iii. This latter option decides if the Jacobian should be recomputed based on the ratio of successive residuals. rc = r+ vi. One would still want to minimize the number of Jacobian computations (by the methods of [47] or [42]. In Exercise 5. σ = 0 (d) Do While is < m and σ ≤ ρ i.
The parameters m and ρ are assigned the default values of 1000 and . F (xn ) ∞ / F (x0 ) ∞ . In Exercise 5. The resulting discrete problem is (5. will ﬁnd the physically meaningful solution [110]. and the ratio Rn = F (xn ) ∞ / F (xn−1 ) ∞ Buy this book from SIAM at http://www.ecsecurehost. This electronic version is for personal use and may not be duplicated or distributed.21 you are asked to modify the code to accept an analytic Jacobian. where µi = (i − 1/2)/N for 1 ≤ i ≤ N . The Chandrasekhar Hequation.com/SIAM/FR16. nsol allows the user to specify a maximum number of iterations. is used to solve exit distribution problems in radiative transfer. The standard assumptions hold near either solution [110] for c ∈ [0. We set the termination parameters τr = τa = 10−6 .7. The discrete problem has its own physical meaning [41] and we will attempt to solve it to high accuracy.1 we tabulate the iteration counter n.21) F (H)(µ) = H(µ) − 1 − The Chandrasekhar Hequation. 1). the solution of the discrete problem is not even a ﬁrstorder accurate approximation to that of the continuous problem. We used the function identically one as initial iterate. NEWTON’S METHOD 87 section is not applicable.5. In Table 5. c 2 1 µH(ν) dν 0 −1 µ+ν = 0. [30]. (5. We will discretize the equation with the composite midpoint rule. Newton’s method. Only one of these solutions has physical meaning. . 1). In Chapter 8 we discuss how to recover from this.9. [41]. It is known [132] that both (5. In the computations reported here we set N = 100 and c = . however. With these default parameters the decision to update the Jacobian is made based entirely on the reduction in the norm of the nonlinear residual. Here we approximate integrals on [0.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Because H has a singularity at µ = 0. 1] by 1 0 f (µ) dµ ≈ 1 N N j=1 f (µj ).22) F (x)i = xi − 1 − c 2N µi xj µi + µj j=1 N −1 . Many times the Jacobian can be computed eﬃciently by reusing results that are already available from the function evaluation. with the 0 function or the constant function with value 1 as initial iterate.22) have two solutions for c ∈ (0.html. We begin with a comparison of Newton’s method and the chord method. We present some iteration statistics in both tabular and graphical form.21) and the discrete analog (5. nsol is based on the assumption that a diﬀerence Jacobian will be used. the default is 40. We computed all Jacobians with the MATLAB code diffjac.
136e01 2. The concave iteration track is the signature of superlinear convergence.000e+00 1.15 for a more careful examination of this concavity.823e02 2.132e01 2. 5. then the chord method should be much more eﬃcient than Newton’s method if the equations for the steps can be solved cheaply. as compared with over 8. 5.2 we compare the Shamanskii method with m = 2 with Newton’s method.ecsecurehost. In the case considered here.8 million ﬂoatingpoint operations while the chord iteration required under 3. This electronic version is for personal use and may not be duplicated or distributed.8 for Newton’s method. The Shamanskii computation required under 6 million ﬂoatingpoint operations. In the remaining examples.965e04 6. The linear convergence of the chord method can also be seen in the convergence of the ratios Rn to a constant value. The Newton iteration required over 8. In the context of this particular example. the iteration statistics. the linear track indicates linear convergence. Buy this book from SIAM at http://www. The good performance of the chord method is no accident.077e01 2.480e01 2. See Exercise 5.html. . In Exercise 5. the cost of the Jacobian computation can be reduced by analytic evaluation and reuse of data that have already been computed for the evaluation of F .7. The reader is encouraged to do this in Exercise 5.511e03 1.891e06 ∞ Rn 1.698e03 7.21.136e01 2.3 million.com/SIAM/FR16.18 you are asked to obtain timings in your environment and will see how the ﬂoatingpoint operation counts are related to the actual timings.136e01 for n ≥ 1 for both Newton’s method and the chord method. However.353e05 2.118e01 2. In Fig.865e04 F (xn ) ∞ / F (x0 ) 1.334e05 1. we plot.388e03 2.1 the solid curve is a plot of F (xn ) ∞ / F (x0 ) ∞ for Newton’s method and the dashed curve a plot of F (xn ) ∞ / F (x0 ) ∞ for the chord method.7. If we think of the chord method as a preconditioned nonlinear Richardson iteration with the initial Jacobian playing the role of preconditioner.136e01 2. for this problem. The ideas in Chapters 6 and 7 show how the low iteration cost of the chord method can be combined with the small number of iterations of a superlinearly convergent nonlinear iterative method. but do not tabulate.729e07 ∞ Rn 1. the cost of a single evaluation and factorization of the Jacobian is far more than the cost of the entire remainder of the chord iteration. 88 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Table 5.074e02 6.7. In Fig. For this example the MATLAB flops command returns an fairly accurate indicator of the cost.000e+00 1.480e01 3. the simple chord method is the most eﬃcient.Copyright ©1995 by the Society for Industrial and Applied Mathematics.1 Comparison of Newton and chord iteration. n 0 1 2 3 4 5 6 7 8 F (xn ) ∞ / F (x0 ) 1.480e01 1.480e01 2.
96. c = . indicating very slow convergence.6 million ﬂoatingpoint operations.html.5 1 1. Aside from c. Newton and chord methods for Hequation.9. For this problem the Jacobian at the solution is nearly singular [110]. 5. The hybrid approach in nsol with the default parameters would be the chord method for this problem. NEWTON’S METHOD 0 89 10 10 Relative Nonlinear Residual 1 10 2 10 3 10 4 10 5 10 6 10 7 0 1 2 3 4 Iterations 5 6 7 8 Fig. 10 0 10 Relative Nonlinear Residual 1 10 2 10 3 10 4 10 5 10 6 10 7 0 0.5 4 Fig. 5.5 2 Iterations 2.1.5 3 3.2. Newton and Shamanskii method for Hequation.ecsecurehost. .9999.9. all iteration parameters are the same as in the previous example.com/SIAM/FR16.Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed. While Newton’s method converges in 7 iterations and requires 21 million ﬂoatingpoint operations. the chord method requires 188 iterations and 10. The qfactor for the chord method in this computation was over . The hybrid method in nsol with the default parameters Buy this book from SIAM at http://www. We now reconsider the Hequation with c = . c = .
com/SIAM/FR16. One should approach plots of convergence histories with some caution. support this.html.3 we compare the Newton iteration with the hybrid. This implies only that the nonlinear residual norms { F (xn ) } converge rlinearly to zero. In Fig. 90 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS computes the Jacobian four times during the course of the iteration. .Copyright ©1995 by the Society for Industrial and Applied Mathematics. and requires 12 million ﬂoatingpoint operations. in general. 5. Buy this book from SIAM at http://www. converges in 14 iterations. In practice. however.9999. This electronic version is for personal use and may not be duplicated or distributed. While the plots indicate qlinear convergence of the nonlinear residuals. plots like those in this section are common. Newton and hybrid methods for Hequation c = . 5. The theory for the chord method asserts that the error norms converge qlinearly to zero.ecsecurehost. 10 0 10 Relative Nonlinear Residual 1 10 2 10 3 10 4 10 5 10 6 0 2 4 6 8 Iterations 10 12 14 Fig. the theory does not.3.
r r l l Buy this book from SIAM at http://www. When is it locally qquadratically convergent? 5. Given (xk . Exercises on Newton’s method In some of the exercises here. Given the standard assumptions.7. xk ) with r l f (xk )f (xk ) < 0 we replace one of the endpoints with y = (xk + xk )/2 r r l l to force f (xk+1 )f (xk+1 ) < 0. 5. Let ∇h F be given by Deﬁnition 5. The bisection method produces a sequence of intervals (xk .6. A better alternative in the MATLAB environment is to use the semilogy command to plot the norms.4.7. This is easy. y ∈ Ω. x0 ) such that f (x0 )f (x0 ) < 0.7. can the Newton iterates for F (x) = 0 and AF (Bx) = 0 show any performance diﬀerence when started at the same initial iterate? What about the chord method? 5. r l Show that the sequence xk = (xk + xk )/2 is rlinearly convergent if f is r l continuous and there exists (x0 .7. If f (y) = 0.8. A function G is said to be H¨lder continuous with exponent α in Ω if o α for all x. This electronic version is for personal use and may not be duplicated or distributed. 5.5. 5. and in the rest of this part of the book.1. 5. . When one does this. for each iteration tabulate the iteration counter. the norm of F . for nonsingular N × N matrices A and B.Copyright ©1995 by the Society for Industrial and Applied Mathematics.1. and the ratio of F (xn ) / F (xn−1 ) for n ≥ 1. we terminate the iteration.7. When asked to do this.ecsecurehost.3.7.7. Suppose F (xn ) is replaced by ∇hn F (xn ) in the Shamanskii method. Show that the secant method converges qsuperlinearly with qorder √ ( 5 + 1)/2. 5.4. NEWTON’S METHOD 91 5. Prove that the secant method converges rsuperlinearly with rorder √ ( 5 + 1)/2. you will be asked to plot or tabulate iteration statistics.html. prove that the iteration given by xn+1 = xn − (∇hn F (xn ))−1 F (xn ) is locally qsuperlinearly convergent if hn → 0. In what way is the convergence analysis of the secant method changed if there are errors in the evaluation of f ? 5.2. one can visually inspect the plot to determine superlinear convergence (concavity) without explicitly computing the ratios.com/SIAM/FR16.7.7. Use the l∞ norm. Discuss how hn must be decreased as the iteration progresses to preserve the qorder of m + 1 [174]. Show that if the Lipschitz G(x) − G(y) ≤ K x − y continuity condition on F in the standard assumptions is replaced by H¨lder continuity with exponent α > 0 that the Newton iterates converge o with qorder 1 + α. xk ) that r l contain a root of a function of one variable.7. Can the performance of the Newton iteration be improved by a linear change of variables? That is.
9. then the Newton iteration converges cubically (qsuperlinear with qorder 3). 5. use x−1 = .html. 5.10. Apply your program to the following function/initial iterate combinations. Show that if f is a real function of one real variable. 1. f (x) = x2 . the standard assumptions hold. Test it on the problems in Exercise 5. f (x) = x2 + 1. Write a program that solves single nonlinear equations with Newton’s method. then the Newton iteration converges monotonically to x∗ .7. What if f (x∗ ) = 0? How can you extend this result if f (x∗ ) = f (x∗ ) = . f (x) = cos(x) − x. the standard assumptions hold.7. . What problems might arise in such a method? Resolve these problems and implement your algorithm. and explain your results.5 5. How would you estimate the qorder of a qsuperlinearly convergent sequence? Rigorously justify your answer.15.5 2. Show that if f is a real function of one real variable. and f (x0 )f (x0 ) > 0.14. x0 = . f (x) = tan−1 (x).7. show that log( en ) is a concave function of n in the sense that log( en+1 ) − log( en ) Buy this book from SIAM at http://www.13.12.7.com/SIAM/FR16. the chord method.Copyright ©1995 by the Society for Industrial and Applied Mathematics.99x0 . x0 = 1 3. If xn → x∗ qsuperlinearly with qorder α > 1. 92 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 5. x0 is suﬃciently near x∗ .7.7. tabulate or plot iteration statistics. x0 = . For the secant method. . For scalar functions of one variable a quadratic model of f about a point xc is mc (x) = f (xc ) + f (xc )(x − xc ) + f (xc )(x − xc )2 /2. f is Lipschitz continuous. x0 = 3 4. In what cases is this algorithm superior to Newton’s method? For hints and generalizations of this idea to functions of several variables.11. This electronic version is for personal use and may not be duplicated or distributed. f (x) = sin(x).ecsecurehost. x0 = 10 5. but not equal to x∗ [171]. f (k) (x∗ ) = 0 = f (k+1) (x∗ )? See [131]. .7.7. and f (x∗ ) = f (x∗ ) = 0 but f (x∗ ) = 0 then the iteration xn+1 = xn − 2f (xn )/f (xn ) converges locally qquadratically to x∗ provided x0 is suﬃciently near to x∗ .9. Show that if f is a real function of one real variable. and the secant method. Consider the iterative method that deﬁnes x+ as a root of mc . see [170]. f (x∗ ) = 0 and f (x) is Lipschitz continuous. Can the ideas in [170] be applied to some of the highorder methods discussed in [190] for equations in one unknown? 5. 5. 5.
Compare timings and ﬂop counts with Newton’s method. or [118] for acceleration of the convergence.7.7.19. Modify dirder by setting the forward diﬀerence step h = 10−2 . This electronic version is for personal use and may not be duplicated or distributed.4. Explain your results (see [50] and [157]). [75].20) is rquadratically (but not qquadratically) convergent to 0 if βηγ < 1/2. 5. Compare execution times in your environment. Modify nsol to accept this analytic Jacobian rather than evaluate a diﬀerence Jacobian. Try diﬀerent values of N and c. c combination and the MATLAB flops command to get an idea for the number of ﬂoatingpoint operations required by the various methods. LU = F (x) + E.7.7. [35] and [155] for more discussion of this.9 (see [52]). and the algorithm in nsol. [170]. the chord method. 5. Prove that F can be computed at a cost of less than twice that of an evaluation of F .9999 with ﬁxedpoint iteration.21.html. [49]. Typically (see [89]. .22). Solve the Hequation with N = 100. [184]) the computed LU factorization of F (x) is the exact LU factorization of a nearby matrix. Tabulate or plot iteration statistics.11 improve the convergence? Try one of the approaches in [51].1 to describe how this factorization error aﬀects the convergence of the Newton iteration. How does the analysis for the QR factorization diﬀer? 5. c = .Copyright ©1995 by the Society for Industrial and Applied Mathematics. Solve the Hequation with c = 1 and N = 100.20.7. NEWTON’S METHOD 93 is a decreasing function of n for n suﬃciently large.17. [184] or another book on numerical linear algebra and Theorem 5. Use the theory in [89].7. How does the performance of the methods diﬀer as N and c change? Compare your results to the tabulated values of the H function in [41] or [14].16. Try the chord method and compare the results to part 4 of Exercise 5.7. Does the trick in Exercise 5.22. the discretized nonlinearity from the H equation in (5.18. How do the execution times and ﬂop counts change? 5. Use the MATLAB cputime command to see how long the entire iteration took for each N.9 and c = .com/SIAM/FR16. 5. How does this change aﬀect the convergence rate observations that you have made? Buy this book from SIAM at http://www.7. Is this still the case if the convergence is qsuperlinear but there is no qorder? What if the convergence is qlinear? 5. Compute by hand the Jacobian of F . 5.ecsecurehost.7. Show that the sequence on the right side of (5. You might also look at [90]. Duplicate the results on the Hequation in § 5.6.
λ) = Aφ − λφ φT φ − 1 = 0 0 for the vector x = (φT . 94 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 5. the solution is u∗ (x) = x(1 − x).24.7. such as the the MATLAB rand function. 5. In MATLAB.7. If the size of the noise is O(10−4 ) and you make the appropriate adjustments to dirder. Show that the number n of Newton iterations needed to obtain en ≤ e0 is O(log( )) and that the number of ﬂoatingpoint operations required is O(N 3 log( )). and f (x) = 2 − sin(x(1 − x)). for example.7. Let u0 = 0 be the initial iterate.23) F (φ. λ) ∈ RN +1 is an eigenvectoreigenvalue pair. This electronic version is for personal use and may not be duplicated or distributed.7. you could modify diffjac to use the techniques of [47] and nsol to use the sparse matrix factorization in MATLAB.26.23) to the inverse power method. and that x0 is near enough to x∗ so that the Newton iteration converges qquadratically to x∗ . See [150] for more development of this idea.25. What about the chord method? 5. u(0) = u(1) = 0. Develop a sparse version of nsol for problems with tridiagonal Jacobian. Apply your code to the central diﬀerence discretization of the twopoint boundary value problem −u = sin(u) + f (x). λ)T ∈ RN +1 .ecsecurehost. Buy this book from SIAM at http://www. Add noise to your function evaluation with a random number generator. the cost of a Jacobian is O(N ) function evaluations. that the cost of a function evaluation is O(N 2 ) ﬂoatingpoint operations. when is F (x) nonsingular? Relate the application of Newton’s method to (5. . Suppose you try to ﬁnd an eigenvectoreigenvalue pair (φ.com/SIAM/FR16. λ) of an N ×N matrix A by solving the system of N + 1 nonlinear equations (5. Here. Assume that the standard assumptions hold.23. how are the convergence rates diﬀerent? What is an appropriate termination criterion? At what point does the iteration stagnate? 5.html. What is the Jacobian of this system? If x = (φ.Copyright ©1995 by the Society for Industrial and Applied Mathematics.
1). See [145] and [175].2 we show how the requirements of the simple analysis can be changed to admit a more aggressive (i.1. This electronic version is for personal use and may not be duplicated or distributed. 95 Buy this book from SIAM at http://www. This was the view taken in [55] where inexact Newton methods in which the step satisﬁes (6. The basic estimates We will discuss speciﬁc implementation issues in § 6. Chapter 6 Inexact Newton Methods Theorem 5.1 we give a straightforward analysis that uses the techniques of Chapter 5.1 describes how errors in the derivative/function aﬀect the progress in the Newton iteration. 6.2. . The proof and application of Theorem 6.1) aﬀects the convergence.2) e+ ≤ KI ( ec + ηc ) ec . Before that we will give the basic result from [55] to show how a step that satisﬁes (6. and x+ = xc + s then (6. for example.1. but the other Krylov subspace methods discussed in Chapters 2 and 3 and elsewhere can also be used.1.1. Another way to look at this is to ask how an approximate solution of the linear equation for the Newton step aﬀects the iteration. 6. Direct analysis. We will present this result in two parts. We follow [69] and refer to the term ηc on the right hand side of (6. Then there are δ and KI such that if xc ∈ B(δ). In § 6.4.html.1) F (xc )s + F (xc ) ≤ ηc F (xc ) are considered.1. Let the standard assumptions hold.1 should be compared to that of Theorem 5. Any approximate step is accepted provided that the relative residual of the linear equation is small.1. In most of this chapter we will focus on the approximate solution of the equation for the Newton step by GMRES.1. This is quite useful because conditions like (6.1) are precisely the small linear residual termination conditions for iterative solution of the linear system for the Newton step. In § 6.ecsecurehost. s satisﬁes (6.1 and the other results in Chapter 5.1) as the forcing term.e.1. for discussion of these ideas in the context of the classical stationary iterative methods. Theorem 6. larger) choice of the parameter η.com/SIAM/FR16.Copyright ©1995 by the Society for Industrial and Applied Mathematics.. Such methods are not new.1.
p for some Kη > 0 the convergence is qsuperlinear 2 Our deﬁnition of r is consistent with the idea that the residual in a linear equation is b − Ax.1 for iterative methods in the next result. Let the standard assumptions hold.2) note that if2 r = −F (xc )s − F (xc ) is the linear residual then s + F (xc )−1 F (xc ) = −F (xc )−1 r and (6. where K is the constant from (5.2.1. the proof is complete. Theorem 6.2) and will suﬃce to explain most computational observations.7). which is a direct consequence of (6. and • if ηn ≤ Kη F (xn ) with qorder 1 + p.2).1).6) imply that s + F (xc )−1 F (xc ) ≤ F (xc )−1 ηc F (xc ) ≤ 4κ(F (x∗ ))ηc ec .3. .3) e+ = ec + s = ec − F (xc )−1 F (xc ) − F (xc )−1 r.1. Let δ be small enough so that the conclusions of Lemma 4.1 e+ ≤ ec − F (xc )−1 F (xc ) + 4κ(F (x∗ ))ηc ec ≤ K ec 2 + 4κ(F (x∗ ))ηc ec .1. To prove the ﬁrst assertion (6. then the inexact Newton iteration ¯ ¯ xn+1 = xn + sn .com/SIAM/FR16.3) and Theorem 5. In [55] r = F (xc )s + F (xc ). η ].1 hold. (4.1. (6. Now. Then there are δ and η such that if x0 ∈ B(δ). Moreover • if ηn → 0 the convergence is qsuperlinear. This electronic version is for personal use and may not be duplicated or distributed.1 and Theorem 5. {ηn } ⊂ [0. where F (xn )sn + F (xn ) ≤ ηn F (xn ) converges qlinearly to x∗ . We summarize the implications of Theorem 6. and (4. Buy this book from SIAM at http://www.ecsecurehost.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Hence.html. If we set KI = K + 4κ(F (x∗ )). using (6. 96 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Proof.
This is not the case and the purpose of this subsection is to describe more accurately the necessary restrictions on the forcing terms. Then if n ≥ 0 and xn ∈ B(δ) we have en+1 ≤ KI ( en + ηn ) en ≤ KI (δ + η ) en < en . ¯ This proves qlinear convergence with a qfactor of KI (δ + η ). If ηn ≤ Kη F (xn ) then we may use (4. Then there is δ such that if xc ∈ B(δ).1). Theorem 6.1. Since KI = O(κ(F (x∗ ))). with γ ec 2 2 . and ηc ≤ η < η < 1. F (x∗ )ec ec = F (x∗ )−1 F (x∗ )ec ≤ F (x∗ )−1 γ F (x∗ )−1 2 F (xc ) ≤ (1 + M0 δ) F (x∗ )ec . . Buy this book from SIAM at http://www. INEXACT NEWTON METHODS 97 Proof.2) holds for xc ∈ B(δ).0.7) and (6. ¯ If ηn → 0 then qsuperlinear convergence follows from the deﬁnition. one might conclude from Theorem 6.2).4) note that Theorem 4. M0 = (6. x+ = xc + s.ecsecurehost.Copyright ©1995 by the Society for Industrial and Applied Mathematics.com/SIAM/FR16.1 implies that F (xc ) ≤ F (x∗ )ec + Since we have.1. s satisﬁes (6. ¯ where KI is from (6. Let δ be small enough so that (6. then ¯ (6.5) Now. Let the standard assumptions hold.2. 6. ¯ 1−p p + Kη 2p F (x∗ ) p ) en 1+p Proof.3.2) to conclude en+1 ≤ KI ( en which completes the proof.1.1. Reduce δ and η if needed so that ¯ KI (δ + η ) < 1. 1) other than requiring that 1 not be an accumulation point. This electronic version is for personal use and may not be duplicated or distributed.1 in that no restriction is put on the sequence {ηn } ⊂ [0. F (x∗ )e+ = F (x∗ )(ec + s) = F (x∗ )(ec − F (xc )−1 F (xc ) − F (xc )−1 r). Weighted norm analysis.2 that if F (x∗ ) is ill conditioned very small forcing terms must be used. To prove (6.html.4) F (x∗ )e+ ≤ η F (x∗ )ec . The results here diﬀer from those in § 6.
4 diﬀers from Theorem 6. We make such a choice in one of the examples considered in § 6. In fact.3 described in Theorem 6. Hence. This electronic version is for personal use and may not be duplicated or distributed. using (4. The application of Theorem 6.2).7) ≤ (1 + M1 δ) r + M2 δ ec ≤ (1 + M1 δ)(1 + M0 δ)ηc F (x∗ )ec +M2 δ F (x∗ )−1 F (x∗ )ec ≤ ((1 + M1 δ)(1 + M0 δ)η + M2 δ F (x∗ )−1 ) F (x∗ )ec . F (x∗ )e+ (6.4) is remarkable in its assertion that any method that produces a step with a linear residual less than that of the zero step will reduce the norm of the error if the error is measured with the weighted norm · ∗ = F (x∗ ) · .1. just that 1 not be an accumulation point. Now let δ be small enough so that ¯ (1 + M1 δ)(1 + M0 δ)η + M2 δ F (x∗ )−1 ≤ η and the proof is complete.1.ecsecurehost.4.html.1. Note that the distance δ from the initial iterate to the solution may have to be smaller for (6.Copyright ©1995 by the Society for Industrial and Applied Mathematics.5). and M2 = K F (x∗ ) and obtain.5 for all n) that try to minimize the number of inner iterations are completely justiﬁed by this result if the initial iterate is suﬃciently near the solution.3 asserts qlinear convergence in the weighted norm if the initial iterate is suﬃciently near x∗ . M1 = 2γ F (x∗ )−1 .1. The importance of this theorem for implementation is that choices of the sequence of forcing terms {ηn } (such as ηn = . .4 and compare it to a modiﬁcation of a choice from Buy this book from SIAM at http://www.1.1 F (x∗ )(ec − F (xc )−1 F (xc )) ≤ K F (x∗ ) ec 2 .com/SIAM/FR16.6) Since F (x∗ )F (xc )−1 r ≤ r + (F (x∗ ) − F (xc ))F (xc )−1 r ≤ (1 + 2γ F (x∗ )−1 we may set ec ) r . This is made precise in Theorem 6. 98 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS By Theorem 5. However (6.4) to hold than for (6.1.2 in that we do not demand that {ηn } be bounded away from 1 by a suﬃciently large amount. (6. F (x∗ )e+ ≤ F (x∗ )F (xc )−1 r + K F (x∗ ) ec 2 . Theorem 6.7) and (6.
then the inexact Newton ¯ iteration xn+1 = xn + sn .2). To prove the assertions on qsuperlinear convergence.2 and 6. e < δ∗ = F (x∗ ) δ0 ∗ Set δ = F (x∗ ) −1 δ∗ .4. Let δ0 be such that (6. Proof.com/SIAM/FR16. This proves the ﬁrst assertion.1. Theorem 6. Then e0 ∗ < δ∗ if e0 < δ. {ηn } ⊂ [0.5. .1.1.2 might lead one to believe that a small value of ηn is necessary for convergence.ecsecurehost. While Theorem 6. Rather we note that (6. Let the standard assumptions hold. The proof uses both (6. qlinear convergence with respect to the weighted norm · ∗ is equivalent to qlinear convergence of the sequence of nonlinear residuals { F (xn ) }. We state this result as Proposition 6.1. The most signiﬁcant diﬀerence is the norm in which qlinear convergence takes place. The implications for the choice of the forcing terms {ηn } are also diﬀerent. We close this section with some remarks on the two types of results in Theorems 6.8) Note that e < δ0 if x ∗ ≤ F (x∗ ) x ≤ κ(F (x∗ )) x ∗ .1 hold. The superlinear convergence results then follow exactly as in the proof of Theorem 6. For all x ∈ RN . (6.1. and • if ηn ≤ Kη F (xn ) p for some Kη > 0 the convergence is qsuperlinear with qorder 1 + p.4) and (6. INEXACT NEWTON METHODS 99 [69] that decreases ηn rapidly with a view toward minimizing the number of outer iterations. η] with η < η < 1. eventually en will be small enough so that the conclusions of Theorem 6. Theorem 6. Our ﬁrst task is to relate the norms · and · ∗ so that we can estimate δ.1 and leave the proof to the reader in Exercise 6.1.9) en+1 ∗ ≤ η en ¯ ∗ < δ∗ and hence xn → x∗ qlinearly with respect to the weighted norm.1. Then there is δ such that if x0 ∈ B(δ).html.4.Copyright ©1995 by the Society for Industrial and Applied Mathematics. where F (xn )sn + F (xn ) ≤ ηn F (xn ) converges qlinearly with respect to · ∗ to x∗ .1.4) implies that if en ∗ < δ∗ then (6.4) holds for ec < δ0 .2. note that since xn → x∗ . Our proof does not rely on the (possibly false) assertions that en < δ for all n or that xn → x∗ qlinearly with respect to the unweighted norm. Moreover • if ηn → 0 the convergence is qsuperlinear.4 shows that the sequence of forcing terms need only be kept bounded away from Buy this book from SIAM at http://www.2. This electronic version is for personal use and may not be duplicated or distributed.
s satisﬁes (6. NewtonGMRES. In this section. Let the standard assumptions hold.ecsecurehost.e.1. terminating when the relative linear residual is smaller than ηc (i. to the reader as exercises. we state two theorems on inexact local improvement.5. when (6.1 and its proof for an illustration of this point. [136] in the context of optimization.1) with an iterative method for the linear system for the Newton step. Then F (xn ) converges qlinearly to 0 if and only if en ∗ does. ¯ 6.3. Theorem 6. Buy this book from SIAM at http://www. In this section we use the l2 norm to measure nonlinear residuals. The name of the method indicates which linear iterative method is used. NewtonCG. for example. 100 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 1. Errors in the function. 6. This electronic version is for personal use and may not be duplicated or distributed. Proposition 6. Typically the nonlinear iteration that generates the sequence {xn } is called the outer iteration and the linear iteration that generates the approximations to the steps is called the inner iteration. x+ = xc + s. . Then there is δ such that if xc ∈ B(δ).1.com/SIAM/FR16.5 may well suﬃce for linear convergence.11) e+ ≤ KI (( ec + ηc ) ec + (xc ) ).Copyright ©1995 by the Society for Industrial and Applied Mathematics. s satisﬁes (6.html. then ¯ there is KE such that (6. The reason for this is that the Krylov methods use the scalar product and the estimates in Chapters 2 and 3 are in terms of the Euclidean norm. can be regarded as part of the inexact solution of the equation for the Newton step.12) F (x∗ )e+ ≤ η F (x∗ )ec + KE (xc ) . Let the standard assumptions hold.1.2. We will discuss other factors in the choice of forcing terms in § 6. would use SOR. These methods have also been called truncated Newton methods [56]. Then there are δ and KI such that if xc ∈ B(δ). or GMRES to solve the linear system.1.4 we will scale the l2 norm by a factor of 1/N so that the results for the diﬀerential and integral equations will be independent of the computational mesh.6.1) holds). Newtoniterative methods A Newtoniterative method realizes (6. NewtonSOR.1 and 6.3.10). which are direct analogs to the proofs of Theorems 6. but at a price of a greater number of outer iterations. A constant sequence such as ηn = . In the examples discussed in § 6.1.1. and ηc ≤ η < η < 1.10) F (xc )s + F (xc ) + (xc ) ≤ ηc F (xc ) + (xc ) and x+ = xc + s. Let the standard assumptions hold and let xn → x∗ . This naming convention is taken from [175] and [145].2. Theorem 6. CG.1. such as those arising from a diﬀerence approximation of the action of F (xc ) on a vector. We leave the proofs.3. Note that errors in the derivative. then (6. See [55] or Proposition 6.
Algorithm 6. Buy this book from SIAM at http://www. k T i. Dh F (x : w) for some h. Newton GMRES. hjk = vk+1 vj ii.2. . (f) ρ = βe1 − Hk y k 3. To illustrate this point. fdgmres(s. h. as pointed out in [20]. While ρ > η F (x) (a) k = k + 1 (b) vk+1 = Dh F (x : vk ) for j = 1. . an evaluation of the action of F (xc )T on a vector is also required. . .html. β = ρ. a sequence of approximate steps {sk } is produced.2. We give a MATLAB implementation of this algorithm in the collection of MATLAB codes. application of GMRES to the linear equation for the Newton step with matrixvector products approximated by ﬁnite diﬀerences is the same as the application of GMRES to the matrix Gh F (x) whose last k − 1 columns are the vectors vk = Dh F (x : vk−1 ) The sequence {vk } is formed in the course of the forwarddiﬀerence GMRES iteration. x. 0)T (e) e1 = ∈ Rk+1 Minimize βe1 − Hk y k Rk+1 to obtain y k ∈ Rk . . In our MATLAB implementation we solve the least squares problem in step 2e by the Givens rotation approach used in Algorithm gmres.1. kmax. Rk+1 . We begin by discussing the eﬀects of a forward diﬀerence approximation to the action of the Jacobian on a vector. In the case of CGNR and CGNE. ρ = r 2 . We provide a detailed analysis of the NewtonGMRES iteration. that b = −F (x). . F. INEXACT NEWTON METHODS 101 6. vk+1 = vk+1 − hjk vj (c) hk+1. It is important to note that this is entirely diﬀerent from forming the ﬁnite diﬀerence Jacobian ∇h F (x) and applying that matrix to w. and that the initial iterate for the linear problem is the zero vector. (5. In many implementations [20]. 0. Note that matrixvector products are replaced by forward diﬀerence approximations to F (x). the action of F (xc ) on a vector w is approximated by a forward diﬀerence. η. r = −F (x). If the linear iterative method is any one of the Krylov subspace methods discussed in Chapters 2 and 3 then each inner iteration requires at least one evaluation of the action of F (xc ) on a vector. . s = 0. we give the forwarddiﬀerence GMRES algorithm for computation of the Newton step.Copyright ©1995 by the Society for Industrial and Applied Mathematics. ρ) 1.com/SIAM/FR16. In fact. v1 = r/ r 2 .15). k = 0 2.ecsecurehost. [24]. s = −F (x)−1 F (x). .k = vk+1 2 2 2 and k < kmax do (d) vk+1 = vk+1 / vk+1 (1. This electronic version is for personal use and may not be duplicated or distributed. s = Vk y k .1.
Since F is Lipschitz continuous and {uj } is an orthonormal basis for RN we have.13) F (x)s + F (x) 2 < (η + CG h) F (x) 2 . for the purposes of ⊥ analysis.1.com/SIAM/FR16.ecsecurehost. Assuming that there is no error in the evaluation of F we may summarize the observations above in the following proposition. h ≤ h. we have.1 hold. and δ such that if x ∈ B(δ). with γ = γ( x∗ 2 + δ). This electronic version is for personal use and may not be duplicated or distributed. where {vj }k is the orthonormal basis for Kk generated j=1 by Algorithm fdgmres. Proposition 6.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Note that Dh F (x : uj ) = 1 0 F (x + th x 2 uj )uj dt 1 0 (F = F (x)uj + (x + th x 2 uj ) − F (x))uj dt. which is a special case of the more general results in [20]. Consider the linear map deﬁned by its action on {uj } Buj = Dh F (x : uj ).html. Hence Gh F is also ﬁrst order accurate in h. 102 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS As should be clear from inspection of Algorithm gmres. could be speciﬁed by the action of ∇h F on any basis for Kk .14) B − F (x) 2 ≤ h x 2 γ/2 ≤ h¯ /2 γ Since the linear iteration (fdgmres) terminates in k < kmax iterations. and Algorithm fdgmres terminates with k < kmax then the computed step s satisﬁes (6. γ ¯γ h¯ ≤ F (x∗ )−1 −1 2 /2 Buy this book from SIAM at http://www. Proof.2. the diﬀerence between ∇h F and Gh F is that the columns are computed by taking directional derivatives based on two diﬀerent orthonormal bases: ∇h F using the basis of unit vectors in coordinate directions and Gh F the basis for the Krylov space Kk ⊥ constructed by algorithm fdgmres. The action of Gh F on Kk . ¯ ¯ Then there are CG .3. ¯ (6. 1). Let the standard assumptions hold. Bs + F (x) and therefore (6. Let {uj } be any orthonormal basis for RN such that uj = vj for 1 ≤ j ≤ k. h. Let η ∈ (0. is not used by the algorithm and.15) Assume that F (x)s + F (x) 2 2 ≤ η F (x) 2 ≤ η F (x) 2 + h¯ s 2 /2. First let δ be small enough so that B(δ) ⊂ Ω and the conclusions to Lemma 4. . since B and Gh F agree on Kk . the orthogonal complement of Kk .
Theorem 6. Then there is δ such that if x0 ∈ B(δ) and the sequences {ηn } and {hn } satisfy σn = ηn + CG hn ∈ [0.1.1.ecsecurehost. hn .1 and (6.2.1 applies to a restarted GMRES because upon successful termination (6.1.4 and Theorem 6.2. • if σn → 0 the convergence is qsuperlinear.1 and are left as (easy) exercises. 6.2. σ such that if x0 ∈ B(δ) and the sequences {ηn } and {hn } ¯ satisfy σn = ηn + CG hn ≤ σ ¯ then the forward diﬀerence NewtonGMRES iteration xn+1 = xn + sn where sn is computed by Algorithm fdgmres with arguments (sn . and • if σn ≤ Kη F (xn ) p for some Kη > 0 the convergence is qsuperlinear 2 with qorder 1 + p.2.2. Theorem 6.15) completes the proof with CG = 4¯ (1 + ∗ )−1 . xn .1 hold. σ ] ¯ with 0 < σ < 1 then the forward diﬀerence NewtonGMRES iteration ¯ xn+1 = xn + sn Buy this book from SIAM at http://www. we must specify not only the sequence {ηn }. Moreover. ηn .html. The proofs are immediate consequences of Theorems 6.4.1 implies that a ﬁnite diﬀerence implementation will not aﬀect the performance of NewtonGMRES if the steps in the forward diﬀerence approximation of the derivatives are suﬃciently small.15) imply F (x∗ )−1 −1 2 s 2 /2 ≤ F (x)−1 ≤ F (x)s 2 −1 2 s 2 2 ≤ (1 + η) F (x) 2 ¯γ + h¯ s 2 /2. but also the steps {hn } used to compute forward diﬀerences.16) with (6. Therefore. ρ) converges qlinearly and sn satisﬁes (6. . γ Combining (6. (6. kmax.2. In an implementation.1.16) s 2 ≤ 4(1 + η) F (x∗ )−1 F (x) 2 .2. This electronic version is for personal use and may not be duplicated or distributed. INEXACT NEWTON METHODS 103 Then Lemma 4. Assume that the assumptions of Proposition 6.15) will hold. We summarize our results so far in the analogs of Theorem 6.1 hold. therefore. and Proposition 6.1.2.2.com/SIAM/FR16.3.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Assume that the assumptions of Proposition 6.2.17) F (xn )sn + F (xn ) 2 < σn F (xn ) 2 . Note also that Proposition 6. F. η) F (x 2 Proposition 6. Then there are δ.
Moreover • if σn → 0 the convergence is qsuperlinear. but the choice of that constant depends on the problem. [111].com/SIAM/FR16. Computation of ∇h F (x) and its transpose analytically is one option as is diﬀerentiation of F automatically or by hand. This issue is independent of the particular linear solver and our discussion is in the general inexact Newton method setting. xn . 6.2. F. GMRES(m). 6. The reader may explore this further in Exercise 6. Setting η to a constant for the entire iteration is often a reasonable strategy as we see in § 6. ρ) converges qlinearly with respect to · ∗ and sn satisﬁes (6. In the case where hn is approximately the square root of machine roundoﬀ. [63]. NewtonCG is a possible approach. The choice in Buy this book from SIAM at http://www.4. we must provide a method for forming the sequence of forcing terms {ηn }. as it will be if CGNR or CGNE is used as the iterative method.17).2.html. and • if σn ≤ Kη F (xn ) p for some Kη > 0 the convergence is qsuperlinear 2 with qorder 1 + p. 104 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS where sn is computed by Algorithm fdgmres with arguments (sn .ecsecurehost. If a transpose is needed. We adjust η as the iteration progresses with a variation of a choice from [69]. This electronic version is for personal use and may not be duplicated or distributed. the function. Other Newtoniterative methods. Other iterative methods may be used as the solver for the linear equation for the Newton step. ηn . a forward diﬀerence formulation is not an option because approximation of the transposevector product by diﬀerences is not possible. If F is symmetric and positive deﬁnite. GMRES may not be practical. kmax.13.5.6. [78].3. as it is in the context of unconstrained minimization. . When storage is restricted or the problem is very large. may not converge rapidly enough to be useful. and [12]. If a constant η is too small. for a small m. BiCGSTAB and TFQMR should be considered in all cases where there is not enough storage for GMRES(m) to perform well. NewtonGMRES implementation We provide a MATLAB code nsolgm in the collection that implements a forwarddiﬀerence NewtonGMRES algorithm. hn . Newtonmultigrid [99] is one approach using a stationary iterative method. See also [6]. In addition. As input we must give the initial iterate.2 states that the observed behavior of the forward diﬀerence implementation will be the same as that of an implementation with exact derivatives. the vector of termination tolerances as we did for nsol. σn ≈ ηn if ηn is not too small and Theorem 6. CGNR and CGNE oﬀer alternatives. The reader may also be interested in experimenting with the other Krylov subspace methods described in § 3. much eﬀort can be wasted in the initial stages of the iteration.2.Copyright ©1995 by the Society for Industrial and Applied Mathematics. but require evaluation of transposevector products. This algorithm requires diﬀerent inputs than nsol.
ηn ).9 as recommended in [69]. γηn−1 > . γηn−1 ≤ . . max(ηn .9999 are used. the most information possible would be extracted from the inner iteration.18) the parameter ηmax is an upper limit on the sequence {ηn }.1. In all the examples in this book we use the value γ = . If ηn is uniformly bounded away from 1. This electronic version is for personal use and may not be duplicated or distributed.20) C ηn = min(ηmax . The constant . The idea is that if ηn−1 is suﬃciently large we do not let ηn decrease by much more than a factor of ηn−1 . A method of safeguarding was suggested in [69] to avoid volatile decreases in ηn .5. In order to specify the choice at n = 0 and bound the sequence away from 1 we set (6. We use dirder to approximate directional derivatives and use the default value of h = 10−7 . γηn−1 )). max(ηn .19) C ηn = A min(ηmax . In [69] the choices γ = . This safeguarding does improve the performance of the iteration. There is a chance that the ﬁnal iterate will reduce F far beyond the desired level and that the cost of the solution of the linear equation for the last step will be more accurate than is really needed. In this way. Buy this book from SIAM at http://www.Copyright ©1995 by the Society for Industrial and Applied Mathematics.1. A where γ ∈ (0.5τt / F (xn ) )).9 and ηmax = . In (6. 2 n > 0. B It may happen that ηn is small for one or more iterations while xn is still far from the solution.ecsecurehost. (6.1. n > 0. Over solving means that the linear equation for the Newton step is solved to a precision far beyond what is needed to correct the nonlinear iteration. A for n > 0 would guarantee qquadratic convergence by then setting ηn = ηn Theorem 6. This oversolving on the ﬁnal step can be controlled comparing the norm of the current nonlinear residual F (xn ) to the nonlinear residual norm at which the iteration would terminate τt = τa + τr F (x0 ) and bounding ηn from below by a constant multiple of τt / F (xn ) . n = 0. ηn ). n > 0. As a measure of the degree to which the nonlinear iteration approximates the solution we begin with A ηn = γ F (xn ) 2 / F (xn−1 ) 2 . Exercise 6. INEXACT NEWTON METHODS 105 [69] is motivated by a desire to avoid such over solving.html. n = 0. 1] is a parameter.18) B ηn = ηmax . 2 A 2 min(ηmax .com/SIAM/FR16. ηmax .1. The algorithm nsolgm does this and uses (6.9 asks the reader to modify nsolgm so that other choices of the sequence {ηn } can be used. min(η A max . .1 is somewhat arbitrary.
html. We illustrate the consequences of this in § 6. We revisit the Hequation and solve a preconditioned nonlinear convectiondiﬀusion equation. Examples for NewtonGMRES In this section we consider two examples. Hence if a large value of η can be used. s.ecsecurehost. GMRES will have to be restarted. One new function evaluation will be needed to approximate the action of F (xc ) on a vector Buy this book from SIAM at http://www. η) √ 1. nsolgm(x. we scale the l2 norm of the nonlinear √ residual by a factor of 1/ N so that constant functions will have norms that are independent of the computational mesh. x + hv. If a maximum of m GMRES iterations is needed (or GMRES(m) is used as a linear solver) the storage requirements of Algorithm nsolgm are the m + 5 vectors x. τ. rc = r0 = F (x) 2 / N √ 2.5. σ = r+ /rc . it is equivalent to precondition the nonlinear equation before calling nsolgm.4 and ask the reader to make the necessary modiﬁcations to nsolgm in Exercise 6. and the Krylov basis {vk }m .3. k=1 Since GMRES forms the inner iterates and makes termination decisions based on scalar products and l2 norms.9. . Counts of function evaluations corresponding to outer iterations are indicated by circles.com/SIAM/FR16. F.4. Note that if only a single inner iteration is needed. rc = r+ ≤ τr r0 + τa exit.Copyright ©1995 by the Society for Industrial and Applied Mathematics.4 are independent of the outer iteration. Do while F (x) 2 / N > τr r0 + τa (a) Select η. the cost of the entire iteration can be greatly reduced. This electronic version is for personal use and may not be duplicated or distributed. F (x + hv).1. Note that the cost of an outer iterate is one evaluation of F to compute the value at the current iterate and other evaluations of F to compute the forward diﬀerences for each inner iterate. 106 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Algorithm 6. (b) fdgmres(s. However. x. In all the ﬁgures we plot the relative nonlinear residual F (xn ) 2 / F (x0 ) 2 against the number of function evaluations required by all inner and outer iterations to compute xn . Since the Newton steps for M F (x) = 0 are the same as those for F (x) = 0. η) (c) x = x + s (d) Evaluate F (x) (f) If F (x) 2 √ (e) r+ = F (x) 2 / N . From these plots one can compare not only the number of outer iterations. For large and poorly conditioned problems. the total cost of the outer iteration will be two function evaluations since F (xc ) will be known. F (x). F. The preconditioners we use in § 6. we also terminate the outer iteration on small l2 norms of the nonlinear residuals. because of our applications to diﬀerential and integral equations. but also the total cost. 6. This enables us to directly compare the costs of diﬀerent strategies for selection of η.
We set the parameters in (6.9 and ηmax = .25.html. We see that the constant η iteration terminates after 12 function evaluations. 6. . In Fig. This electronic version is for personal use and may not be duplicated or distributed.25.20) and used as the default in nsolgm.Copyright ©1995 by the Society for Industrial and Applied Mathematics. 3 outer iterations. We also count the evaluation of F (x0 ) as an additional cost in the evaluation of x1 . which terminated after 10 function evaluations. reported in § 5. and roughly 275 thousand ﬂoatingpoint operations.4. 10 10 10 10 10 10 10 10 10 0 1 2 Relative Nonlinear Residual 3 4 5 6 7 8 0 2 4 6 8 Function Evaluations 10 12 Fig.20) to γ = .1 performed slightly better and Buy this book from SIAM at http://www.com/SIAM/FR16. which therefore has a minimum cost of three function evaluations. We set the parameters in (6. The initial iterate is the function identically one. 6. We used τr = τa = 10−6 in this example.9999. Chandrasekhar Hequation. NewtonGMRES for the Hequation. c = .9 using two schemes for selection of the parameter η. This is a slightly higher overall cost than the other approach in which {ηn } is given by (6. Here the iteration with ηn = .ecsecurehost. 6. In Fig. 4 outer iterations.9. and 230 thousand ﬂoatingpoint operations.1.20) to γ = .1.20) with the dashed line. 6. We solve the Hequation on a 100point mesh with c = . INEXACT NEWTON METHODS 107 and then F (x+ ) will be evaluated to test for termination.1 with the solid line and using the sequence given by (6.1 we plot the progress of the iteration using η = .20). This choice of ηmax was the one that did best overall in our experiments on this problem. This cost is entirely a result of the computation and factorization of a single Jacobian. For each example we compare a constant value of ηn with the choice given in (6.9 and ηmax = .2 the results change for the more illconditioned problem with c = .6 incurred a cost of 3.3 million ﬂoatingpoint operations. slower by a factor of over 1000. The chord method with the diﬀerent (but very similar for this problem) l∞ termination condition for the same problem.
4. 6.html. 6.2.Copyright ©1995 by the Society for Industrial and Applied Mathematics. and roughly 513 thousand ﬂoatingpoint operations.ecsecurehost. 7 outer iterations. The iteration with the decreasing {ηn } given by (6.21) We consider the partial diﬀeren − ∇2 u + Cu(ux + uy ) = f Buy this book from SIAM at http://www.20) required 23 function evaluations. c = . NewtonGMRES for the PDE. tial equation (6.9999 10 0 Relative Nonlinear Residual 10 1 10 2 10 3 10 4 0 2 4 6 8 10 12 Function Evaluations 14 16 18 Fig. 6.3. This electronic version is for personal use and may not be duplicated or distributed. and 537 thousand ﬂoatingpoint operations. NewtonGMRES for the Hequation. Convectiondiﬀusion equation.2. C = 20.com/SIAM/FR16. . terminated after 22 function evaluations and 7 outer iterations. 108 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 0 10 10 Relative Nonlinear Residual 1 10 2 10 3 10 4 10 5 10 6 10 7 0 5 10 15 Function Evaluations 20 25 Fig.
. f has been constructed so that the exact solution was the discretization of 10xy(1 − x)(1 − y) exp(x4. In Exercise 6. and 2. where h = 1/32.5 ). the choice of {ηn } given by (6.com/SIAM/FR16.21) with the fast Poisson solver fish2d. This electronic version is for personal use and may not be duplicated or distributed. 1). Buy this book from SIAM at http://www.20) reduced the number of inner iterations in the early stages of the iteration and avoided oversolving.20) required 16 function evaluations.Copyright ©1995 by the Society for Industrial and Applied Mathematics.6 million ﬂoatingpoint operations. the iteration for C = 20 required more than 80 function evaluations to converge. 4 outer iterations. As in § 3.5.9 the reader is asked to explore this. INEXACT NEWTON METHODS 109 with homogeneous Dirichlet boundary conditions on the unit square (0. The computation with constant η terminated after 19 function evaluations and 4 outer iterates at a cost of roughly 3 million ﬂoatingpoint operations. however.7 we discretized (6. In (6.21) on a 31 × 31 grid using centered diﬀerences. 6. Examples of such cases include (1) implicit time integration of nonlinear parabolic partial diﬀerential equations or ordinary diﬀerential equations in which the initial iterate is either the converged solution from the previous time step or the output of a predictor and (2) solution of problems in which the initial iterate is an interpolant of the solution from a coarser mesh.5. To be consistent with the secondorder accuracy of the diﬀerence scheme we used τr = τa = h2 .20) we set γ = . 1) × (0.html. The iteration with {ηn } given by (6. When preconditioning was not done. We compare the same possibilities for {ηn } as in the previous example and report the results in Fig. We set C = 20 and u0 = 0.9 and ηmax = .3. In this case. Preconditioning is crucial for good performance.ecsecurehost. In cases where the initial iterate is very good and only a single Newton iterate might be needed. a large choice of ηmax or constant η may be the more eﬃcient choice. In the computation we preconditioned (6.
Discuss this strategy and when it might be useful. 6. 10.11.5.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Verify (6. 6. what are some possible implications? 6.4.2 and 6. ηn = .5.5.4. Exercises on inexact Newton methods 6. vary the parameter c and see how the performance of NewtonGMRES with the various choices of {ηn } is aﬀected. 40 and how preconditioning aﬀects the iteration.6. For the convectiondiﬀusion example in § 6.4.9.99999. Can anything like Proposition 6.5. How do the choices ηn = 1/(n + 1).2.2. Prove Theorems 6. For the Hequation experiment with various values of c.2.5. ηn = 10−4 [32]. ηn = 21−n [24]. . 6.ecsecurehost.5. . preconditioned NewtonGMRES is implemented in such a way that the data needed for the preconditioner is only computed when needed. Is there a relation to the Shamanskii method? 6.1. and ηn = .4 equally accurate? 6.5.3. 6. and x is suﬃciently near x∗ then F (x) /(1 + ) ≤ F (x∗ )e ≤ (1 + ) F (x) . . how is the performance aﬀected if GMRES(m) is used instead of GMRES? Buy this book from SIAM at http://www.5.5. such as c = .1 by showing that if the standard assumptions hold. 6.10.5. which may be less often that with each outer iteration. This electronic version is for personal use and may not be duplicated or distributed. 0 < < 1. In the DASPK code for integration of diﬀerential algebraic equations.5. For the Hequation example in § 6.75 aﬀect the results? Present the results of your experiments as graphical iteration histories. 110 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 6.12.05.999999.1.5.14). (n + 2)−1 ) [56]. Prove Theorems 6.1. 6. Verify (6.1.html.5. and for the convectiondiﬀusion equation various values of C such as C = 1.8.2. Give an example in which F (xn ) → 0 qlinearly but xn does not converge to x∗ qlinearly. Are the converged solutions for the preconditioned and unpreconditioned convectiondiﬀusion example in § 6.7.com/SIAM/FR16.4.6. Experiment with the sequence of forcing terms {ηn }. ηn = min( F (xn ) 2 .1. Duplicate the results in § 6. Prove Proposition 6.1 be proved for a ﬁnitediﬀerence BiCGSTAB or TFQMR linear solver? If not. 6. 6. [23].5. What happens when c = 1? See [119] for an explanation of this.8).5.5 and 6.
This electronic version is for personal use and may not be duplicated or distributed. 6. which allows you to choose from a variety of forward diﬀerence Krylov methods. Buy this book from SIAM at http://www.com/SIAM/FR16. INEXACT NEWTON METHODS 111 6. Apply NewtonCGNR.html. or NewtonTFQMR to the two examples in § 6.4.ecsecurehost. might be of use in this exercise. . function evaluations) compare to NewtonGMRES? Experiment with the forcing terms. The MATLAB codes fdkrylov.25. which we describe more fully in Chapter 8.13. ﬂoatingpoint operations. NewtonBiCGSTAB. How will you deal with the need for a transpose in the case of CGNR? How does the performance (in terms of time. If you have done Exercise 5.5.5. apply nsolgm to the same twopoint boundary value problem and compare the performance.7.Copyright ©1995 by the Society for Industrial and Applied Mathematics. and nsola.14.
ecsecurehost.Copyright ©1995 by the Society for Industrial and Applied Mathematics.com/SIAM/FR16. .html. This electronic version is for personal use and may not be duplicated or distributed. 112 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Buy this book from SIAM at http://www.
the cost is little more than that of the chord method.ecsecurehost. if the Jacobian is dense.Copyright ©1995 by the Society for Industrial and Applied Mathematics. as we shall see in § 7. is locally superlinearly convergent. If xc and Bc are the current approximate solution and Jacobian. Broyden’s method [26] computes B+ by (7. Bc is updated to form B+ . .2) y = F (x+ ) − F (xc ) and s = x+ − xc . The method we present in this chapter. Hence.3) B+ s = y. Chapter 7 Broyden’s method QuasiNewton methods maintain approximations of the solution x∗ and the Jacobian at the solution F (x∗ ) as the iteration progresses. then (7.2) B+ = Bc + (y − Bc s)sT F (x+ )sT = Bc + .3. The advantages of such methods. After the computation of x+ . Broyden’s method is an example of a secant update. This means that the updated approximation to F (x∗ ) satisﬁes the secant equation (7. In one dimension.3) uniquely speciﬁes the classical secant method. and hence is a very powerful alternative to Newton’s method or the chord method. For equations in several variables (7.1) −1 x+ = xc − Bc F (xc ). these methods could have an advantage in terms of function evaluation cost over Newtoniterative methods. In fact.3) alone is a system of N equations in N 2 113 Buy this book from SIAM at http://www. sT s sT s In (7.html. is that solution of equations with the quasiNewton approximation to the Jacobian is often much cheaper than using F (xn ) as the coeﬃcient matrix. (7. there is no cost in function evaluations associated with an inner iteration. The construction of B+ determines the quasiNewton method.com/SIAM/FR16. Broyden’s method. This electronic version is for personal use and may not be duplicated or distributed. In comparison with Newtoniterative methods. if a good preconditioner (initial approximation to F (x∗ )) can be found. quasiNewton methods require only one function evaluation for each nonlinear iterate.
See [62] and [64] for discussion of this topic. The analysis is somewhat clearer in the linear case and the plan of this chapter is to present results for the linear case for both the theory and the examples.ecsecurehost. [113].4) E = B − F (x∗ ) and the step s = x+ −xc . We must relate the superlinear convergence condition that ηn → 0 in Theorem 6.html.2 to a more subtle. [84]. In this book we focus on Broyden’s method and our methods of analysis are not as general as those in [62] and [64].com/SIAM/FR16. we will write sn = xn+1 − xn . which we will not discuss further. Some examples can be found in [101]. is that for linear problems Broyden’s method converges in 2N iterations [29]. In this chapter we will assume that the function values are accurate and refer the reader to [66].Copyright ©1995 by the Society for Industrial and Applied Mathematics. and we refer the reader to [28]. One can also apply the method to linear least squares problems [84].6) n→∞ lim En sn = 0. The role of the secant equation is very deep. but easier to verify condition in order to analyze Broyden’s method. and [116]. and [64] for discussion and analysis of several such methods. [63]. the Broyden iteration will converge qsuperlinearly to the root. When indexing of the steps is necessary. (7. Ax = b. [130]. [95]. In this chapter we show that if the data x0 and B0 are suﬃciently good. [143]. [143]. [104]. We will express our results in terms of the error in the Jacobian (7. 7.4) is consistent with our use of e = x − x∗ . [84]. sn Buy this book from SIAM at http://www. Veriﬁcation of qsuperlinear convergence cannot be done by the direct approaches used in Chapters 5 and 6. Locally superlinearly convergent secant methods can be designed to maintain the sparsity pattern. One interesting result. [60] on the sequence e of steps {sn } and errors in the Jacobian {En } (7.1. . [62]. [82]. of the approximate Jacobian. This electronic version is for personal use and may not be duplicated or distributed. and [114] for discussions of quasiNewton methods with inaccurate data. [82]. More subtle structural properties can also be preserved by secant methods. This technique is based on veriﬁcation of the Dennis–Mor´ condition [61]. Broyden’s method is also applicable to linear equations [29]. While we could use E = B −F (x). symmetry. [65]. where B ≈ A is updated. or positive deﬁniteness. [96]. [104]. 114 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS unknowns.5) −1 xn+1 = xn − Bn F (xn ) where Bn = F (x∗ )+En ≈ F (x∗ ) is generated by some method (not necessarily Broyden’s method).1. The Dennis–Mor´ condition e In this section we consider general iterative methods of the form (7.
sn Buy this book from SIAM at http://www.6) holds. Theorem 7. Since −F (xn ) = Bn sn = F (x∗ )sn + En sn we have (7. and [64] for several more examples. if xn → x∗ qsuperlinearly.1.7) En sn = −F (x∗ )sn − F (xn ) = −F (x∗ )en+1 + F (x∗ )en − F (xn ). then for n suﬃciently large (7.com/SIAM/FR16. and hence (7.1. BROYDEN’S METHOD 115 The main result in this section.7) (7.6). 1 0 (F (x∗ ) − F (x∗ + ten ))en dt F (x∗ )en − F (xn ) ≤ γ en 2 /2. e Conversely. are used to prove superlinear convergence for Broyden’s method in this chapter and many other quasiNewton methods. Assume that xn = x∗ for any n. let {Bn } be a sequence of nonsingular N × N matrices. We use the fundamental theorem of calculus and the standard assumptions to obtain F (x∗ )en − F (xn ) = and hence Therefore.8) implies that En sn ≤ (2 F (x∗ ) νn + γ en ) sn which implies the DennisMor´ condition (7. By (7. e Proof. Theorem 7.Copyright ©1995 by the Society for Industrial and Applied Mathematics. See [61]. The assumption that xn = x∗ for any n implies that the sequence (7.9) sn /2 ≤ en ≤ 2 sn . Now.1. . assume that xn → x∗ and that (7. This electronic version is for personal use and may not be duplicated or distributed.8) En sn ≤ F (x∗ )en+1 + γ en 2 /2.6) holds. Then xn → x∗ qsuperlinearly if and only if xn → x∗ and the Dennis–Mor´ condition (7.5).9) we have en+1 = νn en ≤ 2νn sn . Let µn = En sn . let x0 ∈ RN be given and let {xn }∞ be given n=1 by (7.10) νn = en+1 / en is deﬁned and and νn → 0 by superlinear convergence of {xn }. [62]. [28].html.1 and its corollary for linear problems. Our formulation of the Dennis–Mor´ result is e a bit less general than that in [60] or [63]. by (7.ecsecurehost. Let the standard assumptions hold.
ecsecurehost. Since ηn → 0. which is the inexact Newton condition.12) sn En sn = (Bn − F (x∗ ))sn = −F (xn ) − F (x∗ )sn .2. We will use the l2 norm throughout our discussion.11) (7. n=0 Let { n }∞ ⊂ RN be such that n=0 n n 2 Since xn → x∗ by assumption.18) T ψn+1 = ψn − θn (ηn ψn )ηn + n is either 1 or 0 for lim η T ψn n→∞ n = 0. Hence. ηn → 0 where < ∞. and [115].15) implies that (7. F (x∗ )−1 −1 ≤ F (x∗ )sn ≤ En sn + F (xn ) = µn sn + F (xn ) . Convergence analysis Our convergence analysis is based on an approach to inﬁnitedimensional problems from [97]. 2 − θ).2. We begin with a lemma from [104]. (7. and we employ notation from all four of these references. Combining (7. for n suﬃciently large.2.1. This electronic version is for personal use and may not be duplicated or distributed. sn ≤ 2 F (x∗ )−1 F (xn ) .11) again. Let 0 < θ < 1 and let ˆ ˆ {θn }∞ ⊂ (θ. Let ψ0 ∈ RN be given. . since (7. 7. F (xn ) + F (xn )sn (7.1. 2 and let {ηn }∞ be a set of vectors in RN such that ηn n=0 all n. [104].html. 116 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS We have.com/SIAM/FR16.13) µn ≤ F (x∗ )−1 −1 /2. using the standard assumptions and (7. (7. Buy this book from SIAM at http://www. We assume now that n is large enough so that (7. Since µn → 0 by assumption. ˆ Lemma 7. If {ψn }∞ is given by n=1 (7.15) ≤ F (xn ) + F (x∗ )sn + (F (x∗ ) − F (xn ))sn ≤ (µn + γ en ) sn .Copyright ©1995 by the Society for Industrial and Applied Mathematics. ηn = 2 F (x∗ )−1 (µn + γ en ). That paper was the basis for [168].13) holds.14) Moreover.16) F (xn ) + F (xn )sn ≤ ηn F (xn ) . the proof is complete by Theorem 6.14) and (7.17) then (7.
≤ ψn ≤ ψn 2 2 2 2 T − θn (2 − θn )(ηn ψn )2 ˆ T − θ2 (ηn ψn )2 ≤ ψn 2 .19) ∞ n=0 T (ηn ψn )2 < ∞. which is (7. we can use the fact that ˆ θn (2 − θn ) > θ2 > 0 to show that the sequence {ψn } is bounded in l2 norm by ψ0 ψn+1 2 2 2 and. . 2 ψn 2 ( ψn θn (2 − θn ) 2 − ψn+1 2 + n 2 ). which holds even if ψn = 0. 2a which is valid for a > 0 and b ≤ a. From (7. To prove the result for n = 0 we use the inequality (7.21) Hence (7. Buy this book from SIAM at http://www.18). This electronic version is for personal use and may not be duplicated or distributed.22) T (ηn ψn )2 ≤ − ψn+1 2 ≤ ψn 2 − T θn (2 − θn )(ηn ψn )2 + 2 ψn 2 n 2. In that case. M n=0 T (ηn ψn )2 ≤ ψ0 2 2 − ψM +1 ˆ θ2 2 2 ≤ ψ0 2 . 2 Therefore. 2 ψn 2 ≤ ψn Hence if ψn = 0 (7. ˆ θ2 We let M → ∞ to obtain (7.20) a2 − b2 ≤ a − b2 . From (7. for any M > 0.Copyright ©1995 by the Society for Industrial and Applied Mathematics. We ﬁrst consider the case in which n = 0 for all n.html. This inequality is used often in the analysis of quasiNewton methods [63].19) implies convergence to zero of the terms in the series.20) we conclude that if ψn = 0 then T ψn − θn (ηn ψn )ηn 2 ≤ ψn 2 2 2 T − θn (2 − θn )(ηn ψn )2 T θn (2 − θn )(ηn ψn )2 .ecsecurehost.com/SIAM/FR16. in fact. BROYDEN’S METHOD 117 Proof. Convergence of the series in (7.21) we conclude that ψn+1 2 ≤ µ.
118 where ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS ∞ i 2 i=0 µ= Hence M T 2 n=0 (ηn ψn ) + ψ0 2 .2. ˆ θ2 Hence (7. We may use Lemma 7. We will consider a modiﬁed update (7.23). This property. the Broyden update has a very simple form.1 to prove a result from [130] on the convergence of Broyden’s method for linear problems. . 2].Copyright ©1995 by the Society for Industrial and Applied Mathematics.23) B+ = Bc + θc (y − Bc s)sT (Ax+ − b)sT = Bc + θc .24) B+ = Bc + θc (y − Bc s)sT (Ec s)sT = Bc − θc . s 2 s 2 2 2 Lemma 7. In the linear case.ecsecurehost. called bounded deterioration [28].1 is nicely illustrated by this application of the lemma. sT s sT s If xc and Bc are the current approximations to x∗ = A−1 b and A and s = x+ − xc then y − Bc s = (Ax+ − Axc ) − Bc (x+ − xc ) = −Ec s and therefore (7.19) holds and the proof is complete. Our ﬁrst task is to show that the errors in the Broyden approximations to A do not grow. ≤ 2µ ˆ θ2 2µ ˆ θ2 M n=0 ( ψn 2 − ψn+1 2 + M n 2) = ψ0 2 − ψM +1 2+ n 2 n=0 ≤ 2µ2 . The role of ˆ the parameter θ and the sequence {θn } in Lemma 7. and xc ∈ RN be given.2.com/SIAM/FR16. Let A be nonsingular. is an important part of the convergence analysis. Then (7. Linear problems. 7.2. The standard assumptions in this case reduce to nonsingularity of A.2. Buy this book from SIAM at http://www.1. θc ∈ [0. This electronic version is for personal use and may not be duplicated or distributed. In this section F (x) = Ax−b and Bn = A+En .html.2.25) E+ 2 ≤ Ec 2 . Let Bc be nonsingular and let B+ be formed by the Broyden update (7. The reader should compare the statement of the result with that in the nonlinear case.
Assume that {θn } satisﬁes the ˆ hypotheses of Lemma 7. otherwise.28) converges qsuperlinearly to x∗ = A−1 b for every initial iterate x0 .27) This completes the proof as E+ 2 Ps = ssT . Let A be nonsingular.2.2. In [130] one suggestion was γc  ≥ σ. If {θn } is such the matrices Bn obtained by (7.23) are nonsingular then the modiﬁed Broyden iteration (7. where Ps is the orthogonal projector (7. The superlinear convergence result is remarkable in that the initial iterate need not be near the root. e We leave the proof as Exercise 7.2. By subtracting A from both sides of (7. Let A be nonsingular. .23) are nonsingular and {xn } is given by the modiﬁed Broyden iteration (7. 1). Speciﬁcally Proposition 7. 1 − γc where γc = −1 −1 (Bc y)T s (Bc As)T s = 2 s 2 s 2 2 and σ ∈ (0. Theorem 7.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Buy this book from SIAM at http://www. If {θn } is such that the matrices Bn obtained by (7. 1).1 for some θ ∈ (0.24) we have E+ = E c − θ c (Ec s)sT s 2 2 (7.1.1 for some θ ∈ (0.html. For linear problems one can show that the Dennis–Mor´ condition implies e convergence and hence the assumption that convergence takes place is not needed. However the results in [130] assume only that the ˆ sequence {θn } satisﬁes the hypotheses of Lemma 7.2. 1) is ﬁxed. 1) and that θc is always chosen so that B+ is nonsingular.com/SIAM/FR16.5. Now.26) = Ec (I − θc Ps ).1.2. This electronic version is for personal use and may not be duplicated or distributed. θc can always be selected to make B+ nonsingular.28) −1 xn+1 = xn − Bn (Axn − b) then the Dennis–Mor´ condition implies convergence of {xn } to x∗ = A−1 b. 1 − sign(γc )σ θc = .1.ecsecurehost.1 for some θ ∈ (0. s 2 2 ≤ Ec 2 I − θ c Ps 2 and I − θc Ps 2 ≤ 1 by orthogonality of Ps and the fact that 0 ≤ θc ≤ 2. BROYDEN’S METHOD 119 Proof. 1. Assume that {θn } satisﬁes ˆ the hypotheses of Lemma 7.
we provide a proof based on ideas from [97].29) holds. Let φ ∈ RN be given. T ηn = sn / sn 2 . Our approach is to show that for every φ ∈ RN that (7. and we may invoke Lemma 7.29) n→∞ lim φT En sn sn 2 = 0.2. Buy this book from SIAM at http://www.2).Copyright ©1995 by the Society for Industrial and Applied Mathematics. Nonlinear problems.1) and (7. sn 2 sn 2 This completes the proof. [82]. Definition 7. setting φ = ej . It is global in the sense that qsuperlinear convergence takes place independently of the initial choices of x and B. . [63].1. For nonlinear problems our analysis is local and we set θn = 1. B0 ) exists if F is deﬁned at xn for all n.30) T ηn ψ n = T En sn (En φ)T sn = φT → 0.1. x0 . and xn+1 and Bn+1 can be computed from xn and Bn using (7. It is known [29]. Assuming that (7. Let Pn = Since for all v ∈ RN and all n T v T (En φ) = φT (En v) sn sT n . we have n→∞ lim En sn =0 sn 2 which is the Dennis–Mor´ condition. As indicated in the preface. the unit vector in the jth En sn coordinate direction.com/SIAM/FR16. We say that the Broyden sequence ({xn }. [143].1. The result here is not a local convergence result. This electronic version is for personal use and may not be duplicated or distributed.2. [84]. We make a deﬁnition that allows us to compactly express this fact.1 with n =0 to conclude that (7. This will imply qsuperlinear convergence e by Theorem 7. sn 2 2 and. However we must show that the sequence {Bn } of Broyden updates exists and remains nonsingular. 7.html. ψn = En φ. that the Broyden iteration will terminate in 2N steps in the linear case. Our analysis of the nonlinear case is diﬀerent from the classical work [28].2. sn 2 As j was arbitrary.2. [115].ecsecurehost. by (7. {Bn }) for the data (F.26) T T En+1 φ = (I − θn Pn )En φ. shows that the j component of has limit zero. 120 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Proof.2.1 and Proposition 7. that does not use the Frobenius norm and extends to various inﬁnitedimensional settings. Bn is nonsingular for all n. and [104].
which. Let xc ∈ Ω and a nonsingular matrix Bc be given.32) as F (x+ ) = −Ec s + and then using (7.27) and ∆c = 1 0 1 0 1 0 (F (xc + ts) − F (x∗ ))s dt.Copyright ©1995 by the Society for Industrial and Applied Mathematics. the errors in the Jacobian approximation can grow as the iteration progresses.2). Let the standard assumptions hold.2. B0 ) and that the Broyden iterates converge qsuperlinearly to x∗ .html.2) to obtain E+ = Ec (I − Ps ) + where Ps is is given by (7. (F (xc + ts) − F (x∗ ))s dt (∆c s)sT s 2 2 (F (xc + ts) − F (x∗ )) dt.32) = −F (x+ ) + We begin by writing (7. However.2. x0 .6).com/SIAM/FR16. Assume that −1 x+ = xc − Bc F (xc ) = xc + s ∈ Ω and B+ is given by (7. the sequence { En } need not be monotone nonincreasing.2.7) implies that Ec s = −F (x+ ) + (F (x+ ) − F (xc ) − F (x∗ )s) (7. . Bounded deterioration. BROYDEN’S METHOD 121 We will show that if the standard assumptions hold and x0 and B0 are suﬃciently good approximations to x∗ and F (x∗ ) then the Broyden sequence exists for the data (F. We then show that if the initial data is suﬃciently good. Theorem 7. The development of the proof is complicated. Buy this book from SIAM at http://www. Unlike the linear case.31) E+ 2 ≤ Ec + γ( ec 2 + e+ 2 )/2 Proof.2. We ﬁrst prove a bounded deterioration result like Lemma 7. together with the qlinear convergence e proved earlier completes the proof of local qsuperlinear convergence. Then (7. Finally we verify the Dennis–Mor´ condition (7. However in the nonlinear case. this growth can be limited and therefore the Broyden sequence exists and the Broyden iterates converges to x∗ at least qlinearly. the possible increase can be bounded in terms of the errors in the current and new iterates. Note that (7. This electronic version is for personal use and may not be duplicated or distributed.ecsecurehost. Our ﬁrst task is to show that bounded deterioration holds.
3. We must explore the relation between the size of E0 needed to enforce qlinear convergence and the qfactor. 2(1 − r) Buy this book from SIAM at http://www. .2. x0 . By Theorem 7. Local qlinear convergence. 2 2 2 We show that En 2 ≤ δ1 by induction. This electronic version is for personal use and may not be duplicated or distributed. 1) be given.2.2 for the linear case E+ and the proof is complete.3 hold. B0 ) exists and xn → x∗ qlinearly with qfactor at most r. (1 + r)γδ ≤ δ1 .34) δB = δ1 − δ 2 .html. n j=0 < δ1 . Now assume that Ek 2 < δ1 for all 0 ≤ k ≤ n. reducing δ and δ1 further if needed. where KA is from the statement of Theorem 5. Proof. Theorem 7. in the same way to obtain En+1 2 ≤ E0 2+ rj ≤ δB + which completes the proof. Let the standard assumptions hold.2. The proof is more complex than similar results in Chapters 5 and 6. Let δ and δ1 be small enough so that the conclusions of Theorem 5.ecsecurehost.3 states that Broyden’s method is locally qlinearly convergent.4.com/SIAM/FR16. + γ(1 + r)rn δ/2. Since E0 2 < δB < δ1 we may begin the induction.3. En+1 Since En 2 2 2 ≤ En ≤ r en 2 + γ( en+1 2 + en 2 )/2. the qfactor is at most KA (δ + δ1 ) ≤ r. by the standard assumptions ∆c 2 ≤ γ( ec ≤ Ec 2 + e+ 2 )/2. En−1 2 . + ∆c Just as in the proof of Lemma 7.4. . 122 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Hence.33) δ2 = 2(1 − r) and set (7. Then there are δ and δB such that if x0 ∈ B(δ) and E0 2 < δB the Broyden sequence for the data (F.2. . We will prove the result by choosing δB and reducing δ if necessary so that E0 2 ≤ δB will imply that En 2 ≤ δ1 for all n. Then. Let r ∈ (0. en+1 2 2 and therefore 2 En+1 ≤ En 2 + γ(1 + r) en 2 /2 ≤ En (1 + r)γδ 2 We can estimate En 2 .Copyright ©1995 by the Society for Industrial and Applied Mathematics.2. To do this reduce δ if needed so that γ(1 + r)δ < δ1 (7. . Theorem 7. which will prove the result.
as in the proof of Theorem 7. n The hypothesis in the lemma that n n 2 <∞ holds since Theorem 7. Let φ ∈ RN be arbitrary.1 we need only show that Dennis–Mor´ condition (7. Note that T T En+1 φ = (I − Pn )En φ + Pn ∆T φ. Implementation of Broyden’s method The implementation of Broyden’s method that we advocate here is related to one that has been used in computational ﬂuid mechanics [73].2. 7.2.2. BROYDEN’S METHOD 123 Veriﬁcation of the Dennis–Mor´ condition. The ﬁnal task is to verify e the Dennis–Mor´ condition. Then there are δ and δB such that if x0 ∈ B(δ) and E0 2 < δB the Broyden sequence for the data (F. Another approach. Hence Lemma 7.html. n We wish to apply Lemma 7. x0 . after the storage available for the iteration is exhausted.2. [142].1 implies that T ηn ψn = T En sn (En φ)T sn = φT → 0. B0 ) exists and xn → x∗ qsuperlinearly. .3 hold. Proof.3 and the standard assumptions imply ∆n 2 ≤ γ(1 + r)rn δ/2.ecsecurehost. clears the storage and starts over.3. ηn = sn / sn 2 . Let δ and δB be such that the conclusions of Theorem 7. By Theorem 7. e Theorem 7.1 let Pn = Set ∆n = 1 0 sn sT n . and n = Pn ∆T φ. [31]. implies the Dennis–Mor´ condition and e completes the proof.4.com/SIAM/FR16. As in the proof of Theorem 7. In these methods. Let the standard assumptions hold. This electronic version is for personal use and may not be duplicated or distributed. sn 2 2 (F (xn + tsn ) − F (x∗ )) dt. Buy this book from SIAM at http://www.Copyright ©1995 by the Society for Industrial and Applied Mathematics.2. restarting.1. sn 2 sn 2 This.2.1. Some such methods are called limited memory formulations in the optimization literature [138]. the oldest of the stored vectors is replaced by the most recent.6) holds e to complete the proof.1 with T ψn = En φ.2.
x0 . Now sn = xn+1 − xn = yn+1 − yn by (7. B0 ) exists.4. Equation (7.1. x0 . Then the Broyden sequence ({yn }.html.1. Proof.35) holds by deﬁnition when n = 0. The consequence of this proof for implementation is that we can assume that B0 = I. We begin by showing how the initial approximation to F (x∗ ). (B + uv T )−1 = I − 1 + v T B −1 u The expression (7. (B −1 u)v T (7. {Bn }) for the data (F. [177]. [176]. We leave the proof to Exercise 7.36) −1 = xn − (B0 Cn )−1 F (xn ) = xn − Bn F (xn ) = xn+1 . {Cn }) exists for the data −1 (B0 F.2. This electronic version is for personal use and may not be duplicated or distributed. The basis for this is the following simple formula [68]. B0 . Assume that the Broyden sequence ({xn }. Lemma 7. We use a restarted approach. I) and xn = yn . Let B be a nonsingular N × N matrix and let u. though a limited memory approach could be used as well. Proposition 7.37) B −1 .35) for all n. Our plan will be to store Bn in a compact and easy to apply form.37) is called the Sherman–Morrison formula. Hence Bn+1 = Bn + F (xn+1 )sT n sn 2 2 −1 B0 F (yn+1 )sT n = B0 Cn+1 . may be regarded as a preconditioner and incorporated into the deﬁntion of F just as preconditioners were in the implementation of NewtonGMRES in § 6.Copyright ©1995 by the Society for Industrial and Applied Mathematics.3. . Bn = B0 Cn (7. In this case. T Bn+1 = Bn + un vn . 124 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS This latter approach is similar to the restarted GMRES iteration.36). sn 2 2 = B0 Cn + B0 This completes the proof. v ∈ RN . [11].3. If (7. We proceed by induction.com/SIAM/FR16. If a better choice for B0 exists. In the context of a sequence of Broyden updates {Bn } we have for n ≥ 0.ecsecurehost. Our basic implementation is a nonlinear version of the implementation for linear problems in [67]. Then B + uv T is invertible if and only if 1 + v T B −1 u = 0.5. then −1 −1 yn+1 = yn − Cn B0 F (yn ) (7. we can incorporate that into the −1 function evaluation. Buy this book from SIAM at http://www.35) holds for a given n.
42) −1 Bn = n−1 j=0 I+ sj+1 sT j sj 2 2 . if B0 = I. (I − w0 v0 ) 2 and vn = sn / sn 2 .40) implies that for n ≥ 1 T −1 sn = −w + Cw w(vn−1 w) = w(−1 + Cw (Cw − sn−1 2 )) = −Cw w sn−1 Hence. (7. vj }n−1 at a cost of O(N n) j=0 ﬂoatingpoint operations. Therefore.38) = n−1 j=0 T (I − wj vj ). .com/SIAM/FR16. −1 T −1 wn = (Bn un )/(1 + vn Bn un ) (7.e. wn = −sn+1 / sn 2 .html. This electronic version is for personal use and may not be duplicated or distributed. the computation of the Broyden step) can be computed from the 2n vectors {wj . we can write (7.38) is valid for n ≥ 0. the Broyden step for the following iteration is n−1 −1 T (7. (7.39) (I − wj vj )F (xn ). Buy this book from SIAM at http://www.41) 2 = − sn−1 2 wn−1 . sn = −Bn F (xn ) = − j=0 Since the product n−2 j=0 T (I − wj vj )F (xn ) must also be computed as part of the computation of wn−1 we can combine the computation of wn−1 and sn as follows: w (7. .38) as (7.Copyright ©1995 by the Society for Industrial and Applied Mathematics.. −1 = (I − w T T T Bn n−1 vn−1 )(I − wn−2 vn−2 ) . Since the empty matrix product is the identity. −1 Hence the action of Bn on F (xn ) (i. Note that (7. One may carry this one important step further [67] and eliminate the need to store the sequence {wn }. In fact. Moreover. one need only store the steps {sn } and their norms to construct the sequence {wn }.40) = n−2 j=0 T (I − wj vj )F (xn ) 2 T + vn−1 w)−1 wn−1 = Cw w where Cw = ( sn−1 sn T = −(I − wn−1 vn−1 )w. for n ≥ 0. BROYDEN’S METHOD 125 where un = F (xn+1 )/ sn Setting we see that.ecsecurehost. .
Instead. r0 = F (x) 2 .43) =− I+ sn+1 sT n sn 2 2 sn+1 sT n sn 2 2 n−1 j=0 I+ sj+1 sT j 2 sj 2 F (xn+1 ) −1 Bn F (xn+1 ). is of the same order as GMRES. This electronic version is for personal use and may not be duplicated or distributed. Our looping convention is that the loop in step 2(e)ii is skipped if n = 0. itc = itc + 1 (b) x = x + sn (c) Evaluate F (x) (d) If F (x) 2 ≤ τr r0 + τa exit. Our notation and algorithmic framework are based on [67].3.44) to describe an algorithm. The storage requirement.html. We use a restarting approach in the algorithm. we solve (7.Copyright ©1995 by the Society for Industrial and Applied Mathematics.39) directly to compute sn+1 because sn+1 appears on both sides of the equation sn+1 = − I + (7.6. making Broyden’s method competitive in storage with GMRES as a linear solver [67]. Do while itc < maxit (a) n = n + 1. τ. n − 1 z = z + sj+1 sT z/ sj j 2 2 Buy this book from SIAM at http://www.ecsecurehost. adding to the input list a limit nmax on the number of Broyden iterations taken before a restart and a limit maxit on the number of nonlinear iterations. maxit. The termination criteria are the same as for Algorithm nsol in § 5.42) and (7. Note that B0 = I is implicit in the algorithm.44) −1 1 + sT Bn F (xn+1 )/ sn n 2 2 T −1 = 1 + vn Bn un is nonzero unless Bn+1 is singular.3.42) and (7.43) for sn+1 to obtain (7. this is consistent with setting the empty matrix product to the identity. F.1. for j = 0. 2. of n + O(1) vectors for the nth iterate. nmax) 1. By Proposition 7.44) sn+1 = − −1 Bn F (xn+1 ) −1 1 + sT Bn F (xn+1 )/ sn n 2 2 . n = −1. 126 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS We cannot use (7. brsol(x.1. Algorithm 7. We can use (7. z = −F (x) ii. s0 = −F (x).com/SIAM/FR16. Our implementation is similar to GMRES(m). (e) if n < nmax then i. . the denominator in (7. restarting with B0 when storage is exhausted. itc = 0.
As an example we consider the linear PDE (3. 7. sn+1 = z/(1 − sT z/ sn 2 ) n 2 (f) if n = nmax then n = −1. where z and F (x) may occupy the same j=0 storage location. which computes xn and sn+1 .5 million ﬂoating point operations.33) preconditioned with the Poisson solver fish2d.4. We set τa = τr = h2 where h = 1/32. the steps {sj }n . x. z. and therefore need storage for nmax + 2 vectors.5 million ﬂoating point operations. If n < nmax. The Sherman–Morrison approach in Algorithm brsol is more eﬃcient. and F (x).1 we present results for (3. √ In all the examples in this section we use the l2 norm multiplied by 1/ N . In view of all this. 7. The theory in this chapter indicates that Broyden’s method can be viewed as an alternative to GMRES as an iterative method for nonsymmetric linear equations.1. We compare Broyden’s method without restarts (solid line) to Broyden’s method with restarts every three iterates (dashed line). Linear problems.com/SIAM/FR16. then the nth iteration of brsol. This requirement of one more vector than the algorithm in [67] needed is a result of the nonlinearity.ecsecurehost. . the iteration will restart with xnmax . and in several papers on inﬁnitedimensional problems as well [91]. One can also see highly irregular performance from the 7. where more elaborate numerical experiments that those given here are presented.33) from § 3. BROYDEN’S METHOD 127 iii.7. In Fig. We use the same mesh. A MATLAB implementation of brsol is provided in the collection of MATLAB codes. than the dense matrix approach proposed in [85]. not compute snmax+1 . coeﬃcients. especially for large problems. The restarted iteration required 24 iterations and 3. in terms of both time and storage. If n = nmax. [104]. This point has been explored in some detail in [67]. However the dense matrix approach makes it possible to detect ill conditioning in the approximate Jacobians.Copyright ©1995 by the Society for Industrial and Applied Mathematics. For linear problems.7.html. we feel that the approach of Algorithm brsol is the best approach. a slightly higher cost than the GMRES solution. [120]. If the data is suﬃciently good. s = −F (x). This electronic version is for personal use and may not be duplicated or distributed. righthand side. we allow for increases in the nonlinear residual. Our implementation of brsol in the collection of MATLAB codes stores z and F separately in the interest of clarity and therefore requires nmax + 3 vectors of storage. Examples for Broyden’s method Buy this book from SIAM at http://www. and solution as the example in § 3. bounded deterioration implies that the Broyden matrices will be well conditioned and superlinear convergence implies that only a few iterates will be needed.4. requires O(nN ) ﬂoatingpoint operations and storage of n + 3 vectors. Broyden’s method terminated after 9 iterations at a cost of roughly 1. We use the MATLAB code brsol.
9 both methods required six iterations for convergence.4 illustrate the diﬀerences in the two methods. Our general recommendation is to try both. as well as those in § 8. and the sequence of steps saves a vector over the nonlinear case.9 (Fig.1. We consider the Hequation from Chapters 5 and 6 and a nonlinear elliptic partial diﬀerential equation. As before we solve the Hequation on a 100point mesh with c = . If storage is at a premium. Hence. 128 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS restarted method. . In the linear case [67] one can store only the residuals and recover the terminal iterate upon exit. the vector z. a substantial improvement over the NewtonGMRES implementation. if function evaluations are very expensive. Chandrasekhar Hequation. For c = .com/SIAM/FR16. Broyden’s method and NewtonGMRES both require more storage as iterations progress. the cost of a solution by Broyden’s method could be much less than that of one by NewtonGMRES. 7.9999 (Fig. The examples in this section.2. Buy this book from SIAM at http://www. however. 7. The cost in function evaluations for Broyden’s method is one for each nonlinear iteration and for NewtonGMRES one for each outer iteration and one for each inner iteration. NewtonGMRES as the inner iteration progresses and Broyden’s method as the outer iteration progresses. This electronic version is for personal use and may not be duplicated or distributed. 10 1 10 Relative Residual 0 10 1 10 2 10 3 0 5 10 Iterations 15 20 25 Fig.Copyright ©1995 by the Society for Industrial and Applied Mathematics. The implementation without restarts required 153 thousand ﬂoatingpoint operations and the restarted method 150. Broyden’s method for the linear PDE.2) and c = .html.4. Storage of the residual. Nonlinear problems.3). and the number of nonlinear iterations is large. 7. Broyden’s method could be at a disadvantage as it might need to be restarted many times.ecsecurehost. We compare Broyden’s method without restarts (solid line) to Broyden’s method with restarts every three iterates (dashed line). 7.
2.Copyright ©1995 by the Society for Industrial and Applied Mathematics. BROYDEN’S METHOD 129 For c = . the storage in NewtonGMRES is used in the inner iteration. while that in Broyden in the nonlinear iteration.3. c = . it may be the case Buy this book from SIAM at http://www. This electronic version is for personal use and may not be duplicated or distributed. c = . .9999. If many nonlinear iterations are needed.ecsecurehost. 10 0 10 Relative Nonlinear Residual 1 10 2 10 3 10 4 10 5 10 6 10 7 0 2 4 6 8 10 Iterations 12 14 16 18 Fig. 10 10 10 10 10 10 10 10 10 0 1 2 Relative Nonlinear Residual 3 4 5 6 7 8 0 1 2 3 Iterations 4 5 6 Fig. The restarted iteration requires 18 iterations and 407 ﬂoatingpoint operations. the restarted iteration has much more trouble. both implementations performed better than NewtonGMRES. 7. 7.com/SIAM/FR16.html.9. One should take some care in drawing general conclusions. While Broyden’s method on this example performed better than NewtonGMRES. Broyden’s method for the Hequation. The unrestarted iteration converges in 10 iterations at a cost of roughly 249 thousand ﬂoatingpoint operations.9999. Even so. Broyden’s method for the Hequation.
4. s.4.2) and more storage was used. 130 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS that Broyden’s method may need restarts while the GMRES iteration for the linear problem for the Newton step may not.6 million ﬂoatingpoint operations.com/SIAM/FR16. which happened on the second iterate. We consider the nonlinear convectiondiﬀusion equation from § 6. This is not supported by the local convergence theory for nonlinear problems. 7. 7. One can do this in brsol by setting the third entry in the parameter list to 1.4. u0 = 0. The Broyden iteration when restarted every n iterates. requires storage of n + 2 vectors {sj }n−1 .2 million ﬂoatingpoint operations. This is a modest improvement over the NewtonGMRES cost (reported in § 6. xc . Hence restarting the Broyden iteration every 8 iterations most closely corresponds to the NewtonGMRES requirements. 10 1 10 Relative Nonlinear Residual 0 10 1 10 2 10 3 10 4 0 2 4 6 8 Iterations 10 12 14 16 Fig. the residual norms do not monotonically decrease in the Broyden iteration (we ﬁx this in § 8.2) of 2. We preconditioned with the Poisson solver fish2d. This electronic version is for personal use and may not be duplicated or distributed. and τr = τa = h2 . In § 8.2. x. However. however it can be quite successful in practice.4. Broyden’s method for nonlinear PDE.html. F (xc + hv)) to accommodate the NewtonGMRES iteration. F (xc ). At most ﬁve inner GMRES iterations were required. We set C = 20. Therefore the GMRES inner iteration for the NewtonGMRES approach needed storage for at most 10 vectors (the Krylov basis. and z (when stored in the j=0 same place as F (x)). We use the same mesh width h = 1/32 on a uniform square grid.4 we compare Broyden’s method without restarts (solid line) to Broyden’s method with restarts every 8 iterates (dashed line). This approach Buy this book from SIAM at http://www. In Fig. as one would if the problem were truly linear.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Convectiondiﬀusion equation.3. The unrestarted Broyden iteration converges in 12 iterations at a cost of roughly 2. .3 we will return to this example.ecsecurehost. We allowed for an increase in the residual.
This electronic version is for personal use and may not be duplicated or distributed.4 million ﬂoatingpoint operations. but the convergence was extremely irregular after the restart. Buy this book from SIAM at http://www.com/SIAM/FR16.html.Copyright ©1995 by the Society for Industrial and Applied Mathematics. . BROYDEN’S METHOD 131 took 15 iterations to terminate at a cost of 2.ecsecurehost.
What happens in the diﬀerential equation examples if preconditioning is omitted? 7. V is M × N . Vary the parameters in the equations and the number of vectors stored by Broyden’s method and report on their eﬀects on performance. Exercises on Broyden’s method 7.5. 1) × (0. Prove Proposition 7. Precondition with the Poisson solver fish2d. Solve the Hequation with Broyden’s method and c = 1. then (A + U V )−1 = A−1 − A−1 U (I + V A−1 U )−1 V A−1 .5. Compare the performance of Broyden’s method and NewtonGMRES on the Modiﬁed Bratu Problem −∇2 u + dux + eu = 0 on (0. 7.5. Set τa = τr = 10−8 . Extend Proposition 7. 7.4. How does this compare with nsol on the examples in Chapter 5.Copyright ©1995 by the Society for Industrial and Applied Mathematics. initial iterates. 7. [199].3.3. .4. Modify nsol to use Broyden’s method instead of the chord method after the initial Jacobian evaluation.4 with Broyden’s method using various values for the parameter C.1.1.7. updates an approximation to the inverse of the Jacobian at the root so that B −1 ≈ F (x∗ )−1 satisﬁes the inverse secant equation −1 B+ y = s (7. 7. 7.5.45) Buy this book from SIAM at http://www. 1) with homogeneous Dirichlet boundary conditions. Try to solve the nonlinear convectiondiﬀusion equation in § 6.2. 132 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 7. This is the Sherman–Morrison–Woodbury formula [68]. The bad Broyden method [26].5.5.5. and values for the convection coeﬃcient d.1 by showing that if A is a nonsingular N × N matrix.9. What qfactor for linear convergence do you see? Can you account for this by applying the secant method to the equation x2 = 0? See [53] for a complete explanation.1. Experiment with various mesh widths. then A + U V is nonsingular if and only if I + V A−1 U is a nonsingular M × M matrix. U is N × M . Duplicate the results reported in § 7.com/SIAM/FR16.html.8.5. 7. This electronic version is for personal use and may not be duplicated or distributed.2. If this is the case. See [102] for further generalizations. How does the performance of Broyden’s method diﬀer from NewtonGMRES? 7.ecsecurehost.5. so named because of its inferior performance in practice [63]. Prove Proposition 7.5.5.3.6.
are relevant to this exercise.5. [93]. for example.5. Read about the algorithm in [172] and prove local superlinear convergence under the standard assumptions and the additional assumption that the sparsity pattern of B0 is the same as that of F (x∗ ).13.5. . [125].5.com/SIAM/FR16.5. Implement the bad Broyden method.12.14 in Chapter 5 to numerically estimate the qorder of Broyden’s method for some of the examples in this section (both linear and nonlinear).3. 7. The Schubert or sparse Broyden algorithm [172].7. Does anything go wrong? See [67] for more about this. Implement the Schubert algorithm on an unpreconditioned discretized partial diﬀerential equation (such as the Bratu problem from Exercise 7.10. For problems with tridiagonal Jacobian. How would the Schubert algorithm be aﬀected by preconditioning? Compare your analysis with those in [158]. What can you conclude from your observations? Buy this book from SIAM at http://www. y 2 2 Show that the bad Broyden method is locally superlinearly convergent if the standard assumptions hold. [92]. and compare the performance to that of Broyden’s method.ecsecurehost.4. apply it to the examples in § 7. BROYDEN’S METHOD 133 with the rankone update −1 −1 B+ = Bc + −1 (s − Bc y)y T . 7. [27] is a quasiNewton update for sparse matrices that enforces both the secant equation and the sparsity pattern. This electronic version is for personal use and may not be duplicated or distributed. 7. and main diagonal are updated. Does the relative performance of the methods change as h is decreased? This is interesting even in one space dimension where all matrices are tridiagonal.Copyright ©1995 by the Society for Industrial and Applied Mathematics. 7. superdiagonal.11. Try to implement the bad Broyden method in a storageeﬃcient manner using the ideas from [67] and § 7. Discuss the diﬀerences in implementation from Broyden’s method. and [189]. j ≤ N and i − j ≤ 1.8) and compare it to NewtonGMRES and Broyden’s method. Use your result from Exercise 5. and [112]. the update is (y − Bc s)i sj (B+ )ij = (Bc )ij + i+1 2 k=i−1 sk for 1 ≤ i. References [101].html. Note that only the subdiagonal.
.ecsecurehost. 134 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Buy this book from SIAM at http://www.com/SIAM/FR16.html.Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
. The reason for this is that f (x) = (1 + x2 )−1 is small for large x.9 × 1017 .com/SIAM/FR16. can be used to accomplish the same objective. [173]. [154]. where the algorithms can be motivated and intuition developed with a very simple example. a failure of Newton’s method that results from an inaccurate initial iterate. but the length of s is too large. [70]. Of the many such algorithms we will focus on the class of line search methods and the Armijo rule [3]. and because it is trivial to add a line search to an existing locally convergent implementation of Newton’s method. If we look more closely. [198]. [129]. −1. [25]. Single equations If we apply Newton’s method to ﬁnd the root x∗ = 0 of the function f (x) = arctan(x) with initial iterate x0 = 10 we ﬁnd that the initial iterate is too far from the root for the local convergence theory in Chapter 5 to be applicable. and continuation/homotopy methods [1].ecsecurehost. We begin with single equations. This electronic version is for personal use and may not be duplicated or distributed. [70].Copyright ©1995 by the Society for Industrial and Applied Mathematics. we see that the Newton step s = −f (xc )/f (xc ) is pointing toward the root in the sense that sxc < 0. We select the line search paradigm because of its simplicity. A prototype algorithm is given below. 9. Chapter 8 Global Convergence By a globally convergent algorithm we mean an algorithm with the property that for any initial iterate the iteration either converges to a root of F or fails to do so in one of a small number of ways. We can see this eﬀect in the sequence of iterates: 10. [88]. f (x) = arctan(x) ≈ ±π/2. [159].html.1.9 × 104 . implemented inexactly [24]. [25]. [63]. [109]. and hence the magnitude of the Newton step is much larger than that of the iterate itself. 135 Buy this book from SIAM at http://www. [2].5 × 109 . The implementation and analysis of the line search method we develop in this chapter do not depend on whether iterative or direct methods are used to compute the Newton step or upon how or if the Jacobian is computed and stored. 2. such as trust region [153]. This observation motivates the simple ﬁx of reducing the size of the step until the size of the nonlinear residual is decreased. 8. [24]. [181]. Other methods. −138.
−6 × 10−10 . even in this terminal phase rapid convergence takes place only in the ﬁnal three outer iterations. 2. This electronic version is for personal use and may not be duplicated or distributed. 1.5. The quadratic convergence behavior of Newton’s method is only apparent very late in the iteration. (b) s = −f (x)/f (x) (search direction) (c) xt = x + s (trial point) (d) If f (xt ) < f (x) then x = xt (accept the step) else s = s/2 goto 2c (reject the step) When we apply Algorithm lines1 to our simple example. One does not set x+ = x + s without testing the step to see if the absolute value of the nonlinear residual has been decreased. We call this method a line search because we search along the line segment (xc .99.4. −3. The outer iterations are identiﬁed with circles as in § 6. As we move from the right endpoint to the left.56. However. 3. In the terminal phase. there is the possibility of oscillation about a solution without convergence [63]. −. some authors [63] refer to this as backtracking. lines1(x. Note the diﬀerences between Algorithm lines1 and Algorithm nsol. 8. xc − f (xc )/f (xc )) to ﬁnd a decrease in f (xc ). r0 = f (x) 2. however. In theory. 136 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Algorithm 8. and 2 steplength reductions. 4.8.4. the minimum number of two function evaluations per iterate is required. . decrease in the size of the nonlinear residual was obtained with a full Newton step.1. The algorithm is to compute a search direction d which for us will be the Newton direction d = −f (xc )/f (xc ) Buy this book from SIAM at http://www. . Iterates 1. After the fourth iterate.1. where the local theory is valid. −0.1. 1. τ ) 1.1. f.3.Copyright ©1995 by the Society for Industrial and Applied Mathematics. We plot the progress of the iteration in Fig. we obtain the sequence 10. 9 × 10−4 .html. To remedy this and to make analysis possible we replace the test for simple decrease with one for suﬃcient decrease.com/SIAM/FR16. This is typical of the performance of the algorithms we discuss in this chapter.2. and 4 required 3. −8. −1. One can see that most of the work is in the initial phase of the iteration.9.ecsecurehost. 2. Do while f (x) > τr r0 + τa (a) If f (x) = 0 terminate with failure. Algorithm lines1 almost always works well. We plot arctan(x)/arctan(x0 ) as a function of the number of function evaluations with the solid line. 3.
A factor of Buy this book from SIAM at http://www. from [3] (see also [88]) is called the Armijo rule.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Once suﬃcient decrease has been obtained we accept the step s = λd. and then test steps of the form s = λd. We will analyze Algorithm nsola in § 8. This electronic version is for personal use and may not be duplicated or distributed. with λ = 2−j for some j ≥ 0. f. We remark here that there is nothing critical about the reduction factor of 2 in the line search.com/SIAM/FR16. something we did not have to do in the earlier chapters.1. nsola1(x. GLOBAL CONVERGENCE 0 137 10 10 10 Relative Nonlinear Residual 10 10 10 10 10 10 10 10 1 2 3 4 5 6 7 8 9 10 0 5 10 15 20 Function Evaluations 25 30 35 Fig. If f (xt ) < (1 − αλ)f (x) then x = xt (accept the step) else λ = λ/2 goto 2(c)i (reject the step) This is essentially the algorithm implemented in the MATLAB code nsola. The parameter α ∈ (0.1) as easy as possible to satisfy. This strategy.2.html. Newton–Armijo iteration for inverse tangent. but positive.1) f (xc + λd) < (1 − αλ)f (xc ). .2. The condition in (8. τ ) 1. xt = x + λd (trial point) ii. number intended to make (8.1. until f (x + s) satisﬁes (8. (b) d = −f (x)/f (x) (search direction) (c) λ = 1 i. Note that we must now distinguish between the step and the Newton direction. Algorithm 8. 8. Do while f (x) > τr r0 + τa (a) If f (x) = 0 terminate with failure.ecsecurehost. We follow the recent optimization literature [63] and set α = 10−4 . r0 = f (x) 2.1) is called suﬃcient decrease of f . 1) is a small.
In that event a reduction of a factor of 8 would require three passes through the loop in nsola1 but only a single reduction by a factor of 10. where 0 < σ0 < σ1 < 1. Our algorithm nsola incorporates these ideas.ecsecurehost. τ. η). in fact.1 [54]. Taking as large a fraction as possible helps move the iteration into the terminal phase in which full steps may be taken and fast convergence expected.1. On the other hand. the iteration may stagnate as a result. We will return to the issue of reduction in λ in § 8.2). approximate the Newton step by a quasiNewton method.5. The parameter σ0 safeguards against that.Copyright ©1995 by the Society for Industrial and Applied Mathematics.2) F (xn )dn + F (xn ) ≤ ηn F (xn ) . nsola(x. terminate with failure.1 and . The danger is that the minimum may be too near zero to be of much use and. or even use the chord method provided that the resulting direction dn satisﬁes (8. Do while F (x) > τr r0 + τa (a) Find d such that F (x)d + F (x) ≤ η F (x) If no such d can be found. Buy this book from SIAM at http://www. In Exercise 8. [63]. Our second extension is to allow more ﬂexibility in the reduction of λ. [87]. This could be quite important if the search direction were determined by an inexact Newton method or an approximation to Newton’s method such as the chord method or Broyden’s method. 8. r0 = F (x) 2.3. [133]. 138 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 10 could well be better in situations in which small values of λ are needed for several consecutive steps (such as our example with the arctan function).2) is motivated by the Newtoniterative paradigm. We write this for the Armijo sequence as (8.5. Safeguarding is an important aspect of many globally convergent methods and has been used for many years in the context of the polynomial interpolation algorithms we discuss in § 8. Algorithm 8.9 you are asked to implement the Armijo rule in the context of Algorithm nsol. the analysis here applies equally to methods that solve the equations for the Newton step directly (so ηn = 0). Analysis of the Armijo rule In this section we extend the onedimensional algorithm in two ways before proceeding with the analysis. developed in § 8. .3. reducing λ by too much can be costly as well. First. Taking full Newton steps ensures fast local convergence.1).3. is to minimize an interpolating polynomial based on the previous trial steps. While (8.html. [86]. We allow for any choice that produces a reduction that satisﬁes σ0 λold ≤ λnew ≤ σ1 λold . 1.2. One such method.2. Typical values of σ0 and σ1 are .com/SIAM/FR16. we accept any direction that satisﬁes the inexact Newton condition (6. F. This electronic version is for personal use and may not be duplicated or distributed.
by giving a uniform lower bound on λn (assuming that F (x0 ) = 0).2). for example.com/SIAM/FR16. If. Assume that {xn } is given by Algorithm nsola with (8. F (xn ) is singular for some n.1. The convergence theorem is the remarkable statement that if {xn } and { F (xn )−1 } are bounded (thereby eliminating premature failure and local minima of F that are not roots) and F is Lipschitz continuous in a neighborhood of the entire sequence {xn }. then the iteration converges to a root of F at which the standard assumptions hold. Let x0 ∈ RN and α ∈ (0.ecsecurehost. r) = ∞ {x  x − xn ≤ r}. We will see that if F has no roots then either F (xn ) has a limit point which is singular or {xn } becomes unbounded.1 implies that the line search phase (step 2b in Algorithm nsola) will terminate in a number of cycles that is independent of n. Assumption 8. 1 − α) Buy this book from SIAM at http://www. illconditioning of F might be detected as part of the factorization.2.3) ¯ {ηn } ⊂ (0. We show this. and F (x)−1 ≤ mf on the set Ω({xn }. Note how the safeguard bound σ0 enters this result. xn → x. The inner iteration could terminate with a failure. η ] ⊂ (0.1. We begin with a formal statement of our assumption that F is uniformly Lipschitz continuous and bounded away from zero on the sequence {xn }. Clearly if F has no roots.Copyright ©1995 by the Society for Industrial and Applied Mathematics. mf > 0 such that F is deﬁned.html. and the convergence is qquadratic. a local minimum of F which is not a root. the algorithm must fail.2. There are r. γ. .2. 1) be given. Lemma 8. F is Lipschitz continuous with Lipschitz constant γ. step 2a were implemented with a direct factorization method. xn → ∞. This electronic version is for personal use and may not be duplicated or distributed. GLOBAL CONVERGENCE 139 (b) λ = 1 i. Let {xn } be the iteration produced by Algorithm nsola with initial iterate x0 . as do [25] and [70]. The algorithm can fail in some obvious ways: 1. Note also the very modest conditions on ηn . If F (xt ) < (1 − αλ) F (x) then x = xt else Choose σ ∈ [σ0 . full steps are taken in the terminal phase. σ1 ] λ = σλ goto 2(b)i Note that step 2a must allow for the possibility that F is illconditioned and that no direction can be found that satisﬁes (8. 2. n=0 Assumption 8. ¯ 3. xt = x + λd ii.
1 F is Lipschitz continuous on the line segment [xn . Theorem 8. for any λ ∈ [0. ¯ Lipschitz continuity of F implies that if λ ≤ λ1 then F (xn + λdn ) ≤ (1 − λ) F (xn ) + λ¯ F (xn ) + η ≤ (1 − λ) F (xn ) + λ¯ F (xn ) + η γ 2 λ dn 2 2 1 0 (F (xn + tλdn ) − F (xn ))λdn dt (F (xn + tλdn ) − F (xn ))λdn dt.2. Let n ≥ 0. Then either F (x0 ) = 0 or (8. by Assumption 8. By the fundamental theorem of calculus and (8.html.Copyright ©1995 by the Society for Industrial and Applied Mathematics. The Buy this book from SIAM at http://www. ¯ Therefore. xn + λdn ] whenever ¯ λ ≤ λ1 = r/(mf F (x0 ) ). .ecsecurehost. Proof. ¯ This shows that λ can be no smaller than σ0 λ2 .1 holds. This electronic version is for personal use and may not be duplicated or distributed. Now we can show directly that the sequence of Armijo iterates either is unbounded or converges to a solution. mf F (x0 ) m2 F (x0 ) (1 + η )2 γ ¯ f . by the bound ηn ≤ η . 140 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS and that Assumption 8. 1 0 ¯ m2 γ(1 + η )2 λ2 F (xn ) f 2 ¯ m2 γλ(1 + η )2 F (x0 ) f 2 2 ≤ (1 − λ + η λ) F (xn ) + λ F (xn ) ¯ < (1 − αλ) F (xn ) .4) ¯ λn ≥ λ = σ0 min r 2(1 − α − η ) ¯ . ¯ The step acceptance condition in nsola implies that { F (xn ) } is a decreasing sequence and therefore dn = F (xn )−1 (ξn − F (xn )) ≤ mf (1 + η ) F (xn ) . ¯ ξn ≤ ηn F (xn ) ≤ η F (xn ) . which will be implied by ¯ ¯ λ ≤ λ2 = min λ1 . 1] F (xn + λdn ) = F (xn ) + λF (xn )dn + = (1 − λ)F (xn ) + λξn + where. which completes the proof. 2(1 − α − η ) ¯ (1 + η )2 m2 γ F (x0 ) ¯ f .com/SIAM/FR16.2.2) we have.1 says even more.2.
Hence as soon as xnk ∈ B(δ).1 is violated) or the iteration becomes unbounded. All that one needs to make the theorems and proofs in § 8. Otherwise Lemma 8. Thus far we have used for d the inexact Newton direction. 1) be given.2.com/SIAM/FR16. say to x∗ . Since the standard assumptions hold at x∗ .1. then the storage requirements are roughly the same as for Algorithm nsolgm if one reuses the storage location for the ﬁnite diﬀerence directional derivative to test for suﬃcient decrease.2). The iteration can be very costly if the Jacobian must be Buy this book from SIAM at http://www. Assume that {xn } is given by Algorithm nsola. {xn } is bounded. Theorem 8.html. GLOBAL CONVERGENCE 141 sequence converges to a root at which the standard assumptions hold. At this stage we have shown that if the Armijo iteration fails to converge to a root either the continuity properties of F or nonsingularity of F break down as the iteration progresses (Assumption 8.1 holds.3). If GMRES is used as the linear solver.2. where r is the radius in Assumption 8. and that Assumption 8. the entire sequence.2 applies and hence full steps can be taken.2. Then {xn } converges to a root x∗ of F at which the standard assumptions hold.1 and therefore the standard assumptions hold at x∗ . Let x0 ∈ RN and α ∈ (0.2. Theorem 6.2. there is δ such that if x ∈ B(δ).9 you are asked to do this with nsol and. Moreover. will remain in B(δ) and converge to x∗ . It could well be advantageous to use a chord or Broyden direction. If F (xn ) = 0 for some n. F (x∗ ) = 0. The boundedness of the sequence {xn } implies that a subsequence {xnk } converges. then we are done because Assumption 8. Implementation of the Armijo rule The MATLAB code nsola is a modiﬁcation of nsolgm which provides for a choice of several Krylov methods for computing an inexact Newton direction and globalizes Newton’s method with the Armijo rule. {ηn } satisﬁes (8.1 implies that the standard assumptions hold at x∗ = xn . so in the terminal phase of the iteration the step lengths are all 1 and the convergence is qquadratic. Proof.1 implies that F (xn ) converges qlinearly to zero with qfactor at ¯ most (1 − αλ). the Newton iteration with x0 = x will remain in B(δ) and converge qquadratically to x∗ .2 hold is (8. via a forward diﬀerence approximation to F (xn )d. Since F is continuous. which is easy to verify. not just the subsequence. . 8.3. This electronic version is for personal use and may not be duplicated or distributed. with each iteration. full steps are taken for n suﬃciently large. Eventually xnk − x∗  < r. in principle. In this case the storage requirements are roughly the same as for Algorithm nsol.Copyright ©1995 by the Society for Industrial and Applied Mathematics.1.4. and the convergence behavior in the ﬁnal phase of the iteration is that given by the local theory for inexact Newton methods (Theorem 6.2.1. numerically determine if the chord direction satisﬁes the inexact Newton condition. In Exercise 8. We use the l2 norm in that code and in the numerical results reported in § 8.ecsecurehost. This is all that one could ask for in terms of robustness of such an algorithm.2). for example.5.
html.ecsecurehost. otherwise. In practice such methods usually perform better than constant reduction factors. say). Evaluation of F (xc )dc if the full step is not accepted (here dc is the Broyden direction) would not only verify the inexact Newton condition but provide enough information for a twopoint parabolic line search. . Our Broyden–Armijo code does not explicitly check the inexact Newton condition and thereby saves an evaluation of F . polynomial interpolation as a way to choose λ. After λc has been rejected and a model polynomial computed. In this section we consider an elaboration of the simple line search scheme of reduction of the steplength by a ﬁxed factor. The ﬁrst. but might respond well to a more aggressive steplength reduction (by factors of 1/10. 8. F (xn + λ1 dn ) 2 .10. Buy this book from SIAM at http://www. In this approach we use values of f and f at λ = 0 and the value of f at the current value of λ to construct a 2nd degree interpolating polynomial for f . .3. one might expect to be able to better estimate the appropriate reduction factor.5.5) λ+ = σ 1 λc λt Twopoint parabolic model. If one can model the decrease in F 2 as the steplength is reduced. . If we have rejected k steps we have in hand the values F (xn ) 2 .Copyright ©1995 by the Society for Industrial and Applied Mathematics. is applicable to all methods. We can use this iteration history to model the scalar function f (λ) = F (xn + λdn ) 2 2 with a polynomial and use the minimum of that polynomial as the next steplength. if λt > σ1 λc . how Broyden’s method must be implemented. This electronic version is for personal use and may not be duplicated or distributed. The motivation for this is that some problems respond well to one or two reductions in the steplength by modest amounts (such as 1/2) and others require many such reductions. F (xn + λk−1 dn ) 2 . (8. Clearly f (0) can be computed as 2 (8. we compute the minimum λt of that polynomial analytically and set σ0 λc if λt < σ0 λc . is again based on [67] and is more specialized. f (0) = F (xn ) 2 is known. We consider two topics in this section. 142 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS evaluated and factored many times during the phase of the iteration in which the approximate solution is far from the root. . The second.com/SIAM/FR16.6) f (0) = 2(F (xn )T dn )T F (xn ) = 2F (xn )T (F (xn )dn ).1. The reader is asked to pursue this in Exercise 8. We consider two ways of doing this that use second degree polynomials which we compute using previously computed information. Polynomial line searches.
p agree with f. Buy this book from SIAM at http://www. In any case. This electronic version is for personal use and may not be duplicated or distributed. λ− is p(λ) = f (0) + λ λc − λ− (λ − λ− )(f (λc ) − f (0)) (λc − λ)(f (λ− ) − f (0)) + . This is a possibility with directions other than inexact Newton directions. .5. f (λc ). and f (λ− ). we implement this method with σ0 = .1.ecsecurehost. In this approach one evaluates f (0) and f (1) as before. If the full step is rejected. We take the latter approach and set λ+ = σ1 λc . If p (0) ≤ 0 one could either set λ+ to be the minimum of p on the interval [σ0 λ. λ2 c Our approximation of f (0) < 0.Copyright ©1995 by the Society for Industrial and Applied Mathematics.com/SIAM/FR16. An alternative to the twopoint model that avoids the need to approximate f (0) is a threepoint model. GLOBAL CONVERGENCE 143 Use of (8. σ1 λ] or reject the parabolic model and simply set λ+ to σ0 λc or σ1 λc . We then compute λ+ with (8. The MATLAB code parab3p implements the threepoint parabolic line search and is called by the two codes nsola and brsola. In the MATLAB code nsola from the collection.5). λc . which can be obtained by examination of the ﬁnal residual from GMRES or by expending an additional function evaluation to compute F (xn )dn with a forward diﬀerence. we have the values f (0). we set λ = σ1 and try again. After the second step is rejected. so if f (λc ) > f (0). σ1 = . If p (0) = 2 (λ− (f (λc ) − f (0)) − λc (f (λ− ) − f (0))) λc λ− (λc − λ− ) is positive. f at 0 and p(λc ) = f (λc ) is p(λ) = f (0) + f (0)λ + cλ2 . Threepoint parabolic model. where c= f (λc ) − f (0) − f (0)λc . then we set λt to the minimum of p λt = −p (0)/p (0) and apply the safeguarding step (8.html. λc λ− We must consider two situations.5) to compute λ+ . where λc and λ− are the most recently rejected values of λ. then c > 0 and p(λ) has a minimum at λt = −f (0)/(2c) > 0.6) requires evaluation of F (xn )dn . The polynomial p(λ) such that p. which uses f (0) and the two most recently rejected steps to create the parabola. it may be necessary to compute a new search direction. our approximation to f (0) should be negative. The polynomial that interpolates f at 0. If it is not. such as the Broyden direction.
[63]. This electronic version is for personal use and may not be duplicated or distributed. and wn = (Bn un )/(1 + vn Bn un ). The diﬀerence from the development in § 7.8) and the relation dn+1 = − I − wn sT n sn 2 −1 Bn F (xn+1 ) to solve for dn+1 and obtain an analog of (7.2. the polynomial modeling approach is heuristic. [133].3.7) yn − Bn sn = F (xn+1 ) − (1 − λn )F (xn ).8) to construct Bc F (x+ ) as we did in Algorithm brsol. using the notation in § 7. As in § 7.2.38) to obtain wn = λn (8.8) = λn where −dn+1 sn + (λ−1 − 1) n sn 2 sn 2 −sn+1 /λn+1 sn + (λ−1 − 1) n sn 2 sn −1 dn+1 = −Bn+1 F (xn+1 ) 2 .ecsecurehost.5. is the search direction used to compute xn+1 .7) and (7.5.8) reduces to (7. Broyden’s method. The safeguarding and Theorem 8. sn 2 sn 2 If. In Chapter 7 we implemented Broyden’s method at a storage cost of one vector for each iteration.44) (8. then −1 sn = −λn Bn F (xn ) and hence (8. we set un = we can use (8.3 is that the simple relation between the sequence {wn } and {vn } is changed. [86].41) when λn = 1 and λn+1 = 1 (so dn+1 = sn+1 ). 8. Clearly. our approach is a nonlinear version of the method in [67]. yn − Bn sn sn −1 T −1 . vn = .3. The relation y − Bc s = F (x+ ) was not critical to this and we may also incorporate a line search at a cost of a bit of complexity in the implementation.html.8) and (8.9) dn+1 = − −1 −1 sn 2 Bn F (xn+1 ) − (1 − λn )sT Bn F (xn+1 )sn n 2 −1 sn 2 + λn sT Bn F (xn+1 ) n 2 −1 We then use the second equality in (8.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Note that (8.com/SIAM/FR16.9) is left to Exercise 8.1 provide insurance that progress will be made in reducing F . 144 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS Interpolation with a cubic polynomial has also been suggested [87]. If a line search is used. We can use the ﬁrst equality in (8. Veriﬁcation of (8.3. . Buy this book from SIAM at http://www.
Do while n ≤ maxit (a) n = n + 1 (b) x = x + sn (c) Evaluate F (x) (d) If F (x) (e) 2 ≤ τr r0 + τa exit. In Exercise 8. store sn+1 2 − λn sT z) n and λn+1 . In fact.5.3.3.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Initialize: Fc = F (x) r0 = F (x) 2 .1 and also changes η using (6. nmax) 1. uses a threepoint parabolic line search and performs a restart when storage is exhausted like Algorithm brsol does. We do not include details of the line search or provide for restarts or a limited memory implementation. d = −F (x). s0 = λ0 d 2. n = −1. The MATLAB implementation of brsola in the collection illustrates this point. Another approach would be to compute F (xc )d numerically if a full step is rejected and then test (8.1. F. computation of the step length requires storage of both the current point x and the trial point x + λs before a step can be accepted. brsola(x. 8.20) Buy this book from SIAM at http://www. As a side eﬀect. . GLOBAL CONVERGENCE 145 Algorthm brsola uses these ideas. f (0).ecsecurehost.com/SIAM/FR16. Since dn+1 and sn+1 can occupy the same location. an approximation of f (0) can be obtained at no additional cost and a parabolic line search based on f (0). i. So. τ. there is no guarantee that the Broyden direction is a direction of decrease for F 2 . Examples for Newton–Armijo The MATLAB code nsola is a modiﬁcation of nsolgm that incorporates the threepoint parabolic line search from § 8. Algorithm 8.2) before beginning the line search. The MATLAB implementation. n − 1 a = −λj /λj+1 . However. (h) Set sn+1 = λn+1 d.10 the reader is asked to fully develop and implement these ideas. maxit. for j = 0. and f (1) can be used. compute λ0 . z = −F (x) ii. the storage of the globalized methods exceeds that of the local method by one vector. provided in the collection of MATLAB codes. The MATLAB code returns with an error message if more than a ﬁxed number of reductions in the step length are needed.4. It is possible that the Broyden direction fails to satisfy (8.2). This electronic version is for personal use and may not be duplicated or distributed. b = 1 − λj z = z + (asj+1 + bsj )sT z/ sj j 2 2 2 2 (f) d = ( sn 2 z + (1 − λn )sT zsn )/( sn n 2 (g) Compute λn+1 with a line search. the storage requirements of Algorithm brsola would appear to be essentially the same as those of brsol.html.
We compare this strategy. However. For this problem we plot the progress of the iteration using η = . Since we have easy access to analytic derivatives in this example. however.4 with γ = . initial iterate u0 = 0.3 we show the progress of the iteration for the unpreconditioned equation. 8.4. may well be most eﬃcient.2 and 7. The constant reduction line search took three reductions in the ﬁrst two outer iterations and two each in following two. using GMRES as the linear solver. with the constant η method as we did in § 6.25 with the solid line and using the sequence given by (6. the quasiNewton method.4. As before. In Fig. 146 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS once the iteration is near the root.5) for the arctan function. We consider a much more diﬃ −∇2 u + Cu(ux + uy ) = f with the same right hand side f . We base our absolute error criterion on the norm · 2 / N as we did in § 6.4. This makes a globalization strategy critical (Try it without one!).3 we compare the Broyden–Armijo method with NewtonGMRES for those problems for which both methods were successful. which requires a single function evaluation for each outer iteration when full steps can be taken. We take (6. the parabolic line search takes only one reduction in the next three iterations and none thereafter. The parabolic line search required 7 outer iterates and 21 function evaluations in contrast with the constant reduction searches 11 outer iterates and 33 function evaluations.4. We set τr = τa = h2 /10.4. We use x0 = 10 as the initial iterate with τr = τa = 10−8 . We compare the twopoint parabolic line search with the constant reduction search (σ0 = σ1 = . In the ﬁrst outer iteration. 8.9. Convectiondiﬀusion equation. we can use the twopoint parabolic line search. that if the initial iterate is far from the solution. We caution the reader now. This electronic version is for personal use and may not be duplicated or distributed. This is a tighter tolerance that in § 6.21) from § 6. In Fig. Counts of function evaluations corresponding to outer iterations are√ indicated by circles. 8. .2 and.4. an inexact Newton method such as NewtonGMRES can succeed in many case in which a quasiNewton method can fail because the quasiNewton direction may not be an inexact Newton direction.com/SIAM/FR16. we set C = 100.20) with the dashed Buy this book from SIAM at http://www. both line searches take three steplength reductions. 8. because of the large value of C and resulting illconditioning. when both methods succeed. However. 1) × (0. Inverse tangent function.ecsecurehost.2 we plot the iteration history corresponding to the parabolic line search with the solid line and that for the constant reduction with the dashed line.Copyright ©1995 by the Society for Industrial and Applied Mathematics.4.2.1. is needed to obtain an accurate solution. in all the ﬁgures we plot F (xn ) 2 / F (x0 ) 2 against the number of function evaluations required by all line searches and inner and outer iterations to compute xn . cult problem.html. In § 8. 1) that we considered before. Here. and will repeat the caution later. 31 × 31 mesh and homogeneous Dirichlet boundary conditions on the unit square (0.2.
4 (varying η) million ﬂoating point operations.3. This electronic version is for personal use and may not be duplicated or distributed. In the computation we Buy this book from SIAM at http://www.20) to γ = . We set the parameters in (6. 8. GLOBAL CONVERGENCE 0 147 10 10 Relative Nonlinear Residual 2 10 4 10 6 10 8 10 10 10 12 0 5 10 15 20 Function Evaluations 25 30 35 Fig. and 25 (constant η) and 22 (varying η) outer iterations. 759 (constant η) and 744 (varying η) function evaluations.3 (constant η) and 75. The two approaches to selection of the forcing term performed very similarly.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Newton–Armijo for the PDE.com/SIAM/FR16. 8.ecsecurehost. This problem is very poorly conditioned and much tighter control on the inner iteration is needed than for the other problems. We consider the preconditioned problem in Fig. .html.25. The iterations required 75.9 and ηmax = .2. 8. 10 0 10 Relative Nonlinear Residual 1 10 2 10 3 10 4 10 5 0 100 200 300 400 500 Function Evaluations 600 700 800 Fig.4. line. Newton–Armijo for the arctan function. C = 100.
2.20) terminated after 70 function evaluations.9 and ηmax = .ecsecurehost.21) with the fast Poisson solver fish2d. 148 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS preconditioned (6. required at most 16 inner iterations while the constant η approach needed at most 10.5 million ﬂoatingpoint operations.20) with the dashed line.4.6 million ﬂoatingpoint operations. As reported in § 6.20) to γ = . terminating in 4 outer iterations. The iteration in which {ηn } is given by (6. This electronic version is for personal use and may not be duplicated or distributed. Broyden–Armijo. When compared to the Broyden’s method results in § 6. The constant η iteration terminated after 79 function evaluations. 8. . In Fig. We set the parameters in (6. the NewtonGMRES iteration always took full steps. 8. and roughly 13.4 we show the progress of the iteration for the preconditioned equation.4.20). 8.Copyright ©1995 by the Society for Industrial and Applied Mathematics. In these cases the Jacobian is not well approximated by a low rank perturbation of the identity [112] and the ability of the inexact Newton iteration to ﬁnd a good search direction was the important factor.3 million ﬂoatingpoint operations.com/SIAM/FR16. with the forcing term given by (6. For this problem we plot the progress of the iteration using η = . 9 outer iterations.4. 10 0 10 Relative Nonlinear Residual 1 10 2 10 3 10 4 10 5 0 10 20 30 40 50 Function Evaluations 60 70 80 Fig. The line search in this case reduced the step length three times on the ﬁrst iteration and twice on the second and third. the Broyden–Armijo costs reﬂect Buy this book from SIAM at http://www.99.4. The most eﬃcient iteration.2. 16 function evaluations. We found that the Broyden–Armijo line search failed on all the unpreconditioned convection diﬀusion equation examples.4. when increases in the residual were allowed. We begin by revisiting the example in § 6. and 2.3. Newton–Armijo for the PDE.25 with the solid line and using the sequence given by (6.2 with C = 20. and 12. C = 100. 9 outer iterations.html. The line search in the nonconstant η case reduced the step length three times on the ﬁrst iteration and once on the second.
In Fig. Restarting the Broyden iteration every 19 iterations would equalize the storage requirements with Newton–GMRES–Armijo. 42 outer iterates.2. In terms of storage. This electronic version is for personal use and may not be duplicated or distributed.ecsecurehost. 8. GLOBAL CONVERGENCE 149 an improvement in eﬃciency. For the more diﬃcult problem with C = 100.9. NewtonGMRES and Broyden–Armijo for the PDE. From these examples we see that Broyden–Armijo can be very eﬃcient provided only a small number of iterations are needed.7. needing to store 21 vectors.5. We set τa = τr = h2 . ηmax = . Broyden’s method suffers under these conditions as one can see from the comparison of Broyden– Armijo (solid line) and Newton–GMRES–Armijo (dashed line) in Fig. 8. and γ = . C = 20.99. The steplength was reduced on the second iteration (twice) and the third (once). The Broyden–Armijo iteration required 34 nonlinear iterations.4.4.8 million ﬂoatingpoint operations while the Broyden–Armijo took 9 iterations and required 2 million ﬂoatingpoint operations. The storage requirements increase as the nonlinear iteration progresses and this fact puts Broyden’s Buy this book from SIAM at http://www.Copyright ©1995 by the Society for Industrial and Applied Mathematics.2 took 12 iterations and 2. The restarted Broyden–Armijo iteration took 19.html.5 compares the Broyden–Armijo iteration (solid line) to the GMRES iteration (dashed line). 10 0 Relative Nonlinear Residual 10 1 10 2 10 3 0 2 4 6 8 10 Function Evaluations 12 14 16 Fig. . ηmax = . 8. and 123 function evaluations. We set τa = τr = h2 /10. roughly 16 million ﬂoatingpoint operations. and therefore was much more eﬃcient in terms of storage. the performance of the two methods is more similar. In our implementation of brsola the threepoint parabolic line search was used. Figure 8.5 million ﬂoatingpoint operations. and 85 function evaluations.com/SIAM/FR16. The Broyden iteration in § 6. the Broyden–Armijo iteration required storage for 37 vectors while the NewtonGMRES iteration needed at most 16 inner iterations.5. and γ = .6 we compare our implementation of Newton–GMRES–Armijo (dashed line) to Broyden–Armijo (solid line).9 as we did in § 6.
8. 8. NewtonGMRES and Broyden–Armijo for the PDE. method at a disadvantage to NewtonGMRES when the number of nonlinear iterations is large. C = 100. Buy this book from SIAM at http://www.ecsecurehost.html. 10 0 10 Relative Nonlinear Residual 1 10 2 10 3 10 4 10 5 0 20 40 60 80 Function Evaluations 100 120 140 Fig. 150 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 0 10 10 Relative Nonlinear Residual 1 10 2 10 3 10 4 10 5 0 10 20 30 40 50 60 Function Evaluations 70 80 90 Fig. This electronic version is for personal use and may not be duplicated or distributed. .6.Copyright ©1995 by the Society for Industrial and Applied Mathematics. NewtonGMRES and restarted Broyden–Armijo for the PDE.com/SIAM/FR16. C = 100.7.
4.5.ecsecurehost. f (x) = 1 + x2 8.5. 8. Experiment with other linear solvers such as GMRES(m). f (x) = .4.40.5.7. Modify brsola to test the Broyden direction for the descent property and use a twopoint parabolic line search. Show that if F is Lipschitz continuous and the iteration {xn } produced by Algorithm nsola converges to x∗ with F (x∗ ) = 0.1. then F (x∗ ) is singular. f (x) = 2 + sin(x) 7. f (x) = arctan(x) (i. f (x) = x(1 + sin(1/x)) 5.5.com/SIAM/FR16. 8. 8. Explain your results. BiCGSTAB. Use nsola to duplicate the results in § 8. He puts $10.8) and (8.4.9 + arctan(x) 4..3.e.000. f (x) = ex 6. f (x) = arctan(x2 ) 3. What could you do if the Broyden direction is not a descent direction? Apply your code to the examples in § 8.2. 8. Modify nsol.9). and TFQMR. A numerical analyst buys a German sports car for $50.5.000 down and takes a 7 year installment loan to pay the balance.1.5. .5. 8.2. Try to do it in such a way that the chord direction is used whenever possible. GLOBAL CONVERGENCE 151 8. the hybrid Newton algorithm from Chapter 5.9. Verify (8.5. 8.5. If the monthly payments are $713.5.8. Apply your code to the following functions with x0 = 10.) 2.10. Duplicate the results in § 8.5.Copyright ©1995 by the Society for Industrial and Applied Mathematics. How does the method proposed in [10] diﬀer from the one implemented in nsola? What could be advantages and disadvantages of that approach? 8. In what ways are the results in [25] and [70] more general than those in this section? What ideas do these papers share with the analysis in this section and with [3] and [88]? 8.5. to use the Armijo rule. Vary the convection coeﬃcient in the convectiondiﬀusion equation and the mesh size and report the results.html. 1. You might use the MATLAB code nsola to do this. This electronic version is for personal use and may not be duplicated or distributed. This is interesting in the locally convergent case as well. what is the interest rate? Assume monthly compounding.6. Implement the Newton–Armijo method for single equations. Exercises on global convergence 8. Buy this book from SIAM at http://www.
html. Modify nsola to use a cubic polynomial and constant reduction line searches instead of the quadratic polynomial line search.5. Under what conditions will the iteration given by nsola converge to a root x∗ that is independent of the initial iterate? Buy this book from SIAM at http://www.5.13. when does it? How would you implement a secantArmijo method in such a way that the convergence theorem 8. This electronic version is for personal use and may not be duplicated or distributed.com/SIAM/FR16.12.2) with ηn bounded away from 1? If not. 8.11. Compare the results with the examples in § 8.5.ecsecurehost.1 is applicable? 8. Does the secant method for equations in one variable always give a direction that satisﬁes (8.Copyright ©1995 by the Society for Industrial and Applied Mathematics. 152 ITERATIVE METHODS FOR LINEAR AND NONLINEAR EQUATIONS 8. .2.4.
Numerical Solution 153 Buy this book from SIAM at http://www. B. pp. Barlett. . S. Dongarra. pp. SIAM Rev. ¨ [15] H. Cambridge. 14 (1979). E. Cambridge University Press. [8] O. PA. 28–85. Internat. Astrophys. [12] R. The principle of minimized iterations in the solution of the matrix eigenvalue problem. [16] K. T. Methods.html. Society for Industrial and Applied Mathematics. S. Bibliography [1] E. Petzold. and H. An inverse matrix adjustment arising in discriminant analysis.. [2] .. Math. Math. Bicanic and K. 183–196. [9] S. J. E. p. pp. M.. 133–181. Georg. ´ ´ [13] N. Eﬃcient methods to calculate Chandrasekhar’s Hfunctions. pp. Math. New York. 22 (1951). 1–3. Bank and D. Romine. [14] P. 22 (1973). 148–152. Engrg. A. Manteuffel. H. Eijkhout. Sur les op´rations dans les ensembles abstraits et leur applications e aux ´quations int´grales. SpringerVerlag. pp. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. L. pp. Math. J. Iterative Solution Methods. [11] M. Numer. and J. Pozo.. [6] K. 1990. 2nd. Iterative variants of the Nystr¨m method for the numerical o solution of integral equations. pp. Rose. Simplicial and continuation methods for approximating ﬁxed points and solutions to systems of equations. Who was ‘Raphson’?. 3 (1922). E. S. Comput. Numer. J. ed. Fund. Global approximate Newton methods. 17–29. 1993.. Appl. Sci. [4] W. 16 (1966). This electronic version is for personal use and may not be duplicated or distributed. R. Math. [3] L. and L. Paciﬁc J. 2 (1960). 1994. DeRooij. 9 (1951). Statist. Allgower and K. Numer. Ashby. An Introduction to Numerical Analysis. Bosma and W.. 279–295. 17–31. Math. Donato. Banach. pp. Ann. Brenan. Berry. 37 (1981). 22 (1980). e e [10] R. Quart. Numer. [5] S. Astron. Uber die numerische Behandlung von Integralgleichungen nach der Quadraturformelmethode. A comparison of adaptive Chebyshev and least squares polynomial preconditioning for Hermetian positive deﬁnite linear systems.. C. SIAM J. 1989. T. New York. F.ecsecurehost. pp. Minimization of functions having Lipschitzcontinuous ﬁrst partial derivatives. Campbell. F. van der Vorst. Axelsson. J. 283. Math. John Wiley. V. J. Numerical Continuation Methods : An Introduction. Statist. Armijo.. 13 (1992). [7] . Johnson. Arnoldi.. Barrett.Copyright ©1995 by the Society for Industrial and Applied Mathematics. Otto. Philadelphia. L. Demmel. pp.. Brakhage.. R. 126 (1983). Chan. E. Atkinson. 1– 29. A. 107–111.com/SIAM/FR16.
1994. Report CRSCTR9513. 50. Ipsen. Rannacher. 1467–1488. Reduced storage matrix methods in stiﬀ ODE systems. D. Brown and A. 154 BIBLIOGRAPHY of Initial Value Problems in DiﬀerentialAlgebraic Equations. F. C. T. Burmeister. Diﬀerence Equations and Iterative Processes. GMRES and the minimal polynomial. Algorithms for Minimization Without Deriviatives. Q. N. Statist. C. Campbell. 1960. Comput. pp. 24 (1987). Tidriri. B. Statist. J. Dennis. ´ [28] C. no.ecsecurehost. Some eﬃcient algorithms for solving systems of nonlinear equations. I. This electronic version is for personal use and may not be duplicated or distributed. 577–593. VODE: A variable coeﬃcient ode solver. D.. D. Convergence theory of nonlinear NewtonKrylov algorithms. J. pp.. Brown. C. Using Krylov methods in the solution of largescale diﬀerentialalgebraic systems. J. pp.C. [21] P. W. 129–156. BIT. and L. C. Englewood Cliﬀs. in Proceedings of Internationaler Kongreß uber Anwendung der Mathematik in ¨ dem Ingenieurwissenschaften. Numer. [32] X. R. Programming. Tech. SIAM J. F. The convergence of an algorithm for solving sparse nonlinear systems. W. Brown. pp. Campbell. Brown. J. T. ed. E. 31 (1989). Kelley. A class of methods for solving nonlinear simultaneous equations. 1975.. Anal. Math. 10 (1989). SIAM J. Maths. D. Appl. Hindmarsh. SIAM J. A local convergence theory for combined inexact–Newton/ ﬁnite– diﬀerence projection methods. Center for Research in Scientiﬁc Computation. On the local and superlinear convergence of quasiNewton methods. Cavanaugh. Meyer. H. Tech. 19 (1965). and M. Math. Center for Research in Scientiﬁc Computation. 11 (1990). 40–91. 1987. PhD thesis. L. Brown and Y. [27] . pp. N. 407–434. Ipsen. 1996. and J. 297–330. Xue. [35] R. E. March 1995. Schnabel. N. J. pp. 4 (1994). NJ. Hindmarsh. Cambridge Tracts. PA. to appear. . Comput. Gropp. N. A. Hindmarsh. pp. Broyden. and C. North Carolina State University. Philadelphia. 1038–1051.. SIAM. Meyer. Kelley. Buy this book from SIAM at http://www. Comput. Comput. [31] R. Journal of Integral Equations and Applications. Weimar.Copyright ©1995 by the Society for Industrial and Applied Mathematics. 15 (1994). Numer.. Byrne. L. Zur Konvergenz einiger verfahren der konjugierten Richtungen. 10 (1973). Report CRSCTR9410.. Petzold. pp. [18] . in Proceedings of the International Workshop on the NavierStokes Equations. Optim. Cambridge.html. PrenticeHall. Cambridge Univ. 12 (1973). Broyden. N. Notes in Numerical Fluid Mechanics. [26] C. Byrd. to appear. and R. No. Convergence estimates for solution of integral equations with GMRES. D. Cai... Comput. North Carolina State University. [20] P.. More. NewtonKrylovSchwarz methods in CFD. Vieweg Verlag. Press. Math. Hybrid Krylov methods for nonlinear systems of equations.. Sci. pp. G. SIAM J. [30] I. Math. [34] S. A Multigrid Tutorial. 1970. [24] P. 14 in Classics in Applied Mathematics. [17] R. Busbridge. [33] S. Saad. Brent. University of Maryland. SIAM J.. 25 (1971). 450–481. C. I. 1973. D. 327–344. Philadelphia. Society for Industrial and Applied Mathematics. Inst. C. 223– 246. Sci. Applic. 63 (1994). pp. Keyes. [25] . [23] P. C. R. G. [19] W. The Mathematics of Radiative Transfer. Sci.com/SIAM/FR16. and Z. Braunschweig. and A. 285–294. [22] P. Nocedal. Briggs. [29] W. Anal. SIAM J. July 1994. Comput. pp. G. C. Representation of quasiNewton matrices and their use in limited memory methods.
Curtis. V.. CA. [54] T. D. SIAM J. Truncated Newton algorithms for largescale optimization. in Sparse Matrix Computations. eds. [48] J.. Kelley. Eisenstat. Philadelphia. J. Convergence acceleration for Newton’s method at singular points. Simoncini. Golub. G. 6 (1985). This electronic version is for personal use and may not be duplicated or distributed. 190–212. B. Steihaug. W. Anal. [43] T. December 1994. ´ [42] T. Domain Decomposition Methods. Society for Industrial and Applied Mathematics. Houston. Newton’s method at singular points I. 1976. Anal. pp. J.. and C. Steihaug. [47] A. Keller. 1990. [51] . Concus. and G. Kelley. R.. [39] . J. Daniel. The conjugate gradient method for linear and nonlinear operator equations. [55] R. Bunch and D. and D. SIAM J. [40] . Inexact Newton methods. F. 34 (1955). Proceedings of the Fourth International Symposium on Domain Decomposition Methods. pp. ed. Numer. Programming. SIAM J. 566–574. 296–314. T. 4 (1967). Numer. Proceedings of the Third International Symposium on Domain Decomposition Methods. 1969. Report TR/PA/94/18. Math. Meurant. 1989. New York. Society for Industrial and Applied Mathematics. 1988. More. [45] P. and C.. . pp. Math. Los Angeles.Copyright ©1995 by the Society for Industrial and Applied Mathematics. J. Chandrasekhar. J. Inst. [41] S. 66–70.. O’Leary. and T. P. Numer. P. 4. pp. Society for Industrial and Applied Mathematics. On the estimation of sparse Jacobian matrices. J. Comput. Tech. pp. [49] D. Numer. pp. Numer. SIAM J. SIAM J. BIBLIOGRAPHY 155 [36] F. Numer. [37] T. Frontiers in Appl. pp. Powell. Broyden’s method for a class of problems having singular Jacobian at the root. K. 20 (1983). Domain Decomposition Methods. in Constructive Aspects of the Fundamental Theorem of Algebra. [44] P. pp. Anal.. TX. 220–252. SIAM J. 117–119. [56] R. 1988. Philadelphia.. H. pp. Philadelphia. Convergence rates for Newton’s method at singular points. Anal. p. 19 (1982). M. 22 (1985).. Coleman and C. Glowinski. Estimation of sparse Jacobian matrices and graph coloring problems. PA. Widlund. R. Anal. J. [53] . PA. Dembo. Tong. Dekker. and O. S. A quasiminimal residual variant of the BiCGSTAB algorithm for nonsymmetric systems. Sci.. PA. H. Math. 42 (1983). Chan. Golub. T. Statist. [46] E. Block preconditioning for the conjugate gradient method. 219–229. 13 (1974). SIAM J. Chan. 26 (1983). The Nstep iteration procedures. eds. Appl. Academic Press. SIAM J. Moscow. Math. H. 1989. J. and J. ´ [38] T. January 14–16. Math. pp.. T. pp. A generalized conjugate gradient method for the numerical solution of elliptic partial diﬀerential equations. Craig. W. 187–209. VanLoan. 10–26. Concus. Philadelphia. eds. Henrici. [52] . Finding a zero by means of successive linear interpolation. 64– 73. Phys...com/SIAM/FR16. Periaux. Anal. [50] D. Handbook for Matrix Computations. Buy this book from SIAM at http://www. 147–154. 400–408.. Anal. E. pp. Dembo and T. ChaitinChatelin. Decker. 37–48. J. 1990... R. SIAM J. Numer. F. 1960.Proceedings of the Second International Symposium on Domain Decomposition Methods. 338. Numer. Decker and C. 20 (1983). 17 (1980). Rose. G.. Comput.ecsecurehost. 15 (1994). Reid. Szeto.. pp. Sci. 1991. Dover.. Sublinear convergence of the chord method at singular points. eds. 309–332. Gallopoulos. pp. Society for Industrial and Applied Mathematics. Coleman and J. PA. Radiative Transfer. 19 (1982). USSR. Domain Decomposition Methods. W.html. Is nonnormality a serious diﬃculty ?. No. CERFACS.
Math. SIAM Rev. Least change secant updates for quasiNewton methods. New Haven. [73] M. E. 19 (1977). pp. 1971. 760–778. 16–32. Bathe. [74] V. J. methods. 2 (1990). [68] W.. Choosing the forcing terms in an inexact Newton method.. pp. B. Optim. Dennis. Amsterdam. in Mathematical Programming Study 22: Mathematical programming at Oberwolfach II. Feng. [61] . Impact of Computing in Science and Engineering. Yale University. C. Conjugate gradient methods for indeﬁnite systems. 22 (1985). pp. Y. in Numerical Analysis Dundee 1975. Rall. pp. C. 443–459. SIAM J. J. Comput. A.. SIAM J. Sci.. pp. Programming. L. Engelman. [69] S. B. Anal. PhD thesis. Martinez. 1982. Numer. M. 21 (1984). [70] . and Dublin Philosophical Magazine and Journal of Science. [58] . pp. Dennis and J. 358–382. [63] . Fletcher. SIAM J. 62 (1993). 21 (1979). The London.. 427–459. Some devices for the solution of large sets of simultaneous linear equations (with an appendix on the reciprocation of partitioned matrices). Faber and T. no. G.. and A. pp. . Fast secant methods for the iterative solution of large nonsymmetric linear systems.Copyright ©1995 by the Society for Industrial and Applied Mathematics. 17 (1981). P. [59] J. pp. 393–422. J. pp. Dennis and R. 1976. R. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Freund. Local convergence analysis of tensor methods for nonlinear equations. Eisenstat and H. Leastchange sparse secant updates with inaccurate secant conditions. 73–89. Numer. C. Anal. Anal. Numer. Numer.ecsecurehost. pp. E. in Nonlinear Functional Analysis and Applications. Frank. Optim. B. Sci. [72] H. Anal. SIAM J.. pp. E. CT.. Watson. Manteuffel. Nonsymmetric Systems of Linear Equations. ed. Elman. Seventh Series. D. 156 BIBLIOGRAPHY [57] J. 46–89. 70–85. Numer. Philadelphia. The application of quasiNewton methods in ﬂuid mechanics. 4 (1994). F. QuasiNewton methods. J. ed. E. Statist.. Berlin. SpringerVerlag. pp.. Internat. motivation and theory. This electronic version is for personal use and may not be duplicated or distributed. Comput. 1996.html. North–Holland. 6 (1969). 352– 362. SIAM J. Math. Inaccuracy in quasiNewton methods: Local improvement theorems. 549–560. More. Schnabel. 4 (1994). G. and R.. ´ [60] J. Triangular decomposition methods for solving reducible nonlinear systems of equations. Duncan. SIAM Rev. 707–718. 244–276. Saad. A hybrid ChebyshevKrylov subspace algorithm for solving nonsymmetric systems of linear equations. A characterization of superlinear convergence and its application to quasiNewton methods.. Zhang. and P. Schnabel. 949–987. W. Elman. SIAM J. New York. E. Walker. Toward a uniﬁed convergence theory for Newtonlike methods. J. [64] J. Academic Press. Convergence theorems for least change secant update methods. Globally convergent inexact Newton methods.. Walker. [66] . Buy this book from SIAM at http://www. [67] P. pp. [76] R. Dennis and H. Strang. [75] D. Iterative Methods for Large.. pp. Sparse. Dennis. 16 in Classics in Applied Mathematics. and X. 28 (1974). SIAM J. On the Kantorovich hypothesis for Newton’s method. 425–472. 35 (1944). 18 (1981). [62] J. and K. Deuflhard. 17 (1996). SIAM J. Walter. pp. 840–855. pp. pp. 7 (1986). 493–507. Edinburgh. Comput. pp. Methods Engrg. [71] H. SIAM. Saylor. 1984. New York. 660–670. E. Necessary and suﬃcient conditions for the existence of a conjugate gradient method.com/SIAM/FR16. [65] . F.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
BIBLIOGRAPHY
157
[77] R. W. Freund, A transposefree quasiminimal residual algorithm for nonHermitian linear systems, SIAM J. Sci. Comput., 14 (1993), pp. 470–482. [78] R. W. Freund, G. H. Golub, and N. M. Nachtigal, Iterative solution of linear systems, Acta Numerica, 1 (1991), pp. 57–100. [79] R. W. Freund, M. H. Gutknecht, and N. M. Nachtigal, An implementation of the lookahead Lanczos algorithm for nonHermitian matrices, SIAM J. Sci. Comput., 14 (1993), pp. 137–158. [80] R. W. Freund and N. M. Nachtigal, QMR: a quasiminimal residual algorithm for nonhermitian linear systems, Numer. Math., 60 (1991), pp. 315–339. [81] , An implementation of the QMR method based on coupled twoterm recurrences, SIAM J. Sci. Comput., 15 (1994), pp. 313–337. [82] D. M. Gay, Some convergence properties of Broyden’s method, SIAM J. Numer. Anal., 16 (1979), pp. 623–630. [83] C. W. Gear, Numerical Initial Value Problems in Ordinary Diﬀerential Equations, PrenticeHall, Englewood Cliﬀs, NJ, 1971. [84] R. R. Gerber and F. T. Luk, A generalized Broyden’s method for solving simultaneous linear equations, SIAM J. Numer. Anal., 18 (1981), pp. 882–890. [85] P. E. Gill and W. Murray, QuasiNewton methods for unconstrained optimization, J. I. M. A., 9 (1972), pp. 91–108. [86] , Safeguarded steplength algorithms for optimization using descent methods, Tech. Report NAC 37, National Physical Laboratory Report, Teddington, England, 1974. [87] P. E. Gill, W. Murray, and M. H. Wright, Practical Optimization, Academic Press, London, 1981. [88] A. A. Goldstein, Constructive Real Analysis, Harper and Row, New York, 1967. [89] G. H. Golub and C. G. VanLoan, Matrix Computations, Johns Hopkins University Press, Baltimore, 1983. [90] A. Griewank, Analysis and modiﬁcation of Newton’s method at singularities, PhD thesis, Australian National University, 1981. [91] , Rates of convergence for secant methods on nonlinear problems in Hilbert space, in Numerical Analysis, Proceedings Guanajuato , Mexico 1984, Lecture Notes in Math., No, 1230, J. P. Hennart, ed., SpringerVerlag, Heidelberg, 1986, pp. 138– 157. [92] , The solution of boundary value problems by Broyden based secant methods, in Computational Techniques and Applications: CTAC 85, Proceedings of CTAC, Melbourne, August 1985, J. Noye and R. May, eds., North Holland, Amsterdam, 1986, pp. 309–321. [93] , On the iterative solution of diﬀerential and integral equations using secant updating techniques, in The State of the Art in Numerical Analysis, A. Iserles and M. J. D. Powell, eds., Clarendon Press, Oxford, 1987, pp. 299–324. [94] A. Griewank and G. F. Corliss, eds., Automatic Diﬀerentiation of Algorithms: Theory, Implementation, and Application, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1991. [95] A. Griewank and P. L. Toint, Local convergence analysis for partitioned quasinewton updates, Numer. Math., 39 (1982), pp. 429–448. [96] , Partitioned variable metric methods for large sparse optimization problems, Numer. Math., 39 (1982), pp. 119–137. [97] W. A. Gruver and E. Sachs, Algorithmic Methods In Optimal Control, Pitman, London, 1980.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
158
BIBLIOGRAPHY
[98] M. H. Gutknecht, Variants of BICGSTAB for matrices with complex spectrum, SIAM J. Sci. Comput., 14 (1993), pp. 1020–1033. [99] W. Hackbusch, MultiGrid Methods and Applications, vol. 4 of Springer Series in Computational Mathematics, SpringerVerlag, New York, 1985. [100] , Multigrid methods of the second kind, in Multigrid Methods for Integral and Diﬀerential Equations, Oxford University Press, Oxford, 1985. [101] W. E. Hart and S. O. W. Soul, QuasiNewton methods for discretized nonlinear boundary problems, Journal of the Institute of Applied Mathematics, 11 (1973), pp. 351–359. [102] H. V. Henderson and S. R. Searle, On deriving the inverse of a sum of matrices, SIAM Rev., 23 (1981), pp. 53–60. [103] M. R. Hestenes and E. Steifel, Methods of conjugate gradient for solving linear systems, J. of Res. Nat. Bureau Standards, 49 (1952), pp. 409–436. [104] D. M. Hwang and C. T. Kelley, Convergence of Broyden’s method in Banach spaces, SIAM J. Optim., 2 (1992), pp. 505–532. [105] E. Isaacson and H. B. Keller, Analysis of numerical methods, John Wiley, New York, 1966. [106] L. Kantorovich and G. Akilov, Functional Analysis, 2nd ed., Pergamon Press, New York, 1982. [107] L. V. Kantorovich, Functional analysis and applied mathematics, Uspehi Mat. Nauk., 3 (1948), pp. 89–185. translation by C. Benster as Nat. Bur. Standards Report 1509. Washington, D. C., 1952. [108] H. B. Keller, Newton’s method under mild diﬀerentiability conditions, J. Comput. Sys. Sci, 4 (1970), pp. 15–28. [109] , Lectures on Numerical Methods in Bifurcation Theory, Tata Institute of Fundamental Research, Lectures on Mathematics and Physics, SpringerVerlag, New York, 1987. [110] C. T. Kelley, Solution of the Chandrasekhar Hequation by Newton’s method, J. Math. Phys., 21 (1980), pp. 1625–1628. [111] , A fast multilevel algorithm for integral equations, SIAM J. Numer. Anal., 32 (1995), pp. 501–513. [112] C. T. Kelley and E. W. Sachs, A quasiNewton method for elliptic boundary value problems, SIAM J. Numer. Anal., 24 (1987), pp. 516–531. [113] , A pointwise quasiNewton method for unconstrained optimal control problems, Numer. Math., 55 (1989), pp. 159–176. [114] , Fast algorithms for compact ﬁxed point problems with inexact function evaluations, SIAM J. Sci. Statist. Comput., 12 (1991), pp. 725–742. [115] , A new proof of superlinear convergence for Broyden’s method in Hilbert space, SIAM J. Optim., 1 (1991), pp. 146–150. [116] , Pointwise Broyden methods, SIAM J. Optim., 3 (1993), pp. 423–441. [117] , Multilevel algorithms for constrained compact ﬁxed point problems, SIAM J. Sci. Comput., 15 (1994), pp. 645–667. [118] C. T. Kelley and R. Suresh, A new acceleration method for Newton’s method at singular points, SIAM J. Numer. Anal., 20 (1983), pp. 1001–1009. [119] C. T. Kelley and Z. Q. Xue, Inexact Newton methods for singular problems, Optimization Methods and Software, 2 (1993), pp. 249–267. [120] , GMRES and integral operators, SIAM J. Sci. Comput., 17 (1996), pp. 217– 226. [121] T. Kerkhoven and Y. Saad, On acceleration methods for coupled nonlinear elliptic systems, Numerische Mathematik, 60 (1992), pp. 525–548.
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Copyright ©1995 by the Society for Industrial and Applied Mathematics. This electronic version is for personal use and may not be duplicated or distributed.
BIBLIOGRAPHY
159
[122] C. Lanczos, Solution of linear equations by minimized iterations, J. Res. Natl. Bur. Stand., 49 (1952), pp. 33–53. [123] T. A. Manteuffel, Adaptive procedure for estimating parameters for the nonsymmetric Tchebychev iteration, Numer. Math., 31 (1978), pp. 183–208. [124] T. A. Manteuffel and S. Parter, Preconditioning and boundary conditions, SIAM J. Numer. Anal., 27 (1990), pp. 656–694. [125] E. S. Marwil, Convergence results for Schubert’s method for solving sparse nonlinear equations, SIAM J. Numer. Anal., (1979), pp. 588–604. [126] S. McCormick, Multilevel Adaptive Methods for Partial Diﬀerential Equations, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1989. [127] J. A. Meijerink and H. A. van der Vorst, An iterative solution method for linear systems of which the coeﬃcient matrix is a symmetric Mmatrix, Math. Comput., 31 (1977), pp. 148–162. [128] C. D. Meyer, Matrix Analysis and Applied Linear Algebra, forthcoming. ´ [129] J. J. More, Recent developments in algorithms and software for trust region methods, in Mathematical Programming: The State of the Art, A. Bachem, M. Gr¨schel, and B. Korte, eds., SpringerVerlag, Berlin, 1983, pp. 258–287. o ´ [130] J. J. More and J. A. Trangenstein, On the global convergence of Broyden’s method, Math. Comput., 30 (1976), pp. 523–540. [131] T. E. Mott, Newton’s method and multiple roots, Amer. Math. Monthly, 64 (1957), pp. 635–638. [132] T. W. Mullikin, Some probability distributions for neutron transport in a half space, J. Appl. Probab., 5 (1968), pp. 357–374. [133] W. Murray and M. L. Overton, Steplength algorithms for minimizing a class of nondiﬀerentiable functions, Computing, 23 (1979), pp. 309–331. [134] N. M. Nachtigal, S. C. Reddy, and L. N. Trefethen, How fast are nonsymmetric matrix iterations?, SIAM J. Matrix Anal. Appl., 13 (1992), pp. 778– 795. [135] N. M. Nachtigal, L. Reichel, and L. N. Trefethen, A hybrid gmres algorithm for nonsymmetric linear systems, SIAM J. Matrix Anal. Appl., 13 (1992). [136] S. G. Nash, Preconditioning of truncated Newton methods, SIAM J. Sci. Statist. Comput., 6 (1985), pp. 599–616. [137] , Who was Raphson? (an answer). Electronic Posting to NADigest, v92n23, 1992. [138] J. L. Nazareth, Conjugate gradient methods less dependent on conjugacy, SIAM Rev., 28 (1986), pp. 501–512. [139] O. Nevanlinna, Convergence of Iterations for Linear Equations, Birkh¨user, a Basel, 1993. [140] I. Newton, The Mathematical Papers of Isaac Newton (7 volumes), D. T. Whiteside, ed., Cambridge University Press, Cambridge, 1967–1976. [141] B. Noble, Applied Linear Algebra, Prentice Hall, Englewood Cliﬀs, NJ, 1969. [142] J. Nocedal, Theory of algorithms for unconstrained optimization, Acta Numerica, 1 (1991), pp. 199–242. [143] D. P. O’Leary, Why Broyden’s nonsymmetric method terminates on linear equations, SIAM J. Optim., 4 (1995), pp. 231–235. [144] J. M. Ortega, Numerical Analysis A Second Course, Classics in Appl. Math., No. 3, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1990. [145] J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970. [146] A. M. Ostrowski, Solution of Equations and Systems of Equations, Academic
Buy this book from SIAM at http://www.ecsecurehost.com/SIAM/FR16.html.
Statist. K. de Math. [153] M. The Symmetric Eigenvalue Problem. 437–445. [149] D. Rall. 1968.. at Cambridge. 865–881. Sci.. [162] H. Amsterdam. New York. SIAM Rev. A ﬂexible innerouter preconditioned GMRES algorithm. [160] J. 1980. 87–114. . pp. in Nonlinear Programming 2. J. Analysis aequationum universalis seu ad aequationes algebraicas resolvendas methodus generalis.html. Peaceman and H. Robinson. W. Tech. ILUT: A dual threshold incomplete LU factorization. Inverse iteration. W. Comput. 856–869. Math. 461–469. Computer Science Department. North Holand. Comput. Mangasarian. [161] T. 1973. 71–82. et expedita. Original in British Library... D. 1690. pp. North Holland. New York. eds. Powell. 6 (1890). pp. C. [147] B. E. R. Academic Press. pp. in Proc. 160 BIBLIOGRAPHY Press. 1–27. pp. Wilkinson. Appl. 1986. J. Stepleman et al.. N. 993–986. [150] G. B. London. [158] J. Math. Englewood Cliﬀs. J. Statist. Picard. A description of DASSL: a diﬀerential/algebraic system solver. Implementation of an adaptive algorithm Buy this book from SIAM at http://www. pp. 29 (1979). R. NATO Conf. Experiments on GramSchmidt orthogonalization. SIAM J.. Comput. Peters and J. Sci. ser 4. [154] . pp. 1983. Rheinboldt. Comput. Macmillan. Gordon and Breach. [169] P. 15 (1978). SIAM J. pp.. and Z. A hybrid method for nonlinear equations. [152] E. 1970. Parlett. Math. 325–328. R. 1960. 20 (1966). pp. L. Real Analysis. Saylor and D. [157] G. Reid. Amsterdam. 9 (1961). 23–37. On Newton’s method for singular problems. 11 (1955).. GMRES a generalized minimal residual algorithm for solving nonsymmetric linear systems. John Wiley. D.com/SIAM/FR16. [156] J. and S. Math. Saad. [159] W. New York. pp. 24 (1987). Convergence of the Newton process to multiple solutions. 155–169.. pp. 339–360. [148] B. [164] . Rivlin. 1992. SIAM J. Liu. Taylor. This electronic version is for personal use and may not be duplicated or distributed. [167] Y. [166] . 1975. Report 9238. New York. Anal. [151] L. Least squares solution of sparse systems of nonlinear equations by a modiﬁed Marquardt algorithm. Raphson. [165] .. A.. New York. Comput.. [168] E. R. deducta ac demonstrata. Numer. Numer. illconditioned equations and Newton’s method. Programming. Royden. The Chebyshev Polynomials. O.. Schultz. Petzold. John Wiley.ecsecurehost. University of Minnesota. M´moire sur la th´orie des ´quations aux d´riv´es partielles et la e e e e e e m´thode des approximations successives. Sachs. 7 (1986). 145–210. Practical use of polynomial preconditionings for the conjugate gradient method.. 28–41. A lookahead Lanczos algorithm for unsymmetric matrices. Math. 6 (1985). 105–124. S. 35 (1986). (1993). New York. SIAM J. Parlett. pp. The numerical solution of parabolic and elliptic diﬀerential equations. M. Indus. Soc. C. Numer. L. Rice. Numerical Analysis of Parametrized Nonlinear Equations. Rachford. pp.Copyright ©1995 by the Society for Industrial and Applied Mathematics. pp. 2nd ed. Saad and M. 1974. pp. N. Broyden’s method in Hilbert space. eds. Prentice Hall. [155] L. in Numerical Methods for Nonlinear Algebraic Equations. ex nova inﬁnitarum serierum doctrina. Smolarski. 65–68. NJ. Least squares polynomials in the complex plane and their use for solving nonsymmetric linear systems. J. SIAM J. R. pp. Convergence properties of a class of minimization algorithms. [163] Y. R. 44 (1985). Anal. July 1972. Sci. in Scientiﬁc Computing. Meyer. Reddien. H.
Varga. Newton’s method with a model trust region modiﬁcation. Software. Adjustment of an inverse matrix corresponding to changes in the elements of a given column or a given row of the original matrix (abstract). A. Schnabel and P. Statist. pp. U. Saylor. Math. Anal. Shamanskii. 14 (1978). ¨ ¨ [171] E. A hybrid ArnoldiFaber iterative method for nonsymmetric systems of linear equations. Sweet. A modiﬁcation of Newton’s method.. [172] L. Comput.. F. Ukran. pp. H.. pp. 154/156 (1991). SIAM J. 47–67.. Anal. Frank. 36–52. E. [179] D. pp. [178] K. Schroder. CRC Press. 124–127. Sci. pp.. Ann. 15 (1994). 21 (1950).. B. BIBLIOGRAPHY 161 for Richardson’s method. Anal. 19 (1982). N. The methods of cyclic reduction. pp. Comput. Linear Algebrra Appl.. 20 (1949). pp. S.) [175] A. SIAM J. Anal. R. 19 (1977). 28 (1988). A family of trustregionbased algorithms for unconstrained minimization with strong global convergence properties. Numer. L. Modiﬁcation of a quasiNewton method for nonlinear equations with sparse Jacobian.. p. [180] P. Sherman and W. Swarztrauber and R. 13 Buy this book from SIAM at http://www.S. CGS. Academic Press.. SIAM Rev. pp. pp. B. ACM Trans.. Sci. [187] . pp. Math. 19 (1967). Introduction to matrix computations. 1–22. Applied Mathematics Series. H. pp. 24 (1970). Morrison. 815–843. Statist. Numer. Uber unendlich viele Algorithmen zur Auﬂosung der Gleichungen. Statist. This electronic version is for personal use and may not be duplicated or distributed. Schnabel. Sorensen. FL. [188] P. Sci. Englewood Cliﬀs. Swarztrauber. SIAM J. 409–426. [184] G. E. 2 (1870). Comput. Toint. Boca Raton. 8 (1987). (In Russian. pp. Math. Math. Math. L. Numer. [191] H. 1964. [177] . Schultz. A. Comput. 213–239.html. 369–385.ecsecurehost. 317–365. Ann. pp. MATLAB Primer. 5 (1979).. J. pp. 31 (1977). Prentice Hall. pp. and R. van der Vorst. a fast Lanczostype solver for nonsymmetric linear systems. Adjustment of an inverse matrix corresponding to a change in one element of a given matrix. Comput. D. [176] J. Zh. [181] D. 10 (1989). Fourth Edition. 490–501. Alternating direction preconditioning for nonsymmetric systems of linear equations. . SIAM J. A. C. Statist. [190] J. 755–774. NJ. Stewart. SIAM J. [185] E. W. An optimum iterative method for solving any linear system with a square matrix. On sparse and symmetric matrix updating subject to a linear equation. K. 21 (1984). Numer. 49 (1958). BIT. pp.. SIAM J. Tensor methods for nonlinear equations. Starke and R.. Math. 1994. Iterative Methods for the Solution of Equations. [186] P.. [189] P. [174] V. Sonneveld. Sigmon. Stiefel. Starke. Traub. 352–364. 64 (1993). C.. 1973. [182] G. 954–961. National Bureau of Standards. Smolarski and P.. 27–30. Statist. Byrd. [173] G.com/SIAM/FR16. 621. New York. Comput. SIAM J. 199–209. N. Approximate cyclic reduction for solving Poisson’s equation. On Newtoniterative methods for the solution of systems of nonlinear equations. Schubert. pp. 22 (1985). Algorithm 541: Eﬃcient FORTRAN subprograms for the solution of elliptic partial diﬀerential equations. pp. 615–646. Ann. 133–138. Math. [170] R. Sci..Copyright ©1995 by the Society for Industrial and Applied Mathematics. Numer. Kernel polynomials in linear algebra and their numerical applications. Fourier analysis and the FACR algorithm for the discrete solution of Poisson’s equation on a rectangle. SIAM J. 163–178. BiCGSTAB: A fast and smoothly converging variant to BiCG for the solution of nonsymmetric systems. Mat. [183] G. Sherman.
815–825. 15 (1994). 279–286. Sci. Watson.. [202] L. pp... [192] H. Journal Comput. 109–118. S. Inverting modiﬁed matrices. Vuik. 1962. 6 (1994). IMA J. Comput. pp. Appl. Algorithm 652: HOMPACK: A suite of codes for globally convergent homotopy algorithms. T. Englewood Cliﬀs. [196] . Residual smoothing techniques for iterative methods.. Morgan. Numer. van der Vorst and C. pp. S. [201] T. Walker and L.. [194] E. Buy this book from SIAM at http://www. Numer. NJ. [197] H. A. Math. Ypma. Varga. 162 BIBLIOGRAPHY (1992). Prentice Hall. 281–310. 3 (1983). 1950. Woodbury. NJ. This electronic version is for personal use and may not be duplicated or distributed. [199] M. F. pp. 19 (1995). M. SIAM J. Wachspress. Zhou and H. Young. Math.. Academic Press. Software. and A. [193] R. pp. 9 (1989). Billups. F. C. The superlinear convergence behaviour of GMRES. Statistical Research Group. Alg. . J. New York. [200] D. L.com/SIAM/FR16. ACM Trans. Appl.html. Iterative Solution of Elliptic Systems and Applications to the Neutron Diﬀusion Equations of Reactor Physics. Residual smoothing and peak/plateau behavior in Krylov subspace methods. Walker. Statist. Sci. Anal. 1971. The eﬀect of rounding errors on Newtonlike methods. Applied Numer.Copyright ©1995 by the Society for Industrial and Applied Mathematics. pp. Lin. P. Implementation of the GMRES method using Householder transformations. A simpler GMRES. pp. Walker. Zhou. J. Memorandum Report 42. 1966. 297–312. Math. 631–644. A. Iterative Solution of Large Linear Systems. [195] H. pp. 571–581.ecsecurehost. Matrix Iterative Analysis. Prentice Hall. 327–341. F. Englewood Cliﬀs. 48 (1993). SIAM J. Comput. Princeton NJ. [198] L. 13 (1987).
Index Aconjugacy.com/SIAM/FR16. 110 Biconjugacy. 47 Biorthogonality. 47 BiCGSTAB. 34 Chord method. 49 Chandrasekhar Hequation. 145 163 Cauchy sequence. 6 BiCG. 74 convergence. . 47 Bounded deterioration. 126 Brsola algorithm. 87 solution Broyden’s method. 87 NewtonGMRES. 47 BiConjugate Gradient. 137 Arnoldi process. 133 Brsol algorithm. 47 algorithm. 67 ConvectionDiﬀusion Equation Broyden’s method. 130 Convergence theorem Broyden’s method linear equations.Copyright ©1995 by the Society for Industrial and Applied Mathematics. 74 algorithm. 118. 120 implementation. 123 limited memory. 12 use of residual polynomials. 37 Backtracking. 50 ﬁnite diﬀerence. 130 Newton–Armijo. 12 Approximate inverse.ecsecurehost. 120 Broyden’s method. 67 CGNE. 77 Armijo rule. 49 Conjugate transpose. 123 restarting. 67 Contraction Mapping Theorem. This electronic version is for personal use and may not be duplicated or distributed. 132 Banach Lemma. 107 Characteristic polynomial. 50 algorithm. 3 Conjugate Gradient algorithm. 6. 146 NewtonGMRES. 47 algorithm. 119 Buy this book from SIAM at http://www. 25 CGNR. 121 Breakdown. 48 algorithm. 14 minimization property. 48 algorithm.html. 128 Newton’s method. 22 ﬁnite termination. 136 Bad Broyden method. 123 sparse. 34 Contraction mapping. 14 Conjugate Gradient Squared. 47 Broyden sequence. 118 nonlinear equations. 20 Anorm. 76 Condition number. 108 solution Broyden’s method. 25 CGS. 113 convergence linear equations.
67 ﬁnite diﬀerence Newton. 77 contraction mapping. 36 Limited memory. 95 Inner iteration. 79 DASPK. 99 Newton’s method. 80 Jacobian. 96. 104 Gauss–Seidel algorithm. 81 inexact Newton’s method. 37 INDEX loss of orthogonality. 40. 8 preconditioning. This electronic version is for personal use and may not be duplicated or distributed. 142 Line search. 40. 141 NewtonGMRES. 3 Inexact Newton method. 34 Diﬀerence approximation.com/SIAM/FR16. 43 Globally convergent algorithm. 135 GMRES algorithm. 100 Inverse power method. 83 Krylov subspace. 91 o High order methods. 11 Left preconditioning. 136 Linear PDE BiCGSTAB. 123 Line Search parabolic threepoint. 46 algorithm. 115 Diagonalizable matrix. 80 directional derivative. 128 Newton’s method. 39. 5 Iteration statistics. 9 Jacobi preconditioning. 127 CGNR. 110 Dennis–Mor´ Condition. 87 NewtonGMRES. 66 Forcing term. 66 Fixedpoint iteration. 9 Givens rotation.ecsecurehost. 146 Iteration matrix. 60 Fixed point. 94 Inverse secant equation. 142 polynomial. 107 H¨lder Continuity. 34 GMRES(m). 71 Newton–Armijo. 80 Elementary Jordan block. 57 Broyden’s method. 132 Inverse Tangent Newton–Armijo. 82 Shamanskii method. 46 use of residual polynomials. 114 e superlinear convergence. 34 Diagonalizing transformation. 46 loss of orthogonality. 41 modiﬁed. 55 GMRES.Copyright ©1995 by the Society for Industrial and Applied Mathematics. 135. 36 Induced matrix norm deﬁnition. 24 Jordan block. 123 chord method. 103 secant method. 60 Kantorovich Theorem. . 143 twopoint. 87 solution Broyden’s method. 80 nonlinearity. 8 convergence result. 33 restarts. 91 Jacobi algorithm. 41 minimization property. 33 Gram–Schmidt process. 34 ﬁnite diﬀerences. 95 choice of. 101 ﬁnite termination. 164 nonlinear equations. 54 Buy this book from SIAM at http://www. 45 diagonalizable matrices.html. 79 choice of h. 40 Hequation. 76. 78 Hybrid algorithms. 39.
9 Reorthogonalization. 7 Modiﬁed Bratu Problem. 79 solution Broyden’s method. 66 Right preconditioning. 92 termination. 78 convergence. 72 relative nonlinear. 51 Quasiresidual norm. 52 QuasiNewton methods. 132 Newton direction. 127 TFQMR. 91 rorder. 133 Search direction. 23 Preconditioning conjugate gradient. 104 Picard iteration. 105 Newtoniterative method. 4 nonlinear as error estimate. 71 algorithm. 113 Quasiresidual. This electronic version is for personal use and may not be duplicated or distributed. INDEX 165 algorithm. 3 Matrix splitting. 11 as error estimate. 67 Local improvement. 36 Jacobi. 86 Nsola algorithm. 136 Newton’s method. 74 convergence rate.html. 52 Relaxation parameter. 72 NewtonGMRES. 103 ﬁnite diﬀerences. 24. 24. 82 convergence. 33 application to TFQMR. 66 Poisson solver. . 101 safeguarding. 14 Richardson iteration.Copyright ©1995 by the Society for Industrial and Applied Mathematics. 82 qorder. 100 Over solving. 36. 36 incomplete factorization. 100 Matrix norm. 6 Quasiminimization. 36 Richardson iteration. 27 for GMRES. 105 Outer iteration. 14 application to CGNR.com/SIAM/FR16. 81 stagnation. 113 Secant method. 101 convergence rates. 138 Nsolgm algorithm. 24. 41 Residual linear. 81 inexact. 24. 19 GMRES. 5 nonlinear. 54 polynomial. 75 inexact. 72 Residual polynomials application to CG. 27 preconditioner for CG. 100 Normal matrix. 136 Secant equation. 34 Nsol algorithm. 36 Richardson iteration. 95 monotone convergence.ecsecurehost. 27. 11 Preconditioned CG Buy this book from SIAM at http://www. 78 algorithm. 52 deﬁnition. 113 Shamanskii method. 36. 25 application to GMRES. 58 Lipschitz continuous deﬁnition. 24. 54 Polynomial preconditioning. 7 Positive deﬁnite matrix. 91 Secant update. 24 Poisson solver. 71 ﬁnite diﬀerences. 36 Schubert algorithm.
98 INDEX Buy this book from SIAM at http://www. 66 Suﬃcient decrease. 24 Successive substitution. 124. 52 TFQMR. 52 Weighted norm. 91 Sherman–Morrison formula. 38 linear equations. . 5 as preconditioners. 12 Spectral radius. 9 preconditioning. 9 SPD matrix.com/SIAM/FR16. 53 ﬁnite diﬀerence. 166 ﬁnite diﬀerences. 136 SOR algorithm. 4 nonlinear equations. 75 Stationary iterative methods. 16 GMRES.ecsecurehost. This electronic version is for personal use and may not be duplicated or distributed. 72 TFQMR. 132 Simple decrease. 9 Symmetric matrix. 110 weights.Copyright ©1995 by the Society for Industrial and Applied Mathematics. 11 Symmetric positive deﬁnite matrix. 132 Sherman–Morrison–Woodbury formula. 12 Termination CG.html. 137 Symmetric Gauss–Seidel algorithm. 35. 7 Stagnation. 94 Newton’s method. 51 algorithm.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.