P. 1
templates

templates

|Views: 37|Likes:
Published by Kevin Murray

More info:

Published by: Kevin Murray on Apr 07, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

12/18/2012

pdf

text

original

Templates for the Solution of Linear Systems

:
Building Blocks for Iterative Methods
1
Richard Barrett
2
, Michael Berry
3
, Tony F. Chan
4
,
James Demmel
5
, June M. Donato
6
, Jack Dongarra
3,2
,
Victor Eijkhout
7
, Roldan Pozo
8
, Charles Romine
9
,
and Henk Van der Vorst
10
This document is the electronic version of the 2nd edition of the Templates book,
which is available for purchase from the Society for Industrial and Applied
Mathematics (http://www.siam.org/books).
1
This work was supported in part by DARPA and ARO under contract number DAAL03-91-C-0047, the
National Science Foundation Science and Technology Center Cooperative Agreement No. CCR-8809615, the
Applied Mathematical Sciences subprogram of the Office of Energy Research, U.S. Department of Energy,
under Contract DE-AC05-84OR21400, and the Stichting Nationale Computer Faciliteit (NCF) by Grant CRG
92.03.
2
Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830-
6173.
3
Department of Computer Science, University of Tennessee, Knoxville, TN 37996.
4
Applied Mathematics Department, University of California, Los Angeles, CA 90024-1555.
5
Computer Science Division and Mathematics Department, University of California, Berkeley, CA 94720.
6
Science Applications International Corporation, Oak Ridge, TN 37831
7
Texas Advanced Computing Center, The University of Texas at Austin, Austin, TX 78758
8
National Institute of Standards and Technology, Gaithersburg, MD
9
Office of Science and Technology Policy, Executive Office of the President
10
Department of Mathematics, Utrecht University, Utrecht, the Netherlands.
ii
How to Use This Book
We have divided this book into five main chapters. Chapter 1 gives the motivation for this book
and the use of templates.
Chapter 2 describes stationary and nonstationary iterative methods. In this chapter we present
both historical development and state-of-the-art methods for solving some of the most challenging
computational problems facing researchers.
Chapter 3 focuses on preconditioners. Many iterative methods depend in part on precondition-
ers to improve performance and ensure fast convergence.
Chapter 4 provides a glimpse of issues related to the use of iterative methods. This chapter,
like the preceding, is especially recommended for the experienced user who wishes to have further
guidelines for tailoring a specific code to a particular machine. It includes information on complex
systems, stopping criteria, data storage formats, and parallelism.
Chapter 5 includes overviews of related topics such as the close connection between the Lanc-
zos algorithm and the Conjugate Gradient algorithm, block iterative methods, red/black orderings,
domain decomposition methods, multigrid-like methods, and row-projection schemes.
The Appendices contain information on how the templates and BLAS software can be obtained.
A glossary of important terms used in the book is also provided.
The field of iterative methods for solving systems of linear equations is in constant flux, with
new methods and approaches continually being created, modified, tuned, and some eventually
discarded. We expect the material in this book to undergo changes from time to time as some
of these new approaches mature and become the state-of-the-art. Therefore, we plan to update
the material included in this book periodically for future editions. We welcome your comments
and criticisms of this work to help us in that updating process. Please send your comments and
questions by email to templates@cs.utk.edu.
iii
List of Symbols
A, . . . , Z matrices
a, . . . , z vectors
α, β, . . . , ω scalars
A
T
matrix transpose
A
H
conjugate transpose (Hermitian) of A
A
−1
matrix inverse
A
−T
the inverse of A
T
a
i,j
matrix element
a
.,j
jth matrix column
A
i,j
matrix subblock
a
i
vector element
u
x
, u
xx
first, second derivative with respect to x
(x, y), x
T
y vector dot product (inner product)
x
(i)
j
jth component of vector x in the ith iteration
diag(A) diagonal of matrix A
diag(α, β . . .) diagonal matrix constructed from scalars α, β . . .
span(a, b . . .) spanning space of vectors a, b . . .
1 set of real numbers
1
n
real n-space
[[x[[
2
2-norm
[[x[[
p
p-norm
[[x[[
A
the “A-norm”, defined as (Ax, x)
1/2
λ
max
(A), λ
min
(A) eigenvalues of A with maximum (resp. minimum) modulus
σ
max
(A), σ
min
(A) largest and smallest singular values of A
κ
2
(A) spectral condition number of matrix A
L linear operator
α complex conjugate of the scalar α
max¦S¦ maximum value in set S
min¦S¦ minimum value in set S
¸
summation
O() “big-oh” asymptotic bound
Conventions Used in this Book
D diagonal matrix
L lower triangular matrix
U upper triangular matrix
Q orthogonal matrix
M preconditioner
I, I
n×n
n n identity matrix
ˆ x typically, the exact solution to Ax = b
h discretization mesh width
iv
Acknowledgements
The authors gratefully acknowledge the valuable assistance of many people who commented on
preliminary drafts of this book. In particular, we thank Loyce Adams, Bill Coughran, Matthew
Fishler, Peter Forsyth, Roland Freund, Gene Golub, Eric Grosse, Mark Jones, David Kincaid,
Steve Lee, Tarek Mathew, No¨ el Nachtigal, Jim Ortega, and David Young for their insightful com-
ments. We also thank Geoffrey Fox for initial discussions on the concept of templates, and Karin
Remington for designing the front cover.
This work was supported in part by DARPA and ARO under contract number DAAL03-91-
C-0047, the National Science Foundation Science and Technology Center Cooperative Agreement
No. CCR-8809615, the Applied Mathematical Sciences subprogram of the Office of Energy Re-
search, U.S. Department of Energy, under Contract DE-AC05-84OR21400, and the Stichting Na-
tionale Computer Faciliteit (NCF) by Grant CRG 92.03.
v
vi
Contents
List of Symbols iv
List of Figures ix
1 Introduction 1
1.1 Why Use Templates? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 What Methods Are Covered? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Iterative Methods 5
2.1 Overview of the Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Stationary Iterative Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 The Jacobi Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.2 The Gauss-Seidel Method . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.3 The Successive Overrelaxation Method . . . . . . . . . . . . . . . . . . . 9
2.2.4 The Symmetric Successive Overrelaxation Method . . . . . . . . . . . . . 11
2.2.5 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Nonstationary Iterative Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.1 Conjugate Gradient Method (CG) . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 MINRES and SYMMLQ . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.3 CG on the Normal Equations, CGNE and CGNR . . . . . . . . . . . . . . 16
2.3.4 Generalized Minimal Residual (GMRES) . . . . . . . . . . . . . . . . . . 17
2.3.5 BiConjugate Gradient (BiCG) . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.6 Quasi-Minimal Residual (QMR) . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.7 Conjugate Gradient Squared Method (CGS) . . . . . . . . . . . . . . . . . 21
2.3.8 BiConjugate Gradient Stabilized (Bi-CGSTAB) . . . . . . . . . . . . . . . 24
2.3.9 Chebyshev Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Computational Aspects of the Methods . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5 A short history of Krylov methods . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6 Survey of recent Krylov methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3 Preconditioners 35
3.1 The why and how . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.1 Cost trade-off . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.2 Left and right preconditioning . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Jacobi Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.1 Block Jacobi Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3 SSOR preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4 Incomplete Factorization Preconditioners . . . . . . . . . . . . . . . . . . . . . . 38
3.4.1 Creating an incomplete factorization . . . . . . . . . . . . . . . . . . . . . 38
vii
viii CONTENTS
3.4.2 Point incomplete factorizations . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4.3 Block factorization methods . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4.4 Blocking over systems of partial differential equations . . . . . . . . . . . 46
3.4.5 Incomplete LQ factorizations . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5 Polynomial preconditioners . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.6 Other preconditioners . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.6.1 Preconditioning by the symmetric part . . . . . . . . . . . . . . . . . . . . 47
3.6.2 The use of fast solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.6.3 Alternating Direction Implicit methods . . . . . . . . . . . . . . . . . . . 48
4 Related Issues 51
4.1 Complex Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 Stopping Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2.1 More Details about Stopping Criteria . . . . . . . . . . . . . . . . . . . . 52
4.2.2 When r
(i)
or |r
(i)
| is not readily available . . . . . . . . . . . . . . . . . 55
4.2.3 Estimating |A
−1
| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.4 Stopping when progress is no longer being made . . . . . . . . . . . . . . 56
4.2.5 Accounting for floating point errors . . . . . . . . . . . . . . . . . . . . . 56
4.3 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3.1 Survey of Sparse Matrix Storage Formats . . . . . . . . . . . . . . . . . . 57
4.3.2 Matrix vector products . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.3.3 Sparse Incomplete Factorizations . . . . . . . . . . . . . . . . . . . . . . 63
4.4 Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4.1 Inner products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4.2 Vector updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.4.3 Matrix-vector products . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.4.4 Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.4.5 Wavefronts in the Gauss-Seidel and Conjugate Gradient methods . . . . . 70
4.4.6 Blocked operations in the GMRES method . . . . . . . . . . . . . . . . . 71
5 Remaining topics 73
5.1 The Lanczos Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2 Block and s-step Iterative Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.3 Reduced System Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.4 Domain Decomposition Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.4.1 Overlapping Subdomain Methods . . . . . . . . . . . . . . . . . . . . . . 76
5.4.2 Non-overlapping Subdomain Methods . . . . . . . . . . . . . . . . . . . . 77
5.4.3 Further Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.5 Multigrid Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.6 Row Projection Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
A Obtaining the Software 83
B Overview of the BLAS 85
C Glossary 87
C.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
D Flowchart of iterative methods 107
List of Figures
2.1 The Jacobi Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 The Gauss-Seidel Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 The SOR Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 The SSOR Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 The Preconditioned Conjugate Gradient Method . . . . . . . . . . . . . . . . . . . 13
2.6 The Preconditioned GMRES(m) Method . . . . . . . . . . . . . . . . . . . . . . 18
2.7 The Preconditioned BiConjugate Gradient Method . . . . . . . . . . . . . . . . . 20
2.8 The Preconditioned Quasi Minimal Residual Method without Look-ahead . . . . . 22
2.9 The Preconditioned Conjugate Gradient Squared Method . . . . . . . . . . . . . . 23
2.10 The Preconditioned BiConjugate Gradient Stabilized Method . . . . . . . . . . . . 24
2.11 The Preconditioned Chebyshev Method . . . . . . . . . . . . . . . . . . . . . . . 26
3.1 Preconditioner solve of a system Mx = y, with M = LU . . . . . . . . . . . . . 39
3.2 Preconditioner solve of a system Mx = y, with M = (D + L)D
−1
(D + U) =
(D +L)(I +D
−1
U). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3 Construction of a D-ILU incomplete factorization preconditioner, storing the in-
verses of the pivots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.4 Wavefront solution of (D+L)x = u froma central difference problemon a domain
of n n points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.5 Preconditioning step algorithm for a Neumann expansion M
(p)
≈ M
−1
of an
incomplete factorization M = (I +L)D(I +U). . . . . . . . . . . . . . . . . . . 43
3.6 Block version of a D-ILU factorization . . . . . . . . . . . . . . . . . . . . . . . 44
3.7 Algorithm for approximating the inverse of a banded matrix . . . . . . . . . . . . 45
3.8 Incomplete block factorization of a block tridiagonal matrix . . . . . . . . . . . . 45
4.1 Profile of a nonsymmetric skyline or variable-band matrix. . . . . . . . . . . . . . 61
4.2 A rearrangement of Conjugate Gradient for parallelism . . . . . . . . . . . . . . . 68
ix
x LIST OF FIGURES
Chapter 1
Introduction
Which of the following statements is true?
• Users want “black box” software that they can use with complete confidence for general
problem classes without having to understand the fine algorithmic details.
• Users want to be able to tune data structures for a particular application, even if the software
is not as reliable as that provided for general methods.
It turns out both are true, for different groups of users.
Traditionally, users have asked for and been provided with black box software in the form of
mathematical libraries such as LAPACK, LINPACK, NAG, and IMSL. More recently, the high-
performance community has discovered that they must write custom software for their problem.
Their reasons include inadequate functionality of existing software libraries, data structures that
are not natural or convenient for a particular problem, and overly general software that sacrifices
too much performance when applied to a special case of interest.
Can we meet the needs of both groups of users? We believe we can. Accordingly, in this book,
we introduce the use of templates. A template is a description of a general algorithm rather than the
executable object code or the source code more commonly found in a conventional software library.
Nevertheless, although templates are general descriptions of key algorithms, they offer whatever
degree of customization the user may desire. For example, they can be configured for the specific
data structure of a problem or for the specific computing system on which the problem is to run.
We focus on the use of iterative methods for solving large sparse systems of linear equations.
Many methods exist for solving such problems. The trick is to find the most effective method
for the problem at hand. Unfortunately, a method that works well for one problem type may not
work as well for another. Indeed, it may not work at all.
Thus, besides providing templates, we suggest how to choose and implement an effective
method, and how to specialize a method to specific matrix types. We restrict ourselves to iter-
ative methods, which work by repeatedly improving an approximate solution until it is accurate
enough. These methods access the coefficient matrix A of the linear system only via the matrix-
vector product y = A x (and perhaps z = A
T
x). Thus the user need only supply a subroutine
for computing y (and perhaps z) given x, which permits full exploitation of the sparsity or other
special structure of A.
We believe that after reading this book, applications developers will be able to use templates to
get their program running on a parallel machine quickly. Nonspecialists will know how to choose
and implement an approach to solve a particular problem. Specialists will be able to assemble
and modify their codes—without having to make the huge investment that has, up to now, been
required to tune large-scale applications for each particular machine. Finally, we hope that all users
will gain a better understanding of the algorithms employed. While education has not been one of
1
2 CHAPTER 1. INTRODUCTION
the traditional goals of mathematical software, we believe that our approach will go a long way in
providing such a valuable service.
1.1 Why Use Templates?
Templates offer three significant advantages. First, templates are general and reusable. Thus, they
can simplify ports to diverse machines. This feature is important given the diversity of parallel
architectures.
Second, templates exploit the expertise of two distinct groups. The expert numerical analyst
creates a template reflecting in-depth knowledge of a specific numerical technique. The compu-
tational scientist then provides “value-added” capability to the general template description, cus-
tomizing it for specific contexts or applications needs.
And third, templates are not language specific. Rather, they are displayed in an Algol-like
structure, which is readily translatable into the target language such as FORTRAN (with the use
of the Basic Linear Algebra Subprograms, or BLAS, whenever possible) and C. By using these
familiar styles, we believe that the users will have trust in the algorithms. We also hope that users
will gain a better understanding of numerical techniques and parallel programming.
For each template, we provide some or all of the following:
• a mathematical description of the flow of the iteration;
• discussion of convergence and stopping criteria;
• suggestions for applying a method to special matrix types (e.g., banded systems);
• advice for tuning (for example, which preconditioners are applicable and which are not);
• tips on parallel implementations; and
• hints as to when to use a method, and why.
For each of the templates, the following can be obtained via electronic mail.
• a MATLAB implementation based on dense matrices;
• a FORTRAN-77 program with calls to BLAS
1
.
See Appendix A for details.
1.2 What Methods Are Covered?
Many iterative methods have been developed and it is impossible to cover them all. We chose
the methods below either because they illustrate the historical development of iterative methods,
or because they represent the current state-of-the-art for solving large sparse linear systems. The
methods we discuss are:
1. Jacobi
2. Gauss-Seidel
3. Successive Over-Relaxation (SOR)
4. Symmetric Successive Over-Relaxation (SSOR)
1
For a discussion of BLAS as building blocks, see [68, 69, 70, 143] and LAPACK routines [3]. Also, see Appendix B.
1.2. WHAT METHODS ARE COVERED? 3
5. Conjugate Gradient (CG)
6. Minimal Residual (MINRES) and Symmetric LQ (SYMMLQ)
7. Conjugate Gradients on the Normal Equations (CGNE and CGNR)
8. Generalized Minimal Residual (GMRES)
9. Biconjugate Gradient (BiCG)
10. Quasi-Minimal Residual (QMR)
11. Conjugate Gradient Squared (CGS)
12. Biconjugate Gradient Stabilized (Bi-CGSTAB)
13. Chebyshev Iteration
For each method we present a general description, including a discussion of the history of the
method and numerous references to the literature. We also give the mathematical conditions for
selecting a given method.
We do not intend to write a “cookbook”, and have deliberately avoided the words “numerical
recipes”, because these phrases imply that our algorithms can be used blindly without knowledge
of the system of equations. The state of the art in iterative methods does not permit this: some
knowledge about the linear system is needed to guarantee convergence of these algorithms, and
generally the more that is known the more the algorithm can be tuned. Thus, we have chosen
to present an algorithmic outline, with guidelines for choosing a method and implementing it on
particular kinds of high-performance machines. We also discuss the use of preconditioners and
relevant data storage issues.
4 CHAPTER 1. INTRODUCTION
Chapter 2
Iterative Methods
The term“iterative method” refers to a wide range of techniques that use successive approximations
to obtain more accurate solutions to a linear system at each step. In this book we will cover two
types of iterative methods. Stationary methods are older, simpler to understand and implement, but
usually not as effective. Nonstationary methods are a relatively recent development; their analysis
is usually harder to understand, but they can be highly effective. The nonstationary methods we
present are based on the idea of sequences of orthogonal vectors. (An exception is the Chebyshev
iteration method, which is based on orthogonal polynomials.)
The rate at which an iterative method converges depends greatly on the spectrum of the co-
efficient matrix. Hence, iterative methods usually involve a second matrix that transforms the
coefficient matrix into one with a more favorable spectrum. The transformation matrix is called
a preconditioner. A good preconditioner improves the convergence of the iterative method, suffi-
ciently to overcome the extra cost of constructing and applying the preconditioner. Indeed, without
a preconditioner the iterative method may even fail to converge.
2.1 Overview of the Methods
Below are short descriptions of each of the methods to be discussed, along with brief notes on the
classification of the methods in terms of the class of matrices for which they are most appropriate.
In later sections of this chapter more detailed descriptions of these methods are given.
• Stationary Methods
– Jacobi.
The Jacobi method is based on solving for every variable locally with respect to the
other variables; one iteration of the method corresponds to solving for every variable
once. The resulting method is easy to understand and implement, but convergence is
slow.
– Gauss-Seidel.
The Gauss-Seidel method is like the Jacobi method, except that it uses updated values as
soon as they are available. In general, if the Jacobi method converges, the Gauss-Seidel
method will converge faster than the Jacobi method, though still relatively slowly.
– SOR.
Successive Overrelaxation (SOR) can be derived from the Gauss-Seidel method by
introducing an extrapolation parameter ω. For the optimal choice of ω, SOR may con-
verge faster than Gauss-Seidel by an order of magnitude.
5
6 CHAPTER 2. ITERATIVE METHODS
– SSOR.
Symmetric Successive Overrelaxation (SSOR) has no advantage over SOR as a stand-
alone iterative method; however, it is useful as a preconditioner for nonstationary meth-
ods.
• Nonstationary Methods
– Conjugate Gradient (CG).
The conjugate gradient method derives its name from the fact that it generates a se-
quence of conjugate (or orthogonal) vectors. These vectors are the residuals of the iter-
ates. They are also the gradients of a quadratic functional, the minimization of which
is equivalent to solving the linear system. CG is an extremely effective method when
the coefficient matrix is symmetric positive definite, since storage for only a limited
number of vectors is required.
– Minimum Residual (MINRES) and Symmetric LQ (SYMMLQ).
These methods are computational alternatives for CG for coefficient matrices that are
symmetric but possibly indefinite. SYMMLQ will generate the same solution iterates
as CG if the coefficient matrix is symmetric positive definite.
– Conjugate Gradient on the Normal Equations: CGNE and CGNR.
These methods are based on the application of the CG method to one of two forms of
the normal equations for Ax = b. CGNE solves the system (AA
T
)y = b for y and then
computes the solution x = A
T
y. CGNR solves (A
T
A)x =
˜
b for the solution vector
x where
˜
b = A
T
b. When the coefficient matrix A is nonsymmetric and nonsingular,
the normal equations matrices AA
T
and A
T
A will be symmetric and positive definite,
and hence CG can be applied. The convergence may be slow, since the spectrum of the
normal equations matrices will be less favorable than the spectrum of A.
– Generalized Minimal Residual (GMRES).
The Generalized Minimal Residual method computes a sequence of orthogonal vectors
(like MINRES), and combines these through a least-squares solve and update. How-
ever, unlike MINRES (and CG) it requires storing the whole sequence, so that a large
amount of storage is needed. For this reason, restarted versions of this method are used.
In restarted versions, computation and storage costs are limited by specifying a fixed
number of vectors to be generated. This method is useful for general nonsymmetric
matrices.
– BiConjugate Gradient (BiCG).
The Biconjugate Gradient method generates two CG-like sequences of vectors, one
based on a system with the original coefficient matrix A, and one on A
T
. Instead of
orthogonalizing each sequence, they are made mutually orthogonal, or “bi-orthogonal”.
This method, like CG, uses limited storage. It is useful when the matrix is nonsymmet-
ric and nonsingular; however, convergence may be irregular, and there is a possibility
that the method will break down. BiCG requires a multiplication with the coefficient
matrix and with its transpose at each iteration.
– Quasi-Minimal Residual (QMR).
The Quasi-Minimal Residual method applies a least-squares solve and update to the
BiCG residuals, thereby smoothing out the irregular convergence behavior of BiCG.
Also, QMR largely avoids the breakdown that can occur in BiCG. On the other hand,
it does not effect a true minimization of either the error or the residual, and while it
converges smoothly, it does not essentially improve on the BiCG.
– Conjugate Gradient Squared (CGS).
2.2. STATIONARY ITERATIVE METHODS 7
The Conjugate Gradient Squared method is a variant of BiCG that applies the updating
operations for the A-sequence and the A
T
-sequences both to the same vectors. Ideally,
this would double the convergence rate, but in practice convergence may be much more
irregular than for BiCG. A practical advantage is that the method does not need the
multiplications with the transpose of the coefficient matrix.
– Biconjugate Gradient Stabilized (Bi-CGSTAB).
The Biconjugate Gradient Stabilized method is a variant of BiCG, like CGS, but using
different updates for the A
T
-sequence in order to obtain smoother convergence than
CGS.
– Chebyshev Iteration.
The Chebyshev Iteration recursively determines polynomials with coefficients chosen
to minimize the norm of the residual in a min-max sense. The coefficient matrix must
be positive definite and knowledge of the extremal eigenvalues is required. This method
has the advantage of requiring no inner products.
2.2 Stationary Iterative Methods
Iterative methods that can be expressed in the simple form
x
(k)
= Bx
(k−1)
+c (2.1)
(where neither B nor c depend upon the iteration count k) are called stationary iterative methods.
In this section, we present the four main stationary iterative methods: the Jacobi method, the
Gauss-Seidel method, the Successive Overrelaxation (SOR) method and the Symmetric Successive
Overrelaxation (SSOR) method. In each case, we summarize their convergence behavior and their
effectiveness, and discuss how and when they should be used. Finally, in '2.2.5, we give some
historical background and further notes and references.
2.2.1 The Jacobi Method
The Jacobi method is easily derived by examining each of the n equations in the linear system
Ax = b in isolation. If in the ith equation
n
¸
j=1
a
i,j
x
j
= b
i
,
we solve for the value of x
i
while assuming the other entries of x remain fixed, we obtain
x
i
= (b
i

¸
j=i
a
i,j
x
j
)/a
i,i
. (2.2)
This suggests an iterative method defined by
x
(k)
i
= (b
i

¸
j=i
a
i,j
x
(k−1)
j
)/a
i,i
, (2.3)
which is the Jacobi method. Note that the order in which the equations are examined is irrelevant,
since the Jacobi method treats them independently. For this reason, the Jacobi method is also
known as the method of simultaneous displacements, since the updates could in principle be done
simultaneously.
In matrix terms, the definition of the Jacobi method in (2.3) can be expressed as
x
(k)
= D
−1
(L +U)x
(k−1)
+D
−1
b, (2.4)
8 CHAPTER 2. ITERATIVE METHODS
Choose an initial guess x
(0)
to the solution x.
for k = 1, 2, . . .
for i = 1, 2, . . . , n
¯ x
i
= 0
for j = 1, 2, . . . , i −1, i + 1, . . . , n
¯ x
i
= ¯ x
i
+a
i,j
x
(k−1)
j
end
¯ x
i
= (b
i
− ¯ x
i
)/a
i,i
end
x
(k)
= ¯ x
check convergence; continue if necessary
end
Figure 2.1: The Jacobi Method
where the matrices D, −L and −U represent the diagonal, the strictly lower-triangular, and the
strictly upper-triangular parts of A, respectively.
The pseudocode for the Jacobi method is given in Figure 2.1. Note that an auxiliary storage
vector, ¯ x is used in the algorithm. It is not possible to update the vector x in place, since values
from x
(k−1)
are needed throughout the computation of x
(k)
.
Convergence of the Jacobi method
Iterative methods are often used for solving discretized partial differential equations. In that context
a rigorous analysis of the convergence of simple methods such as the Jacobi method can be given.
As an example, consider the boundary value problem
Lu = −u
xx
= f on (0, 1), u(0) = u
0
, u(1) = u
1
,
discretized by
Lu(x
i
) = 2u(x
i
) −u(x
i−1
) −u(x
i+1
) = f(x
i
)/N
2
for x
i
= i/N, i = 1 . . . N −1.
The eigenfunctions of the Land Loperator are the same: for n = 1 . . . N −1 the function u
n
(x) =
sin nπx is an eigenfunction corresponding to λ = 4 sin
2
nπ/(2N). The eigenvalues of the Jacobi
iteration matrix B are then λ(B) = 1 −1/2λ(L) = 1 −2 sin
2
nπ/(2N).
From this it is easy to see that the high frequency modes (i.e., eigenfunction u
n
with n large)
are damped quickly, whereas the damping factor for modes with n small is close to 1. The spectral
radius of the Jacobi iteration matrix is ≈ 1 − 10/N
2
, and it is attained for the eigenfunction
u(x) = sin πx.
The type of analysis applied to this example can be generalized to higher dimensions and other
stationary iterative methods. For both the Jacobi and Gauss-Seidel method (below) the spectral
radius is found to be 1 −O(h
2
) where h is the discretization mesh width, i.e., h = N
−d
where N
is the number of variables and d is the number of space dimensions.
2.2.2 The Gauss-Seidel Method
Consider again the linear equations in (2.2). If we proceed as with the Jacobi method, but now
assume that the equations are examined one at a time in sequence, and that previously computed
2.2. STATIONARY ITERATIVE METHODS 9
Choose an initial guess x
(0)
to the solution x.
for k = 1, 2, . . .
for i = 1, 2, . . . , n
σ = 0
for j = 1, 2, . . . , i −1
σ = σ +a
i,j
x
(k)
j
end
for j = i + 1, . . . , n
σ = σ +a
i,j
x
(k−1)
j
end
x
(k)
i
= (b
i
−σ)/a
i,i
end
check convergence; continue if necessary
end
Figure 2.2: The Gauss-Seidel Method
results are used as soon as they are available, we obtain the Gauss-Seidel method:
x
(k)
i
= (b
i

¸
j<i
a
i,j
x
(k)
j

¸
j>i
a
i,j
x
(k−1)
j
)/a
i,i
. (2.5)
Two important facts about the Gauss-Seidel method should be noted. First, the computations
in (2.5) appear to be serial. Since each component of the new iterate depends upon all previ-
ously computed components, the updates cannot be done simultaneously as in the Jacobi method.
Second, the new iterate x
(k)
depends upon the order in which the equations are examined. The
Gauss-Seidel method is sometimes called the method of successive displacements to indicate the
dependence of the iterates on the ordering. If this ordering is changed, the components of the new
iterate (and not just their order) will also change.
These two points are important because if A is sparse, the dependency of each component
of the new iterate on previous components is not absolute. The presence of zeros in the matrix
may remove the influence of some of the previous components. Using a judicious ordering of
the equations, it may be possible to reduce such dependence, thus restoring the ability to make
updates to groups of components in parallel. However, reordering the equations can affect the rate
at which the Gauss-Seidel method converges. A poor choice of ordering can degrade the rate of
convergence; a good choice can enhance the rate of convergence. For a practical discussion of this
tradeoff (parallelism versus convergence rate) and some standard reorderings, the reader is referred
to Chapter 3 and '4.4.
In matrix terms, the definition of the Gauss-Seidel method in (2.5) can be expressed as
x
(k)
= (D −L)
−1
(Ux
(k−1)
+b). (2.6)
As before, D, −Land −U represent the diagonal, lower-triangular, and upper-triangular parts of A,
respectively.
The pseudocode for the Gauss-Seidel algorithm is given in Figure 2.2.
2.2.3 The Successive Overrelaxation Method
The Successive Overrelaxation Method, or SOR, is devised by applying extrapolation to the Gauss-
Seidel method. This extrapolation takes the formof a weighted average between the previous iterate
10 CHAPTER 2. ITERATIVE METHODS
Choose an initial guess x
(0)
to the solution x.
for k = 1, 2, . . .
for i = 1, 2, . . . , n
σ = 0
for j = 1, 2, . . . , i −1
σ = σ +a
i,j
x
(k)
j
end
for j = i + 1, . . . , n
σ = σ +a
i,j
x
(k−1)
j
end
σ = (b
i
−σ)/a
i,i
x
(k)
i
= x
(k−1)
i
+ω(σ −x
(k−1)
i
)
end
check convergence; continue if necessary
end
Figure 2.3: The SOR Method
and the computed Gauss-Seidel iterate successively for each component:
x
(k)
i
= ω¯ x
(k)
i
+ (1 −ω)x
(k−1)
i
(where ¯ x denotes a Gauss-Seidel iterate, and ω is the extrapolation factor). The idea is to choose a
value for ω that will accelerate the rate of convergence of the iterates to the solution.
In matrix terms, the SOR algorithm can be written as follows:
x
(k)
= (D −ωL)
−1
(ωU + (1 −ω)D)x
(k−1)
+ω(D −ωL)
−1
b. (2.7)
The pseudocode for the SOR algorithm is given in Figure 2.3.
Choosing the Value of ω
If ω = 1, the SOR method simplifies to the Gauss-Seidel method. A theorem due to Kahan [129]
shows that SOR fails to converge if ω is outside the interval (0, 2). Though technically the term
underrelaxation should be used when 0 < ω < 1, for convenience the term overrelaxation is now
used for any value of ω ∈ (0, 2).
In general, it is not possible to compute in advance the value of ω that is optimal with respect
to the rate of convergence of SOR. Even when it is possible to compute the optimal value for ω, the
expense of such computation is usually prohibitive. Frequently, some heuristic estimate is used,
such as ω = 2 −O(h) where h is the mesh spacing of the discretization of the underlying physical
domain.
If the coefficient matrix Ais symmetric and positive definite, the SOR iteration is guaranteed to
converge for any value of ω between 0 and 2, though the choice of ω can significantly affect the rate
at which the SOR iteration converges. Sophisticated implementations of the SOR algorithm (such
as that found in ITPACK [139]) employ adaptive parameter estimation schemes to try to home in
on the appropriate value of ω by estimating the rate at which the iteration is converging.
For coefficient matrices of a special class called consistently ordered with property A (see
Young [215]), which includes certain orderings of matrices arising from the discretization of el-
liptic PDEs, there is a direct relationship between the spectra of the Jacobi and SOR iteration
2.2. STATIONARY ITERATIVE METHODS 11
matrices. In principle, given the spectral radius ρ of the Jacobi iteration matrix, one can determine
a priori the theoretically optimal value of ω for SOR:
ω
opt
=
2
1 +

1 −ρ
2
. (2.8)
This is seldom done, since calculating the spectral radius of the Jacobi matrix requires an imprac-
tical amount of computation. However, relatively inexpensive rough estimates of ρ (for example,
from the power method, see Golub and Van Loan [108, p. 351]) can yield reasonable estimates for
the optimal value of ω.
2.2.4 The Symmetric Successive Overrelaxation Method
If we assume that the coefficient matrix A is symmetric, then the Symmetric Successive Overre-
laxation method, or SSOR, combines two SOR sweeps together in such a way that the resulting
iteration matrix is similar to a symmetric matrix. Specifically, the first SOR sweep is carried out as
in (2.7), but in the second sweep the unknowns are updated in the reverse order. That is, SSOR is
a forward SOR sweep followed by a backward SOR sweep. The similarity of the SSOR iteration
matrix to a symmetric matrix permits the application of SSOR as a preconditioner for other iter-
ative schemes for symmetric matrices. Indeed, this is the primary motivation for SSOR since its
convergence rate, with an optimal value of ω, is usually slower than the convergence rate of SOR
with optimal ω (see Young [215, page 462]). For details on using SSOR as a preconditioner, see
Chapter 3.
In matrix terms, the SSOR iteration can be expressed as follows:
x
(k)
= B
1
B
2
x
(k−1)
+ω(2 −ω)(D −ωU)
−1
D(D −ωL)
−1
b, (2.9)
where
B
1
= (D −ωU)
−1
(ωL + (1 −ω)D),
and
B
2
= (D −ωL)
−1
(ωU + (1 −ω)D).
Note that B
2
is simply the iteration matrix for SOR from (2.7), and that B
1
is the same, but with
the roles of L and U reversed.
The pseudocode for the SSOR algorithm is given in Figure 2.4.
2.2.5 Notes and References
The modern treatment of iterative methods dates back to the relaxation method of Southwell [192].
This was the precursor to the SOR method, though the order in which approximations to the un-
knowns were relaxed varied during the computation. Specifically, the next unknown was chosen
based upon estimates of the location of the largest error in the current approximation. Because
of this, Southwell’s relaxation method was considered impractical for automated computing. It is
interesting to note that the introduction of multiple-instruction, multiple data-stream(MIMD) paral-
lel computers has rekindled an interest in so-called asynchronous, or chaotic iterative methods (see
Chazan and Miranker [53], Baudet [30], and Elkin [91]), which are closely related to Southwell’s
original relaxation method. In chaotic methods, the order of relaxation is unconstrained, thereby
eliminating costly synchronization of the processors, though the effect on convergence is difficult
to predict.
The notion of accelerating the convergence of an iterative method by extrapolation predates
the development of SOR. Indeed, Southwell used overrelaxation to accelerate the convergence
of his original relaxation method. More recently, the ad hoc SOR method, in which a different
12 CHAPTER 2. ITERATIVE METHODS
Choose an initial guess x
(0)
to the solution x.
Let x
(
1
2
)
= x
(0)
.
for k = 1, 2, . . .
for i = 1, 2, . . . , n
σ = 0
for j = 1, 2, . . . , i −1
σ = σ +a
i,j
x
(k−
1
2
)
j
end
for j = i + 1, . . . , n
σ = σ +a
i,j
x
(k−1)
j
end
σ = (b
i
−σ)/a
i,i
x
(k−
1
2
)
i
= x
(k−1)
i
+ω(σ −x
(k−1)
i
)
end
for i = n, n −1, . . . , 1
σ = 0
for j = 1, 2, . . . , i −1
σ = σ +a
i,j
x
(k−
1
2
)
j
end
for j = i + 1, . . . , n
σ = σ +a
i,j
x
(k)
j
end
x
(k)
i
= x
(k−
1
2
)
i
+ω(σ −x
(k−
1
2
)
i
)
end
check convergence; continue if necessary
end
Figure 2.4: The SSOR Method
relaxation factor ω is used for updating each variable, has given impressive results for some classes
of problems (see Ehrlich [82]).
The three main references for the theory of stationary iterative methods are Varga [209],
Young [215] and Hageman and Young [119]. The reader is referred to these books (and the
references therein) for further details concerning the methods described in this section.
2.3 Nonstationary Iterative Methods
Nonstationary methods differ from stationary methods in that the computations involve informa-
tion that changes at each iteration. Typically, constants are computed by taking inner products of
residuals or other vectors arising from the iterative method.
2.3.1 Conjugate Gradient Method (CG)
The Conjugate Gradient method is an effective method for symmetric positive definite systems. It
is the oldest and best known of the nonstationary methods discussed here. The method proceeds by
generating vector sequences of iterates (i.e., successsive approximations to the solution), residuals
2.3. NONSTATIONARY ITERATIVE METHODS 13
Compute r
(0)
= b −Ax
(0)
for some initial guess x
(0)
for i = 1, 2, . . .
solve Mz
(i−1)
= r
(i−1)
ρ
i−1
= r
(i−1)
T
z
(i−1)
if i = 1
p
(1)
= z
(0)
else
β
i−1
= ρ
i−1

i−2
p
(i)
= z
(i−1)

i−1
p
(i−1)
endif
q
(i)
= Ap
(i)
α
i
= ρ
i−1
/p
(i)
T
q
(i)
x
(i)
= x
(i−1)

i
p
(i)
r
(i)
= r
(i−1)
−α
i
q
(i)
check convergence; continue if necessary
end
Figure 2.5: The Preconditioned Conjugate Gradient Method
corresponding to the iterates, and search directions used in updating the iterates and residuals.
Although the length of these sequences can become large, only a small number of vectors needs
to be kept in memory. In every iteration of the method, two inner products are performed in order
to compute update scalars that are defined to make the sequences satisfy certain orthogonality
conditions. On a symmetric positive definite linear system these conditions imply that the distance
to the true solution is minimized in some norm.
The iterates x
(i)
are updated in each iteration by a multiple (α
i
) of the search direction vec-
tor p
(i)
:
x
(i)
= x
(i−1)

i
p
(i)
.
Correspondingly the residuals r
(i)
= b −Ax
(i)
are updated as
r
(i)
= r
(i−1)
−αq
(i)
where q
(i)
= Ap
(i)
. (2.10)
The choice α = α
i
= r
(i−1)
T
r
(i−1)
/p
(i)
T
Ap
(i)
minimizes r
(i)
T
A
−1
r
(i)
over all possible choices
for α in equation (2.10).
The search directions are updated using the residuals
p
(i)
= r
(i)

i−1
p
(i−1)
, (2.11)
where the choice β
i
= r
(i)
T
r
(i)
/r
(i−1)
T
r
(i−1)
ensures that p
(i)
and Ap
(i−1)
– or equivalently,
r
(i)
and r
(i−1)
– are orthogonal. In fact, one can show that this choice of β
i
makes p
(i)
and r
(i)
orthogonal to all previous Ap
(j)
and r
(j)
respectively.
The pseudocode for the Preconditioned Conjugate Gradient Method is given in Figure 2.5. It
uses a preconditioner M; for M = I one obtains the unpreconditioned version of the Conjugate
Gradient Algorithm. In that case the algorithm may be further simplified by skipping the “solve”
line, and replacing z
(i−1)
by r
(i−1)
(and z
(0)
by r
(0)
).
Theory
The unpreconditioned conjugate gradient method constructs the ith iterate x
(i)
as an element of
x
(0)
+span¦r
(0)
, . . . , A
i−1
r
(0)
¦ so that (x
(i)
−ˆ x)
T
A(x
(i)
−ˆ x) is minimized, where ˆ x is the exact
14 CHAPTER 2. ITERATIVE METHODS
solution of Ax = b. This minimum is guaranteed to exist in general only if Ais symmetric positive
definite. The preconditioned version of the method uses a different subspace for constructing the
iterates, but it satisfies the same minimization property, although over this different subspace. It
requires in addition that the preconditioner M is symmetric and positive definite.
The above minimization of the error is equivalent to the residuals r
(i)
= b −Ax
(i)
being M
−1
orthogonal (that is, r
(i)
T
M
−1
r
(j)
= 0 if i = j). Since for symmetric A an orthogonal basis for the
Krylov subspace span¦r
(0)
, . . . , A
i−1
r
(0)
¦ can be constructed with only three-term recurrences,
such a recurrence also suffices for generating the residuals. In the Conjugate Gradient method two
coupled two-term recurrences are used; one that updates residuals using a search direction vector,
and one updating the search direction with a newly computed residual. This makes the Conjugate
Gradient Method quite attractive computationally.
There is a close relationship between the Conjugate Gradient method and the Lanczos method
for determining eigensystems, since both are based on the construction of an orthogonal basis for
the Krylov subspace, and a similarity transformation of the coefficient matrix to tridiagonal form.
The coefficients computed during the CG iteration then arise from the LU factorization of this
tridiagonal matrix. From the CG iteration one can reconstruct the Lanczos process, and vice versa;
see Paige and Saunders [167] and Golub and Van Loan [108, '10.2.6]. This relationship can be
exploited to obtain relevant information about the eigensystem of the (preconditioned) matrix A;
see '5.1.
Convergence
Accurate predictions of the convergence of iterative methods are difficult to make, but useful
bounds can often be obtained. For the Conjugate Gradient method, the error can be bounded in
terms of the spectral condition number κ
2
of the matrix M
−1
A. (Recall that if λ
max
and λ
min
are
the largest and smallest eigenvalues of a symmetric positive definite matrix B, then the spectral
condition number of B is κ
2
(B) = λ
max
(B)/λ
min
(B)). If ˆ x is the exact solution of the linear
system Ax = b, with symmetric positive definite matrix A, then for CG with symmetric positive
definite preconditioner M, it can be shown that
|x
(i)
− ˆ x|
A
≤ 2α
i
|x
(0)
− ˆ x|
A
(2.12)
where α = (

κ
2
− 1)/(

κ
2
+ 1) (see Golub and Van Loan [108, '10.2.8], and Kaniel [130]),
and |y|
2
A
≡ (y, Ay). From this relation we see that the number of iterations to reach a relative
reduction of in the error is proportional to

κ
2
.
In some cases, practical application of the above error bound is straightforward. For example,
elliptic second order partial differential equations typically give rise to coefficient matrices A with
κ
2
(A) = O(h
−2
) (where h is the discretization mesh width), independent of the order of the
finite elements or differences used, and of the number of space dimensions of the problem (see for
instance Axelsson and Barker [14, '5.5]). Thus, without preconditioning, we expect a number of
iterations proportional to h
−1
for the Conjugate Gradient method.
Other results concerning the behavior of the Conjugate Gradient algorithm have been obtained.
If the extremal eigenvalues of the matrix M
−1
A are well separated, then one often observes so-
called superlinear convergence (see Concus, Golub and O’Leary [57]); that is, convergence at a
rate that increases per iteration. This phenomenon is explained by the fact that CG tends to elimi-
nate components of the error in the direction of eigenvectors associated with extremal eigenvalues
first. After these have been eliminated, the method proceeds as if these eigenvalues did not exist
in the given system, i.e., the convergence rate depends on a reduced system with a (much) smaller
condition number (for an analysis of this, see Van der Sluis and Van der Vorst [197]). The effective-
ness of the preconditioner in reducing the condition number and in separating extremal eigenvalues
can be deduced by studying the approximated eigenvalues of the related Lanczos process.
2.3. NONSTATIONARY ITERATIVE METHODS 15
Implementation
The Conjugate Gradient method involves one matrix-vector product, three vector updates, and two
inner products per iteration. Some slight computational variants exist that have the same structure
(see Reid [178]). Variants that cluster the inner products, a favorable property on parallel machines,
are discussed in '4.4.
For a discussion of the Conjugate Gradient method on vector and shared memory computers,
see Dongarra, et al. [70, 165]. For discussions of the method for more general parallel architectures
see Demmel, Heath and Van der Vorst [66] and Ortega [165], and the references therein.
Further references
A more formal presentation of CG, as well as many theoretical properties, can be found in the
textbook by Hackbusch [117]. Shorter presentations are given in Axelsson and Barker [14] and
Golub and Van Loan [108]. An overview of papers published in the first 25 years of existence of
the method is given in Golub and O’Leary [107].
2.3.2 MINRES and SYMMLQ
The Conjugate Gradient method can be viewed as a special variant of the Lanczos method (see '5.1)
for positive definite symmetric systems. The MINRES and SYMMLQ methods are variants that
can be applied to symmetric indefinite systems.
The vector sequences in the Conjugate Gradient method correspond to a factorization of a tridi-
agonal matrix similar to the coefficient matrix. Therefore, a breakdown of the algorithm can occur
corresponding to a zero pivot if the matrix is indefinite. Furthermore, for indefinite matrices the
minimization property of the Conjugate Gradient method is no longer well-defined. The MINRES
and SYMMLQ methods are variants of the CG method that avoid the LU-factorization and do not
suffer from breakdown. MINRES minimizes the residual in the 2-norm. SYMMLQ solves the
projected system, but does not minimize anything (it keeps the residual orthogonal to all previous
ones). The convergence behavior of Conjugate Gradients and MINRES for indefinite systems was
analyzed by Paige, Parlett, and Van der Vorst [166].
Theory
When A is not positive definite, but symmetric, we can still construct an orthogonal basis for the
Krylov subspace by three term recurrence relations. Eliminating the search directions in equa-
tions (2.10) and (2.11) gives a recurrence
Ar
(i)
= r
(i+1)
t
i+1,i
+r
(i)
t
i,i
+r
(i−1)
t
i−1,i
,
which can be written in matrix form as
AR
i
= R
i+1
¯
T
i
,
where
¯
T
i
is an (i + 1) i tridiagonal matrix
¯
T
i
=
← i →

¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
¸

.

i + 1

16 CHAPTER 2. ITERATIVE METHODS
In this case we have the problem that (, )
A
no longer defines an inner product. However we can
still try to minimize the residual in the 2-norm by obtaining
x
(i)
∈ ¦r
(0)
, Ar
(0)
, . . . , A
i−1
r
(0)
¦, x
(i)
= R
i
¯ y
that minimizes
|Ax
(i)
−b|
2
= |AR
i
¯ y −b|
2
= |R
i+1
¯
T
i
y −b|
2
.
Now we exploit the fact that if D
i+1
≡ diag(|r
(0)
|
2
, |r
(1)
|
2
, ..., |r
(i)
|
2
), then R
i+1
D
−1
i+1
is an
orthonormal transformation with respect to the current Krylov subspace:
|Ax
(i)
−b|
2
= |D
i+1
¯
T
i
y −|r
(0)
|
2
e
(1)
|
2
and this final expression can simply be seen as a minimum norm least squares problem.
The element in the (i + 1, i) position of
¯
T
i
can be annihilated by a simple Givens rotation
and the resulting upper bidiagonal system (the other subdiagonal elements having been removed in
previous iteration steps) can simply be solved, which leads to the MINRES method (see Paige and
Saunders [167]).
Another possibility is to solve the system T
i
y = |r
(0)
|
2
e
(1)
, as in the CG method (T
i
is
the upper i i part of
¯
T
i
). Other than in CG we cannot rely on the existence of a Choleski
decomposition (since A is not positive definite). An alternative is then to decompose T
i
by an
LQ-decomposition. This again leads to simple recurrences and the resulting method is known as
SYMMLQ (see Paige and Saunders [167]).
2.3.3 CG on the Normal Equations, CGNE and CGNR
The CGNE and CGNR methods are the simplest methods for nonsymmetric or indefinite systems.
Since other methods for such systems are in general rather more complicated than the Conjugate
Gradient method, transforming the system to a symmetric definite one and then applying the Con-
jugate Gradient method is attractive for its coding simplicity.
Theory
If a system of linear equations Ax = b has a nonsymmetric, possibly indefinite (but nonsingular),
coefficient matrix, one obvious attempt at a solution is to apply Conjugate Gradient to a related
symmetric positive definite system, A
T
Ax = A
T
b. While this approach is easy to understand
and code, the convergence speed of the Conjugate Gradient method now depends on the square of
the condition number of the original coefficient matrix. Thus the rate of convergence of the CG
procedure on the normal equations may be slow.
Several proposals have been made to improve the numerical stability of this method. The best
known is by Paige and Saunders [168] and is based upon applying the Lanczos method to the
auxiliary system

I A
A
T
0

r
x

=

b
0

.
A clever execution of this scheme delivers the factors L and U of the LU-decomposition of the
tridiagonal matrix that would have been computed by carrying out the Lanczos procedure with
A
T
A.
Another means for improving the numerical stability of this normal equations approach is sug-
gested by Bj¨ orck and Elfving in [34]. The observation that the matrix A
T
A is used in the construc-
tion of the iteration coefficients through an inner product like (p, A
T
Ap) leads to the suggestion
that such an inner product be replaced by (Ap, Ap).
2.3. NONSTATIONARY ITERATIVE METHODS 17
2.3.4 Generalized Minimal Residual (GMRES)
The Generalized Minimal Residual method is an extension of MINRES (which is only applicable to
symmetric systems) to unsymmetric systems. Like MINRES, it generates a sequence of orthogonal
vectors, but in the absence of symmetry this can no longer be done with short recurrences; instead,
all previously computed vectors in the orthogonal sequence have to be retained. For this reason,
“restarted” versions of the method are used.
In the Conjugate Gradient method, the residuals form an orthogonal basis for the space
span¦r
(0)
, Ar
(0)
, A
2
r
(0)
, . . .¦. In GMRES, this basis is formed explicitly:
w
(i)
= Av
(i)
for k = 1, . . . , i
w
(i)
= w
(i)
−(w
(i)
, v
(k)
)v
(k)
end
v
(i+1)
= w
(i)
/|w
(i)
|
The reader may recognize this as a modified Gram-Schmidt orthogonalization. Applied to the
Krylov sequence ¦A
k
r
(0)
¦ this orthogonalization is called the “Arnoldi method” [6]. The inner
product coefficients (w
(i)
, v
(k)
) and |w
(i)
| are stored in an upper Hessenberg matrix.
The GMRES iterates are constructed as
x
(i)
= x
(0)
+y
1
v
(1)
+ +y
i
v
(i)
,
where the coefficients y
k
have been chosen to minimize the residual norm |b − Ax
(i)
|. The
GMRES algorithm has the property that this residual norm can be computed without the iterate
having been formed. Thus, the expensive action of forming the iterate can be postponed until the
residual norm is deemed small enough.
The pseudocode for the restarted GMRES(m) algorithm with preconditioner M is given in
Figure 2.6.
Theory
The Generalized Minimum Residual (GMRES) method is designed to solve nonsymmetric linear
systems (see Saad and Schultz [188]). The most popular form of GMRES is based on the modified
Gram-Schmidt procedure, and uses restarts to control storage requirements.
If no restarts are used, GMRES (like any orthogonalizing Krylov-subspace method) will con-
verge in no more than n steps (assuming exact arithmetic). Of course this is of no practical value
when n is large; moreover, the storage and computational requirements in the absence of restarts
are prohibitive. Indeed, the crucial element for successful application of GMRES(m) revolves
around the decision of when to restart; that is, the choice of m. Unfortunately, there exist examples
for which the method stagnates and convergence takes place only at the nth step. For such systems,
any choice of m less than n fails to converge.
Saad and Schultz [188] have proven several useful results. In particular, they show that if the
coefficient matrix A is real and nearly positive definite, then a “reasonable” value for m may be
selected. Implications of the choice of m are discussed below.
Implementation
A common implementation of GMRES is suggested by Saad and Schultz in [188] and relies on
using modified Gram-Schmidt orthogonalization. Householder transformations, which are rela-
tively costly but stable, have also been proposed. The Householder approach results in a three-fold
increase in work; however, convergence may be better, especially for ill-conditioned systems (see
Walker [212]). From the point of view of parallelism, Gram-Schmidt orthogonalization may be
18 CHAPTER 2. ITERATIVE METHODS
x
(0)
is an initial guess
for j = 1, 2, ....
Solve r from Mr = b −Ax
(0)
v
(1)
= r/|r|
2
s := |r|
2
e
1
for i = 1, 2, ..., m
Solve w from Mw = Av
(i)
for k = 1, ..., i
h
k,i
= (w, v
(k)
)
w = w −h
k,i
v
(k)
end
h
i+1,i
= |w|
2
v
(i+1)
= w/h
i+1,i
apply J
1
, ..., J
i−1
on (h
1,i
, ..., h
i+1,i
)
construct J
i
, acting on ith and (i + 1)st component
of h
.,i
, such that (i + 1)st component of J
i
h
.,i
is 0
s := J
i
s
if s(i + 1) is small enough then (UPDATE(˜ x, i) and quit)
end
UPDATE(˜ x, m)
end
In this scheme UPDATE(˜ x, i)
replaces the following computations:
Compute y as the solution of Hy = ˜ s, in which
the upper i i triangular part of H has h
i,j
as
its elements (in least squares sense if H is singular),
˜ s represents the first i components of s
˜ x = x
(0)
+y
1
v
(1)
+y
2
v
(2)
+... +y
i
v
(i)
s
(i+1)
= |b −A˜ x|
2
if ˜ x is an accurate enough approximation then quit
else x
(0)
= ˜ x
Figure 2.6: The Preconditioned GMRES(m) Method
preferred, giving up some stability for better parallelization properties (see Demmel, Heath and
Van der Vorst [66]). Here we adopt the Modified Gram-Schmidt approach.
The major drawback to GMRES is that the amount of work and storage required per iteration
rises linearly with the iteration count. Unless one is fortunate enough to obtain extremely fast con-
vergence, the cost will rapidly become prohibitive. The usual way to overcome this limitation is by
restarting the iteration. After a chosen number (m) of iterations, the accumulated data are cleared
and the intermediate results are used as the initial data for the next m iterations. This procedure is
repeated until convergence is achieved. The difficulty is in choosing an appropriate value for m. If
m is “too small”, GMRES(m) may be slow to converge, or fail to converge entirely. A value of m
that is larger than necessary involves excessive work (and uses more storage). Unfortunately, there
are no definite rules governing the choice of m—choosing when to restart is a matter of experience.
For a discussion of GMRES for vector and shared memory computers see Dongarra et al. [70];
2.3. NONSTATIONARY ITERATIVE METHODS 19
for more general architectures, see Demmel, Heath and Van der Vorst [66].
2.3.5 BiConjugate Gradient (BiCG)
The Conjugate Gradient method is not suitable for nonsymmetric systems because the residual
vectors cannot be made orthogonal with short recurrences (for proof of this see Voevodin [211]
or Faber and Manteuffel [95]). The GMRES method retains orthogonality of the residuals by
using long recurrences, at the cost of a larger storage demand. The BiConjugate Gradient method
takes another approach, replacing the orthogonal sequence of residuals by two mutually orthogonal
sequences, at the price of no longer providing a minimization.
The update relations for residuals in the Conjugate Gradient method are augmented in the
BiConjugate Gradient method by relations that are similar but based on A
T
instead of A. Thus
we update two sequences of residuals
r
(i)
= r
(i−1)
−α
i
Ap
(i)
, ˜ r
(i)
= ˜ r
(i−1)
−α
i
A
T
˜ p
(i)
,
and two sequences of search directions
p
(i)
= r
(i−1)

i−1
p
(i−1)
, ˜ p
(i)
= ˜ r
(i−1)

i−1
˜ p
(i−1)
.
The choices
α
i
=
˜ r
(i−1)
T
r
(i−1)
˜ p
(i)
T
Ap
(i)
, β
i
=
˜ r
(i)
T
r
(i)
˜ r
(i−1)
T
r
(i−1)
ensure the bi-orthogonality relations
˜ r
(i)
T
r
(j)
= ˜ p
(i)
T
Ap
(j)
= 0 if i = j.
The pseudocode for the Preconditioned BiConjugate Gradient Method with preconditioner M
is given in Figure 2.7.
Convergence
Few theoretical results are known about the convergence of BiCG. For symmetric positive defi-
nite systems the method delivers the same results as CG, but at twice the cost per iteration. For
nonsymmetric matrices it has been shown that in phases of the process where there is significant
reduction of the norm of the residual, the method is more or less comparable to full GMRES (in
terms of numbers of iterations) (see Freund and Nachtigal [101]). In practice this is often con-
firmed, but it is also observed that the convergence behavior may be quite irregular, and the method
may even break down. The breakdown situation due to the possible event that z
(i−1)
T
˜ r
(i−1)
≈ 0
can be circumvented by so-called look-ahead strategies (see Parlett, Taylor and Liu [171]). This
leads to complicated codes and is beyond the scope of this book. The other breakdown situation,
˜ p
(i)
T
q
(i)
≈ 0, occurs when the LU-decomposition fails (see the theory subsection of '2.3.1), and
can be repaired by using another decomposition. This is done in QMR (see '2.3.6).
Sometimes, breakdown or near-breakdown situations can be satisfactorily avoided by a restart
at the iteration step immediately before the (near-) breakdown step. Another possibility is to switch
to a more robust (but possibly more expensive) method, like GMRES.
Implementation
BiCG requires computing a matrix-vector product Ap
(k)
and a transpose product A
T
˜ p
(k)
. In some
applications the latter product may be impossible to perform, for instance if the matrix is not formed
explicitly and the regular product is only given in operation form, for instance as a function call
evaluation.
20 CHAPTER 2. ITERATIVE METHODS
Compute r
(0)
= b −Ax
(0)
for some initial guess x
(0)
.
Choose ˜ r
(0)
(for example, ˜ r
(0)
= r
(0)
).
for i = 1, 2, . . .
solve Mz
(i−1)
= r
(i−1)
solve M
T
˜ z
(i−1)
= ˜ r
(i−1)
ρ
i−1
= z
(i−1)
T
˜ r
(i−1)
if ρ
i−1
= 0, method fails
if i = 1
p
(i)
= z
(i−1)
˜ p
(i)
= ˜ z
(i−1)
else
β
i−1
= ρ
i−1

i−2
p
(i)
= z
(i−1)

i−1
p
(i−1)
˜ p
(i)
= ˜ z
(i−1)

i−1
˜ p
(i−1)
endif
q
(i)
= Ap
(i)
˜ q
(i)
= A
T
˜ p
(i)
α
i
= ρ
i−1
/˜ p
(i)
T
q
(i)
x
(i)
= x
(i−1)

i
p
(i)
r
(i)
= r
(i−1)
−α
i
q
(i)
˜ r
(i)
= ˜ r
(i−1)
−α
i
˜ q
(i)
check convergence; continue if necessary
end
Figure 2.7: The Preconditioned BiConjugate Gradient Method
In a parallel environment, the two matrix-vector products can theoretically be performed si-
multaneously; however, in a distributed-memory environment, there will be extra communication
costs associated with one of the two matrix-vector products, depending upon the storage scheme
for A. A duplicate copy of the matrix will alleviate this problem, at the cost of doubling the storage
requirements for the matrix.
Care must also be exercised in choosing the preconditioner, since similar problems arise during
the two solves involving the preconditioning matrix.
It is difficult to make a fair comparison between GMRES and BiCG. GMRES really minimizes
a residual, but at the cost of increasing work for keeping all residuals orthogonal and increasing
demands for memory space. BiCG does not minimize a residual, but often its accuracy is com-
parable to GMRES, at the cost of twice the amount of matrix vector products per iteration step.
However, the generation of the basis vectors is relatively cheap and the memory requirements are
modest. Several variants of BiCG have been proposed that increase the effectiveness of this class
of methods in certain circumstances. These variants (CGS and Bi-CGSTAB) will be discussed in
coming subsections.
2.3.6 Quasi-Minimal Residual (QMR)
The BiConjugate Gradient method often displays rather irregular convergence behavior. Moreover,
the implicit LU decomposition of the reduced tridiagonal system may not exist, resulting in break-
down of the algorithm. A related algorithm, the Quasi-Minimal Residual method of Freund and
Nachtigal [101], [102] attempts to overcome these problems. The main idea behind this algorithm
2.3. NONSTATIONARY ITERATIVE METHODS 21
is to solve the reduced tridiagonal system in a least squares sense, similar to the approach fol-
lowed in GMRES. Since the constructed basis for the Krylov subspace is bi-orthogonal, rather than
orthogonal as in GMRES, the obtained solution is viewed as a quasi-minimal residual solution,
which explains the name. Additionally, QMR uses look-ahead techniques to avoid breakdowns in
the underlying Lanczos process, which makes it more robust than BiCG.
Convergence
The convergence behavior of QMR is typically much smoother than for BiCG. Freund and Nachti-
gal [101] present quite general error bounds which show that QMR may be expected to converge
about as fast as GMRES. From a relation between the residuals in BiCG and QMR (Freund and
Nachtigal [101, relation (5.10)]) one may deduce that at phases in the iteration process where BiCG
makes significant progress, QMR has arrived at about the same approximation for ˆ x. On the other
hand, when BiCG makes no progress at all, QMR may still show slow convergence.
The look-ahead steps in the QMR method prevent breakdown in all cases but the so-called
“incurable breakdown”, where no number of look-ahead steps would yield a next iterate.
Implementation
The pseudocode for the Preconditioned Quasi Minimal Residual Method, with preconditioner
M = M
1
M
2
, is given in Figure 2.8. This algorithm follows the two term recurrence version
without look-ahead, presented by Freund and Nachtigal [102] as Algorithm 7.1. This version of
QMR is simpler to implement than the full QMR method with look-ahead, but it is susceptible
to breakdown of the underlying Lanczos process. (Other implementational variations are whether
to scale Lanczos vectors or not, or to use three-term recurrences instead of coupled two-term re-
currences. Such decisions usually have implications for the stability and the efficiency of the algo-
rithm.) A professional implementation of QMR with look-ahead is given in Freund and Nachtigal’s
QMRPACK, which is available through netlib; see Appendix A.
We have modified the algorithm to include a relatively inexpensive recurrence relation for the
computation of the residual vector. This requires a few extra vectors of storage and vector update
operations per iteration, but it avoids expending a matrix-vector product on the residual calculation.
Also, the algorithm has been modified so that only two full preconditioning steps are required
instead of three.
Computation of the residual is done for the convergence test. If one uses right (or post) precon-
ditioning, that is M
1
= I, then a cheap upper bound for |r
(i)
| can be computed in each iteration,
avoiding the recursions for r
(i)
. For details, see Freund and Nachtigal [101, proposition 4.1]. This
upper bound may be pessimistic by a factor of at most

i + 1.
QMR has roughly the same problems with respect to vector and parallel implementation as
BiCG. The scalar overhead per iteration is slightly more than for BiCG. In all cases where the
slightly cheaper BiCG method converges irregularly (but fast enough), there seems little reason to
avoid QMR.
2.3.7 Conjugate Gradient Squared Method (CGS)
In BiCG, the residual vector r
(i)
can be regarded as the product of r
(0)
and an ith degree polynomial
in A, that is
r
(i)
= P
i
(A)r
(0)
. (2.13)
This same polynomial satisfies ˜ r
(i)
= P
i
(A)˜ r
(0)
so that
ρ
i
= (˜ r
(i)
, r
(i)
) = (P
i
(A
T
)˜ r
(0)
, P
i
(A)r
(0)
) = (˜ r
(i)
, P
2
i
(A)r
(0)
). (2.14)
22 CHAPTER 2. ITERATIVE METHODS
Compute r
(0)
= b −Ax
(0)
for some initial guess x
(0)
˜ v
(1)
= r
(0)
; solve M
1
y = ˜ v
(1)
; ρ
1
= |y|
2
Choose ˜ w
(1)
, for example ˜ w
(1)
= r
(0)
solve M
t
2
z = ˜ w
(1)
; ξ
1
= |z|
2
γ
0
= 1; η
0
= −1
for i = 1, 2, . . .
if ρ
i
= 0 or ξ
i
= 0 method fails
v
(i)
= ˜ v
(i)

i
; y = y/ρ
i
w
(i)
= ˜ w
(i)

i
; z = z/ξ
i
δ
i
= z
T
y; if δ
i
= 0 method fails
solve M
2
˜ y = y
solve M
T
1
˜ z = z
if i = 1
p
(1)
= ˜ y; q
(1)
= ˜ z
else
p
(i)
= ˜ y −(ξ
i
δ
i
/
i−1
)p
(i−1)
q
(i)
= ˜ z −(ρ
i
δ
i
/
i−1
)q
(i−1)
endif
˜ p = Ap
(i)

i
= q
(i)
T
˜ p; if
i
= 0 method fails
β
i
=
i

i
; if β
i
= 0 method fails
˜ v
(i+1)
= ˜ p −β
i
v
(i)
solve M
1
y = ˜ v
(i+1)
ρ
i+1
= |y|
2
˜ w
(i+1)
= A
T
q
(i)
−β
i
w
(i)
solve M
T
2
z = ˜ w
(i+1)
ξ
i+1
= |z|
2
θ
i
= ρ
i+1
/(γ
i−1

i
[); γ
i
= 1/

1 +θ
2
i
; if γ
i
= 0 method fails
η
i
= −η
i−1
ρ
i
γ
2
i
/(β
i
γ
2
i−1
)
if i = 1
d
(1)
= η
1
p
(1)
; s
(1)
= η
1
˜ p
else
d
(i)
= η
i
p
(i)
+ (θ
i−1
γ
i
)
2
d
(i−1)
s
(i)
= η
i
˜ p + (θ
i−1
γ
i
)
2
s
(i−1)
endif
x
(i)
= x
(i−1)
+d
(i)
r
(i)
= r
(i−1)
−s
(i)
check convergence; continue if necessary
end
Figure 2.8: The Preconditioned Quasi Minimal Residual Method without Look-ahead
2.3. NONSTATIONARY ITERATIVE METHODS 23
Compute r
(0)
= b −Ax
(0)
for some initial guess x
(0)
Choose ˜ r (for example, ˜ r = r
(0)
)
for i = 1, 2, . . .
ρ
i−1
= ˜ r
T
r
(i−1)
if ρ
i−1
= 0 method fails
if i = 1
u
(1)
= r
(0)
p
(1)
= u
(1)
else
β
i−1
= ρ
i−1

i−2
u
(i)
= r
(i−1)

i−1
q
(i−1)
p
(i)
= u
(i)

i−1
(q
(i−1)

i−1
p
(i−1)
)
endif
solve Mˆ p = p
(i)
ˆ v = Aˆ p
α
i
= ρ
i−1
/˜ r
T
ˆ v
q
(i)
= u
(i)
−α
i
ˆ v
solve Mˆ u = u
(i)
+q
(i)
x
(i)
= x
(i−1)

i
ˆ u
ˆ q = Aˆ u
r
(i)
= r
(i−1)
−α
i
ˆ q
check convergence; continue if necessary
end
Figure 2.9: The Preconditioned Conjugate Gradient Squared Method
This suggests that if P
i
(A) reduces r
(0)
to a smaller vector r
(i)
, then it might be advantageous
to apply this “contraction” operator twice, and compute P
2
i
(A)r
(0)
. Equation (2.14) shows that the
iteration coefficients can still be recovered from these vectors, and it turns out to be easy to find
the corresponding approximations for x. This approach leads to the Conjugate Gradient Squared
method (see Sonneveld [191]).
Convergence
Often one observes a speed of convergence for CGS that is about twice as fast as for BiCG, which is
in agreement with the observation that the same “contraction” operator is applied twice. However,
there is no reason that the “contraction” operator, even if it really reduces the initial residual r
(0)
,
should also reduce the once reduced vector r
(k)
= P
k
(A)r
(0)
. This is evidenced by the often highly
irregular convergence behavior of CGS. One should be aware of the fact that local corrections to
the current solution may be so large that cancellation effects occur. This may lead to a less accurate
solution than suggested by the updated residual (see Van der Vorst [205]). The method tends to
diverge if the starting guess is close to the solution.
Implementation
CGS requires about the same number of operations per iteration as BiCG, but does not involve
computations with A
T
. Hence, in circumstances where computation with A
T
is impractical, CGS
may be attractive.
24 CHAPTER 2. ITERATIVE METHODS
Compute r
(0)
= b −Ax
(0)
for some initial guess x
(0)
Choose ˜ r (for example, ˜ r = r
(0)
)
for i = 1, 2, . . .
ρ
i−1
= ˜ r
T
r
(i−1)
if ρ
i−1
= 0 method fails
if i = 1
p
(i)
= r
(i−1)
else
β
i−1
= (ρ
i−1

i−2
)(α
i−1

i−1
)
p
(i)
= r
(i−1)

i−1
(p
(i−1)
−ω
i−1
v
(i−1)
)
endif
solve Mˆ p = p
(i)
v
(i)
= Aˆ p
α
i
= ρ
i−1
/˜ r
T
v
(i)
s = r
(i−1)
−α
i
v
(i)
check norm of s; if small enough: set x
(i)
= x
(i−1)

i
ˆ p and stop
solve Mˆ s = s
t = Aˆ s
ω
i
= t
T
s/t
T
t
x
(i)
= x
(i−1)

i
ˆ p +ω
i
ˆ s
r
(i)
= s −ω
i
t
check convergence; continue if necessary
for continuation it is necessary that ω
i
= 0
end
Figure 2.10: The Preconditioned BiConjugate Gradient Stabilized Method
The pseudocode for the Preconditioned Conjugate Gradient Squared Method with precondi-
tioner M is given in Figure 2.9.
2.3.8 BiConjugate Gradient Stabilized (Bi-CGSTAB)
The BiConjugate Gradient Stabilized method (Bi-CGSTAB) was developed to solve nonsymmet-
ric linear systems while avoiding the often irregular convergence patterns of the Conjugate Gra-
dient Squared method (see Van der Vorst [205]). Instead of computing the CGS sequence i →
P
2
i
(A)r
(0)
, Bi-CGSTAB computes i → Q
i
(A)P
i
(A)r
(0)
where Q
i
is an ith degree polynomial
describing a steepest descent update.
Convergence
Bi-CGSTAB often converges about as fast as CGS, sometimes faster and sometimes not. CGS can
be viewed as a method in which the BiCG “contraction” operator is applied twice. Bi-CGSTAB
can be interpreted as the product of BiCG and repeatedly applied GMRES(1). At least locally, a
residual vector is minimized, which leads to a considerably smoother convergence behavior. On
the other hand, if the local GMRES(1) step stagnates, then the Krylov subspace is not expanded,
and Bi-CGSTAB will break down. This is a breakdown situation that can occur in addition to the
other breakdown possiblities in the underlying BiCG algorithm. This type of breakdown may be
avoided by combining BiCG with other methods, i.e., by selecting other values for ω
i
(see the
2.3. NONSTATIONARY ITERATIVE METHODS 25
algorithm). One such alternative is Bi-CGSTAB2 (see Gutknecht [114]); more general approaches
are suggested by Sleijpen and Fokkema in [189].
Note that Bi-CGSTAB has two stopping tests: if the method has already converged at the first
test on the norm of s, the subsequent update would be numerically questionable. Additionally,
stopping on the first test saves a few unnecessary operations, but this is of minor importance.
Implementation
Bi-CGSTAB requires two matrix-vector products and four inner products, i.e., two inner products
more than BiCG and CGS.
The pseudocode for the Preconditioned BiConjugate Gradient Stabilized Method with precon-
ditioner M is given in Figure 2.10.
2.3.9 Chebyshev Iteration
Chebyshev Iteration is another method for solving nonsymmetric problems (see Golub and
Van Loan [108, '10.1.5] and Varga [209, Chapter 5]). Chebyshev Iteration avoids the computation
of inner products as is necessary for the other nonstationary methods. For some distributed memory
architectures these inner products are a bottleneck with respect to efficiency. The price one pays
for avoiding inner products is that the method requires enough knowledge about the spectrum
of the coefficient matrix A that an ellipse enveloping the spectrum can be identified; however
this difficulty can be overcome via an adaptive construction developed by Manteuffel [145], and
implemented by Ashby [7]. Chebyshev iteration is suitable for any nonsymmetric linear system
for which the enveloping ellipse does not include the origin.
Comparison with other methods
Comparing the pseudocode for Chebyshev Iteration with the pseudocode for the Conjugate Gra-
dient method shows a high degree of similarity, except that no inner products are computed in
Chebyshev Iteration.
Scalars c and d must be selected so that they define a family of ellipses with common center
d > 0 and foci d + c and d − c which contain the ellipse that encloses the spectrum (or more
general, field of values) of A and for which the rate r of convergence is minimal:
r =
a +

a
2
−c
2
d +

d
2
−c
2
, (2.15)
where a is the length of the x-axis of the ellipse.
We provide code in which it is assumed that c and d are known. For code including the adap-
tive detemination of these iteration parameters the reader is referred to Ashby [7]. The Chebyshev
method has the advantage over GMRES that only short recurrences are used. On the other hand,
GMRES is guaranteed to generate the smallest residual over the current search space. The BiCG
methods, which also use short recurrences, do not minimize the residual in a suitable norm; how-
ever, unlike Chebyshev iteration, they do not require estimation of parameters (the spectrum of A).
Finally, GMRES and BiCG may be more effective in practice, because of superlinear convergence
behavior, which cannot be expected for Chebyshev.
For symmetric positive definite systems the “ellipse” enveloping the spectrum degenerates to
the interval [λ
min
, λ
max
] on the positive x-axis, where λ
min
and λ
max
are the smallest and largest
eigenvalues of M
−1
A. In circumstances where the computation of inner products is a bottleneck,
it may be advantageous to start with CG, compute estimates of the extremal eigenvalues from the
CG coefficients, and then after sufficient convergence of these approximations switch to Chebyshev
Iteration. A similar strategy may be adopted for a switch from GMRES, or BiCG-type methods, to
Chebyshev Iteration.
26 CHAPTER 2. ITERATIVE METHODS
Compute r
(0)
= b −Ax
(0)
for some initial guess x
(0)
.
d = (λ
max

min
)/2, c = (λ
max
−λ
min
)/2.
for i = 1, 2, . . .
solve Mz
(i−1)
= r
(i)
.
if i = 1
p
(1)
= z
(0)
α
1
= 2/d
else
β
i−1
= (cα
i−1
/2)
2
α
i
= 1/(d −β
i−1
)
p
(i)
= z
(i−1)

i−1
p
(i−1)
.
endif
x
(i)
= x
(i−1)

i
p
(i)
.
r
(i)
= b −Ax
(i)
(= r
(i−1)
−α
i
Ap
(i)
).
check convergence; continue if necessary
end
Figure 2.11: The Preconditioned Chebyshev Method
Convergence
In the symmetric case (where A and the preconditioner M are both symmetric) for the Chebyshev
Iteration we have the same upper bound as for the Conjugate Gradient method, provided c and d are
computed from λ
min
and λ
max
(the extremal eigenvalues of the preconditioned matrix M
−1
A).
There is a severe penalty for overestimating or underestimating the field of values. For ex-
ample, if in the symmetric case λ
max
is underestimated, then the method may diverge; if it is
overestimated then the result may be very slow convergence. Similar statements can be made for
the nonsymmetric case. This implies that one needs fairly accurate bounds on the spectrum of
M
−1
A for the method to be effective (in comparison with CG or GMRES).
Implementation
In Chebyshev Iteration the iteration parameters are known as soon as one knows the ellipse con-
taining the eigenvalues (or rather, the field of values) of the operator. Therefore the computation
of inner products, as is necessary in methods like GMRES or CG, is avoided. This avoids the
synchronization points required of CG-type methods, so machines with hierarchical or distributed
memory may achieve higher performance (it also suggests strong parallelization properties; for a
discussion of this see Saad [184], and Dongarra, et al. [70]). Specifically, as soon as some segment
of w is computed, we may begin computing, in sequence, corresponding segments of p, x, and r.
The pseudocode for the Preconditioned Chebyshev Method with preconditioner M is given
in Figure 2.11. It handles the case of a symmetric positive definite coefficient matrix A. The
eigenvalues of M
−1
A are assumed to be all real and in the interval [λ
min
, λ
max
], which does not
include zero.
2.4 Computational Aspects of the Methods
Efficient solution of a linear system is largely a function of the proper choice of iterative method.
However, to obtain good performance, consideration must also be given to the computational ker-
2.4. COMPUTATIONAL ASPECTS OF THE METHODS 27
Matrix-
Inner Vector Precond
Method Product SAXPY Product Solve
JACOBI 1
a
GS 1 1
a
SOR 1 1
a
CG 2 3 1 1
GMRES i + 1 i + 1 1 1
BiCG 2 5 1/1 1/1
QMR 2 8+4
bc
1/1 1/1
CGS 2 6 2 2
Bi-CGSTAB 4 6 2 2
CHEBYSHEV 2 1 1
Table 2.1: Summary of Operations for Iteration i. “a/b” means “a” multiplications with the matrix
and “b” with its transpose.
a
This method performs no real matrix vector product or preconditioner solve, but the number of operations is equivalent
to a matrix-vector multiply.
b
True SAXPY operations + vector scalings.
c
Less for implementations that do not recursively update the residual.
nels of the method and how efficiently they can be executed on the target architecture. This point
is of particular importance on parallel architectures; see '4.4.
Iterative methods are very different from direct methods in this respect. The performance of
direct methods, both for dense and sparse systems, is largely that of the factorization of the matrix.
This operation is absent in iterative methods (although preconditioners may require a setup phase),
and with it, iterative methods lack dense matrix suboperations. Since such operations can be exe-
cuted at very high efficiency on most current computer architectures, we expect a lower flop rate for
iterative than for direct methods. (Dongarra and Van der Vorst [73] give some experimental results
about this, and provide a benchmark code for iterative solvers.) Furthermore, the basic operations
in iterative methods often use indirect addressing, depending on the data structure. Such operations
also have a relatively low efficiency of execution.
However, this lower efficiency of execution does not imply anything about the total solution
time for a given system. Furthermore, iterative methods are usually simpler to implement than
direct methods, and since no full factorization has to be stored, they can handle much larger sytems
than direct methods.
In this section we summarize for each method
• Matrix properties. Not every method will work on every problem type, so knowledge of
matrix properties is the main criterion for selecting an iterative method.
• Computational kernels. Different methods involve different kernels, and depending on the
problem or target computer architecture this may rule out certain methods.
Table 2.2 lists the storage required for each method (without preconditioning). Note that we
are not including the original system Ax = b and we ignore scalar storage.
1. Jacobi Method
• Extremely easy to use, but unless the matrix is “strongly” diagonally dominant, this
method is probably best only considered as an introduction to iterative methods or as a
preconditioner in a nonstationary method.
• Trivial to parallelize.
28 CHAPTER 2. ITERATIVE METHODS
Method Storage
Reqmts
JACOBI matrix + 3n
SOR matrix + 2n
CG matrix + 6n
GMRES matrix + (i + 5)n
BiCG matrix + 10n
CGS matrix + 11n
Bi-CGSTAB matrix + 10n
QMR matrix + 16n
c
CHEBYSHEV matrix + 5n
Table 2.2: Storage Requirements for the Methods in iteration i: n denotes the order of the matrix.
c
Less for implementations that do not recursively update the residual.
2. Gauss-Seidel Method
• Typically faster convergence than Jacobi, but in general not competitive with the non-
stationary methods.
• Applicable to strictly diagonally dominant, or symmetric positive definite matrices.
• Parallelization properties depend on structure of the coefficient matrix. Different order-
ings of the unknowns have different degrees of parallelism; multi-color orderings may
give almost full parallelism.
• This is a special case of the SOR method, obtained by choosing ω = 1.
3. Successive Over-Relaxation (SOR)
• Accelerates convergence of Gauss-Seidel (ω > 1, over-relaxation); may yield conver-
gence when Gauss-Seidel fails (0 < ω < 1, under-relaxation).
• Speed of convergence depends critically on ω; the optimal value for ω may be estimated
from the spectral radius of the Jacobi iteration matrix under certain conditions.
• Parallelization properties are the same as those of the Gauss-Seidel method.
4. Conjugate Gradient (CG)
• Applicable to symmetric positive definite systems.
• Speed of convergence depends on the condition number; if extremal eigenvalues are
well-separated then superlinear convergence behavior can result.
• Inner products act as synchronization points in a parallel environment.
• Further parallel properties are largely independent of the coefficient matrix, but depend
strongly on the structure the preconditioner.
5. Generalized Minimal Residual (GMRES)
• Applicable to nonsymmetric matrices.
• GMRES leads to the smallest residual for a fixed number of iteration steps, but these
steps become increasingly expensive.
• In order to limit the increasing storage requirments and work per iteration step, restart-
ing is necessary. When to do so depends on A and the right-hand side; it requires skill
and experience.
• GMRES requires only matrix-vector products with the coefficient matrix.
2.4. COMPUTATIONAL ASPECTS OF THE METHODS 29
• The number of inner products grows linearly with the iteration number, up to the restart
point. In an implementation based on a simple Gram-Schmidt process the inner prod-
ucts are independent, so together they imply only one synchronization point. A more
stable implementation based on modified Gram-Schmidt orthogonalization has one
synchronization point per inner product.
6. Biconjugate Gradient (BiCG)
• Applicable to nonsymmetric matrices.
• Requires matrix-vector products with the coefficient matrix and its transpose. This dis-
qualifies the method for cases where the matrix is only given implicitly as an operator,
since usually no corresponding transpose operator is available in such cases.
• Parallelization properties are similar to those for CG; the two matrix vector products
(as well as the preconditioning steps) are independent, so they can be done in parallel,
or their communication stages can be packaged.
7. Quasi-Minimal Residual (QMR)
• Applicable to nonsymmetric matrices.
• Designed to avoid the irregular convergence behavior of BiCG, it avoids one of the two
breakdown situations of BiCG.
• If BiCG makes significant progress in one iteration step, then QMR delivers about the
same result at the same step. But when BiCG temporarily stagnates or diverges, QMR
may still further reduce the residual, albeit very slowly.
• Computational costs per iteration are similar to BiCG, but slightly higher. The method
requires the transpose matrix-vector product.
• Parallelization properties are as for BiCG.
8. Conjugate Gradient Squared (CGS)
• Applicable to nonsymmetric matrices.
• Converges (diverges) typically about twice as fast as BiCG.
• Convergence behavior is often quite irregular, which may lead to a loss of accuracy in
the updated residual. Tends to diverge if the starting guess is close to the solution.
• Computational costs per iteration are similar to BiCG, but the method doesn’t require
the transpose matrix.
• Unlike BiCG, the two matrix-vector products are not independent, so the number of
synchronization points in a parallel environment is larger.
9. Biconjugate Gradient Stabilized (Bi-CGSTAB)
• Applicable to nonsymmetric matrices.
• Computational costs per iteration are similar to BiCG and CGS, but the method doesn’t
require the transpose matrix.
• An alternative for CGS that avoids the irregular convergence patterns of CGS while
maintaining about the same speed of convergence; as a result we often observe less
loss of accuracy in the updated residual.
10. Chebyshev Iteration
• Applicable to nonsymmetric matrices (but presented in this book only for the symmetric
case).
30 CHAPTER 2. ITERATIVE METHODS
• This method requires some explicit knowledge of the spectrum (or field of values); in
the symmetric case the iteration parameters are easily obtained from the two extremal
eigenvalues, which can be estimated either directly from the matrix, or from applying
a few iterations of the Conjugate Gradient Method.
• The computational structure is similar to that of CG, but there are no synchronization
points.
• The Adaptive Chebyshev method can be used in combination with methods as CG or
GMRES, to continue the iteration once suitable bounds on the spectrum have been
obtained from these methods.
Selecting the “best” method for a given class of problems is largely a matter of trial and error.
It also depends on how much storage one has available (GMRES), on the availability of A
T
(BiCG
and QMR), and on how expensive the matrix vector products (and Solve steps with M) are in
comparison to SAXPYs and inner products. If these matrix vector products are relatively expensive,
and if sufficient storage is available then it may be attractive to use GMRES and delay restarting as
much as possible.
Table 2.1 shows the type of operations performed per iteration. Based on the particular prob-
lem or data structure, the user may observe that a particular operation could be performed more
efficiently.
2.5 A short history of Krylov methods
1
Methods based on orthogonalization were developed by a number of authors in the early ’50s.
Lanczos’ method [141] was based on two mutually orthogonal vector sequences, and his motivation
came from eigenvalue problems. In that context, the most prominent feature of the method is that it
reduces the original matrix to tridiagonal form. Lanczos later applied his method to solving linear
systems, in particular symmetric ones [142]. An important property for proving convergence of
the method when solving linear systems is that the iterates are related to the initial residual by
multiplication with a polynomial in the coefficient matrix.
The joint paper by Hestenes and Stiefel [121], after their independent discovery of the same
method, is the classical description of the conjugate gradient method for solving linear systems.
Although error-reduction properties are proved, and experiments showing premature convergence
are reported, the conjugate gradient method is presented here as a direct method, rather than an
iterative method.
This Hestenes/Stiefel method is closely related to a reduction of the Lanczos method to sym-
metric matrices, reducing the two mutually orthogonal sequences to one orthogonal sequence, but
there is an important algorithmic difference. Whereas Lanczos used three-term recurrences, the
method by Hestenes and Stiefel uses coupled two-term recurrences. By combining the two two-
term recurrences (eliminating the “search directions”) the Lanczos method is obtained.
A paper by Arnoldi [6] further discusses the Lanczos biorthogonalization method, but it also
presents a new method, combining features of the Lanczos and Hestenes/Stiefel methods. Like
the Lanczos method it is applied to nonsymmetric systems, and it does not use search directions.
Like the Hestenes/Stiefel method, it generates only one, self-orthogonal sequence. This last fact,
combined with the asymmetry of the coefficient matrix means that the method no longer effects a
reduction to tridiagonal form, but instead one to upper Hessenberg form. Presented as “minimized
iterations in the Galerkin method” this algorithm has become known as the Arnoldi algorithm.
The conjugate gradient method received little attention as a practical method for some time,
partly because of a misperceived importance of the finite termination property. Reid [178] pointed
1
For a more detailed account of the early history of CG methods, we refer the reader to Golub and O’Leary [107] and
Hestenes [122].
2.6. SURVEY OF RECENT KRYLOV METHODS 31
out that the most important application area lay in sparse definite systems, and this renewed the
interest in the method.
Several methods have been developed in later years that employ, most often implicitly, the
upper Hessenberg matrix of the Arnoldi method. For an overview and characterization of these
orthogonal projection methods for nonsymmetric systems see Ashby, Manteuffel and Saylor [10],
Saad and Schultz [187], and Jea and Young [124].
Fletcher [97] proposed an implementation of the Lanczos method, similar to the Conjugate Gra-
dient method, with two coupled two-term recurrences, which he named the bi-conjugate gradient
method (BiCG).
2.6 Survey of recent Krylov methods
Research into the design of Krylov subspace methods for solving nonsymmetric linear systems is
an active field of research and new methods are still emerging. In this book, we have included only
the best known and most popular methods, and in particular those for which extensive computa-
tional experience has been gathered. In this section, we shall briefly highlight some of the recent
developments and other methods not treated here. A survey of methods up to about 1991 can be
found in Freund, Golub and Nachtigal [105]. Two more recent reports by Meier-Yang [150] and
Tong [195] have extensive numerical comparisons among various methods, including several more
recent ones that have not been discussed in detail in this book.
Several suggestions have been made to reduce the increase in memory and computational costs
in GMRES. An obvious one is to restart (this one is included in '2.3.4): GMRES(m). Another
approach is to restrict the GMRES search to a suitable subspace of some higher-dimensional Krylov
subspace. Methods based on this idea can be viewed as preconditioned GMRES methods. The
simplest ones exploit a fixed polynomial preconditioner (see Johnson, Micchelli and Paul [125],
Saad [182], and Nachtigal, Reichel and Trefethen [158]). In more sophisticated approaches, the
polynomial preconditioner is adapted to the iterations (Saad [186]), or the preconditioner may
even be some other (iterative) method of choice (Van der Vorst and Vuik [207], Axelsson and
Vassilevski [24]). Stagnation is prevented in the GMRESR method (Van der Vorst and Vuik [207])
by including LSQR steps in some phases of the process. In De Sturler and Fokkema [63], part of
the optimality of GMRES is maintained in the hybrid method GCRO, in which the iterations of
the preconditioning method are kept orthogonal to the iterations of the underlying GCR method.
All these approaches have advantages for some problems, but it is far from clear a priori which
strategy is preferable in any given case.
Recent work has focused on endowing the BiCG method with several desirable properties: (1)
avoiding breakdown; (2) avoiding use of the transpose; (3) efficient use of matrix-vector products;
(4) smooth convergence; and (5) exploiting the work expended in forming the Krylov space with
A
T
for further reduction of the residual.
As discussed before, the BiCG method can have two kinds of breakdown: Lanczos breakdown
(the underlying Lanczos process breaks down), and pivot breakdown (the tridiagonal matrix T
implicitly generated in the underlying Lanczos process encounters a zero pivot when Gaussian
elimination without pivoting is used to factor it). Although such exact breakdowns are very rare in
practice, near breakdowns can cause severe numerical stability problems.
The pivot breakdown is the easier one to overcome and there have been several approaches
proposed in the literature. It should be noted that for symmetric matrices, Lanczos breakdown
cannot occur and the only possible breakdown is pivot breakdown. The SYMMLQ and QMR
methods discussed in this book circumvent pivot breakdown by solving least squares systems.
Other methods tackling this problem can be found in Fletcher [97], Saad [180], Gutknecht [113],
and Bank and Chan [29, 28].
Lanczos breakdown is much more difficult to eliminate. Recently, considerable attention
has been given to analyzing the nature of the Lanczos breakdown (see Parlett [171], and
32 CHAPTER 2. ITERATIVE METHODS
Gutknecht [112, 115]), as well as various look-ahead techniques for remedying it (see Brezinski
and Sadok [39], Brezinski, Zaglia and Sadok [40, 41], Freund and Nachtigal [101], Parlett [171],
Nachtigal [159], Freund, Gutknecht and Nachtigal [100], Joubert [128], Freund, Golub and
Nachtigal [105], and Gutknecht [112, 115]). However, the resulting algorithms are usually too
complicated to give in template form (some codes of Freund and Nachtigal are available on
netlib.) Moreover, it is still not possible to eliminate breakdowns that require look-ahead
steps of arbitrary size (incurable breakdowns). So far, these methods have not yet received much
practical use but some form of look-ahead may prove to be a crucial component in future methods.
In the BiCG method, the need for matrix-vector multiplies with A
T
can be inconvenient as
well as doubling the number of matrix-vector multiplies compared with CG for each increase in
the degree of the underlying Krylov subspace. Several recent methods have been proposed to over-
come this drawback. The most notable of these is the ingenious CGS method by Sonneveld [191]
discussed earlier, which computes the square of the BiCG polynomial without requiring A
T
– thus
obviating the need for A
T
. When BiCG converges, CGS is often an attractive, faster converging
alternative. However, CGS also inherits (and often magnifies) the breakdown conditions and the
irregular convergence of BiCG (see Van der Vorst [205]).
CGS also generated interest in the possibility of product methods, which generate iterates
corresponding to a product of the BiCG polynomial with another polynomial of the same de-
gree, chosen to have certain desirable properties but computable without recourse to A
T
. The
Bi-CGSTAB method of Van der Vorst [205] is such an example, in which the auxiliary polynomial
is defined by a local minimization chosen to smooth the convergence behavior. Gutknecht [114]
noted that Bi-CGSTAB could be viewed as a product of BiCG and GMRES(1), and he suggested
combining BiCG with GMRES(2) for the even numbered iteration steps. This was anticipated to
lead to better convergence for the case where the eigenvalues of A are complex. A more efficient
and more robust variant of this approach has been suggested by Sleijpen and Fokkema in [189],
where they describe how to easily combine BiCG with any GMRES(m), for modest m.
Many other basic methods can also be squared. For example, by squaring the Lanczos pro-
cedure, Chan, de Pillis and Van der Vorst [45] obtained transpose-free implementations of BiCG
and QMR. By squaring the QMR method, Freund and Szeto [103] derived a transpose-free QMR
squared method which is quite competitive with CGS but with much smoother convergence. Un-
fortunately, these methods require an extra matrix-vector product per step (three instead of two)
which makes them less efficient.
In addition to Bi-CGSTAB, several recent product methods have been designed to smooth the
convergence of CGS. One idea is to use the quasi-minimal residual (QMR) principle to obtain
smoothed iterates from the Krylov subspace generated by other product methods. Freund [104]
proposed such a QMR version of CGS, which he called TFQMR. Numerical experiments show
that TFQMR in most cases retains the desirable convergence features of CGS while correcting its
erratic behavior. The transpose free nature of TFQMR, its low computational cost and its smooth
convergence behavior make it an attractive alternative to CGS. On the other hand, since the BiCG
polynomial is still used, TFQMR breaks down whenever CGS does. One possible remedy would be
to combine TFQMR with a look-ahead Lanczos technique but this appears to be quite complicated
and no methods of this kind have yet appeared in the literature. Recently, Chan et. al. [46] derived
a similar QMR version of Van der Vorst’s Bi-CGSTAB method, which is called QMRCGSTAB.
These methods offer smoother convergence over CGS and Bi-CGSTAB with little additional cost.
There is no clear best Krylov subspace method at this time, and there will never be a best
overall Krylov subspace method. Each of the methods is a winner in a specific problem class, and
the main problem is to identify these classes and to construct new methods for uncovered classes.
The paper by Nachtigal, Reddy and Trefethen [157] shows that for any of a group of methods (CG,
BiCG, GMRES, CGNE, and CGS), there is a class of problems for which a given method is the
winner and another one is the loser. This shows clearly that there will be no ultimate method. The
best we can hope for is some expert system that guides the user in his/her choice. Hence, iterative
methods will never reach the robustness of direct methods, nor will they beat direct methods for
2.6. SURVEY OF RECENT KRYLOV METHODS 33
all problems. For some problems, iterative schemes will be most attractive, and for others, direct
methods (or multigrid). We hope to find suitable methods (and preconditioners) for classes of very
large problems that we are yet unable to solve by any known method, because of CPU-restrictions,
memory, convergence problems, ill-conditioning, et cetera.
34 CHAPTER 2. ITERATIVE METHODS
Chapter 3
Preconditioners
3.1 The why and how
The convergence rate of iterative methods depends on spectral properties of the coefficient matrix.
Hence one may attempt to transform the linear system into one that is equivalent in the sense that it
has the same solution, but that has more favorable spectral properties. A preconditioner is a matrix
that effects such a transformation.
For instance, if a matrix M approximates the coefficient matrix Ain some way, the transformed
system
M
−1
Ax = M
−1
b
has the same solution as the original system Ax = b, but the spectral properties of its coefficient
matrix M
−1
A may be more favorable.
In devising a preconditioner, we are faced with a choice between finding a matrix M that
approximates A, and for which solving a system is easier than solving one with A, or finding
a matrix M that approximates A
−1
, so that only multiplication by M is needed. The majority
of preconditioners falls in the first category; a notable example of the second category will be
discussed in '3.5.
3.1.1 Cost trade-off
Since using a preconditioner in an iterative method incurs some extra cost, both initially for the
setup, and per iteration for applying it, there is a trade-off between the cost of constructing and
applying the preconditioner, and the gain in convergence speed. Certain preconditioners need little
or no construction phase at all (for instance the SSOR preconditioner), but for others, such as
incomplete factorizations, there can be substantial work involved. Although the work in scalar
terms may be comparable to a single iteration, the construction of the preconditioner may not be
vectorizable/parallelizable even if application of the preconditioner is. In that case, the initial cost
has to be amortized over the iterations, or over repeated use of the same preconditioner in multiple
linear systems.
Most preconditioners take in their application an amount of work proportional to the number
of variables. This implies that they multiply the work per iteration by a constant factor. On the
other hand, the number of iterations as a function of the matrix size is usually only improved by a
constant. Certain preconditioners are able to improve on this situation, most notably the modified
incomplete factorizations and preconditioners based on multigrid techniques.
On parallel machines there is a further trade-off between the efficacy of a preconditioner in
the classical sense, and its parallel efficiency. Many of the traditional preconditioners have a large
sequential component.
35
36 CHAPTER 3. PRECONDITIONERS
3.1.2 Left and right preconditioning
The above transformation of the linear system A → M
−1
A is often not what is used in practice.
For instance, the matrix M
−1
A is not symmetric, so, even if A and M are, the conjugate gradients
method is not immediately applicable to this system. The method as described in figure 2.5 reme-
dies this by employing the M
−1
-inner product for orthogonalization of the residuals. The theory
of the cg method is then applicable again.
All cg-type methods in this book, with the exception of QMR, have been derived with such a
combination of preconditioned iteration matrix and correspondingly changed inner product.
Another way of deriving the preconditioned conjugate gradients method would be to split the
preconditioner as M = M
1
M
2
and to transform the system as
M
−1
1
AM
−1
2
(M
2
x) = M
−1
1
b.
If M is symmetric and M
1
= M
t
2
, it is obvious that we now have a method with a symmetric
iteration matrix, hence the conjugate gradients method can be applied.
Remarkably, the splitting of M is in practice not needed. By rewriting the steps of the method
(see for instance Axelsson and Barker [14, pgs. 16,29] or Golub and Van Loan [108, '10.3]) it is
usually possible to reintroduce a computational step
solve u from Mu = v,
that is, a step that applies the preconditioner in its entirety.
There is a different approach to preconditioning, which is much easier to derive. Consider again
the system.
M
−1
1
AM
−1
2
(M
2
x) = M
−1
1
b.
The matrices M
1
and M
2
are called the left- and right preconditioners, respectively, and we can
simpy apply an unpreconditioned iterative method to this system. Only two additional actions
r
0
←M
−1
1
r
0
before the iterative process and x
n
←M
−1
2
x
n
after are necessary.
Thus we arrive at the following schematic for deriving a left/right preconditioned iterative
method from any of the symmetrically preconditioned methods in this book.
1. Take a preconditioned iterative method, and replace every occurence of M by I.
2. Remove any vectors from the algorithm that have become duplicates in the previous step.
3. Replace every occurrence of A in the method by M
−1
1
AM
−1
2
.
4. After the calculation of the initial residual, add the step
r
0
←M
−1
1
r
0
.
5. At the end of the method, add the step
x ←M
−1
2
x,
where x is the final calculated solution.
It should be noted that such methods cannot be made to reduce to the algorithms given in section 2.3
by such choices as M
1
= M
t
2
or M
1
= I.
3.2. JACOBI PRECONDITIONING 37
3.2 Jacobi Preconditioning
The simplest preconditioner consists of just the diagonal of the matrix:
m
i,j
=

a
i,i
if i = j
0 otherwise.
This is known as the (point) Jacobi preconditioner.
It is possible to use this preconditioner without using any extra storage beyond that of the matrix
itself. However, division operations are usually quite costly, so in practice storage is allocated for
the reciprocals of the matrix diagonal. This strategy applies to many preconditioners below.
3.2.1 Block Jacobi Methods
Block versions of the Jacobi preconditioner can be derived by a partitioning of the variables. If the
index set S = ¦1, . . . , n¦ is partitioned as S =
¸
i
S
i
with the sets S
i
mutually disjoint, then
m
i,j
=

a
i,j
if i and j are in the same index subset
0 otherwise.
The preconditioner is now a block-diagonal matrix.
Often, natural choices for the partitioning suggest themselves:
• In problems with multiple physical variables per node, blocks can be formed by grouping the
equations per node.
• In structured matrices, such as those from partial differential equations on regular grids, a
partitioning can be based on the physical domain. Examples are a partitioning along lines in
the 2D case, or planes in the 3D case. This will be discussed further in '3.4.3.
• On parallel computers it is natural to let the partitioning coincide with the division of vari-
ables over the processors.
3.2.2 Discussion
Jacobi preconditioners need very little storage, even in the block case, and they are easy to imple-
ment. Additionally, on parallel computers they don’t present any particular problems.
On the other hand, more sophisticated preconditioners usually yield a larger improvement.
1
3.3 SSOR preconditioning
The SSOR preconditioner
2
like the Jacobi preconditioner, can be derived from the coefficient ma-
trix without any work.
If the original, symmetric, matrix is decomposed as
A = D +L +L
T
1
Under certain conditions, one can show that the point Jacobi algorithm is optimal, or close to optimal, in the sense
of reducing the condition number, among all preconditioners of diagonal form. This was shown by Forsythe and Strauss
for matrices with Property A [98], and by van der Sluis [196] for general sparse matrices. For extensions to block Jacobi
preconditioners, see Demmel [65] and Elsner [94].
2
The SOR and Gauss-Seidel matrices are never used as preconditioners, for a rather technical reason. SOR-
preconditioning with optimal ω maps the eigenvalues of the coefficient matrix to a circle in the complex plane; see Hageman
and Young [119, §9.3]. In this case no polynomial acceleration is possible, i.e., the accelerating polynomial reduces to the
trivial polynomial Pn(x) = x
n
, and the resulting method is simply the stationary SOR method. Recent research by Eier-
mann and Varga [83] has shown that polynomial acceleration of SOR with suboptimal ω will yield no improvement over
simple SOR with optimal ω.
38 CHAPTER 3. PRECONDITIONERS
in its diagonal, lower, and upper triangular part, the SSOR matrix is defined as
M = (D +L)D
−1
(D +L)
T
,
or, parametrized by ω
M(ω) =
1
2 −ω
(
1
ω
D +L)(
1
ω
D)
−1
(
1
ω
D +L)
T
.
The optimal value of the ω parameter, like the parameter in the SOR method, will reduce the
number of iterations to a lower order. Specifically, for second order eliptic problems a spectral
condition number κ(M
−1
ωopt
A) = O(

κ(A)) is attainable (see Axelsson and Barker [14, '1.4]).
In practice, however, the spectral information needed to calculate the optimal ω is prohibitively
expensive to compute.
The SSOR matrix is given in factored form, so this preconditioner shares many properties of
other factorization-based methods (see below). For instance, its suitability for vector processors or
parallel architectures depends strongly on the ordering of the variables. On the other hand, since
this factorization is given a priori, there is no possibility of breakdown as in the construction phase
of incomplete factorization methods.
3.4 Incomplete Factorization Preconditioners
A broad class of preconditioners is based on incomplete factorizations of the coefficient matrix.
We call a factorization incomplete if during the factorization process certain fill elements, nonzero
elements in the factorization in positions where the original matrix had a zero, have been ignored.
Such a preconditioner is then given in factored formM = LU with L lower and U upper triangular.
The efficacy of the preconditioner depends on how well M
−1
approximates A
−1
.
3.4.1 Creating an incomplete factorization
Incomplete factorizations are the first preconditioners we have encountered so far for which there
is a non-trivial creation stage. Incomplete factorizations may break down (attempted division by
zero pivot) or result in indefinite matrices (negative pivots) even if the full factorization of the same
matrix is guaranteed to exist and yield a positive definite matrix.
An incomplete factorization is guaranteed to exist for many factorization strategies if the orig-
inal matrix is an M-matrix. This was originally proved by Meijerink and Van der Vorst [151]; see
further Beauwens and Quenon [33], Manteuffel [146], and Van der Vorst [198].
In cases where pivots are zero or negative, strategies have been proposed such as substituting an
arbitrary positive number (see Kershaw [131]), or restarting the factorization on A + αI for some
positive value of α (see Manteuffel [146]).
An important consideration for incomplete factorization preconditioners is the cost of the fac-
torization process. Even if the incomplete factorization exists, the number of operations involved in
creating it is at least as much as for solving a system with such a coefficient matrix, so the cost may
equal that of one or more iterations of the iterative method. On parallel computers this problem is
aggravated by the generally poor parallel efficiency of the factorization.
Such factorization costs can be amortized if the iterative method takes many iterations, or if the
same preconditioner will be used for several linear systems, for instance in successive time steps
or Newton iterations.
Solving a system with an incomplete factorization preconditioner
Incomplete factorizations can be given in various forms. If M = LU (with L and U nonsingular
triangular matrices), solving a system proceeds in the usual way (figure 3.1), but often incomplete
3.4. INCOMPLETE FACTORIZATION PRECONDITIONERS 39
Let M = LU and y be given.
for i = 1, 2, . . .
z
i
=
−1
ii
(y
i

¸
j<i

ij
z
j
)
for i = n, n −1, n −2, . . .
x
i
= u
−1
ii
(z
i

¸
j>i
u
ij
x
j
)
Figure 3.1: Preconditioner solve of a system Mx = y, with M = LU
Let M = (D +L)(I +D
−1
U) and y be given.
for i = 1, 2, . . .
z
i
= d
−1
ii
(y
i

¸
j<i

ij
z
j
)
for i = n, n −1, n −2, . . .
x
i
= z
i
−d
−1
ii
¸
j>i
u
ij
x
j
Figure 3.2: Preconditioner solve of a system Mx = y, with M = (D + L)D
−1
(D + U) =
(D +L)(I +D
−1
U).
factorizations are given as M = (D+L)D
−1
(D+U) (with D diagonal, and L and U now strictly
triangular matrices, determined through the factorization process). In that case, one could use either
of the following equivalent formulations for Mx = y:
(D +L)z = y, (I +D
−1
U)x = z
or
(I +LD
−1
)z = y, (D +U)x = z.
In either case, the diagonal elements are used twice (not three times as the formula for M would
lead one to expect), and since only divisions with D are performed, storing D
−1
explicitly is the
practical thing to do. At the cost of some extra storage, one could store LD
−1
or D
−1
U, thereby
saving some computation. Solving a system using the first formulation is outlined in figure 3.2.
The second formulation is slightly harder to implement.
3.4.2 Point incomplete factorizations
The most common type of incomplete factorization is based on taking a set S of matrix positions,
and keeping all positions outside this set equal to zero during the factorization. The resulting
factorization is incomplete in the sense that fill is supressed.
The set S is usually chosen to encompass all positions (i, j) for which a
i,j
= 0. A position that
is zero in A but not so in an exact factorization
3
is called a fill position, and if it is outside S, the
fill there is said to be “discarded”. Often, S is chosen to coincide with the set of nonzero positions
in A, discarding all fill. This factorization type is called the ILU(0) factorization: the Incomplete
LU factorization of level zero
4
.
3
To be precise, if we make an incomplete factorization M = LU, we refer to positions in L and U when we talk of
positions in the factorization. The matrix M will have more nonzeros than L and U combined.
4
The zero refers to the fact that only “level zero” fill is permitted, that is, nonzero elements of the original matrix. Fill
levels are defined by calling an element of level k + 1 if it is caused by elements at least one of which is of level k. The
first fill level is that caused by the original matrix elements.
40 CHAPTER 3. PRECONDITIONERS
Let S be the nonzero set ¦(i, j): a
ij
= 0¦
for i = 1, 2, . . .
set d
ii
←a
ii
for i = 1, 2, . . .
set d
ii
←1/d
ii
for j = i + 1, i + 2, . . .
if (i, j) ∈ S and (j, i) ∈ S then
set d
jj
←d
jj
−a
ji
d
ii
a
ij
Figure 3.3: Construction of a D-ILU incomplete factorization preconditioner, storing the inverses
of the pivots
We can describe an incomplete factorization formally as
for each k, i, j > k: a
i,j

a
i,j
−a
i,k
a
−1
k,k
a
k,j
if (i, j) ∈ S
a
i,j
otherwise.
Meijerink and Van der Vorst [151] proved that, if A is an M-matrix, such a factorization exists for
any choice of S, and gives a symmetric positive definite matrix if A is symmetric positive definite.
Guidelines for allowing levels of fill were given by Meijerink and Van der Vorst in [152].
Fill-in strategies
There are two major strategies for accepting or discarding fill-in, one structural, and one numerical.
The structural strategy is that of accepting fill-in only to a certain level. As was already pointed out
above, any zero location (i, j) in A filling in (say in step k) is assigned a fill level value
fill(i, j) = 1 + max¦fill(i, k), fill(k, i)¦. (3.1)
If a
ij
was already nonzero, the level value is not changed.
The numerical fill strategy is that of ‘drop tolerances’: fill is ignored if it is too small, for
a suitable definition of ‘small’. Although this definition makes more sense mathematically, it is
harder to implement in practice, since the amount of storage needed for the factorization is not
easy to predict. See [20, 156] for discussions of preconditioners using drop tolerances.
Simple cases: ILU(0) and D-ILU
For the ILU(0) method, the incomplete factorization produces no nonzero elements beyond the
sparsity structure of the original matrix, so that the preconditioner at worst takes exactly as much
space to store as the original matrix. In a simplified version of ILU(0), called D-ILU (Pom-
merell [173]), even less is needed. If not only we prohibit fill-in elements, but we also alter only
the diagonal elements (that is, any alterations of off-diagonal elements are ignored
5
), we have the
following situation.
Splitting the coefficient matrix into its diagonal, lower triangular, and upper triangular parts as
A = D
A
+L
A
+U
A
, the preconditioner can be written as M = (D+L
A
)D
−1
(D+U
A
) where D is
the diagonal matrix containing the pivots generated. Generating this preconditioner is described in
figure 3.3. Since we use the upper and lower triangle of the matrix unchanged, only storage space
for D is needed. In fact, in order to avoid division operations during the preconditioner solve stage
we store D
−1
rather than D.
5
In graph theoretical terms, ILU(0) and D-ILU coincide if the matrix graph contains no triangles.
3.4. INCOMPLETE FACTORIZATION PRECONDITIONERS 41
Remark: the resulting lower and upper factors of the preconditioner have only nonzero elements
in the set S, but this fact is in general not true for the preconditioner M itself.
The fact that the D-ILU preconditioner contains the off-diagonal parts of the original matrix
was used by Eisenstat [90] to derive at a more efficient implementation of preconditioned CG.
This new implementation merges the application of the tridiagonal factors of the matrix and the
preconditioner, thereby saving a substantial number of operations per iteration.
Special cases: central differences
We will now consider the special case of a matrix derived from central differences on a Cartesian
product grid. In this case the ILU(0) and D-ILU factorizations coincide, and, as remarked above,
we only have to calculate the pivots of the factorization; other elements in the triangular factors are
equal to off-diagonal elements of A.
In the following we will assume a natural, line-by-line, ordering of the grid points.
Letting i,j be coordinates in a regular 2D grid, it is easy to see that the pivot on grid point (i, j)
is only determined by pivots on points (i −1, j) and (i, j −1). If there are n points on each of m
grid lines, we get the following generating relations for the pivots:
d
i,i
=

a
1,1
if i = 1
a
i,i
−a
i,i−1
d
−1
i−1
a
i−1,i
if 1 < i ≤ n
a
i,i
−a
i,i−n
d
−1
i−n
a
i−n,i
if i = kn + 1 with k ≥ 1
a
i,i
− a
i,i−1
d
−1
i−1
a
i−1,i
−a
i,i−n
d
−1
i−n
a
i−n,i
otherwise.
Conversely, we can describe the factorization algorithmically as
Initially: d
i,i
= a
i,i
for all i
for i = 1..nm do:

d
i+1,i+1
= d
i+1,i+1
−a
i+1,i
d
−1
i,i
a
i,i+1
if there is no k
such that i = kn
d
i+n,i+n
= d
i+n,i+n
−a
i+n,i
d
−1
i,i
a
i,i+n
if i +n ≤ nm
In the above we have assumed that the variables in the problem are ordered according to the
so-called “natural ordering”: a sequential numbering of the grid lines and the points within each
grid line. Below we will encounter different orderings of the variables.
Modified incomplete factorizations
One modification to the basic idea of incomplete factorizations is as follows: If the product
a
i,k
a
−1
k,k
a
k,j
is nonzero, and fill is not allowed in position (i, j), instead of simply discarding this
fill quantity subtract it from the diagonal element a
i,i
. Such a factorization scheme is usually
called a “modified incomplete factorization”.
Mathematically this corresponds to forcing the preconditioner to have the same rowsums as
the original matrix. One reason for considering modified incomplete factorizations is the behavior
of the spectral condition number of the preconditioned system. It was mentioned above that for
second order elliptic equations the condition number of the coefficient matrix is O(h
−2
) as a func-
tion of the discretization mesh width. This order of magnitude is preserved by simple incomplete
factorizations, although usually a reduction by a large constant factor is obtained.
Modified factorizations are of interest because, in combination with small perturbations, the
spectral condition number of the preconditioned system can be of a lower order. It was first proved
by Dupont, Kendall and Rachford [80] that a modified incomplete factorization of A + O(h
2
)D
A
gives κ(M
−1
A) = O(h
−1
) for the central difference case. More general proofs are given by
Gustafsson [111], Axelsson and Barker [14, '7.2], and Beauwens [31, 32].
42 CHAPTER 3. PRECONDITIONERS
for k = 1, . . . , 2n −1
do in parallel for i = max(1, k + 1 −n), min(n, k)
j = k −i + 1
x
i+(j−1)n
←D
i+(j−1)n
y
i+(j−1)n
−L
i+(j−1)n
i −1 + (j −1)nx
i−1+(j−1)n
−L
i+(j−1)n
i + (j −2)nx
i+(j−2)n
Figure 3.4: Wavefront solution of (D + L)x = u from a central difference problem on a domain
of n n points.
Instead of keeping row sums constant, one can also keep column sums constant. In compu-
tational fluid mechanics this idea is justified with the argument that the material balance stays
constant over all iterates. (Equivalently, one wishes to avoid ‘artificial diffusion’.) Appleyard and
Cheshire [4] observed that if A and M have the same column sums, the choice
x
0
= M
−1
b
guarantees that the sum of the elements in r
0
(the material balance error) is zero, and that all further
r
i
have elements summing to zero.
Modified incomplete factorizations can break down, especially when the variables are num-
bered other than in the natural row-by-row ordering. This was noted by Chan and Kuo [50], and a
full analysis was given by Eijkhout [85] and Notay [160].
A slight variant of modified incomplete factorizations consists of the class of “relaxed incom-
plete factorizations”. Here the fill is multiplied by a parameter 0 < α < 1 before it is subtracted
from the diagonal; see Ashcraft and Grimes [11], Axelsson and Lindskog [18, 19], Chan [44],
Eijkhout [85], Notay [161], Stone [193], and Van der Vorst [202]. For the dangers of MILU in the
presence of rounding error, see Van der Vorst [204].
Vectorization of the preconditioner solve
At first it may appear that the sequential time of solving a factorization is of the order of the number
of variables, but things are not quite that bad. Consider the special case of central differences on a
regular domain of n n points. The variables on any diagonal in the domain, that is, in locations
(i, j) with i + j = k, depend only on those on the previous diagonal, that is, with i + j = k − 1.
Therefore it is possible to process the operations on such a diagonal, or ‘wavefront’, in parallel (see
figure 3.4), or have a vector computer pipeline them; see Van der Vorst [201, 203].
Another way of vectorizing the solution of the triangular factors is to use some form of expan-
sion of the inverses of the factors. Consider for a moment a lower triangular matrix, normalized
to the form I − L where L is strictly lower triangular). Its inverse can be given as either of the
following two series:
(I −L)
−1
=

I +L +L
2
+L
3
+
(I +L)(I +L
2
)(I +L
4
)
(3.2)
(The first series is called a “Neumann expansion”, the second an “Euler expansion”. Both series are
finite, but their length prohibits practical use of this fact.) Parallel or vectorizable preconditioners
can be derived from an incomplete factorization by taking a small number of terms in either series.
Experiments indicate that a small number of terms, while giving high execution rates, yields almost
the full precision of the more recursive triangular solution (see Axelsson and Eijkhout [15] and
Van der Vorst [199]).
3.4. INCOMPLETE FACTORIZATION PRECONDITIONERS 43
Let M = (I +L)D(I +U) and y be given.
t ←y
for k = 1, . . . , p
t ←y −Lt
x ←D
−1
t, t ←x
for k = 1, . . . , p
x ←t −Ux
Figure 3.5: Preconditioning step algorithm for a Neumann expansion M
(p)
≈ M
−1
of an incom-
plete factorization M = (I +L)D(I +U).
There are some practical considerations in implementing these expansion algorithms. For in-
stance, because of the normalization the L in equation (3.2) is not L
A
. Rather, if we have a precon-
ditioner (as described in section 3.4.2) described by
A = D
A
+L
A
+U
A
, M = (D +L
A
)D
−1
(D +U
A
),
then we write
M = (I +L)D(I +U), where L = L
A
D
−1
, U = D
−1
U
A
.
Now we can choose whether or not to store the product L
A
D
−1
. Doing so doubles the storage
requirements for the matrix, not doing so means that separate multiplications by L
A
and D
−1
have
to be performed in the expansion.
Suppose then that the products L = L
A
D
−1
and U = D
−1
U
A
have been stored. We then
define M
(p)
≈ M
−1
by
M
(p)
=
p
¸
k=0
(−U)
p
D
−1
p
¸
k=0
(−L)
p
, (3.3)
and replace solving a system Mx = y for x by computing x = M
(p)
y. This algorithm is given in
figure 3.5.
Parallelizing the preconditioner solve
The algorithms for vectorization outlined above can be used on parallel computers. For instance,
variables on a wavefront can be processed in parallel, by dividing the wavefront over processors.
More radical approaches for increasing the parallelism in incomplete factorizations are based on
a renumbering of the problem variables. For instance, on rectangular domains one could start
numbering the variables from all four corners simultaneously, thereby creating four simultaneous
wavefronts, and therefore four-fold parallelism (see Dongarra, et al. [70], Van der Vorst [200, 202]).
The most extreme case is the red/black ordering (or for more general matrices the multi-color
ordering) which gives the absolute minimum number of sequential steps.
Multi-coloring is also an attractive method for vector computers. Since points of one color
are uncoupled, they can be processed as one vector; see Doi [67], Melhem [153], and Poole and
Ortega [175].
However, for such ordering strategies there is usually a trade-off between the degree of paral-
lelism and the resulting number of iterations. The reason for this is that a different ordering may
give rise to a different error matrix, in particular the norm of the error matrix may vary considerably
between orderings. See experimental results by Duff and Meurant [78] and a partial explanation of
them by Eijkhout [84].
44 CHAPTER 3. PRECONDITIONERS
for i = 1, 2, . . .
X
i
←A
ii
for i = 1, 2, . . .
Let Y
i
≈ X
−1
i
for j = i + 1, i + 2, . . .
if A
ij
= 0 and A
ji
= 0,
then X
j
←X
j
−A
ji
Y
i
A
ij
Figure 3.6: Block version of a D-ILU factorization
3.4.3 Block factorization methods
We can also consider block variants of preconditioners for accelerated methods. Block methods
are normally feasible if the problem domain is a Cartesian product grid; in that case a natural
division in lines (or planes in the 3-dimensional case), can be used for blocking, though incomplete
factorizations are not as effective in the 3-dimensional case; see for instance Kettler [133]. In such
a blocking scheme for Cartesian product grids, both the size and number of the blocks increases
with the overall problem size.
The idea behind block factorizations
The starting point for an incomplete block factorization is a partitioning of the matrix, as mentioned
in '3.2.1. Then an incomplete factorization is performed using the matrix blocks as basic entities
(see Axelsson [12] and Concus, Golub and Meurant [56] as basic references).
The most important difference with point methods arises in the inversion of the pivot blocks.
Whereas inverting a scalar is easily done, in the block case two problems arise. First, inverting the
pivot block is likely to be a costly operation. Second, initially the diagonal blocks of the matrix
are likely to be be sparse and we would like to maintain this type of structure throughout the
factorization. Hence the need for approximations of inverses arises.
In addition to this, often fill-in in off-diagonal blocks is discarded altogether. Figure 3.6
describes an incomplete block factorization that is analogous to the D-ILU factorization (sec-
tion 3.4.2) in that it only updates the diagonal blocks.
As in the case of incomplete point factorizations, the existence of incomplete block methods is
guaranteed if the coefficient matrix is an M-matrix. For a general proof, see Axelsson [13].
Approximate inverses
In block factorizations a pivot block is generally forced to be sparse, typically of banded form,
and that we need an approximation to its inverse that has a similar structure. Furthermore, this
approximation should be easily computable, so we rule out the option of calculating the full inverse
and taking a banded part of it.
The simplest approximation to A
−1
is the diagonal matrix D of the reciprocals of the diagonal
of A: d
i,i
= 1/a
i,i
.
Other possibilities were considered by Axelsson and Eijkhout [15], Axelsson and Polman [21],
Concus, Golub and Meurant [56], Eijkhout and Vassilevski [89], Kolotilina and Yeremin [140],
and Meurant [154]. One particular example is given in figure 3.7. It has the attractive theoretical
property that, if the original matrix is symmetric positive definite and a factorization with positive
diagonal D can be made, the approximation to the inverse is again symmetric positive definite.
3.4. INCOMPLETE FACTORIZATION PRECONDITIONERS 45
Let X be a banded matrix,
factor X = (I +L)D
−1
(I +U),
let Y = (I −U)D(I −L)
Figure 3.7: Algorithm for approximating the inverse of a banded matrix
X
1
= A
1,1
for i ≥ 1
let Y
i
≈ X
−1
i
X
i+1
= A
i+1,i+1
−A
i+1,i
Y
i
A
i,i+1
Figure 3.8: Incomplete block factorization of a block tridiagonal matrix
Banded approximations to the inverse of banded matrices have a theoretical justification. In
the context of partial differential equations the diagonal blocks of the coefficient matrix are usually
strongly diagonally dominant. For such matrices, the elements of the inverse have a size that is ex-
ponentially decreasing in their distance from the main diagonal. See Demko, Moss and Smith [64]
for a general proof, and Eijkhout and Polman [88] for a more detailed analysis in the M-matrix
case.
The special case of block tridiagonality
In many applications, a block tridiagonal structure can be found in the coefficient matrix. Examples
are problems on a 2D regular grid if the blocks correspond to lines of grid points, and problems
on a regular 3D grid, if the blocks correspond to planes of grid points. Even if such a block
tridiagonal structure does not arise naturally, it can be imposed by renumbering the variables in a
Cuthill-McKee ordering [59].
Such a matrix has incomplete block factorizations of a particularly simple nature: since no fill
can occur outside the diagonal blocks (A
i,i
), all properties follow from our treatment of the pivot
blocks. The generating recurrence for the pivot blocks also takes a simple form, see figure 3.8.
After the factorization we are left with sequences of X
i
block forming the pivots, and of Y
i
blocks
approximating their inverses.
Two types of incomplete block factorizations
One reason that block methods are of interest is that they are potentially more suitable for vector
computers and parallel architectures. Consider the block factorization
A = (D +L)D
−1
(D +U) = (D +L)(I +D
−1
U)
where D is the block diagonal matrix of pivot blocks,
6
and L, U are the block lower and upper
triangle of the factorization; they coincide with L
A
, U
Q
in the case of a block tridiagonal matrix.
We can turn this into an incomplete factorization by replacing the block diagonal matrix of
pivots D by the block diagonal matrix of incomplete factorization pivots X = diag(X
i
), giving
M = (X +L)(I +X
−1
U)
6
Writing (I +L
A
D
−1
)(D +U
A
) is equally valid, but in practice harder to implement.
46 CHAPTER 3. PRECONDITIONERS
For factorizations of this type (which covers all methods in Concus, Golub and Meurant [56] and
Kolotilina and Yeremin [140]) solving a linear system means solving smaller systems with the X
i
matrices.
Alternatively, we can replace D by a the inverse of the block diagonal matrix of the approxi-
mations to the inverses of the pivots, Y = diag(Y
i
), giving
M = (Y
−1
+L)(I +Y U).
For this second type (which was discussed by Meurant [154], Axelsson and Polman [21] and Ax-
elsson and Eijkhout [15]) solving a system with M entails multiplying by the Y
i
blocks. Therefore,
the second type has a much higher potential for vectorizability. Unfortunately, such a factorization
is theoretically more troublesome; see the above references or Eijkhout and Vassilevski [89].
3.4.4 Blocking over systems of partial differential equations
If the physical problem has several variables per grid point, that is, if there are several coupled
partial differential equations, it is possible to introduce blocking in a natural way.
Blocking of the equations (which gives a small number of very large blocks) was used by
Axelsson and Gustafsson [17] for the equations of linear elasticity, and blocking of the variables
per node (which gives many very small blocks) was used by Aarden and Karlsson [1] for the
semiconductor equations. A systematic comparison of the two approaches was made by Bank, et
al. [26].
3.4.5 Incomplete LQ factorizations
Saad [183] proposes to construct an incomplete LQ factorization of a general sparse matrix. The
idea is to orthogonalize the rows of the matrix by a Gram-Schmidt process (note that in sparse
matrices, most rows are typically orthogonal already, so that standard Gram-Schmidt may be not so
bad as in general). Saad suggest dropping strategies for the fill-in produced in the orthogonalization
process. It turns out that the resulting incomplete L factor can be viewed as the incomplete Choleski
factor of the matrix AA
T
. Experiments show that using Lin a CG process for the normal equations:
L
−1
AA
T
L
−T
y = b is effective for some relevant problems.
3.5 Polynomial preconditioners
So far, we have described preconditioners in only one of two classes: those that approximate the
coefficient matrix, and where linear systems with the preconditioner as coefficient matrix are easier
to solve than the original system. Polynomial preconditioners can be considered as members of the
second class of preconditioners: direct approximations of the inverse of the coefficient matrix.
Suppose that the coefficient matrix Aof the linear system is normalized to the formA = I −B,
and that the spectral radius of B is less than one. Using the Neumann series, we can write the
inverse of A as A
−1
=
¸

k=0
B
k
, so an approximation may be derived by truncating this infinite
series. Since the iterative methods we are considering are already based on the idea of applying
polynomials in the coefficient matrix to the initial residual, there are analytic connections between
the basic method and polynomially accelerated one.
Dubois, Greenbaum and Rodrigue [76] investigated the relationship between a basic method
using a splitting A = M −N, and a polynomially preconditioned method with
M
−1
p
= (
p−1
¸
i=0
(I −M
−1
A)
i
)M
−1
.
The basic result is that for classical methods, k steps of the polynomially preconditioned method
are exactly equivalent to kp steps of the original method; for accelerated methods, specifically the
3.6. OTHER PRECONDITIONERS 47
Chebyshev method, the preconditioned iteration can improve the number of iterations by at most a
factor of p.
Although there is no gain in the number of times the coefficient matrix is applied, polynomial
preconditioning does eliminate a large fraction of the inner products and update operations, so there
may be an overall increase in efficiency.
Let us define a polynomial preconditioner more abstractly as any polynomial M = P
n
(A)
normalized to P(0) = 1. Now the choice of the best polynomial preconditioner becomes that of
choosing the best polynomial that minimizes |I −M
−1
A|. For the choice of the infinity norm we
thus obtain Chebyshev polynomials, and they require estimates of both a lower and upper bound
on the spectrum of A. These estimates may be derived from the conjugate gradient iteration itself;
see '5.1.
Since an accurate lower bound on the spectrum of A may be hard to obtain, Johnson, Micchelli
and Paul [125] and Saad [182] propose least squares polynomials based on several weight functions.
These functions only require an upper bound and this is easily computed, using for instance the
“Gerschgorin bound” max
i
¸
j
[A
i,j
[; see [209, '1.4]. Experiments comparing Chebyshev and
least squares polynomials can be found in Ashby, Manteuffel and Otto [8].
Application of polynomial preconditioning to symmetric indefinite problems is described by
Ashby, Manteuffel and Saylor [9]. There the polynomial is chosen so that it transforms the system
into a definite one.
3.6 Preconditioners fromproperties of the differential equation
A number of preconditioners exist that derive their justification from properties of the underlying
partial differential equation. We will cover some of them here (see also '5.5 and '5.4). These
preconditioners usually involve more work than the types discussed above, however, they allow for
specialized faster solution methods.
3.6.1 Preconditioning by the symmetric part
In '2.3.4 we pointed out that conjugate gradient methods for non-selfadjoint systems require the
storage of previously calculated vectors. Therefore it is somewhat remarkable that preconditioning
by the symmetric part (A+A
T
)/2 of the coefficient matrix A leads to a method that does not need
this extended storage. Such a method was proposed by Concus and Golub [55] and Widlund [214].
However, solving a system with the symmetric part of a matrix may be no easier than solving
a system with the full matrix. This problem may be tackled by imposing a nested iterative method,
where a preconditioner based on the symmetric part is used. Vassilevski [210] proved that the
efficiency of this preconditioner for the symmetric part carries over to the outer method.
3.6.2 The use of fast solvers
In many applications, the coefficient matrix is symmetric and positive definite. The reason for this
is usually that the partial differential operator from which it is derived is self-adjoint, coercive,
and bounded (see Axelsson and Barker [14, '3.2]). It follows that for the coefficient matrix A the
following relation holds for any matrix B from a similar differential equation:
c
1

x
T
Ax
x
T
Bx
≤ c
2
for all x,
where c
1
, c
2
do not depend on the matrix size. The importance of this is that the use of B as a
preconditioner gives an iterative method with a number of iterations that does not depend on the
matrix size.
48 CHAPTER 3. PRECONDITIONERS
Thus we can precondition our original matrix by one derived from a different PDE, if one can be
found that has attractive properties as preconditioner. The most common choice is to take a matrix
from a separable PDE. A system involving such a matrix can be solved with various so-called
“fast solvers”, such as FFT methods, cyclic reduction, or the generalized marching algorithm (see
Dorr [74], Swarztrauber [194], Bank [25] and Bank and Rose [27]).
As a simplest example, any elliptic operator can be preconditioned with the Poisson operator,
giving the iterative method
−∆(u
n+1
−u
n
) = −(Lu
n
−f).
In Concus and Golub [58] a transformation of this method is considered to speed up the conver-
gence. As another example, if the original matrix arises from
−(a(x, y)u
x
)
x
−(b(x, y)u
y
)
y
= f,
the preconditioner can be formed from
−(˜ a(x)u
x
)
x
−(
˜
b(y)u
y
)
y
= f.
An extension to the non-self adjoint case is considered by Elman and Schultz [93].
Fast solvers are attractive in that the number of operations they require is (slightly higher than)
of the order of the number of variables. Coupled with the fact that the number of iterations in the
resulting preconditioned iterative methods is independent of the matrix size, such methods are close
to optimal. However, fast solvers are usually only applicable if the physical domain is a rectangle
or other Cartesian product structure. (For a domain consisting of a number of such pieces, domain
decomposition methods can be used; see '5.4).
3.6.3 Alternating Direction Implicit methods
The Poisson differential operator can be split in a natural way as the sum of two operators:
L = L
1
+L
2
, where L
1
= −

2
∂x
2
, L
2
= −

2
∂y
2
.
Now let L
1
, L
2
be discretized representations of L
1
, L
2
. Based on the observation that L
1
+L
2
=
(I +L
1
)(I +L
2
) −I −L
1
L
2
, iterative schemes such as
(1 +αL
1
)(1 +αL
2
)u
(m+1)
= [(1 +βL
1
)(1 +βL
2
)] u
(m)
with suitable choices of α and β have been proposed.
This alternating direction implicit, or ADI, method was first proposed as a solution method
for parabolic equations. The u
(m)
are then approximations on subsequent time steps. However,
it can also be used for the steady state, that is, for solving elliptic equations. In that case, the
u
(m)
become subsequent iterates; see D’Yakonov [81], Fairweather, Gourlay and Mitchell [96],
Hadjidimos [118], and Peaceman and Rachford [172]. Generalization of this scheme to variable
coefficients or fourth order elliptic problems is relatively straightforward.
The above method is implicit since it requires systems solutions, and it alternates the x and y
(and if necessary z) directions. It is attractive from a practical point of view (although mostly on
tensor product grids), since solving a system with, for instance, a matrix I + αL
1
entails only a
number of uncoupled tridiagonal solutions. These need very little storage over that needed for the
matrix, and they can be executed in parallel, or one can vectorize over them.
A theoretical reason that ADI preconditioners are of interest is that they can be shown to be
spectrally equivalent to the original coefficient matrix. Hence the number of iterations is bounded
independent of the condition number.
However, there is a problem of data distribution. For vector computers, either the system solu-
tion with L
1
or with L
2
will involve very large strides: if columns of variables in the grid are stored
3.6. OTHER PRECONDITIONERS 49
contiguously, only the solution with L
1
will involve contiguous data. For the L
2
the stride equals
the number of variables in a column.
On parallel machines an efficient solution is possible if the processors are arranged in a P
x
P
y
grid. During, e.g., the L
1
solve, every processor row then works independently of other rows.
Inside each row, the processors can work together, for instance using a Schur complement method.
With sufficient network bandwidth this will essentially reduce the time to that for solving any of
the subdomain systems plus the time for the interface system. Thus, this method will be close to
optimal.
50 CHAPTER 3. PRECONDITIONERS
Chapter 4
Related Issues
4.1 Complex Systems
Conjugate gradient methods for real symmetric systems can be applied to complex Hermitian sys-
tems in a straightforward manner. For non-Hermitian complex systems we distinguish two cases.
In general, for any coefficient matrix a CGNE method is possible, that is, a conjugate gradients
method on the normal equations A
H
Ax = A
H
b, or one can split the system into real and complex
parts and use a method such as GMRES on the resulting real nonsymmetric system. However, in
certain practical situations the complex system is non-Hermitian but symmetric.
Complex symmetric systems can be solved by a classical conjugate gradient or Lanczos
method, that is, with short recurrences, if the complex inner product (x, y) = ¯ x
T
y is replaced by
(x, y) = x
T
y. Like the BiConjugate Gradient method, this method is susceptible to breakdown,
that is, it can happen that x
T
x = 0 for x = 0. A look-ahead strategy can remedy this in most
cases (see Freund [99] and Van der Vorst and Melissen [206]).
4.2 Stopping Criteria
An iterative method produces a sequence ¦x
(i)
¦ of vectors converging to the vector x satisfying
the n n system Ax = b. To be effective, a method must decide when to stop. A good stopping
criterion should
1. identify when the error e
(i)
≡ x
(i)
−x is small enough to stop,
2. stop if the error is no longer decreasing or decreasing too slowly, and
3. limit the maximum amount of time spent iterating.
For the user wishing to read as little as possible, the following simple stopping criterion will
likely be adequate. The user must supply the quantities maxit, |b|, stop tol, and preferably also
|A|:
• The integer maxit is the maximum number of iterations the algorithm will be permitted to
perform.
• The real number |A| is a norm of A. Any reasonable (order of magnitude) approximation
of the absolute value of the largest entry of the matrix A will do.
• The real number |b| is a norm of b. Again, any reasonable approximation of the absolute
value of the largest entry of b will do.
51
52 CHAPTER 4. RELATED ISSUES
• The real number stop tol measures how small the user wants the residual r
(i)
= Ax
(i)

b of the ultimate solution x
(i)
to be. One way to choose stop tol is as the approximate
uncertainty in the entries of A and b relative to |A| and |b|, respectively. For example,
choosing stop tol 10
−6
means that the user considers the entries of A and b to have errors in
the range ±10
−6
|A| and ±10
−6
|b|, respectively. The algorithm will compute x no more
accurately than its inherent uncertainty warrants. The user should choose stop tol less than
one and greater than the machine precision ε.
1
Here is the algorithm:
i = 0
repeat
i = i + 1
Compute the approximate solution x
(i)
.
Compute the residual r
(i)
= Ax
(i)
−b.
Compute |r
(i)
| and |x
(i)
|.
until i ≥ maxit or |r
(i)
| ≤ stop tol (|A| |x
(i)
| +|b|).
Note that if x
(i)
does not change much from step to step, which occurs near convergence, then
|x
(i)
| need not be recomputed. If |A| is not available, the stopping criterion may be replaced with
the generally stricter criterion
until i ≥ maxit or |r
(i)
| ≤ stop tol |b| .
In either case, the final error bound is |e
(i)
| ≤ |A
−1
| |r
(i)
|. If an estimate of |A
−1
| is
available, one may also use the stopping criterion
until i ≥ maxit or |r
(i)
| ≤ stop tol |x
(i)
|/|A
−1
| ,
which guarantees that the relative error |e
(i)
|/|x
(i)
| in the computed solution is bounded by
stop tol.
4.2.1 More Details about Stopping Criteria
Ideally we would like to stop when the magnitudes of entries of the error e
(i)
= x
(i)
−x fall below a
user-supplied threshold. But e
(i)
is hard to estimate directly, so we use the residual r
(i)
= Ax
(i)
−b
instead, which is more readily computed. The rest of this section describes howto measure the sizes
of vectors e
(i)
and r
(i)
, and how to bound e
(i)
in terms of r
(i)
.
We will measure errors using vector and matrix norms. The most common vector norms are:
|x|

≡ max
j
[x
j
[ ,
|x|
1

¸
j
[x
j
[ , and
|x|
2
≡ (
¸
j
[x
j
[
2
)
1/2
.
For some algorithms we may also use the norm |x|
B,α
≡ |Bx|
α
, where B is a fixed nonsingular
matrix and α is one of ∞, 1, or 2. Corresponding to these vector norms are three matrix norms:
|A|

≡ max
j
¸
k
[a
j,k
[ ,
|A|
1
≡ max
k
¸
j
[a
j,k
[ , and
|A|
F
≡ (
¸
jk
[a
j,k
[
2
)
1/2
,
as well as |A|
B,α
≡ |BAB
−1
|
α
. We may also use the matrix norm |A|
2
= (λ
max
(AA
T
))
1/2
,
where λ
max
denotes the largest eigenvalue. Henceforth |x| and |A| will refer to any mutually
1
On a machine with IEEE Standard Floating Point Arithmetic, ε = 2
−24
≈ 10
−7
in single precision, and ε = 2
−53

10
−16
in double precision.
4.2. STOPPING CRITERIA 53
consistent pair of the above. (|x|
2
and |A|
F
, as well as |x|
2
and |A|
2
, both form mutually
consistent pairs.) All these norms satisfy the triangle inequality |x + y| ≤ |x| + |y| and |A +
B| ≤ |A| + |B|, as well as |Ax| ≤ |A| |x| for mutually consistent pairs. (For more details
on the properties of norms, see Golub and Van Loan [108].)
One difference between these norms is their dependence on dimension. A vector x of length n
with entries uniformly distributed between 0 and 1 will satisfy |x|

≤ 1, but |x|
2
will grow like

n and |x|
1
will grow like n. Therefore a stopping criterion based on |x|
1
(or |x|
2
) may have
to be permitted to grow proportional to n (or

n) in order that it does not become much harder to
satisfy for large n.
There are two approaches to bounding the inaccuracy of the computed solution to Ax = b.
Since |e
(i)
|, which we will call the forward error, is hard to estimate directly, we introduce the
backward error, which allows us to bound the forward error. The normwise backward error is de-
fined as the smallest possible value of max¦|δA|/|A|, |δb|/|b|¦ where x
(i)
is the exact solution
of (A + δA)x
(i)
= (b + δb) (here δA denotes a general matrix, not δ times A; the same goes for
δb). The backward error may be easily computed from the residual r
(i)
= Ax
(i)
−b; we show how
below. Provided one has some bound on the inverse of A, one can bound the forward error in terms
of the backward error via the simple equality
e
(i)
= x
(i)
−x = A
−1
(Ax
(i)
−b) = A
−1
r
(i)
,
which implies |e
(i)
| ≤ |A
−1
| |r
(i)
|. Therefore, a stopping criterion of the form “stop when
|r
(i)
| ≤ τ” also yields an upper bound on the forward error |e
(i)
| ≤ τ |A
−1
|. (Sometimes we
may prefer to use the stricter but harder to estimate bound |e
(i)
| ≤ | [A
−1
[ [r
(i)
[ |; see '4.2.3.
Here [X[ is the matrix or vector of absolute values of components of X.)
The backward error also has a direct interpretation as a stopping criterion, in addition to
supplying a bound on the forward error. Recall that the backward error is the smallest change
max¦|δA|/|A|, |δb|/|b|¦ to the problem Ax = b that makes x
(i)
an exact solution of (A +
δA)x
(i)
= b + δb. If the original data A and b have errors from previous computations or mea-
surements, then it is usually not worth iterating until δA and δb are even smaller than these errors.
For example, if the machine precision is ε, it is not worth making |δA| ≤ ε|A| and |δb| ≤ ε|b|,
because just rounding the entries of A and b to fit in the machine creates errors this large.
Based on this discussion, we will now consider some stopping criteria and their properties.
Above we already mentioned
Criterion 1. |r
(i)
| ≤ S
1
≡ stop tol (|A| |x
(i)
| + |b|). This is equivalent to asking that
the backward error δA and δb described above satisfy |δA| ≤ stop tol |A| and |δb| ≤
stop tol |b|. This criterion yields the forward error bound
|e
(i)
| ≤ |A
−1
| |r
(i)
| ≤ stop tol |A
−1
| (|A| |x
(i)
| +|b|) .
The second stopping criterion we discussed, which does not require |A|, may be much more
stringent than Criterion 1:
Criterion 2. |r
(i)
| ≤ S
2
≡ stop tol |b|. This is equivalent to asking that the backward error
δA and δb satisfy δA = 0 and |δb| ≤ tol |b|. One difficulty with this method is that if
|A| |x| |b|, which can only occur if A is very ill-conditioned and x nearly lies in the
null space of A, then it may be difficult for any method to satisfy the stopping criterion. To
see that A must be very ill-conditioned, note that
1 <
|A| |x|
|b|
=
|A| |A
−1
b|
|b|
≤ |A| |A
−1
| .
This criterion yields the forward error bound
|e
(i)
| ≤ |A
−1
| |r
(i)
| ≤ stop tol |A
−1
| |b|
54 CHAPTER 4. RELATED ISSUES
If an estimate of |A
−1
| is available, one can also just stop when the upper bound on the error
|A
−1
| |r
(i)
| falls below a threshold. This yields the third stopping criterion:
Criterion 3. |r
(i)
| ≤ S
3
≡ stop tol |x
(i)
|/|A
−1
|. This stopping criterion guarantees that
|e
(i)
|
|x
(i)
|

|A
−1
| |r
(i)
|
|x
(i)
|
≤ stop tol ,
permitting the user to specify the desired relative accuracy stop tol in the computed solution
x
(i)
.
One drawback to Criteria 1 and 2 is that they usually treat backward errors in each component of
δA and δb equally, since most norms |δA| and |δb| measure each entry of δA and δb equally.
For example, if A is sparse and δA is dense, this loss of possibly important structure will not be
reflected in |δA|. In contrast, the following stopping criterion gives one the option of scaling each
component δa
j,k
and δb
j
differently, including the possibility of insisting that some entries be zero.
The cost is an extra matrix-vector multiply:
Criterion 4. S
4
≡ max
j
([r
(i)
[
j
/(E [x
(i)
[ + f)
j
) ≤ stop tol. Here E is a user-defined matrix
of nonnegative entries, f is a user-defined vector of nonnegative entries, and [z[ denotes the
vector of absolute values of the entries of z. If this criterion is satisfied, it means there are a
δA and a δb such that (A + δA)x
(i)
= b + δb, with [δa
j,k
[ ≤ tol e
j,k
, and [δb
j
[ ≤ tol f
j
for all j and k. By choosing E and f, the user can vary the way the backward error is
measured in the stopping criterion. For example, choosing e
j,k
= |A|

and f
j
= |b|

makes the stopping criterion |r
(i)
|

/(n|A|

|x
(i)
|

+ |b|

), which is essentially the
same as Criterion 1. Choosing e
j,k
= [a
j,k
[ and f
j
= [b
j
[ makes the stopping criterion
measure the componentwise relative backward error, i.e., the smallest relative perturbations
in any component of A and b which is necessary to make x
(i)
an exact solution. This tighter
stopping criterion requires, among other things, that δA have the same sparsity pattern as A.
Other choices of E and f can be used to reflect other structured uncertainties in A and b.
This criterion yields the forward error bound
|e
(i)
|

≤ | [A
−1
[ [r
(i)
[ | ≤ S
4
| [A
−1
[(E[x
(i)
[ +f)|

where [A
−1
[ is the matrix of absolute values of entries of A
−1
.
Finally, we mention one more criterion, not because we recommend it, but because it is widely
used. We mention it in order to explain its potential drawbacks:
Dubious Criterion 5. |r
(i)
| ≤ S
5
≡ stop tol |r
(0)
|. This commonly used criterion has the
disadvantage of depending too strongly on the initial solution x
(0)
. If x
(0)
= 0, a common
choice, then r
(0)
= b. Then this criterion is equivalent to Criterion 2 above, which may be
difficult to satisfy for any algorithm if |b| < |A| |x|. On the other hand, if x
(0)
is very
large and very inaccurate, then |r
(0)
| will be very large and S
5
will be artificially large;
this means the iteration may stop too soon. This criterion yields the forward error bound
|e
(i)
| ≤ S
5
|A
−1
|.
4.2. STOPPING CRITERIA 55
4.2.2 When r
(i)
or |r
(i)
| is not readily available
It is possible to design an iterative algorithm for which r
(i)
= Ax
(i)
− b or |r
(i)
| is not directly
available, although this is not the case for any algorithms in this book. For completeness, however,
we discuss stopping criteria in this case.
For example, if ones “splits” A = M − N to get the iterative method x
(i)
= M
−1
Nx
(i−1)
+
M
−1
b ≡ Gx
(i−1)
+ c, then the natural residual to compute is ˆ r
(i)
= x
(i)
− Gx
(i)
− c =
M
−1
(Ax
(i)
− b) = M
−1
r
(i)
. In other words, the residual ˆ r
(i)
is the same as the residual
of the preconditioned system M
−1
Ax = M
−1
b. In this case, it is hard to interpret ˆ r
(i)
as
a backward error for the original system Ax = b, so we may instead derive a forward error
bound |e
(i)
| = |A
−1
Mˆ r
(i)
| ≤ |A
−1
M| |ˆ r
(i)
|. Using this as a stopping criterion requires
an estimate of |A
−1
M|. In the case of methods based on splitting A = M − N, we have
A
−1
M = (M −N)
−1
M = (I −G)
−1
, and |A
−1
M| = |(I −G)
−1
| ≤ 1/(1 −|G|).
Another example is an implementation of the preconditioned conjugate gradient algorithm
which computes |r
(i)
|
M
−1/2
,2
= (r
(i)T
M
−1
r
(i)
)
1/2
instead of |r
(i)
|
2
(the implementation
in this book computes the latter). Such an implementation could use the stopping criterion
|r
(i)
|
M
−1/2
,2
/|r
(0)
|
M
−1/2
,2
≤ tol as in Criterion 5. We may also use it to get the forward error
bound |e
(i)
| ≤ |A
−1
M
1/2
| |r
(i)
|
M
−1/2
,2
, which could also be used in a stopping criterion.
4.2.3 Estimating |A
−1
|
Bounds on the error |e
(i)
| inevitably rely on bounds for A
−1
, since e
(i)
= A
−1
r
(i)
. There is a
large number of problem dependent ways to estimate A
−1
; we mention a few here.
When a splitting A = M −N is used to get an iteration
x
(i)
= M
−1
Nx
(i−1)
+M
−1
b = Gx
(i−1)
+c,
then the matrix whose inverse norm we need is I −G. Often, we know how to estimate |G| if the
splitting is a standard one such as Jacobi or SOR, and the matrix A has special characteristics such
as Property A. Then we may estimate |(I −G)
−1
| ≤ 1/(1 −|G|).
When A is symmetric positive definite, and Chebyshev acceleration with adaptation of param-
eters is being used, then at each step the algorithm estimates the largest and smallest eigenvalues
λ
max
(A) and λ
min
(A) of A anyway. Since A is symmetric positive definite, |A
−1
|
2
= λ
−1
min
(A).
This adaptive estimation is often done using the Lanczos algorithm (see section 5.1), which can
usually provide good estimates of the largest (rightmost) and smallest (leftmost) eigenvalues of a
symmetric matrix at the cost of a few matrix-vector multiplies. For general nonsymmetric A, we
may apply the Lanczos method to AA
T
or A
T
A, and use the fact that |A
−1
|
2
= 1/λ
1/2
min
(AA
T
) =
1/λ
1/2
min
(A
T
A).
It is also possible to estimate |A
−1
|

provided one is willing to solve a few systems of linear
equations with A and A
T
as coefficient matrices. This is often done with dense linear system
solvers, because the extra cost of these systems is O(n
2
), which is small compared to the cost
O(n
3
) of the LU decomposition (see Hager [120], Higham [123] and Anderson, et al. [3]). This is
not the case for iterative solvers, where the cost of these solves may well be several times as much
as the original linear system. Still, if many linear systems with the same coefficient matrix and
differing right-hand-sides are to be solved, it is a viable method.
The approach in the last paragraph also lets us estimate the alternate error bound |e
(i)
|


| [A
−1
[ [r
(i)
[ |

. This may be much smaller than the simpler |A
−1
|

|r
(i)
|

in the case
where the rows of Aare badly scaled; consider the case of a diagonal matrix Awith widely varying
diagonal entries. To compute | [A
−1
[ [r
(i)
[ |

, let R denote the diagonal matrix with diagonal
entries equal to the entries of [r
(i)
[; then | [A
−1
[ [r
(i)
[ |

= |A
−1
R|

(see Arioli, Demmel and
Duff [5]). |A
−1
R|

can be estimated using the technique in the last paragraph since multiplying
by A
−1
R or (A
−1
R)
T
= R
T
A
−T
is no harder than multiplying by A
−1
and A
−T
and also by R,
a diagonal matrix.
56 CHAPTER 4. RELATED ISSUES
4.2.4 Stopping when progress is no longer being made
In addition to limiting the total amount of work by limiting the maximum number of iterations one
is willing to do, it is also natural to consider stopping when no apparent progress is being made.
Some methods, such as Jacobi and SOR, often exhibit nearly monotone linear convergence, at least
after some initial transients, so it is easy to recognize when convergence degrades. Other methods,
like the conjugate gradient method, exhibit “plateaus” in their convergence, with the residual norm
stagnating at a constant value for many iterations before decreasing again; in principle there can
be many such plateaus (see Greenbaum and Strakos [109]) depending on the problem. Still other
methods, such as CGS, can appear wildly nonconvergent for a large number of steps before the
residual begins to decrease; convergence may continue to be erratic from step to step.
In other words, while it is a good idea to have a criterion that stops when progress towards a
solution is no longer being made, the form of such a criterion is both method and problem depen-
dent.
4.2.5 Accounting for floating point errors
The error bounds discussed in this section are subject to floating point errors, most of which are
innocuous, but which deserve some discussion.
The infinity norm |x|

= max
j
[x
j
[ requires the fewest floating point operations to compute,
and cannot overflow or cause other exceptions if the x
j
are themselves finite
2
. On the other hand,
computing |x|
2
= (
¸
j
[x
j
[
2
)
1/2
in the most straightforward manner can easily overflow or lose
accuracy to underflow even when the true result is far from either the overflow or underflow thresh-
olds. For this reason, a careful implementation for computing |x|
2
without this danger is available
(subroutine snrm2 in the BLAS [71] [143]), but it is more expensive than computing |x|

.
Now consider computing the residual r
(i)
= Ax
(i)
− b by forming the matrix-vector product
Ax
(i)
and then subtracting b, all in floating point arithmetic with relative precision ε. A standard
error analysis shows that the error δr
(i)
in the computed r
(i)
is bounded by |δr
(i)
| ≤ O(ε)(|A|
|x
(i)
| + |b|), where O(ε) is typically bounded by nε, and usually closer to

nε. This is why
one should not choose stop tol ≤ ε in Criterion 1, and why Criterion 2 may not be satisfied by any
method. This uncertainty in the value of r
(i)
induces an uncertainty in the error e
(i)
= A
−1
r
(i)
of
at most O(ε)|A
−1
| (|A| |x
(i)
| + |b|). A more refined bound is that the error (δr
(i)
)
j
in the
jth component of r
(i)
is bounded by O(ε) times the jth component of [A[ [x
(i)
[ + [b[, or more
tersely [δr
(i)
[ ≤ O(ε)([A[ [x
(i)
[ + [b[). This means the uncertainty in e
(i)
is really bounded by
O(ε)| [A
−1
[([A[[x
(i)
[+[b[)|. This last quantity can be estimated inexpensively provided solving
systems with A and A
T
as coefficient matrices is inexpensive (see the last paragraph of '4.2.3).
Both these bounds can be severe overestimates of the uncertainty in e
(i)
, but examples exist where
they are attainable.
4.3 Data Structures
The efficiency of any of the iterative methods considered in previous sections is determined primar-
ily by the performance of the matrix-vector product and the preconditioner solve, and therefore on
the storage scheme used for the matrix and the preconditioner. Since iterative methods are typically
used on sparse matrices, we will review here a number of sparse storage formats. Often, the storage
scheme used arises naturally from the specific application problem.
In this section we will review some of the more popular sparse matrix formats that are used in
numerical software packages such as ITPACK [139] and NSPCG [164]. After surveying the various
formats, we demonstrate how the matrix-vector product and an incomplete factorization solve are
formulated using two of the sparse matrix formats.
2
IEEE standard floating point arithmetic permits computations with ±∞and NaN, or Not-a-Number, symbols.
4.3. DATA STRUCTURES 57
4.3.1 Survey of Sparse Matrix Storage Formats
If the coefficient matrix A is sparse, large-scale linear systems of the form Ax = b can be most
efficiently solved if the zero elements of A are not stored. Sparse storage schemes allocate con-
tiguous storage in memory for the nonzero elements of the matrix, and perhaps a limited number
of zeros. This, of course, requires a scheme for knowing where the elements fit into the full matrix.
There are many methods for storing the data (see for instance Saad [185] and Eijkhout [86]).
Here we will discuss Compressed Row and Column Storage, Block Compressed Row Storage,
Diagonal Storage, Jagged Diagonal Storage, and Skyline Storage.
Compressed Row Storage (CRS)
The Compressed Row and Column (in the next section) Storage formats are the most general: they
make absolutely no assumptions about the sparsity structure of the matrix, and they don’t store any
unnecessary elements. On the other hand, they are not very efficient, needing an indirect addressing
step for every single scalar operation in a matrix-vector product or preconditioner solve.
The Compressed Row Storage (CRS) format puts the subsequent nonzeros of the matrix rows
in contiguous memory locations. Assuming we have a nonsymmetric sparse matrix A, we cre-
ate 3 vectors: one for floating-point numbers (val), and the other two for integers (col ind,
row ptr). The val vector stores the values of the nonzero elements of the matrix A, as they
are traversed in a row-wise fashion. The col ind vector stores the column indexes of the el-
ements in the val vector. That is, if val(k) = a
i,j
then col ind(k) = j. The row ptr
vector stores the locations in the val vector that start a row, that is, if val(k) = a
i,j
then
row ptr(i) ≤ k < row ptr(i + 1). By convention, we define row ptr(n + 1) = nnz + 1,
where nnz is the number of nonzeros in the matrix A. The storage savings for this approach is
significant. Instead of storing n
2
elements, we need only 2nnz +n + 1 storage locations.
As an example, consider the nonsymmetric matrix A defined by
A =

¸
¸
¸
¸
¸
¸
¸
10 0 0 0 −2 0
3 9 0 0 0 3
0 7 8 7 0 0
3 0 8 7 5 0
0 8 0 9 9 13
0 4 0 0 2 −1
¸

. (4.1)
The CRS format for this matrix is then specified by the arrays ¦val, col ind, row ptr¦
given below
val 10 -2 3 9 3 7 8 7 3 9 13 4 2 -1
col ind 1 5 1 2 6 2 3 4 1 5 6 2 5 6
row ptr 1 3 6 9 13 17 20 .
If the matrix A is symmetric, we need only store the upper (or lower) triangular portion of the
matrix. The trade-off is a more complicated algorithm with a somewhat different pattern of data
access.
Compressed Column Storage (CCS)
Analogous to Compressed Row Storage there is Compressed Column Storage (CCS), which is also
called the Harwell-Boeing sparse matrix format [77]. The CCS format is identical to the CRS
format except that the columns of A are stored (traversed) instead of the rows. In other words, the
CCS format is the CRS format for A
T
.
The CCS format is specified by the 3 arrays ¦val, row ind, col ptr¦, where row ind
stores the row indices of each nonzero, and col ptr stores the index of the elements in val
which start a column of A. The CCS format for the matrix A in (4.1) is given by
58 CHAPTER 4. RELATED ISSUES
val 10 3 3 9 7 8 4 8 8 9 2 3 13 -1
row ind 1 2 4 2 3 5 6 3 4 5 6 2 5 6
col ptr 1 4 8 10 13 17 20 .
Block Compressed Row Storage (BCRS)
If the sparse matrix A is comprised of square dense blocks of nonzeros in some regular pattern,
we can modify the CRS (or CCS) format to exploit such block patterns. Block matrices typically
arise from the discretization of partial differential equations in which there are several degrees of
freedom associated with a grid point. We then partition the matrix in small blocks with a size equal
to the number of degrees of freedom, and treat each block as a dense matrix, even though it may
have some zeros.
If n
b
is the dimension of each block and nnzb is the number of nonzero blocks in the n n
matrix A, then the total storage needed is nnz = nnzb n
2
b
. The block dimension n
d
of A is then
defined by n
d
= n/n
b
.
Similar to the CRS format, we require 3 arrays for the BCRS format: a rectangular array for
floating-point numbers ( val(1 : nnzb,1 : n
b
,1 : n
b
)) which stores the nonzero blocks in
(block) row-wise fashion, an integer array (col ind(1 : nnzb)) which stores the actual column
indices in the original matrix A of the (1, 1) elements of the nonzero blocks, and a pointer array
(row blk(1 : n
d
+ 1)) whose entries point to the beginning of each block row in val(:,:,:)
and col ind(:). The savings in storage locations and reduction in indirect addressing for BCRS
over CRS can be significant for matrices with a large n
b
.
Compressed Diagonal Storage (CDS)
If the matrix A is banded with bandwidth that is fairly constant from row to row, then it is worth-
while to take advantage of this structure in the storage scheme by storing subdiagonals of the
matrix in consecutive locations. Not only can we eliminate the vector identifying the column and
row, we can pack the nonzero elements in such a way as to make the matrix-vector product more
efficient. This storage scheme is particularly useful if the matrix arises from a finite element or
finite difference discretization on a tensor product grid.
We say that the matrix A = (a
i,j
) is banded if there are nonnegative constants p, q, called the
left and right halfbandwidth, such that a
i,j
= 0 only if i − p ≤ j ≤ i + q. In this case, we can
allocate for the matrix A an array val(1:n,-p:q). The declaration with reversed dimensions
(-p:q,n) corresponds to the LINPACK band format [72], which unlike CDS, does not allow for
an efficiently vectorizable matrix-vector multiplication if p + q is small.
Usually, band formats involve storing some zeros. The CDS format may even contain some ar-
ray elements that do not correspond to matrix elements at all. Consider the nonsymmetric matrix A
defined by
A =

¸
¸
¸
¸
¸
¸
¸
10 −3 0 0 0 0
3 9 6 0 0 0
0 7 8 7 0 0
0 0 8 7 5 0
0 0 0 9 9 13
0 0 0 0 2 −1
¸

. (4.2)
Using the CDS format, we store this matrix A in an array of dimension (6,-1:1) using the
mapping
val(i, j) = a
i,i+j
. (4.3)
Hence, the rows of the val(:,:) array are
4.3. DATA STRUCTURES 59
val(:,-1) 0 3 7 8 9 2
val(:, 0) 10 9 8 7 9 -1
val(:,+1) -3 6 7 5 13 0
.
Notice the two zeros corresponding to non-existing matrix elements.
A generalization of the CDS format more suitable for manipulating general sparse matrices on
vector supercomputers is discussed by Melhem in [153]. This variant of CDS uses a stripe data
structure to store the matrix A. This structure is more efficient in storage in the case of varying
bandwidth, but it makes the matrix-vector product slightly more expensive, as it involves a gather
operation.
As defined in [153], a stripe in the n n matrix A is a set of positions S = ¦(i, σ(i)); i ∈ I ⊆
I
n
¦, where I
n
= ¦1, . . . , n¦ and σ is a strictly increasing function. Specifically, if (i, σ(i)) and
(j, σ(j)) are in S, then
i < j →σ(i) < σ(j).
When computing the matrix-vector product y = Ax using stripes, each (i, σ
k
(i)) element of A in
stripe S
k
is multiplied with both x
i
and x
σ
k
(i)
and these products are accumulated in y
σ
k
(i)
and y
i
,
respectively. For the nonsymmetric matrix A defined by
A =

¸
¸
¸
¸
¸
¸
¸
10 −3 0 1 0 0
0 9 6 0 −2 0
3 0 8 7 0 0
0 6 0 7 5 4
0 0 0 0 9 13
0 0 0 0 5 −1
¸

, (4.4)
the 4 stripes of the matrix A stored in the rows of the val(:,:) array would be
val(:,-1) 0 0 3 6 0 5
val(:, 0) 10 9 8 7 9 -1
val(:,+1) 0 -3 6 7 5 13
val(:,+2) 0 1 -2 0 4 0
.
Jagged Diagonal Storage (JDS)
The Jagged Diagonal Storage format can be useful for the implementation of iterative methods on
parallel and vector processors (see Saad [184]). Like the Compressed Diagonal format, it gives a
vector length essentially of the size of the matrix. It is more space-efficient than CDS at the cost of
a gather/scatter operation.
A simplified form of JDS, called ITPACK storage or Purdue storage, can be described as fol-
lows. In the matrix from (4.4) all elements are shifted left:

¸
¸
¸
¸
¸
¸
¸
10 −3 0 1 0 0
0 9 6 0 −2 0
3 0 8 7 0 0
0 6 0 7 5 4
0 0 0 0 9 13
0 0 0 0 5 −1
¸

−→

¸
¸
¸
¸
¸
¸
¸
10 −3 1
9 6 −2
3 8 7
6 7 5 4
9 13
5 −1
¸

after which the columns are stored consecutively. All rows are padded with zeros on the right to
give them equal length. Corresponding to the array of matrix elements val(:,:), an array of
column indices, col ind(:,:) is also stored:
val(:, 1) 10 9 3 6 9 5
val(:, 2) −3 6 8 7 13 −1
val(:, 3) 1 −2 7 5 0 0
val(:, 4) 0 0 0 4 0 0
,
60 CHAPTER 4. RELATED ISSUES
col ind(:, 1) 1 2 1 2 5 5
col ind(:, 2) 2 3 3 4 6 6
col ind(:, 3) 4 5 4 5 0 0
col ind(:, 4) 0 0 0 6 0 0
.
It is clear that the padding zeros in this structure may be a disadvantage, especially if the
bandwidth of the matrix varies strongly. Therefore, in the CRS format, we reorder the rows of the
matrix decreasingly according to the number of nonzeros per row. The compressed and permuted
diagonals are then stored in a linear array. The new data structure is called jagged diagonals.
The number of jagged diagonals is equal to the number of nonzeros in the first row, i.e., the
largest number of nonzeros in any row of A. The data structure to represent the n n matrix A
therefore consists of a permutation array (perm(1:n)) which reorders the rows, a floating-point
array (jdiag(:)) containing the jagged diagonals in succession, an integer array (col ind(:))
containing the corresponding column indices indices, and finally a pointer array (jd ptr(:))
whose elements point to the beginning of each jagged diagonal. The advantages of JDS for matrix
multiplications are discussed by Saad in [184].
The JDS format for the above matrix A in using the linear arrays ¦perm, jdiag, col ind,
jd ptr¦ is given below (jagged diagonals are separated by semicolons)
jdiag 1 3 7 8 10 2; 9 9 8 -1; 9 6 7 5; 13
col ind 1 1 2 3 1 5; 4 2 3 6; 5 3 4 5; 6
perm 5 2 3 4 1 6 jd ptr 1 7 13 17 .
Skyline Storage (SKS)
The final storage scheme we consider is for skyline matrices, which are also called variable band
or profile matrices (see Duff, Erisman and Reid [79]). It is mostly of importance in direct solution
methods, but it can be used for handling the diagonal blocks in block matrix factorization methods.
A major advantage of solving linear systems having skyline coefficient matrices is that when piv-
oting is not necessary, the skyline structure is preserved during Gaussian elimination. If the matrix
is symmetric, we only store its lower triangular part. A straightforward approach in storing the el-
ements of a skyline matrix is to place all the rows (in order) into a floating-point array (val(:)),
and then keep an integer array (row ptr(:)) whose elements point to the beginning of each row.
The column indices of the nonzeros stored in val(:) are easily derived and are not stored.
For a nonsymmetric skyline matrix such as the one illustrated in Figure 4.1, we store the lower
triangular elements in SKS format, and store the upper triangular elements in a column-oriented
SKS format (transpose stored in row-wise SKS format). These two separated substructures can be
linked in a variety of ways. One approach, discussed by Saad in [185], is to store each row of the
lower triangular part and each column of the upper triangular part contiguously into the floating-
point array (val(:)). An additional pointer is then needed to determine where the diagonal
elements, which separate the lower triangular elements from the upper triangular elements, are
located.
4.3.2 Matrix vector products
In many of the iterative methods discussed earlier, both the product of a matrix and that of its
transpose times a vector are needed, that is, given an input vector x we want to compute products
y = Ax and y = A
T
x.
We will present these algorithms for two of the storage formats from '4.3: CRS and CDS.
4.3. DATA STRUCTURES 61
+ x x
x + x x
x x + x x
x x + x x x x
x + x x x
x x x + x x
x x + x x x
x x x + x x x x
x x x x + x x x
x + x x x
x x x x + x x
x x + x x
x x x + x x x
x x + x x
+ x x
x x x + x
+
Figure 4.1: Profile of a nonsymmetric skyline or variable-band matrix.
CRS Matrix-Vector Product
The matrix vector product y = Ax using CRS format can be expressed in the usual way:
y
i
=
¸
j
a
i,j
x
j
,
since this traverses the rows of the matrix A. For an nn matrix A, the matrix-vector multiplication
is given by
for i = 1, n
y(i) = 0
for j = row_ptr(i), row_ptr(i+1) - 1
y(i) = y(i) + val(j)
*
x(col_ind(j))
end;
end;
Since this method only multiplies nonzero matrix entries, the operation count is 2 times the number
of nonzero elements in A, which is a significant savings over the dense operation requirement
of 2n
2
.
For the transpose product y = A
T
x we cannot use the equation
y
i
=
¸
j
(A
T
)
i,j
x
j
=
¸
j
a
j,i
x
j
,
since this implies traversing columns of the matrix, an extremely inefficient operation for matrices
stored in CRS format. Hence, we switch indices to
for all j, do for all i: y
i
←y
i
+a
j,i
x
j
.
The matrix-vector multiplication involving A
T
is then given by
for i = 1, n
y(i) = 0
end;
for j = 1, n
62 CHAPTER 4. RELATED ISSUES
for i = row_ptr(j), row_ptr(j+1)-1
y(col_ind(i)) = y(col_ind(i)) + val(i)
*
x(j)
end;
end;
Both matrix-vector products above have largely the same structure, and both use indirect ad-
dressing. Hence, their vectorizability properties are the same on any given computer. However,
the first product (y = Ax) has a more favorable memory access pattern in that (per iteration of
the outer loop) it reads two vectors of data (a row of matrix A and the input vector x) and writes
one scalar. The transpose product (y = A
T
x) on the other hand reads one element of the input
vector, one row of matrix A, and both reads and writes the result vector y. Unless the machine
on which these methods are implemented has three separate memory paths (e.g., Cray Y-MP), the
memory traffic will then limit the performance. This is an important consideration for RISC-based
architectures.
CDS Matrix-Vector Product
If the nn matrix A is stored in CDS format, it is still possible to perform a matrix-vector product
y = Ax by either rows or columns, but this does not take advantage of the CDS format. The idea
is to make a change in coordinates in the doubly-nested loop. Replacing j →i +j we get
y
i
←y
i
+a
i,j
x
j
⇒ y
i
←y
i
+a
i,i+j
x
i+j
.
With the index i in the inner loop we see that the expression a
i,i+j
accesses the jth diagonal of the
matrix (where the main diagonal has number 0).
The algorithm will now have a doubly-nested loop with the outer loop enumerating the diago-
nals diag=-p,q with p and q the (nonnegative) numbers of diagonals to the left and right of the
main diagonal. The bounds for the inner loop follow from the requirement that
1 ≤ i, i + j ≤ n.
The algorithm becomes
for i = 1, n
y(i) = 0
end;
for diag = -diag_left, diag_right
for loc = max(1,1-diag), min(n,n-diag)
y(loc) = y(loc) + val(loc,diag)
*
x(loc+diag)
end;
end;
The transpose matrix-vector product y = A
T
x is a minor variation of the algorithm above.
Using the update formula
y
i
← y
i
+a
i+j,i
x
j
= y
i
+a
i+j,i+j−j
x
i+j
we obtain
for i = 1, n
y(i) = 0
end;
for diag = -diag_right, diag_left
for loc = max(1,1-diag), min(n,n-diag)
y(loc) = y(loc) + val(loc+diag, -diag)
*
x(loc+diag)
end;
end;
4.3. DATA STRUCTURES 63
The memory access for the CDS-based matrix-vector product y = Ax (or y = A
T
x) is three
vectors per inner iteration. On the other hand, there is no indirect addressing, and the algorithm
is vectorizable with vector lengths of essentially the matrix order n. Because of the regular data
access, most machines can perform this algorithm efficiently by keeping three base registers and
using simple offset addressing.
4.3.3 Sparse Incomplete Factorizations
Efficient preconditioners for iterative methods can be found by performing an incomplete factor-
ization of the coefficient matrix. In this section, we discuss the incomplete factorization of an nn
matrix A stored in the CRS format, and routines to solve a system with such a factorization. At
first we only consider a factorization of the D-ILU type, that is, the simplest type of factorization
in which no “fill” is allowed, even if the matrix has a nonzero in the fill position (see section 3.4.2).
Later we will consider factorizations that allow higher levels of fill. Such factorizations consider-
ably more complicated to code, but they are essential for complicated differential equations. The
solution routines are applicable in both cases.
For iterative methods, such as QMR, that involve a transpose matrix vector product we need
to consider solving a system with the transpose of as factorization as well.
Generating a CRS-based D-ILU Incomplete Factorization
In this subsection we will consider a matrix split as A = D
A
+ L
A
+ U
A
in diagonal, lower
and upper triangular part, and an incomplete factorization preconditioner of the form (D
A
+
L
A
)D
−1
A
(D
A
+ U
A
). In this way, we only need to store a diagonal matrix D containing the
pivots of the factorization.
Hence,it suffices to allocate for the preconditioner only a pivot array of length n (pivots(1:n)).
In fact, we will store the inverses of the pivots rather than the pivots themselves. This implies that
during the system solution no divisions have to be performed.
Additionally, we assume that an extra integer array diag ptr(1:n) has been allo-
cated that contains the column (or row) indices of the diagonal elements in each row, that is,
val(diag ptr(i)) = a
i,i
.
The factorization begins by copying the matrix diagonal
for i = 1, n
pivots(i) = val(diag_ptr(i))
end;
Each elimination step starts by inverting the pivot
for i = 1, n
pivots(i) = 1 / pivots(i)
For all nonzero elements a
i,j
with j > i, we next check whether a
j,i
is a nonzero matrix element,
since this is the only element that can cause fill with a
i,j
.
for j = diag_ptr(i)+1, row_ptr(i+1)-1
found = FALSE
for k = row_ptr(col_ind(j)), diag_ptr(col_ind(j))-1
if(col_ind(k) = i) then
found = TRUE
element = val(k)
endif
end;
64 CHAPTER 4. RELATED ISSUES
If so, we update a
j,j
.
if (found = TRUE)
val(diag_ptr(col_ind(j))) = val(diag_ptr(col_ind(j)))
- element
*
pivots(i)
*
val(j)
end;
end;
CRS-based Factorization Solve
The system LUy = x can be solved in the usual manner by introducing a temporary vector z:
Lz = x, Uy = z.
We have a choice between several equivalent ways of solving the system:
LU = (D +L
A
)D
−1
(D +U
A
)
= (I +L
A
D
−1
)(D +U
A
)
= (D +L
A
)(I +D
−1
U
A
)
= (I +L
A
D
−1
)D(I +D
−1
U
A
)
The first and fourth formulae are not suitable since they require both multiplication and division
with D; the difference between the second and third is only one of ease of coding. In this section we
use the third formula; in the next section we will use the second for the transpose system solution.
Both halves of the solution have largely the same structure as the matrix vector multiplication.
for i = 1, n
sum = 0
for j = row_ptr(i), diag_ptr(i)-1
sum = sum + val(j)
*
z(col_ind(j))
end;
z(i) = pivots(i)
*
(x(i)-sum)
end;
for i = n, 1, (step -1)
sum = 0
for j = diag(i)+1, row_ptr(i+1)-1
sum = sum + val(j)
*
y(col_ind(j))
y(i) = z(i) - pivots(i)
*
sum
end;
end;
The temporary vector z can be eliminated by reusing the space for y; algorithmically, z can even
overwrite x, but overwriting input data is in general not recommended.
CRS-based Factorization Transpose Solve
Solving the transpose system (LU)
T
y = x is slightly more involved. In the usual formulation we
traverse rows when solving a factored system, but here we can only access columns of the matrices
L
T
and U
T
(at less than prohibitive cost). The key idea is to distribute each newly computed
component of a triangular solve immediately over the remaining right-hand-side.
For instance, if we write a lower triangular matrix as L = (l
∗1
, l
∗2
, . . . ), then the system
Ly = x can be written as x = l
∗1
y
1
+ l
∗2
y
2
+ . Hence, after computing y
1
we modify
x ← x − l
∗1
y
1
, and so on. Upper triangular systems are treated in a similar manner. With this
algorithm we only access columns of the triangular systems. Solving a transpose system with a
matrix stored in CRS format essentially means that we access rows of L and U.
The algorithm now becomes
4.3. DATA STRUCTURES 65
for i = 1, n
x_tmp(i) = x(i)
end;
for i = 1, n
z(i) = x_tmp(i)
tmp = pivots(i)
*
z(i)
for j = diag_ptr(i)+1, row_ptr(i+1)-1
x_tmp(col_ind(j)) = x_tmp(col_ind(j)) - tmp
*
val(j)
end;
end;
for i = n, 1 (step -1)
y(i) = pivots(i)
*
z(i)
for j = row_ptr(i), diag_ptr(i)-1
z(col_ind(j)) = z(col_ind(j)) - val(j)
*
y(i)
end;
end;
The extra temporary x tmp is used only for clarity, and can be overlapped with z. Both x tmp and
z can be considered to be equivalent to y. Overall, a CRS-based preconditioner solve uses short
vector lengths, indirect addressing, and has essentially the same memory traffic patterns as that of
the matrix-vector product.
Generating a CRS-based ILU(k) Incomplete Factorization
Incomplete factorizations with several levels of fill allowed are more accurate than the D-ILU
factorization described above. On the other hand, they require more storage, and are considerably
harder to implement (much of this section is based on algorithms for a full factorization of a sparse
matrix as found in Duff, Erisman and Reid [79]).
As a preliminary, we need an algorithm for adding two vectors x and y, both stored in sparse
storage. Let lx be the number of nonzero components in x, let x be stored in x, and let xind be
an integer array such that
if xind(j)=i then x(j) = x
i
.
Similarly, y is stored as ly, y, yind.
We now add x ← x + y by first copying y into a full vector w then adding w to x. The total
number of operations will be O(lx + ly)
3
:
% copy y into w
for i=1,ly
w( yind(i) ) = y(i)
% add w to x wherever x is already nonzero
for i=1,lx
if w( xind(i) ) <> 0
x(i) = x(i) + w( xind(i) )
w( xind(i) ) = 0
% add w to x by creating new components
% wherever x is still zero
for i=1,ly
if w( yind(i) ) <> 0 then
lx = lx+1
3
This is not counting the initial zeroing of the w array.
66 CHAPTER 4. RELATED ISSUES
xind(lx) = yind(i)
x(lx) = w( yind(i) )
endif
In order to add a sequence of vectors x ← x +
¸
k
y
(k)
, we add the y
(i)
vectors into w before
executing the writes into x. A different implementation would be possible, where w is allocated as
a sparse vector and its sparsity pattern is constructed during the additions. We will not discuss this
possibility any further.
For a slight refinement of the above algorithm, we will add levels to the nonzero components:
we assume integer vectors xlev and ylev of length lx and ly respectively, and a full length
level vector wlev corresponding to w. The addition algorithm then becomes:
% copy y into w
for i=1,ly
w( yind(i) ) = y(i)
wlev( yind(i) ) = ylev(i)
% add w to x wherever x is already nonzero;
% don’t change the levels
for i=1,lx
if w( xind(i) ) <> 0
x(i) = x(i) + w( xind(i) )
w( xind(i) ) = 0
% add w to x by creating new components
% wherever x is still zero;
% carry over levels
for i=1,ly
if w( yind(i) ) <> 0 then
lx = lx+1
x(lx) = w( yind(i) )
xind(lx) = yind(i)
xlev(lx) = wlev( yind(i) )
endif
We can now describe the ILU(k) factorization. The algorithm starts out with the matrix A,
and gradually builds up a factorization M of the form M = (D + L)(I + D
−1
U), where L, D
−1
,
and D
−1
U are stored in the lower triangle, diagonal and upper triangle of the array M respectively.
The particular form of the factorization is chosen to minimize the number of times that the full
vector w is copied back to sparse form.
Specifically, we use a sparse form of the following factorization scheme:
for k=1,n
for j=1,k-1
for i=j+1,n
a(k,i) = a(k,i) - a(k,j)
*
a(j,i)
for j=k+1,n
a(k,j) = a(k,j)/a(k,k)
This is a row-oriented version of the traditional ‘left-looking’ factorization algorithm.
We will describe an incomplete factorization that controls fill-in through levels (see equa-
tion (3.1)). Alternatively we could use a drop tolerance (section 3.4.2), but this is less attractive
from a point of implementation. With fill levels we can perform the factorization symbolically at
first, determining storage demands and reusing this information through a number of linear systems
of the same sparsity structure. Such preprocessing and reuse of information is not possible with fill
controlled by a drop tolerance criterium.
4.4. PARALLELISM 67
The matrix arrays A and M are assumed to be in compressed row storage, with no particular ordering
of the elements inside each row, but arrays adiag and mdiag point to the locations of the diagonal
elements.
for row=1,n
% go through elements A(row,col) with col<row
COPY ROW row OF A() INTO DENSE VECTOR w
for col=aptr(row),aptr(row+1)-1
if aind(col) < row then
acol = aind(col)
MULTIPLY ROW acol OF M() BY A(col)
SUBTRACT THE RESULT FROM w
ALLOWING FILL-IN UP TO LEVEL k
endif
INSERT w IN ROW row OF M()
% invert the pivot
M(mdiag(row)) = 1/M(mdiag(row))
% normalize the row of U
for col=mptr(row),mptr(row+1)-1
if mind(col) > row
M(col) = M(col)
*
M(mdiag(row))
The structure of a particular sparse matrix is likely to apply to a sequence of problems, for
instance on different time-steps, or during a Newton iteration. Thus it may pay off to perform the
above incomplete factorization first symbolically to determine the amount and location of fill-in
and use this structure for the numerically different but structurally identical matrices. In this case,
the array for the numerical values can be used to store the levels during the symbolic factorization
phase.
4.4 Parallelism
In this section we discuss aspects of parallelism in the iterative methods discussed in this book.
Since the iterative methods share most of their computational kernels we will discuss these
independent of the method. The basic time-consuming kernels of iterative schemes are:
• inner products,
• vector updates,
• matrix–vector products, e.g., Ap
(i)
(for some methods also A
T
p
(i)
),
• preconditioner solves.
We will examine each of these in turn. We will conclude this section by discussing two par-
ticular issues, namely computational wavefronts in the SOR method, and block operations in the
GMRES method.
4.4.1 Inner products
The computation of an inner product of two vectors can be easily parallelized; each processor
computes the inner product of corresponding segments of each vector (local inner products or
LIPs). On distributed-memory machines the LIPs then have to be sent to other processors to be
combined for the global inner product. This can be done either with an all-to-all send where every
68 CHAPTER 4. RELATED ISSUES
processor performs the summation of the LIPs, or by a global accumulation in one processor,
followed by a broadcast of the final result. Clearly, this step requires communication.
For shared-memory machines, the accumulation of LIPs can be implemented as a critical sec-
tion where all processors add their local result in turn to the global result, or as a piece of serial
code, where one processor performs the summations.
Overlapping communication and computation
Clearly, in the usual formulation of conjugate gradient-type methods the inner products induce a
synchronization of the processors, since they cannot progress until the final result has been com-
puted: updating x
(i+1)
and r
(i+1)
can only begin after completing the inner product for α
i
. Since
on a distributed-memory machine communication is needed for the inner product, we cannot over-
lap this communication with useful computation. The same observation applies to updating p
(i)
,
which can only begin after completing the inner product for β
i−1
.
Figure 4.2 shows a variant of CG, in which all communication time may be overlapped with
useful computations. This is just a reorganized version of the original CG scheme, and is therefore
precisely as stable. Another advantage over other approaches (see below) is that no additional
operations are required.
This rearrangement is based on two tricks. The first is that updating the iterate is delayed to
mask the communication stage of the p
(i)
T
Ap
(i)
inner product. The second trick relies on splitting
the (symmetric) preconditioner as M = LL
T
, so one first computes L
−1
r
(i)
, after which the inner
product r
(i)
T
M
−1
r
(i)
can be computed as s
T
s where s = L
−1
r
(i)
. The computation of L
−T
s will
then mask the communication stage of the inner product.
x
(−1)
= x
(0)
= initial guess; r
(0)
= b −Ax
(0)
;
p
(−1)
= 0; β
−1
= 0; α
−1
= 0;
s = L
−1
r
(0)
;
ρ
0
= (s, s)
for i = 0, 1, 2, ....
w
(i)
= L
−T
s;
p
(i)
= w
(i)

i−1
p
(i−1)
;
q
(i)
= Ap
(i)
;
γ = (p
(i)
, q
(i)
);
x
(i)
= x
(i−1)

i−1
p
(i−1)
;
α
i
= ρ
i
/γ;
r
(i+1)
= r
(i)
−α
i
q
(i)
;
s = L
−1
r
(i+1)
;
ρ
i+1
= (s, s);
if |r
(i+1)
| small enough then
x
(i+1)
= x
(i)

i
p
(i)
quit;
endif
β
i
= ρ
i+1

i
;
end;
Figure 4.2: A rearrangement of Conjugate Gradient for parallelism
Under the assumptions that we have made, CG can be efficiently parallelized as follows:
4.4. PARALLELISM 69
1. The communication required for the reduction of the inner product for γ can be overlapped
with the update for x
(i)
, (which could in fact have been done in the previous iteration step).
2. The reduction of the inner product for ρ
i+1
can be overlapped with the remaining part of the
preconditioning operation at the beginning of the next iteration.
3. The computation of a segment of p
(i)
can be followed immediately by the computation of a
segment of q
(i)
, and this can be followed by the computation of a part of the inner product.
This saves on load operations for segments of p
(i)
and q
(i)
.
For a more detailed discussion see Demmel, Heath and Van der Vorst [66]. This algorithm can
be extended trivially to preconditioners of LDL
T
form, and nonsymmetric preconditioners in the
Biconjugate Gradient Method.
Fewer synchronization points
Several authors have found ways to eliminate some of the synchronization points induced by the
inner products in methods such as CG. One strategy has been to replace one of the two inner prod-
ucts typically present in conjugate gradient-like methods by one or two others in such a way that
all inner products can be performed simultaneously. The global communication can then be pack-
aged. A first such method was proposed by Saad [181] with a modification to improve its stability
suggested by Meurant [155]. Recently, related methods have been proposed by Chronopoulos and
Gear [54], D’Azevedo and Romine [61], and Eijkhout [87]. These schemes can also be applied to
nonsymmetric methods such as BiCG. The stability of such methods is discussed by D’Azevedo,
Eijkhout and Romine [60].
Another approach is to generate a number of successive Krylov vectors (see '2.3.4) and orthog-
onalize these as a block (see Van Rosendale [208], and Chronopoulos and Gear [54]).
4.4.2 Vector updates
Vector updates are trivially parallelizable: each processor updates its own segment.
4.4.3 Matrix-vector products
The matrix–vector products are often easily parallelized on shared-memory machines by split-
ting the matrix in strips corresponding to the vector segments. Each processor then computes the
matrix–vector product of one strip. For distributed-memory machines, there may be a problem if
each processor has only a segment of the vector in its memory. Depending on the bandwidth of
the matrix, we may need communication for other elements of the vector, which may lead to com-
munication bottlenecks. However, many sparse matrix problems arise from a network in which
only nearby nodes are connected. For example, matrices stemming from finite difference or finite
element problems typically involve only local connections: matrix element a
i,j
is nonzero only if
variables i and j are physically close. In such a case, it seems natural to subdivide the network, or
grid, into suitable blocks and to distribute them over the processors. When computing Ap
i
, each
processor requires the values of p
i
at some nodes in neighboring blocks. If the number of con-
nections to these neighboring blocks is small compared to the number of internal nodes, then the
communication time can be overlapped with computational work. For more detailed discussions on
implementation aspects for distributed memory systems, see De Sturler [62] and Pommerell [174].
4.4.4 Preconditioning
Preconditioning is often the most problematic part of parallelizing an iterative method. We will
mention a number of approaches to obtaining parallelism in preconditioning.
70 CHAPTER 4. RELATED ISSUES
Discovering parallelism in sequential preconditioners. Certain preconditioners were not de-
veloped with parallelism in mind, but they can be executed in parallel. Some examples are domain
decomposition methods (see '5.4), which provide a high degree of coarse grained parallelism, and
polynomial preconditioners (see '3.5), which have the same parallelism as the matrix-vector prod-
uct.
Incomplete factorization preconditioners are usually much harder to parallelize: using wave-
fronts of independent computations (see for instance Paolini and Radicati di Brozolo [169]) a
modest amount of parallelism can be attained, but the implementation is complicated. For in-
stance, a central difference discretization on regular grids gives wavefronts that are hyperplanes
(see Van der Vorst [201, 203]).
More parallel variants of sequential preconditioners. Variants of existing sequential incom-
plete factorization preconditioners with a higher degree of parallelism have been devised, though
they are perhaps less efficient in purely scalar terms than their ancestors. Some examples are: re-
orderings of the variables (see Duff and Meurant [78] and Eijkhout [84]), expansion of the factors
in a truncated Neumann series (see Van der Vorst [199]), various block factorization methods (see
Axelsson and Eijkhout [15] and Axelsson and Polman [21]), and multicolor preconditioners.
Multicolor preconditioners have optimal parallelism among incomplete factorization methods,
since the minimal number of sequential steps equals the color number of the matrix graphs. For
theory and appplications to parallelism see Jones and Plassman [126, 127].
Fully decoupled preconditioners. If all processors execute their part of the preconditioner solve
without further communication, the overall method is technically a block Jacobi preconditioner
(see '3.2.1). While their parallel execution is very efficient, they may not be as effective as more
complicated, less parallel preconditioners, since improvement in the number of iterations may be
only modest. To get a bigger improvement while retaining the efficient parallel execution, Radi-
cati di Brozolo and Robert [177] suggest that one construct incomplete decompositions on slightly
overlapping domains. This requires communication similar to that for matrix–vector products.
4.4.5 Wavefronts in the Gauss-Seidel and Conjugate Gradient methods
At first sight, the Gauss-Seidel method (and the SOR method which has the same basic structure)
seems to be a fully sequential method. A more careful analysis, however, reveals a high degree of
parallelism if the method is applied to sparse matrices such as those arising from discretized partial
differential equations.
We start by partitioning the unknowns in wavefronts. The first wavefront contains those un-
knowns that (in the directed graph of D − L) have no predecessor; subsequent wavefronts are
then sets (this definition is not necessarily unique) of successors of elements of the previous wave-
front(s), such that no successor/predecessor relations hold among the elements of this set. It is clear
that all elements of a wavefront can be processed simultaneously, so the sequential time of solving
a system with D −L can be reduced to the number of wavefronts.
Next, we observe that the unknowns in a wavefront can be computed as soon as all wavefronts
containing its predecessors have been computed. Thus we can, in the absence of tests for con-
vergence, have components from several iterations being computed simultaneously. Adams and
Jordan [2] observe that in this way the natural ordering of unknowns gives an iterative method that
is mathematically equivalent to a multi-color ordering.
In the multi-color ordering, all wavefronts of the same color are processed simultaneously. This
reduces the number of sequential steps for solving the Gauss-Seidel matrix to the number of colors,
which is the smallest number d such that wavefront i contains no elements that are a predecessor
of an element in wavefront i +d.
4.4. PARALLELISM 71
As demonstrated by O’Leary [163], SOR theory still holds in an approximate sense for multi-
colored matrices. The above observation that the Gauss-Seidel method with the natural ordering
is equivalent to a multicoloring cannot be extended to the SSOR method or wavefront-based in-
complete factorization preconditioners for the Conjugate Gradient method. In fact, tests by Duff
and Meurant [78] and an analysis by Eijkhout [84] show that multicolor incomplete factorization
preconditioners in general may take a considerably larger number of iterations to converge than
preconditioners based on the natural ordering. Whether this is offset by the increased parallelism
depends on the application and the computer architecture.
4.4.6 Blocked operations in the GMRES method
In addition to the usual matrix-vector product, inner products and vector updates, the precondi-
tioned GMRES method (see '2.3.4) has a kernel where one new vector, M
−1
Av
(j)
, is orthogonal-
ized against the previously built orthogonal set ¦v
(1)
, v
(2)
,. . . , v
(j)
¦. In our version, this is done
using Level 1 BLAS, which may be quite inefficient. To incorporate Level 2 BLAS we can apply
either Householder orthogonalization or classical Gram-Schmidt twice (which mitigates classical
Gram-Schmidt’s potential instability; see Saad [184]). Both approaches significantly increase the
computational work, but using classical Gram-Schmidt has the advantage that all inner products can
be performed simultaneously; that is, their communication can be packaged. This may increase the
efficiency of the computation significantly.
Another way to obtain more parallelism and data locality is to generate a basis ¦v
(1)
, Av
(1)
, ...,
A
m
v
(1)
¦ for the Krylov subspace first, and to orthogonalize this set afterwards; this is called m-step
GMRES(m) (see Kim and Chronopoulos [138]). (Compare this to the GMRES method in '2.3.4,
where each new vector is immediately orthogonalized to all previous vectors.) This approach
does not increase the computational work and, in contrast to CG, the numerical instability due to
generating a possibly near-dependent set is not necessarily a drawback.
72 CHAPTER 4. RELATED ISSUES
Chapter 5
Remaining topics
5.1 The Lanczos Connection
As discussed by Paige and Saunders in [167] and by Golub and Van Loan in [108], it is straightfor-
ward to derive the conjugate gradient method for solving symmetric positive definite linear systems
from the Lanczos algorithm for solving symmetric eigensystems and vice versa. As an example,
let us consider how one can derive the Lanczos process for symmetric eigensystems from the (un-
preconditioned) conjugate gradient method.
Suppose we define the n k matrix R
(k)
by
R
k
= [r
(0)
, r
(1)
, . . . , r
(k−1)
],
and the k k upper bidiagonal matrix B
k
by
B
k
=

1 −β
1
0
1 −β
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. −β
k−1
0 1
¸
¸
¸
¸
¸
¸
¸
¸
¸
,
where the sequences ¦r
(k)
¦ and ¦β
k
¦ are defined by the standard conjugate gradient algorithm
discussed in '2.3.1. From the equations
p
(j)
= r
(j−1)

j−1
p
(j−1)
, j = 2, 3, . . . , k ,
and p
(1)
= r
(0)
, we have R
k
= P
k
B
k
, where
P
k
= [p
(1)
, p
(2)
, . . . , p
(k)
].
Assuming the elements of the sequence ¦p
(j)
¦ are A-conjugate, it follows that
ˆ
T
k
= R
T
k
AR
k
= B
T
k
ˆ
Λ
k
B
k
is a tridiagonal matrix since
ˆ
Λ
k
=

p
(1)
T
Ap
(1)
0 0
0 p
(2)
T
Ap
(2)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 0
0 0 p
(k)
T
Ap
(k)
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
.
73
74 CHAPTER 5. REMAINING TOPICS
Since span¦p
(1)
, p
(2)
, . . . , p
(j)
¦ = span¦r
(0)
, r
(1)
, . . . , r
(j−1)
¦ and since the elements of
¦r
(j)
¦ are mutually orthogonal, it can be shown that the columns of n k matrix Q
k
= R
k

−1
form an orthonormal basis for the subspace span¦b, Ab, . . . , A
k−1
b¦, where ∆ is a diagonal
matrix whose ith diagonal element is |r
(i)
|
2
. The columns of the matrix Q
k
are the Lanczos
vectors (see Parlett [170]) whose associated projection of A is the tridiagonal matrix
T
k
= ∆
−1
B
T
k
ˆ
Λ
k
B
k

−1
. (5.1)
The extremal eigenvalues of T
k
approximate those of the matrix A. Hence, the diagonal and sub-
diagonal elements of T
k
in (5.1), which are readily available during iterations of the conjugate
gradient algorithm ('2.3.1), can be used to construct T
k
after k CG iterations. This allows us to
obtain good approximations to the extremal eigenvalues (and hence the condition number) of the
matrix Awhile we are generating approximations, x
(k)
, to the solution of the linear systemAx = b.
For a nonsymmetric matrix A, an equivalent nonsymmetric Lanczos algorithm (see Lanc-
zos [141]) would produce a nonsymmetric matrix T
k
in (5.1) whose extremal eigenvalues (which
may include complex-conjugate pairs) approximate those of A. The nonsymmetric Lanczos
method is equivalent to the BiCG method discussed in '2.3.5.
5.2 Block and s-step Iterative Methods
The methods discussed so far are all subspace methods, that is, in every iteration they extend the
dimension of the subspace generated. In fact, they generate an orthogonal basis for this subspace,
by orthogonalizing the newly generated vector with respect to the previous basis vectors.
However, in the case of nonsymmetric coefficient matrices the newly generated vector may be
almost linearly dependent on the existing basis. To prevent break-down or severe numerical error in
such instances, methods have been proposed that perform a look-ahead step (see Freund, Gutknecht
and Nachtigal [100], Parlett, Taylor and Liu [171], and Freund and Nachtigal [101]).
Several new, unorthogonalized, basis vectors are generated and are then orthogonalized with
respect to the subspace already generated. Instead of generating a basis, such a method generates a
series of low-dimensional orthogonal subspaces.
The s-step iterative methods of Chronopoulos and Gear [54] use this strategy of generating
unorthogonalized vectors and processing them as a block to reduce computational overhead and
improve processor cache behaviour.
If conjugate gradient methods are considered to generate a factorization of a tridiagonal re-
duction of the original matrix, then look-ahead methods generate a block factorization of a block
tridiagonal reduction of the matrix.
A block tridiagonal reduction is also effected by the Block Lanczos algorithm and the Block
Conjugate Gradient method (see O’Leary [162]). Such methods operate on multiple linear systems
with the same coefficient matrix simultaneously, for instance with multiple right hand sides, or the
same right hand side but with different initial guesses. Since these block methods use multiple
search directions in each step, their convergence behavior is better than for ordinary methods. In
fact, one can show that the spectrum of the matrix is effectively reduced by the n
b
− 1 smallest
eigenvalues, where n
b
is the block size.
5.3 Reduced System Preconditioning
As we have seen earlier, a suitable preconditioner for CG is a matrix M such that the system
M
−1
Ax = M
−1
f
requires fewer iterations to solve than Ax = f does, and for which systems Mz = r can be solved
efficiently. The first property is independent of the machine used, while the second is highly ma-
chine dependent. Choosing the best preconditioner involves balancing those two criteria in a way
5.4. DOMAIN DECOMPOSITION METHODS 75
that minimizes the overall computation time. One balancing approach used for matrices A arising
from 5-point finite difference discretization of second order elliptic partial differential equations
(PDEs) with Dirichlet boundary conditions involves solving a reduced system. Specifically, for an
n n grid, we can use a point red-black ordering of the nodes to get
Ax =
¸
D
R
C
C
T
D
B
¸
x
R
x
B

=
¸
f
R
f
B

, (5.2)
where D
R
and D
B
are diagonal, and C is a well-structured sparse matrix with 5 nonzero diagonals
if n is even and 4 nonzero diagonals if n is odd. Applying one step of block Gaussian elimination
(or computing the Schur complement; see Golub and Van Loan [108]) we have
¸
I O
−C
T
D
−1
R
I
¸
D
R
C
C
T
D
B
¸
x
R
x
B

=
¸
I O
−C
T
D
−1
R
I
¸
f
R
f
B

which reduces to
¸
D
R
C
O D
B
−C
T
D
R
−1
C
¸
x
R
x
B

=
¸
f
R
f
B
−C
T
D
R
−1
f
R

.
With proper scaling (left and right multiplication by D
B
−1/2
), we obtain from the second block
equation the reduced system
(I −H
T
H)y = g , (5.3)
where H = D
−1/2
R
CD
−1/2
B
, y = D
1/2
B
x
B
, and g = D
−1/2
B
(f
B
−C
T
D
−1
R
f
R
). The linear system
(5.3) is of order n
2
/2 for even n and of order (n
2
− 1)/2 for odd n. Once y is determined, the
solution x is easily retrieved fromy. The values on the black points are those that would be obtained
from a red/black ordered SSOR preconditioner on the full system, so we expect faster convergence.
The number of nonzero coefficients is reduced, although the coefficient matrix in (5.3) has
nine nonzero diagonals. Therefore it has higher density and offers more data locality. Meier and
Sameh [149] demonstrate that the reduced system approach on hierarchical memory machines such
as the Alliant FX/8 is over 3 times faster than unpreconditioned CG for Poisson’s equation on nn
grids with n ≥ 250.
For 3-dimensional elliptic PDEs, the reduced system approach yields a block tridiagonal ma-
trix C in (5.2) having diagonal blocks of the structure of C from the 2-dimensional case and off-
diagonal blocks that are diagonal matrices. Computing the reduced system explicitly leads to an
unreasonable increase in the computational complexity of solving Ax = f. The matrix products
required to solve (5.3) would therefore be performed implicitly which could significantly decrease
performance. However, as Meier and Sameh show [149], the reduced system approach can still
be about 2-3 times as fast as the conjugate gradient method with Jacobi preconditioning for 3-
dimensional problems.
5.4 Domain Decomposition Methods
In recent years, much attention has been given to domain decomposition methods for linear el-
liptic problems that are based on a partitioning of the domain of the physical problem. Since the
subdomains can be handled independently, such methods are very attractive for coarse-grain par-
allel computers. On the other hand, it should be stressed that they can be very effective even on
sequential computers.
In this brief survey, we shall restrict ourselves to the standard second order self-adjoint scalar
elliptic problems in two dimensions of the form:
−∇ (a(x, y)∇u) = f(x, y) (5.4)
76 CHAPTER 5. REMAINING TOPICS
where a(x, y) is a positive function on the domain Ω, on whose boundary the value of u is pre-
scribed (the Dirichlet problem). For more general problems, and a good set of references, the reader
is referred to the series of proceedings [47, 48, 49, 106, 134, 176].
For the discretization of (5.4), we shall assume for simplicity that Ω is triangulated by a set T
H
of nonoverlapping coarse triangles (subdomains) Ω
i
, i = 1, ..., p with n
H
internal vertices. The

i
’s are in turn further refined into a set of smaller triangles T
h
with n internal vertices in total.
Here H, h denote the coarse and fine mesh size respectively. By a standard Ritz-Galerkin method
using piecewise linear triangular basis elements on (5.4), we obtain an n n symmetric positive
definite linear system Au = f.
Generally, there are two kinds of approaches depending on whether the subdomains overlap
with one another (Schwarz methods) or are separated from one another by interfaces (Schur Com-
plement methods, iterative substructuring).
We shall present domain decomposition methods as preconditioners for the linear systemAu =
f or to a reduced (Schur Complement) system S
B
u
B
= g
B
defined on the interfaces in the non-
overlapping formulation. When used with the standard Krylov subspace methods discussed else-
where in this book, the user has to supply a procedure for computing Av or Sw given v or w and
the algorithms to be described herein computes K
−1
v. The computation of Av is a simple sparse
matrix-vector multiply, but Sw may require subdomain solves, as will be described later.
5.4.1 Overlapping Subdomain Methods
In this approach, each substructure Ω
i
is extended to a larger substructure Ω

i
containing n

i
internal
vertices and all the triangles T

i
⊂ T
h
, within a distance δ from Ω
i
, where δ refers to the amount of
overlap.
Let A

i
, A
H
denote the the discretizations of (5.4) on the subdomain triangulation T

i
and the
coarse triangulation T
H
respectively.
Let R
T
i
denote the extension operator which extends (by zero) a function on T

i
to T
h
and R
i
the
corresponding pointwise restriction operator. Similarly, let R
T
H
denote the interpolation operator
which maps a function on the coarse grid T
H
onto the fine grid T
h
by piecewise linear interpolation
and R
H
the corresponding weighted restriction operator.
With these notations, the Additive Schwarz Preconditioner K
as
for the system Au = f can be
compactly described as:
K
−1
as
v = R
T
H
A
−1
H
R
H
v +
p
¸
i=1
R
T
i
A

−1
i
R
i
v.
Note that the right hand side can be computed using p subdomain solves using the A

i
’s, plus
a coarse grid solve using A
H
, all of which can be computed in parallel. Each term R
T
i
A

−1
i
R
i
v
should be evaluated in three steps: (1) Restriction: v
i
= R
i
v, (2) Subdomain solves for w
i
: A

i
w
i
=
v
i
, (3) Interpolation: y
i
= R
T
i
w
i
. The coarse grid solve is handled in the same manner.
The theory of Dryja and Widlund [75] shows that the condition number of K
−1
as
A is bounded
independently of both the coarse grid size H and the fine grid size h, provided there is “sufficient”
overlap between Ω
i
and Ω

i
(essentially it means that the ratio δ/H of the distance δ of the boundary
∂Ω

i
to ∂Ω
i
should be uniformly bounded from below as h →0.) If the coarse grid solve term is left
out, then the condition number grows as O(1/H
2
), reflecting the lack of global coupling provided
by the coarse grid.
For the purpose of implementations, it is often useful to interpret the definition of K
as
in matrix
notation. Thus the decomposition of Ω into Ω

i
’s corresponds to partitioning of the components of
the vector u into p overlapping groups of index sets I
i
’s, each with n

i
components. The n

i
n

i
matrix A

i
is simply a principal submatrix of A corresponding to the index set I
i
. R
T
i
is a n n

i
matrix defined by its action on a vector u
i
defined on T

i
as: (R
T
i
u
i
)
j
= (u
i
)
j
if j ∈ I
i
but
is zero otherwise. Similarly, the action of its transpose R
i
u forms an n

i
-vector by picking off
5.4. DOMAIN DECOMPOSITION METHODS 77
the components of u corresponding to I
i
. Analogously, R
T
H
is an n n
H
matrix with entries
corresponding to piecewise linear interpolation and its transpose can be interpreted as a weighted
restriction matrix. If T
h
is obtained from T
H
by nested refinement, the action of R
H
, R
T
H
can be
efficiently computed as in a standard multigrid algorithm. Note that the matrices R
T
i
, R
i
, R
T
H
, R
H
are defined by their actions and need not be stored explicitly.
We also note that in this algebraic formulation, the preconditioner K
as
can be extended to
any matrix A, not necessarily one arising from a discretization of an elliptic problem. Once we
have the partitioning index sets I
i
’s, the matrices R
i
, A

i
are defined. Furthermore, if A is positive
definite, then A

i
is guaranteed to be nonsingular. The difficulty is in defining the “coarse grid”
matrices A
H
, R
H
, which inherently depends on knowledge of the grid structure. This part of the
preconditioner can be left out, at the expense of a deteriorating convergence rate as p increases.
Radicati and Robert [177] have experimented with such an algebraic overlapping block Jacobi
preconditioner.
5.4.2 Non-overlapping Subdomain Methods
The easiest way to describe this approach is through matrix notation. The set of vertices of T
h
can be divided into two groups. The set of interior vertices I of all Ω
i
and the set of vertices B
which lies on the boundaries
¸
i
∂Ω

i
of the coarse triangles Ω

i
in T
H
. We shall re-order u and f as
u ≡ (u
I
, u
B
)
T
and f ≡ (f
I
, f
B
)
T
corresponding to this partition. In this ordering, equation (5.4)
can be written as follows:

A
I
A
IB
A
T
IB
A
B

u
I
u
B

=

f
I
f
B

. (5.5)
We note that since the subdomains are uncoupled by the boundary vertices, A
I
= blockdiagonal(A
i
)
is block-diagonal with each block A
i
being the stiffness matrix corresponding to the unknowns
belonging to the interior vertices of subdomain Ω
i
.
By a block LU-factorization of A, the system in (5.5) can be written as:

I 0
A
T
IB
A
−1
I
I

A
I
A
IB
0 S
B

u
I
u
B

=

f
I
f
B

, (5.6)
where
S
B
≡ A
B
−A
T
IB
A
−1
I
A
IB
is the Schur complement of A
B
in A.
By eliminating u
I
in (5.6), we arrive at the following equation for u
B
:
S
B
u
B
= g
B
≡ f
B
−A
IB
A
−1
I
f
I
. (5.7)
We note the following properties of this Schur Complement system:
1. S
B
inherits the symmetric positive definiteness of A.
2. S
B
is dense in general and computing it explicitly requires as many solves on each subdo-
main as there are points on each of its edges.
3. The condition number of S
B
is O(h
−1
), an improvement over the O(h
−2
) growth for A.
4. Given a vector v
B
defined on the boundary vertices B of T
H
, the matrix-vector product
S
B
v
B
can be computed according to A
B
v
B
− A
T
IB
(A
−1
I
(A
IB
v
B
)) where A
−1
I
involves p
independent subdomain solves using A
−1
i
.
5. The right hand side g
B
can also be computed using p independent subdomain solves.
78 CHAPTER 5. REMAINING TOPICS
These properties make it possible to apply a preconditioned iterative method to (5.7), which is
the basic method in the nonoverlapping substructuring approach. We will also need some good
preconditioners to further improve the convergence of the Schur system.
We shall first describe the Bramble-Pasciak-Schatz preconditioner [36]. For this, we need to
further decompose B into two non-overlapping index sets:
B = E ∪ V
H
(5.8)
where V
H
=
¸
k
V
k
denote the set of nodes corresponding to the vertices V
k
’s of T
H
, and E =
¸
i
E
i
denote the set of nodes on the edges E
i
’s of the coarse triangles in T
H
, excluding the vertices
belonging to V
H
.
In addition to the coarse grid interpolation and restriction operators R
H
, R
T
H
defined before, we
shall also need a new set of interpolation and restriction operators for the edges E
i
’s. Let R
Ei
be
the pointwise restriction operator (an n
Ei
n matrix, where n
Ei
is the number of vertices on the
edge E
i
) onto the edge E
i
defined by its action (R
Ei
u
B
)
j
= (u
B
)
j
if j ∈ E
i
but is zero otherwise.
The action of its transpose extends by zero a function defined on E
i
to one defined on B.
Corresponding to this partition of B, S can be written in the block form:
S
B
=

S
E
S
EV
S
T
EV
S
V

. (5.9)
The block S
E
can again be block partitioned, with most of the subblocks being zero. The diagonal
blocks S
Ei
of S
E
are the principal submatrices of S corresponding to E
i
. Each S
Ei
represents the
coupling of nodes on interface E
i
separating two neighboring subdomains.
In defining the preconditioner, the action of S
−1
Ei
is needed. However, as noted before, in
general S
Ei
is a dense matrix which is also expensive to compute, and even if we had it, it would be
expensive to compute its action (we would need to compute its inverse or a Cholesky factorization).
Fortunately, many efficiently invertible approximations to S
Ei
have been proposed in the literature
(see Keyes and Gropp [135]) and we shall use these so-called interface preconditioners for S
Ei
instead. We mention one specific preconditioner:
M
Ei
= α
Ei
K
1/2
where K is an n
Ei
n
Ei
one dimensional Laplacian matrix, namely a tridiagonal matrix with 2’s
down the main diagonal and −1’s down the two off-diagonals, and α
Ei
is taken to be some average
of the coefficient a(x, y) of (5.4) on the edge E
i
. We note that since the eigen-decomposition
of K is known and computable by the Fast Sine Transform, the action of M
Ei
can be efficiently
computed.
With these notations, the Bramble-Pasciak-Schatz preconditioner is defined by its action on a
vector v
B
defined on B as follows:
K
−1
BPS
v
B
= R
T
H
A
−1
H
R
H
v
B
+
¸
Ei
R
T
Ei
M
−1
Ei
R
Ei
v
B
. (5.10)
Analogous to the additive Schwarz preconditioner, the computation of each term consists of the
three steps of restriction-inversion-interpolation and is independent of the others and thus can be
carried out in parallel.
Bramble, Pasciak and Schatz [36] prove that the condition number of K
−1
BPS
S
B
is bounded by
O(1 + log
2
(H/h)). In practice, there is a slight growth in the number of iterations as h becomes
small (i.e., as the fine grid is refined) or as H becomes large (i.e., as the coarse grid becomes
coarser).
The log
2
(H/h) growth is due to the coupling of the unknowns on the edges incident on a com-
mon vertex V
k
, which is not accounted for in K
BPS
. Smith [190] proposed a vertex space modifica-
tion to K
BPS
which explicitly accounts for this coupling and therefore eliminates the dependence
on H and h. The idea is to introduce further subsets of B called vertex spaces X =
¸
k
X
k
with
5.4. DOMAIN DECOMPOSITION METHODS 79
X
k
consisting of a small set of vertices on the edges incident on the vertex V
k
and adjacent to it.
Note that X overlaps with E and V
H
. Let S
X
k
be the principal submatrix of S
B
corresponding
to X
k
, and R
X
k
, R
T
X
k
be the corresponding restriction (pointwise) and extension (by zero) matri-
ces. As before, S
X
k
is dense and expensive to compute and factor/solve but efficiently invertable
approximations (some using variants of the K
1/2
operator defined before) have been developed in
the literature (see Chan, Mathew and Shao [51]). We shall let M
X
k
be such a preconditioner for
S
X
k
. Then Smith’s Vertex Space preconditioner is defined by:
K
−1
V S
v
B
= R
T
H
A
−1
H
R
H
v
B
+
¸
Ei
R
T
Ei
M
−1
Ei
R
Ei
v
B
+
¸
X
k
R
T
X
k
M
−1
X
k
R
X
k
v
B
. (5.11)
Smith [190] proved that the condition number of K
−1
V S
S
B
is bounded independent of H and h,
provided there is sufficient overlap of X
k
with B.
5.4.3 Further Remarks
Multiplicative Schwarz Methods
As mentioned before, the Additive Schwarz preconditioner can be viewed as an overlapping block
Jacobi preconditioner. Analogously, one can define a multiplicative Schwarz preconditioner which
corresponds to a symmetric block Gauss-Seidel version. That is, the solves on each subdomain are
performed sequentially, using the most current iterates as boundary conditions from neighboring
subdomains. Even without conjugate gradient acceleration, the multiplicative method can take
many fewer iterations than the additive version. However, the multiplicative version is not as
parallelizable, although the degree of parallelism can be increased by trading off the convergence
rate through multi-coloring the subdomains. The theory can be found in Bramble, et al. [37].
Inexact Solves
The exact solves involving A

−1
i
, A
−1
i
and A
−1
H
in K
as
, K
BPS
, K
V S
can be replaced by inexact
solves
˜
A

−1
i
,
˜
A
−1
i
and
˜
A
−1
H
, which can be standard elliptic preconditioners themselves (e.g. multi-
grid, ILU, SSOR, etc.).
For the Schwarz methods, the modification is straightforward and the Inexact Solve Additive
Schwarz Preconditioner is simply:
˜
K
−1
as
v = R
T
H
˜
A
−1
H
R
H
v +
p
¸
i=1
R
T
i
˜
A

−1
i
R
i
v.
The Schur Complement methods require more changes to accommodate inexact solves. By
replacing A
−1
H
by
˜
A
−1
H
in the definitions of K
BPS
and K
V S
, we can easily obtain inexact pre-
conditioners
˜
K
BPS
and
˜
K
V S
for S
B
. The main difficulty is, however, that the evaluation of the
product S
B
v
B
requires exact subdomain solves in A
−1
I
. One way to get around this is to use an
inner iteration using
˜
A
i
as a preconditioner for A
i
in order to compute the action of A
−1
I
. An
alternative is to perform the iteration on the larger system (5.5) and construct a preconditioner from
the factorization in (5.6) by replacing the terms A
I
, S
B
by
˜
A
I
,
˜
S
B
respectively, where
˜
S
B
can be
either
˜
K
BPS
or
˜
K
V S
. Care must be taken to scale
˜
A
H
and
˜
A
i
so that they are as close to A
H
and
A
i
as possible respectively — it is not sufficient that the condition number of
˜
A
−1
H
A
H
and
˜
A
−1
i
A
i
be close to unity, because the scaling of the coupling matrix A
IB
may be wrong.
80 CHAPTER 5. REMAINING TOPICS
Nonsymmetric Problems
The preconditioners given above extend naturally to nonsymmetric A’s (e.g., convection-diffusion
problems), at least when the nonsymmetric part is not too large. The nice theoretical convergence
rates can be retained provided that the coarse grid size H is chosen small enough (depending
on the size of the nonsymmetric part of A) (see Cai and Widlund [43]). Practical implementations
(especially for parallelism) of nonsymmetric domain decomposition methods are discussed in [136,
137].
Choice of Coarse Grid Size H
Given h, it has been observed empirically (see Gropp and Keyes [110]) that there often exists an
optimal value of H which minimizes the total computational time for solving the problem. A small
H provides a better, but more expensive, coarse grid approximation, and requires solving more, but
smaller, subdomain solves. A large H has the opposite effect. For model problems, the optimal H
can be determined for both sequential and parallel implementations (see Chan and Shao [52]). In
practice, it may pay to determine a near optimal value of H empirically if the preconditioner is to
be re-used many times. However, there may also be geometric constraints on the range of values
that H can take.
5.5 Multigrid Methods
Simple iterative methods (such as the Jacobi method) tend to damp out high frequency components
of the error fastest (see '2.2.1). This has led people to develop methods based on the following
heuristic:
1. Perform some steps of a basic method in order to smooth out the error.
2. Restrict the current state of the problem to a subset of the grid points, the so-called “coarse
grid”, and solve the resulting projected problem.
3. Interpolate the coarse grid solution back to the original grid, and perform a number of steps
of the basic method again.
Steps 1 and 3 are called “pre-smoothing” and “post-smoothing” respectively; by applying this
method recursively to step 2 it becomes a true “multigrid” method. Usually the generation of
subsequently coarser grids is halted at a point where the number of variables becomes small enough
that direct solution of the linear system is feasible.
The method outlined above is said to be a “V-cycle” method, since it descends through a se-
quence of subsequently coarser grids, and then ascends this sequence in reverse order. A “W-cycle”
method results from visiting the coarse grid twice, with possibly some smoothing steps in between.
An analysis of multigrid methods is relatively straightforward in the case of simple differential
operators such as the Poisson operator on tensor product grids. In that case, each next coarse grid
is taken to have the double grid spacing of the previous grid. In two dimensions, a coarse grid
will have one quarter of the number of points of the corresponding fine grid. Since the coarse grid
is again a tensor product grid, a Fourier analysis (see for instance Briggs [42]) can be used. For
the more general case of self-adjoint elliptic operators on arbitrary domains a more sophisticated
analysis is needed (see Hackbusch [116], McCormick [147]). Many multigrid methods can be
shown to have an (almost) optimal number of operations, that is, the work involved is proportional
to the number of variables.
From the above description it is clear that iterative methods play a role in multigrid theory as
smoothers (see Kettler [132]). Conversely, multigrid-like methods can be used as preconditioners
5.6. ROW PROJECTION METHODS 81
in iterative methods. The basic idea here is to partition the matrix on a given grid to a 22 structure
A
(i)
=

A
(i)
1,1
A
(i)
1,2
A
(i)
2,1
A
(i)
2,2

with the variables in the second block row corresponding to the coarse grid nodes. The matrix on
the next grid is then an incomplete version of the Schur complement
A
(i+1)
≈ S
(i)
= A
(i)
2,2
−A
(i)
2,1
A
(i)
−1
1,1
A
(i)
1,2
.
The coarse grid is typically formed based on a red-black or cyclic reduction ordering; see for
instance Rodrigue and Wolitzer [179], and Elman [92].
Some multigrid preconditioners try to obtain optimality results similar to those for the full
multigrid method. Here we will merely supply some pointers to the literature: Axelsson and Eijk-
hout [16], Axelsson and Vassilevski [23, 22], Braess [35], Maitre and Musy [144], McCormick and
Thomas [148], Yserentant [216] and Wesseling [213].
5.6 Row Projection Methods
Most iterative methods depend on spectral properties of the coefficient matrix, for instance some
require the eigenvalues to be in the right half plane. A class of methods without this limitation is
that of row projection methods (see Bj¨ orck and Elfving [34], and Bramley and Sameh [38]). They
are based on a block row partitioning of the coefficient matrix
A
T
= [A
1
, . . . , A
m
]
and iterative application of orthogonal projectors
P
i
x = A
i
(A
T
i
A
i
)
−1
A
T
i
x.
These methods have good parallel properties and seem to be robust in handling nonsymmetric and
indefinite problems.
Row projection methods can be used as preconditioners in the conjugate gradient method. In
that case, there is a theoretical connection with the conjugate gradient method on the normal equa-
tions (see '2.3.3).
82 CHAPTER 5. REMAINING TOPICS
Appendix A
Obtaining the Software
Alarge body of numerical software is freely available 24 hours a day via an electronic service called
Netlib. In addition to the template material, there are dozens of other libraries, technical reports on
various parallel computers and software, test data, facilities to automatically translate FORTRAN
programs to C, bibliographies, names and addresses of scientists and mathematicians, and so on.
One can communicate with Netlib in one of a number of ways: by email, (much more easily) via
an X-window interface called XNetlib, or through anonymous ftp (netlib2.cs.utk.edu).
To get started using netlib, one sends a message of the formsend index to netlib@ornl.gov.
A description of the entire library should be sent to you within minutes (providing all the interven-
ing networks as well as the netlib server are up).
FORTRAN and C versions of the templates for each method described in this book are available
from Netlib. A user sends a request by electronic mail as follows:
mail netlib@ornl.gov
On the subject line or in the body, single or multiple requests (one per line) may be made as follows:
send index from linalg
send sftemplates.shar from linalg
The first request results in a return e-mail message containing the index from the library linalg,
along with brief descriptions of its contents. The second request results in a return e-mail message
consisting of a shar file containing the single precision FORTRAN routines and a README file. The
versions of the templates that are available are listed in the table below:
shar filename contents
sctemplates.shar Single precision C routines
dctemplates.shar Double precision C routines
sftemplates.shar Single Precision Fortran 77 routines
dftemplates.shar Double Precision Fortran 77 routines
mltemplates.shar MATLAB routines
Save the mail message to a file called templates.shar. Edit the mail header from this file
and delete the lines down to and including << Cut Here >>. In the directory containing the
shar file, type
83
84 APPENDIX A. OBTAINING THE SOFTWARE
sh templates.shar
No subdirectory will be created. The unpacked files will stay in the current directory. Each
method is written as a separate subroutine in its own file, named after the method (e.g., CG.f,
BiCGSTAB.f, GMRES.f). The argument parameters are the same for each, with the exception of
the required matrix-vector products and preconditioner solvers (some require the transpose matrix).
Also, the amount of workspace needed varies. The details are documented in each routine.
Note that the vector-vector operations are accomplished using the BLAS [143] (many manu-
facturers have assembly coded these kernels for maximum performance), although a mask file is
provided to link to user defined routines.
The README file gives more details, along with instructions for a test routine.
Appendix B
Overview of the BLAS
The BLAS give us a standardized set of basic codes for performing operations on vectors and ma-
trices. BLAS take advantage of the Fortran storage structure and the structure of the mathematical
system wherever possible. Additionally, many computers have the BLAS library optimized to their
system. Here we use five routines:
1. SCOPY: copies a vector onto another vector
2. SAXPY: adds vector x (multiplied by a scalar) to vector y
3. SGEMV: general matrix vector product
4. STRMV: matrix vector product when the matrix is triangular
5. STRSV: solves Tx = b for triangular matrix T
The prefix “S” denotes single precision. This prefix may be changed to “D”, “C”, or “Z”, giving
the routine double, complex, or double complex precision. (Of course, the declarations would also
have to be changed.) It is important to note that putting double precision into single variables
works, but single into double will cause errors.
If we define a
i,j
= a(i,j) and x
i
= x(i), we can see what the code is doing:
• ALPHA = SDOT( N, X, 1, Y, 1 ) computes the inner product of two vectors x and
y, putting the result in scalar α.
The corresponding Fortran segment is
ALPHA = 0.0
DO I = 1, N
ALPHA = ALPHA + X(I)
*
Y(I)
ENDDO
• CALL SAXPY( N, ALPHA, X, 1, Y ) multiplies a vector x of length n by the scalar
α, then adds the result to the vector y, putting the result in y.
The corresponding Fortran segment is
DO I = 1, N
Y(I) = ALPHA
*
X(I) + Y(I)
ENDDO
85
86 APPENDIX B. OVERVIEW OF THE BLAS
• CALL SGEMV( ’N’, M, N, ONE, A, LDA, X, 1, ONE, B, 1 ) computes
the matrix-vector product plus vector Ax +b, putting the resulting vector in b.
The corresponding Fortran segment:
DO J = 1, N
DO I = 1, M
B(I) = A(I,J)
*
X(J) + B(I)
ENDDO
ENDDO
This illustrates a feature of the BLAS that often requires close attention. For example, we
will use this routine to compute the residual vector b − Aˆ x, where ˆ x is our current approxi-
mation to the solution x (merely change the fourth argument to -1.0E0). Vector b will be
overwritten with the residual vector; thus, if we need it later, we will first copy it to temporary
storage.
• CALL STRMV( ’U’, ’N’, ’N’, N, A, LDA, X, 1 ) computes the matrix-
vector product Ax, putting the resulting vector in x, for upper triangular matrix A.
The corresponding Fortran segment is
DO J = 1, N
TEMP = X(J)
DO I = 1, J
X(I) = X(I) + TEMP
*
A(I,J)
ENDDO
ENDDO
Note that the parameters in single quotes are for descriptions such as ’U’ for ‘UPPER TRI-
ANGULAR’, ’N’ for ‘No Transpose’. This feature will be used extensively, resulting in storage
savings (among other advantages).
The variable LDA is critical for addressing the array correctly. LDA is the leading dimension
of the two-dimensional array A, that is, LDA is the declared (or allocated) number of rows of the
two-dimensional array A.
Appendix C
Glossary
Adaptive methods Iterative methods that collect information about the coefficient matrix during
the iteration process, and use this to speed up convergence.
Backward error The size of perturbations δA of the coefficient matrix and δb of the right hand
side of a linear system Ax = b, such that the computed iterate x
(i)
is the solution of (A +
δA)x
(i)
= b +δb.
Band matrix A matrix A for which there are nonnegative constants p, q such that a
i,j
= 0 if
j < i − p or j > i + q. The two constants p, q are called the left and right halfbandwidth
respectively.
Black box A piece of software that can be used without knowledge of its inner workings; the user
supplies the input, and the output is assumed to be correct.
BLAS Basic Linear Algebra Subprograms; a set of commonly occurring vector and matrix oper-
ations for dense linear algebra. Optimized (usually assembly coded) implementations of the
BLAS exist for various computers; these will give a higher performance than implementation
in high level programming languages.
Block factorization See: Block matrix operations.
Block matrix operations Matrix operations expressed in terms of submatrices.
Breakdown The occurrence of a zero divisor in an iterative method.
Choleski decomposition Expressing a symmetric matrix A as a product of a lower triangular ma-
trix L and its transpose L
T
, that is, A = LL
T
.
Condition number See: Spectral condition number.
Convergence The fact whether or not, or the rate at which, an iterative method approaches the
solution of a linear system. Convergence can be
• Linear: some measure of the distance to the solution decreases by a constant factor in
each iteration.
• Superlinear: the measure of the error decreases by a growing factor.
• Smooth: the measure of the error decreases in all or most iterations, though not neces-
sarily by the same factor.
• Irregular: the measure of the error decreases in some iterations and increases in others.
This observation unfortunately does not imply anything about the ultimate convergence
of the method.
87
88 APPENDIX C. GLOSSARY
• Stalled: the measure of the error stays more or less constant during a number of iter-
ations. As above, this does not imply anything about the ultimate convergence of the
method.
Dense matrix Matrix for which the number of zero elements is too small to warrant specialized
algorithms to exploit these zeros.
Diagonally dominant matrix See: Matrix properties
Direct method An algorithmthat produces the solution to a system of linear equations in a number
of operations that is determined a priori by the size of the system. In exact arithmetic, a direct
method yields the true solution to the system. See: Iterative method.
Distributed memory See: Parallel computer.
Divergence An iterative method is said to diverge if it does not converge in a reasonable number
of iterations, or if some measure of the error grows unacceptably. However, growth of the
error as such is no sign of divergence: a method with irregular convergence behavior may
ultimately converge, even though the error grows during some iterations.
Domain decomposition method Solution method for linear systems based on a partitioning of
the physical domain of the differential equation. Domain decomposition methods typically
involve (repeated) independent system solution on the subdomains, and some way of com-
bining data from the subdomains on the separator part of the domain.
Field of values Given a matrix A, the field of values is the set ¦x
T
Ax : x
T
x = 1¦. For symmetric
matrices this is the range [λ
min
(A), λ
max
(A)].
Fill A position that is zero in the original matrix A but not in an exact factorization of A. In an
incomplete factorization, some fill elements are discarded.
Forward error The difference between a computed iterate and the true solution of a linear system,
measured in some vector norm.
Halfbandwidth See: Band matrix.
Ill-conditioned system Alinear systemfor which the coefficient matrix has a large condition num-
ber. Since in many applications the condition number is proportional to (some power of) the
number of unknowns, this should be understood as the constant of proportionality being
large.
Incomplete factorization A factorization obtained by discarding certain elements during the fac-
torization process (‘modified’ and ‘relaxed’ incomplete factorization compensate in some
way for discarded elements). Thus an incomplete LU factorization of a matrix A will in
general satisfy A = LU; however, one hopes that the factorization LU will be close enough
to A to function as a preconditioner in an iterative method.
Iterate Approximation to the solution of a linear system in any iteration of an iterative method.
Iterative method An algorithm that produces a sequence of approximations to the solution of a
linear system of equations; the length of the sequence is not given a priori by the size of the
system. Usually, the longer one iterates, the closer one is able to get to the true solution. See:
Direct method.
Krylov sequence For a given matrix A and vector x, the sequence of vectors ¦A
i

i≥0
, or a finite
initial part of this sequence.
89
Krylov subspace The subspace spanned by a Krylov sequence.
LAPACK A mathematical library of linear algebra routine for dense systems solution and eigen-
value calculations.
Lower triangular matrix Matrix A for which a
i,j
= 0 if j > i.
LQ factorization A way of writing a matrix A as a product of a lower triangular matrix L and a
unitary matrix Q, that is, A = LQ.
LU factorization / LU decomposition Expressing a matrix A as a product of a lower triangular
matrix L and an upper triangular matrix U, that is, A = LU.
M-Matrix See: Matrix properties.
Matrix norms See: Norms.
Matrix properties We call a square matrix A
Symmetric if a
i,j
= a
j,i
for all i, j.
Positive definite if it satisfies x
T
Ax > 0 for all nonzero vectors x.
Diagonally dominant if a
i,i
>
¸
j=i
[a
i,j
[; the excess amount min
i
¦a
i,i

¸
j=i
[a
i,j
[¦ is
called the diagonal dominance of the matrix.
An M-matrix if a
i,j
≤ 0 for i = j, and it is nonsingular with (A
−1
)
i,j
≥ 0 for all i, j.
Message passing See: Parallel computer.
Multigrid method Solution method for linear systems based on restricting and extrapolating so-
lutions between a series of nested grids.
Modified incomplete factorization See: Incomplete factorization.
Mutually consistent norms See: Norms.
Natural ordering See: Ordering of unknowns.
Nonstationary iterative method Iterative method that has iteration-dependent coefficients.
Normal equations For a nonsymmetric or indefinite (but nonsingular) system of equations Ax =
b, either of the related symmetric systems (A
T
Ax = A
T
b) and (AA
T
y = b; x = A
T
y). For
complex A, A
T
is replaced with A
H
in the above expressions.
Norms A function f : R
n
→R is called a vector norm if
• f(x) ≥ 0 for all x, and f(x) = 0 only if x = 0.
• f(αx) = [α[f(x) for all α, x.
• f(x +y) ≤ f(x) +f(y) for all x, y.
The same properties hold for matrix norms. A matrix norm and a vector norm (both de-
noted | |) are called a mutually consistent pair if for all matrices A and vectors x
|Ax| ≤ |A| |x|.
Ordering of unknowns For linear systems derived from a partial differential equation, each un-
known corresponds to a node in the discretization mesh. Different orderings of the unknowns
correspond to permutations of the coefficient matrix. The convergence speed of iterative
methods may depend on the ordering used, and often the parallel efficiency of a method on a
parallel computer is strongly dependent on the ordering used. Some common orderings for
rectangular domains are:
90 APPENDIX C. GLOSSARY
• The natural ordering; this is the consecutive numbering by rows and columns.
• The red/black ordering; this is the numbering where all nodes with coordinates (i, j)
for which i +j is odd are numbered before those for which i +j is even.
• The ordering by diagonals; this is the ordering where nodes are grouped in levels for
which i + j is constant. All nodes in one level are numbered before the nodes in the
next level.
For matrices from problems on less regular domains, some common orderings are:
• The Cuthill-McKee ordering; this starts from one point, then numbers its neighbors,
and continues numbering points that are neighbors of already numbered points. The Re-
verse Cuthill-McKee ordering then reverses the numbering; this may reduce the amount
of fill in a factorization of the matrix.
• The Minimum Degree ordering; this orders the matrix rows by increasing numbers of
nonzeros.
Parallel computer Computer with multiple independent processing units. If the processors have
immediate access to the same memory, the memory is said to be shared; if processors have
private memory that is not immediately visible to other processors, the memory is said to be
distributed. In that case, processors communicate by message-passing.
Pipelining See: Vector computer.
Positive definite matrix See: Matrix properties.
Preconditioner An auxiliary matrix in an iterative method that approximates in some sense the
coefficient matrix or its inverse. The preconditioner, or preconditioning matrix, is applied in
every step of the iterative method.
Red/black ordering See: Ordering of unknowns.
Reduced system Linear system obtained by eliminating certain variables from another linear sys-
tem. Although the number of variables is smaller than for the original system, the matrix of
a reduced system generally has more nonzero entries. If the original matrix was symmetric
and positive definite, then the reduced system has a smaller condition number.
Relaxed incomplete factorization See: Incomplete factorization.
Residual If an iterative method is employed to solve for x in a linear system Ax = b, then the
residual corresponding to a vector y is Ay −b.
Search direction Vector that is used to update an iterate.
Shared memory See: Parallel computer.
Simultaneous displacements, method of Jacobi method.
Sparse matrix Matrix for which the number of zero elements is large enough that algorithms
avoiding operations on zero elements pay off. Matrices derived from partial differential
equations typically have a number of nonzero elements that is proportional to the matrix
size, while the total number of matrix elements is the square of the matrix size.
Spectral condition number The product
|A|
2
|A
−1
|
2
=
λ
1/2
max
(A
T
A)
λ
1/2
min
(A
T
A)
,
C.1. NOTATION 91
where λ
max
and λ
min
denote the largest and smallest eigenvalues, respectively. For linear
systems derived from partial differential equations in 2D, the condition number is propor-
tional to the number of unknowns.
Spectral radius The spectral radius of a matrix A is max¦[λ(A)[¦.
Spectrum The set of all eigenvalues of a matrix.
Stationary iterative method Iterative method that performs in each iteration the same operations
on the current iteration vectors.
Stopping criterion Since an iterative method computes successive approximations to the solution
of a linear system, a practical test is needed to determine when to stop the iteration. Ideally
this test would measure the distance of the last iterate to the true solution, but this is not
possible. Instead, various other metrics are used, typically involving the residual.
Storage scheme The way elements of a matrix are stored in the memory of a computer. For dense
matrices, this can be the decision to store rows or columns consecutively. For sparse matrices,
common storage schemes avoid storing zero elements; as a result they involve indices, stored
as integer data, that indicate where the stored elements fit into the global matrix.
Successive displacements, method of Gauss-Seidel method.
Symmetric matrix See: Matrix properties.
Template Description of an algorithm, abstracting away from implementational details.
Tune Adapt software for a specific application and computing environment in order to obtain better
performance in that case only. itemize
Upper triangular matrix Matrix A for which a
i,j
= 0 if j < i.
Vector computer Computer that is able to process consecutive identical operations (typically ad-
ditions or multiplications) several times faster than intermixed operations of different types.
Processing identical operations this way is called ‘pipelining’ the operations.
Vector norms See: Norms.
C.1 Notation
In this section, we present some of the notation we use throughout the book. We have tried to use
standard notation that would be found in any current publication on the subjects covered.
Throughout, we follow several conventions:
• Matrices are denoted by capital letters.
• Vectors are denoted by lowercase letters.
• Lowercase greek letters usually denote scalars, for instance
• Matrix elements are denoted by doubly indexed lowercase letter, however
• Matrix subblocks are denoted by doubly indexed uppercase letters.
92 APPENDIX C. GLOSSARY
We define matrix A of dimension mn and block dimension m

n

as follows:
A =

a
1,1
a
1,n
.
.
.
.
.
.
a
m,1
a
m,n
¸
¸
¸ (a
i,j
∈ 1)

A
1,1
A
1,n

.
.
.
.
.
.
A
m

,1
A
m

,n

¸
¸
¸ (A
i,j
∈ 1

×\

).
We define vector x of dimension n as follows:
x =

x
1
.
.
.
x
n
¸
¸
¸ x
i
∈ 1.
Other notation is as follows:
• I
n×n
(or simply I if the size is clear from the context) denotes the identity matrix.
• A = diag(a
i,i
) denotes that matrix A has elements a
i,i
on its diagonal, and zeros everywhere
else.
• x
(k)
i
denotes the ith element of vector x during the kth iteration
Bibliography
[1] J. AARDEN AND K.-E. KARLSSON, Preconditioned CG-type methods for solving the cou-
pled systems of fundamental semiconductor equations, BIT, 29 (1989), pp. 916–937.
[2] L. ADAMS AND H. JORDAN, Is SOR color-blind?, SIAM J. Sci. Statist. Comput., 7 (1986),
pp. 490–506.
[3] E. ANDERSON, ET. AL., LAPACK Users Guide, SIAM, Philadelphia, 1992.
[4] J. APPLEYARD AND I. CHESHIRE, Nested factorization, in Reservoir Simulation Sympo-
sium of the SPE, 1983. Paper 12264.
[5] M. ARIOLI, J. DEMMEL, AND I. DUFF, Solving sparse linear systems with sparse backward
error, SIAM J. Matrix Anal. Appl., 10 (1989), pp. 165–190.
[6] W. ARNOLDI, The principle of minimized iterations in the solution of the matrix eigenvalue
problem, Quart. Appl. Math., 9 (1951), pp. 17–29.
[7] S. ASHBY, CHEBYCODE: A Fortran implementation of Manteuffel’s adaptive Chebyshev
algorithm, Tech. Rep. UIUCDCS-R-85-1203, University of Illinois, 1985.
[8] S. ASHBY, T. MANTEUFFEL, AND J. OTTO, A comparison of adaptive Chebyshev and least
squares polynomial preconditioning for Hermitian positive definite linear systems, SIAM J.
Sci. Statist. Comput., 13 (1992), pp. 1–29.
[9] S. ASHBY, T. MANTEUFFEL, AND P. SAYLOR, Adaptive polynomial preconditioning for
Hermitian indefinite linear systems, BIT, 29 (1989), pp. 583–609.
[10] S. F. ASHBY, T. A. MANTEUFFEL, AND P. E. SAYLOR, A taxonomy for conjugate gradient
methods, SIAM J. Numer. Anal., 27 (1990), pp. 1542–1568.
[11] C. ASHCRAFT AND R. GRIMES, On vectorizing incomplete factorizations and SSOR pre-
conditioners, SIAM J. Sci. Statist. Comput., 9 (1988), pp. 122–151.
[12] O. AXELSSON, Incomplete block matrix factorization preconditioning methods. The ulti-
mate answer?, J. Comput. Appl. Math., 12&13 (1985), pp. 3–18.
[13] , A general incomplete block-matrix factorization method, Linear Algebra Appl., 74
(1986), pp. 179–190.
[14] O. AXELSSON AND A. BARKER, Finite element solution of boundary value problems. The-
ory and computation, Academic Press, Orlando, Fl., 1984.
[15] O. AXELSSON AND V. EIJKHOUT, Vectorizable preconditioners for elliptic difference equa-
tions in three space dimensions, J. Comput. Appl. Math., 27 (1989), pp. 299–321.
93
94 BIBLIOGRAPHY
[16] , The nested recursive two-level factorization method for nine-point difference matri-
ces, SIAM J. Sci. Statist. Comput., 12 (1991), pp. 1373–1400.
[17] O. AXELSSON AND I. GUSTAFSSON, Iterative solution for the solution of the Navier equa-
tions of elasticity, Comput. Methods Appl. Mech. Engrg., 15 (1978), pp. 241–258.
[18] O. AXELSSON AND G. LINDSKOG, On the eigenvalue distribution of a class of precondi-
tioning matrices, Numer. Math., 48 (1986), pp. 479–498.
[19] , On the rate of convergence of the preconditioned conjugate gradient method, Numer.
Math., 48 (1986), pp. 499–523.
[20] O. AXELSSON AND N. MUNKSGAARD, Analysis of incomplete factorizations with fixed
storage allocation, in Preconditioning Methods – Theory and Applications, D. Evans, ed.,
Gordon and Breach, New York, 1983, pp. 265–293.
[21] O. AXELSSON AND B. POLMAN, On approximate factorization methods for block-matrices
suitable for vector and parallel processors, Linear Algebra Appl., 77 (1986), pp. 3–26.
[22] O. AXELSSON AND P. VASSILEVSKI, Algebraic multilevel preconditioning methods, I, Nu-
mer. Math., 56 (1989), pp. 157–177.
[23] , Algebraic multilevel preconditioning methods, II, SIAM J. Numer. Anal., 57 (1990),
pp. 1569–1590.
[24] O. AXELSSON AND P. S. VASSILEVSKI, A black box generalized conjugate gradient solver
with inner iterations and variable-step preconditioning, SIAM J. Matrix Anal. Appl., 12
(1991), pp. 625–644.
[25] R. BANK, Marching algorithms for elliptic boundary value problems; II: The variable co-
efficient case, SIAM J. Numer. Anal., 14 (1977), pp. 950–970.
[26] R. BANK, T. CHAN, W. COUGHRAN JR., AND R. SMITH, The Alternate-Block-
Factorization procedure for systems of partial differential equations, BIT, 29 (1989),
pp. 938–954.
[27] R. BANK AND D. ROSE, Marching algorithms for elliptic boundary value problems. I: The
constant coefficient case, SIAM J. Numer. Anal., 14 (1977), pp. 792–829.
[28] R. E. BANK AND T. F. CHAN, A composite step bi-conjugate gradient algorithm for non-
symmetric linear systems, Tech. Rep. CAM 92-, UCLA, Dept. of Math., Los Angeles, CA
90024-1555, 1992.
[29] R. E. BANK AND T. F. CHAN, An analysis of the composite step Biconjugate gradient
method, Numerische Mathematik, 66 (1993), pp. 295–319.
[30] G. BAUDET, Asynchronous iterative methods for multiprocessors, J. Assoc. Comput. Mach.,
25 (1978), pp. 226–244.
[31] R. BEAUWENS, On Axelsson’s perturbations, Linear Algebra Appl., 68 (1985), pp. 221–
242.
[32] , Approximate factorizations with S/P consistently ordered M-factors, BIT, 29 (1989),
pp. 658–681.
[33] R. BEAUWENS AND L. QUENON, Existence criteria for partial matrix factorizations in
iterative methods, SIAM J. Numer. Anal., 13 (1976), pp. 615–643.
BIBLIOGRAPHY 95
[34] A. BJ ¨ ORCK AND T. ELFVING, Accelerated projection methods for computing pseudo-
inverse solutions of systems of linear equations, BIT, 19 (1979), pp. 145–163.
[35] D. BRAESS, The contraction number of a multigrid method for solving the Poisson equation,
Numer. Math., 37 (1981), pp. 387–404.
[36] J. H. BRAMBLE, J. E. PASCIAK, AND A. H. SCHATZ, The construction of preconditioners
for elliptic problems by substructuring, I, Mathematics of Computation, 47 (1986), pp. 103–
134.
[37] J. H. BRAMBLE, J. E. PASCIAK, J. WANG, AND J. XU, Convergence estimates for product
iterative methods with applications to domain decompositions and multigrid, Math. Comp.,
to appear.
[38] R. BRAMLEY AND A. SAMEH, Row projection methods for large nonsymmetric linear sys-
tems, SIAM J. Sci. Statist. Comput., 13 (1992), pp. 168–193.
[39] C. BREZINSKI AND H. SADOK, Avoiding breakdown in the CGS algorithm, Numer. Alg.,
1 (1991), pp. 199–206.
[40] C. BREZINSKI, M. ZAGLIA, AND H. SADOK, Avoiding breakdown and near breakdown in
Lanczos type algorithms, Numer. Alg., 1 (1991), pp. 261–284.
[41] , A breakdown free Lanczos type algorithm for solving linear systems, Numer. Math.,
63 (1992), pp. 29–38.
[42] W. BRIGGS, A Multigrid Tutorial, SIAM, Philadelphia, 1977.
[43] X.-C. CAI AND O. WIDLUND, Multiplicative Schwarz algorithms for some nonsymmetric
and indefinite problems, SIAM J. Numer. Anal., 30 (1993), pp. 936–952.
[44] T. CHAN, Fourier analysis of relaxed incomplete factorization preconditioners, SIAM J.
Sci. Statist. Comput., 12 (1991), pp. 668–680.
[45] T. CHAN, L. DE PILLIS, AND H. VAN DER VORST, A transpose-free squared Lanczos
algorithm and application to solving nonsymmetric linear systems, Tech. Rep. CAM 91-17,
UCLA, Dept. of Math., Los Angeles, CA 90024-1555, 1991.
[46] T. CHAN, E. GALLOPOULOS, V. SIMONCINI, T. SZETO, AND C. TONG, A quasi-minimal
residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems, Tech. Rep. CAM
92-26, UCLA, Dept. of Math., Los Angeles, CA 90024-1555, 1992. SIAM J. Sci. Comput.,
to appear.
[47] T. CHAN, R. GLOWINSKI, , J. P´ ERIAUX, AND O. WIDLUND, eds., Domain Decomposition
Methods, Philadelphia, 1989, SIAM. Proceedings of the Second International Symposium
on Domain Decomposition Methods, Los Angeles, CA, January 14 - 16, 1988.
[48] , eds., Domain Decomposition Methods, Philadelphia, 1990, SIAM. Proceedings of the
Third International Symposium on Domain Decomposition Methods, Houston, TX, 1989.
[49] , eds., Domain Decomposition Methods, SIAM, Philadelphia, 1991. Proceedings of
the Fourth International Symposium on Domain Decomposition Methods, Moscow, USSR,
1990.
[50] T. CHAN AND C.-C. J. KUO, Two-color Fourier analysis of iterative algorithms for elliptic
problems with red/black ordering, SIAM J. Sci. Statist. Comput., 11 (1990), pp. 767–793.
96 BIBLIOGRAPHY
[51] T. F. CHAN, T. P. MATHEW, AND J. P. SHAO, Efficient variants of the vertex space domain
decomposition algorithm, Tech. Rep. CAM 92-07, UCLA, Dept. of Math., Los Angeles, CA
90024-1555, 1992. SIAM J. Sci. Comput., to appear.
[52] T. F. CHAN AND J. SHAO, On the choice of coarse grid size in domain decomposition
methods, tech. rep., UCLA, Dept. of Math., Los Angeles, CA 90024-1555, 1993. to appear.
[53] D. CHAZAN AND W. MIRANKER, Chaotic relaxation, Linear Algebra Appl., 2 (1969),
pp. 199–222.
[54] A. CHRONOPOULOS AND C. GEAR, s-step iterative methods for symmetric linear systems,
J. Comput. Appl. Math., 25 (1989), pp. 153–168.
[55] P. CONCUS AND G. GOLUB, A generalized conjugate gradient method for nonsymmet-
ric systems of linear equations, in Computer methods in Applied Sciences and Engineering,
Second International Symposium, Dec 15–19, 1975; Lecture Notes in Economics and Math-
ematical Systems, Vol. 134, Berlin, New York, 1976, Springer-Verlag.
[56] P. CONCUS, G. GOLUB, AND G. MEURANT, Block preconditioning for the conjugate gra-
dient method, SIAM J. Sci. Statist. Comput., 6 (1985), pp. 220–252.
[57] P. CONCUS, G. GOLUB, AND D. O’LEARY, A generalized conjugate gradient method for
the numerical solution of elliptic partial differential equations, in Sparse Matrix Computa-
tions, J. Bunch and D. Rose, eds., Academic Press, New York, 1976, pp. 309–332.
[58] P. CONCUS AND G. H. GOLUB, Use of fast direct methods for the efficient numerical solu-
tion of nonseparable elliptic equations, SIAM J. Numer. Anal., 10 (1973), pp. 1103–1120.
[59] E. CUTHILL AND J. MCKEE, Reducing the bandwidth of sparse symmetric matrices, in
ACM Proceedings of the 24th National Conference, 1969.
[60] E. D’AZEVEDO, V. EIJKHOUT, AND C. ROMINE, LAPACK working note 56: Reducing
communication costs in the conjugate gradient algorithm on distributed memory multipro-
cessor, tech. report, Computer Science Department, University of Tennessee, Knoxville,
TN, 1993.
[61] E. D’AZEVEDO AND C. ROMINE, Reducing communication costs in the conjugate gradient
algorithm on distributed memory multiprocessors, Tech. Rep. ORNL/TM-12192, Oak Ridge
National Lab, Oak Ridge, TN, 1992.
[62] E. DE STURLER, A parallel restructured version of GMRES(m), Tech. Rep. 91-85, Delft
University of Technology, Delft, The Netherlands, 1991.
[63] E. DE STURLER AND D. R. FOKKEMA, Nested Krylov methods and preserving the orthog-
onality, Tech. Rep. Preprint 796, Utrecht University, Utrecht, The Netherlands, 1993.
[64] S. DEMKO, W. MOSS, AND P. SMITH, Decay rates for inverses of band matrices, Mathe-
matics of Computation, 43 (1984), pp. 491–499.
[65] J. DEMMEL, The condition number of equivalence transformations that block diagonalize
matrix pencils, SIAM J. Numer. Anal., 20 (1983), pp. 599–610.
[66] J. DEMMEL, M. HEATH, AND H. VAN DER VORST, Parallel linear algebra, in Acta Nu-
merica, Vol. 2, Cambridge Press, New York, 1993.
[67] S. DOI, On parallelism and convergence of incomplete LU factorizations, Appl. Numer.
Math., 7 (1991), pp. 417–436.
BIBLIOGRAPHY 97
[68] J. DONGARRA, J. DUCROZ, I. DUFF, AND S. HAMMARLING, A set of level 3 Basic Linear
Algebra Subprograms, ACM Trans. Math. Soft., 16 (1990), pp. 1–17.
[69] J. DONGARRA, J. DUCROZ, S. HAMMARLING, AND R. HANSON, An extended set of
FORTRAN Basic Linear Algebra Subprograms, ACM Trans. Math. Soft., 14 (1988), pp. 1–
32.
[70] J. DONGARRA, I. DUFF, D. SORENSEN, AND H. VAN DER VORST, Solving Linear Systems
on Vector and Shared Memory Computers, SIAM, Philadelphia, PA, 1991.
[71] J. DONGARRA AND E. GROSSE, Distribution of mathematical software via electronic mail,
Comm. ACM, 30 (1987), pp. 403–407.
[72] J. DONGARRA, C. MOLER, J. BUNCH, AND G. STEWART, LINPACKUsers’ Guide, SIAM,
Philadelphia, 1979.
[73] J. DONGARRA AND H. VAN DER VORST, Performance of various computers using stan-
dard sparse linear equations solving techniques, in Computer Benchmarks, J. Dongarra and
W. Gentzsch, eds., Elsevier Science Publishers B.V., New York, 1993, pp. 177–188.
[74] F. DORR, The direct solution of the discrete Poisson equation on a rectangle, SIAM Rev.,
12 (1970), pp. 248–263.
[75] M. DRYJA AND O. B. WIDLUND, Towards a unified theory of domain decomposition algo-
rithms for elliptic problems, Tech. Rep. 486, also Ultracomputer Note 167, Department of
Computer Science, Courant Institute, 1989.
[76] D. DUBOIS, A. GREENBAUM, AND G. RODRIGUE, Approximating the inverse of a matrix
for use in iterative algorithms on vector processors, Computing, 22 (1979), pp. 257–268.
[77] I. DUFF, R. GRIMES, AND J. LEWIS, Sparse matrix test problems, ACM Trans. Math. Soft.,
15 (1989), pp. 1–14.
[78] I. DUFF AND G. MEURANT, The effect of ordering on preconditioned conjugate gradients,
BIT, 29 (1989), pp. 635–657.
[79] I. S. DUFF, A. M. ERISMAN, AND J.K.REID, Direct methods for sparse matrices, Oxford
University Press, London, 1986.
[80] T. DUPONT, R. KENDALL, AND H. RACHFORD, An approximate factorization proce-
dure for solving self-adjoint elliptic difference equations, SIAM J. Numer. Anal., 5 (1968),
pp. 559–573.
[81] E. D’YAKONOV, The method of variable directions in solving systems of finite difference
equations, Soviet Math. Dokl., 2 (1961), pp. 577–580. TOM 138, 271–274.
[82] L. EHRLICH, An Ad-Hoc SOR method, J. Comput. Phys., 43 (1981), pp. 31–45.
[83] M. EIERMANN AND R. VARGA, Is the optimal ω best for the SOR iteration method?, Linear
Algebra Appl., 182 (1993), pp. 257–277.
[84] V. EIJKHOUT, Analysis of parallel incomplete point factorizations, Linear Algebra Appl.,
154–156 (1991), pp. 723–740.
[85] , Beware of unperturbed modified incomplete point factorizations, in Proceedings of
the IMACS International Symposium on Iterative Methods in Linear Algebra, Brussels, Bel-
gium, R. Beauwens and P. de Groen, eds., 1992.
98 BIBLIOGRAPHY
[86] , LAPACK working note 50: Distributed sparse data structures for linear algebra
operations, Tech. Rep. CS 92-169, Computer Science Department, University of Tennessee,
Knoxville, TN, 1992.
[87] , LAPACKworking note 51: Qualitative properties of the conjugate gradient and Lanc-
zos methods in a matrix framework, Tech. Rep. CS 92-170, Computer Science Department,
University of Tennessee, Knoxville, TN, 1992.
[88] V. EIJKHOUT AND B. POLMAN, Decay rates of inverses of banded M-matrices that are
near to Toeplitz matrices, Linear Algebra Appl., 109 (1988), pp. 247–277.
[89] V. EIJKHOUT AND P. VASSILEVSKI, Positive definitess aspects of vectorizable precondi-
tioners, Parallel Computing, 10 (1989), pp. 93–100.
[90] S. EISENSTAT, Efficient implementation of a class of preconditioned conjugate gradient
methods, SIAM J. Sci. Statist. Comput., 2 (1981), pp. 1–4.
[91] R. ELKIN, Convergence theorems for Gauss-Seidel and other minimization algorithms,
Tech. Rep. 68-59, Computer Science Center, University of Maryland, College Park, MD,
Jan. 1968.
[92] H. ELMAN, Approximate Schur complement preconditioners on serial and parallel comput-
ers, SIAM J. Sci. Statist. Comput., 10 (1989), pp. 581–605.
[93] H. ELMAN AND M. SCHULTZ, Preconditioning by fast direct methods for non self-adjoint
nonseparable elliptic equations, SIAM J. Numer. Anal., 23 (1986), pp. 44–57.
[94] L. ELSNER, A note on optimal block-scaling of matrices, Numer. Math., 44 (1984), pp. 127–
128.
[95] V. FABER AND T. MANTEUFFEL, Necessary and sufficient conditions for the existence of a
conjugate gradient method, SIAM J. Numer. Anal., 21 (1984), pp. 315–339.
[96] G. FAIRWEATHER, A. GOURLAY, AND A. MITCHELL, Some high accuracy difference
schemes with a splitting operator for equations of parabolic and elliptic type, Numer. Math.,
10 (1967), pp. 56–66.
[97] R. FLETCHER, Conjugate gradient methods for indefinite systems, in Numerical Analysis
Dundee 1975, G. Watson, ed., Berlin, New York, 1976, Springer Verlag, pp. 73–89.
[98] G. FORSYTHE AND E. STRAUSS, On best conditioned matrices, Proc. Amer. Math. Soc., 6
(1955), pp. 340–345.
[99] R. FREUND, Conjugate gradient-type methods for linear systems with complex symmetric
coefficient matrices, SIAM J. Sci. Statist. Comput., 13 (1992), pp. 425–448.
[100] R. FREUND, M. GUTKNECHT, AND N. NACHTIGAL, An implementation of the look-ahead
Lanczos algorithm for non-Hermitian matrices, SIAM J. Sci. Comput., 14 (1993), pp. 137–
158.
[101] R. FREUND AND N. NACHTIGAL, QMR: A quasi-minimal residual method for non-
Hermitian linear systems, Numer. Math., 60 (1991), pp. 315–339.
[102] , An implementation of the QMR method based on coupled two-termrecurrences, Tech.
Rep. 92.15, RIACS, NASA Ames, Ames, CA, 1992.
[103] R. FREUND AND T. SZETO, A quasi-minimal residual squared algorithm for non-Hermitian
linear systems, tech. rep., RIACS, NASA Ames, Ames, CA, 1991.
BIBLIOGRAPHY 99
[104] R. W. FREUND, A transpose-free quasi-minimum residual algorithm for non-Hermitian lin-
ear systems, SIAM J. Sci. Comput., 14 (1993), pp. 470–482.
[105] R. W. FREUND, G. H. GOLUB, AND N. M. NACHTIGAL, Iterative solution of linear sys-
tems, Acta Numerica, (1992), pp. 57–100.
[106] R. GLOWINSKI, G. H. GOLUB, G. A. MEURANT, AND J. P´ ERIAUX, eds., Domain Decom-
position Methods for Partial Differential Equations, SIAM, Philadelphia, 1988. Proceedings
of the First International Symposium on Domain Decomposition Methods for Partial Differ-
ential Equations, Paris, France, January 1987.
[107] G. GOLUB AND D. O’LEARY, Some history of the conjugate gradient and Lanczos meth-
ods, SIAM Rev., 31 (1989), pp. 50–102.
[108] G. GOLUB AND C. VAN LOAN, Matrix Computations, second edition, The Johns Hopkins
University Press, Baltimore, 1989.
[109] A. GREENBAUM AND Z. STRAKOS, Predicting the behavior of finite precision Lanczos and
conjugate gradient computations, SIAM J. Mat. Anal. Appl., 13 (1992), pp. 121–137.
[110] W. D. GROPP AND D. E. KEYES, Domain decomposition with local mesh refinement, SIAM
J. Sci. Statist. Comput., 13 (1992), pp. 967–993.
[111] I. GUSTAFSSON, A class of first-order factorization methods, BIT, 18 (1978), pp. 142–156.
[112] M. H. GUTKNECHT, A completed theory of the unsymmetric Lanczos process and related
algorithms, part II, Tech. Rep. 90-16, IPS Research Report, ETH Z¨ urich, Switzerland, 1990.
[113] , The unsymmetric Lanczos algorithms and their relations to P´ ade approximation, con-
tinued fractions and the QD algorithm, in Proceedings of the Copper Mountain Conference
on Iterative Methods, 1990.
[114] , Variants of Bi-CGSTAB for matrices with complex spectrum, Tech. Rep. 91-14, IPS
ETH, Z¨ urich, Switzerland, 1991.
[115] , A completed theory of the unsymmetric Lanczos process and related algorithms, part
I, SIAM J. Matrix Anal. Appl., 13 (1992), pp. 594–639.
[116] W. HACKBUSCH, Multi-Grid Methods and Applications, Springer-Verlag, Berlin, New
York, 1985.
[117] , Iterative L¨ osung großer schwachbesetzter Gleichungssysteme, Teubner, Stuttgart,
1991.
[118] A. HADJIDIMOS, On some high accuracy difference schemes for solving elliptic equations,
Numer. Math., 13 (1969), pp. 396–403.
[119] L. HAGEMAN AND D. YOUNG, Applied Iterative Methods, Academic Press, New York,
1981.
[120] W. HAGER, Condition estimators, SIAM J. Sci. Statist. Comput., 5 (1984), pp. 311–316.
[121] M. HESTENES AND E. STIEFEL, Methods of conjugate gradients for solving linear systems,
J. Res. Nat. Bur. Stand., 49 (1952), pp. 409–436.
[122] M. R. HESTENES, Conjugacy and gradients, in A History of Scientific Computing,
Addison-Wesley, Reading, MA, 1990, pp. 167–179.
100 BIBLIOGRAPHY
[123] N. HIGHAM, Experience with a matrix norm estimator, SIAM J. Sci. Statist. Comput., 11
(1990), pp. 804–809.
[124] K. JEA AND D. YOUNG, Generalized conjugate-gradient acceleration of nonsym- metriz-
able iterative methods, Linear Algebra Appl., 34 (1980), pp. 159–194.
[125] O. JOHNSON, C. MICCHELLI, AND G. PAUL, Polynomial preconditioning for conjugate
gradient calculation, SIAM J. Numer. Anal., 20 (1983), pp. 362–376.
[126] M. JONES AND P. PLASSMANN, Parallel solution of unstructed, sparse systems of linear
equations, in Proceedings of the Sixth SIAM conference on Parallel Processing for Scien-
tific Computing, R. Sincovec, D. Keyes, M. Leuze, L. Petzold, and D. Reed, eds., SIAM,
Philadelphia, pp. 471–475.
[127] , A parallel graph coloring heuristic, SIAM J. Sci. Statist. Comput., 14 (1993),
pp. 654–669.
[128] W. JOUBERT, Lanczos methods for the solution of nonsymmetric systems of linear equations,
SIAM J. Matrix Anal. Appl., 13 (1992), pp. 926–943.
[129] W. KAHAN, Gauss-Seidel methods of solving large systems of linear equations, PhD thesis,
University of Toronto, 1958.
[130] S. KANIEL, Estimates for some computational techniques in linear algebra, Mathematics
of Computation, 20 (1966), pp. 369–378.
[131] D. KERSHAW, The incomplete Cholesky-conjugate gradient method for the iterative solu-
tion of systems of linear equations, J. Comput. Phys., 26 (1978), pp. 43–65.
[132] R. KETTLER, Analysis and comparison of relaxation schemes in robust multigrid and pre-
conditioned conjugate gradient methods, in Multigrid Methods, Lecture Notes in Mathemat-
ics 960, W. Hackbusch and U. Trottenberg, eds., Springer-Verlag, Berlin, New York, 1982,
pp. 502–534.
[133] , Linear multigrid methods in numerical reservoir simulation, PhD thesis, Delft Uni-
versity of Technology, Delft, The Netherlands, 1987.
[134] D. E. KEYES, T. F. CHAN, G. MEURANT, J. S. SCROGGS, AND R. G. VOIGT, eds.,
Domain Decomposition Methods For Partial Differential Equations, SIAM, Philadelphia,
1992. Proceedings of the Fifth International Symposium on Domain Decomposition Meth-
ods, Norfolk, VA, 1991.
[135] D. E. KEYES AND W. D. GROPP, A comparison of domain decomposition techniques for
elliptic partial differential equations and their parallel implementation, SIAM J. Sci. Statist.
Comput., 8 (1987), pp. s166 – s202.
[136] , Domain decomposition for nonsymmetric systems of equations: Examples from com-
putational fluid dynamics, in Domain Decomposition Methods, proceedings of the Sec-
ond Internation Symposium, Los Angeles, California, January 14–16, 1988, T. F. Chan,
R. Glowinski, J. Periaux, and O. B. Widlund, eds., Philadelphia, 1989, SIAM, pp. 373–384.
[137] , Domain decomposition techniques for the parallel solution of nonsymmetric systems
of elliptic boundary value problems, Applied Num. Math., 6 (1989/1990), pp. 281–301.
[138] S. K. KIM AND A. T. CHRONOPOULOS, A class of Lanczos-like algorithms implemented
on parallel computers, Parallel Comput., 17 (1991), pp. 763–778.
BIBLIOGRAPHY 101
[139] D. R. KINCAID, J. R. RESPESS, D. M. YOUNG, AND R. G. GRIMES, ITPACK 2C: A
Fortran package for solving large sparse linear systems by adaptive accelerated iterative
methods, ACM Trans. Math. Soft., 8 (1982), pp. 302–322. Algorithm 586.
[140] L. Y. KOLOTILINA AND A. Y. YEREMIN, On a family of two-level preconditionings of the
incomlete block factorization type, Sov. J. Numer. Anal. Math. Modelling, (1986), pp. 293–
320.
[141] C. LANCZOS, An iteration method for the solution of the eigenvalue problem of linear dif-
ferential and integral operators, J. Res. Nat. Bur. Stand., 45 (1950), pp. 255–282.
[142] , Solution of systems of linear equations by minimized iterations, J. Res. Nat. Bur.
Stand., 49 (1952), pp. 33–53.
[143] C. LAWSON, R. HANSON, D. KINCAID, AND F. KROGH, Basic Linear Algebra Subpro-
grams for FORTRAN usage, ACM Trans. Math. Soft., 5 (1979), pp. 308–325.
[144] J. MAITRE AND F. MUSY, The contraction number of a class of two-level methods; an exact
evaluation for some finite element subspaces and model problems, in Multigrid methods,
Proceedings, K¨ oln-Porz, 1981, W. Hackbusch and U. Trottenberg, eds., vol. 960 of Lecture
Notes in Mathematics, 1982, pp. 535–544.
[145] T. MANTEUFFEL, The Tchebychev iteration for nonsymmetric linear systems, Numer. Math.,
28 (1977), pp. 307–327.
[146] , An incomplete factorization technique for positive definite linear systems, Mathemat-
ics of Computation, 34 (1980), pp. 473–497.
[147] S. MCCORMICK, Multilevel Adaptive Methods for Partial Differential Equations, SIAM,
Philadelphia, 1989.
[148] S. MCCORMICK AND J. THOMAS, The Fast Adaptive Composite grid (FAC) method for
elliptic equations, Mathematics of Computation, 46 (1986), pp. 439–456.
[149] U. MEIER AND A. SAMEH, The behavior of conjugate gradient algorithms on a multivector
processor with a hierarchical memory, J. Comput. Appl. Math., 24 (1988), pp. 13–32.
[150] U. MEIER-YANG, Preconditioned conjugate gradient-like methods for nonsymmetric linear
systems, tech. rep., CSRD, University of Illinois, Urbana, IL, April 1992.
[151] J. MEIJERINK AND H. VAN DER VORST, An iterative solution method for linear systems
of which the coefficient matrix is a symmetric M-matrix, Mathematics of Computation, 31
(1977), pp. 148–162.
[152] , Guidelines for the usage of incomplete decompositions in solving sets of linear equa-
tions as they occur in practical problems, J. Comput. Phys., 44 (1981), pp. 134–155.
[153] R. MELHEM, Toward efficient implementation of preconditioned conjugate gradient meth-
ods on vector supercomputers, Internat. J. Supercomput. Appls., 1 (1987), pp. 77–98.
[154] G. MEURANT, The block preconditioned conjugate gradient method on vector computers,
BIT, 24 (1984), pp. 623–633.
[155] , Multitasking the conjugate gradient method on the CRAY X-MP/48, Parallel Comput.,
5 (1987), pp. 267–280.
[156] N. MUNKSGAARD, Solving sparse symmetric sets of linear equations by preconditioned
conjugate gradients, ACM Trans. Math. Software, 6 (1980), pp. 206–219.
102 BIBLIOGRAPHY
[157] N. NACHTIGAL, S. REDDY, AND L. TREFETHEN, How fast are nonsymmetric matrix iter-
ations?, SIAM J. Matrix Anal. Appl., 13 (1992), pp. 778–795.
[158] N. NACHTIGAL, L. REICHEL, AND L. TREFETHEN, A hybrid GMRES algorithm for non-
symmetric matrix iterations, Tech. Rep. 90-7, MIT, Cambridge, MA, 1990.
[159] N. M. NACHTIGAL, A Look-Ahead Variant of the Lanczos Algorithm and its Application to
the Quasi-Minimal Residual Methods for Non-Hermitian Linear Systems, PhD thesis, MIT,
Cambridge, MA, 1991.
[160] Y. NOTAY, Solving positive (semi)definite linear systems by preconditioned iterative meth-
ods, in Preconditioned Conjugate Gradient Methods, O. Axelsson and L. Kolotilina, eds.,
vol. 1457 of Lecture Notes in Mathematics, Nijmegen, 1989, pp. 105–125.
[161] , On the robustness of modified incomplete factorization methods, Internat. J. Comput.
Math., 40 (1992), pp. 121–141.
[162] D. O’LEARY, The block conjugate gradient algorithm and related methods, Linear Algebra
Appl., 29 (1980), pp. 293–322.
[163] , Ordering schemes for parallel processing of certain mesh problems, SIAM J. Sci.
Statist. Comput., 5 (1984), pp. 620–632.
[164] T. C. OPPE, W. D. JOUBERT, AND D. R. KINCAID, NSPCG user’s guide, version 1.0: A
package for solving large sparse linear systems by various iterative methods, Tech. Rep.
CNA–216, Center for Numerical Analysis, University of Texas at Austin, Austin, TX, April
1988.
[165] J. M. ORTEGA, Introduction to Parallel and Vector Solution of Linear Systems, Plenum
Press, New York and London, 1988.
[166] C. PAIGE, B. PARLETT, AND H. VAN DER VORST, Approximate solutions and eigenvalue
bounds from Krylov subspaces, Numer. Lin. Alg. Appls., to appear.
[167] C. PAIGE AND M. SAUNDERS, Solution of sparse indefinite systems of linear equations,
SIAM J. Numer. Anal., 12 (1975), pp. 617–629.
[168] C. C. PAIGE AND M. A. SAUNDERS, LSQR: An algorithm for sparse linear equations and
sparse least squares, ACM Trans. Math. Soft., 8 (1982), pp. 43–71.
[169] G. PAOLINI AND G. RADICATI DI BROZOLO, Data structures to vectorize CG algorithms
for general sparsity patterns, BIT, 29 (1989), pp. 703–718.
[170] B. PARLETT, The symmetric eigenvalue problem, Prentice-Hall, London, 1980.
[171] B. N. PARLETT, D. R. TAYLOR, AND Z. A. LIU, A look-ahead Lanczos algorithm for
unsymmetric matrices, Mathematics of Computation, 44 (1985), pp. 105–124.
[172] D. PEACEMAN AND J. H.H. RACHFORD, The numerical solution of parabolic and elliptic
differential equations, J. Soc. Indust. Appl. Math., 3 (1955), pp. 28–41.
[173] C. POMMERELL, Solution of Large Unsymmetric Systems of Linear Equations, vol. 17 of
Series in Micro-electronics, volume 17, Hartung-Gorre Verlag, Konstanz, 1992.
[174] , Solution of large unsymmetric systems of linear equations, PhD thesis, Swiss Federal
Institute of Technology, Z¨ urich, Switzerland, 1992.
BIBLIOGRAPHY 103
[175] E. POOLE AND J. ORTEGA, Multicolor ICCG methods for vector computers, Tech. Rep.
RM 86-06, Department of Applied Mathematics, University of Virginia, Charlottesville,
VA, 1986.
[176] A. QUARTERONI, ed., Domain Decomposition Methods, Proceedings of the Sixth Interna-
tional Symposium on Domain Decomposition Methods, Como, Italy,, Providence, RI, 1993,
AMS. to appear.
[177] G. RADICATI DI BROZOLO AND Y. ROBERT, Vector and parallel CG-like algorithms for
sparse non-symmetric systems, Tech. Rep. 681-M, IMAG/TIM3, Grenoble, France, 1987.
[178] J. REID, On the method of conjugate gradients for the solution of large sparse systems of
linear equations, in Large Sparse Sets of Linear Equations, J. Reid, ed., Academic Press,
London, 1971, pp. 231–254.
[179] G. RODRIGUE AND D. WOLITZER, Preconditioning by incomplete block cyclic reduction,
Mathematics of Computation, 42 (1984), pp. 549–565.
[180] Y. SAAD, The Lanczos biorthogonalization algorithm and other oblique projection methods
for solving large unsymmetric systems, SIAM J. Numer. Anal., 19 (1982), pp. 485–506.
[181] , Practical use of some Krylov subspace methods for solving indefinite and nonsym-
metric linear systems, SIAM J. Sci. Statist. Comput., 5 (1984), pp. 203–228.
[182] , Practical use of polynomial preconditionings for the conjugate gradient method,
SIAM J. Sci. Statist. Comput., 6 (1985), pp. 865–881.
[183] , Preconditioning techniques for indefinite and nonsymmetric linear systems, J. Com-
put. Appl. Math., 24 (1988), pp. 89–105.
[184] , Krylov subspace methods on supercomputers, SIAM J. Sci. Statist. Comput., 10
(1989), pp. 1200–1232.
[185] , SPARSKIT: A basic tool kit for sparse matrix computation, Tech. Rep. CSRD TR
1029, CSRD, University of Illinois, Urbana, IL, 1990.
[186] , A flexible inner-outer preconditioned GMRES algorithm, SIAM J. Sci. Comput., 14
(1993), pp. 461–469.
[187] Y. SAAD AND M. SCHULTZ, Conjugate gradient-like algorithms for solving nonsymmetric
linear systems, Mathematics of Computation, 44 (1985), pp. 417–424.
[188] , GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear
systems, SIAM J. Sci. Statist. Comput., 7 (1986), pp. 856–869.
[189] G. L. G. SLEIJPEN AND D. R. FOKKEMA, Bi-CGSTAB() for linear equations involving
unsymmetric matrices with complex spectrum, Tech. Rep. 772, University of Utrecht, De-
partment of Mathematics, Utrecht, The Netherlands, 1993.
[190] B. F. SMITH, Domain decomposition algorithms for partial differential equations of linear
elasticity, Tech. Rep. 517, Department of Computer Science, Courant Institute, 1990.
[191] P. SONNEVELD, CGS, a fast Lanczos-type solver for nonsymmetric linear systems, SIAM J.
Sci. Statist. Comput., 10 (1989), pp. 36–52.
[192] R. SOUTHWELL, Relaxation Methods in Theoretical Physics, Clarendon Press, Oxford,
1946.
104 BIBLIOGRAPHY
[193] H. STONE, Iterative solution of implicit approximations of multidimensional partial differ-
ential equations, SIAM J. Numer. Anal., 5 (1968), pp. 530–558.
[194] P. SWARZTRAUBER, The methods of cyclic reduction, Fourier analysis and the FACR algo-
rithm for the discrete solution of Poisson’s equation on a rectangle, SIAM Rev., 19 (1977),
pp. 490–501.
[195] C. TONG, A comparative study of preconditioned Lanczos methods for nonsymmetric linear
systems, Tech. Rep. SAND91-8240, Sandia Nat. Lab., Livermore, CA, 1992.
[196] A. VAN DER SLUIS, Condition numbers and equilibration of matrices, Numer. Math., 14
(1969), pp. 14–23.
[197] A. VAN DER SLUIS AND H. VAN DER VORST, The rate of convergence of conjugate gradi-
ents, Numer. Math., 48 (1986), pp. 543–560.
[198] H. VAN DER VORST, Iterative solution methods for certain sparse linear systems with a
non-symmetric matrix arising from PDE-problems, J. Comput. Phys., 44 (1981), pp. 1–19.
[199] , A vectorizable variant of some ICCG methods, SIAM J. Sci. Statist. Comput., 3
(1982), pp. 350–356.
[200] , Large tridiagonal and block tridiagonal linear systems on vector and parallel com-
puters, Parallel Comput., 5 (1987), pp. 45–54.
[201] , (M)ICCG for 2D problems on vector computers, in Supercomputing, A.Lichnewsky
and C.Saguez, eds., North-Holland, 1988. Also as Report No.A-17, Data Processing Center,
Kyoto University, Kyoto, Japan, December 17, 1986.
[202] , High performance preconditioning, SIAM J. Sci. Statist. Comput., 10 (1989),
pp. 1174–1185.
[203] , ICCG and related methods for 3D problems on vector computers, Computer Physics
Communications, (1989), pp. 223–235. Also as Report No.A-18, Data Processing Center,
Kyoto University, Kyoto, Japan, May 30, 1987.
[204] , The convergence behavior of preconditioned CG and CG-S in the presence of round-
ing errors, in Preconditioned Conjugate Gradient Methods, O. Axelsson and L. Y. Kolotilina,
eds., vol. 1457 of Lecture Notes in Mathematics, Berlin, New York, 1990, Springer-Verlag.
[205] , Bi-CGSTAB: A fast and smoothly converging variant of Bi-CG for the solution of
nonsymmetric linear systems, SIAM J. Sci. Statist. Comput., 13 (1992), pp. 631–644.
[206] H. VAN DER VORST AND J. MELISSEN, A Petrov-Galerkin type method for solving Ax = b
where A is symmetric complex, IEEE Trans. Magnetics, 26 (1990), pp. 706–708.
[207] H. VAN DER VORST AND C. VUIK, GMRESR: A family of nested GMRES methods, Tech.
Rep. 91-80, Delft University of Technology, Faculty of Tech. Math., Delft, The Netherlands,
1991.
[208] J. VAN ROSENDALE, Minimizing inner product data dependencies in conjugate gradient
iteration, Tech. Rep. 172178, ICASE, NASA Langley Research Center, 1983.
[209] R. VARGA, Matrix Iterative Analysis, Prentice-Hall Inc., Englewood Cliffs, NJ, 1962.
[210] P. VASSILEVSKI, Preconditioning nonsymmetric and indefinite finite element matrices, J.
Numer. Alg. Appl., 1 (1992), pp. 59–76.
BIBLIOGRAPHY 105
[211] V. VOEVODIN, The problem of non-self-adjoint generalization of the conjugate gradient
method is closed, U.S.S.R. Comput. Maths. and Math. Phys., 23 (1983), pp. 143–144.
[212] H. F. WALKER, Implementation of the GMRES method using Householder transformations,
SIAM J. Sci. Statist. Comput., 9 (1988), pp. 152–163.
[213] P. WESSELING, An Introduction to Multigrid Methods, Wiley, Chichester, 1991.
[214] O. WIDLUND, A Lanczos method for a class of non-symmetric systems of linear equations,
SIAM J. Numer. Anal., 15 (1978), pp. 801–812.
[215] D. YOUNG, Iterative solution of large linear systems, Academic Press, New York, 1971.
[216] H. YSERENTANT, On the multilevel splitting of finite element spaces, Numer. Math., 49
(1986), pp. 379–412.
106 BIBLIOGRAPHY
Appendix D
Flowchart of iterative methods
y
n
y
n
y
n
y
n
y
n
Is the transpose
available?
Is storage at
a premium?
Try CG
Is the matrix
symmetric?
Is the matrix
definite?
Are the outer
eigenvalues known?
Try Chebyshev
or CG
Try GMRES
with long restart
Try MinRES
or CG
Try QMR
Try CGS or
Bi-CGSTAB
107

ii

How to Use This Book
We have divided this book into five main chapters. Chapter 1 gives the motivation for this book and the use of templates. Chapter 2 describes stationary and nonstationary iterative methods. In this chapter we present both historical development and state-of-the-art methods for solving some of the most challenging computational problems facing researchers. Chapter 3 focuses on preconditioners. Many iterative methods depend in part on preconditioners to improve performance and ensure fast convergence. Chapter 4 provides a glimpse of issues related to the use of iterative methods. This chapter, like the preceding, is especially recommended for the experienced user who wishes to have further guidelines for tailoring a specific code to a particular machine. It includes information on complex systems, stopping criteria, data storage formats, and parallelism. Chapter 5 includes overviews of related topics such as the close connection between the Lanczos algorithm and the Conjugate Gradient algorithm, block iterative methods, red/black orderings, domain decomposition methods, multigrid-like methods, and row-projection schemes. The Appendices contain information on how the templates and BLAS software can be obtained. A glossary of important terms used in the book is also provided. The field of iterative methods for solving systems of linear equations is in constant flux, with new methods and approaches continually being created, modified, tuned, and some eventually discarded. We expect the material in this book to undergo changes from time to time as some of these new approaches mature and become the state-of-the-art. Therefore, we plan to update the material included in this book periodically for future editions. We welcome your comments and criticisms of this work to help us in that updating process. Please send your comments and questions by email to templates@cs.utk.edu.

iii

I n×n x ˆ h diagonal matrix lower triangular matrix upper triangular matrix orthogonal matrix preconditioner n × n identity matrix typically. λmin (A) σmax (A). Z a. y). σmin (A) κ2 (A) L α max{S} min{S} O(·) matrices vectors scalars matrix transpose conjugate transpose (Hermitian) of A matrix inverse the inverse of AT matrix element jth matrix column matrix subblock vector element first.. . minimum) modulus largest and smallest singular values of A spectral condition number of matrix A linear operator complex conjugate of the scalar α maximum value in set S minimum value in set S summation “big-oh” asymptotic bound Conventions Used in this Book D L U Q M I. second derivative with respect to x vector dot product (inner product) jth component of vector x in the ith iteration diagonal of matrix A diagonal matrix constructed from scalars α.List of Symbols A. defined as (Ax. β. . ω AT AH A−1 A−T ai. uxx (x. . . . . spanning space of vectors a. .j Ai.) span(a. . b . xT y (i) xj diag(A) diag(α. β . . set of real numbers real n-space 2-norm p-norm the “A-norm”. . . . . . . .) R Rn ||x||2 ||x||p ||x||A λmax (A).j ai ux . z α. x)1/2 eigenvalues of A with maximum (resp. β . the exact solution to Ax = b discretization mesh width iv . . . b . . .j a.

and Karin Remington for designing the front cover. v . We also thank Geoffrey Fox for initial discussions on the concept of templates. This work was supported in part by DARPA and ARO under contract number DAAL03-91C-0047. Bill Coughran. Roland Freund. Jim Ortega. CCR-8809615. Department of Energy. and the Stichting Nationale Computer Faciliteit (NCF) by Grant CRG 92. Gene Golub. Matthew Fishler. Steve Lee. Mark Jones. In particular. and David Young for their insightful come ments.S. Peter Forsyth. Eric Grosse.Acknowledgements The authors gratefully acknowledge the valuable assistance of many people who commented on preliminary drafts of this book. No¨ l Nachtigal. we thank Loyce Adams.03. the National Science Foundation Science and Technology Center Cooperative Agreement No. U. David Kincaid. Tarek Mathew. under Contract DE-AC05-84OR21400. the Applied Mathematical Sciences subprogram of the Office of Energy Research.

vi .

. . . . . . . . . . . . . . . . . .2. . . . . . . . . . . . 3. . . . . . . . . . . . . . . .2. . . . . . . . . . . . . . . . . . . . . . . . . . . . .3. . . . . . 2. . . .3. . . . . . .6 Survey of recent Krylov methods .6 Quasi-Minimal Residual (QMR) .1. . . . 2. . . . . . . . . . . . . . . . . . .1 Creating an incomplete factorization vii iv ix 1 2 2 5 5 7 7 8 9 11 11 12 12 15 16 17 19 20 21 24 25 26 30 31 35 35 35 36 37 37 37 37 38 38 2 . . . . . . . .2. . . . . Preconditioners 3. . . . . . . . . . . . . . . . . . . 2. . . . . . . . . . . . . . .3 Nonstationary Iterative Methods . . . . . . . . . . . .7 Conjugate Gradient Squared Method (CGS) . . . . . . . . . . .2 MINRES and SYMMLQ .3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Computational Aspects of the Methods . 2. . . . . . . . . . . . . . 2. . . . . . . . . . 2. . . . . . . . 1. . . . . . . . .2. . . . . . . . . . . .2. . . . . . . . . . . . . . . . . . . . . . . .5 A short history of Krylov methods . . . . . . . .1 The why and how . . . . . . . . . . . . . . . . . . . .3. . . . . . . . . . . . . . . . . . . . .3 CG on the Normal Equations. .2 The Gauss-Seidel Method . . . . .Contents List of Symbols List of Figures 1 Introduction 1. . . . . . . . . . . . .4 The Symmetric Successive Overrelaxation Method 2. . .4 Generalized Minimal Residual (GMRES) . . .4. . 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. . . . . . . . . 2.5 Notes and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3. . . . . . . . . . . . . . . . .3. . . . . . . . . . . . . .2 What Methods Are Covered? . . .1 Why Use Templates? . . . . . . . . . .1. .1 Cost trade-off . . . . . . . . . . . . . . . . . . . . . . .5 BiConjugate Gradient (BiCG) . . . . . . . . . . . . . . . . . . . . . . 3. CGNE and CGNR . 2. . .3. . .2. 2. . . . . . . . . 3. . .3 SSOR preconditioning . . . . . . . . 2. . . 2. . . . . . . . . .2 Stationary Iterative Methods . . . . .3 The Successive Overrelaxation Method . . . . .3. 2. . . . . . . . . . . . . . . . . . .1 Overview of the Methods . . . . . . . . .1 Conjugate Gradient Method (CG) . . . . . . . . . . . . . . . . . . . . . . . . 3 . . . . . . . . . . . 2. . . . . . . . . . . . . .4 Incomplete Factorization Preconditioners . . . . . . . . . . 3. . .1 The Jacobi Method . .8 BiConjugate Gradient Stabilized (Bi-CGSTAB) . 3.2 Discussion . . . . . . . . . . . . . . . 2. . 3.1 Block Jacobi Methods . . . . . . . . . . . . . . . .2. . . . . . . . . . . . . . . . . . . . . . .9 Chebyshev Iteration . . . . . . . . 2. . . . . . . . . . . . . . . . . . Iterative Methods 2. . . . . . . . . . . .2 Jacobi Preconditioning . . . . . . . . 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3. . .2 Left and right preconditioning . . . 2. 3. . . .

.5 Wavefronts in the Gauss-Seidel and Conjugate Gradient methods 4. . . . . . . . . . . . . 5. . . 4.4. . . . . . 4. . . . . . . . . . . 4. . . . . .2. . . . . . . . . . . . . . . . . . . . . . . . D Flowchart of iterative methods . . . . .4 Preconditioning . . . . . . . . . . . . . . . . 4. . . . .6. . . . . . . . . 5. . . . . . . . . . . . . . . . . . . 5. . . . .4. . . . . 4.2 Stopping Criteria . . . . . . . .4. . . .2. . . . 3. .viii 3. . . . 39 44 46 46 46 47 47 47 48 51 51 51 52 55 55 56 56 56 57 60 63 67 67 69 69 69 70 71 73 73 74 74 75 76 77 79 80 81 83 85 87 91 107 3. . . . . . . . . . . . . . . . .3 Further Remarks . . . . .2 Vector updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Inner products . . . .3 Reduced System Preconditioning . . .1 Complex Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 4. . . . . . . . . . . . . . . . .5 Multigrid Methods . . . 5. . . . . . . . . . .3 Estimating A−1 . . . . . . . . . . . . . . . . . . .4 Domain Decomposition Methods . . . . .1 Preconditioning by the symmetric part .4 Blocking over systems of partial differential equations 3. 4. . . . . . . . . . . . . . . . . .1 More Details about Stopping Criteria . . . . . . . . Other preconditioners . . . . . . . . . CONTENTS . . . . . . . . . . 4. . . . . . .3 Sparse Incomplete Factorizations . . . . . . . . . . . . . . . . . . . . . 4. . . . .4. . . .3 Data Structures . . . . . . . . . . . 3. . . . . . . . . . . Polynomial preconditioners . . . . . . . . . . A Obtaining the Software B Overview of the BLAS C Glossary C. . . 5 . . . . . .5 3. . . . . . . .4. . . . . . . . . . . . .2 Block and s-step Iterative Methods . . . . . . . . .5 Incomplete LQ factorizations . .6. . .4. .4. . . . .2. . . . . . . . .2 The use of fast solvers . . . . . . . . . . . . .1 Overlapping Subdomain Methods .2 Matrix vector products .2 Point incomplete factorizations . . . . . Remaining topics 5.4. . . . . . . . . . .2 When r(i) or r(i) is not readily available . . . . . . . . . . . . . .3.2 Non-overlapping Subdomain Methods 5. . . 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. . .4 Parallelism . . . . . . . . . . . . .4. . . .3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Notation . . . . . . . . . . . . . .6 Blocked operations in the GMRES method . . . . . . 5. . . . . . . . . . . . . . . . . . .4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Accounting for floating point errors . . 3. . . .2. . . . . . . . . .3 Alternating Direction Implicit methods . . . . . . . . . . . .4. . . . . . . .3. . 4. . . . . . . . .4. .1 The Lanczos Connection . . . . . . . . . . . . . . 3. . . . 5. . . . . .3 Block factorization methods . . . . . . . . . .6 Row Projection Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4. . . . . .6 4 Related Issues 4. . . . . . . . . . . . . . . .6. . 4. . . . . . .3 Matrix-vector products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2. . . . 5. . . .4 Stopping when progress is no longer being made . . . . 4. . . .1 Survey of Sparse Matrix Storage Formats . . . . . . . . . . . . . . . . . . . . . . . . . 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. . . . . . . . . . .

. . .2 2. . . . . . . . . . . . . . . . .1 4. . Block version of a D-ILU factorization . . . . . .8 2. . . . . . . . . . . . . . Algorithm for approximating the inverse of a banded matrix . . . . . . . . ix . .4 2. . . . . . . . The Preconditioned BiConjugate Gradient Method . . . . . . . . . . . . . . . . . . . . . . . . . . .List of Figures 2. . . . . . . . . . . .6 2. . . . . . . . . . . . . . . . .10 2. . . . .2 3. . . . . . . . . . The Preconditioned Quasi Minimal Residual Method without Look-ahead The Preconditioned Conjugate Gradient Squared Method .7 2. . . . . . The Preconditioned Conjugate Gradient Method . Wavefront solution of (D+L)x = u from a central difference problem on a domain of n × n points. . The Preconditioned BiConjugate Gradient Stabilized Method . . Preconditioning step algorithm for a Neumann expansion M (p) ≈ M −1 of an incomplete factorization M = (I + L)D(I + U ). . .8 4. . . . . . . . . . . . . The SOR Method . . . . .11 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 3. . . . . . . . . . . . . . . . . . . . .2 The Jacobi Method . . . .9 2. . . . . The Preconditioned GMRES(m) Method . . .3 2. . . . . . . . . . . . . . . . . . . . . . . . The Gauss-Seidel Method . . .1 3. . Construction of a D-ILU incomplete factorization preconditioner. . . . . . . . . .5 3. . . Preconditioner solve of a system M x = y. .1 2. . . . . . . 8 9 10 12 13 18 20 22 23 24 26 39 39 40 42 43 44 45 45 61 68 Preconditioner solve of a system M x = y. . . . . . . . .3 3. . . . . . . . . storing the inverses of the pivots . . . Profile of a nonsymmetric skyline or variable-band matrix. The SSOR Method . . . . . . . . . . . . . . . . . . . . . .7 3. The Preconditioned Chebyshev Method . . . . . . . . . .4 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Incomplete block factorization of a block tridiagonal matrix . . . . . . . . . .5 2. . . . . . . . . . . . . . . . . . . . . A rearrangement of Conjugate Gradient for parallelism . . . with M = LU . . . . . . . . . . . . . . . . . with M = (D + L)D−1 (D + U ) = (D + L)(I + D−1 U ). . . . . . . . . . . . . . . . . . .

x LIST OF FIGURES .

and IMSL. Unfortunately. a method that works well for one problem type may not work as well for another. • Users want to be able to tune data structures for a particular application. in this book. although templates are general descriptions of key algorithms. LINPACK. data structures that are not natural or convenient for a particular problem. Many methods exist for solving such problems. we suggest how to choose and implement an effective method. applications developers will be able to use templates to get their program running on a parallel machine quickly. We believe that after reading this book. been required to tune large-scale applications for each particular machine. Indeed. they can be configured for the specific data structure of a problem or for the specific computing system on which the problem is to run. While education has not been one of 1 . Thus the user need only supply a subroutine for computing y (and perhaps z) given x. Specialists will be able to assemble and modify their codes—without having to make the huge investment that has. Traditionally. We focus on the use of iterative methods for solving large sparse systems of linear equations. Nonspecialists will know how to choose and implement an approach to solve a particular problem. Can we meet the needs of both groups of users? We believe we can. Finally. A template is a description of a general algorithm rather than the executable object code or the source code more commonly found in a conventional software library. we introduce the use of templates. The trick is to find the most effective method for the problem at hand. which permits full exploitation of the sparsity or other special structure of A. We restrict ourselves to iterative methods.Chapter 1 Introduction Which of the following statements is true? • Users want “black box” software that they can use with complete confidence for general problem classes without having to understand the fine algorithmic details. besides providing templates. Thus. the highperformance community has discovered that they must write custom software for their problem. even if the software is not as reliable as that provided for general methods. NAG. Nevertheless. and how to specialize a method to specific matrix types. up to now. More recently. we hope that all users will gain a better understanding of the algorithms employed. These methods access the coefficient matrix A of the linear system only via the matrixvector product y = A · x (and perhaps z = AT · x). Accordingly. for different groups of users. For example. which work by repeatedly improving an approximate solution until it is accurate enough. it may not work at all. they offer whatever degree of customization the user may desire. users have asked for and been provided with black box software in the form of mathematical libraries such as LAPACK. It turns out both are true. and overly general software that sacrifices too much performance when applied to a special case of interest. Their reasons include inadequate functionality of existing software libraries.

143] and LAPACK routines [3]. The methods we discuss are: 1. whenever possible) and C. Also. INTRODUCTION the traditional goals of mathematical software. or because they represent the current state-of-the-art for solving large sparse linear systems. • a MATLAB implementation based on dense matrices. • a FORTRAN-77 program with calls to BLAS1 .g. 1. templates are general and reusable. Jacobi 2.2 What Methods Are Covered? Many iterative methods have been developed and it is impossible to cover them all. Thus. see Appendix B. and • hints as to when to use a method. see [68. We chose the methods below either because they illustrate the historical development of iterative methods. First. Second. By using these familiar styles. Successive Over-Relaxation (SOR) 4. templates exploit the expertise of two distinct groups. we believe that our approach will go a long way in providing such a valuable service. customizing it for specific contexts or applications needs. templates are not language specific. • advice for tuning (for example.1 Why Use Templates? Templates offer three significant advantages. which is readily translatable into the target language such as FORTRAN (with the use of the Basic Linear Algebra Subprograms. 69. For each template. they are displayed in an Algol-like structure. The expert numerical analyst creates a template reflecting in-depth knowledge of a specific numerical technique. banded systems). See Appendix A for details.. And third. . 70. we provide some or all of the following: • a mathematical description of the flow of the iteration. the following can be obtained via electronic mail. For each of the templates. We also hope that users will gain a better understanding of numerical techniques and parallel programming.2 CHAPTER 1. or BLAS. they can simplify ports to diverse machines. Rather. • discussion of convergence and stopping criteria. Symmetric Successive Over-Relaxation (SSOR) 1 For a discussion of BLAS as building blocks. 1. Gauss-Seidel 3. • tips on parallel implementations. This feature is important given the diversity of parallel architectures. which preconditioners are applicable and which are not). and why. we believe that the users will have trust in the algorithms. • suggestions for applying a method to special matrix types (e. The computational scientist then provides “value-added” capability to the general template description.

and generally the more that is known the more the algorithm can be tuned.1. we have chosen to present an algorithmic outline. WHAT METHODS ARE COVERED? 5. Quasi-Minimal Residual (QMR) 11. . Thus.2. Biconjugate Gradient (BiCG) 10. Generalized Minimal Residual (GMRES) 9. We also discuss the use of preconditioners and relevant data storage issues. Conjugate Gradient Squared (CGS) 12. Conjugate Gradient (CG) 6. We do not intend to write a “cookbook”. Chebyshev Iteration 3 For each method we present a general description. Minimal Residual (MINRES) and Symmetric LQ (SYMMLQ) 7. We also give the mathematical conditions for selecting a given method. with guidelines for choosing a method and implementing it on particular kinds of high-performance machines. including a discussion of the history of the method and numerous references to the literature. because these phrases imply that our algorithms can be used blindly without knowledge of the system of equations. and have deliberately avoided the words “numerical recipes”. The state of the art in iterative methods does not permit this: some knowledge about the linear system is needed to guarantee convergence of these algorithms. Biconjugate Gradient Stabilized (Bi-CGSTAB) 13. Conjugate Gradients on the Normal Equations (CGNE and CGNR) 8.

4 CHAPTER 1. INTRODUCTION .

A good preconditioner improves the convergence of the iterative method. The Jacobi method is based on solving for every variable locally with respect to the other variables. but convergence is slow. 5 . The transformation matrix is called a preconditioner. except that it uses updated values as soon as they are available. SOR may converge faster than Gauss-Seidel by an order of magnitude. the Gauss-Seidel method will converge faster than the Jacobi method. one iteration of the method corresponds to solving for every variable once. The nonstationary methods we present are based on the idea of sequences of orthogonal vectors. Hence. In this book we will cover two types of iterative methods. The resulting method is easy to understand and implement. which is based on orthogonal polynomials. though still relatively slowly.Chapter 2 Iterative Methods The term “iterative method” refers to a wide range of techniques that use successive approximations to obtain more accurate solutions to a linear system at each step. iterative methods usually involve a second matrix that transforms the coefficient matrix into one with a more favorable spectrum. In later sections of this chapter more detailed descriptions of these methods are given. – SOR. without a preconditioner the iterative method may even fail to converge. Successive Overrelaxation (SOR) can be derived from the Gauss-Seidel method by introducing an extrapolation parameter ω. along with brief notes on the classification of the methods in terms of the class of matrices for which they are most appropriate. their analysis is usually harder to understand. simpler to understand and implement.) The rate at which an iterative method converges depends greatly on the spectrum of the coefficient matrix. Stationary methods are older. Nonstationary methods are a relatively recent development. • Stationary Methods – Jacobi. Indeed. but they can be highly effective. (An exception is the Chebyshev iteration method. The Gauss-Seidel method is like the Jacobi method.1 Overview of the Methods Below are short descriptions of each of the methods to be discussed. In general. sufficiently to overcome the extra cost of constructing and applying the preconditioner. – Gauss-Seidel. if the Jacobi method converges. 2. but usually not as effective. For the optimal choice of ω.

However. and combines these through a least-squares solve and update. or “bi-orthogonal”. computation and storage costs are limited by specifying a fixed number of vectors to be generated. SYMMLQ will generate the same solution iterates as CG if the coefficient matrix is symmetric positive definite. it is useful as a preconditioner for nonstationary methods. and while it converges smoothly. On the other hand. however. so that a large amount of storage is needed. however. – Conjugate Gradient Squared (CGS). and one on AT . one based on a system with the original coefficient matrix A. CG is an extremely effective method when the coefficient matrix is symmetric positive definite. and there is a possibility that the method will break down. it does not effect a true minimization of either the error or the residual. CGNR solves (AT A)x = ˜ for the solution vector b x where ˜ = AT b. They are also the gradients of a quadratic functional. the minimization of which is equivalent to solving the linear system. These methods are based on the application of the CG method to one of two forms of the normal equations for Ax = b. BiCG requires a multiplication with the coefficient matrix and with its transpose at each iteration. like CG. This method is useful for general nonsymmetric matrices. – Minimum Residual (MINRES) and Symmetric LQ (SYMMLQ). Also. The Quasi-Minimal Residual method applies a least-squares solve and update to the BiCG residuals. since the spectrum of the normal equations matrices will be less favorable than the spectrum of A. This method. It is useful when the matrix is nonsymmetric and nonsingular. These vectors are the residuals of the iterates. The Generalized Minimal Residual method computes a sequence of orthogonal vectors (like MINRES). – Generalized Minimal Residual (GMRES). convergence may be irregular. they are made mutually orthogonal. The Biconjugate Gradient method generates two CG-like sequences of vectors.6 CHAPTER 2. Symmetric Successive Overrelaxation (SSOR) has no advantage over SOR as a standalone iterative method. restarted versions of this method are used. unlike MINRES (and CG) it requires storing the whole sequence. – BiConjugate Gradient (BiCG). In restarted versions. since storage for only a limited number of vectors is required. • Nonstationary Methods – Conjugate Gradient (CG). ITERATIVE METHODS – SSOR. Instead of orthogonalizing each sequence. it does not essentially improve on the BiCG. For this reason. and hence CG can be applied. QMR largely avoids the breakdown that can occur in BiCG. – Conjugate Gradient on the Normal Equations: CGNE and CGNR. The conjugate gradient method derives its name from the fact that it generates a sequence of conjugate (or orthogonal) vectors. thereby smoothing out the irregular convergence behavior of BiCG. – Quasi-Minimal Residual (QMR). The convergence may be slow. CGNE solves the system (AAT )y = b for y and then computes the solution x = AT y. When the coefficient matrix A is nonsymmetric and nonsingular. . b the normal equations matrices AAT and AT A will be symmetric and positive definite. These methods are computational alternatives for CG for coefficient matrices that are symmetric but possibly indefinite. uses limited storage.

2. we summarize their convergence behavior and their effectiveness.j xj (k−1) )/ai. j=1 we solve for the value of xi while assuming the other entries of x remain fixed.j xj )/ai. A practical advantage is that the method does not need the multiplications with the transpose of the coefficient matrix. Note that the order in which the equations are examined is irrelevant. – Biconjugate Gradient Stabilized (Bi-CGSTAB).i .2) This suggests an iterative method defined by xi (k) = (bi − j=i ai.3) which is the Jacobi method. the Successive Overrelaxation (SOR) method and the Symmetric Successive Overrelaxation (SSOR) method. but in practice convergence may be much more irregular than for BiCG. In this section.1 The Jacobi Method The Jacobi method is easily derived by examining each of the n equations in the linear system Ax = b in isolation. (2.2.3) can be expressed as x(k) = D−1 (L + U )x(k−1) + D−1 b. Finally.2 Stationary Iterative Methods Iterative methods that can be expressed in the simple form x(k) = Bx(k−1) + c (2. (2. this would double the convergence rate. In matrix terms. – Chebyshev Iteration. the definition of the Jacobi method in (2. we present the four main stationary iterative methods: the Jacobi method.4) .2. STATIONARY ITERATIVE METHODS 7 The Conjugate Gradient Squared method is a variant of BiCG that applies the updating operations for the A-sequence and the AT -sequences both to the same vectors. The Biconjugate Gradient Stabilized method is a variant of BiCG. Ideally.2. For this reason. The Chebyshev Iteration recursively determines polynomials with coefficients chosen to minimize the norm of the residual in a min-max sense.1) (where neither B nor c depend upon the iteration count k) are called stationary iterative methods. and discuss how and when they should be used. since the updates could in principle be done simultaneously. since the Jacobi method treats them independently. the Gauss-Seidel method. If in the ith equation n ai.5. we give some historical background and further notes and references. This method has the advantage of requiring no inner products. The coefficient matrix must be positive definite and knowledge of the extremal eigenvalues is required. 2. In each case. the Jacobi method is also known as the method of simultaneous displacements.j xj = bi . (2. like CGS. in §2. but using different updates for the AT -sequence in order to obtain smoother convergence than CGS. 2.i . we obtain xi = (bi − j=i ai.

8 CHAPTER 2. . Convergence of the Jacobi method Iterative methods are often used for solving discretized partial differential equations. N − 1 the function un (x) = sin nπx is an eigenfunction corresponding to λ = 4 sin2 nπ/(2N ). 2. . u(1) = u1 . for i = 1.1: The Jacobi Method where the matrices D. whereas the damping factor for modes with n small is close to 1. .. . The eigenvalues of the Jacobi iteration matrix B are then λ(B) = 1 − 1/2λ(L) = 1 − 2 sin2 nπ/(2N ).j xj ¯ ¯ end xi = (bi − xi )/ai. 2. and that previously computed . . It is not possible to update the vector x in place. The pseudocode for the Jacobi method is given in Figure 2. In that context a rigorous analysis of the convergence of simple methods such as the Jacobi method can be given. for k = 1. . . i = 1 . . 2. the strictly lower-triangular. and the strictly upper-triangular parts of A. . i.2 The Gauss-Seidel Method Consider again the linear equations in (2. since values ¯ from x(k−1) are needed throughout the computation of x(k) .i ¯ ¯ end x(k) = x ¯ check convergence. From this it is easy to see that the high frequency modes (i. . . . . The eigenfunctions of the L and L operator are the same: for n = 1 . but now assume that the equations are examined one at a time in sequence.e. N − 1. u(0) = u0 . i + 1. 1). . ITERATIVE METHODS Choose an initial guess x(0) to the solution x.e. h = N −d where N is the number of variables and d is the number of space dimensions. The type of analysis applied to this example can be generalized to higher dimensions and other stationary iterative methods.1.2. If we proceed as with the Jacobi method. . and it is attained for the eigenfunction u(x) = sin πx. 2. −L and −U represent the diagonal. consider the boundary value problem Lu = −uxx = f discretized by Lu(xi ) = 2u(xi ) − u(xi−1 ) − u(xi+1 ) = f (xi )/N 2 for xi = i/N . Note that an auxiliary storage vector. The spectral radius of the Jacobi iteration matrix is ≈ 1 − 10/N 2 . i − 1. continue if necessary end Figure 2. respectively. .2). . . x is used in the algorithm. For both the Jacobi and Gauss-Seidel method (below) the spectral radius is found to be 1 − O(h2 ) where h is the discretization mesh width. As an example. n xi = 0 ¯ for j = 1. eigenfunction un with n large) are damped quickly. .. on (0. n (k−1) xi = xi + ai.

the reader is referred to Chapter 3 and §4. 2. Since each component of the new iterate depends upon all previously computed components.j xj end (k) xi = (bi − σ)/ai. the dependency of each component of the new iterate on previous components is not absolute. continue if necessary end Figure 2. thus restoring the ability to make updates to groups of components in parallel. . reordering the equations can affect the rate at which the Gauss-Seidel method converges. . Second. . the updates cannot be done simultaneously as in the Jacobi method.j xj (k−1) )/ai. is devised by applying extrapolation to the GaussSeidel method. lower-triangular. The Gauss-Seidel method is sometimes called the method of successive displacements to indicate the dependence of the iterates on the ordering.3 The Successive Overrelaxation Method The Successive Overrelaxation Method. . STATIONARY ITERATIVE METHODS 9 Choose an initial guess x(0) to the solution x. n (k−1) σ = σ + ai.i end check convergence.2.i . . First. . . n σ=0 for j = 1. we obtain the Gauss-Seidel method: xi (k) = (bi − j<i ai. These two points are important because if A is sparse. or SOR.5) can be expressed as x(k) = (D − L)−1 (U x(k−1) + b). This extrapolation takes the form of a weighted average between the previous iterate . . the new iterate x(k) depends upon the order in which the equations are examined. . D. . For a practical discussion of this tradeoff (parallelism versus convergence rate) and some standard reorderings. .2. respectively. for k = 1.2. . it may be possible to reduce such dependence. The presence of zeros in the matrix may remove the influence of some of the previous components.2: The Gauss-Seidel Method results are used as soon as they are available.6) As before. 2. for i = 1. Using a judicious ordering of the equations. In matrix terms. i − 1 (k) σ = σ + ai.j xj end for j = i + 1. (2. . 2. and upper-triangular parts of A. (2. 2. . . A poor choice of ordering can degrade the rate of convergence. the components of the new iterate (and not just their order) will also change. The pseudocode for the Gauss-Seidel algorithm is given in Figure 2.2. However. the definition of the Gauss-Seidel method in (2. a good choice can enhance the rate of convergence. the computations in (2.5) Two important facts about the Gauss-Seidel method should be noted. If this ordering is changed. −L and −U represent the diagonal.j xj (k) − j>i ai.5) appear to be serial.4.

some heuristic estimate is used. though the choice of ω can significantly affect the rate at which the SOR iteration converges. . Even when it is possible to compute the optimal value for ω.7) . . If the coefficient matrix A is symmetric and positive definite. . continue if necessary end Figure 2. For coefficient matrices of a special class called consistently ordered with property A (see Young [215]). the SOR algorithm can be written as follows: x(k) = (D − ωL)−1 (ωU + (1 − ω)D)x(k−1) + ω(D − ωL)−1 b. In matrix terms. . Frequently. for k = 1. 2.3.j xj end for j = i + 1. In general. . it is not possible to compute in advance the value of ω that is optimal with respect to the rate of convergence of SOR. such as ω = 2 − O(h) where h is the mesh spacing of the discretization of the underlying physical domain.3: The SOR Method and the computed Gauss-Seidel iterate successively for each component: xi (k) = ω¯i x (k) + (1 − ω)xi (k−1) (where x denotes a Gauss-Seidel iterate.10 CHAPTER 2. A theorem due to Kahan [129] shows that SOR fails to converge if ω is outside the interval (0. 2. n σ=0 for j = 1. and ω is the extrapolation factor). i − 1 (k) σ = σ + ai. the SOR method simplifies to the Gauss-Seidel method. . for convenience the term overrelaxation is now used for any value of ω ∈ (0. there is a direct relationship between the spectra of the Jacobi and SOR iteration (2. the expense of such computation is usually prohibitive. the SOR iteration is guaranteed to converge for any value of ω between 0 and 2.i (k) (k−1) (k−1) xi = xi + ω(σ − xi ) end check convergence. The idea is to choose a ¯ value for ω that will accelerate the rate of convergence of the iterates to the solution. . 2. .j xj end σ = (bi − σ)/ai. n (k−1) σ = σ + ai. . . . Choosing the Value of ω If ω = 1. for i = 1. 2). which includes certain orderings of matrices arising from the discretization of elliptic PDEs. . . The pseudocode for the SOR algorithm is given in Figure 2. Sophisticated implementations of the SOR algorithm (such as that found in ITPACK [139]) employ adaptive parameter estimation schemes to try to home in on the appropriate value of ω by estimating the rate at which the iteration is converging. Though technically the term underrelaxation should be used when 0 < ω < 1. . 2). ITERATIVE METHODS Choose an initial guess x(0) to the solution x. .

in which a different . The pseudocode for the SSOR algorithm is given in Figure 2. given the spectral radius ρ of the Jacobi iteration matrix. Southwell’s relaxation method was considered impractical for automated computing. However.2. then the Symmetric Successive Overrelaxation method.7). That is. In chaotic methods. STATIONARY ITERATIVE METHODS 11 matrices. but with the roles of L and U reversed. which are closely related to Southwell’s original relaxation method. and B2 = (D − ωL)−1 (ωU + (1 − ω)D). It is interesting to note that the introduction of multiple-instruction.4 The Symmetric Successive Overrelaxation Method If we assume that the coefficient matrix A is symmetric. Specifically.2. p. see Golub and Van Loan [108.2. Indeed. Baudet [30]. Southwell used overrelaxation to accelerate the convergence of his original relaxation method. the first SOR sweep is carried out as in (2. In matrix terms. thereby eliminating costly synchronization of the processors. one can determine a priori the theoretically optimal value of ω for SOR: ωopt = 2 1+ 1 − ρ2 . page 462]). In principle. (2. More recently. and Elkin [91]). and that B1 is the same.7). the order of relaxation is unconstrained. the ad hoc SOR method. 351]) can yield reasonable estimates for the optimal value of ω. This was the precursor to the SOR method. The notion of accelerating the convergence of an iterative method by extrapolation predates the development of SOR. this is the primary motivation for SSOR since its convergence rate. Specifically. though the order in which approximations to the unknowns were relaxed varied during the computation. from the power method. or SSOR. is usually slower than the convergence rate of SOR with optimal ω (see Young [215. For details on using SSOR as a preconditioner.8) This is seldom done.2. (2. 2. with an optimal value of ω. combines two SOR sweeps together in such a way that the resulting iteration matrix is similar to a symmetric matrix.9) 2. see Chapter 3. Because of this. multiple data-stream (MIMD) parallel computers has rekindled an interest in so-called asynchronous. the SSOR iteration can be expressed as follows: x(k) = B1 B2 x(k−1) + ω(2 − ω)(D − ωU )−1 D(D − ωL)−1 b. or chaotic iterative methods (see Chazan and Miranker [53]. where B1 = (D − ωU )−1 (ωL + (1 − ω)D). though the effect on convergence is difficult to predict. Note that B2 is simply the iteration matrix for SOR from (2. Indeed.5 Notes and References The modern treatment of iterative methods dates back to the relaxation method of Southwell [192]. since calculating the spectral radius of the Jacobi matrix requires an impractical amount of computation.4. relatively inexpensive rough estimates of ρ (for example. the next unknown was chosen based upon estimates of the location of the largest error in the current approximation. SSOR is a forward SOR sweep followed by a backward SOR sweep. but in the second sweep the unknowns are updated in the reverse order. The similarity of the SSOR iteration matrix to a symmetric matrix permits the application of SSOR as a preconditioner for other iterative schemes for symmetric matrices.

. . The reader is referred to these books (and the references therein) for further details concerning the methods described in this section.j xj (k− 1 ) 2 end for j = i + 1. . . . . . 1 σ=0 for j = 1. ITERATIVE METHODS Choose an initial guess x(0) to the solution x. . for k = 1. successsive approximations to the solution). n (k−1) σ = σ + ai.i xi (k− 1 ) 2 = xi (k−1) + ω(σ − xi (k−1) ) end for i = n. The method proceeds by generating vector sequences of iterates (i. 2. 2. .e. . n (k) σ = σ + ai.j xj end σ = (bi − σ)/ai. . . . . . .1 Conjugate Gradient Method (CG) The Conjugate Gradient method is an effective method for symmetric positive definite systems.12 CHAPTER 2. 2. . 2. residuals . for i = 1. .4: The SSOR Method relaxation factor ω is used for updating each variable.j xj (k− 1 ) 2 end for j = i + 1. .3. constants are computed by taking inner products of residuals or other vectors arising from the iterative method. It is the oldest and best known of the nonstationary methods discussed here. 2. 2.j xj end ) end check convergence. . 1 Let x( 2 ) = x(0) . The three main references for the theory of stationary iterative methods are Varga [209]. . . n σ=0 for j = 1. . i − 1 σ = σ + ai. continue if necessary end xi (k) = xi (k− 1 ) 2 + ω(σ − xi (k− 1 ) 2 Figure 2. i − 1 σ = σ + ai..3 Nonstationary Iterative Methods Nonstationary methods differ from stationary methods in that the computations involve information that changes at each iteration. . n − 1. . Typically. Young [215] and Hageman and Young [119]. has given impressive results for some classes of problems (see Ehrlich [82]). . .

. for M = I one obtains the unpreconditioned version of the Conjugate Gradient Algorithm. In that case the algorithm may be further simplified by skipping the “solve” line. and search directions used in updating the iterates and residuals. and replacing z (i−1) by r(i−1) (and z (0) by r(0) ). In every iteration of the method. .5. It uses a preconditioner M . 2. In fact. where x is the exact ˆ ˆ ˆ .5: The Preconditioned Conjugate Gradient Method corresponding to the iterates. . r(i) and r(i−1) – are orthogonal.10) The choice α = αi = r(i−1) r(i−1) /p(i) Ap(i) minimizes r(i) A−1 r(i) over all possible choices for α in equation (2. The pseudocode for the Preconditioned Conjugate Gradient Method is given in Figure 2. Theory The unpreconditioned conjugate gradient method constructs the ith iterate x(i) as an element of x(0) + span{r(0) . continue if necessary end Figure 2.11) where the choice βi = r r /r r ensures that p and Ap – or equivalently. two inner products are performed in order to compute update scalars that are defined to make the sequences satisfy certain orthogonality conditions. . T (2. only a small number of vectors needs to be kept in memory. .2. . Although the length of these sequences can become large.3. one can show that this choice of βi makes p(i) and r(i) orthogonal to all previous Ap(j) and r(j) respectively. Correspondingly the residuals r(i) = b − Ax(i) are updated as r(i) = r(i−1) − αq (i) T where T q (i) = Ap(i) . On a symmetric positive definite linear system these conditions imply that the distance to the true solution is minimized in some norm. The iterates x(i) are updated in each iteration by a multiple (αi ) of the search direction vector p(i) : x(i) = x(i−1) + αi p(i) . solve M z (i−1) = r(i−1) T ρi−1 = r(i−1) z (i−1) if i = 1 p(1) = z (0) else βi−1 = ρi−1 /ρi−2 p(i) = z (i−1) + βi−1 p(i−1) endif q (i) = Ap(i) T αi = ρi−1 /p(i) q (i) (i) (i−1) x =x + αi p(i) r(i) = r(i−1) − αi q (i) check convergence. NONSTATIONARY ITERATIVE METHODS 13 Compute r(0) = b − Ax(0) for some initial guess x(0) for i = 1. (i)T (i) (i−1)T (i−1) (i) (i−1) (2. The search directions are updated using the residuals p(i) = r(i) + βi−1 p(i−1) . Ai−1 r(0) } so that (x(i) − x)T A(x(i) − x) is minimized.10). .

ITERATIVE METHODS solution of Ax = b. §5.12) √ √ where α = ( κ2 − 1)/( κ2 + 1) (see Golub and Van Loan [108. . .1. For example. In the Conjugate Gradient method two coupled two-term recurrences are used. then the spectral ˆ condition number of B is κ2 (B) = λmax (B)/λmin (B)). For the Conjugate Gradient method. and a similarity transformation of the coefficient matrix to tridiagonal form. The above minimization of the error is equivalent to the residuals r(i) = b − Ax(i) being M −1 T orthogonal (that is. that is. and y 2 ≡ (y. and one updating the search direction with a newly computed residual. Golub and O’Leary [57]). r(i) M −1 r(j) = 0 if i = j). although over this different subspace. .8]. This makes the Conjugate Gradient Method quite attractive computationally. but it satisfies the same minimization property. (Recall that if λmax and λmin are the largest and smallest eigenvalues of a symmetric positive definite matrix B. we expect a number of iterations proportional to h−1 for the Conjugate Gradient method. §10. then one often observes socalled superlinear convergence (see Concus. see Paige and Saunders [167] and Golub and Van Loan [108. independent of the order of the finite elements or differences used. the convergence rate depends on a reduced system with a (much) smaller condition number (for an analysis of this. In some cases. This phenomenon is explained by the fact that CG tends to eliminate components of the error in the direction of eigenvectors associated with extremal eigenvalues first. It requires in addition that the preconditioner M is symmetric and positive definite. see §5. . This minimum is guaranteed to exist in general only if A is symmetric positive definite. since both are based on the construction of an orthogonal basis for the Krylov subspace.5]). see Van der Sluis and Van der Vorst [197]). such a recurrence also suffices for generating the residuals. The coefficients computed during the CG iteration then arise from the LU factorization of this tridiagonal matrix. and of the number of space dimensions of the problem (see for instance Axelsson and Barker [14. The preconditioned version of the method uses a different subspace for constructing the iterates. convergence at a rate that increases per iteration.e.14 CHAPTER 2. If x is the exact solution of the linear system Ax = b. This relationship can be exploited to obtain relevant information about the eigensystem of the (preconditioned) matrix A. but useful bounds can often be obtained. From the CG iteration one can reconstruct the Lanczos process. Thus. The effectiveness of the preconditioner in reducing the condition number and in separating extremal eigenvalues can be deduced by studying the approximated eigenvalues of the related Lanczos process. There is a close relationship between the Conjugate Gradient method and the Lanczos method for determining eigensystems. Other results concerning the behavior of the Conjugate Gradient algorithm have been obtained. i. the method proceeds as if these eigenvalues did not exist in the given system. practical application of the above error bound is straightforward.6]. After these have been eliminated. and vice versa. elliptic second order partial differential equations typically give rise to coefficient matrices A with κ2 (A) = O(h−2 ) (where h is the discretization mesh width). From this relation we see that the number of iterations to reach a relative A √ reduction of in the error is proportional to κ2 . and Kaniel [130]).2.2. Ay). Convergence Accurate predictions of the convergence of iterative methods are difficult to make. If the extremal eigenvalues of the matrix M −1 A are well separated. one that updates residuals using a search direction vector. then for CG with symmetric positive definite preconditioner M . §10. . Ai−1 r(0) } can be constructed with only three-term recurrences. without preconditioning. it can be shown that x(i) − x ˆ A ≤ 2αi x(0) − x ˆ A (2. Since for symmetric A an orthogonal basis for the Krylov subspace span{r(0) . with symmetric positive definite matrix A. the error can be bounded in terms of the spectral condition number κ2 of the matrix M −1 A..

.. SYMMLQ solves the projected system. .     ↑ i+1 ↓ . . .. The vector sequences in the Conjugate Gradient method correspond to a factorization of a tridiagonal matrix similar to the coefficient matrix. For discussions of the method for more general parallel architectures see Demmel. . . but symmetric. The MINRES and SYMMLQ methods are variants of the CG method that avoid the LU -factorization and do not suffer from breakdown. which can be written in matrix form as ¯ ARi = Ri+1 Ti . . The convergence behavior of Conjugate Gradients and MINRES for indefinite systems was analyzed by Paige.. Further references A more formal presentation of CG. For a discussion of the Conjugate Gradient method on vector and shared memory computers. .10) and (2. An overview of papers published in the first 25 years of existence of the method is given in Golub and O’Leary [107]. we can still construct an orthogonal basis for the Krylov subspace by three term recurrence relations. and two inner products per iteration.       . .2 MINRES and SYMMLQ The Conjugate Gradient method can be viewed as a special variant of the Lanczos method (see §5...4. are discussed in §4. The MINRES and SYMMLQ methods are variants that can be applied to symmetric indefinite systems. Eliminating the search directions in equations (2. .3. et al.1) for positive definite symmetric systems.i + r(i) ti. . a breakdown of the algorithm can occur corresponding to a zero pivot if the matrix is indefinite. .3. 2..i + r(i−1) ti−1. Theory When A is not positive definite. Shorter presentations are given in Axelsson and Barker [14] and Golub and Van Loan [108]. Some slight computational variants exist that have the same structure (see Reid [178]). . three vector updates..i . . Parlett. as well as many theoretical properties. Furthermore. for indefinite matrices the minimization property of the Conjugate Gradient method is no longer well-defined... . .2. [70.. and Van der Vorst [166]. ¯ where Ti is an (i + 1) × i tridiagonal matrix      ¯ Ti =       ← i → .11) gives a recurrence Ar(i) = r(i+1) ti+1. 165].. . . Therefore. . but does not minimize anything (it keeps the residual orthogonal to all previous ones). Heath and Van der Vorst [66] and Ortega [165]. can be found in the textbook by Hackbusch [117]. Variants that cluster the inner products. see Dongarra. MINRES minimizes the residual in the 2-norm. . and the references therein. . NONSTATIONARY ITERATIVE METHODS Implementation 15 The Conjugate Gradient method involves one matrix-vector product. a favorable property on parallel machines.

3. This again leads to simple recurrences and the resulting method is known as SYMMLQ (see Paige and Saunders [167]). coefficient matrix. Ar(0) . Another possibility is to solve the system Ti y = r(0) 2 e(1) . A clever execution of this scheme delivers the factors L and U of the LU -decomposition of the tridiagonal matrix that would have been computed by carrying out the Lanczos procedure with AT A. r(i) 2 ). Ap). possibly indefinite (but nonsingular). Ai−1 r(0) }. Thus the rate of convergence of the CG procedure on the normal equations may be slow. . −1 Now we exploit the fact that if Di+1 ≡ diag( r(0) 2 . An alternative is then to decompose Ti by an LQ-decomposition.. . AT Ax = AT b.. ¯ The element in the (i + 1. Another means for improving the numerical stability of this normal equations approach is suggested by Bj¨ rck and Elfving in [34]. Since other methods for such systems are in general rather more complicated than the Conjugate Gradient method. 2. . r(1) 2 . which leads to the MINRES method (see Paige and Saunders [167]). AT Ap) leads to the suggestion that such an inner product be replaced by (Ap. then Ri+1 Di+1 is an orthonormal transformation with respect to the current Krylov subspace: Ax(i) − b 2 ¯ = Di+1 Ti y − r(0) 2 e(1) 2 and this final expression can simply be seen as a minimum norm least squares problem. .. . the convergence speed of the Conjugate Gradient method now depends on the square of the condition number of the original coefficient matrix. . as in the CG method (Ti is ¯ the upper i × i part of Ti ). CGNE and CGNR The CGNE and CGNR methods are the simplest methods for nonsymmetric or indefinite systems.3 CG on the Normal Equations. The best known is by Paige and Saunders [168] and is based upon applying the Lanczos method to the auxiliary system I AT A 0 r x = b 0 . Several proposals have been made to improve the numerical stability of this method. ·)A no longer defines an inner product.16 CHAPTER 2. i) position of Ti can be annihilated by a simple Givens rotation and the resulting upper bidiagonal system (the other subdiagonal elements having been removed in previous iteration steps) can simply be solved. The observation that the matrix AT A is used in the construco tion of the iteration coefficients through an inner product like (p. However we can still try to minimize the residual in the 2-norm by obtaining x(i) ∈ {r(0) . Other than in CG we cannot rely on the existence of a Choleski decomposition (since A is not positive definite). ITERATIVE METHODS In this case we have the problem that (·. transforming the system to a symmetric definite one and then applying the Conjugate Gradient method is attractive for its coding simplicity. one obvious attempt at a solution is to apply Conjugate Gradient to a related symmetric positive definite system. x(i) = Ri y ¯ that minimizes Ax(i) − b 2 = ARi y − b ¯ 2 ¯ = Ri+1 Ti y − b 2 . Theory If a system of linear equations Ax = b has a nonsymmetric. While this approach is easy to understand and code.

however. The Householder approach results in a three-fold increase in work. especially for ill-conditioned systems (see Walker [212]). In the Conjugate Gradient method. instead. where the coefficients yk have been chosen to minimize the residual norm b − Ax(i) .3. which are relatively costly but stable. . Applied to the Krylov sequence {Ak r(0) } this orthogonalization is called the “Arnoldi method” [6]. they show that if the coefficient matrix A is real and nearly positive definite. convergence may be better. Saad and Schultz [188] have proven several useful results. . . the choice of m. this basis is formed explicitly: w(i) = Av (i) for k = 1.3. the storage and computational requirements in the absence of restarts are prohibitive. the expensive action of forming the iterate can be postponed until the residual norm is deemed small enough. For this reason. there exist examples for which the method stagnates and convergence takes place only at the nth step.}. Like MINRES. “restarted” versions of the method are used. and uses restarts to control storage requirements. . A2 r(0) . the crucial element for successful application of GMRES(m) revolves around the decision of when to restart. . Theory The Generalized Minimum Residual (GMRES) method is designed to solve nonsymmetric linear systems (see Saad and Schultz [188]). v (k) )v (k) end v (i+1) = w(i) / w(i) The reader may recognize this as a modified Gram-Schmidt orthogonalization. The most popular form of GMRES is based on the modified Gram-Schmidt procedure. Implications of the choice of m are discussed below. In GMRES. GMRES (like any orthogonalizing Krylov-subspace method) will converge in no more than n steps (assuming exact arithmetic). Implementation A common implementation of GMRES is suggested by Saad and Schultz in [188] and relies on using modified Gram-Schmidt orthogonalization. all previously computed vectors in the orthogonal sequence have to be retained. Gram-Schmidt orthogonalization may be . the residuals form an orthogonal basis for the space span{r(0) . i w(i) = w(i) − (w(i) . Ar(0) . If no restarts are used. then a “reasonable” value for m may be selected. Householder transformations. From the point of view of parallelism.4 Generalized Minimal Residual (GMRES) The Generalized Minimal Residual method is an extension of MINRES (which is only applicable to symmetric systems) to unsymmetric systems. Thus. For such systems. NONSTATIONARY ITERATIVE METHODS 17 2. that is. The GMRES algorithm has the property that this residual norm can be computed without the iterate having been formed. but in the absence of symmetry this can no longer be done with short recurrences. The pseudocode for the restarted GMRES(m) algorithm with preconditioner M is given in Figure 2. . In particular. Indeed. moreover. Of course this is of no practical value when n is large. The GMRES iterates are constructed as x(i) = x(0) + y1 v (1) + · · · + yi v (i) . The inner product coefficients (w(i) . Unfortunately. any choice of m less than n fails to converge.2. . it generates a sequence of orthogonal vectors.6. v (k) ) and w(i) are stored in an upper Hessenberg matrix. have also been proposed.

18 CHAPTER 2.. Solve r from M r = b − Ax(0) v (1) = r/ r 2 s := r 2 e1 for i = 1. This procedure is repeated until convergence is achieved. The difficulty is in choosing an appropriate value for m. such that (i + 1)st component of Ji h. Unfortunately.. acting on ith and (i + 1)st component of h. 2. The usual way to overcome this limitation is by restarting the iteration....i ) construct Ji . s represents the first i components of s ˜ x = x(0) + y1 v (1) + y2 v (2) + . . in which ˜ the upper i × i triangular part of H has hi. i) x replaces the following computations: Compute y as the solution of Hy = s.i v (k) end hi+1.. .. hi+1. there are no definite rules governing the choice of m—choosing when to restart is a matter of experience. The major drawback to GMRES is that the amount of work and storage required per iteration rises linearly with the iteration count.... m) x end In this scheme UPDATE(˜.. or fail to converge entirely.i .i = (w. Unless one is fortunate enough to obtain extremely fast convergence.. If m is “too small”. A value of m that is larger than necessary involves excessive work (and uses more storage).i . GMRES(m) may be slow to converge.j as its elements (in least squares sense if H is singular). i hk. . i) and quit) x end UPDATE(˜. m Solve w from M w = Av (i) for k = 1. Heath and Van der Vorst [66]).i is 0 s := Ji s if s(i + 1) is small enough then (UPDATE(˜... [70]. the cost will rapidly become prohibitive. + yi v (i) ˜ s(i+1) = b − A˜ 2 x if x is an accurate enough approximation then quit ˜ else x(0) = x ˜ Figure 2..i = w 2 v (i+1) = w/hi+1. Here we adopt the Modified Gram-Schmidt approach.. . .. For a discussion of GMRES for vector and shared memory computers see Dongarra et al.. v (k) ) w = w − hk. After a chosen number (m) of iterations. 2. the accumulated data are cleared and the intermediate results are used as the initial data for the next m iterations. ITERATIVE METHODS x(0) is an initial guess for j = 1..i apply J1 . . Ji−1 on (h1.6: The Preconditioned GMRES(m) Method preferred. giving up some stability for better parallelization properties (see Demmel.

Implementation BiCG requires computing a matrix-vector product Ap(k) and a transpose product AT p(k) . ˜ ˜ ˜ βi = r(i) r(i) ˜ (i−1)T r (i−1) r ˜ T ensure the bi-orthogonality relations r(i) r(j) = p(i) Ap(j) = 0 ˜ ˜ T T if i = j. . breakdown or near-breakdown situations can be satisfactorily avoided by a restart at the iteration step immediately before the (near-) breakdown step. p(i)T Ap(i) ˜ T p(i) = r(i−1) + βi−1 p(i−1) .7. The update relations for residuals in the Conjugate Gradient method are augmented in the BiConjugate Gradient method by relations that are similar but based on AT instead of A. r(i) = r(i−1) − αi AT p(i) .5 BiConjugate Gradient (BiCG) The Conjugate Gradient method is not suitable for nonsymmetric systems because the residual vectors cannot be made orthogonal with short recurrences (for proof of this see Voevodin [211] or Faber and Manteuffel [95]). T p(i) q (i) ≈ 0. at the cost of a larger storage demand.3. and ˜ can be repaired by using another decomposition. This is done in QMR (see §2.1). like GMRES. occurs when the LU -decomposition fails (see the theory subsection of §2. Taylor and Liu [171]). Thus we update two sequences of residuals r(i) = r(i−1) − αi Ap(i) . ˜ ˜ ˜ and two sequences of search directions p(i) = r(i−1) + βi−1 p(i−1) . see Demmel. NONSTATIONARY ITERATIVE METHODS for more general architectures. but it is also observed that the convergence behavior may be quite irregular. For symmetric positive definite systems the method delivers the same results as CG. Another possibility is to switch to a more robust (but possibly more expensive) method. for instance as a function call evaluation. at the price of no longer providing a minimization. The other breakdown situation. For nonsymmetric matrices it has been shown that in phases of the process where there is significant reduction of the norm of the residual. The BiConjugate Gradient method takes another approach. In some ˜ applications the latter product may be impossible to perform. and the method T may even break down.3. 19 2.3. Sometimes.3. The pseudocode for the Preconditioned BiConjugate Gradient Method with preconditioner M is given in Figure 2. This leads to complicated codes and is beyond the scope of this book. Heath and Van der Vorst [66].2. The GMRES method retains orthogonality of the residuals by using long recurrences. The breakdown situation due to the possible event that z (i−1) r(i−1) ≈ 0 ˜ can be circumvented by so-called look-ahead strategies (see Parlett. replacing the orthogonal sequence of residuals by two mutually orthogonal sequences. but at twice the cost per iteration. for instance if the matrix is not formed explicitly and the regular product is only given in operation form. Convergence Few theoretical results are known about the convergence of BiCG. In practice this is often confirmed. The choices αi = r(i−1) r(i−1) ˜ .6). the method is more or less comparable to full GMRES (in terms of numbers of iterations) (see Freund and Nachtigal [101]).

It is difficult to make a fair comparison between GMRES and BiCG. the two matrix-vector products can theoretically be performed simultaneously. r(0) = r(0) ). the generation of the basis vectors is relatively cheap and the memory requirements are modest.3. A related algorithm. 2. These variants (CGS and Bi-CGSTAB) will be discussed in coming subsections. A duplicate copy of the matrix will alleviate this problem. at the cost of doubling the storage requirements for the matrix. Moreover. The main idea behind this algorithm . continue if necessary end Figure 2. BiCG does not minimize a residual.7: The Preconditioned BiConjugate Gradient Method In a parallel environment. in a distributed-memory environment.20 CHAPTER 2. [102] attempts to overcome these problems. but at the cost of increasing work for keeping all residuals orthogonal and increasing demands for memory space. GMRES really minimizes a residual. ITERATIVE METHODS Compute r(0) = b − Ax(0) for some initial guess x(0) . the implicit LU decomposition of the reduced tridiagonal system may not exist. at the cost of twice the amount of matrix vector products per iteration step. Several variants of BiCG have been proposed that increase the effectiveness of this class of methods in certain circumstances. solve M z (i−1) = r(i−1) solve M T z (i−1) = r(i−1) ˜ ˜ (i−1)T (i−1) ρi−1 = z r ˜ if ρi−1 = 0. 2. method fails if i = 1 p(i) = z (i−1) p(i) = z (i−1) ˜ ˜ else βi−1 = ρi−1 /ρi−2 p(i) = z (i−1) + βi−1 p(i−1) p(i) = z (i−1) + βi−1 p(i−1) ˜ ˜ ˜ endif q (i) = Ap(i) q (i) = AT p(i) ˜ ˜ T αi = ρi−1 /˜(i) q (i) p x(i) = x(i−1) + αi p(i) r(i) = r(i−1) − αi q (i) r(i) = r(i−1) − αi q (i) ˜ ˜ ˜ check convergence. Care must also be exercised in choosing the preconditioner. but often its accuracy is comparable to GMRES. . Choose r(0) (for example. since similar problems arise during the two solves involving the preconditioning matrix. the Quasi-Minimal Residual method of Freund and Nachtigal [101]. depending upon the storage scheme for A. . . however. ˜ ˜ for i = 1. there will be extra communication costs associated with one of the two matrix-vector products. resulting in breakdown of the algorithm. However.6 Quasi-Minimal Residual (QMR) The BiConjugate Gradient method often displays rather irregular convergence behavior.

2. the residual vector r(i) can be regarded as the product of r(0) and an ith degree polynomial in A.) A professional implementation of QMR with look-ahead is given in Freund and Nachtigal’s QMRPACK. which makes it more robust than BiCG. which is available through netlib. Since the constructed basis for the Krylov subspace is bi-orthogonal. that is r(i) = Pi (A)r(0) .7 Conjugate Gradient Squared Method (CGS) In BiCG. This √ upper bound may be pessimistic by a factor of at most i + 1. In all cases where the slightly cheaper BiCG method converges irregularly (but fast enough). which explains the name. Implementation The pseudocode for the Preconditioned Quasi Minimal Residual Method.14) (2.1]. The look-ahead steps in the QMR method prevent breakdown in all cases but the so-called “incurable breakdown”.3. Pi2 (A)r(0) ). Pi (A)r(0) ) = (˜(i) . Freund and Nachtigal [101] present quite general error bounds which show that QMR may be expected to converge about as fast as GMRES. For details. This version of QMR is simpler to implement than the full QMR method with look-ahead. QMR has roughly the same problems with respect to vector and parallel implementation as BiCG. QMR may still show slow convergence. QMR uses look-ahead techniques to avoid breakdowns in the underlying Lanczos process. the algorithm has been modified so that only two full preconditioning steps are required instead of three. On the other ˆ hand. there seems little reason to avoid QMR. This algorithm follows the two term recurrence version without look-ahead. when BiCG makes no progress at all. Such decisions usually have implications for the stability and the efficiency of the algorithm. Convergence The convergence behavior of QMR is typically much smoother than for BiCG. that is M1 = I. or to use three-term recurrences instead of coupled two-term recurrences.3. 2. then a cheap upper bound for r(i) can be computed in each iteration. presented by Freund and Nachtigal [102] as Algorithm 7. where no number of look-ahead steps would yield a next iterate. From a relation between the residuals in BiCG and QMR (Freund and Nachtigal [101. avoiding the recursions for r(i) . We have modified the algorithm to include a relatively inexpensive recurrence relation for the computation of the residual vector. with preconditioner M = M1 M2 . proposition 4.10)]) one may deduce that at phases in the iteration process where BiCG makes significant progress.8. but it avoids expending a matrix-vector product on the residual calculation. QMR has arrived at about the same approximation for x.13) . rather than orthogonal as in GMRES. similar to the approach followed in GMRES. Also. r r r (2. This same polynomial satisfies r(i) = Pi (A)˜(0) so that ˜ r ρi = (˜(i) .1. r(i) ) = (Pi (AT )˜(0) . but it is susceptible to breakdown of the underlying Lanczos process. the obtained solution is viewed as a quasi-minimal residual solution. Additionally. see Freund and Nachtigal [101. This requires a few extra vectors of storage and vector update operations per iteration. The scalar overhead per iteration is slightly more than for BiCG. NONSTATIONARY ITERATIVE METHODS 21 is to solve the reduced tridiagonal system in a least squares sense. Computation of the residual is done for the convergence test. (Other implementational variations are whether to scale Lanczos vectors or not. relation (5. If one uses right (or post) preconditioning. is given in Figure 2. see Appendix A.

ξ1 = z 2 ˜ γ0 = 1. . q (1) = z ˜ ˜ else p(i) = y − (ξi δi / i−1 )p(i−1) ˜ q (i) = z − (ρi δi / i−1 )q (i−1) ˜ endif p = Ap(i) ˜ (i)T p. for example w(1) = r(0) ˜ ˜ t solve M2 z = w(1) . if δi = 0 method fails solve M2 y = y ˜ T solve M1 z = z ˜ if i = 1 p(1) = y . z = z/ξi δi = z T y. s(1) = η1 p ˜ else d(i) = ηi p(i) + (θi−1 γi )2 d(i−1) s(i) = ηi p + (θi−1 γi )2 s(i−1) ˜ endif x(i) = x(i−1) + d(i) r(i) = r(i−1) − s(i) check convergence. y = y/ρi ˜ ˜ w(i) = w(i) /ξi . ρ1 = y 2 ˜ ˜ Choose w(1) . 2. γi = 1/ 1 + θi . . if i = 0 method fails ˜ i =q βi = i /δi . if βi = 0 method fails v (i+1) = p − βi v (i) ˜ ˜ solve M1 y = v (i+1) ˜ ρi+1 = y 2 w(i+1) = AT q (i) − βi w(i) ˜ T solve M2 z = w(i+1) ˜ ξi+1 = z 2 2 θi = ρi+1 /(γi−1 |βi |). ITERATIVE METHODS Compute r(0) = b − Ax(0) for some initial guess x(0) v (1) = r(0) . if ρi = 0 or ξi = 0 method fails v (i) = v (i) /ρi . solve M1 y = v (1) .8: The Preconditioned Quasi Minimal Residual Method without Look-ahead .22 CHAPTER 2. . continue if necessary end Figure 2. η0 = −1 for i = 1. if γi = 0 method fails 2 2 ηi = −ηi−1 ρi γi /(βi γi−1 ) if i = 1 d(1) = η1 p(1) .

However. in circumstances where computation with AT is impractical. and it turns out to be easy to find the corresponding approximations for x.9: The Preconditioned Conjugate Gradient Squared Method This suggests that if Pi (A) reduces r(0) to a smaller vector r(i) . CGS may be attractive. even if it really reduces the initial residual r(0) . continue if necessary end Figure 2. . This approach leads to the Conjugate Gradient Squared method (see Sonneveld [191]). One should be aware of the fact that local corrections to the current solution may be so large that cancellation effects occur. This may lead to a less accurate solution than suggested by the updated residual (see Van der Vorst [205]). . . Convergence Often one observes a speed of convergence for CGS that is about twice as fast as for BiCG. Hence. but does not involve computations with AT .3. Implementation CGS requires about the same number of operations per iteration as BiCG. there is no reason that the “contraction” operator. The method tends to diverge if the starting guess is close to the solution. should also reduce the once reduced vector r(k) = Pk (A)r(0) . NONSTATIONARY ITERATIVE METHODS 23 Compute r(0) = b − Ax(0) for some initial guess x(0) Choose r (for example. which is in agreement with the observation that the same “contraction” operator is applied twice.2. Equation (2. 2. .14) shows that the iteration coefficients can still be recovered from these vectors. ρi−1 = rT r(i−1) ˜ if ρi−1 = 0 method fails if i = 1 u(1) = r(0) p(1) = u(1) else βi−1 = ρi−1 /ρi−2 u(i) = r(i−1) + βi−1 q (i−1) p(i) = u(i) + βi−1 (q (i−1) + βi−1 p(i−1) ) endif solve M p = p(i) ˆ v = Aˆ ˆ p αi = ρi−1 /˜T v r ˆ q (i) = u(i) − αi v ˆ solve M u = u(i) + q (i) ˆ x(i) = x(i−1) + αi u ˆ q = Aˆ ˆ u r(i) = r(i−1) − αi q ˆ check convergence. and compute Pi2 (A)r(0) . then it might be advantageous to apply this “contraction” operator twice. This is evidenced by the often highly irregular convergence behavior of CGS. r = r(0) ) ˜ ˜ for i = 1.

24 CHAPTER 2. Bi-CGSTAB can be interpreted as the product of BiCG and repeatedly applied GMRES(1). ρi−1 = rT r(i−1) ˜ if ρi−1 = 0 method fails if i = 1 p(i) = r(i−1) else βi−1 = (ρi−1 /ρi−2 )(αi−1 /ωi−1 ) p(i) = r(i−1) + βi−1 (p(i−1) − ωi−1 v (i−1) ) endif solve M p = p(i) ˆ (i) v = Aˆ p αi = ρi−1 /˜T v (i) r s = r(i−1) − αi v (i) check norm of s. This type of breakdown may be avoided by combining BiCG with other methods. continue if necessary for continuation it is necessary that ωi = 0 end Figure 2. then the Krylov subspace is not expanded. On the other hand. .3.8 BiConjugate Gradient Stabilized (Bi-CGSTAB) The BiConjugate Gradient Stabilized method (Bi-CGSTAB) was developed to solve nonsymmetric linear systems while avoiding the often irregular convergence patterns of the Conjugate Gradient Squared method (see Van der Vorst [205]). 2. ITERATIVE METHODS Compute r(0) = b − Ax(0) for some initial guess x(0) Choose r (for example. if small enough: set x(i) = x(i−1) + αi p and stop ˆ solve M s = s ˆ t = Aˆ s ωi = tT s/tT t x(i) = x(i−1) + αi p + ωi s ˆ ˆ r(i) = s − ωi t check convergence. At least locally.10: The Preconditioned BiConjugate Gradient Stabilized Method The pseudocode for the Preconditioned Conjugate Gradient Squared Method with preconditioner M is given in Figure 2. i. r = r(0) ) ˜ ˜ for i = 1. Convergence Bi-CGSTAB often converges about as fast as CGS. a residual vector is minimized. . This is a breakdown situation that can occur in addition to the other breakdown possiblities in the underlying BiCG algorithm. .e.. by selecting other values for ωi (see the . if the local GMRES(1) step stagnates.9. Bi-CGSTAB computes i → Qi (A)Pi (A)r(0) where Qi is an ith degree polynomial describing a steepest descent update. CGS can be viewed as a method in which the BiCG “contraction” operator is applied twice. which leads to a considerably smoother convergence behavior. 2. Instead of computing the CGS sequence i → Pi2 (A)r(0) . and Bi-CGSTAB will break down. sometimes faster and sometimes not.

and then after sufficient convergence of these approximations switch to Chebyshev Iteration. Chebyshev iteration is suitable for any nonsymmetric linear system for which the enveloping ellipse does not include the origin. i. more general approaches are suggested by Sleijpen and Fokkema in [189]. 2. λmax ] on the positive x-axis. Additionally. because of superlinear convergence behavior. The price one pays for avoiding inner products is that the method requires enough knowledge about the spectrum of the coefficient matrix A that an ellipse enveloping the spectrum can be identified.e. One such alternative is Bi-CGSTAB2 (see Gutknecht [114]). and implemented by Ashby [7]. except that no inner products are computed in Chebyshev Iteration.3. GMRES is guaranteed to generate the smallest residual over the current search space. The BiCG methods. On the other hand. GMRES and BiCG may be more effective in practice.15) d + d2 − c2 where a is the length of the x-axis of the ellipse. The pseudocode for the Preconditioned BiConjugate Gradient Stabilized Method with preconditioner M is given in Figure 2. Comparison with other methods Comparing the pseudocode for Chebyshev Iteration with the pseudocode for the Conjugate Gradient method shows a high degree of similarity. which also use short recurrences. In circumstances where the computation of inner products is a bottleneck.3. compute estimates of the extremal eigenvalues from the CG coefficients. The Chebyshev method has the advantage over GMRES that only short recurrences are used.1. Chapter 5]). unlike Chebyshev iteration. For symmetric positive definite systems the “ellipse” enveloping the spectrum degenerates to the interval [λmin . which cannot be expected for Chebyshev. to Chebyshev Iteration. do not minimize the residual in a suitable norm. it may be advantageous to start with CG. however this difficulty can be overcome via an adaptive construction developed by Manteuffel [145]. Chebyshev Iteration avoids the computation of inner products as is necessary for the other nonstationary methods. they do not require estimation of parameters (the spectrum of A). stopping on the first test saves a few unnecessary operations. (2. For some distributed memory architectures these inner products are a bottleneck with respect to efficiency. §10. Implementation Bi-CGSTAB requires two matrix-vector products and four inner products..5] and Varga [209. field of values) of A and for which the rate r of convergence is minimal: √ a + a2 − c2 √ r= . or BiCG-type methods. NONSTATIONARY ITERATIVE METHODS 25 algorithm). We provide code in which it is assumed that c and d are known.2. Note that Bi-CGSTAB has two stopping tests: if the method has already converged at the first test on the norm of s. For code including the adaptive detemination of these iteration parameters the reader is referred to Ashby [7]. where λmin and λmax are the smallest and largest eigenvalues of M −1 A.10. however. A similar strategy may be adopted for a switch from GMRES. two inner products more than BiCG and CGS. but this is of minor importance. Scalars c and d must be selected so that they define a family of ellipses with common center d > 0 and foci d + c and d − c which contain the ellipse that encloses the spectrum (or more general. .9 Chebyshev Iteration Chebyshev Iteration is another method for solving nonsymmetric problems (see Golub and Van Loan [108. the subsequent update would be numerically questionable. Finally.

if it is overestimated then the result may be very slow convergence. if i = 1 p(1) = z (0) α1 = 2/d else βi−1 = (cαi−1 /2)2 αi = 1/(d − βi−1 ) p(i) = z (i−1) + βi−1 p(i−1) . 2. in sequence. It handles the case of a symmetric positive definite coefficient matrix A. is avoided. if in the symmetric case λmax is underestimated.11: The Preconditioned Chebyshev Method Convergence In the symmetric case (where A and the preconditioner M are both symmetric) for the Chebyshev Iteration we have the same upper bound as for the Conjugate Gradient method. as soon as some segment of w is computed. we may begin computing. solve M z (i−1) = r(i) . There is a severe penalty for overestimating or underestimating the field of values. Therefore the computation of inner products. This avoids the synchronization points required of CG-type methods. For example. The eigenvalues of M −1 A are assumed to be all real and in the interval [λmin . provided c and d are computed from λmin and λmax (the extremal eigenvalues of the preconditioned matrix M −1 A). Similar statements can be made for the nonsymmetric case. as is necessary in methods like GMRES or CG. check convergence. r(i) = b − Ax(i) (= r(i−1) − αi Ap(i) ). corresponding segments of p. . λmax ]. Specifically. x. . for a discussion of this see Saad [184]. which does not include zero. However.26 CHAPTER 2. consideration must also be given to the computational ker- . d = (λmax + λmin )/2.11. continue if necessary end Figure 2. Implementation In Chebyshev Iteration the iteration parameters are known as soon as one knows the ellipse containing the eigenvalues (or rather. endif x(i) = x(i−1) + αi p(i) . for i = 1. to obtain good performance. and Dongarra. the field of values) of the operator. . The pseudocode for the Preconditioned Chebyshev Method with preconditioner M is given in Figure 2. and r. ITERATIVE METHODS Compute r(0) = b − Ax(0) for some initial guess x(0) . et al. then the method may diverge.4 Computational Aspects of the Methods Efficient solution of a linear system is largely a function of the proper choice of iterative method. 2. so machines with hierarchical or distributed memory may achieve higher performance (it also suggests strong parallelization properties. This implies that one needs fairly accurate bounds on the spectrum of M −1 A for the method to be effective (in comparison with CG or GMRES). c = (λmax − λmin )/2. [70]).

Jacobi Method • Extremely easy to use. and with it. COMPUTATIONAL ASPECTS OF THE METHODS MatrixVector Product 1a 1a 1a 1 1 1/1 1/1 2 2 1 27 Method JACOBI GS SOR CG GMRES BiCG QMR CGS Bi-CGSTAB CHEBYSHEV Inner Product SAXPY 1 1 3 i+1 5 8+4bc 6 6 2 Precond Solve 2 i+1 2 2 2 4 1 1 1/1 1/1 2 2 1 Table 2.2 lists the storage required for each method (without preconditioning). and depending on the problem or target computer architecture this may rule out certain methods. iterative methods lack dense matrix suboperations. b True SAXPY operations + vector scalings. iterative methods are usually simpler to implement than direct methods. Different methods involve different kernels. Iterative methods are very different from direct methods in this respect. Note that we are not including the original system Ax = b and we ignore scalar storage. Such operations also have a relatively low efficiency of execution.1: Summary of Operations for Iteration i. is largely that of the factorization of the matrix. nels of the method and how efficiently they can be executed on the target architecture. c Less for implementations that do not recursively update the residual. • Computational kernels. depending on the data structure. a This method performs no real matrix vector product or preconditioner solve. but unless the matrix is “strongly” diagonally dominant. but the number of operations is equivalent to a matrix-vector multiply. so knowledge of matrix properties is the main criterion for selecting an iterative method. Table 2. Not every method will work on every problem type. both for dense and sparse systems. the basic operations in iterative methods often use indirect addressing. 1.4. (Dongarra and Van der Vorst [73] give some experimental results about this.4. In this section we summarize for each method • Matrix properties. This point is of particular importance on parallel architectures. This operation is absent in iterative methods (although preconditioners may require a setup phase). . and since no full factorization has to be stored. “a/b” means “a” multiplications with the matrix and “b” with its transpose. we expect a lower flop rate for iterative than for direct methods. this lower efficiency of execution does not imply anything about the total solution time for a given system. Since such operations can be executed at very high efficiency on most current computer architectures. and provide a benchmark code for iterative solvers. they can handle much larger sytems than direct methods. However.) Furthermore. The performance of direct methods. see §4. • Trivial to parallelize.2. Furthermore. this method is probably best only considered as an introduction to iterative methods or as a preconditioner in a nonstationary method.

2: Storage Requirements for the Methods in iteration i: n denotes the order of the matrix. • Speed of convergence depends on the condition number. • GMRES requires only matrix-vector products with the coefficient matrix. if extremal eigenvalues are well-separated then superlinear convergence behavior can result.28 Method JACOBI SOR CG GMRES BiCG CGS Bi-CGSTAB QMR CHEBYSHEV CHAPTER 2. the optimal value for ω may be estimated from the spectral radius of the Jacobi iteration matrix under certain conditions. • This is a special case of the SOR method. c Less for implementations that do not recursively update the residual. 4. • Applicable to strictly diagonally dominant. Different orderings of the unknowns have different degrees of parallelism. but in general not competitive with the nonstationary methods. 2. obtained by choosing ω = 1. over-relaxation). under-relaxation). • Inner products act as synchronization points in a parallel environment. • Speed of convergence depends critically on ω. Gauss-Seidel Method • Typically faster convergence than Jacobi. 3. • GMRES leads to the smallest residual for a fixed number of iteration steps. Successive Over-Relaxation (SOR) • Accelerates convergence of Gauss-Seidel (ω > 1. may yield convergence when Gauss-Seidel fails (0 < ω < 1. . it requires skill and experience. • In order to limit the increasing storage requirments and work per iteration step. 5. • Parallelization properties are the same as those of the Gauss-Seidel method. multi-color orderings may give almost full parallelism. restarting is necessary. but these steps become increasingly expensive. When to do so depends on A and the right-hand side. • Parallelization properties depend on structure of the coefficient matrix. Conjugate Gradient (CG) • Applicable to symmetric positive definite systems. or symmetric positive definite matrices. Generalized Minimal Residual (GMRES) • Applicable to nonsymmetric matrices. • Further parallel properties are largely independent of the coefficient matrix. but depend strongly on the structure the preconditioner. ITERATIVE METHODS Storage Reqmts matrix + 3n matrix + 2n matrix + 6n matrix + (i + 5)n matrix + 10n matrix + 11n matrix + 10n matrix + 16nc matrix + 5n Table 2.

then QMR delivers about the same result at the same step. so they can be done in parallel. The method requires the transpose matrix-vector product. it avoids one of the two breakdown situations of BiCG. Tends to diverge if the starting guess is close to the solution. 9. up to the restart point. 10. • Parallelization properties are as for BiCG. Quasi-Minimal Residual (QMR) • Applicable to nonsymmetric matrices.2. QMR may still further reduce the residual. the two matrix-vector products are not independent. This disqualifies the method for cases where the matrix is only given implicitly as an operator. but the method doesn’t require the transpose matrix. but the method doesn’t require the transpose matrix.4. 7. • Computational costs per iteration are similar to BiCG. A more stable implementation based on modified Gram-Schmidt orthogonalization has one synchronization point per inner product. • Convergence behavior is often quite irregular. • Designed to avoid the irregular convergence behavior of BiCG. COMPUTATIONAL ASPECTS OF THE METHODS 29 • The number of inner products grows linearly with the iteration number. Conjugate Gradient Squared (CGS) • Applicable to nonsymmetric matrices. so the number of synchronization points in a parallel environment is larger. Chebyshev Iteration • Applicable to nonsymmetric matrices (but presented in this book only for the symmetric case). • Converges (diverges) typically about twice as fast as BiCG. In an implementation based on a simple Gram-Schmidt process the inner products are independent. • If BiCG makes significant progress in one iteration step. . as a result we often observe less loss of accuracy in the updated residual. • An alternative for CGS that avoids the irregular convergence patterns of CGS while maintaining about the same speed of convergence. albeit very slowly. • Unlike BiCG. the two matrix vector products (as well as the preconditioning steps) are independent. • Requires matrix-vector products with the coefficient matrix and its transpose. • Computational costs per iteration are similar to BiCG. since usually no corresponding transpose operator is available in such cases. 6. • Parallelization properties are similar to those for CG. Biconjugate Gradient (BiCG) • Applicable to nonsymmetric matrices. • Computational costs per iteration are similar to BiCG and CGS. 8. or their communication stages can be packaged. But when BiCG temporarily stagnates or diverges. so together they imply only one synchronization point. which may lead to a loss of accuracy in the updated residual. but slightly higher. Biconjugate Gradient Stabilized (Bi-CGSTAB) • Applicable to nonsymmetric matrices.

• The computational structure is similar to that of CG. Table 2. Like the Lanczos method it is applied to nonsymmetric systems. combined with the asymmetry of the coefficient matrix means that the method no longer effects a reduction to tridiagonal form. but there is an important algorithmic difference. In that context. the user may observe that a particular operation could be performed more efficiently. but it also presents a new method. If these matrix vector products are relatively expensive. This Hestenes/Stiefel method is closely related to a reduction of the Lanczos method to symmetric matrices. Lanczos later applied his method to solving linear systems. on the availability of AT (BiCG and QMR).30 CHAPTER 2. By combining the two twoterm recurrences (eliminating the “search directions”) the Lanczos method is obtained. and if sufficient storage is available then it may be attractive to use GMRES and delay restarting as much as possible.1 shows the type of operations performed per iteration. in the symmetric case the iteration parameters are easily obtained from the two extremal eigenvalues. to continue the iteration once suitable bounds on the spectrum have been obtained from these methods. This last fact. partly because of a misperceived importance of the finite termination property. Like the Hestenes/Stiefel method. Although error-reduction properties are proved. the method by Hestenes and Stiefel uses coupled two-term recurrences. 2. and it does not use search directions. . Reid [178] pointed 1 For a more detailed account of the early history of CG methods. in particular symmetric ones [142]. • The Adaptive Chebyshev method can be used in combination with methods as CG or GMRES. It also depends on how much storage one has available (GMRES). Whereas Lanczos used three-term recurrences. which can be estimated either directly from the matrix. after their independent discovery of the same method. rather than an iterative method. is the classical description of the conjugate gradient method for solving linear systems. An important property for proving convergence of the method when solving linear systems is that the iterates are related to the initial residual by multiplication with a polynomial in the coefficient matrix. the conjugate gradient method is presented here as a direct method. and his motivation came from eigenvalue problems. Selecting the “best” method for a given class of problems is largely a matter of trial and error.5 A short history of Krylov methods1 Methods based on orthogonalization were developed by a number of authors in the early ’50s. Presented as “minimized iterations in the Galerkin method” this algorithm has become known as the Arnoldi algorithm. reducing the two mutually orthogonal sequences to one orthogonal sequence. but there are no synchronization points. we refer the reader to Golub and O’Leary [107] and Hestenes [122]. or from applying a few iterations of the Conjugate Gradient Method. combining features of the Lanczos and Hestenes/Stiefel methods. Lanczos’ method [141] was based on two mutually orthogonal vector sequences. Based on the particular problem or data structure. it generates only one. self-orthogonal sequence. A paper by Arnoldi [6] further discusses the Lanczos biorthogonalization method. The conjugate gradient method received little attention as a practical method for some time. ITERATIVE METHODS • This method requires some explicit knowledge of the spectrum (or field of values). but instead one to upper Hessenberg form. and on how expensive the matrix vector products (and Solve steps with M ) are in comparison to SAXPYs and inner products. and experiments showing premature convergence are reported. The joint paper by Hestenes and Stiefel [121]. the most prominent feature of the method is that it reduces the original matrix to tridiagonal form.

the BiCG method can have two kinds of breakdown: Lanczos breakdown (the underlying Lanczos process breaks down).3. An obvious one is to restart (this one is included in §2. (3) efficient use of matrix-vector products. and pivot breakdown (the tridiagonal matrix T implicitly generated in the underlying Lanczos process encounters a zero pivot when Gaussian elimination without pivoting is used to factor it).4): GMRES(m). but it is far from clear a priori which strategy is preferable in any given case. Saad [182]. It should be noted that for symmetric matrices. In this book. and Bank and Chan [29. Manteuffel and Saylor [10]. Other methods tackling this problem can be found in Fletcher [97]. which he named the bi-conjugate gradient method (BiCG). Several methods have been developed in later years that employ. (2) avoiding use of the transpose. The simplest ones exploit a fixed polynomial preconditioner (see Johnson. In more sophisticated approaches. most often implicitly.6. with two coupled two-term recurrences. near breakdowns can cause severe numerical stability problems. and Nachtigal. Methods based on this idea can be viewed as preconditioned GMRES methods. the polynomial preconditioner is adapted to the iterations (Saad [186]). In this section. All these approaches have advantages for some problems. 2. The pivot breakdown is the easier one to overcome and there have been several approaches proposed in the literature. (4) smooth convergence. in which the iterations of the preconditioning method are kept orthogonal to the iterations of the underlying GCR method. and (5) exploiting the work expended in forming the Krylov space with AT for further reduction of the residual. similar to the Conjugate Gradient method. and Jea and Young [124]. A survey of methods up to about 1991 can be found in Freund. In De Sturler and Fokkema [63]. considerable attention has been given to analyzing the nature of the Lanczos breakdown (see Parlett [171]. 28]. Fletcher [97] proposed an implementation of the Lanczos method.2. Saad and Schultz [187]. SURVEY OF RECENT KRYLOV METHODS 31 out that the most important application area lay in sparse definite systems. Golub and Nachtigal [105]. part of the optimality of GMRES is maintained in the hybrid method GCRO. we shall briefly highlight some of the recent developments and other methods not treated here. Stagnation is prevented in the GMRESR method (Van der Vorst and Vuik [207]) by including LSQR steps in some phases of the process. Lanczos breakdown cannot occur and the only possible breakdown is pivot breakdown. Two more recent reports by Meier-Yang [150] and Tong [195] have extensive numerical comparisons among various methods. Gutknecht [113]. or the preconditioner may even be some other (iterative) method of choice (Van der Vorst and Vuik [207]. For an overview and characterization of these orthogonal projection methods for nonsymmetric systems see Ashby. and in particular those for which extensive computational experience has been gathered. Reichel and Trefethen [158]). the upper Hessenberg matrix of the Arnoldi method. Micchelli and Paul [125]. and . Lanczos breakdown is much more difficult to eliminate. Saad [180].6 Survey of recent Krylov methods Research into the design of Krylov subspace methods for solving nonsymmetric linear systems is an active field of research and new methods are still emerging. including several more recent ones that have not been discussed in detail in this book. we have included only the best known and most popular methods. Another approach is to restrict the GMRES search to a suitable subspace of some higher-dimensional Krylov subspace. The SYMMLQ and QMR methods discussed in this book circumvent pivot breakdown by solving least squares systems. Axelsson and Vassilevski [24]). and this renewed the interest in the method. Recent work has focused on endowing the BiCG method with several desirable properties: (1) avoiding breakdown. As discussed before. Although such exact breakdowns are very rare in practice. Recently. Several suggestions have been made to reduce the increase in memory and computational costs in GMRES.

these methods have not yet received much practical use but some form of look-ahead may prove to be a crucial component in future methods. The transpose free nature of TFQMR. which computes the square of the BiCG polynomial without requiring AT – thus obviating the need for AT . the resulting algorithms are usually too complicated to give in template form (some codes of Freund and Nachtigal are available on netlib. and CGS). 41]. Recently. and there will never be a best overall Krylov subspace method. Freund. 115]). Gutknecht [114] noted that Bi-CGSTAB could be viewed as a product of BiCG and GMRES(1). However. faster converging alternative. BiCG. since the BiCG polynomial is still used. The Bi-CGSTAB method of Van der Vorst [205] is such an example. chosen to have certain desirable properties but computable without recourse to AT . CGS is often an attractive. Reddy and Trefethen [157] shows that for any of a group of methods (CG. Hence. Unfortunately. Gutknecht and Nachtigal [100]. several recent product methods have been designed to smooth the convergence of CGS. There is no clear best Krylov subspace method at this time. By squaring the QMR method. The most notable of these is the ingenious CGS method by Sonneveld [191] discussed earlier. Chan. In addition to Bi-CGSTAB. This was anticipated to lead to better convergence for the case where the eigenvalues of A are complex. This shows clearly that there will be no ultimate method. So far. One idea is to use the quasi-minimal residual (QMR) principle to obtain smoothed iterates from the Krylov subspace generated by other product methods. Joubert [128]. Many other basic methods can also be squared. Nachtigal [159]. Numerical experiments show that TFQMR in most cases retains the desirable convergence features of CGS while correcting its erratic behavior. These methods offer smoother convergence over CGS and Bi-CGSTAB with little additional cost. which generate iterates corresponding to a product of the BiCG polynomial with another polynomial of the same degree. iterative methods will never reach the robustness of direct methods. the need for matrix-vector multiplies with AT can be inconvenient as well as doubling the number of matrix-vector multiplies compared with CG for each increase in the degree of the underlying Krylov subspace. Brezinski. Golub and Nachtigal [105]. nor will they beat direct methods for . Freund and Szeto [103] derived a transpose-free QMR squared method which is quite competitive with CGS but with much smoother convergence. which is called QMRCGSTAB. its low computational cost and its smooth convergence behavior make it an attractive alternative to CGS. these methods require an extra matrix-vector product per step (three instead of two) which makes them less efficient. On the other hand. When BiCG converges. For example. and the main problem is to identify these classes and to construct new methods for uncovered classes. for modest m. Freund and Nachtigal [101]. The paper by Nachtigal. it is still not possible to eliminate breakdowns that require look-ahead steps of arbitrary size (incurable breakdowns). and he suggested combining BiCG with GMRES(2) for the even numbered iteration steps. Freund. which he called TFQMR. al. ITERATIVE METHODS Gutknecht [112. where they describe how to easily combine BiCG with any GMRES(m). by squaring the Lanczos procedure. Several recent methods have been proposed to overcome this drawback.32 CHAPTER 2. de Pillis and Van der Vorst [45] obtained transpose-free implementations of BiCG and QMR. TFQMR breaks down whenever CGS does. However. Freund [104] proposed such a QMR version of CGS.) Moreover. 115]). Each of the methods is a winner in a specific problem class. One possible remedy would be to combine TFQMR with a look-ahead Lanczos technique but this appears to be quite complicated and no methods of this kind have yet appeared in the literature. GMRES. In the BiCG method. CGNE. and Gutknecht [112. there is a class of problems for which a given method is the winner and another one is the loser. Parlett [171]. Chan et. A more efficient and more robust variant of this approach has been suggested by Sleijpen and Fokkema in [189]. CGS also inherits (and often magnifies) the breakdown conditions and the irregular convergence of BiCG (see Van der Vorst [205]). in which the auxiliary polynomial is defined by a local minimization chosen to smooth the convergence behavior. as well as various look-ahead techniques for remedying it (see Brezinski and Sadok [39]. [46] derived a similar QMR version of Van der Vorst’s Bi-CGSTAB method. CGS also generated interest in the possibility of product methods. Zaglia and Sadok [40. The best we can hope for is some expert system that guides the user in his/her choice.

because of CPU-restrictions. convergence problems. ill-conditioning. et cetera.6. and for others. iterative schemes will be most attractive. SURVEY OF RECENT KRYLOV METHODS 33 all problems. direct methods (or multigrid). We hope to find suitable methods (and preconditioners) for classes of very large problems that we are yet unable to solve by any known method. For some problems.2. . memory.

34 CHAPTER 2. ITERATIVE METHODS .

we are faced with a choice between finding a matrix M that approximates A. Certain preconditioners need little or no construction phase at all (for instance the SSOR preconditioner). In that case. On parallel machines there is a further trade-off between the efficacy of a preconditioner in the classical sense. On the other hand. there is a trade-off between the cost of constructing and applying the preconditioner.Chapter 3 Preconditioners 3. there can be substantial work involved. Certain preconditioners are able to improve on this situation. This implies that they multiply the work per iteration by a constant factor.5. Many of the traditional preconditioners have a large sequential component.1.1 The why and how The convergence rate of iterative methods depends on spectral properties of the coefficient matrix. such as incomplete factorizations. or over repeated use of the same preconditioner in multiple linear systems. 35 . so that only multiplication by M is needed. Hence one may attempt to transform the linear system into one that is equivalent in the sense that it has the same solution. and the gain in convergence speed. and for which solving a system is easier than solving one with A. a notable example of the second category will be discussed in §3. both initially for the setup. and its parallel efficiency. For instance. In devising a preconditioner. the transformed system M −1 Ax = M −1 b has the same solution as the original system Ax = b.1 Cost trade-off Since using a preconditioner in an iterative method incurs some extra cost. most notably the modified incomplete factorizations and preconditioners based on multigrid techniques. the initial cost has to be amortized over the iterations. the number of iterations as a function of the matrix size is usually only improved by a constant. but for others. or finding a matrix M that approximates A−1 . 3. The majority of preconditioners falls in the first category. Most preconditioners take in their application an amount of work proportional to the number of variables. the construction of the preconditioner may not be vectorizable/parallelizable even if application of the preconditioner is. but that has more favorable spectral properties. if a matrix M approximates the coefficient matrix A in some way. but the spectral properties of its coefficient matrix M −1 A may be more favorable. Although the work in scalar terms may be comparable to a single iteration. A preconditioner is a matrix that effects such a transformation. and per iteration for applying it.

which is much easier to derive.and right preconditioners. 4. have been derived with such a combination of preconditioned iteration matrix and correspondingly changed inner product. so. respectively.5 remedies this by employing the M −1 -inner product for orthogonalization of the residuals. For instance. §10. By rewriting the steps of the method (see for instance Axelsson and Barker [14. and we can simpy apply an unpreconditioned iterative method to this system. All cg-type methods in this book. Remove any vectors from the algorithm that have become duplicates in the previous step. Take a preconditioned iterative method.2 Left and right preconditioning The above transformation of the linear system A → M −1 A is often not what is used in practice. Another way of deriving the preconditioned conjugate gradients method would be to split the preconditioner as M = M1 M2 and to transform the system as −1 −1 −1 M1 AM2 (M2 x) = M1 b.3 t by such choices as M1 = M2 or M1 = I.1. the splitting of M is in practice not needed. The method as described in figure 2. that is. add the step −1 r0 ← M1 r0 . and replace every occurence of M by I. a step that applies the preconditioner in its entirety. 1. Only two additional actions −1 −1 r0 ← M1 r0 before the iterative process and xn ← M2 xn after are necessary. Thus we arrive at the following schematic for deriving a left/right preconditioned iterative method from any of the symmetrically preconditioned methods in this book. the conjugate gradients method is not immediately applicable to this system.29] or Golub and Van Loan [108.3]) it is usually possible to reintroduce a computational step solve u from M u = v. There is a different approach to preconditioning. PRECONDITIONERS 3. . hence the conjugate gradients method can be applied. where x is the final calculated solution. Remarkably. Consider again the system. The matrices M1 and M2 are called the left.36 CHAPTER 3. pgs. 5. with the exception of QMR. 2. The theory of the cg method is then applicable again. t If M is symmetric and M1 = M2 . It should be noted that such methods cannot be made to reduce to the algorithms given in section 2. add the step −1 x ← M2 x. At the end of the method. −1 −1 3. even if A and M are. −1 −1 −1 M1 AM2 (M2 x) = M1 b. 16. it is obvious that we now have a method with a symmetric iteration matrix. After the calculation of the initial residual. the matrix M −1 A is not symmetric. Replace every occurrence of A in the method by M1 AM2 .

. on parallel computers they don’t present any particular problems. can be derived from the coefficient matrix without any work.2. blocks can be formed by grouping the equations per node. i. among all preconditioners of diagonal form. one can show that the point Jacobi algorithm is optimal. and by van der Sluis [196] for general sparse matrices. . In this case no polynomial acceleration is possible. 3. natural choices for the partitioning suggest themselves: • In problems with multiple physical variables per node.2. for a rather technical reason. On the other hand. and the resulting method is simply the stationary SOR method. see Hageman and Young [119. . then mi.3.3.j 0 if i and j are in the same index subset otherwise.4. 2 The SOR and Gauss-Seidel matrices are never used as preconditioners. • On parallel computers it is natural to let the partitioning coincide with the division of variables over the processors.. If the original. the accelerating polynomial reduces to the trivial polynomial Pn (x) = xn . Often. more sophisticated preconditioners usually yield a larger improvement. and they are easy to implement.2.1 Block Jacobi Methods Block versions of the Jacobi preconditioner can be derived by a partitioning of the variables. §9.j = ai. even in the block case. symmetric. The preconditioner is now a block-diagonal matrix.e. JACOBI PRECONDITIONING 37 3. or planes in the 3D case.j = ai. This strategy applies to many preconditioners below. This is known as the (point) Jacobi preconditioner. matrix is decomposed as A = D + L + LT 1 Under certain conditions. in the sense of reducing the condition number. such as those from partial differential equations on regular grids. For extensions to block Jacobi preconditioners. n} is partitioned as S = i Si with the sets Si mutually disjoint.1 3. It is possible to use this preconditioner without using any extra storage beyond that of the matrix itself. • In structured matrices. .2 Discussion Jacobi preconditioners need very little storage. . Recent research by Eiermann and Varga [83] has shown that polynomial acceleration of SOR with suboptimal ω will yield no improvement over simple SOR with optimal ω. This will be discussed further in §3. SORpreconditioning with optimal ω maps the eigenvalues of the coefficient matrix to a circle in the complex plane. However. a partitioning can be based on the physical domain. division operations are usually quite costly. see Demmel [65] and Elsner [94]. If the index set S = {1. Examples are a partitioning along lines in the 2D case.i 0 if i = j otherwise.3 SSOR preconditioning The SSOR preconditioner2 like the Jacobi preconditioner. 3. or close to optimal. This was shown by Forsythe and Strauss for matrices with Property A [98]. Additionally.2 Jacobi Preconditioning The simplest preconditioner consists of just the diagonal of the matrix: mi.3]. so in practice storage is allocated for the reciprocals of the matrix diagonal.

2−ω ω ω ω The optimal value of the ω parameter.4]). parametrized by ω M (ω) = 1 1 1 1 ( D + L)( D)−1 ( D + L)T . but often incomplete . Specifically.1).38 CHAPTER 3. In cases where pivots are zero or negative. We call a factorization incomplete if during the factorization process certain fill elements. Incomplete factorizations may break down (attempted division by zero pivot) or result in indefinite matrices (negative pivots) even if the full factorization of the same matrix is guaranteed to exist and yield a positive definite matrix. will reduce the number of iterations to a lower order. Even if the incomplete factorization exists. The SSOR matrix is given in factored form. 3. Solving a system with an incomplete factorization preconditioner Incomplete factorizations can be given in various forms. §1. have been ignored. strategies have been proposed such as substituting an arbitrary positive number (see Kershaw [131]). The efficacy of the preconditioner depends on how well M −1 approximates A−1 . PRECONDITIONERS in its diagonal. For instance. its suitability for vector processors or parallel architectures depends strongly on the ordering of the variables. the SSOR matrix is defined as M = (D + L)D−1 (D + L)T . for second order eliptic problems a spectral −1 condition number κ(Mωopt A) = O( κ(A)) is attainable (see Axelsson and Barker [14.4. however. since this factorization is given a priori. An important consideration for incomplete factorization preconditioners is the cost of the factorization process. the spectral information needed to calculate the optimal ω is prohibitively expensive to compute. or if the same preconditioner will be used for several linear systems. see further Beauwens and Quenon [33]. or restarting the factorization on A + αI for some positive value of α (see Manteuffel [146]).1 Creating an incomplete factorization Incomplete factorizations are the first preconditioners we have encountered so far for which there is a non-trivial creation stage.4 Incomplete Factorization Preconditioners A broad class of preconditioners is based on incomplete factorizations of the coefficient matrix. On parallel computers this problem is aggravated by the generally poor parallel efficiency of the factorization. lower. 3. solving a system proceeds in the usual way (figure 3. This was originally proved by Meijerink and Van der Vorst [151]. Such factorization costs can be amortized if the iterative method takes many iterations. or. Such a preconditioner is then given in factored form M = LU with L lower and U upper triangular. so the cost may equal that of one or more iterations of the iterative method. there is no possibility of breakdown as in the construction phase of incomplete factorization methods. In practice. the number of operations involved in creating it is at least as much as for solving a system with such a coefficient matrix. On the other hand. like the parameter in the SOR method. for instance in successive time steps or Newton iterations. Manteuffel [146]. An incomplete factorization is guaranteed to exist for many factorization strategies if the original matrix is an M -matrix. and Van der Vorst [198]. If M = LU (with L and U nonsingular triangular matrices). and upper triangular part. so this preconditioner shares many properties of other factorization-based methods (see below). nonzero elements in the factorization in positions where the original matrix had a zero.

Fill levels are defined by calling an element of level k + 1 if it is caused by elements at least one of which is of level k. 3. .4.1: Preconditioner solve of a system M x = y. The resulting factorization is incomplete in the sense that fill is supressed. zi = d−1 (yi − j<i ij zj ) ii for i = n. the diagonal elements are used twice (not three times as the formula for M would lead one to expect). (D + U )x = z. . The matrix M will have more nonzeros than L and U combined. and L and U now strictly triangular matrices.2: Preconditioner solve of a system M x = y. one could store LD−1 or D−1 U . The second formulation is slightly harder to implement. S is chosen to coincide with the set of nonzero positions in A. 3 To be precise. for i = 1. one could use either of the following equivalent formulations for M x = y: (D + L)z = y.2. n − 2. This factorization type is called the ILU (0) factorization: the Incomplete LU factorization of level zero4 . 2. j) for which ai. The first fill level is that caused by the original matrix elements. At the cost of some extra storage. that is. n − 2. xi = zi − d−1 j>i uij xj ii Figure 3.4. the fill there is said to be “discarded”. discarding all fill. . and if it is outside S. we refer to positions in L and U when we talk of positions in the factorization. . with M = (D + L)D−1 (D + U ) = (D + L)(I + D−1 U ). zi = −1 (yi − j<i ij zj ) ii for i = n. determined through the factorization process). or (I + LD−1 )z = y.j = 0. The set S is usually chosen to encompass all positions (i. if we make an incomplete factorization M = LU . . factorizations are given as M = (D + L)D−1 (D + U ) (with D diagonal. A position that is zero in A but not so in an exact factorization3 is called a fill position. . . .2 Point incomplete factorizations The most common type of incomplete factorization is based on taking a set S of matrix positions. . xi = u−1 (zi − j>i uij xj ) ii Figure 3.3. n − 1. thereby saving some computation. Often. Solving a system using the first formulation is outlined in figure 3. n − 1. for i = 1. and since only divisions with D are performed. with M = LU Let M = (D + L)(I + D−1 U ) and y be given. INCOMPLETE FACTORIZATION PRECONDITIONERS 39 Let M = LU and y be given. In that case. (I + D−1 U )x = z In either case. 4 The zero refers to the fact that only “level zero” fill is permitted. 2. nonzero elements of the original matrix. storing D−1 explicitly is the practical thing to do. and keeping all positions outside this set equal to zero during the factorization. . . . .

the level value is not changed. j) in A filling in (say in step k) is assigned a fill level value fill(i. ILU (0) and D-ILU coincide if the matrix graph contains no triangles. only storage space for D is needed. even less is needed. j > k: ai. the preconditioner can be written as M = (D+LA )D−1 (D+UA ) where D is the diagonal matrix containing the pivots generated. . in order to avoid division operations during the preconditioner solve stage we store D−1 rather than D. 156] for discussions of preconditioners using drop tolerances. . set dii ← 1/dii for j = i + 1. such a factorization exists for any choice of S.40 CHAPTER 3. . i + 2.k a−1 ak. j) = 1 + max{fill(i. it is harder to implement in practice. the incomplete factorization produces no nonzero elements beyond the sparsity structure of the original matrix. The structural strategy is that of accepting fill-in only to a certain level. but we also alter only the diagonal elements (that is. set dii ← aii for i = 1. and upper triangular parts as A = DA +LA +UA . Generating this preconditioner is described in figure 3. i. (3. . In fact. Simple cases: ILU (0) and D-ILU For the ILU (0) method. fill(k. and one numerical. Meijerink and Van der Vorst [151] proved that.k ai.3: Construction of a D-ILU incomplete factorization preconditioner.1) If aij was already nonzero. If not only we prohibit fill-in elements. j): aij = 0} for i = 1.3. . . if A is an M -matrix. any zero location (i. Splitting the coefficient matrix into its diagonal. See [20. As was already pointed out above.j − ai. called D-ILU (Pommerell [173]). i)}. so that the preconditioner at worst takes exactly as much space to store as the original matrix. .j ← ai. . 2.j k. 5 In graph theoretical terms. . The numerical fill strategy is that of ‘drop tolerances’: fill is ignored if it is too small. j) ∈ S and (j. for a suitable definition of ‘small’. j) ∈ S otherwise. Guidelines for allowing levels of fill were given by Meijerink and Van der Vorst in [152]. Since we use the upper and lower triangle of the matrix unchanged. one structural. since the amount of storage needed for the factorization is not easy to predict. i) ∈ S then set djj ← djj − aji dii aij Figure 3. Although this definition makes more sense mathematically. . PRECONDITIONERS Let S be the nonzero set {(i.j if (i. In a simplified version of ILU (0). we have the following situation. Fill-in strategies There are two major strategies for accepting or discarding fill-in. if (i. k). storing the inverses of the pivots We can describe an incomplete factorization formally as for each k. any alterations of off-diagonal elements are ignored5 ). and gives a symmetric positive definite matrix if A is symmetric positive definite. lower triangular. 2.

k fill quantity subtract it from the diagonal element ai.i if 1 < i ≤ n  i−1 ai.2]. ordering of the grid points. although usually a reduction by a large constant factor is obtained.i = i−n   ai. In this case the ILU (0) and D-ILU factorizations coincide. Modified incomplete factorizations One modification to the basic idea of incomplete factorizations is as follows: If the product ai.i − ai.i+n i.i if there is no k such that i = kn if i + n ≤ nm In the above we have assumed that the variables in the problem are ordered according to the so-called “natural ordering”: a sequential numbering of the grid lines and the points within each grid line. line-by-line.i i−n Conversely. it is easy to see that the pivot on grid point (i. Letting i. Below we will encounter different orderings of the variables. and. j − 1). j) and (i.  − ai.i−n d−1 ai−n.i − ai. INCOMPLETE FACTORIZATION PRECONDITIONERS 41 Remark: the resulting lower and upper factors of the preconditioner have only nonzero elements in the set S.i+1 − ai+1.i ai.j is nonzero. The fact that the D-ILU preconditioner contains the off-diagonal parts of the original matrix was used by Eisenstat [90] to derive at a more efficient implementation of preconditioned CG. More general proofs are given by Gustafsson [111]. This new implementation merges the application of the tridiagonal factors of the matrix and the preconditioner.i+1 = di+1. Such a factorization scheme is usually called a “modified incomplete factorization”.i  i−1  otherwise. as remarked above. It was first proved by Dupont. we get the following generating relations for the pivots:  if i = 1  a1. §7.4.i di. and fill is not allowed in position (i..i+1  di+n. Mathematically this corresponds to forcing the preconditioner to have the same rowsums as the original matrix.i for all i for i = 1.i = ai. we can describe the factorization algorithmically as Initially: di.1    ai. Special cases: central differences We will now consider the special case of a matrix derived from central differences on a Cartesian product grid. but this fact is in general not true for the preconditioner M itself.i . This order of magnitude is preserved by simple incomplete factorizations. If there are n points on each of m grid lines.i if i = kn + 1 with k ≥ 1 di. Modified factorizations are of interest because. . in combination with small perturbations. and Beauwens [31.i+n − ai+n. 32].i−1 d−1 ai−1. thereby saving a substantial number of operations per iteration.i−n d−1 ai−n. the spectral condition number of the preconditioned system can be of a lower order.i − ai. j) is only determined by pivots on points (i − 1. other elements in the triangular factors are equal to off-diagonal elements of A. we only have to calculate the pivots of the factorization. Kendall and Rachford [80] that a modified incomplete factorization of A + O(h2 )DA gives κ(M −1 A) = O(h−1 ) for the central difference case.k a−1 ak.j be coordinates in a regular 2D grid. One reason for considering modified incomplete factorizations is the behavior of the spectral condition number of the preconditioned system.i d−1 ai.i−1 d−1 ai−1. In the following we will assume a natural.nm do:  −1  di+1. instead of simply discarding this k. j).i+n = di+n. It was mentioned above that for second order elliptic equations the condition number of the coefficient matrix is O(h−2 ) as a function of the discretization mesh width.3. Axelsson and Barker [14.

. Both series are finite. 2n − 1 do in parallel for i = max(1. . one wishes to avoid ‘artificial diffusion’. or have a vector computer pipeline them. . see Ashcraft and Grimes [11].) Appleyard and Cheshire [4] observed that if A and M have the same column sums. In computational fluid mechanics this idea is justified with the argument that the material balance stays constant over all iterates. Chan [44].2) (The first series is called a “Neumann expansion”. that is. one can also keep column sums constant. the second an “Euler expansion”. Stone [193]. Modified incomplete factorizations can break down. Here the fill is multiplied by a parameter 0 < α < 1 before it is subtracted from the diagonal. in locations (i. This was noted by Chan and Kuo [50]. Axelsson and Lindskog [18. see Van der Vorst [201. but their length prohibits practical use of this fact.) Parallel or vectorizable preconditioners can be derived from an incomplete factorization by taking a small number of terms in either series. normalized to the form I − L where L is strictly lower triangular). Consider for a moment a lower triangular matrix. PRECONDITIONERS for k = 1.42 CHAPTER 3. the choice x0 = M −1 b guarantees that the sum of the elements in r0 (the material balance error) is zero. especially when the variables are numbered other than in the natural row-by-row ordering. k) j =k−i+1 xi+(j−1)n ← Di+(j−1)n yi+(j−1)n −Li+(j−1)n i − 1 + (j − 1)nxi−1+(j−1)n −Li+(j−1)n i + (j − 2)nxi+(j−2)n Figure 3. Experiments indicate that a small number of terms. Consider the special case of central differences on a regular domain of n × n points. Its inverse can be given as either of the following two series: (I − L)−1 = I + L + L2 + L3 + · · · (I + L)(I + L2 )(I + L4 ) · · · (3. and that all further ri have elements summing to zero. and a full analysis was given by Eijkhout [85] and Notay [160]. see Van der Vorst [204]. while giving high execution rates. or ‘wavefront’. Instead of keeping row sums constant. Another way of vectorizing the solution of the triangular factors is to use some form of expansion of the inverses of the factors. A slight variant of modified incomplete factorizations consists of the class of “relaxed incomplete factorizations”. j) with i + j = k. Notay [161]. . For the dangers of MILU in the presence of rounding error.4). Vectorization of the preconditioner solve At first it may appear that the sequential time of solving a factorization is of the order of the number of variables. depend only on those on the previous diagonal. . Eijkhout [85]. 19].4: Wavefront solution of (D + L)x = u from a central difference problem on a domain of n × n points. The variables on any diagonal in the domain. 203]. Therefore it is possible to process the operations on such a diagonal. k + 1 − n). and Van der Vorst [202]. (Equivalently. in parallel (see figure 3. min(n. but things are not quite that bad. with i + j = k − 1. yields almost the full precision of the more recursive triangular solution (see Axelsson and Eijkhout [15] and Van der Vorst [199]). that is.

For instance. not doing so means that separate multiplications by LA and D−1 have to be performed in the expansion.3. . Multi-coloring is also an attractive method for vector computers. Suppose then that the products L = LA D−1 and U = D−1 UA have been stored.5: Preconditioning step algorithm for a Neumann expansion M (p) ≈ M −1 of an incomplete factorization M = (I + L)D(I + U ). . This algorithm is given in figure 3. for such ordering strategies there is usually a trade-off between the degree of parallelism and the resulting number of iterations. Doing so doubles the storage requirements for the matrix. The most extreme case is the red/black ordering (or for more general matrices the multi-color ordering) which gives the absolute minimum number of sequential steps.4. Melhem [153]. then we write M = (I + L)D(I + U ). t←y for k = 1. and Poole and Ortega [175].5. .2) described by A = DA + LA + UA . . 202]). U = D−1 UA .3) and replace solving a system M x = y for x by computing x = M (p) y. on rectangular domains one could start numbering the variables from all four corners simultaneously. variables on a wavefront can be processed in parallel. INCOMPLETE FACTORIZATION PRECONDITIONERS 43 Let M = (I + L)D(I + U ) and y be given. they can be processed as one vector. thereby creating four simultaneous wavefronts.4. M (p) = k=0 (−U )p D−1 k=0 (−L)p . For instance. by dividing the wavefront over processors. Now we can choose whether or not to store the product LA D−1 . . (3. Since points of one color are uncoupled. There are some practical considerations in implementing these expansion algorithms. because of the normalization the L in equation (3. in particular the norm of the error matrix may vary considerably between orderings. However.2) is not LA . . . p x ← t − Ux Figure 3. et al. and therefore four-fold parallelism (see Dongarra. [70]. Parallelizing the preconditioner solve The algorithms for vectorization outlined above can be used on parallel computers. Van der Vorst [200. We then define M (p) ≈ M −1 by p p M = (D + LA )D−1 (D + UA ). More radical approaches for increasing the parallelism in incomplete factorizations are based on a renumbering of the problem variables. t ← x for k = 1. The reason for this is that a different ordering may give rise to a different error matrix. For instance. . . p t ← y − Lt x ← D−1 t. where L = LA D−1 . see Doi [67]. if we have a preconditioner (as described in section 3. Rather. See experimental results by Duff and Meurant [78] and a partial explanation of them by Eijkhout [84].

PRECONDITIONERS for i = 1. For a general proof. Figure 3. often fill-in in off-diagonal blocks is discarded altogether. see Axelsson [13]. Then an incomplete factorization is performed using the matrix blocks as basic entities (see Axelsson [12] and Concus. and Meurant [154].3 Block factorization methods We can also consider block variants of preconditioners for accelerated methods. . The idea behind block factorizations The starting point for an incomplete block factorization is a partitioning of the matrix. typically of banded form. both the size and number of the blocks increases with the overall problem size.i = 1/ai. As in the case of incomplete point factorizations.2. see for instance Kettler [133]. . . in that case a natural division in lines (or planes in the 3-dimensional case). if Aij = 0 and Aji = 0. this approximation should be easily computable. Other possibilities were considered by Axelsson and Eijkhout [15]. Axelsson and Polman [21]. 2. the approximation to the inverse is again symmetric positive definite. the existence of incomplete block methods is guaranteed if the coefficient matrix is an M -matrix. It has the attractive theoretical property that. Golub and Meurant [56] as basic references). though incomplete factorizations are not as effective in the 3-dimensional case. then Xj ← Xj − Aji Yi Aij Figure 3. One particular example is given in figure 3. Hence the need for approximations of inverses arises. Concus. i + 2. Golub and Meurant [56]. Approximate inverses In block factorizations a pivot block is generally forced to be sparse. In addition to this. Second. . Eijkhout and Vassilevski [89]. so we rule out the option of calculating the full inverse and taking a banded part of it. . Furthermore. inverting the pivot block is likely to be a costly operation. Kolotilina and Yeremin [140]. can be used for blocking. if the original matrix is symmetric positive definite and a factorization with positive diagonal D can be made. The most important difference with point methods arises in the inversion of the pivot blocks.i . initially the diagonal blocks of the matrix are likely to be be sparse and we would like to maintain this type of structure throughout the factorization. First.6 describes an incomplete block factorization that is analogous to the D-ILU factorization (section 3. In such a blocking scheme for Cartesian product grids. Block methods are normally feasible if the problem domain is a Cartesian product grid. in the block case two problems arise.7. Xi ← Aii for i = 1.4. Whereas inverting a scalar is easily done.2) in that it only updates the diagonal blocks.44 CHAPTER 3. 2.1. as mentioned in §3.6: Block version of a D-ILU factorization 3. . .4. −1 Let Yi ≈ Xi for j = i + 1. . . The simplest approximation to A−1 is the diagonal matrix D of the reciprocals of the diagonal of A: di. and that we need an approximation to its inverse that has a similar structure. .

After the factorization we are left with sequences of Xi block forming the pivots. We can turn this into an incomplete factorization by replacing the block diagonal matrix of pivots D by the block diagonal matrix of incomplete factorization pivots X = diag(Xi ). UQ in the case of a block tridiagonal matrix. and Eijkhout and Polman [88] for a more detailed analysis in the M -matrix case.4.i+1 − Ai+1. it can be imposed by renumbering the variables in a Cuthill-McKee ordering [59]. U are the block lower and upper triangle of the factorization. they coincide with LA .i Yi Ai.7: Algorithm for approximating the inverse of a banded matrix X1 = A1.8. Such a matrix has incomplete block factorizations of a particularly simple nature: since no fill can occur outside the diagonal blocks (Ai. see figure 3. and problems on a regular 3D grid. . if the blocks correspond to planes of grid points.1 for i ≥ 1 −1 let Yi ≈ Xi Xi+1 = Ai+1.i+1 Figure 3. For such matrices. and of Yi blocks approximating their inverses. The special case of block tridiagonality In many applications.i ). Consider the block factorization A = (D + L)D−1 (D + U ) = (D + L)(I + D−1 U ) where D is the block diagonal matrix of pivot blocks.3. let Y = (I − U )D(I − L) Figure 3. INCOMPLETE FACTORIZATION PRECONDITIONERS 45 Let X be a banded matrix. but in practice harder to implement. In the context of partial differential equations the diagonal blocks of the coefficient matrix are usually strongly diagonally dominant. Two types of incomplete block factorizations One reason that block methods are of interest is that they are potentially more suitable for vector computers and parallel architectures. giving M = (X + L)(I + X −1 U ) 6 Writing (I + LA D−1 )(D + UA ) is equally valid. a block tridiagonal structure can be found in the coefficient matrix. factor X = (I + L)D−1 (I + U ). Moss and Smith [64] for a general proof. The generating recurrence for the pivot blocks also takes a simple form. all properties follow from our treatment of the pivot blocks.6 and L. Even if such a block tridiagonal structure does not arise naturally.8: Incomplete block factorization of a block tridiagonal matrix Banded approximations to the inverse of banded matrices have a theoretical justification. Examples are problems on a 2D regular grid if the blocks correspond to lines of grid points. the elements of the inverse have a size that is exponentially decreasing in their distance from the main diagonal. See Demko.

and that the spectral radius of B is less than one. we can write the ∞ inverse of A as A−1 = k=0 B k .4. and a polynomially preconditioned method with p−1 −1 Mp = ( i=0 (I − M −1 A)i )M −1 . A systematic comparison of the two approaches was made by Bank. the second type has a much higher potential for vectorizability. Therefore.4 Blocking over systems of partial differential equations If the physical problem has several variables per grid point.5 Incomplete LQ factorizations Saad [183] proposes to construct an incomplete LQ factorization of a general sparse matrix. so an approximation may be derived by truncating this infinite series. and where linear systems with the preconditioner as coefficient matrix are easier to solve than the original system. [26]. so that standard Gram-Schmidt may be not so bad as in general). Golub and Meurant [56] and Kolotilina and Yeremin [140]) solving a linear system means solving smaller systems with the Xi matrices. we can replace D by a the inverse of the block diagonal matrix of the approximations to the inverses of the pivots. et al. Since the iterative methods we are considering are already based on the idea of applying polynomials in the coefficient matrix to the initial residual. see the above references or Eijkhout and Vassilevski [89]. Suppose that the coefficient matrix A of the linear system is normalized to the form A = I − B. For this second type (which was discussed by Meurant [154]. such a factorization is theoretically more troublesome. we have described preconditioners in only one of two classes: those that approximate the coefficient matrix. Axelsson and Polman [21] and Axelsson and Eijkhout [15]) solving a system with M entails multiplying by the Yi blocks. Blocking of the equations (which gives a small number of very large blocks) was used by Axelsson and Gustafsson [17] for the equations of linear elasticity. for accelerated methods. Dubois. most rows are typically orthogonal already. specifically the . and blocking of the variables per node (which gives many very small blocks) was used by Aarden and Karlsson [1] for the semiconductor equations. giving M = (Y −1 + L)(I + Y U ).5 Polynomial preconditioners So far. Saad suggest dropping strategies for the fill-in produced in the orthogonalization process.4. there are analytic connections between the basic method and polynomially accelerated one. Y = diag(Yi ). The basic result is that for classical methods. Unfortunately. Using the Neumann series. PRECONDITIONERS For factorizations of this type (which covers all methods in Concus.46 CHAPTER 3. Experiments show that using L in a CG process for the normal equations: L−1 AAT L−T y = b is effective for some relevant problems. It turns out that the resulting incomplete L factor can be viewed as the incomplete Choleski factor of the matrix AAT . k steps of the polynomially preconditioned method are exactly equivalent to kp steps of the original method. that is. Alternatively. 3. it is possible to introduce blocking in a natural way. if there are several coupled partial differential equations. 3. 3. The idea is to orthogonalize the rows of the matrix by a Gram-Schmidt process (note that in sparse matrices. Greenbaum and Rodrigue [76] investigated the relationship between a basic method using a splitting A = M − N . Polynomial preconditioners can be considered as members of the second class of preconditioners: direct approximations of the inverse of the coefficient matrix.

and they require estimates of both a lower and upper bound on the spectrum of A. they allow for specialized faster solution methods. where a preconditioner based on the symmetric part is used.3. solving a system with the symmetric part of a matrix may be no easier than solving a system with the full matrix. We will cover some of them here (see also §5. OTHER PRECONDITIONERS 47 Chebyshev method. Let us define a polynomial preconditioner more abstractly as any polynomial M = Pn (A) normalized to P (0) = 1. where c1 .j |. the coefficient matrix is symmetric and positive definite. see §5.4). polynomial preconditioning does eliminate a large fraction of the inner products and update operations.6. Experiments comparing Chebyshev and least squares polynomials can be found in Ashby.4]. Manteuffel and Saylor [9]. Since an accurate lower bound on the spectrum of A may be hard to obtain.1. These estimates may be derived from the conjugate gradient iteration itself. Johnson. The reason for this is usually that the partial differential operator from which it is derived is self-adjoint. c2 do not depend on the matrix size. These functions only require an upper bound and this is easily computed. 3.4 we pointed out that conjugate gradient methods for non-selfadjoint systems require the storage of previously calculated vectors.2]). These preconditioners usually involve more work than the types discussed above. Application of polynomial preconditioning to symmetric indefinite problems is described by Ashby. . coercive. however. However.3. Now the choice of the best polynomial preconditioner becomes that of choosing the best polynomial that minimizes I − M −1 A . see [209.6.1 Preconditioning by the symmetric part In §2. so there may be an overall increase in efficiency. 3.5 and §5. Therefore it is somewhat remarkable that preconditioning by the symmetric part (A + AT )/2 of the coefficient matrix A leads to a method that does not need this extended storage. §1. The importance of this is that the use of B as a preconditioner gives an iterative method with a number of iterations that does not depend on the matrix size. It follows that for the coefficient matrix A the following relation holds for any matrix B from a similar differential equation: c1 ≤ xT Ax ≤ c2 xT Bx for all x.6 Preconditioners from properties of the differential equation A number of preconditioners exist that derive their justification from properties of the underlying partial differential equation. and bounded (see Axelsson and Barker [14. Manteuffel and Otto [8]. Micchelli and Paul [125] and Saad [182] propose least squares polynomials based on several weight functions. For the choice of the infinity norm we thus obtain Chebyshev polynomials.2 The use of fast solvers In many applications. Such a method was proposed by Concus and Golub [55] and Widlund [214]. This problem may be tackled by imposing a nested iterative method. using for instance the “Gerschgorin bound” maxi j |Ai. Although there is no gain in the number of times the coefficient matrix is applied. §3.6. Vassilevski [210] proved that the efficiency of this preconditioner for the symmetric part carries over to the outer method. There the polynomial is chosen so that it transforms the system into a definite one. 3. the preconditioned iteration can improve the number of iterations by at most a factor of p.

fast solvers are usually only applicable if the physical domain is a rectangle or other Cartesian product structure. that is. An extension to the non-self adjoint case is considered by Elman and Schultz [93]. or one can vectorize over them. y)ux )x − (b(x. there is a problem of data distribution. For vector computers. either the system solution with L1 or with L2 will involve very large strides: if columns of variables in the grid are stored . As another example. it can also be used for the steady state. and they can be executed in parallel. ∂ ∂ where L1 = − ∂x2 . A system involving such a matrix can be solved with various so-called “fast solvers”. domain decomposition methods can be used. for instance. Generalization of this scheme to variable coefficients or fourth order elliptic problems is relatively straightforward. the u(m) become subsequent iterates. and Peaceman and Rachford [172]. A theoretical reason that ADI preconditioners are of interest is that they can be shown to be spectrally equivalent to the original coefficient matrix. if the original matrix arises from −(a(x. a matrix I + αL1 entails only a number of uncoupled tridiagonal solutions.4). Fairweather. In Concus and Golub [58] a transformation of this method is considered to speed up the convergence. The above method is implicit since it requires systems solutions. L2 = − ∂y2 . any elliptic operator can be preconditioned with the Poisson operator. Swarztrauber [194]. see D’Yakonov [81]. Coupled with the fact that the number of iterations in the resulting preconditioned iterative methods is independent of the matrix size. Based on the observation that L1 + L2 = (I + L1 )(I + L2 ) − I − L1 L2 . if one can be found that has attractive properties as preconditioner. y)uy )y = f. L2 be discretized representations of L1 . and it alternates the x and y (and if necessary z) directions. or the generalized marching algorithm (see Dorr [74]. Hadjidimos [118]. 3. This alternating direction implicit. The most common choice is to take a matrix from a separable PDE. As a simplest example. such methods are close to optimal. see §5. for solving elliptic equations. Hence the number of iterations is bounded independent of the condition number. L2 .3 Alternating Direction Implicit methods The Poisson differential operator can be split in a natural way as the sum of two operators: L = L1 + L2 . such as FFT methods. In that case. These need very little storage over that needed for the matrix. PRECONDITIONERS Thus we can precondition our original matrix by one derived from a different PDE. However. giving the iterative method −∆(un+1 − un ) = −(Lun − f ). iterative schemes such as (1 + αL1 )(1 + αL2 )u(m+1) = [(1 + βL1 )(1 + βL2 )] u(m) with suitable choices of α and β have been proposed. However. Bank [25] and Bank and Rose [27]). Gourlay and Mitchell [96]. method was first proposed as a solution method for parabolic equations. 2 2 Now let L1 . (For a domain consisting of a number of such pieces. The u(m) are then approximations on subsequent time steps. It is attractive from a practical point of view (although mostly on tensor product grids).48 CHAPTER 3. since solving a system with.6. the preconditioner can be formed from −(˜(x)ux )x − (˜ a b(y)uy )y = f. or ADI. Fast solvers are attractive in that the number of operations they require is (slightly higher than) of the order of the number of variables. However. cyclic reduction.

. On parallel machines an efficient solution is possible if the processors are arranged in a Px ×Py grid. every processor row then works independently of other rows. for instance using a Schur complement method. only the solution with L1 will involve contiguous data.6. Inside each row. During.3. the processors can work together. this method will be close to optimal. For the L2 the stride equals the number of variables in a column. Thus. e. OTHER PRECONDITIONERS 49 contiguously. the L1 solve.g. With sufficient network bandwidth this will essentially reduce the time to that for solving any of the subdomain systems plus the time for the interface system. .

50 CHAPTER 3. PRECONDITIONERS .

in certain practical situations the complex system is non-Hermitian but symmetric. this method is susceptible to breakdown. A look-ahead strategy can remedy this in most cases (see Freund [99] and Van der Vorst and Melissen [206]). The user must supply the quantities maxit. the following simple stopping criterion will likely be adequate. stop if the error is no longer decreasing or decreasing too slowly. For non-Hermitian complex systems we distinguish two cases. However. • The real number b is a norm of b. for any coefficient matrix a CGNE method is possible. y) = xT y.1 Complex Systems Conjugate gradient methods for real symmetric systems can be applied to complex Hermitian systems in a straightforward manner. 2. y) = xT y is replaced by ¯ (x.Chapter 4 Related Issues 4. In general. that is. a method must decide when to stop. with short recurrences. To be effective. that is. and 3. any reasonable approximation of the absolute value of the largest entry of b will do. and preferably also A : • The integer maxit is the maximum number of iterations the algorithm will be permitted to perform.2 Stopping Criteria An iterative method produces a sequence {x(i) } of vectors converging to the vector x satisfying the n × n system Ax = b. b . For the user wishing to read as little as possible. it can happen that xT x = 0 for x = 0. Complex symmetric systems can be solved by a classical conjugate gradient or Lanczos method. or one can split the system into real and complex parts and use a method such as GMRES on the resulting real nonsymmetric system. Again. identify when the error e(i) ≡ x(i) − x is small enough to stop. • The real number A is a norm of A. a conjugate gradients method on the normal equations AH Ax = AH b. that is. if the complex inner product (x. Like the BiConjugate Gradient method. A good stopping criterion should 1. limit the maximum amount of time spent iterating. stop tol. 4. 51 . Any reasonable (order of magnitude) approximation of the absolute value of the largest entry of the matrix A will do.

For some algorithms we may also use the norm x B.1 Here is the algorithm: i=0 repeat i=i+1 Compute the approximate solution x(i) . which occurs near convergence.2. and ε = 2−53 ≈ 10−16 in double precision. 4. as well as A B. ≡ j |xj | .k |2 )1/2 . the stopping criterion may be replaced with the generally stricter criterion until i ≥ maxit or r(i) ≤ stop tol · b . and how to bound e(i) in terms of r(i) . Note that if x(i) does not change much from step to step. Compute r(i) and x(i) . respectively. Henceforth x and A will refer to any mutually 1 On a machine with IEEE Standard Floating Point Arithmetic.k | . The most common vector norms are: x x x ∞ 1 2 ≡ maxj |xj | .α ≡ Bx α . respectively. The rest of this section describes how to measure the sizes of vectors e(i) and r(i) . or 2. Compute the residual r(i) = Ax(i) − b. In either case.52 CHAPTER 4. where B is a fixed nonsingular matrix and α is one of ∞. until i ≥ maxit or r(i) ≤ stop tol · ( A · x(i) + b ). Corresponding to these vector norms are three matrix norms: A A A ∞ 1 F ≡ maxj k |aj. so we use the residual r(i) = Ax(i) −b instead.1 More Details about Stopping Criteria Ideally we would like to stop when the magnitudes of entries of the error e(i) = x(i) −x fall below a user-supplied threshold. then x(i) need not be recomputed. and ≡ ( j |xj |2 )1/2 . The user should choose stop tol less than one and greater than the machine precision ε. which is more readily computed. the final error bound is e(i) ≤ A−1 · r(i) . If an estimate of A−1 is available. We may also use the matrix norm A 2 = (λmax (AAT ))1/2 . 1. .k | . If A is not available. choosing stop tol 10−6 means that the user considers the entries of A and b to have errors in the range ±10−6 A and ±10−6 b . where λmax denotes the largest eigenvalue. which guarantees that the relative error e(i) / x(i) in the computed solution is bounded by stop tol. one may also use the stopping criterion until i ≥ maxit or r(i) ≤ stop tol · x(i) / A−1 . One way to choose stop tol is as the approximate uncertainty in the entries of A and b relative to A and b . ε = 2−24 ≈ 10−7 in single precision. ≡ maxk j |aj. The algorithm will compute x no more accurately than its inherent uncertainty warrants. We will measure errors using vector and matrix norms. RELATED ISSUES • The real number stop tol measures how small the user wants the residual r(i) = Ax(i) − b of the ultimate solution x(i) to be.α ≡ BAB −1 α . But e(i) is hard to estimate directly. and ≡ ( jk |aj. For example.

One difficulty with this method is that if A · x b . which implies e(i) ≤ A−1 · r(i) . For example. both form mutually consistent pairs. then it is usually not worth iterating until δA and δb are even smaller than these errors. ( x 2 and A F .3. which allows us to bound the forward error. is hard to estimate directly. because just rounding the entries of A and b to fit in the machine creates errors this large. the same goes for δb). This criterion yields the forward error bound e(i) ≤ A−1 · r(i) ≤ stop tol · A−1 · ( A · x(i) + b ) . To see that A must be very ill-conditioned. Based on this discussion. r(i) ≤ S2 ≡ stop tol · b .2. then it may be difficult for any method to satisfy the stopping criterion.) All these norms satisfy the triangle inequality x + y ≤ x + y and A + B ≤ A + B . A vector x of length n with entries uniformly distributed between 0 and 1 will satisfy x ∞ ≤ 1. δb / b } to the problem Ax = b that makes x(i) an exact solution of (A + δA)x(i) = b + δb. see §4. but x 2 will grow like √ n and x 1 will grow like n. we show how below. which we will call the forward error. Therefore.2. Provided one has some bound on the inverse of A. as well as Ax ≤ A · x for mutually consistent pairs. This is equivalent to asking that the backward error δA and δb satisfy δA = 0 and δb ≤ tol · b . This is equivalent to asking that the backward error δA and δb described above satisfy δA ≤ stop tol · A and δb ≤ stop tol · b . if the machine precision is ε. it is not worth making δA ≤ ε A and δb ≤ ε b . The second stopping criterion we discussed. Therefore a stopping criterion based on x 1 (or x 2 ) may have √ to be permitted to grow proportional to n (or n) in order that it does not become much harder to satisfy for large n. If the original data A and b have errors from previous computations or measurements.) One difference between these norms is their dependence on dimension. STOPPING CRITERIA 53 consistent pair of the above. There are two approaches to bounding the inaccuracy of the computed solution to Ax = b. r(i) ≤ S1 ≡ stop tol · ( A · x(i) + b ). we will now consider some stopping criteria and their properties. Here |X| is the matrix or vector of absolute values of components of X. may be much more stringent than Criterion 1: Criterion 2. δb / b } where x(i) is the exact solution of (A + δA)x(i) = (b + δb) (here δA denotes a general matrix. (For more details on the properties of norms. as well as x 2 and A 2 . one can bound the forward error in terms of the backward error via the simple equality e(i) = x(i) − x = A−1 (Ax(i) − b) = A−1 r(i) . not δ times A. we introduce the backward error. see Golub and Van Loan [108]. Since e(i) . Recall that the backward error is the smallest change max{ δA / A . Above we already mentioned Criterion 1. which can only occur if A is very ill-conditioned and x nearly lies in the null space of A. note that 1 A · x A · A−1 b = ≤ A · A−1 b b . a stopping criterion of the form “stop when r(i) ≤ τ ” also yields an upper bound on the forward error e(i) ≤ τ · A−1 . The backward error may be easily computed from the residual r(i) = Ax(i) − b.4. The normwise backward error is defined as the smallest possible value of max{ δA / A . which does not require A . This criterion yields the forward error bound e(i) ≤ A−1 · r(i) ≤ stop tol · A−1 · b . in addition to supplying a bound on the forward error.) The backward error also has a direct interpretation as a stopping criterion. (Sometimes we may prefer to use the stricter but harder to estimate bound e(i) ≤ |A−1 | · |r(i) | .

then r(0) = b. Finally. By choosing E and f . This yields the third stopping criterion: Criterion 3. which may be difficult to satisfy for any algorithm if b A · x . it means there are a δA and a δb such that (A + δA)x(i) = b + δb. but because it is widely used. we mention one more criterion. This commonly used criterion has the disadvantage of depending too strongly on the initial solution x(0) . this loss of possibly important structure will not be reflected in δA . this means the iteration may stop too soon. one can also just stop when the upper bound on the error A−1 · r(i) falls below a threshold.k | and fj = |bj | makes the stopping criterion measure the componentwise relative backward error. The cost is an extra matrix-vector multiply: Criterion 4. On the other hand. If this criterion is satisfied.k and δbj differently. and |z| denotes the vector of absolute values of the entries of z. not because we recommend it. This criterion yields the forward error bound e(i) ≤ S5 · A−1 . the user can vary the way the backward error is measured in the stopping criterion. the smallest relative perturbations in any component of A and b which is necessary to make x(i) an exact solution. including the possibility of insisting that some entries be zero.. the following stopping criterion gives one the option of scaling each component δaj. We mention it in order to explain its potential drawbacks: Dubious Criterion 5. (i) x x(i) permitting the user to specify the desired relative accuracy stop tol in the computed solution x(i) .k = |aj. S4 ≡ maxj (|r(i) |j /(E · |x(i) | + f )j ) ≤ stop tol. Choosing ej.k . choosing ej. among other things.54 CHAPTER 4. since most norms δA and δb measure each entry of δA and δb equally. r(i) ≤ S3 ≡ stop tol · x(i) / A−1 . f is a user-defined vector of nonnegative entries. Here E is a user-defined matrix of nonnegative entries. For example. RELATED ISSUES If an estimate of A−1 is available. if x(0) is very (0) large and very inaccurate. which is essentially the same as Criterion 1.k | ≤ tol · ej. This criterion yields the forward error bound e(i) ∞ ≤ |A−1 | · |r(i) | ≤ S4 · |A−1 |(E|x(i) | + f ) ∞ where |A−1 | is the matrix of absolute values of entries of A−1 . with |δaj. . and |δbj | ≤ tol · fj for all j and k. i.e. then r will be very large and S5 will be artificially large. This stopping criterion guarantees that e(i) A−1 · r(i) ≤ ≤ stop tol . One drawback to Criteria 1 and 2 is that they usually treat backward errors in each component of δA and δb equally. if A is sparse and δA is dense. This tighter stopping criterion requires. Other choices of E and f can be used to reflect other structured uncertainties in A and b.k = A ∞ and fj = b ∞ makes the stopping criterion r(i) ∞ /(n A ∞ x(i) ∞ + b ∞ ). In contrast. that δA have the same sparsity pattern as A. For example. r(i) ≤ S5 ≡ stop tol · r(0) . If x(0) = 0. a common choice. Then this criterion is equivalent to Criterion 2 above.

we have A−1 M = (M − N )−1 M = (I − G)−1 .4. A−1 2 = λ−1 (A). This is often done with dense linear system solvers. and A−1 M = (I − G)−1 ≤ 1/(1 − G ).2. it is a viable method. Demmel and Duff [5]). and use the fact that A−1 2 = 1/λmin (AAT ) = 1/2 1/λmin (AT A). then the natural residual to compute is r(i) = x(i) − Gx(i) − c = ˆ M −1 (Ax(i) − b) = M −1 r(i) . which is small compared to the cost O(n3 ) of the LU decomposition (see Hager [120]. because the extra cost of these systems is O(n2 ). where the cost of these solves may well be several times as much as the original linear system. et al.2 .2. Another example is an implementation of the preconditioned conjugate gradient algorithm which computes r(i) M −1/2 . the residual r(i) is the same as the residual ˆ of the preconditioned system M −1 Ax = M −1 b.2. so we may instead derive a forward error bound e(i) = A−1 M r(i) ≤ A−1 M · r(i) . consider the case of a diagonal matrix A with widely varying diagonal entries. and Chebyshev acceleration with adaptation of parameters is being used. There is a large number of problem dependent ways to estimate A−1 . if many linear systems with the same coefficient matrix and differing right-hand-sides are to be solved. although this is not the case for any algorithms in this book. Such an implementation could use the stopping criterion r(i) M −1/2 . Using this as a stopping criterion requires ˆ ˆ an estimate of A−1 M . then the matrix whose inverse norm we need is I − G.3 Estimating A−1 Bounds on the error e(i) inevitably rely on bounds for A−1 . Still. When A is symmetric positive definite. since e(i) = A−1 r(i) . we mention a few here.2 = (r(i)T M −1 r(i) )1/2 instead of r(i) 2 (the implementation in this book computes the latter). . then |A−1 | · |r(i) | ∞ = A−1 R ∞ (see Arioli. however. let R denote the diagonal matrix with diagonal entries equal to the entries of |r(i) |. [3]). we know how to estimate G if the splitting is a standard one such as Jacobi or SOR. we 1/2 may apply the Lanczos method to AAT or AT A. Since A is symmetric positive definite. a diagonal matrix. For completeness. if ones “splits” A = M − N to get the iterative method x(i) = M −1 N x(i−1) + −1 M b ≡ Gx(i−1) + c.2 / r(0) M −1/2 . In other words. This is not the case for iterative solvers. 4. it is hard to interpret r(i) as ˆ a backward error for the original system Ax = b. It is also possible to estimate A−1 ∞ provided one is willing to solve a few systems of linear equations with A and AT as coefficient matrices. In this case. In the case of methods based on splitting A = M − N . which could also be used in a stopping criterion. This may be much smaller than the simpler A−1 ∞ · r(i) ∞ in the case where the rows of A are badly scaled. which can usually provide good estimates of the largest (rightmost) and smallest (leftmost) eigenvalues of a symmetric matrix at the cost of a few matrix-vector multiplies. Then we may estimate (I − G)−1 ≤ 1/(1 − G ). When a splitting A = M − N is used to get an iteration x(i) = M −1 N x(i−1) + M −1 b = Gx(i−1) + c.2 ≤ tol as in Criterion 5. For general nonsymmetric A. we discuss stopping criteria in this case. Higham [123] and Anderson. and the matrix A has special characteristics such as Property A. A−1 R ∞ can be estimated using the technique in the last paragraph since multiplying by A−1 R or (A−1 R)T = RT A−T is no harder than multiplying by A−1 and A−T and also by R. min This adaptive estimation is often done using the Lanczos algorithm (see section 5.1). For example. then at each step the algorithm estimates the largest and smallest eigenvalues λmax (A) and λmin (A) of A anyway.2 When r(i) or r(i) is not readily available It is possible to design an iterative algorithm for which r(i) = Ax(i) − b or r(i) is not directly available. STOPPING CRITERIA 55 4. We may also use it to get the forward error bound e(i) ≤ A−1 M 1/2 · r(i) M −1/2 . The approach in the last paragraph also lets us estimate the alternate error bound e(i) ∞ ≤ |A−1 | · |r(i) | ∞ . Often. To compute |A−1 | · |r(i) | ∞ .

On the other hand. we will review here a number of sparse storage formats.3). Since iterative methods are typically used on sparse matrices. A more refined bound is that the error (δr(i) )j in the jth component of r(i) is bounded by O(ε) times the jth component of |A| · |x(i) | + |b|. we demonstrate how the matrix-vector product and an incomplete factorization solve are formulated using two of the sparse matrix formats. the form of such a criterion is both method and problem dependent. In this section we will review some of the more popular sparse matrix formats that are used in numerical software packages such as ITPACK [139] and NSPCG [164]. This is why one should not choose stop tol ≤ ε in Criterion 1. After surveying the various formats. while it is a good idea to have a criterion that stops when progress towards a solution is no longer being made. convergence may continue to be erratic from step to step. and usually closer to nε. The infinity norm x ∞ = maxj |xj | requires the fewest floating point operations to compute. Now consider computing the residual r(i) = Ax(i) − b by forming the matrix-vector product (i) Ax and then subtracting b. . can appear wildly nonconvergent for a large number of steps before the residual begins to decrease. often exhibit nearly monotone linear convergence. Often. such as Jacobi and SOR. This means the uncertainty in e(i) is really bounded by O(ε) |A−1 |·(|A|·|x(i) |+|b|) . such as CGS. like the conjugate gradient method. 4. For this reason.4 Stopping when progress is no longer being made In addition to limiting the total amount of work by limiting the maximum number of iterations one is willing to do.56 CHAPTER 4. computing x 2 = ( j |xj |2 )1/2 in the most straightforward manner can easily overflow or lose accuracy to underflow even when the true result is far from either the overflow or underflow thresholds. Other methods. Still other methods.5 Accounting for floating point errors The error bounds discussed in this section are subject to floating point errors. it is also natural to consider stopping when no apparent progress is being made. A standard (i) error analysis shows that the error δr(i) in the computed r(i) is bounded by δr√ ≤ O(ε)( A · (i) x + b ). but which deserve some discussion. This last quantity can be estimated inexpensively provided solving systems with A and AT as coefficient matrices is inexpensive (see the last paragraph of §4. symbols. most of which are innocuous. and why Criterion 2 may not be satisfied by any method. and therefore on the storage scheme used for the matrix and the preconditioner. 2 IEEE standard floating point arithmetic permits computations with ±∞ and NaN. or more tersely |δr(i) | ≤ O(ε)(|A| · |x(i) | + |b|). in principle there can be many such plateaus (see Greenbaum and Strakos [109]) depending on the problem.2. at least after some initial transients. or Not-a-Number. the storage scheme used arises naturally from the specific application problem. but it is more expensive than computing x ∞ . Some methods. In other words. with the residual norm stagnating at a constant value for many iterations before decreasing again. so it is easy to recognize when convergence degrades. where O(ε) is typically bounded by nε. Both these bounds can be severe overestimates of the uncertainty in e(i) . RELATED ISSUES 4. 4. exhibit “plateaus” in their convergence.2. a careful implementation for computing x 2 without this danger is available (subroutine snrm2 in the BLAS [71] [143]). and cannot overflow or cause other exceptions if the xj are themselves finite2 .3 Data Structures The efficiency of any of the iterative methods considered in previous sections is determined primarily by the performance of the matrix-vector product and the preconditioner solve. but examples exist where they are attainable. This uncertainty in the value of r(i) induces an uncertainty in the error e(i) = A−1 r(i) of at most O(ε) A−1 · ( A · x(i) + b ). all in floating point arithmetic with relative precision ε.2.

we define row ptr(n + 1) = nnz + 1. and the other two for integers (col ind. Block Compressed Row Storage. they are not very efficient. requires a scheme for knowing where the elements fit into the full matrix. Compressed Row Storage (CRS) The Compressed Row and Column (in the next section) Storage formats are the most general: they make absolutely no assumptions about the sparsity structure of the matrix. where row ind stores the row indices of each nonzero. Assuming we have a nonsymmetric sparse matrix A. which is also called the Harwell-Boeing sparse matrix format [77]. row ptr} given below val col ind 10 1 -2 5 3 1 9 2 1 3 6 3 7 2 6 8 3 9 7 4 13 3 ··· 9 1 ··· 5 17 20 13 6 . row ind. and perhaps a limited number of zeros. The trade-off is a more complicated algorithm with a somewhat different pattern of data access. needing an indirect addressing step for every single scalar operation in a matrix-vector product or preconditioner solve. The storage savings for this approach is significant. This. large-scale linear systems of the form Ax = b can be most efficiently solved if the zero elements of A are not stored. that is. col ind. The col ind vector stores the column indexes of the elements in the val vector. col ptr}. of course.1) is given by . as they are traversed in a row-wise fashion. and they don’t store any unnecessary elements. 4 2 2 5 -1 6 row ptr If the matrix A is symmetric. There are many methods for storing the data (see for instance Saad [185] and Eijkhout [86]). The CCS format for the matrix A in (4. The Compressed Row Storage (CRS) format puts the subsequent nonzeros of the matrix rows in contiguous memory locations. By convention. In other words. Jagged Diagonal Storage. row ptr). and col ptr stores the index of the elements in val which start a column of A.3. if val(k) = ai. As an example.1) A= 5 0   3 0 8 7   0 8 0 9 9 13  0 4 0 0 2 −1 The CRS format for this matrix is then specified by the arrays {val. DATA STRUCTURES 57 4.1 Survey of Sparse Matrix Storage Formats If the coefficient matrix A is sparse. (4. Instead of storing n2 elements. That is. consider the nonsymmetric matrix A defined by   10 0 0 0 −2 0  3 9 0 0 0 3     0 7 8 7 0 0    . the CCS format is the CRS format for AT . we create 3 vectors: one for floating-point numbers (val).j then col ind(k) = j. The val vector stores the values of the nonzero elements of the matrix A. On the other hand. we need only 2nnz + n + 1 storage locations. where nnz is the number of nonzeros in the matrix A. Here we will discuss Compressed Row and Column Storage. if val(k) = ai.4. The CCS format is specified by the 3 arrays {val. we need only store the upper (or lower) triangular portion of the matrix. and Skyline Storage.3. The CCS format is identical to the CRS format except that the columns of A are stored (traversed) instead of the rows. Sparse storage schemes allocate contiguous storage in memory for the nonzero elements of the matrix. Diagonal Storage. Compressed Column Storage (CCS) Analogous to Compressed Row Storage there is Compressed Column Storage (CCS). The row ptr vector stores the locations in the val vector that start a row.j then row ptr(i) ≤ k < row ptr(i + 1).

This storage scheme is particularly useful if the matrix arises from a finite element or finite difference discretization on a tensor product grid. q. Block matrices typically arise from the discretization of partial differential equations in which there are several degrees of freedom associated with a grid point. which unlike CDS. RELATED ISSUES 8 ··· 9 4 ··· 5 17 2 6 20 .j ) is banded if there are nonnegative constants p. we store this matrix A in an array of dimension (6. called the left and right halfbandwidth.:) array are (4.-p:q). We then partition the matrix in small blocks with a size equal to the number of degrees of freedom.i+j . an integer array (col ind(1 : nnzb)) which stores the actual column indices in the original matrix A of the (1. even though it may have some zeros.1 : nb . Usually. Compressed Diagonal Storage (CDS) If the matrix A is banded with bandwidth that is fairly constant from row to row.-1:1) using the mapping val(i.n) corresponds to the LINPACK band format [72].1 : nb )) which stores the nonzero blocks in (block) row-wise fashion. We say that the matrix A = (ai. we can modify the CRS (or CCS) format to exploit such block patterns. (4.:) and col ind(:).3) . the rows of the val(:.j = 0 only if i − p ≤ j ≤ i + q. 1) elements of the nonzero blocks. In this case. band formats involve storing some zeros. such that ai. we can pack the nonzero elements in such a way as to make the matrix-vector product more efficient. If nb is the dimension of each block and nnzb is the number of nonzero blocks in the n × n matrix A. The CDS format may even contain some array elements that do not correspond to matrix elements at all. then it is worthwhile to take advantage of this structure in the storage scheme by storing subdiagonals of the matrix in consecutive locations. The savings in storage locations and reduction in indirect addressing for BCRS over CRS can be significant for matrices with a large nb . Similar to the CRS format.2) A=  0 0 8 7 5 0     0 0 0 9 9 13  0 0 0 0 2 −1 Using the CDS format. we require 3 arrays for the BCRS format: a rectangular array for floating-point numbers ( val(1 : nnzb. we can allocate for the matrix A an array val(1:n.58 val row ind 10 1 3 2 3 4 9 2 1 7 3 4 8 5 8 4 6 10 8 3 13 CHAPTER 4. and treat each block as a dense matrix. The declaration with reversed dimensions (-p:q. j) = ai. 3 2 13 5 -1 6 col ptr Block Compressed Row Storage (BCRS) If the sparse matrix A is comprised of square dense blocks of nonzeros in some regular pattern. then the total storage needed is nnz = nnzb × n2 . Consider the nonsymmetric matrix A defined by   10 −3 0 0 0 0  3 9 6 0 0 0     0 7 8 7 0 0   . does not allow for an efficiently vectorizable matrix-vector multiplication if p + q is small. The block dimension nd of A is then b defined by nd = n/nb .:. Hence. Not only can we eliminate the vector identifying the column and row. and a pointer array (row blk(1 : nd + 1)) whose entries point to the beginning of each block row in val(:.

:) array would be val(:. In the matrix from (4.4) 6 0 7 5 4   0   0 0 0 0 9 13  0 0 0 0 5 −1 the 4 stripes of the matrix A stored in the rows of the val(:. 3) 1 val(:. As defined in [153].:) is also stored: val(:. respectively. called ITPACK storage or Purdue storage. 0) val(:. For the nonsymmetric matrix A defined by   10 −3 0 1 0 0  0 9 6 0 −2 0     3 0 8 7 0 0   . σ(j)) are in S. Like the Compressed Diagonal format. but it makes the matrix-vector product slightly more expensive.-1) val(:. then i < j → σ(i) < σ(j). σk (i)) element of A in stripe Sk is multiplied with both xi and xσk (i) and these products are accumulated in yσk (i) and yi . Corresponding to the array of matrix elements val(:. . an array of column indices. σ(i)) and (j. A generalization of the CDS format more suitable for manipulating general sparse matrices on vector supercomputers is discussed by Melhem in [153]. . Notice the two zeros corresponding to non-existing matrix elements. 0 0 0 10 0 0 0 9 -3 1 3 8 6 -2 6 7 7 0 0 9 5 4 5 -1 13 0 .+1) val(:.  A= (4. This variant of CDS uses a stripe data structure to store the matrix A. All rows are padded with zeros on the right to give them equal length. 1) 10 val(:.3. where In = {1.+2) Jagged Diagonal Storage (JDS) The Jagged Diagonal Storage format can be useful for the implementation of iterative methods on parallel and vector processors (see Saad [184]). if (i.4) all elements are shifted left:     10 −3 1 10 −3 0 1 0 0  9   0 6 −2 9 6 0 −2 0        3  3 8 7 0 8 7 0 0    −→    6  0 7 5 4  6 0 7 5 4        9 13  0 0 0 0 9 13  0 0 0 0 5 −1 5 −1 after which the columns are stored consecutively. Specifically.:).4. . i ∈ I ⊆ In }.-1) val(:. as it involves a gather operation.+1) 0 10 -3 3 9 6 7 8 7 8 7 5 9 9 13 2 -1 0 59 . col ind(:. σ(i)). 4) 0 9 6 −2 0 3 8 7 0 6 9 7 13 5 0 4 0 5 −1 . 0) val(:. . This structure is more efficient in storage in the case of varying bandwidth. . DATA STRUCTURES val(:. It is more space-efficient than CDS at the cost of a gather/scatter operation. 2) −3 val(:. it gives a vector length essentially of the size of the matrix. A simplified form of JDS. each (i. can be described as follows. n} and σ is a strictly increasing function. a stripe in the n × n matrix A is a set of positions S = {(i. When computing the matrix-vector product y = Ax using stripes.

The JDS format for the above matrix A in using the linear arrays {perm. 2) ind(:. A major advantage of solving linear systems having skyline coefficient matrices is that when pivoting is not necessary. the largest number of nonzeros in any row of A. given an input vector x we want to compute products y = Ax and y = AT x.60 col col col col ind(:. An additional pointer is then needed to determine where the diagonal elements. discussed by Saad in [185]. an integer array (col ind(:)) containing the corresponding column indices indices. 13 6 perm jd ptr Skyline Storage (SKS) The final storage scheme we consider is for skyline matrices. For a nonsymmetric skyline matrix such as the one illustrated in Figure 4. in the CRS format.3. The data structure to represent the n × n matrix A therefore consists of a permutation array (perm(1:n)) which reorders the rows. 5. Therefore. which are also called variable band or profile matrices (see Duff. especially if the bandwidth of the matrix varies strongly. If the matrix is symmetric. 3 · · · 6. The advantages of JDS for matrix multiplications are discussed by Saad in [184]. that is. the skyline structure is preserved during Gaussian elimination. A straightforward approach in storing the elements of a skyline matrix is to place all the rows (in order) into a floating-point array (val(:)).e. It is mostly of importance in direct solution methods. col ind. we store the lower triangular elements in SKS format. The compressed and permuted diagonals are then stored in a linear array. both the product of a matrix and that of its transpose times a vector are needed. but it can be used for handling the diagonal blocks in block matrix factorization methods. The number of jagged diagonals is equal to the number of nonzeros in the first row. and then keep an integer array (row ptr(:)) whose elements point to the beginning of each row. 6 9 4 9 2 8 · · · -1. and finally a pointer array (jd ptr(:)) whose elements point to the beginning of each jagged diagonal.1. jdiag. 1) ind(:. 4) 1 2 4 0 2 1 3 3 5 4 0 0 2 4 5 6 5 6 0 0 5 6 . 4. are located. and store the upper triangular elements in a column-oriented SKS format (transpose stored in row-wise SKS format).3: CRS and CDS. a floating-point array (jdiag(:)) containing the jagged diagonals in succession. 0 0 CHAPTER 4. We will present these algorithms for two of the storage formats from §4. These two separated substructures can be linked in a variety of ways. .2 Matrix vector products In many of the iterative methods discussed earlier. Erisman and Reid [79]). is to store each row of the lower triangular part and each column of the upper triangular part contiguously into the floatingpoint array (val(:)). The new data structure is called jagged diagonals. 5. 1 7 9 5 6 3 13 7 4 17 5. we reorder the rows of the matrix decreasingly according to the number of nonzeros per row. One approach. . 3) ind(:. which separate the lower triangular elements from the upper triangular elements. i. RELATED ISSUES It is clear that the padding zeros in this structure may be a disadvantage. jd ptr} is given below (jagged diagonals are separated by semicolons) jdiag col ind 1 1 3 1 5 7 2 2 8 3 3 10 1 4 1 2. we only store its lower triangular part.. The column indices of the nonzeros stored in val(:) are easily derived and are not stored.

Since this method only multiplies nonzero matrix entries.4. The matrix-vector multiplication involving AT is then given by for i = 1. for j = 1.i xj . Hence.i xj . the matrix-vector multiplication is given by for i = 1.j xj = j aj. For the transpose product y = AT x we cannot use the equation yi = j (AT )i. n . DATA STRUCTURES + x x x + x x x x + x x 61 x x + x x x + x x x x x x + x x x x x x x + x x x x x x x + x x x + x x x x + x x x x x x x + x x x x + x x x x x x x x + x x x + x x x + x x x x x + x x + Figure 4. row_ptr(i+1) . CRS Matrix-Vector Product The matrix vector product y = Ax using CRS format can be expressed in the usual way: yi = j ai. since this implies traversing columns of the matrix. For an n×n matrix A.3. n y(i) = 0 end. we switch indices to for all j.j xj . the operation count is 2 times the number of nonzero elements in A.1 y(i) = y(i) + val(j) * x(col_ind(j)) end. which is a significant savings over the dense operation requirement of 2n2 . do for all i: yi ← yi + aj. end. n y(i) = 0 for j = row_ptr(i). since this traverses the rows of the matrix A. an extremely inefficient operation for matrices stored in CRS format.1: Profile of a nonsymmetric skyline or variable-band matrix.

n y(i) = 0 end.q with p and q the (nonnegative) numbers of diagonals to the left and right of the main diagonal.i+j−j xi+j . end.62 CHAPTER 4. row_ptr(j+1)-1 y(col_ind(i)) = y(col_ind(i)) + val(i) * x(j) end. i + j ≤ n. diag_left for loc = max(1. for diag = -diag_left. and both reads and writes the result vector y. the first product (y = Ax) has a more favorable memory access pattern in that (per iteration of the outer loop) it reads two vectors of data (a row of matrix A and the input vector x) and writes one scalar..i xj yi + ai+j.n-diag) y(loc) = y(loc) + val(loc. min(n.g. The idea is to make a change in coordinates in the doubly-nested loop. n y(i) = 0 end. Cray Y-MP). but this does not take advantage of the CDS format. With the index i in the inner loop we see that the expression ai.1-diag). RELATED ISSUES for i = row_ptr(j).i+j accesses the jth diagonal of the matrix (where the main diagonal has number 0). end.diag) * x(loc+diag) end. the memory traffic will then limit the performance. diag_right for loc = max(1.i+j xi+j . end. Using the update formula yi we obtain for i = 1. The algorithm will now have a doubly-nested loop with the outer loop enumerating the diagonals diag=-p.n-diag) y(loc) = y(loc) + val(loc+diag. min(n. Hence. ← = yi + ai+j. The bounds for the inner loop follow from the requirement that 1 ≤ i. for diag = -diag_right. one row of matrix A. Both matrix-vector products above have largely the same structure.1-diag). it is still possible to perform a matrix-vector product y = Ax by either rows or columns. and both use indirect addressing. The transpose product (y = AT x) on the other hand reads one element of the input vector. their vectorizability properties are the same on any given computer. This is an important consideration for RISC-based architectures. -diag) * x(loc+diag) end. However. CDS Matrix-Vector Product If the n × n matrix A is stored in CDS format. Replacing j → i + j we get yi ← yi + ai. The algorithm becomes for i = 1. Unless the machine on which these methods are implemented has three separate memory paths (e.j xj ⇒ yi ← yi + ai. The transpose matrix-vector product y = AT x is a minor variation of the algorithm above.

such as QM R.2). diag_ptr(col_ind(j))-1 if(col_ind(k) = i) then found = TRUE element = val(k) endif end. we assume that an extra integer array diag ptr(1:n) has been allocated that contains the column (or row) indices of the diagonal elements in each row. that involve a transpose matrix vector product we need to consider solving a system with the transpose of as factorization as well. and an incomplete factorization preconditioner of the form (DA + −1 LA )DA (DA + UA ). n pivots(i) = val(diag_ptr(i)) end. row_ptr(i+1)-1 found = FALSE for k = row_ptr(col_ind(j)). but they are essential for complicated differential equations. Generating a CRS-based D-ILU Incomplete Factorization In this subsection we will consider a matrix split as A = DA + LA + UA in diagonal. Such factorizations considerably more complicated to code. and routines to solve a system with such a factorization. and the algorithm is vectorizable with vector lengths of essentially the matrix order n. Hence. For iterative methods. . we next check whether aj.3 Sparse Incomplete Factorizations Efficient preconditioners for iterative methods can be found by performing an incomplete factorization of the coefficient matrix. Additionally. val(diag ptr(i)) = ai.4.3. Each elimination step starts by inverting the pivot for i = 1. since this is the only element that can cause fill with ai. lower and upper triangular part. we will store the inverses of the pivots rather than the pivots themselves.4. DATA STRUCTURES 63 The memory access for the CDS-based matrix-vector product y = Ax (or y = AT x) is three vectors per inner iteration. for j = diag_ptr(i)+1.3.it suffices to allocate for the preconditioner only a pivot array of length n (pivots(1:n)). that is. This implies that during the system solution no divisions have to be performed. that is. In fact. Later we will consider factorizations that allow higher levels of fill. On the other hand.j with j > i.i is a nonzero matrix element. In this section. Because of the regular data access. In this way. 4.j . we only need to store a diagonal matrix D containing the pivots of the factorization.i . there is no indirect addressing. the simplest type of factorization in which no “fill” is allowed. even if the matrix has a nonzero in the fill position (see section 3. The solution routines are applicable in both cases. most machines can perform this algorithm efficiently by keeping three base registers and using simple offset addressing. n pivots(i) = 1 / pivots(i) For all nonzero elements ai. At first we only consider a factorization of the D-ILU type. we discuss the incomplete factorization of an n × n matrix A stored in the CRS format. The factorization begins by copying the matrix diagonal for i = 1.

(D + LA )D−1 (D + UA ) (I + LA D−1 )(D + UA ) (D + LA )(I + D−1 UA ) (I + LA D−1 )D(I + D−1 UA ) We have a choice between several equivalent ways of solving the system: The first and fourth formulae are not suitable since they require both multiplication and division with D. . Upper triangular systems are treated in a similar manner. . CRS-based Factorization Solve The system LU y = x can be solved in the usual manner by introducing a temporary vector z: Lz = x. z(i) = pivots(i) * (x(i)-sum) end. and so on.pivots(i) * sum end. n sum = 0 for j = row_ptr(i). we update aj. The temporary vector z can be eliminated by reusing the space for y. CRS-based Factorization Transpose Solve Solving the transpose system (LU )T y = x is slightly more involved.j . In the usual formulation we traverse rows when solving a factored system. algorithmically. (step -1) sum = 0 for j = diag(i)+1. z can even overwrite x. the difference between the second and third is only one of ease of coding. 1. end. LU = = = = U y = z. for i = n. For instance. for i = 1.64 If so. row_ptr(i+1)-1 sum = sum + val(j) * y(col_ind(j)) y(i) = z(i) . Both halves of the solution have largely the same structure as the matrix vector multiplication. RELATED ISSUES if (found = TRUE) val(diag_ptr(col_ind(j))) = val(diag_ptr(col_ind(j))) . in the next section we will use the second for the transpose system solution. l∗2 . The key idea is to distribute each newly computed component of a triangular solve immediately over the remaining right-hand-side. The algorithm now becomes . if we write a lower triangular matrix as L = (l∗1 . but here we can only access columns of the matrices LT and U T (at less than prohibitive cost).element * pivots(i) * val(j) end. after computing y1 we modify x ← x − l∗1 y1 . Hence. but overwriting input data is in general not recommended. With this algorithm we only access columns of the triangular systems. In this section we use the third formula. . Solving a transpose system with a matrix stored in CRS format essentially means that we access rows of L and U . diag_ptr(i)-1 sum = sum + val(j) * z(col_ind(j)) end. then the system Ly = x can be written as x = l∗1 y1 + l∗2 y2 + · · ·. CHAPTER 4. ). end.

y is stored as ly.ly if w( yind(i) ) <> 0 then lx = lx+1 3 This is not counting the initial zeroing of the w array. end. Let lx be the number of nonzero components in x. they require more storage.3. for i = 1. and has essentially the same memory traffic patterns as that of the matrix-vector product. indirect addressing. and are considerably harder to implement (much of this section is based on algorithms for a full factorization of a sparse matrix as found in Duff. we need an algorithm for adding two vectors x and y.4. yind.tmp * val(j) end. and let xind be an integer array such that if xind(j)=i then x(j) = xi . On the other hand. Erisman and Reid [79]). Similarly.val(j) * y(i) end. As a preliminary. n z(i) = x_tmp(i) tmp = pivots(i) * z(i) for j = diag_ptr(i)+1.ly w( yind(i) ) = y(i) % add w to x wherever x is already nonzero for i=1. We now add x ← x + y by first copying y into a full vector w then adding w to x. n x_tmp(i) = x(i) end. end. Overall. diag_ptr(i)-1 z(col_ind(j)) = z(col_ind(j)) .lx if w( xind(i) ) <> 0 x(i) = x(i) + w( xind(i) ) w( xind(i) ) = 0 % add w to x by creating new components % wherever x is still zero for i=1. 1 (step -1) y(i) = pivots(i) * z(i) for j = row_ptr(i). both stored in sparse storage. let x be stored in x. Both x tmp and z can be considered to be equivalent to y. for i = n. row_ptr(i+1)-1 x_tmp(col_ind(j)) = x_tmp(col_ind(j)) . y. . DATA STRUCTURES for i = 1. and can be overlapped with z. a CRS-based preconditioner solve uses short vector lengths. 65 The extra temporary x tmp is used only for clarity. Generating a CRS-based ILU (k) Incomplete Factorization Incomplete factorizations with several levels of fill allowed are more accurate than the D-ILU factorization described above. The total number of operations will be O(lx + ly)3 : % copy y into w for i=1.

we will add levels to the nonzero components: we assume integer vectors xlev and ylev of length lx and ly respectively.i) for j=k+1. we add the y (i) vectors into w before executing the writes into x. % carry over levels for i=1. we use a sparse form of the following factorization scheme: for k=1.1)). The particular form of the factorization is chosen to minimize the number of times that the full vector w is copied back to sparse form. where L.n for j=1. . and a full length level vector wlev corresponding to w. D−1 . We will not discuss this possibility any further. RELATED ISSUES In order to add a sequence of vectors x ← x + k y (k) .n a(k.i) = a(k. where w is allocated as a sparse vector and its sparsity pattern is constructed during the additions.ly if w( yind(i) ) <> 0 then lx = lx+1 x(lx) = w( yind(i) ) xind(lx) = yind(i) xlev(lx) = wlev( yind(i) ) endif We can now describe the ILU (k) factorization. The addition algorithm then becomes: % copy y into w for i=1.2).j) = a(k. Such preprocessing and reuse of information is not possible with fill controlled by a drop tolerance criterium. % don’t change the levels for i=1. We will describe an incomplete factorization that controls fill-in through levels (see equation (3.i) .66 xind(lx) = yind(i) x(lx) = w( yind(i) ) endif CHAPTER 4.lx if w( xind(i) ) <> 0 x(i) = x(i) + w( xind(i) ) w( xind(i) ) = 0 % add w to x by creating new components % wherever x is still zero. and gradually builds up a factorization M of the form M = (D + L)(I + D−1 U ). Specifically.k) This is a row-oriented version of the traditional ‘left-looking’ factorization algorithm.k-1 for i=j+1. A different implementation would be possible. The algorithm starts out with the matrix A. diagonal and upper triangle of the array M respectively. and D−1 U are stored in the lower triangle.a(k.n a(k. determining storage demands and reusing this information through a number of linear systems of the same sparsity structure.j)/a(k. For a slight refinement of the above algorithm.ly w( yind(i) ) = y(i) wlev( yind(i) ) = ylev(i) % add w to x wherever x is already nonzero.4. With fill levels we can perform the factorization symbolically at first. but this is less attractive from a point of implementation. Alternatively we could use a drop tolerance (section 3.j)*a(j.

• preconditioner solves.4 Parallelism In this section we discuss aspects of parallelism in the iterative methods discussed in this book. The basic time-consuming kernels of iterative schemes are: • inner products.mptr(row+1)-1 if mind(col) > row M(col) = M(col) * M(mdiag(row)) The structure of a particular sparse matrix is likely to apply to a sequence of problems. but arrays adiag and mdiag point to the locations of the diagonal elements.g.aptr(row+1)-1 if aind(col) < row then acol = aind(col) MULTIPLY ROW acol OF M() BY A(col) SUBTRACT THE RESULT FROM w ALLOWING FILL-IN UP TO LEVEL k endif INSERT w IN ROW row OF M() % invert the pivot M(mdiag(row)) = 1/M(mdiag(row)) % normalize the row of U for col=mptr(row). 4. • vector updates. Thus it may pay off to perform the above incomplete factorization first symbolically to determine the amount and location of fill-in and use this structure for the numerically different but structurally identical matrices.4. and block operations in the GMRES method. namely computational wavefronts in the SOR method.4. e. This can be done either with an all-to-all send where every . each processor computes the inner product of corresponding segments of each vector (local inner products or LIPs). PARALLELISM 67 The matrix arrays A and M are assumed to be in compressed row storage. the array for the numerical values can be used to store the levels during the symbolic factorization phase. We will examine each of these in turn. On distributed-memory machines the LIPs then have to be sent to other processors to be combined for the global inner product. 4. with no particular ordering of the elements inside each row. for instance on different time-steps. or during a Newton iteration.1 Inner products The computation of an inner product of two vectors can be easily parallelized.4. In this case. Ap(i) (for some methods also AT p(i) ). We will conclude this section by discussing two particular issues. Since the iterative methods share most of their computational kernels we will discuss these independent of the method.col) with col<row COPY ROW row OF A() INTO DENSE VECTOR w for col=aptr(row). • matrix–vector products..n % go through elements A(row. for row=1.

Since on a distributed-memory machine communication is needed for the inner product. r(i+1) = r(i) − αi q (i) . in the usual formulation of conjugate gradient-type methods the inner products induce a synchronization of the processors. Clearly. so one first computes L−1 r(i) . Another advantage over other approaches (see below) is that no additional operations are required. Figure 4. w(i) = L−T s.2: A rearrangement of Conjugate Gradient for parallelism Under the assumptions that we have made.2 shows a variant of CG. ρi+1 = (s.. which can only begin after completing the inner product for βi−1 . Figure 4. this step requires communication. where one processor performs the summations. The computation of L−T s will then mask the communication stage of the inner product. γ = (p(i) . q (i) = Ap(i) . we cannot overlap this communication with useful computation. after which the inner T product r(i) M −1 r(i) can be computed as sT s where s = L−1 r(i) .68 CHAPTER 4. s) for i = 0. or as a piece of serial code. r(0) = b − Ax(0) .. or by a global accumulation in one processor. The same observation applies to updating p(i) . ρ0 = (s. p(−1) = 0. This is just a reorganized version of the original CG scheme. p(i) = w(i) + βi−1 p(i−1) . s). followed by a broadcast of the final result. s = L−1 r(i+1) . and is therefore precisely as stable. Overlapping communication and computation Clearly. x(−1) = x(0) = initial guess. q (i) ). end. the accumulation of LIPs can be implemented as a critical section where all processors add their local result in turn to the global result. . For shared-memory machines. This rearrangement is based on two tricks. 2. x(i) = x(i−1) + αi−1 p(i−1) . in which all communication time may be overlapped with useful computations. s = L−1 r(0) . endif βi = ρi+1 /ρi . The first is that updating the iterate is delayed to T mask the communication stage of the p(i) Ap(i) inner product.. CG can be efficiently parallelized as follows: . β−1 = 0. RELATED ISSUES processor performs the summation of the LIPs. if r(i+1) small enough then x(i+1) = x(i) + αi p(i) quit. 1. The second trick relies on splitting the (symmetric) preconditioner as M = LLT . since they cannot progress until the final result has been computed: updating x(i+1) and r(i+1) can only begin after completing the inner product for αi . α−1 = 0. αi = ρi /γ.

3.4. We will mention a number of approaches to obtaining parallelism in preconditioning. PARALLELISM 69 1. and this can be followed by the computation of a part of the inner product. we may need communication for other elements of the vector. A first such method was proposed by Saad [181] with a modification to improve its stability suggested by Meurant [155]. 2.3 Matrix-vector products The matrix–vector products are often easily parallelized on shared-memory machines by splitting the matrix in strips corresponding to the vector segments. matrices stemming from finite difference or finite element problems typically involve only local connections: matrix element ai. Recently. This algorithm can be extended trivially to preconditioners of LDLT form.2 Vector updates Vector updates are trivially parallelizable: each processor updates its own segment. If the number of connections to these neighboring blocks is small compared to the number of internal nodes. Heath and Van der Vorst [66]. which may lead to communication bottlenecks.4) and orthogonalize these as a block (see Van Rosendale [208]. For more detailed discussions on implementation aspects for distributed memory systems. each processor requires the values of pi at some nodes in neighboring blocks. into suitable blocks and to distribute them over the processors.4 Preconditioning Preconditioning is often the most problematic part of parallelizing an iterative method. then the communication time can be overlapped with computational work. and Chronopoulos and Gear [54]). For distributed-memory machines.4. Each processor then computes the matrix–vector product of one strip. and Eijkhout [87]. 3. The reduction of the inner product for ρi+1 can be overlapped with the remaining part of the preconditioning operation at the beginning of the next iteration. D’Azevedo and Romine [61]. For example. 4. The computation of a segment of p(i) can be followed immediately by the computation of a segment of q (i) . In such a case. many sparse matrix problems arise from a network in which only nearby nodes are connected. related methods have been proposed by Chronopoulos and Gear [54].4. (which could in fact have been done in the previous iteration step). The stability of such methods is discussed by D’Azevedo. When computing Api . see De Sturler [62] and Pommerell [174]. there may be a problem if each processor has only a segment of the vector in its memory. For a more detailed discussion see Demmel. 4.j is nonzero only if variables i and j are physically close. However. 4. One strategy has been to replace one of the two inner products typically present in conjugate gradient-like methods by one or two others in such a way that all inner products can be performed simultaneously. and nonsymmetric preconditioners in the Biconjugate Gradient Method. The communication required for the reduction of the inner product for γ can be overlapped with the update for x(i) .4. it seems natural to subdivide the network. This saves on load operations for segments of p(i) and q (i) . Depending on the bandwidth of the matrix. These schemes can also be applied to nonsymmetric methods such as BiCG. or grid. The global communication can then be packaged. Eijkhout and Romine [60].4. . Fewer synchronization points Several authors have found ways to eliminate some of the synchronization points induced by the inner products in methods such as CG. Another approach is to generate a number of successive Krylov vectors (see §2.

This requires communication similar to that for matrix–vector products. Radicati di Brozolo and Robert [177] suggest that one construct incomplete decompositions on slightly overlapping domains. Incomplete factorization preconditioners are usually much harder to parallelize: using wavefronts of independent computations (see for instance Paolini and Radicati di Brozolo [169]) a modest amount of parallelism can be attained. but the implementation is complicated. 203]). Fully decoupled preconditioners. Next.4). RELATED ISSUES Discovering parallelism in sequential preconditioners. Thus we can. Certain preconditioners were not developed with parallelism in mind. 4. It is clear that all elements of a wavefront can be processed simultaneously. We start by partitioning the unknowns in wavefronts. though they are perhaps less efficient in purely scalar terms than their ancestors. which provide a high degree of coarse grained parallelism. the overall method is technically a block Jacobi preconditioner (see §3. all wavefronts of the same color are processed simultaneously. a central difference discretization on regular grids gives wavefronts that are hyperplanes (see Van der Vorst [201. since the minimal number of sequential steps equals the color number of the matrix graphs. subsequent wavefronts are then sets (this definition is not necessarily unique) of successors of elements of the previous wavefront(s). such that no successor/predecessor relations hold among the elements of this set. and polynomial preconditioners (see §3. Variants of existing sequential incomplete factorization preconditioners with a higher degree of parallelism have been devised. expansion of the factors in a truncated Neumann series (see Van der Vorst [199]).5 Wavefronts in the Gauss-Seidel and Conjugate Gradient methods At first sight. In the multi-color ordering. This reduces the number of sequential steps for solving the Gauss-Seidel matrix to the number of colors. since improvement in the number of iterations may be only modest. we observe that the unknowns in a wavefront can be computed as soon as all wavefronts containing its predecessors have been computed. and multicolor preconditioners. however. 127]. For instance. so the sequential time of solving a system with D − L can be reduced to the number of wavefronts.1).2. More parallel variants of sequential preconditioners. For theory and appplications to parallelism see Jones and Plassman [126. . they may not be as effective as more complicated. which have the same parallelism as the matrix-vector product. While their parallel execution is very efficient. reveals a high degree of parallelism if the method is applied to sparse matrices such as those arising from discretized partial differential equations. To get a bigger improvement while retaining the efficient parallel execution. Some examples are: reorderings of the variables (see Duff and Meurant [78] and Eijkhout [84]). Multicolor preconditioners have optimal parallelism among incomplete factorization methods. various block factorization methods (see Axelsson and Eijkhout [15] and Axelsson and Polman [21]). the Gauss-Seidel method (and the SOR method which has the same basic structure) seems to be a fully sequential method. which is the smallest number d such that wavefront i contains no elements that are a predecessor of an element in wavefront i + d. have components from several iterations being computed simultaneously.4. less parallel preconditioners. in the absence of tests for convergence. A more careful analysis.70 CHAPTER 4. Adams and Jordan [2] observe that in this way the natural ordering of unknowns gives an iterative method that is mathematically equivalent to a multi-color ordering.5). but they can be executed in parallel. The first wavefront contains those unknowns that (in the directed graph of D − L) have no predecessor. Some examples are domain decomposition methods (see §5. If all processors execute their part of the preconditioner solve without further communication.

Both approaches significantly increase the computational work.. M −1 Av (j) . . . This may increase the efficiency of the computation significantly..4.. this is called m-step GMRES(m) (see Kim and Chronopoulos [138]). 4.3. and to orthogonalize this set afterwards. Av (1) . that is. Whether this is offset by the increased parallelism depends on the application and the computer architecture. To incorporate Level 2 BLAS we can apply either Householder orthogonalization or classical Gram-Schmidt twice (which mitigates classical Gram-Schmidt’s potential instability.4. inner products and vector updates. v (2) .4. their communication can be packaged. In our version. m (1) A v } for the Krylov subspace first. Another way to obtain more parallelism and data locality is to generate a basis {v (1) . .6 Blocked operations in the GMRES method In addition to the usual matrix-vector product. the numerical instability due to generating a possibly near-dependent set is not necessarily a drawback. In fact. is orthogonalized against the previously built orthogonal set {v (1) . the preconditioned GMRES method (see §2. where each new vector is immediately orthogonalized to all previous vectors. The above observation that the Gauss-Seidel method with the natural ordering is equivalent to a multicoloring cannot be extended to the SSOR method or wavefront-based incomplete factorization preconditioners for the Conjugate Gradient method. . which may be quite inefficient.4. . tests by Duff and Meurant [78] and an analysis by Eijkhout [84] show that multicolor incomplete factorization preconditioners in general may take a considerably larger number of iterations to converge than preconditioners based on the natural ordering.4) has a kernel where one new vector. SOR theory still holds in an approximate sense for multicolored matrices.3. in contrast to CG.. but using classical Gram-Schmidt has the advantage that all inner products can be performed simultaneously.) This approach does not increase the computational work and. this is done using Level 1 BLAS. see Saad [184]). v (j) }. PARALLELISM 71 As demonstrated by O’Leary [163]. (Compare this to the GMRES method in §2.

RELATED ISSUES .72 CHAPTER 4.

. we have Rk = Pk Bk . .   . ..1 The Lanczos Connection As discussed by Paige and Saunders in [167] and by Golub and Van Loan in [108]. From the equations p(j) = r(j−1) + βj−1 p(j−1) .  1 −β2 . . p (k)T . p(k) ]. . and the k × k upper bidiagonal matrix Bk by  1 −β1 ··· 0  . . .. 3. it is straightforward to derive the conjugate gradient method for solving symmetric positive definite linear systems from the Lanczos algorithm for solving symmetric eigensystems and vice versa. Suppose we define the n × k matrix R(k) by Rk = [r(0) . . . p(2) . −βk−1 . . 0 0 Ap(k) 73 .     ..   . 0 ··· 1      . . let us consider how one can derive the Lanczos process for symmetric eigensystems from the (unpreconditioned) conjugate gradient method. r(k−1) ]. . 0 . . . where Pk = [p(1) .. r(1) .   ..    where the sequences {r(k) } and {βk } are defined by the standard conjugate gradient algorithm discussed in §2.. it follows that T Tˆ ˆ Tk = Rk ARk = Bk Λk Bk is a tridiagonal matrix since  T p(1) Ap(1) 0  T  0 p(2) Ap(2)   . . and p(1) = r(0) . j = 2. . . k . . Bk =  .  ..      . . Assuming the elements of the sequence {p(j) } are A-conjugate.  . . . ˆ Λk =  . .1. . . . . As an example. . .  0 ··· ··· .Chapter 5 Remaining topics 5.3. . ...

74

CHAPTER 5. REMAINING TOPICS

Since span{p(1) , p(2) , . . . , p(j) } = span{r(0) , r(1) , . . . , r(j−1) } and since the elements of {r } are mutually orthogonal, it can be shown that the columns of n × k matrix Qk = Rk ∆−1 form an orthonormal basis for the subspace span{b, Ab, . . . , Ak−1 b}, where ∆ is a diagonal matrix whose ith diagonal element is r(i) 2 . The columns of the matrix Qk are the Lanczos vectors (see Parlett [170]) whose associated projection of A is the tridiagonal matrix Tˆ (5.1) Tk = ∆−1 Bk Λk Bk ∆−1 .
(j)

The extremal eigenvalues of Tk approximate those of the matrix A. Hence, the diagonal and subdiagonal elements of Tk in (5.1), which are readily available during iterations of the conjugate gradient algorithm (§2.3.1), can be used to construct Tk after k CG iterations. This allows us to obtain good approximations to the extremal eigenvalues (and hence the condition number) of the matrix A while we are generating approximations, x(k) , to the solution of the linear system Ax = b. For a nonsymmetric matrix A, an equivalent nonsymmetric Lanczos algorithm (see Lanczos [141]) would produce a nonsymmetric matrix Tk in (5.1) whose extremal eigenvalues (which may include complex-conjugate pairs) approximate those of A. The nonsymmetric Lanczos method is equivalent to the BiCG method discussed in §2.3.5.

5.2

Block and s-step Iterative Methods

The methods discussed so far are all subspace methods, that is, in every iteration they extend the dimension of the subspace generated. In fact, they generate an orthogonal basis for this subspace, by orthogonalizing the newly generated vector with respect to the previous basis vectors. However, in the case of nonsymmetric coefficient matrices the newly generated vector may be almost linearly dependent on the existing basis. To prevent break-down or severe numerical error in such instances, methods have been proposed that perform a look-ahead step (see Freund, Gutknecht and Nachtigal [100], Parlett, Taylor and Liu [171], and Freund and Nachtigal [101]). Several new, unorthogonalized, basis vectors are generated and are then orthogonalized with respect to the subspace already generated. Instead of generating a basis, such a method generates a series of low-dimensional orthogonal subspaces. The s-step iterative methods of Chronopoulos and Gear [54] use this strategy of generating unorthogonalized vectors and processing them as a block to reduce computational overhead and improve processor cache behaviour. If conjugate gradient methods are considered to generate a factorization of a tridiagonal reduction of the original matrix, then look-ahead methods generate a block factorization of a block tridiagonal reduction of the matrix. A block tridiagonal reduction is also effected by the Block Lanczos algorithm and the Block Conjugate Gradient method (see O’Leary [162]). Such methods operate on multiple linear systems with the same coefficient matrix simultaneously, for instance with multiple right hand sides, or the same right hand side but with different initial guesses. Since these block methods use multiple search directions in each step, their convergence behavior is better than for ordinary methods. In fact, one can show that the spectrum of the matrix is effectively reduced by the nb − 1 smallest eigenvalues, where nb is the block size.

5.3

Reduced System Preconditioning
M −1 Ax = M −1 f

As we have seen earlier, a suitable preconditioner for CG is a matrix M such that the system

requires fewer iterations to solve than Ax = f does, and for which systems M z = r can be solved efficiently. The first property is independent of the machine used, while the second is highly machine dependent. Choosing the best preconditioner involves balancing those two criteria in a way

5.4. DOMAIN DECOMPOSITION METHODS

75

that minimizes the overall computation time. One balancing approach used for matrices A arising from 5-point finite difference discretization of second order elliptic partial differential equations (PDEs) with Dirichlet boundary conditions involves solving a reduced system. Specifically, for an n × n grid, we can use a point red-black ordering of the nodes to get Ax = DR CT C DB xR xB = fR fB , (5.2)

where DR and DB are diagonal, and C is a well-structured sparse matrix with 5 nonzero diagonals if n is even and 4 nonzero diagonals if n is odd. Applying one step of block Gaussian elimination (or computing the Schur complement; see Golub and Van Loan [108]) we have I −1 −C T DR which reduces to DR O C DB − C T DR −1 C xR xB = fR fB − C T DR −1 fR . O I DR CT C DB xR xB = I −1 −C T DR O I fR fB

With proper scaling (left and right multiplication by DB −1/2 ), we obtain from the second block equation the reduced system (I − H T H)y = g ,
−1/2 −1/2 1/2 −1/2

(5.3)

−1 where H = DR CDB , y = DB xB , and g = DB (fB − C T DR fR ). The linear system 2 2 (5.3) is of order n /2 for even n and of order (n − 1)/2 for odd n. Once y is determined, the solution x is easily retrieved from y. The values on the black points are those that would be obtained from a red/black ordered SSOR preconditioner on the full system, so we expect faster convergence. The number of nonzero coefficients is reduced, although the coefficient matrix in (5.3) has nine nonzero diagonals. Therefore it has higher density and offers more data locality. Meier and Sameh [149] demonstrate that the reduced system approach on hierarchical memory machines such as the Alliant FX/8 is over 3 times faster than unpreconditioned CG for Poisson’s equation on n×n grids with n ≥ 250. For 3-dimensional elliptic PDEs, the reduced system approach yields a block tridiagonal matrix C in (5.2) having diagonal blocks of the structure of C from the 2-dimensional case and offdiagonal blocks that are diagonal matrices. Computing the reduced system explicitly leads to an unreasonable increase in the computational complexity of solving Ax = f . The matrix products required to solve (5.3) would therefore be performed implicitly which could significantly decrease performance. However, as Meier and Sameh show [149], the reduced system approach can still be about 2-3 times as fast as the conjugate gradient method with Jacobi preconditioning for 3dimensional problems.

5.4 Domain Decomposition Methods
In recent years, much attention has been given to domain decomposition methods for linear elliptic problems that are based on a partitioning of the domain of the physical problem. Since the subdomains can be handled independently, such methods are very attractive for coarse-grain parallel computers. On the other hand, it should be stressed that they can be very effective even on sequential computers. In this brief survey, we shall restrict ourselves to the standard second order self-adjoint scalar elliptic problems in two dimensions of the form: − · (a(x, y) u) = f (x, y) (5.4)

76

CHAPTER 5. REMAINING TOPICS

where a(x, y) is a positive function on the domain Ω, on whose boundary the value of u is prescribed (the Dirichlet problem). For more general problems, and a good set of references, the reader is referred to the series of proceedings [47, 48, 49, 106, 134, 176]. For the discretization of (5.4), we shall assume for simplicity that Ω is triangulated by a set TH of nonoverlapping coarse triangles (subdomains) Ωi , i = 1, ..., p with nH internal vertices. The Ωi ’s are in turn further refined into a set of smaller triangles Th with n internal vertices in total. Here H, h denote the coarse and fine mesh size respectively. By a standard Ritz-Galerkin method using piecewise linear triangular basis elements on (5.4), we obtain an n × n symmetric positive definite linear system Au = f . Generally, there are two kinds of approaches depending on whether the subdomains overlap with one another (Schwarz methods) or are separated from one another by interfaces (Schur Complement methods, iterative substructuring). We shall present domain decomposition methods as preconditioners for the linear system Au = f or to a reduced (Schur Complement) system SB uB = gB defined on the interfaces in the nonoverlapping formulation. When used with the standard Krylov subspace methods discussed elsewhere in this book, the user has to supply a procedure for computing Av or Sw given v or w and the algorithms to be described herein computes K −1 v. The computation of Av is a simple sparse matrix-vector multiply, but Sw may require subdomain solves, as will be described later.

5.4.1 Overlapping Subdomain Methods
In this approach, each substructure Ωi is extended to a larger substructure Ωi containing ni internal vertices and all the triangles Ti ⊂ Th , within a distance δ from Ωi , where δ refers to the amount of overlap. Let Ai , AH denote the the discretizations of (5.4) on the subdomain triangulation Ti and the coarse triangulation TH respectively. T Let Ri denote the extension operator which extends (by zero) a function on Ti to Th and Ri the T corresponding pointwise restriction operator. Similarly, let RH denote the interpolation operator which maps a function on the coarse grid TH onto the fine grid Th by piecewise linear interpolation and RH the corresponding weighted restriction operator. With these notations, the Additive Schwarz Preconditioner Kas for the system Au = f can be compactly described as:
p −1 Kas v

=

T RH A−1 RH v H

+
i=1

T Ri A i Ri v.

−1

Note that the right hand side can be computed using p subdomain solves using the Ai ’s, plus −1 T a coarse grid solve using AH , all of which can be computed in parallel. Each term Ri A i Ri v should be evaluated in three steps: (1) Restriction: vi = Ri v, (2) Subdomain solves for wi : Ai wi = T vi , (3) Interpolation: yi = Ri wi . The coarse grid solve is handled in the same manner. −1 The theory of Dryja and Widlund [75] shows that the condition number of Kas A is bounded independently of both the coarse grid size H and the fine grid size h, provided there is “sufficient” overlap between Ωi and Ωi (essentially it means that the ratio δ/H of the distance δ of the boundary ∂Ωi to ∂Ωi should be uniformly bounded from below as h → 0.) If the coarse grid solve term is left out, then the condition number grows as O(1/H 2 ), reflecting the lack of global coupling provided by the coarse grid. For the purpose of implementations, it is often useful to interpret the definition of Kas in matrix notation. Thus the decomposition of Ω into Ωi ’s corresponds to partitioning of the components of the vector u into p overlapping groups of index sets Ii ’s, each with ni components. The ni × ni T matrix Ai is simply a principal submatrix of A corresponding to the index set Ii . Ri is a n × ni T matrix defined by its action on a vector ui defined on Ti as: (Ri ui )j = (ui )j if j ∈ Ii but is zero otherwise. Similarly, the action of its transpose Ri u forms an ni -vector by picking off

6) . I We note the following properties of this Schur Complement system: 1. the matrices Ri . then Ai is guaranteed to be nonsingular. The difficulty is in defining the “coarse grid” matrices AH . The set of interior vertices I of all Ωi and the set of vertices B which lies on the boundaries i ∂Ωi of the coarse triangles Ωi in TH . Analogously. (5.5. not necessarily one arising from a discretization of an elliptic problem. RH can be T T efficiently computed as in a standard multigrid algorithm. equation (5. the preconditioner Kas can be extended to any matrix A. RH are defined by their actions and need not be stored explicitly. SB is dense in general and computing it explicitly requires as many solves on each subdomain as there are points on each of its edges. fB )T corresponding to this partition. In this ordering. Furthermore. 3. the system in (5. The condition number of SB is O(h−1 ). we arrive at the following equation for uB : SB uB = gB ≡ fB − AIB A−1 fI . which inherently depends on knowledge of the grid structure. (5. i 5. Given a vector vB defined on the boundary vertices B of TH .6). We shall re-order u and f as u ≡ (uI . 2. RH .5) can be written as: I AT A−1 IB I where SB ≡ AB − AT A−1 AIB IB I is the Schur complement of AB in A. We also note that in this algebraic formulation. By eliminating uI in (5. Radicati and Robert [177] have experimented with such an algebraic overlapping block Jacobi preconditioner. uB )T and f ≡ (fI . (5.5) We note that since the subdomains are uncoupled by the boundary vertices.4. Note that the matrices Ri . RH . By a block LU-factorization of A. the matrix-vector product SB vB can be computed according to AB vB − AT (A−1 (AIB vB )) where A−1 involves p IB I I independent subdomain solves using A−1 . SB inherits the symmetric positive definiteness of A.4.7) 0 I AI 0 AIB SB uI uB = fI fB . if A is positive definite. RH is an n × nH matrix with entries corresponding to piecewise linear interpolation and its transpose can be interpreted as a weighted T restriction matrix. an improvement over the O(h−2 ) growth for A. at the expense of a deteriorating convergence rate as p increases. This part of the preconditioner can be left out. 5. DOMAIN DECOMPOSITION METHODS 77 T the components of u corresponding to Ii . The right hand side gB can also be computed using p independent subdomain solves. If Th is obtained from TH by nested refinement. AI = blockdiagonal(Ai ) is block-diagonal with each block Ai being the stiffness matrix corresponding to the unknowns belonging to the interior vertices of subdomain Ωi .2 Non-overlapping Subdomain Methods The easiest way to describe this approach is through matrix notation.4) can be written as follows: AI AT IB AIB AB uI uB = fI fB . Ri . the action of RH . Once we have the partitioning index sets Ii ’s. 4. Ai are defined. The set of vertices of Th can be divided into two groups.

the action of SEi is needed. Smith [190] proposed a vertex space modification to KBP S which explicitly accounts for this coupling and therefore eliminates the dependence on H and h. Corresponding to this partition of B. as the coarse grid becomes coarser). the action of MEi can be efficiently computed. we need to further decompose B into two non-overlapping index sets: B = E ∪ VH (5.e. In practice. there is a slight growth in the number of iterations as h becomes small (i. S can be written in the block form: SB = SE T SEV SEV SV . namely a tridiagonal matrix with 2’s down the main diagonal and −1’s down the two off-diagonals. the Bramble-Pasciak-Schatz preconditioner is defined by its action on a vector vB defined on B as follows: −1 T KBP S vB = RH A−1 RH vB + H Ei −1 T REi MEi REi vB . which is the basic method in the nonoverlapping substructuring approach. However. For this.7). excluding the vertices belonging to VH . and even if we had it. Pasciak and Schatz [36] prove that the condition number of KBP S SB is bounded by 2 O(1 + log (H/h)). and αEi is taken to be some average of the coefficient a(x. T In addition to the coarse grid interpolation and restriction operators RH .9) The block SE can again be block partitioned. The diagonal blocks SEi of SE are the principal submatrices of S corresponding to Ei . as noted before. We note that since the eigen-decomposition of K is known and computable by the Fast Sine Transform. The idea is to introduce further subsets of B called vertex spaces X = k Xk with .8) where VH = k Vk denote the set of nodes corresponding to the vertices Vk ’s of TH . Fortunately.4) on the edge Ei . many efficiently invertible approximations to SEi have been proposed in the literature (see Keyes and Gropp [135]) and we shall use these so-called interface preconditioners for SEi instead. Each SEi represents the coupling of nodes on interface Ei separating two neighboring subdomains. it would be expensive to compute its action (we would need to compute its inverse or a Cholesky factorization). The action of its transpose extends by zero a function defined on Ei to one defined on B.10) Analogous to the additive Schwarz preconditioner. (5. We mention one specific preconditioner: MEi = αEi K 1/2 where K is an nEi × nEi one dimensional Laplacian matrix.78 CHAPTER 5. we shall also need a new set of interpolation and restriction operators for the edges Ei ’s. RH defined before. and E = i Ei denote the set of nodes on the edges Ei ’s of the coarse triangles in TH .. as the fine grid is refined) or as H becomes large (i. in general SEi is a dense matrix which is also expensive to compute.e. Let REi be the pointwise restriction operator (an nEi × n matrix. y) of (5. We will also need some good preconditioners to further improve the convergence of the Schur system. We shall first describe the Bramble-Pasciak-Schatz preconditioner [36]. With these notations. where nEi is the number of vertices on the edge Ei ) onto the edge Ei defined by its action (REi uB )j = (uB )j if j ∈ Ei but is zero otherwise. with most of the subblocks being zero. the computation of each term consists of the three steps of restriction-inversion-interpolation and is independent of the others and thus can be carried out in parallel. REMAINING TOPICS These properties make it possible to apply a preconditioned iterative method to (5. The log2 (H/h) growth is due to the coupling of the unknowns on the edges incident on a common vertex Vk . −1 Bramble.. (5. which is not accounted for in KBP S . −1 In defining the preconditioner.

That is. By ˜ replacing A−1 by A−1 in the definitions of KBP S and KV S . although the degree of parallelism can be increased by trading off the convergence rate through multi-coloring the subdomains. A−1 and A−1 in Kas .3 Further Remarks Multiplicative Schwarz Methods As mentioned before. KV S can be replaced by inexact i H ˜ −1 ˜ ˜ solves A i . .). For the Schwarz methods. and RXk . However.4. which can be standard elliptic preconditioners themselves (e. The theory can be found in Bramble. et al. One way to get around this is to use an I ˜ inner iteration using Ai as a preconditioner for Ai in order to compute the action of A−1 . (5. ILU. multii H grid. −1 The Schur Complement methods require more changes to accommodate inexact solves. [37]. SB respectively. An I alternative is to perform the iteration on the larger system (5. etc.g. we can easily obtain inexact preH H ˜ ˜ conditioners KBP S and KV S for SB . Mathew and Shao [51]). We shall let MXk be such a preconditioner for SXk . SB by AI . DOMAIN DECOMPOSITION METHODS 79 Xk consisting of a small set of vertices on the edges incident on the vertex Vk and adjacent to it. Inexact Solves The exact solves involving A i .11) −1 Smith [190] proved that the condition number of KV S SB is bounded independent of H and h. Analogously. the solves on each subdomain are performed sequentially. Even without conjugate gradient acceleration. Let SXk be the principal submatrix of SB corresponding T to Xk . Care must be taken to scale AH and Ai so that they are as close to AH and ˜ ˜ Ai as possible respectively — it is not sufficient that the condition number of A−1 AH and A−1 Ai i H be close to unity. the multiplicative version is not as parallelizable.5) and construct a preconditioner from ˜ ˜ ˜ the factorization in (5. As before.6) by replacing the terms AI . the modification is straightforward and the Inexact Solve Additive Schwarz Preconditioner is simply: p T ˜ ˜ −1 Kas v = RH A−1 RH v + H i=1 T ˜ −1 Ri A i Ri v. 5. one can define a multiplicative Schwarz preconditioner which corresponds to a symmetric block Gauss-Seidel version. Then Smith’s Vertex Space preconditioner is defined by: −1 KV S vB T = RH A−1 RH vB + H Ei −1 T REi MEi REi vB + Xk −1 T RXk MXk RXk vB . SSOR. that the evaluation of the product SB vB requires exact subdomain solves in A−1 . RXk be the corresponding restriction (pointwise) and extension (by zero) matrices.5.4. provided there is sufficient overlap of Xk with B. the multiplicative method can take many fewer iterations than the additive version. where SB can be ˜ ˜ ˜ ˜ either KBP S or KV S . the Additive Schwarz preconditioner can be viewed as an overlapping block Jacobi preconditioner. Note that X overlaps with E and VH . using the most current iterates as boundary conditions from neighboring subdomains. however. A−1 and A−1 . KBP S . because the scaling of the coupling matrix AIB may be wrong. The main difficulty is. SXk is dense and expensive to compute and factor/solve but efficiently invertable approximations (some using variants of the K 1/2 operator defined before) have been developed in the literature (see Chan.

but smaller. coarse grid approximation. multigrid-like methods can be used as preconditioners . Perform some steps of a basic method in order to smooth out the error. since it descends through a sequence of subsequently coarser grids. REMAINING TOPICS The preconditioners given above extend naturally to nonsymmetric A’s (e. Conversely. In that case. From the above description it is clear that iterative methods play a role in multigrid theory as smoothers (see Kettler [132]). Restrict the current state of the problem to a subset of the grid points. and then ascends this sequence in reverse order. subdomain solves. A large H has the opposite effect. the optimal H can be determined for both sequential and parallel implementations (see Chan and Shao [52]). 5. a coarse grid will have one quarter of the number of points of the corresponding fine grid. Since the coarse grid is again a tensor product grid. 137]. Practical implementations (especially for parallelism) of nonsymmetric domain decomposition methods are discussed in [136. it has been observed empirically (see Gropp and Keyes [110]) that there often exists an optimal value of H which minimizes the total computational time for solving the problem. Choice of Coarse Grid Size H Given h. and perform a number of steps of the basic method again. Usually the generation of subsequently coarser grids is halted at a point where the number of variables becomes small enough that direct solution of the linear system is feasible.. the so-called “coarse grid”. Interpolate the coarse grid solution back to the original grid. The nice theoretical convergence rates can be retained provided that the coarse grid size H is chosen small enough (depending on the size of the nonsymmetric part of A) (see Cai and Widlund [43]). An analysis of multigrid methods is relatively straightforward in the case of simple differential operators such as the Poisson operator on tensor product grids. and solve the resulting projected problem. A small H provides a better. that is. For model problems. 3. A “W-cycle” method results from visiting the coarse grid twice. it may pay to determine a near optimal value of H empirically if the preconditioner is to be re-used many times.5 Multigrid Methods Simple iterative methods (such as the Jacobi method) tend to damp out high frequency components of the error fastest (see §2. This has led people to develop methods based on the following heuristic: 1. McCormick [147]). by applying this method recursively to step 2 it becomes a true “multigrid” method. each next coarse grid is taken to have the double grid spacing of the previous grid. and requires solving more.g. at least when the nonsymmetric part is not too large. In practice. Steps 1 and 3 are called “pre-smoothing” and “post-smoothing” respectively. there may also be geometric constraints on the range of values that H can take. convection-diffusion problems). a Fourier analysis (see for instance Briggs [42]) can be used. Many multigrid methods can be shown to have an (almost) optimal number of operations. However. For the more general case of self-adjoint elliptic operators on arbitrary domains a more sophisticated analysis is needed (see Hackbusch [116]. but more expensive.1). The method outlined above is said to be a “V-cycle” method.80 Nonsymmetric Problems CHAPTER 5. with possibly some smoothing steps in between. In two dimensions. 2. the work involved is proportional to the number of variables.2.

5. Axelsson and Vassilevski [23. . Yserentant [216] and Wesseling [213]. .1 A2.2 A2. .1 (i) (i) A1. The basic idea here is to partition the matrix on a given grid to a 2×2 structure A(i) = A1. A class of methods without this limitation is that of row projection methods (see Bj¨ rck and Elfving [34]. They o are based on a block row partitioning of the coefficient matrix AT = [A1 .1 A1. In that case. and Elman [92]. .1 A1.2 − A2. Some multigrid preconditioners try to obtain optimality results similar to those for the full multigrid method. (i) (i) (i)−1 (i) 5. i i These methods have good parallel properties and seem to be robust in handling nonsymmetric and indefinite problems. there is a theoretical connection with the conjugate gradient method on the normal equations (see §2. The coarse grid is typically formed based on a red-black or cyclic reduction ordering.2 . Maitre and Musy [144]. for instance some require the eigenvalues to be in the right half plane. The matrix on the next grid is then an incomplete version of the Schur complement A(i+1) ≈ S (i) = A2.2 (i) (i) with the variables in the second block row corresponding to the coarse grid nodes. McCormick and Thomas [148].6.3. .3). Braess [35]. see for instance Rodrigue and Wolitzer [179]. Here we will merely supply some pointers to the literature: Axelsson and Eijkhout [16]. Am ] and iterative application of orthogonal projectors Pi x = Ai (AT Ai )−1 AT x. 22]. and Bramley and Sameh [38]). Row projection methods can be used as preconditioners in the conjugate gradient method.6 Row Projection Methods Most iterative methods depend on spectral properties of the coefficient matrix. ROW PROJECTION METHODS 81 in iterative methods.

REMAINING TOPICS .82 CHAPTER 5.

technical reports on various parallel computers and software. The versions of the templates that are available are listed in the table below: shar filename sctemplates. One can communicate with Netlib in one of a number of ways: by email.gov On the subject line or in the body. FORTRAN and C versions of the templates for each method described in this book are available from Netlib. bibliographies.edu). A description of the entire library should be sent to you within minutes (providing all the intervening networks as well as the netlib server are up).shar mltemplates. A user sends a request by electronic mail as follows: mail netlib@ornl. (much more easily) via an X-window interface called XNetlib.shar contents Single precision C routines Double precision C routines Single Precision Fortran 77 routines Double Precision Fortran 77 routines MATLAB routines Save the mail message to a file called templates. and so on.shar dctemplates. type 83 . To get started using netlib. Edit the mail header from this file and delete the lines down to and including << Cut Here >>. one sends a message of the form send index to netlib@ornl.shar sftemplates. The second request results in a return e-mail message consisting of a shar file containing the single precision FORTRAN routines and a README file. names and addresses of scientists and mathematicians. single or multiple requests (one per line) may be made as follows: send index from linalg send sftemplates. In addition to the template material. there are dozens of other libraries.shar dftemplates.gov.shar from linalg The first request results in a return e-mail message containing the index from the library linalg.shar.utk.Appendix A Obtaining the Software A large body of numerical software is freely available 24 hours a day via an electronic service called Netlib. or through anonymous ftp (netlib2. test data.cs. along with brief descriptions of its contents. facilities to automatically translate FORTRAN programs to C. In the directory containing the shar file.

the amount of workspace needed varies. OBTAINING THE SOFTWARE sh templates. . The details are documented in each routine.f). Note that the vector-vector operations are accomplished using the BLAS [143] (many manufacturers have assembly coded these kernels for maximum performance)..shar No subdirectory will be created.f. named after the method (e. Each method is written as a separate subroutine in its own file. The unpacked files will stay in the current directory. along with instructions for a test routine. although a mask file is provided to link to user defined routines. GMRES. The argument parameters are the same for each. The README file gives more details.f.84 APPENDIX A. with the exception of the required matrix-vector products and preconditioner solvers (some require the transpose matrix).g. BiCGSTAB. CG. Also.

) It is important to note that putting double precision into single variables works. 1. many computers have the BLAS library optimized to their system. 1. giving the routine double. (Of course. The corresponding Fortran segment is ALPHA = 0. or double complex precision. STRMV: matrix vector product when the matrix is triangular 5. the declarations would also have to be changed. putting the result in scalar α. The corresponding Fortran segment is DO I = 1. 1 ) computes the inner product of two vectors x and y. Y ) multiplies a vector x of length n by the scalar α. putting the result in y. Additionally. This prefix may be changed to “D”. N ALPHA = ALPHA ENDDO + X(I)*Y(I) • CALL SAXPY( N. SGEMV: general matrix vector product 4. “C”.Appendix B Overview of the BLAS The BLAS give us a standardized set of basic codes for performing operations on vectors and matrices. SCOPY: copies a vector onto another vector 2.0 DO I = 1. ALPHA. or “Z”. BLAS take advantage of the Fortran storage structure and the structure of the mathematical system wherever possible.j) and xi = x(i).j = a(i. but single into double will cause errors. Here we use five routines: 1. complex. SAXPY: adds vector x (multiplied by a scalar) to vector y 3. Y. we can see what the code is doing: • ALPHA = SDOT( N. STRSV: solves T x = b for triangular matrix T The prefix “S” denotes single precision. N Y(I) = ALPHA*X(I) + Y(I) ENDDO 85 . If we define ai. X. then adds the result to the vector y. X.

J) ENDDO ENDDO computes Note that the parameters in single quotes are for descriptions such as ’U’ for ‘UPPER TRIANGULAR’. B. ONE. ’N’ for ‘No Transpose’. we will first copy it to temporary storage. X. LDA is the leading dimension of the two-dimensional array A. J X(I) = X(I) + TEMP*A(I. LDA. The variable LDA is critical for addressing the array correctly.J)*X(J) + B(I) ENDDO ENDDO This illustrates a feature of the BLAS that often requires close attention. Vector b will be overwritten with the residual vector. N TEMP = X(J) DO I = 1. 1. putting the resulting vector in b. A. 1 ) the matrix-vector product plus vector Ax + b. M B(I) = A(I. putting the resulting vector in x. LDA. .0E0). This feature will be used extensively. M. ’N’. we will use this routine to compute the residual vector b − Aˆ. 1 ) computes the matrixvector product Ax. N. for upper triangular matrix A. that is. The corresponding Fortran segment is DO J = 1. if we need it later. LDA is the declared (or allocated) number of rows of the two-dimensional array A. OVERVIEW OF THE BLAS • CALL SGEMV( ’N’. resulting in storage savings (among other advantages). ’N’. where x is our current approxix ˆ mation to the solution x (merely change the fourth argument to -1. • CALL STRMV( ’U’. X.86 APPENDIX B. N. N DO I = 1. A. thus. ONE. For example. The corresponding Fortran segment: DO J = 1.

Condition number See: Spectral condition number. such that the computed iterate x(i) is the solution of (A + δA)x(i) = b + δb. and the output is assumed to be correct. 87 . Choleski decomposition Expressing a symmetric matrix A as a product of a lower triangular matrix L and its transpose LT . and use this to speed up convergence.Appendix C Glossary Adaptive methods Iterative methods that collect information about the coefficient matrix during the iteration process. that is. Backward error The size of perturbations δA of the coefficient matrix and δb of the right hand side of a linear system Ax = b. Black box A piece of software that can be used without knowledge of its inner workings. BLAS Basic Linear Algebra Subprograms. though not necessarily by the same factor. Breakdown The occurrence of a zero divisor in an iterative method. • Irregular: the measure of the error decreases in some iterations and increases in others. A = LLT . Block matrix operations Matrix operations expressed in terms of submatrices. The two constants p. an iterative method approaches the solution of a linear system. q such that ai. or the rate at which.j = 0 if j < i − p or j > i + q. Convergence can be • Linear: some measure of the distance to the solution decreases by a constant factor in each iteration. Optimized (usually assembly coded) implementations of the BLAS exist for various computers. Band matrix A matrix A for which there are nonnegative constants p. a set of commonly occurring vector and matrix operations for dense linear algebra. Convergence The fact whether or not. • Superlinear: the measure of the error decreases by a growing factor. these will give a higher performance than implementation in high level programming languages. the user supplies the input. • Smooth: the measure of the error decreases in all or most iterations. Block factorization See: Block matrix operations. q are called the left and right halfbandwidth respectively. This observation unfortunately does not imply anything about the ultimate convergence of the method.

Since in many applications the condition number is proportional to (some power of) the number of unknowns. the field of values is the set {xT Ax : xT x = 1}. Ill-conditioned system A linear system for which the coefficient matrix has a large condition number. this does not imply anything about the ultimate convergence of the method. the length of the sequence is not given a priori by the size of the system. one hopes that the factorization LU will be close enough to A to function as a preconditioner in an iterative method. the longer one iterates. For symmetric matrices this is the range [λmin (A). Divergence An iterative method is said to diverge if it does not converge in a reasonable number of iterations. growth of the error as such is no sign of divergence: a method with irregular convergence behavior may ultimately converge. λmax (A)]. Field of values Given a matrix A. some fill elements are discarded. Krylov sequence For a given matrix A and vector x. this should be understood as the constant of proportionality being large. GLOSSARY • Stalled: the measure of the error stays more or less constant during a number of iterations. See: Iterative method. See: Direct method. However. . however. a direct method yields the true solution to the system. Diagonally dominant matrix See: Matrix properties Direct method An algorithm that produces the solution to a system of linear equations in a number of operations that is determined a priori by the size of the system. measured in some vector norm. Incomplete factorization A factorization obtained by discarding certain elements during the factorization process (‘modified’ and ‘relaxed’ incomplete factorization compensate in some way for discarded elements). the closer one is able to get to the true solution. In an incomplete factorization. even though the error grows during some iterations. Domain decomposition method Solution method for linear systems based on a partitioning of the physical domain of the differential equation. and some way of combining data from the subdomains on the separator part of the domain. the sequence of vectors {Ai x}i≥0 . or if some measure of the error grows unacceptably. As above. Halfbandwidth See: Band matrix. Iterate Approximation to the solution of a linear system in any iteration of an iterative method. Dense matrix Matrix for which the number of zero elements is too small to warrant specialized algorithms to exploit these zeros. Distributed memory See: Parallel computer.88 APPENDIX C. Iterative method An algorithm that produces a sequence of approximations to the solution of a linear system of equations. Domain decomposition methods typically involve (repeated) independent system solution on the subdomains. Thus an incomplete LU factorization of a matrix A will in general satisfy A = LU . In exact arithmetic. Forward error The difference between a computed iterate and the true solution of a linear system. Usually. Fill A position that is zero in the original matrix A but not in an exact factorization of A. or a finite initial part of this sequence.

j. Norms A function f : Rn → R is called a vector norm if • f (x) ≥ 0 for all x. Matrix norms See: Norms. Mutually consistent norms See: Norms.j = 0 if j > i. Lower triangular matrix Matrix A for which ai. The convergence speed of iterative methods may depend on the ordering used.j |} is called the diagonal dominance of the matrix. The same properties hold for matrix norms. LQ factorization A way of writing a matrix A as a product of a lower triangular matrix L and a unitary matrix Q. For complex A. each unknown corresponds to a node in the discretization mesh.j = aj. LU factorization / LU decomposition Expressing a matrix A as a product of a lower triangular matrix L and an upper triangular matrix U . LAPACK A mathematical library of linear algebra routine for dense systems solution and eigenvalue calculations.i > j=i |ai. Natural ordering See: Ordering of unknowns. An M -matrix if ai. that is. • f (αx) = |α|f (x) for all α.i for all i. Some common orderings for rectangular domains are: . Ordering of unknowns For linear systems derived from a partial differential equation. Modified incomplete factorization See: Incomplete factorization. A matrix norm and a vector norm (both denoted · ) are called a mutually consistent pair if for all matrices A and vectors x Ax ≤ A x . that is. Multigrid method Solution method for linear systems based on restricting and extrapolating solutions between a series of nested grids. x = AT y).j |.j ≤ 0 for i = j.89 Krylov subspace The subspace spanned by a Krylov sequence. Matrix properties We call a square matrix A Symmetric if ai. M -Matrix See: Matrix properties. the excess amount mini {ai. A = LQ. Normal equations For a nonsymmetric or indefinite (but nonsingular) system of equations Ax = b. Different orderings of the unknowns correspond to permutations of the coefficient matrix. A = LU . j. y. and often the parallel efficiency of a method on a parallel computer is strongly dependent on the ordering used. x. Diagonally dominant if ai.j ≥ 0 for all i. AT is replaced with AH in the above expressions. • f (x + y) ≤ f (x) + f (y) for all x. and f (x) = 0 only if x = 0. Message passing See: Parallel computer.i − j=i |ai. Nonstationary iterative method Iterative method that has iteration-dependent coefficients. Positive definite if it satisfies xT Ax > 0 for all nonzero vectors x. and it is nonsingular with (A−1 )i. either of the related symmetric systems (AT Ax = AT b) and (AAT y = b.

Search direction Vector that is used to update an iterate. For matrices from problems on less regular domains. is applied in every step of the iterative method. and continues numbering points that are neighbors of already numbered points. the memory is said to be shared.90 APPENDIX C. All nodes in one level are numbered before the nodes in the next level. If the processors have immediate access to the same memory. Parallel computer Computer with multiple independent processing units. this may reduce the amount of fill in a factorization of the matrix. . Pipelining See: Vector computer. then numbers its neighbors. some common orderings are: • The Cuthill-McKee ordering. method of Jacobi method. Reduced system Linear system obtained by eliminating certain variables from another linear system. then the residual corresponding to a vector y is Ay − b. Sparse matrix Matrix for which the number of zero elements is large enough that algorithms avoiding operations on zero elements pay off. the matrix of a reduced system generally has more nonzero entries. while the total number of matrix elements is the square of the matrix size. Shared memory See: Parallel computer. this is the ordering where nodes are grouped in levels for which i + j is constant. the memory is said to be distributed. Preconditioner An auxiliary matrix in an iterative method that approximates in some sense the coefficient matrix or its inverse. this is the numbering where all nodes with coordinates (i. this is the consecutive numbering by rows and columns. Positive definite matrix See: Matrix properties. • The ordering by diagonals. If the original matrix was symmetric and positive definite. The Reverse Cuthill-McKee ordering then reverses the numbering. Matrices derived from partial differential equations typically have a number of nonzero elements that is proportional to the matrix size. j) for which i + j is odd are numbered before those for which i + j is even. this orders the matrix rows by increasing numbers of nonzeros. Simultaneous displacements. Red/black ordering See: Ordering of unknowns. this starts from one point. • The red/black ordering. • The Minimum Degree ordering. then the reduced system has a smaller condition number. Relaxed incomplete factorization See: Incomplete factorization. Residual If an iterative method is employed to solve for x in a linear system Ax = b. GLOSSARY • The natural ordering. if processors have private memory that is not immediately visible to other processors. In that case. Spectral condition number The product A · A−1 = λmax (AT A) λmin (AT A) 1/2 1/2 2 2 . The preconditioner. or preconditioning matrix. processors communicate by message-passing. Although the number of variables is smaller than for the original system.

this can be the decision to store rows or columns consecutively. For dense matrices. a practical test is needed to determine when to stop the iteration. method of Gauss-Seidel method. C.j = 0 if j < i. Spectrum The set of all eigenvalues of a matrix. For sparse matrices. Spectral radius The spectral radius of a matrix A is max{|λ(A)|}. . that indicate where the stored elements fit into the global matrix. typically involving the residual. Ideally this test would measure the distance of the last iterate to the true solution. For linear systems derived from partial differential equations in 2D. Successive displacements. Tune Adapt software for a specific application and computing environment in order to obtain better performance in that case only. various other metrics are used. Throughout. Stopping criterion Since an iterative method computes successive approximations to the solution of a linear system. Stationary iterative method Iterative method that performs in each iteration the same operations on the current iteration vectors. Storage scheme The way elements of a matrix are stored in the memory of a computer. however • Matrix subblocks are denoted by doubly indexed uppercase letters. we follow several conventions: • Matrices are denoted by capital letters.1 Notation In this section. We have tried to use standard notation that would be found in any current publication on the subjects covered. Processing identical operations this way is called ‘pipelining’ the operations. stored as integer data.1. Symmetric matrix See: Matrix properties. • Vectors are denoted by lowercase letters. the condition number is proportional to the number of unknowns. Vector computer Computer that is able to process consecutive identical operations (typically additions or multiplications) several times faster than intermixed operations of different types. Instead. • Lowercase greek letters usually denote scalars. we present some of the notation we use throughout the book. NOTATION 91 where λmax and λmin denote the largest and smallest eigenvalues. itemize Upper triangular matrix Matrix A for which ai. as a result they involve indices.C. common storage schemes avoid storing zero elements. for instance • Matrix elements are denoted by doubly indexed lowercase letter. Template Description of an algorithm. Vector norms See: Norms. but this is not possible. abstracting away from implementational details. respectively.

n    . am. . xn Other notation is as follows: • I n×n (or simply I if the size is clear from the context) denotes the identity matrix. . • A = diag(ai.n a1.i on its diagonal. .   . . • xi (k) (Ai. .92 APPENDIX C. and zeros everywhere else. . denotes the ith element of vector x during the kth iteration . .  xi ∈ R.i ) denotes that matrix A has elements ai.j ∈ R) A= .  x= .n We define vector x of dimension n as follows:   x1  .1 · · · a1.  (ai.1 · · · Am .j ∈ R ×\ ).n Am .1 · · · A1.1 · · · am. . .  . GLOSSARY We define matrix A of dimension m × n and block dimension m × n as follows:     A1.

Orlando. pp. Math. Solving sparse linear systems with sparse backward error. pp. 1–29. [8] S. F. 1984. A SHBY. Incomplete block matrix factorization preconditioning methods.. 74 (1986). [3] E.Bibliography [1] J. On vectorizing incomplete factorizations and SSOR preconditioners. pp. LAPACK Users Guide. S AYLOR. Linear Algebra Appl. G RIMES.. A ARDEN AND K. Academic Press. [7] S. 1542–1568. A RIOLI .. 165–190. Comput. A XELSSON.. AND I... A SHCRAFT AND R. BARKER. in Reservoir Simulation Symposium of the SPE. 1983.. A PPLEYARD AND I. Philadelphia. A XELSSON AND A. [10] S. pp. C HESHIRE. 1985. 9 (1988). SIAM. Numer. SIAM J. 7 (1986). A DAMS AND H. Preconditioned CG-type methods for solving the coupled systems of fundamental semiconductor equations. 12&13 (1985). 3–18. A. SIAM J. K ARLSSON. E. pp. Statist. pp. SIAM J. M ANTEUFFEL . Math. Sci. Sci. Statist. BIT. UIUCDCS-R-85-1203. [13] . pp. Quart. pp. AND P. Tech. 13 (1992).. AND P. pp. Statist. A taxonomy for conjugate gradient methods. Math. The principle of minimized iterations in the solution of the matrix eigenvalue problem.. 9 (1951). Appl. M ANTEUFFEL . BIT. A comparison of adaptive Chebyshev and least squares polynomial preconditioning for Hermitian positive definite linear systems. T. Appl.. J. A NDERSON . A SHBY. 299–321. 916–937. A SHBY. [2] L. [9] S. University of Illinois. 29 (1989). T. A RNOLDI. 1992. Comput. M ANTEUFFEL . 27 (1989). J ORDAN. Comput. A SHBY. J. Anal. Appl. A general incomplete block-matrix factorization method. SIAM J. 179–190. [15] O. pp. Rep. ET. Adaptive polynomial preconditioning for Hermitian indefinite linear systems. 27 (1990). Paper 12264. [11] C. Theory and computation. SIAM J. Comput. 29 (1989). [14] O. T. AND J. The ultimate answer?. Is SOR color-blind?. Nested factorization. 583–609. [12] O. E IJKHOUT.. 10 (1989). CHEBYCODE: A Fortran implementation of Manteuffel’s adaptive Chebyshev algorithm. AL . D UFF. Fl. 122–151. A XELSSON AND V. S AYLOR. Matrix Anal. D EMMEL . Appl. [4] J. 17–29. OTTO. Comput.-E. 490–506. pp. Sci. Vectorizable preconditioners for elliptic difference equations in three space dimensions. [6] W. Finite element solution of boundary value problems. [5] M. 93 . J.

Math. Numer. Numerische Mathematik. 48 (1986).. 1983. pp. 12 (1991). [29] R. pp. 77 (1986).. 48 (1986). Algebraic multilevel preconditioning methods. pp. Statist. Numer. BANK. II: The variable coefficient case. M UNKSGAARD. of Math. UCLA. 1373–1400. pp. Iterative solution for the solution of the Navier equations of elasticity. in Preconditioning Methods – Theory and Applications. BIT. C HAN. Anal. 792–829.. Comput.. [22] O. [24] O. CAM 92-. BANK . A black box generalized conjugate gradient solver with inner iterations and variable-step preconditioning. A XELSSON AND N. 14 (1977). II.. BANK AND T. 499–523. VASSILEVSKI. Appl. [33] R. F. E.. pp. pp. [21] O. Linear Algebra Appl. I: The constant coefficient case. L INDSKOG. B EAUWENS AND L. Comput. 265–293. [18] O. SIAM J. On approximate factorization methods for block-matrices suitable for vector and parallel processors. Mech. Approximate factorizations with S/P consistently ordered M -factors. pp. Math. P OLMAN. 658–681. I. On the eigenvalue distribution of a class of preconditioning matrices. SIAM J. Marching algorithms for elliptic boundary value problems. 241–258. A XELSSON AND P. J. New York. On Axelsson’s perturbations.. Anal. 950–970. pp.. pp. S MITH. D. BANK AND T. Anal. SIAM J. C HAN . 1992. pp. Q UENON. 15 (1978). 157–177. 295–319. 3–26. BIT. Numer. Dept. 14 (1977). Math. C OUGHRAN J R . Numer. 66 (1993). A XELSSON AND G. Numer. ROSE. ed. The Alternate-BlockFactorization procedure for systems of partial differential equations. 25 (1978). pp.94 [16] BIBLIOGRAPHY . 56 (1989). [30] G. Existence criteria for partial matrix factorizations in iterative methods. An analysis of the composite step Biconjugate gradient method. T. Numer. pp. Evans. 221– 242. VASSILEVSKI. pp. Numer. Engrg. [31] R. Methods Appl. B EAUWENS. 938–954. Marching algorithms for elliptic boundary value problems. G USTAFSSON. [32] . 625–644. Algebraic multilevel preconditioning methods. pp. Asynchronous iterative methods for multiprocessors. C HAN. E. W. F.. Analysis of incomplete factorizations with fixed storage allocation. A XELSSON AND P. Sci.. A XELSSON AND I. pp. Anal. Comput... Mach. Linear Algebra Appl. Gordon and Breach. Rep. pp. S.. 226–244. The nested recursive two-level factorization method for nine-point difference matrices.. A XELSSON AND B. [19] . 615–643. 29 (1989). 68 (1985). [17] O. 1569–1590. SIAM J. 12 (1991). SIAM J. CA 90024-1555. [20] O. 29 (1989). SIAM J. [25] R. BANK AND D. 479–498. Matrix Anal. On the rate of convergence of the preconditioned conjugate gradient method. [28] R. 13 (1976).. . 57 (1990).. Assoc. Los Angeles. pp. [26] R. [23] . Tech. A composite step bi-conjugate gradient algorithm for nonsymmetric linear systems. AND R. BAUDET. [27] R.

PASCIAK . 1989. V. Fourier analysis of relaxed incomplete factorization preconditioners. 63 (1992).. 19 (1979).. [42] W. J. [38] R. Multiplicative Schwarz algorithms for some nonsymmetric and indefinite problems. Proceedings of the Fourth International Symposium on Domain Decomposition Methods. January 14 . . CA. SIAM.. AND A. Anal. pp. [36] J. C HAN . Math. B RIGGS. Convergence estimates for product iterative methods with applications to domain decompositions and multigrid. H. 11 (1990). Alg. C HAN AND C. Z AGLIA . Comput. to appear. pp. S ZETO . B RAMLEY AND A.. of Math. PASCIAK . B RAMBLE . Rep. 668–680. BIT.BIBLIOGRAPHY 95 ¨ [34] A. Tech. K UO. B REZINSKI AND H. pp. pp. W IDLUND. AND J. [40] C. Numer. C HAN . Numer. [46] T. Tech. Proceedings of the Third International Symposium on Domain Decomposition Methods. Math. Statist. 29–38. 1991. VAN DER VORST.. H. S ADOK. A transpose-free squared Lanczos algorithm and application to solving nonsymmetric linear systems. E. S ADOK. [48] [49] . B REZINSKI . E. . pp.. pp. [43] X. B RAESS. [37] J. DE P ILLIS . [41] . The contraction number of a multigrid method for solving the Poisson equation. 13 (1992). 1988. Philadelphia. Avoiding breakdown and near breakdown in Lanczos type algorithms. [44] T.. W IDLUND. J. S CHATZ. SIAM J. [35] D. Los Angeles. SIAM J. Sci. Sci. Domain Decomposition Methods. CA 90024-1555. Houston. UCLA. I. The construction of preconditioners for elliptic problems by substructuring. AND H.. Row projection methods for large nonsymmetric linear systems. eds. Los Angeles. 261–284. [39] C. SIAM. Avoiding breakdown in the CGS algorithm. C AI AND O. Comput. pp. . T.. Two-color Fourier analysis of iterative algorithms for elliptic problems with red/black ordering. J. Domain Decomposition Methods. M. 1992. Numer. Numer. Numer. Comp. Domain Decomposition Methods. 145–163. Sci. UCLA.-C. E LFVING. pp. pp. Moscow. 103– 134. Comput. Statist. 1989. 30 (1993). X U. AND C. CAM 91-17. Dept. Rep. of Math. 1991. H. 767–793. CAM 92-26. S IMONCINI . 1 (1991). G LOWINSKI . Proceedings of the Second International Symposium on Domain Decomposition Methods. G ALLOPOULOS . pp. WANG . SIAM J. AND O.. Dept. Accelerated projection methods for computing pseudoinverse solutions of systems of linear equations.16. A quasi-minimal residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems. Statist. SIAM J. 168–193. 1990. 1 (1991). 936–952. Alg. J. 199–206. Los Angeles.. TX. SIAM. Philadelphia. C HAN . S AMEH. eds. 387–404. 1990. E. B RAMBLE . ´ [47] T. 12 (1991). SIAM J. J. C HAN. [45] T. Comput. to appear. CA 90024-1555. P E RIAUX . eds.. Sci. AND H. A breakdown free Lanczos type algorithm for solving linear systems. L.. A Multigrid Tutorial. R. Math. 37 (1981). [50] T. Mathematics of Computation. Philadelphia. 47 (1986).-C. T ONG. Philadelphia. 1977. USSR.. B J ORCK AND T. SIAM..

SIAM J. eds. Anal. ROMINE. M IRANKER. of Math. V. A generalized conjugate gradient method for the numerical solution of elliptic partial differential equations. C UTHILL AND J.. J. . 7 (1991). [61] E. 2 (1969). Use of fast direct methods for the efficient numerical solution of nonseparable elliptic equations. pp. Comput. M OSS . [64] S. M C K EE. [62] E. SIAM J. Comput.. in Sparse Matrix Computations. CA 90024-1555. 309–332. to appear. TN. Oak Ridge National Lab. CA 90024-1555. 25 (1989). C ONCUS . Preprint 796. D E S TURLER. Tech. D E S TURLER AND D. 2. [54] A. [57] P. Utrecht. R. pp.. D EMKO . C HAN AND J. The condition number of equivalence transformations that block diagonalize matrix pencils. Dept. G. 220–252. G OLUB. Springer-Verlag. D EMMEL . Numer. Dept. D’A ZEVEDO AND C. Comput. Parallel linear algebra. Reducing the bandwidth of sparse symmetric matrices. AND C. tech. 1991. Nested Krylov methods and preserving the orthogonality. Numer.. C HAZAN AND W. Math. SIAM J. in ACM Proceedings of the 24th National Conference. 1993. 6 (1985). Los Angeles. 1969. S MITH. D OI. J. Academic Press. ORNL/TM-12192. H. M EURANT. G OLUB . ROMINE. 1976. Rose. AND G.. New York. C ONCUS AND G. in Computer methods in Applied Sciences and Engineering. S HAO. S HAO. D’A ZEVEDO .. TN. University of Tennessee. tech. to appear. [56] P. Chaotic relaxation. On parallelism and convergence of incomplete LU factorizations. Rep. pp. Tech. 199–222. Linear Algebra Appl. Dec 15–19. Vol. O’L EARY. A parallel restructured version of GMRES(m). Delft. E IJKHOUT. report. 1993. 491–499. M ATHEW. Anal. Second International Symposium. 91-85. M. CAM 92-07. pp.. Vol. pp. F. G EAR. [67] S. G OLUB . 1103–1120. 417–436. Rep. of Math.. 20 (1983). W. LAPACK working note 56: Reducing communication costs in the conjugate gradient algorithm on distributed memory multiprocessor. SIAM J. New York. pp. 153–168. Decay rates for inverses of band matrices. 10 (1973). pp. AND P. Math. Tech. The Netherlands. [58] P. P. Sci. C HRONOPOULOS AND C. Reducing communication costs in the conjugate gradient algorithm on distributed memory multiprocessors. Lecture Notes in Economics and Mathematical Systems. C HAN . New York. AND D.. UCLA. 1976. G. Statist. On the choice of coarse grid size in domain decomposition methods. D EMMEL. F OKKEMA. [59] E. Rep. C ONCUS . Los Angeles. s-step iterative methods for symmetric linear systems. Tech. C ONCUS AND G. P. UCLA. 1993. 1992. [53] D. H EATH . 1992. AND H. Knoxville. VAN DER VORST.96 BIBLIOGRAPHY [51] T. rep. T. Berlin. 599–610. 43 (1984). Appl. G OLUB. 134. [60] E. Delft University of Technology.. pp. Bunch and D. Numer. [55] P. [65] J. Cambridge Press. Utrecht University. Appl. 1975. Rep. The Netherlands. Computer Science Department. [66] J.. AND J. in Acta Numerica. Oak Ridge. Block preconditioning for the conjugate gradient method. [52] T. 1993. Mathematics of Computation. F. Efficient variants of the vertex space domain decomposition algorithm. [63] E. A generalized conjugate gradient method for nonsymmetric systems of linear equations. Sci.

The direct solution of the discrete Poisson equation on a rectangle. Soviet Math. [73] J.. E IERMANN AND R.BIBLIOGRAPHY 97 [68] J. Comm. Towards a unified theory of domain decomposition algorithms for elliptic problems. 1991. Math. D UPONT. E HRLICH. Anal. pp. pp. Phys. pp. D U C ROZ . R. pp. AND H. [80] T. SIAM J. 43 (1981). M EURANT. Approximating the inverse of a matrix for use in iterative algorithms on vector processors..K. 1993. J. M OLER . AND G. M.. Philadelphia. in Proceedings of the IMACS International Symposium on Iterative Methods in Linear Algebra. G RIMES . D RYJA AND O. Solving Linear Systems on Vector and Shared Memory Computers. Tech. K ENDALL . 15 (1989). Performance of various computers using standard sparse linear equations solving techniques. 577–580. eds. J. D ONGARRA AND H. 177–188. 31–45. J. S ORENSEN . Beware of unperturbed modified incomplete point factorizations. The method of variable directions in solving systems of finite difference equations. Linear Algebra Appl. 5 (1968). AND H. Brussels. B UNCH . pp. AND S. Computing. D’YAKONOV. pp. Analysis of parallel incomplete point factorizations. [84] V. W IDLUND. D. [77] I. R.V. 1986. [74] F. ACM Trans. Rep. D UFF . SIAM. Courant Institute. [70] J. 403–407. D ONGARRA AND E. G REENBAUM . 1979. E RISMAN . 1992. 22 (1979). D U C ROZ .R EID. [83] M. TOM 138. 1989. H AMMARLING. AND R. An approximate factorization procedure for solving self-adjoint elliptic difference equations. E IJKHOUT. C. Distribution of mathematical software via electronic mail. L EWIS.. 12 (1970). A set of level 3 Basic Linear Algebra Subprograms. pp. D UBOIS . Dongarra and W. D UFF AND G. Belgium. Is the optimal ω best for the SOR iteration method?. S. [85] . A. 30 (1987). Oxford University Press. D ONGARRA . 257–277. AND J. VAN DER VORST. Direct methods for sparse matrices. 16 (1990). pp. S. New York. eds.. I. RODRIGUE. SIAM.. Soft. [78] I. J. pp. 154–156 (1991). D UFF . pp. VARGA. The effect of ordering on preconditioned conjugate gradients. de Groen. D ONGARRA . LINPACK Users’ Guide. Department of Computer Science. Comput. 182 (1993). SIAM Rev. D ONGARRA . 271–274. Math. ACM Trans. Soft. Elsevier Science Publishers B.. pp. [79] I. Numer. R ACHFORD. 14 (1988). 1–17. [75] M. G ROSSE. also Ultracomputer Note 167. [72] J. B. London. An extended set of FORTRAN Basic Linear Algebra Subprograms. D ONGARRA .. D ORR. D UFF . S TEWART. pp. pp. 723–740.. Philadelphia. J. An Ad-Hoc SOR method. 2 (1961). 1–14. Soft. H AMMARLING . Linear Algebra Appl. I. 257–268. AND J. . [71] J. Dokl.. 29 (1989). D UFF .. BIT. [76] D. Sparse matrix test problems. Beauwens and P. A. ACM. 1– 32. Math. VAN DER VORST. [81] E. H ANSON. PA. AND G. [82] L. [69] J. 559–573. R. Gentzsch. in Computer Benchmarks. 486.. 248–263. 635–657. ACM Trans.

E LSNER. Watson. 92.. Tech.. FABER AND T. tech. SIAM J. [96] G. CS 92-170.. F LETCHER. M ITCHELL. FAIRWEATHER . On best conditioned matrices. 1976. Linear Algebra Appl. pp. Comput. A. CA. F REUND AND N.. SIAM J. Preconditioning by fast direct methods for non self-adjoint nonseparable elliptic equations. [100] R. SIAM J. NACHTIGAL. [90] S. NASA Ames. pp. Numer.. 1–4. pp. S TRAUSS. in Numerical Analysis Dundee 1975. Parallel Computing. E ISENSTAT. Numer. TN. A note on optimal block-scaling of matrices. 315–339. F ORSYTHE AND E. F REUND AND T. Computer Science Department. pp. University of Tennessee. 425–448. 127– 128. SIAM J. College Park.. [102] . P OLMAN. Approximate Schur complement preconditioners on serial and parallel computers.. Tech. S CHULTZ. pp. [99] R. E IJKHOUT AND P.15. Sci. Efficient implementation of a class of preconditioned conjugate gradient methods. pp. Anal.. Sci. 73–89. Springer Verlag. . Numer. 340–345. Knoxville. Sci. Rep. G. 21 (1984). 56–66. G UTKNECHT. ed. Math. NACHTIGAL. [95] V. An implementation of the QMR method based on coupled two-term recurrences. RIACS. Convergence theorems for Gauss-Seidel and other minimization algorithms. 44–57. Statist. pp. Knoxville. Math.. AND A. Comput. SIAM J. Sci. 10 (1989). [87] [88] V. Math. Statist. Rep. [98] G.. Decay rates of inverses of banded M -matrices that are near to Toeplitz matrices. An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices. 13 (1992). 6 (1955). pp. 581–605. Rep. Conjugate gradient-type methods for linear systems with complex symmetric coefficient matrices. Comput. New York. VASSILEVSKI. Numer. pp. 60 (1991). Soc. Comput. Necessary and sufficient conditions for the existence of a conjugate gradient method. [103] R. 109 (1988). 1991. Numer. pp. 1992. G OURLAY. Some high accuracy difference schemes with a splitting operator for equations of parabolic and elliptic type. [93] H. E IJKHOUT AND B. [101] R. Computer Science Center. TN. [91] R. pp. [92] H. 10 (1967).. Amer. [89] V. Rep. pp. F REUND . E LMAN. 14 (1993). MD. University of Tennessee. Positive definitess aspects of vectorizable preconditioners. Conjugate gradient methods for indefinite systems. Berlin. LAPACK working note 50: Distributed sparse data structures for linear algebra operations. A quasi-minimal residual squared algorithm for non-Hermitian linear systems. 93–100. QMR: A quasi-minimal residual method for nonHermitian linear systems. . SIAM J. CA. 23 (1986). Ames. Tech.. 44 (1984). Anal. University of Maryland. 10 (1989). 2 (1981). E LKIN. Math. S ZETO. Proc. [97] R. RIACS. Tech. 1992.98 [86] BIBLIOGRAPHY . M. rep. Statist. 1968. 1992. Ames. Computer Science Department. M ANTEUFFEL. 315–339. CS 92-169. 247–277. 137– 158. NASA Ames. AND N. pp. E LMAN AND M. 68-59. [94] L.. F REUND. LAPACK working note 51: Qualitative properties of the conjugate gradient and Lanczos methods in a matrix framework. Jan.

[105] R. Predicting the behavior of finite precision Lanczos and conjugate gradient computations. 57–100. ´ [106] R. 31 (1989). Reading. [112] M.. pp. H. Numer. Nat. SIAM J. Z¨ rich. AND N. SIAM J. A. VAN L OAN. Sci. M. The Johns Hopkins University Press. [117] 1991. Proceedings of the First International Symposium on Domain Decomposition Methods for Partial Differential Equations. H ESTENES AND E. G. pp. 49 (1952). Tech. Variants of Bi-CGSTAB for matrices with complex spectrum. 470–482. H. Teubner. Matrix Computations. H ESTENES. A transpose-free quasi-minimum residual algorithm for non-Hermitian linear systems. pp. 1990.. G OLUB AND C. G ROPP AND D. E. 409–436. D. Matrix Anal.. pp. Addison-Wesley. Mat. Res. 1990. Multi-Grid Methods and Applications. 121–137. Anal. Comput. [109] A. [122] M. Stuttgart. Iterative solution of linear systems. Springer-Verlag. Academic Press. [110] W. 13 (1992). Conjugacy and gradients. Berlin. pp. pp. Iterative L¨ sung großer schwachbesetzter Gleichungssysteme. 91-14. o . [108] G. pp. 1985. Baltimore. Appl. Methods of conjugate gradients for solving linear systems. [120] W. 1988. 5 (1984). Rep. [119] L. W. Condition estimators. in A History of Scientific Computing. NACHTIGAL. SIAM J. eds. YOUNG. [118] A. G REENBAUM AND Z. Acta Numerica. 50–102. u [113] . 311–316. [107] G. 1990. MA. Comput. Sci. cona tinued fractions and the QD algorithm. Bur. Domain decomposition with local mesh refinement. Applied Iterative Methods. 18 (1978). Sci. Domain Decomposition Methods for Partial Differential Equations. G OLUB . F REUND . Statist. 167–179. IPS ETH. 1991. Switzerland. pp. . Some history of the conjugate gradient and Lanczos methods.. H ADJIDIMOS. H ACKBUSCH. H AGEMAN AND D. F REUND. Philadelphia. Switzerland. 13 (1992). Math. pp. 396–403. H. 14 (1993). O’L EARY. J. G OLUB AND D. SIAM J. New York. Stand. S TRAKOS. SIAM J. 967–993. Rep. (1992).. AND J. G LOWINSKI . IPS Research Report. Comput. W. January 1987. u .. France. ETH Z¨ rich. Statist.BIBLIOGRAPHY 99 [104] R. 13 (1969). Tech. On some high accuracy difference schemes for solving elliptic equations. M EURANT. G. 90-16. [111] I.. The unsymmetric Lanczos algorithms and their relations to P´ de approximation. A completed theory of the unsymmetric Lanczos process and related algorithms. BIT. . G UTKNECHT. [114] [115] [116] W. New York. Appl. part I. P E RIAUX. [121] M. G OLUB . in Proceedings of the Copper Mountain Conference on Iterative Methods. H AGER. 1989. pp. G USTAFSSON. S TIEFEL. part II. K EYES. Paris.. A completed theory of the unsymmetric Lanczos process and related algorithms. 594–639. second edition. 142–156. G. 13 (1992). SIAM. SIAM Rev. pp.. A class of first-order factorization methods. R. 1981.

13 (1992). A class of Lanczos-like algorithms implemented on parallel computers. pp. Polynomial preconditioning for conjugate gradient calculation. J OUBERT...... K AHAN. M ICCHELLI . pp. Philadelphia. AND G. F. H IGHAM. [127] . K ETTLER. T. 8 (1987). Mathematics of Computation. pp. Comput. 34 (1980). 17 (1991). pp. Delft.metrizable iterative methods. 159–194. J EA AND D. K. S CROGGS . The incomplete Cholesky-conjugate gradient method for the iterative solution of systems of linear equations. 373–384. pp. Domain decomposition for nonsymmetric systems of equations: Examples from computational fluid dynamics. and O. J. Linear multigrid methods in numerical reservoir simulation. PhD thesis. SIAM J.. 43–65. 471–475. Comput. F. SIAM. Sincovec. pp. C. L. Appl. Los Angeles. YOUNG. pp. [135] D. AND R. Widlund. Statist. E. New York. VA. Statist. 6 (1989/1990). K IM AND A. Anal. Reed. K ERSHAW. [126] M. eds. [128] W. G. Chan. and D. Trottenberg. Comput. January 14–16. 654–669.. PAUL. . SIAM. pp.. Sci.. pp. pp. J. Math. Applied Num. [125] O. D. pp. K EYES . eds. The Netherlands. Analysis and comparison of relaxation schemes in robust multigrid and preconditioned conjugate gradient methods. 281–301. C HRONOPOULOS. eds. proceedings of the Second Internation Symposium. K EYES AND W. G ROPP. 1987. Domain Decomposition Methods For Partial Differential Equations. Generalized conjugate-gradient acceleration of nonsym.. [132] R. C HAN . S. R. Periaux. Leuze. PhD thesis. 362–376. Phys. J ONES AND P.100 BIBLIOGRAPHY [123] N. [133] . Experience with a matrix norm estimator. Matrix Anal. D. E. 11 (1990). 1991. [134] D. 20 (1966). Sci. B. pp. Numer. A comparison of domain decomposition techniques for elliptic partial differential equations and their parallel implementation. Philadelphia. Lecture Notes in Mathematics 960. 26 (1978). Estimates for some computational techniques in linear algebra. [129] W. T. Gauss-Seidel methods of solving large systems of linear equations. J. SIAM J. 1989. pp. T. 14 (1993). California. [130] S. J OHNSON . SIAM. 1982. 369–378. W. 20 (1983). Philadelphia. in Multigrid Methods. Norfolk. M. Domain decomposition techniques for the parallel solution of nonsymmetric systems of elliptic boundary value problems. Petzold. A parallel graph coloring heuristic. 1958. Sci. 804–809. Parallel solution of unstructed. eds. Keyes. Linear Algebra Appl. Berlin.. [137] [138] S. SIAM J. Delft University of Technology. [136] . K ANIEL. 502–534. in Proceedings of the Sixth SIAM conference on Parallel Processing for Scientific Computing. 926–943.. 1992. Springer-Verlag. VOIGT. SIAM J. s166 – s202. in Domain Decomposition Methods. 1988. R. Hackbusch and U. G. . [124] K. SIAM J. sparse systems of linear equations. P LASSMANN.. 763–778. M EURANT. [131] D. Parallel Comput. Lanczos methods for the solution of nonsymmetric systems of linear equations. Statist. University of Toronto. Glowinski. Proceedings of the Fifth International Symposium on Domain Decomposition Methods. Comput.

University of Illinois. ACM Trans. 6 (1980). in Multigrid methods. [154] G. Software. pp. K INCAID . an exact evaluation for some finite element subspaces and model problems. The block preconditioned conjugate gradient method on vector computers. pp. M EURANT. Bur. D. 33–53. T HOMAS. Math. M. [145] T. BIT. eds. G. 1989. [149] U. M UNKSGAARD. Mathematics of Computation... Y. R. pp. pp. Sov. pp. S AMEH. 49 (1952). ACM Trans. Toward efficient implementation of preconditioned conjugate gradient methods on vector supercomputers. [153] R. 5 (1979). The Tchebychev iteration for nonsymmetric linear systems. AND R. Stand. M USY. L ANCZOS. 1982. Comput. ACM Trans. Urbana. 535–544.. Guidelines for the usage of incomplete decompositions in solving sets of linear equations as they occur in practical problems. Phys. Supercomput.. R. Internat. M AITRE AND F. H ANSON . vol. W. 44 (1981). [147] S. pp. pp. AND F. 308–325. R ESPESS . An iterative solution method for linear systems of which the coefficient matrix is a symmetric M -matrix. pp. pp. M ELHEM. 24 (1984). Basic Linear Algebra Subprograms for FORTRAN usage. 302–322. Mathematics of Computation. Mathematics of Computation. 1 (1987). 77–98. M EIER AND A. Anal. 623–633. M EIER -YANG. J. The Fast Adaptive Composite grid (FAC) method for elliptic equations.. [140] L.. . An incomplete factorization technique for positive definite linear systems. [151] J. pp. KOLOTILINA AND A.. M EIJERINK AND H. SIAM. J. Math. Numer. M C C ORMICK. K¨ ln-Porz. Nat. [141] C. April 1992. Res. 46 (1986). Math. pp. J. [142] . The behavior of conjugate gradient algorithms on a multivector processor with a hierarchical memory. R. Soft. Math. tech. Stand. Res. 31 (1977). rep. [152] . [143] C. An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. L AWSON . [144] J. Proceedings. 255–282.. M C C ORMICK AND J. G RIMES. 293– 320. Appls. K ROGH. Solving sparse symmetric sets of linear equations by preconditioned conjugate gradients. 13–32. Hackbusch and U. Multilevel Adaptive Methods for Partial Differential Equations. Philadelphia. 24 (1988). 960 of Lecture o Notes in Mathematics. Modelling. Bur. pp.. J.BIBLIOGRAPHY 101 [139] D. 473–497. 206–219. Multitasking the conjugate gradient method on the CRAY X-MP/48. IL. [156] N. Preconditioned conjugate gradient-like methods for nonsymmetric linear systems. 267–280. Algorithm 586. ITPACK 2C: A Fortran package for solving large sparse linear systems by adaptive accelerated iterative methods. The contraction number of a class of two-level methods. [150] U. pp. J. 28 (1977). CSRD. [155] . Parallel Comput. D. YOUNG . Y EREMIN.. Solution of systems of linear equations by minimized iterations. 1981. Nat. Appl.. On a family of two-level preconditionings of the incomlete block factorization type. pp. 8 (1982). 34 (1980). 45 (1950). (1986). 439–456. VAN DER VORST. Comput. 148–162. K INCAID . [148] S. [146] . 307–327. Trottenberg. J. 5 (1987). Numer. Y. Soft. pp. Math. 134–155. M ANTEUFFEL. pp. Math.

3 (1955). to appear. vol. R ADICATI DI B ROZOLO. Z¨ rich. [171] B. 40 (1992). J OUBERT. April 1988. PARLETT. SIAM J. [163] .. [172] D. pp.0: A package for solving large sparse linear systems by various iterative methods. u . 12 (1975). Matrix Anal. Ordering schemes for parallel processing of certain mesh problems. Anal. P EACEMAN AND J . Math. N. 5 (1984). D. D. 617–629. [173] C. [165] J. TX.102 BIBLIOGRAPHY [157] N. S AUNDERS. Rep. 8 (1982). AND L. ACM Trans. [160] Y. SIAM J. R. Approximate solutions and eigenvalue bounds from Krylov subspaces. London. Center for Numerical Analysis. Appl. Swiss Federal Institute of Technology. 293–322. Indust. PhD thesis. pp. AND L. Switzerland. AND D. R EICHEL . volume 17. A Look-Ahead Variant of the Lanczos Algorithm and its Application to the Quasi-Minimal Residual Methods for Non-Hermitian Linear Systems. Nijmegen. R EDDY. S AUNDERS. Axelsson and L. How fast are nonsymmetric matrix iterations?. Soc. [161] .. in Preconditioned Conjugate Gradient Methods. 1992. PhD thesis. 105–124. R ACHFORD. [164] T. 17 of Series in Micro-electronics. BIT. The block conjugate gradient algorithm and related methods. Numer. pp. H. 1457 of Lecture Notes in Mathematics. Sci. [174] . Lin.H. [170] B. PAIGE . New York and London.. pp. [162] D. NSPCG user’s guide. N OTAY. 29 (1989). Soft.. [169] G. 703–718. 29 (1980). On the robustness of modified incomplete factorization methods. 620–632. PARLETT. M. A hybrid GMRES algorithm for nonsymmetric matrix iterations. NACHTIGAL. Hartung-Gorre Verlag. Comput. L IU. pp. Linear Algebra Appl. PAIGE AND M. 778–795. A.. O RTEGA. A. MIT. Prentice-Hall. 43–71. Alg. Solution of large unsymmetric systems of linear equations. [167] C. Cambridge. 105–125. pp. pp. LSQR: An algorithm for sparse linear equations and sparse least squares. S. eds. CNA–216.. vol. 121–141. pp. J. [159] N. Math. Numer. MIT. NACHTIGAL . [158] N. The symmetric eigenvalue problem. R. 28–41. A look-ahead Lanczos algorithm for unsymmetric matrices. PAOLINI AND G.. T REFETHEN. 1990. P OMMERELL. Rep. Solution of Large Unsymmetric Systems of Linear Equations. version 1. AND Z. 1980. O. 44 (1985). Appl. Internat. Introduction to Parallel and Vector Solution of Linear Systems. MA. NACHTIGAL . PARLETT. 13 (1992). The numerical solution of parabolic and elliptic differential equations. PAIGE AND M. pp.. 1991. Mathematics of Computation. Kolotilina. Tech. University of Texas at Austin. [168] C. W. TAYLOR . L. Austin. C. 1988. Cambridge. Konstanz. VAN DER VORST. M.. K INCAID. J. 1989. Comput. 90-7. C. B. T REFETHEN. O’L EARY. pp. Data structures to vectorize CG algorithms for general sparsity patterns. Math. SIAM J. Plenum Press. Solving positive (semi)definite linear systems by preconditioned iterative methods. [166] C. O PPE . 1992. Solution of sparse indefinite systems of linear equations. Tech. Appls. Statist. AND H. MA.

[192] R.. Urbana. Italy. L. Statist. Relaxation Methods in Theoretical Physics. Sci. 42 (1984). Rep. Vector and parallel CG-like algorithms for sparse non-symmetric systems. O RTEGA. 89–105. Reid. Statist. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. Rep. 36–52. The Lanczos biorthogonalization algorithm and other oblique projection methods for solving large unsymmetric systems. W OLITZER. RI. 10 (1989). Domain decomposition algorithms for partial differential equations of linear elasticity. SIAM J. SPARSKIT: A basic tool kit for sparse matrix computation. Sci. 19 (1982). pp. CGS. Department of Computer Science. S LEIJPEN AND D. F OKKEMA. RM 86-06. to appear. Rep. 44 (1985).. [191] P. pp. pp. Preconditioning techniques for indefinite and nonsymmetric linear systems. Providence. France. [181] [182] [183] [184] [185] [186] . Appl. 203–228. AMS. S AAD AND M. Statist. Academic Press. SIAM J. 681-M. 772. Comput. SIAM J. Department of Applied Mathematics. . Domain Decomposition Methods. Charlottesville. ed. IL. [187] Y. The Netherlands. J. SIAM J. A flexible inner-outer preconditioned GMRES algorithm. R EID. pp.. Comput. University of Virginia. [179] G. 1986. SIAM J.. Statist. Grenoble. pp. 461–469. 5 (1984). Math. S ONNEVELD... Como. . 517. IMAG/TIM3. 14 (1993). 1200–1232. S CHULTZ. Sci. RODRIGUE AND D. Comput. pp.BIBLIOGRAPHY 103 [175] E. [177] G. Krylov subspace methods on supercomputers. pp. Statist. J. 1990. Tech. S OUTHWELL. Anal. [188] . Numer. Comput. Conjugate gradient-like algorithms for solving nonsymmetric linear systems. Bi-CGSTAB( ) for linear equations involving unsymmetric matrices with complex spectrum. Sci. S AAD. P OOLE AND J. Mathematics of Computation. Comput. Rep. 6 (1985). S MITH. Sci. F.. 1971. University of Utrecht. [189] G. 231–254. pp. Tech. Tech. G. [180] Y. Comput. 1990. Tech. ROBERT. London. [176] A. Multicolor ICCG methods for vector computers. 1993. Mathematics of Computation. Department of Mathematics. R ADICATI DI B ROZOLO AND Y. Clarendon Press. University of Illinois. 1946. Preconditioning by incomplete block cyclic reduction. in Large Sparse Sets of Linear Equations. Practical use of some Krylov subspace methods for solving indefinite and nonsymmetric linear systems. pp. . 24 (1988). pp. 549–565. 1993. [178] J. VA. Oxford. Comput. CSRD. pp. 10 (1989). On the method of conjugate gradients for the solution of large sparse systems of linear equations. Rep. Practical use of polynomial preconditionings for the conjugate gradient method. SIAM J. 7 (1986). 856–869. . [190] B. Q UARTERONI.. Sci.. 485–506. 417–424. 865–881. . 1987. R. a fast Lanczos-type solver for nonsymmetric linear systems. CSRD TR 1029. . ed. Tech. Courant Institute. SIAM J. Utrecht... Proceedings of the Sixth International Symposium on Domain Decomposition Methods.

104

BIBLIOGRAPHY

[193] H. S TONE, Iterative solution of implicit approximations of multidimensional partial differential equations, SIAM J. Numer. Anal., 5 (1968), pp. 530–558. [194] P. S WARZTRAUBER, The methods of cyclic reduction, Fourier analysis and the FACR algorithm for the discrete solution of Poisson’s equation on a rectangle, SIAM Rev., 19 (1977), pp. 490–501. [195] C. T ONG, A comparative study of preconditioned Lanczos methods for nonsymmetric linear systems, Tech. Rep. SAND91-8240, Sandia Nat. Lab., Livermore, CA, 1992. [196] A. VAN DER S LUIS, Condition numbers and equilibration of matrices, Numer. Math., 14 (1969), pp. 14–23. [197] A. VAN DER S LUIS AND H. VAN DER VORST, The rate of convergence of conjugate gradients, Numer. Math., 48 (1986), pp. 543–560. [198] H. VAN DER VORST, Iterative solution methods for certain sparse linear systems with a non-symmetric matrix arising from PDE-problems, J. Comput. Phys., 44 (1981), pp. 1–19. [199] [200] [201] , A vectorizable variant of some ICCG methods, SIAM J. Sci. Statist. Comput., 3 (1982), pp. 350–356. , Large tridiagonal and block tridiagonal linear systems on vector and parallel computers, Parallel Comput., 5 (1987), pp. 45–54. , (M)ICCG for 2D problems on vector computers, in Supercomputing, A.Lichnewsky and C.Saguez, eds., North-Holland, 1988. Also as Report No.A-17, Data Processing Center, Kyoto University, Kyoto, Japan, December 17, 1986. , High performance preconditioning, SIAM J. Sci. Statist. Comput., 10 (1989), pp. 1174–1185. , ICCG and related methods for 3D problems on vector computers, Computer Physics Communications, (1989), pp. 223–235. Also as Report No.A-18, Data Processing Center, Kyoto University, Kyoto, Japan, May 30, 1987. , The convergence behavior of preconditioned CG and CG-S in the presence of rounding errors, in Preconditioned Conjugate Gradient Methods, O. Axelsson and L. Y. Kolotilina, eds., vol. 1457 of Lecture Notes in Mathematics, Berlin, New York, 1990, Springer-Verlag. , Bi-CGSTAB: A fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems, SIAM J. Sci. Statist. Comput., 13 (1992), pp. 631–644.

[202] [203]

[204]

[205]

[206] H. VAN DER VORST AND J. M ELISSEN, A Petrov-Galerkin type method for solving Ax = b where A is symmetric complex, IEEE Trans. Magnetics, 26 (1990), pp. 706–708. [207] H. VAN DER VORST AND C. V UIK, GMRESR: A family of nested GMRES methods, Tech. Rep. 91-80, Delft University of Technology, Faculty of Tech. Math., Delft, The Netherlands, 1991. [208] J. VAN ROSENDALE, Minimizing inner product data dependencies in conjugate gradient iteration, Tech. Rep. 172178, ICASE, NASA Langley Research Center, 1983. [209] R. VARGA, Matrix Iterative Analysis, Prentice-Hall Inc., Englewood Cliffs, NJ, 1962. [210] P. VASSILEVSKI, Preconditioning nonsymmetric and indefinite finite element matrices, J. Numer. Alg. Appl., 1 (1992), pp. 59–76.

BIBLIOGRAPHY

105

[211] V. VOEVODIN, The problem of non-self-adjoint generalization of the conjugate gradient method is closed, U.S.S.R. Comput. Maths. and Math. Phys., 23 (1983), pp. 143–144. [212] H. F. WALKER, Implementation of the GMRES method using Householder transformations, SIAM J. Sci. Statist. Comput., 9 (1988), pp. 152–163. [213] P. W ESSELING, An Introduction to Multigrid Methods, Wiley, Chichester, 1991. [214] O. W IDLUND, A Lanczos method for a class of non-symmetric systems of linear equations, SIAM J. Numer. Anal., 15 (1978), pp. 801–812. [215] D. YOUNG, Iterative solution of large linear systems, Academic Press, New York, 1971. [216] H. Y SERENTANT, On the multilevel splitting of finite element spaces, Numer. Math., 49 (1986), pp. 379–412.

106

BIBLIOGRAPHY

Appendix D Flowchart of iterative methods Is the matrix symmetric? n y Is the matrix definite? n y Are the outer eigenvalues known? n y Try Chebyshev or CG Try MinRES or CG Try CG Is the transpose available? n y Try QMR Is storage at a premium? n y Try CGS or Bi-CGSTAB Try GMRES with long restart 107 .

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->