Professional Documents
Culture Documents
Symposium On Sparse Matrices
Symposium On Sparse Matrices
INTRODUCTION
f(t,w,p) , (1)
(3)
the currenr guess for w, then methods like [Broyden (1969A)] are
necessary to insure that the amount of work at each step is of
2 3
order n rather than n , where n is the number of components of w.
Fortunately, when n is large in Computational Design problems,
the Jacobian is typically ~parse.t The sparseness structure can be
highly irregular, but computational efficiency is achieved, in
spite of this generality, hy exploiting the fixed sparseness
structure for the Jacobian. ----
Svstem (3) is of the form
Ax =b (4)
m-l
c ij = (a ij - L cikckj)d (5)
k=l
h
were . . ) d = 1 f or ~~"
m = mi n ( ~,J, .. an d d = c-1. f .<. If a ij -- 0
ii ~ ~ J.
and , for l<k<m
- - -1, c ik ckj -
- 0, then c ij is "logically zero." Other-
wise, a reduced formula defines c ij • In this formula only nonzero
numbers occur.
In 1966, a highly efficient symbolic factorization program,
GNSO (GeNerate SOlve) was created [Gustavson et al (1970A)]. GNSO
generates a linear (loop-free) code SOLVE, which is specifically
tailored to the zero-nonzero structure of A. The SOLVE program
represents a machine language code for computing the reduced formula
for each c .. t O. The program SOLVE can be very long and as an
1J
alternative, two programs SFACT and NFACT were created [Chang
(1968A)]. SFACT generates the sparseness structure of C in the
context of Tinney's row Gaussian elimination. tt The program NFACT
uses the sparseness information for C to enhance the speed of
execution of the numerical elimination.
Gustavson's paper in this Proceedings is a fundamental exposi-
tion of the main programming concepts involved in extensions ttt to
his own GNSO program, Chang's programs SFACT and NFACT, and those
of Tinney et al. Row Gaussian elimination, with diagonal pivoting
in the natural order, is treated both for the unsymmetric and
symmetric cases. An ordering program OPTORD is also described.
In the design problem associated with system (1) there is a
nested set of computation loops. Because of this, the elements a ij
of A have a hierarchy of "variability types" (Le., :: 0; :: constant;
or dependent on p, t, or w respectively). The last three variability
types imply increasing frequency of change of the numerical value
of the element a .. with that variability type. If the inner loop
calculations are~rlot memorv-limited, then it is desirable to seg-
ment the calculations. This is done in such a way as to avoid
repeated calculation of quantities which are constant within those
inner loops.
t S
lsk = 0 by definition if S<a.
k=a
ttTinney and his colleagues have developed an extensive sparse
matrix technology for problems in the field of Power Generation and
Distribution. [Sato and Tinney (1963A) ; Tinney and Walker(1967B);
Tinnev(1968A); Ogbuobiri, Tinney, and Walker(1970A); Ogbuobiri(1970B)].
ttt FORTRAN subroutines, based on these concepts, are in the IBM
program product SL-MATH, which was announced recently by IBM World
Trade Corporation.
8 D. J. ROSE AND R. A. WILLOUGHBY
(6)
t Each of the papers in the Proceedings has its own set of notations
and these are, in general, different from the notation here in the
introduction. The superscript T refers to the transpose operation.
INTRODUCTION 9
Ax = ABx (7)
tHaving too many special purpose codes, on the other hand, tends to
create a high level of human inefficiency.
tt As indicated earlier, it is also used to denote the transpose
operation for rectangular arrays.
INTRODUCTION 11
R.ki +-
vi' (9a)
T T T
v +- v - V.U. (9b)
1 l'
T T -1
Then ~. Pkv where Pk v k-1 R.kk • Note that the resulting
component, v., on the left of the assignment symbol in (9b) is zero.
For sparse m~trices (9a) and (9b) are done only for those i where
v. :f O.
1 In this elimination or factorization stage, a given row of U
may be needed repeatedly to form subsequent rows of U. In compu-
tations which are memory bound ttt , it may be desirable to order
the equationQ and unknowns so that a given row of U is used only
for a small number of near-by rows of U. Thus one would need
only a reasonable sized locally active storage, and other quantities
could be brought up from back up store as needed or placed there
when generated. This aspect of ordering is an essential feature
of bandwidth minimization and is discussed in Cuthill's paper.
Row Gaussian elimination can be adapted to a form of pivoting-
for-size. One can, for example, first arrange the rows of A in
order of increasing density of nonzeros in each row, and then apply
threshold pivoting [Curtis and Reid (197lA)]. Pivoting-for-size
can then be achieved by generating a column index array J such
that, for row k, the pivot column is J(k). Procedure (9b) is still
a¥plied, but now i = J(v) for v = 1,2, ... ,k-l. This yields a vector
v with v. = 0 for the previous pivot components. As before
T T1 -1
~.= Pkv , but now n k= v~ , ~ = J(k). The pivot component
~ = J(k) satisfies Iv I > T max Iv I where the threshold, T, is in
the range 0 ~ T ~ 1. ~ v v
This is just another way of writing zT L = wT, and from the lower
triangular property of L it follows from (11) that
(12a)
-1
where Pk = £'k . The algorithm then proceeds, as long as k>l, via
the ass~gnmen~s
(12b)
k +- k-1 (12c)
Linear Programming
There are two sparse matrix algorithms which are widely used
in the field of Linear Programming; namely, PFI (Product Form of
the Inverse) and EFI (Elimination Form of the Inverse). The
t Extended precision inner product accumulation is important
because of the inherent numerical cancellation.
tt A similar algorithm applies to wTU = c .
INTRODUCTION 13
(13)
matrix with Pk in
-1
If c = Tk b, then c can be calculated as follows: Let a = 0kbk'
then ~ = a, and, for j k, c,. = b. - at. k • If bk
~ 0, then
.11 J
a = 0 and c = b. In particular, Tk e j = e j , for j 1 k, but
-1 . -1
Tk t' k = ek(these are, 1n fact, the basic characteristics of Tk ).
T ~T -1 T
Let c = b Tk ' then c can be calculated as follows:
ck (b k -bTt~*)Pk and c i =_~j for j ~ k. Thus one does not need
to form the k column of Tk . This column is often referred to
as an n-vector.
It is now an easy matter to present the factorization stage
of both the PFI and EFI algorithms. The PFI method is as follows:
-1 -1
t' l = a' l and for t' k = Tk _ l
2<k~n, Tl a. k • One then has
the identity T~l ... T~l A = I which yields the following form of
inverse.
-1 -1 -1
A = Tn ... Tl (14)
-1 -1 -1
A = Ul Uz ... Un-1Ln-1 ... L-1
l • (15)
First of all, in the former field the matrices are normally posi-
tive definite symmetric. Also, the coefficient matrix has a
highly structured numerical and sparseness pattern. Since it is
clearly beyond the scope of this introduction to systematically
survey this field, only a few remarks of a general character will
be presented along with a summary of the papers by Evans, George,
Guymon-King, and Widlund. However, the bibliography contains a
number of basic references. t
Finite difference techniques have been a major tool for gener-
ating the coefficient matrix for system (4), but more recently,
finite element methods have heen actively Rtudied and developed
[ Zlamal (1968A), (1970A); Felippa and Clough (1970A); Babuska
(1971A); Fix and Larsen (1971A)]. A part of the paper by George
concerns the analysis, developed in his Ph.D. thesis at Stanford
in 1971, of finite element methods from the point of view of
computational efficiency.
The application paper by Guymon-King also deals with the finite
element approach. Their field of application is environmental
pollution; that is, they are dealing with the Quantification of the
growth, decay, and movement of man-made and natural substances in
the environment. The particular problem discussed in the paper
is the development of a mathematical model capable of simulating
or predicting the movement and fate of man-made pollutants in large
fresh water lakes or reservoirs. This is clearly a very ambitious
project, but it represents a systematic approach to problem modeling
in an important new area of research.
As is always the case in new application areas, the main
focus of attention is on the problem modeling rather than the effi-
cient solution of well known and properly posed problems. General
purpose algorithms and the computer programs which implement these
algorithms Rhould be aimed at providing a computational tool; that
is, to facilitate the formulation and testing of mathematical
models in new application areas.
There has always been a set of tradeoffs in comparing the
computational complexity of various numerical methods for solving
partial differential equations. There is, first of all, the com-
plexity question associated with the generation, storage, and
fetching of the nonzero coefficients in the coefficient matrix.
In the usual finite difference approach, the order n of the matrix
can be very large (10,000-100,000). However, the matrix has only
a small number of nonzero elements in each row (5-50) and this
number is independent of n. Moreover, in simple geometries, the
generation of the coefficients is straightforward. In sophisticated
finite element methods, the size of the matrix is orders of magni-
tude small~r, but the matrix is denser and the generation of the
nonzero elements is much more complex.
t [Birkhoff (1971A); Young (1971A)] are two recent surveys which
analyze much of the current research in this area.
INTRODUCTION 17
AX + XB = C. (18)
The matrices t A(MxM), B(NxN), and C(MxN) are given and one seeks a
matrix X(MxN) which satisfies (18). Equation (18) can also be
expressed in tensor product form [Lynch, Rice, and Thomas (1964A,B)]
and is related to an analysis of alternating direction methods.
Direct solution for certain classes of finite difference equations
have been studied via (18) [Bickley and McNamee (1960A); Osborne
(1965A)]. Notice that the assignment, A + TAT-I, B + S-lBS,
C + TCS, and X + TXS, leaves (18) invariant. Thus, by means of
similarity transform~tions on A and B, one can precondition system
for easy solution. For example, if A and B are diagonal then (18)
becomes (a ii - bij)x ij = c ij .
Special Topics
This group consists of the two papers hy Glaser-Saliba and
Hoernes. The Hoernes paper concerns Data Base Systems, a technology
which will have a growing impact on problem modeling and computa-
tion. In particular, many of the coefficient matrices and vectors
used in sparse matrix applications will be generated from data bases.
Hoernes presents a brief introduction to this important field. An
in-depth treatment of this technology would involve a sophisticated
interac~ion of the fields of Systems Programming and Computer
Hardware. These subjects are beyond the scope of this Proceedings.
The Glaser-Saliba paper involves, besides a description of
analytical photogrammetry, the important topic of matrix partitioning
[Householder (1953A), (1964A)]. Let A = (A ) be an MxM partitioned
matrix where, A is a square matrix of orde¥vk(v), and
k(l) + k(2) + .. ~¥ k(M) = n. Assume that All is nonsingular and that
the first k(l) pivots in Gaussian elimination are chosen from the
first k(l) rows and columns. Then the reduced matrix of order
n-k(l) is giventt in block form for 2~V,v~ by
At (19)
UV
Notice that (19) is invariant under any nonsingular transformation
of the first k(l) rows and columns of A.
Let M = 2, k(l) = 1, and k(2) = n-l, then (19) is the outer
product representation of a single pivot step. Also, in this case,
-1 -1
All = tIl = PI' A2l is a column vector, and A12 is a row vector.
The outer product approach is used by Gustavson and by Hachtel
relative to symbolic preprocessing and to a priori ordering.
If k(l»>l then formula (19) is to be viewed symbolically,
since one does not usually form the inverse of All explicitly in
t The dimensions are given for each of the arrays in the parenthe-
sis after the symbol for the array.
tt To within roundoff errors.
20 D. J. ROSE AND R. A. WILLOUGHBY
matrices and also directed graphs. They also study the role of
tearing and modification in sparse systems. Here the use of
graphs allows precise statements about the complexity of the
solution.
If a matrix is block triangular, then only the diagonal blocks
need be factored, and the solution is determined via block substi-
tution. Rose-Bunch point out that, if A is irreducible, so are
all the matrices obtained from A via Gaussian reduction. However,
All can still be highly reducible, and they show how one can take
advantage of the reducibility of All. Again, one can create a
modified algorithm in which only the diagonal blocks of All are
~ -1
factored t in the formation~of AZ2 • For example, let A12= AllA12
A12 = A12 · Now, A12 is formed via a set of hlock substi-
~