Professional Documents
Culture Documents
SYSTEMS,
VOL.
PAS-89,
NO.
1, JANUARY 1970
Sparsity Programming
E. C. OGBUOBIRI,
MEMBER, IEEE
is obtained. Thus, even when one has plenty of space for a given
class of problems, one should also consider sparsity coding in
preference to a conventional coding as a means of reducing computer time.
The author is not immune to the reservations of prospective
sparsity programmers. The idea is always there. The goal is
around the corner. However the path to it is complicated. The
difficulty with sparsity coding in scientific programming appears
to lie in the following:
1) there is more to it than knowledge of FORTRAN language,
2) the program logic is more involved, and hence more debugging time is expected.
Certain applications require that the creator of a problem or
the inventor of an algorithm play the role of a programmer and
vice versa. But this is an ideal combination of talents. Sometimes
the creator knows just enough conventional programming to be
able to code his problemi in a smaller scale. Would it not be helpful to find a programmer who does not understand the totality of
the program logic but who has the skill to expand the size of the
existing program. This would, in fact, amount to a division of
labor that does not involve any loss in efficiency.
This paper purports to formalize sparsity programming in such
a way that the subject will become less mysterious to most conventional coders. By introducing and emphasizing a systematic
approach, the task of sparsity programming will be much easier
both conceptually and in practice. Although rectangular arrays
are implied in the text, the extension to higher dimensional
arrays is obvious. In the interest of the majority of users, the
FORTRAN language is assumed throughout the text. The works of
Byrnes [2], Randell and Kuehner [3], and Jodeit [4] are very
inspiring.
151
COLUMN 2
COLUMN
1 EASMi
A (M, 1)
A(1, 1)
All, N)
|0
M M+1
STARTING
ADDRESSES
END ROW 2
A(M,N)
array.
END ROW M
2M 2M+1
DIAGONALS
OFF-DIACONALS
END OF
BLOCK A
STORAGE ORGANIZATION
Definition
A block of memory is a contiguous set of storage locations.
Consider a rectangular matrix A of dimensions M X N. The
standard FORTRAN practice is to store this matrix column by
column in a block of memory named A. The first column is
followed by the second column, and so on, until the N columns are
exhausted (see Fig. 1). Thus the double-subscripted variable A is
stored as a single-subscripted variable. The element aij is interpreted as A (Mj + i - M). Consequently, the expansion rule for
the ordered pair (i,j) denoting (row index, column index) is
(1)
(i,j) = AI + i-M.
This rule is built into the FORTRAN compiler, and the translation
is automatic whenever the ordered pair is encountered in a coding.
If, in a DIMENSION statement, one had coded A (MN) in place of
A (M,N), then the compiler will be perfectly happy if within the
body of the program one codes A (MIJ + i- M) in place of A (i,j).
Surely the coding A (ij) is simpler to use than the coding A (Alj
+ i-M) which involves a preliminary computation. To take advantage of such simplicity we naturally sacrifice flexibility.
Therefore, when the matrix A is sparse and we can no longer
afford to store all the zero elements, we must devise a specialpurpose storage scheme for which we must also know the associated translation for the ordered pair (i,j). This is the essence of
sparsity programming.
In many applications,
it is easier and more natural for an input routine to build a system matrix row by row than to build the
6).
are
1) column index c,
2) value v of the element (optional),
3) address p of the next element in the row.
152
POINTER
TO
ELEMENT
ELEMENT
1970
l l
I cjI V0
I o II
-1
q
(a)
POINTER
TO
ELEMENT
VI
P I
C2
p
(b)
Fig. 3. Format for element storage. (a) Nonterminal element
of row. (b) Terminal element of row.
Retrieval of an Element
In order to retrieve a given element of a working array A, the
following procedure may be employed:
V21
(b)
Fig. 4. List modification. (a) List before addition of new element.
(b) List after addition of new element.
NAA
FAA
c
q
NAA
I|
NAA
FAA
NAA
FAA
(aI)
ZZIIIlZIEl
(b)
Fig. 6. Inclusion of new available address q in list of available
addresses. (a) Stack before chaining. (b) Stack after chaining.
(a)
FAA
FAA
NAA
(aI)I
(a)
NAA
(b)
Garbage Collection
There are some applications in which nonzero elements of an
array become zero in the course of program execution. A procedure by which the space occupied by these elements can be reused as soon as they become available is referred to as condensation or garbage collection. The essence of condensation is to move
active elements to one side of the storage block and the available
addresses to the other side.
c)
COMPLEX
REAL
d)
COMPLEX
b)
1000
1.5
4250
6000
6500
9000
1000
2.0
5000
7000
8000
11000
13000
1000
2.5
5750
8000
9500
1000
3.0
6500
9000
11000
15000
1000
3.5
7250
10000
12500
17000
1000
4.0
8000
1000
14000
19000
For binary matrices such as incidence matrices, further compactification of storage is possible if the (c,p) attribute of each
element in a row is binary coded so that one or more element
attributes can be packed into one full word. However, the retrieval costs for such packing are unduly high in high-level
languages like FORTRAN. In basic machine languages, this reduction in storage space can be achieved without any sacrifice in computation time.
UNSYMMETRIC
SYMMETRIC
REAL
arrays
153
PROGRAMMING STRATEGY
The approach to sparsity programming should be a two-stage
process; the first stage is a conventional programming using
formal subscripts while the second stage is a program conversion.
stored
unsymmetric matrix
a) the total memory space for the real case is 2N + 3n
b) the total memory space for the complex case is 3N + 4n
symmetric matrix
c) the total memory space for the real case is 2N + 1.5n
d) the total memory space for the complex case is 3N + 2n.
If we let k be the average number of off-diagonal elements per
row, then n = kN. Let CM denote the total central memory requirement. With these definitions, the formulas a)-d) will
become
a) CM = (3k + 2)N
b) CM = (4k + 3)N
(2)
c) CM = (1.5k + 1)N
d) CM = (2k + 3)N.
Fig. 8 shows the values of CM for typical value of N and k of
rectangular arrays. For any value of N, the value of CM can be
computed from Fig. 8 by multiplying the value of CM corresponding to the given k by N/1000.
154
DIAG
SA
hN+1
OFFD
2Nt1
2N
20000
(a)
C
MAIN PROGRAM
COMMON / BLOCK / A (20000)
COMMON / BLANK / ISA, IDIAG, IOFFO
ISA- i
IDIAG - N #I
IOfFD: 2 N.1
RETURN
END
(b)
Fig. 9. Block design and base address initialization. (a) Block A
of dimension 20 000 and with subblocks SA, DIAG, and OFFD.
(b) Sample coding to illustrate modularity and relocatability.
label 2
label 1
A(I,J)
=
=
ml,M2,M2
ni,n2,n3
CONTINUE
CONTINUE
calls for a direct translation, in accordance with 3), of the statement A (I,J) = X.
DO label 1 I = MIM2,M3
DO label 2 J = ni,n2,n3
X = A(I,J)
label 2 CONTINUE
label 1 CONTINUE
CONCLUSION
In outlining the essence of sparsity programming in a problemindependent context, the complete separability between the
functions of a conventional coder and those of a sparsity coder
has been demonstrated and emphasized. The need for this separability of tasks becomes more evident in large complex programs and is not diminished by the fact that the two coders may
be one and the same person since the overall programming
efficiency and debugging speed are enhanced by the staging of the
two tasks. The method of storage and retrieval of array elements
has been formalized in a manner providing automatic garbage
collection and a rule-of-thumb calculation of core-storage requirements; sample formulas and memory budget tables are
given. The examples of the recompiling techniques for the key
conventional codings constitute a program conversion guide;
this conversion guide lends itself to a software implementation.
REFERENCES
[1] W. F. Tinney and C. E. Hart, "Power flow solution by Newton's
Discussion
Jen Hsu (P.O. Box 361, Berkeley, Calif. 94701): The paper essentially
pointed out the utilization of linked lists to minimize storage for
sparsity matrices. Linked lists, a well-known method in programming
technique [5], are composed of items which do not usually occupy
contiguous locations in memory. Therefore, an item contains secondary information, i.e., a pointer to where the next item in the chain is
located.
For more effective programming, I would suggest the use of a
circular list instead of the linear list which the author described. We
know that a circular list can be used not only to represent an inherently circular structure but also to represent a linear structure,
i.e., a circular list with one pointer to the rear node. Although a circular list is essentially equivalent to a straight linear list with two
155
pointers to the front and rear, it will be possible to have access to all
of the list starting at any given point, and it will be more effective to
move the leftmost to the right [5].
REFERENCES
[5] W. D. Maurser, Programming, Introduction to Computer Lauguages
and Techniques. San Francisco: Holden-Day, 1968.
E. C. Ogbuobiri: The answer to Mr. Hsu's remarks is in the paper itself. However, the following additional comments are offered.
The paper claims that there is a simple methodology for sparsity
programming and describes this methodology in detail. It is emphasized, even in the abstract, that this methodology involves no more
than a replacement of the FORTRAN storage convention for multidimensional arrays by special-purpose storage organizations. For a
class of problems, Gaussian elimination, for instance, the storage organization must be designed prior to translation into sparsity code only
after the conventional code, which assumes arrays with no sparsity,
is fully debugged. A concrete general-purpose demonstration is given
for the simple problem of storage and retrieval of elements of real
unsymmetric arrays together with a method for garbage collection.
The paper also claims that the two-stage approach described fosters
division of labor and also enhances overall gains in programming
efficiency.
The design of storage organization-takes into account Mr. Hsu's
observation. An array is stored as a tree of lists. The terminal lists
are linear. Hence a vector becomes the building block of an array.
A vector together with the use (and the sequence of creation and extinction) of its elements in the course of computation determine the
most efficient list structure. While some I/O buffer may lend itself to
a circular list, most storage and retrieval problems of numerical
computation exploiting sparsity do not. While, in Gaussian elimination, implicit chaining of row elements is always possible, rows of
certain incidence matrices whose row elements arrive randomly at
input time require explicit chaining. Thus there is room, in each
problem, to optimize the storage organization. However, if the programmer were simply to implement a general-purpose storage and
retrieval scheme for multidimensional arrays without due regard to
numerical operations-on the array elements, as is done in IBM System/360 MATLAN (Qctober 1968 edition) he will still achieve overall
gains in both storage space and computation time for reasons givein
in the paper; the bottleneck Droblem of memory requirements will at
least be solved. He can always improve on the gains by improved
storage organization that takes into account the process of computations to which an array is subjected.
It should be self-evident that this author is not selling a particular
list structure (since the most efficient list can be found for each specific
application) but a methodology which is an outgrowth of his experience in sparsity programming. If any user understands -and -applies
this methodology in his problems, his sparsity code and programming effort in a given computer language can, in that language,
only be equaled but not excelled.