Programming Considerations: Tianhe-2 (China)

CHAPTER 2
PROGRAMMING CONSIDERATIONS
2.1 Introduction
The use of digital computers imposes two main limitations: time, and
memory. In some applications, it is possible to trade time (speed) for
memory or vice versa. A programming technique is considered to be an
efficient one when both time and memory requirements cab be improved.
There are three factors that affect the performance of any programming
technique:
- computer type (Tianhe-2 (China), Titan (US), SUN, DC-Station,
PC,...etc.),
- programming language (FORTRAN, BASIC, C+, JAVA ...etc.), and
- application (power systems, VLSI, NASA, ... etc.).
The main issues for any programming technique are

- sparsity programming (to compress the data stored),
- triangular factorization (to solve linear algebraic equations), and
- data management (to improve processing and accessing of data).
Overlaying is a programming technique widely used in FORTRAN and

BASIC. In this method, certain portions of a computer program are held in
high speed memory during all phases of the execution, and the other
portions (stored on lower media) are only brought into high speed memory
when needed.
2.2 Sparsity Programming
Sparsity is a digital programming technique that store sparse matrices in a

compact form. The basis of this programming technique is the storage of
nonzero entries only, and at the same time keep tracking the location of
these entries in the matrix (also called storage/access programming
technique).
Ibrahim Omar Habiballah Sep. 2013 (Updated Sep. 2014)

Sparsity programming can also be used for processing nonzero data only
(like in the inversion of nonsingular square matrices). This way it does not
only reduce the memory requirements but also increase the speed.
The performance of any sparsity technique depends on

- matrix symmetry,
- storage order,
- percent sparsity, and
- occurrence of blocks of zeros.
Tow broad classes of sparsity programming techniques are:

1) Entry-Row-Column Storage Method.
2) Chained Data Structure Method.
2.2.1) Entry-Row-Column Storage Method
In this method, the nonzeros and the corresponding row/column locations

are stored in three parallel arrays:
- STO (for storing nonzero entries),
- IR (for storing corresponding row locations), and
- IC (for storing corresponding column locations).
Therefore, the total required storage for n x n sparse matrix with ne

nonzeros is 3ne entries (compared to n2 entries when storing the entire
matrix).
Example (1):
a) Consider a 3 x 3 matrix with 4 nonzeros, then
no of entries using normal storage method is 9
no of entries using first storage method is 12
b) Consider a 4 x 4 matrix with 6 nonzeros, then

no of entries using first storage method is 18
c) Consider a 1000 x 1000 matrix with 5000 nonzeros, then

no of entries using normal storage method is 1 000 000
no of entries using first storage method is 15 000
Above example reveals the extreme advantage of sparsity programming
techniques in storing large scale sparse matrices.
The nonzero entries of a sparse matrix can be stored in either row or

column fashion. The nonzeros can also be stored in arbitrary order; in this
case, the IR and IC arrays must be scanned every time an access is
required.
Insertion of a new entry requires pushing down of other entries located

below the insertion. Similarly, deletion of an existing entry requires
pushing up of other entries located below the deletion.
A major drawback of this method is the long access time it takes whenever
a new entry is inserted to the list or an entry is deleted from the list.
Example (2):
Consider the following 3 x 3 sparse matrix
0 1 0 
A  3 0 0
0 0 2
The row fashion storage method is
Counter STO IR IC
1 1.0 1 2
2 3.0 2 1
3 2.0 3 3
Assume the insertion of entry 4 in row 1 and column 3.
0 1 4 
A  3 0 0
0 0 2

The storage of this new entry will push down all entries of STO, IR, and IC
arrays as
Counter STO IR IC
1 1.0 1 2
2 4.0 1 3
3 3.0 2 1
4 2.0 3 3
Assume, now, the deletion of entry 1 from row 1 and column 2.
0 0 4 
A  3 0 0
0 0 2
The deletion of this entry will push up all entries of STO, IR, and IC arrays
as
Counter STO IR IC
1 4.0 1 3
2 3.0 2 1
3 2.0 3 3
Such insertion/deletion requires long access time.
2.2.2) Chained Data Structure Method
In this method, the nonzeros and the corresponding row (or column)
locations are stored in three parallel arrays:
- STO (for storing nonzero entries),
- IR (for storing corresponding row locations), or
- IC (for storing corresponding column locations), and
- NX (for telling how far down the list the next entry will be found).
It is also necessary to employ a forth array NFIRST which tells where a
certain row/column starts in the list. The size of NFIRST array is equal to
the number of rows (if row fashion is used) or the number of columns (if
column fashion is used).

The total required storage for n x n sparse matrix with ne nonzeros is 3ne +
n entries (as oppose to n2 entries when storing the entire matrix).
Example (3)
a) Consider a 3 x 3 matrix with 4 nonzeros, then
no of entries using second storage method is 15
b) Consider a 4 x 4 matrix with 6 nonzeros, then

no of entries using second storage method is 22
c) Consider a 1000 x 1000 matrix with 5000 nonzeros, then

no of entries using normal storage method is 1 000 000
no of entries using second storage method is 16 000
The nonzeros of this method can be stored in either row or column fashion.
A major advantage of this method is the high speed access time it takes
whenever a new entry is inserted to the list or an entry is to be deleted from
the list.
Insertion of a new entry requires no pushing down of other entries. In this

case, the NX array may need modification. Similarly, deletion of an entry
requires no pushing up of other entries. In this case, the NX array and/or
NFIRST array may need modification.
Example (4):
Consider the 3 x 3 sparse matrix of Example (2). The four arrays required
by this storage method, using the row fashion approach, are
Counter STO IC NX NFIRST

1 1.0 2 0 1
2 3.0 1 0 2
3 2.0 3 0 3

Assume the insertion of entry 4 in row 1 and column 3. The storage of this
new entry will be inserted at the end of the list, NX should be modified.

1 1.0 2 3 1
2 3.0 1 0 2
3 2.0 3 0 3
4 4.0 3 0
Assume, now, the deletion of entry 1 from row 1 and column 2.

1 1.0 2 3 4
2 3.0 1 0 2
3 2.0 3 0 3
4 4.0 3 0
Notice, in this example, that NX remains untouched.
2.2.3) Updating of an Entry
In this case, the old value is replaced with the new value directly on the
STO array (using any of the two storage methods) without affecting the
other arrays.
2.2.4) Increasing/Decreasing the Size of the Matrix
Increasing/decreasing the size of a sparse matrix using the entry-row-

column storage method is treated exactly as insertion/deletion of entries in
the matrix.
Increasing/decreasing the size of a sparse matrix using the chained data

structure storage method is treated similar to the insertion/deletion of
entries in the matrix, however, the size of the NFIRST array change
accordingly.

Example (5):
The following 9 x 7 sparse matrix has 15 nonzero entries out of 63.
1 0 0 1 0 0 0
0 2 0 0 0 0  2

0 0 3 0 0 0 0
 
0 0 0 4 4 0 0 
A0 0 0 0 5 0 0
 
 6 0 0 0 0 6 0
0 0 0 0 0 0 7 
 
0 0 0 0 0 8  8
 9 0 0 9 
 0 0 0
Using row-fashion of chained data structure storage method, the following

arrays can be recorded:

1 1.0 1 1 1
2 -1.0 4 0 3
3 2.0 2 1 5
4 -2.0 7 0 6
5 3.0 3 0 8
6 4.0 4 1 9
7 -4.0 5 0 11
8 5.0 5 0 12
9 -6.0 1 1 14
10 6.0 6 0
11 7.0 7 0
12 8.0 6 1
13 -8.0 7 0
14 -9.0 1 1
15 9.0 7 0

Assuming the entry -2.0 (in row 2 and column 7) is changed to -2.1, and the
entry -9.0 (in row 9 and column 1) is changed to -9.1; then the sizes of all
four arrays remain the same. Only the values of revised entries are
changed in the STO array.

1 1.0 1 1 1
2 -1.0 4 0 3
3 2.0 2 1 5
4 -2.1 7 0 6
5 3.0 3 0 8
6 4.0 4 1 9
7 -4.0 5 0 11
8 5.0 5 0 12
9 -6.0 1 1 14
10 6.0 6 0
11 7.0 7 0
12 8.0 6 1
13 -8.0 7 0
14 -9.1 1 1
15 9.0 7 0
Assume now the addition of a new entry -5.0 (in row 5 and column 1).
Update will take place on the parallel arrays ONLY as follows:

1 1.0 1 1 1
2 -1.0 4 0 3
3 2.0 2 1 5
4 -2.1 7 0 6
5 3.0 3 0 8
6 4.0 4 1 9
7 -4.0 5 0 11
8 5.0 5 8 12
9 -6.0 1 1 14
10 6.0 6 0
11 7.0 7 0
12 8.0 6 1
13 -8.0 7 0
14 -9.1 1 1
15 9.0 7 0
16 -5.0 1 0

Assume now the removal of an existing entry -2.1 in (in row 2 and column
7). Update will take place ONLY in the NX array. Notice that in this case,
the access time is very fast [only changing NX(3) from 1 to 0].

1 1.0 1 1 1
2 -1.0 4 0 3
3 2.0 2 0 5
4 -2.1 7 0 6
5 3.0 3 0 8
6 4.0 4 1 9
7 -4.0 5 0 11
8 5.0 5 8 12
9 -6.0 1 1 14
10 6.0 6 0
11 7.0 7 0
12 8.0 6 1
13 -8.0 7 0
14 -9.1 1 1
15 9.0 7 0
16 -5.0 1 0
Assume now the addition of a new row with two entries: -10 (in row 10 and
column 3), and 10 (in row 10 and column 6). Update will take place on all
arrays as follows:

1 1.0 1 1 1
2 -1.0 4 0 3
3 2.0 2 0 5
4 -2.1 7 0 6
5 3.0 3 0 8
6 4.0 4 1 9
7 -4.0 5 0 11
8 5.0 5 8 12
9 -6.0 1 1 14
10 6.0 6 0 17
11 7.0 7 0
12 8.0 6 1
13 -8.0 7 0
14 -9.1 1 1
15 9.0 7 0
16 -5.0 1 0
17 -10.0 3 1
18 10.0 6 0

2.2.5) Symmetry Matrix Storage
In symmetry matrices, only the upper right (or lower left) triangle,
including the diagonal entries, is stored. If the upper right triangle entries
are stored, a lower left triangle position can be accessed by interchanging
the row column subscripts.
Example (6):
The following 4 x 4 symmetric sparse matrix has 8 nonzero entries out of

16.
1 0 1 0 
0 2 0  2
A
 1 0 3 0
 
 0 2 0 4
Using row-fashion of chained data structure storage method, the following

arrays can be recorded. It can be noticed that only 6 nonzero entries of the
upper right triangle are recorded because of the symmetry.

1 1.0 1 1 1
2 -1.0 3 0 3
3 2.0 2 1 5
4 -2.0 4 0 6
5 3.0 3 0
6 4.0 4 0
If the access to the entry -1.0 (in row 3 and column 1) needs to be accessed,
then the corresponding entry from the upper right triangle in row 1 and
column 3 is used (i.e., changing the subscript a31 to a13).

Programming Considerations: Tianhe-2 (China)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Programming Considerations: Tianhe-2 (China)

Uploaded by

Copyright:

Available Formats

CHAPTER 2

The main issues for any programming technique are

Overlaying is a programming technique widely used in FORTRAN and

2.2 Sparsity Programming

Sparsity is a digital programming technique that store sparse matrices in a

Ibrahim Omar Habiballah Sep. 2013 (Updated Sep. 2014)

The performance of any sparsity technique depends on

Tow broad classes of sparsity programming techniques are:

2.2.1) Entry-Row-Column Storage Method

In this method, the nonzeros and the corresponding row/column locations

Therefore, the total required storage for n x n sparse matrix with ne

b) Consider a 4 x 4 matrix with 6 nonzeros, then

c) Consider a 1000 x 1000 matrix with 5000 nonzeros, then

The nonzero entries of a sparse matrix can be stored in either row or

Insertion of a new entry requires pushing down of other entries located

Consider the following 3 x 3 sparse matrix

The row fashion storage method is

Assume the insertion of entry 4 in row 1 and column 3.

Ibrahim Omar Habiballah Sep. 2013 (Updated Sep. 2014)

Assume, now, the deletion of entry 1 from row 1 and column 2.

Such insertion/deletion requires long access time.

2.2.2) Chained Data Structure Method

Ibrahim Omar Habiballah Sep. 2013 (Updated Sep. 2014)

b) Consider a 4 x 4 matrix with 6 nonzeros, then

c) Consider a 1000 x 1000 matrix with 5000 nonzeros, then

Insertion of a new entry requires no pushing down of other entries. In this

Counter STO IC NX NFIRST

Ibrahim Omar Habiballah Sep. 2013 (Updated Sep. 2014)

Counter STO IC NX NFIRST

Assume, now, the deletion of entry 1 from row 1 and column 2.

Counter STO IC NX NFIRST

Notice, in this example, that NX remains untouched.

2.2.3) Updating of an Entry

2.2.4) Increasing/Decreasing the Size of the Matrix

Increasing/decreasing the size of a sparse matrix using the entry-row-

Increasing/decreasing the size of a sparse matrix using the chained data

Ibrahim Omar Habiballah Sep. 2013 (Updated Sep. 2014)

The following 9 x 7 sparse matrix has 15 nonzero entries out of 63.

Using row-fashion of chained data structure storage method, the following

Counter STO IC NX NFIRST

Ibrahim Omar Habiballah Sep. 2013 (Updated Sep. 2014)

Counter STO IC NX NFIRST

Counter STO IC NX NFIRST

Ibrahim Omar Habiballah Sep. 2013 (Updated Sep. 2014)

Counter STO IC NX NFIRST

Counter STO IC NX NFIRST

Ibrahim Omar Habiballah Sep. 2013 (Updated Sep. 2014)

The following 4 x 4 symmetric sparse matrix has 8 nonzero entries out of

Using row-fashion of chained data structure storage method, the following

Counter STO IC NX NFIRST

Ibrahim Omar Habiballah Sep. 2013 (Updated Sep. 2014)

You might also like