This action might not be possible to undo. Are you sure you want to continue?

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

28,2651-2679 (1989)

**A FORTRAN PROGRAM F O R PROFILE AND WAVEFRONT REDUCTION
**

S. W. SLOAN

Department of Cioif Engineering and Suroejing, The Uniwrsity af ilrcwcastle, N.S. W., 2308, Australia

SUMMARY

A FORTRAN 77 program for reducing the profile and wavefront of a sparse matrix with a symmetric st.ructure i s described. The implementation is based on an algorithm published previously by the Author and appears in response to a large number of enquiries for the source code. Extensive testing of the scheme suggests that its performance is consistently superior to that of the widely used reverse Cuthill-LMcKee and Gibbs-King methods. In addition to presenting a complete listing of the program, we also describe how to interface it with a typical finite element code. The scheme is especially useful in finite element analysis where it can be employed to derive eflicient orderings for both profile and frontal solution schemes.

INTRODUCTION

Many problems in engineering require the solution of a set of Iinear equations of the form Ax = b, nhere A is a sparsc positive definite N x N matrix with a symmetric structure, b is a prescribed vector of length N and x i s a vector of length N which is sought. Two methods for solving these types of equations, which have found wide use in finite element analysis, are the profile and frontal solution schemes. In order to compute the solution efficiently, both of these algorithms require the equations to be processed in a certain order. For the profile method to be effective it is necessary to order the equations, which in finite element analysis corresponds to labelling the nodes, so that the sum of the row bandwidths is small. More formally, we define the sum ofthe row bandwidths to be equal to the prqfile, P , according to

Y

where the row bandwidth, hi, is the difference between i i - 1 and the column index of the first nonzero in row i. In profile and frontal solution schemes the efficiency o f an ordering is related to the number of equations which are active during each step of the factorization process. Formally, we definc row,j lo be active during the elimination of column i i f j > i and there exists aik= 8 with k < i. Thus, at thc ith stage of the factorization, the number of uctiue eyuations is simply the number of rows of the profile of A, ignoring the rows already eliminated, which intersect column i. L,ettingJi denote the number of equations which are active during the elimination of the variable xi, it follows from the symmetric structure of A that

In the finite element literature, the quantity j , is known as the current wuuefiont or fronrwidrh.

0829-5981/89/12265 1-29$14.50 PI 1989 by John Wiley & Son$, Ltd.

Received 23 Septemhrr 1988 Revised 5 April I989

2652

S. W. SLOAN

Assuming that N and the average value ofji are reasonably large, it may be shown that a complete profile or frontal factorization requires O ( N F 2 ) operations, where F is the root-mean-squure wavefront which is defined by

When using the profile method we thus wish to minimize the profile and root-mean-square wavefront in order to minimize, respectively, the storage requirement and solution time. In finite element analysis this is achieved by a judicious labelling of the nodes. For the usual form of the frontal algorithm, however, the order of the nodes does not determine the efficiency of the solution process owing to the manner in which the assembly and elimination phases are interleaved. Rather than assembling the matrix A in total before commencing the elimination procedure, the frontal scheme eliminates any equations that are fully summed immediately after they are assembled for each element in turn. Owing to this method of assembly and elimination it is the order of the elements that dictates the eficiency of a frontal solution. In practice, a good element ordering can be derived from an efficient nodal ordering by processing the elements in ascending sequence of their lowest numbered nodes. This paper describes a FORTRAN 77 implementation of an algorithm' for reducing the profile and root-mean-square wavefront of a sparse matrix with a symmetric pattern of zeros. The program has been tested on a broad range of problems, including those of Everstine,' and the results suggest that it is superior to the widely used reverse Cuthill-McKee3. and Gibbs-King'. ' methods. In addition to giving a complete listing of the program, we also describe how it may be incorporated in a typical finite element code. NOTATION

As noted by Cuthill and M c K ~the ,ordering of a sparse symmetric matrix for minimum profile ~ ~ and wavefront is equivalent to the labelling of an undirected graph. The terminology of graph theory provides a succinct means of discussing various ordering schemes, and they are usually described in this framework. Before outlining the algorithm and its implementation, it is thus appropriate to state some basic definitions. A qruph consists of a set of members called nodes, together with a set of unordered pairs of distinct nodes called edges. A graph satisfying this definition is said to be undirected since each nodal pair defining the set of edges is unordered. Note that loops, where edges connect nodes to themselves. and inultiple edges, which are pairs of nodes Connected by more than one edge, are excluded. The degree of a node is the number of edges that are incident to it and each node must be labelled with an integer in the range I , 2,. . . , N . Any pair of nodes that is connected by an edge is said to be adjacent, whilst a path is defined by a sequence of edges such that consecutive edges share a common node. Two nodes are said to be connected if there is a path joining them and the graph itself is connected if each pair of nodes is connected. The distance between a pair of nodes is defined as the number of edges on the shortest path connecting them, whilst the diameter of a graph is the maximum distance between all pairs of nodes. In general, a diameter may be defined by several pairs of nodes and all of these are said to be peripheral nodes. Following the accepted notation for describing ordering schemes,4pha pseudodiirrneter is any estimate of the diameter which is found by some approximate algorithm. Nodes which define a pseudo-diameter are known as pseudo-peripheral nodes. A concept which is central to many successful ordering algorithms is the rooted level structure. This IS a partitioning of the nodes such that each node is assigned to one of the levels I , , I,, . . . ,I h ( r ,

Since any pair of nodes i a n d j are connected by an edge only if aijand aji are both non-zero. . . it follows that the total number of non-zeros in the matrix is equal to 2E + N . whilst its width. . 1.1 from the root node r. . the quantity 2E + N actually corresponds to the total number of non-zero 2 x 2 submatrices. lh(. More formally. Letting q ( r ) denote the number of nodes on level i.PROFILE AND WAVEFRONT REDUCTION 2653 in accordance with its distance from a specified root node r. All nodes which lie at the same distance from the root node belong to the same level. .)} and is defined by 1. is the set of all nodes which lie at a distance of i. 2.. is simply its total number of levels.1. The depth of a rooted level structure. Clearly. Graph representation of a sparse matrix with a symmetric structure . the maximum distance of any node from the root node is equal to h(r). w(r).. as shown in Figure 1. ( r ) > 1 < i < h(r) Any sparse matrix with a symmetric structure may be viewed as an undirected graph. the partitioning may be written as L(r)= { I . is the maximum number of nodes which belong to a single level. where each node is associated with two equations. Each node in the graph corresponds to a row in the matrix and the degree of a node gives the number of off-diagonal non-zeros in its row.) In finite element applications. where E is the number of edges and N is the number of nodes. h(r). A G R I D OF 4-NODED QUADKILATERALS T H E STRUCTURE OF THE CORRESPONDING MATRIX ( X = NONZERO ENTRY) :: x x x x x x x x x x x x x x x x x x x x x x x x- THE CORRESPONDING GRAPH Figure 1. the graph corresponding to a mesh is constructed such that all nodes which share an element are connected by an edge. l l = r . the width of the level structure is defined by w(r)= max ( o . (Note that for the two-dimensional grid shown in Figure 1. For i > 1. f.

3. 5. ={l]. More formally.) Exit with starting node s and end node e.4. ( 3 . I. ( 5 . q. i.4j.If h(i)> h(s) and w(i) < w(e) set s = i and go to step 3. 2. 6 ) together with the edges { I. 3).) For each node z E q. W. By definition. ) 3. where I. ' 7 . ' & ( r .5). (Exit. {I. generate L(z)= . 5 . the graph itself is connected In order to iilustrate the concept of a level structure. for example. 6 ) .2654 S.4). 5 ) . (2. (I. A rooted level structure In order to illustrate some of the foregoing definitions. Nodes 3 and 6 arc also peripheral for the same reason. The level structure rooted at node 1 is shown schematically in Figure 2. The width and depth of this level structure are both equal to three. (2.e k!s)= ill>12.6}. 6 ) and 1. we note that the graph shown in Figure 1 is defined by the set of nodes (1.3. {2. where the boxes denote levcl numbers.) Scan the sorted level Ih(s)and form a list of nodes.) Form the level structure rooted at node s. Since all nodes in the graph are connected. (Sbrt the last level ) Sort the nodes in 2h(s) in ascending sequence of their degrees 4. and these nodes are peripheral since they define the diameter of thc graph. 2.4}. wc note that the sum of the nodal degrees is equal to twice the number of edges..) Scan all nodes in thc grapi! and d c c t a node 5 with the smallest degree.5). 5 . containing only one node of each degree. this lcvd structure may be represented by the set L(l)=(I. is two. 7.) Set w(e)= m. = f2. (Initialize. if w(i) < w(e) set e = i and w ( e ) = w(i). A more detailed description of thc d i e m e may be found in Sloan. (Generate rooted levei structure. it is necessary to understand the notion of distance. 6. l k ( l ) ] . The distance between nodes 1 and 4.4 and 6 are equal to three. I..\ For the node labelling and are determined as follaws: I (First guess for starting node. (Test for termination. 5}. whilst the degrees of nodes 2 and 5 are equal to five. 2).' Selection of pseudo-peripheral riorles Before commencing the node labelling phase we first Gnd a pair of pseudo-peripheral nodes which lie at opposite ends of a pseudo-diameter These wrve as starting and end nod:. (Shnnk the last level. The degrees ofnodes 1. in order of ascending degree. {4. {2. THE ALGORITHM The labelling algorithm IS comprised of two distinct steps which we will briefly outline. IJ). 6 ) .(3. ={3. Else. SLOAN Figure 2.

are said to be active. y and z are. The current degrees of nodes x. but do not have a postactive status. respectively 1. This shrinking strategy performs better than the one used originally in Sloan.' but the shrinking strategy in step 4 has been modified slightly following a suggestion of S ~ o t tComputational experiments indicate that .PROFILE AND WAVEFRONT REDUCTION 2655 This algorithm is similar to that described by Sloan. After the labelling is completed.~ many of the nodes in at step 3 have identical degrees. To begin the node labelling procedure we require a starting node and an end node which define a pseudo-diameter. and is based on the premise that deep level structures are usually narrow. Any node which is adjacent to an active node. 2 and 3. The current degree of a preactive node is equal to its number of preactive and inactive neighbours plus one. Nodes which are adjacent to z postactive node. all the nodes are postactive and have current degrees of zero. each node in the graph is inactive and hence has a current degree equal to its degree plus one. we define any node which has been assigned a new label as postactive. the short circuiting strategy in step 6 is retained. An example which illustrates the above terminology is shown in Figure 3. We then relabel the starting node as node one and form a list of nodes which E T OF NODES WITH N "INACTIVE"' STATUS I I "PREACTIVE" STATUS SET OF NODES WITH AN "ACTLVE" STATUS SET OF NODES WITH A "POSTACTIVE" STATUS Figure 3. Imposing the condition that w(i)< w (e) allows some of the level structures L(i) to be aborted during their assembly. while cach node which is not postactive. Terininology for node-labelling algorithm . Node labelling algorithm After a pair of pseudo-peripheral nodes has been found using the preceding algorithm. Prior to labelling. often minimizes the number of level structures that need to be generated in step 6 without affecting the quality of the resulting pseudo-peripheral nodes. active or preactive is said to be inactive.' especially for 'square' grids which may have large numbers of nodes in Eh(s) in step 3. but is not postactive or active. the growth in the current wavefront is measured by a quantity called the current degree. As in the original algorithm. To describe the algorithm succinctly. Each inactive node has a current degree which is equal to its degree plus one. so that only one node of each degree is present. is said to be preactiue. particularly if the size of this level is large. As the labelling progresses. we then generate the node labels in a single pass. Removing nodes with duplicate degrees in this list. A slight penalty for this improved performance is that the overall algorithm requires an additional N / 2 words of memory to relabel an arbitrary graph. All postactive nodes have current degrees of zero. whilst the current degree of an active node is equal to its number of preactive neighbours.

3. is the degree of node i. 2. insert it in the priority queue with a preactive status by setting n t n + 1 and q . (Initialize node count and priority queue. This list is comprised of nodes which are either active or preactive and constitutes a priority queue. j . (Entry. l. 6. = k . then insert it in the priority queue with a preactive status by setting m e n + 1 and qn=j. 8. (Select node to be labelled. Let q denote a priority queue of length n. Assign node i a postactive status. (Compute distances.' If we set W. and W . (Assign initial status and priority. set pj=pj+ W . 4. p . . = 1. = 1 and W . examine each node j which is adjacent to node i and increment its priority according to p j t p j + W . . 9. 6. 5.) Label node i with its new number by incrementing the node count according to I c l + 1 and setting qi = 1. W. where 1 is the count of nodes that have been labelled. (Update priorities and queue. If node i is not preactive go to step 8. arc integer weights and d.. (Test for termination. . and compute the distance. = qn and decrementing n according to n e n .) Search the priority queue and locate the node i which has the maximum priority according to pi=max {Pjl jecl 7. = 2 to obtain good orderings.2656 S. = 0 and W .) While the priority queue is not empty.) Delete node i from the priority queue by setting q.1. Let m be the index of node i such that q. take no action.) Enter with the endpoints of a pseudo-diameter. Note that if node i = belongs to I. in step 3 reflect the importance of distance from the end node relative to that of current degree. ... The values assigned to the weights W .) Examine each node j which is adjacent to node i. Else. d o steps 6-9. = i. (this corresponds to decreasing the current degree of node j by unity). and examine each node k which is adjacent to node j . . The queue is then updated using the connectivity information for the graph and the whole process is repeated until all the nodes have been labelled. 9. Else. then 6. which is signified by n>O. Each node in the queue has a priority which is determined so that a low current degree and a large distance from the end node ensures a high priority. increment its priority according to p k + p k + W. The node with the highest priority is chosen as the next node to be labelled and then deleted from the queue. If node j is inactive.SLOAN are eligible to be labelled next. If node k is inactive. Insert s in the priority queue by setting n= 1 and qn=s. Empirical tests on large collections of problems suggest that these criteria should be weighted with W .= W. such that q i is the node number for node i..) Assign each node in the graph an inactive status and compute its initial priority. according to p. *6.) Form the level structure rooted at the end node. If node k is not postactive. and assign node s a preactive status. L(e)= {11. 10. assign node j an active status.) Set I=O. (Label the next node. This does not affect the labellings produced. If nodej is not preactive. The overall algorithm may be summarized as follows: 1. The above algorithm differs from that described by Sloan' in that we no longer insist that all priorities are non-negative. nodes s and e.and W . (Exit. the node labelling algorithm is similar to one of the schemes proposed by King' (except that . (Update queue and priorities. of each node i from the end node.1.*(di+ 1) where W . .) Exit with the new node labels.- w. where 9 is thc list of new node numbers.

e) operations for a mesh with e elements. it is generally prudent to insist that W. An efficient method for determining the new element labels is as follows: (Entry. . Let the number of nodes for each element be n and the list of element nodes be given by the set {v. which requires O ( e log. Because it uses current degree as the principal measure of priority. Once the element labels have been determined.) Enter with the list of nodes defining each element and the new node ordering. For large grids with many elements. on}. this approach has the advantage that it takes only O(e)operations. This follows from the observation that an element ordering which is efficient for a grid of low order elements is also efficient for an equivalent grid of high order elements. which is written in FORTRAN 77 and requires only slight modification for the present application. it is possible to execute this step using a linear bin sort. we increase the size of W . this approach leads to considerable economies in the ordering phase. q. . relative to W . increasing the size of W . E . where the solution efficiency is dictated by the order of the nodes. this places more emphasis on the global criterion of distance from the end node and forces the nodes to be labelled level by level. by setting ci= i.) Exit with the list of element numbers E such that ci holds the old number for element i. it is necessary to determine an efficient element ordering since the equations are assembled and eliminated for each element in turn. may be found in Houlsby and Sloan. if an additional N storage locations are available. Element lubelling algorithm The schemes described in the previous sections may be used to generate efficient node or element ordering for finite element analysis. places more emphasis on current degree and forces the algorithm to produce labellings which are similar to those produced by King's method. mi. which may occur if insufficient weight i s given to current degree. . which is a purely local criterion.) Compute and store the lowest numbered node. it is desirable to use an efficient sorting algorithm in step 3. for each element i according to m i = min ( q ( u j ) } l<j<n Also initialize the list of element numbers. . however. Since a mesh of high order elements may have a small number of corner nodes. A good element labelling may be derived from an optimized nodal ordering by sorting the elements in ascending sequence of their lowest numbered nodes. As opposed to quicksort. For a profile solution strategy. on the other hand. v2. (Exit. > W. relative to W.. the performance of King's strategy is erratic and it may yield very poor labellings. the new node labelling held in q can be discarded. can easily be generated from E. Indeed. but a large number of nodes in total. which gives the new element numbers. Sloan and Randolphg have noted that it is necessary to consider the corner nodes only when relabelling the elements for a frontal solution scheme.' This ensures that the equations are eliminated in a similar order to that dictated by the optimized nodal ordering and has proved most effective in practice. For grids comprised of high order elements. If.) Sort the list E in ascending sequence of its keys held in m. With a frontal algorithm...PROFILE AND WAVEFRONT REDUCTlON 2657 King gave no systematic procedure for determining the starting node and employed a criterion for breaking ties). such that y1is the new node number for node i. Note that the inverse permutation vector. (Initialize. the elements may be numbered in an arbitrary fashion." Alternatively. (Sort elements. . A fast quicksort routine. In order to avoid excessive growth in the wavefront.

This data structure stores each edge twice for speed of access and requires 2 E h! 1 words of memory. whilst the degree of node I is simply XADJ(I+ l)-XADJ(I). A similar data structure. . To the best of the Author’s knowledge. ROOTLS. the nodes adjacent to node I are found in ADJ(J) for J=XADJ(J).4. 3 . 1.2658 S. It controls the overall . The amount of ‘elbow room’ required. 1. It is based on the rule that all nodes which share an element are connected by an edge and requires the node lists . 2 .5. IADJ needs to be about 25 per cent greater than the final length of ADJ (which is equal to 2E). requires that IADJ/2E=2. This information is then used to allocate space in the adjacency vector ADJ which is initially filled with zeros and has a specified length of IADJ. Mi. to illustrate the detail of the implementation. 5 . these codes obey the syntax of standard FORTRAN 77 and ought to be portable. ?. After the node lists for all the elements have been examined. Following George and L ~ uthe graph is stored as an adjacency list which is accessed through a pointer vector.~ for each element to be input. The algorithm used to generate the node graph is an enhanced version of that given by Collins. is dependent on the type of element used in the mesh A fine grid of 3-noded triangles.6. .4.5. On average.1. For a general mesh of elements where + + NE each element i has NEN(i) nodes. is given in Appendix 1.6. Collins’s scheme requires excessive storage if the maximum nodal degree is significantly greater than the average nodal degree and is thus unsuitable for grids of high order elements. IADJ needs to be set equal to 1 NEN(1) * (NEN(i)1 1). Vacant entries in ADJ are signified by zeros which allow new entries to be inserted in their correct positions. The proposed algorithm avoids this liniitation by first scanning the element node lists to compute an upper bound estimate for the degree of each node.5} and the corresponding pointer vector IS XADJ = (1. however. whilst for a grid of 2noded bar elements IADJI2E = 1.2.5. however. . ISORTT.3.9. for example. which generates a node graph from typical finite element data. is used to store the nodal lists for each element (the routine may easily be modified to deal with other storage conventions). XADJ(I)+ I . Each of the routines will be discussed in turn. NUMBER and PROFIL) and deals with both connected and disconnected graphs. .2. Any pair of nodes that has already been connected by a scan of a previous element is ignored. which has proved convenient in practical finite element programs. A complete implementation of the algorithm for reducing profile and wavefront is given in Appendix 11.I). The node labelling program is comprised of six subroutines (LABEL. DIAMTR. 6. SLOAN IMPLEMENTATION An illustrative subroutine. The algorithm for generating a node graph from finite element data is implemented in a single subroutine named GRAPH. sufficient space in ADJ is guaranteed by setting IADJ = NE * NEN * (NEN .12. 1. If ADJ and XADJ denote the adjacency list and pointer vector.’’ Although fast. To illustrate this scheme. this is the only routine that needs to be called by the user in order to form the iist of node labels.15. any zeros remaining in ADJ are removed by compression and the pointer vector XADJ is updated. All of the codes contain numerous comments in the hope that they are self-documenting. For a mesh comprised of N E elemcnts.3.23). Subroutine LABEL Once the node graph has been generated from the finite element data. the adjacency list for the graph shown in Figure 1 is stored as ADJ = (2.XADJ(I+ 1).4. Subroutine GRAPH This routine generates the node graph which corresponds to a finite element mesh.20. where each clement is of the same type and has N E N nodes. The I= adjacency list ADJ is formed by looping over each element and connecting all pairs of nodes in its node list. respectively.4.

The nodes on level I are found in I. The new node babels arc hcld in a pcrmutation vector called NNN. such that NNN(1) is the new node number for node I. ~ pointer vector XLS. which is maintained as a priority qucuc. Following George and L ~ u we~flag components which . XLS(I + 1) . This convention allows the arrays MASK and NNN. we use MASK to hold the distance of each node from the end node. active. then the old node numbers are retained by setting NNN(1) = I. which typically requires only two or three iterations. whilst the lists P and S have lcngtbs equal to N . requires 4N + 1 words of memory to label an arbitrary graph. Thc nodes currently in the queue are stored in the array Q. XLS(I)+ 1. Subroutine ROOTLS This subroutine generates a level structure rooted at the node ROOT and is a modification of the code given by George and L ~ uUpon exit. Note that if the profile for the original ordering is less than the profile for the new ordering. In this case. Subroutine LABEL calls thrcc subroutines: DTAMTR to select the starting and end nodes. an upper bound on the storage required for XLS is N + 1. The depth and width of the rooted level structure. have already been considered (labelled) by using a masking vector called MASK. A key feature of the labelling process is the list of active and preactive nodes. Subroutine D I A M T R This routine computes a pair of nodes which define the endpoints of a pseudo-diameter. whether it is postactive. . The list Q has a maximum length of NC. After the starting and end nodes have been selected. It deals with disconnected graphs by labelling cach component in turn. .PROFILE AND WAVFFRONT REDUCTION 2659 flow of the program and. which has an upper bound of A’. Subroutine I S O R T I This code is a straightforward implementation of insertion sort and orders an integer list in ascending sequence of its integer keys. the node at which the labelling is to be started. Since there are a maximum of N levels for a graph with N nodes. which are both of length N . as described previously in the scetion entitled ‘Node labelling algorithm’.S(J). depends on the number of levels in LS. all output from ROOTLS should be ignored. the total number of nodes in the same component as SNODE. The other output parameters are SNODE. Since the list Q is kept unordered. It uses the algorithm described in the section ‘Selection of pseudo-peripheral nodes’ and operates on each component of the graph one at a time.c. where J=XLS(I). locating the node with . however. The length required for XLS. assuming that it has not been aborted during assembly. The only nodes that are visible to DIAMTR are those which have MASK = 0. The status of each node during the ordering process (i. TSORTT to perform an insertion sort and ROOTLS to generate a rooted level structure. to share the same storage. NUMBER to label the nodes and PROFIL to compute the profiles for the old and new node labels. whilst a separate list named P is used to record the priorities of all of the nodes. . the level structure is held in the list LS and . are given by the output parameters DEPTH and WIDTH Subroutine N U M B E R The scheme for labelling the nodes. Subroutine JITAMTR calls two other subroutines. preactive or inactive) is kept in the array S. excluding the 2 E + N + 1 words of memory needed to hold the structure of the graph. In all cases the length of LS is equal to NC. and NC. is implemented in this subroutine. Note that the assembly of the level structure is aborted if any level is found to have a width which is greater than or equal to the parameter MAXWID. .1.

W. Deleting. we choose to update the priority of node k regardless of its status. where the priority queue grows to an appreciable size. at least for small and medium sized problems involving several thousand nodes.12 A heap data structure permits an item to be deleted. Another elegant method for implementing a priority queue.2660 S. are shown in Tables I and 11. inserted or have its priority changed in O(10g2n) operations. From Table I it is clear that the reverse Cuthill-McKee algorithm gives poorer results than any of the othcr schemes since. These tables also list results for the reverse Cuthill-McKee algorithm. Although simple. Preliminary numerical experiments suggest that the unordered list data structure is to be preferred for problems where the root-mean-square wavefront does not exceed several hundred nodes. whilst the searching step is trivial. Of particular interest to the present work are the results given by Arm~trong. the heap implementation requires an additional array of length N . in step 9 of the node labelling algorithm. The average reduction achieved by the Gibbs-King ccheme is between those of the . this scheme establishes valuable benchmarks for gauging the performance of faster but more approximate algorithms. In order to change the priority of a node in a heap efficiently.2 These correspond to a diverse range of finite element grids and have been widely studied. the relative performance of the heap data structure improves and it eventually becomes the method of choice. as implemented in the FORTRAN program of George and L ~ uand the Gibbs Kings algorithm as .6 The code for Lewis’s implementation is distributed by the Association for Computing Machinery as Algorithm 582 and is available on request. it is unnecessary to update the priority of node k if it is postactive (since the node has already been assigned a new label and its priority is no longer of interest). we employ the thirty test matrices collected by Everstine. however. this implementation of the priority queue has proved highly efficient. is trivial. This economy of storage is achieved by adopting the convention that all postactive nodes have a status which is greater than zero. For larger graphs. inserting or changing the priority of a node. Consequently. which is efficient for large queues. as this leads to cleaner code which avoids many redundant tests. In NUMBER. the computed profile is just 10 per cent above thc simulated annealing profile. RESULTS In order to provide direct comparisons between various algorithms. and thus shares the same storage as the permutation array NNN. SLOAN the maximum priority requires O(n) operations for a queue of lenkth n. Its success follows from the fact that the number of nodes in the queue is typically a small fraction of the total number of nodes in the graph {or component). Armstrong’s results. is based on a variant of a binary tree data structure called a heap. together with those obtained from the code of Appendix 11. typically.’~ obtained near-optimal profiles and wavefronts for Everstine‘s problems by who using a simulated annealing algorithm. This performance docs not compare well with that of the Sloan algorithm where. it is necessary to keep track of its position. In subroutine NUMBER. on average. Although it is much too slow for practical use. Note that.~ described by Lewis. Subroutine PROFIL This subroutine is straightforward and uses the original and new node numbers to compute the original and new profiles. Detailed comparisons of the two methods for managing a priority queue suggest that the simple unordered list implementation is generally superior to the heap implementation. it yields a profile which is 31 per cent in excess of the simulated annealing profile. we note that the status array S is also used to hold the new node numbers.

oo 1-11 1.10 1.29 1. respectively.07 1.06 1.31 1.31 1. on another 3 occasions.25 1.29 1.12 1.13 1.51 L93 1.PROFILE AND WAVEFRONT REDUCTION 266 1 Table I. As practical applications often deal with a single type of matrix which has a particular sparsity pattern.. Pa.17 1.01 1.37 1. Moreover.21 1.02 1. which are not bettered by any of the three algorithms tested. these two schemes give profiles which are at least 50 per cent above the optimum.11 1.21 1.67 1.11 1-03 1.25 1.32 1. Pilo=profile for Sloan. is much less erratic and in the worst case its profile exceeds the simulated annealing profile by 24 per cent.24 1.26 1.15 1. = profile for reverse Cuthill McKee. P.01 I .26 1.33 2.15 1.02 1.22 1-05 1.32 1. The results for the reverse Cuthill-McKee and Gibbs King schemes are not encouraging in this regard since they show worst case profiles which are.19 1.50 1.14 1.05 1... P.16 1.13 1..28 1.24 1. on the other hand.60 1.35 1. =profile for Armstrong. it is not only the average performance of an algorithm which is of interest but also its worst performance.51 1.07 1.20 I .21 1.14 1.07 1.13 1.08 1.76 1.22 1. Excluding the results for the simulated annealing scheme.24 1-16 1-01 1-10 1.02 1.23 1.24 1.42 1. * = no improvement reverse Cuthill-McKee and Sloan schemes.10 1.1 I I .11 1.17 1.01 1.23 1.33 1.45 1. the Sloan scheme yields the lowest profile on 24 occasions whilst the Gibbs-King scheme gives the lowest profilc on 4 occasions. .02 1. at least 1 14 per cent and 76 per cent above the optimum profiles.10 1. =profile for Gibbs-King.05 1.oo 1.15 1 .20 1. The performance of the Sloan algorithm.24 1.13 1.15 1.17 1.24 1.07 1.02 1.11 1.46 1. Results for profile reduction on Everstine's test problems 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 59 66 72 87 162 193 198 209 221 234 245 307 310 346 361 419 492 503 512 592 607 758 869 878 918 992 100s 1007 1242 2680 464 640 244 2336 2806 7953 5817 9712 10131 1999 41 79 8132 3006 9054 5445 40145 34282 36417 6530 29397 3061 5 23871 20397 26933 109273 263298 122075 26793 11I430 590543 314 217 244' 696 1641 5505 1417 3819 2225 1539 4179* 8132* 3006* 8034 5075 8649 7067 15319 5319 1 1440 17868 8580 19293 22391 23105 38128 43068 24703 50052 105663 314 193 244* 682 1579 4609 1313 4032 2154 1349 3813 8132* 3006* 768 1 5060 8073 5513 15042 4970 10925 14760 8175 15728 19696 20498 34068 40141 22465 52952 9927 1 294 193 244* 525 1554 4618 1297 3316 2052 1089 2676 7550 2982 6726 5062 7255 3412 14436 4680 10073 14412 7598 14909 205 37 17236 33928 36396 22669 36822 89686 273 193 219 515 1272 4409 1287 2693 1848 1016 2161 6535 2940 6136 4992 6512 3304 11958 4384 9417 13065 7123 13207 17835 15949 32528 32513 19913 33098 84900 1.10 1.24 1.08 1-04 1.10 Average Notes: N =number of nodes: Po =original profile. The reverse Cuthill -McKee scheme is unable to achieve the lowest profile for any of the examples.

38 18. = root-mean-square wavefront for reverse Cuthill-McKee: L'.32 10.11 Notes: N =number of nodes.82 13.40 1. yields a root-mean-square wavefront which is some 35 per cent above that of the simulated annealing algorithm.01 1.05 1.20 23442 5.34 2.86 6.13 12-80 24.44 1.45 1.00 30.38 1.22 11.12 1.09 1.93 105.42 34. 2E 4N 2 and 2 E + 5N 2 words of integer memory to compute their orderings.1 3 30.30 16.43 14.24 1. on average.33 1.26 1.24 1.48* 27.04 1.63 46.29 16.06 1.64 19.88 16. = root-mean-square wavefront for Sloan.26 37.00 1.08 1.12 1-08 1.14 1.79 20.94 24.96 6.55 1.55 55.04 1.33 8.53 30.62 26.95 11.60 23.51 33.33 1.05 1563 20.83 13.26 27.30 9.29 1.14 4.64 27.27 1.27 Average 1..29 1.19 26..69 1. Results for root-mean-square wavefront reduction on Everstine's test problems 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 59 66 72 87 162 193 198 209 221 234 245 307 310 346 361 419 492 503 512 592 607 758 869 878 918 992 1005 1007 1242 2680 8. its average and worst case results are respectively 11 per cent and 27 per cent above those given by simulated annealing.23 10.26 1.23 14. the reverse Cuthill-McKee and Sloan implementations require.29 40.94 3. F.77 9.30 32.53 2.22 11.08 15.45 3..15 2.00 1. F.01 1.27 1.. = original root-mean-square wavcfront.25 1.43 37.26 1.43 45.02 1. The reverse Cuthill McKee algorithm again performs less favourably than the Gibbs King and Sloan schemes and.16 20.39 3466 44.36 9.95 20.37 1.53 1.29 14.23 19.18 6.59 9.80 7.11 1.14 2.24 17..93 24.35 18.27 1-48 1.22 1.15 1.93 10.01 1.64 4.95 25..15 15.85 27.03 1.12 1. its root-mean-square wavefront is 124 per cent in excess of the simulated annealing root-mean-square wavefront.94 3.20 1.36* 9-85* 23.90 50.02 31.46* 8.10 1.48 27.65 24.40 10.73 6.09 24.09 1. =root-mean-square wavefront for Gibbs-King.14 1.35 1.24 1.74 40.10 1.80 22.11 1. * = no improvement The root-mean-square wavefront reductions for the various algorithms.38 107. respectively.92 131.87 25.00 137.11 1.oo 1.30 18.98 18-02 33.80 1-23 1.33 9-62 18.65 9-76 20. exhibit similar trends to those shown in the profile results.12 6-16 791 23. W.15 1.96 1199 32.92 40.70 6.12 1-04 1.10 1.50 10.94 3.23 22..92 18.08 1.23 1-02 1.48 7.11 1. As implemented by Lewis.32 26.46* 6. Including the storage needed to hold the structure of the graph.02 1. In the worst case.01 1.18 1.24 10.34 1.6 the storage required by the + + + .15 1.87 22.39 32.51 7860 14. Fa.25 1.74 1.07 79.55 1.32 1-28 1.93 5.18 1. The Sloan scheme gives the most consistent reductions. given in Table 11. and compared with the reverse Cuthill-McKee and Gibbs-King schemes it yields the lowest root-mean-square wavefront on 24 occasions.52 7.2662 S. F .59 2.46 29. = root-mean-square wavefront for Armstrong.7 1 10.76 19-70 27.7 1 11-44 18.42 14.25 12.23 1.28 21-39 15.20 1.47 23.32 1.12 1.32 50.58 5.39 9.14 1.18 1.12 48.34 3.36* 9-85* 24.66 26.1 1 1-02 1.30 1.27 1.46* 8.26 1.17 1.01 1.21 1.27 1.63 34.01 19.22 19.20 1.96 43.18 55.14 302.84 30.76 25.01 3.71 4.. SLOAN Table 11.52 34.88 24.

. detailed timing statistics indicate that the reverse Cuthill-McKee algorithm is typically 140 per cent faster than the Gibbs-King algorithm and 50 per cent faster than the Sloan algorithm.PROFILE AND WAVEFRONT REDUCTION 2663 Gibbs-King algorithm depends on the problem at hand..Unchanged * NE . APPENDIX I ....ADJ.....IADJ.XNPN..Unchanged SUBROUTINE GRAPH(N. ....Undefined * * OUTPUT : * ------* * N .. It can be used to provide efficient orderings for both profile and frontal solution schemes.....XADJ) ...... * * PURPOSE : -------* * * Form a dj a c e nc y l i s t f o r a graph corresponding t o a f i n i t e element * nesh * * INPUT: * ----* * N .. .NPN.. Empirical evidence suggests that its average storage requirement is roughly 2 E + 7 N words....I4 + CONCLUSION A FORTRAN 77 program for reducing the profile and root-mean-square wavefront of a sparse matrix has been described.Number of nodes i n graph ( f i n i t e element mesh) * NE .... XNPN(I+l)-l * IADJ ...nodes f o r element I a r e found i n NPN(J).... however.... All the schemes... where * J = XNPN(I)...NE.Index v e c t o r f o r NPN * .... for a larger range of test problems which include those of Everstine..L i s t of node numbers f o r each element * XNPN ..... .Length ok v e c t o r A D J * ...Undefined * XADJ ......Length of NPN = XNPN(NE+l)-1 * NPN ...INPN.Unchanged * INPN . but this increases to 2 E + 9 N 3 for the worst case...IADJ=(NEN(l)*(NEN(l)-l)+.. With regard to efficiency. The code has been tested extensively on a broad range of finite element problems and found to be superior to existing methods. We note in closing that a more comprehensivc set of results and timing statistics.. X N P N ( I ’ ) + l ....Se t IADJ=NE*NEN*(NEN-1) f o r a mesh of a s i n g l e type of * element with NEN nodes * ..Number of elements i n f i n i t e element mesh * INPN .Unchanged * NPN .. are easily fast enough for most practical applications... may be found in Sloan and Ng.... ....+NEN(NE)*(NEN(NE)-1)) * f o r a mesh of elements with varying numbers of nodes * ADJ .

...Nodes adjacent to node I are found in ADJ(J).N L=L+XADJ(I) XADJ(I)=L-XADJ(1) * * * * * * * ...... XADJ(I)+l......XNPN(NE+I).....NEN~..ADJ(IADJ).MSTRT.... * * * INTEGER N.J.e.......XADJ(N+~) Initialise the adjacency list and its index vector DO 5 I=l.N XADJ ( I) =O 10 CONTINUE Estimate the degree of each node (always an overestimate) DO 30 I=l.IADJ...JSTOP NODEJ=NPN(J) XADJ(NODEJ)=XADJ(NODEJ)+NENl 20 CONTINUE 30 CONTINUE Reconstruct XADJ to point to start of each set of neighbours L=1 DO 40 I=l.K.NPN(INPN)....Index vector f o r ADJ .IADJ ADJ ( I)=O 5 CONTINUE DO 10 I=l..NE JSTRT=XNPN( I) JSTOP=XNPN(I+l)-I NENl =JSTOP-JSTRT DO 20 J=JSTRT.25).....INPN..NODEJ... IADJ/2E is typically around 1. In some cases...MSTOP. XADJ(I+l)-l .. * * * * * * * * NOTES : ------ * This routine typically requires about 25 percent elbow room f o r assembling the ADJ list (i..... the elbow room may be larger (IADJ/ZE is slightly less than 2 for the 3-noded triangle) and in other cases i t may be zero (IADJ/ZE = 1 f o r bar elements) PROGRAMMER: ----------Scott Sloan * * LAST MODIFIED: 10 March 1989 Scott Sloan * -------------* ...~STOP.. SLOAN * * * * * * XNPN IADJ ADJ XADJ - * * * * Unchanged Unchanged Adjacency list for all nodes in graph List of length 2E where E is the number of edges in the graph (note that 2E = XADJ(N+l)-1 ) . W.JSTRT.M..... where J = XADJ(I)...NE.2664 S ..LSTRT....Degree of node I given by XADJ(I+l)-XADJ(1) ..NODEK..JSTOP...I........ + L....

JSTOP IF(ADJ(J).’CHECKNPN AND XNPN ARRAYS‘) + END .lX.lOOO) STOP 65 CONTINUE ADJ(M)=NODEJ 70 CONTINUE 80 CONTINUE 90 CONTINUE S t r i p any zeros from adjacency list * * * K=O JSTRT= 1 DO 110 I=l.’.lOOO) STOP 55 CONTINUE ADJ(L)=NODEK MSTRT=XADJ(NODEK) MSTOP=iLADJ(NODEK+l)-l DO 60 M=MSTRT.EQ.EQ. //.EQ.O)GOTO 105 K=K+l ADJ(K)=ADJ(J) 100 CONTINUE 105 CONTINUE XADJ(I+l)=K+l JSTRT=JSTOP+l 110 CONTINUE Error message 1000 FORMAT(//.LSTOP 70 IF(ADJ(L).PROFILE AND WAVEFRONT REDUCTION 2665 40 CONTINUE * XADJ(N+l)=L Form adjacency l i s t (which may contain zeros) * * * * * DO 90 I=l.lX.JSTOP-l NODEJ=NPN(J) LSTRT=XADJ(NODEJ) LSTOP=XADJ(NODEJ+l)-l DO 70 K=J+l.NODEK)GOTO IF(ADJ(L).’CANNOTASSEMBLE NODE ADJACENCY LIST‘.NE JSTRT=XNPN(I) JSTOP=XNPN(I+1)-1 DO 80 J=JSTRT. + //.’***ERROR IN GRAPH***.MSTOP 65 IF(ADJ(M).EQ.lX.O)GOTO 55 50 CONTINUE WRITE(6.JSTOP NODEK=NPN(K) DO 50 L=LSTRT.N JSTOP=XADJ(I+l)-1 DO 100 J=JSTRT.O)GOTO 60 CONTINUE WRITE(6.

...................Profile for new node numbering * .Undefined * NEWPRO ... W....N * IW .OLDPRO...........Index vector for ADJ * ..IW. XADJ(I+1)-1 * ..E2... SLOAN APPENDIX I1 ... NUMBER.....Adjacency list for all nodes in graph * ..Undefined * ..XADJ....Nodes adjacent t o node I are found in ADJ(J).......Total number of nodes in graph * E2 .... then SUBROUTINE LABEL(N.If original node numbers give a smaller profile then * NNN is set so that NNN(I)=I for I=l.Undefined IW * OLDPRO . * * * * * Label a graph for small profile and rms wavefront * * * * * N ...Unchanged * E2 .... XADJ(I)+l....... where it 3 = XADJ(I).Profile using origi?al node numbering * NEWPRO ..Not used * OLDPRO .....Undefined * * * * * N ..... PROFIL Scott Sloan ----------- * LAST MODIFIED: 10 March 1889 Scott Sloan -_______-_____ * * ............Unchanged * NNN ..........Degree of node I given by XADJ(I+l)-XADJ(1) * NNN .2666 S..Unchanged * XADJ ..NNN..............Unchanged * ADJ .New number for node I given by NNN(1) * ..List of new node numbers * .........ADJ....NEWPRO) * * * * * * * * original node numbers are used and NEWPRO=OLDPRO _--_____--_________ PROGRAMMER: SUBROUTINES CALLED: DIAMTR....... .. ...Twice the number of edges in the graph = XADJ(N+l)-l * ADJ ..List of length 2E where E is the number of edges in * the graph and 2E = XADJ(N+l)-l * XADJ ......If original profile is smaller than new profile.....

....MASK..........LT.XADJ.I3..IW(I2).XLS.N".LSTNUM..N NNN(I)=O 10 CONTINUE Define offsets Il=l 12=11+N 13=12+N+1 Loop while some nodes remain unnumbered LSTNUM=O 20 IF(LSTNUK.NC.E2...NEWPRO) Use original numbering if it gives a smaller profile IF(OLDPRO..SNODE...PROFILE AND WAVEFRONT REDUCTION 2667 * * * * * * * * * * 7+ INTEGER N.SNODE..~DJ{N+1).....LS...ADJ..NC) ....SNODE.NC....-Find nodes which define a psuedo-diameter of a graph and store distances from end node SUBROUTINE DIAMTR(N.IW(3*N+l) Set all new node numbers = 0 This is used to denote all visible nodes DO 10 I=1...NEWPR0..N)THEN * * * * * * * * Find end points of p-diameter for nodes in this component Compute distances of nodes from endanode CALL DIAMTR(N.SNODE.12...NC) Number nodes in this component CALL NUMBER{N.ADJ.IW(I1).LT.HLEVEL....E2."N(N)......E2....IW(I2)) GOT0 20 END IF Compute profiJes for old and new node numbers CALL PROFIL(N.NEWPRO)THEN DO 30 I=l....LSTNUM.E2.......XADJ.ADJ.ADJ.I..NMJ.I1.N"....XADJ. PURPOSE: .XADJ.N NNN(I)=I 30 CONTINUE NEWPRO=OLDPRO END IF END * * * * * * * * * * * .OLDPRO...OLDPRO..IW(I3)......IW(I1).E2.-.... + ADJ(E2).

.Undefined HLEVEL . .The number of nodes in this component of graph NC N E2 - SUBROUTINES CALLED: ROOTLS.Not used HLEVEL .DEGREE... where J = XADJII). .EWIDTH.....ISTOP..SDEPTH~E2.Undefined XLS .List of dist nce of n des f r m the end nod ! ....Undefined ..ISTRT.I..EQ....Visible nodes have MASK = 0 .Unchanged XADJ MASK ....O)THEN * .MINDEG.Unchanged Unchanged ADJ .N IF(MASK(I)..HSIZE. + XLS(N+l)...JSTRT~ * * * + JSTOP.MASK(N).... ISORTI _____---_-____----NOTE : ___-- SNODE and ENODE define a pseudo-diameter * PROGRAMMER: Scott Sloan ___------_* * * LAST MODIFIED: 10 March 1989 Scott Sloan _--------_____ * * ...2668 S.~DJ(N+l)..Not used SNODE ..Masking vector for graph ...J.Index vector for ADJ ....List of nodes in the component LS XLS ....Starting node for numbering .SNODE.. OUTPUT: ------ .HLEVEL(N) Choose first guess for starting node by min degree Ignore nodes that are invisible (MASK ne 0) MINDEG=N DO 10 I=l... XADJ(I)+l.NODE.....XADJ(I+l)-l .Twice the number of edges in the graph = XADJ(N+l)-l ...Adjacency list for all nodes in the graph ...... W.N.Undefined NC ...List of length 2E where E is the number of edges in the graph and 2E = XADJ(N+1)-1 XADJ .....Nodes adjacent to node I are found in ADJ(J)...ADJ(E2). XNTEGER NC......Undefined SNODE ..Degree of node I given by XADJ(I+l)-XADJ(1) MASK ....Unchanged . node invisible otherwise LS .WIDTH. SLOAN * * * * * * * * * * * * * * -----N E2 ADJ INPUT: - * * * * * * * * * * * * * * * * * * * * + * * * * * The total number of nodes in the graph .L$(N)...DEPTHfENODE.. .

LS.LT.ISTOP NODE=LS(I) HSIZE=HSIZE+l HLEVEL(HSIZE)=NODE XLS(NODE)=XADJ(NODE+l)-XADJ(N0DE) * * * * * * * 20 CONTINUE Sort list of nodes in ascending sequence of their degree Use insertion sort algorithm IF(HSIZE.ISTOP NODE=HLEVEL(I) IF(XLS(NODE).SNODE.SDEPTH.XLS) Remove nodes with duplicate degrees ISTOP=HSIZE HSIZE=l DEGREE=XLS(HLEVEL(l)) DO 25 I=2.E2.DEGREE)THEN * * * DEGREE=XLS(NODE) HSIZE=HSIZE+l HLEVEL( HSIZE)=NODE ENDIF 25 CONTINUE Loop over nodes in shrunken level EWIDTH=NC+l DO 30 I=l.MINDEG)THEN SNODE=I MINDEG=DEGREE END IF END IF 10 CONTINUE Generate level structure f o r node with r i degree nn * * CALL ROOTLS(N.HSIZE MODE=HLEVEL( I) .MASK.XADJ.HLEVEL.1)CALL ISORTI(HSIZE.WIDTH) Store number of nodes in this component * * * * * * * NC=XLS(SDEPTH+l)-l Iterate to find start and end nodes 15 CONTINUE * Store list of nodes that ate at max distance from starting node Store their degrees in XLS HSIZE=O ISTRT=XLS(SDEPTH) ISTOP=XLS(SDEPTH+l)-1 DO 20 I=ISTRT.N+I.PROFILE AND WAVEFRONT RFDUCTTON 2669 * * * DEGREE=XADJ(I+I)-XADJ(1) IF(DEGREE.XLS.N.NE.GT.ADJ.

..E2...ADJ.....NODE....DEPTH JSTRT=XLS(I) JSTOP=XLS(I+l)-1 DO 40 J=JSTRT..ADJ..........MASK.GT.WIDTH) IF(WIDTH.MASK. new max depth.XADJ.LS..DEPTH..SIAIAN * * * * * Form rooted level structures for each node in shrunken level CALL ROOTLS(N.........--..DEPTH. WIDTH) .DEPTH.2670 S..EWIDTH)THEN Level structure was not aborted during assembly * * * * * * IF(DEPTH..EN0DE)THEN CALL ROOTLS(N.SDEPTH)THEN Level structure of greater depth found Store new starting node....XADJ..LT........WIDTH) ENDIF Store distances of each node from end node DO 50 I=l.XLS..NE..LS. * * PURPOSE : ---..EZ.XLS..EWIDTH.. and begin a new iteration * * * * * * * * * * SNODE=NODE SDEPTH=DEPTH GOT0 15 ENDIF Level structure width for this end node is smallest so far store end node and new min width ENODE=NODE EWIDTH=WIDTH END IF 30 CONTINUE Generate level structure rooted at end node if necessary IF(NODE.MASK..E2....MAXWID.ADJ..NC+l.........XLS. W..ENODE....LS.* * * Generate rooted level structure using a FORTRAN 77 implementation * of the algorithm given by George and L i u * + SUBROUTINE ROOTLS(N.XADJ..ROOT....JSTOP MASK( LS( J) )=I-1 40 CONTINUE 50 CONTINUE END .

.....Adjacency list for all nodes in graph _ List of length 2E where E is the number of edges in the graph and 2E = XADJ(N+l)-l XADJ . ...Degree of node I is XADJ(I+l)-XADJ(1) MASK . XLS(I)+l...NODE......I...DEPTH.LWDTH.EZ..N. * * INTEGER ROOT.. XADJ(I)+l.. ......Undefined XLS ........Root node for level structure MAXWID ....Undefined WIDTH .Nodes adjacent to node I are found in ADJ(J)..MASK(N)...Number of nodes ROOT ..NC..MAXWID...Visible nodes have MASK = 0 LS ..ADJ(E2)...Undefined OUTPUT : ___---N ROOT MAXWID E2 ADJ XADJ MASK LS XLS * * * * * * * * ......NBR...Undefined DEPTH .Max permissible width of rooted level structure ..Index vector for ADJ .Abort assembly of level structure if width is ge MAXWID .....Masking vector for graph ..unchanged ...Unchanged - * * * * * * * - - * * * * * * _ DEPTH WIDTH NOTE: * Scott Sloan LAST MODIFIED: 10 March 1989 ___-_-_-__-_-* * .LSTRT..... XLS(I+l)-1 List of max length NC+1 Number of levels in rooted level structure Width of rooted level structure ..XLS(N+l~..LS(N) Initialisation * * * * _____ If WIDTH ge MAXWID then assemb1....y has been aborted Scott Sloan PROGRAMMER: ..PROFILF AND WAVEFRONT REDUCTION 267 1 * * * * * * INPUT : ______ * * * * * * * * * * * * * * * * * N ....XDJ(N+l)..Unchanged - Unchanged Unchanged Unchanged Unchanged List containing a rooted level structure List of length NC Index vector for LS Nodes in level I are found in LS(J). where J = XADJ(I)..LSTOP.Assembly ensured by setting MAXWID = N+l E2 _ Twice the number of edges in the graph = XADJ(N+1)-1 ADJ .JSTOP.. XADJ(I+1)-1 . where J = XLS(I).J.WIDTH... + JSTRT.

NC MASK(LS(I))=O 40 CONTINUE END .GE.MAXWID)GOTO GOT0 1 0 END IF XLS(DEPTH+l)=LSTOP+I 35 * * Reset MASK=O for nodes in the level structure 35 CONTINUE DO 40 I=1. W.O)THEN LVDTH is the width of the current level LSTRT points to start of current level LSTOP points to end of current level NC counts the nodes in component LSTRT=LSTOP+l LSTOP=NC DEPTH=DEPTH+l XLS(DEPTH)=LSTRT Generate next level by finding all visible neighbours of nodes in current level * * * * * * * * * * 20 * * * * * * * * 30 DO 30 I=LSTRT. SLOAN * MASK(ROOT)=l LS(l)=ROOT NC =I WIDTH=l DEPTH=O LSTOP=O LWDTHsl 10 IF(LWDTH.LSTOP NODE=LS(I) JSTRT=XADJ(NODE) JSTOP=XADJ(NODE+l)-l DO 20 J=JSTRT.O)THEN NC=NC+1 LS(NC)=NBR MASK(NBR)=l END IF CONTINUE CONTINUE Compute width of level just assembled and the width of the level structure so far LWDTHzNC-LSTOP WIDTH=MAX(LWDTH.EQ.WIDTH) Abort assembly if level structure is too wide IF(WIDTH.2672 S.GT.JSTOP NBR=ADJ(J) IF(MASK(NBR).

..... * * * * * Order a list of integers in ascending sequence of their keys * using insertion sort * * * * * ....NL T=LIST(I) VALUE=KEY(T) DO 1 J=I-IiI?-l 0 IF(VALUE..I..GE............A list of integers * NK ....Unchanged * KEY ..J.......Unchanged * LIST ......A list of integers sorted in ascending sequence o f KEY * NK ...KEY(LIST(J)))THEN LIST(J+l)=T GOT0 20 SUBROUTINE ISORTI(NL...VALUE.NK........................PROFILE AND WAVEFRONT REDUCTION 2673 .........A list of integer keys * * OUTPUT : ........Length of LIST NL * LIST .............Length of KEY (NK must be ge NL) * KEY ..........._ _ _ _ * * * NL .....KEY(NK) DO 20 I=2......Unchanged * * NOTE: Efficient for short lists only (NL I t 20) * -____ * * PROGRAMMER: Scott Sloan * _-______-__ * * LAST MODIFIED: 10 March 1989 Scott Sloan * _____-_------* ...-....KEY) ENDIF LIST(J+l)=LIST(J) 1 0 CONTINUE LIST( l)=T 20 CONTINUE END ...NK.. INTEGER NL...T..........LIST(NL).............LIST....

.....Also used to store queue of active o r preactive nodes * P ..List of nodes which are in this component * ...Not used * P .........Count of numbered nodes (input value incremented by NC) * E2 ...Unchanged NC * SNODE ....Count of nodes which have already been numbered * E2 .Undefined * * * * * N . XADJ(I)+l...2674 s.Unchanged * .Unchanged * ADJ ........List of length 2E where E is the number of edges in * the graph and 2E = XADJ(N+l)-l * XADJ ..Not used * * NOTES : * * * S also serves as a list giving the status of the nodes * during the numbering process: * S(1) gt 0 indicates node I is postactive * S ( 1 ) = 0 indicates node I is active * S(1) = -1 indicates node I is preactive * S(1) = -2 indicates node I is inactive * P is used to hold the priorities for each node * - SUBROUTINE NUMBER(N.List giving the distance o f each node in this * component from the end node * Q .NC... ..P) ...ADJ...Number of nodes in component of graph NC * SNODE .Twice tne number of edges in the graph = XADJ(N+l)-l * ADJ Adjacency list for a l l nodes in graph * ..List of new node numbers * ..E2.I is S ( 1 ) * 0 ..Unchanged * S ..New number for node.. SLOAN ..Nodes adjacent to node I are found in ADJ(J). w... where * J = XADJ(I).LSTNUM..SNODE.Index vector for ADJ * .XADJ...Unchanged * LSTNUM .. XADJ(I+l)-l * S ........ ....Number of nodes in graph * ...Unchanged * XADJ .... * * * * * Number nodes in component of graph for small profile and rms * wavef ron t * * INPUT: * * * N ..Q....S.....Node at which numbering starts * LSTNUM ...

.MAXPRT)THEN ADDRES=I MAXPRT=PRTY END IF CONTINUE NEXT is the node NEXT=Q(ADDRES) to * * * * 35 be numbered nexc .JSTOP.WZ*DEGREE where: w1 = a positive weight w2 = a positive weight DEGREE = initial current degree for node DIST = distance of node from end node * * * * DO 10 I=l...N. + P(N)..... INTEGER NC..DJ(N+l)...ADDRES.0)THEN * * Scan queue for node with max priority ADDRES= 1 MAXPRT=P(Q(l)) DO 35 I=2........GT.....NC NODE=Q (I) P(NODE)=W1*S(NODE)-WZ*(XADJ(NODE+l)-XADJ(NODE)+l) S(NODE)=-2 10 CONTINUE Insert starting node in queue and assign i t a preactive status NN is the size of queue "= * 1 Q(NN)=SNODE S(SNODE)=-l Loop while queue is not empty * 7% * F 30 I ( NN ....ADJ(E2)....ISTOP..I..GT ......J...NBR...JSTRT........ + NODE..NABOR.NN PRTY=P( Q( I)) IF(PRTY...SNODE...ISTRT...W2.. * * * * * * * + W2=2) * Initialise priorities and status for each node in this component Initial priority = Wl*DIST ..E2.....NN..........NEXT.Wl...S(N) PARAMETER(Wl=l...PRTY.PROFILE AND WAVEFRONT REDlJCTlON 2675 * PROGRAMMER: LAST MODIFIED: Scott Sloan * * * * * ____-_________ 10 March 1989 Scott Sloan .LSTNUM.MAXPRT.

ISTOP NBR=ADJ(I) IF(S(NBR).EQ.2676 S.EQ.ISTOP Decrease current degree of neighbour by -1 NBR=ADJ(I) P(NBR)=P(NBR)+W2 Add neighbour to goeue if it is inactive assign it a preactive status IF(S(NBR).EQ. examine its neighbours DO 50 I=ISTRT.JSTOP NABOR=ADJ(J) Decrease current degree of adjacent node by -1 .-1)THEN Decrease current degree of preactive neighbour by -1 assign neighbour an active status P (NBR)=P(NBR ) 4 rJ2 S( NBR)=O * * * * * Loop over nodes adjacent to preactive neighbour JSTRT=XADJ(NBR) JSTOP=XADJ(NBR+l)-1 DO 60 J=JSTRT.-2)THEN ""1 =+ Q(NN)=NBR S(NBR) =-I END IF C 0 NTINUE END IF Store new node number for node NEXT Status for node NEXT is now postactive LSTNUM=LSTNUM+l S(NEXT)=LSTNUM Search for preactive neighbours of node NEXT * * * * * * * * * * 50 * * * * * * * * DO 80 I=ISTRT. SLOAN * * Delete node NEXT from queue Q(ADDRES)=Q(NN) NN=NN-l ISTRT=XADJ(NEXT) ISTOP=XADJ(NEXT+l)-1 IF(S(NEXT).-1)THEN Node NEXT is preactive. W.

.E2............... a 9 Unchanged Unchanged Unchanged E2 ADJ Unchanged XRDJ Unchanged OLDPRO .NNN...... XADJ(I)+l...EQ. .NEWPRO) * * Compute the profiles using both original and new node numbers INPUT : * * * * * * ____-- N " N * * * * * * * * * * * * * * * * * * * * * * * E2 ADJ - XADJ OLDPRO NEWPRO - Total number of nodes in graph List of new node numbers for graph New node number for node I is given by " 1 () Twice the number of edges in the graph = XADJ(N+1)-1 Adjacency list for all nodes in graph List o f length 2E where E is the number of edges in the graph and 2E = XADJ(N+l)-l Index vector for ADJ Nodes adjacent to node I are found in ADJ(J)..Profile with original node numbering NEWPRO .PROFILE AND WAVEFRONT REDUCTION 2677 * * * P(NABOR)=P(NABOR)+W2 IF(S(NABOR)..... XADJ(I+l)--l Degree of node I given by XADJ(I+l)-XADJ(I) Undefined Undefined ................. * * * * * * SUBROUTINE PROFIL(N....XADJ...........OLDPRO........Profile with m w node numbering N NNN - NOTE : ___-- Profiles include diagonal terms .-2)THEN Insert inactive node in queue with a preactive status 60 80 ""1 =+ Q{NN)=NABOR S(NABOR)=-1 END IF CONTINUE END IF CONTINUE GOT0 30 END IF END ....... where J = XADJ(I).ADJ...

..... Znt. .. Compute. 8.JSTRT.. Scott. ‘A hybrid profile reduction algorithm’.. numer. J..J. A C M Nalional Conjirence.. George and J.. I.OLDMIN) NEWPRO=NEWPRO+DIM(N”(I).. Solution ofI. Cuthill and J.. 523-533 (1970).OLDMIN. 8.. P. €1....... ‘An automatic reordering scheme for simultancous equations derived from network systems’. I . Math.J.ADJ(E2) S e t p r o f i l e s and l o o p o v e r e ach node i n g r a p h OLDPRO=O NEWPRO=O * DO 20 I=1. E..arye Sparse Positiw Definite Systems.2678 S . S.. 2. 7.E2. S o j h a r e .... * * * INTEGER NEWPRO. methods eng..N JSTRT=XADJ(I) JSTOP=XADJ(I+1)-1 OLDMIN=ADJ(JSTRT) NEWMIN=N”(ADJ(JSTRT)) Find l o w e s t numbered nei ghbour o f node I ( u s i n g b o t h o l d and new node numbers) * * * 10 DO 10 J=JSTRT+l. E.. Ciibbs. ‘An algorithm for profile and wavefront reduction of sparse matrices’.. Sloan..JSTOP OLDMIN=MIN(OLDMIN. numer.N. W.. Int. j. ACM Truns. 378-387 (1476). Int......N”(N). 837-853 (1979). ClitTs.N(ADJ(J))) CONTINUE Update p r o f i l e s * * * * * * OLDPRO=OLDPRO+DIM(I.NEWMIN.... Englewood .. G.. 4 A. ‘Implementation of the Gibbs-Poole-Stockmeyer and Gibbs-King algorithms’.. 1988. 14. methods eng. LL.. N... 239-251 119861.. G . 1969.. 2..... ACiM Trans. ‘Reducing the bandwidth of sparse symmetric matrices’.. 2. Lewis. W.. Proc... M a t h . Sqftivure..I.? 1981.... SLOAN * PROGRAMMER: S c o t t Sloan * LAST MODIFIED: 10 March 1989 S c o t t Sl oan * * * .. J. 5 . 23.. Association for Computing Machinery. Everstine.. methods eny.. New York. j . Prenfice-Hall.. King. W.JSTOP.ADJ(J)) NEWMIN=MIN(NEWMIN. 1. McKee...... C.. + XADJ(N+l). ‘A comparison of three resequencing algorithms for the reduction of matrix profile arid wavefront’. 6. Private communication from Harwell. numer.. 180-1 89 (1 982)... N.NEWMIN) 20 CONTINUE Add d i a g o n a l terms t o p r o f i l e s OLDPRO=OLDPRO+N NEWPRO=NEWPRO+N END REFERENCES 1. 3..OLDPRO.

numrr. University of Newcastle). A. 1785-1790 (1985). Int. numer. Addison-Wesley.j. Alyorithms. mrrhods eng. Strurt. j . methods eng. schemes’. ‘Efficient sorting routincs in FORTRAN 77’. I n / . Sedgewick. 19. Collins. 10. R.. 12. Softwurr. . . S. W.. Sloan and W. Randolph. 1153--1181 (1983). 1983. 345-356 (1973). 6. ‘A direct comparison of three algorithms for reducing profile and navefront’. (C‘kil hgineering Report No. 13. Armstrong. T. ‘Near-minimal matrix profiles and wavefronts for testing nodal resequencing algorithms’.07. F. j . 14. W. R. W. 6. Eng. ~iumer. G . 21. J.. IY88. B. S. 11. New York. Houlsby and S. 198-203 (1984). Sloan. Sloan and M .PROFILF AND WAVFFROhT REDLJCTION 2679 9. S. Ado. Int. ‘Automatic elemcnt reordering for finite element analysis with frontal solution methods m g . ‘Bandwidth reduction by automatic renumbering’. To appcar in Cotnp. 026. Ng.

- A-03 relazione di calcolo
- D.5 Relazione Di Calcolo Strutturale
- R06_Relazione Di Calcolo Strutturale
- 1971ln.1086
- 1 Condono Revisione Gennaio2015
- 1 Condono Revisione Gennaio2015
- DM 15-5-85_modificato_200985
- DM 15-5-85_modificato_200985
- DM 15-5-85_modificato_200985
- Stato Di Fatto Tav 2
- Stato Di Fatto Tav 1
- Stato Di Fatto Tav 0
- 160404_AllegatoV
- 160404_AllegatoIV
- 160404_AllegatoIII
- 160404_AllegatoII
- 160404_AllegatoI
- Nzee Storage Tanks
- Detailed dynamic analysis of rigid water conf copy4.docx
- Tank Thesis
- Wirth
- opcm3316-2003
- 68.Italian Extreme Winds-parte 2
- 67.Italian Extreme Winds-parte 1
- rinforzo_strutturale_scale.pdf

program

program

- Pnp
- lec2
- DS Lect 24 All Pairs Shortest Path Problem
- Roman Domination in Complementary Prism Graphs
- 6 M131 Chapter9 Suha AlShaikh
- stan
- p029 Windsor
- a4[1] Revised
- Connected Matchings in Special Families of Graphs.
- One-Mother Vertex Graphs
- Maximum Matchingekar Scribe
- BBK03
- SINGLE VALUED NEUTROSOPHIC GRAPHS
- HW VLSI CAD
- Plick Graphs with Crossing Number 1
- The t-Pebbling Number of Jahangir Graph
- Coloring Vertex
- 02_ProbSolveSearch
- Turan
- On the Application of Graph Colouring Techniques in Round-Robin Sports Scheduling
- Rr210501 Discrete Structures and Graph Theory
- 04536237
- Greedy Proofs
- Beyond the Shannon’s Bound
- 6rt7y
- lecture4_partition2
- Derivation Graphs
- Theoritical Aspects Graph Models for MANETs
- 6689_01_pef_201303071.pdf
- sasimi97.SpecC

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd