This action might not be possible to undo. Are you sure you want to continue?

# ri)URh.

kL

CF CC!.IFLTATION.AL

PHYSICS

87. 231-236

(

19901

**Note An Algorithm to Fin between Two Nodes in a Gra
**

The problem of finding paths connecting two nodes in a given graph is of great interest for several applications in different fields; Whenever complete information on all paths (e.g., total number of pahts, length, and cost) is needed. a heuristicsearch is useless. If the size of a given application is manageable in terms of available CPU time and,/or memory limits, an exhaustive search (that is a COL‘R-~ prehensive analysis) is still the only way to solve this problem [I 1. The algorithm presented in this note has been successfully used to analyze data from TvIetropolis-Monte Carlo [2a] and molecular dynamics [Zb] compute: simuiations of 125 water molecules interacting through MCY [3] or ST2 [4< potentials. However, it could be easily used virtually without modifications 1~ an:. case that requires an exhaustive search of an undirecred graph, A schematic flow chart of the algorithm is presented in Fig. i. The graph ;Z be examined. G, the start, and end nodes are given as input data (block I ). Ths undirected graph C includes i%’nodes and it is represented by an adjacency matrix in which the irh row lists the nodes adjacent to node i. The order in. which nodes are explored is unessential for our purposes. This problem is ideally suited for a recursive algorithm, owever, sir,ce we wanI to use languages that do not support recursion (e.g., FORTWJd. OCCAM2), WE decided to implement backtracking as an alternative approach. The progress consists of an exhaustive depth-first graph search fer al/ solrrtions. A stack is used, where the program stores/retrieves the appropriale context, 1.0 go one step forward/backward in the graph, by updating a stack index. The program starts pointing IO the adjacency list for the start-node (block 2 ), selects an adjacent r,ode Ipot vet explored (blocks 4-6), tests for solution (i.e.. :he end--node), a dead end, o!‘ a cycle (blocks 7 and 9): and stores intermediate results (block Iill. This process is accomplished by pushing the working adjacency matrix in the srack and modifying by zeroing the adjacency just explored. At this poict the program is ready ro foIio:t, a pattern in the graph structure by successive exploration of one of the adjacens rtodes. i (block 12): until a solution, a dead-end or a cycle is found. When a solution is found the node list is printed (block 8 ) and the program goes OG: step back (block 10) to repeat the cycle for all the previous adjacenciesnot yet explcred. The back travel in the proper way along the graph is granted by retrieving the appropriate adjacency matrix from the stack. Memory requiremenrs are of rhe order of N x K x L. where N is the number of nodes. K is the maximum numhes or

AND SCIORTINO FIG.232 MIGLIORE. MARTORANA. 1. Schematic flowchart of the algorithm. .

all the secondary paths to a node already in the list generated during the search process. As one can see. 2 with its adjacency matrix. 3 are schematically shown the steps followed by the program. 1&6-332-l. 2).ALGORITHM FOR EXHAUSTWE SEARCH FIG. The program exits when the last adjacency of the last adjacent node to the start-node is solved (node 5 fro-m node 3 in the example of Fig. 3. Cycles (e.g. FIG. As an illustrative example. and suppose we want to find ail the pathways connecting node 1 with node 5. Exampleof a simple graph and the corresponding adjaceccy mauix. and L is the maximum depth to be reached in the exploration of 6. by the program Dashed lines represent aljacencies . Indeed. are identified and ignored.. let us consider the simple graph shown in Fig.. adjacencies for each node. an Fig. 2. Diagram of the search process followed that generate cycies. In Fig.. ali branchings of a given node are systematically explored each time selecting the deepest node in the graph. 3) are avoided since the program maintains an updated list of the nodes already visikd along the path.

‘48 1’0 MHz-T800 (FORTRAN) 120 MHz-T800 (OCCAMZ) 125 (water mol.4 s 2040 s 95 s on and IVote. In this work we use the present algorithm to analyze the pathways between two. In Table I we report CPU time values for a VAX 11/750.-.234 MIGLIORE. The adjacency matrix was constructed assuming that two water molecules are “bonded” if their interaction energy is less than .) 2. Schematic representation of the S-TSOO Transputers system.!750 with floating point accelerator. TABLE I .1 8 22 2114 306920 3360 s 6. CRAY X-MP/48. in terms of their length. .!48 l-T800 Transputer (FORTRAN and OCCAMZ). \ \ ‘_ -r _d /’ . and the underlying H-bond network is intricate enough to present virtually all kinds of possible branchings.12kJ. 4. At this energy threshold. and data referring to the specific snapshot selected . water molecules belonging to a Monte Carlo configuration. and topology. the system is well above the percolation limit. chosen only for demonstration purposes. I / FIG. number. Execution times are of the program running VAX 11. MARTORANA.---. In our computer simulations on aqueous systems [S] we are interested in characterizing the H-bond pathways between two distant water molecules. I L. therefore most likely a connectivity pathway exists between any given couple of water molecule. arbitrarily chosen. CRAY X-MP. a Transputer-based system [6].4ND SCIORTfNO Characteristic Parameters of the Graph Generated from a Monte Carlo Configuration Number of nodes Average H-bond/water Start node End node Number of paths found Number of steps VAX 1 I/750 with FPU CRAY X-MP.

J.ct. it runs in 95 s: that is only ‘i5 ~~IXFS slower than on CRAY (which is more than 1000 times more expensive). Berm (Pfenum. The btg difference in performance between the FORTRAN and 8CCAM2 impiementatmns for Transputers is essentially due to the smart matrix assignment instructi. CRRNSM. S. 3. p. H. Inform. P/!~. New York.RAHhiAN. KUSHKK AND B. 1977). 581!87.RCH .oi OCCAMZ. ACKNOWLEDGMENTS We *ish to than-k Professor S. YOSHIMINE. CLE~ENTI. L.J. Fin. “Molecular Dynamics Merhods: Continuous Potentials. J. The same FORTRAN program runs on a system which includes one 20MHz-T800 Transputer [7]. S&t. the program on CRAY runs over 500 times faster than on VAX without modifications or restructuring. 3. J Chrni. 60. and it grows iinearly with the number of processors. Fornili for many useful discttissions. edited by B. P/fyS. 53. The present work ha. 1545 j 197’1.1-16 . I.> method [9]. In f<t. i = 103 (we are interested m pathways up to 100 molecules iongi. F. ia) J.cns . Mr. M. 1983 ). 41. p. VALLEAU AND S. REFERENCES 1.i:triples qf Airtjiicial bi:rilfgmce (riopa. 1. Pappalardo for technical assistance. For a general treatment of graphs search strategies see: (a) “z RICH. since it is automa. Parallelo funding. 5. s 5 from our simulation. Palo Alto. As one can see from Table I.P.s been carried or_2 at IAIF-CNR and supported also by local MPf 60%. p. and Mr. i37. where each processor analyzes a different configuration. 2. . “-4 Guide to Monte Carlo for Statistical Mechamcs.rt$icial Ifiiiei!igmrr (McGraw-Hill. BERNE. G. running ihe betiveeri same program on independent sets of data. P. is the so-called procesor _!ir~. E. (0) J. 6. in almost half of the VAX time. CA. New York. 64. 0. K = 5 (maximum number of H-bonds per water). Vol. Vol. G. 1351 tl976!.T. and the information exchange through the communication channels would degrade severely the overal: system performance Another way to take advantage of parallelism._&pendent on the particular network topology. Mr. and Prog. edited by B. 197i). Lapis.ALGORITHM FOR EXHAUSTIVE SEP.A. Thus no communications processors occur other than those needed to send the pathways list to the masse: processor. Highways. The overali speed of the farm is essentially i.Woderrl i”/mwe!icai Chemisrry. Berm (Plenum. When implemented In OCCAM2 [Xl? thai Is rhe native high-level ianguage for Transputers. New York. MATSL~KA. CkWi. Bn the present case we used a ST800 Transputer array configured as schematically shown in Fig. Wtr-rtrxxox. In this case we have N= 125 {the number of water molecuies in the simulated sample). P. STILLIVGER AND . AND M. e CA.ticalh:J recrorizPd directly by the CRAY FORTRAN compiler. La Gatrir:e.” Modem Theorrticnl Chemistry. 4. 55: (b) N. 4. NILSSON. 1980). J. this kind of graph search cannot be efficiently ~rn~~ernc~ted in parallel processes: since information on all nodes and cycles already explored have to be shared in real rime among all concurrent processes.“ . The analysis of a single configuration cannot be efficientiy paralieiized.

MIGLIORF v. MARTORANA.-Institu[e . University of Palermo via . p. London. Trmspurer Developnzent S~~sfum (Prentice-Hall.4ND SCIORTINO 5. Phys. i-viARTORAN. L. SCIORTINO. MIGLIORE.4rchiraJ. AND S. L. Bristol.4M Progrunzmirig (INMOS. . 309 (1988 j. V. POUNTAIN. Italy F. 1. FORNILI. F. 1. 9.I. 35. BRUCE'. L. R. SCIORTINO Deparlment qf Physics. I-90123 Palermo. Phyx C~mtnun. 90. R. Bristol. Technical Note No. F. 2786 (1989). 6 (INMOS. Chem. 1987). 1988). Simd. Simul. RECEIVED: December 16. REVISED: March 8. 1986). F. bfof. N. INMOS. D. 129. 41. Conzpur. FORNILI. 8. JESSHOPE.for Interdisciplinar~~ Applications of Physics via Archirafi 36. C. FORNILI. 1989 M. 125 (19881. 36.4 Tutorial Inrroduction to OCC. I-90123 Palermo.4 C. 1988. SCIORTINO AND S. M. 363 (1986).236 MIGLIORE. Italy . p. 7. INMOS. . 6. MARTORANA. . R. Mol. NOTO. AND S.