This action might not be possible to undo. Are you sure you want to continue?

ON PARALLEL

AND DISTRIBUTED

SYSTEMS.

VOL. 5, NO. 9, SEPTEMBER

1994

995

**efficient EREW PRAM Algorithms for Parentheses-Matching
**

Sushi1 K. Prasad, Member, IEEE, Sajal K. Das, Member, IEEE, and Calvin C.-Y. Chen

stack. For additional literature related to this problem, readers may refer to an annotated bibliography on parallel parsing [I] and an in-depth survey of parallel parentheses-matching algorithms [28]. The PRAM model consists of several processors and a global shared memory, in which each processor can access a shared memory cell in constant time. Arithmetic and logical operations can also be performed in unit time. There is a global clock. All synchronizations and communications take place via the shared memory. Among different variants of the PRAM family, we consider the weakest, albeit the most feasible, model, namely, the exclusive-read and exclusive-write (EREW) variant, in which no two processors can simultaneously read from or write into a memory cell. In the concurrent-read and exclusive-write (CREW) model, however, simultaneous reading, but not simultaneous writing, from a memory cell by more than one processor is allowed. The concurrent-read and concurrent-write (CRCW) model allows simultaneous reading and simultaneous writing. Although the sequential algorithm for parentheses-matching is straightforward and optimal, designing a parallel algorithm Index Terms-EREW PRAM, optimal algorithms, parallel alis nontrivial, which has given rise to a number of fast algogorithm, parentheses-matching, parsing rithms for CREW and EREW PRAM’ several of which are s, cost-optimal. In this paper, we present four EREW algorithms I. INTRODUCTION for parentheses-matching that provide new insights into this HE parentheses-matching problem is used to determine interesting problem and that are easier to implement compared the mate of each parenthesis in a balanced string of to the competing algorithms. They are presented in the order of parentheses. A string of n parentheses can be matched sequen- successive improvements in their resource requirements (i.e., tially in O(n) time using a stack by scanning the input string time, processor, and working space). Algorithm I is based on a partitioning strategy, exploiting once. Since parentheses-matching is an integral subproblem a simple characterization of a balanced string of parentheses. of parsing and evaluating expressions by computation-tree generation[31, [4], [16], [20], several parallel algorithms have Compared to the existing parallel algorithms for parenthesesmatching, Algorithm I is the simplest, though costwise it is been proposed on the parallel random-access machine (PRAM) model VI, [41, PI, [71, Bl, U41, W31, W I, P4l, W I, [311, nonoptimal. This algorithm segregates parentheses by nesting [33], and some on distributed memory models, such as mesh, levels, placing odd-level parentheses to the left of even-level shuffle exchange [32], and hypercube networks [29]. More parentheses, and packing the result. Each such partitioning interestingly, none of these algorithms makes explicit use of a requires O(log n) time, and log n - 1 iterations are sufficient to bring the matching pair of parentheses adjacent to each other. Manuscript received April 30, 1992; revised June 25, 1993. This work was and the space used is Thus, the time required is O(log’71), supported in part by a Texas Advanced Research Program Granr under Award O(n). A preliminary version of this result is available in [ 141. 003594003, and in part, during 1992 and 1993, by Georgia State University under Research Initiation/Enhancement grants. Algorithm I is the content of Section II. S. K. F’ msad is with the Department of Mathematics and Computer Section III describes Algorithm II as our first example of a Science, Georgia State University, Atlanta, GA 30303 USA; e-mail: match/copy EREW algorithm, implementing the central idea matskp@sunshine.cs.gsu.edu. S. K. Das is with the Department of Computer Science, University of North in its simplest form. Although not cost-optimal, this algorithm Texas, Denton, TX 76203 USA; e-mail: das@ponder.csci.unt.edu. requires O(log n) time employing O(n) processors, and it C.C:Y. Chen is with the Department of Computer and Information serves as a helpful introduction to the more involved costEngineering, Tamkang University, Tamsui, Taipei, Taiwan, Republic of China. IEEE Log Number 9401214. optimal EREW algorithms described later.

Abstract-We present four polylog-time parallel algorithms for matching parentheses on an exclusive-read and exclusive-write (EREW) parallel random-access machine (PRAM) model. These algorithms provide new insights into the parentheses-matching problem. The first algorithm has a time complexity of O( log’ n ) employing O( & ) processors for an input string containing ~1 parentheses. Although this algorithm is not cost-optimal, it is extremely simple to implement. The remaining three algorithms, which are based on a different approach, achieve O( log n) time complexity in each case, and represent successive improvements. The second algorithm requires O( n ) processors and working space, and it is comparable to the first algorithm in its ease of implementation. The third algorithm uses O( & ) processors and O( TLlog n ) space. Thus, it is cost-optimal, but uses extra space compared to the standard stack-based sequential algorithm. The last algorithm reduces the space complexity to O(n) while maintaining the same processor and time complexities. Compared to other existing time-optimal algorithms for the parenthesesmatching problem that either employ extensive pipelining or use linked lists and comparable data structures, and employ sorting or a linked list ranking algorithm as subroutines, our last two algorithms have two distinct advantages. First, these algorithms employ arrays as their basic data structures, and second, they do not use any pipelining, sorting, or linked list ranking algorithms.

T

1045-92 19/94$04.00 0

1994 IEEE

. A preliminary version of this result appeared in [8]. By the first observation.i # j 5 71. our algorithms do not employ pipelining. 1 illustrates Algorithm I. for I 5 . 1) The mate of a parenthesis at an odd position in a balanced input string lies at an even position. [18]. respectively. W e will use C language conventions. The well-known prejix-sum problem will occur as a subproblem on several occasions. a right parenthesis at an odd position). . for 1 5 k 5 71. a balanced input string will have the form F. . we outline a few conventions. cost-optimal EREW algorithm with complexities of O(log 71) processors. cf=. which builds a “copying/matching” tree whose nodes represent the unmatched parentheses in their spanned substrings using an encoding scheme. but is stored in a distributed fashion among the various processors. with the first substring containing the left and the right parentheses at. 711 at the beginning of each algorithm. respectively. 2) If a balanced string does not have any left parenthesis at an even position (or. . which is obtained by combining previous methods. It can be optimally solved in O(log 71) time employing O( &) processors on the EREW PRAM model [ 111. symbols (i and j) denote a left and right parenthesis at the ith and jth positions. By using the positional matching trick of Tsang. These parentheses can be matched independently of the matching of the set of left parentheses at even positions and the set of right parentheses at odd positions. or involved subalgorithms with large constant of proportionality. However.. 1. Otherwise. If this partitioning fails to split the input string (i. xi. In this case. or parallel sorting [IO]. . . Algorithm I is formulated by using a partitioning approach as follows. Thus.. In this figure. yielding an overall time of O(log~z). then.“‘. if the right substring is empty). Illustration of Algorithm I One can show that any string satisfying the second observation is of the following form: ( I( )( I. the input string can be partitioned into two substrings. and the remaining unmatched parentheses are copied into the parent node. Before proceeding further. Fl ODD-EVEN SEGREGATION ALGORITHM Consider the following two observations concerning a balanced input string. the set of left parentheses at odd positions have their mates in the set of right parentheses at even positions. NO. All logarithms are assumed to be in base 2. and with the second substring containing the left and the right parentheses at. our solutions also serve to demonstrate that the parentheses-matching problem is easier than both sorting and linked list ranking problems on the EREW PRAM model. [26]. Output is to be obtained in an array MATCH[ 1. by the second observation. This is a relatively simple. Our algorithms use only arrays for storage. 1) for i = 1 to [log nl . n] such that MATCH[i] = j implies that the parentheses PAkEN[I] and PAREN[j] are mates. odd and even positions.2.e.996 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. II.( 1. then the mate of each left parenthesis in the string lies immediately to its right. Lam.. Section V describes Algorithm IV. and Chin [33]. SEFI-EMBER 1994 Algorithm III is a cost-optimal EREW algorithm. Section VI briefly surveys the existing time-optimal EREW algorithms for the parentheses-matching problem and compares them with our time-optimal algorithms. such as linked list ranking [ 111.22.1 do a) Mark each left parenthesis at an odd position and each right parenthesis at an even position. Given an array of n elements this problem computes all the partial sums Zl. and by ensuring “EREWness” by redundant positional address calculation via a parallel prefixsum algorithm. we avoid actually building the matching tree of Algorithm III. matching is performed.. At each pair of sibling nodes. where 71 is even. . the partitioning scheme can be repeatedly applied until form F is obtained. This is done in O(1) time by means of a suitable array data structure. thus achieving greater simplicity in less space. Preliminary versions of this algorithm previously appeared in [17]. This algorithm is detailed here in Section IV. Let 71 input parentheses be available in an array PAREN[ 1. Thus. Algorithm I is stated as follows. all the mates can easily be found in parallel. log 71). respectively. Fig. . ALGORITHM I: AN I 234561 8 9 IO II I2 I3 14 I5 16 (I [ 4 ’(3’46 ’45 (5’ ’41’ 43(7’ ’(s’4 I(4 ’ Ll(6 L’ ’ h I 234561 8 9 IO II I2 I3 14 I5 16 Marked Fig. VOL. the total working space is O(n. equivalently. even and odd positions. [12]. 5. Section VII contains some concluding remarks. Furthermore. A “virtual” time and O(n) space using O(6) copying/matching tree is produced that is encoded as in Algorithm III. 9.

Likewise. The level of a left parenthesis at a position i is given by the difference in the number of left and right parentheses in the substring PAREN[ 1. Graphical representation showing the reduction in the number of levels by partitioning. Algorithm I requires O(log’ n) time employing I& processors. j]. . and then stable sort the parentheses according to their levels. To our knowledge.1 iterations are sufficient for Algorithm I. and those belonging to the even levels in the right partition.$f.PRASAD er al. This phenomenon is illustrated in Fig. its reduction will leave it empty. Subsequent (k . . Consider the graphical representation of an input string. Thus. It can also be shown that the average number of partitioning steps is O(log7z). A similar result has been independently reported by Diks and Rytter [ 191. the substring is reduced to a standard form consisting of the unmatched right parentheses followed by the unmatched left parentheses [20]. If not. we have developed an O(log n)-time O(G)-processor algorithm on the EREW PRAM to sort a special class of integers in which any two successive integers in the input sequence differ at most by 1 [7]. Since each substep of Algorithm I can be implemented in O(log71) time using & processors on an EREW PRAM. Using a parallel bucket sort algorithm [ll]. in O(logn) time using 0( I&. The level of a right parenthesis at position j is one more than the difference in the number of left and right parentheses in PAREN[ 1 . A reduction in the number of processors would be possible if 5 integers. Applying Cole’ parallel merge sort algorithm [lo]. Instead of repeated partitioning to obtain form F.(5)6)8(9)LO (2 )3 (4 )7 (I I*(9 40 (2)3(4 4 (5 16 Fig. 2. and since there are [log n1 . one can determine the level of each parenthesis by using a parallel prefix algorithm. For a rigorous proof of this claim. III. i]. . 2. 2) Check if the input string has been converted to form F. Using an optimal parentheses-matching algorithm (such as ones presented later in this paper) as a subroutine. [9]. 2. n]. however. the time complexity of such a parenthesesmatching algorithm would be 0( . one for each level. [151. Interested readers may obtain a proof by using a result on the average height of planted plane trees [6]. If a substring is balanced. thus converting the input string to the desired form F. The partitioning scheme places the parentheses belonging to the odd levels of each substring in the left partition. . then the input string is unbalanced. In a substring of parentheses. But the close relationship between matching parentheses and sorting integers has lead us to the following result.. since the maximum number of levels in a balanced string of n = 2” parentheses is 2k-1 (in a string of 2k-1 left parentheses followed by 2k-1 right parentheses).1 iterations. . Otherwise.: EFFICIENT EREW PRAM ALGORITHMS 997 (O(O))() 3 1 2 4 5 6 7 8 9 IO (. [34]. processors on an EREW PRAM. an algorithm by Rajasekaran and Sen [30] requires a word length of nf. and store the results in the output array MATCH.’ Therefore.2) repeated partitionings of the substrings would reduce the number of levels in any substring to 2’ = 1. can be (stable) sorted in O(log n) time using fewer than o(n) processors. The parentheses belonging to a level can be matched independently of any other level. only if the basic operations on 0( ~1log n)bit integers are assumed to require unit time. this can s be accomplished in O(logn) time using O(n) processors on an EREW PRAM model. 1 employing 0 ( & 1 processors. ALGORITHM II: AN EFFICIENT MATCH-AND-COPY ALGORITHM We now discuss an alternate approach to parenthesesmatching problem that turns out to be more fruitful. We informally show that [log 7~1. the number of levels in the left and right partitions of a string cannot exceed 2k-2. see [15]. [12]. this approach falls short of cost-optimality. . . Consider the ‘ result [21] sorts II integers drawn from [l. for any F > 0. its left partition would have [6] levels. b) Use parallel prefix algorithm to pack the marked parentheses followed by the unmarked parentheses. constituent Level I substrings. each representable in O(logn) binary bits. Furthermore. Can we make Algorithm I any faster? The answer lies in the observation that Algorithm I converts an input string eventually to form F by partitioning the input string into its Remarks: . there is no algorithm that sorts n integers in [l: . match the parentheses. as shown in Fig. whereas its right partition would have 141 levels. each representable in O(logn) bits.“. when all the locally matched pairs of parentheses are found and removed. n] in O(log n) time A employing O( & ) processors. if the input string has 1 levels.

2+i+l moves PAREN[n/2 + i + l] to PAREN[ri + Ii . : n] as shown in Fig. A. 3. 3. however. PI through Pni2 recursively reduce the left substring. if 11 < r-2. processor P.2+1 through P. match the rightmost z = min(ll ! ~2) left parentheses of the left substring with the leftmost z right parentheses of the right substring. this algorithm is conceptually straightforward. which can each be matched by six processors in O(1) time with the matching right parentheses of the right sibling. 9 through P. processors Pn. reduce the right substring and each knows the values ~2 and 12.1.z left parentheses. 2) Fori = O. Thus. O(logn)-time algorithm using O(n) processors. yet keeping the time spent at each level at O(1).. Similarly. Algorithm II leads to a cost-optimal algorithm./2 and receives rz and 12 values in return. This avoids any read-write conflict while matching. Step 2 is easily implemented in-place in 0( 1) time as follows. the work performed at each level is O(n). Matching left and right siblings. NO. (1 (2 (a (6 ( r5 (ia. j) to denote t h at the left parenthesis i is matched with the right parenthesis j. the size of a substring doubles. the left sibling has six unmatched left parentheses. then for i = T:. which contains )r7 )sa )a7 )sa . In the last level of the matching tree of Fig. then for i = 11 through rz.l+i+l performs this matching. then Pnl2-i matches the left parenthesis in PAREN[n/2 . If the left substring contains ri right parentheses followed by Ir left parentheses. in the worst case. and thus match. an input string. Implementation Suppose the reduced left and right substrings after the first step are formatted in the input array PAREN[l. We briefly indicate the key ideas for Algorithm III in this paragraph. is given by the recurrence relation T(n) = T( 5) + O(1).-. SEPTEMBER 1994 Left Sibling 4+r2 Right Sibling 3n n-l*+1 \ \ Parent Node Fig. processor Pi sends its ~-1 and 11 values to Pi+. through 11. and after each matching iteration. The details can be filled in easily. 1) For i = 1!2. which is described next. and at the end of Step 1. Remarks: One can clearly state Algorithm II iteratively. and give pertinent details later. 4.i + 11. Since the left and the right substrings in Step 1 are reduced recursively. 5. Then.z right parentheses followed by I = 11 + /s . processor Pn.. leaving r = ~1 + rz .. processor Pn. . 9. 2) Match the reduced left and right substrings. In Algorithm III. all the unmatched parentheses are carried from level to level in the binary matching tree. IV. Further let there be n processors.z-i moves PAREN[n/2-i] to PAREN[n-Zs+ra-i]..l. else. the total time required. and if the right substring contains rg right parentheses followed by 12 left parentheses. Recursively reduce the left and the right substring.. . we use only & processors instead of n. Correspondingly.i] with the right parenthesis in PAREN[n/2 + i + 11. After modifications. Else. each processor knows the values ~1 and 11.. Similar to Algorithm I. min(lr: ~2) . 3) If 11 2 ra.998 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. following recursive algorithm to reduce. 1) Split the string into two halves. The algorithm conceptually builds a binary matching tree in which two siblings containing reduced substrings are matched to obtain the reduced substring of their parent.. n/2. if 11 < ~2. . 4 shows the working of Algorithm II as an iterative algorithm (after unfolding the recursion to substrings of size 1) with an example string of 32 parentheses. Algorithm II is an in-place. Remove those z matched pairs. wherein processor Pi starts with the parenthesis PAREN[i]. ALGORITHM III: AN OPTIMAL MATCH-AND-COPY ALGORITHM In Algorithm II.. we reduce this work from O(n) to 0( &). We use the notation (i. Fig. T(n). Therefore. VOL. in Step 1. if 11 2 rz.

1 . doing so. two processors can be employed to concurrently match the corresponding left and right segments: (13 with 3 )x. Therefore. Now three processors can independently match (12 with 2)sa. and (1.: EFFICIENT EREW PRAM ALGORITHMS 999 . (s 1(61 with 1)27 1)su. are collectively represented by the notation (13. using the representative parentheses (al and 2)s7. and likewise. Here 2)27 was split into 1)~s and 1)27. matching (is with )ac. which denotes these three left parentheses with (1 as their leftmost. and (3 with )a~.31> <1. a representative parenthesis can be used to index the parentheses it represents. . and 2)~.PRASAD er al. the representative parentheses (13 had to be split into (12 and (s 1. and after all the matching segment- . and the right sibling may be represented by l)r7 2)a7 3)sa. matching (6 with )27. and 3)aa was Split into 1)~) and 2)s~. to create the leftmost representative parenthesis of a left partition.((. denote j right parentheses.2 Following such a scheme. and (152 with 1)17 1)26.))I26 27 30 3 I5 16 17 31 32 1 . and finally. <2. 1) Multiple processors are employable to match a pair of encoded siblings if the siblings can be kept partitioned appropriately. Similarly. It also keeps the encoded nodes partitioned into P og~size segments so that after the entire matching tree has been constructed. using the left representative parenthesis (152 while using 2)a7. we IlOt’2 tW0 points here. Notice that in 2An interval encoding scheme of similar nature has been independently employed by Diks and Rytter [ 191. In general. Furthermore. Then. I)271 )30. let a representative parenthesis (ij denote j left parentheses. matching (1s with )r7. Similarly. then. and.32> Fig. a representative parenthesis of a right partition might be required to be split to create the rightmost representative parenthesis of the succeeding right partition. 21 (( 61 ((p)I)). a representative parenthesis of the preceding partition might be required to be split. . Next the left sibling can be partitioned into two segments. the left sibling can be partitioned into three segments representing two parentheses each: (12. the right sibling can be partitioned into 1)r7 2)a7 and 3)~. 4. then (2 with )sr. the left sibling may be represented by (i3 (6 1 (152. Illustration of Algorithm II )si )ss. . and (al (is2 with 1)172 )a7. As illustrated by this matching procedure. Suppose the left parentheses. (13 and (a 1 (152. (1 (2 (a. . with (i being the leftmost. Similarly. (a 1 (a 1. 2) Since matching a pair of left and right partitions starts with the leftmos: representative parenthesis of the left partition and the rightmost representative parenthesis of the right partition. similarly. and matching (1 with )ss.2. with )i being the rightmost. thus reducing the work from O(n) to O(A) per level. Algorithm III employs an encoding/indexing scheme to avoid carrying all the unmatched parentheses from level to level in the binary matching tree. each representing three parentheses. these two representative parentheses in particular are essential for their respective partitions. let j ). using the right representative parenthesis 1)r7. the right sibling is partitionable into segments 1)171 ) 26. Such encoded siblings can be sequentially matched by a processor by starting with the leftmost left representative parenthesis (13 and the rightmost right representative parenthesis 3)aa.

Each node of the tree is conceptually segments. 3) I 13211 TREE I21 TREE 131 v Fig. 1) (1 P7 ps PI3 I5 171]] 111 la11 3J2.size). PI (41 TREE[lI 1) s (36 4 II Ib p4 I 2). . Likewise.10> 4: : <11. p5 ps 32 PI I’ : PI P2 . kept partitioned into k-size At the beginning of Phase 2. rrep. Each processor reduces its substring sequentially. This algorithm consists of three phases.(2 (3(4 : 4 5 )5(6 p2 9 $6 : p.‘ $3. .4. a local i copy of the total number of unmatched right parentheses in ~1.o.. 5. the index of the leftmost left parenthesis and the total number of left parentheses form the left representative record.. TREE[ I] ll[l!. pairs have been identified. and also the total number of right parentheses of the substring (rrep. VOL.IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. occupying those positions of TREE[i + l] that both its children occupy in TREE[i].> ilc]. and the resultant reduced substrings are made available in the input array PAREN. 9. Depending on the location of its representative records. 5. I’ )... . where TREE[i] stores the 1:th level..(3. stores the leaves such that the ith leaf occupies the subarray TREE[l][(1. 1). each pair of matching left and right segments is matched. i. processor Pi is assigned to the ith reduced substring produced by Phase 1. Pi keeps track of the current location of its lrep by a local variable lpos and that of its rrep by rpos. (/ . a processor is associated with either a left or a right sibling. Similarly. A processor encodes its substring into two representative records: the right representative record. i3 : p4 . and that of the left parentheses in II. the input string is partitioned into p substrings.e. In Phase 2. each of size !C = 2m-4. : q + n]. In Phase 3. Each parent at a level i + 1 is double the size of its children at level i.. all 24 substrings are matched iteratively to construct a binary matching tree of height q..12> i j <22. NO. for a left sibling node.. Processor Z’ also keeps. . contains the index of the rightmost right parenthesis of the substring in the array PAREN (rrep.(. .&size substring produced from Phase 1. for 1 5 i < p.23> $: 1 13 1 14 ( 15 ( 16 1 17 ( 18 ( 19( 20 121 ) 24 I.)32 v. where 4 = [log [n/ml 1. Illustration of Algorithm ILI. each pair of matching left and right segments can be matched independently in O(logn) time.1)k + l. SEPTEMBER 1994 PAREN: P. lrep. Algorithm III uses p = 24 processors. Data Structure and Operations The matching tree is stored in an array TREE[l. each leaf has a capacity Ic to accommodate up to . a processor Pi associated with a right sibling keeps the corresponding values in local variables rp and /a. as shown in Fig. In Phase 1. Thus. 17 ! ps 21 : ‘6 25 )25)26 : P7 : ps 4 29 32 )3. . & 1 (. <9. . each requiring O(logn) time. it finds the locally matched parentheses.index). Assuming an input string of length n = 2”. )29>30 PAREN : p&&&i 4 i . A representative record contains information for the conceptual representative parentheses described earlier. for 1 5 i 5 p. . Each processor Pi is responsible for carrying its representative parentheses along the matching tree. I”.. A. 5..

has to match the right segment containing rrep. correspondingly. rrep. Pi of a left sibling. Node j is a left sibling if j is odd. Therefore. is at a left sibling).> (i * Ic)]. and receives r2 and 12 in return.index . 5 shows the entire matching tree constructed during Phase 2. Thus. if lu > ~2. because it is required to initiate the matching of that segment.size = min(lrep. processor PI fills in the leftmost location of the succeeding left segment with an appropriate temporary record. a temporary record with size = lrep.. Pi constructs its representative records rrep and lrep. ru . Zu! of the left parentheses represented by its lrep and by those to its right. and lrep at TREE[l] [lpos]... initializes its local variables 7‘ and 11 if i is odd (i. the size of the representative records are reduced to represent increasingly fewer unmatched parentheses. the matching is performed by the processor associated with the left sibling.II). As indicated earlier. of right parentheses represented by its rep and by those to its left in a node.size.size.z and index = lrep. and. . Above each representative record. and the variable level. the associated processor stores a temporary record at the rightmost position of the preceding right segment. Finally. it discontinues carrying that record further down the matching tree.j+l through Pt. at TREE[I] [rpos]. Similarly.index + z is stored at TREE[level + l][Ipos + z]. the calculation to place the representative records (i. Processor Pi of a left sibling sends its q and 11 to Pz+~ of the right sibling.11 + q. lu = lu rg + 12. . If z < rrep. Th e re d uced substrings are left in its subarray by packing the right parentheses at the beginning and the left parentheses at the end of the subarray.size {rrep is the leftmost record of this segment}. for j = 1.j belonging to node j at this level of the tree.2.size. The shaded regions of each pair of siblings have been found to be matching. ~“~‘ records.z and index = rrep.1) of TREE[i]. ! p. Similarly.: EFFICIENT EREW PRAM ALGORITHMS 1001 Each representative record occupies the same position in each node that the corresponding parenthesis would occupy if Algorithm II were used. similarly. then.z is filled in the last position TREE[level + l][rpos .12 + k21eve1-’ in TREE[level +l]. fewer unmatched parentheses are carried along leaving those parentheses that matched.ra). a temporary record with size = rrep. . and the processor that would match specific segment pairs during Phase 3 are shown encircled. Detailed Steps of Algorithm Ill Phasel: For i = 1.size . Pi keeps two private variables: 1) the number. p.. Initially. .. Fig. Furthermore. d) Assign processors to matching segments: If r2 < II {processors assigned to the right sibling will perform the matching task}. then.2. it initializes 7’ and 12. during Phase 3. Phase 2: 1) Initialize the leaf nodes in TREE[l]: Pi stores its representative records.lu . and rpos = rpos -II c) + r1 . the leftmost position of each left segment is to be filled with a record.size. and copies lrep to its new position lpos = lpos + r2 . rrep. if ru 5 r2 {rrep has a match in the left sibling}.e.-. If this range of locations goes beyond the boundary of the segment containing location Ipos. processor Pi sequentially finds the locally matched parentheses in subarray PAREN[ (i 1) * Ic + 1.size. ru = rrep. Pt+c.. the relative position of its lrep in its segment from right is z = ((lu . b) Copy the unmatched representative records in the parent: Pi belonging to a left sibling copies its rrep in exactly the same location in TREE[level +l] as it was in TREE[level]. P. A record lrep of a processor Pj placed at a position TREE[i][lpos] represents the left parentheses corresponding to locations lpos through (lpos+lrep. matching is performed by the processor associated with the right sibling if ~2 < 1. P. Analogously to Algorithm II. and. 1 otherwise../~2’ --l. if necessary: Fill in the temporary B. for the rep of a processor belonging to a right sibling. computes new lrep. . else. one pair for each of its representative records.size . the associated processor becomes responsible for matching that segment with the symmetrically opposite segment of the sibling. if ((ru . Additionally. the processor responsible for copying it is also shown. To avoid access conflict between the processors of left and right segments during Phase 3. a processor is assigned to match at most two pairs of left and right segments. 2) Construct the binary matching tree: For level = 1 to y do {match the nodes at TREE[level] TREE[level+l]} a) and construct Exchange r’ and l’ between the processors s s -’ of siblings: There are t = 21eve’ processors. ru = r-u . Processor Pi sets the 2 positions of its records rpos as the index of the rightmost right parenthesis in the array PAREN. after Phase 2 is over. if a matched left (right) representative record is the rightmost (leftmost) representative record in its segment. s if rrep’ range crosses the boundary of its segment. Accordingly. if a processor finds that its representative record is matched. 2) the total number.PRASAD et al. likewise The relative position of rrep of a processor in its segment (from left) at the parent node is z = ((ru . if 11 < ru. Analogously.e. {Analogous processing for a left sibling}. lpos as the index of the leftmost left parenthesis..Z] of the preceding segment. TU.size and lu = lrep.(..1) mod k) + 1 5 rrep.size . Pi of a right sibling copies its lrep similarly. If z < lrep. as one moves down the matching tree.1) mod k) + 1.size = min(rrep. updating lpos and rpos) from a child to its parent is similar. for Pi of a right sibling. . with the matching left segment being in the left sibling.1) mod /c) + 1. Pi saves a copy of all its current variables.

index. is not optimal compared to O(?t) stack-space used for a sequential algorithm. SEPTEMBER 1994 If 1 > 1’ then modify current lrep as . respectively. . 6. Whenever the whole or a part of a representative record is found to be matched in the corresponding sibling.AND SPACE-OPTIMAL ALGORITHM The key idea of Algorithm IV is not to construct the binary matching tree explicitly. These two variables have been used only to clarify the presentation. 2) Find the left and right representative rically opposite locations: records at symmet- The position symmetric to r of right segment in the left segment is 1’= 2 ]?I t . lrepsize. uses two arrays. because there are many empty spaces in the matching tree that remain unused (Fig. 3) Match the segments starting with lrep = TREE[level][l] and rrep = TREE[levell[r]: While 1 < r do a) Match the right parenthesis in PAREN[rrep. in the right and the left segments: of the right segment r = min of last position [ FjIc V. so both can accommodate all the left and right parentheses. because each level of TREE requires O(n) space.I’ ). the matching portions of left and right siblings are stored aligned in such a way that the left parenthesis at LMATCH[i] is the match of the right parenthesis at RMATCH[i]. The space complexity. update rrep if 1 > l’ } While 1 < I’ lrep = TREE[level][l]. e) Compute new r’ and 1 ‘ s s: Remarks: Notice that the private variables 17~ and ru can be derived as rrL = (rpos -1) mod (k2rever-‘ + 1 and ZU ) = k2 teuet-l. during Phase 2 of Algorithm IV. Phase 3: For i = 1.r + 1. (a) Number of parentheses matched at each node. 1) Find the position of the rightmost and leftmost records. {analogously. Thus.index = lrep. If Pi’ representative parentheses are at a left sibs ling in TREE[level+l]. $1. .index].index] with the left parenthesis in PAREN[lrep. where t = 2revr’ is the size of a node -l at this level. similarly.1002 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. 5. and increment 1. although each processor carries its two representative records of the “virtual” matching tree level-bylevel. p. of the rightmost right and the position r2 + [ yjt parenthesis. for 1 5 Z 5 $. Pi sequentially matches it.6) PI (6. if Pi has been assigned to match a pair of segments.2.size. [ endwhile} .((lpos-1) mod (JG2tevet-l) + 1) from rpos and lpos. let processor P. In these arrays.size = lrep. ALGORITHM IV: A TIME. in contrast with Algorithm II. r1 = rr+rp-min(ll. The portions of the matching tree that are of use during Phase 3 are the matched portions of different nodes. 1 = 1 + rrep. The unmatched portions serve only to calculate the parent nodes during Phase 2. Assuming that 1 < l’ . NO. employing & processors. (0.min(ll. I = max of the first position 2[y]t . if lrep. and r.[TlrC + 1 of the left segment and the position l?j t -‘ + 1 of II the leftmost left parenthesis.index and increment lrep.0) P5 b) c) LNDEX[Il INDEX121 INDEX 131 1 1 2 6 2 0 0 0 3 0 r 0 4 0 0 5 2 I 0 6 0 0 0 7 I 0 0 8 0 0 0 l. r2). The next algorithm cleverly modifies the implementation aspects of Algorithm III to bring down the space complexity back to O(71) while maintaining the parallel time at O(logn). 1 = 1’ . and l1 = 11 + 12 . and 3 requires O(log 7~) time using I$& processors. if only a part has matched) is stored in LMATCH (RMATCH) if the current node is a left (right) sibling at an appropriate location. Algorithm III is a cost-optimal EREW PRAM algorithm. read new lrep from TREE[levell[El. ?-a). Decrement rrep. It is easy to conclude that each of the Phases 1.1’ lrep.size . LMATCH[l .size = 0. respectively. (b) Calculation of INDEX[l level][l p]. (1 . therefore. to store the matching portions of the left and right siblings.NDEX[l] INDEX [2] INDEX[31 1 6 13 I 6 13 I 6 13 1 6 13 3 7 13 3 1 13 4 7 13 4 7 13 (b) Fig. VOL. a processor belonging to a right sibling computes its rz and la. is O(nlogn). the representative record (or a temporary record. 5). 2. 9. respectively. :] and RMATCH[ 1. Algorithm IV. . of a right sibling be assigned to match a right segment with the symmetrically located left segment of the left sibling.index+ ).size . Similarly. The sizes of the arrays LMATCH and RMATCH are precisely g each. In Step 2d of Phase 2. Decrement rrep. it does so only in its private memory while keeping track of the current level of the matching tree and the particular node at that level.(1 . The space usage. therefore. If rrepsize = 0. Therefore. . lrep. respectively. . read the new rep from TREE]level][r]. .

1)2l-l + 1 5 i 5 j21p1. The primary additional task here. and 12 need to be calculated for each pair of siblings. and.. . . and the rightmost is node p + 5. al. in addition to what Algorithm III does during its Phase 2. the number of parentheses matched at each pair of left and right siblings. 5.._. Ii. .. . only the values ~1. This step does not require actual building of the matching tree. an odd-numbered node (except the root) is a left sibling. Thus. . For Phase 2. Next a parallel prefix-sum algorithm is used on array INDEX treated as a row-major linear array. equal portions of the arrays LMATCH and RMATCH are allocated to all the processors..“. . Therefore. the leftmost leaf is node 1.2’ (2j)] contains the required index for the processors associated with the left and the right siblings._ . the leftmost node at level 2 is node p + 1.1) + 1. As in Algorithm III. A 2-D array INDEX[ 1: . q] [ 1. thus avoiding any concurrent read for this purpose.) During Phase 3. Algorithm IV uses p = 2g processors. r2. . Phase 1 is same as in the previous algorithm. A processor associated with node i of the matching tree would need to know the exact number of parentheses matched at nodes 1 through 1:in order to store the parentheses matched at node i into the arrays LMATCH and RMATCH. Fig.. the (2j . the root is node 2p . . .3 To calculate and fill in the array INDEX.. and the rightmost leaf is node p at level 1..PRASAD e.”_ . .. for 1 5 j 5 p/2l and 1 < 1 5 q. p] is used to precalculate and store the above information such that each processor Pi associated with the jth node (from left to right) at level 1 has a private copy INDEX[l][i] as the index into arrays LMATCH and RMATCH. The conceptual matching tree and the arrays LMATCH and RMATCH.1. . 6(b) illustrates this process with the same input string as in Fig. where q = rlog[n/m]l! and it also consists of three O(logn)time phases. . let us number the nodes of the matching tree starting from levels containing the leaves down to the level of the root. The effect of this step is that now the entire subarray INDEX[1][2”(2j . Other entries of INDEX are initialized to zeros beforehand. -_~I-. is calculated and stored in INDEX[1][2l(2j . and within each level from left to right. assuming an input string of length n = 2”. 7.1) + 11. and each processor sequentially matches the parentheses in its subarrays. . The first element of the tuple 3A similar preprocessing is carried out by Anderson et al.1)th and the (2j)th node at level 1. for 1 I: j 5 fi and (j ._ . (Calculation of the appropriate location will be explained in later paragraphs. is to calculate the proper indices into arrays LMATCH and RMATCH where the matched representative records are to be stored for each pair of siblings of the matching tree.: EFFICIENT EREW PRAM ALGORITHMS 1003 p1 1 Lh4ATCH Rh4ATCH ‘ : 2 ‘ $ 3 4 % 1) 29 pz p4 9 ICI 11 (1 6 5 ‘ : 2)14 $ 6 7 ‘ ie % p7 8 ‘ 1 3L p* 12 y5 13 14 15 16 ‘ )5 Pz 2)24 P6 247 p7 ‘ )17 p5 ‘ 8 Fig.-. [2]. likewise.

keeping the number of parentheses represented by the record and the index of the leftmost (or the rightmost for rrep) parenthesis in the input array PAREN. SEPTEMBER 1994 6 7 qs 8 <f ‘ 5 9 IO 11 2 2)27 I2 ‘ !5 13 (f6 14 15 16 PI c4. respectively. . &I.rz).min(ll. In this figure.1 > ki. The vertical dotted lines separate the private variables of each processor.?j] and RMATCH[l . An [rep with fields index. index = lrep. 5. and records lrep and rrep. and store into the array INDEX: for level = 1 to q do for all j = 1. if necessary.. of parentheses matched into INDEX[level][t(j . If 1rep. NO. for j = 1. lrep is the only record in the node. respectively. Fig. . q][l . Compute the number of matched parentheses at each pair of siblings. Accordingly.” In Phase 3.index + . Detailed Steps of Algorithm IV There are t = 21eve’ processors. 1rep. and u is shown as ( inIeZ size (and analogously for an rrep).3. a temporary record is copied into LMATCH[INDEX[level][i]-z + l] with size = lrep. In the next stage. VOL.. 11.u -z).1) + 11 and matches the k left and right parentheses represented by these matching left and right subarrays as in Phase 3 of Algorithm III. Pi fills in a temporary right representative record in RMATCH.. The fields size and index are as in the previous algorithm. Initially. P.. Likewise. If lrep. . sor P. Node j is a left sibling if j is odd. Next each Pi (including Pp) reads lrep = LMATCH[lc(i 1) + l] and rep = RMATCH[rE(i . . 9.1) + 1: ..32> P5 <2. at a leaf node. by using the indices available in the array INDEX. Fig. Q). . Therefore. The records matched are shown enclosed within “(” and “). each such record has three fields.1) + 1: . ~2) and II = 12 = 11 + Z2 . and the matching left and right representative records of each pair of siblings are stored into the arrays LMATCH and RMATCH. Processor Pi of a left sibling sends its ~1 and 11 to Pi+t of the right sibling.u = lrep. min(ll. Pt*(j-I)+1 -’ through Pt*j belonging to node j at this level of the tree.} for level = 1 to q do a) Exchange siblings: r’ s and l’ among s the processors of A. where t = 21eue’ proces-l. 7 illustrates this stage. lrep. b) Copy the matched records into arrays LMATCH and RMATCH: Phase 1 is as in Algorithm III. copies its lrep into LMATCH[INDEX[level][i] -lrep.5. then Pi writes a new temporary record at LMATCH[lci + 11. PI and P4 fill in temporary records at.size < z.u > z.u keeps the total number of left parentheses represented by the record lrep and those represented by the records to the right of lrep in a node. 51 are initialized.!. P.u+l]. respectively. Additionally. Each processor has variables ~1. Use parallel prejix-sum algorithm on the array INDEX[ 1. Phase 3 of Algorithm IV. ~2.u 5 z. Now each left and right subarray has its leftmost and rightmost representative record. if 1rep.1) + 11.2 I> <28. contains the total number of right parentheses represented by rrep and by those to its left in a node. r2). and /2 in return. the matching tree is constructed conceptually level-by-level. therefore. Phase 2: Initialize the entries of the array INDEX to zeros. the third and the ninth locations of arrays LMATCH and RMATCH. Similarly. . and the second element is the number of left parentheses with each processor. then a part of this lrep has matched. and receives r:. but 1rep. are updated for the parent node.c <19.17> ‘ 8 - Fig.u Number of matched parentheses are z = min(ll. 8 illustrates Phase 3.27> <15.3 I> <3. size. processor Pi is assigned to the subarrays LMATCH[lc(i . Let this record be at position j. a third field 1rep.size +j . and stores the number. As indicated earlier. .2.. The size of the record lrep is reduced to represent parentheses only through the subarray boundary.(j-l)+I calculates the new values ~1 = ~2 = ~1 +rz -min(ll.lrep.u .size. and store the matching representative records into the arrays MATCH and RMATCH using indices from array INDEX: {Each processor begins with the same lo- cal variables that it had at the end of Phase 1. Those records. Each Pi (except for Pp) reads its rightmost left representative record.26> P7 <16. f. in Fig. for 1 5 i 5 p.29> 5 <7.24> p2 <20. its lrep is matched completely.13> % <l&25> <1.30> Pb <6. Construct the conceptual matching tree. and 12. which remain unmatched.. For Pi of a left sibling.(j-I)+I of a left sibling gets rp and 12 values from Ptj+l of its right sibling in parallel. Arrays LMATCH[l . in its portion of the subarray of LMATCH. The field u serves as the variable lu of Algorithm III.size (1rep.14> <8. . 6(a) corresponds to the number of right parentheses. respectively. For the parent. 8. . .. Ici] and RMATCH[lc(i .IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. rrep.. p] treated as a row-major linear array.

The algorithms for parentheses-matching on the EREW PRAM. To our knowlif 1rep. Pi of a left sibling. Mayr.. . rep of a right there are four other parentheses-matching algorithms that require O(log n) time and use n/ log n processors. Each of the leaves of the binary tree of the as in the Bar-On and Vishkin algorithm [4]. A parent parentheses of each logn-size substring. . Now we consider optimal EREW algorithms. Record w h ere m(v) is the number of matching (4. Having stated this. Each node contains their first and third phases are the same. where i is the nesting level of that pair. and then it is read by its matching right problem on the EREW PRAM model.<lI m(w) for each node ‘ u. Then each node ‘ calculates a triple u + ki .1) + 1.u = 1rep. . Likewise. a parallel prefix algorithm is parentheses represented by these subarrays as in Phase employed to calculate b(v) = CzL.size -z.~). For right and finds the rightmost left representative record completeness. 1251. the algorithms due to Anderson. but employs a difnode is easily computed from its children. which For 1 5 i 5 p . : Ici]. but the pipelining process used to assign the rightmost right parentheses of each substring. each processor is assigned to a comprehended at an abstract level. . The algorithm by Tsang et al.. There also exists an 1) Write temporary records in the succeeding subarrays. do not precalculate b(v)‘ s. Again. However. [33] is similar to that of phase builds a binary tree and then performs concurrent searches on that tree to find mates for the leftmost left and Anderson et al. The second phase the nesting level of the leftmost left and the rightmost right finds the mates for the leftmost left and the rightmost right parentheses in the subsequence spanned by that node.index string sequentially. Each leaf processor is assigned to a substring Pi writes a new temporary record at LMATCH[ki + l] of 2 logn parentheses. due to Kim [24].: EFFICIENT EREW PRAM ALGORITHMS 1005 “minimally” spans the subsequence containing the mate of the leftmost left parenthesis of its substring. W e begin. and each node is assigned to a different lrep is at location j. however.u -2). First. Separately. each processor uses the matching found in the second possible reason is that Tsang et al.7471).1.size + j . Next each matching pair at node ‘ is assigned a unique index u Remarks: Algorithms III and IV are both time-optimal. Levcopoulos and Petersson [27]. parenthesis. The key data structure sibling remain unchanged.1) VI. In the next stage. and if lrepsize + j . Lam. starts with lrep = LMATCH[k(i . Analogously.1) + 11. Among the advantages of this algorithm are its underlying array data structure to implement the binary Bar-On and Vishkin [4]. where edge. d) Compute new r’ and l’ as in Algorithm Ill. and then climbs down to the appropriate leaf to find the mate.u --z + 12. is copied . m(u) .j. O(logn)-time arrival order. and (b(w): i). we must clarify 1) + 1:. A phase. a left parenthesis is written at the unique location O(logn) time is a lower bound for the parentheses-matching (b + i) of an array. It employs n/ log n processors and comprises three matching tree and the ease with which this algorithm can be O(logn)-time phases. processor Pi is assigned to match Tsang. second phase is assigned to a substring. .1. we also discuss Kim’ algorithm. s lrep in subarray LMATCH[k(i . a binary tree of n/ log n s the rightmost right representative record in RMATCH. phase to sequentially scan and match the other parentheses The Diks and Rytter algorithm [ 191 has three phases. and matches the k left and right order numbering of these nodes. Pi scans its subarrays from left to can be employed to solve parentheses-matching problem. In this context. ) ki] and RMATCH[lc(i . This process is repeated c) Update the unmatched records for the parent: Records rrep of a left sibling and lrep of a right for the rightmost right parenthesis. and then assign indices to their parentheses.size = irep. Each leaf has a single-node linked list (1rep..PRASAD et al. Each node v passes a packet containing (b(w). rrep is processed similarly. If nodes is constructed.1 > &. z = min(ll. In the third unique identifiers to the matching pairs is too complicated. we point m(v) . The leaf processors collect all the packets in their W e briefly sketch the existing cost-optimal. Furthermore. and Chin [33].size = Ici . where that these four algorithms and our Algorithm III have been Ic = p-q.Ici and index = lrep. then processor. o(v)). because this was the first such PRAM algorithm. The first among these is due the parentheses represented by subarrays LMATCH[k(1. assuming an inRMATCH[lc(i . ferent approach. COMPARISON WITH THE EXISTING to its two children. too. and 1rep. a matched rep into RMATCH. if optimal EREW algorithm for a related problem. to Anderson et al. this algorithm log a-size substring that is sequentially reduced. ~2). respectively. developed independently of each other. 3 of Algorithm III.u > r2 updates its lrep as lrep.1 . . 0 .1) + 1. called the necessary: all tallest neighbors problem. An internal node splits the packet into COST-OPTIMAL EREW ALGORITHMS two by dividing the nesting interval appropriately and passing them down. s s Diks and Rytter [ 191. ki] and In Anderson et al. and it begins by reducing its subwith size = lrep. a binary matching tree with 71/ log n each processor starts at its leaf and climbs up to a node that leaves is constructed.‘ algorithm [2]. in addition to the proposed Algorithms III and IV. just in the substring. with a CREW algorithm by pipelining process.1) + l] and rrep = parentheses remaining at node II. Once these indices are communicated to each pair at out that Levcopoulos and Petersson [27] have shown that the leaf level. These are sibling is updated. 0 5 i 5 Algorithm IV is also space-optimal. The second uses extensive pipelining. and Phase 3: For 1 5 i 5 p. Communication of the indices requires pipelining. are 2) Match the subarrays: the number of unmatched closing (right) and opening (left) P. pairs found at node u. Next Pi updates lrep. and c(u) and o(v).j. used is an array implementing the binary tree.. [2]. . and Warmuth [2]. focusing on the underlying techniques and the data structures paper describing this algorithm contains few details of this employed.

Now let us suppose that only I& processors are available for the ATN problem. a final right list is formed. For this purpose. Let H. In comparison with these algorithms. an updated left list is created at the parent. Let Fi be the set of segments of increasing height from Lh for each segment h in Hi. A fi divide-and-conquer ( > approach is used. The Levcopoulos and Petersson algorithm [27] is unique in not relying on a binary matching tree. in addition to having numerous O(log n)-time stages. the matching portions of the left list of the left child and that of the right list of the right child are “cut out” and stored. and the ATN problem is sequentially solved for each subset. consist of the segments h E H with LTNH (h) = hi. and stores them to be matched later. a processor is assigned to each such copy. . Its solution consists of finding the left and the right tallest neighbors of each segment. Calculation of individual IHi 1. n-processor algorithm first. stable sorted. then RTNs(i) = 0. the left list cut-outs stored at each internal node are concatenated (in some order) to obtain a final left list. including linked list ranking and parallel sorting. and the segment RTNH (hi). [5] have solved a related problem. < zi+i for 1 5 i I n . SEPTEMBER 1994 containing the first and last indices of the unmatched right parentheses (the right list). . The ATN problem is next solved for the set H. Now. for each s E R. and employs a linked list ranking algorithm. Then IHi ] copies of Ri are made. and create an instance of the ATN problem by replacing the ith parenthesis with the ordered pair (i. can be shown to 6 require O(logn) time using n processors. and it identifies the matching portions of the left and right children at each parent node. 9. I). One of these pairs is the leftmost left parenthesis of the left subtree and its matching right parenthesis in the right subtree. Likewise. and let H = {hi 1 1 5 i 5 6). sorted by their indices. where t E I. our algorithms consist of relatively . < yj and j is the minimum such integer (denoted by RTNs(i) = j). Although Kim [24] has solved this problem on the EREW PRAM in 0( log n) time employing & processors. The list H of the tallest segment of each subset is formed. Reduction of each block is carried out by first reducing the constituent sublocks of size log n using a processor each. Berkman et al. s Then we discuss how it is extended to bring down the processor complexity to 0 T& . such that 2. LTNs(i). Diks and Rytter are unclear about how splitting pairs are employed to compute each parent node in O(1) time. llSj&. on the CREW PRAM with same time and processor complexities. The segments are stored into a doubly linked list. The splitting of the matching portions at each parent node must be performed in a constant time. if y. Next those parentheses whose nesting depths are multiples of logn are separated out. Let Ri (or Li) be the set of segments in partition Ti such that RTNT~ (s) = 0 (or LTNT%(s) = 0). called all nearest smaller values (ANSV) problem. using & processors and O(logn) time using the previous algorithm. Each block is reduced through a series of processings. To solve parentheses-matching problem using an algorithm for the ATN problem. RTNs(s) = t. a preprocessing stage computes two splitting pairs for each node. is the shortest segment taller than s and can be obtained from the merged list Ri U l?i. At a parent node. where I is its prefix sum. and on the CRCW PRAM using O(loglogn) time and h processors. If no such right neighbor exists in S. there is another pair for the rightmost right parenthesis of the right subtree. = (2.. and the ATN problem is recursively solved on each subset. This is followed by actual matching of pairs. Instead. Each of these three steps requires O(log n) time. as a major tool. a parallel prefix algorithm is employed to find nesting depths of each parenthesis.: y. The left tallest neighbor. Tfi. .) 1 1 < i 2 n}. and Ri is merged with an appropriate portion of Ii. VOL. there is a left list for the unmatched left parentheses. and pipelining ensures an O(logn) time bound. its RTN corresponds to its matching right parenthesis. The input set is partitioned into fi equal subsets. 5. is analogously defined. and finally stable sorting those blocks that are large enough to contain more than one group. likewise. Intricate processor assignments and replications must be carried out to avoid concurrent read and to ensure logarithmic time for these steps. Then. but a concurrent read is avoided by simply killing the contending process trying to climb up from a right child. partitioned into & subsets. . and do not employ either pipelining or other involved techniques.1. for each left parenthesis. Reduction of a group requires pipelined scan by a processor assigned to a subblock through each of the subblocks to its right to first “book” and then match the right parentheses with the left parentheses of its own subblock. These splitting pairs are computed by using the tree navigation technique of the BarOn and Vishkin algorithm. for needs parallel sorting. The all tallest neighbors (ATN) problem consists of n vertical line segments S = {s. The right tallest neighbor (RTN) of the ith segment in set S is the jth segment. this algorithm has used Cole’ optimal sorting algorithm s [lo]. This algorithm has some similarity to our Algorithms III and IV in that it has used intervals to represent reduced substrings. A total of ~‘ mergings. Ti. one for each partition. NO. and then reducing a group of logn consecutive subblocks using logn processors. referred to as blocks. Let hi be the tallest segment in the ith partition. The right list of the left child is linked with the unmatched portion of the right list of the right child. and matched in O(logn) time. and the ATN problem is solved on this list. i < j. which has a large constant of proportionality. After the binary tree is constructed. assign a -1 to each left parenthesis and 1 to each right parenthesis. the Diks and Rytter algorithm uses linked lists as the underlying data structure. Likewise. The boundaries of the matching block are easily identified by the two matching parentheses of those two special parentheses that define a block. However. Similarly. The major steps taken are as follows. We sketch Kim’ O(log n)-time. and then find all prefix-sums. Furthermore. our algorithms use only arrays. These special parentheses partition the original string into several substrings. The matching left and right parentheses in these two final lists are aligned by using a linked list ranking algorithm on each of these lists. The two stage matching avoids any concurrent access.1006 IEEE TRANSACTIONS O N PARALLEL AND DISTRIBUTED SYSTEMS. First. Thereafter the unmatched parentheses in a reduced block are matched with those in the matching blocks.

Feb. REFERENCES R. Das. 3949. Lewis. “Deterministic coin tossing with applications to optimal list ranking. they also show how their algorithm can be employed to determine if a string of o(logn) time parentheses of length O(n log n) is balanced in using 71 processors. Cornput. Denton. pp. Das and C. Conf Comput. Rice. 11 l-l 18. O(n. Dept. Pmcessing Len. Ibarra et al. [3] F. C. Jan. D. 17. message-passing parallel machine. and the required processor-time product is O(7~log~~) as compared to O(n) operations for a sequential algorithm. 7. vol. C.. 1986.e. Before matching a pair of left and right parentheses. 1991. [22] M. pp. In addition. Vishkin. Comput.” Acta Informutica. and P. Luttighuis. is the set of balanced strings of k different kinds of parentheses [22]. K. The algorithms presented here can easily be extended for strings of parentheses of more than one kind. MA: Addison-Wesley. H. vol. “Optimal generation of a computation tree form. CRPDC-91-5. Parallel Processin~Symp. [5] 0. “The average height of planted plane trees. Rytter. referred to as the one-sided Dyck language over one letter. [IO] R. June 1992.” Theoretical Comput. They are presented in the order of successive improvements in the resource (i. Dept. Control.. Reading. CONCLUSION matching on hypercube architectures.. Schieber. Dec. and task partitioning schemes in a parallel environment. 15-22. Sysr. Programming Lang. Science. Nijholt. 1990. Alblas. [9]. VII. 1985. 29. 2. “Efficient parallel algorithms and data structures related to trees. Chen and S. W. [ 151 S. 1991. K. and M. Future work can be aimed at designing cost-optimal parallel algorithms for parentheses- . Das. 132-137. 262-277. Fleury. Shen. “Breadth-first traversal of trees and integer sorting in parallel. Algorithms I and II are very easy to implement. [23] have shown that one-sided Dyck languages over k: letters can be recognized in O(logn) time using a polynomial number of processors. 659-663. Univ. Chen. Vishkin. Dept.” SIAM J. Rytter.” Inform. Apr. 1992. and minimum coloring of interval graphs [ 131. [6] N. and working space) requirements. 443454. .’ pp. Dec.. a [~>Ol~>’ IS a bl anced string over three different kinds of parentheses. “Parallel merge sort. UMIACS-TR-88-79 CS-TR-2133. for k: 2 1. 1991.” in R.” Memoranda Informatica 89-67. employing 0( &) processors.. “Two E R E W algorithms for parentheses matching. Part I: The basic techniques with applications to optima1 parallel list ranking in logarithmic time.-Y. Cole and U. “Parallel generation of post-fix and tree forms. Comput.E. G. S& U&v. processor.” Inform. vol. “An annotated bibliography on parallel parsing. Chen. Prasad. “Efficient parallel algorithms on interval graphs. “A cost-optimal parallel algorithm for the parentheses matching problem on an E R E W PRAM. Ravikumar..” ACM Trans. 25 l-262. pp. [l 11 R.” in Proc. of North Texas.: EFFICIENT EREW PRAM ALGORITHMS 1007 fewer steps. 7. Con$ Parallel Architecture and Languages Europe (PARLE’ 92). “A divide-and-conquer algorithm for parentheses matching. A. 770-785. [13] S. 1982. CS-TR-89-18. TX. Sahni. pp. Our algorithms also illustrate the development of. [ 181 -. Vishkin. that several other problems can possibly be solved cost-optimally on the hypercube computers.. Rep. H. Warmuth. [8] C. [ 171 N. and U.” Inform. Sci. Cornput. pp. pp. 199 i. “Approximate parallel scheduling. North Texas. “Some fast parallel algorithms for parentheses matching. we have recently applied a parentheses-matching algorithm to designing cost-optimal parallel algorithms for breadth-first traversal of general trees and sorting a special class of integers [7]. W e point out that the set of balanced strings of parentheses is a context-free language. Dept. suitable data structures. K. 2. Berlin.” in Proc. pp. although there is a trade-off between their time and processor requirements. wherever necessary. Rep. Graph Theory and Compufing. Prasad.” ACM Trans. 32-53.” Ph. Comput.. and B. Orlando. 1985.e. Jiang. (201 A.” Tech. [ 12) -. vOi. Berkman. [ 191 K. In!. Mayr and Werchner [29] have presented an algorithm to match n parentheses in O(log7~) time using n processors on the hypercube model. 348-357. K. Dekel and S. no. 1989.” Info& Processing Let?.” SfAM J. [23] 0. K. “On parsing arithmetic expressions in multiprocessing environment. Comput. Comput. Except for Algorithm III. 70. which is a distributedmemory. of Central Florida. . Other than its natural applications in parsing. vol. E. EfJicienf Parallel Algorithms. Univ. “Some subclasses of contextfree languages in ‘ v1. pp. UK: Cambridge University Press. Das and G.-Y. 131-i43. “Parallel approximation algorithms for bin packing. June 1991. Lecture Notes in Computer Science 605. Sysf. 1989. pp. Germany: Springer-Verlag. 126 13 1. 267-310. 17. On the other hand. Currently.L 82. Univ. 1972. Bar-On and U.PRASAD e. Sci. Gibbons and W. In general. Sahni and the anonymous referees for suggesting inclusion of some important references. Cole. 41. al. “Some doubly logarithmic optimal parallel algorithms based on finding all nearest smaller values. Univ. Anderson et al. Germany: Springer-Verlag. 1988. each implementable in a straightforward fashion. Reid Ed. 1989. vol. Programming Lang. pp. Such a result will imply. 1992. sec. Introduction to Formal Language Theo?. 1988.. Deo and S. pp. Inform. 1161 E. Sci.G. all others make use of optimal (i.. 121 R. Orlando. B. [I41 S. 1988.. Int. Srh Inr. Sept. time.” Tech. 87.” Tech.” in Proc. T.4. [ll W e have presented four polylogarithmic-time parallel algorithms for the parentheses-matching problem on the EREW PRAM model. Cambridge. Knuth. For example. Parallel Processing Symp. “On optimal parallel computations for sequence of brackets.A. processor scheduling. employing O(e) processors.. Baccelli and T.)) working space. [9]. we are investigating other classes of problems that can be efficiently solved in parallel. Berlin.” in Proc. 57-63. 10.. Harrison. both Algorithms III and IV are cost-optimal and are run in O(logn) time. Algorithm I runs in O(log* 7~) time. Nov. Ibarra. J.” Inform. Denton. Processing Left. pp. and S. FL: Academic Press. vol. an additional check is needed to ensure that the pair contains parentheses of the same kind. Algorithms I and II are not cost-optimal. Recently.. pp. Rep. [13]. vol. of Maryland. using parentheses-matching as a subalgorithm. and S. Aug. 1988.-Y. ACKNOWLEDGMENT W e would like to thank S. 1988. and Algorithm IV is the most sophisticated. a one-sided Dyck language over Ic letters. Hagerup and H. pp. 1991.. The continued innovation of such new algorithms for the parentheses-matching problem thus points to the richness in the structure of this important problem. That is. whereas Algorithm II requires O(log71) time with O(71) processors. Computation. vol. 1978. de Bruijn. Akker. Sci.D. Sept. dissertation. Lewis.. Oct.c’ . vol. 191 -. France. Diks and W. [2] have employed a parentheses-matching algorithm to design parallel heuristics for bin-packing problems. of ?went< the Netherlands. [2l] T.. 171 C. pp. Mayr. . TX. [4] I. 128-142.-Y. Chen.O. “Two E R E W algorithms for parentheses matching. Dept. 36. no. 5rh Int. Anderson. “Improved nonconstructive seauential and parallel integer sorting. Paris. in light of our results in [7].

S. degree in computer science and engineering from the Indian Institute of Technology. Kharagpur. Sci. and Computing. I-15. operating system supports for multimedia application. degree in computer science from the University of Central Florida. Kim. and computer interconnection networks. J. neural networks. University of North Texas. Int.” in Proc. and has presented his research work at many international and national conferences. Control. Univ. TX. “An optimal EREW parallel algorithm for parentheses matching. “Parallel algorithms for parentheses matching and generation of random balanced sequences of parentheses.” IEEE Trans. 1271 C. He was previously with Xerox Corp. vol. “Matching parentheses in parallel. His current research interests include design and analysis of parallel algorithms. Han. W. parallel graph algorithms. Jan. ~321 Y. SEPTEMBER 1994 ~241 S. 1988. Italy. 1992. Chen received the Ph. parallel discrete event simulation. 40. pp.D. A. Deo. 1989. ACM. of Washington. 766775. T. “Parallel algorithms for bucket sorting and the data dependent prefix problem. Mayr and R. Orlando. 27th Annual Allerton ConJ on Communication. 1989.” Dept.S. “Optimal routing of parentheses on the hypercube. and performance evaluation. He is a member of the IEEE Computer Society. 97&984. vol.” Tech. Denton. Levcopoulos and 0. [301 S. His current research interests are parallel algorithms and data structures.S. 4th Ann. From I988 to 1993. Werchner. m -> Rep. Ladner and M. degree in computer science from Washington State University. Chin. Symp. 1st Int. India. pp. Denton. Berlin. ‘ vol. W. 1986. and F. ~291 E. in 1988.” in Proc. in 1986. 29. the New York Academy of Sciences. Germany: Springer-Verlag. pp.. His research interests include design and analysis of parallel algorithms. 90-07-04. degree in computer science from the University of North Texas. 1980. VOL.” ACTA Informatica. he was an Assistant Professor in the Department of Computer Science. the M.. 1. C. Dr.. Sen. pp. In 1993. TX.. Lecture Notes in Computer Science 297. Conf Supercomputing. distributed and parallel systems. Parallel prefix computation. Das (M’ 94) received the B. He is currently an Associate Professor in the Department of Computer Science and Information Engineering. NO. Sarkar and N. ACM.-Y. he has been an Assistant Professor at Georgia State University. 83 l-838. Univ. Since 1990. Int. Greece. Athens. working in the areas of operating system support and performance evaluation. 109-I 17. Chen is a member of ACM and the IEEE Computer Society.D.” in Proc. “Fast parallel algorithms for parentheses matching. Comput.” J. [331 W. 423431. the M. 1990. TX. where he is currently a tenured Associate Professor and also a faculty member of the Center for Research in Parallel and Distributed Computing. dissertation. 3. Atlanta. Comput. he received a scholarship to visit the Leonardo Fibonacci Institute in Trento. Fischer. Petersson. degree in computer science from the Indian Institute of Science. pp. Dr. Taiwan.Tech. and IBM Corp. applied graph theory and combinatorics. 39. Republic of China. on Parallel Algorithms and Architectures. He has been a member of program committees for several conferences. Sci. w31 G. ” in Proc. Bangalore. Das is a recipient of the Cambridge Nehru Scholarship in 1986 and an Honor Professor Award at the University of North Texas in 1991. pp. Dr.C. Lam.1008 IEEE TRANSACTIONS ON PARALLEL AND DlSTRIBUTED SYSTEMS. pp. 1990. interconnection networks. fats. N. pp.K. in 1984. USA. E. Tsang. “Optimal parallel algorithms on sorted intervals. very large scale integration (VLSI) layout algorithms. “Parallel algorithms for geometric intersection graphs. Orlando. SK. in 1983.” Discrete Applied Math.” in Proc. Das. degree in computer science from Calcutta University. Lewis and S. 1992. in 1985. He has published more than 40 journal papers in his areas of research interest.. [311 D.K. Prasad (S’ 86M’ 91) received the B. Dec. parallel data structures. 130-132. Dept. Wagner and Y. USA. and Sigma Xi. 1992. vol. 924-930. and the Ph. 27. Comput. He serves on the Editorial Board of Purallel Processing Letters and Journal of Parallel Algorithms und Applications. 5. and Eng. [341 R. parallel compilation. in preparation. of North Texas.K. Srikant. WI R. . Pullman. 1993.Tech. Con$ Parallel Processing. and the Ph. in 1990. in 1991. Rajasekaran and S. Ph. vol. and multiprocessor interconnection networks.D. 185-192. Tamkang University. degree in computer science from the University of Central Florida. Con$ Parallel Processing. pp. 9. “On parallel integer sorting. W. “Parallel parsing of arithmetic expression. Denton.D. Prasad has published more than 20 papers in journals and refereed proceedings in his areas of research interest.

Sign up to vote on this title

UsefulNot useful- Regular Epression to DFA
- An Interactive Introduction to LaTeX,
- Pedoman Pembuatan Jurnal
- indesign_grep_mmurphy
- ieee
- Ieee Format
- Speed Read
- Template (2) for Paris Conference
- formatoIEEE
- Thesis 2
- Script
- ieeeformat for paper presentations-[by FaaDoOEngineers.com](1)
- ieeeformat
- Ieeeformat for Paper Presentations
- Chapter 1
- 00010036
- IE12 IEEE Word Template-A4
- Mouton Journal Style Sheet Jun08
- JMEST Manuscript Template
- ICRIT2011samplepaper
- Math Mode Latex
- 2014_04_msw_a4_format
- IJRE Paper Template
- bmn
- How to Instruction Paperwritting
- Ieee Trans Jour
- IEEEWordTemplate
- File Format
- IEEE-Word-template
- FULLPAPER TEMPLATE_.docx
- PDC95

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd