Bidirectional Parsing For Natural Language Processing

Bidirectional Parsing for Natural Language Processing
Ştefan ANDREI
Faculty of Informatics, “Al.I.Cuza” University
Str. Berthelot, nr. 16, 6600, Iaşi, România.
E-mail: stefan@infoiasi.ro
Abstract The main purpose of this paper is to provide par-

allel parsing for (subclasses of) context free gram-
The paper emphasizes some subclasses of context mars using two processors. Similar ideas concern-
free grammars for which there exists a parallel ap- ing parsing of the input word from both sides were
proach useful for solving the membership prob- presented in (Tseytlin and Yushchenko, 1977) and
lem. We combine the classical style of LR parsers (Loka, 1984). However, the mention papers do not
attached to a grammar G with a “mirror” process construct effectively any parsers and do not men-
for G. The input word will be analysed from both tion any parallel model of computers for such com-
sides using two processors. We present the gen- putation.
eral bidirectional parser (for any context free lan- In this paper, we define a new parser for the
guage) using a nondeterministic device. A gen- class of context free languages. The input word
eral MIMD model for describing the bidirection- is analysed from both sides, but the parse strategy
al parsing is presented. Our general bidirectional is in up-to-up sense (LR and RL styles are com-
parser can be also used as a deterministic model bined for this parallel strategy). Some of the re-
for the known LR(k ) and RL(k ) parsers. Accord- sults from this paper were reported previously in
ingly, the membership problem may be solved in (Andrei and Kudlek, 1999; Andrei, Grigoraş and
linear time complexity with a parallel algorithm. Kudlek, 1999).
Finally, we present an example of a context free The associated derivation tree corresponding to
grammar useful for describing syntactic dependen- the input word w is down to up traversed by both
cies for English language. processors, i.e. from the leaves to its root. Only
one processor will be strongly active after the par-
allel algorithm “meets” the “middle” of the input
1 Introduction word:
In recent years, recognition algorithms, both se-
quential and parallel, for context free grammars
(with applications to parsing programming lan- ST3
guages) or for non context free languages (with ap-
plications to computational linguistic) have been
the subject of study by many computer scientist-
s (Aho and Ullman, 1972; Kulkarni and Shankar,
1996; Rus and Jones, 1998; Satta and Stock, 1994; ST1 ST2
Sikkel, 1998). In (Satta and Stock, 1994) was de-
scribed a formal framework for bidirectional tabu-
lar parsing of general context free languages, and
some applications to natural language processing Figure 1. Up-to-up bidirectional parsing
were studied. Moreover, an algorithm for head-
driven parsing and a general algorithm for island- The “derivation forests” ST1 and ST2 will be
driven parsing were studied. parsed in parallel, and, finally, the subtree ST3 will
be parsed in a sequential way. Definition 2.1 Let G = (VN ; VT ; S; P ) be a
The second section contains the definition of the context free grammar. Let C fs1 ; s2 g
general up-to-up bidirectional parser. The correct- f1; 2; :::; jP jg#(V [V 0)#VT#(V [V 0) #
ness of this parser is proven as the main result f1; 2; :::; jP jg f1; 2; :::; jP jg [fACC; REJ g
of this section (Theorem 2.1). The third section be the set of all possible configurations, where #
presents a very convenient parallel approach for is a special character (a new terminal symbol).
describing the general up-to-up bidirectional pars- The general up-to-up bidirectional parser (de-
ing strategy. The chosen model is a MIMD com- noted by Gu BP (G)) is the pair (C0 ; `), where
puter. Two processors P1 and P2 analyse the in- C = f(s ; ; #; #w#; #; ; ) j w 2 VT g C
0 1
put word from both sides. Theorems 3.1 and 3.2 is called the set of initial configurations. The first
ensure the finiteness and correctness of our par- component is the state, the second and the last but
allel algorithm. The next section points out the one components of a configuration are for storing
definition of some new subclasses of context free the partial syntactic analysis. The last componen-
grammars. We called them LR RL grammars. t is for storing the final syntactic analysis. The
These grammars can be viewed as some combina- third and the fifth components are the work - s-
tions between classical LR grammars and the mir- tacks. The fourth component represents the cur-
ror ones. The fifth section presents an example of rent content of the input word. The transition re-
a simple context free grammar useful for describ- lation ( ` CC , sometimes denoted by )
ing syntactic dependencies for English language. Gu BP (G)
between configurations is given by:
Some open problems are emphasized in Conclu-
sions.
10 Shift-Shift: (s1 ; 1 ; #; #a u b#; #; 2 ; ) `
(s1 ; 1 ; # a; #u#; b #; 2 ; );
20 Reduce-Shift: (s1 ; 1 ; # ; #u b#; #; 2 ; )
2 The General Bidirectional Pars- ` (s1; r1 1; # Ab ;e ; #u#; b #; 2 ; ), where
0 0
er r1 = A ! h( ) 2 P , e0 = j1 j + 1; b0 =
minfj1 j + 1; minfb00 j Cb ;e 2 gg; 00 00
We suppose the reader familiar with the notions as 30 Shift-Reduce: (s1 ; 1 ; #; #a u#; #; 20 200 ;
context free grammars, words (Aho and Ullman, ) ` (s1 ; 1 ; # a; #u#; Bb;e #; 20 r2 200 ; ),
1972; Salomaa, 1973). Let = 1 2 ::: k be a where r2 = B ! h( ) 2 P; e = j20 200 j + 1;
word over V. Then b = minfj20 200 j + 1; minfb0 j Db ;e 2 gg:
= 1 2 ::: m if m k and
0 0
(m) 40 Reduce-Reduce:
otherwise (s1 ; 1 ; #; #u#; " #; 20 200 ; ) `
+1 k n+2 ::: k if n k
=
(n)

k n (s1 ; r1 1 ; #Ab1 ;e1 ; #u#; Bb2 ;e2 #; 20 r2 200 ; ),
where r1 = A ! h( ) 2 P , r2 = B !
otherwise.
Notation 2.1 Let G = (VN ; VT ; S; P ) be a con- h(") 2 P , e1 = j1 j + 1; e2 = j20 200 j + 1;

text free grammar and V 0 be the set VN : N N b1 = minfj1 j + 1; minfb01 j Cb1 ;e1 2 gg and 0 0
Let h : V [ V 0 ! V be a function given by b2 = minfj20 200 j + 1; minfb02 j Db2 ;e2 2 "gg; 0 0
h(X ) = X; 8 X 2 V and h(Xb;e ) = X; 8 Xb;e 2 50 Shift-Stay: (s1 ; 1 ; #; #a u#; #; 2 ; ) `

V 0 : This function can be easily extended to an ho- (s1 ; 1 ; # a; #u#; #; 2 ; );
momorphism h : (V [ V 0 ) ! V such as h() = 60 Reduce-Stay: (s1 ; 1 ; # ; #u#; #; 2 ; )
and h(X1 X2 :::Xn ) = h(X1 ) h(X2 ) ::: h(Xn ); ` (s1; r1 1; # Ab ;e ; #u#; #; 2 ; ), where
0 0
8 X1 ; X2 , ..., Xn 2 V [ V 0 , 8 n 2 ( being the r1 = A ! h( ) 2 P , e0 = j1 j + 1; b0 =

catenation operation). minfj1 j + 1; minfb00 j Cb ;e 2 gg; 00 00
70 Stay-Shift: (s1 ; 1 ; #; #u b#; #; 2 ; ) `

The occurrence of Ab;e signifies that A is the la- (s1 ; 1 ; ; #u#; b #; 2 ; );
bel of the derivation subtree of root v in the forests 80 Stay-Reduce: (s1 ; 1 ; #; #u#; #; 20 200 ;
ST1 or ST2 (Figure 1), and the frontier has the cor- ) ` (s1 ; 1 ; #; #u#; Bb;e #; 20 r2 2 ; ),
responding right most derivation [rb ; rb+1 ; :::; re ]. where r2 = B ! h( ) 2 P; e = j2 j + 1;
b = minfj2 j + 1; minfb0 j Db ;e 2 gg; 0 0
90 Possible-accept: (s1 ; 1 ; #; ##; #; 2 ; ) algorithm (denoted by (PAR-UUBP)) clearly fol-
` (s2 ; 1; #; ##; #; 2 ; ); lows Definition 2.1 of the general up-to-up bidi-
10 Shift-Terminal: (s2 ; 1 ; #; ##; a #; 2 ; 3 ) rectional parser. Our algorithm uses the following
0
` (s2; 1 ; # a; ##; #; 2 ; 3); variables:

11 Shift-Nonterminal:
0 w2 VT the input word, n= jwj;
(s2 ; 1 ; #; ##; Ab;e #; 20 200 ; 3 ) ` i1, i2 two counters for the current positions
00 0 0
(s2 ; 1 ; # A; ##; #; 2 ; 2 3 ), where 2 = w; in
[rb ; rb+1 ; :::; re ] and j20 j = e b + 1; accept a boolean variable which takes the
0 00
12 Reduce: (s2 ; 1 1 ; # ; ##; #; 2 ; 3 ) `
0 true value iff w2 L(G);
00 0
(s2 ; 1 ; # A; ##; #; 2 ; r1 3 1 ), where r1 = Stack1, Stack2 two working stacks for P1
A ! h( ) 2 P; = u1 Bb;e ::: um Cb ;e 0 ; u1 , and P2;
Output_tape1, Output_tape2 the out-
0 0
..., um 2 VT , j10 j = e0 b + 1;

130 Accept: (s2 ; ; #S; ##; #; ; ) ` ACC ; put tapes of P1 and P2 for storing the partial syn-
140 Reject: (s1 ; 1 ; #; #u#; #; 2 ; ) ` REJ tactic analysis;
and (s2 ; 1 ; #; ##; #; 2 ; 3 ) ` REJ if no Output_tape the output tape for storing the
transitions of type 10 , 20 , ..., 130 can be applied. global syntactic analysis;
exit a boolean variable which is true iff P1
or P2 detect the non-acceptance of w.
Our model is a two-stack machine (Hopcroft
and Ullman, 1979). The differences consist in The variables w, i1, i2, accept,
the existence of two heads (instead of only one) Output_tape, exit are stored in the common
which may read the input tape, and two output memory. We use some predefined procedures,
tapes which can be accessed only in write style, such as:
and has only two states. Therefore, our model can pop(Stack,) - the value of will be the
simulate a Turing machine. string of length jj starting from the first symbol
of Stack; after that, the string is removed from
Stack;
Theorem 2.1 (Andrei, Grigoraş and
Kudlek, 1999) Let G be a context free push_first(Stack,A) - add to the con-
grammar, S being its start symbol. tent of Stack the symbol A; A will be the new top
of Stack;
Then (s1 ; ; #; #w#; #; ; ) push_last(Stack,) - add to the con-
Gu BP (G)
130
tent of Stack, starting from the last symbol of
(s2 ; ; #S; ##; #; ; ) ACC iff Stack, the string ; Stack will have the same
Gu BP (G) top.
S =) w:
G;rm Now, the method of parallel algorithm (PAR-
UUBP) can be pointed out (we suppose that the
context free grammar G = (VN ; VT ; S; P ) is al-
3 Parallel Approach for General ready read):
Bidirectional Parsing
begin
In this section, we present a very convenient par- read(n); read(w); i1:=1; i2:=n;
allel approach for describing the general up-to- accept:=false; exit:=false;
up bidirectional parsing strategy. Our model is a repeat in parallel
MIMD (multiple instruction stream and multiple action1(P1); action2(P2)
data stream) computer. We consider two proces- until (i1>=i2) or (exit=true);
sors P1 and P2 which operate asynchronously and if (exit = true) then
share a common memory (Akl, 1997). accept := false
In the following, we shall present a parallel al- else
gorithm which describes the general up-to-up bidi- repeat
rectional parsing strategy. In fact, our parallel action3(P1,P2)
until (exit = true); case
if (accept = true) then if (9 r2 = A! h( ) 2 P; is in
write(’w is accepted and has Stack1 starting from the top)
the right syntactical then begin
analysis’,Output_tape); /* reduce action */
else let := u1 Cb;e :::um Db ;e 0 , where
0 0
write(’w is not accepted’); u1 ; :::; um 2 VT ;

end. e00 := jOutput_tape2j + 1;
if ( does not contain any
It remains to describe the procedures symbol from V 0 V )
action1(P1), action2(P2) and then b00 := e00
action3(P1,P2). else b00 := minfe00 ; bg;
procedure action1(P1); pop(Stack2, );
begin push_first(Stack2,Bb ;e ); 00 0
case push_first(Output_tape2, r2);

if (9 r1 = A! h() 2 P; is in end;
Stack1 starting from the top) if (i1 < i2) then begin
then begin /* shift action */
/* reduce action */ push_first(Stack2,w[i2]);
let := u1 Cb;e :::um Db ;e 0 , where
0 0 i2 := i2-1;
u1 ; :::; um 2 VT ; end
e00 := jOutput_tape1j + 1; otherwise: begin
if ( does not contain /* backtrack is needed; */
any symbol from V 0 V ) if (all the backtrack steps
then b00 := e00 are over) and (still no
else b00 := minfe00 ; bg; reduce or shift action
pop(Stack1,); could be made)
push_first(Stack1,Ab ;e ); 00 0 then exit := true;
push_first(Output_tape1,r1); end
end; end;
if (i1 <= i2) then begin Finally, we describe the procedure
/* shift action */ action3(P1,P2) in a sequential way.
push_first(Stack1,w[i1]); The goal is to simulate the transitions 100 , 110 ,
i1 := i1+1; 120 and 130 for Gu BP (G) from Definition 2.1.
end The input tape is now empty, i.e. w has been
otherwise: begin already read (of course, if exit has the value
/* backtrack is needed; */ false), but we read symbols from Stack2 (send
if (all the backtrack steps by processor P2) modifying the content of the
are over) and (still no Output_tape1 and Output_tape2 putting
reduce or shift actions the results in Output_tape.
could be made) procedure action3(P1,P2);
then exit := true; begin
end case
end; if (9 r1 = A! h() 2 P; is in
The description of procedure action2(P2) is Stack1 starting from the top)
very similar to action1(P1). then begin
procedure action2(P2); /* reduce action */
begin let = u1 Cb;e ::: 0 Db ;e 00 ; where
0 0
u 2 VT , 00 2 V , 0 2 (V [ V 0 ) ; For a given context free grammar
let 10 from Output_tape1 such G = (VN ; VT ; S; P )and w 2 VT as its in-
that j10 j = e0 b + 1; put, Algorithm (PAR-UUBP) gives the answer
pop(Output_tape1,10 ); ”w is accepted and has the right
syntactical analysis ” if S =) w and

pop(Stack1,);
G;rm
push(Stack2,A); “w is not accepted.” otherwise.
push_first(Output_tape,r1);
push_last(Output_tape,10 );
end; 4 Deterministic Bidirectional
if (top of Stack2 is a Parsing for LR-RL Grammars
terminal symbol)
then begin Deterministic (and linear) parallel algorithms (as
/* shift-terminal action */ particular cases of the general up-to-up bidirec-
pop(Stack2,a), where a 2 VT ; tional parsing algorithm) for solving the member-
push_first(Stack1,a); ship problem, can be derived for some “combina-
end; tions” of subclasses of context free languages. The
if (top of Stack2 is from V’) deterministic up-to-up bidirectional parser has the
then begin same device as the general model. The only differ-
/*shift-nonterminal action*/ ence is the uniqueness of choosing the production
pop(Stack2,Ab;e); r from the set of the given productions of the in-
push_first(Stack1,A); put grammar (i.e. no backtrack step is needed). We
let 20 from Output_tape2 use the definitions of LR(k ) (Knuth, 1965) gram-
such that j20 j = e b + 1; mars.
pop(Output_tape2,20 );
push_first(Output_tape,20 ); Definition 4.1 Let G be a context free grammar
and k be a natural number. We say that G is
RL(k) if Ge is a LR(k) grammar. A language L
end;
if (Output_tape1=;) and
is RL(k ) if there exists a RL(k ) grammar which
(Output_tape2=;) and
generates L.
(Stack1=S) and (Stack2=;)
then begin In a similar way to Definition 4.1, we can de-
accept:=true; exit:=true fine the “mirror” of classical subclasses of LR(k )
end; grammars very useful in compiler techniques.
otherwise: begin Now, we combine the LR(k ) and RL(k ) styles
/* backtrack is needed; */ for obtaining the deterministic up-to-up bidirec-
if (all the backtrack steps tional parsing for context free languages.
are over) and (still no
reduce or shift action could Definition 4.2 Let G be a context free gram-
be made) mar and k1 , k2 2 N
: We say that G is a
then exit := true LR(k1 ) RL(k2 ) grammar iff G is both a LR(k1 )
end; and RL(k2 ) grammar. A language L is called
end; LR(k1 ) RL(k2 ) if there exists G a LR(k1 )
RL(k2 ) grammar for which L = L(G):
Theorem 3.1 (Andrei, Grigoraş and Kudlek,
1999) - finiteness of Algorithm (PAR-UUBP)) We can remark that G is a LR(k1 ) RL(k2 )
The Algorithm (PAR-UUBP) performs a finite grammar iff G is a RL(k2 ) LR(k1 ) grammar.
number of steps until terminates its execution. Of course, because the class of LR(k ) languages,
for k 1; equals to the class of LR(1) languages
Theorem 3.2 (Andrei, Grigoraş and Kudlek, (Knuth, 1965), then the above definition has prac-
1999) - correctness of Algorithm (PAR-UUBP)) tical interest for k1 ; k2 2 f0; 1g.
Particularly, the deterministic up-to-up bidirec- S; P rod); where VT ; being the English
tional parsing is similar to the general parallel ap- alphabet and P rod being the following set of pro-
proach, the only difference being the lack of back- ductions:
track steps. The procedures action1(P1) and
action2(P2) are related to classical sequential
syntactical analysis algorithms for LR(1) (or the 1: S ! NP VP 7: NP ! N
subclasses LR(0), SLR(1) and LALR(1)) and 2: NP ! D A A N PP 8: VP ! V NP PP
RL(1). We also get a linear running time for the 3: NP ! A A N PP 9: VP ! V NP
procedures action1(P1) and action2(P2) 4: NP ! D A N PP 10: VP ! V PP
instead of an exponential sequential running time 5: NP ! A N PP 11: VP ! V
(backtrack steps are no more required). 6: NP ! D N 12: PP ! P NP
Obviously, the procedure action3(P1,P2)
has no backtrack steps for the subclasses of gram-
mars given in Definition 4.2. Consequently, the 13: D ! the 21: N ! hunter
procedure action3(P1,P2) is deterministic 14: D ! some 22: V ! attack
and has a linear running time. 15: A ! big 23: V ! ate
The correctness of the deterministic parallel al- 16: A ! brown 24: V ! watched
gorithms is ensured by the correctness of the gen- 17: A ! old 25: P ! for
eral parallel algorithm and the correctness of each 18: N ! birds 26: P ! beside
of the sequential syntactic analysers for the men- 19: N ! fleas 27: P ! with
tioned subclasses of context free grammars (such 20: N ! dog
as LR(1), RL(1)).
In (Sag and Wazow, 1997), the productions 1,...,
Theorem 4.1 (Andrei, Grigoraş and Kudlek, 12 are called rules, and 13, ..., 27 are called lexi-
1999) - the complexity of the deterministic con.
parallel algorithms) Using a JAVA implementation for our bidirec-
Let us denote with T1 (n); T2 (n) and T3 (n) tional parsing algorithm, we determine the vi-
the running time of the sequential proce- able prefix automaton for grammar G which has
dures action1(P1), action2(P2) and 88 states corresponding to LR(1) items with no
action3(P1,P2), where n is the length of the conflicts (reduce-reduce, reduce-shift), so G is a
input word. Supposing that the routing time is LR(1) grammar. Moreover, we construct the vi-
zero, the parallel running time t(n) satisfies the able prefix automaton for grammar G e which has
relations: minfT1 (n2 );T2 (n)g + T3 (n) t(n) 70 states corresponding to RL(1) items with no
maxfT1 (n); T2 (n)g + T3 (n); and t(n) 2 O(n): conflicts (reduce-reduce, reduce-shift), so Ge is a
RL(1) grammar.
5 A Natural Language Example According to Definition 4.2, it follows that
grammar G is a LR(1) RL(1) grammar. We
We present a modified context free grammar simi- consider now the input word:
lar to the one presented in (Sag and Wasow, 1997) w = the big brown dog with fleas watched the
for describing syntactic dependencies for English
birds beside the hunter
language.
The term ’grammatical category’ covers the We shall simulate a possible execution of the
parts of speech and types of phrases, such as noun bidirectional parsing algorithm for this example
phrase and prepositional phrase. For convenience, (processors P1 and P2 work synchronously).
we will abbreviate them, so that ’NOUN’ becomes We have n = jwj and i1 and i2 two point-
’N’, ’NOUN PHRASE’ becomes ’NP’, ’DETER- ers between 1 and 12. We shall point out the
MINER’ becomes ’D’, etc. Let us consider the iterations of the procedures action(P1) and
following context free grammar: action(P2). The execution of these procedures
G = (fS; NP; V P; D; A; N; P P; V; P g; VT ; will finish when i1 > i2.
Processor P1 Processor P2
Action: Initial Action: Initial
Stack1 = [ ] Stack2 = [ ]
i1 = 1 i2 = 12
Output_tape1 = [ ] Output_tape2 = [ ]
Action: Shift Action: Shift
Stack1 = [the] Stack2 = [hunter]
i1 = 2 i2 = 11
Output_tape1 = [ ] Output_tape2 = [ ]
Action: Reduce Action: Reduce
Stack1 = [D1;1 ] Stack2 = [N1;1 ]
i1 = 2 i2 = 11
Output_tape1 = [13] Output_tape2 = [21]
Stack1 = [D1;1 , big ] Stack2 = [the, N1;1 ]
i1 = 3 i2 = 10
Output_tape1 = [13] Output_tape2 = [21]
Stack1 = [D1;1 , A2;2 ] Stack2 = [D2;2 , N1;1 ]
i1 = 3 i2 = 10
Output_tape1 = [15, 13] Output_tape2 = [21, 13]
Action: Shift Action: Reduce
Stack1 = [D1;1 , A2;2 , brown] Stack2 = [NP1;3 ]
i1 = 4 i2 = 10
Output_tape1 = [15, 13] Output_tape2 = [6, 21, 13]
Action: Reduce Action: Shift
Stack1 = [D1;1 , A2;2 , A3;3 ] Stack2 = [beside, NP1;3 ]
i1 = 4 i2 = 9
Output_tape1 = [16, 15, 13] Output_tape2 = [6, 21, 13]
Action: Shift Action: Reduce
Stack1 = [D1;1 , A2;2 , A3;3 , dog ] Stack2 = [P4;4 , NP1;3 ]
i1 = 5 i2 = 9
Output_tape1 = [16, 15, 13] Output_tape2 = [6, 21, 13, 26]
Stack1 = [D1;1 , A2;2 , A3;3 , N4;4 ] Stack2 = [P P1;5 ]
i1 = 5 i2 = 9
Output_tape1 = [20, 16, 15, 13] Output_tape2 = [12, 6, 21, 13, 26]
Stack1 = [D1;1 , A2;2 , A3;3 , N4;4 , with] Stack2 = [birds, P P1;5 ]
i1 = 6 i2 = 8
Output_tape1 = [20, 16, 15, 13] Output_tape2 = [12, 6, 21, 13, 26]
Stack1 = [D1;1 , A2;2 , A3;3 , N4;4 , P5;5 ] Stack2 = [N6;6 , P P1;5 ]
i1 = 6 i2 = 8
Output_tape1 = [27, 20, 16, 15, 13] Output_tape2 = [12, 6, 21, 13, 26, 18]
Stack1 = [D1;1 , A2;2 , A3;3 , N4;4 , P5;5 , fleas] Stack2 = [the, N6;6 , P P1;5 ]
i1 = 7 i2 = 7
Output_tape1 = [27, 20, 16, 15, 13] Output_tape2 = [12, 6, 21, 13, 26, 18]
Processor P1 (continuation) Action: Shift-Nonterminal
Action: Reduce Stack1 = [NP1;9 , V ]
Stack2 = [NP6;8 , P P1;5 ]
Stack1 = [D1;1 , A2;2 , A3;3 , N4;4 , P5;5 , N6;6 ]
i1 = 7 Output_tape1 = [2, 12, 7, 19, 27, 20, 16, 15,
Output_tape1 = [19, 27, 20, 16, 15, 13] 13]
Action: Reduce Output_tape2 = [12, 6, 21, 13, 26, 6, 18, 13]
Stack1 = [D1;1 , A2;2 , A3;3 , N4;4 , P5;5 , NP6;7 ]
Output_tape = [24]
i1 = 7 Action: Shift-Nonterminal
Output_tape1 = [7, 19, 27, 20, 16, 15, 13] Stack1 = [NP1;9 , V , NP ]
Action: Reduce Stack2 = [P P1;5 ]
Stack1 = [D1;1 , A2;2 , A3;3 , N4;4 , P P5;8 ] Output_tape1 = [2, 12, 7, 19, 27, 20, 16, 15,
i1 = 7 13]
Output_tape1 = [12, 7, 19, 27, 20, 16, 15, 13] Output_tape2 = [12, 6, 21, 13, 26]
Action: Reduce Output_tape = [6, 18, 13, 24]
Stack1 = [NP1;9 ] Action: Shift-Nonterminal
i1 = 7 Stack1 = [NP1;9 , V , NP , P P ]
Output_tape1 = [2, 12, 7, 19, 27, 20, 16, 15, Stack2 = [ ]
13] Output_tape1 = [2, 12, 7, 19, 27, 20, 16, 15,
13]
Output_tape2 = [ ]
Output_tape = [12, 6, 21, 13, 26, 6, 18, 13,
24]
Processor P2 (continuation)
Action: Reduce
Action: Reduce
Stack1 = [NP1;9 , V P ]
Stack2 = [D7;7 , N6;6 , P P1;5 ]
Stack2 = [ ]
i2 = 7
Output_tape1 = [2, 12, 7, 19, 27, 20, 16, 15,
Output_tape2 = [12, 6, 21, 13, 26, 18, 13]
13]
Action: Reduce
Output_tape2 = [ ]
Stack2 = [NP6;8 , P P1;5 ]
Output_tape = [8, 12, 6, 21, 13, 26, 6, 18, 13,
i2 = 7
24]
Output_tape2 = [12, 6, 21, 13, 26, 6, 18, 13]
Action: Reduce
Action: Shift
Stack1 = [S ]
Stack2 = [watched, NP6;8 , P P1;5 ]
Stack2 = [ ]
i2 = 6
Output_tape1 = [ ]
Output_tape2 = [12, 6, 21, 13, 26, 6, 18, 13]
Output_tape2 = [ ]
Action: Reduce
Output_tape = [1, 8, 12, 6, 21, 13, 26, 6, 18,
Stack2 = [V9;9 , NP6;8 , P P1;5 ]
13, 24, 2, 12, 7, 19, 27, 20, 16, 15, 13]
i2 = 6
In conclusion, the word w is accepted by the
Output_tape2 = [12, 6, 21, 13, 26, 6, 18, 13,
bidirectional parser and it has the right syntactic
24]
analysis:
rm = [1; 8; 12; 6; 21; 13; 26; 6; 18; 13; 24; 2; 12;
7; 19; 27; 20; 16; 15; 13];
Now, the processors P1 and P2 meet in the
“middle” of the input word, and only processor P1 6 Conclusions
will be strongly active. Processor P2 only sends
the data (i.e. Stack2, Output_tape2) to the We have seen that our bidirectional parser model
internal memory of P1. In fact, we shall illus- is in fact a two-stack machine (Hopcroft and Ull-
trate the execution steps made by the procedure man, 1979) and therefore our model can simulate
action(P1,P2). a Turing machine. In this paper the bidirection-
al parsing uses only LR(k ) style (and its “mirror” Methods of Algorithmic Language Implementa-
RL(k) style, of course). tion. A.Ershov, C.H.A.Koster (Eds.), LNCS 47,
Springer Verlag, Berlin, Germany: 231-245
Open problems and future work: to get a
A. Salomaa. 1973. Formal Languages. Aca-
more precise estimation of the running time of the
demic Press. New York
deterministic parallel algorithm presented in Sec-
G. Satta and O. Stock. 1994. Bidirectional
tion 4; to find further closure and inclusion prop-
context-free grammar parsing for natural language
erties of LR RL grammars and languages; to
processing. Artificial Intelligence. 69: 87-103
extend our bidirectional approach to non context
I. A. Sag and T. Wasow. 1997. Syntactic The-
free languages - with applications to computation-
ory: A Formal Introduction. CSLI Publications.
al linguistic and natural language processing.
Leland Stanford Junior University
K. Sikkel. 1998. Parsing Schemata and Correct-
References: ness of Parsing Algorithms. Theoretical Computer
Science. 199: 87-103
Şt. Andrei, Gh. Grigoraş and M. Kudlek. 1999.
Up-to-up Bidirectional Parsing for Context Free
Languages. Fachbereich Informatik, Universität
Hamburg, Bericht-221: 1-24
A. V. Aho and J. D. Ullman. 1972. The Theory
of Parsing, Translation, and Compiling. Volume I,
II, Prentice Hall
S. Akl. 1997. Parallel Computation. Models
and Methods. Prentice Hall
Şt. Andrei and M. Kudlek. 1999. Bidirection-
al Parsing for Linear Languages. Developments
in Language Theory, Fourth International Confer-
ence, July 6-9, 1999, Aachen, Germany, Prepro-
ceedings, W. Thomas Ed.: 331-344
J. Allen. 1995. Natural Language Understand-
ing. Second Edition, The Benjamin/Cummings
Publishing Company, Inc.
J. E. Hopcroft and J. D. Ullman. 1979. Intro-
duction to Automata Theory, Languages and Com-
putation. Addison - Wesley Publishing Company
D. E. Knuth. 1965. On the translation of lan-
guages from left to right. Information Control. 8:
607-639
S. R. Kulkarni and P. Shankar. 1996. Linear
time parsers for classes of non context free lan-
guages. Theoretical Computer Science. 165: 355-
390
R. R. Loka. 1984. A note on parallel parsing.
SIGPLAN Notices. 19 (1): 57-59
T. Rus and J. Jones. 1998. PHRASE parser-
s from multi-axiom grammars. Theoretical Com-
puter Science, 199: 199-229
G. E. Tseytlin and E. L. Yushchenko. 1977.
Several aspects of theory of parametric models
of languages and parallel syntactic analysis. In:

Bidirectional Parsing For Natural Language Processing

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bidirectional Parsing For Natural Language Processing

Uploaded by

Copyright:

Available Formats

Bidirectional Parsing for Natural Language Processing

Abstract The main purpose of this paper is to provide par-

Notation 2.1 Let G = (VN ; VT ; S; P ) be a con- h(") 2 P , e1 = j1 j + 1; e2 = j20 200 j + 1;

Let h : V [ V 0 ! V be a function given by b2 = minfj20 200 j + 1; minfb02 j Db2 ;e2 2 "gg; 0 0

h(X ) = X; 8 X 2 V and h(Xb;e ) = X; 8 Xb;e 2 50 Shift-Stay: (s1 ; 1 ; #; #a u#; #; 2 ; ) `

8 X1 ; X2 , ..., Xn 2 V [ V 0 , 8 n 2 ( being the r1 = A ! h( ) 2 P , e0 = j1 j + 1; b0 =

70 Stay-Shift: (s1 ; 1 ; #; #u b#; #; 2 ; ) `

` (s2; 1 ; # a; ##; #; 2 ; 3); variables:

..., um 2 VT , j10 j = e0 b + 1;

write(’w is not accepted’); u1 ; :::; um 2 VT ;

case push_first(Output_tape2, r2);

You might also like

Bidirectional Parsing For Natural Language Processing

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bidirectional Parsing For Natural Language Processing

Uploaded by

Copyright:

Available Formats

Bidirectional Parsing for Natural Language Processing

Abstract The main purpose of this paper is to provide par-

Notation 2.1 Let G = (VN ; VT ; S; P ) be a con- h(") 2 P , e1 = j1 j + 1; e2 = j20 200 j + 1;

Let h : V [ V 0 ! V be a function given by b2 = minfj20 200 j + 1; minfb02 j Db2 ;e2 2 "gg; 0 0

h(X ) = X; 8 X 2 V and h(Xb;e ) = X; 8 Xb;e 2 50 Shift-Stay: (s1 ; 1 ; # ; #a u#; #; 2 ; ) `

8 X1 ; X2 , ..., Xn 2 V [ V 0 , 8 n  2 ( being the r1 = A ! h( ) 2 P , e0 = j1 j + 1; b0 =

70 Stay-Shift: (s1 ; 1 ; # ; #u b#; #; 2 ; ) `

` (s2; 1 ; # a; ##; #; 2 ; 3); variables:

..., um 2 VT , j10 j = e0 b + 1;

write(’w is not accepted’); u1 ; :::; um 2 VT ;

case push_first(Output_tape2, r2);

You might also like

Notation 2.1 Let G = (VN ; VT ; S; P ) be a con- h(") 2 P , e1 = j1 j + 1; e2 = j20 200 j + 1;

Let h : V [ V 0 ! V be a function given by b2 = minfj20 200 j + 1; minfb02 j Db2 ;e2 2 "gg; 0 0

h(X ) = X; 8 X 2 V and h(Xb;e ) = X; 8 Xb;e 2 50 Shift-Stay: (s1 ; 1 ; #; #a u#; #; 2 ; ) `

8 X1 ; X2 , ..., Xn 2 V [ V 0 , 8 n 2 ( being the r1 = A ! h( ) 2 P , e0 = j1 j + 1; b0 =

70 Stay-Shift: (s1 ; 1 ; #; #u b#; #; 2 ; ) `

` (s2; 1 ; # a; ##; #; 2 ; 3); variables:

..., um 2 VT , j10 j = e0 b + 1;