Professional Documents
Culture Documents
parsers
1. build I , the canonical collection of sets of LR(1) items (a) I0 closure(f S 0 ! S; eof]g) (b) repeat until no sets are added for I 2 I and X 2 NT T if goto(I ; X ) is a new set, add it to I 2. iterate through I 2 I , lling in the ACTION table 3. ll in the GOTO table
j j j
would be valid
are also valid, where ) recognizing Y takes parser to A ! XY Z; ] I represents all the simultaneously valid states
j
; ]
CPSC 434
LR(1)
parser example
The Grammar
1 2 3
E T
::= ::=
j
T+E T
id
0 1 2 3 Symbol S0 E T
S0 E T
E T+E T
id
f id g f id g f id g
FOLLOW
CPSC 434
S3: T ::= id , + ]
T ::= id , $ ]
FIRST( $ ) = $ FIRST( $ ) = $ FIRST( + E $ ) = + FIRST( $ ) = $
S4: E ::= T + E , $ ],
S5: E ::= T + E , $ ]
CPSC 434
Lecture 12, Page 3
S0
closure ( f S ::= E ]g )
Iteration 1 goto(S0; E ) = S1 goto(S0; T ) = S2 goto(S0, id) = S3 Iteration 2 goto(S2, +) = S4 Iteration 3 goto(S4, id) = S3 goto(S4; E ) = S5 goto(S4; T ) = S2
CPSC 434
Lecture 12, Page 4
0 1 2 3
S0 E T
ACTION +
E T+E T
id
S0 S1 S2 S3 S4 S5
id
expr 1 | | | 5 |
GOTO
term 2 | | | 2 |
The "reduce" actions are determined by the lookahead entries in the LR(1) items (instead of FOLLOW as in SLR parsers) The dfa, ACTION and GOTO tables have the exact same format for both SLR(1) and LR(1) parsers
CPSC 434
Lecture 12, Page 5
Shift-Reduce
, ] , ]
Reduce-Reduce
A ::= B ::=
, ] , ]
con ict
FOLLOW(A) FIRST( )
\ FIRST( \
con ict
FOLLOW(A) FOLLOW(B)
\
SLR(1)
LR(1)
CPSC 434
SLR(1)
parsing example
The Grammar
S0 G E T
G E = E j id E+Tj T T * f j id
S0: S0 ::= G ]
G ::= G ::= E ::= E ::= T ::= T ::=
E=E] id ] E+T] T] T * id ] id ]
FOLLOW(G) = f $ g FOLLOW(T) = f $, *, +, = g
S1: G ::= id ]
T ::= id ]
LR(1)
parsing example
The Grammar
G E = E j id E+Tj T T * id j id
S0: S0 ::= G , f $ g ]
E=E,f$g] id , f $ g ] E + T , f =, + g ] T , f =, + g ] T * id , f =, +, * g ] id , f =, +, * g ] ]
S1: G ::= id , f $ g ]
T ::= id , f =,
+, * g
LALR(1)
parsing
lookahead in states.
De ne the core of a set of LR(1) items to be the set of LR(0) items derived by ignoring the lookahead symbols. Thus, the two sets fA) ; a]; A ) fA) ; c]; A ) have the same core.
Key Idea:
i
If two sets of LR(1) items, I and I , have the same core, we can merge the states that represent them in the ACTION and GOTO tables.
j
CPSC 434
LALR(1)
table construction
merge.
CPSC 434
LALR(1)
table construction
A more space e cient algorithm can be derived by observing that: we can represent I by its kernel, those items that are either the initial item S 0 ! S; eof] or do not have the at the left end of the rhs. we can compute shift, reduce, and goto actions for the state derived from I directly from kernel(I ).
i i i
This method avoids building the complete canonical collection of sets of LR(1) items. generate lookahead information.
LALR(1)
properties
LALR(1) parsers have same number of states as SLR(1) parsers (core LR(0) items are the same)
May perform reduce rather than error. But will catch error before more input is processed.
LALR derived from LR with no shift-reduce con ict
LALR may create reduce-reduce con ict not in LR from which LALR is derived.
CPSC 434
LR(k )
languages
SLR(1) LALR(1) LR(1) LR(k) Grammars SLR(1) LALR(1) LR(1) LR(k ) Languages
CPSC 434
Lecture 12, Page 13
S1 > S 2 S 1 = S2 S1 < S 2
A handle is thus composed of: <>, < = >, < = = >, ... To decide whether to shift or reduce, compare top of stack with lookahead (ignoring nonterminals): Shift if < or = Reduce if > Left end of handle is marked by rst < found
CPSC 434
Lecture 12, Page 14
Parsing example
E ::= E + E j E * E j id
+ * id
The Grammar
+ * id
< < > > < > > > < < >
Input
Stack
$ $ < id $<E $<E+ $ < E + < id $<E+<E $<E+<E* $ < E + < E * < id $<E+<E*E $<E+E $<E
CPSC 434
id + id * id + id * id + id * id id * id * id * id id
$ $ $ $ $ $ $ $ $ $ $
Precedence
> $ < + < id > + < * < id > * > + > $ >
id
$ <
$ $ $ $
id + + id * * id
descent parser directly encodes a grammar (typically an LL(1) grammar) into a series of mutually recursive procedures. It has most of the linguistic limitations of LL(1). the use of a production after seeing only the rst k symbols of its right hand side.
the occurrence of the right hand side of a production after having seen all that is derived from that right hand side with k symbols of lookahead.
CPSC 434
Parsing review
top-down recursive descent
LL(1)
operator precedence
LR(1)
Advantages fast locality simplicity error detection simple method fast automatable simple method very fast small table associativity fast det. langs. early error det. automatable associativity
Disadvantages hand-coded maintenance no left recursion associativity LL(1) LR(1) no left recursion associativity L(G) 6 = L(parser) error detection no ambiguity working sets table size error recovery
CPSC 434