You are on page 1of 17

LR(1)

parsers

LR(1) table construction algorithm

1. build I , the canonical collection of sets of LR(1) items (a) I0 closure(f S 0 ! S; eof]g) (b) repeat until no sets are added for I 2 I and X 2 NT T if goto(I ; X ) is a new set, add it to I 2. iterate through I 2 I , lling in the ACTION table 3. ll in the GOTO table
j j j

What does I \mean"?


j

would be valid

A ! X Y Z; ] ) have recognized X & Y Z A ! X Y Z; ] ) Y


! ; ]& ; 2 FIRST(Z

are also valid, where ) recognizing Y takes parser to A ! XY Z; ] I represents all the simultaneously valid states
j

; ]

CPSC 434

Lecture 12, Page 1

LR(1)

parser example
The Grammar

1 2 3

E T

::= ::=
j

T+E T
id

The Augmented Grammar

0 1 2 3 Symbol S0 E T

S0 E T

::= ::= ::=


FIRST
j

E T+E T
id

f id g f id g f id g

f eof g f eof g f +, eof g

FOLLOW

CPSC 434

Lecture 12, Page 2

Example LR(1) states


S0: S0 ::= E , $ ],
E ::= T + E , $ ], E ::= T , $ ], T ::= id , + ] T ::= id , $ ]
FIRST( $ ) = $ FIRST( $ ) = $ FIRST( + E $ ) = + FIRST( $ ) = $

S1: S0 ::= E , $ ] S2: E ::= T


E , $ ], E ::= T , $ ]
+

S3: T ::= id , + ]

T ::= id , $ ]
FIRST( $ ) = $ FIRST( $ ) = $ FIRST( + E $ ) = + FIRST( $ ) = $

S4: E ::= T + E , $ ],

E ::= T + E , $ ], E ::= T , $ ], T ::= id , + ] T ::= id , $ ]

S5: E ::= T + E , $ ]
CPSC 434
Lecture 12, Page 3

Example GOTO function


Start

S0

closure ( f S ::= E ]g )

Iteration 1 goto(S0; E ) = S1 goto(S0; T ) = S2 goto(S0, id) = S3 Iteration 2 goto(S2, +) = S4 Iteration 3 goto(S4, id) = S3 goto(S4; E ) = S5 goto(S4; T ) = S2
CPSC 434
Lecture 12, Page 4

Example ACTION and GOTO tables


The Augmented Grammar

0 1 2 3

S0 E T
ACTION +

::= ::= ::=


$
j

E T+E T
id

S0 S1 S2 S3 S4 S5

shift 3 | | | | accept | shift 4 reduce 2 | reduce 3 reduce 3 shift 3 | | | | reduce 1

id

expr 1 | | | 5 |

GOTO

term 2 | | | 2 |

The "reduce" actions are determined by the lookahead entries in the LR(1) items (instead of FOLLOW as in SLR parsers) The dfa, ACTION and GOTO tables have the exact same format for both SLR(1) and LR(1) parsers
CPSC 434
Lecture 12, Page 5

Resolving parse con icts


Parse con icts possible when certain LR items are found in the same state. Depending on parser, may choose between LR items using lookahead. Legal lookahead for LR items must be disjoint, else con ict exists. A ::= B ::=
LR(0)

Shift-Reduce

, ] , ]

Reduce-Reduce

A ::= B ::=

, ] , ]

con ict
FOLLOW(A) FIRST( )
\ FIRST( \

con ict
FOLLOW(A) FOLLOW(B)
\

SLR(1)

LR(1)

CPSC 434

Lecture 12, Page 6

SLR(1)

parsing example
The Grammar

S0 G E T

::= ::= ::= ::=

G E = E j id E+Tj T T * f j id

S0: S0 ::= G ]
G ::= G ::= E ::= E ::= T ::= T ::=

E=E] id ] E+T] T] T * id ] id ]
FOLLOW(G) = f $ g FOLLOW(T) = f $, *, +, = g

S1: G ::= id ]
T ::= id ]

Reduce-reduce con ict in S1 for lookahead $ !


CPSC 434
Lecture 12, Page 7

LR(1)

parsing example
The Grammar

S0 G E T G ::= G ::= E ::= E ::= T ::= T ::=

::= ::= ::= ::=

G E = E j id E+Tj T T * id j id

S0: S0 ::= G , f $ g ]

E=E,f$g] id , f $ g ] E + T , f =, + g ] T , f =, + g ] T * id , f =, +, * g ] id , f =, +, * g ] ]

S1: G ::= id , f $ g ]
T ::= id , f =,
+, * g

Reduce-reduce con ict in S1 resolved by lookahead!


CPSC 434
Lecture 12, Page 8

LALR(1)

parsing

LR(1) parsers have many more states than SLR(1)

parsers (approximately factor of ten for Pascal).


LALR(1) parsers have same number of states as SLR(1) parsers, but with more power due to

lookahead in states.

De ne the core of a set of LR(1) items to be the set of LR(0) items derived by ignoring the lookahead symbols. Thus, the two sets fA) ; a]; A ) fA) ; c]; A ) have the same core.
Key Idea:
i

; b]g, and ; d]g

If two sets of LR(1) items, I and I , have the same core, we can merge the states that represent them in the ACTION and GOTO tables.
j

CPSC 434

Lecture 12, Page 9

LALR(1)

table construction

There are two approaches to constructing LALR(1) parsing tables

merge.

Approach 1: Build LR(1) sets of items, then


1. For each core present among the set of LR(1) items, nd all sets having that core and replace these sets by their union 2. Update the goto function to re ect the replacement sets

The resulting algorithm has large space requirements

CPSC 434

Lecture 12, Page 10

LALR(1)

table construction

A more space e cient algorithm can be derived by observing that: we can represent I by its kernel, those items that are either the initial item S 0 ! S; eof] or do not have the at the left end of the rhs. we can compute shift, reduce, and goto actions for the state derived from I directly from kernel(I ).
i i i

This method avoids building the complete canonical collection of sets of LR(1) items. generate lookahead information.

Approach 2: Build LR(0) sets of items, then


1. Construct kernels of LR(0) sets of items 2. Initialize lookaheads of each kernel item 3. Compute when lookaheads propagate 4. Propagate lookaheads
CPSC 434
Lecture 12, Page 11

LALR(1)

properties

LALR(1) parsers have same number of states as SLR(1) parsers (core LR(0) items are the same)

May perform reduce rather than error. But will catch error before more input is processed.
LALR derived from LR with no shift-reduce con ict

will also have no shift-reduce con ict.

LALR may create reduce-reduce con ict not in LR from which LALR is derived.

CPSC 434

Lecture 12, Page 12

LR(k )

languages

SLR(1) LALR(1) LR(1) LR(k) Grammars SLR(1) LALR(1) LR(1) LR(k ) Languages
CPSC 434
Lecture 12, Page 13

Operator precedence parsers


Another approach to shift-reduce parsing is to use operator precedence. Given S ) S1S2 , there are three possible precedence relations between S1 and S2. 1. S1 in handle, S2 not (S1 reduced before S2) 2. both in handle (reduced at same time) 3. S2 in handle, S1 not (S2 reduced before S1)

S1 > S 2 S 1 = S2 S1 < S 2

A handle is thus composed of: <>, < = >, < = = >, ... To decide whether to shift or reduce, compare top of stack with lookahead (ignoring nonterminals): Shift if < or = Reduce if > Left end of handle is marked by rst < found
CPSC 434
Lecture 12, Page 14

Parsing example
E ::= E + E j E * E j id
+ * id

The Grammar
+ * id

> > > $ <

< < > > < > > > < < >
Input

Stack

$ $ < id $<E $<E+ $ < E + < id $<E+<E $<E+<E* $ < E + < E * < id $<E+<E*E $<E+E $<E
CPSC 434

id + id * id + id * id + id * id id * id * id * id id

$ $ $ $ $ $ $ $ $ $ $

Precedence

> $ < + < id > + < * < id > * > + > $ >
id

$ <

$ $ $ $

id + + id * * id

Lecture 12, Page 15

Parsing review Recursive Descent A hand coded recursive

descent parser directly encodes a grammar (typically an LL(1) grammar) into a series of mutually recursive procedures. It has most of the linguistic limitations of LL(1). the use of a production after seeing only the rst k symbols of its right hand side.

LL(k) An LL(k) parser must be able to recognize

LR(k) An LR(k) parser must be able to recognize

the occurrence of the right hand side of a production after having seen all that is derived from that right hand side with k symbols of lookahead.

CPSC 434

Lecture 12, Page 16

Parsing review
top-down recursive descent
LL(1)

operator precedence

LR(1)

Advantages fast locality simplicity error detection simple method fast automatable simple method very fast small table associativity fast det. langs. early error det. automatable associativity

Disadvantages hand-coded maintenance no left recursion associativity LL(1) LR(1) no left recursion associativity L(G) 6 = L(parser) error detection no ambiguity working sets table size error recovery

CPSC 434

Lecture 12, Page 17

You might also like