Bottomup Parser PDF

Bottom-up parsing
Bottom-up parsing
Recall
Goal:
For a grammar G, with start symbol S, any string

such that S is called a sentential form
Given an input string w and a grammar G, construct

a parse tree by starting at the leaves and working to
the root.
If Vt , then is called a sentence in L(G)

Otherwise it is just a sentential form (not a
sentence in L(G))
A left-sentential form is a sentential form that occurs
in the leftmost derivation of some sentence.
A right-sentential form is a sentential form that
occurs in the rightmost derivation of some sentence.
The parser repeatedly matches a right-sentential form

from the language against the trees upper frontier.
At each match, it applies a reduction to build on
the frontier:
each reduction matches an upper frontier of the
partially built tree to the RHS of some production
each reduction adds a node on top of the frontier
The nal result is a rightmost derivation, in reverse.
Compiler Construction 1
CA448
Bottom-Up Parsing
CA448
Example
S
A
B
What are we trying to nd?
aABe
Abc
b
d
A substring of the trees upper frontier that:

matches some production A where reducing
to A is one step in the reverse of a rightmost
derivation
and the input string abbcde

Prodn.
Bottom-Up Parsing
Handles
Consider the grammar

1
2
3
4
We call such a string a handle.
Sentential Form
a b bcde
a Abc de
aA d e
aABe
S
Formally:
a handle of a right-sentential form is a production
A and a position in where may be found
and replaced by A to produce the previous rightsentential form in a rightmost derivation of
The trick appears to be scanning the input and

nding valid sentential forms.
i.e., if S rm Aw rm w then A in
the position following is a handle of w
Because is a right-sentential form, the substring to
the right of a handle contains only terminal symbols.
Handles
Handles
Theorem:
If G is unambiguous then every right-sentential form

has a unique handle.
Proof: (by denition)
1. G is unambiguous rightmost derivation is
unique
2. a unique production A applied to take

i1 to i
3. a unique position k at which A is applied
4. a unique handle A
The handle A in the parse tree for w
CA448
Bottom-Up Parsing
CA448
Bottom-Up Parsing
Example
Handle-pruning
The left-recursive expression grammar (original form)
The process to construct a bottom-up parse is called

handle-pruning.
1
2
3
4
5
6
7
8
9
<goal>
<expr>
<term>
<factor>
Prodn.
1
3
5
9
7
8
4
7
9
::=
::=
|
|
::=
|
|
::=
|
<expr>
<expr> + <term>
<expr> - <term>
<term>
<term> * <factor>
<term> / <factor>
<factor>
num
id
To construct a rightmost derivation

S = 0 1 2 ... n
we set i to n and apply the following simple algorithm
for i = n downto 1
1. find the handle Ai i in i
2. replace i with Ai to generate i1
Sentential Form
<goal>
<expr>
<expr> - <term>
<expr> - <term> * <factor>
<expr> - <term> * id
<expr> - <factor> * id
<expr> - num * id
<term> - num * id
<factor> - num * id
id - num * id
This takes 2n steps, where n is the length of the

derivation
Stack implementation
Example: back to x - 2 * y
1
2
3
4
5
6
7
8
9
One scheme to implement a handle-pruning, bottomup parser is called a shift-reduce parser.

Shift-reduce parsers use a stack and an input buer
1. initialize stack with $
2. Repeat until the top of the stack is the goal
symbol and the input token is $
(a) nd the handle
if we dont have a handle on top of the stack,
shift an input symbol onto the stack
(b) prune the handle
if we have a handle A on the stack, reduce
i. pop || symbols o the stack
ii. push A onto the stack
CA448
Bottom-Up Parsing
<goal>
<expr>
<term>
<factor>
::=
::=
|
|
::=
|
|
::=
|
<expr>
<expr> + <term>
<expr> - <term>
<term>
<term> * <factor>
<term> / <factor>
<factor>
num
id
Stack
$
$id
$<factor>
$<term>
$<expr>
$<expr> $<expr> $<expr> $<expr> $<expr> $<expr> $<expr> $<expr> $<expr>
$<goal>
id num
<factor>
<term>
<term> *
<term> * id
<term> * <factor>
<term>
Input
Action
*
*
*
*
*
*
*
*
*
shift
reduce
reduce
reduce
shift
shift
reduce
reduce
shift
shift
reduce
reduce
reduce
reduce
accept
id
id
id
id
id
id
id
id
id
id
9
7
4
8
7
9
5
3
1
10
CA448
Shift-reduce parsing
num
num
num
num
num
num
Bottom-Up Parsing
LR parsing
Shift-reduce parsers are simple to understand
The skeleton parser:
A shift-reduce parser has just four canonical actions:
push s0
token = next_token()
repeat forever
s = top of stack
if action[s,token] == " shift si" then
push si
token = next_token()
else if action[s,token] == " reduce A "
then
pop | | states
s = top of stack
push goto[s , A]
else if action[s, token] == " accept" then
return
else error()
1. shift next input symbol is shifted onto the top

of the stack
2. reduce right end of handle is on top of stack;
locate left end of handle within the stack;
pop handle o stack and push appropriate nonterminal LHS
3. accept terminate parsing and signal success
4. error call an error recovery routine
Key insight: recognize handles with a DFA:
This takes k shifts, l reduces, and 1 accept, where k

is the length of the input string and l is the length
of the reverse rightmost derivation
DFA transitions shift states instead of symbols

accepting states trigger reductions
11
12
Example tables
state
id
s4
s4
s4
0
1
2
3
4
5
6
7
8
1
2
3
4
5
6
ACTION
+ *
s5
r5 s6
r6 r6
r4
acc
r3
r5
r6
r2
r4
Example using the tables

Stack
$0
$04
$03
$036
$036
$036
$036
$02
$025
$025
$025
$025
$025
$01
GOTO
<expr> <term> <factor>
1
2
3
7
2
3
8
3
The Grammar
::= <expr>
::= <term> + <expr>
|
<term>
<term>
::= <factor> * <term>
|
<factor>
<factor> ::= id
<goal>
<expr>
4
3
8
4
3
2
7
Input
id * id + id$
* id + id$
* id + id$
id + id$
+ id $
+ id $
+ id $
+ id $
id $
$
$
$
$
$
Action
s4
r6
s6
s4
r6
r5
r4
s5
s4
r6
r5
r3
r2
acc
Note: This is a simple little right-recursive grammar.

It is not the same grammar as in previous lectures.
CA448
13
Bottom-Up Parsing
14
CA448
Why study LR grammars?
Bottom-Up Parsing
LR parsing
LR(1) grammars are often used to construct parsers.

We call these parsers LR(1) parsers.
Three commonly used algorithms are used to build

tables for an LR parser:
used to be everyones favourite parser (but topdown is making a comeback with JavaCC)
virtually all context-free programming language
constructs can be expressed in an LR(1) form
LR grammars are the most general grammars
parsable by a deterministic, bottom-up parser
ecient parsers can be implemented for LR(1)
grammars
LR parsers detect an error as soon as possible in
a left-to-right scan of the input
LR grammars describe a proper superset of the
languages recognized by predictive (i.e., LL)
parsers
1. SLR(1)
smallest class of grammars
smallest tables (number of states)
simple, fast construction
2. LR(1)
full set of LR(1) grammars
largest tables (number of states)
slow, large construction
3. LALR(1)
intermediate sized set of grammars
same number of states as SLR(1)
canonical construction is slow and large
better construction techniques exist
LL(k): recognize use of a production A

seeing rst k symbols of
LR(k): recognize occurrence of (the handle)
having seen all of what is derived from plus k
symbols of lookahead
15
An LR(1) parser for either Algol or Pascal has several

thousand states, while an SLR(1) or LALR(1) parser
for the same language may have several hundred
states.
16
LR(k) items
Example
The table construction algorithms use sets of LR(k)

items or congurations to represent the possible
states in a parse.
The indicates how much of an item we have seen

at a given state in the parse:
An LR(k) item is a pair [, ], where

is a production from G with a at some position
in the RHS, marking how much of the RHS of a
production has already been seen
is a lookahead string containing k symbols
(terminals or $)
Two cases of interest are k = 0 and k = 1:
LR(1) items play a key role in the LR(1) and LALR(1)

table construction algorithms.
CA448
[A XY Z] indicates that the parser has seen

a string derived from XY and is looking for one
derivable from Z
LR(0) items: (no lookahead)
A XY Z generates 4 LR(0) items:
1. [A XY Z]
LR(0) items play a key role in the SLR(1) table

construction algorithm.
[A XY Z] indicates that the parser is looking

for a string that can be derived from XY Z
17
Bottom-Up Parsing
2. [A X Y Z]
3. [A XY Z]
4. [A XY Z]
CA448
The characteristic finite state machine

(CFSM)
The CFSM for a grammar is a DFA which recognizes
viable prexes of right-sentential forms:
A viable prex is any prex that does not extend
beyond the handle.
It accepts when a handle has been discovered and
needs to be reduced.
To construct the CFSM we need two functions:
closure0(I) to build its states
18
Bottom-Up Parsing
closure0
Given an item [A B], its closure contains
the item and any other items that can generate legal
substrings to follow .
Thus, if the parser has viable prex on its stack,
the input should reduce to B (or for some other
item [B ] in the closure).
function closure0(I)
repeat
if [A B ] I
add [B ] to I
until no more items can be added to I
return I
goto0(I, X) to determine its transitions
19
20
goto0
Building the LR(0) item sets
Let I be a set of LR(0) items and X be a grammar

symbol.
We start the construction with the item [S S$],

where
Then, GOTO(I, X) is the closure of the set of all

items
S is the start symbol of the augmented grammar G

S is the start symbol of G
[A X ] such that [A X] I
If I is the set of valid items for some viable prex
, then GOTO(I, X) is the set of valid items for the
viable prex X.
GOTO(I, X) represents state after recognizing X
in state I.
function goto0(I, X)
let J be the set of items [A X ]
such that [A X ] I
return closure0(J)
21
CA448
Bottom-Up Parsing
$ represents EOF
To compute the collection of sets of LR(0) items
function items(G )
s0 = closure0({[S S$]})
S = {s0}
repeat
for each set of items s S
for each grammar symbol X
if goto0(s, X) = and goto0(s, X)
/ S
add goto0(s, X) to S
until no more item sets can be added to S
return S
CA448
22
Bottom-Up Parsing
LR(0) example
Constructing the LR(0) parsing table
1. construct the collection of sets of LR(0) items for

S
1
2
3
4
5
S
E
T
S E$
I4
E E + T
I5
I6
E T
T id
T (E)
I1 : S E$
E E +T
I2 : S E$
I7
I3 : E E + T
T id
I8
I9
T (E)
The corresponding CFSM:
I0 :
E$
E+T
T
id
(E)
2. state i of the CFSM is constructed from Ii
E E + T
T id
T (E)
E E + T
E T
T id
T (E)
T (E)
E E +T
T (E)
E T
:
:
:
:
:
:
(a) [A a] Ii and goto0(Ii, a) = Ij

ACTION[i, a] = shift j
(b) [A ] Ii, A = S
ACTION[i, a] = reduce A , a
(c) [S S$] Ii
ACTION[i, a] = accept, a
3. goto0(Ii, A) = Ij
GOTO[i, A] = j
4. set undened entries in ACTION and GOTO to
error
9
T
(
id
0
E
id
+
7
)
T
4
5. initial state of parser s0 is closure0([S S$])
(
E
(
+
$
2
T
id
23
24
Conflicts in the ACTION table
LR(0) example
If the LR(0) parsing table contains any multiplydened ACTION entries then G is not LR(0)
9
T
(
id
0
E
1
5
id
state
0
1
2
3
4
5
6
7
8
9
id
s5
acc
s5
r2
r4
s5
r5
r3
6
E
ACTION
(
)
+
s6
s3
acc acc acc
s6
r2
r2
r2
r4
r4
r4
s6
s8
s3
r5
r5
r5
r3
r3
r3
shift-reduce : both shift and reduce possible in

same item set
7
)
T
4
Two conicts arise:
(
+
$
2
T
id
reduce-reduce : more than one distinct reduce

action possible in same item set
s2
acc
r2
r4
r5
r3
GOTO
E T
1 9

4

7 9

CA448
Conicts can be resolved through lookahead in

ACTION. Consider:
A |a
shift-reduce conict
a:=b+c*d
requires lookahead to avoid shift-reduce conict
after shifting c (need to see * to give precedence
over +)
25
Bottom-Up Parsing
26
CA448
Bottom-Up Parsing
A simple approach to adding

lookahead: SLR(1)
From previous example

9
1
2
3
4
5
Add lookaheads after building LR(0) item sets

Constructing the SLR(1) parsing table:
G
2. state i of the CFSM is constructed from Ii
(a) [A a] Ii and goto0(Ii, a) = Ij
ACTION[i, a] = shift j, a = $
(b) [A ] Ii, A = S
ACTION[i, a] = reduce A ,
a FOLLOW(A)
(c) [S S $] Ii
ACTION[i, $] = accept
GOTO[i, A] = j
error
5. initial state of parser s0 is closure0([S S$])
27
S
E
T
E$
E+T
T
id
(E)
(
id
0
E
id
+
T
id
7
)
$
2
(
+
FOLLOW(E) = FOLLOW(T ) = {$,+,)}

state
0
1
2
3
4
5
6
7
8
9
id
s5
s5
s5
(
s6
s6
s6
ACTION
) +
s3
r2 r2
r4 r4
s8 s3
r5 r5
r3 r3
acc
r2
r4
r5
r3
GOTO
E T
1 9

4

7 9

28
Example: A grammar that is not LR(0)

1
2
3
4
5
6
7
S
E
T
F
E$
E+T
T
T F
F
id
(E)
FOLLOW
{+,),$}
{+,*,),$}
{+,*,),$}
E
T
F
state
0
1
2
3
4
5
6
7
8
9
10
11
12
LR(0) item sets:

I0 :
I1 :
I2 :
I3 :
I4 :
I5 :
S E $
E E + T
E T
T T F
T F
F id
F (E)
S E$
E E +T
S E$
E E + T
T T F
T F
F id
F (E)
T F
F id
I6 :
I7 :
I8 :
I9 :
I10 :
I11 :
I12 :
F
E
E
T
T
F
F
E
T
T
F
F
T
F
E
T
F
E
(E)
E + T
T
T F
F
id
(E)
T
T F
T F
id
(E)
T F
(E)
E + T
T F
(E)
E +T
29
CA448
Example: But it is SLR(1)
Bottom-Up Parsing
s3
r5
r6
r3
r4
r7
r2
s3
r5
r6
s8
r4
r7
s8
ACTION
id (
)
s5 s6
s5 s6
r5
r6
s5 s6
r3
s5 s6
r4
r7
r2
s10
acc
r5
r6
r3
r4
r7
r2
30
CA448
Example: A grammar that is not

SLR(1)
GOTO
E
T
1
7
11
12 7
Bottom-Up Parsing
LR(1) items
An LR(1) item is one in which
Consider:
S
L
R
L=R
R
R
id
L
All the lookahead strings are constrained to have

length 1
Look something like [A X Y Z, a]
Whats the point of the lookahead symbols?
Its LR(0) item sets:

I0 :
I1 :
I2 :
I3 :
S S$
S L = R
S R
LR
L id
R L
S S $
S L = R
R L
S R
I4 :
I5 :
I6 :
I7 :
I8 :
I9 :
carry along to choose correct reduction when there

is a choice
lookaheads are bookkeeping, unless item has at
right end:
in [A X Y Z, a], a has no direct use
in [A XY Z, a], a is useful
allows use of grammars that are not uniquely
invertible1
LR
R L
LR
L id
L id
S L = R
R L
LR
L id
L R
R L
S L = R
Consider I2:
The point: For [A , a] and [B , b], we

can decide between reducing to A or B by looking
at limited right context
= FOLLOW(R) (S L = R R = R)
RHS
31
a grammar is uniquely invertible if no two productions have the same
32
F
4
closure1(I)
goto1(I)
Given an item [A B, a], its closure contains

the item and any other items that can generate legal
substrings to follow .
Let I be a set of LR(1) items and X be a grammar

symbol.
Thus, if the parser has viable prex on its stack,

the input should reduce to B (or for some other
item [B , b] in the closure).
function closure1(I)
repeat
if [A B, a] I
add [B , b] to I, where b FIRST( a)
until no more items can be added to I
return I
Then, GOTO(I, X) is the closure of the set of all

items
[A X , a] such that [A X, a] I
If I is the set of valid items for some viable prex
, then GOTO(I, X) is the set of valid items for the
viable prex X.
GOTO(I, X) represents state after recognizing X
in state I.
function goto1(I, X)
let J be the set of items
[A X, a] such that [A X, a] I
return closure1(J)
CA448
33
Bottom-Up Parsing
Building the LR(1) item sets for

grammar G
CA448
34
Bottom-Up Parsing
Constructing the LR(1) parsing table

Build lookahead into the DFA to begin with
We start the construction with the item [S S, $],

where
S is the start symbol of the augmented grammar G
2. state i of the LR(1) machine is constructed from

Ii
S is the start symbol of G

$ represents EOF
To compute the collection of sets of LR(1) items
function items(G )
s0 = closure1({[S S, $])
S = {s0}
repeat
for each set of items s S
for each grammar symbol X
if goto1(s, X) = and goto1(s, X)
/ S
add goto1(s, X) to S
until no more item sets can be added to S
return S

G
35
(a) [A a, b] Ii and goto1(Ii, a) = Ij

(b) [A , a] Ii, A = S
ACTION[i, a] = reduce A
(c) [S S, $] Ii
GOTO[i, A] = j
error
5. initial state of parser s0 is closure1([S S, $])
36
Back to previous example (

/ SLR(1))
S
L
R
Example: back to the SLR(1)

expression grammar
L=R
R
R
id
L
In general, LR(1) has many more states than

LR(0)/SLR(1):
I0 :
I1 :
I2 :
I3 :
I4 :
S S ,
S L = R,
S R,
L R,
L id,
R L,
L R,
L id,
S S,
S L = R,
R L,
S R,
L R,
R L,
L R,
L id,
$
$
$
=
=
$
$
$
$
$
$
$
=$
=$
=$
=$
L id ,
S L = R,
R L,
L R,
L id,
L R,
R L,
S L = R,
R L,
L R,
R L,
L R,
L id,
L id ,
L R,
I5 :
I6 :
I7 :
I8 :
I9 :
I10 :
I11 :
I12 :
I13 :
=$
$
$
$
$
=$
=$
$
$
$
$
$
$
$
$
S
E
1
2
3
4
5
6
7
Its LR(1) item sets:
T
F
E
E+T
T
T F
F
id
(E)
LR(1) item sets:

I0 :
S E ,
E E + T ,
E T ,
T T F ,
T F ,
F id,
F (E),
$
+$
+$
*+$
*+$
*+$
*+$
I0 : shifting (
S (E),
E E + T ,
E T ,
T T F ,
T F ,
F id,
F (E),
*+$
+)
+)
*+)
*+)
*+)
*+)
I0 : shifting (
S (E),
E E + T ,
E T ,
T T F ,
T F ,
F id,
F (E),
*+)
+)
+)
*+)
*+)
*+)
*+)
I2 no longer has shift-reduce conict:

reduce on $, shift on =
37
CA448
Bottom-Up Parsing
38
CA448
Another example
Bottom-Up Parsing
LALR(1) parsing
Consider:
0
1
2
3
S
S
C
S S ,
S CC ,
C cC ,
C d,
S S,
S C C,
C cC ,
C d,
C c C,
C cC ,
C d,
C d,
S CC,
C c C,
C cC ,
C d,
C d,
C cC,
C cC,
$
$
cd
cd
$
$
$
$
cd
cd
cd
cd
$
$
$
$
$
cd
$
Dene the core of a set of LR(1) items to be the

set of LR(0) items derived by ignoring the lookahead
symbols.
S
CC
cC
d
Thus, the two sets
LR(1) item sets:

I0 :
I1 :
I2 :
I3 :
I4 :
I5 :
I6 :
I7 :
I8 :
I9 :
{[A , a], [A , b]}, and

state
0
1
2
3
4
5
6
7
8
9
ACTION
c
d
$
s3 s4
acc
s6 s7
s3 s4
r3 r3
r1
s6 s7
r3
r2 r2
r2
GOTO
S C
1 2

5
8

9

39
{[A , c], [A , d]}

have the same core.
Key idea:
If two sets of LR(1) items, Ii and Ij , have the
same core, we can merge the states that represent
them in the ACTION and GOTO tables.
40
LALR(1) table construction
LALR(1) table construction
To construct LALR(1) parsing tables, we can insert

a single step into the LR(1) algorithm
(1.5) For each core present among the set of LR(1)
items, nd all sets having that core and replace
these sets by their union.
The goto function must be updated to reect the
replacement sets.
The resulting algorithm has large space requirements.
41
CA448
Bottom-Up Parsing
Example
The revised (and renumbered) algorithm

G
2. for each core present among the set of LR(1)
items, nd all sets having that core and replace
these sets by their union. (Update the goto
function incrementally)
3. state i of the LALR(1) machine is constructed
from Ii
(a) [A a, b] Ii and goto1(Ii, a) = Ij
(b) [A , a] Ii, A = S
ACTION[i, a] = reduce A
(c) [S S, $] Ii
GOTO[i, A] = j
error
6. initial state of parser s0 is closure1([S S, $])
42
CA448
Bottom-Up Parsing
The role of precedence
Reconsider:
0
1
2
3
S
S
C
Precedence and associativity can be used to resolve

shift/reduce conicts in ambiguous grammars.
S
CC
cC
d
lookahead with higher precedence shift

same precedence, left associative reduce
LR(1) item sets:
I0 : S S ,
S CC ,
C cC ,
C d,
I1 : S S,
I2 : S C C ,
C cC ,
C d,
I3 : C c C ,
C cC ,
C d,
I4 : C d,
I5 : S CC,
I6 : C c C ,
C cC ,
C d,
I7 : C d,
I8 : C cC,
I9 : C cC,
$
$
cd
cd
$
$
$
$
cd
cd
cd
cd
$
$
$
$
$
cd
$
Advantages:
Merged states:
I36 : C
C
C
I47 : C
I89 : C
state
0
1
2
36
47
5
89
c C,
cC ,
d,
d,
cC,
cd$
cd$
cd$
cd$
cd$
ACTION
c
d
$
s36 s47
acc
s36 s47
s36 s47
r3
r3
r3
r1
r2
r2
r2
more concise, albeit ambiguous, grammars

shallower parse trees fewer reductions
GOTO
S C
1 2

5
89

43
Classic application: expression grammars

With precedence and associativity, we can use:
E
|
|
|
|
|
|
|
EE
E/E
E+E
EE
(E)
E
id
num
44
Error recovery in shift-reduce parsers
Left versus right recursion
The problem
Right Recursion:
encounter an invalid token
needed for termination in predictive parsers
bad pieces of tree hanging from stack
requires more stack space
incorrect entries in symbol table
right associative operators

Left Recursion:
We want to parse the rest of the le
works ne in bottom-up parsers
Restarting the parser
limits required stack space
nd a restartable state on the stack
left associative operators
move to a consistent place in the input

print an informative message (including line
number)
Rule of thumb:
right recursion for top-down parsers
left recursion for bottom-up parsers
45
CA448
Bottom-Up Parsing
Parsing review
Recursive descent
A hand coded recursive descent parser directly
encodes a grammar (typically an LL(1) grammar)
into a series of mutually recursive procedures. It has
most of the linguistic limitations of LL(1).
LL(k)
An LL(k) parser must be able to recognize the use
of a production after seeing only the rst k symbols
of its right hand side.
LR(k)
An LR(k) parser must be able to recognize the
occurrence of the right hand side of a production
after having seen all that is derived from that right
hand side with k symbols of lookahead.
47
46

Bottomup Parser PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bottomup Parser PDF

Uploaded by

Copyright:

Available Formats

Bottom-up parsing

For a grammar G, with start symbol S, any string

Given an input string w and a grammar G, construct

If Vt , then is called a sentence in L(G)

The parser repeatedly matches a right-sentential form

What are we trying to nd?

A substring of the trees upper frontier that:

and the input string abbcde

Consider the grammar

We call such a string a handle.

The trick appears to be scanning the input and

If G is unambiguous then every right-sentential form

2. a unique production A applied to take

3. a unique position k at which A is applied

The handle A in the parse tree for w

The left-recursive expression grammar (original form)

The process to construct a bottom-up parse is called

To construct a rightmost derivation

This takes 2n steps, where n is the length of the

One scheme to implement a handle-pruning, bottomup parser is called a shift-reduce parser.

Shift-reduce parsers are simple to understand

The skeleton parser:

A shift-reduce parser has just four canonical actions:

1. shift next input symbol is shifted onto the top

This takes k shifts, l reduces, and 1 accept, where k

DFA transitions shift states instead of symbols

Example using the tables

Note: This is a simple little right-recursive grammar.

Why study LR grammars?

LR(1) grammars are often used to construct parsers.

Three commonly used algorithms are used to build

LL(k): recognize use of a production A

An LR(1) parser for either Algol or Pascal has several

The table construction algorithms use sets of LR(k)

The indicates how much of an item we have seen

An LR(k) item is a pair [, ], where

LR(1) items play a key role in the LR(1) and LALR(1)

[A XY Z] indicates that the parser has seen

LR(0) items play a key role in the SLR(1) table

[A XY Z] indicates that the parser is looking

The characteristic finite state machine

goto0(I, X) to determine its transitions

Building the LR(0) item sets

Let I be a set of LR(0) items and X be a grammar

We start the construction with the item [S S$],

Then, GOTO(I, X) is the closure of the set of all

S is the start symbol of the augmented grammar G

Constructing the LR(0) parsing table

1. construct the collection of sets of LR(0) items for

2. state i of the CFSM is constructed from Ii

(a) [A a] Ii and goto0(Ii, a) = Ij

5. initial state of parser s0 is closure0([S S$])

Conflicts in the ACTION table

shift-reduce : both shift and reduce possible in

Two conicts arise:

reduce-reduce : more than one distinct reduce

Conicts can be resolved through lookahead in

A simple approach to adding

From previous example

Add lookaheads after building LR(0) item sets

FOLLOW(E) = FOLLOW(T ) = {$,+,)}

Example: A grammar that is not LR(0)