Professional Documents
Culture Documents
Bottom Up PDF
Bottom Up PDF
1
Bottom up parsing
• Goal of parser : build a derivation
– top-down parser : build a derivation by working from the start
symbol towards the input.
• builds parse tree from root to leaves
• builds leftmost derivation
– bottom-up parser : build a derivation by working from the input
back toward the start symbol
• builds parse tree from leaves to root
• builds reverse rightmost derivation
a string the starting symbol
reduced to
2
Types of bottom-up parsing
3
Bottom Up Parsing
• “Shift-Reduce” Parsing
• Reduce a string to the start symbol of the grammar.
• At every step a particular sub-string is matched (in left-to-right fashion)
to the right side of some production, Replace this string by the LHS
(called reduction).
• If the substring is chosen correctly at each step, it is the trace of a
rightmost derivation in reverse Reverse
Consider: order
abbcde
S aABe
aAbcde
A Abc | b
Bd
aAde
aABe
Rightmost Derivation: S
S aABe aAde aAbcde abbcde
4
Handle
• A Handle of a string
– A substring that matches the RHS of some production and whose
reduction represents one step of a rightmost derivation in reverse
– So we scan tokens from left to right, find the handle, and replace it
by corresponding LHS
• Formally:
– handle of a right sentential form is a production A ,
location of in , that satisfies the above property.
– i.e. A is a handle of at the location immediately after the
end of , if:
S => A => ω
5
Handle
• A certain sentential form may have many different handles.
6
An Example of Bottom-Up Paring
• S aABe
• A Abc | b
• Bd
7
Handle-pruning,
The process of discovering a handle & reducing it to the appropriate left-
hand side is called handle pruning.
Handle pruning forms the basis for a bottom-up parsing method.
Problems:
• Two problems:
– locate a handle and
– decide which production to use (if there are more than two candidate
productions).
8
Shift Reduce Parser using Stack
• General Construction: using a stack:
– “shift” input symbols into the stack until a handle is found on top of
it.
– “reduce” the handle to the corresponding non-terminal.
– other operations:
• “accept” when the input is consumed and only the start symbol is
on the stack, also: “error”
9
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Stack Expr num
Op +
Op -
$ Op *
Input String
* ( num + num )
13
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op -
Op *
num
REDUCE
* ( num + num )
14
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op -
Op *
num
REDUCE
* ( num + num )
15
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op -
Expr Op *
REDUCE
num
* ( num + num )
16
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op -
Op *
Expr
SHIFT
num
* ( num + num )
17
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
* Op -
Op *
Expr
SHIFT
num
( num + num )
18
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op -
Op *
Expr
REDUCE
num *
( num + num )
19
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
Op +
Op Op -
Op *
Expr
SHIFT
num *
( num + num )
20
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
( Op +
Op Op -
Op *
Expr
SHIFT
num *
num + num )
21
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
Expr - Expr
Expr num
( Op +
Op Op -
Op *
Expr
SHIFT
num *
num + num )
22
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
num Expr - Expr
Expr num
( Op +
Op Op -
Op *
Expr
SHIFT
num *
+ num )
23
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
Expr Expr - Expr
Expr num
( Op +
Op Op -
Op *
Expr
REDUCE
SHIFT
num * num
+ num )
24
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
Expr Expr - Expr
Expr num
( Op +
Op Op -
Op *
Expr
SHIFT
num * num
+ num )
25
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
+
Expr - Expr
Expr Expr num
( Op +
Op -
Op Op *
Expr
SHIFT
num * num
num )
26
Shift-Reduce Parser Example
Expr Expr Op Expr
Op Expr (Expr)
Expr Expr - Expr
Expr num
( Op +
Op Op -
Op *
Expr
REDUCE
SHIFT
num * num +
num )
27
Shift-Reduce Parser Example
Expr Expr Op Expr
Op Expr (Expr)
Expr Expr - Expr
Expr num
( Op +
Op Op -
Op *
Expr
SHIFT
num * num +
num )
28
Shift-Reduce Parser Example
Expr Expr Op Expr
Op Expr (Expr)
Expr Expr - Expr
Expr num
( Op +
Op Op -
Op *
Expr
SHIFT
num * num +
num )
29
Shift-Reduce Parser Example
Expr
Expr Expr Op Expr
Op Expr (Expr)
Expr Expr - Expr
Expr num
( Op +
Op Op -
Op *
Expr
REDUCE
SHIFT
33
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
Expr - Expr
) Expr num
Expr Op +
Expr Op -
Op Expr Op *
Expr ( Op
Expr
REDUCE
34
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr Expr Op Expr
Expr (Expr)
Expr - Expr
) Expr num
Expr
Op +
Expr Op -
Op Expr
Op *
Expr (
Expr Op
Expr
REDUCE
35
Shift-Reduce Parser Example
Expr Expr Op Expr
Expr (Expr)
Expr - Expr
)
Expr Expr num
Op +
Expr
Op Expr Op -
Expr ( Op *
Expr Op
Expr
ACCEPT!
36
Basic Idea
• Goal: construct parse tree for input string
• Read input from left to right
• Build tree in a bottom-up fashion
• Use stack to hold pending sequences of terminals and nonterminals
37
Example, Corresponding Parse Tree
Expr
Expr – Term
<id,x> <num,2>
• SEXP
• EXP EXP + TERM |TERM
1. Shift until top-of-stack is the right end of a handle • TERMTERM*F ACT| FACT
2. Pop the right end of the handle & reduce • FACT(EXP)|ID |NUM
•
38
Conflicts During Shift-Reduce Parsing
• There are context-free grammars for which shift-reduce parsers cannot
be used.
• Stack contents and the next input symbol may not decide action:
– shift/reduce conflict: Whether make a shift operation or a reduction.
– reduce/reduce conflict: The parser cannot decide which of several
reductions to make.
• If a shift-reduce parser cannot be used for a grammar, that grammar is
called as non-LR(k) grammar.
Conflicts
“shift/reduce” or “reduce/reduce”
Example:
Stack Input
if … then stmt else … Shift/ Reduce Conflict
40
Confilcts Resolution
• Shift-reduce conflict
– Resolve in favor of shift
• Reduce-reduce conflict
– Use the production that appears earlier
41
Shift-Reduce Parsers
• There are two main categories of shift-reduce parsers
1. Operator-Precedence Parser
– simple, but only a small class of grammars.
CFG
LR
LALR
2. LR-Parsers SLR
– covers wide range of grammars.
• SLR – simple LR parser
• LR – most general LR parser
• LALR – intermediate LR parser (lookhead LR parser)
– SLR, LR and LALR work same, only their parsing tables are different.
42
• Consider the Grammar
• S‟S
• S(S)S|ε
• Show the actions of shift reduce parser for the input string ( ) using the
above grammar
43
Operator-Precedence Parser
• Operator grammar
– small, but an important class of grammars
– we may have an efficient operator precedence parser (a shift-reduce
parser) for an operator grammar.
• In an operator grammar, no production rule can have:
– at the right side
– two adjacent non-terminals at the right side.
• Ex:
EAB EEOE EE+E |
Aa Eid E*E |
Bb O+|*|/ E/E | id
not operator grammar not operator grammar operator grammar
44
Precedence Relations
• In operator-precedence parsing, we define three disjoint precedence
relations between certain pairs of terminals.
45
Using Operator-Precedence Relations
• The intention of the precedence relations is to find the handle of
a right-sentential form,
<. with marking the left end,
=· appearing in the interior of the handle, and
.> marking the right hand.
46
Using Operator -Precedence Relations
E E+E | E-E | E*E | E/E | E^E | (E) | -E | id
id + * $
The partial operator-precedence .> .> .>
id
table for this grammar + <. .> <. .>
• Then the input string id+id*id with the precedence relations inserted
will be:
47
To Find The Handles
1. Scan the string from left end until the first .> is encountered.
2. Then scan backwards (to the left) over any =· until a <. is encountered.
3. The handle contains everything to left of the first .> and to the right of
the <. is encountered.
48
Operator-Precedence Parsing Algorithm
• The input string is w$, the initial stack is $ and a table holds precedence relations between
certain terminals
Algorithm:
set p to point to the first symbol of w$ ;
repeat forever
if ( $ is on top of the stack and p points to $ ) then return
else {
let a be the topmost terminal symbol on the stack and let b be the symbol pointed to
by p;
if ( a <. b or a =· b ) then { /* SHIFT */
push b onto the stack;
advance p to the next input symbol;
}
else if ( a .> b ) then /* REDUCE */
repeat pop stack
until ( the top of stack terminal is related by <. to the terminal most recently
popped );
else error();
} 49
Operator-Precedence Parsing Algorithm -- Example
stack input action id + * $
$ id+id*id$ shift id .> .> .>
$<.id +id*id$ reduce E id
$ +id*id$ shift + <. .> <. .>
50
How to Create Operator-Precedence Relations
• We use associativity and precedence relations among operators.
4. Also, let
(=·) $ <. ( id .> ) ) .> $
( <. ( $ <. id id .> $ ) .> )
( <. id
51
Operator-Precedence Relations
+ - * / ^ id ( ) $
+ .> .> <. <. <. <. <. .> .>
52
Handling Unary Minus
• Operator-Precedence parsing cannot handle the unary minus when we
also the binary minus in our grammar.
• The best approach to solve this problem, let the lexical analyzer handle
this problem.
– The lexical analyzer will return two different operators for the unary minus and the binary
minus.
– The lexical analyzer will need a lookhead to distinguish the binary minus from the unary
minus.
• Then, we make
O <. unary-minus for any operator
unary-minus .> O if unary-minus has higher precedence than O
unary-minus <. O if unary-minus has lower (or equal) precedence than O
53
Precedence Functions
• Compilers using operator precedence parsers do not need to store the
table of precedence relations.
• The table can be encoded by two precedence functions f and g that map
terminal symbols to integers.
• For symbols a and b.
f(a) < g(b) whenever a <. b
f(a) = g(b) whenever a =· b
f(a) > g(b) whenever a .> b
54
Disadvantages of Operator Precedence Parsing
• Disadvantages:
– It cannot handle the unary minus (the lexical analyzer should handle
the unary minus).
– Small class of grammars.
– Difficult to decide which language is recognized by the grammar.
• Advantages:
– simple
– powerful enough for expressions in programming languages
55
Error Recovery in Operator-Precedence Parsing
Error Cases:
1. No relation holds between the terminal on the top of stack and the
next input symbol.
2. A handle is found (reduction step), but there is no production with
this handle as a right side
Error Recovery:
1. Each empty entry is filled with a pointer to an error routine.
2. Decides the popped handle “looks like” which right hand side. And
tries to recover from that situation.
56
Handling Shift/Reduce Errors
57
Example
id ( ) $
id e3 e3 .> .>
( <. <. =. e4
) e3 e3 .> .>
$ <. <. e2 e1
( <. <. =. e4
) e3 e3 .> .>
$ <. <. e2 e1
60
LR Parsers
LR(k) parsing.
61
LL(k) vs. LR(k)
• LL(k): must predict which production to use having seen only first k
tokens of RHS
– Works only with some grammars
– But simple algorithm (can construct by hand)
62
More on LR(k)
• Can recognize virtually all programming language constructs (if CFG
can be given)
• Most general non-backtracking shift-reduce method known, but can be
implemented efficiently
• Class of grammars can be parsed is a superset of grammars parsed by
LL(k)
• Can detect syntax errors as soon as possible
63
More on LR(k)
64
LR Parsers
• LR-Parsers
– covers wide range of grammars.
– SLR – simple LR parser
– LR – most general LR parser
– LALR – intermediate LR parser (look-head LR parser)
– SLR, LR and LALR work same (they used the same algorithm),
only their parsing tables are different.
65
LR Parsing Algorithm
67
Entries in Transition Table
Entry Meaning
69
LR(0) Item
An LR(0) item is a production and a position in its RHS marked by a dot
(e.g., A α · β)
• The dot tells how much of the RHS we have seen so far. For example,
for a production S XYZ,
– S ·XYZ: we hope to see a string derivable from XYZ
– S X·YZ: we have just seen a string derivable from X and we hope
to see a string derivable from YZ
– SXY.Z : we have just seen a string derivable from XY and we
hope to see a string derivable from Z
– SXYZ. : we have seen a string derivable from XYZ and going to
reduce it to S
– (X, Y, Z are grammar symbols)
70
SLR PARSING
71
Augmented Grammar
72
Constructing Sets of LR(0) Items
1. Create a new nonterminal S' and a new production S' S where S is
the start symbol.
2. Put the item S' S into a start state called state 0.
3. Closure: If A B is in state s, then add B to state s for
every production B in the grammar.
4. Creating a new state from an old state[ goto operation] : Look for an
item of the form A x where x is a single terminal or
nonterminal and build a new state from A x . Include in the
new state all items with x in the old state. A new state is created for
each different x.
5. Repeat steps 3 and 4 until no new states are created. A state is new if it
is not identical to an old state.
73
The Closure Operation (Example)
closure({[E’ •E]}) =
{ [E’ • E] } { [E’ • E] { [E’ • E] { [E’ • E]
[E • E + T] [E • E + T] [E • E + T]
[E • T] } [E • T] [E • T]
[T • T * F] [T • T * F]
Add [E•] [T • F] } [T • F]
Add [T•] [F • ( E )]
Grammar: Add [F•] [F • id] }
EE+T|T
TT*F|F
F(E)
F id 74
Formal Definition of GOTO operation for constructing
LR(0) Items
1. For each item [A•X] I, add the set of items
closure({[AX•]}) to goto(I,X) if not already
there
2. Repeat step 1 until no more items can be added to
goto(I,X)
75
The Goto Operation (Example 1)
79
State 0
80
Creating State 1 From State 0 [ goto(I0,E)]
• Final version of state 0:
I0: {
E' E
EE+T E' E
ET E E+T
TT*F ET
T T*F
TF
T F
F(E) F (E)
F id F id
}
• Using step 4, we create new state 1 from items E'
E and E E + T
I0 E I1
81
State 1
• State 1 starts with the items E' E and
E E + T. These items are formed from
items E' E and E E + T by
moving the "" one grammar symbol to the E' E
E E+T
right. In each case, the grammar symbol is
ET
E. T T*F
• Closure does not add any new items, so T F
state 1 ends up with the 2 items: F (E)
F id
I1: {
E' E
EE+T
} I0 E I1
82
Creating State 2 From State 0 [ goto(I0,T)]
• Using step 4, we create state 2 from items E T
and T T * F by moving the "" past the “T”.
• State 2 starts with 2 items,
E' E
I2: {
E E+T
ET ET
TT*F T T*F
T F
}
F (E)
• Closure does not add additional items to state 2. F id
I0 T I2
83
Creating State 3 From State 0 [ goto(I0,F)]
• Using step 4, we create state 3 from item T
F.
• State 3 starts (and ends up) with one item:
I3: { E' E
E E+T
TF ET
} T T*F
• Since the only item in state 3 is a complete T F
F (E)
item, there will be no transitions out of state
F id
3.
• The figure on the next slide shows the DFA of
viable prefixes to this point.
I0 F I3
84
DFA After Creation of State 3
85
Creating State 4 From State 0 [ goto(I0,(]
• Using step 4, we create state 4 from item F
( E ).
• State 4 begins with one item:
E' E
F(E)
E E+T
• Applying closure to this item, we add the items ET
E E+T T T*F
T F
ET
F (E)
F id
86
State 4
• Applying closure to E T, we add
items T T * F and
T F to state 4, giving
E' E
F(E) E E+T
EE+T ET
T T*F
ET T F
TT*F F (E)
F id
TF
87
State 4
• Applying step 3 to T F, we add items F
( E ) and F id to state 4, giving the
final set of items
I 4: {
E' E
F(E) E E+T
EE+T ET
T T*F
ET T F
TT*F F (E)
TF F id
F(E)
F id
} I0 ( I4
• The next slide shows the DFA to this point.
88
DFA After Creation of State 4
89
Creating State 5 From State 0 [ goto(I0,id)]
90
DFA After Creation of State 5
91
Creating State 6 From State 1 [ goto(I1,+)]
• State 1 consists of 2 items
E' E
EE+T
• Create state 6 from item E E + T, giving
E' E
the item E E + T. E E+T
• Closure results in the set of items ET
I6: { T T*F
T F
EE+T F (E)
T T*F F id
T F
F ( E )
F id
} I1 + I6
92
DFA After Creation of State 6
93
Creating State 7 From State 2 [ goto(I2,*)]
• State 2 has two items,
ET
TT*F
E' E
• We create state 7 from T T * F,
E E+T
giving the initial item T T * F. ET
Using closure, we end up with T T*F
I7: { T F
F (E)
• TT*F F id
F (E)
F id
}
I2 * I7
94
DFA After Creation of State 7
95
Creating State 8 From State 4 [ goto(I4,E)]
96
Other Transitions From State 4 [ goto(I4,T),
goto(I4,F), goto(I4,( ), goto(I4,id)]
• If we use the items E T and
T T * F from state 4 to start a
new state, we begin with items
ET E' E
E E+T
TT*F
ET
• This set is identical to state 2. T T*F
T F
• Similarly, the items F (E)
– T F will produce state 3 F id
97
DFA After Creation of State 8
98
Creating State 9 From State 6 [ goto(I6,+)]
• We use item E E + T from state six to
create state 9:
I9: {
E E+T E' E
T T*F E E+T
ET
T F
T T*F
F (E) T F
F id F (E)
} F id
• All other transitions from state 6 go to
existing states. The next slide shows the
DFA to this point.
I6 + I9
99
DFA After Creation of State 9
100
Creating State 10 From State 7 [ goto(I7,F)]
I7 F I10
101
DFA After Creation of State 10
102
Creation of State 11 From State 8 [ goto(I8,F)]
• We use item F ( E * ) from state 8
to create state 11:
I11: {
F (E) E' E
} E E+T
ET
• All other transitions from state 8 go to T T*F
existing states. T F
• State 9 has one transition to an existing F (E)
F id
state (7). No other new states can be
added, so we are done.
• The next slide shows the final DFA for
viable prefixes. I7 F I10
103
DFA for Viable Prefixes
104
(SLR) Parsing Tables for Expression Grammar
Action Table Goto Table
1) E E+T state id + * ( ) $ E T F
2) ET 0 s5 s4 1 2 3
1 s6 acc
3) T T*F 2 r2 s7 r2 r2
4) TF 3 r4 r4 r4 r4
5) F (E) 4 s5 s4 8 2 3
6) F id 5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
105
Constructing Parse Table
• Construct the DFA (state graph) as in LR(0)
• Action Table
– If there is a transition from the state i to state j on a terminal „a‟,
ACTION[i, a] = shift j
– If there is a reduce item A α· (for a production #k in state i, for
each a ∈ FOLLOW(A),
ACTION[i, a] = Reduce k
– If an item S‟ S. is in state i,
ACTION[i, $] = Accept
– Otherwise, error
• GOTO
– Write GOTO for nonterminals: for terminals it is already embedded
in the action table
106
Algorithm – Construction of SLR Parsing Table
1. Construct the canonical collection of sets of LR(0) items for G‟.
C{I0,...,In}
2. Create the parsing action table as follows
• If a is a terminal, A.a in Ii and goto(Ii,a)=Ij then action[i,a] is
shift j.
• If A. is in Ii , then action[i,a] is reduce A for all a in
FOLLOW(A) where AS‟.
• If S‟S. is in Ii , then action[i,$] is accept.
• If any conflicting actions generated by these rules, the grammar is
not SLR(1).
3. Create the parsing goto table
• for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j
• All entries not defined by (2) and (3) are errors.
4. Initial state of the parser contains S‟.S
107
• We use the partial DFA at right
to fill in row 0 of the parse table.
– By rule 2a,
• action[ 0, ( ] = shift 4
• action[ 0, id ] = shift 5
– By rule 3,
• goto[ 0, E ] = 1
• goto[ 0, T ] = 2
• goto[ 0, F ] = 3
108
Action Table Goto Table
state id + * ( ) $ E T F
1) E E+T 0 s5 s4 1 2 3
2) ET 1
3) T T*F 2
4) TF 3
5) F (E) 4
6) F id 5
6
7
8
9
10
11
109
• We use the partial DFA at right
to fill in row 1 of the parse table.
– By rule 2a,
• action [ 1, + ] = shift 6
– By rule 2c
• action [ 1, $ ] = accept
110
Action Table Goto Table
state id + * ( ) $ E T F
1) E E+T 0 s5 s4 1 2 3
2) ET 1 s6 acc
3) T T*F 2
4) TF 3
5) F (E) 4
6) F id 5
6
7
8
9
10
11
111
• We use the partial DFA at right
to fill in row 5 of the parse table.
– By rule 2b, we set
action[ 5, x ] = reduce Fid
for each x Follow(F).
– Since Follow(F) = { ), +, *, $)
we have
• action[ 5, ) ] = reduce
Fid
• action[ 5, +] = reduce
Fid
• action[5, *] = reduce
Fid
• action[5, $] = reduce
Fid 112
Action Table Goto Table
state id + * ( ) $ E T F
1) E E+T 0 s5 s4 1 2 3
2) ET
1 s6 acc
3) T T*F
4) TF 2
5) F (E) 3
6) F id 4
5 r6 r6 r6 r6
9
10
11
113
Use the DFA to Finish the SLR Table
The complete SLR parse table for the expression grammar is given on the next slide.
114
Parse Table For Expression Grammar
4 s5 s4 8 2 3 Notation:
5 r6 r6 r6 r6
s5 = “shift 5”
6 s5 s4 9 3
7 s5 s4 10 r2 = “reduce by
E T”
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
115
Example SLR Grammar and LR(0) Items
State I0: State I1: State I2: State I3: State I4: State I5:
C’ •C C’ C• C A•B A a• C A B• B a•
C •A B B •a
A •a
a $ C A B
1 0 s3 1 2
acc Grammar:
C 4 1
B s5 4 1. C’ C
start A 2
0 2 3 r3 2. C A B
a 5 4
r2 3. A a
a
5
r4 4. B a
3
117
Actions of A LR-Parser
1. shift s -- shifts the next input symbol and the state s onto the stack
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm Sm ai s, ai+1 ... an $ )
4. Error -- Parser detected an error (an empty entry in the action table)
118
LR Parsing Algorithm
• Refer Text:
• Compilers Principles Techniques and Tools by Alfred V Aho, Ravi
Sethi, Jeffery D Ulman
• Page No. 218-219
119
Actions of A (S)LR-Parser -- Example
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by Fid Fid
0F3 *id+id$ reduce by TF TF
0T2 *id+id$ shift 7
0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by Fid Fid
0T2*7F10 +id$ reduce by TT*F TT*F
0T2 +id$ reduce by ET ET
0E1 +id$ shift 6
0E1+6 id$ shift 5
0E1+6id5 $ reduce by Fid Fid
0E1+6F3 $ reduce by TF TF
0E1+6T9 $ reduce by EE+T EE+T
0E1 $ accept
120
Exercise
• Consider the following grammar of simplified statement sequences:
• stmt_sequencestmt_sequene;stmt |stmt
• stmts
a) Construct the DFA of LR(0) items of this grammar.
b) Construct SLR parsing table.
c) Show the parsing stack and the actions of the SLR parsing for the input
string s;s;s
121
shift/reduce and reduce/reduce conflicts
• If a state does not know whether it will make a shift operation or
reduction for a terminal, we say that there is a shift/reduce conflict.
• If the SLR parsing table of a grammar G has a conflict, we say that that
grammar is not SLR grammar.
122
Conflict Example
S L=R I0: S‟ .S I1: S‟ S. I6: S L=.R I9: S L=R.
SR S .L=R R .L
L *R S .R I2: S L.=R L .*R
L id L .*R R L. L .id
RL L .id
R .L I3: S R.
123
Conflict Example2
S AaAb I0: S‟ .S
S BbBa S .AaAb
A S .BbBa
B A.
B.
Problem
FOLLOW(A)={a,b}
FOLLOW(B)={a,b}
a reduce by A b reduce by A
reduce by B reduce by B
reduce/reduce conflict reduce/reduce conflict
124
SLR(1)
• There is an easy fix for some of the shift/reduce or reduce/reduce errors
– requires to look one token ahead (called the lookahead token)
• Steps to resolve the conflict of an itemset:
1) for each shift item Y b . c you find FIRST(c)
2) for each reduction item X a . you find FOLLOW(X)
3) if each FOLLOW(X) do not overlap with any of the other sets, you have resolved the conflict!
• eg, for the itemset with E T . and TT.*F
– FOLLOW(E) = { $, +, ) }
– FIRST(* F) = { * }
– no overlapping!
• This is a SLR(1) grammar, which is more powerful than LR(0)
125
General LR(1) Parsing
• The SLR(1) trick doesn't always work
• The difficulty with the SLR(1) method is that it applies lookaheads after
the construction of the DFA of LR(0) items.
• The power of general LR(1) method is that it uses a new DFA that has
the lookaheads built into its construction from the start.
• This DFA uses extension of LR(0) items ie. LR(1) items.
• A single lookahead token is attached with each item.
• LR(1) item is a pair of consisting of an LR(0) item and a lookahead
token of the form.
[Aα.β , a] where Aα.β is an LR(0) item and a is a token.
126
LR(1) items
• A LR(1) item is:
.
A ,a where a is the look-head of the LR(1) item
(a is a terminal or end-marker.)
.
• When ( in the LR(1) item A ,a ) is not empty, the look-head
does not have any affect.
.
• When is empty (A ,a ), we do the reduction by A only if
the next input symbol is a (not for any terminal in FOLLOW(A)).
.
• A state will contain A ,a1 where {a1,...,an} FOLLOW(A)
...
A ,an.
127
Canonical Collection of Sets of LR(1) Items
• The construction of the canonical collection of the sets of LR(1) items
are similar to the construction of the canonical collection of the sets of
LR(0) items, except that closure and goto operations work a little bit
different.
128
goto operation
129
Construction of The Canonical LR(1) Collection
• Algorithm:
C is { closure({S‟.S,$}) }
repeat the followings until no more set of LR(1) items can be added to C.
for each I in C and each grammar symbol X
if goto(I,X) is not empty and not in C
add goto(I,X) to C
130
A Short Notation for The Sets of LR(1) Items
• A set of LR(1) items containing the following items
.
A ,a1
...
.
A ,an
can be written as
.
A ,a1/a2/.../an
131
LR (1) Items -Example
SCC
CcC|d
• Augmented Grammar
S’ S
SCC
CcC|d
• Start with closure { S‟.S,$}
• I 0.
{S’.S, $
S.CC,$
C. cC, c/d
C. d, c/d
}
132
State 1 from State 0 (goto( I0,S)
• I1: {
S‟S., $}
I1
I0
S‟.S,$ S
S.CC,$
C.cC,c/d S‟S.,$
C.d, c/d
133
State 2 from State 0 ( GOTO (I0,C)
State 3 from 0 ( GOTO ( I0,c)
State 4 from 0 (GOTO (I0,d)
• I2:
{
SC.C,$
I4:
C.cC,$ {
C.d,$
}
Cd.,c/d
}
I3:
{
Cc.C,c/d
CcC, c/d
C.d,c/d
}
DFA upto this point is shown in the next
slide
134
I0 I1
S‟.S,$ S
S.CC,$ S‟S.,$
C.cC,c/d
C.d, c/d
I2
C SC.C,$
C.cC,$ S
C.d,$
c
Cc.C,c/d I3 I4
C.cC,c/d
C.d,c/d Cd., c/d
d
135
New states from State 2 (GOTO (I2,C)
GOTO(I2, c), GOTO(I2,d)
• I5:
{
SCC.,$
}
I6
{Cc.C,$
C.cC,$
C.d,$
}
I7:
{Cd.,$}
136
137
Construction of LR(1) Parsing Tables
1. Construct the canonical collection of sets of LR(1) items for G‟.
C{I0,...,In}
139
LALR(1)
• If the lookaheads s1 and s2 are different, then the items A a, s1
and A a , s2 are different
– this results to a large number of states since the combinations of expected lookahead
symbols can be very large.
• We can combine the two states into one by creating an item A a, s3
where s3 is the union of s1 and s2
• LALR(1) is weaker than LR(1) but more powerful than SLR(1)
• LALR(1) and LR(0) have the same number of states
• Most parser generators are LALR(1), including CUP (Constructor of
Useful Parsers)
140
Practical Considerations
• How to avoid reduce/reduce and shift/reduce conflicts:
– left recursion is good, right recursion is bad
• Most shift/reduce errors are easy to remove by assigning precedence and
associativity to operators
141
LALR Parsing Tables
• LALR stands for LookAhead LR.
• LALR parsers are often used in practice because LALR parsing tables
are smaller than LR(1) parsing tables.
• The number of states in SLR and LALR parsing tables for a grammar G
are equal.
• But LALR parsers recognize more grammars than SLR parsers.
• yacc creates a LALR parser for the given grammar.
• A state of LALR parser will be again a set of LR(1) items.
142
Creating LALR Parsing Tables
143
The Core of A Set of LR(1) Items
• We will find the states (sets of LR(1) items) in a canonical LR(1) parser
with same items. Then we will merge them as a single state.
I4:Cd., c/d A new state: .
I47: Cd. ,c/d/$
.
I7:Cd ,$ have same item, merge them
• We will do this for all states of a canonical LR(1) parser to get the states
of the LALR parser.
• In fact, the number of the states of the LALR parser for a grammar will
be equal to the number of states of the SLR parser for that grammar.
144
I3: Cc.C,c/d A new state: I36: Cc.C,c/d/$
C.cC,c/d C.cC,c/d/$
C.d,c/d C.d,c/d/$
145
• I8 and I9 are replaced by their union
• I89 :
{
CcC. , c/d/$
}
146
Creation of LALR Parsing Tables
• Create the canonical LR(1) collection of the sets of LR(1) items for
the given grammar.
• Find each core; find all sets having that same core; replace those sets
having same cores with a single set which is their union.
C={I0,...,In} C‟={J1,...,Jm} where m n
• Create the parsing tables (action and goto tables) same as the
construction of the parsing tables of LR(1) parser.
– Note that: If J=I1 ... Ik since I1,...,Ik have same cores
cores of goto(I1,X),...,goto(I2,X) must be same.
– So, goto(J,X)=K where K is the union of all sets of items having same cores as goto(I1,X).
147
LALR Parsing Table
c d $ S C
1.SCC
0 s36 s47 1 2
2.CcC
1 acc
47 r3 r3 r3
5 r1
89 r2 r2 r2
148
Exercises
Q1. Show that the following grammar
SAa|bAc|dc|dba
Ad
Is LALR(1) but not SLR(1).
149