You are on page 1of 148

TABLE DRIVEN PREDICTIVE PARSING

1
TABLE DRIVEN PREDICTIVE PARSING

2
PREDICTIVE PARSING

3
PREDICTIVE PARSING

4
PREDICTIVE PARSING

5
PREDICTIVE PARSING

6
PREDICTIVE PARSING

7
PREDICTIVE PARSING

8
PREDICTIVE PARSING

9
PREDICTIVE PARSING

10
PREDICTIVE PARSING

11
PREDICTIVE PARSING

12
PREDICTIVE PARSING

13
PREDICTIVE PARSING

14
PREDICTIVE PARSING

15
PREDICTIVE PARSING

16
PREDICTIVE PARSING

17
PREDICTIVE PARSING

18
PREDICTIVE PARSING

19
PREDICTIVE PARSING

20
PREDICTIVE PARSING

21
PREDICTIVE PARSING

22
PREDICTIVE PARSING

23
PREDICTIVE PARSING

24
PREDICTIVE PARSING

25
PREDICTIVE PARSING

26
PREDICTIVE PARSING

27
PREDICTIVE PARSING

28
PREDICTIVE PARSING

29
PREDICTIVE PARSING

30
PREDICTIVE PARSING

31
PREDICTIVE PARSING

32
PREDICTIVE PARSING

33
PREDICTIVE PARSING

34
PREDICTIVE PARSING

35
PREDICTIVE PARSING

36
PREDICTIVE PARSING

37
PREDICTIVE PARSING

38
PREDICTIVE PARSING

39
PREDICTIVE PARSING

40
PREDICTIVE PARSING

41
PREDICTIVE PARSING

42
PREDICTIVE PARSING

43
PREDICTIVE PARSING

44
PREDICTIVE PARSING

45
PREDICTIVE PARSING

46
PREDICTIVE PARSING

47
PREDICTIVE PARSING

48
PREDICTIVE PARSING

49
RECONSTRUCTING THE PARSE
TREE

50
RECONSTRUCTING THE PARSE
TREE

51
RECONSTRUCTING THE PARSE
TREE

52
Example (Stack Moves)
Input string: id + id * id

53
EXAMPLE: THE “DANGLING ELSE”
GRAMMAR

54
EXAMPLE: THE “DANGLING ELSE”
GRAMMAR

55
EXAMPLE: THE “DANGLING ELSE”
GRAMMAR
EXAMPLE: THE “DANGLING ELSE”
GRAMMAR
EXAMPLE: THE “DANGLING ELSE”
GRAMMAR
EXAMPLE: THE “DANGLING ELSE”
GRAMMAR
EXAMPLE: THE “DANGLING ELSE”
GRAMMAR
EXAMPLE: THE “DANGLING ELSE”
GRAMMAR
EXAMPLE: THE “DANGLING ELSE”
GRAMMAR
EXAMPLE: THE “DANGLING ELSE”
GRAMMAR
EXAMPLE: THE “DANGLING ELSE”
GRAMMAR
LL(1) GRAMMAR LL(1) Grammar

• LL(1) grammars
• Are never ambiguous.
• Will never have left recursion.

• Furthermore...
• If we are looking for a n “A” and the next symbol is
“b”,
• Then only one production mu s t be possible

• Although elimination of left recursion and left


factoring is easy.
• Some grammar will never be a LL(1) grammar. 72
LL(1) GRAMMAR

66
PROPERTIES OF LL(1) GRAMMAR

• A gr a mma r G is LL(1) if a n only if whenever


A    are two distinct productions of G
the following conditions hold:

1. For no terminal a do both  and  derive strings


beginning with a.

2. At most one of  and  can derive the empty string.

3. If then  *   then  does not derive any string


beginning with a terminal in FOLLOW(A).

67
Examples

S -> A a A b | B b B a
A -> ε
B -> ε

Rule 1. FIRST (A a A b) ꓵFIRST (B b B a)


= { a} ꓵ{b} = ϕ

Rule 2. (A a A b)  ε
( B b B a)  ε

Rule 3. Not applicable

Grammar is LL(1)
Examples
S -> 1 A B | ε
A -> 1 A C | 0 C
B -> 0 S
C - >1

Rule 1. FIRST (1 A B) ꓵFIRST (ε)


= { 1} ꓵ{ε} = ϕ

Rule 2. FIRST (1 A B)  ε
FIRST (ε)  ε

Rule 3. FIRST (ε)  ε, FOLLOW (S) = {$}


Follow (S) ꓵFIRST (1 A B) = ϕ

Grammar is LL(1)
ERROR RECOVERY
When Do Errors Occur? Recall Predictive Parser Function:

a + b $ Input

Stack X Predictive Parsing Output


Program
Y
Z
Parsing Table
$ M[A,a]

1. If X is a terminal and it doesn’t match input.


2. If M[X, Input] is empty – No allowable actions
75
Error Recovery in Predictive Parsing
(cont.)
• Panic-mode error recovery

• It is based on the idea of skipping symbols on the


input until a token in a selected set of synchronizing
tokens appears.

• Its effectiveness depends on the choice on the


synchronizing set.
Error Recovery in Predictive Parsing
(cont.)
Some heuristics are as follows.
1. Place all symbols in FOLLOW(A) into the synchronizing set for
nonterminal A. Skip tokens until an element of FOLLOW(A) is seen
and pop A from the stack, it is likely that parsing can continue.

2. There is hierarchical structure on constructs in a language; e.g.,


expressions within blocks, and so on. We can add to the
synchronizing set of a lower construct the symbols that begin higher
constructs.

3. If we add symbols in FIRST(A) to the synchronizing set of


nonterminal A, then it may be possible to resume parsing according
to A if a symbol in FIRST(A) appears in the input.

4. If a nonterminal can generate the empty string, then the production


deriving  can be used as a default. Doing so may postpone some
error detection, but cannot cause an error to be missed

5. If a terminal on top of the stack, cannot be matched, a simple idea is


to pop the terminal, issue a message saying that terminal was
inserted, and continue parsing.
Example

Add “sync” to indicate synchronizing tokens obtained from FOLLOW sets.


Example

STACK INPUT REMARKS


$E ) id * + id $ Error, skip )
If M[A,a]=, skip a.
$E id * + id $ Id is in First(E) If M[A,a]=sync, A is popped.
$E’T id * + id $ If a token on top of stack does
$E’T’F id * + id $ not match input, pop it.
$E’T’id id * + id $
$E’T’ * + id $
$E’T’F* * + id $
Error, M[F,+]=synch, F has been
$E’T’F + id $ popped

$E’T’ + id $
$E’ + id $
$E’T+ + id $
$E’T id $
$E’T’F id $
$E’T’id id $
$E’T’ $
$E’ $
$ $
Bottom-Up Parsing
• Shift-reduce parsing is a general style of bottom-up
parsing.

• It attempts to construct a parse tree for an input string


beginning at the leaves and working up towards the root.

• At each reduction step a particular substring matching


the right side of a production is replaced by the
nonterminal on the left side of that production.

• If the substring is chosen correctly at each reduction


step, a rightmost derivation is traced out in reverse.
75
Example
Consider the following grammar
S  aABe
A  Abc | b
B d
The sentence “a b b c d e” cab be reduced to S by the
following reduction steps:
1. abbcde (A b) (handle at position 2)
2. aAbcde (A Abc)
3. aAde (B d)
4. aABe (S aABe)
5. S
S rm a A B e  a A d e  a A b c d e  a b b c d e 76

S lm a A B e  a AbcB e  a b b c B e  a b b c d e
Example

• The reductions trace out the rightmost derivation in


reverse
77
• Main task is to find appropriate substring for
reduction. And this substring is known as a handle.
Handles
• Handle is a substring (in the parsing string) that matches the right
side of a production rule.
• But not every substring matches the right side of a
production rule is handle
• Reduction to NT on LHS should represent one
step along the reverse of rightmost derivation

• Formal Definition: A handle of a right sentential form  (


) is a production rule A   and a position of 
where the string  may be found and replaced by A to
produce the previous right-sentential form in a rightmost
derivation  
S  A  

78
Handles (cont.)
If S * Aw * w, then A in the position following  is
a handle of w.

Note:
1. The string w to the right of a handle contains only terminal
symbols.
2. If a grammar is unambiguous, then every right-sentential
form of the grammar has exactly one handle; otherwise,
some right-sentential forms may have more than one handle.

79
Example
Consider the following ambiguous grammar
EE+E | E*E | (E) | id
Two rightmost derivation of id1+id2*id3

1. E  E+E 2. E  E*E
E+ E*E E*id3
 E+ E*id3  E+ E*id3
 E+ id2*id3  E+ id2*id3
 id1+ id2*id3  id1+ id2*id3
two possible handles for Right Sentential Form
E + E * id3

80
Handle Pruning
• A right-most derivation in reverse can be obtained by
handle-pruning.

input string = n-th right-sentential form


rm rm rm rm rm
S=0  1  2  ...  n-1  n= 

• Start from n, find a handle Ann in n, and replace n in


by An to get n-1.
• Then find a handle An-1n-1 in n-1, and replace n-1
in by An-1 to get n-2.
• Repeat this, until we reach S.
Handle Pruning

Disadvantages:

1. To locate the substring to be reduce in a right sentential


form

2. To determine what production to choose in case there is


more than one production with that substring on right
hand side.
Stack Implementation of Shift-
Reduce Parsing
• a stack to hold grammar symbols

• an input buffer to hold input string w.

We use $ to mark the end of the stack and the input buffer

STACK INPUT
$ w$
• The parser shifts input symbols onto the stack until a handle  is on
top of stack.

• It reduces  to the left side of production A.

• Repeats this cycle until it has an error or the stack contains S and
the input buffer is empty.

STACK INPUT
$S $
83
Shift-Reduce Parsing
The handle always appears at the top of the stack just before it is
identified as the handle.

The parser performs following operations.

1. Shift: moves symbol from input buffer to stack.


2. Reduce: if handle appears on the top of the stack then
reduction of it by appropriate rule is done. Means R.H.S. of
the rule is popped of and L.H.S. is pushed in.
3. Accept: if stack contains start symbol only and input buffer is
empty at the same time then that action is called accept.
4. Error: A situation in which parser can not either shift or
reduce the symbols, it can not even perform the accept action
is called error.
EXAMPLE SHIFT-REDUCE PARSING
Consider the grammar:

Stack Input Action


$ id1 + id2$
$ id1 + id2$ shift
$id1 + id2$ reduce 6
$F + id2$ reduce 4
$T + id2$ reduce 2
$E + id2$ shift
$E + id2$ shift
$E + id2 reduce 6
$E + F reduce 4
$E + T reduce 1
$E accept
SHIFT-REDUCE PARSING
• Handle will always appear on Top of stack, never inside
• Possible forms of two successive steps in any rightmost
derivation
• CASE 1: STACK INPUT
$ yz$
S
After Reducing the handle
A $B yz$
B
Shifting from Input
   y z z$
S * Az  Byz  yz
$By
rm rm rm Reduce the handle
$A z$
SHIFT-REDUCE PARSING
• Case 2: STACK INPUT
S
$ xyz$
After Reducing the handle
B A
$B xyz$
  Shifting from Input
x y z
$Bxy z$
Reducing the handle
$BxA z$

S * BxAz  Bxyz  xyz


rm rm rm
CONFLICTS DURING SHIFT-REDUCE
PARSING
• There are context-free grammars for which shift-reduce parsers
cannot be used.
• Stack contents and the next input symbol may not decide action:
– shift/reduce conflict: Whether make a shift
operation or a reduction.
– reduce/reduce conflict: The parser cannot
decide which of several reductions to make.
• If a shift-reduce parser cannot be used for a grammar, that grammar is
called as non-LR(k) grammar.

left to right right-most k lookhead


scanning derivation
in reverse

• An ambiguous grammar can never be a LR grammar.


SHIFT-REDUCE CONFLICT IN
AMBIGUOUS GRAMMAR
stmt  if expr then stmt
| if expr then stmt else stmt
| other

STACK INPUT
….if expr then stmt else….$

 We can’t decide whether to shift or reduce?


Example:

• At each point: i) should it shift or reduce?


ii)if reduce, then which rule to use?
REDUCE-REDUCE CONFLICT IN
AMBIGUOUS GRAMMAR
1. stmt  id(parameter_list)
2. stmt  expr:=expr
3. parameter_list  parameter_list, parameter
4. parameter_list  parameter
5. parameter_list  id
6. expr  id(expr_list)
7. expr  id
8. expr_list  expr_list, expr
9. expr_list  expr

 Stmt : p (i, j) -> id (id, id)


 We can’t decide which production will be used to
reduce id?
STACK INPUT
….id ( id , id ) …$
SHIFT-REDUCE PARSERS
There are two main categories of shift-reduce parsers
1. Operator-Precedence Parser
– simple, but only a small class of grammars.
CFG
LR
LALR
SLR

2. LR-Parsers
– covers wide range of grammars.
• SLR – simple LR parser
• CLR – Canonical LR -most general LR parser
• LALR – intermediate LR parser (lookhead LR parser)
– SLR, LR and LALR work same, only their parsing tables are different.
Operator Precedence Parser
Operator grammars are used for implementation of operator
precedence parser.

operator grammar : It is a context-free grammar that has the


property that
1. No Ꜫ transition ( A -> Ꜫ )
2. No two adjacent non-terminals in its right-hand side.

Example : E -> EAE | -E | ( E ) | id


Not an Operator
A -> + | - | * | / Grammar

Operator Grammar :
E -> E + E | E – E | E * E | E / E | -E | ( E ) | id
Precedence relation

• If a has higher precedence over b; a .> b


• If a has lower precedence over b; a <. b
• If a and b have equal precedence; a =. b
Note:
– id has higher precedence than any other symbol
– $ has lowest precedence.
– if two operators have equal precedence, then we
check the Associativity of that particular operator.
Example
E -> E + E | E * E | id

Precedence table for this grammar:

w:input string id + id *id


$<.id.>+<.id.>*<.id.>$
$ E + id * id $
$ <. + <. Id.> * <.id.> $
$ E + E * id $
$ <. + <. * <.id.>$
$ E + E * E$
$ <. + <. * .> $
$E+E$
$ <. + .> $
$E$
Basic Principal

• Scan input string left to right, try to detect .> and put a
pointer on its location.

• Now scan backwards over any =. till reaching <.

• String between <. And .> is our handle including any


intervening or surrounding nonterminals.

• Replace handle by the head of the respective production.

• REPEAT until reaching start symbol.

• Disadv: Entire Right sentential form need to be scanned


Implementation with Stack

• Use stack to store the input symbols already seen and


the precedence relations are used to guide the actions
of the shift reduce parsesr.

• Let a be the top terminal on the stack, and b the


symbol pointed to by ip

1. if a <· b or a =· b then push b onto the stack


and advance ip to the next input symbol (shift
operation)

2. if a ·> b then reduce operation


Example
input string w: id + id *id

E -> E + E | E * E | id
Precedence Function

• Operator precedence parser need to store precedence


relation table

• Instead operator precedence parsers use table that is encoded


by two precedence functions f and g that map terminal
symbols to integers.

• We attempt to select f and g, so that, for symbol a and b


1. f(a) < g(b) whenever a <. b
2. f(a) > g(b) whenever a >. b
3. f(a) = g(b) whenever a =. b
Precedence Function
Algorithm: Construct Precedence Function
Input: Operator precedence matrix
Output: Precedence function or indication that it does not exists

1. Create functions fa and ga for each grammar terminal a and for the
end of string symbol.
2. Partition the symbols in groups so that fa and gb are in the same
group if a =· b (there can be symbols in the same group even if they
are not connected by this relation).
3. Create a directed graph whose nodes are in the groups, next for each
symbols a and b do: place an edge from the group of gb to the group
of fa if a <· b, otherwise if a ·> b place an edge from the group of fa
to that of gb.
4. If the constructed graph has a cycle then no precedence functions
exist. When there are no cycles collect the length of the longest
paths from the groups of fa and gb respectively.
Consider the following table:

Resulting graph:
gid fid

f* g*
Precedence Table:
id + * $
g+ f+
f 4 2 4 0
g 5 1 3 0
f$ g$
LR PARSERS
LR parsing is attractive because:

• Most general non-backtracking shift-reduce parsing, yet it is still efficient.

• The class of grammars that can be parsed using LR methods is a


proper superset of the class of grammars that can be parsed with
predictive parsers.
LL(1)-Grammars  LR(1)-Grammars

• Detect a syntactic error as soon as it is possible to do so a


left-to-right scan of the input.

• LR parsers can be constructed to recognize virtually all


programming language constructs for which CFG grammars can
be written

Drawback of LR method:

• Too much work to construct LR parser by hand


• Fortunately tools (LR parsers generators) are available
LL VS. LR

 LR (shift reduce) is more powerful than LL


(predictive parsing)

 Can detect a syntactic error as soon as possible.

 LR is difficult to do by hand (unlike LL)


LR PARSING ALGORITHM
input a1 ... ai ... an $
stack
Sm
Xm
Sm-1 LR Parsing Algorithm output
Xm-1
.
.
S1 Action Table Goto Table
X1 terminals and $ non-terminal
S0 s four different s
t actions t each item is
a a a state number t
t e
e s
s
CONSTRUCTING SLR PARSING TABLES –
LR(0) ITEM
• An LR(0) item of a grammar G is a production of G a dot at the
some position of the right side.
A   aBb
• Ex: A  aBb Possible LR(0) Items: A  a  Bb
(four different possibility)
A  aB  b
A  aBb 

• Sets of LR(0) items will be the states of action and goto table of
the SLR parser.
– States represent sets of “items”

• LR parser makes shift-reduce decision by maintaining states to


keep track of where we are in a parsing process
CONSTRUCTING SLR PARSING TABLES –
LR(0) ITEM
• An item indicates how much of a production we have seen at a given
point in the parsing process

• For Example the item A  X  YZ


– We have already seen on the input a string derivable from X
– We hope to see a string derivable from YZ

• For Example the item A   XYZ


– We hope to see a string derivable from XYZ

• For Example the item A  XYZ 


– We have already seen on the input a string derivable from XYZ
– It is possibly time to reduce XYZ to A

• Special Case:
Rule: A   yields only one item
A 
CONSTRUCTING SLR PARSING TABLES
• A collection of sets of LR(0) items (the canonical LR(0)
collection) is the basis for constructing SLR parsers.

• Canonical LR(0) collection provides the basis of constructing a


DFA called LR(0) automaton
– This DFA is used to make parsing decisions

• Each state of LR(0) automaton represents a set of items in the


canonical LR(0) collection

• To construct the canonical LR(0) collection for a grammar


– Augmented Grammar
– CLOSURE function
– GOTO function
AUGMENTED GRAMMAR

If G is a grammar with start symbol S, then G’, the argument


grammar for G, is G with a new start symbol S’ and
productions S’S.

The Grammar is as follow:


E E+T |T
T T*F|F
F  (E) | id

The argument grammar is as follows:


E’  E
E E+T |T
T T*F|F
F  (E) | id
THE CLOSURE OPERATION

• If I is a set of LR(0) items for a grammar G, then


closure(I) is the set of LR(0) items constructed from I by
the two rules:

1. Initially, every LR(0) item in I is added to closure(I).


2. If A  .B is in closure(I) and
B is a production rule of G;
then B. will be in the closure(I).
We will apply this rule until no more new LR(0)
items can be added to closure(I).
THE CLOSURE OPERATION - EXAMPLE

E'  E
closure({E'  . E}) =

E  E+T { E'  E Kernel


items
E T E  E+T
T  T*F E  T
T F T  T*F
F  (E) T  F
F  id F  (E)
F  id }

Kernel item: includes the initial item S’->.S and all items whose
dots are not at the left end.

Non-Kernel item: all items whose dots are at the left end.
GOTO OPERATION

If I is a set of items and X is a grammar symbol,


then goto(I,X) is the closure of the set of all items [AX•]
such that [A •X] is in I.

If I is a set of items ,*E’•E], [E E •+T]}, then goto(I,+) consists of


E  E + •T
T  •T * F
T •F
F  • (E)
F  • id
CONSTRUCTION OF THE
CANONICAL LR(0) COLLECTION
(CC)
• To create the SLR parsing tables for a grammar G, we will
create the canonical LR(0) collection of the grammar G’.

• Algorithm:
C is { closure({S'  S}) }
repeat the followings until no more set of LR(0) items
can be added to C.
for each I in C and each grammar symbol X
if GOTO(I,X) is not empty and not in C
add GOTO(I,X) to C

• GOTO function is a DFA on the sets in C.


The argument grammar is as follows:
Example E’  E
E E+T |T
T T*F|F
F  (E) | id

closure (,*E’•E]})=I0: E’  •E
E •E+T
E •T
T •T*F
T •F
F  • (E)
F  • id

goto(I0, E)=I1: E’  E •
E E•+T

113
Example (cont.) I0: E’  .E
E  .E+T

E .T
T  .T*F
T  .F

goto(I0, T)= I2: E  T • F  .(E)


F  .id
T T•*F

goto(I0, F)= I3: T  F •

goto(I0, ( )= I4: F  (• E)
E •E+T
E •T
T •T*F
F  • (E)
F  • id

goto(I0, id)= I5: F  id • 114


Example (cont.)

goto(I1, +)= I6: E  E + • T


T •T*F
T  •F
F  • (E)
F  • id

goto(I2, *)= I7: T  T * • F


F  • (E)
F  • id

115
Example (cont.)
goto(I4, E)= I8 : F  (E•)
E E•+T
goto(I4, T)= I2

goto(I4, F)= I3

goto(I4, ( )= I4

goto(I4, id)= I5

goto(I6, T)= I9 : E  E + T •
T T•*F
116
Example (cont.)
goto(I6, F)= I3
goto(I6, ( )= I4
goto(I6, id)= I5
goto(I7, F )= I10 : T  T * F •
goto(I7, ( )= I4
goto(I7, id )= I5
goto(I8, ) )= I11 : F  (E) •
goto(I8, + )= I6
goto(I9, * )= I7

117
The Transition Diagram
The goto functions for the canonical collection of sets of items
be shown as a transition diagram.

118
The Transition Diagram
The goto functions for the canonical collection of sets of items
be shown as a transition diagram.

119
The Parsing Table
State i is construction form Ii.
1) The parsing actions for state i are determinated as follows:
a) If [A•a] is in Ii and goto(Ii, a)=Ij, set action[i,a] to
“shift j”. Here a must be a terminal.
b) If [A•+ is in Ii, set action *i, a+ to “reduce A” for all a
in FOLLOW (A). Here A must not S’.
c) If * S’S•+ in Ii, set action*i,$+ to “accept”.
2) The goto transactions for state i are constructed for all
nonterminals A using the rule:
• If goto(Ii, A)=Ij, the goto[i,A]= j.
3) All entries not defined by 1) and 2) are set “error”
4) The start state of the parser is the one constructed from the set
of items containing *S’• S]

120
PARSING TABLES OF EXPRESSION
GRAMMAR
Action Table GotoTable

state id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
Key to Notation 3 r4 r4 r4 r4
4 s5 s4 8 2 3
S4=“Shift input symbol
and push state 4” 5 r6 r6 r6 r6
R5= “Reduce by rule 5” 6 s5 s4 9 3
Acc=Accept
(blank)=Syntax Error 7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
LR PARSING ALGORITHM
input a1 ... ai ... an $
stack
Sm
Xm
Sm-1 LR Parsing Algorithm output
Xm-1
.
.
S1 Action Table Goto Table
X1 terminals and $ non-terminal
S0 s four different s
t actions t each item is
a a a state number t
t e
e s
s
A CONFIGURATION OF LR PARSING
ALGORITHM

• A configuration of a LR parsing is:

( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )

Stack Rest of Input

• Sm and ai decides the parser action by consulting the


parsing action table. (Initial Stack contains just So)

• A configuration of a LR parsing represents the right sentential


form:

X1 ... Xm ai ai+1 ... an $


ACTIONS OF A LR-PARSER

1. shift s -- shifts the next input symbol and the state s onto the stack
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) € ( So X1 S1 ... Xm Sm ai s, ai+1 ... an $ )

2. reduce A (or rN where N is a production number)


– pop 2|| (=r) items from the stack;
– then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) € ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )

– Output is the reducing production reduce A

2. Accept – Parsing successfully completed

3. Error -- Parser detected an error (an empty entry in the action table)
ACTIONS OF A LR-PARSER
Reduce Action

• pop 2||(=r) items from the stack; let us assume that


 = Y1Y2...Yr

• then push A and s where s=goto[sm-r,A]

( So X1 S1 ... Xm-r Sm-r Y1 Sm-r+1 ...Yr Sm, ai ai+1 ... an $)


€ ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )

• In fact, Y1Y2...Yr is a handle.

X1 ... Xm-r A ai ... an $ c X1 ... Xm Y1...Yr ai ai+1 ... an $


PARSING TABLES OF EXPRESSION
GRAMMAR
Action Table GotoTable

state id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
Key to Notation 4 s5 s4 8 2 3
S4=“Shift input symbol 5 r6 r6 r6 r6
and push state 4” 6 s5 s4 9 3
R5= “Reduce by rule 5”
Acc=Accept 7 s5 s4 10
(blank)=Syntax Error 8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
ACTIONS OF A (S)LR-PARSER -- state

0
id

s5
+ * (

s4
) $ E T

1 2
F

EXAMPLE 1

2
s6

r2 s7 r2
acc

r2

3 r4 r4 r4 r4

4 s5 s4 8 2 3

5 r6 r6 r6 r6

6 s5 s4 9 3

7 s5 s4 10

Input : id * id + id 8

9
s6

r1 s7
s11

r1 r1

10 r3 r3 r3 r3

11 r5 r5 r5 r5

stack input action output


0 id*id+id$ shift 5
0id5 *id+id$ reduce by Fid Fid

0F3 *id+id$ reduce by TF TF

0T2 *id+id$ shift 7


0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by Fid Fid

0T2*7F10 +id$ reduce by TT*F TT*F


0T2 +id$ reduce by ET ET
0E1 +id$ shift 6
0E1+6 id$ shift 5
0E1+6id5 $ reduce by Fid Fid
0E1+6F3 $ reduce by TF TF
0E1+6T9 $ reduce by EE+T EE+T
0E1 $ accept
Example for a not SLR(1) and
unambiguous grammar
S  L=R
S  R
L  *R
L  id
R L
Example for a not SLR(1) and
unambiguous grammar (cont.)

Yu-Chen Kuo 129


Example for a not SLR(1) and
unambiguous grammar (cont.)
Consider the set of items I2.
I2: S  L = R
R L
action*2, =+= “shift 6”
FOLLOW(R) contains =,
set action*2, =+= “reduce R  L”
• Shift/Reduce Conflict but not ambiguous
• In fact, no right-sentential form that begins R= …..
when the viable prefix L only. (*L)
Spilt the state accord to FOLLOW set.
Construction Canonical LR
Parsing Tables
• An LR(1) item is of the form [A, a] where
A is a production and a is a terminal or $.

• The “1” refers to the length of the second


component, called the lookahead of the item.

• The lookahead has no effect in an item of the


form [A, a], where  is not , but an item
of the form [A, a] calls for a reduction by
A only if the input symbol is a.
Construction Canonical LR
Parsing Tables
• Thus, we are compelled to reduce by A  only on
those input symbols a for which
[A , a] is an LR(1) item in the state on top of the
stack.

• The set of such a’s will always be a subset of


FOLLOW(A), but it could be a proper subset.

• The method for constructing the collection of sets of


LR(1) items is essentially the same as the way we
built the collection of sets of LR(0) items. We only to
modify two procedures closure and goto.
Construction Canonical LR
Parsing Tables (closure function )
Construction Canonical LR
Parsing Tables (item & goto)

Yu-Chen Kuo 134


Example (cont.)
Consider the following augmented grammar
S’  S
S  CC
C  cC | d
Example (cont.) S’  S
closure ( {[S’  S, $]})= I0: S  CC
S’   S, $ C  cC | d
S   CC, $ goto (I2, c)= I6:
C  cC, c/d C  c C, $
C  d, c/d C  cC, $
C  d, $
goto(I0, S)= I1: S’  S  , $
goto (I0, C)= I2: goto (I2, d)= I7: C  d , $
S  C C, $
C  cC, $ goto(I3, C)=I8: C  cC, c/d
C  d, $
goto (I0, c)= I3: goto(I3, c)=I3
C  c C, c/d
C  cC, c/d goto(I3, d)=I4
C  d, c/d
goto(I0, d)= I4: goto(I6, C)=I9: C  cC, $
C  d , c/d
goto(I6, c)=I6
goto (I2, C)= I5:
S  C C , $ goto(I6,d)=I7
Example (Transition Diagram)

Compare I6 & I3 with


different
FOLLOW
Example (Transition Diagram)
CONSTRUCTION OF LR(1)
PARSING TABLES
1. Construct the canonical collection of sets of LR(1) items for G’.
C{I0,...,In}
2. Create the parsing action table as follows
• .
If a is a terminal, A a ,b in Ii and goto(Ii,a)=Ij then action[i,a] is
shift j.
• .
If A ,a is in Ii , then action[i,a] is reduce A where AS’.
• .
If S’S ,$ is in Ii , then action[i,$] is accept.
• If any conflicting actions generated by these rules, the grammar is not
LR(1).

3. Create the parsing goto table


• for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j

4. All entries not defined by (2) and (3) are errors.

5. Initial state of the parser contains S’.S,$


Example (Parsing Table)
Canonical LR Parser vs. SLR
Parser
Every SLR(1) grammar is a LR(1) grammar.
Canonical LR Parser (LR(1) grammar) may have more states than SLR
parser (SLR(1)) form the same grammar.

Exercise: check the following grammar is a LR(1) grammar or not.

S  L=R
S  R
L  *R
L  id
R L
LALR Parsing Tables
• LALR stands for LookAhead LR.

• LALR parsers are often used in practice because LALR


parsing tables are smaller than LR(1) parsing tables.

• The number of states in SLR and LALR parsing tables for


a grammar G are equal.

• But LALR parsers recognize more grammars than


SLR parsers.

• A state of LALR parser will be again a set of LR(1)


items.
Creating LALR parsing tables

Canonical LR(1) Parser € LALR Parser


shrink # of states

• This shrink process may introduce a reduce/reduce


conflict in the resulting LALR parser (so the grammar
is NOT LALR)

• But, this shrink process does not produce a


shift/reduce conflict.
The Core of A Set of LR(1) Items
• The core of a set of LR(1) items is the set of its
first component.
Ex:
R  L ,$
..
S  L =R,$ S  L =R
RL
.. Core

• We will find the states (sets of LR(1) items) in a canonical LR(1)


parser with same cores. Then we will merge them as a single
state.
I1:L  .
id ,= Same Core ..
A new state:I12: L  id ,=
L  id ,$
.
I2:L  id ,$ Merge Them

• We will do this for all states of a canonical LR(1) parser to get


the states of the LALR parser.
• In fact, the number of the states of the LALR parser for a
grammar will be equal to the number of states of the SLR
Creation of LALR parsing tables
• Create the canonical LR(1) collection of the sets of LR(1) items
for the given grammar.
• Find each core; find all sets having that same core; replace those sets
having same cores with a single set which is their union.
C={I0,...,In} € C’=,J1,...,Jm} where m  n
• Create the parsing tables (action and goto tables) same as the
construction of the parsing tables of LR(1) parser.

– Note that: If J=I1  ...  Iksince I1,...,Ik have same cores


€ cores of goto(I1,X),...,goto(I2,X) must be same.

– So, goto(J,X)=K, where K is the union of all sets of items


having same cores as goto(I1,X).
• If no conflict is introduced, the grammar is LALR(1) grammar. (We may
only introduce reduce/reduce conflicts; we cannot introduce a
shift/reduce conflict)
Example S’  S
I0: S’   S, $ S  CC
S   CC, $ C  cC | d
C  cC, c/d
C  d, c/d

goto (I0, C)= I2: goto (I2, d)= I7: C  d , $


S  C C, $
C  cC, $ goto(I3, C)=I8: C  cC, c/d
C  d, $
goto (I0, c)= I3: goto(I3, c)=I3
C  c C, c/d
C  cC, c/d goto(I3, d)=I4
C  d, c/d
goto(I0, d)= I4: goto(I6, C)=I9: C  cC, $
C  d , c/d
goto(I6, c)=I6
goto (I2, C)= I5:
S  C C , $ goto(I6,d)=I7

goto (I2, c)= I6:


C  c C, $
C  cC, $
C  d, $
LALR Parsing Table
QUESTIONS ?

14
8

You might also like