Syntax Analysis - LL LR Parser

TABLE DRIVEN PREDICTIVE PARSING
1
TABLE DRIVEN PREDICTIVE PARSING
2
PREDICTIVE PARSING
3
PREDICTIVE PARSING
4
PREDICTIVE PARSING
5
PREDICTIVE PARSING
6
PREDICTIVE PARSING
7
PREDICTIVE PARSING
8
PREDICTIVE PARSING
9
PREDICTIVE PARSING
10
PREDICTIVE PARSING
11
PREDICTIVE PARSING
12
PREDICTIVE PARSING
13
PREDICTIVE PARSING
14
PREDICTIVE PARSING
15
PREDICTIVE PARSING
16
PREDICTIVE PARSING
17
PREDICTIVE PARSING
18
PREDICTIVE PARSING
19
PREDICTIVE PARSING
20
PREDICTIVE PARSING
21
PREDICTIVE PARSING
22
PREDICTIVE PARSING
23
PREDICTIVE PARSING
24
PREDICTIVE PARSING
25
PREDICTIVE PARSING
26
PREDICTIVE PARSING
27
PREDICTIVE PARSING
28
PREDICTIVE PARSING
29
PREDICTIVE PARSING
30
PREDICTIVE PARSING
31
PREDICTIVE PARSING
32
PREDICTIVE PARSING
33
PREDICTIVE PARSING
34
PREDICTIVE PARSING
35
PREDICTIVE PARSING
36
PREDICTIVE PARSING
37
PREDICTIVE PARSING
38
PREDICTIVE PARSING
39
PREDICTIVE PARSING
40
PREDICTIVE PARSING
41
PREDICTIVE PARSING
42
PREDICTIVE PARSING
43
PREDICTIVE PARSING
44
PREDICTIVE PARSING
45
PREDICTIVE PARSING
46
PREDICTIVE PARSING
47
PREDICTIVE PARSING
48
PREDICTIVE PARSING
49
RECONSTRUCTING THE PARSE
TREE
50
TREE
51
TREE
52
Example (Stack Moves)
Input string: id + id * id
53
EXAMPLE: THE “DANGLING ELSE”
GRAMMAR
54
GRAMMAR
55
GRAMMAR
GRAMMAR
GRAMMAR
GRAMMAR
GRAMMAR
GRAMMAR
GRAMMAR
GRAMMAR
GRAMMAR
LL(1) GRAMMAR LL(1) Grammar
• LL(1) grammars
• Are never ambiguous.
• Will never have left recursion.
• Furthermore...
• If we are looking for a n “A” and the next symbol is
“b”,
• Then only one production mu s t be possible
• Although elimination of left recursion and left

factoring is easy.
• Some grammar will never be a LL(1) grammar. 72
LL(1) GRAMMAR
66
PROPERTIES OF LL(1) GRAMMAR
• A gr a mma r G is LL(1) if a n only if whenever

A    are two distinct productions of G
the following conditions hold:
1. For no terminal a do both  and  derive strings

beginning with a.
2. At most one of  and  can derive the empty string.
3. If then  *   then  does not derive any string

beginning with a terminal in FOLLOW(A).
67
Examples
S -> A a A b | B b B a
A -> ε
B -> ε
Rule 1. FIRST (A a A b) ꓵFIRST (B b B a)

= { a} ꓵ{b} = ϕ
Rule 2. (A a A b)  ε
( B b B a)  ε
Rule 3. Not applicable
Grammar is LL(1)
Examples
S -> 1 A B | ε
A -> 1 A C | 0 C
B -> 0 S
C - >1
Rule 1. FIRST (1 A B) ꓵFIRST (ε)

= { 1} ꓵ{ε} = ϕ
Rule 2. FIRST (1 A B)  ε
FIRST (ε)  ε
Rule 3. FIRST (ε)  ε, FOLLOW (S) = {$}

Follow (S) ꓵFIRST (1 A B) = ϕ
Grammar is LL(1)
ERROR RECOVERY
When Do Errors Occur? Recall Predictive Parser Function:
a + b $ Input
Stack X Predictive Parsing Output

Program
Y
Z
Parsing Table
$ M[A,a]
1. If X is a terminal and it doesn’t match input.

2. If M[X, Input] is empty – No allowable actions
75
Error Recovery in Predictive Parsing
(cont.)
• Panic-mode error recovery
• It is based on the idea of skipping symbols on the

input until a token in a selected set of synchronizing
tokens appears.
• Its effectiveness depends on the choice on the

synchronizing set.
Error Recovery in Predictive Parsing
(cont.)
Some heuristics are as follows.
1. Place all symbols in FOLLOW(A) into the synchronizing set for
nonterminal A. Skip tokens until an element of FOLLOW(A) is seen
and pop A from the stack, it is likely that parsing can continue.
2. There is hierarchical structure on constructs in a language; e.g.,

expressions within blocks, and so on. We can add to the
synchronizing set of a lower construct the symbols that begin higher
constructs.
3. If we add symbols in FIRST(A) to the synchronizing set of

nonterminal A, then it may be possible to resume parsing according
to A if a symbol in FIRST(A) appears in the input.
4. If a nonterminal can generate the empty string, then the production

deriving  can be used as a default. Doing so may postpone some
error detection, but cannot cause an error to be missed
5. If a terminal on top of the stack, cannot be matched, a simple idea is

to pop the terminal, issue a message saying that terminal was
inserted, and continue parsing.
Example
Add “sync” to indicate synchronizing tokens obtained from FOLLOW sets.

Example
STACK INPUT REMARKS

$E ) id * + id $ Error, skip )
If M[A,a]=, skip a.
$E id * + id $ Id is in First(E) If M[A,a]=sync, A is popped.
$E’T id * + id $ If a token on top of stack does
$E’T’F id * + id $ not match input, pop it.
$E’T’id id * + id $
$E’T’ * + id $
$E’T’F* * + id $
Error, M[F,+]=synch, F has been
$E’T’F + id $ popped
$E’T’ + id $
$E’ + id $
$E’T+ + id $
$E’T id $
$E’T’F id $
$E’T’id id $
$E’T’ $
$E’ $
$ $
Bottom-Up Parsing
• Shift-reduce parsing is a general style of bottom-up
parsing.
• It attempts to construct a parse tree for an input string

beginning at the leaves and working up towards the root.
• At each reduction step a particular substring matching

the right side of a production is replaced by the
nonterminal on the left side of that production.
• If the substring is chosen correctly at each reduction

step, a rightmost derivation is traced out in reverse.
75
Example
Consider the following grammar
S  aABe
A  Abc | b
B d
The sentence “a b b c d e” cab be reduced to S by the
following reduction steps:
1. abbcde (A b) (handle at position 2)
2. aAbcde (A Abc)
3. aAde (B d)
4. aABe (S aABe)
5. S
S rm a A B e  a A d e  a A b c d e  a b b c d e 76
S lm a A B e  a AbcB e  a b b c B e  a b b c d e
Example
• The reductions trace out the rightmost derivation in

reverse
77
• Main task is to find appropriate substring for
reduction. And this substring is known as a handle.
Handles
• Handle is a substring (in the parsing string) that matches the right
side of a production rule.
• But not every substring matches the right side of a
production rule is handle
• Reduction to NT on LHS should represent one
step along the reverse of rightmost derivation
• Formal Definition: A handle of a right sentential form  (

) is a production rule A   and a position of 
where the string  may be found and replaced by A to
produce the previous right-sentential form in a rightmost
derivation  
S  A  
78
Handles (cont.)
If S * Aw * w, then A in the position following  is
a handle of w.
Note:
1. The string w to the right of a handle contains only terminal
symbols.
2. If a grammar is unambiguous, then every right-sentential
form of the grammar has exactly one handle; otherwise,
some right-sentential forms may have more than one handle.
79
Example
Consider the following ambiguous grammar
EE+E | E*E | (E) | id
Two rightmost derivation of id1+id2*id3
1. E  E+E 2. E  E*E
E+ E*E E*id3
 E+ E*id3  E+ E*id3
 E+ id2*id3  E+ id2*id3
 id1+ id2*id3  id1+ id2*id3
two possible handles for Right Sentential Form
E + E * id3
80
Handle Pruning
• A right-most derivation in reverse can be obtained by
handle-pruning.
input string = n-th right-sentential form

rm rm rm rm rm
S=0  1  2  ...  n-1  n= 
• Start from n, find a handle Ann in n, and replace n in

by An to get n-1.
• Then find a handle An-1n-1 in n-1, and replace n-1
in by An-1 to get n-2.
• Repeat this, until we reach S.
Handle Pruning
Disadvantages:
1. To locate the substring to be reduce in a right sentential

form
2. To determine what production to choose in case there is

more than one production with that substring on right
hand side.
Stack Implementation of Shift-
Reduce Parsing
• a stack to hold grammar symbols
• an input buffer to hold input string w.
We use $ to mark the end of the stack and the input buffer
STACK INPUT
$ w$
• The parser shifts input symbols onto the stack until a handle  is on
top of stack.
• It reduces  to the left side of production A.
• Repeats this cycle until it has an error or the stack contains S and
the input buffer is empty.
STACK INPUT
$S $
83
Shift-Reduce Parsing
The handle always appears at the top of the stack just before it is
identified as the handle.
The parser performs following operations.
1. Shift: moves symbol from input buffer to stack.

2. Reduce: if handle appears on the top of the stack then
reduction of it by appropriate rule is done. Means R.H.S. of
the rule is popped of and L.H.S. is pushed in.
3. Accept: if stack contains start symbol only and input buffer is
empty at the same time then that action is called accept.
4. Error: A situation in which parser can not either shift or
reduce the symbols, it can not even perform the accept action
is called error.
EXAMPLE SHIFT-REDUCE PARSING
Consider the grammar:
Stack Input Action

$ id1 + id2$
$ id1 + id2$ shift
$id1 + id2$ reduce 6
$F + id2$ reduce 4
$T + id2$ reduce 2
$E + id2$ shift
$E + id2$ shift
$E + id2 reduce 6
$E + F reduce 4
$E + T reduce 1
$E accept
SHIFT-REDUCE PARSING
• Handle will always appear on Top of stack, never inside
• Possible forms of two successive steps in any rightmost
derivation
• CASE 1: STACK INPUT
$ yz$
S
After Reducing the handle
A $B yz$
B
Shifting from Input
   y z z$
S * Az  Byz  yz
$By
rm rm rm Reduce the handle
$A z$
SHIFT-REDUCE PARSING
• Case 2: STACK INPUT
S
$ xyz$
After Reducing the handle
B A
$B xyz$
  Shifting from Input
x y z
$Bxy z$
Reducing the handle
$BxA z$
S * BxAz  Bxyz  xyz

rm rm rm
CONFLICTS DURING SHIFT-REDUCE
PARSING
• There are context-free grammars for which shift-reduce parsers
cannot be used.
• Stack contents and the next input symbol may not decide action:
– shift/reduce conflict: Whether make a shift
operation or a reduction.
– reduce/reduce conflict: The parser cannot
decide which of several reductions to make.
• If a shift-reduce parser cannot be used for a grammar, that grammar is
called as non-LR(k) grammar.
left to right right-most k lookhead

scanning derivation
in reverse
• An ambiguous grammar can never be a LR grammar.

SHIFT-REDUCE CONFLICT IN
AMBIGUOUS GRAMMAR
stmt  if expr then stmt
| if expr then stmt else stmt
| other
STACK INPUT
….if expr then stmt else….$
 We can’t decide whether to shift or reduce?

Example:
• At each point: i) should it shift or reduce?

ii)if reduce, then which rule to use?
REDUCE-REDUCE CONFLICT IN
AMBIGUOUS GRAMMAR
1. stmt  id(parameter_list)
2. stmt  expr:=expr
3. parameter_list  parameter_list, parameter
4. parameter_list  parameter
5. parameter_list  id
6. expr  id(expr_list)
7. expr  id
8. expr_list  expr_list, expr
9. expr_list  expr
 Stmt : p (i, j) -> id (id, id)

 We can’t decide which production will be used to
reduce id?
STACK INPUT
….id ( id , id ) …$
SHIFT-REDUCE PARSERS
There are two main categories of shift-reduce parsers
1. Operator-Precedence Parser
– simple, but only a small class of grammars.
CFG
LR
LALR
SLR
2. LR-Parsers
– covers wide range of grammars.
• SLR – simple LR parser
• CLR – Canonical LR -most general LR parser
• LALR – intermediate LR parser (lookhead LR parser)
– SLR, LR and LALR work same, only their parsing tables are different.
Operator Precedence Parser
Operator grammars are used for implementation of operator
precedence parser.
operator grammar : It is a context-free grammar that has the

property that
1. No Ꜫ transition ( A -> Ꜫ )
2. No two adjacent non-terminals in its right-hand side.
Example : E -> EAE | -E | ( E ) | id

Not an Operator
A -> + | - | * | / Grammar
Operator Grammar :
E -> E + E | E – E | E * E | E / E | -E | ( E ) | id
Precedence relation
• If a has higher precedence over b; a .> b

• If a has lower precedence over b; a <. b
• If a and b have equal precedence; a =. b
Note:
– id has higher precedence than any other symbol
– $ has lowest precedence.
– if two operators have equal precedence, then we
check the Associativity of that particular operator.
Example
E -> E + E | E * E | id
Precedence table for this grammar:
w:input string id + id *id

$<.id.>+<.id.>*<.id.>$
$ E + id * id $
$ <. + <. Id.> * <.id.> $
$ E + E * id $
$ <. + <. * <.id.>$
$ E + E * E$
$ <. + <. * .> $
$E+E$
$ <. + .> $
$E$
Basic Principal
• Scan input string left to right, try to detect .> and put a
pointer on its location.
• Now scan backwards over any =. till reaching <.
• String between <. And .> is our handle including any

intervening or surrounding nonterminals.
• Replace handle by the head of the respective production.
• REPEAT until reaching start symbol.
• Disadv: Entire Right sentential form need to be scanned

Implementation with Stack
• Use stack to store the input symbols already seen and

the precedence relations are used to guide the actions
of the shift reduce parsesr.
• Let a be the top terminal on the stack, and b the

symbol pointed to by ip
1. if a <· b or a =· b then push b onto the stack

and advance ip to the next input symbol (shift
operation)
2. if a ·> b then reduce operation

Example
input string w: id + id *id
E -> E + E | E * E | id
Precedence Function
• Operator precedence parser need to store precedence

relation table
• Instead operator precedence parsers use table that is encoded

by two precedence functions f and g that map terminal
symbols to integers.
• We attempt to select f and g, so that, for symbol a and b

1. f(a) < g(b) whenever a <. b
2. f(a) > g(b) whenever a >. b
3. f(a) = g(b) whenever a =. b
Precedence Function
Algorithm: Construct Precedence Function
Input: Operator precedence matrix
Output: Precedence function or indication that it does not exists
1. Create functions fa and ga for each grammar terminal a and for the
end of string symbol.
2. Partition the symbols in groups so that fa and gb are in the same
group if a =· b (there can be symbols in the same group even if they
are not connected by this relation).
3. Create a directed graph whose nodes are in the groups, next for each
symbols a and b do: place an edge from the group of gb to the group
of fa if a <· b, otherwise if a ·> b place an edge from the group of fa
to that of gb.
4. If the constructed graph has a cycle then no precedence functions
exist. When there are no cycles collect the length of the longest
paths from the groups of fa and gb respectively.
Consider the following table:
Resulting graph:
gid fid
f* g*
Precedence Table:
id + * $
g+ f+
f 4 2 4 0
g 5 1 3 0
f$ g$
LR PARSERS
LR parsing is attractive because:
• Most general non-backtracking shift-reduce parsing, yet it is still efficient.
• The class of grammars that can be parsed using LR methods is a

proper superset of the class of grammars that can be parsed with
predictive parsers.
LL(1)-Grammars  LR(1)-Grammars
• Detect a syntactic error as soon as it is possible to do so a

left-to-right scan of the input.
• LR parsers can be constructed to recognize virtually all

programming language constructs for which CFG grammars can
be written
Drawback of LR method:
• Too much work to construct LR parser by hand

• Fortunately tools (LR parsers generators) are available
LL VS. LR
 LR (shift reduce) is more powerful than LL

(predictive parsing)
 Can detect a syntactic error as soon as possible.
 LR is difficult to do by hand (unlike LL)

LR PARSING ALGORITHM
input a1 ... ai ... an $
stack
Sm
Xm
Sm-1 LR Parsing Algorithm output
Xm-1
.
.
S1 Action Table Goto Table
X1 terminals and $ non-terminal
S0 s four different s
t actions t each item is
a a a state number t
t e
e s
s
CONSTRUCTING SLR PARSING TABLES –
LR(0) ITEM
• An LR(0) item of a grammar G is a production of G a dot at the
some position of the right side.
A   aBb
• Ex: A  aBb Possible LR(0) Items: A  a  Bb
(four different possibility)
A  aB  b
A  aBb 
• Sets of LR(0) items will be the states of action and goto table of
the SLR parser.
– States represent sets of “items”
• LR parser makes shift-reduce decision by maintaining states to

keep track of where we are in a parsing process
CONSTRUCTING SLR PARSING TABLES –
LR(0) ITEM
• An item indicates how much of a production we have seen at a given
point in the parsing process
• For Example the item A  X  YZ

– We have already seen on the input a string derivable from X
– We hope to see a string derivable from YZ
• For Example the item A   XYZ

– We hope to see a string derivable from XYZ
• For Example the item A  XYZ 

– We have already seen on the input a string derivable from XYZ
– It is possibly time to reduce XYZ to A
• Special Case:
Rule: A   yields only one item
A 
CONSTRUCTING SLR PARSING TABLES
• A collection of sets of LR(0) items (the canonical LR(0)
collection) is the basis for constructing SLR parsers.
• Canonical LR(0) collection provides the basis of constructing a

DFA called LR(0) automaton
– This DFA is used to make parsing decisions
• Each state of LR(0) automaton represents a set of items in the

canonical LR(0) collection
• To construct the canonical LR(0) collection for a grammar

– Augmented Grammar
– CLOSURE function
– GOTO function
AUGMENTED GRAMMAR
If G is a grammar with start symbol S, then G’, the argument

grammar for G, is G with a new start symbol S’ and
productions S’S.
The Grammar is as follow:

E E+T |T
T T*F|F
F  (E) | id
The argument grammar is as follows:

E’  E
E E+T |T
T T*F|F
F  (E) | id
THE CLOSURE OPERATION
• If I is a set of LR(0) items for a grammar G, then

closure(I) is the set of LR(0) items constructed from I by
the two rules:
1. Initially, every LR(0) item in I is added to closure(I).

2. If A  .B is in closure(I) and
B is a production rule of G;
then B. will be in the closure(I).
We will apply this rule until no more new LR(0)
items can be added to closure(I).
THE CLOSURE OPERATION - EXAMPLE
E'  E
closure({E'  . E}) =
E  E+T { E'  E Kernel

items
E T E  E+T
T  T*F E  T
T F T  T*F
F  (E) T  F
F  id F  (E)
F  id }
Kernel item: includes the initial item S’->.S and all items whose
dots are not at the left end.
Non-Kernel item: all items whose dots are at the left end.
GOTO OPERATION
If I is a set of items and X is a grammar symbol,

then goto(I,X) is the closure of the set of all items [AX•]
such that [A •X] is in I.
If I is a set of items ,*E’•E], [E E •+T]}, then goto(I,+) consists of

E  E + •T
T  •T * F
T •F
F  • (E)
F  • id
CONSTRUCTION OF THE
CANONICAL LR(0) COLLECTION
(CC)
• To create the SLR parsing tables for a grammar G, we will
create the canonical LR(0) collection of the grammar G’.
• Algorithm:
C is { closure({S'  S}) }
repeat the followings until no more set of LR(0) items
can be added to C.
for each I in C and each grammar symbol X
if GOTO(I,X) is not empty and not in C
add GOTO(I,X) to C
• GOTO function is a DFA on the sets in C.

The argument grammar is as follows:
Example E’  E
E E+T |T
T T*F|F
F  (E) | id
closure (,*E’•E]})=I0: E’  •E
E •E+T
E •T
T •T*F
T •F
F  • (E)
F  • id
goto(I0, E)=I1: E’  E •
E E•+T
113
Example (cont.) I0: E’  .E
E  .E+T
E .T
T  .T*F
T  .F
goto(I0, T)= I2: E  T • F  .(E)

F  .id
T T•*F
goto(I0, F)= I3: T  F •
goto(I0, ( )= I4: F  (• E)
E •E+T
E •T
T •T*F
F  • (E)
F  • id
goto(I0, id)= I5: F  id • 114

Example (cont.)
goto(I1, +)= I6: E  E + • T

T •T*F
T  •F
F  • (E)
F  • id
goto(I2, *)= I7: T  T * • F

F  • (E)
F  • id
115
Example (cont.)
goto(I4, E)= I8 : F  (E•)
E E•+T
goto(I4, T)= I2
goto(I4, F)= I3
goto(I4, ( )= I4
goto(I4, id)= I5
goto(I6, T)= I9 : E  E + T •
T T•*F
116
Example (cont.)
goto(I6, F)= I3
goto(I6, ( )= I4
goto(I6, id)= I5
goto(I7, F )= I10 : T  T * F •
goto(I7, ( )= I4
goto(I7, id )= I5
goto(I8, ) )= I11 : F  (E) •
goto(I8, + )= I6
goto(I9, * )= I7
117
The Transition Diagram
The goto functions for the canonical collection of sets of items
be shown as a transition diagram.
118
The Transition Diagram
The goto functions for the canonical collection of sets of items
be shown as a transition diagram.
119
The Parsing Table
State i is construction form Ii.
1) The parsing actions for state i are determinated as follows:
a) If [A•a] is in Ii and goto(Ii, a)=Ij, set action[i,a] to
“shift j”. Here a must be a terminal.
b) If [A•+ is in Ii, set action *i, a+ to “reduce A” for all a
in FOLLOW (A). Here A must not S’.
c) If * S’S•+ in Ii, set action*i,$+ to “accept”.
2) The goto transactions for state i are constructed for all
nonterminals A using the rule:
• If goto(Ii, A)=Ij, the goto[i,A]= j.
3) All entries not defined by 1) and 2) are set “error”
4) The start state of the parser is the one constructed from the set
of items containing *S’• S]
120
PARSING TABLES OF EXPRESSION
GRAMMAR
Action Table GotoTable
state id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
Key to Notation 3 r4 r4 r4 r4
4 s5 s4 8 2 3
S4=“Shift input symbol
and push state 4” 5 r6 r6 r6 r6
R5= “Reduce by rule 5” 6 s5 s4 9 3
Acc=Accept
(blank)=Syntax Error 7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
LR PARSING ALGORITHM
input a1 ... ai ... an $
stack
Sm
Xm
Sm-1 LR Parsing Algorithm output
Xm-1
.
.
S1 Action Table Goto Table
X1 terminals and $ non-terminal
S0 s four different s
t actions t each item is
a a a state number t
t e
e s
s
A CONFIGURATION OF LR PARSING
ALGORITHM
• A configuration of a LR parsing is:
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )
Stack Rest of Input
• Sm and ai decides the parser action by consulting the

parsing action table. (Initial Stack contains just So)
• A configuration of a LR parsing represents the right sentential

form:
X1 ... Xm ai ai+1 ... an $

ACTIONS OF A LR-PARSER
1. shift s -- shifts the next input symbol and the state s onto the stack
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) € ( So X1 S1 ... Xm Sm ai s, ai+1 ... an $ )
2. reduce A (or rN where N is a production number)

– pop 2|| (=r) items from the stack;
– then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) € ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )
– Output is the reducing production reduce A
2. Accept – Parsing successfully completed
3. Error -- Parser detected an error (an empty entry in the action table)
ACTIONS OF A LR-PARSER
Reduce Action
• pop 2||(=r) items from the stack; let us assume that

 = Y1Y2...Yr
• then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm-r Sm-r Y1 Sm-r+1 ...Yr Sm, ai ai+1 ... an $)

€ ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )
• In fact, Y1Y2...Yr is a handle.
X1 ... Xm-r A ai ... an $ c X1 ... Xm Y1...Yr ai ai+1 ... an $

PARSING TABLES OF EXPRESSION
GRAMMAR
Action Table GotoTable
state id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
Key to Notation 4 s5 s4 8 2 3
S4=“Shift input symbol 5 r6 r6 r6 r6
and push state 4” 6 s5 s4 9 3
R5= “Reduce by rule 5”
Acc=Accept 7 s5 s4 10
(blank)=Syntax Error 8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
ACTIONS OF A (S)LR-PARSER -- state
0
id
s5
+ * (
s4
) $ E T
1 2
F
EXAMPLE 1
2
s6
r2 s7 r2
acc
r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
Input : id * id + id 8
9
s6
r1 s7
s11
r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
stack input action output

0 id*id+id$ shift 5
0id5 *id+id$ reduce by Fid Fid
0F3 *id+id$ reduce by TF TF
0T2 *id+id$ shift 7

0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by Fid Fid
0T2*7F10 +id$ reduce by TT*F TT*F

0T2 +id$ reduce by ET ET
0E1 +id$ shift 6
0E1+6 id$ shift 5
0E1+6id5 $ reduce by Fid Fid
0E1+6F3 $ reduce by TF TF
0E1+6T9 $ reduce by EE+T EE+T
0E1 $ accept
Example for a not SLR(1) and
unambiguous grammar
S  L=R
S  R
L  *R
L  id
R L
unambiguous grammar (cont.)
Yu-Chen Kuo 129

unambiguous grammar (cont.)
Consider the set of items I2.
I2: S  L = R
R L
action*2, =+= “shift 6”
FOLLOW(R) contains =,
set action*2, =+= “reduce R  L”
• Shift/Reduce Conflict but not ambiguous
• In fact, no right-sentential form that begins R= …..
when the viable prefix L only. (*L)
Spilt the state accord to FOLLOW set.
Construction Canonical LR
Parsing Tables
• An LR(1) item is of the form [A, a] where
A is a production and a is a terminal or $.
• The “1” refers to the length of the second

component, called the lookahead of the item.
• The lookahead has no effect in an item of the

form [A, a], where  is not , but an item
of the form [A, a] calls for a reduction by
A only if the input symbol is a.
Parsing Tables
• Thus, we are compelled to reduce by A  only on
those input symbols a for which
[A , a] is an LR(1) item in the state on top of the
stack.
• The set of such a’s will always be a subset of

FOLLOW(A), but it could be a proper subset.
• The method for constructing the collection of sets of

LR(1) items is essentially the same as the way we
built the collection of sets of LR(0) items. We only to
modify two procedures closure and goto.
Parsing Tables (closure function )
Parsing Tables (item & goto)
Yu-Chen Kuo 134

Example (cont.)
Consider the following augmented grammar
S’  S
S  CC
C  cC | d
Example (cont.) S’  S
closure ( {[S’  S, $]})= I0: S  CC
S’   S, $ C  cC | d
S   CC, $ goto (I2, c)= I6:
C  cC, c/d C  c C, $
C  d, c/d C  cC, $
C  d, $
goto(I0, S)= I1: S’  S  , $
goto (I0, C)= I2: goto (I2, d)= I7: C  d , $
S  C C, $
C  cC, $ goto(I3, C)=I8: C  cC, c/d
C  d, $
goto (I0, c)= I3: goto(I3, c)=I3
C  c C, c/d
C  cC, c/d goto(I3, d)=I4
C  d, c/d
goto(I0, d)= I4: goto(I6, C)=I9: C  cC, $
C  d , c/d
goto(I6, c)=I6
goto (I2, C)= I5:
S  C C , $ goto(I6,d)=I7
Example (Transition Diagram)
Compare I6 & I3 with

different
FOLLOW
Example (Transition Diagram)
CONSTRUCTION OF LR(1)
PARSING TABLES
1. Construct the canonical collection of sets of LR(1) items for G’.
C{I0,...,In}
2. Create the parsing action table as follows
• .
If a is a terminal, A a ,b in Ii and goto(Ii,a)=Ij then action[i,a] is
shift j.
• .
If A ,a is in Ii , then action[i,a] is reduce A where AS’.
• .
If S’S ,$ is in Ii , then action[i,$] is accept.
• If any conflicting actions generated by these rules, the grammar is not
LR(1).
3. Create the parsing goto table

• for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j
4. All entries not defined by (2) and (3) are errors.
5. Initial state of the parser contains S’.S,$

Example (Parsing Table)
Canonical LR Parser vs. SLR
Parser
Every SLR(1) grammar is a LR(1) grammar.
Canonical LR Parser (LR(1) grammar) may have more states than SLR
parser (SLR(1)) form the same grammar.
Exercise: check the following grammar is a LR(1) grammar or not.
S  L=R
S  R
L  *R
L  id
R L
LALR Parsing Tables
• LALR stands for LookAhead LR.
• LALR parsers are often used in practice because LALR

parsing tables are smaller than LR(1) parsing tables.
• The number of states in SLR and LALR parsing tables for

a grammar G are equal.
• But LALR parsers recognize more grammars than

SLR parsers.
• A state of LALR parser will be again a set of LR(1)

items.
Creating LALR parsing tables
Canonical LR(1) Parser € LALR Parser

shrink # of states
• This shrink process may introduce a reduce/reduce

conflict in the resulting LALR parser (so the grammar
is NOT LALR)
• But, this shrink process does not produce a

shift/reduce conflict.
The Core of A Set of LR(1) Items
• The core of a set of LR(1) items is the set of its
first component.
Ex:
R  L ,$
..
S  L =R,$ S  L =R
RL
.. Core
• We will find the states (sets of LR(1) items) in a canonical LR(1)

parser with same cores. Then we will merge them as a single
state.
I1:L  .
id ,= Same Core ..
A new state:I12: L  id ,=
L  id ,$
.
I2:L  id ,$ Merge Them
• We will do this for all states of a canonical LR(1) parser to get

the states of the LALR parser.
• In fact, the number of the states of the LALR parser for a
grammar will be equal to the number of states of the SLR
Creation of LALR parsing tables
• Create the canonical LR(1) collection of the sets of LR(1) items
for the given grammar.
• Find each core; find all sets having that same core; replace those sets
having same cores with a single set which is their union.
C={I0,...,In} € C’=,J1,...,Jm} where m  n
• Create the parsing tables (action and goto tables) same as the
construction of the parsing tables of LR(1) parser.
– Note that: If J=I1  ...  Iksince I1,...,Ik have same cores

€ cores of goto(I1,X),...,goto(I2,X) must be same.
– So, goto(J,X)=K, where K is the union of all sets of items

having same cores as goto(I1,X).
• If no conflict is introduced, the grammar is LALR(1) grammar. (We may
only introduce reduce/reduce conflicts; we cannot introduce a
shift/reduce conflict)
Example S’  S
I0: S’   S, $ S  CC
S   CC, $ C  cC | d
C  cC, c/d
C  d, c/d
goto (I0, C)= I2: goto (I2, d)= I7: C  d , $

S  C C, $
C  cC, $ goto(I3, C)=I8: C  cC, c/d
C  d, $
goto (I0, c)= I3: goto(I3, c)=I3
C  c C, c/d
C  cC, c/d goto(I3, d)=I4
C  d, c/d
goto(I0, d)= I4: goto(I6, C)=I9: C  cC, $
C  d , c/d
goto(I6, c)=I6
goto (I2, C)= I5:
S  C C , $ goto(I6,d)=I7
goto (I2, c)= I6:

C  c C, $
C  cC, $
C  d, $
LALR Parsing Table
QUESTIONS ?
14
8

Syntax Analysis - LL LR Parser

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Syntax Analysis - LL LR Parser

Uploaded by

Copyright:

Available Formats

TABLE DRIVEN PREDICTIVE PARSING

• Although elimination of left recursion and left

• A gr a mma r G is LL(1) if a n only if whenever

1. For no terminal a do both  and  derive strings

2. At most one of  and  can derive the empty string.

3. If then  *   then  does not derive any string

Rule 1. FIRST (A a A b) ꓵFIRST (B b B a)

Rule 3. Not applicable

Rule 1. FIRST (1 A B) ꓵFIRST (ε)

Rule 3. FIRST (ε)  ε, FOLLOW (S) = {$}

Stack X Predictive Parsing Output

1. If X is a terminal and it doesn’t match input.

• It is based on the idea of skipping symbols on the

• Its effectiveness depends on the choice on the

2. There is hierarchical structure on constructs in a language; e.g.,

3. If we add symbols in FIRST(A) to the synchronizing set of

4. If a nonterminal can generate the empty string, then the production

5. If a terminal on top of the stack, cannot be matched, a simple idea is

Add “sync” to indicate synchronizing tokens obtained from FOLLOW sets.

STACK INPUT REMARKS

• It attempts to construct a parse tree for an input string

• At each reduction step a particular substring matching

• If the substring is chosen correctly at each reduction

• The reductions trace out the rightmost derivation in

• Formal Definition: A handle of a right sentential form  (

input string = n-th right-sentential form

• Start from n, find a handle Ann in n, and replace n in

1. To locate the substring to be reduce in a right sentential

2. To determine what production to choose in case there is

• an input buffer to hold input string w.

• It reduces  to the left side of production A.

The parser performs following operations.

1. Shift: moves symbol from input buffer to stack.

Stack Input Action

S * BxAz  Bxyz  xyz

left to right right-most k lookhead

• An ambiguous grammar can never be a LR grammar.

 We can’t decide whether to shift or reduce?

• At each point: i) should it shift or reduce?

 Stmt : p (i, j) -> id (id, id)

operator grammar : It is a context-free grammar that has the

Example : E -> EAE | -E | ( E ) | id

• If a has higher precedence over b; a .> b

Precedence table for this grammar:

w:input string id + id *id

• Now scan backwards over any =. till reaching <.

• String between <. And .> is our handle including any

• Replace handle by the head of the respective production.

• REPEAT until reaching start symbol.

• Disadv: Entire Right sentential form need to be scanned

• Use stack to store the input symbols already seen and

• Let a be the top terminal on the stack, and b the

1. if a <· b or a =· b then push b onto the stack

2. if a ·> b then reduce operation

• Operator precedence parser need to store precedence

• Instead operator precedence parsers use table that is encoded

• We attempt to select f and g, so that, for symbol a and b

• Most general non-backtracking shift-reduce parsing, yet it is still efficient.

• The class of grammars that can be parsed using LR methods is a

• Detect a syntactic error as soon as it is possible to do so a

• LR parsers can be constructed to recognize virtually all

• Too much work to construct LR parser by hand

 LR (shift reduce) is more powerful than LL

goto(I2, )= I7: T  T • F

0T27F10 +id$ reduce by TTF TT*F