Unit 2

UNIT-2
(BASIC PARSING TECHNIQUES)

1. PARSERS:
The parser is that phase of the compiler which takes a token string as input and with
the help of existing grammar, converts it into the corresponding Intermediate
Representation(IR). The parser is also known as Syntax Analyzer.
Predictive descent
OR LL(1) parser
Classification of Parser
Types of Parser:
The parser is mainly classified into two categories, i.e. Top-down Parser, and Bottom-
up Parser. These are explained below:
a. Top-Down Parser:
The top-down parser is the parser that generates parse for the given input string with the
help of grammar productions by expanding the non-terminals i.e. it starts from the
start symbol and ends on the terminals. It uses left most derivation.
Problems with top-down parsing are :
1. Backtracking
a. Backtracking is a technique in which for expansion of non-terminal symbol, we

choose alternative and if some mismatch occurs then we try another alternative if any.
b. If for a non-terminal, there are multiple production rules beginning with the same
input symbol then to get the correct derivation, we need to try all these alternatives.
Mr. Abhinav Gupta (CS&E Department) 1

Consider the input string provided by the lexical analyzer is ‘‘abd’’ for the following
grammar.
SaAd
Abc|b
The top-down parser will parse the input string ‘abd’ and will start creating the parse
tree with the starting symbol S.
Now the first input symbol ‘a’ matches the first leaf node of the tree. So the parser will
move ahead and find a match for the second input symbol ‘b‘.
Possibility1:
Abc
REJECTED
Possibility2:
Ab
ACCEPTED
Now the next leaf node ‘b‘ matches the second input symbol ‘b‘. Further, the third
input symbol ‘d‘ matches the last leaf node ‘d‘ of the tree. Thereby successfully
completing the top-down parsing
We get an error in 1st method and go back to A to see whether there is another
production for A or not. So, corresponding parse tree is represented as in 2nd method
and we halt and announce successful completion of parsing.
2. Left Recursion
A Grammar G (V, T, P, S) is left recursive if it has a production in the form.
AAα|β
If left recursion is present in the grammar then top-down parser can enter into infinite
loop.

Elimination of Left Recursion
If we have left recursive pair of production such as A → A α | β then Left Recursion can
be eliminated by replacing a pair of production with
A → β A′
A’ → α A′ | ϵ
Example:
i) E → E+T|T
ii) T → T*F|F
iii) F → (E)|id
The left and right variables are the same in the production rules above, that is, E and T.
So to eliminate the left recursion, we have to change the production rules to a different
form.
In the production rules above, we still have left recursion:

i) E → E+T|T α=+T and β=T
E → TE′
E′→ +TE′|ϵ
ii) T → T*F|F α=*F and β=F
T → FT′
T′→ *FT′|ϵ
After eliminating the left recursion, the final production rules are as follows:
E → TE′
E′→ +TE′|ϵ
T → FT′
T′→ *FT′|ϵ
F → (E)|id
3. Left factoring
Left factoring is used in case of backtracking when it is not clear that which of the
production is used to expand the non-terminal, if there is no left recursion.
If A→αβ|αγ are two productions then it is not possible for us to take a

decision whether to select A → α β or A → α γ
Removing Left Factoring :

The equivalent left factored grammar will be –
A → α A'
A' → β | γ

Problem-01: Do left factoring in the following grammar- S → iEtS | iEtSeS | a
E→b
The left factored grammar is- S → iEtSS’ | a
S’ → eS | ∈
E→b
Problem-02:
Do left factoring in the following grammar- S → bSSaaS | bSSaSb | bSb | a
Step-01: S → bSS’ | a
S’ → SaaS | SaSb | b
Again, this is a grammar with common prefixes.
Step-02: S → bSS’ | a
S’ → SaD | b
D → aS | Sb
This is a left factored grammar.
Problem-03:
Do left factoring in the following grammar- S → aSSbS | aSaSb | abb | b
Step-01: S → aS’ | b
S’ → SSbS | SaSb | bb
Step-02: S → aS’ | b
S’ → SD | bb
D → SbS | aSb
Problem-04:
Do left factoring in the following grammar- S → a | ab | abc | abcd
Step-01: S → aS’
S’ → b | bc | bcd | ∈
S’ → bD | ∈
D → c | cd | ∈
S’ → bD | ∈
D → cD’ | ∈
D’ → d | ∈
Problem-05:
Do left factoring in the following grammar- S → aAd | aB
A → a | ab
B → ccd | ddc
The left factored grammar is- S → aS’
S’ → Ad | B
A → aA’
A’ → b | ∈
B → ccd | ddc

a. Top-down parser is classified into 2 types:
A recursive descent parser, and Predictive parser (LL1) without
backtracking.
a.1 Recursive descent parser

To implement a recursive descent parser, the grammar must hold the following
properties:
 It should not be left recursive.
 It should be left-factored. (Alternates productions should not have common
prefixes).
a.2 Predictive parser or LL(1) parser

Predictive Parser is also another method that implements the technique of Top- Down
parsing without Backtracking.
A predictive parser is an effective technique of executing recursive-descent parsing by
managing the stack of activation records, particularly.
Predictive Parsers has the following components −
 Input Buffer −The input buffer includes the string to be parsed followed by an
end marker $ to denote the end of the string.
Here a, +, b are terminal symbols .
 Stack − It contains a combination of grammar symbols with $ on the bottom of

the stack. At the start of Parsing, the stack contains the start symbol of
Grammar followed by $.
 Parsing Table − It is a two-dimensional array or Matrix M [A, a] where A is

nonterminal and 'a' is a terminal symbol.

Following are the steps to perform Predictive Parsing or LL(1)
i. Elimination of Left Recursion (Already done)
ii. Left Factoring (Already done)
iii. Computation of FIRST & FOLLOW
iv. Construction of Predictive or LL(1) Parsing Table
(iii) Computation of FIRST & FOLLW
Rules for Calculating FIRST Function-

Note:
We calculate the FIRST function of a non-terminal on the LHS of a production rule.
∈ may appear in the first function of a non-terminal.
Rule-01:
For a production rule X → ∈, FIRST (X) = { ∈ }
For any terminal symbol ‘a’, FIRST (a) = { a }
Rule-02:
For a production rule X aY then add a to FIRST (X), First(X) = { a }
Rule-03:
For a production rule X → Y1Y2Y3,
Calculating First(X)
 If ∈ ∉ FIRST(Y1), then FIRST(X) = FIRST(Y1)
 If ∈ ∈ FIRST(Y1), then FIRST(X) = { FIRST(Y1) – ∈ } ∪ FIRST(Y2Y3)
( Y1∈ ) { ∈ – ∈}
Calculating First(Y2Y3)
 If ∈ ∉ FIRST(Y2), then FIRST(Y2Y3) = FIRST(Y2)
 If ∈ ∈ FIRST(Y2), then FIRST(Y2Y3) = { FIRST(Y2) – ∈ } ∪ FIRST(Y3)
Similarly, we can make expansion for any production rule X → Y 1Y2Y3…..Yn.
Rules for Calculating FOLLOW Function-

Note:
We calculate the FOLLOW function of a non-terminal on the RHS of a production rule.
∈ will never appear in the follow function of a non-terminal.
Rule-01:
For the start symbol S, place $ in FOLLOW(S). FOLLOW(S) = { $ }
Rule-02:
For any production rule A → αB, FOLLOW(B) = FOLLOW(A)
Rule-03:
For any production rule A → αBβ,
 If ∈ ∉ First(β), then FOLLOW(B) = First(β)
 If ∈ ∈ First(β), then FOLLOW(B) = { First(β) – ∈ } ∪ FOLLOW(A)
( β ∈ )

Example 1:
Production Rules: E -> TE’
E’ -> +T E’|Є
T -> F T’
T’ -> *F T’ | Є
F -> (E) | id
FIRST set
FIRST(E) = FIRST(T) = FIRST(F) = { ( , id }
FIRST(E’) = { +, Є }
FIRST(T) = FIRST(F) = { ( , id }
FIRST(T’) = { *, Є }
FIRST(F) = { ( , id }
FOLLOW Set
FOLLOW(E) = { $ , ) }
FOLLOW(E’) = FOLLOW(E) = { $, ) }
FOLLOW(T) = { FIRST(E’) – Є } U FOLLOW(E’) U FOLLOW(E) = { + , $ , ) }
FOLLOW(T’) = FOLLOW(T) = { + , $ , ) }
FOLLOW(F) = { FIRST(T’) – Є } U FOLLOW(T’) U FOLLOW(T) = { *, +, $, ) }
Now, we represent FIRST and FOLLOW as:
Non terminal FIRST FOLLOW
E { ( , id } {$,)}
E’ { +, Є } { $, ) }
T { ( , id } {+,$,)}
T’ { *, Є } {+,$,)}
F { ( , id } { *, +, $, ) }
iv. Construction of Predictive or LL(1) Parsing Table
For each production A –> α.
1. Find First(α) and for each terminal in First(α), make entry A –> α in the table.
2. If First(α) contains ε (epsilon) as terminal than, find the Follow(A) and for each
terminal in Follow(A), make entry A –> α in the table.
3. If the First(α) contains ε and Follow(A) contains $ as terminal, then make entry A –>
α in the table for the $.
4. Remaining entry mark as error.
For production: E  T E’
Aα
FIRST(α) = FIRST( TE’ ) = FIRST(T) = { ( , id }
Then M[ A, { ( , id } ] = E  T E’
i.e.
M[ E, ( ] = E  T E’
M[ E, id ] = E  T E’
For production: E’  +T E’
A α
FIRST(α) = FIRST( +TE’ ) = FIRST(+) = { + }
Then M[ A, { + } ] = E’  +T E’
i.e.
M* E’, + + = E’  +T E’

For production: E’  ε
A α
FIRST(α) = FIRST( ε ) = { ε } and FOLLOW(E’) = { ),$ }
Then
M* E’, ) ] = E’  ε
M* E’, $ ] = E’ ε
For production: T  F T’
Aα
FIRST(α) = FIRST( F ) = { ( , id }
Then.
M[ T, ( ] = T  F T’
M[ T, id ] = T  F T’
For production: T’  *FT’
A α
FIRST(α) = FIRST( * ) = { * }
Then
M* T’, * ] = T’  * F T’
For production: T’  ε
A α
FIRST(α) = FIRST( ε ) = { ε } and FOLLOW(T’) = {+, ), $ }
Then
M* T’, + ] = T’  ε
M* T’, ) ] = T’  ε
M[ T’, $ ] = T’  ε
For production: F(E)
Aα
FIRST(α) = FIRST( (E) ) = { ( }
Then
M[ F, ( ] = F  ( E )
For production: F  id
Aα
FIRST(α) = FIRST( id ) = { id }
Then
M[ F, id ] = F  id
Predictive or LL(1) Parsing table:
Id + * ( ) $
E E  T E’ E  T E’
E’ E’  +T E’ E’  ε E’  ε
T T  FT’ T  FT’
T’ T’  ε T’  *FT’ T’  ε T’  ε
F F id F  (E)
NOTE: If table contains multiple entries at same position then grammar is not
predictive or LL(1).

Example 2: Find the FIRST and FOLLOW of the following grammar.
Production Rules: S -> aBDh
B -> cC
C -> bC | Є
D -> EF
E -> g | Є
F -> f | Є
FIRST set
FIRST(S) = { a }
FIRST(B) = { c }
FIRST(C) = { b , Є }
FIRST(D) = FIRST(E) U FIRST(F) = { g, f, Є }
FIRST(E) = { g , Є }
FIRST(F) = { f , Є }
FOLLOW Set
FOLLOW(S) = { $ }
FOLLOW(B) = { FIRST(D) – Є } U FIRST(h) = { g , f , h }
FOLLOW(C) = FOLLOW(B) = { g , f , h }
FOLLOW(D) = FIRST(h) = { h }
FOLLOW(E) = { FIRST(F) – Є } U FOLLOW(D) = { f , h }
FOLLOW(F) = FOLLOW(D) = { h }
Example 3: Find the FIRST and FOLLOW of the following grammar.

Production Rules: S -> ACB|Cbb|Ba
A -> da|BC
B-> g|Є
C-> h| Є
FIRST set
FIRST(S) = FIRST(A) U FIRST(B) U FIRST(C) = { d, g, h, Є, b, a}
FIRST(A) = { d } U {FIRST(B)-Є} U FIRST(C) = { d, g, h, Є }
FIRST(B) = { g, Є }
FIRST(C) = { h, Є }
FOLLOW Set
FOLLOW(S) = { $ }
FOLLOW(A) = { h, g, $ }
FOLLOW(B) = { a, $, h, g }
FOLLOW(C) = { b, g, $, h }
Example 4: Check the given grammar is Predictive/LL(1) or not.

Production Rules: S -> iCtSS’ | a
S’-> eS | Є
C -> b [Do Practice]

b. Bottom-up Parser or Shift Reduce Parsers:
Shift Reduce Parser is the parser that generates the parse tree for the given input string
with the help of grammar productions by compressing the non-terminals i.e. it starts
from non-terminals and ends on the start symbol.
e.g.
Parsing
NOTE: Substring b, Abc, d and aABe are known as handle.
Stack Implementation of Shift Reduce Parsing

There are the various steps of Shift Reduce Parsing which are as follows −
 It uses a stack and an input buffer.
 Insert $ at the bottom of the stack and the right end of the input string in Input
Buffer.
 Shift: Parser shifts zero or more input symbols onto the stack until the handle is
on top of the stack.
 Reduce: Parser reduce or replace the handle on top of the stack to the left side of
production, i.e., R.H.S. of production is popped, and L.H.S is pushed.
 Accept: Step 3 and Step 4 will be repeated until it has identified an error or until
the stack includes start symbol (S) and input Buffer is empty, i.e., it contains $.
 Error: Signal discovery of a syntax error that has appeared and calls an error
recovery routine.

Example − To stack implementation of shift-reduce parsing is done, consider the
grammar −
 E→E+E
 E→E∗E
 E → (E)
 E → id
and input string as “ id1+id2∗id3 “.
Stack Input String Action
$ id1 + id2 * id3$ Shift
$ id1 +id2 * id3$ Reduce by E → id
$E +id2 * id3$ Shift
$E+ id2 * id3$ Shift
$ E + id2 * id3$ Reduce by E → id
$E + E * id3$ Shift
$E + E * id3$ Shift
$E + E * id3 $ Reduce by E → id
$E + E * E $ Reduce by E → E * E
$E + E $ Reduce by E → E + E
$E $ Accept
Further Bottom-up parser is classified into two types:

b.1 LR parser LR(0), SLR(1), CLR(1), LALR(1)
LR parsing is one type of bottom up parsing. It is used to parse the large class of
grammars. In the LR parsing, "L" stands for left-to-right scanning of the input. "R" stands
for constructing a right most derivation in reverse.
b.2 Operator precedence parser:

An operator precedence parser is a bottom-up parser that interprets an operator
grammar. This parser is only used for operator grammars. Ambiguous grammars are not
allowed in any parser except operator precedence parser.

LR(0) Parser: Canonical Collection of LR(0) Items
An LR(0) is the item of a grammar G is a production of G with a dot at some position in
the right side.
S  ABC generates four productions
S -> •ABC
S -> A•BC
S -> AB•C
S -> ABC•
NOTE: The production A -> ε generates only one item A -> •ε or A -> •
Steps for constructing the LR parsing table :

1. Writing augmented grammar
2. We need two functions – Closure() and Goto()
3. LR(0) collection of items to be found
Augmented Grammar:
If G is a grammar with starting symbol S, then G’ (augmented grammar for G) is a
grammar with a new starting symbol S ‘ and productions S’ •S
The purpose of this new starting production is to indicate to the parser when it should
stop parsing.
Example
Given grammar
S → AA
A → aA | b
The Augment grammar G` is represented by
S`→ S
S → AA
A → aA | b
Closure:
If I is a set of items for a grammar G, then closure(I) is the set of items constructed
from I by the two rules:
1. Initially every item in I is added to closure(I).
2. If A  α•Bβ is in closure(I) and B  γ is a production then add the item B -> •γ to I,
If it is not already there. We apply this rule until no more items can be added to
closure(I).
Eg: Given grammar S → AA

A → aA
A→b
If I is the set of one production { S → .AA } then Closure(I) contains the productions:
S → •AA
A → •aA
A → •b

Goto:
If there is a production in I is A  α•Bβ then Goto(I,B) = A  αB•β
But if β  γ is a production then add Closure(β) in above production.
e.g.
Given grammar
S → AA
A → aA
A→b
If I is the set of one production { S → •aA }

then Goto(I,a): S  a•A // So, add Closure(A)
A → •aA
A → •b
Question: Construct the LR(0) parser for the following Gramaar:

Given grammar:
S → AA
A → aA | b
I0 State:
Add Augment production to the I0 State and Compute the Closure
I0 = S` → •S
Add all productions starting with S in to I0 State because "•" is followed by the non-
terminal. So, the I0 State becomes
I0 = S` → •S
S → •AA
Add all productions starting with "A" in modified I0 State because "•" is followed by the
non-terminal. So, the I0 State becomes.
I0= S` → •S
S → •AA
A → •aA
A → •b
I1= Goto(I0, S) = S` → S•
Here, the Production is reduced so close the State.
I1= S` → S•
I2= Goto(I0, A) = S → A•A

So, Add closure (A) to I2 State because "•" is followed by the non-terminal. So, the I2
State becomes
I2= S → A•A
A → •aA
A → •b

I3= Goto (I0,a) = A → a•A
So, Add Closure (A) to I3 States because "•" is followed by the non-terminal. So, the I3
State becomes
I3= A → a•A
A → •aA
A → •b
I4= Goto (I0, b) = A → b•
I5= Goto (I2, A) = S → AA•
Goto (I2,a) = A → a•A
So, Add Closure (A) because "•" is followed by the non-terminal which becomes
A → a•A
A → •aA
A → •b
(same as I3)
Goto (I2, b) = A → b•
(same as I4)
I6= Goto (I3, A) = A → aA•
Goto (I3, a) = A → a•A
So, Add Closure (A) because "•" is followed by the non-terminal which becomes
A → a•A
A → •aA
A → •b
(same as I3)
Goto (I3, b) = A → b•
(same as I4)
Drawing DFA:
The DFA contains the 7 states I0 to I6.

LR(0) Table
o If a state is going to some other state on a terminal then it correspond to a shift
move.
o If a state is going to some other state on a non-terminal then it correspond to
Goto move.
o If a state contain the final item in the particular row then write the reduce node
completely.
Explanation:
o I0 on S is going to I1 so write it as 1.
o I0 on A is going to I2 so write it as 2.
o I0, I2and I3on a are going to I3 so write it as S3 which means that shift 3.
o I0, I2 and I3 on b are going to I4 so write it as S4 which means that shift 4.
o I4, I5 and I6 all states contains the final item because they contain • in the right
most end. So rate the production as production number.
Productions are numbered as follows:
S → AA ... (1)
A → aA ... (2)
A→b ... (3)
 I1 contains the final item which drives(S` → S•), so action {I1, $} = Accept.
 I4 contains the final item which drives A → b• and that production corresponds
to the production number 3 so write it as r3 in the entire row.
 I5 contains the final item which drives S → AA• and that production corresponds
 I6 contains the final item which drives A → aA• and that production corresponds
Question: Construct the Canonical Collection of LR(0) items and LR table.

Given grammar:
E → E+T E→T T → T*F T→F F → (E) F → id
(Do Practice)

SLR(1) Parser:
SLR (1) refers to simple LR Parsing. It is same as LR(0) parsing. The only difference is in
the parsing table. To construct SLR (1) parsing table, we use canonical collection of LR
(0) item.
Steps for constructing the SLR(1) parsing table :
2. Create Canonical collection of LR (0) items
3. Find FOLLOW of LHS of production
4. Construct a SLR (1) parsing table. Defining 2 functions:goto[list of terminals] and
action[list of non-terminals] in the parsing table
SLR (1) Table Construction
The steps which use to construct SLR (1) Table is given below:
If a state (Ii) is going to some other state (Ij) on a terminal then it corresponds to a shift
move in the action part.
If a state (Ii) is going to some other state (Ij) on a variable then it correspond to go to
move in the Go to part.

If a state (Ii) contains the final item like A → ab• which has no transitions to the next
state then the production is known as reduce production. For all terminals X in FOLLOW
(A), write the reduce entry along with their production numbers.
Example
S -> •Aa
A->αβ•
Follow(S) = {$}
Follow (A) = {a}
EXAMPLE– Construct SLR(1) parsing table for the given context-

free grammar:
S –> AA
A –> aA
A –> b
STEP1 – Writing augmented grammar
The augmented grammar of the given grammar is:-
S’–>•S
S–>•AA
A–>•aA
A–>•b
STEP2 – Figure showing the Canonical collection of LR (0) items. (Previous example)

STEP3 –Find FOLLOW of LHS of production
FOLLOW(S) = {$}
FOLLOW(A) = {a,b,$}
STEP 4- Construct a SLR (1) parsing table. Defining 2 functions: goto[list of terminals]
and action[list of non-terminals] in the parsing table
Below is the SLR parsing table.
Explanation:
 I1 contains the final item which drives S’ → S• and follow (S’) = {$}, so action {I1, $} =
Accept
 0 gives A in I2, so 2 is added to the (0 rows and A column).
 I0 gives S in I1,so 1 is added to the (1 row and S column).
 similarly 5 is written in (2 row and A column), 6 is written in (3 row and A column).
 I0 gives a in I3 .so S3(shift 3) is added to (0 row and a column).
 I0 gives b in I4 .so S4(shift 4) is added to the (0 row and b column).
 Similarly, S3(shift 3) is added on (2,3 rows and a column), S4(shift 4) is added on
(2,3 rows and b column).
S → AA ... (1)
A → aA ... (2)
A→b ... (3)
 I4 is reduced state as ‘•‘ is at the end. I4 is the 3rd production of grammar (A–>•b).
LHS of this production is A. FOLLOW(A)={a,b,$}. Write r3(reduced 3) in the (4th row
and columns of a,b,$)
 I5 is reduced state as ‘•‘ is at the end. I5 is the 1st production of grammar (S->•AA).
LHS of this production is S.
FOLLOW(S)={$}. Write r1(reduced 1) in the (5th row and column of $)
 I6 is a reduced state as ‘•‘ is at the end. I6 is the 2nd production of grammar
(A–>•aA). The LHS of this production is A.
FOLLOW(A)={a,b,$}. Write r2(reduced 2) in the (6th row and columns of a,b,$)

Question: Construct SLR(1) parsing table for the given context-
free grammar:
E→E+T|T
T→T*F|F
F → id
I0 State:
I0 = Closure (S` → •E)
Add all productions starting with E in to I0 State because "." is followed by the non-
I0 = S` → •E
E → •E + T
E → •T
Add all productions starting with T and F in modified I0 State because "." is followed by
the non-terminal. So, the I0 State becomes.
I0= S` → •E
E → •E + T
E → •T
T → •T * F
T → •F
F → •id
I1= Goto (I0, E) I2= Goto (I0, T) I3= Goto (I0, F) I4= Goto (I0, id)
S` → E• E → T• T → F• F → id•
E → E• + T T → T•* F)
I5= Goto (I1, +)

E → E +•T
Add all productions starting with T and F in I5 State because "." is followed by the non-
I5 = E → E +•T
T → •T * F
T → •F
F → •id
I6= Goto (I2, *)

T → T * •F
Add all productions starting with F in I6 State because "." is followed by the non-
I6 = T → T * •F
F → •id
I7= Goto (I5, T) Goto (I5, F) Goto (I5, id)

E → E + T• T → F• F → id•
T → T• * F (same as I3) (same as I4)

I8= Goto (I6, F) Goto (I6, id)
T → T * F• F → id•
(same as I4)
Goto (I7, *)
T → T * •F
F → •id
(same as I6)
Drawing DFA:
SLR (1) Table

Explanation:
First (E) = First (E + T) ∪ First (T)
First (T) = First (T * F) ∪ First (F)
First (F) = {id}
First (T) = {id}
First (E) = {id}
Follow (E) = First (+T) ∪ {$} = {+, $}
Follow (T) = First (*F) ∪ Follow (E)
= {*, +, $}
Follow (F) = {*, +, $}
E→E+T …(1)
E→T …(2)
T→T*F …(3)
T→F …(4)
F → id …(5)
o I1 contains the final item which drives S → E• and follow (S) = {$}, so action {I1, $} =
Accept
o I2 contains the final item which drives E → T• and follow (E) = {+, $}, so action {I2, +} =
R2, action {I2, $} = R2
o I3 contains the final item which drives T → F• and follow (T) = {+, *, $}, so action {I3, +} =
R4, action {I3, *} = R4, action {I3, $} = R4
o I4 contains the final item which drives F → id• and follow (F) = {+, *, $}, so action {I4, +} =
R5, action {I4, *} = R5, action {I4, $} = R5
o I7 contains the final item which drives E → E + T• and follow (E) = {+, $}, so action {I7, +}
= R1, action {I7, $} = R1
o I8 contains the final item which drives T → T * F• and follow (T) = {+, *, $}, so action {I8,
+} = R3, action {I8, *} = R3, action {I8, $} = R3.
Example:
Construct canonical collection and SLR table:
S iSeS
S iS
S a (Do Practice)

CLR (1) Parsing
CLR refers to canonical lookahead. CLR parsing use the canonical collection of LR (1)
items to build the CLR (1) parsing table. CLR (1) parsing table produces the more number
of states as compare to the SLR (1) parsing.
In the CLR (1), we place the reduce node only in the lookahead symbols.
LR (1) item
LR (1) item is a collection of LR (0) items and a look ahead symbol.

LR (1) item = LR (0) item + look ahead
The general syntax becomes [A->∝•B, a ]
where A->∝•B is the production and a is a terminal or right end marker $
The look ahead is used to determine that where we place the final item.
The look ahead always add $ symbol for the argument production.
Steps for constructing CLR parsing table :

2. LR(1) collection of items to be found
3. Defining 2 functions: goto[list of terminals] and action[list of non-terminals] in
the CLR parsing table
Constructing Canonical LR (CLR) or LR(1) Prasing table
Step 1:
For the grammar G initially Add S’ •S, $ in the set of production.
Step 2:
CLOSURE FUNCTION
For each production A α • X β , a then Add X• γ, b where b= FIRST(β a)
Step 3:
GOTO FUNCTION
For each production A α • X β , a is in the set
then GOTO(A , X) = A α X• β , a
and if β γ then Add β •γ, a ’a’ will be same as above

Example
Question: Construct CLR ( 1 ) parsing table for the Grammar.
1. S → AA
2. A → aA
3. A → b
Add Augment Production, insert '•' symbol at the first position for every production in G
and also add the lookahead.
S` → •S, $
S → •AA, $ b= FIRST(βa)= FIRST( $)= { $ }
A → •aA, a/b
A → •b, a/b b= FIRST(βa)= FIRST( A$)= FIRST{ A }= (a,b}
I0 State:
I0 = S` → •S, $
I0 = S` → •S, $
S → •AA, $
Add all productions starting with A in modified I0 State because "•" is followed by the
I0= S` → •S, $
S → •AA, $
A → •aA, a/b
A → •b, a/b
I1= Goto (I0, S) = S` → S•, $
I2= Goto (I0, A) = S → A•A, $

Add all productions starting with A in I2 State because "•" is followed by the non-
I2= S → A•A, $
A → •aA, $ SAME
A → •b, $
I3= Goto (I0, a) = A → a•A, a/b

I3= A → a•A, a/b
A → •aA, a/b SAME
A → •b, a/b
I4= Goto (I0, b) = A → b•, a/b

I5= Goto (I2, A) = S → AA•, $
I6= Goto (I2, a) = A → a•A, $

I6 = A → a•A, $
A → •aA, $ SAME
A → •b, $
I7= Goto (I2, b) = A → b•, $
I8= Goto (I3, A) = A → aA•, a/b
Go to (I3, a) = A → a•A, a/b = (same as I3)

Go to (I3, b) = A → b•, a/b = (same as I4)
I9= Goto (I6, A) = A → aA•, $
Go to (I6, a) = A → a•A, $ = (same as I6)

Go to (I6, b) = A → b•, $ = (same as I7)
Drawing DFA:

CLR (1) Parsing table:
States a b $ S A
I0 S3 S4 1 2
I1 ACCEPT
I2 S6 S7 5
I3 S3 S4 8
I4 R3 R3
I5 R1
I6 S6 S7 9
I7 R3
I8 R2 R2
I9 R2

1. S → AA ... (1)
2. A → aA ...(2)
3. A → b ... (3)
NOTE: The placement of shift node in CLR (1) parsing table is same as the SLR (1)
parsing table. Only difference in the placement of reduce node.
I4 contains the final item which drives ( A → b•, a/b), so action {I4, a} = R3, action {I4, b}
= R3.
I5 contains the final item which drives ( S → AA•, $), so action {I5, $} = R1.
I7 contains the final item which drives ( A → b•,$), so action {I7, $} = R3.
I8 contains the final item which drives ( A → aA•, a/b), so action {I8, a} = R2, action {I8,
b} = R2.
I9 contains the final item which drives ( A → aA•, $), so action {I9, $} = R2.

LALR (1) Parsing
LALR refers to the lookahead LR. To construct the LALR (1) parsing table, we use the
canonical collection of LR (1) items.
In the LALR (1) parsing, the LR (1) items which have same productions but different look
ahead are combined to form a single set of items
LALR (1) parsing is same as the CLR (1) parsing, only difference in the parsing table.
Example
Question: Construct LALR( 1 ) parsing table for the Grammar.
1. S → AA
2. A → aA
3. A → b
Add Augment Production, insert '•' symbol at the first position for every production in G
and also add the lookahead.
S` → •S, $
S → •AA, $ b= FIRST(βa)= FIRST( $)= { $ }
A → •aA, a/b
A → •b, a/b b= FIRST(βa)= FIRST( A$)= FIRST{ A }= (a,b}
Same as CLR(1)
I0 State:
I0 = S` → •S, $
I0 = S` → •S, $
S → •AA, $
Add all productions starting with A in modified I0 State because "•" is followed by the
I0= S` → •S, $
S → •AA, $
A → •aA, a/b
A → •b, a/b
I1= Goto (I0, S) = S` → S•, $
I2= Goto (I0, A) = S → A•A, $

I2= S → A•A, $
A → •aA, $ SAME
A → •b, $

I3= Goto (I0, a) = A → a•A, a/b
I3= A → a•A, a/b
A → •aA, a/b SAME
A → •b, a/b
I4= Goto (I0, b) = A → b•, a/b
I5= Goto (I2, A) = S → AA•, $
I6= Goto (I2, a) = A → a•A, $

I6 = A → a•A, $
A → •aA, $ SAME
A → •b, $
I7= Goto (I2, b) = A → b•, $
I8= Goto (I3, A) = A → aA•, a/b
Go to (I3, a) = A → a•A, a/b = (same as I3)

Go to (I3, b) = A → b•, a/b = (same as I4)
I9= Goto (I6, A) = A → aA•, $
Go to (I6, a) = A → a•A, $ = (same as I6)

Go to (I6, b) = A → b•, $ = (same as I7)
If we analyze then LR (0) productions of I3 and I6 are same but they differ
only in their lookahead.
I3 = { A → a•A, a/b
A → •aA, a/b
A → •b, a/b }
I6= { A → a•A, $
A → •aA, $
A → •b, $ }
Clearly I3 and I6 are same in their LR (0) items but differ in their lookahead, so
we can combine them and called as I36.
I36 = { A → a•A, a/b/$
A → •aA, a/b/$
A → •b, a/b/$ }

The I4 and I7 are same but they differ only in their look ahead, so we can
combine them and called as I47.
I47 = {A → b•, a/b/$}
The I8 and I9 are same but they differ only in their look ahead, so we can
combine them and called as I89.
I89 = {A → aA•, a/b/$}
Drawing DFA:
LALR (1) Parsing table:
States a b $ S A
I0 S36 S47 1 2
I1 Accept
I2 S36 S47 5
I36 S36 S47 89
I47 r3 r3 r3
I5 r1
I89 r2 r2 r2

(b.2) Operator Precedence Parsing
Operator precedence grammar is kinds of shift reduce parsing method. It is applied to a
small class of operator grammars.
A grammar is said to be operator precedence grammar if it has two properties:

o No R.H.S. of any production has a ∈.
o No two non-terminals are adjacent.
Examples –
This is an example of operator grammar:
E->E+E/E*E/id
Operator precedence can only established between the terminals of the grammar. It
ignores the non-terminal.
However, the grammar given below is not an operator grammar because two non-
terminals are adjacent to each other:
S->SAS/a
A->bSb/b
We can convert it into an operator grammar, though:
S->SbSbS/SbS/a
A->bSb/b
There are the three operator precedence relations:

a ⋗ b means that terminal "a" has the higher precedence than terminal "b".
a ⋖ b means that terminal "a" has the lower precedence than terminal "b".
a ≐ b means that the terminal "a" and "b" both have same precedence.
EXAMPLE:
Grammar:
1. E → E+T/T
2. T → T*F/F
3. F → id Given string: w = id + id * id
Precedence table:

Now let us process the string with the help of the above precedence table:
Parsing Action
o Both end of the given input string, add the $ symbol.
o Now scan the input string from left right until the ⋗ is encountered.
o Scan backwards left over all the equal precedence until the first left most ⋖ is
encountered.
o Everything between left most ⋖ and right most ⋗ is a handle.
o $ START SYMBOL $ means parsing is successful.
Advantages –
1. It can easily be constructed by hand.
2. It is simple to implement this type of parsing.
Disadvantages –
1. It is hard to handle tokens like the minus sign (-), which has two different
precedence (depending on whether it is unary or binary).
2. It is applicable only to a small class of grammars.
Automatic Parser Generator

YACC is an automatic tool that generates the parser program.
As we have discussed YACC in the first unit so you can go through the concepts again to
make things more clear.

Unit 2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 2

Uploaded by

Copyright:

Available Formats

UNIT-2

(BASIC PARSING TECHNIQUES)

Problems with top-down parsing are :

a. Backtracking is a technique in which for expansion of non-terminal symbol, we

Mr. Abhinav Gupta (CS&E Department) 1

Mr. Abhinav Gupta (CS&E Department) 2

In the production rules above, we still have left recursion:

If A→αβ|αγ are two productions then it is not possible for us to take a

Removing Left Factoring :

Mr. Abhinav Gupta (CS&E Department) 3

Mr. Abhinav Gupta (CS&E Department) 4

a.1 Recursive descent parser

a.2 Predictive parser or LL(1) parser

Here a, +, b are terminal symbols .

 Stack − It contains a combination of grammar symbols with $ on the bottom of

 Parsing Table − It is a two-dimensional array or Matrix M [A, a] where A is

Mr. Abhinav Gupta (CS&E Department) 5

(iii) Computation of FIRST & FOLLW

Rules for Calculating FIRST Function-

Similarly, we can make expansion for any production rule X → Y 1Y2Y3…..Yn.

Rules for Calculating FOLLOW Function-

Mr. Abhinav Gupta (CS&E Department) 6

Mr. Abhinav Gupta (CS&E Department) 7

Mr. Abhinav Gupta (CS&E Department) 8

Example 3: Find the FIRST and FOLLOW of the following grammar.

Example 4: Check the given grammar is Predictive/LL(1) or not.

Mr. Abhinav Gupta (CS&E Department) 9

NOTE: Substring b, Abc, d and aABe are known as handle.

Stack Implementation of Shift Reduce Parsing

Mr. Abhinav Gupta (CS&E Department) 10

Stack Input String Action

$ id1 + id2 * id3$ Shift

$ id1 +id2 * id3$ Reduce by E → id

$E +id2 * id3$ Shift

$E+ id2 * id3$ Shift

$ E + id2 * id3$ Reduce by E → id

Further Bottom-up parser is classified into two types:

b.2 Operator precedence parser:

Mr. Abhinav Gupta (CS&E Department) 11

Steps for constructing the LR parsing table :

Eg: Given grammar S → AA

Mr. Abhinav Gupta (CS&E Department) 12

If I is the set of one production { S → •aA }

Question: Construct the LR(0) parser for the following Gramaar:

I2= Goto(I0, A) = S → A•A

Mr. Abhinav Gupta (CS&E Department) 13

Mr. Abhinav Gupta (CS&E Department) 14

Question: Construct the Canonical Collection of LR(0) items and LR table.

Mr. Abhinav Gupta (CS&E Department) 15

SLR (1) Table Construction

Mr. Abhinav Gupta (CS&E Department) 16

EXAMPLE– Construct SLR(1) parsing table for the given context-

Mr. Abhinav Gupta (CS&E Department) 17

Mr. Abhinav Gupta (CS&E Department) 18

I5= Goto (I1, +)

I6= Goto (I2, *)

I7= Goto (I5, T) Goto (I5, F) Goto (I5, id)

Mr. Abhinav Gupta (CS&E Department) 19

SLR (1) Table

Mr. Abhinav Gupta (CS&E Department) 20

Mr. Abhinav Gupta (CS&E Department) 21

LR (1) item is a collection of LR (0) items and a look ahead symbol.