You are on page 1of 8

Chapter No.

4
The Role of Parser: parser obtains a string of tokens from the lexical analyzer and verifies that the string can be
generated by the grammar of the source language. Parser report syntax errors. There are two primary parsing
techniques: top-down and bottom up.
Top-down parsers: A top-down parsers starts at the root of the parse tree and grows towards leaves. At each node, the
parser picks a production and tries to match the input. However, the parser may pick the wrong production in which
case it will need to backtrack. Some grammars are backtrack- free.
Bottom-up parsers: A bottom- up parser starts at the leaves and grows toward root of the parse tree. As input is
consumed, the parser encodes possibilities in an internal state. The bottom- up parser starts in a state valid for legal first
tokens. Bottom-up parsers handle a large class of grammars.
How to eliminate ambiguity? If a grammar has more than one leftmost derivation for a single sentential form, the
grammar is ambiguous.
Sif E then S |
if E then S else S|other. From this grammar the sentential form has two derivations:
if E1 then if E2 then S else S2

The convention in most programming languages is to match the else with the most recent if. We can rewrite grammar to
avoid generating the problem and match each else to innermost unmatched if:
Sif E then S|
if E then Withelse else S |assignment
Withelse if E then Withelse else Withelse |assignment
if E1 then if E2 then A1 else A2

Left Recursion:
A Grammar G (V, T, P, S) is left recursive if it has a production in the form.
A → A α |β. Where α and β are sequences of terminals and non-terminals that do not starts with A. Top down parsing
methods cannot handle left recursive grammars, so a transformation that eliminates left recursion is needed.
Elimination Left Recursion
Example 1.: A → A α |β it would generate the language βα* we transform the grammar in such a way that it should
generate the same language.
A βA’ // (where A is a new non terminal)
A’ αA’ |ɛ
Example.2: Consider the following grammar for arithmetic expressions.
EE+T |T T=β, E’=A’ +T= α
TT*F |F F= β T’=A’ *F= α
F(E) |id Eliminate the left recursion.
Sol: ET E’
E’+T E’ |ɛ
TFT’
T’*FT’ | ɛ
F( E ) | id
Example3. SABC Eliminate left recursion. // S ABC // because no recursion.
AAa|Ad|b // AbA’ , A’aA’| ɛ|dA’
BBb |c // BcB’ , B’bB’ | ɛ
CCc|g // CgC’ , C’cC’ | ɛ
Left Factoring: Left factoring is a grammar transformation that is useful for producing a grammar suitable for predictive
parsing. The basic idea is that when it is not clear which of two alternative productions to use to expand a nonterminal
A, we may be able to re-write the A-production to defer the decision until we have seen enough of the input to make
the right choice.
Example 1. Aαβ|αγ where α is a longest prefix common to two or more of its alternatives and γ represents all
alternatives that do not begin with α.
Remove left factoring
AαA’
A’β|γ
Example 2.
SiEtS |iEtSeS |a
Eb
Solution: SiEtSS’ |a
S’es |ɛ
Eb
Top down Parsing: A top-down parser starts with the root of the parse tree. The root node is labeled with the goal
(start) symbol of the grammar. The top-down parsing algorithm proceeds as follows:
1. Construct the root node of the parse tree
2.
2. Repeat until the fringe of the parse tree matches input string
a. At a node labeled A, select a production with A on its lhs
b. for each symbol on its rhs, construct the appropriate child
c. When a terminal symbol is added to the fringe and it does not match the fringe, backtrack
The key is picking right production in step a. That choice should be guided by the input string. Let’s try parsing using this
algorithm using the expression grammar.
Grammar:
1. Goalexpr
2. expr expr + term
3. |expr – term
4. |term
5. termterm * factor
6. | term / factor
7. |factor
8. factornumber
9. | id
input : x – 2 * y
This worked well except that “–” does not match “+”. The parser made the wrong choice of production to use at step 2.
The parser must backtrack and use a different production.

This time the “–” and “–” matched. We can advance past “–” to look at “2”. Now, we need to expand “term”

The 2’s match but the expansion terminated too soon because there is still unconsumed input and there are no non-
terminals to expand in the sentential form Æ Need to backtrack.

This time the parser met with success. All of the input matched.

Recursive descent parsing: recursive descent parser is a kind of top-down parser built from a set of mutually
recursive procedures where each such procedure implements one of the nonterminals of the grammar.
Example 2. E i E’
E’+ i E’ |ɛ //input : i + i$
E
E( ) { E’( ) { match( char t) {
if(l==’i’) if(l==’+’) if(l==t)
{ { l=getchar();
i E’
match(‘i’); match(‘+’); else
E’(); match(‘i’); printf(“error);
} E’(); } E’
+ i
} }
else return; main() {
Main2 E()6 E’()6 E’() } E();
if(l==’$’); ɛ
Printf(“Success”);
To match a non-terminal symbol, the procedure simply calls the} corresponding procedure for that non-terminal symbol
(which may be a recursive call, hence the name of the
technique. The main difference between recursive descent parsing and predictive
parsing is that recursive descent parsing may or may not require backtracking while predictive parsing does not require
any backtracking.
Predictive parsing: The goal of predictive parsing is to construct a top-down parser that never backtracks. To do so, we
must transform a grammar in two ways:
1. eliminate left recursion, and
2. perform left factoring.
The basic idea in predictive parsing is: given Aα|β, the parser should be able to choose between α and β. To
accomplish this, the parser needs FIRST and FOLLOW sets.
First and Follow:
The construction of a predictive parser is aided by two functions associated with a grammar G. These functions FIRST,
and FOLLOW, allow us to fill in the entries of a predictive parsing table for G.
To computer FIRST(X) for all grammar symbol X, apply the following rules until no more terminals or ɛ can be added to
any FIRST set.
1. If X is terminal, then FIRST(X)= {X}. // FIRST(Terminal)={Terminal}
2. IF X  ɛ is a production, then added ɛ to FIRST(X). // FIRST(ɛ)={ ɛ}.
3. FIRST(A) contains all terminals present in first place of every string derived by A.
Example#1.
Sabc | def |ghi
FIRST(S)={a,d,g}
Example#2.
Productions of First of Nonterminals
Grammar
SABC | ghi | FIRST(S)= {a,b,c ,g, j} // FIRST(S)=FIRST(A) in case of nonterminal at first
jkl place
Aa|b|c FIRST(A)={a,b,c}
Bb FIRST(B)={b}
Dd FIRST(D)={d}
Example#3.
Productions of Grammar First of Nonterminals
SABC FIRST(S)= {a,b,c,d,e.f, ɛ} // if we have production consists of all nonterminals in this
case FIRST(S)=FIRST(A) but A ɛ we put the ɛ in place of A, then we compute
FIRST(B) , again FIRST(B) contain ɛ we put ɛ in place of B , we compute FIRST(C)
again FIRST(C) contains ɛ so we put ɛ in place of C we get S ɛ . hence the
FIRST(S) contains { a,b,c,d,e.f, ɛ}.
Aa|b| ɛ FIRST(A)={a,b, ɛ }
Bc|d| ɛ FIRST(B)= {c,d, ɛ }
Ce|f| ɛ FIRST(C) ={e,f, ɛ }

Example #4.
Productions of First of Nonterminals
Grammar
ETE’ FIRST(E)=FIRST(T)={id,(}
E’*TE’ | ɛ FIRST(E’)={*, ɛ}
TFT’ FIRST(T)={id, (}
T’ ɛ |+FT’ FIRST(T’)= { ɛ,+}
Fid|( E ) FIRST(F)={id, (}
Example #5.
Productions of First of Nonterminals
Grammar
SABCDE First(S)={a,b,c,d,e, ɛ}
Aa| ɛ First(A)={a, ɛ}
Bb| ɛ First(B)={b, ɛ}
Cc| ɛ First( C)={c, ɛ}
Dd| ɛ First(D)={d, ɛ}
Ee| ɛ First(E)={e, ɛ}
Example # 6
Productions of First of Nonterminals
Grammar
SBb|Cd First(S)={a,b,c,d}
BaB | ɛ First(B)={a, ɛ}
Cc C| ɛ First(C)={c, ɛ}
Example #7
Productions of First of Nonterminals
Grammar
SACB |CbB|Ba First(S)={d,g,h, ɛ,b,a}
Ada|BC First(A)={d,g,h, ɛ}
Bg| ɛ First(B)={g, ɛ}
Ch| ɛ First(C )={h, ɛ}

FOLLOW Sets.
Define FOLLOW(A), for nonterminal A, to be the set of terminals a that can appear immediately to the right of A in some
sentential form. Note that there may at some time during the derivation have been symbols between A and a, but if so
they derived ɛ and disappeared. If A can be the rightmost symbol in some sentential form, then $ is in FOLLOW (A).
To compute FOLLOW(A) for all nonterminal A, apply the following rules until nothing can be added to any FOLLOW set.
1. Place $ in FOLLOW(S), where S is the start symbol and $ is the input right endmarker.
2. If there is a production AαBβ, then everything in FIRST(β) except for ɛ is placed in FOLLOW of B.
3. IF there is a production A αB or a production AαBβ where FIRST(β) contains ɛ, then everything in FOLLOW
(A) is in FOLLOW (B).
FOLLOW (A) contains set of all terminals present immediate in the right of A.
1. FOLLOW(Start)={$}
2. SACD
C a|b if Follow(A) =First(C) // if RHS of A contains variable/nonterminal then we take First(NT) e.g FIRST(C)
FOLLOW (A) =FIRST(C)={a,b}
FOLLOW(D) = Follow(S) ={$} Rule 3. Applies here.
Example 1.
SaSbS|bSaS|ɛ FOLLOW(S)={$, b, a} //FOLLOW never contains the ɛ
Example2.
Productions of FOllow of Nonterminals
Grammar
SAaAb|BbBa FOLLOW(S)= {$}
A ɛ FOLLOW(A)= {a,b}
B ɛ Follow(B)= {b, a}

Example 3.
Productions of FOllow of Nonterminals
Grammar
SABC Follow(S)={$}
ADEF Follow(A)=First(B)= First(C)=Follow(S)={$}
B ɛ Follow(B)=First(C)=Follow(S)={$}
C ɛ Folllow(C)=Follow(S)={$}
D ɛ Follow(D)=First(E)=First(F)=Follow(A)={$}
E ɛ Follow(E) =First(F)=Follow(A)={$}
F ɛ Follow(F)=Follow(A)={$}
Example 4.
Productions of Grammar FIRST of nonterminal Follow of nonterminal
S aBDh First(S)={a} Follow(S)={$}
Bc C First(B)={c} Follow(B)={g,f,h}
CbC |ɛ First(C )= { b, ɛ} Follow(C)={g,f,h}
DE F First(D)= First(E) AND First(F)={ g, ɛ, f} Follow(D)={h}
Eg | ɛ First( E) ={g,ɛ} Follow( E)={f,h}
Ff | ɛ First(F)={f,ɛ} Follow(F)={h}
Example 5.
Productions First of Nonterminals Follow of Nonterminals
of Grammar
SABCDE First(S)={a,b,c,d,e, ɛ} Follow(S)={$}
Aa| ɛ First(A)={a, ɛ} Follow(A)= {b,c}
Bb| ɛ First(B)={b, ɛ} Follow(B)= {c}
Cc First( C)={c} Follow(C) ={d,e,$}
Dd| ɛ First(D)={d, ɛ} Follow(D) ={e,$}
Ee| ɛ First(E)={e, ɛ} Follow(E)= {$}
Example 6.
Productions First of Nonterminals Follow of nonterminals
of Grammar
SBb|Cd First(S)={a,b,c,d} Follow(S)={$}
BaB | ɛ First(B)={a, ɛ} Follow(B)={b}
Cc C| ɛ First(C)={c, ɛ} Follow (C) ={d}
Example 7.
Productions of First of Nonterminals Follow of non terminals
Grammar
SACB |CbB|Ba First(S)={d,g,h, ɛ,b,a} Follow(S)={$}
Ada|BC First(A)={d,g,h, ɛ} Follow(A)={h,g, $}
Bg| ɛ First(B)={g, ɛ} Follow( B)={a,$,h,g}
Ch| ɛ First(C )={h, ɛ} Follow(C) ={g,$,b,h}

Construction of a Predictive Parsing Table


Here now is the algorithm to construct a predictive parsing table.
1. For each Production Aα
i. For each terminal a in FIRST(α), add Aα to M[A, a].
ii. If ɛ is in FIRST(α) , add A α to M[A, b] for each terminal in Follow(A). If ɛ is in FIRST(α), and $ is in
Follow(A), add A α to M[A,$].
2. Make each undefined entry of M be error.
Example#1.
Grammar First Follow
SaABb First(S)={a} Follow(S)= $
Ac| ɛ First(A)={c, ɛ} Follow(A)= {d, b}
Bd | ɛ First(B)={d, ɛ} Follow(B)={b}
Parsing table:
a b c d $
S SaABb Example #2.
A A ɛ Ac A ɛ
B B ɛ Bd
Grammar First Follow
SaSbS|bSaS| ɛ First(S)={a,b, ɛ } Follow(S)= {$,a,b}

a b $ The grammar is not LL(1), LL(1) cannot have multiple entries in one
S SaSbS SbSaS S ɛ cell.
S ɛ S ɛ

Example#3.
Grammar First Follow
SAaAb|BbBa First(S)={a,b} Follow(S)= $
A ɛ First(A)={ ɛ} Follow(A)= {a, b}
B ɛ First(B)={ ɛ} Follow(B)={a,b}
a b $
S SAaAb SBbBa
A A ɛ A ɛ
B B ɛ B ɛ

Example #4.

Grammar First Follow


ETX First(E)={int, (} Follow( E)={$,) }
X+E | ɛ First(X)={+, ɛ} Follow(X)={$,)}
TintY|(E) First(T)={int, (} Follow(T)={+,$,)}
Y*T | ɛ First(Y)={*, ɛ} Follow(Y)={+,$,)}
int + * ( ) $
E ETX ETX
X X+E X ɛ X ɛ
T TintY T(E)
Y Y ɛ Y*T Y ɛ Y ɛ

Example #5.
Grammar First Follow
SaB |ɛ First(S)={a, ɛ} Follow(S)={$}
BbC | ɛ First(B)={b, ɛ} Follow(B)={$}
CcS| ɛ First(C)={c, ɛ} Follow(C)={$}
a b c $
S SaB S ɛ
B BbC B ɛ
C CcS C ɛ

Example #6.

Grammar First Follow


SaAa | ɛ First(S)={a, ɛ} Follow(S)={$,a}
AabS | ɛ First(A)={a, ɛ} Follow(A)={a}
a b $
S SaAa S ɛ
S ɛ
A AabS
A ɛ

Example #7

Grammar First Follow


ETE’
E’+TE’ | ɛ
TFT’
T’*FT’ | ɛ
Fid |(E)

You might also like