You are on page 1of 14

2.

5
NIT-2 Syntax Analysis
UN

PART-B3
WITH SOLUTIONS
ESSAY QUESTIONS
2.1 INTRODUCTIOON

a10. Discuss about,

parser
Role of the
(a)
(b) Representative grammars.
Model Paper-, 94

AnSwer:

askeywords, identifier,.. etc..)


takes the tokens (such
Parser
the
Role of
the second phase ofcompiler
which with the rules of a pranmmardc
( ofsvntax analyzer is stream (string of tokens)
arser the syntax oft the input
and c h e c k s
o r lexical
analyzer
secanner
the language.
the
programming Source program
erined by

Lexical analyzer

Parser
Symbol table f

Rest of front end

Intermediate representation

Parser
Figure: Role of

Representative Grammars
(b) discussed below,
for specific methods
are
that are suitable
Some of the grammars

)
Expression Grammar
terms and factors. For instance, thé following
if it describes expressions,
The grammar is called expression grammar,
expression grammar.
grammar is

EE+T|T
TT*FF
F->(EJid
Where,
contains terms which are isolated by operator.
+
E is the set of expressions. It
*
which are isolated by operator.
Tis the set ofterms. It contains factors
either identifier (id) or parenthesized expression.
F is the set of factors. It contains
It is a type of LR grammar suitable
for bottom up parsing, Apart irom
This grammar follow associativity and precedence. However, it cannot
be

more number of operators and heigher levels of precedence.


this, it is also useful for deáling with
applied for top-down parsing.
(i) Constructs unc

such Such keywords are helpful in


while, int etc.
selecting PO
Constructs are grammars that start with keywords
as
This simplifies the parsing proCEss.
duction of the grammar which needs to be considered for matching input.
SIA GROUP
STUDENTS-
SPECTRUM ALL-IN-ONE 1oURNAL FOR ENGINEERING
2.6 cOMPILER DESIGN INTO-HYDERABAD
AD
(ii) Non-left Recursive Form of Expression Grammar Syntax Error Recovery
The employed by parserto
er to recover
following is a non-left recursive variant of expres-
sion grammar
The various strategies
follows,
as
rom syntactic error are
recovery
E>E +T/T (1) Panic mode
recovery
Phrase level
T>T*F/F i)
F (E/id, (1ii) Error production
(iv) Global production.
E TE
Panic Mode Recovery
E'>+TElE (i) simple strategy employe
Is a
Panic mode
recovery
from snt
syntactic
TFT most of the
parsing
it is to
method
card
discard
to recover
a single
a single input
T*FTle error. The
idea behind
after a1scovering
error untl a
a time end
F-> (E)/id symbol one at tokens like
semicolon or
set of synchronizing Hence this strategy mainiy
This grammar is suitable for are found.
top-down parsing. (representing end)
amount or input skips w ithou
Dut
(iv) Handling Ambiguous Grammar focuses on
considerable
error
additional input
The following checking any
grammar consider plus(+) operator and Is that, it
ensures that the
multiplication(*) operator as identical. Thus, it is used The advantage this strategy
of while perform1n
infinite loop,
for dealing with ambiguity that arises due to not enter into
parsing. parser will used when there
method is mainly
Beside this.
E E+ E the skips. encountered in the statement
are less number of errors
E E E Phrase Level Recovery
(i)
E (E) is employed by most of
Phase level recovery strategy
to recover trom syntactic
E id error repairing compilers
method 1s to perform
error. The main idea in this
Where, E represents alltypes of expressions. local correction those input string that were remain
on
Q11. Explain in detail syntax error handling. unchecked on discover1ng on error. This
corTecton can
be done by substituting a string in place of input pretix
Answer
Some of the local corrections done by comp1ler des1gn
Syntax Error Handling are as follows,
. Syntax Error Replacing comma by semicolon
errors are the errors that arise during
Syntax or syntactic Deleting extra sem1colon
incorrect usage ofsemico-
These errors can be
syntax analysis. braces etc. In C or Java, syntactic
Inserting missing semicolon
lons, additional or missing without enclosing The main draw back of this method is that
error could be the presence of a case statement
ficulty n findng the occurrence of the actual
it faces
di
error before
switch. pertormmg detection
S y n t a x E r r o r Detection
(iii) Error Production
2. the use of precise
can be detected through Error production strategy heips in
Syntax error LL and LR are
The parsing
methods such as
error messages that are encounteredgenerating appropriate
methods. after their
occur-
parsing
the e r r o r s
immediately
which given input string. inis method while parsing a
to
capable of detectingviable-prefix property according helps improving the
is not efficiency of parsing by in
rence. They possess whenever a prefix of the input string
Alternatively.
on a given input string. continuously performing parsing
an error
is triggered to the
language).
One of the main
string (according of tokens
ulty indisadvantage this method that
considered as a the sequence of
encountering it faces difficut
is triggered upon be parsed any
more according
because itf the gramm
maintaining the
is
changes the grammar. Thisi s
error
an
that c a n n o t i
analysis
from lexical
grammar.
following goals
production need to be
changed whilecorresponding error e
to the
language
error
handler has to
achieve
(iv)
Global Production parsising
production
However,the
al
ofer
employee bystrategEY 15
existence
a
passing sinple theoretical
during parsing. about the
reterring concept the
insertion Aeletion on
and precisely
(i)
Address
clearly
Addressing
the e r r o r
must
involve
the source

atleast
program.
erroneous input.
algorithm
processing tokens to
to
perform
rors. error
within
programs. recover from
location ofan oferror-free The advantage of this
the
overhead
for the
processing
so as
to
detect
improving the time and spac strategy is
that it
(1)
Put low
Recover
from an
erTor
as
soon as
possible
on
the TITLECove requirement of parv
befoe you buy
heips
eg
n
(1) next errors.
GROUP
LOGo
f o r
the SlA
UNIT-2 Syntax Analysis
2.7
CFL for Given G r a m m a r
2.2 coNTEXT FREE GRAMMARS
Given grammar, G is,
a12. What are context free grammars? Also give the S abB
formal definition of context free grammars.
A aaBb
Answer
Model Paper-ll, Q4(a)
Context Free Grammar (CFG) BbbAa
A>E
A context tree grammar (or simply a grammar) describes
the programming language constructs. It consists of terminals, Let u s derive the strings
generated by G. We start vwith
on-terminals, a start symbol ánd productions. It is denoted by start symbol S.
G- S, P). SabB
Terminals (V) abbb Aa
Terminals are finite set of symbols from which the strings
far the language are formed. These are generally represented
abbbaa Bb
abbaabbAab
letters a, 6, c,..
by small
abbaabbeab
Non-terminals (V_)
abbaabbab
Non-terminals are the finite set of variables that
represents a language i.e., a set of strings. These are generally strings containing n
Clearly, the grammar generates
represented by capital letters 4, B, C,.. b's
number of a's and (n + 1) number of
Start Symbol (S) T h e CFL is given by,
In a grammar, one of the non-terminals is used as a L= {ab (bbaay bba (bay"'}.
start symbol. It represents the language being defined by the
notational conventions of
grammar. t
is generally denoted by S. Q13. Discuss about the
context free grammars.
Productions (P)
Productions are the set of rules that describes the recursive Answer:
All productions are of the form, The notational conventions of context free grammars
definition of a language.
are listed below,
1. The following symbols are treated as non-terminals,
Where,
a:A non-terminal i.e., a E V (i) The initial uppercase letters of the alphabet. For
instance. A, B, C.
B: A combination of terminals and non-terminals
i.e. BE (V,U V,). ii) The italic words in lower case like expr or stmt.
Examples (ii) The start symbol of the grammar i.e., S
. The grammar that generates strings having equal number (iv) Incase of programming constructs, uppercase
of a's and b's is, letters denote the constructs. For instance, E
represents expressions, T represents terms and F
G- ({S}, {a, b}, P, S)
represents factors respectively,
Where, P is,
The following symbols are treated as terminals
S aSbS
) The initial letters of the alphabet in lowercase like
S bSaS a, b, c.
SE i) The digits 0 to 9 (0, 1,.9).
2 The grammar that generates the set of all strings with
(ii) Strings in bold representingterminai symbols. For
exactly one I over 2= {0, 1} is,
instance id, if.
Where, P 1s,
(iv) Operator like +, *, etc.
SA\A
(v) Punctuations like comma, semicolon, parenthes
A04 | E
3 The last letters of the alphabet in uppercase like x. y
The language
generated by CFG is called as Context denote the grammar symbols. (either terminals or
Free Language (CFL)
terminals).
COMPILER DESIGN IJNTU-HYDERABAD
2.8 AD)
which may. be empty.
4 The Greek letters in lowercase like a,B, y denote the strings of grammar symbols
production and a represents
A > , where A represents head ofthe the
nis helps in writing the generie production as

body of the production. Ihese letters may als be


(usually u, v, w, z) denote strings ofterminals.
of the alphabet in lowercase,
..,

5. ne last letters

empty. symbol is not specified.


the start symbol if the
start
The head of the initial production
is considered as
as A-productions
because all these prodduc-
are called
A> a,
form A -> a, A > a, A> a, .,

1. The productions of the >a, a a,


tions haveA as their head. They can also be represented as A
Where, a, a, ., Ca, are the alternatives for A.
different types
of derivations.
014. What are derivations? List and explain

Answer
Derivations rule with a terminal inuut.
RHS of a production
nonterminal symbol present on
Derivation is a method of replacing the rules are called as
derivations.
from a set of production
This is the process of deriving a input string

There are two types ofderivations,

(i) Left Most Derivation in a derivation.


the leftmost non-terminal
is replaced first at each step
of a string,
In the leftmost derivation
abababa is,
The leftmost derivation string
of

S= SbS
ShSbS
m

S6S6Sbs
m

a b SbSbS
Im

ababS6S
m

abubabS

abababa
m

Right Most Derivation


(ii) derivation.
non-terminal is replaced first at each step in a
It the rightmost derivation of a string, the rightmost
The rightmost derivation of string abababa is,

S SbS

SbSbS
rm

SbSbSb6S
rm

S6Sb Sba

Sb Sbaba

Sbababa

abababa
2.9
I T 2 SyntaxAnalys

Explain the relationship between parse trees and derlvations.


a16.
which interior
AMNE derivation.
tree
It is a l a b e l l e d
in
shows a graphical repre e s e n t a t i o n of a r e e TOr a
A parse
tree or a rivation tree
non-terminals. A parse conte
the leaves of a node a r e labelled by terminals
or
Dy non-teminals and
iabelled
N d e sa r e l a b e i l e

G has the following properties


gramn

root ot a parse
tree is the start symbol of the CFG.
he
The

node or a parse tree are always the non-terminal.


The interior
tree are terminal S. non-
Every leaf
of a parse or
node of a parse
tree contains the
and the interior
1fthe grammar
&
has a production of form 4 left to rignt.
.

then its children must be labelled by y,. y3 ... y, from non-terminal AA


A the
teminal node of a parse
tree contains
E and ifthe interior
the graimmar
i fthe nas IS a production of form 4 the root
must be the only child ofA.
string
derived rom
hen E left to right form a

a ree contains terminals or e which when read from


Or parse
The leaves

consider the context free grammar for arithmetic expressions.


For example,
id
EE+E|E*E|(E)1-E|
derivation for the string (id + id) is, -

is shown belowW,
The leftmost tree

- E + E) -(id+ E) =-(id+ id)


and the corresponding parse
E-E=-
E

id id
fid+ id)
Figure: Parse Tree for String
steps
is constructed using the following
The parse tree for (id+ id)
-

start symbol of G, that


is E.
The root node of a tree is the as shown below.
two children and E are added to start symbol -

To apply the production, E>-E,


E
E

Step 1 Step 2
children to E as (, E and )
E using the production E->
-

(E). By adding three


Ihenon-terminal E is further expanded
as shown below.
E

E
cOMPILER DESIGN IJNTU-HYDERABADI
2.10
The successive steps in the construction of parse tree are shown below,

E
E

id id
id
Step 6
Step 5
Step 4
Q16. Discuss about the following,
(a) Ambiguity
(b)Verifying the language generated by a grammar.

Answer
(a) Ambiguity In other words there is more
more than one parse tree for some sentence.
A grammar is said to be ambiguous ifit produces
for some sentence. A grammar becomes ambiguous
than one leftmost derivation or more than one rightmost derivation
when the same non-terminal appears twice at the right
hand side of a production.
because with more than parse tree for a sentence, it is difficult
one
Ambiguous grammars are not suitable for some parser
can use ambiguous grammars with disambiguating rules that
to select a parse tree. However, for some parsers like yacc
discards all the unnecessary parse trees by leaving only one parse tree for each sentence.

(ii) Thus, ambiguous grammar contains productions of the form,

SSoS
Where, a may be string ofterminals or non-terminals ie., the same non-terminal appears twice on R.H.S of a production.

Example
Consider the following grammar,
SSS| alb
Let us derive a string from this grammar,
SSS
SSS
aSS

abS

aba
(iii) Now, we construct derivation trees for the string aba.

a S

a b

Figure: Derivation Trees for aba


NT-2 Syntax Analysis
UNIT 2.11
Since there are twO derivation trees for the
string aba, the given grammar is ambiguous.
Verifying the Language Generated by a Grammar
b)
The granmmar G can generate a language L ift the following conditions are satisfied,

Every string generated by grammar G must be present in language L.


Every string present in the language L can be generated by the grammar G.

To illustrate the above cases consider the following examples,

Example

Consider a grammar

A-(A)A|e
Which is capable of generating only strings of balanced parenthesis.

For the above grammar, the following statements are true,

(a Every string derived from A is a string with equal number of parenthesis.

A.
(b) Every string with equal number of parenthesis can be derived from
Proof for Case(a)

Basis Hypothesis
considered as balanced.
can be derived from A which is
In a single step (i.e., n=) only empty string (E)
Induction Hypothesis
leftmost derivation. The
which are derived in less thann steps are balanced.Also consider an n-step
Suppose that strings
these considerations is in the following form,
derivation of both
AA)A A y
m lm

Proof
less than steps. Thus, by induction
clear that x and y derivations are carried out in
n
From the above derivation, it is
balanced.
x and y are balanced strings.
Hence the string (x)y also needs to be
hypothesis
Proof for Case (b)

Basic Hypothesis
In case if the length of string is zero, the string must be e which is a balanced string.

Induction Hypothesis
less than 2n can be derived
It is a fact that length of balance strings is even. Consider that a balanced strings having length
from A. Also consider, a string w which is a balanced string having length 2n where 2
n 1 and (x) as the smallest (nonempty)
prefix of string w with balanced parenthesis.
From the obove assumptions w can be defined as,

w=(x)y
Where x and y are balanced strings whose lengths are less than 2n.

Therefore, it can be said that x and y can be derived from A. Thus, it is now clear that the derivation is of the form.

AA)A A y
m Im lm

This means w= (x)y can be derived from A.

Hence is is proved that a grammar G generates a language L.


2.12 cOMPILER DESIGN (JNTU-HYDERAB.
us s now context free grammar is more powerful than Regular Expression.
iscus
ABADI
Answer:
Context-Free Gramnmars Versus
Regular Expressions
Context-free Grammars (CFG) are powerful thanregular expressions (RE). This is because of the followillowing featuresof
. CFG.

1) CFGS are capable of describing all the constructs that are described by RE. However, REs cannot describe all the c
Con
structs that can be described by CFGS.
(1) The languages which are regular languages.are context free languages, However, the languages which CFLS:
necessarily be regular.
are not
The above facts can be illustedted through the following example.
Consider a language consisting of a set of strings of a' s and b's ending in abb as described by a RE (ab)"abb. The same
ne
language can be described by a CFG as,

Sas bS,Jas,
S, bs
SbS,
A grammar can be developed mechanically which can accept the language that is accepted by NFA. Thus, the NFA ac-
cepting the above languages is given below,

Start

Figure: NFA Accepting Strings of a's and b's Ending in abb


Now, a CFG can be constructed as follows,

() A non-terminal S, is created for every state i of the NFA.


ii) The production S> aS, is introduced for the transition ofstate i to statej on input e.
(i) The production S-e is introducedif the accepting state is i.
(iv) The start symbol is made as S, for the start state i.

In addition, a language with equal number ofx's and y's [i.e., L =ry where n 2 1] is a language which can be generated
by context free grammars but not by the RE.
The reason why this language cannot be generated by RE is discussed below,
Assume that, the language L=ry is generated by sonme regular expression. So it is possible to construct a DFA D accept
ing L with some k states. Incaae bf input having more x's than DFA states (i.e., k) the DFA D has to enter into some of the states
(say 4) for two times. Name the path, i.e., from state 4, to itself as d-, Now, there must also be a path from Ai to an accepting
state (f) (as ab e L). Moreover, there exist a path from 4, through 4, to the accepting state which is named as a'b. This means
that L also accepts a'b' also which is not in the language (db). Thus, the L = db'is accepted by DFA D.

a'-

Figure: DFA Accepting x'y and xy


Thus, it is concluded that the finite automata cannot accept the language such as L =ry. That demands the count o
number ofx's before it completely gets y's.
-2 Syntax Analysis 2.13
UNI

For some string a then grammar is left recursive. WNe

2.3 WRITING A GRAMMAR must eliminate left recursion since top-down parsers cannot
handle left recursive grammars.
detail about eliminating ambiguity
Discuss in An immediate left recursion from the production of the
a18.
a18.
and left recursion.
form is,
Model Paper-ill, Q4a)
Answer
A-Aa|B
Eliminating Ambiguity It can be eliminated by rewriting the productions as,
from a grammar can be
However,
the ambiguity ABA
rules.
the production
rewriting
eliminated by from ofthe
the grammars form A- aA'|E
the ambiguity
In general, rewriting which does not change the set of strings that be
can beeliminated by
can

4aABAY| a, |a, , derived from A.


as follows,
productions
the
A-aA BA'Y|A' Example
Consider the CFG for arithmetic expressions
as,
A |oa,|... is ambiguous
Ifmore than one production of a grammar EE+T|T
transformations repetitively. For
is modified by applying arithmetic T->T*F|F
then, it free grammar G, for
context
consider the
example, F ( E ) | id
expression given as,
T are left recursive. Eliminating
|E*E|(E) | id The production for E and
E>E+E non-terminal these productions, we get
the grammar as,
because the left-recursions from
This grammar is ambiguous Each
side of a production.
twice at right hand ETE
E appears Therefore the precedence
has a precedence.
arithmetic operator rules E'+TE E
while writing the grammar
be
must preserved
of operators these operators.
for expressions involving T FT
from the above grammar,
To eliminate the ambiguity of TFTE
a
non-terminal (T) to move the rightmost
we introduce
new
tree. So, we get, F (E)| id
non-terminals further down the parse
these we must
In general, ifthere are many
productions for A,
EE+TIT as given below
id combine all A-productions
T E * E|(E)|
This grammar is still ambiguous
because by substituting AAaAa,A B, B,1
production E> ) we get, A
E with T(using Where no B can start with an

TT'T
Then, we replace them by following productions,
introduce another
non-terminal F as given
Therefore, we
below, AB,A B,A'|BA
TT°FIF A A a, 4 . A E
F (E)| id
above Q19. Discuss in detail about,
for the
Therefore, the ambiguous grammar
(a) Left factoring
grammar Is,
construct.
E E + T | E -T| T (b) Non-context-free language

TT*F| T|F|F Answer

F (E)|id (a) Left Factoring


a programming transformation in which
Grammar defines the features of Left factoring is a ggammar
recursion may be left recursion isolated into a single
language as recursive ules. The the common parts of two productions are
is suitabie for predictive
or right recursion. production. Aleft factored grammar
Eliminating Left Recursion parsing.
if the same non aß (where
Grammar is said to be left recursive,
as the
Anyproduction of the form, Aaß as.
terminal on the left hand side of a production appears is common) can be replaced by following production
In other words if there is a
fhirst symbol in the right hand side.
production of the form is,
AB B
2.14
Lett
which tactoring is required
the production
input
to
choose because it is
difficult cOMPILER DESIGN JNTU-HYDE
starts with either oß, to
factoring we defer thisnon-empty string
a or
oß. to expand Adecide
we

(b)
have seen decision by derived from o. In left
enough input to make expanding A to oA' until
the
when
Q20. What
2.4 TOP-DoWN PARSING
is
top-down parsing?
(DERABAAABAS
types
Non-context-free Language Construct
right choice.
Answer
of
top-down parsing Explain
techniquer
Some ues. erent
programming syntactic constructs of
Top-down Parsing Model Paper-l, Q4
only grammars.languages are difficult to non-context-free
These be
specified using build Top-down parsing may be
following examples. difficulties can be
illustrated
a
parse tree for
from the root and an input string in
considered as an
atte:
with the then preorder, that ipt
creat1ng the nodes
Example1 can also be
considered as
derivation for an input an
attempt
of a
to parin
string. construct la
Consider the abstract A
a

leftimost
In this top-down parser constructs the
the first w language
L=
language. {wew/w is in (a/b)*
from the start
symbol of the grammar. Then leftmost des
the
piece of program code is the identifier declaration, able
production rule such that it can it derivation
identifier. The and second w is
the inclusion of
c is
left to right in the sentential form. move the inputselecteon
i.e.. language 'L'
repeated number of a'sconsists of
strings like aabbcaabb
the production rules for a leftmost If there are morestrina
than
c
and b's production rule is dependent onnon-terminal then selectie
non-context-free language is seperated by 'c'. Such backtrack or not. If the whether the
free directly related to non-context input string parser can backtrack, itparse tree can
programming languages like C and Java. order until itrepetitively and can scan th
programming This is because, has succeeded intry out all the the
languages demand identifier and if the parsing the possibilities in at
any
make the backtracking is not permitted then the On the other
to its string.
usage. Moreover, arbitrary declaration prior
identifiers. length can be allotted right selection of the parser has to
the task in such parsers. production rule, which is
crucial
The
grammar of these Top-down parsers are popular since it is
identifiers using tokens likeprogramming languages represent efficient parsers by hand using easy to construct
identifiers based on the
'id' rather than
distinguishing Types of top-down methods.
of declaration of
type of character
strings. The condition
Top-down Parsing
identifiers before their The different types of
checked by the usage is verified or top-down parsing techniques are,
semantic-analysis phase in the compilers. ) Recursive Descent Parsing
Example2 For answer refer
Unit-I, Q21.
Consider (i) Non-recursive Predictive Parsing
non-context-free language L
a

{a'b"ed
=

nz1
and m21}. This language consists of For
answer tefer Unit-lI,
Q24.
the output of
regular expression a*b*c*d*.
strings, which are Q21. Discuss in detail
Where, Answer
Recursive-Descent parsing.
Number of a's and c's Recursive Descent Parsing
=
number of b's and d's.
In this Parsing means generating a
language, d"
represents that the variable parsing a parse tree is generated parse
tree. Now, in
Top-down
be a can
by
repreated n number
of times and ' , b" the starting variable taking the parent node or
of the grammar and
formal parameter lists of two functions and b represents thein preorder. constructing nodes the

parameters and respectively.


n m
having a
formal
the actual However, and d represent The recursive descent
parsing thod is the general form
parameter lists involved in calling the or general
way of representing top-down
functions. of parser, the parsing. In this type
This
languages L is capable of concept of repeated scans of input may be used.
difficulties involved in abstracting the Making repeated scans or performing repeated reading of tne
verifying that number of formal input is called backtracking. as
parameters in function declaration must coincide with
number of actual the Steps
parameters in the usage of function.
In Every non-terminal has a separate procedure. Ie
programming languages like C, the syntax of function procedure body of the
declaration does not involve the count of to as its RHS. The RHScorresponding non-terminal is referred
of the production rule is converted to
it is done parameters. Therefore, a code
using semantic-analysis phase. by parsing symbol by symbol using a recursive descent
NIT-2 Syntax Analys 2.15
ABADI
ySis

followed while constructjng recursive descent parser are as foilows,


h a s i c steps to be
al is the input sysmbol then, it is matched with the kook head symbol. !f tic mtch beuoi nes successfuB then the
fterminal the input
is sysm

lokahead pointer
is incremented by
I
to pomt the next input symbol.
ferent .caon-terminal is the input symbol then,
the procedure of the cotresponding non terninei is called.
Ifnon-t

, Q4(b) rule then, the alternafives are coinbinci io form a


sigle procedure
a ra
e re many alternatives for the production a
f the
body
to the start symbol activates the parser
mpt to Cinally, the production
corresponding

arting
ree. It Iets take an example for recursive
descent parsing, which needs backtracking.

Lets
tmost
rampie

e, d}, R, {})
ation G (, B}, {a, b,
Suit Ris given byy
from
one I cBd
of
can Babla

the is cad.
string that is to parsed from ie
and the input and crearine Todes
any tree, using the start variable
we construct a syntax
ther an input string
As we
know that to parse
S to
oright, in preorder.

Cial the input siring.


to the first symbol of
The parser
has a pointer, pointing
uct ads nd
end o' nput strrg

erno andier
B
Botom Recursive descent
of stack aISHg Proccoure
Stack

Figure: Strweture of a
Recursive Descent Parser
tle sy niax ites
>
Bd
the initial orstart
variable i.e., l and generaies
The parser starts parsing from

1 6 . d ofi fron ieft


'vta
'c' atd compat es
first input symbol i.e.
in the figure the parser checks the incremerts the roin to rext syriboi.
Now, as shown tree, the parser

'matches the left most node 'c' in the syntax b, a 61 1 non vernunal t
nght. Asthefirst input 'c the tree i . ,
it compares a to
the next node n

to the input symbol 'a',


IREparser is now pointing exists then priority is given
to the hrst yieid
from let.
than one yield
pands Bwith its yield, if more
B ab la

- S I A GROUP
STyDENTS
RUM ALLAIN-ONE JOURMML FOR ENGEERING
2.16
Now, parser
pointing to
tOnput symbol 'a' and the
left-most child
cOMPILER DESIGN IJNTU-HYN

B is *a'
of variable matches
which does not
founds a match HYDERABA
and
with 'd'. At this st ge, the pary
ADd the poinen
CTemented
does to next
backtracking for the variable B. It 'd'.
input symbol checks
The that
i.e., nextifchild
has any is 'b'yields
of Bmore which can be used. It no more yielde

constructs a new r e e . oe parse


Beherates that yields and
an
crror. If yes, then expands B or replaces B with

C B

The input symbol 'a'now matches the child 'a' hence the parser then increments the pointer. Now, the parser is
to
input symbol 'd.
pointing
Now, the right most child of variable I matches the input symbol a.

NOW, the parser halts and declares that the input string is syntactically correct.
intinite 100p. One of such grammar
enter into an
Brammars that can cause a recursive descent parserto mars is
known as lett-recursive grammar.
022. Explain in detail first and follow functions. Give examples.
Answer
FIRST Function
FIRST (0) is a set of terminal symbois that are the fir_t symbos in the strings derived from a. Where a is any string of
grauar symbols.
The rules for computing FIRST () for ail grammar symbols X'are,
1. IfX is a terminal symbol then FIRST(X)= {X}
2 If X is a nontermina! having a production rule as,

XE then FIRST (X)= {¬}.


!fXis a non-terminal having a production rule as,

KA, A,.A then add a FIRsT (X) if for some i, there is,
a in FIRST (A) and
E in FIRST of all A, ie., FIRST (4), , FIRST (4)
I fFIRST (4) for all j= 1,2, k i s E then E is added to FIRST (X).
That is whatever in FIRST (4,) is in FIRST (A). Ife is not in FIRST (4) then nothing more is added to FIRST (X). How-
ever, if FiRST (4,) derives e in zero or more steps then we add FIRST (4), FIRST (4,) and so on.

The above rules for FIRST (X) are applied until there are no more terminals ore that can be
added to any FIRST set.
FOLLOW Function

EOLOw (A) is a set of terminal symbols as sucn inat they appear


immediateBy after A in any string occurring on
right hand side of production.
Tho rules for conmputing FOLLOW (A) for all non-terminals A
of the grammar are
Eor the start symbol S, FOLLOW (9=} where S is used as end marker for the innt
ut.

2
iac a nroduction as
A- uB5. Where FiksI (P) does not contain E then FOLLOW
(B)= FIRST (8)- {E
as BB. Where FIRST (B) contain e
tfA has a production then
as A > B then FOLLOW (B) FOLLOW(4
FOLLOW (B) =FIRST (B)- (E}U FOLLOW )
=

IfAhas production
a

IfA is the rightmost symbol in some sentential form then Sic added to FOLLOW
5. (4).
L.ook for the SIA GROUP NGO on the TITLE COVER
NIT-2 Symtax Analysis 2.17

F m p l e FOLLOW (B)
Consider the grammar given below, The production involving B is S
->aA B
GSa4B Which is of form 4- aBBusing role (3) for 1OLIOW
we get.
SDA
S a AB

AaAb A aBB
Hence ßE
BbB Hence by using rule (3) we get,

B-E FOLLOW (B) FOLLOW (5)


1s free from left recursion and
The above grammar (G)
e t lactor FOLLOW (B) = {S}
FOLLOW
FIRST and
Computing Hence we get,
FIRST
FOLLOW (A)= {S}
U FIRST (6) FIRST (E)
FIRST ()=FIRST (a) FOLLOW (A)= {b, $}
(using rule I and 2)
= {a, b, e} FOLLOW (B)= {$}
in detail.
FIRST (4)= FIRST (a) U FIRST (¬) Q23. What are LL(1) grammars? Explain
(using rule I and 2)
Answer
a, E LL(1) Grammar

FIRST(B)= FIRST (b) U FIRST (E) if its parsing


(using rule I and 2) A grammar is said to be LL{1) grammar
entries that is an entry
table does not contain multiply-defined
{b, E} A > a. from the
MA, a) always contain a single production
B are follows,
Therefore first of S, A and
as
grammar.

FIRST (S) {a, b, e} Here,


L: Stands for left to right scanning ofthe input.
FIRST(4)= fa, e}
derivation
FIRST (B) = {b, e}
L: Stands for producing the leftmost
for the input.
FOLLOW (A)
look ahead symbol for
FOLLOW (S) =
{$} {As Sis the start symbol] 1: Stands for using one
decisions.
making parsing action
FOLLOW (A)
aA is ofform A a B A LL(I) grammar has the following properties.
The production involving A->
or left recursive,
we get, cannot
Busing rule (2) for FOLLOW, A grammar which is ambiguous
SaAB be a LL(1) grammar.
does not have
AaBB . The parsing table of an LL(1) grammar
multiply-defined entries.
Then FIRST (B) - E
then LL{1)
= FIRST (B) -- E
3. IfA>a| ß are two distinct productions,
grammar must hold
the following conditions
= {b, e} - E
and Bshould not begin
= {b} (i) The strings derived from a
(0)nFIRST
with thesameterminalaie.. FIRST
Consider the production S-> bA (B) # a.

The production involving Ais S-> bA


i.e..
derive the empty string
(11) Either a or B can
FIRST(a) n FIRST (B)
FOLLOW (A) = FOLLOW (B) #¬.

not
then a should
(1) If B derives an empty string the terminals in
FOLLOW (A) = FOLLOW (5)
with
derive strings that begins
FOLLOWA) i.e., FIRST
(a)nFOLLOW(4): o
FOLLOW (A) = {b, $}
- SIA GROUP
ENGINEERING STUDENTS
SPECTRUM ALL-IN-ONE JoURNAL FOR
cOMPILER DESIGN [JNTU-HYDERA
2.18
The same argument holds if a derives an empty string.
ERABAD
L A t ) parser provides the detailed information on the internal of the parsing process. 1t 1s also Capable ofi catch.
catchig a
errors because thie situations of syntax errors are recorded cxplicitiy in the tabie.

expressions. A non L )grammars have


multiplvd.
Cxample oi a i.L() grammar is a grammar for arithmetic
else grammar as given betoW.
defined
t sin their parsing table. An example of such grammars is dangling
S- iEtS | iESeS | a
E-h

. Leti factoring this grammar, we get,

S1EtSS'|a

S'eSe

E-> b
fne parsing iable ior this gram:nar is given below.

Input Symbol
Non-terninal b

S-a SiEtSS
S'->E

S'-eS

Table: Parsing Table for Dangling else Grammar


e defines multiple entries tor the produu tnon
In the above parsing tadie the entry M[S
of the product:ons S S to bhe chonen upin
This grammar is amb1guous because it is not clear which one

seemg iie input symboi e.

from being put on the stack or removcd Irom the input 1f the secone
Ifthe parserchooses S'>E ihen this prevents e
ever

This choice aswiates the de's wth the closest prev tous then's
choice S'- eS in made then it resolves the amb1guity.
Se
the mult1ple detined entr av a single valued entry
ie
So the parser can parsethe string by making
be removed by tist eliminating the ic ft recursion and
let
factorin
Themultiple-defined entries from a parsing table can
the grammar then constructing a parsing table for the transformed grainmar.

predictive parsing table with an example


24. Explain the procedure for non-recursive
Modei Paper-, C4
Answer

Non-recursive Predictive Parsing


A non-recursive predictive parser uses a parsing table which shows which product:on rulr to select irum several alternat
2vaabie for expandng a given non-terminai and the first terminal symbol that shouid be produced by that non-terminal. The pars

table consists ofrows and colunns where there is a row for each non-ierminal and a column for each ternnal symbol includrng
ihe cnd marker for the input string. Each entry MA. a) in a table iseithera production rule or an error lt aso uses a stack contain
a sequense ofgrammar symbois with the of S. Symbol placed on the botiom indicating the botiom ot the stack.intaly, the
mbol resides on top. The stack is used to keep track ofall the nonterninals for which no predicton nas Deen made yet

The parser aiso uses an inputbuffer and an output stream. The strng to be parsed is stored in the inpu butfer

The end of the butfer uses a S symboi to indicate the end of the input strng
he contnguration ofthe non-recursive predictive parser is shown in the figure

Look for the SiA GROUP LOGo o n the TITLECoVER before you buy

You might also like