Professional Documents
Culture Documents
5
NIT-2 Syntax Analysis
UN
PART-B3
WITH SOLUTIONS
ESSAY QUESTIONS
2.1 INTRODUCTIOON
parser
Role of the
(a)
(b) Representative grammars.
Model Paper-, 94
AnSwer:
Lexical analyzer
Parser
Symbol table f
Intermediate representation
Parser
Figure: Role of
Representative Grammars
(b) discussed below,
for specific methods
are
that are suitable
Some of the grammars
)
Expression Grammar
terms and factors. For instance, thé following
if it describes expressions,
The grammar is called expression grammar,
expression grammar.
grammar is
EE+T|T
TT*FF
F->(EJid
Where,
contains terms which are isolated by operator.
+
E is the set of expressions. It
*
which are isolated by operator.
Tis the set ofterms. It contains factors
either identifier (id) or parenthesized expression.
F is the set of factors. It contains
It is a type of LR grammar suitable
for bottom up parsing, Apart irom
This grammar follow associativity and precedence. However, it cannot
be
atleast
program.
erroneous input.
algorithm
processing tokens to
to
perform
rors. error
within
programs. recover from
location ofan oferror-free The advantage of this
the
overhead
for the
processing
so as
to
detect
improving the time and spac strategy is
that it
(1)
Put low
Recover
from an
erTor
as
soon as
possible
on
the TITLECove requirement of parv
befoe you buy
heips
eg
n
(1) next errors.
GROUP
LOGo
f o r
the SlA
UNIT-2 Syntax Analysis
2.7
CFL for Given G r a m m a r
2.2 coNTEXT FREE GRAMMARS
Given grammar, G is,
a12. What are context free grammars? Also give the S abB
formal definition of context free grammars.
A aaBb
Answer
Model Paper-ll, Q4(a)
Context Free Grammar (CFG) BbbAa
A>E
A context tree grammar (or simply a grammar) describes
the programming language constructs. It consists of terminals, Let u s derive the strings
generated by G. We start vwith
on-terminals, a start symbol ánd productions. It is denoted by start symbol S.
G- S, P). SabB
Terminals (V) abbb Aa
Terminals are finite set of symbols from which the strings
far the language are formed. These are generally represented
abbbaa Bb
abbaabbAab
letters a, 6, c,..
by small
abbaabbeab
Non-terminals (V_)
abbaabbab
Non-terminals are the finite set of variables that
represents a language i.e., a set of strings. These are generally strings containing n
Clearly, the grammar generates
represented by capital letters 4, B, C,.. b's
number of a's and (n + 1) number of
Start Symbol (S) T h e CFL is given by,
In a grammar, one of the non-terminals is used as a L= {ab (bbaay bba (bay"'}.
start symbol. It represents the language being defined by the
notational conventions of
grammar. t
is generally denoted by S. Q13. Discuss about the
context free grammars.
Productions (P)
Productions are the set of rules that describes the recursive Answer:
All productions are of the form, The notational conventions of context free grammars
definition of a language.
are listed below,
1. The following symbols are treated as non-terminals,
Where,
a:A non-terminal i.e., a E V (i) The initial uppercase letters of the alphabet. For
instance. A, B, C.
B: A combination of terminals and non-terminals
i.e. BE (V,U V,). ii) The italic words in lower case like expr or stmt.
Examples (ii) The start symbol of the grammar i.e., S
. The grammar that generates strings having equal number (iv) Incase of programming constructs, uppercase
of a's and b's is, letters denote the constructs. For instance, E
represents expressions, T represents terms and F
G- ({S}, {a, b}, P, S)
represents factors respectively,
Where, P is,
The following symbols are treated as terminals
S aSbS
) The initial letters of the alphabet in lowercase like
S bSaS a, b, c.
SE i) The digits 0 to 9 (0, 1,.9).
2 The grammar that generates the set of all strings with
(ii) Strings in bold representingterminai symbols. For
exactly one I over 2= {0, 1} is,
instance id, if.
Where, P 1s,
(iv) Operator like +, *, etc.
SA\A
(v) Punctuations like comma, semicolon, parenthes
A04 | E
3 The last letters of the alphabet in uppercase like x. y
The language
generated by CFG is called as Context denote the grammar symbols. (either terminals or
Free Language (CFL)
terminals).
COMPILER DESIGN IJNTU-HYDERABAD
2.8 AD)
which may. be empty.
4 The Greek letters in lowercase like a,B, y denote the strings of grammar symbols
production and a represents
A > , where A represents head ofthe the
nis helps in writing the generie production as
5. ne last letters
Answer
Derivations rule with a terminal inuut.
RHS of a production
nonterminal symbol present on
Derivation is a method of replacing the rules are called as
derivations.
from a set of production
This is the process of deriving a input string
S= SbS
ShSbS
m
S6S6Sbs
m
a b SbSbS
Im
ababS6S
m
abubabS
abababa
m
S SbS
SbSbS
rm
SbSbSb6S
rm
S6Sb Sba
Sb Sbaba
Sbababa
abababa
2.9
I T 2 SyntaxAnalys
root ot a parse
tree is the start symbol of the CFG.
he
The
is shown belowW,
The leftmost tree
id id
fid+ id)
Figure: Parse Tree for String
steps
is constructed using the following
The parse tree for (id+ id)
-
Step 1 Step 2
children to E as (, E and )
E using the production E->
-
E
cOMPILER DESIGN IJNTU-HYDERABADI
2.10
The successive steps in the construction of parse tree are shown below,
E
E
id id
id
Step 6
Step 5
Step 4
Q16. Discuss about the following,
(a) Ambiguity
(b)Verifying the language generated by a grammar.
Answer
(a) Ambiguity In other words there is more
more than one parse tree for some sentence.
A grammar is said to be ambiguous ifit produces
for some sentence. A grammar becomes ambiguous
than one leftmost derivation or more than one rightmost derivation
when the same non-terminal appears twice at the right
hand side of a production.
because with more than parse tree for a sentence, it is difficult
one
Ambiguous grammars are not suitable for some parser
can use ambiguous grammars with disambiguating rules that
to select a parse tree. However, for some parsers like yacc
discards all the unnecessary parse trees by leaving only one parse tree for each sentence.
SSoS
Where, a may be string ofterminals or non-terminals ie., the same non-terminal appears twice on R.H.S of a production.
Example
Consider the following grammar,
SSS| alb
Let us derive a string from this grammar,
SSS
SSS
aSS
abS
aba
(iii) Now, we construct derivation trees for the string aba.
a S
a b
Example
Consider a grammar
A-(A)A|e
Which is capable of generating only strings of balanced parenthesis.
A.
(b) Every string with equal number of parenthesis can be derived from
Proof for Case(a)
Basis Hypothesis
considered as balanced.
can be derived from A which is
In a single step (i.e., n=) only empty string (E)
Induction Hypothesis
leftmost derivation. The
which are derived in less thann steps are balanced.Also consider an n-step
Suppose that strings
these considerations is in the following form,
derivation of both
AA)A A y
m lm
Proof
less than steps. Thus, by induction
clear that x and y derivations are carried out in
n
From the above derivation, it is
balanced.
x and y are balanced strings.
Hence the string (x)y also needs to be
hypothesis
Proof for Case (b)
Basic Hypothesis
In case if the length of string is zero, the string must be e which is a balanced string.
Induction Hypothesis
less than 2n can be derived
It is a fact that length of balance strings is even. Consider that a balanced strings having length
from A. Also consider, a string w which is a balanced string having length 2n where 2
n 1 and (x) as the smallest (nonempty)
prefix of string w with balanced parenthesis.
From the obove assumptions w can be defined as,
w=(x)y
Where x and y are balanced strings whose lengths are less than 2n.
Therefore, it can be said that x and y can be derived from A. Thus, it is now clear that the derivation is of the form.
AA)A A y
m Im lm
1) CFGS are capable of describing all the constructs that are described by RE. However, REs cannot describe all the c
Con
structs that can be described by CFGS.
(1) The languages which are regular languages.are context free languages, However, the languages which CFLS:
necessarily be regular.
are not
The above facts can be illustedted through the following example.
Consider a language consisting of a set of strings of a' s and b's ending in abb as described by a RE (ab)"abb. The same
ne
language can be described by a CFG as,
Sas bS,Jas,
S, bs
SbS,
A grammar can be developed mechanically which can accept the language that is accepted by NFA. Thus, the NFA ac-
cepting the above languages is given below,
Start
In addition, a language with equal number ofx's and y's [i.e., L =ry where n 2 1] is a language which can be generated
by context free grammars but not by the RE.
The reason why this language cannot be generated by RE is discussed below,
Assume that, the language L=ry is generated by sonme regular expression. So it is possible to construct a DFA D accept
ing L with some k states. Incaae bf input having more x's than DFA states (i.e., k) the DFA D has to enter into some of the states
(say 4) for two times. Name the path, i.e., from state 4, to itself as d-, Now, there must also be a path from Ai to an accepting
state (f) (as ab e L). Moreover, there exist a path from 4, through 4, to the accepting state which is named as a'b. This means
that L also accepts a'b' also which is not in the language (db). Thus, the L = db'is accepted by DFA D.
a'-
2.3 WRITING A GRAMMAR must eliminate left recursion since top-down parsers cannot
handle left recursive grammars.
detail about eliminating ambiguity
Discuss in An immediate left recursion from the production of the
a18.
a18.
and left recursion.
form is,
Model Paper-ill, Q4a)
Answer
A-Aa|B
Eliminating Ambiguity It can be eliminated by rewriting the productions as,
from a grammar can be
However,
the ambiguity ABA
rules.
the production
rewriting
eliminated by from ofthe
the grammars form A- aA'|E
the ambiguity
In general, rewriting which does not change the set of strings that be
can beeliminated by
can
TT'T
Then, we replace them by following productions,
introduce another
non-terminal F as given
Therefore, we
below, AB,A B,A'|BA
TT°FIF A A a, 4 . A E
F (E)| id
above Q19. Discuss in detail about,
for the
Therefore, the ambiguous grammar
(a) Left factoring
grammar Is,
construct.
E E + T | E -T| T (b) Non-context-free language
(b)
have seen decision by derived from o. In left
enough input to make expanding A to oA' until
the
when
Q20. What
2.4 TOP-DoWN PARSING
is
top-down parsing?
(DERABAAABAS
types
Non-context-free Language Construct
right choice.
Answer
of
top-down parsing Explain
techniquer
Some ues. erent
programming syntactic constructs of
Top-down Parsing Model Paper-l, Q4
only grammars.languages are difficult to non-context-free
These be
specified using build Top-down parsing may be
following examples. difficulties can be
illustrated
a
parse tree for
from the root and an input string in
considered as an
atte:
with the then preorder, that ipt
creat1ng the nodes
Example1 can also be
considered as
derivation for an input an
attempt
of a
to parin
string. construct la
Consider the abstract A
a
leftimost
In this top-down parser constructs the
the first w language
L=
language. {wew/w is in (a/b)*
from the start
symbol of the grammar. Then leftmost des
the
piece of program code is the identifier declaration, able
production rule such that it can it derivation
identifier. The and second w is
the inclusion of
c is
left to right in the sentential form. move the inputselecteon
i.e.. language 'L'
repeated number of a'sconsists of
strings like aabbcaabb
the production rules for a leftmost If there are morestrina
than
c
and b's production rule is dependent onnon-terminal then selectie
non-context-free language is seperated by 'c'. Such backtrack or not. If the whether the
free directly related to non-context input string parser can backtrack, itparse tree can
programming languages like C and Java. order until itrepetitively and can scan th
programming This is because, has succeeded intry out all the the
languages demand identifier and if the parsing the possibilities in at
any
make the backtracking is not permitted then the On the other
to its string.
usage. Moreover, arbitrary declaration prior
identifiers. length can be allotted right selection of the parser has to
the task in such parsers. production rule, which is
crucial
The
grammar of these Top-down parsers are popular since it is
identifiers using tokens likeprogramming languages represent efficient parsers by hand using easy to construct
identifiers based on the
'id' rather than
distinguishing Types of top-down methods.
of declaration of
type of character
strings. The condition
Top-down Parsing
identifiers before their The different types of
checked by the usage is verified or top-down parsing techniques are,
semantic-analysis phase in the compilers. ) Recursive Descent Parsing
Example2 For answer refer
Unit-I, Q21.
Consider (i) Non-recursive Predictive Parsing
non-context-free language L
a
{a'b"ed
=
nz1
and m21}. This language consists of For
answer tefer Unit-lI,
Q24.
the output of
regular expression a*b*c*d*.
strings, which are Q21. Discuss in detail
Where, Answer
Recursive-Descent parsing.
Number of a's and c's Recursive Descent Parsing
=
number of b's and d's.
In this Parsing means generating a
language, d"
represents that the variable parsing a parse tree is generated parse
tree. Now, in
Top-down
be a can
by
repreated n number
of times and ' , b" the starting variable taking the parent node or
of the grammar and
formal parameter lists of two functions and b represents thein preorder. constructing nodes the
lokahead pointer
is incremented by
I
to pomt the next input symbol.
ferent .caon-terminal is the input symbol then,
the procedure of the cotresponding non terninei is called.
Ifnon-t
arting
ree. It Iets take an example for recursive
descent parsing, which needs backtracking.
Lets
tmost
rampie
e, d}, R, {})
ation G (, B}, {a, b,
Suit Ris given byy
from
one I cBd
of
can Babla
the is cad.
string that is to parsed from ie
and the input and crearine Todes
any tree, using the start variable
we construct a syntax
ther an input string
As we
know that to parse
S to
oright, in preorder.
erno andier
B
Botom Recursive descent
of stack aISHg Proccoure
Stack
Figure: Strweture of a
Recursive Descent Parser
tle sy niax ites
>
Bd
the initial orstart
variable i.e., l and generaies
The parser starts parsing from
'matches the left most node 'c' in the syntax b, a 61 1 non vernunal t
nght. Asthefirst input 'c the tree i . ,
it compares a to
the next node n
- S I A GROUP
STyDENTS
RUM ALLAIN-ONE JOURMML FOR ENGEERING
2.16
Now, parser
pointing to
tOnput symbol 'a' and the
left-most child
cOMPILER DESIGN IJNTU-HYN
B is *a'
of variable matches
which does not
founds a match HYDERABA
and
with 'd'. At this st ge, the pary
ADd the poinen
CTemented
does to next
backtracking for the variable B. It 'd'.
input symbol checks
The that
i.e., nextifchild
has any is 'b'yields
of Bmore which can be used. It no more yielde
C B
The input symbol 'a'now matches the child 'a' hence the parser then increments the pointer. Now, the parser is
to
input symbol 'd.
pointing
Now, the right most child of variable I matches the input symbol a.
NOW, the parser halts and declares that the input string is syntactically correct.
intinite 100p. One of such grammar
enter into an
Brammars that can cause a recursive descent parserto mars is
known as lett-recursive grammar.
022. Explain in detail first and follow functions. Give examples.
Answer
FIRST Function
FIRST (0) is a set of terminal symbois that are the fir_t symbos in the strings derived from a. Where a is any string of
grauar symbols.
The rules for computing FIRST () for ail grammar symbols X'are,
1. IfX is a terminal symbol then FIRST(X)= {X}
2 If X is a nontermina! having a production rule as,
KA, A,.A then add a FIRsT (X) if for some i, there is,
a in FIRST (A) and
E in FIRST of all A, ie., FIRST (4), , FIRST (4)
I fFIRST (4) for all j= 1,2, k i s E then E is added to FIRST (X).
That is whatever in FIRST (4,) is in FIRST (A). Ife is not in FIRST (4) then nothing more is added to FIRST (X). How-
ever, if FiRST (4,) derives e in zero or more steps then we add FIRST (4), FIRST (4,) and so on.
The above rules for FIRST (X) are applied until there are no more terminals ore that can be
added to any FIRST set.
FOLLOW Function
2
iac a nroduction as
A- uB5. Where FiksI (P) does not contain E then FOLLOW
(B)= FIRST (8)- {E
as BB. Where FIRST (B) contain e
tfA has a production then
as A > B then FOLLOW (B) FOLLOW(4
FOLLOW (B) =FIRST (B)- (E}U FOLLOW )
=
IfAhas production
a
IfA is the rightmost symbol in some sentential form then Sic added to FOLLOW
5. (4).
L.ook for the SIA GROUP NGO on the TITLE COVER
NIT-2 Symtax Analysis 2.17
F m p l e FOLLOW (B)
Consider the grammar given below, The production involving B is S
->aA B
GSa4B Which is of form 4- aBBusing role (3) for 1OLIOW
we get.
SDA
S a AB
AaAb A aBB
Hence ßE
BbB Hence by using rule (3) we get,
not
then a should
(1) If B derives an empty string the terminals in
FOLLOW (A) = FOLLOW (5)
with
derive strings that begins
FOLLOWA) i.e., FIRST
(a)nFOLLOW(4): o
FOLLOW (A) = {b, $}
- SIA GROUP
ENGINEERING STUDENTS
SPECTRUM ALL-IN-ONE JoURNAL FOR
cOMPILER DESIGN [JNTU-HYDERA
2.18
The same argument holds if a derives an empty string.
ERABAD
L A t ) parser provides the detailed information on the internal of the parsing process. 1t 1s also Capable ofi catch.
catchig a
errors because thie situations of syntax errors are recorded cxplicitiy in the tabie.
S1EtSS'|a
S'eSe
E-> b
fne parsing iable ior this gram:nar is given below.
Input Symbol
Non-terninal b
S-a SiEtSS
S'->E
S'-eS
from being put on the stack or removcd Irom the input 1f the secone
Ifthe parserchooses S'>E ihen this prevents e
ever
This choice aswiates the de's wth the closest prev tous then's
choice S'- eS in made then it resolves the amb1guity.
Se
the mult1ple detined entr av a single valued entry
ie
So the parser can parsethe string by making
be removed by tist eliminating the ic ft recursion and
let
factorin
Themultiple-defined entries from a parsing table can
the grammar then constructing a parsing table for the transformed grainmar.
table consists ofrows and colunns where there is a row for each non-ierminal and a column for each ternnal symbol includrng
ihe cnd marker for the input string. Each entry MA. a) in a table iseithera production rule or an error lt aso uses a stack contain
a sequense ofgrammar symbois with the of S. Symbol placed on the botiom indicating the botiom ot the stack.intaly, the
mbol resides on top. The stack is used to keep track ofall the nonterninals for which no predicton nas Deen made yet
The parser aiso uses an inputbuffer and an output stream. The strng to be parsed is stored in the inpu butfer
The end of the butfer uses a S symboi to indicate the end of the input strng
he contnguration ofthe non-recursive predictive parser is shown in the figure
Look for the SiA GROUP LOGo o n the TITLECoVER before you buy