Professional Documents
Culture Documents
LEARNING OBJECTIVES
Compilers Analysis It breaks up the source program into pieces and creates
A compiler is a software that translates code written in high-level an intermediate representation of the source program. This is more
language (i.e., source language) into target language. language specific.
Example: source languages like C, Java, . . . etc. Compilers are
user friendly. Synthesis It constructs the desired target program from the inter-
The target language is like machine language, which is efficient mediate representation. The target program will be more machine
for hardware. specific, dealing with registers and memory locations.
6.4 | Unit 6 • Compiler Design
Loader/linker
Library files, relocatable Example 2:
object files int a;
Absolute machine code bool b;
char c;
Phases c = a + b;
Compilation process is partitioned into some subproceses We cannot add integer with a Boolean variable and assign it
called phases. to a character variable.
In order to translate a high level code to a machine code,
Intermediate code generation
we need to go phase by phase, with each phase doing a par-
ticular task and parsing out its output for the next phase. The intermediate representation should have two important
properties:
Lexical analysis or scanning (i) It should be easy to produce.
It is the first phase of a compiler. The lexical analyzer reads the (ii) Easy to translate into the target program
stream of characters making up the source program and groups
‘Three address code’ is one of the common forms of
the characters into meaningful sequences called lexemes.
Intermediate code.
Example: Consider the statement: if (a < b) Three address code consists of a sequence of instruc-
In this sentence the tokens are if, (a, <, b,). tions, each of which has at most three operands.
Number of tokens = 6
Identifiers: a, b Example:
Keywords: if id1 = id2 + id3 × 10;
Operators: <, ( , ) t1: = inttoreal(10)
Chapter 1 • Lexical Analysis and Parsing | 6.5
t2:= id3 × t1 in the source program. The stream of tokens is sent to the
t3:= id2 + t2 parser for syntax analysis.
id1 = t3
There will be interaction with the symbol table as well.
Source program
Code optimization
Lexical
The output of this phase will result in faster running analyzer
machine code.
Get
Example: For the above intermediate code the optimized Error handler next Tokens Symbol table
tokens
code will be
t1:= id3 × 10.0 Parser
id1: = id2 + t1
In this we eliminated t2 and t3 registers.
Lexeme: Sequence of characters in the source program
Code generation that matches the pattern for a token. It is the smallest logical
•• In this phase, the target code is generated. unit of a program.
•• Generally the target code can be either a relocatable
Example: 10, x, y, <, >, =
machine code or an assembly code.
•• Intermediate instructions are each translated into a Tokens: These are the classes of similar lexemes.
sequence of machine instructions.
•• Assignment of registers will also be done. Example: Operators: <, >, =
Identifiers: x, y
Example: MOVF id3, R2
Constants: 10
MULF ≠ 60.0, R2 Keywords: if, else, int
MOVF id2, R1
ADDF R2, R1
MOVF R1, id1
Operations performed by lexical analyzer
1. Identification of lexemes and spelling check
Symbol table management 2. Stripping out comments and white space (blank, new
line, tab etc).
A symbol table is a data structure containing a record for
3. Correlating error messages generated by the compiler
each variable name, with fields for the attributes of the
with the source program.
name.
4. If the source program uses a macro-preprocessor, the
What is the use of a symbol table? expansion of macros may also be performed by lexical
1. To record the identifiers used in the source program. analyzer.
2. Its type and scope
3. If it is a procedure name then the number of argu- Example 1: Take the following example from Fortran
ments, types of arguments, the method of parsing (by DO 5 I = 1.25
reference) and the type returned. Number of tokens = 5
The 1st lexeme is the keyword DO
Error detection and reporting Tokens are DO, 5, I, =, 1.25.
(i) Lexical phase can detect errors where the characters Example 2: An example from C program
remaining in the input ‘do not form any token’. for (int i = 1; i < = 10; i + +)
(ii) Errors of the type, ‘violation of syntax’ of the language Here tokens are for, (, int, i, =, 1,;, i, < =, 10,;,
are detected by syntax analysis. i, ++,)
(iii) Semantic phase tries to detect constructs that have the Number of tokens = 13
right syntactic structure but no meaning.
Example: adding two array names etc. LEX compiler
Lexical analyzer divides the source code into tokens. To
Lexical Analysis implement lexical analyzer we have two techniques namely
Lexical Analysis is the first phase in compiler design. The hand code and the other one is LEX tool.
main task of the lexical analyzer is to read the input char- LEX is an automated tool which specifies lexical ana-
acters of the source program, group them into lexemes, and lyzer, from the rules given by the regular expression.
produce as output a sequence of tokens for each lexeme These rules are also called as pattern recognizing rules.
6.6 | Unit 6 • Compiler Design
Syntax Analysis The parser generator is used for construction of the com-
pilers front end.
This is the 2nd phase of the compiler, checks the syntax and
constructs the syntax/parse tree.
Input of parser is token and output is a parse/ syntax tree. Scope of declarations
Declaration scope refers to the certain program text portion,
Constructing parse tree in which rules are defined by the language.
Within the defined scope, entity can access legally to
Construction of derivation tree for a given input string by
declared entities.
using the production of grammar is called parse tree.
The scope of declaration contains immediate scope
Consider the grammar
always. Immediate scope is a region of declarative portion
S → E + E/E * E with enclosure of declaration immediately.
E → id Scope starts at the beginning of declaration and scope
The parse tree for the string continues till the end of declaration. Whereas in the over
ω = id + id * id is loadable declaration, the immediate scope will begin, when
S the callable entity profile was determined.
The visible part refers text portion of declaration, which
E + E is visible from outside.
E * E
Syntax Error Handling
id id id 1. Reports the presence of errors clearly and accurately.
2. Recovers from each error quickly.
ω = id + id * id
3. It should not slow down the processing of correct
programs.
Role of the parser
1. Construct a parse tree. Error Recovery Strategies
2. Error reporting and correcting (or) recovery. A parser
can be modeled by using CFG (Context Free Grammar)
recognized by using pushdown automata/table driven Panic Phrase level Error Global
parser. mode productions correction
3. CFG will only check the correctness of sentence with
Panic mode On discovering an error, the parser discards
respect to syntax not the meaning.
input symbols one at a time until one of the synchronizing
tokens is found.
Source Token
program Lexical Parse Phrase level A parser may perform local correction on the
Parser
analyzer tree
Get next remaining input. It may replace the prefix of the remaining
token Syntax input.
Lexical
errors
errors Error productions Parser can generate appropriate error
messages to indicate the erroneous construct that has been
How to construct a parse tree? recognized in the input.
Parse tree’s can be constructed in two ways. Global corrections There are algorithms for choosing a
(i) Top-down parser: It builds parse trees from the top minimal sequence of changes to obtain a globally least cost
(root) to the bottom (leaves). correction.
(ii) Bottom-up parser: It starts from the leaves and works
up to the root. Context Free Grammars
In both cases, the input to the parser is scanned from left to and Ambiguity
right, one symbol at a time.
A grammar is a set of rules or productions which generates
a collection of finite/infinite strings.
Parser generator It is a 4-tuple defined as G = (V, T, P, S)
Parser generator is a tool which creates a parser. Where
V = set of variables
Example: compiler – compiler, YACC
T = set of terminals
The input of these parser generator is grammar we use and P = set of production rules
the output will be the parser code. S = start symbol
Chapter 1 • Lexical Analysis and Parsing | 6.7
− E Left recursion
( E )
Left recursion can take the parser into infinite loop so we
need to remove left recursion.
E + E
Elimination of left recursion
id id
A → Aa/b is a left recursive.
It can be replaced by a non-recursive grammar:
Ambiguity A → bA′
A′ → aA′/e
A grammar that produces more than one parse tree for some
sentence is said to be ambiguous. In general
Or A → Aa1/Aa2/…/Aam/b1/b2/…/bn
A grammar that produces more than one left most or more We can replace A productions by
than one right most derivations is ambiguous. A → b1 A′/b2 A′/–bn A′
A′ → a1 A′/a2 A′/–am A′
For example consider the following grammar:
String → String + String/String – String /0/1/2/…/9 Example 3: Eliminate left recursion from
E → E + T/T
9 – 5 + 2 has two parse trees as shown below T → T * F/F
String F → (E)/id
Solution E → E + T/T it is in the form
A → Aa/b
String String
+ So, we can write it as E → TE′
E′ → +TE′/e
2
String String
Similarly other productions are written as
−
T → FT ′
9 5 T1 → × FT ′/∈
Figure 1 Leftmost derivation F → (E)/id
6.8 | Unit 6 • Compiler Design
Example 4 Eliminate left recursion from the grammar Let the string w = cad is to generate:
S → (L)/a S
L → L, S/b
c A d
Solution: S → (L)/a
a b
L → bL′
L′ → SL′/∈ The string generated from the above parse tree is cabd.
but, w = cad, the third symbol is not matched.
So, report error and go back to A.
Left factoring Now consider the other alternative for production A.
A grammar with common prefixes is called non-determin-
S
istic grammar. To make it deterministic we need to remove
common prefixes. This process is called as Left Factoring. c A d
The grammar: A → ab1/ab2 can be transformed into a
A → a A′
A′ → b1/b2 String generated ‘cad’ and w = cad. Now, it is successful.
In this we have used back tracking. It is costly and time
Example 5: What is the resultant grammar after left consuming approach. Thus an outdated one.
factoring the following grammar?
S → iEtS/iEtSeS/a Predictive Parsers
E→b By eliminating left recursion and by left factoring the gram-
mar, we can have parse tree without backtracking. To con-
Solution: S → iEtSS ′/a struct a predictive parser, we must know,
S ′ → eS/∈
1. Current input symbol
E→b 2. Non-terminal which is to be expanded
A procedure is associated with each non-terminal of the
grammar.
Types of Parsing
Parsers Recursive descent parsing
In recursive descent parsing, we execute a set of recursive
procedures to process the input.
Topdown parsers Bottom up parsers
(predictive parser)
The sequence of procedures called implicitly, defines a
parse tree for the input.
Recursive descent Non-recursive Operator (LR parsers)
parsing descent precedence Non-recursive predictive parsing
parsing parsing
(table driven parsing)
SLR CLR LALR •• It maintains a stack explicitly, rather than implicitly via
recursive calls.
•• A table driven predictive parser has
→ An input buffer
Topdown Parsing → A stack
A parse tree is constructed for the input starting from the → A parsing table
root and creating the nodes of the parse tree in preorder. It → Output stream
simulates the left most derivation.
Input a + b $
Backtracking Parsing
If we make a sequence of erroneous expansions and sub-
x
sequently discover a mismatch we undo the effects and roll Predictive parsing
y Output
program
back the input pointer.
z
This method is also known as brute force parsing.
$
Example: S → cAd Parsing table
M
A → ab/a
Chapter 1 • Lexical Analysis and Parsing | 6.9
Constructing a parsing table By applying these rules to the above grammar, we will get
To construct a parsing table, we have to learn about two the following parsing table.
functions:
Input Symbol
1. FIRST ( )
2. FOLLOW ( ) Non-terminal id + * ( ) $
E E → TE ′ E → TE ′
FIRST(X) To compute FIRST(X) for all grammar symbols
E ′ E ′ → + TE ′ E ′ → e E ′ → e
X, apply the following rules until no more terminals or e can
be added to any FIRST set. T T → FT ′ T → FT ′
T ′ T ′ → e T ′ → * FT ′ T ′ → e T ′ → e
1. If X is a terminal, then FIRST(X) is {X}.
2. If X → e is a production, then add e to FIRST(X). F F → id F → (E)
the production.
Right sentential forms of a unambiguous grammar have
For example consider the grammar one unique handle.
S → aABe
A → Abc/b Example: For grammar, S → aABe
A → Abc/b
B→d
B→d
In bottomup parsing the string ‘abbcde’ is verified as S ⇒ aABe ⇒ aAde ⇒ aAbcde ⇒ abbcde
abbcde
Note: Handles are underlined.
aAbcde
aAde → reverse order Handle pruning The process of discovering a handle and
aABe reducing it to the appropriate left hand side is called han-
S dle pruning. Handle pruning forms the basis for a bottomup
Stack implementation of shift–reduce parser parsing.
The shift reduce parser consists of input buffer, Stack and To construct the rightmost derivation:
parse table. S = r0 ⇒ r1 ⇒ r2 ____ ⇒ rn = w
Input buffer consists of strings, with each cell containing
only one input symbol. Apply the following simple algorithm:
Stack contains the grammar symbols, the grammar sym- For i ← n to 1
bols are inserted using shift operation and they are reduced Find the handle Ai → Bi in ri
using reduce operation after obtaining handle from the col- Replace Bi with Ai to generate ri–1
lection of buffer symbols. Consider the cut of a parse tree of a certain right sentential
Parse table consists of 2 parts goto and action, which are form:
constructed using terminal, non-terminals and compiler items. S
Let us illustrate the above stack implementation.
→ Let the grammar be A
S → AA
A → aA a b w
A → b
Let the input string ‘w’ be abab$ Here A → b is a handle for abw.
w = abab$
Shift reduce parsing with a stack There are 2 problems
Stack Input String Action with this technique:
$ abab$ Shift
(i) To locate the handle
$a bab$ Shift
(ii) Decide which production to use
$ab ab$ Reduce (A → b)
$aA ab$ Reduce (A → aA)
General construction using a stack
$A ab$ Shift
$Aa b$ Shift 1. ‘Shift’ input symbols onto the stack until a handle is
$Aab $ Reduce (A → b) found on top of it.
$AaA $ Reduce (A → aA)
2. ‘Reduce’ the handle to the corresponding non-terminal.
3. ‘Accept’ when the input is consumed and only the start
$AA $ Reduce (S → AA)
$S $ Accept symbol is on the stack.
4. Errors – call an error reporting/recovery routine.
Chapter 1 • Lexical Analysis and Parsing | 6.11
Viable prefixes The set of prefixes of a right sentential Let the operator precedence table for this grammar is:
form that can appear on the stack of a shift reduce parser
id + × $
are called viable prefixes.
id ⋗ ⋗ ⋗
+ ⋖ ⋗ ⋖ ⋗
Conflicts × ⋖ ⋗ ⋗ ⋗
Conflicts
$ ⋖ ⋖ ⋖ accept
3. T → T * F 4.
T→F A → B.CD means we have seen an input string derivable
5. F → (E) 6. F → id from B and hope to see a string derivable from CD.
The LR (0) items are constructed as a DFA from gram-
Action Goto mar to recognize viable prefixes.
State id + × ( ) $ E T F The items can be viewed as the states of NFA.
0 S5 S4 1 2 3 The LR (0) item (or) canonical LR (0) collection, pro-
1 S6 acc
vides the basis for constructing SLR parser.
2 r2 S7 r2 r2 To construct LR (0) items, define
3 r4 r4 r4 r4 (a) An augmented grammar
4 S5 S4 8 2 3 (b) closure and goto
5 r6 r6 r6 r6
6 S5 S4 9 3 Augmented grammar (G′) If G is a grammar with start
7 S5 S4 10 symbol S, G′ the augmented grammar for G, with new start
8 S6 S1 symbol S ′ and production S′ → S.
9 r1 S7 r1 r1 Purpose of G′ is to indicate when to stop parsing and
10 r3 r3 r3 r3 announce acceptance of the input.
11 r5 r5 r5 r5
Closure operation Closure (I) includes
Moves of LR parser on input string id*id+id is shown below: 1. Intially, every item in I is added to closure (I)
2. If A → a.Bb is in closure (I) and b → g is a production
then add B → .g to I.
Stack Input Action
0 id * id + id$ Shift 5
reduce 6 means reduce with
Goto operation
0id 5 * id + id$ 6th production F → id and Goto (I, x) is defined to be the closure of the set of all items
goto [0, F ] = 3 [A → aX.b] such that [A → a.Xb] is in I.
reduce 4 i.e T → F
0F 3 * id + id$ Items
goto [0, T ] = 2
0T 2 * id + id$ Shift 7
0T2 * 7 id + id$ Shift 5 Kernel items: S′ → .S Non-kernel items:
reduce 6 i.e F → id
and all items whose Which have their
0T2 * 7 id 5 + id$ dots are not at the dots at the left end.
goto [7, F ] = 10
left end
0T2 * 7 F 10 + id$ reduce 3 i.e T → T *F
0T 2 + id$ goto [0, T ] = 2 Construction of sets of Items
0E 1 + id$ reduce 2 i.e E → T & goto [0, E ] = 1 Procedure items (G′)
Begin
0E1 + 6 id$ Shift 6
C: = closure ({[S′ → .S]});
0E1 + 6 id 5 $ Shift 5 repeat
0E1 + 6F 3 $ reduce 6 & goto [6, F ] = 3 For each set of items I in C and each grammar symbol x
0E1 + 6T 9 $ reduce 4 & goto [6, T ] = 9 Such that goto (I, x) is not empty and not in C do add goto
(I, x) to C;
0E1 $ reduce 1 & goto [0, E ] = 1
Until no more sets of items can be added to C, end;
0E1 $ accept
Example: LR (0) items for the grammar
Constructing SLR parsing table E′ → E
LR (0) item: LR (0) item of a grammar G is a production of E → E + T/T
G with a dot at some position of the right side of production. T → T * F/F
F → (E)/id
Example: A → BCD
Possible LR (0) items are is given below:
I0: E′ → .E
A → .BCD
A → B.CD E → .E + T
A → BC.D E → .T
A → BCD. T → .T * F
6.14 | Unit 6 • Compiler Design
Solution: The initial set of items is Consider the string derivation ‘dcd’:
I0: S′ → .S, $ S ⇒ CC ⇒ CcC ⇒ Ccd ⇒ dcd
S → .CC, $
A → a.Bb, a Stack Input Action
Here A = S, a = ∈, B = C, b = C and a = $ 0 dcd $ shift 4
First (ba) is first (C$) = first (C) = {c, d} 0d4 Cd $ reduce 3 i.e. C → d
So, add items [C → .cC, c]
0C 2 Cd $ shift 6
[C → .cC, d]
0C 2C 6 D $ shift 7
∴ Our first set I0: S′ → .S, $
0C2C 6d 7 $ reduce C → d
S → .CC, $
0C 2C 6C 9 $ reduce C → cC
C → .coca, c/d
C → .d, c/d. 0C 2C 5 $ reduce S → CC
0S1 $
I1: goto (I0, X) if X = S
S′ → S., $
Example 8: Construct CLR parsing table for the grammar:
I2 : goto (I0, C)
S→L=R
S → C.C, $
S→R
C → .cC, $
L → *R
C → .d, $
L → id
I3: goto (I0, c)
R→L
C → c.C, c/d
C → .cC, c/d Solution: The canonical set of items is
C → .d c/d I0: S′ → .S, $
I4: goto (I0, d) S → .L = R, $
C → d., c/d S → .R, $
L → .* R, = [first (= R$) = {=}]
I5: goto (I2, C) L → .id, =
S → CC., $ R → .L, $
I6: goto (I2, c) I1: got (I0, S)
C → c.C; $ S′ → S., $
C → ..cC, $
C → ..d, $ I2: goto (I0, L)
S → L. = R, $
I7: goto (I2, d)
R → L., $
C → d. $
I3: goto (I0, R)
I8: goto (I3, C)
S → R., $
C → cC., c/d
I9: goto (I6, C) I4: got (I0, *)
C → cC., $ L → *. R, =
R → .L, =
CLR table is: L → .* R, =
Action Goto
L → .id, =
I5: goto (I0, id)
States c 1 $ S C
L → id.,=
I0 S3 S4 1 2
I6: goto (I7, L)
I1 acc
R → L., $
I2 S6 S7 5
I3 S3 S4 8 I7: goto (I2, =)
I4 R3 r3 S → L = .R,
I5 r1 R → .L, $
I6 S6 S7 9
L → .*R, $
L → .id, $
I7 r3
I8 R2 r2 I8: goto (I4, R)
I9 r2 L → *R., =
Chapter 1 • Lexical Analysis and Parsing | 6.17
Exercises
Practice Problems 1 4. If an LL (1) parsing table is constructed for the above
grammar, the parsing table entry for [S → [ ] is
Directions for questions 1 to 15: Select the correct alterna-
(A) S → T; S (B) S→∈
tive from the given choices.
(C) T → UR (D) U → [S]
1. Consider the grammar
S→a Common data for questions 5 to 7: Consider the aug-
S → ab mented grammar
The given grammar is: S→X
(A) LR (1) only X → (X)/a
(B) LL (1) only 5. If a DFA is constructed for the LR (1) items of the
(C) Both LR (1) and LL (1) above grammar, then the number states present in it
(D) LR (1) but not LL (1) are:
2. Which of the following is an unambiguous grammar, (A) 8 (B) 9
that is not LR (1)? (C) 7 (D) 10
(A) S → Uab | Vac 6. Given grammar is
U → d (A) Only LR (1)
V → d (B) Only LL (1)
(B) S → Uab/Vab/Vac (C) Both LR (1) and LL (1)
U → d (D) Neither LR (1) nor LL (1)
V → d 7. What is the number of shift-reduce steps for input (a)?
(C) S → AB (A) 15 (B) 14
A → a (C) 13 (D) 16
B → b 8. Consider the following two sets of LR (1) items of a
(D) S → Ab grammar:
A → a/c
X → c.X, c/d X → c.X, $
Common data for questions 3 and 4: Consider the grammar: X → .cX, c/d X → .cX, $
S → T; S/∈ X → d, c/d X → .d, $
T → UR Which of the following statements related to merging
U → x/y/[S] of the two sets in the corresponding LALR parser is/are
R → .T/∈ FALSE?
3. Which of the following are correct FIRST and 1. Cannot be merged since look ahead are different.
FOLLOW sets for the above grammar? 2. Can be merged but will result in S – R conflict.
(i) FIRST(S) = FIRST (T) = FIRST (U) = {x, y, [, e} 3. Can be merged but will result in R – R conflict.
(ii) FIRST (R) = {,e} 4. Cannot be merged since goto on c will lead to two
(iii) FOLLOW (S) = {], $} different sets.
(iv) FOLLOW (T) = Follow (R) = {;} (A) 1 only (B) 2 only
(v) FOLLOW (U) = {. ;} (C) 1 and 4 only (D) 1, 2, 3 and 4
(A) (i) and (ii) only 9. Which of the following grammar rules violate the
(B) (ii), (iii), (iv) and (v) only requirements of an operator grammar?
(C) (ii), (iii) and (iv) only (i) A → BcC (ii) A → dBC
(D) All the five (iii) A → C/∈ (iv) A → cBdC
6.20 | Unit 6 • Compiler Design
(A) (i) only (B) (i) and The entry/entries for [Z, d] is/are
(C) (ii) and (iii) only (D) (i) and (iv) only (A) Z→d
10. The FIRST and FOLLOW sets for the grammar: (B) Z → XYZ
(C) Both (A) and (B)
S → SS + /SS*/a
(D) X→Y
(A) First (S) = {a}
Follow (S) = {+, *, $} 13. The following grammar is
(B) First (S) = {+} S → AaAb/BbBa
Follow (S) = {+, *, $} A→e
(C) First (S) = {a} B→e
Follow (S) = {+, *} (A) LL (1) (B) Not LL (1)
(D) First (S) = {+, *} (C) Recursive (D) Ambiguous
Follow (S) = {+, *, $} 1 4. Compute the FIRST (P) for the below grammar:
11. A shift reduces parser carries out the actions specified P → AQRbe/mn/DE
within braces immediately after reducing with the cor- A → ab/e
responding rule of the grammar: Q → q1q2/e
S → xxW [print ‘1’] R → r1r2/e
S → y [print ‘2’] D→d
W → Sz [print ‘3’] E→e
What is the translation of ‘x x x x y z z’? (A) {m, a} (B) {m, a, q1, r1, b, d}
(A) 1231 (B) 1233 (C) {d, e} (D) {m, n, a, b, d, e, q1, r1}
(C) 2131 (D) 2321 15. After constructing the LR(1) parsing table for the aug-
12. After constructing the predictive parsing table for the mented grammar
following grammar: S′ → S
Z→d S → BB
Z → XYZ B → aB/c
Y → c/∈ What will be the action [I3, a]?
X→Y (A) Accept (B) S7
X→a (C) r2 (D) S5
8. LR parsers are attractive because 15. Which of the following statement is false?
(A) They can be constructed to recognize CFG cor- (A) An unambiguous grammar has single leftmost
responding to almost all programming constructs. derivation.
(B) There is no need of backtracking. (B) An LL (1) parser is topdown.
(C) Both (A) and (B). (C) LALR is more powerful than SLR.
(D) None of these (D) An ambiguous grammar can never be LR (K) for
9. YACC builds up any k.
(A) SLR parsing table
16. Merging states with a common core may produce ___
(B) Canonical LR parsing table
conflicts in an LALR parser.
(C) LALR parsing table
(A) Reduce – reduce
(D) None of these
(B) Shift – reduce
10. Language which have many types, but the type of every
(C) Both (A) and (B)
name and expression must be calculated at compile
(D) None of these
time are
(A) Strongly typed languages 1 7. LL (K) grammar
(B) Weakly typed languages (A) Has to be CFG
(C) Loosely typed languages (B) Has to be unambiguous
(D) None of these (C) Cannot have left recursion
11. Consider the grammar shown below: (D) All of these
S → iEtSS′/a/b 18. The I0 state of the LR (0) items for the grammar
S′ → eS/e
S → AS/b
In the predictive parse table M, of this grammar, the A → SA/a.
entries M [S′, e] and M [S′, $] respectively are
(A) S′ → .S
(A) {S′ → eS} and {S′ → ∈} S → .As
(B) {S′ → eS} and { } S → .b
(C) {S′ → ∈} and {S′ → ∈} A → .SA
(D) {S′ → eS, S′ → e}} and {S′ → ∈} A → .a
12. Consider the grammar S → CC, C → cC/d. (B) S → .AS
The grammar is S → .b
(A) LL (1) A → .SA
(B) SLR (1) but not LL (1) A → .a
(C) LALR (1) but not SLR (1) (C) S → .AS
(D) LR (1) but not LALR (1) S → .b
13. Consider the grammar (D) S→A
E → E + n/E – n/n A → .SA
For a sentence n + n – n, the handles in the right senten- A → .a
tial form of the reduction are
1 9. In the predictive parsing table for the grammar:
(A) n, E + n and E + n – n
(B) n, E + n and E + E – n S → FR
(C) n, n + n and n + n – n R → ×S/e
(D) n, E + n and E – n F → id
14. A top down parser uses ___ derivation. What will be the entry for [S, id]?
(A) Left most derivation (A) S → FR
(B) Left most derivation in reverse (B) F → id
(C) Right most derivation (C) Both (A) and (B)
(D) Right most derivation in reverse (D) None of these
6.22 | Unit 6 • Compiler Design
11. Whichof the following strings is generated by the (A) Abstract syntax tree
grammar? [2007] (B) Symbol table
(A) aaaabb (B) aabbbb (C) Semantic stack
(C) aabbab (D) abbbba (D) Parse table
12. For the correct answer strings to Q.78, how many der- 18. The grammar S → aSa|bS|c is [2010]
ivation trees are there? [2007] (A) LL (1) but not LR (1)
(A) 1 (B) 2 (B) LR (1) but not LR (1)
(C) 3 (D) 4 (C) Both LL (1) and LR (1)
(D) Neither LL (1) nor LR (1)
13. Which of the following describes a handle (as applica-
ble to LR-parsing) appropriately? [2008] 19. The lexical analysis for a modern computer language
(A) It is the position in a sentential form where the next such as Java needs the power of which one of the fol-
shift or reduce operation will occur. lowing machine models in a necessary and sufficient
(B) It is non-terminal whose production will be used sense? [2011]
for reduction in the next step. (A) Finite state automata
(C) It is a production that may be used for reduction (B) Deterministic pushdown automata
in a future step along with a position in the sen- (C) Non-deterministic pushdown automata
tential form where the next shift or reduce opera- (D) Turing machine
tion will occur. Common data for questions 20 and 21: For the grammar
(D) It is the production p that will be used for reduc- below, a partial LL (1) parsing table is also presented
tion in the next step along with a position in the along with the grammar. Entries that need to be filled are
sentential form where the right hand side of the indicated as E1, E2, and E3. Is the empty string, $ indicates
production may be found. end of input, and, I separates alternate right hand side of
14. Which of the following statements are true? productions
(i) Every left-recursive grammar can be converted S → a A b B|b A a B| e
to a right-recursive grammar and vice-versa A→S
(ii) All e-productions can be removed from any con-
B→S
text-free grammar by suitable transformations
(iii) The language generated by a context-free gram- a b $
mar all of whose productions are of the form X S E1 E2 S→e
→ w or X → wY (where, w is a string of terminals
and Y is a non-terminal), is always regular A A→S A→S Error
(iv) The derivation trees of strings generated by a con- B B→S B→S E3
text-free grammar in Chomsky Normal Form are
always binary trees [2008] 20. The FIRST and FOLLOW sets for the non-terminals
(A) (i), (ii), (iii) and (iv) (B) (ii), (iii) and (iv) only A and B are [2012]
(C) (i), (iii) and (iv) only (D) (i), (ii) and (iv) only (A) FIRST (A) = {a, b, e} = FIRST (B)
15. An LALR (1) parser for a grammar G can have shift-
FOLLOW (A) = {a, b}
reduce (S–R) conflicts if and only if [2008] FOLLOW (B) = {a, b, $}
(A) The SLR (1) parser for G has S–R conflicts (B) FIRST (A) = {a, b, $}
(B) The LR (1) parser for G has S–R conflicts FIRST (B) = {a, b, e}
(C) The LR (0) parser for G has S–R conflicts FOLLOW (A) = {a, b}
(D) The LALR (1) parser for G has reduce-reduce FOLLOW (B) = {$}
conflicts (C) FIRST (A) = {a, b, e} = FIRST (B)
FOLLOW (A) = {a, b}
16. S → aSa| bSb | a | b;
FOLLOW (B) = ∅
The language generated by the above grammar over the
(D) FIRST (A) = {a, b} = FIRST (B)
alphabet {a, b} is the set of [2009]
FOLLOW (A) = {a, b}
(A) All palindromes.
(B) All odd length palindromes. FOLLOW (B) = {a, b}
(C) Strings that begin and end with the same symbol. 21. The appropriate entries for E1, E2, and E3 are [2012]
(D) All even length palindromes. (A) E1: S → a A b B, A → S
17. Which data structure in a compiler is used for managing E2: S → b A a B, B → S
information about variables and their attributes? [2010] E3: B → S
6.24 | Unit 6 • Compiler Design
(B) E1: S → a A b B, S → e 26. Consider the grammar defined by the following produc-
E2: S → b A a B, S → e tion rules, with two operators * and +
E3: S → ∈ S→T*P
(C) E1: S → a A b B, S → e T → U|T * U
E2: S → b A a B, S → e P → Q + P|Q
E3: B → S Q → Id
(D) E1: A → S, S → e
U → Id
E2: B → S, S → e
Which one of the following is TRUE? [2014]
E3: B → S
(A) + is left associative, while * is right associative
22. What is the maximum number of reduce moves that
(B) + is right associative, while * is left associative
can be taken by a bottom-up parser for a grammar with
no epsilon-and unit-production (i.e., of type A → ∈ (C) Both + and * are right associative.
and A → a) to parse a string with n tokens? [2013] (D) Both + and * are left associative
(A) n/2 (B) n–1
27. Which one of the following problems is undecidable?
(C) 2n – 1 (D) 2n [2014]
23. Which of the following is/are undecidable? (A) Deciding if a given context -free grammar is am-
(i) G is a CFG. Is L (G) = φ? biguous.
(ii) G is a CFG, Is L (G) = Σ*? (B) Deciding if a given string is generated by a given
(iii) M is a Turing machine. Is L (M) regular? context-free grammar.
(C) Deciding if the language generated by a given
(iv) A is a DFA and N is an NFA. Is L (A) = L (N)?
context-free grammar is empty.
[2013]
(D) Deciding if the language generated by a given
(A) (iii) only context free grammar is finite.
(B) (iii) and (iv) only
28. Which one of the following is TRUE at any valid state
(C) (i), (ii) and (iii) only
in shift-reduce parsing? [2015]
(D) (ii) and (iii) only
(A) Viable prefixes appear only at the bottom of the
24. Consider the following two sets of LR (1) items of an stack and not inside.
LR (1) grammar. [2013] (B) Viable prefixes appear only at the top of the
X → c.X, c/d X → c.X, $ stack and not inside.
(C) The stack contains only a set of viable prefixes.
X → .cX, c/d X → .cX, $
(D) The stack never contains viable prefixes.
X → .d, c/d X → .d, $
29. Among simple LR (SLR), canonical LR, and look-
Which of the following statements related to merging
ahead LR (LALR), which of the following pairs iden-
of the two sets in the corresponding LALR parser is/
tify the method that is very easy to implement and the
are FALSE?
method that is the most powerful, in that order?
(i) Cannot be merged since look - ahead are different.
[2015]
(ii) Can be merged but will result in S-R conflict.
(A) SLR, LALR
(iii) Can be merged but will result in R-R conflict. (B) Canonical LR, LALR
(iv) Cannot be merged since goto on c will lead to (C) SLR, canonical LR
two different sets. (D) LALR, canonical LR
(A) (i) only (B) (ii) only
30. Consider the following grammar G
(C) (i) and (iv) only (D) (i), (ii), (iii) and (iv)
S → F | H
25. A canonical set of items is given below F → p | c
S → L. > R
H → d | c
Q → R.
Where S, F and H are non-terminal symbols, p, d
On input symbol < the sset has [2014]
and c are terminal symbols. Which of the following
(A) A shift–reduce conflict and a reduce–reduce conflict.
statement(s) is/are correct? [2015]
(B) A shift–reduce conflict but not a reduce–reduce
conflict. S1. LL(1) can parse all strings that are generated
(C) A reduce–reduce conflict but not a shift reduce using grammar G
conflict. S2. LR(1) can parse all strings that are generated using
(D) Neither a shift–reduce nor a reduce–reduce conflict. grammar G
Chapter 1 • Lexical Analysis and Parsing | 6.25
Answer Keys
Exercises
Practice Problems 1
1. D 2. A 3. B 4. A 5. D 6. C 7. C 8. D 9. C 10. A
11. C 12. C 13. A 14. B 15. D
Practice Problems 2
1. D 2. C 3. B 4. A 5. B 6. B 7. A 8. C 9. C 10. A
11. D 12. A 13. D 14. A 15. D 16. A 17. C 18. A 19. A