You are on page 1of 22

PART-A

In which phase, parse tree is generated?


a)Semantic analysis
1. b)Syntax analysis
c)Intermediate code generation
d)Code generation
When expression total=5+3 is tokenized then what is the token category of total?
a) Assignment operator
2. b) Identifier
c) Integer Literal
d) Addition Operator
Leaf nodes in a parse tree indicate?
a) sub terminals
3. b) half-terminals
c) non-terminals
d) terminals
In Which of the following phase of compiler, FSA (Finite State Automata)?
a)Code optimization
4. b)Code generation
c)Lexical analysis
d) Parser
What does the lexical analyzer take as input?
a)Tokens
5. b)Parse Tree
c)Source Code
d)Machine Code
In which derivation the right-most non-terminal symbol is replaced at each step?
a) Right look ahead
6. b) Right claim
c) Rightmost
d) Right non-terminal
Which of the following are labeled by operator symbol?
a) Root
7. b) Interior nodes
c) Leaves
d) Nodes
Which derivation is generated by the bottom-up parser?
a)Right-most derivation in reverse
8. b)Left-most derivation in reverse
c)Right-most derivation
d)Left-most derivation
A form of recursive-descent parsing that does not require any back-tracking is known as?
a) predictive parsing
9. b) non-predictive parsing
c) recursive parsing
d) non-recursive parsing
In which phase of the compiler is grammar checked?
a)Syntax Analysis
10. b)Code optimization
c)Semantic analysis
d)Code generation.
What is the use of a symbol table in compiler design?
a) Finding name’s scope
11. b) Type checking
c) Keeping all of the names of all entities in one place
d) Correcting error
Which of the following is a part of a compiler that takes as input a stream of characters and
produces as output a stream of words along with their associated syntactic categories?
a) Optimizer
12.
b) Scanner
c) Parser
d) Sentinals
How will you speed up the lexical analysis phase in input buffering ?
1. Double the buffer size
2. Introduce one more buffer
3. Use sentinel character at the end of buffer
13.
a. 1, 2, 3
b. 1, 2
c. 2, 3
d. 0,0
The RE gives none or many instances of an x or y is?
a) (x+y)
14. b) (x+y)*
c) (x* + y)
d) (xy)*
Characters are grouped into tokens in which of the following phase of the compiler design?
a) Code generator
15. b) Lexical analyzer
c) Parser
d) Code optimization
Which one of the following is a bottom up parser?
a)Predictive Parser
16. b)Recursive Descent Parser
c)Non recursive descent parser
d)Shift- Reduce Parser
Which of these does not belong to CFG?
a) Terminal Symbol
17. b) Non terminal Symbol
c) Start symbol
d) End Symbol
Which of the following derivations does a top-down parser use while parsing an input string?
a) Leftmost derivation
18. b) Leftmost derivation in reverse
c) Rightmost derivation
d) Rightmost derivation in reverse
What does LR stand for?
a)Right to left
19. b)Left to right
c)Left to right and Rightmost Derivation in reverse
d)Left to right reduction
In which phase of the compiler is grammar checked?
a)Syntax Analysis
20. b)Code optimization
c)Semantic analysis
d)Code generation
Part-B
1.What are the two parts of a compilation? Explain briefly.

Analysis and Synthesis are the two parts of compilation. o The analysis part breaks up the
source program into constituent pieces and creates an intermediate representation of the
source program. o The synthesis part constructs the desired target program from the
intermediate representation.

2.Differentiate Lexeme, Token, Pattern with example.

Tokens- Sequence of characters that have a collective meaning. Patterns- There is a set of
strings in the input for which the same token is produced as output. This set of strings is
described by a rule called a pattern associated with the token Lexeme- A sequence of
characters in the source program that is matched by the pattern for a token.

3. Construct NFA for the regular expression: (a+b)*

4.Define a Context Free Grammar (CFG)

A context free grammar G is a collection of the following :

· V is a set of non terminals

· T is a set of terminals
· S is a start symbol

· P is a set of production rules

G can be represented as G = (V,T,S,P)

Production rules are given in the following form

Non terminal → (V U T)*

5.Eliminate the left recursion for the following grammar

E-->E+T/T

T-->T*F/F

F-->(E)/id

After Left Recursion:

6.Draw a transition diagram to represent relational operators.


7. List the operations on string languages.

1. Union

Union is the most common set operation. Consider the two languages L and M.
Then the union of these two languages is denoted by:

L ∪ M = { s | s is in L or s is in M}
That means the string s from the union of two languages can either be from
language L or from language M.

If L = {a, b} and M = {c, d}Then L ∪ M = {a, b, c, d}

2. Concatenation

Concatenation links the string from one language to the string of another language
in a series in all possible ways. The concatenation of two different languages is
denoted by:

L ⋅ M = {st | s is in L and t is in M}If L = {a, b} and M = {c, d}

Then L ⋅ M = {ac, ad, bc, bd}

3. Kleene Closure

Kleene closure of a language L provides you with a set of strings. This set of
strings is obtained by concatenating L zero or more time. The Kleene closure of
the language L is denoted by:

If L = {a, b}L* = {∈, a, b, aa, bb, aaa, bbb, …}

4. Positive Closure

The positive closure on a language L provides a set of strings. This set of strings
is obtained by concatenating ‘L’ one or more times. It is denoted by:

It is similar to the Kleene closure. Except for the term L0, i.e. L+ excludes ∈ until
it is in L itself.

If L = {a, b}L+ = {a, b, aa, bb, aaa, bbb, …}

8. Construct a parse tree for –(id + id )


9.Compute FIRST for all the non-terminals for the following grammar.
S→ (L) | a
L→ L, S | S

Remove Left Recursion for production L

L-> SL’

L’ -> ,SL’ | ∈

The grammar after eliminating left recursion is-

S → (L) / a

L → SL’

L’ → ,SL’ / ∈

First:

First(s) ==> { ( , a }

First (L) ==> { ( , a }

First (L’) ==> { , , ∈}

10.Eliminate Left Recursion from the following grammar.

S -> Sab | T

Left Recursion :

S-->T S’

S’ --> abS’ | ∈
PART-C

1.Discuss about the recognition of tokens.

How tokens can be recognize

Lexical analyzer read the source program character by character and produces a
stream of tokens.

The token may be identifier , a variable, a operator, a constant or a keyword.

In order to specify the tokens we use regular expression.

We have to recognize the tokens with the help of transition diagrams.

Recognition of tokens is done to separate out different tokens.

Example: assume the following grammar fragment to generate a specific


language:

where the terminals if, then, else, relop, id, and num generate sets of strings given
by the following regular definitions:

Where letter and digits are as defined previously.

For this language fragment the lexical analyzer will recognize the keywords if,
then, else, as well as the lexemes denoted by relop, id, and num. To simplify
matters, we assume keywords are

reserved; that is, they cannot be used as identifiers. The num represents the
unsigned integer and real numbers of Pascal. In addition, we assume lexemes are
separated by white space,
consisting of nonnull sequences of blanks, tabs, and newlines. The lexical
analyzer will strip out white space. It will do so by comparing a string against the
regular definition ws, below.

If a match for ws is found, the lexical analyzer does not return a token to the
parser.

Transition Diagram

Tokens can be recognized by Finite Automata

A Finite automaton(FA) is a simple idealized machine used to recognize patterns


within input taken from some character set(or Alphabet) C. The job of FA is to
accept or reject an input depending on whether the pattern defined by the FA
occurs in the input.

There are two notations for representing Finite Automata. They are

Transition Diagram

Transition Table

Transition diagram is a directed labeled graph in which it contains nodes and


edges Nodes represents the states and edges represents the transition of a state
Every transition diagram is only one initial state represented by an arrow mark (--
>) and zero or more final states are represented by double circle

Example:

Where state "1" is initial state and state 3 is final state.

As an intermediate step in the construction of a lexical analyzer, we first produce


flowchart, called a Transition diagram. Transition diagrams depict the actions that
take place when a lexical analyzer is called by the parser to get the next token.
The TD uses to keep track of information about characters that are seen as the
forward pointer scans the input. it dose that by moving from position to position
in the diagram as characters are read.

Components of Transition Diagram

Finite Automata for recognizing identifiers


Finite Automata for recognizing keywords

Finite Automata for recognizing numbers

Finite Automata for relational operators

Finite Automata for recognizing white spaces


2.What are the phases of compiler? Explain the phases in detail. Write down the
output of each phase for the expression a:b+c*60

Phases of a compiler: A compiler operates in phases. A phase is a logically interrelated


operation that takes source program in one representation and produces output in another
representation.

The phases include:


1. Lexical analysis (“scanning”)
Reads in program, groups characters into “tokens”
2. Syntax analysis (“parsing”)
Structures token sequence according to grammar rules of the language.
3. Semantic analysis
Checks semantic constraints of the language.
4. Intermediate code generation
Translates to “lower level” representation.
5. Code optimization
Improves code quality.
6. Code generation.

Phase-1: Lexical Analysis


Lexical analyzer reads the stream of characters making up the source program and
groups the characters into meaningful sequences called lexeme
• For each lexeme, the lexical analyzer produces a token of the form that it passes
on to the subsequent phase, syntax analysis (token-name, attribute-value)
• Token-name: an abstract symbol is used during syntax analysis.
• attribute-value: points to an entry in the symbol table for this token.
Example:
newval := oldval + 12
Tokens:
newval Identifier
= Assignment operator
oldval Identifier
+ Add operator
12 Number
Lexical analyzer truncates white spaces and also removes errors.

Phase-2: Syntax Analysis


• Also called Parsing or Tokenizing.
• The parser uses the first components of the tokens produced by the lexical
analyzer to create a tree-like intermediate representation that depicts the
grammatical structure of the token stream.
• A typical representation is a syntax tree in which each interior node represents an
operation and the children of the node represent the arguments of the operation

Phase-3: Semantic Analysis


• The semantic analyzer uses the syntax tree and the information in the symbol
table to check the source program for semantic consistency with the language
definition.
• Gathers type information and saves it in either the syntax tree or the symbol table,
for subsequent use during intermediate-code generation.
• An important part of semantic analysis is type checking, where the compiler
checks that each operator has matching operands.
• For example, many programming language definitions require an array index to
be an integer; the compiler must report an error if a floating-point number is
used to index an array.
• Example:
newval := oldval+12

The type of the identifier newval must match with the type of expression
(oldval+12).
Example:

Semantic analysis
• Syntactically correct, but semantically incorrect
example:
sum = a + b;
int a;
double sum; data type mismatch
char b;

Phase-4: Intermediate Code Generation


After syntax and semantic analysis of the source program, many compilers
generate an explicit low-level or machine-like intermediate representation(a
program for an abstract machine). This intermediate representation should have
two important properties:
• it should be easy to produce and
• it should be easy to translate into the target machine.
The considered intermediate form called three-address code, which consists of a
sequence of assembly-like instructions with three operands per instruction. Each
operand can act like a register.
This phase bridges the analysis and synthesis phases of translation.

Example:

Phase-5: Code Optimization


• The compiler looks at large segments of the program to decide how to improve
performance
• The machine-independent code-optimization phase attempts to improve the
intermediate code so that better target code will result.
• Usually better means:
• faster, shorter code, or target code that consumes less power.
• There are simple optimizations that significantly improve the running time of the
target program without slowing down compilation too much.
• Optimization cannot make an inefficient algorithm efficient - “only makes an
efficient algorithm more efficient”

Example:
The above intermediate code will be optimized as:
Temp1 = Id3 * 1
Id1 = Id2 + Temp1
Phase-6: Code Generation
• The last phase of translation is code generation.
• Takes as input an intermediate representation of the source program and maps it
into the target language
• If the target language is machine, code, registers or memory locations are
selected for each of the variables used by the program.
• Then, the intermediate instructions are translated into sequences of machine
instructions that perform the same task.
• A crucial aspect of code generation is the judicious assignment of registers to
hold variables.

Example:

3. Construct Deterministic Finite Automata for the given regular expression.


(0+1)* 01
4.Construct Parsing table for the grammar and find states made by predictive
parser on input “id + id * id” and find FIRST and FOLLOW.
E -> E + T | T
T -> T * F | F
F -> (E)
F-> id

Step1 : Eliminate Left Recursion:

After eliminating left-recursion the grammar is


E → TE’
E’ → +TE’ | ε
T → FT’
T’ → *FT’ | ε
F → (E) | id

Step 2: Find First and Follow

First( ) :
FIRST(E) = { ( , id}
FIRST(E’) ={+ , ε }
FIRST(T) = { ( , id}
FIRST(T’) = {*, ε }
FIRST(F) = { ( , id }
Follow( ):
FOLLOW(E) = { $, ) }
FOLLOW(E’) = { $, ) }
FOLLOW(T) = { +, $, ) }
FOLLOW(T’) = { +, $, ) }
FOLLOW(F) = {+, * , $ , ) }

Step 3: Construct Parsing Table

Predictive Parsing Table:


Step 4 : Parse the given Input string: id+id*id

5. Construct Leftmost Derivation. , Rightmost Derivation, Derivation Tree for


the
following grammar with respect to the string “aaabbabbba”.
S aB | bA
AaS| bAA|a
B bS | aBB | b

Leftmost Derivation
The process of deriving a string by expanding the leftmost non-terminal at each step is
called as leftmost derivation.
The geometrical representation of leftmost derivation is called as a leftmost derivation
tree.

S → aB
→ aaBB (Using B → aBB)
→ aaaBBB (Using B → aBB)
→ aaabBB (Using B → b)
→ aaabbB (Using B → b)
→ aaabbaBB (Using B → aBB)
→ aaabbabB (Using B → b)
→ aaabbabbS (Using B → bS)
→ aaabbabbbA (Using S → bA)
→ aaabbabbba (Using A → a)
Derivation Tree

Rightmost Derivation-
The process of deriving a string by expanding the rightmost non-terminal at each step
is called as rightmost derivation.
The geometrical representation of rightmost derivation is called as a rightmost
derivation tree.

S → aB
→ aaBB (Using B → aBB)
→ aaBaBB (Using B → aBB)
→ aaBaBbS (Using B → bS)
→ aaBaBbbA (Using S → bA)
→ aaBaBbba (Using A → a)
→ aaBabbba (Using B → b)
→ aaaBBabbba (Using B → aBB)
→ aaaBbabbba (Using B → b)
→ aaabbabbba (Using B → b)
Derivation Tree

6. Check whether the following grammar can be implemented using predictive parser.
Check whether the string “abfg” is accepted or not using predictive parsing.
SA
AaB|Ad
BbBC|f
Cg
Step 1: (Eliminate Left Recursion)
S -> A
A -> aBA’
A’-> dA’|€
B -> bBC|f
C -> g

Step 2: Find First and Follow


FIRST
FOLLOW

Step 3 : Construct Parsing Table

Step 4 : Parsing the Input String ==> “abfg”


7.Construct LR (0) parsing table for the grammar. Check whether the input
string “aabb” is accepted or not.
S-AA
A-aA|b
Augmented Grammer

S’--> AA
S-->AA
A-->aA|b
Canonical collection of given grammer:

.
S’ --> AA

.
S--> AA

A--> .aA|.b
Constructing Data flow diagram
Construction of parsing Table:

Parsing Input string:

You might also like