COMPILER DESIGN
ECX6235
ANSWERS FOR TMA 01
NAME : N.G.P.R PREMACHANDRA
REG. No : 511078618
CENTRE : COLOMBO
DUE DATE : 30/03/2017
[Q1]
(a)
A grammar G can be formally written as a 4-tuple (N, T, S, P) where −
• N or VN is a set of variables or non-terminal symbols.
• T or ∑ is a set of Terminal symbols.
• S is a special variable called the Start symbol, S ∈ N
• P is Production rules for Terminals and Non-terminals. A production rule has the form α
→ β, where α and β are strings on VN ∪ ∑ and least one symbol of α belongs to VN.
Types of grammars,
Grammar Grammar Language Automaton
Type Accepted Accepted Production rules
Type 0 Unrestricted Recursively Turing Machine α→β
grammar enumerable
language
Type 1 Context-sensitive Context-sensitive Linear-bounded αAβ→αγβ
grammar language automaton
Type 2 Context-free Context-free Pushdown A→γ
grammar language automaton
Type 3 Regular grammar Regular language Finite state X → a or X → aY
automaton
(b)
A compiler is a program that translates a high-level language program into a functionally
equivalent low-level language program. So, a compiler is basically a translator whose source
language. In other words if we input source program to a compiler it will give the output of an
object program.
Lexical Analysis
In the lexical analysis phase, the compiler scans the characters of the source program, one
character at a time. Whenever it gets a sufficient number of characters to constitute a token of
the specified language, it outputs that token. In order to perform this task, the lexical analyzer
must know the keywords, identifiers, operators, delimiters, and punctuation symbols of the
language to be implemented. So, when it scans the source program, it will be able to return a
suitable token whenever it encounters a token lexeme.
The lexical analyzer identify the each identifiers(id) by using regular expressions.
Therefore, the lexical analyzer design must:
1. Specify the token of the language, and
2. Suitably recognize the tokens.
Syntax Analyzer
Syntax analysis or parsing is the second phase of a compiler. The parser analyzes the source
code (token stream) against the production rules to detect any errors in the code. The output of
this phase is a parse tree. Syntax analyser get all the tokens one by one and take the grammar
Parsers are expected to parse the whole code even if some errors exist in the program.
Parsers use error recovering strategies.
Semantic Analyzer
Semantic Analyzer take the parse tree and verify it semantically.
Intermediate Code Generator
Compilers generate an explicit intermediate code representation of the source program. The
intermediate code can have a variety of forms. For example, a three-address code.
Code Optimization
In the optimization phase, the compiler performs various transformations in order to improve
the intermediate code. These transformations will result in faster-running machine code.
Code Generation
The final phase in the compilation process is the generation of target code. This process
involves selecting memory locations for each variable used by the program. Then, each
intermediate instruction is translated into a sequence of machine instructions that performs the
same task.
(c)
Compilation phase Software tool
Lexical analyzer Flex, ANTLR, DFASTAR , Ragel, re2c, PLY
(Python Lex-Yacc)
Syntax analyzer Berkeley Yacc: LALR parser,
Bison,Lex (and Flex lexical analyser),
BNF,PLY, MKS Yacc
[Q2]
(a)
c. Production rules are,
S while <NT1><NT3>
<NT1> (<NT2>)
<NT2> id(x)[<]c(7)
<NT3> {NT4}
<NT4> <NT5>;<NT6>;
<NT5> print. Id(x)
<NT6> id(x)[=]<NT7>
<NT7> id(x) [+] c(1)
Terminals - while, (,), id(x), <, c, {, }, print, ;, =, +
Let’s take
while as w
; as a
( as b
) as d
id(x) as i
[] as o
c(x) as c
{ as f
} as g
Print as p
NTx as N
Grammar for the equation, G ={N, T, P, S}
T = {w, b, d, i, e, c, f, g, p, h, i}
S = wNN
P = {S wNN, N bNd,N ioc, N fNg, N NaNa, N pi, N ioN}
S wNN
S wbNdN
S wbiocdN
S wbiocdfNg
S wbiocdfNaNag
S wbiocdfpiaNag
S wbiocdfgpiaioNag
S wbiocdfgpiaioiocag
S wbiocdfgpia(io)+cag
[Q3]
a)
Wireless Home Automation using IoT
b)
Grammar
Terminals – (on, off L1, L2, L3)
Non Terminals – (SWITCH, L)
Production rules
START → SWITCH
SWITCH → LAMP on | LAMP off
LAMP → lamp1 | lamp2
Start symbol – START
String L1 on
START → SWITCH
→ L on
→ L1 on
c)
DFA
L1 on
NDFA
L2 on
L
[Q4]
Lexical Analysis
Lex file - home_automation.l
/* Lexical Analysis */
%{
#include "y.tab.h"
#include <stdlib.h>
void yyerror(char *);
%}
%%
START { return(START); }
SWITCH{ return(SWITCH); }
LAMP { return(LAMP); }
lamp1 { return(k); }
lamp2 { return(k); }
[ \t] ; /* skip whitespace */
. yyerror("Unknown character");
%%
int yywrap(void) {
return 1;
}
Syntax Analysis
yacc file - home_automation.y
/* Syntax Analysis */
%{
#include <stdio.h> /* For I/O */
int errors; /* Error Count */
void yyerror(char *);
int yylex(void);
// Terminals
%token START
%token SWITCH
%token LAMP
%token lamp1
%token lamp2
%%
%%
PROGRAM:
STATUS lamp1 on;
STATUS:
status {printf(" obtained current status \n");}
COMP:
lamp1 {printf("obtained lamp status ");}
|lamp2 {printf("obtained fan status");}
%%
void yyerror(char *s) {
fprintf(stdout, "%s\n", s);
}
int main(void) {
yyparse();
return 0;
}