BCSE307P - Compiler Lab Manual

1
SCHOOL OF COMPUTER SCIENCE AND

ENGINEERING
BCSE307P – COMPILER DESIGN LABORATORY
REGISTER NUMBER : VISHAL BHOOMA KANNAN
NAME OF THE STUDENT : 21BCE1294

2
Table of Contents
Expt. Page
Date Name of the Experiment
No No
1 13.12.2022 Study of Phases of Compiler 3
Implementation of Token Separation (Lexical

2 20.12.2022 7
Analyzer)
3 12.01.2023 Study of LEX and YACC Tool 11
4 31.01.2023 Implementation of Lexical Analyzer using Lex Tool 15
5 07.02.2023 Implementation using YACC Tool 21
Implementation of Simple Calculator using Lex and

6 14.02.2023 29
YACC
Implementation of Abstract Syntax Tree - Infix to
7 28.02.2023 32
Postfix
8 07.03.2023 Implementation of Backend 35
Implementation of Code Optimization (Constant

9 21.03.2023 41
Folding)
10 28.03.2023 Study of LLVM 44

3
Experiment 1
Aim: Study of Phases of Compiler
COMPILER
A compiler is a program that translates source code written in a high-level programming
language into machine code that can be executed by a computer.
The process of translating the source code into machine code involves several distinct
phases, each of which performs a specific task. Here are the main phases of a compiler:
1. Lexical Analysis:
The first phase of a compiler is lexical analysis, which is also known as scanning.
The lexical analyser, often generated by a tool like LEX, reads the source code character by
character and groups them into meaningful units called tokens.
[Tokens are the smallest units of syntax in a programming language and can be keywords,
identifiers, operators, or constants.]
4
2. Syntax Analysis:
The second phase of a compiler is syntax analysis, also known as parsing.
The parser, often generated by a tool like YACC, takes the tokens generated by the lexical
analyser and groups them into a hierarchical structure called a parse tree.
The parse tree represents the syntactic structure of the source code and can be used to
check for syntax errors.
3. Semantic Analysis:
The third phase of a compiler is semantic analysis.
The semantic analyser checks the source code for semantic errors, such as type mismatches
or undeclared variables. It also assigns meanings to the symbols used in the source code,
such as variables, functions, and classes.
4. Intermediate Code Generation:

5
The fourth phase of a compiler is intermediate code generation.
The compiler generates an intermediate representation of the source code that is more
abstract than the source code itself. The intermediate code can be optimized for efficiency
and translated into different target languages.
5. Code Optimization:
The fifth phase of a compiler is code optimization.
The compiler analyses the intermediate code and applies transformations that improve the
efficiency of the generated code. This can include eliminating redundant operations,
reordering instructions, and simplifying expressions.
6
6. Code Generation:
The final phase of a compiler is code generation.
The compiler generates the target code, which is the actual machine code that can be
executed by a computer. The target code can be optimized for the specific platform or
architecture for which it is intended.
“The phases of a compiler work together to translate high-level programming languages

into efficient machine code that can be executed by a computer. Each phase performs a
specific task and contributes to the overall performance and correctness of the generated
code.”
7
Experiment 2
Aim: Implementation of Token Separation (Lexical Analyzer)
Algorithm:
For every space separated string (token)
For each character in token
If isOperator(token)
StoreOperator(token)
If isKeyword(token)
StoreKeyword(token)
Else if isConstant(token)
StoreConstant(token)
Else
StoreIdentifier(token)
Source Code:
// identifiers, constants, operators, keywords
#include <stdio.h>
#include <string.h>
int isOperator(char c)
{
if (c == '+' || c == '-' || c == '/' || c == '*' || c == '=' || c == ';' ||
c == '(' || c == ')' || c == '{' || c == '}')
{
return 1;
}
return 0;
}
int isKeyword(char *buffer, int buff_count)

{
if (!strcmp(buffer, ""))
{
return 0;
}
const char *keywords[] = {"print", "int", "void", "main"};
for (int i = 0; i < 2; i++)
{
if (!strncmp(buffer, keywords[i], buff_count))
{
return 1;
}
}
return 0;
8
int isConstant(char *buffer, int count)

{
if (!strcmp(buffer, ""))
{
return 0;
}
int c = 1;
for (int i = 0; i < count; i++)
{
if (!(buffer[i] >= 48 && buffer[i] <= 57))
{
c = 0;
}
}
return c;
}
int main()
{
char keywordList[20][255];
int keyword_count = 0;
char identList[20][255];
int ident_count = 0;
char constList[20][255];
char const_count = 0;
char symbols[20];
char symb_count = 0;
FILE *ptr = fopen("test.txt", "r");

if (ptr == NULL)
{
printf("File Not Found");
}
char c = 0;
char buffer[255];
int buff_count;
int temp;
while (fscanf(ptr, "%c", &c) == 1)
{
buff_count = 0;
strcpy(buffer, "");
while (c != ' ' && c != '\n')
{
if (c == ';')
{
break;
}
buffer[buff_count++] = c;
9
fscanf(ptr, "%c", &c);

}
temp = 0;
while (temp < buff_count)
{
if (isOperator(buffer[temp]))
{
symbols[symb_count++] = buffer[temp];
buffer[temp] = 0;
}
temp++;
}
if (isKeyword(buffer, buff_count))
{
strcpy(keywordList[keyword_count++], buffer);
}
else if (isConstant(buffer, buff_count))
{
strncpy(constList[const_count++], buffer, buff_count);
}
else if (strcmp(buffer, "") && strcmp(buffer, " "))
{
strncpy(identList[ident_count++], buffer, buff_count);
}
}
printf("\nKeyword list\n");
for (int i = 0; i < keyword_count; i++)
{
printf("%s\n", keywordList[i]);
}
printf("\n\nIdentifier list\n");
for (int i = 0; i < ident_count; i++)
{
printf("%s\n", identList[i]);
}
printf("\n\nConstant list\n");
for (int i = 0; i < const_count; i++)
{
printf("%s\n", constList[i]);
}
printf("\n\nSymbol list\n");
for (int i = 0; i < symb_count; i++)
{
printf("%c\n", symbols[i]);
}
fclose(ptr);
}
10
Input Text:
int a;
int b;
a = 5;
b = 10;
int c = a + b;
print(c);
Output:
11
Experiment 3
Aim: Study of Lex and Yacc tools
“LEX and YACC are tools that are commonly used in the construction of compilers. LEX is
a lexical analyser generator, while YACC is a parser generator. Together, they provide a
way to generate a complete compiler for a programming language.”
LEX:
LEX is used to generate a lexical analyser, which is responsible for breaking up the input
source code into individual tokens.
The lexical analyser

● reads the input character by character.
● recognizes sequences of characters called regular expressions
LEX generates a C program that performs this analysis on the input source code.
12
YACC
YACC, on the other hand, is used to generate a parser.
A parser is responsible for analysing the sequence of tokens generated by the lexical
analyzer and checking whether it conforms to the grammar of the programming language
being compiled.
YACC generates a C program that implements a parsing algorithm that is based on the
grammar of the programming language.
LEX and YACC
The process of using LEX and YACC to construct a compiler are :
1. Defining the lexical structure of the programming language using regular

expressions. This involves identifying the various types of tokens that can appear in
the source code of the programming language and defining regular expressions to
match those tokens.
2. Defining the grammar of the programming language. This involves specifying the
rules that govern the syntax of the programming language.
3. Writing the LEX input file that defines the regular expressions for the various types
of tokens in the programming language.
13
4. Writing the YACC input file that defines the grammar of the programming language
and specifies the actions that should be taken when a valid sequence of tokens is
recognized.
5. Running LEX and YACC on their respective input files to generate the C programs
that will form the basis of the compiler.
6. Writing additional code to handle semantic analysis, code generation, and other
tasks that are necessary to transform the input source code into executable code.
LEX VS YACC -
“In summary, LEX is used to generate a lexical analyzer, while YACC is used to generate a
parser, and together they form the basis of a compiler for a programming language.”
The resulting compiler can then be used to compile programs written in the
programming language for which it was designed.
14
SUMMARY:-
LEX and YACC are powerful tools that can be used to generate the lexical analyzer and
parser components of a compiler.
They allow for the construction of compilers that can handle complex programming
languages and automate much of the process of transforming source code into executable
code.
15
Experiment 4
Aim: Implementation of Lexical analyser using Lex Tool
a.) Program to recognize Integer, Real and Exponential
%{
#include <stdio.h>
%}
sign [+-]?
digit [0-9]+
exp ([e|E]{sign}{digit})
%%
\+?[0-9]+ {printf("Positive Number");}
\-[0-9]+ {printf("Negative number");}
{sign}{digit}?\.{digit}? {printf("Fractional value");}
{sign}{digit}(\.{digit}?)?{exp} {printf("Exponential value");}
%%
int yywrap(void){
return 1;
}
int main(){
yylex();
return 0;
}
16
b.) C program to implement lexical analyser using lex tool for a simple statement
%{
#include <stdio.h>
#include <stdlib.h>
int count = 1;
%}
%%
.*\n {printf("line %d: %s", count, yytext); count++;}

%%
int yywrap(void){
return 1;
}
int main(){
yyin=fopen("sample.c","r");
yylex();
return 0;
}
17
c.) Design a compiler to do lexical analysis in c, c++
%{
#include <stdio.h>
int l_count = 1;
%}
sign [+-]?
digit [0-9\.]+
alpha [a-zA-Z_]+
op [\+\=\-\/\<\>\*\%\^]
identifier {alpha}+[a-zA-Z_0-9.]*
keyword (include|void|main|int|float|for|define|scanf|if|printf|else|pow)
%%
[ \t]
\{ {printf("%s \t: block start\n", yytext);}
\} {printf("%s \t: block end\n", yytext);}
^# {printf("%s \t: packaging delimiter \n", yytext);}
{keyword} {printf("%s \t: keyword\n", yytext);}
{op} {printf("%s \t: operator\n", yytext);}
\".+?\" {printf("%s \t: string\n", yytext);}
[,;&]
{identifier} {printf("%s \t: identifier\n", yytext);}
{digit} {printf("%s \t: constant\n", yytext);}
[\n] {printf("End of line %d \n\n", l_count); l_count++; }
%%
int yywrap(void){
return 1;
}
int main(){
yyin=fopen("sample.c","r");
yylex();
printf("Line count : %d\n", l_count);
return 0;
}
18
19
20
21
Experiment 5: Recognizing strings using grammar in yacc programming.
Q1. Recognizing an bn
Algorithm:
The grammar
S -> A Q B N
Q-> A Q B | ε
Yacc code:
%{
#include <stdio.h>
#include <stdlib.h>
#include "lex.yy.c"
int yywrap();
int yylex();
int yyerror(char *msg);
%}
%token A B N
%%
S: A Q B N { printf("valid string\n");
exit(0); }
;
Q: A Q B |
;
%%
int yyerror(char *msg)

{
printf("invalid string\n");
exit(0);
}
int main()
{
printf("enter the string\n");
yyparse();
}
22
Lex code:
%{
#include "y.tab.h"
%}
%%
[aA] {return A;}
[bB] {return B;}
\n {return N;}
. {return yytext[0];}
%%
int yywrap()
{
return 1;
}
Output:
23
Q2. Recognizing anb

Algorithm:
The grammar:
S -> A Q B N
Q -> A Q | ε
Yacc code:
%{
#include <stdio.h>
#include <stdlib.h>
#include "lex.yy.c"
int yywrap();
int yylex();
%}
%token A B N
%%
S: A Q B N { printf("valid string\n");
exit(0); }
;
Q: A Q |
;
%%

{
exit(0);
}
int main()
{
yyparse();
}
24
Lex code:
%{
#include "y.tab.h"
%}
%%
[aA] {return A;}
[bB] {return B;}
\n {return N;}
%%
int yywrap()
{
return 1;
}
Output:
25
Q3. Recognizing abn

Algorithm:
Grammar:
S -> A Q N
Q -> Q B | ε
Yacc code:
%{
#include <stdio.h>
#include <stdlib.h>
#include "lex.yy.c"
int yywrap();
int yylex();
%}
%token A B N
%%
S: A Q N { printf("valid string\n");
exit(0); }
;
Q: Q B |
;
%%

{
exit(0);
}
int main()
{
yyparse();
}
26
Lex Code:
%{
#include "y.tab.h"
%}
%%
[aA] {return A;}
[bB] {return B;}
\n {return N;}
%%
int yywrap()
{
return 1;
}
Output:
27
Q4. Recognizing (ab)n

Algorithm:
Grammar:
S -> A B Q N
Q -> A B Q | ε
Yacc code:
%{
#include <stdio.h>
#include <stdlib.h>
#include "lex.yy.c"
int yywrap();
int yylex();
%}
%token A B N
%%
S: A B Q N { printf("valid string\n");
exit(0); }
;
Q: A B Q |
;
%%

{
exit(0);
}
int main()
{
yyparse();
}
28
Lex Code:
%{
#include "y.tab.h"
%}
%%
[aA] {return A;}
[bB] {return B;}
\n {return N;}
%%
int yywrap()
{
return 1;
}
Output:
29
Experiment 6: Implementation of Simple calculator using Lex

and Yacc
Algorithm:
S -> E
E -> E + E {E.val = E1.val +
E2.val} E -> E – E {E.val =
E1.val - E2.val} E -> E * E
{E.val = E1.val * E2.val} E -> E
/ E {E.val = E1.val / E2.val} E ->
E % E {E.val = E1.val % E2.val}
Yacc code:
%{
#include <stdio.h>
#include <stdlib.h>
#include "lex.yy.c"
int yywrap();
int yylex();
int flag = 0;
%}
%token N
%left '+' '-' '*' '/' '%' '(' ')'
%%
S: E { printf("Answer is : %d\n", $$); return 0; }
;
E: E'+'E {$$=$1+$3;}
| E'-'E {$$=$1-$3;}
| E'*'E {$$=$1*$3;}
| E'/'E {$$=$1/$3;}
| E'%'E {$$=$1%$3;}
30
| '(' E ')' {$$=$2;}

| N {$$=$1;}
;
%%

{
printf("invalid arithmetic expression\n");
flag = 1;
exit(0);
}
int main()
{
printf("enter the arithmetic operation\n");
yyparse();
if(flag==0){
printf("Valid arithmetic operation\n");
}
}
Lex code:
%{
#include "y.tab.h"
#include <stdio.h>
#include <stdlib.h>
%}
%%
[0-9]+ {yylval=atoi(yytext); return N;}
[\n] {return 0;}
%%
int yywrap()
{
return 1;
}
31
Output:
32
Experiment 7: Converting Infix to Postfix using yacc and lex.
Yacc code:
%{
#include <stdio.h>
#include <stdlib.h>
#include "lex.yy.c"
int yywrap();
int yylex();
int flag = 0;
char st[100];
int top=0;
void A1()
{
st[top++]=yytext[0];
}
void A2()
{
printf("%c", st[--top]);
}
void A3()
{
printf("%c", yytext[0]);
}
%}
%token ID
%left '+' '-' '*' '/' '%' '(' ')' UMINUS
%%
S: E
E: E'+'{A1();}T{A2();}
| E'-'{A1();}T{A2();}
| T
;
T: T'*'{A1();}F{A2();}
| T'/'{A1();}F{A2();}
| F
;
F: '('E{A2();}')'
33
| '-'{A1();}F{A2();}
| ID{A3();}
;
%%

{
printf("invalid arithmetic expression\n");
flag = 1;
exit(0);
}
//driver code
int main()
{
printf("enter the arithmetic operation\n");
yyparse();
printf("\n");
}
Lex code:
%{
#include "y.tab.h"
#include <stdio.h>
#include <stdlib.h>
%}
alpha [A-Za-z]
digit [0-9]
%%
{alpha}+({alpha}|{digit})* {return ID;}
{digit}+ {yylval=atoi(yytext); return ID;}
[\n] {return 0;}
%%
int yywrap()
{
return 1;
}
34
Output:
35
Experiment 8: Implementation of Backend, converting infix

expression to quadruple and machine instructions.
Source Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#define MAX_EXPR_SIZE 100

struct node
{
int info;
struct node *ptr;
} *top, *temp;
void push(char data)

{
if (top == NULL)
{
top = (struct node *)malloc(1 * sizeof(struct node));
top->ptr = NULL;
top->info = data;
}
else
{
temp = (struct node *)malloc(1 * sizeof(struct node));
temp->ptr = top;
temp->info = data;
top = temp;
}
}
char pop()
{
temp = top;
if (temp == NULL)
return -1;
else
temp = temp->ptr;
int popped = top->info;
free(top);
top = temp;
return popped;
36
char peek()
{
return top->info;
}
int isEmpty()
{
if (top == NULL)
return 1;
return 0;
}
int precedence(char operator)

{
switch (operator)
{
case '+':
case '-':
return 1;
case '*':
case '/':
return 2;
default:
return -1;
}
}
int isOperator(char ch)

{
return (ch == '+' || ch == '-' || ch == '*' || ch == '/' || ch == '^');
}
char *infixToPostfix(char *infix)

{
int i, j;
int len = strlen(infix);
char *postfix = (char *)malloc(sizeof(char) * (len + 2));
for (i = 0, j = 0; i < len; i++)

{
if (infix[i] == ' ' || infix[i] == '\t')
continue;
if (isalnum(infix[i]))
37
{
postfix[j++] = infix[i];
}
else if (infix[i] == '(')
{
push(infix[i]);
}
else if (infix[i] == ')')
{
while (!isEmpty() && peek() != '(')
postfix[j++] = pop();
if (!isEmpty() && peek() != '(')
return "Invalid Expression";
else
pop();
}
else if (isOperator(infix[i]))
{
while (!isEmpty() && precedence(peek()) >= precedence(infix[i]))
push(infix[i]);
}
}
while (!isEmpty())
{
if (peek() == '(')
{
return "Invalid Expression";
}
}
postfix[j] = '\0';
return postfix;
}
void printInstruction(char oper)

{
if (oper == '+')
{
printf("ADD ");
}
else if (oper == '-')
{
printf("SUB ");
38
}
else if (oper == '*')
{
printf("MUL ");
}
else if (oper == '/')
{
printf("DIV ");
}
else if (oper == '^')
{
printf("EXP ");
}
}
int isAlpha(char c)
{
return (c >= 65 && c <= 90) || (c >= 97 && c <= 122);
}
void printVal(char c)
{
if (isAlpha(c))
{
printf("%c ", c);
}
else
{
printf("t%d ", c + 1);
}
}
int main()
{
char infix[MAX_EXPR_SIZE];
printf("Enter an infix expression: ");
fgets(infix, MAX_EXPR_SIZE, stdin);
char *postfix = infixToPostfix(infix);
printf("Postfix expression : %s\n", postfix);
int n = strlen(postfix);
int count = 0;
char t1, t2;
char quad[50][4];
top = NULL;
for (int i = 0; i < n; i++)
39
{
if (isalnum(postfix[i]))
{
push(postfix[i]);
}
else
{
t1 = pop();
t2 = pop();
quad[count][0] = (count);
quad[count][1] = postfix[i];
quad[count][2] = t2;
quad[count][3] = t1;
push(count);
count++;
}
}
printf("Quadruple table :\n");
{
printf("%d) ", i);
printf("%c ", quad[i][1]);
printVal(quad[i][2]);
printVal(quad[i][3]);
printf("t%d\n", i + 1);
}
top = NULL;
int reg_count = 0;
{
if (isAlpha(quad[i][2]))
{
printf("MOV R%d, %c\n", reg_count, quad[i][2]);
push(reg_count);
reg_count++;
}
if (isAlpha(quad[i][3]))
{
printf("MOV R%d, %c\n", reg_count, quad[i][3]);
push(reg_count);
reg_count++;
}
printInstruction(quad[i][1]);
t1 = pop();
t2 = pop();
40
printf("R%d, R%d, R%d\n", reg_count, t2, t1);

push(reg_count);
reg_count++;
}
return 0;
}
Output:
41
Experiment 9: Implementation of code optimization (constant folding)

Source Code:
#include <bits/stdc++.h>
using namespace std;
int calc(int n1, int n2, string op)

{
if (op == "+")
{
return n1 + n2;
}
else if (op == "-")
{
return n1 - n2;
}
else if (op == "*")
{
return n1 * n2;
}
else if (op == "/")
{
return n1 / n2;
}
else
{
return -1;
}
}
int main()
{
vector<vector<string>> vs;
int u;
cin >> u;
for (int i = 0; i < u + 1; i++)
{
string S, T;
getline(cin, S);
stringstream X(S);
vector<string> v1;
while (getline(X, T, ' '))
v1.push_back(T);
vs.push_back(v1);
42
}
vs.erase(vs.begin());
vector<int> buff(u, -1);
int i = 0;
for (auto &v : vs)
{
if (isdigit(v[1][0]))
{
int n = stoi(v[1]);
if (buff[n] != -1)
v[1] = "#" + to_string(buff[n]);
}
if (isdigit(v[2][0]))
{
int n = stoi(v[2]);
if (buff[n] != -1)
v[2] = "#" + to_string(buff[n]);
}
if ((v[1].rfind("#", 0) == 0) && (v[2].rfind("#", 0) == 0))
{
v[1].erase(v[1].begin());
v[2].erase(v[2].begin());
buff[i] = calc(stoi(v[1]), stoi(v[2]), v[3]);
cout << buff[i] << endl;
}
i++;
}
vector<vector<string>> cp;
for (int i = 0; i < u; i++)
{
if (buff[i] == -1)
{
cp.push_back(vs[i]);
}
}
for (auto v : cp)
{
for (auto v1 : v)
cout << v1 << " ";
cout << "\n";
}
return 0;
}
43
Output:
44
Experiment 10: Study of LLVM

LLVM:
What is LLVM?
LLVM is an acronym that stands for low level virtual machine. It also refers to a compiling.
technology called the LLVM project, which is a collection of modular and reusable compiler.
and toolchain technologies.
A compiler infrastructure designed to optimise code and generate high-performance.
machine code. LLVM is an open-source project that is used in a variety of software,
including.
the LLVM compiler itself, the Clang C/C++ compiler, and the Swift programming language.
We will explore the role of LLVM in modern compiler design, the architecture of LLVM, the
LLVM intermediate representation (IR), and the benefits of using LLVM in compiler design.
Role of LLVM in Modern Compiler Design

45
LLVM has become an important tool for modern compiler design due to its powerful
optimization features and its ability to generate code for multiple architectures. LLVM
provides a flexible and modular framework that allows developers to easily add new
optimizations or target new architectures.
LLVM is also designed to work with other tools in the compiler toolchain, such as the Clang
C/C++ compiler or the GCC compiler, allowing for seamless integration of different.
components in the compilation process.
The Architecture of LLVM

The architecture of LLVM is composed of several components:
● the front-end
● the optimizer
● the back end
FRONT END:
The front-end is responsible for parsing the source code and generating an abstract syntax.
tree (AST). The AST is then transformed into LLVM IR by the front-end. LLVM IR is a
low-level intermediate representation that is designed to be portable and
architecture independent.
46
THE OPTIMIZER:
The optimizer is the core component of LLVM and is responsible for performing a wide range
of optimizations on the LLVM IR. These optimizations include dead code elimination,
constant propagation, loop optimization, and many others. LLVM uses a modular
architecture that allows developers to easily add new optimizations or modify existing ones.
THE BACK END:

The back end is responsible for generating machine code for a specific target architecture.
LLVM supports a wide range of architectures, including x86, ARM, MIPS, and many others.
The back end uses the LLVM IR as input and generates machine code for the target
architecture. LLVM supports several code generation strategies, including just-in-time (JIT)
compilation and ahead-of-time (AOT) compilation.
LVM Intermediate Representation (IR)

LLVM IR is a low-level intermediate representation that is used by LLVM to optimize code
and generate machine code.
● Type-safe, SSA-based representation
● Designed to be portable and architecture-independent
● Similar to assembly language, but with a higher level of abstraction.
47
LLVM IR is designed to be easy to work with and understand. The syntax of LLVM IR is
simple and concise, and it is easy to generate LLVM IR from other languages. LLVM IR is
also designed to be easy to optimize.
LLVM IR includes a rich set of instructions that can be used to perform a wide range of
optimizations such as:
● loop optimization
● constant propagation
● & many others.
(A simple "Hello, world!" program in the LLVM IR format)

48
Benefits of Using LLVM in Compiler Design

There are several benefits to using LLVM in compiler design.
1.> LLVM provides a powerful set of optimization features that can be used to optimize code
for performance and size. LLVM's modular architecture makes it easy to add new
optimizations or modify existing ones.
2.>LLVM is designed to be portable and architecture independent. This means that
compilers built with LLVM can generate code for multiple architectures without the need for
architecture-specific code.
3.>LLVM is open-source and has a large and active community. This means that developers
can easily find help and support when using LLVM.
4.>LLVM is designed to work well with other tools in the compiler toolchain, such as Clang
or GCC. This means that compilers built with LLVM can easily integrate with other tools in
the toolchain.
5.> LLVM provides a flexible and modular framework that allows developers to easily add
new optimizations or target new architectures.
Conclusion
LLVM is a powerful and flexible compiler infrastructure that is used in a wide range of
software, including compilers for C/C++, Swift, and many others.
LLVM helps build new computer languages and improve existing languages. It automates
many of the difficult and unpleasant tasks involved in language creation, such as porting the
outputted code to multiple platforms and architectures.

BCSE307P - Compiler Lab Manual

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BCSE307P - Compiler Lab Manual

Uploaded by

Copyright:

Available Formats

1

SCHOOL OF COMPUTER SCIENCE AND

BCSE307P – COMPILER DESIGN LABORATORY

REGISTER NUMBER : VISHAL BHOOMA KANNAN

NAME OF THE STUDENT : 21BCE1294

1 13.12.2022 Study of Phases of Compiler 3

Implementation of Token Separation (Lexical

3 12.01.2023 Study of LEX and YACC Tool 11

4 31.01.2023 Implementation of Lexical Analyzer using Lex Tool 15

5 07.02.2023 Implementation using YACC Tool 21

Implementation of Simple Calculator using Lex and

8 07.03.2023 Implementation of Backend 35

Implementation of Code Optimization (Constant

10 28.03.2023 Study of LLVM 44

The second phase of a compiler is syntax analysis, also known as parsing.

The third phase of a compiler is semantic analysis.

4. Intermediate Code Generation:

The fourth phase of a compiler is intermediate code generation.

The fifth phase of a compiler is code optimization.

The final phase of a compiler is code generation.

“The phases of a compiler work together to translate high-level programming languages

int isKeyword(char *buffer, int buff_count)

int isConstant(char *buffer, int count)

FILE *ptr = fopen("test.txt", "r");

fscanf(ptr, "%c", &c);

The lexical analyser

YACC, on the other hand, is used to generate a parser.

LEX and YACC

The process of using LEX and YACC to construct a compiler are :

1. Defining the lexical structure of the programming language using regular

.*\n {printf("line %d: %s", count, yytext); count++;}

c.) Design a compiler to do lexical analysis in c, c++

Experiment 5: Recognizing strings using grammar in yacc programming.

int yyerror(char *msg)

Q2. Recognizing anb

int yyerror(char *msg)

Q3. Recognizing abn

int yyerror(char *msg)

Q4. Recognizing (ab)n

int yyerror(char *msg)

Experiment 6: Implementation of Simple calculator using Lex

| '(' E ')' {$$=$2;}

int yyerror(char *msg)

Experiment 7: Converting Infix to Postfix using yacc and lex.

int yyerror(char *msg)

Experiment 8: Implementation of Backend, converting infix

#define MAX_EXPR_SIZE 100

void push(char data)

int precedence(char operator)

int isOperator(char ch)

char *infixToPostfix(char *infix)

for (i = 0, j = 0; i < len; i++)

void printInstruction(char oper)

printf("R%d, R%d, R%d\n", reg_count, t2, t1);

Experiment 9: Implementation of code optimization (constant folding)

int calc(int n1, int n2, string op)

Experiment 10: Study of LLVM

Role of LLVM in Modern Compiler Design

The Architecture of LLVM

THE BACK END:

LVM Intermediate Representation (IR)

(A simple "Hello, world!" program in the LLVM IR format)

Benefits of Using LLVM in Compiler Design

char infixToPostfix(char infix)