Professional Documents
Culture Documents
Engineering
&Technology
LAB RECORD
III B.TECH – II SEM – R20 REGULATION
Name:
Roll No:
This is to certify that this is the bonafide record of the work done by
in the Compiler Design Laboratory of III Year II Semester of B. Tech Coursein Computer
Science & Information Technology Branch during the Academic Year 2022 – 2023.
INSTITUTE VISION
Producing globally competent and quality technocrats with human values for the holistic needs of
industry and society.
INSTITUTE MISSION
Creating an outstanding infrastructure and platform for enhancement of skills, knowledge and
behavior of students towards employment and higher studies.
Providing a healthy environment for research, development and entrepreneurship, to meet the
expectations of industry and society.
Transforming the graduates tocontribute tothe socio-economic development and welfare of the society
through value-based education.
DEPARTMENT VISION
To excel in computing arena and to produce globally competent Computer Science and Information
Technology graduates with Ethical and Human values to serve the society.
DEPARTMENT MISSION
To impart strong theoretical and practical background in computer science and information
technology discipline with an emphasis on software development.
To provide an open environment to the students and faculty that promotes professional
growth.
To inculcate the skills necessary to continue their education and research for contribution to
society.
Institute of
Engineering
&Technology
PSO 2: To acquire the knowledge on latest tools and technologies to provide technical
solutions.
PEO 3: Graduates will engage in life-long learning and professional development to adapt to
dynamically computing environment.
Institute of
Engineering
&Technology
Approved by AICTE & Permanently Affiliated to JNTUGV, Vizianagaram
(An Autonomous Institution)
PO 12 : Life-Long Learning: Recognize the need for, and have the preparation and ability to engage in
independent and life-long learning in the broadest context of technological change.
INDEX
COMPILER DESIGN LAB MANUAL
Aim: Design a lexical analyzer for given language and the lexical analyzer should ignore redundant
spaces, tabs and new lines.
Algorithm:
FILE *f1,*f2,*f3;
char c,str[10],st1[10];
int num[100],lineno=0,tokenvalue=0,i=0,j=0,k=0; printf("\nEnter the c program");/*gets(st1);*/
f1=fopen("input","w"); while((c=getchar())!=EOF) putc(c,f1);
fclose(f1); f1=fopen("input","r"); f2=fopen("identifier","w"); f3=fopen("specialchar","w");
while((c=getc(f1))!=EOF)
{
if(isdigit(c))
{
tokenvalue=c-'0'; c=getc(f1); while(isdigit(c))
{
tokenvalue*=10+c-'0'; c=getc(f1); }
}
num[i++]=tokenvalue; ungetc(c,f1);
}
else
if(isalpha(c))
{
PRAGNATHA (Asst Professor, LIET) 1|P ag e
putc(c,f2); c=getc(f1);
while(isdigit(c)||isalpha(c)||c=='_'||c=='$')
{
putc(c,f2); c=getc(f1);
}
putc(' ',f2); ungetc(c,f1);
}
else
if(c==' '||c=='\t')
printf(" ");
else if(c=='\n')
lineno++;
else
putc(c,f3);
}
fclose(f2);
fclose(f3); fclose(f1);
printf("\nThe no's in the program are"); for(j=0;j<i;j++)
printf("%d",num[j]); printf("\n"); f2=fopen("identifier","r"); k=0;
printf("The keywords and identifiersare:"); while((c=getc(f2))!=EOF)
{
if(c!=' ')
str[k++]=c;
else
{
str[k]='\0'; keyword(str); k=0;
}
}
Expected Output:
Output :
C tokens:
C tokens are the basic buildings blocks in C language which are constructed together to write a C program.
Each and every smallest individual units in a C program are known as C tokens. C tokens are of six types.
They are,
individual units in a C program are known as C tokens. C tokens are of six types. They are,
Keywords (eg: int, while),
Each program elements in a C program are given a name called identifiers. Names given to identify
Variables, functions and arrays are examples for identifiers. eg. x is a name given to integer variable in
above program.
Rules for constructing identifier name in C?
Keywords in C language?
Function Follow :
Step1:First put $ (the end of input marker) in Follow(S) (S is the start symbol)
Step2:If there is a production A → aBb, (where a can be a whole string) then everything in FIRST(b)
except for ε is placed in FOLLOW(B).
Step3:If there is a production A → aB, then everything in FOLLOW(A) is in FOLLOW(B)
Step4 :If there is a production A → aBb, where FIRST(b) contains ε, then everything in FOLLOW(A) is
in FOLLOW(B)
Source code:
#include<stdio.h>
#include<string.h>
int i,j,l,m,n=0,o,p,nv,z=0,x=0;
char str[10],temp,temp2[10],temp3[20],*ptr; struct prod
{
char lhs[10],rhs[10][10],ft[10],fol[10]; int n;
}pro[10]; void findter()
{
int k,t; for(k=0;k<n;k++)
{
if(temp==pro[k].lhs[0])
{
for(i=0;i<10;i++) pro[i].n=0;
f=fopen("tab5.txt","r"); while(!feof(f))
{
fscanf(f,"%s",pro[n].lhs); if(n>0)
{
if( strcmp(pro[n].lhs,pro[n-1].lhs) == 0 )
{
pro[n].lhs[0]='\0';
fscanf(f,"%s",pro[n-1].rhs[pro[n-1].n]); pro[n-1].n++;
continue;
}
}
fscanf(f,"%s",pro[n].rhs[pro[n].n]); pro[n].n++;
n++;
}
printf("\n\nTHE GRAMMAR IS AS FOLLOWS\n\n"); for(i=0;i<n;i++)
for(j=0;j<pro[i].n;j++)
printf("%s -> %s\n",pro[i].lhs,pro[i].rhs[j]);
pro[0].ft[0]='#';
for(i=0;i<n;i++)
{
for(j=0;j<pro[i].n;j++)
{
if( pro[i].rhs[j][0]<65 || pro[i].rhs[j][0]>90 )
{
pro[i].ft[strlen(pro[i].ft)]=pro[i].rhs[j][0];
}
else if ( pro[i].rhs[j][0]>=65 && pro[i].rhs[j][0]<=90 )
{
temp=pro[i].rhs[j][0]; if(temp=='S')
pro[i].ft[strlen(pro[i].ft)]='#'; findter();
}
}
}
printf("\n\nFIRST\n"); for(i=0;i<n;i++)
{
printf("\n%s -> ",pro[i].lhs); for(j=0;j<strlen(pro[i].ft);j++)
{
for(l=j-1;l>=0;l--)
printf("\n\nFOLLOW\n"); for(i=0;i<n;i++)
{
printf("\n%s -> ",pro[i].lhs); for(j=0;j<strlen(pro[i].fol);j++)
{
for(l=j-1;l>=0;l--) if(pro[i].fol[l]==pro[i].fol[j])
break; if(l==-1)
printf("%c",pro[i].fol[j]);
}
}
printf("\n"); getch();
}
First put $ (the end of input marker) in Follow(S) (S is the start symbol)
If there is a production A → aBb, (where a can be a whole string) then everything in FIRST(b) except
for ε is placed in FOLLOW(B).
If there is a production A → aB, then everything in FOLLOW(A) is in FOLLOW(B)
FIRST(E) = {'(',id}
FIRST(E') = {+,ε}FIRST(T) = {'(',id}
FIRSTFOLLOW(E') = {$,)}
FOLLOW(T) = {+,$,)}
FOLLOW(T') = {+,$,)}
FOLLOW(F) = {*,+,$,)}
(T') = {*,ε}
FIRST(F) = {'(',id}
Find FOLLOW(E) terms for above grammer?
Ans: FOLLOW(E) = {$,)}
Algorithm :
Source code:
+ - * / ^ i ( ) $
> > < < < < < > >
> > < < < < < > >
> > > > < < < > >
> > > > < < < > >
> > > > < < < > >
> > > > > e e > >
< < < < < < < = e
> > > > > e e > >
< < < < < <
Output:
For example, the following operator precedence relations canbe introduced for simple expressions
id + * $
The input string: id1 + id2 * id3after inserting precedence relations becomes?
Ans : $ <· id1 ·> + <· id2·>* <· id3 ·> $
Having precedence relations allows to identify handles as follows?
Ans : - scan the string from left until seeing ·>
scan backwards the string from right to left until seeing <·
everything between the two relations <· and ·> forms the handle
7. Operator Precedence Parsing Algorithm ?
Ans : Initialize: Set ip to point to the first symbol of w$
Repeat: Let X be the top stack symbol, and a the symbol pointed to by ip
if $ is on the top of the stack and ip points to $ then return else
Let a be the top terminal on the stack, and b the symbol pointed to by ip
ifa <· b or a =· b then
push b onto the stack
advance ip to the next input symbol
else if a ·>bthen repeat
pop the stack
until the top stack terminal is related by <· to the terminal most recently popped
else error()
end
Theory:
For factoring:
Give a set of productions of the form
A-> αβ1|αβ2|…|αβn
Invent a new nonterminal Z and replace this set by the collection:A->αZ
Z-> β1|β2|…βn For substitution:
If we are given a grammar containing A Bα and if all of the productions with B on the left side are:
B-> β1|β2|…βnα
ALGORITHM:
Step 1: start.
Step 2: Declare the prototype functions E() , EP(),T(), TP(),F() Step 3: Read the string to be parsed.
Step 4: Check the productions
Step 5: Compare the terminals and Non-terminals Step 6: Read the parse string.
Step 7: stop the production
void f()
{
int i,n=0,l; for(i=0;i<=strlen(op);i++)
if(op[i]!='e')
tmp[n++]=op[i]; strcpy(op,tmp); l=strlen(op);
for(n=0;n < l && op[n]!='F';n++); if((ip_sym[ip_ptr]=='i')||(ip_sym[ip_ptr]=='I'))
{
op[n]='i';
printf("E=%-25s",op);
printf("F->i\n"); advance();
}
else
{
if(ip_sym[ip_ptr]=='(')
{
advance(); e();
if(ip_sym[ip_ptr]==')')
{
advance(); i=n+2;
do
{
op[i+2]=op[i]; i++;
}
while(i<=l); op[n++]='(';
op[n++]='E';
op[n++]=')';
printf("E=%-25s",op);
printf("F->(E)\n");
}
}
else
{
void main()
{
int i; clrscr();
printf("\nGrammar without left recursion"); printf("\n\t\t E->TE' \n\t\t E'->+TE'|e \n\t\t T->FT' ");
printf("\n\t\t T'->*FT'|e \n\t\t F->(E)|i");
printf("\n Enter the input expression:"); gets(ip_sym);
printf("Expressions");
printf("\t Sequence of production rules\n"); e();
for(i=0;i < strlen(ip_sym);i++)
{
if(ip_sym[i]!='+'&&ip_sym[i]!='*'&&ip_sym[i]!='('&&
ip_sym[i]!=')'&&ip_sym[i]!='i'&&ip_sym[i]!='I')
{
printf("\nSyntax error"); break;
}
for(i=0;i<=strlen(op);i++) if(op[i]!='e')
tmp[n++]=op[i]; strcpy(op,tmp); printf("\nE=%-25s",op);
}
If(flag==1) Printf(“accepted”); getch();
}
Expected output :
Output :
Algorithm :
Source code:
}
else{ for(l=0;l<noter;l++)
{
if(ter[l]==first[i][j])
{
row=l;
}
}
for(k=0;k<npro;k++){ if(nt[i]==pro[k][0])
{
col=i; if((pro[k][3]==first[i][j])&&(pro[k][0]==nt[col]))
{
strcpy(res[col][row],pro[k]); count[col][row]+=1;
}
else
{
if((isupper(pro[k][3]))&&(pro[k][0]==nt[col]))
{
flag=0; for(m=0;m<nont;m++)
{
if(nt[m]==pro[k][3]){index=m;flag=1;}
}
if(flag==1){ for(m=0;m<strlen(first[index]);m++)
{if(first[i][j]==first[index][m])
{strcpy(res[col][row],pro[k]);
count[col][row]+=1;}
}
}
}}}}}
}}
printf("LL1 Table\n\n"); flag=0; for(i=0;i<noter;i++)
PRAGNATHA (Asst Professor, LIET) 22 | P a g e
{
printf("\t%c",ter[i]);
}
for(j=0;j<nont;j++)
{
printf("\n\n%c",nt[j]); for(k=0;k<noter;k++)
{
printf("\t%s",res[j][k]);
if(count[j][k]>1){flag=1;}
}
}
if(flag==1){printf("\nThe given grammar is not LL1");} else{printf("\nThe given grammar is LL1");}
getch();
}
Expected Output:
Ener the follow values: Follow values(S):$ e Follow values(A):$ e Follow values(E):t
LL1 Table
a $ b t
S S- > aA
E E->b
The given grammar is LL1
Output :
What is LL(1)?
Ans: The parsing table entries are single entries. So each location has not more than one entry.
This type of grammar is called LL(1) grammar.
How to element left factoring for give grammer?
Ans : Consider this following grammar: S → iEtS | iEtSeS | a
E→b
After eliminating left factoring, we have S → iEtSS’ | a
S’→ eS | ε E → b
Find first set of terminals for above grammer?
Ans: FIRST(S) = { i, a }
FIRST(S’) = {e, ε }
FIRST(E) = { b}
Find FOLLOW set of terminals of above grammer?
Ans :FOLLOW(S) = { $ ,e }
FOLLOW(S’) = { $ ,e } FOLLOW(E) = {t}
Is the parsing TABLE is given bellow either LL(1) or not?
Algorithm :
Source code:
#include<stdio.h>
#include<stdlib.h>
int stack[20],top=-1;
void push(int item)
{
if(top>=20)
{
printf("stack overflow");
exit(1);
}
stack[++top]=item;
}
int pop()
{
int ch;
if(top<=-1)
{
printf("stack underflow");
exit(1);
}
ch=stack[top--];
return ch;
}
char convert(int item)
{
char ch;
switch(item)
{
case 0:return'E';
case 1:return'e';
case 2:return'T';
case 3:return't';
PRAGNATHA (Asst Professor, LIET) 26 | P a g e
case 4:return'F';
case 5:return'i';
case 6:return'+';
case 7:return'*';
case 8:return'(';
case 9:return')';
case 10:return'$';
}
}
void main()
{
int m[10][10],i,j,k;
char ips[20];
int ip[10],a,b,t;
m[0][0]=m[0][3]=21;
m[1][1]=621;
m[1][4]=m[1][5]=-2;
m[2][0]=m[2][3]=43;
m[3][1]=m[3][4]=m[3][5]=-2;
m[3][2]=743;
m[4][0]=5;
m[4][3]=809;
printf("enter the input string(end with $):");
scanf("%s",ips);
for(i=0;i,i<ips[i];i++)
{
switch(ips[i])
{
case 'E':k=0;
break;
case 'e':k=1;
break;
case 'T':k=2;
break;
case 't':k=3;
break;
case 'F':k=4;
break;
case 'i':k=5;
break;
case '+':k=6;
break;
case '*':k=7;
break;
case '(':k=8;
break;
case ')':k=9;
break;
PRAGNATHA (Asst Professor, LIET) 27 | P a g e
case '$':k=10;
break;
}
ip[i]=k;
}
ip[i]=-1;
push(10);
push(0);
i=0;
printf("STACK \t INPUT \n");
while(1)
{
//printf("\t");
for(j=0;j<=top;j++)
{
printf("%c",convert(stack[j]));
}
printf("\t");
for(k=i;ip[k]!=-1;k++)
{
printf("%c",convert(ip[k]));
}
printf("\n");
if(stack[top]==ip[i])
{
if(ip[i]==10)
{
printf("\t success");
return;
}
else
{
top--;
i++;
}
}
else if(stack[top]<=4&&stack[top]>=0)
{
a=stack[top];
b=ip[i]-5;
t=m[a][b];
top--;
while(t>0)
{
push(t%10);
t=t/10;
}
}
PRAGNATHA (Asst Professor, LIET) 28 | P a g e
else
{
printf("error");
return;
}
}
}
Output 1:
Output 2:
Algorithm :
STEP1: Initial State: the stack consists of the single state, s0; ip points to the first character in w.
STEP 2: For top-of-stack symbol, s, and next input symbol, a case action of T[s,a]
STEP 3: shift x: (x is a STATE number) push a, then x on the top of the stack and advance ip to point to
the next input symbol.
STEP 4: reduce y: (y is a PRODUCTION number) Assume that the production is of the form A ==>
beta pop 2 * |beta| symbols of the stack.
STEP 5: At this point the top of the stack should be a state number, say s’. push A, then goto of T[s’,A]
(a state number) on the top of the stack.
Source code:
check(); st_ptr++;
}
st_ptr++; check();
}
void check()
{
if(!strcmpi(stack,"E")&&ip_ptr==len)
{
printf("\n $%s\t\t%s$\t\t\tACCEPT",stack,ip_sym); getch();
exit(0);
}
if(flag==0)
{
printf("\n%s\t\t\t%s\t\t reject",stack,ip_sym); exit(0);
}
return;
}
Expected Output:
$ a+a$ --
$a +a$ Shift a
$E +a$ E
$E+a $ Shift a
$E+E $ E a
$E $ E E*E
$E $ ACCEPT
Output:
Algorithm :
STEP 1: Represent Ii by its CLOSURE, those items that are either the initial item[S’->.S; eof] or do not
have the . at the left end of the rhs.
STEP 2 : Compute shift, reduce, and goto actions for the state derived from Ii directly from
CLOSURE(Ii)
Source code:
#include<stdio.h> #include<conio.h> #include<stdlib.h> #include<string.h>
void push(char *,int *,char); char stacktop(char *);
void isproduct(char,char); int ister(char);
int isnter(char); int isstate(char); void error();
void isreduce(char,char); char pop(char *,int *);
void printt(char *,int *,char [],int); void rep(char [],int);
struct action
{
char row[6][5];
};
const struct action A[12]={
{"sf","emp","emp","se","emp","emp"},
{"emp","sg","emp","emp","emp","acc"},
{"emp","rc","sh","emp","rc","rc"},
{"emp","re","re","emp","re","re"},
{"sf","emp","emp","se","emp","emp"},
{emp","rg","rg","emp","rg","rg"},
{"sf","emp","emp","se","emp","emp"},
{"sf","emp","emp","se","emp","emp"},
{"emp","sg","emp","emp","sl","emp"},
{"emp","rb","sh","emp","rb","rb"},
{"emp","rb","rd","emp","rd","rd"},
{"emp","rf","rf","emp","rf","rf"}
};
struct gotol
{
char r[3][4];
};
default
:printf("%c",t[r]); break;
}
}
Aim :Implement the lexical analyzer using JLex, flex or lex or other lexical analyzer generating tools.
Algorithm :
\< |
== |
\> {if(!COMMENT) printf("\n\t%s is a RELATIONAL OPERATOR",yytext);}
%%
Expected Output:
$lex lex.l
$cc lex.yy.c
$./a.out var.c
#include<stdio.h> is a PREPROCESSOR DIRECTIVE FUNCTION
main (
)
BLOCK BEGINS
int is a KEYWORD a IDENTIFIER
b IDENTIFIER BLOCK ENDS
how many sections a JLex input file is organized? Ans : A JLex input file is organized into three
sections
what are the sections in Jlex?
Ans : 1)user code 2)JLex directives 3)regular expression rules
Algorithm :
Source code:
Enter a value:5
Loop rolling output:2 Loop unrolling output:2
Output:
Aim :Convert the BNF rules into YACC form and write code to generate abstract Syntax tree.
Algorithm :
Step1: To specify the syntax of a language : CFG and BNF Step2: ex: if-else statement in C has the
form of statement →if (
expression )
statement else statement
Step3: An alphabet of a language is a set of symbols.
step4: ex:{0,1} for a binary number system(language)={0,1,100,101,...} Step 5:{a,b,c} for
language={a,b,c, ac,abcc..}
Step6: {if,(,),else ...} for a if statements={if(a==1)goto10, if--}
Step7: A string over an alphabet
is a sequence of zero or more symbols from the alphabet.
Step8: Ex: 0,1,10,00,11,111,0202 ... strings for a alphabet {0,1}
Step9: Null string is a string which does not have any symbol of alphabetLanguage.
o Is a subset of all the strings over a given alphabet.
Step10: Alphabets Ai Languages Li for Ai A0={0,1} L0={0,1,100,101,...}
A1={a,b,c} L1={a,b,c, ac, abcc..}
A2={all of C tokens} L2= {all sentences of C program }
Source code:
<int.l>
%{
#include"y.tab.h"
#include<stdio.h> #include<string.h> int LineNo=1; %} identifier [a-zA-Z][_a-zA-Z0-9]* number [0-
9]+|([0-9]*\.[0-9]+)
%%
main\(\) return MAIN; if return IF; else return ELSE; while return WHILE;
int |
char | float return TYPE; {identifier} {strcpy(yylval.var,yytext); return VAR;} {number}
{strcpy(yylval.var,yytext);
return NUM;}
\< |
\> |
\>= |
\<= |
== {strcpy(yylval.var,yytext); Compiler design Lab programs 12
return RELOP;} [ \t] ;
\n LineNo++;
. return yytext[0];
%%
<int.y>
%{
Expected Output:
$lex int.l
$yacc –d int.y
$gcc lex.yy.c y.tab.c –ll –lm
$./a.out test.c
Output:
THEORY:
The algorithm we shall present basically tries to find for every statement in the program a mapping
between variables, and values of N T 𝖴⊥ { , } . If a variable is mapped to a constant number, that
number is the variables value in that statement on every execution. If a variable is mapped to T (top), its
value in the statement is not known to be constant, and in the variable is mapped to ⊥ (bottom), its value
is not initialized on every execution, or the statement is unreachable. The algorithm for assigning the
mappings to the statements is an iterative algorithm, that traverses the control flow graph of the
algorithm, and updates each mapping according to the mapping of the previous statement, and the
functionality of the statement. The traversal is iterative, because non-trivial programs have circles in
their control flow graphs, and it ends when a “fixed-point” is reached – i.e., further iterations don’t
change the mappings.
ALGORITHM:
Step1: start
Step2: declare a=30,b=3,c
Step3: display result of propagation Step4:end
Source code:
int constantpropagation()
{
int a = 30; int b = 3; int c;
c = b * 4; if (c > 10) {
c = c - 10;
}
return c * 2;
}
Expected Output:
Output:
What is an interpreter?
a program that reads an executable program and produces the results of running that program
usually, this involves executing the source program in some fashion