0% found this document useful (0 votes)
45 views17 pages

Spring 2024 Compiler Constructoin A Lab 7

Uploaded by

neha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views17 pages

Spring 2024 Compiler Constructoin A Lab 7

Uploaded by

neha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement

STM
using parsing tool YACC.

Usman Institute of Technology


Department of Computer Science
Course Code: CS412
Course Title: Compiler Construction SPRING 2024

Lab 07

Objective:

This experiment is designed for giving the concepts of YACC(Bison) with FLEX and then
implement Symbol Table Manager with parsing tool YACC.

Student Information

Student Name

Student ID

Date

Assessment

Marks Obtained

Remarks

Signature

Usman Institute of Technology


Department of Computer Science
CS412 – Compiler Construction

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.

Lab 07

Instructions

• Come to the lab in time. Students who are late more than 15 minutes, will not be allowed to
attend the lab.
• Students have to perform the examples and exercises by themselves.
• Raise your hand if you face any difficulty in understanding and solving the examples or exercises.
• Lab work must be submitted on or before the submission date.
• Do not copy the work of other students otherwise both will get zero marks.

1. Objective

Introduction
The UNIX utility YACC (Yet Another Compiler Compiler) parses a stream of token, typically generated
by lex, according to a user-specified grammar.

Structure of a YACC file


A YACC file looks much like a lex file:

...definitions...

%%

...rules...
%%

...code...

In the example just saw, all three sections are present:

Definitions Section
There are three things that can go in the definitions section:

C Code: Any code between %{ and %} is copied to the C file. This is typically used for defining
file variables, and for prototypes of routines that are defined in the code segment.

Definitions: The definitions section of a lex file was concerned with characters; in YACC this is tokens.
These token definitions are written to a .h file when yacc compiles this file.

Associativity Rules: These handle associativity and priority of operators’ definitions

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.

Lex YACC Interaction


Conceptually, lex parses a file of characters and outputs a stream of tokens; yacc accepts a stream of
tokens and parses it, performing actions as appropriate. In practice, they are more tightly coupled. If lex
program is supplying a tokenizer, the yacc program will repeatedly call the yylex routine. The lex rules
will probably function by calling return every time they have parsed a token.

The Shared Header File of Return Codes


If lex is to return tokens that YACC will process, they have to agree on what tokens there are, This is done
as follows.
• The YACC file will have token definitions %token NUMBER in the definitions section.
• When the YACC file is translated with YACC -d, a header file y.tab.h is created that has
definitions like #define NUMBER 258.
• This file can then be included in both the lex and yacc program.
• The lex file can then call return NUMBER, and the YACC program can match on this token.
• The return codes that are defined from %TOKEN definitions typically start at around 258,so that
single characters can simply be returned as their integer value:
/* in the lex program */
[0-9]+ {return NUMBER} [-+*/] {return *yytext}
/* in the YACC program */
sum : TERMS ’+’ TERM

Rules section
The rules section contains the grammar of the language we want to parse. This looks like:

name1 : THING something OTHERTHING {action}1


| other something THING {other action}
name2 : .....

This is the general form of context-free grammars, with a set of actions associated with each matching
righthand side. It is a good convention to keep non-terminals (names that can be expanded further) in
lower case and terminals (the symbols that are finally matched) in upper case. The terminal symbols get
matched with return codes from the lex tokenizer.

User Code Section


The minimal main program is:

int main()
{ yyparse()
; return 0;
}

Extensions to more ambitious programs should be self-evident. In addition to the main program, the code
section will usually also contain subroutines, to be used either in the YACC or the lex program.

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.

Algorithm:

YACC Code:
1. Define tokens corresponding to Terminals in the given grammar.
2. Write the grammar in the YACC format.
3. Define main().
3.1. Call yyparse() inside that.
4. Define yywrap().
5. Define yyerror().

LEX Code:
1. Include y.tab.h to get the token definitions in the YACC file.
2. Define patterns corresponding to the tokens (Grammar or Rules).
3. Return tokens when the patterns are matched.

Compilation Order:
1. Translate LEX file *.l using the command LEX *.l or *.flex using command FLEX *.flex.
2. Now a c file named lex.yy.c is generated.
3. Now Translate YACC file *.y using the command YACC -d *.y.
3.1. 2 files are generated y.tab.c and y.tab.h.
4. Compile the 2 C files generated using the command gcc lex.yy.c y.tab.c
5. Now the executable file a.exe is generated.
6. Execute the program using the command a.exe.

Example 7.1:
To construct a simple desk calculator that reads an arithmetic expression in infix form, and valuates it, and
then prints its numeric value, we need following grammar:
E->E+T|T
T->T*F|F
F - >( E ) | digit
The token digit is a single digit between 0 and 9.

calc.flex File:
%{
#include "calc.tab.h"
#include <stdlib.h>
%}

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.
%%

[0-9]+ { yylval =atoi(yytext);


return INTEGER;
}
[-+\n] return *yytext;
[ \t] ; /* skip whitespace */
. yyerror("invalid character");
%%
int yywrap(void) { return 1; }

clac.y File
%{
#include <stdio.h>
void yyerror(char *s);
%}

%token INTEGER

%%

program:
programexpr '\n' { printf("%d\n", $2); }
|
;
expr:
INTEGER { $$ = $1; }
| expr '+' expr { $$ = $1 + $3; }
;

%%

void yyerror(char *s) { printf("%s\n", s); }

int main(void) {
yyparse();
return 0;
}

Compilation Steps:
Commands to create our compiler, calc.exe, are listed below:
win_flexcalc.flex # create lex.yy.c win_bison –d
calc.y # create calc.tab.h, calc.tab.c mingw32-gcc
lex.yy.c y.tab.c –o calc # compile/link

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.

Example 7.2 - Algorithm for Scientific Calculator

LEX program
1. Declare header files y.tab.h which contains information of the tokens and also declare variable
yylval within %{ and %}.
2. End of declaration section with %%
3. Write the Regular Expression for NUM, SIN, COS, TAN, SQRT, LOG
4. If match found for regular expression then write action that store token in yylval where p is
pointer declared in YACC and return the valve of token. 5. End rule-action section by %%
6. Declare main function.
7. End.
YACC program
1. Declaration of header files in %{ and %} symbol.
2. Declare pointer of type double in union
3. Declare type of token and non terminal by using union %union
{ double p; }
4. Declare tokens NUM, SIN, COS, TAN, SQRT, LOG 5. Declare start symbol
by using %start.
6. Declare type of nonterminal exp of type p using %type
7. Give precedence and associativity of operator. Using %left, %right and %nonassoc.
8. End of declaration section by %%
9. Write CFG for arithmetic operation like addition, subtraction, division and multiplication and
appropriate action for the same.

S : exp {printf("=%g\n",$1);}
exp : exp'+'exp { $$=$1+$3; }
| exp'/'exp { if($3==0)
{
printf("Divide By Zero");
exit(0);

}
Else
$$=$1/$3;
}
|'-'exp %prec UMINUS {$$=-$2;}
|'('exp')' {$$=$2;}
|SIN'('exp')' {$$=sin($3);}
|SQRT'('exp')' {$$=sqrt($3);}
|NUM ;

10. End the translation rule section by %%


11. Define main() function to call yyparse() function to parse an input stream

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.
12. Define yyerror() function to displaying any error message when a yacc detects a syntax error.
yyerror(const char *msg)
{

if(flag==0);
printf("\n\n Syntax is Wrong");
}
13. End.

Compilation Steps:
Commands to create our compiler, calc.exe, are listed below:
win_flex*.flex # create lex.yy.c
win_bison –d *.y # create calc.tab.h, calc.tab.c mingw32-
gcclex.yy.cy.tab.c –o abc # compile/link

The Symbol Table


Symbol Table is an important data structure created and maintained by the compiler in order to keep track
of semantics of variables i.e. it stores information about the scope and binding information about names,
information about instances of various entities such as variable and function names, classes, objects, etc.

 It is built-in lexical and syntax analysis phases.


 The information is collected by the analysis phases of the compiler and is used by the synthesis
phases of the compiler to generate code.
 It is used by the compiler to achieve compile-time efficiency.
It is used by various phases of the compiler as follows:-
Lexical Analysis: Creates new table entries in the table, for example like entries about tokens.
Syntax Analysis: Adds information regarding attribute type, scope, dimension, line of reference, use, etc in
the table.
Semantic Analysis: Uses available information in the table to check for semantics i.e. to verify that
expressions and assignments are semantically correct(type checking) and update it accordingly.
Intermediate Code generation: Refers symbol table for knowing how much and what type of run-time is
allocated and table helps in adding temporary variable information.
Code Optimization: Uses information present in the symbol table for machine-dependent optimization.
Target Code generation: Generates code by using address information of identifier present in the table.

Symbol Table entries – Each entry in the symbol table is associated with attributes that support the
compiler in different phases.

Use of Symbol Table-


The symbol tables are typically used in compilers.
During this scan compiler stores the identifiers of that application program in the symbol table. These
identifiers are stored in the form of name, value address, type.

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.

Here the name represents the name of identifier, value represents the value stored in an identifier, the
address represents memory location of that identifier and type represents the data type of identifier.

Thus compiler can keep track of all the identifiers with all the necessary information.

Items stored in Symbol table:

-Keywords - Data Types


- Operators - Functions
- Variables - Procedures
- Constants - Literals

Contents of the Symbol Table


The symbol table will contain the following types of information for the input strings in a source
program:
- The lexeme (input string) itself
- Corresponding token
- Its semantic component (e.g., variable, operator,
- constant, functions, procedure, etc.)
- Data type
- Pointers to other entries (when necessary)

Information used by the compiler from Symbol table:

 Data type and name


 Declaring procedures
 Offset in storage
 If structure or record then, a pointer to structure table.
 For parameters, whether parameter passing by value or by reference
 Number and type of arguments passed to function
 Base Address

Looking things up in the Symbol Table


Example - How would the scanner read a Pascal program header.

int main(void)

After reading the first 4 characters, it assembles the lexeme Pushing back the blank into the input stream.
Since int is a reserved word, it is already in the symbol table, which returns the token int

Next, the scanner assembles the lexeme

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.

pushing back the ( following it The Symbol Table manager can’t find it when it looks it up. Therefore, it
installs the main in the Symbol Table with the token identifier.
During semantic analysis, it will be given the semantic property of int.

Data stored in the Symbol Table


Let’s take a look at the kinds of information stored in the symbol table for different types of elements
within a program.

Operations of Symbol table:

Following operations can be performed on symbol table-

 Insert() Insertion of an item in the symbol table.


 Delete() Deletion of any item from the symbol table.
 Search() Searching of desired item from symbol table.

Implementation of Symbol table:


Following are commonly used data structures for implementing symbol table
 List
 Linked List
 Hash Table
 Binary Search Tree

Lets discuss each Data structure individually

List
we use a single array or equivalently several arrays, to store names and their associated information ,New
names are added to the list in the order in which they are encountered . The position of the end of the array
is marked by the pointer available, pointing to where the next symbol-table entry will go. The search for a

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.

name proceeds backwards from the end of the array to the beginning. when the name is located the
associated information can be found in the words following next.
 In this method, an array is used to store names and associated information.
 A pointer “available” is maintained at end of all stored records and new names are added in the
order as they arrive
 To search for a name we start from the beginning of the list till available pointer and if not found
we get an error “use of the undeclared name”
 While inserting a new name we must ensure that it is not already present otherwise an error occurs
i.e. “Multiple defined names”
 Insertion is fast O(1), but lookup is slow for large tables – O(n) on average
 The advantage is that it takes a minimum amount of space.

Linked List
- This implementation is using a linked list. A link field is added to each record. Searching of names
is done in order pointed by the link of the link field.
- A pointer “First” is maintained to point to the first record of the symbol table. Insertion is fast O(1),
but lookup is slow for large tables – O(n) on average

Hash Table:
- In hashing scheme, two tables are maintained – a hash table and symbol table and are the most
commonly used method to implement symbol tables.
- A hash table is an array with an index range: 0 to table size – 1. These entries are pointers pointing
to the names of the symbol table.
- To search for a name, we use a hash function that will result in an integer between 0 to table size –
1.
- Insertion and lookup can be made very fast – O(1).
- The advantage is quick to search is possible and the disadvantage is that hashing is complicated to
implement.
-
Binary Search Tree :
Another approach to implementing a symbol table is to use a binary search tree i.e. we add two link
fields i.e. left and right child.
All names are created as child of the root node that always follows the property of the binary search
tree.
Insertion and lookup are O(log2 n) on average.

The Syntax Tree


The Syntax Tree is a tree representation of the program structure. We use such representations in the
intermediate code generation phase. It makes the compiler more transparent.

Example:

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.

Scope Management

A compiler maintains two types of symbol tables: a global symbol table which can be accessed by all the
procedures and scope symbol tables that are created for each scope in the program.

To determine the scope of a name, symbol tables are arranged in hierarchical structure as shown in the
example below:

The above program can be represented in a hierarchical structure of symbol tables:

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.

The global symbol table contains names for one global variable (int value) and two procedure names,
which should be available to all the child nodes shown above. The names mentioned in the pro_one
symbol table (and all its child tables) are not available for pro_two symbols and its child tables.

This symbol table data structure hierarchy is stored in the semantic analyzer and whenever a name needs
to be searched in a symbol table, it is searched using the following algorithm:

• first a symbol will be searched in the current scope, i.e. current symbol table.
• if a name is found, then search is completed, else it will be searched in the parent symbol table
until,
• either the name is found or global symbol table has been searched for the name.

C++ Program to Implement Symbol Table

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM
using parsing tool YACC.

Zulfiqar Ali SPRING 2024 CS412-Compiler Construction


Lab 07: This experiment is designed giving the concept of YACC (Bison) with FLEX and then implement STM using
parsing tool YACC.

Answer the following Questions:

1. List the few differences between:


• *.tab.c and *.tab.h
• *.y and *.l
• lex.yy.c and *.tab.c
2. Which data structure you would recommend for your STM and why?
3. What is hash table.
4. Describe the importance of YACC.

Programming Exercise

1. Write a program that constructs the SYMBOL TABLE for the keywords, operators, identifiers and
digits. For the keywords store the keyword and line numbers where it appeared. For identifiers
store identifiers and its type. Use following structure to construct STM.

Name Type Size Dimension Line of Line Address Scope Value


Declaration of
Usage

Zulfiqar Ali SPRING 204 CS412-Compiler Construction

You might also like