You are on page 1of 13

Symbol Tables

CSC 3205: Compiler Design

Marriette Katarahweire

26th February 2020

CSC 3205: Compiler Design 1/13


Symbol Tables

Symbol tables are data structures that are used by compilers


to hold information about source-program constructs.
It is an important data structure created and maintained by
the compiler in order to keep track of semantics of variable
The information is collected incrementally by the analysis
phases of a compiler
Used by the synthesis phases to generate the target code.
Entries in the symbol table contain information about an
identifier eg. its character string (or lexeme), type,position in
storage and any other relevant information.
Symbol tables typically need to support multiple declarations
of the same identifier within a program.
It is used by compiler to achieve compile time efficiency.

CSC 3205: Compiler Design 2/13


Symbol Table and other Compiler Phases

Lexical Analysis: Creates new table entries in the table,


example entries about token.
Syntax Analysis: Adds information regarding attribute type,
scope, dimension, line of reference, use, etc in the table.
Semantic Analysis: Uses available information in the table to
check for semantics i.e. to verify that expressions and
assignments are semantically correct(type checking) and
update it accordingly.
Intermediate Code generation: Refers symbol table for
knowing how much and what type of run-time is allocated
and table helps in adding temporary variable information.
Code Optimization: Uses information present in symbol table
for machine dependent optimization.
Target Code generation: Generates code by using address
information of identifier present in the table.
CSC 3205: Compiler Design 3/13
Items stored in Symbol table

Variable names and constants


Procedure and function names
Literal constants and strings
Compiler generated temporaries
Labels in source languages
What information is to be associated with the name?
How do we access this information?
Some symbol tables also include the keywords in the same
table while others use a separate table for keywords.

CSC 3205: Compiler Design 4/13


Symbol Tables: Scopes

The scope of a declaration is the portion of a program to


which the declaration applies.
Scopes are implemented by setting up a separate symbol table
for each scope.
A program block with declarations will have its own symbol
table with an entry for each declaration in the block.
This approach also works for other constructs that set up
scopes; for example, a class would have its own table, with an
entry for each field and method.
Declarations may be constants, type declarations, variable
declarations, procedure declarations, class declarations

CSC 3205: Compiler Design 5/13


Symbol Table Purpose

A symbol table may serve the following purposes depending upon


the language
To store the names of all entities in a structured form at one
place.
To verify if a variable has been declared.
To implement type checking, by verifying assignments and
expressions in the source code are semantically correct.
To determine the scope of a name (scope resolution)

CSC 3205: Compiler Design 6/13


Who creates the Symbol Table?
Who creates the symbol table, the lexical, syntax or semantic
analysis phase?
- It is the semantic analysis phase which creates the symbol table
because it is not until the semantic analysis phase that enough
information is known about a name to describe it.
- Many compilers set up a table at lexical analysis time for the
various variables in the program, and fill in information about the
symbol later during semantic analysis when more information
about the variable is known.
- The parser can create entries. With its knowledge of the
syntactic structure of a program, a parser is often in a better
position than the lexical analyzer to distinguish among different
declarations of an identifier.
- In some cases, a lexical analyzer can create a symbol-table entry
as soon as it sees the characters that make up a lexeme.
- More often, the lexical analyzer can only return to the parser a
token, say id, along with a pointer to the lexeme.
CSC 3205: Compiler Design 7/13
Class and Related Attributes
A name in a program can represent a variable, a constant, a
parameter, a record or union type, a field in a record or union,
a procedure or function, a macro, an array, a label, or a file
etc. These are values for a symbol’s class.
Not all languages have all these possibilities. The possible
classes for a name vary with the language, as do the other
attributes. Once a name’s class is known, the rest of the
information may vary depending on the value of the class
attribute.
If a name is of class variable, or constant, or parameter, there
is another attribute which records the name’s type.
For a name whose class is procedure or function, other
attributes indicate the number of parameters, the parameters,
themselves, and the result type for functions.
For a name whose class is array, other attributes are the
number of dimensions and the array bounds (if known)
CSC 3205: Compiler Design 8/13
Symbol Table Operations

There are two main operations on symbol tables: Insert (or


Enter ) and Lookup (or Retrieval)
Most languages require declaration of names. When the
declaration is processed, the name is inserted into the symbol
table.
For languages not requiring declarations, e.g., FORTRAN and
SNOBOL, a name is inserted into the symbol table on its first
use in the program. Each subsequent use of a name causes a
symbol table lookup operation.
Given the characters of a name, searching finds the attributes
associated with that name. The search time depends on the
data structure used to represent the symbol table.

CSC 3205: Compiler Design 9/13


Implementation of Symbol Tables
An unordered list would enter each name sequentially as it is
declared. The lookup operation must then search linearly.
Inefficient unless it is known that the number of entries will be
exceedingly small.
An Ordered List: ordered according to the characters in the
name. Can be implemented as an array or as a linked list.
A Binary Tree: combines the fast search time of an ordered
array with the insertion ease of a linked list.
Hash tables are the best method for efficiency. Used by most
production-quality compilers. More space is generally needed
so it is not as space-efficient as binary trees or list structures
A stack where a pointer is kept to the top of the stack for
each block. Names are pushed onto the stack as they are
encountered. This is an easy structure to implement, but
relatively inefficient in operation.
Combinations of the data structures may be used. eg a
hashed symbol table implemented as a stack.
CSC 3205: Compiler Design 10/13
Scope Management
. . .

int value=10;

void pro_one(){
int one_1;
int one_2;

{
int one_3; // inner scope 1
int one_4;
}

int one_5;

{
int one_6; //inner scope 2\\
int one_7;
}
}

void pro_two(){
int two_1;
int two_2;

{
int two_3; // inner scope 3
int two_4;
}

int two_5;
}

CSC 3205: Compiler Design 11/13


Hierarchical Symbol Table

CSC 3205: Compiler Design 12/13


Hierarchical Symbol Table

whenever a name needs to be searched in a symbol table, it is


searched using the following algorithm:
first a symbol will be searched in the current scope, i.e.
current symbol table.
if a name is found, then search is completed, else it will be
searched in the parent symbol table until, either the name is
found or global symbol table has been searched for the name.

CSC 3205: Compiler Design 13/13

You might also like