Compiler Construction CS F363
Semantic Analysis
Shashank Gupta
BITS Pilani Assistant Professor
Pilani Campus
Department of Computer Science and Information Systems
BITS Pilani, Pilani Campus
Topics to be Covered……
• Abstract Syntax Tree
• L-Attributed Definitions
CS F363 Compiler Construction 2
BITS Pilani, Pilani Campus
Abstract Syntax Tree (AST)……
• Consider the grammar
E → int | ( E ) | E + E
• And the string
5 + (2 + 3)
• After lexical analysis (a list of tokens)
int5 ‘+’ ‘(‘ int2 ‘+’ int3 ‘)’
• During parsing we build a parse tree …
CS F363 Compiler Construction 3
BITS Pilani, Pilani Campus
Abstract Syntax Tree (AST)……
• Traces the operation of
the parser
• Captures the nesting
structure
• But too much info
• Parentheses
• Single-successor
nodes
CS F363 Compiler Construction 4
BITS Pilani, Pilani Campus
Abstract Syntax Tree (AST)……
• Also captures the nesting structure
• But abstracts from the concrete
syntax
• a more compact and easier to use
• An important data structure in a
compiler
CS F363 Compiler Construction 5
BITS Pilani, Pilani Campus
Abstract Syntax Tree (AST)
The parse tree generated during syntax checking is
not suitable a representation for further processing.
A modified form, known as abstract syntax tree
(AST) is required.
CS F363 Compiler Construction 6
BITS Pilani, Pilani Campus
Abstract Syntax Tree (AST)……
AST is condensed form of parse tree.
• Like Parse Trees but ignore some details
• An abstract syntax tree (AST) is the procedure’s parse tree with the
nodes for most non-terminal symbols removed.
Chain of single productions may be collapsed, and
operators move to the parent nodes
CS F363 Compiler Construction 7
BITS Pilani, Pilani Campus
Abstract Syntax Tree (AST)……
Abstract Syntax tree
Syntax tree CS F363 Compiler Construction 8
BITS Pilani, Pilani Campus
Construction of AST……
We use attribute grammar rules to construct the
abstract syntax tree.
But in order to do that we first require two
procedures for tree construction.
CS F363 Compiler Construction 9
BITS Pilani, Pilani Campus
Construction of AST…..
Each node can be represented as a record
• Operators: one field for operator, remaining fields ptrs to operands
• mknode (op, left, right )
• Creates a node with label op (with fields which point to left and right) and returns a
pointer to the newly created node.
Identifier: one field with label id and another ptr to symbol table
mkleaf(id,entry)
• Number: one field with label num and another to keep the value of the number
mkleaf(num,val)
CS F363 Compiler Construction 10
BITS Pilani, Pilani Campus
Construction of AST…..
The following sequence of function calls creates a tree for a - 4 + c
P1 = mkleaf (id, entry.a)
P2 = mkleaf (num, 4)
P3 = mknode (-, P1 , P2 )
P4 = mkleaf (id, entry.c)
P5 = mknode (+, P3 , P4 )
CS F363 Compiler Construction 11
BITS Pilani, Pilani Campus
Syntax Directed Definition for Constructing
AST
E → E1 + T
E → T
T → T1 * F
T → F
F → (E)
F → id
F → num
CS F363 Compiler Construction 12
BITS Pilani, Pilani Campus
Expression Grammar
• E → E1 + T [Link] = mknode(+, E1 .ptr, [Link])
• E→T [Link] = [Link]
• T → T1 * F [Link] := mknode(*, T1 .ptr, [Link])
• T→F [Link] := [Link]
• F → (E) [Link] := [Link]
• F → id [Link] := mkleaf(id, [Link])
• F → num [Link] := mkleaf(num,val)
CS F363 Compiler Construction 13
BITS Pilani, Pilani Campus
L-Attributed Definitions
L-Attributed Definitions contain both synthesized and inherited attributes
A syntax directed definition is L-Attributed if each inherited attribute of Xj in a
production
A → X1 . . . Xj . . . Xn, depends only on
• The attributes of the symbols to the left (this is what L in L-Attributed stands for) of Xj , i.e., X1 X2 . .
. Xj − 1, and
• The inherited attributes of A.
• A → BC {B.S = A.S} is an L-attributed definition. (where S is an inherited attribute)
CS F363 Compiler Construction 14
BITS Pilani, Pilani Campus
Few Examples
A → LM
• [Link] = f1 ([Link])
L-Attributed Definitions
• [Link] = f2 ([Link])
• [Link] = f3 ([Link])
A → QR
• [Link] = f4 ([Link])
Not L-Attributed Definitions
• [Link] = f5 ([Link])
• [Link] = f6 ([Link])
CS F363 Compiler Construction 15
BITS Pilani, Pilani Campus
L-Attributed Definitions
When translation takes place during parsing, order of
evaluation is linked to the order in which nodes are created
• In S-attributed definitions, parent’s attribute evaluated after child’s
attribute.
A natural order in both top-down and bottom-up parsing is
depth first-order
• L-attributed definition: where attributes can be evaluated in depth-first
order
CS F363 Compiler Construction 16
BITS Pilani, Pilani Campus
Compiler Construction CS F363
Semantic Analysis
Shashank Gupta
BITS Pilani Assistant Professor
Pilani Campus
Department of Computer Science and Information Systems
BITS Pilani, Pilani Campus
Translation Schemes
Translation Schemes are more implementation oriented than syntax
directed definitions since they indicate the order in which semantic rules
and attributes are to be evaluated.
A Translation Scheme is a context-free grammar in which
• Attributes are associated with grammar symbols;
• Semantic Actions are enclosed between braces {} and are inserted within the right-hand
side of productions.
CS F363 Compiler Construction 2
BITS Pilani, Pilani Campus
Translation Schemes……
Translation Schemes deal with both synthesized and inherited
attributes.
• Semantic Actions are treated as terminal symbols
• Annotated parse-trees contain semantic actions as children of the node
standing for the corresponding production.
Translation Schemes are useful to evaluate L-Attributed
definitions at parsing time
• An L-Attributed Syntax-Directed Definition can be transformed into a
Translation Scheme.
CS F363 Compiler Construction 3
BITS Pilani, Pilani Campus
Translation Schemes……
A CFG where semantic actions occur
within the RHS of production
• E→ T R
• R→ addop T R | ε
• T→ num
• addop → + | –
CS F363 Compiler Construction 4
BITS Pilani, Pilani Campus
Parse Tree for 9-5+2
CS F363 Compiler Construction 5
BITS Pilani, Pilani Campus
Translation Schemes
A CFG where semantic actions occur within the RHS of production
• E→ T R
• R→ addop T R | ε
• T→ num
• addop → + | –
Example: A translation scheme to map infix to postfix
• E→ T R
• R→ addop T {print(addop)}R | ε
• T→ num {print(num)}
• addop → + | –
CS F363 Compiler Construction 6
BITS Pilani, Pilani Campus
Parse Tree for 9-5+2
E→ T R
R→ addop T {print(addop)}R | ε
T→ num {print(num)}
addop → + | –
CS F363 Compiler Construction 7
BITS Pilani, Pilani Campus
Parse Tree for 9-5+2
E→ T R
R→ addop T {print(addop)}R | ε
T→ num {print(num)}
addop → + | –
CS F363 Compiler Construction 8
BITS Pilani, Pilani Campus
Evaluation of Translation Schemes
Assume attribute equations are terminal symbols
• Perform depth first order traversal to obtain 9 5 – 2 +
When designing translation scheme, ensure
attribute value is available when referred to
CS F363 Compiler Construction 9
BITS Pilani, Pilani Campus
CS F363 Compiler Construction 10
BITS Pilani, Pilani Campus
An inherited attribute for a symbol on RHS of a production must
be computed in an action before that symbol
• S → A1 A2 {A1 .in = 1,A2 .in = 2}
• A → a {print([Link])}
Depth first order traversal gives error (undefined)
• A synthesized attribute for the non terminal on the LHS can be computed after
all attributes it references, have been computed. The action normally should be
placed at the end of RHS.
CS F363 Compiler Construction 11
BITS Pilani, Pilani Campus
Syntax Directed Definition
• D→TL [Link] = [Link]
• T → real [Link] = real
• T → int [Link] = int
• L → L1 , id L1 .in = [Link]
addtype([Link], [Link])
• L → id addtype ([Link],[Link]) 12
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Translation Schemes: An Example
Consider the Translation Scheme for the L-Attributed Definition for “type declarations”:
D → T {[Link] := T .type } L
T → int {T .type :=int }
T → real {T .type :=real }
L → { L [Link] := [Link] } L1, id {addtype([Link], [Link]) }
L → id {addtype([Link], [Link]) }
CS F363 Compiler Construction 13
BITS Pilani, Pilani Campus
Translation Schemes: An Example (Cont.)
The parse-tree with semantic actions for the input real id1, id2, id3 is:
D D → T {[Link] := T .type } L
T → int {T .type :=int }
T → real {T .type :=real }
L → { L [Link] := [Link] } L1, id {addtype([Link], [Link]) }
T [Link]:=[Link] L
L → id {addtype([Link], [Link]) }
real [Link] := real[Link]:=[Link] L1 , id 3 addtype(id 3 . entry , L. in )
L2 .in:=L1 .in L2
, id 2 addtype(id2 .entry , L1. in)
id1 addtype(id1 .entry , L2 .in )
Traversing the Parse-Tree in depth-first order we can evaluate the attributes.
CS F363 Compiler Construction 14
BITS Pilani, Pilani Campus
Design of Translation Schemes
When designing a Translation Scheme we must be sure that an
attribute value is available when a semantic action is executed.
• When the semantic action involves only synthesized attributes.
• The action can be put at the end of the production
Example: The following Production and Semantic Rule:
• T →T1∗F [Link]:=[Link]∗[Link] yield the translation scheme
• T →T1∗F {[Link] := [Link]∗[Link]}
CS F363 Compiler Construction 15
BITS Pilani, Pilani Campus
Design of Translation Schemes……
Rules for Implementing L-Attributed SDD’s: If we have an L-Attributed
Syntax-Directed Definition we must enforce the following restrictions:
• Attributes in a Translation Scheme following the previous rules can be computed at
compile time similarly to the evaluation of S-Attributed Definitions.
An inherited attribute for a symbol in the right-hand side of a production
must be computed in an action before the symbol;
• A synthesized attribute for the non terminal on the left-hand side can only be computed
when all the attributes it references have been computed.
• The action is usually put at the end of the production.
CS F363 Compiler Construction 16
BITS Pilani, Pilani Campus
Bottom-up Evaluation of Inherited Attributes
Eliminate attribute equations from the translation
scheme.
• Make transformation so that embedded actions occur only at the
end of their productions.
Replace each action by a distinct marker non-
terminal M and attach action at end of M → ∈
CS F363 Compiler Construction 17
BITS Pilani, Pilani Campus
L-Attributed Definitions to S-Attributed Definitions
E→TR E→TR
R → + T {print (+)} R R→+TMR
R → - T {print (-)} R R→-TNR
R→∈ R→∈
T→num{print([Link])} T → num {print([Link])}
M→∈ {print(+)}
N→∈ {print(-)}
Translation Schemes S-Attributed Definitions
18
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Compiler Construction CS F363
Type Checking
Shashank Gupta
BITS Pilani Assistant Professor
Pilani Campus
Department of Computer Science and Information Systems
BITS Pilani, Pilani Campus
Type Checking
A type is a set of values and operations on those values.
• A language’s type system specifies which operations are valid for a
type.
The aim of type checking is to ensure that operations are
used on the variable/expressions of the correct types.
CS F363 Compiler Construction 2
BITS Pilani, Pilani Campus
Categories of Type System
Untyped • No type checking needs to be done.
Statically • All type checking is done at compile time.
Typed • Also, called strongly typed.
Dynamically • Type checking is done at run time
Typed
CS F363 Compiler Construction 3
BITS Pilani, Pilani Campus
Static Typing
Static Typing
• Catches most common programming errors at compile time
• Avoids runtime overhead
Most code is written using static types languages
• In fact, developers for large/critical system insist that code
should be strongly type checked at compile time even if
language is not strongly typed
CS F363 Compiler Construction 4
BITS Pilani, Pilani Campus
Type Expression
The idea is to associate each language construct with an
expression describing its type and so it is known as Type
Expression.
Type expressions are defined inductively from basic types and
constants using type constructors.
• They are close to type constructors of languages like C.
CS F363 Compiler Construction 5
BITS Pilani, Pilani Campus
Type System
A type system is a collection of rules for assigning
type expressions to various parts of a program
Different type systems may be used by different
compilers for the same language
CS F363 Compiler Construction 6
BITS Pilani, Pilani Campus
Type Expression…..
Used to represent types of language constructs. A type
expression can be
• Basic Type: integer, real, char, Boolean, etc.
• A special type: type-error is used to indicate type violations.
Type Name
• Type Constructor applied to a list of type expressions.
7
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Type Constructors
Array: if T is a type expression then array(I, T) is a type expression
denoting the type of an array with elements of type T and index set I
• int A[10];
• A can have type expression array(0 .. 9, integer)
Product: if T1 and T2 are type expressions then their Cartesian
product T1 × T2 is a type expression.
For e.g. An argument list passed to a function with first argument integer and
second real, has type integer × real.
CS F363 Compiler Construction 8
BITS Pilani, Pilani Campus
Type Constructors…..
Pointer: if T is a type expression, then pointer (T) is a type expression
denoting type pointer to an object of type T
Function: function maps domain set to range set. It is denoted by type
expression D → R (D is the Domain and R is the Range of function)
• int × int → int
• The type of function int* f(char a, char b)is denoted by
• char × char → pointer(int)
CS F363 Compiler Construction 9
BITS Pilani, Pilani Campus
Type Constructors…..
What does Type Expression “(integer → (real → character)”
represents?
It represents a function that takes an integer as an argument and
returns another function which maps a real number to a
character.
CS F363 Compiler Construction 10
BITS Pilani, Pilani Campus
Type Constructors……
Named Records: These are products with named elements.
• For a record structure with two named fields
• length (an integer) and word (of type array(10, char)), the record is of
type
record((length × integer) × (word × array (10, character)))
CS F363 Compiler Construction 11
BITS Pilani, Pilani Campus
Type Systems
Type System of a language is a collection of rules depicting
the type expression assignment to program objects.
• Usually done with Syntax Directed Definition
Type Checker is an implementation of a Type System.
CS F363 Compiler Construction 12
BITS Pilani, Pilani Campus
Specifications of a Type Checker
The type checker is a translation scheme that synthesizes the
type of each expression from the types of its sub-expressions.
The type checker can handle arrays, pointers, statements and
functions.
CS F363 Compiler Construction 13
BITS Pilani, Pilani Campus
Specifications of a Type Checker
Consider a language which consists of a sequence of declarations
followed by a single expression
P → D ; E
D → D ; D | id : T
T → char | integer | T[num] | T*
E → literal | num | E%E | E [E] | *E
CS F363 Compiler Construction 14
BITS Pilani, Pilani Campus
Specifications of a Type Checker
A program generated by this grammar is
• key : integer;
• key % 1999
Assume following
• basic types are char, int, type-error
• all arrays start at 0
• char[256] has type expression array(0 .. 255, char)
CS F363 Compiler Construction 15
BITS Pilani, Pilani Campus
Rules for Symbol Table entry
D → id : T
T → char
T → integer
T → T1*
T → T1 [num]
16
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Rules for Symbol Table entry
D → id : T addtype([Link], [Link])
T → char [Link] = char
T → integer [Link] = int
T → T1* [Link] = pointer(T1 .type)
T → T1 [num] [Link] = array(0….num-1, T1 .type)
17
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Type Checking of Functions
E → E1 (E2) E. type = ([Link] == s and [Link] == s → t) ? t : type-error
CS F363 Compiler Construction 18
BITS Pilani, Pilani Campus
Compiler Construction CS F363
Type Checking
Shashank Gupta
BITS Pilani Assistant Professor
Pilani Campus
Department of Computer Science and Information Systems
BITS Pilani, Pilani Campus
Type Systems
Type System of a language is a collection of rules depicting
the type expression assignment to program objects.
• Usually done with Syntax Directed Definition
Type Checker is an implementation of a Type System.
CS F363 Compiler Construction 2
BITS Pilani, Pilani Campus
Specifications of a Type Checker
The type checker is a translation scheme that synthesizes the
type of each expression from the types of its sub-expressions.
The type checker can handle arrays, pointers, statements and
functions.
CS F363 Compiler Construction 3
BITS Pilani, Pilani Campus
Specifications of a Type Checker
Consider a language which consists of a sequence of declarations
followed by a single expression
P → D ; E
D → D ; D | id : T
T → char | integer | T[num] | T*
E → literal | num | E%E | E [E] | *E
CS F363 Compiler Construction 4
BITS Pilani, Pilani Campus
Specifications of a Type Checker
A program generated by this grammar is
• key : integer;
• key % 1999
Assume following
• basic types are char, int, type-error
• all arrays start at 0
• char[256] has type expression array(0 .. 255, char)
CS F363 Compiler Construction 5
BITS Pilani, Pilani Campus
Rules for Symbol Table entry
D → id : T addtype([Link], [Link])
T → char [Link] = char
T → integer [Link] = int
T → T1* [Link] = pointer(T1 .type)
T → T1 [num] [Link] = array(0….num-1, T1 .type)
6
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Type Checking of Functions
E → E1 (E2) E. type = ([Link] == s and [Link] == s → t) ? t : type-error
CS F363 Compiler Construction 7
BITS Pilani, Pilani Campus
Type Checking for Expressions
E → literal [Link] = char
E → num [Link] = integer
E → id [Link] = lookup([Link])
The synthesized attribute type for E gives the type expression assigned by
the type system to the expression generated by E
CS F363 Compiler Construction 8
BITS Pilani, Pilani Campus
Type Checking for Expressions…..
E → E1 % E2 [Link] = if E1 .type == integer and E2 .type == integer
then integer else type_error
CS F363 Compiler Construction 9
BITS Pilani, Pilani Campus
Type Checking for Expressions…..
E → E1 [ E2 ] [Link] = if [Link]==integer and E1 .type==array(s, t)
then t else type_error
In an array reference E1 [ E2 ], the index expression E2 must have type
integer, in which case the result is the element type t obtained from the type
array(s, t) of E1
10
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Type Checking for Expressions…..
E → *E1 [Link] = if [Link] == pointer(t) then t else
type_error
Within expressions, the prefix operator * yields the object pointed to by its
operand. The type of *E is the type t of the object pointed to by the pointer E
11
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Type checking for Statements
A program now consists of declarations followed by
statements
• Statements normally do not have any type value, hence of type
void.
The statements we consider are assignment,
conditional, and while statements.
• Sequence of statements are separated by semicolons
CS F363 Compiler Construction 12
BITS Pilani, Pilani Campus
Revised Specifications of a Type Checker
P → D ; S
D → D ; D | id : T
T → char | integer | T[num] | T*
E → literal | num | E%E | E [E] | *E
S → id := E
S → if E then S1
S → while E do S1
S → S1 ; S2
CS F363 Compiler Construction 13
BITS Pilani, Pilani Campus
Type checking for Statements
The previous rules for checking expressions are still
needed since statements can have expressions within them.
For propagating type error occurring in some statement
nested deep inside a block, a set of rules are required.
CS F363 Compiler Construction 14
BITS Pilani, Pilani Campus
Type checking for Statements
S → id := E [Link] = if [Link] == [Link] then
void else type_error
15
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Type checking for Statements….
S → if E then S1 [Link] = if [Link] == boolean then [Link] else
type_error
CS F363 Compiler Construction 16
BITS Pilani, Pilani Campus
Type checking for Statements
S → while E do S1 [Link] = if [Link] == boolean then [Link] else type_error
CS F363 Compiler Construction 17
BITS Pilani, Pilani Campus
Type checking for Statements
S → S1 ; S2 [Link] = if [Link] == void and [Link] == void then void else
type_error
In this rule, a sequence of statements has type void only if each sub-
statement has type void.
18
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Type Conversion
Consider expression like x + i where x is of type real and
i is of type integer
• Internal representations of integers and reals are different in a computer
• Different machine instructions are used for operations on integers and reals
The compiler has to convert both the operands to the same
type
• Language definition specifies what conversions are necessary.
19
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Type Conversion…..
Usually, conversion is to the type of the left hand side
• Type checker is used to insert conversion operations:
•x + i is converted into x real+ inttoreal(i)
Type conversion is called implicit/coercion if done by
compiler.
• It is limited to the situations where no information is lost.
• Conversions are explicit if programmer has to write something to cause
conversion.
20
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Type Checking for Expressions
E → num
E → [Link]
E → id
E → E1 op E2
CS F363 Compiler Construction 21
BITS Pilani, Pilani Campus
Type Checking for Expressions
E → num [Link] = int
E → [Link] [Link] = real
E → id [Link] = lookup( [Link] )
CS F363 Compiler Construction 22
BITS Pilani, Pilani Campus
Type Checking for Expressions……
E → E1 op E2 [Link] = if E1 .type == int && E2 .type == int
then int
elseif E1 .type == int && E2 .type == real
then real
elseif E1 .type == real && E2 .type == int
then real
elseif E1 .type == real && E2 .type==real then
real
23
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Compiler Construction CS F363
Type Checking
Shashank Gupta
BITS Pilani Assistant Professor
Pilani Campus
Department of Computer Science and Information Systems
BITS Pilani, Pilani Campus
Outline of Lecture
• Overloading of Functions
• Overloaded Function Resolution
• Narrowing the Set of Types
• Implementation of Symbol Table for Local Scope
CS F363 Compiler Construction 2
BITS Pilani, Pilani Campus
Overloaded Functions and Operators
Overloaded symbol has different meaning depending
upon the context
• In math, + is overloaded; used for integer, real, complex, etc.
In Ada, () is overloaded; used for array, function call,
etc.
• Overloading is resolved when a unique meaning for an
occurrence of a symbol is determined
CS F363 Compiler Construction 3
BITS Pilani, Pilani Campus
Overloaded Functions and Operators……
In Ada standard interpretation of * is multiplication of integers.
However, it may be overloaded by saying
• function “*” (i, j: integer) return complex;
• function “*” (i, j: complex) return complex;
• Possible type expression for “ * ” include:
• integer x integer → integer
• integer x integer → complex
• complex x complex → complex
4
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Overloaded Functions Resolution
Suppose only possible type for 2, 3 and 5 is integer.
• Z is a complex variable
3*5 is either integer or complex depending upon the
context
• in 2*(3*5): 3*5 is an integer because 2 is an integer
• in Z*(3*5): 3*5 is complex because Z is complex
CS F363 Compiler Construction 5
BITS Pilani, Pilani Campus
Finding Set of Possible Types…..
E`→ E E`.types = [Link]
E → id [Link] = lookup(id)
E → E1 (E2 ) [Link] = {t |∃s in E2 .types && s→t is in E1 .types}
6
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Set of Possible types for the 3*5……
7
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Finding Set of Possible Types…..
There may be several declarations of an overloaded identifier, so
we assume that a symbol-table entry may contain a set of
possible types
• This set is returned by the lookup function
Similarly, the rules for checking other operators in expressions
can be obtained.
CS F363 Compiler Construction 8
BITS Pilani, Pilani Campus
Type Resolution
Try all possible types of each overloaded function
• Keep track of all possible types.
• Discard invalid possibilities.
• At the end, check if there is a single unique type.
Overloading can be resolved in two passes:
• Bottom up: compute set of all possible types for each expression.
• Top down: Narrow set of possible types based on what could be
used in an expression.
CS F363 Compiler Construction 9
BITS Pilani, Pilani Campus
Narrowing the Set of Possible types
It requires a complete expression to have a unique type.
• Given a unique type from the context we can narrow down the type
choices for each expression.
If this process does not result in a unique type for each sub-
expression then, a type error is declared for the expression.
CS F363 Compiler Construction 10
BITS Pilani, Pilani Campus
Narrowing the set of Possible types….
E` → E E`.types = [Link]
[Link] = if E`.types=={t} then t else type_error
E → id [Link] = lookup(id)
E → E1 (E2 ) [Link] = { t | ∃s in E2 .types && s→t is in E1 .types}
t = [Link]
S = {s | s ∈ [Link] and (s→t) ∈ [Link]}
E2 .unique = if S=={s} then s else type_error
E1 .unique = if S=={s} then s→t else type_error
11
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Symbol Table
Compiler uses symbol table to keep track of scope and
binding information about names.
Changes to table occur
• if a new name is discovered
• if new information about an existing name is discovered
12
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Symbol Table……
Symbol table must have mechanism to
• add new entries
• find existing information efficiently
Two common mechanism
• linear lists: simple to implement, poor performance
• hash tables: greater programming/space overhead, good performance
Compiler should be able to grow symbol table dynamically. If size is fixed, it
must be large enough for the largest program
CS F363 Compiler Construction 13
BITS Pilani, Pilani Campus
Hashed Local Symbol Table
14
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Symbol Table Entries
Each entry corresponds to a declaration of a name
• Format need not be uniform because information depends upon the
usage of the name
Each entry is a record consisting of consecutive words.
• If uniform records are desired, some entries may be kept outside the
symbol table (e.g., variable length strings)
CS F363 Compiler Construction 15
BITS Pilani, Pilani Campus
Symbol Table Entries…….
Information is entered into symbol table at various times
• keywords are entered initially
• Identifier lexemes are entered by lexical analyzer
A name may denote several objects in the same block
int x;
struct x {float y, z; }
• Lexical analyzer returns the name itself and not pointer to symbol table entry.
• Record in the symbol table is created when role of the name becomes clear.
• In this case two symbol table entries will be created.
CS F363 Compiler Construction 16
BITS Pilani, Pilani Campus
Declarations
P→D
D→D;D
D → id : T
T → integer
T → real
CS F363 Compiler Construction 17
BITS Pilani, Pilani Campus
Declarations……
For each name create symbol table entry with information like
type and relative address
P → {offset=0} D
D→D;D
D → id : T enter([Link], [Link], offset); offset = offset + [Link]
T → integer [Link] = integer; [Link] = 4
T → real [Link] = real; [Link] = 8
CS F363 Compiler Construction 18
BITS Pilani, Pilani Campus
Declarations……
T → array [ num ] of T1 [Link] = array([Link], T1 .type)
[Link] = [Link] x T1 .width
T → *T1 [Link] = pointer(T1 .type)
[Link] = 4
CS F363 Compiler Construction 19
BITS Pilani, Pilani Campus
Compiler Construction CS F363
Symbol Table
Shashank Gupta
BITS Pilani Assistant Professor
Pilani Campus
Department of Computer Science and Information Systems
BITS Pilani, Pilani Campus
Declarations
P→D
D→D;D
D → id : T
T → integer
T → real
CS F363 Compiler Construction 2
BITS Pilani, Pilani Campus
Declarations……
For each name create symbol table entry with information like
type and relative address
P → {offset=0} D
D→D;D
D → id : T enter([Link], [Link], offset); offset = offset + [Link]
T → integer [Link] = integer; [Link] = 4
T → real [Link] = real; [Link] = 8
CS F363 Compiler Construction 3
BITS Pilani, Pilani Campus
Declarations……
T → array [ num ] of T1 [Link] = array([Link], T1 .type)
[Link] = [Link] x T1 .width
T → *T1 [Link] = pointer(T1 .type)
[Link] = 4
CS F363 Compiler Construction 4
BITS Pilani, Pilani Campus
Nested Scope of Procedures
program sort;
var a : array[1..n] of integer;
x : integer;
procedure readarray;
var i : integer;
……
procedure exchange(i,j:integers);
……
procedure quicksort(m,n : integer);
var k,v : integer;
function partition(x,y:integer):integer;
var i,j: integer;
……
……
begin{main}
……
end. 5
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Design of Symbol Table for Nested Procedures
6
BITS Pilani, Pilani Campus
Keeping track of local information
When a nested procedure is seen, processing of declaration in enclosing
procedure is temporarily suspended.
• assume following language
•P → D
• D → D ;D | id : T | proc id ;D ; S
A new symbol table is created when procedure declaration
D → proc id ; D1 ; S is seen
• entries for D1 are created in the new symbol table.
• the name represented by id is local to the enclosing procedure.
CS F363 Compiler Construction 7
BITS Pilani, Pilani Campus
Things to be Done…..
1.1 Create Symbol Table
2.1 Compute offset for each scope
2.2 Remember the offset of previous scope
1.2 Remember the pointer to the Symbol Table
1.3 Remember the pointer to the Previous Symbol Table
CS F363 Compiler Construction 8
BITS Pilani, Pilani Campus
Creating Symbol Table: Interface
mktable (previous): create a new symbol
table and return a pointer to the new table.
• The argument previous points to the enclosing procedure.
enter (table, name, type, offset)
• Creates a new entry
CS F363 Compiler Construction 9
BITS Pilani, Pilani Campus
Creating Symbol Table: Interface……..
addwidth (table, width): Records cumulative
width of all the entries in a table
•enterproc (table, name, newtable) creates a new entry for
procedure name.
•newtable points to the symbol table of the new procedure.
Maintain two stacks : (1) Symbol tables and (2) Offsets
• Standard stack operations: Push, Pop, Top
10
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Declarations in Nested Procedures
P → D
D → D ;D
D → id : T
D → proc id ;D ; S
CS F363 Compiler Construction 11
BITS Pilani, Pilani Campus
Creating Symbol Table……..
P → {t = mktable (nil);
push(t,tblptr);
push(0,offset)}
D
{addwidth(top(tblptr),top(offset));
pop(tblptr); pop(offset)}
D → D;D
12
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Creating Symbol Table ..…
D → proc id; {t = mktable(top(tblptr));
push(t, tblptr);
push(0, offset)}
D1 ; S
{t = top(tblptr);
addwidth(t, top(offset));
pop(tblptr); pop(offset);
enterproc(top(tblptr), [Link], t)}
13
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Creating symbol table …
D → id: T {enter(top(tblptr), [Link], [Link], top(offset));
top(offset) = top (offset) + [Link]}
14
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Compiler Construction CS F363
Intermediate Code Generation
Shashank Gupta
BITS Pilani Assistant Professor
Pilani Campus
Department of Computer Science and Information Systems
BITS Pilani, Pilani Campus
Intermediate Code Representation
Front end translates a source program into an intermediate
representation.
• Back end generates target code from intermediate representation
Benefits
• Retargeting is possible.
• Machine independent code optimization is possible is a set of values and
operations on those values.
CS F363 Compiler Construction 2
BITS Pilani, Pilani Campus
Position of Intermediate Code Generator
CS F363 Compiler Construction 3
BITS Pilani, Pilani Campus
Abstract Syntax Tree/DAG
Condensed form of a parse tree.
• Useful for representing language constructs.
Depicts the natural hierarchical structure of the source program
• Each internal node represents an operator.
• Children of the nodes represent operands.
• Leaf nodes represent operands.
• DAG is more compact than abstract syntax tree because common sub
expressions are eliminated.
CS F363 Compiler Construction 4
BITS Pilani, Pilani Campus
a := b * -c + b * -c
Abstract Syntax Tree DAG
5
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Postfix Notation
Linearized representation of a syntax tree.
• List of nodes of the tree.
Nodes appear immediately after its children.
CS F363 Compiler Construction 6
BITS Pilani, Pilani Campus
Postfix Notation……
The postfix notation for an expression E is defined as
follows:
• If E is a variable or a constant, then the postfix notation is E itself.
If E is an expression of the form E1 op E2 where op
is a binary operator then the postfix notation for E is
• E1`E2`op where E1`E2` are the postfix notations for E1 and E2
respectively.
CS F363 Compiler Construction 7
BITS Pilani, Pilani Campus
Postfix Notation…….
If E is an expression of the form (E),
• then the postfix notation for (E) is E`.
• No parenthesis are needed in postfix notation.
Postfix notation for a = b×-c + b×-c is
•abc-×bc-×+=
CS F363 Compiler Construction 8
BITS Pilani, Pilani Campus
Three Address Code
Sequence of statements of the general form
X := Y op Z
• X, Y or Z are names, constants or compiler generated
temporaries
• op stands for any operator such as a fixed - or floating-point
arithmetic operator, or a logical operator
CS F363 Compiler Construction 9
BITS Pilani, Pilani Campus
Three Address Code……
Only one operator on the right hand side is allowed. Source expression
like x + y * z might be translated into
• t1 := y * z
• t2 := x + t1 where t1 and t2 are compiler generated temporary names.
Unraveling of complicated arithmetic expressions and of control flow
makes 3-address code desirable for code generation and optimization
• The use of names for intermediate values allows 3-address code to be easily
rearranged
• A linearized representation of a syntax tree where explicit names correspond to the
interior nodes of the graph.
CS F363 Compiler Construction 10
BITS Pilani, Pilani Campus
Three Address Instructions
Assignment
• x = y op z
• x = op y
• x=y
Jump
• goto L
• if x relop y goto L
CS F363 Compiler Construction 11
BITS Pilani, Pilani Campus
Three Address Instructions…..
Indexed assignment
• x = y[i]
• x[i] = y
Function
• param x
• call p,n
• return y
CS F363 Compiler Construction 12
BITS Pilani, Pilani Campus
Three Address Instructions…….
Pointer
•x = &y
•x = *y
•*x = y
CS F363 Compiler Construction 13
BITS Pilani, Pilani Campus
Syntax directed translation of expression into 3-
address code
Two attributes
• [Link], a name that will hold the value of E, and
• [Link], the sequence of three-address statements evaluating E.
• A function gen(…)to produce sequence of three
address statements
CS F363 Compiler Construction 14
BITS Pilani, Pilani Campus
Syntax directed translation of expression into
3-address code……
E → E1 + E2 [Link]:= newtmp
[Link]:= E1 .code || E2 .code ||
gen([Link] := E1 .place +
E2 .place)
CS F363 Compiler Construction 15
BITS Pilani, Pilani Campus
Syntax directed translation of expression into
3-address code…….
S → id := E [Link] := [Link] || gen([Link]:= [Link])
CS F363 Compiler Construction 16
BITS Pilani, Pilani Campus
Syntax directed translation of expression into
3-address code……
E → E1 * E2 [Link]:= newtmp
[Link]:= E1 .code || E2 .code ||
gen([Link] := E1 .place *
E2 .place)
CS F363 Compiler Construction 17
BITS Pilani, Pilani Campus
Syntax directed translation of expression into
3-address code……
E → (E1) [Link] := [Link]
[Link] := E1 .code
CS F363 Compiler Construction 18
BITS Pilani, Pilani Campus
Syntax directed translation of expression into
3-address code……
E → -E1 [Link] := newtmp
[Link] := E1 .code || gen([Link] :=
- E1 .place)
CS F363 Compiler Construction 19
BITS Pilani, Pilani Campus
Compiler Construction CS F363
Intermediate Code Generation
Shashank Gupta
BITS Pilani Assistant Professor
Pilani Campus
Department of Computer Science and Information Systems
BITS Pilani, Pilani Campus
x= (y+z)*(-w+v)
2
CS F363 Compiler Construction BITS Pilani, Pilani Campus
More Examples……
For a = b * -c + b * -c, following code is generated
t1 = -c
t2 = b * t1
t3 = -c
t4 = b * t3
t5 = t2 + t4
a = t5
CS F363 Compiler Construction 3
BITS Pilani, Pilani Campus
Names in the Symbol Table
S → id := E {p = lookup([Link]);
if p <> nil then emit(p := [Link])
else error}
E → id {p = lookup([Link]);
if p <> nil then [Link] = p
else error}
4
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Flow of Control …
S → if E then S1 else S2 [Link] := newlabel
[Link] [Link] := newlabel
if [Link] = 0 goto [Link]
S1 .code [Link] = [Link] ||
goto [Link] gen(if [Link] = 0 goto [Link]) || S1 .code ||
[Link]: gen(goto [Link]) ||
S2 .code gen([Link] :) ||
[Link]: S2 .code ||
gen([Link] :)
5
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Flow of Control ……
S → while E do S1 [Link] := newlabel
Desired Translation is [Link] := newlabel
[Link]: [Link] := gen([Link]:) ||
[Link] [Link] ||
if [Link] = 0 goto [Link] gen(if [Link] = 0 goto [Link]) ||
S1 .code S1 .code ||
goto [Link] gen(goto [Link]) ||
[Link]: gen([Link]:)
6
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Type Conversion within Assignments
E → E1+ E2 [Link]= newtmp;
if E1 .type = integer and E2 .type = integer then
emit([Link] ':=' E1 .place 'int+' E2 .place);
I/P: x = y+i*j
[Link] = integer;
assuming x and y
have type real and i …
and j have type similar code if both E1 .type and E2 .type are real
integer else if [Link] = int and [Link] = real then
u = newtmp;
t1 := i int* j emit(u ':=' inttoreal [Link]);
t2 := inttoreal t1
emit([Link] ':=' u 'real+' [Link]);
t3 := y real+ t2
x = t3 [Link] = real;
…
similar code if [Link] is real and [Link] is integer
else
[Link] = type_error;
7
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Boolean Expressions
• Compute logical values E → E or E
• Change the flow of control E and E
• Boolean operators are: and or not not E
(E)
id relop id
true
false
8
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Methods of Translation
Evaluate similar to arithmetic expressions –
• Normally use 1 for true and 0 for false
Implement by flow of control
• Given expression E1 or E2 if E1 evaluates to true then
E1 or E2 evaluates to true without evaluating E2
CS F363 Compiler Construction 9
BITS Pilani, Pilani Campus