You are on page 1of 35

INTERMEDIATE CODE GENERATION &

RUNTIME ENVIRNOMENTS
INTRODUCTION
 It is the 4th phase of compiler.
 Last phase of semantic analyzer.

 Advantages:
 Retargeting: Target code can be generated to any machine
just by attaching new machine as the back end.
 It is possible to apply machine independent code
optimization. This helps in faster generation of code.
HOW TO CHOOSE THE INTERMEDIATE REPRESENTATION?

 It should be easy to translate


1.source language  intermediate representation
2.Intermediate representation  machine code .
 The intermediate representation should be suitable for
optimization.
 It should be neither too high level nor too low level.

 One can have more than one intermediate representation


in a single compiler.
INTERMEDIATE CODE
REPRESENTATIONS
 General forms of intermediate representations (IR):
1. Graphical IR
Abstract syntax trees(AST)
Direct Acyclic Graph(DAG)
2. Linear IR (ie., non graphical)
 Postfix
Three address Code(TAC)
ABSTRACT SYNTAX TREE
 Also called Syntax Tree.
 A syntax tree depicts the natural hierarchical structure of
a source program.
 It is similar to expression tree but different from parse
tree.
 In Syntax tree no grammar symbols are present.

 The internal nodes represents operators and terminal


nodes represents operands
AST
 Eg: AST and Parse tree corresponds to 2*7+3
AST

 Draw AST for the below expression:


a : =b * - c + b * - c
DIFFERENT REPRESENTATIONS OF AST
 Pointer Representation
 Tabular Representation

1.Pointer representation :
Here each node is represented as a record with a field for
its operator and additional fields for pointers to its
children
2. Tabular Representation:
Here Nodes are allocated from an array of records and
the index or position of the node serves as the pointer to
the node.
REPRESENTATION: EXAMPLE
1 2
SDD TO PRODUCE SYNTAX TREE
DIRECT ACYCLIC GRAPH
 A DAG is a directed graph without cycle.
 Used to overcome the disadvantage of AST.

 A DAG(Directed Acyclic Graph) gives the same


information but in a more compact way.
 Its main use is to identify the common sub expressions.
DAG

 Draw DAG for the expression:


a:=b*-c+b*-c
DAG
LINEAR REPRESENTATIONS: POSTFIX

 Post fix form for the expression a+b is : a b +


 Also called Reverse Polish Form.

 Postfix notation is a linearized representation of a syntax


tree.
 Eg: 2*7+3 = 2 7 * 3 +
POST FIX
 Write down the postfix form for :
a : =b * - c + b * - c .
THREE ADDRESS CODE

 Three-address code is a sequence of statements of the


general form x := y op z.
 Where x, y and z are names, constants, or compiler-
generated temporaries; op stands for any operator.
 Three-address code is a linear zed representation of a
syntax tree.
 Eg: IC for the source language statement : x+ y*z is:

t1 : = y * z
t2 : = x + t1
where t1 and t2 are compiler-generated temporary names.
THREE ADDRESS CODE- TAC
 Write TAC for the following expression:
1. a= b*-c + b*-c
2. 2*7+6
3.
TYPES OF THREE-ADDRESS STATEMENTS
 1. Assignment statements:
 a. x := y op z, where op is a binary operator
 b. x := op y, where op is a unary operator
 2. Copy statements

 a. x := y
 3. The unconditional jumps:

 a. goto L
 4. Conditional jumps:

 a.if x relop y goto L


 5. Assignments:

 a. x := y[i]
 b. x[i] := y
 7. Address and pointer assignments:

 a. x := &y, x := *y, and *x = y


IC FOR C PROGRAM FRAGMENT : EG:1
 C-Program
Int sum, i;
sum= 0;
for (i=0; i<10; i++)
sum=sum+i
 Intermediate code
sum = 0;
i = 0;
t1=0
L1: if(i >= 10) goto L2
t2=sum+t1
sum=t2
t1=i+1
goto L1
IC FOR C PROGRAM FRAGMENT : EG:2
DATA STRUCTURES FOR TAC
(IMPLEMENTATION)
 How to present these instructions in a data structure?
 – Quadruples

 – Triples

 – Indirect triples
QUADRUPLE
 A quadruple is a record structure with four fields, which
are, op, arg1, arg2 and result.
 The op field contains an internal code for the operator.

 The three-address statement x : =y op z is represented by


placing y in arg1, z in arg2 and x in result.
 The contents of fields arg1, arg2 and result are normally
pointers to the symbol-table entries for the names
represented by these fields. If so, temporary names must
be entered into the symbol table as they are created.
QUADRUPLE EXAMPLE

 Expression a=b*-c+b*-c
TRIPLE
 To avoid entering temporary names into the symbol table, we
might refer to a temporary value by the position of the
statement that computes it.
 If we do so, three-address statements can be represented by
records with only three fields: op, arg1 and arg2.
 The fields arg1 and arg2, for the arguments of op, are either
pointers to the symbol table or pointers into the triple structure
( for temporary values ).
 Since three fields are used, this intermediate code format is
known as triples
TRIPLE EXAMPLE
 Expression : a=b*-c+b*-c
INDIRECT TRIPLE
 Another implementation of three-address code .
 Rather than listing the triples themselves, listing pointers
to triples, . This implementation is called indirect triples.
 For example, let us use an array statement to list
pointers to triples in the desired order.
INDIRECT TRIPLE EXAMPLE

You might also like