UNIT - VII
Intermediate Code
Generation
Santanu Halder Compiler Design
1
Intermediate Code Generation
Santanu Halder Compiler Design
2
Why need an intermediate code?
1. If a compiler translates the source language to its target
machine language without having the option for generating
intermediate code, then for each new machine, a full native
compiler is required.
2. Intermediate code eliminates the need of a new full
compiler for every unique machine by keeping the analysis
portion same for all the compilers.
3. It becomes easier to apply the source code modifications
to improve code performance by applying code
optimization technique on the intermediate code.
Santanu Halder Compiler Design
3
Directed Acyclic Graph
A Directed Acyclic Graph (DAG) for an expression identifies
the common sub-expressions of the expression. DAG can be
constructed by using the same techniques that construct
syntax tree.
Example:
a+a*(b-c)+(b-c)*d
Santanu Halder Compiler Design
4
Directed Acyclic Graph
A Directed Acyclic Graph (DAG) for an expression identifies
the common sub-expressions of the expression. DAG can be
constructed by using the same techniques that construct
syntax tree.
Example:
a+a*(b-c)+(b-c)*d
Santanu Halder Compiler Design
5
Intermediate Representation (IR)
High Level IR
High Level Intermediate code representation is very close to
the source language itself. They can be easily generated from
the source code and we can easily apply code modifications
to enhance performance. But for target machine optimization,
it is less preferred.
Low Level IR
Low Level Intermediate code representation is close to the
target machine which makes it suitable for register and
memory allocation, instruction set selection etc. It is good for
machine-dependent optimizations.
Intermediate code can be either language specific e.g. ByteCode for
Java or language independent three-address code
Santanu Halder Compiler Design
6
Three address code
Intermediate code generator receives input from its
predecessor phase, semantic analyzer, in the form of an
annotated syntax tree. That syntax tree then can be
converted into a linear representation, e.g. postfix notation.
Intermediate code tends to be machine independent code.
Therefore, code generation assumes to have unlimited
number of memory storage register to generate code.
Example:
t1 = c * d;
a=b+c*d t2 = b + t1;
t3 = t2 + t1;
a = t3;
Santanu Halder Compiler Design
7
Three address code
In a three address code there is at most one operator at
the right side of an instruction.
Example:
t1 = b – c
t2 = a * t1
t3 = a + t2
t4 = t1 * d
t5 = t3 + t4
Santanu Halder Compiler Design
8
Three address instruction forms
x = y op z
x = op y
x=y
goto L
if x goto L and if False x goto L
if x relop y goto L
Procedure calls using:
param x
call p,n
y = call p,n
x = y[i] and x[i] = y
x = &y and x = *y and *x =y
Santanu Halder Compiler Design
9
Three address code
do 1. i=i+1;
{ 2. t1=i*2; //Assume each element of a[] takes 2 bytes
i=i+1; 3. t2=a[t1];
}while(a[i]<v) 4. if t2 < v goto 1
Santanu Halder Compiler Design
10
Three address code..
x = (-b + sqrt(b*b - 4*a*c)) / (2*a)
t1 = b * b
t2 = 4 * a
t3 = t2 * c
t4 = t1 - t3
t5 = sqrt(t4)
t6 = 0 - b
t7 = t5 + t6
t8 = 2 * a
t9 = t7 / t8
x = t9
Santanu Halder Compiler Design
11
Three address code..
for (i = 0; i < 10; i++) 1. i=0
{ 2. if i >= 10 goto 9
b[i] = i*i; 3. t1=i*2
} 4. t2= i*i;
5. t3 := b + t1; //address to store i*i
6. *t3 := t2; //store through pointer
7. i=i+1
8. goto 2; //repeat loop
9.
Santanu Halder Compiler Design
12
Three address code..
//10×10 identity matrix
//Assume i,j>0
for (i=0; i<10; i++)
for( j=0; j<10; j++)
a[i][j] = 0.0;
for(i=0;i<10;i++)
a[i][i]=1.0;
Santanu Halder Compiler Design
13
Three address code..
//10×10 identity matrix 1. i=0 11. i=0
//Assume i,j>0 2. j=0 12. t4=i*10
for (i=0; i<10; i++) 3. t1=i*10 13. t5=t4+i
for( j=0; j<10; j++) 4. t2=t1+j 14. t6=t5*2
a[i][j] = 0.0; 5. t3=t2*2 15. a[t6]=1.0
for(i=0;i<10;i++) 6. a[t3]=0.0 16. i=i+1
a[i][i]=1.0; 7. j=j+1 17. if i<10 goto 12
8. if j<10 goto 4
9. i=i+1;
10. if i<10 goto 2
Santanu Halder Compiler Design
14
Three address code..
void quicksort(int m, int n)
{
int i, j, v, x;
if(n<=m) return;
/*fragment begins here*/
i=m-1; j=n; v=a[n];
while(1)
{
do{ i=i+1;} while(a[i]<v);
do{ j=j-1;} while(a[j]>v);
if(i>=j) break;
x=a[i]; a[i]=a[j]; a[j]=x;
}
x=a[i]; a[i]=a[n]; a[n]=x;
/*fragment ends here*/
quiksort(m,j);
quicksort(j+1,n);
}
Santanu Halder Compiler Design
15
Three address code..
void quicksort(int m, int n) 1. i=m-1
{ 2. j=n
int i, j, v, x; 3. t1=n*2
if(n<=m) return; 4. v=a[t1]
/*fragment begins here*/ 5. i=i+1
i=m-1; j=n; v=a[n]; 6. t2=i*2
while(1) 7. t3=a[t2]
{ 8. If t3<v goto 5
do{ i=i+1;} while(a[i]<v); 9. j=j-1
do{ j=j-1;} while(a[j]>v); 10. t4=j*2
if(i>=j) break; 11. t5=a[t4]
x=a[i]; a[i]=a[j]; a[j]=x; 12. If t5>v goto 9
} 13. If i>=j goto 21
x=a[i]; a[i]=a[n]; a[n]=x; 14. t6=i*2
/*fragment ends here*/ 15. x=a[t6]
quiksort(m,j); 16. t7=i*2
quicksort(j+1,n); 17. t8=j*2
} 18. t9=a[t8]
Santanu Halder Compiler Design
16
Three address code..
void quicksort(int m, int n) 1. i=m-1 19. a[t7]=t9
{ 2. j=n 20. t10=j*2
int i, j, v, x; 3. t1=n*2 21. a[t10]=x
if(n<=m) return; 4. v=a[t1] 22. goto 5
/*fragment begins here*/ 5. i=i+1 23. t11=i*2
i=m-1; j=n; v=a[n]; 6. t2=i*2 24. x=a[t11]
while(1) 7. t3=a[t2] 25. t12=i*2
{ 8. If t3<v goto 5 26. t13=n*2
do{ i=i+1;} while(a[i]<v); 9. j=j-1 27. t14=a[t13]
do{ j=j-1;} while(a[j]>v); 10. t4=j*2 28. a[t12]=t14
if(i>=j) break; 11. t5=a[t4] 29. t15=n*2
x=a[i]; a[i]=a[j]; a[j]=x; 12. If t5>v goto 9 30 a[t15]=x
} 13. If i>=j goto 21
x=a[i]; a[i]=a[n]; a[n]=x; 14. t6=i*2
/*fragment ends here*/ 15. x=a[t6]
quiksort(m,j); 16. t7=i*2
quicksort(j+1,n); 17. t8=j*2
} 18. t9=a[t8]
Santanu Halder Compiler Design
17
Three address code data structure
Quadruples
Has four fields: op, arg1, arg2 and result
Triples
Temporaries are not used and instead references to
instructions are made
Indirect triples
In addition to triples we use a list of pointers to triples
Santanu Halder Compiler Design
18
Quadruples
Each instruction is divided into four fields: (1) Operator (2)
Arg1 (3) Arg2 and (4) Result
Example
b * minus c + b * minus c
Three address code
t1 = minus c
t2 = b * t1
t3 = minus c
t4 = b * t3
t5 = t2 + t4
a = t5
Santanu Halder Compiler Design
19
Triples
Each instruction has three fields: (1) Operator, (2) Arg1 and (3)
Arg2. The results of respective sub-expressions are denoted by the
position of expression. Triples face the problem of code
immovability while optimizations, as the results are positional and
changing the order or position of an expression may cause
problems.
Example
b * minus c + b * minus c
Three address code
t1 = minus c
t2 = b * t1
t3 = minus c
t4 = b * t3
t5 = t2 + t4
a = t5
Santanu Halder Compiler Design
20
Indirect Triples
This representation is an enhancement over triples
representation. It uses pointers instead of position to store
results. This enables the optimizers to freely re-position the
sub-expression to produce an optimized code.
Example
b * minus c + b * minus c
Three address code
t1 = minus c
t2 = b * t1
t3 = minus c
t4 = b * t3
t5 = t2 + t4
a = t5
Santanu Halder Compiler Design
21
Declarations
A variable or procedure has to be declared before it can be used.
Declaration involves allocation of space in memory and entry of type and
name in the symbol table. A program may be coded and designed keeping
the target machine structure in mind, but it may not always be possible to
accurately convert a source code to its target language.
Taking the whole program as a collection of procedures and sub-
procedures, it becomes possible to declare all the names local to the
procedure. Memory allocation is done in a consecutive manner and
names are allocated to memory in the sequence they are declared in the
program. We use offset variable and set it to zero {offset = 0} that denote
the base address.
The source programming language and the target machine architecture
may vary in the way names are stored, so relative addressing is used.
While the first name is allocated memory starting from the memory
location 0 {offset=0}, the next name declared later, should be allocated
memory next to the first one.
Santanu Halder Compiler Design
22
Declarations..
Take the example of C programming language where an integer variable
is assigned 2 bytes of memory and a float variable is assigned 4 bytes of
memory.
Santanu Halder Compiler Design
23
End of Unit VII
Santanu Halder Compiler Design
24