You are on page 1of 36

CODE GENERATION

Three address Object


Statements Code generator Program
General Issues in Code generation:

 Deciding what machine instructions to generate.

 Deciding in what order computations should be


done.

 Deciding which registers to use.


Forms of Object program

Absolute-Machine code
Relocatable Machine code
Assembly language code
Basic Block

A basic block is a sequence of consecutive statements in


which flow of control enters at the beginning and leaves at
the end without halt or possibility of branching except ,at
the end.
1. a := b+c
2. d := d-b
3. e := a+f
4. if a>b goto 7
5. f := a-d
6. goto 10
7. b := d+f
8. e := a-c
9. if b>c goto
15
10. b := d+c
11. if a>b goto 1
Algorithm:Partition into basic blocks

Input:A sequence of three address statements


Output:A list of basic blocks with each three-address statement
in exactly one block.
Method:
1) We first determine the set of leaders, the first statement
of basic blocks.The rules are
a) The first statement is a leader.
b) Any statement that is the target of a conditional or un-
onditional goto is a leader.
c) Any statement that immediately follows a goto
or conditional goto statement is a leader.
2) For each leader, its basic block consists of the leader and all
statements up to but not including the next leader or the end
of the program.
Fragment of code to be partitioned into basic
blocks
1. a := b+c
2. d := d-b
3. e := a+f
4. if a>b goto 7
5. f := a-d
6. goto 10
7. b := d+f
8. e := a-c
9. if b>c goto
15
10. b := d+c
11. if a>b goto 1
Applying the algorithm we identify the following
leaders
1. a := b+c
2. d := d-b
3. e := a+f
4. if a>b goto 7
5. f := a-d
6. goto 10
7. b := d+f
8. e := a-c
9. if b>c goto
15
10. b := d+c
11. if a>b goto 1
1. a := b+c
2. d := d-b
3. e := a+f B1
4. if a>b goto 7

7. b := d+f
5. f := a-d
6. goto 10 B2 B3 8. e := a - c
9. if b>c goto 15

10. b := d+c
B4
11. if a>b goto 1
1. a := b+c
2. d := d-b
3. e := a+f B1
4. if a>b goto 7

7. b := d+f
5. f := a-d
6. goto 10 B2 B3 8. e := a - c
9. if b>c goto 15

10. b := d+c
B4
11. if a>b goto 1
Computing Next uses of variables in a
basic block

i: x:= …+…;

(no intervening assignments to x) …

j: y:=…+x*…;

Statement j uses x computed at i.


The next use of x is j

To find next uses in a basic block we perform a backward scan


from the end of the basic block.
Computing Next uses of variables in a
basic block

Suppose we reach three-address statement i: x:=y OP z


then, do the following

1. Attach to statement i the information currently found in the


symbol table regarding the next use and liveness of x,y,z.

2. In the symbol table,set x to “not live” and ”no next use”.

3. In the symbol table set y and z to “live” and the next uses
of y and z to i.
The following shows the next use information for the
basic blocks considered earlier.
a d,0;b 1,1;c 1,1;d 1,2;e d,0;f 1,3
 1. a1,3 := b1,2+c 1,0 a 1,3;b 1,2;c 1,0;d 1,2;e d,0;f 1,3
2. d 1,0:=d 1,0-b d,0 a 1,3;b d,0;c 1,0;d 1,0;e d,0;f 1,3
3. e 1,0:= a 1,0+f 1,0 a 1,0;b d,0;c 1,0;d 1,0;e 1,0;f 1,0

a 1,5;b d,0;c 1,0;d 1,5;e 1,0;f d,0


 5. f1,0 := a d,0 – d 1,0
a d,0;b d,0;c 1,0;d 1,0;e 1,0;f 1,0

 7. b 1,0 := d 1,0 + f 1,0 a 1,8;b d,0;c 1,8;d 1,7;e d,0;f 1,7


8. e1,0 := a d,0 – c 1,0 a 1,8;b 1,0;c 1,8;d 1,0;e d,0;f 1,0
a d,0;b 1,0;c 1,0;d 1,0;e 1,0;f 1,0

a d,0;b d,0;c 1,10;d 1,10;e 1,0;f 1,0


 10. b 1,0 := d 1,0 + c 1,0 a d,0;b 1,0;c 1,0;d 1,0;e 1,0;f 1,0
MACHINE MODEL
 The machine for which we generate code is a byte
addressable machine with 2 16 bytes(2 15 16-bit words)
of memory.
 There are 8 general purpose registers numbered 0 to 7,
each capable of holding a 16-bit quantity.
 The instructions are of the form
4 bits 6 bits 6 bits
OP source destination
 The bit patterns in the fields specify the nature of operands
and the words that follow the instruction contain the
operands.
 The op codes we refer to are MOV, ADD, SUB.

The length of the instruction in words is regarded as the cost


of the instruction for analytical purpose.
The table that illustrates the addressing modes and Instruction formats:
Addressing Mode Operand Bit Meaning Extra cost
pattern

1. Register mode ( r) 001xxx Operand in register xxx 0

2. Indirect register 010xxx Address of the operand in the 0


mode (*r) register xxx
3. Indexed 011xxx Address of the operand is the 1
mode X (r ) contents of the register xxx + the
value X found in the word that
follows the instruction
4. Indirect indexed 100xxx Address of the operand is in the 1
mode *X (r) location obtained as the contents
of the register xxx + the value X
found in the word that follows the
instruction
5. Immediate #X 101$$$ The word that follows the 1
instruction contains the
immediate operand
6. Absolute X 110$$$ The word that follows the 1
instruction contains the address
of the operand.
Some example instructions and their costs:

MOV R0,R1 1
MOV R5,M 2
ADD #1,R3 2
SUB 4(R0),*5(R1) 3
A Code-Generation Algorithm
For each quadruple A := B op C we perform the following

1) Invoke a function GETREG() to determine the location L where the


computation B op C should be performed. L will usually be a
register, but it could also be a memory location

R M

2) If the value of B is not in L, generate the instruction MOV B1 ,L to


place a copy of B in L. Consult the address descriptor for B to
determine B1 ,(one of ) the current location(s) of B. Prefer the
register for B1 ,if the value of B is currently both in memory and a
register.
A Code-Generation Algorithm
For each quadruple A := B op C we perform the following

3) Generate the instruction OP C1, L where C1 is the current location


of C. Update the address descriptor of A to indicate that A is in
location L. If L is a register , update its descriptor to indicate that it
will contain at run time the value of A.

4) If the current values of B and/or C have no next uses, are not live
on exit from the block, and are in registers, alter the register
descriptor to indicate that, after execution of A :=B OP C, those
registers no longer will contain B and/or C, respectively.
GETREG( ):

1) If the name B is in a register that holds the value of no other


names (recall that copy instructions such as X := Y could
cause a register to hold the value of two or more variables
simultaneously), and B is not live and has no next use after
execution of A := B+C, then return the register of B for L.
Update the address descriptor of B to indicate that B is no
longer in L.

2) Failing (1), return an empty register for L if there is one.


GETREG( ):

3) Failing (2), if A has a next use in the block ,or OP is an


operator, such as indexing ,that requires a register, find
an occupied register R. Store the value of R into a memory
location (by MOV R,M) if it is not already in the proper memory
location M, update the address descriptor for M,and return R.
A suitable occupied register might be one whose datum is
referenced furthest in the future, or one whose value is also in
memory. The exact choice is open, since there is no one proven
best way to make the selection.

4) If A is not used in the block ,or no suitable occupied register


can be found ,select the memory location of A as L.
By applying the code generation algorithm, to one of the
basic blocks obtained above, we obtain the following code.
Statements Code Register Address
generated Descriptors Descriptors
Registers empty a:M;b:M;c:M;
d:M; e:M; f:M
1. a 1,3 := b 1,2 + c 1,0 MOV b,R0 R0:a a:R0;b:M;c:M;
ADD c,R0 d:M;e:M;f:M

2. d1,0 := d 1,0 - b d,0 MOV d,R1 R0:a; R1:d a:R0;b:M;c:M;


SUB b,R1 d:R1;e:M;f:M

3. e 1,0:=a 1,0 + f 1,0 MOV R0,R2 R0:a;R1:d;R2:e; a:R0;b:M;c:M;


ADD f ,R2 d:R1;e:R2;f:M

a,c,d,e,f live MOV R0,a


MOV R1,d
MOV R2,e
• Code optimization:
– A transformation to a program to make it
run faster and/or take up less space
– Optimization should be safe, preserve the
meaning of a program.

– Example: peephole optimization.


• A simple technique to improve target code.
• Peephole: a small moving window to the target
program.
• Technique: example a short sequence of target
instructions (peephole) and try to replace it with
a faster or shorter sequence
• Peephole optimization:
• Redundant instruction elimination
• Flow of control optimization
• Algebraic simplifications
• Instruction selection

• Examples:
– Redundant loads and stores
MOV R0, a
MOV a, R0
– Unreachable code
If debug = 1 goto L1
Goto L2
L1: print debugging info
L2:
– Examples:
• Flow of control optimization:
goto L1 goto L2
… …
L1: goto L2 L1: goto L2

if a < b goto L1 if a<b goto L2


… …
L1: goto L2 L1: goto L2

goto L1 if a < b goto L2


… goto L3
L1: if a < b goto L2 …
L1:
L3:
• Algebraic simplification:
x : = x+0
x := x*1 == nop
• Reduction in strength
X^2  x * x
X * 4  x << 2
• Instruction selection
Sometimes some hardware instructions can
implement certain operation efficiently.
• Code optimization can either be high
level or low level:
– High level code optimizations:
• Loop unrolling, loop fusion, procedure inlining
– Low level code optimizations:
• Instruction selection, register allocation
– Some optimization can be done in both
levels:
• Common subexpression elimination, strength
reduction, etc.
– Flow graph is a common intermediate
representation for code optimization.
• Basic block: a sequence of consecutive statements
with exactly 1 entry and 1 exit.
• Flow graph: a directed graph where the nodes are
basic blocks and block B1 block B2 if and only if B2
can be executed immediately after B1:
• Algorithm to construct flow graph:
– Finding leaders of the basic blocks:
• The first statement is a leader
• Any statement that is the target of a conditional or
unconditional goto is a leader
• Any statement that immediately follows a goto or
conditional goto statement is a leader
– For each leader, its basic block consists all
statements up to the next leader.
– B1B2 if and only if B2 can be executed
immediately after B1.
• Example:
100: sum = 0
101: j=0
102: goto 107
103: t1 = j << 2
104: t2 = addr(a)
105: t3 = t2[t1]
106: sum = sum + t3
107: if j < n goto 103
• Optimizations within a basic block is called local optimization.
• Optimizations across basic blocks is called global optimization.
• Some common optimizations:
– Instruction selection
– Register allocation
– Common subexpression elimination
– Code motion
– Strength reduction
– Induction variable elimination
– Dead code elimination
– Branch chaining
– Jump elimination
– Instruction scheduling
– Procedure inlining
– Loop unrolling
– Loop fusing
– Code hoisting
• Instruction selection:
– Using a more efficient instruction to replace a sequence of
instructions (space and speed).
– Example:

Mov R2, (R3)


Add R2, #1, R2
Mov (R3), R2  Add (R3), 1, (R3)
• Register allocation: allocate variables to registers
(speed)
• Example:
M[R13+sum] = 0 R2 = 0
M[R13+j] = 0 R1 = 0
GOTO L18 GOTO L18
L19: L19:
R0 = M[R13+j] << 2 R0 = R1 << 2
M[R13+sum] = M[R13+sum] R2 = R2+M[R0+_a]
+M[R0+_a] R1 = R1+1
M[R13+j] = M[R13+j]+1 L18:
L18: NZ = R1 - M[_n]
NZ = M[R13+j] - M[_n] if NZ < 0 goto L19
if NZ < 0 goto L19
• Code motion: move a loop invariant computation
before the loop
• Example:

R2 = 0 R2 = 0
R1 = 0 R1 = 0
GOTO L18 R4 = M[_n]
L19: GOTO L18
R0 = R1 << 2 L19:
R2 = R2+M[R0+_a] R0 = R1 << 2
R1 = R1+1 R2 = R2+M[R0+_a]
L18: R1 = R1+1
NZ = R1 - M[_n] L18:
if NZ < 0 goto L19 NZ = R1 – R4
if NZ < 0 goto L19
• Strength reduction: replace expensive operation by
equivalent cheaper operations
• Example:

R2 = 0 R2 = 0
R1 = 0 R1 = 0
R4 = M[_n] R4 = M[_n]
GOTO L18 R3 = _a
L19: GOTO L18
R0 = R1 << 2 L19:
R2 = R2+M[R0+_a] R2 = R2+M[R3]
R1 = R1+1 R3 = R3 + 4
L18: R1 = R1+1
NZ = R1 – R4 L18:
if NZ < 0 goto L19 NZ = R1 – R4
if NZ < 0 goto L19
• Induction variable elimination: can induce value from
another variable.
• Example:
R2 = 0 R2 = 0
R1 = 0 R4 = M[_n] << 2
R4 = M[_n] R3 = _a
R3 = _a GOTO L18
GOTO L18 L19:
L19: R2 = R2+M[R3]
R2 = R2+M[R3] R3 = R3 + 4
R3 = R3 + 4 L18:
R1 = R1+1 NZ = R3 – R4
L18: if NZ < 0 goto L19
NZ = R1 – R4
if NZ < 0 goto L19
• Common subexpression elimination:an expression
was previously calculated and the variables in the
expression have not changed. Can avoid
recomputing the expression.
• Example:
R1 = M[R13+I] << 2 R1=M[R13+I] << 2
R1 = M[R1+_b] R1 = M[R1+_b]
R2 = M[R13+I] << 2; R2 = R1
R2 = M[R2+_b]
The End

You might also like