Unit-5-1

18CSC304J- COMPILER DESIGN
UNIT-5
SRMIST, Vadapalani Campus

UNIT-V SYLLABUS
1. Code optimization- Introduction 10. Introduction to Global Data Flow Analysis
2. Principal Sources of Optimization 11. Computation of gen and kill
3. Function Preserving Transformation 12. Computation of in and out
4. Loop Optimization 13. Parameter Passing
5. Optimization of basic Blocks 14. Runtime Environments
6. Building Expression of DAG 15. Source Language issues
7. Peephole Optimization 16. Storage Organization
8. Basic Blocks, Flow Graphs 17. Activation Records
9. Next -Use Information 18. Storage Allocation strategies
2
BASIC BLOCKS AND FLOW GRAPHS
BASIC BLOCKS
● A basic block is a sequence of consecutive statements in which flow of
control enters at the beginning and leaves at the end without any halt or
possibility of branching except at the end.
Example: Sequence of three-address statements forms a basic block:
● A three-address statement x : = y + z is said to define x and

to use (reference) y and z.
● A name in a basic block is said to be live at a given point
if its value is used after that point in the program, perhaps
in another basic block
Algorithm  used to partition a sequence of three-address statements into basic blocks.

3
4
Example: Consider the fragment of source code shown below; it computes the dot
product of two vectors a and b of length 20. A list of three-address statements
performing this computation on our target machine is below
Basic block 1: Statement (1) to (2)

5
Transformations on Basic Blocks:
● A basic block computes a set of expressions

● These expressions are the values of the names live on exit from the block.
● Two basic blocks are said to be equivalent if they compute the same set of
expressions
● A number of transformations can be applied to a basic block without changing the
set of expressions computed by the block.
● Many transformations are useful for improving the quality of code generated from
a basic block
● Global code "optimizer” tries to rearrange the computations in a program to
reduce the overall running time or space requirement of the final target program
● Two important classes of local transformation are :
○ Structure-preserving transformations
○ Algebraic transformations
6
Structure preserving transformations:
● The primary structure-preserving transformations on basic blocks are:

1. common subexpression elimination
2. dead-code elimination
3. renaming of temporary variables
4. interchange of two independent adjacent statements
Assume basic blocks have no arrays, pointers, or procedure calls
1. Common subexpression elimination:

Consider the basic block
● Since the second and fourth expressions
compute the same expression, the basic block
can be transformed as
7
2. Dead-code elimination:
● Suppose x is dead, that is, never subsequently used, at the point where the
statement x : = y + z appears in a basic block.
● Then this statement may be safely removed without changing the value of the basic
block.
3. Renaming temporary variables:

● A statement t : = b + c ( t is a temporary ) can be changed to u : = b + c (u is a new
temporary)
● i.e., change all instances of t to u without changing the value of the basic block.
● Transform a basic block into an equivalent block in which each statement that
defines a temporary defines a new temporary. Such a block is called a
normal-form block.
8
4. Interchange of statements:
● Suppose a block has two adjacent statements
● We can interchange the two statements without affecting the
value of the block if and only if neither x nor y is t1 and neither b nor c is t2.
Algebraic transformations
● It can be used to change the set of expressions computed by a basic block into an
algebraically equivalent set.
i) Simplify expression:
Statement such as x : = x + 0 or x : = x * 1 can be eliminated from a basic block without
changing the set of expressions it computes.
ii) Replace expensive operations by cheaper ones:
The exponential operator in the statement x : = y * * 2 can be replaced by x : = y * y.
9
FLOW GRAPHS:
● A graph representation of three-address statements, called a flow graph (or)

● Add the flow-of-control information to the set of basic blocks making up a program
by constructing a directed graph called a flow graph
● It is useful for understanding code-generation algorithms, even if the graph is not
explicitly constructed by a code-generation algorithm.
● Nodes in the flow graph represent computations, and the edges represent the flow
of control.
● Some register-assignment algorithms use flow graphs to find the inner loops where
a program is expected to spend most of its time
● The nodes of the flow graph are basic blocks
● One node is distinguished as initial; it is the block whose leader is the first
statement.
10
● There is a directed edge from block B1 to block B2 if B2 can immediately
follow B1 in some execution sequence; i.e., if
○ there is a conditional or unconditional jump from the last statement of B1 to
the first statement of B2,
○ B2 immediately follows B1 in the order of the program, and B1 does not end
in an unconditional jump.
● Then B1 is predecessor of B2, and B2 is a successor of B1
Example: The flow graph of the dot product program is given.

● B1 is the initial node.
● Also in the last statement, the jump to statement (3) has been replaced by
an equivalent jump to the beginning of block B2.
11
12
Representation of Basic Blocks:
● After partitioning the three-address statements, each basic block can be represented
by a variety of data structures
○ record consisting of a count of the number of quadruples in the block
○ pointer to the leader (first quadruple) of the block
○ lists of predecessors and successors of the block
● An alternative is to make a linked list of the quadruples in each block
Loops
Question: In a flow graph, what is a loop, and how does one find all loops?
● A loop is a collection of nodes in a flow graph such that
1. All nodes in the collection are strongly connectedfrom any node in the loop to
any other, there is a path of length one or more, wholly within the loop
2. The collection of nodes has a unique entrya node in the loop has only way to
reach a node of the loop from a node outside the loop to first go through the entry
● A loop that contains no other loops is called an inner loop.
13
NEXT-USE INFORMATION
● Collect next-use information about names in basic blocks.
● If the name in a register is no longer needed, then we remove the name
from the register and the register can be used to store some other names
● Idea: keeping a name in storage only if it will be used subsequently
Computing Next Uses
● Suppose three-address statement i assigns a value to x.
● If statement j has x as an operand, and control can flow from statement i
to j along a path that has no intervening assignments to x, then we say
statement j uses the value of x computed at i.
● To determine for each three-address statement x : = y op z and what is the
next uses of x, y and z. Need to use backward scan
14
Storage for Temporary Names
● Pack two temporaries into the same location if they are not live simultaneously
● Since almost all temporaries are defined and used within basic blocks, next-use
information can be applied to pack temporaries.
15
● We can allocate storage locations for temporaries by examining and assigning a
temporary to the first location in the field for temporaries that does not contain a live
temporary.
● If a temporary cannot be assigned to any previously created location, add a new
location to the data area for the current procedure.
● Temporaries can be packed into registers rather than memory locations
● Example: six temporaries can be packed into two locations. These locations
correspond to t1 and t2
Without replacing temporaries With replacing temporaries
16
INTRODUCTION - CODE OPTIMIZATION
● The code produced by the straight forward compiling algorithms can often be
made to run faster or take less space, or both.
● This improvement is achieved by program transformations that are traditionally
called optimizations.
● Compilers that apply code-improving transformations are called optimizing
compilers.
● It increases the execution speed of the program
● It reduces the storage space
● Optimization can be categorized into two types:
1) Machine dependent – are program transformations that improve the target
code without taking into consideration any properties of the target machine.
2) Machine independent - are based on register allocation and utilization of
special machine-instruction sequences.
17
The criteria for code improvement transformations:
● A transformation must preserve the meaning of programs.
i.e., the optimization must not change the output produced by a program
for a given input, or cause an error.
● A transformation must, on the average, speed up programs by a
measurable amount
● The transformation must be worth the effort
Getting Better Performance
18
An Organization for an Optimizing Compiler
● This concentrates on the transformation of intermediate code using the
organization
Example:
● Assume that the intermediate code consists of three-address statements
● A sorting C program for quicksort will be used for code improving
transformation
19
Assuming that each array element takes four bytes:
Intermediate code will recalculate 4*i every time a[ i ]

appears in the source program
20
Flow graph for the program
In code optimizer, programs are represented by
flow graphs in which edges indicate the flow of
control and nodes represents basic blocks
t1,t2,…,t3 are temporary variables
B1 is the initial node.

There are 3 loops.
B2 and B3 are loops by themselves.
Blocks B2,B3,B4 and B5 together form a loop
with entry B2
21
PRINCIPAL SOURCES OF OPTIMIZATION
● Techniques for implementing these transformations are presented here
● A transformation of a program is called local if it can be performed by looking only at
the statements in a basic block; otherwise, it is called global
● Many transformation can be performed both in local and global levels
● Local transformation are usually performed first
Function-Preserving Transformations
● There are a number of ways in which a compiler can improve a program without
changing the function it computes
Examples are
● Common sub expression elimination
● Copy propagation
● Dead-code elimination and
● Constant folding 22
1. Common Sub expressions elimination
● An occurrence of an expression E is called a common sub-expression if E
was previously computed, and the values of variables in E have not
changed since the previous computation. i.e., avoid re-computing the expression
Use the previously computed value
Example: Consider the three-address

code of quicksort
In block B5 of flow graph, it recalculates

4*i and 4*j
The assignment to t7 and t10 have the

local common subexpression 4*i and 4*j
They have been eliminated by using t6

instead of t7 and t8 instead of t10
23
Example: Consider the three-address code of quicksort
After local common subexpressions re eliminated,

B5 still evaluates 4* i and 4*j. Both are common subexpression
i.e., t8=4*j, t9=a[t8],a[t8]=x
In B5 can be replaced by
t9=a[t4], a[t4]=x
Using t4 computed in block B3
As control passes from the evaluation of 4*j in B3 to B5, there is

no change in j, so t4 can be used for 4*j
Similarly other possible statements are replaced
24
2. Copy Propagation
● Assignments of the form f : = g called copy statements, or copies for short.
● The idea behind the copy-propagation transformation is to use g for f,
whenever possible
This block can be
further improved by
eliminating x using
new transformation
The assignment x=t3

in this block is a copy
25
3 Dead Code Elimination or useless code
● A variable is live at a point in a program if its value can be used
subsequently; otherwise, it is dead at that point.
● Statement that computes values that never get used
Dead-code elimination
● Programmer may not introduce any dead code removes the assignment
● It may appear as a result of previous transformations to x and transforms into:
● Adv: often turns the copy statement into dead code
4 Constant Folding
● deducing at compile time that the value of an expression is a constant
and using the constant instead is known as constant folding.
26
Loop Optimization
● The inner loops where programs tend to spend the bulk of their time.
● The running time of a program may be improved if we decrease the
number of instructions in an inner loop, even if we increase the amount of
code outside that loop.
Three techniques are important for loop optimization:
● code motion  which moves code outside a loop;
● Induction-variable elimination  which eliminate variables from inner
loop
● Reduction in strength which replaces expensive operation by a cheaper
one, such as a multiplication by an addition
27
Code motion  which moves code outside a loop;
● It decreases the amount of code in a loop is code motion.
● Transformation takes an expression that yields the same result independent of
the number of times a loop is executed (loop-invariant computation) and places
the expression before the loop.
Example:
28
Induction Variables and Reduction in Strength
● Apply to replace variables from inner loops

● Apply if the variables get changed every time either increment or
decrement by some constant
● Loops are usually processed inside out
● The replacement of a multiplication by a subtraction will speed up the
object code if multiplication takes more time than addition or subtraction
as is the case on many machines
● The replacement of a more expensive instruction by a cheaper one is
called strength reduction
29
Example:
In block B3 the values of j and t4

remain in lock-step; every time the
value of j decreases by 1, t4
decreases by 4 because 4* j is
assigned to t4. Such identifiers are
called induction variables.
In Block B3 t4=4*j; After the

assignment statement j:=j-1 the
relationship t4 = 4* (j-1) holds t4:=4*j-4
Replace the assignment statement

t4=4*j by t4:=t4-4
30
After reduction in strength is applied to the inner
loops around B2 and B3
The values of i in t2=4*i and j in t4=4*j satisfy the

relationship
so the test t2>=t4 is equivalent to i>= j
This replacement made, i in B2 and j in block B3

become dead variables and the assignments to
them in these blocks become dead code that can
be eliminated
The code-improving transformations have

been effective.
The number of instructions in B2 and B3 has

been reduced from 4 to 3, in B5 from 9 to 3
and in B6 from 8 to 3. But B1 has grown
from 4 to 6 instruction
31
OPTIMIZATION OF BASIC BLOCKS
● A number of code-improving transformations for basic blocks includes
structure-preserving transformations, such as common subexpression
elimination and dead-code elimination, and algebraic transformations such
as reduction in strength.
● Many of the structure-preserving transformations can be implemented by
constructing a dag for a basic block.
● For common subexpressions: Say a new node m is about to be added;
there is an existing node n with the same children, same order, and with
same operator. Then n computes the same value as m and may be used in
its place.
32
However, the node with fourth statement d:=a-d has the
operator - and the nodes labeled a and do as children.
the operator and the children are the same, we do not

create this node, but add d to the list of definitions for
the node labeled -.
If b is not live on exit, use:
Consider the third statement c : =b+c if both b and d are live on exit, then a fourth statement
must be used to copy the value from one to the other
the use of b in b+c refers to the node labeled -,
because that is the most recent definition of b.
33
Example: Consider the expressions that are guaranteed to
compute the same value, no matter how that value is computed.
The dag method will miss the fact that the expression
computed by the first and fourth statements in the
sequence is the same, i.e., b+c
Algebraic identities are applied to the dag, may

expose the equivalence.
Also using dead-code elimination we delete from a dag

any root (node with no ancestors) that has no live
variables.
34
The Use of Algebraic Identities
● Algebraic identities represent another important class of optimizations on
basic blocks.
Example:
● Another class of algebraic optimizations includes reduction in strength, i.e.,

replacing a more expensive operator by a cheaper one
Example:
● In constant folding  evaluate constant expressions at compile time and
replace the constant expressions by their values.
Example: The expression 2*3.14 would be replaced by 6.28.
● The dag-construction process can help more general algebraic
transformations such as commutativity and associativity
Example: Suppose * is commutative, i.e., x*y = y*x.
35
● The relational operators <=, >=,<,>,= and ≠ sometimes generate
unexpected common subexpressions.
Example: the condition x>y can also be tested by subtracting the
arguments and performing a test on the condition code set by the subtraction
● Associative laws may also be applied to expose common subexpressions
Example:
● the following intermediate code might be generated:

Example:
● If t is not needed outside this block, we can change this sequence to using
both the associativity and commutativity of +
Example:
36
PEEPHOLE OPTIMIZATION
● A statement-by-statement code-generations strategy often produce target
code that contains redundant instructions and suboptimal constructs .
● The quality of such target code can be improved by applying “optimizing”
transformations to the target program.
● A simple but effective technique for improving the target code is peephole
optimization, a method for trying to improving the performance of the
target program by examining a short sequence of target instructions (called
the peephole) and replacing these instructions by a shorter or faster
sequence, whenever possible.
● Peephole optimization is a technique for improving the quality of the target
code, it can also be applied directly after intermediate code generation to
improve the intermediate representation
37
PEEPHOLE OPTIMIZATION
● The peephole is a small, moving window on the target program. The code in the
peephole need not contiguous, although some implementations do require as
improvements.
● The following examples of program transformations that are characteristic
of peephole optimizations:
 Redundant-instructions elimination
 Flow-of-control optimizations
 Algebraic simplifications
 Use of machine idioms
Redundant Loads And Stores:
we can delete instructions (2) because whenever (2) is
If we see the instructions sequence executed. (1) will ensure that the value of
(1) MOV R0,a a is already in register R0.If (2) had a label we could not
be sure that (1) was always executed
(2) MOV a,R0 immediately before (2) and so we could not remove (2).
38
Unreachable Code:
● Another opportunity for peephole optimizations is the removal of unreachable

instructions.
● An unlabeled instruction immediately following an unconditional jump may be
removed.
● This operation can be repeated to eliminate a sequence of instructions.
● For example, for debugging purposes, a large program may have within it certain
segments that are executed only if a variable debug is 1.
● In C, the source code might look like:
#define debug 0
….
If ( debug ) {
Print debugging information
}
39
● In the intermediate representation the if-statement may be translated as:
If debug =1 goto L2
goto L2
L1: print debugging information
L2:
● One obvious peephole optimization is to eliminate jumps over jumps. Thus, no
matter what the value of debug, (a) can be replaced by:
If debug ≠1 goto L2
L2:
● Since debug is set to 0 at the beginning of the program, constant propagation
should replace (b) by
If debug ≠0 goto L2
L2:
40
● As the argument of the first statement of (c) evaluates to a constant true,
it can be replaced by goto L2.
● Then all the statement that print debugging aids are manifestly
unreachable and can be eliminated one at a time.
Flows-Of-Control Optimizations:
● The intermediate code generation algorithms frequently produce jumps
to jumps, jumps to conditional jumps, or conditional jumps to jumps.
● These unnecessary jumps can be eliminated in either the intermediate
code or the target code by the following types of peephole optimizations.
● We can replace the jump sequence
41
● If there are now no jumps to L1, then it may be possible to eliminate the statement
L1:goto L2 provided it is preceded by an unconditional jump
● Similarly, the sequence
● Finally, suppose there is only one jump to L1 and L1 is preceded by an

unconditional goto. Then the sequence
While the number of instructions in (1) and (2) is the same,

we sometimes skip the
unconditional jump in (2), but never in (1).
Thus (2) is superior to (1) in execution time
42
Algebraic Simplification:
● There is no end to the amount of algebraic simplification that can be attempted
through peephole optimization.
● Only a few algebraic identities occur frequently enough that it is worth considering
implementing them
● For example, statements such as x := x+0 Or x := x * 1
● Are often produced by straightforward intermediate code-generation algorithms,
and they can be eliminated easily through peephole optimization.
Reduction in Strength:
● Reduction in strength replaces expensive operations by equivalent cheaper ones
on the target machine.
● Certain machine instructions are considerably cheaper than others and can often
be used as special cases of more expensive operators.
● For example, x² is invariably cheaper to implement as x*x than as a call to an
exponentiation routine.
43
● Fixed-point multiplication or division by a power of two is cheaper to implement as
a shift.
● Floating-point division by a constant can be implemented as multiplication by a
constant, which may be cheaper.
Use of Machine Idioms:
● The target machine may have hardware instructions to implement certain specific
operations efficiently.
● For example, some machines have auto-increment and auto-decrement addressing
modes. These add or subtract one from an operand before or after using its value.
● The use of these modes greatly improves the quality of code when pushing or
popping a stack, as in parameter passing.
● These modes can also be used in code for statements like i : =i+1.
44
BUILDING EXPRESSION OF DAG
● Directed acyclic graphs (dags) are useful data structures for implementing
transformations on basic blocks.
● A dag gives a picture of how the value computed by each statement in a
basic block is used in subsequent statements of the block.
● Constructing a dag from three-address statements is a good way of
determining common sub expressions (expressions computed more than
once) within a block.
45
Dag Construction
● To construct a dag for a basic block. we process each statement of the of

the form x : = y + z
● The nodes that represent the "current" values of y and z
● These could be leaves, or they could be interior nodes of the dag
● If y and/or z had been evaluated by previous statements of the block
● We then create a node labeled + and give it two children;
○ the left child is the node for y,
○ the right child is the node for z
● Then we label this node x.
● If there is already a node denoting the same value as y + z, we do not add
the new node to the dag,
● But rather give the existing node the additional label x
46
Two details should be mentioned
● First, if x (not x0) had previously labeled some other node, we remove that
label, since the "current" value of x is the node just created.
● Second, for an assignment such as x : = y we do not create a new node.
we append label x to the list of names on the node for the "current " value
of y.
47
Example
48
49
50
Application of DAG
● To automatically detect common sub expressions.

● To determine which identifiers have their values used in the block.
● To determine which statements compute values that could be used
outside the block
● It can be used to reconstruct a simplified list of quadruples
51
Stages in DAG Construction Process
52
53
Generating Code from DAGS
● The advantage of generating code for a basic block from its dag
representation is that, from a dag we can easily see how to rearrange the
order of the final computation sequence than we can starting from a linear
sequence of three-address statements or quadruples
Example: Consider the following basic

block whose dag representation
54
55
Rearranging the order
The order in which computations are done can affect the cost of resulting object code.
56
Possible questions from code generation
and code optimization
1. Convert the source code (program/expression/statement) to three-address code
2. Find the syntax tree for three-address code
3. Construct the dag for three-address code
4. Find the optimized dag for three-address code
5. Find the number of basic blocks present in the given three address code
6. Draw the flow graph for three-address code
7. Find the optimization of basic block (apply any transformation method and find the
optimized code)
8. Generate the code for three address code
9. Generate the optimized code for three-address code
10. Construct code for the following sentences
57

Unit-5-1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit-5-1

Uploaded by

Copyright:

Available Formats

18CSC304J- COMPILER DESIGN

SRMIST, Vadapalani Campus

Example: Sequence of three-address statements forms a basic block:

● A three-address statement x : = y + z is said to deﬁne x and

Algorithm  used to partition a sequence of three-address statements into basic blocks.

Basic block 1: Statement (1) to (2)

● A basic block computes a set of expressions

● The primary structure-preserving transformations on basic blocks are:

Assume basic blocks have no arrays, pointers, or procedure calls

1. Common subexpression elimination:

3. Renaming temporary variables:

● A graph representation of three-address statements, called a ﬂow graph (or)

Example: The ﬂow graph of the dot product program is given.

Without replacing temporaries With replacing temporaries

Intermediate code will recalculate 4*i every time a[ i ]

t1,t2,…,t3 are temporary variables

B1 is the initial node.

Example: Consider the three-address

In block B5 of flow graph, it recalculates

The assignment to t7 and t10 have the

They have been eliminated by using t6

After local common subexpressions re eliminated,

i.e., t8=4*j, t9=a[t8],a[t8]=x

Using t4 computed in block B3

As control passes from the evaluation of 4*j in B3 to B5, there is

Similarly other possible statements are replaced

The assignment x=t3

● Adv: often turns the copy statement into dead code

● Apply to replace variables from inner loops

In block B3 the values of j and t4

In Block B3 t4=4*j; After the

Replace the assignment statement

The values of i in t2=4*i and j in t4=4*j satisfy the

so the test t2>=t4 is equivalent to i>= j

This replacement made, i in B2 and j in block B3

The code-improving transformations have

The number of instructions in B2 and B3 has

the operator and the children are the same, we do not

If b is not live on exit, use:

Algebraic identities are applied to the dag, may

Also using dead-code elimination we delete from a dag

● Another class of algebraic optimizations includes reduction in strength, i.e.,

● the following intermediate code might be generated:

● Another opportunity for peephole optimizations is the removal of unreachable

● Finally, suppose there is only one jump to L1 and L1 is preceded by an

While the number of instructions in (1) and (2) is the same,

● To construct a dag for a basic block. we process each statement of the of

● To automatically detect common sub expressions.

Example: Consider the following basic

You might also like

The values of i in t2=4i and j in t4=4j satisfy the