Professional Documents
Culture Documents
In this module, we will understand to construct a directed acyclic graph (DAG) for every basic
block. The construction of DAG for arrays and pointers will be discussed as there is a difference
in the construction. The concept of rearranging of instructions will be dealt in the context of
DAG. The nodes of the DAG will also be labeled to have optimized use of registers.
DAG is a useful data structure for representing the basic blocks. DAG can be very useful in
identifying and implementing structure preserving and algebraic transformation on basic blocks.
DAG gives all the necessary details of how the value is computed for each statement of the basic
block. Applications of DAG include identifying common sub-expressions in a basic block,
variables which are defined inside the block but evaluated outside and which statements of a
block gets their value computed outside the block.
A DAG is typically constructed for each basic block. A DAG is a directed acyclic graph and thus
has nodes and edges. Leaves are labeled by unique identifiers such as variable or constants of the
three-address instruction and thus the corresponding r-values of the expression. Interior nodes
are labeled by an operator. Nodes are optimally given a sequence of identifiers. Algorithm 31.1
elucidates the DAG construction.
1. x := y op z
2. x := op y
3. x := y
4. If node (y) is undefined create a label ‘y’ and let it be node (y) and so for node (z) if ‘z’ is
not there
5. In case (i) check if there is a node labeled ‘op’ whose left and right children are ‘y’ and
‘z’ respectively. If not create such a node and let it be ‘n’. In case (ii) ignore ‘z’ and do
the same thing. In case (iii) let the node(y) be ‘n’
6. Delete x from the list of attached identifiers for node(x). Append ‘x’ to the list of attached
identifiers for the node ‘n’ and set node(x) to n
}
Lines 1-3 of the algorithm, gives the possibilities of the expression for the construction of DAG.
Line 4 checks whether the operands of the RHS expression is available in the DAG. If it is
available, we use the same nodes; else we create new nodes for these operands. Line 5 checks
whether the DAG has a operator node with the one in the RHS of the expression whose left and
right children are ‘y’ and ‘z’ respectively. If it is available we use it, else we create a new node
with ‘op’ and attach edges from this node to the operands ‘y’ and ‘z’ respectively. For the
expressions defined in line 2 we repeat the same process but with one operand. For the case
define in line 3, we name the node(x) as node(y).
1. t1 := 4 * i
2. t2 := a[t1]
3. t3 := 4 * i
4. t4 := b [t3]
5. t5 := t2 *t4
Instruction 1, checks if the constant ‘4’ and ‘i' are available. As it is not available, we create
these two nodes. Since, these are new nodes, a new internal node with operator, ‘*’ is created
with left and right children as ‘4’ and ‘i’ respectively and the interior node is labeled as t1. For
line 2, the nodes are created with ‘a’ as the left child and since t1 is already there it is used and a
new interior node is created and is labeled as t2. For the instruction at line 3, since nodes ‘4’, ‘i’
and ‘*’ whose left and right children are ‘4’ and ‘i' are already there, they are used and thus t1
and t3 corresponds to the same node. This aspect of the DAG helps in identifying the common
sub-expression. Proceeding in a similar fashion, we end up with the DAG as available in figure
31.1
x := a[i]
z := x
a[j] := y
If i = j, then the value chosen is different for ‘z’ and thus we have committed a mistake. So,
while processing array, we kill all nodes labeled [] to prevent confusion in computation. Similar
scenarios are adopted for pointers and procedures.
Thus we propose a modification to the algorithm for DAG construction and the following are
updated:
• Evaluation of assignment to an array must follow the previous assignment if there is one
• Any assignment to an array element follow any previous evaluation of ‘a’
• Any use of an identifier must follow the previous procedure call or indirectly through a
pointer if there is one
• Any procedure call or indirect assignment through a pointer must follow all previous
evaluations of an identifier
In the previous module we understood the simple code generator which generates code from the
basic blocks. In this module we will start understanding the algorithm to generate code from the
DAG. The steps involved in the algorithm to generate code from DAG include
• Rearranging the order – To optimize the code generation, the instructions are rearranged
and this is referred to as heuristic reordering
• Labeling the tree for register information – To know the number of registers required to
generate code, the labels of the nodes are numbered which indicate the number of
registers required to evaluate that node.
• Tree traversal to generate code – This reordered labeled tree is traversed to generate code
based on the target language’s instruction.
In this module we will see how to rearrange the nodes and label them for optimal register usage.
In the next module, we will look at the code generation algorithm from DAG.
Node_listing ( )
Initially all nodes are unlisted. The outside while loop ensures that all nodes are being listed. A
flag is used to keep track whether all the nodes are listed or not. We start from some node whose
parents are all visited. In the first iteration, none will be listed and we list the root. We start by
looking at the leftmost child of an interior node ‘n’ and if it is not a leaf, we list that. We then
iterate by considering the node that is visited as the child and try to list its parent. If there are no
more nodes to list, we then consider an arbitrary node as given in the outside while loop. After
listing the nodes, we will reverse the string which is the order to execute the nodes.
Consider the following DAG given in figure 31.2 where the nodes are numbe red. Let us try
rearranging this DAG for optimal ordering using the node_listing algorithm
We start from the root and list the root as “1”. We then find and list its left child and list node
“2” based on the inside while loop. ‘6’ is the left child of ‘2’, but ‘6’ has an unlisted parent ‘5’
and therefore we exit the inside while loop. The outside while loop is continued by listing node
‘3’. Its left child is ‘4’ and we list it. It’s left child is ‘5’ and we list followed by listing ‘6’ now
as both of ‘6’’s parents are listed. Nodes 7, 9, 10 are leaf nodes and so we don’t list them. We
continue with the outside while loop to list node ‘8’ and nodes 11 and 12 are leaves and hence
not listed.
Figure 31.2 DAG heuristic reordering example
Thus, the listed nodes are “1234568”. This string is reversed to yield, “8654321”. This indicates
we need to evaluate node 8 followed by 6, 5, 4, 3, 2 and finally 1. The following is the sequence
of instruction after rearranging.
• t8 := d +e
• t6 := a +b
• t5 := t6 - c
• t4 := t5 * t8
• t3 := t4 – e
• t2 := t6 + t4
• t1:= t2 + t3
Register allocation is a core work in the context of code generation. Using DAG many
optimizations and evaluation could be performed. Common sub-expressions identification,
heuristic reordering are some of the optimizations that were discussed. In addition, the nodes of
the DAG could be labeled which is used to identify the number of registers necessary to compute
every node. The code generation for DAG uses this label to generate the target language’s
instruction. Here, “left leaf” means a node that is a leaf and the left most descendent of its parent.
All other leaves are referred to as “right leaves”.
Labeling can be done by visiting node in a bottom up order so that a node is not visited until all
its children all labeled. The order in which the first three nodes are created is suitable if the parse
tree is used as intermediate code, so in this case the labels can be computed as syntax directed
translation.
Node_labelling( )
{
if n is a leaf then
if n is leftmost child of its parents then
label (n) = 1
else
label (n) = 0
else begin /* n is an interior node */
let n1, n2 , …. , nk be the children of n ordered by label ,
so label (n1) >= label (n2) >= ….>= label (nk) ;
label (n) = max (label(ni) + i - 1)
end
}
The outside if-else checks if a node is a leaf or interior. The algorithm starts by assigning the leaf
node a label of 0 or 1 depending on whether it is a right leaf or a left leaf. The traversal then
proceeds to the interior nodes. The interior nodes are labeled in the “else” of the outside if- else
block. The interior node is labeled by computing the maximum of the labels of its children if
they have different values and is assigned “1” value more if they have same values.
Consider the following DAG given in figure 31.3. Let us see the computation of the labels using
the node_labeling algorithm
We theoretically use post order traversal for label computation. Node ‘a’ is labeled 1 since it is
the left most leaf node. ‘b’ is labeled 0 as it is the right leaf node. Parent of a, b is assigned max
(1,0) which is 1. We then assign ‘e’ with a value of ‘1’ as it is a left leaf node, ‘c’ and ‘d’ with
the values of ‘1’ and ‘0’ as they are the left and right leaf nodes. Node t2 is labeled ‘1’ which is
the maximum of nodes ‘c’ and‘d’’s label. Node t3 is assigned ‘2’ as its children have a label ‘1’
and this node’s label is computed as ‘1’ + label (e). Root’s label is given as ‘2’ as its right child
has a maximum value of ‘2’
Summary: In this module we looked at DAG construction procedure for simple expressions and
in the context of array and pointers. We also started the code generation procedure for DAG
which involves heuristic reordering and labeling. The algorithm for heuristic reordering and
labeling were also discussed. In the next module, we will discuss the procedure to generate code
from this reordered and labeled DAG.