You are on page 1of 7

MODULE 31 – DAG Construction, reordering and labeling

In this module, we will understand to construct a directed acyclic graph (DAG) for every basic
block. The construction of DAG for arrays and pointers will be discussed as there is a difference
in the construction. The concept of rearranging of instructions will be dealt in the context of
DAG. The nodes of the DAG will also be labeled to have optimized use of registers.

31.1 Directed Acyclic Graph – DAG

DAG is a useful data structure for representing the basic blocks. DAG can be very useful in
identifying and implementing structure preserving and algebraic transformation on basic blocks.
DAG gives all the necessary details of how the value is computed for each statement of the basic
block. Applications of DAG include identifying common sub-expressions in a basic block,
variables which are defined inside the block but evaluated outside and which statements of a
block gets their value computed outside the block.

31.1. DAG construction

A DAG is typically constructed for each basic block. A DAG is a directed acyclic graph and thus
has nodes and edges. Leaves are labeled by unique identifiers such as variable or constants of the
three-address instruction and thus the corresponding r-values of the expression. Interior nodes
are labeled by an operator. Nodes are optimally given a sequence of identifiers. Algorithm 31.1
elucidates the DAG construction.

Algorithm 31.1DAG construction

• Input: Basic block


• Output: DAG having label for each node – operator or variables. For each node a list of
attached identifiers
DAG_Construct( )
{

1. x := y op z
2. x := op y
3. x := y
4. If node (y) is undefined create a label ‘y’ and let it be node (y) and so for node (z) if ‘z’ is
not there
5. In case (i) check if there is a node labeled ‘op’ whose left and right children are ‘y’ and
‘z’ respectively. If not create such a node and let it be ‘n’. In case (ii) ignore ‘z’ and do
the same thing. In case (iii) let the node(y) be ‘n’
6. Delete x from the list of attached identifiers for node(x). Append ‘x’ to the list of attached
identifiers for the node ‘n’ and set node(x) to n
}
Lines 1-3 of the algorithm, gives the possibilities of the expression for the construction of DAG.
Line 4 checks whether the operands of the RHS expression is available in the DAG. If it is
available, we use the same nodes; else we create new nodes for these operands. Line 5 checks
whether the DAG has a operator node with the one in the RHS of the expression whose left and
right children are ‘y’ and ‘z’ respectively. If it is available we use it, else we create a new node
with ‘op’ and attach edges from this node to the operands ‘y’ and ‘z’ respectively. For the
expressions defined in line 2 we repeat the same process but with one operand. For the case
define in line 3, we name the node(x) as node(y).

Consider the following example involving three-address code instructions:

1. t1 := 4 * i
2. t2 := a[t1]
3. t3 := 4 * i
4. t4 := b [t3]
5. t5 := t2 *t4

Instruction 1, checks if the constant ‘4’ and ‘i' are available. As it is not available, we create
these two nodes. Since, these are new nodes, a new internal node with operator, ‘*’ is created
with left and right children as ‘4’ and ‘i’ respectively and the interior node is labeled as t1. For
line 2, the nodes are created with ‘a’ as the left child and since t1 is already there it is used and a
new interior node is created and is labeled as t2. For the instruction at line 3, since nodes ‘4’, ‘i’
and ‘*’ whose left and right children are ‘4’ and ‘i' are already there, they are used and thus t1
and t3 corresponds to the same node. This aspect of the DAG helps in identifying the common
sub-expression. Proceeding in a similar fashion, we end up with the DAG as available in figure
31.1

Figure 31.1 DAG constructed for the example

31.2 DAG construction involving Arrays, Pointers and Procedures

Consider the following sequence of statements involving array operations


x := a[i]
a[ j] := y
z := a[i]

These sequence of statements can be interpreted as same as the following sequences.

x := a[i]
z := x
a[j] := y
If i = j, then the value chosen is different for ‘z’ and thus we have committed a mistake. So,
while processing array, we kill all nodes labeled [] to prevent confusion in computation. Similar
scenarios are adopted for pointers and procedures.

Thus we propose a modification to the algorithm for DAG construction and the following are
updated:

• Evaluation of assignment to an array must follow the previous assignment if there is one
• Any assignment to an array element follow any previous evaluation of ‘a’
• Any use of an identifier must follow the previous procedure call or indirectly through a
pointer if there is one
• Any procedure call or indirect assignment through a pointer must follow all previous
evaluations of an identifier

31.2 Code generation from DAG

In the previous module we understood the simple code generator which generates code from the
basic blocks. In this module we will start understanding the algorithm to generate code from the
DAG. The steps involved in the algorithm to generate code from DAG include

• Rearranging the order – To optimize the code generation, the instructions are rearranged
and this is referred to as heuristic reordering

• Labeling the tree for register information – To know the number of registers required to
generate code, the labels of the nodes are numbered which indicate the number of
registers required to evaluate that node.

• Tree traversal to generate code – This reordered labeled tree is traversed to generate code
based on the target language’s instruction.

In this module we will see how to rearrange the nodes and label them for optimal register usage.
In the next module, we will look at the code generation algorithm from DAG.

31.3. Rearranging the order – Heuristic reordering


Rearranging the nodes involves changing the order of independent statements of the DAG which
will help efficient utilization of the registers. This rearranging of nodes also helps in reducing the
final cost of assembly level code. The algorithm is called Node_listing and is given in Algorithm
31.2

Algorithm 31.2 Node listing algorithm

Node_listing ( )

while unlisted interior nodes remain do begin


1. select an unlisted node n, all of whose parents have been listed ;
list n;
2. while the leftmost child m of n has no unlisted parents and is not a leaf do
/* since n was just listed, m is not yet listed*/
begin
a. list m;
b. n = m
end
end
}

Initially all nodes are unlisted. The outside while loop ensures that all nodes are being listed. A
flag is used to keep track whether all the nodes are listed or not. We start from some node whose
parents are all visited. In the first iteration, none will be listed and we list the root. We start by
looking at the leftmost child of an interior node ‘n’ and if it is not a leaf, we list that. We then
iterate by considering the node that is visited as the child and try to list its parent. If there are no
more nodes to list, we then consider an arbitrary node as given in the outside while loop. After
listing the nodes, we will reverse the string which is the order to execute the nodes.

Consider the following DAG given in figure 31.2 where the nodes are numbe red. Let us try
rearranging this DAG for optimal ordering using the node_listing algorithm

We start from the root and list the root as “1”. We then find and list its left child and list node
“2” based on the inside while loop. ‘6’ is the left child of ‘2’, but ‘6’ has an unlisted parent ‘5’
and therefore we exit the inside while loop. The outside while loop is continued by listing node
‘3’. Its left child is ‘4’ and we list it. It’s left child is ‘5’ and we list followed by listing ‘6’ now
as both of ‘6’’s parents are listed. Nodes 7, 9, 10 are leaf nodes and so we don’t list them. We
continue with the outside while loop to list node ‘8’ and nodes 11 and 12 are leaves and hence
not listed.
Figure 31.2 DAG heuristic reordering example

Thus, the listed nodes are “1234568”. This string is reversed to yield, “8654321”. This indicates
we need to evaluate node 8 followed by 6, 5, 4, 3, 2 and finally 1. The following is the sequence
of instruction after rearranging.

• t8 := d +e
• t6 := a +b
• t5 := t6 - c
• t4 := t5 * t8
• t3 := t4 – e
• t2 := t6 + t4
• t1:= t2 + t3

31.4 Labeling algorithm

Register allocation is a core work in the context of code generation. Using DAG many
optimizations and evaluation could be performed. Common sub-expressions identification,
heuristic reordering are some of the optimizations that were discussed. In addition, the nodes of
the DAG could be labeled which is used to identify the number of registers necessary to compute
every node. The code generation for DAG uses this label to generate the target language’s
instruction. Here, “left leaf” means a node that is a leaf and the left most descendent of its parent.
All other leaves are referred to as “right leaves”.

Labeling can be done by visiting node in a bottom up order so that a node is not visited until all
its children all labeled. The order in which the first three nodes are created is suitable if the parse
tree is used as intermediate code, so in this case the labels can be computed as syntax directed
translation.

A node ‘n’ is labeled using the following equation


The algorithm for labeling is given in Algorithm 31.3.

Algorithm 31.3 Node labelling

Node_labelling( )

{
if n is a leaf then
if n is leftmost child of its parents then
label (n) = 1
else
label (n) = 0
else begin /* n is an interior node */
let n1, n2 , …. , nk be the children of n ordered by label ,
so label (n1) >= label (n2) >= ….>= label (nk) ;
label (n) = max (label(ni) + i - 1)
end
}

The outside if-else checks if a node is a leaf or interior. The algorithm starts by assigning the leaf
node a label of 0 or 1 depending on whether it is a right leaf or a left leaf. The traversal then
proceeds to the interior nodes. The interior nodes are labeled in the “else” of the outside if- else
block. The interior node is labeled by computing the maximum of the labels of its children if
they have different values and is assigned “1” value more if they have same values.

Consider the following DAG given in figure 31.3. Let us see the computation of the labels using
the node_labeling algorithm

Figure 31.3 Example DAG for labeling

We theoretically use post order traversal for label computation. Node ‘a’ is labeled 1 since it is
the left most leaf node. ‘b’ is labeled 0 as it is the right leaf node. Parent of a, b is assigned max
(1,0) which is 1. We then assign ‘e’ with a value of ‘1’ as it is a left leaf node, ‘c’ and ‘d’ with
the values of ‘1’ and ‘0’ as they are the left and right leaf nodes. Node t2 is labeled ‘1’ which is
the maximum of nodes ‘c’ and‘d’’s label. Node t3 is assigned ‘2’ as its children have a label ‘1’
and this node’s label is computed as ‘1’ + label (e). Root’s label is given as ‘2’ as its right child
has a maximum value of ‘2’

Summary: In this module we looked at DAG construction procedure for simple expressions and
in the context of array and pointers. We also started the code generation procedure for DAG
which involves heuristic reordering and labeling. The algorithm for heuristic reordering and
labeling were also discussed. In the next module, we will discuss the procedure to generate code
from this reordered and labeled DAG.

You might also like