0% found this document useful (0 votes)
13 views28 pages

Unit IV: Syntax Directed Translation

The document discusses Syntax Directed Translations (SDTs) and their types, focusing on syntax-directed definitions that associate attributes with grammar symbols. It explains synthesized and inherited attributes, providing examples of their application in a simple desk calculator grammar and a declaration grammar. The document also illustrates the process of annotating parse trees to compute attribute values using semantic rules.

Uploaded by

ergopal2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views28 pages

Unit IV: Syntax Directed Translation

The document discusses Syntax Directed Translations (SDTs) and their types, focusing on syntax-directed definitions that associate attributes with grammar symbols. It explains synthesized and inherited attributes, providing examples of their application in a simple desk calculator grammar and a declaration grammar. The document also illustrates the process of annotating parse trees to compute attribute values using semantic rules.

Uploaded by

ergopal2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Unit – IV

Unit IV: Syntax Directed Translation

Question 1: What are Syntax Directed Translations (SDTs)? What are its types?
Explain with a suitable example.

1. Syntax-Directed Definition
A syntax-directed definition is a generalization of a context-free grammar in which each
grammar symbol has associated with set of attributes, divided into two types called the
synthesized and inherited attributes of that grammar symbol.
An attribute represent a string, a number, a type, a memory location, or whatever. The
value of an attribute at a parse-tree node is defined by a semantic rule associated with the
production used at that node.
 The value of a synthesized attribute at a node is computed from the values of attributes at
the children of that node in the parse tree.
 The value of an inherited attribute is computed from the values of attributes at the
siblings and parent of that node.
Semantic rules set up dependencies between attributes that will be represented by a
graph. Dependency graph can derive an evaluation order for the semantic rules. Evaluation of the
semantic rules defines the values of the attributes at the nodes in the parse tree for the input
string.
A parse tree showing the values of attributes at each node is called an annotated parse
tree. The process of computing the attribute values at the nodes is called annotating or decorating
the parse tree.

a) Form of a Syntax-Directed Definition


In a syntax-directed definition, each grammar production A →  has associated with it, a
set of semantic rules of the form b := f (c1, c2, . . . , ck) where f is a function, and either
 b is a synthesized attribute of A and c1, c2, . . . , ck are attributes belonging to the grammar
symbols of the production, or
 b is an inherited attribute of one of grammar symbols on the right side of the production,
and c1, c2,….ck are attributes belonging to the grammar symbols of the production.
In either case, we say that attribute b depends on attributes c1, c2, . . . , ck. An attribute grammar
is a syntax-directed definition in which the functions in semantic rules cannot have side effects.

Compiler Design 177


Unit – IV

Question 2: What is the Syntax Directed Definition (SDD) for the simple desk
calculator, write and explain. What would be the parse tree for 1 * 2 + 3n?

b) Synthesized Attributes
The value of a synthesized attribute at a node is computed from the values of attributes at
the children of that node in the parse tree. A syntax-directed definition that uses synthesized
attributes is said to be an S-attributed definition.
A parse tree for an S-attributed definition can always be annotated by evaluating the
semantic rules for the attributes at each node bottom up, from the leaves to the root. The value of
a synthesized attribute at a node is computed from the values of attributes at the children of that
node in the parse tree;

Example 1: Consider the following grammar for desk calculator


L→En
E → El + T | T
T→Tl*F|F
F → ( E ) | digit
Following table shows the syntax-directed definition for the above desk calculator
grammar. For the non terminals E, T and F the values can be obtained using the attribute val.
Here val is a attribute and semantic rule is computing the value of val.
The token digit has synthesized attribute lexval whose value can be obtained from lexical
analyzer. In the rule L → E n, symbol L is the start symbol and print the final output of statement.
 In syntax directed definition, terminals have synthesized attributes only.
 Thus there is no definition of terminal. The synthesized attributes are often used in syntax
directed definition. The syntax directed definition that uses only synthesized attributes is
called S-attributed definition.
 In a parse tree, at each node the semantic rule is evaluated for annotating (computing) the
S-attributed definition. This processing is in bottom up fashion.
Production Semantic Rules
L→En L.val := E.val
E → El + T E.val := E1.val + T.val
E→T E.val := T.val
T→Tl*F T.val := Tl.val × F.val
T→F T.val := F.val

178 Compiler Design


Unit – IV
F→(E) F.val := E.val
F → digit F. val := digit .lexval
Table: Syntax-directed definition
 The rule for production 1, L → E n, sets L.val to E.va1.
 Production 2, E → E l + T, also has one rule, which computes the val attribute for the
head E as the sum of the values at E1 and T. At any parse tree node N labeled E, the value
of val for E is the sum of the values of val at the children of node N labeled E and T.
 Production 3, E + T, has a single rule that defines the value of val for E to be the same as
the value of val at the child for T. Production 4 is similar to the second production; its
rule multiplies the values at the children instead of adding them.
 The rules for productions 5 and 6 copy values at a child, like that for the third production.
 Production 7 gives F.val the value of a digit, that is, the numerical value of the token
digit that the lexical analyzer returned.
Following steps are used to compute S-attributed definition:
1. Write the syntax directed definition using the appropriate semantic actions for
corresponding production rule of the given grammar.
2. The annotated parse tree is generated and attribute values are computed. The computation
is done in bottom up manner.
3. The value obtained at the root node is supposed to be the final output.

Example 2: Construct parse tree, syntax tree and annotated parse tree for the input siring is
1*2+3n;
Solution:
E

E + T

T F
+
T * F digit
* 3 (3)
F digit
1 2 (2)
digit
(1)
(a) Syntax Tree (b) Parse Tree
Compiler Design 179
Unit – IV
The given expression 1*2+3 is followed by a newline character (n), the program prints
the value 5. Figure shows parse tree, syntax tree and annotated parse tree for the input 1*2+3n.
The output printed at the root of the tree is the value of E.val at the first child of the root.
L .val = 5
Value obtained from
child to parent n
E.val = 5

E.val = 2 T.val = 3
+

F.val = 3
T.val = 2

T.val = 1 F.val = 2 digit.lexval = 3


*
F.val = 1 digit.lexval=2

digit.lexval = 1
Figure: Annotated parse tree for 1*2+3n.
For the computation of attributes, we start from the leftmost and bottommost node. The
rule F  digit is used in order to reduce digit to F. The semantic action that takes place here is
F.val := digit.lexval. The value of digit is obtained from lexical analyzer becomes the value of F.
Hence F.val = 1.
Since T is the parent node of F and semantic action suggests that T.val = F.val. We can
get the T.val = 1.Thus the computation of S-attributes is done from children. Then consider
T  T1 * F production; the corresponding semantic action is
T.val := T1.val  F.val
Hence T.val := T1.val  F.val = 1  2 = 2.
Similarly, the combination of E1.val + T.val becomes the E node.
E.val := E1.val + T.val = 2 + 3
E.val := 5
Here we get the E.val from left child of E and T.val from right child of E. Finally we acquire the
value of E as 5.Then the production L  En is applied to reduce E.val = 5 and n = ;
The semantic action associated with L  En suggests us to print the result E.val. Hence
the output will be 5.
180 Compiler Design
Unit – IV

Question 3: What should be the SDD with inherited attribute for the grammar:
D  TL T  int | real L  L, id | id
show the dependency graph for "real id, id, id".
Question 4: Consider the following grammar:
D  TL T  int | real L  L, id | id
Construct a syntax directed scheme with inherited attribute L. Show parse tree for
input string "int x, y, z".
c) Inherited Attribute
The value of inherited attribute at a node in a parse tree is defined using the attribute
values at the parent or siblings. Inherited attributes are convenient for expressing the dependence
of a programming language construct on the context in which it appears.
For example, we can use an inherited attribute to keep track of whether an identifier
appears on the left or right side of an assignment in order to decide whether the address or the
value of the identifier is needed.
Example 3: For the following grammar, annotate the parse tree for the computation of inherited
attributes for the given string: real id1, id2, id3;
D→TL
T → int | real
L → L1 , id | id
For the string real id1, id2, id3 we have to distribute the data type real to all the identifiers id1, id2
and id3; such that id1 becomes real, id2 becomes real and id3 becomes real. Following steps are to
be followed,
1. Construct the syntax directed definition using semantic action.
2. Annotate the parse tree with inherited attributes by processing in top down fashion.
The syntax directed definition for the above given grammar is
Production Semantic Rules
D→TL L.in := T.type
T → int T. type := integer
T → real T.type := real
L → L1 , id Ll.in := L.in
addType(id. entry, L.in)
L → id addType(id. entry, L.in)
Table: Syntax directed definition with inherited attribute L.in.
Compiler Design 181
Unit – IV
A declaration generated by the nonterminal D in the syntax-directed definition in table
consists of the keyword int or real, followed by a list of identifiers. The nonterminal has a
synthesized attribute type, whose value is determined by the keyword in the declaration.
The semantic rule L.in := T.type, associated with production D → TL, sets inherited
attribute L.in to the type in the declaration. The rules then pass this type down the parse tree using
the inherited attribute L.in.
Rules associated with the productions for L call procedure addtype to add the type of
each identifier to its entry in the symbol table (pointed to by attribute entry). Nonterminal D
represents a declaration, which, from production 1, consists of a type T followed by a list L of
identifiers. T has one attribute, T.type, which is the type in the declaration D. Nonterminal L also
has one attribute, which we call in to emphasize that it is an inherited attribute.
The purpose of L.in is to pass the declared type down the list of identifiers, so that it can
be added to the appropriate symbol-table entries.
 Productions 2 and 3 evaluate the synthesized attribute T.type, giving the appropriate
value, integer or real. This type is passed to the attribute L.in in the rule for production 1.
 Production 4 passes L.in down the parse tree. That is, the value Ll.in is computed at a
parse-tree node by copying the value of L.in from the parent of that node; the parent
corresponds to the head of the production.
 Productions 4 and 5 also have a rule in which a function addType is called with two
arguments:
 id.entry, a lexical value that points to a symbol-table object, and
 L.in, the type being assigned to every identifier on the list.
Figure shows an annotated parse tree for the sentence real id1, id2, id3.
D
Value obtained from
child to parent

T.type = real L.in = real

real L.in = real , id3

Value obtained from


L.in = real , id2
parent to child

Figure: Annotated Parse tree id1

182 Compiler Design


Unit – IV
The value of L.in at the three L-nodes gives the type of the identifiers id1, id2, and id3.
These values are determined by computing the value of the attribute T.type at the left child of the
root and then evaluating L.in top-down at the three L-nodes in the right subtree of the root.
At each L-node we also call the procedure addtype to insert into the symbol table the
identifier at the right child of this node has type real.
Example 4:
The SDD in Fig. computes terms like 3*5 and 3*5*7. The top-down parse of input 3*5
begins with the production T → FT '. Here, F generates the digit 3, but the operator * is
generated by T '. Thus, the left operand 3 appears in a different subtree of the parse tree from *.
An inherited attribute will therefore be used to pass the operand to the operator.
Production Semantic Rules
T→FT' T'.inh := F.val
T.val := T'.syn
T ' → * F T1' T1'.inh := T1'.inh × F.val
T'.syn := T1'.syn
T'→ T'.syn := T'.inh
F → digit F.val := digit.lexval
Figure: An SDD based on a grammar suitable for top-down parsing
Each of the nonterminals T and F has a synthesized attribute val; the terminal digit has a
synthesized attribute lexval The nonterminal T ' has two attributes: an inherited attribute inh and a
synthesized attribute syn.
The semantic rules are based on the idea that the left operand of the operator * is
inherited. In short, the head T ' of the production T ' → * F T1' inherits the left operand of * in
the production body. Given a term x * y * z, the root of the subtree for * y * z inherits x. Then, the
root of the subtree for * z inherits the value of x * y, and so on, if there are more factors in the
term. Once all the factors have been accumulated, the result is passed back up the tree using
synthesized attributes.
To see how the semantic rules are used, consider the annotated parse tree for 3 * 5 in
Fig.. The leftmost leaf in the parse tree, labeled digit, has attribute value lexval = 3, where the 3
is supplied by the lexical analyzer. Its parent is for production 4, F → digit. The only semantic
rule associated with this production defines F.val = digit.lexval, which equals 3.
At the second child of the root, the inherited attribute T '.inh is defined by the semantic
rule T'.inh = F.val associated with production 1. Thus, the left operand, 3, for the * operator is
passed from left to right across the children of the root.
Compiler Design 183
Unit – IV

Figure: Annotated parse tree for 3 * 5


The production at the node for T is T ' → * F T1'. (the subscript 1 in the annotated parse
tree to distinguish between the two nodes for T'.) The inherited attribute T1'.inh is defined by the
semantic rule T1'.inh = T '.inh  F.val associated with production 2.
With T '.inh = 3 and F.val = 5, we get T1'.inh = 15. At the lower node for T1', the
production is T ' → . The semantic rule T '.syn = T '.inh defines T1'.syn = 15. The syn attributes
at the nodes for T ' pass the value 15 up the tree to the node for T, where T.val = 15.

Question 5: Let synthesize attribute 'val' gives the value of binary number generated
by 'S' in the following grammar. (For example on input 100.101, S.Val = 4.625)
S  L.L | L L  LB | B B0|1
Give the synthesized attributes to determine S.Val.

Solution:
We will use attributes val for synthesize value, frac for fractional binary value of string,
int for integer binary value of string and len for the current position of bit in the binary string.
For example in the string 100.101, in the fractional part i.e. .101, for the first digit after
dot i.e. 1, L.len = 0 + 1 = 1, for the second digit 0, L.len = 1 + 1 = 2 and for the third digit 1,
L.len = 1 + 2 = 3. (100.101 = 22 + 01 + 00 + 2-1 + 0-2 + 2-3 = 4 + 0 + 0 + 0.5 + 0 + 0.125 = 4.625.)
Following table shows the syntax-directed definition:
Production Semantic Rules
SL.L S.val := L.int + L.frac
SL S.val := L.int
L  LB L.int := L.int * 2 + B.val;
L.len := L.len + 1;
L.frac := L.frac + B.val * (1.0/2.0)  L.len
184 Compiler Design
Unit – IV
LB L.int := B.val;
L.len := 1;
L.frac := B.val * (1.0/2.0)  L.len
B0 B.val := 0
B1 B.val := 1
Table: Syntax-directed definition ( represents exponent operation)
For example, for the input string w = 100.101, the annotated parse tree will be,
f S.val = 4.625

Calculations are b e
given below
L.int = 4 + L.frac = 0.625

a d

L.int = 2 B.val = 0 L.frac = 0.5 B.val = 1

L.int = 1 B.val = 0 0 L.frac = 0.5 B.val = 0 1

B.val = 1 B.val = 1
0 0

1 1
Figure: Annotated Parse Tree
The values in the tree are calculated as:
a. L.int := L.int * 2 + B.val = 1 * 2 + 0 = 2
b. L.int := L.int * 2 + B.val = 2 * 2 + 0 = 4
c. L.frac := B.val * (1.0/2.0)  L.len = 1 * (1.0/2.0)  1 = 0.5
d. L.frac := L.frac + B.val * (1.0/2.0)  L.len = 0.5 + 0 * (1.0/2.0)  2 = 0.5 + 0 = 0.5
e. L.frac := L.frac + B.val * (1.0/2.0)  L.len = 0.5 + 1 * (1.0/2.0) 3 = 0.5 + 0.125 = 0.625
f. S.val := L.int + L.frac = 4 + 0.625 = 4.625 (From (b) and (e))

Question 6: What do you mean by dependency graph? Explain by giving suitable


example.

Compiler Design 185


Unit – IV
2. Evaluation Orders for SDD's
"Dependency graphs" are a useful tool for determining an evaluation order for the attribute
instances in a given parse tree. While an annotated parse tree shows the values of attributes, a
dependency graph helps us determine how those values can be computed.

a) Dependency Graph
A dependency graph shows the flow of information among the attribute instances in a
particular parse tree; an edge from one attribute instance to another means that the value of the first
is needed to compute the second. Edges express constraints implied by the semantic rules.
If an attribute 'y' is evaluated after the evolution of 'z' then we say that y is dependent on z
this dependency can be shown by parse tree with some other relation known as dependency graph.
Dependency graph exist in both synthesized and inherit attribute. Before constructing a
dependency graph for a parse tree, we put each semantic rule into the form b := f (c1, c2, . . . , ck)
by introducing a dummy synthesized attribute b for each semantic rule that consists of a
procedure call. The graph has a node for each attribute and an edge to the node for b from the
node for c if attribute b depends on attribute c.
Example 5: Design the dependency graph for the following grammar
E  E1 + E2
E  E1 * E2
Solution:
The semantic rules for the above grammar are as given below:
Production Semantic Rules
E  E1 + E2 E.val := E1.val + E2.val
E  E1 * E2 E.val := E1.val * E2.val
Table: Syntax-directed definition
The dependency graph is shown in following Figure.
Eval

Eval + Eval

E1val * E2val
Figure: Dependency Graph

The synthesized attributes can be represented by • val. Hence the synthesized attributes
are given by E.val, E1.val and E2.val. The dependencies among the nodes are given by solid
arrows. The arrows from E1 and E2 show that value of E depends upon E1 and E2.
186 Compiler Design
Unit – IV
Example 6:
Suppose A.a := f(X.x, Y.y) is a semantic rule for the production A → XY. This rule defines
a synthesized attribute A.a that depends on the attributes X.x and Y.y. If this production is used in
the parse tree, then there will be three nodes A.a, X.x, and Y.y in the dependency graph with an
edge to A.a from X.x since A.a depends on X.x, and an edge to A.a from Y.y since A.a also
depends on Y.y.
If the production A → XY has the semantic rule X.i := g(A.a, Y.y) associated with it, then
there will be an edge to X.i from A.a and also an edge to X.i from Y.y, since X.i depends on both
A.a and Y.y.

Example 7:
An example of a complete dependency graph is shown in Fig. The nodes of the
dependency graph, represented by the numbers 1 through 9, correspond to the attributes in the
annotated parse tree in Fig (page no. 184)

Figure: Dependency graph for the annotated parse tree 3 * 5


Nodes 1 and 2 represent the attribute lexval associated with the two leaves labeled digit.
Nodes 3 and 4 represent the attribute val associated with the two nodes labeled F. The edges to
node 3 from 1 and to node 4 from 2 result from the semantic rule that defines F.val in terms of
digit.lexval. In fact, F.val equals digit.lexval, but the edge represents dependence not equality.
Nodes 5 and 6 represent the inherited attribute T'.inh associated with each of the
occurrences of nonterminal T'. The edge to 5 from 3 is due to the rule T'.inh = F.val, which
defines T'.inh at the right child of the root from F.val at the left child. We see edges to 6 from
node 5 for T'.inh and from node 4 for F.val, because these values are multiplied to evaluate the
attribute inh at node 6.
Nodes 7 and 8 represent the synthesized attribute syn associated with the occurrences of
T'. The edge to node 7 from 6 is due to the semantic rule T'.syn = T'.inh associated with
Compiler Design 187
Unit – IV
production 3 in Fig. The edge to node 8 from 7 is due to a semantic rule associated with
production 2.
Finally, node 9 represents the attribute T.val The edge to 9 from 8 is due to the semantic
rule, T.val = T'.syn, associated with production 1.

Question 7: Consider the following grammar. Show the parse tree and dependency
graph for the input string real x, y, z:
D  TL
T  int | real
L  L, id | id.

Solution:
A dependency graph for the input string real idl, id2, id3 is shown in Figure.
L

From sibling
T.type L.in
L

real Parent L.in , id3


to
child

L.in , id2

From sibling

id1
Figure: Dependency graph for a declaration real id1, id2, id3
The dependencies among the nodes can be shown be solid arrows. In the dependency
graph how the values can be inherited from the parent or sibling nodes are shown. Hence the
name for the attributes is inherited attributes.

b) Evaluation Order
The topological sort of the dependency graph decides the evaluation order in a parse tree.
In deciding evaluation order the semantic rules in the syntax directed definitions are used. Thus
the translation is specified by syntax directed definitions.

188 Compiler Design


Unit – IV
L

T.type L.in=real 2

1 real 4 L.in=real , id3 3

6 L.in=real , id2 5

7 id1

Figure: Evaluation Order


The evaluation order can be decided as follows:
1. The type real is obtained from lexical analyzer by analyzing the input token.
2. The L.in is assigned the type real from the sibling T.type
3. The entry in the symbol table for identifier id3 gets associated with the type real. Hence
variable id3 becomes of real type.
4. The L.in is assigned the type real from the parent L.in
5. The entry in the symbol table for identifier id2 gets associated with the type real. Hence
variable id2 becomes of real type.
6. The L.in is assigned the type real from the parent L.in
7. The entry in the symbol table for identifier id1 gets associated with the type real. Hence
variable id1 becomes of real type.
Thus by evaluation the semantic rules in this order stores the type real in the symbol
table entry for each identifier id1, id2 and id3.

c) S-Attributed Definitions
Given an SDD, it is diffcult to tell whether there exist any parse trees whose dependency
graphs have cycles. In practice, translations can be implemented using classes of SDD's that
guarantee an evaluation order, since they do not permit dependency graphs with cycles. Also, the
classes can be implemented efficiently in connection with top-down or bottom-up parsing. The
first class is defined as follows:
 An SDD is S-attributed if every attribute is synthesized.

Compiler Design 189


Unit – IV
Example 8: The SDD of table is an example of an S-attributed definition. Each attribute, L.val,
E.val, T.val, and F.val is synthesized.
Production Semantic Rules
L→En L.val := E.val
E → El + T E.val := E1.val + T.val
E→T E.val := T.val
T→Tl*F T.val := Tl.val × F.val
T→F T.val := F.val
F→(E) F.val := E.val
F → digit F. val := digit .lexval
Table: Syntax-directed definition
When an SDD is S-attributed, we can evaluate its attributes in any bottom-up order of the
nodes of the parse tree. It is simple to evaluate the attributes by performing a postorder traversal
of the parse tree and evaluating the attributes at a node N when the traversal leaves N for the last
time. That is, we apply the function postorder, defined below, to the root of the parse tree.
postorder(N) {
for (each child C of N, from the left) postorder(C);
evaluate the attributes associated with node N:
}
S-attributed definitions can be implemented during bottom-up parsing, since a bottom-up
parse corresponds to a postorder traversal. Specifically, postorder corresponds exactly to the
order in which an LR parser reduces a production body to its head.

d) L-Attributed Definitions
The second type of SDD's is called L-attributed definitions. In this type, between the
attributes associated with a production body, dependency-graph edges can go from left to right,
but not from right to left (hence "L-attributed"). In short, each attribute must be either
1. Synthesized, or
2. Inherited, but with the rules limited as follows. Suppose that there is a production A →
X1X2 • • Xn, and that there is an inherited attribute Xi.a computed by a rule associated with
this production. Then the rule may use only:
a) Inherited attributes associated with the head A.
b) Either inherited or synthesized attributes associated with the occurrences of symbols
X1X2 • • Xi - 1 located to the left of Xi.
190 Compiler Design
Unit – IV
c) Inherited or synthesized attributes associated with this occurrence of Xi itself, but only
in such a way that there are no cycles in a dependency graph formed by the attributes of
this Xi.
Example 9:
The SDD in table is L-attributed.
Production Semantic Rules
T→FT' T'.inh := F.val
T.val := T'.syn
T ' → * F T1' T1'.inh := T1'.inh × F.val
T'.syn := T1'.syn
T'→ T'.syn := T'.inh
F → digit F.val := digit.lexval
Table: Syntax-directed definition
To see why, consider the semantic rules for inherited attributes
Production Semantic Rules
T→FT' T'.inh := F.val
T ' → * F T1' T1'.inh := T1'.inh × F.val
T'.synT'.inh
The first of these rules defines the inherited attribute := T '.syn
using only F.val, and F appears to the
left of T ' in the production body. The second rule defines T1'.inh using the inherited attribute
T '.inh associated with the head, and F.val, where F appears to the left of T1' in the production
body.
In each of these cases, the rules use information "from above or from the left," as
required by the class. The remaining attributes are synthesized. Hence, the SDD is L-attributed.
Example 10:
Any SDD containing the following production and rules cannot be L-attributed:
Production Semantic Rules
A→BC A.s := B.b
B.i := f (C.c, A.s)
The first rule, A.s := B.b, is a legitimate rule in either an S-attributed or L-attributed SDD. It
defines a synthesized attribute A.s in terms of an attribute at a child (that is, a symbol within the
production body).
The second rule defines an inherited attribute B.i, so the entire SDD cannot be S-attributed. Also,
the rule is legal, the SDD cannot be L-attributed, because the attribute C.c is used to define B.i,

Compiler Design 191


Unit – IV
and C is to the right of B in the production body. While attributes at siblings in a parse tree may
be used in L-attributed SDD's, they must be to the left of the symbol whose attribute is being
defined.

Question 8: Write and explain the syntax directed definition (SDD) for constructing a
syntax tree for an expression? Show the syntax tree for “a – 4 + c”
Question 9: Consider the following grammar:
E  E1 + T | E1 – T | T T  (E) | id | num
where id and num are terminal symbols
i) Obtain syntax directed definition for constructing syntax-tree for the above
grammar.
ii) Obtain suitable translation scheme for definition in part (i) above.
iii) Show the steps for construction of syntax-tree for expression 'x - 5 + y '.

3. Applications of Syntax-Directed Translation


The main application of Syntax-Directed Translation is the construction of syntax trees.
Since some compilers use syntax trees as an intermediate representation, a common form of SDD
turns its input string into a tree. To complete the translation to intermediate code, the compiler
may then walk the syntax tree, using another set of rules that are in effect an SDD on the syntax
tree rather than the parse tree.
We consider two SDD's for constructing syntax trees for expressions. The first, an S-
attributed definition, is suitable for use during bottom-up parsing. The second, L-attributed, is
suitable for use during top-down parsing.

a) Construction of Syntax Trees


The construction of a syntax tree for an expression is similar to the translation of the
expression into postfix form. We construct subtrees for the subexpressions by creating a node for
each operator and operand. The children of an operator node are the roots of the nodes
representing the subexpressions constituting the operands of that operator. Each node in a syntax
tree can be implemented as a record with several fields.
In the node for an operator, one field identifies the operator and the remaining fields
contain pointers to the nodes for the operands. The operator is called the label of the node. When
used for translation, the nodes in a syntax tree may have additional fields to hold the values (or
pointers to values) of attributes attached to the node.
192 Compiler Design
Unit – IV
Following functions are used to create the nodes of syntax trees for expressions with
binary operators. Each function returns a pointer to a newly created node.
1. new Node(op, left, right): This function creates an operator node with label op and two
fields containing pointers to left and right (Figure(a)).
2. new Leaf(id, entry): This function creates an identifier node with label id and a field
containing entry, a pointer to the symbol-table entry for the identifier (Figure(b)).
3. new Leaf(num, val): This function creates a number node with label num and a field
containing val, the value of the number (Figure(c)).
op id num

(a) (b) (c)


Figure: Functions to create Syntax Tree

Example 11: Construct the syntax tree for the expression a - 4 + c


Solution:
The following sequence of functions calls creates the syntax tree for the expression a-4+c
as shown in following figure. In this sequence, p1, p2, . . . p5 are pointers to nodes, and entry-a
and entry-c are pointers to the symbol-table entries for identifiers a and c.
1. p1 := new Leaf(id, entry-a);
2. p2 := new Leaf(num, 4);
3. p3 := new Node('-', p1, p2);
4. p4 := new Leaf(id, entry-c);
5. p5 := new Node('+', p3, p4)
+

– id c

to entry for c
id a num 4

Figure: Syntax tree for a-4+c


to entry for a
Compiler Design 193
Unit – IV
The tree is constructed as bottom up. The function calls new Leaf(id, entry-a) and new
Leaf(num, 4) construct the leaves for a and 4; the pointers to these nodes are saved using p1 and
p2. The call new Node('-', p1, p2) then constructs the interior node with the leaves for a and 4 as
children. After two more steps, p5 is left pointing to the root.

b) Syntax-Directed Definition for Constructing Syntax Trees


Following table shows an S-attributed definition for constructing a syntax tree for an
expression containing the operators + and –. It uses the productions of the grammar to schedule
the calls of the functions mknode and mkleaf to construct the tree. The synthesized attribute node
for E and T keeps track of the pointers returned by the function calls.
Production Semantic Rules
E → El + T E. node := new Node(‘+’, E1.node, T.node)
E → El – T E.node := new Node(‘-’, E1.node, T.node)
E→T E.node := T.node
T→(E) T.node := E.node
T → id T.node := new Leaf(id, id.entry)
T → num T.node := new Leaf(num, num.val)
Table: Syntax-directed definition for constructing a syntax tree for an expression.
Example 12: An annotated parse tree showing the construction of a syntax tree for the expression
a-4+c is shown in following figure.

Figure: Construction of a syntax-tree for a-4+c.


194 Compiler Design
Unit – IV
The parse tree is shown dotted. The parse-tree nodes labeled by the nonterminals E and T
use the synthesized attribute node to hold a pointer to the syntax-tree node for the expression
represented by the nonterminal.
The semantic rules associated with the productions T → id and T → num define attribute
T.node to be a pointer to a new leaf for an identifier and a number. Attributes id.entry and
num.val are the lexical values assumed to be returned by the lexical analyzer with the tokens id
and num.
When an expression E is a single term, corresponding to a use of the production E → T,
the attribute E.node gets the value of T.node. When the semantic rule E.node := mknode(‘-’,
E1.node, T.node) associated with the production E → E – T is invoked, previous rules have set
E1.node and T.node to be pointers to the leaves for a and 4.
Example 13:
The rules for building syntax trees are similar to the rules for the desk calculator. In the
desk-calculator example, a term x * y was evaluated by passing x as an inherited attribute, since x
and * y appeared in different portions of the parse tree. Here, the idea is to build a syntax tree for
x + y by passing x as an inherited attribute, since x and + y appear in different subtrees.
Production Semantic Rules
E → T E' E.node := E'.syn
E.inh := T.node
E ' → + T E l' E'.inh := new Node(‘+’, E'.inh, T.node)
E'.syn := E1'.syn
E '→ – T E l' E'.inh := new Node(‘-’, E'.inh, T.node)
E'.syn := E1'.syn
E'→ E'.syn := E1'.syn
T→(E) T.node := E.node
T → id T.node := new Leaf(id, id.entry)
T → num T.node := new Leaf(num, num.val)
Figure: Coastructing syntax trees during top-down parsing
Nonterminal E ' has an inherited attribute inh and a synthesized attribute syn. Attribute
E'.inh represents the partial syntax tree constructed so far. Specifically, it represents the root of
the tree for the prefix of the input string that is to the left of the subtree for E '. At node 5 in the
dependency graph, E'.inh denotes the root of the partial syntax tree for the identifier a that is, the
leaf for a. At node 6, E'.inh denotes the root for the partial syntax tree for the input a - 4. At node
9, E'.inh denotes the syntax tree for a - 4 + c.

Compiler Design 195


Unit – IV

Figure: Dependency graph for a - 4 + c, with the SDD of Fig. 5.13


Since there is no more input, at node 9, E'.inh points to the root of the entire syntax tree.
The syn attributes pass this value back up the parse tree until it becomes the value of E.node. The
attribute value at node 10 is defined by the rule E'.syn = E'.inh associated with the production
E' → . The attribute value at node 11 is defined by the rule E'.syn = E1'.syn associated with
production 2 in Fig. Similar rules define the attribute values at nodes 12 and 13.

Question 10: How the translator for an S-attributed definition can be implemented with
the help of LR parser generator? (Synthesized attribute on the parser stack). Show the
implementation of a desk calculator with LR parser.
Question 11: Consider the following grammar:
LE E  E1 + T | T T  T1 * F | F F  (E) | digit
where digit is terminal symbol.
i) Obtain Syntax Directed Definition for the above grammar.
ii) Give the implementation of above grammar using an LR parser.
iii) Show the moves made by the translator of part (ii) on input 5 + 3*4.

4. Syntax-Directed Translation Schemes


Syntax-directed translation schemes are a complementary notation to syntax-directed
definitions. All of the applications of syntax-directed definitions can be implemented using
syntax-directed translation schemes.
A syntax-directed translation scheme (SDT) is a context-free grammar with program
fragments embedded within production bodies. The program fragments are called semantic
actions and can appear at any position within a production body. By convention, we place curly
braces around actions; if braces are needed as grammar symbols, then we quote them.

196 Compiler Design


Unit – IV
Any SDT can be implemented by first building a parse tree and then performing the
actions in a left-to-right depth-first order; that is, during a preorder traversal.
SDT's are implemented during parsing, without building a parse tree. The SDT's can be
used to implement two important classes of SDD's:
1. The underlying grammar is LR-parsable, and the SDD is S-attributed.
2. The underlying grammar is LL-paxsable, and the SDD is L-attributed.
In both these cases, the semantic rules in an SDD can be converted into an SDT with
actions that are executed at the right time. During parsing, an action in a production body is
executed as soon as all the grammar symbols to the left of the action have been matched.
SDTs that can be implemented during parsing can be characterized by introducing
distinct marker nonterminals in place of each embedded action; each marker M has only one
production, M → . If the grammar with marker nonterminals can be parsed by a given method,
then the SDT can be implemented during parsing.

a) Postfix Translation Schemes


The simplest SDD implementation occurs when we can parse the grammar bottom-up
and the SDD is S-attributed. In that case, we can construct an SDT in which each action is placed
at the end of the production and is executed along with the reduction of the body to the head of
that production. SDTs with all actions at the right ends of the production bodies are called postfix
SDT's.
Example 14:
The postfix SDT implements the desk calculator SDD, with one change: the action for
the first production prints a value. The remaining actions are exact counterparts of the semantic
rules. Since the underlying grammar is LR, and the SDD is S-attributed, these actions can be
correctly performed along with the reduction steps of the parser.
L→En { print (E.val)}
E → El + T { E.val := E1.val + T.val; }
E→T { E.val := T.val; }
T→Tl*F { T.val := Tl.val × F.val; }
T→F { T.val := F.val; }
F→(E) { F.val := E.val; }
F → digit { F. val := digit .lexval; }
Table: Postfix SDT implementing the desk calculator

Compiler Design 197


Unit – IV
b) Parser-Stack Implementation of Postfix SDT's
Postfix SDT's can be implemented during LR parsing by executing the actions when
reductions occur. The attribute(s) of each grammar symbol can be put on the stack in a place
where they can be found during the reduction. The best plan is to place the attributes along with
the grammar symbols (or the LR states that represent these symbols) in records on the stack itself.
A translator for S-attributed definition can be implemented using LR-parser generator.
1. A bottom up method is used to parse the input string.
2. A parser stack is used to hold the values of synthesized attribute.
The stack is implemented as a pair of state and value. Each state entry is the pointer to
the LR (1) parsing table. There is no need to store the grammar symbol implicitly in the parser
stack at the state entry.
The state will be identified by unique grammar symbol that is been placed in the parser
stack. Hence parser stack can be denoted as stack[i]. And stack[i] is a combination of state[i] and
value[i]. For example, for the production rule X  ABC the stack can be as shown in Figure.
Before reduction the states A, B and C can be inserted in the stack along with the values
A.a, B.b and C.c. The top pointer of value[top] will point the value C.c, similarly B.b is in
value[top–1] and A.a is in value[top–2].
After reduction the left hand side symbols of the production i.e., X will be placed in the
stack along with the value X.x at the top. Hence after reduction value[top] = X.x.
1. After reduction top is decremented by 2 the state covering X is placed at the top of
state[top] and value of synthesized attribute X.x is put in value[top].
2. If the symbol has no attribute then the corresponding entry in the value array will be kept
undefined.
Production: X  ABC

State Value State Value

A A.a X X.a top

B B.b
top
C C.c

Before reduction After reduction

Figure: Parser stack

198 Compiler Design


Unit – IV
Example 15: Construct the syntax directed definition and generate the code fragment using S-
attributed definition for the following grammar.
LEn
E  E1 + T | T
T  T1 * F | F
F  (E) | digit
Solution: The syntax-directed definition for the given grammar can be written as:
Production Semantic Rules
L→En L.val := E.val
E → El + T E.val := E1.val + T.val
E→T E.val := T.val
T→Tl*F T.val := Tl.val × F.val
T→F T.val := F.val
F→(E) F.val := E.val
F → digit F. val := digit .lexval
Table: Syntax-directed definition
The synthesized attributes in the annotated parse tree can be evaluated by an LR parser
during a bottom-up parse of the input line 1 * 2 + 3n. Assume that the lexical analyzer supplies
the value of attribute digit.lexval, which is the numeric value of each token representing a digit.
When the parser shifts a digit onto the stack, the token digit is placed in state[top] and its attribute
value is placed in val[top].
To evaluate attributes, we modify the parser to execute the code fragments shown below
before making the appropriate reduction. We can associate attribute evaluation with reductions,
because each reduction determines the production to be applied. The code fragments have been
obtained from the semantic rules by replacing each attribute by a position in the val array.
Production Semantic Rules
L→En { print (stack[top – 1].val); top = top – 1; }
E → El + T { stack[top – 2].val = stack[top – 2].val + stack[top].val; top = top – 2; }
E→T
T → Tl * F { stack[top – 2].val = stack[top – 2].val × stack[top].val; top = top – 2; }
T→F
F→(E) { stack[top – 2].val = stack[top – 1].val; top = top – 2; }
F → digit
Table: Implementation of a desk calculator with an LR parser.
Compiler Design 199
Unit – IV
Suppose that the stack is kept in an array of records called stack, with top a cursor to the
top of the stack. Thus, stack[top] refers to the top record on the stack, stack[top – 1] to the record
below that and so on. Also, assume that each record has a field called val, which holds the
attribute of whatever grammar symbol is represented in that record.
Thus, we may refer to the attribute E.val that appears at the third position on the stack as
stack[top – 2].val. The entire syntax directed translation is shown in above table.
Input State val Production Used
1*2+3n _ _
*2+3n 1 1
*2+3n F 1 F → digit
*2+3n T 1 T→F
2+3n T* 1_
+3n T*2 1_2
+3n T*F 1_2 F → digit
+3n T 2 T→T*F
+3n E 2 E→T
3n E+ 2_
n E+3 2_3
n E+F 2_3 F → digit
n E+T 2_3 T→F
n E 5 E→E+T
En 5_
L 5 L→En
Table: Moves made by translator on input 1 * 2 + 3n.
For example, in the second production, E → E1 + T, we go two positions below the top to
get the value of E1, and we find the value of T at the top. The resulting sum is placed where the
head E will appear after the reduction, that is, two positions below the current top.
As after the reduction, the three topmost stack symbols are replaced by one. After
computing E.val, we pop two symbols off the top of the stack, so the record where we placed
E.val will now be at the top of the stack.
In the third production, E → T, no action is necessary, because the length of the stack
does not change, and the value of T.val at the stack top will simply become the value of E.val.
The same observation applies to the productions T → F and F → digit.

200 Compiler Design


Unit – IV
Production F → (E) is slightly different. Although the value does not change, two positions are
removed from the stack during the reduction, so the value has to move to the position after the
reduction. Table shows the sequence of moves made by the parser on input 1 * 2 + 3n. The
contents of the state and val fields of the parsing stack are shown after each move.
Consider the sequence of events on seeing the input symbol 1.
 In the first move, the parser shifts the state corresponding to the token digit (whose value
is 1) onto the stack. (The state is represented by 1 and the value 1 is in the val field.)
 In the second move, the parser reduces by the production F → digit and implements the
semantic rule F.val := digit.lexval.
 In the third move the parser reduces by T → F. No code fragment is associated with this
production, so the val array is left unchanged.
 After each reduction the top of the val stack contains the attribute value associated with
the left side of the reducing production.

Question 12: Eliminate left Recursive from this grammar and develop Syntax-Directed
translation scheme for 9 – 5 + 2 grammar is:
EE+T|E–T|T
T  (E) | id | num
5. Top-down Translation
In top-down translation, L-attributed definitions will be implemented during predictive parsing.

a) Eliminating Left Recursion


Most arithmetic operators are associate to the left; it is natural to use left-recursive grammars
for expressions. The transformation applies to translation schemes with synthesized attributes.
A → Aα | β (with left recursion)
After removing left recursion
A → βA'
A' → αA' | ε

Example 16: Give translation scheme and show the annotated parse tree with the value of the
attribute E.val at root for 9 – 5 + 2 for the following grammar
EE+T|E–T|T
T  (E) | num
Solution:
Compiler Design 201
Unit – IV
The translation scheme for the left recursive grammar is given below.
Production Semantic Action
E → El + T { E.val := E1.val + T.val }
E → El – T { T.val := Tl.val – F.val }
E→T { E.val := T.val }
T→(E) { T.val := E.val }
T → num { T.val := num.val }
Table: Translation scheme with left-recursive grammar.
Now remove the left recursion and rewrite the translation scheme for right recursive grammar as:
Production Semantic Action
E→T { R.i := T.val }
R { E.val := R.s }
R → +T { R1.i := R.i + T.val }
R1 { R.s := R1.s }
E → –T { R1.i := R.i - T.val }
R1 { R.s := R1.s }
R→ε { R.s := R.i }
T→ (
E { T.val := E.val }
)
T → num { T.val := num.val }
Figure: Transformed translation scheme with right-recursive grammar.
The annotated parse tree is shown in Figure.
E

T.val = 9 R.i = 9

– R.i = 4
T.val = 5
num.val = 9
+ T.val = 2 R.i = 6
num.val = 5

num.val = 2

Figure: Evaluation of the expression 9-5+2

202 Compiler Design


Unit – IV
The new scheme produces the annotated parse tree for the expression 9-5+2. The arrows
in the figure suggest a way of determining the value of the expression. In figure the individual
numbers are generated by T, and T.val takes its value from the lexical value of the number, given
by attribute num.val.
The 9 in the subexpression 9–5 is generated by the leftmost T, but the minus operator and
5 are generated by the R at the right child of the root. The inherited attribute R.i obtains the value
9 from T.val.
The subtraction 9–5 and the passing of the result 4 down to the middle node for R are
done by embedding the following action between T and R1 in R → –TR1
{ R1.i := R.i – T.val }
A similar action adds 2 to the value of 9–5, yielding the result R.i = 6 at the bottom node
for R. The result is needed at the root as the value of E.val the synthesized attribute s for R, not
shown in figure is used to copy the result up to the root.

Example 17: Consider the following translation scheme:


A  A1 Y {A1.a = g(A.a, Y.y)}
AX {A.a = f(X.x)}
What will be the translation scheme if the left recursion is removed? Compute the attribute value
with 2 different ways for the input "XYY".
Solution:
Given grammar is
A  A1 Y {A1.a = g(A.a, Y.y)}
AX {A.a = f(X.x)}
Here, A.a is the synthesized attribute of left-recursive nonterminal A, and X and Y are single
grammar symbols with synthesized attributes X.x and Y.y. These could represent a string of
several grammar symbols, each with its own attributes, as the schema has an arbitrary function g
computing A.a in the recursive production and an arbitrary function f computing A.a in the
second production.
In each case, f and g take as arguments whatever attributes they are allowed to access if
the SDD is S-attributed. We want to turn the underlying grammar into
AXR
RYR|
Figure suggests what the SDT on the new grammar must do. In Figure (a) we see the
effect of the postfix SDT on the original grammar.

Compiler Design 203


Unit – IV
We apply f once, corresponding to the use of production A  X, and then apply g as
many times as we use the production A  A Y. Since R generates a "remainder" of Y's, its
translation depends on the string to its left, a string of the form XYY . . Y.
Each use of the production R  YR results in an application of g. For R, we use an
inherited at tribute R.i to accumulate the result of successively applying g, starting with the value
of A.a.
A.a = g(g(f(X.x), Y1.y), (Y2.y) A

X R.i = f(X.x)
A.a = g(f(X.x), Y1.y) Y2

Y1 R.i = g(f(X.x), Y1.y)

A.a = f(X.x) Y1

Y2 R.i = g(g(f(X.x), Y1.y), Y2.y)

X

Figure: Eliminating left recursion from a postfix SDT (Two different ways for the input "XYY")
In addition, R has a synthesized attribute R.s, not shown in Figure. This attribute is first
computed when R ends its generation of Y symbols, as signaled by the use of production R .
R.s is then copied up the tree, so it can become the value of A.a for the entire expression
XYY ….Y. The case where A generates XYY is shown in Figure, and we see that the value of A.a
at the root of Figure (a) has two uses of g.
Thus R.i is at the bottom of the tree as shown in Figure (b), and the value of R.s gets
copied up into that tree. To accomplish this translation, we use the following syntax directed
definition:
Production Semantic Action
AX {R.i = f(X.x)}
R {A.a = R.s}
RY {R1.i = g(R.i, Y.y)}
R1 {R.s = R1.s}
R {R.s = R.i}
The inherited attribute R.i is evaluated immediately before a use of R in the body, while the
synthesized attributes A.a and R.s are evaluated at the ends of productions. Thus, whatever values
are needed to compute these attributes will be available from what has been computed to the left.

204 Compiler Design

You might also like