You are on page 1of 14

Compilation Techniques (10)

Code generation for expressions


Left values
Postfix form
Special cases
Short-circuit evaluation

© Codruta-Mihaela ISTIN 2015


Code generation for expressions
 Given an expression, how can it be translated to code for a stack
based virtual machine (SVM) or for a register based virtual machine
(RVM)?
 An expression can contain different operands (ex: ids, numbers,
strings), operators (ex: arithmetic, logic, comparison), functions calls,
array indexing, member access, …
 All these constituents must first form an Abstract Syntax Tree (AST)
according to the operators precedence and associativity
 The AST can be explicit (built in memory as a data structure) or
implicit (determined by the visiting order of the predicates when
using a descent recursive parser)
 The AST must pass the types analysis (TA) so it represents a valid
expression
 Information from TA (such as the resulted type of a subexpression)
will be later used in code generation
© Codruta-Mihaela ISTIN 2015
Left values
 Some expressions need left values (ex: assignments “r=…”, or functions’ calls
with parameters sent by address/reference)
 When the left value is required, the address of the operand must be used, not its
value. At AST traversal left values can be handled in different ways:
◦ bottom-up: as long as possible the lval is used (because lval is more general
than rval) and a synthesized attribute is used to specify that. If a rval is
needed, the lval is converted to rval
◦ top-down: an operator node sends to its children an attribute which tells what
type of values it needs (ex: “=“ will send to its left child “lval required” and to
its right child “rval required”). In this way when an id is encountered, its
address or value will be used, according to the inherited attribute
◦ custom treatment: when the languages use lvals only in a few well-defined
cases (ex: Pascal), these cases can be handled separately. The lval cases
are separated from the rest of the expressions that will work only with rvals.
 Using code optimizations the first two forms can be reduced to the third one
// bottom-up: a=b // top-down: a=b // custom: a=b
PUSHADDR &a PUSHADDR &a PUSH b
PUSHADDR &b PUSH b POP a
LOAD STORE
STORE © Codruta-Mihaela ISTIN 2015
Postfix form
 The resulted form when the AST of an expression is traversed
in postfix order
 Postfix order: first the children and then the root
 A simple algorithm to manually construct an AST for an expression:
◦ the root of the AST is the operation which is executed last
◦ the children of the root are the ASTs which are constructed with
the operands of the root operation
◦ the algorithm is repeated recursively until there are no more
operations

© Codruta-Mihaela ISTIN 2015


Postfix form example
r=2*k*max(sin(a[i]+PI/2),0.5)

r *

* max

2 k sin 0.5
+

[] /

a i PI 2

r 2 k * a i [] PI 2 / + sin© Codruta-Mihaela
0.5 max * =
ISTIN 2015
Postfix form evaluation
 The most important property of a postfix form is that it can be evaluated
in the exact order of its constituents using a stack
 Evaluation algorithm:
◦ if an operand (ex: id, constant) is encountered, it is pushed on stack
◦ if an operator or function is encountered, the operation is made on
stack by taking the necessary number of arguments from stack and
the result is pushed on stack
◦ the algorithm is repeated until there are no more constituents
◦ in the end the result will be the top value of the stack

Example: r 2 k * a i [] PI 2 / + sin 0.5 max * =


[] → [r] → [r, 2] → [r, 2, k] → [r, 2*k] → [r, 2*k, a] → [r, 2*k, a, i]
→ [r, 2*k, a[i]] → [r, 2*k, a[i], PI] → [r, 2*k, a[i], PI, 2] → [r, 2*k, a[i], PI/2]
→ [r, 2*k, a[i]+(PI/2)] → [r, 2*k, sin(a[i]+(PI/2))] → [r, 2*k, sin(a[i]+PI/2), 0.5]
→ [r, 2*k, max(sin(a[i]+PI/2), 0.5)] → [r, (2*k)*max(sin(a[i]+(PI/2)), 0.5)]
→ [r=(2*k)*max(sin(a[i]+(PI/2)), 0.5)] © Codruta-Mihaela ISTIN 2015
SVM code for a postfix form
// r 2 k * a i [] PI 2 / + sin 0.5 max * =
// code generated for a SVM
PUSHADDR &r
PUSH 2
PUSH k
MULTIPLY
PUSHADDR &a // In C “a” of array type means &a
PUSH i
GETELEMENT
PUSH PI
PUSH 2
DIVIDE
ADD
CALL sin
PUSH 0.5
CALL max
MULTIPLY
STORE
© Codruta-Mihaela ISTIN 2015
RVM code for a postfix form
 For each intermediary AST node a new register (or temporary
memory location) must be allocated
 This register receives the result and it is further used by its parent
 Optimization: because each register is used only once (by its
parent), after its usage it can be reused in order to decrease the
register/memory usage

// r 2 k * a i [] PI 2 / + sin 0.5 max * =


MULTIPLY Tmp1, 2, k
GETELEMENT Tmp2, &a, i
DIVIDE Tmp3, PI, 2
ADD Tmp4, Tmp2, Tmp3
CALL Tmp2, sin, Tmp4
CALL Tmp3, max, Tmp2, 0.5
MULTIPLY Tmp2, Tmp1, Tmp3
STORE r, Tmp2
© Codruta-Mihaela ISTIN 2015
Last value of an expression
 Many expressions have as last operation a function call which
returns something but whose result is not used:
strcpy(name, buffer); //in C strcpy returns the destination
 In the above case, the value returned by strcpy is not used. In case
of SVM this means that the result value is on stack but it will remain
there because no one will use it
 In these cases the last value must be discarded from stack
 Algorithm: in the statement rule (where the expression begins), if the
synthesized attribute containing the type of the result (it is returned
by TA) is checked and if it is not “void”, a “DROP” instruction is
generated
// strcpy(name,buffer);
// name and buffer are char[]
PUSHADDR &name
PUSHADDR &buffer
CALL strcpy
DROP © Codruta-Mihaela ISTIN 2015
Duplicates when the result is consumed
 Some operations do not leave the result on stack, for example “=“
pops the value from stack
 If such operations can be cascaded (ex: a=b=5), the last value on
stack must be preserved or restored

// a=b=5 // a=b=5
PUSH 5 PUSH 5
POP b DUP // preserve by duplication
PUSH b // restore POP b
POP a POP a

© Codruta-Mihaela ISTIN 2015


Functions with variable number
of arguments
 In languages such as C/C++: int printf(const char *fmt, …)
◦ the arguments are put on stack from right to left. In this way the known
argument (“fmt”) is closer to FP and it is in a known position (first position
near FP)
◦ from this known position a special iterator is used (of type “va_list”) to
iterate the rest of the arguments
◦ the types of the rest of the arguments must be known from the required
arguments (in case of “printf” by parsing the “fmt” string) or otherwise
inferred, else the iterator cannot know the arguments types
◦ the arguments are erased from stack at the call position (the place where
they are know), not at function return
 In languages such as Java: double min(double…values):
◦ all the optional arguments are accumulated in an array
◦ this array is passed as a single argument
◦ the function can access any optional argument, in any order, by
accessing this array
© Codruta-Mihaela ISTIN 2015
Short-circuit evaluation (SCE)
 When the value of the first member of a logic expression (and, or) is
known, the rest of the expression is evaluated only if necessary
 Many programming languages specify SCE as standard for their logic
operators. This means that SCE is a requirement, not an optional
feature
 Example: if(a>0 && f(b,a)!=1){…}
 “a>b” is evaluated first and if it is false, “f(b,a)!=1” is never evaluated
because it is known that the final result of && will be false
 SCE advantages:
◦ speed optimization by not evaluating the second expressions (if the
expressions are very simple, it is possible to appear a delay because
it is easier to compute a logic operator than a jump)
◦ allows to guard the second expressions with preconditions; if these
are not satisfied, the second expressions are not run
 If the second expressions have side effects (such as modifying the
value of a global variable), it must be taken into account that it is
possible to not evaluate these expressions so in some cases the side
effects appear and in others they do not © Codruta-Mihaela ISTIN 2015
SCE implementation
// if(a>0 && f(b,a)!=1){…} // if(a>0 || f(b,a)!=1){…}
PUSH a PUSH a
PUSH 0 PUSH 0
GREATER GREATER
JF L1 JT L1
PUSH b PUSH b
PUSH a PUSH a
CALL f CALL f
PUSH 1 PUSH 1
NOTEQ NOTEQ
JF L1 JF L2
// begin if code L1: // begin if code
... ...
L1: // after if code L2: // after if code

© Codruta-Mihaela ISTIN 2015


Bibliography reading
 Compilers. Principles, Techniques and Tools
2.3, 2.5, 6.4

© Codruta-Mihaela ISTIN 2015

You might also like