You are on page 1of 71

Contents

Type Checking
Run-time Environments
Intermediate Code Generation

Prepared By:
Dabbal Singh Mahara
2016

1
Type Checking
• A type is a set of values together with a set of operations that can be
performed on them
• Type checking is checking that each operation in a program receives
appropriate number of arguments of appropriate types in appropriate order.
• The purpose of type checking is to verify that operations performed on a
value are in fact permissible.
• Certain operations are legal for values of each type
– It doesn’t make sense to add a function pointer and an integer in C.
– It does make sense to add two integers.
• The type of an identifier is typically available from declarations, but we may
have to keep track of the type of intermediate expressions.
• Type errors arise when operations are performed on values that do not
support that operation.

Dabbal Mahara 2
Type Systems
• A language’s type system specifies which operations are valid for which types.
• Type systems provide a concise formalization of the semantic checking rules.
• A type system defines a set of types and rules to assign types to programming
language constructs like informal type system rules, for example “if both
operands of addition are of type integer, then the result is of type integer”.
• Type Checking is the process of checking that the program obeys the type
system.
• A type checker implements type system.
• A sound type system eliminates run-time type checking for type errors.
– Memory errors: Reading from an invalid pointer, etc.
– Violation of abstraction boundaries.

Dabbal Mahara 3
Type Checking Overview

Three kinds of languages:


• Statically typed: All or almost all checking of types is done as part of compilation (C,
ML, Java)
• Dynamically typed: Almost all checking of types is done as part of program execution
(Scheme, Prolog)
• Untyped: No type checking (machine code)

• Static typing proponents say:


– Static checking catches many programming errors at compile time
– Avoids overhead of runtime type checks
• Dynamic typing proponents say:
– Static type systems are restrictive
– Rapid prototyping easier in a dynamic type system

Dabbal Mahara 4
Static Checking
• Refers to the compile-time checking of programs in order to ensure that
the semantic conditions of the language are being followed
• Examples of static checks include:
– Type checks
– Flow-of-control checks
– Uniqueness checks
– Name-related checks
• Flow-of-control checks: statements that cause flow of control to leave a construct
must have some place where control can be transferred; e.g., break statements in
C
• Uniqueness checks: a language may dictate that in some contexts, an entity can
be defined exactly once; e.g., identifier declarations, labels, values in case
expressions
• Name-related checks: Sometimes the same name must appear two or more
times; e.g., in Ada a loop or block can have a name that must then appear both at
the beginning and at the end

Dabbal Mahara 5
Type Expression
• A language usually provides a set of base types that it supports together with ways to construct
other types using type constructors
• Through type expressions we are able to represent types that are defined in a program
• A base type is a type expression
 a primitive data type such as integer, real, char, boolean, …
 type-error signal an error during type checking
 void : no type
• A type name (e.g., a record name) is a type expression
• A type constructor applied to type expressions is a type expression. E.g.,
– arrays: If T is a type expression and I is a range of integers, then array(I,T) is a
type expression
– records: If T1, …, Tn are type expressions and f1, …, fn are field names, then
record((f1,T1),…,(fn,Tn)) is a type expression
– pointers: If T is a type expression, then pointer(T) is a type expression Ex: pointer(int)
– functions: If T1, …, Tn, and T are type expressions, then so is (T1,…,Tn) →T.
Ex: int→int represents the type of a function which takes an int value as parameter,
and return type is also int.
Dabbal Mahara 6
A Simple Type Checking System

Dabbal Mahara 7
Specification of Simple Type checker

• A simple type checking translation scheme for declaration is


given in the following figure.
• The basic types are : character and integer.
• The constructed types are : array and pointer.
• The type attribute is added to each symbol.
• The declaration should come before the usage of the variable.

Dabbal Mahara 8
Type checking for expression
• The synthesized attribute type for E gives the type of the expression
assigned by the type system for the expression generated by E.
• The function lookup returns the type of id.
• The following figure shows the type checking for the expressions.

Dabbal Mahara 9
Type checking for statements
• In some languages statements have a type associated with them, while some
other languages don’t assign types to statements.
• In the latter case, statements are given a type void to distinguish a type safe
statement with one which has a type error.
• if an error occurs within a statement, then the type assigned to this statement is
type_error.

Dabbal Mahara 10
Type checking for functions

• A function to an argument can be captured by production:


T→T->T E→E(E)
Function type declaration Function call

Dabbal Mahara 11
Type Conversion and Coercion
• Since representation of integer and real is different within a computer, the different
machine instructions are used for operations on integers and reals. Often if different
parts of an expression are of different types then type conversion is required.
• For example, in the expression: z = x + y what is the type of z if x is integer and y is
real ?
• Compilers have to convert one of the them to ensure that both operand of same type!
• In many language Type conversion is explicit, for example using type casts i.e. must
be specify as inttoreal(x)
• Type conversions which happen implicitly is called coercion. Implicit type conversions
are carried out by the compiler recognizing a type incompatibility and running a type
conversion routine (for example, something like inttoreal(int)) that takes a value of the
original type and returns a value of the required type.
• The coercion of expressions is given in following figure.

Dabbal Mahara 12
Type Conversion and Coercion (Contd.)

Dabbal Mahara 13
Structural Equivalence of Type Expressions
• The basic question is "when are two type expressions equivalent?"
• Two expressions are structurally equivalent if they are two expressions of same basic types
or are formed by applying same constructor.

Example: int a, b;
Here a and b are structurally equivalent.

Dabbal Mahara 14
Run-time Environments

• A compiler must accurately implement the abstractions embodied in the source


language definition. These abstractions typically include the concepts such
names, scope, bindings, data types, operators, procedures, parameters and
flow-of-control constructs.
• The compiler must co-operate with operating system and other system software
to support these abstractions on the target machine.
• To do so, the compiler creates and manages a run-time environment in which
target code are being executed.
• By runtime, we mean a program in execution.
• Runtime environment is a state of the target machine, which may include
software libraries, environment variables, etc., to provide services to the
processes running in the system.

Dabbal Mahara 15
Run-time Environment...
• Runtime support system is a package, mostly generated with the executable
program itself and facilitates the process communication between the process
and the runtime environment. It takes care of memory allocation and de-
allocation while the program is being executed
• This environment deals with a number of issues such as layout and allocation of
storage locations for the objects named in the source program, the mechanisms
used by the target program to access variables, the linkages between
procedures, the mechanisms for passing parameters, and interfaces to the
oerating system, input/output devices and other programs.
• That is,
‣ Management of run-time resources
‣ Correspondence between static (compile-time) and dynamic (run-time) structures
‣ Storage organization

Dabbal Mahara 16
Run-time Resources
• Execution of a program is initially under the control of the operating
system (OS)
• When a program is invoked:
‣ The OS allocates space for the program
‣ The code is loaded into part of this space
‣ The OS jumps to the entry point of the program
(i.e., to the beginning of the “main” function)

Dabbal Mahara 17
Memory Layout: Storage Organization

Low Address

High Address

Dabbal Mahara 18
Correspondance between static and Dynamic structures

• Compiler must do the storage allocation and provide access to variables and data.
• At run time, we need a system to map NAMES (in the source program) to STORAGE
on the machine.
• Allocation and de-allocation of memory is handled by a RUN-TIME SUPPORT
SYSTEM typically linked and loaded along with the compiled target code.
• One of the primary responsibilities of the run-time system is to manage ACTIVATIONS
of procedures.
• Procedure execution begins at the first statement of the procedure body.
• When a procedure returns, execution returns to the instruction immediately following
the procedure call.

Dabbal Mahara 19
Activation and Activation Tree
• Every execution of a procedure is called an ACTIVATION.
• The LIFETIME of an activation of procedure P is the sequence of steps between
the first and last steps of P’s body, including any procedures called while P is
running.
• Normally, when control flows from one activation to another, it must (eventually)
return to the same activation.
• If a procedure is recursive, a new activation can begin before an earlier
activation of the same procedure has ended.
• We can represent the activations of procedures during the running of an entire
program by a tree, called an activation tree.
• Activation tree shows the way control enters and leaves activations. In an
activation tree:
– Each node represents an activation of a procedure.
– The root represents the activation of the main program.
– The node a is a parent of the node b if the control flows from a to b.
– The node a is left to to the node b if the lifetime of a occurs before the lifetime
of b. Dabbal Mahara 20
Procedure Activations: Example

Dabbal Mahara 21
Procedure Activation : Example (contd...)
• The example is a sketch of a program that reads nine integers into an array a
and sorts them using the reciursie quicksort algorithm.
• The main function has three tasks. it calls readarray, sets sentinels and then calls
quicksort on the entire data array.
• The figure in the right side shows the sequence of calls that might result from an
execution of the program. In this execution, the call to partition(1,9) returns 4, so
a[1] to a[3] hold elements less than its chosen separator value v, while the larger
elements are in a[5] through a[9].
• In this example, procedure activations are nested in time.

Dabbal Mahara 22
Activation Tree: During an Execution of quicksort

• This activation tree shows one possible activation tree that completes the
sequence of calls and returns in above program.
• The functions are represented by the first letters of their names.
• Remember that this tree is only one possibility, since the arguments of
subsequent calls, and also the number of calls along any branch is
influenced by the values returned by the partition.
Dabbal Mahara 23
Control Stack
• Procedure calls and returns are managed by a run time stack called the control stack.
• Each live activation has a frame known as activation record, on the control stack, with
root of the activation tree at the bottom and the entire sequence of activations
corresponding to the path in the activation tree to the activation where control resides
currently. The latter activations has a record at the top of the stack.
• The stack keeps track of currently-active procedure activations.
– An activation record is pushed onto the control stack as the activation starts.
– That activation record is popped when that activation ends.
• At any point in time, the control stack represents a path from the root of the activation
tree to one of the nodes.
• The flow of the control in a program corresponds to a depth first traversal of the
activation tree that:
– starts at the root,
– visits a node before its children, and
– recursively visits children at each node an a left‐to‐right order.

Dabbal Mahara 24
Top

Dabbal Mahara 25
Activation Records
• Information needed by a single execution of a procedure is managed using a
contiguous block of storage called activation record.
• An activation record is allocated when a procedure is entered, and it is
de‐allocated when that procedure exited.
• Size of each field can be determined at compile time (Although actual location of
the activation record is determined at run‐time).
• Except that if the procedure has a local variable and its size depends on a
parameter, its size is determined at the run time.

Dabbal Mahara 26
A General Activation Record
• Temporary values, such as those arising from the evaluation of
expressions, in cases where those temporaries cannot be held in Actual parameters
registers. Returned values
• Local data belonging to the procedures whose activation record this is.
• Saved machine status, withe information about the state of the machine Control link
just before the call to the procedure. This information typically includes
the return address ( value of the program counter, to which the called Access link
procedure must return) and the content of registers that were used by
the calling procedure and that must be restored when the return occurs. Saved machine status
• An access link, may be added to locate data needed by the called Local data
procedure but found elsewhere, e.g. in another activation record.
• A control link, pointing to the activation record of caller. Temporaries
• Space for return value of the called function, if any.
• The actual parameters used by the calling procedure.

Dabbal Mahara 27
Creation of An Activation Record

• Who allocates an activation record of a procedure?


• Some part of the activation record of a procedure is created by that procedure
immediately after that procedure is entered.
• Some part is created by the caller of that procedure before that procedure is
entered.
• Calling sequences are code statements to create activations records on the
stack and enter data in them.
• The CALLING SEQUENCE for a procedure allocates an activation record and
fills its fields in with appropriate values.
• The RETURN SEQUENCE restores the machine state to allow execution of the
calling procedure to continue.

Dabbal Mahara 28
Creation of An Activation Record

parameters and return value

control link
links and saved status Caller’s activation record

temporaries and local data


caller’s responsibility

parameters and return value

control link Callee’s activation record


links and saved status

callee’s responsibility
temporaries and local data

Stack_top

Dabbal Mahara 29
Creation of An Activation Record
Sample calling sequence
• Caller evaluates the actual parameters and places them into the activation record of the
callee.
• Caller stores a return address and old value for stack_top in the callee’s activation record.
• Caller increments stack_top to the beginning of the temporaries and locals for the callee.
• Caller branches to the code for the callee.
• Callee saves all needed register values and status.
• Callee initializes its locals and begins execution.
Sample return sequence
• Callee places the return value at the correct location in the activation record (next to caller’s
activation record)
• Callee uses status information previously saved to restore stack_top and the other registers.
• Callee branches to the return address previously requested by the caller.
• [Optional] Caller copies the return value into its own activation record and uses it to evaluate
an expression.

Dabbal Mahara 30
Who deallocates?
• Callee de‐allocates the part allocated by Callee.
• Caller de‐allocates the part allocated by Caller.

Variable-length data
• In some languages, array size can depend on a value
passed to the procedure as a parameter.
• This and any other variable-sized data can still be
allocated on the stack, but BELOW the callee’s
activation record.
• In the activation record itself, we simply store
POINTERS to the to-be-allocated data.
• All variable-length data is pointed to from the local
data area.

Dabbal Mahara 31
• In the analysis-synthesis model of compiler, the front end analyzes a source program and
creates an intermediate representation, from which the back end generates target code.
• The details of source language are confined to front end and details of the target machine to
the back end.
• With a suitably defined intermediate representation, a compiler for language i and machine j
can then be built by combining the front end for language i with the back end for machine j.
• Intermediate code is often the link between the compiler’s front end and back end.
• Intermediate codes are machine independent codes, but they are close to machine
instructions.

Fig. Position of Intermediate Code Generator in Compiler


Dabbal Mahara 32
Why Intermediate Language
Advantages:
• Target code can be generated to any machine just by attaching new
machine as the back end. This is called retargeting.
• It is possible to apply machine independent code optimization to
intermediate code in order to optimize the code generation.
• IR can modularize the task: Front end is not bothered about machine details
and Back end is not bothered about source language.
• Have many front-ends into a single back-end
– gcc can handle C, C++, Java, Fortran, Ada, ...
– each front-end translates source to the same generic language (called GENERIC)
• Have many back-ends from a single front-end
– Do most optimization on intermediate representation before emitting code targeted at a
single machine
• It provides intermediate level of abstraction
• more details than the source
• fewer details than the target

Dabbal Mahara 33
Types of Intermediate Languages
There are three kinds of intermediate representations:
1. High-level intermediate representations:
– closer to the source language; e.g., syntax trees or Directed Acyclic
Graph(DAG)
– easy to generate from the input program
– code optimizations may not be straightforward
2. Low-level intermediate representations:
– closer to target machine; e.g., P-Code, U-Code (used in PA-RISC and
MIPS),
GCC’s RTL, 3-address code
– easy to generate code from
– generation from input program may require effort
3. “Mid”-level intermediate representations:
– Java bytecode, Microsoft CIL, LLVM IR, ...

Dabbal Mahara 34
1. Syntax Tree

• Each node in a syntax tree represents a construct; the children of the


node represent the meaningful components of the construct.
• A syntax tree node representing an expression E1+E2 has label + and
two children representing the subexpressions E1 and E2.
• We shall implement the nodes of a syntax tree by objects with a
suitable number of fields. Each object will have an op fields that is the
label of the node.
• Additionaly, if a node is a leaf node, an op field has the lexical value for
the leaf. A constructor function leaf(op,val) creates a leaf object.
• If the node is an interior node, a constructor function node(op,
c1,c2,..,ck) creates an object with field op and other k fields.

Dabbal Mahara 35
SDD for creating syntax tree

Example: Creating syntax tree for expression: 2-3+4 E


+ E + T
E - T num
- num 4 4
T num

num 3
2
num 2 num 3
Dabbal Mahara 36
Example 2: Syntax tree

Dabbal Mahara 37
2. DAG
• Like syntax tree for an expression, DAG has leaves corresponding to atomic operands and
interior nodes corresponding to operators.
• The difference is that a node N in DAG has more than one parent if N represents a common
subexpression.
• All what is needed is that functions such as Node and Leaf above check whether a node already
exists. If such a node exists, a pointer is returned to that node.
• More compact representation
• Gives clues regarding generation of efficient code

Example: DAG for expression:

Dabbal Mahara 38
3. Three-Address Code
• A three address code is the intermediate representation with at most one operator on the
right side of an instruction.
• That is, no built-up arithmetic expressions are permitted.
• Thus, x+y*z might be translated into the sequence of three address instructions:
t1 = y*z
t2 = x + t 1
where t1 and t2 are compiler generated temporary names.
• 3AC is close to assembly language, making machine code generation easier.
• 3AC is easy to generate from syntax trees or DAG. We associate a temporary with each
interior tree node.

Dabbal Mahara 39
Forms of 3AC
• Assignment statements of the form x := y op z, where op is a binary arithmetic or logical operation.
• Assignment statements of the form x := op y, where op is a unary operator, such as unary minus,
logical negation
• Copy statements of the form x := y, which assigns the value of y to x.
• Unconditional statements goto L, which means the statement with label L is the next to be executed.
• Conditional jumps, such as if x relop y goto L, where relop is a relational operator (<, =, >=, etc) and
L is a label. (If the condition x relop y is true, the statement with label L will be executed next.)
• Statements param x and call p, n for procedure calls, and return y, where y represents the (optional)
returned value. The typical usage: p(x1, …, xn)
param x1
param x2

param xn
call p, n
• Index assignments of the form x := y[i] and x[i] := y. The first sets x to the value in the location i memory
units beyond location y. The second sets the content of the location i unit beyond x to the value of y.
• Address and pointer assignments:
x := &y
x := *y
*x := y
Dabbal Mahara 40
Representation of 3AC in data structure
• How to present these instructions in a data structure?
In compiler the instructions in 3AC can be implemented as objects or records with fields for
operator and operands. Three such representations are:
– Quadruples
– Triples
– Indirect triples

1. Quadruples
• Has four fields: op, arg1, arg2, result
• Exceptions:
– Unary operators: no arg2
– Operators like param: no arg2, no result
– (Un)conditional jumps: target label is the result

Dabbal Mahara 41
2. Triples

• Only three fields: no result field


• Results referred to by its position

(c ) Three address code

Fig. Representation of a = b * - c + b * - c

Dabbal Mahara 42
3. Indirect Triples
• When instructions are moving around during optimizations: quadruples are
better than triples.
• Indirect triples solve this problem.
• Indirect triples consists of list of pointers to triples rather than a listing of
triples themselves. With this optimizing compilers can move an instruction
by reordering the instruction list without affecting the triples themselves.

Dabbal Mahara 43
3AC for program constructs
• Program consists of assignment statements like a=b op c or control statements like if-then-else,
while loop or for statements.
• This section deals with generation of three address code for assignment statement and control
statements.

1. Three-address code for assignment statement


• The three-address code for an assignment statement S is given in following SDD. The
attributes S.code and E.code denote the three-address code for S and E respectively.
• Attribute E.addr denotes the address that will hold the value of E. This address can be a name,
a constant, or a compiler-generated temporary.
• The production E -> id, in SDD, when an expression is a single identifier, say x, then x itself
holds the value of the expression. The semantic rules for this production define E.addr to point
to the symbol table entry for this instance of id. Let top denote the current symbol table.
Function top.get ( )retrieves the entry when it is applied to id.lexeme of id. E.code is set to
empty string.
• For E -> (E1), the translation of E is the same as that of the subexpression E1. Hence, E.addr
eauals E1.addr and E.code = E1.code.
• For E->E1 + E2 , generate code to compute value of E from the values of E1 and E2. Values are
computed into newly generated temporary names.
• A sequence of temporary names is created by new Temp();
• The gen( ) function is to build an instruction and return it.
Dabbal Mahara 44
3AC for Assignment statement with expressions

Dabbal Mahara 45
Example: Generate three address code or the following arithmetic expression: a=-b*c

S.code = t1 = -b t2 = t1 * c
a = t2
E.addr = t2
E.code= t1 = -b t2 = t1 * c

E.addr = c
E.addr = t1
E.code = ‘ ’
E.code= t1 = -b

E.addr = b
E.code = ‘ ’

Three Address code: fig. Parse tree for the expression a= -b *c


t1 = -b
t 2 = t 1* c
a = t2
Dabbal Mahara 46
3AC generation for Array references
• Elements of array are stored in consecutive memory location.
• In C and Java, the array elements are numbered 0,1,.........., n-1, for array with n
elements.
• If width of each element is w then the ith element of the array can be accessed at:
base + i * w ........ ( 1)
where base is the base address of array or the address of the 1st element of
array.
• Example: Let A[10] be an array of 10 elements. Let size of each element be 2 i.e.,
w =2 and the array is stored from memory location 1000 i.e. base address=1000.
• 3rd element of array is at address = 1000 + 3 * 2 = 1000 + 3 * 2 = 1000 +
6=1006

A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9]

Dabbal Mahara 47
3AC generation for Array references

• More generally, the array elements need not be started at 0. In one dimensional array,
the array elements are numbered low, low+1, low+2,............, high and base is the
relative address of A[low].

• The address of A[i] can be rewritten as: base + (i-low) * w ..............(2)

• Formula (2) can be written as: i * w + base – low * w = i * w + c ,


where c = base – low * w.

• All the components in c are known before compilation hence they can be pre-computed
and stored. This reduces the time taken to generate address of ith element.

• We assume that c is saved in the symbol table entry for A, so the relative address of
A [ i ] is obtained by simply adding i * w to c.

Dabbal Mahara 48
One Dimensional Array Reference: Example

A: array [10 ... 20] of integers;


... ...

base width of the array element w


low i

x : = A[ i ]
= base + (i – low) * w = i *w + c
where, c = base – low * w with low = 10; w =4

x Dabbal Mahara 49
• In case of multi-dimension array like matrix, elements are either stored as Row Major or Column
Major. C language and Pascal uses row major storage where as Fortran language uses column
major storage.
(0,0) (0,1) (0,2)
• Example: Consider Array A[3,3] with elements: (1,0) (1,1) (1,2)

(2,0) (2,1) (2,2)

Row Major (0,0) (0,1) (0,2) (1,0) (1,1) (1,2) (2,0) (2,1) (2,2)
Colum Major (0,0) (1,0) (2,0) (0,1) (1,1) (2,1) (0,2) (1,2) (2,2)

Address of element A [i, j] in row major storage is given by the expression as follows.
A[i,j] = base + ((i - low1) * n2 + j - low2) * w, ................. ( 3)
where low1 and low2 are lower bounds of i & j and n2 defines the number of columns. w defines the size of
each element.
Expression can be written as:
A[i,j] = (( i * n2) + j) * w) + ( base – (( low1 * n2) + low2) * w ) .......... (4)
The second part of the Expression (4) can be pre-computed by knowing the value of base, low1, low2 and
w. This helps in faster generation of address for A[i,j].
Dabbal Mahara 50
Example: 2D array referencing

A : array (1..2,1..3) of integer;


... = A[i,j] = baseA +(i-low1) * n2 + j-low2) *w
= ((i*n2) +j) *w + c
Where c = baseA –((low1 *n2)+low2) * w
with low1 = 1, low2 =1, n2 = 3 , w =4

Three- address code


t1 = i* 3
t2 = t1 +j
t3 = t2 * 4
t4 = c
t5 = t4 [t3]
..... = t5
Dabbal Mahara 51
Translation of Array references
• The main problem of generating code for array referencesis to relate address calculation
formula in the grammar for array references.
• Let nonterminal L generate an array name followed by a sequence of index expressions:
L -> L [ E ] | id [ E ]
• In this translation scheme gen( ) function builds an instruction and incrementally emits it into
the stream of generated instructions.
• The nonterminal L has three synthesized attributes:
• L.addr: denotes a temporary that is used while computing the offset for the array
reference generated by L.
• L.array: is pointer to a symbol table entry for the array name. The base address of the
array – the address of its 0th element, say, L.array.base is used to determine the actual l-
value of an array refrence after all the index expressions are analyzed. The location for
array reference is therefore L.array.base[L.addr].
• L.type is the type of the subarray generated by L. For any type t, we assume that its
width is given by t.width and t.elem gives the element type.
• The translation scheme is shown below:

Dabbal Mahara 52
Translation of Array references

Dabbal Mahara 53
Example: Compute 3AC for expression c+a[i][j], where c, i and j are all integers and a is 2x3 integer
array.
E. addr= t5

E . addr = t4
E.addr = c +
L.array = a
L.type = integer
c L . addr = t3
L.array = a
L.type = array (3, integer)
L.addr = t1 [ E . addr =j ]

j
a [ E . addr =i ]
a.type
= array(2, array(3, integer))
Three Address Code
i t1 = i * 12
t2 = j *4
Fig. Annotated Parse Tree for c + a[ i ][ j ] t3 = t1 + t2
t4 = a [ t3]
t5 = c + t4
Dabbal Mahara 54
Flow-of- control statements
• Control statements are used to alter the sequential flow of execution.
• Some the control statements are if-then-else statement, while statement.
• S -> if (E ) S1
• S -> if ( E ) S1 else S2
• S -> while ( E ) S1
• S -> do S1 while ( E )
• Three Address Code for if-then, if-then-else, while do statements can be generated using
the translation rules given in following slides.
• In the translation rules, both S and E have a synthesized attribute code, which gives the
trasnslation into three-address instructions.
• For simpilicity, translations S.code and E.code are built up as string using SDD.
• The translation of S -> if (E) S1 consists of E.code followed by S1.code as shown in figure.

Dabbal Mahara 55
Three Address Code generation for if then statement
to E. true

Statement Translation rules E.code to E. false

S->if E then S1 E.true = newlabel();


E.false = S.next;
E.true : S1.code
S1.next = S.next;
S.code = E.code || label(E.True, ‘:’)|| S1.code
E.false : ...

Example: Generate 3 address code for the statement: if a>b then x =y +z.
Ans:
3AC for the given statement is:
if a>b then goto L1
goto L2
L1: t1 = y + z
x = t1
L2: ....

Dabbal Mahara 56
Three Address Code generation for if then else statement
Production Semantic Rules
S->if E then S1 else S2 E.true = newlabel();
E.false = newlabel();
S1.next = S.next;
S2.next = S.next;
S.code = E.code || label(E.True, ‘:’)||
S1.code || gen(‘GOTO’, S.next) ||
label(E.false, ‘:’) || S2.code

Example: Generate 3 address code for the statement: if a>b then x =y +z else x = y-z
The three address code is given below:

if a>b then goto L1


goto L2
L1: t1= y+z
x =t1
goto L3
L2: t1 = y-z
x = t1
L3: ............
Dabbal Mahara 57
Three Address Code generation for while do statement

S.code three-address code for evaluating S


S.begin label to start of S
S.next label to end of S

production Semantic rules


S->while E do S1 S.begin = newlabel();
E.true = newlabel();
E.false = S.next
S1.next = S.begin
S.code= label(S.begin ‘:’) || E.code || label(E.true’:’) || S1.code ||
gen(‘GOTO’ S.begin)

Dabbal Mahara 58
Example 1: Generate 3 address code for the statement: while a>b do x = y +z.
The three address code is given below:

L1: if a> b then goto L2


goto L3
L2: t1 = y+z
x = t1
goto L1
L3: .......

Example 2: Generate 3 address code for the statement:


i=2*n+k
while i do
i=i–k

The three address code is given below:


t1 =2
t2 = t1 *n
t3 = t2 +k
L1: if i =1 then goto L2
goto L3
L2: t4 = i-k
i = t4
goto L
L3: ....... Dabbal Mahara 59
Example 3: Generate 3AC for the statement:
while a<b do
if c< d then
x=y+z
else
x=y–z
Solution:
The three address code will be as follows
L1: if a<b then GOTO L2
GOTO LNEXT
L2: if c<d then GOTO L3
GOTO L4
L3: t1 = y + z
x = t1
GOTO L1
L4: t1= y - z
x = t1
GOTO L1
LNEXT:
Dabbal Mahara 60
Example 4: Generate 3AC for the statement:
c =0
do
if (a< b)
x++
else
x- - Alternate Method:
c+ +
while ( c < 5 ) c=0
L1: if ( a<b ) GOTO L2
Solution: GOTO L3
The three address code is given below: L2: t1 = x + 1
1. c =0 x = t1
2. if (a < b) GOTO (4) GOTO L4
3. GOTO (7)
L3: t2 = x -1
4. t1 = x + 1
5. x = t1
x = t2
6. GOTO (9) L4: t3 = c +1
7. t2 = x -1 c = t3
8. x = t2 if ( c <5) GOTO L1
9. t3 = c +1 ----------
10. c = t3
11. if ( c<5) GOTO (2)
12. ------------------------
Dabbal Mahara 61
Example 5: Generate three address code for following c program
int a[10], b[10], dot_product, i;
dot_product = 0;
for ( i =0 ; i < 10 ; i++ ) dot_product += a[i] * b[i];

Intermediate Code:
dot_product = 0;
t7 = t3 + t6
i =0; t8 = dot_product + t7
L1: if (i >=10) GOTO L2 dot_product = t8
t9 = i + 1
t1 = addr(a) // c = base – low* w = base
i = 19
t2 = i * 4 GOTO L1
t3 = t1[t2] L2: -------------
t4 = addr(b)
t5 = i * 4
t6 = t4[t5]

Dabbal Mahara 62
Example 6 : Generate three address code for following c program
int a[10], b[10], dot_product, i;
int *a1, *b1;
dot_product = 0;
a1 =a; b1 = b;
for ( i =0 ; i < 10 ; i++ ) dot_product += *a1++ * *b1++;

Intermediate Code:
dot_product = 0; t5 = *b1
t6 = b1 + 1
a1 = &a b1 = t6 +1
b1 = &b t7 = t3 + t5
t8 = dot_product + t7
i =0;
dot_product = t8
L1: if (i >=10) GOTO L2 t9 = i + 1
t3 = *a1 i = 19
GOTO L1
t4 = a1 + 1
L2: -------------
a1 = t4
Dabbal Mahara 63
Logical Expression
• Logical operators are mainly used in flow control statements like if then else, while-do and repeat until.
• not operation has the highest precedence-level followed by and and or is at least precedence level.
• Logical expressions always results in values either true or false.
• True can be treated as non zero or non negative or 1 value. Whereas false may be 0 or negative value.

Production Translation rules


E → E1 or E2 E1.true = E. true
E1. false = newlabel( )
E2.true = E.true
E2.false = E. false
E.code = E1.code || label (E1.false) || E2.code

E → E1 and E2 E1.true = newlabel ( )


E1. false = false
E2.true = E.true
E2.false = E. false
E.code = E1.code || label (E1.true) || E2.code

Dabbal Mahara 64
SDD for translation of Boolean Expression to 3AC
E → not E1 E1.true = E. false
E1. false = E.true
E.code = E1.code

E → E1 rel E2 E.code = E1.code || E2.code || gen(‘if’ E1.addr rel.op E2.addr ‘goto’


E.true) || gen (‘goto’ E.false)

E → ( E1 ) E.value = E1.value

E → true E.code = gen(‘goto’ E.true)

E → false E.code = gen(‘goto’ E.false)

Examples: Generate 3 AC from following statement:


a or b and not c.

The 3AC for the above expression will be as follows:


t1 = not c
t2 = b and t1
t3 = a or t2

Dabbal Mahara 65
Example: consider the following statement and translate it into three address
codes. if (x < 100 || x > 200 && x != y ) x = 0;

Three address-code:
if x < 100 goto L2
goto L3
L3: if x > 200 goto L4
goto L1
L4: if x != y goto L2
goto L1
L2: x=0
L1: ......

Dabbal Mahara 66
Three address code for procedure call

S → call id ( Elist ) { for each item p on queue do


produce (‘param’ p);
produce(‘call’ id.value |queue|) }
Elist → Elist , E { append E.value to the end of queue }
Elist → E { initialize queue to contain only E.value }

Example: 1

Dabbal Mahara 67
Example 2
Consider the statement: n=f(a[i])
where a is array of integers f is function from integers to integers.

Three Address Code:


t1 = i * 4
t2 = a [ t1 ]
param t2
t3 = call f,1
n = t3

Dabbal Mahara 68
int dot_product ( int x[ ] , int y [] )
{
Example: 3 int d, i;
d =0;
for ( i= 0; i<10;i++
int main() d += x[i] * y[i];
return d;
{ }
int p, int a[10]; int b[10];
p = dot_product ( a, b); intermediate Code:
func begin dot_product t6 = t4[t5]
} d =0 t7 = t3 + t6
i=0 t8 = d + t7
Intermediate code L1: if (i>=10) goto L2 d= t8
t1 = addr(x) t9 = i + 1
funct begin main t2 = i * 4 i = 19
param a t3 = t1[t2] goto L1
t4 = addr(y) L2: return d
param b
t5 = i * 4 func end
p = call dot_product, 2
func end
Dabbal Mahara 69
Example 4: Write 3AC for the following code:
int fact ( int n)
{
if ( n== 0 ) return 1;
else return ( n* fact(n-1));
}
Intermediate Code:
func begin fact
if (n==0) goto L1
t1 = n-1
param t1
t2 = call fact, 1
t3 = n * t2
return t3
L1: return 1
func end

Dabbal Mahara 70
Thank You !

71

You might also like