Professional Documents
Culture Documents
Rupesh Nasre.
Machine-Independent
Machine-Independent
Lexical
LexicalAnalyzer
Analyzer Code
CodeOptimizer
Optimizer
Intermediate representation
Backend
Token stream
Frontend
Syntax
SyntaxAnalyzer
Analyzer Code
CodeGenerator
Generator
Machine-Dependent
Machine-Dependent
Semantic
SemanticAnalyzer
Analyzer Code
CodeOptimizer
Optimizer
Intermediate
Intermediate Symbol
Code
CodeGenerator
Generator Table
2
Intermediate representation
Role of IR Generator
● To act as a glue between front-end and
backend (or source and machine codes).
● To lower abstraction from source level.
– To make life simple.
● To maintain some high-level information.
– To keep life interesting.
● Complete some syntactic checks, perform more
semantic checks.
– e.g. break should be inside loop or switch only.
3
Representations
● Syntax Trees
++
– Maintains structure of the construct
** 44
– Suitable for high-level representations
33 55
● Three-Address Code
– Maximum three addresses
t1
t1==33**55
in an instruction t2
t2==t1
t1++44
3AC
– Suitable for both high and
low-level representations mult
mult3,
3,55 2AC
add
add44
● Two-Address Code
● … push
push33 1AC
push
push55 or
– e.g. C mult
mult stack
add
add44 machine
Syntax Trees and DAGs
a + a * (b – c) + (b – c) * d
++ ++
++ ** ++ ** AAsmall
smallproblem:
problem:
subgraph
subgraphisomorphism
isomorphism
aa ** -- dd ** dd isisNP-complete.
NP-complete.
aa -- bb cc aa --
bb cc bb cc
Assignment statement: a = b * - c + b * - c;
Assignment statement: a = b * - c + b * - c;
op arg1 arg2
(0) (2) 0 minus c
(1) (3) 1 * b (0)
(2) (0) 2 minus c
(3) (1) 3 * b (2)
(4) (4) 4 + (1) (3)
(5) (5) 5 = (4)
ifif(flag)
(flag)
ifif(flag)
(flag) xx1 ==-1;
-1;
1
xx==-1;
-1; else
else flag
else
else xx2 ==1; flag
1;
xx==1;
1; 2
xx3 ==Φ(xΦ(x11,,xx22))
yy==xx**a; a; 3
yy==xx3 **a; a;
3
xx==-1
-1 xx==11
yy==xx**aa
Language Constructs
● Declarations
– Types (int, int [], struct, int *)
– Storage qualifiers (array expressions, const, static)
● Assignments
● Conditionals, switch
● Loops
● Function calls, definitions
SDT Applications
● Finding type expressions
– int a[2][3] is array of 2 arrays of 3 integers.
– in functional style: array(2, array(3, int))
Production Semantic Rules
array
array T → B id C T.t = C.t
C.i = B.t
B → int B.t = int
22 array
array
B → float B.t = float
C → [ num ] C1 C.t = array(num, C1.t)
33 int
int C1.i = C.i
C→ε C.t = C.i
Classwork:
Classwork:Write
Writeproductions
productionsand
andsemantic
semanticrules
rulesfor
forcomputing
computing
types
typesand
andfinding
findingtheir
theirwidths
widthsininbytes.
bytes. 13
SDT Applications
● Finding type expressions
– int a[2][3] is array of 2 arrays of 3 integers.
– in functional style: array(2, array(3, int))
Production Semantic Rules
array
array T → B id C T.t = C.t; C.iw = B.sw;
C.i = B.t; T.sw = C.sw;
B → int B.t = int; B.sw = 4;
22 array
array
B → float B.t = float; B.sw = 8;
C → [ num ] C1 C.t = array(num, C1.t);
33 int
int C1.i = C.i; C.sw = C1.sw * num.value;
C→ε C.t = C.i; C.sw = C.iw;
Classwork:
Classwork:Write
Writeproductions
productionsand
andsemantic
semanticrules
rulesfor
forcomputing
computing
types
typesand
andfinding
findingtheir
theirwidths
widthsininbytes.
bytes. 14
Type Equivalence
● Since type expressions are possible, we need
to talk about their equivalence.
● Let's first structurally represent them.
● Question: Do type expressions form a DAG?
Can they contain cycles?
array
array
x y
22 array
array data
data
next
next
int
int // float
float
33 int
int
union { struct node {
int x; int data;
int a[2][3] float y; struct node *next; 15
}; };
Type Equivalence
● Two types are structurally equivalent
iff one of the following conditions is true.
1.They are the same basic type.
Name
2.They are formed by applying the same equivalence
constructor to structurally equivalent types.
3.One is a type name that denotes the other. typedef
18
Type Conversions
● int a = 10; float b = 2 * a;
● Widening conversion are safe.
– int → long → float → double.
– Automatically done by compiler, called coercion.
● Narrowing conversions may not be safe.
– int → char.
– Usually, enforced by the programmers, called casts.
– Sometimes, deferred until runtime, dyn_cast<...>.
19
Declarations
● When declarations are together, a single offset
on the stack pointer suffices.
– int x, y, z; fun1(); fun2();
● Otherwise, the translator needs to keep track of
the current offset.
– int x; fun1(); int y, z; fun2();
● A similar concept is applicable for fields in
structs.
● Blocks and Nestings
– Need to push the current environment and pop. 20
Expressions
● We have studied expressions at length.
● To generate 3AC, we will use our grammar and
its associated SDT to generate IR.
● For instance, a = b + - c would be converted to
– t1 = minus c
– t2 = b + t1
– a = t2
21
Array Expressions
● For instance, create IR for c + a[i][j].
● This requires us to know the types of a and c.
● Say, c is an integer (4 bytes) and a is int [2][3].
● Then, the IR is
t1
t1 == ii ** 12
12 ;; 33 ** 44 bytes
bytes
t2
t2 == jj ** 44 ;; 11 ** 44 bytes
bytes
t3
t3 == t1
t1 ++ t2t2 ;; offset
offset from
from aa
t4
t4 == a[t3]
a[t3] ;; assuming
assuming base[offset]
base[offset] is
is present
present in
in IR.
IR.
t5
t5 == cc ++ t4t4
22
Array Expressions
● a[5] is a + 5 * sizeof(type)
● a[i][j] for a[3][5] is a + i * 5 * sizeof(type) + j *
sizeof(type)
● This works when arrays are zero-indexed.
● Classwork: Find array expression to be generated
for accessing a[i][j][k] when indices start with low,
and array is declared as type a[10][20][30].
● Classwork: What all computations can be
performed at compile-time?
● Classwork: What happens for malloced arrays?
23
Array Expressions
void
voidfun(int
fun(inta[][])
a[][]){{ We view an array to be a D-
a[0][0]
a[0][0]==20;
20; dimensional matrix. However, for
}} the hardware, it is simply single
void
voidmain()
main(){{ dimensional.
int
inta[5][10];
a[5][10];
fun(a);
fun(a);
printf("%d\n",
printf("%d\n",a[0][0]);
a[0][0]);
}}
0,0 0,2
1,0 3,0
1,2
25
IR for Array Expressions
● L → id [E] | L [E]
L → id [ E ] { L.type = id.type;
L.addr = new Temp();
gen(L.addr '=' E.addr '*' L.type.width); }
L → L1 [ E ] { L.type = L1.type;
T = new Temp();
L.addr = new Temp();
gen(t '=' E.addr '*' L.type.width);
gen(L.addr '=' L1.addr '+' t); }
E → id { E.addr = id.addr; }
E→L { E.addr = new Temp();
gen(E.addr '=' L.base '[' L.addr ']'); }
E → E1 + E2 { E.addr = new Temp();
gen(E.addr '=' E1.addr + E2.addr); }
S → id = E { gen(id.name '=' E.addr); }
S→L=E { gen(L.base '[' L.addr ']' '=' E.addr); }
26
Note
Notethat
that the
thegrammar
grammarisisleft-recursive.
left-recursive.
t1
t1 == ii ** 12
12 ;; 33 ** 44 bytes
bytes
t2
t2 == jj ** 44 ;; 11 ** 44 bytes
bytes
t3
t3 == t1
t1 ++ t2t2 ;; offset
offset from
from aa
t4
t4 == a[t3]
a[t3];; assuming
assuming base[offset]
base[offset] is
is present
present in
in IR.
IR.
t5
t5 == cc ++ t4t4
L → id [ E ] { L.type = id.type;
L.addr = new Temp();
gen(L.addr '=' E.addr '*' L.type.width); }
L → L1 [ E ] { L.type = L1.type;
T = new Temp();
L.addr = new Temp();
gen(t '=' E.addr '*' L.type.width);
gen(L.addr '=' L1.addr '+' t); }
E → id { E.addr = id.addr; }
E→L { E.addr = new Temp();
gen(E.addr '=' L.base '[' L.addr ']'); }
E → E1 + E2 { E.addr = new Temp();
gen(E.addr '=' E1.addr + E2.addr); }
S → id = E { gen(id.name '=' E.addr); } 27
28
Control-Flow – Boolean Expressions
● B → B || B | B && B | !B | (B) | E relop E | true | false
● relop → < | <= | > | >= | == | !=
● What is the associativity of ||?
● What is its precedence over &&?
● How do we optimize evaluation of (B1 || B2) and (B3 &&
B4)?
– Short-circuiting: if (x < 10 && y < 20) ...
– Classwork: Write a C program to find out if C uses
short-circuiting or not.
● if (p && p->next) ... 29
Control-Flow – Boolean Expressions
● Source code:
– if (x < 100 || x > 200 && x != y) x = 0;
● IR: without short-circuit with short-circuit
b1
b1 == xx << 100
100 b1
b1 == xx << 100
100
b2
b2 == xx >> 200
200 iftrue
iftrue b1
b1 goto
goto L2
L2
b3
b3 == xx !=
!= yy b2
b2 == xx >> 200
200
iftrue
iftrue b1
b1 goto
goto L2
L2 iffalse
iffalse b2
b2 goto
goto L3
L3
iffalse
iffalse b2
b2 goto
goto L3
L3 b3
b3 == xx !=
!= yy
iffalse
iffalse b3
b3 goto
goto L3
L3 iffalse
iffalse b3
b3 goto
goto L3
L3
L2:
L2: L2:
L2:
xx == 0;
0; xx == 0;
0;
L3:
L3: L3:
L3: 30
...
... ...
...
3AC for Boolean Expressions
B1.true
B1.true==B1.true;
B1.true;
B → B1 || B2
B1.false
B1.false==newLabel();
newLabel();
B2.true
B2.true==B.true;
B.true;
B2.false
B2.false==B.false;
B.false;
B.code
B.code==B1.code
B1.code++
label(B1.false)
label(B1.false)++
B2.code;
B2.code;
B → B1 && B2 B1.true
B1.true==newLabel();
newLabel();
B1.false
B1.false==B.false;
B.false;
B2.true
B2.true==B.true;
B.true;
B2.false
B2.false==B.false;
B.false;
B.coee
B.coee==B1.code
B1.code++
label(B1.true)
label(B1.true)++
B2.code;
B2.code;
31
3AC for Boolean Expressions
B1.true
B1.true==B.false;
B.false;
B → !B1
B1.false
B1.false==B.true;
B.true;
B.code
B.code==B1.code;
B1.code;
B → E1 relop E2 B.code
B.code==E1.code
E1.code++E2.code
E2.code++
gen('if'
gen('if'E1.addr
E1.addrrelop
relopE2.addr
E2.addr
'goto'
'goto'B.true)
B.true)++
gen('goto'
gen('goto'B.false);
B.false);
B → true B.code
B.code==gen('goto'
gen('goto'B.true);
B.true);
B → false B.code
B.code==gen('goto'
gen('goto'B.false);
B.false);
32
SDD for while
S → while ( C ) S1 L1
L1 ==newLabel();
newLabel();
L2
L2 ==newLabel();
newLabel();
SS1.next
.next ==L1;
L1;
1
C.false
C.false ==S.next;
S.next;
C.true
C.true ==L2;
L2;
S.code
S.code ==“label”
“label”++L1
L1++
C.code
C.code++
”label”
”label”++L2
L2++
SS1.code
.code++
1
gen('goto'
gen('goto'L1);
L1);
33
3AC for if / if-else
B.true
B.true==newLabel();
newLabel();
S → if (B) S1
B.false
B.false==SS11.next
.next==S.next;
S.next;
S.code
S.code==B.code
B.code++
label(B.true)
label(B.true)++
SS1.code;
.code;
1
B.true
B.true==newLabel();
newLabel();
S → if (B) S1 else S2 B.false
B.false==newLabel();
newLabel();
SS1.next
.next==SS22.next
.next==S.next;
S.next;
1
S.code
S.code==B.code
B.code++
label(B.true)
label(B.true)++SS11.code
.code++
Gen('goto'
Gen('goto'S.next)
S.next)++
label(B.false)
label(B.false)++SS22.code;
.code;
34
Control-Flow – Boolean Expressions
● Source code: if (x < 100 || x > 200 && x != y) x = 0;
without optimization with short-circuit
b1
b1 == xx << 100
100
b2
b2 == xx >> 200
200
b3
b3 == xx !=
!= yy b1
b1 == xx << 100
100
iftrue
iftrue b1
b1 goto
goto L2
L2 iftrue
iftrue b1
b1 goto
goto L2
L2
goto
goto L0L0 b2
b2 == xx >> 200
200
L0:
L0: iffalse
iffalse b2
b2 goto
goto L3
L3
iftrue
iftrue b2
b2 goto
goto L1
L1 b3
goto b3 == xx !=
!= yy
goto L3L3 iffalse
L1: iffalse b3
b3 goto
goto L3
L3
L1: L2:
iftrue
iftrue b3
b3 goto
goto L2
L2 L2:
goto xx == 0;
0;
goto L3L3
L2: L3:
L3:
L2:
xx == 0;
0; ...
...
L3:
L3: Avoids redundant gotos.
35
...
...
Homework
● Write SDD to generate 3AC for for.
– for (S1; B; S2) S3
● Write SDD to generate 3AC for repeat-until.
– repeat S until B
36
Backpatching
● if (B) S required us to pass label while
evaluating B.
– This can be done by using inherited attributes.
● Alternatively, we could leave the label
unspecified now...
– … and fill it in later.
● Backpatching is a general concept for one-pass
code generation
– Recall stack offset computation in A3.
B → true B.code
B.code==gen('goto
gen('goto–');
–');
B → B1 || B2 backpatch(B1.false);
backpatch(B1.false);
...
... 37
break and continue
● break and continue are disciplined / special gotos.
● Their IR needs
– currently enclosing loop / switch.
– goto to a label just outside / before the enclosing block.
● How to write the SDD to generate their 3AC?
– either pass on the enclosing block and label as an
inherited attribute, or
– use backpatching to fill-in the label of goto.
– Need additional restriction for continue.
● Classwork: How to support break label? 38
IR for switch
● Using nested if-else
switch(E)
switch(E) {{
● Using a table of pairs case
case VV11:: SS11
case
case VV22:: SS22
– <Vi, Si>
……
● Using a hash-table case
case VVn-1 :: SSn-1
n-1 n-1
default:
default: SSnn
– when i is large (say, > 10) }}
● Special case when Vis are
consecutive integrals.
– Indexed array is sufficient.
39
switch(E)
switch(E) {{
case
case VV11:: SS11
case
case VV22:: SS22
……
case
case VVn-1 :: SSn-1
n-1 n-1
default:
default: SSnn
}}
40
Functions
● Function definitions
– Type checking / symbol table entry
– Return type, argument types, void
– Stack offset for variables
– Stack offset for arguments
● Function calls
– Push parameters
– Switch scope / push environment
– Jump to label for the function
– Switch scope / pop environment 41
– Pop parameters
Language Constructs
● Declarations
– Types (int, int [], struct, int *)
– Storage qualifiers (array expressions, const, static)
● Assignments
● Conditionals, switch
● Loops
● Function calls, definitions