Attribution Non-Commercial (BY-NC)

14 views

Attribution Non-Commercial (BY-NC)

- a3 - interdisciplinary lesson plan
- Compiler Over 2
- IJMC-4-2009-book
- Scheduling in Sport & Entertainments
- Design and Implementation of Path Planning Algorithm for Wheeled Mobile Robot in a Known Dynamic Environment
- Astart Hart
- BPI FIELD REQUIREMENTS
- 1-s2.0-0166218X81900135-main
- Open Research Problems
- PPPJ
- Design &Development of an Interpreted Programming Language
- 41 Undirected Graphs
- A* Algorithm
- Study on Flexibility of Sensor Network for Linear Processes 2002 Computers Chemical Engineering
- HS BasicGraphTerminology
- sol-exc1-17
- nx_tutorial.pdf
- On Terminal Hosoya Polynomial of Some Thorn Graphs
- 06 Chapter 1
- Revista Conection vol 32 issue 2.pdf

You are on page 1of 93

Mooly Sagiv

html://www.cs.tau.ac.il/~msagiv/courses/wcc11-12.html

Chapter 4

Code Generation

Transform the AST into machine code

Several phases Many IRs exist

Machine instructions can be described by tree patterns Replace tree-nodes by machine instruction

Tree rewriting Replace subtrees

3

a := (b[4*c+d]*2)+9

leal

movsbl

Ra + * mem + @b * + Rd 9 2

Rc

Ra + * 9 2

Rt

Load_Byte (b+Rd)[Rc], 4, Rt

Ra

Load_address 9[Rt], 2, Ra

Load_Byte (b+Rd)[Rc], 4, Rt

Overall Structure

Code selection Register allocation Instruction ordering

10

Simplifications

Consider small parts of AST at time Simplify target machine Use simplifying conventions

11

Outline

Simple code generation for expressions (4.2.4, 4.3)

Pure stack machine Pure register machine

Code generation of basic blocks (4.2.5) [Automatic generation of code generators (4.2.6)] Later

Handling control statements Program Analysis Register Allocation Activation frames

12

Fixed translation for each node type Translates one expression at the time Local decisions only Works well for simple machine model

Stack machines (PDP 11, VAX) Register machines (IBM 360/370)

13

SP

Stack

BP

14

15

Example

Push_Local #p

p := p + 5

16

Push_Local #p SP Push_Const 5 Add_Top2 7 BP+5 BP Store_Local #p

17

Push_Local #p Push_Const 5 Add_Top2 7 BP+5 BP Store_Local #p

SP

18

5 7 SP Push_Local #p Push_Const 5 Add_Top2 7 BP+5 BP Store_Local #p

19

Push_Local #p Push_Const 5 Add_Top2 7 BP+5 BP Store_Local #p

12

SP

20

Push_Local #p SP Push_Const 5 Add_Top2 12 BP+5 BP Store_Local #p

21

Register Machine

Fixed set of registers Load and store from/to memory Arithmetic operations on register only

22

23

Example

Load_Mem p, R1

p := p + 5

24

R1 R2 Load_Mem p, R1 Load_Const 5, R2 Add_Reg R2, R1 7

x770

Store_Reg R1, P

memory

25

7 R1 R2 Load_Mem p, R1 Load_Const 5, R2 Add_Reg R2, R1 7

x770

Store_Reg R1, P

memory

26

7 R1 5 R2 Load_Mem p, R1 Load_Const 5, R2

Add_Reg R2, R1

7

x770

Store_Reg R1, P

memory

27

12 R1 5 R2 Load_Mem p, R1 Load_Const 5, R2 Add_Reg R2, R1 7

x770

Store_Reg R1, P

memory

28

12 R1 5 R2 Load_Mem p, R1 Load_Const 5, R2 Add_Reg R2, R1 12

x770

Store_Reg R1, P

memory

29

Tree rewritings Bottom up AST traversal

30

31

Example

Subt_Top2

Mult_Top2

Mult_Top2

b

Push_Local #b

Push_Local #b Push_Constant 4

*

Mult_Top2

a

Push_Local #a

c

Push_Local #c

32

33

Need to allocate register for temporary values

AST nodes

Bottom up code generation Allocate registers for subtrees

34

35

36

Assume enough registers Use DFS to:

Generate code Assign Registers

Target register Auxiliary registers

37

38

39

Example

T=R1 Subt_Reg R1, R2 T=R1 Mult_Reg R2, R1 T=R1

T=R3

*

T=R2 T=R2

*

T=R4

c

40

Load_Mem a, R3 Load_Mem c, R4

Example

41

Runtime Evaluation

42

Optimality

The generated code is suboptimal May consume more registers than necessary

May require storing temporary results

43

Example

44

Observation (Aho&Sethi)

The compiler can reorder the computations of sub-expressions The code of the right-subtree can appear before the code of the left-subtree May lead to faster code

45

Example

T=R1 Subt_Reg R3, R1 T=R1 Mult_Reg R2, R1 T=R1

T=R2

*

T=R2 T=R3

*

T=R3

c

46

Load_Mem a, R2 Load_Mem c, R3

Example

Load_Mem b, R1 Load_Mem b, R2 Mult_Reg R2, R1 Load_Mem a, R2

Load_Mem c, R3

Mult_Reg R3, R2 Load_Constant 4, R3 Mult_Reg R2, R3 Subt_Reg R3, R1

47

Bottom-up (labeling)

Compute for every subtree

The minimal number of registers needed Weight

Top-Down

Generate the code using labeling by preferring heavier subtrees (larger labeling)

48

m>n m registers +

m registers

n registers

49

m<n n registers +

m registers

n registers

50

m=n m+1 registers +

m registers

n registers

51

52

3 *

2 1 b * 1 b

1 4

* 2

1

53

Top-Down

T=R1 Subt_Reg R2, R1 T=R1 Mult_Reg R2, R1

-3

*2

*2

T=R2 T=R2

T=R1

T=R3

b1

b1

41

*2

T=R2

a1

c1

54

Load_Mem a, R3 Load_Mem c, R2

Generalizations

More than two arguments for operators

Function calls

Need more registers than available

55

Add_Mem X, R1 Mult_Mem X, R1 No need for registers to store right operands

56

2 *

1 1 b * 0 b

1 4

* 1

0

57

Top-Down

T=R1 Subt_Reg R2, R1 T=R1 Mult_Mem b, R1

-2

*1

*2

T=R2

T=R1

b1

Load_Mem b, R1

b0

Load_Constant 4, R2

T=R1

41

*1

a1

Load_Mem a, R1

c0

58

Empirical Results

Experience shows that for handwritten programs 5 registers suffice (Yuval 1977) But program generators may produce arbitrary complex expressions

59

Spilling

Even an optimal register allocator can require more registers than available Need to generate code for every correct program The compiler can save temporary results

Spill registers into temporaries Load when needed

60

Heavy tree Needs more registers than available A `heavy tree contains a `heavy subtree whose dependents are light Generate code for the light tree Spill the content into memory and replace subtree by temporary Generate code for the resultant tree

61

62

Top-Down (2 registers)

Load_Mem T1, R2 Subt_Reg R2, R1 T=R1 T=R1 -3 T=R1 Store_Reg R1, T1

Mult_Reg R2, R1

*2

*2

b1

b1

41

*2

T=R1

a1

c1

63

Load_Mem a, R2 Load_Mem c, R1

Top-Down (2 registers)

Load_Mem a, R2 Load_Mem c, R1 Mult_Reg R1, R2 Load_Constant 4, R2 Mult_Reg R2, R1 Store_Reg R1, T1 Load_Mem b, R1 Load_Mem b, R2 Mult_Reg R2, R1 Load_Mem T1, R2 Subtr_Reg R2, R1

64

Summary

Register allocation of expressions is simple Good in practice Optimal under certain conditions

Uniform instruction cost `Symbolic trees

Code-Generator Generators exist (BURS)

Even simpler for 3-address machines Simple ways to determine best orders But misses opportunities to share registers between different expressions

Can employ certain conventions

Graph coloring

65

Chapter 4.2.5

66

Given

AST Machine description

Number of registers Instructions + cost

Generate code for AST with minimum cost NPC [Aho 77]

67

68

Simplifications

Consider small parts of AST at time

One expression at the time

Ignore certain instructions

69

Basic Block

Parts of control graph without split A sequence of assignments and expressions which are always executed together Maximal Basic Block Cannot be extended

Start at label or at routine entry Ends just before jump like node, label, procedure call, routine exit

70

Example

void foo()

{

if (x > 8) { z = 9; t = z + 1; } z = z * z; t=tz; bar(); t = t + 1;

x>8

z=9;

t = z + 1; z=z*z; t = t - z; bar() t=t+1;

71

Running Example

72

73

Optimized code(gcc)

74

Outline

Dependency graphs for basic blocks Transformations on dependency graphs From dependency graphs into code

Instruction selection (linearizations of dependency graphs) Register allocation (the general idea)

75

Dependency graphs

Threaded AST imposes an order of execution The compiler can reorder assignments as long as the program results are not changed Define a partial order on assignments

a < b a must be executed before b

Nodes are assignments Edges represent dependency

76

Running Example

77

Sources of dependency

Data flow inside expressions

Operator depends on operands Assignment depends on assigned expressions

From assignments to their use

78

Sources of dependency

Order of subexpresion evaluation is immaterial

As long as inside dependencies are respected

Come between

Depending assignment Next assignment

79

1. Nodes AST becomes nodes of the graph 2. Replaces arcs of AST by dependency arrows

Operator Operand

3. Create arcs from assignments to uses 4. Create arcs between assignments of the same variable 5. Select output variables (roots) 6. Remove ; nodes and their arrows

80

Running Example

81

Short-circuit assignments

Connect variables to assigned expressions Connect expression to uses

82

Running Example

83

84

Common Subexpressions

Repeated subexpressions Examples x = a * a + 2* a*b + b * b; y=a*a2*a*b+b*b; a[i] + b [i] Can be eliminated by the compiler In the case of basic blocks rewrite the DAG

85

Linearize the dependency graph

Instructions must follow dependency

Many solutions exist Select the one with small runtime cost Assume infinite number of registers

Symbolic registers Assign registers later

May need additional spill

Possible Heuristics

Late evaluation Ladders

86

87

Register Allocation

Maps symbolic registers into physical registers Reuse registers as much as possible Graph coloring

Undirected graph Nodes = Registers (Symbolic and real) Edges = Interference

88

R3 R1 R2

X1

X1 R2

89

Running Example

90

Optimized code(gcc)

91

Summary Heuristics for code generation of basic blocks Works well in practice Fits modern machine architecture Can be extended to perform other tasks

Common subexpression elimination

92

93

- a3 - interdisciplinary lesson planUploaded byapi-302688219
- Compiler Over 2Uploaded bySaikrishna Gv
- IJMC-4-2009-bookUploaded byAnonymous 0U9j6BLllB
- Scheduling in Sport & EntertainmentsUploaded byforjoe45
- Design and Implementation of Path Planning Algorithm for Wheeled Mobile Robot in a Known Dynamic EnvironmentUploaded byInternational Journal of Research in Engineering and Technology
- Astart HartUploaded byUnifiesta Pereira
- BPI FIELD REQUIREMENTSUploaded bytoxicwind
- 1-s2.0-0166218X81900135-mainUploaded byzabroma1717
- Open Research ProblemsUploaded bySrikar Varadaraj
- PPPJUploaded byMakhloufi Hocine
- Design &Development of an Interpreted Programming LanguageUploaded byAJER JOURNAL
- 41 Undirected GraphsUploaded byMarcus Braga
- A* AlgorithmUploaded bySachin Sharma
- Study on Flexibility of Sensor Network for Linear Processes 2002 Computers Chemical EngineeringUploaded byCarlos Alberto Augusto Sánchez Quiñones
- HS BasicGraphTerminologyUploaded byAnuragGupta
- sol-exc1-17Uploaded byttnleeobh7382
- nx_tutorial.pdfUploaded byeanet2013
- On Terminal Hosoya Polynomial of Some Thorn GraphsUploaded byIoan Degau
- 06 Chapter 1Uploaded byatta
- Revista Conection vol 32 issue 2.pdfUploaded byAlejandro Perez
- Randomized AlgorithmsUploaded byAsafAhmad
- Network Analysis of World Trade using the BACI-CEPII dataset.pdfUploaded byKen
- vicon_ct-1Uploaded byhambali ali
- entropy-21-00277-v2Uploaded byPudretexD
- Ariestha W Bustan_Salman AnUploaded byariestha
- Localizing and Excluding Quantum Information; Or, How to Share a Quantum Secret in SpacetimeUploaded bynope

- Python LearnUploaded bySujeet Omar
- Pcs7 Compendium Part b en-US en-USUploaded byHugo Diaz
- EST-EST3-v3.0-Programming-Manual.pdfUploaded byHernan Rea
- Crafting an Interpreter Part 3 - Parse Trees and Syntax Trees - VROSNET - CodePrUploaded byAntony Ingram
- Information Technology CourseUploaded byemission2009
- Compilation vs InterpretationUploaded byRamnish Kumar
- PBP Reference ManualUploaded byJose De Jesus Morales
- Restaurant ReservationUploaded byAdv Sunil Joshi
- C#Uploaded bysangamalla
- BusRulesAdmUploaded byumksbl
- Gabriel Hjort Blindell (Auth.)-Instruction Selection_ Principles, Methods, And Applications-Springer International Publishing (2016)Uploaded byChristian Kjær Larsen
- Correct Higher Order Program Transform At 1409992Uploaded byCholid Fauzi
- DB2 UNIT 2Uploaded byAnonymous 7r2OlOFV
- BCSL -044 Statistics LabUploaded bydfgfdg
- AN4727.pdfUploaded byBOLFRA
- fundamentals of c++Uploaded byRahayu Wilujeng
- Compiler DesignUploaded byEr Chintan Patel
- X11 Basic ManualUploaded bydoctorwhoisthis
- Salida Configure SheldonUploaded byVale Martin
- Python for InformaticsUploaded byPepe
- Sinumerik WS800 - cycle language CL800.pdfUploaded byFlávio Ramos
- Python for InformaticsUploaded byIsra Safril Lamno
- ITM Theory NotesUploaded bypearku
- Concept of Programming LanguageUploaded bynifrasismail
- c-lang-refUploaded byDouglas W. Goodall
- Introduction to Parallel ComputingUploaded byMainak Hore
- Week 1 LabUploaded byWaqas Mehmood
- cmosis50Uploaded byKesani Venkat Narsimha Reddy
- [William a. Wulf] Compilers and Computer ArchitectureUploaded byFausto N. C. Vizoni
- Lecture 1Uploaded bySahir Majeed

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.