• It takes the intermediate representation (IR) produced by the front
end of the compiler as an input, along with relevant symbol table information, and produces as output a semantically equivalent target program. • The most important criterion for a code generator is that it produces correct code, because of the number of special cases that a code generator might face • The many choices for the input IR include
three-address representations such as quadruples, triples, indirect triples
virtual machine representations such as bytecodes and stack-machine code linear representations such as postfix notation graphical representations such as syntax trees and DAG's. Target Program • The architecture of the target program can be - RISC(reduced instruction set computer)- has many registers, three-address instructions, simple addressing modes, and a relatively simple instruction-set architecture. - CISC(complex instruction set computer - has few registers, two-address instructions, a variety of addressing modes, several register classes, variable-length instructions, and instructions with side effects. - Stack based - because of its limitation and requirement for multiple swap and copy operations this operation almost disappeared until JVM was introduced. Stack based operations are done by pushing operands onto a stack and then performing the operations on the operands at the top of the stack. To achieve high performance the top of the stack is typically kept in registers. • A code generator has three major tasks instruction selection Register allocation and assignment instruction ordering Instruction selection
• The code generator maps the IR program into a code sequence so it
can be executed by the target machine to do this it considers - the level of the IR( If the IR is high level - the nature of the instruction-set architecture( the uniformity and completeness of the instruction set are important factors - the desired quality of the generated code. speed and size Register Allocation • Register allocation is selecting the set of variables that will reside in registers at each point in the program. • Register assignment is choosing the specific register that a variable will reside in. • Registers are the fastest computational unit on the target machine, Instructions involving register operands are invariably shorter and faster than those involving operands in memory Evaluation ordering • The order in which computations are performed can affect the efficiency of the target code picking a best order in the general case is a difficult NP-complete problem • Some computation orders require fewer registers to hold intermediate results than others.
• generating code for the three-address statements in the order in
which they have been produced by the intermediate code generator avoids the efficiency problem A simple Target machine code • It contains a three-address machine with load and store operations, computation operations, jump operations, and conditional jumps with n general-purpose registers, R0,R1,.. . ,Rn - 1. • It can have a variety of addressing modes: - Indexed address ( LD Rl, a(R2) ) • Rl = contents (a + contents (R2))
- An integer indexed by a register ( LD Rl , 100(R2) )
Rl = contents(100 + contents(R2))
- indirect addressing ( LD Rl , *100(R2) )
• Rl = contents(contents(100 + contents(R2))),
- immediate constant addressing ( LD Rl, #100 )
Program and Instruction Costs • a cost(the length of compilation time and the size, running time and power consumption of the target program)is associated with compiling and running a program. • the cost of a target-language program on a given input is the sum of costs of the individual instructions executed when the program is run on that input. • When calculating cost addressing modes involving registers have zero additional cost, while those involving a memory location or constant in them have an additional cost because such operands have to be stored in the words following the instruction.