You are on page 1of 20

Compiler

Construction

1
Cousins of Compiler
Preprocessors
Assemblers
Loader and Link-Editor

2
Preprocessors
 They produce input to compiler. They may perform
the following functions:
1. Macro Processing
2. File Inclusion
3. Rational preprocessors (they augment older languages with
modern flow of control)

4. Language extensions
(Equel is a database query language embedded in C)

3
Assembler
Some compiler produces assembly code that is passed to an
assembler for processing.

Other compilers produces relocatable machine code that can be


passed directly to the loader/link-editor.

Assembly code is a mnemonic version of machine code, in


which names are used instead of binary codes for operations.
e.g.
MOV a, R1
ADD #2, R1
MOV R1, b

4
Two-Pass Assembly
The assembler makes two passes over the input.

A pass consist of reading the input file once.

In the first pass, all the identifiers that denote storage
locations are found and stored in a symbol table
(separate from that of compiler)

Identifiers are assigned storage locations as they are


encountered for the first time

5
For Example
The symbol table might contains the entries as shown
below: (Assuming that a word consists of 4 bytes, and the addresses
are assigned starting from bytes)

IDENTIFIER ADDRESS
a 0
b 4

6
Two-Passes Assembly
In the second pass, the assembler scans the input
again

This time it translate each operation code into the


sequence of bits representing that operation in
machine language, it also translate each identifier
representing a location into the address given for that
identifier in the symbol table.

7
Two-Passes Assembly

The output of the second pass is usually relocatable


machine code, meaning that it can be loaded starting
at any location L in memory;

8
Example
The following is a hypothetical machine code for
previous example:

000101 00 00000000 *
0011 01 10 00000010
001001 00 00000000 *

Instruction code Register Address mode Memory address

9
Continued……..
If L = 00001111 i.e. 15 then a and b would be at
location 15 and 19, respectively, and the instructions
would now appear as absolute code:

000101 00 00001111
0011 01 10 00000010
001001 00 00010011

10
Loader and Link-Editor
It perform two functions of loading and link-editing
Loading process consist of taking relocatable machine
code, altering the relocatable addresses and placing
the altered instruction and data in memory at proper
location.
Link-Editor make a single program from several files
of relocatable machine code.

11
The Grouping of Phases
Activities from more than one phases can be grouped
together.

Front and Back Ends


Passes

12
Front and Back Ends
 Front End consist of those phases, or part of phases that
depends primarily on the source language and are
independent of target machine. It normally includes:

1. Lexical analysis
2. Syntactic analysis
3. The creation of Symbol Table
4. Semantic analysis
5. The generation of Intermediate Code
6. Certain amount of Code optimization
7. Error Handling

13
Front and Back Ends
 The Back End includes those portions of the
compiler that depend on the target machine, and
are independent of source language. It includes:

1. Code Optimization
2. Code Generation
3. Necessary Error handling and Symbol-Table
operations.

14
Advantages of Front End and Back End
If we want to write new compiler for the same
machine then only the Front End of the compiler
change while the Back End remain the same.

Similarly if we want to write the compiler for new


machine then the front end of the compiler remain
the same and back end of the compiler is changed.

It is also good idea to compile several different


languages into same intermediate language and use a
common back end for different front ends.
15
Example:
A C Language compiler may have more than one back
ends each for a different machines i.e.

C-Language Compiler
Front End

Back End Back End


IBM Apple

It means that one back end will work for IBM
Machine and 2nd for Apple with a single front end
16
Example:
Suppose we have a C- Compiler with Front End and
Back End and now for FORTRON compiler we have
only to change its front end for FORTRAN and its
Back End will remain the same.

C-Compiler FORTRAN-Compiler
Front End Front End

Target Language
Back End

17
Passes
Several phases of compilation are usually
implemented in a single pass consisting of reading an
input file and writing an output file.

The grouped phases must be interleaved during the


pass. For Example:

lexical analysis, syntax analysis, semantic analysis and


intermediate code generation might be grouped
together into one pass
18
Passes
It is desirable to have few passes, since it takes time
to read and write intermediate files.

On the other hand, if we group several phases into


one pass, we may be forced to keep the entire
program in memory, because one phase may need
information in a different order than a previous phase
produce it.

19
20

You might also like