You are on page 1of 18

Phases of a compiler

Phases of a Compiler
• Compiler operates in phases
• Each phase transforms source program
from one representation to another
The Structure of a Modern Compiler

Lexical Analysis
Source Code
Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Code Optimization Machine


Code
Lexical Analysis
Lexical Analysis

Syntax Analysis
while (y < z) {
Semantic Analysis
int x = a + b;
y += x; IR Generation

} IR Optimization

Code Generation

Code Optimization
Reads the characters in the source program
and groups them into tokens.

 Token represents an Identifier, or a keyword,


a punctuation character or a operator.

 The character Sequence forming a token is


called the lexeme for the token.

The lexical Analyzer not only generate


tokens but also it enters the lexeme into
the symbol table.
Outcome of Lexical Analyzer
•Groups the tokens into Syntactic Structures.
Ensures that the components of a program fit together meaningfully
Gathers type information and checks for type compatibility
•Simple Instructions produced by the
syntax Analyzer is IR.

•IR has two properties: Easy to use, Easy


To translate.

•Eg: TAC
•Improve the Intermediate Code so that
the ultimate object program runs faster
and or takes less space.

It involves:
- Detection and removal of dead code.
-Calculation of constant expressions and terms.
-Moving code outside of loops.
-Removal of unnecessary temporary variables.
Machine code is generated. This involves:

*. Allocation of Registers and Memory.


*. Generation of correct References.
*. Generation of correct types.
*. Generation of machine code.
Symbol Table
• It is a data structure containing a record for each
identifier, with fields for the attributes of the identifier.

• The data structure allows us to find the record for each


identifier quickly and to store or retrieve data from that
record quickly.

• When an identifier in the source program is detected by


the lexical Analyzer, the identifier is entered into the
symbol table.
Error Detection and Reporting
• Each phase can encounter some errors.
• After detecting an error, a phase must deal with that error, so that
compilation can proceed allowing further errors in the source
program to be detected.
Eg:
i) Lexical Errors : (Don’t form any tokens)
“in, floa, switc etc.,”
ii) Syntax Errors: (Token stream violates the structure rules of the
language).
“ Missing of parenthesis, braces etc.,”

iii) Semantic Errors:(No meaning to the operation involved)


“a is not used”;
Phases of the Compiler
Source Program

1
Lexical Analyzer

2
Syntax Analyzer

3
Semantic Analyzer

Symbol-table Error Handler


Manager
4 Intermediate
Code Generator

5
Code Optimizer

6
Code Generator

Target Program
Find the Answer?
Consider the following statement
position=initial + rate*60
Show the output of each phase.
Example
position := initial + rate * 60

lexical analyzer
id1 := id2 + id3 * 60
syntax analyzer
:=
id1 +
id2 *
id3 60
semantic analyzer
:=
Symbol + E
Table
id1 r
id2l *
r
position .... id3 inttoreal o
60 r
initial …. s
intermediate code generator
rate….
Example
Symbol Table E
r
position ....
r
initial …. o
intermediate code generator r
rate…. s
temp1 := inttoreal(60)
temp2 := id3 * temp1
temp3 := id2 + temp2 3 address code
id1 := temp3
code optimizer
temp1 := id3 * 50.0
id1 := id2 + temp1
final code generator
MOVF id3, R2
MULF #60.0, R2
MOVF id2, R1
ADDF R2, R1
MOVF R1, id1

You might also like