You are on page 1of 28

Compiler Design

Instructor: Mohammed O.
Email: momoumer90@gmail.com
Samara University
Preliminaries Required
oBasic knowledge of programming languages.
oBasic knowledge of FSA and CFG.
Chapter One: Introduction
This Chapter Covers:
Phases of a Compiler
Compiler Construction Tool
Introduction
Programming languages are notations for describing
computations (study) to people and to machines.
Before a program can be run, it first must be translated
into a form that can be executed by a computer.

The software systems that do this translation are called


compilers.
Simply stated, a compiler is a program that can read a
program in one language (source language) and translate it
into an equivalent program in another language (target
language).
Compiler Design

An important role of the compiler is to report any errors in


the source program that it detects during the translation
process.
Cont…
A source program may be divided into modules stored in
separate files.
The task of collecting the source program is sometimes
entrusted to a separate program, called a pre-processor.
Pre-processing is the first pass of any compilation. It
processes such as include-files.
The compiler may produce an assembly-language program
as its output, because assembly language is easier to
produce as output and is easier to debug (the process of
identifying and removing errors).
The assembly language is then processed by a program
called an assembler that produces relocatable machine code
as its output.
Cont…
The linker (which helps to link object modules of program
into a single object file) and assign final addresses and
revises code and data to reflect new addresses (a process
called relocation ).

The loader then puts together all of the executable object


files into memory for execution.
Cont…
Basic for Linker Loader
Comparison
Basic It generates the It loads the
executable module of executable module to
a source program. the main memory.
Input It takes as input, the It takes executable
object code generated module generated by
by an assembler. a linker.
Function It combines all the It allocates the
object modules of a addresses to an
source code to executable module in
generate an main memory for
executable module. execution.
A Language-Processing System
Major Parts of Compilers
There are two major parts of a compiler: Analysis
(Front-End) and Synthesis (Back-End)
In analysis phase, an intermediate representation is created
from the given source program.
 Lexical Analyzer, Syntax Analyzer and Semantic
Analyzer are the parts of this phase.

In synthesis phase, the equivalent target program is


created from this intermediate representation.
 Intermediate Code Generator, Code Generator, and Code
Optimizer are the parts of this phase.
Phases of A Compiler
A compiler takes as input a source programme and
produces as output an equivalent sequence of machine
instructions.
This process is so complex, to consider the compilation
process as occurring in one single step.

It is customary to partition the compilation process into a


series of sub-processes called phases.
A phase is a logically cohesive (working together
effectively) operation that takes as input one representation
of the source programme and produces as output another
representation.
Phases of A Compiler
Phases of A Compiler
Each phase transforms the source program from one
representation into another representation.
They communicate with error handlers.
They communicate with the symbol table.
Lexical Analyzer
Lexical Analyzer reads the source program character by
character and returns the tokens of the source program.
A token describes a pattern of characters having same
meaning in the source program. (such as identifiers,
operators, keywords, numbers, delimeters and so on)
Ex: newval := oldval + 12 => tokens : newval identifier
:= assign op
oldval identifier
+ add operator
12 a number
Lexical Analyzer
Puts information about identifiers into the symbol table.
Regular expressions are used to describe tokens (lexical
constructs).
A (Deterministic) Finite State Automata can be used in the
implementation of a lexical analyzer
Syntax Analyzer
A Syntax Analyzer creates the syntactic structure
(generally a parse tree) of the given program.
A syntax analyzer is also called as a parser.
A parse tree describes a syntactic structure.

 • In a parse tree, all terminals are at


leaves.

• All inner nodes are non-terminals in


a context free grammar.
Syntax Analyzer (Cont.)
The syntax of a language is specified by a context free
grammar (CFG).
A syntax analyzer checks whether a given program satisfies
the rules implied (suggested) by a CFG or not. It takes all
the tokens one by one and uses Context Free Grammar to
construct the parse tree.
Parsing Techniques
Depending on how the parse tree is created, there are
different parsing techniques.
These parsing techniques are categorized into two groups:
 Top-Down Parsing
 Bottom-Up Parsing

Top-Down Parsing:
Construction of the parse tree starts at the root, and
proceeds towards the leaves.
Efficient top-down parsers can be easily constructed by
hand.
Recursive Predictive Parsing, Non-Recursive Predictive
Parsing (LL Parsing).
Parsing Techniques (Cont.)
Bottom-Up Parsing:
Construction of the parse tree starts at the leaves, and
proceeds towards the root.
Normally efficient bottom-up parsers are created with the
help of some software tools.
Bottom-up parsing is also known as shift-reduce parsing.
LR Parsing – much general form of shift-reduce parsing,
LR, SLR, LALR
Semantic Analyzer
A semantic analyzer checks the source program for
semantic errors and collects the type information for the
code generation.
Type-checking is an important part of semantic analyzer.

It verifies the parse tree, whether it is meaningful or not. It


furthermore produces a verified parse tree.
Context-free grammars used in the syntax analysis are
integrated with attributes (semantic rules)
Intermediate Code Generation
A compiler may produce an explicit (clear) intermediate
codes representing the source program.
These intermediate codes are generally machine
(architecture independent). But the level of intermediate
codes is close to the level of machine codes.
Ex: newval := oldval * fact + 1

id1 := id2 * id3 + 1

MULT id2,id3,temp1 Intermediates Codes


ADD temp1,#1,temp2
MOVE temp2,id1
Code Optimizer
The code optimizer optimizes (make the best) the code
produced by the intermediate code generator in the terms
of time and space.

Ex:
MULT id2,id3,temp1
ADD temp1,#1,id1
Code Generator
The main purpose of code generator is to write a code that
the machine can understand (Produces the target
language).
The target program is normally is a relocatable object file
containing the machine codes.

Ex:

MOVE id2,R1
MULTid3,R1
ADD #1,R1
MOVE R1,id1
Symbol Table
This portion of the compiler keeps track of the names used
by the programme and records essential information about
each, such as its type (integer, real, etc.).

The data structure used to record this information is called


a symbol table.

It is a data structure being used and maintained by the


compiler, consists all the identifiers name along with their
types. It helps the compiler to function smoothly by finding
the identifiers quickly.
Error Handler
This is invoked when a flaw (error) in the source
programme is detected.
The tasks of the error handling process are to detect each
error, report it to the user, and then make some recover
strategy and implement them to handle error.
There are two types of errors:
Run-Time Error : is an error which takes place during the
execution of a program, and usually happen because of
invalid input data.
Compile-Time Error : rises at compile time, before
execution of the program.
Both the table management and error handling routines
interact with all phases of the compiler.
Compiler Construction Tool
Some commonly used compiler-construction tools include

1. Scanner generators that produce lexical analyzers from a


regular-expression description of the tokens of a language.
2. Parser generators that automatically produce syntax
analyzers from a grammatical description of a
programming language or Context Free Grammar .
3. Syntax-directed translation engines that produce
collections of routines for walking a parse tree and
generating intermediate code.
Cont…
4. Code-generator generators that produce a code generator
from a collection of rules for translating each operation of
the intermediate language into the machine language for a
target machine.
5. Data-flow analysis engines that facilitate (make easy) the
gathering of information about how values are transmitted
from one part of a program to each other part. Data-flow
analysis is a key part of code optimization.
6. Compiler-construction toolkits that provide an integrated
set of routines for constructing various phases of a compiler.
Quiz one - 5 %
1. The input to code generator is syntax analyzer? (T/F).
2. The registers are selected for each computation (operation)
to be done, in which phase?
3. Which one is optional phase from compiler phases? And
why it is optional?
4. List compiler construction tools?
5. What is the function of error handler?
6. Write two major part of the compiler?
7. List phases of a compiler?
8. Write two types of parsing techniques?
9. What is the main function of semantic analyzer?
10. What is the use of symbol table?

You might also like