You are on page 1of 28

Compiler Design

Instructor: Mohammed O.
Email: momoumer90@gmail.com
Samara University
Preliminaries Required
oBasic knowledge of programming languages.
oBasic knowledge of FSA and CFG.
Chapter One: Introduction
This Chapter Covers:
Phases of a Compiler
Compiler Construction Tool
Introduction
Programming languages are notations for describing
computations (study) to people and to machines.

Before a program can be run, it first must be translated


into a form that can be executed by a computer.

The software systems that do this translation are called


compilers.

Simply stated, a compiler is a program that can read a


program in one language (source language) and translate it
into an equivalent program in another language (target
language).
Compiler Design

Fig. Overall role of compiler

An important role of the compiler is to report any errors in


the source program that it detects during the translation
process.
Cont…
The compiler may produce an assembly-language program
as its output, because assembly language is easier to
produce as output and is easier to debug (the process of
identifying and removing errors).

The assembly language is then processed by a program


called an assembler that produces relocatable machine code
as its output.
Cont…
Basic for Linker Loader
Comparison
Basic It generates the It loads the
executable module of executable module to
a source program. the main memory.
Input It takes as input, the It takes executable
object code generated module generated by
by an assembler. a linker.
Function It combines all the It allocates the
object modules of a addresses to an
source code to executable module in
generate an main memory for
executable module. execution (RAM).
A Language-Processing System

Pre-processing is the first pass


of any compilation. It processes
such as include-files.

Absolute machine code that


holds at a k own, fixed memory
address.
Major Parts of Compilers
There are two major parts of a compiler: Analysis
(Front-End) and Synthesis (Back-End)

In analysis phase, an intermediate representation is created


from the given source program.
 Lexical Analyzer, Syntax Analyzer and Semantic
Analyzer are the parts of this phase.

In synthesis phase, the equivalent target program is


created from this intermediate representation.
 Intermediate Code Generator, Code Generator, and Code
Optimizer are the parts of this phase.
Phases of A Compiler
A compiler takes as input a source programme and produces
as output an equivalent sequence of machine instructions.

This process is so complex, to consider the compilation


process as occurring in one single step.

It is customary to partition the compilation process into a


series of sub-processes called phases.

A phase is a logically cohesive (working together effectively)


operation that takes as input one representation of the source
programme and produces as output another representation.
Phases of A Compiler

Fig. Phases of a compiler


Phases of A Compiler
Each phase transforms the source program from one
representation into another representation.

They communicate with error handlers.

They communicate with the symbol table.


Lexical Analyzer
Lexical Analyzer reads the source program character by
character and returns the tokens of the source program.

A token describes a pattern (regular form) of characters


having same meaning in the source program. (such as
identifiers, operators, keywords, numbers, delimeters and
so on)

Ex: newval = oldval + 12 => tokens : newval identifier


= assign op
oldval identifier
+ add operator
12 a number
Lexical Analyzer
Puts information about identifiers into the symbol table.

Regular expressions are used to describe tokens (lexical


constructs).

A (Deterministic) Finite State Automata can be used in the


implementation of a lexical analyzer
Syntax Analyzer
A Syntax Analyzer creates the syntactic structure
(generally a parse tree) of the given program.

A syntax analyzer is also called as a parser.

A parse tree describes a syntactic structure.


• In a parse tree, all terminals are at leaves.

• All inner nodes are non-terminals in


a context free grammar.
Syntax Analyzer (Cont.)
The syntax of a language is specified by a context free
grammar (CFG).

A syntax analyzer checks whether a given program satisfies


the rules implied (suggested) by a CFG or not.

It takes all the tokens one by one and uses Context Free
Grammar to construct the parse tree.
Parsing Techniques
Depending on how the parse tree is created, there are
different parsing techniques.

These parsing techniques are categorized into two groups:


 Top-Down Parsing
 Bottom-Up Parsing

Top-Down Parsing:
 Construction of the parse tree starts at the root, and
proceeds towards the leaves.
 Efficient top-down parsers can be easily constructed by
hand.
 Recursive Predictive Parsing, Non-Recursive Predictive
Parsing (LL Parsing).
Parsing Techniques (Cont.)
Bottom-Up Parsing:
 Construction of the parse tree starts at the leaves, and
proceeds towards the root.
 Normally efficient bottom-up parsers are created with
the help of some software tools.
 Bottom-up parsing is also known as shift-reduce parsing.
 LR Parsing – much general form of shift-reduce parsing,
LR, SLR, LALR.
Semantic Analyzer
A semantic analyzer checks the source program for
semantic errors and collects the type information for the
code generation.

Type-checking is an important part of semantic analyzer.

It verifies the parse tree, whether it is meaningful or not. It


furthermore produces a verified parse tree.

Context-free grammars used in the syntax analysis are


integrated with attributes (semantic rules)
Intermediate Code Generation
A compiler may produce an explicit (clear) intermediate
codes representing the source program.

These intermediate codes are generally machine


(architecture independent) – those programs run at any
architecture of computer (HW). But the level of
intermediate codes is close to the level of machine codes.
Ex: newval = oldval * fact + 1

id1 = id2 * id3 + 1


MULT id2,id3,temp1 Intermediates Codes
ADD temp1,#1,temp2
MOVE temp2,id1
Code Optimizer
The code optimizer optimizes (make the best) the code
produced by the intermediate code generator in the terms
of time and space.

This is an optional phase designed to improve the


intermediate code quality.

Ex:
MULT id2,id3,temp1
ADD temp1,#1,id1
Code Generator
The main purpose of code generator is to write a code that the
machine can understand (Produces the target language).

The target program is normally a relocatable object file


containing the machine codes.

Ex:
MOVE id2,R1 - Register is a temporary storage area built into
a CPU
MULTid3,R1
ADD #1,R1
MOVE R1,id1
Symbol Table
This portion of the compiler keeps track of the names used
by the programme and records essential information about
each, such as its type (integer, real, etc.).

The data structure used to record this information is called


a symbol table.

It is a data structure being used and maintained by the


compiler, consists all the identifiers name along with their
types. It helps the compiler to function smoothly by finding
the identifiers quickly.
Error Handler
This is invoked when a flaw (error) in the source programme
is detected.

The tasks of the error handling process are to detect each


error, report it to the user, and then make some recover
strategy and implement them to handle error.

There are two types of errors:


Run-Time Error : is an error which takes place during the
execution of a program (the executable code is started
running.), and usually happen because of invalid input data.
Compile-Time Error : rises at compile time (the source code is
converted into an executable code), before execution of the
program.
Error Handler
Both the table management and error handling routines
interact with all phases of the compiler.
Compiler Construction Tool
Some commonly used compiler-construction tools include:-

1. Scanner generators that produce lexical analyzers from a


regular-expression description of the tokens of a language.

2. Parser generators that automatically produce syntax


analyzers from a grammatical description of a
programming language or Context Free Grammar.

3. Syntax-directed translation engines that produce


collections of routines for a parse tree and generating
intermediate code.
Cont…
4. Code-generator generators that produce a code generator
from a collection of rules for translating each operation of
the intermediate language into the machine language for a
target machine.

5. Data-flow analysis engines that facilitate (make easy) the


gathering of information about how values are transmitted
from one part of a program to each other part. Data-flow
analysis is a key part of code optimization.

6. Compiler-construction toolkits that provide an integrated


set of routines for constructing various phases of a compiler.
Quiz one - 10 %
1. The input to code generator is syntax analyzer? (T/F).
2. Registers are selected for each computation (operation) to
be done, in which phase?
3. Which one is optional phase from compiler phases? And
why it is optional?
4. List compiler construction tools?
5. What is the role of error handler?
6. Write two major parts of the compiler?
7. List all phases of a compiler?
8. Write two types of parsing techniques?
9. What is the main function of syntax analyzer?
10. What is symbol table?

You might also like