You are on page 1of 71

01CE0601 – Compiler Design

Unit - 1
Introduction to
Compiler

Prof. Rituraj Jain


Department of Information Technology
Outline

Translator

Language Translator

Cousins of Compiler

Phases of Compiler

Grouping of Phases

Pass and Phase

Types of Compiler

Compiler construction tools


Translator

What it is?

Why it is required?

What you think about this?

Some initial words………….


Translator

Differences between ideas described by S/W designer


to do something and the manner in which these ideas
are implemented in a computer system.

Semantic Gap

Application Execution
domain To fill this gape domain
semantic rules are
required
Translator

To fill this gap semantic rules are required….This issue is


tackled by Programming Languages

Specification Gap Execution Gap

Programming
Application Execution
Language
domain domain
domain
Translator

take some program (written in one language) as input and


converts it into another language(Target language).

Input Language is Source Language and Output Language


is Target Language.

It also detects and reports error during translation.

Three types of Translators


1) Compiler 2) Interpreter 3) Assembler
Translator
Translator

Language Translator
Cousins of Compiler
Phases of Compiler

Grouping of Phases

Pass and Phase

Types of Compiler

Compiler construction tools


Language Translator

Source: https://www.nesoacademy.org/
Language Translator

Source: https://www.nesoacademy.org/
Compiler

Compiler is a program which translates higher level language into


functionally equivalent lower level language (assembly code/object
language).

It also detects and reports error.


Interpreter

Interpreter is a translator which is used to convert programs in high-


level language to low-level language.

Interpreter translates line by line and reports the error once it


encountered during the translation process
Assembler

An assembler is a translator used to translate assembly language to


machine language.

An assembler translates a low-level language, an assembly


language to an even lower-level language, which is the machine
code.

The machine code can be directly understood by the CPU.


Compiler v/s Interpreter
Parameter Compiler Interpreter
Program Compilers scan the entire program in The program is interpreted/translated one
scanning one go. line at a time.
Error As and when scanning is performed, One line of code is scanned, and errors
detection all the errors are shown in the end encountered are shown.
together, not line by line.
Object code Compilers convert the source code to Interpreters do not convert the source
object code. code into object code.
Execution The execution time of compiler is It is not preferred due to its slow speed.
time less, hence it is preferred. Usually, interpreter is slow, and hence
takes more time to execute the code.
Compiler v/s Interpreter
Parameter Compiler Interpreter
Need of Compiler doesn’t require the source code It requires the source code for execution later.
source code for execution later.

Programming Programming languages that use Programming languages that uses interpreter
languages compilers include C, C++, C#, etc.. include Python, Ruby, Perl, MATLAB, etc.

Types of Compiler can check syntactic and Interpreter checks the syntactic errors only.
errors semantic errors in the program
detected simultaneously.

Size Compiler are larger in size. Interpreters are smaller in size.

Flexibility Compilers are not flexible. Interpreters are relatively flexible.

Efficiency Compilers are more efficient. Interpreters are less efficient.


Language Processor System... Internal Architecture

Translation Processor /

Context of Compiler /

Cousins of Compiler
Language Processor System... Internal Architecture

produce i/p to compiler Preprocessor


task is to collecting the source program

functions are : macro processing, file inclusion, rational preprocessor,


language extension

Source: https://www.nesoacademy.org/
Language Processor System... Internal Architecture

Preprocessor

Macro A preprocessor may allow a user to define macros that are


Processing short hands for longer constructs.

A preprocessor may include header files into program text.


File Inclusion

“Rational” augment older languages with more modern flow of control and
Preprocessor data structuring facilities.

Language These preprocessor attempts to add capabilities to the


Extension language by certain amounts to build-in macro
Language Processor System... Internal Architecture

Compiler

Source: https://www.nesoacademy.org/
Language Processor System... Internal Architecture
Assembler

input is assembly code generated by compiler

convert assembly code to machine code

processing in two phases

in phase one identifying the identifiers and store them into the symbol table

in second phase translation of each operation code and identifiers into sequence
of bits and specific location respectively
Language Processor System... Internal Architecture
Assembler

Source: https://www.nesoacademy.org/
Language Processor System... Internal Architecture
Loader and Link Editor
A program called loader performs the two functions of loading and link editing

The process of loading consist of taking relocatable machine code, altering the
relocatable address and placing the altered instructions and data in memory at the
proper location.

Linker allows us to make a single program from several files of relocatable machine
code.

Types of Linking : Static and Dynamic

Types of Loader : Compile and Go loader, Absolute Loader, Relocating Loader.


Language Processor System... Internal Architecture
Loader and Link Editor

Source: https://www.nesoacademy.org/
Translator

Language Translator

Cousins of Compiler

Phases of Compiler
Grouping of Phases

Pass and Phase

Types of Compiler

Compiler construction tools


The Phases of Compilers

ANALYSIS
PHASE
ANALYSIS – SYNTHESIS
MODEL OF COMPILATION

SYNTHESIS
PHASE
The Phases of Compilers… LEXICAL ANALYSIS

It is also called Linear analysis or scanning.

In this phase, input character from source code is read from left to right and then
break into stream of units.

These units are called tokens.

Sequence of characters having a collective meaning is called Token.

Tokens can be categorized into identifiers, constants, literals, keywords,


operators, delimeters etc.

So, Token is the smallest meaningful entities of program are produced as output
The Phases of Compilers… LEXICAL ANALYSIS

Source:
https://www.nesoacademy.org/
The Phases of Compilers… SYNTAX ANALYSIS

It is also called Hierarchical analysis or Parsing.

It determines if the sentence formed from the words are syntactically


(grammatically) correct.

It creates syntax tree from generated tokens if the code is error free.

Syntax tree consist operators as internal node and operands as leaf node.

This phase check each and every line and try to detect errors if it is grammatically
(syntax wise) not correct.
The Phases of Compilers… SYNTAX ANALYSIS

Source: https://www.nesoacademy.org/
The Phases of Compilers… SEMANTIC ANALYSIS

It determines meaning of string

In semantic analysis various operations are performed like

Performing check ; whether operator have compatible arguments or


not (type checking) , matching of parenthesis, scope of operation etc.

Ensuring that components of a program fits together meaningfully.


The Phases of Compilers… SEMANTIC ANALYSIS

Source: https://www.nesoacademy.org/
The Phases of Compiler…INTERMEDIATE CODE GENERATOR

Intermediate code generation should have two properties :


it should be easy to produce, easy to translate.

Intermediate code is called “Three Address Code”.

It is called three address code as it maximum consist three


operands.
The Phases of Compiler…INTERMEDIATE CODE GENERATOR

Source: https://www.nesoacademy.org/
The Phases of Compiler…CODE OPTIMIZATION

This phase improves the intermediate code, in such a way that a machine
code can be produced, which occupies less memory space and less
execution time without changing the functionality or correctness of program.
The Phases of Compiler…CODE OPTIMIZATION

Source: https://www.nesoacademy.org/
The Phases of Compiler…CODE GENERATOR

In Code generation phase the target code gets generated.

In this phase intermediate (optimized) code is translated into a sequence


of machine instructions that perform the same operation.
The Phases of Compiler…CODE GENERATOR

Source: https://www.nesoacademy.org/
The Phases of Compiler…

pos := initial + rate * 60


• pos (identifier)
• := (assignment symbol)
Lexical
Analysis • initial (identifier)
• + (plus sign)
• rate (identifier)
• * (multiplication sign)
• 60 (number)
The Phases of Compiler…

• pos (identifier) E → E := E
:=
• := (assignment symbol)
• initial (identifier)
Syntax pos + E → E + E
• + (plus sign) Analysis
• rate (identifier) initial * E → E * E
• * (multiplication sign)
• 60 (number) rate 60 E → NUM

syntax tree for


pos := initial + rate * 60
The Phases of Compiler…
:=
E → E := E
:= pos +
(real)

pos + E → E + E initial *
Semantic (real)
Analysis
initial * E → E * E rate inttoreal
(real)

rate 60 E → NUM
60
(integer)
syntax tree for
pos := initial + rate * 60
The Phases of Compiler…

Intermediate Code
Three-Address Code : pos := initial + rate * 60
Generation

/* original code, where id1,


* id2, and id3 are reals
*/
• One assignment id1 := id2 + id3 * 60
• One other operator (at most)
/* three address code */
temp1 := inttoreal(60)
temp2 := id3 * temp1
temp3 := id2 + temp2
id1 := temp3
The Phases of Compiler…

Code Optimization
/* natural code */
temp1 := inttoreal(60)
temp2 := id3 * temp1
/* original code, where id1,
temp3 := id2 + temp2
id2, and id3 are reals */
id1 := temp3
id1 := id2 + id3 * 60

/* optimized code */
temp1 = id3 * 60.0
id1 := id2 + temp1
The Phases of Compiler…

Code Generation

/* assembly code */
/* optimized intermediate code */ MOVF id3, R2
temp1 = id3 * 60.0 MULF #60.0, R2
id1 := id2 + temp1 MOVF id2, R1

ADDF R2, R1

MOVF R1, id1


Symbol Table Management

Symbol table is the data structure which contains a record for each identifier with its attribute
list.

As a identifier identified by scanner (lexical analyzer) it will be entered into symbol table

Attributes of identifiers will be entered by another phases of compiler.

Essential function of compiler is to record the identifiers with its attributes (type, scope,
storage location, etc.)

In the case of function attributes are return type, no. & type of parameters, parameter
passing scheme.
Error Detection and Recovery

Each phase can have errors so it has to deal with those errors so that
next phase of compilers may proceeds and next errors to be detected.

Lexical analyze phase detect error when characters remaining in


the input do not form any token.

Syntax analyze phase detect error where token stream violates the
structure rules of the language.

Semantic phase tries to detect construct that having the right syntactic
structure but no meaning.
Translator

Language Translator

Cousins of Compiler

Phases of Compiler

Grouping of Phases
Pass and Phase

Types of Compiler

Compiler construction tools


The Grouping of Phases

ANALYSIS
PHASE

SYNTHESIS
PHASE
The Grouping of Phases…ANALYSIS PHASE

Front end consist of those phase, that depends preliminary on the source
language & are independent of the target machine.

It includes Lexical analysis, Syntax analysis, Semantic analysis,


Intermediate code generation and Creation of Symbol table.

A certain amount of code optimization can be done by the front end as


well

The front end also includes error handling that goes along with each of
these phases.
The Grouping of Phases…SYNTHESIS PHASE

Back includes those portion of compiler that depend on


the target machine and do not depend on source
language.

Back end includes code optimization and code


generation with necessary error handling and symbol
table operation.
Advantage of Analysis – Synthesis concept

One can take the front end of a compiler and redo its associated back
end to produce a compiler for the same source language on a different
machine.

If the back end design carefully, it may not even be necessary to


redesign too much of the back end.
Translator

Language Translator

Cousins of Compiler

Phases of Compiler

Grouping of Phases

Pass and Phase


Types of Compiler

Compiler construction tools


Pass of Compiler

One Complete scan of Source program is called Pass.

A collection of phases is done only once (single pass) or multiple


times (multi pass)

Single pass: usually requires everything to be defined before


being used in source program

Multi pass: compiler may have to keep entire program


representation in memory
Pass of Compiler

PASS PHASE

Various phases are logically grouped The process of compilation is carried


together to form a pass. out in various step is called Phase.

Require Less Memory Require More Memory

Execution is faster Execution is slower

Less memory operation(read/write) to More memory operations to be


be performed performed
2 Passes of Compiler 6 Phases of Compiler
Pass v/s Phases

Pass – 1 : Perform analysis of source program and note relevant


information

Pass – 2 : Generate target code using information noted in pass-1.

Difficulty in single pass : Forward Reference

A forward reference means variable or label is referenced before


its declared.
Translator

Language Translator

Cousins of Compiler

Phases of Compiler

Grouping of Phases

Pass and Phase

Types of Compiler
Compiler construction tools
Types of Compilers

Native Compiler

Cross Compiler

Source to Source Compiler

Just in Time Compiler

Incremental Compiler

Parallelizing Compiler
Native Compiler

The compiler used to Compile a source code for same type of


platform only.

The output generated by this type of compiler can only be run on


the same type of computer system and OS that the compiler itself
runs on.
Cross Compiler

The compiler used to compile a source code for different kinds


platform.

A Cross compiler is a compiler capable of creating executable


code for a platform other than one on which compiler is running.
Source to Source Compiler

The compiler that takes high-level language code as input and outputs source
code of another high- level language only.

Unlike other compilers which convert high level language into low level machine
language, it can take up a code written in Pascal and can transform it into C-
conversion of one high level language into another high level language having
same type of abstraction .

Thus, it is also known as transpiler .


Just in Time Compiler

A compiler of this kind converts bytecode into an instruction


set that can be read by a machine's processor.

Such compiler find uses in cases where we need to improve


or optimize performance of an application runtime.
Incremental Compiler

The compiler which compiles only the changed lines


from the source code and update the object code.
Parallelizing Compiler

A Compiler which is specially designed to run in parallel


computer architecture is known as parallelizing compiler.
Translator

Language Translator

Cousins of Compiler

Phases of Compiler

Grouping of Phases

Pass and Phase

Types of Compiler

Compiler construction tools


Compiler Construction Tools

Scanner Generator (Lexical Analyser Generator)

It generates lexical analyzers from the input that consists of regular


expression description based on tokens of a language. It generates a
finite automaton to recognize the regular expression. Example: Lex
Compiler Construction Tools

Parser Generator

It produces syntax analyzers (parsers) from the input that is based


on a grammatical description of programming language or on a
context-free grammar.
Compiler Construction Tools

Syntax Directed Translation Engines

It generates intermediate code with three address format from the


input that consists of a parse tree. These engines have routines to
traverse the parse tree and then produces the intermediate code. In
this, each node of the parse tree is associated with one or more
translations.
Compiler Construction Tools

Automatic Code Generators

It generates the machine language for a target machine. Each


operation of the intermediate language is translated using a
collection of rules and then is taken as an input by the code
generator.
Compiler Construction Tools

Data-Flow Analysis Engines

It is used in code optimization. Data flow analysis is a key part of


the code optimization that gathers the information, that is the
values that flow from one part of a program to another.
Qualities of a Compiler

Compiler itself must be bug free.

It must generate correct machine code.

Generated machine code must run fast (Execution speed).

Compilation time must be less.

Compilation must be portable.

It must be print good diagnostics and Error message.

Generated code must work well with existing debugger.

You might also like