You are on page 1of 25

HAWASSA UNIVERSITY

Institute of Technology (IOT)


Faculty of Informatics

Compiler Design

(CoSc4072)

1 Compile by: Areg. T.


CH_01

Introduction

2 Compile by: Areg. T.


Outline
• Phases of Compiler

• Computer Language Representation

• Compiler Construction Tools

Compile by: Areg. T. 3


Introduction
What is Compiler?
• A compiler takes source code or computer program as an
input and produces an executable which is another
program, it might be an assembly language or byte code
and the executable can run separately on data and produces
the output.

• In a compiler, we pre-process the program first and


produces the executable and then we can run that same
executable on many input data without having to recompile
it.

Compile by: Areg. T. 4


Introduction
Why Compilers?
• Programming in machine (or assembly) language is
tedious, error prone, and machine dependent.

• Historical note: In 1954, IBM started developing


FORTRAN language and its compiler.

Compile by: Areg. T. 5


Introduction
What is Interpreter?
• An interpreter takes your program that you wrote and
whatever data that you want to run the program on as an
input and it produces the output directly.

• Meaning that it doesn’t do any processing of the program


before it executes it on the input data.

• So, you just write the program and you invoke the
interpreter on the data and the program immediately begins
running.

Compile by: Areg. T. 6


Introduction
What is Compiler Design?
• Compiler design is the process of creating or designing a compiler.

• By designing compilers, computer scientists can make programming


languages more efficient and reliable.

• They can also make it easier for programmers to develop complex


software systems.

Compile by: Areg. T. 7


Phases of Compiler
• The first three phases of a compiler, which are lexical, syntax and
semantic analysis are together known as analysis phase.

• The last three phases of a compiler, which are ICG, code optimization
and code generation are together known as synthesis phase.

Compile by: Areg. T. 8


Phases of Compiler
Lexical Analysis (Scanner)
• Lexical analysis is the first phase of the compiler also known as a
scanner.
• It converts the high level input program into a sequence of tokens in
the following form:
token_name, attribute_value
• Where token_name is an abstract symbol that is used during syntax
analysis, and attribute_value points to an entry in the symbol table for
this token.
Compile by: Areg. T. 9
Phases of Compiler
Lexical Analysis (Scanner)
• Example: Assume having a source code position = initial + rate * 60,
what will be the possible token generated?

Compile by: Areg. T. 10


Phases of Compiler
Syntax Analysis (Parser)
• The syntax analyzer (parser) uses the tokens produced by the lexical
analyzer and creates a syntax tree that depicts the grammatical
structure of the token stream.

Compile by: Areg. T. 11


Phases of Compiler
Syntax Analysis (Parser)
• Example: assume having a token id = id + id * 60 what will be the
possible parse tree generated?

Compile by: Areg. T. 12


Phases of Compiler
Semantic Analysis
• The semantic analyzer uses the syntax tree and the information in the
symbol table to check the source program for semantic consistency
with the language.

• Gathers type information and saves it in either the syntax tree or the
symbol table, for subsequent use during ICG.

Compile by: Areg. T. 13


Phases of Compiler
Semantic Analysis
• Type checking is an important task of semantic analysis.

• Example:

Compile by: Areg. T. 14


Phases of Compiler
Intermediate Code Generation (ICG)
• ICG generates an explicit low-level or machine-like intermediate
representation, after receiving the annotated parse tree from semantic
analysis phase.

• Three-address code is a popular example of intermediate


representations.

Compile by: Areg. T. 15


Phases of Compiler
Intermediate Code Generation (ICG)
• Intermediate representation should have two important properties:
• It should be easy to produce.
• It should be easy to translate into the target machine.

• Example:

Compile by: Areg. T. 16


Phases of Compiler
Code Optimization
• Code optimization designed to improve the intermediate code so that
object code will run faster or takes less memory space.

• Code optimization reduces the number of steps involved in a program


without affecting the meaning.

Compile by: Areg. T. 17


Phases of Compiler
Code Optimization
• Example:

Compile by: Areg. T. 18


Phases of Compiler
Code Generation
• Machine code generation takes an intermediate representation of the
source program as an input and maps it into the target language.

• If the target language is machine code, registers or memory locations


are selected for each of the variables used by the program.

• Then, the intermediate instructions are translated into sequences of


machine instructions that perform the same task.

Compile by: Areg. T. 19


Phases of Compiler
Code Generation
• A crucial aspect of code generation is the judicious assignment of
registers to hold variables.

• Example:

Compile by: Areg. T. 20


Computer Language Representation
• Computer language representation is the way in which a programming
language is represented in a computer system.

• This representation is used by compilers and interpreters to


understand the program and generate machine code or execute the
program directly.

• There are a number of different ways to represent programming


languages. Some common representations are:

Compile by: Areg. T. 21


Computer Language Representation
• Token streams: A token stream is a sequence of tokens, where each
token represents a single unit of the programming language, such as a
keyword, identifier or operator.

• Parse trees: A parse tree is a tree-like data structure that represents the
grammatical structure of a program.

• Abstract syntax trees (ASTs): An AST is similar to a parse tree, but it


is more abstract and represents the program in a way that is easier for
the compiler or interpreter to analyze and execute.

Compile by: Areg. T. 22


Computer Language Representation
• Intermediate representations (IRs): An IR is a low-level representation
of the program that is designed to be efficient for generating machine
code or executing the program directly.

• The representation that is used for a particular programming language


depends on the language itself and the specific compiler or interpreter
that is being used.

Compile by: Areg. T. 23


Compiler construction tools
• Some of compiler construction tools that helps us in designing
compilers are:

• Parser generator: generates syntax analyzers from a grammar of a


language.

• Scanner generator: generates lexical analyzers from a regular-


expression.

• Syntax-directed translation engines: generates intermediate code


from a parse tree.
Compile by: Areg. T. 24
Compiler construction tools
• Some of compiler construction tools that helps us in designing
compilers are:

• Code-generator: translates intermediate language into the machine


language for a target machine.

• Data-flow analysis engines: facilitates the gathering of information


about how values are transmitted from one part of a program to
each other part. Data-flow analysis is a key part of code
optimization.

Compile by: Areg. T. 25

You might also like