Introduction

Compiler Design Overview
By
Dr. Prem Nath
Associate Professor
Computer Science & Engineering Department,
H N B Garhwal University
(A Central University)
1
Table of Contents
1. Introduction
2. Language Processors
3. Compiler
4. Interpreter
5. Compiler Vs Interpreter
6. Language Processing System
7. Back End and Front End of Compiler
8. Syntax Analysis Overview
9. Synthesis Overview
Dr. Prem Nath 2

Introduction
• Programs are written in Programming

Languages.
• Programming Languages comprise Syntax like
English Language.
• Computer can execute only Machine Level
Language.
• Need for Language Translator.
Dr. Prem Nath 3

Language Processor
Language Processor translates one program
written in one Programming Language into
Equivalent another program written in another
Programming Language.
Language Processor
Compiler Interpreter
Dr. Prem Nath 4

Compiler
➢ Language Translator
➢ Software System
➢ Can read a Program in one Language (Source
Language) and Translate it into an Equivalent
Program in Another Language (Target Language).
Source
Program
Compiler Errors
Input Target Program

(m/c Executable)
Output
Dr. Prem Nath 5

Compiler…
❖ Machine Architectures are shaping Compiler.
❖ Compiler Study touches:- Programming

Languages, Machine Architecture, Software
Engg., Algorithms, etc.
Dr. Prem Nath 6

Interpreter
❖ Another kind of Language Processor
❖ Translates the Source Code into target Code
Line-by-Line.
❖ Appears to execute the operations directly.
Source
Program
Input Interpreter Errors
Output
Dr. Prem Nath 7

Compile Vs Interpreter
➢ The interpreter translates and executes the source
code line by line whereas the compiler first
translates the entire code and then executes it.
➢ The compiler works faster than the interpreter
since the interpreter translates and executes line
by line which takes more time.
➢ Interpreter can give better error diagnostics
since it works line by line.
Dr. Prem Nath 8

Compile Vs Interpreter…
➢ Compilers generate machine code, whereas interpreters
interpret intermediate code.
➢ Interpreters are easier to write and can provide better
error messages (symbol table is still available).
➢ Interpreters are at least 5 times slower than machine
code generated by compilers.
➢ Interpreters also require much more memory than
machine code generated by compilers.
❖ Examples: Perl, Python, Unix Shell, Java, BASIC, LISP
Dr. Prem Nath 9

Hybrid Compiler
❖ Combines Features of Compiler and Interpreter.
Java Source Code Compiler
Intermediate Code
(Byte Code)
m/c Independent
Interpreter
Platform Dependent
(Java Virtual m/c)
Output
Dr. Prem Nath 10

Language Processing System
11
Language Processing System…
❖ Preprocessor: Expands the Macros into Source
Program.
❖ Assembler: Processes the Program in Assembly
Language which is produced by the Compiler and
generates relocatable Machine Code.
❖ Large Programs are compiled into pieces (Object
Files).
❖ Linker: Links the relocatable m/c code with other
relocatable object codes and Library Files. It resolves
the external memory references.
❖ Loader: Loades all the executable object files into
main memory for execution.
Dr. Prem Nath 12
Front-end and Back-end Division
Source IR Machine
Front end Back end
code code
Analysis Errors Synthesis
Front end maps legal code into IR

(Intermediate Representation)
Back end maps IR onto target Machine Code
Allows multiple front ends
Multiple passes may result in better code
Front End
Source Tokens IR
Scanner Parser
Code
Errors
Also called Lexical Analysis or Scanning

Recognize Legal Code
Report Errors
Produce IR
Front End…
Source Tokens IR
Scanner Parser
code
Errors
Scanner/Lexical Analyzer
Groups the characters into meaningful sequences
called Lexemes
Output Token of the Form: <token-name, attribute-value>
Maps characters into tokens – the basic unit of syntax
Eliminate white space (tabs, blanks, comments)
Front End…
Front End…
Source tokens IR
Scanner Parser
code
Errors
Parser:
Recognize context-free syntax
Guide context-sensitive analysis
Construct IR
Produce meaningful error messages
Attempt error correction
There are parser generators like YACC which
automates much of the work
Front End…
➢Context-free-grammars are used to represent
programming language syntaxes.
➢Parse Tree is constructed.
Back End…
Instruction Register Machine Code
IR selection Allocation
Errors
Translate IR into target machine code

Choose instructions for each IR operation
Decide what to keep in registers at each
point
Ensure conformance with system interfaces
Back End…
Instruction Register Machine code
errors
Produce compact fast code

Use available addressing modes
Back End…
Instruction Register Machine Code

Errors
Have a value in a register when used

Limited resources
Optimal allocation is difficult
Back End…
Traditional Three Pass Compiler
Source IR Middle IR Machine
Front End Back End
code End code
Errors
Code improvement analyzes and change IR

Goal is to reduce runtime
Middle End (Optimizer)
Modern optimizers are usually built as a set
of passes
Typical passes
Constant propagation
Common sub-expression elimination
Redundant store elimination
Dead code elimination
Phases of Compiler
25
Translation Overview - Lexical Analysis
LA can be generated automatically
from regular expression specifications.
LEX and Flex are two such tools.
Lexical Analyzer is a DFA.
Why is LA separate from parsing?
➢ Simplification of Design - Software
Engineering reason
➢ I/O issues are limited LA alone
➢ LA based on finite automata are more
efficient to implement than pushdown
automata used for parsing (due to stack)
26
Translation Overview - Syntax Analysis or Parsing
Syntax Analyzers (Parsers) can be generated
automatically from several variants of context-free
grammar specifications
➢ LL(1), and LALR(1) are the most popular
ones
➢ ANTLR (for LL(1)), YACC and Bison (for
LALR(1)) are such tools
Parsers are deterministic push-down
automata
Parsers cannot handle context-sensitive
features of programming languages; e.g.,
➢ Variables are declared before use
➢ Types match on both sides of assignments
➢ Parameter types and number match in
declaration and use 27
Translation Overview - Semantic Analysis
Semantic consistency that cannot be
handled at the parsing stage.
Stores type information in the symbol
table or the syntax tree
➢ Types of variables, function
parameters, array dimensions, etc.
➢ Used not only for semantic validation
but also for subsequent phases of
compilation
➢ Static semantics of programming
languages can be specified using
attribute grammars
28
Translation Overview - Intermediate Code Generation
While generating machine code directly from
source code is possible, it entails two problems
With m languages and n target machines, we
need to write m X n compilers
The code optimizer which is one of the largest
and very-difficult-to-write components of any
compiler cannot be reused
By converting source code to an intermediate
code, a machine-independent code optimizer
may be written
Intermediate code must be easy to produce and
easy to translate to machine code
29
Different Types of Intermediate Code
The type of intermediate code deployed is based on the
application
Quadruples, triples, indirect triples, abstract syntax trees
are the classical forms used for machine-independent
optimizations and machine code generation
Static Single Assignment form (SSA) is a recent form and
enables more effective optimizations
Conditional constant propagation and global value
numbering are more effective on SSA
Program Dependence Graph (PDG) is useful in automatic
parallelization, instruction scheduling, and software
pipelining 30
Translation Overview - Code Optimization
31
Translation Overview - Code Generation
32
Machine-Dependent Optimizations
Peephole optimizations
Analyze sequence of instructions in a small window (peephole) and
using preset patterns, replace them with a more efficient sequence
Redundant instruction elimination
e.g., replace the sequence [LD A,R1][ST R1,A] by [LD A,R1]
Eliminate “jump to jump” instructions
Use machine idioms (use INC instead of LD and ADD)
Instruction scheduling (reordering) to eliminate pipeline interlocks
and to increase parallelism
Trace scheduling to increase the size of basic blocks and increase
parallelism
Software pipelining to increase parallelism in loops
33
Compilation Process and T-Diagram
➢ S: Source Language
➢ T: Target Language S T
➢ I: Implemented Language
The language in which Compiler is I
implemented
Source Compiler Target

Language
Language
Implemented
Language
Dr. Prem Nath 34

Compilation Process and T-Diagram…
➢ A: Source Language, B: Target Language
➢ Py: Python Language, AL: Assembly Language
A B
Py Py M
C++
C++ M
C
C M
AL
AL M
M
Dr. Prem Nath 35
Compilation Process and Bootstrapping
➢ Bootstrapping: It is a process to Convert a Complex
Program using a Compiler which is implemented in a
simple language.
➢ It produces even more complex program as output.
➢ A compiler can be designed using Bootstrapping.
➢ A Compiler compiles its own program into another
program nearer to the machine language.
C++ M
C
C M
AL
AL M
Dr. Prem Nath M 36
Cross Compiler
➢ Cross Compiler is a compiler which runs on one
machine and produces output code for another machine.
Pascal Compiler C First Machine
Pascal Compiler C Second Machine
C++
Dr. Prem Nath 37

Cross Compiler…
➢ Cross Compiler is a compiler which runs on one
machine and produces output code for another
machine.
Pas C Pas C
C C C++ C++
C++
Dr. Prem Nath 38

Thanks
39

Introduction

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction

Uploaded by

Copyright:

Available Formats

Compiler Design Overview

Dr. Prem Nath 2

• Programs are written in Programming

Dr. Prem Nath 3

Dr. Prem Nath 4

Input Target Program

Dr. Prem Nath 5

❖ Machine Architectures are shaping Compiler.

❖ Compiler Study touches:- Programming

Dr. Prem Nath 6

Input Interpreter Errors

Dr. Prem Nath 7

Dr. Prem Nath 8

Dr. Prem Nath 9

Dr. Prem Nath 10

Analysis Errors Synthesis

Front end maps legal code into IR

Also called Lexical Analysis or Scanning

Translate IR into target machine code

Produce compact fast code

Instruction Register Machine Code

Have a value in a register when used

Code improvement analyzes and change IR

Source Compiler Target

Dr. Prem Nath 34

Pascal Compiler C First Machine

Pascal Compiler C Second Machine

Dr. Prem Nath 37

Dr. Prem Nath 38

You might also like