You are on page 1of 22

MU-MIT

Chapter 01
Introduction to Compiler Design (CD)
Course number: CSE 404 Instructor: Desta Gebre
Email: dhesta26@yahoo.com
Lecture 3, Practice 3
Office: 209
Credit 4 Lab: Aynalem
Pre-Requisite CSE 405
Text book:
Alfred V Aho, Ravi Sethi, Jeffrey D. Ullman,
“Compilers – Principles,Techniques and Tools”
Addison Wesley
1
Course Organization
• Lectures
• Organized in to eight chapters
• Labs
• Up to five reports
• Projects and seminars
• In groups of at most 5 students 1. Introduction
2. Lexical Analysis
3. Syntax Analysis
4. Syntax Directed Translation
5. Run Time Environments
6. Intermediate Code Generation
7. Code Generation
8. Code Optimization
2
Evaluation
• Quiz (10%)
• Lab assignments (up to 20%)
• Project (up to 20%)
• Mid exam (up to 30%)
• Final Exam (up to 40%)
• Total out of 100%
Caution
• Direct copying from other student or from the Internet is not allowed. (Results in 0).
• Each group member must contribute in project and all lab works.
• A group member who does not know what is in a lab report or project will be
given zero. Up to 5 marks will also be deducted from each group member.
• Cheating in exams will not be tolerated.
• Will be reported to discipline committee right away. 3
¿Why sudy Compilers?
• Become a better programmer
• Insight into interaction between languages, compilers, and hardware
• Compiler techniques are everywhere
• Parsing (little languages, interpreters, HTML)
• Database engines, query languages
• Text processing
• Fascinating blend of theory and engineering
• Direct applications of theory to practice
• Parsing, scanning, static analysis
• Resource allocation, “optimization”, etc.
• You might even write a compiler some day!
• Etc
4
Overview and History
 Cause
 Software for early computers was written in assembly language
 The benefits of reusing software on different CPUs started to
become significantly greater than the cost of writing a compiler

 The first real compiler


 Writing a compiler has been considered as time taking task..
 FORTRAN compilers of the late 1950s took18 person-years to build
 Nowadays writing some modules of a compiler has
become routine enough that can be automated.
 Compiler generators or compiler-compilers
 E.g. scanner and parser generators (Lex, Yacc)

5
Compilers

 A compiler acts as a translator,


transforming human-oriented programming languages
into computer-oriented machine languages.
 Ignore machine-dependent details for programmer

Programming Machine
Language Compiler Language
(Source) (Target)

6
…Cont’d
 Compilers may generate three types of code:
 Pure Machine Code
 Machine instructions set without assuming the existence of any
operating system or library.
 Mostly being OS or embedded applications.
 Augmented Machine Code
 Code with OS routines and runtime support routines.
 More often.
 Virtual Machine Code
 Virtual instructions, can be run on any architecture with a virtual
machine interpreter or a just-in-time compiler
 Ex. Java

7
…Cont’d
 Another way that compilers differ from one another is
in the format of the target machine code they
generate:
 Assembly or other source format
 Relocatable binary
 Relative address
 A linkage step is required
 Absolute binary
 Absolute address
 Can be executed directly

8
Phases of Compilation

Source
Program Tokens Syntactic Semantic
Scanner Parser
(Character Stream) Structure Routines

Intermediate
Representation

Symbol and Optimizer


Attribute
Tables

(Used by all Phases of The Compiler)

Code
Generator
9
Target machine code
Phases of Compilation: Scanner

Source
Program Tokens Syntactic Semantic
Scanner Parser
(Character Stream) Structure Routines

Intermediate
Scanner Representation

 The scanner begins the analysis of the source program by


reading the input, character by character, and grouping
Symbol and Optimizer
characters into individual words and symbols (tokens)
Attribute
Tables
 RE ( Regular expression )
 NFA ( Non-deterministic Finite Automata )
 (Used by
DFA ( Deterministic Finite Automata ) all
 LEX Phases of
The Compiler) Code
Generator
10
Target machine code
Phases of Compilation: Parser

Source
Program Tokens Syntactic Semantic
Scanner Parser
(Character Stream) Structure Routines

Intermediate
Parser Representation

 Given a formal syntax specification (typically as a context-


free grammar [CFG] ), the parser reads tokens and groups
Symbol and Optimizer
them into units as specified by the productions of the CFG
Attribute
being used.
Tables
 As syntactic structure is recognized, the parser either calls
corresponding semantic routines directly or builds a syntax
(Used by all
tree.
Phases
 CFG ( Context-Free Grammar ) of
The Compiler)
 Top Down, Bottom Up parsers Code
 Operator Precedence parser Generator
 LR, SLR, LALR Parsers
11
 YACC
Target machine code
Phases of Compilation: Semantic ..

Source
Program Tokens Syntactic Semantic
Scanner Parser
(Character Stream) Structure Routines

Intermediate
Semantic Routines Representation

 Perform two functions


 Check the static semantics of each construct
Symbol and
 Do the actual translation Optimizer
Attribute
 The heart of a compiler
Tables
 Syntax Directed Translation
 Semantic Processing Techniques
(Used
by all
 IR (Intermediate Representation)
Phases of
The Compiler) Code
Generator
12
Target machine code
Phases of Compilation: Optimizer

Source
Program Tokens Syntactic Semantic
Scanner Parser
(Character Stream) Structure Routines

Intermediate
Optimizer Representation

 The IR code generated by the semantic routines is


analyzed and transformed into functionally equivalent but
Symbol and Optimizer
improved IR code
Attribute
 This phase can be very complex and slow
Tables
 Peephole optimization
 loop optimization, register allocation
(Used by all
Phases of
 Register and Temporary Management
 Peephole Optimization The Compiler) Code
Generator
13
Target machine code
Phases of Compilation: Code …

Source
Program Tokens Syntactic Semantic
Scanner Parser
(Character Stream) Structure Routines

Intermediate
Code Generator Representation
 Interpretive Code Generation
 Generating Code from Tree/Dag
 Grammar-Based Code Generator
Optimizer

Code
Generator
14 Target machine code
Phases: Summary
Code Generator
[Intermediate Code Generator]

Non-optimized Intermediate Code


Scanner
[Lexical Analyzer]

Tokens

Code Optimizer
Parser
[Syntax Analyzer]
Optimized Intermediate Code
Parse tree

Code Generator
Semantic Process
[Semantic analyzer] Target machine code

Abstract Syntax Tree w/ Attributes

15
Grouping of Phases

 Any compiler must perform two major tasks

Compiler

Analysis Synthesis

 Analysis of the source program


 Synthesis of a machine-language program

16
Grouping of Phases …cont’d
 In analysis phase, an intermediate representation is
created from the given source program.
 Lexical Analyzer, Syntax Analyzer and Semantic Analyzer are the parts of
this phase.

 In synthesis phase, the equivalent target program is


created from this intermediate representation.
 Intermediate Code Generator, Code Generator, and Code Optimizer
are the parts of this phase.

17
Syntax and Semantics of Programming Language

 A programming language must include the specification of


syntax (structure) and semantics (meaning).
 Syntax typically means the context-free syntax because of
the almost universal use of context-free-grammar (CFGs)
 Ex. a = b + c is syntactically legal and b + c = a is illegal
 The semantics of a programming language are commonly
divided into two classes:
 Static semantics
 Semantics rules that can be checked at compiled time.
 Ex. The type and number of a function’s arguments
 Runtime semantics
 Semantics rules that can be checked only at run time
18
Metalanguages
 Metalanguage: a language used to define another language
 We will use different metalangauges to define the various
components of a programming language so that these
components can be generated automatically
 Lexical analysis: regular expressions
 Syntax analysis: context free grammars
 Semantics analysis: attribute grammars
 Intermediate code generation: attribute grammars
 Code generation: tree grammars

19
Generators
 Automatic Compiler Generator
Compiler
Compiler Compiler
Definition in
(Metalanguage)

Error Message
 Automatic tool Generator

Tool
Definition in Compiler Tool
(Metalanguage)

Error Message
20
Computer Architecture and Compiler Design

 Compilers should exploit the hardware-specific feature


and computing capability to optimize code.
 The problems encountered in modern computing
platforms:
 Instruction sets for some popular architectures are highly non-
uniform.
 High-level programming language operations are not always
easy to support.
 Ex. exceptions, threads, dynamic heap access …
 Exploiting architectural features such as cache, distributed
processors and memory
 Effective use of a large number of processors

21
Compiler Design Considerations
 Debugging Compilers
 Designed to aid in the development and debugging of
programs.
 Optimizing Compilers
 Designed to produce efficient target code
 Retargetable Compilers
 A compiler whose target architecture can be changed without
its machine-independent components having to be rewritten.

22

You might also like