You are on page 1of 37

Compilers

History of Compilation
Compilers are fundamental to modern computing.

They act as translators, transforming human-

oriented programming languages into computeroriented machine languages.

The First Compiler


The first real compiler FORTRAN compilers of the late 1950s Machine-independent languages such as Fortran proved the viability of high-level compiled language. 18 person-years to build
Compilers normally translate conventional

programming languages like Java, C and C++ into executable machine-language instructions
Compiler technology is more broadly applicable and has

been employed in rather unexpected areas. Tex, LaTex PostScript Verilag and VHDL (address the creation of VLSI)
3

What Compiler Do
Compilers may be distinguished in two ways
By the kind of machine code they generate By the format of target code they generate

Compilers may generate any of three types of

code by which they can be differentiated:


Pure Machine Code Augmented Machine Code Augmented with operation system routines and runtime language support routinues. I/O, storage allocation, mathematical functions, etc. Virtual Machine Code JVM (Java Virtual Machine)
4

Target Code Format


Another way that compilers differ from one

another is in the format of the target machine code they generate.


Assembly or other source formats Relocatable binary A linkage step is required Absolute binary

Interpreters
Another kind of language processor, called an

interpreter. differs from a compiler in that it executes programs without explicitly performing a translation. Interpreters provide a number of capabilities not

usually found in compilers.


Programs can be easily modified as execution

proceeds. Languages in which the type of an object is developed dynamically (e.g., Lisp and Scheme) are easily supported in an interpreter. Provide a significant degree of machine independence, since no machine code is generated.

Organization of a Compiler (1)

Organization of a Compiler (2)


Any compiler must perform two major tasks
Analysis of the source program being compiled Synthesis of a target program When executed, will correctly perform the computations described by the source program.

Scanner
Scanner
The scanner begins the analysis of the source

program by reading the input, character by character, and grouping characters into individual words and symbols (tokens) such as identifiers, integers, reserved words, and delimiters. The tokens are encoded (often as integers) and then are fed to the parser for syntactic analysis.
Regular expressions
See Chapter 3

Parser
Parser The parser is based on a formal syntax specification such as CFGs (context-free grammars) It reads tokens and groups them into phrases as the syntax specification Grammars . The parser verifies correct syntax. If a syntax error is found, it issues a suitable error message. As syntactic structure is recognized, the parser usually builds a syntax tree (AST) as a concise representation of program structure.

10

The Type Checker, Translator, Optimizer and Code Generator


The type checker checks the static semantics of each

AST node. If an AST node is semantically correct, it can be translated into IR code that correctly implements the meaning of the AST node. A symbol table is a mechanism that allows information to be associated with identifiers and share among compiler phases. The IR code generated by the translator is analyzed and transformed into functionally equivalent but improved IR code by the optimizer. The IR code produced by the translator is mapped into target machine code by the code generator.
A well-known compiler: GCC [GUN] is a heavily

11

optimizing compiler that can target over thirty computer architectures (Intel, Sparc, PowerPC, ) and has at least six front ends (C, C++, Fortran, Ada and Java).

Supplementary
From Modern Compiler Design

12

Compiler Compilers
To obtain the compiler, we run another

compiler whose input consists of compiler source text and which will produce executable code for it, as it would for any program source text. When the source language is also the implementation language and the source text to be compiled is actually a new version if the compiler itself, the process is called bootstrapping.
13

Compiler Compilers

Compiling and running a compiler 14

Magic Work
The compiler can work its magic because of two

factors:
The input is in a language and consequently has a

structure, which is described in the language manual. The semantics of the input is described in terms of and is attached to that structure.

15

Conceptual Structure
Conceptual structure of a compiler

16

Phases of a compiler

From [ASU]
17

Why study compiler construction


Compiler construction is very successful

Compiler construction has a wide applicability


Be applied to rapidly create read routines for

HTML, PostScript, etc.


Compilers contain generally useful algorithms

18

Demo Compiler
Structure of the demo compiler

19

General Translation (I)

From [ASU]
20

General Translation (II)

From [ASU]
21

Notations
Parsing Parse Tree Syntax Analysis Abstract Syntax Tree (AST) Annotated Abstract Syntax Tree
The annotations in a node are also called the

attributes of a node It is the task of the context handling module

22

Parsing
Syntax trees are also called parse tree

Parsing is also called syntax analysis


Grammar
expression ->expression + term | expression - term | term term -> term * factor | term / factor | factor factor -> identifier | constant | ( expression )

23

Parse Tree
b*b 4*a*c

24

Abstract Syntax Tree (AST)

25

Annotated AST
Examples of annotations are type information and

optimization information. The annotations in a node are also called the attributes of that node.

26

Annotated AST

27

Grammar for demo compiler


Fully parenthesized expression

28

Lexical analysis for the demo compiler


The tokens in our language are (, ), +, *, and digit

29

Lexical analyzer

30

Syntax analysis for the demo compiler


Recursive descent parsing

Predictive recursive descent parsing


LL(1) Look-ahead sets

31

A C template for a grammar rule

32

P -> A1 A2 An | B1 B2 |

Context handling for the demo compiler

33

Code generation for the demo compiler


A simple stack machine PUSH n
Pushes the integer n onto the stack

ADD Replaces the topmost two elements by their sum MULT Replaces the topmost two elements by their product PRINT Pops the top element and prints its value

Depth-first scan of the AST

34

Code generation results


The expression (2*((3*4)+9)) Outputs PUSH 2 PUSH 3 PUSH 4 MULT PUSH 9 ADD MULT PRINT
35

Interpretation for the demo compiler


Code generator emits code to have the actions

performed by a machine at a later time. The interpreter performs the actions right away.

36

The structure of a more realistic compiler

37

You might also like