Professional Documents
Culture Documents
Types of compilers
Cross-Compiler: that runs on a machine 'A' and produces a
code for another machine 'B'.
Source to Source Compiler: is a compiler that translates
the source program (code) written in one programming
language into the source code for another programming
language.
Native Compiler: a compiler that translates source
program to object code on the same platform.
Virtual Machine: A virtual machine is a software
implementation of a machine (for example, a computer)
that executes programs like a physical machine.
2. Language Processing System
A preprocessor is a tool that produces input for compilers.
Its purpose is to process directives.
Directives are specific instructions that start with # symbol
and end with a newline (NOT semicolon).
A preprocessor may allow a user to define macros that are
short hands for longer constructs.
A macro is a rule that defines how an input sequence (e.g.
an identifier) is converted into a replacement output
sequence (e.g. some text).
Example: #define DTU “Debre Tabor University”
.
2.2. COMPILER
2.4. Linker
Linker is a computer program that links and merges
various object files together in order to make an
executable file.
All these files might have been compiled by separate
assemblers.
Con…
The major task of a linker is to search and
locate referenced module/routines in a
program and to determine the memory
location where these codes will be loaded,
making the program instruction to have
absolute references.
Linking is performed at the last step in
compiling a program.
Source code compiler Assembler Object code Linker Exécutable file
Loader
2.5. Loader
The first phase of compiler also called scanner works as a text scanner.
This phase scans the source code as a stream of characters and converts it into
meaningful lexemes called tokens.
The scanner begins the analysis of the source program by reading the input
text—character by character—and grouping individual characters into tokens
(identifiers, integers, reserved words, delimiters, and so on).
The scanner does the following.
It puts the program into a compact and uniform format (a stream of tokens).
It eliminates unneeded information (such as comments).
It processes compiler control directives (for example include source text from a
file).
It sometimes enters preliminary information into symbol tables (for example, to
register the presence of a particular label or identifier).
Cont.…
Examples of Tokens:
a. Key words: while, if, void, int, float, for, …
b. Identifiers: declared by the programmer
c. Operators: +, -, *, /, =, ==, <, >, <=, >=, …
d. Numeric Constants: numbers such as 124, 12.35, 0.09E-23, etc
e. Character constants: single character or strings of characters enclosed in
quotes.
f. Special characters: characters used as delimiters such as ( ) , ; :
Example: Show the token classes or types, put out by the lexical analysis phase
corresponding to this C++ source input:
a) position = initial + rate * 60 ;
b) sum = sum + unit * /* accumulate sum */ 1.2e-12 ;
3.2. Syntax Analysis (The parser)
Semantic analysis checks whether the parse tree constructed follows the rules of
language.
the semantic analyzer keeps track of identifiers, their types and expressions; whether
identifiers are declared before use or not, etc.
The semantic analyzer produces an annotated syntax tree as an output.
The type checker checks the static semantics of each AST node.
Example: Draw an Attributed AST for position = initial + rate * 60
Cont.…
Semantic errors:
Undeclared identifier
Multiple declared identifier
Index out of bounds
Wrong number or types of args to call
Incompatible types for operation
Break statement outside switch/loop
Goto with no label
Etc…..
3.4. Intermediate Code Generation
It is a data-structure maintained throughout all the phases of a compiler .
All the identifiers’ names along with their types are stored here.
The symbol table makes it easier for the compiler to quickly search the identifier record and
retrieve it.
The symbol table is also used for scope management.
A symbol table is a mechanism that allows information to be associated with identifiers and
shared among compiler phases.