You are on page 1of 12

NAME- PRATHAM RAI

ROLL NO –18700220055

STREAM- IT

SEMESTER- 5th

INSTITUTION – TECHNO INTERNATIONAL NEWTOWN

COMPILER DESIGN
CONTENT
 Introduction
 Steps to Convert HLL To
Machine Code
 Phases Of Compiler
 Role of lexical analyzer
 Lexemes, Patterns and
tokens
 Types of tokens
 Regular expression of
tokens
 Finite Automata
INTRODUCTION
 A compiler is a software that converts the source code to
the object code. In other words, we can say that it
converts the high-level language to machine/binary
language. Moreover, it is necessary to perform this step to
make the program executable. This is because the
computer understands only binary language.
 Some compilers convert the high-level language to an
assembly language as an intermediate step. Whereas some
others convert it directly to machine code. This process of
converting the source code into machine code is
called compilation. Let us learn more about it in detail
STEPS TO CONVERT HLL TO MACHINE CODE
PHASES OF COMPILERS
PHASES EXPLANATION WITH EXAMPLES
ROLE OF LEXICAL ANALYZER
 Lexical analyzer separates the characters of the source language into
groups that logically belong together, called tokens. It includes the
token name which is an abstract symbol that define a type of lexical
unit and an optional attribute value called token values. Tokens can
be identifiers, keywords, constants, operators, and punctuation
symbols including commas and parenthesis. A rule that represent a
group of input strings for which the equal token is make as output is
called the pattern
 The lexical analyzer also handles issues including stripping out the
comments and whitespace (tab, newline, blank, and other characters
that are used to separate tokens in the input). The correlating error
messages that are generated by the compiler during lexical analyzer
with the source program.
 For example, it can maintain track of all newline characters so that it
can relate an ambiguous statement line number with each error
message. It can be implementing the expansion of macros, in the case
of macro, pre-processors are used in the source program.
LEXEMES, PATTERNS AND TOKENS
 Lexemes- A lexeme is an actual character sequence
forming a specific instance of a token, such as numbers.
A lexeme is a sequence of characters in the source text
that is matched by the pattern for a token
 Patterns – A rule that describes the set of strings
associated to a token. Expressed as a regular expression
and describing how a particular token can be formed
The pattern matches each string in the set

 Tokens- A token is a group of characters having collective


meaning: typically a word or punctuation mark, separated
by a lexical analyser and passed to a parser
TYPES OF TOKENS
REGULAR EXPRESSION OF TOKENS
Regular expression is an important notation for specifying patterns. Each pattern
matches a set of strings, so regular expressions serve as names for a set of strings.
Programming language tokens can be described by regular languages. The specification
of regular expressions is an example of a recursive definition.
Notations
Union : | Separates alternate possibilities.

Concatenation : . Normally matches any character except a newline.


Within square brackets the dot is literal.
Closure : + Matches the preceding pattern element one or more times.

* Matches the preceding pattern element zero or more times.

Digits – digit(digit)*
Letter – A|B|…….|Z|A|……..|Z|
Identifiers – (letter).(letter|digit)*
FINITE AUTOMATA
 Finite automata is a state machine that takes a string of symbols as input and changes
its state accordingly. Finite automata is a recognizer for regular expressions. When a
regular expression string is fed into finite automata, it changes its state for each
literal. If the input string is successfully processed and the automata reaches its final
state, it is accepted, i.e., the string just fed was said to be a valid token of the
language in hand.
 The mathematical model of finite automata consists of:
 Finite set of states (Q)
 Finite set of input symbols (Σ)
 One Start state (q0)
 Set of final states (qf)
 Transition function (δ)
 The transition function (δ) maps the finite set of state (Q) to a finite set of input
symbols (Σ), Q × Σ ➔ Q
THANK YOU

You might also like