0% found this document useful (0 votes)

82 views9 pages

Lexical Analysis in Compiler Design

The document discusses the lexical analysis phase of compilers. It describes how the lexical analyzer reads input characters and produces tokens. The lexical analyzer works with the parser, passing tokens to it using the nexttoken() function. Regular expressions are used to specify token patterns. The lexical analyzer implements these patterns to recognize tokens. It is an important early phase that breaks down code into basic elements to simplify later processing.

Uploaded by

shrutika jori

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views9 pages

Lexical Analysis in Compiler Design

Uploaded by

shrutika jori

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Review: Compiler Phases:

Source program

Lexical analyzer Front End

Syntax analyzer
Symbol table
manager Semantic analyzer Error handler

Intermediate code generator

Code optimizer
Backend
Code generator

gsn.comp@coeptech.ac.in
Chapter 3: Lexical Analysis
• Lexical analyzer: reads input characters and produces a
sequence of tokens as output (nexttoken()).
– Trying to understand each element in a program.
– Token: a group of characters having a collective meaning.
const pi = 3.14159;

Token 1: (const, -)
Token 2: (identifier, ‘pi’)
Token 3: (=, -)
Token 4: (realnumber, 3.14159)
Token 5: (;, -)

gsn.comp@coeptech.ac.in
Interaction of Lexical analyzer
with parser

token
Source Lexical parser
program analyzer Nexttoken()

symbol
table

gsn.comp@coeptech.ac.in
• Some terminology:
– Token: a group of characters having a collective
meaning. A lexeme is a particular instant of a token.
• E.g. token: identifier, lexeme: pi, etc.
– pattern: the rule describing how a token can be formed.
• E.g: identifier: ([a-z]|[A-Z]) ([a-z]|[A-Z]|[0-9])*

• Lexical analyzer does not have to be an individual

phase. But having a separate phase simplifies the
design and improves the efficiency and
portability.
gsn.comp@coeptech.ac.in
• Two issues in lexical analysis.
– How to specify tokens (patterns)?
– How to recognize the tokens giving a token specification (how to
implement the nexttoken() routine)?

• How to specify tokens:

– all the basic elements in a language must be
tokens so that they can be recognized.
main() {
int i, j;
for (I=0; I<50; I++) {
printf(“I = %d”, I);
}
}
• Token types: constant, identifier, reserved word, operator and
misc. symbol.
– Tokens are specified by regular expressions.
gsn.comp@coeptech.ac.in
• Some definitions
– alphabet : a finite set of symbols. E.g. {a, b, c}
– A string over an alphabet is a finite sequence of symbols drawn
from that alphabet (sometimes a string is also called a sentence or a
word).
– A language is a set of strings over an alphabet.
– Operation on languages (a set):
• union of L and M, L U M = {s|s is in L or s is in M}
• concatenation of L and M
LM = {st | s is in L and t is in M}
i
L
• Kleene closure of L,

• Positive closure of L, i
– Example: L
• L={aa, bb, cc}, M = {abc}

gsn.comp@coeptech.ac.in
• Formal definition of Regular expression:f
• Given an alphabet ∑ ,
• (1) ε is a regular expression that denote { ε }, the
set that contains the empty string.
• (2) For each a∈ ∑ , a is a regular expression
denote {a}, the set containing the string a.
• (3) r and s are regular expressions denoting the
language (set) L(r ) and L(s ). Then
– ( r ) | ( s ) is a regular expression denoting L( r ) U L( s )
– ( r ) ( s ) is a regular expression denoting L( r ) L ( s )
– ( r )* is a regular expression denoting (L ( r )) *

• Regular expression is defined together with the

language it denotes.
gsn.comp@coeptech.ac.in
• Examples:
– let ∑ ={ a,b }
a|b
(a | b) (a | b)
a*
(a | b)*
a | a*b

– We assume that ‘*’ has the highest precedence and is

left associative. Concatenation has second highest
precedence and is left associative and ‘|’ has the lowest
precedence and is left associative
• (a) | ((b)*(c ) ) = a | b*c
gsn.comp@coeptech.ac.in
• Regular definition.
– gives names to regular expressions to construct more complicate
regular expressions.
d1 -> r1
d2 ->r2
…
dn ->rn
– example:
letter -> A | B | C | … | Z | a | b | …. | z
digit -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
identifier -> letter (letter | digit) *

• more examples: integer constant, string constants, reserved

words, operator, real constant.

gsn.comp@coeptech.ac.in

Lect2 Lexical
No ratings yet
Lect2 Lexical
9 pages
ch3 M.PPTX - 0
No ratings yet
ch3 M.PPTX - 0
46 pages
Lexical Analyzer
No ratings yet
Lexical Analyzer
34 pages
Chapter 2 - Lexical Analysis
100% (1)
Chapter 2 - Lexical Analysis
69 pages
Lexical Analyzer 1
No ratings yet
Lexical Analyzer 1
37 pages
Specification of Tokens
No ratings yet
Specification of Tokens
21 pages
Language About Complier Construction
No ratings yet
Language About Complier Construction
23 pages
Lexical Analysis
No ratings yet
Lexical Analysis
41 pages
Chapter THREE
No ratings yet
Chapter THREE
24 pages
Lec2 LexicalAnalyser
No ratings yet
Lec2 LexicalAnalyser
30 pages
Lexical Analysis
No ratings yet
Lexical Analysis
57 pages
Compiler Lexical Analysis Guide
No ratings yet
Compiler Lexical Analysis Guide
56 pages
Unit22pdf 2021 03 13 13 38 11
No ratings yet
Unit22pdf 2021 03 13 13 38 11
114 pages
Compilers - Week 2
No ratings yet
Compilers - Week 2
14 pages
CD ch2
No ratings yet
CD ch2
104 pages
Compiler Design Chapter-2
60% (5)
Compiler Design Chapter-2
105 pages
2 - Lexical Analysis
No ratings yet
2 - Lexical Analysis
52 pages
FALLSEM2025-26 BCSE307L TH VL2025260101614 2025-07-16 Reference-Material-I
No ratings yet
FALLSEM2025-26 BCSE307L TH VL2025260101614 2025-07-16 Reference-Material-I
20 pages
Module 3
No ratings yet
Module 3
7 pages
2 Lex
No ratings yet
2 Lex
45 pages
Lexical Analyzer 2023
No ratings yet
Lexical Analyzer 2023
38 pages
Chapter2-Lexical Analysis
No ratings yet
Chapter2-Lexical Analysis
64 pages
Compiler Lexical Analysis Guide
No ratings yet
Compiler Lexical Analysis Guide
65 pages
Lexical Analysis Techniques and Implementation
No ratings yet
Lexical Analysis Techniques and Implementation
44 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
62 pages
Chapter 7 Lexical Analysis
No ratings yet
Chapter 7 Lexical Analysis
61 pages
Chapter-2 Compiler Design
No ratings yet
Chapter-2 Compiler Design
98 pages
Compiler Construction: Scanning & Tokens
No ratings yet
Compiler Construction: Scanning & Tokens
72 pages
Chapter 2
No ratings yet
Chapter 2
56 pages
Introduction to Alphabets and Regular Expressions
No ratings yet
Introduction to Alphabets and Regular Expressions
21 pages
rkCD-Chapter 2 - LEXICAL ANALYSIS
No ratings yet
rkCD-Chapter 2 - LEXICAL ANALYSIS
9 pages
Lexical Analysis: Tokens & Patterns Explained
No ratings yet
Lexical Analysis: Tokens & Patterns Explained
77 pages
2 - Compilers (Lexical Analysis)
No ratings yet
2 - Compilers (Lexical Analysis)
60 pages
1st Phase Lexical Analyzer
No ratings yet
1st Phase Lexical Analyzer
33 pages
Scanner (Lexical Analyzer) : The Structure of A Compiler
No ratings yet
Scanner (Lexical Analyzer) : The Structure of A Compiler
109 pages
Ch3 - Lexical Analysis
No ratings yet
Ch3 - Lexical Analysis
52 pages
Regular Expressions
No ratings yet
Regular Expressions
4 pages
CS321 Languages and Compiler Design I: Winter 2012
No ratings yet
CS321 Languages and Compiler Design I: Winter 2012
4 pages
CH 3 Myppt
No ratings yet
CH 3 Myppt
59 pages
Compiler-Lexical Analysis
100% (1)
Compiler-Lexical Analysis
59 pages
Pcdunit2 Continuation
No ratings yet
Pcdunit2 Continuation
26 pages
Specification of Tokens
No ratings yet
Specification of Tokens
20 pages
Lexical Analyzer in Perspective: Parser Source Program Token
No ratings yet
Lexical Analyzer in Perspective: Parser Source Program Token
22 pages
Intro to Regular Expressions
No ratings yet
Intro to Regular Expressions
27 pages
Regular Expression: Anab Batool Kazmi
No ratings yet
Regular Expression: Anab Batool Kazmi
32 pages
Lexical Analysis
No ratings yet
Lexical Analysis
44 pages
Regular Expressions in Compiler Design
No ratings yet
Regular Expressions in Compiler Design
44 pages
Lexical Analysis: CD: Compiler Design
No ratings yet
Lexical Analysis: CD: Compiler Design
122 pages
Compiler
No ratings yet
Compiler
60 pages
Chapter 2
No ratings yet
Chapter 2
99 pages
CC Unit-1
No ratings yet
CC Unit-1
143 pages
Chapter 2
No ratings yet
Chapter 2
91 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
Lexical Analysis
No ratings yet
Lexical Analysis
73 pages
Lexical Scanner Overview
No ratings yet
Lexical Scanner Overview
31 pages
DS Unit 2
No ratings yet
DS Unit 2
32 pages
Introduction to Data Science Lecture Notes
100% (5)
Introduction to Data Science Lecture Notes
133 pages
Computer Problem Solving Course Guide
No ratings yet
Computer Problem Solving Course Guide
2 pages
Operating System Server: Dining Philosophers Problem Using Semaphores
No ratings yet
Operating System Server: Dining Philosophers Problem Using Semaphores
20 pages
Unit II Data Stuctures Arrays Searching Sorting Dipalee D Rane COMP DYPCOE DDR
No ratings yet
Unit II Data Stuctures Arrays Searching Sorting Dipalee D Rane COMP DYPCOE DDR
27 pages
Memory Safety
No ratings yet
Memory Safety
78 pages
Access Exercise
0% (1)
Access Exercise
3 pages
Intro to CS for Educators
100% (1)
Intro to CS for Educators
26 pages
Write A Program To Make A Calculator.: C++ Programs
No ratings yet
Write A Program To Make A Calculator.: C++ Programs
1 page
HQL & HCQL for Java Developers
No ratings yet
HQL & HCQL for Java Developers
20 pages
System 36 RPG II UserGuide Reference
No ratings yet
System 36 RPG II UserGuide Reference
914 pages
JBDL 64
No ratings yet
JBDL 64
95 pages
Euskalhack 2024 Coop
No ratings yet
Euskalhack 2024 Coop
35 pages
Expanded Explanations Request
No ratings yet
Expanded Explanations Request
51 pages
Data Structure Interview Questions PDF
No ratings yet
Data Structure Interview Questions PDF
4 pages
Unit 5 Notes Java-New Syllabus
No ratings yet
Unit 5 Notes Java-New Syllabus
45 pages
Oracle Searching Matching
No ratings yet
Oracle Searching Matching
27 pages
Unit 1 Java by Multiatoms
0% (1)
Unit 1 Java by Multiatoms
62 pages
OS Process Synchronization, Deadlocks
100% (4)
OS Process Synchronization, Deadlocks
30 pages
DataFlowAnalysis v1
No ratings yet
DataFlowAnalysis v1
31 pages
Module 1
No ratings yet
Module 1
23 pages
OpenSplice DDS Finance
No ratings yet
OpenSplice DDS Finance
69 pages
Motivational Letter Samples
No ratings yet
Motivational Letter Samples
6 pages
Introduction to GNUSim8085 Simulator
No ratings yet
Introduction to GNUSim8085 Simulator
13 pages
Operating Systems Exam 2020
No ratings yet
Operating Systems Exam 2020
3 pages
Harshit Vishwakarma's Tech Profile
No ratings yet
Harshit Vishwakarma's Tech Profile
1 page
ICSE Paper 2005: Computer Applications
No ratings yet
ICSE Paper 2005: Computer Applications
10 pages
Nitesh Kumar:: Email: Mobile
No ratings yet
Nitesh Kumar:: Email: Mobile
4 pages
Programming Logic and Design MCQ PDF
No ratings yet
Programming Logic and Design MCQ PDF
17 pages
Java Class Loading Explained
No ratings yet
Java Class Loading Explained
4 pages

Lexical Analysis in Compiler Design

Uploaded by

Lexical Analysis in Compiler Design

Uploaded by

Review: Compiler Phases:

Lexical analyzer Front End

Intermediate code generator

• Lexical analyzer does not have to be an individual

• How to specify tokens:

• Regular expression is defined together with the

– We assume that ‘*’ has the highest precedence and is

• more examples: integer constant, string constants, reserved

You might also like