You are on page 1of 23

1

Lahore Garrison University


CSC373-Compiler Construction
Week-6 Lecture-12
Semester-#5 Fall 2019
Prepared
by:

Eisha Tir Razia


2
Overview of previous lecture

 Conversion of DFA into minimized DFA

Lahore Garrison University


3
Preamble of each lecture

 Tokens, Patterns
 Lexemes
 Identifiers
 Use of Regular Expression in Lexical Analysis

Lahore Garrison University


4
Lecture Outcomes

 Understanding related to
 Tokens, Patterns, Lexemes, identifiers, Use of Regular Expression in Lexical
Analysis

Lahore Garrison University


5

How to Describe Tokens?

 Regular Languages are the most popular for


specifying tokens
 Simple and useful theory
 Easy to understand
 Efficient implementations

Lahore Garrison University


6

Languages

 Let S be a set of characters. S is called the alphabet.


 A language over S is set of strings of characters drawn from S.

Lahore Garrison University


7

Example of Languages

Alphabet = English characters


Language = English sentences

Alphabet = ASCII
Language = C++ programs,
Java, C#

Lahore Garrison University


8

Notation

 Languages are sets of strings (finite sequence of characters)


 Need some notation for specifying which sets we want

Lahore Garrison University


9

Notation

 For lexical analysis we care about regular languages.


 Regular languages can be described using regular expressions.

Lahore Garrison University


10

Regular Languages

 Each regular expression is a notation for a regular language (a set of


words).
 If A is a regular expression, we write L(A) to refer to language denoted
by A.

Lahore Garrison University


11

Regular Expression

 A regular expression (RE) is defined inductively


a ordinary character
from S
e the empty string

Lahore Garrison University


12

Regular Expression

R|S = either R or S
RS = R followed by S
(concatenation)
R* = concatenation of R
zero or more times
(R*= e |R|RR|RRR...)

Lahore Garrison University


13

RE Extentions

R? = e | R (zero or one R)
R+ = RR* (one or more R)
(R) =R (grouping)

Lahore Garrison University


14

RE Extentions

[abc] = a|b|c (any of listed)


[a-z] = a|b|....|z (range)
[^ab] = c|d|... (anything but
‘a’‘b’)

Lahore Garrison University


15

Regular Expression

RE Strings in L(R)
a “a”
ab “ab”
a|b “a” “b”
(ab)* “” “ab” “abab” ...
(a|e)b “ab” “b”

Lahore Garrison University


16

Example: integers

 integer: a non-empty string


of digits
 digit = ‘0’|’1’|’2’|’3’|’4’|
’5’|’6’|’7’|’8’|’9’
 integer = digit digit*

Lahore Garrison University


17

Example: identifiers

 identifier:
string or letters or digits starting with a letter
 C identifier:
[a-zA-Z_][a-zA-Z0-9_]*

Lahore Garrison University


18

How to Use REs

 We need mechanism to determine if an input string w


belongs to L(R), the language denoted by regular
expression R.

Lahore Garrison University


19

Finite Automata (FA)

 Specification:
Regular Expressions
 Implementation:
Finite Automata

Lahore Garrison University


20

Finite Automata

Finite Automaton consists of


 An input alphabet (S)
 A set of states
 A start (initial) state
 A set of transitions
 A set of accepting (final) states

Lahore Garrison University


21

Finite Automata

 A finite automaton accepts a string if we can follow transitions


labelled with characters in the string from start state to some
accepting state.

Lahore Garrison University


22

Q&A

Lahore Garrison University


23
References

 These lecture notes were taken from following source:


 Compilers: Principles, Techniques, and Tools By Alfred V. Aho, Ravi Sethi,
Jeffrey D. Ullman, Contributor Jeffrey D. Ullman, Addison-Wesley Pub.
Co., 2nd edition, 2006

Lahore Garrison University

You might also like