You are on page 1of 17

Compiler Construction

CS-4207

Instructor Name: Atif Ishaq


Lecture 6
Today’s Lecture

 Recognition of Token

 Regular Expression and FSM

 Transition Diagram Construction

2
Recognition of Token : Transition Diagram

 A language defined by a grammar is a (possibly infinite) set of strings

 An automation is a device that determines, by reading a string (word) one

character at a time, whether the string belongs to a special language

 A finite state automata (FSA, NFA) is an automaton that recognizes regular

languages (regular expressions)

 Simplest automaton : memory is an element of a finite set

3
Recognition of Token : Transition Diagram

 Graphically a Finite State Automata are represented by

 A set of labeled states, represented as nodes in a digraph

 Directed edges labelled with a character are drawn between states

 One or more states designated as terminal (accepting)

 One or more state designated as initial

 On reading character a ∈ ∑ , automaton may move from state S1 to state S2 if

there exists an a-labled edge connecting S1 to S2.

 A string belongs to the language if, while reading the string, the automaton

may move from an initial state to an accepting state.

4
Recognition of Token : Transition Diagram

Following diagram is an NFA which recognizes the language of all string over ∑ :

{a , b} which have an even number of a’s and b’s

For even a’s and b’s

5
Recognition of Token : Transition Diagram

6
Recognition of Token : Transition Diagram

7
Recognition of Token : Transition Diagram

8
Recognition of Token : Transition Diagram

9
Recognition of Tokens : Transition Diagram

relop  < | > | <= | >= | <> | =

id  letter (letter | digit )*

10
Recognition of Tokens : Transition Diagram

A transition diagram for unsigned digits

A transition diagram for white spaces

11
What else a Lexical Analyze Do?

 All keyword / reserve word are matched as ids


 After the match, symbol table or special keyword table is consulted
 Keywords table contains string version of all keywords along with the
associated token value
 When a match is found the token is returned along with its symbolic
value, i.e, “then”,16
 If match is not found then it is assumed that an id has been discovered

if 15
then 16
begin 17
... ...
12
Transition Diagram : Code
token nexttoken()
{ while (1) {
switch (state) {
case 0: c = nextchar();
if (c==blank || c==tab || c==newline) { Decides the
state = 0;
lexeme_beginning++; next start state
}
else if (c==‘<’) state = 1; to check
else if (c==‘=’) state = 5;
else if (c==‘>’) state = 6;
else state = fail();
break; int fail()
case 1: { forward = token_beginning;
… swith (start) {
case 9: c = nextchar(); case 0: start = 9; break;
if (isletter(c)) state = 10; case 9: start = 12; break;
else state = fail(); case 12: start = 20; break;
break; case 20: start = 25; break;
case 10: c = nextchar(); case 25: recover(); break;
if (isletter(c)) state = 10; default: /* error */
else if (isdigit(c)) state = 10; }
else state = 11; return start;
break; }
13

Transition Diagram : Code

14
Lecture Outcome

 Significance of context free grammar in compiler construction

 How to resolve associativity and precedence issues in arithmetic


expressions

 Focusing on unambiguous grammar for parsing

15
Lecture Outcome

 Token Recognition

 Transition Diagram Construction

16
17

You might also like