You are on page 1of 18

IMPLEMENTATION OF FINITE AUTOMAT IN CODE

There are several ways to translate either a


DFA or an NFA into code.
Consider , again the example of a DFA that
accepts identifiers consisting of a letter
followed by a sequence of letters and/ or
digits in its amend form that includes
lookahead and the principal of longest
substring.

1
IMPLEMENTATION OF FINITE
AUTOMAT IN CODE (cont’d)

Letter

Letter/- {Other}
Start In_id Finish
2 Return ID

Digit

2
Simulation of the DFA

{ Starting in state 1}
If the next character is a letter/- then
advance the input:
{ now in state 2}
While the next character is a letter or a digit do
advance the input { stay in state 2}
End while;
{ go to state 3 without consuming input }
Accept
Else
{ Error or other cases }
End if;
3
Constructing Transition Diagrams for
Tokens

 Transition Diagrams (TD) are used to


represent the tokens
 As characters are read, the relevant TDs
are used to attempt to match lexeme to a
pattern

4
 Each TD has:
– States : Represented by Circles
– Actions : Represented by Arrows between states
– Start State : Beginning of a pattern (Arrowhead)
– Final State(s) : End of pattern (Concentric Circles)
 Each TD is Deterministic - No need to choose
between 2 different actions !

5
Example TDs

 Recognition Of Relational Operators

>=: start > = RTN(GE)


0 6 7

other
8 * RTN(G)

We’ve accepted “>” and have read other char that


must be unread (means push back into input stream)
6
Example : All RELOPs

start < =
0 1 2 return(relop, LE)
>
3 return(relop, NE)
other

= 4 * return(relop, LT)

5
> return(relop, EQ)

=
6 7 return(relop, GE)
other
7 8 * return(relop, GT)
Example TDs : id

id : letter or digit

start letter other *


9 10 11

return( get_token(), install_id())

8
9
Example TDs : Unsigned #s

digit digit

start digit * . digit other *


20 21 22 23 24

return(num, install_num())
digit

start digit other *


25 26 27

10
Implementing Transition Diagrams

start < =
0 1 2
 class Scanner {
 >
 char _la; // The lookahead character 3
 Token nextToken() { other
 startLexeme(); // reset window at start *
= 4
 while(true) {
 switch(_state) {
5
 case 0: {
>
 _la = getChar();
 if (_la == ‘<’) _state = 1; =
6 7
 else if (_la == ‘=’) _state = 5;
 else if (_la == ‘>’) _state = 6; other
*
 else failure(state); 8
 }break;
 case 6: {
 _la = getChar(); case 7: {
 if (_la == ‘=’) _state = 7; return new Token(GEQUAL);
 else _state = 8;
}break;
 }break;
 }
 } case 8: {
 } pushBack(_la);
 } return new Token(GREATER);
}

11
Implementing Transition Diagrams
lexeme_beginning = forward; FUNCTIONS USED
state = 0; nextchar(), forward, retract(),
token nexttoken() install_num(), install_id(), gettoken(),
isdigit(), isletter(), recover()
{ while(1) {
switch (state) {
case 0: c = nextchar();
/* c is lookahead character */ start < =
0 1 2
if (c== blank || c==tab || c== newline) {
repeat >
until state = 0;
3
a “return” lexeme_beginning++;
other
occurs /* advance *
beginning of lexeme */ = 4
}
5
else if (c == ‘<‘) state = 1;
>
else if (c == ‘=‘) state = 5;
else if (c == ‘>’) state = 6; =
6 7
else state = fail();
other
break; *
8
… /* cases 1-8 here */
12
Implementing Transition Diagrams, II
25

digit
*
digit other
25 26 27
............. advances
forward
case 25; c = nextchar();
if (isdigit(c)) state = 26;
else state = fail();
Case numbers correspond to transition
break; diagram states !
case 26; c = nextchar();
if (isdigit(c)) state = 26;
else state = 27;
break;
case 27; retract(1); lexical_value = install_num();
return ( NUM );
retracts looks at the region
13 .............
forward lexeme_beginning ... forward
Implementing Transition Diagrams, III
.............
case 9: c = nextchar();
if (isletter(c)) state = 10;
else state = fail();
break;
case 10; c = nextchar();
if (isletter(c)) state = 10;
else if (isdigit(c)) state = 10;
else state = 11;
break;
case 11; retract(1); lexical_value = install_id();
return ( gettoken(lexical_value) );
letter or digit
.............
*
letter other
reads token 9 10 11
14 name from ST
When Failures Occur:

Init fail()
{ start = state;
forward = lexeme beginning;
switch (start) {
case 0: start = 9; break;
case 9: start = 12; break; Switch to
case 12: start = 20; break; next transition
case 20: start = 25; break; diagram
case 25: recover(); break;
default: /* lex error */
}
return start;
}

15
What Else Does Lexical Analyzer Do?

All Keywords / Reserved words are matched as ids


• After the match, the symbol table or a special keyword table is consulted

• Keyword table contains string versions of all keywords and associated token values

if 15
then 16
begin 17
... ...
• When a match is found, the token is returned, along with its symbolic value, i.e.,
“then”, 16
• If a match is not found, then it is assumed that an id has been discovered
16
ASSINGMENT

17
THANKS
18

You might also like