You are on page 1of 6

Lexical Specification

- before implementing Lexical analyser we should write lexical specification


- major parts of lexical specification

a- Alphabets in language defined by sigma

b- Pattern of lexical unit

c- regular expression

d- transition diagram

e- transition Table

Example : write Lexical Specification for Abc language having only three letter
identifiers.

Sigma = {abc..zABC..z0..9}
Pattern:
three letter identifier start with letter followed by two letters OR digits

Regular Expression :
(a+b+c..+z+A+B..+z) (a+b..+z+A+B..z+0+1..9)

(a+b..+z+A+B..z+0+1..9)
Transition Diagram
(See on board )
Transition Table
Rules:
1- number of rows is equal to number of states in diagram
2- Number of cols is equal to number of alphabets defined in sigma
3- blank cell indicates no transitions or error
4- One extra col for accepting state
5- Place all zeros at accepting state row

(See on board)

......................................
Example Following are identifiers in abc language. Tokenize them using
specification

abc x23

.............................................
Example 1 DeVelop Lexical Specification for xyz language
1- Identifiers - one or more letters
2- Keywords int and print
3- int-literals
4- operators = and +
5- Punctuations only ;
6- delimiters (ignored)

Sample Program1
int a;
a=25
prit a
Sample program 2
int a;
int b;
int c;
a=2;
b=3;
c= a+b;
print c
............................................
Lexical Specification
Sigman={abc..zABc..Z012..9;=+space tab newline}
Letter={abc..zAbc..Z}
Digit={012..9}

1- Identifier
Pattern : At least one or more letters
RE : Letter Letter* OR Letter+
2-Keywords
Only two {int print}
covered in id definition
use same re for keywords
3- int-Literals
Pattern : Start with digit, followed by more digits
RE : +Digits digits* OR digit +

4- Operators only + and =


RE +
RE =
5- Punctuations
only ;
RE ;

5- Delimiters
space tab and newline
Alll ignored
RegularExpression (space|tab|newline)*

Transition Diagram
- deterministic sate diagram
1- Transition
2- State
3- Accepting State
4- Accepting state with input Retraction
(See all symbols on board)

Draw TD for each Lexical unit.

(See on board)

Draw Combine TD for whole Language

tokenize the given sample program


int a;
a=25
print a

Tokenize the given sample program


int a;
int b;
int c;
a=2;
b=3;
c= a+b;
print c
........................................
All above sample programs are syntactcally an lexically correct
...........................................
Sample Pogram 3

A syn incorrcect program

a int
= 2
price26
....................................
Draw Transition table
State Input Let
Let Dig = + ; delim Ac Tok
0 1 3 6 5 7 0
1 1 2 2 2 2 2
2 0 0 0 0 0 0 y id/Key
3 4 3 4 4 4 4
4 0 0 0 0 0 0 y int-lit
5 0 0 0 0 0 0 n +
6 0 0 0 0 0 0 n =
7 0 0 0 0 0 0 n ;

Assumption
y= accepting state with input retraction
n= acception state without input retraction
blank = not a accepting state
......................................
Example 2
DeVelop Lexical Specification for xyz language
1- Identifiers - start with letter, followed by lettre or digits
2- Keywords int, print, get
3- int-literals - start with non zero digit
07 not a int literal
7 is intliteral
4- operators = and +
5- Punctuations only ; { }
6- delimiters (ignored)

Tokenize the following programs


Sample Program1
{int a23;
a=25
print a
}
Sample program 2
{int a;
int b;
int c;
get a
get b
c= a+b;
print c

}
..........................................

Specification

CharacterSet={abc..zAB..Z012..9+=;{}sptab\n}
Letter={abc..zAB..Z}
digits={012..9}
nonzeroDigit={123..9}
1- identifier

Letter (Letter|Didit)*
language ={a,abc,ab23,int,print, get ...}
2- keyword (covered in ids)

use same Re for ids


3- Int-Literal

nonzeroDigit digits* + 0

4- Operators
Re +
Re =

5- Punctuations
Re ;
Re }
Re {

6- Delimiters
(space|tab|newline)*

7- Combine Transition Diagram

8- Transition Table
...............................................
Language Extension 1

1- new Keywords
if and while

2- operators
*, -,< and <=

Sample Program1
calculate factorial
{int fact;
int count;
int number;
get number;
count=1;
fact=1;
while count<=number{
fact=fact*number;
number=number -1;
}
print fact;
}

Lexical Specification again

4- Operators
Re +
Re =
Re *
Re -
Re < (=|^)
..........................................

Language Extension II

Sample Program 1
calculate largest number

{int a=2;
int b=3;
int lagest;
if a>b
largest=a;
else
largets b;
print largest;
}
Sample Program 2

Average to three numbers

{ int a=2,b=3,d=4,avg;
avg=(a+b+d)/4;
print avg;
}
........................................
Rewrite Lexical Speification
.......................................
DeVelop Lexical Specification for xyz language
1- Identifiers - start with letter, followed by lettre or digits
2- Keywords int, print, get,if, while
3- int-literals - start with non zero digit
07 not a int literal
7 is intliteral
4- operators =,-,*,/,<,<=
5- Punctuations ; { } () ,
6- delimiters (ignored)

Tokenize the following programs


Sample Program1
{int a23;
a=25
print a
}
Sample program 2
{int a;
int b;
int c;
get a
get b
c= a+b;
print c

You might also like