Professional Documents
Culture Documents
(CSP358)
Practical No. 1
(LEX)
Practical No. 1
Scanner Generator: Lex
Lex is a tool that generates a lexical analyser for the C
language
To change the output name from a to any thing else use option “-o”
Example: gcc lex.yy.c –o output
And then to run : ./output.out (on Linux) OR output.exe (on windows)
LEX
name regex
Rules:
Regex action
User subroutine: C code for any user defined
subroutines that are called in second section.
LEX- Small Examples with observations
%%
Lex code to delete all blanks or tabs
%%
[ \t]+ ;
To print the words or numbers as it is
[a-z]+ printf("%s: word", yytext);
[0-9]+ printf(“%s: number", yytext);
[a-z]+ ECHO;
Lex Regular Expressions
Regular expression Action
\x x, if x is a lex operator
"xy" xy, even if x or y is a lex operator (except \)
[xy] x or y
[x-z] x, y, or z
[^x] Any character but x
. Any character but newline
^x x at the beginning of a line
<y>x x when lex is in start condition y
x$ x at the end of a line
x? Optional x
x* 0, 1, 2, ... instances of x
x+ 1, 2, 3, ... instances of x
x{m,n} m through n occurrences of x
xx|yy Either xx or yy
x| The action on x is the action for the next rule
(x) x
x/y x but only if followed by y
{xx} The translation of xx from the definitions section
Lex Predefined Variables
9
E.g.
[a-z]+ printf(“%s”, yytext);
[a-z]+ ECHO;
[a-zA-Z]+ {words++; chars += yyleng;}
Lex Library Routines
10
yylex()
The default main() contains a call of yylex()
yymore()
return the next token
yyless(n)
retain the first n characters in yytext
yywarp()
is called whenever Lex reaches an end-of-file
The default yywarp() always returns 1
LEX
Ambiguous Source Rules
Lex can handle ambiguous specifications. When
more than one expression can match the current
input, Lex chooses as follows:
1) The longest match is preferred.
2) Among rules which matched the same number of
characters, the rule given first is preferred.
Practical-1: Instructor Led
I1: Write a Lex specification to declare whether the
entered word starts with a vowel or not.