You are on page 1of 16

COMPILER DESIGN LAB

(CSP358)

Practical No. 1
(LEX)

Practical No. 1
Scanner Generator: Lex
 Lex is a tool that generates a lexical analyser for the C
language

 It produces a program which recognises regular


expressions.

 These regular expressions are specified by the user in the


source specifications given to Lex.
Scanner Generator: Lex
Lex LEX lex.yy.c lex program execute
Source following steps on
the console:
a.out
C Compiler flex pgm name.l
gcc lex.yy.c –ll
./a.out<input.txt
a.out
a.out (Scanner)
tokens
on windows change:
./a.out to a.exe

To change the output name from a to any thing else use option “-o”
Example: gcc lex.yy.c –o output
And then to run : ./output.out (on Linux) OR output.exe (on windows)
LEX

Lex Source File


The general format of Lex source is:
{definitions}
%%
{rules}
%%
{user subroutines}
Lex Source
 Lex source is separated into three sections by %%
delimiters
 The general format of Lex source is
{definitions}
%% (Required)
{rules}
%%
{user subroutines}
 The absolute minimum Lex program is thus
%%
Lex Source
6

Definition Section: C code to be inserted as it is. This is


between delimiters %{ and %}
 Names for regular expression

name regex
Rules:
Regex action
User subroutine: C code for any user defined
subroutines that are called in second section.
LEX- Small Examples with observations

 %%
 Lex code to delete all blanks or tabs
%%
[ \t]+ ;
 To print the words or numbers as it is
[a-z]+ printf("%s: word", yytext);
[0-9]+ printf(“%s: number", yytext);

[a-z]+ ECHO;
Lex Regular Expressions
Regular expression Action
\x x, if x is a lex operator
"xy" xy, even if x or y is a lex operator (except \)
[xy] x or y
[x-z] x, y, or z
[^x] Any character but x
. Any character but newline
^x x at the beginning of a line
<y>x x when lex is in start condition y
x$ x at the end of a line
x? Optional x
x* 0, 1, 2, ... instances of x
x+ 1, 2, 3, ... instances of x
x{m,n} m through n occurrences of x
xx|yy Either xx or yy
x| The action on x is the action for the next rule
(x) x
x/y x but only if followed by y
{xx} The translation of xx from the definitions section
Lex Predefined Variables
9

 yytext -- a string containing the lexeme


 yyleng -- the length of the lexeme
 yyin -- the input stream pointer
 the default input of default main() is stdin
 yyout -- the output stream pointer
 the default output of default main() is stdout.

 E.g.
[a-z]+ printf(“%s”, yytext);
[a-z]+ ECHO;
[a-zA-Z]+ {words++; chars += yyleng;}
Lex Library Routines
10

 yylex()
 The default main() contains a call of yylex()
 yymore()
 return the next token
 yyless(n)
 retain the first n characters in yytext
 yywarp()
 is called whenever Lex reaches an end-of-file
 The default yywarp() always returns 1
LEX
 Ambiguous Source Rules
Lex can handle ambiguous specifications. When
more than one expression can match the current
input, Lex chooses as follows:
1) The longest match is preferred.
2) Among rules which matched the same number of
characters, the rule given first is preferred.
Practical-1: Instructor Led
I1: Write a Lex specification to declare whether the
entered word starts with a vowel or not.

I2: Write a Lex Specification to count the number of


words, lines, small letters, capital letters, digits and
special characters in a given input file.
LEX- Example
Write a lex code to count lines of a text
file. main()
{
yyin= fopen("myfile.txt","r");
%{ yylex();
#include<stdio.h> printf(" This File contains ...");
int lines=0; printf("\n\t %d lines", lines);
%} }
%% int yywrap()
\n lines++; {
%% return(1);
}
Practical-1: Semi-Instructor Led
S1: Design a lexical analyser to identify the tokens
such as keywords, identifiers, operators, symbols and
strings for C language using Lex.
Hint
%{
15 #include <stdio.h>
%}
DIGIT [0-9]
LETTER[a-zA-Z]
%%
If { printf(“ recognized Keyword: %s \n”,yytext);
}
{letter}({letter}|{digit})* { printf(“id: %s\n”, yytext);
}
\n { printf(“new line\n”);
}
%%
main() {
yylex();
}
Submission for Practical-1on Google
Classroom
 Part-1(Instructor Led and Semi-instructor led): PDF file
(Typed) containing: Lex Theory, Practical  Aim,
Program, Output Screen shot
 Part-2 ( Evaluation Practical): PDF file(typed) with Aim,
program and output screen shot

IMPORTANT: Students should create their own documents, if program/


outputs are copied then ZERO marks will be allotted to all such students.

You might also like