You are on page 1of 4

LEX

What is LEX?
Lex tool is a lex compiler. Lex is a utility or a tool that helps to generate your scanners rapidly. It
is mainly used in lexical analysis phase in order to identify the tokens using regular expressions.
What is the purpose of using LEX?
The main purpose of a lexical analyzer (scanner) is to break up an input stream into tokens i.e(to
identify the meaning and its use).
For example: a = b + c * d
ID ASSIGN ID PLUS ID MULTI ID SEMI
Use of LEX

lex.l
lec.yy.c

LEX Compiler

Compiler

Input Stream

lec.yy.c
a.out
Sequence of tokens

a.out
lex.l:
It is an input file written in a compiler language which describes the generation of a lexical
analyzer. The lex compiler transforms lex.l to a C program known as lex.yy.c.
lex.yy.c
The file is a compilation file to a C compiler. The C compiler after the compilation produces
a new file called as a.out. The C compiler performs the lexical analysis which takes the
stream of input characters and produces a stream of tokens.

yylval
It is a global variable which is shared by lexical analyzer and parser to return the name and
an attribute value of a token.
The attribute value can be a numeric code, pointer to a symbol table or nothing.
Another tool for lexical analyzer generation is Flex.
Structure of a Lex program
Lex source program is separated into three sections by %% delimiters. The general form or the
syntax of a lex program can be as follows:
Declarations
%%
Definitions
%%
Translation Rules
%%
Auxiliary Functions or User Code
%%

[Required]
[Required]
[Required]
[Optional]

Declarations: This section includes the declaration of variables, constants and regular definitions.
Definitions: It contains declarations of names, characters and operators.
Translation Rules: It consists of regular expressions and code segments.
The syntax can be given as Pattern {action}.
The actions may be a C/C++ code. Pattern is a regular expression or a regular definition. Action
refers to the segments of code.
Auxiliary Functions or User Code: These functions are compiled and loaded with the lexical
analyzer. Lexical analyzer produced by the lex starts its process by reading one character at a
time until a valid match for a pattern is found. Once a pattern match is found, the associated
action takes place to produce the tokens. The token is then given to the parser for further
processing.

The absolute minimum Lex program is thus %%.


Conflict Resolution in Lex
Conflict arises when several prefixes of input matches one or more patterns. This can be resolved
as follows:
Always prefer a longer prefix than a shorter prefix.
When two or more patterns are matched for the longer prefix, then the first pattern in the
lex program is preferred.
Lookahead Operator
It is an additional operator read by lex in order to distinguish the additional pattern for a token.
Lexical analyzer is used to read one character ahead of valid lexeme and then retracts to produce
token. At times, it is needed to have certain characters at the end of the input to match with a
pattern. In such cases, slash (l), is used to indicate the end part of a pattern that matches the
lexeme.
For example:

IF (I, J)

=5

and IF (condition) THEN

The above IF statement issues a conflict whether to produce IF as an array or a keyword. Hence
to resolve this, the lex rule can be follows:
IF /\

(.*\) {

Letter

Lex predefined variables

yytext -- a string containing the lexeme


yyleng -- the length of the lexeme
yyin -- the input stream pointer , the default input of default main() is stdin
yyout -- the output stream pointer, the default output of default main () is stdout.

Lex Library Routines


yylex(): The default main () contains a call of yylex()
yymore(): It returns the next token
yyless(n): It retains the first n characters in yytext.
yywarp(): It is called whenever Lex reaches an end-of-file. The default yywarp () always
returns 1.

Installation

Download the Windows version of FLEX


Download a C/C++ Compiler DevCPP or Code::Blocks
Install all . It's recommended to install in folders WITHOUT spaces in their
names.
I use 'C:\GNUWin32' for FLEX and 'C:\ DevCPP'
Now add the BIN folders of both folders into your PATH variable. Incase you
don't know how to go about this, see how to do it on Windows XP ,Windows
Vista and Windows 7. You will have to add '; C:\GNUWin32\bin;C:\Dev-Cpp'
to the end of your PATH
Open up a CMD prompt and type in the following
C:\flex --version
flex version 2.5.4
C:\>gcc --version
gcc (GCC) 3.4.2 (mingw-special)
Copyright (C) 2004 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.There is NO warranty; not
even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

You might also like