You are on page 1of 3


Lex is a computer program that generates lexical analyzers (scanners or lexers). Lex is commonly used with the yacc parser generator. Lex, originally written by Mike Lesk and Eric Schmidt and described in 1975, is the standard lexical analyzer generator on many Unix systems Lex reads an input stream specifying the lexical analyzer and outputs source code implementing the lexer in the C programming language.Though originally distributed as proprietary software, some versions of Lex are now open source. some examples being OpenSolaris and Plan 9 from Bell Labs. One popular open source version of Lex, called flex, or the "fast lexical analyzer", is not derived from proprietary code. The structure of a Lex file is intentionally similar to that of a yacc file; files are divided into three sections, separated by lines that contain only two percent signs, as follows: Definition section %% Rules section %% C code section

The definition section defines macros and imports header files written in C. It is also possible to write any C code here, which will be copied verbatim into the generated source file. The rules section associates regular expression patterns with C statements. When the lexer sees text in the input matching a given pattern, it will execute the associated C code.

The C code section contains C statements and functions that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section. In large programs it is more convenient to place this code in a separate file linked in at compile time. Lex and parser generators, such as Yacc or Bison, are commonly used together. Parser generators use a formal grammar to parse an input stream, something which Lex cannot do using simple regular expressions (Lex is limited to simple finite state automata). It is typically preferable to have a (Yacc-generated, say) parser be fed a tokenstream as input, rather than having it consume the input character-stream directly. Lex is often used to produce such a token-stream.

Flex is a rewrite of the AT&T Unix lex tool (the two implementations do not share any code, though), with some extensions and incompatibilities, both of which are of concern to those who wish to write scanners acceptable to either implementation. Flexs `-l' option turns on maximum compatibility with the original AT&T lex implementation, at the cost of a major loss in the generated scanner's performance Flex is fully compatible with lex with the following exceptions:

The `input()'and `unput()' routine is not redefinable. This restriction is in accordance with POSIX. flex scanners are not as reentrant as lex scanners. lex does not support exclusive start conditions (%x), though they are in the POSIX specification. Some implementations of lex allow a rule's action to begin on a separate line, if the rule's pattern has trailing whitespace: %% foobar<space here> { foobar_action(); } flex does not support this feature.

It is a parser generator that is part of the GNU Project. Bison reads a specification of a context-free language, warns about any parsing ambiguities, and generates a parser (either in C, C++, or Java) which reads sequences of tokens and decides whether the sequence conforms to the syntax specified by the grammar. Bison by default generates LALR parsers but can also create GLR parsers. In POSIX mode, Bison is compatible with yacc, but also has several improvements over this earlier program.It can be used with both lex and flex. The actual language-design process using Bison, from grammar specification to a working compiler or interpreter, has these parts: 1. Formally specify the grammar in a form recognized by Bison. For each grammatical rule in the language, describe the action that is to be taken when an instance of that rule is recognized. The action is described by a sequence of C statements. 2. Write a lexical analyzer to process input and pass tokens to the parser. The lexical analyzer may be written by hand in C Write error-reporting routines.

So one can see that lex and flex are the lexical analyzer where lex is the older version The lexical analyzer create tokens and those tokens are then read by parser generators such as bison to check for ambiguities , abstract-syntax-tree construction and error recovery.