Professional Documents
Culture Documents
Lab - 02 Lexical Analysis White Space
Lab - 02 Lexical Analysis White Space
The lexical analyzer is the part of the compiler that reads the source program and performs
certain secondary tasks at the user interface. One such task is stripping out comments and
white space in the form of blanks, tabs and new line characters, from the source program.
Another is correlating error messages from the compiler with the source program i.e. keeping a
correspondence between errors and source line numbers.
Objectives
Tools/Software Requirement
Description
Lexical analysis is the process of converting a sequence of characters into a sequence of tokens.
A program or function which performs lexical analysis is called a lexical analyzer, lexer or a
scanner. A lexer often exists as a single function which is called by a parser or another function.
For LAB experiments we use a very simple hypothetical high level programming language TINY.
Although it is a very simple language – it lacks floating point values, arrays, functions and many
more – but it is still large enough to explain essential features of compiler construction.
i. Comments are enclosed in curly brackets {. . .}. Comments cannot be nested, however
multiline comments are allowed.
ii. White space consists of blanks, tabs and newlines.
Lab Tasks
1. Implement code for whitespace removal i.e. blanks, tabs and newlines
2. Write code for removing comments from the source code
a) Single line comments
b) Multi-line comments
3. Test your implementation using the sample “factorial” program written in TINY below.
Factorial Program:
if 0 < x then
fact := 1;
repeat
fact := fact * x;
x := x – 1;
until x = 0;
write fact;
end