You are on page 1of 6

1. What is lexical analysis? What is the purpose of lexical analysis?

Explain with
diagram.
= 1) The first step of compilation, called lexical analysis, 2) is to convert the input from a
simple sequence of characters into a list of tokens of different kinds, such as numerical and
string constants, variable identifiers, and programming language keywords. The purpose of
lex is to generate lexical analyzers.

2. What is token?
= A token is a group of characters having collective meaning: typically a word or
punctuation mark, separated by a lexical analyzer and passed to a parser.
3. Define tokenization.
= Tokenization is the act of breaking up a sequence of strings into pieces such as words,
keywords, phrases, symbols and other elements.
4. Define regular expression. What the applications of regular expressions.
= 1) A Regular Expression (or Regex) is a pattern (or filter) that describes a set of strings
that matches the pattern. In other words, accepts a certain set of strings and rejects the
rest.
2) Regular Expressions are useful for numerous practical day to day tasks that a data
scientist encounters. They are used everywhere from data pre-processing to natural
language processing, pattern matching, web scraping, data extraction and what not!
5. What finite automata? What the applications of finite automata.
= 1) Finite automata is a state machine that takes a string of symbols as input and changes
its state accordingly. Finite automata is a recognizer for regular expressions.
2) *For the designing of lexical analysis of a compiler. *For recognizing the pattern using
regular expressions. *For the designing of the combination and sequential circuits using
Mealy and Moore Machines. *Used in text editors. *For the implementation of spell
checkers.
6. List any two lex library function.
= *main() : Invokes the lexical analyzer by calling the yylex subroutine.
#include <stdio.h>
#include <locale.h>
main() {
setlocale(LC_ALL, "");
yylex();
exit(0);
}
*yywrap() : Returns the value 1 when the end of input occurs.
yywrap() {
return(1);
}
*yymore() : Appends the next matched string to the current value of the yytext array rather
than replacing the contents of the yytext array.
*yyless(int n) : Retains n initial characters in the yytext array and returns the remaining
characters to the input stream.
*yyreject() : Allows the lexical analyzer to match multiple rules for the same input string.
(The yyreject subroutine is called when the special action REJECT is used.)
7. Define token recognition. With the help of diagram describe how to recognition of
tokens.
= It is a diagrammatic representation to depict the action that will take place when a lexical
analyzer is called by the parser to get the next token. It is used to keep track of information
about the characters that are seen as the forward pointer scans the input.

In programming language, keywords, constants, identifiers, strings, numbers, operators


and punctuations symbols can be considered as tokens. For example, in C language, the
variable declaration lineint value = 100;contains the tokens:int (keyword), value (identifier),
= (operator), 100 (constant) and ; (symbol). Lexeme.
8. Define input buffering. What is its purpose? Explin with example.
= 1) Lexical Analysis has to access secondary memory each time to identify tokens. It is
time-consuming and costly. So, the input strings are stored into a buffer and then scanned
by Lexical Analysis. 2) input buffer is a location that holds all incoming information before it
continues to the CPU for processing.
It uses two pointers to scan tokens −
Begin Pointer (bptr) − It points to the beginning of the string to be read.
Look Ahead Pointer (lptr) − It moves ahead to search for the end of the token.
9. "Lex is a compiler". Comment.
= LEX is a program generator designed for lexical processing of character input/output
stream. Anything from simple text search program that looks for pattern in its input-output
file to a C compiler that transforms a program into optimized code. In program with
structure input-output two tasks occurs over and over
10. What is the output of Lex program?
= LEX source program and produces lexical Analyzer as its output.
11. Lex is a scanner provided by Linux operating system. State true/false.
= True. The lex utility generates C programs to be used in lexical processing of character
input, and that can be used as an interface to yacc. The C programs are generated from lex
source code and conform to the ISO C standard. Usually, the lex utility writes the program it
generates to the file lex. yy. c.
12. Define pattern.
= Pattern: A set of strings in the input for which the same token is produced as output. This
set of strings is described by a rule called a pattern associated with the token.
13. Lexical analyzer keeps the track of line number', state true or false.
= True. It is used to keep track of information about the characters that are seen as the
forward pointer scans the input
14. What is lexeme?
= Lexeme: A lexeme is a sequence of characters in the source program that is matched by
the pattern for a token.
15. Write the purpose of lex library functions yylex() and yyerror().
= yyerror() : A function that is used by routines that generate diagnostics. A version
of yyerror() is provided in the library, which simply passes its arguments to fprintf with
output to the error stream stderr. A newline is written following the
message. yyerror() returns an integer value which is the value returned from fprintf. You
can provide a replacement. The definition of yyerror must agree with the prototype
of yyerror() defined in yylex.c:
external int yyerror(const char * format, …)
yylex() : The scanner that lex produces. It returns a token if it has located in the input. A
negative or zero value indicates error or end of input.
16. What is lexical error?
= A lexical error is any input that can be rejected by the lexer. This generally results from
token recognition falling off the end of the rules you've defined.
17. Define sentinel. Describe with example.
= The sentinel is a special character that should not be a part of the source code. An eof
character is used as a Sentinel. each time when the forward pointer is converted, a check
is completed to provide that one half of the buffer has not converted off. If it is completed,
then the other half should be reloaded. Example : 1) Null character for indicating the end of
a null-terminated string 2) Null pointer for indicating the end of a linked list or a tree.
18. What is Lex? With the help of diagram describe lex. Write steps of execution of
lex program.
= Lex is a program designed to generate scanners, also known as tokenizers, which
recognize lexical patterns in text. Anything from simple text search program that looks for
pattern in its input-output file to a C compiler that transforms a program into optimized
code. In program with structure input-output two tasks occurs over and over.

1) Use the lex program to change the specification file into a C language program. The
resulting program is in the lex. yy. ...
2) Use the cc command with the -ll flag to compile and link the program with a library of lex
subroutines. The resulting executable program is in the a.
19. Differentiate between tokens lexemes and pattern.
=
Criteria Token Lexeme Pattern

It is a sequence of
characters in the
source code that are
Token is basically a matched by given
sequence of characters predefined language It specifies a set of
that are treated as a rules for every lexeme rules that a scanner
unit as it cannot be to be specified as a follows to create a
Definition further broken down. valid token. token.

all the reserved


keywords of that The sequence of
language(main, printf, characters that make
Keyword etc.) int, goto the keyword.

it must start with the


name of a variable, alphabet, followed by
Identifier function, etc main, a the alphabet or a digit.

all the operators are


Operator considered tokens. +, = +, =

each kind of
punctuation is
considered a token. e.g.
semicolon, bracket,
Punctuation comma, etc. (, ), {, } (, ), {, }

any string of
a grammar rule or “Welcome to characters (except ‘ ‘)
Literal boolean literal. GeeksforGeeks!” between ” and “

20. Write a short note on: Finite Automata (FA) as a lexical analyzer?
= Finite automata is a state machine that takes a string of symbols as input and changes its
state accordingly. Finite automata is a recognizer for regular expressions. When a regular
expression string is fed into finite automata, it changes its state for each literal. If the input
string is successfully processed and the automata reaches its final state, it is accepted, i.e.,
the string just fed was said to be a valid token of the language in hand.

21. Write a lex program which find out factors of a given number.
= #include <stdio.h>
int main() {
int num, i;
printf("Enter a positive integer: ");
scanf("%d", &num);
printf("Factors of %d are: ", num);
for (i = 1; i <= num; ++i) {
if (num % i == 0) {
printf("%d ", i);
} } return 0; }

22. Write a lex program to find the area of circle.


= #include <stdio.h>
int main(void) {
float pie = 3.14;
int radius = 6;
printf("The radius of the circle is %d \n" , radius);
float area = (float)(pie* radius * radius);
printf("The area of the given circle is %f", area);
return 0; }

23. Write a lex program to find factorial of a given number.


= #include<stdio.h>
// function for calculationg factorial
int fact(int n)
{ int i,f=1;
for(i=1;i<=n;i++)
f=f*i;
return f; }
// Driver function
int main()
{ int n;
printf("Enter a number\n");
scanf("%d",&n);
printf("Factorial of the number is: %d",fact(n));
return 0; }

24. Write a lex program to return the tokens identifier and number.
=%{
#include <stdio.h>
%
} / rule section % %
// regex for valid identifiers
^[a - z A - Z _][a - z A - Z 0 - 9 _] * printf("Valid Identifier");
// regex for invalid identifiers
^[^a - z A - Z _] printf("Invalid Identifier");
.;
%%
main()
{ yylex(); }

25. Write a lex program to count total number of vowels and total number of
consonants from the input string.
= %{
int vow_count=0;
int const_count =0;
%}

%%
[aeiouAEIOU] {vow_count++;}
[a-zA-Z] {const_count++;}
%%
int yywrap(){}
int main()
{
printf("Enter the string of vowels and consonents:");
yylex();
printf("Number of vowels are: %d\n", vow_count);
printf("Number of consonants are: %d\n", const_count);
return 0;
}

You might also like