Professional Documents
Culture Documents
Explain with
diagram.
= 1) The first step of compilation, called lexical analysis, 2) is to convert the input from a
simple sequence of characters into a list of tokens of different kinds, such as numerical and
string constants, variable identifiers, and programming language keywords. The purpose of
lex is to generate lexical analyzers.
2. What is token?
= A token is a group of characters having collective meaning: typically a word or
punctuation mark, separated by a lexical analyzer and passed to a parser.
3. Define tokenization.
= Tokenization is the act of breaking up a sequence of strings into pieces such as words,
keywords, phrases, symbols and other elements.
4. Define regular expression. What the applications of regular expressions.
= 1) A Regular Expression (or Regex) is a pattern (or filter) that describes a set of strings
that matches the pattern. In other words, accepts a certain set of strings and rejects the
rest.
2) Regular Expressions are useful for numerous practical day to day tasks that a data
scientist encounters. They are used everywhere from data pre-processing to natural
language processing, pattern matching, web scraping, data extraction and what not!
5. What finite automata? What the applications of finite automata.
= 1) Finite automata is a state machine that takes a string of symbols as input and changes
its state accordingly. Finite automata is a recognizer for regular expressions.
2) *For the designing of lexical analysis of a compiler. *For recognizing the pattern using
regular expressions. *For the designing of the combination and sequential circuits using
Mealy and Moore Machines. *Used in text editors. *For the implementation of spell
checkers.
6. List any two lex library function.
= *main() : Invokes the lexical analyzer by calling the yylex subroutine.
#include <stdio.h>
#include <locale.h>
main() {
setlocale(LC_ALL, "");
yylex();
exit(0);
}
*yywrap() : Returns the value 1 when the end of input occurs.
yywrap() {
return(1);
}
*yymore() : Appends the next matched string to the current value of the yytext array rather
than replacing the contents of the yytext array.
*yyless(int n) : Retains n initial characters in the yytext array and returns the remaining
characters to the input stream.
*yyreject() : Allows the lexical analyzer to match multiple rules for the same input string.
(The yyreject subroutine is called when the special action REJECT is used.)
7. Define token recognition. With the help of diagram describe how to recognition of
tokens.
= It is a diagrammatic representation to depict the action that will take place when a lexical
analyzer is called by the parser to get the next token. It is used to keep track of information
about the characters that are seen as the forward pointer scans the input.
1) Use the lex program to change the specification file into a C language program. The
resulting program is in the lex. yy. ...
2) Use the cc command with the -ll flag to compile and link the program with a library of lex
subroutines. The resulting executable program is in the a.
19. Differentiate between tokens lexemes and pattern.
=
Criteria Token Lexeme Pattern
It is a sequence of
characters in the
source code that are
Token is basically a matched by given
sequence of characters predefined language It specifies a set of
that are treated as a rules for every lexeme rules that a scanner
unit as it cannot be to be specified as a follows to create a
Definition further broken down. valid token. token.
each kind of
punctuation is
considered a token. e.g.
semicolon, bracket,
Punctuation comma, etc. (, ), {, } (, ), {, }
any string of
a grammar rule or “Welcome to characters (except ‘ ‘)
Literal boolean literal. GeeksforGeeks!” between ” and “
20. Write a short note on: Finite Automata (FA) as a lexical analyzer?
= Finite automata is a state machine that takes a string of symbols as input and changes its
state accordingly. Finite automata is a recognizer for regular expressions. When a regular
expression string is fed into finite automata, it changes its state for each literal. If the input
string is successfully processed and the automata reaches its final state, it is accepted, i.e.,
the string just fed was said to be a valid token of the language in hand.
21. Write a lex program which find out factors of a given number.
= #include <stdio.h>
int main() {
int num, i;
printf("Enter a positive integer: ");
scanf("%d", &num);
printf("Factors of %d are: ", num);
for (i = 1; i <= num; ++i) {
if (num % i == 0) {
printf("%d ", i);
} } return 0; }
24. Write a lex program to return the tokens identifier and number.
=%{
#include <stdio.h>
%
} / rule section % %
// regex for valid identifiers
^[a - z A - Z _][a - z A - Z 0 - 9 _] * printf("Valid Identifier");
// regex for invalid identifiers
^[^a - z A - Z _] printf("Invalid Identifier");
.;
%%
main()
{ yylex(); }
25. Write a lex program to count total number of vowels and total number of
consonants from the input string.
= %{
int vow_count=0;
int const_count =0;
%}
%%
[aeiouAEIOU] {vow_count++;}
[a-zA-Z] {const_count++;}
%%
int yywrap(){}
int main()
{
printf("Enter the string of vowels and consonents:");
yylex();
printf("Number of vowels are: %d\n", vow_count);
printf("Number of consonants are: %d\n", const_count);
return 0;
}