Professional Documents
Culture Documents
1
Source Language
Lexical Analyzer
Target Language
2
Source Language
Lexical Analyzer
Today!
Syntax Analyzer Front
End
Semantic Analyzer
Intermediate Code
Target Language
3
The Role of Lexical Analyzer
4
The Role of Lexical Analyzer (cont’d)
Lexical Parser
input
Analyzer
Push back Get Next Token
character
Symbol
Table
5
The Role of Lexical Analyzer (cont’d)
6
The Role of Lexical Analyzer (cont’d)
7
The Role of Lexical Analyzer (cont’d)
8
The Role of Lexical Analyzer (cont’d)
9
The Role of Lexical Analyzer (cont’d)
10
The Role of Lexical Analyzer (cont’d)
11
Issues in Lexical Analysis
12
Issues in Lexical Analysis (cont’d)
13
Issues in Lexical Analysis (cont’d)
14
Issues in Lexical Analysis (cont’d)
15
Issues in Lexical Analysis (cont’d)
16
What exactly is lexing?
i f _ ( i = = j ) ;\n\t z = 1 ; \n e l s e ; \n\t z = 0 ; \n e n d i f ;
This is really nothing more than a string of characters:
During our lexical analysis phase we must divide this string into meaningful sub-
strings.
17
Tokens
18
Identifying Tokens
19
Implementation
20
Example
i f _ ( i = = j ) ;\n\t z = 1 ; \n e l s e ; \n\t z = 0 ; \n e n d i f ;
Line Token Lexeme
1 BLOCK_COMMAND if
1 OPEN_PAREN (
1 ID i
1 OP_RELATION ==
1 ID j
1 CLOSE_PAREN )
1 ENDLINE ;
2 ID z
2 ASSIGN =
2 NUMBER 1
2 ENDLINE ;
3 BLOCK_COMMAND else
Etc…
21
Lookahead
22
TOKENS
The output of our lexical analysis phase is a streams of tokens.A
token is a syntactic category.
“A name for a set of input strings with related structure”
Example: “ID,” “NUM”, “RELATION“,”IF”
a [index] = 4 + 2
24
Tokens (cont’d)
a ID
[ Left bracket
index ID
] Right bracket
= Assign
4 Num
+ plus sign
2 Num
25
Tokens (cont’d)
26
Attributes For Tokens
27
Attributes For Tokens (cont’d)
28
Attributes For Tokens (cont’d)
29
Attributes For Tokens (cont’d)
30
Attributes For Tokens (cont’d)
E= M + C *2
31
Attributes For Tokens (cont’d)
33
Lexeme
34
Example
Classifies
Pattern
36
Tokens,Patterns.Lexemes (cont’d)
Example:
Const pi = 3.1416;
In the example above ,when the character
sequence “pi” appears in the source
program,a token representing an identifier is
retuned to the parser(ID).The returning of a
token is often implemented by passing an
integer corresponding to the token.
37
THANKS
38