Professional Documents
Culture Documents
Tayaba Anjum
Lexical Analyzer
• Read input characters from the source program
• Group them into lexemes
• Produce as output a sequence of tokens
• Interact with the symbol table
• Correlate error messages generated by the compiler with the source
program
The role of lexical analyzer
token
Source To semantic
Lexical Analyzer Parser
program analysis
getNextToken
Symbol
table
Why to separate Lexical analysis and parsing
1. Simplicity of design
2. Improving compiler efficiency
3. Enhancing compiler portability
Tokens, Patterns and Lexemes
• A token is a pair a token name and an optional token value
• A pattern is a description of the form that the lexemes of a token may
take
• A lexeme is a sequence of characters in the source program that
matches the pattern for a token
Example
<id, 1> <=> <id,2> <*> <id, 3> <*> <number, 2>
Symbol Table
ID name
1 E
2 M
3 C
Error recovery
• Panic mode: successive characters are ignored until we reach to a well
formed token
• Delete one character from the remaining input
• Insert a missing character into the remaining input
• Replace a character by another character
• Transpose two adjacent characters
Buffering Issue
• Lexical analyzer may need to look at least a character ahead to make a
token decision.
• Buffering: to reduce overhead required to process a single character
Buffering Issue
Tokens Specification
We need a formal way to specify patterns: regular expressions
• Alphabet: any finite set of symbols
• String over alphabet: finite sequence of symbols drawn from that
alphabet
• Language: countable set of strings over some fixed alphabet
Examples
• Which language is generated by:
• (a|b)(a|b)
•a*
• (a|b) *
• a|a*b
Tokens Recognition
Implementation: Transition Diagrams
• Implementation: Transition Diagrams
• Intermediate step in constructing lexical analyzer
• Convert patterns into flowcharts called transition diagrams
– nodes or circles: called states
– Edges: directed from state to another, labeled by symbols
Implementation: Transition Diagrams
Implementation: Transition Diagrams
Implementation: Transition Diagrams
Implementation: Transition Diagrams