04 Syntax Analysis

Syntax Analysis
The role of parser
token
Source Lexical Parse tree Rest of Front Intermediate
Parser
program Analyzer End representation
getNext
Token
Symbol
table
Error handling
• Common programming errors
o Lexical errors
o Syntactic errors
o Semantic errors
o Lexical errors
• Error handler goals

o Report the presence of errors clearly and accurately
o Recover from each error quickly enough to detect
subsequent errors
o Add minimal overhead to the processing of correct
programs
Error-recover strategies
• Panic mode recovery
o Discard input symbol one at a time until one of designated set
of synchronization tokens is found
• Phrase level recovery
o Replacing a prefix of remaining input by some string that
allows the parser to continue
• Error productions
o Augment the grammar with productions that generate the
erroneous constructs
• Global correction
o Choosing minimal sequence of changes to obtain a globally
least-cost correction
Benefits of a Grammar
Example of Grammars
EE+T|T E  TE’
TT*F|F E’  +TE’ | Ɛ
F  (E) | id T  FT’
T’  *FT’ | Ɛ
F  (E) | id
E  E + E | E * E | (E) | id
Context-free Grammars
• Terminal
• Nonterminals
• Start Symbol
• Productions
Example (example 4.5)
Notational convention
Notational conventions contd..
Example 4.5 re-written
Derivations
• Productions are treated as rewriting
rules to generate a string
o E  E + E | E * E | -E | (E) | id
• E => -E => -(E) => (id)
o Derivations for –(id+id)

• E => -E => -(E) => -(E+E) => -(id+E)=>-(id+id)
Parse trees
• -(id+id)
• E => -E => -(E) => -(E+E) => -(id+E)=>-(id+id)
End of part 1
Derivation types
Let’s first consider the grammar 4.7 from Aho et al.
The sentence –(id + id) is valid for grammar 4.7

Because
Left most derivation
Right most derivation

Derivation in detail
• Recall the Parse tree
E
E
E
- E
E
 -E
E
- E
E
 -E ( E )
 -(E)
E
- E
E
 -E ( E )
 -(E)
E + E
 -(E+E)
E
- E
E
 -E ( E )
 -(E)
E + E
 -(E+E)
 -(id+E) E
id
Left most Derivation
E
- E
E
 -E ( E )
 -(E)
E + E
 -(E+E)
 -(id+E) id
id
 -(id+id)
Left most Derivation
E
E
E
- E
E
 -E
E
- E
E
 -E ( E )
 -(E)
E
- E
E
 -E ( E )
 -(E)
E + E
 -(E+E)
E
- E
E
 -E ( E )
 -(E)
E + E
 -(E+E)
 -(E+id) id
E
Right most Derivation
E
- E
E
 -E ( E )
 -(E)
E + E
 -(E+E)
 -(E+id) id
id
 -(id+id)
Right most Derivation
Another Example
id * id + id
E E
E + E E + E
Another Example
id * id + id
E E
E + E E + E
* E id
E
Another Example
id * id + id
E E
E + E E + E
* E * E id
E E
id
Left most Derivation Right most Derivation

Another Example
id * id + id
E E
E + E E + E
* E * E id
E E
id id id

Another Example
id * id + id
E E
E + E E + E
* E id * E id
E E
id id id id

Ambiguity
• For some strings there exist more than one parse
tree
• Or more than one leftmost derivation
• Or more than one rightmost derivation
• Ambiguity is Bad – needs to be removed
• Example: id+id*id (example 4.11 in Aho et al)

Example of Ambiguity
How to deal with Ambiguity
• (possibly) Rewrite the Grammar
• Enforce precedence of * over +

Ambiguity again –
Dangling Else
• Consider this Grammar
• Then consider this expression
• Is it ambiguous?!
Ambiguity again –
Dangling Else
How to fix Dangling Else
problem
• Else matches with the closest unmatched then
Parse tree for the
rewritten grammar
Resolving Ambiguity
• No general techniques
• Automatic conversion of an ambiguous grammar

to an unambiguous one is not possible
• Rewriting the grammar – an option
• Precedence and associativity declarations – a

better option than rewriting
• * should precede +
Context free Grammars
VS Regular Expressions
Recall the regular expression
NFA for the RE

Context free Grammar for

(a|b)*abb
NFA for the RE

End of part 2
Error Handling (again!!)
Syntax Error Handling
Panic Mode recovery
• E.G. (1 + + 2)
o Skip ahead to next integer and then continue
Error Production
Local and Global
Corrections
Lexical VS Syntactic Analysis
Top Down Parsing
• Recursive decent parsing – matching the terminals

o Requires backtracking when a mismatch occurs during matching terminals
Top down parsing -
example
• Consider the grammar
ET|T+E
T = int | int * T | (E)
• Token stream is (int)
• Can (int) be parsed?

Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E
(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E
(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E
int Mismatch: int ≠ ( !!

Backtrack
(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E
(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E
int * T
Mismatch: int ≠ ( !!
Backtrack
(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E
( E )
Match 
Advance
(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E
( E )
(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E
( E )
(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E
( E )
T Match 
Advance
int
(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E
( E ) Match 
Advance
T
int
(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E
( E )
int
End of input..
(int) Accept 
Class Test!!
• Parse the following string for the above grammar
id + id * id

04 Syntax Analysis

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

04 Syntax Analysis

Uploaded by

Copyright:

Available Formats

Syntax Analysis

The role of parser

• Error handler goals

o Derivations for –(id+id)

The sentence –(id + id) is valid for grammar 4.7

Left most derivation

Right most derivation

Left most Derivation Right most Derivation

Left most Derivation Right most Derivation

Left most Derivation Right most Derivation

• Or more than one leftmost derivation

• Or more than one rightmost derivation

• Ambiguity is Bad – needs to be removed

• Example: id+id*id (example 4.11 in Aho et al)

• Enforce precedence of * over +

• Then consider this expression

• Automatic conversion of an ambiguous grammar

• Rewriting the grammar – an option

• Precedence and associativity declarations – a

NFA for the RE

Context free Grammar for

NFA for the RE

• Recursive decent parsing – matching the terminals

• Token stream is (int)

• Can (int) be parsed?

int Mismatch: int ≠ ( !!

• Parse the following string for the above grammar

You might also like