You are on page 1of 66

Syntax Analysis

The role of parser

token
Source Lexical Parse tree Rest of Front Intermediate
Parser
program Analyzer End representation
getNext
Token

Symbol
table
Error handling
• Common programming errors
o Lexical errors
o Syntactic errors
o Semantic errors
o Lexical errors

• Error handler goals


o Report the presence of errors clearly and accurately
o Recover from each error quickly enough to detect
subsequent errors
o Add minimal overhead to the processing of correct
programs
Error-recover strategies
• Panic mode recovery
o Discard input symbol one at a time until one of designated set
of synchronization tokens is found
• Phrase level recovery
o Replacing a prefix of remaining input by some string that
allows the parser to continue
• Error productions
o Augment the grammar with productions that generate the
erroneous constructs
• Global correction
o Choosing minimal sequence of changes to obtain a globally
least-cost correction
Benefits of a Grammar
Example of Grammars
EE+T|T E  TE’
TT*F|F E’  +TE’ | Ɛ
F  (E) | id T  FT’
T’  *FT’ | Ɛ
F  (E) | id

E  E + E | E * E | (E) | id
Context-free Grammars
• Terminal
• Nonterminals
• Start Symbol
• Productions
Example (example 4.5)
Notational convention
Notational conventions contd..
Example 4.5 re-written
Derivations
• Productions are treated as rewriting
rules to generate a string

o E  E + E | E * E | -E | (E) | id
• E => -E => -(E) => (id)

o Derivations for –(id+id)


• E => -E => -(E) => -(E+E) => -(id+E)=>-(id+id)
Parse trees
• -(id+id)
• E => -E => -(E) => -(E+E) => -(id+E)=>-(id+id)
End of part 1
Derivation types
Let’s first consider the grammar 4.7 from Aho et al.

The sentence –(id + id) is valid for grammar 4.7


Because

Left most derivation

Right most derivation


Derivation in detail
• Recall the Parse tree
Derivation in detail
E

E
Derivation in detail
E

- E
E
 -E
Derivation in detail
E

- E
E
 -E ( E )
 -(E)
Derivation in detail
E

- E
E
 -E ( E )
 -(E)
E + E
 -(E+E)
Derivation in detail
E

- E
E
 -E ( E )
 -(E)
E + E
 -(E+E)
 -(id+E) E
id
Left most Derivation
Derivation in detail
E

- E
E
 -E ( E )
 -(E)
E + E
 -(E+E)
 -(id+E) id
id
 -(id+id)
Left most Derivation
Derivation in detail
E

E
Derivation in detail
E

- E
E
 -E
Derivation in detail
E

- E
E
 -E ( E )
 -(E)
Derivation in detail
E

- E
E
 -E ( E )
 -(E)
E + E
 -(E+E)
Derivation in detail
E

- E
E
 -E ( E )
 -(E)
E + E
 -(E+E)
 -(E+id) id
E
Right most Derivation
Derivation in detail
E

- E
E
 -E ( E )
 -(E)
E + E
 -(E+E)
 -(E+id) id
id
 -(id+id)
Right most Derivation
Another Example
id * id + id
E E

E + E E + E
Another Example
id * id + id
E E

E + E E + E

* E id
E
Another Example
id * id + id
E E

E + E E + E

* E * E id
E E

id

Left most Derivation Right most Derivation


Another Example
id * id + id
E E

E + E E + E

* E * E id
E E

id id id

Left most Derivation Right most Derivation


Another Example
id * id + id
E E

E + E E + E

* E id * E id
E E

id id id id

Left most Derivation Right most Derivation


Ambiguity
• For some strings there exist more than one parse
tree

• Or more than one leftmost derivation

• Or more than one rightmost derivation

• Ambiguity is Bad – needs to be removed

• Example: id+id*id (example 4.11 in Aho et al)


Example of Ambiguity
How to deal with Ambiguity
• (possibly) Rewrite the Grammar

• Enforce precedence of * over +


Ambiguity again –
Dangling Else
• Consider this Grammar

• Then consider this expression

• Is it ambiguous?!
Ambiguity again –
Dangling Else
How to fix Dangling Else
problem
• Else matches with the closest unmatched then
Parse tree for the
rewritten grammar
Resolving Ambiguity
• No general techniques

• Automatic conversion of an ambiguous grammar


to an unambiguous one is not possible

• Rewriting the grammar – an option

• Precedence and associativity declarations – a


better option than rewriting
• * should precede +
Context free Grammars
VS Regular Expressions
Recall the regular expression
Context free Grammars
VS Regular Expressions
Recall the regular expression

NFA for the RE


Context free Grammars
VS Regular Expressions
Recall the regular expression

Context free Grammar for


(a|b)*abb

NFA for the RE


Context free Grammars
VS Regular Expressions
End of part 2
Error Handling (again!!)
Syntax Error Handling
Panic Mode recovery

• E.G. (1 + + 2)
o Skip ahead to next integer and then continue
Error Production
Local and Global
Corrections
Lexical VS Syntactic Analysis
Top Down Parsing

• Recursive decent parsing – matching the terminals


o Requires backtracking when a mismatch occurs during matching terminals
Top down parsing -
example
• Consider the grammar
ET|T+E
T = int | int * T | (E)

• Token stream is (int)

• Can (int) be parsed?


Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E

(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E

(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E

int Mismatch: int ≠ ( !!


Backtrack

(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E

(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E

int * T
Mismatch: int ≠ ( !!
Backtrack

(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E

( E )
Match 
Advance

(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E

( E )

(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E

( E )

(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E

( E )

T Match 
Advance
int
(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E

( E ) Match 
Advance
T

int
(int)
Top down parsing -
ET|T+E
example
T = int | int * T | (E)
E

( E )

int
End of input..
(int) Accept 
Class Test!!

• Parse the following string for the above grammar

id + id * id

You might also like