Professional Documents
Culture Documents
Syntax Analysis
Syntax Analysis
Acknowledgement
• Alfred V Aho, Monica S. Lam, Ravi Sethi,
Jeffrey D Ullman- “Compilers- Principles,
Techniques and Tools”
Symbol Table
Error is
Error is detected here
… detected here …
Prefix Prefix DO 10 I = 1;0
for (;)
… …
Dr. Girish Kumar Patnaik 12
Error Recovery Strategies
• Panic mode
– Discard input until a token in a set of designated
synchronizing tokens is found
• Phrase-level recovery
– Perform local correction on the input to repair the error
• Error productions
– Augment grammar with productions for erroneous
constructs
• Global correction
– Choose a minimal sequence of changes to obtain a
global least-cost correction
Dr. Girish Kumar Patnaik 13
1. Panic Mode
In case of an error like:
a=b + c // no semi-colon
d=e + f ;
proc A {
- match the current token with a, and move to the next token;
- call ‘B’;
- match the current token with b, and move to the next token;
}
proc A {
case of the current token {
‘a’: - match the current token with a, and move to the next token;
- call ‘B’;
- match the current token with b, and move to the next token;
‘b’: - match the current token with b, and move to the next token;
- call ‘A’;
- call ‘B’;
}
}
FOLLOW(E) = { $, ) }
FOLLOW(E’) = { $, ) }
FOLLOW(T) = { +, ), $ }
FOLLOW(T’) = { +, ), $ }
FOLLOW(F) = {+, *, ), $ }
Dr. Girish Kumar Patnaik 35
LL ( 1 ) Grammars
• Predictive parsers, that is, recursive-descent parsers
needing no backtracking, can be constructed for a class of
grammars called LL(1)
• The first "L" in LL(1) stands for scanning the input from
left to right, the second "L" for producing a leftmost
derivation, and the “1" for using one input symbol of
lookahead at each step to make parsing action decisions
• No left-recursive or ambiguous grammar can be LL(1)
Problem ambiguity
Dr. Girish Kumar Patnaik 42
A Grammar which is not LL(1)
• What do we have to do it if the resulting parsing table
contains multiply defined entries?
• If we didn’t eliminate left recursion, eliminate the left recursion in the grammar.
• If the grammar is not left factored, we have to left factor the grammar.
• If its (new grammar’s) parsing table still contains multiply defined entries, that
grammar is ambiguous or it is inherently not a LL(1) grammar.
• A left recursive grammar cannot be a LL(1) grammar.
• A → Aα | β
any terminal that appears in FIRST(β) also appears FIRST(Aα) because
Aα βα.
If β is ε, any terminal that appears in FIRST(α) also appears in FIRST(Aα)
and FOLLOW(A).
• A grammar is not left factored, it cannot be a LL(1) grammar
• A → αβ1 | αβ2
any terminal that appears in FIRST(αβ1) also appears in FIRST(αβ2).
• An ambiguous grammar cannot be a LL(1) grammar.
• Error-Productions
– If we have a good idea of the common errors that might be encountered, we can augment the
grammar with productions that generate erroneous constructs.
– When an error production is used by the parser, we can generate appropriate error diagnostics.
– Since it is almost impossible to know all the errors that can be made by the programmers, this
method is not practical.
• Global-Correction
– Ideally, we would like the compiler to make as few change as possible in processing incorrect
inputs.
– We have to globally analyze the input to find the error.
– This is an expensive method, and it is not in practice.
Dr. Girish Kumar Patnaik 52
Panic-Mode Error Recovery in LL(1) Parsing
FOLLOW(E) = { $, ) }
FOLLOW(E’) = { $, ) }
FOLLOW(T) = { +, ), $ }
FOLLOW(T’) = { +, ), $ }
FOLLOW(F) = {+, *, ), $ }
Dr. Girish Kumar Patnaik 56
Panic-Mode Error Recovery in LL(1) Parsing
• "synch" indicating synchronizing tokens obtained from the FOLLOW set of the
nonterminal
• If the parser looks up entry M[A, a] and finds that it is blank, then the input symbol “a” is
skipped.
• If the entry is "synch," then the nonterminal on top of the stack is popped in an attempt to
resume parsing.
• If a token on top of the stack does not match the input symbol, then we pop the token
from the stack