Professional Documents
Culture Documents
Top-Down Parsing
Begin with the start symbol at the root of the parse tree
Build the parse tree from the top down
Top-Down Parsing
S aSbS | bSaS | e
S
b
b
a
S
e
Parsing Decisions
Nondeterministic Parser
Backtracking Parser
Backtracking Parser
S aSa | bSb | a | b
S
a
b
7
Backtracking Parser
S aSa | bSb | a | b
S
b
S
a
b
a
b
8
Backtracking Parser
S aSa | bSb | a | b
S
b
S
b
b
S
b
9
Backtracking Parser
S aSa | bSb | a | b
S
b
S
b
b
S
b
b
10
Backtracking Parser
S aSa | bSb | a | b
S
b
S
b
a
b
b
11
Backtracking Parser
S aSa | bSb | a | b
S
b
S
b
b
b
b
b
b
12
Backtracking Parser
S aSa | bSb | a | b
S
b
b
13
proc A {
- match the current token with a, and move to the next token;
- call B;
- match the current token with b, and move to the next token;
}
14
A aA | bB | e
If all other productions fail, we should apply an e-production. For
example, if the current token is not a or b, we may apply the
e-production.
15
proc A {
case of the current token {
a: - match the current token with a, and move to the next token;
- call B;
- match the current token with b, and move to the next token;
b: - match the current token with b, and move to the next token;
- call A;
- call B;
}
}
16
Proc type()
{
case of the current token {
int : match the current
token with int, move to the
next token
float : match the
currenttoken with float,
move to the next token;
}
}
18
parse() {
token = get_next_token();
if (E() and token == '$')
then return true
else return false
}
Eprime() {
if (token == '+')
E() {
then token=get_next_token()
if (T())
if (T())
then return Eprime()
then return Eprime()
else return false
else return false
}
else if (token==')' or token=='$')
then return true
else return false
}
The remaining procedures are similar.
19
20
Left Recursion
EE+T|T
TT*F|F
F n | (E)
E
E
21
23
24
Predictive Parsing
Wouldnt it be nice if
the r.d. parser just knew which production to expand next?
Idea:
switch ( something ) {
case L1: return E1();
case L2: return E2();
otherwise: print syntax error;
}
whats something, L1, L2?
the parser will do lookahead (look at next token)
26
LL(1) Languages
28
Left-Factoring Example
Starting with the grammar
ET+E|T
T int | int * T | ( E )
Left-Factoring (cont.)
In general,
A ab1 | ab2
where a is non-empty and the first
symbols of b1 and b2 (if they have one)are different.
when processing a we cannot know whether expand
A to ab1 or
A to ab2
But, if we re-write the grammar as follows
A aA
A b1 | b2
so, we can immediately expand A to
aA
31
Left-Factoring -- Algorithm
For each non-terminal A with two or more alternatives
(production rules) with a common non-empty prefix, let say
A ab1 | ... | abn | g1 | ... | gm
convert it into
A aA | g1 | ... | gm
A b1 | ... | bn
32
Left-Factoring Example1
A abB | aB | cdg | cdeB | cdfB
A aA | cdA
A bB | B
A g | eB | fB
33
|
|
|
When we are trying to write the non-terminal stmt, if the current token
is if we have to choose first production rule.
When we are trying to write the non-terminal stmt, we can uniquely
choose the production rule by just looking the current token.
We eliminate the left recursion in the grammar, and left factor it. But it
may not be suitable for predictive parsing (not LL(1) grammar).
34
input buffer
stack
Non-recursive
Predictive Parser
output
Parsing Table
35
LL(1) Parser
input buffer
our string to be parsed. We will assume that its end is
marked with a special symbol $.
output
a production rule representing a step of the derivation
sequence (left-most derivation) of the string in the input
buffer.
36
stack
contains the grammar symbols
at the bottom of the stack, there is a special end marker
symbol $.
initially the stack contains only the symbol $ and the
starting symbol S.
$S initial stack
when the stack is emptied (ie. only $ left in the stack), the
parsing is completed.
parsing table
a two-dimensional array M[A,a]
each row is a non-terminal symbol
each column is a terminal symbol or the special symbol $
each entry holds a production rule.
37
INITIAL CONFIGURATION
Stack
$S
Input Buffer
Input string$
FINAL CONFIGURATION
Stack
Input Buffer
$
$
38
39
3. If X is a non-terminal
parser looks at the parsing table entry M[X,a]. If
M[X,a] holds a production rule XY1Y2...Yk, it pops X
from the stack and pushes Yk,Yk-1,...,Y1 into the stack.
The parser also outputs the production rule XY1Y2...Yk
to represent a step of the derivation.
4. none of the above error
all empty entries in the parsing table are errors.
If X is a terminal symbol different from a, this is also
an error case.
40
LL(1) Parsing
Table
S S aBc
B
B bB
stack
input
output
$S
$cBa
$cB
$cBb
$cB
$cBb
$cB
$c
$
abbc$
abbc$
bbc$
bbc$
bc$
bc$
c$
c$
$
S aBc
Be
B bB
B bB
Be
accept, successful completion
41
B bB
Be
Derivation(left-most): SaBcabBcabbBcabbc
S
parse tree
a
B
b
e
42
E TE
E +TE | e
T FT
T *FT | e
.F (E) | id
E
E
T
T
F
id
E TE
2.E +TE 3. E e
5.T *FT 6. T e
8.Fid
(
E TE
E +TE
T FT
E e
E e
T e
T e
T FT
T e
F id
T *FT
F (E)
43
input
id+id$
id+id$
id+id$
id+id$
+id$
+id$
$ E T+
$ E T
$ E T F
$ E Tid
$ E T
$ E
$
+id$
id$
id$
id$
$
$
$
output
E TE
T FT
F id
id
E
T e
E
TE
1
2
T FT
T
F id
T e
E e
accept
E
T
4
6
1.E TE
4.T FT
7.F (E)
5
7
2.E +TE 3. E e
5.T *FT 6. T e
8.Fid
44
FOLLOW
45
46
Example
1.Xa
FIRST(X)={a}
2.X
FIRST(X)={}
3.Xa|
FIRST(X)={a, }
4. XAbB
AaB
B
FIRST(X)={a}
FIRST(A)={a}
5.XABC A
B
FIRST(X)={c}
FIRST(A)={ }
FIRST(B)={}
Cc
FIRST(B)={ }
47
FIRST Example
E TE
E +TE | e
T FT
T *FT | e
F (E) | id
FIRST(F) = {(,id}
FIRST(T) = {*, e}
FIRST(T) = {(,id}
FIRST(E) = {+, e}
FIRST(E) = {(,id}
48
ETE
First(E)
E is a non-terminal and has a production ETE , From rule 3
Add all the non e -symbols of FIRST(T) and also collect first sets of E if their
preceding nonterminal can derive e
FIRST(T) = ?
T is a nonterminal and has a production rule TFT, from rule 3
Add all the non e -symbols of FIRST(F) and also collect first sets of T if
their preceding nonterminal can derive e
FIRST(F) = ?
F is a nonterminal and has a production Fid | (E) .
First(F)= { id, ( }
Hence
FIRST(E)=FIRST(T)=FIRST(F)={ (,id }
49
E+TE |
T*FT|
50
FIRST SETS
FIRST(E) = {(,id}
FIRST(T) = {(,id}
E TE
E +TE |
T FT
T *FT |
F (E) | id
FIRST(E) = {+, }
FIRST(T) = {*, }
FIRST(F) = {(,id}
51
We apply these rules until nothing more can be added to any follow set.
52
FOLLOW Example
E TE
E +TE | e
T FT
T *FT | e
F (E) | id
FOLLOW(E)
Since E is a start symbol
add $ to the follow set
From rule 2, the terminal )
is followed by E. So add )
also to the follow set of E
Hence
FOLLOW(E)= { $,)}
53
FOLLOW(E) :
[ETE, E+TE ]
From rule (3)
everything in
FOLLOW(E) will be
added to
FOLLOW(E).
HENCE
FOLLOW(E)={ $, ) }
FOLLOW(T) : [ETE,
E+TE]
From rule (2) FIRST(E)
except is added to
FOLLOW(T).
From rule (3) , since First(E)
contains add FOLLOW (E)
to the FOLLOW(T).
HENCE
FOLLOW(T)={+, $, ) }
54
FOLLOW(T) :
[TFT,T*FT ]
From rule (3) everything in
FOLLOW(T) will be added to
FOLLOW(T).
HENCE
FOLLOW(T)={+, $, ) }
FOLLOW(F) :
[TFT,T*FT]
From rule (2) FIRST(T) except
is added to FOLLOW(F).
From rule (3) , since First(T)
contains e add FOLLOW (T) to
the FOLLOW(T).
HENCE
FOLLOW(F)={*,+, $, ) }
55
FOLLOW SETS
FOLLOW(E) = { $, ) }
FOLLOW(E) = { $, ) }
FOLLOW(T) = { +, ), $ }
FOLLOW(T) = { +, ), $ }
FOLLOW(F) = {+, *, ), $ }
56
EXERCISES
COMPUTE FIRST and FOLLOW
SETS for the following grammar
S aBc
B bB | e
57
SOLUTION
FIRST(S)={a}
FIRST(B)={b, e}
FOLLOW(S)={$}
FOLLOW(B)={c}
58
2.
statement if-statement | other
If-statement if ( exp ) statement else-part
Else-part else statement |
Exp0 | 1
3:A(A ) A|
4:
Lexpatom |list
Atomnumber | identifier
List ( lexp-seq )
Lexp-seq lexp , lexp-seq |lexp
Left factor the grammar
Compute First and Follow for the resultant grammar.
59
60
61
S aBc
B bB | e
BbB| e
First(B)={b, e}
SaBc
First(S)=First(aBc)={a}
M[B,b]=BbB
Hence M[S,a]=SaBc
a
Follow(B)={c}
S
B
Hence M[B,c]=B e
SaBc
B bB
Be
62
63
E TE' :
Since First(TE') = First(T) =
{ (, id }, we add E TE' to M[E, (] and
M[E, id].
id
E
E TE'
E TE'
E'
T
T'
F
64
E' +TE' :
Since First(+TE') = {+}, we add E' +TE'
to M[E',+].
id
E
E'
E TE'
E TE'
E' +TE'
T
T'
F
65
E' e :
We must examine Follow(E') = { $, )
}. We add E' e to M[E',)] and
M[E',$]
id
E
E'
E TE'
E' e
E' e
E TE'
E' +TE'
T
T'
F
66
T FT' :
Since First(FT') = First(F) =
{ (, id }, we add T FT' to M[T,(]
and M[T,id].
id
E
E TE'
T FT'
E' e
E' e
E TE'
E' +TE'
E'
T
T FT'
T'
F
67
T' *FT' :
id
E
T'
E TE'
E' e
E' e
E TE'
E' +TE'
E'
T
T FT'
T FT'
T' *FT'
F
68
T' e :
We examine Follow(T') =
{ +, $, ) }. We add T' e to M[T',+],
M[T',)], and M[T',$].
id
E
T'
E TE'
E' e
E' e
T' e
T' e
E TE'
E' +TE'
E'
T
T FT'
T FT'
T' e
T' *FT'
F
69
F ( E ):
We add F ( E ) to M[F,(]
id
E
T'
F
E TE'
E' e
E' e
T' e
T' e
E TE'
E' +TE'
E'
T
T FT'
T FT'
T' e
T' *FT'
F(E)
70
F id :
We add F id to M[F,id]
id
E
T FT'
E' e
E' e
T' e
T' e
T FT'
T' e
F id
E TE'
E' +TE'
T'
F
E TE'
E'
T
T' *FT'
F(E)
71
T FT'
E' e
E' e
T' e
T' e
T FT'
T' e
F id
E TE'
E' +TE'
T'
F
E TE'
E'
T
T' *FT'
F(E)
72
73
LL(1) Grammars
A grammar whose parsing table has no multiply-defined entries is said
to be LL(1) grammar.
one input symbol used as a look-head symbol do determine parser action
LL(1)
The parsing table of a grammar may contain more than one production
rule. In this case, we say that it is not a LL(1) grammar.
74
FOLLOW(S) = { $,e }
FOLLOW(E) = { $,e }
FOLLOW(C) = { t }
a
S Sa
S iCtSE
EeS
Ee
E
C
Ee
Cb
two production rules for M[E,e]
Problem ambiguity
75
76
77
78
Grammar
SSa|a
Left recursive
SaS|a
FIRST(a S) FIRST(a)
SaR|e
RS|e
SaRa
RS|e
For R: S * e and e * e
For R:
FIRST(S) FOLLOW(R)
79
80
Example
81
M[N][T]
exp
number
exp
term exp'
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
term' e
exp'
addop
term exp'
factor
( exp )
exp' e
addop
-
term' e
term' e
Example: Parse 1 + 2 * 3
mulop
term
factor term'
term'
factor
exp
term exp'
exp'
term
term'
mulop factor
term'
term' e
mulop
*
factor
number
82
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp'
addop
term exp'
exp' e
term
factor term'
term' e
mulop
addop
-
term'
factor
exp
term exp'
exp'
term
term' e
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
83
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
term' e
term' e
factor
exp'
addop
term exp'
exp' e
addop
-
term
factor term'
term'
mulop
exp
term exp'
exp'
term
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
84
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
term' e
exp
term' e
factor
exp'
addop
term exp'
exp' e
addop
-
term
factor term'
term'
mulop
exp
term exp'
exp'
term
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
85
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
exp'
addop
term exp'
exp' e
addop
-
term
factor term'
term
term' e
term'
exp'
term' e
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
86
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
exp'
addop
term exp'
exp' e
addop
-
term
factor term'
term
term' e
term'
exp'
term' e
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
87
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
term
factor term'
exp'
addop
term exp'
exp' e
addop
-
factor
term'
term' e
term'
exp'
term' e
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
88
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
term
factor term'
exp'
addop
term exp'
exp' e
addop
-
num
term'
term' e
term'
exp'
term' e
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
89
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
exp'
addop
term exp'
exp' e
addop
-
term
factor term'
term'
term' e
term'
exp'
term' e
+ num * num $
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
90
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
term' e
exp'
term' e
+ num * num $
mulop
factor
( exp )
exp'
addop
term exp'
exp' e
addop
-
term
factor term'
term'
factor
exp
term exp'
exp'
term
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
91
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
term
factor term'
exp'
addop
term exp'
exp' e
addop
-
addop
term
term' e
term'
exp'
term' e
+ num * num $
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
92
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
term
factor term'
exp'
addop
term exp'
exp' e
addop
-
addop
term
term' e
term'
exp'
term' e
+ num * num $
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
93
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
term
factor term'
exp'
addop
term exp'
exp' e
addop
-
+
term
term' e
term'
exp'
term' e
+ num * num $
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
94
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
term
factor term'
exp'
addop
term exp'
exp' e
addop
-
+
term
term' e
term'
exp'
term' e
+ num * num $
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
95
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
exp'
addop
term exp'
exp' e
addop
-
term
factor term'
term
term' e
term'
exp'
term' e
num * num $
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
96
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
term
factor term'
exp'
addop
term exp'
exp' e
addop
-
factor
term'
term' e
term'
exp'
term' e
num * num $
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
97
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
term
factor term'
exp'
addop
term exp'
exp' e
addop
-
num
term'
term' e
term'
exp'
term' e
num * num $
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
98
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
exp'
addop
term exp'
exp' e
addop
-
term
factor term'
term'
term' e
term'
exp'
term' e
* num $
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
99
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
mulop
term
factor term'
exp
term exp'
exp'
term
term
factor term'
addop
+
exp'
addop
term exp'
exp' e
addop
-
factor
term'
term' e
term'
exp'
term' e
* num $
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
100
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
*
term
factor term'
exp
term exp'
exp'
term
term
factor term'
exp'
addop
term exp'
exp' e
addop
-
factor
term'
term' e
term'
exp'
term' e
* num $
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
101
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
term
factor term'
exp'
addop
term exp'
exp' e
addop
-
factor
term'
term' e
term'
exp'
term' e
num $
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
102
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
term
factor term'
exp'
addop
term exp'
exp' e
addop
-
factor
term'
term' e
term'
exp'
term' e
num $
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
103
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
term
factor term'
exp'
addop
term exp'
exp' e
addop
-
num
term'
term' e
term'
exp'
term' e
num $
mulop
factor
factor
( exp )
term' e
term'
mulop factor
term'
term' e
mulop
*
factor
number
104
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
exp' e
addop
-
term
factor term'
term'
term' e
term'
exp'
term' e
term' e
mulop
factor
factor
( exp )
term'
mulop factor
term'
term' e
mulop
*
factor
number
105
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
exp'
addop
term exp'
addop
+
addop
term
factor term'
exp
term exp'
exp'
term
exp' e
addop
-
term
factor term'
term'
term' e
term'
exp'
term' e
term' e
mulop
factor
factor
( exp )
term'
mulop factor
term'
term' e
mulop
*
factor
number
106
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
exp'
addop
term exp'
addop
+
addop
term
factor term'
term' e
exp'
term' e
exp' e
addop
-
term' e
mulop
factor
( exp )
term
factor term'
term'
factor
exp
term exp'
exp'
term
term'
mulop factor
term'
term' e
mulop
*
factor
number
107
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
exp'
addop
term exp'
addop
+
addop
term
factor term'
term' e
exp'
term' e
exp' e
addop
-
term' e
mulop
factor
( exp )
term
factor term'
term'
factor
exp
term exp'
exp'
term
term'
mulop factor
term'
term' e
mulop
*
factor
number
108
M[N][T]
exp
exp
term exp'
number
exp' e
exp'
addop
term exp'
exp'
addop
term exp'
addop
+
addop
term
factor term'
term' e
term' e
exp' e
addop
-
term' e
mulop
factor
( exp )
term
factor term'
term'
factor
exp
term exp'
exp'
term
term'
mulop factor
term'
term' e
mulop
*
factor
number
109
Successful Parse!
110
Self Study
Error Recovery Techniques
Panic-Mode Error Recovery
Phrase-Level Error Recovery
Error-Productions
Global-Correction
Reference :
Aho, Sethi and Ullman
111