You are on page 1of 29

Unit 7

Predictive Parsing

Nguyen Thi Thu Huong


Hanoi University of Technology

1
Predictive Parsers
 Parser can “predict” which production to use
 By looking at the next few tokens
 No backtracking
 Predictive parsers accept LL(k) grammars
L means “left-to-right” scan of input
 L means “leftmost derivation”
 k means “predict based on k tokens of lookahead”
 In practice, LL(1) is used

2
A stringent condition
The grammar must not be left recursive and no
two right sides of a production have a common
prefix.

3
Left Recursion

A grammar is left recursive if it has a non-terminal A such that


there is a derivation.
A  A for some string 

Top-down parsing techniques cannot handle left-recursive


grammars.
So, we have to convert our left-recursive grammar into an equivalent
grammar which is not left-recursive.
The left-recursion may appear in a single step of the derivation
(immediate left-recursion), or may appear in more than one step of
the derivation.

4
Immediate Left-Recursion
AA|  where  does not start with A

 eliminate immediate left recursion


A   A’
A’   A ’ |  an equivalent grammar

In general,

A  A 1 | ... | A m | 1 | ... | n where 1 ... n do not start with A

 eliminate immediate left recursion


A  1 A’ | ... | n A’
A’  1 A’ | ... | m A’ |  an equivalent grammar

5
Immediate Left-Recursion -- Example
E  E+T | T
T  T*F | F
F  id | (E)

 eliminate immediate left recursion

E  T E’
E’  +T E’ | 
T  F T’
T’  *F T’ | 
F  id | (E)

6
Left-Recursion -- Problem
• A grammar cannot be immediately left-recursive, but it still can be
left-recursive.
• By just eliminating the immediate left-recursion, we may not get
a grammar which is not left-recursive.

S  Aa | b
A  Sc | d This grammar is not immediately left-recursive,
but it is still left-recursive.

S  Aa  Sca or
A  Sc  Aac causes to a left-recursion

• So, we have to eliminate all left-recursions from our grammar

7
Eliminate Left-Recursion -- Algorithm
- Arrange non-terminals in some order: A1 ... An
- for i from 1 to n do {
- for j from 1 to i-1 do {
replace each production
Ai  Aj 
by
Ai  1  | ... | k 
where Aj  1 | ... | k
}
- eliminate immediate left-recursions among Ai productions
}

8
Predictive Parsing and
Left Factoring
 In the grammar
ET+E|T
T  int | int * T | ( E )

 Hard to predict because


 For T two productions start with int
 For E it is not clear how to predict

 A grammar must be left-factored before use for


predictive parsing

9
Left Factoring
 A predictive parser (a top-down parser without backtracking) insists
that the grammar must be left-factored.

grammar  a new equivalent grammar suitable for predictive


parsing

If_stmt  if expr then stmt else stmt |


if expr then stmt

 when we see if, we cannot now which production rule to choose to


re-write stmt in the derivation.

10
Left Factoring (con’d)
In general,

A  1 | 2 where  is non-empty and the first symbols


of 1 and 2 (if they have one)are different.
when processing  we cannot know whether expand
A to 1 or
A to 2

But, if we re-write the grammar as follows


A  A’
A’  1 | 2 so, we can immediately expand A to A’

11
Left factoring algorithm
 For each non-terminal A with two or more alternatives (production
rules) with a common non-empty prefix, let say

A  1 | ... | n | 1 | ... | m

convert it into

A  A’ | 1 | ... | m
A’  1 | ... | n

12
Left-Factoring Example
 Consider the grammar
ET+E|T
T  int | int * T | ( E )

• Factor out common prefixes of productions


ETX
X+E|
T  ( E ) | int Y
Y*T|

13
Left Factoring

S --> if E then S | if E then S else S

can be rewritten as

S --> if E then S S’
S’ --> else S | 
 Grammar G: ETE'
E'+TE'|
TFT'
T'*FT'|
F(E) |intint

It’s possible to implement a predictive parser for G

15
Parsing table

+ * ( ) int $
E ETE' ETE'
E' E'+TE' E' E'
T TFT' TFT'
T' T' T'*FT’ T' T'
F F(E) Fint
+ Push
* Push
( Push
) Pushy
int Push
# Accept

16
LL(1) Parsing Tables. Errors
 Blank entries indicate error situations
 Consider the [E,*] entry
 “There is no way to derive a string starting
with * from non-terminal E”

17
Using Parsing Tables
 Method similar to recursive descent, except
 Foreach non-terminal S
 We look at the next token a
 And chose the production shown at [S,a]
 We use a stack to keep track of pending non-
terminals
 We reject when we encounter an error state
 We accept when we encounter end-of-input

18
LL(1) Parsing Algorithm
initialize stack = <S $> and next
repeat
case stack of
<X, rest> : if T[X,*next] = Y1…Yn
then stack  <Y1… Yn rest>;
else error ();
<t, rest> : if t == *next ++
then stack  <rest>;
else error ();
until stack == < >

19
LL(1) Parsing Example
Stack Input Action
E# int * int $ TX
TX# int * int $ int Y
int Y X # int * int $ push
YX# * int $ *T
*TX# * int $ push
TX# int $ int Y
int Y X # int $ push
YX# $ 
X# $ 
# $ ACCEPT

20
Constructing Parsing Tables
 LL(1) languages are those defined by a
parsing table for the LL(1) algorithm
 No table entry can be multiply defined
 We want to generate parsing tables from
CFG

21
Constructing Parsing Tables
(Cont.)
 If A  , where in the line of A we place  ?
 In the column of t where t can start a string
derived from 
 * t 
 We say that t  First()
 In the column of t if  is  and t can follow an A
S *  A t 
 We say t  Follow(A)

22
Computing First Sets
Definition: First(X) = { t | X * t}  { | X * }

Algorithm sketch :
1. for all terminals t do First(X)  { t } //if X is terminal t
2. for each production X   do First(X)  {  }
3. if X  A1 … An  and   First(Ai), 1  i  n do
• add First() to First(X)
4. for each X  A1 … An s.t.   First(Ai), 1  i  n do
• add  to First(X)
5. repeat steps 4 & 5 until no First set can be grown

23
First Sets. Example
 Recall the grammar
ETX X+E|
T  ( E ) | int Y Y*T|
 First sets
First( ( ) = { ( } First( T ) = {int, ( }
First( ) ) = { ) } First( E ) = {int, ( }
First( int) = { int } First( X ) = {+,  }
First( + ) = { + } First( Y ) = {*,  }
First( * ) = { * }

24
Computing Follow Sets
 Definition:
Follow(X) = { t | S *  X t  }

 Intuition
 If S is the start symbol then $  Follow(S)

 IfX  A B then First(B)  Follow(A) and


Follow(X)  Follow(B)
 Also if B *  then Follow(X)  Follow(A)

25
Computing Follow Sets (Cont.)
Algorithm sketch:

1. Follow(S)  { $ }
2. For each production A   X 
• add First() - {} to Follow(X)
3. For each A   X  where   First()
• add Follow(A) to Follow(X)
 repeat step(s) ___ until no Follow set grows

26
Follow Sets. Example
 Recall the grammar
ETX X+E|
T  ( E ) | int Y Y*T|
 Follow sets
Follow( + ) = { int, ( } Follow( * ) = { int, ( }
Follow( ( ) = { int, ( } Follow( E ) = {), $}
Follow( X ) = {$, ) } Follow( T ) = {+, ) , $}
Follow( ) ) = {+, ) , $} Follow( Y ) = {+, ) , $}
Follow( int) = {*, +, ) , $}

27
Constructing LL(1) Parsing
Tables
 Construct a parsing table T for CFG G

 For each production A   in G do:


 For each terminal t  First() do
 T[A, t] = 

 If   First(), for each t  Follow(A) do


 T[A, t] = 

 If   First() and $  Follow(A) do


 T[A, $] = 

28
Notes on LL(1)
Parsing Tables
 If any entry is multiply defined then G is not
LL(1)
 If G is ambiguous
 If G is left recursive
 If G is not left-factored

 Most programming language grammars are not


LL(1)
 There are tools that build LL(1) tables

29

You might also like