You are on page 1of 51

Top Down Parsing

Shashank Gupta
BITS Pilani Assistant Professor
Pilani Campus
Department of Computer Science and Information Systems
Parsing

• Process of determination whether a string can be


generated by a grammar
• Parsing falls in two categories:

- Top-down parsing: Construction of the parse tree starts


at the root (from the start symbol) and proceeds towards
leaves (token or terminals)

- Bottom-up parsing: Construction of the parse tree


starts from the leaf nodes (tokens or terminals of the
grammar) and proceeds towards root (start symbol)

CS F363 Compiler Construction 2


BITS Pilani, Pilani Campus
Example: Top-Down Parsing

This Grammar generates the type information of Pascal

type simple|  id | array[ simple ] of type


simple  int | char | num dotdot num

CS F363 Compiler Construction 3


BITS Pilani, Pilani Campus
Example

Construction of a parse tree is done by starting the


root labelled by a start symbol.

Repeat the following two steps


• at a node labelled with non terminal A, select one of the
productions of A and construct the children nodes.
• find the next node at which the subtree is constructed.

CS F363 Compiler Construction 4


BITS Pilani, Pilani Campus
Example
Parse array [ num dotdot num ] of int

type simple|  id | array[ simple ] of type


simple  int | char | num dotdot num

CS F363 Compiler Construction 5


BITS Pilani, Pilani Campus
Example

Parse array [ num dotdot num ] of integer


type simple|  id | array[ simple ] of type
simple  int | char | num dotdot num

CS F363 Compiler Construction 6


BITS Pilani, Pilani Campus
Recursive Descent Parser

It is a top down method of syntax analysis in which a set of


recursive procedures are executed to process the input.

• A procedure is associated with each non-terminal of the grammar.

It is usually built from a set of mutually-recursive


procedures or a non-recursive equivalent where each such
procedure usually implements one of the production rules
of the grammar.

CS F363 Compiler Construction 7


BITS Pilani, Pilani Campus
Recursive Descent Parser

CS F363 Compiler Construction 8


BITS Pilani, Pilani Campus
Recursive Descent Parsing

type simple|  id | array[ simple ] of type


First set: simple  int | char | num dotdot num
Let there be a production
A -> α
then First( α ) is the set of tokens that appear as
the first token in the strings generated from α.
For example :
First(simple) = {int, char, num}
First(num dotdot num) = {num}
CS F363 Compiler Construction 9
BITS Pilani, Pilani Campus
Computation of First Sets

CS F363 Compiler Construction 10


BITS Pilani, Pilani Campus
Example

Calculate the First of all non-terminals in the following grammar.

S  ABCDE
A  a | First (S )  First ( ABCDE )  (First ( A) )  (First (B) )  (First (C )) {a, b, c }
B  b |
C c
D  d |
E  e |
CS F363 Compiler Construction 11
BITS Pilani, Pilani Campus
Example

Calculate the First of all non-terminals in the following grammar.

S  Bb | Cd
B  aB |
C  cC |

CS F363 Compiler Construction 12


BITS Pilani, Pilani Campus
Example

Calculate the First of all non-terminals in the following grammar.

S  ACB | CbB | Ba
A  da | BC
B  g |
C  h |

CS F363 Compiler Construction 13


BITS Pilani, Pilani Campus
Example

Write procedures for each of the non-terminals


of this grammar using recursive descent parsing
and parse the following input: i + i

E i E '

E   i E |
' '

CS F363 Compiler Construction 14


BITS Pilani, Pilani Campus
Recursive Descent Parser

E i E '
E '   i E ' |

15
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Example

Write procedures for each of the non-terminals


of this grammar using recursive descent parsing
and parse the following input:
array [ num dotdot num ] of int

type simple|  id | array[ simple ] of type


simple  int | char | num dotdot num

CS F363 Compiler Construction 16


BITS Pilani, Pilani Campus
Recursive Descent Parser

type simple|  id | array[ simple ] of type


simple  int | char | num dotdot num
17
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Recursive Descent Parser

CS F363 Compiler Construction 18


BITS Pilani, Pilani Campus
Left Recursion

• A recursive descent parser may loop forever


for the following production of the form:
A  A
• From the Grammar A A  | 
• Left recursion can be removed by rewriting the
grammar as
A  A '

A   A |
' '

19
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Example

E  E  T |T
T T * F | F
F  ( E ) | id

CS F363 Compiler Construction 20


BITS Pilani, Pilani Campus
Removal of Left Recursion

A  A1 | A 2 | A 3 |     | A n | 1 |  2 |  3 |    |  m

A  1 A' |  2 A' |    |  m A'


A' 1 A' | 2 A' |    | n A' |

CS F363 Compiler Construction 21


BITS Pilani, Pilani Campus
Example

A  AC | A a d |b d |c

CS F363 Compiler Construction 22


BITS Pilani, Pilani Campus
Indirect Left Recursion

S  A a |b
A  Ac | Sd |

CS F363 Compiler Construction 23


BITS Pilani, Pilani Campus
Example

S c Ad
A a b|a

CS F363 Compiler Construction 24


BITS Pilani, Pilani Campus
Left Factoring

It is the process of removing the common left


factor that appears in two productions of the
same non-terminal.
A   1 |  2
Removal of Left Factoring:

A  A '

A  1 |  2
'

25
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Example of Left Factoring

stmt  if exp then stmt


| if exp then stmt else stmt

CS F363 Compiler Construction 26


BITS Pilani, Pilani Campus
Predictive Parsing

It is a non-recursive top down parsing method.

It recognizes LL(k) languages.


• First ‘L’ means scanning of i/p stream from left to right.
• Second ‘L’ stands for left most derivation.
• ‘K’ means actual number of look ahead tokens.

CS F363 Compiler Construction 27


BITS Pilani, Pilani Campus
Predictive Parsing

28
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Follow Sets

Follow (X) is the set of symbols that might


follow the derivation of X in an input
stream.

CS F363 Compiler Construction 29


BITS Pilani, Pilani Campus
Example

Calculate the Follow of all non-terminals in the following grammar.

S  ABCDE
A  a |
B  b |
C c
D  d |
E  e |
CS F363 Compiler Construction 30
BITS Pilani, Pilani Campus
Example

Calculate the Follow of all non-terminals in the following grammar.

S  Bb | Cd
B  aB |
C  cC |

CS F363 Compiler Construction 31


BITS Pilani, Pilani Campus
Algorithm for Computation of
Follow Sets

• Always include $ in follow(S).


• if there is a production A → αBβ
then everything in first(β) (except ε) is in follow(B)
• if there is a production A → αBβ and First(β) contains ε
then everything in follow(A) is in follow(B)
• if there is a production A → αB
then everything in follow(A) is in follow(B)

CS F363 Compiler Construction 32


BITS Pilani, Pilani Campus
Example

Calculate the Follow of all non-terminals in the following two grammars.

S  ACB | CbB | Ba S  i E t S S |a '

A  da | BC S  e S |
'

B  g | E b
C  h |

CS F363 Compiler Construction 33


BITS Pilani, Pilani Campus
S 
S 

Predictive Parser

S  S  |

CS F363 Compiler Construction 34


BITS Pilani, Pilani Campus
Algorithm of Parse Table

• for each production A α do


– for each terminal ‘a’ in first(α)
M[A, a] = A α
– if ε is in First(α)
M[A, b] = A α
for each terminal b in follow(A)
– if ε is in First(α) and $ is in follow(A)
M[A, $] = A α

CS F363 Compiler Construction 35


BITS Pilani, Pilani Campus
Algorithm of Predictive Parsing

• Initially, predictive parser assumes 'X' symbol on top of stack,


and 'a' the current input symbol.
• Consider '$' is a special token that is at the bottom of the stack
and also terminates the input string.
• if X = a = $ then stop
• if X = a ≠ $ then pop(X) and increment the look ahead pointer.
• if X is a non terminal
then if M[X,a] = {X PQR}
then begin pop(X); push(R,Q,P)
end
else error
CS F363 Compiler Construction 36
BITS Pilani, Pilani Campus

TE 
F''

T*
F
id
(E
F T
ET)'TE'

Predictive Parser
E T E ' Non-Terminal First Follow
E   T E |
' ' E {id, ( } {$, ) }
E’ {+,  } {$, ) }
T FT' T {id, ( } {+, ), $}
T’ {*,  } {+, ), $}
T '  * F T ' | F {id, ( } {*, +, ), $}
F  id | ( E )
Non-Terminal First Follow
E {id, ( } {$, ) }
E’ {+, } {$, ) }
T {id, ( } {+, ), $}
T’ {*, } {+, ), $}
F {id, ( } {*, +, ), $}

CS F363 Compiler Construction 37


BITS Pilani, Pilani Campus
Predictive Parser

Construct Parse Table for the following


Grammar and find out whether it is LL(1) or not.
S a S bS
|b S a S
|

CS F363 Compiler Construction 38


BITS Pilani, Pilani Campus
Conditions for Grammar to be in LL
(1)

A  1 |  2 |  3 |     |  n

First (1 )  ( First ( 2 )  ( First ( 3 )       First ( n )  

A   |

First ( )  ( Follow( A)  

CS F363 Compiler Construction 39


BITS Pilani, Pilani Campus
Examples

Consider the following three grammars and find out


whether it is LL(1) or not.

S  aABb S  i E t S S ' | a S  aSA|


A  c | S  e S |
'
A  c |
B  d | E b

CS F363 Compiler Construction 40


BITS Pilani, Pilani Campus
Error Recovery

• Stop immediately at the first error.


• Compiler must recover from errors and
identify as many errors as possible.
• Error recovery methods
– Panic mode
– Phrase level recovery
– Error productions
– Global correction
CS F363 Compiler Construction 41
BITS Pilani, Pilani Campus
Panic Mode

When an error is encountered anywhere in the statement,


the rest of the statement is ignored by not processing the
input from erroneous input to delimiters.

This mode prevents the parser from developing infinite


loops and is considered as the easiest way for recovery of
the errors.

CS F363 Compiler Construction 42


BITS Pilani, Pilani Campus
Panic Mode

Error detection happens when an entry in parse table is found


written error i.e. M[A, a] = error

Skip the tokens in an i/p until a token in syn set appears.

Place the symbols in follow (A) in syn set. Skip the tokens until
an element in follow (A) is seen. Pop (A) and continue parsing.

CS F363 Compiler Construction 43


BITS Pilani, Pilani Campus
Predictive Parser
E T E ' Non-Terminal First Follow
E   T E |
' ' E {id, ( } {$, ) }
E’ {+,  } {$, ) }
T FT' T {id, ( } {+, ), $}
T’ {*,  } {+, ), $}
T '  * F T ' | F {id, ( } {*, +, ), $}
F  id | ( E )

44
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Error Recovery through Panic Mode

As soon as, you hit any error state, do the following

Keep discarding the token until you hit any token, that
comes in the follow set of any non-terminal of grammar
that is placed on the top of stack. Pop that non-terminal
from the stack and continue parsing from that token.

Keep discarding the tokens until you hit any token,


which comes in the First set of any non-terminal of
grammar that is placed on the top of stack. Resume
parsing from this token.
CS F363 Compiler Construction 45
BITS Pilani, Pilani Campus
' id

F'
G
T
E 


T*
/
(E
F
G FF
T)'T'E' '
T
E
G

Example

Consider the following predictive parsing table and the following subset of input string
 * ) id The current status of the stack is $ E ' T " ) E ' F T G . Parse this given input and
if error state exist then show error recovery through panic mode.

CS F363 Compiler Construction 46


BITS Pilani, Pilani Campus
Example

Consider the following expression grammar and parse the


following string : “not (true and or false)” using non-recursive
descent parsing and if error comes show error recovery

b exp r  b exp r or bterm| bterm


bterm btermand bfactor| bfactor
bfactor  not bfactor| (b exp r ) | true| false

CS F363 Compiler Construction 47


BITS Pilani, Pilani Campus

Example
1. b exp r  btermb exp r ' NT First Set Follow Set
2. b exp r '  or btermb exp r ' bexpr { not, (, true, false } { $, ) }
bexp’ { or,  } { $, ) }
3. b exp r '  bterm { not, (, true, false } { or, $, ) }
4. bterm bfactor bterm' bterm’ { and,  } { or, $, ) }
bfactor { not, (, true, false } { and, or, $, ) }
5. bterm'  and bfactor bterm'
6. bterm' 
7. bfactor  not bfactor
8. bfactor  ( b exp r )
9. bfactor  true “not (true and or false)”
10. bfactor  false
or and not ( ) true false $
Bexpr 1 1 syn 1 1 syn
bexp’ 2 3 3
bterm syn 4 4 syn 4 4 syn
bterm’ 6 5 6 6
bfactor syn syn 7 8 syn 9 10 syn 48
CS F363 Compiler Construction BITS Pilani, Pilani Campus
Example

Construct a LL (1) Grammar for deriving matrix


like constructs
[23, 43, 34, 56; 1, 32, 76; 10, 18, 87]
Here, rows are separated by semicolons and
columns are separated by commas. A matrix
must have at least one row and a row must have
at least one column entry.

CS F363 Compiler Construction 49


BITS Pilani, Pilani Campus
Example

matrix  rows
rows  row semicolonrows | row
row  num remainingelements
remainingelements comma num remainingelements|
The modified LL (1) Grammar is

matrix  rows
rows  row remainingrows
remainingrows  semicolonrows |
row  num remainingelements
remainingelements comma num remainingelements|

CS F363 Compiler Construction 50


BITS Pilani, Pilani Campus
Thank You

13
CS F363 Compiler
BITS Pilani, Pilani Campus

You might also like