Top-Down Parsing

Presented By

Subhadra Mishra
Asst. Professor,
Dept. of CSA, OUAT

Top-Down Parsing
• The parse tree is created top to bottom. • Top-down parser
– Recursive-Descent Parsing
• Backtracking is needed (If a choice of a production rule does not work, we backtrack to try other alternatives.) • It is a general parsing technique, but not widely used. • Not efficient

– Predictive Parsing
• • • • no backtracking efficient needs a special form of grammars (LL(1) grammars). Recursive Predictive Parsing is a special form of Recursive Descent parsing without backtracking. • Non-Recursive (Table Driven) Predictive Parser is also known as LL(1) parser.

Subhadra Mishra

2

backtrack S Subhadra Mishra 3 .Recursive-Descent Parsing (uses Backtracking) • Backtracking is needed. S → aBc B → bc | b S input: abc a b B c c a B b c fails. • It tries to find the left-most derivation.

| αn input: . a predictive parser can uniquely choose a production rule by just looking the current symbol in the input string...Predictive Parser a grammar è eliminate left recursion è left factor a grammar suitable for predictive parsing (a LL(1) grammar) • When re-writing a non-terminal in a derivation step.. a ...... A → α1 | . current token Subhadra Mishra 4 ...

Recursive Predictive Parsing • Each non-terminal corresponds to a procedure.call ‘B’.match the current token with a. Ex: A → aBb (This is only the production rule for A) proc A { . } Subhadra Mishra 5 .match the current token with b. and move to the next token. . . and move to the next token.

} } Subhadra Mishra 6 .call ‘B’. .match the current token with b. ‘b’: . and move to the next token.) A → aBb | bAB proc A { case of the current token { ‘a’: . and move to the next token.match the current token with b.call ‘A’. . .call ‘B’.Recursive Predictive Parsing (cont. .match the current token with a. and move to the next token.

LL(1) Parser • Non-Recursive predictive parsing is a table-driven parser.Non-Recursive Predictive Parsing -. • It is also known as LL(1) Parser. • It is a top-down parser. input buffer stack Non-recursive Predictive Parser Parsing Table Subhadra Mishra 7 output .

stack – – – – contains the grammar symbols at the bottom of the stack. We will assume that its end is marked with a special symbol $. output – a production rule representing a step of the derivation sequence (left-most derivation) of the string in the input buffer. parsing table – – – – a two-dimensional array M[A. the parsing is completed.a] each row is a non-terminal symbol each column is a terminal symbol or the special symbol $ each entry holds a production rule.LL(1) Parser input buffer – our string to be parsed. $S ç initial stack when the stack is emptied (ie. there is a special end marker symbol $. only $ left in the stack). initially the stack contains only the symbol $ and the starting symbol S. Subhadra Mishra 8 .

If M[X. and moves the next symbol in the input buffer. it pops X from the stack and pushes Yk.. If X is a non-terminal è parser looks at the parsing table entry M[X..Yk.LL(1) Parser – Parser Actions • • 1.a] holds a production rule X→Y1Y2. There are four possible parser actions. If X is a terminal symbol different from a. .Y1 into the stack.. Subhadra Mishra 9 4. none of the above è error – – all empty entries in the parsing table are errors. 2.. 3..Yk to represent a step of the derivation. this is also an error case. The parser also outputs the production rule X→Y1Y2.a].. The symbol at the top of the stack (say X) and the current symbol in the input string (say a) determine the parser action.Yk-1... If X and a are $ è parser halts (successful completion) If X and a are the same terminal symbol (different from $) è parser pops X from the stack.

LL(1) Parser – Example1 S → aBa B → bB | ε a S B stack $S $aBa $aB $aBb $aB $aBb $aB $a $ b B → bB S → aBa B → bB B → bB B→ε $ S → aBa B→ε LL(1) Parsing Table input abba$ abba$ bba$ bba$ ba$ ba$ a$ a$ $ output accept. successful completion Subhadra Mishra 10 .

) Outputs: S → aBa B → bB B → bB B→ε Derivation(left-most): S⇒aBa⇒abBa⇒abbBa⇒abba S parse tree a B a b b B B ε Subhadra Mishra 11 .LL(1) Parser – Example1 (cont.

Compute FIRST and FOLLOW sets for this G 2. Compute parse table entries Subhadra Mishra 12 .Problem EàE+T|T TàT*F|F F à (E) | num 1.

• if α derives to ε.Constructing LL(1) Parsing Tables • Two functions are used in the construction of LL(1) parsing tables: – FIRST FOLLOW • FIRST(α) is a set of the terminal symbols which occur as first symbols in strings derived from α where α is any string of grammar symbols. • FOLLOW(A) is the set of the terminals which occur immediately after (follow) the non-terminal A in the strings derived from the starting symbol. then ε is also in FIRST(α) . * – a terminal a is in FOLLOW(A) if S ⇒ αAaβ * – $ is in FOLLOW(A) if S ⇒ αA Subhadra Mishra 13 .

... • If X is ε è FIRST(X)={ε} • If X is Y1Y2..... è if ε is in all FIRST(Yj) for j=1.n then ε is in FIRST(X).Yn is a production rule è if a terminal a in FIRST(Yi) and ε is in all FIRST(Yj) for j=1.i-1 then a is in FIRST(X)....... è if ε is in all FIRST(Yj) for j=1. • If X is a non-terminal symbol and X → Y1Y2.n then ε is in FIRST(X)...Compute FIRST for Any String X • If X is a terminal symbol è FIRST(X)={X} • If X is a non-terminal symbol and X → ε is a production rule è ε is in FIRST(X).i-1 then a is in FIRST(X).Yn è if a terminal a in FIRST(Yi) and ε is in all FIRST(Yj) for j=1.... Subhadra Mishra 14 .

FIRST Example E → TE’ E’ → +TE’ | ε T → FT’ T’ → *FT’ | ε F → (E) | id FIRST(F) = {(.id} FIRST(T ) = {*. ε} FIRST(T) = {(.id} FIRST(TE ) = {(. ε} FIRST(E) = {(.id} FIRST(*FT ) = {*} FIRST(ε) = {ε} FIRST((E)) = {(} FIRST(id) = {id} Subhadra Mishra 15 .id} FIRST(+TE ) = {+} FIRST(ε) = {ε} FIRST(FT ) = {(.id} FIRST(E ) = {+.

Subhadra Mishra 16 .Compute FOLLOW (for non-terminals) • If S is the start symbol è $ is in FOLLOW(S) • if A → αBβ is a production rule è everything in FIRST(β) is FOLLOW(B) except ε • If ( A → αB is a production rule ) or ( A → αBβ is a production rule and ε is in FIRST(β) ) è everything in FOLLOW(A) is in FOLLOW(B). We apply these rules until nothing more can be added to any follow set.

) } FOLLOW(T) = { +. ). *. ) } FOLLOW(E’) = { $. $ } FOLLOW(F) = {+. ). $ } FOLLOW(T’) = { +. $ } Subhadra Mishra 17 .FOLLOW Example E → TE’ E’ → +TE’ | ε T → FT’ T’ → *FT’ | ε F → (E) | id FOLLOW(E) = { $. ).

$] • All other undefined entries of the parsing table are error entries. Subhadra Mishra 18 .a] – If ε in FIRST(α) and $ in FOLLOW(A) è add A → α to M[A.Algorithm • for each production rule A → α of a grammar G – for each terminal a in FIRST(α) è add A → α to M[A.a] – If ε in FIRST(α) è for each terminal a in FOLLOW(A) add A → α to M[A.Constructing LL(1) Parsing Table -.

id} FIRST(*FT’ )={*} è T → FT’ into M[T.+}è T’ → ε into M[T’.id] è E’ → +TE’ into M[E’.).(] and M[T.+] FIRST((E) )={(} FIRST(id)={id} è F → (E) into M[F.(] and M[E.)} è E’ → ε into M[E’.*] T → FT’ T’ → *FT’ T’ → ε FIRST(ε)={ε} è none but since ε in FIRST(ε) and FOLLOW(T’)={$.$] and M[E’.id] è T’ → *FT’ into M[T’.id] Subhadra Mishra 19 F → (E) F → id .)] and M[T’.)] FIRST(FT’)={(.(] è F → id into M[F.+] FIRST(ε)={ε} è none but since ε in FIRST(ε) and FOLLOW(E’)={$. M[T’.$].Example E → TE’ E’ → +TE’ E’ → ε FIRST(TE’)={(.id} FIRST(+TE’ )={+} è E → TE’ into M[E.Constructing LL(1) Parsing Table -.

LL(1) Parser – Example2 E → TE’ E’ → +TE’ | ε T → FT’ T’ → *FT’ | ε F → (E) | id id E → TE’ T → FT’ T’ → ε F → id Subhadra Mishra + E’ → +TE’ * E E’ T T’ F ( E → TE’ T → FT’ ) E’ → ε $ E’ → ε T’ → ε T’ → *FT’ F → (E) T’ → ε 20 .

LL(1) Parser – Example2 stack $E $E’T $E’ T’F $ E’ T’id $ E’ T’ $ E’ $ E’ T+ $ E’ T $ E’ T’ F $ E’ T’id $ E’ T’ $ E’ $ input id+id$ id+id$ id+id$ id+id$ +id$ +id$ +id$ id$ id$ id$ $ $ $ output E → TE’ T → FT’ F → id T’ → ε E’ → +TE’ T → FT’ F → id T’ → ε E’ → ε accept Subhadra Mishra 21 .

Subhadra Mishra 22 . we say that it is not a LL(1) grammar. In this case. one input symbol used as a look-head symbol do determine parser action LL(1) left most derivation input scanned from left to right • The parsing table of a grammar may contain more than one production rule.LL(1) Grammars • A grammar whose parsing table has no multiply-defined entries is said to be LL(1) grammar.

e } FOLLOW(C) = { t } a S S→a E C C→b two production rules for M[E.A Grammar which is not LL(1) S→iCtSE | E→eS | ε C→b a FOLLOW(S) = { $.e } FOLLOW(E) = { $.e] Problem è ambiguity Subhadra Mishra 23 FIRST(iCtSE) = {i} FIRST(a) = {a} FIRST(eS) = {e} FIRST(ε) = {ε} FIRST(b) = {b} b e E→eS E→ε i S → iCtSE t $ E→ε .

è If β is ε. Subhadra Mishra 24 .A Grammar which is not LL(1) (cont. eliminate the left recursion in the grammar. we have to left factor the grammar. – If the grammar is not left factored. • A grammar is not left factored. it cannot be a LL(1) grammar • A → αβ1 | αβ2 èany terminal that appears in FIRST(αβ 1) also appears in FIRST(αβ 2). • An ambiguous grammar cannot be a LL(1) grammar. • A left recursive grammar cannot be a LL(1) grammar. any terminal that appears in FIRST(α) also appears in FIRST(Aα) and FOLLOW(A). – A → Aα | β è any terminal that appears in FIRST(β) also appears FIRST(Aα) because Aα ⇒ βα. – If its (new grammar’s) parsing table still contains multiply defined entries.) • What do we have to do it if the resulting parsing table contains multiply defined entries? – If we didn’t eliminate left recursion. that grammar is ambiguous or it is inherently not a LL(1) grammar.

Properties of LL(1) Grammars • A grammar G is LL(1) if and only if the following conditions hold for two distinctive production rules A → α and A → β 1. 3. Both α and β cannot derive strings starting with same terminals. If β can derive to ε. 2. At most one of α and β can derive to ε. Subhadra Mishra 25 . then α cannot derive to any string starting with a terminal in FOLLOW(A).

Sign up to vote on this title
UsefulNot useful