Professional Documents
Culture Documents
412
FALL 2017
Parsing II
Top-down parsing
Comp 412
source IR … IR target
code code
Front End OpMmizer Back End
Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved.
Students enrolled in Comp 412 at Rice University have explicit permission to make copies of these
materials for their personal use.
Faculty from other educaMonal insMtuMons may use these materials for nonprofit educaMonal purposes,
provided this copyright noMce is preserved.
Chapter 3 in EaC2e
Ambiguity Review
Defini9ons
• A context-free grammar G is ambiguous if there exists has more than
one leTmost derivaMon for some sentence in L(G)
• A context-free grammar G is ambiguous if there exists has more than
one rightmost derivaMon for some sentence in L(G)
• The leTmost and rightmost derivaMons for a sentenMal form may differ,
even in an unambiguous grammar
– However, they must have the same parse tree
LeHmost deriva9on
<id,x>
Evaluates (x - 2) * y
Elimina9ng ambiguity does not necessarily produce the desired meaning. It
produces a consistent meaning, but that meaning can be consistently wrong.
COMP 412, Fall 2017 6
Order of OperaMons
How do you add precedence to a grammar?
To add precedence
• Decide how many levels of precedence the grammar needs
• Create a nonterminal for each level of precedence
• Isolate the corresponding part of the grammar
• Force the parser to recognize high precedence subexpressions first
6 Factor — Term
T T * F
9 <id,x> — Term
4 <id,x> — Term * Factor F F <id,y>
6 <id,x> — Factor * Factor
<id,x> — <num,2> * Factor <id,x> <num,2>
8
9 <id,x> — <num,2> * <id,y>
The le9most deriva*on Parse tree for x – 2 * y
The classic expression grammar derives x — ( 2 * y ) with the parse tree shown..
Both the leTmost and rightmost derivaMons give the same parse tree and value,
because the grammar directly and explicitly encodes the desired precedence.
COMP 412, Fall 2017 9
DerivaMons and Precedence
Rule Senten*al Form
Goal G
—
0 Expr
E
2 Expr — Term
4 Expr — Term * Factor E — T
Borom-up parsers can recognize a larger class of grammars than can top-down parsers. 11
COMP 412, Fall 2017
Top-Down Parsing
We will examine two ways of implemen9ng top-down parsers
Recursive-Descent Parser
• Highly efficient, highly flexible form of parser
• Typically implemented as a hand-coded parser
– Set of mutually-recursive rouMnes
– Works well for any “backtrack free” or “predicMvely parsable” grammar
• Easy to understand, easy to implement
Knowledge encoded in
tables to drive skeleton
0 Goal ➝ Expr
1 Expr ➝ Expr + Term
2 | Expr - Term
3 | Term
4 Term ➝ Term * Factor And the input x – 2 * y
5 | Term / Factor
6 | Factor
7 Factor ➝ ( Expr )
8 | number
9 | id
1 Expr + Term ↑x - 2 * y
Term
3 Term + Term ↑x - 2 * y
6 Factor + Term ↑x - 2 * y Fact.
9 <id,x> + Term ↑x - 2 * y
<id,x>
→ <id,x> + Term x ↑- 2 * y
2 Expr -Term ↑x - 2 * y
Term
3 Term -Term ↑x - 2 * y
6 Factor -Term ↑x - 2 * y Fact.
9 <id,x> - Term ↑x - 2 * y
<id,x>
→ <id,x> -Term x ↑- 2 * y
→ <id,x> -Term x - ↑2 * y
Now, “-” and “-” match Now we can expand Term to match “2”
8 <id,x> - <num,2> x - ↑2 * y
Term Fact.
→ <id,x> - <num,2> x - 2 ↑* y
Fact. <num,2>
Where are we?
• “2” matches “2” <id,x>
• We have more input, but no NTs leT to expand
• The expansion terminated too soon
⇒ Need to backtrack
The Point:
This Mme, we matched & consumed all the input
⇒ For efficiency, the parser must make the correct choice when it expands a
Success! NT.
Wrong choices lead to wasted effort.
Formally,
A grammar is leH recursive if ∃ A ∈ NT such that a derivaMon
A ⇒+ Aα exists, for some string α ∈ (NT ∪ T ) +
Added a reference to
the empty string
New Idea: the ε producMon
COMP 412, Fall 2017 25
EliminaMng LeT Recursion
The expression grammar contains two cases of leH recursion
Expr → Expr + Term Term → Term * Factor
| Expr - Term | Term * Factor
| Term | Factor
0 Goal → Expr
Right-recursive expression grammar
1 Expr → Term Expr’
• This grammar is correct, if
2 Expr’ → + Term Expr’ somewhat counter-intuiMve.
3 | - Term Expr’ • A top-down parser will terminate
4 | ε using it.
5 Term → Factor Term’ • A top-down parser may need to
6 Term’ → * Factor Term’ backtrack with it.
7 | / Factor Term’ • It is leT associaMve, as was the
8 | ε original
9 Factor → ( Expr ) ⇒ Why didn’t we just rewrite it so Expr
was at the right end of the RHS,
10 | number
rather than the leT end?
11 | id
0 Goal → Expr
Right-recursive expression grammar
1 Expr → Term Expr’
• This grammar is correct, if
2 Expr’ → + Term Expr’ somewhat counter-intuiMve.
NOTE: This technique eliminates direct
3 | - Term Expr’ • leT recursion — when a producMon’s
A top-down parser will terminate
4 | ε using it.
RHS begins with its own LHS.
5 Term → Factor Term’ • It does not address indirect leT
A top-down parser may need to
6 Term’ → * Factor Term’ recursion. We will get there, in a
backtrack with it.
couple of slides …
7 | / Factor Term’ • It is leT associaMve, as was the
8 | ε original
9 Factor → ( Expr ) ⇒ Why didn’t we just rewrite it so Expr
was at the right end of the RHS,
10 | number
rather than the leT end?
11 | id
+ +
+ z w +
+ y x +
w x y z
ASTs for w + x + y + z
COMP 412, Fall 2017 30
Parsing with the RR CEG
Text of slide
Rule Senten*al Form Expr
— Goal
Term Expr’
0 Expr
1 Term Expr’ minus Term Expr’
5 Factor Term’ Expr’
11 <id,x> Term’ Expr’
Factor Term’ ε
8 <id,x> Expr’
3 <id,x> - Term Expr’ <num,2>
5 <id,x> - Factor Term’ Expr’
10 <id,x> - <num,2> Term’ Expr’ Mmes Factor Term’
6 <id,x> - <num,2> * Factor Term’ Expr’
11 <id,x> - <num,2> * <id,y> Term’ Expr’ <id,y> ε
8 <id,x> - <num,2> * <id,y> Expr’
Factor Term’
4 <id,x> - <num,2> * <id,y>
<id,x> ε
COMP 412, Fall 2017 32