This action might not be possible to undo. Are you sure you want to continue?
https://www.scribd.com/doc/44520706/lec8
01/12/2015
text
original
Topdown parsers start at the root of derivation tree and ll in choosing production for nonterminal is the key problem LL (predictive) parsers use lookahead to choose production Bottomup parsers start at leaves and ll in choosing righthand side (rhs) of production is the key problem LR parsers use current stack and lookahead to choose production LR(k) parsers are more powerful than LL(k) parsers because they can see the entire rhs before choosing a production.
CPSC 434
Lecture 8, Page 1
Some de nitions
For a grammar G, with start symbol S , any string such that S is called a sentential form.
)
If contains only terminal symbols, is a sentence in L(G). If contains one or more nonterminals, it is just a sentential form (not a sentence in L(G)). A leftsentential form is a sentential form that occurs in the leftmost derivation of some sentence. A rightsentential form is a sentential form that occurs in the rightmost derivation of some sentence.
CPSC 434
Lecture 8, Page 2
Bottomup parsing
Goal:
Given an input string w and a grammar G, construct a parse tree by starting at the leaves and working to the root. The parser repeatedly matches the righthand side rhs of a production against a substring in the current rightsentential form. At each match, it applies a reduction to build the parse tree. each reduction replaces the matched substring with the nonterminal on the lefthand side lhs of the production each reduction adds an internal node to the current parse tree the result is another rightsentential form The nal result is a rightmost derivation, in reverse.
CPSC 434
Lecture 8, Page 3
Example
Consider the grammar 1 goal ::= a A B 2 A ::= A b c b 3 4 B ::= d
h h i h i h i h i j h i i
e
and the input string abbcde. Prod'n. Sentential Form  abbcde 3 a A bcde 2 a A de 4 aA Be goal 1
h h i i h ih i h i
Handle 3,2 2,4 4,3 1,4 
The trick appears to be scanning the input and nding valid sentential forms.
CPSC 434
Lecture 8, Page 4
Handles
We trying to nd a substring of the current rightsentential form where: matches some production A ::= reducing to A is one step in the reverse of a rightmost derivation. We will call such a string a handle.
Formally,
a handle of a rightsentential form is a production A ::= and a position in where may be found. If (A ::= ;k) is a handle, then replacing the in at position k with A produces the previous rightsentential form in a rightmost derivation of . Because is a rightsentential form, the substring to the right of a handle contains only terminal symbols.
CPSC 434
Lecture 8, Page 5
Handles
Provable fact:
If G is unambiguous, then every rightsentential form has a unique handle.
Proof: (by de nition)
1. G is unambiguous rightmost derivation is unique. 2. a unique production A ::= applied to take ?1 to 3. a unique position k at which A ::= is applied 4. a handle (A ::= ;k)
) )
i i
)
)
CPSC 434
Lecture 8, Page 6
Example
The leftrecursive expression grammar (original form, before left factoring) 1 2 3 4 5 6 7 8 9 Prod'n.  1 3 5 9 7 8 4 7 9
CPSC 434
h h
goal expr
h
h
::= expr ::= expr + term expr  term term term ::= term * factor term / factor factor factor ::= num
i h i i h i h i j j h h i h i i i h i h j j h h i h i i j
i i
id
Sentential Form goal expr expr  term expr  term * factor expr  term * id expr  factor * id expr  num * id term  num * id factor  num * id
h h h h h h h h h i i i i i i i h h h h i i i h i i i
i
id  num * id
Handle  1,1 3,3 5,5 9,5 7,3 8,3 4,1 7,1 9,1
Lecture 8, Page 7
Handlepruning
The process we use to construct a bottomup parse is called handlepruning.
To construct a rightmost derivation S= 0 1 2 ?1
) ) ) )
n
)
n
= w;
we set i to n and apply the following simple algorithm
do i = to 1 by 1 find the handle replace i with
(1) (2)
n
A
(A ::= ; k )
i i i i
in to generate
i i
?1
This takes 2n steps, where n is the length of the derivation
CPSC 434
Lecture 8, Page 8
Shiftreduce parsing
One scheme to implement a handlepruning, bottomup parser is called a shiftreduce parser. Shiftreduce parsers use a stack and an input bu er 1. initialize stack with $ 2. Repeat until the top of the stack is the goal symbol and the input token is \end of le" a) nd the handle if we don't have a handle on top of the stack, shift an input symbol onto the stack b) prune the handle if we have a handle (A ::= ; k) on the stack, reduce i) pop symbols o the stack ii) push A onto the stack
j j
CPSC 434
Lecture 8, Page 9
Back to \x
$ $id $ factor $ term $ expr $ expr $ expr $ expr $ expr $ expr $ expr $ expr $ expr $ expr $ goal
h h h i i i h h i i h h i i h h i i h h i i h i h i
 2 * y
"
Input
id num num num num num num * * * * * * * * * id id id id id id id id id id
Stack
Handle Action
num
h h h h
h h
factor term term * term * id term * factor term
i i i i i i h
i
none 9,1 7,1 4,1 none none 8,3 7,3 none none 9,5 5,5 3,3 1,1 none
shift reduce 9 reduce 7 reduce 4 shift shift reduce 8 reduce 7 shift shift reduce 9 reduce 5 reduce 3 reduce 1 accept
1. Shift until top of stack is the right end of a handle 2. Find the left end of the handle and reduce 5 shifts + 9 reduces + 1 accept
CPSC 434
Lecture 8, Page 10
Shiftreduce parsing
Shiftreduce parsers are simple to understand have a simple, tabledriven, shiftreduce skeleton encode grammatical knowledge in tables A shiftreduce parser has just four canonical actions: 1. shift  next input symbol is shifted onto the top of the stack 2. reduce  right end of handle is on top of stack; locate left end of handle within the stack; pop handle o stack and push appropriate nonterminal lhs 3. accept  terminate parsing and signal success 4. error  call an error recovery routine
CPSC 434
Lecture 8, Page 11
LR(1) grammars
Informally, we say that a grammar G is LR(1) if,
given a rightmost derivation S = 0 1 2
) )
)
)
n
= w;
we can, for each rightsentential form in the derivation, 1. isolate the handle of each rightsentential form, and 2. determine the production by which to reduce by scanning from left to right, going at most 1 symbol beyond the right end of the handle of .
i i
Formality will come later.
CPSC 434
Lecture 8, Page 12
Why study LR(1) grammars?
LR(1) grammars are often used to construct shiftreduce parsers. We call these parsers LR(1) parsers. everyone's favorite parser (EFP) virtually all contextfree programming language constructs can be expressed in an LR(1) form LR grammars are the most general grammars that can be parsed by a nonbacktracking, shiftreduce parser e cient shiftreduce parsers can be implemented for LR(1) grammars LR parsers detect an error as soon as possible in a lefttoright scan of the input LR grammars describe a proper superset of the languages recognized by LL (predictive) parsers
CPSC 434
Lecture 8, Page 13
Tabledriven LR(1) parsing
A tabledriven LR(1) parser looks like stack
source code
scanner

tabledriven parser
6
6 ?
 il
action & goto tables Stack two items per state: state and symbol Table building tools are readily available (yacc) We'll learn how to build these tables by hand!
CPSC 434
Lecture 8, Page 14
?
LR(1)
parsing
The skeleton parser:
token = next token() repeat forever s = top of stack if action s,token] = "shift push token push i token = next token()
s "'
i
then
s
else if action s,token] = "reduce " then pop j j symbols s = top of stack push push goto s, ]
2
A ::= A
A
else if action s, token] = "accept " then return else error()
This takes k shifts, l reduces, and 1 accept, where k is the length of the input string and l is the length of the reverse rightmost derivation.
Note: Equivalent to Figure 4.30, Aho, Sethi, and Ullman
CPSC 434
Lecture 8, Page 15
Example tables
id s4     s4 s4   +   s5 r5 r6    r4
ACTION
S0 S1 S2 S3 S4 S5 S6 S7 S8
*    s6 r6    
$ <expr> <term> <factor>  1 2 3 acc    r3    r5    r6     7 2 3   8 3 r2    r4   
The Grammar
j
GOTO
1 2 3 4 5 6
CPSC 434
<goal> ::= <expr> <expr> ::= <term> + <expr> <term> <term> ::= <factor> * <term> <factor> <factor> ::= id
j
Lecture 8, Page 16
This action might not be possible to undo. Are you sure you want to continue?