Professional Documents
Culture Documents
H190638E
1. E → E+ T
E→ T
T→ T*F
T→ F
F→ (E)
F→ id
Canonical LR (0) Collection
Augmented I0 Go to ( I0 , E)
Grammar E’→ .E I1
E’→ E E → .E+ T E’→ E.
E → E+ T E→ .T E → E. + T
E→ T T→ .T*F
T→ T*F T→ .F
T→ F F→ .(E)
F→ (E) F→ .id
F→ id
STATE ACTIO GO TO
N
+ * ( ) id $ E T F
0 S4 S5 1 2 3
1 S6 Acc
2 R2 S7 R2 R2
3 R4 R4 R4 R4
4 S4 S5 8 2 3
5 R5 R6 R6 R6
6 S4 S5 9 3
7 S4 S5 10
8 S6 S11
9 R1 S7 R1 R1
10 R3 R3 R3 R3
11 R5 R5 R5 R5
FOLLOW (E) = { $, +, ) }
2. LALR parsing
It stands for a look ahead left right and is a technique for deciding when reductions have to
be made in shift/reduce parsing. Often it can make decisions without using a look ahead. At
times a look ahead of 1 is required. Most parser generators construct LALR parsers.
It is the most powerful parser which can handle large classes of grammar. The size of the CLR
parsing table is quite large compared to other parsing tables. LALR reduces the size of this
table. LALR works similarly to CLR. The only difference is, it combines the similar states of the
CLR parsing table into one single state.
The general syntax becomes [A->∝.B, a]
where A->∝.B is a production and a is a terminal or right end marker $
LR(1) items=LR(0) items + look ahead
ɛ- NFA δE a b c ɛ
→q0 { q0} Ø Ø { q1}
q1 Ø { q1} Ø { q2}
q2 Ø Ø { q2} Ø
This phase recognizes three types of tokens: Terminal Symbols (TRM)- Keywords
and Operators, Literals (LIT), and Identifiers (IDN)
Lexeme It is a sequence of characters in the source code that are matched by given
predefined language rules for every lexeme to be specified as a valid token.