You are on page 1of 5

CHIRADZA LAWINE

H190638E

SYSTEMS PROGRAMMING ASSIGNMENT 2

1. E → E+ T
E→ T
T→ T*F
T→ F
F→ (E)
F→ id
Canonical LR (0) Collection

Augmented I0 Go to ( I0 , E)
Grammar E’→ .E I1
E’→ E E → .E+ T E’→ E.
E → E+ T E→ .T E → E. + T
E→ T T→ .T*F
T→ T*F T→ .F
T→ F F→ .(E)
F→ (E) F→ .id
F→ id

Go to (I0 , T) Go to (I0 , F) Go to (I0 , ( )


I2 I3 I4
E→ T. T→ F. T→ F.
T→ T.*F F→ (.E)
E → .E+ T
E→ .T
T→ .T*F
T→ .F
F→ .(E)
F→ .id

Go to (I0 , id ) Go to (I1 , + ) Go to (I2 , *)


I5 I6 I7
F→ id. E → E+. T T→ T*.F
T→ .T*F F→ .(E)
T→ .F F→ .id
F→ .(E)
F→ id.

Go to (I4 , E ) Go to (I6 , T) Go to (I7 , F)


I8 I9 I10
F→ (E.) E → E+ T. T→ T*F.
E → E.+ T T→ T.*F
Go to (I8, ) )
I8
F→ (E).

SLR PARSING TABLE

STATE ACTIO GO TO
N
+ * ( ) id $ E T F
0 S4 S5 1 2 3
1 S6 Acc
2 R2 S7 R2 R2
3 R4 R4 R4 R4
4 S4 S5 8 2 3
5 R5 R6 R6 R6
6 S4 S5 9 3
7 S4 S5 10
8 S6 S11
9 R1 S7 R1 R1
10 R3 R3 R3 R3
11 R5 R5 R5 R5

FOLLOW (E) = { $, +, ) }

FOLLOW (T) = {$, +, ), *}


FOLLOW (F) = {$, +, ), *}

2. LALR parsing
It stands for a look ahead left right and is a technique for deciding when reductions have to
be made in shift/reduce parsing. Often it can make decisions without using a look ahead. At
times a look ahead of 1 is required. Most parser generators construct LALR parsers.

It is the most powerful parser which can handle large classes of grammar. The size of the CLR
parsing table is quite large compared to other parsing tables. LALR reduces the size of this
table. LALR works similarly to CLR. The only difference is, it combines the similar states of the
CLR parsing table into one single state.
The general syntax becomes [A->∝.B, a]
where A->∝.B is a production and a is a terminal or right end marker $
LR(1) items=LR(0) items + look ahead

LALR Parsers SLR Parsers


LALR parsers LR(1) collection , items with SLR parsers make use of canonical
items having same core merged into a collection of LR(0) items for constructing
single itemset. the parsing tables.
LALR parsers lookahead one symbol. SLR parsers don’t do any lookahead i.e.,
they lookahead zero
LALR parsers uses lookahead symbol to SLR parsers uses FOLLOW information to
guide reductions. guide reductions.
LALR parser works on very large class SLR parsers may fail to I produce a table for
grammars. certain class of grammars on which ether
succeed.
Every LALR(1) grammar may not be SLR(1) Every SLR(1) grammar is LR(1) grammar and
but every LALR(1) grammar is LR(1) LALR(1).
grammar.
A shift-reduce conflict can not arise but a A shift-reduce or reduce-reduce conflict
reduce-reduce conflict may arise. may arise in SLR parsing table.
A LALR parser is intermediate in power SLR parser is least powerful.
between SLR and LR parser

3. Regular expression : a*b*c*


4.

ɛ- NFA δE a b c ɛ
→q0 { q0} Ø Ø { q1}
q1 Ø { q1} Ø { q2}
q2 Ø Ø { q2} Ø

Ɛ-closure (q0) = {q0, q1, q2}


Ɛ-closure (q1) = {q1, q2}
Ɛ-closure (q2) = {q2}

δD ({q0, q1, q2}, a) = Ɛ-closure (δE {q0, q1, q2}, a))


= Ɛ-closure (δE (q0, a) ᴜ δE (q1, a) ᴜ δE (q2, a))
= Ɛ-closure ({q0} ᴜ Ø ᴜ Ø)
= Ɛ-closure (q0)
= {q0, q1, q2}
DFA δD a b c
→{q0, q1, q2} {q0, q1, q2} {q1, q2} { q2}
{q1, q2} Ø {q1, q2} { q2}
{ q2} Ø Ø { q2}

 A token is basically a sequence of characters that are treated as a unit as it cannot


be further broken down. In programming languages like C language- keywords
(int, char, float, const, goto, continue, etc.) identifiers (user-defined names),
operators (+, -, *, /), delimiters/punctuators like comma (,), semicolon(;), braces ({
}), etc., strings can be considered as tokens.

This phase recognizes three types of tokens: Terminal Symbols (TRM)- Keywords
and Operators, Literals (LIT), and Identifiers (IDN)

 Lexeme It is a sequence of characters in the source code that are matched by given
predefined language rules for every lexeme to be specified as a valid token.

main is lexeme of type identifier(token)

(,),{,} are lexemes of type punctuation(token)


 A pattern specifies a set of rules that a scanner follows to create a token. For
a keyword to be identified as a valid token, the pattern is the sequence of
characters that make the keyword. For an identifier to be identified as a valid
token, the pattern is the predefined rule that it must start with the alphabet,
followed by the alphabet or a digit.

You might also like