Chiradza Lawine H190638e Assignment 2

CHIRADZA LAWINE
H190638E
SYSTEMS PROGRAMMING ASSIGNMENT 2
1. E → E+ T
E→ T
T→ T*F
T→ F
F→ (E)
F→ id
Canonical LR (0) Collection
Augmented I0 Go to ( I0 , E)
Grammar E’→ .E I1
E’→ E E → .E+ T E’→ E.
E → E+ T E→ .T E → E. + T
E→ T T→ .T*F
T→ T*F T→ .F
T→ F F→ .(E)
F→ (E) F→ .id
F→ id
Go to (I0 , T) Go to (I0 , F) Go to (I0 , ( )

I2 I3 I4
E→ T. T→ F. T→ F.
T→ T.*F F→ (.E)
E → .E+ T
E→ .T
T→ .T*F
T→ .F
F→ .(E)
F→ .id
Go to (I0 , id ) Go to (I1 , + ) Go to (I2 , *)

I5 I6 I7
F→ id. E → E+. T T→ T*.F
T→ .T*F F→ .(E)
T→ .F F→ .id
F→ .(E)
F→ id.
Go to (I4 , E ) Go to (I6 , T) Go to (I7 , F)

I8 I9 I10
F→ (E.) E → E+ T. T→ T*F.
E → E.+ T T→ T.*F
Go to (I8, ) )
I8
F→ (E).
SLR PARSING TABLE
STATE ACTIO GO TO
N
+ * ( ) id $ E T F
0 S4 S5 1 2 3
1 S6 Acc
2 R2 S7 R2 R2
3 R4 R4 R4 R4
4 S4 S5 8 2 3
5 R5 R6 R6 R6
6 S4 S5 9 3
7 S4 S5 10
8 S6 S11
9 R1 S7 R1 R1
10 R3 R3 R3 R3
11 R5 R5 R5 R5
FOLLOW (E) = { $, +, ) }
FOLLOW (T) = {$, +, ), *}

FOLLOW (F) = {$, +, ), *}
2. LALR parsing
It stands for a look ahead left right and is a technique for deciding when reductions have to
be made in shift/reduce parsing. Often it can make decisions without using a look ahead. At
times a look ahead of 1 is required. Most parser generators construct LALR parsers.
It is the most powerful parser which can handle large classes of grammar. The size of the CLR
parsing table is quite large compared to other parsing tables. LALR reduces the size of this
table. LALR works similarly to CLR. The only difference is, it combines the similar states of the
CLR parsing table into one single state.
The general syntax becomes [A->∝.B, a]
where A->∝.B is a production and a is a terminal or right end marker $
LR(1) items=LR(0) items + look ahead
LALR Parsers SLR Parsers

LALR parsers LR(1) collection , items with SLR parsers make use of canonical
items having same core merged into a collection of LR(0) items for constructing
single itemset. the parsing tables.
LALR parsers lookahead one symbol. SLR parsers don’t do any lookahead i.e.,
they lookahead zero
LALR parsers uses lookahead symbol to SLR parsers uses FOLLOW information to
guide reductions. guide reductions.
LALR parser works on very large class SLR parsers may fail to I produce a table for
grammars. certain class of grammars on which ether
succeed.
Every LALR(1) grammar may not be SLR(1) Every SLR(1) grammar is LR(1) grammar and
but every LALR(1) grammar is LR(1) LALR(1).
grammar.
A shift-reduce conflict can not arise but a A shift-reduce or reduce-reduce conflict
reduce-reduce conflict may arise. may arise in SLR parsing table.
A LALR parser is intermediate in power SLR parser is least powerful.
between SLR and LR parser
3. Regular expression : a*b*c*

4.
ɛ- NFA δE a b c ɛ
→q0 { q0} Ø Ø { q1}
q1 Ø { q1} Ø { q2}
q2 Ø Ø { q2} Ø
Ɛ-closure (q0) = {q0, q1, q2}

Ɛ-closure (q1) = {q1, q2}
Ɛ-closure (q2) = {q2}
δD ({q0, q1, q2}, a) = Ɛ-closure (δE {q0, q1, q2}, a))

= Ɛ-closure (δE (q0, a) ᴜ δE (q1, a) ᴜ δE (q2, a))
= Ɛ-closure ({q0} ᴜ Ø ᴜ Ø)
= Ɛ-closure (q0)
= {q0, q1, q2}
DFA δD a b c
→{q0, q1, q2} {q0, q1, q2} {q1, q2} { q2}
{q1, q2} Ø {q1, q2} { q2}
{ q2} Ø Ø { q2}
 A token is basically a sequence of characters that are treated as a unit as it cannot

be further broken down. In programming languages like C language- keywords
(int, char, float, const, goto, continue, etc.) identifiers (user-defined names),
operators (+, -, *, /), delimiters/punctuators like comma (,), semicolon(;), braces ({
}), etc., strings can be considered as tokens.
This phase recognizes three types of tokens: Terminal Symbols (TRM)- Keywords
and Operators, Literals (LIT), and Identifiers (IDN)
 Lexeme It is a sequence of characters in the source code that are matched by given
predefined language rules for every lexeme to be specified as a valid token.
main is lexeme of type identifier(token)
(,),{,} are lexemes of type punctuation(token)

 A pattern specifies a set of rules that a scanner follows to create a token. For
a keyword to be identified as a valid token, the pattern is the sequence of
characters that make the keyword. For an identifier to be identified as a valid
token, the pattern is the predefined rule that it must start with the alphabet,
followed by the alphabet or a digit.

Chiradza Lawine H190638e Assignment 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chiradza Lawine H190638e Assignment 2

Uploaded by

Copyright:

Available Formats

CHIRADZA LAWINE

SYSTEMS PROGRAMMING ASSIGNMENT 2

Go to (I0 , T) Go to (I0 , F) Go to (I0 , ( )

Go to (I0 , id ) Go to (I1 , + ) Go to (I2 , *)

Go to (I4 , E ) Go to (I6 , T) Go to (I7 , F)

SLR PARSING TABLE

FOLLOW (T) = {$, +, ), *}

LALR Parsers SLR Parsers

3. Regular expression : abc*

Ɛ-closure (q0) = {q0, q1, q2}

δD ({q0, q1, q2}, a) = Ɛ-closure (δE {q0, q1, q2}, a))

 A token is basically a sequence of characters that are treated as a unit as it cannot

main is lexeme of type identifier(token)

(,),{,} are lexemes of type punctuation(token)

You might also like