You are on page 1of 77

TOPIC 4: SYNTAX &

SEMANTIC ANALYSIS
– PART B
S L I D E S C O M P I L E D B Y: F A D Z L I N A H M A D O N
F A C U LT Y O F C O M P U T E R & M A T H E M A T I C A L
SCIENCES
UITM (MELAKA) JASIN CAMPUS

T E X T B O O K : C O M P I L E R D E S I G N : T H E O R Y, T O O L S &
E X A M P L E S ( J AVA E D I T I O N ) B Y B E R G M A N N , S E T H D . 1
(CHAPTER 4)

Fadzlin Ahmadon - UiTM Jasin


INTRODUCTION

• Parsing problem - given a grammar and an input


string, use parsing algorithm to:
1. Determine if the string is in the language of the
grammar
2. Determine its structure
• Parsing algorithms are classified as:
• Top Down Parsing
• Bottom Up Parsing

Fadzlin Ahmadon - UiTM Jasin 2


INTRODUCTION – TOP DOWN PARSING

• Top Down Parsing algorithm: grammar rules are applied in a


top-to-down direction in the derivation tree.
• Input string: abbbaccb
S

a S b
G8:
1. S → a S b b A c
2. S → b A c
3. A → b S b S
4. A → a
b A c

a
Derivation Tree

Fadzlin Ahmadon - UiTM Jasin 3


INTRODUCTION – TOP DOWN PARSING

• Start with starting nonterminal and decide which rule of the


grammar should be applied:
• Examine a single input symbol and compare it with the first symbol on
the right side of the rules
• abbbaccb
G8:
1. S → a S b
2. S → b A c
3. A → b S
4. A → a

S ⇒ aSb ⇒ abAcb ⇒ abbScb ⇒ abbbAccb ⇒ abbbaccb


4

1 2 3 2 4
Fadzlin Ahmadon - UiTM Jasin
RELATIONS AND CLOSURE

• From a grammar, we can automate the process of producing


a parser
• This requires the use of mathematical theories involving sets
and relations.

Each pair may be listed


Relation is a set of
in parentheses and
ordered pairs
separated by commas

Fadzlin Ahmadon - UiTM Jasin 5


RELATIONS AND CLOSURE

• If R is a relation, then the reflexive transitive closure of R is


designated as R*
• R* is a relation made up of the same elements of R with the
following properties:
1) All pairs of R are also in R*
2) If (a,b) and (b,c) are in R*, then (a,c) is in R* [TRANSITIVE]
3) If a is in one of the pairs in R, then (a,a) is in R* [REFLEXIVE]
R1
(a,b)
(c,d)
(b,a)
(b,c)
(c,c)
Fadzlin Ahmadon - UiTM Jasin 6
RELATIONS AND CLOSURE

Example: Show R1* the reflexive transitive closure of R1.


Solution:

• (a,b)
• (c,d)
• (b,a)
all pairs of R1
• (b,c)
• (c,c)

• (a,c)
• (b,d) transitive
• (a,d)

• (a,a)
• (b,b) reflexive
• (d,d)

Fadzlin Ahmadon - UiTM Jasin 7


SIMPLE GRAMMARS

• A grammar is a simple grammar IF


1. Every rule is of the form:

A → aα any string of terminal & non


terminal

any non terminal


any terminal

2. Every pair of rules defining the same non terminal begin with
different terminals on the right side of the arrow

Fadzlin Ahmadon - UiTM Jasin 8


SIMPLE GRAMMARS

• Example: consider the following grammars:


G9: G10: G11:
1. S → aSb 1. S → aSb 1. S → aSb
2. S → b 2. S → ε 2. S → a

• Which one is a simple grammar?


• Answer: G9, because G10 has an epsilon rule, and G11 have
the same terminal for the same non terminal

Fadzlin Ahmadon - UiTM Jasin 9


SELECTION SET

• Parsing algorithm must decide which rule in the grammar have


to be applied when parsing a string/
• HOW? by using selection set

SELECTION SET: set of input symbols which


imply the application of a grammar rule

• In simple grammar selection set of each rule is exactly one


terminal symbol: the first one on the right hand side

Fadzlin Ahmadon - UiTM Jasin 10


SELECTION SET

• What is the selection set for each of the rules in G9?


G9: G11:
1. S → aSb 1. S → aSb
2. S → b 2. S → a

• Rule 1: { a }
• Rule 2: { b }

• In top down parsing, rules defining the same non terminal


must have disjoint / non-intersecting selection sets – thus G11
can’t be used.

Fadzlin Ahmadon - UiTM Jasin 11


SELECTION SET
• Example: abbaddd
G12: Selection Sets
1. S → a b S d 1. { a }
2. S → b a S d 2. { b }
3. S → d 3. { d }
S
rule1
a b S d
a
S

rule 2 b
abb a S d

b a S d

abbad rule 3 a b S d

b a S d

Fadzlin Ahmadon - UiTM Jasin d 12


EXTENDED PUSHDOWN MACHINE
CONSTRUCTION FOR SIMPLE GRAMMARS

1. Build a table with:


1. Each column labeled by a terminal symbol (and endmarker ↵)
2. Each row labeled by a non terminal or terminal (and bottom marker ▽)
2. For each grammar rule of the form A → aα, fill in the cell in
row A and column a with: REP(αra), Retain where αr =
α reversed
3. Fill in the cell in the row a and column a with pop,
advance for each terminal symbol a
4. Fill in the cell in row ▽ and column ↵ with Accept
5. Fill in all other cells with Reject
6. Initialize stack with ▽ and the starting non terminal

Fadzlin Ahmadon - UiTM Jasin 13


EXTENDED PUSHDOWN MACHINE
CONSTRUCTION FOR SIMPLE GRAMMARS

• Build an extended pushdown machine for G13:


G13: a b ↵
1. S → aSB S Rep Rep
2. S → b (BSa) (b) Reject
3. B → a Retain Retain
4. B → bBa B Rep (a) Rep
Retain (aBb) Reject
Retain
a Pop
Advance Reject Reject
b Pop
Reject Advance Reject
▽ Reject Reject Accept

Fadzlin Ahmadon - UiTM Jasin 14


EXTENDED PUSHDOWN MACHINE
CONSTRUCTION FOR SIMPLE GRAMMARS

• Once we have filled in each cell of the pushdown machine


table, it is applied by replacing the top stack symbol with the
symbols in the right side of the rule in reverse order and retain
the input pointer
• When the top stack symbol is a terminal, check that the current
input symbol matches that stack symbol. If yes -> pop the
stack and advance the input pointer, if no -> reject the input
string
• When the end of input string is encountered, stack should be
empty (except for ▽) in order for it to be accepted.

Fadzlin Ahmadon - UiTM Jasin 15


EXTENDED PUSHDOWN MACHINE
CONSTRUCTION FOR SIMPLE GRAMMARS

• Show the sequence of stacks for input string: aba

a a b a
S → S → S → b → → → ↵
▽ B B B B a Accep
▽ ▽ ▽ ▽ ▽ ▽ t

Fadzlin Ahmadon - UiTM Jasin 16


PARSER IMPLEMENTATION

• We now know that from a grammar we can write a program


that can accept any string in the language of that grammar, and
reject any string not in the language of that grammar
• There is software that can automatically generate a parser
from a grammar. We call this kind of software compiler-
compiler
• An example of compiler-compiler is SableCC

Fadzlin Ahmadon - UiTM Jasin 17


RECURSIVE DESCENT PARSER FOR
SIMPLE GRAMMARS
• Second way of implementing a parser for simple grammars is
to use a methodology known as recursive descent

Recursive Descent Parser - parser is written using a


traditional programming language such as Java or C++
• Method is written for each non terminal in the grammar:
• it handles non terminals by calling the corresponding method
• it handles terminal by reading another input symbol

Fadzlin Ahmadon - UiTM Jasin 18


RECURSIVE DESCENT PARSER FOR
SIMPLE GRAMMARS
G13:
1. S → aSB
2. S → b
3. B → a void B ()
4. B → bBa {
if (inp=='a')
void S () inp = getInp();
{ // rule 3
if (inp=='a') else
// apply rule 1 if (inp=='b')
{ // apply rule 4
inp = getInp(); {
S (); inp = getInp();
B (); B();
} // end rule 1 if (inp=='a')
else inp = getInp();
if (inp=='b') else
inp = getInp(); reject();
// apply rule 2 } // end rule 4
else else
reject(); reject();
} }
Fadzlin Ahmadon - UiTM Jasin 19
QUASI-SIMPLE GRAMMARS

• Quasi-simple grammar is a grammar which obeys the


restrictions of simple grammars, but may also contain rules of
the form:

N → ε epsilon / empty string

any non terminal

• As long as all rules defining the same nonterminal have


disjoint selection sets

Fadzlin Ahmadon - UiTM Jasin 20


QUASI-SIMPLE GRAMMARS

• For example, G14 is a quasi-simple grammar


G14:
1. S → a A S
2. S → b
3. A → c A S
4. A → ε

• In order to do a top-down parse for this grammar, we have to


again find the selection set for each rule
• To find the selection set for ε rules, we need to find the follow
set

Fadzlin Ahmadon - UiTM Jasin 21


FOLLOW SETS

1. For a nonterminal A, Fol(A) = the set of all terminals that


can appear immediately to the right of A in some partial
derivation
2. Derivation starts from S↵, where S is the starting non
terminal
3. If A is the rightmost symbol in a derivation, then endmarker
(↵) is in Fol(A)
4. ε is never in a follow set

Fadzlin Ahmadon - UiTM Jasin 22


FOLLOW SETS

• For grammar G14, the follow sets for S and A are:


G14:
1. S → a A S
2. S → b
3. A → c A S
4. A → ε

S↵ ⇒ aAS ↵ ⇒ acASS↵ ⇒ acASaAS↵ Fol(S) = {a,b,↵}


⇒ acASb↵

S↵ ⇒ aAS↵ ⇒ aAaAS↵ Fol(A) = {a,b}


⇒ aAb↵

Fadzlin Ahmadon - UiTM Jasin 23


QUASI-SIMPLE GRAMMARS: SELECTION
SET
• The selection set for an ε rule is simply the follow set of the
nonterminal on the left side of the arrow
G14:
1. S → a A S Fol(S) = {a,b,↵}
2. S → b
3. A → c A S Fol(A) = {a,b}
4. A → ε

• In G14, the selection set for rule 4 is:


• Sel(4) = Fol(A) = {a,b}

Fadzlin Ahmadon - UiTM Jasin 24


QUASI-SIMPLE GRAMMAR

• Example: acbb

G14: Selection Sets:


1. S → a A S 1. { a }
2. S → b 2. { b }
3. A → c A S 3. { c }
4. A → ε 4. { a,b }

Fadzlin Ahmadon - UiTM Jasin 25


QUASI-SIMPLE GRAMMAR
S
• Example: acbb
S
S a A S
a A S
a A S c A S
c A S
rule 4
rule 1 rule 3 acb ⇒
a ⇒
ε
ac ⇒
S S

a A S a A S

c A S b
c A S
rule 2 rule 2
acb ⇒ acbb ⇒
ε b ε b
Fadzlin Ahmadon - UiTM Jasin 26
EXTENDED PUSHDOWN MACHINES
CONSTRUCTION FOR QUASI-SIMPLE
GRAMMARS
1. Build a table with:
1. Each column labeled by a terminal symbol (and endmarker ↵)
2. Each row labeled by a non terminal or terminal (and bottom marker ▽)
2. For each grammar rule of the form A → aα, fill in the cell in row A
and column a with: REP(αra), Retain where αr = α reversed
3. Fill in the cell in the row a and column a with pop, advance for
each terminal symbol a
4. Fill in the cell in row ▽ and column ↵ with Accept
1. For each ε rule in the grammar, fill in cells of the row corresponding to the
nonterminal on the left side of the arrow, but only in those columns
corresponding to elements of the follow set of the nonterminal. Fill in these cells
with Pop, Retain
5. Fill in all other cells with Reject
6. Initialize stack with ▽ and the starting non terminal
Fadzlin Ahmadon - UiTM Jasin 27
EXTENDED PUSHDOWN MACHINES
CONSTRUCTION FOR QUASI-SIMPLE
GRAMMARS
a b c ↵
S Rep (SAa) Rep
Retain (b) Reject Reject S
Retain ▽
A Pop Pop Rep
Retain Retain (SAc) Reject G14:
Retain
a Pop
1. S → a A S
Advance Reject Reject Reject 2. S → b
3. A → c A S
b Pop
Reject Advance Reject Reject 4. A → ε
Pop G14:
c Reject Reject Advance Sel(1) = {a}
Sel(2) = {b}
▽ Reject Reject Reject Accept
Sel(3) = {c}
Fadzlin Ahmadon - UiTM Jasin Sel(4) = {a,b} 28
RECURSIVE DESCENT PARSER FOR
QUASI-SIMPLE GRAMMARS
• Recursive descent parser for quasi-simple grammars are
similar to those for simple grammars
• The only difference is if any of the selection set for ε rule is
the current input symbol, we simply return to the calling
method without reading any input

Fadzlin Ahmadon - UiTM Jasin 29


RECURSIVE DESCENT PARSER FOR
QUASI-SIMPLE GRAMMARS

void S ()
{
if (inp=='a') void A ()
// apply rule 1 {
{ if (inp=='c')
inp = getInp(); {
A (); inp = getInp();
S (); // apply rule 3
} // end rule 1 A();
else S();
if (inp=='b') }// end rule 3
inp = getInp(); else
// apply rule 2 if (inp=='a'||inp=='b')
else ; // apply rule 4
reject(); else
} reject();
}

Fadzlin Ahmadon - UiTM Jasin 30


LL(1) GRAMMARS

• Grammars that can be parsed down:


• Simple Grammar (A → aα)
• Quasi-simple Grammar (A → aα) with (N → ε)
• LL(1) Grammar (A → α)
• Like Simple Grammar and Quasi-simple Grammar, we can
construct a one-state pushdown machine parser / recursive
descent parser for LL(1) Grammar
• As long as any two rules defining the same non terminal have
disjoint set

Fadzlin Ahmadon - UiTM Jasin 31


LL(1) GRAMMARS

• Called LL(1) Grammar because:


• The parser finds a left-most derivation when scanning the input from
left to right if it can look ahead no more than one input symbol.
• We have to find the selection sets for the LL(1) Grammar
before we can construct pushdown machine / recursive descent
parser
• There are 12 steps that we have to do to find the selection set

Fadzlin Ahmadon - UiTM Jasin 32


LL(1) GRAMMARS

G15:
1. S → ABc
2. A → bA
3. A → ε
4. B → c
Step 1.
Find all nullable rules and nullable nonterminals:

a. Remove, temporarily, all rules containing a terminal.


b. All ε rules are nullable rules.
c. The nonterminal defined in a nullable rule is a nullable
nonterminal.
LL(1) GRAMMARS

Step 1.
Find all nullable rules and nullable nonterminals:

d. All rules in the form A → B C D ... where B,C,D,... are all nullable
non-terminals are nullable rules, (nullable nonterminals).
e. A nonterminal is nullable if ε can be derived from it, and a rule is
nullable if ε can be derived from its right side.

G15: Nullable rules: rule 3; Nullable nonterminals: A


LL(1) GRAMMARS

Step 2.
Compute the relation “Begins Directly With” for each nonterminal:

A BDW X if there is a rule A → α X β such that


• α is a nullable string (a string of nullable non-terminals).
• A represents a nonterminal .
• X represents a terminal or nonterminal.
• β represents any string of terminals and nonterminals.
LL(1) GRAMMARS

Step 2.
Compute the relation “Begins Directly With” for each
nonterminal:

G15 :
S BDW A (from rule 1)
S BDW B (also from rule 1, because A is nullable)
A BDW b (from rule 2)
B BDW c (from rule 4
LL(1) GRAMMARS

Step 3.
Compute the relation “Begins With”:

a. X BW Y if there is a string beginning with Y that can be


derived from X.
b. BW is the reflexive transitive closure of BDW. In addition,
BW should contain pairs of the form a BW a for each
terminal a in the grammar.
Step 3.
Compute the relation “Begins With”:
For G15 –
S BW A
S BW B (from BDW)
A BW b G15:
B BW c 1. S → ABc
2. A → bA
S BW b (transitive) 3. A → ε
S BW c 4. B → c

S BW S
A BW A
B BW B (reflexive)
b BW b
c BW c
LL(1) GRAMMARS

Step 4.
Compute the set of terminals "First(x)" for each symbol x in the
grammar.

a. At this point, we can find the set of all terminals which can
begin a sentential form when starting with a given symbol of the
grammar.
b. First(A) = set of all terminals b, such that A BW b for each
nonterminal A.
c. First(t) = {t} for each terminal.
LL(1) GRAMMARS

Step 4.
Compute the set of terminals "First(x)" for each symbol x in the
grammar.

For G15 –

First(S) = {b,c}
First(A) = {b}
First(B) = {c}
First(b) = {b}
First(c) = {c}
LL(1) GRAMMARS

Step 5.
Compute "First" of right side of each rule:
a. Compute the set of terminals which can begin a sentential form
derivable from the right side of each rule.
First (XYZ...) = {First(X)}
U {First(Y)} if X is nullable
U {First(Z)} if Y is also nullable . . .
b. Find the union of the First(x) sets for each symbol on the right
side of a rule, but stop when reaching a non-nullable symbol.
LL(1) GRAMMARS

Step 5.
Compute "First" of right side of each rule:

For G15 –
1. First(ABc)=First(A) U First(B)={b,c}
(because A is nullable)
2. First(bA) = {b}
3. First(ε) = {}
4. First(c) = {c}

If the grammar contains no nullable rules, skip to step 12 at


this point.
LL(1) GRAMMARS

Step 6. Compute the relation “Is Followed Directly By”:

 B FDB X if there is a rule of the form A → α B β X γ


where β is a string of nullable nonterminals, α, γ are strings
of symbols, X is any symbol, and A and B are nonterminals.
LL(1) GRAMMARS

Step 6. Compute the relation “Is Followed Directly By”:

For G15 –
A FDB B (from rule 1)
B FDB c (from rule 1)

 If B were a nullable nonterminal we would also have A FDB c.


LL(1) GRAMMARS

Step 7.
Compute the relation “Is Direct End Of”:

X DEO A if there is a rule of the form:


A → α X β where β is a string of nullable nonterminals,
α is a string of symbols, and X is a single grammar symbol.
LL(1) GRAMMARS

Step 7.
Compute the relation “Is Direct End Of”:

For G15 –
c DEO S (from rule 1)
A DEO A (from rule 2)
b DEO A (from rule 2, since A is nullable)
c DEO B (from rule 4)
LL(1) GRAMMARS

Step 8.
Compute the relation “Is End Of”:

a. X EO Y if there is a string derived from Y that ends with


X.
b. EO is the reflexive transitive closure of DEO.
c. EO should contain pairs of the form N EO N for each
nullable nonterminal, N, in the grammar.
LL(1) GRAMMARS

Step 8.
Compute the relation “Is End Of”:

For G15 –
c EO S
A EO A (from DEO )
b EO A
c EO B
(no transitive entries)
c EO c
S EO S (reflexive)
b EO b
B EO B
LL(1) GRAMMARS

Step 9.
Compute the relation “Is Followed By”:

 W FB Z if there is a string derived from S↵ in which W is


immediately followed by Z.
If there are symbols X and Y such that
W EO X (Step 8)
X FDB Y (Step 6)
Y BW Z (Step 3)
then
W FB Z
LL(1) GRAMMARS

Step 9.
Compute the relation “Is Followed By”:

For G15 –
A EO A A FDB B B BW B A FB B
B BW c A FB c
b EO A B BW B b FB B
B BW c b FB c
B EO B B FDB c c BW c B FB c
c EO B c BW c c FB c
LL(1) GRAMMARS

Step 10.
Extend the FB relation to include endmarker:

 A FB ↵ if A EO S where A represents any nonterminal and S


represents the starting nonterminal.

For G15 –
S FB ↵ because S EO S

 There are now seven pairs in the FB relation for grammar G15.
LL(1) GRAMMARS

Step 11.
Compute the Follow Set for each nullable nonterminal:

 The follow set of any nonterminal A is the set of all terminals, t,


for which A FB t.
Fol(A) = {t | A FB t}

 To find selection sets, we need find follow sets for nullable


nonterminals only.

For G15 –
Fol(A) = {c} since A is the only nullable
nonterminal and A FB c.
LL(1) GRAMMARS

Step 12.
Compute the selection set for each rule:

i. A → α
if rule i is not a nullable rule, then
Sel(i) = First(α) (from Step 5)
if rule i is a nullable rule, then
Sel(i) = First(α) U Fol(A)
LL(1) GRAMMARS

Step 12.
Compute the selection set for each rule:

For G15 –
Sel(1) = First(ABc) = {b,c}
Sel(2) = First(bA) = {b}
Sel(3) = First(ε) U Fol(A) = {} U {c} = {c}
Sel(4) = First(c) = {c}
b c ↵
Pushdown Machines for
S Rep Rep LL(1) Grammars exactly as
(cBA) (cBA) Reject
Retain Retain
Pushdown Machines for
quasi-simple grammars
A Rep (Ab) Pop
Retain Retain Reject
B Reject Rep (c)
Retain Reject G15:
b Pop 1. S → ABc
Advance Reject Reject S
c Pop ▽ 2. A → bA
Reject Advance Reject 3. A → ε
▽ Reject Reject Accept Initial 4. B → c
Stack
Pushdown Machine for Grammar G5
b
b A A A c c Accept
→ B → B → B → B → c → → ↵
S c c c c c c →
▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽

Sequence of Stacks for Machine for Grammar G15. Input bcc↵


Recursive Descent for LL(1) Grammars
void S ()
{
void A ()
if (inp=='b' || inp=='c')
{
// apply rule 1
if (inp=='b')
{
// apply rule 2
A ();
{
B ();
inp=getInp();
if (inp=='c')
A ();
inp=getInp();
} // end rule 2
else
else
reject();
if (inp=='c')
} // end rule 1
; // apply rule 3
else
else
reject();
reject();
}
}

void B ()
{
if (inp=='c')
inp=getInp(); // apply rule 4
else
reject();
}
Dependency Graph for the Steps in the
Algorithm for Finding Selection Set 1

1. Find nullable rules and nullable non


terminals 6 7
2. Find “Begins Directly With” relation 2
(BDW).
3. Find “Begins With” relation (BW).
4. Find “First(x)” for each symbol, x.
3 9 8
5. Find “First(n)” for the right side of
each rule, n.
6. Find “Followed Directly By” relation
(FDB).
7. Find “Is Direct End Of” relation
4 10
(DEO).
8. Find “Is End Of” relation (EO).
9. Find “Is Followed By” relation (FB).
10. Extend FB to include endmarker.
11. Find Follow Set, Fol(A), for each 5 11
nullable nonterminal, A.
12. Find Selection Set, Sel(n), for each
rule, n.
12
PARSING ARITHMETIC EXPRESSIONS TOP
DOWN
• We now understand how to determine if a grammar can be
parsed down and how to construct a top down parser
• Now we can begin to study how to parse arithmetic
expressions : used widely in programming languages

Fadzlin Ahmadon - UiTM Jasin 58


PARSING ARITHMETIC EXPRESSIONS TOP
DOWN
• Check if this grammar is LL(1):
G 5:
1. Expr → Expr + Term
2. Expr → Term
3. Term → Term ∗ Factor
4. Term → Factor
5. Factor → ( Expr )
6. Factor → var
• The twelve steps algorithm
1. Nullable rule & nonterminal
• Nullable rules: none
• Nullable nonterminals: none

Fadzlin Ahmadon - UiTM Jasin 59


PARSING ARITHMETIC EXPRESSIONS TOP
DOWN

2. Begins Directly With Relation


• Expr BDW Expre
• Expr BDW Term
• Term BDW Term
• Term BDW Factor
• Factor BDW (
• Factor BDW var

Fadzlin Ahmadon - UiTM Jasin 60


PARSING ARITHMETIC EXPRESSIONS TOP
DOWN
3. Begins With Relation
• Expr BW Expr
• Expr BW Term
• Term BW Term
• Term BW Factor
• Factor BW (
• Factor BW var
• Expr BW Factor
• Expr BW (
• Expr BW var
• Term BW (
• Term BW var
• Factor BW Factor
• ( BW (
• var BW var
• ∗ BW ∗
• + BW +
• ) BW )
Fadzlin Ahmadon - UiTM Jasin 61
PARSING ARITHMETIC EXPRESSIONS TOP
DOWN
4. First(x)
• First(Expr) = {(,var}
• First(Term) = {(,var}
• First(Factor) = {(,var}

5. First( ) right side of each rule


1) First(Expr + Term) = {(,var}
2) First(Term) = {(,var}
3) First(Term ∗ Factor) = {(,var}
4) First(Factor) = {(,var}
5) First( ( Expr ) ) = {(}
6) First (var) = {var}

Fadzlin Ahmadon - UiTM Jasin 62


PARSING ARITHMETIC EXPRESSIONS TOP
DOWN
G 5:
1. Expr → Expr + Term Sel(1) = {(,var}
2. Expr → Term Sel(2) = {(,var}
3. Term → Term ∗ Factor Sel(3) = {(,var}
4. Term → Factor Sel(4) = {(,var}
5. Factor → ( Expr ) Sel(5) = {(}
6. Factor → var Sel(6) = {var}

• This grammar is not LL(1) because rules 1 and 2 define the


same non terminal Expr and their selection sets intersect
• This is also true for rules 3 and 4

Fadzlin Ahmadon - UiTM Jasin 63


PARSING ARITHMETIC EXPRESSIONS TOP
DOWN
G 5:
Sel(1) = {(,var}
1. Expr → Expr + Term
Sel(2) = {(,var}
2. Expr → Term
Sel(3) = {(,var}
3. Term → Term ∗ Factor
Sel(4) = {(,var}
4. Term → Factor
Sel(5) = {(}
5. Factor → ( Expr )
Sel(6) = {var}
6. Factor → var

• Rules 1 and 3 both have a property known as left recursion :


1. Expr → Expr + Term
3. Term → Term ∗ Factor

• They are in the form: A → Aα


PARSING ARITHMETIC EXPRESSIONS TOP
DOWN

• The left recursion can be eliminated by rewriting the


grammar with an equivalent grammar that does not have left
recursion

• The offending rule might be in the form:


A → Aα
A→β
in which we assume that β is a string of terminals and
nonterminals that does not begin with an A.
PARSING ARITHMETIC EXPRESSIONS TOP
DOWN

• The left recursion can be eliminated by introducing a new


nonterminal, R, and rewriting the rules as:

A→βR
R→αR
R→ε
Parsing Arithmetic Expressions Top Down
G16:
1. Expr → Term Elist
2. Elist → + Term Elist
3. Elist → ε
4. Term → Factor Tlist
5. Tlist → ∗ Factor Tlist
6. Tlist → ε
7. Factor → ( Expr )
8. Factor → var

Step 2.
Step 1. Expr BDW Term
Nullable rules: 3,6 Elist BDW +
Nullable nonterminals: Term BDW Factor
Elist, Tlist Tlist BDW ∗
Factor BDW (
Factor BDW var
Parsing Arithmetic Expressions Top Down
G16:
1. Expr → Term Elist
2. Elist → + Term Elist
Step 3. 3. Elist → ε
Expr BW Term 4. Term → Factor Tlist
Elist BW + 5. Tlist → ∗ Factor Tlist
Term BW Factor (from BDW) 6. Tlist → ε
Tlist BW ∗ 7. Factor → ( Expr )
Factor BW ( 8. Factor → var
Factor BW var
Expr BW Expr
Term BW Term
Expr BW Factor Factor BW Factor
Term BW ( Elist BW Elist
Term BW var (transitive) Tlist BW Tlist (reflexive)
Expr BW ( Factor BW Factor
Expr BW var + BW +
∗ BW ∗
( BW (
var BW var
) BW )
Parsing Arithmetic Expressions Top Down
G16:
1. Expr → Term Elist
Step 4. 2. Elist → + Term Elist
First (Expr) = {(,var} 3. Elist → ε
First (Elist) = {+} 4. Term → Factor Tlist
First (Term) = {(,var} 5. Tlist → ∗ Factor Tlist
First (Tlist) = {∗} 6. Tlist → ε
First (Factor) = {(,var} 7. Factor → ( Expr )
8. Factor → var

Step 5.
1. First(Term Elist) = {(,var}
2. First(+ Term Elist) = {+}
3. First(ε) = {}
4. First(Factor Tlist) = {(,var}
5. First(∗ Factor Tlist) = {∗}
6. First(ε) = {}
7. First(( Expr )) = {(}
8. First(var) = {var}
Parsing Arithmetic Expressions Top Down

Step 6. G16:
Term FDB Elist 1. Expr → Term Elist
Factor FDB Tlist 2. Elist → + Term Elist
Expr FDB ) 3. Elist → ε
4. Term → Factor Tlist
Step 7. 5. Tlist → ∗ Factor Tlist
Elist DEO Expr 6. Tlist → ε
Term DEO Expr 7. Factor → ( Expr )
Elist DEO Elist 8. Factor → var
Term DEO Elist
Tlist DEO Term
Factor DEO Term
Tlist DEO Tlist
Factor DEO Tlist
) DEO Factor
var DEO Factor
Parsing Arithmetic Expressions Top Down

Step 8. ) EO Term
Elist EO Expr ) EO Tlist
Term EO Expr ) EO Expr (transitive)
Elist EO Elist ) EO Elist
Term EO Elist var EO Term
Tlist EO Term var EO Tlist
Factor EO Term (from DEO) var EO Expr
Tlist EO Tlist var EO Elist
Factor EO Tlist
) EO Factor Expr EO Expr
Term EO Term
var EO Factor Factor EO Factor
Tlist EO Expr ) EO )
Tlist EO Elist (transitive) var EO var (reflexive)
Factor EO Expr + EO +
Factor EO Elist ∗ EO ∗
( EO (
Elist EO Elist
Tlist EO Tlist
Parsing Arithmetic Expressions Top Down
Step 9.
Tlist EO Term FDB Elist BW + Tlist FB +
BW Elist
Factor EO BW +
BW Elist
var EO BW +
BW Elist
Term EO BW +
BW Elist
) EO BW +
BW Elist
) EO Factor FDB Tlist BW ∗
BW Tlist
var EO BW ∗
BW Tlist
Factor EO BW ∗
BW Tlist
Elist EO Expr FDB ) BW ) Elist FB )
Tlist EO Expr Tlist FB )
Parsing Arithmetic Expressions Top Down
Step 10. G16:
Elist FB ↵ 1. Expr → Term Elist
Term FB ↵ 2. Elist → + Term Elist
Expr FB ↵ 3. Elist → ε
Tlist FB ↵ 4. Term → Factor Tlist
Factor FB ↵ 5. Tlist → ∗ Factor Tlist
6. Tlist → ε
Step 11. 7. Factor → ( Expr )
Fol (Elist) = {), ↵} 8. Factor → var
Fol (Tlist) = {+,),
↵}
Step 12.
Sel(1) = First(Term Elist) = {(,var}
Sel(2) = First(+ Term Elist) = {+}
Sel(3) = Fol(Elist) = {), ↵}
Sel(4) = First(Factor Tlist) = {(,var}
Sel(5) = First(∗ Factor Tlist) = {∗}
Sel(6) = Fol(Tlist) = {+,), ↵}
Sel(7) = First( ( Expr ) ) = {(}
Sel(8) = First(var) = {var}
PUSHDOWN MACHINES FOR LL(1)
GRAMMARS
G16:
1. Expr → Term Elist Sel(1) = {(,var}
2. Elist → + Term Elist Sel(2) = {+}
3. Elist → ε Sel(3) = {), ↵}
4. Term → Factor Tlist Sel(4) = {(,var}
5. Tlist → ∗ Factor Tlist Sel(5) = {∗}
6. Tlist → ε Sel(6) = {+,), ↵}
7. Factor → ( Expr ) Sel(7) = {(}
8. Factor → var Sel(8) = {var}

Since all rules defining the same non terminal (rules 2 and 3, rules 5 and 6
and rules 7 and 8 have disjoint selection sets, the grammar G16 is LL(1)
grammar.
+ * ( ) var ↵
Expr Reject Reject Rep(Elist, Reject Rep(Elist, Reject
Term) Term)
Retain Retain
Elist Rep(Elist, Term, Reject Reject Pop Reject Pop
+) Retain Retain
Retain

Term Reject Reject Rep Reject Rep Reject


(Tlist,Factor) (Tlist,Factor)
Retain Retain S
Tlist Pop Rep(Tlist, Reject Pop Reject Pop ▽
Retain Factor,*) Retain Retain
Retain
Factor Reject Reject Rep(),Expr,() Reject Rep(var) Reject Initial
Retain Retain
Stack
+ Pop Reject Reject Reject Reject Reject
Advance
* Reject Pop Reject Reject Reject Reject
Advance
( Reject Reject Pop Reject Reject Reject
Advance
) Reject Reject Reject Pop Reject Reject
Advance
var Reject Reject Reject Reject Pop Reject
Advance
▽ Reject Reject Reject Reject Reject Accept
Recursive Descent for LL(1) Grammars
G16:
void parse () 1. Expr → Term Elist
{ 2. Elist → + Term Elist
inp = getInp(); 3. Elist → ε
Expr (); 4. Term → Factor Tlist
// Call start nonterminal 5. Tlist → ∗ Factor Tlist
if (inp=='\r') 6. Tlist → ε
accept(); 7. Factor → ( Expr )
// end of string marker
8. Factor → var
else
reject(); void Elist ()
} {
if (inp=='+')
// apply rule 2
void Expr ()
{
{
inp=getInp();
if (inp=='(' || inp=='v')
Term ();
// apply rule 1
Elist ();
{
} // end rule 2
Term ();
else
Elist ();
if (inp==')' || inp=='\r')
} // end rule 1
; // apply rule 3
else
else
reject();
reject ();
}
}
void Term ()
{
if (inp=='(' || inp=='v') Recursive Descent for
// apply rule 4 LL(1) Grammars
{
Factor ();
Tlist ();
} // end rule 4 void Factor ()
else {
reject(); if (inp=='(')
} // apply rule 7
{
void Tlist () inp=getInp();
{ Expr ();
if (inp=='*') if (inp==')')
// apply rule 5 inp=getInp();
{ else
inp=getInp(); reject();
Factor (); } // end rule 7
Tlist (); else
} // end rule 5 if (inp=='v')
else inp=getInp();
if (inp=='+' || inp==')' || inp=='\r') // apply rule 8
; // apply rule 6 else
else reject();
reject(); }
}

You might also like