Topic 4-Bv2

TOPIC 4: SYNTAX &
SEMANTIC ANALYSIS
– PART B
S L I D E S C O M P I L E D B Y: F A D Z L I N A H M A D O N
F A C U LT Y O F C O M P U T E R & M A T H E M A T I C A L
SCIENCES
UITM (MELAKA) JASIN CAMPUS
T E X T B O O K : C O M P I L E R D E S I G N : T H E O R Y, T O O L S &
E X A M P L E S ( J AVA E D I T I O N ) B Y B E R G M A N N , S E T H D . 1
(CHAPTER 4)
Fadzlin Ahmadon - UiTM Jasin

INTRODUCTION
• Parsing problem - given a grammar and an input

string, use parsing algorithm to:
1. Determine if the string is in the language of the
grammar
2. Determine its structure
• Parsing algorithms are classified as:
• Top Down Parsing
• Bottom Up Parsing
Fadzlin Ahmadon - UiTM Jasin 2

INTRODUCTION – TOP DOWN PARSING
• Top Down Parsing algorithm: grammar rules are applied in a

top-to-down direction in the derivation tree.
• Input string: abbbaccb
S
a S b
G8:
1. S → a S b b A c
2. S → b A c
3. A → b S b S
4. A → a
b A c
a
Derivation Tree

INTRODUCTION – TOP DOWN PARSING
• Start with starting nonterminal and decide which rule of the

grammar should be applied:
• Examine a single input symbol and compare it with the first symbol on
the right side of the rules
• abbbaccb
G8:
1. S → a S b
2. S → b A c
3. A → b S
4. A → a
S ⇒ aSb ⇒ abAcb ⇒ abbScb ⇒ abbbAccb ⇒ abbbaccb

4
1 2 3 2 4
Fadzlin Ahmadon - UiTM Jasin
RELATIONS AND CLOSURE
• From a grammar, we can automate the process of producing

a parser
• This requires the use of mathematical theories involving sets
and relations.
Each pair may be listed

Relation is a set of
in parentheses and
ordered pairs
separated by commas

• If R is a relation, then the reflexive transitive closure of R is

designated as R*
• R* is a relation made up of the same elements of R with the
following properties:
1) All pairs of R are also in R*
2) If (a,b) and (b,c) are in R*, then (a,c) is in R* [TRANSITIVE]
3) If a is in one of the pairs in R, then (a,a) is in R* [REFLEXIVE]
R1
(a,b)
(c,d)
(b,a)
(b,c)
(c,c)
Example: Show R1* the reflexive transitive closure of R1.

Solution:
• (a,b)
• (c,d)
• (b,a)
all pairs of R1
• (b,c)
• (c,c)
• (a,c)
• (b,d) transitive
• (a,d)
• (a,a)
• (b,b) reflexive
• (d,d)

SIMPLE GRAMMARS
• A grammar is a simple grammar IF

1. Every rule is of the form:
A → aα any string of terminal & non

terminal
any non terminal

any terminal
2. Every pair of rules defining the same non terminal begin with
different terminals on the right side of the arrow

SIMPLE GRAMMARS
• Example: consider the following grammars:

G9: G10: G11:
1. S → aSb 1. S → aSb 1. S → aSb
2. S → b 2. S → ε 2. S → a
• Which one is a simple grammar?

• Answer: G9, because G10 has an epsilon rule, and G11 have
the same terminal for the same non terminal

SELECTION SET
• Parsing algorithm must decide which rule in the grammar have

to be applied when parsing a string/
• HOW? by using selection set
SELECTION SET: set of input symbols which

imply the application of a grammar rule
• In simple grammar selection set of each rule is exactly one

terminal symbol: the first one on the right hand side

SELECTION SET
• What is the selection set for each of the rules in G9?

G9: G11:
1. S → aSb 1. S → aSb
2. S → b 2. S → a
• Rule 1: { a }
• Rule 2: { b }
• In top down parsing, rules defining the same non terminal

must have disjoint / non-intersecting selection sets – thus G11
can’t be used.

SELECTION SET
• Example: abbaddd
G12: Selection Sets
1. S → a b S d 1. { a }
2. S → b a S d 2. { b }
3. S → d 3. { d }
S
rule1
a b S d
a
S
rule 2 b
abb a S d
b a S d
abbad rule 3 a b S d
b a S d
Fadzlin Ahmadon - UiTM Jasin d 12

EXTENDED PUSHDOWN MACHINE
CONSTRUCTION FOR SIMPLE GRAMMARS
1. Build a table with:

1. Each column labeled by a terminal symbol (and endmarker ↵)
2. Each row labeled by a non terminal or terminal (and bottom marker ▽)
2. For each grammar rule of the form A → aα, fill in the cell in
row A and column a with: REP(αra), Retain where αr =
α reversed
3. Fill in the cell in the row a and column a with pop,
advance for each terminal symbol a
4. Fill in the cell in row ▽ and column ↵ with Accept
5. Fill in all other cells with Reject
6. Initialize stack with ▽ and the starting non terminal

• Build an extended pushdown machine for G13:

G13: a b ↵
1. S → aSB S Rep Rep
2. S → b (BSa) (b) Reject
3. B → a Retain Retain
4. B → bBa B Rep (a) Rep
Retain (aBb) Reject
Retain
a Pop
Advance Reject Reject
b Pop
Reject Advance Reject
▽ Reject Reject Accept

• Once we have filled in each cell of the pushdown machine

table, it is applied by replacing the top stack symbol with the
symbols in the right side of the rule in reverse order and retain
the input pointer
• When the top stack symbol is a terminal, check that the current
input symbol matches that stack symbol. If yes -> pop the
stack and advance the input pointer, if no -> reject the input
string
• When the end of input string is encountered, stack should be
empty (except for ▽) in order for it to be accepted.

• Show the sequence of stacks for input string: aba
a a b a
S → S → S → b → → → ↵
▽ B B B B a Accep
▽ ▽ ▽ ▽ ▽ ▽ t

PARSER IMPLEMENTATION
• We now know that from a grammar we can write a program

that can accept any string in the language of that grammar, and
reject any string not in the language of that grammar
• There is software that can automatically generate a parser
from a grammar. We call this kind of software compiler-
compiler
• An example of compiler-compiler is SableCC

RECURSIVE DESCENT PARSER FOR
SIMPLE GRAMMARS
• Second way of implementing a parser for simple grammars is
to use a methodology known as recursive descent
Recursive Descent Parser - parser is written using a

traditional programming language such as Java or C++
• Method is written for each non terminal in the grammar:
• it handles non terminals by calling the corresponding method
• it handles terminal by reading another input symbol

SIMPLE GRAMMARS
G13:
1. S → aSB
2. S → b
3. B → a void B ()
4. B → bBa {
if (inp=='a')
void S () inp = getInp();
{ // rule 3
if (inp=='a') else
// apply rule 1 if (inp=='b')
{ // apply rule 4
inp = getInp(); {
S (); inp = getInp();
B (); B();
} // end rule 1 if (inp=='a')
else inp = getInp();
if (inp=='b') else
inp = getInp(); reject();
// apply rule 2 } // end rule 4
else else
reject(); reject();
} }
QUASI-SIMPLE GRAMMARS
• Quasi-simple grammar is a grammar which obeys the

restrictions of simple grammars, but may also contain rules of
the form:
N → ε epsilon / empty string
any non terminal
• As long as all rules defining the same nonterminal have

disjoint selection sets

• For example, G14 is a quasi-simple grammar

G14:
1. S → a A S
2. S → b
3. A → c A S
4. A → ε
• In order to do a top-down parse for this grammar, we have to

again find the selection set for each rule
• To find the selection set for ε rules, we need to find the follow
set

FOLLOW SETS
1. For a nonterminal A, Fol(A) = the set of all terminals that

can appear immediately to the right of A in some partial
derivation
2. Derivation starts from S↵, where S is the starting non
terminal
3. If A is the rightmost symbol in a derivation, then endmarker
(↵) is in Fol(A)
4. ε is never in a follow set

FOLLOW SETS
• For grammar G14, the follow sets for S and A are:

G14:
1. S → a A S
2. S → b
3. A → c A S
4. A → ε
S↵ ⇒ aAS ↵ ⇒ acASS↵ ⇒ acASaAS↵ Fol(S) = {a,b,↵}

⇒ acASb↵
S↵ ⇒ aAS↵ ⇒ aAaAS↵ Fol(A) = {a,b}

⇒ aAb↵

QUASI-SIMPLE GRAMMARS: SELECTION
SET
• The selection set for an ε rule is simply the follow set of the
nonterminal on the left side of the arrow
G14:
1. S → a A S Fol(S) = {a,b,↵}
2. S → b
3. A → c A S Fol(A) = {a,b}
4. A → ε
• In G14, the selection set for rule 4 is:

• Sel(4) = Fol(A) = {a,b}

QUASI-SIMPLE GRAMMAR
• Example: acbb
G14: Selection Sets:

1. S → a A S 1. { a }
2. S → b 2. { b }
3. A → c A S 3. { c }
4. A → ε 4. { a,b }

QUASI-SIMPLE GRAMMAR
S
• Example: acbb
S
S a A S
a A S
a A S c A S
c A S
rule 4
rule 1 rule 3 acb ⇒
a ⇒
ε
ac ⇒
S S
a A S a A S
c A S b
c A S
rule 2 rule 2
acb ⇒ acbb ⇒
ε b ε b
EXTENDED PUSHDOWN MACHINES
CONSTRUCTION FOR QUASI-SIMPLE
GRAMMARS
1. Build a table with:
1. Each column labeled by a terminal symbol (and endmarker ↵)
2. Each row labeled by a non terminal or terminal (and bottom marker ▽)
2. For each grammar rule of the form A → aα, fill in the cell in row A
and column a with: REP(αra), Retain where αr = α reversed
3. Fill in the cell in the row a and column a with pop, advance for
each terminal symbol a
4. Fill in the cell in row ▽ and column ↵ with Accept
1. For each ε rule in the grammar, fill in cells of the row corresponding to the
nonterminal on the left side of the arrow, but only in those columns
corresponding to elements of the follow set of the nonterminal. Fill in these cells
with Pop, Retain
5. Fill in all other cells with Reject
6. Initialize stack with ▽ and the starting non terminal
EXTENDED PUSHDOWN MACHINES
CONSTRUCTION FOR QUASI-SIMPLE
GRAMMARS
a b c ↵
S Rep (SAa) Rep
Retain (b) Reject Reject S
Retain ▽
A Pop Pop Rep
Retain Retain (SAc) Reject G14:
Retain
a Pop
1. S → a A S
Advance Reject Reject Reject 2. S → b
3. A → c A S
b Pop
Reject Advance Reject Reject 4. A → ε
Pop G14:
c Reject Reject Advance Sel(1) = {a}
Sel(2) = {b}
▽ Reject Reject Reject Accept
Sel(3) = {c}
Fadzlin Ahmadon - UiTM Jasin Sel(4) = {a,b} 28
• Recursive descent parser for quasi-simple grammars are
similar to those for simple grammars
• The only difference is if any of the selection set for ε rule is
the current input symbol, we simply return to the calling
method without reading any input

void S ()
{
if (inp=='a') void A ()
// apply rule 1 {
{ if (inp=='c')
inp = getInp(); {
A (); inp = getInp();
S (); // apply rule 3
} // end rule 1 A();
else S();
if (inp=='b') }// end rule 3
inp = getInp(); else
// apply rule 2 if (inp=='a'||inp=='b')
else ; // apply rule 4
reject(); else
} reject();
}

LL(1) GRAMMARS
• Grammars that can be parsed down:

• Simple Grammar (A → aα)
• Quasi-simple Grammar (A → aα) with (N → ε)
• LL(1) Grammar (A → α)
• Like Simple Grammar and Quasi-simple Grammar, we can
construct a one-state pushdown machine parser / recursive
descent parser for LL(1) Grammar
• As long as any two rules defining the same non terminal have
disjoint set

LL(1) GRAMMARS
• Called LL(1) Grammar because:

• The parser finds a left-most derivation when scanning the input from
left to right if it can look ahead no more than one input symbol.
• We have to find the selection sets for the LL(1) Grammar
before we can construct pushdown machine / recursive descent
parser
• There are 12 steps that we have to do to find the selection set

LL(1) GRAMMARS
G15:
1. S → ABc
2. A → bA
3. A → ε
4. B → c
Step 1.
Find all nullable rules and nullable nonterminals:
a. Remove, temporarily, all rules containing a terminal.

b. All ε rules are nullable rules.
c. The nonterminal defined in a nullable rule is a nullable
nonterminal.
LL(1) GRAMMARS
Step 1.
Find all nullable rules and nullable nonterminals:
d. All rules in the form A → B C D ... where B,C,D,... are all nullable
non-terminals are nullable rules, (nullable nonterminals).
e. A nonterminal is nullable if ε can be derived from it, and a rule is
nullable if ε can be derived from its right side.
G15: Nullable rules: rule 3; Nullable nonterminals: A

LL(1) GRAMMARS
Step 2.
Compute the relation “Begins Directly With” for each nonterminal:
A BDW X if there is a rule A → α X β such that

• α is a nullable string (a string of nullable non-terminals).
• A represents a nonterminal .
• X represents a terminal or nonterminal.
• β represents any string of terminals and nonterminals.
LL(1) GRAMMARS
Step 2.
Compute the relation “Begins Directly With” for each
nonterminal:
G15 :
S BDW A (from rule 1)
S BDW B (also from rule 1, because A is nullable)
A BDW b (from rule 2)
B BDW c (from rule 4
LL(1) GRAMMARS
Step 3.
Compute the relation “Begins With”:
a. X BW Y if there is a string beginning with Y that can be

derived from X.
b. BW is the reflexive transitive closure of BDW. In addition,
BW should contain pairs of the form a BW a for each
terminal a in the grammar.
Step 3.
Compute the relation “Begins With”:
For G15 –
S BW A
S BW B (from BDW)
A BW b G15:
B BW c 1. S → ABc
2. A → bA
S BW b (transitive) 3. A → ε
S BW c 4. B → c
S BW S
A BW A
B BW B (reflexive)
b BW b
c BW c
LL(1) GRAMMARS
Step 4.
Compute the set of terminals "First(x)" for each symbol x in the
grammar.
a. At this point, we can find the set of all terminals which can
begin a sentential form when starting with a given symbol of the
grammar.
b. First(A) = set of all terminals b, such that A BW b for each
nonterminal A.
c. First(t) = {t} for each terminal.
LL(1) GRAMMARS
Step 4.
Compute the set of terminals "First(x)" for each symbol x in the
grammar.
For G15 –
First(S) = {b,c}
First(A) = {b}
First(B) = {c}
First(b) = {b}
First(c) = {c}
LL(1) GRAMMARS
Step 5.
Compute "First" of right side of each rule:
a. Compute the set of terminals which can begin a sentential form
derivable from the right side of each rule.
First (XYZ...) = {First(X)}
U {First(Y)} if X is nullable
U {First(Z)} if Y is also nullable . . .
b. Find the union of the First(x) sets for each symbol on the right
side of a rule, but stop when reaching a non-nullable symbol.
LL(1) GRAMMARS
Step 5.
Compute "First" of right side of each rule:
For G15 –
1. First(ABc)=First(A) U First(B)={b,c}
(because A is nullable)
2. First(bA) = {b}
3. First(ε) = {}
4. First(c) = {c}
If the grammar contains no nullable rules, skip to step 12 at

this point.
LL(1) GRAMMARS
Step 6. Compute the relation “Is Followed Directly By”:
 B FDB X if there is a rule of the form A → α B β X γ

where β is a string of nullable nonterminals, α, γ are strings
of symbols, X is any symbol, and A and B are nonterminals.
LL(1) GRAMMARS
Step 6. Compute the relation “Is Followed Directly By”:
For G15 –
A FDB B (from rule 1)
B FDB c (from rule 1)
 If B were a nullable nonterminal we would also have A FDB c.

LL(1) GRAMMARS
Step 7.
Compute the relation “Is Direct End Of”:
X DEO A if there is a rule of the form:

A → α X β where β is a string of nullable nonterminals,
α is a string of symbols, and X is a single grammar symbol.
LL(1) GRAMMARS
Step 7.
Compute the relation “Is Direct End Of”:
For G15 –
c DEO S (from rule 1)
A DEO A (from rule 2)
b DEO A (from rule 2, since A is nullable)
c DEO B (from rule 4)
LL(1) GRAMMARS
Step 8.
Compute the relation “Is End Of”:
a. X EO Y if there is a string derived from Y that ends with

X.
b. EO is the reflexive transitive closure of DEO.
c. EO should contain pairs of the form N EO N for each
nullable nonterminal, N, in the grammar.
LL(1) GRAMMARS
Step 8.
Compute the relation “Is End Of”:
For G15 –
c EO S
A EO A (from DEO )
b EO A
c EO B
(no transitive entries)
c EO c
S EO S (reflexive)
b EO b
B EO B
LL(1) GRAMMARS
Step 9.
Compute the relation “Is Followed By”:
 W FB Z if there is a string derived from S↵ in which W is

immediately followed by Z.
If there are symbols X and Y such that
W EO X (Step 8)
X FDB Y (Step 6)
Y BW Z (Step 3)
then
W FB Z
LL(1) GRAMMARS
Step 9.
Compute the relation “Is Followed By”:
For G15 –
A EO A A FDB B B BW B A FB B
B BW c A FB c
b EO A B BW B b FB B
B BW c b FB c
B EO B B FDB c c BW c B FB c
c EO B c BW c c FB c
LL(1) GRAMMARS
Step 10.
Extend the FB relation to include endmarker:
 A FB ↵ if A EO S where A represents any nonterminal and S

represents the starting nonterminal.
For G15 –
S FB ↵ because S EO S
 There are now seven pairs in the FB relation for grammar G15.
LL(1) GRAMMARS
Step 11.
Compute the Follow Set for each nullable nonterminal:
 The follow set of any nonterminal A is the set of all terminals, t,

for which A FB t.
Fol(A) = {t | A FB t}
 To find selection sets, we need find follow sets for nullable

nonterminals only.
For G15 –
Fol(A) = {c} since A is the only nullable
nonterminal and A FB c.
LL(1) GRAMMARS
Step 12.
Compute the selection set for each rule:
i. A → α
if rule i is not a nullable rule, then
Sel(i) = First(α) (from Step 5)
if rule i is a nullable rule, then
Sel(i) = First(α) U Fol(A)
LL(1) GRAMMARS
Step 12.
Compute the selection set for each rule:
For G15 –
Sel(1) = First(ABc) = {b,c}
Sel(2) = First(bA) = {b}
Sel(3) = First(ε) U Fol(A) = {} U {c} = {c}
Sel(4) = First(c) = {c}
b c ↵
Pushdown Machines for
S Rep Rep LL(1) Grammars exactly as
(cBA) (cBA) Reject
Retain Retain
Pushdown Machines for
quasi-simple grammars
A Rep (Ab) Pop
Retain Retain Reject
B Reject Rep (c)
Retain Reject G15:
b Pop 1. S → ABc
Advance Reject Reject S
c Pop ▽ 2. A → bA
Reject Advance Reject 3. A → ε
▽ Reject Reject Accept Initial 4. B → c
Stack
Pushdown Machine for Grammar G5
b
b A A A c c Accept
→ B → B → B → B → c → → ↵
S c c c c c c →
▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽
Sequence of Stacks for Machine for Grammar G15. Input bcc↵

Recursive Descent for LL(1) Grammars
void S ()
{
void A ()
if (inp=='b' || inp=='c')
{
// apply rule 1
if (inp=='b')
{
// apply rule 2
A ();
{
B ();
inp=getInp();
if (inp=='c')
A ();
inp=getInp();
} // end rule 2
else
else
reject();
if (inp=='c')
} // end rule 1
; // apply rule 3
else
else
reject();
reject();
}
}
void B ()
{
if (inp=='c')
inp=getInp(); // apply rule 4
else
reject();
}
Dependency Graph for the Steps in the
Algorithm for Finding Selection Set 1
1. Find nullable rules and nullable non

terminals 6 7
2. Find “Begins Directly With” relation 2
(BDW).
3. Find “Begins With” relation (BW).
4. Find “First(x)” for each symbol, x.
3 9 8
5. Find “First(n)” for the right side of
each rule, n.
6. Find “Followed Directly By” relation
(FDB).
7. Find “Is Direct End Of” relation
4 10
(DEO).
8. Find “Is End Of” relation (EO).
9. Find “Is Followed By” relation (FB).
10. Extend FB to include endmarker.
11. Find Follow Set, Fol(A), for each 5 11
nullable nonterminal, A.
12. Find Selection Set, Sel(n), for each
rule, n.
12
PARSING ARITHMETIC EXPRESSIONS TOP
DOWN
• We now understand how to determine if a grammar can be
parsed down and how to construct a top down parser
• Now we can begin to study how to parse arithmetic
expressions : used widely in programming languages

DOWN
• Check if this grammar is LL(1):
G 5:
1. Expr → Expr + Term
2. Expr → Term
3. Term → Term ∗ Factor
4. Term → Factor
5. Factor → ( Expr )
6. Factor → var
• The twelve steps algorithm
1. Nullable rule & nonterminal
• Nullable rules: none
• Nullable nonterminals: none

DOWN
2. Begins Directly With Relation

• Expr BDW Expre
• Expr BDW Term
• Term BDW Term
• Term BDW Factor
• Factor BDW (
• Factor BDW var

DOWN
3. Begins With Relation
• Expr BW Expr
• Expr BW Term
• Term BW Term
• Term BW Factor
• Factor BW (
• Factor BW var
• Expr BW Factor
• Expr BW (
• Expr BW var
• Term BW (
• Term BW var
• Factor BW Factor
• ( BW (
• var BW var
• ∗ BW ∗
• + BW +
• ) BW )
DOWN
4. First(x)
• First(Expr) = {(,var}
• First(Term) = {(,var}
• First(Factor) = {(,var}
5. First( ) right side of each rule

1) First(Expr + Term) = {(,var}
2) First(Term) = {(,var}
3) First(Term ∗ Factor) = {(,var}
4) First(Factor) = {(,var}
5) First( ( Expr ) ) = {(}
6) First (var) = {var}

DOWN
G 5:
1. Expr → Expr + Term Sel(1) = {(,var}
2. Expr → Term Sel(2) = {(,var}
3. Term → Term ∗ Factor Sel(3) = {(,var}
4. Term → Factor Sel(4) = {(,var}
5. Factor → ( Expr ) Sel(5) = {(}
6. Factor → var Sel(6) = {var}
• This grammar is not LL(1) because rules 1 and 2 define the

same non terminal Expr and their selection sets intersect
• This is also true for rules 3 and 4

DOWN
G 5:
Sel(1) = {(,var}
Sel(2) = {(,var}
2. Expr → Term
Sel(3) = {(,var}
Sel(4) = {(,var}
4. Term → Factor
Sel(5) = {(}
Sel(6) = {var}
6. Factor → var
• Rules 1 and 3 both have a property known as left recursion :

• They are in the form: A → Aα

DOWN
• The left recursion can be eliminated by rewriting the

grammar with an equivalent grammar that does not have left
recursion
• The offending rule might be in the form:

A → Aα
A→β
in which we assume that β is a string of terminals and
nonterminals that does not begin with an A.
DOWN
• The left recursion can be eliminated by introducing a new

nonterminal, R, and rewriting the rules as:
A→βR
R→αR
R→ε
Parsing Arithmetic Expressions Top Down
G16:
1. Expr → Term Elist
2. Elist → + Term Elist
3. Elist → ε
4. Term → Factor Tlist
5. Tlist → ∗ Factor Tlist
6. Tlist → ε
8. Factor → var
Step 2.
Step 1. Expr BDW Term
Nullable rules: 3,6 Elist BDW +
Nullable nonterminals: Term BDW Factor
Elist, Tlist Tlist BDW ∗
Factor BDW (
Factor BDW var
G16:
2. Elist → + Term Elist
Step 3. 3. Elist → ε
Expr BW Term 4. Term → Factor Tlist
Elist BW + 5. Tlist → ∗ Factor Tlist
Term BW Factor (from BDW) 6. Tlist → ε
Tlist BW ∗ 7. Factor → ( Expr )
Factor BW ( 8. Factor → var
Factor BW var
Expr BW Expr
Term BW Term
Expr BW Factor Factor BW Factor
Term BW ( Elist BW Elist
Term BW var (transitive) Tlist BW Tlist (reflexive)
Expr BW ( Factor BW Factor
Expr BW var + BW +
∗ BW ∗
( BW (
var BW var
) BW )
G16:
Step 4. 2. Elist → + Term Elist
First (Expr) = {(,var} 3. Elist → ε
First (Elist) = {+} 4. Term → Factor Tlist
First (Term) = {(,var} 5. Tlist → ∗ Factor Tlist
First (Tlist) = {∗} 6. Tlist → ε
First (Factor) = {(,var} 7. Factor → ( Expr )
8. Factor → var
Step 5.
1. First(Term Elist) = {(,var}
2. First(+ Term Elist) = {+}
3. First(ε) = {}
4. First(Factor Tlist) = {(,var}
5. First(∗ Factor Tlist) = {∗}
6. First(ε) = {}
7. First(( Expr )) = {(}
8. First(var) = {var}
Step 6. G16:
Term FDB Elist 1. Expr → Term Elist
Factor FDB Tlist 2. Elist → + Term Elist
Expr FDB ) 3. Elist → ε
4. Term → Factor Tlist
Step 7. 5. Tlist → ∗ Factor Tlist
Elist DEO Expr 6. Tlist → ε
Term DEO Expr 7. Factor → ( Expr )
Elist DEO Elist 8. Factor → var
Term DEO Elist
Tlist DEO Term
Factor DEO Term
Tlist DEO Tlist
Factor DEO Tlist
) DEO Factor
var DEO Factor
Step 8. ) EO Term
Elist EO Expr ) EO Tlist
Term EO Expr ) EO Expr (transitive)
Elist EO Elist ) EO Elist
Term EO Elist var EO Term
Tlist EO Term var EO Tlist
Factor EO Term (from DEO) var EO Expr
Tlist EO Tlist var EO Elist
Factor EO Tlist
) EO Factor Expr EO Expr
Term EO Term
var EO Factor Factor EO Factor
Tlist EO Expr ) EO )
Tlist EO Elist (transitive) var EO var (reflexive)
Factor EO Expr + EO +
Factor EO Elist ∗ EO ∗
( EO (
Elist EO Elist
Tlist EO Tlist
Step 9.
Tlist EO Term FDB Elist BW + Tlist FB +
BW Elist
Factor EO BW +
BW Elist
var EO BW +
BW Elist
Term EO BW +
BW Elist
) EO BW +
BW Elist
) EO Factor FDB Tlist BW ∗
BW Tlist
var EO BW ∗
BW Tlist
Factor EO BW ∗
BW Tlist
Elist EO Expr FDB ) BW ) Elist FB )
Tlist EO Expr Tlist FB )
Step 10. G16:
Elist FB ↵ 1. Expr → Term Elist
Term FB ↵ 2. Elist → + Term Elist
Expr FB ↵ 3. Elist → ε
Tlist FB ↵ 4. Term → Factor Tlist
Factor FB ↵ 5. Tlist → ∗ Factor Tlist
6. Tlist → ε
Step 11. 7. Factor → ( Expr )
Fol (Elist) = {), ↵} 8. Factor → var
Fol (Tlist) = {+,),
↵}
Step 12.
Sel(1) = First(Term Elist) = {(,var}
Sel(2) = First(+ Term Elist) = {+}
Sel(3) = Fol(Elist) = {), ↵}
Sel(4) = First(Factor Tlist) = {(,var}
Sel(5) = First(∗ Factor Tlist) = {∗}
Sel(6) = Fol(Tlist) = {+,), ↵}
Sel(7) = First( ( Expr ) ) = {(}
Sel(8) = First(var) = {var}
PUSHDOWN MACHINES FOR LL(1)
GRAMMARS
G16:
1. Expr → Term Elist Sel(1) = {(,var}
2. Elist → + Term Elist Sel(2) = {+}
3. Elist → ε Sel(3) = {), ↵}
4. Term → Factor Tlist Sel(4) = {(,var}
5. Tlist → ∗ Factor Tlist Sel(5) = {∗}
6. Tlist → ε Sel(6) = {+,), ↵}
7. Factor → ( Expr ) Sel(7) = {(}
8. Factor → var Sel(8) = {var}
Since all rules defining the same non terminal (rules 2 and 3, rules 5 and 6
and rules 7 and 8 have disjoint selection sets, the grammar G16 is LL(1)
grammar.
+ * ( ) var ↵
Expr Reject Reject Rep(Elist, Reject Rep(Elist, Reject
Term) Term)
Retain Retain
Elist Rep(Elist, Term, Reject Reject Pop Reject Pop
+) Retain Retain
Retain
Term Reject Reject Rep Reject Rep Reject

(Tlist,Factor) (Tlist,Factor)
Retain Retain S
Tlist Pop Rep(Tlist, Reject Pop Reject Pop ▽
Retain Factor,*) Retain Retain
Retain
Factor Reject Reject Rep(),Expr,() Reject Rep(var) Reject Initial
Retain Retain
Stack
+ Pop Reject Reject Reject Reject Reject
Advance
* Reject Pop Reject Reject Reject Reject
Advance
( Reject Reject Pop Reject Reject Reject
Advance
) Reject Reject Reject Pop Reject Reject
Advance
var Reject Reject Reject Reject Pop Reject
Advance
▽ Reject Reject Reject Reject Reject Accept
Recursive Descent for LL(1) Grammars
G16:
void parse () 1. Expr → Term Elist
{ 2. Elist → + Term Elist
inp = getInp(); 3. Elist → ε
Expr (); 4. Term → Factor Tlist
// Call start nonterminal 5. Tlist → ∗ Factor Tlist
if (inp=='\r') 6. Tlist → ε
accept(); 7. Factor → ( Expr )
// end of string marker
8. Factor → var
else
reject(); void Elist ()
} {
if (inp=='+')
// apply rule 2
void Expr ()
{
{
inp=getInp();
if (inp=='(' || inp=='v')
Term ();
// apply rule 1
Elist ();
{
} // end rule 2
Term ();
else
Elist ();
if (inp==')' || inp=='\r')
} // end rule 1
; // apply rule 3
else
else
reject();
reject ();
}
}
void Term ()
{
if (inp=='(' || inp=='v') Recursive Descent for
// apply rule 4 LL(1) Grammars
{
Factor ();
Tlist ();
} // end rule 4 void Factor ()
else {
reject(); if (inp=='(')
} // apply rule 7
{
void Tlist () inp=getInp();
{ Expr ();
if (inp=='*') if (inp==')')
// apply rule 5 inp=getInp();
{ else
inp=getInp(); reject();
Factor (); } // end rule 7
Tlist (); else
} // end rule 5 if (inp=='v')
else inp=getInp();
if (inp=='+' || inp==')' || inp=='\r') // apply rule 8
; // apply rule 6 else
else reject();
reject(); }
}

Topic 4-Bv2

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Topic 4-Bv2

Uploaded by

Copyright:

Available Formats

TOPIC 4: SYNTAX &

Fadzlin Ahmadon - UiTM Jasin

• Parsing problem - given a grammar and an input

Fadzlin Ahmadon - UiTM Jasin 2

• Top Down Parsing algorithm: grammar rules are applied in a

Fadzlin Ahmadon - UiTM Jasin 3

• Start with starting nonterminal and decide which rule of the

S ⇒ aSb ⇒ abAcb ⇒ abbScb ⇒ abbbAccb ⇒ abbbaccb

• From a grammar, we can automate the process of producing

Each pair may be listed

Fadzlin Ahmadon - UiTM Jasin 5

• If R is a relation, then the reflexive transitive closure of R is

Example: Show R1* the reflexive transitive closure of R1.

Fadzlin Ahmadon - UiTM Jasin 7

• A grammar is a simple grammar IF

A → aα any string of terminal & non

any non terminal

Fadzlin Ahmadon - UiTM Jasin 8

• Example: consider the following grammars:

• Which one is a simple grammar?

Fadzlin Ahmadon - UiTM Jasin 9

• Parsing algorithm must decide which rule in the grammar have

SELECTION SET: set of input symbols which

• In simple grammar selection set of each rule is exactly one

Fadzlin Ahmadon - UiTM Jasin 10

• What is the selection set for each of the rules in G9?

• In top down parsing, rules defining the same non terminal

Fadzlin Ahmadon - UiTM Jasin 11

Fadzlin Ahmadon - UiTM Jasin d 12

1. Build a table with:

Fadzlin Ahmadon - UiTM Jasin 13

• Build an extended pushdown machine for G13:

Fadzlin Ahmadon - UiTM Jasin 14

• Once we have filled in each cell of the pushdown machine

Fadzlin Ahmadon - UiTM Jasin 15

• Show the sequence of stacks for input string: aba

Fadzlin Ahmadon - UiTM Jasin 16

• We now know that from a grammar we can write a program

Fadzlin Ahmadon - UiTM Jasin 17

Recursive Descent Parser - parser is written using a

Fadzlin Ahmadon - UiTM Jasin 18

• Quasi-simple grammar is a grammar which obeys the

N → ε epsilon / empty string

any non terminal

• As long as all rules defining the same nonterminal have

Fadzlin Ahmadon - UiTM Jasin 20

• For example, G14 is a quasi-simple grammar

• In order to do a top-down parse for this grammar, we have to

Fadzlin Ahmadon - UiTM Jasin 21

1. For a nonterminal A, Fol(A) = the set of all terminals that

Fadzlin Ahmadon - UiTM Jasin 22

• For grammar G14, the follow sets for S and A are:

S↵ ⇒ aAS ↵ ⇒ acASS↵ ⇒ acASaAS↵ Fol(S) = {a,b,↵}

S↵ ⇒ aAS↵ ⇒ aAaAS↵ Fol(A) = {a,b}

Fadzlin Ahmadon - UiTM Jasin 23

• In G14, the selection set for rule 4 is:

Fadzlin Ahmadon - UiTM Jasin 24

G14: Selection Sets:

Fadzlin Ahmadon - UiTM Jasin 25

Fadzlin Ahmadon - UiTM Jasin 29

Fadzlin Ahmadon - UiTM Jasin 30

• Grammars that can be parsed down:

Fadzlin Ahmadon - UiTM Jasin 31

• Called LL(1) Grammar because: