You are on page 1of 24

CS327 - Compilers

Parsing

Abhishek Bichhawat 07/02/2024


Question
Which of the following grammars are ambiguous?

1. S → SS | a | b
2. E→E+E|a
3. S → ε | Sa | Sb
4. E’ → -E’ | n | (E)
E → E’ | E’ + E
Question
Which of the following grammars are ambiguous?

1. S → SS | a | b
2. E→E+E|a
3. S → Sa | Sb
4. E’ → -E’ | n | (E)
E → E’ | E’ + E
Removing Ambiguity
S → if E then S
| if E then S’ else S

S’ → if E then S’ else S’

Str: if E1 then if E2 then S1 else S2


if if

E1 if S2 E1 if

E2 S1 E2 S1 S2
Removing Ambiguity
CFG : Str : id * id + id
E → id | E + E | E * E

CFG’ :
E’ → E’ + E | E
E → id | E * (E’) | (E’) * E
Removing Ambiguity
● No standard approaches
● Cannot convert the grammar automatically
● Can sometimes be useful
● Can use precedence and associativity to remove ambiguity
precedence left ADD, SUBTRACT; // less precedence
precedence left TIMES, DIVIDE; // more precedence
Multiple Parse Trees

Associativity: Assume + is left associative


E E

E + E E + E

id id
E + E E + E

id id id id
Multiple Parse Trees

Precedence: If * has more precedence than +


E E

E * E E + E

id id
E + E E * E

id id id id
Abstract Syntax Trees
● Data structure that represents the parse tree for the compiler
○ Ignores some detail in the parse tree
Abstract Syntax Trees
CFG:
E → E * E | E + E | (E) | int

Str:
5 * (2 + 3)

Tokens:
<NUMBER, 5>;<MUL, *>;<OPAREN, (>;
<NUMBER, 2>;<PLUS, +>;<NUMBER, 3>;<CPAREN, )>;
Abstract Syntax Trees

E * E

int, 5 E
( )

E + E

int, 2 int, 3
Abstract Syntax Trees
Times
E

int, 5 Plus
E * E

int, 5 E
( ) int, 2 int, 3

E + E

int, 2 int, 3
Recursive Descent Parsing
● Top-down parsing approach
○ Start at the top when constructing parse tree
○ Grows from left to right
○ Start with the start-symbol and try the rules in order
● Consider the CFG:
E→T|E+T
T → int | (E)
● String: (5)
Recursive Descent Parsing
E
CFG:
E→T|E+T
T → int | (E) T

String: (5) int


Recursive Descent Parsing
E
CFG:
E→T|E+T
T → int | (E) T

String: (5) E
( )

T
Recursive Descent Parsing
E
CFG:
E→T|E+T
T → int | (E) T

String: (5) E
( )

int
Recursive Descent Parsing
● Easy to implement
● Cannot backtrack once a production is successful
○ Works when only one production succeeds for a non-terminal
● Does not work for left-recursive grammars
S → Sa | b
Recursive Descent Parsing
● Easy to implement
● Cannot backtrack once a production is successful
○ Works when only one production succeeds for a non-terminal
● Does not work for left-recursive grammars
S → Sa | b
● Eliminate left-recursive
Eliminate Left Recursion
Suppose we have a left-recursive grammar:
S → Sa | b

All strings will start with the terminal b followed by many a

Equivalent grammar with right-recursion:


S → bS’
S’ → aS’ | ε
Eliminate Left Recursion
● In general, if
S → Sa1 | … | San | b1 | … | bm
● All strings derived from S start with one of b1, … ,bm and
continue with several instances of a1, ... ,an
● Rewrite as
S → b1S’| … | bmS’
S’→ a1S’| … | anS’ | ε
Predictive Parser
● Similar to recursive descent but predicts which production to use
○ Looks at the next tokens
○ No backtracking
● Accept LL(k) grammars

Left-to-right Lookahead
scanning of input k tokens

Leftmost
derivation
LL(1) Parser
● Looks ahead 1 token
● Only one choice of production at every step
○ Unique production or no production given the next token
● No backtracking

● Suppose we have the grammar:

E→T+E|T
T → int | int * T | (E)
LL(1) Parser - Left factoring
E→T+E|T
T → int | int * T | (E)

Eliminate the common prefixes


E → TA of multiple productions for one
A→+E|ε non terminal.

T → int B | (E)
B→*T|ε
LL(1) Parser - Example
S → if E then S
| if E then S else S

S → if E then S S’
S’ → ε | else S

You might also like