You are on page 1of 67

Prof.

Radu Prodan

SYNTACTIC ANALYSIS
TOP-DOWN PARSING
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 1
Overview
▪ Top-down parsing
– Starts with start symbol and follows leftmost derivation steps
– Traverses parse tree in pre-order from root to leaves

▪ Predictive parsers
– Choose next grammar rule using one or more lookahead tokens

▪ Backtracking parsers
– Try different grammar rule possibilities
– Back up in input if one possibility fails
– Slow and unsuitable for practical compilers

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 2


Predictive Top-Down Parsing
▪ Recursive-descendant parsing
– Ad-hoc, handwritten for each input grammar

▪ LL(1) parsing
– Automatically-generated
– Process input from Left to right, builds a Leftmost derivation
and uses 1 lookahead symbol

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 3


Agenda
▪ Recursive-descendant parsing

▪ LL(1) parsing

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 4


Recursive Descendent Parsing
▪ Nonterminals are parsed by a separate procedure
– Calls other parsing procedures in correct sequence given by
body of its BNF definition

▪ Terminals are parsed by a match procedure


– Receives expected token parameter as input
– Checks if next input token is identical with expected token
parameter and consumes it if it succeeds
– Gives an error if not

▪ One global lookahead variable keeps next input token


17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 5
Arithmetic Expression Grammar
TokenType token ; exp → exp addop term | term
addop → + | –
procedure factor () ; term → term mulop factor | factor
begin mulop → *
case token of factor → ( exp ) | number
( : match ( ( ) ;
exp () ;
match ( ) ) ; procedure match ( expectedToken ) ;
number : match (number) ; begin
else error () ; if token = expectedToken then
end case ; getToken () ;
end factor ; else
error () ;
end if ;
end match ;

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 6


Arithmetic Expression Grammar (2)
▪ exp → exp addop term | term procedure exp ;
– Calling first exp leads to immediate recursive begin
loop term () ;
– exp and term can begin with same tokens: while token = + or
number or ( token = – do
match (token) ;
term () ;
end while ;
▪ Translate grammar into EBNF end exp ;

procedure term ;
exp → term { addop term }
begin
term → factor { mulop factor } factor () ;
while token = * do
▪ Eliminate addop and mulop nonterminals match (token) ;
that only match tokens (operators) factor ();
end while ;
end term ;

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 7


Arithmetic Expression Calculation
function exp: integer ; exp → term { addop term }
var temp: integer ; term → factor { mulop factor }
begin
temp := term () ;
while token = + or token = – do
match (token) ;
case token of
+ : temp := temp + term () ;
– : temp := temp – term () ;
end case ;
end while ;
return temp ;
end exp ;
▪ Left associativity implied in
EBNF definition
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 8
Syntax Tree for Arithmetic
Expressions
function exp : syntaxTree ;
var temp, newtemp : syntaxTree ;
begin
temp := term () ;
+
while token = + or token = – do
match (token) ; + 5
newtemp := makeOpNode(token) ;
leftChild(newtemp) := temp ; 3 4
rightChild(newtemp) := term () ;
temp := newtemp ;
end while ;
return temp ;
end exp ; exp → term { addop term }
term → factor { mulop factor }
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 9
If Statement Grammar
if-stmt → if ( exp ) statement procedure ifStmt ;
| if ( exp ) statement else statement begin
match (if) ;
match ( ( ) ;
▪ EBNF grammar exp () ;
match ( ) ) ;
statement () ;
if-stmt → if ( exp ) statement
if token = else then
[ else statement ]
match (else) ;
statement () ;
end if ;
▪ Parser uses most closely nested end ifStmt ;
disambiguating rule

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 10


Syntax Tree for If Statement
function ifStatement : syntaxTree ;
var temp : syntaxTree ;
begin
match (if) ;
if
match ( ( ) ;
temp := makeStmtNode(if) ; exp statement statement
testChild(temp) := exp () ;
match ( ) ) ;
thenChild(temp) := statement () ;
if token = else then
match (else) ;
elseChild(temp) := statement () ;
else elseChild(temp) := nil ;
end if ;
end ifStatement ;

if-stmt → if ( exp ) statement [ else statement ]

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 11


Recursive Descendent Parsing
Problems
▪ It may be difficult to convert a BNF grammar into EBNF
– Solution: left recursion removal

▪ Predictive parser that needs only one lookahead


character
– Solution: left factoring

▪ Recursive-descendent parsers are powerful but ad-hoc


and handwritten
– Solution: automatic LL parser generator

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 12


Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 13


Left Recursion Removal
▪ Immediate left recursion

▪ A → A  | 
–  and  are strings of terminals and nonterminals
–  does not begin with A
– L(G) = {  n | n  0 }

▪ Equivalent grammar that uses right recursion


A →  A
A →  A | 

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 14


Immediate Left Recursion Removal
▪ Left recursive grammar
A → A1 | A2 | ... | An | 1 | 2 | ... | m

– 1, 2, ..., m do not begin with an A

▪ Removed left recursion


A → 1 A | 2 A | . . . | m A
A → 1 A | 2 A | . . . | n A | 

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 15


Indirect Left Recursion Removal
A → B a | c
B → A b | d

▪ Transform all indirect left recursions into immediate left recursions

▪ Choose an arbitrary order of nonterminals A1, …, Am

▪ Eliminate all rules of form Ai → Aj , with j  i


– Replace Aj by its definition

Indirect Left Direct Left Recursive Right Recursive


Recursive
A1 → A2 a | c A1 → A2 a | c A1 → A2 a | c
A2 → A1 b | d A2 → A2 a b | c b | d A2 → c b A2 | d A2
A2 → a b A2 | 
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 16
Indirect Left Recursion Removal
Algorithm
(* for all nonterminals in a well defined ranking *)
for i:= 1 to m do
(* for all nonterminal with a smaller rank *)
for j:= 1 to i–1 do
Replace each grammar rule Ai → Aj  by rule
Ai → 1  | 2  | . . . | k ,
where Aj → 1 | 2 | . . . | k
Eliminate direct left recursions of Ai

▪ No cycles and -productions


17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 17
Indirect Left Recursion Removal
Example
A1 → A2 a | A1 a | c
A2 → A2 b | A1 b | d

▪ Left recursion does not change language, but changes grammar


▪ Changes parse trees and complicates parser
Outer loop Inner loop Action Grammar
i=1 Inner loop Remove A1 → A2 a A1 | c A1
does not immediate left A1 → a A1 | 
execute recursion on A1 A2 → A2 b | A1 b | d
i=2 j=1 Eliminate rule A1 → A2 a A1 | c A1
A2 → A1 b A1 → a A1 | 
A2 → A2 b | A2 a A1 b | c A1 b | d
i=2 Inner loop Remove left A1 → A2 a A1 | c A1
done recursion on A2 A1 → a A1 | 
A2 → c A1 b A2 | d A2
A2 → b A2 | a A1 b A2 | 
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 18
Arithmetic Expression Grammar
▪ Left recursive grammar
exp → exp + term | exp – term | term

▪ Right recursive grammar


exp → term exp
exp → + term exp | – term exp | 

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 19


Right Recursive Expression Parser
Left Recursive Grammar Equivalent Right Recursive Grammar
exp → exp addop term | term exp → term exp
addop → + | – exp → addop term exp | 
term → term multop factor | factor addop → + | –
mulop → * term → factor term
factor → ( exp ) | number term → mulop factor term | 
mulop → *
factor → ( exp ) | number
procedure exp ; procedure exp ;
begin begin
term () ; case token of
exp () ; + : match (+) ;
end exp ; term () ;
exp () ;
– : match (–) ;
term () ;
exp () ;
end case ;
end exp ;

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 20


Loss of Left Associativity
▪ Parse tree for 3 – 4 – 5
exp
term exp

factor term addop term exp

number  – factor term addop term exp


(3)
number  – factor term 
Expression Grammar (4)
exp → term exp number 
exp → addop term exp |  (5)
addop → + | –
term → factor term
term → mulop factor term | 
mulop → *
factor → ( exp ) | number
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 21
Left Recursive Parser
Expression Grammar
function exp : integer ; exp → term exp
var temp : integer ; exp → addop term exp | 
begin addop → + | –
temp := term () ; term → factor term
return exp (temp) ; term → mulop factor term | 
end exp ; mulop → *
factor → ( exp ) | number
function exp (valsofar : integer) : integer ;
begin
if token = + or token = – then
match (token) ; ▪ 3–4–5
case token of
+ : valsofar := valsofar + term () ;
– : valsofar := valsofar – term () ;
end case ; –
return exp (valsofar) ;
else return valsofar ; – 5
end exp ; 3 4

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 22


Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 23


Left Factoring
▪ Two or more grammar rule choices share a common prefix
string
– A →   |  

▪ More than one lookahead character necessary

▪ Rewrite rule as two rules with  as common factor


– A →  A
– A →  | 

▪ Longest common substring  in different non-terminal


definitions
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 24
Arithmetic Expression Grammar
exp → term + exp | term

▪ After left factoring


exp → term exp
exp → + exp | 

▪ Replacing exp with term exp in second rule gives


identical results as after left recursion removal
exp → term exp
exp → + term exp | 

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 25


Grammar of If Statements
if-stmt → if ( exp ) statement
| if ( exp ) statement else statement

▪ After left factoring


if-stmt → if ( exp ) statement else-part
else-part → else statement | 

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 26


Left Factoring Algorithm
while there are changes to grammar do
for  A  N  A → 1 | 2 | … | n  P do
let  be a prefix of maximum length that is
shared by two or more production
choices for A
if    then
suppose that 1, …, k share , so that
A →  1 | … |  k | k+1 | … | n,
j’s share no common prefix (j[1..k])
and k+1, …, n do not share 
replace rule A → 1 | 2 | … | n
by rules:
A →  A | k+1 | … | n
A → 1 | … | k

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 27


Grammar of Statement Sequences
▪ Right recursive form
stmt-sequence → stmt ; stmt-sequence | stmt
stmt → s

▪ After left factoring


stmt-sequence → stmt stmt-seq
stmt-seq → ; stmt-sequence | 

▪ Left recursive form


stmt-sequence → stmt-sequence ; stmt | stmt
stmt → s

▪ Left recursion removal


stmt-sequence → stmt stmt-seq
stmt-seq → ; stmt stmt-seq | 
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 28
Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 29


LL(1) Parsing Overview
▪ Requires a right recursive and left factored grammar

▪ Uses an explicit stack instead of recursive calls

▪ Mark bottom of stack with dollar ($) character

▪ Match a token on top of the stack with next input token

▪ Generate replaces a nonterminal A at top of stack by string  using grammar


rule A → 
–  pushed onto stack in reversed order of symbols
Parsing stack Input Action
1 $ Start symbol Input string
. . . . . .
. . . . . .
$ $ accept
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 30
Balanced Parentheses Grammar
S → ( S ) S Parsing Input Action
|  Stack
1 $ S ( ) $ S → ( S ) S
2 $ S ) S ( ( ) $ match
3 $ S ) S ) $ S → 
4 $ S ) ) $ match
5 $ S $ S → 
6 $ $ accept

M[N, T] ( ) $
S S → ( S ) S S →  S → 

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 31


If-Statement Grammar
statement → if-stmt | other
if-stmt → if ( exp ) statement else-part
else-part → else statement | 
exp → 0 | 1

M[N, T] if other else 0 1 $


statement → statement
statement
if-stmt → other
if-stmt →
if ( exp )
if-stmt
statement
else-part
else-part →
else statement else-part
else-part
→ 
else-part → 
exp exp → 0 exp → 1
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 32
LL(1) Parsing Actions for
Grammar of if-Statements
Parsing Stack Input Action
$ statement if(0) if(1) other else other $ statement → if-stmt
if-stmt → if ( exp )
$ if-stmt if(0) if(1) other else other $
statement else-part
$ else-part statement ) exp ( if if(0) if(1) other else other $ match
$ else-part statement ) exp ( (0) if(1) other else other $ match
$ else-part statement ) exp 0) if(1) other else other $ exp → 0
$ else-part statement ) 0 0) if(1) other else other $ match
$ else-part statement ) ) if(1) other else other $ match
$ else-part statement if(1) other else other $ statement → if-stmt
if-stmt → if ( exp )
$ else-part if-stmt if(1) other else other $
statement else-part
$ else-part else-part statement ) exp ( if if(1) other else other $ match
$ else-part else-part statement ) exp ( (1) other else other $ match
$ else-part else-part statement ) exp 1) other else other $ exp → 1
$ else-part else-part statement ) 1 1) other else other $ match
$ else-part else-part statement ) ) other else other $ match
$ else-part else-part statement other else other $ statement → other
$ else-part else-part other other else other $ match
$ else-part else-part else other $ else-part → else statement
$ else-part statement else else other $ match
$ else-part statement other $ statement → other
$ else-part other other $ match
$ else-part $ else-part → 
$ $ accept

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 33


LL(1) Parsing Algorithm
(* assumes $ marks bottom of stack and end of input *)
while top(parsing stack)  $  token  $ do
if top(parsing stack) = a  T  token = a
then (* match *)
pop(a, parsing stack) ;
token = getToken() ;
else if top(parsing stack) = A  N  token = a  T 
 A → X1X2 … Xn  M[A, a]
then (* generate *)
pop(A, parsing stack) ;
???
for i := n downto 1 do
push(Xi, parsing stack) ;
else error ;
if top(parsing stack) = $  token = $
then accept
else error ;
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 34
LL(1) Parsing Table
▪ Context-free grammar: G = (T, N, P, S)

▪ Parsing table indexed by nonterminals and terminals which


contains production rules to use when
– Nonterminal is on top of stack
– Terminal is next in input

▪ A production (A → )  M[A, a] in two cases:


– (  * a)  a  T
•  starts with terminal a: a  First()
– (  * )  (S$ * Aa)  a  T  $
• A is followed by terminal a if it can disappear: a  Follow(A)

M[N, T] ( ) $
S → ( S ) S | 
S S → ( S ) S S →  S → 
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 35
LL(1) Parsing Table Construction Algorithm
for  A  N   A →   P do
for  a  First() do
M[A, a] = M[A, a]  { A →  }
if   First() then
for  a  Follow(A) do
M[A, a] = M[A, a]  { A →  }

▪ If (A →   P)  (  * a)  (a  T)  M[A, a] = M[A, a]  { A →  }


– a  First()

▪ If (A →   P)  (  * )  (S $ *  A a )  (a  T  $) 
 M[A, a] = M[A, a]  { A →  }
– a  Follow(A)
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 36
Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing
– FIRST sets
– FOLLOW sets
– Parsing table
– LL(1) grammars

▪ Error recovery

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 37


First Sets
▪ G = (T, N, P, S)

▪ XTN

▪ Set First(X)  T   is defined as follows:


– If X  T    First(X) = { X }
– If X  N, then  X → X1 X2 … Xn  P 
• First(X1) –   First(X)
• If   First(X1)  …    First(Xi)  i < n  First(Xi+1) – {  }  First(X)
• If   First(X1)  …    First(Xn)    First(X)

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 38


Integer Expression Grammar:
First Sets Computation
exp → exp addop term | term addop → + | –
term → term mulop factor | factor mulop → *
factor → ( exp ) | number

Grammar Rule Iteration 1 Iteration 2 Iteration 3


exp → exp
addop term
exp → term First(exp) = { (, number }
addop → + First(addop) = { + }
addop → – First(addop) = { +, – }
term → term
mulop factor
term → factor First(term) = { (, number }
mulop → * First(mulop) = { * }
factor → ( exp ) First(factor) = { ( }
factor → number First(factor) = { (, number }
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 39
Statement Sequence Grammar:
First Sets Computation
▪ Left recursive
stmt-sequence → stmt-sequence ; stmt | stmt
stmt → s

▪ Left factored right recursive


stmt-sequence → stmt stmt-seq
stmt-seq → ; stmt-sequence | 
stmt → s

Grammar Productions Iteration 1 Iteration 2


stmt-sequence → First(stmt-sequence) =
stmt stmt-seq ={s}
stmt-seq → ; stmt-sequence First(stmt-seq) = { ; }
stmt-seq →  First(stmt-seq) = { ;, }
stmt → s First(stmt) = { s }
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 40
If-Statement Grammar:
First Sets Computation
statement → if-stmt | other
if-stmt → if ( exp ) statement else-part
else-part → else statement | 
exp → 0 | 1

Grammar Rule Iteration 1 Iteration 2


First(statement) =
statement → if-stmt
= { if, other }
statement → other First(statement) = { other }
if-stmt → if ( exp ) statement
First(if-stmt) = { if }
else-part
else-part → else statement First(else-part) = { else }
else-part →  First(else-part) = { else,  }
exp → 0 First(exp) = { 0 }
exp → 1 First(exp) = { 0, 1 }

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 41


First Set Computation Algorithm
for  A  N do
First(A) :=  ;

while there are changes to any First(A) do


for  A → X1 X2 … Xn do
k := 1 ;
continue := true ;
while continue = true  k  n do
First(A) := First(A)  First(Xk) – {  } ;
if   First(Xk) then
continue := false ;
k := k + 1 ;
if continue = true then
First(A) := First(A)  {  } ;

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 42


Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing
– FIRST sets
– FOLLOW sets
– Parsing table
– LL(1) grammars

▪ Error recovery

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 43


Follow Sets
▪ G = (T, N, P, S) and A  N

▪ Set Follow(A)  T  $ is defined as follows:


– If A = S  $  Follow(A)
– If ( B →  A   P)  First() –   Follow(A)
– If ( B →  A   P)    First()
 Follow(B)  Follow(A)
• B →  A is a common special case

▪  is never an element of Follow set

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 44


Simple Expression Grammar:
Follow Sets Computation

Grammar Rule Iteration 1 Iteration 2


Follow(exp) = $  First(addop) = { $, +, – }
exp → exp Follow(term) = Follow(exp) =
Follow(addop) = First(term) = { (, number }
addop term = { $, +, –, ) }
Follow(term) = Follow(exp) = { $, +, – }
Follow(term) = Follow(exp) =
exp → term Follow(term) = Follow(exp) = { $, +, – }
= { $, +, –, ) }
Follow(term) = First(mulop) = { $, +, –, * }
term → term Follow(factor) = Follow(term) =
Follow(mulop) = First(factor) = { (, number }
mulop factor = { $, +, –, *, ) }
Follow(factor) = Follow(term) = { $, +, –, * }
Follow(factor) = Follow(term) =
term → factor Follow(factor) = Follow(term) = { $, +, –, * }
= { $, +, –, *, ) }
factor → ( exp ) Follow(exp) = First( ) ) = { $, +, –, ) }

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 45


Statement Sequence Grammar:
Follow Sets Computation

Grammar Rule Iteration 1


Follow(stmt-sequence) = { $ }
stmt-sequence → Follow(stmt) = First(stmt-seq) – {  } = { ; }
stmt stmt-seq Follow(stmt) = Follow(stmt-sequence) = { ;, $ }
Follow(stmt-seq) = Follow(stmt-sequence) = { $ }
stmt-seq → ; stmt-sequence Follow(stmt-sequence) = Follow(stmt-seq) = { $ }

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 46


If-Statement Grammar:
Follow Sets Computation
Grammar Rule Iteration 1 Iteration 2
statement → Follow(statement) = { $ } Follow(if-stmt) =
if-stmt Follow(if-stmt) = Follow(statement) = { $ } Follow(statement) = { $, else }
if-stmt → Follow(exp) = First( ) ) = { ) } Follow(statement) =
if ( exp ) Follow(statement) = First(else-part) – {  } = Follow(if-stmt) = { $, else }
statement = { $, else }
else-part Follow(statement) = Follow(if-stmt) = Follow(else-part) =
= { $, else } Follow(if-stmt) = { $, else }
Follow(else-part) = Follow(if-stmt) = { $ }
else-part → Follow(statement) = Follow(else-part) = Follow(statement) =
else statement = { $, else } Follow(else-part) = { $, else }
exp → 0 | 1

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 47


Follow Set Computation Algorithm
Follow(S) := $ ;
for  A  N – { S } do
Follow(A) :=  ;

while there are changes to any Follow sets do


for  A → X1 X2 … Xn  P do
for  i[1..n] do
Follow(Xi) := Follow(Xi) 
First(Xi+1 Xi+2 … Xn) – {  } ;
(* Note: if i=n, then Xi+1 Xi+2 … Xn =  *)
if   First(Xi+1 Xi+2 … Xn) then
Follow(Xi) := Follow(Xi)  Follow(A) ;

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 48


Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing
– FIRST sets
– FOLLOW sets
– Parsing table
– LL(1) grammars

▪ Error recovery

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 49


Simple Expression Grammar:
Parsing Table
Grammar Rule First Set Follow Set
exp → exp addop term | term First(exp) = { (, number } Follow(exp) = { $, +, –, ) }
addop → + | – First(addop) = { +, – } Follow(addop) = { (, number }
term → term mulop factor | factor First(term) = { (, number } Follow(term) = { $, +, –, ) }
mulop → * First(mulop) = { * } Follow(mulop) = { (, number }
factor → ( exp ) | number First(factor) = { (, number } Follow(factor) = { $, +, –, *, ) }

M[N,T] ( number ) + – * $
exp → exp addop exp → exp addop
exp
term | term term | term
addop addop → + addop → –
term → term mulop term → term mulop
term
factor | factor factor | factor
mulop mulop → *
factor factor → ( exp ) factor → number

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 50


Statement Sequence Grammar:
Parsing Table

Grammar Rule First Set Follow Set


stmt-sequence → stmt stmt-seq First(stmt-sequence) = { s } Follow(stmt-sequence) = { $ }
stmt-seq → ; stmt-sequence |  First(stmt-seq) = { ;,  } Follow(stmt-seq) = { $ }
stmt → s First(stmt) = { s } Follow(stmt) = { ;, $ }

M[N, T] s ; $
stmt-sequence →
stmt-sequence
stmt stmt-seq
stmt-seq →
stmt-seq stmt-seq → 
; stmt-sequence
stmt stmt → s

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 51


If-Statement Grammar:
Parsing Table
Grammar Rule First Set Follow Sets
statement→ if-stmt | other First(statement) = { if, other } Follow(statement) = { $, else }
if-stmt → if ( exp )
First(if-stmt) = { if } Follow(if-stmt) = { $, else }
statement else-part
else-part →
First(else-part) = { else,  } Follow(else-part) = { $, else }
else statement | 
exp → 0 | 1 First(exp) = { 0, 1 } Follow(exp) = { ) }

M[N, T] if other else 0 1 $


statement
statement statement → if-stmt
→ other
if-stmt → if ( exp )
if-stmt
statement else-part
else-part →
else statement else-part
else-part
→ 
else-part → 
exp exp → 0 exp → 1

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 52


Expression Grammar:
First Sets Computation
Grammar Rule Iteration 1 Iteration 2 Iteration 3
First(exp) = First(term)
exp → term exp
= { (, number }
exp → First(exp) =
addop term exp First(addop) = { +, –,  }
exp →  First(exp) = {  }
addop → + First(addop) = { + }
addop → – First(addop) = { +, – }
term → First(term) = First(factor) =
factor term = { (, number }
term → multop First(term) =
factor term First(mulop) = { *,  }
term →  First(term) = {  }
mulop → * First(mulop) = { * }
factor → ( exp ) First(factor) = { ( }
factor → number First(factor) = { (, number }

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 53


Expression Grammar:
Follow Sets Computation
Grammar Rule Iteration 1 Iteration 2
Follow(exp) = { $ }
Follow(exp) =
exp → term exp Follow(term) = First(exp) = { +, – }
Follow(exp) = { $, ) }
Follow(exp) = Follow(exp) = { $ }
Follow(addop) = First(term) = { (, number }
exp → Follow(term) =
Follow(term) = First(exp) = { +, – }
addop term exp Follow(exp) = { $, ), +, – }
Follow(term) = Follow(exp) = { $, +, – }
Follow(factor) = First(term) = { * }
term → Follow(term) =
Follow(factor) = Follow(term) = { $, +, –, * }
factor term Follow(term) = { $, ), +, – }
Follow(term) = Follow(term) = { $, +, – }
Follow(mulop) = First(factor) = { (, number }
term → multop Follow(factor) =
Follow(factor) = First(term) = {$, +, –, * }
factor term Follow(term) = { $, ), +, –, * }
Follow(factor) = Follow(term) = { $, +, –, * }
factor → ( exp ) Follow(exp) = First( ) ) = { $, ) }

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 54


Expression Grammar:
LL(1) Parsing Table Grammar Productions First Sets Follow Sets
exp → term exp First(exp) = { (, number } Follow(exp) = { $, ) }
exp → addop term exp |  First(exp) = { +, –,  } Follow(exp) = { $, ) }
addop → + | – First(addop) = { +, – } Follow(addop) = { (, number }
term → factor term First(term) = { (, number } Follow(term) = { $, ), +, – }
term → multop factor term |  First(term) = { *,  } Follow(term) = { $, ), +, – }
mulop → * First(mulop) = { * } Follow(mulop) = { (, number }
factor → ( exp ) | number First(factor) = { (, number } Follow(factor) = { $, ), +, –, * }

M[N,T] ( number ) + – * $
exp → exp →
exp
term exp term exp
exp exp → addopexp → addop exp
exp
→  term exp term exp → 
addop addop → + addop → –
term → term →
term
factor term factor term
term term → mulop term
term term →  term → 
→  factor term → 
mulop mulop → *
factor → factor →
factor
( exp ) number
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 55
Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing
– FIRST sets
– FOLLOW sets
– Parsing table
– LL(1) grammars

▪ Error recovery

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 56


LL(1) Grammar
▪ Grammar is LL(1) if associated LL(1) parsing table has at
most one production in each table entry
– An LL(1) grammar cannot be ambiguous

▪ G = (T, N, P, S) is LL(1) if following conditions are satisfied

– First(i)  First(j) = ,  A → 1 | 2 | … | n  i, j  [1..n]  i  j

– First(A)  Follow(A) = ,  A  N    First(A)

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 57


Non-LL(1) Programming Language
statement → assign-stmt | call-stmt | other
assign-stmt → identifier := exp
call-stmt → identifier ( exp-list )

▪ Replace assign-stmt and call-stmt by right-hand sides of


their defining productions
statement → identifier := exp
| identifier ( exp-list )
| other

▪ Left factoring
statement → identifier statement | other
statement → := exp | ( exp-list )

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 58


Expression Evaluation
in LL(1) Parsing
▪ Expression grammar Parsing
Input Action
Value
E → E + n | n Stack Stack
$ E 3 + 4 + 5 $ E → n E $
▪ Left recursion removal $ E n 3 + 4 + 5 $ match/push $
E → n E
$ E + 4 + 5 $ E → + n # E 3 $
E → + n E | 
$ E # n + + 4 + 5 $ match 3 $
▪ Value stack $ E # n 4 + 5 $ match/push 3 $
– Push number after each match $ E # + 5 $ add stack 4 3 $
– Add operation indicated on
parsing stack by a special $ E + 5 $ E → + n # E 7 $
pound symbol (#) $ E # n + + 5 $ match 7 $
– Left associativity
$ E # n 5 $ match/push 7 $

E → n E $ E # $ add stack 5 7 $
E → + n # E |  $ E $ E →  12 $
$ $ accept 12 $

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 59


Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing
– FIRST sets
– FOLLOW sets
– Algorithm

▪ Error recovery

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 60


Error Recovery
▪ Recogniser
– Determines if a program is syntactically correct
– Displays a helpful error message

▪ Goals
– Find error as soon as possible
– Find a good place to resume parsing
– Find as many real errors as possible
– Avoid error cascade
– Avoid infinite loops on errors

▪ Error correction or error repair


– Find a correct program closest to wrong one
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 61
Panic Mode Error Recovery
▪ Recursive descendant parsers

▪ Synchronising tokens for each recursive procedure


– Scan ahead (ignore input tokens) on errors until reaching one
synchronising token
– First sets and follow sets as synchronising tokens
– First sets allow parser detect errors early in parse

▪ In a recursive procedure parsing nonterminal N


– Check if next token is in First(N)
– If an error happens, ignore tokens until First(N)  Follow(N)
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 62
Recursive Descendant Error
Recovery in Simple Expression Grammar
procedure checkinput ( firstset, followset ) ; procedure scanto ( syncset ) ;
begin begin
if ( token  firstset ) then while token  syncset  { $ } do
error ; token = getToken () ;
scanto (firstset  followset ) ; end while ;
end if ; end scanto ;
end checkinput ;

procedure factor ( syncset ) ;


procedure exp ( syncset ) ; begin
begin checkinput ( { (, number }, syncset ) ;
checkinput ( { (, number }, syncset ) ; if ( token  { (, number } ) then
if ( token  { (, number } ) then case token of
term ( syncset  { +, – } ) ; (: match ( ( ) ;
while token = + or token = – do exp ( { ) } ) ;
match ( token ) ; match ( ) ) ;
term ( syncset  { +, – } ) ; number : match ( number ) ;
end while ; else error ;
checkinput ( syncset, { (, number } ) ; end case ;
end if ; checkinput ( syncset, { (, number } ) ;
end exp ; end if ;
end factor ;

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 63


Error Recovery in LL(1) Parsers
▪ Nonterminal A on top of stack
▪ Input token T

▪ If M[A, T] =   Error
– T  First(A)
– T  Follow(A), if   First(A)

▪ T = $  T  Follow(A)
– Pop A from stack

▪ T  $  T  First(A)  Follow(A)
– Scan input until T  First(A)  Follow(A)

▪ Push a new nonterminal onto stack

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 64


Parsing Table with
Error Recovery in Expression Grammar
Grammar Productions First Sets Follow Sets
exp → term exp First(exp) = { (, number } Follow(exp) = { $, ) }
exp → addop term exp |  First(exp) = { +, –,  } Follow(exp) = { $, ) }
addop → + | – First(addop) = { +, – } Follow(addop) = { (, number }
term → factor term First(term) = { (, number } Follow(term) = { $, ), +, – }
term → multop factor term |  First(term) = { *,  } Follow(term) = { $, ), +, – }
mulop → * First(mulop) = { * } Follow(mulop) = { (, number }
factor → ( exp ) | number First(factor) = { (, number } Follow(factor) = { $, ), +, –, * }

M[N,T] ( number ) + – * $
exp → exp →
exp pop scan scan scan pop
term exp term exp
exp → addop exp → addop
exp scan scan exp →  scan exp → 
term exp term exp
addop pop pop scan addop → + addop → – scan pop
term → term →
term pop pop pop scan pop
factor term factor term
term → mulop
term scan scan term →  term →  term →  term → 
factor term
mulop pop pop scan scan scan mulop → * pop
factor → factor →
factor pop pop pop pop pop
( exp ) number
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 65
Error Recovery in
LL(1) Expression Grammar
Parsing Stack Input Action
$ exp ( 2 + * ) $ exp→ term exp
$ exp term ( 2 + * ) $ term → factor term
$ exp term factor ( 2 + * ) $ factor → ( exp )
$ exp term ) exp ( ( 2 + * ) $ match
$ exp term ) exp 2 + * ) $ exp → term exp
$ exp term ) exp term 2 + * ) $ term → factor term
$ exp term ) exp term factor 2 + * ) $ factor → 2
$ exp term ) exp term 2 2 + * ) $ match
$ exp term ) exp term + * ) $ term → 
$ exp term ) exp + * ) $ exp → addop term exp
$ exp term ) exp term addop + * ) $ addop → +
$ exp term ) exp term + + * ) $ match
$ exp term ) exp term * ) $ scan
$ exp term ) exp term ) $ pop
$ exp term ) exp ) $ exp → 
$ exp term ) ) $ match
$ exp term $ term → 
$ exp $ exp → 
$ $ accept

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 66


Conclusions
▪ Top-down parsing

▪ Recursive descendant parsing

▪ LL(1) parser generation algorithm

▪ Right recursive and left-factored grammars

▪ Panic mode error recovery

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 67

You might also like