You are on page 1of 4

# Course Overview

## Nullable, First sets (starter sets), and Follow sets

A non-terminal is nullable if it derives the empty string

## PART I: overview material

1
2
3

Introduction
Language processors (tombstone diagrams, bootstrapping)
Architecture of a compiler

## First(N) or starters(N) is the set of all terminals that can

begin a sentence derived from N

## PART II: inside a compiler

4
5
6
7

Syntax analysis
Contextual analysis
Runtime organization
Code generation

## Follow(N) is the set of terminals that can follow N in

some sentential form
Next we will see algorithms to compute each of these.

8
9

Interpretation
Review

## Define Nullable(x1 x2 x3 xn) as:

if n==0 then
true
else if !Nullable(x1) then
false
else
Nullable(x2 x3 xn)

Nullable(t) = false

## For each non-terminal N

Nullable(N) = is there a production N ::= ?

Repeat
For each production N ::= x1 x2 x3 xn
If Nullable(xi) for all of xi then set Nullable(N) to true

## Generalizing the definition of First sets

Define First(x1 x2 x3 xn) as:
if !Nullable(x1) then
First(x1)
else
First(x1) First(x2 x3 xn)

First(t) = { t }

## For each non-terminal N

First(N) = { }

Repeat
For each production N ::= x1 x2 x3 xn
First(N) = First(N) First(x1)
For each i from 2 through n
If Nullable(x1 xi-1), then
First(N) = First(N) First(xi)

## Note: some textbooks add (empty string) to First(N)

whenever N is nullable, so that First(N) is never { }
(empty set)

## Until no First set changes

Syntax Analysis (Chapter 4)

## Example of computing Nullable, First, Follow

S ::= TUVW | WVUT
T ::= aT | e
U ::= Ub | f
V ::= cV |
W ::= Wd |

Follow(S) = {\$}
// the end-of-file symbol
For each non-terminal N other than S
Follow(N) = { }

Repeat
For each production N ::= x1 x2 x3 xn
For each i from 1 through n-1
if xi is a non-terminal then
Follow(xi) = Follow(xi) First(xi+1 xn)
For each i from n downto 1
if xi is a non-terminal and Nullable(xi+1 xn) then
Follow(xi) = Follow(xi) Follow(N)

## Until no Follow set changes

Syntax Analysis (Chapter 4)

Nullable?
false

First
{a, e, d, c, f}

Follow
{\$}

false

{a, e}

{f, \$}

false

{f}

{c, d, \$, a, e, b}

true

{c} or {c, }

{d, \$, f}

true

{d} or {d, }

{\$, c, f, d}

Parsing

## We will now look at parsing.

Topics:

Recognition
To answer the question does the input conform to the syntax of
the language

Some terminology
Different types of parsing strategies
bottom up
top down
Recursive descent parsing
What is it
How to implement a parser given an EBNF specification

Parsing
Recognition + also determine structure of program (for example
by creating an AST data structure)

Unambiguous grammar:
A grammar is unambiguous if there is only at most one way to
parse any input. (i.e. for syntactically correct program there is
precisely one parse tree)

## Different kinds of Parsing Algorithms

10

Bottom up parsing

## Two big groups of algorithms can be distinguished:

The parse tree grows from the bottom (leafs) up to the top (root).

bottom up strategies
top down strategies

Sentence
Sentence
Subject
Subject
Object
Object
Noun
Noun
Verb
Verb

::=
::=
::=
::=
::=
::=
::=
::=
::=
::=

## The cat sees the rat.

The rat sees me.
I like a cat.
Syntax Analysis (Chapter 4)

Sentence

Subject
Subject Verb
Verb Object
Object ..
II || AA Noun
Noun || The
The Noun
Noun
me
me || aa Noun
Noun || the
the Noun
Noun
cat
cat || bat
bat || rat
rat
like
like || is
is || see
see || sees
sees

Subject

I see the rat.
I sees a rat.

The
11

Object

Noun

Verb

cat

sees

## Syntax Analysis (Chapter 4)

Noun

rat

.
12

Top-down parsing

Quick review
Syntactic analysis

## The parse tree is constructed starting at the top (root).

Lexical analysis
Group letters into words (or group characters into tokens)
Use regular expressions and deterministic FSMs
Grammar transformations
Left-factoring
Left-recursion removal
Substitution
Parsing = structural analysis of program
Group words into sentences, paragraphs, and documents
(or tokens into expressions, commands, and programs)
Top-Down and Bottom-Up

Sentence

Subject

Verb

Object

Noun

The

cat

Noun

sees

rat

13

14

Sentence
Sentence
Subject
Subject
Object
Object
Noun
Noun
Verb
Verb

## Recursive descent parsing is a straightforward top-down

parsing algorithm.
We will now look at how to develop a recursive descent
parser from an EBNF specification.
Idea: the parse tree structure corresponds to the recursive
calling structure of parsing functions that call each other.

::=
::=
::=
::=
::=
::=
::=
::=
::=
::=

Subject
Subject Verb
Verb Object
Object ..
II || AA Noun
Noun || The
The Noun
Noun
me
me || aa Noun
Noun || the
the Noun
Noun
cat
cat || bat
bat || rat
rat
like
like || is
is || see
see || sees
sees

## Define a procedure parseN for each non-terminal N

pr
private
ivatevo
void
idparseSentence(
parseSentence());;
pr
;;
private
ivatevo
void
idparseSubjec
parseSubjectt(())
pr
;;
private
ivatevo
void
idparseObjec
parseObjectt(())
pr
;;
private
ivatevo
void
idparseNoun(
parseNoun())
pr
;;
private
ivatevo
void
idparseVerb(
parseVerb())

15

16

## Recursive Descent Parsing: Auxiliary Methods

public
publicclass
classMicroEnglishParser
MicroEnglishParser{{
private
privateTerminalSymbol
TerminalSymbolcurrentTerminal;
currentTerminal;

public
publicclass
classMicroEnglishParser
MicroEnglishParser{{

private
privatevoid
voidaccept
accept(TerminalSymbol
(TerminalSymbolexpected)
expected){{
ifif(currentTerminal
(currentTerminalmatches
matchesexpected)
expected)
currentTerminal
currentTerminal==next
nextinput
inputterminal
terminal;;
else
else
report
reportaasyntax
syntaxerror
error
}}

private
privateTerminalSymbol
TerminalSymbolcurrentTerminal;
currentTerminal;
//Auxiliary
//Auxiliarymethods
methods will
willgo
gohere
here
...
...
//Parsing
//Parsingmethods
methods will
willgo
gohere
here
...
...

}}

}}

17

...
...

18

## Recursive Descent Parsing: Parsing Methods

Subject
Subject

Sentence
Sentence ::=
::= Subject
Subject Verb
Verb Object
Object ..

private
privatevoid
voidparseSubject(
parseSubject()){{
ifif(currentTerminal
(currentTerminalmatches
matchesII))
accept(
accept(II););
else
elseifif(currentTerminal
(currentTerminalmatches
matches A
A)){{
accept(
);
accept(AA);
parseNoun(
parseNoun(););
}}
else
elseifif(currentTerminal
(currentTerminalmatches
matchesThe
The)){{
accept(
The););
accept(The
parseNoun(
parseNoun(););
}}
else
else
report
reportaasyntax
syntaxerror
error
}}

private
privatevoid
voidparseSentence(
parseSentence()){{
parseSubject(
parseSubject(););
parseVerb(
parseVerb(););
parseObject(
parseObject(););
accept(.);
accept(.);
}}

19

## Recursive Descent Parsing: Parsing Methods

Noun
Noun

::=
::= II || AA Noun
Noun || The
The Noun
Noun

20

## Recursive Descent Parsing: Parsing Methods

::=
::= cat
cat || bat
bat || rat
rat

Object
Object
Verb
Verb

private
privatevoid
voidparseNoun(
parseNoun()){{
ifif(currentTerminal
(currentTerminalmatches
matchescat
cat))
accept(
cat););
accept(cat
else
elseifif(currentTerminal
(currentTerminalmatches
matchesbat
bat))
accept(
bat););
accept(bat
else
at))
elseifif(currentTerminal
(currentTerminalmatches
matchesr
rat
accept(
at););
accept(rrat
else
else
report
reportaasyntax
syntaxerror
error
}}

::=
::= me
me || aa Noun
Noun || the
the Noun
Noun
::=
::= like
like || is
is || see
see || sees
sees

private
privatevoid
voidparseObject(
parseObject()){{
}}

??

private
privatevoid
voidparseVerb(
parseVerb()){{
}}

??

## Test yourself: Can you complete parseObject( ) and parseVerb( ) ?

Syntax Analysis (Chapter 4)

21

22

## Systematic Development of Rec. Descent Parser

(1) Express grammar in EBNF
(2) Grammar Transformations:
Left factorization and Left recursion elimination

## (3) Create a parser class with

private variable currentToken
methods to call the scanner: accept and acceptIt

## (4) Implement a public method for main function to call:

public parse method that
fetches the first token from the scanner
calls parseS (where S is start symbol of the grammar)
verifies that scanner next produces the endoffile token

## (5) Implement private parsing methods:

add private parseN method for each non terminal N
Syntax Analysis (Chapter 4)

23