You are on page 1of 7

www.th-deg.

de

Lexical analysis II
1. From NFA to DFA
2. How lex works
3. From RE to DFA

TECHNISCHE HOCHSCHULE DEGGENDORF


2
www.th-deg.de

Efficient implementation of DFAs


• Construct a DFA directly from an RE
• First step: create syntax tree from RE
• An NFA state is important, if it has an outgoing non-ε-transition
(final states have an outgoing transition w/ EOF)
• Algorithm NFA2DFA only uses important states (move(s,a) is empty, if s is not
important)
• NFA states treated as 1 state, if
– They have the same important states
– Either both or none have accepting states
• Define for a node n of the syntax tree of (r)EOF:
– nullable(n) = ε є L(n)
– firstpos(n) = set of positions in subtree of n that correspond to the
first symbol of some string of L(n)
– lastpos(n) = set of positions in subtree of n that correspond to the
last symbol of some string of L(n)
– followpos(p) = for position p: {p'| there is a string in L((r)EOF) ...cd...
with pos(c)=p, pos(d)=p'}
TECHNISCHE HOCHSCHULE DEGGENDORF
3
www.th-deg.de

Compute help functions


Node n nullable(n) firstpos(n) lastpos(n)

ε true Ø Ø

a≠ε false {a} {a}

c|d nullable(c) OR firstpos(c) U lastpos(c) U


nullable(d) firstpos(d) lastpos(d)
cd nullable(c) AND IF nullable(c) IF nullable(d)
nullable(d) THEN THEN
firstpos(c) U lastpos(c) U
firspos(d) lastpos(d)
ELSE ELSE
firstpos(c) lastpos(d)
c* true firstpos(c) lastpos(c)

TECHNISCHE HOCHSCHULE DEGGENDORF


4
www.th-deg.de

Compute help functions

Node n followpos(i) condition for i

cd firstpos(d) i position in
lastpos(c)

c* firstpos(c) i position in
lastpos(c)

TECHNISCHE HOCHSCHULE DEGGENDORF


5
www.th-deg.de

Example
(a|b)*abb
1,2,3

1,2,3 6.

5 .
1,2,3
nu
lla

1,2,3 4.
bl
e

3
.
1,2 *
1,2|

3 4 5 6
1a 2b a b b EOF
3 4 5 6
1 2 3 4 5 6
1 2
{1,2,3} {1,2,3} {4} {5} {6} Ø

Legend: position firstpos lastpos followpos nullable

TECHNISCHE HOCHSCHULE DEGGENDORF


6
www.th-deg.de

Construct DFA from RE


• Algorithm 7.4 RE2DFA (=3.36)
• Input: Regular expression r
• Output: DFA D w/ L(D)=L(r)
1. Construct a syntax tree T from the augmented RE: (r)EOF
2. Compute nullable, firstpos, lastpos, followpos for T
3. Ds={firstpos(n)}, unmarked (n is the root of T)
4. while there is an unmarked state S in Ds:
1. mark S
2. for each input symbol a:
1. Let U be the union of followpos(p) for all p in S
that correspond to a [U accepting, iff pos(EOF) є U]
2. if U is not in Ds:
1. add U as unmarked state to Ds
3. Dt[S,a] = U
TECHNISCHE HOCHSCHULE DEGGENDORF
7

You might also like