You are on page 1of 31

Step-by-step Conversion of

Regular Expressions to DFA


On the regular expression:
((a⋅ b)|c)*
THOMPSON’S CONSTRUCTION

Convert the regular expression to an NFA.


Thompson’s Construction Rules
RE: s | t
RE: ε

RE: a

RE: s t

RE: s *
Step 1: construct NFA for r1.

a
( (a ⋅ b) | c )* r1: 1 2

r1
Step 2: construct NFA for r2.

a
( (a ⋅ b) | c )* r1: 1 2

r1 r2 b
r2: 3 4
Step 3: construct NFA for r3.

( (a ⋅ b) | c )*
a b
r3 r3: 1 2 4
Step 4: construct NFA for r4.

( (a ⋅ b) | c )*
a b
r3 r4 r3: 1 2 4

c
r4: 5 6
Step 5: construct NFA for r5.

( (a ⋅ b) | c )*
a b
r5 1 2 4
r5: 𝜀 𝜀
7 8
𝜀 𝜀
5 c 6
Step 6: construct NFA for r5*.
𝜀

a b
1 2 4
𝜀 𝜀 𝜀 𝜀
9 7 8 10
𝜀 𝜀
5 c 6

𝜀
SUBSET CONSTRUCTION

Convert the NFA to a DFA.


Draw transition table for DFA
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4
𝜀 𝜀
9 𝜀 7 8 𝜀 10
𝜀
5 6 𝜀
c

𝜀
Subset construction: algorithm

Initially, 𝜀-closure( S0) is the only state in


Dstates and it is unmarked;
while (there is an unmarked state T in Dstates) {
mark T;
for (each input symbol a) {
U = 𝜀-closure(move(T, a));
Dtran[T, a] = U
if (U is not in Dstates)
} add U as unmarked state to Dstates;
}
Add 𝜀-closure(9) as DFA start state
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A

𝜀 𝜀
9 𝜀 7 8 𝜀 10
𝜀
5 6 𝜀
c

𝜀
Mark state A
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓

𝜀 𝜀
9 𝜀 7 8 𝜀 10
𝜀
5 6 𝜀
c

𝜀
Compute 𝜀-closure(move(A, a))
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓ B
{2} B
𝜀 𝜀
9 𝜀 7 8 𝜀 10
𝜀
5 6 𝜀
c

𝜀
Compute 𝜀-closure(move(A, b))
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓ B -
{2} B
𝜀 𝜀
9 𝜀 7 8 𝜀 10
𝜀
5 6 𝜀
c

𝜀
Compute 𝜀-closure(move(A, c))
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓ B - C
{2} B
𝜀 𝜀
9 𝜀 7 8 𝜀 10 {6,8,10,7,1,5} C
𝜀
5 6 𝜀
c

𝜀
Mark B
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓ B - C
{2} B✓
𝜀 𝜀
9 𝜀 7 8 𝜀 10 {6,8,10,7,1,5} C
𝜀
5 6 𝜀
c

𝜀
Compute 𝜀-closure(move(B, a))
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓ B - C
{2} B✓ -
𝜀 𝜀
9 𝜀 7 8 𝜀 10 {6,8,10,7,1,5} C
𝜀
5 6 𝜀
c

𝜀
Compute 𝜀-closure(move(B, b))
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓ B - C
{2} B✓ - D
𝜀 𝜀
9 𝜀 7 8 𝜀 10 {6,8,10,7,1,5} C
𝜀 {4,8,7,1,5,10} D
5 6 𝜀
c

𝜀
Compute 𝜀-closure(move(B, c))
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓ B - C
{2} B✓ - D -
𝜀 𝜀
9 𝜀 7 8 𝜀 10 {6,8,10,7,1,5} C
𝜀 {4,8,7,1,5,10} D
5 6 𝜀
c

𝜀
Mark C
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓ B - C
{2} B✓ - D -
𝜀 𝜀
9 𝜀 7 8 𝜀 10 {6,8,10,7,1,5} C✓
𝜀 {4,8,7,1,5,10} D
5 6 𝜀
c

𝜀
Compute 𝜀-closure(move(C, a))
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓ B - C
{2} B✓ - D -
𝜀 𝜀
9 𝜀 7 8 𝜀 10 {6,8,10,7,1,5} C✓ B
𝜀 {4,8,7,1,5,10} D
5 6 𝜀
c

𝜀
Compute 𝜀-closure(move(C, b))
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓ B - C
{2} B✓ - D -
𝜀 𝜀
9 𝜀 7 8 𝜀 10 {6,8,10,7,1,5} C✓ B -
𝜀 {4,8,7,1,5,10} D
5 6 𝜀
c

𝜀
Compute 𝜀-closure(move(C, c))
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓ B - C
{2} B✓ - D -
𝜀 𝜀
9 𝜀 7 8 𝜀 10 {6,8,10,7,1,5} C✓ B - C
𝜀 {4,8,7,1,5,10} D
5 6 𝜀
c

𝜀
Mark D
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓ B - C
{2} B✓ - D -
𝜀 𝜀
9 𝜀 7 8 𝜀 10 {6,8,10,7,1,5} C✓ B - C
𝜀 {4,8,7,1,5,10} D✓
5 6 𝜀
c

𝜀
Compute 𝜀-closure(move(D, a))
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓ B - C
{2} B✓ - D -
𝜀 𝜀
9 𝜀 7 8 𝜀 10 {6,8,10,7,1,5} C✓ B - C
𝜀 {4,8,7,1,5,10} D✓ B
5 6 𝜀
c

𝜀
Compute 𝜀-closure(move(D, b))
Dstates
𝜀
DFA Next State
NFA States
State a b c
a
1 2 b4 {9,7,1,5,10} A✓ B - C
{2} B✓ - D -
𝜀 𝜀
9 𝜀 7 8 𝜀 10 {6,8,10,7,1,5} C✓ B - C
𝜀 {4,8,7,1,5,10} D✓ B - C
5 6 𝜀
c

𝜀
Draw DFA

DFA Next State


NFA States
State a b c a
{9,7,1,5,10} A✓ B - C B
a b
{2} B✓ - D -
{6,8,10,7,1,5} C✓ B - C A a D
{4,8,7,1,5,10} D✓ B - C c c
C

c
TRANSLATION TO C

Convert the DFA into C code.


int match(char* next) {
goto A;

A: if (*next == '\0') return 1;


if (*next == 'a') { next++; goto B; }
if (*next == 'c') { next++; goto C; }
return 0;

B: if (*next == '\0') return 0;


if (*next == 'b') { next++; goto D; }
return 0;

C: if (*next == '\0') return 1;


if (*next == 'a') { next++; goto B; }
if (*next == 'c') { next++; goto C; }
return 0;

D: if (*next == '\0') return 1;


if (*next == 'a') { next++; goto B; }
if (*next == 'c') { next++; goto C; }
return 0;
}

You might also like