You are on page 1of 56

15-453

FORMAL LANGUAGES,
AUTOMATA AND
COMPUTABILITY

(For next time: Read Chapter 1.3 of the book)


1
1
q2 q4

0
ε
q3

q1

2
A non-deterministic finite automaton (NFA)
is a 5-tuple N = (Q, Σ, , Q0, F)

Q is the set of states


Σ is the alphabet
 : Q  Σε → 2Q is the transition function
Q0  Q is the set of start states
F  Q is the set of accept states

2Q is the set of subsets of Q and Σε = Σ  {ε}

3
N = (Q, Σ, , Q0, F)
1
q2 q4 Q = {q1, q2, q3, q4}

0 Σ = {0,1}
ε Q0 = {q1, q2}
q3
F = {q4}  Q
0
0 1 ε
q1
q1

q2
q3

4 q4
N = (Q, Σ, , Q0, F)
1
q2 q4 Q = {q1, q2, q3, q4}

0 Σ = {0,1}
ε Q0 = {q1, q2}
q3
F = {q4}  Q
0
0 1 ε
q1
q1 {q3}  
q2  {q4} 
q3 {q4}  {q2}
5 q4   
N = (Q, Σ, , Q0, F)
1
q2 q4 Q = {q1, q2, q3, q4}

0 Σ = {0,1}
ε, 0 Q0 = {q1, q2}
q3
F = {q4}  Q
0
0 1 ε
q1
q1 {q3}  
q2  {q4} 
00  L(N)? q3 {q2,q4}  {q2}
01  L(N)? q4   
6
Let w Σ* and suppose w can be written as
w1... wn where wi  Σε (ε is viewed as representing
the empty string)

Then N accepts w if there are r0, r1, ..., rn  Q


such that

1. r0  Q0
2. ri+1  (ri, wi+1 ) for i = 0, ..., n-1, and
3. rn  F
L(N) = the language of machine N
= set of all strings machine N accepts

A language L is recognized by an NFA N


if L = L (N).
7
N = (Q, Σ, , Q0, F)
1
q2 q4 Q = {q1, q2, q3, q4}

0 Σ = {0,1}
ε, 0 Q0 = {q1, q2}
q3
F = {q4}  Q
0
0 1 ε
q1
q1 {q3}  
q2  {q4} 
00  L(N)? q3 {q2,q4}  {q2}
01  L(N)? q4   
8
FROM NFA TO DFA
Input: N = (Q, Σ, , Q0, F)
Output: M = (Q, Σ, , q0, F)

Q = 2Q
 : Q  Σ → Q
(R,) =  ε( (r,) ) *
rR
q0 = ε(Q0)
* F = { R  Q | f  R for some f  F }

For R  Q, the ε-closure of R, ε(R) = {q that can be reached


9 from some r  R by traveling along zero or more ε arrows},
Given: NFA N = ( {1,2,3}, {a.b},  , {1}, {1} )
Construct: equivalent DFA M

N
1
a
a b
ε

a,b
2 3

ε({1}) = {1,3}
10
N = ( Q, Σ, , Q0, F )
Given: NFA N = ( {1,2,3}, {a,b},  , {1}, {1} )
Construct: equivalent DFA M = (Q, Σ, , q0, F)
N  a b
1   
a {1}  {2}
a b
ε {2} {2,3} {3}
{3} {1,3} 
a, b {1,2} {2,3} {2,3}
2 3
{1,3} {1,3} {2}
{2,3} {1,2,3} {3}
q0 = ε({1}) = {1,3} {1,2,3}
{1,2,3} {2,3}
11
N = ( Q, Σ, , Q0, F )
Given: NFA N = ( {1,2,3}, {a,b},  , {1}, {1} )
Construct: equivalent DFA M = (Q, Σ, , q0, F)
N  a b
1   
a {1}  {2}
a b
ε {2} {2,3} {3}
{3} {1,3} 
a, b {1,2} {2,3} {2,3}
2 3
{1,3} {1,3} {2}
{2,3} {1,2,3} {3}
q0 = ε({1}) = {1,3} {1,2,3}
{1,2,3} {2,3}
12
REGULAR LANGUAGES CLOSED
UNDER STAR
Let L be a regular language and M be a
DFA for L
We construct an NFA N that recognizes L*
ε
1
0
0,1

ε 1

0 0

1 ε

13
Formally:
Input: M = (Q, Σ, , q1, F) DFA
Output: N = (Q, Σ, , {q0}, F) NFA
Q = Q  {q0}
F = F  {q0}

{(q,a)} if q  Q and a ≠ ε
{q1} if q  F and a = ε
(q,a) = {q1} if q = q0 and a = ε
 if q = q0 and a ≠ ε
 else
14
Show: L(N) = L*

1. L(N)  L*

2. L(N)  L*

15
1. L(N)  L*
Assume w = w1…wk is in L*, where w1,…,wk  L
We show N accepts w by induction on k
Base Cases:
 k=0 (w = ε)
 k=1 (w  L)
Inductive Step:
Assume N accepts all strings v = v1…vk  L*, vi  L
and let u = u1…ukuk+1  L* , vj L
Since N accepts u1…uk (by induction) and M
accepts uk+1, N must accept u
16
2. L(N)  L*
Assume w is accepted by N, we show w  L*
If w = ε, then w  L*
ε
If w ≠ ε

 L*

By induction

 L*
By induction

17 accept
REGULAR LANGUAGES ARE CLOSED
UNDER REGULAR OPERATIONS
Union: A  B = { w | w  A or w  B }

Intersection: A  B = { w | w  A and w  B }

Negation: A = { w  Σ* | w  A }

Reverse: AR = { w1 …wk | wk …w1  A }

Concatenation: A  B = { vw | v  A and w  B }

Star: A* = { w1 …wk | k ≥ 0 and each wi  A }

18
The PUMPING LEMMA and
REGULAR EXPRESSIONS

19
SOME LANGUAGES ARE
NOT REGULAR
B = {0n1n | n ≥ 0} is NOT regular!

20
WHICH OF THESE ARE REGULAR

C = { w | w has equal number of 1s and 0s}


NOT REGULAR

D = { w | w has equal number of


occurrences of 01 and 10}
REGULAR!!!

21
THE PUMPING LEMMA
Let L be a regular language with |L| = 

Then there exists a positive integer P


such that
if w  L and |w| ≥ P
then w = xyz, where:
1. |y| > 0
2. |xy| ≤ P
3. xyiz  L for any i ≥ 0

22
Let M be a DFA that recognizes L
Let P be the number of states in M
Assume w  L is such that |w| ≥ P
We show w = xyz 1. |y| > 0
2. |xy| ≤ P
3. xyiz  L for any i ≥ 0
x


q0 qi qj q|w|

There must be j > i such that qi = qj

23
USING THE PUMPING LEMMA
Use the pumping lemma to prove that
B = {0n1n | n ≥ 0} is not regular

Hint: Assume B is regular

Let B = L(M), for DFA M,


and let P be larger than the
number of states in M

Try pumping s = 0P1P

24
Use the pumping lemma to prove that
C = { w | w has an equal number of 0s and 1s}
is not regular

Hint: Try pumping s = 0P1P

If C is regular, s can be split into s = xyz,


where for any i ≥ 0, xyiz is also in C
and |xy| ≤ P

26
WHAT DOES D LOOK LIKE?

D = { w | w has equal number of


occurrences of 01 and 10}
= { w | w = 1, w = 0, w = ε or
w starts with a 0 and ends with a 0 or
w starts with a 1 and ends with a 1 }

(0(01)*0)  (1(01)*1)  1  0  ε

27
REGULAR EXPRESSIONS
 is a regular expression representing {}
ε is a regular expression representing {ε}
 is a regular expression representing 

If R1 and R2 are regular expressions


representing L1 and L2 then:
(R1R2) represents L1L2
(R1  R2) represents L1  L2
(R1)* represents L1*

28
PRECEDENCE

Tightest Star (“*”)


Concatenation (“.”, “”)
Loosest Union (“”, “+”, “|”)

29
EXAMPLE

R1*R2  R3 = ( ( R1* ) R2 )  R3

30
{ w | w has exactly a single 1 }

0*10*

31
What language does
* represent?
{ε}

32
{ w | w has length ≥ 3 and its 3rd symbol is 0 }

000(01)*  010(01)* 
100(01)*  110(01)*
= (01)(01)0(01)*

33
{ w | every odd position of w is a 1 }

1((01)1)*(01ε)  ε

Also

(1(01))*(1ε)

34
EQUIVALENCE
L can be represented by a regexp

L is a regular language

35
L can be represented by a regexp

L is a regular language

Given regular expression R, we show there


exists NFA N such that R represents L(N)
Induction on the length of R:

36
Given regular expression R, we show there
exists NFA N such that R represents L(N)
Induction on the length of R:
Base Cases (R has length 1):

R=
(matches a single symbol)

R=ε
(matches the empty string)

R=
(matches nothing)

37
Inductive Step:
Assume R has length k > 1 and that any regular
expression of length < k represents a language
that can be recognized by an NFA

Three possibilities for R:


R = R1  R2 (Union Theorem!)
R = R1 R2
R = (R1)*

38
Have Shown

L can be represented by a regexp



L is a regular language

39
Transform (1(0  1))* to an NFA

ε 1 1,0

40
L can be represented by a regexp


L is a regular language

41
L can be represented by a regexp


L is a regular language

Proof idea: Transform an NFA for L into a


regular expression by removing states and re-
labeling the arrows with regular expressions

42
ε
ε
ε
ε NFA
ε
Add unique and distinct start and accept states
While machine has more than 2 states:
Pick an internal state, rip it out and
re-label the arrows with regexps,
to account for the missing state

0 0

1
43
ε
ε
ε
ε NFA
ε

While machine has more than 2 states:


Pick an internal state, rip it out and
re-label the arrows with regexps,
to account for the missing state

01*0

44
a a,b

ε b ε
q0 q1 q2 q3

R(q0,q3) = (a*b)(ab)*

45
a,b

a*b ε
q0 q2 q3

R(q0,q3) = (a*b)(ab)*

46
(a*b)(ab)*
q0 q3

R(q0,q3) = (a*b)(ab)*

47
b
bb

a
q1 q2
a
ε
b
ε
b a

q3 ε

48
b
bb

a
q1 q2
a aba
ε
b
b ε
a

49
b
bb  (abb
 ba)b*a = R(q1,q1)

a
q1 q2
a  ba
ε
b  (a  ba)b*ε

(bb  (a  ba)b*a)* (b  (a  ba)b*)


50
Convert the NFA to a regular expression

a, b (a  b)b*b(bb*b)*
q1 q2 b
(a  b)b*b
ε
(a  b)b*b(bb*b)*a
b
bb*b
a b

q3 ε

ε
51
((a  b)b*b(bb*b)*a)* 
((a  b)b*b(bb*b)*a)*(a  b)b*b(bb*b)*

52
Formally: Add qstart and qaccept to create G
Run CONVERT(G): (return regexp)
If #states = 2
return the expression on the arrow
going from qstart to qaccept
If #states > 2

53
Formally: Add qstart and qaccept to create G
Run CONVERT(G): (return regexp)

If #states > 2
select qripQ different from qstart and qaccept
define Q = Q – {qrip}
define R as:
} Defines: G (GNFA)

R(qi,qj) = R(qi,qrip)R(qrip,qrip)*R(qrip,qj)  R(qi,qj)

return CONVERT(G)

54
CONVERT(G) is “equivalent” to G
Proof by induction on k (number of states in G)
Base Case:
 k=2
Inductive Step:
Assume claim is true for k-1 states
We first note that G and G are “equivalent”
But, by the induction hypothesis, G is
“equivalent” to CONVERT(G)

And CONVERT(G) is equivalent to CONVERT(G )

55 QED
DFA NFA

DEF

Regular Regular
Language Expression

57
Read Chapter 1.3 of the book for next time

58

You might also like