You are on page 1of 35

Regular Expressions

Fall 2006 Costas Busch - RPI 1


Regular Expressions
Regular expressions
describe regular languages

Example: (a  b  c) *

describes the language


 a, bc *    , a, bc, aa, abc, bca,...

Fall 2006 Costas Busch - RPI 2


Recursive Definition
Primitive regular expressions: ,  , 

Given regular expressions r1 and r2

r1  r2
r1  r2
Are regular expressions
r1 *
 r1 
Fall 2006 Costas Busch - RPI 3
Examples

A regular expression:  a  b  c  * (c  )

Not a regular expression:  a  b 

Fall 2006 Costas Busch - RPI 4


Languages of Regular Expressions

L r  : language of regular expression r

Example
L (a  b  c) *    , a, bc, aa, abc, bca,...

Fall 2006 Costas Busch - RPI 5


Definition

For primitive regular expressions:

L    

L      

L a    a
Fall 2006 Costas Busch - RPI 6
Definition (continued)

For regular expressions r1 and r2

L r1  r2   L r1   L r2 

L r1  r2   L r1  L r2 

L r1 *   L r1   *

L  r1    L r1 
Fall 2006 Costas Busch - RPI 7
Example
Regular expression:  a  b  a *
L  a  b   a *  L  a  b   L a *
 L a  b  L a *
  L a   L b    L a   *
   a   b    a  *
  a, b   , a, aa, aaa,...
  a, aa, aaa,..., b, ba, baa,...
Fall 2006 Costas Busch - RPI 8
Example

Regular expression r   a  b  *  a  bb 

L r    a, bb, aa, abb, ba, bbb,...

Fall 2006 Costas Busch - RPI 9


Example

Regular expression r   aa  *  bb  * b

L r   {a b
2n 2m
b : n, m  0}

Fall 2006 Costas Busch - RPI 10


Example

Regular expression r  (0  1) * 00 (0  1) *

L(r ) = { all strings containing substring 00 }

Fall 2006 Costas Busch - RPI 11


Example

Regular expression r  (1  01) * (0   )

L(r ) = { all strings without substring 00 }

Fall 2006 Costas Busch - RPI 12


Equivalent Regular Expressions

Definition:

Regular expressions r1 and r2

are equivalent if L(r1 )  L(r2 )

Fall 2006 Costas Busch - RPI 13


Example
L = { all strings without substring 00 }

r1  (1  01) * (0   )
r2  (1* 011*) * (0   )  1* (0   )

r1 and r2
L(r1 )  L(r2 )  L
are equivalent
regular expressions
Fall 2006 Costas Busch - RPI 14
Regular Expressions
and
Regular Languages

Fall 2006 Costas Busch - RPI 15


Theorem

Languages
Generated by
Regular Expressions
 Regular
Languages

Fall 2006 Costas Busch - RPI 16


Proof:

Languages
Generated by  Regular
Languages
Regular Expressions

Languages
Generated by  Regular
Languages
Regular Expressions

Fall 2006 Costas Busch - RPI 17


Proof - Part 1

Languages
Generated by  Regular
Languages
Regular Expressions

For any regular expression r


the language L(r ) is regular

Proof by induction on the size of r


Fall 2006 Costas Busch - RPI 18
Induction Basis
Primitive Regular Expressions: ,  , 
Corresponding
NFAs
L( M1 )    L()

regular
L( M 2 )  {}  L( )
languages
a
L( M 3 )  {a}  L(a )

Fall 2006 Costas Busch - RPI 19


Inductive Hypothesis

Suppose
that for regular expressions r1 and r2 ,
L(r1 ) and L(r2 ) are regular languages

Fall 2006 Costas Busch - RPI 20


Inductive Step
We will prove:
L r1  r2 

L r1  r2 
Are regular
Languages
L r1 *

L  r1  
Fall 2006 Costas Busch - RPI 21
By definition of regular expressions:

L r1  r2   L r1   L r2 

L r1  r2   L r1  L r2 

L r1 *   L r1   *

L  r1    L r1 
Fall 2006 Costas Busch - RPI 22
By inductive hypothesis we know:
L(r1 ) and L(r2 ) are regular languages

We also know:
Regular languages are closed under:
Union L r1   L r2 
Concatenation L r1  L r2 
Star  L r1   *
Fall 2006 Costas Busch - RPI 23
Therefore:

L r1  r2   L r1   L r2 

Are regular
L r1  r2   L r1  L r2 
languages

L r1 *   L r1   *

L((r1 ))  L(r1 ) is trivially a regular language


(by induction hypothesis)
Fall 2006 Costas Busch - RPI End of Proof-Part 1 24
Using the regular closure of these operations,
we can construct recursively the NFA M
that accepts L(M )  L(r )

Example: r  r1  r2
L(M1 )  L(r1 )
L(M )  L(r )


L(M2 )  L(r2 ) 

Fall 2006 Costas Busch - RPI 25


Proof - Part 2

Languages
Generated by  Regular
Languages
Regular Expressions

For any regular language L there is


a regular expression r with L( r )  L

We will convert an NFA that accepts L


to
Fall 2006
a regular expression
Costas Busch - RPI 26
Since L is regular, there is a
NFA M that accepts it

L( M )  L

Take it with a single final state


Fall 2006 Costas Busch - RPI 27
From M construct the equivalent
Generalized Transition Graph
in which transition labels are regular expressions

Example: Corresponding
M Generalized transition graph

a c a c
a, b ab
Fall 2006 Costas Busch - RPI 28
b b
Another Example:
a
q0 q1 a, b q2
b

Transition labels b b
are regular a
expressions q0 q1 a  b q2
b
Fall 2006 Costas Busch - RPI 29
b b
Reducing the states:
a
q0 q1 a  b q2
b

Transition labels
are regular bb * a b
expressions
q0 bb * (a  b) q2
Fall 2006 Costas Busch - RPI 30
Resulting Regular Expression:

bb * a b

q0 bb * (a  b) q2

r  (bb * a ) * bb * (a  b)b *

L(r )  L( M )  L
Fall 2006 Costas Busch - RPI 31
In General
Removing a state: e
d c
qi q qj
a b

ae * d ce * b
ce * d
qi qj
ae * b
Fall 2006 Costas Busch - RPI 32
By repeating the process until
two states are left, the resulting graph is
Initial graph Resulting graph
r1 r4
r3
q0 qf
r2
The resulting regular expression:
r  r1 * r2 (r4  r3r1 * r2 ) *
L(r )  L( M )  L
Fall 2006 Costas Busch - RPI
End of Proof-Part 2 33
Standard Representations
of Regular Languages

Regular Languages

DFAs

Regular
NFAs
Expressions

Fall 2006 Costas Busch - RPI 34


When we say: We are given
a Regular Language L

We mean: Language L is in a standard


representation

(DFA, NFA, or Regular Expression)

Fall 2006 Costas Busch - RPI 35

You might also like