You are on page 1of 19

Regular Expressions

Meenakshi D’Souza

IIIT-Bangalore

Term II 2017-18
Examples of Regular Expressions

Expressions built from a, b, , using operators +, ·, and ∗.


(a∗ + b ∗ ) · c
“Strings of only a’s or only b’s, followed by a c.”
(a + b)∗ abb(a + b)∗
“contains abb as a subword.”
(a + b)∗ b(a + b)(a + b)

(b ∗ ab ∗ a)∗ b ∗
Examples of Regular Expressions

Expressions built from a, b, , using operators +, ·, and ∗.


(a∗ + b ∗ ) · c
“Strings of only a’s or only b’s, followed by a c.”
(a + b)∗ abb(a + b)∗
“contains abb as a subword.”
(a + b)∗ b(a + b)(a + b)
“3rd last letter is a b.”
(b ∗ ab ∗ a)∗ b ∗
Examples of Regular Expressions

Expressions built from a, b, , using operators +, ·, and ∗.


(a∗ + b ∗ ) · c
“Strings of only a’s or only b’s, followed by a c.”
(a + b)∗ abb(a + b)∗
“contains abb as a subword.”
(a + b)∗ b(a + b)(a + b)
“3rd last letter is a b.”
(b ∗ ab ∗ a)∗ b ∗
“Even number of a’s.”
Examples of Regular Expressions

Expressions built from a, b, , using operators +, ·, and ∗.


(a∗ + b ∗ ) · c
“Strings of only a’s or only b’s, followed by a c.”
(a + b)∗ abb(a + b)∗
“contains abb as a subword.”
(a + b)∗ b(a + b)(a + b)
“3rd last letter is a b.”
(b ∗ ab ∗ a)∗ b ∗
“Even number of a’s.”
“Every 4-bit block of the form w [4i, 4i + 1, 4i + 2, 4i + 3] has
even parity.”
Examples of Regular Expressions

Expressions built from a, b, , using operators +, ·, and ∗.


(a∗ + b ∗ ) · c
“Strings of only a’s or only b’s, followed by a c.”
(a + b)∗ abb(a + b)∗
“contains abb as a subword.”
(a + b)∗ b(a + b)(a + b)
“3rd last letter is a b.”
(b ∗ ab ∗ a)∗ b ∗
“Even number of a’s.”
“Every 4-bit block of the form w [4i, 4i + 1, 4i + 2, 4i + 3] has
even parity.”
(0000 + 0011 + · · · + 1111)∗ ( + 0 + 1 + · · · + 111)
Formal definitions

Syntax of regular expresions over an alphabet A:

r ::= ∅ | a | r + r | r · r | r ∗

where a ∈ A.
Semantics: associate a language L(r ) ⊆ A∗ with regexp r .

L(∅) = {}
L(a) = {a}
L(r + r 0 ) = L(r ) ∪ L(r 0 )
L(r · r 0 ) = L(r ) · L(r 0 )
L(r ∗ ) = L(r )∗ .
Formal definitions

Syntax of regular expresions over an alphabet A:

r ::= ∅ | a | r + r | r · r | r ∗

where a ∈ A.
Semantics: associate a language L(r ) ⊆ A∗ with regexp r .

L(∅) = {}
L(a) = {a}
L(r + r 0 ) = L(r ) ∪ L(r 0 )
L(r · r 0 ) = L(r ) · L(r 0 )
L(r ∗ ) = L(r )∗ .

Question: Do we need  in syntax?


Formal definitions

Syntax of regular expresions over an alphabet A:

r ::= ∅ | a | r + r | r · r | r ∗

where a ∈ A.
Semantics: associate a language L(r ) ⊆ A∗ with regexp r .

L(∅) = {}
L(a) = {a}
L(r + r 0 ) = L(r ) ∪ L(r 0 )
L(r · r 0 ) = L(r ) · L(r 0 )
L(r ∗ ) = L(r )∗ .

Question: Do we need  in syntax?


No.  ≡ ∅∗ .
Example: Semantics of regexp

(a∗ + b ∗ ) · c
·

∗ ∗

a b c
Example: Semantics of regexp

(a∗ + b ∗ ) · c
·

∗ ∗

{a} a b {b} c {c}


Example: Semantics of regexp

(a∗ + b ∗ ) · c
·

{, a, aa, . . .} ∗ ∗ {, b, bb, . . .}

{a} a b {b} c {c}


Example: Semantics of regexp

(a∗ + b ∗ ) · c
·

{, a, b, aa, bb, . . .} +

{, a, aa, . . .} ∗ ∗ {, b, bb, . . .}

{a} a b {b} c {c}


Example: Semantics of regexp

(a∗ + b ∗ ) · c
· {c, ac, bc, aac, bbc, . . .}

{, a, b, aa, bb, . . .} +

{, a, aa, . . .} ∗ ∗ {, b, bb, . . .}

{a} a b {b} c {c}


Kleene’s Theorem: RE = DFA

Class of languages defined by regular expressions coincides with


regular languages.
Proof
RE → DFA: Use closure properties of regular languages.
DFA → RE:
DFA → RE: Kleene’s construction

Let A = (Q, s, δ, F ) be given DFA.


Define Lpq = {w ∈ A∗ | δ(p,
b w ) = q}.
S
Then L(A) = f ∈F Lsf .
For X ⊆ Q, define LX = {w ∈ A∗ | δ(p,
pq
b w) =
q via a path that stays in X except for first and last states}
X

p q

LQ
S
Then L(A) = f ∈F sf .
DFA → RE: Kleene’s construction

p q

Advantage:
X ∪{r } X ∗
Lpq = LX X X
pq + Lpr · (Lrr ) · Lrq .
DFA → RE: Kleene’s construction (2)
Method:
Begin with LQsf for each f ∈ F .
Simplify by using terms with strictly smaller X ’s:
X ∪{r } X ∗
Lpq = LX X X
pq + Lpr · (Lrr ) · Lrq .

For base terms, observe that



{} {a | δ(p, a) = q} 6 q
if p =
Lpq =
{a | δ(p, a) = q} ∪ {} if p = q.

Exercise: convert NFA/DFA’s below to RE’s:


a, b b b
a
b

s f
a
DFA → RE: Kleene’s construction (2)
Method:
Begin with LQsf for each f ∈ F .
Simplify by using terms with strictly smaller X ’s:
X ∪{r } X ∗
Lpq = LX X X
pq + Lpr · (Lrr ) · Lrq .

For base terms, observe that



{} {a | δ(p, a) = q} 6 q
if p =
Lpq =
{a | δ(p, a) = q} ∪ {} if p = q.

Exercise: convert NFA/DFA’s below to RE’s:


a, b b b
a
b

s f e o
a

You might also like