You are on page 1of 34

Chapter 2

Regular Expressions, Regular Languages


and
Regular Grammars

1
Regular Expression
• We have known that a regular language can be
described by some dfa.
• Now we look at other ways of representing regular
languages.
Definition
Let Σ be a given alphabet. Then
1.  ,  and a  Σ are all regular expressions. These are called primitive
regular expressons.
2. If r1 and r2 are regular expressions, so are r1  r2 , r1  r2 , r1* and (r1 ).
3. A string is a regular expression if and only if it can derived from
the primitive regular expressions by a finite number of applications
of the rules in (2).
2
Regular Expressions…
Regular Expression (RE):
• E is a regular expression over  if E is one of:
•e
• a, where a  
• If r and s are regular expressions (REs), then the following
expressions also regular:
• r | s  (r or s)
• rs  (r followed by s)
• r*  (r repeated zero or more times)

• Each RE has an equivalent regular language (RL).

3
Regular Expressions…
• Language Associated with Regular Expressions
Definition
The language L(r) denoted by any regular expression r is defined
by the following rules.
1.  is a reguar expression denoting the empty set,
2. λ is a regular expression denoting {λ},
Termination
3. for every a  Σ , a is a regular expression denoting {a}.condition
If r1 and r2 are regular expressions, then
4. L(r1  r2 )  L(r1 )  L(r2 ),
5. L(r1  r2 )  L(r1 )L(r2 ), Reduce L(r ) to simpler components.
6. L((r1 ))  L(r1 ),
7. L(r1* )  ( L(r1 ))* .
4
Regular Expressions…
Regular Language (RL):
• L is a regular language over  if L is one of:
• empty set
• {e} a set that contains empty string
• {a} where a  

• If R and S are regular languages (RL), then the following


languages also regular:
• R  S = {w | w  R or w  S}
• RS = {rs | r  R and s  S}
• R* = R0  R1  R2  R3  …

5
Regular Expressions...
Rules for Specifying Regular Expressions:
1. e is a regular expression  L = {e}
2. If a is in , a is a regular expression  L = {a}, the set
containing the string a.
3. Let r and s be regular expressions with languages L(r) and
L(s). Then
p a. r | s is a RE  L(r)  L(s)
r
e
c b. rs is a RE  L(r) L(s)
e
d
e
c. r* is a RE  (L(r))*
n
c d. (r) is a RE L(r), extra parenthesis
e

6
Regular Expressions…
abc
• concatenation (“followed by”)

a|b|c
• alternation (“or”)

*
• zero or more occurrences

+
• one or more occurrences
7
Regular Expressions…
• Examples:
• {0, 1}{00, 11} = {000, 011, 100, 111}
• {0}* = {e, 0, 00, 000, 0000, …}
• {e}* = {e}
• {10, 01}* = {e, 10, 1010, 101010, …,
01, 0101, 010101, …,
1001, 100101, 10010101, …,
0110, 011010, 01101010, …}
• Notational shorthand:
• L0 = {e}
• Li = LLi-1
• L+ = LL*
8
Regular Expressions…
• Let L be a language over {a, b}, each string in L contains
the substring bb
• L = {a, b}*{bb}{a, b}*

• L is regular language (RL). Why?


• {a} and {b} are RLs
• {a, b} is RL
• {a, b}* is RL
• {b}{b} = {bb} is also RL
• Then L = {a, b}*{bb}{a, b}* is RL

9
Regular Expressions…
• Let L be a language over {a, b}, each string in L
• begins and ends with an a
• contains at least one b
• L = {a}{a, b}*{b}{a, b}*{a}

• L is regular language (RL). Why?


• {a} and {b} are RLs
• {a, b} is RL
• {a, b}* is RL
• Then L = {a}{a, b}*{b}{a, b}*{a} is RL

10
Regular Expressions…
• L = {a, b}*{bb}{a, b}*
• RE = (a|b)*bb(a|b)*

• L = {a}{a, b}*{b}{a, b}*{a}


• RE = a(a|b)*b(a|b)*a

• This RE = (a)|((b)*(c)) is equivalent to a|b*c

• We say REs r and s are equivalent (r=s), iff r and s represent the
same language.
• Example: r = a|b, s = b|a  r = s Why?
• Since L(r) = L(s) = {a, b}
11
Regular Expressions…
• Let = {a, b}

• RE a|b  L = {a, b}
• RE (a|b)(a|b)  L = {aa, ab, ba, bb}
• RE aa|ab|ba|bb same as above
• RE a*  L = {, a , aa, aaa, …}
• RE (a|b)*  L = set of all strings of a’s and b’s
including 
• RE (a*b*)* same as above
• RE a|a*b  L = {a,b,ab,aab,aaab, …}

12
Regular Expressions…

Algebraic Properties of regular Expressions


r|s=s|r
r | (s | t) = (r | s) | t
(r s) t = r (s t)
r(s|t)=rs|rt
(s|t)r=sr|tr
r=r=r

r* = (r*)* = ( r |  )+ = r+ | 
r** = r*
r+ = r r*

13
Regular Expression Examples
• All strings of 1s and 0s.
(0 | 1)*

• All strings of 1s and 0s beginning with a 1.


1 (0 | 1)*

14
Regular Expression Examples…
• All strings containing two or more 0s.
(1|0)*0(1|0)*0(1|0)*

• All strings containing an even number of 0s.


(1*01*01*)* | 1*

• All strings of alternating 0s and 1s.


(  | 1 ) ( 0 1 )* (  | 0 )

• Strings over the alphabet {0, 1} with no consecutive 0's


• (1 | 01 )* (0 | )
• 1*(01+)* (0 | )
• 1*(011*)* (0 | )
15
Regular Expression Examples…
• Describe the following in English:
• (0|1)*
• all strings over {0, 1}

• b*ab*ab*ab* - describe the language?

16
Regular Expression Examples…
• More Examples of RE:

• 01*
• {0, 01, 011, 0111, …..}

• (01*)(01)
• {001, 0101, 01101, 011101, …..}

• (0 | 1)* = {0, 1, 00, 01, 10, 11, …..}


• i.e., all strings of 0 and 1

• (0 | 1)* 00 (0 | 1)* = {00, 1001, …..}


• i.e., all 0 and 1 strings containing a “00”

17
Regular Expression Examples…
• More Examples of RE:

• (1 | 10)*
• all strings starting with “1” and containing no “00”

• (0 | 1)*011
• all strings ending with “011”

• 0*1*
• all strings with no “0” after “1”

• 00*11*
• all strings with at least one “0” and one “1”, and no “0” after “1”
18
Connection Between RE & RL
• A language L is called regular if and only if there exists
some DFA M such that L = L(M).

• Since a DFA has an equivalent NFA, then


• A language L is called regular if and only if there exists some
NFA N such that L = L(N).

• If we have a RE r, we can construct an NFA that accept


L(r).

19
Connection Between RE & RL…
• For  in the regular expression, construct NFA

q0 q1
L={}=
(1) nfa accepts  .

• For  in the regular expression, construct NFA

q0 λ q1 L = {e}
(2) nfa accepts {}.
• For a   in the regular expression, construct NFA

q0 a q1 L = {a}
(3) nfa accepts a.
20
Connection Between RE & RL: Example
• Build an NFA-e that accepts (a|b)*ba

a start a

start b
b
start b e a
q1
ba
a
e
e
start

a|b e b
e

21
Example…
• Build an NFA-e that accepts (a|b)*ba

a
e
(a|b) *
e
e e

e b e

22
Example…
• Build an NFA-e that accepts (a|b)*ba

a
e
e e
e

e b e

e
b e a

23
Example…
• Let’s try a ( b | c )*
a b c
1. a, b, & c S0 S1 S0 S1 S0 S1

 b 
S1 S2
S0 S5
2. b | c  c 
S3 S4


b
 S2 S3 
S0  S1 S6  S7
 c 
3. ( b | c ) * S4 S5

24
Example…

4. a ( b | c )*
b
 S4 S5 
S0
a
S1   
S2 S3 S8 S9
 c 
S6 S7

b|c

a
S0 S1

25
Example…
Let : a
abb 3 patterns
a*b+

NFA’s :
start a
1 2

start a b b
3 4 5 6

a b
start
7 8
b

26
Example…
NFA for : a | abb | a*b+

a
1 2

start  a b b
0 3 4 5 6

 a b
7 b 8

27
NFA to RE
• If L is accepted by some NFA-e, then L is represented by
some regular expression.
• An expression graph is like a state diagram but it can
have regular expressions as labels on arcs.
• An NFA-e is an expression graph.
• An expression graph can be reduced to one with just
two states.
• If we reduce an NFA-e in this way, the arc label then
corresponds to the regular expression representing it.

28
NFA to RE...
• Merge Edges : Replace state by Edges
a c
a b
b q0
p q1
c

q0
ac*b q1

a|b|c

29
NFA to RE…
• Let G be the state diagram of a finite automata.
• Let m be the number of final states of G.
• Make m copies of G, each of which has one final state.
• Call these graphs G1, G2, …, Gm
• For each Gt
• Repeat
• Do the steps in the previous slide
• Until the only states in Gt are the start state and the single
final state.
• Determine the RE of Gt
• The RE of G is obtained by joining RE’s of each Gt by 
or |.
30
NFA to RE…
b b
G: start c c
1 2 3

b b
G1: start c c
1 2 3

b b
G2: start c c
1 2 3

31
NFA to RE…
b b
G1: start c c
1 2 3

b b
cc
1 3

b
1  b*

32
NFA to RE…
b b
G2: start c c
1 2 3

b b
cc
1 3

 b*ccb*

RE for G  b* | b*ccb*

33
NFA to RE: Exercise
• Given the following NFA, find the regular expression it
accepts.
b b a,b

q0 a q2
q1 b
a

34

You might also like