You are on page 1of 46

Hawassa University Daye

Campus
Department of Computer
Automata and complexity
Science
theory

Course code: CoSc3071


By: Mekonen M.
1
/(reg)ex/
Chapter Two
Unit 3
Regular Expression, Languages and Grammars
Regular expression
• One way of describing regular languages is via the notation of regular
expressions. It is the most effective way to represent any language
• Regular Expressions are an algebraic way to describe languages. OR
• A regular expression is a sequence of characters that define a pattern.
Notational shorthand's
1. One or more instances: +
2. Zero or more instances: *
3. Zero or one instances: ?
4. Alphabets: Σ

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 3


Rules to define regular expression
∅, λ and a ∈ Σ are all regular expression. These are called primitive regular expression. Another string
derived from those primitive regular expression
1. is a regular expression corresponding an empty language
2. λ is a regular expression for null string denoting {λ}
3. If is a symbol in then is a regular expression,
4. Suppose and are regular expression denoting the languages and . Then,
a. (r is a regular expression denoting
b. is a regular expression denoting
c. * is a regular expression denoting
d. + is a regular expression denoting
5. Only those “formulas” that can be produced by the application of rules 1-4 are
regular expressions over Σ.
The language denoted by regular expression is said to be a regular set.
Mekonen M. # CoSc3071  Unit 3 – Regular Expression 4
Regular expression

*
• L = Zero or More Occurrences of a = a*
𝜖
a
aa
aaa Infinite …..
aaaa
aaaaa….
.

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 5


Regular expression

+
• L = One or More Occurrences of a = a+

a
aa
aaa Infinite …..
aaaa
aaaaa…..

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 6


Regular expression examples
1. 0 or 1
𝐒𝐭𝐫𝐢𝐧𝐠𝐬 :𝟎 ,𝟏𝐑 . 𝐄 .=𝟎∨𝟏
2. 0 or 11 or 111
𝐒𝐭𝐫𝐢𝐧𝐠𝐬 :𝟎 ,𝟏𝟏, 𝟏𝟏𝟏𝐑 . 𝐄 .=𝟎|𝟏𝟏|𝟏𝟏𝟏
𝐒𝐭𝐫𝐢𝐧𝐠𝐬 : 𝛜 , 𝐚 , 𝐚𝐚 , 𝐚𝐚𝐚 , 𝐚𝐚𝐚𝐚𝐑
….. . 𝐄 .= 𝐚 ∗
3. String having zero or more a.

𝐒𝐭𝐫𝐢𝐧𝐠𝐬 : 𝐚 , 𝐚𝐚 , 𝐚𝐚𝐚 , 𝐚𝐚𝐚𝐚 …..𝐑 . 𝐄 .= 𝐚 +¿


4. String having one or more a.

Regular
5.𝐒𝐭𝐫𝐢𝐧𝐠𝐬 expression
: 𝐚𝐛𝐜 , 𝐛𝐜𝐚 , 𝐛𝐛𝐛 ,𝐜𝐚𝐛 ,𝐚𝐛𝐚over
…. that represent all string of length 3.
𝐑 . 𝐄 .= ( 𝐚|𝐛|𝐜 )( 𝐚|𝐛|𝐜 ) (𝐚|𝐛|𝐜)
6. All:𝟎,𝟏𝟏,𝟏𝟎𝟏,𝟏𝟎𝟏𝟎𝟏,𝟏𝟏𝟏𝟏…
𝐒𝐭𝐫𝐢𝐧𝐠𝐬 binary string +

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 7


Regular expression examples
7. 0 or more occurrence of either a or b or both
𝑺𝒕𝒓𝒊𝒏𝒈𝒔:𝝐,𝒂,𝒂𝒂,𝒂𝒃𝒂𝒃,𝒃𝒂𝒃… 𝑹. 𝑬 .=(𝒂∨𝒃)∗
8. 1 or more occurrence of either a or b or both
𝑺𝒕𝒓𝒊𝒏𝒈𝒔:𝒂,𝒂𝒂,𝒂𝒃𝒂𝒃,𝒃𝒂𝒃,𝒃𝒃𝒃𝒂𝒂𝒂… +

9. Binary no. ends with 0 *


𝑺𝒕𝒓𝒊𝒏𝒈𝒔:𝟎,𝟏𝟎,𝟏𝟎𝟎,𝟏𝟎𝟏𝟎,𝟏𝟏𝟏𝟏𝟎…
10. Binary no. ends with 1
𝑺𝒕𝒓𝒊𝒏𝒈𝒔:𝟏,𝟏𝟎𝟏,𝟏𝟎𝟎𝟏,𝟏𝟎𝟏𝟎𝟏,… 𝑹. 𝑬 .=(𝟎∨𝟏)∗𝟏
11. Binary no. starts and ends with 1
𝑺𝒕𝒓𝒊𝒏𝒈𝒔:𝟏𝟏,𝟏𝟎𝟏,𝟏𝟎𝟎𝟏,𝟏𝟎𝟏𝟎𝟏,…
12. String starts and ends with same character
𝑺𝒕𝒓𝒊𝒏𝒈𝒔:𝟎𝟎,𝟏𝟎𝟏,𝒂𝒃𝒂,𝒃𝒂𝒂𝒃…
Mekonen M. # CoSc3071  Unit 3 – Regular Expression 8
Regular expression examples
13. All string of a and b starting with a
*

14. String of 0 and 1 ends with 𝑹.


00 𝑬 .=(𝟎∨𝟏)∗𝟎𝟎

…String ends with abb


15. 𝑹. 𝑬 .=(𝒂∨𝒃)∗𝒂𝒃𝒃
𝑹.with
16.…String starts with 1 and ends 𝑬 .=𝟏(𝟎∨𝟏)∗𝟎
0

𝑹.𝑬.=( 𝟎 𝟏 )(and
17.…All binary string with at least 3 characters rd | |
𝟎 𝟏) 3𝟎(𝟎∨𝟏)∗
character should be zero

𝑹. 𝑬 .=𝒂∗𝒃 𝒂∗𝒃𝒂∗
… Language which consist of exactly two b’s over the set
18.

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 9


Regular expression examples
19. The language with such that 3rd character from right end of the string is always
a.
… 𝑹.𝑬.=(𝒂∨𝒃)∗𝒂(𝒂∨𝒃)(𝒂∨𝒃)
20. Any no. of followed by any no. of followed by any no. of
… 𝑹. 𝑬 .=𝒂∗𝒃∗𝒄 ∗
…. String should contain at least three
21. ∗ ∗ ∗ ∗
𝑹.𝑬.=(𝟎∨𝟏) 𝟏(𝟎∨𝟏) 𝟏(𝟎∨𝟏) 𝟏(𝟎∨𝟏)
∗ ∗ ∗
𝑹 . 𝑬 .=𝟎 𝟏 𝟎 𝟏 𝟎
…. String should contain exactly two
22.

1 and( 𝟎∨𝟏
…. Length of string should be at least𝑹.𝑬.=
23. | |
) ( 𝟎∨𝟏3 )( 𝟎∨𝟏) ( 𝟎∨𝟏)( 𝟎∨𝟏 )( 𝟎∨𝟏)
at most
∗ ∗ ∗ ∗ ∗
…. 24. No. of zero should be multiple of 3 𝑹. 𝑬 .=(𝟏 𝟎𝟏 𝟎𝟏 𝟎𝟏 )
Mekonen M. # CoSc3071  Unit 3 – Regular Expression 10
Regular expression examples
25. The language with where should be multiple of 3
𝑺𝒕𝒓𝒊𝒏𝒈𝒔:𝒂𝒂𝒂,𝒃𝒂𝒂𝒂,𝒃𝒂𝒄𝒂𝒃𝒂,𝒂𝒂𝒂𝒂𝒂𝒂.. ∗ ∗ ∗
𝑹.𝑬.=( ( 𝒃∨𝒄 ) 𝒂 ( 𝒃∨𝒄 ) 𝒂 ( 𝒃∨𝒄 ) 𝒂 ( 𝒃∨𝒄 ) )
∗∗

26. Even no. of 0 ∗ ∗ ∗ ∗


…. 𝑹 . 𝑬 .=(𝟏 𝟎 𝟏 𝟎 𝟏 )
27. String should have odd length ∗
…. 𝑹. 𝑬 .=( 𝟎∨𝟏 ) (( 𝟎|𝟏 ) (𝟎∨𝟏))
28. String should have even length ∗
…. 𝑹 . 𝑬 .=( ( 𝟎|𝟏 ) ( 𝟎∨𝟏))

29. 𝑹. 𝑬 .=( 𝟎 ) ( ( 𝟎|𝟏 ) (𝟎∨𝟏))
….String start with 0 and has odd length


30. 𝑹. 𝑬 .=𝟏(𝟎∨𝟏)(( 𝟎|𝟏 ) (𝟎∨𝟏))
….String start with 1 and has even length

𝑺𝒕𝒓𝒊𝒏𝒈𝒔:𝟎𝟎𝟏𝟎𝟏,𝟏𝟎𝟏𝟎𝟎,
31. All string begins𝟏𝟏𝟎,𝟎𝟏𝟎𝟏𝟏… 𝑹.𝑬.=(𝟎𝟎∨𝟏𝟏)(𝟎∨𝟏)∗∨( 𝟎|𝟏 ) ∗(𝟎𝟎∨𝟏𝟏)
or ends with 00 or 11

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 11


Regular Language
A Language is regular if it is accepted by some DFA. Because of the
equivalence of NFA’s and DFA’s, a language is also regular if it is
accepted by some NFA.
For every regular language there is a regular expression, and for
every regular expression there is a regular language.
If we have any regular expression r, we can construct an NFA that
accepts L(r).

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 12


Regular Expression to FA
• Regular expressions and finite-state automata have the same
expressive power.
• For given a regular expression E, it is possible to create an ε-NFA M
such that L(M) = L(E).
• To convert the RE to FA, we are going to use a method called the
subset method.
• This method is used to obtain FA from the given regular expression.

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 13


Regular Expression to FA

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 14


Regular Expression to FA
Steps to convert RE to FA using subset method
• Step 1: Design a transition diagram for given regular expression, using
NFA with ε moves.
• Step 2: Convert this NFA with ε to NFA without ε.
• Step 3: Convert the obtained NFA to equivalent DFA.

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 15


Regular Expression to FA - basics
For  in the regular expression, construct NFA

L={ }= 
start

For  in the regular expression, construct NFA

start  L = {e}

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 16


Regular Expression to FA
For a   in the regular expression, construct NFA
If s and t are regular expressions, Ms and Mt are their NFAs. s|t
has NFA:
start a
L = {a}

Ms e
start e L = {L(Ms)  L(Mt)}
i f
e
Mt e

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 17


Regular Expression and NFA
If s and t are regular expressions, Ms, Mt their NFAs.
st (concatenation) has NFA:
L = {L(Ms)L(Mt)}

start
i e Ms e Mt e f

If s is a regular expressions and Ms its NFA, s* (Kleene star) has


e
NFA:
start
i e Ms e f L = {L(Ms)*}

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 18


Regular Expression to FA
Example: Find an NFA which accepts r = (a|b)*ba

start a
a
start b
b
start b a
ba
a
e
e
(a|b) start

e b
e

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 19


Regular Expression and FA
Example: Find an NFA which accepts r = (a|b)*ba

a
e
e e
e
(a|b)*
e b e

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 20


Regular Expression to FA
Example: Find an NFA which accepts r = (a|b)*ba

step 1:
𝑞0 𝑞1
(a|b)* ba
𝑞3

step 2:

 𝑞2 a
𝑞4 

𝑞 0 𝑞1 𝑞6

𝑞7
b
𝑞8 a
𝑞
𝑞9
9

𝑞3 b
𝑞5 

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 21


Try those two steps
• convert NFA using -closure set of states
• Lastly, minimize it, You will get

a
A
b
𝐶 b
D

b
a

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 22


Regular Expression to FA

Exercise 1:
1. Give an NFA that accepts the
language L((a+b)*b(a+bb)*)

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 23


FA to Regular Expression – Arden’s Theorem
To convert FA to RE use Arden’s theorem
Theorem:
P and Q are two regular expressions over “∑”, and if P does not contain
“∈” , then the following equation in R given by R = Q + RP has a unique
solution i.e., R = QP*.” That means, whenever we get any equation in the
form of R = Q + RP, then we can directly replace it with R = QP*. So, here
we will first prove that R = QP* is the solution of this equation and then
prove that it is the unique solution of this equation.

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 24


FA to Regular Expression
Let’s start by taking this equation as equation
1. Let q1 be the initial Let q1 be the initial state.
2. There are q2, q3, q4 ....qn number of states. The final state may be some qj
where j<= n.
3. Let αji represents the transition from qj to qi.
4. Calculate qi such that
qi = αji * qj
If qj is a start state then we have:
qi = αji * qj + ε
5. Similarly, compute the final state which ultimately gives the regular
expression 'r'.
Mekonen M. # CoSc3071  Unit 3 – Regular Expression 25
FA to Regular Expression - Example
Example 1: construct RE from the given DFA
a b

start b a
q1 q2 q3

Let’s see the equation Substitute value of q1 in q2, w get


q2= q1b+ q2b
q1 = q1a+  q2= a* b+ q2b
q2= q1b+ q2b R=Q+RP = QP*
q2= a* bb* = a* b+
q3= q2a
Let’s simplify q1 first From the given DFA, we want to
find out RE, we normally
q1 = q1a+ 
calculate the equation for final
q1 =  + q1a state. Since q2 is a final state and
q2 = a* b+
R=Q+RP = QP*
q1 = a* = a*
Mekonen M. # CoSc3071  Unit 3 – Regular Expression 26
FA to Regular Expression – Example2
Example 2: Construct the RE for the given DFA
0
q1 q2 Let’s see the equation
1 q1 = q21+ q30 +
q2= q10
1 0
0 q3= q11
q4= q20+ q31+ q4(0+1)
1 q4
q3 Substitute q2 and q3 to q1
0, 1
q1 = q21+ q30 +
q1 =  + q101+ q110
q1 =  + q1(01+10)
R=Q+RP = QP*
Note q1 = (01+10)* = (01+10)*
If there exists multiple final states, then-
Write a regular expression for each final state separately.
Add all the regular expressions to get the final regular
expression.

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 27


FA to Regular Expression – State Elimination
• This method involves the following steps in finding the regular expression for any
given DFA
Step 1:Thumb Rule: The initial state of the DFA must not have any incoming
edge.
If there exists any incoming edge to the initial state, then create a new initial state having no
incoming edge to it.

q1 
q2 qi q1 q2

Step 1:Thumb Rule: There must exist only one final state in the DFA.
If there exists multiple final states in the DFA, then convert all the final states into non-final
states and create a new single final state.

q2 
q2 q1 qqf2
q1
q3 
q3
Mekonen M. # CoSc3071  Unit 3 – Regular Expression 28
FA to Regular Expression – State Elimination
Step 3:Thumb Rule: The final state of the DFA must not have any outgoing
edge.
If there exists any outgoing edge from the final state, then create a new final state having no
outgoing edge from it.

q1 q2 q3
q1 q2 q3 
Step 4.
Eliminate all the intermediate states one by one.
These states may be eliminated in any order.
In the end,
Only an initial state going to the final state will be left.
The cost of this transition is the required regular expression.

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 29


FA to Regular Expression – State Elimination
Example: Convert0Following FA to RE using state elimination
q1 q2
1
Step 1.Create new state to Incoming edge to initial state.
 1
qi q1 q2
0

Step 2. Create new final state to outgoing edge to final state.


 0 
qi q1 q2 qf
1

Finally, Eliminate intermediate states


10 0(10)*
0 qi qf

qi q2 qf

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 30


Try yourself
• Convert FA to RE

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 31


Regular Grammars
A third way of describing regular language is by means of certain simple grammar.
Linear Grammar: - It has a single non-terminal on the left-hand side or a right-
hand side consisting of a single terminal or single terminal followed by a non-
terminal known as left-linear grammar and right linear grammar respectively.
Linear grammar are classified in to
right-linear grammar
left-linear grammar

Right and Left Linear Grammar


A grammar G : (V, T, S, P) is said to be right-linear if all productions are of the
form
A  xB
Ax
Where A, B  V, and x  T*.
Mekonen M. # CoSc3071  Unit 3 – Regular Expression 32
Regular Grammars
A grammar is said to be left-linear if all the production are of the
form
A  Bx
Ax
Note that in a regular grammar, at most one variable appears on the
right side of any production. Furthermore, that variable must
consistently be either the rightmost or leftmost symbol of the right
side of any production.
A regular grammar is always linear, but not all linear grammar are
regular.

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 33


Regular Grammars

Example: The grammar G1 = ({S}, {a,b}, S, P1) with P1 given as


SabS | a is right linear.

The Grammar G2 = ({S, S1, S2}, {a,b}, S, P2) with P2 given as


S  S1ab
S1  S1ab | S2
S2  a is left-linear.

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 34


Regular Grammars

Exercise:
The grammar G = ({S, A, B}, {a, b},S, P) with production
SA
AaB | λ
BbA
Is left-linear or right-linear?

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 35


Convert FA to Right linear Grammars
Converting a FA, defined by M=(Q, Σ, δ, q0, F), to a regular grammar,
defined by G = (V, T, S, P) is straightforward. The rules are summarized
below:
1. The start symbol of the grammaris q0, the non-terminal
corresponding to the start state of the FA.
2. For each transition from state qi to state qj on some symbol
‘a’, create a production rule of the form: qi → aqj.
3. For each state qi of the FA which is a final state, create a
production rule of the form: qi → λ.

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 36


Convert FA to left linear Grammar cont…
Example: Convert the following FA to Left linear grammar: for the
language which accept any string start with a
a, b
The reverse of the above grammar
a
q 1 q2 produce: reverse language by RLG
Let label q1 as S and q2 as A and i.e. language which accept any string
The production of above FA is: end with a
S  aA S  Aa/a
A  aA/bA/ λ A  Aa/Ab/a/b
We can elimate λ, then we get
S  aA/a
A  aA/bA/a/b

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 37


Convert FA to left linear Grammar cont…
To convert the given FA to LLG , we reverse the FA by following steps
given to RLG.
a, b a, b

a
q1 q2 q1
a
q2

Let level q1 as S and q2 as A and , where A – is start terminal


The production of above FA is:
• A  Aa/Ab/Sa
• Sλ
We can elimate λ, then we get
• A  Aa/Ab/Sa/a

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 38


Regular Grammars to FA
Example: Construct a finite automata that accepts the language
generated by the grammar
S  aV1
V1  abS | b
 We start the transition graph, with vertices S, V1, and Vf and consider
those variable as states. Also all terminals(a,b) treated as an input of the
machine
 The first production rule creates an edge labeled a between S and V1.
 For the second rule, we need to introduce an additional vertex so that
there is a path labeled ab between V1 and S.
 Finally, we need to add an edge labeled b between V1 and Vf,
Mekonen M. # CoSc3071  Unit 3 – Regular Expression 39
Regular Grammars and FA

The language generated by the grammar and accepted by the


automaton is the regular language L((aab)* ab).

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 40


Regular Grammars and FA

Exercise 1: Construct a DFA that accepts the language generated by


the grammar.
SabA
AbaB
BaA | bb
Exercise 2: Find a regular grammar that generates the language L(aa*
(ab + a)*).
Answer: S  aA
A  aA | aB | ε
B  bA | aA
Mekonen M. # CoSc3071  Unit 3 – Regular Expression 41
Regular Grammars and FA

Exercise 3: Find a regular grammar that generates the language on


Σ = {a, b} consisting of all strings with no more than three a's.

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 42


Regular Grammars

Regular Expressions

DFA or NFA

Regular Grammars

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 43


Pumping Lemma for Regular Language
Pumping Lemma is to be applied to show that certain languages are not regular.
It states that if a language L is regular, there exists some integers pumping length
p ≥ 1 such that every string s  L with a length of p or more symbols |s| ≥ p, that
can be written s=xyz, where x, y and z are substrings of the s such that
1) |y| > 0
2) |xy| ≤ P
3) xyiz ∈ A for every i ≥ 0
It should never be used to show a language is regular.
If L is regular, it satisfies Pumping Lemma.
If L does not satisfy Pumping Lemma, it is non-regular.
The right way to show that a certain language L is not regular is to suppose L
regular and try to reach a contradiction.

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 44


Pumping Lemma for Regular Language

Example: Using pumping lemma prove that the language L = {anbn: n ≥


0 } is not regular
Proof 1:
Since |xy| ≤ p, x and y are made of as only.
Since y > 0, |x| < p.
Since xz must be in L we have a contradiction, because xz has fewer as than bs.
Proof 2: Assume that A is regular
Pumping Length = P
S = apbp let s= aaaaabbbbb and p = 5
S = xyz

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 45


Pumping Lemma for Regular Language

Check 1: the y is in the 3 a’s part aaaaabbbbb


x y z
xyiz  xy2z  aa(aaaaaa)bbbbb. This case is a contradiction.
L is not regular.
Check 2: the y is in the 2 a’s part aaaaabbbbb
x y z
xyiz  xy2z  aaa(aaaa)bbbbb

Mekonen M. # CoSc3071  Unit 3 – Regular Expression 46

You might also like