You are on page 1of 44

Theory of Computation

Introduction

Dr. Krishnendu Rarhi


E: Krishnendu.e9621@cumail.in
Terminologies
• Symbol: Symbol(often also called character) is the smallest
building block, which can be any alphabet, letter, or picture.
• Example: a, b, c, 0, 1, ….
• Alphabets (Σ): Alphabets are a set of symbols, which are
always finite.
• Examples: Σ = {0, 1}; {0, 1, 2, …, 9}; {a, b, c}; {A, B, C, …, Z}.
• String: String is a finite sequence of symbols from some
alphabet. A string is generally denoted as w and the length of a
string is denoted as |w|.
Σ* is a set of all possible strings(often power set(need not be unique here or we can say multiset) of
string) So this implies that language is a subset of Σ*.
Empty string is the string with zero occurrence of symbols, represented as ε.
Dr. Krishnendu Rarhi©
Formulation
Number of Strings (of length 2) that can be generated over the
alphabet {a, b} -
- -
a a
a b
b a
b b
Length of String |w| = 2
Number of Strings = 4
Conclusion:
For alphabet {a, b} with length n, number of strings can be generated =
2n.
If the number of symbols in alphabet Σ is represented by |Σ|, then a number of strings of length n, possible over
Dr. Krishnendu Rarhi©
Σ is |Σ|n.
Terminologies
• Language: A language is a set of strings, chosen from some Σ*
or we can say- ’A language is a subset of Σ* ’. A language that
can be formed over ‘ Σ ‘ can be Finite or Infinite.

Example of Finite Language:


L1 = { set of string of 2 }
L1 = { xy, yx, xx, yy }

Example of Infinite Language:


L1 = { set of all strings starts with 'b' }
L1 = { babb, baa, ba, bbb, baab, ....... }

Dr. Krishnendu Rarhi©


Regular expression
• Regular Expressions are used to denote regular languages.
Regular Languages are the
An expression is regular if: most restricted types of
• ɸ is a regular expression for regular language ɸ. languages and are accepted
by finite automata.
• ɛ is a regular expression for regular language {ɛ}.
• If a ∈ Σ (Σ represents the input alphabet), a is regular expression with language {a}.
• If a and b are regular expression, a + b is also a regular expression with language {a,b}.
• If a and b are regular expression, ab (concatenation of a and b) is also regular.
• If a is regular expression, a* (0 or more times a) is also regular.

Dr. Krishnendu Rarhi©


Regular expression

Regular Languages
Regular Expression

set of vowels (a∪e∪i∪o∪u) {a, e, i, o, u}

a followed by 0 or more b (a.b *) {a, ab, abb, abbb, abbbb,….}

any no. of vowels followed by any no. of { ε , a ,aou, aiou, b, abcd…..} where ε represent
v*.c* ( where v – vowels and c – consonants)
consonants empty string (in case 0 vowels and o consonants )

Dr. Krishnendu Rarhi©


Regular Expressions vs. Finite Automata

• Offers a declarative way to express the pattern of any string we want to accept
• E.g., 01*+ 10*

• Automata => more machine-like


< input: string , output: [accept/reject] >
• Regular expressions => more program syntax-like

• Unix environments heavily use regular expressions


• E.g., bash shell, grep, vi & other editors, sed
• Perl scripting – good for string processing
• Lexical analyzers such as Lex or Flex

Dr. Krishnendu Rarhi©


Regular Expressions

Finite Automata
Regular = (DFA, NFA,
expressions
ε-NFA)
Syntactical
expressions Automata/machines

Regular
Languages

Formal language
classes

Dr. Krishnendu Rarhi©


Language Operators
• Union of two languages:
• L U M = all strings that are either in L or M
• Note: A union of two languages produces a third language

• Concatenation of two languages:


• L . M = all strings that are of the form xy
s.t., x ∈ L and y ∈ M
• The dot operator is usually omitted
• i.e., LM is same as L.M

Dr. Krishnendu Rarhi©


“i” here refers to how many strings to concatenate from the parent
language L to produce strings in the language Li

Kleene Closure (the * operator)


• Kleene Closure of a given language L:
• L0 = { ε}
• L1= {w | for some w ∈ L}
• L2= { w1w2 | w1 ∈ L, w2 ∈ L (duplicates allowed)}
• Li= { w1w2…wi | all w’s chosen are ∈ L (duplicates allowed)}
• (Note: the choice of each wi is independent)
• L* = Ui≥0 Li (arbitrary number of concatenations)
Example:
• Let L = { 1, 00}
• L0 = { ε}
• L1= {1,00}
• L2= {11,100,001,0000}
• L3= {111,1100,1001,10000,000000,00001,00100,0011}
• L* = L0 U L1 U L2 U …

Dr. Krishnendu Rarhi©


Kleene Closure (special notes)

Why?
• L* is an infinite set iff |L|≥1 and L≠{ε}
• If L={ε}, then L* = {ε} Why?

• If L = Φ, then L* = {ε} Why?

Σ* denotes the set of all words over an


alphabet Σ
• Therefore, an abbreviated way of saying there is
an arbitrary language L over an alphabet Σ is:
• L ⊆ Σ*

Dr. Krishnendu Rarhi©


Building Regular Expressions
• Let E be a regular expression and the language represented by E is
L(E)
• Then:
• (E) = E
• L(E + F) = L(E) U L(F)
• L(E F) = L(E) L(F)
• L(E*) = (L(E))*

Dr. Krishnendu Rarhi©


Example: how to use these regular expression properties and
language operators?
• L = { w | w is a binary string which does not contain two consecutive 0s or two consecutive 1s anywhere)
• E.g., w = 01010101 is in L, while w = 10010 is not in L
• Goal: Build a regular expression for L
• Four cases for w:
• Case A: w starts with 0 and |w| is even
• Case B: w starts with 1 and |w| is even
• Case C: w starts with 0 and |w| is odd
• Case D: w starts with 1 and |w| is odd
• Regular expression for the four cases:
• Case A: (01)*
• Case B: (10)*
• Case C: 0(10)*
• Case D: 1(01)*
• Since L is the union of all 4 cases:
• Reg Exp for L = (01)* + (10)* + 0(10)* + 1(01)*
• If we introduce ε then the regular expression can be simplified to:
• Reg Exp for L = (ε +1)(01)*( ε +0)

Dr. Krishnendu Rarhi©


Finite Automata (FA) & Regular Expressions (Reg
Ex)
■ To show that they are interchangeable, consider
the following theorems:
■ Theorem 1: For every DFA A there exists a regular
expression R such that L(R)=L(A)
■ Theorem 2: For every regular expression R there exists
an ε -NFA E such that L(E)=L(R)

ε -NFA NFA
Theorem 2
Kleene Theorem

Reg
DFA
Ex
Theorem 1
Dr. Krishnendu Rarhi©
DFA Reg Ex
Theorem 1
DFA to RE construction
Informally, trace all distinct paths (traversing cycles only once)
from the start state to each of the final states
and enumerate all the expressions along the way

Example: 1 0 0,1

q 0 q 1 q
0 1 2

(1*) 0 (0*) 1 (0 + 1)*

1* 00* 1 (0+1)*
Q) What is the language?

1*00*1(0+1)*
Dr. Krishnendu Rarhi©
Reg Ex ε -NFA
Theorem 2
RE to ε-NFA construction

Example: (0+1)*01(0+1)*

(0+1)* 01 (0+1)*

ε ε
0 0
ε ε ε ε
ε 0 1

ε 1
ε ε 1
ε
ε ε

Dr. Krishnendu Rarhi©


Regular Grammar & Regular Language
Regular Grammar : A grammar is regular if it has rules of form A
-> a or A -> aB or A -> ɛ where ɛ is a special symbol called NULL.

Regular Languages : A language is regular if it can be


expressed in terms of regular expression.

Dr. Krishnendu Rarhi©


Closure Property of Regular Language
• Union : If L1 and If L2 are two regular languages, their union L1
∪ L2 will also be regular. For example, L1 = {an | n ≥ 0} and L2
= {bn | n ≥ 0}
L3 = L1 ∪ L2 = {an ∪ bn | n ≥ 0} is also regular.
• Intersection : If L1 and If L2 are two regular languages, their
intersection L1 ∩ L2 will also be regular. For example,
L1= {am | m ≥ 0} and L2= {bn | n ≥ 0 }
L3 = L1 ∩ L2 = {am bn | n ≥ 0 and m ≥ 0} is also regular.
• Concatenation : If L1 and If L2 are two regular languages, their
concatenation L1.L2 will also be regular. For example,
L1 = {an | n ≥ 0} and L2 = {bn | n ≥ 0}
L3 = L1.L2 = {am . bn | m ≥ 0 and n ≥ 0} is also regular.
Dr. Krishnendu Rarhi©
Closure Property of Regular Language
• Kleene Closure : If L1 is a regular language, its Kleene closure
L1* will also be regular. For example,
L1 = (a ∪ b)
L1* = (a ∪ b)*
• Complement : If L(G) is regular language, its complement L’(G)
will also be regular. Complement of a language can be found by
subtracting strings which are in L(G) from all possible strings.
For example,
L(G) = {an | n > 3}
L’(G) = {an | n <= 3}
Two regular expressions are equivalent if languages generated by them are same. For example, (a+b*)*
and (a+b)* generate same language. Every string which is generated by (a+b*)* is also generated by
Dr. Krishnendu Rarhi©
(a+b)* and vice versa.
Examples
Which one of the following languages over the alphabet {0,1} is
described by the regular expression?
(0+1)*0(0+1)*0(0+1)*
(A) The set of all strings containing the substring 00.
(B) The set of all strings containing at most two 0’s.
(C) The set of all strings containing at least two 0’s.
(D) The set of all strings that begin and end with either 0 or 1.
Option A says that it must have substring 00. But 10101 is also a part of language but it does not
contain 00 as substring. So it is not correct option.
Option B says that it can have maximum two 0’s but 00000 is also a part of language. So it is not
correct option.
Option C says that it must contain at least two 0. In regular expression, two 0 are present. So this is
correct option.
Option D says that it contains all strings that begin and end with either 0 or 1. But it can generate
strings which start with 0 and end with 1 or vice versaRarhi©
Dr. Krishnendu as well. So it is not correct.
Examples
Which of the following languages is generated by given
grammar?
S -> aS | bS | ∊
(A) {an bm | n,m ≥ 0}
(B) {w ∈ {a,b}* | w has equal number of a’s and b’s}
(C) {an | n ≥ 0} ∪ {bn | n ≥ 0} ∪ {an bn | n ≥ 0}
(D) {a,b}*
Option (A) says that it will have 0 or more a followed by 0 or more b. But S -> bS => baS => ba is also
a part of language. So (A) is not correct.
Option (B) says that it will have equal no. of a’s and b’s. But But S -> bS => b is also a part of
language. So (B) is not correct.
Option (C) says either it will have 0 or more a’s or 0 or more b’s or a’s followed by b’s. But as shown
in option (A), ba is also part of language. So (C) is not correct.
Option (D) says it can have any number of a’s and any numbers of b’s in any order. So (D) is correct.
Dr. Krishnendu Rarhi©
Examples
The regular expression 0*(10*)* denotes the same set as
(A) (1*0)*1*
(B) 0 + (0 + 10)*
(C) (0 + 1)* 10(0 + 1)*
(D) none of these
Two regular expressions are equivalent if languages generated by them are same.
Option (A) can generate all strings generated by 0*(10*)*. So they are equivalent.
Option (B) string null can not generated by given languages but 0*(10*)* can. So
they are not equivalent.
Option (C) will have 10 as substring but 0*(10*)* may or may not. So they are not
equivalent.

Dr. Krishnendu Rarhi©


Examples
The regular expression for the language having input alphabets a and
b, in which two a’s do not come together:
(A) (b + ab)* + (b +ab)*a
(B) a(b + ba)* + (b + ba)*
(C) both options (A) and (B)
(D) none of the above
Option (C) stating both both options (A) and (B) is the correct regular expression for the stated
question.
The language in the question can be expressed as L={&epsilon,a,b,bb,ab,aba,ba,bab,baba,abab,…}.
In option (A) ‘ab’ is considered the building block for finding out the required regular expression.(b +
ab)* covers all cases of strings generated ending with ‘b’.(b + ab)*a covers all cases of strings
generated ending with a.
Applying similar logic for option (B) we can see that the regular expression is derived considering ‘ba’
as the building block and it covers all cases of strings starting with a and starting with b.

Dr. Krishnendu Rarhi©


Chomsky Hierarchy

Dr. Krishnendu Rarhi©


Chomsky Hierarchy (Type 0: Unrestricted
Grammar)
• Type-0 grammars include all formal grammars. Type 0 grammar
language are recognized by Turing machine. These languages are also
known as the Recursively Enumerable languages.
Grammar Production in the form of
|α| -> |β|
where α is ( V + T)* V ( V + T)*
V : Variables
T : Terminals.
β is ( V + T)*
In type 0 there must be at least one variable on Left side of production.
For example,
Sab –> ba
A –> S.
Here, Variables are S, A and Terminals a, b.

Dr. Krishnendu Rarhi©


Chomsky Hierarchy (Type 1: Context
Sensitive Grammar)
• Type-1 grammars generate the context-sensitive languages. The
language generated by the grammar are recognized by the Linear
Bound Automata .
In Type 1
I. First of all Type 1 grammar should be Type 0.
II. Grammar Production in the form of
α -> β; |α| <= |β| (count of symbol in α is less than or equal to β)
For Example,
S –> AB
AB –> abc
B –> b

Dr. Krishnendu Rarhi©


Chomsky Hierarchy (Type 2: Context Free
Grammar)
• Type-2 grammars generate the context-free languages. The
language generated by the grammar is recognized by
aIn Type
Pushdown
2,
automata.
1. First of all it should be Type 1.
2. Left hand side of production can have only one variable.
For example,
S –> AB
A –> a
B –> b

Dr. Krishnendu Rarhi©


Chomsky Hierarchy (Type 3: Regular
Grammar)
• Type-3 grammars generate regular languages. These
languages are exactly all languages that can be accepted by a
finite state automaton. Type 3 is most restricted form of
Type 3 should be in the given form only :
grammar.
V –> VT / T (left-regular grammar)
(or)
V –> TV /T (right-regular grammar)
for example:
S –> a
The above form is called as strictly regular grammar.
There is another form of regular grammar called extended regular grammar. In this form :
V –> VT* / T*. (extended left-regular grammar)
(or)
V –> T*V /T* (extended right-regular grammar)
for example :
S –> ab. Dr. Krishnendu Rarhi©
Designing FA from RE
• Even number of a’s : The regular expression for even number
of a’s is (b|ab*ab*)*. We can construct a finite automata as
shown in Figure

The above automata will accept all strings which have even number of
a’s. For zero a’s, it will be in q0 which is final state. For one ‘a’, it will
go from q0 to q1 and the string will not be accepted. For two a’s at any
positions, it will go from q0 to q1 for 1st ‘a’ and q1 to q0 for second ‘a’.
So, it will accept all strings with even number of a’s.
Dr. Krishnendu Rarhi©
Designing FA from RE
• String with ‘ab’ as substring : The regular expression for
strings with ‘ab’ as substring is (a|b)*ab(a|b)*. We can construct
finite automata as shown in Figure

The above automata will accept all string which have ‘ab’ as substring. The
automata will remain in initial state q0 for b’s. It will move to q1 after
reading ‘a’ and remain in same state for all ‘a’ afterwards. Then it will move
to q2 if ‘b’ is read. That means, the string has read ‘ab’ as substring if it
reaches q2.
Dr. Krishnendu Rarhi©
Designing FA from RE
• String with count of ‘a’ divisible by 3 : The regular expression for
strings with count of a divisible by 3 is {a3n | n >= 0}. We can construct
automata as shown in Figure

If weabove
The wantautomata
to designwill
a finite
acceptautomata with
all string of a3n. Theofautomata
number
form a’s as 3n+1,
will
remain in initial state
same automata can q0
be for ɛ and
used withit will
finalbestate
accepted.
as q1 For stringof‘aaa’,
instead q0. it will
move
If from q0 to q1 then q1 toautomata
q2 and then q2 to q0. For every
kn set of three
a’s, it will come to q0, hence accepted. Otherwise, it will be in |q1nor
we want to design a finite with language {a >=q2,0}, k
states are required. We have used k = 3 in our example.
hence rejected.
Dr. Krishnendu Rarhi©
Designing FA from RE
• Binary numbers divisible by 3 : The regular expression for binary numbers
which are divisible by three is (0|1(01*0)*1)*. The examples of binary
number divisible by 3 are 0, 011, 110, 1001, 1100, 1111, 10010 etc. The
DFA corresponding to binary number divisible by 3 can be shown in Figure

The above automata will accept all binary numbers divisible by 3. For 1001,
the automata will go from q0 to q1, then q1 to q2, then q2 to q1 and finally
q2 to q0, hence accepted. For 0111, the automata will go from q0 to q0, then
q0 to q1, then q1 to q0 and finally q0 to q1, hence rejected.
Dr. Krishnendu Rarhi©
Designing FA from RE
• String with regular expression (111 + 11111)* : The string
accepted using this regular expression will have 3, 5, 6(111 twice), 8
(11111 once and 111 once), 9 (111 thrice), 10 (11111 twice) and all
other counts of 1 afterwards. The DFA corresponding to given regular
expression is given in Figure

The above automata will accept all binary numbers divisible by 3. For 1001,
the automata will go from q0 to q1, then q1 to q2, then q2 to q1 and finally
q2 to q0, hence accepted. For 0111, the automata will go from q0 to q0, then
q0 to q1, then q1 to q0 and finally q0 to q1, hence rejected.
Dr. Krishnendu Rarhi©
Designing FA from RE (Example)
• Will be the minimum number of states for strings with odd
number of a’s?
The regular expression for odd number of a is b*ab*(ab*ab*)* and
corresponding automata is given in Figure and minimum number of
states are 2.

Dr. Krishnendu Rarhi©


Designing Deterministic Finite Automata
• Problem-1: Construction of a DFA for the set of string over {a, b} such that length of the
string |w|=2 i.e, length of the string is exactly 2.
Explanation – The desired language will be like:
L = {aa, ab, ba, bb}

Here, State A represent set of all string of length zero (0), state B represent set of all string of
length one (1), state C represent set of all string of length two (2). State C is the final state
and D is the dead state it is so because after getting any alphabet as input it will not go into
final state ever.Number of states: n+2
Where n is |w|=n

The above automata will accept all the strings having the length of the string exactly 2. When
the length of the string is 1, then it will go from state A to B. When the length of the string is 2,
then it will go from state B to C and when the length of the string is greater than 2, then it will
go from state C to D (Dead state) and after it from state D TO D itself.

Dr. Krishnendu Rarhi©


Designing Deterministic Finite Automata
• Problem-2: Construction of a DFA for the set of string over {a, b} such that
length of the string |w|>=2 i.e, length of the string should be at least 2.
Explanation – The desired language will be like:
L = {aa, ab, ba, bb, aaa, aab, aba, abb........}

Here, State A represent set of all sting of length zero (0), state B represent
set of all sting of length one (1), and state C represent set of all sting of
length two (2). Number of states: n+1
Where n is |w|>=n

The above automata will accept all the strings having the length of the string
at least 2. When the length of the string is 1, then it will go from state A to B.
When the length of the string is 2, then it will go from state B to C and lastly
when the length of the string is greater than 2, then it will go from state C to
C itself.
Dr. Krishnendu Rarhi©
Designing Deterministic Finite Automata
• Problem-2: Construction of a DFA for the set of string over {a, b} such that length
of the string |w|<=2 i.e, length of the string is atmost 2.
Explanation – The desired language will be like:
L = {?, aa, ab, ba, bb}

Here, State A represent set of all sting of length zero (0), state B represent set of all
sting of length one (1), state C represent set of all sting of length two (2), state A, B,
C is the final state and D is the dead state it is so because after getting any alphabet
as input it will not go into final state ever.
Number of states: n+2
Where n is |w|<=n

The above automata will accept all the strings having the length of the string at most
2. When the length of the string is 1, then it will go from state A to B. When the
length of the string is 2, then it will go from state B to C and lastly when the length of
the string is greater than 2, then it will go from state C to D (Dead state).

Dr. Krishnendu Rarhi©


Conversion of NFA to DFA
• An NFA can have zero, one or more than one move from a given
state on a given input symbol. An NFA can also have NULL moves
(moves without input symbol). On the other hand, DFA has one and
only one move from a given state on a given input symbol.
• Conversion from NFA to DFA
Suppose there is an NFA N {Q, ∑, q0, δ, F} which recognizes a
language L. Then the DFA D {Q’, ∑, q0, δ’, F’} can be constructed for
language L as:
Step 1: Initially Q’ = ɸ.
Step 2: Add q0 to Q’.
Step 3: For each state in Q’, find the possible set of states for each
input symbol using transition function of NFA. If this set of states is
not in Q’, add it to Q’.
Step 4: Final state of DFA will be all states with contain F (final states
of NFA)
Dr. Krishnendu Rarhi©
Conversion of NFA to DFA
Consider the following NFA shown in Figure

Following are the various parameters for NFA.


Q = { q0, q1, q2 }
∑ = ( a, b )
F = { q2 }
δ (Transition Function of NFA)

Dr. Krishnendu Rarhi©


Conversion of NFA to DFA
• Step 1: Q’ = ɸ
Step 2: Q’ = {q0}
Step 3: For each state in Q’, find the states for each input
symbol.
Currently, state in Q’ is q0, find moves from q0 on input symbol
a and b using transition function of NFA and update the
transition table of DFA.
δ’ (Transition Function of DFA)
• Now { q0, q1 } will be considered as a single state. As its entry
is not in Q’, add it to Q’.
So Q’ = { q0, { q0, q1 } }
Dr. Krishnendu Rarhi©
Conversion of NFA to DFA
• Now, moves from state { q0, q1 } on different input symbols are
not present in transition table of DFA, we will calculate it like:
δ’ ( { q0, q1 }, a ) = δ ( q0, a ) ∪ δ ( q1, a ) = { q0, q1 }
δ’ ( { q0, q1 }, b ) = δ ( q0, b ) ∪ δ ( q1, b ) = { q0, q2 }
Now we will update the transition table of DFA.
δ’ (Transition Function of DFA)
• Now { q0, q2 } will be considered as a single state. As its entry
is not in Q’, add it to Q’.
So Q’ = { q0, { q0, q1 }, { q0, q2 } }

Dr. Krishnendu Rarhi©


Conversion of NFA to DFA
• Now, moves from state {q0, q2} on different input symbols are
not present in transition table of DFA, we will calculate it like:
δ’ ( { q0, q2 }, a ) = δ ( q0, a ) ∪ δ ( q2, a ) = { q0, q1 }
δ’ ( { q0, q2 }, b ) = δ ( q0, b ) ∪ δ ( q2, b ) = { q0 }
Now we will update the transition table of DFA.
δ’ (Transition Function of DFA)
• As there is no new state generated, we are done with the
conversion. Final state of DFA will be state which has q2 as its
component i.e., {q0, q2 }

Dr. Krishnendu Rarhi©


Conversion of NFA to DFA
• Following are the various parameters for DFA.
Q’ = { q0, { q0, q1 }, { q0, q2 } }
∑ = ( a, b )
F = { { q0, q2 } } and transition function δ’ as shown above. The
final DFA for above NFA has been shown in Figure

Sometimes, it is not easy to convert regular expression to DFA. First you can convert regular
expression to NFA and then NFA to DFA.
Dr. Krishnendu Rarhi©
Conversion of NFA to DFA (Example)
• The number of states in the minimal deterministic finite
automaton corresponding to the regular expression (0 + 1)* (10)
is ___________.
First, we will make an NFA for the above expression. To make an NFA for (0
+ 1)*, NFA will be in same state q0 on input symbol 0 or 1. Then for
concatenation, we will add two moves (q0 to q1 for 1 and q1 to q2 for 0) as
shown in Figure

Dr. Krishnendu Rarhi©

You might also like