You are on page 1of 70

Theory of Computation

(Formal Languages and Automata)

1
2
3
4
What is automata theory?
• Automata theory is the study of abstract computational devices
• Abstract devices are (simplified) models of real computations
• Computations happen everywhere: On your laptop, on your cell
phone, in nature, …
• Why do we need abstract models?

5
A simple computer

BATTERY

input: switch
output: light bulb
actions: flip switch
states: on, off
6
A simple “computer”

BATTERY start off on

input: switch
output: light bulb bulb is on if and only if
there was an odd number
actions: f for “flip switch” of flips
states: on, off
7
Another “computer” 1
1 start off off
1

2 2 2 2
BATTERY
1
2
off on
1

inputs: switches 1 and 2


actions: 1 for “flip switch 1” bulb is on if and only if
actions: 2 for “flip switch 2” both switches were flipped
an odd number of times
states: on, off

8
A design problem
1 4

?
5
BATTERY

Can you design a circuit where the light is on if and only


if all the switches were flipped exactly the same number
of times??

9
A design problem
• Such devices are difficult to reason about, because they can be
designed in an infinite number of ways
• By representing them as abstract computational devices, or
automata, we will learn how to answer such questions

10
These devices can model many things
• They can describe the operation of any “small computer”, like
the control component of an alarm clock or a microwave
• They are also used in lexical analyzers to recognize well formed
expressions in programming languages:

ab1 is a legal name of a variable in C


5u= is not

11
Applications of Finite Automata
• Software for designing and checking the behavior of digital
circuits
• Lexical analyzer of a typical compiler
• Software for scanning large bodies of text (e.g., web pages)
for pattern finding
• Software for verifying systems of all types that have a finite
number of states (e.g., stock market transaction,
communication/network protocol)

12
Application in Compiler Design

Regular Grammar
Expression
13
Different kinds of automata
• This was only one example of a computational device, and there are
others
• We will look at different devices, and look at the following questions:
• What can a given type of device compute, and what are its limitations?
• Is one type of device more powerful than another?

14
Some devices …
finite automata Devices with a finite amount of memory.
Used to model “small” computers.

push-down Devices with infinite memory that can be


automata accessed in a restricted way.
Used to model parsers, etc.

Turing Machines Devices with infinite memory.


Used to model any computer.

time-bounded Infinite memory, but bounded running time.


Turing Machines Used to model any computer program that
runs in a “reasonable” amount of time.
Some highlights of the course
• Finite automata
• We will understand what kinds of things a device with finite memory can do,
and what it cannot do
• Introduce simulation: the ability of one device to “imitate” another device
• Introduce nondeterminism: the ability of a device to make arbitrary choices
• Push-down automata
• These devices are related to grammars, which describe the structure of
programming (and natural) languages

16
Some highlights of the course
• Turing Machines
• This is a general model of a computer, capturing anything we could ever hope
to compute
• Surprisingly, there are many things that we cannot compute, for example:

Write a program that, given the code of another


program in C, tells if this program ever outputs
the word “hello”

• It seems that you should be able to tell just by looking at the program, but it is
impossible to do!

17
Some highlights of the course
• Time-bounded Turing Machines
• Many problems are possible to solve on a computer in principle, but take too
much time in practice
• Traveling salesman: Given a list of cities, find the shortest way to visit them
and come back home

• Easy in principle: Try the cities in every possible order


• Hard in practice: For 100 cities, this would take 100+ years even on the fastest
computer!

18
Preliminaries of automata theory
• How do we formalize the question

Can device A solve problem B?


• First, we need a formal way of describing the problems that we are
interested in solving

19
Problems
• Examples of problems we will consider
• Given a word s, does it contain the subword “fool”?
• Given a number n, is it divisible by 7?
• Given a pair of words s and t, are they the same?
• Given an expression with brackets, e.g. (()()), does every left bracket
match with a subsequent right bracket? Automata as
• All of these have “yes/no” answers. an Acceptor

• There are other types of problems, that ask “Find this” or “How many
of that” but we won’t look at those.
Automata as a
Transducer
20
Finite Automata as an Acceptor

To recognize C identifier 21
Finite Automata as a Transducer

A Binary Half adder 22


Finite Automata : Examples
• On/Off switch action

state

• Modeling recognition of the word “then”

Start state Transition Intermediate Final state


state

23
Alphabets and strings
• A common way to talk about words, number, pairs of
words, etc. is by representing them as strings
• To define strings, we start with an alphabet
An alphabet is a finite set of symbols.
• Examples
S1 = {a, b, c, d, …, z}: the set of letters in English
S2 = {0, 1, …, 9}: the set of (base 10) digits
S3 = {a, b, …, z, #}: the set of letters plus the
special symbol #
S4 = {(, )}: the set of open and closed brackets
24
Strings
A string over alphabet S is a finite sequence
of symbols in S.
• The empty string will be denoted by e
• Examples

abfbz is a string over S1 = {a, b, c, d, …, z}


9021 is a string over S2 = {0, 1, …, 9}
ab#bc is a string over S3 = {a, b, …, z, #}
))()(() is a string over S4 = {(, )}

25
Languages
A language is a set of strings over an alphabet.

• Languages can be used to describe problems with


“yes/no” answers, for example:
L1 = The set of all strings over S1 that contain
the substring “fool”
L2 = The set of all strings over S2 that are divisible by 7
= {7, 14, 21, …}
L3 = The set of all strings of the form s#s where s is any
string over {a, b, …, z}
L4 = The set of all strings over S4 where every ( can be
matched with a subsequent )
26
Languages & Grammars
• Languages: “A language is a collection of
sentences or strings of finite length all
constructed from a finite alphabet of symbols”
• Grammars: “A grammar can be regarded as a
device that enumerates the sentences of a
language” - nothing more, nothing less

• N. Chomsky, Information and Control, Vol 2, 1959

27
The Chomsky Hierarchy
• A containment hierarchy of classes of formal languages

Regular Context-
(DFA) free Context-
Recursively-
(PDA) sensitive
enumerable
(LBA)
(TM)

28
Sets and Set Operations
Introduction
• A set is a collection of objects.
• The objects in a set are called elements of the set.
• A well – defined set is a set in which we know for sure if an element belongs
to that set.
• Example:
• The set of all Prime Minister’s of India
• The set of all Hindi movies released in 2020
• The set of best TV shows of all time - is not well – defined (It is a matter
of opinion)

30
Notation
• When talking about a set we usually denote the set with a capital
letter.
• Roster notation is the method of describing a set by listing each
element of the set.
• Example:
• Let C = The set of all days in a week. The Roster notation would be
C={Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday}.

• Example:
• Let set A = The set of odd numbers greater than zero, and less than 10. The
roster notation of A={1, 3, 5, 7, 9}

31
More on Notation
• Sometimes we can’t list all the elements of a set.
• For instance, Z = The set of integer numbers. We can’t write out all
the integers, there infinitely many integers. So we adopt a
convention using dots …
• The dots mean continue on in this pattern forever and ever.
• Z = { …-3, -2, -1, 0, 1, 2, 3, …}
• W = {0, 1, 2, 3, …} = This is the set of whole numbers.

32
Set – Builder Notation
• When it is not convenient to list all the elements of a set, we use a
notation the employs the rules in which an element is a member of
the set. This is called set – builder notation.
• V = { people | citizens registered to vote in India}
• A = {x | x > 5} = This is the set A that has all real numbers greater
than 5.
• The symbol “|” is read as “such that”.

33
Special Sets of Numbers
• N = The set of natural numbers.
= {1, 2, 3, …}.
• W = The set of whole numbers.
={0, 1, 2, 3, …}
• Z = The set of integers.
= { …, -3, -2, -1, 0, 1, 2, 3, …}
• Q = The set of rational numbers.
={x| x=p/q, where p and q are elements of Z and q ≠ 0 }
• H = The set of irrational numbers.
• R = The set of real numbers.
• C = The set of complex numbers.

34
Universal Set, Subsets, Powerset
• The Universal Set denoted by U is the set of all possible elements
used in a problem.
• When every element of one set is also an element of another set,
we say the first set is a subset.
• Example A={1, 2, 3, 4, 5} and B={2, 3}
We say that B is a subset of A. The notation we use is B A.
• Let S={1,2,3}, list all the subsets of S.
• The subsets of S are  , {1}, {2}, {3}, {1,2}, {1,3}, {2,3}, {1,2,3}.
• The powerset of S is
{  , {1}, {2}, {3}, {1,2}, {1,3}, {2,3}, {1,2,3}}
35
The Empty Set
• The empty set is a special set. It contains no
elements. It is usually denoted as { } or  .
• The empty set is always considered a subset of any
set.
• Do not be confused by this question:
• Is this set {0} empty?
• It is not empty! It contains the element zero.

36
Intersection of sets
• When an element of a set belongs to two or more sets we say the
sets will intersect.
• The intersection of a set A and a set B is denoted by A ∩ B.
• A ∩ B = {x| x is in A and x is in B}
• Note the usage of and. This is similar to conjunction. A ^ B.
• Example A={1, 3, 5, 7, 9} and B={1, 2, 3, 4, 5}
• Then A ∩ B = {1, 3, 5}. Note that 1, 3, 5 are in both A and B.
• Mutually Exclusive Sets
• We say two sets A and B are mutually exclusive if A ∩ B =  .
• Think of this as two events that can not happen at the same time.
37
Union of sets
• The union of two sets A, B is denoted by A U B.
• A U B = {x| x is in A or x is in B}
• Note the usage of or. This is similar to disjunction A v B.
• Using the set A and the set B from the previous slide,
Example A={1, 3, 5, 7, 9} and B={1, 2, 3, 4, 5}
then the union of A, B is A U B = {1, 2, 3, 4, 5, 7, 9}.
• The elements of the union are in A or in B or in both. If elements are
in both sets, we do not repeat them.

38
Complement of a Set
C
• The complement of set A is denoted by or by A .
A’
• A’ = {x| x is not in set A}.
• The complement set operation is analogous to the negation operation
in logic.
• Example Say U={1,2,3,4,5}, A={1,2}, then A’ = {3,4,5}.

39
Cardinal Number
• The Cardinal Number of a set is the number of elements in the set and
is denoted by n(A).
• Let A={2,4,6,8,10}, then n(A)=5.
• The Cardinal Number formula for the union of two sets is
n(A U B)=n(A) + n(B) – n(A∩B).
• The Cardinal number formula for the complement of a set is n(A) +
n(A’)=n(U).

40
Strings and Languages

41
Languages
• A language is a set of strings

• String: A sequence of letters

• Examples: “cat”, “dog”, “house”, …

• Defined over an alphabet:


S = a, b, c,, z
42
Alphabets and Strings
• We will use small alphabets: S = a, b
• Strings
a
u = ab
ab
v = bbbaaa
abba
w = abba
baba
aaabbbaabab 43
String Operations
w = a1a2  an abba
bbbaaa
v = b1b2 bm

Concatenation
wv = a1a2  anb1b2 bm abbabbbaaa

44
Reverse

w = a1a2  an ababaaabbb

w = an  a2 a1
R
bbbaaababa

45
String Length
• Length:
w = a1a2  an w =n

• Examples:

abba = 4
aa = 2
a =1
46
Length of Concatenation
uv = u + v
• Example:

u = aab, u = 3
v = abaab, v = 5

uv = aababaab = 8
uv = u + v = 3 + 5 = 8 47
Empty String 
• A string with no letters:

 =0
• Observations:

w = w = w

abba = abba = abba


48
Substring
• Substring of string:
• a subsequence of consecutive characters

String abbab Proper ab
Substring
abbab abba
abbab b
abbab bbab
abbab 49
Proper Substring
• Substring of string:
• a subsequence of consecutive characters excluding the
empty string and the given string
abbab ab
String Substring
abbab abba
abbab b
abbab bbab
50
Prefix and Suffix
abbab
w = uv
• Prefixes  Suffixes abbab
a bbab prefix

ab bab suffix

abb ab
abba b
abbab  51
Proper Prefix and Proper Suffix
abbab
w = uv
• Proper  Proper abbab
Prefixes Suffixes
a bbab prefix

ab bab suffix

abb ab
abba b
abbab  52
Examples
• Substrings of “computer”
‘λ’, 'c', 'co', 'com', 'comp', 'compu', 'comput', 'compute', 'computer', 'o',
'om', 'omp', 'ompu', 'omput', 'ompute', 'omputer', 'm', 'mp', 'mpu',
'mput', 'mpute', 'mputer', 'p', 'pu', 'put', 'pute', 'puter', 'u', 'ut', 'ute',
'uter', 't', 'te', 'ter', 'e', 'er', ‘r’
• Proper Substrings of “computer”
‘λ’, 'c', 'co', 'com', 'comp', 'compu', 'comput', 'compute', 'computer', 'o',
'om', 'omp', 'ompu', 'omput', 'ompute', 'omputer', 'm', 'mp', 'mpu',
'mput', 'mpute', 'mputer', 'p', 'pu', 'put', 'pute', 'puter', 'u', 'ut', 'ute',
'uter', 't', 'te', 'ter', 'e', 'er', ‘r’
53
Examples
• Prefix of the string “computer”
‘λ’, 'c', 'co', 'com', 'comp', 'compu', 'comput', 'compute', 'computer’
• Proper Prefix of the string “computer”
‘λ’, 'c', 'co', 'com', 'comp', 'compu', 'comput', 'compute', 'computer’
• Suffix of the string “computer”
‘λ’, ‘r', ‘er', ‘ter', 'uter', ‘puter', 'mputer', 'omputer', 'computer’
• Proper Suffix of the string “computer”
‘λ’, ‘r', ‘er', ‘ter', 'uter', ‘puter', 'mputer', 'omputer', 'computer’

54
Another Operation
w = ww
n


w
n

• Example: (abba ) = abbaabba


2

• Definition: w =
0

(abba ) = 
0

55
The * Operation (Kleene or Star Closure)
S * : the set of all possible strings from alphabet S

S = a, b
S* =  , a, b, aa, ab, ba, bb, aaa, aab,

56
The + Operation (Positive Closure)
+ : the set of all possible strings from
S
alphabet S except 

S = a, b
S* =  , a, b, aa, ab, ba, bb, aaa, aab,

+
S = S * −
+
S = a, b, aa, ab, ba, bb, aaa, aab,
57
Language
• A language is any subset of S*
• Example:
S = a, b
S* =  , a, b, aa, ab, ba, bb, aaa,

• Languages: 
a, aa, aab
{ , abba, baba, aa, ab, aaaaaa} 58
Another Example
L = {a b : n  0}
n n

• An infinite language


ab
L abb  L
aabb
aaaaabbbbb
59
Operations on Languages
• The usual set operations
a, ab, aaaa  bb, ab = {a, ab, bb, aaaa}
a, ab, aaaa  bb, ab = {ab}
a, ab, aaaa − bb, ab = a, aaaa
• Complement: L = S * −L
a, ba =  , b, aa, ab, bb, aaa,
60
Reverse
• Definition:
L = {w : w  L}
R R

• Examples:
ab, aab, baba = ba, baa, abab
R

L = {a b : n  0}
n n

L = {b a : n  0}
R n n
61
Concatenation
• Definition:
L1L2 = xy : x  L1, y  L2 

• Example:
a, ab, bab, aa

= ab, aaa, abb, abaa, bab, baaa


62
Another Operation
• Definition: L =
n

LL L
n
a, b = a, ba, ba, b =
3

aaa, aab, aba, abb, baa, bab, bba, bbb


• Special case: L0 = 

0
a , bba , aaa  =  63
More Examples

L = {a b : n  0}
n n

L = {a b a b : n, m  0}
2 n n m m

aabbaaabbb  L 2

64
Star-Closure (Kleene *)
• Definition:
L* = L  L  L 
0 1 2

• Example:
 , 
a, bb, 
 
a, bb* =  
 aa , abb , bba , bbbb , 
aaa, aabb, abba, abbbb,
65
Positive Closure
• Definition:
+
L = L  L 
1 2

= L * − 

a, bb, 
+  
a, bb = aa, abb, bba, bbbb, 
aaa, aabb, abba, abbbb,
 
66
Examples
• If L={1, 2, 3} find the length of L2

L2= L0 U L1 U L2
L2 = {λ, 1, 2, 3, 11, 12, 13, 21, 22, 23, 31, 32, 33}
|L2| = 13

67
Examples
• If L1 = {a, b, c}, L2 = {1, 2}, If L1UL2, L1ՈL2, L1-L2, L1*, L2’, L11.

L1UL2 = {a, b, c, 1, 2} |L1UL2|=5


L1ՈL2 = φ |L1ՈL2|=0
L1-L2 = {a, b, c}
L1* = {λ, a, b, c, aa, ab, ac, ba, bb, bc, ca, cb, cc, aaa…….}
L2’ = {λ, 11, 12, 21, 22, 111, 112, 121, 122, 211, 212, 221, 222,…..}
L11 = {λ, a, b, c}

68
Examples
• Let L={ab, baa, aa}. Which of the following are in L*?

a. abaabaaabaa
b. aaaabaaaa
c. baaaaabaaaab
d. baaaaabaa
e. aaabbaaaaaa

69
Examples
• If 𝐿1 = {𝑎𝑛 𝑏 𝑛 𝑛 ≥ 0 𝑎𝑛𝑑 𝐿2 = 𝑐 𝑛 𝑛 ≥ 0 , find 𝐿1 𝐿2 , (𝐿1 𝐿2 )𝑅 ,
𝐿1 𝑅 𝐿2

𝐿1 𝐿2 = {𝑎𝑛 𝑏 𝑛 𝑐 𝑚 𝑛, 𝑚 ≥ 0

(𝐿1 𝐿2 )𝑅 = {𝑐 𝑚 𝑎𝑛 𝑏 𝑛 𝑛, 𝑚 ≥ 0

𝐿1 𝑅 𝐿2 = {𝑏 𝑛 𝑎𝑛 𝑐 𝑚 𝑛, 𝑚 ≥ 0

70

You might also like