You are on page 1of 24

FORMAL LANGUAGES AND

AUTOMATA THEORY-
19CS604

1
Introduction to Automata
Theory
Chapter 1

2
What is Automata Theory?
 Study of abstract computing devices, or
“machines”
 Automaton = an abstract computing device
 Note: A “device” need not even be a physical
hardware!
 A fundamental question in computer science:
 Find out what different models of machines can do
and cannot do
 The theory of computation
 Computability vs. Complexity

3
(A pioneer of automata theory)

Alan Turing (1912-1954)


 Father of Modern Computer
Science
 English mathematician
 Studied abstract machines called
Turing machines even before
computers existed

4
Theory of Computation: A
Historical Perspective
1930s • Alan Turing studies Turing machines
• Decidability
• Halting problem
1940-1950s • “Finite automata” machines studied
• Noam Chomsky proposes the
“Chomsky Hierarchy” for formal
languages
1969 Cook introduces “intractable” problems
or “NP-Hard” problems
1970- Modern computer science: compilers,
computational & complexity theory evolve
5
The Chomsky Hierachy
• A containment hierarchy of classes of formal languages

Regular Context-
(DFA) Context- Recursively-
free
sensitive enumerable
(PDA)
(LBA) (TM)

6
Finite Automata
 Some Applications
 Software for designing and checking the behavior
of digital circuits
 Lexical analyzer of a typical compiler
 Software for scanning large bodies of text (e.g.,
web pages) for pattern finding
 Software for verifying systems of all types that
have a finite number of states (e.g., stock market
transaction, communication/network protocol)

7
Finite Automata : Examples
action
 On/Off switch state

Grammars:
 Modeling recognition of the word “then”

Start state Transition Intermediate Final state


state
8
Structural expressions
 Grammars:
 are useful models when designing software that processes
data with a recursive structure.
 The best known example is a parser, the component of a
compiler that deals with the recursively nested features of
the typical programming language such as expressions
arithmetic conditional and so on.
 For instance, a grammatical rule like E→E+E states that an
expression can be formed by taking any two expressions
and connecting them by a plus sign this rule is typical of how
expressions of real programming languages are formed.

9
Structural expressions
 Regular Expressions also denote the structure of
data especially text Strings.

 represents capitalized words followed by a space and


two capital letters.
 This expression represents patterns in text that could
be a city and state. Eg: Ithaca NY

10
 It misses multiword city names such as
Palo Alto CA which could be captured
by the more complex expression

11
The Central Concepts of
Automata Theory

12
Alphabet
An alphabet is a finite, non-empty set of
symbols
 We use the symbol ∑ (sigma) to denote an

alphabet
 Examples:
 Binary: ∑ = {0,1}
 All lower case letters: ∑ = {a,b,c,..z}
 Alphanumeric: ∑ = {a-z, A-Z, 0-9}
 DNA molecule letters: ∑ = {a,c,g,t}
 …

13
Strings
 A string or word is a finite sequence of
symbols chosen from ∑.
Eg: 01101 is a string form the alphabet
∑ = {0,1}

14
 Empty String
 Is the string with zero occurrences of
symbols.
 Denoted by  (or “epsilon”)
  is a string that may be chosen from any
alphabet whatsoever.

15
 Length of a string
 is the number of positions for symbols in the
string.(i.e number of symbols in the string)
 01101 has length 5
 |  |=0
 | 011|=3
 E.g., x = 010100 |x| = 6
 x = 01  0  1  00  |x| = ?

16
Powers of an alphabet
 If ∑ is an alphabet we can express the set of all
strings of a certain length from that alphabet by
using an exponential notation.
 We define ∑k to be the set of strings of length k,
each of whose symbols is in ∑.
 ∑k = the set of all strings of length k
 ∑* = ∑0 U ∑1 U ∑2 U … = ∑+ U {}
 ∑+ = ∑1 U ∑2 U ∑3 U …
 ∑0 =  (i.e,  is the only string whose length is 0)
17
 Eg:

18
 Concatenation of strings
 Let x and y be strings
 xy denotes the concatenation of x and y,
that is, the string formed by making a copy
of x and following it by a copy of y.

19
 Eg:
 Let x= 1101 and y=0011
xy= 11010011
yx= 00111101
 w=w=w
 That is,  is the identity for concatenation, since when
concatenated with any string it yields the other string
as a result (analogously to the way 0, the identity for
addition, can be added to any number x and yields x
as a result).

20
Languages
 A set of strings all of which are chosen from
some ∑* , where ∑ is a particular alphabet is
called a language.
 If ∑ is an alphabet, and L  ∑*, then L is a
language over ∑.

21
Example

22
Languages & Grammars
 Languages: “A language is a
Or “words” collection of sentences of
finite length all constructed
from a finite alphabet of
symbols”
 Grammars: “A grammar can
be regarded as a device that
enumerates the sentences
of a language” - nothing
more, nothing less

 N. Chomsky, Information and


Control, Vol 2, 1959

Image source: Nowak et al. Nature, vol 417, 2002


23
Languages
L is a said to be a language over alphabet ∑, only if L  ∑*
 this is because ∑* is the set of all strings (of all possible
length including 0) over the given alphabet ∑
Examples:
1. Let L be the language of all strings consisting of n 0’s
followed by n 1’s:
L = {, 01, 0011, 000111,…}
2. Let L be the language of all strings of with equal number of
0’s and 1’s:
L = {, 01, 10, 0011, 1100, 0101, 1010, 1001,…}
Canonical ordering of strings in the language

Definition: Ø denotes the Empty language


 Let L = {}; Is L=Ø?
NO
24

You might also like