You are on page 1of 107

KOTEBE METROPOLITAN UNIVERSITY

Automata and Complexity Theory

Course Material

January, 2022
TABLE OF CONTENTS
1. INTRODUCTION TO THEORY OF COMPUTATION --------------- 5
1.1 MATHEMATICAL PRELIMINARIES ---------------------------------------------------------------------6
1.1.1 SETS AND SUBSETS ---------------------------------------------------------------------------------------------------6

1.1.2 SET OPERATIONS ------------------------------------------------------------------------------------------------------6

1.2 LANGUAGES ----------------------------------------------------------------------------------------------8


1.3 GRAMMARS--------------------------------------------------------------------------------------------- 11
1.3.1 TYPES OF GRAMMARS---------------------------------------------------------------------------------------------- 14

1.3.2 REGULAR GRAMMARS --------------------------------------------------------------------------------------------- 14

1.3.3 CONTEXT-FREE GRAMMARS------------------------------------------------------------------------------------- 15

1.4 AUTOMATA --------------------------------------------------------------------------------------------- 15


1.4.1 TYPES AUTOMATA --------------------------------------------------------------------------------------------------- 16

2 FINITE AUTOMATA AND REGULAR LANGUAGES ------------- 17


2.1 FINITE AUTOMATA ------------------------------------------------------------------------------- 17
2.1.1 DETERMINISTIC FINITE AUTOMATA (DFA) ---------------------------------------------------------------- 19

2.1.2 DESIGNING FINITE AUTOMATA -------------------------------------------------------------------------------- 24

2.1.3 NONDETERMINISTIC FINITE AUTOMATA (NFA) ---------------------------------------------------------- 28

2.2 EQUIVALENCE OF DFA AND NFA ------------------------------------------------------------------- 32


2.3 REDUCTION OF NUMBER OF STATES IN FINITE AUTOMATA ------------------------------------- 36

3 . REGULAR EXPRESSION AND REGULAR LANGUAGES ------ 40


3.1 REGULAR EXPRESSIONS ------------------------------------------------------------------------------ 40
3.1.1 EQUIVALENCE WITH FINITE AUTOMATA ------------------------------------------------------------------- 43

3.2 REGULAR GRAMMARS -------------------------------------------------------------------------------- 51


3.2.1 RIGHT- LINEAR AND LEFT-LINEAR GRAMMARS ---------------------------------------------------------- 52

3.2.2 EQUIVALENCE OF FINITE AUTOMATA AND REGULAR GRAMMARS ------------------------------- 53

3.3 PROPERTIES OF REGULAR LANGUAGES ------------------------------------------------------------ 56


3.3.1 CLOSURE PROPERTIES -------------------------------------------------------------------------------------------- 57

Compiled by: Destalem H. 2


3.3.2 THE PUMPING LEMMA FOR REGULAR LANGUAGES ---------------------------------------------------- 58

4 . CONTEXT – FREE LANGUAGES AND PUSHDOWN


AUTOMATA ----------------------------------------------------------------------- 59
4.1 CONTEXT-FREE GRAMMARS ----------------------------------------------------------------- 60
4.1.1 TYPES OF DERIVATIONS ------------------------------------------------------------------------------------------ 62

4.1.2 DESIGNING CONTEXT-FREE GRAMMARS ------------------------------------------------------------------- 65

4.1.3 AMBIGUITY ------------------------------------------------------------------------------------------------------------ 66

4.2 SIMPLIFICATION OF CONTEXT-FREE GRAMMARS ----------------------------------- 68


4.2.1 CONSTRUCTION OF REDUCED GRAMMARS ---------------------------------------------------------------- 69

4.2.2 REMOVING USELESS PRODUCTIONS ------------------------------------------------------------------------- 70

4.2.3 REMOVING Λ-PRODUCTIONS ------------------------------------------------------------------------------------ 73

4.2.4 REMOVING UNIT-PRODUCTIONS ------------------------------------------------------------------------------- 74

4.3 IMPORTANT NORMAL FORMS----------------------------------- ERROR! BOOKMARK NOT DEFINED.


4.3.1 CHOMSKY NORMAL FORM ---------------------------------------------------------------------------------------- 76

4.3.2 GREIBACH NORMAL FORM --------------------------------------------------------------------------------------- 78

4.4 PUSHDOWN AUTOMATA ------------------------------------------------------------------------ 80


4.4.1 FORMAL DEFINITION OF A PUSHDOWN AUTOMATON ------------------------------------------------- 82

5 TURING MACHINE(TM) --------------------------------------------------- 86


1 INTRODUCTION TO COMPUTATIONAL COMPLEXITY --------------------------------------------- 86
2 STANDARD TURING MACHINE(TM) ---------------------------------------------------------------- 86
FORMAL DEFINITION OF A STANDARD TURING MACHINE ------------------------------------------------------- 88

REPRESENTATION OF TURING MACHINES ----------------------------------------------------------------------------- 91

REPRESENTATION OF TM BY INSTANTANEOUS DESCRIPTIONS: ----------------------------------------------- 91

MOVES IN TURING MACHINE------------------------------------------------------------------------------------------------ 92

REPRESENTATION OF TM BY TRANSITION TABLE: ------------------------------------------------------------------ 93

REPRESENTATION OF TM BY TRANSITION DIAGRAM: ------------------------------------------------------------- 94

CONSTRUCTION OF TURING MACHINE(TM) -------------------------------------------------------------- 96


VARIANTS OF TURING MACHINES ----------------------------------------------------------------------------------------- 96

6 COMPUTABILITY ----------------------------------------------------------- 99
Compiled by: Destalem H. 3
INTRODUCTION ------------------------------------------------------------------------------------------------ 99
PRIMITIVE RECURSIVE FUNCTIONS ------------------------------------------------------------------------ 99
INITIAL FUNCTIONS ------------------------------------------------------------------------------------------------------------ 99

PRIMITIVE RECURSIVE FUNCTIONS OVER N ------------------------------------------------------------------------- 100

PRIMITIVE RECURSIVE FUNCTIONS OVER  = {A, B} --------------------------------------------------------------- 101

RECURSIVE FUNCTIONS ------------------------------------------------------------------------------------- 103

7 COMPUTATIONAL COMPLEXITY ----------------------------------- 104


INTRODUCTION ----------------------------------------------------------------------------------------------- 104
BIG-O NOTATION -------------------------------------------------------------------------------------------- 104
GROWTH RATE OF FUNCTIONS -------------------------------------------------------------------------------------------- 104

1 CLASS P VERSUS CLASS NP ------------------------------------------------------------------------- 105


POLYNOMIAL TIME REDUCTION AND NP-COMPLETE PROBLEMS------------------------------------ 106

Compiled by: Destalem H. 4


1. INTRODUCTION TO THEORY OF
COMPUTATION
After completing this chapter students will able to:
▪ Have good understanding with basic information why we learn theory in computer science.
▪ Grasp basic knowledge in set theory, subsets, proper subsets, set operation.
▪ Basic knowledge in function and relation.
▪ Have good understand graphs and tree.
▪ Identify the different proofing techniques.
▪ Will be introduced and able to define what the terms like automata, grammar, language, string
concatenation etc.

In this chapter, topics like set theories, set operations, function, relation, graphs, proofing techniques,
languages, automata and grammars will be discussed briefly. The concepts which are discussed in this
chapter will help as base for other coming chapters.

The question that will come to every reader of this module is, since Computer science is a practical
discipline, why do we need to study the theoretical aspect of it? There are several reasons why we study
formal language and automata which are the core part of the theoretical part of the field. This chapter
serves to introduce the reader to the principal motivation and also outlines the major topics covered in
this module.

Activity 1.1
✓ Mention and explain briefly why we need to learn the theoretical aspect of
computer science?
Why study theory?
The first answer is that theory provides concepts and principles that help us understand the general
nature of the discipline. The field of computer science includes a wide range of special topics, from
machine design to programming.

The second answer is that the ideas we will discuss have some immediate and important applications.
The fields of digital design, programming languages, and compilers are the most obvious examples, but
there are many others. The concepts we study here run like a thread through much of computer science,
from operating systems to pattern recognition.

Compiled by: Destalem H. 5


The third answer is that the subject matter itself is intellectually stimulating and fun. It provides many
challenging, puzzle-like problems that can lead to some sleepless nights. This is problem solving in its
pure essence.

1.1 MATHEMATICAL PRELIMINARIES

Activity 1.2
✓ Define the terms set, subset, and proper subsets?
✓ Discuss about the different represent type of a set?

1.1.1 SETS AND SUBSETS


A set is a well-defined collection of objects, for example, the set of all students in a university. A
collection of all books in a library also a set. The individual objects are called members or elements of
the set.

We use capital letters A, B, C, . . . for denoting sets, The small letters a, b, c, . . . are used to denote the
elements of any set. To indicate that x is an element of the set A, we write x ∈ A. The statement that x
is not in A is written x ∉ A. We can denote sets using
1. By listing its elements. We write all the elements of the set (without repetition) and enclose
them within braces. We can write the elements in any order. For example, the set of all positive
integers divisible by 15 and less than 100 can be written as {15. 30, 45. 60. 75. 90}.
2. By describing the properties of the elements of the set. For example. the set {15, 30. 45. 60.
75. 90} can be described as: {n: n is a positive integer divisible by 15 and less than 100}. (The
description of the property is called predicate. In this case the set is said to be implicitly
specified.)
3. By recursion. We define the elements of the set by a computational rule for calculating the
elements. For example. the set of all natural numbers leaving a remainder 1 when divided by 3
can be described as
{an = | a0 = 1,an+1 = an + 3}

1.1.2 SET OPERATIONS


The usual set operations are union (∪), intersection (∩), and difference (−) defined as
• If we have two sets S1 and S2, then union operation is defined by:-
S1∪S2 = {x: x ∈ S1, or x ∈ S2}
• If we have two sets S1 and S2, then intersection operation is defined by:-
S1⋂S2 = {x: x ∈ S1, and x ∈ S2}
• If we have two sets S1 and S2, then difference operation is defined by:-
Compiled by: Destalem H. 6
S1−S2 = {x: x ∈ S1, or x ∉ S2}

Another basic operation is complementation. The complement of a set S, denoted by 𝑆̅ , which consists
of all elements not in S. To make this meaningful, we need to know what the universal set U of all
possible elements is. If U is specified, then 𝑆̅ = {x: x ∈ U, x ∉ S}.

The set with no elements, called the empty set or the null set, is denoted by ∅. From the definition of a
set, it is obvious that
S ∪ ∅ = S−∅ = S, S ⋂ ∅ = ∅ =, ∅̅ = ∪, ̿𝑆 = S.
The following useful identities, known as DeMorgan's laws,
̅̅̅̅̅̅̅̅̅
𝑺𝟏𝑼𝑺𝟐 = ̅̅̅̅ 𝑺𝟐 , ̅̅̅̅̅̅̅̅̅̅
𝑺𝟏 ∩ ̅̅̅̅ 𝑺𝟏⋂𝑺𝟐 = ̅̅̅̅ ̅̅̅̅ , are needed on several occasions.
𝑺𝟏𝑼𝑺𝟐

A set S1 is said to be a subset of S if every element of S1 is also an element of S. We write this as


S1 ⊆ S.
We say that S1 is a proper subset of S, if S contains at least one element which is not present in S1; we
write this as S1 ⊂ S.
If S1 and S2 are two sets and have no common element, that is, S1 ∩ S2 = ø, then the two sets S1 and S2
are said to be disjoint sets.

A set is said to be finite if it contains a finite number of elements; otherwise it is infinite. The size of a
finite set is the number of elements in it; this is denoted by |S|. A given set normally has many subsets.
The set of all subsets of a set S is called the powerset of S and is denoted by 2S. Observe that 2S is a set
of sets.
Example 1.1
If S is the set {a, b, c}, then its powerset is
2S ={ ∅ , {a} , {b} , {c} , {a,b} , {a,c} , {b,c} , {a,b,c} }.

Here |S| = 3 and |2s|=|23| = 8. This is an instance of a general result; if S is finite, then |2S|=2|S| .

If the elements of a set are ordered sequences of elements from other sets, then such sets are said to be
the Cartesian product of other sets. The Cartesian product of two sets, which itself is a set of ordered
pairs, we write as
S = S1 × S2 = { (x,y) : x ∈ S1 , y ∈ S2 }
Example 1.2
Let S1 = {2, 4} and S2 = {2, 3, 5, 6}. Then
S1 × S2 = {(2, 2), (2, 3), (2, 5), (2, 6), (4, 2), (4, 3), (4, 5), (4, 6)}.

Compiled by: Destalem H. 7


Note that the order in which the elements of a pair are written matters. The pair (4, 2) is in S1 × S2, but
(2, 4) is not. The notation is extended in an obvious fashion to the Cartesian product of more than two
sets; generally
S1 x S2 x . . . x Sn = { (x1 , x2 , . . . , xn ) : xi ∈ Si }
A set can be divided by separating it into a number of subsets. Suppose that S1, S2, Sn are subsets of a
given set S and that the following holds:
1. The subsets S1, S2,…Sn are mutually disjoint;
2. S1 ∪ S2 ∪…∪ Sn = S;
3. none of the Si is empty. Then S1, S2,…Sn is called a partition of S.

1.2 LANGUAGES

Activity 1.3
✓ Explain the similarity and difference between natural and formal languages?
✓ Define the strings?
✓ What is concatenation?

We are all familiar with the notion of natural languages, such as English. Dictionaries define the term
informally as a system suitable for the expression of certain ideas, facts, or concepts, including a set of
symbols and rules for their manipulation. But, this is not sufficient as a definition for the study of formal
languages. We need a precise definition for the term.

To define language formally, we start with a finite, nonempty set Σ of symbols, called the alphabet.
From the individual symbols we construct strings, which are finite sequences of symbols from the
alphabet. The se of strings is called language. For example, if the alphabet Σ = {a, b}, then abab and
aaabbba are strings on Σ. In this module we will use lowercase letters a, b, c…, for elements of Σ and
the letters u, υ, w…for string names. For example, for assigning strings to a letter will write:
w = abaaa,
to indicate that the string named w has the specific value abaaa.
Concatenation: the concatenation of two strings w and υ is the string obtained by appending the symbols
of one string for example υ to the right end of another string w, that is, if we have strings
w=a1a2a3….an and
v=b1b2…bm then the concatenation of w and υ creates a new string, denoted by wv,
wv = a1a2…anb1b2…bm

Compiled by: Destalem H. 8


Reverse: the reverse of a string is obtained by writing the symbols in reverse order; if w is a string as
shown above, then its reverse is: wR = an…a2a1
Length: the length of a string w, denoted by |w|, is the number of symbols in the string.
Empty string: is a string with no symbols at all. It will be denoted by symbol λ (pronounced as lambda).
Note the following simple relations about empty string.
|λ|=0,
λw=wλ=w hold for all w.
Substring: any string of consecutive symbols in some string w is said to be a substring of w. If w=vu
then the substrings υ and u are said to be a prefix and a suffix of w, respectively. For example, if w =
abbab, then {λ, a, ab, abb, abba, abbab} is the set of all prefixes of w, while bab, ab, b are some of its
suffixes. The length of string is added value of its substrings. For example, if u and υ are strings, then
the length of their concatenation is the sum of the individual lengths, that is, |uv|=|u|+|v|
If w is a string, then wn stands for the string obtained by repeating w n times. As a special case, we define
w0 = λ for all w.
If Σ is an alphabet, then we use Σ* to denote the set of all possible strings obtained by concatenating zero
or more symbols from Σ. The set Σ* always contains λ. To exclude the empty string, we define
Σ+ = Σ*- {λ}
While Σ is finite by assumption, Σ* and Σ+ are always infinite since there is no limit on the length of the
strings in these sets.
Any subset L of Σ* is a language over Σ. If you want to emphasize that you are dealing with the abstract
theory of such languages, say "formal language" instead of just "language".

Since by definition, the alphabet Σ is finite, we can enumerate all words (strings) over Σ. That is, we can
order all words (string) by sorting them in a right-infinite sequence. This ordering can be done in many
ways, but the most standard ordering is the alphabetical enumeration. For instance, for the binary
alphabet Σ = {0, 1}, the set Σ* of all words over Σ is sorted by alphabetical enumeration like this:

𝜆, 0, 1, 00, 01, 10, 11, 000, 001, ...

Any set is called countable if it has at most as many elements as there are natural numbers. In
mathematical rigor, a set S is defined to be countable if there exists an injective map i: S → N (remember:
a map f: A → B from a set A to a set B is called injective if for every b ∈ B there exists at most one a ∈
A such that f(a) = b). The set {0, 1}* can be seen to be countable by understanding the alphabetical
enumeration as a map from i: {0, 1}* → N

Compiled by: Destalem H. 9


Finite sets are always countable. But there are also many sets that are infinite and countable; one calls
such sets infinitely countable. {0, 1}* is an example. Another example of a countable infinite set is N
itself, or the set of all even integers. So we have the following diagram:

Definition 1.1: Let A and B be languages. We define the operations union, concatenation, and star-
closure as follows:

• Union: A ∪ B = {x| x ∈ A or x ∈ B}.


• Concatenation: A.B = {xy | x ∈ A and y ∈ B}.
• Star-closure: A∗ = {x1x2 . . . xk | k ≥ 0 and each xi ∈ A}.

You are already familiar with the union operation. It simply takes all the strings in both A and B and
lumps them together into one language.

The concatenation operation is a little trickier. It attaches a string from A in front of a string from B in
all possible ways to get the strings in the new language.

The star operation is a bit different from the other two because it applies to a single language rather than
to two different languages. That is, the star operation is a unary operation instead of a binary operation.
It works by attaching any number of strings in A together to get a string in the new language. Because
“any number” includes 0 as a possibility, the empty string 𝜆 is always a member of A∗, no matter what
A is.

Example 1.3
Let the alphabet Σ be the standard 26 letters {a, b, . . . , z}. If A = {good, bad} and
B = {boy,girl}, then
• A ∪ B = {good, bad, boy, girl},
• A . B = {goodboy, goodgirl, badboy, badgirl}, and
• A∗ = {ε, good, bad, goodgood, goodbad, badgood, badbad, goodgoodgood,
goodgoodbad, goodbadgood, goodbadbad, . . . }.
Compiled by: Destalem H. 10
Example 1.4
Let Σ = {a, b}. Then
∑ * = { λ,a,b,aa,ab,bb,aaa,aab,……}
The set:{a,aa,aab} is a language on Σ. Because of it has a finite number of sentences, we
call it a finite language. The set L={ anbn : n≥0} is also a language on Σ. The strings
aabb and aaaabbbb are in the language L, but the string abb is not in L. This language is
infinite. Most interesting languages are infinite.

Since languages are sets, the operation of sets like union, intersection, and difference of two languages
are immediately defined. The complement of a language is defined with respect to Σ*; that is, the
complement of L is 𝐿̅ = ∑* - L. The reverse of a language is the set of all string reversals, that is, LR
={ wR: w∈ L}. The concatenation of two languages L1 and L2 is the set of all strings obtained by
concatenating any element of L1 with any element of L2; specifically,
L1L2 = {xy : x ∈ L1,y ∈ L2}
We define Ln as L concatenated with itself n times, with the special cases
L0 = {λ}
L1 = L for every language L.
Finally, we define the star-closure of a language as: L* = L0 ∪ L1∪ L2∪. . .
and the positive closure as: L+ = L1 ∪ L2 ∪. . .
Example 1.5
If L= { anbn : n≥0} then
The reverse of L is: LR {bnan : n≥0}

1.3 GRAMMARS
Activity 1.4
✓ Define Grammar Formally?
✓ What is linear grammar?

To express languages mathematically, we need a mechanism to describe them. Everyday language is


imprecise and ambiguous, so informal descriptions in English are often inadequate. The set notation
used in Examples 1.3 and 1.4 is more suitable, but limited. As we proceed we will learn about several
language-definition mechanisms that are useful in different circumstances. Here we introduce a common
and powerful one, the notion of a grammar.

A grammar for the English language tells us whether a particular sentence is well-formed or not. A
typical rule of English grammar is “a sentence can consist of a noun phrase followed by a predicate.”
Example 1.6: The English Language Grammar
Compiled by: Destalem H. 11
<Sentence> →<noun phrase> <Predicate>
<Noun Phrase> →<article> <noun>
<Predicate> →<verb>
<article> →a
<article> →the
<noun> →cat
<noun> →dog
<verb> →runs
<verb> →sleeps
Derivation of string “the dog sleeps”
<Sentence> ⇒ <noun phrase> <Predicate>
⇒< noun phrase > <verb>
⇒<article> <noun> <verb>
⇒ the <noun> <verb>
⇒ the dog <verb>
⇒ the dog sleeps
Derivation of string “a cat runs”
<Sentence> ⇒ <noun phrase> <Predicate>
⇒< noun phrase > <verb>
⇒<article> <noun> <verb>
⇒ a <noun> <verb>
⇒ a cat <verb>
⇒ a cat runs
We get the following set of strings (Language) of the grammar:
L={“a cat runs”, “a cat sleeps”, “the cat runs”, “the cat sleeps , a dog runs”, “a dog sleeps”, “the
dog runs”, “the dog sleeps”}

If we associate the actual words “a” and “the” with Articles “cat” and “dog” with Nouns , and “runs”
and “sleeps” with Verbs , then the grammar tells us that the sentences “a cat runs” and “the dog runs”
are properly formed. If we were to give a complete grammar, then in theory, every proper sentence could
be explained this way. This example illustrates the definition of a general concept in terms of simple
ones. We start with the top-level concept, here Sentences, and successively reduce it to the irreducible
building blocks of the language. The generalization of these ideas leads us to formal grammars.
Definition1.2: A grammar G is defined as a quadruple G =(V, T, S, P),
Where
V is a finite set of objects called variables,
T is a finite set of objects called terminal symbols,
S ∈ V is a special symbol called the start variable,
Compiled by: Destalem H. 12
P is a finite set of productions.
It will be assumed that the sets V and T are nonempty and disjoint. The production rules are the heart of
a grammar; they specify how the grammar transforms one string into another, and through this they
define a language associated with the grammar. In our discussion we will assume that all production
rules are of the form
x→y
where x is an element of (V ∪ T)+ and y is in (V ∪ T)*. The productions are applied in the following
manner: Given a string w of the form
w=uxv
we say the production x → y is applicable to this string, and we may use it to replace x with y, thereby
obtaining a new string
z=uyv
This is written as
w⟹z
We say that w derives z or that z is derived from w. Successive strings are derived by applying the
productions of the grammar in arbitrary order. A production can be used whenever it is applicable, and
it can be applied as often as desired. If
w1 ⟹ w2 ⟹. . . ⟹ wn,
we say that w1 derives wn and write

w1⇒wn
The * indicates that an unspecified number of steps (including zero) can be taken to derive wn from w1.
By applying the production rules in a different order, a given grammar can normally generate many
strings. The set of all such terminal strings is the language defined or generated by the grammar.
Definition 1.3: Let G = (V, T, S, P) be a grammar. Then the set

L (G) = {w ∈ T*: S⇒w} is the language generated by G.

If w ∈ L (G), then the sequence S⟹w1⟹w2⟹ . . . ⟹ wn⟹ 𝑤, is a derivation of the sentence w. The
strings S, w1, w2,…, wn, which contain variables as well as terminals, are called sentential forms of the
derivation.

Example 1.7
Consider the grammar: G = ({S},{a,b},S,P}, with P given by
S→aSb
S→λ then
S⟹aSb⟹aaSbb⟹aabb, so we can write

S⇒aabb
The string aabb is a sentence in the language generated by G, while aaSbb is a sentential form. A
grammar G completely defines L(G), but it may not be easy to get a very explicit description of the
Compiled by: Destalem H. 13
language from the grammar. Here, however, the answer is fairly clear. It is not hard to conjecture that
L(G)={anbn : n≥0}

Example 1.8
Find a grammar that generates the language L={anbn+1 : n≥0}
Solution:
The idea behind the previous example can be extended to this case. All we need to do is generate an
extra b. This can be done with a production S → Ab, with other productions chosen so that A can derive
the language in the previous example. Reasoning in this fashion, we get the grammar G =({S, A}, {a,
b}, S, P), with productions
S→Ab
A→aAb
A→λ

1.3.1 TYPES OF GRAMMARS


There are different types of grammar. In here we are going to see briefly the following two types
• Linear Grammars depending upon the position of variable : - 1. Right-Linear and 2. Left-Linear Grammars
• Nonlinear Grammars
Linear Grammars are Grammars that have at most one variable at the right side production of the
grammar. There are different types of linear grammars based on the positions of the variable at the right
side of the production. Nonlinear grammars are grammars that have more than one variable at the right
side of the production.
Example of linear grammar: G=({S},{a,b},S,p) where p is given by
S→aSb
S→λ

Example of nonlinear grammar G=({S},{a,b},S,P) where p is given by


S→SS
S→λ
S→aSb
S→bSa

1.3.2 REGULAR GRAMMARS


Grammars are often an alternative way of specifying languages. Whenever we define a language family
through an automaton or in some other way, we are interested in knowing what kind of grammar we can
associate with the family.

Right-Linear and Left-Linear Grammars


Definition 1.8: A grammar G =(V, T, S, P) is said to be right-linear if all productions are of the form
Compiled by: Destalem H. 14
A → xB,
A → x,
where A, B ∈ V, and x ∈ T*.

Definition 1.9: A grammar is said to be left-linear if all productions are of the form
A → Bx,
or
A → x.

A regular grammar is one that is either right-linear or left-linear. Note that in a regular grammar, at
most one variable appears on the right side of any production. Furthermore, that variable must
consistently be either the rightmost or leftmost symbol of the right side of any production.

1.3.3 CONTEXT-FREE GRAMMARS


The productions in a regular grammar are restricted in two ways: The left side must be a single variable, while
the right side has a special form. To create grammars that are more powerful, we must relax some of these
restrictions. By retaining the restriction on the left side, but permitting anything on the right, we get context-free
grammars.
Definition 1.4: A grammar G = (V, T, S, P) is said to be context-free if all productions in P have the form
A → x,
where A ∈ V and x ∈ (V ∪ T)*. A language L is said to be context-free if and only if there is a context free
grammar G such that L= L (G).

1.4 AUTOMATA

Activity 1.5
✓ What is automaton?
✓ Mention and explain the three different types of automata?

Formal language is an abstract of programming language, and automaton is an abstract model of a digital
computer. As such, every automaton includes some essential features of the digital computer. It has a
mechanism for reading input. It will be assumed that the input is a string over a given alphabet, written
on an input file, which the automaton can read but not change. The input file is divided into cells, each
of which can hold one symbol. The input mechanism can read the input file from left to right, one symbol
at a time. The input mechanism can also detect the end of the input string (by sensing an end-of-file
condition).

Compiled by: Destalem H. 15


The automaton can produce output of some form. It may have a temporary storage device, consisting
of an unlimited number of cells, each capable of holding a single symbol from an alphabet (not
necessarily the same one as the input alphabet). The automaton can read and change the contents of the
storage cells. Finally, the automaton has a control unit, which can be in any one of a finite number of
internal states, and which can change state in some defined manner. An automaton is assumed to operate
in a discrete timeframe. At any given time, the control unit is in some internal state, and the input
mechanism is scanning a particular symbol on the input file. The internal state of the control unit at the
next time step is determined by the next-state or transition function. This transition function gives the
next state in terms of the current state, the current input symbol, and the information currently in the
temporary storage.

During the transition from one time interval to the next, output may be produced or the information in
the temporary storage changed. The term configuration will be used to refer to a particular state of the
control unit, input file, and temporary storage. The transition of the automaton from one configuration
to the next will be called a move. In the coming chapter we will discuss in detail about automata, but
here we are going to introduce to the different types of automata.

1.4.1 TYPES AUTOMATA


Automata are distinguished by the temporary memory
▪ Finite Automata: no temporary memory
▪ Pushdown Automata: stack
▪ Turing Machines: random access memory

Power of Automata, the following diagram shows the power of the different types of automata on
solving different types of problems.

Compiled by: Destalem H. 16


2 FINITE AUTOMATA AND REGULAR LANGUAGES

After completing this chapter students will able to:


▪ Develop designing mentality.
▪ Have good understanding in regular languages.
▪ Understand the relation between regular languages and finite automata.
▪ Design transition graphs and transition tables for different regular languages.
▪ Differentiate between deterministic and nondeterministic automata easily.
▪ Can perform easily the conversion between NFA and DFA
▪ Easily minimize automata states to a reduced automata

2.1 FINITE AUTOMATA

Activity 2.1
1. Express finite automata briefly?
2. List and explain types of finite automata?
3. Define DFA and NFA formally?
4. What are the DFA representations?

Finite automata are good models for computers with an extremely limited amount of memory. We
interact with such computers all the time, as they lie at the heart of various electromechanical devices.
The controller for an automatic door is one example of such a device. Often found at Hotels,
supermarket entrances and exits, automatic doors swing open when the controller senses that a person
is approaching. An automatic door has a pad in front to detect the presence of a person about to walk
through the doorway. Another pad is located to the rear of the doorway so that the controller can hold
the door open long enough for the person to pass all the way through and also so that the door does not
strike someone standing behind it as it opens. This configuration is shown in figure 2.1.

FIGURE 2.1: Top view of an automatic door

Compiled by: Destalem H. 17


The controller is in either of two states: “OPEN” or “CLOSED,” representing the corresponding
condition of the door. As shown in figures 2.2 and 2.3, there are four possible input conditions:
“FRONT” (meaning that a person is standing on the pad in front of the doorway), “REAR” (meaning
that a person is standing on the pad to the rear of the doorway), “BOTH” (meaning that people are
standing on both pads), and “NEITHER” (meaning that no one is standing on either pad).

FIGURE 2.2: State diagram for an automatic door controller

FIGURE 2.3: State transition table for an automatic door controller

The controller moves from state to state, depending on the input it receives. When in the CLOSED state
and receiving input NEITHER or REAR, it remains in the CLOSED state. In addition, if the input
BOTH is received, it stays CLOSED because opening the door risks knocking someone over on the
rear pad. But if the input FRONT arrives, it moves to the OPEN state. In the OPEN state if input,
FRONT, REAR, or BOTH is received, it remains in OPEN. If input NEITHER arrives, it returns to
CLOSED.

For example, a controller might start in state CLOSED and receive the series of input signals FRONT,
REAR, NEITHER, FRONT, BOTH, NEITHER, REAR, and NEITHER. It then would go through
the series of states CLOSED (starting), OPEN, OPEN, CLOSED, OPEN, OPEN, CLOSED,
CLOSED, and CLOSED.

Thinking of an automatic door controller as a finite automaton is useful because that suggests standard
ways of representation as in Figures 2.2 and 2.3. This controller is a computer that has just a single bit
of memory, capable of recording which of the two states the controller is in. Other common devices
have controllers with somewhat larger memories. In an elevator controller, a state may represent the
floor the elevator is on and the inputs might be the signals received from the buttons. This computer
might need several bits to keep track of this information. Controllers for various household appliances
such as dishwashers and electronic thermostats, as well as parts of digital watches and calculators are

Compiled by: Destalem H. 18


additional examples of computers with limited memories. The design of such devices requires keeping
the methodology and terminology of finite automata in mind.

We will now take a closer look at finite automata from a mathematical perspective. We will develop a
precise definition of a finite automaton, terminology for describing and manipulating finite automata,
and theoretical results that describe their power and limitations. Besides giving you a clearer
understanding of what finite automata are and what they can and cannot do, this theoretical development
will allow you to practice and become more comfortable with mathematical definitions, theorems, and
proofs in a relatively simple setting.

Strings, by definition are finite (have only a finite number of symbols). Most languages of interest are
infinite (contain an infinite number of strings) However, in order to work with these languages we must
be able to specify or describe them in ways that are finite. Finite automata use to describe these infinite
languages. Finite automata are finite collections of states with transition rules that take you from one
state to another.

There are two types of finite automata

▪ Deterministic Finite Automata (DFA).


▪ Nondeterministic Finite Automata (NFA).

Figure 2.4: Pictorial representation string processing using Automata

2.1.1 DETERMINISTIC FINITE AUTOMATA (DFA)

Definition 2.1: A Deterministic Finite Automata or DFA is defined by the quintuple


M = (Q,Σ,δ,q0, F), Where
Q is a finite set of internal states,
Σ is a finite set of symbols called the input alphabet,
δ : Q × Σ → Q is a total function called the transition function,
Compiled by: Destalem H. 19
q0 ∈ Q is the initial state,
F ⊆Q is a set of final states.
A deterministic finite automaton operates in the following manner.
1. At the initial time, it is assumed to be in the initial state q0, with its input mechanism on the
leftmost symbol of the input string. During each move of the automaton, the input mechanism
advances one position to the right, so each move consumes one input symbol.
2. When the end of the string is reached, the string is accepted if the automaton is in one of its final
states. Otherwise, the string is rejected.
3. The input mechanism can move only from left to right and reads exactly one symbol on each
step.
4. The transitions from one internal state to another are governed by the transition function δ. For
example, if we have the transition function δ(q0, a) = q1, This means that, if the DFA is in state
(position) q0 and the current input symbol is a, the DFA will go into state q1 after reading symbol
a.
In discussing automata, it is essential to have a clear and intuitive picture to work with. To visualize and
represent finite automata, we use transition graphs (State diagram), in which the vertices represent
states and the edges represent transitions. The labels on the vertices are the names of the states, while
the labels on the edges are the current values of the input symbol.

For example, if q0 and q1 are internal states of some machine M, then the graph associated with M will
have one vertex labeled q0 and another labeled q1. An edge (q0,q1) labeled a represents the transition
δ(q0,a) = q1. The initial state will be identified by an incoming unlabeled arrow not originating at any
vertex. Final states are drawn with a double circle.

More formally, if M = (Q, Σ,δ,q0,F) is a deterministic finite automaton, then its associated transition
graph (state diagram) has exactly |Q| vertices, each one labeled with a different qi ∈ Q. For every
transition rule δ(qi,a) = qj, the graph has an edge (qi,qj) labeled a. The vertex associated with q0 is called
the initial vertex, while those labeled with qf ∈ F are the final vertices. It is a trivial matter to convert
from the (Q, Σ,δ,q0,F) formal definition of a DFA to its transition graph representation and vice versa.
The deterministic Finite Automat will be present in three features; these representations are:
1. Instantaneous description
2. Transition graph (sate diagram)
3. Transition table

Example 2.1:
Consider a machine given by M1 = ({q0, q1, q2}, {0, 1},𝛿, q0,{ql}), Where the transition
function (δ) (movement) is given by the following instantaneous description.

Compiled by: Destalem H. 20


δ(q0,0) = q0, δ(q0,1) = q1,
δ(q1,0) = q0, δ(q1,1) = q2,
δ(q2,1) = q1,
δ(q2,0) = q2,

Based on the above given movement of the machine we can draw the transition graph and table as
follows.
1 0
1
q 0 q 1
q 2 δ 0 1
0 q0 q0 q1
0 1
q1 q0 q2
q2 q2 q1
Figure 2.5: Transition graph of the three-state finite automaton M1

Using the machine in figure 2.5 above, we can process strings created from {0,1}* in the following
manner.
Let us take string 1001 from {0,1}*, first the DFA will read the first symbol from the left side of the
string and compute with the function δ(q0,1)=q1 then the next symbol will compute with the function
δ(q1,0)=q0.
The computation will continue until the last symbol, when the string terminates at the final state in this
string case, then the string is accepted otherwise rejected. Here is the whole process for the whole string
1001
for the first symbol which is 1-------------------δ(q0,1)=q1,
For the second symbol which is 0--------------δ(q1,0)=q0
For the third symbol which is 0 ---------------δ(q0,0)=q0
For the fourth symbol which is 1------------ δ(q0,1)=q1,
When the string ended the machine terminates at q1, so we can say the string 1001 is accepted by the
machine M1 in figure 2.5.

Example 2.2:
Consider the state diagram (transition graph) for finite automaton M2.

Compiled by: Destalem H. 21


1
0
1

q 1 q 2

Figure 2.6: Transition graph of the two-state finite automaton M2

The formal description is M2 is ({q1, q2}, {0,1}, δ, q1, {q2}). The transition function δ with the help of
transition table is given
𝛿 0 1
q1 q1 q2
q2 q1 q2

Remember that the state diagram of M2 and the formal description of M2 or the transition table contain
the same information, only in different forms. You can always go from one form to the other form if
necessary.

A good way to begin understanding any machine is to try it on some sample input strings as we have
discussed in the above example 2.5. When you do these “experiments” to see how the machine is
working, its method of functioning often becomes apparent. For instance we can process the sample
string 1101 with machine M2. The machine M2 starts in its start state q1 and proceeds first to state q2
after reading the first 1, and then to states q2, q1, and q2 after reading 1, 0, and 1. The string is accepted
because q2 is an accept state. But string 110 leaves M2 in state q1, so it is rejected. After trying a few
more examples, you would see that M2 accepts all strings that end in a 1. Thus the language accepted by
M2 is L(M2) = {w| w ends in a 1}.

In order to discuss about the relationship between languages and machines, it is convenient to introduce
the extended transition function δ*: Q × Σ* → Q. The second argument of δ* is a string, rather than a
single symbol, and its value gives the state the automaton will be in after reading that string. For
example, if
δ(q0,a) = q1
and
δ(q1,b) = q2,
then
Compiled by: Destalem H. 22
δ* (q0,ab) = q2.
Formally, we can define δ* recursively by
δ*(q, 𝜆) = q,…………………………….…….(2.1)
δ*(q, wa) = δ(δ*(q, w), a),……………………(2.2)

for all q ∈ Q, w ∈ Σ*, a ∈ Σ. To see why this is appropriate, let us apply these definitions to the simple
case above. First, we use (2.2) to get

δ*(q, ab) = δ(δ*(q, a), b)……………………………………..(2.3)

But
δ*(q0, a) = δ(δ*(q0, 𝜆 ), a)
=δ(q0, a)
=q1
Substituting this into (2.3), we get
δ*(q0, ab) = δ(q1, b)=q2

Definition 2.2: The language accepted by a machine M = (Q, Σ,δ, q0,F) is the set of all strings on Σ
accepted by M. In formal notation,
L(M)={w ∈ Σ* : δ* (q0,w) ∈ F}
Note that we require that δ, and consequently δ*, be total functions. At each step, a unique move is
defined, so that we are justified in calling such an automaton deterministic. A DFA will process every
string in Σ* and either accept it or not accept it. Non acceptance means that the DFA stops in a non final
state, so that
̅̅̅̅̅̅̅
𝐿(𝑚)= {w ∈ 𝛴* : δ* (q0,w) ∉ F}
Every finite automaton accepts some language. If we consider all possible finite automata, we get a set
of languages associated with them. We will call such a set of languages a family. The family of languages
that is accepted by deterministic finite accepters is called regular language.
Definition 2.3: A language L is called regular if and only if there exists some deterministic finite
accepter M such that L= L(M).

Example 2.3:
Consider the DFA in Figure 2.7.

Compiled by: Destalem H. 23


a a, b

b q a, b
q 0 1 q 2

Figure 2.7: Transition graph for a language L = {anb:n≥0}.

In drawing Figure 2.7 we use two labels on a single edge (see δ(q1,a), δ(q1,b) use single edge). Such
multiply labeled edges are shorthand for two or more distinct transitions: The transition is taken
whenever the input symbol matches any of the edge labels.

The automaton in Figure 2.7 remains in its initial state q0 until the first b is encountered. If this is also
the last symbol of the input, then the string is accepted since q1 is final state. If not, the DFA goes into
state q2, from which it can never escape (such sate is called trap state). Here the state q2 is a trap state.
We see clearly from the transitional graph that the automaton accepts all strings consisting of an arbitrary
number of a's, followed by a single b. All other input strings are rejected. In set notation, the language
accepted by the automaton is
L = {anb:n≥0}.

2.1.2 DESIGNING FINITE AUTOMATA


Whether it will be of automaton or artwork, design is a creative process. As such, it cannot be reduced
to a simple recipe or formula. However, you might find a particular approach helpful when designing
various types of automata. That is, put yourself in the place of the machine you are trying to design and
then see how you would go about performing the machine’s task. Pretending that you are the machine
is a psychological trick that helps engage your whole mind in the design process.

Let’s design a finite automaton using the “reader as automaton” method just described. Suppose that
you are given some language and want to design a finite automaton that recognizes it. Pretending to be
the automaton, you receive an input string and must determine whether it is a member of the language
the automaton is supposed to recognize. You get to see the symbols in the string one by one. After each
symbol, you must decide whether the string seen so far is in the language. The reason is that you, like
the machine, don’t know when the end of the string is coming, so you must always be ready with the
answer.

First, in order to make these decisions, you have to figure out what you need to remember about the
string as you are reading it. Why not simply remember all you have seen? Bear in mind that you are
pretending to be a finite automaton and that this type of machine has only a finite number of states,
which means a finite memory. Imagine that the input is extremely long—say, from here to the moon—
so that you could not possibly remember the entire thing. You have a finite memory—say, a single sheet
Compiled by: Destalem H. 24
of paper—which has a limited storage capacity. Fortunately, for many languages you don’t need to
remember the entire input. You need to remember only certain crucial information. Exactly which
information is crucial depends on the particular language considered.

For example, suppose that the alphabet is {0,1} and that the language consists of all strings with an odd
number of 1s. You want to construct a finite automaton M1 to recognize this language. Pretending to be
the automaton, you start getting an input string of 0s and 1s symbol by symbol. Do you need to remember
the entire string seen so far in order to determine whether the number of 1s is odd? Of course not, simply
remember whether the number of 1s seen so far is even or odd and keep track of this information as you
read new symbols. If you read a 1, flip the answer; but if you read a 0, leave the answer as is.

But how does this help you design M1? Once you have determined the necessary information to
remember about the string as it is being read, you represent this information as a finite list of possibilities.
In this instance, the possibilities would be
1. even so far, and
2. Odd so far.
Then you assign a state to each of the possibilities. These are the states of M1, as shown here.

q even q odd

Figure 2.8: The two states qeven and qodd

Next, you assign the transitions by seeing how to go from one possibility to another upon reading a
symbol. So, if state qeven represents the even possibility and state qodd represents the odd possibility, you
would set the transitions to flip state on a 1 and stay put on a 0, as shown here.

0 1
q q odd
even
0
1
Figure 2.9: Transitions telling how the possibilities rearrange

Next, you set the start state to be the state corresponding to the possibility associated with having seen
0 symbols so far (the empty string 𝜆). In this case, the start state corresponds to state qeven because 0 is
an even number. Last, set the accept states to be those corresponding to possibilities where you want to
accept the input string. Set qodd to be an accept state because you want to accept when you have seen an
odd number of 1s. These additions are shown in the following figure.

0 1
qeven
q odd

0
1

Figure 1.10: Adding the start and accept states


Compiled by: Destalem H. 25
Example 2.4:
This example shows how to design a finite automaton M2 to recognize the regular language of all strings
that contain the string 001 as a substring. For example, 0010, 1001, 001, and 11111110011111 are all in
the language, but 11 and 0000 are not. How would you recognize this language if you were pretending
to be M2? As symbols come in, you would initially skip over all 1s. If you come to a 0, then you note
that you may have just seen the first of the three symbols in the pattern 001 you are seeking. If at this
point you see a 1, there were too few 0s, so you go back to skipping over 1s. But if you see a 0 at that
point, you should remember that you have just seen two symbols of the pattern. Now you simply need
to continue scanning until you see a 1. If you find it, remember that you succeeded in finding the pattern
and continue reading the input string until you get to the end. So there are four possibilities: You
1. haven’t just seen any symbols of the pattern and seen symbol 1 only,
2. have just seen a 0,
3. have just seen 00, or
4. have seen the entire pattern 001.

Assign the states q, q0, q00, and q001 to these possibilities. You can assign the transitions by observing
that from q reading a 1 you stay in q, but reading a 0 you move to q0. In q0 reading a 1 you return to q,
but reading a 0 you move to q00. In q00 reading a 1 you move to q001, but reading a 0 leaves you in q00.
Finally, in q001 reading a 0 or a 1 leaves you in q001. The start state is q, and the only accept state is q001,
as shown in Figure 2.11.

1 0
1
q 0 1 q
q 0 q 00
001

0,1
0

Figure 2.11: Accepts strings containing 001

Example 2.5
Design a deterministic finite accepter (DFA) that recognizes the set of all strings on Σ= {a,b} starting
with the prefix ab?
Solution
Here the question is a DFA designing. The DFA accepts set of strings that will start with symbol a, and
followed by b. The only issue here is the first two symbols in the string; after they have been read, no
further decisions are needed. Still, the automaton has to process the whole string before its decision is
made. We can therefore solve the problem with an automaton that has four states; an initial state (haven’t
seen any symbol), two states for recognizing ab (one state for reading symbol a and one another state
for reading symbol b), we need one state for reading symbol a if it followed by another a and finally we
need a final state,. If the first symbol is an a and the second is a b, the automaton goes to the final trap
state, where it will stay since the rest of the input does not matter. On the other hand, if the first symbol
is not an a or the second one is not a b, the automaton enters the nonfinal trap state. The simple solution

Compiled by: Destalem H. 26


is shown below in figure 2.12. Here the states q2 and q3 are called trap states (state ones you are in, you
cannot escape)

a, b

b
q q
a 1 3

a
q 0 q 2
b a, b

Figure 2.12: Machine (DFA) for L={abw: w ∈ {a,b}*}


Example 2.6
Show that the language L= {awa: w ∈ {a,b}* } is regular?

Solution
According to definition 2.3 any language is regular, if we can design a DFA for it. The construction of
a DFA for this language is similar to example 2.5, the only different here is it starts with single symbol
a, and there is also one additional constraint, that should end with another symbol a, but in between
these two symbols it can be any combination. What this DFA must do is check whether a string begins
and ends with an a; what is between is immaterial. The solution is complicated by the fact that there is
no explicit way of testing the end of the string. This difficulty is overcome by simply putting the DFA
into a final state whenever the second a is encountered. If this is not the end of the string, and another b
is found, it will take the DFA out of the final state and back to some none final state in between. Scanning
continues in this way, each a taking the automaton back to its final state. The complete solution is shown
in Figure 2.13.To prove, take some strings from {a,b}* that starts and ends with a and process with the
machine in figure 2.13, it will be obvious that the machine accepts a string if and only if it begins and
ends with an a. Since we have constructed a DFA for the language, we can claim that, by definition, the
language is regular.

a
b
a
q q
a 1 3

b
q 0 q 2
b a, b

Figure 2.13: DFA for L= {awa: w ∈ {a,b}* }

Compiled by: Destalem H. 27


2.1.3 NONDETERMINISTIC FINITE AUTOMATA (NFA)
In contrast to a DFA, where we have a unique next state for a transition from a state on an input symbol,
now we consider a finite automaton with nondeterministic transitions. A transition is nondeterministic
if there are several (possibly zero) next states from a state on an input symbol or without any input. A
transition without input is called as 𝜆-transition. A nondeterministic finite automaton is defined in the
similar lines of a DFA in which transitions may be nondeterministic.
Definition 2.4: Formally, a nondeterministic finite automaton (NFA) is a quintuple
M = (Q, Σ, δ, q0, F), where Q, Σ, q0 and F are as in a DFA; whereas, the transition function δ is:
δ : Q × (Σ ∪ {𝜆}) → 2Q, power se of Q (pot(Q)) is a function so that, for a given state and an input
symbol (possibly 𝜆), δ assigns a set of next states, possibly empty set.
Note that there are three major differences between this definition and the definition of a DFA.
1. In a nondeterministic accepter, the range of δ is in the power set of Q (pot(Q)), 2Q, so that its
value is not a single element of Q but a subset of it. This subset defines the set of all possible
states that can be reached by the transition. If, for instance, the current state is q1, the symbol a
is read, and δ(q1,a) = {q0,q2} : then either q0 or q2 could be the next state of the NFA.
2. We allow λ as the second argument of δ. This means that the NFA can make a transition without
consuming an input symbol. Although we still assume that the input mechanism can only travel
to the right, it is possible that it is stationary on some moves.
3. Finally, in an NFA, the set δ(qi,a) may be empty, meaning that there is no transition defined for
this specific situation.
Like DFA's, nondeterministic accepters can be represented by transition graphs. The vertices are
determined by Q, while an edge (qi,qj) with label 𝜶 is in the graph if and only if δ(qi; 𝜶) contains qj.
Note that since 𝜶 may be the empty string, there can be some edges labeled λ.
A string is accepted by an NFA if there is some sequence of possible moves that will put the machine in
a final state at the end of the string. A string is rejected (that is, not accepted) only if there is no possible
sequence of moves by which a final state can be reached. Nondeterminism can therefore be viewed as
involving “intuitive” insight by which the best move can be chosen at every state (assuming that the
NFA wants to accept every string).
Note: If S is a set, then Pot (S) denotes the power set of S (2S), the set of all subsets of S. For a finite
set S of size n, Pot (S) has size 2n.
Example 2.7:
Consider the NFA figure 2.14. The NFA accepts all the words (strings) from {0,1}* that end with 01.
Note that there are two "0" arrows leaving q0, one to itself q0 and the second to q1, which would be
forbidden in a DFA.
0,1

start q 0 q 1
1 q 2
0

Figure 2.14: NFA for set of strings that ends with 01.
Compiled by: Destalem H. 28
Example 2.8:
Let machine M with Q = {q0, q1, q2, q3, q4}, Σ = {a, b}, F = {q1, q3} and δ be given by the
following transition table.

𝛿 a b 𝜆
q0 {q1} ∅ {q4}
q1 ∅ {q1} {q2}
q2 {q2,q3} {q3} ∅
q3 ∅ ∅ ∅
q4 {q4} {q3} ∅

. In the similar lines of a DFA, an NFA can be represented by a state transition diagram. For instance,
the present NFA can be represented as figure 2.15:

a
b a, b
 q 2
q
3

q
1
a b
q 0
 q 4

Figure: 2.15: Diagram for M

Note the following few nondeterministic transitions in this NFA.


1. There is no transition from q0 on input symbol b.
2. There are multiple (two) transitions from q2 on input symbol a.
3. There is a transition from q0 to q4 without any input, i.e. 𝜆-transition.
Consider the traces for the string ab from the state q0 . Clearly, the following four figures are the possible
traces.

a q b
1. q 0
1 q1

q  q a
q
b
q
2. 0
4
4 3

3. q 0
a q 1
 q 2
b
q 3

4. q 0
a q 1
b q 1
 q 2

Figure: 2.16: Diagram that shows traces for ab


Note that three distinct states q1, q2 and q3 are reachable from q0 via the string ab. That means, while
tracing a path from q0 for ab we consider possible insertion of 𝜆 in ab, wherever 𝜆-transitions are
defined. For example, in trace (2) we have included an 𝜆-transition from q0 to q4, considering ab as 𝜆ab,

Compiled by: Destalem H. 29


as it is defined. Whereas, in trace (3) we consider ab as a𝜆b. It is clear that, if we process the input string
ab at the state q0, then the set of next states is {q1, q2, q3}.
Definition 2.5: The extended transition function for a NFA δ: Q × Σ* → Pot( Q) is defined inductively
through

∀q ∈ Q: δ(q,𝜆) = {q}

and
∀q ∈ Q ∀w ∈ Σ* ∀a ∈Σ: δ(q,wa)= ⋃p∈ 𝛿(q,𝑤) δ(p, 𝑤)

For an NFA, the extended transition function is defined so that δ * (qi,w) contains qj if and only if there
is a walk in the transition graph from qi to qj labeled w. This holds for all qi, qj ∈ Q, and w ∈ Σ*. The
language L accepted by an NFA, M = (Q,Σ,δ, q0,F) is defined as the set of all strings accepted by the
machine. Formally,
L(M )= {w ∈ Σ* : δ* (q0,w) ∩ F ≠ ∅}
In words, the language consists of all strings w for which there is a walk labeled w from the initial
vertex of the transition graph to some final vertex.
Definition 2.6: A NFA M accepts a word (strings) w if δ*(q0, w) ∩ F ≠ ∅. The language L(M) of a NFA
is the set of all words(strings) accepted by M.

Example 2.9: consider the NFA in figure 2.17

1 0,1
q
0 q
1
q 2

Figure 2.17:
The automaton shown in Figure 2.17 is nondeterministic not only because several edges with the same
label originate from one vertex, but also because it has a λ-transition. Some transitions, such as δ (q2,0),
are unspecified in the graph. This is to be interpreted as a transition to the empty set, that is, δ (q2,0) =
Ø. The automaton accepts strings λ, 1010, and 101010, but not 110 and 10100. Note that for 10 there
are two alternative walks, one leading to q0, the other to q2. Even though q2 is not a final state, the string
is accepted because one walk leads to a final state.
Example 2.10. Consider the machine M1 in figure 2.18. what is the language accepted by the NFA M1?
b
q 0 b
q 0
q 0 q
3
 q 2

Figure 2.18: diagram for M1


Compiled by: Destalem H. 30
1. From the initial state q0 one can reach back to q0 via strings from a∗ (any number of a) or from
aλb∗b (string start with a, followed by λ, followed by any number of b and followed by b ) , i.e.
ab+ (a followed by at least one b), or via a string which is a mixture of strings from the above
two sets. That is, the strings of (a + abb*)∗ (this kind of representation of strings will
discussed next chapter with the topic regular expression) will lead us from q0 to q0.
2. Also, note that the strings of ab∗ will lead us from the initial state q0 to the final state q2.
3. Thus, any string accepted by the NFA can be of the form a string from the set (a + abb*)∗
followed by a string from the set ab∗.
Hence, the language accepted by the NFA can be represented by { {a} ∪ {a}{b}+ }*{a}{b}* or with
the help of regular expression we can represent (a + abb*)∗ab∗

Example 2.11
Figure 2.19 represents an NFA. It has several λ-transitions and some undefined transitions such
as δ(q2,a). Suppose we want to find δ* (q1,a) and δ* (q2,λ). There is a walk labeled a involving two λ-
transitions from q1 to itself. By using some of the λ-edges twice, we see that there are also walks
involving λ-transitions to q0 and q2. Thus,
δ*(q1,a) = {q0,q1,q2}.

q q  q
0
q0
1 2

Figure 2.19: Diagram that represents NFA

Since there is a λ-edge between q2 and q0, we have immediately that δ*(q2,λ)contains q0. Also, since any
state can be reached from itself by making no move, and consequently using no input
symbol,δ*(q2,λ)also contains q2. Therefore,
δ*(q2, λ) = {q0,q2},
Using as many λ-transitions as needed, you can also check that
δ *(q2, aa) = {q0,q1,q2},

Example 2.12
What is the language accepted by the automaton in example 2.9, Figure 2.17?
Solution
It is easy to see from the graph that the only way the machine can stop in a final state is if the input is
either a repetition of the string 10 or the empty string. Therefore, the automaton accepts the language
L= {(10) n : n ≥0}.

Compiled by: Destalem H. 31


2.2 EQUIVALENCE OF DFA AND NFA

 Activity 2.2
1. Explain the need of nondeterministic finite automata?
2. Show that DFA and NFA are equivalent?

Note that NFAs are (i) good for human design: it is typically much easier to write down an NFA for a
given language than a DFA, but (ii) NFAs are bad for computer implementations because of the non-
determinism: if a computer program would have to emulate an NFA, the program wouldn't know which
transition options it should take when a non-deterministic situation is encountered. Fortunately, it is
possible to automatically transform any (maybe human-specified) NFA into an equivalent (computer-
runnable) DFA. This construction is the underlying idea used in the proof of the following proposition.

Definition 2.7: Two finite accepters, M1 and M2, are said to be equivalent, if they both accept the same
language

L(M1) = L(M2),

There are generally many accepters for a given language, so any DFA or NFA has many equivalent
accepters.
Proposition 2.1:- The languages accepted by DFAs are the languages accepted by NFAs.

Proof, general comment: This proposition is of the form "set A equals set B". Almost invariably, when
proving statements of this form, one shows two things: (i) A ⊆ B, and (ii) B ⊆ A.

Proof main idea: (i) Showing that the set of languages accepted by DFAs is a subset of the set of
languages accepted by NFAs is trivial, because DFAs are NFAs (the definition of NFAs is a
generalization of the definition of DFAs). The difficult part is to show (ii) that the set of languages
accepted by NFAs is a subset of the set of languages accepted by DFAs. For this one has to start with
some NFA M = (QN, Σ, δN, q0N, FN) and construct from it a DFA M' = (QD, Σ, δD, q0D, FD) that accepts
the same language. The crucial idea is to take as the states of M' all subsets of states of M, that is, put
QD = Pot(QN) Further details: put q0D = { q0N }, FD = {S ⊆ QN | S ∩ FN ≠ ∅}, and for all S ∈ QD =
pot(QN), a ∈ Σ,

δD(S,a) = ⋃qN∈S 𝛿N (qN, 𝑎)……………………….(2.1)

Proof, example: The subset construction will become immediately plausible through an example. We
take the NFA figure 2.20.

Compiled by: Destalem H. 32


0,1
a q 0
0 q 3
1 q 2

Figure 2.20: NFA Diagram

And here is the corresponding NFA obtained from the subset construction:

Figure 2.21: Partial representation for the DFA

Notes:

1. The states qi that occur inside the state circles of the subset DFA are from the nondeterministic
NFA, therefore they should more correctly be written qiN. The N subscript is omitted for
readability.
2. According to q0D = { q0N }, the start state is the one that contains (as a set) only q0N.
3. The accepting states of the subset DFA are all subsets of NFA states that contain some accepting
state of the NFA (here this is only q2N), according to FD = {S ⊆ QN | S ∩ FN ≠ ∅}.
4. To illustrate how it is determined which arrows leave a given state of the DFA, let us consider
the state S = {q0N, q2N }, and apply the rule δD (S, a) = UqN∈S 𝛿 N (qN,a) . For a = 0, we have
to check to which NFA states one can go from either q0N or q2N. We find that we can reach q0N
and q1N from q0N and no other state from q2N. Therefore, what we can reach altogether is {q0N,
q1N}. This yields the "0" arrow from {q0N, q2N } to {q0N, q1N} in the diagram. Similarly, the "1"
arrow is determined by considering what states we can reach in the NFA from either q0N or q2N.
From q0N a 1 leads to q0N and from q2N a 1 leads nowhere, so we determine in the DFA a "1"
arrow from {q0N, q2N } to {q0N}.
5. Seen strictly, in a DFA for every state q and every symbol a there must be an arrow labeled with
a leaving q. For instance, from {q1N, q2N} we get the dotted arrows marked in the diagram.
However, the state S = {q1N, q2N} cannot be reached from the starting state in the subset DFA.
Generally, in the diagram I omitted all such irrelevant transitions that cannot be reached from
the start state, which leads to the numerous seemingly "orphan" states in the diagram.

Compiled by: Destalem H. 33


Proof, formal: The formal proof has to show that the language accepted by our subset DFA is the
same language as the one accepted by the original NFA. We proceed by induction on the length of
words. This is a very common type of proof in the theory of formal languages, and I suggest that
you digest this proof thoroughly. We show for every word w that

δN(q0N,w) = δN(q0D,w)…………………....…..2.2

Be careful that you understand this formula. δ(q0N,w) is the set of NFA states that can be reached from
the starting state via w. δ(q0D,w) is the single state of the DFA that is deterministically reached from the
DFA starting state via w. However, due to the subset construction, this single DFA state corresponds to
a set of NFA states – namely, to δN(q0N,w). Furthermore, make sure that you understand that if we have
shown (2.2), then we are done. Because, w is accepted by the NFA ⇔ δN(q0N,w) contains some accepting
state ⇔ δN(q0D,w) is an accepting state ⇔ w is accepted by the DFA.

Induction basis: |w| = 0, i.e., w = 𝜆. Then δN(q0N,𝜆) = {q0N} = δD(q0D,w).

Induction step: assume that (2.2) has been shown for all |w| ≤ n. Let v = wa be a word of length n +1.
Then δN(q0N,wa)=⋃pN∈𝛿N (q0N,𝑤) δN(pN, 𝑎) , which by induction is equal to
⋃pN∈ 𝛿D (q0D,𝑤) δN(pN, 𝑎), which by (2.1) is equal to δD(δD(q0D,w),a) = δD(q0D,wa).

One important conclusion we can draw from preposition 2.1 is that every language accepted by an nfa
is regular.

Example 2.13
Convert the NFA in Figure 2.22 to an equivalent DFA.
b

a q
0
a q 3
 q 2

a
Figure 2.22: Nondeterministic finite automata
Solution
The NFA starts in state q0, so the initial state of the DFA will be labeled {q0}. After reading an a, the
NFA can be in state q1 or, by making a λ-transition, in state q2. Therefore, the corresponding DFA must
have a state labeled {q1,q2} and a transition
δ({q0},a) = {q1,q2}.
In state q0, the NFA has no specified transition when the input is b; therefore,
δ ({q0},b) = Ø.
A state labeled Ø represents an impossible move for the NFA and, therefore, means nonacceptance of
the string. Consequently, this state in the DFA must be a nonfinal trap state.

Compiled by: Destalem H. 34


We have now introduced into the DFA the state { q1,q2}, so we need to find the transitions out of this
state. Remember that this state of the DFA corresponds to two possible states of the NFA, so we must
refer back to the NFA. If the NFA is in state q1and reads an a, it can go to q1. Furthermore, from q1 the
NFA can make a λ-transition to q2. If, for the same input, the NFA is in state q2, then there is no specified
transition. Therefore,
δ({q1,q2},a) = {q1,q2}.
Similarly,
δ({q1,q2},b) = {q0}
At this point, every state has all transitions defined. The result, shown in Figure 2.23, is a DFA,
equivalent to the NFA with which we started. The NFA in figure 2.18 accepts any string for which δ*(q0,
w) contains q1. For the corresponding DFA to accept every such w, any state whose label includes q1
must be made a final state.
a

b {q , q }
1 2

a
{q 0
}
a

b
a, b
Figure 2.23
Example 2.14
Convert the NFA in Figure 2.24 into an equivalent deterministic machine (DFA).

0 0

q q
a 0
0,1 1
0,1
q 2

Figure 2.24: Machine for NFA


Solution
Since δ(q0,0) = {q0,q1}, we introduce the state {q0,q1} as single state in GD and add an edge labeled 0
between {q0}and {q0,q1}. In the same way, considering δN (q0,1) = {q1} gives us the new state {q1} and
an edge labeled 1 between it and {q0}. There are now a number of missing edges, so we continue, using
the construction of preposition 2.1. Looking at the state {q0,q1}, we see that there is no outgoing edge
labeled 0, so we compute
δN*(q0,0) ∪ δN*(q1,0) = {q0,q1,q2}.This gives us the new state {q0,q1,q2}and the transition
δD({q0,q1},0) = {q0,q1,q2}.
Then, using symbol 1
δN(q0,1) ∪ δN(q1,1) ∪ δN(q2,1) = {q1,q2}.

Compiled by: Destalem H. 35


Makes it necessary to introduce yet another state {q1,q2}. At this point, we have the partially constructed
automaton shown in Figure 2.25. Since there are still some missing edges, we continue until we obtain
the complete solution in Figure 2.26.

a
{q }
0

0 {q }2

{q

,q }
0 1

{q , q , q 2}
0 1
{q , q }
1 2

{q , q ,q 2}
0 1

Figure 2.25: partially constructed automaton Figure 2.26: Complete deterministic Finite automata

2.3 REDUCTION OF NUMBER OF STATES IN FINITE


AUTOMATA

Activity 2.3
1. Show the relation between language and Finite automata machine?
2. Show the mechanism to reduce number of states in a given DFA?

Any DFA defines a unique language, but the converse is not true. For a given language, there are many
DFA's that accept it. There may be a considerable difference in the number of states of such equivalent
automata. In terms of the questions, we have considered so far, all solutions are equally satisfactory, but
if the results are to be applied in a practical setting, there may be reasons for preferring one over another.
Example 2.15
The two DFA's depicted in Figure 2.29 (a) and 2.29 (b) are equivalent, as a few test strings will quickly
reveal. We notice some obviously unnecessary features of Figure 2.29 (a). The state q5 plays absolutely
no role in the automaton since it can never be reached from the initial state q0. Such a state is inaccessible,
and it can be removed (along with all transitions relating to it) without affecting the language accepted
by the automaton. But even after the removal of q5, the first automaton has some redundant parts.

Compiled by: Destalem H. 36


The states reachable subsequent to the first move δ (q0,0) mirror those reachable from a first move
δ(q0,1). The second automaton combines these two options. From a strictly theoretical point of view,
there is little reason for preferring the automaton in Figure 2.29 (b) over that in Figure 2.29 (a). However,
in terms of simplicity, the second alternative is clearly preferable. Representation of an automaton for
the purpose of computation requires space proportional to the number of states. For storage efficiency,
it is desirable to reduce the number of states as far as possible.

1 0,1
q q
0 1
3

q 0
1 0

1
q 1
2
q q
1 4 5

0
0
0,1
a)
0,1
q q 1 q
0
0,1 1
2

0
b)
Figure 2.26: Diagram that shows Equivalent of DFAs

Definition 2.8: Two states q0 and q1 of a DFA are called indistinguishable if


𝛿 *(q0,w) ∈ F implies 𝛿 *(q1,w) ∈ F,
and
𝛿 *(q0,w) ∉ F implies 𝛿 *(q1,w) ∉ F, for all w ∈ Σ*.
If, on the other hand, there exists some string w ∈ Σ* such that 𝛿 *(q0,w) ∈ F implies 𝛿 *(q1,w) ∉ F, or
vice versa, then the states q0 and q1 are said to be distinguishable by a string w.

Clearly, two states are either indistinguishable or distinguishable. In-distinguish ability has the
properties of an equivalence relation: If q0 and q1 are indistinguishable and if q1 and q2 are also
indistinguishable, then so are q0 and q2, and all three states are indistinguishable.
Procedure: Mark
1. Remove all inaccessible states. This can be done by enumerating all simple paths of the graph
of the dfa starting at the initial state. Any state not part of some path is inaccessible.
2. Consider all pairs of states (p, q). If p ∈ F and q ∉ F or vice versa, mark the pair (p, q) as
distinguishable.
3. Repeat the following step until no previously unmarked pairs are marked. For all pairs (p, q)
and all a ∈ Σ, compute δ(p, a)= pa and δ (q, a) = qa. If the pair (pa,qa) is marked as distinguishable,
mark (p, q) as distinguishable.
Compiled by: Destalem H. 37
We claim that this procedure constitutes an algorithm for marking all distinguishable pairs.

Proposition 2.2: The procedure Mark, applied to any dfa M = (Q, λ,δ,q0,F), terminates and determines
all pairs of distinguishable states.

Proof: Obviously, the procedure terminates, since there are only a finite number of pairs that can be
marked. It is also easy to see that the states of any pair so marked are distinguishable. The only claim
that requires elaboration is that the procedure finds all distinguishable pairs.

Note first that states qi and qj are distinguishable with a string of length n if and only if there are
transitions for some a ∈ Σ, with qk and qi distinguishable by a string of length n – 1. We use this first to
show that at the completion of the nth pass through the loop in step 3, all states distinguishable by strings
of length n or less have been marked. In step 2, we mark all pairs indistinguishable by λ, so we have a
basis with n = 0 for an induction. We now assume that the claim is true for all i = 0,1,…, n–1. By this
inductive assumption, at the beginning of the nth pass through the loop, all states distinguishable by
strings of length up to n–1. Because of (2.3) and (2.4) below, at the end of this pass, all states
distinguishable by strings of length up to n will be marked. By induction then, we can claim that, for
any n, at the completion of the nth pass, all pairs distinguishable by strings of length n or less have been
marked.
𝛿(qi ,a) = qk……………………………………….2.3
and
𝛿(qj ,a) = ql………………………………………..2.4
To show that this procedure marks all distinguishable states, assume that the loop terminates after n
passes. This means that during the nth pass no new states were marked. From (2. 3) and (2.4), it then
follows that there cannot be any states distinguishable by a string of length n, but not distinguishable by
any shorter string. But if there are no states distinguishable only by strings of length n, there cannot be
any states distinguishable only by strings of length n+1, and so on. As a consequence, when the loop
terminates, all distinguishable pairs have been marked.

The procedure mark can be implemented by partitioning the states into equivalence classes. Whenever
two states are found to be distinguishable, they are immediately put into separate equivalence classes.
Procedure: Reduce
Given a dfa M = ( Q,Σ,δ, q0, F), we construct a reduced DFA 𝑀 ̂ = (𝑄̂, Σ, δ̂ , q0
̂ , 𝐹̂ ) as follows.
1. Use procedure mark to generate the equivalence classes, say {qi,qj,…,qk}, as described.
̂.
2. For each set {qi,qj,…,qk} of such indistinguishable states, create a state labeled i j…k for 𝑀
3. For each transition rule of M of the form δ(qr,a)=qp, find the sets to which qr and qp belong.
If qr ∈{qi,qj,…,qk} and qp ∈ {ql,qm,…, qn}, add to δ̂ a rule
δ̂(i,j,….k,a) = lm……n.
Compiled by: Destalem H. 38
̂ , is that state of 𝑀
4. The initial state q0 ̂ whose label includes the 0.
5. 𝐹̂ is the set of all the states whose label contains i such that qi∈ F
Example 2.18
Consider the automaton in Figure 2.30; reduce the number of states to get the minimized DFA?
q 1

0 1
q 0 0
0

q 2

1 1 q 4
0
q3 1 0,1

Figure 2.27: Deterministic Finite Automata

Solution
In the second step of procedure mark we partition the state set into final and nonfinal states to get two
equivalence classes S0 = {q0,q1,q3} and Sf = {q2,q4}. In the next step, Take any two states from one of
the classes; in this case let us take from the nonfinal class q0 and q1, after computing
δ(q0,0) = q1 and δ(q1,0) = q2,
Since the out of the computation q1 for the first function and q2 for the send function, are in different
class, we recognize that q0 and q1are distinguishable, so we put them into different sets. So ={q0,q1,q3}
is split into {q0} and {q1,q3}. Also, since δ(q2,0) = q3 and δ(q4, 0) =q4, the class {q2,q4} is split into {q2}
and {q4}. Now we can check for {q1,q3}, δ(q1,0)=q2 and δ(q3,0)=q2 , again we have to compute for the
send input δ(q1,1) =q4 and δ(q3,1) = q4. The computation for the class {q1,q3} both inputs goes to the
same state q2 and q4, indicates that both are indistinguishable so we can combine them. Now we do have
the following class of states.
{q0},{q1,q3},{q2},{q4}, the new DFA will look like figure 2.31. Once the indistinguishability classes are
found, the construction of the minimal DFA is straightforward.
q 1, 3
q 0,1
0
1
0 0
q 4
q 2

1
0,1
Figure 2.28: Minimized DFA
.

Compiled by: Destalem H. 39


3 . REGULAR EXPRESSION AND REGULAR LANGUAGES

After completing this chapter students will able to:


▪ Understand the primitive regular expressions.
▪ How to express set of strings with the help of primitive regular expression and
mathematical operators.
▪ Have good understanding in regular expressions representation for languages.
▪ Understand the relation between regular languages and regular expression.
▪ Understand the relation between regular languages and regular grammars
▪ Can perform easily the conversion between Regular expression to NFA to DFA
▪ Can perform easily the conversion between Regular grammar to NFA to DFA
▪ Easily express strings with the help of regular grammars and regular expression
▪ Use pumping lemma to show non regularity of some languages

In this chapter, we first define regular expressions as a means of representing certain subsets of strings
over Σ and prove that regular sets are precisely those accepted by finite automata or transition systems.
We use pumping lemma for regular sets to prove that certain sets are not regular. We then discuss closure
properties of regular sets. Finally, we give the relation between regular sets and regular grammars.

3.1 REGULAR EXPRESSIONS

 Activity 3.1
1. What are the primitive regular expressions?
2. What are the valid mathematical operator used in regular expression?
3. What is regular expression?

We now consider the class of languages obtained by applying union, concatenation, and Kleene star for
finitely many times on the basis elements. These languages are known as regular languages and the
corresponding finite representations are known as regular expressions.

Definition 3.1: Let Σ be an alphabet, a regular expression r over Σ denotes a language L(r) over Σ.
Say that r is a regular expression if r is

1. a for some a in the alphabet Σ,


2. λ,
3. ∅,
4. If r1 and r2 are regular expressions, then (r1 + r2) is a regular expression, with
L((r1 + r2)) = L(r1) ∪ L(r2).
Compiled by: Destalem H. 40
5. If r1 and r2 are regular expressions, then (r1r2) is a regular expression, with
L(r1r2))=L(r1)L(r2).
6. If r is a regular expression, then (r*) is a regular expression, with L((r*)) = (L(r))*.

In items 1 and 2, the regular expressions a and λ represent the languages {a} and { λ }, respectively.
In item 3, the regular expression ∅ represents the empty language. In items 4, 5, and 6, the
expressions represent the languages obtained by taking the union or concatenation of the languages
r1 and r2, or the star of the language r, respectively. These symbol a, λ and ∅ are called primitive
regular expression.

Definition 3.2: If r is a regular expression, then the language represented by r is denoted by L(r).
Further, a language L is said to be regular if there is a regular expression r such that L = L(r).

Remark
1. A regular language over an alphabet Σ is the one that can be obtained from the empty set (∅),
{λ}, and {a}, for a ∈ Σ, by finitely many applications of union, concatenation and Kleene star.
2. The smallest class of languages over an alphabet Σ which contains Ø, {λ}, and {a} and is
closed with respect to union, concatenation, and Kleene star is the class of all regular
languages over Σ.

Examples 3.1:
1. As we observed earlier that the languages Ø, {λ}, {a}, and all finite sets are regular.
2. {an : n>=0} is regular as it can be represented by the expression a*.
3. Σ *, the set of all strings over an alphabet Σ, is regular. For instance, if Σ = {a1, a2, ..., an}, then
Σ* can be represented as (a1 + a2 + ……. + an)*.
4. The set of all strings over {a,b} which contain ab as a substring is regular.
Solution
For instance, the set can be written as
{x ∈ {a,b}* | ab is substring of x }
= { yabz | y,z ∈ {a,b}* }
= {a,b}*{ab}{a,b}*

Hence, the corresponding regular expression is (a + b)*ab(a + b)*.

5. Express the language L over {0,1} that contains 01 or 10 as sub-string with regular expression?
Solution
L = { x | 01 is substring of x} ∪ { x | 10 is substring of x}
= {y01z | y,z ∈ Σ*} ∪ {u10v | u,v ∈ Σ*}
= Σ*{01} Σ* ∪ Σ*{10} Σ*
= {0,1}*{01}{0,1}* ∪ {0,1}*{10}{0,1}*
Compiled by: Destalem H. 41
Since, Σ*, {01}, and {10} are regular by the rule that concatenation, union and kleene
operation on regular expression then L regular Language. The regular expression represent L is
given below.
(0+1)*01(0+1)* + (0+1)*10(0+1)*
6. Express with regular expression the set of all strings over {a,b} which do not contain ab as a
substring.
Solution
By analyzing the language one can observe that precisely the language is as follows.
{ bnam : n,m>=0 }
Thus, the regular expression of the language is b*a*.

Definition 3.3: Two regular expressions r1 and r2 are said to be equivalent if they represent the same
language; in which case, we write r1 ≡ r2.

IDENTITIES FOR REGULAR EXPRESSIONS


Two regular expressions P and Q are equivalent (we write P ≡Q) if P and Q represent the same set of
strings. We now give the identities for regular expressions; these are useful for simplifying regular
expressions.

I. ∅+R = R
II. ∅R = R∅ = ∅
III. 𝜆R = R 𝜆 = R
IV. 𝜆* = 𝜆 and ∅* = 𝜆
V. R+R=R
VI. R*R* = R*
VII. RR* = R*R
VIII. (R*) = R*
IX. 𝜆 + RR* = R* = 𝜆 + R*R
X. (PQ)*P = P(QP)*
XI. (P + Q)* = (P*Q*)* = (P* + Q*)*
XII. (P + Q)R = PR + QR and R(P + Q) = RP + RQ

The following theorem is very much useful in simplifying regular expressions (i.e. replacing a given
regular expression P by a simpler regular expression equivalent to P).

Theorem 3.1: (Arden’s theorem) Let P and Q be two regular expressions over Σ. If P does not contain𝜆,
then the following equation in R, namely
R = Q + RP ………………………………….….. (3.1)

has a unique solution (i.e. one and only one solution) given by R = QP*.

Proof: Q + (QP*) P = Q (𝜆 + P*P) = QP* by identity IX


Hence (3.1) is satisfied when R =QP*. This means R =QP* is a solution of (3.1). To prove uniqueness,
consider (3.1). Here, replacing R by Q + RP on the right hand side, we get the equation
Q + RP = Q + (Q + RP) P

Compiled by: Destalem H. 42


= Q + QP + RPP
= Q + QP + RP2
= Q + QP + QP2 + …. + QP i + RPi+1
= Q(𝜆 + P+ P2 +…… + Pi) + RP i+1
From (3.1),

R = Q(𝜆 + P + P2 + ... + pi) + RPi+1 for i≥ 0 ……………………(3.2)

We now show that any solution of (3.1) is equivalent to QP*. Suppose R satisfies (3.1), then it satisfies
(3.2). Let w be a string of length i in the set R .Then w belongs to the set Q(𝜆+P+P 2 + ... + Pi) + RPi +1.
As P does not contain𝜆, RPi+1 has no string of length less than i+1 and so w is not in the set RPi+1 . This
means that w belongs to the set Q(𝜆 + P + p 2 + ... + P'), and hence to QP*.

Consider a string w in the set QP*. Then w is in the set QPk for some k ≥ 0, and hence in Q(𝜆 + P + P2
+ . . . + Pk). So w is on the R.H.S. of (3.2). Therefore, w is in right hand side. of (3.2). Thus R and QP*
represent the same set. This proves the uniqueness of the solution of (3.1).

3.1.1 EQUIVALENCE WITH FINITE AUTOMATA


Regular expressions and finite automata are equivalent in their descriptive power. This fact is surprising
because finite automata and regular expressions superficially appear to be rather different. However,
any regular expression can be converted into a finite automaton that recognizes the language it describes,
and vice versa. Recall that a regular language is one that is recognized by some finite automaton.

Theorem 3.2: A language is regular if and only if some regular expression describes it. This theorem
has two directions. We state and prove each direction as a separate lemma.

Lemma 3.3: If a language is described by a regular expression, then it is regular.

Proof Idea: Say that we have a regular expression r describing some language L. We show how to
convert r into an NFA recognizing L. by definition, if an NFA recognizes L then L is regular.

Proof: Let’s convert r into an NFA N. We consider the six cases in the formal definition of regular
expressions.

1. r = a, for some a ∈ Σ. Then L(r) = {a}, and the following NFA recognizes L(r).
a

Figure 3.1: NFA for r = a

Note that this machine fits the definition of an NFA but not that of a DFA because it has some
states with no exiting arrow for each possible input symbol. Of course, we could have presented
an equivalent DFA here; but an NFA is all we need for now, and it is easier to describe.

2. r =𝝀. Then L(r) = { 𝜆 }, and the following NFA recognizes L(r).


Compiled by: Destalem H. 43


Figure 3.2: NFA for r = 𝝀

3. r = ∅. Then L(r) = ∅, and the following NFA recognizes L(r).


1

Figure 3.3: NFA for r = ∅


4. r = r1 ∪ r2.
M ( r 1)

r 1
 



r 2

M (r 2)
Figure 3.4: Automaton for L(r1 + r2).
5. r = r1.r2.
M ( r 1) M (r 2)

r  r 2

Figure 3.5: Automaton for L(r1r2).


6. r = r∗

M ( r 1)

 r 


*
Figure 3.6: Automaton for L(r1 ).

Example 3.2:
We convert the regular expression (ab + a)∗ to an NFA in a sequence of stages. We build up
from the smallest subexpressions to larger subexpressions until we have an NFA for the original
expression, as shown in the following diagram. Note that this procedure generally doesn’t give the NFA
with the fewest states. In this example, the procedure gives an NFA with eight states, but the smallest
equivalent NFA has only two states. Can you find it?
a

Compiled by: Destalem H. 44


ab
a
 b

a
 b
ab + a 
a

a 
 b
( ab + a ) * 
 a

Figure 3.7: Building an NFA from the regular expression (ab + a)∗

Example 3.3
In Figure 3.8, we convert the regular expression (a + b)∗aba to an NFA. A few of the minor
steps are not shown.
a

 a

a+b 
b


a

( a + b) * 

b

aba a  b  a

Compiled by: Destalem H. 45



(a + b) * aba a
 
 a  b  a
 
b

Figure 3.8: Building an NFA from the regular expression (a + b)∗aba

Example 3.4:
Find an NFA that accepts L(r), where r=(a + bb)* (ba* + λ)
Solution
Automata for (a + bb) and (ba* + λ), constructed directly from first principles, are given in Figure 3.9.
(a) M1 accepts L(a + bb).
(b) M2 accepts L (ba* + λ).


a 
b b   
M 1
b 
M 2 a

Figure 3.9: Automaton accepts L ((a + bb)* (ba* + λ)).

Now let’s turn to the other direction of the proof of Theorem 3.2.
Lemma 3.4: If a language is regular, then it is described by a regular expression.
Proof Idea: We need to show that if a language L is regular, a regular expression describes it. Because
L is regular, it is accepted by a DFA. We describe a procedure for converting DFAs into equivalent
regular expressions. We break this procedure into two parts, using a new type of finite automaton called
a generalized nondeterministic finite automaton, GNFA. First, we show how to convert DFAs into
GNFAs, and then GNFAs into regular expressions. Generalized nondeterministic finite automata are
simply nondeterministic finite automata wherein the transition arrows may have any regular expressions
as labels, instead of only members of the alphabet or𝜆. The GNFA reads blocks of symbols from the
input, not necessarily just one symbol at a time as in an ordinary NFA. The GNFA moves along a
transition arrow connecting two states by reading a block of symbols from the input, which themselves
constitute a string described by the regular expression on that arrow. A GNFA is nondeterministic and
so may have several different ways to process the same input string. It accepts its input if it’s processing
can cause the GNFA to be in an accept state at the end of the input. The following figure presents an
example of a GNFA.
Compiled by: Destalem H. 46
ab * aa

q start
a*

 ab+ ba
ab
b
(aa) *

b*
q accept

Figure 3.10: A generalized nondeterministic finite automaton


For convenience, we require that GNFAs always have a special form that meets the following conditions.

• The start state has transition arrows going to every other state but no arrows coming in from any
other state.
• There is only a single accept state, and it has arrows coming in from every other state but no
arrows going to any other state. Furthermore, the accept state is not the same as the start state.
• Except for the start and accept states, one arrow goes from every state to every other state and
also from each state to itself.

We can easily convert a DFA into a GNFA in the special form. We simply add a new start state with an
𝜆 arrow to the old start state and a new accept state with 𝜆 arrows from the old accept states. If any
arrows have multiple labels (or if there are multiple arrows going between the same two states in the
same direction), we replace each with a single arrow whose label is the union of the previous labels.

Finally, we add arrows labeled ∅ between states that had no arrows. This last step won’t change the
language recognized because a transition labeled with ∅ can never be used. From here on we assume
that all GNFAs are in the special form.

Now we show how to convert a GNFA into a regular expression. Say that the GNFA has k states. Then,
because a GNFA must have a start and an accept state and they must be different from each other, we
know that k ≥ 2. If k > 2, we construct an equivalent GNFA with k−1 states. This step can be repeated
on the new GNFA until it is reduced to two states. If k = 2, the GNFA has a single arrow that goes from
the start state to the accept state. The label of this arrow is the equivalent regular expression. For
example, the stages in converting a DFA with three states to an equivalent regular expression are shown
in the following figure.

Compiled by: Destalem H. 47


3 − state
DFA
 5 − state
 4 − state
GNFA GNFA


Re gular  2 − state  3 − state
Expression
GNFA GNFA

Figure 3.11: Typical stages in converting a DFA to a regular expression

The crucial step is constructing an equivalent GNFA with one fewer state when k > 2. We do so by
selecting a state, ripping it out of the machine, and repairing the remainder so that the same language is
still recognized. Any state will do, provided that it is not the start or accept state. We are guaranteed that
such a state will exist because k > 2. Let’s call the removed state qrip.

After removing qrip we repair the machine by altering the regular expressions that label each of the
remaining arrows. The new labels compensate for the absence of qrip by adding back the lost
computations. The new label going from a state qi to a state qj is a regular expression that describes all
strings that would take the machine from qi to qj either directly or via qrip. We illustrate this approach in
Figure 3.12.

q j
r 4

r
(r ) (r )* r + r q
3
q i
q rip q
1 2 3 4
j
r 1
i

r 2

before After

Figure 3.12: Constructing an equivalent GNFA with one fewer state

In the old machine, if

1. qi goes to qrip with an arrow labeled r1,


2. qrip goes to itself with an arrow labeled r2,
3. qrip goes to qj with an arrow labeled r3, and
4. qi goes to qj with an arrow labeled r4, then in the new machine, the arrow from qi to qj gets the
label
(r1)(r2)∗(r3) + (r4).
We make this change for each arrow going from any state qi to any state qj, including the case where
qi = qj. The new machine recognizes the original language.
Compiled by: Destalem H. 48
Proof: Let’s now carry out this idea formally. First, to facilitate the proof, we formally define the new
type of automaton introduced. A GNFA is similar to a nondeterministic finite automaton except for the
transition function, which has the form

δ: (Q – {qaccept}) × (Q − {qstart}) →r

The symbol r is the collection of all regular expressions over the alphabet Σ, and qstart and qaccept are the
start and accept states. If δ(qi, qj) = r, the arrow from state qi to state qj has the regular expression r as
its label. The domain of the transition function is (Q − {qaccept}) × (Q − {qstart}) because an arrow
connects every state to every other state, except that no arrows are coming from qaccept or going to qstart.

Definition 3.4: A generalized nondeterministic finite automaton is a 5-tuple, (Q, Σ, δ, qstart, qaccept), where

1. Q is the finite set of states,


2. Σ is the input alphabet,
3. δ: (Q − {qaccept}) × Q − {qstart}) →r is the transition function,
4. qstart is the start state, and
5. qaccept is the accept state.

A GNFA accepts a string w in Σ* if w = w1 w2 …wk, where each wi is in Σ* and a sequence of states q0,
q1, . . . , qk exists such that
1. q0 = qstart is the start state,
2. qk = qaccept is the accept state, and
3. for each i, we have wi ∈ L(ri), where ri = δ(qi−1 , qi); in other words, ri is the expression on the
arrow from qi−1 to qi.

Returning to the proof of Lemma 3.3, we let M be the DFA for language L. Then we convert M to a
GNFA G by adding a new start state and a new accept state and additional transition arrows as necessary.
We use the procedure CONVERT(G), which takes a GNFA and returns an equivalent regular
expression. This procedure uses recursion, which means that it calls itself. An infinite loop is avoided
because the procedure calls itself only to process a GNFA that has one fewer state. The case where the
GNFA has two states is handled without recursion.

CONVERT(G):
1. Let k be the number of states of G.
2. If k = 2, then G must consist of a start state, an accept state, and a single arrow connecting
them and labeled with a regular expression r.
Return the expression r.
3. If k > 2, we select any state qrip ∈ Q different from qstart and qaccept and let
G’ be the GNFA (Q’, Σ, δ’, qstart, qaccept), where Q’ = Q − {qrip},
and for any qi ∈ Q’ − {qaccept} and any qj ∈ Q’ − {qstart}, let
δ’(qi, qj) = (r1)(r2)∗(r3) + (r4),
for r1 = δ(qi, qrip), r2 = δ(qrip, qrip), r3 = δ(qrip, qj), and r4 = δ(qi, qj).
4. Compute CONVERT (G’) and return this value.

Next we prove that CONVERT returns a correct value.

Compiled by: Destalem H. 49


Claim 3.5: For any GNFA G, CONVERT (G) is equivalent to G.

We prove this claim by induction on k, the number of states of the GNFA.

Basis: Prove the claim true for k = 2 states. If G has only two states, it can have only a single arrow,
which goes from the start state to the accept state. The regular expression label on this arrow describes
all the strings that allow G to get to the accept state. Hence this expression is equivalent to G.

Induction step: Assume that the claim is true for k−1 states and uses this assumption to prove that the
claim is true for k states. First we show that G and G’ recognize the same language. Suppose that G
accepts an input w. Then in an accepting branch of the computation, G enters a sequence of states:
qstart, q1, q2, q3, . . . , qaccept.
If none of them is the removed state qrip, clearly G’ also accepts w. The reason is that each of the new
regular expressions labeling the arrows of G’ contains the old regular expression as part of a union. If
qrip does appear, removing each run of consecutive qrip states forms an accepting computation for G’.
The states qi and qj bracketing a run have a new regular expression on the arrow between them that
describes all strings taking qi to qj via qrip on G. So G’ accepts w.
Conversely, suppose that G’ accepts an input w. As each arrow between any two states qi and qj in G’
describes the collection of strings taking qi to qj in G, either directly or via qrip, G must also accept w.
Thus G and G’ are equivalent.

The induction hypothesis states that when the algorithm calls itself recursively on input G’, the result is
a regular expression that is equivalent to G’ because G’ has k−1 states. Hence this regular expression
also is equivalent to G, and the algorithm is proved correct. This concludes the proof of Claim 3.5,
Lemma 3.3, and Theorem 3.2.

Example 3.5:
In this example, we use the preceding algorithm to convert a DFA into a regular expression. We
begin with the two-state DFA in Figure 3.13(a). In Figure 3.13(b), we make a four-state GNFA by
adding a new start state and a new accept state, called S and F instead of qstart and qaccept so that we can
draw them conveniently. To avoid cluttering up the figure, we do not draw the arrows labeled ∅, even
though they are present. Note that we replace the label a, b on the self-loop at state 2 on the DFA with
the label a + b at the corresponding point on the GNFA. We do so because the DFA’s label represents
two transitions, one for a and the other for b, whereas the GNFA may have only a single transition going
from 2 to itself.

In Figure 3.13(c), we remove state 2 and update the remaining arrow labels. In this case, the only label
that changes is the one from 1 to F. In part (b) it was ∅, but in part (c) it is b(a + b)∗. We obtain this
result by following step 3 of the CONVERT procedure. State qi is state 1, state qj is a, and qrip is 2, so r1
= b, r2 = a ∪ b, r3 = 𝜆, and r4 = ∅. Therefore, the new label on the arrow from 1 to a is (b)(a ∪ b)∗(𝜆) ∪
Compiled by: Destalem H. 50
∅. We simplify this regular expression to b(a+b)∗. In Figure 3.13(d), we remove state 1 from part (c)
and follow the same procedure. Because only the start and accept states remain, the label on the arrow
joining them is the regular expression that is equivalent to the original DFA.

2
a, b

a) b)
a

S 1

b( a + b) *

c)
d)
Figure 3.13: Converting a two-state DFA to an equivalent regular expression

3.2 REGULAR GRAMMARS

 Activity 3.2
1. What is regular grammar?
2. Define what is right linear grammar and left linear grammar?
3. Show the equivalence of Finite automata and regular grammar with example.

A third way of describing regular languages is by means of certain grammars. Grammars are often an
alternative way of specifying languages. Whenever we define a language family through an automaton
or in some other way, we are interested in knowing what kind of grammar we can associate with the
family. First, we look at grammars that generate regular languages.

Compiled by: Destalem H. 51


3.2.1 RIGHT- LINEAR AND LEFT-LINEAR GRAMMARS
Definition 3.5: A grammar G = (V, T, S, P) is said to be right-linear if all productions are of the form
A → xB,
A → x,
where A, B ∈ V, and x ∈ T*.

Definition 3.6: A grammar is said to be left-linear if all productions are of the form
A → Bx,
or
A → x.

A regular grammar is one that is either right-linear or left-linear.


Note that in a regular grammar; at most one variable appears on the right side of any production.
Furthermore, that variable must consistently be either the rightmost or leftmost symbol of the right
side of any production.

Example 3.6
1. The grammar G1 = ({S}, {a,b},S,P1), with P1 given as
S →abS|a
is right-linear.
2. The grammar G2 = ({S, S1, S2}, {a, b}, S, P2), with productions
S →S1ab,
S1→ S1ab|S2,
S2 →a,
is left-linear.

Both G1 and G2 are regular grammars.

The sequence
S ⇒ abS ⇒ ababS ⇒ababa
is a derivation with G1. From this single instance it is easy to conjecture that L (G1) is the language
denoted by the regular expression r = (ab)*a. In a similar way, we can see that L(G2) is the regular
language L(aab(ab)*).

Example 3.7
The grammar G =({S, A, B}, {a, b}, S, P) with productions
S →A
A→ aB|λ,
B →Ab, is not regular.

Although every production is either in right-linear or left-linear form, the grammar itself is neither right-
linear nor left-linear, and therefore is not regular. The grammar is an example of a linear grammar.

Compiled by: Destalem H. 52


A linear grammar is a grammar in which at most one variable can occur on the right side of any
production, without restriction on the position of this variable. Clearly, a regular grammar is always
linear, but not all linear grammars are regular

Our next goal will be to show that regular grammars are associated with regular languages and that for
every regular language there is a regular grammar. Thus, regular grammars are another way of talking
about regular languages.

3.2.2 EQUIVALENCE OF FINITE AUTOMATA AND REGULAR


GRAMMARS
A finite automaton M is said to be equivalent to a regular grammar G, if the language accepted by M is
precisely generated by the grammar G, i.e. L(M) = L(G). Now, we prove that finite automata and regular
grammars are equivalent. In order to establish this equivalence, we first prove that given a DFA we can
construct an equivalent regular grammar. Then for converse, given a regular grammar, we construct an
equivalent generalized finite automaton (GFA), a notion which we introduce here as an equivalent notion
for DFA.

Theorem 3.6: If M is a DFA, then L(M) can be generated by a regular grammar G.


Proof: Suppose M = (Q, Σ, δ, q0, F) is a DFA. Construct a grammar G = (V, Σ, S, P) by setting V = Q
and S = q0 and
P = {A → aB : B ∈ δ(A, a)} ∪ {A → a : δ(A, a) ∈ F}.
In addition, if the initial state q0 ∈ F, then we include S → 𝜆 in P. Clearly G is a regular grammar. We
claim that L(G) = L(M).
From the construction of G, it is clear that 𝜆 ∈ L(M) if and only if 𝜆 ∈ L(G). Now, for n ≥ 1, let
x = a1a2 . . . an ∈ L(A) be arbitrary. That is δ̂(q0, a1a2 . . . an) ∈ F. This implies, there exists a sequence
of states
q1, q2, . . . , qn
such that
δ(qi−1, ai) = qi, for 1 ≤ i ≤ n, and qn ∈ F.
As per construction of G, we have
qi−1 → aiqi ∈ P, for 1 ≤ i ≤ n − 1, and qn → an ∈ P.
Using these production rules we can derive x in G as follows:
S = q0 ⇒ a1q1
⇒ a1a2q2
.
.

. ⇒a1a2 · · · an−1qn−1
⇒ a1a2 · · · an = x.
Thus x ∈ L(G).
Compiled by: Destalem H. 53

Conversely, suppose y = b1 · · · bm ∈ L(G), for m ≥ 1, i.e. S ⇒ y in G. Since every production rule of G

is form A → aB or A → a, the derivation S ⇒ y has exactly m steps and first m − 1 steps are because of
production rules of the type A → aB and the last mth step is because of the rule of the form A → a. Thus,
in every step of the deviation one bi of y can be produced in the sequence. Precisely, the derivation can
be written as
S ⇒ b1B1
⇒ b1b2B2
⇒ b1b2 · · · bm−1Bm−1
⇒ b1b2 · · · bm = y.
From the construction of G, it can be observed that
δ(Bi−1, bi) = Bi, for 1 ≤ i ≤ m − 1, and B0 = S
in M. Moreover, δ(Bm−1, bm) ∈ F. Thus,
δ̂( (q0, y) = δ̂( (S, b1 · · · bm) = δ̂( (δ(S, b1), b2 · · · bm)
= δ̂( (B1, b2 · · · bm)
.= δ̂( (Bm−1, bm)
= δ(Bm−1, bm) ∈ F
so that y ∈ L(M). Hence L(M) = L(G).
Example 3.11:
Consider the DFA given below in figure, Set V = {q0, q1}, Σ = {a, b}, S = q0 and P has the
following production rules:
q0 → aq1 | bq0 | a
q1 → aq0 | bq1 | b
Now G = (V, Σ, S, P) is a regular grammar that is equivalent to the given DFA.
b
a
q 0
q 1

a
b
Figure 3.18: Deterministic Finite Automata

Example 2.12:
Consider the DFA given in figure3.19 below. The regular grammar G = (V, Σ, P, S), where V =
{q1, q2, q3}, Σ = {a, b}, S = q1 and P has the following rules
q1 → aq2 | bq1 | a | b | 𝜆
q2 → aq3 | bq1 | b
q3 → aq3 | bq3

Compiled by: Destalem H. 54


is equivalent to the given DFA. Here, note that q3 is a trap state. So, the production a rule in which q3 is
involved can safely be removed to get a simpler but equivalent regular grammar with the following
production rules.

q1 → aq2 | bq1 | a | b | 𝜆
q2 → bq1 | b
a a
q 1
q 2 q 3

b
b a, b
Figure 3.19: Deterministic Finite Automata

Construction of a Finite Automata M accepting L(G) for a given regular grammar G


Let G = ({A0, A1, .... , An},Σ , A0, P). We construct a transition system M whose
(i) States correspond to variables.
(ii) Initial state corresponds to A0. And
(iii) Transitions in M correspond to productions in P. As the last production applied in any
derivation is of the form Ai ⟶ a, the corresponding transition terminates at a new state, and
this is the unique final state.

We define M as ({ q0,….. ,qn, qf}, Σ, q0,{qf}) where 𝛿 is defined as follows:


i. Each production Ai⟶ aAj induces a transition from qi to qj with label a,
ii. Each production Ak⟶ a induces a transition from qk to qf with label a.

From the construction it is easy to see that A0 ⇒alA1 ⇒ ala2A2 ⇒ ………⇒... a1a2…….an-1 An-1 ⇒a1a2
…….an is a derivation of ala a1a2…….an if and only if there is a path in M starting from q0 and
terminating in qf with path value a1a2…….an. Therefore, L (G) = L (M)

Example: 2.13:
Let G = ({A0 , A1} , {a. b}, A0, P), where P consists of A0⟶aA1, Al ⟶ bA1 , A1 ⟶ a,
A1⟶bA1. Construct a transition system M accepting L(G).
Solution
Let M = ({ q0,q1,qf}) {a. b}, q0,{ qf }). Where q0 and q1 correspond to A0 and A1, respectively and qf is
the new (final) state introduced. A0⟶aA1 induces a transition from q0 to q1 with label a. Similarly.
A1⟶bA1 and A1⟶bA0 induce transitions from q1 to q1 with label b and from ql to q1 with label b,
respectively. Al ⟶ a induces a transition from q1 to qf with label a. M is given in Figure3.20 below

Compiled by: Destalem H. 55


b
b

q 0
q1
q f
a a

Figure 3.20: Transition system

Example 3.13
Construct a finite automaton that accepts the language generated by the grammar G
V0 →aV1,
V1 →abV0|b,
where V0 is the start variable. We start the transition graph with vertices V0, V1, and Vf. The first
production rule creates an edge labeled a between V0 and V1. For the second rule, we need to introduce
an additional vertex so that there is a path labeled ab between V1 and V0. Finally, we need to add an
edge labeled b between V1 and Vf, giving the automaton shown in Figure 3.17. The language generated
by the grammar and accepted by the automaton is the regular language
L((aab) * ab.
v 0
a v
1 b v f

b a

Figure 3.17: NFA for G

3.3 PROPERTIES OF REGULAR LANGUAGES

 Activity 3.3
1. Is regular language closed under union, intersection, concatenation and
complement? Prove your Answer with examples.
2. What is the purpose of pumping Lemma in regular language?

In the previous sections we have introduced various tools, regular expression, grammars, and automata,
to understand regular languages. Also, we have noted that the class of regular languages is closed with
respect to certain operations like union, concatenation, Kleene closure. Now, with this information, can
we determine whether a given language is regular or not? If a given language is regular, then to prove
the same we need to use regular expression, regular grammar, finite automata. Is there any other way to
prove that a language is regular? The answer is “Yes”. If a given language can be obtained from some
known regular languages by applying those operations which preserve regularity, then one can ascertain

Compiled by: Destalem H. 56


that the given language is regular. If a language is not regular, a practical tool called pumping lemma
will be introduced to ascertain that the language is not regular. If we were somehow know that some
languages are not regular, then again closure properties might be helpful to establish some more
languages that are not regular. Thus, closure properties play important role not only in proving certain
languages are regular, but also in establishing non-regularity of languages. Hence, we are indented to
explore further closure properties of regular languages.

3.3.1 CLOSURE PROPERTIES


Set Theoretic Properties
Theorem 3.7: The class of regular languages is closed with respect to complement.

Proof: Let L be a regular language accepted by a DFA M = (Q, Σ,δ,q0,F). Construct the DFA
M’ = (Q, Σ, δ, q0,Q − F), that is, by interchanging the roles of final and nonfinal states of M. We claim
that L(M’) = L̅ so that L̅ is regular. For x ∈ Σ∗,
x ∈ L̅ ⇔ x ∉L
⇔ ̂δ(q0,x) 6 ∈ F
⇔ ̂δ( (q0,x) ∈ Q − F
⇔ x ∈ L(M’).
Corollary 3.8: The class of regular languages is closed with respect to intersection.
Proof: If L1 and L2 are regular, then so are L̅1 and L̅2. Then their union L̅1 ∪ L̅2 is also regular. Hence,
(L̅1 ∪ L̅2) ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
(L̅1 ∪ L̅2) is regular. But, by De Morgan’s law
L1 ∩ L2 = ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
(L̅1 ∪ L̅2)
so that L1 ∩ L2 is regular.
Alternative Proof by Construction, For i = 1, 2, let Mi = (Qi, Σ,δi,qi,Fi) be two DFA accepting Li. That
is, L(M1) = L1 and L(M2) = L2. Set the DFA
M = (Q1 × Q2, Σ,δ, (q1,q2),F1 × F2) where δ is defined point-wise by
δ((p,q),a) = (δ1(p,a),δ2(q,a)),
for all (p,q) ∈ Q1 × Q2 and a ∈ Σ. We claim that L(M) = L1 ∩ L2. Using induction on |x|, first observe
that ̂δ (p,q),x) = ( ̂δ1(p,x), ̂δ2(q,x)´ , for all x ∈ Σ∗.
Now it clearly follows that
x ∈ L(M) ⇔ ̂δ (q1,q2),x) ∈ F1 × F2
⇔ ̂δ1(q1,x), ̂δ 2(q2,x)´ ∈ F1 × F2
⇔ ̂δ1(q1,x) ∈ F1 and ̂δ 2(q2,x) ∈ F2
⇔ x ∈ L1 and x ∈ L2
⇔ x ∈ L1 ∩ L2.
Corollary 3.9: The class of regular languages is closed under set difference.

Compiled by: Destalem H. 57


Proof: Since L1 − L2 = L1 ∩ L̅2, LetL1 and L2 are regular languages. Since complement of a regular
language is regular then L̅2 is also regular. As we have seen the intersection of two regular languages is
regular then L1 ∩ L̅2 is regular. Since L1 – L2 = L1 ∩ L̅2, so regular languages closed under difference.

3.3.2 THE PUMPING LEMMA FOR REGULAR LANGUAGES


The pumping lemma condenses a simple observation about DFAs into an immensely useful statement
about a certain syntactical property that all regular languages have. Its main use is to show that some
given language is not regular, by showing that the language in question does not have this property.

Theorem 3.10: (pumping lemma): Let L be a regular language. Then there exists a constant n
(depending on L), such that ∀w ∈ L, |w| ≥ n, we can find a partition w = xyz, such that (1) y ≠ 𝜆, (2)
|xy| ≤ n, and (3) ∀k ≥ 0, xykz ∈ L. In intuitive terms, every word w from L that exceeds n in length can
be "pumped" by replicating an inner part, such that the "pumped-up" words are also in L.

Proof: Because L is regular, there exists some DFA M = (Q, Σ, δ, q0, F) that accepts L. Let M have n
states. Observe that in an accepting run of any word w = x1x2...xk of length at least n, at least one state
(maybe the start state) must have been visited twice after xn has been read. Let P be this state and let
xixi+1...xj be the subword before which the run went through P and after which it went through P, that
is, ̂δ (P, xixi+1...xj) = P. Put
x = x1x2...xi-1 and y = xixi+1...xj. It is clear that the statement holds

Example 3.14:
Show that L = {0i1i} is not regular.
Solution
Step 1 Suppose L is regular. Let n be the number of states in the finite automaton accepting L.
Step 2 Let w = 0n1n. Then |w| = 2n> n. By pumping lemma, we write w = xyz with |xy| ≤ n and |y|≠0.
Step 3 We want to find i so that xyz ∉L for getting a contradiction. The string y can be in any of the
following forms:
Case 1 y has 0’s. i.e. y = 0k for some k ≥ l.
Case 2 ,y has only l's. i.e. y = 1l for some l≥.l
Case 3 y has both 0’s and l's, i.e. y = 0k1j for some k, j ≥ 1

In Case 1. we can take i = 0. As xyz = 0n1n xz = 0n-k1n. As k≥1,n – k ≠ n, So xz ∉ L


In Case 2. take i = 0. As before, xz is 0n1n-1 and n ≠ n – 1. So, xz ∉ L.
In Case 3. take i =2. As xyz =0n-k0k1j1n-j, xy2z = 0n-k0k1j0k1j1n-j. As xy2z , is not the form 0i1j , xy2z ∉ L
Thus in all the cases we get a contradiction. Therefore, L is not regular.

Compiled by: Destalem H. 58


4 . CONTEXT – FREE LANGUAGES AND PUSHDOWN
AUTOMATA

After completing this chapter students will able to:


▪ Have well understanding about context- free language and grammar.
▪ Can define easy context free grammar.
▪ Can design context free grammar easily and efficiently.
▪ Have good understanding about derivation (parsing) of strings.
▪ Can easy derive strings with the help of the different derivation mechanisms like left , right
derivations and derivation trees,
▪ Can design push down automata.
▪ Differentiate and explain easily the difference between deterministic pushdown automata and
nondeterministic pushdown automata.

 4.1 Activity
1. Why we need learn context free grammar?
2. What is the application of context-free grammar?
3. What is context free language?

In Chapter 2 and 3 we introduced three different, though equivalent, methods of describing languages:
finite automata, regular expressions and regular grammars. We showed that many languages can be
described in this way but that some simple languages, such as L = {0n1n| n ≥ 0}, cannot.

In this chapter we present context-free grammars, a more powerful method of describing languages.
Such grammars can describe certain features that have a recursive structure, which makes them useful
in a variety of applications. Context-free grammars were first used in the study of human languages.
One way of understanding the relationship of terms such as noun, verb, and preposition and their
respective phrases leads to a natural recursion because noun phrases may appear inside verb phrases and
vice versa. Context-free grammars help us organize and understand these relationships.

An important application of context-free grammars occurs in the specification and compilation of


programming languages. A grammar for a programming language often appears as a reference for people
trying to learn the language syntax. Designers of compilers and interpreters for programming languages
often start by obtaining a grammar for the language. Most compilers and interpreters contain a
component called a parser that extracts the meaning of a program prior to generating the compiled code
or performing the interpreted execution. A number of methodologies facilitate the construction of a

Compiled by: Destalem H. 59


parser once a context-free grammar is available. Some tools even automatically generate the parser from
the grammar.

Collections of languages associated with context-free grammars are called the context-free languages.
They include all the regular languages and many additional languages. In this chapter, we give a formal
definition of context-free grammars and study the properties of context-free languages. We also
introduce pushdown automata, a class of machines recognizing the context-free languages. Pushdown
automata are useful because they allow us to gain additional insight into the power of context-free
grammars.

4.1 CONTEXT-FREE GRAMMARS


 4.2 Activity
1. Define context-free grammar formally?
2. What is derivation? Explain it with connection to languages?
3. When do we say a grammar is ambiguous grammar?

Definition 4.1: A context-free grammar is a 4-tuple (V, T, S, P), where

1. V is a finite set called the variables,


2. T is a finite set, disjoint from V, called the terminals,
3. 4. S ∈ V, is the start variable.
4. P is a finite set of rules, with each rule being a variable and a string of variables and terminals,

Alternatively, we can define as

A grammar G = (V, T, S, P) is said to be context-free if all productions in P have the form

A→x
where A ∈ V and x ∈ (V ∪ T)*.
A language L is said to be context-free language if and only if there is a context free grammar G such
that
L= L (G).
Example 4.1: consider grammar G1=({A,B},{0,1},A,P), the production P is given by
A → 0A1
A→B
B→0
We use a grammar to describe a language by generating each string of that language in the following
manner.

Compiled by: Destalem H. 60


1. Write down the start variable. It is the variable on the left-hand side of the top rule, unless
specified otherwise.
2. Find a variable that is written down and a rule that starts with that variable. Replace the written
down variable with the right-hand side of that rule.
3. Repeat step 2 until no variables remain.
For example, grammar G1 generates the string 0000111. The sequence of substitutions to obtain a string
is called a derivation. A derivation of string 0000111 in grammar G1 is
A⇒0A1⇒00A11⇒000A111⇒000B111⇒0000111.
You may also represent the same information pictorially with a parse tree. An example of a parse tree
is shown in Figure 4.1.
A
A
A
B

0 0 0 0 1
1 1
Figure 4.1: Parse tree for 0000111 in grammar G1
All strings generated in this way constitute the language of the grammar. We write L(G 1) for the
language of grammar G1. Some experimentation with the grammar G1 shows us that L(G1) is {0n+11n| n
≥ 0}. Any language that can be generated by some context-free grammar is called a context-free
language (CFL). For convenience when presenting a context-free grammar, we abbreviate several rules
with the same left-hand variable, such as A → 0A1 and A → B, into a single line A → 0A1 | B, using
the symbol “ | ” as an “or”.
If u, v, and w are strings of variables and terminals, and A → w is a rule of the grammar, we say that

uAv yields uwv, written uAv ⇒ uwv. Say that u derives v, written u ⇒ v, if u = v or if a sequence u1, u2.
. . uk exists for k ≥ 0 and u ⇒ u1 ⇒ u2 ⇒ . . . ⇒ uk ⇒ u.

The language of the grammar is {w ∈ Σ∗ | S ⇒ w}. In grammar G1, V = {A, B}, Σ = {0, 1}, S = A, and
P is the collection of the three rules appearing on above grammar G1
Example 4.2
Consider the grammar G2 = ({S}, {a, b}, S, P), with productions P
S →aSa,
S→bSb,
S→λ,
is context-free.

Compiled by: Destalem H. 61


A typical derivation in this grammar is

S ⇒ aSa ⇒ aaSaa⇒ aabSbaa ⇒ aabbaa

This, and similar derivations, make it clear that


L(G2)= {wwR: w ∈ {a,b}* is a context free language generated by the grammar.

Example 4.3
Consider grammar G3 = ({S}, {a, b}, S, P). The set of rules, P is given by
S → aSb | SS | 𝜆.
This grammar generates strings such as abab, aaabbb, and aababb.
L(G3) = { w ∈ {a,b}* : na(w) = nb(w) }

4.1.1 TYPES OF DERIVATIONS


Leftmost and Rightmost Derivations
In a grammar that is not linear, a derivation may involve sentential forms with more than one variable. In such
cases, we have a choice in the order in which variables are replaced. Take, for example, the grammar
G = ({A, B, S}, {a, b}, S, P) with productions
1. S → AB,
2. A →aaA,
3. A → 𝜆,
4. B → Bb,
5. B → 𝜆.
This grammar generates the language L(G) = {a2nbm : n ≥ 0, m ≥ 0}. Carry out a few derivations to convince
yourself of this. Consider now the two derivations
1 2 3 4 5
S ⇒ AB ⇒ aaAB ⇒ aaB ⇒ aaBb ⇒ aab.
And
1 4 2 5 3
S ⇒ AB ⇒ ABb ⇒ aaABb ⇒ aaAb ⇒ aab.

In order to show which production is applied, we have numbered the productions and written the appropriate
number on the ⇒ symbol. From this we see that the two derivations not only yield the same sentence but also use
exactly the same productions. The difference is entirely in the order in which the productions are applied. To
remove such irrelevant factors, we often require that the variables be replaced in a specific order.

A derivation is said to be leftmost if in each step the leftmost variable in the sentential form is replaced. If in each
step the rightmost variable is replaced, we call the derivation rightmost.
Example 4.4:
Consider the grammar with productions G=({A,B,S},{a,b},S,P) where P is given by

Compiled by: Destalem H. 62


S → aAB,
A →bBb,
B → A| 𝜆.
Then, the leftmost and rightmost derivations are:
S ⟹ aAB⟹ abBbB⟹ abAbB ⟹ abbBbbB ⟹ abbbbB ⟹ abbbb, is a leftmost derivation of the string abbbb.

S ⟹ aAB ⟹ aA ⟹ abBb ⟹ abAb ⟹ abbBbb ⟹ abbbb is rightmost derivation of the same string abbbb


Definition 4.2: A derivation S ⇒ w is called a leftmost derivation if we apply a production only to the
leftmost variable at every step.

Definition 4.3: A derivation S ⇒ w is a rightmost derivation if we apply production to the rightmost
variable at every step.

Derivation Trees
A second way of showing derivations, independent of the order in which productions are used, is by a derivation
or parse tree. A derivation tree is an ordered tree in which nodes are labeled with the left sides of productions
and in which the children of a node represent its corresponding right sides. For example, Figure 4.2 shows part
of a derivation tree representing the production
A→abAB
In a derivation tree, a node labeled with a variable occurring on the left side of a production has children consisting
of the symbols on the right side of that production. Beginning with the root, labeled with the start symbol and
ending in leaves that are terminals, a derivation tree shows how each variable is replaced in the derivation. The
following definition makes this notion precise.

a

b B
A
Figure 4.2: Partial derivation tree

Definition 4.4: Let a grammar G = (V, T, S, P ) be a context-free grammar. An ordered tree is a derivation tree
for G if and only if it has the following properties.
1. The root is labeled S.
2. Every leaf has a label from T ∪ {λ}.
3. Every interior vertex (a vertex that is not a leaf) has a label from V.
4. If a vertex has label A ∈ V, and its children are labeled (from left to right) a1, a2,…, an, then P must
contain a production of the form
Compiled by: Destalem H. 63
A → a1a2…..an.
5. A leaf labeled λ has no siblings, that is, a vertex with a child labeled λ can have no other children.
A tree that has properties 3, 4, and 5, but in which 1 does not necessarily hold and in which property 2 is replaced
by V ∪ T ∪ {λ}, is said to be a partial derivation tree.

The string of symbols obtained by reading the leaves of the tree from left to right, omitting any λ’s encountered,
is said to be the yield of the tree. The descriptive term left to right can be given a precise meaning. The yield is
the string of terminals in the order they are encountered when the tree is traversed in a depth-first manner, always
taking the leftmost unexplored branch.
Example 4.5
Consider the grammar G, with productions
S → aAB,
A →bBb,
B → A| 𝜆.
The tree in Figure 4.3 is a partial derivation tree for G, while the tree in Figure 4.4 is a derivation tree. The string
abBbB, which is the yield of the first tree, is a sentential form of G. The yield of the second tree, abbbb, is a
sentence of L (G).
S

a B
A

b B
b

Figure: 4.3: Partial derivation tree


S

a
A B

b A
b B

b
 b B

Figure 4.4: Derivation Tree (parse tree)

Definition 4.6: The yield of a derivation tree is the concatenation of the labels of the leaves without
repetition in the left-to-right ordering. For example, The yield of the derivation tree of Fig. 4.4 is abbbb

Compiled by: Destalem H. 64


4.1.2 DESIGNING CONTEXT-FREE GRAMMARS

As with the design of finite automata, the design of context-free grammars requires creativity. Indeed,
context-free grammars are even trickier to construct than finite automata because we are more
accustomed to programming a machine for specific tasks than we are to describing languages with
grammars. The following techniques are helpful, singly or in combination, when you’re faced with the
problem of constructing a context free grammar (CFG).

First, many contexts free languages (CFLs) are the union of simpler CFLs. If you must construct a CFG
for a CFL that you can break into simpler pieces, and then construct individual grammars for each piece.
These individual grammars can be easily merged into a grammar for the original language by combining
their rules and then adding the new rule S → S1|S2| · · · |Sk, where the variables Si are the start variables
for the individual grammars. Solving several simpler problems is often easier than solving one
complicated problem. For example, to get a grammar for the language {0n1n| n ≥ 0}∪{1n0n| n ≥ 0}, first
construct the grammar
S1 → 0S11 | 𝜆
for the language {0n1n| n ≥ 0} and the grammar
S2 → 1S20 | 𝜆
for the language {1n0n| n ≥ 0} and then add the rule S → S1 | S2 to give the grammar
S → S1 | S2
S1 → 0S11 | 𝜆
S2 → 1S20 | 𝜆.
Second, constructing a CFG for a language that happens to be regular is easy if you can first construct
a DFA for that language. You can convert any DFA into an equivalent CFG as follows. Make a variable
Pi for each state qi of the DFA. Add the rule Pi → aPj to the CFG if δ(qi, a) = qj is a transition in the
DFA. Add the rule Pi → 𝜆 if qi is an accept state of the DFA. Make P0 the start variable of the grammar,
where q0 is the start state of the machine. Verify on your own that the resulting CFG generates the same
language that the DFA recognizes.

Third, certain context-free languages contain strings with two substrings that are “linked” in the sense
that a machine for such a language would need to remember an unbounded amount of information about
one of the substrings to verify that it corresponds properly to the other substring. This situation occurs
in the language {0n1n| n ≥ 0} because a machine would need to remember the number of 0s in order to
verify that it equals the number of1s. You can construct a CFG to handle this situation by using a rule
of the form P → uPv, which generates strings wherein the portion containing the u’s corresponds to the
portion containing the v’s.

Compiled by: Destalem H. 65


Finally, in more complex languages, the strings may contain certain structures that appear recursively
as part of other (or the same) structures.
Example 4.6
Let L be the set of all palindromes over {a. h}. Construct a grammar G generating L.
Solution
For constructing a grammar G generating the set of all palindromes. W use the recursive techniques to
observe the following:
(i) 𝜆 is a palindrome.
(ii) a, b are palindromes.
(iii) If x is a palindrome then axa , and bxb are palindromes.
So we define P as the set consisting of
i. S → 𝜆
ii. S → a and S → b
iii. S → aSa and S → bSb

Let G={S}, {a.b},S,P) then S ⇒ 𝜆 , S ⇒ a, S ⇒b , therefore


𝜆,a,b ∈ L(G)
If x is a palindrome of even length, then x =a1a2 ………….amam……………..a1, where each ai is either a or b.

Then S ⇒ a1a2 ………….amamam-1……………..a1 by applying S → aSa and S → bSb. Thus x ∈ L(G).
If x is a palindrome of odd length, then x = a1a2 ………….amcam……………..a1 where ai's and c are either a,

or b. So S ⇒ a1………….anSan……………..a1 ⇒ x by applying S → aSa and S → bSb and finally, S → a,
S → b. Thus. x ∈ L(G). This proves L = L(G}.

4.1.3 AMBIGUITY
Sometimes a grammar can generate the same string in several different ways. Such a string will have
several different parse trees and thus several different meanings. This result may be undesirable for
certain applications, such as programming languages, where a program should have a unique
interpretation. If a grammar generates the same string in several different ways, we say that the string is
derived ambiguously in that grammar. If a grammar generates some string ambiguously, we say that the
grammar is ambiguous.
Example 4.7:
If G is the grammar S → SbS|a, show that G is ambiguous?
Solution
To prove that G is ambiguous, we have to find a w ∈ L(G), which is ambiguous. Consider a string w
= abababa ∈ L(G). Then we get two derivation trees for w (see Fig. 4.5). Thus, G is ambiguous.

Compiled by: Destalem H. 66


S

S S
b
a S
S b

a A

S S
b

a a
S

S S
b
S S S
b S b

a a a a
Figure 4,5: Two derivation trees of string abababa
Example 4.8
Consider, a grammar G = ({S}, {a, b, +, *}, S. P), where P consists of S→ S+S | S*S | b | a.
show that the grammar is ambiguous?
Solution
We have two derivation trees for a + a * b given in Fig. 4.6.
S

S S
+

a S
S *

a b

S S
*
S
S +
a

a b

Figure 4.6: Two derivation trees for a + a * b.

Compiled by: Destalem H. 67


The leftmost derivations of a + a * b induced by the two derivation trees are
S => S + S => a + S => a + S * S => a + a *S => a + a * b
S => S * S => S + S * S => a + 5 *S => a + a * S => a + a * b
Therefore, a + a * b is ambiguous.
Definition 4.7: A terminal string w ∈ L(G) is ambiguous if there exist two or more derivation trees for
w (or there exist two or more leftmost or right most derivations of w).

Note: A Language which generates by ambiguous grammar is ambiguous

4.2 SIMPLIFICATION OF CONTEXT-FREE GRAMMARS

 Activity 4.3
1. What is useless variable and production?
2. What is 𝜆 production?
3. Define unit production?

In a CFG G, it may not be necessary to use all the symbols in V∪ Σ, or all the productions in P for
deriving sentences. So when we study a context free language L(G), we try to eliminate those symbols
and productions in G which are not useful for the derivation of sentences. Consider, for example,
G = ({S. A, B, C, E}, {a, b, c}, S, P) Where
P ={S → AB,A → a, B → b, B → C, E → c | 𝜆}
It is easy to see that L(G) = {ab}. Let 𝐺̂ = ({S, A,B}, {a, b}, S, 𝑃̂), where 𝑃̂ consists of S →AB, A
→ a, B → b. L(G) = L(𝐺̂ ). We have eliminated the symbols C, E and c and the productions B → C, E
→ c | 𝜆, We note the following points regarding the symbols and productions which are eliminated:
(i) C does not derive any terminal string.
(ii) E and c do not appear in any sentential form.
(iii) E → 𝜆 is a null production.
(iv) B → C simply replaces B by C.
In this section, we give the construction to eliminate
(i) Variables not deriving terminal strings,
(ii) Symbols not appearing in any sentential form,
(iii) Null productions. And
(IV) Productions of the form A → B.

Compiled by: Destalem H. 68


4.2.1 CONSTRUCTION OF REDUCED GRAMMARS
Theorem 4.1: Let G = (V, T, S, P) be a context-free grammar. Suppose that P contains a production of
the form
A → x1Bx2.
Assume that A and B are different variables and that
B → y1 |y2 |…|yn

is the set of all productions in P that have B as the left side. Let 𝐺̂ = (V, T, S, 𝑃̂) be the grammar in
which is constructed by deleting
A → x1Bx2. …………………………..….………..(4.1)
from P, and adding to it
A → x1y1x2| x1y2x2 | . . . . . | x1ynx2
then
L(𝐺̂ ) = L(G)

Proof: suppose that w ∈ L (G), so that



S⇒G w.
The subscript on the derivation sign ⇒ is used here to distinguish between derivations with different
grammars. If this derivation does not involve the production (4.1), then obviously

S ⇒ 𝐺̂ w
If it does, then look at the derivation the first time (4.1) is used. The B introduced eventually has to be
replaced; we lose nothing by assuming that this is done immediately. Thus


S⇒G u1Au2 ⇒ u1x1Bx2u2 ⇒ u1x1yix2u2
But with grammar 𝐺̂ we can get
∗ ∗
S ⇒ 𝐺̂ u1Au2 ⇒ 𝐺̂ u1x1yix2u2
Thus we can reach the same sentential form with 𝐺̂ and. It follows then, by induction on the number of
times the production is applied, that

S ⇒ 𝐺̂ w
Therefore, if w ∈ L(G), then w ∈ L(𝐺̂ )
By similar reasoning, we can show that if w ∈ L ( 𝐺̂ ) then w ∈ L (G), completing the proof.
Example 4.9:
Consider G = ({A, B} , {a,b,c} , A, P) with productions
A → a|aaA|abBc
B → abbA|b
get new grammar using substitution.
Solution
Using the suggested substitution for the variable B, we get the new grammar 𝐺̂ with productions
Compiled by: Destalem H. 69
A → a|aaA|ababbAc|abbc
B→ abbA|b

The new grammar 𝐺̂ is equivalent to G. The string aaabbc has the derivation
A ⇒ aaA ⇒ aaabBc⇒aaabbc in G, and the corresponding derivation
A ⇒ aaA ⇒ aaabbc in 𝐺̂

Notice that, in this case, the variable B and its associated productions are still in the grammar even
though they can no longer play a part in any derivation. We will next show how such unnecessary
productions can be removed from a grammar.

4.2.2 REMOVING USELESS PRODUCTIONS


One invariably wants to remove productions from a grammar that can never take part in any derivation.
For example, in the grammar whose entire production set is
S → aSb| 𝜆 |A
A → aA
the production S → A clearly plays no role, as A cannot be transformed into a terminal string. While A
can occur in a string derived from S, this can never lead to a sentence. Removing this production leaves
the language unaffected and is a simplification by any definition.
Definition 4.8: Let G = (V, T, S, P) be a context-free grammar. A variable B ∈ V is said to be useful if
and only if there is at least one w ∈ L (G) such that

∗ ∗
S⇒ xBy ⇒ w
with x, y in (V ∪ T)*. In words, a variable is useful if and only if it occurs in at least one derivation.
A variable that is not useful is called useless. A production is useless if it involves any useless variable.
Example 4.10:
A variable may be useless because there is no way of getting a terminal string from it. Another reason a
variable may be useless is shown in the next grammar. Let a grammar
G=({S,A},{a,b},S,P) where productions are given by

S→A,
A → aA | 𝜆,
B → bA,
Solution

Although B can derive a terminal string, there is no way we can achieve S⇒ xB y. So the variable B is
useless. And also the production B → bA. Now we can eliminate the variable B and its production
B → bA, and get the new grammar and production as follows
𝐺̂ =({S,A},{a},S,𝑃̂) where the new production is given by

Compiled by: Destalem H. 70


S→A,
A → aA | 𝜆,
Example 4.11
Eliminate useless symbols and productions from G = (V, T, S, P), where
V = {S, A, B, C} and T = {a, b} , with P consisting of

S → aS|A ,
A→a
B → aa
B → aCb
Solution
First, we identify the set of variables that can lead to a terminal string. Because A → a and B → aa, the
variables A and B belong to this set. So does S, because S ⇒ A ⇒ a. However, this argument cannot be
made for C, thus identifying it as useless. Removing C and its corresponding productions, we are led to
the grammar G1 with variables V1 = {S, A, B} , terminals T = {a} , and productions
S → aS|A ,
A→a
B → aa
Next we want to eliminate the variables that cannot be reached from the start variable. For this, we can
draw a dependency graph for the variables. Dependency graphs are a way of visualizing complex
relationships and are found in many applications. For context-free grammars, a dependency graph has
its vertices labeled with variables, with an edge between vertices C and D if and only if there is a
production of the form
C → xDy.
A dependency graph for V1 is shown in Figure 4.7. A variable is useful only if there is a path from the
vertex labeled S to the vertex labeled with that variable. In our case, Figure 4.7 shows that B is useless.
Removing it and the affected productions and terminals, we are led to the final answer

𝐺̂ = ( 𝑉̂ , 𝑇̂,S 𝑃̂ ) with 𝑉̂ = {A, S}, 𝑇̂, ={a} and production


S → aS|A ,
A→a

S A B
Figure 4.7: dependency graph
Theorem 4.2: Let G = (V, T, S, P) be a context-free grammar. Then there exists an equivalent grammar
𝐺̂ = ( 𝑉̂ , 𝑇̂,S 𝑃̂) that does not contain any useless variables or productions.

Compiled by: Destalem H. 71


Proof: The grammar 𝐺̂ can be generated from G by an algorithm consisting of two parts.

In the first part we construct an intermediate grammar G1 = (V1, T1, S, P1) such that V1 contains only
variables A for which

S⇒ w ∈ T*
is possible. The steps in the algorithm are
1. Set V to ∅ .
2. Repeat the following step until no more variables are added to V1. For every A ∈ V for which P
has a production of the form
A → x1x2 ………xn with all xi in V1 ∪ T, and A to V1
3. Take P1 as all the productions in P whose symbols are all in (V1 ∪ T)


Clearly this procedure terminates. It is equally clear that if A ∈ V1, then S ⇒ w ∈ T* is a possible

derivation with G1. The remaining issue is whether every A for which S⇒ w = ab… is added to V1
before the procedure terminates. To see this, consider any such A and look at the partial derivation tree
corresponding to that derivation (Figure 4.9). At level k, there are only terminals, so every variable Ai
at level k – 1 will be added to V1 on the first pass through Step 2 of the algorithm. Any variable at level
k–2 will then be added to V1 on the second pass through Step 2. The third time through Step 2, all
variables at level k – 3 will be added, and so on. The algorithm cannot terminate while there are variables
in the tree that are not yet in V1. Hence A will eventually be added to V1.
A

A j

level k−2

c k −1
A i
level

a b level k
Figure 4.8: Procedure for removing Unit Production

In the second part of the construction, we get the final answer from G1. We draw the variable dependency
graph for G1 and from it find all variables that can not be reached from S. These are removed from the
variable set, as are the productions involving them. We can also eliminate any terminal that does not
occur in some useful production. The result is the grammar 𝐺̂ = ( 𝑉̂ , 𝑇̂,S 𝑃̂ )
Because of the construction, 𝐺̂ does not contain any useless symbols or productions. Also, for each w
∈ L (G) we have a derivation
∗ ∗
S⇒ xAy ⇒ w
Compiled by: Destalem H. 72
Since the construction of 𝐺̂ retains A and all associated productions, we have everything needed to make
the derivation
∗ ∗
S⇒ 𝐺̂ xAy ⇒ 𝐺̂ w.

The grammar 𝐺̂ is constructed from G by the removal of productions, so that 𝑃̂ ⊆ P Consequently L(G)
⊆ L(𝐺̂ ). Putting the two results together, we see that G and 𝐺̂ are equivalent.

4.2.3 REMOVING Λ-PRODUCTIONS


One kind of production that is sometimes undesirable is one in which the right side is the empty string
Definition 4.9: Any production of a context-free grammar of the form
A→λ
is called a λ-production. Any variable A for which the derivation

A⇒λ
is possible is called nullable.
A grammar may generate a language not containing λ, yet have some λ-productions or nullable variables.
In such cases, the λ-productions can be removed
Example 4.12
Consider the grammar
S → aS1b
S1 → aS1b| λ
with start variable S. This grammar generates the λ-free language {anbn : n ≥ 1} . The λ-production S1
→ λ can be removed after adding new productions obtained by substituting λ for S1 where it occurs on
the right. Doing this we get the grammar
S → aS1b|ab
S1 → aS1b| ab
This new grammar generates the same language as the original one.
Theorem 4.3: Let G be any context-free grammar with λ not in L (G). Then there exists an equivalent
grammar having no λ-productions.
Proof: We first find the set VN of all nullable variables of G, using the following steps.
1. For all productions A → λ, put A into VN.
2. Repeat the following step until no further variables are added to VN. For all productions
B → A1 A2…An,
where A1, A2,…, An are in VN, put B into VN. Once the set VN has been found, we are ready to construct
𝑃̂. To do so, we look at all productions in P of the form
A → x1 x2…xm , m ≥ 1
where each xi ∈ V∪T. For each such production of P, we put into 𝑃̂ that production as well as all those
generated by replacing nullable variables with λ in all possible combinations. For example, if xi and xj
are both nullable, there will be one production in with xi replaced with λ, one in which xj is replaced

Compiled by: Destalem H. 73


with λ, and one in which both xi and xj are replaced with λ. There is one exception: If all xi are nullable,
the production A → λ is not put into 𝑃̂.
The argument that this 𝐺̂ grammar is equivalent to G is straightforward.
Example 4.13
Find a context-free grammar without λ-productions equivalent to the grammar defined by
S → ABaC
A → BC
B→b|λ
C→ D| λ
D→d
Solution
From the first step of the construction in Theorem 4.3, we find that the nullable variables are A, B, C.
Then, following the second step of the construction, we get
S → ABaC| BaC| AaC|ABa|aC|Aa|Ba|a
A → B|C|BC
B→b
C→ D
D→d

4.2.4 REMOVING UNIT -PRODUCTIONS


Definition 4.10: Any production of a context-free grammar of the form
A → B, where
A, B ∈ V, is called a unit-production. To remove unit-productions, we use the substitution rule.
Theorem 4.4: Let G = (V, T, S, P ) be any context-free grammar without λ-productions. Then there
exists a context-free grammar 𝐺̂ = ( 𝑉̂ , 𝑇̂,S 𝑃̂ ) that does not have any unit-productions and that is
equivalent to G.
Proof: Any unit-production of the form A → A can be removed from the grammar without effect, and
we need only consider A → B, where A and B are different variables
A→B
with
A → y1 |y2|…|yn.
But this will not always work; in the special case
A→B
B → A , the unit-productions are not removed.
To get around this, we first find, for each A, all variables B such
that

A ⇒ B ……………………..(4.2)

Compiled by: Destalem H. 74


We can do this by drawing a dependency graph with an edge ( C, D) when-ever the grammar has a unit-
production C → D; then (4.2) holds whenever there is a walk between A and B. The new grammar 𝐺̂ is
generated by first putting into all non-unit productions of P. Next, for all A and B satisfying (4.2), we
add to 𝑃̂
A → y1 |y2|…|yn,
where B → y1 |y2|…|yn is the set of all rules in with B on the left. Note that since B → y1 |y2|…|yn is
taken from 𝑃̂, none of the yi can be a single variable, so that no unit-productions are created by the last
step. To show that the resulting grammar is equivalent to the original one, we can follow the same line
of reasoning as in Theorem 4.1
Example 4.14
Remove all unit-productions from
S → Aa|B
B → A|bb
A→ b | bc|B
Solution
∗ ∗
The dependency graph for the unit-productions is given in Figure 4.9; we see from it that S ⇒ A , S ⇒
∗ ∗
B, B ⇒ A, and A ⇒ B. Hence, we add to the original non-unit productions.
S → Aa,
A→ a|bc,
B→ bb,

S A B

Figure 4.9: Dependency graph


The new rules
S → a|bc|bb,
A→ bb,
B→ a|bc,
Will be added to obtain the equivalent grammar
S → a|bc|bb|Aa,
A→ a|bc| bb,
B→ a|bb|bc,
Note that the removal of the unit-productions has made B and the associated productions useless

Compiled by: Destalem H. 75


4.3 METHODS FOR TRANSFORMING GRAMMARS

 Activity 4.4
1. Define Chomsky Normal form?
2. Define Greibach Normal Form?

4.3.1 CHOMSKY NORMAL FORM

When working with context-free grammars, it is often convenient to have them in simplified form. One
of the simplest and most useful forms is called the Chomsky normal form. Chomsky normal form is
useful in giving algorithms for working with context-free grammars.
Definition 4.11: A context-free grammar is in Chomsky normal form if every rule is of the form
A → BC
A→a
where a is any terminal and A, B, and C are any variables.
Theorem 4.5 : Any context-free language is generated by a context-free grammar in Chomsky normal
form.
Proof idea: We can convert any grammar G into Chomsky normal form. The conversion has several
stages wherein rules that violate the conditions are replaced with equivalent ones that are satisfactory.
First, we add a new start variable. Then, we eliminate all 𝜆-rules of the form A → 𝜆. We also eliminate
all unit rules of the form A → B. In both cases we patch up the grammar to be sure that it still generates
the same language. Finally, we convert the remaining rules into the proper form.
Proof: First, we add a new start variable S0 and the rule S0 → S, where S was the original start variable.
This change guarantees that the start variable doesn’t occur on the right-hand side of a rule.
Second, we take care of all 𝜆 - rules. We remove an 𝜆 - rule A → 𝜆, where A is not the start variable.
Then for each occurrence of an A on the right-hand side of a rule, we add a new rule with that occurrence
deleted. In other words, if R → uAv is a rule in which u and v are strings of variables and terminals, we
add rule R → uv. We do so for each occurrence of an A, so the rule R → uAvAw causes us to add R →
uvAw, R → uAvw, and R → uvw. If we have the rule R → A, we add R → 𝜆 unless we had previously
removed the rule R → 𝜆. We repeat these steps until we eliminate all 𝜆 -rules not involving the start
variable.
Third, we handle all unit rules. We remove a unit rule A → B. Then, whenever a rule B → u appears,
we add the rule A → u unless this was a unit rule previously removed. As before, u is a string of variables
and terminals. We repeat these steps until we eliminate all unit rules.

Compiled by: Destalem H. 76


Finally, we convert all remaining rules into the proper form. We replace each rule A → u1u2 · · · uk,
where k ≥ 3 and each ui is a variable or terminal symbol, with the rules A → u1A1, A1 → u2A2, A2 →
u3A3. . . and Ak−2 → uk−1 uk.
The Ai’s are new variables. We replace any terminal ui in the preceding rule(s) with the new variable Ui
and add the rule Ui → ui.
Example 4.15
Let G be the following CFG and convert it to Chomsky normal form by using the conversion
procedure just given. The series of grammars presented illustrates the steps in the conversion. Rules
shown in bold have just been added.
1. The original CFG G6 is shown on the left. The result of applying the first step to make a new
start variable appears on the right.
S → ASA | aB S0 → S
A→B|S S → ASA | aB
A→B|S
B→b|λ
B→b|λ
2. Remove λ -rules B → λ, shown on the left, and A → λ, shown on the right.
S0 → S S0 → S
S → ASA | aB | a S → ASA | aB | a | SA | AS | S
A→B|S|λ A→B|S
B→b B→b
3a. Remove unit rules S → S, shown on the left, and S0 → S, shown on the right.
S0 → S
S0 → S | ASA | aB | a | SA | AS
S → ASA | aB | a | SA | AS
S → ASA | aB | a | SA | AS
A→B|S A→B|S
B→b B→b
3b. Remove unit rules A → B and A → S.
S0 → ASA | aB | a | SA | AS S0 → ASA | aB | a | SA | AS
S → ASA | aB | a | SA | AS S → ASA | aB | a | SA | AS
A→B|S|b A → b | ASA | aB | a | SA | AS
B→b B→b

4. Convert the remaining rules into the proper form by adding additional variables and rules. The final
grammar in Chomsky normal form is equivalent to G. (Actually the procedure given in Theorem 4.95
produces several variables Ui and several rules Ui → a. We simplified the resulting grammar by using a
single variable U and rule U → a.)
S0 → AA1 | UB | a | SA | AS
S → AA1 | UB | a | SA | AS
A → b | AA1 | UB | a | SA | AS
A1 → SA
Compiled by: Destalem H. 77
U→a
B→b

4.3.2 GREIBACH NORMAL FORM


Another useful grammatical form is the Greibach normal form. Here we put restrictions not on the length
of the right sides of a production, but on the positions in which terminals and variables can appear.
Arguments justifying Greibach normal form are a little complicated and not very transparent.
Similarly, constructing a grammar in Greibach normal form equivalent to a given context-free grammar
is tedious. We therefore deal with this matter very briefly. Nevertheless, Greibach normal form has many
theoretical and practical consequences.
Definition 4.12: A context-free grammar is said to be in Greibach normal form ifall productions have
the form
A → ax,
where a ∈ T and x ∈ V*
Example 4.16
Convert the grammar
S → abSb|aa into Greibach normal form.

Solution
Here we can use a device similar to the one introduced in the construction of Chomsky normal form.
We introduce new variables A and B that are essentially synonyms for a and b, respectively. Substituting
for the terminals with their associated variables leads to the equivalent grammar
S → aBSB | aA,
A → a,
B → b,
which is in Greibach normal form.

4.4 CHOMSKY’S HIERARCHY OF GRAMMARS


Chomsky Hierarchy represents the class of languages that are accepted by the different machine. The
category of language in Chomsky's Hierarchy is as given below:
Type 0 known as Unrestricted Grammar.
Type 1 known as Context Sensitive Grammar.
Type 2 known as Context Free Grammar.
Type 3 known as Regular Grammar.

Compiled by: Destalem H. 78


Type 0: Unrestricted Grammar:
Type-0 grammars include all formal grammars. Type 0 grammar language are recognized by Turing
machine. These languages are also known as the Recursively Enumerable languages. Grammar
Production in the form of
𝛼→ 𝛽
were 𝛼 ∈ (V + T) * V (V + T) *
𝛽 ∈ (V + T) *., where V: Variables, T: Terminals.
In type 0 there must be at least one variable on Left side of production.
For example,
Sab –> ba
A –> S.
Here, Variables are S, A and Terminals a, b.
Type 1: Context Sensitive Grammar
Type-1 grammars generate the context-sensitive languages. The language generated by the grammar are
recognized by the Linear Bound Automata
I. First of all Type 1 grammar should be Type 0.
II. Grammar Production in the form of
𝛼→ 𝛽
| 𝛼 |<=| 𝛽 | i.e. count of symbol in 𝛼 is less than or equal to 𝛽
For Example,
S –> AB
AB –> abc
B –> b
Type 2: Context Free Grammar:
Type-2 grammars generate the context-free languages. The language generated by the grammar is
recognized by a Pushdown automaton.
Compiled by: Destalem H. 79
1. First of all it should be Type 1.
2. Left hand side of production can have only one variable.
| 𝛼 | = 1.
There is no restriction on 𝛽.
For example,
S –> AB
A –> a
B –> b
Type 3: Regular Grammar:
Type-3 grammars generate regular languages. These languages are exactly all languages that can be
accepted by a finite state automaton. Type 3 is most restricted form of grammar.
Type 3 should be in the given form only :
V –> VT / T (left-regular grammar)
(or)
V –> TV /T (right-regular grammar) for example:
S –> a
The above form is called as strictly regular grammar. There is another form of regular grammar called
extended regular grammar. In this form:
V –> VT* / T* (Extended left-regular grammar)
(or)
V –> T*V /T* (Extended right-regular grammar)
for example:
S –> ab.

4.5 PUSHDOWN AUTOMATA

 Activity 4.5
1. Define Push down automata formally?
2. What is the language recognized by Pushdown automata?
3. Prove that the language generated by CFG accepts by Pushdown Automata?

In this section we introduce a new type of computational model called pushdown automata. These
automata are like nondeterministic finite automata but have an extra component called a stack. The stack
provides additional memory beyond the finite amount available in the control. The stack allows
pushdown automata to recognize some nonregular languages.

Compiled by: Destalem H. 80


Pushdown automata are equivalent in power to context-free grammars. This equivalence is useful
because it gives us two options for proving that a language is context free. We can give either a context-
free grammar generating it or a pushdown automaton recognizing it. Certain languages are more easily
described in terms of generators, whereas others are more easily described by recognizers.
The following figure is a schematic representation of a finite automaton. The control represents the states
and transition function, the tape contains the input string, and the arrow represents the input head,
pointing at the next input symbol to be read.

State
control

a b b a input
Figure 4.10: Schematic of a finite automaton

With the addition of a stack component we obtain a schematic representation of a pushdown automaton,
as shown in the following figure.
State
control

x a b b a input
y
z
stack
Figure 4.11: Schematic of a pushdown automaton

A pushdown automaton (PDA) can write symbols on the stack and read them back later. Writing a
symbol “pushes down” all the other symbols on the stack. At any time the symbol on the top of the stack
can be read and removed. The remaining symbols then move back up. Writing a symbol on the stack is
often referred to as pushing the symbol, and removing a symbol is referred to as popping it. Note that
all access to the stack, for both reading and writing, may be done only at the top. In other words a stack
is a “last in, first out” storage device. If certain information is written on the stack and additional
information is written afterward, the earlier information becomes inaccessible until the later
information is removed.

Plates on a cafeteria serving counter illustrate a stack. The stack of plates rests on a spring so that when
a new plate is placed on top of the stack, the plates below it move down. The stack on a pushdown
automaton is like a stack of plates, with each plate having a symbol written on it.

A stack is valuable because it can hold an unlimited amount of information. Recall that a finite
automaton is unable to recognize the language {0n1n| n ≥ 0} because it cannot store very large numbers
Compiled by: Destalem H. 81
in its finite memory. A PDA is able to recognize this language because it can use its stack to store the
number of 0s it has seen. Thus the unlimited nature of a stack allows the PDA to store numbers of
unbounded size. The following informal description shows how the automaton for this language works.

Read symbols from the input. As each 0 is read, push it onto the stack. As soon as 1s are seen, pop a 0
off the stack for each 1 read. If reading the input is finished exactly when the stack becomes empty of
0s, accept the input. If the stack becomes empty while 1s remain or if the 1s are finished while the stack
still contains 0s or if any 0s appear in the input following 1s, reject the input.

As mentioned earlier, pushdown automata may be nondeterministic. Deterministic and nondeterministic


pushdown automata are not equivalent in power.

Nondeterministic pushdown automata recognize certain languages that no deterministic pushdown


automata can recognize. Recall that deterministic and nondeterministic finite automata do recognize the
same class of languages, so the pushdown automata situation is different. We focus on nondeterministic
pushdown automata because these automata are equivalent in power to context-free grammars.

4.5.1 FORMAL DEFINITION OF A PUSHDOWN


AUTOMATON

The formal definition of a pushdown automaton is similar to that of a finite automaton, except for the
stack. The stack is a device containing symbols drawn from some alphabet. The machine may use
different alphabets for its input and its stack, so now we specify both an input alphabet Σ and a stack
alphabet Γ.

At the heart of any formal definition of an automaton is the transition function, which describes its
behavior. The domain of the transition function is Q × {Σ∪𝜆} × {Γ∪𝜆}. Thus the current state, next
input symbol read, and top symbol of the stack determine the next move of a pushdown automaton.
Either symbol may be 𝜆, causing the machine to move without reading a symbol from the input or
without reading a symbol from the stack.

For the range of the transition function we need to consider what to allow the automaton to do when it
is in a particular situation. It may enter some new state and possibly write a symbol on the top of the
stack. The function δ can indicate this action by returning a member of Q together with a member of
{Γ∪𝜆}, that is, a member of Q × {Γ∪𝜆}. Because we allow nondeterminism in this model, a situation
may have several legal next moves. The transition function incorporates nondeterminism in the usual
way, by returning a set of members of Q × {Γ∪𝜆}, that is, a member of P(Q × {Γ∪𝜆}). Putting it all
together, our transition function δ takes the form δ: Q × {Σ∪𝜆} × {Γ∪𝜆} →P(Q × {Γ∪𝜆}).

Compiled by: Destalem H. 82


Definition 4.13: A pushdown automaton is a 7-tuple (Q, Σ, Γ, δ, z0,q0, F), where Q, Σ, Γ, and F are all
finite sets, and
1. Q is the set of states,
2. Σ is the input alphabet,
3. Γ is the stack alphabet,
4. a special pushdown symbol called the initial symbol on the pushdown store denoted by Zo.
5. δ: Q × {Σ∪𝜆} × {Γ∪𝜆} →P(Q × {Γ∪𝜆}) is the transition function,
6. q0 ∈ Q is the start state, and
7. F ⊆ Q is the set of accept states.
A pushdown automaton M = (Q, Σ, Γ, δ, z0,q0, F) computes as follows. It accepts input w if w can be
written as w = w1 w2 · · · wm, where each wi ∈ {Σ∪𝜆} and sequences of states r0, r1, . . . , rm ∈ Q and
strings s0, s1, . . . , sm ∈ Γ∗ exist that satisfy the following three conditions. The strings si represent the
sequence of stack contents that M has on the accepting branch of the computation.
1. r0 = q0 and s0 = 𝜆. This condition signifies that M starts out properly, in the start state and with
an empty stack.
2. For i = 0, . . . , m − 1, we have (ri+1,b) ∈ δ(ri, wi+1 , a), where si = at and si+1 = bt for some a, b
∈{Γ∪𝜆} and t ∈ Γ∗. This condition states that M moves properly according to the state, stack,
and next input symbol.
3. rm ∈ F. This condition states that an accept state occurs at the input end.
Example 2.17
The following is the formal description of the PDA that recognizes the language {0n1n| n ≥ 0}.
Let M1 be (Q, Σ, Γ, δ, q1, z0, F), where
Q = {q1, q2, q3, q4},
Σ = {0,1},
Γ = {0, z0},
F = {q1, q4}, and
δ is given by the following table, wherein blank entries signify ∅.

Input: 0 1 𝜆
Stack: 0 z0 𝜆 0 z0 𝜆 0 z0 𝜆
q1 {(q2,z0)}
q2 {(q2,0)} {(q3,𝜆)}
q3 {(q3,𝜆)} {(q4,𝜆)}
q4

We can also use a state diagram to describe a PDA, as in Figures 4.12. Such diagrams are similar to the
state diagrams used to describe finite automata, modified to show how the PDA uses its stack when
Compiled by: Destalem H. 83
going from state to state. We write “a,b → c” to signify that when the machine is reading an a from the
input, it may replace the symbol b on the top of the stack with a c. Any of a, b, and c may be 𝜆. If a is
𝜆, the machine may make this transition without reading any symbol from the input. If b is ε, the machine
may make this transition without reading and popping any symbol from the stack. If c is 𝜆, the machine
does not write any symbol on the stack when going along this transition.
,  → 0,  → 0
q z0
q
1 2

1,0 → 

q q 3
1,0 → 
, z0 → 
4

Figure 2.12: State diagram for the PDA M1 that recognizes {0n1n| n ≥ 0}
The formal definition of a PDA contains no explicit mechanism to allow the PDA to test for an empty
stack. This PDA is able to get the same effect by initially placing a special symbol $ on the stack. Then
if it ever sees the $ again, it knows that the stack effectively is empty. Subsequently, when we refer to
testing for an empty stack in an informal description of a PDA, we implement the procedure in the same
way.
Similarly, PDAs cannot test explicitly for having reached the end of the input string. This PDA is able
to achieve that effect because the accept state takes effect only when the machine is at the end of the
input. Thus from now on, we assume that PDAs can test for the end of the input, and we know that we
can implement it in the same manner.
Example 4.18
In this example we give a PDA M2 recognizing the language {wwR | w ∈ {0,1}∗}. Recall that wR means
w written backwards. The informal description and state diagram of the PDA follow.
Begin by pushing the symbols that are read onto the stack. At each point, nondeterministically guess
that the middle of the string has been reached and then change into popping off the stack for each symbol
read, checking to see that they are the same. If they were always the same symbol and the stack empties
at the same time as the input is finished, accept; otherwise reject.
1,  → 1
,  → 0 z 0,  → 0
q 1
q 2

,  → 

0,0 → 
q q
1,1 → 
3
4
 , z0 → 
Figure 2.13: State diagram for the PDA M3 that recognizes {wwR| w ∈ {0, 1}∗}

Compiled by: Destalem H. 84


Example 4.19
Construct an npda for the language L={ w | w ∈ {a,b}* : na(w)=nb(w)
Solution
As in Example 4.17, the solution to this problem involves counting the number of a’s and b’s, which is
easily done with a stack. Here we need not even worry about the order of the a’s and b’s. We can insert
a counter symbol, say 0, into the stack whenever an a is read, then pop one counter symbol from the
stack when a b is found. The only difficulty with this is that if there is a prefix of w with more b’s than
a’s, we will not find a 0 to use. But this is easy to fix; we can use a negative counter symbol, say 1, for
counting the b’s that are to be matched against a’s later. The complete solution is given in the transition
graph in Figure 4.14

a ,z → 0 z b,z → 1z
a, 0 → 00 b, 1 → 11
a, 1 →  b, 0 → 

, z → z
q1 q2
.Figure 4.14: State diagram for the PDA M3 that recognizes L={ w | w ∈ {a,b}* : na(w)=nb(w)

Compiled by: Destalem H. 85


5 TURING MACHINE(TM)
1 INTRODUCTION TO COMPUTATIONAL COMPLEXITY
Throughout history people had a notion of a process of producing an output from a set of inputs
in a finite number of steps, and they thought of “computation” as “a person writing numbers on a scratch
pad following certain rules.”
After their success in defining computation, researchers focused on understanding what
problems are computable. They showed that several interesting tasks are inherently uncomputable: No
computer can solve them without going into infinite loops (i.e., never halting) on certain inputs.
Computational complexity theory focuses on issues of computational efficiency — quantifying the
amount of computational resources required to solve a given task. We will quantify the efficiency of an
algorithm by studying how its number of basic operations scales as we increase the size of the input.
Computation is a mathematically precise notion. We will typically measure the computational
efficiency of an algorithm as the number of basic operations it performs as a function of its input length.
That is, the efficiency of an algorithm can be captured by a function T from the set N of natural numbers
to itself, i.e., T: N→N, such that T(n) is equal to the maximum number of basic operations that the
algorithm performs on inputs of length n. For the grade-school algorithm we have at most T(n) = 2n2and
for repeated addition at least T(n) = n10n-1.
Throughout history people have been solving computational tasks using a wide variety of
methods, ranging from intuition and “eureka” moments to mechanical devices such as abacus or slide
rules to modern computers. How can we find a simple mathematical model that captures all of these
ways to compute?
Surprisingly enough, it turns out that there is a simple mathematical model that suffices for
studying many questions about computation and its efficiency—the Turing machine. It suffices to
restrict attention to this single model since it seems able to simulate all physically realizable
computational methods with little loss of efficiency. Thus the set of “efficiently solvable” computational
tasks is at least as large for the Turing machine as for any other method of computation. One possible
exception is the quantum computer model, but we do not currently know if it is physically realizable.
In the following sections we discuss about the theoretical computational model of computing – the
Turing Machine (TM).

2 STANDARD TURING MACHINE(TM)


For thousands of years, the term “computation” was understood to mean application of
mechanical rules to manipulate numbers, where the person/machine doing the manipulation is allowed
a scratch pad on which to write the intermediate results. The Turing machine is a concrete embodiment
of this intuitive notion. It can be also viewed as the equivalent of any modern programming language
— albeit one with no built-in prohibition on its memory size.
Here we describe this model informally as follows. Let f be a function that takes a string of bits
(i.e., a member of the set {0,1}*) and outputs either 0 or 1. An algorithm for computing f is a set of
mechanical rules, such that by following them we can compute f(x) given any input x ∈ {0,1}*. The set
Compiled by: Destalem H. 86
of rules being followed is fixed (i.e., the same rules must work for all possible inputs) though each rule
in this set may be applied arbitrarily many times. Each rule involves one or more of the following
“elementary” operations:
1. Read a bit of the input.
2. Read a bit (or possibly a symbol from a slightly larger alphabet, say a digit in the set {0, . . . , 9})
from the scratch pad or working space we allow the algorithm to use.
Based on the values read,
1. Write a bit/symbol to the scratch pad.
2. Either stop and output 0 or 1, or choose a new rule from the set that will be applied next.
Finally, the running time is the number of these basic operations performed. We measure it in
asymptotic terms, so we say a machine runs in time T(n) if it performs at most T(n) basic operations
time on inputs of length n.
In automata theory course we have seen that neither finite automata nor pushdown automata can
be regarded as truly general models for computers, since they are not capable of recognizing even such
simple languages as {anbncn : n ≥ 0}. In this course we take up the study of devices that can recognize
this and many more complicated languages. Although these devices, called Turing machines named
after their inventor Alan Turing (1912-1954), are more general than the automata previously studied,
their basic appearance is similar to those automata.
A Turing machine consists of a finite control, a tape, and a head that can be used for reading or
writing on that tape. The formal definitions of Turing machines and their operation are in the same
mathematical style as those used for finite and pushdown automata. So in order to gain the additional
computational power and generality of function that Turing machines possess, we shall not move to an
entirely new sort of model for a computer.
The important points to remember by way of introduction are that Turing machines are designed
to satisfy simultaneously these three criteria:
(a) They should be automata; that is, their construction and function should be in the same general
spirit as the devices/machines previously studied.
(b) They should be as simple as possible to describe, define formally, and reason about.
(c) They should be as general as possible in terms of the computations they can carry out.
Now let us look more closely at these machines. In essence, a Turing machine consists of a finite-
state control unit and a tape (see Figure 1). Communication between the two is provided by a single
Read/Write head, which reads symbols from the tape and is also used to change the symbols on the tape.
The control unit operates in discrete steps; at each step it performs two functions in a way that depends
on its current state and the tape symbol currently scanned by the read/write head:
1. Put the control unit in a new state.
2. Either:
Compiled by: Destalem H. 87
(a) Write a symbol in the tape square currently scanned, replacing the one already there; or
(b) Move the read/write head one tape square to the left or right.
(c)

.... □ □ a b a a □ □ .....
Tape

Read/Write head

(Moves in both directions)

q2 q0

h q1
q3

Finite Control

Figure 1: A Turing Machine consisting of a Tape, R/W head and a finite control.

FORMAL DEFINITION OF A STANDARD TURING MACHINE


Definition: A Turing machine M is a 7-tuple, namely (Q, , , δ, □, s, H), where
Q is a finite nonempty set of states;
 is a finite nonempty set of tape symbols;
□ is the blank symbol; □ 
 is a nonempty set of input symbols and is subset of  and □ ;
s is the initial state; s ∈ Q
H is the set of halting/final states; H ⊆ 𝑄
δ is the transition function, is a function from Q   to Q   {L, R}, that is
a mapping (q, x) onto (q', y, D) where D denotes the direction of movement
of R/W head: D = L or R according as the movement is to the left or right.
Note: (1) The acceptability of a string is decided by the reach ability from the initial state to some
final state. So the final states are also called the accepting states.
(2)  may not be defined for some elements of Q  .
Example-1: Consider the following Turing Machine
M = (Q, , , δ, □, s, H), where
Q = {q0, q1, h},
 = a, □
∑ = {a},
Compiled by: Destalem H. 88
s = q0,
H = {h},
And δ (the transition function) is given by the following table.

q, σ δ(q, σ)

q0 a (q1, □, R)
q0 □ (h, □)
q1 a (q0, □, R )
q1 □ (q0, □,L)
Table 1: for the transition table
When M is started in its initial state q0, it scans its head to the right, changing all a's to □'s as it goes,
until it finds a tape square already containing □; then it halts. (Changing a nonblank symbol to the blank
symbol will be called erasing the nonblank symbol.) To be specific, suppose that M is started with its
head scanning the first of four a's, the last of which is followed by a □. Then M will go back and forth
between states q0 and q1 four times, alternately changing an a to a □ and moving the head right; the first
and fourth lines of the table for δ are the relevant ones during this sequence of moves. At this point, M
will find itself in state q0 scanning □ and, according to the second line of the table, will halt.
Example2: Consider the Turing machine defined by

And

If this Turing machine is started in state q0 with the symbol a under the read-write head, the applicable
Compiled by: Destalem H. 89
transition rule is δ (q0,a)= (q0,b,R). Therefore, the read-write head will replace a with b, then move right
on the tape. The machine will remain in state q0. Any subsequent a will also be replaced with a b, but
b's will not be modified. When the machine encounters the first blank, it will move left one cell, and
then halt in final state q1.
The figure 2 below shows several stages of the process for a simple initial configuration.

Figure 2: A sequence of moves.


As before, we can use transition graphs to represent Turing machines. Now we label the edges of the
graph with three items: the current tape symbol, the symbol that replaces it, and the direction in which
the read-write head is to move. The Turing machine in Example 2 is represented by the transition graph
in Figure3 below.

Figure 3: Transition graph


We now formalize the operation of a Turing machine. As we have the “productions” as the
control theme of a grammar, the “transitions” or “moves” are the central theme of a Turing machine.
These transitions are given as a table or list of 5-tuples, where each tuple has the form:
(current-state, symbol-read, symbol-written, direction, next-state)
Creating such a list is called “implementation description” or “programming” of a Turing machine. A
Turing machine is often defined to start with the read/write head positioned over the first (leftmost) input
symbol.
To specify the status of a Turing machine computation, we need to specify the state, the contents
of the tape, and the position of the head. Since all but a finite initial portion of the tape will be blank, the
contents of the tape can be specified by a finite string. These considerations lead us to the following
ways of describing Turing Machines.
Since one can make several different definitions of a Turing machine, it is worthwhile to summarize the

Compiled by: Destalem H. 90


main features of our model, which we will call a standard Turing machine:
1. The Turing machine has a tape that is unbounded in both directions, allowing any number of left
and right moves.
2. The Turing machine is deterministic in the sense that δ defines at most one move for each
configuration.
3. There is no special input file. We assume that at the initial time the tape has some specified
content. Some of this may be considered input. Similarly, there is no special output device.
Whenever the machine halts, some or all of the contents of the tape may be viewed as output.

REPRESENTATION OF TURING MACHINES


We can describe a Turing machine by employing (i) instantaneous descriptions using move-
relations (├ ). (ii) Transition table. And (iii) transition diagram (transition graph).

REPRESENTATION OF TM BY INSTANTANEOUS DESCRIPTIONS:


'Snapshots' of a Turing machine in action can be used to describe a Turing machine. These give
'instantaneous descriptions' of a Turing machine. So an ID of a Turing machine is defined in terms of
the entire input string and the current state.
Definition: An ID of a Turing machine M is a string x1qx2, where q is the present state of M, the
entire input string is split as x1x2 , the first symbol of x2 is the current symbol a under the R/W head and
x2 has all the subsequent symbols of the input string, and the string x2 is the substring of the input string
formed by all the symbols to the left of a.
Example: A snapshot of Turing machine is shown in figure below. Obtain the instantaneous
description.
x1 x2

..... □ □ a4 a1 a2 a1 a2 a2 a a4 a2 □ □ .......
Tape

R/W head
State

q
Figure 4: A Snapshot of Turing Machine

Solution:

The present symbol under the R/W head is a. The present state is q. So a is written to the right of q. The
nonblank symbols to the left of a form the string a4a1a2a1a2a2, which is written to the left of q. The sequence of
nonblank symbols to the right of a is a4a2. Thus the ID is as given in the figure below.

Compiled by: Destalem H. 91


a4a1a2a1a2a2 q a a4a2

left sequence right sequence

Present state Symbol under R/W head

Figure 5: Representation of ID

Note: (1) For constructing the ID, we simply insert the current state in the input string to the left of
the symbol under the R/W head.

(2) We observe that the blank symbol may occur as part of the left or right substring.

The instantaneous description gives only a finite amount of information to the right and left of the read-write
head. The unspecified part of the tape is assumed to contain all blanks; normally such blanks are irrelevant and
are not shown explicitly in the instantaneous description. If the position of blanks is relevant to the discussion,
however, the blank symbol may appear in the instantaneous description. For example, the instantaneous
description q ω indicates that the read-write head is on the cell to the immediate left of the first symbol of w and
that this cell contains a blank.

MOVES IN TURING MACHINE


As in the case of pushdown automata, δ(q, x) induces a change in ID of the Turing Machine. We call
this change in ID a move.
Suppose δ(q, xj) = (p, y, L). The input string to be processed is x1x2 . . . xn, and the present symbol under
the R/W head is xi. So the ID before processing xiis
x1 x2 . . . xi-1 q xi . . . xn
After processing xi, the resulting ID is
x1 x2 . . . xi-2 p xi-1 y xi+1. . . xn
This change of ID is represented by
x1 x2 . . . xi-1 q xi . . . xn ├ x1 x2 . . . xi-2 p xi-1 y xi+1. . . xn
If i = 1, the resulting ID is py x2 x3. . . xn .
Suppose δ(q, xj) = (p, y, L), then the change of ID is represented by
x1 x2 . . . xi-1 q xi . . . xn ├ x1 x2 . . . xi-1 y p xi+1. . . xn
If i = n, the resulting ID is x1 x2 . . . xn-1 y p □ .

Compiled by: Destalem H. 92


We can denote an ID by Ij for some j. Ij├ Ik defines a relation among IDs. So the symbol ├*
denotes the reflexive – transitive closure of the relation ├. In particular, Ij├* Ij. Also, if I1├* In, then we
can split this as I1 ├ I2 ├ I3 ├ . . . . ├ In for some IDs, I2, . . . , In-1.

Note: The description of moves by IDs is very much useful to represent the processing of input
strings.

REPRESENTATION OF TM BY TRANSITION TABLE:


We give the definition of transition function δ in the form of a table called the transition table. If δ(q, a)
= (γα β), we write αβγ under the α– column and in the q–row. So if we get αβγ in the table, it
means that αis written in the current cell, β gives the movement of the head (L or R) and γ denotes the
new state into which the Turing machine enters.
Consider, for example, a Turing machine with five states q1, ..., q5, where ql is the initial state and q5 is
the (only) final state. The tape symbols () are 0, 1 and □. The transition table given in the Table below
describes δ.
Present State Tape Symbol
□ 0 1
→q1 1 L q2 0 R q1 -
q2 □ R q3 0 L q2 1 L q2
q3 - □ R q4 □ R q5
q4 0 R q5 0 R q4 1 R q4
0 L q2
q
5

Table 1: Transition Table of a Turing Machine


Note: The initial state is marked with → and the final state is marked with .

Example 3: Consider the TM description given in Table 1. Draw the computation sequence of the
input string 00.
Solution:
We describe the computation sequence in terms of the contents of the tape and the current state.
If the string in the tape is a1a2 . . . aj aj+1 . . . am and the TM in state q is to read aj+ 1, then we write
a1a2 . . . aj q aj+1 . . . am
For the input string 00□, we get the following sequence:
q100□├ 0q10□├ 00q1□├ 0q201├ q2001
├ q2□001├□q3001 ├□□q401├ □□0q41├ □□01q4□
Compiled by: Destalem H. 93
├ □□010q5├□□01q200├□□0q2100├ □□q20100
├ □q2□0100 ├ □□q30100 ├ □□□q4100 ├ □□□1q400
├ □□□10q40 ├ □□□100q4□ ├ □□□1000q5□
├ □□□100q200 ├ □□□10q2000 ├ □□□1q20000
├ □□□q210000 ├ □□q2□10000 ├ □□□q310000 ├ □□□□q50000

REPRESENTATION OF TM BY TRANSITION DIAGRAM :


We can use the transition systems (diagrams) to represent Turing machines. The states are
represented by vertices. Directed edges are used to represent transition of states. The labels are triples
of the form (α,β,γ ), where α,β  Γ (where Γ is set of tape symbols)and γ {L, R}. When there is
a directed edge from qi to qj with label (α,β,γ ),, it means that

δ(qi,α ) = (qj, β,γ)


During the processing of an input string, suppose the Turing machine enters qi and the R/W head
scans the (present) symbol . As a result the symbol is α written in the cell under the R/W head. The
R/W head moves to the left or to the right depending on γ, and the new state is qj.
Every edge in the transition system can be represented by a 5-tuple (qi,α,β,γ, qj ). So each Turing
machine can be described by the sequence of 5-tuples representing all the directed edges. The initial
state is indicated by → and any final state is marked with .
Example: M is a Turing machine represented by the transition system in figure below. Obtain
the computation sequence of M for processing the input string 0011.

(□, □, R)

(y, y, R)
(y, y, R) (y, y, L)
(x, x, R) (□,□,R)
(0, x, R)
q3 q5 q6
q1 q2
(1, y, L)

(x, x, R) (0, 0, R) (0, 0, L)

q4

Figure 4: Transition Diagram for M.


(0, 0, L)

Compiled by: Destalem H. 94


Solution: The initial tape input is □0011□. Let us assume that M is in state qj and the R/W head scans 0
(the first 0). We can represent this as in Figure 5.

□ 0 0 1 1 □
Tape

R/W head
State

q1
Figure 5: TM processing 0011
The figure can be represented by

□0011□
q1
From Figure 4 we see that there is a directed edge from q1 to q2 with the label (0. x, R). So the current
symbol 0 is replaced by x and the head moves right. The new state is q2. Thus. we get

□x 011□
q2
The change brought about by processing the symbol 0 can be represented as

(0, x, R)
□0011□ □x011□
q1 q2
The entire computation sequence reads as follows:

(0, x, R) (0, 0, R)
□0011□ □x 011□ □x 011□
q1 q2 q2

(1, y, L) (0, 0, L) (x, x, R)


□x0y1□ □x0y1□ □x0y1□
q3 q4 q1

(0, x, R) (y, y, R) (1, y, L)


□xxy1□ □xxy1□ □xxyy□
q2 q2 q3

(y, y, L) (x, x, R) (y, y, R)


□xxyy□ □xxyy□ □xxyy□
q3 q5 q5

(y, y, R) (□, □, R)
□xxyy□ □xxyy□□
q5 q6
Compiled by: Destalem H. 95
CONSTRUCTION OF TURING MACHINE(TM)
Designing a Turing machine to solve a problem is an interesting task. It is somewhat similar to
programming. Given a problem, different Turing machines can be constructed to solve it. But we would
like to have a Turing machine which does it in a simple and efficient manner. Like we learn some
techniques of programming to deal with alternatives, loops etc, it is helpful to understand some
techniques in Turing machine construction, which will help in designing simple and efficient Turing
machines. It should be noted that we are using the word ‘efficient’ in an intuitive manner here.

VARIANTS OF TURING MACHINES


N-Track Turing Machine
An N-track Turing Machine is one in which each square of the tape holds an ordered n-tuple of
symbols from the tape alphabet. This can be thought of as a Turing machine with multiple tape heads,
all of which move in lock-step mode.
“N-Track Turing machines are equivalent to standard Turing machines”.
Semi-Infinite Tape Turing Machine
A Turing machine may have a “semi-infinite tape”, the nonblank input is at the extreme left end
of the tape and it is infinite in length to the right.
“Turing machines with semi-infinite tape are equivalent to Standard Turing machines”.
Offline Turing Machine
An “Offline Turing Machine” has two tapes. One tape is read-only and contains the input, the
other is read-write and is initially blank.
“Offline Turing machines are equivalent to Standard Turing machines”.
Multi-tape Turing Machine
A “Multi-tape Turing Machine” has a finite number of tapes, each with its own independently
controlled tape head. A multitape TM has a finite set Q of states, an initial state q0, a subset F of Q called
the set of final states, a set P of tape symbols, a new symbol b not in P called the blank symbol. (We
assume that    and b)
There are k tapes, each divided into cells. The first tape holds the input string w. Initially, all the
other tapes hold the blank symbol.
Initially the head of the first tape (input tape) is at the left end of the input w. All the other heads
can be placed at any cell initially.
 is a partial function from Q  k into Q  k {L, R, S}k. We use implementation description
to define . Figure 6 represents a multi-tape TM. A move depends on the current state and k tape
symbols under k tape heads.

Compiled by: Destalem H. 96


Finite

Control

..... .....

..... .....

...... ......
Figure 6: Multitape Turing machine
In a typical move:
(i) M enters a new state.
(ii) On each tape, a new symbol is written in the cell under the head.
(iii) Each tape head moves to the left or right or remains stationary. The heads move
independently: some move to the left, some to the right and the remaining heads
do not move.
The initial ID has the initial state q0, the input string w in the first tape (input tape), empty strings
of b's in the remaining k - 1 tapes. An accepting ID has a final state, some strings in each of the k tapes.
“Multi-tape Turing Machines are equivalent to Standard Turing Machines”.
Exercise: Prove that, Every language accepted by a multitape TM is acceptable by some
single-tape TM (that is, the standard TM).
Exercise: Prove that, If M1 is the single-tape TM simulating multitape TM M, then the time
taken by M1 to simulate n moves of M is (n2).

Nondeterministic Turing Machines


In the case of standard Turing machines (hereafter we refer to this machine as deterministic TM),
(q, a) was defined (for some elements of Q  ) as an element of Q    {L, R}. Now we extend the
definition of . In a nondeterministic TM,(q1, a) is defined as a subset of Q  {L, R}.
Definition: A nondeterministic Turing machine is a 7-tuple (Q,    q0, □, F) where
1. Q is a finite nonempty set of states
2.  is a finite nonempty set of tape symbols
3. □  is called the blank symbol
4.  is a nonempty subset of  called the set of input symbols. We assume that □ .
5. q0 is the initial state
6. F  Q is the set of final states
Compiled by: Destalem H. 97
7.  is a partial function from Q   into the power set of Q    {L, R}.
Note: If q  Q and x  and (q, x) = {(ql, y1, D1), (q2, y2, D2), . . ., (qn, yn, Dn)} then the NDTM
can chose anyone of the actions defined by (qi, yi, Di) for i = 1, 2, . . ., n.
We can also express this in terms of ├ (move) relation. If (q, x) = {(qi, yi, Di)  i = 1, 2, . . ., n}
then the ID zqxw can change to anyone of the n IDs specified by the n-element set (q, x).
Suppose (q, x) = {(q1, y1, L), (q2, y2, R), (q3, y3, L)}. Then
z1z2 . . . zkqxzk+1 . . . zn├z1z2 . . . zk-1q1zky1zk+1. . . zn
or z1z2 . . . zkqxzk+1 . . . zn├ z1z2 . . . zky2q2zk+1. . . zn
or z1z2 . . . zkqxzk+1 . . . zn├ z1z2 . . . zk-1q3zky3zk+1. . . zn
So on reading the input symbol, the NDTM M whose current ID is z1z2 . . . zkqxzk+1 . . . zn can
change to any one of the three IDs given earlier.
Remark: When (q, x) = {(qi, yi, Di)  i = 1, 2, . . ., n} the NDTM chooses any one of the n triples
totally (that is, it cannot take a state from one triple, another tape symbol from a second triple and a third
D(L or R) from a third triple, etc.)
Definition: w  is accepted by a nondeterministic TM M if q0w├* xqfy for some final state qf.
Note: As in the case of NDFA, an ID of the fonn xqy (for some q F) may be reached as the result
of applying the input string w. But w is accepted by M as long as there is some sequence of
moves leading to an ID with an accepting state. It does not matter that there are other
sequences of moves leading to an ID with a nonfinal state or TM halts without processing
the entire input string.
Definition: Let M be a TM and w an input string. The running time of M on input w, is the number
of steps that M takes before halting. If M does not halt on an input string w, then the running time of M
on w is infinite.
Note: Some TMs may not halt on all inputs of length n. But we are interested in computing the running
time, only when the TM halts.
Definition: The time complexity of TM M is the function T(n), n being the input size, where T(n) is
defined as the maximum of the running time of M over all inputs w of size n.

Compiled by: Destalem H. 98


6 COMPUTABILITY
INTRODUCTION
In this chapter we shall discuss the class of primitive recursive functions – a subclass of partial recursive
functions. The Turing machine is viewed as a mathematical model of a partial recursive function. We
will study automata as the computing machines. The problem of finding out whether a given problem is
'solvable' by automata reduces to the evaluation of functions on the set of natural numbers or a given
alphabet by mechanical means.
We start with the definition of partial and total functions.
A partial function f from X to Y is a rule which assigns to every element of X at most one element of Y.
A total function from X to Y is a rule which assigns to every element of X a unique element of Y. For
example, if R denotes the set of all real numbers, the rule f from R to itself given by f (r) =
+√𝑟is a partial function since f (r) is not defined as a real number when r is negative. But g(r) = 2r is a
total function from R to itself.
In this chapter we consider total functions from Xk to X, where X = {0, 1, 2, 3, 4, 5, . . . } or X = {a,
b}*. Throughout this chapter we denote (0, 1, 2, ...) by N and (a, b) by . (Recall that Xk is the set of all
k-tuples of elements of X).
A partial or total function f from Xk to X is also called a function of k variables and denoted by f (x1,
x2, . . ., xk). For example, f (x1, x2) = 2xl + x2 is a function of two variables: f (l, 2) = 4, 1 and 2 are
called arguments and 4 is called a value. g(w1, w2) = w1w2 is a function of two variables (w1,
w2  *); g(ab, aa) = abaa, ab, aa are called arguments and abaa is a value.

PRIMITIVE RECURSIVE FUNCTIONS


In this section we construct primitive recursive functions over N and . We define some initial functions
and declare them as primitive recursive functions. By applying certain operations on the primitive
recursive functions obtained so far, we get the class of primitive recursive functions.

INITIAL FUNCTIONS
The initial functions over N are:
(a) Zero Function Z defined by Z(x) = 0.
(b) Successor Function S defined by S(x) = x + 1
(c) Projection function Uin defined by Uin(x1, x2, .. . ., xk) = xi
For example: S(4) = 5, Z(7) = 0, U23(2, 4, 7) = 4, U13(2, 4, 7) = 2, U33(2, 4, 7) = 7.
Note: As U11(x) = x for every x in N, U11 is simply the identity function. So Uin is also termed as a
generalized identity function.
The initial function over  = a, b are:
(a) nil (x) = 
(b) cons a(x) = ax
(c) cons b(x) = bx
Compiled by: Destalem H. 99
For example: nil (abab) = , cons a(abab) = aabab, cons b(abab) = babab.
Note: We note that cons a(x) and cons b(x) simply denote the concatenation of the 'constant' string a
and x and the concatenation of the constant string b and x.
Definition: If fl, f2, ... , fk are partial functions of n variables and g is a partial function of k variables,
then the composition of g with fl, f2, ... , fk is a partial function of n variables defined by
g(fl(x1, x2, .. . ., xn), f2(x1, x2, .. . ., xn), ... , fk(x1, x2, .. . ., xn))
Example: Let f1(x, y) = x + y, f2(x, y) = 2x, f3(x, y) = xy and g(x, y, z) = x + y + z be functions over N.
Then
g( fl(x, y), f2(x, y), f3(x, y)) = g(x + y, 2x, xy)
= x + y + 2x + xy
Thus the composition of g with fl , f2, f3 is given by a function h:
h(x, y) = x + y + 2x + xy
Note: Definition 2.1 generalizes the composition of two functions. The concept is useful where a
number of outputs become the inputs for a subsequent step of a program.
The composition of g with fl , f2, . . ., fn is total when g, fl , f2, . . ., fn are total.
The next definition gives a mechanical process of computing a function.
Definition: A function f (x) over N is defined by recursion if there exists a constant k (a natural number)
and a function h(x, y) such that
f (0) = k, f (n + 1) = h(n, f (n)) (3.1)
By induction on n, we can define f (n) for all n. As f (0) = k, there is basis for induction. Once f (n) is
known, f(n + 1) can be evaluated by using (3.1).

PRIMITIVE RECURSIVE FUNCTIONS OVER N


Definition: A total function f over N is called primitive recursive (i) if it is anyone of the three initial
functions, or (ii) if it can be obtained by applying composition and recursion a finite number of times to
the set of initial functions.
Example: Show that the function f1 (x, y) = x + y is primitive recursive.
Solution: f1 is a function of two variables. If we want f1 to be definede by recursion, we need a function
g of a single variable and a function h of three variables.
f1 (x, 0) = x + 0 = x
By comparing f1 (x, 0) with L.H.S. of (3.2), we see that g can be defined by
g(x) = x = U11(x)
Also, f1 (x, y + 1) with L.H.S. of (3.3), we have
h (x, y, f1 (x, y)) = f1 (x, y) + 1 = S(f1 (x, y)) = S(U33(x, y, f1 (x, y)))

Compiled by: Destalem H. 100


Define h (x, y, z) = S(U33(x, y, z). As g = U11, it is an initial function. The function h is obtained from
the initial functions U33 and S by composition, and by recursion using g and h. Thus fl is obtained by
applying composition and recursion a finite number of times to initial functions U11, U33 and S. So fl is
primitive recursive.
Note: A total function is primitive recursive if it can be obtained by applying composition and recursion
a finite number of times to primitive recursive functions fl , f2, . . ., fm. This is clear as each fi is obtained
by applying composition and recursion a finite number of times to initial functions.

PRIMITIVE RECURSIVE FUNCTIONS OVER  = {A, B}


For constructing the primitive recursive function over {a, b}, the process is similar to that of
function over N except for some minor modifications. It should be noted that  plays the role of 0 in
(3.2) and ax or bx plays the role of y + 1 in (3.3). Recall that  denotes {a, b}.

Definition: A function f (x) over  is defined by recursion if there exists a 'constant" string w  *
and functions h1(x, y) and h2(x. y) such that
f () = w (3.4)
f (ax) = h1(x, f (x)) (3.5)
f (bx) = h2(x, f (x))
(h1 and h2 may be functions in one variable.)
Definition: A function f (x1, x2, . . ., xn) over  is defined by recursion if there exist functions g(x1,
x2, . . ., xn-1), h1(x1, x2, . . ., xn+1) , h2(x1, x2, . . ., xn+1), such that
f (𝜆 x2, . . ., xn) = g(x2, . . ., xn) (3.6)
f (ax1, x2, . . ., xn ) = h1(x1, x2, . . ., xn, f (x1, x2, . . ., xn)) (3.7)
f (ax1, x2, . . ., xn ) = h1(x1, x2, . . ., xn, f (x1, x2, . . ., xn))
(h1 and h2 may be functions of m variables, where m < n + 1.)
Now we can define the class of primitive recursive functions over .

Definition: A total function f over  is primitive recursive (i) if it is anyone of the three initial functions,
or (ii) if it can be obtained by applying composition and recursion a finite number of times to the initial
functions.
Note: As in the case of functions over N, a total function over  is primitive recursive if it is obtained
by applying composition and recursion a finite number of times to primitive recursive function
f1, f2, . . ., fm.
Example: Show that the following functions are primitive recursive:
(a) Constant functions a and b (i.e. a(x) = a, b(x) = b)
(b) Identity function
(c) Concatenation
(d) Transpose
(e) Head function (i.e. head (a1a2 ... an) = a1)
Compiled by: Destalem H. 101
(f) Tail function (i.e. tail (a1a2 ... an) = an)
(g) The conditional function “ if x1   then x2 else x3”
Solution :
(a) As a(x) = cons a(nil (x)), the function a(x) is the composition of the initial function cons a with
the initial function nil and is hence primitive recursive.
(b) Let us denote the identity function by id. Then,
id() = 

id(ax) = cons a(x)

id(bx) = cons b(x)

So id is defined by recursion using cons a and cons b. Therefore, the identity function is primitive
recursive.
(c) The concatenation function can be defined by
concat(x1, x2) = x1x2

concat(, x2) = id(x2)

concat(ax1, x2) = cons a (concat(x1, x2))

concat(bx1, x2) = cons b (concat(x1, x2))

(d) The transpose function can be defined by trans(x) = xT. Then


trans() = 

trans(ax) = concat(trans(x), a(x) )

trans(bx) = concat(trans(x), b(x) )

(e) The head function head(x) satisfies


head() = 

head(ax) = a(x)

head(bx) = b(x)

(f) The tail function tail(x) satisfies


tail() = 

tail(ax) = id(x)

tail(bx) = id(x)

Therefore, tail(x) is primitive recursive.


(g) The conditional function can be defined by

cond(x1, x2, x3) = “ if x1   then x2 else x3”

Compiled by: Destalem H. 102


Then,
cond( x2, x3) = id(x3)
cond(ax1 x2, x3) = id(x2)
cond(bx1 x2, x3) = id(x2)
Therefore, id(x1 x2, x3) is primitive recursive.

RECURSIVE FUNCTIONS
By introducing one more operation on functions, we define the class of recursive functions,
which includes the class of primitive recursive functions.
Definition: Let g(x1, x2, . . ., xn, y) be a total function over N. g is a regular function if there exists some
natural number y0 such that g(x1, x2, . . ., xn, y0) = 0 for all values x1, x2, . . ., xn in N.
For instance, g(x, y) = min(x, y) is a regular function since g(x, 0) = 0 for all x in N. But f
(x, y) = |x – y| is not regular since f (x, y) = 0 only when x = y, and so we cannot find a fixed y such that
f (x, y) = 0 for all x in N.
Definition: A function f (x1, x2, . . ., xn) over N is defined from a total function g(x1, x2, . . .,
xn, y) by minimization if
(a) f (x1, x2, . . ., xn) is the least value of all y's such that g(x1, x2, . . ., xn, y) = 0 if it exists. The least
value is denoted by y(g(x1, x2, . . ., xn, y) = 0).
(b) f (x1, x2, . . ., xn) is undefined if there is no y such that g(x1, x2, . . ., xn, y) = 0.
Note: In general, f is partial. But, if g is regular then f is total.
Definition: A function is recursive if it can be obtained from the initial functions by a finite number of
applications of composition, recursion and minimization over regular functions.
Definition: A function is partial recursive if it can be obtained from the initial functions by a finite
number of applications of composition, recursion and minimization.
Example: Show that f (x) = x/2 is a partial recursive function over N.
Solution: Let g(x, y) = |2y – x|, where 2y - x = 0 for some y only when x is even. Let
f1(x) = y(|2y – x| = 0). Then f1(x) is defined only for even values of x and is equal
to x/2. When x is odd, f1(x) is not defined. f1 is partial recursive. As f (x) = x/2 = f1(x), f is a
partial recursive function.
So far we have dealt with recursive and partial recursive functions over N. We can define
partial recursive functions over  using the primitive recursive predicates and the minimization process.
As the process is similar, we will discuss it here.
The concept of recursion occurs in some programming languages when a procedure has a call to
the same procedure for a different parameter. Such a procedure is called a recursive procedure. Certain
programming languages like C, C++ allow recursive procedures.

Compiled by: Destalem H. 103


7 COMPUTATIONAL COMPLEXITY
INTRODUCTION
When a problem/language is decidable, it simply means that the problem is computationally
solvable in principle. It may not be solvable in practice in the sense that it may require enormous amount
of computation time and memory. In this chapter we discuss the computational complexity of a problem.
The proofs of decidability/undecidability are quite rigorous, since they depend solely on the definition
of a Turing machine and rigorous mathematical techniques. But the proof and the discussion in
complexity theory rests on the assumption that P  P. The computer scientists and mathematicians
strongly believe that P  P, but this is still open.

This problem is one of the challenging problems of the 21st century. This problem carries a prize
money of $lM. P stands for the class of problems that can be solved by a deterministic algorithm (i.e.
by a Turing machine that halts) in polynomial time; P stands for the class of problems that can be
solved by a nondeterministic algorithm (that is, by a nondeterministic TM) in polynomial time; P stands
for polynomial and P for nondeterministic polynomial. Another important class is the class of NP –
complete problems which is a subclass of P.
In this chapter these concepts are formalized and Cook's theorem on the NP – completeness of
SAT problem is proved.

BIG-O NOTATION
GROWTH RATE OF FUNCTIONS
When we have two algorithms for the same problem, we may require a comparison between the
running times of these two algorithms. With this in mind, we study the growth rate of functions defined
on the set of natural numbers N.
Definition: Let f, g : N → R+ (R+ being the set of all positive real numbers), we say that f (n) =
O(g(n)) if there exist positive integers C and N0 such that
f(n)  C.g(n) for all n  N0.
In this case we say f is of the order of g (or f is 'big oh' of g)
Note: f (n) = O(g(n)) is not an equation. It expresses a relation between two functions f and g.
Definition: If p(n) = aknk + ak-1nk-1 + . . . + a1 + a0 is a polynomial of degree k over Z and ak > 0,
then p(n) = O(nk).
Example: Let f (n) = 4n3 + 5n2 + 7n + 3. Prove that f(n) = 0(n 3).
Solution: In order to prove that f (n) = O(n3), take C = 5 and N0 = 10. Then
f (n) = 4n3 + 5n2 + 7n + 3  5n3 for all n  10
When n = 10, 5n2 + 7n + 3 = 573 < 103. For n > 10, 5n2 + 7n + 3 < n3. Then f (n) = O(n3).
Compiled by: Destalem H. 104
Note: The order of a polynomial is determined by its degree.
Definition: An exponential function is a function from q: N → N defined by
q(n) = an for some fixed a > 1.
When n increases, each of n, n2, 2n increases. But a comparison of these functions for specific
values of n will indicate the vast difference between the growth rate of these functions.
TABLE 1: Growth Rate of Polynomial and Exponential Functions
n f (n) = n2 g(n) = n2 + 3n + 9 q(n) = 2n
1 1 13 2
5 25 49 32
10 100 139 1024
50 2500 2659 (1.13)1015
100 10000 10309 (1.27)1030
1000 1000000 1003009 (1.07)10301

From Table 1, it is easy to see that the function q(n) grows at a very fast rate when compared to
f (n) or g(n). In particular the exponential function grows at a very fast rate when compared to any
polynomial of large degree. We prove a precise statement comparing the growth rate of polynomials
and exponential function.

Definition: We say g  O( f ), if for any constant C and N0, there exists n  N0 such that g  C.f (n).

Definition: If f and g are two functions and f = O(g), but g  O( f ), we say that the growth rate of g
is greater than that of f (In this case g(n) / f(n) becomes unbounded as n increases to .)
Definition: The growth rate of any exponential function is greater than that of any polynomial.
Note: The function n1og n lies between any polynomial function and an for any constant a. As log n  k
for a given constant k and large values of n, n1og n  nk or large values of n. Hence n1og n dominates any
2
(𝑙𝑜𝑔(𝑥))
polynomial. But n1og n = (e1og n ) 1og n = e(1og n)2. Let us calculate 𝑙𝑖𝑚 . By L'Hospital's rule,
𝑛→∞ 𝑐𝑥

(𝑙𝑜𝑔𝑥 )2 1⁄𝑥 2𝑙𝑜𝑔𝑥 2


𝑙𝑖𝑚 = 𝑙𝑖𝑚 2𝑙𝑜𝑔𝑥 = 𝑙𝑖𝑚 = 𝑙𝑖𝑚 =0
𝑛→∞ 𝑐𝑥 𝑛→∞ 𝑐 𝑛→∞ 𝑐𝑥 𝑛→∞ 𝑐𝑥

So (log n)2 grows more slowly than cn. Hence n1og n = e(1og n)2 grows more slowly than 2n. The
same holds good when logarithm is taken over base 2 since logcn and log2n differ by a constant factor.
Hence there exist functions lying between polynomials and exponential functions.

1 CLASS P VERSUS CLASS NP


In this section we introduce the classes P and NP of languages.
Definition: A Turing machine M is said to be of time complexity T(n) if the following holds: Given
an input w of length n, M halts after making at most T(n) moves.

Compiled by: Destalem H. 105


Note: In this case, M eventually halts. Recall that the standard TM is called a deterministic TM.

Definition: A language L is in class P if there exists some polynomial T(n) such that L=
T(M) for some deterministic TM M of time complexity T(n).
Example: Construct the time complexity T(n) for the Turing Machine M which accepts the language
{0n1n | n  1}.
Solution: We require the following moves:
(a) If the leftmost symbol in the given input string w is 0, replace it by x and move right till we
encounter a leftmost 1 in w. Change it to y and move backwards.
(b) Repeat (a) with the leftmost 0. If we move back and forth and no 0 or 1 remains, move to a final
state.
(c) For strings not in the form 0n1n, the resulting state has to be nonfinal.
Step (a) consists of going through the input string (0n1n) forward and backward and replacing the
leftmost 0 by x and the leftmost 1 by y. So we require at most 2n moves to match a 0 with a 1. Step (b)
is repetition of step (a) n times. Hence the number of moves for accepting 0n1n is at most (2n)(n). For
strings not of the form 0n1n, TM halts with less than 2n2 steps. Hence T(M) = O(n2).
We can also define the complexity of algorithms. In the case of algorithms, T(n) denotes the
running time for solving a problem with an input of size n, using this algorithm.
Note: The Euclidean algorithm for computing gcd of two numbers is a polynomial time algorithm.

Definition: A language L is in class NP if there is a nondeterministic TM M and a polynomial time


complexity T(n) such that L = T(M) and M executes at most T(n) moves for every input w of length n.
It has been proved that there exists a deterministic TM M1 simulating a nondetenninistic TM M.
If T(n) is the complexity of M, then the complexity of the equivalent deterministic TM M1 is 2O(T(n)).
This can be justified as follows. The processing of an input string w of length n by M is equivalent to a
'tree' of computations by M1. Let k be the maximum of the number of choices forced by the
nondeterministic transition function. (It is max|(q, x)|, the maximum taken over all states q and all tape
symbol x) . Every branch of the computation tree has a length T(n) or less. Hence the total number of
leaves is at most kT(n). Hence the complexity of M1 is at most 2O(T(n)).
It is not known whether the complexity of M1 is less than 2O(T(n)). Once again an answer to this
question will prove or disprove P  NP. But there do exist algorithms where T(n) lies between a
polynomial and an exponential function.

POLYNOMIAL TIME REDUCTION AND NP-COMPLETE PROBLEMS


If P1 and P2 are two problems and P2  P, then we can decide whether P1  P by relating the
two problems P1 and P2. If there is an algorithm for obtaining an instance of P2 given any instance of
P1, then we can decide about the problem P1. Intuitively if this algorithm is a polynomial one, then the

Compiled by: Destalem H. 106


problem P1 can be decided in polynomial time.
Definition: Let P1 and P2 be two problems. A reduction from P1 to P2 is an algorithm which converts
an instance of P1 to an instance of P2. If the time taken by the algorithm is a polynomial p(n), n being
the length of the input of P1, then the reduction is called a polynomial reduction P1 to P2.

Theorem 1: If there is a polynomial time reduction from P1 to P2 and if P2 is in P then P1 is in P.


Proof : Let m denote the size of the input of P1. As there is a polynomial – time reduction of P1
to P2, the corresponding instance of P2 can be got in polynomial-time. Let it be O(mj), So the size of the
resulting input of P2 is atmost cmj for some constant c. As P2 is in P, the time taken for deciding the
membership in P2 is O(nk), n being the size of the input of P2. So the total time taken for deciding the
membership of m-size input of P1 is the sum of the time taken for conversion into an instance of P2, and
the time for decision of the corresponding input in P2. This is O[mj + (cmj)k] which is the same as O(mjk).
So P1 is in P.

Compiled by: Destalem H. 107

You might also like