Professional Documents
Culture Documents
Course Material
January, 2022
TABLE OF CONTENTS
1. INTRODUCTION TO THEORY OF COMPUTATION --------------- 5
1.1 MATHEMATICAL PRELIMINARIES ---------------------------------------------------------------------6
1.1.1 SETS AND SUBSETS ---------------------------------------------------------------------------------------------------6
6 COMPUTABILITY ----------------------------------------------------------- 99
Compiled by: Destalem H. 3
INTRODUCTION ------------------------------------------------------------------------------------------------ 99
PRIMITIVE RECURSIVE FUNCTIONS ------------------------------------------------------------------------ 99
INITIAL FUNCTIONS ------------------------------------------------------------------------------------------------------------ 99
In this chapter, topics like set theories, set operations, function, relation, graphs, proofing techniques,
languages, automata and grammars will be discussed briefly. The concepts which are discussed in this
chapter will help as base for other coming chapters.
The question that will come to every reader of this module is, since Computer science is a practical
discipline, why do we need to study the theoretical aspect of it? There are several reasons why we study
formal language and automata which are the core part of the theoretical part of the field. This chapter
serves to introduce the reader to the principal motivation and also outlines the major topics covered in
this module.
Activity 1.1
✓ Mention and explain briefly why we need to learn the theoretical aspect of
computer science?
Why study theory?
The first answer is that theory provides concepts and principles that help us understand the general
nature of the discipline. The field of computer science includes a wide range of special topics, from
machine design to programming.
The second answer is that the ideas we will discuss have some immediate and important applications.
The fields of digital design, programming languages, and compilers are the most obvious examples, but
there are many others. The concepts we study here run like a thread through much of computer science,
from operating systems to pattern recognition.
Activity 1.2
✓ Define the terms set, subset, and proper subsets?
✓ Discuss about the different represent type of a set?
We use capital letters A, B, C, . . . for denoting sets, The small letters a, b, c, . . . are used to denote the
elements of any set. To indicate that x is an element of the set A, we write x ∈ A. The statement that x
is not in A is written x ∉ A. We can denote sets using
1. By listing its elements. We write all the elements of the set (without repetition) and enclose
them within braces. We can write the elements in any order. For example, the set of all positive
integers divisible by 15 and less than 100 can be written as {15. 30, 45. 60. 75. 90}.
2. By describing the properties of the elements of the set. For example. the set {15, 30. 45. 60.
75. 90} can be described as: {n: n is a positive integer divisible by 15 and less than 100}. (The
description of the property is called predicate. In this case the set is said to be implicitly
specified.)
3. By recursion. We define the elements of the set by a computational rule for calculating the
elements. For example. the set of all natural numbers leaving a remainder 1 when divided by 3
can be described as
{an = | a0 = 1,an+1 = an + 3}
Another basic operation is complementation. The complement of a set S, denoted by 𝑆̅ , which consists
of all elements not in S. To make this meaningful, we need to know what the universal set U of all
possible elements is. If U is specified, then 𝑆̅ = {x: x ∈ U, x ∉ S}.
The set with no elements, called the empty set or the null set, is denoted by ∅. From the definition of a
set, it is obvious that
S ∪ ∅ = S−∅ = S, S ⋂ ∅ = ∅ =, ∅̅ = ∪, ̿𝑆 = S.
The following useful identities, known as DeMorgan's laws,
̅̅̅̅̅̅̅̅̅
𝑺𝟏𝑼𝑺𝟐 = ̅̅̅̅ 𝑺𝟐 , ̅̅̅̅̅̅̅̅̅̅
𝑺𝟏 ∩ ̅̅̅̅ 𝑺𝟏⋂𝑺𝟐 = ̅̅̅̅ ̅̅̅̅ , are needed on several occasions.
𝑺𝟏𝑼𝑺𝟐
A set is said to be finite if it contains a finite number of elements; otherwise it is infinite. The size of a
finite set is the number of elements in it; this is denoted by |S|. A given set normally has many subsets.
The set of all subsets of a set S is called the powerset of S and is denoted by 2S. Observe that 2S is a set
of sets.
Example 1.1
If S is the set {a, b, c}, then its powerset is
2S ={ ∅ , {a} , {b} , {c} , {a,b} , {a,c} , {b,c} , {a,b,c} }.
Here |S| = 3 and |2s|=|23| = 8. This is an instance of a general result; if S is finite, then |2S|=2|S| .
If the elements of a set are ordered sequences of elements from other sets, then such sets are said to be
the Cartesian product of other sets. The Cartesian product of two sets, which itself is a set of ordered
pairs, we write as
S = S1 × S2 = { (x,y) : x ∈ S1 , y ∈ S2 }
Example 1.2
Let S1 = {2, 4} and S2 = {2, 3, 5, 6}. Then
S1 × S2 = {(2, 2), (2, 3), (2, 5), (2, 6), (4, 2), (4, 3), (4, 5), (4, 6)}.
1.2 LANGUAGES
Activity 1.3
✓ Explain the similarity and difference between natural and formal languages?
✓ Define the strings?
✓ What is concatenation?
We are all familiar with the notion of natural languages, such as English. Dictionaries define the term
informally as a system suitable for the expression of certain ideas, facts, or concepts, including a set of
symbols and rules for their manipulation. But, this is not sufficient as a definition for the study of formal
languages. We need a precise definition for the term.
To define language formally, we start with a finite, nonempty set Σ of symbols, called the alphabet.
From the individual symbols we construct strings, which are finite sequences of symbols from the
alphabet. The se of strings is called language. For example, if the alphabet Σ = {a, b}, then abab and
aaabbba are strings on Σ. In this module we will use lowercase letters a, b, c…, for elements of Σ and
the letters u, υ, w…for string names. For example, for assigning strings to a letter will write:
w = abaaa,
to indicate that the string named w has the specific value abaaa.
Concatenation: the concatenation of two strings w and υ is the string obtained by appending the symbols
of one string for example υ to the right end of another string w, that is, if we have strings
w=a1a2a3….an and
v=b1b2…bm then the concatenation of w and υ creates a new string, denoted by wv,
wv = a1a2…anb1b2…bm
Since by definition, the alphabet Σ is finite, we can enumerate all words (strings) over Σ. That is, we can
order all words (string) by sorting them in a right-infinite sequence. This ordering can be done in many
ways, but the most standard ordering is the alphabetical enumeration. For instance, for the binary
alphabet Σ = {0, 1}, the set Σ* of all words over Σ is sorted by alphabetical enumeration like this:
Any set is called countable if it has at most as many elements as there are natural numbers. In
mathematical rigor, a set S is defined to be countable if there exists an injective map i: S → N (remember:
a map f: A → B from a set A to a set B is called injective if for every b ∈ B there exists at most one a ∈
A such that f(a) = b). The set {0, 1}* can be seen to be countable by understanding the alphabetical
enumeration as a map from i: {0, 1}* → N
Definition 1.1: Let A and B be languages. We define the operations union, concatenation, and star-
closure as follows:
You are already familiar with the union operation. It simply takes all the strings in both A and B and
lumps them together into one language.
The concatenation operation is a little trickier. It attaches a string from A in front of a string from B in
all possible ways to get the strings in the new language.
The star operation is a bit different from the other two because it applies to a single language rather than
to two different languages. That is, the star operation is a unary operation instead of a binary operation.
It works by attaching any number of strings in A together to get a string in the new language. Because
“any number” includes 0 as a possibility, the empty string 𝜆 is always a member of A∗, no matter what
A is.
Example 1.3
Let the alphabet Σ be the standard 26 letters {a, b, . . . , z}. If A = {good, bad} and
B = {boy,girl}, then
• A ∪ B = {good, bad, boy, girl},
• A . B = {goodboy, goodgirl, badboy, badgirl}, and
• A∗ = {ε, good, bad, goodgood, goodbad, badgood, badbad, goodgoodgood,
goodgoodbad, goodbadgood, goodbadbad, . . . }.
Compiled by: Destalem H. 10
Example 1.4
Let Σ = {a, b}. Then
∑ * = { λ,a,b,aa,ab,bb,aaa,aab,……}
The set:{a,aa,aab} is a language on Σ. Because of it has a finite number of sentences, we
call it a finite language. The set L={ anbn : n≥0} is also a language on Σ. The strings
aabb and aaaabbbb are in the language L, but the string abb is not in L. This language is
infinite. Most interesting languages are infinite.
Since languages are sets, the operation of sets like union, intersection, and difference of two languages
are immediately defined. The complement of a language is defined with respect to Σ*; that is, the
complement of L is 𝐿̅ = ∑* - L. The reverse of a language is the set of all string reversals, that is, LR
={ wR: w∈ L}. The concatenation of two languages L1 and L2 is the set of all strings obtained by
concatenating any element of L1 with any element of L2; specifically,
L1L2 = {xy : x ∈ L1,y ∈ L2}
We define Ln as L concatenated with itself n times, with the special cases
L0 = {λ}
L1 = L for every language L.
Finally, we define the star-closure of a language as: L* = L0 ∪ L1∪ L2∪. . .
and the positive closure as: L+ = L1 ∪ L2 ∪. . .
Example 1.5
If L= { anbn : n≥0} then
The reverse of L is: LR {bnan : n≥0}
1.3 GRAMMARS
Activity 1.4
✓ Define Grammar Formally?
✓ What is linear grammar?
A grammar for the English language tells us whether a particular sentence is well-formed or not. A
typical rule of English grammar is “a sentence can consist of a noun phrase followed by a predicate.”
Example 1.6: The English Language Grammar
Compiled by: Destalem H. 11
<Sentence> →<noun phrase> <Predicate>
<Noun Phrase> →<article> <noun>
<Predicate> →<verb>
<article> →a
<article> →the
<noun> →cat
<noun> →dog
<verb> →runs
<verb> →sleeps
Derivation of string “the dog sleeps”
<Sentence> ⇒ <noun phrase> <Predicate>
⇒< noun phrase > <verb>
⇒<article> <noun> <verb>
⇒ the <noun> <verb>
⇒ the dog <verb>
⇒ the dog sleeps
Derivation of string “a cat runs”
<Sentence> ⇒ <noun phrase> <Predicate>
⇒< noun phrase > <verb>
⇒<article> <noun> <verb>
⇒ a <noun> <verb>
⇒ a cat <verb>
⇒ a cat runs
We get the following set of strings (Language) of the grammar:
L={“a cat runs”, “a cat sleeps”, “the cat runs”, “the cat sleeps , a dog runs”, “a dog sleeps”, “the
dog runs”, “the dog sleeps”}
If we associate the actual words “a” and “the” with Articles “cat” and “dog” with Nouns , and “runs”
and “sleeps” with Verbs , then the grammar tells us that the sentences “a cat runs” and “the dog runs”
are properly formed. If we were to give a complete grammar, then in theory, every proper sentence could
be explained this way. This example illustrates the definition of a general concept in terms of simple
ones. We start with the top-level concept, here Sentences, and successively reduce it to the irreducible
building blocks of the language. The generalization of these ideas leads us to formal grammars.
Definition1.2: A grammar G is defined as a quadruple G =(V, T, S, P),
Where
V is a finite set of objects called variables,
T is a finite set of objects called terminal symbols,
S ∈ V is a special symbol called the start variable,
Compiled by: Destalem H. 12
P is a finite set of productions.
It will be assumed that the sets V and T are nonempty and disjoint. The production rules are the heart of
a grammar; they specify how the grammar transforms one string into another, and through this they
define a language associated with the grammar. In our discussion we will assume that all production
rules are of the form
x→y
where x is an element of (V ∪ T)+ and y is in (V ∪ T)*. The productions are applied in the following
manner: Given a string w of the form
w=uxv
we say the production x → y is applicable to this string, and we may use it to replace x with y, thereby
obtaining a new string
z=uyv
This is written as
w⟹z
We say that w derives z or that z is derived from w. Successive strings are derived by applying the
productions of the grammar in arbitrary order. A production can be used whenever it is applicable, and
it can be applied as often as desired. If
w1 ⟹ w2 ⟹. . . ⟹ wn,
we say that w1 derives wn and write
∗
w1⇒wn
The * indicates that an unspecified number of steps (including zero) can be taken to derive wn from w1.
By applying the production rules in a different order, a given grammar can normally generate many
strings. The set of all such terminal strings is the language defined or generated by the grammar.
Definition 1.3: Let G = (V, T, S, P) be a grammar. Then the set
∗
L (G) = {w ∈ T*: S⇒w} is the language generated by G.
If w ∈ L (G), then the sequence S⟹w1⟹w2⟹ . . . ⟹ wn⟹ 𝑤, is a derivation of the sentence w. The
strings S, w1, w2,…, wn, which contain variables as well as terminals, are called sentential forms of the
derivation.
Example 1.7
Consider the grammar: G = ({S},{a,b},S,P}, with P given by
S→aSb
S→λ then
S⟹aSb⟹aaSbb⟹aabb, so we can write
∗
S⇒aabb
The string aabb is a sentence in the language generated by G, while aaSbb is a sentential form. A
grammar G completely defines L(G), but it may not be easy to get a very explicit description of the
Compiled by: Destalem H. 13
language from the grammar. Here, however, the answer is fairly clear. It is not hard to conjecture that
L(G)={anbn : n≥0}
Example 1.8
Find a grammar that generates the language L={anbn+1 : n≥0}
Solution:
The idea behind the previous example can be extended to this case. All we need to do is generate an
extra b. This can be done with a production S → Ab, with other productions chosen so that A can derive
the language in the previous example. Reasoning in this fashion, we get the grammar G =({S, A}, {a,
b}, S, P), with productions
S→Ab
A→aAb
A→λ
Definition 1.9: A grammar is said to be left-linear if all productions are of the form
A → Bx,
or
A → x.
A regular grammar is one that is either right-linear or left-linear. Note that in a regular grammar, at
most one variable appears on the right side of any production. Furthermore, that variable must
consistently be either the rightmost or leftmost symbol of the right side of any production.
1.4 AUTOMATA
Activity 1.5
✓ What is automaton?
✓ Mention and explain the three different types of automata?
Formal language is an abstract of programming language, and automaton is an abstract model of a digital
computer. As such, every automaton includes some essential features of the digital computer. It has a
mechanism for reading input. It will be assumed that the input is a string over a given alphabet, written
on an input file, which the automaton can read but not change. The input file is divided into cells, each
of which can hold one symbol. The input mechanism can read the input file from left to right, one symbol
at a time. The input mechanism can also detect the end of the input string (by sensing an end-of-file
condition).
During the transition from one time interval to the next, output may be produced or the information in
the temporary storage changed. The term configuration will be used to refer to a particular state of the
control unit, input file, and temporary storage. The transition of the automaton from one configuration
to the next will be called a move. In the coming chapter we will discuss in detail about automata, but
here we are going to introduce to the different types of automata.
Power of Automata, the following diagram shows the power of the different types of automata on
solving different types of problems.
Activity 2.1
1. Express finite automata briefly?
2. List and explain types of finite automata?
3. Define DFA and NFA formally?
4. What are the DFA representations?
Finite automata are good models for computers with an extremely limited amount of memory. We
interact with such computers all the time, as they lie at the heart of various electromechanical devices.
The controller for an automatic door is one example of such a device. Often found at Hotels,
supermarket entrances and exits, automatic doors swing open when the controller senses that a person
is approaching. An automatic door has a pad in front to detect the presence of a person about to walk
through the doorway. Another pad is located to the rear of the doorway so that the controller can hold
the door open long enough for the person to pass all the way through and also so that the door does not
strike someone standing behind it as it opens. This configuration is shown in figure 2.1.
The controller moves from state to state, depending on the input it receives. When in the CLOSED state
and receiving input NEITHER or REAR, it remains in the CLOSED state. In addition, if the input
BOTH is received, it stays CLOSED because opening the door risks knocking someone over on the
rear pad. But if the input FRONT arrives, it moves to the OPEN state. In the OPEN state if input,
FRONT, REAR, or BOTH is received, it remains in OPEN. If input NEITHER arrives, it returns to
CLOSED.
For example, a controller might start in state CLOSED and receive the series of input signals FRONT,
REAR, NEITHER, FRONT, BOTH, NEITHER, REAR, and NEITHER. It then would go through
the series of states CLOSED (starting), OPEN, OPEN, CLOSED, OPEN, OPEN, CLOSED,
CLOSED, and CLOSED.
Thinking of an automatic door controller as a finite automaton is useful because that suggests standard
ways of representation as in Figures 2.2 and 2.3. This controller is a computer that has just a single bit
of memory, capable of recording which of the two states the controller is in. Other common devices
have controllers with somewhat larger memories. In an elevator controller, a state may represent the
floor the elevator is on and the inputs might be the signals received from the buttons. This computer
might need several bits to keep track of this information. Controllers for various household appliances
such as dishwashers and electronic thermostats, as well as parts of digital watches and calculators are
We will now take a closer look at finite automata from a mathematical perspective. We will develop a
precise definition of a finite automaton, terminology for describing and manipulating finite automata,
and theoretical results that describe their power and limitations. Besides giving you a clearer
understanding of what finite automata are and what they can and cannot do, this theoretical development
will allow you to practice and become more comfortable with mathematical definitions, theorems, and
proofs in a relatively simple setting.
Strings, by definition are finite (have only a finite number of symbols). Most languages of interest are
infinite (contain an infinite number of strings) However, in order to work with these languages we must
be able to specify or describe them in ways that are finite. Finite automata use to describe these infinite
languages. Finite automata are finite collections of states with transition rules that take you from one
state to another.
For example, if q0 and q1 are internal states of some machine M, then the graph associated with M will
have one vertex labeled q0 and another labeled q1. An edge (q0,q1) labeled a represents the transition
δ(q0,a) = q1. The initial state will be identified by an incoming unlabeled arrow not originating at any
vertex. Final states are drawn with a double circle.
More formally, if M = (Q, Σ,δ,q0,F) is a deterministic finite automaton, then its associated transition
graph (state diagram) has exactly |Q| vertices, each one labeled with a different qi ∈ Q. For every
transition rule δ(qi,a) = qj, the graph has an edge (qi,qj) labeled a. The vertex associated with q0 is called
the initial vertex, while those labeled with qf ∈ F are the final vertices. It is a trivial matter to convert
from the (Q, Σ,δ,q0,F) formal definition of a DFA to its transition graph representation and vice versa.
The deterministic Finite Automat will be present in three features; these representations are:
1. Instantaneous description
2. Transition graph (sate diagram)
3. Transition table
Example 2.1:
Consider a machine given by M1 = ({q0, q1, q2}, {0, 1},𝛿, q0,{ql}), Where the transition
function (δ) (movement) is given by the following instantaneous description.
Based on the above given movement of the machine we can draw the transition graph and table as
follows.
1 0
1
q 0 q 1
q 2 δ 0 1
0 q0 q0 q1
0 1
q1 q0 q2
q2 q2 q1
Figure 2.5: Transition graph of the three-state finite automaton M1
Using the machine in figure 2.5 above, we can process strings created from {0,1}* in the following
manner.
Let us take string 1001 from {0,1}*, first the DFA will read the first symbol from the left side of the
string and compute with the function δ(q0,1)=q1 then the next symbol will compute with the function
δ(q1,0)=q0.
The computation will continue until the last symbol, when the string terminates at the final state in this
string case, then the string is accepted otherwise rejected. Here is the whole process for the whole string
1001
for the first symbol which is 1-------------------δ(q0,1)=q1,
For the second symbol which is 0--------------δ(q1,0)=q0
For the third symbol which is 0 ---------------δ(q0,0)=q0
For the fourth symbol which is 1------------ δ(q0,1)=q1,
When the string ended the machine terminates at q1, so we can say the string 1001 is accepted by the
machine M1 in figure 2.5.
Example 2.2:
Consider the state diagram (transition graph) for finite automaton M2.
q 1 q 2
The formal description is M2 is ({q1, q2}, {0,1}, δ, q1, {q2}). The transition function δ with the help of
transition table is given
𝛿 0 1
q1 q1 q2
q2 q1 q2
Remember that the state diagram of M2 and the formal description of M2 or the transition table contain
the same information, only in different forms. You can always go from one form to the other form if
necessary.
A good way to begin understanding any machine is to try it on some sample input strings as we have
discussed in the above example 2.5. When you do these “experiments” to see how the machine is
working, its method of functioning often becomes apparent. For instance we can process the sample
string 1101 with machine M2. The machine M2 starts in its start state q1 and proceeds first to state q2
after reading the first 1, and then to states q2, q1, and q2 after reading 1, 0, and 1. The string is accepted
because q2 is an accept state. But string 110 leaves M2 in state q1, so it is rejected. After trying a few
more examples, you would see that M2 accepts all strings that end in a 1. Thus the language accepted by
M2 is L(M2) = {w| w ends in a 1}.
In order to discuss about the relationship between languages and machines, it is convenient to introduce
the extended transition function δ*: Q × Σ* → Q. The second argument of δ* is a string, rather than a
single symbol, and its value gives the state the automaton will be in after reading that string. For
example, if
δ(q0,a) = q1
and
δ(q1,b) = q2,
then
Compiled by: Destalem H. 22
δ* (q0,ab) = q2.
Formally, we can define δ* recursively by
δ*(q, 𝜆) = q,…………………………….…….(2.1)
δ*(q, wa) = δ(δ*(q, w), a),……………………(2.2)
for all q ∈ Q, w ∈ Σ*, a ∈ Σ. To see why this is appropriate, let us apply these definitions to the simple
case above. First, we use (2.2) to get
But
δ*(q0, a) = δ(δ*(q0, 𝜆 ), a)
=δ(q0, a)
=q1
Substituting this into (2.3), we get
δ*(q0, ab) = δ(q1, b)=q2
Definition 2.2: The language accepted by a machine M = (Q, Σ,δ, q0,F) is the set of all strings on Σ
accepted by M. In formal notation,
L(M)={w ∈ Σ* : δ* (q0,w) ∈ F}
Note that we require that δ, and consequently δ*, be total functions. At each step, a unique move is
defined, so that we are justified in calling such an automaton deterministic. A DFA will process every
string in Σ* and either accept it or not accept it. Non acceptance means that the DFA stops in a non final
state, so that
̅̅̅̅̅̅̅
𝐿(𝑚)= {w ∈ 𝛴* : δ* (q0,w) ∉ F}
Every finite automaton accepts some language. If we consider all possible finite automata, we get a set
of languages associated with them. We will call such a set of languages a family. The family of languages
that is accepted by deterministic finite accepters is called regular language.
Definition 2.3: A language L is called regular if and only if there exists some deterministic finite
accepter M such that L= L(M).
Example 2.3:
Consider the DFA in Figure 2.7.
b q a, b
q 0 1 q 2
In drawing Figure 2.7 we use two labels on a single edge (see δ(q1,a), δ(q1,b) use single edge). Such
multiply labeled edges are shorthand for two or more distinct transitions: The transition is taken
whenever the input symbol matches any of the edge labels.
The automaton in Figure 2.7 remains in its initial state q0 until the first b is encountered. If this is also
the last symbol of the input, then the string is accepted since q1 is final state. If not, the DFA goes into
state q2, from which it can never escape (such sate is called trap state). Here the state q2 is a trap state.
We see clearly from the transitional graph that the automaton accepts all strings consisting of an arbitrary
number of a's, followed by a single b. All other input strings are rejected. In set notation, the language
accepted by the automaton is
L = {anb:n≥0}.
Let’s design a finite automaton using the “reader as automaton” method just described. Suppose that
you are given some language and want to design a finite automaton that recognizes it. Pretending to be
the automaton, you receive an input string and must determine whether it is a member of the language
the automaton is supposed to recognize. You get to see the symbols in the string one by one. After each
symbol, you must decide whether the string seen so far is in the language. The reason is that you, like
the machine, don’t know when the end of the string is coming, so you must always be ready with the
answer.
First, in order to make these decisions, you have to figure out what you need to remember about the
string as you are reading it. Why not simply remember all you have seen? Bear in mind that you are
pretending to be a finite automaton and that this type of machine has only a finite number of states,
which means a finite memory. Imagine that the input is extremely long—say, from here to the moon—
so that you could not possibly remember the entire thing. You have a finite memory—say, a single sheet
Compiled by: Destalem H. 24
of paper—which has a limited storage capacity. Fortunately, for many languages you don’t need to
remember the entire input. You need to remember only certain crucial information. Exactly which
information is crucial depends on the particular language considered.
For example, suppose that the alphabet is {0,1} and that the language consists of all strings with an odd
number of 1s. You want to construct a finite automaton M1 to recognize this language. Pretending to be
the automaton, you start getting an input string of 0s and 1s symbol by symbol. Do you need to remember
the entire string seen so far in order to determine whether the number of 1s is odd? Of course not, simply
remember whether the number of 1s seen so far is even or odd and keep track of this information as you
read new symbols. If you read a 1, flip the answer; but if you read a 0, leave the answer as is.
But how does this help you design M1? Once you have determined the necessary information to
remember about the string as it is being read, you represent this information as a finite list of possibilities.
In this instance, the possibilities would be
1. even so far, and
2. Odd so far.
Then you assign a state to each of the possibilities. These are the states of M1, as shown here.
q even q odd
Next, you assign the transitions by seeing how to go from one possibility to another upon reading a
symbol. So, if state qeven represents the even possibility and state qodd represents the odd possibility, you
would set the transitions to flip state on a 1 and stay put on a 0, as shown here.
0 1
q q odd
even
0
1
Figure 2.9: Transitions telling how the possibilities rearrange
Next, you set the start state to be the state corresponding to the possibility associated with having seen
0 symbols so far (the empty string 𝜆). In this case, the start state corresponds to state qeven because 0 is
an even number. Last, set the accept states to be those corresponding to possibilities where you want to
accept the input string. Set qodd to be an accept state because you want to accept when you have seen an
odd number of 1s. These additions are shown in the following figure.
0 1
qeven
q odd
0
1
Assign the states q, q0, q00, and q001 to these possibilities. You can assign the transitions by observing
that from q reading a 1 you stay in q, but reading a 0 you move to q0. In q0 reading a 1 you return to q,
but reading a 0 you move to q00. In q00 reading a 1 you move to q001, but reading a 0 leaves you in q00.
Finally, in q001 reading a 0 or a 1 leaves you in q001. The start state is q, and the only accept state is q001,
as shown in Figure 2.11.
1 0
1
q 0 1 q
q 0 q 00
001
0,1
0
Example 2.5
Design a deterministic finite accepter (DFA) that recognizes the set of all strings on Σ= {a,b} starting
with the prefix ab?
Solution
Here the question is a DFA designing. The DFA accepts set of strings that will start with symbol a, and
followed by b. The only issue here is the first two symbols in the string; after they have been read, no
further decisions are needed. Still, the automaton has to process the whole string before its decision is
made. We can therefore solve the problem with an automaton that has four states; an initial state (haven’t
seen any symbol), two states for recognizing ab (one state for reading symbol a and one another state
for reading symbol b), we need one state for reading symbol a if it followed by another a and finally we
need a final state,. If the first symbol is an a and the second is a b, the automaton goes to the final trap
state, where it will stay since the rest of the input does not matter. On the other hand, if the first symbol
is not an a or the second one is not a b, the automaton enters the nonfinal trap state. The simple solution
a, b
b
q q
a 1 3
a
q 0 q 2
b a, b
Solution
According to definition 2.3 any language is regular, if we can design a DFA for it. The construction of
a DFA for this language is similar to example 2.5, the only different here is it starts with single symbol
a, and there is also one additional constraint, that should end with another symbol a, but in between
these two symbols it can be any combination. What this DFA must do is check whether a string begins
and ends with an a; what is between is immaterial. The solution is complicated by the fact that there is
no explicit way of testing the end of the string. This difficulty is overcome by simply putting the DFA
into a final state whenever the second a is encountered. If this is not the end of the string, and another b
is found, it will take the DFA out of the final state and back to some none final state in between. Scanning
continues in this way, each a taking the automaton back to its final state. The complete solution is shown
in Figure 2.13.To prove, take some strings from {a,b}* that starts and ends with a and process with the
machine in figure 2.13, it will be obvious that the machine accepts a string if and only if it begins and
ends with an a. Since we have constructed a DFA for the language, we can claim that, by definition, the
language is regular.
a
b
a
q q
a 1 3
b
q 0 q 2
b a, b
start q 0 q 1
1 q 2
0
Figure 2.14: NFA for set of strings that ends with 01.
Compiled by: Destalem H. 28
Example 2.8:
Let machine M with Q = {q0, q1, q2, q3, q4}, Σ = {a, b}, F = {q1, q3} and δ be given by the
following transition table.
𝛿 a b 𝜆
q0 {q1} ∅ {q4}
q1 ∅ {q1} {q2}
q2 {q2,q3} {q3} ∅
q3 ∅ ∅ ∅
q4 {q4} {q3} ∅
. In the similar lines of a DFA, an NFA can be represented by a state transition diagram. For instance,
the present NFA can be represented as figure 2.15:
a
b a, b
q 2
q
3
q
1
a b
q 0
q 4
a q b
1. q 0
1 q1
q q a
q
b
q
2. 0
4
4 3
3. q 0
a q 1
q 2
b
q 3
4. q 0
a q 1
b q 1
q 2
∀q ∈ Q: δ(q,𝜆) = {q}
and
∀q ∈ Q ∀w ∈ Σ* ∀a ∈Σ: δ(q,wa)= ⋃p∈ 𝛿(q,𝑤) δ(p, 𝑤)
For an NFA, the extended transition function is defined so that δ * (qi,w) contains qj if and only if there
is a walk in the transition graph from qi to qj labeled w. This holds for all qi, qj ∈ Q, and w ∈ Σ*. The
language L accepted by an NFA, M = (Q,Σ,δ, q0,F) is defined as the set of all strings accepted by the
machine. Formally,
L(M )= {w ∈ Σ* : δ* (q0,w) ∩ F ≠ ∅}
In words, the language consists of all strings w for which there is a walk labeled w from the initial
vertex of the transition graph to some final vertex.
Definition 2.6: A NFA M accepts a word (strings) w if δ*(q0, w) ∩ F ≠ ∅. The language L(M) of a NFA
is the set of all words(strings) accepted by M.
1 0,1
q
0 q
1
q 2
Figure 2.17:
The automaton shown in Figure 2.17 is nondeterministic not only because several edges with the same
label originate from one vertex, but also because it has a λ-transition. Some transitions, such as δ (q2,0),
are unspecified in the graph. This is to be interpreted as a transition to the empty set, that is, δ (q2,0) =
Ø. The automaton accepts strings λ, 1010, and 101010, but not 110 and 10100. Note that for 10 there
are two alternative walks, one leading to q0, the other to q2. Even though q2 is not a final state, the string
is accepted because one walk leads to a final state.
Example 2.10. Consider the machine M1 in figure 2.18. what is the language accepted by the NFA M1?
b
q 0 b
q 0
q 0 q
3
q 2
Example 2.11
Figure 2.19 represents an NFA. It has several λ-transitions and some undefined transitions such
as δ(q2,a). Suppose we want to find δ* (q1,a) and δ* (q2,λ). There is a walk labeled a involving two λ-
transitions from q1 to itself. By using some of the λ-edges twice, we see that there are also walks
involving λ-transitions to q0 and q2. Thus,
δ*(q1,a) = {q0,q1,q2}.
q q q
0
q0
1 2
Since there is a λ-edge between q2 and q0, we have immediately that δ*(q2,λ)contains q0. Also, since any
state can be reached from itself by making no move, and consequently using no input
symbol,δ*(q2,λ)also contains q2. Therefore,
δ*(q2, λ) = {q0,q2},
Using as many λ-transitions as needed, you can also check that
δ *(q2, aa) = {q0,q1,q2},
Example 2.12
What is the language accepted by the automaton in example 2.9, Figure 2.17?
Solution
It is easy to see from the graph that the only way the machine can stop in a final state is if the input is
either a repetition of the string 10 or the empty string. Therefore, the automaton accepts the language
L= {(10) n : n ≥0}.
Activity 2.2
1. Explain the need of nondeterministic finite automata?
2. Show that DFA and NFA are equivalent?
Note that NFAs are (i) good for human design: it is typically much easier to write down an NFA for a
given language than a DFA, but (ii) NFAs are bad for computer implementations because of the non-
determinism: if a computer program would have to emulate an NFA, the program wouldn't know which
transition options it should take when a non-deterministic situation is encountered. Fortunately, it is
possible to automatically transform any (maybe human-specified) NFA into an equivalent (computer-
runnable) DFA. This construction is the underlying idea used in the proof of the following proposition.
Definition 2.7: Two finite accepters, M1 and M2, are said to be equivalent, if they both accept the same
language
L(M1) = L(M2),
There are generally many accepters for a given language, so any DFA or NFA has many equivalent
accepters.
Proposition 2.1:- The languages accepted by DFAs are the languages accepted by NFAs.
Proof, general comment: This proposition is of the form "set A equals set B". Almost invariably, when
proving statements of this form, one shows two things: (i) A ⊆ B, and (ii) B ⊆ A.
Proof main idea: (i) Showing that the set of languages accepted by DFAs is a subset of the set of
languages accepted by NFAs is trivial, because DFAs are NFAs (the definition of NFAs is a
generalization of the definition of DFAs). The difficult part is to show (ii) that the set of languages
accepted by NFAs is a subset of the set of languages accepted by DFAs. For this one has to start with
some NFA M = (QN, Σ, δN, q0N, FN) and construct from it a DFA M' = (QD, Σ, δD, q0D, FD) that accepts
the same language. The crucial idea is to take as the states of M' all subsets of states of M, that is, put
QD = Pot(QN) Further details: put q0D = { q0N }, FD = {S ⊆ QN | S ∩ FN ≠ ∅}, and for all S ∈ QD =
pot(QN), a ∈ Σ,
Proof, example: The subset construction will become immediately plausible through an example. We
take the NFA figure 2.20.
And here is the corresponding NFA obtained from the subset construction:
Notes:
1. The states qi that occur inside the state circles of the subset DFA are from the nondeterministic
NFA, therefore they should more correctly be written qiN. The N subscript is omitted for
readability.
2. According to q0D = { q0N }, the start state is the one that contains (as a set) only q0N.
3. The accepting states of the subset DFA are all subsets of NFA states that contain some accepting
state of the NFA (here this is only q2N), according to FD = {S ⊆ QN | S ∩ FN ≠ ∅}.
4. To illustrate how it is determined which arrows leave a given state of the DFA, let us consider
the state S = {q0N, q2N }, and apply the rule δD (S, a) = UqN∈S 𝛿 N (qN,a) . For a = 0, we have
to check to which NFA states one can go from either q0N or q2N. We find that we can reach q0N
and q1N from q0N and no other state from q2N. Therefore, what we can reach altogether is {q0N,
q1N}. This yields the "0" arrow from {q0N, q2N } to {q0N, q1N} in the diagram. Similarly, the "1"
arrow is determined by considering what states we can reach in the NFA from either q0N or q2N.
From q0N a 1 leads to q0N and from q2N a 1 leads nowhere, so we determine in the DFA a "1"
arrow from {q0N, q2N } to {q0N}.
5. Seen strictly, in a DFA for every state q and every symbol a there must be an arrow labeled with
a leaving q. For instance, from {q1N, q2N} we get the dotted arrows marked in the diagram.
However, the state S = {q1N, q2N} cannot be reached from the starting state in the subset DFA.
Generally, in the diagram I omitted all such irrelevant transitions that cannot be reached from
the start state, which leads to the numerous seemingly "orphan" states in the diagram.
δN(q0N,w) = δN(q0D,w)…………………....…..2.2
Be careful that you understand this formula. δ(q0N,w) is the set of NFA states that can be reached from
the starting state via w. δ(q0D,w) is the single state of the DFA that is deterministically reached from the
DFA starting state via w. However, due to the subset construction, this single DFA state corresponds to
a set of NFA states – namely, to δN(q0N,w). Furthermore, make sure that you understand that if we have
shown (2.2), then we are done. Because, w is accepted by the NFA ⇔ δN(q0N,w) contains some accepting
state ⇔ δN(q0D,w) is an accepting state ⇔ w is accepted by the DFA.
Induction step: assume that (2.2) has been shown for all |w| ≤ n. Let v = wa be a word of length n +1.
Then δN(q0N,wa)=⋃pN∈𝛿N (q0N,𝑤) δN(pN, 𝑎) , which by induction is equal to
⋃pN∈ 𝛿D (q0D,𝑤) δN(pN, 𝑎), which by (2.1) is equal to δD(δD(q0D,w),a) = δD(q0D,wa).
One important conclusion we can draw from preposition 2.1 is that every language accepted by an nfa
is regular.
Example 2.13
Convert the NFA in Figure 2.22 to an equivalent DFA.
b
a q
0
a q 3
q 2
a
Figure 2.22: Nondeterministic finite automata
Solution
The NFA starts in state q0, so the initial state of the DFA will be labeled {q0}. After reading an a, the
NFA can be in state q1 or, by making a λ-transition, in state q2. Therefore, the corresponding DFA must
have a state labeled {q1,q2} and a transition
δ({q0},a) = {q1,q2}.
In state q0, the NFA has no specified transition when the input is b; therefore,
δ ({q0},b) = Ø.
A state labeled Ø represents an impossible move for the NFA and, therefore, means nonacceptance of
the string. Consequently, this state in the DFA must be a nonfinal trap state.
b {q , q }
1 2
a
{q 0
}
a
b
a, b
Figure 2.23
Example 2.14
Convert the NFA in Figure 2.24 into an equivalent deterministic machine (DFA).
0 0
q q
a 0
0,1 1
0,1
q 2
a
{q }
0
0 {q }2
{q
,q }
0 1
{q , q , q 2}
0 1
{q , q }
1 2
{q , q ,q 2}
0 1
Figure 2.25: partially constructed automaton Figure 2.26: Complete deterministic Finite automata
Activity 2.3
1. Show the relation between language and Finite automata machine?
2. Show the mechanism to reduce number of states in a given DFA?
Any DFA defines a unique language, but the converse is not true. For a given language, there are many
DFA's that accept it. There may be a considerable difference in the number of states of such equivalent
automata. In terms of the questions, we have considered so far, all solutions are equally satisfactory, but
if the results are to be applied in a practical setting, there may be reasons for preferring one over another.
Example 2.15
The two DFA's depicted in Figure 2.29 (a) and 2.29 (b) are equivalent, as a few test strings will quickly
reveal. We notice some obviously unnecessary features of Figure 2.29 (a). The state q5 plays absolutely
no role in the automaton since it can never be reached from the initial state q0. Such a state is inaccessible,
and it can be removed (along with all transitions relating to it) without affecting the language accepted
by the automaton. But even after the removal of q5, the first automaton has some redundant parts.
1 0,1
q q
0 1
3
q 0
1 0
1
q 1
2
q q
1 4 5
0
0
0,1
a)
0,1
q q 1 q
0
0,1 1
2
0
b)
Figure 2.26: Diagram that shows Equivalent of DFAs
Clearly, two states are either indistinguishable or distinguishable. In-distinguish ability has the
properties of an equivalence relation: If q0 and q1 are indistinguishable and if q1 and q2 are also
indistinguishable, then so are q0 and q2, and all three states are indistinguishable.
Procedure: Mark
1. Remove all inaccessible states. This can be done by enumerating all simple paths of the graph
of the dfa starting at the initial state. Any state not part of some path is inaccessible.
2. Consider all pairs of states (p, q). If p ∈ F and q ∉ F or vice versa, mark the pair (p, q) as
distinguishable.
3. Repeat the following step until no previously unmarked pairs are marked. For all pairs (p, q)
and all a ∈ Σ, compute δ(p, a)= pa and δ (q, a) = qa. If the pair (pa,qa) is marked as distinguishable,
mark (p, q) as distinguishable.
Compiled by: Destalem H. 37
We claim that this procedure constitutes an algorithm for marking all distinguishable pairs.
Proposition 2.2: The procedure Mark, applied to any dfa M = (Q, λ,δ,q0,F), terminates and determines
all pairs of distinguishable states.
Proof: Obviously, the procedure terminates, since there are only a finite number of pairs that can be
marked. It is also easy to see that the states of any pair so marked are distinguishable. The only claim
that requires elaboration is that the procedure finds all distinguishable pairs.
Note first that states qi and qj are distinguishable with a string of length n if and only if there are
transitions for some a ∈ Σ, with qk and qi distinguishable by a string of length n – 1. We use this first to
show that at the completion of the nth pass through the loop in step 3, all states distinguishable by strings
of length n or less have been marked. In step 2, we mark all pairs indistinguishable by λ, so we have a
basis with n = 0 for an induction. We now assume that the claim is true for all i = 0,1,…, n–1. By this
inductive assumption, at the beginning of the nth pass through the loop, all states distinguishable by
strings of length up to n–1. Because of (2.3) and (2.4) below, at the end of this pass, all states
distinguishable by strings of length up to n will be marked. By induction then, we can claim that, for
any n, at the completion of the nth pass, all pairs distinguishable by strings of length n or less have been
marked.
𝛿(qi ,a) = qk……………………………………….2.3
and
𝛿(qj ,a) = ql………………………………………..2.4
To show that this procedure marks all distinguishable states, assume that the loop terminates after n
passes. This means that during the nth pass no new states were marked. From (2. 3) and (2.4), it then
follows that there cannot be any states distinguishable by a string of length n, but not distinguishable by
any shorter string. But if there are no states distinguishable only by strings of length n, there cannot be
any states distinguishable only by strings of length n+1, and so on. As a consequence, when the loop
terminates, all distinguishable pairs have been marked.
The procedure mark can be implemented by partitioning the states into equivalence classes. Whenever
two states are found to be distinguishable, they are immediately put into separate equivalence classes.
Procedure: Reduce
Given a dfa M = ( Q,Σ,δ, q0, F), we construct a reduced DFA 𝑀 ̂ = (𝑄̂, Σ, δ̂ , q0
̂ , 𝐹̂ ) as follows.
1. Use procedure mark to generate the equivalence classes, say {qi,qj,…,qk}, as described.
̂.
2. For each set {qi,qj,…,qk} of such indistinguishable states, create a state labeled i j…k for 𝑀
3. For each transition rule of M of the form δ(qr,a)=qp, find the sets to which qr and qp belong.
If qr ∈{qi,qj,…,qk} and qp ∈ {ql,qm,…, qn}, add to δ̂ a rule
δ̂(i,j,….k,a) = lm……n.
Compiled by: Destalem H. 38
̂ , is that state of 𝑀
4. The initial state q0 ̂ whose label includes the 0.
5. 𝐹̂ is the set of all the states whose label contains i such that qi∈ F
Example 2.18
Consider the automaton in Figure 2.30; reduce the number of states to get the minimized DFA?
q 1
0 1
q 0 0
0
q 2
1 1 q 4
0
q3 1 0,1
Solution
In the second step of procedure mark we partition the state set into final and nonfinal states to get two
equivalence classes S0 = {q0,q1,q3} and Sf = {q2,q4}. In the next step, Take any two states from one of
the classes; in this case let us take from the nonfinal class q0 and q1, after computing
δ(q0,0) = q1 and δ(q1,0) = q2,
Since the out of the computation q1 for the first function and q2 for the send function, are in different
class, we recognize that q0 and q1are distinguishable, so we put them into different sets. So ={q0,q1,q3}
is split into {q0} and {q1,q3}. Also, since δ(q2,0) = q3 and δ(q4, 0) =q4, the class {q2,q4} is split into {q2}
and {q4}. Now we can check for {q1,q3}, δ(q1,0)=q2 and δ(q3,0)=q2 , again we have to compute for the
send input δ(q1,1) =q4 and δ(q3,1) = q4. The computation for the class {q1,q3} both inputs goes to the
same state q2 and q4, indicates that both are indistinguishable so we can combine them. Now we do have
the following class of states.
{q0},{q1,q3},{q2},{q4}, the new DFA will look like figure 2.31. Once the indistinguishability classes are
found, the construction of the minimal DFA is straightforward.
q 1, 3
q 0,1
0
1
0 0
q 4
q 2
1
0,1
Figure 2.28: Minimized DFA
.
In this chapter, we first define regular expressions as a means of representing certain subsets of strings
over Σ and prove that regular sets are precisely those accepted by finite automata or transition systems.
We use pumping lemma for regular sets to prove that certain sets are not regular. We then discuss closure
properties of regular sets. Finally, we give the relation between regular sets and regular grammars.
Activity 3.1
1. What are the primitive regular expressions?
2. What are the valid mathematical operator used in regular expression?
3. What is regular expression?
We now consider the class of languages obtained by applying union, concatenation, and Kleene star for
finitely many times on the basis elements. These languages are known as regular languages and the
corresponding finite representations are known as regular expressions.
Definition 3.1: Let Σ be an alphabet, a regular expression r over Σ denotes a language L(r) over Σ.
Say that r is a regular expression if r is
In items 1 and 2, the regular expressions a and λ represent the languages {a} and { λ }, respectively.
In item 3, the regular expression ∅ represents the empty language. In items 4, 5, and 6, the
expressions represent the languages obtained by taking the union or concatenation of the languages
r1 and r2, or the star of the language r, respectively. These symbol a, λ and ∅ are called primitive
regular expression.
Definition 3.2: If r is a regular expression, then the language represented by r is denoted by L(r).
Further, a language L is said to be regular if there is a regular expression r such that L = L(r).
Remark
1. A regular language over an alphabet Σ is the one that can be obtained from the empty set (∅),
{λ}, and {a}, for a ∈ Σ, by finitely many applications of union, concatenation and Kleene star.
2. The smallest class of languages over an alphabet Σ which contains Ø, {λ}, and {a} and is
closed with respect to union, concatenation, and Kleene star is the class of all regular
languages over Σ.
Examples 3.1:
1. As we observed earlier that the languages Ø, {λ}, {a}, and all finite sets are regular.
2. {an : n>=0} is regular as it can be represented by the expression a*.
3. Σ *, the set of all strings over an alphabet Σ, is regular. For instance, if Σ = {a1, a2, ..., an}, then
Σ* can be represented as (a1 + a2 + ……. + an)*.
4. The set of all strings over {a,b} which contain ab as a substring is regular.
Solution
For instance, the set can be written as
{x ∈ {a,b}* | ab is substring of x }
= { yabz | y,z ∈ {a,b}* }
= {a,b}*{ab}{a,b}*
5. Express the language L over {0,1} that contains 01 or 10 as sub-string with regular expression?
Solution
L = { x | 01 is substring of x} ∪ { x | 10 is substring of x}
= {y01z | y,z ∈ Σ*} ∪ {u10v | u,v ∈ Σ*}
= Σ*{01} Σ* ∪ Σ*{10} Σ*
= {0,1}*{01}{0,1}* ∪ {0,1}*{10}{0,1}*
Compiled by: Destalem H. 41
Since, Σ*, {01}, and {10} are regular by the rule that concatenation, union and kleene
operation on regular expression then L regular Language. The regular expression represent L is
given below.
(0+1)*01(0+1)* + (0+1)*10(0+1)*
6. Express with regular expression the set of all strings over {a,b} which do not contain ab as a
substring.
Solution
By analyzing the language one can observe that precisely the language is as follows.
{ bnam : n,m>=0 }
Thus, the regular expression of the language is b*a*.
Definition 3.3: Two regular expressions r1 and r2 are said to be equivalent if they represent the same
language; in which case, we write r1 ≡ r2.
I. ∅+R = R
II. ∅R = R∅ = ∅
III. 𝜆R = R 𝜆 = R
IV. 𝜆* = 𝜆 and ∅* = 𝜆
V. R+R=R
VI. R*R* = R*
VII. RR* = R*R
VIII. (R*) = R*
IX. 𝜆 + RR* = R* = 𝜆 + R*R
X. (PQ)*P = P(QP)*
XI. (P + Q)* = (P*Q*)* = (P* + Q*)*
XII. (P + Q)R = PR + QR and R(P + Q) = RP + RQ
The following theorem is very much useful in simplifying regular expressions (i.e. replacing a given
regular expression P by a simpler regular expression equivalent to P).
Theorem 3.1: (Arden’s theorem) Let P and Q be two regular expressions over Σ. If P does not contain𝜆,
then the following equation in R, namely
R = Q + RP ………………………………….….. (3.1)
has a unique solution (i.e. one and only one solution) given by R = QP*.
We now show that any solution of (3.1) is equivalent to QP*. Suppose R satisfies (3.1), then it satisfies
(3.2). Let w be a string of length i in the set R .Then w belongs to the set Q(𝜆+P+P 2 + ... + Pi) + RPi +1.
As P does not contain𝜆, RPi+1 has no string of length less than i+1 and so w is not in the set RPi+1 . This
means that w belongs to the set Q(𝜆 + P + p 2 + ... + P'), and hence to QP*.
Consider a string w in the set QP*. Then w is in the set QPk for some k ≥ 0, and hence in Q(𝜆 + P + P2
+ . . . + Pk). So w is on the R.H.S. of (3.2). Therefore, w is in right hand side. of (3.2). Thus R and QP*
represent the same set. This proves the uniqueness of the solution of (3.1).
Theorem 3.2: A language is regular if and only if some regular expression describes it. This theorem
has two directions. We state and prove each direction as a separate lemma.
Proof Idea: Say that we have a regular expression r describing some language L. We show how to
convert r into an NFA recognizing L. by definition, if an NFA recognizes L then L is regular.
Proof: Let’s convert r into an NFA N. We consider the six cases in the formal definition of regular
expressions.
1. r = a, for some a ∈ Σ. Then L(r) = {a}, and the following NFA recognizes L(r).
a
Note that this machine fits the definition of an NFA but not that of a DFA because it has some
states with no exiting arrow for each possible input symbol. Of course, we could have presented
an equivalent DFA here; but an NFA is all we need for now, and it is easier to describe.
r 1
r 2
M (r 2)
Figure 3.4: Automaton for L(r1 + r2).
5. r = r1.r2.
M ( r 1) M (r 2)
r r 2
r
*
Figure 3.6: Automaton for L(r1 ).
Example 3.2:
We convert the regular expression (ab + a)∗ to an NFA in a sequence of stages. We build up
from the smallest subexpressions to larger subexpressions until we have an NFA for the original
expression, as shown in the following diagram. Note that this procedure generally doesn’t give the NFA
with the fewest states. In this example, the procedure gives an NFA with eight states, but the smallest
equivalent NFA has only two states. Can you find it?
a
a
b
ab + a
a
a
b
( ab + a ) *
a
Figure 3.7: Building an NFA from the regular expression (ab + a)∗
Example 3.3
In Figure 3.8, we convert the regular expression (a + b)∗aba to an NFA. A few of the minor
steps are not shown.
a
a
a+b
b
a
( a + b) *
b
aba a b a
Example 3.4:
Find an NFA that accepts L(r), where r=(a + bb)* (ba* + λ)
Solution
Automata for (a + bb) and (ba* + λ), constructed directly from first principles, are given in Figure 3.9.
(a) M1 accepts L(a + bb).
(b) M2 accepts L (ba* + λ).
a
b b
M 1
b
M 2 a
Now let’s turn to the other direction of the proof of Theorem 3.2.
Lemma 3.4: If a language is regular, then it is described by a regular expression.
Proof Idea: We need to show that if a language L is regular, a regular expression describes it. Because
L is regular, it is accepted by a DFA. We describe a procedure for converting DFAs into equivalent
regular expressions. We break this procedure into two parts, using a new type of finite automaton called
a generalized nondeterministic finite automaton, GNFA. First, we show how to convert DFAs into
GNFAs, and then GNFAs into regular expressions. Generalized nondeterministic finite automata are
simply nondeterministic finite automata wherein the transition arrows may have any regular expressions
as labels, instead of only members of the alphabet or𝜆. The GNFA reads blocks of symbols from the
input, not necessarily just one symbol at a time as in an ordinary NFA. The GNFA moves along a
transition arrow connecting two states by reading a block of symbols from the input, which themselves
constitute a string described by the regular expression on that arrow. A GNFA is nondeterministic and
so may have several different ways to process the same input string. It accepts its input if it’s processing
can cause the GNFA to be in an accept state at the end of the input. The following figure presents an
example of a GNFA.
Compiled by: Destalem H. 46
ab * aa
q start
a*
ab+ ba
ab
b
(aa) *
b*
q accept
• The start state has transition arrows going to every other state but no arrows coming in from any
other state.
• There is only a single accept state, and it has arrows coming in from every other state but no
arrows going to any other state. Furthermore, the accept state is not the same as the start state.
• Except for the start and accept states, one arrow goes from every state to every other state and
also from each state to itself.
We can easily convert a DFA into a GNFA in the special form. We simply add a new start state with an
𝜆 arrow to the old start state and a new accept state with 𝜆 arrows from the old accept states. If any
arrows have multiple labels (or if there are multiple arrows going between the same two states in the
same direction), we replace each with a single arrow whose label is the union of the previous labels.
Finally, we add arrows labeled ∅ between states that had no arrows. This last step won’t change the
language recognized because a transition labeled with ∅ can never be used. From here on we assume
that all GNFAs are in the special form.
Now we show how to convert a GNFA into a regular expression. Say that the GNFA has k states. Then,
because a GNFA must have a start and an accept state and they must be different from each other, we
know that k ≥ 2. If k > 2, we construct an equivalent GNFA with k−1 states. This step can be repeated
on the new GNFA until it is reduced to two states. If k = 2, the GNFA has a single arrow that goes from
the start state to the accept state. The label of this arrow is the equivalent regular expression. For
example, the stages in converting a DFA with three states to an equivalent regular expression are shown
in the following figure.
Re gular 2 − state 3 − state
Expression
GNFA GNFA
The crucial step is constructing an equivalent GNFA with one fewer state when k > 2. We do so by
selecting a state, ripping it out of the machine, and repairing the remainder so that the same language is
still recognized. Any state will do, provided that it is not the start or accept state. We are guaranteed that
such a state will exist because k > 2. Let’s call the removed state qrip.
After removing qrip we repair the machine by altering the regular expressions that label each of the
remaining arrows. The new labels compensate for the absence of qrip by adding back the lost
computations. The new label going from a state qi to a state qj is a regular expression that describes all
strings that would take the machine from qi to qj either directly or via qrip. We illustrate this approach in
Figure 3.12.
q j
r 4
r
(r ) (r )* r + r q
3
q i
q rip q
1 2 3 4
j
r 1
i
r 2
before After
δ: (Q – {qaccept}) × (Q − {qstart}) →r
The symbol r is the collection of all regular expressions over the alphabet Σ, and qstart and qaccept are the
start and accept states. If δ(qi, qj) = r, the arrow from state qi to state qj has the regular expression r as
its label. The domain of the transition function is (Q − {qaccept}) × (Q − {qstart}) because an arrow
connects every state to every other state, except that no arrows are coming from qaccept or going to qstart.
Definition 3.4: A generalized nondeterministic finite automaton is a 5-tuple, (Q, Σ, δ, qstart, qaccept), where
A GNFA accepts a string w in Σ* if w = w1 w2 …wk, where each wi is in Σ* and a sequence of states q0,
q1, . . . , qk exists such that
1. q0 = qstart is the start state,
2. qk = qaccept is the accept state, and
3. for each i, we have wi ∈ L(ri), where ri = δ(qi−1 , qi); in other words, ri is the expression on the
arrow from qi−1 to qi.
Returning to the proof of Lemma 3.3, we let M be the DFA for language L. Then we convert M to a
GNFA G by adding a new start state and a new accept state and additional transition arrows as necessary.
We use the procedure CONVERT(G), which takes a GNFA and returns an equivalent regular
expression. This procedure uses recursion, which means that it calls itself. An infinite loop is avoided
because the procedure calls itself only to process a GNFA that has one fewer state. The case where the
GNFA has two states is handled without recursion.
CONVERT(G):
1. Let k be the number of states of G.
2. If k = 2, then G must consist of a start state, an accept state, and a single arrow connecting
them and labeled with a regular expression r.
Return the expression r.
3. If k > 2, we select any state qrip ∈ Q different from qstart and qaccept and let
G’ be the GNFA (Q’, Σ, δ’, qstart, qaccept), where Q’ = Q − {qrip},
and for any qi ∈ Q’ − {qaccept} and any qj ∈ Q’ − {qstart}, let
δ’(qi, qj) = (r1)(r2)∗(r3) + (r4),
for r1 = δ(qi, qrip), r2 = δ(qrip, qrip), r3 = δ(qrip, qj), and r4 = δ(qi, qj).
4. Compute CONVERT (G’) and return this value.
Basis: Prove the claim true for k = 2 states. If G has only two states, it can have only a single arrow,
which goes from the start state to the accept state. The regular expression label on this arrow describes
all the strings that allow G to get to the accept state. Hence this expression is equivalent to G.
Induction step: Assume that the claim is true for k−1 states and uses this assumption to prove that the
claim is true for k states. First we show that G and G’ recognize the same language. Suppose that G
accepts an input w. Then in an accepting branch of the computation, G enters a sequence of states:
qstart, q1, q2, q3, . . . , qaccept.
If none of them is the removed state qrip, clearly G’ also accepts w. The reason is that each of the new
regular expressions labeling the arrows of G’ contains the old regular expression as part of a union. If
qrip does appear, removing each run of consecutive qrip states forms an accepting computation for G’.
The states qi and qj bracketing a run have a new regular expression on the arrow between them that
describes all strings taking qi to qj via qrip on G. So G’ accepts w.
Conversely, suppose that G’ accepts an input w. As each arrow between any two states qi and qj in G’
describes the collection of strings taking qi to qj in G, either directly or via qrip, G must also accept w.
Thus G and G’ are equivalent.
The induction hypothesis states that when the algorithm calls itself recursively on input G’, the result is
a regular expression that is equivalent to G’ because G’ has k−1 states. Hence this regular expression
also is equivalent to G, and the algorithm is proved correct. This concludes the proof of Claim 3.5,
Lemma 3.3, and Theorem 3.2.
Example 3.5:
In this example, we use the preceding algorithm to convert a DFA into a regular expression. We
begin with the two-state DFA in Figure 3.13(a). In Figure 3.13(b), we make a four-state GNFA by
adding a new start state and a new accept state, called S and F instead of qstart and qaccept so that we can
draw them conveniently. To avoid cluttering up the figure, we do not draw the arrows labeled ∅, even
though they are present. Note that we replace the label a, b on the self-loop at state 2 on the DFA with
the label a + b at the corresponding point on the GNFA. We do so because the DFA’s label represents
two transitions, one for a and the other for b, whereas the GNFA may have only a single transition going
from 2 to itself.
In Figure 3.13(c), we remove state 2 and update the remaining arrow labels. In this case, the only label
that changes is the one from 1 to F. In part (b) it was ∅, but in part (c) it is b(a + b)∗. We obtain this
result by following step 3 of the CONVERT procedure. State qi is state 1, state qj is a, and qrip is 2, so r1
= b, r2 = a ∪ b, r3 = 𝜆, and r4 = ∅. Therefore, the new label on the arrow from 1 to a is (b)(a ∪ b)∗(𝜆) ∪
Compiled by: Destalem H. 50
∅. We simplify this regular expression to b(a+b)∗. In Figure 3.13(d), we remove state 1 from part (c)
and follow the same procedure. Because only the start and accept states remain, the label on the arrow
joining them is the regular expression that is equivalent to the original DFA.
2
a, b
a) b)
a
S 1
b( a + b) *
c)
d)
Figure 3.13: Converting a two-state DFA to an equivalent regular expression
Activity 3.2
1. What is regular grammar?
2. Define what is right linear grammar and left linear grammar?
3. Show the equivalence of Finite automata and regular grammar with example.
A third way of describing regular languages is by means of certain grammars. Grammars are often an
alternative way of specifying languages. Whenever we define a language family through an automaton
or in some other way, we are interested in knowing what kind of grammar we can associate with the
family. First, we look at grammars that generate regular languages.
Definition 3.6: A grammar is said to be left-linear if all productions are of the form
A → Bx,
or
A → x.
Example 3.6
1. The grammar G1 = ({S}, {a,b},S,P1), with P1 given as
S →abS|a
is right-linear.
2. The grammar G2 = ({S, S1, S2}, {a, b}, S, P2), with productions
S →S1ab,
S1→ S1ab|S2,
S2 →a,
is left-linear.
The sequence
S ⇒ abS ⇒ ababS ⇒ababa
is a derivation with G1. From this single instance it is easy to conjecture that L (G1) is the language
denoted by the regular expression r = (ab)*a. In a similar way, we can see that L(G2) is the regular
language L(aab(ab)*).
Example 3.7
The grammar G =({S, A, B}, {a, b}, S, P) with productions
S →A
A→ aB|λ,
B →Ab, is not regular.
Although every production is either in right-linear or left-linear form, the grammar itself is neither right-
linear nor left-linear, and therefore is not regular. The grammar is an example of a linear grammar.
Our next goal will be to show that regular grammars are associated with regular languages and that for
every regular language there is a regular grammar. Thus, regular grammars are another way of talking
about regular languages.
. ⇒a1a2 · · · an−1qn−1
⇒ a1a2 · · · an = x.
Thus x ∈ L(G).
Compiled by: Destalem H. 53
∗
Conversely, suppose y = b1 · · · bm ∈ L(G), for m ≥ 1, i.e. S ⇒ y in G. Since every production rule of G
∗
is form A → aB or A → a, the derivation S ⇒ y has exactly m steps and first m − 1 steps are because of
production rules of the type A → aB and the last mth step is because of the rule of the form A → a. Thus,
in every step of the deviation one bi of y can be produced in the sequence. Precisely, the derivation can
be written as
S ⇒ b1B1
⇒ b1b2B2
⇒ b1b2 · · · bm−1Bm−1
⇒ b1b2 · · · bm = y.
From the construction of G, it can be observed that
δ(Bi−1, bi) = Bi, for 1 ≤ i ≤ m − 1, and B0 = S
in M. Moreover, δ(Bm−1, bm) ∈ F. Thus,
δ̂( (q0, y) = δ̂( (S, b1 · · · bm) = δ̂( (δ(S, b1), b2 · · · bm)
= δ̂( (B1, b2 · · · bm)
.= δ̂( (Bm−1, bm)
= δ(Bm−1, bm) ∈ F
so that y ∈ L(M). Hence L(M) = L(G).
Example 3.11:
Consider the DFA given below in figure, Set V = {q0, q1}, Σ = {a, b}, S = q0 and P has the
following production rules:
q0 → aq1 | bq0 | a
q1 → aq0 | bq1 | b
Now G = (V, Σ, S, P) is a regular grammar that is equivalent to the given DFA.
b
a
q 0
q 1
a
b
Figure 3.18: Deterministic Finite Automata
Example 2.12:
Consider the DFA given in figure3.19 below. The regular grammar G = (V, Σ, P, S), where V =
{q1, q2, q3}, Σ = {a, b}, S = q1 and P has the following rules
q1 → aq2 | bq1 | a | b | 𝜆
q2 → aq3 | bq1 | b
q3 → aq3 | bq3
q1 → aq2 | bq1 | a | b | 𝜆
q2 → bq1 | b
a a
q 1
q 2 q 3
b
b a, b
Figure 3.19: Deterministic Finite Automata
From the construction it is easy to see that A0 ⇒alA1 ⇒ ala2A2 ⇒ ………⇒... a1a2…….an-1 An-1 ⇒a1a2
…….an is a derivation of ala a1a2…….an if and only if there is a path in M starting from q0 and
terminating in qf with path value a1a2…….an. Therefore, L (G) = L (M)
Example: 2.13:
Let G = ({A0 , A1} , {a. b}, A0, P), where P consists of A0⟶aA1, Al ⟶ bA1 , A1 ⟶ a,
A1⟶bA1. Construct a transition system M accepting L(G).
Solution
Let M = ({ q0,q1,qf}) {a. b}, q0,{ qf }). Where q0 and q1 correspond to A0 and A1, respectively and qf is
the new (final) state introduced. A0⟶aA1 induces a transition from q0 to q1 with label a. Similarly.
A1⟶bA1 and A1⟶bA0 induce transitions from q1 to q1 with label b and from ql to q1 with label b,
respectively. Al ⟶ a induces a transition from q1 to qf with label a. M is given in Figure3.20 below
q 0
q1
q f
a a
Example 3.13
Construct a finite automaton that accepts the language generated by the grammar G
V0 →aV1,
V1 →abV0|b,
where V0 is the start variable. We start the transition graph with vertices V0, V1, and Vf. The first
production rule creates an edge labeled a between V0 and V1. For the second rule, we need to introduce
an additional vertex so that there is a path labeled ab between V1 and V0. Finally, we need to add an
edge labeled b between V1 and Vf, giving the automaton shown in Figure 3.17. The language generated
by the grammar and accepted by the automaton is the regular language
L((aab) * ab.
v 0
a v
1 b v f
b a
Activity 3.3
1. Is regular language closed under union, intersection, concatenation and
complement? Prove your Answer with examples.
2. What is the purpose of pumping Lemma in regular language?
In the previous sections we have introduced various tools, regular expression, grammars, and automata,
to understand regular languages. Also, we have noted that the class of regular languages is closed with
respect to certain operations like union, concatenation, Kleene closure. Now, with this information, can
we determine whether a given language is regular or not? If a given language is regular, then to prove
the same we need to use regular expression, regular grammar, finite automata. Is there any other way to
prove that a language is regular? The answer is “Yes”. If a given language can be obtained from some
known regular languages by applying those operations which preserve regularity, then one can ascertain
Proof: Let L be a regular language accepted by a DFA M = (Q, Σ,δ,q0,F). Construct the DFA
M’ = (Q, Σ, δ, q0,Q − F), that is, by interchanging the roles of final and nonfinal states of M. We claim
that L(M’) = L̅ so that L̅ is regular. For x ∈ Σ∗,
x ∈ L̅ ⇔ x ∉L
⇔ ̂δ(q0,x) 6 ∈ F
⇔ ̂δ( (q0,x) ∈ Q − F
⇔ x ∈ L(M’).
Corollary 3.8: The class of regular languages is closed with respect to intersection.
Proof: If L1 and L2 are regular, then so are L̅1 and L̅2. Then their union L̅1 ∪ L̅2 is also regular. Hence,
(L̅1 ∪ L̅2) ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
(L̅1 ∪ L̅2) is regular. But, by De Morgan’s law
L1 ∩ L2 = ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
(L̅1 ∪ L̅2)
so that L1 ∩ L2 is regular.
Alternative Proof by Construction, For i = 1, 2, let Mi = (Qi, Σ,δi,qi,Fi) be two DFA accepting Li. That
is, L(M1) = L1 and L(M2) = L2. Set the DFA
M = (Q1 × Q2, Σ,δ, (q1,q2),F1 × F2) where δ is defined point-wise by
δ((p,q),a) = (δ1(p,a),δ2(q,a)),
for all (p,q) ∈ Q1 × Q2 and a ∈ Σ. We claim that L(M) = L1 ∩ L2. Using induction on |x|, first observe
that ̂δ (p,q),x) = ( ̂δ1(p,x), ̂δ2(q,x)´ , for all x ∈ Σ∗.
Now it clearly follows that
x ∈ L(M) ⇔ ̂δ (q1,q2),x) ∈ F1 × F2
⇔ ̂δ1(q1,x), ̂δ 2(q2,x)´ ∈ F1 × F2
⇔ ̂δ1(q1,x) ∈ F1 and ̂δ 2(q2,x) ∈ F2
⇔ x ∈ L1 and x ∈ L2
⇔ x ∈ L1 ∩ L2.
Corollary 3.9: The class of regular languages is closed under set difference.
Theorem 3.10: (pumping lemma): Let L be a regular language. Then there exists a constant n
(depending on L), such that ∀w ∈ L, |w| ≥ n, we can find a partition w = xyz, such that (1) y ≠ 𝜆, (2)
|xy| ≤ n, and (3) ∀k ≥ 0, xykz ∈ L. In intuitive terms, every word w from L that exceeds n in length can
be "pumped" by replicating an inner part, such that the "pumped-up" words are also in L.
Proof: Because L is regular, there exists some DFA M = (Q, Σ, δ, q0, F) that accepts L. Let M have n
states. Observe that in an accepting run of any word w = x1x2...xk of length at least n, at least one state
(maybe the start state) must have been visited twice after xn has been read. Let P be this state and let
xixi+1...xj be the subword before which the run went through P and after which it went through P, that
is, ̂δ (P, xixi+1...xj) = P. Put
x = x1x2...xi-1 and y = xixi+1...xj. It is clear that the statement holds
Example 3.14:
Show that L = {0i1i} is not regular.
Solution
Step 1 Suppose L is regular. Let n be the number of states in the finite automaton accepting L.
Step 2 Let w = 0n1n. Then |w| = 2n> n. By pumping lemma, we write w = xyz with |xy| ≤ n and |y|≠0.
Step 3 We want to find i so that xyz ∉L for getting a contradiction. The string y can be in any of the
following forms:
Case 1 y has 0’s. i.e. y = 0k for some k ≥ l.
Case 2 ,y has only l's. i.e. y = 1l for some l≥.l
Case 3 y has both 0’s and l's, i.e. y = 0k1j for some k, j ≥ 1
4.1 Activity
1. Why we need learn context free grammar?
2. What is the application of context-free grammar?
3. What is context free language?
In Chapter 2 and 3 we introduced three different, though equivalent, methods of describing languages:
finite automata, regular expressions and regular grammars. We showed that many languages can be
described in this way but that some simple languages, such as L = {0n1n| n ≥ 0}, cannot.
In this chapter we present context-free grammars, a more powerful method of describing languages.
Such grammars can describe certain features that have a recursive structure, which makes them useful
in a variety of applications. Context-free grammars were first used in the study of human languages.
One way of understanding the relationship of terms such as noun, verb, and preposition and their
respective phrases leads to a natural recursion because noun phrases may appear inside verb phrases and
vice versa. Context-free grammars help us organize and understand these relationships.
Collections of languages associated with context-free grammars are called the context-free languages.
They include all the regular languages and many additional languages. In this chapter, we give a formal
definition of context-free grammars and study the properties of context-free languages. We also
introduce pushdown automata, a class of machines recognizing the context-free languages. Pushdown
automata are useful because they allow us to gain additional insight into the power of context-free
grammars.
A→x
where A ∈ V and x ∈ (V ∪ T)*.
A language L is said to be context-free language if and only if there is a context free grammar G such
that
L= L (G).
Example 4.1: consider grammar G1=({A,B},{0,1},A,P), the production P is given by
A → 0A1
A→B
B→0
We use a grammar to describe a language by generating each string of that language in the following
manner.
0 0 0 0 1
1 1
Figure 4.1: Parse tree for 0000111 in grammar G1
All strings generated in this way constitute the language of the grammar. We write L(G 1) for the
language of grammar G1. Some experimentation with the grammar G1 shows us that L(G1) is {0n+11n| n
≥ 0}. Any language that can be generated by some context-free grammar is called a context-free
language (CFL). For convenience when presenting a context-free grammar, we abbreviate several rules
with the same left-hand variable, such as A → 0A1 and A → B, into a single line A → 0A1 | B, using
the symbol “ | ” as an “or”.
If u, v, and w are strings of variables and terminals, and A → w is a rule of the grammar, we say that
∗
uAv yields uwv, written uAv ⇒ uwv. Say that u derives v, written u ⇒ v, if u = v or if a sequence u1, u2.
. . uk exists for k ≥ 0 and u ⇒ u1 ⇒ u2 ⇒ . . . ⇒ uk ⇒ u.
∗
The language of the grammar is {w ∈ Σ∗ | S ⇒ w}. In grammar G1, V = {A, B}, Σ = {0, 1}, S = A, and
P is the collection of the three rules appearing on above grammar G1
Example 4.2
Consider the grammar G2 = ({S}, {a, b}, S, P), with productions P
S →aSa,
S→bSb,
S→λ,
is context-free.
Example 4.3
Consider grammar G3 = ({S}, {a, b}, S, P). The set of rules, P is given by
S → aSb | SS | 𝜆.
This grammar generates strings such as abab, aaabbb, and aababb.
L(G3) = { w ∈ {a,b}* : na(w) = nb(w) }
In order to show which production is applied, we have numbered the productions and written the appropriate
number on the ⇒ symbol. From this we see that the two derivations not only yield the same sentence but also use
exactly the same productions. The difference is entirely in the order in which the productions are applied. To
remove such irrelevant factors, we often require that the variables be replaced in a specific order.
A derivation is said to be leftmost if in each step the leftmost variable in the sentential form is replaced. If in each
step the rightmost variable is replaced, we call the derivation rightmost.
Example 4.4:
Consider the grammar with productions G=({A,B,S},{a,b},S,P) where P is given by
S ⟹ aAB ⟹ aA ⟹ abBb ⟹ abAb ⟹ abbBbb ⟹ abbbb is rightmost derivation of the same string abbbb
∗
Definition 4.2: A derivation S ⇒ w is called a leftmost derivation if we apply a production only to the
leftmost variable at every step.
∗
Definition 4.3: A derivation S ⇒ w is a rightmost derivation if we apply production to the rightmost
variable at every step.
Derivation Trees
A second way of showing derivations, independent of the order in which productions are used, is by a derivation
or parse tree. A derivation tree is an ordered tree in which nodes are labeled with the left sides of productions
and in which the children of a node represent its corresponding right sides. For example, Figure 4.2 shows part
of a derivation tree representing the production
A→abAB
In a derivation tree, a node labeled with a variable occurring on the left side of a production has children consisting
of the symbols on the right side of that production. Beginning with the root, labeled with the start symbol and
ending in leaves that are terminals, a derivation tree shows how each variable is replaced in the derivation. The
following definition makes this notion precise.
a
b B
A
Figure 4.2: Partial derivation tree
Definition 4.4: Let a grammar G = (V, T, S, P ) be a context-free grammar. An ordered tree is a derivation tree
for G if and only if it has the following properties.
1. The root is labeled S.
2. Every leaf has a label from T ∪ {λ}.
3. Every interior vertex (a vertex that is not a leaf) has a label from V.
4. If a vertex has label A ∈ V, and its children are labeled (from left to right) a1, a2,…, an, then P must
contain a production of the form
Compiled by: Destalem H. 63
A → a1a2…..an.
5. A leaf labeled λ has no siblings, that is, a vertex with a child labeled λ can have no other children.
A tree that has properties 3, 4, and 5, but in which 1 does not necessarily hold and in which property 2 is replaced
by V ∪ T ∪ {λ}, is said to be a partial derivation tree.
The string of symbols obtained by reading the leaves of the tree from left to right, omitting any λ’s encountered,
is said to be the yield of the tree. The descriptive term left to right can be given a precise meaning. The yield is
the string of terminals in the order they are encountered when the tree is traversed in a depth-first manner, always
taking the leftmost unexplored branch.
Example 4.5
Consider the grammar G, with productions
S → aAB,
A →bBb,
B → A| 𝜆.
The tree in Figure 4.3 is a partial derivation tree for G, while the tree in Figure 4.4 is a derivation tree. The string
abBbB, which is the yield of the first tree, is a sentential form of G. The yield of the second tree, abbbb, is a
sentence of L (G).
S
a B
A
b B
b
a
A B
b A
b B
b
b B
Definition 4.6: The yield of a derivation tree is the concatenation of the labels of the leaves without
repetition in the left-to-right ordering. For example, The yield of the derivation tree of Fig. 4.4 is abbbb
As with the design of finite automata, the design of context-free grammars requires creativity. Indeed,
context-free grammars are even trickier to construct than finite automata because we are more
accustomed to programming a machine for specific tasks than we are to describing languages with
grammars. The following techniques are helpful, singly or in combination, when you’re faced with the
problem of constructing a context free grammar (CFG).
First, many contexts free languages (CFLs) are the union of simpler CFLs. If you must construct a CFG
for a CFL that you can break into simpler pieces, and then construct individual grammars for each piece.
These individual grammars can be easily merged into a grammar for the original language by combining
their rules and then adding the new rule S → S1|S2| · · · |Sk, where the variables Si are the start variables
for the individual grammars. Solving several simpler problems is often easier than solving one
complicated problem. For example, to get a grammar for the language {0n1n| n ≥ 0}∪{1n0n| n ≥ 0}, first
construct the grammar
S1 → 0S11 | 𝜆
for the language {0n1n| n ≥ 0} and the grammar
S2 → 1S20 | 𝜆
for the language {1n0n| n ≥ 0} and then add the rule S → S1 | S2 to give the grammar
S → S1 | S2
S1 → 0S11 | 𝜆
S2 → 1S20 | 𝜆.
Second, constructing a CFG for a language that happens to be regular is easy if you can first construct
a DFA for that language. You can convert any DFA into an equivalent CFG as follows. Make a variable
Pi for each state qi of the DFA. Add the rule Pi → aPj to the CFG if δ(qi, a) = qj is a transition in the
DFA. Add the rule Pi → 𝜆 if qi is an accept state of the DFA. Make P0 the start variable of the grammar,
where q0 is the start state of the machine. Verify on your own that the resulting CFG generates the same
language that the DFA recognizes.
Third, certain context-free languages contain strings with two substrings that are “linked” in the sense
that a machine for such a language would need to remember an unbounded amount of information about
one of the substrings to verify that it corresponds properly to the other substring. This situation occurs
in the language {0n1n| n ≥ 0} because a machine would need to remember the number of 0s in order to
verify that it equals the number of1s. You can construct a CFG to handle this situation by using a rule
of the form P → uPv, which generates strings wherein the portion containing the u’s corresponds to the
portion containing the v’s.
4.1.3 AMBIGUITY
Sometimes a grammar can generate the same string in several different ways. Such a string will have
several different parse trees and thus several different meanings. This result may be undesirable for
certain applications, such as programming languages, where a program should have a unique
interpretation. If a grammar generates the same string in several different ways, we say that the string is
derived ambiguously in that grammar. If a grammar generates some string ambiguously, we say that the
grammar is ambiguous.
Example 4.7:
If G is the grammar S → SbS|a, show that G is ambiguous?
Solution
To prove that G is ambiguous, we have to find a w ∈ L(G), which is ambiguous. Consider a string w
= abababa ∈ L(G). Then we get two derivation trees for w (see Fig. 4.5). Thus, G is ambiguous.
S S
b
a S
S b
a A
S S
b
a a
S
S S
b
S S S
b S b
a a a a
Figure 4,5: Two derivation trees of string abababa
Example 4.8
Consider, a grammar G = ({S}, {a, b, +, *}, S. P), where P consists of S→ S+S | S*S | b | a.
show that the grammar is ambiguous?
Solution
We have two derivation trees for a + a * b given in Fig. 4.6.
S
S S
+
a S
S *
a b
S S
*
S
S +
a
a b
Activity 4.3
1. What is useless variable and production?
2. What is 𝜆 production?
3. Define unit production?
In a CFG G, it may not be necessary to use all the symbols in V∪ Σ, or all the productions in P for
deriving sentences. So when we study a context free language L(G), we try to eliminate those symbols
and productions in G which are not useful for the derivation of sentences. Consider, for example,
G = ({S. A, B, C, E}, {a, b, c}, S, P) Where
P ={S → AB,A → a, B → b, B → C, E → c | 𝜆}
It is easy to see that L(G) = {ab}. Let 𝐺̂ = ({S, A,B}, {a, b}, S, 𝑃̂), where 𝑃̂ consists of S →AB, A
→ a, B → b. L(G) = L(𝐺̂ ). We have eliminated the symbols C, E and c and the productions B → C, E
→ c | 𝜆, We note the following points regarding the symbols and productions which are eliminated:
(i) C does not derive any terminal string.
(ii) E and c do not appear in any sentential form.
(iii) E → 𝜆 is a null production.
(iv) B → C simply replaces B by C.
In this section, we give the construction to eliminate
(i) Variables not deriving terminal strings,
(ii) Symbols not appearing in any sentential form,
(iii) Null productions. And
(IV) Productions of the form A → B.
is the set of all productions in P that have B as the left side. Let 𝐺̂ = (V, T, S, 𝑃̂) be the grammar in
which is constructed by deleting
A → x1Bx2. …………………………..….………..(4.1)
from P, and adding to it
A → x1y1x2| x1y2x2 | . . . . . | x1ynx2
then
L(𝐺̂ ) = L(G)
∗
S⇒G u1Au2 ⇒ u1x1Bx2u2 ⇒ u1x1yix2u2
But with grammar 𝐺̂ we can get
∗ ∗
S ⇒ 𝐺̂ u1Au2 ⇒ 𝐺̂ u1x1yix2u2
Thus we can reach the same sentential form with 𝐺̂ and. It follows then, by induction on the number of
times the production is applied, that
∗
S ⇒ 𝐺̂ w
Therefore, if w ∈ L(G), then w ∈ L(𝐺̂ )
By similar reasoning, we can show that if w ∈ L ( 𝐺̂ ) then w ∈ L (G), completing the proof.
Example 4.9:
Consider G = ({A, B} , {a,b,c} , A, P) with productions
A → a|aaA|abBc
B → abbA|b
get new grammar using substitution.
Solution
Using the suggested substitution for the variable B, we get the new grammar 𝐺̂ with productions
Compiled by: Destalem H. 69
A → a|aaA|ababbAc|abbc
B→ abbA|b
The new grammar 𝐺̂ is equivalent to G. The string aaabbc has the derivation
A ⇒ aaA ⇒ aaabBc⇒aaabbc in G, and the corresponding derivation
A ⇒ aaA ⇒ aaabbc in 𝐺̂
Notice that, in this case, the variable B and its associated productions are still in the grammar even
though they can no longer play a part in any derivation. We will next show how such unnecessary
productions can be removed from a grammar.
∗ ∗
S⇒ xBy ⇒ w
with x, y in (V ∪ T)*. In words, a variable is useful if and only if it occurs in at least one derivation.
A variable that is not useful is called useless. A production is useless if it involves any useless variable.
Example 4.10:
A variable may be useless because there is no way of getting a terminal string from it. Another reason a
variable may be useless is shown in the next grammar. Let a grammar
G=({S,A},{a,b},S,P) where productions are given by
S→A,
A → aA | 𝜆,
B → bA,
Solution
∗
Although B can derive a terminal string, there is no way we can achieve S⇒ xB y. So the variable B is
useless. And also the production B → bA. Now we can eliminate the variable B and its production
B → bA, and get the new grammar and production as follows
𝐺̂ =({S,A},{a},S,𝑃̂) where the new production is given by
S → aS|A ,
A→a
B → aa
B → aCb
Solution
First, we identify the set of variables that can lead to a terminal string. Because A → a and B → aa, the
variables A and B belong to this set. So does S, because S ⇒ A ⇒ a. However, this argument cannot be
made for C, thus identifying it as useless. Removing C and its corresponding productions, we are led to
the grammar G1 with variables V1 = {S, A, B} , terminals T = {a} , and productions
S → aS|A ,
A→a
B → aa
Next we want to eliminate the variables that cannot be reached from the start variable. For this, we can
draw a dependency graph for the variables. Dependency graphs are a way of visualizing complex
relationships and are found in many applications. For context-free grammars, a dependency graph has
its vertices labeled with variables, with an edge between vertices C and D if and only if there is a
production of the form
C → xDy.
A dependency graph for V1 is shown in Figure 4.7. A variable is useful only if there is a path from the
vertex labeled S to the vertex labeled with that variable. In our case, Figure 4.7 shows that B is useless.
Removing it and the affected productions and terminals, we are led to the final answer
S A B
Figure 4.7: dependency graph
Theorem 4.2: Let G = (V, T, S, P) be a context-free grammar. Then there exists an equivalent grammar
𝐺̂ = ( 𝑉̂ , 𝑇̂,S 𝑃̂) that does not contain any useless variables or productions.
In the first part we construct an intermediate grammar G1 = (V1, T1, S, P1) such that V1 contains only
variables A for which
∗
S⇒ w ∈ T*
is possible. The steps in the algorithm are
1. Set V to ∅ .
2. Repeat the following step until no more variables are added to V1. For every A ∈ V for which P
has a production of the form
A → x1x2 ………xn with all xi in V1 ∪ T, and A to V1
3. Take P1 as all the productions in P whose symbols are all in (V1 ∪ T)
∗
Clearly this procedure terminates. It is equally clear that if A ∈ V1, then S ⇒ w ∈ T* is a possible
∗
derivation with G1. The remaining issue is whether every A for which S⇒ w = ab… is added to V1
before the procedure terminates. To see this, consider any such A and look at the partial derivation tree
corresponding to that derivation (Figure 4.9). At level k, there are only terminals, so every variable Ai
at level k – 1 will be added to V1 on the first pass through Step 2 of the algorithm. Any variable at level
k–2 will then be added to V1 on the second pass through Step 2. The third time through Step 2, all
variables at level k – 3 will be added, and so on. The algorithm cannot terminate while there are variables
in the tree that are not yet in V1. Hence A will eventually be added to V1.
A
A j
level k−2
c k −1
A i
level
a b level k
Figure 4.8: Procedure for removing Unit Production
In the second part of the construction, we get the final answer from G1. We draw the variable dependency
graph for G1 and from it find all variables that can not be reached from S. These are removed from the
variable set, as are the productions involving them. We can also eliminate any terminal that does not
occur in some useful production. The result is the grammar 𝐺̂ = ( 𝑉̂ , 𝑇̂,S 𝑃̂ )
Because of the construction, 𝐺̂ does not contain any useless symbols or productions. Also, for each w
∈ L (G) we have a derivation
∗ ∗
S⇒ xAy ⇒ w
Compiled by: Destalem H. 72
Since the construction of 𝐺̂ retains A and all associated productions, we have everything needed to make
the derivation
∗ ∗
S⇒ 𝐺̂ xAy ⇒ 𝐺̂ w.
The grammar 𝐺̂ is constructed from G by the removal of productions, so that 𝑃̂ ⊆ P Consequently L(G)
⊆ L(𝐺̂ ). Putting the two results together, we see that G and 𝐺̂ are equivalent.
S A B
Activity 4.4
1. Define Chomsky Normal form?
2. Define Greibach Normal Form?
When working with context-free grammars, it is often convenient to have them in simplified form. One
of the simplest and most useful forms is called the Chomsky normal form. Chomsky normal form is
useful in giving algorithms for working with context-free grammars.
Definition 4.11: A context-free grammar is in Chomsky normal form if every rule is of the form
A → BC
A→a
where a is any terminal and A, B, and C are any variables.
Theorem 4.5 : Any context-free language is generated by a context-free grammar in Chomsky normal
form.
Proof idea: We can convert any grammar G into Chomsky normal form. The conversion has several
stages wherein rules that violate the conditions are replaced with equivalent ones that are satisfactory.
First, we add a new start variable. Then, we eliminate all 𝜆-rules of the form A → 𝜆. We also eliminate
all unit rules of the form A → B. In both cases we patch up the grammar to be sure that it still generates
the same language. Finally, we convert the remaining rules into the proper form.
Proof: First, we add a new start variable S0 and the rule S0 → S, where S was the original start variable.
This change guarantees that the start variable doesn’t occur on the right-hand side of a rule.
Second, we take care of all 𝜆 - rules. We remove an 𝜆 - rule A → 𝜆, where A is not the start variable.
Then for each occurrence of an A on the right-hand side of a rule, we add a new rule with that occurrence
deleted. In other words, if R → uAv is a rule in which u and v are strings of variables and terminals, we
add rule R → uv. We do so for each occurrence of an A, so the rule R → uAvAw causes us to add R →
uvAw, R → uAvw, and R → uvw. If we have the rule R → A, we add R → 𝜆 unless we had previously
removed the rule R → 𝜆. We repeat these steps until we eliminate all 𝜆 -rules not involving the start
variable.
Third, we handle all unit rules. We remove a unit rule A → B. Then, whenever a rule B → u appears,
we add the rule A → u unless this was a unit rule previously removed. As before, u is a string of variables
and terminals. We repeat these steps until we eliminate all unit rules.
4. Convert the remaining rules into the proper form by adding additional variables and rules. The final
grammar in Chomsky normal form is equivalent to G. (Actually the procedure given in Theorem 4.95
produces several variables Ui and several rules Ui → a. We simplified the resulting grammar by using a
single variable U and rule U → a.)
S0 → AA1 | UB | a | SA | AS
S → AA1 | UB | a | SA | AS
A → b | AA1 | UB | a | SA | AS
A1 → SA
Compiled by: Destalem H. 77
U→a
B→b
Solution
Here we can use a device similar to the one introduced in the construction of Chomsky normal form.
We introduce new variables A and B that are essentially synonyms for a and b, respectively. Substituting
for the terminals with their associated variables leads to the equivalent grammar
S → aBSB | aA,
A → a,
B → b,
which is in Greibach normal form.
Activity 4.5
1. Define Push down automata formally?
2. What is the language recognized by Pushdown automata?
3. Prove that the language generated by CFG accepts by Pushdown Automata?
In this section we introduce a new type of computational model called pushdown automata. These
automata are like nondeterministic finite automata but have an extra component called a stack. The stack
provides additional memory beyond the finite amount available in the control. The stack allows
pushdown automata to recognize some nonregular languages.
State
control
a b b a input
Figure 4.10: Schematic of a finite automaton
With the addition of a stack component we obtain a schematic representation of a pushdown automaton,
as shown in the following figure.
State
control
x a b b a input
y
z
stack
Figure 4.11: Schematic of a pushdown automaton
A pushdown automaton (PDA) can write symbols on the stack and read them back later. Writing a
symbol “pushes down” all the other symbols on the stack. At any time the symbol on the top of the stack
can be read and removed. The remaining symbols then move back up. Writing a symbol on the stack is
often referred to as pushing the symbol, and removing a symbol is referred to as popping it. Note that
all access to the stack, for both reading and writing, may be done only at the top. In other words a stack
is a “last in, first out” storage device. If certain information is written on the stack and additional
information is written afterward, the earlier information becomes inaccessible until the later
information is removed.
Plates on a cafeteria serving counter illustrate a stack. The stack of plates rests on a spring so that when
a new plate is placed on top of the stack, the plates below it move down. The stack on a pushdown
automaton is like a stack of plates, with each plate having a symbol written on it.
A stack is valuable because it can hold an unlimited amount of information. Recall that a finite
automaton is unable to recognize the language {0n1n| n ≥ 0} because it cannot store very large numbers
Compiled by: Destalem H. 81
in its finite memory. A PDA is able to recognize this language because it can use its stack to store the
number of 0s it has seen. Thus the unlimited nature of a stack allows the PDA to store numbers of
unbounded size. The following informal description shows how the automaton for this language works.
Read symbols from the input. As each 0 is read, push it onto the stack. As soon as 1s are seen, pop a 0
off the stack for each 1 read. If reading the input is finished exactly when the stack becomes empty of
0s, accept the input. If the stack becomes empty while 1s remain or if the 1s are finished while the stack
still contains 0s or if any 0s appear in the input following 1s, reject the input.
The formal definition of a pushdown automaton is similar to that of a finite automaton, except for the
stack. The stack is a device containing symbols drawn from some alphabet. The machine may use
different alphabets for its input and its stack, so now we specify both an input alphabet Σ and a stack
alphabet Γ.
At the heart of any formal definition of an automaton is the transition function, which describes its
behavior. The domain of the transition function is Q × {Σ∪𝜆} × {Γ∪𝜆}. Thus the current state, next
input symbol read, and top symbol of the stack determine the next move of a pushdown automaton.
Either symbol may be 𝜆, causing the machine to move without reading a symbol from the input or
without reading a symbol from the stack.
For the range of the transition function we need to consider what to allow the automaton to do when it
is in a particular situation. It may enter some new state and possibly write a symbol on the top of the
stack. The function δ can indicate this action by returning a member of Q together with a member of
{Γ∪𝜆}, that is, a member of Q × {Γ∪𝜆}. Because we allow nondeterminism in this model, a situation
may have several legal next moves. The transition function incorporates nondeterminism in the usual
way, by returning a set of members of Q × {Γ∪𝜆}, that is, a member of P(Q × {Γ∪𝜆}). Putting it all
together, our transition function δ takes the form δ: Q × {Σ∪𝜆} × {Γ∪𝜆} →P(Q × {Γ∪𝜆}).
Input: 0 1 𝜆
Stack: 0 z0 𝜆 0 z0 𝜆 0 z0 𝜆
q1 {(q2,z0)}
q2 {(q2,0)} {(q3,𝜆)}
q3 {(q3,𝜆)} {(q4,𝜆)}
q4
We can also use a state diagram to describe a PDA, as in Figures 4.12. Such diagrams are similar to the
state diagrams used to describe finite automata, modified to show how the PDA uses its stack when
Compiled by: Destalem H. 83
going from state to state. We write “a,b → c” to signify that when the machine is reading an a from the
input, it may replace the symbol b on the top of the stack with a c. Any of a, b, and c may be 𝜆. If a is
𝜆, the machine may make this transition without reading any symbol from the input. If b is ε, the machine
may make this transition without reading and popping any symbol from the stack. If c is 𝜆, the machine
does not write any symbol on the stack when going along this transition.
, → 0, → 0
q z0
q
1 2
1,0 →
q q 3
1,0 →
, z0 →
4
Figure 2.12: State diagram for the PDA M1 that recognizes {0n1n| n ≥ 0}
The formal definition of a PDA contains no explicit mechanism to allow the PDA to test for an empty
stack. This PDA is able to get the same effect by initially placing a special symbol $ on the stack. Then
if it ever sees the $ again, it knows that the stack effectively is empty. Subsequently, when we refer to
testing for an empty stack in an informal description of a PDA, we implement the procedure in the same
way.
Similarly, PDAs cannot test explicitly for having reached the end of the input string. This PDA is able
to achieve that effect because the accept state takes effect only when the machine is at the end of the
input. Thus from now on, we assume that PDAs can test for the end of the input, and we know that we
can implement it in the same manner.
Example 4.18
In this example we give a PDA M2 recognizing the language {wwR | w ∈ {0,1}∗}. Recall that wR means
w written backwards. The informal description and state diagram of the PDA follow.
Begin by pushing the symbols that are read onto the stack. At each point, nondeterministically guess
that the middle of the string has been reached and then change into popping off the stack for each symbol
read, checking to see that they are the same. If they were always the same symbol and the stack empties
at the same time as the input is finished, accept; otherwise reject.
1, → 1
, → 0 z 0, → 0
q 1
q 2
, →
0,0 →
q q
1,1 →
3
4
, z0 →
Figure 2.13: State diagram for the PDA M3 that recognizes {wwR| w ∈ {0, 1}∗}
a ,z → 0 z b,z → 1z
a, 0 → 00 b, 1 → 11
a, 1 → b, 0 →
, z → z
q1 q2
.Figure 4.14: State diagram for the PDA M3 that recognizes L={ w | w ∈ {a,b}* : na(w)=nb(w)
.... □ □ a b a a □ □ .....
Tape
Read/Write head
q2 q0
h q1
q3
Finite Control
Figure 1: A Turing Machine consisting of a Tape, R/W head and a finite control.
q, σ δ(q, σ)
q0 a (q1, □, R)
q0 □ (h, □)
q1 a (q0, □, R )
q1 □ (q0, □,L)
Table 1: for the transition table
When M is started in its initial state q0, it scans its head to the right, changing all a's to □'s as it goes,
until it finds a tape square already containing □; then it halts. (Changing a nonblank symbol to the blank
symbol will be called erasing the nonblank symbol.) To be specific, suppose that M is started with its
head scanning the first of four a's, the last of which is followed by a □. Then M will go back and forth
between states q0 and q1 four times, alternately changing an a to a □ and moving the head right; the first
and fourth lines of the table for δ are the relevant ones during this sequence of moves. At this point, M
will find itself in state q0 scanning □ and, according to the second line of the table, will halt.
Example2: Consider the Turing machine defined by
And
If this Turing machine is started in state q0 with the symbol a under the read-write head, the applicable
Compiled by: Destalem H. 89
transition rule is δ (q0,a)= (q0,b,R). Therefore, the read-write head will replace a with b, then move right
on the tape. The machine will remain in state q0. Any subsequent a will also be replaced with a b, but
b's will not be modified. When the machine encounters the first blank, it will move left one cell, and
then halt in final state q1.
The figure 2 below shows several stages of the process for a simple initial configuration.
..... □ □ a4 a1 a2 a1 a2 a2 a a4 a2 □ □ .......
Tape
R/W head
State
q
Figure 4: A Snapshot of Turing Machine
Solution:
The present symbol under the R/W head is a. The present state is q. So a is written to the right of q. The
nonblank symbols to the left of a form the string a4a1a2a1a2a2, which is written to the left of q. The sequence of
nonblank symbols to the right of a is a4a2. Thus the ID is as given in the figure below.
Figure 5: Representation of ID
Note: (1) For constructing the ID, we simply insert the current state in the input string to the left of
the symbol under the R/W head.
(2) We observe that the blank symbol may occur as part of the left or right substring.
The instantaneous description gives only a finite amount of information to the right and left of the read-write
head. The unspecified part of the tape is assumed to contain all blanks; normally such blanks are irrelevant and
are not shown explicitly in the instantaneous description. If the position of blanks is relevant to the discussion,
however, the blank symbol may appear in the instantaneous description. For example, the instantaneous
description q ω indicates that the read-write head is on the cell to the immediate left of the first symbol of w and
that this cell contains a blank.
Note: The description of moves by IDs is very much useful to represent the processing of input
strings.
Example 3: Consider the TM description given in Table 1. Draw the computation sequence of the
input string 00.
Solution:
We describe the computation sequence in terms of the contents of the tape and the current state.
If the string in the tape is a1a2 . . . aj aj+1 . . . am and the TM in state q is to read aj+ 1, then we write
a1a2 . . . aj q aj+1 . . . am
For the input string 00□, we get the following sequence:
q100□├ 0q10□├ 00q1□├ 0q201├ q2001
├ q2□001├□q3001 ├□□q401├ □□0q41├ □□01q4□
Compiled by: Destalem H. 93
├ □□010q5├□□01q200├□□0q2100├ □□q20100
├ □q2□0100 ├ □□q30100 ├ □□□q4100 ├ □□□1q400
├ □□□10q40 ├ □□□100q4□ ├ □□□1000q5□
├ □□□100q200 ├ □□□10q2000 ├ □□□1q20000
├ □□□q210000 ├ □□q2□10000 ├ □□□q310000 ├ □□□□q50000
(□, □, R)
(y, y, R)
(y, y, R) (y, y, L)
(x, x, R) (□,□,R)
(0, x, R)
q3 q5 q6
q1 q2
(1, y, L)
q4
□ 0 0 1 1 □
Tape
R/W head
State
q1
Figure 5: TM processing 0011
The figure can be represented by
□0011□
q1
From Figure 4 we see that there is a directed edge from q1 to q2 with the label (0. x, R). So the current
symbol 0 is replaced by x and the head moves right. The new state is q2. Thus. we get
□x 011□
q2
The change brought about by processing the symbol 0 can be represented as
(0, x, R)
□0011□ □x011□
q1 q2
The entire computation sequence reads as follows:
(0, x, R) (0, 0, R)
□0011□ □x 011□ □x 011□
q1 q2 q2
(y, y, R) (□, □, R)
□xxyy□ □xxyy□□
q5 q6
Compiled by: Destalem H. 95
CONSTRUCTION OF TURING MACHINE(TM)
Designing a Turing machine to solve a problem is an interesting task. It is somewhat similar to
programming. Given a problem, different Turing machines can be constructed to solve it. But we would
like to have a Turing machine which does it in a simple and efficient manner. Like we learn some
techniques of programming to deal with alternatives, loops etc, it is helpful to understand some
techniques in Turing machine construction, which will help in designing simple and efficient Turing
machines. It should be noted that we are using the word ‘efficient’ in an intuitive manner here.
Control
..... .....
..... .....
...... ......
Figure 6: Multitape Turing machine
In a typical move:
(i) M enters a new state.
(ii) On each tape, a new symbol is written in the cell under the head.
(iii) Each tape head moves to the left or right or remains stationary. The heads move
independently: some move to the left, some to the right and the remaining heads
do not move.
The initial ID has the initial state q0, the input string w in the first tape (input tape), empty strings
of b's in the remaining k - 1 tapes. An accepting ID has a final state, some strings in each of the k tapes.
“Multi-tape Turing Machines are equivalent to Standard Turing Machines”.
Exercise: Prove that, Every language accepted by a multitape TM is acceptable by some
single-tape TM (that is, the standard TM).
Exercise: Prove that, If M1 is the single-tape TM simulating multitape TM M, then the time
taken by M1 to simulate n moves of M is (n2).
INITIAL FUNCTIONS
The initial functions over N are:
(a) Zero Function Z defined by Z(x) = 0.
(b) Successor Function S defined by S(x) = x + 1
(c) Projection function Uin defined by Uin(x1, x2, .. . ., xk) = xi
For example: S(4) = 5, Z(7) = 0, U23(2, 4, 7) = 4, U13(2, 4, 7) = 2, U33(2, 4, 7) = 7.
Note: As U11(x) = x for every x in N, U11 is simply the identity function. So Uin is also termed as a
generalized identity function.
The initial function over = a, b are:
(a) nil (x) =
(b) cons a(x) = ax
(c) cons b(x) = bx
Compiled by: Destalem H. 99
For example: nil (abab) = , cons a(abab) = aabab, cons b(abab) = babab.
Note: We note that cons a(x) and cons b(x) simply denote the concatenation of the 'constant' string a
and x and the concatenation of the constant string b and x.
Definition: If fl, f2, ... , fk are partial functions of n variables and g is a partial function of k variables,
then the composition of g with fl, f2, ... , fk is a partial function of n variables defined by
g(fl(x1, x2, .. . ., xn), f2(x1, x2, .. . ., xn), ... , fk(x1, x2, .. . ., xn))
Example: Let f1(x, y) = x + y, f2(x, y) = 2x, f3(x, y) = xy and g(x, y, z) = x + y + z be functions over N.
Then
g( fl(x, y), f2(x, y), f3(x, y)) = g(x + y, 2x, xy)
= x + y + 2x + xy
Thus the composition of g with fl , f2, f3 is given by a function h:
h(x, y) = x + y + 2x + xy
Note: Definition 2.1 generalizes the composition of two functions. The concept is useful where a
number of outputs become the inputs for a subsequent step of a program.
The composition of g with fl , f2, . . ., fn is total when g, fl , f2, . . ., fn are total.
The next definition gives a mechanical process of computing a function.
Definition: A function f (x) over N is defined by recursion if there exists a constant k (a natural number)
and a function h(x, y) such that
f (0) = k, f (n + 1) = h(n, f (n)) (3.1)
By induction on n, we can define f (n) for all n. As f (0) = k, there is basis for induction. Once f (n) is
known, f(n + 1) can be evaluated by using (3.1).
Definition: A function f (x) over is defined by recursion if there exists a 'constant" string w *
and functions h1(x, y) and h2(x. y) such that
f () = w (3.4)
f (ax) = h1(x, f (x)) (3.5)
f (bx) = h2(x, f (x))
(h1 and h2 may be functions in one variable.)
Definition: A function f (x1, x2, . . ., xn) over is defined by recursion if there exist functions g(x1,
x2, . . ., xn-1), h1(x1, x2, . . ., xn+1) , h2(x1, x2, . . ., xn+1), such that
f (𝜆 x2, . . ., xn) = g(x2, . . ., xn) (3.6)
f (ax1, x2, . . ., xn ) = h1(x1, x2, . . ., xn, f (x1, x2, . . ., xn)) (3.7)
f (ax1, x2, . . ., xn ) = h1(x1, x2, . . ., xn, f (x1, x2, . . ., xn))
(h1 and h2 may be functions of m variables, where m < n + 1.)
Now we can define the class of primitive recursive functions over .
Definition: A total function f over is primitive recursive (i) if it is anyone of the three initial functions,
or (ii) if it can be obtained by applying composition and recursion a finite number of times to the initial
functions.
Note: As in the case of functions over N, a total function over is primitive recursive if it is obtained
by applying composition and recursion a finite number of times to primitive recursive function
f1, f2, . . ., fm.
Example: Show that the following functions are primitive recursive:
(a) Constant functions a and b (i.e. a(x) = a, b(x) = b)
(b) Identity function
(c) Concatenation
(d) Transpose
(e) Head function (i.e. head (a1a2 ... an) = a1)
Compiled by: Destalem H. 101
(f) Tail function (i.e. tail (a1a2 ... an) = an)
(g) The conditional function “ if x1 then x2 else x3”
Solution :
(a) As a(x) = cons a(nil (x)), the function a(x) is the composition of the initial function cons a with
the initial function nil and is hence primitive recursive.
(b) Let us denote the identity function by id. Then,
id() =
So id is defined by recursion using cons a and cons b. Therefore, the identity function is primitive
recursive.
(c) The concatenation function can be defined by
concat(x1, x2) = x1x2
head(ax) = a(x)
head(bx) = b(x)
tail(ax) = id(x)
tail(bx) = id(x)
RECURSIVE FUNCTIONS
By introducing one more operation on functions, we define the class of recursive functions,
which includes the class of primitive recursive functions.
Definition: Let g(x1, x2, . . ., xn, y) be a total function over N. g is a regular function if there exists some
natural number y0 such that g(x1, x2, . . ., xn, y0) = 0 for all values x1, x2, . . ., xn in N.
For instance, g(x, y) = min(x, y) is a regular function since g(x, 0) = 0 for all x in N. But f
(x, y) = |x – y| is not regular since f (x, y) = 0 only when x = y, and so we cannot find a fixed y such that
f (x, y) = 0 for all x in N.
Definition: A function f (x1, x2, . . ., xn) over N is defined from a total function g(x1, x2, . . .,
xn, y) by minimization if
(a) f (x1, x2, . . ., xn) is the least value of all y's such that g(x1, x2, . . ., xn, y) = 0 if it exists. The least
value is denoted by y(g(x1, x2, . . ., xn, y) = 0).
(b) f (x1, x2, . . ., xn) is undefined if there is no y such that g(x1, x2, . . ., xn, y) = 0.
Note: In general, f is partial. But, if g is regular then f is total.
Definition: A function is recursive if it can be obtained from the initial functions by a finite number of
applications of composition, recursion and minimization over regular functions.
Definition: A function is partial recursive if it can be obtained from the initial functions by a finite
number of applications of composition, recursion and minimization.
Example: Show that f (x) = x/2 is a partial recursive function over N.
Solution: Let g(x, y) = |2y – x|, where 2y - x = 0 for some y only when x is even. Let
f1(x) = y(|2y – x| = 0). Then f1(x) is defined only for even values of x and is equal
to x/2. When x is odd, f1(x) is not defined. f1 is partial recursive. As f (x) = x/2 = f1(x), f is a
partial recursive function.
So far we have dealt with recursive and partial recursive functions over N. We can define
partial recursive functions over using the primitive recursive predicates and the minimization process.
As the process is similar, we will discuss it here.
The concept of recursion occurs in some programming languages when a procedure has a call to
the same procedure for a different parameter. Such a procedure is called a recursive procedure. Certain
programming languages like C, C++ allow recursive procedures.
This problem is one of the challenging problems of the 21st century. This problem carries a prize
money of $lM. P stands for the class of problems that can be solved by a deterministic algorithm (i.e.
by a Turing machine that halts) in polynomial time; P stands for the class of problems that can be
solved by a nondeterministic algorithm (that is, by a nondeterministic TM) in polynomial time; P stands
for polynomial and P for nondeterministic polynomial. Another important class is the class of NP –
complete problems which is a subclass of P.
In this chapter these concepts are formalized and Cook's theorem on the NP – completeness of
SAT problem is proved.
BIG-O NOTATION
GROWTH RATE OF FUNCTIONS
When we have two algorithms for the same problem, we may require a comparison between the
running times of these two algorithms. With this in mind, we study the growth rate of functions defined
on the set of natural numbers N.
Definition: Let f, g : N → R+ (R+ being the set of all positive real numbers), we say that f (n) =
O(g(n)) if there exist positive integers C and N0 such that
f(n) C.g(n) for all n N0.
In this case we say f is of the order of g (or f is 'big oh' of g)
Note: f (n) = O(g(n)) is not an equation. It expresses a relation between two functions f and g.
Definition: If p(n) = aknk + ak-1nk-1 + . . . + a1 + a0 is a polynomial of degree k over Z and ak > 0,
then p(n) = O(nk).
Example: Let f (n) = 4n3 + 5n2 + 7n + 3. Prove that f(n) = 0(n 3).
Solution: In order to prove that f (n) = O(n3), take C = 5 and N0 = 10. Then
f (n) = 4n3 + 5n2 + 7n + 3 5n3 for all n 10
When n = 10, 5n2 + 7n + 3 = 573 < 103. For n > 10, 5n2 + 7n + 3 < n3. Then f (n) = O(n3).
Compiled by: Destalem H. 104
Note: The order of a polynomial is determined by its degree.
Definition: An exponential function is a function from q: N → N defined by
q(n) = an for some fixed a > 1.
When n increases, each of n, n2, 2n increases. But a comparison of these functions for specific
values of n will indicate the vast difference between the growth rate of these functions.
TABLE 1: Growth Rate of Polynomial and Exponential Functions
n f (n) = n2 g(n) = n2 + 3n + 9 q(n) = 2n
1 1 13 2
5 25 49 32
10 100 139 1024
50 2500 2659 (1.13)1015
100 10000 10309 (1.27)1030
1000 1000000 1003009 (1.07)10301
From Table 1, it is easy to see that the function q(n) grows at a very fast rate when compared to
f (n) or g(n). In particular the exponential function grows at a very fast rate when compared to any
polynomial of large degree. We prove a precise statement comparing the growth rate of polynomials
and exponential function.
Definition: We say g O( f ), if for any constant C and N0, there exists n N0 such that g C.f (n).
Definition: If f and g are two functions and f = O(g), but g O( f ), we say that the growth rate of g
is greater than that of f (In this case g(n) / f(n) becomes unbounded as n increases to .)
Definition: The growth rate of any exponential function is greater than that of any polynomial.
Note: The function n1og n lies between any polynomial function and an for any constant a. As log n k
for a given constant k and large values of n, n1og n nk or large values of n. Hence n1og n dominates any
2
(𝑙𝑜𝑔(𝑥))
polynomial. But n1og n = (e1og n ) 1og n = e(1og n)2. Let us calculate 𝑙𝑖𝑚 . By L'Hospital's rule,
𝑛→∞ 𝑐𝑥
So (log n)2 grows more slowly than cn. Hence n1og n = e(1og n)2 grows more slowly than 2n. The
same holds good when logarithm is taken over base 2 since logcn and log2n differ by a constant factor.
Hence there exist functions lying between polynomials and exponential functions.
Definition: A language L is in class P if there exists some polynomial T(n) such that L=
T(M) for some deterministic TM M of time complexity T(n).
Example: Construct the time complexity T(n) for the Turing Machine M which accepts the language
{0n1n | n 1}.
Solution: We require the following moves:
(a) If the leftmost symbol in the given input string w is 0, replace it by x and move right till we
encounter a leftmost 1 in w. Change it to y and move backwards.
(b) Repeat (a) with the leftmost 0. If we move back and forth and no 0 or 1 remains, move to a final
state.
(c) For strings not in the form 0n1n, the resulting state has to be nonfinal.
Step (a) consists of going through the input string (0n1n) forward and backward and replacing the
leftmost 0 by x and the leftmost 1 by y. So we require at most 2n moves to match a 0 with a 1. Step (b)
is repetition of step (a) n times. Hence the number of moves for accepting 0n1n is at most (2n)(n). For
strings not of the form 0n1n, TM halts with less than 2n2 steps. Hence T(M) = O(n2).
We can also define the complexity of algorithms. In the case of algorithms, T(n) denotes the
running time for solving a problem with an input of size n, using this algorithm.
Note: The Euclidean algorithm for computing gcd of two numbers is a polynomial time algorithm.