You are on page 1of 26

DEPARTMENT OF COMPUTER SCIENCE AND

ENGINEERING

UNIT - 5

Subject Code: CS8501

Subject Name: Theory of Computation


(R 2017)

PREPARED BY VERIFIED BY HOD


UNIT V

UNDECIDABILITY

Non-Recursive Enumerable (RE) Language — Undecidable Problem with RE —


Undecidable Problems about TM — Post Correspondence Problem, The Class P and NP.

UNDECIDABILITY

Undecidability is defined as follows:

 Σ is a set of alphabet and A is a Language such that A is proper subset of Σ*. Σ* is


a set of all possible strings.

 A is a undecidable language if there exists NO Computational Model (such as


Turing Machine) M such that for every string w that belong to Σ*, the following
two conditions hold:
1. If w belongs to A, then the computation of Turing Machine M on w as input,
ends in the accept state.

2. If w does not belong to A, then computation of Turing Machine M on w,


ends in the reject state.
Note there may be a Computation Machine M for which one condition hold but does not
hold both conditions. A is an Undecidable Language.

A is a undecidable language if there is NO Turing Machine or an Algorithm that correctly


tells if a given string w is a part of the language or not in finite time.

Language ATM in undecidable

Language ATM is defined as follows:

{(M, w): M is a Turing Machine (TM) that accepts the string w}

Therefore, in this language DFA is enough. We do not need stronger models.

The language ACFG is Undecidable Language.

The Halting Problem Halt is defined as:

Halt = {(P, w): P is a program that terminates execution with w as input }.

The Language Halt is undecidable.


Table 1.1 Decidability and Undecidability of each language

Decidability vs. Undecidability

1. What are recursive languages? (2M, Apr/May 2021)


2. When is the language L recursively enumerable (2M, Nov/Dec 2019)
3. List the properties of recursive and non – recursive enumerable language
(2M, Nov/Dec 2017)
4. Explain in detail about various properties of recursive

There are two types of TMs (based on halting):

 Recursive
TMs that TMs that always halt, no matter accepting or non no matter
accepting or non-accepting = DECIDABLE PROBLEMS
 Recursively enumerable
TMs that TMs that are guaranteed to halt are guaranteed to halt only on
acceptance only on acceptance. If non-accepting, it may or may not halt
(i.e., could loop forever)
Fig.1.1 Recursive, RE, Undecidable languages Recursive, RE, Undecidable languages

 A decision problem requires checking if an input (string) has some property.


Thus, a decision problem is a function from strings to Boolean.
 A decision problem is represented as a formal language consisting of those
strings (inputs) on which the answer is “yes”.

UNDECIDABILITY

1. How does a primitive recursive function help to identify the computable function
(5M, Apr/May 2019)

Decision Problems and Languages

 A decision problem requires checking if an input (string) has some property.


Thus, a decision problem is a function from strings to boolean.
 A decision problem is represented as a formal language consisting of those
strings (inputs) on which the answer is “yes”.

Recursive Enumerability

 A Turing Machine on an input w either (halts and) accepts, or (halts and) rejects,
or never halts.
 The language of a Turing Machine M, denoted as L(M), is the set of all strings w
on which M accepts.
 A language L is recursively enumerable/Turing recognizable if there is a Turing
Machine M such that L(M) = L

Decidability

 A language L is decidable if there is a Turing machine M such that L(M) = L and M


halts on every input.
 Thus, if L is decidable then L is recursively enumerable.

Undecidability

Definition1. A language L is undecidable if L is not decidable. Thus, there is no


Turing machine M that halts on every input and L(M) = L.

 This means that either L is not recursively enumerable. That is there is no


turing machine M such that L(M) = L, or
 L is recursively enumerable but not decidable. That is, any Turing machine M
such that L (M) = L, M does not halt on some inputs.
Figure 1: Relationship between classes of Languages

A NON-RECURSIVELY ENUMERABLE LANGUAGE

DIAGONALIZATION

1. State and prove that “Diagonalization language is not recursively enumerable”


(13M, Nov/Dec 2021)

The Diagonal Language

Definition 2. Define Ld = {M | M 6∈ L (M)}. Thus, Ld is the collection of Turing machines

(Programs) M such that M does not halt and accept when given itself as input.

A NON-RECURSIVELY ENUMERABLE LANGUAGE

Proposition 3. Ld is not recursively enumerable.

Proof. Recall that,

 Inputs are strings over {0, 1}


 Every Turing Machine can be described by a binary string and every binary
string can be viewed as Turing Machine.
 We will denote the ith binary string (in lexicographic order) as the number i.
Thus, we can say j ∈ L(i), which means that the Turing machine corresponding to
ith binary string accepts the jth binary string.

Completing the proof

Diagonalization: Cantor

Proof (contd). We can organize all programs and inputs as a (infinite) matrix, where the
(i, j)th entry is Y if and only if j ∈ L(i).
Suppose Ld is recognized by a Turing machine, which is the jth binary string. i.e., Ld =
L(j). But j ∈ Ld iff j 6∈ L(j)!

Recursively Enumerable Language

A language is said to be recursively enumerable if there exists a Turing Machine that


accepts every string of the language and rejects the string that are not in the language
and it may cause TM to enter into an infinite loop.

Fig.1.3. Recursively Enumerable Language

THE UNIVERSAL LANGUAGE

1. Prove that Universal language is recursively enumerable but not recursive (13M,
Apr/May 2021)

Recursively Enumerable but not Decidable

 Ld not recursively enumerable, and therefore not decidable. Are there languages
that are recursively enumerable but not decidable?
 Yes, ATM = {<M, w> | M is a TM and M accepts w}

Proposition 4. ATM is r.e. but not decidable.

Proof. We have already seen that ATM is r.e. Suppose (for contradiction) ATM is
decidable. Then there is a TM M that always halts and L (M) = Atm. Consider a TM D as
follows:

On input i Run M on input hi, ii Output ‘‘yes’’ if i rejects i Output ‘‘no’’ if i accepts i
Observe that L(D) = Ld! But, Ld is not r.e. which gives us the contradiction.
Undecidable Problems about RE
UNIVERSAL TURING MACHINE

1. Highlight the features of Universal Turing Machine (5M, Nov/Dec 2019)

(13M ,Nov/Dec 2017)

 Turing was inspired by the idea of connecting multiple Turing machines. He


asked himself that can a universal machine be constructed that could simulate
other machines. He named this machine as Universal Turing Machine.
 A Universal Turing Machine is a machine that can simulate an arbitrary Turing
machine over any collection of input symbols. It takes two inputs. The first is the
description of the machine, and the other is the input data.
 A Universal Turing Machine, in more specific terms, can imitate the behavior of
an arbitrary Turing machine over any collection of input symbols. Therefore, it is
possible to create a single machine to calculate any computable sequence.

The input of a UTM includes:

 The description of a machine M on the tape.


 The input data.
The UTM can then simulate M on the rest of the input tape's content. As a result, a
Universal Turing Machine can simulate any other machine.

Creating a general-purpose Turing Machine (UTM) is a more difficult task. Once the
Turing machine's transition is defined, the machine is restricted to performing a specific
type of computation.

Create a universal Turing machine by modifying our fundamental Turing machine


model. For even simple behaviour to be stimulated, the modified Turing computer must
have a huge number of states. Modify basic model by doing the following:

 Increase the number of read/write heads.


 Increase the number of input tape dimensions.
 Increasing memory space.

The UTM would include three pieces of data for the machine it is simulating:
 A basic description of the machine.
 The contents of the machine tape.
 The internal state of the machine.

The Universal machine would simulate the machine by checking the tape input and the
machine's state. It would command the machine by modifying its state in response to
the input. This will be like a computer running another computer. The schematic
diagram of a Universal Turing Machine is as follows:
Fig.1. . Universal Turing Machine

RICE THEOREM
Rice theorem states that any non-trivial semantic property of a language which is
recognized by a Turing machine is undecidable. A property, P, is the language of all
Turing machines that satisfy that property.

Formal Definition

If P is a non-trivial property, and the language holding the property, Lp , is recognized


by Turing machine M, then Lp = {<M> | L(M) ∈ P} is undecidable.

Description and Properties

 Property of languages, P, is simply a set of languages. If any language belongs to P


(L ∈ P), it is said that L satisfies the property P.
 A property is called to be trivial if either it is not satisfied by any recursively
enumerable languages, or if it is satisfied by all recursively enumerable
languages.
 A non-trivial property is satisfied by some recursively enumerable languages and
are not satisfied by others. Formally speaking, in a non-trivial property, where L
∈ P, both the following properties hold:

 Property 1 − There exists Turing Machines, M1 and M2 that recognize


the same language, i.e. either ( <M1>, <M2> ∈ L ) or ( <M1>,<M2> ∉ L )
 Property 2 − There exists Turing Machines M1 and M2, where M1
recognizes the language while M2 does not, i.e. <M1> ∈ L and <M2> ∉ L
Proof

Suppose, a property P is non-trivial and φ ∈ P.


Since, P is non-trivial, at least one language satisfies P, i.e., L(M0) ∈ P , ∋ Turing Machine
M0.
Let, w be an input in a particular instant and N is a Turing Machine which follows −
On input x

 Run M on w
 If M does not accept (or doesn't halt), then do not accept x (or do not halt)
 If M accepts w then run M0 on x. If M0 accepts x, then accept x.
A function that maps an instance ATM = {<M,w>| M accepts input w} to a N such that

 If M accepts w and N accepts the same language as M0, Then L(M) = L(M0) ∈ p
 If M does not accept w and N accepts φ, Then L(N) = φ ∉ p
Since ATM is undecidable and it can be reduced to Lp, Lp is also undecidable.

TURING MACHINE HALTING PROBLEM

1. Outline the halting problem for Turing Machine (5M, Nov/Dec 2019)

2. Prove the halting problem is decidable (15M, Nov/Dec 2017)

Input − A Turing machine and an input string w.


Problem – Whether the Turing machine complete the calculation of the string w in a finite
number of steps? Say yes or no.
Proof – Initially you should assume that a Turing machine is used to solve this problem and it
will show that it is contradicting itself. This kind of Turing machine is called as a Halting
machine which produces a ‘yes’ or ‘no’ in a finite amount of time. In case if the halting machine
completes in a finite amount of time, the output will be yes’, otherwise it will be ‘no’.

Fig.1. Block Diagram of Halting Machine


Now we will design an inverted halting machine (HM)’ as −
 If H returns YES, then loop forever.
 If H returns NO, then halt.

Below mentioned block diagram of an ‘Inverted halting machine’

After that a machine (HM)2 which input itself is constructed as follows −


 If (HM)2 halts on input, loop forever.
 Else, halt.
After this we got a contradiction then the halting problem is undecidable.
Proposition 6. The language HALT = {hM, wi | M halts on input w} is undecidable.

Proof. We will reduce ATM to HALT. Based on a machine M, let us consider a new
machine f(M) as follows:

On input x

Run M on x

If M accepts then halt and accept

If M rejects then go into an infinite loop

Observe that f(M) halts on input w if and only if M accepts w

The Halting Problem

Completing the proof

Proof Suppose HALT is decidable. Then there is a Turing machine H that always halts
and L(H) = HALT. Consider the following program T

On input <M, w>

Construct program f(M)


Run H on <f(M), w>

Accept if H accepts and reject if H rejects

T decides Atm. But, ATM is undecidable, which gives us the contradiction.

Undecidable Problems about TM


1. Outline tractable and intractable problems with example (8M, Nov/Dec 2019)

2. Write short notes on tractable form (2M, Nov/Dec 2017)

The reduction is employed to demonstrate whether or not the given language is


desirable. A problem is undecidable if there is no Turing machine which will always halt
in finite amount of time to give answer as ‘yes’ or ‘no’. An undecidable problem has no
algorithm to determine the answer for a given input.

Examples

 Ambiguity of context-free languages: Given a context-free language, there is no


Turing machine which will always halt in finite amount of time and give answer
whether language is ambiguous or not.
 Equivalence of two context-free languages: Given two context-free languages,
there is no Turing machine which will always halt in finite amount of time and give
answer whether two context free languages are equal or not.
 Everything or completeness of CFG: Given a CFG and input alphabet, whether
CFG will generate all possible strings of input alphabet (∑*)is undecidable.
 Regularity of CFL, CSL, REC and REC: Given a CFL, CSL, REC or REC, determining
whether this language is regular is undecidable.
A semi-decidable problem is subset of undecidable problems for which Turing
machine will always halt in finite amount of time for answer as ‘yes’ and may or may
not halt for answer as ‘no’.

Reduction
When a problem P1 gets reduced to a problem P2, the solution to P2 solves P1,
according to the reduction approach. P1 reduced P2 is the general term for an algorithm
that transforms an instance of a problem P1 into an instance of a problem P2 with the
same solution. As a result, if P1 is not recursive, then neither is P2. In a similar vein, if P1
is not recursively enumerable, then neither is P2.
Theorem

1. If P1 gets reduced to P2, this is what would happen:


2. P2 is equally undecidable if P1 cannot be decided.
3. P2 is also non-RE if P1 is non-RE.

Proof

1. Think about a P1 instance w. Next, create an algorithm that transforms instance


w into instance x of P2 from instance w as input. Next, use that procedure to
determine whether x is contained in P2. If the algorithm returns “yes,” then x is
in P2, and similarly, we can claim that w is in P1 if the method returns “yes,”
since P2 was derived after P1 was reduced (similar to how if the algorithm
returns “no”, then x is not in P2 and w is not in P1 as well). This demonstrates
that if P1 is intractable, then P1 is intractable as well.
2. Here, P1 is not RE while P2 is. Create an algorithm that will decrease P1 to P2;
however, P2 will be identified by this algorithm. This means that a Turing
machine will exist, responding “yes” if the input is P2, but it may or may not halt
if the input is not P2. Since we now know that one can change a w instance in P1
into an x instance in P2, apply a TM next to see if P2 contains x. If f x is approved,
then w must likewise be accepted. This method explains a TM whose native
language is P1. If w is in P1, then x is also in P2, and vice versa if w isn’t in P1.
This demonstrates that P2 must likewise be non-RE if P1 is non-RE.

Empty and Non-Empty Languages


Languages can be divided into two categories: empty and non-empty languages. Let Le
stand for an empty language and Lne for a language that is not empty. Let Mi be a TM
and w be a binary string. If L(Mj) =, Mi will not accept input, and w will be in Le.
Similarly, w is in Lne if L(Mj) is not the empty language. As a result, we may state:
Lne = {M | L(M) ≠ Ф}
Le = {M | L(M) = Ф}
The complement of Le and Lne is the other one.

The Post Correspondence Problem

1. State Post’s correspondence problem.(2M, Nov/Dec 2021)


2. Define PCP and prove that PCP is undecidable.(2M, Apr/May 2021)

The Post correspondence problem is another undecidable problem that turns out to be
a very helpful tool for proving problems in logic or in formal language theory to be
undecidable.

Let Σ be an alphabet with at least two letters. An instance of the Post Correspondence
problem (for short, PCP) is given by two sequences U = (u1, . . . , um) and V = (v1, . . . , vm),
of strings ui , vi ∈ Σ ∗ . The problem is to find whether there is a (finite) sequence (i1, . . . ,
ip), with ij ∈ {1, . . . , m} for j = 1, . . . , p, so that

Equivalently, an instance of the PCP is a sequence of pairs

For example, consider the following problem:

There is a solution for the string 1234556:


abab aaabbb aab ba ab ab aa = ababaaa bb baab baa ba ba a.

 Post Correspondence Problem is a popular undecidable problem that was


introduced by Emil Leon Post in 1946.
 It is simpler than Halting Problem. In this problem we have N number of
Dominos (tiles).
 The aim is to arrange tiles in such order that string made by Numerators is same
as string made by Denominators.
 In simple words, let assume we have two lists both containing N words, aim is to
find out concatenation of these words in some sequence such that both lists yield
same result. Consider there are two lists A and B.
A = [aa, bb, abb] and B = [aab, ba, b]
Now for sequence 1, 2, 1, 3 first list will yield aabbaaabb and second list will yield same
string aabbaaabb. So the solution to this PCP becomes 1, 2, 1, 3. Post Correspondence
Problems can be represented in two ways:

Fig.1. . Domino Form


Fig.1. . Table Form

Step-1: We will start with tile in which numerator and denominator are starting with
same number, so we can start with either 1 or 2, then with second tile, string made by
numerator- 10111, string made by denominator is 10.

Step-2: We need 1s in denominator to match 1s in numerator so we will go with first


tile, string made by numerator is 10111 1, string made by denominator is 10 111.

Step-3: There is extra 1 in numerator to match this 1 we will add first tile in sequence,
string made by numerator is now 10111 1 1, string made by denominator is 10 111 111.

Step-4: Now there is extra 1 in denominator to match it we will add third tile, string
made by numerator is 10111 1 1 10, string made by denominator is 10 111 111 0.

Final Solution - 2 1 1 3

String made by numerators: 101111110

String made by denominators: 101111110

The above strings are same.

Fig.1. . Table Form


Example 2

Step-1: We will start from tile 1 as it is our only option, string made by numerator is
100, string made by denominator is 1.

Step-2: We have extra 00 in numerator, to balance this only way is to add tile 3 to
sequence, string made by numerator is 100 1, string made by denominator is 1 00.

Step-3: There is extra 1 in numerator to balance we can either add tile 1 or tile 2. Let’s
try adding tile 1 first, string made by numerator is 100 1 100, string made by
denominator is 1 00 1.

Step-4: There is extra 100 in numerator, to balance this we can add 1st tile again, string
made by numerator is 100 1 100 100, string made by denominator is 1 00 1 1 1. The 6th
digit in numerator string is 0 which is different from 6th digit in string made by
denominator which is 1.

Undecidability of Post Correspondence Problem: As theorem says that PCP is


undecidable. That is, there is no particular algorithm that determines whether any Post
Correspondence System has solution or not.

Proof

Reduce Turing Machine to PCP then we will prove that PCP is undecidable as well.
Consider Turing machine M to simulate PCP’s input string w can be represented as

M= (Q, ∑, ⎾, δ, q0, qaccept, qreject)

If there is match in input string w, then Turing machine M halts in accepting state. This
halting state of Turing machine is acceptance problem ATM. We know that acceptance
problem ATM is undecidable. Therefore PCP problem is also undecidable. To force
simulation of M, we make 2 modifications to Turing Machine M and one change to our
PCP problem.

1. M on input w can never attempt to move tape head beyond left end of input tape.
2. If input is empty string € we use.
3. PCP problem starts match with first domino [u1/v1] This is called Modified PCP
problem.
MPCP = {[D] | D is instance of PCP starts with first domino}
Construction Steps

1. Put [# / (#q0w1w2..wn#)] into D as first domino, where is instance of D is MPCP.


A partial match is obtained in first domino is # at one face is same #symbol in
other face.
2. Transition functions for Turing Machine M can have moves Left L, Right R. For
every x, y z in tape alphabets and q, r in Q where q is not equal to qreject. If
transition(q, x) = (r, y, R) put domino [qx / by] into D and transition(q, x) =(r, y,
L) put domino [zqx / rzy] into D
3. For every tape alphabet x put [x / x] into D. To mark separation of each
configurations put [# / #] and [# / _#] into D .
4. To read input alphabets x even after Turing Machine is accepting state put [xqa /
qa] and [qax / qa]and [qa# / #] into D. These steps concludes construction of D.
Since this instance of MPCP, we need to convert this to PCP. So to convert to D, we
consider below domino and strings matching.

Converting MPCP To PCP: Let u = u1, u2, …, un be any string of input length n and
modify these strings as

$u = *u1*u2*u3* …*un
u$ = u1* u2* u3* … un*
$u$ = * u1* u2* u3* ... un*
Let D be set of two faced Dominos,

D = {[u1 / v1], [u2 / v2], [u3 / v3], ..., [un / vn]} and {[$u1 / $v1$], [$u2 / v2$], ..., [*_ / _]}

From above dominos collection, we could see only domino has partial match starts
with [$u1 / $v1$] and to place marker end of inputs[*_ / _]. There by we can avoid
stating explicit requirement of domino should start with first domino. If number of
configurations of Turing machine does not lie within value of qng n, then Turing
machine is looping state. It does not halt.

Theorem 6.8.1

The Post correspondence problem is undecidable, provided that the alphabet Σ has at
least two symbols.

Reduce the halting problem to the PCP, by encoding sequences of ID’s as partial
solutions of the PCP.

For instance, this can be done for RAM programs. The first step is to show that every
RAM program can be simulated by a single register RAM program. Then, the halting
problem for RAM programs with one register is reduced to the PCP.

Theorem 6.8.2

It is undecidable whether a context-free grammar is ambiguous

Proof

We reduce the PCP to the ambiguity problem for CFG’s.


Given any instance U = (u1, . . ., um) and V = (v1, . . ., vm) of the PCP, let c1, . . ., cm be m
new symbols, and consider the following languages:

LU = {ui1 · · · uip cip · · · ci1 | 1 ≤ ij ≤ m, 1 ≤ j ≤ p, p ≥ 1},

LV = {vi1 · · · vip cip · · · ci1 | 1 ≤ ij ≤ m, 1 ≤ j ≤ p, p ≥ 1}, and

LU,V = LU ∪ LV .

We can easily construct a CFG, GU, V, generating LU, V. The

productions are:

S −→ SU

S −→ SV

SU −→ uiSUci

SU −→ uici

SV −→ viSV ci S

V −→ vici.

It is easily seen that the PCP for (U, V) has a solution iff LU ∩LV = ∅ iff G is ambiguous.

Example 1:

Consider the correspondence system as given below

A = (b, bab3, ba) and B = (b3, ba, a). The input set is ∑ = {0, 1}. Find the solution.

Solution:

A solution is 2, 1, 1, 3. That means w2w1w1w3 = x2x1x1x3

The constructed string from both lists is bab3b3a.


Example 2:

Does PCP with two lists x = (b, a, aba, bb) and y = (ba, ba, ab, b) have a solution?

Solution: Now we have to find out such a sequence that strings formed by x and y are
identical. Such a sequence is 1, 2, 1, 3, 3, 4. Hence from x and y list

THE CLASSES P AND NP


1. With proper examples, explain P and NP complete problems. (13M, Nov/Dec
2021)

2. Define the classes P and NP problem. Give example problems for both. (2M,
Apr/May 2021)

3. What are polynomial- time algorithms? (2M, Nov/Dec 2019)

4. Show that any problem in P is also in NP but not the other way around (5M,
Nov/Dec 2019)

5. Describe in detail about NP – Hard and NP – Complete problems with example

(13M, Apr/May 2019)

 An algorithm is said to be polynomially bounded if its worst-case complexity is


bounded by a polynomial function of the input size. A problem is said to be
polynomially bounded if there is a polynomially bounded algorithm for it.
 P is the class of all decision problems that are polynomially bounded. The
implication is that a decision problem X ∊ P can be solved in polynomial time on
a deterministic computation model (such as a deterministic Turing machine).
 NP represents the class of decision problems which can be solved in polynomial
time by a non-deterministic model of computation. That is, a decision problem
X ∊ P can be solved in polynomial-time on a non-deterministic computation
model (such as a non-deterministic Turing machine). A non-deterministic model
can make the right guesses on every move and race towards the solution much
faster than a deterministic model.
 A deterministic machine, at each point in time, executes an instruction.
Depending on the outcome of executing the instruction, it then executes some
next instruction, which is unique. A non-deterministic machine on the other hand
has a choice of next steps. It is free to choose any that it wishes. For example, it
can always choose a next step that leads to the best solution for the problem.
 Eg: Travelling Sales Person
A smart non-deterministic algorithm for the above problem starts with a vertex,
guesses the correct edge to choose, proceeds to the next vertex, guesses the
correct edge to choose there, etc. and in polynomial time discovers a Hamiltonian
cycle of least cost and provides an answer to the above problem. This is the
power of non-determinism. A deterministic algorithm here will have no choice
but take super-polynomial time to answer the above question.
 Another way of viewing the above is that given a candidate Hamiltonian cycle
(call it certificate), one can verify in polynomial time whether the answer to the
above question is YES or NO. Thus to check if a problem is in NP, it is enough to
prove in polynomial time that any YES instance is correct. We do not have to
worry about NO instances since the program always makes the right choice.
 It is unknown whether, P = NP
P⊆NP

NP-Complete Problems

The definition of NP-completeness is based on reducibility of problems. Suppose we


wish to solve a problem X and we already have an algorithm for solving another
problem Y. Suppose we have a function T that takes an input x for X and produces T(x),
an input for Y such that the correct answer for X on x is yes if and only if the correct
answer for Y on T(x) is yes. Then by composing T and the algorithm for y, we have an
algorithm for X.

 If the function T itself can be computed in polynomially bounded time, we say X


is polynomially reducible to Y and we write X≤Y
 If X is polynomially reducible to Y, then the implication is that Y is at least as hard
to solve as X. i.e. X is no harder to solve than Y.
 It is easy to see that X ≤ Y and Y ∊ P implies X∊ P.

NP-Hardness and NP-Completeness

A decision problem Y is said to be NP-hard if X≤ Y for all X ∊ NP. An NP-hard problem Y


is said to be NP-complete if Y ∊ NP. NPC is the standard notation for the class of all NP-
complete problems.

 Informally, an NP-hard problem is a problem that is at least as hard as any


problem in NP. If, further, the problem also belongs to NP, it would become NP-
complete.

 It can be easily proved that if any NP-complete problem is in P, then NP = P.


Similarly, if any problem in NP is not polynomial-time solvable, then no NP-
complete problem will be polynomial-time solvable. Thus NP-completeness is at
the crux of deciding whether or not NP = P.

 Using the above definition of NP-completeness to show that a given decision


problem, say Y, is NP-complete will call for proving polynomial reducibility of
each problem in NP to the problem Y. This is impractical since the
class NP already has a large number of member problems and will continuously
grow as researchers discover new members of NP.

 A more practical way of proving NP-completeness of a decision problem Y is to


discover a problem X ∊ NPC such that X ≤p Y. Since X is NP-complete and ≤p is a
transitive relationship, the above would mean that Z ≤p Y Z ∊ NP.
Furthermore if Y ∊ NP, then Y is NP-complete.

 The above is the standard technique used for showing the NP-hardness or NP-
completeness of a given decision problem.

 If the actual input sizes are small, an algorithm with, say, exponential running
time may be acceptable. On the other hand, it may still be possible to obtain near-
optimal solutions in polynomial-time. Such an algorithm that returns near-
optimal solutions (in polynomial time) is called an approximation algorithm.

NP-Hard Problems
A problem is said to be NP-Hard when an algorithm for solving NP Hard can be
translated to solve any NP problem. Then we can say, this problem is at least as hard as
any NP problem, but it could be much harder or more complex.

NP-Complete Problems
NP-Complete (NPC) problems are problems that are present in both the NP and NP-
Hard classes. That is NP-Complete problems can be verified in polynomial time and any
NP problem can be reduced to this problem in polynomial time.

A problem is in class NPC if it is in NP and is as hard as any problem in NP. A problem is


said to be NP-hard if all problems in NP are polynomial time reducible to it, even though
it may not be in NP itself.

If a polynomial time algorithm exists for any of these types of problems, all problems in
NP can be polynomial time solvable. These problems are called NP-complete. NP
completeness is important for both theoretical and practical reasons.

Definition of NP-Completeness
A language M is NP-complete, if it satisfies the two conditions which are given below −
 M is in NP.
 Every A in NP is polynomial time reducible to M.
Suppose, if a language satisfies the second property, but not necessarily the first one,
the language M is known as NP-Hard.
Informally, a search problem M is NP-Hard if there exists some NP-Complete problem A
that Turing reduces to M.

NP-Complete Problems
Examples of NP-Complete problems where no polynomial time algorithm is known are
as follows
 Determining whether a graph has a Hamiltonian cycle
 Determining whether a Boolean formula is satisfactory, etc.
NP-Hard Problems

The following problems are NP-Hard


 The circuit-satisfiability problem
 Set Cover
 Vertex Cover
 Travelling Salesman Problem

Example

Consider an example to check if a problem is in P class or NP class

Step 1 − If a problem is in class P, it is nothing but we can find a solution to that type of
problem in polynomial time.

Step 2 − If a problem is in class NP, it is nothing but that we can verify a possible solution in
polynomial time.

Step 3 − Consider another way, NP means that a problem is Nondeterministically


Polynomial. Specifically, that means that if you could build a machine that had the ability to
try all the possible solutions of your problem at once, it could finish in polynomial time.

Step 4 − so, if you can solve a problem in polynomial time, you can certainly verify that your
answer is correct in polynomial time, can't you? Sure, if you can prove that your algorithm is
correct and that it can find an answer in polynomial time, which it must to be in P.

Prove that the Hamiltonian Path is NP-Complete


A Hamilton cycle is a round trip path along n edges of graph G which visits every vertex
once and returns to its starting vertex.

Example

Given below is an example of the Hamilton cycle path

Hamilton cycle path: 1,2,8,7,6,5,4,3,1

TSP is NP-Complete
The travelling salesman problem (TSP) is having a salesman and a set of cities. The
salesman needs to visit each one of the cities starting from a certain one and returning
to the same city i.e. back to starting position. The challenge of this problem is that the
travelling salesman wants to minimise the total length of the trip.

Proof

 To prove TSP is NP-Complete, first try to prove TSP belongs to Non-deterministic


Polynomial (NP).

 In TSP, we have to find a tour and check that the tour contains each vertex once.
Then, we calculate the total cost of the edges of the tour. Finally, we check if the
cost is minimum or not. This can be done in polynomial time. Therefore, TSP
belongs to NP.

 Next, we have to prove that TSP is NP-hard.

 To prove this, one way is to show that the Hamiltonian cycle ≤p TSP (as we know
that the Hamiltonian cycle problem is NP Complete)

 Assume G = (V, E) to be an instance of the Hamiltonian cycle.

 Hence, an instance of TSP is constructed. We can create the complete graph G' =
(V, E').

 Now, assume that a Hamiltonian cycle H exists in G. The cost of each edge in H is
0 in G' as each edge belongs to E. Therefore, H is having a cost of 0 in G'. Thus, if
graph G has a Hamiltonian cycle, then graph G' has a tour of 0 cost.

 Now let us assume that G' has a tour H’ of cost at most 0. The cost of edges in E'
are 0 and 1 by definition. Hence, each edge must have a cost of 0 as the cost of H’
is 0. We finally conclude that H' contains only edges in E.

 Finally proved that G has a Hamiltonian cycle, if and only if G' has a tour of cost at
most 0. TSP is NP-complete.

A Harder Example:

To show that not all languages are (obviously) in P, consider the following:

HC = {serialize(G) | G has a simple cycle that visits every vertex of G}.

Such a cycle is called a Hamiltonian cycle and the decision problem is the Hamiltonian
Cycle Problem.
Fig. 1.4: The Hamiltonian cycle (HC) problem

In Fig. 1.4 (a) we show an example of a Hamiltonian cycle in a graph. If you think that
the problem is easy to solve, try to solve the problem on the graph shown in Fig. 1.4 (b),
which has one additional vertex and one additional edge. Either find a Hamiltonian cycle
in this graph or show than none exists. To make this even harder, imagine a million-
vertex graph with many slight variations of this pattern. Is HC ∈ P? No one knows the
answer for sure, but it is conjectured that it is not.

Polynomial-Time Verification and Certificates / Polynomial time Reduction

The principle methodology for proving a problem P2 cannot be solved in polynomial


time (i.e.P2 is not in P) is the reduction of a problem P1, which is known, not to be in P,
to P2. A reduction from P1 to P2 is polynomial time if it takes time that is polynomial in
length of the P1 instance.

In order to define NP-completeness, we need to first define NP. Unfortunately, providing


a rigorous definition of NP will involve a presentation of the notion of nondeterministic
models of computation.

Many language recognition problems that may be hard to solve, but they have the
property that they are easy to verify that a string is in the language. There is no
obviously efficient way to find a Hamiltonian cycle in a graph. However, suppose that a
graph did have a Hamiltonian cycle and someone wanted to convince us of its existence.
This person would simply tell us the vertices in the order that they appear along the
cycle. It would be a very easy matter for us to inspect the graph and check that this is
indeed a legal cycle that it visits all the vertices exactly once. Thus, even though we
know of no efficient way to solve the Hamiltonian cycle problem, there is a very efficient
way to verify that a given graph has one.
The given cycle in the above example is called a certificate. A certificate is a piece of
information which allows us to verify that a given string is in a language in polynomial
time.

More formally, given a language L, and given x ∈ L, a verification algorithm is an


algorithm which, given x and a string y called the certificate, can verify that x is in the
language L using this certificate as help. If x is not in L then there is nothing to verify. If
there exists a verification algorithm that runs in polynomial time, we say that L can be
verified in polynomial time.

Note that not all languages have the property that they are easy to verify. For example,
consider the following languages:

UHC = {G | G has a unique Hamiltonian cycle}

HC = {G | G has no Hamiltonian cycle}.

There is no known polynomial time verification algorithm for either of these. For
example, suppose that a graph G is in the language UHC.

The class NP: We can now define the complexity class NP

Definition: NP is the set of all languages that can be verified in polynomial time.

Observe that if we can solve a problem efficiently without a certificate, we can certainly
solve given the additional help of a certificate. Therefore, P ⊆ NP. However, it is not
known whether P = NP. Most experts believe that P 6= NP, but no one has a proof of this.
Next time we will define the notions of NP-hard and NP-complete. There is one last
ingredient that will be needed before defining NP-completeness, namely the notion of a
polynomial time reduction. We will discuss that next time.

Table 1. . Computationally hard problems and their (easy) counterparts.

You might also like