You are on page 1of 24

CST 306 AAD MOD V

MODULE-5
INTRODUCTION TO COMPLEXITY THEORY

Tractable and Intractable Problems, Complexity Classes – P, NP, NP- Hard and
NP-Complete Classes- NP Completeness proof of Clique Problem and Vertex
Cover Problem- Approximation algorithms- Bin Packing, Graph Coloring.
Randomized Algorithms (Definitions of Monte Carlo and Las Vegas algorithms),
Randomized version of Quick Sort algorithm with analysis.

TRACTABLE AND INTRACTABLE PROBLEMS


Computer Scientists divide common complexity functions into two classes:
 Polynomial functions: Any function that is O(nk), i.e. bounded from above
by nk for some constant k. E.g. O(1), O(log n), O(n), O(n × log n), O(n2),
O(n3).
 Exponential functions: The remaining functions. E.g. O(2n), O(n!), O(nn)
On the basis of this classification of functions into polynomial and exponential,
we can classify algorithms:
 Polynomial-Time Algorithm: an algorithm whose order-of-magnitude time
performance is bounded from above by a polynomial function of n, where n
is the size of its inputs.
 Exponential Algorithm: an algorithm whose order-of-magnitude time
performance is not bounded from above by a polynomial function of n.
Tractable Problem: A problem that is solvable by a polynomial-time algorithm.
The upper bound is polynomial.
Here are examples of tractable problems (ones with known polynomial-time
algorithms):
 Searching an unordered list
 Searching an ordered list
 Sorting a list
 Multiplication of integers (even though there’s a gap)
 Finding a minimum spanning tree in a graph (even though there’s a gap)

Intractable Problem: a problem that cannot be solved by a polynomial-time


algorithm. The lower bound is exponential.
From a computational complexity stance, intractable problems are problems for
which there exist no efficient algorithms to solve them.

1
CST 306 AAD MOD V

Most intractable problems have an algorithm that provides a solution, and that
algorithm is the brute-force search.
This algorithm, however, does not provide an efficient solution and is, therefore,
not feasible for computation with anything more than the smallest input.
Examples
Towers of Hanoi: we can prove that any algorithm that solves this problem must
have a worst-case running time that is at least 2n − 1.
* List all permutations (all possible orderings) of n numbers.

COMPLEXITY CLASSES – P, NP, NP- HARD AND NP-COMPLETE


CLASSES
In computer science, there exist some problems whose solutions are not yet
found, the problems are divided into classes known as Complexity Classes. In
complexity theory, a Complexity Class is a set of problems with related
complexity. These classes help scientists to groups problems based on how much
time and space they require to solve problems and verify the solutions. It is the
branch of the theory of computation that deals with the resources required to
solve a problem.
The common resources are time and space, meaning how much time the
algorithm takes to solve a problem and the corresponding memory usage.
The time complexity of an algorithm is used to describe the number of steps
required to solve a problem, but it can also be used to describe how long it takes
to verify the answer.
The space complexity of an algorithm describes how much memory is required
for the algorithm to operate.
Complexity classes are useful in organizing similar types of problems.
Types of Complexity Classes
This article discusses the following complexity classes:
1. P Class
2. NP Class
3. CoNP Class
4. NP hard
5. NP complete

In theoretical computer science, the classification and complexity of common


problem definitions have two major sets; p which is “Polynomial” time
and NP which “Non-deterministic Polynomial” time. There are also NP-
HARD and NP-COMPLETE sets, which we use to express more sophisticated
problems. In the case of rating from easy to hard, we might label these as “easy”,
“medium”, “hard”, and finally “hardest”.

2
CST 306 AAD MOD V

P CLASS
The P in the P class stands for Polynomial Time. It is the collection of decision
problems (problems with a “yes” or “no” answer) that can be solved by a
deterministic machine in polynomial time.
The class P consists of those problems that are solvable in polynomial time. More
specifically, they are problems that can be solved in time O(nk) for some constant
k, where n is the size of the input to the problem.

Features:
1. The solution to P problems is easy to find.
2. P is often a class of computational problems that are solvable and tractable.
Tractable means that the problems can be solved in theory as well as in
practice. But the problems that can be solved in theory but not in practice are
known as intractable.
This class contains many natural problems like:
1. Calculating the greatest common divisor.
2. Finding a maximum matching.
3. Decision versions of linear programming.

NP CLASS
The NP in NP class stands for Non-deterministic Polynomial Time. It is the
collection of decision problems that can be solved by a non-deterministic machine
in polynomial time.
NP are problems that we have yet to find efficient algorithms in Polynomial
Time for, but given a solution we can verify that solution in polynomial time. It
has not been proved if these problems can be solved in polynomial time, or if they
would require super polynomial time
The class NP consists of those problems that are “verifiable” in polynomial time.
If we were somehow given a “certificate” of a solution, then we could verify that
the certificate is correct in time polynomial in the size of the input to the
problem. For example, in the Hamiltonian cycle problem, given a directed graph
G =(V,E) a certificate would be a sequence (v1,v2,….vn) of n vertices. We could
easily check in polynomial time that (vi,vi+1) ϵ E for i=1,2,….n-1 and that (vn,v1)
ϵE
The complexity class NP is the class of languages that can be verified by a
polynomial- time algorithm. A language L belongs to NP if and only if there
exist a two-input polynomial-time algorithm A and a constant c such that

Features:
1. The solutions of the NP class are hard to find since they are being solved by a
non-deterministic machine but the solutions are easy to verify.
2. Problems of NP can be verified by a Turing machine in polynomial time.

3
CST 306 AAD MOD V

Example:
Let us consider an example to better understand the NP class. Suppose there is a
company having a total of 1000 employees having unique employee IDs. Assume
that there are 200 rooms available for them. A selection of 200 employees must
be paired together, but the CEO of the company has the data of some employees
who can’t work in the same room due to some personal reasons.
This is an example of an NP problem. Since it is easy to check if the given choice
of 200 employees proposed by a coworker is satisfactory or not i.e. no pair taken
from the coworker list appears on the list given by the CEO. But generating such
a list from scratch seems to be so hard as to be completely impractical.
It indicates that if someone can provide us with the solution to the problem, we
can find the correct and incorrect pair in polynomial time. Thus for the NP class
problem, the answer is possible, which can be calculated in polynomial time.
This class contains many problems that one would like to be able to solve
effectively:
1. Boolean Satisfiability Problem (SAT).
2. Hamiltonian Path Problem.
3. Graph coloring.

Co-NP CLASS
Co-NP stands for the complement of NP Class. It means if the answer to a
problem in Co-NP is No, then there is proof that can be checked in polynomial
time.
Features:
1. If a problem X is in NP, then its complement X’ is also is in CoNP.
2. For an NP and CoNP problem, there is no need to verify all the answers at
once in polynomial time, there is a need to verify only one particular answer
“yes” or “no” in polynomial time for a problem to be in NP or CoNP.
Some example problems for C0-NP are:
1. To check prime number.
2. Integer Factorization.

NP-HARD CLASS
An NP-hard problem is at least as hard as the hardest problem in NP and it is
the class of the problems such that every problem in NP reduces to NP-hard.
We say that a decision problem Pi is NP-hard if every problem in NP is
polynomial time reducible to Pi. It means that Pi is ‘as hard as’ all the problems
in NP. If Pi can be solved in polynomial-time, then so can all problems in NP.

Features:
1. All NP-hard problems are not in NP.
2. It takes a long time to check them. This means if a solution for an NP-hard
problem is given then it takes a long time to check whether it is right or not.

4
CST 306 AAD MOD V

3. A problem A is in NP-hard if, for every problem L in NP, there exists a


polynomial-time reduction from L to A.
Some of the examples of problems in Np-hard are:
1. Halting problem.
2. Qualified Boolean formulas.
3. No Hamiltonian cycle.

NP-COMPLETE CLASS
A problem is NP-complete if it is both NP and NP-hard. NP-complete problems
are the hard problems in NP.
We say that a decision problem Pi is NP-complete if it is NP-hard and it is also
in the class NP itself. NP-complete problems are set of problems that have been
proved to be in NP. That is, a nondeterministic solution is quite trivial, and yet
no polynomial time algorithm has yet been developed. If any of the problems can
be solved in polynomial time on a deterministic machine, then all the problems
can be solved in NP(Cook's Theorem). NP-complete problems form a set of
problems that may or may not be intractable.

Features:
1. NP-complete problems are special as any problem in NP class can be
transformed or reduced into NP-complete problems in polynomial time.
2. If one could solve an NP-complete problem in polynomial time, then one could
also solve any NP problem in polynomial time.
Some example problems include:
1. 0/1 Knapsack.
2. Hamiltonian Cycle.
3. Satisfiability.
4. Vertex cover.

5
CST 306 AAD MOD V

Relation between Complexity Classes

6
CST 306 AAD MOD V

Difference between NP hard and NP complete problem


The NP problems set of problems whose solutions are hard to find but easy to
verify and are solved by Non-Deterministic Machine in polynomial time.
A Problem X is NP-Hard if there is an NP-Complete problem Y, such that Y is
reducible to X in polynomial time. NP-Hard problems are as hard as NP-
Complete problems. NP-Hard Problem need not be in NP class.
A problem X is NP-Complete if there is an NP problem Y, such that Y is
reducible to X in polynomial time. NP-Complete problems are as hard as NP
problems. A problem is NP-Complete if it is a part of both NP and NP-Hard
Problem. A non-deterministic Turing machine can solve NP-Complete problem in
polynomial time.

NP COMPLETENESS PROOF OF CLIQUE PROBLEM


A clique in an undirected graph G = (V,E) is a subset V’ C V of vertices, each pair
of which is connected by an edge in E. In other words, a clique is a complete sub-
graph of G. The size of a clique is the number of vertices it contains. The clique
problem is the optimization problem of finding a clique of maximum size in a
graph. As a decision problem, we ask simply whether a clique of a given size k
exists in the graph. The formal definition is:
CLIQUE = {<G,k> : G is a graph containing a clique of size k}

Theorem: The clique problem is NP-complete. (As per textbook)

7
CST 306 AAD MOD V

(Proof from other references)


A clique is a subgraph of a graph such that all the vertices in this subgraph are
connected with each other that is the subgraph is a complete graph. The
Maximal Clique Problem is to find the maximum sized clique of a given graph G,
that is a complete graph which is a subgraph of G and contains the maximum
number of vertices. This is an optimization problem. Correspondingly, the Clique
Decision Problem is to find if a clique of size k exists in the given graph or not.

8
CST 306 AAD MOD V

To prove that a problem is NP-Complete, we have to show that it belongs to both


NP and NP-Hard Classes. (Since NP-Complete problems are NP-Hard problems
which also belong to NP)

The Clique Decision Problem belongs to NP – If a problem belongs to the NP


class, then it should have polynomial-time verifiability, that is given a certificate,
we should be able to verify in polynomial time if it is a solution to the problem.
Proof:
Certificate – Let the certificate be a set S consisting of nodes in the clique and S
is a subgraph of G.
Verification – We have to check if there exists a clique of size k in the graph.
Hence, verifying if number of nodes in S equals k, takes O(1) time. Verifying
whether each vertex has an out-degree of (k-1) takes O(k2) time. (Since in a
complete graph, each vertex is connected to every other vertex through an edge.
Hence the total number of edges in a complete graph = kC2 = k*(k-1)/2 ).
Therefore, to check if the graph formed by the k nodes in S is complete or not, it
takes O(k2) = O(n2) time (since k<=n, where n is number of vertices in G).
Therefore, the Clique Decision Problem has polynomial time verifiability and
hence belongs to the NP Class.
The Clique Decision Problem belongs to NP-Hard – A problem L belongs to NP-
Hard if every NP problem is reducible to L in polynomial time. Now, let the
Clique Decision Problem by C. To prove that C is NP-Hard, we take an already
known NP-Hard problem, say S, and reduce it to C for a particular instance. If

9
CST 306 AAD MOD V

this reduction can be done in polynomial time, then C is also an NP-Hard


problem. The Boolean Satisfiability Problem (S) is an NP-Complete problem as
proved by the Cook’s theorem. Therefore, every problem in NP can be reduced to
S in polynomial time. Thus, if S is reducible to C in polynomial time, every NP
problem can be reduced to C in polynomial time, thereby proving C to be NP-
Hard.
Proof that the Boolean Satisfiability problem reduces to the Clique Decision
Problem
Let the boolean expression be – F = (x1 v x2) ^ (x1‘ v x2‘) ^ (x1 v x3) where x1,
x2, x3 are the variables, ‘^’ denotes logical ‘and’, ‘v’ denotes logical ‘or’ and x’
denotes the complement of x. Let the expression within each parentheses be a
clause. Hence we have three clauses – C1, C2 and C3. Consider the vertices as –
<x1, 1>; <x2, 1>; <x1’, 2>; <x2’, 2>; <x1, 3>; <x3, 3> where the second term in
each vertex denotes the clause number they belong to. We connect these vertices
such that –
1. No two vertices belonging to the same clause are connected.
2. No variable is connected to its complement.

Thus, the graph G (V, E) is constructed such that – V = { <a, i> | a belongs to
Ci } and E = { ( <a, i>, <b, j> ) | i is not equal to j ; b is not equal to a’ }
Consider the subgraph of G with the vertices <x2, 1>; <x1’, 2>; <x3, 3>. It
forms a clique of size 3 (Depicted by dotted line in above figure) .
Corresponding to this, for the assignment – <x1, x2, x3> = <0, 1, 1> F
evaluates to true. Therefore, if we have k clauses in our satisfiability
expression, we get a max clique of size k and for the corresponding
assignment of values, the satisfiability expression evaluates to true. Hence,
for a particular instance, the satisfiability problem is reduced to the clique
decision problem. Therefore, the Clique Decision Problem is NP-Hard.
The Clique Decision Problem is NP and NP-Hard. Therefore, the Clique
decision problem is NP-Complete.

10
CST 306 AAD MOD V

(Proof from another reference)


CLIQUE COVER: - Given a graph G and an integer k, can we find k
subsets of verticesV1, V2...VK, such that UiVi = V, and that each Vi is a
clique of G.

 Clique ≤ρ 3CNF

Proof: - As you know that a function of K clause, there must exist a Clique
of size k. It means that P variables which are from the different clauses
can assign the same value (say it is 1). By using these values of all the
variables of the CLIQUES, you can make the value of each clause in the
function is equal to 1

Example: - You have a Boolean function in 3CNF:-

(X+Y+Z) (X+Y+Z') (X+Y'+Z)

After Reduction/Conversion from 3CNF to CLIQUE, you will get P


variables such as: - x +y=1, x +z=1 and x=1

Put the value of P variables in equation (i)

(1+1+0)(1+0+0)(1+0+1)

(1)(1)(1)=1 output verified

 Clique ϵ NP:-

Proof: - As you know very well, you can get the Clique through 3CNF and to
convert the decision-based NP problem into 3CNF you have to first convert
into SAT and SAT comes from NP. So, concluded that CLIQUE belongs to
NP.

11
CST 306 AAD MOD V

 Proof of NPC:-

1. Reduction achieved within the polynomial time from 3CNF to Clique


2. And verified the output after Reduction from Clique To 3CNF above
So, concluded that, if both Reduction and verification can be done within
the polynomial time that means Clique also in NPC.

NP COMPLETENESS PROOF VERTEX COVER PROBLEM

12
CST 306 AAD MOD V

(Proof from another reference)


Vertex Cover:
Definition: - It represents a set of vertex or node in a graph G (V, E), which
gives the connectivity of a complete graph
According to the graph G of vertex cover which you have created, the size of
Vertex Cover =2

13
CST 306 AAD MOD V

 Vertex Cover ≤ρ Clique


In a graph G of Vertex Cover, you have N vertices which contain a Vertex
Cover K. There must exist of Clique Size of size N-K in its complement.
According to the graph G, you have
Number of vertices=6
Size of Clique=N-K=4
You can also create the Clique by complimenting the graph G of Vertex
Cover means in simpler form connect the vertices in Vertex Cover graph G
through edges where edges don’t exist and remove all the existed edges
You will get the graph G with Clique Size=4
 Clique ≤ρ Vertex Cover
Here through the Reduction process, you can get the Vertex Cover form
Clique by just complimenting the Clique graph G within the polynomial
time.
 Vertex Cover ϵ NP
As you know very well, you can get the Vertex Cover through Clique and
to convert the decision-based NP problem into Clique firstly you have to
convert into 3CNF and 3CNF into SAT and SAT into CIRCUIT SAT that
comes from NP.
 Proof of NPC:-
Reduction from Clique to Vertex Cover has been made within the
polynomial time. In the simpler form, you can convert into Vertex Cover
from Clique within the polynomial time
And verification has also been done when you convert Vertex Cover to
Clique and Clique to 3CNF and satisfy/verified the output within a
polynomial time also, so it concluded that Reduction and Verification had
been done in the polynomial time that means Vertex Cover also comes in
NPC

14
CST 306 AAD MOD V

(Proof from another reference)

15
CST 306 AAD MOD V

APPROXIMATION ALGORITHMS
An approximation algorithm for a problem is a polynomial-time algorithm that,
when given input I, outputs an element of FS(I). A feasible solution(FS) is an
object of the right type but not necessarily an optimal one. FS(I) is the set of
feasible solutions for I.
Approximation algorithms are a method to solve NP complete optimization
problems, which are fast(polynomially bounded) algorithms that are not
guaranteed to give the best solutions but will give one that is closest to the
optimal.

BIN PACKING
 How to pack or store objects of various sizes and shapes with a minimum of
wasted space.
 Let S = (s1, …, sn), where 0 < si <= 1 for 1 <= i <= n. pack s1, …, sn into as
few bin as possible, where each bin has capacity one
 Optimal solution for Bin Packing, considering all ways to partition S into n
or fewer subsets. There are more than (n/2)n/2 possible partitions.
 Find a packing in unit-sized bins that minimizes the number of bins used.

16
CST 306 AAD MOD V

THE FIRST-FIT ALGORITHM:


This algorithm puts each item in one of partially packed bins. – If the item does
not fit into any of these bins, it opens a new bin and puts the item into it.

THE FIRST-FIT DECREASING ALGORITHM:


Approximation algorithm we present here uses a very simple heuristic greedy
strategy called first fit decreasing (FFD). Its worst case complexity is O(n2). The
FFDp strategy sorts the object first in the non-decreasing order. Then puts each
item in one of partially packed bins. If the item does not fit into any of these bins,
it opens a new bin and puts the item into it.

17
CST 306 AAD MOD V

Algorithm:

BEST FIT STARTEGY


An object of size S is placed in bin Bj which is the fullest among those bins in
which the objects fit i.e used[j] is maximum subject to requirement. If si are
sorted in decreasing order BF works as well as FFD.

NEXT FIT STARTEGY


Here the objects are not sorted. One bin is filled at a time. Objects are put in the
current bin until the next one does not fit, then a new bin is started and no more
objects are packed in bins considered earlier.

GRAPH COLORING
Graph coloring is the procedure of assignment of colors to each vertex of a graph
G such that no adjacent vertices get same color. The objective is to minimize the
number of colors while coloring a graph. The smallest number of colors required

18
CST 306 AAD MOD V

to color a graph G is called its chromatic number of that graph. Graph coloring
problem is a NP Complete problem.
Vertex coloring is the most common graph coloring problem. The problem is,
given m colors, find a way of coloring the vertices of a graph such that no two
adjacent vertices are colored using same color. The other graph coloring
problems like Edge Coloring (No vertex is incident to two edges of same color)
and Face Coloring (Geographical Map Coloring) can be transformed into vertex
coloring.

Chromatic Number: The smallest number of colors needed to color a graph G is


called its chromatic number. For example, the following can be colored minimum
2 colors.

Sequential coloring Algorithm


It always colors the next vertex say vi with the minimum acceptable color.

19
CST 306 AAD MOD V

Algorithm:
Input : G(V,E) an undirected graph where V={ v1,v2,v3,……………..,vn}
Output: A coloring of G
SeqColor(V,E)
int c,i;
for(i=1; i<=n; i++)
for(c=1;c<=n;c++)
if no vertex adjacent to Vi has color C
color Vi with C
break; // exit for C
// continue for C
// continue for i

Wigderson’s Graph Coloring Algorithm


• Let G=(V,E) and n is the number of vertices.
• Let vɛV. The neighborhood of v denoted N(v), is the set of vertices adjacent
to V.
• The sub graph induced by N(v) is denoted as H(v).
• The key idea in the algorithm is that neighbors of vertices with high
degree are colored first.
Algorithm:
Color3(G)
{
int c;
c=1;
Where (∆(G)≥√n)
Let V be a vertex in G of maximum degree
Color H(v) with colors C and C+1
Color V with the color C+2
Delete v and H(v) from G, and delete all edges incident upon the
deleted vertices
c=c+2;
// now ∆(G)<√n
Use sequential coloring (SC) to color G , beginning with color C.
}

Example: Graph G with n=13 vertices;the degree of v is 6≥√13

20
CST 306 AAD MOD V

Color H(v) with c, c+1

V is assigned color 3.
c) G with H(v) and V removed. ∆(G)=2<√13

Sequential coloring

21
CST 306 AAD MOD V

RANDOMIZED ALGORITHMS (DEFINITIONS OF MONTE


CARLO AND LAS VEGAS ALGORITHMS)
• An algorithm that uses random numbers to decide what to do next
anywhere in its logic is called Randomized Algorithm.
• For example, in Randomized Quick Sort, we use random number to pick
the next pivot (or we randomly shuffle the array).
• Randomized algorithms are classified in two categories.
• Monte Carlo Algorithms
• Las Vegas Algorithms
Monte Carlo Algorithms
• In decision problems, the answer is either “yes” or “no” and an
approximate answer is meaningless for these problems. The probabilistic
algorithms used to solve these problems are called Monte Carlo
Algorithms.
• Monte Carlo Algorithms always finds an answer, but the answer may not
be correct.
• Some randomized algorithms have deterministic time complexity. Such
algorithms are called Monte Carlo Algorithms and are easier to analyse
for worst case.
• These algorithms always produce correct or optimum result.
• Time complexity of these algorithms is based on a random value and time
complexity is evaluated as expected value.
• For example, Randomized QuickSort always sorts an input array
and expected worst case time complexity of QuickSort is O(nLogn).
Las Vegas Algorithms
• On the other hand, time complexity of other randomized algorithms (other
than Las Vegas) is dependent on value of random variable. Such
Randomized algorithms are called Las Vegas Algorithms.
• These algorithms are typically analysed for expected worst case.
• These algorithms may or may not find an answer, but whenever it finds
an answer, the answer is correct.
• To compute expected time taken in worst case, all possible values of the
used random variable needs to be considered in worst case and time taken
by every possible value needs to be evaluated.
• Average of all evaluated times is the expected worst case time complexity.

22
CST 306 AAD MOD V

• Produce correct or optimum result with some probability.


• These algorithms have deterministic running time and it is generally
easier to find out worst case time complexity

RANDOMIZED VERSION OF QUICK SORT ALGORITHM WITH


ANALYSIS
In randomised quick sort algorithm, the pivot element is picked up randomly, so
that each element in the array has equal probability to be picked up as pivot
element. If the uniform random number is generated for selecting the pivot
element, then it is expected that the input data array partitioning will be well
balanced on the average.

23
CST 306 AAD MOD V

24

You might also like