Professional Documents
Culture Documents
13.1 Introduction
So far in the previous units we have studied many algorithms, and learnt
how they play a significant role in solving a range of problems. But, it is not
possible to solve all problems using algorithms. The power of algorithms is
limited to some extent.
The reasons for these limitations are:
Some problems which can be solved using algorithms are not solved
within polynomial time.
Even if we can solve some problems within the polynomial time, the
efficiency of the algorithm is in lower bound.
This unit covers the limitations of algorithm power with respect to lower–
bound arguments of algorithms. It explains decision trees with examples. It
also analyzes P, NP and NP–complete problems.
Objectives:
After studying this unit you should be able to:
explain the lower–bound arguments of algorithms
describe and implement decision trees
define P, NP and NP–complete problems
This problem requires 2n-2 bits of information to solve. All keys except one
must be false and all keys except one must true. Comparing N, N pairs
gives n/2 comparisons and n value bits. We also need n-2 additional bits.
Each element must be compared once. For lower bound, we require a total
of 3n/2-2 comparisons. To find the upper bound we have to group the true
condition and the false condition separately and find their maximum. This
will also be computed as total 3n/2-2 comparisons.
The above example demonstrates the adversary method of obtaining lower
bound. We measure the amount of comparisons required to minimize the
total comparisons to find the lower-bound.
Let us next analyze the method of problem reduction which is used to find
the lower-bound of an algorithm.
13.2.4 Problem reduction
In this method an unsolvable problem A is reduced to a solvable problem B
with a known algorithm. We can use the same reduction idea to find the
lower bound of an algorithm. If we have a problem A which is at least as
hard a problem B whose lower bound is known, we have to reduce B to A so
that any algorithm solving A would also solve B. then the lower bound for B
will also be the lower bound for A.
Equation 13.1 gives the comparison for any sorting algorithm with n
elements in the input list. If we use merge sort for this comparison it gives
this comparison in its worst case. This means that the lower bound n log2n is
tight and cannot be improved. We however have to show that we can
improve the lower bound of log2n! for some values of n.
We can use the concept of decision trees for analyzing the average case of
comparison based sorting algorithms. We calculate the average number of
comparisons for an algorithm based on the average depth of its decision
tree. For example let us consider the insertion sort for three elements.
Figure 13.2 depicts the decision tree for a three element insertion sort.
The lower bound of the average number of comparison Cavg for any
comparison based sorting algorithm is given as
Cavg (n) ≥ log2n !
According to equation 13.1 we have seen that the lower bound for this is n
log2n. Here the lower bound for both average and worst case are almost
same. However these lower bounds are obtained as a result of maximizing
Sikkim Manipal University B1480 Page No. 283
Analysis and Design of Algorithms Unit 13
the number of comparisons that are made for average and worst case. For
sorting algorithms the average case efficiencies are better than their worst
case efficiencies.
13.3.2 Decision trees for searching a sorted array
Let us now see how decision trees can be used for obtaining the lower
bound in searching a sorted array of n elements A [0] < A [1] < …< A [n-1].
The basic algorithm for this problem is binary search algorithm. Cworst gives
the number of comparisons made in the worst case. Equation 13.2 gives the
worst case for searching a sorted array problem.
Cworst (n) = log2n + 1
= log2 (n + 1) Eq: 13.2
Now let us use a decision tree to establish whether this is the least possible
number of comparisons.
Here we are considering the three way comparison where the search
element ‘key’ is compared with some element x to check if key <x, key=x or
key>x. The figure 13.3 shows the decision tree for the case n = 5. Consider
that the elements of the array are 1, 2, 3, 6 and 9. We will start the
comparison with the middle element, 3. The internal nodes of the tree signify
the elements of the array that are being compared with the search element
‘key’. The leaves signify whether the search is successful or unsuccessful.
For an array of n elements such decision trees have 2n +1 leaves where n
for successful search and n + 1 for unsuccessful search. If the least height h
of a decision tree with l leaves is log3l then equation 13.3 gives the lower
bound of this problem based on the number of worst case comparisons.
Cworst (n) ≥ log3 (2n + 1) Eq: 13.3
This lower bound is smaller than the number of worst case comparisons for
binary search at least for higher value of n.
Figure 13.3: Decision Tree for Binary Search in a Five Element Array
Activity 1
Draw a decision tree, to sort three numbers using selection sort
algorithm.
2) The second group consists of those problems that are solved in non-
deterministic polynomial time. For example, knapsack problem and
traveling salesperson problem can be solved in non deterministic
polynomial time.
Definition of P – Stands for polynomial time.
Definition of NP – Stands for non-deterministic polynomial time.
The NP class problems are further divided into NP–complete and NP–hard
problems.
The following reasons justify the restriction of P to decision problems:
1) First it is reasonable to eliminate problems that we are not able to solve
in polynomial time because of the exponentially better output. For
example, generating the subsets of a given set of permutations of n
distinct items. But from the output we see that this cannot be solved in
polynomial time.
2) Secondly we can reduce some of problems that are not decision
problems to a sequence of decision problems that are easy to study. For
example let us consider colours of a graph. Here instead of asking for
the minimum number of colours required to color the vertices of a graph
so that no two vertices are coloured with the same colour, we can verify
whether there is any coloring of the graph’s vertices with more than m
colors.
Not all decision problems are solved in polynomial time. Some decision
problems cannot be solved using any algorithm, such problems are called
undecidable.
There are many problems that have no polynomial time algorithms. Let us
see some examples of those algorithms that do not have polynomial time
algorithm:
Hamiltonian circuit – A Hamiltonian circuit (or Hamiltonian Cycle) is
defined as a circuit in graph G starts and ends at the same vertex and
includes every vertex of G exactly once. If a graph G contains a Hamiltonian
cycle, it is called Hamiltonian Graph.
Traveling salesman – If we have a set of cities and the distances between
them, this problem determines the shortest path starting from a given city,
passing through all the other cities and returning to the first city.
Sikkim Manipal University B1480 Page No. 286
Analysis and Design of Algorithms Unit 13
Knapsack problem – If a set of items are given, each with a weight and a
value, this problem determines the number of items that minimizes the total
weight and maximizes the total value.
Partition problem – This determines whether it is possible to partition the
given set of integers into two that have the same sum.
Bin packing – Bin packing is a hard problem which has the goal to pack a
given number of objects into the minimum number of fixed-size bins.
Graph coloring – This finds the chromatic number of the given graph.
Integer linear programming – This finds the maximum or minimum values
of linear functions for a several integer valued variables subject to a finite
set of constraints.
Another common characteristic that we find in decision problems is that
solving such problems can be computationally difficult, whereas checking
whether a planned solution solves the problem is easy. For example let us
consider the Hamiltonian circuit; it is easy to check if the proposed list of
vertices for a graph with n vertices is in Hamiltonian circuit. We just have to
check whether the list contains n +1 vertices.
Let us first discuss Non-deterministic algorithms.
13.4.1 Non–deterministic algorithms
An algorithm which defines every operation exclusively is called
deterministic algorithm.
An algorithm where every operation may not have an exclusive result and
there is a specified set of possibilities for every operation is called non–
deterministic algorithms.
Non–deterministic algorithm is a two staged algorithm. The two stages are
as follows:
Non–deterministic stage – This is the guessing stage here a random string
is generates which can be thought as a candidate solution to the given
instance.
Deterministic stage – This is the verification stage. In this stage it takes both
the candidate solution and the instance as the input and returns ‘yes’ if the
candidate solution represents the actual solution for the instance.
We see that the above algorithm nondetermine has the following three
functions:
1) choose – randomly chooses one of the elements form the given input
2) success – signifies the successful completion
3) fail – signifies the unsuccessful completion
The algorithm has non-deterministic complexity O(1). If A is ordered then
the deterministic search algorithm has the complexity as (n).
We say that a non-deterministic problem solves the decision problem if for
every instance ‘yes’ of the problem it returns ‘yes’ on some execution. If the
efficiency of the non deterministic algorithm’s verification stage is polynomial
than it is said to be a non deterministic polynomial
Let us next discuss NP-hard and NP-complete classes.
13.4.2 NP–hard and NP–complete classes
We know that NP stands for non-deterministic polynomial. These are the
problems that are solved using the non deterministic algorithms. The
NP– complete class can be further classified in to two, they are:
1) NP–complete
2) NP–hard
NP–complete problems
NP–complete problems are problems that belong to class NP, i.e. they are
the subset of class NP. A problem Q is said to be NP–complete if,
1) Q belongs to the class NP.
2) All other problems in class NP can be reduced to Q in polynomial time.
This implies that NP–complete problems are tough to solve within
polynomial time. If we are able to solve NP–complete problems in
polynomial time, then we can solve all other problems in class NP within
polynomial time.
NP–hard problems
NP-hard problems are similar but more difficult than NP–complete problems.
All problems in class NP can be reduced to NP–hard.
Every problem in NP can be solved in polynomial time. If a NP–hard
problem can be solved in polynomial time, then all NP–complete problems
certificate is a true. Note that we must have m ≤ P (n). This is because with
a problem instance of length n the computation is completed in at most P (n)
steps. During this process, the Turing machine head cannot move more
than P(n) steps to the left of its starting point.
We define some propositions with their intended interpretations as follows:
1) For i = 0, 1, ……, P(n) and j = 0; 1,…….., q – 1, the proposition Qij
indicates that after i computation steps, M is in state of j.
2) For i = 0; 1,…….., P(n), j = - P(n) ,…….; P(n), and k = 1; 2,……., s, the
proposition Sijk indicates that after i computation steps, square j of the
tape contains the symbol ak.
3) i = 0, 1, ……, P(n) and j = -P (n),……, P (n), the proposition Tij indicates
that after i computation steps, the machine M is scans for square j of the
tape.
Now, we define some clauses to describe the computation executed by M:
1) In each calculation step, M is in at least at one state. For each
i = 0…., P(n) we have the clause
Qi0 v Qi1……..Qi (q -1);
which gives (P (n) + 1)q = O(P(n)) literals altogether.
2) In each computation step, M is in at most one at state. For each
i = 0…., P(n) and for each pair j; k of different states, we have the
clause
(Qij Λ Qik);
which gives a total of q (q - 1) (P(n) + 1) = O(P(n)) literals.
3) In each step, the tape square contains at least one alphabet symbol.
For each i = 0,….., P(n) and -P(n) ≤ j ≤ P(n) we have the clause
Sij1 v Sij2 …….. Sijs;
Which gives (P (n) + 1) (2P (n) + 1) s = O(P (n)2) literals.
4) In each step, the tape square contains at most one alphabet symbol.
For each i = 0,……. , P(n) and -P(n) ≤ j ≤ P(n), and each distinct pair
ak; al of symbols we have the clause
(Sijk Λ Sijl);
which gives a total of (P (n) + 1)(2P(n) + 1)s(s - 1) = O(P(n)2) literals
altogether
5) In each step, the tape is scans at least one square. For each i = 0,
……, P(n), we have the clause
Ti(-P(n)) vTi(1-P(n)) v v Ti(P(n)-1) v TiP (n);
Which gives (P (n) + 1)(2P(n) + 1) = O(P(n)2) literals.
6) In each step, the tape is scans at most one square. For each i = 0
…., P(n), and each distinct pair of j and k of tape squares from -P(n)
to P(n), we have the clause
(Tij Λ Tik);
which gives the total of 2P (n)(2P(n) + 1)(P(n) + 1) = O(P(n)3) literals.
7) Initially, the machine is in state 1 scanning square 1. This is
expressed by the two clauses
Q-1; T-1;
this gives just two literals.
8) The configuration at every step after the first is determined from the
configuration of the previous step by the functions T, U, and D
defining the machine M. For each i = 0,........, P(n), -P(n) ≤ j ≤ P(n),
k = 0,………, q - 1, and l = 1,……., s, we have the clauses
Tij Λ Qik Λ Sijl ! Q(i+1)T(k;l)
Tij Λ Qik Λ Sijl ! S(i+1)jU(k;l)
Tij Λ Qik Λ Sijl ! T(i+1)(j+D(k;l))
Sijk ! Tij Λ S(i+1)jk
The fourth among these clauses ensures that the contents of any
tape square other than the currently scanned square remains the
same (to see this, note that the given clause is equivalent to the
formula Sijk Λ :Tij S(i+1)jk). These clauses contribute a total of
(12s + 3)(P(n) + 1)(2P(n) + 1)q = O(P(n)2) literals.
9) Initially, the string ai1, ai2 ,………., ain defining the problem instance I
is inscribed on squares 1; 2, ……, n of the tape. This is expressed by
the n clauses
S01i1 ; S02i2, …., S0nin;
a total of n literals.
10) By the P(n)th step, the machine has arrive at the stop state, and is
then is scanning for square 0, which contains the symbol a1. This is
expressed by the three clauses
QP(n)0, SP(n)01, ……., TP(n)0;
this gives another 3 literals.
On the whole, the number of literals involved in these clauses is O(P(n) 3).
Note that ‘q’ and ‘s’ are constants and depend only on the machine and do
not vary with the problem instance. Also, it does not contribute to the growth
of the number of literals with increasing problem size, which is what the
notation ‘O’ captures. The procedure for setting up these clauses, provided
the original machine M and the instance I of problem D, can be achieved in
polynomial time.
Now demonstrate that you have succeeded in converting D into SAT. Say, I
is a positive instance of problem D which means there is a certificate c
which halts the scanning symbol a1 on square 0 when M is run with inputs c;
I. This implies there is some sequence of symbols that can be placed initially
on squares -P(n),……., -1 of the tape so that all the above clauses are
fulfilled.
In other words, if I is a negative instance of problem D then there is no
certificate for I, which means that when the computation halts, the machine
will not be scanning a1 on square 0. This implies that any symbols can be
placed on squares -P(n), ......., -1 of the tape and the set of above clauses
cannot be fulfilled, and thus forms a negative instance of SAT.
We can conclude the following from the instance I of problem D: In
polynomial time, a set of clauses forms a positive instance of SAT if and
only if I is a positive instance of D. In other words, the problem D is
converted into SAT in polynomial time. Since D was an arbitrary NP
problem, it follows that; any NP problem can be converted to SAT in
polynomial time.
Activity 2
Find some examples of NP–hard and NP–complete problems from the
Internet and analyze how they are solved.
13.5 Summary
Let us summarize what we have discussed in this unit.
Limitations of algorithms power includes lower bound arguments, decision
trees and P, NP and NP complete problems
Lower bound arguments include different types of obtaining lower bounds
like trivial lower bound, information–theoretic arguments, adversary
arguments and problem reduction.
Decision trees are used for sorting and searching algorithms which have to
compare their input elements.
We have also analyzed P, NP and NP–complete problems. We have also
discussed the proof for Cook’s theorem.
13.6 Glossary
Term Description
Node A node is a theoretical basic unit used to build linked data
structures such as trees, linked lists, and computer-based
representations of graphs.
Polynomial time An algorithm is in Polynomial time if its running time is
upper bounded by a polynomial in the size of the input for
the algorithm
Turing machine A Turing machine is a theoretical machine that
manipulates symbols contained on a strip of tape.
Chromatic number This is the minimum number of colors used for the vertices
of a given graph such that no two adjacent vertices are of
the same color.
13.8 Answers
Self Assessment Questions
1. Lower – bound
2. True
3. Information – theoretic
4. Sorting
5. True
6. Ternary
7. Tractable
8. Graph coloring
9. Deterministic
Terminal Questions
1. Refer section 13.2 – Lower bound arguments
2. Refer section 13.2.1 – Trivial lower bound arguments
3. Refer section 13.3.1 – Decision tree for sorting algorithm
4. Refer section 13.4.1 – Non-deterministic algorithms
5. Refer section 13.4.4 – Cook’s theorem
References
Puntambekar, A. A. (2008). Design and Analysis of Algorithms. Technical
Publication, Pune.
Anany Levitin (2009). Introduction to Design and Analysis of Algorithms.
Dorling Kindersley, India
E-References
http://benchoi.info/Bens/Teaching/Development/Algorithm/PowerPoint/
CH05.ppt http://cs.baylor.edu/~maurer/aida/courses/adversar.ppt
www.inf.ed.ac.uk/teaching/courses/propm/papers/Cook.pdf
http://www.vidyasagar.ac.in/journal/maths/Vol12/JPS12-23.pdf