You are on page 1of 40

Algorithm Design and Complexity

Course 6
Overview
 Classes of Problems
 P vs NP
 Polynomial Reduction
 NP-hard and NP-complete
 Backtracking
 n-Queens Problem
 Graph Coloring Problem
Classes of Problems
 Complexity of an algorithm
  notation
 Used to compare algorithms in order to determine which one
is better

 Complexity of a problem?
 We would like to know how difficult a problem is in order to
know what solution to look for
 Used to compare problems in order to determine which one
is more difficult, regardless of the algorithm used to solve it!
 In other words, by taking into consideration the best
algorithms we could devise to solve the problems
Classes of Problems (2)
 We are able to define classes of problems, but not as many
than the classes of algorithms

 P = any problem for which we can find a correct solution in


polynomial time
 We can solve the problem in polynomial time
 There exists at least such a solution/an algorithm
 The problem is called tractable

 NP = any problem for which we can verify that a solution is


correct in polynomial time
 We may not solve the problem in polynomial time

 Polynomial time = O(nk) k - constant


P vs NP
 Therefore, P and NP are classes of problems

 Any problem in P is also in NP


 If we can find a solution in polynomial time, we surely can verify that a solution is
correct in polynomial time
 P  NP

 Not any problem in NP is in P?


 It is easier to verify that a “guessed” solution is correct or not than to compute
this solution
 This has not been proved until now!
 NP  P ???

 It is unknown if P = NP, but most researchers believe that P and NP are not
the same class
 1 million dollar prize for proving that P = NP or P != NP
 http://en.wikipedia.org/wiki/P_versus_NP_problem
P vs NP (2)
 Conclusion: the problems in NP \ P should be more
difficult than the problems in P
 For these problems, we cannot find a solution in
polynomial time at this moment in time
 Maybe we shall find in the future, especially if someone
manages to prove that P = NP
 Therefore, at this moment in time we could separate
between problems in:
 P – less difficult
 NP \ P – more difficult
Polynomial Reduction
 Given 2 decision problems, A1 and A2
 Remember: A decision problem is a problem that has only two
possible outputs: yes/no
 We say that problem A1 can be reduced in polynomial time to
problem A2 (A1 P A2) if:
 There exists a polynomial time algorithm F that transforms any input
data for A1, x, into an input data for A2, F(x)
 A1(x) == yes  A2(F(x)) == yes
 A1(x) == yes => A2(F(x)) == yes
 A1(x) == no => A2(F(x)) == no

 Image source: http://homepages.ius.edu/rwisman/C455/html/notes/Chapter34/NP-Completeness.htm


Polynomial Reduction (2)
 Thus, we can use any solution to A2 to solve A1
 If we have a solution for A2, we also have a solution for
A1

 If A1 P A2, then we say that problem A1 is easier or at


most as difficult as A2

 This looks a bit strange, doesn’t it ?


 From the algorithm complexity point of view
 See next slide why
Polynomial Reduction (3)
Solve A1(x)
x2 = F(x)
RETURN SolveA2(x2)

 From the algorithm point of view, it seems that the algorithm for
solving A1 has a greater complexity than the one for solving A2
 Complexity(SolveA1) = Complexity(F) + Complexity(SolveA2)
 However, we are interested from the classes of problems point of
view:
 We can use the best algorithm for solving A2 and then use F to solve A1
 If A2  P => A1  P
 If A2  NP => A1  P or NP (If A2  NP \ P => A1  P or NP\P)
 Therefore, A1 is always easier or as at most as difficult as A2
Polynomial Reduction (4)
 Another property of polynomial reduction:
 If A1  NP => A2  NP (If A1  NP \ P => A2  NP\P)
 Because A2 cannot be easier than A1

 Therefore, we shall use polynomial reduction to highlight


that a problem is more difficult than another one w.r.t.
classes of problems
NP-hard and NP-complete
 A problem Q is called NP-hard if:
 For  Q1  NP: Q1 P Q

 It is more difficult or at least as difficult as any problem in NP


 However, Q may not even be in NP! There are problems even
more difficult that those in NP! They are NP-hard

 A problem Q is called NP-complete if


 It is NP-hard and it is in NP
 These are the most difficult problems in NP!
NP-hard and NP-complete (2)
 NP-hard and NP-complete are also classes of problems
 A possible graphical representation is:

 Image source: http://en.wikipedia.org/wiki/NP-complete


NP-complete Problems
 Graph Clique
 Given a undirected graph G(V, E). Is there a clique of size k in G ?

 Graph Vertex Cover


 Given a undirected graph G(V, E). Is there a vertex cover of size k ?

 Quick Info:
 A clique is a subset of vertices V’  V such that for any v1, v2  V’
there is an edge (v1, v2)  E[G]
 A vertex cover is a subset of vertices V’  V such that for any edge
(v1, v2)  E[G], v1  V’ and/or v2  V’
 At least an endpoint of any edge in the graph is covered in V’
NP-complete Problems (2)
 Graph Coloring
 N-Queens Problem
 Hamiltonian Cycle
 Travelling Salesman Problem
 Minesweeper
 Task scheduling
 Etc.

 A lot of interesting problems are NP-hard (some of them


are NP-complete)
Conclusions
 There are a lot of problems that are very difficult to solve
(NP-hard)
 There is no polynomial time solution for them, at least at
this moment in time

 We need a method for solving them

 Simple solution: backtracking (with heuristics)


Backtracking
 Useful to solve difficult problems:
 Many optimization problems
 Combinatorial problems
 Problems for which you want to know all the solutions
 NP-complete problems

 Backtracking improves the brute-force “generate and


test” solution for a problem

 Generate and test


 Generate all possible solutions
 After a solution is final, test if it is correct
Generate and Test
 Example: k-clique for a graph G(V, E)

 1. Generate all combinations of k vertices


 1.a. Choose a value from V for the 1st vertex in the clique
 1.b. Choose a value from V for the 2nd vertex in the clique
 …
 1.?. Choose a value from V for the kth vertex in the clique

 2. Test if the generated solution is correct


 2.a. If it is correct and you are looking for only 1 solution, then
stop
 2.b. Else continue generating the next solution
Alternative View of a Problem
 Most of these problems can be transformed into the following problem:

 He have a set of n variables: V1, …,Vn


 Each of these variables have a specified domain:
 One value for each domain should be assigned to each variable in the final
solution
 Vi  Dom(Vi) = Domi[1..ki]; each domain has ki possible values to choose from
 There are a set of constraints that should be respected by the final
solution:
 Constraints for a single variable (E.g.: V2 != 3)
 Constraints between two variables (E.g.: V2 != V3 or V1+V3 = 2)
 Other kind of constraints

 We need to determine a value for each variable that is part of the domain
of that variable and this instantiation respects all the defined constraints
 Constraints Satisfaction Problem (CSP)
Example – k-clique
 We have k variables: V1, …, Vk – the vertices of the k-
clique
 Each variable can take values from all the vertices of the
graph G(Vertices[n], Edges[m])
 Dom(V1) = … = Dom(Vk) = {1, …, n}
 Considering the vertices of the graph are labeled from 1..n
 Constraints:
 Vi != Vj for all 1 <= i < j <= k
 (Vi,Vj)  Edges for all 1 <= i < j <= k
Generate and Test – Revisited
 For the new formulation of the problems

GenerateAndTest(Vars, Domains, Constraints)


FOR (Vars[1] in Domains[1])
FOR (Vars[2] in Domains[2])

FOR (Vars[n] in Domains[n])
CheckConstraints(Vars, Constraints)

 Complexity: (k1 * k2 * … * kn) where ki = size(Domains[i])


 If k1 = k2 = … = kn = k => (kn)
 Exponential complexity
Generate and Test – Recursive
 We can write easily a recursive solution
 Same complexity as the previous algorithm

GenerateAndTestRecursive(Vars[1..n], Domains, Constraints, k)


IF (k == n + 1)
CheckConstraints(Vars, Constraints)
ELSE
FOR (i = 1; i <= size(Domains[k]); i++)
Vars[k] = Domains[k][i]
GenerateAndTestRecursive(Vars[1..n], Domains,
Constraints, k+1)

Initial call: GenerateAndTestRecursive(Vars, Domains, Constraints, 1)


Solution Tree
 Root level: No variable is assigned
 First level: First variable is assigned with all possible values
from the domain

 Last level: The last variable is assigned

 Complexity: generated by all the levels in the tree!


 Depends on the height – d
 Depends on the (average) branching factor – b
 O(bd)

 More details on whiteboard


Problems with G&T
 A correct solution = a solution that is consistent w.r.t. all the
constraints that are checked

 Some inconsistencies appear while building the solution


 Why only check for consistency when the solution is final?

 Also check the consistency of partial solutions


 If a partial solution is not consistent, abandon it
 And assign the next value in the domain to the current variable, if
any left
 If no values are left in the domain of the current variable, go back to
the previous one and continue
 This is called backtracking
Backtracking
 Improvement of G&T that checks for the consistency of the
partial solutions

 Thus, search in the solution tree is pruned


 => the complexity for finding the correct solution is reduced

 We can further improve the search by using heuristics


 We cannot reduce the height of the tree
 But we can reduce the average branching factor of the pruned
solution tree

 However, usually backtracking is still O(bd) for the worst case


 We cannot guarantee it to have a lower complexity
Backtracking – recursive scheme
 We can devise a recursive scheme for most problems solvable using
backtracking

BKTRecursive(Vars[1..n], Domains, Constraints, k)


IF (k == n + 1)
PrintSolution(Vars)
ELSE
// when no next value exists, the index is reset to the first value
WHILE (ExistsNextValue(Vars[k], Domains[k]))
Vars[k] = NextValue(Vars[k] , Domains[k], Constraints)
IF (CheckConstraints(Vars, Constraints, k))
BKTRecursive (Vars[1..n], Domains,
Constraints, k+1)

 PrintSolution, ExistsNextValue, NextValue, CheckConstraints are


problem and method depending
Remarks
 Initial call:
 BKTRecursive(Vars, Domains, Constraints, 1)

 Because consistency is verified after choosing each value for a


variable in the partial solution, it means that the final solution is also
consistent
 Therefore, just print it

 ExistsNextValue, NextValue – method dependent


 Simple to implement for usual backtracking, just iterate through the
domain array for each variable until reaching the end
 More complex for using BKT with heuristics

 CheckConstraints, PrintSolution – problem dependent


 One of the only things that change from problem to problem
Backtracking – iterative scheme
 We can also devise an iterative scheme for most problems solvable
using backtracking
BKTIterative(Vars[1..n], Domains, Constraints)
k=1
WHILE (k <= n +1)
IF (k == n + 1)
PrintSolution(Vars)
k--
CONTINUE

// when no next value exists, the index is reset to the first value
WHILE (ExistsNextValue(Vars[k], Domains[k]))
Vars[k] = NextValue(Vars[k] , Domains[k], Constraints)
IF (CheckConstraints(Vars, Constraints, k))
k++
BREAK
IF (!ExistsNextValue(Vars[k], Domains[k]))
k--
Example: n-Queens Problem
 Given a table of chess size n x n, find a possible positioning of
n queens such that none of the queens attack themselves
 Or find all possible positionings 
 n = 8 usual chess table

 n = 1 => 1 solution
 n = 2, 3 => 0 solutions
 n = 4 => 2 solutions
 …
 n = 8 => 92 solutions
 …
 n = 25 => 2,207,893,435,808,352 solutions
n-Queens Problem
 A possible solution for n = 8 (from Wikipedia)
n-Queens Problem
 Three possible approaches
 First approach
 n2 variables – one for each position on the table
 Each variable has the domain {0, 1} if a queen is placed on that particular
position
 Complexity: O(2n*n)
 Branching factor for each node: 2
 Height of tree: n*n

 Second approach
 n variables – one for each queen
 Each variable has the domain {1, …,n2} – the position on the table for
each queen
 Complexity: O(n2*n)
 Branching factor: n2
 Height of tree: n
n-Queens Problem
CheckConsistency1(Vars[1..n*n], Domains, k)
FOR (i = 1..k-1)
rowi = (i – 1) / n
columni = (i - 1) % n
rowk = (k – 1) / n
columnk = (k – 1) % n
IF (rowi == rowk || columni == columnk || abs(rowi -rowk)
== abs(columni - columnk))
IF (Vars[i] == 1 AND Vars[k] == 1)
// we already have a queen on the same row
// or the same column or the same diagonal
RETURN false
RETURN true
n-Queens Problem
CheckConsistency2(Vars[1..n], Domains, k)
FOR (i = 1..k-1)
rowi = (Vars[i] – 1) / n
columni = (Vars[i] - 1) % n
rowk = (Vars[k] – 1) / n
columnk = (Vars[k] – 1) % n
IF (rowi == rowk || columni == columnk || abs(rowi -rowk)
== abs(columni - columnk))
// we already have a queen on the same row
// or the same column or the same diagonal
RETURN false
RETURN true
n-Queens Problem
 Third approach
 Idea: the queens cannot be placed on the same row!
 n variables – one for the each position of queen i on row i
(i=1..n)
 Each variable has the domain {1, …,n} – the column where the
queen is placed on each row
 Complexity: O(nn)
 Branching factor for each node: n
 Height of tree: n

 The position of each queen would be (i, Vars[i])


 i = 1..n
n-Queens Problem
CheckConsistency3(Vars[1..n], Domains, k)
FOR (i = 1..k-1)
rowi = i
columni = Vars[i]
rowk = k // always rowk != rowi
columnk = Vars[k]
IF (rowi == rowk || columni == columnk ||
abs(rowi - rowk) == abs(columni - columnk))
// we already have a queen on the same row
// or the same column or the same diagonal
RETURN false
RETURN true
Example: Graph Coloring Problem
 Given an undirected graph G(V, E), can we color each
vertex of the graph using k colors such that any two
vertices joined by an edge have different colors ?
  (u, v)  E : color[u] != color[v]

 Modeling the problem:


 N = |V| variables – one for each vertex
 Each domain has k values: {1, …, k} – the color of each vertex
 Complexity: O(kn)
 Height: n
 Branching factor: k
Graph Coloring Problem
CheckConsistency(Vars[1..n], Domains, k)
FOREACH (v IN Adj[k])
// each vertex in the list of adjacency of vertex k
IF (v < k AND color[k] == color[v])
RETURN false
RETURN true
Improvements for Backtracking
 How can we improve backtracking ?

 How we model the problem


 Look at n-queens

 Use of heuristics in order to reduce the average branching


factor
 Variables that have a smaller domain, should be instantiated firstly
 Variables that have the most constraints, should be instantiated firstly
 Etc.

 Forward Checking: more on whiteboard


 Advanced: Arc-Consistency (AC) algorithms
Heuristics for Backtracking
 Minimum Remaining Values (MRV) - Choose the variable that has the
least number of valid values in its domain
 Combined with Forward-Checking

 Most Constraining Variable (MCV) – Choose the variable that has


the most constraints with previous variables

 Least Constraining Value (LCV) – Choose the value that imposes the
least number of constraints on the remaining variables

 Mainly useful if we want to find a solution, not all of them

 More info here: http://www.ai.kun.nl/aicourses/bki212a/slides/AISP-


CSP-ch5.pdf
Conclusions
 There are some problems that are very difficult
 But solvable using exponential algorithms
 NP-complete

 For these problems it’s ok to use backtracking


 Maybe with heuristics

 However, for optimization problems maybe instead of


backtracking, it may sometimes be useful to find an
approximate solution and not the optimum one
 In polynomial time
References
 CLRS – Chapter 36

 http://ww3.algorithmdesign.net/sample/ch13-np.pdf

 http://www.cs.cmu.edu/~avrim/451/lectures/lect1030.pdf

 http://faculty.ksu.edu.sa/YAlohali/Documents/ch3-
Constraint%20Satisfaction%20Problems1.pptx

 Advanced: Backtracking and Constraint Satisfaction Problems:


http://kti.mff.cuni.cz/~bartak/downloads/CPschool05notes.pdf

You might also like