Professional Documents
Culture Documents
Contents
1 Sep 9th, 2008 1
1.1 Welcome to CS 341: Algorithms, Fall 2008 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Marking Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Course Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.4 A Case Study (Convex Hull) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
i
CONTENTS CONTENTS
ii
CONTENTS CONTENTS
iii
1 SEP 9TH, 2008
1
1 SEP 9TH, 2008 1.4 A Case Study (Convex Hull)
5. Lower Bounds
This is not a course on complexity theory, which is where people really get excited about lower bounds, but
you need to know something about this.
1.4.1 Algorithm
Definition (better from an algorithmic point of view)
A convex hull is a polygon and its sides are formed by lines ` that connect at least two points and have no points
on one side.
A straightforward algorithm (sometimes called a brute force algorithm, but that gives them a bad names because
oftentimes the straightforward algorithms are the way to go) – for all pairs of points r, s find the line between r, s
and if all other points lie on one side only then the line is part of the convex hull.
Time for n points: O(n3 ).
Aside: even with this there are good and bad ways to ”see which side points are on.” Computing the slope
between the lines is actually a bad way to do this. Exercise: for r, s, and p, how to do it in the least steps, avoiding
underflow/overflow/division.
Improvement Given one line `, there is a natural ”next” line. Rotate ` through s until it hits the next point.
l r t l'
t is an ”extreme point” (min angle α). Finding it is like ginding a max (or min) – O(n). Time for n points: O(n2 ).
Actually, if h = the number of points on the convex hull, the algorithm takes O(n × h)
Can we do even better? (you bet!)
Repeatedly finding a min/max (which should remind you of sorting.)
Example Sort the points by x coordinate, and then find the ”upper convex hull” and ”lower convex hull” (each of
which comes in sorted order.)
The sorting will cost O(n log n) but the second step is just linear. We don’t quite have a linear algorithm here but
this will be much better. Process from left to right, adding points and each time figuring out whether you need to
2
1 SEP 9TH, 2008 1.4 A Case Study (Convex Hull)
upper bridge
lower bridge
From e, edge from max x coordinate on the left to minimum x coordinate on the right, ”walk up” to get upper
bridge, and ”walk down” to get the lower bridge.
This will be O(n) to divide, and O(n) to find the upper/lower bridges. Get recurrence relation:
n
T (n) = 2T + O(n)
2
This is the same as e.g. merge-sort. It comes out to O(n log n).
Never Any Better Finally let’s talk ever-so-slightly about not getting better than O(n log n). In some sense, no.
If we could find a convex hull faster, we could sort faster.
Technique: put points on a parabola (or alternately other shape) with a map x → (x, x2 ) and compute the convex
hull of these points. From there recover the sorted order. This is an intuitive argument. To be rigorous, we need
to specify the model of computation. We need a restricted model to say that sorting is Ω(n log n) – but need the
power of indirect addressing. (Don’t worry if that seems fuzzy. The take-home message is that to be precise we
need to spend more time on models of computation.)
Measuring in terms of n, the input size, and h, the output size. We saw an O(n log n) algorithm, an O(n × h)
algorithm. Which is better? Well, depends on whether h > log n or not.
One paper written called ”The ultimate convex hull algorithm?” (with a question mark in the name, very unusual)
gave an algorithm that’s O(n log h).
Challenge Look up the O(n log h) algorithm by Timothy Chan (here in SCS) and try to understand it.
3
3 SEP 16TH, 2008
This looks like an O(n log n) algorithm (as it takes that long to sort, and then O(n) after that)
Correctness Proof
There are three approaches to proving correctness of greedy algorithms.
• Suppose there is an optimal solution. Then the Greedy approach can be made into this solution.
4
3 SEP 16TH, 2008 3.3 Example: Knapsack problem
We’ll look at 0-1 Knapsack later (since it’s harder) (and when we study dynamic programming)
So imagine we have a table of items:
Weight wi Value vi
1 6 12
2 4 7
3 4 6
vi
W = 8. Greedy by wi . For the 0 − 1 knapsack:
• Optimal solution
Greedy Algorithm
vi
Order items 1, . . . , n by wi . xi is the weight of item i that we chose.
free-w <- W
for i=1..n
x_i <- min{ w_i, free-W }
free-w <- free-w - x_i
end
5
5 SEP 23, 2008: DIVIDE AND CONQUER
P P
xi = W (assuming W < wi )
The value we get is
n
X vi
xi
wi
i=1
Note: solution looks like it’s for 0-1. The only item we take fractionally is the last.
Claim Greedy algorithm gives the optimal solution to fractional knapsack problem.
Proof We use x1 , . . . , xn and the P uses y1 , . . . , yn . Let k be the minimum index with xk 6= yk . Then yk < xk
optimal P
(because greedy took max xk .) xi = yi = W . So there exists an index l > k such that yl > xl . Ida: swap
excess item l for item k.
yk0 ←k +∆ andP yl0 ← yl − ∆. Well, ∆ ← min{yl , wk − yk }, both terms of which are greater than zero. So the sum
of the weights yi0 = W
• Binary search
• Merge sort
6
5 SEP 23, 2008: DIVIDE AND CONQUER 5.1 Solving Recurrence Relations
n
– Recurse: Two subproblems of size 2
– Conquer: n − 1 comparisons
– Recurrence: T (n) = T n2 + T n2 + (n − 1) and T (1) = 0 comparisons.
n
T (n) = 2T + n − 1 for n even
2
T (1) = 0
So for n a power of 2,
n
T (n) = 2T +n−1
h 2 n n i
= 2 2T + −1 +n−1
n 4 2
= 4T + 2n − 3
4
..
.
n i−1
X
= 2i T + in − (2i
− 1) or 2j
2i
j=0
n
We want 2k
= 1, 2k = n, k = log n.
n
= 2 ∗ kT + k × n − (2k − 1)
2k
= nT (1) + n log n − n + 1
= n log n − n + 1 ∈ O(n log n)
If our goal is to say that mergesort takes O(n log n) for all n (as apposed to exactly computing T (n)) then we can
just add that T (n) ≤ T (n0 ) where n0 = the smallest power of 2 bigger than n.
j n k l n m
T (n) = T +T +n−1
2 2
T (1) = 0
and the exact solution is
7
5 SEP 23, 2008: DIVIDE AND CONQUER 5.1 Solving Recurrence Relations
n h n ni
T (n) = 2T + n − 1 ≤ 2 c log +n−1
2 2 2
= cn(log n − log 2) + n − 1 (by induction)
= cn log n − cn + n − 1
≤ cn log n if c ≥ 1
I’ll leave the details as an exercise (we need a base case, and need to do the case of n odd) for those of you for
whom this is not entirely intuitive.
Another example
n
T (n) = 2T +n
2
Claim T (n) ∈ O(n)
Prove T (n) ≤ cn for some constant c
n
T (n) = 2T +n
2
n
≤ 2c + n = (c + 1)n
2
Wait, constants aren’t supposed to grow like c + 1 above. This proof is fallacious. Please do not make this kind
of mistake on your assignments.
Example 2
j n k l n m
T (n) = T +T +1
2 2
T (1) = 1
Let’s guess T (n) ∈ O(n). Prove by induction that T (n) ≤ cn for some c.
8
5 SEP 23, 2008: DIVIDE AND CONQUER 5.1 Solving Recurrence Relations
Induction step:
jnk lnm
T (n) = c +c +1
2 2
= cn + 1 – we’ve got trouble from that + 1
n
T (n) = 2T +1
2
n
= 4T +2+1
4
..
.
n k−1
X
= 2k T + 2i
2k
i=1
(n = 2k )
= nT (1) + 2k − 1
= 2n − 1
T (n) ≤ c × n − 1
In that case we have
jnk lnm
T (n) = c −1+c −1+1
2 2
= cn − 1
Message: Sometimes we need to strengthen the inductive hypothesis and lower the bound.
T (n) = 2T (2m/2 ) + m
Let S(m) = T (2m ), then S(m) = 2S(m/2) + m. We can say
9
5 SEP 23, 2008: DIVIDE AND CONQUER 5.1 Solving Recurrence Relations
We need
n
T (n) = aT + c × nk
b
The more general case where c × nk 6= f (n) is handled in the textbook. We’ll first look at k = 1.
n
T (n) = aT + cn
b
Results (exact) are:
n
+ cnk , a ≥ 1, b > 1, c > 0, k ≥ 1 then
Theorem If T (n) = aT b
Θ(nk ) if a < bk
We’re not going to do a rigorous proof but we’ll do enough to give you some intuition. We’ll use unrolling. The
rigorous way is through induction.
n
T (n) = aT + cnk
b
n n k
= a aT 2 + c + cnk
b b
n n k
= a2 T 2 + ac + cnk
b b
n n k n
= a3 T 3 + a2 c 2 + ac + cnk
b b b
..
.
logb n−1 n k
X
= ak T (1) + ai c
bi
i=0
logb n−1 i
logb a k
X a
= n T (1) + cn
bk
i=0
n = bt , t = logb n, alogb n = nlogb a . It comes out exactly like that sum in your assignment.
Just to wrap up, if a < bk i.e. logb a < k, the sum is constant and nk dominates. If a = bk the sum is logb n and
we get Θ(nk log n). The third case is when a > bk , and then nlogb a dominates.
10
6 SEP 25, 2008
Equivalently, we can say given a1 , a2 , . . . an , a permutation of 1 . . . n, count the number of inversions i.e. the
number of pairs ai , aj with
i < j but ai > aj .2
n
Brute Force: Check all 2 pairs, taking O(n ).
1
Divide & Conquer: Divide the list in half, with m = 2 .
A = a1 . . . am
B = am+1 . . . an
Recursively count
rA = # inversions in A
rB = # inversions in B
11
6 SEP 25, 2008 6.2 Divide & Conquer Algorithms
Runtime:
n
T (n) = 2T + O(n)
2
Since it’s the same as mergesort, we get O(n log n). Can we do better?
981
1234
------
3924
2943
1962
981
-------
1210554
O(n2 ) for two n-digit numbers. (one step is × or + for two digits)
There is a faster way using divide-and-conquer. First pad 981 to 0981.
09 81 × 12 34
Then calculate
09 × 12 4 → 108
09 × 34 2 → 306
81 × 12 2 → 972
81 × 34 0 → 2754
1210554
The runtime here is
n
T (n) = 4T + O(n)
2
Apply the Master Method.
12
6 SEP 25, 2008 6.2 Divide & Conquer Algorithms
n
T (n) = aT + cnk
b
Here, a = 4, b = 2, k = 1. Compare a with bk . We see a = 4 > bk = 2 so then we have runtime Θ(nlogb a ) = Θ(n2 ).
So far we have not made progress!
(w + x)(y + z) = wy + wz + xy + xz
We know wy and xz but we want wz + xy. This leads to:
p = wy = 09 × 12 = 108
q = xz = 81 × 34 = 2754
r = (w + x)(y + z) = 90[that’s 09 + 81] × 46
108____
1278__
2754
-------
1210554
• What if n is odd?
• How small do you let the recursion get? (Answer: hardware word)
• When is this algorithm useful? (For about 1,000 digits or fewer, don’t use it [BB])
– Schonnage and Strassen better for very large numbers, which runs in O(n log n log log n)
13
7 SEP 30, 2008
Basic D&C
n
Divide each matrix into 2 blocks.
A B E F I J
=
C D G H K L
I = AE + BG etc. Each of the four output blocks has 2 subproblems and O(n2 ) additions.
n
T (n) = 8T + O(n2 )
2
By the master theorem, a = 8, b = 2, k = 2. a = 8 > bk = 4 (the case when recursive work overwhelms other case)
then T (n) ∈ Θ(nlogb a ) = O(n3 ).
Strassen’s Algorithm shows how to get by with just seven (a = 7) subproblems. Not discussing here, but if you’re
curious it’s in the textbook. This gives
n
T (n) = 7T + O(n2 )
2
This is Θ(nlog2 7 ) ≈ O(n2.8... ). There are more complicated algorithms that get even better results (only for very
large n however)
In one dimension, consider {10, 5, 17, 100}. How would we do this? Sort and compare adjacent numbers.
In a plane, we can use brute force, and that’s O(n2 ). What about
(1) Divide points into left/right at the median x coordinate. Most efficient to sort once by x coordinate. Then
we can find a line L in O(1) time.
14
8 OCT 2ND, 2008 7.3 Hidden Surface Removal
– Sort by x
– Sort by y
n
– T (n) = 2T 2 + O(n) ∈ O(n log n)
More general problems – given n points, find closest neighbour of each one. This can be done in O(n log n) (not
obvious)
• Voronoi diagrams
• Delaunay triangulations
Generalization – each interval i has a weight w(i). Pick disjoint intervals to maximize the sum of the weights.
What if we try to use Greedy?
15
8 OCT 2ND, 2008 8.1 Dynamic Programming
A general idea: for interval (or vertex) i, either we use it or we don’t. Let OPT(I) = max weight of non-overlapping
subset. W-OPT(I) is the opt. weight sum of weights of intervals in OPT(I).
If we don’t use i, OPT(I) = OPT(I \ { I } ).
If we use i, OPT(I) = w(i) + OPT(I’) where I’ = the set of intervals that don’t overlap with i.
Leads to a recursive algorithm.
W-OPT(I) = max { W-OPT(I { i } ) , w(i) + W-OPT(I’) }
T (n) = 2T (n − 1) + O(1)
But this is exponential time.
Essentially we are trying all possible subsets of n items – all 2n of them.
For intervals (but not for the general graph problem) we can do better. Order intervals 1, . . . , n by their right
endpoint.
If we choose interval n, then l0 = all intervals disjoint from n – has form 1, 2, . . . , j for some j.
W-OPT(1 ... n) = max ( W-OPT(1 ... n-1 ), w(n) + W-OPT(1..p(n)) ).
p(n) = max index j such that interval j doesn’t overlap n.
More generally,
p(i) = max index j ¿ i such that interval j doesn’t overlap i. W-OPT(1 .. i) = max ( W-OPT(1 .. i-1), w (i) +
W-OPT(1..p(i)))
This leads to an O(n) time algorithm. Note: don’t use recursion blindly. The same subproblem may be solved
many times in your program.
Solution Use memoized recursion (see text.) OR, use an iterative approach.
Let’s look at an algorithm using the second approach.
notation M[i] = W-OPT(1 .. i)
M[0] = 0
for i = 1..n
M[i] = max{ M[i-1], w(i) + M(p(i)) }
end
Runtime is O(n). What about computing p(i) with i = 1..n?
Sorting by right endpoint is O(n log n). To find p(i) sort by the left endpoint as well. Then-Exercise: in O(n) time
find p(i) i = 1..n.
So far this algorithm finds W-OPT but not OPT. (i.e. the weight, not the actual set of items.)
One possibility: enhance above loop to keep set OPT(1..i). Danger here is that storing n sets of size n for n2 size.
One solution: first compute M as above. Then call OPT(n).
recurse fun OPT(i)
if M[i] >= w(i) + M[p(i)]
then return OPT(i-1)
else
return { i } union OPT (p( i))
16
9 OCT 7TH, 2008 8.2 Second example: optimum binary search trees
Note: In CD 240 you did dynamic binary search trees – insert, delete, and rebalancing to control depth.
This is different in that we have items and probabilities ahead of time.
The difference from Huffman coding (a similar problem) is that for Huffman codes, left-to-right order of leaves is
free.
The heart of dynamic programming to find optimum binary search tree: Try all possible splits 1..k and k + 1..n.
Subproblem: ∀i, j find optimum tree for i,P i + 1, . . . , j.
M [i, j] = mink=i..j M [i, k] + M [k + 1, j] + jt=i pt . Each node is one deeper now.
Exercise: work this out.
for i=1..n
M[i,i] = p_i
for r=1..n-1
for i = 1..n-r
-- solve for M[i, i+r]
best <- M[i,i] + M[i+1, i+r]
for k=i+1..i+r-1
temp <- m[i,k] + m[k+1, i+r]
if temp > best, best <- temp
end
M[i,i+r] <- best + sum_(t=i)^(i+r) p_t
(better: p[j] = sum_t=1^j p(t) then use p[i+r] - P[i-1]
Runtime? O(n3 ).
17
9 OCT 7TH, 2008 9.1 Example 2: Minimum Weight Triangulation
Let mii = 0 and mij = min for k = i . . . j − 1. The idea is we’ll break into subproblems from mi to mk times mk+1
to mj .
Algorithm pseudocode:
for i=1..n
m(i,i) = 0
end
for diff=1 .. n
for i = 1..n-diff
j <- i + diff
m(i,j) <- infinity
for k = i .. j-1
temp <- m(i,k) + m(k+1,j) + d_{i-1} d_j d_k
if temp < m (i,j)
m(i,j) <- temp
end
end
end
end
The runtime is O(n3 ) for the O(n2 ) subproblems of O(n) each. Final answer m(1, n) and ex, use k matrix to
recover the actual parenthesization.
Base cases
m(i, i + 2) = `(i, i + 1) + `(i + 1, i + 2) + `(i, i + 2)
Note: We’d better add m(i, i + 1) = `(i, i + 1). And we don’t atually need case m(i, i + 2) – it falls out of the
general formula.
Algorithm:
initialize m(i,i+1)
for diff = 2, ..., n-1
for i = 1 .. n-diff
j<-i + diff
18
10 OCT 9TH, 2008
Runtime O(n3 ): n × n table and O(n2 ) subproblems. O(n) to solve each one.
19
11 OCT 14TH, 2008 10.2 Certain types of subproblems
• Input is rooted tree (not necessarily binary) and subproblems are rooted subtrees.
10.3 Memoization
Use recursion (not explicit solution to subproblems in the bottom-up approach we have used) – danger, solve sub
subproblem over and over. So
T (n) = 2T (n − 1) + O(1) – exponential!
Advantage: storing solved subproblems saves time if we don’t need solutions to all subproblems.
20
11 OCT 14TH, 2008 11.2 Minimum Spanning Trees
• No multiple edges.
We will use n or |V | for the number of vertices, and m or |E| for the number of edges.
n
n(n−1)
• 0≤m≤ 2 = 2 undirected.
What is a path? A sequence of vertices where every consecutive pair is joined by an edge. e.g.3, 5, 4. A walk
allows repetition of vertices and edges. A simple path does not allow.
We say that an undirected graph G is connected if for every pair of vertices, there is a path joining them. For
testing if a graph is connected, we can use DFS or BFS.
For directed graphs: there are different notions of connectivity. A graph can be strongly connected – ∀u, v inV
there is a directed path from u to v.
Tree: A graph that is connected but has no cycles. Note: a tree on n vertices has n − 1 edges.
Storing a graph:
• Adjacency list: Vertices down the left, edge destinations in a list on the right.
We usually use adjacency lists – then we can (sometimes) get algorithms with runtime better than O(n2 ).
Claim E 0 will be a tree. Else E 0 has a cycle. Throw away an edge of the cycle, which leaves a connected graph. If
path a − b used edge (u, v), then replace edge (u, v) with the rest of the cycle.
21
11 OCT 14TH, 2008 11.2 Minimum Spanning Trees
• Grow one connected component and use the minimum weight wedge.
Lemma Let V1 , V2 be a partition of V (into two disjoint non-empty sets with union V .) Let e be a minimum-weight
edge from V1 to V2 . Then there is a minimum spanning tree that includes e.
Stronger version Let X be a set of edges ⊂ minimum spanning tree, and no edge of X goes from V1 to V2 . Let the
minimum spanning tree also include X.
Proof Let T be a minimum spanning tree (stronger: containing X.) T has a path that connects u and v. P must
use an edge from V1 to V2 – say, f .
Let T 0 = T ∪ {e}
{f } exchange e for f . Claim: T 0 is it.
w(e) ≤ w(f ) so w(T 0 ) ≤ w(T ). T 0 is a spanning tree: P ∪ {(u, v)} makes a cycle , so we can remove f and stay
connected.
Note that T 0 contains e and x (because f not in X.)
Following Kruskal’s Algorithm,
A simple Union-Find structure : Store an array C(1 . . . n) and C(i) is the # of connected components containing
vertex i. Union: must rename one of the two sets, do the smaller one. Then h units take O(n log n) in CS 466:
reduce this.
Krustkal’s Algorithm takes O(m log m) to sort plus O(n log n) for the Union-Find test. And O(m log m) =
O(m log n) since log m ≤ log n2 = 2 log n.
22
12 OCT 16TH, 2008
• You are allowed one 8.5 × 11 sheet brought to the midterm. Doesn’t have to be hand-written either.
Recall:
• Kruskal’s algorithm orders edges from minimum-maximum weight. Take each edge unless it forms a cycle
with previously chosen edges.
• Lemma, the cheapest two edges connecting two groups is indeed the best.
Implementation: we need to (repeatedly) find a minimum-weight edge leaving U (as U changes.) Let S(U ) be a
set of edges from U to V − U . We want to find the minimum, insert, and delete. We need a priority queue – use
a heap.
When we do U ← U ∪ {v}, any edge from U to v leaves δ(u). Any other edge incident with v enters δ(u).
Recall that a heap provides O(log n) for insert and delete, and O(1) for finding a minimum.
23
12 OCT 16TH, 2008 12.2 Shortest Paths
Total number of PQ insert/delete operations over all vertices v: (hope for better than n × n.)
Every edge enters
P δ(u) once and leaves once, so 2m.
Alternatively, v∈V deg v = 2m.
Total time for the algorithm is O(n + m log m) = O(m log m) because m ≤ n2 and log m ≤ 2 log n. If m = 0: check
first if m < n − 1 and if so bail out.
Improvements
• Store vertices in the PQ instead of edges. Define w(v) = minimum weight of an edge from U to v.
When we do U ← U ∪ {v}, we must adjust weights of some vertices. Gives (m log n).
• Tweak the PQ to be a ”fibonacci heap,” which gives O(1) for weight change and O(log k) to find minimum.
Gives O(n log n + m).
General input: directed graph G = (V, E) with weights w : E → R. Allow negative weight edges, but disallow
negative weight cycles. (If we have a negative weight cycle, then repeating it potentially gives paths of −∞ weight.)
We might ask for shortest simple path but this is actually hard (NP-complete.)
2. Given u ∈ V , find shortest paths to all other vertices. ”Single source shortest path problem”
3. Find shortest u, v path ∀u, v – the ”all paths shortest path problem.”
24
13 OCT 21, 2008
B
-1
C
5
6
A 2
11
D
e.g. w(ACD) = 8
x
u v
Main idea: try all intermediate vertices x. If we use x, we need a shortest u → x path and a shortest x → v path.
How are these subproblems simpler?
1. Fewer edges – get efficient dynamic programming M [u, v, `] give shortest u, v path with ≤ ` edges.
However, we’re not using this. This gives the same runtime, but uses more space.
Let V = {1, 2, . . . , n}. Let Di [u, v] = min. length of a path u → v using intermediate vertices from the set
{1, . . . , i}. Solve subproblem Di [u, v] for i = 0, 1, . . . , n.
Main formula:
25
13 OCT 21, 2008 13.1 All Pairs Shortest Path
for i = 1..n
for u = 1..n
for v = 1..n
D_i[u,v] = as above in main formula
end
return D_n
Time is O(n3 ). The space however is also O(n3 ), which is extremely undesirable. Notice to compute Di we only
use Di−1 . So we can throw away any previous matrices, bringing space to O(n2 ).
In fact, even better (although not in degree of n) we can:
Note: in the inner loop, D will be a mixture of Di and Di−1 , but this is correct because we don’t go below the
true min by doing this, but we correctly compute the main equation.
Path[u,v]
x <- u
while neq u
output S[x,v]
x <- S[x,v]
end
output v
Exercise: Use this algorithm to test if a graph has a negative weight cycle.
26
14 OCT 23, 2008
• In the case with no negative weight edges, we can use Dijkstra’s Algorithm, which is O(m log n).
• With no negative weight cycles, O(n × m). (This is the most general – still faster than all pairs.)
x
y
General step: have shortest paths to all vertices in B. Initially, B = {s}. Choose the edge (x, y) where x ∈ B and
y ∈ V \ B that minimizes the following:
d(s, x) + w(x, y)
Call this minimum d:
• d(s, y) ← d
• B ← B ∪ {y}
This is greedy in the sense that y has the next minimum distance from s.
• s: Begins here
• π1 : Precedes u
27
14 OCT 23, 2008 14.2 Connectivity in Graphs
So w(π) = w(π1 )+w(u, v)+w(π2 ). Note that w(π1 )+w(u, v) ≥ d and w(π2 ) ≥ 0 as edge-weights are non-negative.
From Claim by induction on |B|, this algorithm finds the shortest path.
Implementation: Make a priority queue (heap) on vertices V \B using value D(v) for v ∈ V such that the minimum
value of D gives the wanted vertex.
D(v) = minimum weight path from s → v using a path in B plus one edge.
• Initialize:
– D(v) ← ∞, ∀v
– D(s) ← 0
– B←φ
Store the D values in a heap. How many times are we extracting the minimum? n times at O(log n) time each.
The ”decrease D value” is done ≤ m times. (Same argument as for Prim.) Each decrease D operation is O(log n)
(done as insert-delete.) Total time is O(n log n + m log n) which is O(m log n) if m ≥ n − 1. Using a Fibonacci
Heap, we can decrease this to O(n log n + m).
1 2 5
6 4
8 7
• DFS: 1,2,4,6,3,5,8,7
28
14 OCT 23, 2008 14.2 Connectivity in Graphs
We call a graph 2-connected if there are no cut vertices. 2-connected components. A figure-eight graphic made of
two connected triangles or squares has two 2-connected components, the triangles/squares. Similarly, 3-connected
means we can remove two vertices without breaking the graph into components.
By the way, Paul Seymour, a famous name in graph theory, is visiting UW this weekend, and he’s speaking
tomorrow at 3:30. He’s also getting an honourary degree on Saturday at convocation.
2 6
3 7
4 5
Solid edges are DFS edges, dotted edges are ”back edges.”
Claim: Every non-tree DFS edge goes from some u to an ancestor. e.g. we can’t have edge (5,7). This justifies
the term ”back edge.”
DFS Algorithm:
• Initialize:
• DFS(u) recursive:
– mark(v) ← visited
– DFSnum(v) ← num; num ← num + 1
– for each edge (u, w)
∗ if mark(w) = not visited then
· (v, w) is a tree edge
· parent(w) ← v
· DFS(w)
else
· if parent(v) 6= w then: (v, w) is a back edge
29
15 OCT 28TH, 2008
Removing arbitrary (non-root, non-leaf) node in the tree v we have T1 , . . . , Ti children and T0 the tree connected
from above. Are these connected in G \ v? It depends on back edges. If Tj has a back edge to T0 then Tj is
connected to T0 . Otherwise, it falls away (and is disconnected.)
We need one more thing: high(v) = highest (i.e. lowest DFS number) vertex reachable from v by going down tree
edges and then along one back edge.
Claim: v is a cut vertex iff it has a DFS child x such that high(x) ≥ DFSnum(v).
Modifying DFS code: set high(v) ←DFSnum(v) in Initialize, and later on set high(v) ← min { high(v), DFSnum(w)
} and later high(v) ← min { high(v), high(w) } .
This is still O(n + m).
Backtracking: A systematic way to try all possibilities. In the workplace, and you need a find an algorithm,
if you’re extremely lucky it’ll be one of the ones we encountered. But more likely, it’ll be similar to one we’ve
seen. But more likely, it’ll be one nobody knows how to solve, and it’s NP-complete. Backtracking is useful for
algorithms that are not NP-complete.
Options:
• Heuristic approach – run quickly, with no guarantee on the quality of the solution.
• Exact algorithm – and bear with the fact it (may) take a long time.
Backtracking Algorithm: F = set of active configurations. Initially, one configuration, the whole problem. While
F 6= φ, C ← remove configuration from F , expand into C1 , . . . , Ct . For each Ci , test for success (solves whole
problem) and failure (dead end.) Otherwise, add Ci to F .
Storing F :
30
15 OCT 28TH, 2008 15.1 Backtracking and Branch/Bound
S = empty set
R = {1 … n}
1 in 1 out
S = { 1 } S = empty
R = { 2 … n } R = { 2 … n }
2 in 2 out
S = { 1,2 } S = { 1 }
R = { 3 … n } R = { 3 … n }
Example: Subset Sum – Knapsack where weight is the value of each item.
P
P 1 . . . n and weight wi for item i, and W , find subset S ∈ {1, . . . , n} with
Given items i∈S wi ≤ W where we
maximize i∈S wi .
P
Decision Version – can we find S with i∈S wi = W ?
A polynomial time algorithm for this decision version gives poly time for the optimization version.
Need to fill in success w = W and failure (of the configuration) when w > W or w + r < W .
This is O(2n ). Before, we built a dynamic programming algorithm for Knapsack with subproblems O(n × W ).
Which is better? Depends on W . e.g. if W has n bits then W ∼ 2n and backtracking is better.
31
15 OCT 28TH, 2008 15.2 Branch-and-Bound
15.2 Branch-and-Bound
• for optimization problems
• ”bound” – for each configuration compute a lower bound on the objective function and prune if ≥ minimum
so far.
General paradigm:
• F = active configurations
• While F 6= φ
Given a graph G = (V, E) and edge weights w : E → R≥0 find a cycle C that goes through every vertex once and
has minimum weight.
Algorithm: based on enumerating all subsets of edges. Configuration: Ic ∈ E (included edges) and Ec ∈ E
(excluded edges.) Ic ∩ Xc = φ. Undecided edges E \ (Ic ∪ Xi ).
Necessary conditions: E \ Xc must be connected. In fact it must be 2-connected. Ic must have ≥ 2 edges at each
vertex, must not contain a cycle.
How to branch? Take the next edge not decided about yet. C − Ic , Xc choose e ∈ E \ (Ic ∪ Xc ). But how to bound?
Given Ic , Xc find a lower bound on minimum TSP tour respecting Ic , Xc . We want an efficiently computable lower
bound (so it’s sort of like a heuristic, but we don’t have issues of correctness.)
32
16 OCT 30TH, 2008
Instead of finding a tour, we’re finding a 1−tree, a spanning tree on nodes 2, . . . , n (not a MST) and two edges
from vertex 1 to leaves of the tree.
Claim Any TSP-tour is a 1-tree. w(min TSP-tour) ≥ w( min 1-tree ). So use this for lower bound.
Claim We can efficiently find a minimum weight 1-tree given Ic , Xc . (Not proven.)
Final Enhancements:
• When we choose the ”best” configuration C from F , as our measure of best, use the one with the minimum
1-tree.
• Designing algorithms
• Analyzing algorithms
Note: distinction between lower bound for an algorithm and lower bound for a problem. For an example, look at
multiplying large integers. The school method was O(n2 ).
In fact, school method is Ω(n2 ) worst case run time of because there are example inputs that take ≥ c × n2 steps.
But there is an algorithm (divide and conquer) with a better worst-case runtime – O(nk ) with k < 2. But a lower
bound for the problem says that all algorithms have to take ≥ some time.
33
16 OCT 30TH, 2008 16.3 Polynomial Time
In a comparison-based model, each comparison gives one bit of information, and since we need log n bits we
need log n comparisons. Often this argument is presented as a tree.
• (Lower end) some problems have Ω(n log n) lower bounds on special models.
Things we care about, like ”is there a TSP algorithm in O(n6 )” – nobody knows. ”Can O(n3 ) dynamic program-
ming algorithms be improved?” – nobody knows.
Major open question: Many practical problems have no polynomial time algorithm and no proved lower bound.
The best that’s known is proving that a large set of problems are all equivalent, and we know that solving one in
polynomial time solves all the others.
What is polynomial?
Θ(n) YES
Θ(n2 ) YES
Θ(n log n) YES (because it’s better than O(n))
Θ(n100 ) YES
Θ(2n ) NO
Θ(n!) NO
The algorithms in this course were (mostly) all poly-time, except backtracking and certain dynamic programming
algorithms (specifically 0-1 Knapsack.)
Low-degree polynomials are efficient. High-degree polynomial don’t seem to come up in practice.
Jack Edmonds is a retired C&O prof. The ”matching” problem has you given a graph and you want to assign
pairs. He first formulated the idea of polynomial time.
In any other algorithms class, you would cover linear programming in algorithms. We have a C&O department
that covers that, but if you’re serious about algorithms, you should be taking courses over there.
34
17 NOV 4TH, 2008 16.4 Reductions
Other history:
• In the 50’s and 60’s, there was a success story creating a linear programming and simplex method – practical
(though not polynomial.)
• Next step, integer linear programming. Seemed promising at the time, and people reduced other problems
to this one, but in the 70’s with the theory of NP-completeness, we found this is actually a hard problem
and people did reductions from integer programming.
Our goal: to attempt to distinguish problems with poly-time algorithms from those that don’t have any. This is
the theory of NP-completeness. (NP = Non-deterministic Polynomial)
16.4 Reductions
Problem A reduces (in polytime) to a problem B (written A ≤ B or A ≤P B) and we can say ”A is easier than
B” if a (polytime) algorithm for B can be used to create a (polytime) algorithm for A. More precisely, there is a
polytime algorithm for A that makes subroutine calls to (polytime) algorithm B.
Note: we can have a reduction with having an algorithm for B.
Consequence of A ≤ B:
An algorithm for B is an algorithm for A. But if we have a lower bound non-polytime algorithm for A then this
implies a non-polytime algorithm for B.
Even without an algorithm for B or a lower bound for A, if we prove reductions A ≤P B and B ≤P A then A and
B are equivalent with respect to polytime (either both have them, or both don’t.)
Example: Longest increasing subsequence problem. We will reduce this problem to not shortest path but longest
path in a graph.
This is a reduction – it reduces the longest increasing subsequence problem to the longest path problem. Is it a
polynomial-time reduction?
How can we solve the longest path problem? Reduction to shortest path problem. Negate the edge weights.
Today’s topics: Reductions (from last class), P and NP, and decision problems.
Examples
35
17 NOV 4TH, 2008 17.2 P or NP?
• TSP decision version: given a graph G = (V, E) with w : E → R+ , and given some bound k ∈ R, is there a
TSP tour of length at most k?
• Independent Set: given a graph G = V (E) and k ∈ N is there an independent set of size ≥ k? Optimization
version: given G, find max independent set.
Usually, decisions and optimization are equivalent with respect to polynomial time. e.g. independent set. In fact,
typically, we can show decision ≤P opt. Input: G, k.
Showing opt ≤P decision: suppose we have a poly-time algorithm for the decision version of independent set. For
k = n . . . 1, give G, k to decision algorithm and stop when it’s NO. Runtime: Assume decision takes O(nt ). Then
this loop takes O(nt+1 ).
We can find the actual independent set in polytime too. Idea: try vertex 1 in/out of independent set. Exercise:
fill this in and check poly-time.
Examples:
In some sense, primality is the ”decision” version of factoring. But although we can test primality in polynomial
time, we can’t factor in polynomial time (and to find one would be bad news for cryptography!)
Notes:
• Must be careful about model of computing and input size – count bits.
17.2 P or NP?
Which problems are in P ? Which are not in P ? We will study a class of ”N P -complete” problems that are
equivalently hard (wrt polytime) (i.e. A ≤P B ∀A, B in class) and none seem to be in P .
Definition of NP (”nondeterministic polynomial time”): there’s a set of NP problems, which contains P prob-
lems and NP-complete algorithms (that are equivalent.) NP problems are polytime if we get some lucky extra
information.
For independent set, it’s easy to verify a graph has an independent set of size ≥ k if you’re given the set. Contrast
with verifying that G has no independent set of size ≥ k, what lucky info would help?
e.g. primes: given n, is it prime? Not clear what info to give (there is some) but for composite numbers (given n,
is it composite (= not prime?)) we could give factors.
A certifier algorithm takes an input plus a certificate (our extra info.) An algorithm B is a certifier for problem
X if:
36
17 NOV 4TH, 2008 17.3 Properties
• ∀s, s is a YES input for X iff ∃t ”certificate” such that B(s, t) outputs YES.
B is a polytime certifier if
Examples
• Independent Set
Input is a graph G and k ∈ N. Question does G have an independent set of size ≥ k?
Claim: Independent Set ∈ NP.
Proof Certificate u ⊆ V (set of vertices.) Certifier: Check if u is an independent set and check |u| ≥ k.
• Non-TSP
Does G have no TSP turn of length ≤ k?
Is Non-TSP in N P ? Nobody knows.
• Subset-Sum:
Input: w1 , . . . , wn in R+ . Is there a subset S = {1 . . . n} such that the sum is exactly W ?
Claim: Subset Sum ∈ N P . Certificate: S. Certifier: add the weights in S.
17.3 Properties
Claim P ⊆ N P .
Let X be a decision problem in P . So X has a polyime algorithm to show X ⊆ N P .
• Certificate: nothing
Claim: any problem in N P has an exponential algorithm. In particular, the running time is O(2poly(n) ).
Proof idea: try all possible certificates using the certifier. The number of certificates is O(2poly(n) ).
Open Questions
Is P = N P ? co-np: ”no versions of NP problems.” non-TSP is in co-NP. Is Co-NP NP? Is P NP intersect co-NP?
37
18 NOV 6TH, 2008
18.2 N P -Complete
These are the hardest problems in N P . Definition: A decision problem X is N P -complete if:
1. X ∈ N P
2. For every Y ∈ N P , Y ≤P X.
1. If X is N P -complete and if X has a polytime algorithm then P = N P . i.e. every Y ∈ N P has a polytime
algorithm.
2. If X is N P -complete, and if X has no polytime algorithm (i.e. lower bound) then no problem in N P -complete
has a polytime algorithm.
The first N P -completeness proof is hard. To show X N P -complete, we must show Y ≤P X for all Y ∈ N P .
Subsequent N P -completeness proofs are easier. If we know X is N P -complete, then to prove Z is N P -complete:
1. Prove Z ∈ N P
2. X ≤P Z
Note that X is a known N P -complete problem and Z is the new problem. Please don’t get this backwards.
^ ¬ ¬
38
18 NOV 6TH, 2008 18.2 N P -Complete
This is a dag with OR, AND, and NOT operations. 0-1 values for variables determine output value. e.g. if x1 = 0
and x2 = 1 then output = 0.
Question: Are there 0-1 values for variables that give 1 as output?
Proof Sketch: We know ∈ N P as above. We must show Y ≤P Circuit SAT for all Y ∈ N P . The idea is that
an algorithm becomes a circuit computation. A certifier algorithm with an unknown certificate becomes a circuit
with variables as some inputs. The question is, is there a certificate such that the certifier says YES – which leads
to circuit satisfiability.
Essentially, if we had a polynomial time way to test circuit satisfiability, we would have a general way to solve any
problem in N P by turning it into a Circuit-SAT problem.
18.2.2 3-SAT
Satisfiability: (of Boolean formulas).
• Input: a boolean formula.
e.g. (x1 ∧ x2 ) ∨ (¬x1 ∧ ¬x2 )
• Question: is there an assignment of 0, 1 to variables to make the formula TRUE (i.e. 1?)
Well, circuits = formulas so these satisfiability problems should be equivalent. We will be rigorous. Even special
form of Satisfiability (SAT) is N P -complete.
3-SAT: e.g. (x1 ∨ ¬x1 ∨ x2 ) ∧ (x2 ∨ x3 ∨ x4 ) ∧ . . .. The ”formula” is the ∧ of ”clauses,” the ∨ of three literals. A
literal is a variable or negation of a variable.
Proof
• 3-SAT ∈ N P :
Certificate: values for variables.
Certifier algorithm: check that each clause has ≥ 1 true literal.
• 3-SAT is harder than another N P -complete problem:
i.e. prove Circuit-SAT ≤P 3-SAT.
Assume we have a polytime algorithm for 3-SAT, so use it to create a polytime algorithm for Circuit-SAT.
Input to algorithm is a circuit C and we want to construct in polytime a 3-SAT formula F to send to the
3-SAT algorithm s.t. C is satisfiable iff F is satisfiable.
39
19 NOV 11TH, 2008
We could derive a formula by carrying the inputs up through the tree (i.e. for f1 and f2 and ∨, just pull
the inputs up and write f1 ∨ f2 .) Caution: the size of formula doubles at every level (thus this is not a
polynomial time or size reduction.)
Idea: make a variable for every node in the circuit. Rewrite a ≡ b as (a ⇒ b) ∧ (b ⇒ a), and a ⇒ b as
(b ∨ ¬a). a ≡ b ∨ c becomes (a ⇒ (b ∨ c)) ∧ ((b ∨ c) ⇒ a) and (b ∨ c ∨ ¬a) ∧ (a ∨ ¬(b ∨ c)) and (a ∨ (¬b ∧ ¬c)).
We get (b ∨ c ∨ ¬a) ∧ (a ∨ ¬b) ∧ (a ∨ ¬c).
Note: we can pad these size two clauses by adding new dummy variable t and (a ∨ b ∨ t) ∧ (a ∨ b ∨ ¬t) etc.
There’s a similar padding for size 1.
The final formula for F :
Question: Are there T/F values for variables that make F true?
Proof:
• SAT ∈ N P
• 3-SAT ≤P SAT
40
19 NOV 11TH, 2008 19.2 Independent Set
For each clause in F , we’ll make a triangle in the graph. For example, (x1 ∨ x2 ∨ ¬x3 ) is drawn as a graph with
three vertices x1 , x2 and x3 , and edges (x1 , x2 ), (x2 , ¬x3 ), (¬x3 , x1 ). We have m clauses, so 3m vertices.
For example: (x1 ∨ x2 ∨ ¬x3 ) ∧ (x1 ∨ ¬x2 ∨ x3 ) becomes:
x1 x1
x2 ¬x3 ¬x2 x3
Details of Algorithm:
• Input: 3-SAT formua F
– Construct G
– Call Independent-Set algorithm on G, m
– Return answer
• Runtime: Constructing G takes poly time. Independent set runs in poly time by assumption.
• Proof: (⇒) Suppose we can assign T/F to variables to satisfy every clause. So, each clause has ≥ 1 true
literal. Pick the corresponding vertex in the graph. Pick the corresponding vertex from the triangle. This
gives an independent set of size = m.
(⇐) Independent set in G must use one vertex from each triangle. Set the corresponding literals to be true.
Set any remaining variables arbitrarily. This satisfies all clauses.
41
19 NOV 11TH, 2008 19.4 Set-Cover Problem
• VC ∈ N P
Certificate: set u. Certifier algorithm: verify U vertex cover and ≤ k.
• Ind-Set ≤P VC
Ind-Set and VC are closely related.
Claim u ∈ V is an independent set iff V − U is an vertex cover.
Suppose that we have a polynomial time algorithm for VC. Here’s an algorithm for independent set. Input
G, k, and call VC algorithm on G, n − k.
Correctness: Claim, G has independent set ≥ k iff G has VC ≤ n − k.
Question:
Can we choose subset of k Si ’s that still cover all the elements? i.e. i1 , . . . , ik such that
[
Sij = E
j=1...k
Example: Can we throw away some intersecting rectangles and still cover some area?
Theorem Set-Cover is NP-complete.
Please find reduction proof on the Internet.
3-SAT Subset-Sum
Independent Hamiltonian
Set Cycle
VC TSP
Set-Cover
Proof (1) ∈ N P and (2) 3-SAT ≤P Ham.Cycle. Give a polytime algorithm for 3-SAT assuming we have one for
Ham.Cycle.
42
20 NOV 13TH, 2008
v vmid
For each vertex v create vin , vout , and vmid as shown above. We’ve created G0 .
Claim G0 has polynomial size. Say G has n vertices, m edges. Then G0 has 3n vertices, m + 2n.
Claim (Correctness) G has a directed H.C. iff G0 has undirected H.C.
(⇒) easy
(⇐) vmid has degree two. So the Hamiltonian cycle must use both incident edges. Then it must use one
incoming edge at v and one outgoing edge at v.
This is the level of N P -completeness proof you’ll be expected to do on your assignment.
Proof
43
20 NOV 13TH, 2008 20.3 Subset-Sum is NP-Complete
• ∈ NP
– ∈ NP
– Ham Cycle ≤P Ham Path
Want algorithm for Ham. Cycle using algorithm for Ham Path. Given G, input for Ham. cycle,
construct G0 such that G has H.C. iff G0 has Ham path.
First idea: G0 ← G. Well, ⇒ is OK but we can find a counterexample for ⇐. Exercise: find a
counterexample.
Second idea: Create three new vertices abc in G0 and connect a and c to all vertices in G0 . This gives
G has Ham. path iff G0 has Ham cycle.
Third idea: Add a single vertex and connect it to everything in G0 .
Fourth idea: erase each vertex from G one-at-a-time and ask for Hamiltonian path.
Final idea: Take one vertex v and split it into two identical cupies. Add new vertices s and t as above.
Claim poly-size.
Proof
1. ∈ N P
2. 3-SAT ≤P Subset-Sum
Give a polynomial-time algorithm for 3-SAT using a polytime algorithm for Subset-Sum.
Input is a 3-SAT formula F with variables x1 , x2 , . . . xn and Pclauses c1 , . . . , cn . Construct a Subset-Sum input
a1 , . . . , at , W s.t. F is satisfiable iff ∃ subset of ai ’s with = W.
Ex, F = (x1 ∨ ¬x2 ∨ x3 ) ∧ (¬x1 ∨ ¬x2 ∨ x3 ).
44
20 NOV 13TH, 2008 20.3 Subset-Sum is NP-Complete
c1 c2 . . . cm x1 x2 x3
x1 1 0 1 0 0
¬x1 0 1 1 0 0
x2 0 0 0 1 0
¬x2 1 1 0 1 0
x3 1 1 0 0 1
¬x3 0 0 0 0 1
xn
¬xn
slack 1, 1 1
slack 1, 2 2
slack 2, 1 1
slack 2, 2 2
≥1 ≥1 1 1
4 4
Make a 0-1 matrix, interpreting the rows as binary numbers (actually with a bigger base of 10.) Add extra
columns: column xi has 10 s in rows xi and rows ¬xi , but zeros elsewhere.
• Want to choose x1 row or ¬x1 row, but not both. Solution is slack rows.
• Want to deal with target ≥ 1. Solution: add two rows per column forcol ci . Add rows slack i,1 = 1 in
c1 and sl i,2 = 2 in ci – and 0 everywhere else.
• True literal in Ci
•
•
Use slack i,1 = 1, so total = 4. Use slack i,2 = 1, total = 4. If only a single true literal, use slack i,1 and
slack i,2 for again 4.
This row set gives sum W .
(⇐) Some subset of rows adds to W .
Column xi ⇒ we use rows xi or ¬xi . Set xi = T or F . That satisfies all clauses. Consider cj , and sum down
cj column to get 4. Slacks give ≤ 3 so some literal in cj must be true.
45
21 NOV 18TH, 2008
• ∈ NP
– Input S
– Convert B to circuit Cn
– Hand Cn to Circuit-SAT subroutine
• Min. Weight Triangulation for Point Set: in N P -complete (’06) (not famous problem)
Given two graphs each on n vertices, are they the same after relabeling vertices?
46
21 NOV 18TH, 2008 21.2 Undecidability
21.2 Undecidability
So far we’ve been talking about efficiency of algorithms. Now, we’ll look at problems with no algorithm whatsoever.
This is also a topic not conventionally covered in an algorithms course. So you won’t find it in textbooks. But
everyone in the School of Computer Science thinks it’s ”absolutely crucial” that everyone graduating with a
Waterloo degree knows this stuff.
21.2.1 Examples
Tiling: Given square tiles with colours on their sides, can I tile the whole plane with copies of these tiles? Must
match colours, and no rotations or flips allowed.
The answer is, actually, no. For a finite piece (k × k) of the plane, it’s possible as I could just try t choices in k 2
2
places, so the problem is O(tk ).
Program Verification: Given specification of inputs and corresponding outputs of a program (specification is finite,
potential number of inputs is infinite) given a program, does this program give correct corresponding output?
Answer: no. On one hand, this is sad for software engineers, because what their processes do attempts to check
this. On the plus side, your skills and ingenuity will always be needed...
Halting Problem: Given a program, does it halt (or go into an infinite loop?)
Sample-Program
while x 6= 1 do
x←x−2
end
Sample-Program-2
while x 6= 1 do
x
if x is even then x ← 2
else x ← 3x + 1
end
Assume x > 0. Sample runs: x = 5, 16, 8, 4, 2, 1. x = 9, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1.
Does this program halt for all x? That’s open.
47
22 NOV 20TH, 2008
Also, any math question about existence of a number can be turned into a halting question. Idea: There is an x
such that Foo(x). x ← 1. While not Foo(x), x ← x − 1.
• Turning machines
• Java programs
• RAM
• Circuit families
• Bertrand Russell (1872-1970) Russell’s paradox (recommend his biography, and some philosophy books)
Let S = the set of sets that do not contain themselves. Is S a member of itself?
Halting Problem
48
23 NOV 25TH, 2008
begin
call H(B, B)
if no, halt.
else, loop forever.
end
So H 0 is like Russell’s set S. His question, ”does S contain S?” is like asking, ”does H 0 halt on its own input?”
Suppose yes, then this is a yes case of the halting problem. So H(H 0 , H 0 ) outputs yes. Look at code for H 0 on
input H 0 . It loops forever. Contradiction.
Suppose no. Then this is the no case of the halting problem. So H(H 0 , H 0 ) outputs no. But then (looking at
code of H 0 ) H 0 halts on input H 0 . Contradiction either way. Therefore, our assumption that H exists is wrong.
Therefore, there is no algorithm to decide the halting problem.
23.1 Undecidability
Recall: a decision problem is undecidable if there is no algorithm for it.
Theorem: If P and Q are decision problems and P is undecidable and P ≤ Q then Q is undecidable.
Proof By contradiction. Suppose Q is decidable. Then it has an algorithm. By the definition of ≤, we get an
algorithm for P . This is contrary to P undecidable.
49
23 NOV 25TH, 2008 23.2 Other Undecidable Problems
Suppose we have an algorithm V to decide Program Verification. Make an algorithm to solve Halt-No-Input.
Input: program A.
Output: does A halt?
Idea: Modify code of A to get a program A0 with input and output.
read input, discard it
A0 A
output 1
Input: program A, input/specs for A. This will work, but we need more formality about input/output specs.
Let’s try another approach.
Halt-No-Input ≤ Program-Equiv.
Suppose we have an algorithm for Program Equivalence. Make an algorithm for Halt-no-Input. Input: program A.
Algorithm: Make A0 as in previous. Make program B: read input, just output 1. Call algorithm for Program-Equiv
on A0 , B.
50
24 NOV 27TH, 2008
Correctness
A0 is equivalent to B iff A halts.
Possible approach: try all integers. This will correctly answer ”yes” if the answer is ”yes.” e.g. least integer
solution to x2 = 991y1 + 1 is a 30-digit x and 29-digit y.
• Parameterized Tractability: exponential algorithms that work in polynomial time for special inputs. For
example, maximum degree in a graph. There may be algorithms that work in polytime when you bound
that maximum degree.
• Exact exponential time algorithm: use heuristics to make branch-and-bound explore the most promising
choice first (and run fast sometimes.)
– Vertex Cover: Greedy algorithm that finds a good (not necessarily min) vertex cover.
51
24 NOV 27TH, 2008 24.1 What to do with NP-complete problems
Some NP-complete problems have approximation factors as close to 1 as we like – at the cost of
increasing running time. Limit is approximation factor = 1 (an exact algorithm) with an exponential-
time algorithm.
– Example Subset-Sum
P
Given w1 , . . . , wn and W , is there S ∈ {1 . . . n} such that i∈S wi = W ?
P P
As optimization, we want i∈S wi ≤ W to maximize i∈S wi .
Recall: Dynamic programming O(n × W ).
Note i∈S wi ≥ 21 (true max. this would be a 2-approximation)
P
P 1
i∈S wi ≥ (1+) (true max) is a ”(1 + )-approximation.
1 3
Claim is there is a (1 + ) approximation algorithm for Subset-Sum with runtime O n . As → 0
we get better approximation but worse runtime.
Idea: apply dynamic programming to rounded input.
Rough rounding – few bits – rough approximation.
Refined rounding – many bits – good approximation.
Rounding parameter b (later b = n (max wi for i = 1 . . . n))
So w̃i ← wbi b
˜ ≤O W W n 1 2
W̃ =O ≤O n
B (max wi )
and
W ≤ n(max wi )
Therefore, our runtime is like O 1 n .
3
P
How good is our approximation?
P P Each
P w̃i is off by ≤P b. The true maximum ≤ i∈S wi + nb ≤
w
i∈S i + (max w i ) ≤ w
i∈S i + w
i∈S i = (1 + ) w
i∈S i .
Second last step: else use max wi as solution.
Therefore, (1 + ) approx. alg.
(And assume wi < W ∀i. Else throw out.)
Idea: dynamic programming algorithm is very good – it only can’t handle having lots of bits in a
number. So throw away half the bits and get an approximate answer.
52
24 NOV 27TH, 2008 24.2 P vs. NP
• Quantum Computing
The hope is that it offers massive parallelism for free. Huge result (Shor, 1994) – efficient factoring on a
quantum computer.
Waterloo is, by the way, the place to be for quantum computing. In Physics, CS, and C&O we have experts
on the subject.
To read a tiny bit more on Quantum Computing is [DPV]
24.2 P vs. NP
53