You are on page 1of 11

Jack Kwan - 3597451

Vlad Voytenko
COMP 372
Assignment 4

Q. 34.1-1
Define the optimization problem LONGEST-PATH-LENGTH as the relation that
associates each instance of an undirected graph and two vertices with the number of edges in a
longest simple path between the two vertices. Define the decision problem LONGEST-PATH =
{⟨G, u, v, k⟩ : G = .(V, E) is an undirected graph, u, v ∈ V, k ≥ 0 is an integer, and there exists a
simple path
from u to v in G consisting of at least k edges}. Show that the optimization problem
LONGEST-PATH-LENGTH can be solved in polynomial time if and only if
LONGEST-PATH ∈ P.

A.

The LONGEST-PATH-LENGTH optimization problem and the LONGEST-PATH decision


problem are closely related. The optimization problem asks for the length of the longest simple
path between two vertices in a graph, while the decision problem asks whether there exists a
simple path of at least a certain length between two vertices.

If the optimization problem LONGEST-PATH-LENGTH can be solved in polynomial time, then


the decision problem LONGEST-PATH can also be solved in polynomial time. Given an instance
⟨G, u, v, k⟩ of the decision problem, we can solve the optimization problem to find the length of
the longest simple path between u and v. If this length is at least k, then we return "yes";
otherwise, we return "no". This procedure clearly runs in polynomial time if the optimization
problem can be solved in polynomial time.

Conversely, if the decision problem LONGEST-PATH can be solved in polynomial time, then the
optimization problem LONGEST-PATH-LENGTH can also be solved in polynomial time. Given
an instance ⟨G, u, v⟩ of the optimization problem, we can use a binary search approach to find
the length of the longest simple path between u and v. We start by setting lower and upper
bounds for the length, say l = 0 and u = |V| - 1. Then we repeatedly solve the decision problem
with k = (l + u) / 2 until l = u. If the answer is "yes", then we set l = k + 1; otherwise, we set u = k
- 1. The final value of l is the length of the longest simple path. This procedure clearly runs in
polynomial time if the decision problem can be solved in polynomial time.

Therefore, the optimization problem LONGEST-PATH-LENGTH can be solved in polynomial


time if and only if LONGEST-PATH ∈ P.

Q. 34.1-4
Is the dynamic-programming algorithm for the 0-1 knapsack problem that is asked
for in Exercise 16.2-2 a polynomial-time algorithm? Explain your answer.
(Exercise 16.2-2:
Give a dynamic-programming solution to the 0-1 knapsack problem that runs in
O(n W) time, where n is the number of items and W is the maximum weight of
items that the thief can put in his knapsack.
)

A.

The dynamic programming solution for the 0-1 knapsack problem runs in O(nW) time, where n
is the number of items and W is the maximum weight of items that the thief can put in his
knapsack.

However, whether this is considered a polynomial-time algorithm depends on how you define
the size of the input.

If you consider the size of the input to be proportional to n (the number of items), then the
algorithm is not polynomial-time because the running time depends on W, which can be
exponentially larger than n.

On the other hand, if you consider the size of the input to be proportional to the number of bits
needed to represent the input, then the algorithm is polynomial-time. In this case, the size of the
input is proportional to n log W (since each weight can be up to W and requires log W bits to
represent), and the running time of O(nW) is polynomial in n log W.

So, the dynamic programming solution for the 0-1 knapsack problem is a pseudo-polynomial
time algorithm. It is polynomial in the numeric value of the input, but not polynomial in the size of
the input (the number of bits needed to represent the input).

Q. 34.2-1
Consider the language GRAPH-ISOMORPHISM = {⟨G1, G2⟩ : G1 and G2 are
isomorphic graphs}. Prove that GRAPH-ISOMORPHISM ∈ NP by describing a
polynomial-time algorithm to verify the language.

A.

To prove that GRAPH-ISOMORPHISM is in NP, we need to provide a certificate and a


polynomial-time verification algorithm that can check, given a certificate, whether a pair of
graphs G1 and G2 are isomorphic.

A certificate for GRAPH-ISOMORPHISM can be a one-to-one mapping (a bijection) f from the


vertices of G1 to the vertices of G2. This mapping represents a proposed isomorphism from G1
to G2.
The verification algorithm works as follows:

Check that f is a bijection from the vertices of G1 to the vertices of G2. This can be done in
polynomial time by checking that each vertex in G1 maps to a unique vertex in G2 and vice
versa.

Check that f preserves adjacency, i.e., for every pair of vertices u and v in G1, u and v are
adjacent in G1 if and only if f(u) and f(v) are adjacent in G2. This can be done in polynomial time
by iterating over each pair of vertices in G1 and checking the corresponding pair of vertices in
G2.

If both checks pass, then the algorithm accepts the certificate; otherwise, it rejects.

Since both steps of the verification algorithm can be performed in polynomial time, this shows
that GRAPH-ISOMORPHISM is in NP.

Q. 34.2-8
Let ϕ be a boolean formula constructed from the boolean input variables x_1, x_2, …, x_k,
negations (¬), ANDs (∧), ORs (∨), and parentheses. The formula ϕ is a tautology if it
evaluates to 1 for every assignment of 1 and 0 to the input variables. Define TAUTOLOGY as
the language of boolean formulas that are tautologies. Show that TAUTOLOGY ∈ co-NP.

A.

To show that TAUTOLOGY is in co-NP, we need to show that its complement,


NOT-TAUTOLOGY, is in NP.

NOT-TAUTOLOGY is the set of boolean formulas that are not tautologies. In other words, a
boolean formula is in NOT-TAUTOLOGY if and only if there exists some assignment of the
variables that makes the formula evaluate to false.

A certificate for NOT-TAUTOLOGY can be an assignment of the variables that makes the
formula evaluate to false. Given such a certificate, we can verify in polynomial time whether the
formula evaluates to false under this assignment by simply substituting the values of the
variables into the formula and evaluating it.

Since we can verify a certificate for NOT-TAUTOLOGY in polynomial time, NOT-TAUTOLOGY is


in NP. Therefore, TAUTOLOGY is in co-NP.

Q. 34.3-1
Verify that the circuit in Figure 34.8(b) is unsatisfiable.
(Figure 34.8(b)
)

A.
Claim: no assignment of values to x1, x2, and x3 causes the circuit in Figure 34.8(b)
to produce a 1 output; it always produces 0, and so it is unsatisfiable.

Start by trying the same input as 34.8(a) ⟨x1 = 1, x2 = 1, x3 = 0⟩.

We can see that setting x1 or x2 to 0 will make the bottom AND gate output 0, making the final
AND gate also 0, regardless of x3.

The last option to try is ⟨x1 = 1, x2 = 1, x3 = 1⟩


From this we can confidently say that no assignment to the inputs of this circuit can cause the
output of the circuit to be 1. The circuit is unsatisfiable.

Q. 34.4-1
Consider the straightforward (nonpolynomial-time) reduction in the proof of Theorem 34.9.
Describe a circuit of size n that, when converted to a formula by this
method, yields a formula whose size is exponential in n.

A.

A circuit that would yield a formula of exponential size when converted using this method is a
circuit composed of a sequence of XOR gates. Consider a circuit with n inputs, where each
input is connected to an XOR gate with the output of the previous gate. The circuit would look
something like input1 - XOR - input2 - XOR - input3 - XOR - ... - XOR - inputn

When you convert this circuit into a Boolean formula, each XOR gate, which is a binary
operation, gets replaced with an equivalent expression using AND, OR, and NOT operations:
XOR(a, b) = (a AND NOT(b)) OR (NOT(a) AND b)

This replacement expands the size of the formula. Since each XOR gate in the circuit gets
replaced with this larger expression, and the replacements are nested (because each XOR gate
depends on the output of the previous one), the size of the resulting formula grows
exponentially with the number of inputs n. This is a worst-case scenario for this type of
conversion.

Q. 34.5-1
The subgraph-isomorphism problem takes two undirected graphs G1 and G2, and
it asks whether G1 is isomorphic to a subgraph of G2. Show that the subgraph-isomorphism
problem is NP-complete

A.

Given a graph G1 and a subgraph G2, a certificate would be a mapping from the vertices of G1
to the vertices of G2. We can verify in polynomial time that this mapping preserves the
adjacency of vertices, thus proving that G1 is isomorphic to a subgraph of G2.

We can reduce the well-known NP-complete problem, the Clique problem, to the Subgraph
Isomorphism problem. Given an instance of the Clique problem, which is a graph G and a
number k, we can create a graph G1 that is a complete graph with k vertices. Then, the original
graph G is G2 in the Subgraph Isomorphism problem. If there is a clique of size k in G, then G1
(the complete graph with k vertices) is isomorphic to a subgraph of G. Conversely, if G1 is
isomorphic to a subgraph of G, then G has a clique of size k. Therefore, if we could solve the
Subgraph Isomorphism problem in polynomial time, we could solve the Clique problem in
polynomial time, proving that Subgraph Isomorphism is NP-hard.

Therefore, the Subgraph Isomorphism problem is NP-complete.

Q. 35.1-1
Give an example of a graph for which APPROX-VERTEX-COVER always yields a
suboptimal solution.

A.

The APPROX-VERTEX-COVER algorithm operates by continuously selecting an arbitrary edge


(u, v), incorporating u and v into the vertex cover, and subsequently eliminating all edges linked
to u and v from the graph. This method can produce a less than optimal solution for specific
graphs.

Example graph :

The best vertex cover for this graph is either {2, 5} or {1, 6}, both of which have a size of 2.
However, if the APPROX-VERTEX-COVER algorithm begins by selecting the edge (1, 2), it will
include vertices 1 and 2 in the vertex cover and then eliminate all edges linked to 1 and 2. The
graph that remains will be:
Next, the algorithm will select the edge (5, 6), include vertices 5 and 6 in the vertex cover, and
eliminate all edges linked to 5 and 6. The vertex cover that results will be {1, 2, 5, 6}, which has
a size of 4 and is suboptimal.

Q. 35.1-2
Prove that the set of edges picked in line 4 of APPROX-VERTEX-COVER forms a
maximal matching in the graph G.

A.

The APPROX-VERTEX-COVER algorithm works as follows:

C=∅
E' = E[G]
while E' ≠ ∅
let (u, v) be an arbitrary edge of E'
C = C ∪ {u, v}
remove from E' every edge incident on either u or v
return C

The set of edges picked in line 4 forms a matching because each time we pick an edge (u, v),
we remove all edges incident on u and v from the graph. This means that no two edges in the
matching share a vertex, which is the definition of a matching.

To prove that this matching is maximal (i.e., no edges can be added to it without breaking the
property of being a matching), suppose for contradiction that there is an edge e = (x, y) in E' that
is not in the matching. Because e is in E', neither x nor y can be incident to any edge in the
matching (otherwise, e would have been removed from E' when that edge was added to the
matching). But this means that e could have been added to the matching, contradicting the
assumption that the matching is maximal.

Therefore, the set of edges picked in line 4 of APPROX-VERTEX-COVER forms a maximal


matching in the graph G.
Q. 35.3-1
Consider each of the following words as a set of letters: {arid, dash, drain,
heard, lost, nose, shun, slate, snare, thread}. Show which set cover
GREEDY-SET-COVER produces when we break ties in favor of the word that appears first in
the dictionary

A.

The GREEDY-SET-COVER algorithm works by repeatedly selecting the subset that contains the
most elements that haven't been covered yet. If there's a tie, it selects the subset that appears
first in the dictionary.

The universal set is {a, r, i, d, s, h, n, e, l, o, t}. At the beginning, none of these elements are
covered.

The subset "arid" covers the most elements (4) that haven't been covered yet. So, we add "arid"
to the set cover and remove 'a', 'r', 'i', 'd' from the universal set.

The subset "snare" now covers the most new elements (4), which are 's', 'n', 'e', and 'r'. So, we
add "snare" to the set cover and remove 's', 'n', 'e', and 'r' from the universal set.

The subsets "heard", "lost", "dash", "thread" each cover 2 new elements. We break the tie by
selecting the subset that appears first in the dictionary, which is "dash". So, we add "dash" to the
set cover and remove 'd' and 'h' from the universal set.

The subset "lost" now covers the most new elements (2), which are 'l', 'o'. So, we add "lost" to
the set cover and remove 'l', 'o' from the universal set.

The subset "thread" now covers the only remaining element ('t'), so we add "thread" to the set
cover and remove 't' from the universal set.

The final set cover is {"arid", "snare", "dash", "lost", "thread"}.

Q. 35.3-3
Show how to implement GREEDY-SET-COVER in such a way that it runs in time

O(∑|S|) for all S in F. (O\left(\sum_{S \in \mathcal{F}}|S|\right) )

A.
Initialize a list or array to keep track of the elements in the universe that are not yet covered.
Let's call this list uncovered.

For each subset S in F, create a list or array that contains the elements of S that are in
uncovered. Let's call this list uncovered_in_S. Also, keep a count of the number of elements in
uncovered_in_S for each S. This can be done in time proportional to the sum of the sizes of the
subsets in F, which is O(∑|S|).

While uncovered is not empty, find the subset S with the maximum count of uncovered_in_S.
This can be done in time proportional to the number of subsets in F.

Add S to the set cover, and remove the elements of S from uncovered. For each subset T in F,
remove the elements of S from uncovered_in_T and update the count of uncovered_in_T. This
can be done in time proportional to the sum of the sizes of the subsets in F, which is O(∑|S|).

Repeat steps 3 and 4 until uncovered is empty.

The total time complexity of this implementation is O(∑|S|) for all S in F, as required.

Q. 35.5-2
Using induction on i, prove inequality (35.26)
Inequality (35.26) must hold for y^* ∈ P_n, and therefore there exists an element z ∈ L_n such
that
\frac{y^*}{(1+\epsilon / 2 n)^n} \leq z \leq y^*

( )
and thus
\frac{y^*}{z} \leq\left(1+\frac{\epsilon}{2 n}\right)^n

( )

A.

We can prove this inequality by induction on n.

For the base case (n = 1), we observe that the inequality simplifies to y* / z ≤ 1 + ε / 2. Given
that z is an element of L_n and y* is an optimal solution, it's a given that z ≤ y*. Hence, it's clear
that y* / z is less than or equal to 1, which in turn is less than or equal to 1 + ε / 2 for any ε > 0.

Now, for the inductive step, let's assume that the inequality is true for n = k. In other words,
we're assuming y* / z ≤ (1 + ε / 2k)^k.
We want to show that this assumption implies the inequality also holds for n = k + 1, i.e., y* / z ≤
(1 + ε / 2(k+1))^(k+1).

From our assumption, we know y* / z ≤ (1 + ε / 2k)^k. We can express (1 + ε / 2(k+1))^(k+1) as


(1 + ε / 2k * 1/(k+1)) * (1 + ε / 2k)^k.

Because 1/(k+1) is less than 1, it's clear that (1 + ε / 2k * 1/(k+1)) is less than (1 + ε / 2k), and
so, (1 + ε / 2(k+1))^(k+1) is greater than (1 + ε / 2k)^k.

Therefore, if y* / z ≤ (1 + ε / 2k)^k (which is our assumption), it follows that y* / z ≤ (1 + ε /


2(k+1))^(k+1).

This validates the inductive step and confirms that inequality (35.26) holds for all n.

Q. 35-3 Weighted set-covering problem


Suppose that we generalize the set-covering problem so that each set S_i in the family
\mathcal{F} has an associated weight w_i and the weight of a cover \smile is \sum_{S_i ∈

\mathcal{C}} w_i ( ). We wish to determine a minimum-weight cover. (Section 35.3


handles the case in which w_i=1 for all i.)

Show how to generalize the greedy set-covering heuristic in a natural manner to provide an
approximate solution for any instance of the weighted set-covering problem. Show that your
heuristic has an approximation ratio of H(d), where d is the maximum size of any set S_i.

A.

The greedy set-covering heuristic can be generalized to handle the weighted set-covering
problem as follows:

Create a list of all the sets in \mathcal{F} along with their weights, where each set is represented
as a pair (S_i, w_i).

While there are elements in the universe that are not yet covered:

Select the set S_i with the highest ratio of uncovered elements to weight (i.e., the set that
covers the most new elements per unit of weight).
Add S_i to the cover \mathcal{C} and remove the elements of S_i from the universe.
Here's the detailed algorithm:

Let U denote the universe of all elements that need to be covered.


Let \mathcal{C} be an empty set that will represent the cover.
While U is not empty, do the following:
For each set S_i in the family \mathcal{F} that has not been chosen yet, calculate the ratio r_i =
|S_i ∩ U| / w_i, where |S_i ∩ U| is the number of elements in U that are also in S_i (i.e., the
number of new elements that S_i can cover), and w_i is the weight of S_i.
Select the set S_j with the highest ratio r_j.
Add S_j to the cover \mathcal{C}.
Remove the elements of S_j from U.
This algorithm gives a solution that is approximately optimal for the weighted set-covering
problem.

Let's denote by d the maximum size of any set S_i, and by H(d) the d-th harmonic number. It's
known that H(d) = 1 + 1/2 + 1/3 + ... + 1/d, which is an upper bound on the natural logarithm
function, ln(d) + 1.

The heuristic is a greedy algorithm, and it works by choosing at each step the set that gives the
maximum benefit per unit weight. Therefore, it ensures that, at every step, we're making a
locally optimal choice. However, this does not guarantee a globally optimal solution.

The approximation ratio of the heuristic is H(d) because the heuristic may choose up to d sets
before covering all elements, and each choice might involve picking a set with a weight that is at
most H(d) times the optimal average weight. The optimal solution would be to always pick the
set with the smallest weight per covered element, but this is not always possible because the
sets are not disjoint and the elements to be covered change after each selection. Therefore, the
greedy heuristic gives a solution that is at most H(d) times the weight of the optimal solution.
This result is a known bound for the approximation ratio of the greedy set-covering algorithm.

You might also like