You are on page 1of 10

COMP4500 2015 Exam

Question 1 [12 marks]

Assuming that T (n)∈ Θ(1) for all n ≤ n0 for a suitable constant n 0, solve each of the following
recurrences to obtain an asymptotic bound on their complexity as a closed form. Make your
bounds as tight as possible.

a. [4 marks] T (n)=6 T (n/ 4)+ n2

nlo g 6is polynomially smaller than n2 (since lo g 4 16=2).


4

Check regularity: a f (n /b)=6∗(n2 /16)=(3/8)n2 ≤ c f (n) for 0< c<1 if c=3 /8, therefore T (n)
is regular.

Therefore, case 3 of the Master Method applies, and T (n)∈ Θ(n2 ).

b. [4 marks] T (n)=10 T ( n/3)+ n2

nlo g 10 is polynomially greater than n2 (since lo g 3 9=2).


3

Therefore, case 1 of the Master Method applies, and T (n)∈ Θ(nlo g 10 ) .


3

c. [4 marks] T (n)=T (n/2)+ lo g2 n

nlo g 1=n0=1 is polynomially similar to lo g 2 n.


2

Therefore, case 2 of the Master Method applies, and T (n)∈ Θ(f ( n)lg(n))=Θ(l g 2 n).
Question 2 [25 marks]

A rainwater drainage system is described by a weighted, directed acyclic graph


G=(V , E ,W ), where:
● the directed edges, G . E , represent one-directional pipes in the drainage system
● the vertices, G .V , are junctions in the system where pipes connect, and
● for each directed edge in the graph,(u , v )∈ G. E , the weight function G . w(u , v )
describes the fraction of water flowing into junction u that then flows down pipe (u , v )
into v .

Vertices with no incoming edges are referred to as inlets, and vertices with no outgoing
edges are referred to as outlets. The rainwater drainage system can have multiple inlets and
outlets.

Since weights in the graph represent fractions, they must be non-negative. For each vertex u
that is not an outlet we also require the fractions on its outgoing edges to sum to one.

If the rainwater drainage system is initially empty and there is a deluge of rain that causes,
for each inlet vi , x (v i )liters of water to flow into vi , we can use the drainage system graph to
calculate, for any outlet v o, the total amount of deluge water that will flow into v o.

For example, for graph G with vertices G .V ={v 1 , v 2 , v 3 , v 4 }and edges


G . E={(v 1 , v 2),( v 1 , v 3),(v 4 , v 2)}and weight function
G . w={( v 1 , v 2)→ 0.4 ,( v 1 , v 3)→ 0.6 ,( v 4 , v 2)→1 }, if x (v 1)=10liters of water flows into
inlet v 1, and x (v 4 )=5liters of water flows into inlet v 4, then in total 9 liters of deluge water
will flow into the outlet v 2, and 6 liters of deluge water will flow into the outlet v 3.

Let G be a rainwater drainage system graph, and x be a mapping from the inlets of G to the
number of liters of water that flows into each of those inlets from the deluge.

a. [5 marks] What is a topological sort of a directed acyclic graph, and how can you
efficiently compute one?
A topological sort of a dag sorts the vertices based on the edges between them such that all
vertices with directed edges pointing to a given vertex v appear before v in the sorted list.

A topological sort can be computed by performing a Depth First Search on the dag, keeping
timestamps of when each vertex is first visited and completed, and then add the vertices to
the front of a linked list as they're finished.

b. [5 marks] Define a recurrence M (G , x , v )that describes, for any arbitrary vertex vofG , the
total number of liters of deluge water that will flow into vertex v. Be sure to define the base
cases of the recurrence as well as the more general cases.

M (G , x , v )=x (v ) if v is an inlet (i.e. if ∄(u , w)∈ G. E∨w=v )



M (G , x , v )= ∑ G . w(u , v )× M (G , x ,u) otherwise
(u , v)∈incoming

c. [10 marks] Given any outlet v o ∈ G, write an efficient algorithm MAXIMUM-FLOW(G, x, vo)
in pseudocode, that calculates M (G , x , v o ). You may assume the existence of an efficient
algorithm that can perform a topological sort of your graph.

MAXIMUM-FLOW(G, x, v0):
L = TOPOLOGICAL-SORT(G)
M = empty map
for i = 0 to |G.V| - 1 do
v = L[i]
if v.numIncoming() == 0 then
M.put(v, x(v))
else
sum = 0
for u in’ince top. sorted, all u will have already been put in the map
sum += G.w(u,v) * M.get(u)
M.put(v, sum)
return M.get(v0)

d. [5 marks] What is the time complexity of your algorithm? State any assumptions that you
make and briefly justify your answer.

Assumptions:
● the incoming edges are stored with each vertex - i.e. v.numIncoming() is constant
time and iterating over v.incoming() is O(deg (v , incoming))
● The map used is implemented such that M.put() and M.get() are constant time (e.g.
HashMap)

Performing TOPOLOGICAL-SORT, if implemented using DFS is O ¿.

MAXIMUM-FLOW then iterates over all vertices, and for each it iterates over every incoming

edge. Since ∑ deg (v , incoming)=¿ E∨¿ ¿ (because every edge points to exactly one
v ∈G . V
vertex), this leads to this portion of the algorithm being O ¿.

Therefore, the overall complexity of MAXIMUM-FLOW is O ¿.


Question 3 [20 marks]

This question involves performing an amortised analysis of a data structure.

Consider a data structure that keeps track of the inventory in a warehouse that has two
operations, BUY(s) that adds one item of stock swith price s . key to the inventory, and
SELL(m) that removes the cheapest m items of stock in the inventory and returns the cost of
purchasing those m cheapest items. The concrete implementation below uses a min-priority
queue, called inventory, that stores one element for each item of stock in the inventory. The
key associated with each element of stock, s, in the inventory is its price s . key . The min-
priority queue is implemented using a binary heap.

It is not straightforward using aggregate analysis to show that any sequence of m


operations, starting from an empty inventory, has amortised cost O(m log m). Instead show
how one could apply the accounting method and potential method, by answering the
following questions:

a. [4 marks] What is the actual cost of the BUY and SELL operations? Answer in terms of
the size, N , of the inventory, letting the actual cost of the priority-queue INSERT operation
be ⌊ lo g2 ( N +1) ⌋ +1, EXTRACT-MIN be ⌊ lo g2 N ⌋ +1, and IS-EMPTY be 1.

BUY(s) costs c BUY =⌊ lo g2 (N +1)⌋ +1 , since it is just the priority queue INSERT operation.

SELL(m) performs the following:


● Initialise the cost and return it (each constant)
● For each iteration of the while loop:
○ Evaluate m > 0 (constant)
○ Evaluate inventory.IS-EMPTY() (constant)
○ Perform inventory.EXTRACT-MIN() ( ⌊ lo g2 N ⌋ +1)
○ Update the cost (constant)
○ Decrement m (constant)

The number of iterations is m' , where m'=min(m , N ).Therefore, the overall cost is
c SELL =2+m ' ⋅( 4+ ⌊ lo g2 N ⌋ +1)=2+ m' ⋅( ⌊ lo g 2 N ⌋ +5)
b. [8 marks] Give the amortised cost for BUY and SELL for use in the accounting method.
Argue that the total amortised costs minus the total actual costs is never negative for any
sequence of m operations.

Each element on the priority queue is inserted once and removed once. For each element
inserted on the priority queue, at most one iteration of the SELL while loop will be performed.
Therefore, let the amortised costs be as follows:

ĉ BUY =⌊ lo g 2 ( N +1) ⌋ +1+(⌊ lo g2 N ⌋ +5)=⌊ lo g 2 (N +1) ⌋ + ⌊lo g2 N ⌋ +6

ĉ SELL=2 (just the fixed costs)

Since the extraction of the stock item on sale is 'paid for' upfront when it is bought, the sum
of amortised costs will never be lower after any sequence of operations than the actual
costs, and therefore the amortised costs may be used.

In any sequence of operations starting from an empty priority queue, at most half of the
operations may be SELL operations (since you can't sell more than you have bought) -
proportionally half or more operations are BUY operations.

Since a single BUY operation on the structure is ⌊ lo g2 ( N +1)⌋ + ⌊ lo g2 N ⌋ + 6∈ O(lo g2 N ), if


m operations are conducted in sequence on the structure starting from an empty priority
queue, the worst case time taken will be O(m lo g2 m).

c. [8 marks] Give a potential function Φ for the warehouse data structure implementation, as
would be used in analysing complexity with the potential method, and show that the value of
the potential function after any sequence of m operations, starting from an empty inventory, is
bound below by the initial value of the potential function. Calculate the amortised cost of
operations BUY and SELL from Φ (show your working).

Since each item requires ( ⌊ lo g2 N ⌋ +5)cost to eventually be decreased let the potential
function for the structure be Φ=N ×(⌊ lo g 2 N ⌋ +5).

When the structure is empty, we have Φ 0=0 (while lo g 2 0 is undefined, it is irrelevant since
you cannot take an item off of an empty priority queue). Then, trivially, Φ i ≥ Φ0 for i>0 since
N ×( ⌊ lo g 2 N ⌋ +5)will never be non-negative for a positive value of N . The change in
potential for a BUY operation is ( ⌊ lo g2 N ⌋ +5) since it adds one item, and the change in
potential for a SELL(m) operation is −m ' ×( ⌊ lo g 2 N ⌋ +5) using the same definition of m'
from part a.

The amortised costs are then:


ĉ BUY =⌊ lo g 2 ( N +1) ⌋ +1+(⌊ lo g2 N ⌋ +5)=⌊ lo g 2 (N +1) ⌋ + ⌊lo g2 N ⌋ +6
ĉ SELL=2+ m' ×( ⌊lo g2 N ⌋ +5)+(−m' ×(⌊lo g2 N ⌋ +5))=2
Question 4 [25 marks total]

A sequence Z of k characters ⟨ z 1 , z 2 , ... , z k ⟩ is a palindrome if z i=z k−i+1for all i in


1,2 ,... , ⌊ k /2 ⌋ .

Given a sequence X =⟨ x1 , x2 , ..., x m ⟩ , we have that Z=⟨ z1 , z2 , ..., z k ⟩ is a subsequence of


X if there exists a strictly increasing sequence ⟨ i1 , i 2 , ... ,i k ⟩ of indices of X such that for all
j=1,2 ,... , k we have that x i =z j.
j

Given a sequence X =⟨ x1 , x2 , ..., x m ⟩ , the problem is to design an efficient algorithm for


finding a longest subsequence of X that is also a palindrome. For example, the longest
subsequence of the sequence ⟨ p , e , n , e , l, o , p , e ⟩ that is also a palindrome is ⟨ p , e , e , p ⟩ .

a. [15 marks] This problem can be solved by dynamic programming. Let A(i , j)be the length
of the longest subsequence of⟨ xi , x i+1 ,... , x j ⟩ that is also a palindrome. The solution we
seek is A(1 , m) . Give a recurrence defining A(i , j) for1 ≤i ≤m and 0 ≤ j ≤m .

You do NOT have to give a dynamic programming solution, just the recurrence.
Be sure to define the base cases of the recurrence as well as the more general cases.

(note: LPS = longest palindromic subsequence)

There are four cases for a given A(i , j):

If i > j then the sequence ⟨ xi , x i+1 ,... , x j ⟩is empty, so the length of the LPS is 0.

If i= j then there is only one character to consider which is by definition a palindrome, so the
length of the LPS is 1.

If the characters at i and j are not the same, then there can't be any longer palindromic
subsequences than existed in the two subproblems it is built from: i.e. where the sequence
was one shorter because i was greater, and where the sequence was shorter because j was
greater. Therefore in this case the length of the LPS is the maximum of the length of the LPS
for the subproblems (i+1 , j) and (i , j−1).

However, if the characters at i and j are the same, then, any palindrome that existed in the
middle of them (i.e. in the subproblem (i+1 , j−1)) is increased in length by two (both ends).

This leads to the following recurrence:


● A(i , j)= 0 if i> j
● A(i , j)=1 if i= j
● A(i , j)=max (A (i +1 , j) , A (i , j−1)) if i≠ jand x i ≠ x j
● A(i , j)= A(i+1 , j−1)+2if i≠ jand x i=x j
Or, in T(n) terms: 0 if n = 0
1 if n = 1
2T(n-1) for n > 1.

b. [10 marks] For the dynamic programming solution indicate in what order the elements of
the matrix A corresponding to the recurrence should be calculated. As part of answering this
question you could either give pseudocode indicating the evaluation order or draw a table
and indicate the dependencies of typical elements and which elements have no
dependencies.

If we consider the table A as a matrix of size m× m, where the columns represent i values
and the rows represent j values, then the diagonal of the matrix (i.e. i= j) will contain 1s,
and everything above and to the right (i.e. the top right half of the matrix) is filled with 0s (i> j
), as shown below:

1 0 0 0 0

1 0 0 0

1 0 0

1 0

For the cells in the lower-right half of the matrix, they may depend on the element above
them and the element to the right (the third recurrence case), or the element diagonally
above-right (the fourth recurrence case).

Therefore, the table should be filled in top-to-bottom, right-to-left, so that when a subproblem
is calculated, all subproblems it depends on are already calculated.

This is equivalent to for loops of the form:

for (j = 1; j <= m; ++j)


for (i = m; i >= 1; --i)
[fill in table]
Question 5 [18 marks total]

Below we describe two computer science problems.

The subset-sum problem: Given a set A of n distinct positive integers and an integer C , the

subset-sum decision problem is to decide if there is a subset Bof A such that ∑ b=C. This
b ∈B
problem is known to be NP-complete (assuming a standard encoding of inputs).

The multiple-partition problem: Given a set A of n distinct positive integers and m integers
C 1 , C2 , ... ,C m, the multiple-partition decision problem is to decide if A can be partitioned into

m disjoint subsets B1 , B2 ,... , B mof A such that for each i in1,2 ,... , m, ∑ b=C i .
b ∈B i

a. [8 marks] Prove that the multiple-partition problem is NP-hard.

Multiple-partition is NP-Hard if a solution to it implies a polynomial time solution to all


problems in NP, or equivalently, if a known NP-Complete problem can be reduced to it in
polynomial time.

The subset-sum problem is known NP-Complete, and can be reduced to the multiple-
partition problem by converting its input as follows:
● The parameter A to multiple-partition is the same as is passed to subset-sum
● The sequence C input to multiple-partition is set such that m=1, with C 1=C from
subset-sum

Then, the multiple-partition problem will determine if there is m=1disjoint subsets B1such

that ∑ b=C 1, where C 1 is C from sub-set sum, which is equivalent to determining whether
b ∈B 1

there is 1 disjoint subset B such that ∑ b=C - i.e. with this input, multiple-partition is
b ∈B
equivalent to subset-sum.

Therefore, since the inputs ( A , C) to subset-sum can be converted in polynomial time


(actually constant time) to inputs ( A ' , C ' ) for multiple-partition, if multiple-partition could be
solved in polynomial time, subset-sum could be solved in polynomial time.

Since subset-sum is NP-complete, a polynomial time solution to subset-sum would imply a


polynomial time solution to all problems in NP. Therefore, a polynomial time solution to
multiple-partition would imply a polynomial time solution to all problems in NP. Therefore, by
definition, multiple partition is NP-Hard.
b. [6 marks] Show that the multiple-partition problem is NP-complete and clearly state any
assumptions that you make.

Multiple-partition is NP-Complete if it is NP-Hard and is in NP. It is known to be NP-Hard,


and it is in NP if a polynomial time algorithm to verify it exists. Such an algorithm could be
implemented as follows:

VERIFY-MULTIPLE-PARTITION(A, C, B)
// Check the subsets B are disjoint
for i = 1 to m do
for j = i to m do
if Bi ∩ B j ≠ ∅then
return false
// Check the subsets B partition A
D=∅
for i = 1 to m do
D = D∪ Bi
if D ≠ A then
return false
// Check that each subset B sums to its equivalent sum value in C
for i = 1 to m do
sum = 0
for b ∈ Bi do
sum += b
if sum ≠ Ci then
return false
return true

Assuming that checking for set intersection, union and equality can be performed in
polynomial time (as in all practical implementations), this algorithm verifies a solution B to
the multiple-partition problem in polynomial time, and therefore multiple-partition is in NP.

Therefore, since multiple-partition is in NP and is NP-Hard, multiple-partition is NP-


Complete.

c. [4 marks] Explain the significance of the existence of NP-complete problems.

NP-Complete problems are problems such that if one NP-Complete problem could be solved
in polynomial time, all problems in NP could be solved in polynomial time and hence P=NP .

In practice, since no mathematician has thus far been able to find a polynomial time solution
to any NP-Complete problem, it is reasonable to assume that if a problem you are
attempting to solve can be shown to be NP-Complete, you are unlikely to be able to find
such a solution. Therefore, if your problem is NP-Complete it is reasonable to instead either
give up, or attempt to approximate a solution or solve a special case instead.

You might also like