You are on page 1of 3

CSC373 Algorithm Design, Analysis, and Complexity Spring 2012

Solutions for Assignment 2: Dynamic Programming

1. Road Trip. 1a) Suppose there are n = 3 hotels. Let Q = 0.005, pk = 100 for each k = 1, 2, 3, and d1 = 400, d2 = 600, d3 = 700. Then the greedy algorithm chooses m(1) = 2, and is then forced to choose m(2) = 3. The total cost is 0 + 100 + (100 600)2 Q + 100, which is $1450. Doing the trip in one day costs (700 600)2 Q + 100, which is $150. Therefore the greedy algorithm is not optimal. 1b) Dene P(k, j) to be the optimal cost for travel from the beginning of the trip through the j th day, given that you stay at the k th hotel on the j th night. Here P(k, j) includes the price of the k th hotel. For simplicity we include the beginning of the trip as k = 0, j = 0, and P(0, 0) = 0. Since there are only n hotels and you must move each day, we can assume j k n. Therefore we consider a (n + 1) (n + 1) table of costs P(k, j), with P(k, j) = for 0 k < j. (1)

Alternatively, we can avoid computing values P(k, j) for k < j. The condition that we start at distance 0 on the rst day is represented by P(0, 0) = 0, and P(k, 0) = for 1 k n. (2)

Consider day j > 0 with j n, and suppose we stay at hotel k that night. The minimal cost to arrive at this state is dened to be P(k, j). On the previous night, the (j 1)st , we must have stayed at some hotel i with j 1 i < k. The optimal cost of the rst j 1 days travel is then P(i, j 1), and the cost of the j th days travel and accommodation, is C(dk di ) + pk . Therefore P(k, j) is the minimum over i of the sum of these two costs, namely P(k, j) =
j1i<k

min

[P(i, j 1) + C(dk di ) + pk ] .

(3)

We can solve for P(k, j) by using the conditions (1) and (2), and iterating through j = 1, 2, . . . , n. It follows that the optimum number of days is therefore J = arg min P(n, J),
1Jn

(4)

and the minimum cost of the trip is O(J) = P(n, J) for this J. In order to recover the sequence of hotels, we begin by setting m(J) = n. For j = J, J 1, . . . , 1 (in decreasing order), we set m(j 1) to be the smallest i such that the minimum is achieved in equation (3) with k = m(j). That is, m(j 1) equals the smallest index i such that P(m(j), j) = P(i, j 1) + C(dm(j) di ) + pm(j) . Such an i must exist by construction. 1c) We skip the implementation. Increasing Q causes the optimum solution to have the mean daily driving distance closer to 600 km, where that is possible by the hotel locations. The variation in the cost of the hotels becomes increasingly irrelevant. When Q is decreased the optimum solution has more exibility to choose the cheapest hotels and/or to reduce the number of days of travel. Eventually, the optimum solution consists of driving the whole way to the nth hotel in one day (i.e., without stopping at any other hotel, so J = 1 and m(1) = n). (5)

2. Scheduling with Deadlines. 2a) Let R = {1, 2, . . . , n} be the set of jobs. Without loss of generality, we assume the initial time s for scheduling any job is s = 0. Moreover, we require the durations of all jobs satisfy ti 0. For any jobs having zero duration, i.e., tj = 0, we can schedule all of them at the start time s = 0. This reduces the problem to the case ti > 0 for all i. We consider this case below. For any subset A R, dene T (A) to be the sum of durations of all the jobs in A, that is T (A) = jA tj . In addition, we dene A(di ) = {j A | dj di } to be the subset of all jobs in A with deadlines no later than di . We therefore have T (A) =
jA

tj , and T (A(di )) =
jA,dj di

tj .

(6)

Moreover, since we can begin scheduling jobs at s = 0 and can leave no gaps between jobs, the earliest possible nishing time for all the jobs in A is given by T (A). That is, for any schedule of all the jobs in A (ignoring deadlines) we have T (A) max{fj | j A and fj is the nishing time of job j in the schedule for A}. Claim: S R is schedulable i T (S(di )) di for every i S. Proof: Suppose S is schedulable, and consider a feasible schedule for these jobs (i.e., it satises the constraints specied in the problem). Such a schedule includes an ordering of the jobs, along with start and nish times, si and fi = si + ti , respectively, for each job. Let i be an arbitrary element of S, and consider the deadline di . Dene Bi to be the set of jobs that nish on or before the deadline di in the above feasible schedule for S. So Bi = {j S | fj di }. Then, by the feasibility of the schedule, any job in S with a deadline on or before di must be in Bi , so S(di ) Bi . From the assumption that tk 0 for all k it follows that T (S(di )) T (Bi ). In addition, from (7), we have (8) T (Bi ) max{fj | j Bi } di . The last inequality above follows from the denition of Bi . Therefore T (S(di )) T (Bi ) di for an arbitrary element i S, as we were required to show. Conversely, suppose S is a subset of R for which T (S(di )) di for each i S. (9) (7)

Sort the jobs in S in increasing order of their deadlines, breaking ties arbitrarily, and schedule them in that sorted order without gaps between jobs. For any j S, let fj be the nish time for job j in this sorted schedule. Then it follows that fj T (S(dj )). (The reason for the inequality here is that all jobs k with a deadline dk = dj are included in the set S(dj ), but if such a job k is scheduled after job j then it does not contribute to the nish time fj .) Therefore, from (9) we have fj T (S(dj )) dj . Since j was an arbitrary element of S, it follows that this schedule (sorted by deadlines) is a feasible schedule for S. Theorem: Let S R. Then S is schedulable i the jobs in S can be scheduled in non-decreasing order of their deadlines. Proof: The result follows from the previous claim and its proof. 2b) Here we assume all deadlines di 0 and durations ti 0 are integer valued. Moreover, we assume these jobs have been sorted so that the deadlines are in non-decreasing order, 0 d1 d2 . . . dn = D, where D is the maximum deadline. A similar dynamic programming approach to the one used for Subset Sum applies here (see p.269 of the text, or, similarly, consider the knapsack problem with constant values). Dene M (i, f ) to be an (m + 1) (D + 1) array, where i {0, 1, . . . , n} is the maximum index of jobs to be considered and f {0, 1, . . . , D} denotes the nish time for all the jobs. Here M (i, f ) is dened to be the maximum number of jobs from the subset {1, 2, . . . , i} that can be scheduled (each within their individual deadlines) 2

such that the last job nishes at or before time f . We dene M (0, f ) = 0 for all f . The following recurrence relation applies for 0 < i n and 0 f D: M (i, f ) = max M (i 1, f ), 1 + M (i 1, f ti ), if job i is not scheduled, if ti f di and job i is scheduled. (10)

The iterative algorithm for computing all entries of M is similar to the subset sum algorithm on p.269. The runtime is given by the size of the table M , namely O(nD). An optimal solution can be recovered from this table using an algorithm similar to the one described in the lecture notes for recovering a solution of the knapsack problem. The runtime for this latter stage is O(n). We omit the details. 3. Optimal Parse Trees. 3a) Consider a subtree with the root (i, j) corresponding to the substring y(di . . . (dj 1)). Here 0 i < j K. The optimal cost of any subtree rooted at (i, j) is dened to be C(i, j). We are only interested in the upper triangular part of this cost matrix, that is C(i, j) for j > i. The gure below shows the cost table C for the example given in the assignment handout with breakpoints d = [d0 , d1 , d2 , d3 ] = [1, 10, 30, 51]. We describe the algorithm for computing such a cost table next.

There are two cases for i and j. In the rst case we have j = i + 1 and this root node (i, j) is a leaf. The optimum value of a leaf is dened to be C(i, i + 1) = di+1 di . The other case is j > i + 1. By the denition of the form of the parse tree, this root node (i, j) has two children, (i, k) and (k, j) for some k with i < k < j. In this second case, the cost for the parse tree is the cost of the root node (i, j), which is dj di plus the costs for the left and right subtrees, C(i, k) and C(k, j). In order for our parse tree to be optimal, we should choose the optimal values for the left and right subtrees. Thus, by the denition of C, the costs of the left and right subtrees are C(i, k) and C(k, j). Given these facts, the optimum cost of a subtree rooted at (i, j) with j > i + 1 is C(i, j) = (dj di ) + mini<k<j (C(i, k) + C(k, j)). Therefore we can compute C(i, j) with 1 i < j K as C(i, j) = dj d i , dj di + mini<k<j (C(i, k) + C(k, j)) for j = i + 1, for j > i + 1. (11)

In terms of the cost matrix C(i, j) this computation can be done by rst computing all the elements on the second-diagonal of this matrix, i.e., with j = i+1. Then computing all the elements on the third-diagonal, j = i + 2, and so on. 3b) To compute one entry in the table, say C(i, j), we may need to check O(n) possible costs (which depend on the choice of k). Each of these cost computations is O(1), so each table entry C(i, j) can be computed in O(n). Since there are O(n2 ) table entries, the overall runtime is O(n3 ). 3c) The root of the tree corresponds to (0, K) with a cost of C(0, K). We recursively compute the left and right subtrees of any node (i, j) as follows. If j = i + 1, then the node is a leaf. Otherwise, j > i + 1 and following equation (11), nd any k such that k = arg mini<k<j (C(i, k) + C(k, j)). Then the left and right subtrees of this node are taken to be (i, k) and (k, j).