You are on page 1of 4

Homework 1 Solutions

MA 522 Fall 2011 1. Consider the searching problem: Input A sequence of n numbers A = [a1 , . . . , an ] and a value v . Output An index i such that v = A[i] or the special value NIL if v does not appear in A (a) (20 points) Write a pseudocode for L INEAR -S EARCH, which scans through the sequence, looking for v . Using a loop invariant, prove that your algorithm is correct. (Make sure that your loop invariant fullls the three necessary properties initialization, maintenance, terminantion.) Solution: L INEAR -S EARCH (A, v ) 1 for i 1 to length[A] 2 do if A[i] = v 3 then return i 4 return NIL Correctness: Loop invariant At the start of each iteration of the for loop we have A[j ] = v for all j < i. Initialization Before the rst loop iteration the invariant holds since the statement is empty. Maintenance The loop invariant is maintained at each iteration, since otherwise at the i-th iteration there is some some j < i such that A[j ] = v . However, in that case for the j -th iteration of the loop the value j is returned, and there is no i-th iteration of the loop, a contradiction. Termination When the loop terminates, there may be two cases: one is that it terminates after i length(A) iterations, and returns i, in which case the if conditional ensures that A[i] = v . The other case is that i exceeds length(A), in this case by the loop invariant we have that for all j length(A) A[j ] = v , this returning NIL is correct. (b) (10 points) How many elements of the input sequence need to be checked on the avarage, assuming that the element being searched for is equally likely to be any element in the array? How about the worst case? What are the avarage-case and worst-case running times of linear search in notation? Justify your answer.

Solution: Since the probability of v = A[i] is 1/n for all i = 1, . . . , n, and we need to check exactly i elements when v = A[i], we have that expected value of the number of checks is 1 1 n(n + 1) n+1 (1 + 2 + + n) = = . n n 2 2 In the worst case it is n. The avarage case running time is c n+1 = (n) since for all n 1 2 we have 1 n+1 n n. 2 2 The worst case running time is also (n). (c) (10 points) If we assume that A is sorted, then we can check the midpoint of the sequence against v and eliminate half of the sequence from further consideration. B INARY SEARCH is an algorithm that repeats this procedure, halving the size of the remaining portion of the sequence each time. Write a pseudocode, either iterative or recursive, for binary search. Solution: B INARY-S EARCH (A, p, q, v ) 1 if q < p 2 then return N IL 3 m p + (p q )/2 4 if A[m] = v 5 then return m 6 if A[m] > v 7 then return B INARY-S EARCH(A, 1, m 1, v ) 8 else return B INARY-S EARCH(A, m + 1, length(A), v ) (d) (10 points) Use the master method to show that the solution to the binary-search recurrence T (n) = T (n/2) + (1) is T (n) = (lg n), and argue that the worst case running time for binary search is (lg n). Solution: We can use the Master Theorem Case 2 because from a = 1 and b = 2 we have nlogb a = n0 = 1, so for k = 0 f (n) = (1) = (nlogb a lgk n). This gives that T (n) = (nlogb a lgk+1 n) = (lg n). The running time of B INARY-S EARCH can be computed from the following recurrence equation: n T (n) = c + T ( ). 2 Thus the above argument shows that the running time is T (n) = (lg n). (e) (10 points) Observe that the while loop of the INSERTION SORT procedure (in class) uses a linear search to scan (backward) through the sorted subarray A[1 . . . j 1]. Can we 2

use binary search instead to improve the overall worst-case running time of insertion sort to (n lg n)? Solution: Yes, we can, and the running time will change to (n lg n). 2. (20 points) Let f, g : N R be functions which are positive for some n n0 . Prove the equatlity O((f + g )2 ) = O(f 2 + g 2 ). (Hint: you need to use the ineaquality between geometric and arithmetic averages.) Solution: First assume that h O((f + g )2 ), i.e. there exist c > 0 and n1 such that 0 h(n) c(f (n) + g (n))2 for all n n1 . Using the ineaquality between geometric and arithmetic averages we get that if n n0 (so f and g are positive) then (f (n) + g (n))2 f (n)g (n), 4 thus (f (n) + g (n))2 = f (n)2 + g (n)2 + 2f (n)g (n) f (n)2 + g (n)2 + which implies that (f (n) + g (n))2 f (n)2 + g (n)2 . 2 Thus for n max(n0 , n1 ) we have that 0 h(n) 2c (f (n)2 + g (n)2 ) , i.e. h O(f 2 + g 2 ). For the other direction, if h O(f 2 + g 2 ) i.e. there exist d > 0 and n2 such that 0 h(n) d(f (n) + g (n))2 for all n n2 , then using that for n n0 f (n)2 + g (n)2 f (n)2 + g (n)2 + 2f (n)g (n) = (f (n) + g (n))2 we have that for n max(n0 , n2 ), 0 h(n) d(f (n) + g (n))2 , i.e. h O((f + g )2 ). 3. (20 points) Give asymptotic upper and lower bound for the recurrence T (n) = T (n/2) + T (n/4) + T (n/8) + n. Solution: The solution of the recurrence can be computed by adding up the nodes in the following tree: (f (n) + g (n))2 2

n 2

n 4

n 8

n 4

n 8

n 16

n 8

n 16

n 32

n 16

n 32

n 64

... ... ... ... ... ... ... Note that if a node would have a number < 1 then we write 0 in that node. Let (i) denote the sum of the nodes in line i. Then (1) = n, (2) n n n 1 + + =n 1 2 4 8 8 7 = n. 8

...

...

Note that we may have inequality above, in case n < 8. Similarly for sum of the (i + 1)-th row 1 (i) (i) (i) 7 + + = (i) 1 (i + 1) = (i). 2 4 8 8 8 Thus we get that i1 7 (i) n. 8 Summing up the rows we get the following upper bound:
lg n

T (n)
i=1

7 8

i1

n
i=1

7 8

i1

n=n

1 1

7 8

= 8n.

On the other hand n is a lower bound for the running time. Thus T (n) = (n). We can also verify that T (n) = (n) by the substitution method. We show that T (n) = cn for some suitable c > 0 constant. We have n n n T (n) = T ( ) + T ( ) + T ( ) + n 2 4 8 n n n = c +c +c +n 2 4 8 7 7 = cn + n = n c+1 . 8 8 7 c + 1 = c. 8 This equations has the solution c = 8, so we have that T (n) = 8c. 4 We need to nd c such that

You might also like