# Stanford University — CS161: Algorithms Luca Trevisan

Handout 1 April 1, 2013

Lecture 1
In which we review prerequisites and start talking about divide-and-conquer algorithms.

1

Prerequisites

This class assumes CS103, CS106 and CS109 as prerequisites.

Proofs and Big-Oh notation
From CS103 we assume “mathematical maturity,” that is the ability to follow nontrivial mathematical proofs, the ability to easily absorb new deﬁnitions and concepts, and the ability to develop and to clearly, concisely, and rigorously expose your own proofs. For example, you should be able to prove by induction that a tree with n vertices has n − 1 edges. We also assume familiarity with the notation O(·), Ω(·), o(·), ω (·), whose deﬁnitions we now recall. If we write f (n) = O(g (n)), where n is a positive integer that typically denotes the size of the input to an algorithm, then we mean that f (n) is at most a ﬁxed constant times g (n) for every suﬃciently large n, that is, f (n) = O(g (n)) ⇔ ∃c, n0 .∀n ≥ n0 . f (n) ≤ c · g (n) . The following is a helpful fact: if limn→∞ f (n)/g (n) exists and it is ﬁnite, then f (n) = O(g (n)). The converse, unfortunately, is not necessarily true. For example, suppose that f (n) = 1 if n is odd, and f (n) = 35n2 if n is even. (Maybe f (n) is the running time of an algorithm that pairs the n input items in some way, and if the number of inputs is odd then the algorithm immediately quits with an error message.) Then f (n) = O(n2 ), but limn→∞ f (n)/n2 does not exist. It is also true, although maybe less helpful, that f (n) = O(g (n)) if and only if lim supn→∞ f (n)/g (n) is ﬁnite. If we write f (n) = Ω(g (n)) then we mean that, for suﬃciently large n, f (n) is at least a ﬁxed constant times g (n), that is, f (n) = Ω(g (n)) ⇔ ∃c > 0, n0 .∀n ≥ n0 . f (n) ≥ c · g (n) . 1

. . Counting and probability From CS109 we assume the basics of counting and of discrete probability. .” to “solve” each piece recursively. e. For example. It will be very useful to you (but not required) to code and to experiment with the algorithms and the data structures that we will describe in the course. f (n) is smaller than g (n) times an arbitrarily small constant. . 1 − n 1 n 1 . n}? Let us say that an input x for a function f () is a ﬁxed point if f (x) = x. what is. . if A and B are ﬁnite sets? For example. . If we sample uniformly a random a function f : {1. how many functions of type f : {0. then we write f (n) = Θ(g (n)). nn . the probability that f () has no ﬁxed point? What does this probability tend to as n goes to inﬁnity? Answers: |B ||A| . . but deﬁnitely not all. problems) is to divide the problem into “pieces.∀n ≥ n0 . Finally. . 2 .∃n0 . . Coding From CS106 we assume basic coding skills. . for suﬃciently large n. n 2 Divide-and-conquer Divide-and-conquer is a general technique to design algorithms: the idea (which works for many. n} → {1. Note that f (n) = o(g (n)) is exactly the same as limn→∞ f (n)/g (n) = 0. 1}? How many functions of type f : {1. 22 . . f (n) = o(g (n)) ⇔ ∀ . . f (n) = ω (g (n)) means that g (n) = o(f (n)). and then to “combine” the solutions to the pieces. n}. . how many functions of the type f :A→B are there.Note that f (n) = O(g (n)) if and only if g (n) = Ω(f (n)). . . The midterm will be a programming assignment. If both f (n) = O(g (n)) and f (n) = Ω(g (n)). For example √ 35n2 − n = Θ(12n2 + n) The meaning of f (n) = o(g (n)) is that. exactly. 1}n → {0. . n} → {1. f (n) ≤ · g (n) .

1 Binary search Given a sorted vector A = A[0]. If L > R then the interval L ≤ i ≤ R is empty and the search fails. Otherwise we try i = (L + R)/2. we want to ﬁnd an index i such that A[i] = x. . A[1]. there is a constant c such that 3 . A[R].Deciding what will be a “piece. We will call the following procedure with arguments (A. .” what problem we want to “solve” on each piece. . R) end if end if end function The procedure looks for an index i such that A[i] = x and such that L ≤ i ≤ R. . in which case we do not need to look at the sub-vector A[i]. . L. A[R]. considerable ingenuity goes into coming up with the right way to instantiate this method. x. . . or A[i] < x. R) if L > R then return Fail else i ← (L + R)/2 if A[i] == x then return i else if A[i] > x then return binary-search(A. if such an index exists. in which case we recurse on A[i + 1]. 0. If we denote by T (n) the worst-case running time of binary search on inputs of length n. function binary-search(A.” will depend on the problem. i − 1) else return binary-search(A. otherwise either A[i] > x. and we can recurse on the subvector A[L]. 2. and how the solutions will be “combined. . i + 1. Sometimes. . A[i − 1]. . then we have T (1) = O(1) T (n) = O(1) + T (n/2) That is. . If we ﬁnd x we are done. L. . x. . . . x. n − 1). x. . A[n − 1] and a target value x.

. The procedure mergesort takes in input a vector A and its length n and it returns a vector that is the sort of A.2 Mergesort Given a vector A = A[0]. A[n − 1]. In general. . If A has length 1 or is empty. and then combine the two pieces. and so if we prove a big-Oh bound on T . . A[n/2 − 1] and A[n/2]. . We will see how to deal with these kind of issues later. and n/2 should be n/2 in the deﬁnition of T (n). . To unfold the recursive deﬁnition of T we see that T (n) = 1 + T (n/2) = 2 + T (n/4) . and it merges them into a sorted vector M . this analysis only covers the case in which n is a power of two. sort each piece recursively. . each containing half of the elements of A. We will proceed in the following way: split A into two pieces: A[0]. Otherwise we will split A into two vectors B and C . and so T (n) = O(log n). 1 4 . .1 2. we want to sort it. . we will sort each of them recursively. . . . then the same bound will apply to T . = k + T (n/2k ) so if we pick k = log2 n. A[n − 1]. and then we will invoke the procedure merge to combine them together. The procedure that does all the work is merge. It takes in input two sorted vectors B and C . each of length n/2. of length b and c. we have T (n) = log2 n + 1. respectively. we will not give fully rigorous proofs of correctness of algorithms (the sample proof given in HW1 is representative of the proof we will do in class and of To be precise.T (1) ≤ c T (n) ≤ c + T (n/2) Now deﬁne a function T as T (1) = 1 T (n) = 1 + T (n/2) Then it is easy to prove by induction that we have T (n) ≤ c · T (n). . then there is nothing to do. . .

c).function mergesort(A. b. c) end if end function function merge(B. mergesort (C. n) if n == 1 or n == 0 then return A else b ← n/2 c←n−b B ← A[0 : b − 1] C ← A[b : n − 1] return merge( mergesort (B. b).b. c) M ← empty vector of length b + c i←0 j←0 p←0 B [ b] ← ∞ C [c] ← ∞ while i < b or j < c do if B [i] < C [j ] then M [p] ← B [i] p←p+1 i←i+1 else M [p] ← C [j ] p←p+1 j ←j+1 end if end while end function 5 . C.

B [0 : i − 1] and C [0 : j − 1] are all empty and so the conditions are trivially true. it is clear that the second part of the ﬁrst condition is also maintained. as desired. and that M is sorted. To prove that the above properties are always satisﬁed. the time elapsed so far is O(i + j ) and. then we write on M [p] the smallest of B [i] and C [j ]. At the end of the algorithm. It should also be clear that the running of merge is O(b + c). The algorithm merge maintains three pointers. the smallest element of the union of B [i : b − 1] and C [j : c − 1]. which means that the second property remains satisﬁed. we also maintain the condition that M is sorted. at every step. 2. Then it follows by induction on the number of times the while loop is executed that the properties are true at all steps and. which means that. M [0 : p − 1].the proofs that we will require in homeworks). p in M . it is O(b + c). we prove that they are true at the beginning. then they remain true after another execution of the loop. at the end of the while loop. i in B and j in C . We will prove by induction that the following properties are true at every execution of the while loop: 1. At the beginning. so the ﬁrst property tells us that M contains all the elements of A and B . and it contains precisely the elements of B [0 : i − 1] and C [0 : j − 1]. M [0 : p − 1] is sorted. Finally. If the conditions are true and we run one step of the while loop. and then we prove that if they are true before an execution of the while loop. All the elements of M [0 : p − 1] are smaller than or equal to all the elements of B [i : b − 1] and C [j : c − 1]. because every invocation of the while loop takes O(1) time and it increases either i or j . but here we will illustrate how such a proof looks like. in particular. because the element that we insert in position p will be greater than or equal to all the elements in M [0 : p − 1]. Because of the second property. we have i = b and j = c. at the end. that is. 6 .