You are on page 1of 3

Merge Sort

Input: unsorted array

Output: sorted array (Increasing values)

Recursively sorts two halves of the array, then merges the two halves into sorted
full-length array.

How to merge: Both halves sorted, so take smallest element in either array
(global min) and add to final array. Continue until both arrays are empty.

The Merge Subroutine

How many operations??

i, j counters for A and B arrays (2 ops)

for k=1 to n (1 op)

if A(i) < B(j) (1 op)
C(k) = A(i) i++ (2 ops)
C(k) = B(i) j++ (also 2 ops)

where m = # of numbers in final array

total 4m+2

The Whole MergeSort Algorithm

Claim: merge sort requires <= 6nlog2(n) + 6n ops to sort n>=1 numbers

Recall: log2(n) = # of times you divide n by 2 until you get down to 1 (grows
much less quickly than n)

Proof of claim:
Assume n=power of 2 for simplicity

Use recursive tree:

Tree level 0: full array n

Level 1: 2 arrays n/2
level 2: 4 n/4
level log2(n): n arrays of 1 element
Think of "horizontal slice" of tree and all work done at that leve

Pattern: at each level j=0 -> log2(n)

2^j subproblems each with size n/(2^j)

Total # ops at level j (not including actually calling the next recursive call): <=
2^j(6(n/(2^j))) -> 6n per level

Total for MergeSort: 6n per level * (log2(n)+1) levels

= 6nlog2(n) + 6n

1) Used worst-case analysis so that runtime bound holds for every array you
might encounter. Could test on benchmarks or “practical” inputs
a. Why? Difficult to define real data
b. Easier to analyze
c. Focus is to design algorithms that always do well
regardless of scenario
2) Won’t pay much attention to constants or lower order terms
a. Reasons: way easier to ignore.
b. Lose very little predictive power by ignoring constants
c. Really only interested in big problems (large n)
3) Fast Algorithm: In worst case runtime, runtime grows slowly with
input n
4) Usually want as close to linear (O(n)) time as possible (binary
search actually log2(n) time, faster than O(n)

Asymptotic Notation

Let T(n) = function on n=1,2,3,…

Usually, T(n) is the worst-case RT on inputs of size n

Q: when is T(n) = O(f(n))

A: Eventually (for large input size) , T(n) is bounded above by some
constant multiple of f(n)

Formal Definition (use for proofs):

T(n) = O(f(n)) if and only if there exist constants C, n0 > 0 such that
T(n) <= C*f(n) for every n > n0

Ex # 1: if
T(n) = akn^k + … + a1n + a0
Then T(n) = O(n^k)
Proof choose n0 = 1
C= abs(ak) + abs(a(k-1)) + … + abs(a1) + abs (a0)

Need to show T(n)<=Cn^k where n>=1

T(n) <= abs(ak)*n^k + … + <= abs(ak)*n^k + … +

=C*n^k QED

Ex #2: non-example

For every k>=1

n^k is not O(n^(k-1))

Proof by contradiction:
Suppose n^k = O(n^(k-1))
There exist constants C and n0 such that:
n^k <= C*n^(k-1) for all n > n0
wrong! Cancel off n^k-1, we get n<=C for all n > n0
We can choose arbitrarily large n so this is false.

Warning: C and n0 cannot depend on n

Definition T(n) >= Omega(f(n)) if and only if there exist constants c, n0

such that T(n) >= C*f(n) for all n >= n0

Basically, where do these two functions cross (n0) such that T(n) is
always greater than C*f(n)

T(n) is Theta(f(n)) iff

1) T(n) = O(f(n))
2) T(n) = Omega(f(n))