You are on page 1of 47

Merge Sort

7 29 4 → 2 4 7 9

72 → 2 7 94 → 4 9

7→7 2→2 9→9 4→4

© 2004 Goodrich, Tamassia Merge Sort 1


Divide-and-Conquer
Divide-and conquer is a Merge-sort is a sorting
general algorithm design algorithm based on the divide-
paradigm: and-conquer paradigm
„ Divide: divide the input data It accesses data in a
S in two disjoint subsets S1 sequential manner (suitable for
and S2 sorting data on a disk)
„ Recur: solve the
subproblems associated
with S1 and S2
„ Conquer: combine the
solutions for S1 and S2 into a
solution for S
The base case for the
recursion are subproblems of
size 0 or 1

© 2004 Goodrich, Tamassia Merge Sort 2


Merge-Sort
Merge-sort on an input sequence S with
n elements consists of three steps:
„ Divide: partition S into two sequences S1 and
S2 of about n/2 elements each
„ Recur: recursively sort S1 and S2
„ Conquer: merge S1 and S2 into a unique
sorted sequence

© 2004 Goodrich, Tamassia Merge Sort 3


Merging sorted arrays
array A: array B:

17 23 56 58 28 60 70

lowPtr highPtr

workspace:

© 2004 Goodrich, Tamassia Merge Sort 4


Merging sorted arrays
array A: array B:

23 56 58 28 60 70

lowPtr highPtr

workspace:

17

© 2004 Goodrich, Tamassia Merge Sort 5


Merging sorted arrays
array A: array B:

56 58 28 60 70

lowPtr highPtr

workspace:

17 23

© 2004 Goodrich, Tamassia Merge Sort 6


Merging sorted arrays
array A: array B:

56 58 60 70

lowPtr highPtr

workspace:

17 23 28

© 2004 Goodrich, Tamassia Merge Sort 7


Merging sorted arrays
array A: array B:

58 60 70

lowPtr highPtr

workspace:

17 23 28 56

© 2004 Goodrich, Tamassia Merge Sort 8


Merging sorted arrays
array A: array B:

60 70

lowPtr highPtr

workspace:

17 23 28 56 58

© 2004 Goodrich, Tamassia Merge Sort 9


Merging sorted arrays
array A: array B:

70

lowPtr highPtr

workspace:

17 23 28 56 58 60

© 2004 Goodrich, Tamassia Merge Sort 10


Merging sorted arrays
array A: array B:

lowPtr highPtr

workspace:

17 23 28 56 58 60 70

© 2004 Goodrich, Tamassia Merge Sort 11


Merging halves of an array
Pass boundaries of sub-arrays to the
algorithm instead of new arrays:

l m m+1 r

17 23 56 58 28 60 70

lowPtr highPtr

© 2004 Goodrich, Tamassia Merge Sort 12


Implementation
public static void recMergeSort
(int[] arr, int[] workSpace, int l, int r) {
if (l == r) {
return;
} else {
int m = (l+r) / 2;
recMergeSort(arr, workSpace, l, m);
recMergeSort(arr, workSpace, m+1, r);
merge(arr, workSpace, l, m+1, r);
}
}

© 2004 Goodrich, Tamassia Merge Sort 13


Merge-Sort Tree
An execution of merge-sort is depicted by a binary tree
„ each node represents a recursive call of merge-sort and stores
Š unsorted sequence before the execution and its partition
Š sorted sequence at the end of the execution
„ the root is the initial call
„ the leaves are calls on subsequences of size 0 or 1

7 2  9 4 → 2 4 7 9

7  2 → 2 7 9  4 → 4 9

7→7 2→2 9→9 4→4

© 2004 Goodrich, Tamassia Merge Sort 14


Execution Example
Partition
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9

7 2 9 4 → 2 4 7 9 3 8 6 1 → 1 3 8 6

7 2 → 2 7 9 4 → 4 9 3 8 → 3 8 6 1 → 1 6

7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1

© 2004 Goodrich, Tamassia Merge Sort 15


Execution Example (cont.)
Recursive call, partition
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9

7 29 4→ 2 4 7 9 3 8 6 1 → 1 3 8 6

7 2 → 2 7 9 4 → 4 9 3 8 → 3 8 6 1 → 1 6

7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1

© 2004 Goodrich, Tamassia Merge Sort 16


Execution Example (cont.)
Recursive call, partition
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9

7 29 4→ 2 4 7 9 3 8 6 1 → 1 3 8 6

72→2 7 9 4 → 4 9 3 8 → 3 8 6 1 → 1 6

7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1

© 2004 Goodrich, Tamassia Merge Sort 17


Execution Example (cont.)
Recursive call, base case
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9

7 29 4→ 2 4 7 9 3 8 6 1 → 1 3 8 6

72→2 7 9 4 → 4 9 3 8 → 3 8 6 1 → 1 6

7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1

© 2004 Goodrich, Tamassia Merge Sort 18


Execution Example (cont.)
Recursive call, base case
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9

7 29 4→ 2 4 7 9 3 8 6 1 → 1 3 8 6

72→2 7 9 4 → 4 9 3 8 → 3 8 6 1 → 1 6

7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1

© 2004 Goodrich, Tamassia Merge Sort 19


Execution Example (cont.)
Merge
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9

7 29 4→ 2 4 7 9 3 8 6 1 → 1 3 8 6

72→2 7 9 4 → 4 9 3 8 → 3 8 6 1 → 1 6

7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1

© 2004 Goodrich, Tamassia Merge Sort 20


Execution Example (cont.)
Recursive call, …, base case, merge
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9

7 29 4→ 2 4 7 9 3 8 6 1 → 1 3 8 6

72→2 7 9 4 → 4 9 3 8 → 3 8 6 1 → 1 6

7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1

© 2004 Goodrich, Tamassia Merge Sort 21


Execution Example (cont.)
Merge
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9

7 29 4→ 2 4 7 9 3 8 6 1 → 1 3 8 6

72→2 7 9 4 → 4 9 3 8 → 3 8 6 1 → 1 6

7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1

© 2004 Goodrich, Tamassia Merge Sort 22


Execution Example (cont.)
Recursive call, …, merge, merge
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9

7 29 4→ 2 4 7 9 3 8 6 1 → 1 3 6 8

72→2 7 9 4 → 4 9 3 8 → 3 8 6 1 → 1 6

7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1

© 2004 Goodrich, Tamassia Merge Sort 23


Execution Example (cont.)
Merge
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9

7 29 4→ 2 4 7 9 3 8 6 1 → 1 3 6 8

72→2 7 9 4 → 4 9 3 8 → 3 8 6 1 → 1 6

7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1

© 2004 Goodrich, Tamassia Merge Sort 24


Analysis of Merge-Sort
The height h of the merge-sort tree is O(log n)
„ at each recursive call we divide in half the sequence,
The overall amount or work done at the nodes of depth i is O(n)
„ we partition and merge 2i sequences of size n/2i
„ we make 2i+1 recursive calls
Thus, the total running time of merge-sort is O(n log n)

depth #seqs size


0 1 n

1 2 n/2

i 2i n/2i

… … …

© 2004 Goodrich, Tamassia Merge Sort 25


Using merge sort
Fast sorting method for arrays: this is what
Java’s Arrays.sort uses, for example
Good for sorting data in external memory –
because works with adjacent indices in the
array
Not so good with lists: relies on constant time
access to the middle of the sequence

© 2004 Goodrich, Tamassia Merge Sort 26


Quick-Sort
7 4 9 6 2 → 2 4 6 7 9

4 2 → 2 4 7 9 → 7 9

2→2 9→9

© 2004 Goodrich, Tamassia Merge Sort 27


Quick-Sort
Quick-sort is a randomized
sorting algorithm based x
on the divide-and-conquer
paradigm:
„ Divide: pick a random
element x (called pivot) and
x
partition S into
Š L elements less than x
Š E elements equal x L E G
Š G elements greater than x
„ Recur: sort L and G
„ Conquer: join L, E and G x

© 2004 Goodrich, Tamassia Merge Sort 28


Partition of lists (or using
extra workspace)
We partition an input sequence as follows:
„ We remove, in turn, each element y from S and
„ We insert y into L, E or G, depending on the result of the
comparison with the pivot x
Each insertion and removal is at the beginning or
at the end of a sequence, and hence takes O(1)
time
Thus, the partition step of quick-sort takes O(n)
time

© 2004 Goodrich, Tamassia Merge Sort 29


Partitioning arrays
Perform the partition using two indices to split S into L
and E+G.
j k
3 2 5 1 0 7 3 5 9 2 7 9 8 9 7 6 9 (pivot = 6)
Repeat until j and k cross:
„ Scan j to the right until finding an element > x.
„ Scan k to the left until finding an element < x.
„ Swap elements at indices j and k

j k
3 2 5 1 0 7 3 5 9 2 7 9 8 9 7 6 9

© 2004 Goodrich, Tamassia Merge Sort 30


Implementation
public static void recQuickSort(int[] arr, int left, int right) {
if (right - left <= 0) return;
else {
int border = partition(arr, left, right);
recQuickSort(arr, left, border-1);
recQuickSort(arr, border+1, right);
}
}

© 2004 Goodrich, Tamassia Merge Sort 31


Quick-Sort Tree
An execution of quick-sort is depicted by a binary tree
„ Each node represents a recursive call of quick-sort and stores
Š Unsorted sequence before the execution and its pivot
Š Sorted sequence at the end of the execution
„ The root is the initial call
„ The leaves are calls on subsequences of size 0 or 1

7 4 9 6 2 → 2 4 6 7 9

4 2 → 2 4 7 9 → 7 9

2→2 9→9
© 2004 Goodrich, Tamassia Merge Sort 32
Execution Example
Pivot selection
7 2 9 43 7 6 1 → 1 2 3 4 6 7 8 9

7 2 9 4 → 2 4 7 9 3 8 6 1 → 1 3 8 6

2→2 9 4 → 4 9 3→3 8→8

9→9 4→4

© 2004 Goodrich, Tamassia Merge Sort 33


Execution Example (cont.)
Partition, recursive call, pivot selection
7 2 9 4 3 7 6 1→ 1 2 3 4 6 7 8 9

2 4 3 1→ 2 4 7 9 3 8 6 1 → 1 3 8 6

2→2 9 4 → 4 9 3→3 8→8

9→9 4→4

© 2004 Goodrich, Tamassia Merge Sort 34


Execution Example (cont.)
Partition, recursive call, base case
7 2 9 43 7 6 1→→ 1 2 3 4 6 7 8 9

2 4 3 1 →→ 2 4 7 3 8 6 1 → 1 3 8 6

1→1 9 4 → 4 9 3→3 8→8

9→9 4→4

© 2004 Goodrich, Tamassia Merge Sort 35


Execution Example (cont.)
Recursive call, …, base case, join
7 2 9 43 7 6 1→ 1 2 3 4 6 7 8 9

2 4 3 1 → 1 2 3 4 3 8 6 1 → 1 3 8 6

1→1 4 3 → 3 4 3→3 8→8

9→9 4→4

© 2004 Goodrich, Tamassia Merge Sort 36


Execution Example (cont.)
Recursive call, pivot selection
7 2 9 43 7 6 1→ 1 2 3 4 6 7 8 9

2 4 3 1 → 1 2 3 4 7 9 7 1 → 1 3 8 6

1→1 4 3 → 3 4 8→8 9→9

9→9 4→4

© 2004 Goodrich, Tamassia Merge Sort 37


Execution Example (cont.)
Partition, …, recursive call, base case
7 2 9 43 7 6 1→ 1 2 3 4 6 7 8 9

2 4 3 1 → 1 2 3 4 7 9 7 1 → 1 3 8 6

1→1 4 3 → 3 4 8→8 9→9

9→9 4→4

© 2004 Goodrich, Tamassia Merge Sort 38


Execution Example (cont.)
Join, join
7 2 9 4 3 7 6 1 →1 2 3 4 6 7 7 9

2 4 3 1 → 1 2 3 4 7 9 7 → 17 7 9

1→1 4 3 → 3 4 8→8 9→9

9→9 4→4

© 2004 Goodrich, Tamassia Merge Sort 39


Worst-case Running Time
The worst case for quick-sort occurs when the pivot is the unique
minimum or maximum element
One of L and G has size n − 1 and the other has size 0
The running time is proportional to the sum
n + (n − 1) + … + 2 + 1
Thus, the worst-case running time of quick-sort is O(n2)
depth time
0 n

1 n−1

… …

n−1 1
© 2004 Goodrich, Tamassia Merge Sort 40
Best-case Running Time
The best case for quick-sort occurs when the pivot is
the median element
The L and G parts are equal – the sequence is split in
halves, like in merge sort
Thus, the best-case running time of quick-sort is
O(n log n)

© 2004 Goodrich, Tamassia Merge Sort 41


Average-case Running Time
The average case for quick-sort: half of the times, the
pivot is roughly in the middle
Thus, the average-case running time of quick-sort is
O(n log n) again
Detailed proof in the textbook

© 2004 Goodrich, Tamassia Merge Sort 42


Summary of Sorting Algorithms
Algorithm Time Notes
in-place (no extra memory)
selection-sort O(n2) slow (good for small inputs)
in-place (no extra memory)
insertion-sort O(n2) slow (good for small inputs)
in-place (+stack),
O(n log n)
quick-sort randomized
expected fast (good for large inputs)
sequential data access
merge-sort O(n log n) fast (good for huge inputs)
heap-sort
in-place (see G & T 8.3.5 for
(later in the the in-place version)
module)
O(n log n)
fast (good for large inputs)

© 2004 Goodrich, Tamassia Merge Sort 43


Comparison sorting
A sorting algorithm is a comparison sorting algorithm
if it uses comparisons between elements in the
sequence to determine in which order to place them
Examples of comparison sorts: bubble sort, selection
sort, insertion sort, heap sort, merge sort, quicksort.
Example of not a comparison sort: bucket sort (Ch.
11.4). Runs in O(n), relies on knowing the range of
values in the sequence (e.g. integers between 1 and
1000).

© 2004 Goodrich, Tamassia Merge Sort 44


Lower bound for comparison
sort
We can model sorting which depends on
comparisons between elements as a binary
decision tree.
At each node, a comparison between two
elements is made; there are two possible
outcomes and we find out a bit more about
the correct order of items in the array.
Finally arrive at full information about the
correct order of the items in the array.

© 2004 Goodrich, Tamassia Merge Sort 45


Comparison sorting
a1,…,an : don’t know what
the correct order is. a1 < a2?
yes no
a2 < a3?
yes

Correct ...
order
is...

(a1,…,an) (a2, a1 ,...,an) … (total of n! possibilities)

© 2004 Goodrich, Tamassia Merge Sort 46


How many comparisons?
If a binary tree has n! leaves, than the
minimal number of levels (assuming the
tree is perfect) is (log2 n!) +1.
This shows that O(n log n) sorting
algorithms are essentially optimal
(log2n! is not equal to n log2n, but has
the same growth rate modulo some
hairy constants).
© 2004 Goodrich, Tamassia Merge Sort 47

You might also like