You are on page 1of 13

Overview

Lecture T5: Analysis of Algorithm Lecture T4:


■ What is an algorithm?
– Turing machine.
■ Is it possible, in principle, to write a program to solve any
problem?
– No. Halting problem and others are unsolvable.

This Lecture:
■ For many problems, there may be several competing algorithms.
– Which one should I use?
■ Computational complexity:
– Rigorous and useful framework for comparing algorithms and
predicting performance.
■ Use sorting as a case study.

Linear Growth Quadratic Growth


Grade school addition. Grade school multiplication.
■ Work is proportional to number of digits N. ■ Work is proportional to square of number of digits N.
■ Linear growth: k N for some constant k. ■ Quadratic growth: k N2 for some constant k.

1 0 1 1 1 1 0 1 0 1 0 1
1 1 1 0 1 1 1 1 1 1 0 1 N=4 N=8
* 1 1 0 1 * 0 1 1 1 1 1 0 1
1 0 1 1 1 1 0 1 0 1 0 1
1 0 1 1 1 1 0 1 0 1 0 1 0
+ 1 1 1 0 + 0 1 1 1 1 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 0 0 1 1 0 1 0 1 0 0 1 0
1 0 1 1 1 1 0 1 0 1 0 1 0
N=4 N=8 1 0 1 1 1 1 0 1 0 1 0 1 0
1 0 0 0 1 1 1 1 1 1 0 1 0 1 0 1 0

2N read operations 1 1 0 1 0 1 0 1 0
2N + 1 write operations 2N reads 1 1 0 1 0 1 0 1 0
N odd parity operations N2 + 2N + 1 writes 0 0 0 0 0 0 0 0 0
N majority operations N-1 adds on N-bit integers
0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0
3 4
Why Does It Matter? Orders of Magnitude
Seconds Equivalent Meters Per Imperial
Example
Run time Second Units
1.3 N3 10 N2 47 N log2N 48 N 1 1 second
(nanoseconds) 10-10 1.2 in / decade Continental drift
1000 1.3 seconds 10 msec 0.4 msec 0.048 msec 10 10 seconds
10-8 1 ft / year Hair growing
Time to 10,000 22 minutes 1 second 6 msec 0.48 msec 102 1.7 minutes
10-6 3.4 in / day Glacier
solve a 103 17 minutes
100,000 15 days 1.7 minutes 78 msec 4.8 msec 10-4 1.2 ft / hour Gastro-intestinal tract
problem
of size million 41 years 2.8 hours 0.94 seconds 48 msec 104 2.8 hours
10-2 2 ft / minute Ant
10 million 41 millennia 1.7 weeks 11 seconds 0.48 seconds 105 1.1 days
1 2.2 mi / hour Human walk
106 1.6 weeks
second 920 10,000 1 million 21 million 102 220 mi / hour Propeller airplane
Max size 107 3.8 months
problem minute 3,600 77,000 49 million 1.3 billion 104 370 mi / min Space shuttle
solved 108 3.1 years
hour 14,000 600,000 2.4 trillion 76 trillion 106 620 mi / sec Earth in galactic orbit
in one 109 3.1 decades
day 41,000 2.9 million 50 trillion 1,800 trillion 108 62,000 mi / sec 1/3 speed of light
1010 3.1 centuries
N multiplied by 10, ... forever 210 thousand
1,000 100 10+ 10
time multiplied by Powers
age of 220 million
1021 of 2
universe 230 billion
5 6

Historical Quest for Speed Better Machines vs. Better Algorithms


Multiplication: a × b. New machine.
■ Naïve: add a to itself b times. N 2N steps N = # bits in binary ■ Costs $$$ or more.
representation of a, b
■ Grade school. N2 steps ■ Makes "everything" finish sooner.
■ Divide-and-conquer (Karatsuba, 1962). N1.58 steps ■ Incremental quantitative improvements (Moore’s Law).
■ Ingenuity (Schönhage and Strassen, 1971). ■ May not help much with some problems.
N log N log log N steps
New algorithm.
step = integer division Costs $ or less.

Greatest common divisor: gcd(a, b). ■ Dramatic qualitative improvements possible! (million times faster)
■ Naïve: factor a and b, then find gcd(a, b). 2N steps ■ May make the difference, allowing specific problem to be solved.
■ Euclid (20 BCE): gcd(a, b) = gcd(b, a mod b). N steps ■ May not help much with some problems.

7 8
Impact of Better Algorithms Case Study: Sorting
Example 1: N-body-simulation. Sorting problem:
■ Simulate gravitational interactions among N bodies. ■ Given N items, rearrange them so that they are in increasing order.
– physicists want N = # atoms in universe ■ Among most fundamental problems.
■ Brute force method: N2 steps.
■ Appel (1981). N log N steps, enables new research.
name name
Example 2: Discrete Fourier Transform (DFT). Hauser Hanley
■ Breaks down waveforms (sound) into periodic components.
Hong Haskell
– foundation of signal processing
Hsu Hauser
– CD players, JPEG, analyzing astronomical data, etc.
Hayes Hayes
■ Grade school method: N2 steps.
Haskell Hill
■ Runge-König (1924), Cooley-Tukey (1965).
FFT algorithm: N log N steps, enables new technology. Hanley Hong
Hornet Hornet
Hill Hsu

9 10

Case Study: Sorting Generic Item to Be Sorted


Sorting problem: Define generic Item type to be sorted.
■ Given N items, rearrange them so that they are in increasing order. ■ Associated operations:
■ Among most fundamental problems. – less, show, swap, rand
■ Example: integers.
Insertion sort
■ Brute-force sorting solution. ITEM.h
■ Move left-to-right through array. typedef int Item;
■ Exchange next element with larger elements to its left, one-by-one.
return 1 if a < b int ITEMless(Item a, Item b);
void ITEMshow(Item a);
swap 2 Items void ITEMswap(Item *pa, Item *pb);
int ITEMscan(Item *pa);

11 12
Item Implementation Generic Sorting Program
item.c
#include <stdio.h> sort.c (see Sedgewick 6.1)
#include "ITEM.h" #include <stdio.h>
#include <stdlib.h>
int ITEMless(Item a, Item b) { #include "Item.h"
return (a < b); Max number of
#define N 2000000
} items to sort.
int main(void) {
void ITEMswap(Item *pa, Item *pb) { int i, n = 0;
swap integers – need Item t; Item a[N];
to use pointers t = *pa; *pa = *pb; *pb = t;
} Read input. while(ITEMscan(&a[n]) != EOF)
n++;
void ITEMshow(Item a) {
printf("%d\n", a); Call generic sort sort(a, 0, n-1);
} function.
for (i = 0; i < n; i++)
void ITEMscan(Item *pa) { Print results. ITEMshow(a[i]);
return scanf("%d", pa); return 0;
} }

13 14

Insertion Sort Function Profiling Insertion Sort Empirically


Use lcc "profiling" capability.
■Automatically generates a file prof.out that has frequency counts
insertionsort.c (see Sedgewick Program 6.1) for each instruction.
Unix
■Striking feature:
void insertionsort(Item a[], int left, int right) { % lcc -b insertion.c item.c
– HUGE numbers!
int i, j; % a.out < sort1000.txt
% bprint
for (i = left + 1; i <= right; i++)
for (j = i; j > left; j--)
Insertion Sort prof.out
if (ITEMless(a[j], a[j-1]))
ITEMswap(&a[j], &a[j-1]);
else void insertionsort(Item a[], int left, int right) <1>{
break; int i, j;
} for (<1>i = left + 1; <1000>i <= right; <999>i++)
for (<999>j = i; <256320>j > left; <255321>j--)
if (<256313>ITEMless(a[j], a[j-1]))
<255321>ITEMswap(&a[j], &a[j-1]);
else
<992>break;
<1>}
15 16
Profiling Insertion Sort Analytically Profiling Insertion Sort Analytically
How long does insertion sort take? How long does insertion sort take?
■ Depends on number of elements N to sort. ■ Depends on number of elements N to sort.
■ Depends on specific input. ■ Depends on specific input.
■ Depends on how long compare and exchange operation takes. ■ Depends on how long compare and exchange operation takes.

Worst case. Best case.


■ Elements in reverse sorted order. ■ Elements in sorted order already.
– ith iteration requires i - 1 compare and exchange operations – ith iteration requires only 1 compare operation
– total = 0 + 1 + 2 + . . . + N-1 = N (N-1) / 2 – total = 0 + 1 + 1 + . . . + 1 = N -1

E F G H I J D C B A A B C D E F G H I J

unsorted active sorted unsorted active sorted

17 18

Profiling Insertion Sort Analytically Profiling Insertion Sort Analytically


How long does insertion sort take? How long does insertion sort take?
■ Depends on number of elements N to sort. ■ Depends on number of elements N to sort.
■ Depends on specific input. ■ Depends on specific input.
■ Depends on how long compare and exchange operation takes. ■ Depends on how long compare and exchange operation takes.

Average case. Worst case: N (N - 1) / 2.


■ Elements are randomly ordered.
– ith iteration requires i / 2 comparison on average Best case: N - 1.
– total = 0 + 1/2 + 2/2 + . . . + (N-1)/2 = N (N-1) / 4
– check with profile: 249,750 vs. 256,313 Average case: N (N – 1) / 4.

B E F R T U O R C E

unsorted active sorted

19 20
Estimating the Running Time Estimating the Running Time
Total run time: Easier alternative.
■ Sum over all instructions: frequency * cost. (i) Analyze asymptotic growth.
(ii) For medium N, run and measure time.
Frequency: For large N, use (i) and (ii) to predict time.
■ Determined by algorithm and input.
■ Can use lcc -b (or analysis) to help estimate. Asymptotic growth rates.
■ Estimate time as a function of input size. Donald Knuth
Cost: – N, N log N, N2, N3, 2N, N!
■ Determined by compiler and machine. ■ Ignore lower order terms and leading coefficients.
■ Could use lcc –S (plus manuals). – Ex. 6N3 + 17N2 + 56 is proportional to N3

Insertion sort is quadratic. On arizona: 1 second for N = 10,000.


■ How long for N = 100,000? 100 seconds (100 times as long).
■ N = 1 million? 2.78 hours (another factor of 100).
■ N = 1 billion? 317 years (another factor of 106).

21 22

Sorting Case Study: mergesort Sorting Case Study: mergesort


Insertion sort (brute-force) Insertion sort (brute-force)
Mergesort (divide-and-conquer) Mergesort (divide-and-conquer)
■ Divide array into two halves. ■ Divide array into two halves.
■ Sort each half separately. How do we sort half size files? ■ Sort each half separately.
! Any sorting algorithm will do. ■ Merge two halves to make sorted whole.
! Use mergesort recursively! ! How do we merge efficiently?

M E R G E S O R T M E M E R G E S O R T M E
M E R G E S O R T M E divide M E R G E S O R T M E divide

E E G M R S E M O R T sort E E G M R S E M O R T sort

E E E G M M O R R S T merge

26 27
Profiling Mergesort Analytically Profiling Mergesort Analytically
How long does mergesort take? How long does mergesort take?
■ Bottleneck = merging (and copying). ■ Bottleneck = merging (and copying).
– merging two files of size N/2 requires N comparisons – merging two files of size N/2 requires N comparisons
■ T(N) = comparisons to mergesort ■ N log2 N comparisons to sort ANY array of N elements.
array of N elements. – even already sorted array!
 0 if N = 1

T(N ) = 
14243
2T (N / 2) + N{
sorting both halves merging
otherwise
How much space?

! Can’t do "in-place" like insertion sort.
Unwind recurrence: (assume N = 2k ). ! Need auxiliary array of size N.

T(N) = 2 T(N/2) + N = 2 (2 T(N/4) + N/2) + N


= 4 T(N/4) + 2N = 4 (2 T(N/8) + N/4) + 2N
= 8 T(N/8) + 3N
= 16 T(N/16) + 4N
...

= N T(1) + k N
= 0 + N log2 N
28 29

Implementing Mergesort Implementing Mergesort


merge (see Sedgewick Program 8.2)

mergesort (see Sedgewick Program 8.3) void merge(Item a[], int left, int mid, int right) {
int i, j, k;
Item aux[MAXN]; uses scratch array
for (i = mid+1; i > left; i--)
void mergesort(Item a[], int left, int right) { aux[i-1] = a[i-1]; copy to
int mid = (right + left) / 2; for (j = mid; j < right; j++) temporary array
if (right <= left) aux[right+mid-j] = a[j+1];
return;
mergesort(a, left, mid); for (k = left; k <= right; k++)
mergesort(a, mid + 1, right); if (ITEMless(aux[i], aux[j]))
merge two sorted
merge(a, left, mid, right); a[k] = aux[i++];
sequences
} else
a[k] = aux[j--];
}

30 31
Profiling Mergesort Empirically Quicksort
Mergesort prof.out
Quicksort.
void merge(Item a[], int left, int mid, int right) <999>{ ■ Partition array so that:
int i, j, k; – some partitioning element a[m] is in its final position
for (<999>i = mid+1; <6043>i > left; <5044>i--) – no larger element to the left of m
<5044>aux[i-1] = a[i-1];
for (<999>j = mid; <5931>j < right; <4932>j++) – no smaller element to the right of m
<4932>aux[right+mid-j] = a[j+1];
for (<999>k = left; <10975>k <= right; <9976>k++) partitioning
if (<9976>ITEMless(aux[i], aux[j])) element
<4543>a[k] = aux[i++]; # comparisons
else
Theory ~ N log2 N = 9,966
<5433>a[k] = aux[j--];
<999>}
Actual = 9,976
Q U I C K S O R T I S C O O L
void mergesort(Item a[], int left, int right) <1999>{
int mid = <1999>(right + left) / 2; I C K I C L Q U S O R T S O O
if (<1999>right <= left)
return<1000>;
<999>mergesort(a, aux, left, mid); Striking feature: ≤L ≥L
<999>mergesort(a, aux, mid+1, right);
no HUGE numbers!
<999>merge(a, aux, left, mid, right);
<1999>} partitioned array
32 34

Quicksort Sorting Case Study: quicksort


Quicksort. Insertion sort (brute-force)
■ Partition array so that: Mergesort (divide-and-conquer)
– some partitioning element a[m] is in its final position Quicksort (conquer-and-divide)
– no larger element to the left of m ■ Partition array so that:
– no smaller element to the right of m – some partitioning element a[m] is in its final position

■ Sort each "half" recursively. – no larger element to the left of m


partitioning
– no smaller element to the right of m
element
■ Sort each "half" recursively.

quicksort.c (see Sedgewick Program 7.1)


Q U I C K S O R T I S C O O L
void quicksort(Item a[], int left, int right) {
int m;
I
C C K
I I C
K L Q
O U
O S
O O
Q R T
S S O
T O
U if (right > left) {
m = partition(a, left, right);
quicksort(a, left, m - 1);
quicksort(a, m + 1, right);
Sort each "half." }
}
35 36
Sorting Case Study: quicksort Implementing Partition
partition (see Sedgewick Program 7.2)
Insertion sort (brute-force)
int partition(Item a[], int left, int right) {
Mergesort (divide-and-conquer)
int i = left-1; /* left to right pointer */
Quicksort (conquer-and-divide) int j = right; /* right to left pointer */
■ Partition array so that: Item p = a[right]; /* partition element */
– some partitioning element a[m] is in its final position
– no larger element to the left of m while(1) {
while (ITEMless(a[++i], p)) find element on left to swap
– no smaller element to the right of m
;
■ Sort each "half" recursively. while (ITEMless(p, a[--j])) look for element on right to
■ How do we partition efficiently? if (j == left) swap, but don’t run off end
– N - 1 comparisons
break;
– easy with auxiliary array
if (i >= j) pointers cross
– better solution: use no extra space! break;
ITEMswap(&a[i], &a[j]);
}
swap partition
ITEMswap(&a[i], &a[right]);
element
return i;
}
37 38

Profiling Quicksort Empirically Profiling Quicksort Empirically


Quicksort prof.out (cont)

int partition(Item a[], int left, int right) <668>{


int i = <668>left-1, j = <668>right;
Quicksort prof.out Item swap, p = <668>a[right];
void quicksort(Item a[], int left, int right) <1337>{
<668>while<1678>(<1678>1) {
int p;
while (<5708>ITEMless(a[++i], p))
if (<1337>right <= left)
<3362>;
return<669>;
while (<6664>ITEMless(p, a[--j]))
<668>p = partition(a, left, right);
if (<4495>j == left)
<668>quicksort(a, left, p-1); Striking feature: no <177>break; Striking feature: no
<668>quicksort(a, p+1, right); HUGE numbers! if (<2346>i >= j) HUGE numbers!
<1337>}
<668>break;
<1678>ITEMswap(&a[i], &a[j]);
}
<668>ITEMswap(&a[i], &a[right]);
return <668>i;
<668>}

39 40
Profiling Quicksort Analytically Profiling Quicksort Analytically
Intuition. Partition on median element.
■ Assume all elements unique. ! Proportional to N log2 N in best and worst case.
■ Assume we always select median as partition element.
■ T(N) = # comparisons. Partition on rightmost element.
! Proportional to N2 in worst case.
 0 if N = 1 ! Already sorted file: takes N2/2 + N/2 comparisons.

T( N ) =  2T ( N / 2) + {N otherwise
1424 3
sorting both halves partitioning Partition on random element.
! Roughly 2 N log e N steps.
If N is a power of 2. ! Choose random partition element.
⇒ T(N) = N log2 N
Check profile.
! Analysis "almost true" if you partition
■ 2 N log e N: 13815 vs. 12372 (5708 + 6664).
on random element.
■ Running time for N = 100,000 about 1.2 seconds.

Can you find median in O(N) time? ■ How long for N = 1 million ?
– slightly more than 10 times (about 12 seconds)
! Yes, see COS 226/423. Bob Tarjan, et al (1973)
41 42

Sorting Analysis Summary Design, Analysis, and Implementation of Algorithms


Running time estimates: Algorithm.
■ Home pc executes 108 comparisons/second. ■ "Step-by-step recipe" used to solve a problem.
■ Supercomputer executes 1012 comparisons/second. ■ Generally independent of programming language or machine on
which it is to be executed.
Insertion Sort (N2) Quicksort (N lg N)
computer thousand million billion thousand million billion Design.
home instant 2.8 hours 317 years instant 0.3 sec 6 min ■ Find a method to solve the problem.

is
super instant 1 second 1.6 weeks instant instant instant

lys

De
s
a
Analysis.

ign
An
■ Evaluate its effectiveness
Lesson: good algorithms are more powerful than supercomputers. and predict theoretical performance.

Implementation. Implement
■ Write actual code and test your theory.

43 44
Sorting Analysis Summary Computational Complexity
Comparison of Different Sorting Algorithms Framework to study efficiency of algorithms.
Attribute insertion quicksort mergesort ■ Depends on machine model, average case, worst case.
Worst case complexity N2 N2 N log2 N ■ UPPER BOUND = algorithm to solve the problem.
Best case complexity N N log2 N N log2 N ■ LOWER BOUND = proof that no algorithm can do better.
Average case complexity N2 N log2 N N log2 N
■ OPTIMAL ALGORITHM: lower bound = upper bound.
Already sorted N N2 N log2 N
Reverse sorted N2 N2 N log2 N Example: sorting.
Space N N 2N ■ Measure costs in terms of comparisons.
Stable yes no yes ■ Upper bound = N log2 N (mergesort).
– quicksort usually faster, but mergesort never slow
Sorting algorithms have different performance characteristics. ■ Lower bound = N log2 N - N log2 e
■ Other choices: BST sort, bubblesort, heapsort, shellsort, selection (applies to any comparison-based algorithm).
sort, shaker sort, radix sort, distribution sort, solitaire sort, hybrid – Why?
methods.
■ Which one should I use?
! Depends on application.

45 46

Computational Complexity Summary


Caveats. How can I evaluate the performance of a proposed algorithm?
■ Worst or average case may be unrealistic. ■ Computational experiments.
■ Costs ignored in analysis may dominate. ■ Complexity theory.
■ Machine model may be restrictive.
What if it’s not fast enough?
Complexity studies provide: ■ Use a faster computer.
■ Starting point for practical implementations. – performance improves incrementally

■ Indication of approaches to be avoided. ■ Understand why.


■ Discover a better algorithm.
– performance can improve dramatically
– not always easy / possible to develop better algorithm

47 48
Average Case vs. Worst Case

Lecture T5: Extra Slides Worst-case analysis.


■ Take running time of worst input of size N.
■ Advantages:
– performance guarantee
■ Disadvantage:
– pathological inputs can determine run time

Average case analysis.


■ Take average run time over all inputs of some class.
■ Advantage:
– can be more accurate measure of performance
■ Disadvantage:
– hard to quantify what input distributions will look like in practice
– difficult to analyze for complicated algorithms, distributions
– no performance guarantee

53

Profiling Quicksort Analytically Comparison Based Sorting Lower Bound


Average case.
a1 < a2
■ Assume partition element chosen at random and all elements are
unique. YES NO
■ Denote ith
largest element by i.
2
■ Probability that i and j (where j > i) are compared = j − i + 1 a2 < a3 a1 < a3
YES NO YES NO
2 N i 1
Expected # of comparisons = ∑ = 2∑ ∑
i< j j − i + 1 i =1 j = 2 j
N 1 1, 2, 3 a1 < a3 2, 1, 3 a2 < a3
≤ 2N ∑
j
j =1 YES NO YES NO
1
≈ 2 N ∫1N
j 1, 3, 2 3, 1, 2 2, 3, 1 3, 2, 1
= 2 N ln N

Decision Tree of Program

55 56
Comparison Based Sorting Lower Bound
Lower bound = N log2N (applies to any comparison-based algorithm).
■ Worst case dictated by tree height h.
■ N! different orderings.
■ One (or more) leaves corresponding to each ordering.
■ Binary tree with N! leaves must have

h ≥ log 2 ( N ! )
≥ log 2 ( N / e ) N Stirling’s formula
= N log 2 N − N log 2 e
= Θ ( N log 2 N)

57