Professional Documents
Culture Documents
Algorithms in Java, 4th Edition · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · May 2, 2008 10:35:54 AM
Running time
how many
times do you
have to turn
the crank?
2
Reasons to analyze algorithms
Predict performance.
Provide guarantees.
time
quadratic
64T
32T
16T
linearithmic
8T
linear
size 1K 2K 4K 8K
N-body Simulation.
• Simulate gravitational interactions among N bodies.
• Brute force: N2 steps.
• Barnes-Hut: N log N steps, enables new research.
Andrew Appel
PU '81
time
quadratic
64T
32T
16T
linearithmic
8T
linear
size 1K 2K 4K 8K
6
Scientific analysis of algorithms
Scientific method.
• Observe some feature of the universe.
• Hypothesize a model that is consistent with observation.
• Predict events using the hypothesis.
• Verify the predictions by making further observations.
• Validate by repeating until the hypothesis and observations agree.
Principles.
• Experiments must be reproducible.
• Hypotheses must be falsifiable.
Why is
my program so slow ?
8
Example: 3-sum
3-sum. Given N integers, find all triples that sum to exactly zero.
Application. Deeply related to problems in computational geometry.
% more 8ints.txt
30 -30 -20 -10 40 0 10 5
9
3-sum: brute-force algorithm
10
Measuring the running time
11
Measuring the running time
ThreeSum.count(a);
client code
implementation
12
3-sum: initial observations
Data analysis. Observe and plot running time as a function of input size N.
N time (seconds) †
1024 0.26
2048 2.16
4096 17.18
8192 137.76
13
Empirical analysis
Log-log plot. Plot running time vs. input size N on log-log scale.
slope = 3
power law
4096 17.18
15
Doubling hypothesis
Q. What is effect on the running time of doubling the size of the input?
512 0.03 -
17
‣ estimating running time
‣ mathematical analysis
‣ order-of-growth hypotheses
‣ input models
‣ measuring space
18
Mathematical models for running time
Donald Knuth
1974 Turing Award
19
Cost of basic operations
20
Cost of basic operations
assignment statement a = b c2
int count = 0;
for (int i = 0; i < N; i++)
if (a[i] == 0) count++;
operation frequency
variable declaration 2
assignment statement 2
equal to comparison N
between N (no zeros)
and 2N (all zeros)
array access N
increment ≤ 2N
22
Example: 2-sum
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
if (a[i] + a[j] == 0) count++;
1
0 + 1 + 2 + . . . + (N − 1) = N (N − 1)
operation frequency 2
! "
N
=
2
variable declaration N+2
increment ≤ N2
23
Tilde notation
Ex 1. 6 N 3 + 20 N + 16
~ 6N3
Ex 2. 6 N 3 + 100 N 4/3 + 56
~ 6N3
Ex 3. 6 N 3 + 17 N 2 lg N + 7 N
~ 6N3
f (N)
Technical definition. f(N) ~ g(N) means lim = 1
N→ ∞ g(N)
€
24
Example: 2-sum
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
if (a[i] + a[j] == 0) count++; "inner loop"
variable declaration ~ N c1 ~ c1 N
assignment statement ~ N c2 ~ c2 N
array access ~ N2 c4 ~ c4 N 2
increment ≤ N2 c5 ≤ c5 N 2
total ~ cN2
25
Example: 3-sum
478 Algorithms and Data Stru
Q. How many instructions as a function of N?
In practice,
• Formulas can be complicated.
• Advanced mathematics might be required.
• Exact models best left for experts.
TN = c1 A + c2 B + c3 C + c4 D + c5 E
A = variable declarations
B = assignment statements frequencies
C = compare (depend on algorithm, input)
D = array access
E = increment
28
Common order-of-growth hypotheses
To determine order-of-growth:
• Assume a power law TN ~ c N a.
• Estimate exponent a with doubling hypothesis.
• Validate with mathematical analysis.
N time (seconds) †
Ex. ThreeSumDeluxe.java
1,000 0.43
Food for thought. How is it implemented?
2,000 0.53
4,000 1.01
8,000 2.87
16,000 11.00
32,000 44.64
64,000 177.48
observations
29
482 Algorithms and Data Structures
1024T
Quadratic. A 1typical program
constant 1
exponential
whose
c
tic
cubi
lin mic
dra
512T
h
running time has order of growth N 2
r
rit
ea
log N logarithmic ~1
qua
ea
lin has two nested for loops, used for some
calculation involving
N all pairs linear
of N ele- 2
64T ments. The force update double loop in
N log3.4.2)
NBody (PROGRAM N is a linearithmic
prototype of ~2
the programs in this classification, as is
2
the elementaryNsorting quadratic
algorithm Inser- 4
8T tion (PROGRAM 4.2.4).
4T
N3
cubic 8
growth
name typical code framework description example
rate
while (N > 1)
log N logarithmic { N = N / 2; ... } divide in half binary search
divide
N log N linearithmic [see lecture 5] mergesort
and conquer
31
Practical implications of order-of-growth
1 1 second
Q. How many inputs can be processed in minutes?
10 10 seconds
Ex. Customers lost patience waiting "minutes" in 1970s;
102 1.7 minutes
they still do.
103 17 minutes
32
Practical implications of order-of-growth
tens of hundreds of
N millions billions minutes seconds second instant
millions millions
tens of
N2 hundreds thousand thousands decades years months weeks
thousands
33
Practical implications of order-of-growth
34
‣ estimating running time
‣ mathematical analysis
‣ order-of-growth hypotheses
‣ input models
‣ measuring space
35
Types of analyses
36
Commonly-used notations
10 N 2
provide
Tilde leading term ~ 10 N 2 10 N 2 + 22 N log N
approximate model
10 N 2 + 2 N +37
N2
asymptotic classify
Big Theta Θ(N 2) 9000 N 2
growth rate algorithms
5 N 2 + 22 N log N + 3N
N2
develop
Big Oh Θ(N 2) and smaller O(N 2) 100 N
upper bounds
22 N log N + 3 N
9000 N 2
develop
Big Omega Θ(N 2) and larger Ω(N 2) N5
lower bounds
N 3 + 22 N log N + 3 N
37
Commonly-used notations
Ex 2. Conjecture: worst-case running time for any 3-sum algorithm is Ω(N 2).
Ex 4. The worst-case height of a tree created with union find with path
compression is Θ(N).
Ex 5. The height of a tree created with weighted quick union is O(log N).
1
loga N = logb N
logb a
38
Predictions and guarantees
Advantages
• describes guaranteed performance. time/memory
f(N)
Challenges values represented
•
by O(f(N))
Cannot use to predict performance.
• Cannot use to compare algorithms.
input size
39
Predictions and guarantees
Advantages. time/memory
values represented
Challenges. by ~ c f(N)
input size
40
‣ estimating running time
‣ mathematical analysis
‣ order-of-growth hypotheses
‣ input models
‣ measuring space
41
Typical memory requirements for primitive types in Java
Bit. 0 or 1.
Byte. 8 bits.
Megabyte (MB). 210 bytes ~ 1 million bytes.
Gigabyte (GB). 220 bytes ~ 1 billion bytes.
type bytes
boolean 1
byte 1
char 2
int 4
float 4
long 8
double 8
42
Typical memory requirements for arrays in Java
Q. What's the biggest double[] array you can store on your computer?
43
M 3.2.1) object uses 32
ee double instance vari-
Typical memory requirements for objects in Java
ny programs create mil-
the information needed
Object overhead. 8 bytes on a typical machine.
or transparency). A refer-
Reference. 4 bytes on a typical machine.
n a data type contains a
4 bytes forExthe reference
1. Each Complex object consumes 24 bytes of memory.
y needed for the object’s
(Program 3.2.1) public32 class
bytes Complex 8 bytes overhead for object
class Charge {
privateobjectdouble re; 8 bytes
overhead
ate double rx; private double im; 8 bytes
ate double ry; rx
... double
ate double q; ry
} values 24 bytes
q
46
Example 2
...
int N = Integer.parseInt(args[0]);
for (int i = 0; i < N; i++) {
int[] a = new int[N];
...
}
47
Out of memory
48
Turning the crank: summary
49