Professional Documents
Culture Documents
Robert Sedgewick
Princeton University
PDP-9
PDP-9
18-bit words
16K memory
switches+lights
paper tape reader/punch
display w/ lightpen
scientists
engineers
FFT
mathematicians
Quicksort
software/hardware designers
cryptananalysts
Simplex
Huffman
Dijkstra
COBOL programmers
Knuth-Morris-Pratt
...
Ford-Fulkerson
. . .
Context
TOP 10 Algorithms
IBM 360/50
Algol W
one run per day
code validation
1982
1st
compiles
1986
2nd
runs
3rd
performance
comparisons
within
reasonable
practical
model
1994
2011
[this talk]
4th
Four challenges
I.
II.
I. Scientific method
Scientific method
model
experiment
edges
max capacity
upper bound
shortest
VE/2
VC
max capacity
2E lg C
E = 2000
edges
C = 100
max capacity
upper bound
for example
shortest
VE/2
VC
177,000
17,700
max capacity
2E lg C
26,575
E = 2000
edges
C = 100
max capacity
upper bound
for example
actual
shortest
VE/2
VC
177,000
17,700
37
max capacity
2E lg C
26,575
edges
max capacity
upper bound
shortest
VE/2
VC
max capacity
2E lg C
E (upper bound)
Warning:
Such analyses are
useless
for predicting
performance
or comparing
algorithms
worst-case bounds
Common wisdom
RS (in a talk):
Q:
RS:
Q:
RS:
Q:
Galactic algorithms
R.J. Lipton: A galactic algorithm is one that will never by used in practice
Why? Any effect would never be noticed in this galaxy
theoretical tour-de-force
too complicated to implement
cost of implementing would exceed savings in this galaxy, anyway
One bloggers conservative estimate: 75% SODA, 95% STOC/FOCS are galactic
Algorithm A is bad.
Google should be interested in my new Algorithm B.
RS:
TCS:
RS:
TCS:
probability
combinatorics
analysis
information theory
Analytic Combinatorics
is a modern basis for studying discrete structures
Developed by
Philippe Flajolet and many coauthors (including RS)
based on
classical combinatorics and analysis
Cambridge 2009
also available
on the web
( )=
( )=
and expand to
get coecients
that we can
approximate.
<
Then introduce a
generating function.
( )=
( )
"
Quadratic equation
Binomial theorem
Stirlings approximation
Develop a
combinatorial
construction,
which directly maps to
a GF equation
( )=
that we can
manipulate algebraically
( )=
( )
( / )
Complexification
Assigning complex values to the variable z in a GF gives a
method of analysis to estimate the coefficients.
The singularities of the function determine the method.
singularity type
method of analysis
meromorphic
(just poles)
Cauchy
(elementary)
fractional powers
logarithmic
Cauchy
(Flajolet-Odlyzko)
none
(entire function)
saddle point
Analytic combinatorics
Q. Wait, didnt you say that the masses dont need to know all that math?
RS. Well, there is one thing...
Why?
Hypothesis: T(N ) ~ a N c
...
cycle time
instruction
set
cache
structure
code
...
...
GFs
model
asymptotics
analysis
...
a ( 2 N 0) c
aN 0 c
= 2c
lg(T(2N 0 )/T(N 0 ) c
as N 0 grows
4. Calculate a as follows:
T(N 0 )/N 0
as N 0 grows
2. Run it for N 0 , 2N 0 , 4N 0 , 8N 0,
. . .
T(2N 0 ) a ( 2N 0 ) c
~
T(N 0 )
aN 0 c
= 2c
4. Multiply by 2c to predict next value
1000
1.1 sec
2000
4.5 sec
4000
18 sec
8000
73 sec
16000
295 sec
III. Introduction to CS
The masses
Scientists, engineers and modern programmers need
write programs
design and analyze algorithms
Detailed analysis
Galactic algorithms
Overly simple input models
Classic algorithms
Realistic input models and randomization
How to predict performance and compare algorithms
Unfortunate facts
Many scientists/engineers lack basic knowledge of computer science
Many computer scientists lack back knowledge of science/engineering
identify fundamentals
teach them to all students
who need to know them
as early as possible
CS
for
CS majors
CS
for
EE
CS
for
math majors
CS
for
physicists
CS
for
poets
CS
for
biochemists
CS
for
economists
CS
for
civil engineers
CS
for
rocket
scientists
CS
for
idiots
CS
for
everyone
Performance matters
in support of
encapsulation
data abstraction
functions and modules
StdDraw
StdAudio
Picture
text I/O
assignment statements
primitive types
Basic requirements
StdIn
StdOut
CS in scientific context
functions
sqrt(), log()
libraries
1D arrays
sound
2D arrays
images
recursion
fractal models
is open-ended
strings
I/O streams
OOP
data structures
genomes
web resources
Brownian motion
small-world
Bouncing ball
Simulation is easy
Bouncing balls
OOP is helpful
N-body
CS is for cubicle-dwellers?
Data abstraction
relevant CS concepts
Computer architecture
Goals
scientific content
Scientific method
Data analysis
Simulation
Applications
Progress report
2008: Enrollments are up. Is this another bubble?
525
350
175
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
Progress report
2009: Maybe.
525
350
175
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
Progress report
2012: Enrollments are skyrocketing.
525
350
175
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
3%
35%
CLASS
62%
8%
none
some
lots
4%
18%
INTENDED MAJOR
70%
11%
24%
18%
First-year
Sophomore
Junior
Senior
10%
37%
other Science/Math
other Engineering
Humanities
Social sciences
CS
Books?
Libraries?
Textbooks?
Robert Sedgewick
December 10, 2007
Future of libraries?
1990 Every student spent significant time in the library
2010 Every student spends significant time online
Few faculty members in the sciences use the library at all for research
YET, the librarys budget continues to grow!
2020?
Scientific papers?
Alan Kay: The best way to predict the future is to invent it.
Scientific papers
When is the last time you visited a library to find a paper?
Did you print the papers to read the last time you refereed a conference?
Color?
Links to references?
Links to detailed proofs?
Simulations?
why not?
Textbooks
A road to ruin
Textbook
traditional look-and-feel
builds on 500 years of experience
for use while learning
Booksite
supports search
has code, test data, animations
links to references
a living document
for use while programming, exploring
Textbook
Part I: Programming (2009)
Prolog
1 Elements of Programming
Your First Program
Built-in types of Data
Conditionals and Loops
Arrays
Input and Output
Case Study: Random Surfer
3 Data Abstraction
Data Types
Creating DataTypes
Designing Data Types
Case Study: N-body
7 Theory of Computation
Formal Languages
Turing Machines
Universality
Computability
Intractability
8 Systems
Library Programming
Compilers and Interpreters
Operating Systems
Networks
Applications Systems
4 Algorithms/Data Structures
9 Scientific Computation
Performance
Sorting and Searching
Stacks and Queues
Symbol Tables
Case Study: Small World
Booksite
introcs.cs.princeton.edu
Text digests
Ready-to-use code
Supplementary exercises/answers
Links to references and sources
Modularized lecture slides
Programming assignments
Demos for lecture and precept
Simulators for self-study
Scientific applications
10000+ files
2000+ Java programs
50+ animated demos
1.2 million unique visitors in 2011
intellectually challenging
pervasive in modern life
critical to modern science and engineering
Barriers
no room in curriculum
need to implement all the algorithms (!)
need to analyze all the algorithms (!)
need to pick the most important ones
back to basics
(one book)
Booksite
algs4.cs.princeton.edu
Text digests
Ready-to-use code
Supplementary exercises/answers
Links to references and sources
Modularized lecture slides
Programming assignments
Demos for lecture and precept
Simulators for self-study
Scientific applications
TOP 10 Algorithms
FFT
Quicksort
Simplex
Huffman
Dijkstra
Knuth-Morris-Pratt
Ford-Fulkerson
. . .
one-click download
test data
variants
robust library versions
typical clients
Performance matters
union-find
graph search
priority queue
Algorithms enrollments
300
225
150
75
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
Summary
The scientific method is an essential ingredient in programming.
Embracing, supporting, and leveraging science in
intro CS and algorithms courses can serve large numbers of students.
Next goals:
Robert Sedgewick
Princeton University