Exact analysis of resource usage (e.g., time or memory) of programs is very hard.

CPU speeds/loads vary widely.

Compiler optimizations.

Implementation details.

Lecture 17: Big-Oh Notation and Algorithms Cache behavior.

Details of input.

Breakthrough solution: don’t be so exact!

Benefits of appropriate vagueness:

Results are robust, even with variations in all of the above. We’d like results

that last beyond the current generation of computers.

Analysis is easier.

1 2-a

“Big Oh” Notation

Major ideas: Notation focuses on growth as a function of problem size, ignoring constant

factors.

Focus on worst case performance.

– If worst-case running time is , it means the worst possible input of Definition: is (“order of ”) if there exist positive constants

size runs in that time. and such that, for every greater than some ,

– Easier to analyze than average case. Example: is because for

– No assumptions about probability distributions required. when .

Disregard constant factors. Example: is because for

(for all ).

– Analysis is easier.

– Focuses on rate of growth. Example: is because for and

, or for and , or . . . .

This concept is one of the most important in both theoretical and practical

!

Example: is for and .

computer science.

It is critical to the study of algorithms.

3 4

Laws of Big-Oh More laws of Big-Oh

Suppose is and is . Suppose is and is .

Addition: . Multiplication: .

In a nutshell: the bigger wins. Application: Nested loops.

for to do

Something with running time ;

something that takes time in the worst case.

Something with running time ;

has running time .

5 6

Simple bounds Rules of Thumb

Since the point of “Big-Oh” notation is to simplify reasoning, it is silly to have run Constant factors don’t matter

times like:

is because for (for all

. This is just . ).

is (summation rule) is .

is .

Exponentials grow faster than polynomials

is (the magic of compound interest).

7 8

More Rules of Thumb Factorials

Logarithms grow slower than for . is a very fast-growing function. It is not for any constant .

is . However .

is .( is sometimes written ). This simple bound is often easy to work with.

The base of logarithms doesn’t matter. Example: . This is the running time of many sorting

algorithms.

Changing the base is just a constant factor!

is for any bases and because

9 10

Shorthand Directed Graphs (digraphs)

Constant – A directed graph (digraph) is a relation: . is the called the set of

vertices (also called nodes). is called the set of edges (also called arcs).

Log –

Polylog – for some .

Linear — .

Quadratic — .

Exponential — for some and that is at least .

Factorial — for some .

The arrows are edges, the circles are vertices.

!

Double exponential — for some and that is at least

.

etc.

11 12

Running Times Estimating from program structure

Finding running times can be hard. We’ll do some easy cases and leave the rest The running time of many algorithms can be estimated from the recursive

for another course. structure of programs.

First question: What’s (the input size)? Base case: In a language like C, almost all simple statements (statements

Example: A directed graph on vertices may have up to edges. An

algorithm that processes each edge in constant time takes running time. Assignments (assume no function call in right-hand-side).

The size of the graph is more accurately vertices and edges. The same Simple I/O functions

Break, continue, return

.

This is important for sparse graphs, where may be much less than .

The recursive cases are the statements that combine simpler structures:

sequencing, conditionals, and loops.

13 14

Compound statements Linear Search

Sequencing: worst case running time of is sum of running times of and Abstractly, linear search checks for membership of a value in a set represented

. as an array or list.

If-then-else: running time of if then else is time to compute plus max Suppose we’re searching an array of length .

time of and .

for from to do

Loops: run time is number of iterations times the running time of the loop body. if then

return true; — found it

Finding a bound on the loop iteration may be impossible, but often it’s easy (e.g.,

return false;

for loops).

Suppose every statement inside the loop takes constant time.

This analysis is conservative — knowing more about how the program works may

yield tighter bounds. What’s the worst case?

15 16

Matrix Multiplication A More Difficult Example

Multiplying two matrices (psuedo-code). – Compute largest power of two less than

MatMult – and its log

for from to do ;

for from to do ;

; — while do

– multiply matrices ;

for from to do ;

for from to do

return ;

17 18

An Insertion Sort

isort(array )

for from to do

for from to do

if then

insert( , , );

break;

insert(array , , )

;

;

;

What is the running time?

19

