You are on page 1of 9

Big-O analysis

CS 420, Fall 14, McConnell

A numerical tug of war

The worst-case number of elementary steps of an algorithm is a precise function of its parameters.
For instance, sorting n numbers with a particular sorting program has a worst-case number of
elementary steps for each value of n, so it is a legitimate function.
It would be hard to figure out what this exact function is, as that would depend on the compiler
and the computer you are running it on. The problem is even more difficult when we are talking
about an algorithm in abstract terms without yet giving a program that implements it or specifying
the computer.
Big-O analysis, also known as asymptotic analysis, allows us to get into the ballpark. It
is an analysis that groups large sets of functions into ballparks, and when two functions arent
in the same ballpark, it tells which one is decisively faster growing. The ballparks are known as
equivalence classes.
Its hard to figure out the exact number of steps that an algorithm takes in the worst case.
If we try, we get an ugly polynomial, such as the one on page 27, and even then, the constants
in the expression reflect engineering considerations that havent been settled until we write out
a full program and specify the compiler and the machine. Its often easy to figure out what
equivalence class it falls into, however. When the running times of two algorithms for a problem
are in different equivalence classes, it gives us a decisive reason for claiming that one algorithm is
superior, independently of implementation details, at least for large instances of the problem.
To compare two functions f (n) and g(n), we will look at what happens to f (n)/g(n) as n goes
to infinity. We can think of f (n) as trying to pull the ratio to infinity and g(n) as trying to pull it
to zero.
Well say that functions f (n) and g(n) are in the same equivalence class if, as n gets large,
the ratio f (n)/g(n) settles down to a region of the y axis that lies between two positive constants,
c1 c2 . In class, I used the notation f (n) = g(n) to express this.
Example: f (n) = n2 , g(n) = 3n2 7n + 5. The ratio f (n)/g(n) gets closer and closer to
1/3 as n gets large. For n = 100, f (n) = 10, 000 and g(n) = 30, 000 700 + 5. The ratio is
0.34124. From then on out, it never gets any farther from 1/3. So we can pick, say, c1 = 0.1
and c2 = 5, and claim that for n > 100, f (n)/g(n) is always between c1 and c2 . Therefore,
f (n) = g(n). They are in the same equivalence class.
We could say that f (n) < g(n) if the limit of f (n)/g(n) as n goes to infinity is 0. That means
that no matter how small a positive constant we choose for c1 , the ratio eventually drops under c1
and stays under it, defeating any attempt to show that they are in the same equivalence class.

Example: f (n) = 100n, g(n) = n3 . Then the limit of f (n)/g(n) = 100/n2 is 0. If we pick c1
to be 1/10,000, then for n > 1000, the ratio will lie below c1 . No matter how small a positive
constant we pick for c1 , the ratio will eventually go below it and stay below it.
We could will say that f (n) > g(n) if the limit of f (n)/g(n) as n goes to infinity is infinity.
That means that no matter how large a c2 we pick, the ratio will eventually go above c2 and stay
there, defeating any effort to show they are in the same equivalence class.
Example: f (n) = n3 , g(n) = 100n. Notice that we have swapped the roles of f (n) and g(n)
from the previous example, so, since f (n) was less than g(n) in the previous example, it
should now be greater than g(n). The ratio is n2 /100. If we pick c2 to be something large,
such as 10, 000, then for n > 1000, the ratio is greater than 10, 000. No matter how large a
constant we pick for c2 , it will suffer the same fate, defeating any attempt to show that they
are in the same equivalence class.
There is a fourth case, which is none of the above three, illustrated by the exotic example of
f (n) = n and g(n) = n1+sin n . We could call this a forfeit. Its caused by one of the players in
the contest behaving in an erratic way that we wont ever see in the worst-case performance of an
algorithm. That is the last time in the course that well see the fourth case.

Notation

For historical reasons, we use will use the following notation, which we have not choice but to
memorize.
f (n) < g(n) is denoted f (n) = o(g(n))
f (n) = g(n) is denoted f (n) = (g(n))
f (n) > g(n) is denoted f (n) = (g(n))
f (n) g(n) is denoted f (n) = O(g(n)). This says, f (n) is not faster growing than g(n).
It might be in the same equivalence class or it might be slower growing; we arent taking a
stand.
f (n) g(n) is denoted f (n) = (g(n)). It might be in the same equivalence class or it
might be faster growing.
Notice that f (n) = o(g(n)) if and only if g(n) = (f (n)), f (n) = (g(n)) if and only if
g(n) = (f (n)), and f (n) = O(g(n)) if and only if g(n) = (f (n)).
Examples:
3n2 5n + 24 = O(n3 );
3n2 5n + 24 = (n2 );
3n2 5n + 24 = (n2 );

3n2 5n + 24 = o(n3 );
3n2 = 5n + 24 6= (n2 );
3n2 5n + 24 6= o(n2 );
The usual convention is to put a complicated function, which is often the worst-case running
time of an algorithm, on the left, and a simple expression inside the O(), ()), (), o(), or ()
expression.
Example: The worst-case running time of Mergesort, when run on a list of size n, is (n log2 n).
The expression f (n)(g(n)) is the most specific of these relationships. Why not always use it?
Sometimes, we can prove that f (n) = O(g(n)), but we dont know how to prove f (n) = (g(n)),
hence we cant claim f (n) = (g(n)). Sometimes, we are only interested in showing upper (big-O)
bounds on the running time of an algorithm, and we dont want to bother with showing lower (())
bounds.

Stunts we can get away with if we are doing asymptotic analysis

When comparing f (n) and g(n) in a class in algebra, you can change the functions in any way that
doesnt change their value anywhere. Thats because standard algebra is concerned with the exact
values of things. There is no notion of ballparks. The algebraic identities tell you what you can
get away with.
When comparing f (n) and g(n) in our big- analysis, it is legitimate to transform them in
any way that cant affect what equivalence class they fall into. This allows us a lot of tricks for
simplifying them that wouldnt be allowed if we were doing an exact analysis. They would get you
in trouble with your teacher in middle school, who would think you are lazy and sloppy. They
make life really easy, yet they dont allow people to cheat when deriving a time bound.
Here are some examples. Convince yourself that they cant affect the outcome of the tug of war
if we only apply them a constant number of times.
Increase f (n) by an additive constant. Same with g(n).
Multiply f (n) or g(n) by a constant that does not depend on n.
When f (n) has two terms, f (n) and f (n) and f (n) = O(f (n)), we can throw out the f (n)
term. Same with g(n).
Example: f (n) = 2n3 + 3n3 + 4n2 + n + 5, we can throw out all but the first term.
Other key facts:
logb n = (logc n) for all b, c > 1. This follows from the fourth identify of 3.15 on page 56:
logb n = logc n/ logc b, and logc b is a constant factor. This means that we dont need to
specify the base of a log when we put it inside a big-O expression, big- expression, little-o
expression, etc.
nk = o(cn ) for all k 0, c > 1. I will omit the proof.

logk n = o(n ) for all k, > 0. It doesnt matter how large k is or how small is; the latter
will eventually overtake the former and leave it in the dust.
Here is a proof: let x = log2 n. Then logk n = xk , and n = (2log2 n ) = 2 log2 n = 2x . (See the
identities on pages 55-56.) Since x goes to infinity as n does, the outcome of the comparison
between logk n and n will be the same as the outcome of the comparison between xk and
2x , and xk = o(2x ) by the previous fact.

Summations

A function is expressed in closed form if it is not expressed recursively or as a summation. If we


want to compare one running time to others, it is convenient to get it into closed form.

4.1

Geometric series
P

When f (n) = ni=0 ri , where 0 < r and r 6= 1, the summation is called a geometric series. The
ratio of each pair of consecutive terms is a constant, r.
Heres a convenient trick: for bounding a geometric series, we can throw out all but the largest
term of the summation and put a big- around it.
Example: 1 + 2 + 4 + 8 + 16 + 32 + 64 = 2 64 1. 1 + 2 + 4 + 8 + 16 + 32 + 64 + 128 = 2 128 1.
The summation is always just shy of two times the largest term. Throwing out all but the largest
P
term, 2n , of the summation ni=1 2i diminishes it by less than a factor of two, so it doesnt change
its equivalence class. The summation is (2n ).
Example: 1 + 1/2 + 1/4 + 1/8 + ... + 1/2n < 2, so throwing out all but the largest term, 1,
diminishes the summation by less than a factor of two. The summation is (1).
P
Proof: We will first show using mathematical induction that f (n) = ni=0 ri = (rn+1 1)/(r
1). If n = 0 the sum is 1 and (r0+1 1)/(r 1) = 1. Therefore, if the equality fails for some
Pn1 i
Pn
i
n
n, it must fail at a smallest n. Because n 6= 0, n > 0.
i=0 r =
i=0 r + r . Since n is the
Pn1
n1+1
n
1)/(r 1) = (r 1)/(r 1). Therefore,
smallest value where the identity fails, i=0 = (r
f (n) = (rn 1)/(r 1) + rn = (rn 1 + rn+1 rn )/(r 1) = (rn+1 1)/(r 1), contradicting our
definition of n as a place where the identity fails. It must be that there is no lowest value of n 0
where it fails, so it must be true for all n 0.
Now that we have our summation in closed form, lets simplify. Suppose r > 1. Lets mess
with (rn+1 1)/(r 1) in ways that are allowed in asymptotic analysis. We can multiply it by
the constant (r 1) to change it to rn+1 1. We can add the constant 1 to change it to rn+1 .
We can divide it by the constant r to change it to rn , which is equal to the biggest term. If it
ties with the biggest term after these changes, it must be that it tied initially with it. Therefore,
Pn
i
n+1 1)/(r 1) = (r n ), and the claim holds.
i=0 r = (r
Suppose 0 < r < 1. Lets get both the numerator and denominator positive, and rewrite it as
(1 rn+1 )/(1 r) This gets nearer and nearer to 1/(1 r) as n gets large, since rn+1 gets smaller
and smaller. We can now multiply this by the constant (1 r), and it is 1. If it was (1) after
P
these steps, it must be that ni=0 ri = (rn+1 1)/(r 1) = (1), and 1 is the largest term of the
summation.

4.2

Hurting your case


P

The summation f (n) = ni=1 i is not a geometric series, since the ratio of consecutive terms is not
a constant r. The difference between consecutive terms is a constant. This is called an arithmetic
series. We cant throw away all but the biggest term without changing its equivalence class.
Heres a proof that its (n2 ). Our strategy is to show first that its O(n2 ) and then show that
its (n2 ). These two facts together show that its (n2 ), just as i 5 and i 5 imply that i = 5.
To show its O(n2 ), we can transform it into a larger function, since this can only hurt our case.
A skeptic wont object. If we can show that it is still O(n2 ) after the transformation, then weve
proved that the original is O(n2 ). The trick is not to hurt our case so much that we change its
P
equivalence class. Lets boost up all the terms of ni=1 i to n. We now have n terms, each of which
is n, so our new sum is O(n2 ). We showed the bound in spite of hurting our case. It must be that
the original is O(n2 ).
To show its (n2 ), we can transform it into a smaller function, since this can only hurt our
case. Lets throw away the first n/2 terms. There are at least n/2 remaining terms, each of which
is at least n/2. Then lets cut each of the remaining terms down n/2. No term has been increased,
so this can only hurt our case. What is left is at least n/2 n/2 = n2 /4 = (n2 ).
We have shown that

Pn

i=1 i

= (n2 ).
P

Another example: f (n) = ni=1 log2 i. The largest term is log2 n. For an upper bound,
nobody will object if we change all the terms to log2 n. That could only hurt our case when trying
P
to show an upper bound. It becomes ni=1 log2 n = n log2 n. Therefore, f (n) = O(n log n).
For a lower bound, suppose n is even, since nobody would object if we throw out the last term
and n 1 = (n ) for n = n + 1. (We can get away with a lot of simplifying steps when were
working toward a big- bound.) Nobody will object if we throw out the terms that are smaller
than log2 (n/2). That could only hurt our efforts to still get our lower bound.
P
Our summation is now ni=n/2 log2 i. The smallest of the terms in this expression is log2 (n/2) =
(log2 n) 1. Nobody will object if we change all of the terms to be equal to the smallest. This
P
gives ni=n/2 ((log2 n) 1) = (n/2 + 1)((log2 n) 1) = (n log n).
We have shown that f (n) = O(n log n) and f (n) = (n log n), so f (n) = (n log n).

4.3

Grouping terms into buckets


P

The summation ni=1 1/i is the harmonic series. It turns out that this one is (log n).
Heres a proof. Lets divide the terms into buckets, where the first term goes into the first
bucket, the next two terms go into the second bucket, the next four terms go into the third, the
next eight go into the fourth. That is the first bucket is {1}, the second is {1/2, 1/3}, the third is
{1/4, 1/5, 1/6, 1/7}, etc. Well end up with O(log n) buckets, since we can double at most log2 n
times before reaching n.
To show that the sum is O(log n), lets observe that the largest element of the first bucket is
at most 1, the largest element of the next bucket is 1/2, the largest of the next is 1/4, et. Lets
5

replace all the elements in each bucket with the biggest element in the bucket. This can only hurt
our case. Now the elements in the first bucket sum to 1 1 = 1, the elements in the second bucket
sum to 2 1/2 = 1, the elements in the third sum to 4 1/4 = 1, etc. There are O(log n) buckets,
each of which sums to at most 1, so our terms all sum to O(log n).
Lets now show that it is (log n). If there is an incomplete bucket at the end, lets throw its
terms out, since this can only hurt our case. It is easy to see that last complete bucket is at least
1/2 log2 n. Therefore, the number of complete buckets is (1/4 log2 n) = (log n). If the last
element in a bucket is 1/i, lets replace all elements in the bucket with 1/(i + 1). This reduces the
size of all elements in the bucket, so it can only hurt our case.
The first bucket now has one term equal to 1/2, so it sums to 1/2. The second bucket has two
terms equal to 1/4, so it sums to 1/2. The third bucket has four terms equal to 1/8, so it sums to
1/2. The sum is 1/2(log n) = (log n).

A peek at Python

Type the following into a file called mySort.py:


print (\n\nIm starting.)
def InsertionSort(l):
if len(l) == 1:
return l
else:
l2 = InsertionSort(l[0:len(l)-1])
t = l[len(l)-1]
return [i for i in l2 if i <= t] + [t] + [i for i in l2 if i > t]
print (\n\nIm existing.)
Then, from the Linux command line, type ipython -i mySort.py. The file will run the program, and the -i prompt tells python to give you the command line after it runs. The only thing
the program does is to print two output statements and define a method. It doesnt actually run
the method; methods are defined with a command, def.
The -i option says to give you the Python command line after it runs. Type this to the Python
command line: InsertionSort([5,3,6,4,2]). You will see that it returns a list that has the same
elements, in sorted order.
Study the Python tutorial to find out about the non-Java-like statements you see. Find out
how lists can be represented with the square-bracket syntax. Find out about the len operator.
Find out what it means when l is a list and you write l[3:6]. Experiment by getting the Python
command line and doing the following:
In [2]: l = [6,9,2,1,3,4,7,8]
In [3]: l[3:6]
Out[3]: [1, 3, 4]
In [4]: l[0:2]
Out[4]: [6, 9]
In [5]: l2 = [100,150]
In [6]: l + l2
Out[6]: [6, 9, 2, 1, 3, 4, 7, 8, 100, 150]
In [7]: l
Out[7]: [6, 9, 2, 1, 3, 4, 7, 8]
In [8]: l2
Out[8]: [100, 150]
7

In [5]: l2 = [100,150]
In [9]: [i for i in l if i >= 4]
Out[9]: [6, 9, 4, 7, 8]
In [10]: l
Out[10]: [6, 9, 2, 1, 3, 4, 7, 8]
Before long, you should study the entire tutorial, trying out all of the examples at the Python
prompt as you go through it.

Getting a recurrence for the running time of a recursive algorithm

Here are some facts about Python operations. Producing the concatenation of two lists takes time
proportional to the sum of sizes of the lists. Creating a sublist of those elements of a list that meet
a condition that takes O(1) time to evaluate takes time proportional to the length of the list. To
create a list of elements between two indices takes time proportional to the number of elements in
this sublist.
Lets think of a name for the worst-case running time of our Python method on a list of size n:
well call it T (n). We would like to get a big- bound for T (n).
If n > 1, it makes a recursive call on n 1 elements. We dont know how long this takes, but
we have a name for it: T (n 1). It also does a fixed number k of operations that each take (n)
time. The total for these is k(n) = (n) since k is fixed. In the friendly world of big-O, we didnt
even need to bother counting what k is!
If n == 1, it takes (1) time.
We get the following recursive expression for the running time:
T (1) = (1); T (n) = T (n 1) + (n).
Since were doing big- analysis, there is no harm in fudging this and just calling it this, since
it can only affect T () by a constant factor:
T (1) = 1; T (n) = T (n 1) + n.
If n 1 > 1, we can substitute T (n 1) = T (n 2) + n 1 into this expression, and get
T (n) = T (n 2) + n 1 + n.
If n 2 > 1, we can substitute T (n 2) = T (n 3) + n 2 into this expression, and get
T (n) = T (n 3) + n 2 + n 1 + n.
Eventually, we will get down to T (1), and we can substitute in T (1) = 1. We get T (n) =
1 + 2 + 3 + . . . + n. Weve turned our recursive expression for the running time into a summation.
Its still not in closed form, but its progress.
We analyzed this summation above, and found out it was (n2 ). The running time of our
Python program must be (n2 ).

Using the master theorem

Study how the book derives a recurrence for mergesort. Its similar to what we have just done
above.
The master theorem is a recipe that often works for getting a big- bound on a recurrence of
the form T (n) = aT (n/b) + f (n). The recurrence for mergesort is similar.
The master theorem says that if the ratio nlogb a /f (n) is (n ) for some > 0, then T (n) =
(nlogb a ). If the inverse of this ratio is (n ), then T (n) = (f (n)). If f (n) = (nlogb a ), then
T (n) = (f (n) log n) = (nlogb a log n).
The second condition on case 3 of the master theorem applies to all recurrences we will see
this semester. It is there as a disclaimer against skeptics who could otherwise show it is false by
tripping it up with an exotic f (n). I will explain what this is about in class.
Examples:
T (n) = 2T (n/2) + n. Were comparing nlog2 2 = n1 vs. f (n) = n. Theyre big- of each
other. T (n) = (n log n).
T (n) = 4T (n/2) + n. Were comparing nlog2 4 = n2 vs. f (n) = n. The ratio is n for = 1.
T (n) = (n2 ).
T (n) = 2T (n/2)+n1/2 . Were comparing n1 vs n1/2 . The ratio is n for = 1/2. T (n) = (n).
T (n) = 2T (n/2) + n2 . Were comparing n1 vs. n2 . The ratio is 1/n1 . T (n) = (n2 ).
T (n) = 2( n/2) = n log n. n1 vs n log n. The ratio is 1/log n. However, since log n = o(n ) for
all > 0, the master theorem says you have to find another method. It punts. In this case,
we have to expand the recurrence by iterative substitution, the way we did for the recurrence
for insertion sort above.