You are on page 1of 72

Module 1 - 1

Complexity and Order Notations


and Binary Search
Instructor – Gajendra Shrimal

1
Algorithm
• An algorithm is a set of instructions to be followed to
solve a problem.
– There can be more than one solution (more than one
algorithm) to solve a given problem.
– An algorithm can be implemented using different
programming languages on different platforms.
• An algorithm must be correct. It should correctly solve
the problem.
– e.g. For sorting, this means even if (1) the input is
already sorted, or (2) it contains repeated elements.
• Once we have a correct algorithm for a problem, we
have to determine the efficiency of that algorithm.
2
Algorithmic Performance
There are two aspects of algorithmic performance:
• Time
• Instructions take time.
• How fast does the algorithm perform?
• What affects its runtime?
• Space
• Data structures take space
• What kind of data structures can be used?
• How does choice of data structure affect the runtime?
 We will focus on time:
– How to estimate the time required for an algorithm
– How to reduce the time required

3
Analysis of Algorithms
• Analysis of Algorithms is the area of computer science that
provides tools to analyze the efficiency of different methods
of solutions.
• How do we compare the time efficiency of two algorithms that
solve the same problem?
Naïve Approach: implement these algorithms in a programming
language (C++), and run them to compare their time
requirements. Comparing the programs (instead of algorithms)
has difficulties.
– How are the algorithms coded?
• Comparing running times means comparing the implementations.
• We should not compare implementations, because they are sensitive to programming
style that may cloud the issue of which algorithm is inherently more efficient.
– What computer should we use?
• We should compare the efficiency of the algorithms independently of a particular
computer.
– What data should the program use?
• Any analysis must be independent of specific data.
4
Analysis of Algorithms
• When we analyze algorithms, we should employ
mathematical techniques that analyze algorithms
independently of specific
implementations, computers, or data.

• To analyze algorithms:
– First, we start to count the number of significant
operations in a particular solution to assess its
efficiency.
– Then, we will express the efficiency of algorithms
using growth functions.

5
The Execution Time of Algorithms
• Each operation in an algorithm (or a program) has a
cost.
 Each operation takes a certain of time.
count = count + 1; take a certain amount of time, but it is constant

A sequence of operations:

count = count + 1; Cost: c1


sum = sum + count; Cost: c2

 Total Cost = c1 + c2

6
The Execution Time of Algorithms (cont.)
Example: Simple If-Statement
Cost Times
if (n < 0) c1 1
absval = -n c2 1
else
absval c3 1
= n;
Total Cost <= c1 + max(c2,c3)

7
The Execution Time of Algorithms
Example: Simple Loop (cont.)
Cost Times
i = 1; c1 1
sum = 0; c2 1
while (i <= n) { c3 n+1
i = i + 1; c4 n
sum = sum + i; c5 n
}

Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*c5


 The time required for this algorithm is proportional to
n
8
The Execution Time of Algorithms (cont.)
Example: Nested Loop
Cost Times
i=1; c1 1
sum = 0; c2 1
while (i <= n) { c3 n+1
j=1; c4 n
while (j <= n) c5 n*(n+1)
{
sum = sum + i; c6 n*n
j = j + 1; c7 n*n
}
} i = i +1; c8 n
Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*(n+1)*c5+n*n*c6+n*n*c7+n*c8
 The time required for this algorithm is proportional to n2

9
General Rules for Estimation
• Loops: The running time of a loop is at most the running
time of the statements inside of that loop times the number of
iterations.
• Nested Loops: Running time of a nested loop containing a
statement in the inner most loop is the running time of
statement multiplied by the product of the sized of all loops.
• Consecutive Statements: Just add the running times of those
consecutive statements.
• If/Else: Never more than the running time of the test plus
the larger of running times of S1 and S2.

10
Algorithm Growth Rates
• We measure an algorithm’s time requirement as a function of the
problem size.
– Problem size depends on the application: e.g. number of elements in a list for a
sorting algorithm, the number disks for towers of hanoi.
• So, for instance, we say that (if the problem size is n)
– Algorithm A requires 5*n2 time units to solve a problem of size n.
– Algorithm B requires 7*n time units to solve a problem of size n.
• The most important thing to learn is how quickly the algorithm’s
time requirement grows as a function of the problem size.
– Algorithm A requires time proportional to n2.
– Algorithm B requires time proportional to n.
• An algorithm’s proportional time requirement is known as
growth rate.
• We can compare the efficiency of two algorithms by comparing
their growth rates.
11
Algorithm Growth Rates
(cont.)

Time requirements as a function


of the problem size n

12
Common Growth Rates
Function Growth Rate Name
c Constant
log N Logarithmic
log2N Log-squared
N Linear
N log N
N2 Quadratic
N3 Cubic
2N Exponential
13
Figure 6.1
Running times for small inputs

14
Figure 6.2
Running times for moderate inputs

15
Input Size

● Input size (number of elements in the input)

● size of an array
● # of elements in a matrix
● # of bits in the binary representation of the input
● vertices and edges in a graph

16
Types of Analysis
● Worst case
● Provides an upper bound on running time
● An absolute guarantee that the algorithm would not run longer, no
matter what the inputs are
● Best case
● Provides a lower bound on running time
● Input is the one for which the algorithm runs the fastest

Lower Bound  Running Time Upper Bound


● Average case
● Provides a prediction about the running time
● Assumes that the input is random

17
Types of Analysis: Example
● Example: Linear Search Complexity

● Best Case : Item found at the beginning: One comparison


● Worst Case : Item found at the end: n comparisons
● Average Case :Item may be found at index 0, or 1, or 2, . . . or n - 1
● Average number of comparisons is: (1 + 2 + . . . + n) / n =
(n+1) / 2

● Worst and Average complexities of common sorting algorithms


Method Worst Case Average Case Best Case
Selection Sort
n2 n2 n2
Insertion Sort
Merge Sort n2 n2 n
Quick Sort nlogn
nlogn nlogn
n2
nlogn
nlogn 18
How do we compare algorithms?
● We need to define a number of
objective measures.

(1) Compare execution times?


Not good: times arespecific to a particular
computer !!

(2) Count the number of statements executed?


Not good: number of statements vary with the
programming language as well as the style of the
individual programmer.

19
Ideal Solution

● Express running time as a function of


the input size n (i.e., f(n)).
● Compare different functions corresponding
to running times.
● Such an analysis is independent
of machine time, programming style, etc.

20
Example
● Associate a "cost" with each statement.
● Find the "total cost” by finding the total number of
times each statement is executed.
Algorithm 1 Algorithm 2
Cost Cost
arr[0] = 0; c1 for(i=0; i<N; i++) c2
arr[1] = 0; c1 arr[i] = 0; c1
arr[2] = 0; c1
... ...
arr[N-1] = 0; c1

c1+c1+...+c1 = c1 x N (N+1) x c2 + N x c1 =
(c2 + c1) x N + c2

21
Another Example
● Algorithm 3 Cost
sum = 0; c1
for(i=0; i<N; i++) c
for(j=0; j<N; j++) 2

sum += arr[i] c2
[j];
c
c1 + c2 x (N+1) + c2 x N x (N+1) + c3 3 x
N2

22
Asymptotic Analysis
● To compare two algorithms with running
times f(n) and g(n), we need a rough
measure that characterizes how fast each
function grows.
● Hint: use rate of growth
● Compare functions in the limit, that is,
asymptotically!
(i.e., for large values of n)

23
Rate of Growth
● Consider the example of buying elephants
and
goldfish:
Cost: cost_of_elephants + cost_of_goldfish
Cost ~ cost_of_elephants (approximation)
● insignificant
The low order terms
for large n in a function are
relatively n4 + 100n2 + 10n + 50
n4

~
i.e., we say that n4 + 100n2 + 10n + 50 and have the
n4
same rate of growth
24
Rate of Growth

25
Rate of Growth

26
27
Common orders of magnitude

28
Asymptotic Notation

● O notation: asymptotic “less than”:

● f(n)=O(g(n)) implies: f(n) “≤”


g(n)
●  notation: asymptotic “greater than”:

● f(n)=  (g(n)) implies: f(n) “≥” g(n)

●  notation: asymptotic “equality”:

● f(n)=  (g(n)) implies: f(n) “=” g(n)

29
Big-O Notation

● We say fA(n)=30n+8 is order n, O (n)


or It is, at most, roughly
proportional to n.
at most,
● fB(n)=n2+1 is order n2, or O(n2). It
is, roughly proportional to n2.
● In general, any O(n2) function is faster-
growing than any O(n) function.

30
Visualizing Orders of Growth

● On a graph, as
you go to the

Value of function 
right, a faster fA(n)=30n+8
growing
function
eventually
becomes fB(n)=n2+1
larger...

Increasing n 

31
More Examples …

● n4 + 100n2 + 10n + 50 is
O(n4)
● 10n3 + 2n2 is O(n3)
● n3 - n2 is O(n3)
● constants
● 10 is O(1)
● 1273 is O(1)

32
Back to Our Example
Algorithm 1 Algorithm 2
Cost Cost
arr[0] = 0; c1 for(i=0; i<N; i++) c2
arr[1] = 0; c1 arr[i] = 0; c1
arr[2] = 0; c1
...
arr[N-1] = 0; c1

c1+c1+...+c1 = c1 x N (N+1) x c2 + N x c1 =
(c2 + c1) x N + c2

● Both algorithms are of the same order: O(N)

33
Example (cont’d)

Algorithm 3 Cost
sum = 0; c1
for(i=0; i<N; i++) c2
for(j=0; j<N; j++) c2
sum += arr[i] c3
[j];
c1 + c2 x (N+1) + c2 x N x (N+1) + c3 x N2 =
O(N2)

34
Asymptotic notations
● O-
notation

35
Big-O Visualization
O(g(n)) is the set of
functions with smaller
or same order of
growth as g(n)

36
Examples
● 2n = O(n ):
2 3 2n2 ≤ cn3  2 ≤ cn  c = 1 and n0= 2

● n =2
n2 ≤ cn2  c ≥ 1  c = 1 and n0= 1
O(n2):
● 1000n2+1000n = O(n2):

1000n2+1000n ≤ 1000n2+ n2 =1001n2 c=1001 and n0 = 1000

● n = O(n2): n ≤ cn2  cn ≥ 1  c = 1 and n0= 1

37
More Examples

● Show that 30n+8 is O(n).


● Show c,n0: 30n+8  cn, n>n0 .
● Let c=31, n0=8. Assume n>n0=8.
Then
cn = 31n = 30n + n > 30n+8, so 30n+8 < cn.

38
Big-O example, graphically

● Note 30n+8 isn’t


less than n cn =
anywhere (n>0). 31n 30n+8

Value of function 
● It isn’t even
less than 31n
everywhere. 30n+8
n
● But it is less than O(n)
31n everywhere to
the right of n=8. n>n0=8 
Increasing n 

39
No Uniqueness
● There is no unique set of values for and c in proving the
n0
asymptotic bounds
● Prove that 100n + 5 = O(n2)
● 100n + 5 ≤ 100n + n = 101n ≤ 101n2

for all n ≥ 5
n0 = 5 and c = 101 is a

solution
● 100n + 5 ≤ 100n + 5n = 105n ≤
105n2
Must find SOME constants c and n0 that satisfy the asymptotic notation relation
for all n ≥ 1
40
Asymptotic notations (cont.)
● -
notation

(g(n)) is the set of functions


with larger or same order of
growth as g(n)

41
Examples
● 5n2 = (n)

 c, n0 such that: 0  cn  5n2  cn  5n2  c = 1 and n0 = 1

● 100n + 5 ≠ (n2)
 c, n0 such that: 0  cn2  100n + 5
100n + 5  100n + 5n ( n  1) = 105n
cn2  105n  n(cn – 105)  0

Since n is positive  cn – 105  0 


n  105/c
 contradiction: n cannot be smaller
than a constant 42
Asymptotic notations (cont.)

● -
notation (g(n)) is the set of functions
with the same order of growth
as g(n)

43
Examples
● n2/2 –n/2 = (n2)

● ½ n2 - ½ n ≤ ½ n2 n ≥ 0  c2= ½

● ½ n2 - ½ n ≥ ½ n2 - ½ n * ½ n ( n ≥ 2 ) = ¼ 
n2 c1=
¼

● n ≠ (n2): c1 n2 ≤ n ≤ c2
n2

 only holds for: n ≤ 1/c1


44
Examples

● 6n3 ≠ (n2): c1 n2 ≤ 6n3 ≤


c2 n2

 only holds for: n ≤ c2 /6

● n ≠ (logn): c1 logn ≤ n ≤ c2 logn

 c2 ≥ n/logn,  n≥ n0
– impossible

45
Relations Between Different Sets
● Subset relations between order-of-growth sets.

RR
O( f ) ( f )
•f
( f )

46
Logarithms and properties
● Inalgorithm analysis we often use the notation “log
n” without specifying the base

Binary logarithm lg n  log2 n log x y y log


x
log xy  log
Natural logarithm ln n  loge n x  log
y
lgk n  (lg n)k x
log y  log x  log
lg lg n  lg(lg y
n) alogb x  xlogb a
log b x  log x
logaa b

47
More Examples
● For each of the following pairs of functions, either f(n) is
● O(g(n)),
f(n) = log nf(n) is= Ω(g(n)),
2; g(n) log n + 5 or f(n) = Θ(g(n)). Determine
● which relationship2is correct.
f(n) = n; g(n) = log n
f(n) =  (g(n))
● f(n) = log log n; g(n) = log n
f(n) = (g(n))
● f(n) = n; g(n) = log2 n
f(n) = O(g(n))
● f(n) = n log n + n; g(n) = log n
f(n) = (g(n))
● f(n) = 10; g(n) = log 10
f(n) = (g(n))
● f(n) = 2 ; g(n) = 10n
n 2
f(n) = (g(n))
● f(n) = 2n; g(n) = 3n
f(n) = (g(n))
f(n) = O(g(n))

48
Properties
● Theorem:
f(n) = (g(n))  f = O(g(n)) and f = (g(n))
● Transitivity:
● f(n) = (g(n)) and g(n) = (h(n))  f(n) = (h(n))
● Same for O and 
● Reflexivity:
● f(n) = (f(n))
● Same for O and 
● Symmetry:
(g(n)) if and only if g(n) = (f(n))
● f(n) =
● Transpose symmetry:
● f(n) = O(g(n)) if and only if g(n) = (f(n))

49
Order-of-Magnitude Analysis and Big
O Notation
• If Algorithm A requires time proportional to f(n), Algorithm A
is said to be order f(n), and it is denoted as O(f(n)).
• The function f(n) is called the algorithm’s growth-rate
function.
• Since the capital O is used in the notation, this notation is
called the Big O notation.
• If Algorithm A requires time proportional to n2, it is O(n2).
• If Algorithm A requires time proportional to n, it is O(n).

50
Definition of the Order of an
Definition: Algorithm
Algorithm A is order f(n) – denoted as
O(f(n)) – if constants k and n0 exist such that A
requires
no more than k*f(n) time units to solve a
problem
of size n  n0.

• The requirement of n  n0 in the definition of O(f(n))


formalizes the notion of sufficiently large problems.
– In general, many values of k and n can satisfy this definition.
51
Order of an

Algorithm
If an algorithm requires n –3*n+10 seconds to solve a problem
2

size n. If constants k and n0 exist such that


k*n2 > n2–3*n+10 for all n  n0 .
the algorithm is order n2 (In fact, k is 3 and n0 is 2)
3*n2 > n2–3*n+10 for all n 
Thus, the algorithm requires2 no
. more than k*n2 time units for n 
n0 ,
So it is O(n2)

52
Order of an Algorithm
(cont.)

53
A Comparison of Growth-Rate
Functions

54
A Comparison of Growth-Rate
Functions (cont.)

55
Growth-Rate
O(1)
Functions
Time requirement is constant, and it is independent of the problem’s size.
O(log2n) Time requirement for a logarithmic algorithm increases increases slowly
as the problem size increases.
O(n) Time requirement for a linear algorithm increases directly with the size
of the problem.
O(n*log2n) Time requirement for a n*log2n algorithm increases more rapidly than
a linear algorithm.
O(n2) Time requirement for a quadratic algorithm increases rapidly with the
size of the problem.
O(n3) Time requirement for a cubic algorithm increases more rapidly with the
size of the problem than the time requirement for a quadratic algorithm.
O(2n) As the size of the problem increases, the time requirement for an
exponential algorithm increases too rapidly to be practical.

56
Growth-Rate
Functions
• If an algorithm takes 1 second to run with the problem size
8, what is the time requirement (approximately) for that
algorithm with the problem size 16?
• If its order is:
O(1)  T(n) = 1 second
O(log2n)  T(n) = (1*log216) / log28 = 4/3 seconds
O(n)  T(n) = (1*16) / 8 = 2 seconds
O(n*log2n)  T(n) = (1*16*log216) / 8*log28 = 8/3 seconds
O(n2)  T(n) = (1*162) / 82 = 4 seconds
O(n3)  T(n) = (1*163) / 83 = 8 seconds
O(2n)  T(n) = (1*216) / 28 = 28 seconds = 256 seconds

57
Properties of Growth-Rate
Functions
1. We can ignore low-order terms in an algorithm’s growth-rate
function.
– If an algorithm is O(n3+4n2+3n), it is also O(n3).
– We only use the higher-order term as algorithm’s growth-rate function.

2. We can ignore a multiplicative constant in the higher-order term


of an algorithm’s growth-rate function.
– If an algorithm is O(5n3), it is also O(n3).

3. O(f(n)) + O(g(n)) = O(f(n)+g(n))


– We can combine growth-rate functions.
– If an algorithm is O(n3) + O(4n), it is also O(n3 +4n2)  So, it is O(n3).
– Similar rules hold for multiplication.

58
Some
Mathematical Facts
• Some mathematical equalities are:
n
n*(n 2
 i  1  2  ...  n 
1)  n
i 1 2 2

i
2
n * ( n  1) * ( 2 n  n
3
 1  4  ...  n 2
1) 
i 1  6 3

n 1

2
i

 0  1  2  ...  2 n 1  2 n 
i0 1

59
Growth-Rate Functions – Example1
Cost Times
i = 1; c1 1
sum = 0; c2 1
while (i <= n) { c3 n+1
i = i + 1; c4 n
sum = sum + i; c5 n
}

T(n) = c1 + c2 + (n+1)*c3 + n*c4 + n*c5


= (c3+c4+c5)*n + (c1+c2+c3)
= a*n + b
 So, the growth-rate function for this algorithm is O(n)

60
Growth-Rate Functions – Example2
Cost Times
i=1; c1 1
sum = 0; c2 1
while (i <= n) { c3 n+1
j=1; c4 n
while (j <= n) { c5 n*(n+1)
sum = sum + i; c6 n*n
j = j + 1; c7 n*n
}
i = i +1; c8 n
}
T(n) = c1 + c2 + (n+1)*c3 + n*c4 + n*(n+1)*c5+n*n*c6+n*n*c7+n*c8
= (c5+c6+c7)*n2 + (c3+c4+c5+c8)*n + (c1+c2+c3)
= a*n2 + b*n + c
 So, the growth-rate function for this algorithm is O(n2)
61
Growth-Rate Functions – Example3
Cost Times
for (i=1; i<=n; i++) c1 n+1
n

for (j=1; j<=i; j++) c2  ( j  1)


j 1

n j

for (k=1; k<=j; k++) c3   ( k  1)


n k 1 j
j 1

x=x+1; c4 k
j 1 k 1

n n
n j

T(n) = c1*(n+1) + c2*(  ( j  1) j


) + c4*(   k )
j 1 ) + c3* (   (k  1) j 1 k 1

j 1 k 1

= a*n3 + b*n2 + c*n + d


 So, the growth-rate function for this algorithm is O(n3)

62
Growth-Rate Functions – Recursive
Algorithms
void hanoi(int n, char source, char dest, char spare) { Cost
if (n > 0) { c1
hanoi(n-1, source, spare, dest); c2
cout << "Move top disk from pole " << c3
source
<< " to pole " << dest << c4
} } endl; hanoi(n-1, spare, dest,
source);

• The time-complexity function T(n) of a recursive algorithm is


defined in terms of itself, and this is known as recurrence equation
for T(n).
• To find the growth-rate function for a recursive algorithm, we have
to solve its recurrence relation.

63
Growth-Rate Functions –
Hanoi Towers
• What is the cost of hanoi(n,’A’,’B’,’C’)?

when n=0
T(0) = c1

when n>0
T(n) = c1 + c2 + T(n-1) + c3 + c4 + T(n-1)
= 2*T(n-1) + (c1+c2+c3+c4)
= 2*T(n-1) + c  recurrence equation for the
growth-rate
function of hanoi-towers algorithm

• Now, we have to solve this recurrence equation to find the growth-rate


function of hanoi-towers algorithm

64
Growth-Rate Functions – Hanoi
Towers (cont.)
• There are many methods to solve recurrence equations, but we will use a
simple method known as repeated substitutions.

T(n) = 2*T(n-1) + c
= 2 * (2*T(n-2)+c) + c
= 2 * (2* (2*T(n-3)+c) + c) + c
= 23 * T(n-3) + (22+21+20)*c (assuming n>2)
when substitution repeated i-1th times
= 2i * T(n-i) + (2i-1+ ... +21+20)*c
when i=n
= 2n * T(0) + (2n-1+ ... +21+20)*c
n 1

= 2n * c1 + (  2 )*ci

i0

= 2n * c1 + ( 2n-1 )*c = 2n*(c1+c) – c  So, the growth rate function is O(2n)


65
What to
• An algorithm can requireAnalyze
different times to solve different
problems of the same size.
– Eg. Searching an item in a list of n elements using sequential search. 
Cost: 1,2,...,n
• Worst-Case Analysis –The maximum amount of time that an
algorithm require to solve a problem of size n.
– This gives an upper bound for the time complexity of an algorithm.
– Normally, we try to find worst-case behavior of an algorithm.
• Best-Case Analysis –The minimum amount of time that an
algorithm require to solve a problem of size n.
– The best case behavior of an algorithm is NOT so useful.
• Average-Case Analysis –The average amount of time that an
algorithm require to solve a problem of size n.
– Sometimes, it is difficult to find the average-case behavior of an algorithm.
– We have to look at all possible data organizations of a given size n, and their
distribution probabilities of these organizations.
– Worst-case analysis is more common than average-case analysis.
66
What is
Important?
• An array-based list retrieve operation is O(1), a linked-
list- based list retrieve operation is O(n).
• But insert and delete operations are much easier on a linked-
list-
based list implementation.
 When selecting the implementation of an Abstract Data
Type (ADT), we have to consider how frequently particular ADT
operations occur in a given application.

• If the problem size is always small, we can probably ignore


the algorithm’s efficiency.
– In this case, we should choose the simplest algorithm.

67
What is Important?
(cont.)
• We have to weigh the trade-offs between an algorithm’s time
requirement and its memory requirements.
• We have to compare algorithms for both style and efficiency.
– The analysis should focus on gross differences in efficiency and not reward coding
tricks that save small amount of time.
– That is, there is no need for coding tricks if the gain is not too much.
– Easily understandable program is also important.
• Order-of-magnitude analysis focuses on large problems.

68
Sequential
Search
int sequentialSearch(const int a[], int item, int n)
{ for (int i = 0; i < n && a[i]!= item; i++);
if (i == n)
return –1;
return i;
}
Unsuccessful Search:  O(n)

Successful Search:
Best-Case: item is in the first location of the array O(1)
Worst-Case: item is in the last location of the array
O(n) Average-Case: The number of key comparisons 1,
2, ...,
n
n
i
(n  n)/
2
i 1

2  O(n)
n n

69
Binary Search
int binarySearch(int a[], int size, int x) {
int low =0;
int high = size –1;
int mid; // mid will be the
index of
// target when it’s found.
while (low <= high) {
mid
= (low + high)/2; if
(a[mid] < x)
low = mid + 1;
else if (a[mid] > x)
high = mid – 1;
else
return mid;
}
return –1;
}

70
Binary Search –
Analysis
• For an unsuccessful search:
– The number of iterations in the loop is
log2n + 1
 O(log2n)
• For a successful search:  O(1)
– Best-Case: The number of iterations is 1.  O(log2n)
– Worst-Case: The number of iterations is log2n +1  O(log2n)
– Average-Case: The avg. # of iterations
0 <1log2n2 3 4 5 6 7  an array with size
8
3 2 3 1 3 2 3 4  # of iterations
The average # of iterations = 21/8 < log28

71
How much better is O(log2n)?

n O(log2n)
16 4
64 6
256 8
1024 (1KB) 10
16,384 14
131,072 17
262,144 18
524,288 19
1,048,576 (1MB) 20
1,073,741,824 (1GB) 30

72

You might also like