You are on page 1of 98

Getting Started

Chapter 2

Mr. Shahid Waseem


PhD-Scholar-CS UET
MS-CS GCUL
Topics
 Insertion Sort
 Analyzing Insertion Sort
 Recurrence
 Divide-and-Conquer Approach
Insertion Sort
Insertion Sort (Pseudocode)
Insertion Sort
A[j], which is compared with the values in shaded
rectangles to its left in the test of line 5.
Shaded arrows show array values moved one position to
the right in line 6, and black arrows indicate where the key
is moved to in line 8.
(f) The final sorted array.
Loop invariants (Insertion Sort)
Initialization:
The subarray A[1 ‥ j - 1], therefore, consists of
just the single element A[1], which is in fact the
original element in A[1]. Moreover, this
subarray is sorted, which shows that the loop
invariant holds prior to the first iteration of the
loop.
Loop invariants (Insertion Sort)
Maintenance:
The body of the outer for loop works by moving
A[ j - 1], A[ j - 2], A[ j - 3], and so on by one
position to the right until the proper position for
A[ j] is found, at which point the value of A[j] is
inserted.
Loop invariants (Insertion Sort)
Termination:
The outer for loop ends when j exceeds n, i.e.,
when j = n + 1. Substituting n + 1 for j in the
wording of loop invariant, we have that the
subarray A[1 ‥ n] consists of the elements
originally in A[1 ‥ n], but in sorted order. But the
subarray A[1 ‥ n] is the entire array! Hence, the
entire array is sorted, which
Running Time (INSERTION-
SORT)
Running Time (INSERTION-
SORT)
Best Case (INSERTION-SORT)
 Even for inputs of a given size, an algorithm’s
running time may depend on which input of
that size is given.
 Best case occurs if the array is already
sorted. For each j = 2,3,……..n, we then find
that A[i] less than & equal to key in line 5
when i has its initial value of j-1. Thus ti = 1
for j = 2,3,…..n, and the best-case running
time is given as:
Best Case (INSERTION-SORT)
We can express this running time as an C b for
constants a and b that depend on the
statement costs ci it is thus a linear function of
n. an+b
Worst Case (INSERTION-SORT)
If the array is in reverse sorted order—that is, in
decreasing order—the worst case results. We
must compare each element A[j] with each
element in the entire sorted subarray
A[1….. j -1]
Worst Case (INSERTION-SORT)
We can express this worst-case running time as
an2+bn+c for constants a, b, and c that again
depend on the statement costs ci ; it is thus a
quadratic function of n.
Recurrences
Example
Factorial (n) if (n=1)
return 1 else
return (n * Factorial(n-
1)) Recurrence Equation

1 if n=1
T(n) cT(n-1)+1 if n >
= 1
Recurrences
• The expression:

c n=

T (n) = 1

⎪ 2T ⎜ n ⎞ + n>
⎨ ⎪ ⎝ 2⎠ 1
cn
is a recurrence.

• Recurrence: an equation that describes a function
in terms of its value on smaller functions
Divide and Conquer
Divide: Break the problem into smaller sub-problems

Conquer: Solve the sub-problems. Generally, this involves


waiting for the problem to be small enough that it is trivial to
solve (i.e. 1 or 2 items)

Combine
Divide and Conquer: Sorting
How should we split the data?

What are the sub-problems we need to solve?

How do we combine the results from these sub-


problems?
MergeSort
MergeSort: Merge

Assuming L and R are sorted already, merge


the two to create a single sorted array
Merge
L: 1 3 5 8 R: 2 4 6 7
Merge
L: 1 3 5 8 R: 2 4 6 7

B:
Merge
i J

L: 1 3 5 8 R: 2 4 6 7

B:
Merge
i J

L: 1 3 5 8 R: 2 4 6 7

B:
Merge
i J

L: 1 3 5 8 R: 2 4 6 7

B: 1
Merge
i J

L: 1 3 5 8 R: 2 4 6 7

B: 1
Merge
i J

L: 1 3 5 8 R: 2 4 6 7

B: 1 2
Merge
i J

L: 1 3 5 8 R: 2 4 6 7

B: 1 2
Merge
i J

L: 1 3 5 8 R: 2 4 6 7

B: 1 2 3
Merge
i J

L: 1 3 5 8 R: 2 4 6 7

B: 1 2 3
Merge
i J

L: 1 3 5 8 R: 2 4 6 7

B: 1 2 3 4
Merge
i J

L: 1 3 5 8 R: 2 4 6 7

B: 1 2 3 4
Merge
i J

L: 1 3 5 8 R: 2 4 6 7

B: 1 2 3 4 5
Merge
i J

L: 1 3 5 8 R: 2 4 6 7

B: 1 2 3 4 5
Merge
i J

L: 1 3 5 8 R: 2 4 6 7

B: 1 2 3 4 5 6
Merge i J

L: 1 3 5 8 R: 2 4 6 7

B: 1 2 3 4 5 6
Merge i J

L: 1 3 5 8 R: 2 4 6 7

B: 1 2 3 4 5 6 7
Merge i J

L: 1 3 5 8 R: 2 4 6 7

B: 1 2 3 4 5 6 7
Merge i J

L: 1 3 5 8 R: 2 4 6 7

B: 1 2 3 4 5 6 7 8
Merge
Does the algorithm terminate?
Merge

Is it correct?
Loop invariant: At the end of each iteration of the for loop of
lines 4-10 the B[1..k] contains the smallest k elements from L
and R in sorted order.
Merge

Is it correct?
Loop invariant: At the beginning of the for loop of lines 4-10 the
first k-1 elements of B are the smallest k-1 elements from L and
R in sorted order.
Merge
Running time?

?
?
?
?
?
?

?
?
?
Merge
Running time?

1
1
1
n+1
n
n
n

n
n
1

T(n) = (1+1+1+n+1+n+n+n+n+n+1)= O(6n+5)


T(n) = O(n)
Merge

Running time Θ(n) - linear


Merge-Sort
Running time?

 c if n is small
T ( n)  
2T (n / 2)  D(n)  C (n) otherwise

D(n): cost of splitting (dividing) the data


C(n): cost of merging/combining the data
Merge-Sort
Running time?

 c if n is small
T ( n)  
2T (n / 2)  D(n)  C (n) otherwise

D(n): cost of splitting (dividing) the data - linear Θ(n)


C(n): cost of merging/combining the data – linear Θ(n)
Merge-Sort
Running time?

 c if n is small
T ( n)  
2T (n / 2)  cn otherwise

Which is?
Merge-Sort 
T ( n)  
c if n is small
2T (n / 2)  cn otherwise
cn

T(n/2) T(n/2)
Merge-Sort 
T ( n)  
c if n is small
2T (n / 2)  cn otherwise
cn

cn/2 cn/2

T(n/4) T(n/4) T(n/4) T(n/4)


Merge-Sort 
T ( n)  
c if n is small
2T (n / 2)  cn otherwise
cn

cn/2 cn/2

cn/4 cn/4 cn/4 cn/4

T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8)


Merge-Sort 
T ( n)  
c if n is small
2T (n / 2)  cn otherwise
cn

cn/2 cn/2

cn/4 cn/4 cn/4 cn/4

cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8


c c c c c … c c c c c c
Merge-Sort 
T ( n)  
c if n is small
2T (n / 2)  cn otherwise
cn cn

cn/2 cn/2 cn

cn/4 cn/4 cn/4 cn/4 cn

cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn


c c c c c … c c c c c c cn
Merge-Sort 
T ( n)  
c if n is small
2T (n / 2)  cn otherwise
cn cn

cn/2 cn/2 cn

cn/4 cn/4 cn/4 cn/4 cn

cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn


c c c c c … c c c c c c cn

Depth?
Merge-Sort
We can calculate the depth, by determining when the
recursion gets to down to a small problem size, e.g. 1

At each level, we divide by 2


n
d
1
2
2d  n

d  log 2 n
 c if n is small
T ( n)  
Merge-Sort 2T (n / 2)  cn otherwise

Running time?
 Each level costs cn
 log n levels

cn log n = Θ(n log n


Three Approaches
1. Substitution method: when we have a good
guess of the solution, prove that it’s correct

2. Recursion-tree method: If we don’t have a


good guess, the recursion tree can help. Then
solve with substitution method.

3. Master method: Provides solutions for


recurrences of the form:

T (n)  aT (n / b)  f (n)
Substitution method
Guess the form of the solution
Then prove it’s correct by induction

T ( n )  T ( n / 2)  d

Guesses?
Substitution Method
Guess the form of the solution
Then prove it’s correct by induction

T (n)  T (n / 2)  d
Halves the input then constant amount of work
Similar to binary search:
Guess: O(log2 n)
Proof?

T (n)  T (n / 2)  d  O(log 2 n) ?

Ideas?
Proof?

T (n) =T(n / 2) + d =O(log 2 n)?

Proof by induction!
Assume it’s true for smaller T(k),
i.e. k < n
prove that it’s then true for
current T(n)
T (n)  T (n / 2)  d
Assume T(k) = O(log2 k) for all k < n
Show that T(n) = O(log2 n)

From our assumption, T(n/2) = O(log2 n):


 there exists positive constants c and n such that 
O( g (n))   f (n) : 
 0  f ( n )  cg ( n ) for all n  n0 

From the definition of big-O: T(n/2) ≤ c log2

How do we now prove T(n) = O(log n)?


T (n)  T (n / 2)  d
To prove that T(n) = O(log2 n) identify the appropriate
constants:
 there exists positive constants c and n such that 
O( g (n))   f (n) : 
 0  f ( n )  cg ( n ) for all n  n0 

i.e. some constant c such that T(n) ≤ c log2 n

T (n)  T (n / 2)  d
 c log 2 (n / 2)  d from our inductive hypothesis

£c log 2 n - c log 2 2 + d
£ c log 2 n - c + d residual
 c log 2 n

if c ≥ d
Base case?
For an inductive proof we need to show two things:
 Assuming it’s true for k < n show it’s true for n
 Show that it holds for some base case

  if n is small
T ( n)  
T (n / 2)  d otherwise
T (n)  T (n  1)  n
Guess the solution?
 At each iteration, does a linear amount of work
(i.e. iterate over the data) and reduces the size by
one at each step
 O(n2)

Assume T(k) = O(k2) for all k < n


 again, this implies that T(n-1) ≤ c(n-1)2
Show that T(n) = O(n2), i.e. T(n) ≤ cn2
T (n)  T (n  1)  n
 c(n  1)  n
2
from our inductive hypothesis

 c(n  2n  1)  n
2

 cn  2cn  c  n
2
residual
 cn 2

if  2cn  c  n  0
 2cn  c   n
c(2n  1)   n
n
c
2n  1
1
which holds for any c
c ≥1 for n ≥1 2 1/ n
T (n)  2T (n / 2)  n
Guess the solution?
Recurses into 2 sub-problems that are half the
size and performs some operation on all the
elements
 O(n log n)

What if we guess wrong, e.g. O(n2)?

Assume T(k) = O(k2) for all k < n


 again, this implies that T(n/2) ≤ c(n/2)2
Show that T(n) = O(n2)
T (n)  2T (n / 2)  n
 2c(n / 2)  n2
from our inductive hypothesis

 2cn 2 / 4  n
 1 / 2cn 2  n
 cn 2  (1 / 2cn 2  n) residual
 cn 2
if
 (1 / 2cn  n)  0
2
overkill?
 1 / 2cn 2  n  0
cn  2
T (n)  2T (n / 2)  n
What if we guess wrong, e.g. O(n)?

Assume T(k) = O(k) for all k < n


 again, this implies that T(n/2) ≤ c(n/2)
Show that T(n) = O(n)

T (n)  2T (n / 2)  n
 2cn / 2  n
 cn  n
 cn
factor of n so we can
just roll it in?
T (n)  2T (n / 2)  n
What if we guess wrong, e.g. O(n)?

Assume T(k) = O(k) for all k < n


 again, this implies that T(n/2) ≤ c(n/2)
Show that T(n) = O(n)
Must prove the
T (n)  2T (n / 2)  n exact form!
 2cn / 2  n cn+n ≤ cn ??
 cn  n
 cn
factor of n so we can
just roll it in?
T (n)  2T (n / 2)  n
Prove T(n) = O(n log2 n)
Assume T(k) = O(k log2 k) for all k < n
 again, this implies that T(k) = ck log2k
Show that T(n) = O(n log2 n)

T (n)  2T (n / 2)  n
 2cn / 2 log( n / 2)  n
 cn(log 2 n  log 2 2)  n
 cn log 2 n  cn  n residual
 cn log 2 n
if cn ≥ n, c > 1
Changing variables
T (n)  2T ( n )  log n
Guesses?
We can do a variable change: let m = log2 n
(or n = 2m)

T (2 m )  2T (2 m / 2 )  m
Now, let S(m)=T(2m

S ( m )  2 S ( m / 2)  m
Changing variables
S(m) =2S(m / 2) + m

Guess? S (m)  O(m log m)

T (n)  T (2 m )  S (m)  O (m log m)


substituting m=log n

T (n)  O(log n log log n)


Recursion Tree
Guessing the answer can be difficult
T (n)  3T (n / 4)  n 2
T (n)  T (n / 3)  2T (2n / 3)  cn

The recursion tree approach


 Draw out the cost of the tree at each level of recursion
 Sum up the cost of the levels of the tree
 Find the cost of each level with respect to the depth
 Figure out the depth of the tree
 Figure out (or bound) the number of leaves


T (n)  3T (n / 4)  n 2 cost
cn2 cn2

T(n/4 ) T(n/4 ) T(n/4 )


T (n)  3T (n / 4)  n 2 cost
cn2 cn2

n
2
n
2
n
2 cn2(3/16)
c  c  c 
4 4 4

n n n  n n n


T   T   T   T   T   T   T   T   T  
n n n
 16   16   16   16   16   16   16   16   16 
T (n)  3T (n / 4)  n 2 cost
cn2 cn2

n
2
n
2
n
2 cn2(3/16)
c  c  c 
4 4 4

2 2 2 2 2 2
2
n n n
2 2
n n n n n n
c  c  c  c  c  c  c  c  c  cn2(3/16)2
 16   16   16   16   16   16   16   16   16 

d
3 2
What is the cost at each level?   cn
 16 
What is the depth of the tree?
At each level, the size of the data is divided by 4
n
d
1
4
 n 
log  d 0
4 
log n  log 4 d  0

d log 4  log n

d  log 4 n
T (n)  3T (n / 4)  n 2
cn2

2 2 2
n n n
c  c  c 
4 4 4

2 2 2 2 2 2
2
n n n
2 2
n n n n n n
c  c  c  c  c  c  c  c  c 
 16   16   16   16   16   16   16   16   16 

T (1)
How many leaves are there?
How many leaves?
How many leaves are there in a complete
ternary(Three) tree of depth d?

3 3
d log4 n
Total cost
2 d 1
3 2 3 2 3
T (n)  cn 2  cn    cn  ...    cn 2  (3log4 n )
16  16   16 
log4 n 1 i
3
 cn 2     (3log4 n )
i  0  16 
 i
3
 cn 2     (3log4 n ) 1
k 0 x k 

i  0  16 
1 x
1
 cn 2  (3log4 n )
1  (3 / 16) let x = 3/16

16 2
 cn  (3log4 n )
13 ?
16 2
Total cost T (n)  cn  (3log4 n )
13

log4 3lo g4 n
log4 n
3 4
 4log4 n log4 3
log4 n lo g4 3
4
 n log4 3

16 2
T (n)  cn  ( n log4 3 )
13

T ( n)  O ( n 2 )
Verify solution using substitution

T (n)  3T (n / 4)  n 2

Assume T(k) = O(k2) for all k < n


Show that T(n) = O(n2)

Given that T(n/4) = O((n/4)2), then

 there exists positive constants c and n such that 


O( g (n))   f (n) : 
 0  f ( n )  cg ( n ) for all n  n0 

T(n/4) ≤ c(n/4)
T (n)  3T (n / 4)  n 2

To prove that Show that T(n) = O(n2) we need to identify


the appropriate constants:
 there exists positive constants c and n such that 
O( g (n))   f (n) : 
 0  f ( n )  cg ( n ) for all n  n0 

i.e. some constant c such that T(n) ≤ cn2

T (n)  3T (n / 4)  n 2
 3c(n / 4) 2  n 2
 cn 2 3 / 16  n 2
 cn 2
if
16
c
13
Master Method
Provides solutions to the recurrences of the form:
T (n)  aT (n / b)  f (n)

if f (n)  O(n logb a  ) for  0, then T (n)  ( n logb a )


if f (n)  (n logb a
), then T (n)  (n logb a
log n)
logb a 
if f (n)  (n ) for  0 and af (n / b)  cf (n) for c  1
then T (n)  ( f (n))
T (n)  16T (n / 4)  n
if f (n)  O(n logb a  ) for  0, then T (n)  (n logb a )
if f (n)  (n logb a ), then T (n)  (n logb a log n)
if f (n)  (n logb a  ) for  0 and af (n / b)  cf (n) for c  1
then T (n)  ( f (n))

a = 16 n logb a  n log4 16
b = 4
f(n) = n  n2

2 
is n  O( n ) ?
is n  (n 2 ) ? Case 1: Θ(n2)
is n  (n 2 ) ?
T (n)  T (n / 2)  2 n

if f (n)  O(n logb a  ) for  0, then T (n)  (n logb a )


if f (n)  (n logb a ), then T (n)  (n logb a log n)
if f (n)  (n logb a  ) for  0 and af (n / b)  cf (n) for c  1
then T (n)  ( f (n))

a = 1 n logb a  n log2 1
b = 2
f(n) = 2n  n0

0  Case 3?
is 2  O(n ) ?
n

is 2 n / 2  c 2 n for c  1?
is 2 n  (n 0 ) ?
is 2 n  (n 0 ) ?
T ( n )  T ( n / 2)  2 n

if f (n)  O(n logb a  ) for  0, then T (n)  (n logb a )


if f (n)  (n logb a ), then T (n)  (n logb a log n)
if f (n)  (n logb a  ) for  0 and af (n / b)  cf (n) for c  1
then T (n)  ( f (n))

is 2 n / 2  c 2 n for c  1?

Let c = 1/2

2 n / 2  (1 / 2)2 n
2 n / 2  2 12 n T(n) = Θ(2n)
2 n / 2  2 n 1
T (n)  2T (n / 2)  n
if f (n)  O(n logb a  ) for  0, then T (n)  (n logb a )
if f (n)  (n logb a ), then T (n)  (n logb a log n)
if f (n)  (n logb a  ) for  0 and af (n / b)  cf (n) for c  1
then T (n)  ( f (n))

a = 2 n logb a  n log2 2
b = 2
f(n) = n  n1

1
is n  O(n ) ?
is n  (n1 ) ? Case 2: Θ(n log n)
is n  (n1 ) ?
T (n)  16T (n / 4)  n!
if f (n)  O(n logb a  ) for  0, then T (n)  (n logb a )
if f (n)  (n logb a ), then T (n)  (n logb a log n)
if f (n)  (n logb a  ) for  0 and af (n / b)  cf (n) for c  1
then T (n)  ( f (n))

a = 16 n logb a  n log4 16
b = 4
f(n) = n!  n2

2  Case 3
is n! O(n ) ?
is 16(n/ 4 )!  cn! for c  1?
is n! ( n 2 ) ?
is n! (n 2 ) ?
T (n)  16T (n / 4)  n!
if f (n)  O(n logb a  ) for  0, then T (n)  (n logb a )
if f (n)  (n logb a ), then T (n)  (n logb a log n)
if f (n)  (n logb a  ) for  0 and af (n / b)  cf (n) for c  1
then T (n)  ( f (n))

is 16(n/ 4 )!  cn! for c  1?

Let c = 1/2
cn! 1 / 2n! T(n) = Θ(n!)
 (n / 2)!
therefore,

16(n / 4)! (n / 2)! 1 / 2n!


T (n)  2T (n / 2)  log n
if f (n)  O(n logb a  ) for  0, then T (n)  (n logb a )
if f (n)  (n logb a ), then T (n)  (n logb a log n)
if f (n)  (n logb a  ) for  0 and af (n / b)  cf (n) for c  1
then T (n)  ( f (n))

a = 2 n logb a  n log2 2

b = 2 log2 21 / 2
f(n) = logn  n
 n
1 / 2 
is log n  O(n )?
is log n  (n1/ 2 ) ? Case 1: Θ( n )
is log n  (n1/ 2 ) ?
T (n)  4T (n / 2)  n
if f (n)  O(n logb a  ) for  0, then T (n)  (n logb a )
if f (n)  (n logb a ), then T (n)  (n logb a log n)
if f (n)  (n logb a  ) for  0 and af (n / b)  cf (n) for c  1
then T (n)  ( f (n))

a = 4 n logb a  n log2 4
b = 2
f(n) = n  n2

2 
is n  O( n ) ?
is n  (n 2 ) ? Case 1: Θ(n2)
is n  (n 2 ) ?
Recurrences
T (n)  2T (n / 3)  d T (n)  7T (n / 7)  n
if f (n)  O(n logb a  ) for  0, then T (n)  (n logb a )
if f (n)  (n logb a ), then T (n)  (n logb a log n)
if f ( n)  (n logb a  ) for  0 and af (n / b)  cf (n) for c  1
then T (n)  ( f (n))

T (n)  T (n  1)  log n T (n)  8T (n / 2)  n 3

You might also like