You are on page 1of 54

21-Feb-20

How Fast Can We Sort?

CSE408 • Selection Sort, Bubble Sort, Insertion Sort: O(n2)

Divide-and-Conquer Count, Radix & Bucket • Heap Sort, Merge sort: O(nlgn)

Sort • Quicksort: O(nlgn) - average

• What is common to all these algorithms?


Lecture #17 – Make comparisons between input elements

ai < aj, ai ≤ aj, ai = aj, ai ≥ aj, or ai >


aj
3

Can we do better? Counting Sort Step 1

• Assumptions: (i.e., frequencies)


• Linear sorting algorithms
– n integers which are in the range [0 ... r]
– Counting Sort (r=6)
– r is in the order of n, that is, r=O(n)
– Radix Sort
• Idea: 
– Bucket sort
– For each element x, find the number of elements
x
• Make certain assumptions about the data – Place x into its correct position in the output array
output array

• Linear sorts are NOT “comparison sorts”


4 5 6

Step 2 Algorithm Example


1 2 3 4 5 6 7 8 0 1 2 3 4 5

A 2 5 3 0 2 3 0 3 Cnew 2 2 4 7 7 8

C (frequencies) Cnew (cumulative sums) • Start from the last element of A


1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

• Place A[i] at its correct place in the output B 3 B


0
0
1 2 3 4 5
3
0 1 2 3 4 5
array Cnew 2 2 4 6 7 8 Cnew 1 2 4 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

• Decrease C[A[i]]
A 2by5one
3 0 2 3 0 3 B 0 3 3 B 0 2 3 3
0 1 2 3 4 5
0 1 2 3 4 5
0 1 2 3 4 5

Cnew 1 2 4 5 7 8 Cnew 1 2 3 5 7 8
Cnew 2 2 4 7 7 8
7 8 9

1
21-Feb-20

Example (cont.) COUNTING-SORT Analysis of Counting Sort


1 j n
1 2 3 4 5 6 7 8
A Alg.: COUNTING-SORT(A, B, n, k)
A 2 5 3 0 2 3 0 3 Alg.: COUNTING-SORT(A, B, n, k) 0 k
1. for i ← 0 to r O(r)
1. for i ← 0 to r C
2. do C[ i ] ← 0
do C[ i ] ← 0
1 n

2. B
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
3. for j ← 1 to n O(n)
B 0 0 2 3 3 B 0 0 2 3 3 3 5 3. for j ← 1 to n
4. do C[A[ j ]] ← C[A[ j ]] + 1
0 1 2 3 4 5 0 1 2 3 4 5
4. do C[A[ j ]] ← C[A[ j ]] + 1
C 0 2 3 5 7 8 C 0 2 3 4 7 7 5. C[i] contains the number of elements equal to i
5. C[i] contains the number of elements equal to O(r)
i 6. for i ← 1 to r
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
6. for i ← 1 to r 7. do C[ i ] ← C[ i ] + C[i -1]
B 0 0 2 3 3 3 B 0 0 2 2 3 3 3 5 8. C[i] contains the number of elements ≤ i
7. do C[ i ] ← C[ i ] + C[i -1] O(n)
for j ← n downto 1
0 1 2 3 4 5

C 0 2 3 4 7 8 8. C[i] contains the number of elements ≤ i 9.


9. for j ← n downto111 10. do B[C[A[ j ]]] ← A[ j ]
10
10. do B[C[A[ j ]]] ← A[ j ] 11. C[A[ j ]] ←12 C[A[ j Overall]] - 1time: O(n + r)

Analysis of Counting Sort Radix Sort Radix Sort

• Overall time: O(n + r) • Represents keys as d-digit numbers in some • Assumptions


base-k d=O(1) and k =O(n)
key = x1x2...xd where 0≤xi≤k-1 • Sorting looks at one column at a time
• In practice we use COUNTING sort when r = – For a d digit number, sort the least
significant digit first
O(n) • Example: key=15
– Continue sorting on the next least
 running time is O(n) significant digit, until all digits have been
key10 = 15, d=2, k=10 where 0≤xi≤9 sorted
– Requires only d passes through the list
13 key2 = 1111, d=4, k=2 where 0≤xi≤1
14 15

RADIX-SORT Analysis of Radix Sort Bucket Sort


Alg.: RADIX-SORT(A, d)
for i ← 1 to d • Assumption:
do use a stable sort to sort array A on digit i • Given n numbers of d digits each, where each – the input is generated by a random process that distributes elements
uniformly over [0, 1)
(stable sort: preserves order of identical elements)
• Idea:
digit may take up to k possible values, RADIX- – Divide [0, 1) into k equal-sized buckets (k=Θ(n))
– Distribute the n input values into the buckets
– Sort each bucket (e.g., using quicksort)
SORT correctly sorts the numbers in – Go through the buckets in order, listing elements in each one

O(d(n+k)) • Input: A[1 . . n], where 0 ≤ A[i] < 1 for all i

– Assuming d=O(1) and k=O(n), running time is O(n) • Output: elements A[i] sorted
16 17 18

2
21-Feb-20

Example - Bucket Sort


Example - Bucket Sort Example - Bucket Sort
.12 .17 .21 .23 .26 .39 .68 .72 .78 .94 /
A 1 .78 B 0 /
0 / 0 /
2 .17 1 .17 .12 /

3 .39 2
1 .12 .17 / 1 .12 .17 /
.26 .21 .23 /
2 .21 .23 .26 / 2 .21 .23 .26 /
4 .26 3 .39 /
3 .39 / 3 .39 /
5 .72 4 / Distribute Sort within each
Into buckets 4 / 4 /
6 .94 5 / bucket
7 .21 6 .68 / 5 / 5 / Concatenate the lists from
6 .68 / 6 .68 / 0 to n – 1 together, in order
8 .12 7 .78 .72 /

.23 7 .72 .78 / 7 .72 .78 /


9 8 /
10 .68 8 / 8 /
9 .94 /
19 9 .94 / 20 9 .94 / 21

Analysis of Bucket Sort Radix Sort as a Bucket Sort Special Types of Trees

Alg.: BUCKET-SORT(A, n) • Def: Full binary tree = a 4

binary tree in which each


for i ← 1 to n 1 3
O(n) node is either a leaf or has
2 16 9 10
do insert A[i] into list B[nA[i]] degree exactly 2. 14 8 7
12
for i ← 0 to k - 1 Full binary tree
k O(n/k log(n/k))
do sort list B[i] with quicksort sort =O(nlog(n/k) 4
• Def: Complete binary tree = a
concatenate lists B[0], B[1], . . . , B[n -1] binary tree in which all leaves 1 3

together in order O(k) are on the same level and all 2 16 9 10


internal nodes have degree 2.
return the concatenated lists Complete binary tree

O(n) (if k=Θ(n))


22 23

Definitions Useful Properties The Heap Data Structure


• Height of a node = the number of edges on the longest • Def: A heap is a nearly complete binary tree with
simple path from the node down to a leaf the following two properties:
• Level of a node = the length of a path from the root to – Structural property: all levels are full, except
height
the node possibly the last one, which is filled from left to right
• Height of tree = height of root node height
– Order (heap) property: for any node x
(see Ex 6.1-2, page 129)
Parent(x) ≥ x
d
2d 1  1
Height of root = 3 n   2l   2d 1  1 Height of root = 3
4
l 0 2 1 4
8 From the heap property, it
1 3 1 3 follows that:
Height of (2)= 1 2 16 9 10 Level of (10)= 2 Height of (2)= 1 2 16 9 10 Level of (10)= 2 7 4 “The root is the maximum
14 8 14 8 5 2
element of the heap!”

Heap

A heap is a binary tree that is filled in order

3
21-Feb-20

Array Representation of Heaps Heap Types Adding/Deleting Nodes


• A heap can be stored as an • Max-heaps (largest element at root), have the • New nodes are always inserted at the bottom
array A.
max-heap property: level (left to right)
– Root of tree is A[1]
– Left child of A[i] = A[2i] – for all nodes i, excluding the root: • Nodes are removed from the bottom level (right
– Right child of A[i] = A[2i + 1] A[PARENT(i)] ≥ A[i] to left)
– Parent of A[i] = A[ i/2 ]
– Heapsize[A] ≤ length[A]
• Min-heaps (smallest element at root), have the
• The elements in the subarray
min-heap property:
A[(n/2+1) .. n] are leaves
– for all nodes i, excluding the root:
A[PARENT(i)] ≤ A[i]

Operations on Heaps Maintaining the Heap Property Example


MAX-HEAPIFY(A, 2, 10)
• Maintain/Restore the max-heap property • Suppose a node is smaller than a
child
– MAX-HEAPIFY
– Left and Right subtrees of i are max-heaps
A[2]  A[4]
• Create a max-heap from an unordered array • To eliminate the violation:
– Exchange with larger child
– BUILD-MAX-HEAP
– Move down the tree
• Sort an array in place – Continue until node is not smaller than A[2] violates the heap property A[4] violates the heap property
children
– HEAPSORT
• Priority queues A[4]  A[9]

Heap property restored

Maintaining the Heap Property MAX-HEAPIFY Running Time Building a Heap


• Assumptions: Alg: MAX-HEAPIFY(A, i, n) • Intuitively: • Convert an array A[1 … n] into a max-heap (n = length[A])
– Left and Right 1. l ← LEFT(i) • The elements in the subarray A[(n/2+1) .. n] are leaves
-
subtrees of i are 2. r ← RIGHT(i) -
h
- • Apply MAX-HEAPIFY on elements between 1 and n/2
max-heaps 3. if l ≤ n and A[l] > A[i] - O(h)
2h
– A[i] may be 4. then largest ←l
Alg: BUILD-MAX-HEAP(A)
smaller than its
5. else largest ←i • Running time of MAX-HEAPIFY is O(lgn) 1

children 1. n = length[A] 4
6. if r ≤ n and A[r] > A[largest]
7. then largest ←r • Can be written in terms of the height of the heap, 2. for i ← n/2 downto 1
2

1
3

3
4 5 6 7

8. if largest  i as being O(h) 3. do MAX-HEAPIFY(A, i, n) 8


2 9 10
16 9 10

9. then exchange A[i] ↔ A[largest] 14 8 7


– Since the height of the heap is lgn
10. MAX-HEAPIFY(A, largest, n) A: 4 1 3 2 16 9 10 14 8 7

4
21-Feb-20

Example: A 4 1 3 2 16 9 10 14 8 7 Running Time of BUILD MAX HEAP Running Time of BUILD MAX HEAP

i =5 i =4 i =3 • HEAPIFY takes O(h)  the cost of HEAPIFY on a node i is


Alg: BUILD-MAX-HEAP(A)
1

4 4
1 1

4
proportional to the height of the node i in the tree
1. n = length[A]
h
 T (n)   ni hi   2i h  i   O(n)
h

2 3 2 3 2 3

1 3 1 3 1 3 2. for i ← n/2 downto 1 Height i 0 i 0 Level No. of nodes


4 5 6 7 4 5 6 7 4 5 6 7
O(n)
8
2 9 10
16 9 10 8 2 9 10
16 9 10 8 14 9 10
16 9 10 3. do MAX-HEAPIFY(A, i, n) O(lgn) h0 = 3 (lgn) i=0 20
14 8 7 14 8 7 2 8 7
h1 = 2 i=1 21

i =2 i =1
 Running time: O(nlgn)
1 1 1
h2 = 1 i=2 22
4 4 16 • This is not an asymptotically tight upper bound
2 3 2 3 2 3

1 10 16 10 14 10 h3 = 0 i = 3 (lgn) 23
4 5 6 7 4 5 6 7 4 5 6 7

14 16 9 3 14 7 9 3 8 7 9 3
8 9 10 8 9 10 8 9 10 hi = h – i height of the heap rooted at level i
2 8 7 2 8 1 2 4 1 ni = 2i number of nodes at level i

Running Time of BUILD MAX HEAP Heapsort Example: A=[7, 4, 3, 1, 2]


h
T (n)   ni hi Cost of HEAPIFY at level i  number of nodes at that level • Goal:
i 0
h – Sort an array using heap representations
  2i h  i  Replace the values of ni and hi computed before
i 0 • Idea:
hi h
h
 2 Multiply by 2h both at the nominator and denominator and – Build a max-heap from the array MAX-HEAPIFY(A, 1, 4) MAX-HEAPIFY(A, 1, 3) MAX-HEAPIFY(A, 1, 2)
2 h i
i 0
1
write 2i as i
2
k h – Swap the root (the maximum element) with the last
 2h  k Change variables: k = h - i element in the array
k 0 2

k – “Discard” this last node by decreasing the heap size
 n k
The sum above is smaller than the sum of all elements to 
k 0 2 and h = lgn – Call MAX-HEAPIFY on the new root
 O(n) The sum above is smaller than 2
– Repeat this process until only one node remains MAX-HEAPIFY(A, 1, 1)

Running time of BUILD-MAX-HEAP: T(n) = O(n)

Alg: HEAPSORT(A) Priority Queues Operations on Priority Queues

• Max-priority queues support the following


1. BUILD-MAX-HEAP(A) O(n)
operations:
2. for i ← length[A] downto 2
– INSERT(S, x): inserts element x into set S
3. do exchange A[1] ↔ A[i] n-1 times
– EXTRACT-MAX(S): removes and returns element of
4. MAX-HEAPIFY(A, 1, i - 1) O(lgn) 12 4 S with largest key

– MAXIMUM(S): returns element of S with largest key


• Running time: O(nlgn) --- Can be – INCREASE-KEY(S, x, k): increases value of element
shown to be Θ(nlgn) x’s key to k (Assume k ≥ x’s current key value)

5
21-Feb-20

HEAP-MAXIMUM HEAP-EXTRACT-MAX Example: HEAP-EXTRACT-MAX

Goal:
Goal: 16 1
– Extract the largest element of the heap (i.e., return the max
– Return the largest element of the heap value and also remove that element from the heap
14 10 max = 16 14 10
Idea:
Running time: O(1) 8 7 9 3 8 7 9 3
Alg: HEAP-MAXIMUM(A) – Exchange the root element with the last 2 4 1 2 4
1. return A[1] – Decrease the size of the heap by 1 element Heap size decreased with 1
Heap A: – Call MAX-HEAPIFY on the new root, on a heap of size n-1

14
Heap A: Root is the largest element
Call MAX-HEAPIFY(A, 1, n-1)
8 10
4 7 9 3
2 1
Heap-Maximum(A) returns 7

HEAP-EXTRACT-MAX HEAP-INCREASE-KEY Example: HEAP-INCREASE-KEY

• Goal:
Alg: HEAP-EXTRACT-MAX(A, n) 16 16
– Increases the key of an element i in the heap
1. if n < 1 • Idea: 14 10 14 10

– Increment the key of A[i] to its new value 8 i 7 9 3 8 i 7 9 3


2. then error “heap underflow” 2 4 1 2 15 1
– If the max-heap property does not hold anymore:
3. max ← A[1] traverse a path toward the root to find the proper Key [i ] ← 15
place for the newly increased key
4. A[1] ← A[n] 16 16
16
i
5. MAX-HEAPIFY(A, 1, n-1) remakes heap 14 10 15 10
14 10 i
6. return max 15 7 9 3 14 7 9 3
8 i 7 9 3
2 8 1 2 8 1
Running time: O(lgn) Key [i] ← 15 2 4 1

HEAP-INCREASE-KEY MAX-HEAP-INSERT Example: MAX-HEAP-INSERT

Insert value 15: Increase the key to 15


Alg: HEAP-INCREASE-KEY(A, i, key) • Goal: - Start by inserting - Call HEAP-INCREASE-KEY on A[11] = 15
16
– Inserts a new element into a max- 16 16

1. if key < A[i] heap 14 10


14 10 14 10
8 7 9 3
2. then error “new key is smaller than current key” • Idea: 8 7 9 3 8 7 9 3
3. A[i] ← key – Expand the max-heap with a new
2 4 1 -
2 4 1 - 2 4 1 15
4. while i > 1 and A[PARENT(i)] < A[i] 16
element whose key is -
16
The restored heap containing
5. do exchange A[i] ↔ A[PARENT(i)] the newly added element
14 10 – Calls HEAP-INCREASE-KEY to 14 10
6. i ← PARENT(i) 16 16
8 i 7 9 3 set the key of the new node to its 8 7 9 3
2 4 1 correct value and maintain the 2 4 1 15 15 10
• Running time: O(lgn) 14 10
Key [i] ← 15 max-heap property 8 15 9 3 8 14 9 3

2 4 1 7 2 4 1 7

6
21-Feb-20

MAX-HEAP-INSERT Summary Why Study Sorting Algorithms?


• We can perform the following operations on
16 • There are a variety of situations that we can
Alg: MAX-HEAP-INSERT(A, key, n) heaps:
encounter
14 10 – MAX-HEAPIFY O(lgn)
1. heap-size[A] ← n + 1 – Do we have randomly ordered keys?
O(n)
8 7 9 3
– BUILD-MAX-HEAP
– Are all keys distinct?
2. A[n + 1] ← -
2 4 1 -
– HEAP-SORT O(nlgn)
– How large is the set of keys to be ordered?
3. HEAP-INCREASE-KEY(A, n + 1, key) – MAX-HEAP-INSERT O(lgn) – Need guaranteed performance?
– HEAP-EXTRACT-MAX O(lgn)
Average
Running time: O(lgn) – HEAP-INCREASE-KEY O(lgn) • Various algorithms are better suited to some
O(lgn)
– HEAP-MAXIMUM O(1) of these situations

Some Definitions Bubble Sort Example


8 4 6 9 2 3 1 1 8 4 6 9 2 3

• Internal Sort • Idea: i= 1 j i= 2 j

8 4 6 9 2 1 3 1 2 8 4 6 9 3
– The data to be sorted is all stored in the – Repeatedly pass through the array i= 1 j i= 3 j
computer’s main memory. – Swaps adjacent elements that are out of order 8 4 6 9 1 2 3 1 2 3 8 4 6 9
i
• External Sort 1 2 3 n i= 1 j i= 4 j

– Some of the data to be sorted might be stored in 8 4 6 9 2 3 1 8 4 6 1 9 2 3 1 2 3 4 8 6 9


i= 1 j i= 5 j
some external, slower, device. j

8 4 1 6 9 2 3 1 2 3 4 6 8 9
• In Place Sort i= 1 j i= 6 j

– The amount of extra space required to sort the 8 1 4 6 9 2 3 1 2 3 4 6 8 9


data is constant with the input size. • Easier to implement, but slower than Insertion i= 1 j i= 7
j
sort 1 8 4 6 9 2 3
i= 1 j

Bubble Sort Bubble-Sort Running Time Selection Sort


Alg.: BUBBLESORT(A)
c1
• Idea:
Alg.: BUBBLESORT(A) for i  1 to length[A]
do for j  length[A] downto i + 1 c2
– Find the smallest element in the array
for i  1 to length[A]
Comparisons:  n2/2 do if A[j] < A[j -1] c3 – Exchange it with the element in the first position
do for j  length[A] downto i + 1
do if A[j] < A[j -1] Exchanges:  n2/2
then exchange A[j]  A[j-1] c4 – Find the second smallest element and exchange it
n with the element in the second position
i then exchange A[j]  A[j-1]
 (n  i)
n n
T(n) = c1(n+1) + c2  (n  i  1)  c3  (n  i)  c4 – Continue until the array is sorted
8 4 6 9 2 3 1 i 1 i 1 i 1

• Disadvantage:
n

 (n  i)
i= 1 j
= (n) + (c2 + c2 + c4)
i 1
– Running time depends only slightly on the amount
n n
n(n  1) n 2 n
n
where  (n  i )  n   i  n    2
of order in the file
i 1 i 1 i 1 2 2 2
Thus,T(n) = (n2)

7
21-Feb-20

Example Selection Sort Analysis of Selection Sort


Alg.: SELECTION-SORT(A)
cost times
8 4 6 9 2 3 1 1 2 3 4 9 6 8
Alg.: SELECTION-SORT(A) n ← length[A]
8 4 6 9 2 3 1
c1 1
n ← length[A] for j ← 1 to n - 1
1 4 6 9 2 3 8 1 2 3 4 6 9 8

c2 n
1 2 6 9 4 3 8 1 2 3 4 6 8 9
for j ← 1 to n - 1 do smallest ← j
c3 n-1
1 2 3 9 4 6 8 1 2 3 4 6 8 9 do smallest ← j
n2/2 for i ← j + 1 to n c4 nj11 (n  j  1)
for i ← j + 1 to n comparisons
do if A[i] < A[smallest] do if A[i] < A[smallest] c5 nj11 (n  j )
n
then smallest ← i exchanges then smallest ← i c6 
n 1
j 1
(n  j )

exchange A[j] ↔ A[smallest] c7


exchange A[j] ↔ A[smallest] n-1
n 1 n 1 n 1
T (n)  c1  c2 n  c3 (n  1)  c4  (n  j  1)  c5   n  j   c6   n  j   c7 (n  1)  (n 2 )
j 1 j 1 j 2

MINIMUM AND MAXIMUM Insertion Sort Insertion Sort


• Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with
the one to its left if smaller. the one to its left if smaller.

• Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i]
contain first i+1 elements in ascending order. contain first i+1 elements in ascending order.

Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9


Value 2.78 7.42 0.56 1.12 1.17 0.32 6.21 4.42 3.14 7.71 Value 2.78 7.42 0.56 1.12 1.17 0.32 6.21 4.42 3.14 7.71

Iteration 0: step 0. Iteration 1: step 0.

Insertion Sort Insertion Sort Insertion Sort


• Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with
the one to its left if smaller. the one to its left if smaller. the one to its left if smaller.

• Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i]
contain first i+1 elements in ascending order. contain first i+1 elements in ascending order. contain first i+1 elements in ascending order.

Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9


Value 2.78 0.56
7.42 7.42
0.56 1.12 1.17 0.32 6.21 4.42 3.14 7.71 Value 2.78 2.78
0.56 0.56 7.42 1.12 1.17 0.32 6.21 4.42 3.14 7.71 Value 0.56 2.78 7.42 1.12 1.17 0.32 6.21 4.42 3.14 7.71

Iteration 2: step 0. Iteration 2: step 1. Iteration 2: step 2.

8
21-Feb-20

Insertion Sort Insertion Sort Insertion Sort


• Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with
the one to its left if smaller. the one to its left if smaller. the one to its left if smaller.

• Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i]
contain first i+1 elements in ascending order. contain first i+1 elements in ascending order. contain first i+1 elements in ascending order.

Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9


Value 0.56 2.78 1.12
7.42 7.42
1.12 1.17 0.32 6.21 4.42 3.14 7.71 Value 0.56 1.12
2.78 2.78
1.12 7.42 1.17 0.32 6.21 4.42 3.14 7.71 Value 0.56 1.12 2.78 7.42 1.17 0.32 6.21 4.42 3.14 7.71

Iteration 3: step 0. Iteration 3: step 1. Iteration 3: step 2.

Insertion Sort Insertion Sort Insertion Sort


• Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with
the one to its left if smaller. the one to its left if smaller. the one to its left if smaller.

• Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i]
contain first i+1 elements in ascending order. contain first i+1 elements in ascending order. contain first i+1 elements in ascending order.

Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9


Value 0.56 1.12 2.78 1.17
7.42 7.42
1.17 0.32 6.21 4.42 3.14 7.71 Value 0.56 1.12 1.17
2.78 2.78
1.17 7.42 0.32 6.21 4.42 3.14 7.71 Value 0.56 1.12 1.17 2.78 7.42 0.32 6.21 4.42 3.14 7.71

Iteration 4: step 0. Iteration 4: step 1. Iteration 4: step 2.

Insertion Sort Insertion Sort Insertion Sort


• Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with
the one to its left if smaller. the one to its left if smaller. the one to its left if smaller.

• Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i]
contain first i+1 elements in ascending order. contain first i+1 elements in ascending order. contain first i+1 elements in ascending order.

Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9


Value 0.56 1.12 1.17 2.78 0.32
7.42 7.42
0.32 6.21 4.42 3.14 7.71 Value 0.56 1.12 1.17 0.32
2.78 2.78
0.32 7.42 6.21 4.42 3.14 7.71 Value 0.56 1.12 0.32
1.17 1.17
0.32 2.78 7.42 6.21 4.42 3.14 7.71

Iteration 5: step 0. Iteration 5: step 1. Iteration 5: step 2.

9
21-Feb-20

Insertion Sort Insertion Sort Insertion Sort


• Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with
the one to its left if smaller. the one to its left if smaller. the one to its left if smaller.

• Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i]
contain first i+1 elements in ascending order. contain first i+1 elements in ascending order. contain first i+1 elements in ascending order.

Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9


Value 0.56 0.32
1.12 1.12
0.32 1.17 2.78 7.42 6.21 4.42 3.14 7.71 Value 0.56 0.56
0.32 0.32 1.12 1.17 2.78 7.42 6.21 4.42 3.14 7.71 Value 0.32 0.56 1.12 1.17 2.78 7.42 6.21 4.42 3.14 7.71

Iteration 5: step 3. Iteration 5: step 4. Iteration 5: step 5.

Insertion Sort Insertion Sort Insertion Sort


• Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with
the one to its left if smaller. the one to its left if smaller. the one to its left if smaller.

• Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i]
contain first i+1 elements in ascending order. contain first i+1 elements in ascending order. contain first i+1 elements in ascending order.

Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9


Value 0.32 0.56 1.12 1.17 2.78 6.21
7.42 7.42
6.21 4.42 3.14 7.71 Value 0.32 0.56 1.12 1.17 2.78 6.21 7.42 4.42 3.14 7.71 Value 0.32 0.56 1.12 1.17 2.78 6.21 4.42
7.42 7.42
4.42 3.14 7.71

Iteration 6: step 0. Iteration 6: step 1. Iteration 7: step 0.

Insertion Sort Insertion Sort Insertion Sort


• Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with
the one to its left if smaller. the one to its left if smaller. the one to its left if smaller.

• Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i]
contain first i+1 elements in ascending order. contain first i+1 elements in ascending order. contain first i+1 elements in ascending order.

Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9


Value 0.32 0.56 1.12 1.17 2.78 4.42
6.21 6.21
4.42 7.42 3.14 7.71 Value 0.32 0.56 1.12 1.17 2.78 4.42 6.21 7.42 3.14 7.71 Value 0.32 0.56 1.12 1.17 2.78 4.42 6.21 3.14
7.42 7.42
3.14 7.71

Iteration 7: step 1. Iteration 7: step 2. Iteration 8: step 0.

10
21-Feb-20

Insertion Sort Insertion Sort Insertion Sort


• Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with
the one to its left if smaller. the one to its left if smaller. the one to its left if smaller.

• Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i]
contain first i+1 elements in ascending order. contain first i+1 elements in ascending order. contain first i+1 elements in ascending order.

Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9


Value 0.32 0.56 1.12 1.17 2.78 4.42 3.14
6.21 6.21
3.14 7.42 7.71 Value 0.32 0.56 1.12 1.17 2.78 3.14
4.42 4.42
3.14 6.21 7.42 7.71 Value 0.32 0.56 1.12 1.17 2.78 3.14 4.42 6.21 7.42 7.71

Iteration 8: step 1. Iteration 8: step 2. Iteration 8: step 3.

Insertion Sort Insertion Sort Other Methods


• Iteration i. Repeatedly swap element i with • Iteration i. Repeatedly swap element i with
the one to its left if smaller. the one to its left if smaller. • When analyzing algorithms, linear homogeneous recurrences
of order greater than 2 hardly ever arise in practice
• We briefly describe two unfolding methods that work for a lot
• Property. After ith iteration, a[0] through a[i] • Property. After ith iteration, a[0] through a[i] of cases
contain first i+1 elements in ascending order. contain first i+1 elements in ascending order. – Backward substitution: this works exactly as its name suggests.
Starting from the equation itself, work backwards, substituting values
of the function for previous ones
Array index 0 1 2 3 4 5 6 7 8 9 Array index 0 1 2 3 4 5 6 7 8 9
– Recurrence trees: just as powerful, but perhaps more intuitive, this
Value 0.32 0.56 1.12 1.17 2.78 3.14 4.42 6.21 7.42 7.71 Value 0.32 0.56 1.12 1.17 2.78 3.14 4.42 6.21 7.42 7.71 method involves mapping out the recurrence tree for an equation.
Starting from the equation, you unfold each recursive call to the
function and calculate the non-recursive cost at each level of the tree.
Iteration 9: step 0. Iteration 10: DONE. Then, you find a general formula for each level and take a summation
over all such levels

Backward Substitution: Example (1) Backward Substitution: Example (2) Backward Substitution: Example (3)

• Give a solution to • If we continue to do that we get • We want to get rid of the recursive term
T(n)= T(n-1) + 2n T(n) = T(n-2) + 2(n-1) + 2n T(n) = T(n-i) + 2n(i-1) – i2 + i + 2n
where T(1)=5 T(n) = T(n-3) + 2(n-2) + 2(n-1) + 2n • To do that, we need to know at what iteration we reach our
based case, i.e. for what value of i can we use the initial
• We begin by unfolding the recursion by a simple substitution T(n) = T(n-4) + 2(n-3) + 2(n-2) + 2(n-1) + 2n condition T(1)=5?
of the function values ….. • We get the base case when n-i=1 or i=n-1
• We observe that T(n) = T(n-i) +  j=0i-1 2(n - j) function’s value at the ith iteration • Substituting in the equation above we get
T(n-1) = T((n-1) -1) + 2(n-1) = T(n-2) + 2 (n-1) • Solving the sum we get T(n) = 5 + 2n(n-1-1) – (n-1)2 + (n-1) + 2n
T(n) = T(n-i) + 2n(i-1) – 2(i-1)(i-1+1)/2 + 2n
T(n) = 5 + 2n(n-2) – (n2-2n+1) + (n-1) + 2n = n2 + n + 3
• Substituting into the original equation T(n) = T(n-i) + 2n(i-1) – i2 + i + 2n
T(n)=T(n-2)+2(n-1)+2n

11
21-Feb-20

Recurrence Trees (1) Recurrence Trees (2) Recurrence Trees (3)


Iteration
Cost
• When using recurrence trees, we graphically represent the 0 TT(n)
f(n) • The total value of the function is the
recursion
1 summation over all levels of the tree
• Each node in the tree is an instance of the function. As we T(n/) T(n/)    T(n/) f(n/)
progress downward, the size of the input decreases
• The contribution of each level to the function is equivalent to • Consider the following concrete example
the number of nodes at that level times the non-recursive 2
T(n/2)    T(n/2) T(n/2)    T(n/2)  2f(n/2)
cost on the size of the input at that level .
.
T(n) = 2T(n/2) + n, T(1)= 4
.
• The tree ends at the depth at which we reach the base case .
i
• As an example, we consider a recursive function of the form  i f(n/i )
.
.
T(n) = T(n/) + f(n), T() = c .
.
log n

Recurrence Tree: Example (2) Recurrence Trees: Example (3)


Iteration
Cost
0 T(n)
n • The value of the function is the summation of
1 T(n/2) T(n/2)
n/2 +n/2
the value of all levels. Master Theorem
• We treat the last level as a special case since
T(n/4) T(n/4) T(n/4) T(n/4) 4. n/4
its non-recursive cost is different
2
8.n/8
. T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) .
.
.
i
2i (n/2i )
.
.
.
.
log2 n

Outline Motivation: Asymptotic Behavior of Recursive Algorithms Outline


• Motivation • When analyzing algorithms, recall that we only care • Motivation
• The Master Theorem about the asymptotic behavior • The Master Theorem
– Pitfalls • Recursive algorithms are no different – Pitfalls
– 3 examples • Rather than solving exactly the recurrence relation – 3 examples
• 4th Condition associated with the cost of an algorithm, it is • 4th Condition
– 1 example sufficient to give an asymptotic characterization – 1 example
• The main tool for doing this is the master theorem

12
21-Feb-20

Master Theorem Master Theorem: Pitfalls Master Theorem: Example 1


• Let T(n) be a monotonically increasing function that • You cannot use the Master Theorem if • Let T(n) = T(n/2) + ½ n2 + n. What are the parameters?
a= 1
satisfies – T(n) is not monotone, e.g. T(n) = sin(x) b= 2
T(n) = a T(n/b) + f(n) – f(n) is not a polynomial, e.g., T(n)=2T(n/2)+2n d= 2
T(1) = c – b cannot be expressed as a constant, e.g. Therefore, which condition applies?
where a  1, b  2, c>0. If f(n) is (nd) where d  0 1 < 22, case 1 applies
then •
• Note that the Master Theorem does not solve We conclude that
if a < bd
the recurrence equation T(n)  (nd) =  (n2)
T(n) = If a = bd
if a > bd • Does the base case remain a concern?

Master Theorem: Example 2 Master Theorem: Example 3 Outline


• Let T(n)= 2 T(n/4) + n + 42. What are the parameters? • Let T(n)= 3 T(n/2) + 3/4n + 1. What are the parameters? • Motivation
a= 2 a= 3
b= 4 b= 2 • The Master Theorem
d = 1/2 d= 1 – Pitfalls
Therefore, which condition applies? Therefore, which condition applies?
– 3 examples
2 = 41/2 , case 2 applies 3 > 21, case 3 applies • 4th Condition
– 1 example
• We conclude that • We conclude that

• Note that log231.584…, can we say that T(n)   (n1.584)

No, because log 231.5849… and n1.584   (n1.5849)

‘Fourth’ Condition ‘Fourth’ Condition: Example Summary


• Recall that we cannot use the Master Theorem if f(n), • Say we have the following recurrence relation • Motivation
the non-recursive cost, is not a polynomial T(n)= 2 T(n/2) + n log n • The Master Theorem
• There is a limited 4th condition of the Master • Clearly, a=2, b=2, but f(n) is not a polynomial. – Pitfalls
Theorem that allows us to consider polylogarithmic However, we have f(n)(n log n), k=1 – 3 examples
functions • Therefore by the 4th condition of the Master • 4th Condition
• Corollary: If for some k0 then Theorem we can say that – 1 example

• This final condition is fairly limited and we present it


merely for sake of completeness.. Relax 

13
21-Feb-20

Divide-and-Conquer Divide-and-Conquer Examples Mergesort


The most-well known algorithm design • Sorting: mergesort and quicksort • Question #1: what makes mergesort distinct
technique: from many other sorting algorithms?
1. Divide instance of problem into two or more • Maximum subarray pronblem
internal and external algorithm
smaller instances
• Recursive case • Multiplication of large integers
• Question #2: How to design mergesort using
2. Solve smaller instances independently and
recursively • Closest-pair problem divide-and-conquer technique?
• When to stop? – base case
3. Obtain solution to original (larger) instance by • Matrix multiplication: Strassen’s algorithm (reading)
combining these solutions

Mergesort Pseudocode of Mergesort Pseudocode of Merge


• Split array A[0..n-1] in two about equal halves and
make copies of each half in arrays B and C
• Sort arrays B and C recursively
• Q: when to stop?
• Merge sorted arrays B and C into array A as follows:
– Repeat the following until no elements remain in one of the arrays:
• compare the first elements in the remaining unprocessed
portions of the arrays
• copy the smaller of the two into A, while incrementing the
index indicating the unprocessed portion of that array
– Once all elements in one of the arrays are processed, copy the
remaining unprocessed elements from the other array into A.

8 3 2 9 7 1 5 4

Analysis of Mergesort Mergsort: A big picture


Mergesort
8 3 2 9 7 1 5 4
Example • Time efficiency by recurrence reltation: • Problem: Assume you want to sort X terabytes
T(n) = 2T(n/2) + f(n)
(even perabytes) of data using a cluster of M
8 3 2 9 71 5 4
n-1 comparisons in merge operation for worst case!
T(n) = Θ(n log n) (even thousands) computers.
8 3 2 9 7 1 5 4 • Number of comparisons in the worst case is close to • Solution?
theoretical minimum for comparison-based sorting:
log 2 n! ≈ n log 2 n - 1.44n (Section 11.2)
3 8 2 9 1 7 4 5

• Space requirement: Θ(nlogn) (not in-place)


2 3 8 9 1 4 5 7
• Can be implemented without recursion (bottom-up)

1 2 3 4 5 7 8 9

14
21-Feb-20

Similar to Search Engines in some aspects:


data partitioning!
Quicksort Quicksort
• Select a pivot (partitioning element) – here, the first
element for simplicity! • Basic operation: split/divide
• Rearrange the list so that all the elements in the first s
positions are smaller than or equal to the pivot and all – Differ from the divide operation in mergesort
the elements in the remaining n-s positions are larger – What is the major difference?
than the pivot (see next slide for an algorithm)
p

A[i]p A[i]>p

• Exchange the pivot with the last element in the first


(i.e., ) subarray — the pivot is now in its final position
• Sort the two subarrays recursively

Quicksort Partitioning Algorithm Quicksort Example


• Basic operation: split/divide 8, 2, 13, 5, 14, 3, 7
– Differ from the divide operation in mergesort
– What is the major difference?
• Each split will place the pivot in the right position, and
the left sublist < the right sublist
• No explicit merge A[i] > p
A[j] <= p

Multiplication of Large Integers


Analysis of Quicksort Analysis of Quicksort
• Best case: T(n) =? • Best case: split in the middle — Θ(n log n) Consider the problem of multiplying two (large)
• Worst case: sorted array! — Θ(n2) n-digit integers represented by arrays of their
• Worst case: T(n) =? digits such as:
• Average case: random arrays — Θ(n log n)
– Assume the split can happen in each position with equal
probability! See textbook for details! A = 12345678901357986429
• Improvements: B = 87654321284820912836
– better pivot selection: median-of-three partitioning
– switch to insertion sort on small sublists
– elimination of recursion
These combination makes 20-25% improvement

• Considered the method of choice for internal


sorting of large files (n ≥ 10000)

15
21-Feb-20

Multiplication of Large Integers Multiplication of Large Integers First Cut


Consider the problem of multiplying two (large) n-digit Consider the problem of multiplying two (large) A small example: A  B where A = 2135 and B = 4014
integers represented by arrays of their digits such as: A = (21·102 + 35), B = (40 ·102 + 14)
n-digit integers represented by arrays of their
A = 12345678901357986429 B = 87654321284820912836 digits such as: So, A  B = (21 ·102 + 35)  (40 ·102 + 14)
= 21  40 ·104 + (21  14 + 35  40) ·102 + 35  14
The grade-school algorithm:
A = 12345678901357986429
a1 a2 … an In general, if A = A1A2 and B = B1B2 (where A and B are
b1 b2 … bn B = 87654321284820912836 n-digit, A1, A2, B1, B2 are n/2-digit numbers),
(d10) d11d12 … d1n
(d20) d21d22 … d2n A  B = A1  B1·10n + (A1  B2 + A2  B1) ·10n/2 + A2  B2
………………… Discussion: How to apply “divide-and-conquer”
Recurrence for the number of one-digit multiplications
(dn0) dn1dn2 … dnn to this problem? T(n):
T(n) = 4T(n/2), T(1) = 1
Efficiency: n2 one-digit multiplications Solution: T(n) = n2

Second Cut Large-Integer Multiplication


Integer Multiplication
• Questions:
A  B = A1  B1·10n + (A1  B2 + A2  B1) ·10n/2 + A2  B2 • To multiply two n-digit integers:
– Add two ½n digit integers.
• What if two large numbers have different number
The idea is to decrease the number of multiplications from 4 – Multiply three ½n-digit integers. of digits?
to 3:
– Add, subtract, and shift ½n-digit integers to obtain result. • What if n is an odd number?
(A1 + A2 )  (B1 + B2 ) = A1  B1 + (A1  B2 + A2  B1) + A2  B2, x  2 n / 2  x1  x0
y  2 n / 2  y1  y0
2 n / 2  x1 y0  x0 y1   x0 y0
i.e., (A1  B2 + A2  B1) = (A1 + A2 )  (B1 + B2 ) - A1  B1 - A2  B2, xy  2 n  x1 y1 
 2 n  x1 y1  2 n / 2   (x1  x0 ) (y1  y0 )  x1 y1  x0 y0   x0 y0
which requires only 3 multiplications at the expense of (4-1) A B A C C
extra add/sub.

Recurrence for the number of multiplications T(n): T(n)  T  n /2   T  n /2   T  1 n /2   (n)
T(n) = 3T(n/2), T(1) = 1 recursive calls add, subtract, shift

Solution: T(n) = 3log 2n = nlog 23 ≈ n1.585  T(n)  O(n


log 2 3
)  O(n1.585 )

140



Efficiency of the Closest-Pair Algorithm


Closest-Pair Problem Example 2: Closest Pair Problem

• S is a set of n points Pi=(xi, yi) in the plane  Finding the closest pair of points

• For simplicity, n is a power of two Running time of the algorithm is described by - Find the closest pair of points in a set of points. The set consists of points in two
dimension plane
• Without loss of generality, we assume points - Given a set P of N points, find p, q  P, such that the distance d(p, q) is minimum

are ordered by their x coordinates T(n) = 2T(n/2) + f(n), where f(n)  O(n)
• Discussion: How to apply divide-and-conquer?
By the Master Theorem (with a = 2, b = 2, d = 1)
T(n)  O(n log n) - Application:
- traffic control systems: A system for controlling air or sea traffic might need to know
which two vehicles are too close in order to detect potential collisions.
- computational geometry

16
21-Feb-20

2D Closest Pair Problem


Brute force algorithm 1 Dimension Closest Pair Problem
Input: set S of points
Output: closest pair of points
• Brute-force algorithm: Find all the distances D(p, q)  Finding the closest pair of points
and find the minimum distance - The "closest pair" refers to the pair of points in the set that has the
min_distance = infinity – Time complexity: O(n2) smallest Euclidean distance,
for each point x in S
for each point y in S • 1D problem can be solved in O(nlgn) via sorting Distance between points p1=(x1,y1) and p2=(x2,y2)
if x ≠ y and distance(x,y) < min_distance
• But, sorting does not generalize to higher dimensions
{ D( p1 , p2 )  ( x1  x2 ) 2  ( y1  y2 ) 2
min_distance = dist(x,y) • Let’s develop a divide & conquer algorithm for 2D - If there are two identical points in the set, then the closest pair
closest_pair = (x,y)
}
problem. distance in the set will obviously be zero.

Time Complexity: O(n2)

2D Closest Pair
Example 2 2D Closest Pair
Problem d d
d = minimum (dLmin, dRmi n)

Finding the closest pair of points Find the closest pair in a strip of width 2d d For each point p on the left of the
dividing line, we have to compare to
p the distances to the points in the
Example: d= min(12, 21) blue rectangle.
- Divide: Sort the points by x-coordinate; draw vertical line
d
to have roughly n/2 points on each side Note there must be no point inside
the blue rectangle, because d =
- Conquer: Find closest pair in each side recursively minimum (dLmin, dRmin). Thus, for
- Combine: Find closest pair with one point in each side d each point p, we only have to consider
6 points – 6 red circles on the blue
- Return: best of three solutions S1 S2 rectangles.
Dividing So, we need at most 6*n/2 distance
Line comparisons

Matrix multiplication Matrix multiplication (divide-conquer]


2D Closest Pair (divide-conquer recursive algorithm]
C00 C01   A00 A01   B00 B01 
• Time Complexity C  * 
 10 C11   A10 A11   B10 B11  C00 C01   A00 A01   B00 B01 
– T(n) = 2T(n/2) + O(n) = O(nlgn)   * 
 A00 * B00  A01 * B10 A00 * B01  A01 * B11  C10 C11   A10 A11   B10 B11 
– Solve this recurrence equation yourself by  
 A10 * B00  A11 * B10 A10 * B01  A11 * B11   A00 * B00  A01 * B10 A00 * B01  A01 * B11 
applying the iterative method  
 A10 * B00  A11 * B10 A10 * B01  A11 * B11 
A, B: n by n matrices;
Aij, Bij: n/2 by n/2 matrices, where i, j  {0, 1} A, B: n by n matrices;
Aij, Bij: n/2 by n/2 matrices, where i, j  {0, 1}
recurrence relations:
multiplication: M(n) = ?
multiplication: (n3) addition: (n3)
addition: A(n) = ? Design and Analy sis of Algorithms – Chapter 5 152 Design and Analy sis of Algorithms – Chapter 5 153

17
21-Feb-20

Strassen’s matrix multiplication Strassen’s matrix multiplication The convex hull problem
C00 C01   A00 A01   B00 B01  C00 C01   A00 A01   B00 B01  concave polygon: convex polygon:
C  *  C  * 
 10 C11   A10 A11   B10 B11   10 C11   A10 A11   B10 B11 
 M 1  M 4 M 5  M 7 M3  M5   M 1  M 4 M 5  M 7 M3  M5 
 
 M2  M4 M 1  M 3  M 2  M 6   M2  M4 M 1  M 3  M 2  M 6 

M1=(A00+A11)*(B00+B11) M1=(A00+A11)*(B00+B11)  The convex hull of a set of planar points is the


M2=(A10+A11)*B00
recurrence relations:
M2=(A10+A11)*B00
M(n)  (n2.807)
smallest convex polygon containing all of the points.
M3=A00*(B01-B11) M3=A00*(B01-B11)
M4=A11*(B10-B00) multiplication: M(n) = ? M4=A11*(B10-B00) A(n)  (n2.807)
M5=(A00+A01)*B11 M5=(A00+A01)*B11
addition: A(n) = ?
M6=(A10-A00)*(B00+B01) M6=(A10-A00)*(B00+B01)
M7=(A01-A11)*(B10+B11) M7=(A01-A11)*(B10+B11)
Design and Analy sis of Algorithms – Chapter 5 154 Design and Analy sis of Algorithms – Chapter 5 155 5 -156

 The divide-and-conquer strategy to solve  The merging procedure:


 e.g. points b and f need to be deleted.
the problem: 1. Select an interior point p.
2. There are 3 sequences of points which have
increasing polar angles with respect to p.
(1) g, h, i, j, k
Final result:
(2) a, b, c, d
(3) f, e
3. Merge these 3 sequences into 1 sequence:
g, h, a, b, f, c, e, d, i, j, k.
4. Apply Graham scan to examine the points
one by one and eliminate the points which
cause reflexive angles.
(See the example on the next page.)
5 -157 5 -158 5 -159

Topological Sort
Divide-and-conquer for convex hull • Directed graph G.
 Input : A set S of planar points  Step 4: Apply the merging procedure to merge • Rule: if there is an edge u  v, then u must come
 Output : A convex hull for S Hull(SL) and Hull(SR) together to form a convex before v.
Step 1: If S contains no more than five points, use hull. • Ex: A B
exhaustive searching to find the convex hull and G
return.
 Time complexity: F
Step 2: Find a median line perpendicular to the X- C
axis which divides S into S L and SR ; SL lies to the T(n) = 2T(n/2) + O(n) I H
left of SR . A B
= O(n log n) E D
Step 3: Recursively construct convex hulls for S L and G
SR. Denote these convex hulls by Hull(SL) and F
Hull(SR) respectively. C
I H
5 -160 5 -161 E D 162

18
21-Feb-20

Intuition Implementation Topological Sorting


• Cycles make topological sort impossible. • Start with a list of nodes with in-degree = 0
• Select any node with no in-edges • Select any edge from list
– print it – mark as deleted Figure 13.14
– delete it – mark all outgoing edges as deleted A directed graph without
cycles
– and delete all the edges leaving it – update in-degree of the destinations of those edges
• If any drops below zero, add to the list
• Repeat
• What if there are some nodes left over? • Running time?
Figure 13.15
• Implementation? Efficiency? The graph in Figure 13-14
arranged according to the
topological orders a) a, g, d,
b, e, c, f and b) a, b, g, d, e,
f, c

163 164 165

Topological Sorting Topological Sorting Implementation


• Simple algorithms for finding a topological order • Simple algorithms for finding a topological order • Start with a list of nodes with in-degree = 0
– topSort1 (Continued) • Select any edge from list
• Find a vertex that has no successor – topSort2
• Remove from the graph that vertex and all edges that lead to it, and add • A modification of the iterative DFS algorithm
– mark as deleted
the vertex to the beginning of a list of vertices
• Strategy – mark all outgoing edges as deleted
• Add each subsequent vertex that has no successor to the beginning of the
list – Push all vertices that have no predecessor onto a stack – update in-degree of the destinations of those edges
– Each time you pop a vertex from the stack, add it to the beginning of
• When the graph is empty, the list of vertices will be in topological order a list of vertices • If any drops below zero, add to the list
– When the traversal ends, the list of vertices will be in topological
order
• Running time? In all, algorithm does work:
– O(|V|) to construct initial list
– Every edge marked “deleted” at most once: O(|E|) total
– Every node marked “deleted” at most once: O(|V|) total
– So linear time overall (in |E| and |V|)

166 167 168

Why should we care? Transform and Conquer


• Shortest path problem in directed, acyclic graph Topological Sort Analysis This group of techniques solves a problem by a transformation

– Called a DAG for short  Initialize In-Degree map: O(|V| + |E|)  to a simpler/more convenient instance of the same problem. We call
• General problem:  Initialize Queue with In-Degree 0 vertices: O(|V|) it instance simplification.

– Given a DAG input G with weights on edges  Dequeue and output vertex:  to a different representation of the same instance. We call it
– Find shortest paths from source A to every other vertex  |V| vertices, each takes only O(1) to dequeue and output: representation change.
O(|V|)
 Reduce In-Degree of all vertices adjacent to a vertex and  to a different problem for which an algorithm is already available.
We call it problem reduction.
Enqueue any In-Degree 0 vertices:
 O(|E|)
 Runtime = O(|V| + |E|) Linear!

169
170 Figure: Transform-and-conquer strategy. 171

19
21-Feb-20

Instance simplification - Presorting Example 1 Checking element uniqueness in an array.


Element Uniqueness with presorting
Solve a problem’s instance by transforming it into
another simpler/easier instance of the same problem  Presorting-based algorithm ALGORITHM PresortElementUniqueness(A[0..n - 1])
Stage 1: sort by efficient sorting algorithm (e.g. merge sort). //Solves the element uniqueness problem by sorting the array first
//Input: An array A[0..n - 1] of orderable elements
Presorting : Stage 2: scan array to check pairs of adjacent elements. //Output: Returns “true” if A has no equal elements, “false” otherwise
Many problems involving lists are easier when list is sorted. Sort the array A
Efficiency: for i  0 to n - 2 do
 searching
if A[i]= A[i + 1] return false
 checking if all elements are distinct (element uniqueness) T(n) = Tsort(n) + Tscan(n) Є Θ(nlog n) + Θ(n) = Θ(nlog n)
return true

Also:  Brute force algorithm


 Topological sorting helps solving some problems for dags. Compare all pairs of elements  We can sort the array first and then check only its consecutive
 Presorting is used in many geometric algorithms. Efficiency: Θ(n2) elements: if the array has equal elements, a pair of them must be
next to each other and vice versa.

172 173 174

Example 2 Computing a mode Example 2 Computing a mode


Example 2 Computing a mode
 A mode is a value that occurs most often in a given list of numbers.
 Since list’s i th element is compared with i – 1 elements of the auxiliary
For example, for 5, 1, 5, 7, 6, 5, 7, the mode is 5. list of distinct values seen so far before being added to the list with a ALGORITHM PresortMode(A[0..n - 1])
frequency of 1, the worst-case number of comparisons made by this //Computes the mode of an array by sorting it first
algorithm in creating the frequency list is //Input: An array A[0..n - 1] of orderable elements
 The brute-force approach for computing a mode would scan the list and //Output: The array’s mode
n
compute the frequencies of all its distinct values, then find the value with the Sort the array A
C(n) = ∑ (i-1) = 0 + 1 + . . . + (n-1) = (n-1)n Є Θ(n2). i0 //current run begins at position i
largest frequency.
i=1 2 modefrequency ← 0 //highest frequency seen so far
The additional n – 1 comparisons are needed to find the largest while i ≤ n – 1 do
 To implement this, we can store the values already encountered, along with frequency in the auxiliary list. runlength ← 1; runvalue ← A[i]
their frequencies, in a separate list. while i + runlength ≤ n – 1 and A[i + runlength] = runvalue
 In presort mode computation algorithm, the list is sorted first. Then all runlength ← runlength + 1
 On each iteration, the ith element of the original list is compared with equal values will be adjacent to each other. To compute the mode, all if runlength > modefrequency
the values already encountered by traversing this auxiliary list. we need to do is to find the longest run of adjacent equal values in the modefrequency ← runlength; modevalue ← runvalue
sorted array. i ← i + runlength
 If a matching value is found, its frequency is incremented; otherwise,
return modevalue
the current element is added to the list of distinct values with a  The efficiency of this presort algorithm will be nlogn, which is the time
frequency of 1. spent on sorting.
175 176 177

A
Example 3: Searching Problem Binary Trees
B E
 Binary Tree: A binary tree is a finite set of elements
Problem: Search for a given K in A[0..n-1] that is either empty or is partitioned into three disjoint
subsets:
C D F
Presorting-based algorithm: - The first subset contains a single element called the
Stage 1 Sort the array by an efficient sorting algorithm. ROOT of the tree. Left subtree Right subtree
Stage 2 Apply binary search. - The other two subsets are themselves binary trees,
called the LEFT and RIGHT subtrees of the original BINARY TREE
Efficiency: tree.
T(n) = Tsort(n) + Tsearch(n) = Θ(nlog n) + Θ(log n) = Θ(nlog n)
 NODE : Each element of a binary tree is called a node
of the tree.

Note: A node in a binary tree can have no more than two


178 179 180

subtrees.

20
21-Feb-20

BINARY SEARCH TREE Draw the BSTs of height 2, 3, 4, 5, 6 on the following set of keys:
Binary Search Tree (Binary Sorted Tree) :  Example BSTs: { 1, 4, 5, 10, 16, 17, 21 } 16 17
16 21
 Suppose T is a binary tree. Then T is called a binary search 10 10 17
tree if each node of the tree has the following property : 4 17 10
4 21
the value of each node is greater than every value in the left 5 2 1 5 16 21 4
subtree of that node and is less than every value in the right 1 5
HEIGHT 2
1 5
subtree of N. 3 HEIGHT 3
3 7 HEIGHT 4
17 21 1
7 16 21 21 17
 Let x be a node in a binary search tree . 4
2 4 8 17 16
If y is a node in the left subtree of x,then info(y)<info(x). 10 5
5 16
If y is a node in the right subtree of x , then 8 5 10 10
info(x)<=info(y) or 10 or
x 4 5 16
4
4 4 17
1
1 5 1 21183
All<x All≥x 181 182
HEIGHT 6
HEIGHT 5

Operations of BSTs: Insert Insert Operations: 1st of 3 steps Insert Operations: 2nd of 3 steps
1) The function begins at the root node and compares item 32 2) Considering 35 to be the root of its own subtree, we compare
with the root value 25. Since 32 > 25, we traverse the right
 Adds an element x to the tree so that the binary search item 32 with 35 and traverse the left subtree of 35.
subtree and look at node 35.
tree property continues to hold
25
 The basic algorithm
parent 25

 set the Index to Root


20 35 t 20 35 parent
 Check if the data where index is pointing is NULL, if
so Insert x in place of NULL, and finish the process.
 If the data is not NULL then compare the data with
12 40 12 t 40
inserting item
 If the inserting item is equal or bigger then traverse to
the right else traverse to the left.
(a) (b)
 Continue with 2nd step again until
Step 1: Compare 32 and 25. Step 2: Compare 32 and 35.
Traverse the right subtree. Traverse the left subtree.
184 185 186

Insert Operations: 3rd of 3 steps Insert in BST Insert in BST


3) Create a leaf node with data value 32. Insert the new node as Start at root.
the left child of node 35.  Insert Key 52. 52 > 51  Insert Key 52. 52 > 51
Go right. Go right.
51 51
25 14 72 14 72

20 35 parent
06 33 53 97 06 33 53 97
12 32 40

(c) 13 25 43 64 84 99 13 25 43 64 84 99
Step 3: Insert 32 as left child
of parent 35
187 188 189

21
21-Feb-20

Insert in BST Insert in BST Insert in BST

 Insert Key 52.  Insert Key 52.  Insert Key 52.


51 51 51

14 52 < 72 72 14 52 < 72 72 14 72
Go left. Go left.

06 33 53 97 06 33 53 97 06 33 53 97
52 < 53
Go left.

13 25 43 64 84 99 13 25 43 64 84 99 13 25 43 64 84 99

190 191 192

Insert in BST Operations of BSTs: Search Search in BST


 Insert Key 52.  looks for an element x within the tree
 Search for Key 43.
51 51

14 72  The basic algorithm 14 72


No more tree here.
INSERT HERE  Set the index to root
 Compare the searching item with the data at
06 33 53 97 index 06 33 53 97
52 < 53
Go left.  If (the searching item is smaller) then
traverse to the left else traverse to the right.
13 25 43 52 64 84 99 13 25 43 64 84 99
 Repeat the process until the index is point to
NULL or found the item.

193 194 195

Search in BST Search in BST Search in BST


Start at root.
 Search for Key 43.  Search for Key 43.  Search for Key 43.
43 < 51 43 < 51
51 Go left. 51 51
Go left.
14 72 14 72 14 72
43 > 14
Go right.

06 33 53 97 06 33 53 97 06 33 53 97

13 25 43 64 84 99 13 25 43 64 84 99 13 25 43 64 84 99

196 197 198

22
21-Feb-20

Search in BST Search in BST Search in BST

 Search for Key 43.  Search for Key 43.  Search for Key 43.
51 51 51

14 72 14 72 14 72
43 > 14
Go right.

06 33 53 97 06 33 53 97 06 33 53 97
43 > 33 43 > 33
Go right. Go right.

13 25 43 64 84 99 13 25 43 64 84 99 13 25 43 64 84 99

199 200 201

Search in BST Search in BST Search in BST


Start at root.
 Search for Key 43.  Search for Key 52 . 52 > 51  Search for Key 52. 52 > 51
Go right. Go right.
51 51 51

14 72 14 72 14 72

06 33 53 97 06 33 53 97 06 33 53 97

13 25 43 64 84 99 13 25 43 64 84 99 13 25 43 64 84 99
43 = 43
FOUND

202 203 204

Search in BST Search in BST Search in BST


 Search for Key 52.  Search for Key 52.  Search for Key 52.
51 51 51

14 52 < 72 72 14 52 < 72 72 14 72
Go left. Go left.

06 33 53 97 06 33 53 97 06 33 52 < 53 53 97
Go left.

13 25 43 64 84 99 13 25 43 64 84 99 13 25 43 64 84 99

205 206 207

23
21-Feb-20

Search in BST DELETION: Binary Search Trees


70 70  The transformation from a set to a binary search tree is
an example of the representation-change technique.
35 75 35 75
 Search for Key 52.
51 20 20 65  By doing this transformation, we gain in the time
65 90
efficiency of searching, insertion, and deletion, which
14 72 KEY
50 are all in Θ(log n), but only in the average case. In the
50
worst case, these operations are in Θ(n).

06 33 53 97
52 < 53 70
70
Go left. KEY
35 75 35 90
13 25 43 64 84 99
20 65 90 20 65
No more tree here.
NOT FOUND 50
208
50 209 210

AVL TREES AVL TREES


Balanced Search Trees  AVL trees were invented in 1962 by two Russian scientists, G. M. Adelson-
Velsky and E. M. Landis, after whom this data structure is named.
 If an insertion of a new node makes an AVL tree unbalanced, we
 There are two approaches to balance a search tree: transform the tree by a rotation.
 Definition: An AVL tree is a binary search tree in which the balance factor
of every node, which is defined as the difference between the heights of the
 The first approach is of instance – simplification variety: an node’s left and right subtrees, is either 0, +1 or -1. (The height of the empty  A rotation in an AVL tree is a local transformation of its subtree rooted
unbalanced binary search tree is transformed to a balanced one. An tree is defined as -1). at a node whose balance has become either +2 or -2; if there are several
1 2 such nodes, we rotate the tree rooted at the unbalanced node that is the
AVL tree requires the difference between the heights of the left and
right subtrees of every node never exceed 1. Other types of trees 10 closest to the newly inserted leaf.
10
are red-black trees and splay trees. 0 1 0 0
5 20  There are only four types of rotations:
5 20
• Single right rotation (R-rotation)
 The second approach is of the representation-change variety: 1 -1 0 1 -1
• Single left rotation (L-rotation)
allow more than one element in a node of a search tree. Specific
4 7 12 4 7 • Double left-right rotation (LR – rotation)
cases of such trees are 2 – 3 trees, 2-3-4 trees, B-trees.
• Double right-left rotation (RL – rotation)
0 0 0 0
2 8 2 8
(a) (b)

211
Figure: (a) AVL tree. (b) Binary search tree that is not an AVL tree. The number212 213

above each node indicates that node’s balance factor.

AVL TREES r single R-rotation c AVL TREES


c r
 Single right rotation (R-rotation):  Single left rotation (L-rotation):
 This is the mirror image of the single R-rotation.
This rotation is performed after a new key is inserted into T3 T1
the left subtree of the left child of a tree whose root had the  This rotation is performed after a new key is inserted into the right
T1 T2 T2 T3 subtree of the right child of a tree whose root had the balance if -1
balance if +1 before the insertion.
before the insertion.
(Imagine rotating the edge connecting the root and its left Figure: General form of the R-rotation in the AVL tree. A shaded
(Imagine rotating the edge connecting the root and its right
child in the binary tree in below figure to the right.) node is the last one inserted. child in the binary tree in below figure to the left.)
2 0 Note: -2 0
-1
1 3 2  These tranformations should not only guarantee that a resulting tree is balanced, 1 2
1 but they should also preserve the basic requirements of a binary search tree. 1 -1 L
3 0 0 0 0 0
2  For example, in the initial tree in the above figure, all the keys of subtree T1 are
0 R L 2
0 R 1 3 1 3
2 smaller than c, which is smaller than all the keys of subtree T2, which are smaller 2
0
1 (c) Balanced tree than r, which is smaller than all the keys of subtree T3. (c) Balanced tree
3
(a) Balanced tree (b) Unbalanced tree after 214
 The same relationships among the key values should hold for the balanced tree
215 (a) Balanced tree (b) Unbalanced tree after 216

inserting 1 after rotation. inserting 3

24
21-Feb-20

AVL TREES AVL TREES AVL TREES


 Double left-right rotation (LR-rotation):  Double right-left rotation (RL-rotation):
 Double left-right rotation (LR-rotation): It is a combination of two rotations: We perform the L-rotation of the left
subtree of root r followed by the R-rotation of the new tree rooted at r in  It is the mirror image of the double LR-rotation.
 It is performed after a new key is inserted into the right
below figure.
subtree of the left child of a tree whose root had the balance
2 g  It is performed after a new key is inserted into the left subtree
of +1 before the insertion. 2 r
3 double LR-rotation of the right child of a tree whose root had the balance of -1
1 3
c c r before the insertion. -2 -2
-1
3 1 -1
1 1 1
0 2 g T4 1 1 -1
R 0
1 0 0
T2 T3 3 2
T4 3 L 0
L T1
2 1 T1 T2 T3 or 0
(a) Balanced tree (a) Balanced tree 3
0 2 R
(b) Unbalanced tree after inserting 2 or 0
2
Figure: General form of the double LR-rotation in the AVL tree. A shaded 2 (b) Unbalanced tree after inserting 2
0 0
node is the last one inserted. It can be either in the left subtree or in 0 0
1 3 (c) Balanced tree
217 218 219
the right subtree of the root’s grandchild. 1 3 (c) Balanced tree

Construction of an AVL tree Construction of an AVL tree


AVL TREES
EXAMPLE 1: 100, 200, 300, 250, 270, 70, 40 EXAMPLE 1: 100, 200, 300, 250, 270, 70, 40
 The balance factor of a node is 0 if and only if the height of the left
subtree and height of right subtree are same. Step1: Element 100 is inserted into the empty tree as shown below: Step3: Element 300 is inserted towards right of 200 as shown below:
0
-2
 The balance factor of a node is 1 if the height of the left subtree is 100
one more than height of right subtree (left heavy). 100
-1
 The balance factor of a node is -1 if the height of the right subtree is
one more than height of left subtree (right heavy).
L 200
Step2: Element 200 is inserted to the right of 100 as shown below: 0
 The height of empty tree is -1.
-1 300
 If there are several nodes with the ±2 balance, the rotation is done for
the tree rooted at the unbalanced node that is the closest to the newly
inserted leaf. 100 Above tree is unbalanced since balance factor at node 100 is -2.
All nodes 100, 200 and 300 are in straight line. Hence single rotation
0 is sufficient to balance the tree.
200
Since the tree is heavy towards right, L-rotation is made. i.e., the edge
220 221 connecting 100 to 200 is rotated left making 200 as the root. 222

Construction of an AVL tree Construction of an AVL tree Construction of an AVL tree


EXAMPLE 1: 100, 200, 300, 250, 270, 70, 40 EXAMPLE 1: 100, 200, 300, 250, 270, 70, 40 EXAMPLE 1: 100, 200, 300, 250, 270, 70, 40
0
Step5: Element 270 is inserted as the right child of 250 as shown below: The first rotation occurs at node 250.
-2
200 Left rotation brings 270 up and 250 becomes left child of 270.
0 0 200 -2
0 2 -2
100 300 0
200
100 300 2 200
0 2
-1 100
Step4: Element 250 is inserted as the left child of 300 as shown below: 300
-1
100 300
-1 250 0 1
250
200 270 0 270
0 1 L R
270 0
The youngest ancestor which is out of balance is 300. So we consider L
100 300 250
node 300 with two nodes below it i.e., 250 and 270.
0
Since all these nodes are not lying on a straight line, the tree requires
Now, right rotation is made at node 300. This makes node 300 to
250 223 double rotation (LR-rotation). 224
become right child of 270. 225

25
21-Feb-20

Construction of an AVL tree Construction of an AVL tree Construction of an AVL tree


EXAMPLE 1: 100, 200, 300, 250, 270, 70, 40 EXAMPLE 1: 100, 200, 300, 250, 270, 70, 40 EXAMPLE 1: 100, 200, 300, 250, 270, 70, 40
-1
Step7: Element 40 is inserted as left child of 70 as shown below: Right rotation is performed since the tree is left heavy.
200 1 1
0 0
0
2
200 200 200
100 270 0 2 0
0 0
0 0 100 270 100 270
1 0 1
0
70 270
250 300 0 0
70 70 0 0 0 0
0 250 300 R 250 300
Step6: Element 70 is inserted as left child of 100 as shown below: R 0 40
0 100 250 300
40 40
1
200
0
The tree is unbalanced since node 100 has the balance factor of 2. The tree is unbalanced since node 100 has the balance factor of 2.
100 270 Hence we consider node 100 and its below nodes 70 and 40. Hence we consider node 100 and its below nodes 70 and 40.
0 0 0
Since nodes 100, 70 and 40 lie on a straight line, a single rotation is made. Since nodes 100, 70 and 40 lie on a straight line, a single rotation is made.
70 250 300 226 227 228

Construction of an AVL tree Construction of an AVL tree Construction of an AVL tree


EXAMPLE 2: 5, 6, 8, 3, 2, 4, 7 -1 -2

5 5
0 -1
0
6 6
5
0

Figure: Construction of an AVL tree for the list 5, 6, 8, 3, 2, 4, 7 Figure: Construction of an AVL tree for the list 5, 6, 8, 3, 2, 4, 7 Figure: Construction of an AVL tree for the list 5, 6, 8, 3, 2, 4, 7
by successive insertions. by successive insertions. by successive insertions.
229 230 231

Construction of an AVL tree Construction of an AVL tree Construction of an AVL tree


1 2
-2
0 6 6
5
1 0 2 0
-1 6
L(5) 0 0 5 8 5 8
L 6 0
1
5 8
0 3
3
8 0

2
Figure: Construction of an AVL tree for the list 5, 6, 8, 3, 2, 4, 7 by successive
insertions. The parenthesized number of a rotation’s abbreviation
Figure: Construction of an AVL tree for the list 5, 6, 8, 3, 2, 4, 7 Figure: Construction of an AVL tree for the list 5, 6, 8, 3, 2, 4, 7
indicates the root of the tree being reorganized.
by successive insertions. by successive insertions.
232 233 234

26
21-Feb-20

Construction of an AVL tree Construction of an AVL tree Construction of an AVL tree


2 2
2
0
6 1 6 6
-1 0 -1 0 5
2 0 6
3 3 0 -1
5 8 0 0 8 8
1 3 6
3 8 0 1 0 1
LR(6)
3 R 0 0 0
R(5) 0 0 2 5 2 5
0 0 0 2 4 8
2 5
2 4 4

Figure: Construction of an AVL tree for the list 5, 6, 8, 3, 2, 4, 7 by successive Figure: Construction of an AVL tree for the list 5, 6, 8, 3, 2, 4, 7 by successive Figure: Construction of an AVL tree for the list 5, 6, 8, 3, 2, 4, 7 by successive
insertions. The parenthesized number of a rotation’s abbreviation insertions. insertions. The parenthesized number of a rotation’s abbreviation
indicates the root of the tree being reorganized. 235 236 indicates the root of the tree being reorganized. 237

Construction of an AVL tree Construction of an AVL tree Efficiency of AVL Trees


-1 -1  The height h of any AVL tree with n nodes satisfies the inequalities
0
log2 n ≤ h < 1.4405 log2 (n + 2) – 1.3277.
5 5
5
0 -2 0 -2 RL(6)  These inequalities imply that the operations of searching and insertion
0 0
3 3 are Θ(log n) in the worst case.
6 6
3 7
0 0 1 0 0 1  The operation of key deletion in an AVL tree is considerably more
0 0 0 0
difficult than insertion, but fortunately it turns out to be in the same
2 4 8 2 4 8
2 4 6 8 efficiency class as insertion, i.e., logarithmic.
0 0

7 7  The drawbacks of AVL trees are frequent rotations, the need to


maintain balances for the tree’s nodes, and overall complexity,
especially of the deletion operation.
Figure: Construction of an AVL tree for the list 5, 6, 8, 3, 2, 4, 7 by successive
Figure: Construction of an AVL tree for the list 5, 6, 8, 3, 2, 4, 7 by successive
insertions.  These drawbacks have prevented AVL trees from becoming the
insertions. The parenthesized number of a rotation’s abbreviation standard structure for implementing dictionaries.
238
indicates the root of the tree being reorganized. 239 240

2-3 TREES
2-3 TREES 2-3 TREES  In 2-3 trees, all the leaves must be on the same level. i.e., a 2-3 tree is
 Another idea of balancing a search tree is to allow more than one key in always perfectly height-balanced: the length of a path from the root of
the same node ( Ex: 2-3 trees).  A 3-node contains two ordered keys K1 and K2 (K1 < K2)
the tree to a leaf must be the same for every leaf.
and has three children.
 2-3 trees was introduced by the U. S. computer scientist John Hopcroft in  The leftmost child serves as the root of a subtree with
 Searching in a 2-3 Tree:
1970. keys less than K1  Searching for an element always starts from the root.
 The middle child serves as the root of a subtree with  If the root is a 2-node, then the searching is similar to the way we search
 A 2-3 tree is a tree that can have nodes of two kinds: 2-nodes and 3-nodes. keys between K1 and K2. in a binary search tree: We either stop if search key K is equal to the
root’s key or continue the search in the left subtree if K is less than the
 The rightmost child serves as the root of a subtree with
 A 2-node contains a single key K and has two children: the left child serves root’s key or continue the search in the right subtree if K is larger than
keys greater than K2. the root’s key.
as the root of a subtree whose keys are less than K and the right child
serves as the root of a subtree whose keys are greater than K. 3 - node  If the root is a 3-node, then K can be compared with two elements in the
2 - node root. If it doesn’t match with root elements, then we need to search
K1 , K2 either left subtree, or middle subtree or right subtree. If K is less than the
K first element in the root, continue searching in the left subtree. If K is in
between the two elements of root, continue searching in the middle
subtree. If K is larger than the second element of the root, continue
searching in the right subtree.
<K >K 241
< K1 (K1 , K2 ) > K2 242 243

27
21-Feb-20

An example of a 2-3 tree construction is given below: An example of a 2-3 tree construction is given below:
Creating a 2-3 Tree
 If the 2-3 tree is empty, insert a new key at the root level. Except the Construct 2-3 tree for the list 9, 5, 8,3, 2, 4 and 7 Construct 2-3 tree for the list 9, 5, 8,3, 2, 4 and 7
first node, rest of the nodes are always inserted in the leaf. Since a node in a 2-3 tree cannot contain more than 2 elements, we move
 By performing a search for the key to be inserted, appropriate leaf the middle node to the parent position, left element 5 towards left of the
node is found. Step1: Insert the element 9 into the empty tree as shown below: parent and right element 9 towards right of the parent as shown below:
 If the leaf is a 2-node, we insert K there as either the first or the
second key, depending on whether K is smaller or larger then the
9 8
node’s old key.
 If the leaf is a 3-node, we split the leaf into two parts: the smallest of Step2: Element to be inserted is 5. Since there is only one element in the 5 9
the three keys (two old ones and the new key) is put in the first leaf,
root node, the element 5 is inserted along with 9 as shown below:
the largest key is put in the second leaf, while the middle key is
promoted to the old leaf’s parent. 5, 9 Step4: Element to be inserted is 3. Since 3 is less than 8, it can be inserted
 If the tree’s root itself is the leaf node, a new root is created to accept towards left of 8, but along with node 5 as shown below:
the middle key.
Step3: Element to be inserted is 8. Since 8 is greater than 5 and less than
 If the promotion of a middle element to its parent leads to a 3-node, 8
it may be required to split along the chain of leaf’s ancestor. 9, it is inserted between 5 and 9 as shown below:
 By repeating this procedure, a 2-3 tree can be constructed.
244
5, 8, 9 245
3, 5 9 246

An example of a 2-3 tree construction is given below: An example of a 2-3 tree construction is given below: An example of a 2-3 tree construction is given below:
Construct 2-3 tree for the list 9, 5, 8,3, 2, 4 and 7 Construct 2-3 tree for the list 9, 5, 8,3, 2, 4 and 7 Construct 2-3 tree for the list 9, 5, 8,3, 2, 4 and 7
But, if a node has two elements i.e., 3, 8 (except the leaf i.e., 2, 5), it Step 7: Element to be inserted is 7. Since, 7 is between 3 and 8, it can be
should have 3 children: left child containing all elements less than 3,
Step5: Element to be inserted is 2. Since 2 is less than 8, it can be inserted inserted in the middle node along with 4, 5 as shown below:
middle child containing all elements between 3 and 8, and the right child
towards left of 8, but along with nodes 3 and 5 as shown below: containing the elements larger than 8. So, the above tree can be modified
as shown below: 3, 8
8 3, 8
2 4, 5, 7 9
2, 3, 5 9 2 5 9
Since a node cannot have more than 2 elements, move 5 to the parent
Step 6: Element to be inserted is 4. Since, 4 is between 3 and 8, it can be position and the resulting tree is shown below:
But, a node cannot have more than 2 elements. So, move the middle
element 3 to its parent position as shown below: inserted in the middle node along with 5 as shown below:
3, 5, 8
3, 8
3, 8
2 4 7 9
2 4, 5 9
2, 5 9 247 248 249

An example of a 2-3 tree construction is given below: Efficiency of 2-3 Trees Efficiency of 2-3 Trees
 Therefore, for any 2-3 tree of height h with n nodes, we get the
Construct 2-3 tree for the list 9, 5, 8,3, 2, 4 and 7 inequality:
 Since a 2-3 tree is always height balanced, the efficiency of any
operation depends only on the height of the tree. n ≥ 20 + 21 + 22 + . . . + 2h = 2h+1 - 1
n + 1 ≥ 2h+1
Since, there are 3 elements namely 3, 5, 8 in the node, it is required to  To find the time complexity, it is required to find the lower bound as
well as the upper bound. So, taking log on both sides we get,
move the middle element to the root. Elements 2 and 4 can be the
log2 (n + 1) ≥ h + 1
children of node 3 and the elements 7 and 9 can be the children of node 8
as shown below: h ≤ log2 (n+1) - 1
 To find the lower bound for 2-3 tree: Consider the 2-3 tree with each
node exhibiting the property of 2-nodes.  To find the upper bound for 2-3 tree: Consider the 2-3 tree with
5 each node exhibiting the property of 3-nodes with each node having
Level 0 Number of nodes at level 0 = 1 = 2 0
maximum of two elements as shown below:
K 1, K 2 Level 0 : 2 * 3 0
Level 1
3 8 Number of nodes at level 1 = 2 = 2 1
Level 1 : 2 * 31
K1, K2 K 1, K 2 K1, K2
Level 2
Number of nodes at level 2 = 4 = 2 2
2 4 7 9
| | | | Level 2 : 2 * 32
Level i Number of nodes at level i = 2 i
| | | | K1, K2 K , K K , K K , K
| |1 2 |1 2 1| 2 K|1, K 2 K 1| , K 2 K 1|, K 2 K 1|, K 2 K|1, K 2
250 251 252

| | | | | | | | | Level i : 2 * 3 i

28
21-Feb-20

Efficiency of 2-3 Trees Heaps Heaps


 Therefore, for any 2-3 tree of height h with n nodes, we get the  A heap is a special type of data structure which is suitable for
inequality:
implementing priority queue (Using priority queue, elements can be 10 10 10
n ≤ 2 . 3 0 + 2 . 3 1 + 2 . 32 + . . . + 2 . 3 h inserted into queue based on the priority and elements can be
= 2( 30 + 31 + 32 + 3h) = 3h+1 - 1 deleted from the queue based on the priority) and also suitable for
n + 1 ≤ 3h+1 5 7 5 7
important sorting technique called heap sort. 5 7
So, taking log on both sides we get,
log3 (n + 1) ≤ h + 1 DEFINITION: A heap can be defined as a binary tree with keys 4 2 1
 2 1
h ≥ log3 (n+1) - 1 assigned to its nodes (one key per node) provided the following two 6 2 1
conditions are met:
 These lower and upper bounds on height h, (a) Heap (b) Not a Heap since tree’s (c) Not a Heap since the
1. The tree’s shape requirement - the binary tree is almost complete
log3 (n + 1) – 1 ≤ h ≤ log2 ( n + 1) - 1, shape requirement is parental dominance
imply that the time efficiencies of searching, insertion, and deletion or complete. i.e., all its levels are full except possibly the last level, requirement fails for the
are all in Θ (log n) in both the worst and average case. where only some rightmost leaves may be missing. violated node with key 5.
2. The parental dominance requirement - the key at each node is Figure: Illustration of the definition of “heap”: only the leftmost tree
greater than or equal to the keys at its children. is a heap.
Note: Key values in a heap are ordered top down. There is no left-to-right order in
253 254 key values. i.e., there is no relationship among key values for nodes either255
on same level of tree or in the left and right subtree of same node.

Important properties of Heaps Heaps Heaps


There exists exactly one almost complete binary tree with n nodes. Its

height is equal to log2 n .
 Thus, we could also define a heap as an array H[1 . . . n] in which Construction of a Heap
every element in position i in the first half of the array is greater
 The root of a heap always contains its largest element. than or equal to the elements in positions 2i and 2i + 1, i.e., Bottom-up heap construction algorithm:
 A node of a heap considered with all its descendants is also a heap. H[i] ≥ max{ H[2i], H[2i + 1]} for i = 1, . . . , n/2 .
 A heap can be implemented as an array by recording its elements in  This process continues until the parental dominance requirement for
the top-down, left-to-right fashion. It is convenient to store the heap’s K is satisfied.
elements in positions 1 through n of such an array, leaving H[0] Construction of a Heap
unused. In such a representation,
a) the parental node keys will be in the first n/2 positions of the Bottom-up heap construction algorithm:
 After completing the “heapification” of the subtree rooted at the
array, while the leaf keys will occupy the last n/2 positions;  It initializes the almost complete binary tree with n nodes by placing
b) the children of a key in the array’s parental position i keys in the order given and then “heapifies” the tree as follows: current parental node, the algorithm proceeds to do the same for the
(1 ≤ i ≤ n/2 ) will be in positions 2i and 2i + 1, and node’s immediate predecessor.
correspondingly, the parent of a key in position i ( 2 ≤ i ≤ n) will be in  Starting with the last parental node, the algorithm checks whether
position i/2 . the parental dominance holds for the key at this node.
0 1 2 3 4 5 6
 The algorithm stops after this is done for the tree’s root.
10 10 5 7 4 2 1
 If it doesn,t, the algorithm exchanges the node’s key K with the
5 7 larger key of its children and checks whether the parental
Parents leaves
Array Representation of heap shown in
256
dominance holds for K in its new position. 257 258

4 2 1 figure.

Efficiency of Bottom-up algorithm in the worst case:


Heaps ALGORITHM HeapBottomUp (H[1 . . . n])  Assume, for simplicity, that n = 2k - 1 so that a heap’s tree is full, i.e.,
//Constructs a heap from the elements of a given array the maximum number of nodes occurs on each level.
Construction of a Heap //by the bottom-up algorithm
2 2 //Input: An array H[1 . . . n] of orderable items 2  Let h be the height of the tree, where h= log2 n
2
//Output: A heap H[1 . . . n]
9 8 9 8 for i ← n/2 downto 1 do 9 7  Each key on level i of the tree will traverse to the leaf level h in the worst
9 7 case of the heap construction algorithm.
k ← i; v ← H[k]
6 6 5 7 heap ← false 6 5 8
6 5 5 7  Since moving to next level down requires two comparisons – one to find
8
while not heap and 2 * k ≤ n do the larger child and the other to determine whether the exchange is
j ← 2*k required, the total number of key comparisons involving a key on level i
9 9 will be 2(h – i).
2 if j < n //there are two children
2 8 if H[j] < H[j + 1] j ← j + 1
8 6  Therefore, the total number of key comparisons in the worst case will be
9 8 if v ≥ H[j] h -1 h -1
0 1 2 3 4 5 6
heap ← true Cworst(n) = ∑ ∑ 2(h – i) = ∑ 2(h-i) 2i = 2 (n – log2 (n + 1))
6 5 7 2 5 7
6 5 7 else H[k] ← H[j]; k ← j 2 9 7 6 5 8 i=0 level i keys i=0
H[k] ← v Parents leaves Thus, with this bottom-up algorithm, a heap of size n can be constructed
Figure: Bottom-up construction of heap for the list 2, 9, 7, 6, 5, 8 259 260 with fewer than 2n comparisons. 261

29
21-Feb-20

Deletion from a Heap


Construction of a Heap 9 9 10  Consider the deletion of the root’s key.
 The alternative (and less efficient) algorithm constructs a heap by
successive insertions of a new key into a previously constructed heap. Maximum Key Deletion from heap
6 8 6 10 6 9
It is called as top-down heap construction algorithm. Step1 Exchange the root’s key with the last key K of the heap.
Step2 Decrease the heap’s size by 1.
 To insert a new key K into a heap, first attach a new node with key K 2 7 8 7 8 Step3 “Heapify” the smaller tree by shifting K down the tree exactly
5 10 2 5 7 2 5
in it after the last leaf of the existing heap. in the same way we did it in the bottom-up heap construction
algorithm. i.e., verify the parental dominance for K: if it holds,
 Then shift K up to its appropriate place in the new heap as follows: Figure: Inserting a key (10) into the heap. The new key is shifted up via a we are done; if not, swap K with the larger of its children and
• Compare K with its parent’s key: if the latter is greater than or repeat this operation until the parental dominance condition
swap with its parent until it is not larger than its parent ( or is in the
equal to K, stop ; otherwise, swap these two keys and compare K holds for K in its new position.
with its new parent. root).
• This swapping continues until K is not greater than its last parent
 The efficiency of deletion is determined by the number of key
or it reaches the root.
comparisons needed to “heapify” the tree after swap has been made
and size of the tree is decreased by 1. The time efficiency of deletion is
 Since the height of a heap with n nodes is about log 2 n, the time in O(log n) as well.
efficiency of insertion is in O(log n).
262 263 264

9 HEAPSORT HEAPSORT
 This sorting algorithm was discovered by J. W. Williams. Stage 1 (Heap construction) Stage 2 (Maximum deletions)
8 6 2 9 7 6 5 8 9 6 8 2 5 7
 This is a two-stage algorithm that works as follows: 2 9 8 6 5 7 7 6 8 2 5 9
2 5 1 Stage 1 (Heap construction): Construct a heap for a given 2 9 8 6 5 7 8 6 7 2 5
Step1 Step3 array. 9 2 8 6 5 7 5 6 7 2 8
1 Step2 1 8
Stage 2 (Maximum deletions): Apply the root-deletion 9 6 8 2 5 7 7 6 5 2
8 6 5 6 operation n-1 times to the remaining heap. 2 6 5 7
8 6
6 2 5
2 2  As a result, the array elements are eliminated in decreasing 5 2 6
2 5 9 5 1
order.
5 2
Figure: Deleting root’s key from heap. The key to be deleted is swapped with the last 2 5
 But since under the array implementation of heaps, an
key, after which the smaller tree is “heapified” by exchanging the new key at 2
element being deleted is placed last, the resulting array will
its root with the larger key at its children until the parental dominance Figure: Sorting the array 2, 9, 7, 6, 5, 8 by heapsort.
265 be exactly the original array sorted in ascending order. 266 267

requirement is satisfied.

HEAPSORT HEAPSORT HEAPSORT


2 2 2 9 7 8 6 5 5
9 7 9 8 9 8 6 8 6 8 6 7 2 5 2 6 2 6
6 5 8 6 5 7 6 5 7 2 5 7 2 5 9 2 5 9 7 8 9 7 8 9 7 8 9
HEAP: Exchange 9 and 7 Heapify the list 7, 6, 8, 2, 5 HEAP: Exchange 8 and 5 HEAP: Exchange 6 and 5 Heapify the list 5, 2 HEAP: Exchange 5 and 2
2 9 9

5 7 2 2 2
9 8 2 8 6 8
6 7 6 5 6 5 5 6 5 6
6 5 7 6 5 7 2 5 7
2 8 9 2 7 8 7 8 9 7 8 9
8 9 9
FIGURE: Bottom-up construction of a heap for the list 2, 9, 7, 6, 5, 8.
Heapify the list 5, 6, 7, 2 HEAP: Exchange 7 and 2 Heapify the list 2, 6, 5
After “Heapification”, the list becomes 9, 6, 8, 2, 5, 7. 268 269 Sorted list 2, 5, 6, 7, 8, 9 270

30
21-Feb-20

Analysis of Heapsort using bottom-up approach Analysis of Heapsort using bottom-up approach
 The heap construction stage of the algorithm is in O(n).  So, the total time complexity T(n) is given by the inequality:
T(n) ≤ 2log2 (n-1) + 2log2 (n-2) + 2log2 (n-3) + . . . + 2log2 1
Heapsort
n-1 n-1
 Goal:
 Now, it is required to analyze only the second stage.
• Observe that after exchanging the root with (n-1)th item, we have to T(n) ≤ 2 ∑ log2 i ≤ 2 ∑ log2 (n-1) which indicates that log2 (n-1)  Sort an array using heap representations
reconstruct the heap for (n-1) elements which depends only on height i=1 i=1 appears (n-1) times.
of the tree.
 Idea:
Therefore, the time complexity for the second stage = 2n log2 n ≈ n log2 n.
• So, time complexity to reconstruct the heap for (n-1) elements  Build a max-heap from the array
= 2 log2 (n-1)  So, the time complexity of heap sort =time complexity of stage 1 + time
• After exchanging 0th item with n-2 item, time complexity to complexity of stage 2  Swap the root (the maximum element) with
reconstruct the heap = 2 log2 (n-2)
= O(n) + O(n log2 n) the last element in the array
• After exchanging 0th item with n-3 item, time complexity to
reconstruct the heap = 2 log2 (n-3) = O(max{ n, n log2 n })
... ... = O(n log2 n)  “Discard” this last node by decreasing the
After exchanging 0th item with 1st item, time complexity to Therefore, the time complexity of heapsort = O(nlog 2 n)
• heap size
reconstruct the heap = 2 log2 1
271 272
 Call MAX-HEAPIFY on the new root
Repeat this process until only one node

Example: A=[7, 4, 3, 1, 2] Alg: HEAPSORT(A) Summary

1. BUILD-MAX-HEAP(A) O(n)  We can perform the following operations


on heaps:
2. for i ← length[A] downto 2
MAX-HEAPIFY(A, 1, 4) MAX-HEAPIFY(A, 1, 3) MAX-HEAPIFY(A, 1, 2)
n-1 times  MAX-HEAPIFY O(lgn)
3. do exchange A[1] ↔
O(lgn)  BUILD-MAX-HEAP O(n)
A[i]
 HEAP-SORT O(nlgn)
4. MAX-HEAPIFY(A, 1, i  MAX-HEAP-INSERT O(lgn)
Average
MAX-HEAPIFY(A, 1, 1)
- 1)  HEAP-EXTRACT-MAX O(lgn)
O(lgn)

 HEAP-INCREASE-KEY O(lgn)
 Running time: O(nlgn) --- HEAP-MAXIMUM O(1)

HASHING Hash Function – Address Calculator Hashing

• Using balanced trees (2-3, 2-3-4, red-black, and AVL trees) we can • A hash function tells us where to place an item in array called a
implement table operations (retrieval, insertion and deletion) hash table. This method is know as hashing.
efficiently.  O(logN) • A hash function maps a search key into an integer between 0 and n-1.
– We can have different hash functions.
• Can we find a data structure so that we can perform these table
– Ex. h(x) = x mod n if x is an integer
operations better than balanced search trees?  O(1) – The hash function is designed for the search keys depending on the data types of
YES  HASH TABLES these search keys (int, string, ...)
• In hash tables, we have an array (index: 0..n-1) and an address • Collisions occur when the hash function maps more than one item into
Hash Function the same array.
calculator (hash function) which maps a search key into an array
index between 0 and n-1. – We have to resolve these collisions using certain mechanism.
• A perfect hash function maps each search key into a unique location
of the hash table.
– A perfect hash function is possible if we know all the search keys in advance.
Hash Table – In practice (we do not know all the search keys), a hash function can map more than
one key into the same location (collision).

21-Feb-20 CS202 - Fundamental Structures of Computer Science II 277 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 278 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 279

31
21-Feb-20

Hash Function Hash Functions -- Selecting Digits Hash Functions – Folding

• We can design different hash functions. • If the search keys are big integers (Ex. nine-digit numbers), we can • Folding: Selecting all digits and add them
• But a good hash function should: select certain digits and combine to create the address.
– be easy and fast to compute, h(033475678) = 0 + 3+ 3 + 4 +7 + 5 + 6 +7 + 8 = 43
– place items evenly throughout the hash table. h(033475678) = 37 selecting 2nd and 5th digits (table size is 100)
• We will consider only hash functions operate on integers. h(023455678) = 25 0  h(nine-digit search key)  81
– If the key is not an integer, we map it into an integer first, and apply the hash
function.
• Digit-Selection is not a good hash function because it does not place • We can select a group of digits and we can add these groups too.
• The hash table size should be prime. items evenly throughout the hash table.
– By selecting the table size as a prime number, we may place items evenly
throughout the hash table, and we may reduce the number of collisions.

21-Feb-20 CS202 - Fundamental Structures of Computer Science II 280 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 281 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 282

Hash Functions
Hash Functions – Modula Arithmetic Collision Resolution
Converting Character String into An Integer
• Modula arithmetic provides a simple and effective hash function. • If our search keys are strings, first we have to convert the string into an • There are two general approaches to collision resolution in hash
– We will use modula arithmetic as our hash function in the rest of our discussions. integer, and apply a hash function which is designed to operate on tables:
integers to this integer value to compute the address. 1. Open Addressing – Each entry holds one item
h(x) = x mod tableSize 2. Chaining – Each entry can hold more than item
• We can use ASCII codes of characters in the conversion. • Buckets – hold certain number of items
– Consider the string “NOTE”, assign 1 (00001) to ‘A’, ....
• The table size should be prime.
– N is 14 (01110), O is 15 (01111), T is 20 (10100), E is 5 ((00101)
– Some prime numbers: 7,11, 13, ..., 101, ...
– Concatenate four binary numbers to get a new binary number
– 011100111111010000101  474,757
– apply x mod tableSize

21-Feb-20 CS202 - Fundamental Structures of Computer Science II 283 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 284 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 285

A Collision Open Addressing Open Addressing – Linear Probing

• During an attempt to insert a new item into a table, if the hash function • In linear probing, we search the hash table sequentially starting from
indicates a location in the hash table that is already occupied, we probe the original hash location.
for some other empty (or open) location in which to place the item.The – If a location is occupied, we check the next location
sequence of locations that we examine is called the probe sequence. – We wrap around from the last table location to the first table location if necessary.
 If a scheme which uses this approach we say that
it uses open addressing

• There are different open-addressing schemes:


– Linear Probing
– Quadratic Probing
• Table size is 101 – Double Hashing

21-Feb-20 CS202 - Fundamental Structures of Computer Science II 286 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 287 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 288

32
21-Feb-20

Linear Probing - Example Linear Probing – Clustering Problem Open Addressing – Quadratic Probing

• One of the problems with linear probing is that table items tend to • Primary clustering problem can be almost eliminated if we use
• Example:
0 9 cluster together in the hash table. quadratic probing scheme.
– Table Size is 11 (0..10) – This means that the table contains groups of consecutively occupied locations.
1 • In quadratic probing,
– Hash Function: h(x) = x mod 11 • This phenomenon is called primary clustering. – We start from the original hash location i
2 2
– Insert keys: 3 13 • Clusters can get close to one another, and merge into a larger cluster. – If a location is occupied, we check the locations i+12 , i+22 , i+32 , i+42
• 20 mod 11 = 9 ...
4 25 – Thus, the one part of the table might be quite dense, even though another part has
• 30 mod 11 = 8 relatively few items. – We wrap around from the last table location to the first table location if necessary.
5 24
• 2 mod 11 = 2 • Primary clustering causes long probe searches and therefore decreases
6
• 13 mod 11 = 2  2+1=3 the overall efficiency.
• 25 mod 11 = 3  3+1=4 7
• 24 mod 11 = 2  2+1, 2+2, 2+3=5 8 30
• 10 mod 11 = 10 9 20
• 9 mod 11 = 9  9+1, 9+2 mod 11 =0 10 10

21-Feb-20 CS202 - Fundamental Structures of Computer Science II 289 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 290 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 291

Quadratic Probing - Example Open Addressing – Double Hashing Double Hashing - Example

• Double hashing also reduces clustering.


• Example: • Example:
0 • In linear probing and and quadratic probing , the probe sequences are 0
– Table Size is 11 (0..10) – Table Size is 11 (0..10)
1 independent from the key. 1
– Hash Function: h(x) = x mod 11 – Hash Function: h1(x) = x mod 11
2 2 • We can select increments used during probing using a second hash 2
– Insert keys: 3 13 function. The second hash function h2 should be: h2(x) = 7 – (x mod 7) 3 58
• 20 mod 11 = 9 – Insert keys:
4 25 h2(key)  0 4
• 30 mod 11 = 8 • 58 mod 11 = 3
• 2 mod 11 = 2
5 h2  h1 5
• 14 mod 11 = 3  3+7=10
6 24 6 91
• 13 mod 11 = 2  2+12=3 • 91 mod 11 = 3  3+7, 3+2*7 mod 11=6
• 25 mod 11 = 3  3+12=4 7 9 • We first probe the location h1(key) 7
• 24 mod 11 = 2  2+12, 2+22=6 8 30 – If the location is occupied, we probe the location h1(key)+h2(key), 8
h1(key)+(2*h2(key)), ...
• 10 mod 11 = 10 9 20 9
• 9 mod 11 = 9  9+12, 9+22 mod 11, 10 10 10 14
9+32 mod 11 =7

21-Feb-20 CS202 - Fundamental Structures of Computer Science II 292 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 293 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 294

Open Addressing – Retrieval & Deletion Separate Chaining Separate Chaining

• In open addressing, to find an item with a given key: • Another way to resolve collisions is to change the structure of the hash
– We probe the locations (same as insertion) until we find the desired table.
item or we reach to an empty location. • In open-addressing, each location of the hash table holds only one
• Deletions in open addressing cause complications: item.
– We CANNOT simply delete an item from the hash table because • We can define a hash table so that each location is itself an array called
this new empty (deleted locations) cause to stop prematurely bucket, we can store the items which hash into this location in this
(incorrectly) indicating a failure during a retrieval. array.
– Problem: What will be the size of the bucket?
– Solution: We have to have three kinds of locations in a hash table:
Occupied, Empty, Deleted. • A better approach is to design the hash table as an array of linked lists,
– A deleted location will be treated as an occupied location during this collision resolution method is known as separate-chaining.
retrieval and insertion. • In separate-chaining , each entry (of the hash table) is a pointer to a
linked list (the chain) of the items that the hash function has mapped
into that location.

21-Feb-20 CS202 - Fundamental Structures of Computer Science II 295 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 296 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 297

33
21-Feb-20

Hashing - Analysis Linear Probing – Analysis Linear Probing – Analysis -- Example

• An analysis of the average-case efficiency of hashing involves the • What is the average number of probes for a successful
• For linear probing, the approximate average number of comparisons
load factor  , which is the ration of the current number of items in search and an unsuccessful search for this hash table? 0 9
(probes) that a search requires as follows:
the table to the table size. – Hash Function: h(x) = x mod 11 1
 = (current number of items) / tableSize 1 1  Successful Search: 2 2
1 – 20: 9 -- 30: 8 -- 2 : 2 -- 13: 2, 3 -- 25: 3,4
2  1   
for a successful search
3 13
– 24: 2,3,4,5 -- 10: 10 -- 9: 9,10, 0
• The load factor measures how full a hash table is. Avg. Probe for SS = (1+1+1+2+2+4+1+3)/8=15/8 4 25
– The hash table should be so filled too much to get better performance from the 1 1  Unsuccessful Search: 5 24
1
2  (1   ) 2 
hashing.
for an unsuccessful search – We assume that the hash function uniformly 6
• Unsuccessful searches generally require more time than successful distributes the keys. 7
searches. – 0: 0,1 -- 1: 1 -- 2: 2,3,4,5,6 -- 3: 3,4,5,6
• As load factor increases, the number of collisions increases – 4: 4,5,6 -- 5: 5,6 -- 6: 6 -- 7: 7 -- 8: 8,9,10,0,1
8 30
• In average case analyses, we assume that the hash function uniformly causing increased search times. 9 20
– 9: 9,10,0,1 -- 10: 10,0,1
distributes the keys in the hash table. • To maintain efficiency, it is important to prevent the hash table Avg. Probe for US = 10 10
from filling up. (2+1+5+4+3+2+1+1+5+4+3)/11=31/11

21-Feb-20 CS202 - Fundamental Structures of Computer Science II 298 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 299 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 300

The relative efficiency of


Quadratic Probing & Double Hashing– Analysis Separate Chaining
four collision-resolution methods
• For quadratic probing and double hashing, the approximate average • For separate-chaining, the approximate average number of
number of comparisons (probes) that a search requires as follows: comparisons (probes) that a search requires as follows:

1 1   log e (1   )
 (log e 1   )   for a successful search 
1 for a successful search
2
1
1
for an unsuccessful search
 for an unsuccessful search

• Separate-chaining is most efficient collision resolution scheme.


• On average, both methods require fewer comparisons than
• But it requires more storage. We need storage for the pointer fields.
linear probing.
• We can easily perform deletion operation using separate-chaining
scheme. Deletion is very difficult in open-addressing.

21-Feb-20 CS202 - Fundamental Structures of Computer Science II 301 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 302 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 303

What Constitutes a Good Hash Function Hash Table versus Search Trees Data with Multiple Organizations

• A hash function should be easy and fast to compute. • In the most of operations, the hash table performs better than search • Several independent data structure do not support all operations
• A hash function should scatter the data evenly throughout the hash trees. efficiently.
table. • But, the traversing the data in the hash table in a sorted order is very • We may need multiple organizations for data to get efficient
– How well does the hash function scatter random data? difficult. implementations for all operations.
– How well does the hash function scatter non-random data? – For similar operations, the hash table will not be good choice. – One organization will be used for certain operations, the other organizations will be
• Two general principles : – Ex. Finding all the items in a certain range. used for other operations.

1. The hash function should use entire key in the calculation.


2. If a hash function uses modulo arithmetic, the table size should be
prime.

21-Feb-20 CS202 - Fundamental Structures of Computer Science II 304 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 305 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 306

34
21-Feb-20

Data with Multiple Organizations (cont.) Data with Multiple Organizations (cont.)
ANALYSIS AND DESIGN OF
ALGORITHMS

UNIT-II
CHAPTER 4:
DIVIDE –AND- CONQUER

309
21-Feb-20 CS202 - Fundamental Structures of Computer Science II 307 21-Feb-20 CS202 - Fundamental Structures of Computer Science II 308

OUTLINE Divide – and – Conquer :


Divide – and - conquer algorithms work according to the
 Divide – and - Conquer Technique following general plan :
• Merge sort
1. A problem’s instance is divided into several smaller
• Quick sort instances of the same problem, ideally of about the same
• Binary Search size.
• Multiplication of Large Integers and Strassen’s
2. The smaller instances are solved (typically recursively).
Matrix Multiplication
3. The solutions obtained for the smaller instances are
combined to get a solution to the original problem.

310 311
Figure: Divide – and – Conquer Technique. 312

Divide – and – Conquer : Divide – and – Conquer :


Divide – and – Conquer :
 In the most typical case of divide-and-conquer, a problem’s
 Not every divide-and-conquer algorithm is necessarily more instance of size n is divided into two instances of size n/2.
efficient than even a brute-force solution.
MASTER THEOREM:
 Generally, an instance of size n can be divided into b
 The divide-and-conquer approach yields some of the most instances of size n/b, with a of them needing to be solved.
If f(n) Є Θ(nd) with d ≥ 0 in recurrence Equation (1),
important and efficient algorithms in computer science. (Here, a and b are constants; a ≥ 1 and b > 1).
Then,

 Divide-and-Conquer technique is ideally suited for parallel  Assuming that size n is a power of b, we get the following Θ(nd) if a < b d
computations, in which each subproblem can be solved recurrence for the running time T (n):
simultaneously by its own processor. Later all solutions can T (n) = aT (n/b) + f (n) ………….. (1) T(n) Є Θ(nd log n) if a = b d
be merged to get solution to original problem. Thus, the (General Divide-and-Conquer Recurrence)
execution speed of a program which is based on this Θ(nlogba) if a > b d
where f (n) is a function that accounts for the time spent on
technique can be improved significantly. dividing the problem into smaller ones and on combining
their solutions.
313 314 315

35
21-Feb-20

Divide – and – Conquer : Example: 4, 3, 2, 1 Analysis of the algorithm to find the sum of all the elements
Example1: Consider the problem of computing the sum of of the array using divide-and-Conquer technique
10
n numbers stored in an array. 4, 3 + 2, 1
7 3 Since the problem instance is divided into two parts, the recurrence
ALGORITHM Sum(A[0… n-1], low, high) 4 +
3 2 1 relation for this algorithm is:
+
//Determines sum of n numbers stored in an array A
//using divide-and-conquer technique recursively. A, 0, 3 A(n) = 0 if n = 1

//Input: An array A[0… n-1], low and high initialized to 0 and n-1 A(n/2) + A(n/2) + 1 otherwise
10
// respectively. A, 0, 1
A, 2, 3 No. of additions No. of additions to add the sum of
//Output: Sum of all the elements in the array.
7
if low = high return A[low] 3 on left part of array on right part of array left and right part

mid ← (low + high)/2 A, 2, 2 A, 3, 3


A, 0, 0 A, 1, 1
return (Sum(A, low, mid) + Sum(A, mid+1, high)) 3 1
4 2
316
Recursion Tree: 317 318

Divide – and – Conquer :


Analysis of the algorithm to find the sum of all the elements Analysis of the algorithm to find the sum of all the elements Example2: Consider the problem of finding the largest
of the array using divide-and-Conquer technique of the array using divide-and-Conquer technique
element in an array.
Solve the below recurrence equation using Backward substitution method:
The recurrence equation for the number of additions A(n) ALGORITHM Large(A[0… n-1], low, high)
A(1) = 0 (Initial Condition)
made by the divide-and-conquer summation algorithm on //Determines the largest element in an array A using divide-and-
A(n) = 2 A(n/2) + 1 A(n) = 2 A(n/2) + 1 … (1)
inputs of size n = 2k is
= 2[ 2 A(n/4) + 1 ] + 1 replacing n by n/2 in (1) //conquer technique recursively.
= 2 2 A(n/2 2) + 2 + 1 A(n/2) = 2 A(n/4) + 1 //Input: An array A[0… n-1], low and high initialized to 0 and n-1
A(n) = 2A(n/2) + 1.
= 2 2[ 2 A(n/2 3) + 1 ] + 2 + 1 replacing n by n/2 2 in (1) respectively.
Applying Master Theorem to above equation,
= 2 3 A(n/2 3) + 2 2 + 2 + 1 A(n/2 2) = 2 A(n/2 3) + 1 //Output: Largest element in the array.
= 2 i A(n/2 i) + 2 i-1 + 2 i-2 + . . . + 2 + 1 a(rn – 1) = 1 (2 i – 1)
Thus, for this example, a = 2, b = 2, and d = 0
if low = high return A[low]
=2 i A(n/2 i) + 2 i – 1 r–1 2 -1 hence, since a >bd,
mid ← (low + high)/2
substituting 2i by n in above step yields, A(n) Є θ(nlogb a) = θ (nlog2 2)
n1 ← Large(A, low, mid)
n A(n/n) + n – 1 = n A(1) + n – 1 = n . 0 + n – 1 A(n) Є θ (n).
n2 ← Large(A, mid+1, high)
A(n) Є Θ(n)
if n1 > n2 return n1
319 320 321

else return n2

Mergesort Mergesort Mergesort


 The strategy behind Merge Sort is to change the problem of sorting
into the problem of merging two sorted sub-lists into one.
 Merge Sort is a "recursive" algorithm because it accomplishes its
task by calling itself on a smaller version of the problem (only half  The real problem is how to merge the two sub-lists.
 If the two halves of the array were sorted, then merging them of the list).
carefully could complete the sort of the entire list.
 While it can be done in the original array, the algorithm is much
4, 3, 2, 1  For example, if the array had 2 entries, Merge Sort would begin by simpler if it uses a separate array to hold the portion that has
calling itself for item 1. Since there is only one element, that sub- been merged and then copies the merged data back into the
list is sorted and it can go on to call itself in item 2. original array.
4, 3 2, 1
 Since that also has only one item, it is sorted and now Merge Sort  The basic philosophy of the merge is to determine which sub-list
4 3 2 1 can merge those two sub-lists into one sorted list of size two. starts with the smallest data and copy that item into the merged
list and move on to the next item in the sub-list.

3, 4 1, 2

1, 2, 3, 4 322 323 324

36
21-Feb-20

Mergesort Mergesort
Mergesort ALGORITHM Merge(B[0 . . . p-1], C[0 . . . q-1], A[0 . . . p + q - 1]) #include<stdio.h>
ALGORITHM Mergesort(A[0…n-1]) //Merges two sorted arrays into one sorted array #include<conio.h>
//Input: Arrays B[0 . . . p-1] and C[0 . . . q-1] both sorted void mergesort(int *,int,int);
//Sorts array A[0…n-1] by recursive mergesort void merge(int *,int,int,int);
//Input: An array A[0…n-1] of orderable elements //Output: Sorted array A[0 . . . p + q - 1] of the elements of B & C void main()
i ← 0, j ← 0, k ← 0 { int i,a[10],n;
//Output: Array A[0…n-1] sorted in nondecreasing order
clrscr();
while i < p and j < q do
if n > 1 printf("enter n value\n");
if B[i] ≤ C[j] scanf("%d",&n);
copy A[0 . . . n/2 - 1] to B[0 . . . n/2 - 1]
A[k] ← B[i]; i ← i + 1
copy A[ n/2 . . . n - 1] to C[0 . . . n/2 – 1] printf("enter values\n");
else
Mergesort(B[0 . . . n/2 – 1]) for(i=0;i<n;i++)
A[k] ← C[j]; j ← j + 1 scanf("%d",&a[i]);
Mergesort(C[0 . . . n/2 – 1]) k←k+1
Merge(B, C, A) if i = p
mergesort(a,0,n-1);
printf("\nSorted array\n");
copy C[j . . . q – 1] to A[k . . . p + q – 1] for(i=0;i<n;i++)
else printf("%d\t",a[i]);
325 326 getch(); 327
copy B[i . . . p – 1] to A[k . . . p + q – 1]
}

Mergesort Mergesort Mergesort


void merge(int *a,int low,int mid,int high) // copy remaining elements into temp array
void mergesort(int *a,int low,int high)
{
{
int i,j,k,m,temp[20]; for(m=i;m<=mid-1;m++)
int mid;
i=low,j=mid,k=-1; temp[++k]=a[m];
if(low<high)
{
while(i<=mid-1 && j<=high) for(m=j;m<=high;m++)
mid=(low+high)/2;
{ temp[++k]=a[m];
mergesort(a,low,mid);
if(a[i]<a[j])
mergesort(a,mid+1,high);
temp[++k]=a[i++]; // copy elements of temp array back into array a
merge(a,low,mid+1,high);
else for(m=0;m<=k;m++)
}
temp[++k]=a[j++]; a[low+m]=temp[m];
}
} }

328 329 330

Execution Example Execution Example (cont.) Execution Example (cont.)


 Recursive call, partition
 Partition  Recursive call, partition

7 2 9 43 8 6 1  1 2 3 4 6 7 8 9 7 2 9 43 8 6 1  1 2 3 4 6 7 8 9 7 2 9 43 8 6 1  1 2 3 4 6 7 8 9

7 2 9 4  2 4 7 9 3 8 6 1  1 3 8 6 7 29 4 2 4 7 9 3 8 6 1  1 3 8 6 7 29 4 2 4 7 9 3 8 6 1  1 3 8 6

7 2  2 7 9 4  4 9 3 8  3 8 6 1  1 6 7 2  2 7 9 4  4 9 3 8  3 8 6 1  1 6 722 7 9 4  4 9 3 8  3 8 6 1  1 6

77 22 99 44 33 88 66 11 77 22 99 44 33 88 66 11 77 22 99 44 33 88 66 11
331 332 333

37
21-Feb-20

Execution Example (cont.) Execution Example (cont.) Execution Example (cont.)


 Recursive call, base case  Recursive call, base case  Merge

7 2 9 43 8 6 1  1 2 3 4 6 7 8 9 7 2 9 43 8 6 1  1 2 3 4 6 7 8 9 7 2 9 43 8 6 1  1 2 3 4 6 7 8 9

7 29 4 2 4 7 9 3 8 6 1  1 3 8 6 7 29 4 2 4 7 9 3 8 6 1  1 3 8 6 7 29 4 2 4 7 9 3 8 6 1  1 3 8 6

722 7 9 4  4 9 3 8  3 8 6 1  1 6 722 7 9 4  4 9 3 8  3 8 6 1  1 6 722 7 9 4  4 9 3 8  3 8 6 1  1 6

77 22 99 44 33 88 66 11 77 22 99 44 33 88 66 11 77 22 99 44 33 88 66 11
334 335 336

Execution Example (cont.) Execution Example (cont.) Execution Example (cont.)


 Recursive call, …, base case, merge  Merge  Recursive call, …, merge, merge

7 2 9 43 8 6 1  1 2 3 4 6 7 8 9 7 2 9 43 8 6 1  1 2 3 4 6 7 8 9 7 2 9 43 8 6 1  1 2 3 4 6 7 8 9

7 29 4 2 4 7 9 3 8 6 1  1 3 8 6 7 29 4 2 4 7 9 3 8 6 1  1 3 8 6 7 29 4 2 4 7 9 3 8 6 1  1 3 6 8

722 7 9 4  4 9 3 8  3 8 6 1  1 6 722 7 9 4  4 9 3 8  3 8 6 1  1 6 722 7 9 4  4 9 3 8  3 8 6 1  1 6

77 22 99 44 33 88 66 11 77 22 99 44 33 88 66 11 77 22 99 44 33 88 66 11
337 338 339

Mergesort Example: Analysis of Mergesort


Execution Example (cont.) 4, 3, 2, 1  Assuming that n is a power of 2, the recurrence relation
A, 0, 3
for the number of key comparisons C(n) is
 Merge
A, 0, 1 A, 2, 3 4, 3 2, 1 C(n) = 2C(n/2) + Cmerge(n) for n > 1, C(1) = 0.
where, Cmerge(n) is the number of key comparisons
A, 0, 0 A, 1, 1 A, 2, 2 A, 3, 3 4 3 2 1 performed during the merging stage.
7 2 9 43 8 6 1  1 2 3 4 6 7 8 9

A, 0, 1, 1 A, 2, 3, 3  For the worst case, Cmerge(n) = n – 1, and we have the


7 29 4 2 4 7 9 3 8 6 1  1 3 6 8 3, 4 1, 2 recurrence
Cworst(n) = 2Cworst(n/2) + n – 1 for n > 1, Cworst(1) = 0.
A, 0, 2, 3
722 7 9 4  4 9 3 8  3 8 6 1  1 6 1, 2, 3, 4
 Hence, according to the Master Theorem,
Recursion Tree:
Cworst(n) Є Θ(n log n) (Here, a=2, b=2, d=1)
77 22 99 44 33 88 66 11
(Since, a = b d, Cworst(n) Є Θ(nd log n))
340 341 342

38
21-Feb-20

Quick Sort Quick Sort Quick Sort


 Quick sort divides the inputs according to their value to ALGORITHM Quicksort(A[l..r])  The hard part of Quick Sort is the partitioning.
achieve its partition, a situation where all the elements //Sorts a subarray by quicksort
before some position s are smaller than or equal to A[s]
//Input: A subarray A[l..r] of A[0..n - 1], defined by its left  Algorithm looks at the first element of the array (called
and all the elements after position s are greater than or
// and right indices l and r the "pivot"). It will put all of the elements which are
equal to A[s]:
//Output: The subarray A[l..r] sorted in nondecreasing less than the pivot in the lower portion of the array and
A[0] . . . A[s – 1] A[s] A[s + 1] . . . A[n – 1]
// order the elements higher than the pivot in the upper portion
of the array. When that is complete, it can put the pivot
all are ≤ A[s] all are ≥ A[s] if l < r
between those sections and Quick Sort will be able to
s  Partition(A[l..r]) //s is a split position sort the two sections separately.
 After a partition has been achieved, A[s] will be in its Quicksort(A[l..s - 1])
final position in the sorted array, and we can continue Quicksort(A[s + 1..r])
sorting the two subarrays of the elements preceding and
following A[s] independently by using same method. 343 344 345

Procedure to achieve partition: Procedure to achieve partition: Procedure to achieve partition:


 First, we select pivot element p with respect to whose value we are
 After both scans stop, three situations may arise, depending on whether
going to divide the subarray ( Usually, first element in the subarray is or not the scanning indices have crossed:
 Finally, if the scanning indices stop while pointing to the same
taken as pivot).  If scanning indices i and j have not crossed, i.e., i < j, we simply element, i.e., i=j, the value they are pointing to must be equal to p. Thus
we have the array partitioned, with the split position s = i = j:
exchange A[i] and A[j] and resume the scans by incrementing i and
 The elements in the subarray are rearranged to achieve partition by decrementing j:
i j i=j
using an efficient method based on left-to-right scan and right-to-left
p all are ≤ p >p ... <p all are ≥ p p all are ≤ p =p all are ≥ p
scan, each comparing the subarray’s element with the pivot.
We can combine this case with the case of crossed-over indices (i>j) by
exchanging the pivot with A[j] whenever i ≥ j.
 The left-to-right scan, denoted by index i starts with the second element.  If the scanning indices have crossed over, i.e., i > j, we will have
This scan skips over elements that are smaller than the pivot and stops partitioned the array after exchanging the pivot with A[j]:
on encountering the first element greater than or equal to the pivot.
j i
p all are ≤ p ≤p ≥p all are ≥ p
 The right-to-left scan, denoted by index j starts with the last element.
This scan skips over elements that are larger than the pivot and stops 346 347 348

on encountering the first element smaller than or equal to the pivot.

Partition Procedure 1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot]
ALGORITHM Partition(A[l..r]) ++i ++i
//Partitions subarray by using its first element as a pivot
//Input: subarray A[l..r] of A[0..n - 1], defined by its left and right
indices l and r (l< r)
//Output: A partition of A[l..r], with the split position returned as this
function’s value
p  A[l]
i  l; j  r + 1
repeat
repeat i  i + 1 until A[i] ≥ p
repeat j  j -1 until A[j ] ≤ p Pivot_index = 0 40 20 10 80 60 50 7 30 100 pivot_index = 0 40 20 10 80 60 50 7 30 100
swap(A[i], A[j ])
[0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8]
until i ≥ j
swap(A[i], A[j ]) //undo last swap when i ≥ j
swap(A[l], A[j ]) i j i j
return j 349 350 351

39
21-Feb-20

1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot]
++i ++i ++i
2. While A[j] > A[pivot] 2. While A[j] > A[pivot]
--j --j

pivot_index = 0 40 20 10 80 60 50 7 30 100 pivot_index = 0 40 20 10 80 60 50 7 30 100 pivot_index = 0 40 20 10 80 60 50 7 30 100


[0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8]

i j i j i j
352 353 354

1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot]
++i ++i ++i
2. While A[j] > A[pivot] 2. While A[j] > A[pivot] 2. While A[j] > A[pivot]
--j --j --j
3. If i < j 3. If i < j 3. If i < j
Swap A[i] and A[j] Swap A[i] and A[j] Swap A[i] and A[j]
4. While j > i goto step 1

pivot_index = 0 40 20 10 80 60 50 7 30 100 pivot_index = 0 40 20 10 30 60 50 7 80 100 pivot_index = 0 40 20 10 30 60 50 7 80 100


[0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8]

i j i j i j
355 356 357

1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot]
++i ++i ++i
2. While A[j] > A[pivot] 2. While A[j] > A[pivot] 2. While A[j] > A[pivot]
--j --j --j
3. If i < j 3. If i < j 3. If i < j
Swap A[i] and A[j] Swap A[i] and A[j] Swap A[i] and A[j]
4. While j > i goto step 1 4. While j > i goto step 1 4. While j > i goto step 1

pivot_index = 0 40 20 10 30 60 50 7 80 100 pivot_index = 0 40 20 10 30 60 50 7 80 100 pivot_index = 0 40 20 10 30 60 50 7 80 100


[0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8]

i j i j i j
358 359 360

40
21-Feb-20

1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot]
++i ++i ++i
2. While A[j] > A[pivot] 2. While A[j] > A[pivot] 2. While A[j] > A[pivot]
--j --j --j
3. If i < j 3. If i < j 3. If i < j
Swap A[i] and A[j] Swap A[i] and A[j] Swap A[i] and A[j]
4. While j > i goto step 1 4. While j > i goto step 1 4. While j > i goto step 1

pivot_index = 0 40 20 10 30 60 50 7 80 100 pivot_index = 0 40 20 10 30 60 50 7 80 100 pivot_index = 0 40 20 10 30 7 50 60 80 100


[0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8]

i j i j i j
361 362 363

1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot]
++i ++i ++i
2. While A[j] > A[pivot] 2. While A[j] > A[pivot] 2. While A[j] > A[pivot]
--j --j --j
3. If i < j 3. If i < j 3. If i < j
Swap A[i] and A[j] Swap A[i] and A[j] Swap A[i] and A[j]
4. While j > i goto step 1 4. While j > i goto step 1 4. While j > i goto step 1

pivot_index = 0 40 20 10 30 7 50 60 80 100 pivot_index = 0 40 20 10 30 7 50 60 80 100 pivot_index = 0 40 20 10 30 7 50 60 80 100


[0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8]

i j i j i j
364 365 366

1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot]
++i ++i ++i
2. While A[j] > A[pivot] 2. While A[j] > A[pivot] 2. While A[j] > A[pivot]
--j --j --j
3. If i < j 3. If i < j 3. If i < j
Swap A[i] and A[j] Swap A[i] and A[j] Swap A[i] and A[j]
4. While j > i goto step 1 4. While j > i goto step 1 4. While j > i goto step 1

pivot_index = 0 40 20 10 30 7 50 60 80 100 pivot_index = 0 40 20 10 30 7 50 60 80 100 pivot_index = 0 40 20 10 30 7 50 60 80 100


[0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8]

i j i j j i
367 368 369

41
21-Feb-20

1. While A[i] <= A[pivot]


1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot] ++i
++i ++i 2. While A[j] > A[pivot]
2. While A[j] > A[pivot] 2. While A[j] > A[pivot] --j
--j --j 3. If i < j
3. If i < j 3. If i < j Swap A[i] and A[j]
Swap A[i] and A[j] Swap A[i] and A[j] 4. While j > i goto step 1
4. While j > i goto step 1 4. While j > i goto step 1 5. Swap A[j] and A[pivot]

pivot_index = 0 40 20 10 30 7 50 60 80 100 pivot_index = 0 40 20 10 30 7 50 60 80 100 pivot_index = 0 40 20 10 30 7 50 60 80 100


[0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8]

j i j i j i
370 371 372

1. While A[i] <= A[pivot]


++i
2. While A[j] > A[pivot]
Partition Result Recursion: Quicksort Sub-arrays
--j
3. If i < j
Swap A[i] and A[j]
4. While j > i goto step 1
5. Swap A[j] and A[pivot] 7 20 10 30 40 50 60 80 100 7 20 10 30 40 50 60 80 100
[0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8]

pivot_index = 4 7 20 10 30 40 50 60 80 100 < = A[pivot] > A[pivot] < = A[pivot] > A[pivot]

[0] [1] [2] [3] [4] [5] [6] [7] [8]

j i
373 374 375

Quick sort program Quick sort program


#include<stdio.h>
Quick sort program
#include<conio.h> int partition(int *a,int low,int high)
{
void Quicksort(int*,int,int); void Quicksort(int *a,int low,int high)
int partition(int*,int,int); int i,j,temp,pivot;
void main() { pivot=a[low];
int mid; i=low;
{
j=high+1;
int a[100],n,i; if(low < high)
clrscr(); while(i<=j)
{ {
printf("\nEnter the size of the array\n"); mid=partition(a,low,high);
scanf("%d",&n); do i++;while(pivot>=a[i]);
printf("\nEnter %d eleemnts\n",n); Quicksort(a,low,mid-1); do j--;while(pivot<a[j]);
Quicksort(a,mid+1,high); if(i<j) temp=a[i],a[i]=a[j],a[j]=temp;
for(i=0;i<n;i++)
}
scanf("%d",&a[i]); }
Quicksort(a,0,n-1); temp=a[low];
} a[low]=a[j];
printf("\nSorted array:\n");
for(i=0;i<n;i++) a[j]=temp;
printf("%d\t",a[i]); return j;
}
getch();
} 376 377 378

42
21-Feb-20

a[0] a[1] a[2] a[3] a[4] a[5] a[6] a[7]


5 3 1 9 8 2 4 7
Quick Sort Best case Efficiency Quick sort: Worst Case
low = 0, high = 7
mid = 4
 The number of key comparisons made before a partition  Assume first element is chosen as pivot.
is achieved, is n + 1 if the scanning indices cross over.
 The number of key comparisons made before a partition  Assume we get array that is already in
low = 0, high = 3 low = 5, high = 7
mid = 1 mid = 6 is achieved is n if the scanning indices coincide. order:
 Therefore, the number of key comparisons in the best
case will satisfy the recurrence: pivot_index = 0 2 4 10 12 13 50 57 63 100
low = 2, high =3
low = 0, high = 0
mid = 2
low = 5, high = 5 low = 7, high = 7 Cbest(n) = 2 Cbest(n/2) + n for n >1, Cbest(1) = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
 According to the Master Theorem,
Cbest(n) Є Θ (nlog 2 n)
low = 3, high = 3 i j
low = 2, high = 1

Quicksort Example: Recursion Tree. 379 380 381

1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot]
++i ++i ++i
2. While A[j] > A[pivot] 2. While A[j] > A[pivot] 2. While A[j] > A[pivot]
--j --j --j
3. If i < j 3. If i < j 3. If i < j
Swap A[i] and A[j] Swap A[i] and A[j] Swap A[i] and A[j]
4. While j > i goto step 1 4. While j > i goto step 1 4. While j > i goto step 1
5. Swap A[j] and A[pivot] 5. Swap A[j] and A[pivot] 5. Swap A[j] and A[pivot]

pivot_index = 0 2 4 10 12 13 50 57 63 100 pivot_index = 0 2 4 10 12 13 50 57 63 100 pivot_index = 0 2 4 10 12 13 50 57 63 100


[0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8]

j i j j i
i 382 383 384

1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot] 1. While A[i] <= A[pivot]
++i ++i ++i
2. While A[j] > A[pivot] 2. While A[j] > A[pivot] 2. While A[j] > A[pivot]
--j --j --j
3. If i < j 3. If i < j 3. If i < j
Swap A[i] and A[j] Swap A[i] and A[j] Swap A[i] and A[j]
4. While j > i goto step 1 4. While j > i goto step 1 4. While j > i goto step 1
5. Swap A[j] and A[pivot] 5. Swap A[j] and A[pivot] 5. Swap A[j] and A[pivot]

pivot_index = 0 2 4 10 12 13 50 57 63 100 pivot_index = 0 2 4 10 12 13 50 57 63 100 pivot_index = 0 2 4 10 12 13 50 57 63 100


[0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8] [0] [1] [2] [3] [4] [5] [6] [7] [8]

j i j i j i
385 386 387

43
21-Feb-20

1. While A[i] <= A[pivot] Quick Sort Worst case analysis Quick Sort Average case analysis
++i  Let Cavg (n) be the average number of key comparisons made by
 In the worst case, all the splits will be skewed to the extreme: one of
2. While A[j] > A[pivot] the two subarrays will be empty, while the size of the other will be quicksort on a randomly ordered array of size n.
--j just one less than the size of the subarray being partitioned. This
3. If i < j situation arises in particular, for increasing arrays(already Sorted).  Cavg(0) = 0, Cavg(1) = 0.
Swap A[i] and A[j]  If A[0 . . . n-1] is a strictly increasing array and we use A[0] as the
pivot, the left-to-right scan will stop on A[1] while the right-to-left  After the split QUICKSORT calls itself to sort two subarrays, the
4. While j > i goto step 1 average comparisons to sort an array A[0 . . s-1] is Cavg(s) and the
5. Swap A[j] and A[pivot] scan will go all the way to reach A[0], indicating the split at position
average comparisons to sort an array A[s+1 . . n-1] is Cavg(n-1-s).
0: j i
A[0] A[1] ... A[n-1]  Assuming that the partition split can happen in each position s
(0 ≤ s ≤ n-1) with the same probability 1/n, we get the following
 So, after making n + 1 comparisons to get to this partition and recurrence relation:
exchanging the pivot A[0] with itself, the algorithm will find itself n-1
pivot_index = 0 2 4 10 12 13 50 57 63 100
with the strictly increasing array A[1 . . . n-1]. This sorting continues Cavg(n) = 1/n ∑ [(n + 1) + Cavg(s) + Cavg(n-1-s)] for n > 1
[0] [1] [2] [3] [4] [5] [6] [7] [8] untill the last one A[n-2] . . . n-1] has been processed. The total s=0
number of key comparisons made will be equal to
 After solving the above recurrence, we get the solution,
Cworst(n) = (n + 1) + n + . . . + 3 = (n +1) (n +2) - 3 Є Θ(n2 ).
2 Cavg(n) ≈ 2n ln n ≈ 1.38n log2 n
<= data[pivot] > data[pivot] 388 389 390

Quick Sort Summary of Sorting Algorithms


Binary search
 Thus, on the average, quicksort makes only 38 % more comparisons
than in the best case. Algorithm Time Notes  Binary Search is an incredibly powerful technique
 Quicksort’s innermost loop is so efficient that it runs faster than for searching an ordered list.
slow (good for small
mergesort on randomly ordered arrays, justifying the name given to the Bubble-sort O(n2)
inputs)
 The basic algorithm is to find the middle element of
algorithm by its inventor, the prominent British computer scientist
slow (good for small the list, compare it against the key, decide which
C.A.R. Hoare. Selection-sort O(n2)
inputs) half of the list must contain the key, and repeat with
that half.
fast (good for huge
Merge-sort O(n log n)
inputs)
O(n log n) fastest (good for large
Quick-sort
expected inputs)

391 392 393

Binary search BINARY SEARCH

 It works by comparing a search key K with the array’s Example: low high
middle element A[m]. Maintain array of Items.
Store in sorted order.
Use binary search to FIND Item with Key K= 33. Index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
 If they match, the algorithm stops; otherwise, the same
operation is repeated recursively for the first half of the Value 6 13 14 25 33 43 51 53 64 72 84 93 95 96 97
array if K <A[m] and for the second half if K >A[m]: Index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Value 6 13 14 25 33 43 51 53 64 72 84 93 95 96 97 if Key K is in array,
then it has index
low=0
between low and
high.
high=n-1=15-1=14

394 395 396

44
21-Feb-20

if (Key < a[mid])


Mid = (low + high)/2 = (0+14)/2 = 7
high=mid-1

low mid high low high


low mid high

Index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Value 6 13 14 25 33 43 51 53 64 72 84 93 95 96 97 Value 6 13 14 25 33 43 51 53 64 72 84 93 95 96 97
Value 6 13 14 25 33 43 51 53 64 72 84 93 95 96 97

Compute mid position


Since 33 < 53, we Since 33 < 53, we
and check if matching
can reduce search can reduce search
Key is in that position.
interval. interval.

397 398 399

if (Key > a[mid])


Mid = (low + high)/2 = (0+6)/2 = 3
low=mid+1

low mid high low high


low mid high

Index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Value 6 13 14 25 33 43 51 53 64 72 84 93 95 96 97 Value 6 13 14 25 33 43 51 53 64 72 84 93 95 96 97
Value 6 13 14 25 33 43 51 53 64 72 84 93 95 96 97

Since 33 > 25, we Since 33 > 25, we


Compute mid position
can reduce search can reduce search
and check if matching
interval. interval.
Key is in that position.

400 401 402

Mid = (low + high)/2 = (4+4)/2 = 4


Mid = (low + high)/2 = (4+6)/2 = 5
If (key == a[mid]) return mid
If (key < a[mid]) high=mid-1

low high low


low

Index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Value 6 13 14 25 33 43 51 53 64 72 84 93 95 96 97 Value 6 13 14 25 33 43 51 53 64 72 84 93 95 96 97 Index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Value 6 13 14 25 33 43 51 53 64 72 84 93 95 96 97
high Since 33 < 43, we can
mid Compute mid position
reduce search
and check if matching high Matching Key found.
interval.
Key is in that position. Return index 4.

403 404 405

45
21-Feb-20

Binary Search Algorithm Binary Search Algorithm Analysis:


Multiplication of large integers
Worst-Case Analysis
ALGORITHM BinarySearch(A[0 . . . n-1], K)  The worst-case inputs include all arrays that do not
//Implements nonrecursive binary search contain a given search key.  Some applications, notably modern cryptology, require
//Input: An array A[0 . . . n-1] sorted in ascending order  Since after one comparison the algorithm faces the same situation manipulation of long integers.
but for an array half the size, we get the following recurrence
// and a search key K relation for Cworst(n):
Cworst (n) = Cworst( n/2 ) + 1 for n > 1, Cworst(1) = 1.  By applying divide-and-conquer technique to multiply
//Output: An index of the array’s element that is equal  Assuming that n =2k , the solution to the above recurrence obtained two long numbers, the total number of multiplications
// to K or -1 if there is no such element is
performed can be reduced at the expense of a slight
Cworst(2k) = k + 1 = log2 n + 1
l ← 0; r ← n -1 increase in the number of additions.
while l ≤ r do Average-Case Analysis
 The average number of key comparisons made by binary search is  If we use the classic pen-and-pencil algorithm for
m ← (l + r)/2 only slightly smaller than that in the worst case: multiplying two n-digit integers, each of the n digits of
if K = A[m] return m Cavg(n) ≈ log2 n. first number is multiplied by each of the n digits of
else if K < A[m] r ← m – 1 second number for the total of n2 digit multiplications.
else l ← m + 1
return -1 406 407 408

Multiplication of large integers Multiplication of large integers Multiplication of large integers


 Consider a case of two-digit integers, say 23 and 14.  For any pair of two-digit integers a = a1 a0 and b = b1 b0 , their product c • Therefore, taking advantage of the same trick we used for
• These numbers can be represented as follows: can be computed by the formula: two-digit numbers, we get
c = a * b = c2 102 + c1 101 + c0 ,
23 = 2 . 10 1 + 3 . 10 0 and 14 = 1 . 10 1 + 4 . 10 0 c = a * b = ( a 110n/2 + a0 ) * ( b110n/2 + b0 )
where,
• Now let us multiply them: = (a1 * b1)10n + (a1 * b0 + a0 * b1)10n/2 + (a0 * b0)
c2 = a1 * b1 is the product of their first digits,
23 * 14 = (2 . 10 1 + 3 . 10 0) * (1 . 10 1 + 4 . 10 0) c0 = a0 * b0 is the product of their second digits, = c210n + c110n/2 + c0 ,
= 2 . 10 1 * 1 . 10 1 + 2 . 10 1 * 4 . 10 0 + 3 . 10 0 * 1 . 10 1 c1 = (a1 + a0 ) * (b1 + b0 ) – (c2 + c0 ) is the product of the sum of the a’s where,
digits and the sum of the b’s digits minus the sum of c 2 and c0 .
+ 3 . 10 0 * 4 . 10 0 c2 = a 1 * b1 is the product of their first halves,
= (2 * 1) 10 2 + (2 * 4 + 3 * 1)10 1 + (3 * 4)10 0  Now we apply this trick to multiply two n-digit integers a and b c0 = a 0 * b0 is the product of their second halves,
• The last formula yields the correct answer of 322. But it where n is a positive even number. c1 = (a 1 + a0) * (b1 + b0) – (c2 + c0) is the product of the sum
uses the same four digit multiplications as the pen-and- • Let us divide both numbers in the middle. of the a’s halves and the sum of the b’s halves minus
pencil algorithm. • We denote the first half of a’s digits by a1 and second half by a0 ; for
b, the notations are b1 and b0 , respectively.
the sum of c2 and c0.
• We can compute the middle term underlined with just one
• In these notations, a = a1 a0 implies that a = a1 10n/2 + a0 , and  If n/2 is even, we can apply the same method for computing
digit multiplication by taking advantage of the products
2 * 1 and 3 * 4 that need to be computed anyway: b = b1 b0 implies that b = b1 10n/2 + b0 . the products c2, c0 and c1.
409 410 411
2 * 4 + 3 * 1 = (2 + 3) * (1 + 4) – 2 * 1 – 3 * 4

Multiplication of large integers Strassen’s Matrix Multiplication Strassen’s Matrix Multiplication


 V. Strassen published an algorithm in 1969 which suggests that, we
 Thus, if n is a power of 2, we have a recursive algorithm for  Thus to multiply two 2-by-2 matrices, strassen’s algorithm makes
can find the product C of two 2-by-2 matrices A and B with just seven
computing the product of two n-digit integers. Recursion is stopped 7 multiplications and 18 additions/subtractions, whereas brute-force
multiplications as opposed to the eight required by the brute-force
when n becomes one. algorithm requires 8 multiplications and 4 additions. Its importance
algorithm.
can be seen as matrix order n goes to infinity.
 This is accomplished by using the following formulas:
Analysis:
c00 c01 a00 a01 b00 b01
 Let A and B be two n-by-n matrices where n is a power of two. We can
 Since multiplication of n-digit numbers requires three multiplications = * divide A, B, and their product C into four n/2-by-n/2 submatrices
of n/2 – digit numbers, the recurrence for the number of c10 c11 a10 a11 b10 b11 each as follows:
multiplications M(n) will be
C00 C01 A00 A01 B00 B01
M(n) = 3M(n/2) for n>1, M(1) = 1
Solving it by backward substitutions for n = 2 k yields m1 + m4 – m5 + m7 m3 + m5 = *
M(2k) = 3M(2k-1 ) m2 + m4 m1 + m3 – m2 +m6 C10 C11 A10 A11 B10 B11
k-2 2
= 3[3M(2 )] = 3 M(2 ) k-2 where, m1 = (a00 + a11 ) * (b00 + b11 ) Example: C00 can be computed either as A00 * B00 + A01 * B10 or as
= 3iM(2k-i) = . . . =3kM(2k-k) 2k = n m2 = (a10 + a11 ) * b00
= 3kM(20 ) = 3kM(1) = 3k Taking log on both sides, M1 + M4 – M5 + M7 where M1 , M4 , M5 , and M7 are found by
m3 = a00 * (b01 – b11 ) Strassen’s formulas.
Since k = log2 n, we get, klog2 2 =log2 n
m4 = a11 * (b10 – b00 )
M(n) = 3log2n = nlog23 ≈ n1.585 k = log2 n
Note: m5 = (a00 + a01 ) * b11  If the seven products of n/2-by-n/2 matrices are computed recursively
alogbc = clogba m6 = (a10 - a00 ) * (b00 + b01 ) by the same method, we have Strassen’s algorithm for matrix
412 413 414
m7 = (a01 - a11 ) * (b10 + b11 ) multiplication.

46
21-Feb-20

Strassen’s Matrix Multiplication


OUTLINE :
Analysis: ANALYSIS AND DESIGN OF
If M(n) is the number of multiplications made by Strassen’s algorithm
ALGORITHMS

in multiplying two n-by-n matrices (where n is a power of 2), we get  Decrease-and-Conquer
the following recurrence relation for it: • Insertion Sort
M(n) = 7M(n/2) for n>1, M(1) = 1
• Depth-First Search
Solving it by backward substitutions for n = 2 k yields
M(2k) = 7M(2k-1 ) UNIT-II • Breadth-First Search
= 7[7M(2k-2 )] = 72 M(2k-2 ) • Topological Sorting
= 7iM(2k-i) = . . . =7kM(2k-k) 2k = n • Algorithms for Generating Combinatorial Objects
= 7kM(20 ) = 7kM(1) = 7k Taking log on both sides, CHAPTER 5:  Generating Permutations
Since k = log2 n, we get, klog2 2 =log2 n  Generating Subsets
M(n) = 7log2n = nlog27 ≈ n2.807 , k = log2 n
which is smaller than n3 required by the Note: DECREASE-AND-CONQUER
brute-force algorithm. alogbc = clogba

415 417

Decrease-and-Conquer Decrease – by – a - constant:


 The decrease-and-Conquer technique is based on exploiting the
 In this variation, the size of an instance is reduced by the same
relationship between a solution to a given instance of a problem Examples of Decrease and Conquer constant (typically, this constant is equal to one) on each iteration of
and a solution to a smaller instance of the same problem. the algorithm.
There are three major variations of decrease-and-conquer:
 Reduction-by-two cases do happen occasionally, for example, in
 Once such a relationship is established, it can be exploited either  Decrease by one: algorithms that have to act differently for instances of odd and even
• Insertion sort sizes.
top down (recursively) or bottom up (without a recursion).
• Graph algorithms:
• DFS
 For example, consider the exponentiation problem of computing a n
 Decrease-and-conquer algorithms works according to the • BFS
for positive integer exponents:
following general plan: • Topological sorting
• The relationship between a solution to an instance of size n and an
• Algorithms for generating permutations, subsets
instance of size n-1 is obtained by the formula: an = an-1 . a
1. A problem instance is reduced to a smaller instance of the
• So, the function f(n) = an can be computed “top down” by using
same problem.  Decrease by a constant factor
its recursive definition:
• Binary search
2. The smaller instance is solved. • Josephus problem f(n-1) . a if n > 1
f(n) = a if n = 1
3. Solution of smaller instance is extended to obtain solution to  Variable-size decrease • Function f(n) = an can be computed “bottom up” by multiplying420a
418 419
original problem. • Euclid’s algorithm by itself n-1 times.

Decrease – by – a – constant - factor:


a problem of size n a problem of size n
 In this variation, the size of a problem instance is reduced by the same
constant factor on each iteration of the algorithm.

subproblem  For example, consider the exponentiation problem of computing a n subproblem


of size n-1 for positive integer exponents: of size n/2
• If the instance of size n is to compute a n, the instance of half its size
will be to compute an/2 , with obvious relationship between the
solution to the two: an = (an/2 )2 solution to the
• If n is odd, we have to compute a n-1 by using the rule for even-
subproblem subproblem
valued exponents and then multiply the result by a.
• To summarize, we have the following formula:
(an/2 )2 if n is even and positive
solution to an = (a(n-1)/2 )2 if n is odd and greater than 1 solution to
the original problem a if n = 1 the original problem
• If we compute an recursively according to above formula and
measure the algorithm’s efficiency by number of multiplications,
Figure : Decrease (by one) – and – conquer technique. then algorithm is expected to be in O(log n).
Figure : Decrease (by half) – and – conquer technique.
421 422 423

47
21-Feb-20

Insertion Sort
Variable – Size – Decrease:  Decrease – by – one technique is applied to sort an array A[0 . . . n-1].

 We assume that the smaller problem of sorting the array A[0 . . . n-2]
Consider the problem of exponentiation: Compute an  In this variation, the size reduction pattern varies from has already been solved to give us a sorted array of size n – 1:
one iteration of an algorithm to another. A[0] ≤ . . . ≤ A[n – 2].

 Brute Force: an = a * a * a * a * . . . * a  All we need is to find an appropriate position for A[n – 1] among sorted
 Euclid’s algorithm for computing the GCD provides a elements and insert it there.
good example of such a situation:
 Divide and conquer: an = an/2 * an/2 m if n = 0  There are three reasonable alternatives for doing this:
• First, we can scan the sorted subarray from left to right until the
GCD(m, n) = GCD(n, m mod n) otherwise first element greater than or equal to A[n – 1] is encountered and
then insert A[n – 1] right before that element.
 Decrease by one: an = an-1 * a
• Second, we can scan the sorted subarray from right to left until the
first element smaller than or equal to A[n – 1] is encountered and
then insert A[n – 1] right after that element.
 Decrease by constant factor: an = (an/2)2 if n is even
o This is implemented in practice because it is better for sorted or
almost-sorted arrays. The resulting algorithm is called straight
424 425 insertion sort or simply insertion sort. 426

an = (a(n-1)/2)2 if n is odd

Insertion Sort program:


Insertion Sort Pseudocode of Insertion Sort #include<stdio.h>
• The third alternative is to use binary search to find an appropriate #include<conio.h>
position for A[n – 1] in the sorted portion of the array. The resulting void main()
algorithm is called binary insertion sort. {
void insertion(int*,int);
o Improves number of comparisons
int a[10],n,i;
(worst and average case)
o still requires same number of moves/swaps printf("\nEnter the size of the array\n");
scanf("%d",&n);

 As shown in below figure, starting with A[1] and ending with A[n – 1], printf("\nEnter the array elements\n");
A[i] is inserted in its appropriate place among the first i elements of the for(i=0;i<n;i++)
scanf("%d",&a[i]);
array that have been already sorted.
printf("\nArray before sorting\n");
for(i=0;i<n;i++)
printf("%d\n",a[i]);
A[0] ≤ . . . ≤ A[j] < A[j + 1] ≤ . . . ≤ A[i – 1] | A[i] . . . A[n – 1] insertion(a,n);
printf("\nArray after sorting\n");
Smaller than or equal to A[i] greater than A[i]
for(i=0;i<n;i++)
427 428 printf("%d\n",a[i]); 429

Insertion Sort program: Insertion Sort Example: Insertion Sort Example:


void insertion(int *a,int n) Insert Action: i=1, first iteration Insert Action: i=2, second iteration
{
int i,j,temp; a[0] a[1] a[2] a[3] a[4] a[0] a[1] a[2] a[3] a[4]
temp temp
for(i=1;i<n;i++)
{ 8 20 8 5 10 7 5 8 20 5 10 7
temp=a[i];
j=i-1; 8 20 20 5 10 7 5 8 20 20 10 7
while((j>=0) && (a[j] > temp))
{
a[j+1]=a[j]; --- 8 20 5 10 7 5 8 8 20 10 7
j=j-1;
} --- 5 8 20 10 7
a[j+1]=temp;
}
} 430 431 432

48
21-Feb-20

Insertion Sort
Insertion Sort Example: Insertion Sort Example:
Analysis:
Insert Action: i=3, third iteration Insert Action: i=4, fourth iteration
temp a[0] a[1] a[2] a[3] a[4] temp a[0] a[1] a[2] a[3] a[4]  The number of key comparisons in this algorithm obviously
depends on the nature of the input.
10 5 8 20 10 7 7 5 8 10 20 7
 In the worst case, A[j] > v is executed largest number of times. i.e.,
10 5 8 20 20 7 7 5 8 10 20 20 for every j = i – 1, . . .,0.
• Thus, for the worst-case input, we get A[0] > A[1](for i =1), A[1]
> A[2](for i=2), . . . A[n – 2] > A[n – 1](for i=n-1).
--- 5 8 10 20 7 7 5 8 10 10 20 n-1 i-1 n-1

Cworst(n) = ∑ ∑ 1 = ∑ i = ((n-1)n)/2 Є Θ(n2)


7 5 8 8 10 20
i=1 j=0 i=1

--- 5 7 8 10 20 • Thus, in the worst-case, insertion sort makes exactly the same
433 434
number of comparisons as selection sort. 435

Sorted ARRAY

Insertion Sort
Analysis: Graph Traversal Graph Traversal
 Depth-First Search (DFS) and Breadth-First Search (BFS):
 In the Best case, the comparison A[j] > v is executed  Many graph algorithms require processing • Two elementary traversal algorithms that provide an
only once on every iteration of outer loop. vertices or edges of a graph in a systematic efficient way to “visit” each vertex and edge exactly
• It happens if and only if A[i – 1] ≤ A[i] for every fashion. once.
i = 1, . . . n – 1, i.e., if the input array is already sorted • Both work on directed or undirected graphs.
in ascending order. • Many advanced graph algorithms are based on the
n-1
 Two principal Graph traversal algorithms: concepts of DFS or BFS.
Cbest(n) = ∑1 = n - 1 Є Θ(n) • The difference between the two algorithms is in the
i=1  Depth-first search (DFS) order in which each “visits” vertices.
 In the Average case,
Cavg(n) ≈ n2/4 Є Θ(n2 )  Breadth-first search (BFS)
436 437 438

Depth-First Search
 DFS starts visiting vertices of a graph at an arbitrary vertex by
Depth-First Search Pseudo code of DFS
marking it as having been visited.

 On each iteration, the algorithm proceeds to an unvisited vertex that is  Whenever a new unvisited vertex is reached for the first
adjacent to the last visited vertex . time, it is attached as a child to the vertex from which it
is being reached. Such an edge is called a tree edge
 This process continues until a dead end – a vertex with no adjacent because the set of all such edges forms a forest.
unvisited vertices – is encountered.

 The algorithm may also encounter an edge leading to a


 At a dead end, the algorithm backs up one edge to the vertex it came
from and tries to continue visiting unvisited vertices from there. previously visited vertex other than its immediate
predecessor (i.e., its parent in the tree). Such an edge is
 The algorithm eventually halts after backing up to the starting vertex, called a back edge because it connects a vertex to its
with the latter being a dead end. ancestor, other than the parent, in the depth-first search
forest.
 Uses a stack to hold vertices that may still have unvisited neighbors.
We push a vertex onto the stack when the vertex is reached for first
time, and we pop a vertex off the stack when it becomes a dead end.439 440 441

• stack may simply be the call stack (recursion)

49
21-Feb-20

Example1: DFS graph  tree Example2


Example2 (cont.)
a b c d
a A A A
A unexplored vertex
e f
A visited vertex
g h b B D E B D E B D E
unexplored edge
tree edge C C C
f g back edge

e c
A A A A

d B D E B D E B D E B D E
tree edge
h
C C C C
back edge 442 443 444

Applications of DFS
How efficient is Depth-First Search ? Depth-First Search  Checking connectivity:
Since DFS halts after visiting all the vertices connected by a path to the
 DFS algorithm is quite efficient since it takes just the  We can look at the DFS forest as the given graph with its edges starting vertex, checking a graph’s connectivity can be done as follows:
time proportional to the size of the data structure used classified by the DFS traversal into two disjoint classes: tree edges Start a DFS traversal at an arbitrary vertex and check, after the
for representing the graph in question. and back edges. algorithm halts, whether all the graph’s vertices will have been visited.
- tree edges are edges used by the DFS traversal to reach previously If they have, the graph is connected; otherwise, it is not connected.
unvisited vertices.
 Thus, for the adjacency matrix representation, the - back edges connect vertices to previously visited vertices other
than their immediate predecessors in the traversal.  Identifying connected components of a graph.
traversal’s time is in Θ(|V|2), and for the adjacency list
representation, it is in Θ(|V| + |E|) where |V| and  DFS yields two distinct orderings of vertices:
|E| are the number of the graph’s vertices and edges,  Checking acyclicity:
- order in which the vertices are reached for the first time (pushed
respectively. onto stack). If the graph does not have back edges, then it is clearly acyclic.
- order in which the vertices become dead-ends (popped off stack).
These orders are qualitatively different , and various applications
can take advantage of either of them.  Finding articulation points of a graph:
A vertex of a connected graph is said to be its articulation point if its
removal with all edges incident to it breaks the graph into disjoint
445 446 pieces. 447

Breadth-first search (BFS) Pseudocode of BFS


Breadth-First Search
 It proceeds in a concentric manner by visiting first all the vertices
that are adjacent to a starting vertex, then all unvisited vertices two  Whenever a new unvisited vertex is reached for the first
edges apart from it, and so on, until all the vertices in the same time, it is attached as a child to the vertex from which it
connected component as the starting vertex are visited.
is being reached. Such an edge is called a tree edge
because the set of all such edges forms a forest.
 If unvisited vertices still remain, the algorithm has to be restarted at
an arbitrary vertex of another connected component of the graph.
 The algorithm may also encounter an edge leading to a
previously visited vertex other than its immediate
 Instead of a stack, BFS uses a queue. The queue is initialized with
the traversal’s starting vertex, which is marked as visited. On each
predecessor (i.e., its parent in the tree). Such an edge is
iteration, the algorithm identifies all unvisited vertices that are called a cross edge.
adjacent to the front vertex, marks them as visited, and adds them to
the queue; after that, the front vertex is removed from the queue.

 Similar to level-by-level tree traversal

448 449 450

50
21-Feb-20

Example1 of BFS traversal of undirected graph


Example1 of BFS traversal of undirected graph Example2 of BFS Traversal
a b BFS tree:
c d
a L0
a b c d A
A unexplored vertex
e f g h L1
A visited vertex B C D
e f g h b e f unexplored edge
tree edge tree edge E F
BFS traversal queue: g cross edge
cross edge

a(1) b(2) e(3) f(4) L0 L0


c h A A
b(2) e(3) f(4) g(5)
g(5) c(6) h(7) L1 L1
B C D B C D
c(6) h(7) d(8)
d
E F E F
451 452 453

Example2 (cont.) Example2 (cont.) Notes on BFS


 BFS has same efficiency as DFS :
L0 L0 • it is in Θ(|V|2 ) for the adjacency matrix representation
L0 L0
A A A A • and in Θ(|V|+|E|) for the adjacency list representation.

L1 L1 L1 L1
B C D B C D B C D B C D  BFS yields a single ordering of vertices because the queue is a FIFO
structure and hence the order in which vertices are added to the
L2 L2 L2 queue is the same order in which they are removed from it (order
E F E F E F E F added/deleted from queue is the same).

 Tree edges are the ones used to reach previously unvisited vertices.
L0 L0 L0
A A A
 Cross edges connect vertices to those visited before, but, unlike back
L1 L1 L1 edges in a DFS tree, they connect vertices either on the same or
B C D B C D B C D adjacent levels of a BFS tree.

L2 L2 L2
E F E F E F
454 455 456

Applications of BFS Main facts about DFS and BFS Few Basic Facts about Directed graphs
 Checking connectivity of a graph DFS BFS
 A directed graph, or digraph, is a graph with directions
Data structure stack queue
 Checking acyclicity of a graph specified for all its edges.
No. of vertex orderings 2 orderings 1 ordering
BFS can be helpful in some situations where DFS cannot. Edge types tree and back tree and cross
  There are only two notable differences between
• For example, BFS can be used for finding a path with the fewest (undirected graphs) edges edges undirected and directed graphs in representing the
number of edges between two given vertices. We start a BFS Applications connectivity, connectivity, adjacency matrix and adjacency list:
traversal at one of the two vertices given and stop it as soon as
the other vertex is reached. acyclicity, acyclicity,  The adjacency matrix of a directed graph does not
articulation minimum-edge have to be symmetric.
points paths  An edge in a directed graph has just one ( not two)

Efficiency for adjacency matrix Θ(|V|2) Θ(|V|2) corresponding nodes in the digraph’s adjacency lists.
Note: BFS is not applicable for finding articulation points
of a graph.
Efficiency for adjacency lists Θ(|V| + |E|) Θ(|V| + |E|)

457 458 459

51
21-Feb-20

Directed graphs Directed graphs


a b
a DFS and BFS are principal traversal algorithms for traversing  The presence of a back edge indicates that the digraph has
d 
digraphs, but the structure of corresponding forests can be more a directed cycle.
c complex.
b e  If a DFS forest of a digraph has no back edges, the digraph
 The DFS forest in the previous figure exhibits all four types of edges
possible in a DFS forest of a directed graph: tree edges (ab, bc, de),
is a dag, an acronym for directed acyclic graph.
d
back edges (ba) from vertices to their ancestors, forward edges (ac)
c from vertices to their descendants in the tree other than their children,
Note: A directed cycle in a digraph is a sequence of three or
and cross edges (dc), which are none of the above mentioned types.
e more of its vertices that starts and ends with the same
Note: A back edge in a DFS forest of a directed graph can connect a vertex and in which every vertex is connected to its
(a) (b)
vertex to its parent. immediate predecessor by an edge directed from the
Figure: (a) Digraph. (b) DFS forest of the digraph for the predecessor to the successor.
DFS traversal started at a.
460 461 462

Topological Sorting Topological Sorting


EXAMPLE:
Topological sorting Algorithms
 Consider a set of five required courses {C1, C2, C3, C4, C5} a part-time  The question is whether we can list the vertices of a digraph in such 1. DFS-based algorithm:
student has to take in some degree program. an order that for every edge in the graph, the vertex where the edge  DFS traversal noting the order vertices are popped
starts is listed before the vertex where the edge ends. This problem off stack (i.e., the order in which vertices become
 The courses can be taken in any order as long as the following course is called topological sorting. dead ends).
prerequisites are met: C1 and C2 have no prerequisites, C3 requires C1 and
C2, C4 requires C3, and C5 requires C3 and C4. The student can take only one  Reverse order solves topological sorting
course per term.  For topological sorting to be possible, a digraph must be a dag. i.e.,  Back edges encountered?→ NOT a dag!
if a digraph has no cycles, the topological sorting problem for it has
In which order should the student take the courses? a solution.
 The above situation can be modeled by a digraph in which vertices represents
courses and directed edges indicate prerequisite requirements.  There are two efficient algorithms that both verify whether a
digraph is a dag, and if it is a dag then it produces an ordering of
vertices that solves the topological sorting problem.
C1 C4

C3  DFS – based algorithm


 Source – removal algorithm
C2 C5 463 464 465

Topological sorting Algorithms Topological Ordering: Source Removal Topological Ordering: Source Removal
2. Source removal algorithm: Algorithm: Example Algorithm: Example
 This algorithm is based on direct implementation of the
decrease( by one) – and – conquer technique: Repeatedly identify
and remove a source vertex, ie, a vertex that has no incoming v2 v3 v2 v3
edges

 The order in which the vertices are deleted yields a solution to v6 v5 v4 v6 v5 v4


the topological sorting problem.

v7 v11 v7

Topological order: Topological order: v1

466 467 468

52
21-Feb-20

Topological Ordering: Source Removal Topological Ordering: Source Removal Topological Ordering: Source Removal
Algorithm: Example Algorithm: Example Algorithm: Example
v3

v6 v5 v4 v6 v5 v4 v6 v5

v7 v7 v7

Topological order: v1, v2 Topological order: v1, v2, v3 Topological order: v1, v2, v3 , v4

469 470 471

Topological Ordering: Source Removal Topological Ordering: Source Removal Topological Ordering: Source Removal
Algorithm: Example Algorithm: Example Algorithm: Example
v2 v3

v6 v6 v5 v4 v1 v2 v3 v4 v5 v6 v7

v7 v1
v7 v7

Topological order: v1, v2, v3, v4, v5, v6 Topological order: v1, v2, v3 , v4, v5, v6, v7.
Topological order: v1, v2, v3, v4, v5

472 473 474

Algorithms for Generating Generating Permutations Generating Permutations


Combinatorial Objects
 Approach :
 Most important types of combinatorial objects are:  Assume that the set whose elements need to be
 The smaller-by-one problem is to generate all (n-1)!
 permutations permuted is the set of integers from 1 to n
permutations
 combinations  They can be interpreted as indices of elements in an n-
 Assuming that the smaller problem is solved, we can get
 subsets of a given set element set {a1, …, an}
a solution to the larger one
• by inserting n in each of the n possible positions among
 They typically arise in problems that require a  What would the decrease-by-one technique suggest for elements of every permutation of n-1 elements.
consideration of different choices. the problem of generating all n! permutations?
• There are two possible order of insertions: either left to
right or right to left.
 To solve these problems we need to generate
combinatorial objects.  Total number of all permutations will be n.(n-1) ! = n!

475 476 477

53
21-Feb-20

Generating Permutations Generating Permutations Generating Permutations


ALGORITHM Johnson Trotter (n)
 minimal-change requirement is satisfied.
Another way to get the same ordering : // Implements Johnson-Trotter algorithm for generating
 Each permutation is obtained from the previous one permutations
 Associate a direction with each component k in a
by exchanging only two elements. // Input : A positive integer n
permutation
 Beneficial for algorithm’s speed. // Output : A list of permutations of {1, … , n}
 Indicate such a direction by a small arrow
3 2 4 1
 The component k is said to be mobile Initialize the first permutation with 1 2 … n
 if its arrow points to a smaller number adjacent to it while there exists a mobile integer k do
 3 and 4 are mobile elements
find the largest mobile integer k
swap k and the adjacent integer k ’s arrow points to
 2 and 1 are not
reverse the direction of all integers that are larger than k
 The following algorithm uses this notion
add the new permutation to the list
Figure: Generating permutations bottom up.
478 479 480

Generating permutations
Generating Permutations Lexicographic order algorithm
 The Johnson-Trotter algorithm does not produce 1. Initial permutation: a 1, a2, … an in increasing order
 An application of Johnson Trotter permutations in lexicographic order
2. Scan a current permutation from right to left looking
algorithm :
Example: for the first pair of consecutive elements a i and ai+1
such that ai < ai+1
o Johnson-Trotter algorithm: 123, 132, 312, 321, 231, 213
123 132 312 321 231 213 o Lexicographic order: 123, 132, 213, 231, 312, 321 3. Find the smallest digit in the tail that is larger than a i
and put it in position i

 This algorithm is one of the most efficient for generating 4. Put the rest n elements in increasing order in
permutations. position i+1 to n
 It can be implemented to run in time proportional to the
number of permutations, i.e., in Θ(n!).
481 482 483

Generating subsets Generating subsets Generating subsets


Problem : Generate all subsets of a given set A = {a1 , … , an} (i.e., power set).  A convenient way of solving the problem of generating power set is
based on a one-to-one correspondence between all 2n subsets of an  Generation of power set in squashed order: The order in
n-element set A = {a1 , . . . ,an} and all 2n bit strings b1 , . . . ,bn of length n. which any subset involving a j can be listed only after all
 decrease-by-one approach :
• All subsets of A = {a1, . . . a n} can be divided into two groups: those that do
the subsets involving a 1, . . . , a j-1
not contain an and those that do. The former group is nothing but all the  The easiest way to establish such a correspondence is to assign to a
subsets of {a1, . . . a n-1}, while each and every element of the latter can be subset the bit string in which bi = 1 if ai belongs to the subset and bi = 0  There exists a minimal-change algorithm for generating
obtained by adding an to a subset of {a1, . . ., a n-1}. if ai does not belong to it. bit strings so that every one of them differs from its
• Thus, once we have a list of all subsets of {a1, . . . , a n-1}, we can get all the
subsets of {a1, . . . , a n} by adding to the list all its elements with a n put into immediate predecessor by only a single bit.
each of them.  Example:
 An application of this algorithm to generate all subsets of {a1 , a2 , a3 } is  The bit string 000 will correspond to the empty subset of a three-
element set.  Example: For n = 3, we can get
illustrated below:
 111 will correspond to the set itself, i.e., {a1, a2, a3} 000 001 011 010 110 111 101 100.
 110 will represent {a1, a2}. Such a sequence of bit strings is called the binary reflected
Gray code.

484 485 486

54

You might also like