Professional Documents
Culture Documents
8.1 Introduction
By now you must be familiar with the concepts of decrease-and-conquer
technique. In this unit we deal with a group of design methods that are
based on the idea of transformation. We call this general technique
transform-and-conquer. It is called so because these methods involve a two-
stage procedure.
A. Transform – Here the problem’s instance is changed to a level where
the solution to the problem can be obtained easily.
B. Conquer – In this stage the problem is solved.
Simpler instance
Problem’s or
instance Another representation Solution
or
Another problem’s instance
Objectives:
After studying this unit you should be able to:
define the technique of transform and conquer
describe the method of presorting
explain Gaussian elimination technique
explain the approaches of AVL and 2-3 trees in balanced search trees
define heap and apply it to heapsort
apply the problem reduction strategy
8.2 Presorting
Presorting is sorting before performing any operation. The time efficiency of
algorithms that involve sorting depends on the efficiency of the sorting
algorithm being used. The time spent in presorting should be offset by the
advantages of presorting.
Let us first define presorting.
Presorting is a process of preprocessing data in order to improve it so that it
can later be refined in a better way by any sorting algorithm.
Let us now consider the application of presorting to compute the mode.
Example: Computing a mode
A mode is a value that has the most number of occurrences in a given list of
numbers. For example, for the following list of numbers - 9, 3, 5, 9, 4, 9, 7,
8, 9, the mode is 9. (If you find that several different values occur most
often, then you can consider any of them as the mode.)
In the brute force approach to computing a mode, you would scan the list
and compute the frequencies of all its distinct values, then find the value
with the largest frequency. Whereas in order to implement this idea here,
you will have to store the values already encountered, along with their
frequencies, in a separate list. In all iterations, compare the ith element of the
original list with the values already encountered by traversing the separate
list. If a matching value is found, increment its frequency; otherwise, add the
current element to the list of distinct values seen so far with the frequency of
1. Hence considering the previous list of numbers – 9, 3, 5, 9, 4, 9, 7, 8, 9,
let us sort the list first. We get the sorted list as – 3, 4, 5, 7, 8, 9, 9, 9. Here
we can clearly see that 9 has the highest frequency, and therefore it is the
mode.
The worst case scenario that you are most likely to encounter here is to
encounter a list with no equal elements. For such a list, its ith element is
compared with i−1 elements of the other list which stores the frequency of
distinct values seen so far before being added to the list with a frequency of
1. The worst-case number of comparisons made by this algorithm in
creating the frequency list is given in equation Eq: 8.1.
n
(n 1)n
C (n) (i 1) 0 1 (n 1) (n2 )
i 1 2 Eq: 8.1
The additional n−1 comparisons needed to find the largest frequency in the
auxiliary list do not change the quadratic worst-case efficiency class of the
algorithm. Hence we first sort the input. Then all equal values will be
adjacent to each other. To compute the mode, all we need to do is to find
the longest run of adjacent equal values in the sorted array.
Let us now discuss the algorithm Presort Computing Mode which presorts
an array of numbers and then computes the mode.
Algorithm: Presort Computing Mode (A [0...n − 1])
//Computes the mode of an array by sorting it first
//Input: An array A [0...n − 1] of orderable elements
//Output: The array’s mode
Sort the array A
i←0 //current run begins at position i
modefrequency ← 0 //highest frequency seen so far
while i ≤ n − 1 do
runlength ← 1; runvalue←A[i]
while i + runlength ≤ n − 1 and A [i + runlength] = runvalue
runlength←runlength+1
if runlength> modefrequency
modefrequency ← runlength; modevalue ← runvalue
i ← i + runlength
return modevalue
Let us now trace the algorithm to compute the mode using presort.
Algorithm Tracing for Computing Mode Using Presort
A[]=[1,3,3];
n=2;
while 0<= 2-1 do
runlength=1;
runvalue=A[0];
while 0+1<=2-1 and A[0+1]=A[0]
runlength=1+1
if 2>0
modefrequency=2; //modefrequency value changes from 0 to 2
modevalue=2; //modevalue variable is assigned the value 2
i=0+2 //value of i changes to 2
return 2 //modevalue has been assigned the value 2
Activity 1
2 4 2
A 4 9 3
2 3 7
Use Gaussian elimination to compute the inverse of the above matrix.
Figure 8.2: (a) AVL Tree. (b) Binary Search Tree that is not an AVL Tree
2 0
3
1 2
2 0 0
0
1 3
1
1 2
-1
0 0
2
0 1 3
3
3
Figure 8.4: L-rotation
Double rotations
These rotations comprise of single rotation performed twice. They can be
classified as LR rotation and RL rotation.
LR-rotation – The double left-right rotation (LR-rotation) is a combination of
two rotations: we perform, the L-rotation of the left sub-tree of root followed
by the R-rotation of the new tree rooted at (Refer figure 8.5). It is performed
after a new key is inserted into the right sub-tree of the left child of a tree
whose root had the balance of +1 before the insertion.
3 0
-1 2
1 0 0
0 1 3
3 0 0
0 1 3
The drawbacks of AVL trees are the need to maintain balances for the tree’s
nodes, frequent rotations, and overall complexity, especially of the deletion
operation.
2-node
X
P Q
The figure 8.7 shows a 2-node structure. It has one data element and two
children. Every 2-node must have the following properties
1. Every value appearing in the child P must be ≤X
2. Every value appearing in the child Q must be ≥X
3. The length of the path from the root of a 2-node to every leaf in its child
must be the same.
3-node
X|Y
P Q R
The figure 8.8 shows a 3-node structure. It has 2 data elements and 3
children. Every 3-node must have the following properties:
1. Every value appearing in child P must be ≤ X
2. Every value appearing in child Q must be in between X and Y
3. Every value appearing in child R must be ≥ Y
4. The length of the path from the root of a 3-node to every leaf in its child
must be the same
Properties of 2-3 trees
We will discuss the operations done in a 2-3 tree using figure 8.9 which is a
2-3 tree with numerical keys. As explained earlier, if the key of the child is
smaller than the smallest key of its parent then the child is a left child i.e. it is
placed in the left branch in the sub-tree.
Similarly if a child is larger than the largest key of its parent then it is the
right child, and if it is in between it takes the middle sub-tree. Let us now try
to insert a key ‘28’ into the tree in the figure 8.9.
Here in the figure 8.10 we see that the node containing 27 and 29 have
been split open to accommodate 28 which is now in a temporary node.
It should be remembered that it is always the middle value of the tree that is
pushed upwards. The insertion stops as soon as a node with only one key is
reached.
Proceeding with the insertion, in the figure 8.11, we can see that the node
containing 25 and 31 have been split to accommodate 28. On doing this, the
node containing 21 and 23 becomes the left child of 25 and the node with 27
becomes the right child. Similarly 29 becomes the left child of 31 and the
node with 33 and 35 become the right child.
At this point (Refer figure 8.12) we see that the node with 9 and 19 has been
split. As 28 is greater than 19, it becomes the right child and 9 being smaller
becomes the left child. Here we can see that the tree has four levels, but is
balanced after insertion.
Self Assessment Questions
7. An AVL tree is a _________ tree.
8. The _________________ is the mirror image of the RL-rotation.
9. The two nodes of 2-3 tree are ___________ and ____________.
(a) (b)
Figure 8.15: Illustration of Heap
In figure 8.15, the first tree i.e. figure 8.15(a) is a heap, but the second tree
i.e. figure 8.15(b) is not, as the tree’s shape requirement is violated.
8.5.2 Architectural approach of heaps and algorithms
The two principal ways to construct a heap are:
1. Bottom-up heap construction algorithm
2. Top-down heap construction algorithm
Let us now discuss the bottom-up heap construction.
Bottom-up heap construction
It initializes the essentially complete binary tree with n nodes by placing
keys in the order given and then “heapifies” the tree as follows. Starting with
the last parental node and ending with the root, the algorithm checks
whether the parental dominance holds for the key at this node.
2 2
9 7 9 8
6 5 8 6 5 7
If it does not, the algorithm exchanges the node’s key K with the larger key
of its children and checks whether the parental dominance holds for K in its
new position (Refer to figures 8.16 and 8.17).
2 2 9
9 8 9 8 2 8
6 5 7 6 5 7 6 5 7
1 9 2
[0] 9
3 6 4 5 8
[1] 6
5 7
2 [2] 8
[3] 2
[4] 5
[5] 7
(a) (b)
Figure 8.18: Final Heap and Array Representation
Since the value of a node’s key does not change during the process of
shifting it down the tree, it need not be involved in intermediate swaps. The
empty nodes are swapped with larger keys in its children until a final
position is reached where it accepts the “erased” value again.
Let us now study the algorithm for bottom-up heap construction.
Algorithm: Heap Bottom-up (H [1...n])
//Constructs a heap from the elements of a given array
// by the bottom-up algorithm
//Input: An array H[1..n] of orderable items
//Output: A heap H[1..n]
for i ←n/2 down to 1 do
k←i; v←H[k]
heap←false
while not heap and 2 * k ≤ n do
j ←2 * k
if j <n //there are two children
if H[ j ]<H[ j + 1] j ←j + 1
if v ≥ H[j ]
heap←true
else H[k]←H[j ]; k←j
H[k]←v
h
closed-form formula for the sum i 1
i 2i h or by mathematical induction in h.
5 7
2 4 6
To insert a new key into a heap, first attach a new node with key K in it after
the last leaf of the existing heap. Then shift K up to its appropriate place in
the new heap as in figure 8.20.
8
5 7
2 4 6 9
Compare K with its parent’s key. Stop if the parent key is greater than or
equal to K (the structure is a heap), else, swap these two keys and compare
K with its new parent (Refer to figure 8.21). This swapping continues until K
is not greater than its last parent or it reaches the root. In this algorithm, too,
we can shift up an empty node until it reaches its proper position, where it
will get K’s value.
8 9
5 9 5 8
2 4 6 7 2 4 6 7
This insertion operation doesn’t require more key comparisons than the
heap’s height. Since the height of a heap with n nodes is about log2 n, the
time efficiency of insertion is in O(log n).
Deleting the root key from a heap
The following steps indicate the procedure to delete the root key from a
heap in the figure 8.22.
9
8 6
2 5 1
Step 1: Exchange the root’s key with the last key K of the heaps in the
figure 8.23.
1
8 6
2 5 9
Figure 8.23: Exchanging the Root Key with the Smallest Key
8 6
2 5
Figure 8.24: Delete the Key Having the Original Root Key
Step 3: “Heapify” the smaller tree by shifting K down the tree exactly in the
same way we did it in the bottom-up heap construction algorithm. That is,
verify the parental dominance for K: if it holds, we are done (Refer figure
8.25); if not, swap K with the larger of its children and repeat this operation
until the parental dominance condition holds for K in its new position.
8
5 6
2 1
3 8
2 4 7 9
Let us now perform the first stage of heapification on the tree to make it
balanced as shown in the figure 8.27.
5 5 9
4 8 4 9 4 5
2 3 7 9 2 3 7 8 2 3 7 8
Now that we have got the heapified tree (Refer figure 8.28) we will perform
stage two which includes the deletion of the nodes
4 8
2 3 7 5
To perform the node deletion, now that we have the largest value on the top
of the heap we can just push it into an array and replace the value by the left
most element in the tree, which is deleted (Refer figure 8.29)
9
4 8
3 7 5
We can now see in that in the figure 8.29 that the top most element which is
also the largest element is pushed into the array. So now we are left with a
tree which is not balanced. Therefore we will repeat the process of
hepification at the end of which we will have the largest element in the top
node of the tree. This element is now pushed again in the array, and
replaced by the bottom leftmost element. Hence we repeat this process
again and again which will finally give us the sorted array as in figure 8.30.
2 3 4 5 7 8 9
Since we already know that the heap construction stage of the algorithm is
in O(n), we need to investigate just the time efficiency of the second stage.
For the number of key comparisons, C(n), needed for eliminating the root
keys from the heaps of diminishing sizes from n to 2, we get the inequality
shown in equation Eq: 8.4.
n 1
C (n) 2[log 2 (n 1)] 2[log 2 (n 2) 2[log 2 1] 2 log 2 i
i 1
n 1
log 2 (n 1) 2(n 1) log 2 (n 1) 2n log 2 n
i 1 Eq: 8.4
This means that C(n) є O(n log n) for the second stage of heapsort. For both
stages, we get O(n) + O(n log n) = O(n log n). A more detailed analysis
shows that, in fact, the time efficiency of heapsort is in O(n log n) in both the
worst and average cases.
Activity 2
Construct a heap for the list 1, 8, 6, 5, 3, 7, 4 by the bottom-up algorithm.
mn
lcm(m, n)
gcd(m, n) Eq: 8.6
Considering the equation Eq: 8.6, let us now solve the previous example in
Eq:8.5 to find the lcm(24,60). Here we can see that we get the same result
(Refer Eq: 8.7) as in Eq: 8.5.
30 x 60
lcm(30,60) 60
2 x 3x 5 Eq: 8.7
8.6.2 Counting paths in graphs
Let us now consider the problem of counting paths between two vertices in
a graph. It is not difficult to prove by mathematical induction that the number
of different paths of length k > 0 from the ith vertex to the jth vertex of a graph
(undirected or directed) equals the (i, j)th element of Ak where A is the
adjacency matrix of the graph. Thus, we can solve the problem of counting a
a b c d
a 8 0 0 8
b 0 8 8 0
A4
c 0 8 8 0
d 8 0 0 8
Matrix: 8.4
Hence we find that in the matrix 8.4 in the position of the first row and fourth
column i.e. (1, 4) the value is 8. Therefore the no of paths of length 4 from a
to d is 8.
Activitiy 3
m.n
Prove the equality lcm(m, n) =
gcd(m,n)
8.7 Summary
In this unit we have learnt that the transform and conquer technique works
as two stage method i.e. “Transform” where the problem is modified and
“Conquer” where the problem is solved. We now know that there are three
principal varieties of the transform-and-conquer strategy: instance
simplification, representation change, and problem reduction.
We have learnt that presorting, balanced tree search, Gaussian elimination,
heaps, counting paths in a graph etc. are all strategies illustrating the 3
different varieties involved in the implementation of this technique. These
strategies help us solve complex problems encountered such as sorting
elements, and counting paths in graphs.
8.8 Glossary
Term Description
8.10 Answers
Self Assessment Questions
1. Instance simplification, problem reduction , representation change
2. ө(n2)
3. [log2 n] +1
4. Linear
5. Lower triangular
6. Rank
7. Binary search
8. LR-rotation
9. 2-node, 3-node
10. Top-down
11. Binary trees
12. O(n log n)
13. Euclid’s algorithm
14. Adjacent matrix
mn
15. lcm (m, n) gcd( m, n)
Terminal Questions
1. Refer to 8.2 – Presorting
2. Refer to 8.4.2 – AVL trees
3. Refer to 8.5.2 – Architectural approach of heaps and algorithms
4. Refer to 8.5.3 – Heapsort
5. Refer to 8.4.3 – 2-3 trees
References
Anany V. Levetin. (2002). Introduction to the analysis and design of
algorithms. Addison-Wesley Longman Publishing Co.
Thomas H. Cormen, Charles E. Leiserson, Ronald L Rivest, Clifford
Stein. (2006). Introduction to algorithms,2nd Edition, PHI
E-References
http://lcm.csa.iisc.ernet.in/dsa/node118.html
http://mathworld.wolfram.com/GaussianElimination.html