This action might not be possible to undo. Are you sure you want to continue?
Overview
Binary Trees
Traversal of Binary Trees
Binary Tree Representations
Threaded Binary Trees
Binary Search Tree
AVL tree (Balanced Binary Tree)
Run Time Storage Management
Garbage Collection
Compaction
Unit 1
Trees
Learning Objectives
Overview
Binary Trees
Traversal of Binary Trees
Binary Tree Representation
Threaded Binary Trees
Binary Search Tree
AVL Tree
Run Time Storage Management
Garbage collection
Compaction
Top
Overview
There are several data structures that you have studied like Arrays, Lists, Stacks, Queues and Graphs. All
the data structures except graphs are linear data structures. Graphs are classified in the nonlinear category
of data structures. A tree is another data structure that is an important class of graphs. A tree is an acyclic
connected graph. A tree contains no loops or cycles. The concept of trees is one of the most fundamental
and useful concepts in computer science.
Trees have many variations, implementations, and applications. Trees find their use in applications such as
compiler constructions, database design, windows operating system programs etc.
Top
ALGORITHMS AND ADVANCED DATA STRUCTURES 2
Binary Tre ee ees
A Binary tree is a finite set of elements that is either empty or is partitioned into three disjoint subsets. The
first subset contains a single element called the root of the tree. The other two subsets are themselves binary
trees, called the left and right subtrees of the original tree. A left or right subtree is called nodes of the tree.
A conventional method of picturing a binary tree is shown in figure 1.1. This tree consists of nine nodes
with A as its root. Its left subtree is rooted at B and its right subtree is rooted at C. This is indicated by the
two branches emanating from A to B on the left and to C on the right. The absence of a branch indicates an
empty subtree for example, the left subtree of the binary tree rooted at C and the right subtree of the binary
tree rooted at E are both empty. The binary trees rooted at D, G, H and I have empty right and left subtrees.
/
8 C
D E F
C H l
Figuro T.T. A 8inury Troo
If A is the root of a binary tree and B is the root of its left or right subtree, then A is said to be the father of
B and B is said to be the left or right son of A. A node that has no sons (such as D, G, H, and I of figure
1.1) is called a leaf. Node n
1
is an ancestor of node n
2
(and n
2
is a descendent of n
1
). If, n
1
is either the father
of n
2
or the father of some ancestor of n
2
. For example, in the tree of fig. 1.1, A is an ancestor of G and H is
a descendent of C, but E is neither an ancestor nor a descendent of C. A node n
2
is a left descendent of node
n
1
if n
2
is either the left son of n
1
or a descendent of the left son of n
1
. A right descendent may be similarly
defined. Two nodes are brothers if they are left and right sons of the same father.
Figure 1. 2 illustrate some structures that are not binary trees.
/ / /
8 C 8 C 8 C
D E D E D F
C
(uì (bì (cì
Figuro T.2. Slrucluros lhul uro nol binury lroos
If every nonleaf node in a binary tree has nonempty left and right sub trees, the tree is called a Strictly
Binary Tree. Thus the tree of figure 1.3 is strictly binary tree.
TREES 3
/
8 C
D E
F C
Figuro T.3. Slriclly 8inury Troo
A strictly binary tree with n leaves always contains 2n1 nodes.
The level of a node in a binary tree is defined as follows: The root of the tree has level 0, and the level of
any other node in the tree is one more than the level of its father. For example in the binary tree of figure
1.1 node E is at level 2 and node H is at level 3. The depth of a binary tree is the maximum level of any leaf
in the tree. Thus the depth of the tree of figure 1.1 is 3. A complete binary tree of depth d is the strictly
binary tree all of whose leaves are at level d.
If a binary tree contains m nodes at level l, it contains at most 2m nodes at level l+1. A complete binary tree
of depth d is the binary tree of depth d that contains exactly 2l nodes at each level between 0 and d.
A binary tree of depth d is an almost complete binary tree if:
1. Each leaf in the tree is either at level d or at level d1.
2. For any node n
d
in the tree with a right descendent at level d, all the left descendent of n
d
that are
leaves are also at level d.
The strictly binary tree of figure 1.4a is not almost complete since it violates conditions. The binary tree of
figure is an almost complete binary tree 1.4b.
/ /
8 C 8 C
D E D E F C
F C H l 1
Figuro T.4(uì T.4(bì
Almosl Complolo 8inury Troo
Student Activity 1.1
Before going to next section, answer the following questions:
1. What is a Binary Tree?
2. Define Strictly Binary Tree?
3. Find the total number of nodes in a complete binary tree of depth d.
If your answers are correct, then proceed to next section.
Top
ALGORITHMS AND ADVANCED DATA STRUCTURES 4
Traversal of Binary Trees
In many applications it is necessary, not only to find a node within a binary tree, but to be able to move
through all the nodes of the binary tree visiting each one in turn. If there are n nodes in the binary tree then
their n1 different orders in which they could be visited, but most of these have regularity of pattern. This
operation is called tree traversing. We will define three of these traversal methods. In each of there
methods, nothing needs to be done to traverse an empty binary tree. The methods are all defined
recursively, so that traversing a binary tree involves visiting the root and traversing its left and right sub
trees. The only difference among the methods is the order in which these three operations are performed.
To traverse a nonempty binary tree in Preorder (also known as depthright order), we perform the
following three operations:
(1) Visit the root.
(2) Traverse the left sub tree in preorder.
(3) Traverse the right sub tree in preorder.
To traverse a nonempty binary tree in Inorder (or symmetric order)
(1) Traverse the left subtree in inorder.
(2) Visit the root.
(3) Traverse the right subtree in inorder.
To traverse a nonempty binary tree in Postorder
(1) Traverse the left subtree in postorder.
(2) Traverse the right subtree in postorder.
(3) Visit the root.
Figure 1.5(a&b) Illustrates two binary trees and their traversals in preorder, inorder and postorder.
/
8 C
D E F
C H l
PREORDER : ABDGCEHIF
INORDER : DGBAHEICF
POSTORDER : GDBHIEFCA
Figuro T.5(uì
TREES 5
/
8
C D
E F C H
I J K L
PREORDER : ABCEIFJDGHKL
INORDER : ELEFJBGDKHLA
POSTORDER : IEJFCFKLHDBA
Figuro T.5 (bì
A 8inury Troo und ils Truvorsuls
Student Activity 1.2
Before going to next section, answer the following questions:
1. Find the Inorder, Preorder and Postorder traversals of the following binary tree.
If your answers are correct, then proceed to next section.
Top
Binary Tree Representations
¡mplicit Array Representation of Binary Trees
Recall that the n nodes of an almost complete binary tree can be numbered from 1 to n, so that the number
assigned to a right son is 1 more than twice the number assigned its father. We can represent an almost
complete binary tree without father, left or right links. Instead, the nodes can be kept in an array in the of
size n. We refer to the node at position p simply as “ node p “or info[p] holds the contents of node p info
being the array name.
/
8
D
C
F
C
H
E
ALGORITHMS AND ADVANCED DATA STRUCTURES 6
In C, array start at position 0; therefore instead of numbering the tree nodes from 1 to n, we number them
from 0 to n – 1. Because of the oneposition shift, the two sons of a node numbered p are in position 2p + 1
and 2p + 2, instead of 2p and 2p + 1.
The root of the tree is at position 0, so that tree, the extend pointer to the tree root, always equals 0. The
node in position p (that is, node p) is the implicit father of nodes 2p + 1 and 2p + 2. The left son of node p is
node 2p + 1 and right son of p by 2p + 2. Given a left son at position p, its right brother is at p – 1 and,
given a right son at position p its left brother is at p – 1. Father of p is implemented by (p – 1)/2. p points to
a left son if and only if p is odd. Thus the test for whether node p is a left son (this is left operation) is to
check whether p % 2 is not equal to 0. Figure 1.6 illustrates arrays that represent the almost complete binary
trees.
Figuro T.ó
We can extend this implicit array representation of almost complete binary trees to an implicit array
representation of binary trees generally. We do this by identifying an almost complete binary tree that
contains the binary tree being represented. Figure 1.7(a) illustrate two (non  almost complete) binary trees
TREES 7
and Figure 1.7(b) illustrates the smallest complete binary trees that contain them finally Figure illustrates
the implicit array representation of these almost complete binary trees, and by extension, of the original
binary trees. The implicit array representation is also called the sequential representation, as contrasted with
the linked representation presented earlier, because it allows a tree to be implemented in a contiguous block
of memory (an array) rather than via pointers connecting widely separated nodes. Under the sequential
representation an array element is allocated whether or not it serves to contain a node of a tree. This may be
accomplished by one of two methods. One method is to set info[p] to a special value if node p is NULL.
This special value should be invalid as the information content of a legitimate tree node. For example in a
tree containing positive numbers, a NULL node may be indicated by a negative info value. Alternatively,
we may add a logical flag field, used to each node. Each node then contains two fields info. The entire
structure is contained in an array implemented as node (p). Info p is implemented by node (p) info. We use
this method latter in implementing the sequential representation.
Figuro T.7
(a) Two binary trees
ALGORITHMS AND ADVANCED DATA STRUCTURES 8
Dynamic Node Representation
A node can be defined in language C as follows:
struct node
{
int info;
struct node *left;
struct node *right;
struct node *father;
};
typedef struct node *nodeptr;
The operations info (p), left (p), right (p) and father can be implemented by references to pinfo, pleft, p
right, and pfather, respectively. These operations are used to retrieve the value of node p, left child of node
p, right child of node p and father of node p respectively.
Binary Tree Traversals in C
We may implement the traversal of binary trees in C by recursive routines that mirror the traversal
definitions. The tree C routines preorder, inorder and postorder visit the contents of a binary tree in
preorder, inorder, and postorder respectively.
The parameter to each routine is a pointer to the root node of a binary tree. We use the dynamic node
representation of a binary tree.
/*Preorder : Visit each node of the tree in preorder*/
void preorder (nodeptr root)
{
if (root) {
visit (root);
preorder (root→left);
preorder (root→right);
}
}
/* Inorder: visit each mode in Inorder*/
void inorder (nodeptr root)
{
if (root)
TREES 9
{
inorder (root→left);
visit (root);
inorder (root→right);
}
}
/* Postorder: visit each mode in Postorder*/
void postorder (nodeptr root)
{
if (root)
{
postorder (root→left);
postorder (root→right);
visit (root);
}
}
Student Activity 1.3
Before going to next section, answer the following questions:
1. Write algorithms for inorder, preorder and postorder traversals.
2. Construct the binary tree whose inorder and preorder traversals are given as:
Inorder: E1CFJB9DKHLA
Preorder: ABCEIFJDGHKL
If your answers are correct, then proceed to next section.
Top
Threaded Binary Trees
Traversing a binary tree is a common operation, and it would be helpful to find a more efficient method of
implementing the traversal. As we have seen that generally either left or right child of node is empty i.e.
NULL. We can change these null links in a binary tree to special links called threads, so that it is possible
to perform traversals insertion and deletions operations without using either a stack or a recursion.
In a right threaded binary tree each right link is replaced by a special link to the successor of that node
under inorder traversal, called a right thread. Using right threads we will easily do an inorder traversal of a
tree, since we need only to follow either an ordinary link or a thread to find the next node to visit.
A Left threaded binary tree may be defined similarly as one in which each NULL left pointer is altered to
contain a thread to that node’s inorder predecessor. A binary tree which has both left and right threads is
ALGORITHMS AND ADVANCED DATA STRUCTURES 10
called a fully threaded binary tree. The word fully is omitted if there is no danger of confusion. Following
figure shows a fully threaded binary tree where the threads are shown as dotted lines.
A
8 C
D E F
1 C
K H I
Figuro T.8. Fully lhroudod binury lroo
Following figure shows a Rightthreaded binary tree.
/
8
C D
E F C H
l 1 K L
Figuro T.º. Pighl lhroudod binury lroo
To implement a right threaded binary tree under the dynamic node implementation of a binary tree, and
extra logical field, rethread, is included with in each node is indicate whether or not its right pointer is a
thread. For consistency, the thread field of the right most node of a tree (that is the last node in the tree’s in
order traversal) is also left to TRUE, although its right field remains NULL.
Thus a node is defined as follows (We are assuming that no father field exists):
Struct node {
int info;
struct node *left
struct node * right
int rethread; * a not null thread*
};
typedef struct node *nodeptr;
Now we present a routine to implement inorder traversal of a right threaded binary tree.
inorder2 (nodeptr root)
{
Nodeptr p, q;
p = root;
TREES 11
do {
q = NULL;
while (p! = NULL) { * Traverse left branch*
q = p;
p = p→left;
}
if (q! = NULL){
visit (q);
p = q→right;
while (q→rthread && p! = NULL){
visit (p);
q = p;
p = p→right
}
}
}while (q! = NULL)
}
Student Activity 1.4
Before going to next section, answer the following questions:
1. What are right threaded and left threaded binary trees?
2. Discuss the advantages and disadvantages of threaded binary trees.
If your answers are correct, then proceed to next section.
Top
Binary 8earch Tree
A Binary Search Tree (BST) is an ordered binary tree such that either it is an empty tree or
1. each data value in its left subtree is less than to the root value,
2. each data value is its right subtree is greater than the root value, and
3. left and right subtrees are again binary search trees.
Following figure shows a binary search tree
5
ALGORITHMS AND ADVANCED DATA STRUCTURES 12
3 ó
º
2 4 1C
8
/
Figuro T.T0. 8inury Sourch Troo
Operation on a Binary 8earch Tree
Following operation can be performed on a Binary Search Tree:
1. Initialization of a Binary Search Tree; this operation makes an empty tree.
2. Check whether BST is empty or not.
3. Create a node for the Binary search tree; this operation allocates memory space for the new node;
returns with error if no space is available.
4. Retrieve a nodes data.
5. Update a node’s data.
6. Insert a node.
7. Search for a node.
8. Traverse a Binary Search Tree.
The advantage of using a BST over an array is that a tree enables search insertion, and deletion operation to
be performed efficiently. If an array is used, an insertion or deletion requires that approximately half of the
elements of the array be moved. Insertion or deletion in a BST on the other hand, requires that only a few
pointers be adjusted.
The following algorithm searches a binary search tree and inserts a new record into the tree if the search is
unsuccessful (We assume the existence of a function make tree that constructs a binary tree consisting of a
single node whose information field is passed as an argument and returns a pointer to the tree.)
q = NULL;
p = root;
while (p! = NULL) {
if (key = k (p)) return (p);
q = p;
if (key < k (p))
p = left (p)
else
p = right (p)
}
TREES 13
v = make tree (sec, key);
if (a = NULL);
root = v
else
if (key < k (q))
left (q) = v;
else
right (q) = v;
return (v)
Here key is item to the searched.
Deleting from a Binary 8earch Tree
We now present an algorithm to delete a record (node) with key “key” from a Binary Search Tree. There
are three cases to consider. If the node to be deleted has no sons, it may be deleted with out further
adjustment to the tree. This is illustrated in the following figure.
Dololing nodo wilh koy T0
Figuro T.TT
If the node to the be deleted has only one subtree, its only son can be moved up to take its place. This is
illustrated in the following figure:
8 8
3 11 3 11
1 5 º 14 1 ó º 14
8
3
1 5 º
ó
/
1C
12
13
15
14
11
1
3
8
11
5
ó
º
/
12
14
15
13
ALGORITHMS AND ADVANCED DATA STRUCTURES 14
ó 1C 12 15 / 1C 12 15
/ 13 13
Dololing nodo wilh koy 5.
Figuro T.T2 (uì Figuro T.T2 (bì
If, however, the node p to delete has two subtrees, its In order successor (or predecessor) must take its
place. The in order successor cannot have a left subtree (since a left descendent would be the In order
successor of p). Thus the right son of s can be moved up to take the place of s. This is illustrated in the
following figure, where the node with key 12 replaces the node with key 11 and is replaced in turn by the
node with key 13.
8 8
3 11 3 12
1 5 º 14 1 5 º 14
ó 1C 12 15 ó 1C 13 15
/ 13 /
Dololing wilh Koy TT
Figuro T.T3(uì Figuro T.T3(bì
8earching a Binary 8earch Tree
Since the definition of a binary search tree is recursive, it is easiest to describe a recursive search method.
Suppose we wish to search for an element with key x. An element could in general be an arbitrary structure
that has as one of its field a key. We assume for simplicity that the element just consists of a key and use
the terms element and key interchangeably. We begin at root. If the root is 0, then the search tree contains
no elements and the search is unsuccessful. Otherwise we compare x with the key in the root. If x equals
this key, the search terminates successfully. If x is less than the key in the root, then no element in the right
subtree can have key value x and only the left subtree is to be searched. If x is larger than the key in the
root, only the right subtree needs to be searched. The subtree can be searched recursively as in the
following algorithm.
Search (Nodeptr root, int x)
{
if (root ==0) return o;
else if (x == root →info) return root;
else if (x< root→info)
return (search (root→left, x));
else return (search (root→right), x);
TREES 15
}
following table gives the efficiencies of search insert and delete operation in a binary search the.
Worsl coso Worsl coso Worsl coso Worsl coso /vorooo Coso /vorooo Coso /vorooo Coso /vorooo Coso
Soorcb C (nì C (loonì
lnsorl C (nì C (loonì
Dololo C (nì C (loonì
Student Activity 1.5
Before going to next section, answer the following questions:
1. How many binary search trees are possible with key values 1,2,3,4,5?
2. Delete node 10 from the following binary search tree.
If your answers are correct, then proceed to next section.
Top
AVL tree {Balanced Binary Tree}
The effectiveness of searching process in a binary search tree depends on how data are organised to make
up a specific tree. For example consider the two shapes given in the following figures
/
4 1C
2 ó º 11
1 3 5 8
Figuro T.T4. A nourly íull binury lroo
11
10
9
4
2
1 3
8 15
1C
ALGORITHMS AND ADVANCED DATA STRUCTURES 16
8
7
1
2
3
4
5
6
Figuro T.T5. A dogonorulo binury lroo
The efficiency of search will be rather different in these two cases, although the same elements are
organised in the two structures. The tree in the first figure above is rather short and compact while the tree
in second figure is a long and thin tree. We may say that the tree in the first figure is somewhat more
balanced than that in the second figure.
Let us define more precisely the rotation of a “balanced” tree. The height of a binary tree is the maximum
level of its leaves (this is also sometimes known as the depth of the tree). For convenience, the height of a
NULL tree is defined as 1. A balanced binary tree (AVL tree) is a binary tree in which the heights of two
subtrees of every node never differ by more than 1. The balance of a node in a binary tree is defined as the
height of its left subtree minus the height of its right subtrees.
Following figure illustrates a balanced binary tree. Each node in a balanced binary tree has a balance of –1,
+1, or 0, depending on whether the height of its left subtree is greater than, less than, or equal to the height
of its right subtree. The balance of each node is also indicated in the following figure.
1
1 C
C C 1 1
C C C C C C
C C C C
Fig. T.Tó(uì
Suppose that we are given a balanced binary tree and use the preceding search and insertion algorithm to
insert a new node p into the tree. The resulting tree may or may not remain balanced. Following figure
illustrates all possible insertions that may be made to the tree of figure 1.16 (b).
C
C 14 C
TREES 17
C C C 58 C º12
C C 8 8 C C C C
Ü1 Ü2 Ü3 Ü4 C C 8 8 8 8 C C
Ü5 Üó Ü/ Ü8 Üº 1C Ü11 Ü12
Figuro T.Tó(bì
Each insertion that yields a balanced tree is indicated by a b. The unbalanced insertions are indicated by a
U, and one numbered from 1 to 12. It is easy to see that the tree becomes unbalanced if and only if the
newly inserted node is a left descendent of a node that previously had a balance of 1 (this occurs in case U1
through U8 in figure 1.16(b) or it is a right descendent of a node that previously had a balance of –1 (cases
U9 through U12).
Top
Run Time 8torage Management
As discussed earlier, allocation of storage and its release is done for one node at a time. This method is
convenient in regard to two properties of nodes, (i) size of a node of a particular type is fixed. (ii) a node is
sizeably small. But these two characteristics don’t help in programs where a large amount of contiguous
storage is required. At times a program may require storage blocks in varied sizes and thus arises the need
of a memory management system. The run time storage management is such a system and is a convenient
tool for processing requests for variablelength blocks.
The illustration exhibited expresses the necessity of availability of space; its allocation when space is
requested and combining of contiguous free spaces when a block is freed.
As an example of this situation, consider a small memory of 1024 words. Suppose a request is made for
three blocks of storage of 348, 110 and 212 words, respectively. Let usfurther suppose that these blocks are
allocated sequentially, as shown in Figure 1.17(a). Now suppose that the second block of size 110 is freed,
resulting in the situation depicted in Figure 1.17(b). There are now 464 words of free space; yet, because
the free space is divided into noncontiguous blocks, a request for a block of 400 words could not be
satisfied.
ALGORITHMS AND ADVANCED DATA STRUCTURES 18
Figuro T.T7
Suppose that block 3 were now freed. Clearly, it is not desirable to retain three free blocks of 110, 212, and
354 words. Rather the blocks should be combined into a single large block of 676 words so that further
large requests can be satisfied. After combination, memory will appear as in Figure 1.17(c).
This example illustrates the necessity to keep track of available space, to allocate portions of that space
when allocation requests are presented, and to combine contiguous free spaces when a block is freed.
Top
Garbage Collection
Deallocation of nodes can take place in two levels:
1. The application which claimed the node, releases it back to the operating system.
2. The operating system calls the storage management routines to return free nodes to the free space.
For example deallocation as in:
1. Occurs in ac, program with the statement “free (x)” where x is space earlier allocated by a malloc call.
2. Is usually implement by the method of Garbage Collection. This requires the presence of a ‘marking
bit’ on each node. It runs in two phases. In the first phase, all nongarbage nodes are marked. In the
second phase all nonmarked nodes are collected and returned to the free space. Where variable size
nodes are used. It is desirable to keep the free space as one contiguous block, in this case, the second
phase is called memory compaction.
TREES 19
Garbage Collection is usually called when some program runs out of space. It is a slow process and its use
should be obtained by efficient programming models.
One field must be set aside in each node to indicate whether a node has or has not been marked. The
marking phase sets the mark field to tree in each accessible node. As the collection phase proceed the mark
field in each accessible node is reset to false. Thus, at the start and end of garbage collection, all mark fields
are false. User program do not affect the mark fields.
It is sometimes inconvenient to reserve one field in each node solely for the purpose of marking. In that
case a separate area in memory can be reserved to hold a long array of mark bits. One for each node that
may be allocated.
One aspect of garbage collection is that it must run when there is very little space available. This means that
auxiliary tables and stacks needed by the garbage collector must be kept to a minimum. Since there is little
space available for them. An alternative is to reserve a specific percentage of memory for the exclusive use
of the garbage collector. However, this effectively reduces the amount of memory available to the user and
means that the garbage collector will be called more frequently.
Whenever the garbage collector is called, all user processing comes to a halt while the algorithm examines
all allocated nodes in memory. For his reason it is desirable that the garbage collector be called as
infrequently as possible. For real time applications, in which a computer must respond to a user request
within a specific short time span, garbage collection has generally been considered an unsatisfactory
method of storage management. We can picture a space ship drifting off into the infinite as it waits for
directions from a computer occupied with garbage collection. However, methods have recently been
developed whereby garbage collection can be performed simultaneously with user processing. This means
that the garbage collector must be called before all space has been exhausted so that user processing can
continue in whatever space is left, while the garbage collector recovers additional space.
Another important consideration is that users must be careful to ensure that all lists are well formed and that
all pointers are correct. Usually the operations of a list processing system are carefully implemented so that
if garbage collection does occur in the middle of one of them, the entire system still works correctly.
However, some users try to outsmart the system and implement their own pointer manipulations. This
requires great care so that garbage collection will work properly. In a realtime garbage collection system,
we must ensure not only that user operations do not upset list structures that the garbage must have but also
that the garbage collection algorithm itself does not unduly disturb the list structures that are being used
concurrently by the user.
It is possible that, at the time the garbage collection program is called, users are actually using almost all
the nodes that are allocated. Thus almost all nodes are accessible and the garbage collector recovers very
little additional space. After the system runs for a short time, it will again be out of space; the garbage
collector will again be called only to recover very few additional nodes, and the vicious cycle starts again.
This phenomenon, in which system storage management routines such as garbage collection are executing
almost all the time, is called thrashing.
Clearly thrashing is a situation to be avoided. One drastic solution is to impose the following condition. If
the garbage collector is run and does not recover a specific percentage of the total space, the user who
requested the extra space is terminated and removed from the system. All of that users space is then
recovered and made available to other users.
Top
Compaction
ALGORITHMS AND ADVANCED DATA STRUCTURES 20
As a final topic, we shall briefly discuss compaction as a technique for reclaiming storage and introduce an
algorithm this task.
Compaction works by actually moving blocks of data etc. from one location in memory to another so as to
collect all the free blocks into one large block. The allocation problem then becomes completely implied.
Allocation now consists of merely moving a pointer which point to the top of this successively shortening
block of storage. Once this single block gets too small again, the compaction mechanism is again invoked
to reclaim what unused storage may now exist among allocated blocks. There is generally no storage
release mechanism. Instead, a marking algorithm is used to mark blocks that are still in use. Then, instead
of freeing each unmarked block by calling a release mechanism to put it on the free list, the compactor
simply collects all unmarked blocks into one large block at one end of the memory segment. The only real
problem in this method is the redefining of pointers. This is solved by making extra passes through
memory. After blocks are marked, the entire memory is stepped through and the new address for each
marked block is determined. This is solved by making extra passes through memory. After blocks are
marked, the entire memory is stepped through and the new address for each marked block is determined.
This new address is stored in the block itself. Then another pass over memory is made. On this pass,
pointers that point to marked blocks are reset to point to where the marked blocks will be after compaction.
This is why the new address is stored right in the block – it is easily obtainable. After all pointers have been
reset, then the marked blocks are moved to their new locations. A general algorithm for the compaction
routine is as follows.
1. Invoke garbage collection marking routine.
2. Repeat step 3 until the end of memory is reached.
3. If the current block of storage being examined has been marked then set the address of the block to
the starting address of unused memory update the starting address of unused memory
4. Redefine variable references address of unused memory.
5. Define new values for pointers in marked block.
6. Repeat step 7 until the end of memory is reached
7. Move marked blocks into new locations and reset markets.
Student Activity 1.6
Answer the following questions:
1. What is Garbage collection? What is the disadvantage of garbage collection?
2. Discuss the advantages of AVL Trees.
3. What are the conditions for a tree to be an AVL tree?
8ummary
So after go through this chapter, we can summarize the concepts.
A Binary tree is a finite set of elements, trust is either empty or is partitioned into three disjoint
subsets.
We can traverse a Binary tree in three way i.e. Pre order, Inverter & Post order.
In a threaded binary tree each right sink is replaced by a special link to the successor of that node
under in order traversal called a right thread.
TREES 21
A Binary Search tree (BST) is an ordered binary tree, such tree either it is an empty tree or each data
value in its left sub tree is less than to the root valve, each data valve is its right sub tree is greater than
the root valve and left sight sub trees are again binary search tree.
A balanced binary tree (AVL tree) is a binary tree in which the heights of two sub tree of every node
never differ by more than 1.
8elfassessment Exercises
So¡ved Exerc¡se
I. True and False
1. A binary tree can have more than 2 children of a node.
2. Array representation of binary tree is more efficient than dynamic representation
II. Fill in the blanks
1. Garbage collection is a method of ____________.
2. The advantage of using a Binary search tree over an array is that a tree enables ____________
and ___________to be performed more efficiently.
3. The average time complexity of binary search is __________.
Answers
I. True and False
1. False
2. False
II. Fill in the blanks
1. run time storage management
2. insertion and deletion
4. log
2
n.
Unso¡ved Exerc¡se
I. True and False
1. Threaded binary tree are useful in tree traversals
2. An AVL tree is a more efficient binary search tree
II. Fill in the blanks
1. A ____________ is a finite set of elements that is either empty or is partitioned into three
disjoint subsets.
ALGORITHMS AND ADVANCED DATA STRUCTURES 22
2. In ____________ binary tree each right link is replaced by a special link to the success of that
node under in order traversal.
3. The advantage of using a __________ over an array is that a tree enables search insertion and
deletion operation to be performed efficiently.
4. Since the definition of a binary search tree is ________, it is easiest to describe a recursion
search method.
5. The effectiveness of ______ process in a binary search tree depends on how data are organized
to make up a specific tree.
Detailed Ouestions
1. Prove that the root of a binary tree is an ancestor of every node in the tree except itself.
2. Prove that a node of a binary tree has at most one father.
3. Prove that a strictly binary tree with a leaves contain 2n–1 nodes.
4. Two binary trees are similar if they are both empty or if their left sub trees are similar, and their right
sub trees are similar, write an algorithm to determine if two binary trees are similar.
5. Write C routines to traverse a binary tree in preorder and Post order.
Overview
Bubble Sort
Insertion Sort
Selection Sort
Quick Sort
Merge Sort
Radix Sort
Heap Sort
External Sorting
Lower Bound Theory
Adversary Arguments
Minimum Spanning Tree
Shortest Paths
Graph Component Algorithm
String Matching
The BoyerMoore Algorithm
Unit 2
Sorting Techniques
Learning Objectives
• Overview
• Bubble Sort
• Insertion Sort
• Selection Sort
• Quick Sort
• Merge Sort
• Radix Sort
• Heap Sort
• External Sort
• Lower Bound theory for sorting
• Selection and Adversary Argument
• Minimum Spanning Tree
• Prim’s Algorithm
• Kruskal’s Algorithm
• Shortest Path
• Graph Component Algorithm
• String Matching
• KMP Algorithm
ALGORITHMS AND ADVANCED DATA STRUCTURES 24
• Boyer Moore Algorithm
Top
Overview
The concept of an ordered set of elements is one that has considerable impact on our daily lives. Consider,
for example, the process of finding a telephone number in a telephone directory. This process; called a
search, is simplified considerably by the fact that the names in the directory are listed in alphabetical order.
Consider the trouble you might have in which the customers placed their phone orders with the telephone
company. In such a case, the names might as well have been entered in random order. Since the entries are
sorted in alphabetical rather than in chronological order, the processing is simplified.
A few years ago, it was estimated, more than half the time on many commercial computers was spent in
sorting. This is perhaps no longer true, since sophisticated methods have been devised for organizing data,
methods that do not require that it be kept in any special order. Eventually nonetheless, the information
does go out to people, and then it must be sorted in some way. Because sorting is so important, great many
algorithms have been devised for doing it. In fact so many ideas appear in sorting methods that an entire
course could easily be built around this one theme. Amongst the different methods, the most important is
the distinction between internal and external, that is, whether there are so many structures to be sorted that
they must be kept in external files on disks, tapes, or the like, or whether they can all be kept internally in
highspeed memory.
We now present some basic terminology. A file of size n is a sequence of n items r(0), r(1),……r(n–1).
Each item in the file is called a record. A key, k(i), usually (but not always) a subfield of the entire record.
The file is said to be sorted on the key if i<j implies that k[i] precedes k[j] in some ordering on the keys. In
the example of the telephone directory, the file consists of all the entries in the book. Each entry is a record;
the key upon which the file is sorted is the name field of the record. Each record also contains fields for an
address and a telephone number.
Top
Bubble 8ort
The first sort presented is probably the most widely known among beginners and students of programming
 the Bubble sort. One of the characteristics of this sort is that it is easy to understand and program. Yet, of
all the sorts we shall consider, it is probably the least efficient.
In each of the subsequent examples, x is an array of integers of which the first n are to be sorted so that x[i]
≤ x[j] for n j i 0 ≤ < ≤ . It is straightforward to extend this simple format to one, which is used in sorting n
records, each with a subfield key k.
The basic idea underlying the bubble sort is to pass through the file sequentially several times. Each pass
consists of comparing each element in the file with its successor (x[i] with r[i+1]) and interchanging the
two elements if they are not in proper order. Consider the following file:
25 57 48 37 12 92 86 33
The following comparisons are made on the first pass
x[0] with x[1] (25 with 57) no interchange
x[1] with [2] (57 with 48) interchange
x[2] with x[3] (57 with 37) interchange
SORTING TECHNIQUES 25
x[3] with x[4] (57 with 12) interchange
x[4] with x[5] (57 with 92) no interchange
x[5] with x[6] (92 with 86) interchange
x[6] with x[7] (92 with 33) interchange
Thus that after first pass, the file is in the following order
25 48 37 12 57 86 33 92
Notice that after this pass, the largest element (92) is in its proper position with in the way. In general x[n–i]
will be in its proper position after iteration i. This method is called the bubble sort because each number
slowly “bubbles” up to its proper position. After the second pass the file is
25 37 12 48 57 33 86 92
Notice that 86 has now found its way to the second highest position. Since each integration places a new
element into its proper position, a file of n element requires no more than n–1 iterations.
The complete set of iterations is the following
iteration 0 (initial file) 25 57 48 37 12 92 86 33
iteration 1 25 48 37 12 57 86 33 92
iteration 2 25 37 12 48 57 33 86 92
iteration 3 25 12 37 48 33 57 86 92
iteration 4 12 25 37 33 48 57 86 92
iteration 5 12 25 33 37 48 57 86 92
iteration 6 12 25 33 37 48 57 86 92
iteration 7 12 25 33 37 48 57 86 92
On the basis of foregoing discussion we could proceed to code the bubble sort. We present a routine bubble
that accepts two variables x and n. x is an array of numbers, and n is an integer representing the number of
elements to be sorted (n may be less than number of elements in x).
bubble (int x [ ], int n)
{
int i, j, temp;
for (i = 0; i < n–1; i++) // outer loop controls the no.
for (j = 0; j < n–1; j ++) // of passes
if x [j] > x [j+1] // inner loop governs each
// individual pass
{ // interchange elements
temp = x[j];
x[j] = x[j + 1];
ALGORITHMS AND ADVANCED DATA STRUCTURES
x [j + 1] = temp;
}
}
What can be said about the efficiency of bubble sort? The total number of comparisons is (n–1) (n–1) = n
2

2n +1, which is 0(n
2
). Of course the number of interchanges cannot be greater than the number of
comparisons. It is likely that it is the number of interchanges rather than the number of comparisons that
takes up the most time in the program’s execution.
Student Activity 2.1
Before going to next section, answer the following questions:
1. Discuss the advantages of sorting.
2. Sort the following file using bubble sort
15, 4, 18, 19, 10, 21
If your answers are correct, then proceed to next section.
Top
¡nsertion 8ort
An insertion sort is one that sorts a set of records by inserting records into an existing sorted file.
Example
Initial order: SQ SA C7 H8 DK
Step 1: SQ SA C7 H8 DK
Step 2: C7 SQ SA H8 DK
Step 3: C7 H8 SQ SA DK
Step 4: C7 DK H8 SQ SA
Example of insertion sort
The insertion sort algorithm thus proceeds on the idea of keeping the first part of the list, when once
examined, in the correct order. An initial list with only one item is automatically in order. If we suppose
that we have already sorted the first i1 items, then we take item i and search through this sorted list of
length i1 to see where to insert item i.
The algorithm for insertion sort is as follows:
insertion (int x [ ], int n]
{
int i, k, y;
/* initially x [0] may be thought of as a sorted file of one element. After each repetition of
the following loop. The element x [0] through x [k] are in order.*/
for (k = 1; k < n; k + 1)
SORTING TECHNIQUES 27
{
y = x [k];
/* move down all elements greater than y by 1 position*/
for (i = k–1; i > = 0 && y < x [i]; i)
x [i + 1] = x [i];
/* insert y at proper position*/
x[i + 1] = y;
}
If the initial file is sorted, only one comparison is made on each pass, so that the sort is O[n]. If the file is
initially sorted in the never order, the sort is O(n
2
), since the total no. of comparisons is
(n–1) + (n –2) + …. + 3 + 2 + 1 = (n –1) * n /2
Which is O(n
2
). However the insertion sort is still better than the bubble sort. The closer to file is sorted
order, the more efficient the insertion sort becomes. The average no. of comparisons in the insertion sort is
also O(n
2
). The space requirement for the sort consists of only one temporary variable, y.
Student Activity 2.2
Before going to next section, answer the following questions:
1. Compare search efficiencies of Insertion sort and Bubble sort.
2. Sort the following key values using insertion sort
20, 17, 25, 13, 54
If your answers are correct, then proceed to next section.
Top
8election 8ort
A selection sort is one in which successive element are selected in order and placed into their proper sorted
positions. The elements of the impact may have to be preprocessed to make the ordered selection possible.
The selection sort consists entirely of a selection phase in which the largest of the remaining elements,
large, is repeatedly placed in its proper position I, at the end of the array, to do so large is interchanged with
the element x[i]. After n –1 selection the entire array is sorted. Thus the selection process need be done only
from n –1 down to 1 rather than down to 0. The following algorithm implements the selection sort.
selection (int x[ ], int n]
{
int i, j, large, k;
for (i = n –1; i>0; i )
{
ALGORITHMS AND ADVANCED DATA STRUCTURES 28
/* place the largest number of x [0] through x [i] into large and its index into k */
large = x [a];
k = 0;
for (j = 1; j < = i; j ++)
if (x [j] > large)
{
large = x [j];
}
x[n] = x[i];
x[i] = large;
}
}
Analysis of the selection sort is straightforward. The first pass makes n –1 comparison, the second pass
makes n –2, and so on. Therefore, there is a total of
(n –1} + (n –2) +  + 3 + 2 + 1 = n (n –1)/2
Comparisons, which is O(n
2
). The number of interchanges is always n –1. There is little additional storage
required (except to hold a few temporary variables). The sort may therefore be categorized as O(n
2
),
although it is faster than bubble sort. Example of selection sort is given below:
lniliol ¦ilo . SC S/ C/ H8 DK
Poss 1 . SC DK C/ H8 S/
Poss 2 . H8 DK C/ SC S/
Poss 3 . C/ DK H8 SC S/
Poss 4 . C/ DK H8 SC S/
Student Activity 2.3
Before going to next section, answer the following questions:
1. Find the worst case efficiency of selection sort.
2. Compare storage efficiencies of bubble sort and selection sort.
If your answers are correct, then proceed to next section.
Top
Ouick 8ort
SORTING TECHNIQUES 29
The next sort we consider is the quick sort (or portion exchange sort). Let x be an array, and n the number
of elements in the array to be sorted. Choose an element a from a specific position within the array (for
example, a can be chosen as the first element so that a = x[0]). Suppose that the elements of x are
partitioned so that a is placed into position j and the following condition hold:
1. Each of the elements in position 0 through j–1 is less than or equal to a.
2. Each of the elements in position j + 1 through n –1 is greater than or equal to a.
Notice that if these two conditions hold for a particular a and j, a is the jth smallest element of x, so that a
remains in position j when the array is completed sorted. If the foregoing process is repeated with sub
arrays x[0] through x[j–1] and x[j+1] through x[n –1] and any subarrays created by the process in
successive iteration s, the final result is a sorted file. Hence it is a divide and conquer technique.
Let us illustrate quicksort with an example. If an initial array is given as
25 57 48 37 12 92 86 33
And the first element (25) is placed in its proper position, the resulting array is
12 25 57 48 37 92 86 33
At this point, 25 is in its proper position in the array (x[1]), each element below that position (12) is less
than or equal to 25, and each element above that position (37, 48, 37, 92, 86 and 33) is grater than or equal
to 25. Since 25 is its final position the original problem has been decomposed into the problem of sorting
the two subarrays.
12 25 and (57 48 37 92 86 33)
Nothing need be done to sort the first of these subarays; a file of one element is already sorted. To sort the
second subarray the process is repeated and the subarray is further decided. The entire array may now be
viewed as
12 25 (57 48 37 92 86 33)
Where parentheses enclose the subarrays that are yet to be sorted. Repeating the process on the subarray
x[2] through x[7] yields
12 25 (48 37 33) 57 (92 86)
and further repetitions yield.
12 25 (37 33) 48 57 (92 86)
12 25 (33) 37 48 57 (92 86)
12 25 33 37 48 57 (92 86)
12 25 33 37 48 57 86 92
12 25 33 37 48 57 86 92
Note that the final array is sorted.
By this time you have noticed the quicksort may be defined more conveniently as a recursive procedure.
Now we present a mechanism to partition the given file, and then present and algorithm partition to
implement this.
ALGORITHMS AND ADVANCED DATA STRUCTURES 30
The object of partition is to allow a specific element to find its proper position with respect to the others in
the subarray. Note that the manner in which this partition is performed is irrelevant to the sorting method.
All that is required by the sort is that the elements be partitioned properly. In the preceding example, The
elements in each of the two subfiled remain in the same relative order as the appear in the original file.
However such a partition method is relatively inefficient to implement.
One way to effect a partition efficiently is the following: let a = x [lb] be the element whose final position is
sought. Two pointers, up and down, are initialized to the upper and lower bounds of the subarray
respectively. At any point during execution, each element in a position above up is greater than or equal to
a and each element in a position below down is less than or equal to a. The two pointers up are down are
moved towards each other in the following way.
Step 1 : repeatedly increase the pointer down by one position until x [down] > a
Step 2 : repeatedly decrease the pointer up by one position until x [up] < = a
Step 3 : If up > down, interchange x (down with x [up] ).
The process is repeated until the condition in step 3 fails (up < = down), at which point x [up] is
interchanged with x [lb] (which equals a), whose final position was sought, and j is set to up.
We, illustrated this process on the sample file, showing the positions of up and down as they are adjusted.
The direction of the sean is indicated by an arrow at the pointer being moved. These astericks on a line
indicates that an interchange is being made.
A = x [lb] = 25
Down→ up
25 57 48 37 12 92 86 33
down up
25 57 48 37 12 92 86 33
down ←up
25 57 48 37 12 92 86 33
down ←up
25 57 48 37 12 92 86 33
down ←up
25 57 48 37 12 92 86 33
down up
25 57 48 37 12 92 86 33
down up
*** 25 12 48 37 57 92 86 33
down→ up
25 12 48 37 57 92 86 33
down up
25 12 48 37 57 92 86 33
down ←up
25 12 48 37 57 92 86 33
down ←up
SORTING TECHNIQUES 31
25 12 48 37 57 92 86 33
←up down
25 12 48 37 57 92 86 33
up down
25 12 48 37 57 92 86 33
12 25 48 37 57 92 86 33 ***
At this point 25 is in its proper position (position 1), and every element to its left is less than or equal to 25,
and every element to its right is greater than or equal to 25. We could now proceed to sort the two subarrays
(12) and (48 37 57 92 86 33) by applying the same method.
The algorithm for partition is as follows :
partition (int x [ ], int lb, int ub)
{
int a, down, temp, up;
a = x [lb]; /* a is the element whose final position is sought*/
up = ub;
down =lb;
while (down < up){
while (x [down] < = a && down < ub)
down ++ ; /* move up the array*/
while (x [up] > a)
up  ; /* move down the array*/
if (down < up){
/* interchange x [down] and x [up] */
temp = x [down];
x [down] = x [up];
x [up] = temp;
} //end if
} //end while
x [lb] = x [up];
x [up] = a;
return (up);
} // end partition
We may now code to implement the quicksort
ALGORITHMS AND ADVANCED DATA STRUCTURES 32
quicksort (int a[ ], int p, int q)
{ int j;
if (p < q)
{
//divide into to sub arrays
j = Partition (a, p, q + 1);
//Solve the sub problems
quick sort (a, p, j–1);
quick sort (a, j +1, q);
// There is no need to combining solutions
}
}
Efficiency of Ouick 8ort
How efficient is the quicksort? Assume that the file size a is a power of 2, say n = 2
m
, so that m = log
2
n.
Assume also that proper position for the pivot always turns out to be the exact middle of the subarray. In
that case there will be approximately n comparisons (actually n–1) on the first pass, after which the file is
split into two subarrays (sub files) of size n/2, approximately. For each of these two files there are
approximately n/2 comparisons, and a total of 4 files each of size n/4 are formed and each of these file
requires n/4 comparisons yielding a total of n/8 sub files. After having the sub files m times, there are n
files of size 1. Thus the total number of comparisons for the entire sort is approximately:
n + 2* (n/2) + 4* (n/4) …… + 4* (n/4)
Or n + n + …………… + n (m times)
= nm = n log n
Thus the total no. of comparisons is O(n long n)
Thus of the foregoing properties describe the file the quicksort is O(n log n), which is relatively efficient.
The analysis for the case in which the file size is not an integral power of 2 is similar but slightly more
complex; the results, however remains the same. It can be shown, however, that on the average (over the
files of size n), the quicksort makes approximately 1.386 nlong
2
n comparisons.
For the algorithm quicksort in which x[lb] is used as the pivot value, this analysis assume that the original
array and all the resulting subarrays are unsorted, so that the pivot value x[lb] always finds its proper
position at the middle of the subarray. Suppose that the preceding conditions do not hold and the original
array is sorted (or almost sorted). If, for example, x[lb} is in its correct position, the original file is split into
subfiles of size 0 and n–1, if this process continues, a total number n–1 sub files are sorted, the first of size
n, the second of size n–1, the third of size n–2, and so on. Assuming k comparisons to rearrange a file of
size k, the total no. of comparisons to sort the entire file is
n + (n–1) + (n–2) + ….+2
SORTING TECHNIQUES 33
which is O(n
2
). Similarly, if the original file is sorted in descending order the find position of n [lb} is up
and the file is again split into two sub files that one heavily unbalanced (sizes n–1 and 0), thus the
unmodified quicksort has seemingly absurd property that it works best for files that are completely unsorted
and worst for files that an completely sorted. This property is precisely the opposite for the bubble sort,
which works best for sorted files and worst for unsorted files.
Student Activity 2.4
Before going to next section, answer the following questions :
1. When would quick sort be worse than simple solution sort?
2. Sort the following file using bubble sort
27, 45, 15, 50, 23
If your answers are correct, then proceed to next section.
Top
Merge 8ort
This sort is an example of divideandconquer technique. It has the nice property that in the worst case its
complexity is 0 (n log n). This algorithm is called merge sort. We assume throughout that the element are to
be sorted in nondecreasing order. Given a sequence of n elements (also called keys) a[1]…. A[n], the
general idea is to imaging them into two sets a[1]… a[n/2] and a[n/2+1]…….a [n]. Each set is individually
sorted, and the resulting sorted sequences are merged to produce a single sorted sequence of n elements.
Thus we have an ideal example by divideandconquer strategy in which the splitting is into two equal sized
sets and the combining operation is the merging of two sorted sets into one (as we did in quick sort).
The algorithm merge sort describe this process very succinctly using recursion and a function merge which
merges two sorted sets. Before executing merge sort, the n elements should be placed in an array a[n]. Then
merge sort (1, n) causes the keys to be rearranged into nondecreasing order in a.
mergesort (int low, int high)
{ int mid;
if (low < high) //if there are more than
{ //one element
//Divide problem into sub problems
//Find where to split the array
mid = [(low + high)/2];
//Solve the sub problems
mergesort (low, mid);
mergesort (mid + 1, high);
//Combine the solution
merge (low, mid, high);
}
ALGORITHMS AND ADVANCED DATA STRUCTURES 34
}
merge (int low, int mid, int high)
{
int h, i, j, b[20];
j = low; i = low; j = mid +1;
while((h < = mid) && (j < = high))
{
if (a [h] < = a [j]){
b [i] = a [h];
h = h + 1;
}
else
{
b [i] = a[j];
j = j +1;
}
i = i +1;
}
if (h > mid)
for (k = j; k < = high ; k + 1)
{
b [i] = a [k];
i = i + 1
}
else
for (k = h; h < = mid; k +1)
{
b [i] = a[k];
i = i +1;
for (k = low; k < = high; k +1)
a [k] = b [k];
}
SORTING TECHNIQUES 35
Examp¡e
Consider the array of ten elements a[ ] = {310, 285, 179, 652, 351, 423, 861, 254, 450, 520}. Algorithm
Mergesort begins by splitting a[ ] into two subarrays each of size five. The elements in a[1 to 5] are then
split into two subarrays of size two (a[1 to 2]) and two (a [4 to 5]). Then the items in a[1 to 3] are split into
subarrays of size two (a[1 to 2]) and one (a[3 to 3]). The values in a [1 to 2] are split a final time into one
element subarrays, and now the merging begins. Note that no movement of data has yet taken place. A
record of the subarrays is implicitly maintained by the recursive mechanism. Pictorially the file can now be
viewed as
{310285179652, 351423, 861, 254, 450, 520}
Where vertical bars indicate the boundaries of subarrays. Elements a[1] and a[2] are merged to get
(285, 310179652, 351423, 861, 254, 450, 520}
Then a[3] is merged with a [1 to 2] to yield
{179, 285, 310652, 351423, 861, 254, 450, 520}
Next, elements a[4] and a[5] are merged:
{179, 285, 310351, 652423, 861, 254, 450, 520} and then a[1 to 3] and a[4 to 5]:
{179, 285, 310, 351, 652423, 861, 254, 450, 520}
At this point the algorithm, has returned to the first invocation of mergesort and is about to process the
second recursive call. Repeated recursive calls produce the following subarrays:
{179, 285, 310, 351, 652423861254450, 520}
Elements a[6] and a[7] are merged. Then a[8] is merged with a[6 to 7]:
{179, 285, 310, 351, 652254, 423, 861450, 520}
Next a[9] and a[10] are merged, and them a[6 to 8] and a[9 to 10]:
{179, 285, 310, 351, 652254, 423, 450, 520, 861}
At this point there are two sorted subarrays and the final merge produces the fully sorted result:
{179, 254, 285, 310, 351, 423, 450, 520, 653, 861}
Efficiency of Merge 8ort
There are obviously no more than log
2
n
passes in merge sort, each involving n or fewer comparisons. Thus
mergesort requires no more than n log
2
n comparisons. In fact, it can be shown that the mergesort requires
fewer than n log
2
nn+1
log
2
n–n+1
comparisons, on the average, compared with 1.386 + n* log
2
n
average
comparisons for quick sort. In addition, quick sort can require O(n
2
) comparisons in the worst case, where
as mergesort never requires more than n* log n. However, merge sort does require approximately twice as
many assignments as quick sort on the average.
Merge sort also requires O(n) additional space for the auxiliary array, where as quicksort requires only
O(log n) additional space the stack (if implemented by using stack). An algorithm has been developed for
an inplace merge of two sorted subarrays in O(n) time. This algorithm would allow mergesort to become
ALGORITHMS AND ADVANCED DATA STRUCTURES 36
an inplace O(n log n) sort. However that technique does require a great deal many more assignments and
would thus not be as practical as finding the O(n) extra space.
Student Activity 2.5
Before going to next section, answer the following questions:
1. Compare the space requirements of quick sort and merge sort.
2. Sort the following file using merge sort
16, 17, 10, 9, 4, 18
If your answers are correct, then proceed to next section.
Top
Radix 8ort
The next sorting method that we consider is called the Radix sort. This sort is based on the values of the
actual digits in the positional represent atoms of the numbers being sorted. For example, the number 235 in
decimal notation is written with a 2 in hundreds position, a 3 in the tens position, and a 5 in the units
position. The larger of two such integers of equal length can be determined as follows: Start at the most
significant digit and advance through the leastsignificant digit as long as the corresponding digits in the
two numbers match. The number with a larger digit in the first position in which the digits of the two
numbers do not match in the larger of the two numbers. Of course, if all the digits of both numbers match,
the numbers are equal.
We can write a sorting based on the foregoing methods. Using the decimal base, for example, the numbers
can be partitioned into ten groups based on their most significant digit. Thus every element in the “0” group
is less than every element in the “1” group, all of whose elements are less than every element in the “2”
group and so on. We can then sort within the individual groups based on the next significant digit we repeat
this process until each subgroup has been subdivided so that the leastsignificant digits are sorted. At this
point the original file has been sorted. This method is sometimes called radix exchange sort.
Let us now consider an alternative to the forgoing method. It is apparent from the foregoing discussion that
considerable bookkeeping is involved in constantly subdividing files and distributing their contents into sub
files based on particular digits. It would certainly be easier if we could process the entire file as a whole
rather than deal with many individual files.
Suppose that we perform the following actions on the file for each digit, beginning with the leastsignificant
digit and ending with the mostsignificant digit. Take each number in order in which it appears in the file
and place it into one of the quakes, depending on the value of the digit currently being processed. Then
restore each queue to the original file starting with the queue of numbers with a 0 digit and ending with the
queue of numbers with a 9 digit. When these actions have been performed for each digit, starting with the
least significant with the most significant, the file is sorted. This sorting method is called the radix sort.
Notice that this scheme sorts on the lesssignificant digits first. Thus when all the numbers are sorted on a
more significant digits, numbers, numbers that have the same digit in that position but different digits in a
lesssignificant position are already sorted on the lesssignificant position. This allows processing of the
entire file without and dividing the files and keeping track of where each sub file begins and ends.
Examp¡e:
Now we illustrate this sort on the following file
SORTING TECHNIQUES 37
25 57 48 37 12 12 86 33
Queue based on the least significant digit
Front Rear
Queue [0]
Queue [1]
Queue [2] 12 92
Queue [3] 33
Queue [4]
Queue [5] 25
Queue [6] 86
Queue [7] 57 37
Queue [8] 48
Queue [9]
After first pass:
12 92 33 25 86 57 37 48
Queue based on most significant digit:
Front Rear
Queue [0]
Queue [1] 12
Queue [2] 25
Queue [3] 33 37
Queue [4] 48
Queue [5] 57
Queue [6]
Queue [7]
Queue [8] 86
Queue [9] 92
Therefore sorted file : 12 25 33 37 48 57 86 92
# define NUM 10
radixsort(x, n)
int x [ ], n;
{
ALGORITHMS AND ADVANCED DATA STRUCTURES 38
int front [NUM], near [NUM];
struct {
int into;
int next;
} node [NUM];
int exp, first, i, j, k, p, q, y;
/* inilialize linked list */
for (i = 0; i < n–1; i + 1){
node [i].info = x [i];
node [i].next = i + 1;
}
node [n–1]. info = x [n–1];
node [n+1]. next = –1;
first = 0; //first is the head of the list
for (k = 1; k < 5; k + 1){
/* Assume we have fourdigit numbers*/
for ( i = 0; i < 10; i +1){
/* Initialize queue */
near [i] = –1;
front [i] = –1;
}
//Process each element on the list
while (first ! = –1){
p = first
first = node [first]. next;
y = node [p]. info;
// extract the kth digit
exp = power (10, k–1); //raise 10 to (k–1)th
//power
j = (y/exp) %10;
// insert y into queue [j]
q = near [–j];
SORTING TECHNIQUES 39
if (q ==–1)
front [j] = p;
eloe
node [q]. next = p;
near [j] = p;
}
//At this point each record is in its proper
//queue based on digit k. We now form a
//Single list from all the queue element
//Find the first element
for (j = 0; j < 0 && front [j] ==–1; j + 1);
first = front [j];
//Link up remaining queues
while (j < = 9){ //check if finished
//find the next element
for (i = j + 1; i < 10 && front [i] ==–1; i +1);
if (i < = 9){
p = i;
node [near [j]). next = front [i];
}
j = i
}
node [near [p]). next = 1;
}
//Copy back to original array
for (i = 0; i < n; i + 1){
x [i] = node [first]. info;
first = node [first]. next;
}
}
ALGORITHMS AND ADVANCED DATA STRUCTURES 40
The time requirements for the radix sort clearly depend on the number of digits (m) and the number of
element in the file (n). This sort is approximately O(n +m). Thus the sort is reasonably efficient if the
number of digits in the keys is not too large.
Student Activity 2.6
Before going to next section, answer the following questions:
1. Explain Radix Sort method.
2. Sort the following file using radix sort
637, 455, 987, 462, 982
If your answers are correct, then proceed to next section.
Top
Heap 8ort
We begin by defining a new structure, the heap. We have studied binary trees earlier. A binary tree is
illustrated below.
1 Z
2 3
R P
/
/
4 C 5 M ó 1
8 C º D 1C E 11 l 12 C
Figuro 2.T. A 8inury Troo
A complete binary tree is said to satisfy the ‘heap condition’ if the key of each node is greater than or equal
to the key in its children. Thus the root node will have the largest key value.
Trees can be represented as arrays, by first numbering the nodes (starting from the root) from left to right.
The key values of the nodes are then assigned to array positions whose index is given by the number of the
node. For the example tree above, the corresponding array would be
Index 1 2 3 4 5 6 7 8 9 10 11 12
Array : Z R P G M J A C D E I C
The relationship of a node can be determined from this array representation. If a node is at position j its
children will be at positions 2j and 2j + 1. Its parent will be at position [j/2].
Consider the node M. It is at the position 5. Its parent node is therefore at position [5/2] = 2 i.e. the parent is
Q. Its children are at positions 2*5 and (2*5) +1, i.e. 10 and 11 respectively i.e. E and I are its children. We
see from the pictorial representation that these relationships are correct.
A Heap is a complete binary tree, in which each node satisfies the heap condition, represented as an array.
SORTING TECHNIQUES 41
We will now study the operations possible on a heap and see how these can be combined to generate a
sorting algorithm.
The operations on a heap work in 2 steps.
1. The required node is inserted/deleted/or replaced.
2. It may cause violation of the heap condition so the heap is traversed and modified to rectify any such
violations.
Examp¡es
Insertion
Consider the insertion of node R in the heap of figure 2.1.
(i) Initially R is added as the right child of J and given the number 13.
(ii) But then the heap condition is violated.
(iii) Move R up to position 6 and move 5 down to position 13.
(iv) But the heap condition is still violated.
(v) Swap R and P.
(vi) The heap condition is now satisfied by all the nodes and we get the following heap.
1 Z
2 3
R R
/
ó /
4 C 5 M P
13
8 C º D 1C E 11 l 12 C 1
Figuro 2.2. A Houp
Deletion consider the deletion of M from heap of figure 2.2.
The larger of M
i
children is promoted to 5, to get:
Z
R R
ALGORITHMS AND ADVANCED DATA STRUCTURES 42
/
C l P
C D E C l
Figuro 2.3. Dololion írom ubovo houp
An efficient sorting method based on the heap construction and node removal from the heap in order. This
algorithm is guaranteed to sort n element in n log n steps.
We will first see 2 methods of heap construction and then removal in order from the heap to sort the list.
Top down heap construction
Insert items into an initially empty heap, keeping the heap condition inviolate at all steps.
Examp¡e
Now we build a heap for the following array of characters:
PPCFESSlCNAL PPCFESSlCNAL PPCFESSlCNAL PPCFESSlCNAL
R R R R
P P C P C
Fig. 2.4(uì 2.4(bì 2.4(cì F 2.4(dì
R S S
P C P R P S
F E F E C F E D R
Fig. 2.4(oì 2. 4(íì 2.4(gì
S S
P S P S
SORTING TECHNIQUES 43
l E C R C E C R
F F l
Fig. 2.4(hì 2.4 (iì
S S
P S P S
C E C R C E C R
l C N F l E / L
Fig. 2.4(ì 2.4(kì
Bottom up heap construction
Build a heap with the items in the order presented. Then from the right most node modify to satisfy the
heap condition.
Example: Now we see the above method on the same array
PPCFESSlCNAL
P P
R C R C
F E S S F N L S
l C N / L l C E / E
Fig. 2.5(uì 2.5(bì
P P
R C R S
F N S S C N S C
l F E / L l F E / L
ALGORITHMS AND ADVANCED DATA STRUCTURES 44
Fig. 2.5(cì 2.5(dì
S S
R P R S
C N S C C N P C
l F E / L l F E / L
Fig. 2.5(oì 2.5(íì
We will now see how the sorting take place using the heap built by the top down approach. The sorted
elements will be placed in A [ ] an array of size 12.
S
P S
C N C R
F l E / L
Fig. 2.ó(uì
1. Remove S and store it in A [12]
S
P R
C N C
E l E / L
1 2 3 4 5 ó / 8 º 1C 11 12
/ ~
Fig. 2.ó(bì
2. Remove S and store in A [11]
R
P C
S
SORTING TECHNIQUES 45
C N L
E l E /
1 2 3 4 5 ó / 8 º 1C 11 12
/ ~
Fig. 2.ó(cì
3. Remove R and Store in A[10]
P
C C
l N L
E E /
1 2 3 4 5 ó / 8 º 1C 11 12
/ ~
Fig. 2.ó(dì
4. Remove P and store in A [9]
1 2 3 4 5 ó / 8 º 1C 11 12
/ ~
Fig. 2.ó(oì
5. Remove O and store in A [8]
S S
R S S
P R S S
C
C
l
E
L
N
E
/
C
N
l E
L
ALGORITHMS AND ADVANCED DATA STRUCTURES 46
1 2 3 4 5 ó / 8 º 1C 11 12
/ ~
Fig. 2.ó(íì
6. Remove O and store in A [7]
1 2 3 4 5 ó / 8 º 1C 11 12
/ ~
Fig. 2.ó(gì
7. Remove N and store in A [6]
L
l
E E
/
1 2 3 4 5 ó / 8 º 1C 11 12
/ ~
Fig. 2.ó(h)
8. Similarly remaining nodes are removed and the heap modified to get the sorted list –
AEEILNOOPRSS.
Top
External 8orting
C P R S S
C C P R S S
N C C P R S S
E /
N
l
E
/
E
L
SORTING TECHNIQUES 47
So far, all the algorithm we have examined require that the input fit into main memory. There are, however,
applications where the input is much too large to fit into memory. This section will discuss external sorting
algorithms, which are designed to handle very large inputs.
Why we Need New Algorithms?
Most of the internal sorting algorithms take advantage of the fact that memory is directly addressable. Shell
sort compares elements A [t] and A[i – h
k
] in one time unit. Heapsort computers elements A[i] and [i + 2 +
1] in one time unit. Quick sort, with median of three partitioning requires comparing A [left], A [Center],
and A [Right], in a constant number of time units. If the input is on a tape can only be accessed
sequentially. Even if the data is on a disk, there is still a practical loss of efficiency, because of the delay
required to spin the disk head.
To see how slow external access really are, create a student file that is large, but not too big to fit in main
memory. Read the file in and sort it using an efficient algorithm. The time it takes to sort the input is
certain to be insignificant compared to the time to read the input, even though sorting is an O (n log n)
operation and reading the input is only O(n).
Model for External 8orting
The wide variety of mass storage devices makes external sorting much more devicedependent than internal
sorting. The algorithms that we will consider work on tapes, which are probably the most restrictive storage
medium. Since access to an element on tape is done by winding the tape to the correct location tapes can be
efficiently accessed only in sequential order (in either direction).
We will assume that we have at least three tapes are drives to perform, the sorting. We need too drives to do
an efficient sort, the third drive simplifies matters. If only one tape drive is present, then we are in trouble
any algorithm will require Ω(N
2
) tape accesses.
The 8imple Algorithm
The basic external sorting algorithm uses the Merge routine from mergesort. Suppose we have four tapes
Ta
1
, Ta
2
, Tb
1
, Tb
2
, which are two input and two output tapes. Depending on the point in the algorithm, the a
and b tapes are either input tapes or output tapes. Suppose the data is initially on Ta
1
. Suppose further that
the internal memory can hold (and sort) M records at a time. A natural first step is to read M records at a
time from the input tape, sort the records internally, and then write the sorted records alternately to Tb
1
and
Tb
2
. We will call each set of sorted records a run. When this is done, we rewind all the tapes. Suppose we
have the same input as our example for Shellsort.
T
o1
81 º4 11 ºó 12 35 1/ ºº 28 58 41 /5 15
T
o2
T
o1
T
o2
If M = 3, then after the runs are constructed, the tapes will contain the data indicated in the following
figure.
T
o1
T
o2
T
o1
11 81 º4 1/ 28 ºº 15
ALGORITHMS AND ADVANCED DATA STRUCTURES 48
T
o2
12 35 ºó 41 58 /5
Now, T
b1
and T
b2
contain a group of runs. We take the first run from each tape and merge them, writing the
result, which is a run twice as long, onto T
a1
. Then we take the next run from each tape, merge these, and
the write the result to T
a2
. We continue this process, alternative between T
a1
and T
a2
until either T
b1
or T
b2
is
empty. At this point either both are empty or there is one run left. In the latter case, we copy this run to the
approximate tape. We rewind all four tapes, and repeat the same step, this time using the a tapes as input
and the b tapes, and repeat the same steps, this time using the a tapes as input and the b tapes as output. This
will give runs of 4m. We continue the process until we get one run of length n.
This algorithm will require [log (n/m)] passes, plus the initial runconstructing pass. For instance, if we
have 10 million records of 128 bytes each, and four megabytes of internal memory, then the first pass will
create 320 runs. We would then need nine more passes to complete the sort. Our example requires [log
13/3] = 3 more passes, which are shown in the following figure.
T
o1
11 12 35 81 º4 ºó 15
T
o2
1/ 28 41 58 /5 ºº
T
o1
T
o 2
T
o1
T
o2
T
o1
11 12 1/ 28 35 51 58 /5 81 º4 ºó ºº
T
o2
15
T
o1
11 12 15 1/ 28 35 41 58 /5 81 º4 ºó ºº
T
o2
T
o1
T
o2
Multiway Merge
If we have extra tapes, then we can expect to reduce the number of passes required to sort our input. We do
this extending the basic (twoway) merge to a k – way merge.
SORTING TECHNIQUES 49
Merging two runs is done by winding each input tape to the beginning of each run. Then the smaller
element is found, placed on an output tape, and the appropriate input tape is advanced. If there are k input
tapes, this strategy works the same way, the only difference being that it is slightly more complicated to
find the smallest of the k elements. We can find the smallest of these elements by using a priority queue. To
obtain the next element to write on the output tape, we perform a Delete Min operation. The approximate
input tape is advanced, and if the run on the input tape is not yet complicated, we insert the new element
into the priority queue. Using the same example as before, we distribute the input onto the three tapes.
T
o1
T
o2
T
o3
T
o1
11 81 º4 41 58 /5
T
o2
12 35 ºó 15
T
o3
1/ 28 ºº
We then need two more passes of three way merging to complete the sort.
T
o1
11 12 1/ 28 35 81 º4 ºó ºº
T
o2
15 41 58 35
T
o3
T
o1
T
o2
T
o3
To1
To2
To3
To1 11 12 15 1/ 28 35 41 58 /5 81 º4 ºó ºº
To2
To3
After the initial run construction phase, the number of passes required using kway merging is [log
k
(n/m)],
because the runs get k times as large in each pass. For the example above, the formula is verified, since
[log
3
(13/3)] = 2. If we have 10 tapes then k = 5, and our large example from the previous section would
require [log
5
320] = 4 passes.
Polyphase Merge
ALGORITHMS AND ADVANCED DATA STRUCTURES 50
The k – way merging strategy developed in the last section requires the use of 2k tapes. This could be
prohibitive for some applications. It is possible to get by with only k + 1 tapes. As an example, we will
show how to perform twoway merging using only three tapes.
Suppose we have three tapes, T
1
, T
2
, and T
3
and an input file on T
1
that will produce 34 runs. Our option is
to put 17 runs on each of T
2
and T
3
. We could then merge this result onto T
1
, obtaining one tape with 17
runs. The problem is that since all the runs are on one tape, we must now put some of these runs on T
2
to
perform another merge. The logical way to do this is to copy the first eight runs from T
1
onto T
2
and then
perform the merge. This has the effect of adding an extra half pass for every pass we do.
An alternative method is to split the original 34 runs unevenly. Suppose, we put 21 runs on T
2
and T
3
runs.
We would then merge 13 runs onto T
1
before T
3
was empty. At this point, we could rewind T
1
and T
3
and
merge T
1
, with 13 runs, and T
2
, which has 8 runs, onto T
3
. We could then merge T
1
and T
3
and so on. The
following table shows the number of runs on each tape each pass.
Ron /¦lor /¦lor /¦lor /¦lor /¦lor /¦lor /¦lor
Consl. T
3
÷T
2
T
1
÷T
2
T
1
÷T
3
T
2
÷T
3
T
1
÷T
2
T
1
÷T
3
T
2
÷T
3
T1 C 13 5 C 3 1 C 1
T2 21 8 C 5 2 C 1 C
T3 13 C 8 3 C 2 1 C
The original distribution of runs makes a great deal of difference. For instance, if 22 runs are placed on T
2
with 12 on T
3
then after the first merge, we obtain 12 runs on T
1
and 10 runs on T
2
. After another merge,
there are 10 runs on T
1
and 2 runs on T
3
. At this point the going gets slow, because we can only merge two
sets of runs before T
3
is exhausted. Then T
1
has 8 runs and T
2
has 2 runs Again we, can only merge two
sets of runs, obtaining T
1
with 6 runs and T
3
with 2 runs. After three more passes T
2
has two runs and then
we can finish the merge.
It turns out that the first distribution we gave is optimal. If the number of runs is a Fibbonacci numbers F
N
,
then the best way to distribute them is to split them into two Fibonacci number F
N–1
and F
N–2
. Otherwise, it
is necessary to pad the tape with dummy runs in order to get the number of runs up to a Fibonacci number.
We leave the details of how to place the initial set of runs on the takes as an exercise.
We can extend this to a k – way merge, in which case we need k th order Fibbonacci numbers for the
distribution, where the kth order Fibonacci number is defined as F
(k)
(N) = F
(k)
(N – 1) + F
(k)
(N – 2) + ….+
F
(k)
(N – k), with the approximate initial conditions F
(k)
(N) = 0, 0 ≤ N ≤ k –2. F
(K)
(K – 1) = 1.
Replacement 8election
The last item we will consider is construction of the runs. The strategy we have used so far is the simplest
possible. We read as many records as possible, until one realize that as soon as the first record is written to
an output tape, the memory it used becomes available for another record. If the next record on the input
tape is large than the record we have just output, then it can be included in the run.
SORTING TECHNIQUES 51
Using this observation, we can give an algorithm for producing runs. This technique is commonly referred
to as replacement selection Initially M records are read into memory and placed in a priority queue. We
perform a Delete Min, writing the smallest record to the output tape. We read the next record from the input
tape. If it is larger than the record we have just written we can add it to the priority queue, Otherwise it can
not go into the current run. Since the priority queue, is smaller by one element we can store this new
element in the dead space of the priority queue. Until the run is completed and use the element for the next
run, storing an element in the dead space. It is clear that run construction for the small example we have
been using, with M = 3. Dead elements are indicated by an asterisk.
In this example, replacement selection produces only three runs, compared with the five runs obtained by
sorting. Because of this, a three – way merge finishes in one pass instead of two. If the input is randomly
distributed replacement selection can be shown to produce runs of average length 2M. For our large
example, we would expect 160 runs instead of 320 runs, so a five way merge would require four passes. In
this case, we have not saved a pass although we might if we get lucky and have 125 runs or less. Since
external sorts take so long, every pass saved can make a significant difference in the running time.
As we have seen, it is possible for replacement selection to do no better than the standard algorithm.
However, the input is frequently sorted on nearly sorted to start with, in which case replacement selection
produces only a few very long runs. This kind of input is common for external sorts and makes replacement
selection extremely valuable.
3 Elements In Heap Array Output Next Element Read
HC H1 H2
Ron 1 11 º4 81 11 ºó
81 º4 ºó 81 12*
º4 ºó 12* º4 35*
ºó 35* 12* ºó 1/*
1/* 35* 12* Eno o¦ Ron. Roooilo Hooo
Ron 2 12 35 1/ 12 ºº
1/ 35 ºº 1/ 28
28 ºº 35 28 58
35 ºº 58 35 41
41 ºº 58 41 15*
58 ºº 15* 58 ono o¦ rooo
ºº 15* ºº
15* Eno o¦ Ron. Roooilo Hooo
Ron 3 15 15
Student Activity 2.7
1. Contract a heap from the following by values.
1, 2, 4, 5, 7, 8.
ALGORITHMS AND ADVANCED DATA STRUCTURES 52
Describe the sort with the help of an example.
2. What is internal sort?
If your answers are correct, then proceed to next selection.
Top
Lower Bound Theory
Recall that there is a mathematical notation for expressing lower bounds if f(n) is the time for some
algorithm then we write f(n) = Ω (g(n)) to mean that g(n) is the lower bound for f(n). Formally this equation
can be written if there exists positive constant c and so such that f(n)≥ cg(n) for all n> no. In addition to
developing lower bounds to within a constant factor we are also concerned with determining more exact
bounds whenever this is possible.
Deriving good lower bounds is often more difficult than efficient algorithms. Perhaps this is because a
lower bound states a fact about all possible algorithms for solving a problem. Usually we cannot enumerate
and analyse all these algorithms, so lower bound proofs are often hard to obtain. However for any problem
it is possible to easily observe that a lower bound identical to n exists, where n is the number of inputs to
the problem.
8orting
Now let us consider the sorting problem. We can describe any sorting algorithms that satisfies the
restrictions of the comparison tree. Consider the case in which n numbers A[1:n] are to be sorted and these
numbers are distinct. Now any comparison between a[1] and a[j] must result in one of two possibilities :
either A[i] < A[j] or A[i] > A[j]. So if we form a comparison tree then it will be a binary tree in which each
internal node is labeled by the pair i : j, which represents the comparison of A[i] with A[j]. If A[i] is less
than A[j], then the algorithm proceeds down the left branch of the tree otherwise it proceeds down the right
branch.
Following shows a comparison tree for sorting three items.
1.2
· >
2.3 2.3
· > · >
1,2,3 1.3 1.3 3,2,1
· > · >
1,3,2 3,1,2 2,1,3 2,3,1
Figuro 2.7
We consider the worst case for all comparison—based sorting algorithms. Let T(n) be the minimum
number of comparison that are sufficient to sort n items in the worst case. We know that, if all internal
nodes in a binary tree are at level less than k, then there are at most 2
k
external nodes.
Therefore, if we let k = T(n)
SORTING TECHNIQUES 53
n! < = 2
T(n)
Since T(n) is an integer, we get lower bound
T(n) > = log n!
By Starling’s approximation, it follows that
log n! = n log n – n/ln2 + (1/2) log n + 0(1)
Where ln2 refers to the natural algorithm of 2. This formula shows that T(n) is of the order n log n. Hence
we say that any comparisonbased sorting algorithm need Ω(n log n) time.
Student Activity 2.8
Before going to next section, answer the following questions:
1. Describe lower bound theory.
2. Make a comparison three for sorting of key value
a, b, c, d,
If your answers are correct, then proceed to next section.
Top
Adversary Arguments
One of the proof techniques that is useful for obtaining lower bounds consists of making use of an oracle.
The most famous oracle in history was called the Delphic oracle, located Delphi, Greece. This oracle can
still be found, situated in the side of a hill embedded in some rocks. In olden times people would approach
the oracle and ask it a question. After some period of time elapsed, the oracle would reply and a caretaker
would interpret the oracle’s answer.
A similar phenomenon takes place when we use an oracle to establish a lower bound. Given some model of
computation such as comparison trees, the oracle tells us the outcome of each comparison. To derive a good
lower bound, the oracle tries its best to cause the algorithm to work as hard as it can. It does this by
choosing as the outcome of the next test, the result that causes the most work to be required to determine
the final answer. And by keeping track of the work that is done, a worstcase lower bound for the problem
can be derived.
Merging
Now we consider the merging problem. Given the sets A[1 : m] and B[1 : n], where the items in A and the
items in B are sort, we investigate lower bounds for algorithms that merge these two sets to give a single
sorted set. As was the case for sorting. We assume that all the m+n elements are distinct and that A[1] <
A[2] <……..<A[m] and B[1] < B [2]<…….< B[n]. It is possible that after these two sets are merged, the n
elements of B can be interleaved within A in every possible way. Elementary combinatorics tell us that
there are


.

\

+
m
n m
ways that the A’s and B’s can merge together while preserving the ordering within A and
B. For example, if m=3, n=2. A[1] = x, A [2]= y, A[3]= z, B[1] = u. and B[2]= v, there are =

.

\

+
3
2 3
10
ALGORITHMS AND ADVANCED DATA STRUCTURES 54
ways in which A and B can merge: u, v, x, y, z; u, x, v, y, z; u, x, y, v, z; u, x, y, z, v; x, u, v, y, z; x, u, y, v,
z; x, u, y, v, z; x, y, u, z, v; and x, y, z, u, v.
This if we use comparison trees as our model for merging algorithms, then there will be

.

\

+
m
n m
external
nodes, and therefore at least

.

\

+
n
n m
log
comparisons are required by any comparisonbased merging algorithm. The conventional merging
algorithm that was given in earlier takes m + n  1 comparisons. If we let MERGE(m, n) be the minimum
number of comparisons needed to merge m items with n items, then we have the inequality
( ) 1 n m n , m MERGE
n
n m
log − + ≤ ≤

.

\

+
The exercises show that these upper and lower bounds can get arbitrarily far apart as m gets much smaller
than n. This should not be a surprise because the conventional algorithm is designed to work best when m
and n are approximately equal. In the extreme case when m=1, we observe that binary insertion would
require the fewest number of comparisons needed to merge A[1] into B[1], …..,B[n].
When m and n are equal, the lower bound given by the comparison tree model is too low and the number of
comparisons for the conventional merging algorithm can be shown to be optimal.
Theorem
MERGE (m, m)=2m  1, for m ≥ 1.
Proof
Consider any algorithm that merges the two sets A[1] <……..< A[m] and B[1] <………< B[m]. We
already have an algorithm that requires 2M1 comparisons. If we can show that MERGE (m,m)≥2m1, then
the theorem follows. Consider any comparisonbased algorithm for solving the merging problem and an
instance for which the final result, is B[1] < A[1] < B[2] <A[2] <……< B[m ] < A[m], that is, for which the
B’s and A’s alternate. Any merging algorithm must make each of the 2m  1 comparisons B[1] : A[1], A[1]
: B[2], B[2] : A[2], …., B[m] : A[m] while merging the given inputs. To see this, suppose that a
comparison of type B[i] :A[i] is not made for some i. Then the algorithm cannot distinguish between the
previous ordering and the one in which i.
B[1] < A[1] <…..< A[i  1] < A [i]< B [i] < B[i + 1] <…..< B [m] < A[m]
So the algorithm will not necessarily merge the A’s and B’s properly. If a comparison of type A[i] : B[i +
1] is not made, then the algorithm will not be able to distinguish between the case in which B[1] < A[1] <
B[2] <………< B[m] < A[m] and in which B[1] <A[1] <B[2] <A[2] <….< A[i 1] < B[i] <B[i + 1] < A[i]
< A[i+1]<….< B[m] < A[m]. So any algorithm must make all 2m  1 comparisons to produce this final
result. The theorem follows.
Largest and 8econd Largest
For another example that we can solve using oracles, consider the problem of finding the largest and the
second largest elements out of a set of n. What is a lower bound on the number of comparison required by
any algorithm that finds these two quantities? It has been already provided us with an answer using
SORTING TECHNIQUES 55
comparison trees. An algorithm that makes n  1 comparisons t find the largest and then n  2 to find the
second largest gives an immediate upper bound of 2n  3. So large gap still remains.
This problem was originally stated in terms of a tennis tournament in which the values are called players
and the largest value is interpreted as the winner, and the second largest as the runnerup. Figure 2.8 shows
a sample tournament among eight players. The winner of each match (which is the larger of the tow values
being compared) is promoted up the tree until the final round, which in this case, determines McMohan as
the winner. Now, who are the candidates for second place? The runnerup must be someone who lost to
McMohan but who did not lose to anyone else. In Figure 2.8 that means either Guttag, Rosen, or Francez
are the possible candidates for second place.
Figure 2.8 leads us to another algorithm of determining the runnerup once the winner of a tournament has
been found. The players who have lost to the winner play a second tournament to determine the runnerup.
This second tournament need only be replayed along the path that the winner, in this case McMohahon,
followed as he rose through the tree. For a tournament with n players, there are [log n] levels, and hence
only [log n]  1 comparison are required for this
Figuro 2.8. A lonnis lournumonl
second tournament. This new algorithm, which was first suggested by J. Schreier in 1932, requires a total of
n – 2 + [log n] comparisons. Therefore we have an identical agreement between the known upper and lower
bounds for this problem.
Now we show how the same lower bound can be derived using an oracle.
Theorem
Any comparisonbased algorithm that computes the largest and second largest of a set of n unordered
elements requires n – 2 + [log n] comparisons.
Proof
Assume that a tournament has been played and the largest element and the secondlargest element obtained
by some method. Since we cannot determine the secondlargest element without having determined the
largest element, we see that at least n1 comparisons are necessary. Therefore all we need to show is that
there is always some sequence of comparisons that forces the second largest to be found in [log n]1
additional comparisons.
Suppose that the winner of the tournament has played x matches. Then there are x people who are
candidates for the runnerup position. The runnerup has lost only once, to the winner, and the other x1
candidates must have lost to one other person. Therefore we produce an oracle that decides the results of
matches in such a way that the winner plays [log n] other people.
In a match between a and b the oracle declares a the winner if a is previously undefeated and b has lost at
least once or if both a and b are undefeated but a has won more matches than b. In any other case the oracle
can decide arbitrarily as long as it remains consistent.
ALGORITHMS AND ADVANCED DATA STRUCTURES 56
Now, consider a tournament in which the outcome of each match is determined by the above oracle.
Imagine drawing a directed graph with n vertices corresponding to this tournament. Each vertex
corresponds to one of the n players. Draw a directed edge from vertex b to a, b ≠a, if and only if either
player a has defeated b or a has defeated another player who has defeated b. It is easy to see by induction
that any player who has played and won only x matches can have at most 2
x1
edges pointing into her or his
corresponding node. Since for the overall winner there must be an edge from each of the remaining n1
vertices, it follows that the winner must have played at least [log n] matches.
8tate 8pace Method
Another technique for establishing lower bounds that is related to oracles is the state space description
method. Often it is possible to describe any algorithm for solving a given problem by a set of ntuples. A
state space description is a set of rules that show the possible state (ntuples) that an algorithm can assume
from a given state and a single comparison. Once the state transitions are given, it is possible to derive
lower bounds by arguing that the finish state cannot be reached using any fewer transitions. As an example
of the state space description method, we consider a problem originally defined and solved in the Selection
given n distinct items, find the maximum and the minimum. Recall that the divideandconquerbased
solution required [3n/2]2 comparisons. We would like to show that this algorithm is indeed optimal.
Theorem
Any algorithm that computes the largest and smallest elements of a set of n unordered elements requires
[3n/2]2 comparisons.
Proof
The technique we use to establish a lower bound is to define an oracle by a state table. We consider the
state of a comparisonbased algorithm as being described by a 4tuple (a, b, c, d), where a is the number of
items that have never been compared, b is the number of items that have won but never lost, c is the number
of items that have lost but never won, and d is the number of items that have both won and lost. Originally
the algorithm is in state (n, 0, 0, 0) and concludes with (0, 1,1, n2). Then, after each comparison the tuple
(a, b, c, d) can make progress only if it assumes one of the five possible states shown in Figure 2.9.
To get the state (0, 1, 1, n2) from the state (n, 0, 0, 0), [3n/2]2 comparisons are needed. To see this,
observe that the quickest way to get the a component to zero requires n/2 state changes yielding the tuple
(0, n/2,n/2,0). Next the b and c components are reduced; this requires addition and additional n2 state
changes.
(o2, o÷1, c÷1, oì i¦ o ≥ 2 //Two ilmos ¦rom o oro comooroo.
(o1, o, c÷1, oì or (o1, o÷1, c, oì i¦ o ≥ 1 ///n ilom ¦rom o is comooro wilb ono
or (o1, o, c, o÷1ì //¦rom o ro c.
(o, o1, c, o÷1ì i¦ o ≥ 2 //Two iloms ¦rom o oro comooroo.
(o, o, c1, o÷1ì i¦ c ≥ 2 //Two iloms ¦rom c oro comooroo.
Figuro 2.º. Slulos íor muxmin problom
Selection
We end this section by deriving another lower bound on the selection problem. One of the algorithms
presented there has a worstcase complexity of O(n) no matte what values is being selected. Therefore we
know that asymptotically any selection algorithm requires Θ(n) time. Let SEL
k
(n) be the minimum number
of comparisons needed for finding the kth element of an unordered set of size n. We have already seen that
SORTING TECHNIQUES 57
for k = 1, SEL
1
(n) = n  1 and, for k = 2, SEL
2
(n) = n – 2 + [log n]. In the following paragraphs we present
a state table that shows that n – k + ( k  1)
( )
− 1 k 2
n
log ≤ SEL
k
(n). We continue to use the terminology
that refers to an element of the set as a player and to a comparison between two players as a match that
must be won by one of the players. A procedure for selecting the kthlargest element is referred to as a
tournament that finds the kthbest player.
To derive this lower bound on the selection problem, an oracle is constructed in the form of a state
transition table that will cause any comparison based algorithm to make at least n – k + ( k 
1)
( )
− 1 k 2
n
log comparisons. The tuple size for states in this case is two, (it was four for the maxmin
problem), and the components of a tuple, say (Map, Set), are Map, a mapping from the integers 1, 2,
……….,n onto itself, and Set, an ordered subset of the input. The initial state is the identity mapping (that
is Map(i) = 1, 1 ≤ i ≤ n) and the empty set. Intuitively, at any given time, the players in Set are the top
players (from among all). In particular, the ith player that enters Set in the ithbest player. Candidates for
entering Set are chosen according to their Map values. At any time period t the oracle is assumed to be
given two unordered elements form the input, say a and b, and the oracle acts as follows:
1. If a and b are both in Set at time t, then a wins if a > b. The tuple (Map, Set) remains unchanged.
2. If a is in Set and b is not in Set, then a wins and the tuple (Map, Set) remains unchanged.
3. If a and b are both not in Set and if Map (a) > Map (b) at time t, then a wins. If Map (a) = Map (b),
then it doesn’t matter who wins as long as no inconsistency with any previous decision is made. In
either case, if Map(a) + Map (b) ≥ n/( k  1) at time t, then Map is unchanged and the winner is
inserted into Set as a new member. If Map(a) + Map(b) < n/(k  1), Set stays the same and we set Map
(the loser) : = 0 at time +1 and Map (the winner) := Map (a) + Map (b) at time t + 1 and, for all items
w, 2 ≠ a, w, ≠ b, Map (w) stays the same.
Lemma
Using the oracle just defined, the k1 best players will have played at least (k1)
( )
− 1 k 2
n
log matches
when the tournament is completed.
Proof
At time t the number of matches won by any player x is greater than or equal to ( ) [ ] x Map log . The
elements is Set are ordered so that x
1
< ……< x
j
. Now for all w in the input ( )
¯
=
w
. n w Map Let W={ y : y
is not in Set but Map(y) > 0}. Since for all w in the input Map(w) < n/(k  1), it follows that the size of Set
plus the size of W is greater than K  1. However, since the elements y in W can only be less than some x
i
in Set, if the size of Set is less than k  1 at the end of the tournament, then any player in Set or W is a
candidate to be one of the k  1 best players. This is a contradiction, so it follows that at the end of the
tournament Set≥ (k  1).
We are now in a position to establish the main theorem.
Theorem
[Hayfil] The function SET
k
(n) satisfies
ALGORITHMS AND ADVANCED DATA STRUCTURES 58
SEL
k
(n) ≥ n  k + (k  1)
( )
− 1 k 2
n
log
Proof
According to lemma, the k  1 best players have played at least (k  1)
( )
− 1 k 2
n
log matches. Any player
who is not among the k best player has lost at least one match against a player who is not among the k 
1best. Thus there are n  k additional matches that were not included in the count of the matches played by
the k  1 top players.
Student Activity 2.9
Before going to next section, answer the following questions:
1. Let m = αn. then by Stirling’s approximation
( ) ( ) [ ] ( ). 1 O n log
2
1
log 1 log n n
n
n n
log + − α α − α + α + =

.

\

α
+ α
Show that as α→ 0, the difference
between this formula and m + n  1 gets arbitrarily large.
2. Let F (n) be the minimum number of comparisons. In the worst case, needed to insert B[1] into the
ordered set A[1] < A[2] <……..<A[n]. Prove by induction that F(n) ≥ [log n + 1].
3. A search program is a finite sequence of instructions of three types: (1) if (f (x) r 0) goto L1; else goto
L2; where r is either <, >, or = and x is a vector; (2) accept; and (3) reject. The sum of the subsets
problem asks for a subset I of the integers 1, 2,……,n for the inputs w
1
,….,w
n
such that
( ) , b w
I i i
=
¯
∈
where b is a given number. Consider search programs for which the function f is
restricted so that it can only make comparison of the form
( )
¯
∈
=
I i
i
b w
Using the adversary technique D. Dobkin and R. Lipton have shown that ( )
n
2 Ω such operations are
required to solve the sum of subsets problem (w
1
,……….,w
n
, b). See if you can derive their proof.
If your answers are correct, then proceed to next section.
Top
Minimum 8panning Tree
Let G = (V,E) be an undirected connected graph. A subgraph t = (V, E
1
) of g is a spanning tree of g if t is a
tree.
Example: Figure 2.10 shows the complete graph on four nodes together with three of its spanning trees.
SORTING TECHNIQUES 59
Figuro 2.T0
Spanning trees have many applications. For example, they can be used to obtain an independent set of
circuit equations for an electric network. In practical situations, the edges have weights assigned to them.
These weights may represent the cost of construction, the length of the link, and so on. Given such a
weighted graph, one would then wish to select cities to have minimum total cost or minimum total length.
In either case the links selected have to form a tree. We are therefore interested in finding a spanning tree g
with minimum cost. Fig. 2.11 shows a graph and one of its minimum spanning tree involves the selection of
a subset of the edges, this problem fits the subset paradigm.
1 1
1C 28 1C
ó 2 ó 2
14 1ó 25 14 1ó
25 3 / 3
/ 5
24 18 12 12
5 4 22 4
22
Figuro 2.TT(uì. Gruph Fig. 2.TT(bì. Minimum spunning lroo
We present two algorithms for finding a minimum spanning tree of a weighted graph: Prim’s algorithm and
Kruskal’s algorithm.
Prim's Algorithm
A greedy method to obtain a minimum spanning tree is the to build the tree edge by edge. The next edge to
include is chosen according to some optimization criterion. The simplest such criterion is to choose an edge
that results in a minimum increase in the sum of the costs of the edges so far included. There are two
possible ways to interpret this criterion. In the first, the set of edges so far selected from a tree. Thus if A is
the set of edges selected so far, then A forms a tree. The next edge (u,v) to be included in A is a minimum
cost edge not in A with property that A∪{(u,v)} is also a tree. The following example shows this selection
criterion results in a minimum spanning tree. The corresponding algorithm is known as Prim’s algorithm.
Example: Figure 2.12 shows the working of Prim’s method on the graph of figure 2.11(a). The spanning
tree obtained is shown in figure 2.11(b) and has a cost of 99.
1 1
1C 1C
ó 2 ó 2
/ 25 /
3 3
5 5
ALGORITHMS AND ADVANCED DATA STRUCTURES 60
5 4
Fig. 2.T2(uì Fig. 2.T2(bì
1 1
1C 1C
ó 2 ó 2
/ 25 /
25 3 3
5 5
4 4 12
22 22
Fig. 2.T3(cì Fig. 2.T3(dì
1 1
1C 1C
ó 2 1ó ó 2 1ó
/ 25 / 14
25 3 3
5 12 5
4 4 12
22 22
Fig. 2.T3(oì Fig. 2.T3(íì
Fig. 2.T3
Having seen how Prim’s method works, let us obtain a n algorithm to find a minimum cost spanning tree
using this method. The algorithm will start with a tree that include only a minimum cost edge of g. Then,
edges are added to this tree one by one. The next edge (i, j) to be added is such that I is a vertex already
included in the tree, j is a vertex not yet included, and the cost of (i, j), cost [i, j], is minimum among all
edges (k, l) such that vertex k is in the tree and vertex e is not in the tree. To determine this edge (i, j)
efficiently, we associate with each vertex j not yet included in the tree a value near [j]. The value near [j]
[near (j)] is minimum among all choices for near [j]. We define near [j] = 0 for all vertices j that are already
in the tree. The next edge to include is defined by the vertex j such that near [j] # 0 (j not already in the tree)
and cost [j] [near (j)] is minimum.
Prim (E, cost, n, t)
//E is the set of edges in g. cost [n] [n] is
// the cost adjacmcy matrix of an n vertex
//graph such that cost [i,j] is
//either a positive real number or is it
//no edge (i, j) exists.
//A minimum spanning tree is computed
SORTING TECHNIQUES 61
//and stored as a set of edges in
//The array t[n–1] [2]. (The final cost is
//returned.
{
Let (k, l) be an edge of minimum cost in E;
Min cost = cost [k] [l];
t [1] [1] = k; k[1] [2] = l;
for (i = 1; i < = n : i + l)
if (cost [i] [l] < cost [i] [k])
near [i] = l;
else
near [i] = k;
near [k] = near [l] = 0;
for (i = 2; i < = n–1; i + l)
{ //Find n–2 additional edges for t.
Let j be an index such that near [j] ! = 0 and
Cost [j] near [(j)] is minimum;
T [i] [1] = j; t [i] [2] = near [j];
min cost = mincost + cost [j] [near [j]];
near [j] = 0;
for (k = 1; k < = n; k + l) //update near
if (near [k] ! = 0 && cost [k] [near [k] > cost [k] [j])
near [k] = j;
}
return (min cost);
}
The time required by algorithm prim is 0 (n2), where n is the number of vertices in the
graph g.
Kruskal's Algorithm
There is a second possible interpretation of the optimization criteria mentioned earlier in which the edges of
the graph are considered in nondecreasing order of cost. This interpretation is that the set t of edges so far
selected for the spanning tree be such that it is possible to complete t into a tree. Thus t may not be a tree at
all stages in the algorithm. In fact it will generally only be a forest since the set of edges t can be completed
into a tree if there are no cycles in t. This method is due to kruskal.
ALGORITHMS AND ADVANCED DATA STRUCTURES 62
Example: Consider the graph of figure 2.14(a). We begin with no edges selected figure 2.14(a) shows the
current graph with no edges selected Edge (1,6) is the first edge considered. It is included in the spanning
tree being built. This yield the graph of figure 2.14(b). Next the edge (3,4) is selected and included in the
tree (fig. 2.14(c)). The next edge to be considered is (2,7). Its inclusion in the tree being built does not
create a cycle, so we get the graph of figure 2.14. Edge (2,3) is considered next and included in the tree
figure 2.14(e). Of the edges not yet considered (7,4) has the least cost. It is considered next. Its inclusion in
the tree results in a cycle, so this edge is discarded. Edge (5,4) is the next edge to be added in the tree being
built. This result in the configuration of figure 2.14(f). The next edge to be considered is the edge (7,5). It is
discarded as its inclusion creates a cycle. Finally edge (6,5) is considered an included in the tree built. This
completes the spanning tree. The resulting tree (figure 9(b)) has cost 99.
1 1
1C
ó 2 ó 2
/ /
3 3
5 5
4 4
Fig. (uì Fig. (bì
1 1
1C 1C
ó 2 ó 2
/ 14
3 / 3
5 12 5
4 4 12
Fig. (cì Fig. (dì
1 1
1C 1C
2
ó 1ó ó 2 1ó
/ 14 14
3 / 3
5 12 12
4 4
5 22
Fig. (oì Fig. (íì
Figuro 2.T4
SORTING TECHNIQUES 63
For clarity, kruskal’s method is written out more formally in following algorithm.
1. t = 0;
2. while [(it has less than n–1 edges) R& (E! = 0)]
3. {
4. Choose an edge (u,v) from E of lowest cost;
5. Delete (q, w) from E;
6. If (u, w) does not create a cycle in it)
add (v,w) to t;
7. else
Discard (v, w);
8. }
Initially E is the set of all edges in g. The only functions we wish to perform on this set are
(1) determine an edge with minimum cost (line 4) and (2) delete this edge (line 5). Both these functions can
be performed efficiently if the edges in E are maintained as a sorted sequential list. It is not essential to sort
all the edges so long as the next edge for line 4 can be determined easily. If the edges are maintained as a
min heap, then the next edge to consider can be obtained in 0 (long E) line. The construction of heap it self
take O (E) time. To be able to perform step 6 efficiently, the vertices in g should be grouped together in
such a way that one can easily determine whether the vertices v and w are already connected by the earlier
selection of edges. If they are, then the edge (v,w) is to be added to t. One possible grouping is to place all
vertices in the same connected component by t into a set. For example, when the edge (2,6) is to be
considered, the sets are {1,2}, {2,4,6}, and {5}. Vertices 2 and 6 are in different sets so these sets are
combined to give {1,2,3,4,6} and {5}. The next edge to be considered is (1,4). Since vertices 1 and 4 are in
the same set, the edge is rejected. The edge (3,5) connects vertices in different sets and results in the final
spanning tree.
Student Activity 2.10
Before going to next section, answer the following questions.
1. Why is Prim’s algorithm called greedy method?
2. Compare and contrast Prim’s method with Kruskal’s method.
3. Draw a spanning tree of edges {2, 6, 8, 18, 35} using Kruskal’s method.
If your answers are correct, then proceed to next section.
Top
8hortest Paths
Graphs can be used to represent the highway structure of a state on country with vertices representing cities
and edges representing sections of highway. The edge can them be assigned weights which may be either
the distance along that section of highway. A motoriot wishing to drive from city A to B would be
interested in answers to the following question:
• Is there a path from A to B?
• If there is more than one path from A to B, which is the shortest path?
ALGORITHMS AND ADVANCED DATA STRUCTURES 64
The problems defined by these questions are special cases of the path problems we study in this section.
The length of a path is now defined to be the sum of the weights of the edges or that path. The starting
vertex of the path is referred to as the source, and the last vertex the destination. The graphs are digraphs to
allow for oneway structs. In the problem we consider we are given a directed graph g = (V,E), a weighting
function cost for the edges of g, and a source vertex V0. The problem is to determine the shortest path from
V0 to all the remaining vertices of g. It is assumed that all the weight are positive.
Dijkstra's Algorithm
This algorithm determines the lengths of the shortest paths from v0 to all other vertices in g.
Dijkstra’s (v, cost, dist, n)
//dist [j], 1< = j < = n, is set to the legnth
//of the hortest path from vertex v to
//vertex j in a diagraph g with n
//vertices dist [v] set to zero. G is
//represented by its cost adjacency matrix
//cost [n] [n].
{
for (i = 1; i < = n; i &&)
{ //intializes
S [i] = false; dist [i] = cost [v][i]
}
S [v] = true; dist [v] = 0.0; put v in S.
{
//Determines n–1 paths from v.
Choose u from among these vertices not in S such that digit [u] is minimum;
S[u] = true; put u in S
For (each is adjacent to u with S[w] = false)
//update distances
dist [w] = dist [u] + cost [u] [w]
}
}
Example: Consider the eight vertex diagraph of figure 2.16(a) with cost adjacency matrix as in figure
2.15(b). The values of dist and the vertices selected at each iteration of the for loop of line 12 is previous
algorithm, for finding all the shortest paths from Boston are shown in figure 2.16. To begin with, S
contains only Boston. In the first iteration of the for loop (that is num = 2), the city u that is not in S and
whose dist [4] is minimum is identified to be New York. In the next iteration of the for loop, the city that
enters S is Miami since it has the smallest dist [ ] value from among all the nodes not in S. None of the dist
[ ] values are altered. The algorithm when only seven of the eight vertices are in S. By the definition of dist,
SORTING TECHNIQUES 65
the distance of the last vertex, in this case Los Angeles, is correct as the shortest path from Boston to Los
Angeles can go through only the remaining six vertices.
8oslon
15CC 5
Cbicooo
4 25C
12CC 1CCC
Son Froncisco ó Now ¥or
8CC
2 3
3CC Donvor 14CC ºCC
1CCC
1 8 1CCC
Los /noolos Now Crloons
1/CC / Miomi
Figuro 2.T5(uì
1 2 3 4 5 ó / 8
1 C
2 3CC C
3 1CC 8CC C
4 12CC C
5 15CC C 25C
ó 1CCC C ºCC 14CC
/ C 1CCC
8 1/CC C
Figuro 2.T5(bì
llovolion S \orlox Dislonco
Solcloo L/ SF DEN CHl 8CST N¥ W/ NC
1 2 3 4 5 ó / 8
lnliol   ÷CC ÷CC ÷CC 15CC C 25C ÷CC ÷CC
1 ¦5} ó ÷CC ÷CC ÷CC 125C C 25C 115C 1ó5C
2 ¦5,ó} / ÷CC ÷CC ÷CC 125C C 25C 115C 1ó5C
3 ¦5,ó,/} 4 ÷CC ÷CC 245C 125C C 25C 115C 1ó5C
4 ¦5,ó,/,4} 8 335C ÷CC 245C 125C C 25C 115C 1ó5C
5 ¦5,ó,/,4,8} 3 35C 325C 245C 125C C 25C 115C 1ó5C
ó ¦5,ó,/,4,8,3} 2 335C 325C 245C 125C C 25C 115C 1ó5C
ALGORITHMS AND ADVANCED DATA STRUCTURES 66
¦5,ó,/,4,8,3,2}
Figuro 2.Tó
Top
Graph Component Algorithm
The first questions one is most likely to ask when encountering a new a G will be: Is G connected? If G is
not connected, what are the comments of G? Therefore, our first algorithm will be one that determines the
connectedness and components of a given graph.
A addition to being an important question is its own right, the question connectedness and components
arises in many other algorithms. For example, before testing a graph G for reparability, planarity, or
isomorphism, another graph, it may be better for the sake of efficiency to determine components of G and
then subject each component to the desired scrutiny. Connectedness algorithm is very basic and may serve
as a subroutine in the involved graphtheoretic algorithms. (The reader may be reminded here although in
drawing a graph one might see whether a graph is connected or not, the connectedness is by no means
obvious to a computer or human … if the graph is presented in other forms).
Given the adjacency matrix X of a graph, it is possible to determine whether or not the graph is connected
by trying various permutations of rows with the corresponding columns of X, and them checking if it is in a
blockregional form. This, however, is an inefficient method, because it may involve n! permutations. A
more efficient method could be to check for zeros in the matrix.
Y = X + X
2
+….+X
n1
.
is too is not very efficient, as it involves a large number of matrix multiplications. The following is an
efficient algorithm:
Description of the Algorithm: The basic step in the algorithm is the fusion adjacent vertices. We start with
some vertex in the graph and fuse all vertices that are adjacent to it. Then we take the fused vertex and
again fuse with it all those vertices that are adjacent to it now. This process fusion is repeated until no more
vertices can be fused. This indicates that connected component has been “fused” to a single vertex. If this
exhausts very vertex in the graph, the graph is connected. Otherwise, we start with new vertex (in a
different component) and continue the fusing operation.
In the adjacency matrix the fusion of the jth vertex to the ith vertex is accomplished by ORing, that is,
logically adding the jth row to the ith row as well as the jth column to the ith column. (Remember that in
logical adding 1+ 0 = 0 + 1 = 1 + 1= 1 and 0 + 0 = 0). Then the jth row and the jth column are discarded
from the matrix. (If it is difficult or time consuming to discard the specified rows and columns, one may
leave these rows and columns as the matrix, taking care that they are not considered again in any fusion.
Note that a selfloop resulting from a fusion appears as and in the man diagonal, but parallel edges are
automatically replaced by a single edge because of the logical addition (or ORing) operation. These, of
course, have no effect on the connectedness of a graph.
The maximum number of fusion that may have to be performed in this algorithm is n – 1, n being the
number of vertices. And since in each fusion one performs at most n logical additions, the upper bound on
the execution time is proportional to n(n–1).
SORTING TECHNIQUES 67
Figuro 2.T7. Algorilhm T. Compononls oí G
A proper choice of the initial vertex (to which adjacent vertices are fused) in each component would
improve the efficiency, provided one did not pay too much of a price for selecting the vertex itself.
A flow chart of the “Connectedness and Components Algorithm” is shown in Fig. 2.17.
Top
8tring Matching
In text editing we frequently found the problem of finding all occurrences of pattern in a text in textediting
programs. The pattern searched is a particular word supplied by the user in a text. We can also use String
matching algorithms for particular patterns in DNA sequences.
The stringmatching problem is defined as follows. We assume that length of the text in an array T[1..n]
is of length n and that the pattern is an array P[1..m] of length m. We assume that the elements of
P and T belongs to a finite alphabet Σ. For example, alphabet may be Σ= {0,1} or
Σ = {a,b,…z}. The character arrays P and T are called strings.
ALGORITHMS AND ADVANCED DATA STRUCTURES 68
We say that pattern P occurs with shift s in text T (or, we can say that, that pattern P occurs beginning at
position s + 1 in text T) if 0 ≤ s ≤ n + m and T[s + 1..s + m] = P [1..m] (i.e., if T[s + j] = P[j], for 1 ≤ j ≤ m.
If P occurs with shift s in T, s is called a valid shift, otherwise, an invalid shift. The stringmatching
problem is the problem of finding all valid shifts with which a given pattern P occurs in a given text T
Figure 2.18 illustrates these definitions.
Now we see the native bruteforce algorithm for the stringmatching problem, has worstcase running time
O((n – m +1)m) presents an interesting stringmatching algorithm, due to Rabin and Karp. This algorithm
also
Text T
pattern P s=3
Figuro 2.T8. Tho Slringmulching Problom
Our goal is to find all occurrences of the pattern P = abaa in the text T = abcabaabcabac. The pattern occurs
only once in the text, for shift s = 3. The shifts s = 3 is said to be a valid shift. Here each character of the
pattern is connected by a vertical line to the matching character in the text, and all matched characters are
shown shaded. Has worstcase running time O((n –m +1)m), but it works much better on average and in
practice. It also generalizes nicely to other patternmatching problems. The study then describes a string
matching algorithm that begins by constructing a finite automaton specifically designed to search for
occurrences of the given pattern P in a text. This algorithm runs in time O(n + mΣ). The similar but much
cleverer KnuthMorrisPratt (or KMP) algorithm is presented further. The KMP algorithm runs in time O(n
+ m). An algorithm due to Boyer and Moore that is often the best practical choice, although its worstcase
running time (like that of the RabinKarp algorithm) is no better than that of the naïve stringmatching
algorithm.
Notation and terminology
Σ* denote the set of finitelength strings formed using characters from alphabet Σ. In this chapter, we
consider only strings of finite lengths. The zerolength empty string(∈), also belongs to Σ*. x is the length
of a string x. The concatenation of two strings x and y, is denoted xy. The length of xy is x + y and
consists of the characters from x followed by the characters from y.
A string w is a prefix of a string x, denoted w ⊂ x, if x = wy for some string y ∈Σ*. Note that if w ⊂ x, then
w≤x. Similarly, we can define a string w is a suffix of a string x, denoted w ⊃ x, if x = yw for some y∈Σ*.
It follows from w ⊃ x that w ≤x. The empty string ∈ is both a suffix and a prefix of every string. For
example, we have ab ⊂ abcca and cca ⊃ abcca. It is useful to note that for any string x and y and any
character a, we have x ⊃ y if and only if xa ⊃ ya. Also note that ⊂ and ⊃ are transitive relations. The
following lemma will be useful later.
Student Activity 2.11
Answer the following questions:
1. How is connectedness determined from an adjacency matrix?
2. State and explain stringmatching problem.
o o o o o o o o o c o o c
o o o o
SORTING TECHNIQUES 69
3. What is the worstcase order of time complexity of Rabinkarp algorithm?
Top
The BoyerMoore Algorithm
This algorithm is most efficient if the pattern P is relatively long and the alphabet Σ is reasonably large.
This algorithm due to Robert S. Boyer and J. Strother Moore.
BoyerMorre Matcher (T,P,Σ)
1 n←length [L]
2 m←length[P]
3 λ←ComputeLastOccurrenceFunction (P,m,Σ)
4 y←ComputeGoodSuffixFuinction (P,m)
5 s←0
6 while s ≤n–m
7 do j←m
8 while j > 0 and P [j] = T[sj]
9 do j←j–1
10 if j = 0
11 then print “Pattern occurs at shift”s
12 s←s + 7 [0]
13 else s←s + max (γ[j], j– ≠ λ [T[s+j]])
Aside from the mystriouslooking λ’s and γ‘s, this program looks remarkably like the naïve string
matchiong algorithm. Now we comment out lines3–4 and replace the updating of s on lines 12–13 with
simple incrementations as follows :
12 s←s+1
13 else s←s+1
In the modified program, the while loop beginning on line 6 considers each of the n–m+1 possible shifts s
in turn, and the while loop beginning on line 8 tests the condition P[1..m] = T[s+ 1..s +m] by comparing
P[j] with T[s + j] for j = m, m –1……1. If the loop terminates with j = 0, a valid shifts s has been found, and
line 11 prints out the value of s. At this level, the only remarkable features of the BoyerMoore algorithm
are that it compares the pattern against the text from right to left and that it increases the shifts s on lines
12–13 a value that is not necessarily 1.
The BoyerMoore algorithm uses two heuristics that allow it to avoid much of the work that our previous
stringmatching algorithms performed. These heuristics are very effective in that they often allow the
algorithm to skip altogether the examination of many text characters. These heuristics are known as the
“badcharacter heuristic” and “goodsuffix heuristic”. They are illustrated in Figure 2.19. They can be
viewed as operating independently in parallel. When a mismatch occurs, each heuristic proposes an amount
by which s can safely be increased without missing a valid shift. The BoyerMoore algorithm chooses the
ALGORITHMS AND ADVANCED DATA STRUCTURES 70
larger amount and increases s by that amount: when line 13 is reached after a mismatch, the badcharacter
heuristic proposes increasing s by j –λ[T[s + j]], and the goodsuffix heuristic proposes increasing s by γ[j].
ooo cboroclor oooo so¦¦ix
… …
s
(uì
… …
s+4
(bì
… …
s+3
(c)
Figuro 2.Tº
An illustration of the BoyerMoore heuristics. (a) Matching the pattern reminiscence against a text by
comparing characters in a righttoleft manner. The shifts s is invalid; although a “good suffix” ce of the
pattern matched correctly against the corresponding characters in the text (matching character are shown
shaded), the “bad character” I, which didn’t match the corresponding character n in the pattern, was
discovered in the text. (b) The badcharacter heuristics proposes moving the pattern to the right, if possible,
by the amount that guarantee that the bad text character will match the rightmost occurrence of the bad
character in the pattern. In this example, moving the pattern 4 positions to the right causes the bad text
character I in the text to match the rightmost I in the pattern, at position 6. If the bad character doesn’t occur
in the pattern, then the pattern may be moved completely past the bad character in the pattern is to the right
of the current bad character position, then this heuristic makes no proposal. (c) With the goodsuffix
heuristic the pattern is moved to the right by the least amount that guaranttes that any pattern characters that
align with the good suffix ce previously found in the text will match those suffix characters. In this
example, moving the pattern 3 positions to the right satisfies this condition. Since the goodsuffix heuristic
proposes a movement of 3 positions, which is smaller than the 4position proposal of the badcharacter
heuristic, the BoyerMoore algorithm increases the shift by 4.
The badcharacter heuristic
This heuristic, when a mismatch occurs, uses information about where the bad text character T(S+j) occurs
in the pattern (if in occurs at al) to propose a new shift. In the best case, the mismatch occurs on the first
comparison (P[m] ≠ T[s+m]) and the bad character T[s+m] does not occur in the pattern at all. (imagine
w r i l l o n n o l i c o l b o l
r o m i n i s c o n c o
w r i l l o n n o T i c o l b o l
r o m i n i s c o n c o
w r i l l o n n o l i c o l b o l
r o m i n i s c o n c o
SORTING TECHNIQUES 71
searching for a
m
in the text string b
n
). So, we can increase the shift s by m, since any shift smaller than s+m
will align some pattern character against the bad character, causing a mismatch. If the best case occurs
repeatedly, the BoyerMoore algorithm examines only a fraction I/m of the text characters, since each text
character examined yields a mismatch, thus causings to increase by m. This bestcase behaviour illustrates
the power of matching rightofleft instead of lefttoright.
This algorithm works as follows. Assume we have just found a mismatch: P[j] ≠ T[s+j], where I ≤ ≤j ≤m.
Now let k be the largest index in the range I ≤k ≤m such that T[s+j]=P[k], if any such k exists. Otherwise,
let k=0. We claim that we may safely increase s by jm. We must consider three cases to prove this claim,
as illustrated by Figure 2.20.
K=0: from Figure 2.20 (a), the bad character T[s+j] didn’t occur in the pattern at all, and so we can
safely increase s by j without missing any valid shifts.
K<j: from Figure 2.20 (b), the rightmost occurrence of the bad character is in the pattern to the left of
position j, so that jk>0 and the pattern must be moved jk characters to the right before the bad text
character matches any pattern characters any pattern character. Hence, we increase s by jk without
missing any valid shifts.
k>j: from Figure 2.20 (c), jk<0, and so the badcharacter heuristic is essentially proposing to decrease
s. This recommendation will be ignored by this algorithm, because the goodsuffix heuristic will
propose a shift to the right in all cases.
Now we give a simple program that defines λ [a] to be the index of the rightmost position in the pattern at
which character a occurs, for each Σ ∈ a Σ ∈ a . If a is not found in the pattern, then λ [a] is set to 0. Then
we call λ the lastoccurrence function for the pattern. With this definition, the expression j λ [Ts[s+j]] on
line 13 of BoyerMoore Matcher implements the badcharacter heuristic. (Since j λ [T[s+j]] is negative if
the rightmost occurrence of the bad character T[s+j] in the pattern is to the right of position j, we rely on the
positively of y[j], proposed by the goodsuffix heuristic, to ensure that the algorithm makes progress at each
step).
ALGORITHMS AND ADVANCED DATA STRUCTURES 72
(c)
Figuro 2.20. Tho cusos oí lho budchuruclor hourislic (uì Tho bud churuclor h occurs nowhoro in lho pullorn,
und so lho pullorn cun bo udvuncod ~TT churuclors unlil il hus pussod ovor lho bud churuclor (bì Tho
righlmosl occurronco oí lho bud churuclor in lho pullorn is ul posilion k<, und so lho pullorn cun bo
udvuncod k churuclors. Sinco ~T0 und k~ó íor lho bud churuclor i, lho pullorn cun bo udvuncod 4
posilions unlil lho i´s lino up (cì Tho righlmosl occurronco oí lho bud churuclor in lho pullorn is ul posilion
k>. ln lhis oxumplo, ~T0 und k~T2 íor lho bud churuclor o. Tho budchuruclor hourislic proposos u
nogulivo shiíl, which is ignorod.
COMPUTELASTOCCURRENCEFUNCTION (P, m, Σ )
1. for each character a ∈ Σ
2. do ] [a λ =0
3. for l j ← to m
4. do j P ← ]] [ λ
5. return λ
The running time of procedure COMPUTELASTOCCURRENCEFUNCTION is ( ) m + Σ 0 .
SORTING TECHNIQUES 73
The Good suffix Heuristic
Here we need to define the relation R Q ~ (read “Q is similar to R”) for strings Q and R to mean that Q
R or R Q. We can align two similar strings with their rightmost characters matched, and no pair of
aligned characters will disagree. The relation “~” is symmetric Q~R if and only if R~Q. We also have, as a
consequence of Lemma discussed earlier, that
Q R and S R imply Q~S. (1)
If P[j] ≠ T[s+j], where j<m, then the goodsuffix heuristic says that we can advance s by
m k k m j < ≤ − = 0 : { max ] [ λ and }. ~ ] ... [
k
P m l j P +
i.e., λ [j] is the least amount we can advance s and not cause any characters in the “good suffix”
T[s+j+l..s+m] to be mismatched against the new alignment of the pattern. We call y the goodsuffix
function for the pattern P.
We now show how to compute the goodsuffix function y. We first observe that
] [m w π = ] [ ] [ m m j π λ − ≤ for all j, as follows. If ] [m w π = , then P
w
P by the definition of π .
Furthermore, since P[j+1..m] P for any j, we have P
w
~[j+1…m), by equation (1). Therefore, y[j] ≤m
π [m] for all j.
Now we rewrite our definition for y as
Y[j]=mmax {k: π [m] ≤k<m and P[j+1…m]~P
k
}.
The condition that P[j+1...m]~P
k
holds if either P[j+1..m] P
K
or P
k
P[j+1..m]. But the latter possibility
implies that P
k
P and thus that k≤ π [m], by the definition of π . This latter possibility cannot reduce the
value of y[j] below mπ [m]. We can therefore, rewrite our definition of y still further as follows:
y[j]=mmax ({π [m]} ∪ {m k: π [m]<k<m and P[j+1..m] P
k
}).
(The second set may be empty). It is worth observing that the definition implies that y[j]>0 for all j=1,
2,…m, which ensures that the BoyerMoore algorithm makes progress.
To simplify the expression for y further, we define P’ as the reverse of the pattern P and π ’ as the
corresponding prefix function. That is, P’[i]=P[m i +1] for i =1,2,….m, and π ’[t] is the largest u such that
u<t and P’
u
P’
t
.
If k is the largest possible value such that P[j+1..m] P
k
, then we claim that
π ‘[l]=mi, (2)
where l=(mk)+(mj). To see that this claim is well defined. Note that P[j+1..m] P
k
implies that mj ≤k,
and thus l ≤m. Also, j<m and k ≤m, so that l ≥1. We prove this claim as follows. Since P[j+1..m] P
k
,
we have P’
mj
P
l
`. Therefore, π ’[l] ≥mj. Suppose now that p>mj, where p= π ’[/]. Then, by the
definition of π ’, we have P’
p
P’
l
or, equivalently, P’[1..P]=P’[1p+l..m]=P[ml+l..ml+p]. Substituting
for l=2
m
kj, we obtain P[mp+l..m]=P[km+j+l..km+j+p), which implies P[mp+l..m] P
km+jp
. Since
p>mj, we have j+1>mp+1, and so P[j+1..m] P[mp+l..m], implying that P[j+1..m] P
km+j+p
by the
transitivity of . Finally, since p>mj, we have k’>k, where k’=km+j+p, contradicting or choice of k as
the largest possible value such that P[j+1..m] P
k
. This contradiction means that we can’t have p>mj, and
thus p=mj, which proves the claim (2).
Using equation (2), and noting that π ’[/]=mj implies that j=m– π ’[/] and k=m–l+ π ’[l], we can rewrite
our definition of y still further.
ALGORITHMS AND ADVANCED DATA STRUCTURES 74
) 3 ( ]}) [ ' ' 1 : ] [ ' {
]} [ ({ min
]}) [ ' 1 : ] [ ' – {
]} [ ({ ] [
l m j and m l l l
m m
l m j and m l l l m
m ma m j y
π π
π
π π
π
− = ≤ ≤ − ∪
− =
− = ≤ ≤ + ∪
− =
Σ Again, the second set may be empty.
Now we see the procedure for computing y:
COMPUTEGOODSUFFIXFUNCTION (P, m)
1. ← π COMPUTEGOODSUFFIXFUNCTION (P)
2. P
t
←reverse (P)
3. ' π ← COMPUTEGOODSUFFIXFUNCTION (P’P
4. for j ←0 to m
5. do y[j] ←mπ [m]
6. for l ←1 to m
7. do j ←m–π ’[l]
8. if y[j]>lπ ’[l]
9. then y[j] ←l–π ’[l]
10. return y
The procedure COMPUTEGOODSUFFIXFUNCTION is a direct implementation of equation (3). Its
running time is 0(m).
The worstcase time complexity of the BoyerMoore algorithm is clearly O((nm+1)m+ Σ ), COMPUTE
GOODSUFFIXFUNCTION takes time O(m), and the BoyerMoore algorithm (like the RabinKarp
algorithm) spends O(m) time validating each valid shift s.
The KnuthMorrisPratt algorithm
Knuth. Morris and Pratt gave a lineartime stringmatching algorithm. Their algorithm is a θ(n + m) running
time algorithm by avoiding the computation of the transition function δ . It does the pattern matching using
just an auxiliary function π[1..m] precomputed from the pattern in time O(m). The array π allows the
transition function δ to be computed efficiently “on the fly” as needed. We can say, for any state q = 0,
1……..m and any character a, ∈Σ, the value δ [q] contains the information that is independent of a and is
need to compute δ (q,a). (This remark will be clarified shortly.) Since the array π has only m entries,
whereas δ has O(m[Σ]) entries, we save a factor of Σ in preprocessing by computing π rather than δ.
The pref¡x funct¡on
The prefix function for a pattern provides knowledge about how the pattern matches against shifts to itself.
This information can be used to avoid testing useless shifts in the naïve a patternmatching automaton.
SORTING TECHNIQUES 75
Now see the operation of the naïve string matcher. Figure 2.21(a) describe 3 particular shifts s of a template
containing the pattern P = ababaca against a text T. For this example, q = 5 of the characters have matched ,
but the 6
th
information that q characters have matched successfully determines the corresponding text
characters. If we know these q text characters then we can determine immediately that certain shifts are
invalid. In the example of the figure, the shifts s + 1 is necessarily invalid, since the first pattern character,
an a would be aligned with a text character that is known to match with the second pattern character, a b.
The shifts s – 2 shown in part (b) of the figure, however, aligns the first three pattern characters with three
text characters that must necessarily match. In general it is useful to know the answer to the following
question:
T
s
q
(uì
T
s
P
k
(bì
P
4
P
4
(cì
Figuor 2.2T. Tho Proíix Funclion π
(uì Tho pullorn P ~ ububucu is ulignod wilh u loxl T so lhul lho íirsl q ~ 5 churuclors mulch. Mulching
churuclors, shown shudod uro connoclod by vorlicul linos. (bì Using only our knowlodgo oí lho 5 mulchod
o o c o o o o o o o o c o o o
o o o o o c o
o o c o o o o o o o o c o o o
o o o o o c o
c o o o o
o o o
P
ALGORITHMS AND ADVANCED DATA STRUCTURES 76
churuclors, wo cun doduco lhul u shiíl oí s+ T is invulid, bul lhul u shiíl oí s´ ~ s + 2 is consislonl wilh
ovorylhing wo know uboul lho loxl und lhoroíoro s polonliully vulid. (cì Tho usoíul iníormulion íor such
doduclions cun bo procompulod by compuring lho pullorn wilh ilsolí. Horo, wo soo lhul lho longosl proíix oí P
lhul is ulso u suííix oí P
5
is P
3
. This iníormulion is procompulod und roprosonlod in lho urruy π, so lhul π[5] ~
3. Givon lhul q churuclors huvo mulchod succossíully ul shiíl s, lho noxl polonliully vulid shiíl is ul s´ ~ s (q
π[q]ì.
Given that pattern characters P[1..q] match text characters T[s + 1..s + q], what is the least shift
s’ > s such that
P[1..k] = T[s’ + 1..s’ + k] ... (α)
where s’ + k = s + q?
Such a shifts s’ is the first shift greater than s that is not necessarily invalid due to our knowledge of T [s +
i..s + q]. In the best case, we have that s’ = s + q, and shifts s + 1, s + 2, ….,s+q –1 are all immediately
ruled out. In any case, at the new shift s’ we don’t need to compare the first k characters of P with the
corresponding characters of T, since we are guaranteed that they match by equation (α).
See in Figure 2.21(c). Since T[s’ + 1..s’ –k] is part of the known portion of the text, it is a suffix of the
string P
4
. Equation (α) can therefore be interpreted as asking for the largest k < q such that ]
q k
P P . Then, s’
= s + (q –k) is the potentially valid shift. It turns out to be convenient to store the number k of matching
characters at the new shift s’, rather than storing, say, s’ – s. This information can be used to speed up both
the naïve stringmatching algorithm and the finiteautomaton matcher.
We formalize the precomputation required as follows. Given a pattern P[1..m], the prefix function for the
pattern P is the function :{1, 2,….,m} {0,1,….,m–1} such that
π[q] = max {k : k < q and ]
q k
P P }
That is π[q] is the length of the longest prefix of P that is a proper suffix of P
q
.
The KnuthMorrisPratt Matching algorithm is given in pseudocode below as the procedure KMPMatcher.
It is mostly after FiniteAutomatonMatcher, as we shall see. KMPMatcher calls the auxiliary procedure
ComputePreffixFunction to compute π.
KMPMatcher (T,P)
1 n←length [T]
2 m←length[T]
3 k←ComputePrefixFunction (P)
4 q←0
5 for I←1 to n
6 do while q> 0 and P[q + 1] ≠ T[i]
7 do q←π[q]
8 if P[q + 1] = T[i]
9 then q←q +1
10 if q = m
SORTING TECHNIQUES 77
11 then print “Pattern occurs with shift”i – m
12 q←π[q]
ComputePrefixFunction (P)
1 m←length [P]
2 π[1]←0
3 k←0
4 for q←2 to m
5 do while k> 0 and P[k + 1] ≠ P[q]
6 do k←π[k]
7 if P[k + 1] = P[q]
8 then k←k +1
9 π[q] ←k
10 return π
T¡me comp¡ex¡ty
The time complexity of ComputePrefixFunction is O(m). We can attach a potential of k with the current
state k of the algorithm. This potential has an initial value of 0 (from line 3). Line 6 decreases k, since π[k]
< k. Since π[k] ≥ 0 for all k, however, k can never become negative. The only other line that affects k is line
8, which increases k by at most one during each execution of the for loop body. Since k < q upon entering
the for loop and since q is incremented in each iteration of the for loop body, k < q always holds. (This
justifies the claim that π[q] < q as well, by line 9. We can pay for each execution of the while loop body on
line 6 with the corresponding decrease in the potential function, since π[k] < k. Line 8 increases the
potential function by at most one, so that the amortized cost of the loop body on lines 5–9 is O(1). Since the
number of otherloop iterations is O(m), and since the final potential function is at least as great as the
initial potential function, the total actual worstcase running time of ComputePrefixFunction is O(m).
The KnuthMorrisPratt algorithm has time complexity O(m + n). The call of ComputePrefixFunction
takes O(m) time as we have just seen, and a similar analysis, show that using the value of q as the potential
function, the remainder of KMPMatcher takes O(n) time.
8ummary
Average Case Time complexities of Bubble sort, insertion sort and selection sort are O(n
2
).
Merge Sort has space complexity O(n
2
).
Primes and kruskal’s algorithms are used to find minimum spanning tree.
Finding all occupancies of pattern in a text is a problem known as string matching.
8elfassessment Exercises
So¡ved Exerc¡se
ALGORITHMS AND ADVANCED DATA STRUCTURES 78
I. True and False
1. A selection sort is one in which successive elements are selected in order and placed into their
proper sorted positions.
2. Selection sort is more efficient than quick sort.
II. Fill in the blanks
1. Kruskal’s algorithm is used to find______________.
2. Average case time complexity of quick sort is______________.
3. Two string matching algorithms are _________ and ________ algorithms.
Answers
I. True and False
1. True
2. False
II. Fill in the blanks
1. minimum spanning tree
2. nlogn
3. Boyer, Moore
Unso¡ved Exerc¡se
I. True and False
1. Prim’s algorithm is used to find minimum spanning tree.
2. If a pattern is relatively long and the alphabet is reasonably large than BoyerMoore algorithm is
the most efficient string matching algorithm.
II. Fill in the blanks
1. Time complexity of Bubble Sort is _______________.
2. Space complexity of Merge out is _______________.
3. Two string matching algorithms are _______________ and _______________.
4. Kruskal’s algorithm is used to create_______________ tree_______________.
5. Prims algorithm is a _______________ method for creation of minimum spanning tree.
Overview
Principle of Optimality
Matrix Multiplication
Optimal Binary Search Trees
Unit 3
Dynamic Programming
Learning Objectives
• Overview
• Principle of Optimality
• Matrix Multiplication
• Optimal Binary Search Trees
Top
Overview
Dynamic programming is an algorithm design method that can be used when the solution to a problem can
be viewed as the result of a sequence of decisions.
One way to solve problems for which it is not possible to make a sequence of stepwise decisions leading to
an optimal sequence is to try all possible decision sequences. We could enumerate all decision sequences
and then pick out the best. But the time and space requirements may be prohibitive. Dynamic Programming
often drastically reduces the amount of enumeration by avoiding the enumeration of some decision
sequences that cannot possibly be optimal. In dynamic programming an optimal sequence of decisions is
obtained by making implicit appeal to the principle of optimality.
Top
Principle of Optimality
The principle of optimality states that an optimal sequence of decisions has the property that whatever the
initial state and decision are, the remaining decisions must constitute an optimal decision sequence with
regard to the state resulting from the first decision.
Thus, the essential difference between the greedy method and dynamic programming is that in the greedy
method only one decision sequence is ever generated. In dynamic programming much decision sequences
may be generated. However, sequences containing sub optimal subsequences cannot be optimal (if the
principle of optimality, holds) and so will not (as far as possible) be generated.
ALGORITHMS AND ADVANCED DATA STRUCTURES 82
Another important dynamic feature of programming approach is that optimal solutions to subproblems are
retained so as to avoid recomputing their values. The use of these tabulated values makes it natural to recast
the recursive equations into an iterative algorithm.
Student Activity 3.1
Before going to next section, answer the following questions:
1. What is Dynamic programming?
2. State principle of optimality.
If your answers are correct, then proceed to next section.
Top
Matrix Multiplication
Our first example of dynamic programming is an algorithm that solves the problem of matrixchain
multiplication. We have a sequence (chain) (A
1
, A
2
……A
n
) of n a matrices to be multiplied, and our goal is
to compute the product
A
1
A
2
…… A
n
……….(1)
We can evaluate the expression (1) using the standard algorithm for multiplying pairs of matrices as a sub
routine once we have parenthesized it to resolve all ambiguities in how the matrices are multiplies together.
A product of matrices is fully parenthesized if it is either a single matrix or the product of two fully
parenthesized matrix products, surrounded by parenthesis. We know that the matrix multiplication is
associative, therefore all parenthesations yield the same product. For example, the product of matrices A
1
,
A
2
, A
3
A
4
can be fully parenthesized in five distinct ways :
(A
1
(A
2
(A
3
A
4
))),
(A
1
(A
2
A
3
) (A
4
)),
((A
1
A
2
) (A
3
A
4
)),
(A (A
2
A
3
) A
4
),
(((A
1
A
2
) A
3
) A
4
),
The method of paranthesization of matrices can have a dramatic impact on the cost of evaluating the
product. Consider the cost of multiply two matrices. The standard algorithm is given by the following
pseudocode algorithm.
The attributes rows and columns are the number of rows and columns in a matrix.
Matrix Multiply (A,B)
{
if (columns [A] ! = rows [B]
print_error (“incompatible dimensions”)
DYNAMIC PROGRAMMING 83
else
for i = 1 to rows [A]
for j = 1 to columns [B]
{
C [i, j] = 0
For k = 1 to columns [A]
C [i, j] = C[i j] + A[i, k] * B[k, j]
}
}
To examine the different costs incurred by different parenthesizations of a matrix product, consider the
Counting the Number of Parenthesizations
First we should convince our selves that exhaustively checking all possible parenthesizations does not yield
an efficient algorithm Now we solve the matrixchain multiplication problem by dynamic programming,.
Denote the number of alternative parenthesization of a sequence of n matrices by P(n). Since we can split a
seuqence of n matrices between the K
th
and (k + 1)
st
matrices for any K = 1, 2,…… n – 1 and then
parenthesize the two resulting subsequences independently, we obtain the recurrence
if n = 1
P(n) = P (k) P (n–k) if n ≥ 2
The solution to this recurrence is the sequence of Catalan number :
P(n) = c(n – 1), where
( )
( )
2 / 3 n
n / 4
n
n 2
1 n
1
n C
Ω =
+
=
The number of solution s is thus exponential in n, and the brute force method of exhaustive search is
therefore a poor strategy for determining the optimal parenthesization of a matrix chain.
1
n–1
Σ
k=1
{
=
=
ALGORITHMS AND ADVANCED DATA STRUCTURES 84
An Optimal Parenthesization
In dynamicprogramming the first step is to characterize the structure of an optimal solution. For the
matrixchainmultiplication problem we can perform this step as follows. For convenience, let us adopt the
notation A
i…j
for the matrix that results from evaluating the product Ai A
i+1
….A
j
. An optimal
parenthesization of the product A
1
A
2
….A
n
splits the product between A
k
and A
k+1
for some integer k in the
range 1 ≤ k ≤ n. That is, of some value k, we first compute the matrices A
1...k
and A
k+1
….n and then multiply
them together to produce the final product A
1….n
. The cost of this optimal parentesization is thus the cost of
computing the matrix A
1
….k, plus the cost of computing A
k+1
….n, plus the cost of multiply them together.
The key observation is that the parenthesization of the ‘prefix’ subchain A
1
A
2
…A
k
within this optimal
parenthesization of A
1
…A
n
must be an optimal parenthesization of A
1
A
2
…A
n
why? If there were a less
costly way to parenthesize A
1
A
2
…A
k
, substituting that parenthesization in the optimal parenthesization of
A
1
A
2
…A
n
would produce another parenthesization of A
1
A
2
…A
n
whose cost was lower than the optimum : a
contradiction. A similar observation hold for the parenthesization of the subchain A
k+1
A
k+2
…A
n
in the
optimal parenthesization of A
1
A
2
….A
n
: it must be an optimal parenthesization of A
k+1
Ak+2
…A
n
.
Thus, an optimal solution to an instance of the matrixchain multiplication problem contains with in it
optimal solution to sub problem instances.
A Recursive 8olution
Next we define the value of an optimal solution recursively in terms of the optimal solutions to sub
problem. For this problem, we pick as our subproblems the problems of determining the cost of a
parenthesization of A
i
A
i+1
…..A
j
for 1 ≤ i ≤ j ≤ n. Let m [i, j] be the minimum number of scalar
multiplications needed to compute the matrix A
i…j
; the cost of a cheapest way to compute A
1….n
would thus
be m(1, n).
Now we can define m(i , j) as follows. If i = j; the chain consists of just one matrix A
i…j
= A
i.
So no scalar
multiplications are necessary to compute the product. Thus m (i,j) = 0 for i = 1, 2,….n. To compute m(i, j)
when i<j, we take advantage of the structure of an optimal solution from step 1. Let us suppose that the
optimal parenthesization splits the product Ai A
i+1
….A
j
between A
k
and A
k+1
, where I ≤ k < j, them m(i, j) is
equal to the minimum cost for computing the sub products A
i…k
and A
k+1
….j, plus the cost of multiplying
these two matrices together. Since computing the matrix product A
i….k
A
k+1
….j takes P
i–1
P
k
P
j
scalar
multiplications, we obtain.
M(i, j) = m[i, k] + m[i, k] m[k+1, j] + P
i–1
P
k
P
j
..
This recurrence relation assumes that we know the value of k, which we don’t have are only j–i possible
values for k, however, namely k=i, i+1,….j–1. Since the optimal parenthesization must use one of these
values for k, we need only to check them all to find the best. Thus, our recursive definition for the minimum
cost of parenthesizing the product A
i
A
i+1
….Aj becomes
if i = j
m[i, j] = {m[i, k] +m [k+1, j] + p
i–1
P
k
P
j
}
i j i<j (2)
The m[i, j] values give the costs of optimal solutions to subproblems. To help us keep track of how to
construct an optimal solution, let us define s[i, j] to be a value of k at which we can split the product A
i
0
min
i ≤ k ≤ j {
DYNAMIC PROGRAMMING 85
A
i+1
……A
j
to obtain an optimal parenthesization. That is s[i, j] equals a value k such that m[i, j] = m[i, k]
+m[k+1, j] + p
i–1
P
k
P
j
Calculating optimal cost
Now it is a simple to write a recursive algorithm based on relation (2) to compute the minimum cost of m[1,
n] for multiplying A
1
A
2
….A
n
. However this algorithm takes exponential time—no better than the brute
force method of checking each way of parenthesizing the product.
The important observation is that we have relatively few sub problems. One problem for each choice of i
and j satisfying 1 ≤ i ≤ j ≤ n or (n
2
) + n = θ (n
2
) total.
Instead of computing the solution to recurrence (3) recursively we perform the third step of the dynamic
programming paradigm and compute the optimal cost by using a bottomup approach. The following
pseudocode algorithm assumes that matrix A
i
has dimensions P
i–1
×P
i
for i = 1, 2, ….n. The input sequence
is (P
0
, P
1
…….P
n
), where length [P] = n + 1. The procedure uses an auxiliary table m[1…n, 1…..n] for
storing them m[i, j] costs and an auxiliary table s[1….n, 1…..n] that records which index of k achieved the
optimal cost in computing m[i, j]
Matrix chain order (P)
{
n = length [p]–1
for (i = 1; i< = n; i ++)
m [i, i] = 0
for (l = 2; l < = n; l ++)
for (i = 1; i< = n–l+1; i ++
{
j = i + l–1
m[i, j] = ∞
for (k = I; k < = j–1; k++)
{
q–m[i, k] +m [k+1,J]+P
i–1
P
k
P
j
if (q<m [i, j])
{
m [i, j] = q
s[i, j]=k
}
}
}
This algorithm fits the table m in a manner that corresponds to solving the parenthesization problem or
matrix chain s of increasing length.
Examp¡e
ALGORITHMS AND ADVANCED DATA STRUCTURES 86
Now we see the operation of above algorithm on a chain of n = 6 matrices:
Matrix dimension
/1 3C×35
/2 35×15
/3 15×5
/4 5×1C
/5 1C×2C
/ó 2C×25
The table m and s for this problem are shown below which are calculated from the above algorithm.
m
15,125
11,875
10,500
7,125
2,500
1000
0
750
0
0
4375
2,ó25
0
5,375
3500
5000
0
º,375
7,875
15,750
0
Figuro 3.T
The minimum number of scalar multiplication to multiply the 6 matrices is m[1, 6] = 15, 125.
A simple inspection of the nested loop structure of Matrix chain order yields a running time of O(n
3
) for
the algorithm.
Constructing an optimal solution
The Matrix chain order does not show how to multiply the matrices it only determines the optimal
number of scalar multiplications needed to compute a matrixchain product.
We will use the table s[1…n,1….n] to determine the best way to multiply the matrices. Each entry s[i, j]
records the value of k such that the optimal parenthesization of A
i
A
i+1
…….A
j
splits the product between A
k
and A
k+1
. Thus, we know that the final matrix multiplication in computing A
1…n
optimally is A
1….S[1,n]
A
s[1,
n]+1…..n
. The earlier matrix multiplication can be computed recursively, since s[1, s (1, n)] determines the
east matrix multiplication in computing A
s[1, n]+1
…..n. The following recursive procedure computes the
matrixchain product A
i…
j given the matrices A = (A
1
, A
2
,…. A
n
). The table s computed by Matrix chain
order, and the indices i and j. The initial call is Matrix chain multiply (A, s, 1, n)
/
1
/
2
/
3
/
4
/
5
/
ó
1
2
3
4
5
ó
1
2
3
4
5
ó
i

2
3
4
5
ó
 i
s
1
2
3
4
5 5
5
DYNAMIC PROGRAMMING 87
Matrix Chain Multiply (A, s, i, j)
{
if (j > I)
{
X = Matrix Chain Multiply (A, s, i, s [i, j]);
Y = Matrix Chain Multiply (A, s, s[i, j] + 1, j);
Return Matrix Multiply (x, y)
}
else return A
I
}
In the above example the call
Matrix Chain Multiply (A, s, 1, 6) computes the matrix chain product according to the parenthsization ((A
1
(A
2
A
3
)) ((A
4
A
5
) A
6
)).
Student Activity 3.2
Before going to next section, answer the following questions:
1. What is matrix chain multiplication problem?
2. Describe matrix chain multiplication with an example.
If your answers are correct, then proceed to next section.
Top
Optimal Binary 8earch Trees
Given a fixed set of identifiers to create a binary search tree organization. We may expect different binary
search trees for the same identifier set to have different performance characteristics. The tree of figure
3.2(a), in the worst case, requires four comparisons to find an identifier. Where as the tree of figure 3.2(b)
requires only the tree. On the average the two trees need 12/5 and 11/5 comparisons, respectively.
¦or ¦or
wbilo oo inl
ALGORITHMS AND ADVANCED DATA STRUCTURES 88
oo
inl i¦ wbilo
i¦
Figuro 3.2(uì Figuro 3.2(bì
For example, in the case of tree 1(a), it takes 1, 2, 2, 3, and 4 comparisons, respectively to find the
identifiers for, do, while, int and if. Thus the average number of comparisons is (1+2+3+4)5 = 12/5. This
calculation assumes that each identifier is searched for with equal probability and that no unsuccessful
searches (i.e., searches for identifiers not in the tree) are made.
In a general situation we can expect different identifiers to be searched for with different frequencies (or
probabilities) in addition, we can expect unsuccessful searches also to be made. Let us assume that the
given set of identifiers is {a, a
2
, a
n
} with a
1
< a
2
<……..<a
n
. Let P(i) be the probability with which we
search for a
i
. Let q(i) be the probability that the identifier x being searched for is such that a
i
<x<a
i+1
, 0 ≤ I
≤ n (assume a
0
= –00 and a
n+1
= +00). Then, Σ
o≤i≤n
a(i) is the probability of an unsuccessful search.
Clearly, Σ
1≤i≤n
P(i) + Σ
0 ≤ I ≤ n
q(i) = 1. Given this data we wish to construct an optimal binary search tree for
{a, a
2
….a
n
}. First of course we must be precise what we mean by an optimal binary search tree.
In obtaining a cost function for binary search trees, it is useful to add a fictitious node in place of every
empty subtree in the search tree, such nodes, called external nodes, are drawn square in the figure 3.3. All
other nodes are internal nodes. If a binary search tree represents n identifiers, then there will be exactly n
internal nodes and n+1 (fictitious) external nodes. Every internal node represents a point where an
unsuccessful search may terminate.
¦or ¦or
wbilo oo inl
oo
inl i¦ wbilo
i¦
Figuro 3.3(uì Figuro 3.3(bì
If a successful search terminates at an internal node at level l, then l iterations of the while loop of binary
search algorithm are needed. Hence the expected cost contribution from the internal node for a
i
is P(i) *
level (a
i
).
Unsuccessful searches terminate with t=0 (i.e., at an external node). The identifiers not in the binary search
tree can be partitioned into n+1 equivalence classes E
i
, 0 ≤ i ≤ n. The class E
0
contains all identifiers x such
that a
i
<x<a
i+1
, 1 ≤ i ≤ n. The class Ei contains all identifiers x, such that a
i
<n<a
i+1
, 1<I<n. The class En
contains all identifiers x, x>a
n
. It is easy to see that for all identifiers in the same class Ei, the search
terminates at the same external node. For identifiers in different Ei the search terminates at different
Possiblo binury sourch lroos
DYNAMIC PROGRAMMING 89
external nodes. If the failure node for Ei is at level l, then only l–1 iterations of while loop are made, Hence
the cost contribution of this node is q(i) * (level (Ei)
1
).
The preceding discussion leads to the following formula for the expected cost of a binary search tree:
Σ
1 i ≤n
P(i) * level (a
i
) + Σ
0 ≤ i ≤ n
q(i) * (level (E
i
)–1)
We define an optimal binary search tree for the identifier set {a
1
, a
2
……a
n
} to be a binary search tree for
which above equation is minimum,
Examp¡e
The possible binary search trees for the identifier set {a
1
, a
2
, a
3
} = (do, if, while) are given in figure 3.4
with equal probabilities
P(i) = q(i) = 1/7 for all i, we have
Cost (tree a) = 15/7 cost (tree b) = 13/7
Cost (tree c) = 15/7 cost (tree d) = 15/7
Cost (tree e) = 15/7
As expected tree b is optimal. With P(1) = .6 P (2) = .1 P(3) = .05, q (0) = .15, q (1) = .1, q (2) = .05 and q
(3) = .05 we have
Cost (tree a) = 2.65 cost (tree b) = 1.9
Cost (tree c) = 1.5 cost (tree d) = 2.05
Cost (tree e) = 1.6
For instance, cost (tree a) can be computed as follows. The contribution from successful searches is
3*0.5+2*0.1+0.05=1.75 and the contribution from unsuccessful searches is 3*0.15+3*.1+2*0.05+ 0.90. All
the other costs can also be calculated in a similar manner. Tree c is optimal with this assignment of P’s and
q’s.
wbilo i¦
i¦ oo wbilo
oo
ALGORITHMS AND ADVANCED DATA STRUCTURES 90
Figuro 3.4(uì Figuro 3.4(bì
oo wbilo
i¦ oo
wbilo i¦
Figuro 3.4(cì Figuro 3.4(dì
Do
wbilo
i¦
Figuro 3.4(oì. 8inury Sourch Troos íor (do, whilo, iíì
To apply dynamic programming to the problems of obtaining an optimal binary search tree, we need to
view the construction of such a tree as the result of a sequence of decisions and then observe that the
principle of optimality holds when applied to the problem state resulting from a decision. A possible
approach to this would be to make a decision as to which of the a
i
’s should be assigned to the root node of
the tree. If we choose a
k
, then it is clear that the external nodes for a
1
, a
2
,……a
k–1
as well as the external
nodes for classes E
0
E
1
, ….E
k–1
will be in the left subtree l of the root. The remaining nodes will be in the
right subtree r. Define
Cost (l) = Σ
1<
I
≤n
P(i) * level (a
i
) + Σ
0<
I
≤n
q(i) * (level (E
i
)–1)
and
Cost (r) = Σ
1<I≤n
P(i) * level (a
i
) + Σ
k<I≤n
q(i) * (level (E
i
)–1)
In both cases the level is measured by regarding the root of the respective subtree to be at level 1.
Using w(i, j) to represent the sum
( ) ( ) ( ) [ ]
+ =
+ +
i
1 1 i
l P l q a a , we obtain the
DYNAMIC PROGRAMMING 91
Following as the expected cost of the search tree (figure 3.5):
l
Figuro 3.5. An C8ST wilh rool u
k
P(k)+cost (l) + cost (r) + w (o, k–1) + w (k,n)……………(1)
If the tree is optimal, the equation (1) must be minimum over all binary search trees containing a
1
, a
2
…a
k–1
and E
0
, E
1
,…., E
k–1
. Similarly cost (r) must be minimum. It we use C(i, j) to represent the cost of an optimal
binary search tree t
ij
containing ai+1,…..,a
j
and E
i
,…..E
j
, then for the tree to be optimal, we must have cost
(l) = C(o, k–1) and cost (r) = C(k,n). In addition, k must be chosen such that
P(k)+C(0, k–1) + C(0, k–1)+C(k, n)+W(0, k–1)+W(k, n)
is minimum. Hence for
C(0, n) we obtain C(0, n)= min {C(0, k–1)+C(k, n)+P(k)+W(0, k–1)+W(k, n)}……..(2)
We can generalize equation (2) to obtain for any C(i, j)
C(i, j))= min {C(i, k–1)+C(k, j)+P(k)+W(i, k–1)+W(k, j)}
C(i, j))= min {C(i, k–1)+C(k, j)+W(k, j)}…………………….(3)
Equation (3) can be solved for C(0, n) by first computing all C(i, j) such that j–i=1 (note C(i, j) = 0 and W(i,
i) = q(i), 0 ≤ I ≤ n. Next we can compute all C(i, j) such that j–i = 2, then all C(i, j) with j–i=3, and so on. If
during this computation we record the root r(i, j) of each tree t
ij
, then an optimal binary search tree can be
constructed from these r(i, j). Note that r (i, j) is the value of k that minimizes equation (3).
Examp¡e
Let n = 4 and (a
1
, a
2
, a
3
, a
4
) = (do, if, int, while). Let P (1:4) = (3, 3, 1, 1) and q (0:4) = (2, 3, 1, 1, 1). The
p’s and q’s have been multiplied by 16 for convenience. Initially, we have w(i, i)= q(i), C(i, i) = 0 and r(i, i)
= 0, 0 ≤ i ≤ 4. Using equation (3) and the observation w(i, j) = p(i)+q(j)+ w(i, j–1), we get
w(0, 1)=p(1)+ q(1)+ w(0, 0)=8
c(0, 1)=w(0, 1)+min {c(0, 0)+ c(1, 1)}=8
r(0, 1) =1
w(1, 2)=p(2)+q(2)+w(1, 1)=7
c(1, 2)=w(1, 2)+ min {c(1, 1)+ c(2, 2)}=7
r(0, 2)=2
≤k≤n
i<k≤j
i <k≤j
ALGORITHMS AND ADVANCED DATA STRUCTURES 92
w(2, 3)=p(3)+q(3)+w(2, 2)=3
c(2, 3)=w(2, 3)+ min {c(2, 2)+c(3, 3)}=3
r(2, 3)=3
w(3, 4)=p(4)+q(4)+w(3, 3)=3
c(3, 4)=w(3, 4)+ min {c(3, 3)+c(4, 4)}=3
r(3, 4)=4
Knowing w(i, i+1) and c(i, i+1), 0 ≤ i<4, we can again use equation (3) to compute w(i, i+2), c(i, i+2) and
r(i, i+2), 0 ≤ i<3. This process can be repeated until w(0, 4), c(0, 4), and r(0, 4) are obtained. The table of
figure 3.5 shows the result of this computation. The box in row i and column j shows the result of this
computation. The box in row i and column k shows the values of w(j, j+1), c(j,j+1) and r(j, j+1)
respectively. The computation is carried out by row from row 0 to 4. From the table we see that c(0, 4)=32
is the minimum cost of a binary search tree for (a
1
, a
2
, a
3
, a
4
). The root of tree to 4 is a
2
. Hence, the left
subtree is t
01
, and the right subtree t
24
. Tree t
01
has root a
j
and subtrees t
00
and t
11
. Tree t
24
has root a
3
; its left
subtree is t
22
and its right subtree t34. Thus, with the data in the table it is possible to reconstruct t
04
. Figure
3.6 shows t
04
.
0 1 2 3 4
w
00
=2 w
11
=3 w
22
=1 w
33
=1 w
44
=1
0 c
00
=0 c
11
=0 c
22
=0 c
33
=0 c
44
=0
r
00
=0 r
11
=0 r
22
=0 r
33
=0 r
44
=0
w
01
=8 w
12
=7 w
23
=3 w
34
=3
1 c
01
=8 c
12
=7 c
23
=3 c
34
=3
r
01
=1 r
12
=2 r
23
=3 r
34
=4
w
02
=12 w
13
=9 w
24
=52
2 c
02
=19 c
13
=12 c
24
=8
r
02
=1 r
13
=2 r
24
=3
w
03
=14 w
14
=11
3 c
03
=25 c
14
=13
DYNAMIC PROGRAMMING 93
r
03
= 2 r
14
=2
w
04
=16
4 c
04
=32
r
04
=2
Figuro 3.ó. Compuluilon oí c (0,4ì, w (0,4ì und r (0,4ì
If
Do int
while
Figuro 3.7. Cplimul binury sourch lroo
The above example shows how equation (3) can be used to determine to c’s and r’s and also how to
reconstruct t
0n
knowing the r’s. Let us examine the complexity of this procedure to evaluate the c’s and r’s.
The evaluation procedure described in the above example requires us to compute c(i, j) for (j–i) = 1, 2,…..,
n in that order. When j–i=m, there are n–m+1 c(i, j)’ s to compute. The computation of each of these c(i,
j)’s requires us find the minimum of m quantities (see equation (3)). Hence, each such c(i, j) can be
computed in time O(m). The total time to evaluate all C(i, j) p and r(i, j)’ s is therefore.
Σ (nm–m
2
)=O(n
3
)
We can do better than this using a result due to D.E. Knuth which shows that the optimal k in equation (3)
can be found by limiting the search to the range r(i, j–1) ≤ k ≤ r(i+1, j). In this case the computing time
become O(n
2
). The function OBST uses this result to obtain the values of w(i, j), r(i, j) and c(i, j), 0 ≤ i ≤ j
≤ n, in O(n
2
) time. The tree ton can be constructed from the values of r(i, j) in O(n) time.
OBST (p, q, n)
//given n distinct identifies a, <a
3
<a
2
<..<an;
//and probabilities p[i], 1 ≤ i ≤ n, and q[i],
//0 ≤ i ≤ n, this algorithm computes the
//cost c[i, j] of optimal binary search
//tree t
ij
for identifiers a
i+1
,…….a
j
. It also
//computes r[i, j], the root of t
ij
. w[i ,j]
// is the weight of t
ij
.
{
for (i=0; l<=n–1; I++)
{
i ≤m≤n
ALGORITHMS AND ADVANCED DATA STRUCTURES 94
//initialize
w[i, i]=q[i]
r[i, i]=0;
c[i, i]=0.0;
//optimal trees with one node
w[i, i+1]=q[i]+q[i+1]+p[i+1];
r[i, i+1]=i+1;
c[i, i+1]=q[i]+q[i+1]+p[i+1];
}
w[n, n]=q[n];
r[n, n]=0;
c[n, n]=0.0;
for (m=2; m<=n; m++) //find optimal trees with m nodes
for (i = 0; i<=n–m; i++)
{
j= i+m;
w[i, j]=w[i, j–1]+P[j]+q[j];
//solve equation (3) using Knuth’s result.
k = find (c, r, i, j);
//A value of / in the range r[i, j–1] ≤ l
//≤ r[i+1, j] that minimizes
//c[i, l–1]+c[i, j];
c[i, j]=w[i, j]+c[i, k–1]+c[k, j];
r[i, j]=k;
}
Write (C[0, n], w(0, n), r[0, n]);
} //end OBST
Find (c, r, i, j)
{
min = 00;
for (m=r[i, j–1]; m<=r[i+1,j]; m+1)
{
DYNAMIC PROGRAMMING 95
if (c[i, m–1]+c[m, j]<min)
{
min = c[i, m–1]+c[m, j];
l = m;
}
}
return (l);
}
Student Activity 3.2
Answer the following questions:
1. Explain the need for optimal binary search tree.
2. Give the formula to find expected cost of a binary search tree.
8ummary
• The principle of optimality status that an optimal sequence of decision has the property that whatever
the initial state and decision are, the remainly decision not constitute an optimal decision sequence with
regard to the state resulting from the first decision.
• We apply the Algorithm for computing the optimal costs over recessive solution. We compute the
optimal cost by using a potterup approach.
Se¡fassessment Quest¡ons
So¡ved Exerc¡se
I. True and False
1. In greedy method only one decision sequence is ever generated
2. Principle of optimality does not hold for dynamic programming
II. Fill in the blanks
1. Matrix chain multiplication problem can be solved by _______________.
2. To make binary search tree more efficient we need to find ___________________.
Answers
I. True and False
ALGORITHMS AND ADVANCED DATA STRUCTURES 96
1. True
2. False
II. Fill in the blanks
1. Dynamic programming
2. Optimal binary search tree
Unso¡ved Exerc¡se
1. Fill in the blanks:
(a) The essential difference between the greedy method and ______ is that in the _______ method
only one decision sequence is ever generated.
(b) An __________ to take of the matrixchain multiplication problems contain with is it optimal
solution to sub problem instances.
(c) ____________ is an algorithm design method that can be used when the solution to a problem
can be viewed as the result of a sequence of decision.
(d) In dynamic programming an optimal sequence of decision is obtained by making implicit appeal
to the _________.
Detailed Ouestions
1. Find an optimal parenthesization of a matrixchain product whose sequence of dimensions is (5, 10,
3, 12, 5, 50, 6).
2. Give an efficient algorithm PrintOptimalareas to print the optimal parentesization of a matrix
chain given the table s computed by Matrix chain order. Analyze your algorithm.
3. Show that a full parenthesization of nelement expression has exactly n–1 pairs of parentheses.
4. Let Q(i, j) be the number of times that table entry m[i, j] is referenced by Matrix chain order in
computing other tables entries. Show that the total number of references for the entire table is
( )
= =
=
n
1 i
3
n
1 i 3
n n
j , i R
(Hint: You may find the identity)
Σ i
2
=n(n+1) (2n+1)/6 useful.
5. Use the function OBST to compute w(i, j), r(i, j), and c(i, j), 0 ≤ i ≤ j ≤4, for the identifier set (a
1
, a
2
,
a
3
, a
4
)= (cout, float, if while) with
P(1)=1/20, P(2)=1/5, P(3)=1/10, P(4)=1/20,
Q(0)=1/5, q(1)=1/10, q(2)=1/5, q(3)=1/20 and q(4)=1/20.
Using the r(i, j)’s, construct the optimal binary search tree.
i =1
n
DYNAMIC PROGRAMMING 97
6. (a) Show that the computing time of function OBST is O(n
2
).
(b) Write an algorithm to construct the optimal binary search tree given the roots
r(i, j), 0 ≤ i ≤ j ≤ n. Show that this can be done in time O(n).
Overview
Polynominal Time
NPcompleteness and Reducibility
NPcompleteness
NPcompleteness Proofs
NPcomplete Problems
Unit 4
NP Complete Problem
Learning Objectives
• Overview
• Polynomialtime
• NPCompleteness and Reducibility
• NPCompleteness Proofs
• NPComplete Problems
Top
Overview
All of the algorithms we have studied thus far have been polynomialtime algorithms: on inputs of size n,
their worstcase running time is O(n
K
) where K is a constant. What do you think whether all problems can
be solved in polynomialtime? The answer is no. Some problems such as Tuning's famous "Halting
Problem," cannot be solved by any computer, no matter how much time is provided. One problems can b e
solved, but not in time O(n
K
) for any constant k. The problems that can be solvable by polynominal  time
algorithms are called tractable, and problems that need superpolynominal time as being intractable.
We discuss an interesting class of problems called the "NPComplete" problems in this chapter, whose
status is unknown. There is no polynomialtime algorithm for an NPComplete problem, nor we are yet able
to prove a superpolynominal time lower bound for any of them. In theoretical computer science question P
≠ NP question has been one of the deepest, most perplexing open research problems in theoretical computer
science. It came into existence in 1971.
Many scientists believe that the NPcomplete problems can be solved in polynomial time i.e. intractable.
Because if any single NPcomplete problem can be solved in polynominal time, then we can solve every
NPcomplete problem in a polynomial time.
If you want to design good algorithms you should understand the rudiments of the theory of NP
completeness. If its intractability can be proved, as an engineer you would then do better spending your
time developing an approximation algorithm rather than searching for fast algorithm that solved the
problem exactly. Some problems no harder than sorting, graph searching or overhear flow seem easy but
are in face NPcomplete. Hence we should become familiar with its important class of problems.
NP COMPLETE PROBLEM 99
Student Activity 4.1
Before going to next section, answer the following questions:
1. Give the formal definition for the problem of finding the longest simple cycle in an undirected graph.
Give a needed decision problem. Give the language corresponding to the decision problem.
2. What are abstract problem?
If your answers are correct, then proceed to next section.
Top
Polynominal Time
We begin our study of NPcompleteness by defining polynominal time solvable problems. Generally these
problems are tractable. The reason is a mathematical issue. We give three supporting arguments here.
The first argument says that it is reasonable to regard a problem that requires time Q(n
100
) as intractable
there are very few practical problems that require time on the order of such a high degree polynomials. The
practical polynomialtime problems require much less time.
Second, many problems that can be solved in polynomialtime in one model, there always exists another
polynomialtime model.
And the third is that the class of polynomialtime solvable problems has closure properties. For example if
we fed the output of one polynomialtime algorithm into the input of another, the resultant algorithm is
polynomial. If an polynomialtime algorithm makes a constant number of calls to polynomialtime
subroutined, the running time of the composite algorithm is polynomial.
Abstract Problems
To make clear the class of polynomialtime solvable problems, we first define what a problem is. We define
an abstract problem Q to be a binary relation on a set I of problem instances and a set S of problem
solutions. For example, remember the problem of finding SHORTEST PATH, a shortest path between two
given vertices in an unweighted undirected graph G = (V,E). We define an instance for SHORTEST PATH
as a triple consisting of a graph and two veritices. And a solution is defined as a sequence of vertices in the
graph (with empty sequence denoting that no path exists) The problem SHORTEST PATH itself is the
relation that associated each instance of a graph of two vertices with a shortest path in the graph that joins
the two vertices. We can have more than one solution becauseshortest paths are not necessarily unique.
This formulation of an abstract problem is sufficient for our purposes. To make simple the theory of NP
completeness restricts attention to decision problems: those having yes/no solution. In this case we can
view an abstract decision problem as a function that maps the instance set I to the solution set {0,1}. We
can observe it by an example: a decisions problem path related to the shortest path problem is "Given a
graph G = (V,E), two vertices p, q ∈ V and a positive integer k does a path exits in G between p and q
whose length is at most k" if i = (G,p,q,k) is an instance of this shortest path problem, then PATH (i) = 1
(yes) if a shortest path from u to v has length at most k otherwise PATH (i) = 0 (no).
Certain other abstract problems are there called optimization problem in which some value must be
minimized or maximized and these are not decision problems. But if we want to apply the theory of NP
completeness to optimization problems, we must reproduce them as decision problems. Typically, an
optimization problem can be recast by imposing a bound on the value to be optimized. For example in
ALGORITHMS AND ADVANCED DATA STRUCTURES 100
recasting the shortest path problem as the decision problem PATH we added a bound k to the problem
instance.
The requirement to recast optimization problem as decision problem does not diminish the impact of the
theory. Generally if we are able to solve an optimization problem quickly, we will be able to solve its
related decision problem in short time. We simply compare the value obtained from the solution of the
optimization problem with the bound provided as input to the decision problem if an optimization problem
is easy, therefore, its related decision problem is easy as well. Stated in a way that has more relevance to
NPcompleteness if we can provide evidence that a decision is hard, we also provide evidence that its
related optimization problem is hard. Thus even though it restricts attention it decision problem the theory
of NPcompleteness applies much more widely.
Encodings
If we want to make a computer program that can solve an abstract problem, we have to represent instances
in a way that the program understands. An encoding of a set S of abstract objects is a mapping e from S to
the set of binary strings
,
For example, the encoding of the natural numbers N = {0,1,2,3,4........} is as the
strings {0,10,11,100,......} Hence by this encoding, e(17) = 10001. Anyone who has looked at computer
representations of keyboard characters is familiar with either the ASCII or EBCDIC codes. In the ASCII
codes, e (A) = 1000001. Even a compound object can be encoded as a binary string by combining the
representations of its constituent parts. Polygons, graphs, functions ordered pairs programs all can be
encoded as binary strings.
Hence a computer algorithm to solves some substance decision problem will an encoding of problem
instants as input. A concrete problem isa problem whose instance set is the set of binary thing. An
algorithm solves a concrete problem in time O(T (n)) if it is provided a problem instance i of length n = [i],
the algorithm can produce the solution in at most O(T(n)) time. We can say that a concrete problem is
polynominaltime solvable. Therefore if there exist an algorithm to solve it in time O(n
k
) for some constant
k.
We will now define the complexity class P as the set of concrete decision problems that can be solved in
polynomialtime.
Encoding can be used to map abstract problem to concrete problems. Given an abstract decision problem Q
mapping an instance set I to {0,1} an encoding e : I —{0,1} can be use to induce a related concrete decision
problem which we denote by e(Q). If the solution to an abstract problem instance i ∈ I is Q(i) ∈ {0,1}, then
the solution to the concreteproblem e(i) ∈{0,1}* is also Q(i). There may be some binary strings that
represent no meaningful abstractproblem instance. For convenience, we shall assume that any such string
is mapped arbitrarily to 0. Thus the concrete problem produces the same solution as the abstract problem on
binary digit instances that represent the encodings of abstractproblem instances.
Now We generalize the definition of polynomialtime solvability from concrete problems to abstract
problems using encodings as the bridge, but we keep the definition independent of any particular encoding.
We want to say that the efficiency of solving a problem will not depend on how the problem is encoded.
Unfortunately, it depends quite heavily. For example suppose that an integer k is to be provided as the sole
input to an algorithm and suppose that the running time of the algorithm is O(k). If the integer K the
provided in unary—a string of k 1's then the running time of the algorithm is O(n) on lengthninputs,
which is polynominal time. If we use the more natural binary representation of the integer k, but then the
input length is n = [1gk]. In this case, the running time of the algorithm is O(k) = θ(2
n
) which is exponential
in the size of the input. Thus, depending on the encoding, the algorithm run in the either polynomial of
superpolynomialtime.
To understand the polynomial – time it the encoding of an abstract program is important. We cannot really
talk about solving an abstract problem without first specifying an encoding. Practically, if we rule out
NP COMPLETE PROBLEM 101
expensive encoding such as an unary ones, the actual encoding of a problem makes little difference to
whether the problem can be solved in polynomialtime. For example representing integers in base 6 instead
of binary has no effect on whether a problem is solvable in polynomialtime, since an integer represented in
base 2 in polynomialtime.
A function f : {0,1}*—{0,1}* is Polynomialtime computable if we can find a polynomialtime algorithm A
that, for any input x∈{0,1}*, produces as output f (x). For time set I of problem instances, two encoding e
1
and e
2
are polynomial related if there exist two polynomialtime computable functions f
12
and f
21
such that
for any i ∈ I, we have f
12
(e
1
(i)) e
21
(i) and f
21
(e
2
(i)) e
1
(i). That is, the encoding e
2
(i) can be computed from
the encoding e
1
(i) by a polynomialtime algorithm and vice versa.
Lemma 1
Let S be an abstract decision problem on an instance set I, and let e
1
and e
2
be polynomially related
encoding on I. Then e
1
(S) will be in T if and only if e
2
(S) is in T.
A formal language framework
We focus of Decision problems because they make it easy to use the machinery of formallanguage theory.
Now we define some terminology. We call alphabet , a finite set of symbols. We define a language L over
as any set of string that can be made from . For example, if = {a,b}, the set L = {aa,bb,ab,,...} is a
language, and the empty language is denoted by ∅. The language of all strings over is denoted by ∗. For
example, if ∗ = {a,b}, then ∗ = {ε, α,b,ab,abb,baba,...} is the set of all strings of a’s and b’s. Hence it is
clear that every language L over is a subset of ∗.
We can perform a variety of operations on a language. For example, union and intersection operaitons,
follow directly from the set definitions. The complement of L is defined as L L − Σ = * . The
concatenation of two languages L
1
and L
2
is defined as .
L = {x
y
: x∈ L
1
and y
∈ L
2
}
The closure or Kleene star of a language L is
L* = {ε}∪L u L
2
∪ L
3
∪...,
The set of instances for any decision problem Q is simply the set ∗, where = {0,1}. Since Q completely
characterized by those problem instances that procedure a 1 (yes) answer, we can think of Q as a language
L over = {0,1}, where
L = {x ∈ {0,1}* : Q(x) = 1}
For example, the decision problem PATH has the corresponding language
PATH = ( ) { } E , V G : k , v , u , G = is an undirected graph , such that u, v, ∈ v, k ≥ 0 is an integer, and
there exists a path from u to v in G whose maximum length is k.
By using the formallanguage framework we are able to express relation between decision problems and
algorithms that solve them concisely. An algorithm A is said to accepts a string x ∈ = { 0,1}* if, given
input x, the output of the algorithm is A (x) = 1. The language accepted by algorithm A is the set L
={x∈{0,1}∗ : Α(x) = 1}, that is, the set of strings that the algorithm accepts. An Algorithm A rejects a
string x if A(x) = 0.
ALGORITHMS AND ADVANCED DATA STRUCTURES 102
Even if language L is accepted by an algorithm A, the algorithm will not necessarily reject a string x ∉ L .
For example, if the algorithm loops forever!. If an algorithm A either accept or rejects a string from a
language L then the language L is decided by that algorithm A. A language L is accepted in polynomial
time by an algorithm A if for any lengthn string x∈ L the algorithm accepts x in time O(n
k
) for some
constant k. A language L is decided in polynomialtime by a an algorithm A if for any lengthn string x ∈
{0,1}*, the algorithm decides x in time O(n
k
) for some constant k. Thus, to accept a language, an algorithm
need only worry about strings in L, but to decide a language, it must accept or reject every string in {0,1}*.
As example the language path can be accepted in polynomialtime. One such polynomialtime algorithm is
breadth first search that computes the shortest path from u to v in G, and then compares the distance
obtained with k. If the distance is at most k, the algorithm outputs 1 and halts. Otherwise, the algorithm runs
forever. This algorithm does not decide PATH, however, since it does not explicitly output 0 for instances
in i which the shortest path has length greater than k. A decision algorithm for PATH should explicitly
reject binary strings that do but belong to PATH. For a decision problem such an algorithm is not difficult
to design.
A complexity class is defined as a set language, membership in which is determined but a complexity
measure, such as running time, on an algorithm that determines whether a given string x belong to language
L.
With the help of this language theoretic framework, we can provide an alternative definition of the
complexity class P:
P = {L ⊆ {0,1}* : there exists an algorithm A that decides L in polynomialtime}.
P is also the class of languages that can be accepted in polynomialtime.
Theorem
P = {L: L is accepted by a polynomialtime algorithm}.
Proof
We need only show that if L is accepted by a polynomialtime algorithm, it is decided by a polynomialtime
algorithm this is because the class of languages decided by polynomialtime algorithms is a subset of the
class of languages accepted by polynomialtime algorithms,. Assume L be the language accepted by some
polynomialtime algorithm B Because B accepts L in time O(n
k
) for some constant k their also exist a
constant c such that B accepts L in at most T = cn
k
steps. For any input string x the algorithm B’ simulates
the action of B for time T. At the end of time T, algorithm B’ inspects at the behavior of B. If B has
accepted x then B’ accepts x by outputting a 1. If B has not accepted x then B’ rejects x by outputting a O.
The overhead of B’ simulating B does not increase the running time by more than a polynomialtime
algorithm that decides L.
Student Activity 4.2
Before going to next section, answer the following questions:
1. What are abstract problems?
2. Describe a formal language.
If your answers are correct, then proceed to next section.
Top
NP COMPLETE PROBLEM 103
NPcompleteness and Reducibility
The reason that theoretical computer scientists believe that P ≠ NP is the existence of the class of "NP
complete" problems. This class has an interesting property that if any one NPcomplete problem can be
solved in polynomialtime, then every problem in NP has a polynomialtime solution, that is, P = NP. No
polynomialtime algorithm has ever been discovered for any NPcomplete problem for the decades.
The language HAMCYCLE is one NPcomplete problem. If we could decide HAMCYCLE in polynomial
time then we solve could every problem in NP in polynomialtime. In fact, if NP  P should turn out to be
nonempty, we could say with certainty that HAMCYCLE ∈ NP – P.
Figuro 4.T. An illuslrulion oí u polynomiullimo roduclion írom u lunguugo L
T
lo u lunguugo
L
2
L
T
viu u roduclion íunclion l. For uny inpul x ∈ {0,T}*, lho quoslion oí wholhor
x ∈ L. hus lho sumo unswor us lho qoslion oí wholhor l (xì ∈ L
2.
The NPcomplete languages are the "hardest" languages in NP. We shall show how to compare the relative
"hardness" of languages using "polynomialtime reducibility." First, we formally define the NPcomplete
languages, and then we sketch a proof that one such language, called CIRCUITSAT, is NPcomplete. We
use the notion of reducibility to show that many other problems are NPcomplete.
Reducibility
A problem P can be reduced to another problem P’ if any instance of P can be "easily rephrased" as an
instance of P’
the solution to which provides a solution to the instance of P. For example, the problem of
solving linear equations in an indeterminate x reduces to the problem of solving quadratic equations. Given
an instance ax + b = 0, we can transform it to Ox
2
+ ax + b = 0. Its solution provides a solution to ax + b
= 0. Thus, if a problem P reduces to another problem P’,
then we say that P is, "no harder to solve" than
P’.
Now we returns our formallanguage framework for decision problems, a language L
1
is said to polynomial
time reducible to a language L
2
written L
1
≤
p
L
2
, if there exists a polynomialtime computable to function f :
{a,b}* → {a,b}* such that for all x ∈ {a,b}*,
x ∈ L
1
if and only if f (x) ∈ L
2
.........................................(1)
The function f is called the reducingfunction, and a polynomialtime algorithm F that computes f is known
as a reduction algorithm.
The idea of a polynomialtime reduction from a language L
1
to another language L
2
is given in Figure 4.2.
Each language is a subset of {0,1}*. The reduction function f gives a polynomialtime mapping such that if
¦C,1}* ¦C,1}*
ALGORITHMS AND ADVANCED DATA STRUCTURES 104
x ∈ L
1
then f (x) ∈ L
2
. also, if x ∈ L
1
, then
f (x) ∉ L
2 .
Thus, the reduction function maps any instance x of
the decision problem represented by the language L
2
to an instance
f (x) of the problem represented the
language L
1
to instance f(x) of the problem represented by L
2
. Providing an answer to whether f (x) ∈ L
2
directly provides the answer to whether x ∈ L
1.
Figuro 4.2. Tho prooí oí lommu 2Tho ulgorilhms F is u roduclion ulgorilhm lhul compulos lho roduclion
íunclion l írom L
T
lo L
2
in ploynomiul limo, und A
2
is u polynomiullimo ulgorilhm lhul
docidos L
2
. llluslrulod is un ulgorilhm A
T
lhul docidos wholhor x ∈ L
T .
by using F
lo lrunsíorm uny inpul x inlo l(xì und lhon using /
2
lo docido wholhor l (xì ∈ L
2 .
Polynomialtime reductions give us a powerful tool for proving that various languages belong to P.
Lemma 2
If L
1
,
L
2
⊆ {0,1}* are languages such that L
1
≤
P
L
2
, then
L
2
∈ P implies L
1
∈ P.
Proof Let L
2
is decided by a polynomialtime algorithm B
2
and let F be a polynomialtime reduction
algorithm that computes the reduction function f. We shall construct a polynomialtime algorithm B
1
that
will decides L
1.
The construction of B
1
is given in Figure 4.2. For a given input x ε {0,1}* the algorithm B
1
uses F to
transform x into f(x), and then it uses B
2
to test whether f (x) ε L
2 .
The output of B
2
is the value provided as
the output from B
1.
The Algorithm runs in polynomialtime since both F and B
2
run in polynomialtime.
Student Activity 4.3
Before going to next section, answer the following questions:
1. What do you mean by NPcompleteness?
2. What is reducibility?
If your answers are correct, then proceed to next section.
Top
NPcompleteness
Polynomialtime reductions gives a formal means for showing that one problem is at least as hard as
another, if we consider a polynomialtime factor. Hence if L
1
≤
P
L
2
, then
L
1
is not more than a polynomial
time factor Harder than L
2
because the "less than or equal to" notation for reduction is mnemonic. Now we
define the set of NPcomplete languages, which are the hardest problems in NP.
A language
L
⊆ {0,1}* is NPcomplete if
1. L ∈ NP, and
2. L
1
≤
P
L for every L
1
∈ NP.
F /
2
¦(xì ¦(xì∈L
2
? x∈L
1
?
x
/
1
NP COMPLETE PROBLEM 105
NP
NPC
P
Figuro 4.3. How mosl lhoorolicul compulor scionlisls viow lho rolulionship umong P, NP, und NPC. 8olh
P und NPC uro wholly conluinod wilhin NP,und P ∩ NPC ~ ∅.
If a language L satisfies property 2, but not necessarily property 1, we say that L is NP = hard. We also
define NPC to be the class of NPcomplete language.
As the following theorem shows, NPcompleteness is at the crux of deciding whether P is in fact equal to
NP
Theorem
If any NPcomplete problem is polynomialtime solvable then P = NP. If any problem in NP is not
polynomialtime solvable, then all NPcomplete problems are not polynomialtime solvable.
Proof Suppose that L belongs to class P & L ∈ NPC. For any L
1
∈ NP, we have L
1
≤ L by property 2 of
the definition of NPcompleteness. Thus by Lemma, we also have that L
1
∈ P, hence the state of lemma is
proved.
Now we can prove the second statement, Letthere exists an L ∈ NP such that L ∉ P. Let L
1
ε NPC be any
NPcomplete language, and for the purpose of contradiction, assume that L ∉ P. But then by Lemma, we
have L ≤
P
L
1
, and thus L
∈ P.
This is because that research into the P ≠ NP question centers around the NPcomplete problems. Most
computer scientists think that P ≠ NP, which leads to the relationship among P, NP, and NPC. But for all
we know someone may come up with a polynomialtime algorithm for an NPproblem, thus proving that P
= NP. Nevertheless since no polynomialtime algorithm for any NPcomplete problem has yet been
discovered a proof that a problem is NPcomplete provides excellent evidence for its intractability.
Circuit 8atisfiability
Up to this point we have not actually proved that any Problems is NPcomplete though we have defined
NPcomplete problem. If we prove that at least one problem is NPcomplete, polynomialtime reducibility
can be used as a tool to prove the NPcompleteness of other problems. So we will focus on showing the
existence of an NPcomplete problem: the circuitsatisfiability problem.
ALGORITHMS AND ADVANCED DATA STRUCTURES 106
Figuro 4.4. Two insluncos oí lho circuil sulisíiubilily problom. (uì Tho ussignmonl x
T
~ T, x
2
~ T, x
3
~ 0
lo lho inpuls oí lhis circuil cuusos lho oulpul oí lho circuil lo bo T. Tho circuil is lhoroíoro sulisíiublo.
(bì No ussignmonl lo lho inpul oí lhis circuil cun cuuso lho oulpul oí lho circuil lo
bo T. lho circuil is lhoroíoro unsulisíublo.
We shall informally describe a proof that relies of a basic understanding of boolean combinational circuits.
Two boolean combinational circuits are shown in Figure 4.4. Each circuit has three inputs and a one
output. A truth assignment means a set of boolean input values for that circuit. We say that a one output
Boolean combinational circuits is satisfiable if it has a satisfying assignment: a truth assignment that causes
the output of the circuit in Figure 4.4(a) has the satisfying assignment x
1
= 1, x
2
= 1, x
3
= 0 , and so it is
satisfiable. No assignment of values to x
1
,x
2
, and x
3
causes the circuit in Figure 4.4(b) to produce a 1 output;
it always produces 0, and so it is unsatisfiable.
Now we state the circuitsatisfiability problem as, "Given a boolean combinational circuit composed of
AND, OR, AND NOT gates is it satisfiable?" In order to pose this question formally however we must
agree on a standard encoding. We can make a graph like encoding that maps any given circuit C into a
binary string C whose length is not much larger than the size of the circuit itself. As a formal language.
We can therefore define.
CIRCUITSET =
{C: C is a satisfiable boolean combinational circuit}
There is a great importance of the circuitsatisfiability problem in the area of computer aided hardware
optimization. If a circuit always produces 0, it can be replaced by an easier circuit that omits all logic gates
and provides the constant 0 value as its output. If we can design a polynomialtime algorithm for the
problem then it would have considerable practical application.
Suppose we are given a circuit C, we might attempt to determine whether it is satisfiable by simply
checking all possible assignment to the input. But if there are k input there are 2
k
possible assignments.
When the size of C is polynomial in k, checking each one leads to a superpolynomialtime algorithm. In
fact as has been claimed there is strong evidence that no polynomialtime algorithm exists that solves the
NP COMPLETE PROBLEM 107
circuitsatisfiability problem because circuit satisfiability is NPcomplete. We break the proof of this fact
into two parts based on the two parts of the definition of NPcompleteness.
Lemma 3
The circuitsatisfiability problem is in the class NP.
Proof: We can give a twoinput, polynomialtime algorithm B that can verify CIRCUITSAT. One of the
input to B is a boolean combinational circuit C. Another input is a certificate corresponding to an
assignment of boolean values to the wires in C.
The algorithm B can be design as follows. For each logic gate in the circuit it checks that the value
provided by the certificate on the output wire is correctly computed as a function of the values on he input
wires. Now if the output of the entire circuit is 1 the algorithm outputs 1, since the values assigned to the
inputs of C provide a satisfying assignment. Otherwise, B outputs 0.
Every time a satisfiable circuit C is input to algorithm B, we have a certificate there whose length is
polynomial in the size of C and that causes B to output a
1
. Whenever an unsatisfiable circuit is input no
certificate can fool A into believing that A the circuit is satisfiable. Algorithm A runs in polynomialtime :
with a good implementation, liner time suffices. Thus CIRCUITSET can be verified in polynomialtime,
and CIRCUITSAT ∈ NP.
Now we shall show that the language is NPhard to prove that CIRCUITSET is NPcomplete. Hence we
have to show that every language in NP is polynomialtime reducible to CIRCUITSET. The actual proof of
this fact is full of technical intricacies, and so we shall settle for a sketch of the proof based on some
understanding of the working of computer hardware.
As we know that the Memory stores a computer program as a sequence of instructions. A typical instruction
encoded in memory, and an address where the result is to be stored. Program counter, deeps track of which
instruction is to be executed next. The program counter is automatically incremented whenever an
instruction is fetched, thereby causing the computer to execute instruction sequentially. The execution of an
instruction can cause a value to be written to the program counter, however, and then the normal sequential
execution can be altered, allowing the computer to loop and perform conditional branches.
At any time in the execution of a program the entire state of the computation is represented in the
computer's memory. A configuration is any particular state of computer memory. The execution means the
mapping one configuration to another. Importantly the computer hardware that accomplishes this mapping
can be implemented as a boolean combinational circuit, which we denote by M in the proof of the following
lemma.
Lemma 4
The circuitsatisfiability problem belongs to NPhard.
ALGORITHMS AND ADVANCED DATA STRUCTURES 108
Proof Suppose L be any language in NP. Now we give a polynomialtime algorithm F that can compute a
reduction function f that maps every binary string x to a circuit C = f (x) such that
x ∈ L if and only if C ∈ CIRCUITSAT.
Since L ∈ NP we must have an algorithm A that verifies L in polynomialtime. The algorithm F that we
shall construct will use the two input algorithm A to compute the reduction function f.
Let T(n) denote the worstcase running time of algorithm An on lengthn input strings and let
k ≥ 1 be a constant such that T(n) = O(n
k
) and the length of the certificate is O(n
k
). (The running time of A is
actually a polynomial in the total input size, which includes both an input string and a certificate but since
the length of the certificate is polynomial in the length n of the input string the running time is polynomial
in n.)
This basic idea of the proof is to represent the computations of A as a sequence of configuration. As shown
in Figure 4.5 each configuration can be broken into parts consisting of the program for A, the program
counter and auxiliary machine state, the input x, the certificate y, and working storage. Starting with an
initial configuration c
i
is mapped to a subsequent configuration C
i
+ 1 by the combinational circuit M
Implementing the computer hardware.
CT(n)
C2
C1
C0
NP COMPLETE PROBLEM 109
Figuro 4.5. Tho soquonco oí coníigrulion producod by un ulgorilhm / running on un inpul x und corliíiculo
v. Euch coníigurulion roprosonls lho slulo oí lho compulor íor ono slop oí lho compululion und bosido /.x.
und v. includos lho progrum counlor (PCì, uuxiliury muchino slulo, und working
slorugo. Euch coníiguruuluin is muppod ol lho noxl coníigurulion by u booloun
combinulionul circuil M. Tho oulpul is u dislinguishod
bil in lho working slorugo.
The output of the algorithm A—0 or 1—is written to some designated location in the working storage when
A finished executing, and if we assume that thereafter A halts, the value never changes. Thus if the
algorithm runs for at most T(n) steps, the output appears as one of the bit in c
t(n).
The reduction algorithm F gives a single combinational circuit that computes all configuration given by a
given initial configuration. That is we can paste together T(n) copies of the circuit M. The output of the i
th
circuit, which produced configuration c
i
is fed directly into the input of the
(i + 1)
st
circuit. Thus the configuration rather than ending up in a state register, simply reside as values on
the wires on the connecting copies of M.
Since we know that what a polynomialtime reduction algorithm F must do. Given an input x, it must
compute a circuit C = f (x) that is satisfiable if there exists a certificate y such that A(x,y) = 1. When F
obtains an input x, it first computes n = [x] and constructs a combinational circuit C' consisting os T (n)
copies of M. The input to C'is an initial configuration corresponding to a computation on A(x,y), and the
output is the configuration c
T(n)
.
The circuit C = f (x) that F computes is obtained by making a few changes in C'. Initially the inputs to C'
corresponding to the program for A, the initial program counter, the input x, and the initial state of memory
are wired directly to these known values. Thus the only remaining inputs to the circuit correspond to the
certificate y. Now, all outputs to the circuit should be ignored, except the one bit of C
T(n
) corresponding to
the output of A, This circuit C, so constructed, computes C(y) = A (x,y) for any input y of length O(n
k
). The
reduction algorithm F when provided an input string x, computes such a circuit C and outputs it.
Now we have two properties to be proved. First, we must show that F correctly computes a reduction
function f. That is we have to show that C is satisfiable if and only if there exists a certificate y such that A
(x,y) = 1. Second we must show that F runs in polynomialtime.
In order to prove that F correctly computes a reduction function, let us assume that there exists a certificate
y of length O(n
k
) such that A(x,y) = 1. Then, if we apply the bits of y to the inputs of C, the output of C is
C(y) = A(x,y) = 1. Thus if a certificate exists, then C is satisfiable. Hence there exists an input y to C such
that C(y) = 1, from which we conclude that A(x,y) = 1. Thus, F correctly computes a reduction function.
Now we are going to complete the proof, for this we need only show that F runs in time polynomial in n =
[x]. The first observation we make is that the number of bits needed to represent a configuration is
polynomial in n. The program for A itself has constant size, independent of the length of its input x. The
length of the input is x is n, and the length of the certificate y is O(n
k
). Since the algorithm runs for at most
O(n
k
) step, the amount of working storage required by A is polynomial in n as well. (We assume that this
memory is contiguous)
The combinational circuit M which can implement the computer hardware has size polynomial in the length
of a configuration which is polynomial in O(n
k
) and hence is polynomial in n. (Most of this circuitry
implement the logic of the memory system.) The circuit C consists of at most
t = O(n
k
) copies of M, and hence it has size polynomial in n. The construction of C from x can be
accomplished in polynomialtime by the reduction algorithm F, since each step of the construction takes
polynomialtime.
ALGORITHMS AND ADVANCED DATA STRUCTURES 110
The language circuitset is therefore at least as hard as any language in NP, and since it belongs to NP, it is
NPcomplete.
Theorem
The circuitsatisfactory problem is NPproblem.
Proof Immediate from lemmas given before and the definition of NPcompleteness.
Student Activity 4.4
Before going to next section, answer the following questions:
1. Prove that circuit satisfiability problem is NP AND.
2. Prove that circuit satisfiability problem belongs to NP class.
If your answers are correct, then proceed to next section.
Top
NPcompleteness Proofs
The NPcompleteness of the circuitsatisfactory problem depends on a direct proof that L ≤
P
CIRCUITSAT
for every language L ∈ NP. Here, we shall show how to prove that language are NPcomplete without
directly reducing every language in NP to the given language.
The following lemma provides a base for showing that a language is NPlanguage.
Lemma 4
If L is language such that L' ≤
P
L for some L' ∈ΝΠΧ, Τηεν L is NPhard. Also, if L ∈NP then L ∈NPC.
Proof: For all L" ∈NP we have L" ≤
P
L' this is because L' is NP complete. By supposition. L' ≤
P
L. and thus
by transitivity, we have L" ≤
P
L. which shows that L is NPhard, If L ∈NP. We also have L ∈NPC.
We can say that, by reducing a known NPcomplete language L' to L we implicitly reduce every language
in NP to L. Thus lemma gives us a method for proving that a language L is NPcomplete:
1. Prove L ∈NP.
2. Select a known NPcomplete language L.
3. Describe an algorithm that computes a function f mapping every instance of L' to an instance of L
4. Prove that the function f satisfies x ∈L' if and only if f(x) ∈L for all x ∈ {0,1}*
5. Prove that the algorithm computing f runs in polynomialtime.
Flguro 4.ó. Tho lrulh lublo íor lho cluuso (y
T
(y
2
∧x
2
ìì.
NP COMPLETE PROBLEM 111
resulting expression is
ø’ = y
1
∧ (y
1
↔ (y
2
∧¬ x
2
))
∧ (y
2
↔ (y
3
∨ y
4
))
∧ (y
3
↔ (x
1
→x
2
))
∧ (y
4
↔ ¬ y
5
)
∧ (y
5
↔ (y
^
∨ x
4
))
∧ (y
6
↔ (¬ x
1
↔ x
3
)).
It should be noted that the formula φ’ thus obtained is a conjunction of clauses φ’
i
each of which has at
most 3 literals the only additional requirement is that each clause be an OR of literals.
Now we convert each clause φ’
i
in to conjunctive normal form. We construct a truth table for φ’
i
by
evaluating all possible assignment to its variables. Each row of the truth table consists of a possible
assignment of the variable of the clause, together with the value of the clause under that assignment. Using
the truthtable entries that evaluate to 0, we build a formula in disjunctive normal form (or DNF) — an OR
of AND's that is equivalent to ¬ φ’
i
.
We then convert this formula into a CNF formula φ’’
i
by using
DeMorgan's laws all literals and change OR's into AND's or AND's into OR's.
Here we convert the clause φ’
i
= [y
1
↔(y
2
∧¬x
2
)] into CNF as follow. The truth table for φ’
i
is given above.
The DNF formula equivalent to ¬φ’
i
is
(y
1
∧y
2
∧ x
2
) ∨ (y
1
∧¬ y
2
∧ x
2
) ∨ (y
1
∧¬ y
2
∧¬ x
2
) ∨ ( ¬ y
1
∧ y
2
∧¬ x
2
).
By using DeMorgan's law we get the CNF formula.
ø"
1
= ( ) ( )
2 2 1 2 2 1
x y y x y y ¬ ∨ ∨ ¬ ∧ ¬ ∨ ∨ ¬
∧( ) ( )
2 2 1 2 2 1
n y y n y y ∨ ∨ ∧ ¬ ∨ ∨ ¬
Which is equivalent to the original clause ø'
1.
Each clause φ’
i
of the formula φ’
i
has now been converted into a CNF formula φ”
i
and thus φ’
i
is
equivalent to the CNF formula φ”
consisting of the conjunction of the φ”
i
. Moreover each clause of φ”
i
has
at most 3 literals.
In the step of the reduction further transforms the formula so that each clause has exactly 3 distinct literals.
The final 3CNF formula φ’’’ is constructed from the clauses of the CNF formula φ’’. It also uses two
auxiliary variables let p and q. For each clause C
i
of φ’’ we include the following clauses in φ’’’:
• If C
i
has 3 distinct literals then simply include C
i
as a clause of φ’’’.
• If C
i
has 2 distinct literals, that is if C
i
= (l
1
∨ l
2
), where l
1
and l
2
are literals, then include (l
1
∨ l
2
P) (l
1
∨ l
2
P) as clauses of f(φ). The literals P and ¬ p merely fulfill the syntactic requirement that there be
exactly 3 distance literals are per clause: (l
1
∨ l
2
P) ∧ (l
1
∨ l
2
∨¬
P) is equivalent to (l
1
∨ l
2
) whether p
= 0 or p = 1.
ALGORITHMS AND ADVANCED DATA STRUCTURES 112
• If C
i
has only 1 distinct literal l, then include (l ∨¬
P ∨
P) ∧ (l ∨ ¬
P ∨ ¬
q) as clauses of φ’’’. Note
that every setting of P and q causes the conjunction of these four clauses to evaluate
to l.
Hence the 3CMP formula φ’’’ is satisfiable iff φ is satisfiable by inspecting each of the three steps. Like
the reduction form CIRCUITSET to SAT, the construction of φ’ from φ in the first step retains satisfiability.
The second step produces a CNF formula φ’’ which is equivalent to φ’. Third step produces a 3CNF
formula φ’’ that is effectively equivalent to φ’’ because any assignment to the variables P and q produces a
formula that is algebraically equivalent to φ’’.
We have to show that the reduction can be computed in polynomialtime. In constructing φ’ from φ we
have to introduce at most 1 variable and 1 clause per connective φ’. Constructing φ’ from φ can introduce at
most 8 clause in φ’’ for each clause from φ’, since each clause of φ’ has at most 3 variable, and the truth
table has at most 2
3
= 8 rows. Similarly the construction of φ’ from φ introduces at most 4 clauses into φ’’’
for each clause of φ’’.Hence the size of the φ’’’ is polynomial in the length of the original formula and each
of the constructions can easily be accomplished in polynomialtime.
Top
NPcomplete Problems
NPcomplete problem can be in the domains: boolean logic, arithmetic, automata and language theory,
network design sets and partitions, storage and retrieval sequencing and scheduling, graphs, mathematical
programming, algebra and number theory, games and puzzles, program optimization etc. Here we use the
reduction methodology to provide NPcompleteness proofs for the problems related to graph theory and set
partitioning.
ClRCÜlTS/T
S/T
3CNFS/T
CLlCÜE H/MC¥CLE
\ERTE/CC\ER TSP
SÜ8SETSÜM
Figuro 4.7. Tho Slrucluro oí NP Complolonls Prooís
The Clique Problem
A clique in an undirected graph (G = V,E ). It is a subset V’⊆V of vertices each pair of which is connected
by an edge in E. We can say that a clique is a complete subgraph of G. The size of a clique is defined as
the number of vertices it contains. Hence a clique is:
CLIIQUE = {G,k : G is a graph with a clique of size k}.
NP COMPLETE PROBLEM 113
A native algorithm for determining whether a graph (G = V,E ) with [V] vertices has a clique of size k is to
list all k subsets of V, and check each one to see whether it forms a clique. The time complexity of this
algorithm is Ω(k
2
(
[v
k
]
)), which is polynomial if k is a constant. Generally k could be proportional to[V] in
which case the algorithm runs in super polynomialtime. We can say that an efficient algorithm for the
clique problem is unlikely to exists.
Theorem
The clique problem is NPcomplete.
Proof: We have to show that clique ∈ NP. For a given graph (G = V,E ) , we use the set V' ⊆ V of vertices
in the clique as a certificate for G. To check whether V' is a clique can be accomplished in polynomialtime
can be done by checking whether the edge (u,v) belongs to E.
We next show that the clique problem is NPhard by proving that 3CNFSAT ≤
P
CLIQUE. That we
should be able to prove this result is somewhat surprising since on the surface logical formulas seem to
have little to do with graphs.
C
1
~x
1
v1x
2
vx
3
x
2
C
1
~x
1
v1x
2
vx
3
C
1
~x
1
v1x
2
vx
3
x
3
x
3
Figuro 4.8. Tho gruph G dorivod írom lho 3cní íormulu o ~ C
T
∧ C
2
∧ C
3
whoro C
T
~ (x
T
∨¬ x
2
∨¬ x
3
ì und
C
3
~ (x
T
∨ x
2
∨ x
3
ì, in roducing 3C8FSAT lo CLlCUE . A sulisíying ussignmonl oí lho
íormulu is (x
T
~ 0, x
2
~ 0,x
3
~ 0,ì This sulisíying ussignmonl sulisíios C
T
wilh ¬ x
2
, und il
sulisíios C
2
und C
3
wilh x
3
corrosponding lo lho cliquo wilh lighlly shudod vorlicos.
The reduction algorithm begins with an instance of 3CNFSAT. Let φ = C
1
∧ C
2
∧....∧ C
k
be a boolean
formula in 3CNFSAT k clauses. For r = 1,2 ....., k each clause C
r
has exactly three distinct literals l
1
r
, l
2
r
and l
3
r
. Now we construct a graph G such that φ is satisfiable iff G has a clique of size k.
The graph G = (V,E ) is constructed as follows for each clause C
r
= (
r
3
r
2
r
1
l l l ∨ ) in φ, we place a triple of
vertices v
r
1
,v
r
2
, and v
r
3
in V. We put an edge between two vertices v
r
i
and v
s
j
if both of the following hold:
• v
r
i
and v
s
j
are different triples, that is, r ≠ s and
• their corresponding literals are consistent, that is, v
r
i
is not the negation of v
s
j
.
ALGORITHMS AND ADVANCED DATA STRUCTURES 114
The graph can easily be computed from φ in polynomialtime. As an example of this construction, if we
have
φ= (x
1
∨¬
x
2
∨¬ x
3
) ∧ (¬ x
1
∨
x
2
∨ x
3
) ∧ ( x
1
∨
x
2
∨ x
3
)
then G is the graph shown in Figure 4.9.
Now we show that this transformation of φ into G is a reduction. Assume that φ has a satisfying assignment.
So, each clause C
R
will have at least one literal
r
i
l that is assigned 1, and each such literal corresponding to a
vertex
r
i
v . taking one such "true" literal from each clause yields a set of V' vertices. We say that V' is a
clique. For any two vertices
r
j
r
i
v , v ∈ v’, where r ≠ s both corresponding literals
r
i
l and
s
j
l are mapped to 1 by
the given satisfying assignment and thus the literals cannot be complements. Thus, by the construction of
G, the edge ( )
s
j
r
i
v , v belongs to E.
o v o v
z w z w
v x v x
Figuro 4.º. Poducing cliquo lo vorloxcovor (uì An diroclod groph G ~ (\.Eì wilh cliquo \' ~ {u.v.x.v,}.
(bì Tho Gruph G producod by lho rouclion ulgorilhm lhul hus vorlox covor \  \' ~ {w.z}.
Conversely, assume that G has a clique V' of size k. No edges in G connect vertices in the same triple and
so V' contains exactly one vertex per triple. We can assign 1 to each literal
r
i
l
such that v
r
i
∈ V' since G
contains no edges between inconsistent literals. Each clause is satisfied and so φ is satisfied (Any variables
that correspond to no vertex in the clique may be set arbitrarily)
In the example of Figure 4.9 a satisfying assignment of φ is x
1
= 0, x
2
= 0, x
3
= 1. A corresponding clique
of size k = 3 consists of the vertices corresponding to ¬x
2
from the first clause x
3
from the second clause,
and x
3
from the third clause.
The Vertexcover Problem
We define a vertex cover of an undirected graph G = (V,E) as a subset V' ⊆ V such that if (u,v) ∈ E, then u
∈ V, then u ∈ V' (or both). That is each vertex "covers" its incident edges, and a vertex cover for G is a set
of vertices that covers all the edges in E. The size of a vertex cover is the number of vertices in it. For
example the graph in figure 4.9 has a vertex cover {w, z} of size 2.
This problem is to find a vertex cover of minimum size in a given graph. Restating it as a decision problem
we wish to determine whether a graph has a vertex cover of a given size k. In language we define.
VERTEXCOVER = {G,K : graph G has vertex cover of size k}.
Theorem
The vertexcover problem is NP−complete
NP COMPLETE PROBLEM 115
Proof Initially we shall show that vertexcover ∈NP. Assume there is a graph a G = (V,E) and an integer
k. The certificate we choose is the vertex cover V' ⊆ V itself. The verification algorithm says that [V'] = k,
and then it checks for each edge, whether u ∈ V, and v ∈ V, this verification can be done in polynomial
time easily.
We prove that the vertex cover problem is NPhard by showing that CLIQUE ≤
P
VERTEXCOVER This
reduction is based on the notion of the "complement" of a graph. Given an undirected graph G = (V,E), we
define the complement of G as G = (V,E) where E = {(u,v) : (u,v) ∉E}, In other words G is the graph
containing exactly those edges that are not in G. Figure .9 shows a graph and its complement and illustrates
the reduction from CLIQUE to VERTEXCOVER.
The reduction algorithm takes as input an instance (G,K) off the clique problem. It computes the
complement G which is easily doable in polynomialtime. The output of the reduction algorithm is the
instance (G [V]  k) of the vertex cover problem. To complete the proof, we show that this transformation is
indeed a reduction : the graph G has a clique of size k if and only if the graph G has a vertex cover of size
[V] — k.
Assume that G has a clique V' ⊆ V with [V] = k . We claim that V — V' is a vertex cover in G. Let (u,v) be
any edge in E. Then (u,v) ∉E, which implies that at least one of u to v does not belong toV' is connected by
an edge of E. Equivalently atleast one of u to v is V  V' which means that edge (u,v) is covered by V  V' .
Since (u,v) was chosen arbitrarily from E, every edge of E is covered by a verted in V—V'. Hence the set V 
V' , which has size [V] — k, forms vertex cover for G.
Conversely, suppose that G has a vertex cover V' ⊆ V were [V'] = [V]  k. Then , for all u, v, ∈ V, if (u, v,)
∈ E, or both. The contrapositive of this implication is that for all u, v, ∈ V, if u ∉V' and v ∉V', then (u, v,)
∈ V' , then (u,v) ∈ V. In other words, V — V' is a clique, and it has size [V]  [V'] = k.
The 8ubsetsum Problem
In this, we are given a finite set S ⊂ N and a target t ∈ N. We ask whether there is a subset S' ⊆ S whose
elements sum to t. for example, if S = {1, 4, 16, 64, 256, 1040, 1041, 1093, 1285, 1344} and t = 3755, then
the subset S' = {1, 16, 64, 256, 1040, 1093,1285} is a solution.
In language we define:
Subsetsum =
{S,t there exists a subset S' ⊆ S such that t =
s ∈S'
s}.
As usual, it is important that our standard encoding assumes that the input integers are coded in binary.
Now, we can show that the subsetsum problem is unlikely to have a fast algorithm.
Theorem
The subsetsum problem is NPcomplete.
Proof To show that subsetsum belongs to class NP, for an instance (S,t) of the problem, we assume the
subset S' is the certificate. Checking whether t =
s ∈S'
s can be done by a verification algorithm in
polynomialtime.
We now show that VERTEXCOVER ≤
P
SUBSETSUM. For an instance (G,k) of the subsetsum problem
the reduction algorithm constructs an instance (S,t) of the subsetsum problem so that G has a vertex cover
of size k if and only if there is a subset of S whose sum is exactly t.
At the heart of the reduction is an incidencematrix representation of G. Let G = (V,E) be an undirected
graph and let V = {v
0
,v
1
,....v
[V]  1
}and E = {e
0
,e
1
,....e
[E]  1
}. The incidence matrix of G is a [V] * [E] matrix B
= (b
ij
) such that
ALGORITHMS AND ADVANCED DATA STRUCTURES 116
b
ij
= {1 if edge e
j
is incident on vertex v
i
, 0 otherwise}
The incidence matrix for the undirected graph of Figure 4.10 is shown in Figure 4.10. The incidence matrix
is shown with lower index edges on the right rather, than on the left as is conventional, in order to simplify
the formulas for the numbers in S.
Given a graph G and an integer k, the reduction algorithm computes a set s of numbers and an integer t. To
understand how the reduction algorithm works let us represent numbers in a "modified base4" fashion. The
[E] loworder digits of a number will be in base4 but the highorder digit can be as large as k. The set of
numbers is constructed in such a way that no carries can be propagated from lower digit to higher digits.
Figuro 4.T0. Tho roduclion oí lho vorloxcovor problom lo lho subsol sum problom. (uì An undiorclod gruph
G. A vorloxcovor {v
T
,v
2
,v
3
} oí sizo 3 is lighlly shudod. (bì Tho corrosponding incidonco mulrix. Shuding oí lho
rows corrosponds lo lho vorloxcovor oí purl (iì Euch odgo o
i
hus u T in ul lousl ono lighlly shudod row. (cì
Tho corrosponding subsolsum inslunco. Tho porlion wilhin lho box is lho incidonco mulrix. Horo lho vorlox
covor {v
1
.v
3
.v
4
} oí sizo k ~ 3 corrosponds lo lho lighlly shudod subsol {T,Tó,ó4,25ó,T040,T0º3,T284},
which udds up lo 3754
The set S consists of two types of numbers, corresponding to vertices and edges respectively. For each
vertex v
i
∈ V, we create a positive integer x
i
whose modified base4 representation consists of a leading 1
followed by [E] digits. The digits corresponds to v
i
’s rows of the incidence matrix B = (b
ij
) for G, as
illustrated in Figure 4.10 (c) formally for i = 0,1,....,[V]1,
−
=
+ =
1 E
o j
j
ij
E
i
4 b 4 x
For each edge e
j
∈E we create a positive integer y
j
that is just a row of the “identity” incidence matrix. (The
identity incidence matrix is the [E] + [E] matrix with 1's only in the diagonal positions.) Formally, for J
=0,1,....,[E]  1,
y
j
= 4
j
.
The first digit of the target sum t is k, and all [E] lower order digits are 2's. Formally,
=
+ =
E
0 j
j E
. 4 . 2 4 k t
All of these numbers have polynomial size when we represent them in binary. The reduction can be
performed in polynomialtime by manipulating the bits of the incidence matrix.
v0
v2
v0 0 0 1 0 1
v2 1 1 0 0 0
x0 = 1 0 0 1 0 1 = 1041
x2 = 1 1 1 0 0 0 = 1041
y1 = 0 0 0 0 1 0 = 4
NP COMPLETE PROBLEM 117
Now we have to show that graph G has a vertex cover of size k if and only if there is a subset S' ⊆ S whose
sum is t, First suppose that G has a vertex cover V' ⊆ V of size k. Let V' = {v
i1
, v
i2
,.....,v
ik
}, and define S’
by
S’ = {x
i1
, x
i2
,.....,x
ik
}∪
{y
j
: e
j
is incident on precisely one vertex in V’}
To see that
s ∈S’’
s = t, observe that summing the k leading 1’s of the x
im
∈S’ gives the leading digit k of
modified base 4 representation of t. To get the loworder digits of t, each of which is a 2 consider the digit
positions in turn, each of which corresponds to an edge e
j
. Because V’ is a vertex cover, e
j
incident on at
least one vertex in V’.
Thus, for each edge e
j
there is at least one x
im
∈S’ with 1 in the jth position. If e
j
is
incident on two vertices in V’ then both contribute a 2 to the sum in the jth position. The jth digit of y
j
contributes nothing. Since e
j
is incident on two vertices, which implies that y
j
∉S’ . Thus in this case the
sum of S' produces a 2 in the jth position of t. for the other case —when e
j
is incident on exactly one vertex
in V’ — we have y
j
∈ S’ and the incident vertex and y
j
each contribute 1 to the sum of the jth digit of t,
thereby also producing a 2. Thus , S’ is a solution to the subset sum instance S.
Now, suppose that there is a subset S’ ⊆ S’ that sums to t. Let S = {x
i1
, x
i2
,.....,x
im
}∪ {y
j1
, y
j2
,.....,y
jp
}. We
claim that m = k and that V’ = {v
i1
, v
i2
,.....,v
im
} is a vertex cover for G. To prove this claim we start by
observing that for each edge e
j
∈E there are three 1's in set s in the e
j
position: one from each of the two
vertices incident on e
j
, and one from y
j
because we are working with a modified base 4 representation, there
are no carries from position e
j
to position e
j
+1. Thus, for each of the [E] low order position of] t, at least
one and at most two x
i
must contribute to the sum. Since at least one x
i
contributes to the sum for each edge
we see that V’ is a vertex cover. To see that m = k, and thus that V’ is a vertex cover of size k, observe that
the only way the leading k in target t can be achieved is by including exactly k of x
i
in the sum.
In Figure 4.10 the vertex cover V’ = {v
1
,v
3
,v
4
} corresponds to the subset S’ = {x
1
,x
3
,x
4
,y
0
,y
2
,y
3
,y
4
,}. All of
the y
j
are included in S’ with the exception of y
1
, which is incident on two vertices in V’.
The Hamiltoniancycle Problem
o o´ o o´
z
1
z
2
z
3
z
4
o o´ o o´
(uì (bì
o o´ o o´
z
1
z
2
z
3
z
4
/
o o´ o o´
(cì (dì
ALGORITHMS AND ADVANCED DATA STRUCTURES 118
Figuro 4.TT (uì Widgol /. usod in lho roduclion írom 3CNF lo HAMCYCLE. (bì(cì. lí / is u subgruph oí
somo gruph G Thul conluins u humillonium cyclo und lho only Connoclions írom / lo lho rosl oí G uro
lhrough lho vorlicos o, o´, 6 und 6´ lhon lho shudod odgos roprosonl lho only lwo possiblo
wuys in which lho humilloniun cyclo muy lruvorso lho odgos oí subgruph /.
(dì A compucl roprosonlulion oí lho A widgol.
Theorem
The hamiltonian cycle problem is NPcomplete.
Proof Initially we show that HAMCYCLE belong to NP. Given a graph G = (V,E) our certificate is the
sequence of [V] vertices that make up the hamiltonain cycle. The verification algorithm checks that this
sequence contains each vertex in V exactly once and that with the first vertex repeated at the end it forms a
cycle in G. This verification can be performed in polynomialtime.
We now prove that HAMCYCLE is NPcomplete by showing that 3 CNFSAT ≤
P
HAMCYCLE. Given a
3CNF boolean formula ø over variables x
1
, x
2
,....,x
n
with clauses c
1
,c
2
,....,c
k
, each containing
exactly 3 distinct literals we construct a graph G = (V,E) in polynomialtime such that G has a Hamiltonian
cycle if and only if φ is satisfiable. Our construction is based on widgets, which are pieces of graphs that
enforce certain properties.
Our first widget is the subgraph A shown in Figure 4.11. Suppose that A is a subgraph of some graph G and
that the only connections between A and the remainder G are through the vertices z
1
, z
2
, z
3
and z
4
in one of
the ways shown in figures 4.11 (b) and (c) we may treat subgraph A as if it were simply a pair of edges
(a,a') and (b,b') with the restriction that any hamiltonian cycle of G must include exactly one of these
edges. We shall represent widget A as shown in Figure 4.11.
The subgraph B in Figure 4.12 is our second widget. Suppose that B is a subgraph of some graph G and that
the only connections from B to the remainder of G are through vertices b
1
,b
2
.b
3
, and b
4
. A Hamiltonian cycle
of graph G cannot traverse all of the edges (b
1
,b
2
), (b
2
,b
3
), and (b
3
,b
4
), since then all vertices in the widget
other than b
1
,b
2
,b
3
and b
4
would be missed. A hamiltonian cycle of G may however traverse any proper
subset of these edges. Figure 4.12 (a)—(e) show five such subsets; the remaining two subsets can be
obtained by performing a toptobottom flip of part
(b) and (e). We represent this widget as in Figure 4.12 (f), the idea being that at least one of the paths
pointed by that arrows must be taken by a G hamiltonain cycle.
The graph G that we shall construct consists mostly of copies of these two widgets. The construction is
illustrated in Figure 4.13 of the k clauses φ, we include a copy of widget B, and we join these widgets
together in series as follows. Letting b
ij
be the copy of vertex b
j
in the jth copy of widget B, we connect b
i,4
to b
i
+1.1 for i = 1,2,...,k  1.
Then, for each variable x
m
in φ we include two vertices x'
m
and x"
m
. We connect these two vertices by means
of two copies of the edge (x'
m
, x"
m
), which we denote by e
m
and e
m
to distinguish them. The idea is that if the
hamiltonian cycle takes edge e
m
, it corresponds to assigning variable x
m
the value 1. If the hamiltonian cycle
takes edge e
m,
the variable is assigned the value 0. Each pair of these edges forms a twoedge loop; we
connect these small loops in series by adding edges (x'
m
, x"
m+1
) for m = 1,2,....,n  1. We connect the left
(clause) side of the graph to the right (variable) side by means of two edges (b
1,1
, x'
1
) and (bk,
4
, x"
n
), which
are the topmost and bottom most edges in Figure 4.13.
We are not yet finished with the construction of graph G, since we have yet to relate the variables to the
clauses. If the jth literal of clause C
i
is x
m,
then we use an A widget to connect edge (b
ij
,b
i,j+1
) with edge e
m
.
NP COMPLETE PROBLEM 119
If the jth literal of clause c
i
is ¬ xm, then we instead put an A widget between edge (b
ij
,b
i,j+1
) and e
m
In
Figure 4.13 for example, because clause c
2
is (x
i
∨¬ x
2,
∨x
3
), we place three A widgets as follows:
between (b
2,1
;b
2,2
) and e
1
,
between (b
2,2
;b
2,3
) and
2
e , and
between (b
2,3
;b
2,4
) and e
3
,
Note that connecting two edges by means of A widgets actually entails replacing each edge by the five
edges in the top to bottom of Figure 4.13 (a) and, of course, adding connections that pass through the Z
vertices as well. A given literal l
m
may appear in several clauses (¬ x
3
in figure 4.13 for example), and thus
an edge e
m
or
m
e may be influenced by several A widgets (edge
3
e
,
for example). In this case, we connect
the A widgets in series, as shown in Figure 4.14 effectively replacing edge c
m
or
m
e
by a series of edges.
ALGORITHMS AND ADVANCED DATA STRUCTURES 120
Figuro 4.T2. Widgol 8, usod in lho roduclion írom 3CHFSAT lo HAMCYCLE. No pulh írom vorlox 6
1
lo
vorlox 6
4
conluining ull lho vorlicos in lho widgol muy uso ull lhroo odgos (6
1
.6
2
ì, (6
2
.6
3
ì, und (6
3
.6
4
ì.Any
propor subsol oí lhoso odgos muy bo usod, howovor. (uì(oì íivo such subsols.
(íì A roprosonlulion oí lhis widgol in which ul lousl ono oí lho pulhs poinlod lo
by lho urrow musl bo lukon u humilloniun cyclo.
Figuro 4.T3. Tho Gruph Conslruclod írom lho íormulu φ ~ (¬ x
1
∨x
2
¬∨x
3
ì∧(x
1
∨¬x
2
∨x
3
ì∧(x
1
∨x
2
∨¬x
3
ì.
A sulisíying ussignmonl s lo lho vuriublos oí φ is s (x
1
ì ~ 0, s (x
2
ì ~ T, und s (x
3
ì ~ T, which corrosponds lo
lho humilloniun cyclo shown. Nolo lhul ií s(x
m
ì ~ T, lhon odgo
m
e is in
lho humilloniun cyclo, und ií s(\
m
ì ~ 0, lhon odgo
m
e is in lho humilloniun cyclo
o
1,3
x´
2
C C C
C C C
C C C
C C C
o
1,4
o
1,3
o
1,4
x´
3
o
3,3
C C C
C C C o
3
/
NP COMPLETE PROBLEM 121
C C C
o
3,3
x´´
3
C C C
o
3,4
o
3,4
x´´
3
We claim that formula φ is satisfiable if and only if graph G contains a hamiltonian cycle. We first
suppose that G has a hamiltonian cycle h and show that φ is satisfiable. Cycle h must take a particular form:
First, it traverses edge. ( )
'
1 1 . 1
x , b to go from the top left of the top right.
It then follows all of the x'
m
and x"
m
vertices from top to bottom, choosing either edge e
m
or edge
m
e
,
but not both.
It next traverses edge ( )
"
n k
x , 4 b to get back to the left side.
Finally, it traverses the B widgets from bottom to top on the left.
(It actually traverses edges within the A widgets as well, but we use these subgraphs to enforce the either /or
nature of the edges it connects.)
Given the hamiltonian cycle h, we define a truth assignment for ø as follows. If edge e
m
belong to h then we
set x
m
= 1. Otherwise, edge
m
e belong to h, and we set x
m
= 0.
We claim that this assignment satisfies φ. Consider a clause C
i
and the corresponding B widget in G. Each
edge ( )
1 j , i j , i
b b
+
is connected by an A widget to either edge e
m
of edge
m
e , depending on whether x
m
or ¬x
m
is the jth literal in the clause. The edge ( )
1 j , i j , i
b b
+
is traversed by h if and only if the corresponding literal is
0. Since each of the three edges (b
i,j1
,b
i,2
),(b
i,j2
,b
i,3
),(b
i,j3
,b
i,4
) in clause c
i
is also in a B widget, all three
cannot be traversed by the hamiltonian cycle h. One of the three edges, therefore, must have a
corresponding literal whose assigned value is 1, and Clause C
i
is satisfied. This property holds for each
clause , k ..... 2 , 1 i , C
i
= and thus formula φ is satisfied.
4
o v
1
3 2
1
x w
5
Figuro 4.T5. An inslunco oí lho lruvoling  sulosmun problom. Shudod odgos
roprosonl u minimumcosl lour, wilh cosl 7.
Conversely, let us suppose that formula φ is satisfied by some truth assignment. By following the rules from
above, we can construct a hamiltonian cycle. For graph G: traverse edge e
m
edge
n
e if x
m
= 0, and traverse
/
Figuro 4.T4 Tho ucluul conslruclion usod whon un odgo o
m
or
m
e is iníluoncod by mulliplo A widgols. (uì
A porlio (bì Tho ucluul subgruph conslruclod.
ALGORITHMS AND ADVANCED DATA STRUCTURES 122
edge ( )
1 j , i j , i
b , b
+
if and only if the jth literal of clause C
i
is 0 under the assignment. These rules can indeed
be followed, since we assure that s is a satisfying assignment for formula φ.
Finally, we note that graph G can be constructed on polynomialtime. It contains one B widget for each of
the k clauses in φ, and so there are 3k A widgets. Since the A and B widgets are of fixed size, the graph G
has O(k) vertices and is easily constructed in polynomialtime. Thus we have provided a polynomialtime
reduction from 3CHFSET to HAMCYCLE.
The Travelingsalesman Problem
In the travellingsalesman problem, which is closely related to the hamiltoniancycle problem, a salesman
must visit n cities. Modeling the problem as a complete graph with n vertices, we can say that the salesman
wishes to make a tour, or hamiltonian cycle, visiting each city exactly once and to finishing at the city he
starts from. There is an integer cost c(i, j) to travel from city i to city j, and the salesman wishes to make
the tour whose total cost is minimum, where the total cost is the sum of the individual costs along the edges
of the tour. For example, in Figure 4.15 a minimum cost tour is u, w, v, x, u, with cost 7. The formal
language for the traveling salesman problem is :
TPS = {G, c, k} : G = (V, E) is a complete graph,
c is a function from V × V → Z,
K ∈Z, and
G has a travelling salesman tour with cost at most k}.
The following theorem shows that a fast algorithm for the travellingsalesman problem is unlikely to exist.
Theorem
The travellingsalesman problem is NPcomplete.
Proof: We first show that TPS belongs to NP. Given an instance of the problem, we use as a certificate the
sequence of n vertices in the tour. The verification algorithm checks that this sequence contains each vertex
exactly once, sums up the edge costs and checks whether the sum is at most k. This process can certainly be
done in polynomialtime.
To prove that TSP is NPhard, show that HAMCYCLE ≤
P
TSP. Let G = (V,E) be an instance of HAM
CYCLE. We form the complete graph G' = (V,E') where E' = {(i,j) : i,j,∈V}, and we define the cost
function c by
c(i,j) = {0 if (i,j) ∈E,
1if (i,j) ∈E.
The instance of TSP then (G',c,o), which is easily formed in polynomialtime.
We now show that graph G has a hamiltonian cycle if and only if graph G' has a tour of cost at most 0.
Suppose that graph G has a hamiltonian cycle h. Each edge in h belong to E and thus has cost 0 in G' has.
Thus, h' is a tour in G' with cost 0. Conversely, suppose that graph G' has a tour h' of cost at most 0. Since
the cost s of the edges in E' are 0 and 1, the cost of tour h' is exactly 0. Therefore, h' contains only edge in
E. We conclude that h is a hamiltonian cycle in graph G.
Student Activity 4.5
Answer the following questions:
NP COMPLETE PROBLEM 123
1. What is a clique problem? Show that the clique problem is NP complete?
2. What is vertex cover problem? Show that this problem is NP complete.
3. What is travelingsalesperson problem?
8ummary
If any single NPComplete problem can be solved in polynomial time, then every NPcomplete
problem has a polynomial time algorithm.
The formallanguage framework allows us to express relation between decision problems and
algorithms that solve them concisely.
The circuitsatisfiability problem is NPhand.
The vertex cover problem is NPcomplete.
Se¡fassessment Quest¡ons
So¡ved Exerc¡se
I. True and False
1. The class of polynomialtime solvable problems has closure properties.
2. The class of language decided by polynomialtime algorithms is not a subset of the class of
languages accepted by polynomialtime algorithms.
II. Fill in the blanks
1. The Hamiltonian cycle problem is ______________.
2. The traveling salesman problem is closely related to the _____________________ problem.
3. The NP___________ languages are in a sense, the “hardest” language in NP.
4. If any NPcomplete problem is polynomialtime solvable then________________.
Answers
I. True and False
1. True
2. False
II. Fill in the blanks
1. NPcomplete
2. Hamiltoniancycle
3. complete
4. P=NP
ALGORITHMS AND ADVANCED DATA STRUCTURES 124
Unso¡ved Exerc¡se
I. True and False
1. The circuit – satisfiability problem is NP – complete.
2. The subsetsum problem is NP complete.
II. Fill in the blanks:
1. One of the convenient aspects of focusing on decision problem is that they make it easy to use the
machinery of _________ theory.
2. ___________ reductions provide a formal means for showing that one problem is atleast as hard
as another, to within a polynomialtime factor.
3. If any NPcomplete problem is polynomial time solvable then ________.
4. The circuitsatisfiability problem belongs to the class N ________.
5. The clique problem is __________.
Detailed Ouestions
1. Show that the hamiltonainpath problem is NPcomplete?
2. The longestsimple cycle problem is the problem of determining a simple cycle of maximum length in
a graph (no repeated vertex). Show that this problem is NPcomplete.
3. A Hamiltonainpath in a graph is a simple path that visits every vertex exactly once: show that the
language HAMPATH = {(G,u,v) : there is a hamiltonian path from u to v in graph G} belongs to NP.
4. Show that L is complete for NP if and only if is complete for CoNP.
5. Show that the subsetsum problem is solvable in polynomialtime if the target value t is expressed in
unary.
Overview
Parallelism
Computational Model: PRAM and other Models
Finding Maximum Element
Merging
Sorting
Unit 5
Parallel Algorithms
Learning Objectives
• Overview
• Parallelism
• Computational Model: PRAM and Other Models
• Finding Maximum Element
• Merging
• Sorting
Top
Overview
So far our discussion of algorithm has been confined to single processor computers. In this block we study
algorithms for parallel machines (i.e. computers with more than one processor). There are many
applications in daytoday life that demand real time solutions to problems. For example, whether
forecasting has to be done in a timely fashion. In case of severe hurricanes or snowstorms, evacuation has
to be done in short period of time. If an expert system is used to aid a physician in surgical procedures,
decisions have to be made within seconds. And so on. Programs written for such applications have to
perform enormous amount of computation. In the forecasting example, large sized matrices have to be
operated on. In the medical example, thousand of rules have to be tried. Even the fastest singleprocessor
machines may not be able to come up with solutions within tolerable time limits. Parallel machines offer
the potential of decreasing the solution time enormously.
Examp¡e 1
Assume that you have 5 loads of clothes to wash. Also assume that it takes 25 minutes to wash one load in
a washing machine. Then it will take 125 minutes to wash all the clothes using a single machine. On the
other hand, if you had 5 machines, washing could be computed in just 25 minutes; in this example, if there
are p washing machines and p loads of clothes, then the washing time can be cut down by a factor of p
ALGORITHMS AND ADVANCED DATA STRUCTURES 128
compared to having a single machine : here we have assumed that every machine takes exactly the same
time to wash. If this assumption is invalid then the washing time will be dictated by the slowest machines.
Examp¡e 2
As another example say there are 100 numbers to be added and there are two persons A and B. Person A
can add the first 50 numbers. At the same time B can add next 50 numbers. When they are done one of
them can add the two individual sums to get the final answer. So two people can add the 100 numbers in
almost half the time required by one.
Top
Parallelism
The idea of parallel competing is very similar, given a problem to solve we partition the problem into many
sub problem; and when all the processor are done, the partial solutions are combined to arrive at the final
answer. If there are p processor then potentially we can cut down the solution to by a factor of p. We refer
to any algorithm designed for a single processor machine as a sequential algorithm and any designed for a
multi processor machines a parallel algorithm.
Def¡n¡t¡on 1
Let π be a given problem for which the best known sequential algorithm has a run time of s' (n) where n is
the problem size. If a parallel algorithm on a pprocessor machine runs in time T'(n,p) then the speedup of
the parallel algorithm is defined to be S'(n) / T' (n,p). If the best known sequential algorithm for π has an
asymptotic run time of s(n) and if T(n,p) is the asymptotic run time of a parallel algorithm., then the
asymptotic speedup of the parallel algorithm is defined to be S(n)/T(n,p). If S(2)/T(n,p)=θ(p), then the
algorithm is said to have linear speedup.
Note: In this block we use the terms speedup and asymptotic speedup interchangeably which one is meant
is clear from the context.
Examp¡e 3
For the problem of example 2, the 100 numbers can be added sequentially in 99 units of time. Person A
can add 50 numbers in 49 units of time. At the same time B, can add other 50 numbers. In another unit of
time, the two partial sums can be added; his means that the parallel run time is 50. So the speed up of this
parallel algorithm is 99/50 = 1.98, which is very nearly equal to 2!
Examp¡e 4
There are many sequential algorithms for sorting such as heap sort that are optimal and run in time
θ(nlogn), n being the number of keys to be sorted. Let A be an nprocessor parallel algorithm that sorts n
keys in θ(log n) time and let B be an n
2
processor algorithm that also sort n keys θ (logn) time.
Then the speedup of A is
( )
( )
( ) n
n log
n log n
θ =
θ
θ
. On the other hand, the speedup of B is also
( )
( )
( ) n
n log
n log n
θ =
θ
θ
.
Algorithm A has linear speedup whereas B does not have a linear speedup.
Def¡n¡t¡on 2
PARALLEL ALGORITHMS 129
If a pprocessor parallel algorithm for a given problem runs in time T (n,p) the total work done by this
algorithm is defined to be p T (n,p) the efficiency of the algorithm is defined to be S(n) / pT(n,p), where
S(n) is the asymptotie run time of the best known sequential algorithm for solving the same problem. Also
the parallel algorithm is said to be work optimal if pT(n,p) =
0 (S(n)).
Note: A parallel algorithm is work optimal if and only if it has linear speedup. Also the efficiency of a work
optional parallel algorithm is Q(1)'
Examp¡e 5
Let w be the time to wash one load of cloths on a single machine in example (1) also let n be the total
number of loads to wash. A single machine S will take time nw. If there are p machines the washing time is
. w
p
n
Thus the speedup is
p
n
/ n . This speedup is >
2
p
if n ≥ p .So the asymptotic speed up is Ω (p) and
hence the parallel algorithm has linear speedup and is work optimal also the efficiency is
w
p
n
p
nw
. This is
( ) . p ifn 1 ≥ θ
Examp¡e 6
For the algorithm A of example 4, the total work done is nθ(logn) = θ (n log n). Its efficiency is θ (nlogn) /
θ (nlogn) = θ (1) Thus A is work optimal and has a linear speedup. The total work done by the algorithm B
is n
2
θ (logn) = θ (n
2
logn) and its efficiency is θ(nlogn)/θ(n
2
logn)= θ (1/n) as a result B is not work optimal
.
Is it possible to get a speed up of more than p for any problem on a pprocessor machine? Assume that it is
possible (such a speed up is called super linear speedup). In particular let π be the problem under
consideration and s be the best known sequence run time. If there is a parallel algorithm can a pprocessor
machine whose speedup is better than p, it means that the parallel run time T < (s / p) that is < PT < s. This
is a contradiction since by assumption s is the run time of the best known sequential algorithm for solving
π!
The preceding discussion is valid only when we consider asymptotic speedups. When the speedup is
defined with respect to the actual run times on the sequential and parallel machines, it is possible to obtain
super linear speedup. Two of the possible reasons for such an anomaly are (1) pprocessor have more
aggregate memory than one and (2) The cachehit frequency may be better for the parallel machines as the
pprocessor may have more aggregate cache them does one processor.
One way of solving a given problem in a to explore many techniques (i.e. algorithm) and identify the one
that is the most paralletizable to achieve a good speedup, it is necessary to parallelize every component of
the under lying technique. If a fraction of the technique cannot be parallelized (i.e. has to be run
sequentially), then the maximum speed up that can be obtained a limited by f. Amdahl’s law relates the
maximum speed up achievable with f and p as follows.
Lemma 1
Maximum speed up = ( ) ( ) p / f 1 f / 1 − +
ALGORITHMS AND ADVANCED DATA STRUCTURES 130
Examp¡e 7
Consider the some technique for solving a problem π. Assume that p=10. If f=0.5 for this technique, then
the maximum speedup that can be obtained is 1/0.5+10.5/10=20/11, which is less than 2! If f=0.1, then the
maximum speedup is 10/1.0, which is slightly more than 5! Finally, if f=0.01, then the maximum speedup is
10/1.09, which is slightly more than 9!
Student Activity 5.1
Before going to next section, answer the following questions:
1. Explain the importance of parallel processing with an example.
2. What do you mean by a workoptimal parallel algorithm?
If your answers are correct, then proceed to next section.
Top
Computational Model: PRAM and other Models
The sequential computational model we have employed so far is the RAM (random access machine). In the
RAM model we assume that any of the following operations can be performed in one unit of time :
addition, subtraction, multiplication, division, comparison, memory access, assignment and so on. This
model has been widely accepted as a valid sequential model. On the other hand when it comes to parallel
computing, numerous models have been proposed and algorithm have been designed for each such model.
An important feature of parallel computing that is absent in sequential computing is the need for inter
processor communication. For example, given any problem, the processors have to communicate among
themselves and agree on the subproblems each will work on. Also they need to communicate to see whether
every one has finished its task, and so on. Each machine or processor in a parallel computer can be assumed
to be a RAM. Various parallel models differ in the way they support interprocessor communication.
Parallel models can be broadly categorized into two; fixed connection machines and shared memory
machines.
Figuro 5.T(uì . Mosh Figuro 5.T(bì . Hyporcubo
PARALLEL ALGORITHMS 131
Figuro 5.T(cì 8ulloríly
A fixed connection network is a graph G(V,E) whose nodes represent processors and whose edges represent
communication links between processor. Usually we assume that the degree of each node is either a
constant or a slowly increasing function of the number of nodes in the graph. Examples include the mesh,
hypercube and butterfly, and so on (See figure 5.1).
Inter processor communication is done through the communication links. Any two processors connected by
an edge in G can communicate in one step. In general two processors can communicate through any of the
paths connecting them. The communication time depends on the lengths of these paths (at least for small
packets).
PRAM
In shared memory models [also called PRAMs (Parallel Random Access Machines)], a number (say p) of
processors work synchronously. They communicate with each other using a common block of global
memory that is accessible by all this global memory is also called common or shared memory (See figure
5.2). Communication is performed by writing to and/or reading from the common memory. Any two
processors i and j can communicate in two steps. In the first step, processor i writes its message into
memory cell j, and in the second step processor j reads from this cell. In contrast, in a fixed connection
machine, the communication time depends on the lengths of the paths connecting the communicating
processors.
Each processor in a PRAM is a RAM with some local memory. A single step of a PRAM algorithm can be
one of the following; arithmetic operation (such as addition, division and so on), comparison, memory
access (local or global), assignment etc. The number (m) of cells in a global memory is typically assumed to
be the same as p. But this need not always to the case.
1 2 3 . . . . P orocossor
1 2 3 4 m oloool momorv
Figuro 5.2. A Purullol Pundom Accoss Muchino
In fact we present algorithms for which m is much larger or smaller than p. We also assume that the input is
given in the global memory and there is space for the output and for storing intermediate results. Since the
global memory is accessible by all processors, access conflict may arise. What happens if more than one
ALGORITHMS AND ADVANCED DATA STRUCTURES 132
processor tries to access the same global memory cell (for purpose of reading from or writing into)? There
are several ways of resolving read and write conflicts. Accordingly, several variants of PRAM arise.
EREW (Exclusive Read and Exclusive Write) the PRAM is the shared memory model in which no
concurrent read or write is allowed on any cell by the global memory. Note that ER or EW does not
preclude different processors simultaneously accessing different memory cells. For example, at a given
time step, processor one might access cell five and at the same time processor two might access cell 12 and
so on. But processors one and two cannot access memory cell ten, for example, at the same time. CREW
(Concurrent Read Exclusively Write) PRAM is a variation that permits concurrent read but not concurrent
writes. Similarly one could also define the ERCW model. Finally, the CRCW, PRAM model allows both
concurrent reads and concurrent writes.
In a CRCW or CRCW PRAM, if more than one processor tries to read from the same cell, clearly, they will
read the same information. But in a CRCW PRAM, if more than one processor tries to write in the same
cell, then possibly they may have different messages to write. Thus there has to be an additional mechanism
to determine which, message gets to be written. Accordingly several variations of the CRCW PRAM can be
derived. In a common CRCW PRAM, concurrent writes are permitted in any cell only if all the processor
conflicting for this cell have the same message to write. In an arbitrary CRCW PRAM, if there is a
conflict for writing, one of the processors will succeed in writing and we don’t know which one. Any
algorithm designed for this model should work no matter which processor succeeds in the event of
conflicts. The priority CRCW lets the processor with the highest priority succeed in the case of conflicts.
Typically each processor is assigned a (static) priority to begin with.
Examp¡e 8
Consider a 4processor machine and also consider an operation in which each processor has to read from
the global cell M[1]. This operation can be denoted as
Processor i (in parallel for 1 ≤ i ≤ 4) does:
Read M[1]
This concurrent read operation can be performed in one unit of time on the CRCW as well as on the CREW
PRAMs. But on the EREW PRAM, concurrent reads are prohibited. Still we can perform this operation on
the EREW PRAM making sure that at any given time no two processors attempt to read from the same
memory cell. One way of performing this is as follows: processor 2 reads M[1] at the first time unit;
processor 2 reads M[1] at the second time unit; and processor 3 and 4 read M[1] at third and fourth time
units, respectively the total runtime is four.
Now consider the operation in which each processor has to access M[1] for writing at the same timesince
only one message can be written to M[1], one has to assume some scheme for resolving contentions, this
operation can be denoted as
Processor i(in parallel for 1 ≤ i ≤ 4) does :
Write M[1];
Again in the CRCW PRAM, this operation can be completed in one unit of time. In the CRCW and
(EREW) PRAMs, concurrent writes are prohibited. However, these models can simulate the effect of a
concurrent write. Consider our simple example of four processor trying to write in M[1]simulating a
common CRCW PRAM requires the four processors to verify that all wish to write the same value.
Following this processor 1 can do the writing. Simulating a priority CRCW PRAM requires the four
processors to first determine which has the highest priority, and then the one with this priority does the
write. Other models may be similarly simulated.
PARALLEL ALGORITHMS 133
Note that any algorithm that runs on a pprocessor EREW PRAM in time T(n, p), where n is the problem
size, can also run on a pprocessor CREW PRAM or a CRCW PRAM within the same time. But a CRCW
PRAM algorithm or a CREW PRAM algorithm may not be implementable on an EREW PRAM preserving
the asymptotic run time. In example 8, we saw that the implementation of a single concurrent write or
concurrent read step takes much more time on the CREW PRAM. Likewise, a pprocessor CRCW PRAM
algorithm may not be implementable on a
pprocessor CREW PRAM preserving the symptotic run time. It turn out that there is a strict hierarchy
among the variants of the PRAM in terms of their computational power, for example, a CREW PRAM is
strictly more powerful than an EREW PRAM. This means that there is at least one problem that can be
solved in asymptotically less time on a CREW PRAM than on an EREW PRAM, given the same number of
processors. Also any version of the CRCW PRAM is more powerful than a CREW PRAM as is
demonstrated by example 9.
Examp¡e 9
A[0]=A[1]A[2]A……A[n] is the Boolean (or logical) OR of the n bits A[1 : n]. A[0] is easily computed in
O(n) time on a RAM. Following algorithm shows how A[0] can be computed in θ(1) time using an n
processor CRCW PRAM.
Assume that A[0] is zero to begin with. In the first time step, processor i, for 1 ≤ i ≤ n, reads memory
location A[i] and proceeds to write a1 in memory location A[0] if A[i] is a1. Since several of the A[i], may
be 1, several processors may write to A[0] concurrently. Hence the algorithm can not be run (as such) on a
EREW or CREW PRAM. In fact, for these two models, it is known that the parallel complexity of the
Boolean OR problem is O(log n), no matter how many processors are used. Note that this algorithm works
on all this two varieties of the CRCW PRAM.
Processor i (in parallel for 1 ≤ i ≤ n) does:
if (A[i]==1) A[0]=A[i];
Theorem
The Boolean OR of n bits can be computed in O(1) time on an nprocessor common CRCW PRAM.
There exists a hierarchy among the different versions of the CRCW PRAM also. Common arbitrary, and
priority from an increasing hierarchy of computing power. Let EREW (p, T(n, p)) denote the set of all
problems that can be solved using a pprocessor EREW PRAM in time T(n, p) (n being the problem size).
Similarly define CREW (p, T(n, p)) and CRCW (p, T(n, p)), then
EREW (p, T (n,p)) ⊂ CREW (p, T(n,p)) ⊂ Common CRCW (p, T (n,p))
⊂ Arbitrary CRCW (p, T(n, p)) ⊂ Priority CRCW (p, T(n, p))
In practice a problem of n is solved on a computer with a constant number p of processors. All the
algorithm designed under some assumptions about the relationships between n and p can also be used when
fewer processors are available as there is a general slowdown lemma for the PRAM model.
Let A be a parallel algorithm for solving problem π that runs in time T using p processors. The slowdown
lemma concerns the simulation of the same algorithm on a p’processors machine (for p’<p).
Each step of algorithm A can be simulated on the p’processor machine (call it M) in time
≤ p/p’. Since a processor of M can be in charge of simulating p/p’ processors of the original machine.
Thus, the simulation time on M is ≤ T p/p’. Therefore the total work done on M is ≤ p’ T p/p’
≤ pT+p’T = O(pT). This results is the following lemma.
ALGORITHMS AND ADVANCED DATA STRUCTURES 134
Lemma 2
[Slowdown lemma] Any parallel algorithm that runs on a pprocessor machine in time T can be run on a
p’processor machine in time O(pT/p’), for any p’<p.
Examp¡e 10
Algorithm of example (9) runs in θ(1) time using n processors. Using the slowdown lemma, the same
algorithm also runs in θ(log n) time using
n log
n
processors; it also run in ( ) n θ time using n processors;
and so on. When p=1, the algorithm runs in time θ(n), which is the same as the run times of the best
sequential algorithm!
Student Activity 5.2
Before going to next section, answer the following questions:
1. Define EREW PRAM model.
2. Differentiate between EREW PRAM and CRCW PRAM.
If your answers are correct, then proceed to next section.
Top
Finding Maximum Element
Algorithms to find the maximum element in a list using more than one processor are given below.
Maximal 8election with n
2
Processors
Finding the maximum of n given numbers can be done in O(1) time using an n
2
–processor CRCW PRAM.
Let k
1
, k
2
,…….k
n
be the input. The idea is to perform all pairs of comparisons in one step using n
2
processors. If we name the processors p
ij
(for 1 ≤ i ≤ n, 1 ≤ j ≤ n), processor p
ij
computes x
ij
=(k
i
<k
j
).
Without loss of generality assume that all the keys are distinct. Even if they are not, they can be made
distinct by replacing key k
i
with the tuple (k
i
, i) (for (1 ≤ i ≤ n); this amounts to appending each key with
only a (log n)–bit number of all the input keys. There is only one key k which when compared with every
other key would have yielded the same bit zero. This key can be identified using the Boolean OR algorithm
and is the maximum of all. The resultant algorithm appears as follows:
Step 0. If n=1, output the key.
Step 1. Processor P
ij
(for each 1 ≤ i, j ≤ n in parallel) compute x
ij
=(k
i
<k
j
).
Step 2. The n
2
processors are grouped into n groups G
1
, G
2
,…..G
n
where Gi (1 ≤ i ≤ n) consists
of the processors p
i1
, p
i2
,…..p
in
. Each group Gi computes the Boolean OR of x
i1
,
x
i2
….x
in
.
Step 3. If G
i
computes a zero in step 2, then processor p
i1
outputs k
i
as the answer.
Step 1 and 3 of this algorithm take unit time each. Step 2 takes O(1) time. Thus the whole algorithm runs in
O(1) time; this implies the following theorem:
Theorem
PARALLEL ALGORITHMS 135
The maximum of n keys can be computed in O(1) time using n
2
common CRCW PRAM processors.
Note that the speedup of previous algorithm is θ(n)/1=θ(n). Total work done by this algorithm is θ(n
2
).
Hence its efficiency is θ(n)/ θ(n
2
) = θ(1/n). Clearly this algorithm is not workoptimal.
Finding the maximum using n processors
Now we show that maximal selection can be done in O(log log n) time using n common CRCW PRAM
processors. The technique to be employed is divide and conquer. To simplify the discussion, we assume n
is a perfect square (when n is not a perfect square, replace √x by [√n] in the following discussion).
Let the input sequence by k
1
, k
2
…..k
n
. We are interested in developing an algorithm that can find the
maximum of n keys using n processors. Let T(n) be the run time of this algorithm. We partition the input
into √n parts so that the maximum of each part can be computed in parallel. Since the recursive maximal
selection of each part involves √n keys and an equal number of processors, this can be done in T(√n) time.
Let M
1
, M
2
, …. M
√n
be the group maxima. The answer we are supposed to output is the maximum of these
maxima. Since now we only have √n keys, we can find the maximum of these employing all the n
processors (see the following algorithm).
Step 0. If n=1 return k
Step 1. Partition the input keys into n part k
1
, k
2
….k
√n
where ki consists of k
(i–1)
√n+1
, k
(i–
1)n+2
,….k
in
similarly partition the processors so that Pi(1 ≤ i ≤ n) consists of the
processors P
(i–1)
√n+1
, P
(i–1)n+2
…,P
√n
let P
i
find the maximum of Ki recursively (for 1
≤ i ≤ n).
Step 2. If M
1
, M
2
,……M
√n
are the group maxima, find and output the maximum of these
maxima employing theorem of previous section.
Step 1 of this algorithm takes T(√n) time and step 2 takes O(1) time. This T(n) satisfies the recurrence
T(n)=T(√n)+O(1)
Which solves to T(n) = O(log log n). Therefore, the following theorem arises.
Theorem
The maximum of n keys can be found in O(log log n) time using n common CRCW PRAM processors.
Total work done by the above algorithm is θ(log log n) and its efficiency is θ(n)/ θ(log log n)=θ(1/log log
n).
Thus this algorithm is workoptimal.
Maximal 8election Among ¡ntegers
Consider again the problem of finding the maximum of n given keys. If each one of these keys is a bit, the
problem of finding the maximum reduces to computing the Boolean OR of n bits and hence can be done in
O(1) time using n common CRCW PRAM processors. This raises the following question: What can be the
maximum magnitude of each key if we desire a constant time algorithm for maximal selection using n
processors? Answering this question in its full generality is beyond the scope of this syllabus. Instead we
show that if each key is an integer in the range [0, n
c
], where C is a constant, maximal selection can be done
workoptimally in O(1) time. Speedup of this algorithm is θ(n) and its efficiency is θ(1).
Since each key is of magnitude at most n
c
, it follows that each key is a binary number with ≤C log n bits.
Without loss of generality assume that every key is of length exactly equal to log n. Suppose we find the
ALGORITHMS AND ADVANCED DATA STRUCTURES 136
maximum of the n keys only with respect to their log n/2 most significant bits. (See figure 3). Let M be
dropped from future consideration since it cannot possibly be the maximum. After this many keys can
potentially survive. Next we compute the maximum of remaining keys with respect to their next log n/2
MSBs and drop keys that cannot possibly be the maximum.
(loonì/2 (loonì/2 .... (loonì/2
oils oils oils
K
1
K
2
. . . .
. . . .
. . . .
K
n
Figuro 5.3. Finding lho lnlogor Muximum
We repeat this basic step 2 times (once for every log n/2 bits in the input keys). One of the keys that survive
the very last step can be output as the maximum. Refer to the log n/2 MSBs of any key as its first part, the
next most significant logn/2 bits as its second part, and so on. There are 2c parts for each key. The 2c
th
part
may have less than logn/2 bits. The algorithm is summarized below. To begin with, all the keys are alive.
For (i=1; i<=2c; i++)
{
Step 1. Find the maximum of all alive keys with respect to their i th parts. Let M be the
maximum.
Step 2. Delete each alive key whose i th part is <M
}
Output one of the alive keys.
We now show the step 1 of this algorithm can be completed in O(1) time using n common CRCW PRAM
processors. Note that if a key has at most logn/2 bits, its maximum magnitude is √n1. Thus each step of
this algorithm is nothing but the task of finding the maximum of n keys, where each key is an integer in the
range (o, √n1). Assign one processor to each key. Make use of √n global memory cells (which one
initialized to –∞). Call these cells M
0
,…..M
√n1
. In one parallel write step, of processor i has a key k
i
, then it
tries to write k
i
in M
ki
. For example, if processor i has a key valued 10, it will attempt to write 10 in M
10
.
After this write step, the problem of computing the maximum of the n keys reduces to computing the
maximum of the contents of M
0
, M
1
…..M
√n–1
. Since these are only √n numbers, their maximum can be
found in O(1) time using n processors. As a result we get the following theorem.
Theorem
PARALLEL ALGORITHMS 137
The maximum of n keys can be found in O(1) time using n CRCW PRAM processors provided the keys are
integers in the range [o, n
c
] for any constant c.
Student Activity 5.3
Before going to next section, answer the following questions:
1. Write algorithm for finding the maximum using n
2
processors?
2. Describe method for finding the maximum using n processors.
If your answers are correct, then proceed to next section.
Top
Merging
The problem of merging is to take two sorted sequences as input and produce a sequence of all the
elements. Merging is an important problem. For example an efficient merging algorithm can lead to an
efficient sorting algorithm. The same is true in parallel computing also. In this section we study the parallel
complexity of merging.
A Logarithmic Time Algorithm
Let X
1
=k
1
, k
2
,……, k
m
and X
2
=k
m+1
, k
m+2
,…..k
2m
be the input sorted sequence to be merged. Assume without
loss of generality that m is an integral power of 2 and that the keys are distinct. Note that the merging of X
1
and X
2
can be reduced to computing the rank of each key k in X
1
, UX
2
. If we know the rank of each key,
then the keys can be merged by writing the key whose rank is i into global memory cell i. This writing will
take only one time unit if we have n=2m processors.
For any key k, let its rank in X
1
(X
2
) be denoted as r
1
k
(r
2
k
). If k=k
j
∈X1, then note that r
1
k
=j. If we allocate a
single processor π to k, π can perform a binary search on X
2
and figure out the number 9 of keys in X
2
that
are less than k. Once q is known, π can compute k’s rank in X
1
UX
2
as j+q. If k belongs to X
2
, a similar
procedure can be used to compute its rank in X
1
UX
2
. In summary, if we have 2m processors (one processor
per key); merging can be completed in O(log m) time.
Theorem
Merging of two sorted sequences each of length m can be completed in O(log n) time using m CREW
PRAM processors.
Since two sorted sequences of length m each can be sequentially merged in θ(m) time, the speedup of the
above algorithm is θ(m)/ θ(log n) = θ(m/log n); its efficiency is θ(m)/ θ(m log n) =θ(1/log m). This
algorithm is not workoptimal!
OddEven Merge
OddEven merge is a merging algorithm based on divide and conquer that yields itself to efficient
parallelization. If X
1
=K
1
, K
2
,….K
m
and X
2
= K
m+1
,….K
2m
(where m is an integral power of 2) are the two
sorted sequences to be merged then following algorithm uses 2m processors.
Step 0. If m=1, merge the sequences with one comparison.
Step 1. Partition X
1
and X
2
into their odd and even parts. That is, partition X
1
into X
1
odd
=K
1
,
K
3
,……,K
m–1
and X
1
even
=K
2
, K
4
,…..,K
m
. Similarly, partition X
2
into X
2
odd
and X
2
even
.
ALGORITHMS AND ADVANCED DATA STRUCTURES 138
Step 2. Recursively merge X
1
odd
with X
2
odd
using m processors. Let l
1
=l
1
, l
2
,….l
m
be the
result. Note that X
1
odd
, X
2
even
, X
2
odd
and X
2
even
are in sorted order. At the same time
merge X
1
even
with X
2
even
using the other m processors to get l
2
=l
m+1
, l
m+2
….l
2m
.
Step 3. Shuffle l
1
and l
2
: that is, form the sequence L=l
1
, l
m+1
, l
2
l
m2
,….l
m
, l
2m
. Compare
every pair (l
m+i
, l
i+1
) and interchange them out of order. That is, compare l
m+2
with
l
3
and inter change them if need by, compare l
m+2
with l
3
and inter change them if
need be, and so on. Out put the result sequence.
Examp¡e 11
Let X
1
=2,5,8,11,13,16,21,25 and X
2
=4,9,12,18,23,27,31,34. Figure 5.4. Shows how the oddeven merge
algorithm can be used to merge these two sorted sequences.
T
~2,5,8,TT,T3,Tó,2T,25
2
~4,º,T2,T8,23,27,3T,34
T
odd
T
ovon
2
odd
2
ovon
2,8,T3,2T, 5,TT,Tó,25 4,T2,23,3T º,T8,27,34
morgo morgo
T
~2,4,8,T2,T3,23,3T
2
~5,º,TT,Tó,T8,25,27,34
Shuíílo
~2,5,4,º,8,TT,T2,Tó,T3,T8,2T,25,23,27,3T,34
Compareexchange
2,4,5,8,º,TT,T2,T3,Tó,T3,T8,2T,25,23,27,3T,34
Figuro 5.4. odd ovon  un oxumplo
The correctness of the merging algorithm can be established using the zeroone principle. The validity of
this principle is not proved here.
Theorem
[Zeroone principle] If any oblivious comparisonbased sorting algorithm sorts an arbitrary sequence of n
zeros and ones correctly then it will sort any sequence of arbitrary keys.
PARALLEL ALGORITHMS 139
A comparisonbased sorting algorithm is said to be oblivious if the sequence of cells to be compared in the
algorithm is prespecified. For example, the next pair of cells to be compared cannot depend on the outcome
of comparisons made in the previous steps.
Student Activity 5.4
Before going to next section, answer the following questions:
1. Explain oddeven merge algorithm.
2. Merge the following two files by oddeven merge algorithm
X1 = 5, 9, 10, 13, 15
X2 = 4, 8, 10, 11, 14
If your answers are correct, then proceed to next section.
Top
8orting
Given a sequence of n keys, recall that problem of sorting is to rearrange this sequence into either
ascending or descending order. In this section we study algorithms for parallel sorting. If we have n
processors, the rank of each key can be computed in O(log n) time comparing in parallel, all possible pairs.
Once we know the rank of back key, in one parallel write step they can be written in sorted order (the key
whose rank is i is written in all i). Thus we have the following theorem.
Theorem
We can sort n keys in O(log n) time using n
2
CREW PRAM processors.
OddEven Merge 8ort
Oddeven merge sort employs the classical divide and conquer strategy. Assume for simplicity that n is an
integral power of two and that the keys are distinct. If X=k
1
, k
2
……k
n/2
is the given sequence of n keys, it is
partitioned into two subproblems X
1
=k
1
,k
2
, … k
n/2
and X’
2
=k
n/2+1
,…..k
n
, Of equal length. X’
1
and X’
2
are
sorted recursively assigning n/2 processors to each. The two sorted subsequences (call them X
1
and X
2
respectively) are then finally merged.
The preceding description of the algorithm is exactly the same as that of two subsequence X
1
and X
2
are
merged. We employ the oddeven merge algorithm of previous section.
Theorem
We can sort n arbitrary keys in O(log
2
n) time using n EREW PRAM processors.
Proof : The sorting algorithm is described as follows:
Step 0. If n ≤1, return X.
Step 1. Let X=K
1
, K
2
….k
n
be the input. Partition the input into two: X’
1
=K
1
, K
2
,…k
n/2
and
X’
2
=K2/
2+1
,….k
n
.
Step 2. Allocate n/2 processors to sort X’
1
recursively. Let X
1
be the result. At the same
time employ the other n/2 processors to sort X’2 recursively. Let X
2
be the result.
Step 3. Merge X
1
, and X
2
using odd even merge algorithm and n=2m processors.
ALGORITHMS AND ADVANCED DATA STRUCTURES 140
It uses n processors. Define T(n) to be the time taken by this algorithm to sort n keys using n processors.
Step (of this algorithm takes O(1) time. Step 2 runs in T(n/2) time. Finally, step 3 takes O(log n) time.
Therefore. T(n) satisfies T(n)=O(1)+T(n/2)+O(log n)=T(n/2)+O(log n) which solves to T(n)=O(log
2
n)
Examp¡e 12
Consider the problem of sorting the 16 number 25, 21, 8, 5, 2, 13, 11, 16, 23, 31, 9, 4, 18, 12, 27, 34 using
16 processors. In step 1 of algorithm, the input is partitioned into two parts:
X’
1
=8,21,8,5,2,13,11,16 and X’
2
=23,31,9,4,18,12,27,34, In 2, processors 1 to 8 work on X’
1
, recursively sort
it and obtain X
1
=2,5,8,11,13,16,21,25. At the same time processors 9 to 16 work on X’
2
, sort it, and obtain
X
2
=4,9,12,18,23,27,31,34. In step 3, X
1
and X
2
are merged as showed in example of previous section to get
the final result :
2,4,5,8,9,11,12,13,16,18,21,23,25,27,31,34.
The work done by this algorithm is θ(n log
2
n). Therefore, its efficiency is θ(1/logn) it has a speedup of
θ(1/log n)
Student Activity 5.5
Answer the following questions :
1. Describe oddeven merge sort algorithm.
2. What is the time complexity for sorting n arbitrary keys using n EREW PRAM processors.
8ummary
In parallel computing a problem is subdivided into many subproblems and submitted to many
processors. The partial solutions are then combined to obtain the final result.
In PRAMs, a number of processors work synchrously and communicate to each other by means of a
common global memory.
Maximum selection can be done in O (log log n) time using n common CRCW PRAM processors.
The problem of merging is to take two sorted sequence as input and produce a sequence of all the
element.
OddEven merge is a merging algorithm based on divide and conquer.
Se¡fassessment Quest¡ons
So¡ved Exerc¡se
I. True and False
1. Any algorithm designed for multi processor machines is called a parallel algorithm.
2. The maximum speed up = 1/(f+p)
II. Fill in the blanks
1. OddEven merge is based on_____________technique.
PARALLEL ALGORITHMS 141
2. The Boolean OR of n bits can be computed in_____________time on an nprocessor common
CRCW PRAM.
3. Finding the maximum of n given numbers can be done in O(1) time using an __________CRCW
PRAM.
Answers
I. True and False
1. True
2. False
II. Fill in the blanks
1. divide and conquer
2. O(1)
3. n
2
–processor
Unso¡ved Exerc¡se
I. True and False
1. Shared memory modes are called PRAMs
2. Maximum selection can be done in O(log n) time using n common CRCW PRAM processors.
II. Fill in the blanks
1. A parallel algorithm is work optimal if and only if it has _____.
2. In a CRCW PRAM, if more than one processor trial to read from the same cell, the will read the
________ information.
3. The problem of ________ is to take two sorted sequences as input and produce a sequence of all
the elements.
4. Given a sequence of n keys. The problem of ________ is to rearrange this sequence into either
ascending or descending order.
Detailed Ouestions
1. Algorithms A and B are parallel algorithm for solving the problem of finding the maximum element
in a list. Algorithm A uses n
0.5
processors and runs in time θ(n
0.5
). Algorithm B uses n processors
and runs in O(log n) time. Compute the work done, speedups, and efficiencies of these two
algorithms. Are these algorithms workoptimal?
2. Mr. Ultra smart claims to have found an algorithm for above problem that runs in time θ(logn)
using n
3/4
processors. Is it possible?
3. Present an O(1) time nprocessor common CRCW PRAM algorithm for computing the Boolean
AND of n bits.
ALGORITHMS AND ADVANCED DATA STRUCTURES 142
4. Input is an array of n elements. Give an O(1) time, nprocessor common CRCW PRAM algorithm
to check whether the array is in sorted order.
5. Solve the Boolean OR and AND problems on the CRCW and EREW PRAMs. What are the time
and processor bounds of your algorithms?
6. Can exercise (4) be solved in O(1) time using n processors on any of the PRAMs if the keys are
arbitrary? How about it there are n
2
processors?
7. The algorithm A is a parallel algorithm that has two components. The first runs in θ(log log n) time
using n/log log n EREW PRAM processors. The second component runs in θ(log n) time using
n/logn CREW PRAM processors. Show that the whole algorithm can be run in θ(log n) time using
n/logn CREW PRAM Processors.
8. Present on O(log log n) time algorithm for finding the maximum of n arbitrary numbers using n/log
log n common CRCW PRAM processors.
9. Show that minima computation can be performed in O(log log n) time using n/log log n common
CRCW PRAM processors.
10. Given an array A of n elements, we would like to find the largest I such that A[i]=1. Give an O(1)
time algorithm for this problem on an n–processor common CRCW PRAM.
11. Given two sorted sequences of length n each. How will you merge them in O(1) time using n
2
CRCW PRAM processors?
12. A given two sets A and B of size n each (in the form of arrays), the goal is to check whether the two
sets are disjoint or not. Show how to solve this problem.
(a) In O(1) using n
2
CRCW PRAM processors.
(b) In O (logn) time using n CRCW PRAM processors.
Algorithms and
Advanced Data Structures
BCA202
Directorate of Distance Education
Maharshi Dayanand University
ROHTAK – 124 001
2
Copyright © 2002, Maharshi Dayanand University, ROHTAK
All Rights Reserved. No part of this publication may be reproduced or stored in a retrieval system or
transmitted in any form or by any means; electronic, mechanical, photocopying, recording or otherwise,
without the written permission of the copyright holder.
Maharshi Dayanand University
ROHTAK – 124 001
Developed & Produced by EXCEL BOOKS, A45 Naraina, Phase 1, New Delhi110028
3
Contents
UNIT 1 TREES 1
Overview
Binary Trees
Traversal of Binary Trees
Binary Tree Representation
Threaded Binary Trees
Binary Search Tree
AVL Tree
Run Time Storage Management
Garbage collection
Compaction
UNIT 2 SORTING TECHNIQUES 23
Overview
Bubble Sort
Insertion Sort
Selection Sort
Quick Sort
Merge Sort
Radix Sort
Heap Sort
External Sort
Lower Bound theory for sorting
Selection and Adversary Argument
Minimum Spanning Tree
Prim’s Algorithm
Kruskal’s Algorithm
Shortest Path
Graph Component Algorithm
String Matching
KMP Algorithm
4
Boyer Moore Algorithm
UNIT 3 DYNAMIC PROGRAMMING 81
Overview
Principle of Optimality
Matrix Multiplication
Optimal Binary Search Trees
UNIT 4 NP COMPLETE PROBLEM 98
Overview
Polynomialtime
NPCompleteness and Reducibility
NPCompleteness Proofs
NPComplete Problems
UNIT 5 PARALLEL ALGORITHMS 127
Overview
Parallelism
Computational Model: PRAM and Other Models
Finding Maximum Element
Merging
Sorting
Suggested Readings
1. The Design and Analysis of Computer Algorithms, A.J. Hopcroft and J. Ullman.
2. Data Structures and Algorithms, AHO, A.V., J. E. Hopcroft, and J.D. Ullman
3. Computer Algorithms: Introduction to Design and Analysis and Analysis, S. Basse.
4. Dynamic Programming, Bellman, R., Princeton University Press.
5. Decomposable Searching Problem, Inform Process, Bentley, J.L., Lett.
6. Sorting by Distributive Partitioning, Dobosiewicz, W., Lett.
7. Introduction to the Design and Analysis of Algorithms, Goodman, S. E., McGrawHill.
8. Fundamentals of Data Structure, Horowitz, E. and S. Sahni Computer Science Press.
9. Fundamental Algorithms, Knuth, D.E.
2
ALGORITHMS AND ADVANCED DATA STRUCTURES
A Binary tree is a finite set of elements that is either empty or is partitioned into three disjoint subsets. The first subset contains a single element called the root of the tree. The other two subsets are themselves binary trees, called the left and right subtrees of the original tree. A left or right subtree is called nodes of the tree. A conventional method of picturing a binary tree is shown in figure 1.1. This tree consists of nine nodes with A as its root. Its left subtree is rooted at B and its right subtree is rooted at C. This is indicated by the two branches emanating from A to B on the left and to C on the right. The absence of a branch indicates an empty subtree for example, the left subtree of the binary tree rooted at C and the right subtree of the binary tree rooted at E are both empty. The binary trees rooted at D, G, H and I have empty right and left subtrees.
If A is the root of a binary tree and B is the root of its left or right subtree, then A is said to be the father of B and B is said to be the left or right son of A. A node that has no sons (such as D, G, H, and I of figure 1.1) is called a leaf. Node n1 is an ancestor of node n2 (and n2 is a descendent of n1). If, n1 is either the father of n2 or the father of some ancestor of n2. For example, in the tree of fig. 1.1, A is an ancestor of G and H is a descendent of C, but E is neither an ancestor nor a descendent of C. A node n2 is a left descendent of node n1 if n2 is either the left son of n1 or a descendent of the left son of n1. A right descendent may be similarly defined. Two nodes are brothers if they are left and right sons of the same father. Figure 1. 2 illustrate some structures that are not binary trees.
If every nonleaf node in a binary tree has nonempty left and right sub trees, the tree is called a Strictly Binary Tree. Thus the tree of figure 1.3 is strictly binary tree.
TREES
3
A strictly binary tree with n leaves always contains 2n1 nodes. The level of a node in a binary tree is defined as follows: The root of the tree has level 0, and the level of any other node in the tree is one more than the level of its father. For example in the binary tree of figure 1.1 node E is at level 2 and node H is at level 3. The depth of a binary tree is the maximum level of any leaf in the tree. Thus the depth of the tree of figure 1.1 is 3. A complete binary tree of depth d is the strictly binary tree all of whose leaves are at level d. If a binary tree contains m nodes at level l, it contains at most 2m nodes at level l+1. A complete binary tree of depth d is the binary tree of depth d that contains exactly 2l nodes at each level between 0 and d. A binary tree of depth d is an almost complete binary tree if: 1. 2. Each leaf in the tree is either at level d or at level d1. For any node nd in the tree with a right descendent at level d, all the left descendent of nd that are leaves are also at level d.
The strictly binary tree of figure 1.4a is not almost complete since it violates conditions. The binary tree of figure is an almost complete binary tree 1.4b.
Student Activity 1.1
Before going to next section, answer the following questions: 1. 2. 3. What is a Binary Tree? Define Strictly Binary Tree? Find the total number of nodes in a complete binary tree of depth d.
If your answers are correct, then proceed to next section. Top
but most of these have regularity of pattern. In each of there methods. (2) Traverse the right subtree in postorder. (3) Traverse the right sub tree in preorder. To traverse a nonempty binary tree in Preorder (also known as depthright order). PREORDER INORDER : : ABDGCEHIF DGBAHEICF GDBHIEFCA POSTORDER : ! . This operation is called tree traversing.5(a&b) Illustrates two binary trees and their traversals in preorder.4 ALGORITHMS AND ADVANCED DATA STRUCTURES In many applications it is necessary. To traverse a nonempty binary tree in Inorder (or symmetric order) (1) Traverse the left subtree in inorder. Figure 1. (2) Traverse the left sub tree in preorder. so that traversing a binary tree involves visiting the root and traversing its left and right sub trees. (3) Traverse the right subtree in inorder. nothing needs to be done to traverse an empty binary tree. If there are n nodes in the binary tree then their n1 different orders in which they could be visited. We will define three of these traversal methods. (3) Visit the root. (2) Visit the root. The methods are all defined recursively. inorder and postorder. not only to find a node within a binary tree. To traverse a nonempty binary tree in Postorder (1) Traverse the left subtree in postorder. but to be able to move through all the nodes of the binary tree visiting each one in turn. we perform the following three operations: (1) Visit the root. The only difference among the methods is the order in which these three operations are performed.
We refer to the node at position p simply as “ node p “or info[p] holds the contents of node p info being the array name. Top Recall that the n nodes of an almost complete binary tree can be numbered from 1 to n. so that the number assigned to a right son is 1 more than twice the number assigned its father. Instead. Find the Inorder. the nodes can be kept in an array in the of size n. If your answers are correct.2 Before going to next section. answer the following questions: 1. then proceed to next section. . left or right links. Preorder and Postorder traversals of the following binary tree. We can represent an almost complete binary tree without father.TREES 5 I J PREORDER INORDER K : : L ABCEIFJDGHKL ELEFJBGDKHLA IEJFCFKLHDBA POSTORDER : ! " # Student Activity 1.
the two sons of a node numbered p are in position 2p + 1 and 2p + 2. The node in position p (that is. p points to a left son if and only if p is odd.almost complete) binary trees . so that tree. Given a left son at position p. its right brother is at p – 1 and. Thus the test for whether node p is a left son (this is left operation) is to check whether p % 2 is not equal to 0. The left son of node p is node 2p + 1 and right son of p by 2p + 2. always equals 0. we number them from 0 to n – 1. Because of the oneposition shift. We do this by identifying an almost complete binary tree that contains the binary tree being represented. The root of the tree is at position 0.7(a) illustrate two (non . the extend pointer to the tree root. instead of 2p and 2p + 1.6 illustrates arrays that represent the almost complete binary trees. $ We can extend this implicit array representation of almost complete binary trees to an implicit array representation of binary trees generally. array start at position 0. Figure 1. Figure 1. given a right son at position p its left brother is at p – 1.6 ALGORITHMS AND ADVANCED DATA STRUCTURES In C. Father of p is implemented by (p – 1)/2. node p) is the implicit father of nodes 2p + 1 and 2p + 2. therefore instead of numbering the tree nodes from 1 to n.
used to each node. Under the sequential representation an array element is allocated whether or not it serves to contain a node of a tree.TREES 7 and Figure 1. and by extension. Each node then contains two fields info. we may add a logical flag field. This special value should be invalid as the information content of a legitimate tree node. because it allows a tree to be implemented in a contiguous block of memory (an array) rather than via pointers connecting widely separated nodes. (a) Two binary trees % . of the original binary trees. This may be accomplished by one of two methods.7(b) illustrates the smallest complete binary trees that contain them finally Figure illustrates the implicit array representation of these almost complete binary trees. The entire structure is contained in an array implemented as node (p). as contrasted with the linked representation presented earlier. Info p is implemented by node (p) info. Alternatively. For example in a tree containing positive numbers. One method is to set info[p] to a special value if node p is NULL. The implicit array representation is also called the sequential representation. a NULL node may be indicated by a negative info value. We use this method latter in implementing the sequential representation.
node *left. pright. The operations info (p). node *father. inorder.8 ALGORITHMS AND ADVANCED DATA STRUCTURES A node can be defined in language C as follows: struct node { int info. The parameter to each routine is a pointer to the root node of a binary tree. left child of node p. } } /* Inorder: visit each mode in Inorder*/ void inorder (nodeptr root) { if (root) . right child of node p and father of node p respectively. right (p) and father can be implemented by references to pinfo. inorder and postorder visit the contents of a binary tree in preorder. pleft. and pfather. struct struct struct }. preorder (root→left). We use the dynamic node representation of a binary tree. typedef struct node *nodeptr. and postorder respectively. left (p). /*Preorder : Visit each node of the tree in preorder*/ void preorder (nodeptr root) { if (root) { visit (root). These operations are used to retrieve the value of node p. We may implement the traversal of binary trees in C by recursive routines that mirror the traversal definitions. preorder (root→right). node *right. The tree C routines preorder. respectively.
then proceed to next section. and it would be helpful to find a more efficient method of implementing the traversal. A binary tree which has both left and right threads is . A Left threaded binary tree may be defined similarly as one in which each NULL left pointer is altered to contain a thread to that node’s inorder predecessor. NULL. } } /* Postorder: visit each mode in Postorder*/ void postorder (nodeptr root) { if (root) { postorder (root→left). so that it is possible to perform traversals insertion and deletions operations without using either a stack or a recursion. postorder (root→right). answer the following questions: 1. 2.TREES 9 { inorder (root→left). Construct the binary tree whose inorder and preorder traversals are given as: Inorder: E1CFJB9DKHLA Preorder: ABCEIFJDGHKL If your answers are correct. Using right threads we will easily do an inorder traversal of a tree. visit (root). We can change these null links in a binary tree to special links called threads. In a right threaded binary tree each right link is replaced by a special link to the successor of that node under inorder traversal. since we need only to follow either an ordinary link or a thread to find the next node to visit.e. Write algorithms for inorder. visit (root). inorder (root→right). Top Traversing a binary tree is a common operation. } } Student Activity 1. preorder and postorder traversals.3 Before going to next section. called a right thread. As we have seen that generally either left or right child of node is empty i.
Now we present a routine to implement inorder traversal of a right threaded binary tree. * a not null thread* { . }. ' ( " " To implement a right threaded binary tree under the dynamic node implementation of a binary tree. inorder2 (nodeptr root) { Nodeptr p.10 ALGORITHMS AND ADVANCED DATA STRUCTURES called a fully threaded binary tree. p = root. although its right field remains NULL. and extra logical field. the thread field of the right most node of a tree (that is the last node in the tree’s in order traversal) is also left to TRUE. struct node *left struct node * right int rethread. is included with in each node is indicate whether or not its right pointer is a thread. Following figure shows a fully threaded binary tree where the threads are shown as dotted lines. q. The word fully is omitted if there is no danger of confusion. For consistency. A K & " " H I Following figure shows a Rightthreaded binary tree. typedef struct node *nodeptr. rethread. Thus a node is defined as follows (We are assuming that no father field exists): Struct node int info.
p = p→right } } } }while (q! = NULL) Student Activity 1. Top A Binary Search Tree (BST) is an ordered binary tree such that either it is an empty tree or 1. Following figure shows a binary search tree . 3. and left and right subtrees are again binary search trees. each data value in its left subtree is less than to the root value. } if (q! = NULL){ visit (q). q = p.4 Before going to next section. What are right threaded and left threaded binary trees? Discuss the advantages and disadvantages of threaded binary trees. each data value is its right subtree is greater than the root value. p = q→right. 2. answer the following questions: 1. If your answers are correct. 2. while (p! = NULL) { * Traverse left branch* q = p. while (q→rthread && p! = NULL){ visit (p). p = p→left. then proceed to next section.TREES 11 do { q = NULL.
4. 3. Retrieve a nodes data. p = root. If an array is used. Check whether BST is empty or not. while (p! = NULL) { if (key = k (p)) return (p). an insertion or deletion requires that approximately half of the elements of the array be moved.) q = NULL. q = p. The advantage of using a BST over an array is that a tree enables search insertion.12 ALGORITHMS AND ADVANCED DATA STRUCTURES ) Following operation can be performed on a Binary Search Tree: 1. Insertion or deletion in a BST on the other hand. Create a node for the Binary search tree. this operation makes an empty tree. 5. The following algorithm searches a binary search tree and inserts a new record into the tree if the search is unsuccessful (We assume the existence of a function make tree that constructs a binary tree consisting of a single node whose information field is passed as an argument and returns a pointer to the tree. 8. Initialization of a Binary Search Tree. 6. Traverse a Binary Search Tree. requires that only a few pointers be adjusted. 7. returns with error if no space is available. Search for a node. this operation allocates memory space for the new node. Insert a node. Update a node’s data. and deletion operation to be performed efficiently. 2. if (key < k (p)) p = left (p) else p = right (p) } .
If the node to be deleted has no sons. root = v else if (key < k (q)) left (q) = v. ) If the node to the be deleted has only one subtree. its only son can be moved up to take its place. key). if (a = NULL). There are three cases to consider. This is illustrated in the following figure. We now present an algorithm to delete a record (node) with key “key” from a Binary Search Tree.TREES 13 v = make tree (sec. it may be deleted with out further adjustment to the tree. * " + . else right (q) = v. This is illustrated in the following figure: . return (v) Here key is item to the searched.
however. then the search tree contains no elements and the search is unsuccessful. int x) { if (root ==0) return o. else return (search (root→right). This is illustrated in the following figure. * +  Since the definition of a binary search tree is recursive. If the root is 0. Thus the right son of s can be moved up to take the place of s. the node p to delete has two subtrees. x)). its In order successor (or predecessor) must take its place. x). The in order successor cannot have a left subtree (since a left descendent would be the In order successor of p). We assume for simplicity that the element just consists of a key and use the terms element and key interchangeably. ! If. We begin at root.14 ALGORITHMS AND ADVANCED DATA STRUCTURES * " + . else if (x< root→info) return (search (root→left. If x is larger than the key in the root. . then no element in the right subtree can have key value x and only the left subtree is to be searched. Suppose we wish to search for an element with key x. it is easiest to describe a recursive search method. If x equals this key. If x is less than the key in the root. only the right subtree needs to be searched. An element could in general be an arbitrary structure that has as one of its field a key. else if (x == root →info) return root. The subtree can be searched recursively as in the following algorithm. Otherwise we compare x with the key in the root. where the node with key 12 replaces the node with key 11 and is replaced in turn by the node with key 13. Search (Nodeptr root. the search terminates successfully.
answer the following questions: 1. ! " & ( # $% &' $% &' $% &' $ % !&' ( $ % !&' ( $ % !&' ( Student Activity 1.5? Delete node 10 from the following binary search tree. How many binary search trees are possible with key values 1. For example consider the two shapes given in the following figures . 2.5 Before going to next section. Top !" # $ The effectiveness of searching process in a binary search tree depends on how data are organised to make up a specific tree. then proceed to next section. 10 9 .2.4.3.TREES 15 } following table gives the efficiencies of search insert and delete operation in a binary search the. If your answers are correct.
or 0. The tree in the first figure above is rather short and compact while the tree in second figure is a long and thin tree. The balance of a node in a binary tree is defined as the height of its left subtree minus the height of its right subtrees. A balanced binary tree (AVL tree) is a binary tree in which the heights of two subtrees of every node never differ by more than 1. The balance of each node is also indicated in the following figure. $ $ * $ .16 (b). The resulting tree may or may not remain balanced. less than. ) ) $ Suppose that we are given a balanced binary tree and use the preceding search and insertion algorithm to insert a new node p into the tree. Let us define more precisely the rotation of a “balanced” tree. Following figure illustrates a balanced binary tree. For convenience. although the same elements are organised in the two structures. +1. the height of a NULL tree is defined as 1. Each node in a balanced binary tree has a balance of –1. The height of a binary tree is the maximum level of its leaves (this is also sometimes known as the depth of the tree). Following figure illustrates all possible insertions that may be made to the tree of figure 1. depending on whether the height of its left subtree is greater than.16 ALGORITHMS AND ADVANCED DATA STRUCTURES 8 7 1 2 3 4 5 6 ! " The efficiency of search will be rather different in these two cases. or equal to the height of its right subtree. We may say that the tree in the first figure is somewhat more balanced than that in the second figure.
(ii) a node is sizeably small. Let usfurther suppose that these blocks are allocated sequentially. its allocation when space is requested and combining of contiguous free spaces when a block is freed. resulting in the situation depicted in Figure 1. At times a program may require storage blocks in varied sizes and thus arises the need of a memory management system. respectively. Top % & As discussed earlier. As an example of this situation. allocation of storage and its release is done for one node at a time. (i) size of a node of a particular type is fixed.TREES 17 $ $ + + + $ + $ $ $ + + + $ $ * $ $ $ * $ $ $ + + + $ + Each insertion that yields a balanced tree is indicated by a b.17(a).17(b). 110 and 212 words. The unbalanced insertions are indicated by a U. The run time storage management is such a system and is a convenient tool for processing requests for variablelength blocks. a request for a block of 400 words could not be satisfied. consider a small memory of 1024 words. There are now 464 words of free space. It is easy to see that the tree becomes unbalanced if and only if the newly inserted node is a left descendent of a node that previously had a balance of 1 (this occurs in case U1 through U8 in figure 1. as shown in Figure 1. The illustration exhibited expresses the necessity of availability of space. But these two characteristics don’t help in programs where a large amount of contiguous storage is required. . Suppose a request is made for three blocks of storage of 348. and one numbered from 1 to 12. Now suppose that the second block of size 110 is freed.16(b) or it is a right descendent of a node that previously had a balance of –1 (cases U9 through U12). yet. because the free space is divided into noncontiguous blocks. This method is convenient in regard to two properties of nodes.
18 ALGORITHMS AND ADVANCED DATA STRUCTURES % Suppose that block 3 were now freed. releases it back to the operating system. The application which claimed the node. Occurs in ac. In the second phase all nonmarked nodes are collected and returned to the free space. to allocate portions of that space when allocation requests are presented. memory will appear as in Figure 1. Where variable size nodes are used. Rather the blocks should be combined into a single large block of 676 words so that further large requests can be satisfied. and 354 words. After combination. . Clearly. This requires the presence of a ‘marking bit’ on each node. the second phase is called memory compaction. The operating system calls the storage management routines to return free nodes to the free space. 2. all nongarbage nodes are marked. in this case. It is desirable to keep the free space as one contiguous block. This example illustrates the necessity to keep track of available space. It runs in two phases. In the first phase. 212. For example deallocation as in: 1. and to combine contiguous free spaces when a block is freed. 2.17(c). Top ' ( Deallocation of nodes can take place in two levels: 1. Is usually implement by the method of Garbage Collection. program with the statement “free (x)” where x is space earlier allocated by a malloc call. it is not desirable to retain three free blocks of 110.
in which a computer must respond to a user request within a specific short time span. it will again be out of space. For real time applications. It is a slow process and its use should be obtained by efficient programming models. at the start and end of garbage collection. Thus. For his reason it is desirable that the garbage collector be called as infrequently as possible. all mark fields are false.TREES 19 Garbage Collection is usually called when some program runs out of space. the entire system still works correctly. This phenomenon. As the collection phase proceed the mark field in each accessible node is reset to false. User program do not affect the mark fields. All of that users space is then recovered and made available to other users. After the system runs for a short time. If the garbage collector is run and does not recover a specific percentage of the total space. at the time the garbage collection program is called. One drastic solution is to impose the following condition. Usually the operations of a list processing system are carefully implemented so that if garbage collection does occur in the middle of one of them. is called thrashing. Since there is little space available for them. However. and the vicious cycle starts again. Thus almost all nodes are accessible and the garbage collector recovers very little additional space. It is possible that. all user processing comes to a halt while the algorithm examines all allocated nodes in memory. One aspect of garbage collection is that it must run when there is very little space available. in which system storage management routines such as garbage collection are executing almost all the time. It is sometimes inconvenient to reserve one field in each node solely for the purpose of marking. In that case a separate area in memory can be reserved to hold a long array of mark bits. The marking phase sets the mark field to tree in each accessible node. users are actually using almost all the nodes that are allocated. However. Top . This means that the garbage collector must be called before all space has been exhausted so that user processing can continue in whatever space is left. One for each node that may be allocated. methods have recently been developed whereby garbage collection can be performed simultaneously with user processing. We can picture a space ship drifting off into the infinite as it waits for directions from a computer occupied with garbage collection. However. This means that auxiliary tables and stacks needed by the garbage collector must be kept to a minimum. some users try to outsmart the system and implement their own pointer manipulations. An alternative is to reserve a specific percentage of memory for the exclusive use of the garbage collector. One field must be set aside in each node to indicate whether a node has or has not been marked. this effectively reduces the amount of memory available to the user and means that the garbage collector will be called more frequently. Another important consideration is that users must be careful to ensure that all lists are well formed and that all pointers are correct. the garbage collector will again be called only to recover very few additional nodes. we must ensure not only that user operations do not upset list structures that the garbage must have but also that the garbage collection algorithm itself does not unduly disturb the list structures that are being used concurrently by the user. In a realtime garbage collection system. while the garbage collector recovers additional space. the user who requested the extra space is terminated and removed from the system. This requires great care so that garbage collection will work properly. Clearly thrashing is a situation to be avoided. Whenever the garbage collector is called. garbage collection has generally been considered an unsatisfactory method of storage management.
A general algorithm for the compaction routine is as follows. the compactor simply collects all unmarked blocks into one large block at one end of the memory segment. Repeat step 7 until the end of memory is reached Move marked blocks into new locations and reset markets. 1. If the current block of storage being examined has been marked then set the address of the block to the starting address of unused memory update the starting address of unused memory Redefine variable references address of unused memory. Then another pass over memory is made. This new address is stored in the block itself.6 Answer the following questions: 1. A Binary tree is a finite set of elements. from one location in memory to another so as to collect all the free blocks into one large block. then the marked blocks are moved to their new locations. Invoke garbage collection marking routine. Instead. Allocation now consists of merely moving a pointer which point to the top of this successively shortening block of storage. The allocation problem then becomes completely implied. After all pointers have been reset. the compaction mechanism is again invoked to reclaim what unused storage may now exist among allocated blocks. 3. 5. Define new values for pointers in marked block. trust is either empty or is partitioned into three disjoint subsets. What is Garbage collection? What is the disadvantage of garbage collection? Discuss the advantages of AVL Trees. 2. This is why the new address is stored right in the block – it is easily obtainable. Repeat step 3 until the end of memory is reached. a marking algorithm is used to mark blocks that are still in use. This is solved by making extra passes through memory. 2.20 ALGORITHMS AND ADVANCED DATA STRUCTURES As a final topic. After blocks are marked. 6. Inverter & Post order. In a threaded binary tree each right sink is replaced by a special link to the successor of that node under in order traversal called a right thread. Once this single block gets too small again. There is generally no storage release mechanism. instead of freeing each unmarked block by calling a release mechanism to put it on the free list. Pre order.e. . This is solved by making extra passes through memory. Then. 4. pointers that point to marked blocks are reset to point to where the marked blocks will be after compaction. we can summarize the concepts. Student Activity 1. 7. the entire memory is stepped through and the new address for each marked block is determined. Compaction works by actually moving blocks of data etc. On this pass. 3. The only real problem in this method is the redefining of pointers. we shall briefly discuss compaction as a technique for reclaiming storage and introduce an algorithm this task. After blocks are marked. the entire memory is stepped through and the new address for each marked block is determined. What are the conditions for a tree to be an AVL tree? % So after go through this chapter. We can traverse a Binary tree in three way i.
TREES
21
A Binary Search tree (BST) is an ordered binary tree, such tree either it is an empty tree or each data value in its left sub tree is less than to the root valve, each data valve is its right sub tree is greater than the root valve and left sight sub trees are again binary search tree. A balanced binary tree (AVL tree) is a binary tree in which the heights of two sub tree of every node never differ by more than 1.
)
*+
I.
True and False 1. 2. A binary tree can have more than 2 children of a node. Array representation of binary tree is more efficient than dynamic representation
II.
Fill in the blanks 1. 2. 3. Garbage collection is a method of ____________. The advantage of using a Binary search tree over an array is that a tree enables ____________ and ___________to be performed more efficiently. The average time complexity of binary search is __________.
I.
True and False 1. 2. False False
II.
Fill in the blanks 1. 2. 4. run time storage management insertion and deletion log2n.
I.
True and False 1. 2. Threaded binary tree are useful in tree traversals An AVL tree is a more efficient binary search tree
II.
Fill in the blanks 1. A ____________ is a finite set of elements that is either empty or is partitioned into three disjoint subsets.
22
ALGORITHMS AND ADVANCED DATA STRUCTURES
2. 3. 4. 5.
In ____________ binary tree each right link is replaced by a special link to the success of that node under in order traversal. The advantage of using a __________ over an array is that a tree enables search insertion and deletion operation to be performed efficiently. Since the definition of a binary search tree is ________, it is easiest to describe a recursion search method. The effectiveness of ______ process in a binary search tree depends on how data are organized to make up a specific tree.
,%
1. 2. 3. 4. 5. Prove that the root of a binary tree is an ancestor of every node in the tree except itself. Prove that a node of a binary tree has at most one father. Prove that a strictly binary tree with a leaves contain 2n–1 nodes. Two binary trees are similar if they are both empty or if their left sub trees are similar, and their right sub trees are similar, write an algorithm to determine if two binary trees are similar. Write C routines to traverse a binary tree in preorder and Post order.
Overview Bubble Sort Insertion Sort Selection Sort Quick Sort Merge Sort Radix Sort Heap Sort External Sorting Lower Bound Theory Adversary Arguments Minimum Spanning Tree Shortest Paths Graph Component Algorithm String Matching The BoyerMoore Algorithm
Sorting Techniques
Learning Objectives
• • • • • • • • • • • • • • • • • • Overview Bubble Sort Insertion Sort Selection Sort Quick Sort Merge Sort Radix Sort Heap Sort External Sort Lower Bound theory for sorting Selection and Adversary Argument Minimum Spanning Tree Prim’s Algorithm Kruskal’s Algorithm Shortest Path Graph Component Algorithm String Matching KMP Algorithm
This is perhaps no longer true. the process of finding a telephone number in a telephone directory. and then it must be sorted in some way. x is an array of integers of which the first n are to be sorted so that x[i] ≤ x[j] for 0 ≤ i < j ≤ n . it is probably the least efficient. The file is said to be sorted on the key if i<j implies that k[i] precedes k[j] in some ordering on the keys. it was estimated. that is. the most important is the distinction between internal and external. Yet. A few years ago.the Bubble sort. Since the entries are sorted in alphabetical rather than in chronological order. In fact so many ideas appear in sorting methods that an entire course could easily be built around this one theme. Because sorting is so important. is simplified considerably by the fact that the names in the directory are listed in alphabetical order. the information does go out to people. A key. the key upon which the file is sorted is the name field of the record. Each pass consists of comparing each element in the file with its successor (x[i] with r[i+1]) and interchanging the two elements if they are not in proper order. r(1). Amongst the different methods. Consider the following file: 25 57 48 37 12 92 86 33 The following comparisons are made on the first pass x[0] x[1] x[2] with x[1] (25 with 57) no interchange with [2] (57 with 48) with x[3] interchange interchange (57 with 37) . Top The first sort presented is probably the most widely known among beginners and students of programming . Consider. Each item in the file is called a record. since sophisticated methods have been devised for organizing data. each with a subfield key k. The basic idea underlying the bubble sort is to pass through the file sequentially several times. or whether they can all be kept internally in highspeed memory.24 ALGORITHMS AND ADVANCED DATA STRUCTURES • Top Boyer Moore Algorithm The concept of an ordered set of elements is one that has considerable impact on our daily lives. Eventually nonetheless. the file consists of all the entries in the book. We now present some basic terminology.……r(n–1). In the example of the telephone directory. usually (but not always) a subfield of the entire record. k(i). One of the characteristics of this sort is that it is easy to understand and program. In such a case. the names might as well have been entered in random order. In each of the subsequent examples. This process. more than half the time on many commercial computers was spent in sorting. methods that do not require that it be kept in any special order. of all the sorts we shall consider. called a search. whether there are so many structures to be sorted that they must be kept in external files on disks. Consider the trouble you might have in which the customers placed their phone orders with the telephone company. for example. tapes. the processing is simplified. which is used in sorting n records. Each record also contains fields for an address and a telephone number. Each entry is a record. great many algorithms have been devised for doing it. A file of size n is a sequence of n items r(0). It is straightforward to extend this simple format to one. or the like.
In general x[n–i] will be in its proper position after iteration i. This method is called the bubble sort because each number slowly “bubbles” up to its proper position. j. j ++) if x [j] > x [j+1] // outer loop controls the no. // interchange elements . After the second pass the file is 25 37 12 48 57 33 86 92 Notice that 86 has now found its way to the second highest position.SORTING TECHNIQUES 25 x[3] x[4] x[5] x[6] with x[4] with x[5] with x[6] with x[7] (57 with 12) (57 with 92) (92 with 86) (92 with 33) interchange no interchange interchange interchange Thus that after first pass. x[j] = x[j + 1]. x is an array of numbers. temp. the largest element (92) is in its proper position with in the way. int n) { int i. We present a routine bubble that accepts two variables x and n. // of passes // inner loop governs each // individual pass { temp = x[j]. i++) for (j = 0. bubble (int x [ ]. the file is in the following order 25 48 37 12 57 86 33 92 Notice that after this pass. i < n–1. The complete set of iterations is the following iteration 0 (initial file) 25 iteration 1 iteration 2 iteration 3 iteration 4 iteration 5 iteration 6 iteration 7 25 25 25 12 12 12 12 57 48 37 12 25 25 25 25 48 37 12 37 37 33 33 33 37 12 48 48 33 37 37 37 12 57 57 33 48 48 48 48 92 86 33 57 57 57 57 57 86 33 86 86 86 86 86 86 33 92 92 92 92 92 92 92 On the basis of foregoing discussion we could proceed to code the bubble sort. a file of n element requires no more than n–1 iterations. for (i = 0. Since each integration places a new element into its proper position. and n is an integer representing the number of elements to be sorted (n may be less than number of elements in x). j < n–1.
21 If your answers are correct.ALGORITHMS AND ADVANCED DATA STRUCTURES x [j + 1] = temp. k + 1) . k < n. int n] { int i. } } What can be said about the efficiency of bubble sort? The total number of comparisons is (n–1) (n–1) = n2 2n +1. /* initially x [0] may be thought of as a sorted file of one element. An initial list with only one item is automatically in order. in the correct order. which is 0(n2). Sort the following file using bubble sort 15. 2. y. After each repetition of the following loop. Of course the number of interchanges cannot be greater than the number of comparisons. Top An insertion sort is one that sorts a set of records by inserting records into an existing sorted file.*/ for (k = 1. then proceed to next section. 19. Discuss the advantages of sorting. answer the following questions: 1. The algorithm for insertion sort is as follows: insertion (int x [ ]. 18. then we take item i and search through this sorted list of length i1 to see where to insert item i. k. when once examined. If we suppose that we have already sorted the first i1 items.1 Before going to next section. It is likely that it is the number of interchanges rather than the number of comparisons that takes up the most time in the program’s execution. 10. Student Activity 2. Example Initial order: Step 1: Step 2: Step 3: Step 4: SQ SQ C7 C7 C7 SA SA SQ H8 DK C7 C7 SA SQ H8 H8 H8 H8 SA SQ DK DK DK DK SA Example of insertion sort The insertion sort algorithm thus proceeds on the idea of keeping the first part of the list. 4. The element x [0] through x [k] are in order.
then proceed to next section. i) x [i + 1] = x [i]. int n] { int i. of comparisons in the insertion sort is also O(n2). the sort is O(n2). y. 17. + 3 + 2 + 1 = (n –1) * n /2 Which is O(n 2). at the end of the array. The space requirement for the sort consists of only one temporary variable. However the insertion sort is still better than the bubble sort. for (i = n –1. since the total no. large. i > = 0 && y < x [i]. The closer to file is sorted order. k. j. Sort the following key values using insertion sort 20.2 Before going to next section. selection (int x[ ]. If the file is initially sorted in the never order. 2. Thus the selection process need be done only from n –1 down to 1 rather than down to 0. to do so large is interchanged with the element x[i]. only one comparison is made on each pass.SORTING TECHNIQUES 27 { y = x [k]. large. the more efficient the insertion sort becomes. Compare search efficiencies of Insertion sort and Bubble sort. Student Activity 2. of comparisons is (n–1) + (n –2) + …. 54 If your answers are correct. 13. /* move down all elements greater than y by 1 position*/ for (i = k–1. The average no. The elements of the impact may have to be preprocessed to make the ordered selection possible. } If the initial file is sorted. Top A selection sort is one in which successive element are selected in order and placed into their proper sorted positions. The selection sort consists entirely of a selection phase in which the largest of the remaining elements. i>0. 25. so that the sort is O[n]. After n –1 selection the entire array is sorted.) { . is repeatedly placed in its proper position I. answer the following questions: 1. /* insert y at proper position*/ x[i + 1] = y. i. The following algorithm implements the selection sort.
} } Analysis of the selection sort is straightforward. and so on. The sort may therefore be categorized as O(n2). answer the following questions: 1. Compare storage efficiencies of bubble sort and selection sort. The first pass makes n –1 comparison. 2. which is O(n2). j ++) if (x [j] > large) { large = x [j]. x[i] = large. there is a total of (n –1} + (n –2) + . } x[n] = x[i]. There is little additional storage required (except to hold a few temporary variables). j < = i.3 Before going to next section. Find the worst case efficiency of selection sort. Therefore. then proceed to next section. Top . although it is faster than bubble sort. the second pass makes n –2.28 ALGORITHMS AND ADVANCED DATA STRUCTURES /* place the largest number of x [0] through x [i] into large and its index into k */ large = x [a]. k = 0. The number of interchanges is always n –1. Example of selection sort is given below: Student Activity 2. for (j = 1.+ 3 + 2 + 1 = n (n –1)/2 Comparisons. If your answers are correct.
Choose an element a from a specific position within the array (for example. 92. and then present and algorithm partition to implement this. Hence it is a divide and conquer technique. Each of the elements in position 0 through j–1 is less than or equal to a. a can be chosen as the first element so that a = x[0]). If the foregoing process is repeated with sub arrays x[0] through x[j–1] and x[j+1] through x[n –1] and any subarrays created by the process in successive iteration s. 86 and 33) is grater than or equal to 25. each element below that position (12) is less than or equal to 25. 2. To sort the second subarray the process is repeated and the subarray is further decided. Each of the elements in position j + 1 through n –1 is greater than or equal to a. By this time you have noticed the quicksort may be defined more conveniently as a recursive procedure. Repeating the process on the subarray x[2] through x[7] yields 12 25 (48 37 33) 57 (92 86) and further repetitions yield. the final result is a sorted file. the resulting array is 12 25 57 48 37 92 86 33 At this point. Since 25 is its final position the original problem has been decomposed into the problem of sorting the two subarrays. Now we present a mechanism to partition the given file. a file of one element is already sorted. Suppose that the elements of x are partitioned so that a is placed into position j and the following condition hold: 1. a is the jth smallest element of x. If an initial array is given as 25 57 48 37 12 92 86 33 And the first element (25) is placed in its proper position. 12 25 and (57 48 37 92 86 33) Nothing need be done to sort the first of these subarays. 48. 12 12 12 12 12 25 25 25 25 25 (37 (33) 33 33 33 33) 37 37 37 37 48 48 48 48 48 57 57 57 57 57 (92 (92 (92 86 86 86) 86) 86) 92 92 Note that the final array is sorted. and each element above that position (37. 37. so that a remains in position j when the array is completed sorted. Let x be an array. Notice that if these two conditions hold for a particular a and j. Let us illustrate quicksort with an example. and n the number of elements in the array to be sorted.SORTING TECHNIQUES 29 The next sort we consider is the quick sort (or portion exchange sort). 25 is in its proper position in the array (x[1]). The entire array may now be viewed as 12 25 (57 48 37 92 86 33) Where parentheses enclose the subarrays that are yet to be sorted. .
All that is required by the sort is that the elements be partitioned properly. In the preceding example. showing the positions of up and down as they are adjusted. At any point during execution. These astericks on a line indicates that an interchange is being made. illustrated this process on the sample file. Note that the manner in which this partition is performed is irrelevant to the sorting method. The direction of the sean is indicated by an arrow at the pointer being moved. The two pointers up are down are moved towards each other in the following way. interchange x (down with x [up] ). at which point x [up] is interchanged with x [lb] (which equals a). each element in a position above up is greater than or equal to a and each element in a position below down is less than or equal to a. are initialized to the upper and lower bounds of the subarray respectively. Two pointers. However such a partition method is relatively inefficient to implement. whose final position was sought. Step 1 Step 2 Step 3 : repeatedly increase the pointer down by one position until x [down] > a : repeatedly decrease the pointer up by one position until x [up] < = a : If up > down. The process is repeated until the condition in step 3 fails (up < = down). The elements in each of the two subfiled remain in the same relative order as the appear in the original file. A = x [lb] = 25 Down→ 25 25 25 25 25 25 *** 25 25 25 25 57 down 57 down 57 down 57 down 57 down 57 down 12 down→ 12 12 12 48 48 48 48 48 48 48 48 down 48 down 48 down 37 37 37 37 37 37 37 37 37 37 ←up 12 12 12 12 12 up 12 up 57 up 57 up 57 ←up 57 92 92 92 92 ←up 92 92 92 92 92 92 86 86 86 ←up 86 86 86 86 86 86 86 up 33 up 33 ←up 33 33 33 33 33 33 33 33 . up and down. We.30 ALGORITHMS AND ADVANCED DATA STRUCTURES The object of partition is to allow a specific element to find its proper position with respect to the others in the subarray. and j is set to up. One way to effect a partition efficiently is the following: let a = x [lb] be the element whose final position is sought.
up.SORTING TECHNIQUES 31 25 25 25 12 12 12 up 12 25 48 37 57 57 57 57 92 92 92 92 86 86 86 86 33 33 33 33 *** ←up down 48 37 down 48 48 37 37 At this point 25 is in its proper position (position 1). if (down < up){ /* interchange x [down] and x [up] */ temp = x [down]. while (x [up] > a) up . down. and every element to its left is less than or equal to 25. int ub) { int a.. while (down < up){ while (x [down] < = a && down < ub) down ++ . x [up] = a. The algorithm for partition is as follows : partition (int x [ ]. } } x [lb] = x [up]. x [down] = x [up]. temp. and every element to its right is greater than or equal to 25. We could now proceed to sort the two subarrays (12) and (48 37 57 92 86 33) by applying the same method. a = x [lb]. down =lb. x [up] = temp. return (up). int lb. /* a is the element whose final position is sought*/ up = ub. } We may now code to implement the quicksort // end partition //end if //end while /* move down the array*/ /* move up the array*/ .
Thus the total number of comparisons for the entire sort is approximately: n + 2* (n/2) + 4* (n/4) …… + 4* (n/4) Or n + n + …………… + n (m times) = nm = n log n Thus the total no. q + 1). and a total of 4 files each of size n/4 are formed and each of these file requires n/4 comparisons yielding a total of n/8 sub files. the results. // There is no need to combining solutions } } How efficient is the quicksort? Assume that the file size a is a power of 2. x[lb} is in its correct position. for example. of comparisons is O(n long n) Thus of the foregoing properties describe the file the quicksort is O(n log n). For each of these two files there are approximately n/2 comparisons. the second of size n–1. j–1). Assume also that proper position for the pivot always turns out to be the exact middle of the subarray. if (p < q) { //divide into to sub arrays j = Partition (a. j +1. For the algorithm quicksort in which x[lb] is used as the pivot value. so that m = log2n. The analysis for the case in which the file size is not an integral power of 2 is similar but slightly more complex. If. after which the file is split into two subarrays (sub files) of size n/2.32 ALGORITHMS AND ADVANCED DATA STRUCTURES quicksort (int a[ ]. q).386 nlong2n comparisons. say n = 2m. and so on. the total no. this analysis assume that the original array and all the resulting subarrays are unsorted. After having the sub files m times. that on the average (over the files of size n). //Solve the sub problems quick sort (a. if this process continues. there are n files of size 1. p. It can be shown. the original file is split into subfiles of size 0 and n–1. approximately. int q) { int j. so that the pivot value x[lb] always finds its proper position at the middle of the subarray. which is relatively efficient. Assuming k comparisons to rearrange a file of size k. the third of size n–2. however remains the same. of comparisons to sort the entire file is n + (n–1) + (n–2) + …. int p. the quicksort makes approximately 1. a total number n–1 sub files are sorted. Suppose that the preceding conditions do not hold and the original array is sorted (or almost sorted). quick sort (a.+2 . the first of size n. p. however. In that case there will be approximately n comparisons (actually n–1) on the first pass.
23 If your answers are correct. then proceed to next section. thus the unmodified quicksort has seemingly absurd property that it works best for files that are completely unsorted and worst for files that an completely sorted. This algorithm is called merge sort. Top This sort is an example of divideandconquer technique. mid). Student Activity 2. A[n]. Each set is individually sorted. Before executing merge sort. if the original file is sorted in descending order the find position of n [lb} is up and the file is again split into two sub files that one heavily unbalanced (sizes n–1 and 0). When would quick sort be worse than simple solution sort? Sort the following file using bubble sort 27. Similarly. //Solve the sub problems mergesort (low. answer the following questions : 1. mid. and the resulting sorted sequences are merged to produce a single sorted sequence of n elements. n) causes the keys to be rearranged into nondecreasing order in a. high).4 Before going to next section. We assume throughout that the element are to be sorted in nondecreasing order. mergesort (int low.SORTING TECHNIQUES 33 which is O(n2). which works best for sorted files and worst for unsorted files. 15. 45. Given a sequence of n elements (also called keys) a[1]…. The algorithm merge sort describe this process very succinctly using recursion and a function merge which merges two sorted sets. Thus we have an ideal example by divideandconquer strategy in which the splitting is into two equal sized sets and the combining operation is the merging of two sorted sets into one (as we did in quick sort).a [n]. 2. the general idea is to imaging them into two sets a[1]… a[n/2] and a[n/2+1]……. high). It has the nice property that in the worst case its complexity is 0 (n log n). the n elements should be placed in an array a[n]. Then merge sort (1. mergesort (mid + 1. This property is precisely the opposite for the bubble sort. 50. } . //Combine the solution merge (low. if (low < high) //if there are more than { //one element //Divide problem into sub problems //Find where to split the array mid = [(low + high)/2]. int high) { int mid.
i = i +1. j = mid +1. i=i+1 } else for (k = h. i. int high) { int h. k +1) { b [i] = a[k]. j. k < = high . i = low. k +1) a [k] = b [k]. j = low. b[20]. h = h + 1. } . k < = high. k + 1) { b [i] = a [k]. h < = mid. j = j +1.34 ALGORITHMS AND ADVANCED DATA STRUCTURES } merge (int low. while((h < = mid) && (j < = high)) { if (a [h] < = a [j]){ b [i] = a [h]. int mid. } else { b [i] = a[j]. } i = i +1. } if (h > mid) for (k = j. for (k = low.
652254. 179. 652. 450. Pictorially the file can now be viewed as {310285179652. 254. 310. 351. 254. 423. and now the merging begins. A record of the subarrays is implicitly maintained by the recursive mechanism. 351423. 310. where as quicksort requires only O(log n) additional space the stack (if implemented by using stack). 423. 285. merge sort does require approximately twice as many assignments as quick sort on the average. However. 285. 520. The elements in a[1 to 5] are then split into two subarrays of size two (a[1 to 2]) and two (a [4 to 5]). 310652. 450. 351423.SORTING TECHNIQUES 35 Consider the array of ten elements a[ ] = {310. 450. 254. 520} At this point the algorithm. 423. Repeated recursive calls produce the following subarrays: {179. 285. Elements a[1] and a[2] are merged to get (285. 652423861254450. 450. 520} and then a[1 to 3] and a[4 to 5]: {179. Thus mergesort requires no more than n log2n comparisons. 861. 861. 861. 520} Then a[3] is merged with a [1 to 2] to yield {179. on the average. 450. 861450. 351. 652254. 450. each involving n or fewer comparisons. 351. 861} There are obviously no more than log2n passes in merge sort. Note that no movement of data has yet taken place. 861. Merge sort also requires O(n) additional space for the auxiliary array. it can be shown that the mergesort requires fewer than n log2nn+1 log 2n–n+1 comparisons. In fact. 652423. 520. 520} Elements a[6] and a[7] are merged. 310351. 285. 653. has returned to the first invocation of mergesort and is about to process the second recursive call. 254. An algorithm has been developed for an inplace merge of two sorted subarrays in O(n) time. 254. 351. 351. 520} Next a[9] and a[10] are merged. Then a[8] is merged with a[6 to 7]: {179. elements a[4] and a[5] are merged: {179. 254. where as mergesort never requires more than n* log n. 520} Next. Then the items in a[1 to 3] are split into subarrays of size two (a[1 to 2]) and one (a[3 to 3]). 285. 450. 310. 310. 351423. 861. and them a[6 to 8] and a[9 to 10]: {179. compared with 1. 351. 285. 423. 861} At this point there are two sorted subarrays and the final merge produces the fully sorted result: {179. 861. In addition. 520}. 285. The values in a [1 to 2] are split a final time into oneelement subarrays. quick sort can require O(n2) comparisons in the worst case. 310179652. 450. 520} Where vertical bars indicate the boundaries of subarrays. 310. Algorithm Mergesort begins by splitting a[ ] into two subarrays each of size five. 254. This algorithm would allow mergesort to become . 652423.386 + n* log2n average comparisons for quick sort. 285.
numbers. all of whose elements are less than every element in the “2” group and so on. Suppose that we perform the following actions on the file for each digit. Notice that this scheme sorts on the lesssignificant digits first. Using the decimal base. the file is sorted. the number 235 in decimal notation is written with a 2 in hundreds position. For example. We can write a sorting based on the foregoing methods. depending on the value of the digit currently being processed. Thus every element in the “0” group is less than every element in the “1” group. the numbers are equal. Let us now consider an alternative to the forgoing method. for example. Then restore each queue to the original file starting with the queue of numbers with a 0 digit and ending with the queue of numbers with a 9 digit. We can then sort within the individual groups based on the next significant digit we repeat this process until each subgroup has been subdivided so that the leastsignificant digits are sorted. This method is sometimes called radix exchange sort. then proceed to next section. When these actions have been performed for each digit. Thus when all the numbers are sorted on a more significant digits. However that technique does require a great deal many more assignments and would thus not be as practical as finding the O(n) extra space. Of course. Take each number in order in which it appears in the file and place it into one of the quakes. This allows processing of the entire file without and dividing the files and keeping track of where each sub file begins and ends. if all the digits of both numbers match. a 3 in the tens position. The number with a larger digit in the first position in which the digits of the two numbers do not match in the larger of the two numbers. Sort the following file using merge sort 16. the numbers can be partitioned into ten groups based on their most significant digit. starting with the least significant with the most significant. Student Activity 2.36 ALGORITHMS AND ADVANCED DATA STRUCTURES an inplace O(n log n) sort. It is apparent from the foregoing discussion that considerable bookkeeping is involved in constantly subdividing files and distributing their contents into sub files based on particular digits. This sorting method is called the radix sort. answer the following questions: 1. The larger of two such integers of equal length can be determined as follows: Start at the mostsignificant digit and advance through the leastsignificant digit as long as the corresponding digits in the two numbers match. This sort is based on the values of the actual digits in the positional represent atoms of the numbers being sorted.5 Before going to next section. and a 5 in the units position. Top The next sorting method that we consider is called the Radix sort. 17. Compare the space requirements of quick sort and merge sort. At this point the original file has been sorted. It would certainly be easier if we could process the entire file as a whole rather than deal with many individual files. Now we illustrate this sort on the following file . numbers that have the same digit in that position but different digits in a lesssignificant position are already sorted on the lesssignificant position. 18 If your answers are correct. 4. 2. 10. beginning with the leastsignificant digit and ending with the mostsignificant digit. 9.
SORTING TECHNIQUES
37
25
57
48
37
12
12
86
33
Queue based on the least significant digit Front Queue [0] Queue [1] Queue [2] Queue [3] Queue [4] Queue [5] Queue [6] Queue [7] Queue [8] Queue [9] After first pass: 12 92 33 25 86 57 37 48 25 86 57 48 37 12 33 92 Rear
Queue based on most significant digit: Front Queue [0] Queue [1] Queue [2] Queue [3] Queue [4] Queue [5] Queue [6] Queue [7] Queue [8] Queue [9] 86 92 25 33 37 48 57 86 92 12 25 33 48 57 37 Rear
Therefore sorted file : 12
# define NUM 10 radixsort(x, n) int x [ ], n; {
38
ALGORITHMS AND ADVANCED DATA STRUCTURES
int front [NUM], near [NUM]; struct { int into; int next; } node [NUM]; int exp, first, i, j, k, p, q, y; /* inilialize linked list */ for (i = 0; i < n–1; i + 1){ node [i].info = x [i]; node [i].next = i + 1; } node [n–1]. info = x [n–1]; node [n+1]. next = –1; first = 0; //first is the head of the list
for (k = 1; k < 5; k + 1){ /* Assume we have fourdigit numbers*/ for ( i = 0; i < 10; i +1){ /* Initialize queue */ near [i] = –1; front [i] = –1; } //Process each element on the list while (first ! = –1){ p = first first = node [first]. next; y = node [p]. info; // extract the kth digit exp = power (10, k–1); //raise 10 to (k–1)th //power j = (y/exp) %10; // insert y into queue [j] q = near [–j];
SORTING TECHNIQUES
39
if (q ==–1) front [j] = p; eloe node [q]. next = p; near [j] = p; } //At this point each record is in its proper //queue based on digit k. We now form a //Single list from all the queue element //Find the first element for (j = 0; j < 0 && front [j] ==–1; j + 1); first = front [j]; //Link up remaining queues while (j < = 9){ //check if finished //find the next element for (i = j + 1; i < 10 && front [i] ==–1; i +1); if (i < = 9){ p = i; node [near [j]). next = front [i]; } j=i } node [near [p]). next = 1; } //Copy back to original array for (i = 0; i < n; i + 1){ x [i] = node [first]. info; first = node [first]. next; } }
Thus the root node will have the largest key value. Its parent will be at position [j/2]. 455. It is at the position 5. answer the following questions: 1.40 ALGORITHMS AND ADVANCED DATA STRUCTURES The time requirements for the radix sort clearly depend on the number of digits (m) and the number of element in the file (n). Sort the following file using radix sort 637. E and I are its children. the parent is Q. 987. i. the heap. Its children are at positions 2*5 and (2*5) +1. represented as an array. then proceed to next section.e. We see from the pictorial representation that these relationships are correct. This sort is approximately O(n +m). The key values of the nodes are then assigned to array positions whose index is given by the number of the node. the corresponding array would be Index 1 2 R 3 P 4 G 5 M 6 J 7 A 8 C 9 D 10 E 11 I 12 C Array : Z The relationship of a node can be determined from this array representation. by first numbering the nodes (starting from the root) from left to right. 10 and 11 respectively i. in which each node satisfies the heap condition. A Heap is a complete binary tree. 462. Top ! We begin by defining a new structure. Trees can be represented as arrays.e. ! " # A complete binary tree is said to satisfy the ‘heap condition’ if the key of each node is greater than or equal to the key in its children.e.6 Before going to next section. Student Activity 2. Consider the node M. 982 If your answers are correct. Thus the sort is reasonably efficient if the number of digits in the keys is not too large. For the example tree above. Its parent node is therefore at position [5/2] = 2 i. . If a node is at position j its children will be at positions 2j and 2j + 1. 2. A binary tree is illustrated below. We have studied binary trees earlier. Explain Radix Sort method.
SORTING TECHNIQUES 41 We will now study the operations possible on a heap and see how these can be combined to generate a sorting algorithm. But then the heap condition is violated. The required node is inserted/deleted/or replaced. 1. The operations on a heap work in 2 steps. (i) (ii) Initially R is added as the right child of J and given the number 13. It may cause violation of the heap condition so the heap is traversed and modified to rectify any such violations. to get: . Swap R and P. The heap condition is now satisfied by all the nodes and we get the following heap. (iii) Move R up to position 6 and move 5 down to position 13.2.1. (iv) (v) (vi) But the heap condition is still violated. The larger of Mi children is promoted to 5. ! " # Deletion consider the deletion of M from heap of figure 2. Insertion Consider the insertion of node R in the heap of figure 2. 2.
42 ALGORITHMS AND ADVANCED DATA STRUCTURES # An efficient sorting method based on the heap construction and node removal from the heap in order. Now we build a heap for the following array of characters: !""# $ % $ &' ( &' ( &' )( $ &' *( % $ % &' ( # % # &'( $ % # &' ( . This algorithm is guaranteed to sort n element in n log n steps. " ! # ! Insert items into an initially empty heap. We will first see 2 methods of heap construction and then removal in order from the heap to sort the list. keeping the heap condition inviolate at all steps.
Example: Now we see the above method on the same array !""# $ % $ % $ & # ' ' ( $ % $ # ' ( & ' # $ % % # & ' $ % # & ' $ .( $ !# ! Build a heap with the items in the order presented. Then from the right most node modify to satisfy the heap condition.SORTING TECHNIQUES 43 # % &' ( $ % $ # $ & '( $ $ # & &'( + $ % $ # # $ ' &' .
Remove S and store it in A [12] $ & $ # # ' ! ( . The sorted elements will be placed in A [ ] an array of size 12. $ & $ % # .44 ALGORITHMS AND ADVANCED DATA STRUCTURES ' )( ' *( $ % # & ' ' ( $ $ % # & ' '( $ We will now see how the sorting take place using the heap built by the top down approach.' ( " 2.' ( ' 1. Remove S and store in A [11] $ .
Remove R and Store in A[10] $ & $ ' # # ! ( .' *( " 4.' ( " 5. Remove P and store in A [9] $ & ' # # ! ( .' )( " 3. Remove O and store in A [8] $ & ' # .SORTING TECHNIQUES 45 $ & ' # # ! ( .
' ) " & $$ 8. Remove O and store in A [7] & ' # # ! ( . . Top Similarly remaining nodes are removed and the heap modified to get the sorted list – AEEILNOOPRSS.'( " $ 6.46 ALGORITHMS AND ADVANCED DATA STRUCTURES # ! ( . Remove N and store in A [6] ' # # ! ( .' ( " $ $ 7.
We need too drives to do an efficient sort. Shell sort compares elements A [t] and A[i – hk] in one time unit. but not too big to fit in main memory. with median of three partitioning requires comparing A [left]. There are. the third drive simplifies matters. Tb1. all the algorithm we have examined require that the input fit into main memory. there is still a practical loss of efficiency. we rewind all the tapes. Quick sort. We will call each set of sorted records a run. Even if the data is on a disk. Depending on the point in the algorithm. then after the runs are constructed. which are designed to handle very large inputs. ) ) )* ! !! . because of the delay required to spin the disk head. applications where the input is much too large to fit into memory. which are two input and two output tapes. in a constant number of time units. If only one tape drive is present. When this is done. Suppose we have the same input as our example for Shellsort. Since access to an element on tape is done by winding the tape to the correct location tapes can be efficiently accessed only in sequential order (in either direction). Ta2. Suppose the data is initially on Ta1. the a and b tapes are either input tapes or output tapes. We will assume that we have at least three tapes are drives to perform. sort the records internally. create a student file that is large. If the input is on a tape can only be accessed sequentially. Tb2. A natural first step is to read M records at a time from the input tape. The time it takes to sort the input is certain to be insignificant compared to the time to read the input. then we are in trouble any algorithm will require Ω(N2) tape accesses. "# $! ' #$ The basic external sorting algorithm uses the Merge routine from mergesort. The wide variety of mass storage devices makes external sorting much more devicedependent than internal sorting. even though sorting is an O (n log n) operation and reading the input is only O(n). however. Suppose further that the internal memory can hold (and sort) M records at a time. and A [Right]. %# & & ' #$ ( Most of the internal sorting algorithms take advantage of the fact that memory is directly addressable. Suppose we have four tapes Ta1. Read the file in and sort it using an efficient algorithm. Heapsort computers elements A[i] and [i + 2 + 1] in one time unit. This section will discuss external sorting algorithms. the tapes will contain the data indicated in the following figure. To see how slow external access really are. which are probably the most restrictive storage medium. A [Center]. and then write the sorted records alternately to Tb1 and Tb2. the sorting. The algorithms that we will consider work on tapes.SORTING TECHNIQUES 47 So far. ) ) )* )* ! ! !! If M = 3.
if we have 10 million records of 128 bytes each. ) ) )* )* ! ! !! ) ) )* )* ! ! !! ) ) )* )* ! ! !! If we have extra tapes. In the latter case. alternative between Ta1 and Ta2 until either Tb1 or Tb2 is empty.48 ALGORITHMS AND ADVANCED DATA STRUCTURES )* ! Now. We continue this process. We continue the process until we get one run of length n. At this point either both are empty or there is one run left. This will give runs of 4m. then the first pass will create 320 runs. this time using the a tapes as input and the b tapes. and repeat the same steps. For instance. which are shown in the following figure. This algorithm will require [log (n/m)] passes. merge these. We would then need nine more passes to complete the sort. Tb1 and Tb2 contain a group of runs. We take the first run from each tape and merge them. this time using the a tapes as input and the b tapes as output. and four megabytes of internal memory. . writing the result. and repeat the same step. and the write the result to Ta2. onto Ta1. Our example requires [log 13/3] = 3 more passes. which is a run twice as long. We do this extending the basic (twoway) merge to a k – way merge. we copy this run to the approximate tape. We rewind all four tapes. Then we take the next run from each tape. plus the initial runconstructing pass. then we can expect to reduce the number of passes required to sort our input.
If we have 10 tapes then k = 5. ) ) ) )* )* )* ! ! !! ) ) ) )* )* )* ! ! !! After the initial run construction phase. because the runs get k times as large in each pass. the only difference being that it is slightly more complicated to find the smallest of the k elements. we distribute the input onto the three tapes. the formula is verified. Then the smaller element is found. and if the run on the input tape is not yet complicated. ) !# . If there are k input tapes. For the example above. To obtain the next element to write on the output tape. since [log3 (13/3)] = 2. the number of passes required using kway merging is [logk (n/m)]. this strategy works the same way. ) ) ) )* )* )* ! ! !! We then need two more passes of three way merging to complete the sort. The approximate input tape is advanced. Using the same example as before. and our large example from the previous section would require [log5 320] = 4 passes. placed on an output tape. we insert the new element into the priority queue. and the appropriate input tape is advanced.SORTING TECHNIQUES 49 Merging two runs is done by winding each input tape to the beginning of each run. we perform a Delete Min operation. We can find the smallest of these elements by using a priority queue.
it is necessary to pad the tape with dummy runs in order to get the number of runs up to a Fibonacci number. As an example. we put 21 runs on T2 and T3 runs. 0 ≤ N ≤ k –2. We leave the details of how to place the initial set of runs on the takes as an exercise. we obtain 12 runs on T1 and 10 runs on T2. We could then merge T1 and T3 and so on. which has 8 runs. Otherwise. This has the effect of adding an extra half pass for every pass we do. where the kth order Fibonacci number is defined as F(k) (N) = F(k) (N – 1) + F(k) (N – 2) + …. ) /) " ) /) " " " " " " The original distribution of runs makes a great deal of difference. F (K) (K – 1) = 1. + ) ) ) " . ) /) " . we will show how to perform twoway merging using only three tapes. onto T3. ! $ The last item we will consider is construction of the runs. because we can only merge two sets of runs before T3 is exhausted. obtaining T 1 with 6 runs and T3 with 2 runs. Suppose we have three tapes. we must now put some of these runs on T2 to perform another merge. ) /) . and T2. and T3 and an input file on T1 that will produce 34 runs. The problem is that since all the runs are on one tape. with the approximate initial conditions F (k) (N) = 0. ) /) . This could be prohibitive for some applications. Then T1 has 8 runs and T2 has 2 runs Again we. ) /) . We read as many records as possible. An alternative method is to split the original 34 runs unevenly. The logical way to do this is to copy the first eight runs from T 1 onto T 2 and then perform the merge. then it can be included in the run. We would then merge 13 runs onto T1 before T3 was empty. can only merge two sets of runs.+ F(k) (N – k). It turns out that the first distribution we gave is optimal. obtaining one tape with 17 runs. the memory it used becomes available for another record. we could rewind T1 and T3 and merge T1. with 13 runs. then the best way to distribute them is to split them into two Fibonacci number FN–1 and FN–2. until one realize that as soon as the first record is written to an output tape. The strategy we have used so far is the simplest possible.50 ALGORITHMS AND ADVANCED DATA STRUCTURES The k – way merging strategy developed in the last section requires the use of 2k tapes. in which case we need k th order Fibbonacci numbers for the distribution. . We can extend this to a k – way merge. For instance. . Suppose. If the next record on the input tape is large than the record we have just output. At this point. After three more passes T2 has two runs and then we can finish the merge. T2. After another merge. Our option is to put 17 runs on each of T2 and T3 . if 22 runs are placed on T2 with 12 on T3 then after the first merge. there are 10 runs on T1 and 2 runs on T3. We could then merge this result onto T1. . ) /) . It is possible to get by with only k + 1 tapes. The following table shows the number of runs on each tape each pass. If the number of runs is a Fibbonacci numbers FN. At this point the going gets slow. T1.
In this example.. We read the next record from the input tape. 2. with M = 3. compared with the five runs obtained by sorting. 8. every pass saved can make a significant difference in the running time. Since external sorts take so long. 4. However. As we have seen. This technique is commonly referred to as replacement selection Initially M records are read into memory and placed in a priority queue. so a five way merge would require four passes. If it is larger than the record we have just written we can add it to the priority queue. Contract a heap from the following by values. Until the run is completed and use the element for the next run. 7. the input is frequently sorted on nearly sorted to start with. Since the priority queue.7 1. 3 Elements In Heap Array 0 "1 + ! ! ! ! 2 + !! !! !! !! !! !! 2 2 2 + ! 2 2 ! 2 2 2 01 01 Output Next Element Read ! 2 ! ! # 3+ . . Dead elements are indicated by an asterisk. For our large example. storing an element in the dead space. In this case. Otherwise it can not go into the current run. replacement selection produces only three runs. it is possible for replacement selection to do no better than the standard algorithm. *+ 3 4 Student Activity 2. a three – way merge finishes in one pass instead of two. This kind of input is common for external sorts and makes replacement selection extremely valuable. 4 !! # 3+ .SORTING TECHNIQUES 51 Using this observation. If the input is randomly distributed replacement selection can be shown to produce runs of average length 2M. we can give an algorithm for producing runs. 5. Because of this. It is clear that run construction for the small example we have been using. we would expect 160 runs instead of 320 runs. is smaller by one element we can store this new element in the dead space of the priority queue. we have not saved a pass although we might if we get lucky and have 125 runs or less. in which case replacement selection produces only a few very long runs. writing the smallest record to the output tape. 1. We perform a Delete Min. 2 2 *+ 3 !! 4 2 3.
Therefore. which represents the comparison of A[i] with A[j]. Following shows a comparison tree for sorting three items. so lower bound proofs are often hard to obtain. Let T(n) be the minimum number of comparison that are sufficient to sort n items in the worst case. Top * "# Recall that there is a mathematical notation for expressing lower bounds if f(n) is the time for some algorithm then we write f(n) = Ω (g(n)) to mean that g(n) is the lower bound for f(n). Now let us consider the sorting problem. 5 6 5 77 5 77 6 5 6 77 6 77 5 77 / 6 77 We consider the worst case for all comparison—based sorting algorithms.52 ALGORITHMS AND ADVANCED DATA STRUCTURES Describe the sort with the help of an example. if all internal nodes in a binary tree are at level less than k. What is internal sort? If your answers are correct. Now any comparison between a[1] and a[j] must result in one of two possibilities : either A[i] < A[j] or A[i] > A[j]. 2. then there are at most 2k external nodes. So if we form a comparison tree then it will be a binary tree in which each internal node is labeled by the pair i : j. then the algorithm proceeds down the left branch of the tree otherwise it proceeds down the right branch. We know that. If A[i] is less than A[j]. However for any problem it is possible to easily observe that a lower bound identical to n exists. Formally this equation can be written if there exists positive constant c and so such that f(n)≥ cg(n) for all n> no. then proceed to next selection. In addition to developing lower bounds to within a constant factor we are also concerned with determining more exact bounds whenever this is possible. We can describe any sorting algorithms that satisfies the restrictions of the comparison tree. Usually we cannot enumerate and analyse all these algorithms. where n is the number of inputs to the problem. if we let k = T(n) . Deriving good lower bounds is often more difficult than efficient algorithms. Perhaps this is because a lower bound states a fact about all possible algorithms for solving a problem. Consider the case in which n numbers A[1:n] are to be sorted and these numbers are distinct.
. And by keeping track of the work that is done. The most famous oracle in history was called the Delphic oracle. We assume that all the m+n elements are distinct and that A[1] < A[2] <……. Top ' ' $ One of the proof techniques that is useful for obtaining lower bounds consists of making use of an oracle. Describe lower bound theory. Given the sets A[1 : m] and B[1 : n]. It does this by choosing as the outcome of the next test. located Delphi. n=2. Make a comparison three for sorting of key value a.SORTING TECHNIQUES 53 n! < = 2T(n) Since T(n) is an integer.< B[n]. answer the following questions: 1. A [2]= y. the result that causes the most work to be required to determine the final answer. It is possible that after these two sets are merged. the oracle tells us the outcome of each comparison. In olden times people would approach the oracle and ask it a question. d. A[3]= z. the oracle tries its best to cause the algorithm to work as hard as it can. situated in the side of a hill embedded in some rocks.8 Before going to next section. and B[2]= v. Hence we say that any comparisonbased sorting algorithm need Ω(n log n) time. b. A[1] = x. This formula shows that T(n) is of the order n log n. To derive a good lower bound. 2. Elementary combinatorics tell us that m+ n there are ways that the A’s and B’s can merge together while preserving the ordering within A and m B. c. the n elements of B can be interleaved within A in every possible way. if m=3. there are 3+ 2 = 10 3 . a worstcase lower bound for the problem can be derived. Now we consider the merging problem. This oracle can still be found. As was the case for sorting. the oracle would reply and a caretaker would interpret the oracle’s answer. Greece. Student Activity 2. it follows that log n! = n log n – n/ln2 + (1/2) log n + 0(1) Where ln2 refers to the natural algorithm of 2. A similar phenomenon takes place when we use an oracle to establish a lower bound. where the items in A and the items in B are sort. After some period of time elapsed. Given some model of computation such as comparison trees. B[1] = u. For example. then proceed to next section.<A[m] and B[1] < B [2]<……. If your answers are correct. we investigate lower bounds for algorithms that merge these two sets to give a single sorted set. we get lower bound T(n) > = log n! By Starling’s approximation.
z. A[1] : B[2]. y. y. x.< A[m] and B[1] <………< B[m]. In the extreme case when m=1. u. * * For another example that we can solve using oracles. z. v. u. u. B[1] < A[1] <…. B[m] : A[m] while merging the given inputs. v.< A[i . u. Consider any comparisonbased algorithm for solving the merging problem and an instance for which the final result. then we have the inequality log m+ n n ≤ MERGE(m. x. v. u. v. x. and therefore at least log m+ n n m+ n external m comparisons are required by any comparisonbased merging algorithm. y..1 comparisons. v. v. that is. We already have an algorithm that requires 2M1 comparisons. x. To see this. u.m)≥2m1. Proof Consider any algorithm that merges the two sets A[1] <……. m)=2m . n) be the minimum number of comparisons needed to merge m items with n items. z. If we let MERGE(m. z. z. n ) ≤ m + n − 1 The exercises show that these upper and lower bounds can get arbitrarily far apart as m gets much smaller than n.1 comparisons to produce this final result. then there will be nodes.1. the lower bound given by the comparison tree model is too low and the number of comparisons for the conventional merging algorithm can be shown to be optimal. y.. v.54 ALGORITHMS AND ADVANCED DATA STRUCTURES ways in which A and B can merge: u. Then the algorithm cannot distinguish between the previous ordering and the one in which i. Theorem MERGE (m.. is B[1] < A[1] < B[2] <A[2] <……< B[m ] < A[m]. Any merging algorithm must make each of the 2m . v. When m and n are equal.1] < A [i]< B [i] < B[i + 1] <….. for m ≥ 1. So any algorithm must make all 2m . for which the B’s and A’s alternate. B[2] : A[2]. The conventional merging algorithm that was given in earlier takes m + n . y. x. y. This should not be a surprise because the conventional algorithm is designed to work best when m and n are approximately equal. x. If a comparison of type A[i] : B[i + 1] is not made.< B [m] < A[m] So the algorithm will not necessarily merge the A’s and B’s properly. What is a lower bound on the number of comparison required by any algorithm that finds these two quantities? It has been already provided us with an answer using . This if we use comparison trees as our model for merging algorithms. …. z. z.. y.B[n]. y. …. x. z.< A[i 1] < B[i] <B[i + 1] < A[i] < A[i+1]<…. x. and x. we observe that binary insertion would require the fewest number of comparisons needed to merge A[1] into B[1]. then the algorithm will not be able to distinguish between the case in which B[1] < A[1] < B[2] <………< B[m] < A[m] and in which B[1] <A[1] <B[2] <A[2] <…. z.< B[m] < A[m]. then the theorem follows..1 comparisons B[1] : A[1]. If we can show that MERGE (m. v. u. u. y. The theorem follows. consider the problem of finding the largest and the second largest elements out of a set of n. suppose that a comparison of type B[i] :A[i] is not made for some i.
8 shows a sample tournament among eight players. Therefore we produce an oracle that decides the results of matches in such a way that the winner plays [log n] other people. The runnerup has lost only once. which in this case. This second tournament need only be replayed along the path that the winner. So large gap still remains. and hence only [log n] .1 comparison are required for this 0 1 second tournament. there are [log n] levels. The players who have lost to the winner play a second tournament to determine the runnerup. to the winner. we see that at least n1 comparisons are necessary.SORTING TECHNIQUES 55 comparison trees. This new algorithm. Proof Assume that a tournament has been played and the largest element and the secondlargest element obtained by some method. or Francez are the possible candidates for second place. Therefore all we need to show is that there is always some sequence of comparisons that forces the second largest to be found in [log n]1 additional comparisons. In Figure 2. and the other x1 candidates must have lost to one other person. Then there are x people who are candidates for the runnerup position. In any other case the oracle can decide arbitrarily as long as it remains consistent. An algorithm that makes n . and the second largest as the runnerup. The winner of each match (which is the larger of the tow values being compared) is promoted up the tree until the final round.3. In a match between a and b the oracle declares a the winner if a is previously undefeated and b has lost at least once or if both a and b are undefeated but a has won more matches than b. Since we cannot determine the secondlargest element without having determined the largest element. requires a total of n – 2 + [log n] comparisons.8 leads us to another algorithm of determining the runnerup once the winner of a tournament has been found.2 to find the second largest gives an immediate upper bound of 2n . Figure 2. . in this case McMohahon. Rosen. Schreier in 1932. For a tournament with n players. Suppose that the winner of the tournament has played x matches. Figure 2.8 that means either Guttag. Theorem Any comparisonbased algorithm that computes the largest and second largest of a set of n unordered elements requires n – 2 + [log n] comparisons. This problem was originally stated in terms of a tennis tournament in which the values are called players and the largest value is interpreted as the winner. who are the candidates for second place? The runnerup must be someone who lost to McMohan but who did not lose to anyone else. which was first suggested by J. Now.1 comparisons t find the largest and then n . Therefore we have an identical agreement between the known upper and lower bounds for this problem. followed as he rose through the tree. Now we show how the same lower bound can be derived using an oracle. determines McMohan as the winner.
8 9 7*/ 7:73. 0. we consider a problem originally defined and solved in the Selection given n distinct items. ! # Another technique for establishing lower bounds that is related to oracles is the state space description method.> << . A state space description is a set of rules that show the possible state (ntuples) that an algorithm can assume from a given state and a single comparison. n2) from the state (n. c. Let SELk (n) be the minimum number of comparisons needed for finding the kth element of an unordered set of size n. 8 9 7*/ 7:/ 73.1.> : . Originally the algorithm is in state (n.9. [3n/2]2 comparisons are needed. d) can make progress only if it assumes one of the five possible states shown in Figure 2. observe that the quickest way to get the a component to zero requires n/2 state changes yielding the tuple (0.3 = ? > . n/2. It is easy to see by induction that any player who has played and won only x matches can have at most 2x1 edges pointing into her or his corresponding node. consider a tournament in which the outcome of each match is determined by the above oracle. :>4 . To get the state (0. b. c is the number of items that have lost but never won. :>4 . . d). Since for the overall winner there must be an edge from each of the remaining n1 vertices. Proof The technique we use to establish a lower bound is to define an oracle by a state table. > . We would like to show that this algorithm is indeed optimal. Imagine drawing a directed graph with n vertices corresponding to this tournament. Each vertex corresponds to one of the n players. Then. We consider the state of a comparisonbased algorithm as being described by a 4tuple (a. We have already seen that . b.> . 0. 3. Draw a directed edge from vertex b to a. 0. << )=<< )=> . it is possible to derive lower bounds by arguing that the finish state cannot be reached using any fewer transitions. :>4 . Recall that the divideandconquerbased solution required [3n/2]2 comparisons. it follows that the winner must have played at least [log n] matches.:. this requires addition and additional n2 state changes.> * . *≥ :≥ ≥ ≥ << )=. As an example of the state space description method. 3. Theorem Any algorithm that computes the largest and smallest elements of a set of n unordered elements requires [3n/2]2 comparisons. Once the state transitions are given. 8 9 7*7:73/ . 3. Often it is possible to describe any algorithm for solving a given problem by a set of ntuples. To see this.n/2. 1. b is the number of items that have won but never lost. and d is the number of items that have both won and lost. 1.. if and only if either player a has defeated b or a has defeated another player who has defeated b.0). 1. 2 " 1 34 Selection We end this section by deriving another lower bound on the selection problem. :>4 . Next the b and c components are reduced. c. b ≠a. 0. 8 9 7*7:/ 73. 8 7*7:9 73/ . Therefore we know that asymptotically any selection algorithm requires Θ(n) time.56 ALGORITHMS AND ADVANCED DATA STRUCTURES Now. 0) and concludes with (0. 0)..> << . after each comparison the tuple (a. n2). find the maximum and the minimum. where a is the number of items that have never been compared.> * . One of the algorithms presented there has a worstcase complexity of O(n) no matte what values is being selected. 8 7*9 7:73/ .
are Map. ≠ b. Lemma Using the oracle just defined.1). so it follows that at the end of the tournament Set≥ (k . then it doesn’t matter who wins as long as no inconsistency with any previous decision is made. 2 ≠ a. If a and b are both not in Set and if Map (a) > Map (b) at time t. then a wins. then a wins if a > b. However.1 best players. for all items w. Set). it follows that the size of Set plus the size of W is greater than K . an oracle is constructed in the form of a state transition table that will cause any comparison based algorithm to make at least n – k + ( k n 1) log comparisons. then any player in Set or W is a candidate to be one of the k . A procedure for selecting the kthlargest element is referred to as a tournament that finds the kthbest player.SORTING TECHNIQUES 57 for k = 1. say (Map. We continue to use the terminology 2(k − 1) that refers to an element of the set as a player and to a comparison between two players as a match that must be won by one of the players. Intuitively. SEL2 (n) = n – 2 + [log n]. Set stays the same and we set Map (the loser) : = 0 at time +1 and Map (the winner) := Map (a) + Map (b) at time t + 1 and.1) at time t. then Map is unchanged and the winner is inserted into Set as a new member. if Map(a) + Map (b) ≥ n/( k . 2. This is a contradiction. The initial state is the identity mapping (that is Map(i) = 1. then a wins and the tuple (Map.1). Since for all w in the input Map(w) < n/(k . 3. If Map (a) = Map (b). Proof At time t the number of matches won by any player x is greater than or equal to [log Map )] . since the elements y in W can only be less than some xi in Set. and the oracle acts as follows: 1. if the size of Set is less than k . In either case. To derive this lower bound on the selection problem. The tuple size for states in this case is two. Set) remains unchanged. If Map(a) + Map(b) < n/(k . (it was four for the maxmin 2(k − 1) problem). 1 ≤ i ≤ n) and the empty set. the ith player that enters Set in the ithbest player. The tuple (Map. SEL1(n) = n . the k1 best players will have played at least (k1) log when the tournament is completed. the players in Set are the top players (from among all). an ordered subset of the input. w. The (x elements is Set are ordered so that x1 < ……< xj. a mapping from the integers 1. At any time period t the oracle is assumed to be given two unordered elements form the input.n onto itself.1) log ≤ SELk (n).1 at the end of the tournament. We are now in a position to establish the main theorem. Candidates for entering Set are chosen according to their Map values. In the following paragraphs we present n a state table that shows that n – k + ( k . n matches 2(k − 1) Theorem [Hayfil] The function SETk (n) satisfies . at any given time.1 and. In particular. say a and b. for k = 2. 2. Set) remains unchanged. and Set. If a is in Set and b is not in Set.1. Let W={ y : y is not in Set but Map(y) > 0}. and the components of a tuple. Now for all w in the input w Map(w ) = n. ……….1). Map (w) stays the same. If a and b are both in Set at time t..
……. log 2.………. Thus there are n .1 best players have played at least (k .<A[n]. 3.. Let F (n) be the minimum number of comparisons. where b is a given number. then by Stirling’s approximation 1 αn + n = n[(n + α ) log(1 + α ) − α log α ] − log n O(1). answer the following questions: 1. . Dobkin and R. the k .1) log Proof n 2(k − 1) According to lemma. Prove by induction that F(n) ≥ [log n + 1].10 shows the complete graph on four nodes together with three of its spanning trees. the difference + αn 2 between this formula and m + n .1) log n matches. where r is either <. (2) accept.wn such that i ∈I (w i ) = b.E) be an undirected connected graph.1 top players.wn. If your answers are correct. >. Consider search programs for which the function f is restricted so that it can only make comparison of the form (w i ) = b Using the adversary technique D. Let m = αn. 2. Show that as α→ 0. Example: Figure 2.…. Student Activity 2. A subgraph t = (V. b).k additional matches that were not included in the count of the matches played by the k . The sum of the subsets problem asks for a subset I of the integers 1.58 ALGORITHMS AND ADVANCED DATA STRUCTURES SELk(n) ≥ n .. Lipton have shown that Ω(2n ) such operations are required to solve the sum of subsets problem (w1. and (3) reject. else goto L2.9 Before going to next section. E1) of g is a spanning tree of g if t is a tree. or = and x is a vector. needed to insert B[1] into the ordered set A[1] < A[2] <……. See if you can derive their proof. Any player 2(k − 1) who is not among the k best player has lost at least one match against a player who is not among the k 1best.1 gets arbitrarily large. A search program is a finite sequence of instructions of three types: (1) if (f (x) r 0) goto L1.n for the inputs w1. Top i ∈I $ $ ! " Let G = (V.. then proceed to next section. In the worst case.k + (k .
the length of the link. The next edge (u. The spanning tree obtained is shown in figure 2. The next edge to include is chosen according to some optimization criterion.12 shows the working of Prim’s method on the graph of figure 2. ) $+ ' #$ A greedy method to obtain a minimum spanning tree is the to build the tree edge by edge. the edges have weights assigned to them. In practical situations. 2. then A forms a tree. These weights may represent the cost of construction.v)} is also a tree. The simplest such criterion is to choose an edge that results in a minimum increase in the sum of the costs of the edges so far included. they can be used to obtain an independent set of circuit equations for an electric network. Fig.11 shows a graph and one of its minimum spanning tree involves the selection of a subset of the edges. the set of edges so far selected from a tree. The following example shows this selection criterion results in a minimum spanning tree. Example: Figure 2. For example. In either case the links selected have to form a tree. The corresponding algorithm is known as Prim’s algorithm. and so on.11(a).11(b) and has a cost of 99.v) to be included in A is a minimum cost edge not in A with property that A ∪ {(u. There are two possible ways to interpret this criterion. 1 " " 1 . one would then wish to select cities to have minimum total cost or minimum total length. Thus if A is the set of edges selected so far. In the first. " " 22 '( 6 '( 7 1 We present two algorithms for finding a minimum spanning tree of a weighted graph: Prim’s algorithm and Kruskal’s algorithm. this problem fits the subset paradigm.SORTING TECHNIQUES 59 5 Spanning trees have many applications. We are therefore interested in finding a spanning tree g with minimum cost. Given such a weighted graph.
j].60 ALGORITHMS AND ADVANCED DATA STRUCTURES '( '( " " ' )( ' *( " " '( '( Having seen how Prim’s method works. we associate with each vertex j not yet included in the tree a value near [j]. The next edge (i. cost [i. n. Prim (E. j) efficiently. cost.j] is //either a positive real number or is it //no edge (i. Then. j is a vertex not yet included. j) to be added is such that I is a vertex already included in the tree. l) such that vertex k is in the tree and vertex e is not in the tree. is minimum among all edges (k. and the cost of (i. t) //E is the set of edges in g. The next edge to include is defined by the vertex j such that near [j] # 0 (j not already in the tree) and cost [j] [near (j)] is minimum. We define near [j] = 0 for all vertices j that are already in the tree. //A minimum spanning tree is computed . j). The value near [j] [near (j)] is minimum among all choices for near [j]. cost [n] [n] is // the cost adjacmcy matrix of an n vertex //graph such that cost [i. edges are added to this tree one by one. The algorithm will start with a tree that include only a minimum cost edge of g. To determine this edge (i. j) exists. let us obtain a n algorithm to find a minimum cost spanning tree using this method.
near [k] = near [l] = 0. for (i = 1. . for (i = 2. Thus t may not be a tree at all stages in the algorithm. i + l) { //Find n–2 additional edges for t. i < = n–1. near [j] = 0. } return (min cost). where n is the number of vertices in the graph g.SORTING TECHNIQUES 61 //and stored as a set of edges in //The array t[n–1] [2]. t [1] [1] = k. + ' #$ There is a second possible interpretation of the optimization criteria mentioned earlier in which the edges of the graph are considered in nondecreasing order of cost. Let j be an index such that near [j] ! = 0 and Cost [j] near [(j)] is minimum. for (k = 1. l) be an edge of minimum cost in E. k + l) //update near if (near [k] ! = 0 && cost [k] [near [k] > cost [k] [j]) near [k] = j. Min cost = cost [k] [l]. min cost = mincost + cost [j] [near [j]]. { Let (k. k < = n. else near [i] = k. (The final cost is //returned. In fact it will generally only be a forest since the set of edges t can be completed into a tree if there are no cycles in t. This interpretation is that the set t of edges so far selected for the spanning tree be such that it is possible to complete t into a tree. } The time required by algorithm prim is 0 (n2). T [i] [1] = j. i < = n : i + l) if (cost [i] [l] < cost [i] [k]) near [i] = l. This method is due to kruskal. t [i] [2] = near [j]. k[1] [2] = l. .
2.5) is considered an included in the tree built. It is included in the spanning tree being built.14(a) shows the current graph with no edges selected Edge (1. Edge (2. This completes the spanning tree. " '( '( " " ' )( ' *( " " '( & '( . Next the edge (3. The resulting tree (figure 9(b)) has cost 99.4) is selected and included in the tree (fig. The next edge to be considered is (2. The next edge to be considered is the edge (7.62 ALGORITHMS AND ADVANCED DATA STRUCTURES Example: Consider the graph of figure 2. so this edge is discarded.14(e). Edge (5.14(b). It is discarded as its inclusion creates a cycle. We begin with no edges selected figure 2. This yield the graph of figure 2. It is considered next.14(f). Of the edges not yet considered (7. This result in the configuration of figure 2.4) is the next edge to be added in the tree being built. Its inclusion in the tree being built does not create a cycle.4) has the least cost.7). Its inclusion in the tree results in a cycle.14(a).14.5). so we get the graph of figure 2.3) is considered next and included in the tree figure 2.14(c)). Finally edge (6.6) is the first edge considered.
The edge (3. If the edges are maintained as a min heap. If they are. 8.6} and {5}. If your answers are correct. 5. A motoriot wishing to drive from city A to B would be interested in answers to the following question: • • Is there a path from A to B? If there is more than one path from A to B.v) from E of lowest cost. Top # ) # Graphs can be used to represent the highway structure of a state on country with vertices representing cities and edges representing sections of highway. 18. and {5}. answer the following questions.4). the edge is rejected. The only functions we wish to perform on this set are (1) determine an edge with minimum cost (line 4) and (2) delete this edge (line 5). 35} using Kruskal’s method. the sets are {1. w).2. 7.w) is to be added to t.6) is to be considered. Since vertices 1 and 4 are in the same set. the vertices in g should be grouped together in such a way that one can easily determine whether the vertices v and w are already connected by the earlier selection of edges. w) does not create a cycle in it) add (v. then the edge (v.SORTING TECHNIQUES 63 For clarity. 1. If (u. Delete (q. which is the shortest path? . Both these functions can be performed efficiently if the edges in E are maintained as a sorted sequential list. One possible grouping is to place all vertices in the same connected component by t into a set. Why is Prim’s algorithm called greedy method? Compare and contrast Prim’s method with Kruskal’s method. w) from E. 3.w) to t. {2.5) connects vertices in different sets and results in the final spanning tree. For example. else Discard (v. 2. then the next edge to consider can be obtained in 0 (long E) line. To be able to perform step 6 efficiently. 1. The next edge to be considered is (1. The construction of heap it self take O (E) time. Draw a spanning tree of edges {2. 4. kruskal’s method is written out more formally in following algorithm. then proceed to next section. Vertices 2 and 6 are in different sets so these sets are combined to give {1. 8.2}.3. The edge can them be assigned weights which may be either the distance along that section of highway. 2. when the edge (2.4. 3.4. t = 0. Student Activity 2.10 Before going to next section. 6. 6. It is not essential to sort all the edges so long as the next edge for line 4 can be determined easily. while [(it has less than n–1 edges) R& (E! = 0)] { Choose an edge (u.6}. } Initially E is the set of all edges in g.
The length of a path is now defined to be the sum of the weights of the edges or that path.16. S contains only Boston. n) //dist [j]. To begin with. dist [i] = cost [v][i] . The problem is to determine the shortest path from V0 to all the remaining vertices of g. put v in S.64 ALGORITHMS AND ADVANCED DATA STRUCTURES The problems defined by these questions are special cases of the path problems we study in this section.15(b). dist. G is //represented by its cost adjacency matrix //cost [n] [n]. //intializes S [i] = false.0. a weighting function cost for the edges of g. the city u that is not in S and whose dist [4] is minimum is identified to be New York.E). It is assumed that all the weight are positive. { for (i = 1. is set to the legnth //of the hortest path from vertex v to //vertex j in a diagraph g with n //vertices dist [v] set to zero. put u in S For (each is adjacent to u with S[w] = false) //update distances dist [w] = dist [u] + cost [u] [w] } } Example: Consider the eight vertex diagraph of figure 2. i < = n. { //Determines n–1 paths from v. Choose u from among these vertices not in S such that digit [u] is minimum. The starting vertex of the path is referred to as the source. . S[u] = true. cost. The graphs are digraphs to allow for oneway structs. By the definition of dist. 1< = j < = n. In the problem we consider we are given a directed graph g = (V. for finding all the shortest paths from Boston are shown in figure 2. + ' #$ This algorithm determines the lengths of the shortest paths from v0 to all other vertices in g. dist [v] = 0. In the first iteration of the for loop (that is num = 2). and the last vertex the destination. and a source vertex V0. The algorithm when only seven of the eight vertices are in S.16(a) with cost adjacency matrix as in figure 2. the city that enters S is Miami since it has the smallest dist [ ] value from among all the nodes not in S. Dijkstra’s (v. The values of dist and the vertices selected at each iteration of the for loop of line 12 is previous algorithm. None of the dist [ ] values are altered. i &&) { } S [v] = true. In the next iteration of the for loop.
: :"" "" """ """ 'A "" ' ( """ & = B.C D."" ? : A" "" %.SORTING TECHNIQUES 65 the distance of the last vertex. @. "" !"" & = $. > " "" "" " "" " "" " "" """ " " " !"" " "" """ " "" ' ( D  E . is correct as the shortest path from Boston to Los Angeles can go through only the remaining six vertices. in this case Los Angeles. F : 3 ' 01 H I J I 7J I 77J I 777J I 7777J I 77777J H / "" / "" / "" / "" " " " % 01 / "" / "" / "" / "" / "" " " #& 01 / "" / "" / "" " " " " 01 "" " " " " " " : @$ ) 01 " " " " " " " &B 01 " " " " " " " G 01 / "" " " " " " " &$ 01 / "" " " " " " " .
A more efficient method could be to check for zeros in the matrix. because it may involve n! permutations. (If it is difficult or time consuming to discard the specified rows and columns. is an inefficient method. that is. of course. the question connectedness and components arises in many other algorithms. and them checking if it is in a blockregional form. The maximum number of fusion that may have to be performed in this algorithm is n – 1. the connectedness is by no means obvious to a computer or human … if the graph is presented in other forms). the graph is connected. planarity. however. have no effect on the connectedness of a graph. Connectedness algorithm is very basic and may serve as a subroutine in the involved graphtheoretic algorithms. Otherwise. the upper bound on the execution time is proportional to n(n–1). (Remember that in logical adding 1+ 0 = 0 + 1 = 1 + 1= 1 and 0 + 0 = 0). it may be better for the sake of efficiency to determine components of G and then subject each component to the desired scrutiny. The following is an efficient algorithm: Description of the Algorithm: The basic step in the algorithm is the fusion adjacent vertices. In the adjacency matrix the fusion of the jth vertex to the ith vertex is accomplished by ORing. taking care that they are not considered again in any fusion.66 ALGORITHMS AND ADVANCED DATA STRUCTURES I 777777J . as it involves a large number of matrix multiplications. one may leave these rows and columns as the matrix. or isomorphism. Given the adjacency matrix X of a graph. what are the comments of G? Therefore. Then we take the fused vertex and again fuse with it all those vertices that are adjacent to it now. . before testing a graph G for reparability. but parallel edges are automatically replaced by a single edge because of the logical addition (or ORing) operation. These. Y = X + X2 +…. Note that a selfloop resulting from a fusion appears as and in the man diagonal. And since in each fusion one performs at most n logical additions. our first algorithm will be one that determines the connectedness and components of a given graph. A addition to being an important question is its own right.+Xn1. If this exhausts very vertex in the graph. (The reader may be reminded here although in drawing a graph one might see whether a graph is connected or not. This process fusion is repeated until no more vertices can be fused. This indicates that connected component has been “fused” to a single vertex. n being the number of vertices. another graph. it is possible to determine whether or not the graph is connected by trying various permutations of rows with the corresponding columns of X. We start with some vertex in the graph and fuse all vertices that are adjacent to it. Top / !# 0 $ ! ' #$ The first questions one is most likely to ask when encountering a new a G will be: Is G connected? If G is not connected. we start with new vertex (in a different component) and continue the fusing operation. logically adding the jth row to the ith row as well as the jth column to the ith column. Then the jth row and the jth column are discarded from the matrix. For example. is too is not very efficient. This.
…z}. . provided one did not pay too much of a price for selecting the vertex itself.. We assume that length of the text in an array T[1.m] of length m. We assume that the elements of P and T belongs to a finite alphabet Σ. For example.17. The pattern searched is a particular word supplied by the user in a text. The stringmatching problem is defined as follows.1} or Σ = {a.b. 2. Top # In text editing we frequently found the problem of finding all occurrences of pattern in a text in textediting programs. A flow chart of the “Connectedness and Components Algorithm” is shown in Fig.. The character arrays P and T are called strings. We can also use Stringmatching algorithms for particular patterns in DNA sequences. alphabet may be Σ= {0.SORTING TECHNIQUES 67 / 8 1 6 A proper choice of the initial vertex (to which adjacent vertices are fused) in each component would improve the efficiency.n] is of length n and that the pattern is an array P[1.
otherwise.s + m] = P [1. This algorithm also Text T pattern P * s=3 * * * * : * : * 0 " 4 ) Our goal is to find all occurrences of the pattern P = abaa in the text T = abcabaabcabac.18 illustrates these definitions. denoted w ⊃ x. The study then describes a stringmatching algorithm that begins by constructing a finite automaton specifically designed to search for occurrences of the given pattern P in a text. An algorithm due to Boyer and Moore that is often the best practical choice. Now we see the native bruteforce algorithm for the stringmatching problem.. It follows from w ⊃ x that w ≤x. x is the length of a string x. due to Rabin and Karp..m] (i. we can say that. Has worstcase running time O((n –m +1)m). The pattern occurs only once in the text. & $ Σ* denote the set of finitelength strings formed using characters from alphabet Σ. Also note that ⊂ and ⊃ are transitive relations. The length of xy is x + y and consists of the characters from x followed by the characters from y. we consider only strings of finite lengths. It is useful to note that for any string x and y and any character a. Similarly. although its worstcase running time (like that of the RabinKarp algorithm) is no better than that of the naïve stringmatching algorithm. if x = wy for some string y ∈Σ*. we can define a string w is a suffix of a string x. has worstcase running time O((n – m +1)m) presents an interesting stringmatching algorithm. For example. for shift s = 3. The stringmatching problem is the problem of finding all valid shifts with which a given pattern P occurs in a given text T Figure 2. is denoted xy. for 1 ≤ j ≤ m. and all matched characters are shown shaded. How is connectedness determined from an adjacency matrix? State and explain stringmatching problem.. Note that if w ⊂ x. If P occurs with shift s in T. then w≤x. It also generalizes nicely to other patternmatching problems. The concatenation of two strings x and y. The zerolength empty string(∈). This algorithm runs in time O(n + mΣ). The shifts s = 3 is said to be a valid shift.e. we have x ⊃ y if and only if xa ⊃ ya. if x = yw for some y∈Σ*. A string w is a prefix of a string x. . s is called a valid shift.68 ALGORITHMS AND ADVANCED DATA STRUCTURES We say that pattern P occurs with shift s in text T (or. 2. we have ab ⊂ abcca and cca ⊃ abcca. Here each character of the pattern is connected by a vertical line to the matching character in the text. denoted w ⊂ x. if T[s + j] = P[j]. that pattern P occurs beginning at position s + 1 in text T) if 0 ≤ s ≤ n + m and T[s + 1. In this chapter.11 Answer the following questions: 1. The empty string ∈ is both a suffix and a prefix of every string. The KMP algorithm runs in time O(n + m). The following lemma will be useful later. also belongs to Σ*. an invalid shift. but it works much better on average and in practice. The similar but much cleverer KnuthMorrisPratt (or KMP) algorithm is presented further. Student Activity 2.
Boyer and J. They are illustrated in Figure 2. Now we comment out lines3–4 and replace the updating of s on lines 12–13 with simple incrementations as follows : 12 13 s←s+1 else s←s+1 In the modified program. Top What is the worstcase order of time complexity of Rabinkarp algorithm? "# 1 ' #$ This algorithm is most efficient if the pattern P is relatively long and the alphabet Σ is reasonably large. This algorithm due to Robert S. a valid shifts s has been found. this program looks remarkably like the naïve stringmatchiong algorithm.P.. These heuristics are known as the “badcharacter heuristic” and “goodsuffix heuristic”. BoyerMorre Matcher (T.Σ) 1 2 3 4 5 6 7 8 9 10 11 12 13 n←length [L] m←length[P] λ←ComputeLastOccurrenceFunction (P. When a mismatch occurs. each heuristic proposes an amount by which s can safely be increased without missing a valid shift. Strother Moore.m) s←0 while s ≤n–m do j←m while j > 0 and P [j] = T[sj] do j←j–1 if j = 0 then print “Pattern occurs at shift”s s←s + 7 [0] else s←s + max (γ[j]. At this level. The BoyerMoore algorithm chooses the .m] = T[s+ 1.m. If the loop terminates with j = 0.SORTING TECHNIQUES 69 3. m –1……1. The BoyerMoore algorithm uses two heuristics that allow it to avoid much of the work that our previous stringmatching algorithms performed. the while loop beginning on line 6 considers each of the n–m+1 possible shifts s in turn. These heuristics are very effective in that they often allow the algorithm to skip altogether the examination of many text characters.. the only remarkable features of the BoyerMoore algorithm are that it compares the pattern against the text from right to left and that it increases the shifts s on lines 12–13 a value that is not necessarily 1.s +m] by comparing P[j] with T[s + j] for j = m.Σ) y←ComputeGoodSuffixFuinction (P. and the while loop beginning on line 8 tests the condition P[1. j– ≠ λ [T[s+j]]) Aside from the mystriouslooking λ’s and γ‘s. They can be viewed as operating independently in parallel. and line 11 prints out the value of s.19.
Since the goodsuffix heuristic proposes a movement of 3 positions. the BoyerMoore algorithm increases the shift by 4. by the amount that guarantee that the bad text character will match the rightmost occurrence of the bad character in the pattern. the badcharacter heuristic proposes increasing s by j –λ[T[s + j]]. In this example. : . uses information about where the bad text character T(S+j) occurs in the pattern (if in occurs at al) to propose a new shift. (a) Matching the pattern reminiscence against a text by comparing characters in a righttoleft manner. If the bad character doesn’t occur in the pattern. moving the pattern 3 positions to the right satisfies this condition. "# 1 # # This heuristic. (c) With the goodsuffix heuristic the pattern is moved to the right by the least amount that guaranttes that any pattern characters that align with the good suffix ce previously found in the text will match those suffix characters. > : '( : …= . at position 6. the mismatch occurs on the first comparison (P[m] ≠ T[s+m]) and the bad character T[s+m] does not occur in the pattern at all. and the goodsuffix heuristic proposes increasing s by γ[j]. (b) The badcharacter heuristics proposes moving the pattern to the right. then this heuristic makes no proposal.70 ALGORITHMS AND ADVANCED DATA STRUCTURES larger amount and increases s by that amount: when line 13 is reached after a mismatch. if possible. > '( : : … = . although a “good suffix” ce of the pattern matched correctly against the corresponding characters in the text (matching character are shown shaded). was discovered in the text. which didn’t match the corresponding character n in the pattern. when a mismatch occurs. (imagine . K >  : : K : ? … (c) 2 An illustration of the BoyerMoore heuristics. then the pattern may be moved completely past the bad character in the pattern is to the right of the current bad character position. the “bad character” I. The shifts s is invalid. In this example. K  : K ? … . s+4 K  ) : K ? … . * 3 :? . which is smaller than the 4position proposal of the badcharacter heuristic. A3 + F … = s . In the best case. moving the pattern 4 positions to the right causes the bad text character I in the text to match the rightmost I in the pattern. s+3 .
20 (b). we can increase the shift s by m. (Since j. and so the badcharacter heuristic is essentially proposing to decrease s. for each a ∈ Σ a ∈ Σ . jk<0. since each text character examined yields a mismatch. We must consider three cases to prove this claim. if any such k exists. So.20 (c). We claim that we may safely increase s by jm. the rightmost occurrence of the bad character is in the pattern to the left of position j.λ [T[s+j]] is negative if the rightmost occurrence of the bad character T[s+j] in the pattern is to the right of position j.λ [Ts[s+j]] on line 13 of BoyerMoore Matcher implements the badcharacter heuristic. This algorithm works as follows. Now we give a simple program that defines λ [a] to be the index of the rightmost position in the pattern at which character a occurs. This bestcase behaviour illustrates the power of matching rightofleft instead of lefttoright. K=0: from Figure 2. to ensure that the algorithm makes progress at each step).20. proposed by the goodsuffix heuristic. we increase s by jk without missing any valid shifts. and so we can safely increase s by j without missing any valid shifts. K<j: from Figure 2. since any shift smaller than s+m will align some pattern character against the bad character. as illustrated by Figure 2. the bad character T[s+j] didn’t occur in the pattern at all. we rely on the positively of y[j]. let k=0. Now let k be the largest index in the range I ≤ k ≤ m such that T[s+j]=P[k].20 (a). Then we call λ the lastoccurrence function for the pattern. then λ [a] is set to 0. because the goodsuffix heuristic will propose a shift to the right in all cases. thus causings to increase by m. With this definition. causing a mismatch.SORTING TECHNIQUES 71 searching for am in the text string bn). k>j: from Figure 2. where I ≤ ≤ j ≤ m. This recommendation will be ignored by this algorithm. Otherwise. the BoyerMoore algorithm examines only a fraction I/m of the text characters. If the best case occurs repeatedly. . so that jk>0 and the pattern must be moved jk characters to the right before the bad text character matches any pattern characters any pattern character. If a is not found in the pattern. Hence. Assume we have just found a mismatch: P[j] ≠ T[s+j]. the expression j.
for each character a ∈ Σ do λ [a ] =0 for j ← l to m do λ [ P ]] ← j return λ The running time of procedure COMPUTELASTOCCURRENCEFUNCTION is 0 Σ + m . Σ ) 1. . .> + # 1 3 :+ 5 * . 2. ) ) 1 1 11 * *) ) '( 1 )) ) *) ) 1 1 . *) ) *4) ) 1) 11 1 :9 ) 1 * *1 5 : COMPUTELASTOCCURRENCEFUNCTION (P. . ) 1 " ) + 5 * . 3. 5. 4..< + * 1 : ) * ) *+ ) 4..72 ALGORITHMS AND ADVANCED DATA STRUCTURES (c) ) 11 *4) ) 1 )' ( *) ) )) 1 9 ) * )*+ . ( ) . *) ) : ) * )*& 1 1 = 1 ' )( 1 )) ) *) ) 1 1 . m.
m. 2. i. y[j] ≤ mπ [m] for all j.m]=P[ml+l.m]. so that l ≥ 1. definition of π ’.m] Pkm+j+p by the transitivity of . which ensures that the BoyerMoore algorithm makes progress. equivalently. The condition that P[j+1. The relation “~” is symmetric Q~R if and only if R~Q. contradicting or choice of k as the largest possible value such that P[j+1. and thus p=mj. Since P[j+1. we have k’>k. since P[j+1. by the definition of π . we have Pw~[j+1…m). implying that P[j+1. we have P’p P’l or. then the goodsuffix heuristic says that we can advance s by λ [ j ] = m − max {k : 0 ≤ k < m and P[ j + l.m]~Pk holds if either P[j+1. where l=(mk)+(mj). We call y the goodsuffix function for the pattern P. where k’=km+j+p.. we obtain P[mp+l. We can therefore.m]..P]=P’[1p+l. by equation (1)... we have P’mj Pl` Therefore. Therefore.m] ~ Pk }.. rewrite our definition of y still further as follows: y[j]=mmax ({ π [m]} ∪ {m k: π [m]<k<m and P[j+1. Substituting for l=2mkj. We first observe that w = π [m] λ [ j ] ≤ m − π [m] for all j.m]=P[km+j+l.. and π ’[t] is the largest u such that u<t and P’u P’t..s+m] to be mismatched against the new alignment of the pattern.i +1] for i =1. P’[i]=P[m. and thus l ≤ m. since p>mj.. We can align two similar strings with their rightmost characters matched.km+j+p). that Q R and S R imply Q~S. This latter possibility cannot reduce the value of y[j] below m..m] P[mp+l. then Pw P by the definition of π .m] Pkm+jp.e.π [m]. P’[1. we define P’ as the reverse of the pattern P and π ’ as the corresponding prefix function.2. and noting that π ’[/]=mj implies that j=m– π ’[/] and k=m–l+ π ’[l]. To simplify the expression for y further.. Then.m] Pk implies that mj ≤ k... Finally. We prove this claim as follows. π ’[l] ≥ mj.. which implies P[mp+l. We compute the goodsuffix function y.m] PK or Pk P[j+1. This contradiction means that we can’t have p>mj.m] Pk}). (1) If P[j] ≠ T[s+j]. and so P[j+1.m] Pk.. If w = π [m] . Since p>mj..m] Pk. It is worth observing that the definition implies that y[j]>0 for all j=1. as follows. We also have.m] Pk...….m] P for any j.. Note that P[j+1.. If k is the largest possible value such that P[j+1. where p= π ’[/]. λ [j] is the least amount we can advance s and not cause any characters in the “good suffix” T[s+j+l.. .ml+p]..SORTING TECHNIQUES 73 "# / 2 Here we need to define the relation Q ~ R (read “Q is similar to R”) for strings Q and R to mean that Q R or R Q. which proves the claim (2). But the latter possibility implies that Pk P and thus that k ≤ π [m]. by the . and no pair of aligned characters will disagree.. Using equation (2). now show how to Now we rewrite our definition for y as Y[j]=mmax {k: π [m] ≤ k<m and P[j+1…m]~Pk}. Suppose now that p>mj.. j<m and k ≤ m. as a consequence of Lemma discussed earlier.…m. That is. Furthermore. we have j+1>mp+1. where j<m. (The second set may be empty). To see that this claim is well defined. we can rewrite our definition of y still further. then we claim that (2) π ‘[l]=mi. Also.
Now we see the procedure for computing y: COMPUTEGOODSUFFIXFUNCTION (P.. 10..74 ALGORITHMS AND ADVANCED DATA STRUCTURES y[ j ] = m − ma({π [m]} ∪ {m – l + π ' ] : 1 ≤ l ≤ m and j = m − π ' ]}) [l [l = min ({m − π [m]} ∪ {l − π ' ] : 1 ≤ l ≤ m and j = m − π ' l ]}) [l ' [ (3) Σ Again. We can say. ∈Σ. whereas δ has O(m[Σ]) entries. The worstcase time complexity of the BoyerMoore algorithm is clearly O((nm+1)m+ Σ ). m) 1. 8. (This remark will be clarified shortly. and the BoyerMoore algorithm (like the RabinKarp algorithm) spends O(m) time validating each valid shift s. Its running time is 0(m). Morris and Pratt gave a lineartime stringmatching algorithm.m] precomputed from the pattern in time O(m). 9. 5. 4. The array π allows the transition function δ to be computed efficiently “on the fly” as needed. the second set may be empty. 1……. 7. It does the pattern matching using just an auxiliary function π[1. COMPUTEGOODSUFFIXFUNCTION takes time O(m). we save a factor of Σ in preprocessing by computing π rather than δ. The prefix function for a pattern provides knowledge about how the pattern matches against shifts to itself.) Since the array π has only m entries. "# . for any state q = 0. π ← COMPUTEGOODSUFFIXFUNCTION (P) Pt ← reverse (P) ← π ' COMPUTEGOODSUFFIXFUNCTION (P’P for j ← 0 to m do y[j] ← m.a). 2. 6. 3. This information can be used to avoid testing useless shifts in the naïve a patternmatching automaton. Their algorithm is a θ(n + m) running time algorithm by avoiding the computation of the transition function δ .π [m] for l ← 1 to m do j ← m– π ’[l] if y[j]>l. the value δ [q] contains the information that is independent of a and is need to compute δ (q.m and any character a.π ’[l] then y[j] ← l– π ’[l] return y The procedure COMPUTEGOODSUFFIXFUNCTION is a direct implementation of equation (3). . #1 1) #$ Knuth.
an a would be aligned with a text character that is known to match with the second pattern character. aligns the first three pattern characters with three text characters that must necessarily match. a b. q = 5 of the characters have matched . Figure 2. since the first pattern character. For this example.21(a) describe 3 particular shifts s of a template containing the pattern P = ababaca against a text T.SORTING TECHNIQUES 75 Now see the operation of the naïve string matcher. ) 1 9 * ) 7 ) ) * * '( ) . ) 1 ) 1:1 9 1 * * ) *9 ) * 3 ) 1 . however. the shifts s + 1 is necessarily invalid. In the example of the figure. In general it is useful to know the answer to the following question: T * s : * * * * : P * * * q * '( : * s : * * * * : * * T * k '( * : P : * * P4 P4 ' )( 3 ) π 1 1 ' ( ?1 . ) . The shifts s – 2 shown in part (b) of the figure. If we know these q text characters then we can determine immediately that certain shifts are invalid. but the 6th information that q characters have matched successfully determines the corresponding text characters.
..m} {0. 9 3 * 1 * ' )( 1 * * ) 1) ) * ) 9 1 :9 1 1 1 1 3 1 1 ) * * 1 * . the prefix function for the pattern P is the function :{1. Given that pattern characters P[1. 2.. It is mostly after FiniteAutomatonMatcher. Equation (α) can therefore be interpreted as asking for the largest k < q such that Pk ]Pq . s’ – s.s + q]. We formalize the precomputation required as follows..1.. It turns out to be convenient to store the number k of matching characters at the new shift s’. ….s’ –k] is part of the known portion of the text.21(c).P) 1 2 3 4 5 6 7 8 9 10 n←length [T] m←length[T] k←ComputePrefixFunction (P) q←0 for I←1 to n do while q> 0 and P[q + 1] ≠ T[i] do q←π[q] if P[q + 1] = T[i] then q←q +1 if q = m .q] match text characters T[s + 1..….. This information can be used to speed up both the naïve stringmatching algorithm and the finiteautomaton matcher. See in Figure 2..76 ALGORITHMS AND ADVANCED DATA STRUCTURES ) 1:9 ) * * ) 1 @ 1 *: 1 9 . Since T[s’ + 1.m–1} such that π[q] = max {k : k < q and Pk ]Pq } That is π[q] is the length of the longest prefix of P that is a proper suffix of Pq. Then.. say. and shifts s + 1. KMPMatcher (T... (α) where s’ + k = s + q? Such a shifts s’ is the first shift greater than s that is not necessarily invalid due to our knowledge of T [s + i. we have that s’ = s + q. as we shall see. Given a pattern P[1. what is the least shift s’ > s such that P[1.k] = T[s’ + 1.1 6 ) ) 1 ) * 1 )) 11 1 : 3 πA B ( ) @ 1 ) 11 1 *1 π:1 1 9 1) 3 πAB. KMPMatcher calls the auxiliary procedure ComputePreffixFunction to compute π.s+q –1 are all immediately ruled out..s + q].s’ + k] . s’ = s + (q –k) is the potentially valid shift. The KnuthMorrisPratt Matching algorithm is given in pseudocode below as the procedure KMPMatcher.m]. it is a suffix of the string P4. at the new shift s’ we don’t need to compare the first k characters of P with the corresponding characters of T. In the best case. s + 2. since we are guaranteed that they match by equation (α). In any case. rather than storing.….
the remainder of KMPMatcher takes O(n) time. We can attach a potential of k with the current state k of the algorithm. Since k < q upon entering the for loop and since q is incremented in each iteration of the for loop body. This potential has an initial value of 0 (from line 3). The only other line that affects k is line 8. k < q always holds. so that the amortized cost of the loop body on lines 5–9 is O(1). since π[k] < k. show that using the value of q as the potential function. which increases k by at most one during each execution of the for loop body. 1 $ . Line 6 decreases k. insertion sort and selection sort are O(n2). and a similar analysis. the total actual worstcase running time of ComputePrefixFunction is O(m). We can pay for each execution of the while loop body on line 6 with the corresponding decrease in the potential function. $$ Average Case Time complexities of Bubble sort. and since the final potential function is at least as great as the initial potential function. The call of ComputePrefixFunction takes O(m) time as we have just seen. Finding all occupancies of pattern in a text is a problem known as string matching. Since the number of otherloop iterations is O(m).SORTING TECHNIQUES 77 11 12 then print “Pattern occurs with shift”i – m q←π[q] ComputePrefixFunction (P) 1 2 3 4 5 6 7 8 9 10 m←length [P] π[1]←0 k←0 for q←2 to m do while k> 0 and P[k + 1] ≠ P[q] do k←π[k] if P[k + 1] = P[q] then k←k +1 π[q] ←k return π The time complexity of ComputePrefixFunction is O(m). Primes and kruskal’s algorithms are used to find minimum spanning tree. Merge Sort has space complexity O(n2). (This justifies the claim that π[q] < q as well. The KnuthMorrisPratt algorithm has time complexity O(m + n). Line 8 increases the potential function by at most one. by line 9. since π[k] < k. Since π[k] ≥ 0 for all k. however. k can never become negative.
2. 3. Time complexity of Bubble Sort is _______________. 2. True and False 1. minimum spanning tree nlogn Boyer. . Average case time complexity of quick sort is______________. Two string matching algorithms are _________ and ________ algorithms. 2. Fill in the blanks 1. A selection sort is one in which successive elements are selected in order and placed into their proper sorted positions. Fill in the blanks 1. Fill in the blanks 1. II. 3. I. Prim’s algorithm is used to find minimum spanning tree. Moore I. Kruskal’s algorithm is used to find______________. 2. True and False 1. 4. 2. Kruskal’s algorithm is used to create_______________ tree_______________. 5. 3. Prims algorithm is a _______________ method for creation of minimum spanning tree.78 ALGORITHMS AND ADVANCED DATA STRUCTURES I. If a pattern is relatively long and the alphabet is reasonably large than BoyerMoore algorithm is the most efficient string matching algorithm. 2. Space complexity of Merge out is _______________. True and False 1. Selection sort is more efficient than quick sort. True False II. II. Two string matching algorithms are _______________ and _______________.
Thus. sequences containing sub optimal subsequences cannot be optimal (if the principle of optimality. In dynamic programming an optimal sequence of decisions is obtained by making implicit appeal to the principle of optimality. holds) and so will not (as far as possible) be generated. In dynamic programming much decision sequences may be generated. But the time and space requirements may be prohibitive. One way to solve problems for which it is not possible to make a sequence of stepwise decisions leading to an optimal sequence is to try all possible decision sequences. Top The principle of optimality states that an optimal sequence of decisions has the property that whatever the initial state and decision are. the essential difference between the greedy method and dynamic programming is that in the greedy method only one decision sequence is ever generated. We could enumerate all decision sequences and then pick out the best.Overview Principle of Optimality Matrix Multiplication Optimal Binary Search Trees Dynamic Programming Learning Objectives • • • • Top Overview Principle of Optimality Matrix Multiplication Optimal Binary Search Trees Dynamic programming is an algorithm design method that can be used when the solution to a problem can be viewed as the result of a sequence of decisions. Dynamic Programming often drastically reduces the amount of enumeration by avoiding the enumeration of some decision sequences that cannot possibly be optimal. the remaining decisions must constitute an optimal decision sequence with regard to the state resulting from the first decision. However. .
A2 ……An) of n a matrices to be multiplied. For example. The standard algorithm is given by the following pseudocode algorithm.B) { if (columns [A] ! = rows [B] print_error (“incompatible dimensions”) . We know that the matrix multiplication is associative. then proceed to next section. the product of matrices A1.82 ALGORITHMS AND ADVANCED DATA STRUCTURES Another important dynamic feature of programming approach is that optimal solutions to subproblems are retained so as to avoid recomputing their values. A3 A4 can be fully parenthesized in five distinct ways : (A1 (A2 (A3 A4))). Matrix Multiply (A. What is Dynamic programming? State principle of optimality. (((A1 A2) A3) A4).(1) We can evaluate the expression (1) using the standard algorithm for multiplying pairs of matrices as a sub routine once we have parenthesized it to resolve all ambiguities in how the matrices are multiplies together. ((A1 A2) (A3 A4)). and our goal is to compute the product A1 A2…… An………. Consider the cost of multiply two matrices. The use of these tabulated values makes it natural to recast the recursive equations into an iterative algorithm. (A1 (A2 A3) (A4)). therefore all parenthesations yield the same product. 2. The attributes rows and columns are the number of rows and columns in a matrix. (A (A2 A3) A4). Student Activity 3. A product of matrices is fully parenthesized if it is either a single matrix or the product of two fully parenthesized matrix products. surrounded by parenthesis.1 Before going to next section. Top Our first example of dynamic programming is an algorithm that solves the problem of matrixchain multiplication. We have a sequence (chain) (A1. answer the following questions: 1. The method of paranthesization of matrices can have a dramatic impact on the cost of evaluating the product. A2. If your answers are correct.
and the brute force method of exhaustive search is therefore a poor strategy for determining the optimal parenthesization of a matrix chain. where C (n ) = 1 n + 1 2n n / n 3 / 2 = Ω = (4 n ) The number of solution s is thus exponential in n.. j] = C[i j] + A[i. .DYNAMIC PROGRAMMING 83 else for i = 1 to rows [A] for j = 1 to columns [B] { C [i. j] } } To examine the different costs incurred by different parenthesizations of a matrix product. Since we can split a seuqence of n matrices between the Kth and (k + 1)st matrices for any K = 1.…… n – 1 and then parenthesize the two resulting subsequences independently. 2. Denote the number of alternative parenthesization of a sequence of n matrices by P(n). j] = 0 For k = 1 to columns [A] C [i. k] * B[k. we obtain the recurrence P(n) = = { 1 n–1 if n = 1 if n ≥ 2 Σ P (k) P (n–k) k=1 The solution to this recurrence is the sequence of Catalan number : P(n) = c(n – 1). consider the First we should convince our selves that exhaustively checking all possible parenthesizations does not yield an efficient algorithm Now we solve the matrixchain multiplication problem by dynamic programming.
84
ALGORITHMS AND ADVANCED DATA STRUCTURES
In dynamicprogramming the first step is to characterize the structure of an optimal solution. For the matrixchainmultiplication problem we can perform this step as follows. For convenience, let us adopt the notation Ai…j for the matrix that results from evaluating the product Ai Ai+1….Aj. An optimal parenthesization of the product A1 A2….An splits the product between Ak and Ak+1 for some integer k in the range 1 ≤ k ≤ n. That is, of some value k, we first compute the matrices A1...k and Ak+1….n and then multiply them together to produce the final product A1….n. The cost of this optimal parentesization is thus the cost of computing the matrix A1….k, plus the cost of computing Ak+1….n, plus the cost of multiply them together. The key observation is that the parenthesization of the ‘prefix’ subchain A1 A2…Ak within this optimal parenthesization of A1…An must be an optimal parenthesization of A1 A2…An why? If there were a less costly way to parenthesize A1 A2…Ak, substituting that parenthesization in the optimal parenthesization of A1 A2…An would produce another parenthesization of A1 A2…An whose cost was lower than the optimum : a contradiction. A similar observation hold for the parenthesization of the subchain Ak+1 Ak+2…An in the optimal parenthesization of A1 A2….An : it must be an optimal parenthesization of Ak+1 Ak+2…An. Thus, an optimal solution to an instance of the matrixchain multiplication problem contains with in it optimal solution to sub problem instances.
!
Next we define the value of an optimal solution recursively in terms of the optimal solutions to sub problem. For this problem, we pick as our subproblems the problems of determining the cost of a parenthesization of Ai Ai+1…..Aj for 1 ≤ i ≤ j ≤ n. Let m [i, j] be the minimum number of scalar multiplications needed to compute the matrix Ai…j; the cost of a cheapest way to compute A1….n would thus be m(1, n). Now we can define m(i , j) as follows. If i = j; the chain consists of just one matrix Ai…j = Ai. So no scalar multiplications are necessary to compute the product. Thus m (i,j) = 0 for i = 1, 2,….n. To compute m(i, j) when i<j, we take advantage of the structure of an optimal solution from step 1. Let us suppose that the optimal parenthesization splits the product Ai Ai+1….Aj between Ak and Ak+1, where I ≤ k < j, them m(i, j) is equal to the minimum cost for computing the sub products Ai…k and Ak+1….j, plus the cost of multiplying these two matrices together. Since computing the matrix product Ai….k Ak+1….j takes Pi–1 Pk Pj scalar multiplications, we obtain. M(i, j) = m[i, k] + m[i, k] m[k+1, j] + Pi–1 Pk Pj.. This recurrence relation assumes that we know the value of k, which we don’t have are only j–i possible values for k, however, namely k=i, i+1,….j–1. Since the optimal parenthesization must use one of these values for k, we need only to check them all to find the best. Thus, our recursive definition for the minimum cost of parenthesizing the product Ai Ai+1….Aj becomes
m[i, j] =
{
if i = j 0 min {m[i, k] +m [k+1, j] + pi–1 Pk Pj} i≤k≤j i j i<j (2)
The m[i, j] values give the costs of optimal solutions to subproblems. To help us keep track of how to construct an optimal solution, let us define s[i, j] to be a value of k at which we can split the product Ai
DYNAMIC PROGRAMMING
85
Ai+1……Aj to obtain an optimal parenthesization. That is s[i, j] equals a value k such that m[i, j] = m[i, k] +m[k+1, j] + pi–1 Pk Pj
Now it is a simple to write a recursive algorithm based on relation (2) to compute the minimum cost of m[1, n] for multiplying A1 A2….An. However this algorithm takes exponential time—no better than the bruteforce method of checking each way of parenthesizing the product. The important observation is that we have relatively few sub problems. One problem for each choice of i and j satisfying 1 ≤ i ≤ j ≤ n or (n2) + n = θ (n2) total. Instead of computing the solution to recurrence (3) recursively we perform the third step of the dynamic programming paradigm and compute the optimal cost by using a bottomup approach. The following pseudocode algorithm assumes that matrix Ai has dimensions Pi–1×Pi for i = 1, 2, ….n. The input sequence is (P0, P1…….Pn), where length [P] = n + 1. The procedure uses an auxiliary table m[1…n, 1…..n] for storing them m[i, j] costs and an auxiliary table s[1….n, 1…..n] that records which index of k achieved the optimal cost in computing m[i, j] Matrix chain order (P) { n = length [p]–1 for (i = 1; i< = n; i ++) m [i, i] = 0 for (l = 2; l < = n; l ++) for (i = 1; i< = n–l+1; i ++ { j = i + l–1 m[i, j] = ∞ for (k = I; k < = j–1; k++) { q–m[i, k] +m [k+1,J]+Pi–1 Pk Pj if (q<m [i, j]) { m [i, j] = q s[i, j]=k } } } This algorithm fits the table m in a manner that corresponds to solving the parenthesization problem or matrix chain s of increasing length.
86
ALGORITHMS AND ADVANCED DATA STRUCTURES
Now we see the operation of above algorithm on a chain of n = 6 matrices: Matrix dimension
The table m and s for this problem are shown below which are calculated from the above algorithm.
The minimum number of scalar multiplication to multiply the 6 matrices is m[1, 6] = 15, 125. A simple inspection of the nested loop structure of Matrix chain order yields a running time of O(n3) for the algorithm.
The Matrix chain order does not show how to multiply the matrices it only determines the optimal number of scalar multiplications needed to compute a matrixchain product. We will use the table s[1…n,1….n] to determine the best way to multiply the matrices. Each entry s[i, j] records the value of k such that the optimal parenthesization of Ai Ai+1…….Aj splits the product between Ak and Ak+1. Thus, we know that the final matrix multiplication in computing A1…n optimally is A1….S[1,n] As[1, n]+1…..n . The earlier matrix multiplication can be computed recursively, since s[1, s (1, n)] determines the east matrix multiplication in computing As[1, n]+1…..n. The following recursive procedure computes the matrixchain product Ai…j given the matrices A = (A1, A2,…. An). The table s computed by Matrix chain order, and the indices i and j. The initial call is Matrix chain multiply (A, s, 1, n)
respectively. i. j) { if (j > I) { X = Matrix Chain Multiply (A. j]). What is matrix chain multiplication problem? Describe matrix chain multiplication with an example.2(a). The tree of figure 3.DYNAMIC PROGRAMMING 87 Matrix Chain Multiply (A.2 Before going to next section. j] + 1. s [i. s. . We may expect different binary search trees for the same identifier set to have different performance characteristics. If your answers are correct. s. i. Where as the tree of figure 3. answer the following questions: 1. then proceed to next section.2(b) requires only the tree. Student Activity 3. in the worst case. 2. s. s. requires four comparisons to find an identifier. On the average the two trees need 12/5 and 11/5 comparisons. 1. s[i. Return Matrix Multiply (x. y) } else return AI } In the above example the call Matrix Chain Multiply (A. Y = Matrix Chain Multiply (A. 6) computes the matrix chain product according to the parenthsization ((A1 (A2 A3)) ((A4 A5) A6)). Top " ! # Given a fixed set of identifiers to create a binary search tree organization. j).
an}. All other nodes are internal nodes.e. the search terminates at the same external node. searches for identifiers not in the tree) are made. then there will be exactly n internal nodes and n+1 (fictitious) external nodes. an} with a1 < a2 <……. are drawn square in the figure 3. a2. If a successful search terminates at an internal node at level l. 3. Let us assume that the given set of identifiers is {a. Then. The class En contains all identifiers x. Every internal node represents a point where an unsuccessful search may terminate. This calculation assumes that each identifier is searched for with equal probability and that no unsuccessful searches (i. a2…. If a binary search tree represents n identifiers. 2. it takes 1. Hence the expected cost contribution from the internal node for ai is P(i) * level (ai). Unsuccessful searches terminate with t=0 (i. and 4 comparisons. such that ai<n<ai+1. then l iterations of the while loop of binary search algorithm are needed. 2. In obtaining a cost function for binary search trees. such nodes. at an external node). it is useful to add a fictitious node in place of every empty subtree in the search tree. 1 ≤ i ≤ n. Clearly. First of course we must be precise what we mean by an optimal binary search tree. The identifiers not in the binary search tree can be partitioned into n+1 equivalence classes Ei. Σ1≤i≤n P(i) + Σ0 ≤ I ≤ n q(i) = 1. In a general situation we can expect different identifiers to be searched for with different frequencies (or probabilities) in addition. int and if. The class Ei contains all identifiers x. called external nodes.3. respectively to find the identifiers for.e. in the case of tree 1(a). Σo≤i≤n a(i) is the probability of an unsuccessful search. Given this data we wish to construct an optimal binary search tree for {a. do...<an. 1<I<n. x>an. while.88 ALGORITHMS AND ADVANCED DATA STRUCTURES For example. The class E0 contains all identifiers x such that ai<x<ai+1.. Let q(i) be the probability that the identifier x being searched for is such that ai<x<ai+1. It is easy to see that for all identifiers in the same class Ei. Thus the average number of comparisons is (1+2+3+4)5 = 12/5. 0 ≤ i ≤ n. For identifiers in different Ei the search terminates at different . we can expect unsuccessful searches also to be made. Let P(i) be the probability with which we search for ai. 0 ≤ I ≤ n (assume a0 = –00 and an+1 = +00).
05. With P(1) = .05 we have Cost (tree a) = 2.1+0.5 Cost (tree e) = 1. q (0) = . The preceding discussion leads to the following formula for the expected cost of a binary search tree: Σ1 i ≤n P(i) * level (ai) + Σ0 ≤ i ≤ n q(i) * (level (Ei)–1) We define an optimal binary search tree for the identifier set {a1.DYNAMIC PROGRAMMING 89 external nodes.65 Cost (tree c) = 1.15+3*.1. All the other costs can also be calculated in a similar manner.05+ 0.1+2*0. The contribution from successful searches is 3*0. If the failure node for Ei is at level l. a2……an} to be a binary search tree for which above equation is minimum.05 and q (3) = . Hence the cost contribution of this node is q(i) * (level (Ei)1).9 cost (tree d) = 2. q (1) = . while) are given in figure 3. then only l–1 iterations of while loop are made. if.15.6 For instance. we have Cost (tree a) = 15/7 Cost (tree c) = 15/7 Cost (tree e) = 15/7 As expected tree b is optimal.05=1.5+2*0.75 and the contribution from unsuccessful searches is 3*0.6 P (2) = . q (2) = . The possible binary search trees for the identifier set {a1. Tree c is optimal with this assignment of P’s and q’s.4 with equal probabilities P(i) = q(i) = 1/7 for all i. cost (tree a) can be computed as follows. a2. a3} = (do.1 P(3) = .90. cost (tree b) = 1.05 cost (tree b) = 13/7 cost (tree d) = 15/7 .
90 ALGORITHMS AND ADVANCED DATA STRUCTURES !" ! To apply dynamic programming to the problems of obtaining an optimal binary search tree. we need to view the construction of such a tree as the result of a sequence of decisions and then observe that the principle of optimality holds when applied to the problem state resulting from a decision.Ek–1 will be in the left subtree l of the root. If we choose ak. a2. Define Cost (l) = Σ1<I≤n P(i) * level (ai) + Σ0<I≤n q(i) * (level (Ei)–1) and Cost (r) = Σ1<I≤n P(i) * level (ai) + Σk<I≤n q(i) * (level (Ei)–1) In both cases the level is measured by regarding the root of the respective subtree to be at level 1. The remaining nodes will be in the right subtree r. A possible approach to this would be to make a decision as to which of the ai’s should be assigned to the root node of the tree.……ak–1 as well as the external nodes for classes E0 E1. j) to represent the sum a(a) + i i =1+1 [q(l ) + P(l )] . …. we obtain the . then it is clear that the external nodes for a1. Using w(i.
then for the tree to be optimal. 2)=p(2)+q(2)+w(1.. 1)}=8 r(0. It we use C(i. we get w(0.…. j)}……………………. we have w(i. n)= min {C(0. Next we can compute all C(i.Ej. If during this computation we record the root r(i. n) we obtain C(0. and so on. i <k≤j i<k≤j ≤k≤n Let n = 4 and (a1.. then all C(i. j))= min {C(i. we must have cost (l) = C(o.5): # $ " % P(k)+cost (l) + cost (r) + w (o. 1). j–1). k–1)+W(k. then an optimal binary search tree can be constructed from these r(i. 0 ≤ i ≤ 4. 1)+min {c(0. Let P (1:4) = (3. Similarly cost (r) must be minimum. 0)+ c(1. E1. 1)=p(1)+ q(1)+ w(0. k–1)+C(k. 1)=7 c(1. 2)=2 . 1) =1 w(1. n) by first computing all C(i.. j) is the value of k that minimizes equation (3). Initially. j) such that j–i=1 (note C(i. j)+P(k)+W(i. 3. the equation (1) must be minimum over all binary search trees containing a1. 1. j).….(2) We can generalize equation (2) to obtain for any C(i. j)} C(i.. a3. i) = q(i). 2)+ min {c(1. j) with j–i=3. n)+P(k)+W(0. n) is minimum. Note that r (i. k–1)+W(k. n)+W(0. a2…ak–1 and E0. a2. k–1)+W(k. 0)=8 c(0. while). 3. i) = 0 and r(i. j) C(i.. 0 ≤ I ≤ n.aj and Ei. k–1)+C(k. n)}……. 2)}=7 r(0. 1) and q (0:4) = (2. j) of each tree tij. k–1)+C(k. j) = 0 and W(i. j)+W(k. C(i. 1)+ c(2. k must be chosen such that P(k)+C(0. 1. i) = 0. The p’s and q’s have been multiplied by 16 for convenience. 2)=w(1. Ek–1. if. k–1) + w (k. 1)=w(0.DYNAMIC PROGRAMMING 91 Following as the expected cost of the search tree (figure 3. k–1) + C(0. j) = p(i)+q(j)+ w(i. int.n). j) to represent the cost of an optimal binary search tree tij containing ai+1. i)= q(i). j))= min {C(i. k–1)+C(k.…. j) such that j–i = 2. k–1) and cost (r) = C(k. Using equation (3) and the observation w(i. 1. In addition.(3) Equation (3) can be solved for C(0. a4) = (do. Hence for C(0.n)……………(1) If the tree is optimal.
3)=3 c(3. 3)=3 w(3. j+1) respectively. with the data in the table it is possible to reconstruct t04. The root of tree to 4 is a2. a3. 4)}=3 r(3. c(j. 3)}=3 r(2. The box in row i and column k shows the values of w(j. The table of figure 3. j+1). 4) are obtained. From the table we see that c(0. i+1). 3)+c(4. Hence. and r(0. 3)=w(2. i+2).j+1) and r(j.5 shows the result of this computation. 3)+ min {c(2. i+2). a4). and the right subtree t24. 4)=32 is the minimum cost of a binary search tree for (a1. 4)=p(4)+q(4)+w(3. we can again use equation (3) to compute w(i.6 shows t04. Figure 3. 0 w00=2 0 c00=0 r00=0 w01=8 1 c01=8 r01=1 w02=12 2 c02=19 r02=1 w03=14 3 c03=25 1 w11=3 c11=0 r11=0 w12=7 c12=7 r12=2 w13=9 c13=12 r13=2 w14=11 c14=13 2 w22=1 c22=0 r22=0 w23=3 c23=3 r23=3 w24=52 c24=8 r24=3 3 w33=1 c33=0 r33=0 w34=3 c34=3 r34=4 4 w44=1 c44=0 r44=0 . 4). 4). 0 ≤ i<4. 4)=4 Knowing w(i. 2)=3 c(2. The computation is carried out by row from row 0 to 4. The box in row i and column j shows the result of this computation. Tree t24 has root a3. i+1) and c(i.92 ALGORITHMS AND ADVANCED DATA STRUCTURES w(2. 3)=p(3)+q(3)+w(2. 4)=w(3. 0 ≤ i<3. This process can be repeated until w(0. Tree t01 has root aj and subtrees t00 and t11. a2. the left subtree is t01. c(i. c(0. Thus. its left subtree is t22 and its right subtree t34. 2)+c(3. i+2) and r(i. 4)+ min {c(3.
the root of tij. Hence.<an... this algorithm computes the //cost c[i. j). each such c(i. j–1) ≤ k ≤ r(i+1.aj. { for (i=0. The computation of each of these c(i. j)’ s to compute. In this case the computing time become O(n2). j). 0 ≤ i ≤ j ≤ n. r(i. Let us examine the complexity of this procedure to evaluate the c’s and r’s. j)’s requires us find the minimum of m quantities (see equation (3)). and q[i]. j] of optimal binary search //tree tij for identifiers ai+1. l<=n–1. The total time to evaluate all C(i. n in that order.E.…. in O(n2) time. j) can be computed in time O(m). The tree ton can be constructed from the values of r(i. j) p and r(i. j)’ s is therefore. //0 ≤ i ≤ n. w[i .DYNAMIC PROGRAMMING 93 r03= 2 w04=16 4 c04=32 r04=2 r14=2 & ' () *! !" *! *! If Do int while + $) ( The above example shows how equation (3) can be used to determine to c’s and r’s and also how to reconstruct t0n knowing the r’s. n) //given n distinct identifies a. Knuth which shows that the optimal k in equation (3) can be found by limiting the search to the range r(i. When j–i=m. The evaluation procedure described in the above example requires us to compute c(i.j] // is the weight of tij. The function OBST uses this result to obtain the values of w(i. //and probabilities p[i]. i ≤m≤n Σ (nm–m2)=O(n3) We can do better than this using a result due to D. j). j]. <a3<a2<. j) for (j–i) = 1.. 1 ≤ i ≤ n. I++) { . q. OBST (p. j) and c(i. 2. j) in O(n) time. It also //computes r[i. there are n–m+1 c(i.…….
for (m=r[i. i+1]=i+1. r[i. j–1]+P[j]+q[j]. n]=0. j]=k. j–1]. k = find (c. r[0. c[i. //solve equation (3) using Knuth’s result. j]=w[i. m+1) { .0. i+1]=q[i]+q[i+1]+p[i+1]. r[n.j]. j–1] ≤ l //≤ r[i+1. i. } //end OBST Find (c. j] that minimizes //c[i. r[i. i]=0. for (m=2. j]. w(0. w[i. n).0. j].94 ALGORITHMS AND ADVANCED DATA STRUCTURES //initialize w[i. c[i. c[n. } Write (C[0. n]=q[n]. i<=n–m. i]=q[i] r[i. i++) { j= i+m. j). m<=r[i+1. j) { min = 00. m++) //find optimal trees with m nodes for (i = 0. c[i. k–1]+c[k. i+1]=q[i]+q[i+1]+p[i+1]. n]). i. j]+c[i. l–1]+c[i. r. } w[n. r. //optimal trees with one node w[i. //A value of / in the range r[i. i]=0. n]=0. j]=w[i. n]. m<=n.
j]<min) { min = c[i. Matrix chain multiplication problem can be solved by _______________. To make binary search tree more efficient we need to find ___________________. We compute the optimal cost by using a potterup approach. 2. m–1]+c[m. Explain the need for optimal binary search tree. I. True and False 1. } } return (l). 2. ! • The principle of optimality status that an optimal sequence of decision has the property that whatever the initial state and decision are. In greedy method only one decision sequence is ever generated Principle of optimality does not hold for dynamic programming II. } Student Activity 3. True and False .DYNAMIC PROGRAMMING 95 if (c[i. the remainly decision not constitute an optimal decision sequence with regard to the state resulting from the first decision. j]. Fill in the blanks 1. l = m. 2. m–1]+c[m. • I.2 Answer the following questions: 1. Give the formula to find expected cost of a binary search tree. We apply the Algorithm for computing the optimal costs over recessive solution.
%& Find an optimal parenthesization of a matrixchain product whose sequence of dimensions is (5. 6). 2. (c) ____________ is an algorithm design method that can be used when the solution to a problem can be viewed as the result of a sequence of decision. j)’s. float. if while) with P(1)=1/20. P(2)=1/5. 12. 3. q(3)=1/20 and q(4)=1/20. 3. and c(i. for the identifier set (a1. Show that a full parenthesization of nelement expression has exactly n–1 pairs of parentheses. (d) In dynamic programming an optimal sequence of decision is obtained by making implicit appeal to the _________. j). II. j] is referenced by Matrix chain order in computing other tables entries. r(i. a4)= (cout. Using the r(i. Analyze your algorithm. 2. Use the function OBST to compute w(i. . 0 ≤ i ≤ j ≤4. j). Q(0)=1/5. q(2)=1/5.96 ALGORITHMS AND ADVANCED DATA STRUCTURES 1. Show that the total number of references for the entire table is n n i =1 i =1 R(i . Let Q(i. a2. 5. True False Fill in the blanks 1. 50. Fill in the blanks: (a) The essential difference between the greedy method and ______ is that in the _______ method only one decision sequence is ever generated. j). P(3)=1/10. a3. (b) An __________ to take of the matrixchain multiplication problems contain with is it optimal solution to sub problem instances. Give an efficient algorithm PrintOptimalareas to print the optimal parentesization of a matrix chain given the table s computed by Matrix chain order. j) be the number of times that table entry m[i. j ) = n 3n 3 (Hint: You may find the identity) i =1 Σ n i2=n(n+1) (2n+1)/6 useful. 4. Dynamic programming Optimal binary search tree 1. q(1)=1/10. P(4)=1/20. construct the optimal binary search tree. 10. 5. 2. $ 1.
DYNAMIC PROGRAMMING 97 6. . 0 ≤ i ≤ j ≤ n. j). (a) (b) Show that the computing time of function OBST is O(n2). Write an algorithm to construct the optimal binary search tree given the roots r(i. Show that this can be done in time O(n).
most perplexing open research problems in theoretical computer science. nor we are yet able to prove a superpolynominal time lower bound for any of them. as an engineer you would then do better spending your time developing an approximation algorithm rather than searching for fast algorithm that solved the problem exactly. graph searching or overhear flow seem easy but are in face NPcomplete. Some problems such as Tuning's famous "Halting Problem. Some problems no harder than sorting. their worstcase running time is O(nK) where K is a constant. intractable. We discuss an interesting class of problems called the "NPComplete" problems in this chapter. If you want to design good algorithms you should understand the rudiments of the theory of NPcompleteness. whose status is unknown. . One problems can b e solved.Overview Polynominal Time NPcompleteness and Reducibility NPcompleteness NPcompleteness Proofs NPcomplete Problems NP Complete Problem Learning Objectives • • • • • Top Overview Polynomialtime NPCompleteness and Reducibility NPCompleteness Proofs NPComplete Problems All of the algorithms we have studied thus far have been polynomialtime algorithms: on inputs of size n." cannot be solved by any computer. no matter how much time is provided. What do you think whether all problems can be solved in polynomialtime? The answer is no. It came into existence in 1971. In theoretical computer science question P ≠ NP question has been one of the deepest. Hence we should become familiar with its important class of problems. There is no polynomialtime algorithm for an NPComplete problem. Many scientists believe that the NPcomplete problems can be solved in polynomial time i.e. If its intractability can be proved. Because if any single NPcomplete problem can be solved in polynominal time. and problems that need superpolynominal time as being intractable. The problems that can be solvable by polynominal .time algorithms are called tractable. then we can solve every NPcomplete problem in a polynomial time. but not in time O(nK) for any constant k.
then PATH (i) = 1 (yes) if a shortest path from u to v has length at most k otherwise PATH (i) = 0 (no). Certain other abstract problems are there called optimization problem in which some value must be minimized or maximized and these are not decision problems. Give a needed decision problem. answer the following questions: 1. remember the problem of finding SHORTEST PATH. And the third is that the class of polynomialtime solvable problems has closure properties. In this case we can view an abstract decision problem as a function that maps the instance set I to the solution set {0. q ∈ V and a positive integer k does a path exits in G between p and q whose length is at most k" if i = (G. we first define what a problem is. We can observe it by an example: a decisions problem path related to the shortest path problem is "Given a graph G = (V. We give three supporting arguments here. Typically. then proceed to next section. We define an instance for SHORTEST PATH as a triple consisting of a graph and two veritices. Give the language corresponding to the decision problem. an optimization problem can be recast by imposing a bound on the value to be optimized. We can have more than one solution becauseshortest paths are not necessarily unique. there always exists another polynomialtime model. For example if we fed the output of one polynomialtime algorithm into the input of another.1}. For example in . To make simple the theory of NP completeness restricts attention to decision problems: those having yes/no solution. Second. The reason is a mathematical issue. Give the formal definition for the problem of finding the longest simple cycle in an undirected graph. But if we want to apply the theory of NPcompleteness to optimization problems. a shortest path between two given vertices in an unweighted undirected graph G = (V. To make clear the class of polynomialtime solvable problems. the resultant algorithm is polynomial.E).q.E). We define an abstract problem Q to be a binary relation on a set I of problem instances and a set S of problem solutions. we must reproduce them as decision problems.p. For example. two vertices p. What are abstract problem? If your answers are correct. The first argument says that it is reasonable to regard a problem that requires time Q(n100) as intractable there are very few practical problems that require time on the order of such a high degree polynomials. This formulation of an abstract problem is sufficient for our purposes. And a solution is defined as a sequence of vertices in the graph (with empty sequence denoting that no path exists) The problem SHORTEST PATH itself is the relation that associated each instance of a graph of two vertices with a shortest path in the graph that joins the two vertices. Top We begin our study of NPcompleteness by defining polynominal time solvable problems. 2.1 Before going to next section.NP COMPLETE PROBLEM 99 Student Activity 4. the running time of the composite algorithm is polynomial. Generally these problems are tractable. If an polynomialtime algorithm makes a constant number of calls to polynomialtime subroutined.k) is an instance of this shortest path problem. The practical polynomialtime problems require much less time. many problems that can be solved in polynomialtime in one model.
the encoding of the natural numbers N = {0. graphs.. We want to say that the efficiency of solving a problem will not depend on how the problem is encoded. Stated in a way that has more relevance to NPcompleteness if we can provide evidence that a decision is hard. we have to represent instances in a way that the program understands. its related decision problem is easy as well. Polygons.. Practically.. If the solution to an abstract problem instance i ∈ I is Q(i) ∈ {0.2. if we rule out .10. therefore.. Encoding can be used to map abstract problem to concrete problems. For convenience.11. Thus even though it restricts attention it decision problem the theory of NPcompleteness applies much more widely. To understand the polynomial – time it the encoding of an abstract program is important.} Hence by this encoding. Thus the concrete problem produces the same solution as the abstract problem on binary digit instances that represent the encodings of abstractproblem instances. We can say that a concrete problem is polynominaltime solvable. we shall assume that any such string is mapped arbitrarily to 0. e (A) = 1000001. Unfortunately. the algorithm run in the either polynomial of superpolynomialtime. then the solution to the concreteproblem e(i) ∈{0. An algorithm solves a concrete problem in time O(T (n)) if it is provided a problem instance i of length n = [i]. An encoding of a set S of abstract objects is a mapping e from S to the set of binary strings For example. the running time of the algorithm is O(k) = θ(2n) which is exponential in the size of the input.3. We simply compare the value obtained from the solution of the optimization problem with the bound provided as input to the decision problem if an optimization problem is easy. but then the input length is n = [1gk]..1} can be use to induce a related concrete decision problem which we denote by e(Q). In the ASCII codes.. If we want to make a computer program that can solve an abstract problem. If we use the more natural binary representation of the integer k.. .. If the integer K the provided in unary—a string of k 1's then the running time of the algorithm is O(n) on lengthninputs. it depends quite heavily. e(17) = 10001.100. We cannot really talk about solving an abstract problem without first specifying an encoding. depending on the encoding.. we also provide evidence that its related optimization problem is hard.. We will now define the complexity class P as the set of concrete decision problems that can be solved in polynomialtime.. In this case.1}. functions ordered pairs programs all can be encoded as binary strings.1} an encoding e : I —{0. Hence a computer algorithm to solves some substance decision problem will an encoding of problem instants as input.100 ALGORITHMS AND ADVANCED DATA STRUCTURES recasting the shortest path problem as the decision problem PATH we added a bound k to the problem instance. The requirement to recast optimization problem as decision problem does not diminish the impact of the theory. For example suppose that an integer k is to be provided as the sole input to an algorithm and suppose that the running time of the algorithm is O(k). A concrete problem isa problem whose instance set is the set of binary thing. Now We generalize the definition of polynomialtime solvability from concrete problems to abstract problems using encodings as the bridge.} is as the strings {0.1}* is also Q(i).1. There may be some binary strings that represent no meaningful abstractproblem instance.. the algorithm can produce the solution in at most O(T(n)) time. but we keep the definition independent of any particular encoding. Generally if we are able to solve an optimization problem quickly. Anyone who has looked at computer representations of keyboard characters is familiar with either the ASCII or EBCDIC codes. we will be able to solve its related decision problem in short time. Thus.. Therefore if there exist an algorithm to solve it in time O(nk) for some constant k. which is polynominal time. Given an abstract decision problem Q mapping an instance set I to {0.4. Even a compound object can be encoded as a binary string by combining the representations of its constituent parts.
1}.. We call alphabet . produces as output f (x). the decision problem PATH has the corresponding language PATH = G. follow directly from the set definitions.} is a language.1}.. where = {0. the actual encoding of a problem makes little difference to whether the problem can be solved in polynomialtime.1}∗ : Α(x) = 1}. two encoding e1 and e2 are polynomial related if there exist two polynomialtime computable functions f12 and f21 such that for any i ∈ I. The language accepted by algorithm A is the set L ={x∈{0. { } . union and intersection operaitons. k ≥ 0 is an integer.bb. we can think of Q as a language L over = {0.1}* : Q(x) = 1} For example. By using the formallanguage framework we are able to express relation between decision problems and algorithms that solve them concisely. We can perform a variety of operations on a language. L = {x y : x∈ L1 and y ∈ L2 } The closure or Kleene star of a language L is L* = {ε}∪L u L2 ∪ L3 ∪. the encoding e2 (i) can be computed from the encoding e1(i) by a polynomialtime algorithm and vice versa... the set of strings that the algorithm accepts.NP COMPLETE PROBLEM 101 expensive encoding such as an unary ones.b}.. where L = {x ∈ {0. a finite set of symbols.. and the empty language is denoted by ∅..1}* if.ab.b. We focus of Decision problems because they make it easy to use the machinery of formallanguage theory. since an integer represented in base 2 in polynomialtime. k : G = (V . given input x. v. such that u. we have f12 (e1(i)) e21 (i) and f21 (e2(i)) e1(i). and there exists a path from u to v in G whose maximum length is k. Let S be an abstract decision problem on an instance set I. For example.} is the set of all strings of a’s and b’s. if ∗ = {a. for any input x∈{0. The concatenation of two languages L1 and L2 is defined as . For example. Hence it is clear that every language L over is a subset of ∗. Then e1 (S) will be in T if and only if e2(S) is in T. the output of the algorithm is A (x) = 1.baba. ∈ v. E) is an undirected graph .1}*. For example.. v.. The set of instances for any decision problem Q is simply the set ∗.abb. An Algorithm A rejects a string x if A(x) = 0. if = {a. α. The complement of L is defined as L = Σ * − L .b}. Since Q completely characterized by those problem instances that procedure a 1 (yes) answer. We define a language L over as any set of string that can be made from . A function f : {0. that is. That is. For example representing integers in base 6 instead of binary has no effect on whether a problem is solvable in polynomialtime.ab. u.1}*—{0. The language of all strings over is denoted by ∗. and let e1 and e2 be polynomially related encoding on I. An algorithm A is said to accepts a string x ∈ = { 0. For time set I of problem instances. then ∗ = {ε. Now we define some terminology. the set L = {aa..1}* is Polynomialtime computable if we can find a polynomialtime algorithm A that.
and then compares the distance obtained with k. At the end of time T. For any input string x the algorithm B’ simulates the action of B for time T. it is decided by a polynomialtime algorithm this is because the class of languages decided by polynomialtime algorithms is a subset of the class of languages accepted by polynomialtime algorithms. As example the language path can be accepted in polynomialtime. For a decision problem such an algorithm is not difficult to design. the algorithm runs forever. If the distance is at most k. This algorithm does not decide PATH. P is also the class of languages that can be accepted in polynomialtime. A complexity class is defined as a set language. we can provide an alternative definition of the complexity class P: P = {L ⊆ {0. to accept a language. algorithm B’ inspects at the behavior of B. If B has not accepted x then B’ rejects x by outputting a O. For example. Student Activity 4. One such polynomialtime algorithm is breadth first search that computes the shortest path from u to v in G. then proceed to next section. If your answers are correct. answer the following questions: 1.1}* : there exists an algorithm A that decides L in polynomialtime}. 2. on an algorithm that determines whether a given string x belong to language L. Thus. the algorithm will not necessarily reject a string x ∉ L . the algorithm outputs 1 and halts. A language L is accepted in polynomialtime by an algorithm A if for any lengthn string x∈ L the algorithm accepts x in time O(nk) for some constant k. an algorithm need only worry about strings in L. Assume L be the language accepted by some polynomialtime algorithm B Because B accepts L in time O(nk) for some constant k their also exist a constant c such that B accepts L in at most T = cnk steps. Top . if the algorithm loops forever!.. With the help of this language theoretic framework. the algorithm decides x in time O(nk) for some constant k. What are abstract problems? Describe a formal language. membership in which is determined but a complexity measure. but to decide a language. it must accept or reject every string in {0. We need only show that if L is accepted by a polynomialtime algorithm.1}*. The overhead of B’ simulating B does not increase the running time by more than a polynomialtime algorithm that decides L.102 ALGORITHMS AND ADVANCED DATA STRUCTURES Even if language L is accepted by an algorithm A. P = {L: L is accepted by a polynomialtime algorithm}.1}*. such as running time. If an algorithm A either accept or rejects a string from a language L then the language L is decided by that algorithm A. Otherwise. If B has accepted x then B’ accepts x by outputting a 1. A decision algorithm for PATH should explicitly reject binary strings that do but belong to PATH. A language L is decided in polynomialtime by a an algorithm A if for any lengthn string x ∈ {0. since it does not explicitly output 0 for instances in i which the shortest path has length greater than k. however.2 Before going to next section.
.. Its solution provides a solution to ax + b = 0. This class has an interesting property that if any one NPcomplete problem can be solved in polynomialtime. we formally define the NPcomplete languages.P should turn out to be nonempty... P = NP. and then we sketch a proof that one such language. Now we returns our formallanguage framework for decision problems... the problem of solving linear equations in an indeterminate x reduces to the problem of solving quadratic equations.. Thus... if there exists a polynomialtime computable to function f : {a.NP COMPLETE PROBLEM 103 ! " # The reason that theoretical computer scientists believe that P ≠ NP is the existence of the class of "NPcomplete" problems. Given an instance ax + b = 0. The reduction function f gives a polynomialtime mapping such that if .. ∈ ∈ # # % # $ !" # $ %# # & ' ∈ %# # The NPcomplete languages are the "hardest" languages in NP..b}* → {a. a language L1 is said to polynomialtime reducible to a language L2 written L1 ≤ pL2... # A problem P can be reduced to another problem P’ if any instance of P can be "easily rephrased" as an instance of P’ the solution to which provides a solution to the instance of P... For example. and a polynomialtime algorithm F that computes f is known as a reduction algorithm.. We shall show how to compare the relative "hardness" of languages using "polynomialtime reducibility..2. that is.1}*. No polynomialtime algorithm has ever been discovered for any NPcomplete problem for the decades.. we could say with certainty that HAMCYCLE ∈ NP – P.(1) The function f is called the reducingfunction... then we say that P is........ if a problem P reduces to another problem P’.. "no harder to solve" than P’. The language HAMCYCLE is one NPcomplete problem." First. x ∈ L1 if and only if f (x) ∈ L2.. is NPcomplete.... then every problem in NP has a polynomialtime solution..b}* such that for all x ∈ {a..... if NP . In fact.b}*. The idea of a polynomialtime reduction from a language L1to another language L2 is given in Figure 4. we can transform it to Ox2 + ax + b = 0. called CIRCUITSAT. We use the notion of reducibility to show that many other problems are NPcomplete. If we could decide HAMCYCLE in polynomialtime then we solve could every problem in NP in polynomialtime.. Each language is a subset of {0...
L ∈ NP.3 Before going to next section.1}* are languages such that L1≤ PL2. the reduction function maps any instance x of the decision problem represented by the language L2 to an instance f (x) of the problem represented the language L1 to instance f(x) of the problem represented by L2. which are the hardest problems in NP. and L1≤P L for every L1 ∈ NP. then L1 is not more than a polynomialtime factor Harder than L2 because the "less than or equal to" notation for reduction is mnemonic. Hence if L1≤P L2. ∈ ? ∈ ? (# ) (# # # & ' # # %# # # # # ∈ %# # * '∈ # # Polynomialtime reductions give us a powerful tool for proving that various languages belong to P. We shall construct a polynomialtime algorithm B1 that will decides L1. then f (x) ∉ L2 . Proof Let L2 is decided by a polynomialtime algorithm B2 and let F be a polynomialtime reduction algorithm that computes the reduction function f. What do you mean by NPcompleteness? What is reducibility? If your answers are correct. then L2 ∈ P implies L1 ∈ P. Providing an answer to whether f (x) ∈ L2 directly provides the answer to whether x ∈ L1.1}* the algorithm B1 uses F to transform x into f(x). The construction of B1 is given in Figure 4. 2. The output of B2 is the value provided as the output from B1. Now we define the set of NPcomplete languages. $ If L1 . A language L ⊆ {0. The Algorithm runs in polynomialtime since both F and B2 run in polynomialtime. For a given input x ε {0.2. then proceed to next section. if x ∈ L1. . if we consider a polynomialtime factor.1}* is NPcomplete if 1. Thus. 2. also. and then it uses B2 to test whether f (x) ε L2 . Top ! " Polynomialtime reductions gives a formal means for showing that one problem is at least as hard as another. answer the following questions: 1. L2 ⊆ {0. Student Activity 4.104 ALGORITHMS AND ADVANCED DATA STRUCTURES x ∈ L1 then f (x) ∈ L2.
For any L1 ∈ NP. This is because that research into the P ≠ NP question centers around the NPcomplete problems. then all NPcomplete problems are not polynomialtime solvable. Proof Suppose that L belongs to class P & L ∈ NPC. thus proving that P = NP. we have L1≤ L by property 2 of the definition of NPcompleteness. NP. % & Up to this point we have not actually proved that any Problems is NPcomplete though we have defined NPcomplete problem. But for all we know someone may come up with a polynomialtime algorithm for an NPproblem. / 1 ∅ If a language L satisfies property 2. we say that L is NP = hard. assume that L ∉ P. NPcompleteness is at the crux of deciding whether P is in fact equal to NP If any NPcomplete problem is polynomialtime solvable then P = NP. and for the purpose of contradiction. but not necessarily property 1. Most computer scientists think that P ≠ NP. Nevertheless since no polynomialtime algorithm for any NPcomplete problem has yet been discovered a proof that a problem is NPcomplete provides excellent evidence for its intractability. and NPC. .NP COMPLETE PROBLEM 105 + .. %  # . Now we can prove the second statement. Thus by Lemma. hence the state of lemma is proved. polynomialtime reducibility can be used as a tool to prove the NPcompleteness of other problems. If any problem in NP is not polynomialtime solvable. . So we will focus on showing the existence of an NPcomplete problem: the circuitsatisfiability problem. As the following theorem shows. # . If we prove that at least one problem is NPcomplete. / %# % # % # . which leads to the relationship among P. and thus L ∈ P. we have L ≤P L1. Letthere exists an L ∈ NP such that L ∉ P. But then by Lemma. Let L1 ε NPC be any NPcomplete language. we also have that L1 ∈ P. / 0 # . We also define NPC to be the class of NPcomplete language.∩ .
A truth assignment means a set of boolean input values for that circuit. x2= 1. AND NOT gates is it satisfiable?" In order to pose this question formally however we must agree on a standard encoding. and x3 causes the circuit in Figure 4. No assignment of values to x1. If a circuit always produces 0. # # # # * # * # # # * & ' (# * (# # * 1 # # 1 + 1 * We shall informally describe a proof that relies of a basic understanding of boolean combinational circuits.x2. we might attempt to determine whether it is satisfiable by simply checking all possible assignment to the input.4. and so it is unsatisfiable. checking each one leads to a superpolynomialtime algorithm. When the size of C is polynomial in k. OR. "Given a boolean combinational circuit composed of AND. But if there are k input there are 2k possible assignments.4(b) to produce a 1 output. Suppose we are given a circuit C.4(a) has the satisfying assignment x1= 1. it always produces 0. We can therefore define. CIRCUITSET = { C : C is a satisfiable boolean combinational circuit} There is a great importance of the circuitsatisfiability problem in the area of computer aided hardware optimization. We can make a graph like encoding that maps any given circuit C into a binary string C whose length is not much larger than the size of the circuit itself.106 ALGORITHMS AND ADVANCED DATA STRUCTURES (% # &*' . In fact as has been claimed there is strong evidence that no polynomialtime algorithm exists that solves the . If we can design a polynomialtime algorithm for the problem then it would have considerable practical application. x3= 0 . As a formal language. We say that a one output Boolean combinational circuits is satisfiable if it has a satisfying assignment: a truth assignment that causes the output of the circuit in Figure 4. Now we state the circuitsatisfiability problem as. and so it is satisfiable. it can be replaced by an easier circuit that omits all logic gates and provides the constant 0 value as its output. Each circuit has three inputs and a one output. Two boolean combinational circuits are shown in Figure 4.
A typical instruction encoded in memory. Program counter. Otherwise. ' The circuitsatisfiability problem is in the class NP. Whenever an unsatisfiable circuit is input no certificate can fool A into believing that A the circuit is satisfiable. allowing the computer to loop and perform conditional branches. and CIRCUITSAT ∈ NP. As we know that the Memory stores a computer program as a sequence of instructions. thereby causing the computer to execute instruction sequentially. The circuitsatisfiability problem belongs to NPhard. . B outputs 0. We break the proof of this fact into two parts based on the two parts of the definition of NPcompleteness. Algorithm A runs in polynomialtime : with a good implementation. however. Now if the output of the entire circuit is 1 the algorithm outputs 1. For each logic gate in the circuit it checks that the value provided by the certificate on the output wire is correctly computed as a function of the values on he input wires. Importantly the computer hardware that accomplishes this mapping can be implemented as a boolean combinational circuit. At any time in the execution of a program the entire state of the computation is represented in the computer's memory. Thus CIRCUITSET can be verified in polynomialtime. Proof: We can give a twoinput. A configuration is any particular state of computer memory. Every time a satisfiable circuit C is input to algorithm B.NP COMPLETE PROBLEM 107 circuitsatisfiability problem because circuit satisfiability is NPcomplete. The execution of an instruction can cause a value to be written to the program counter. liner time suffices. Another input is a certificate corresponding to an assignment of boolean values to the wires in C. The actual proof of this fact is full of technical intricacies. and then the normal sequential execution can be altered. The program counter is automatically incremented whenever an instruction is fetched. Now we shall show that the language is NPhard to prove that CIRCUITSET is NPcomplete. The algorithm B can be design as follows. The execution means the mapping one configuration to another. polynomialtime algorithm B that can verify CIRCUITSAT. and so we shall settle for a sketch of the proof based on some understanding of the working of computer hardware. One of the input to B is a boolean combinational circuit C. we have a certificate there whose length is polynomial in the size of C and that causes B to output a1. Hence we have to show that every language in NP is polynomialtime reducible to CIRCUITSET. since the values assigned to the inputs of C provide a satisfying assignment. which we denote by M in the proof of the following lemma. deeps track of which instruction is to be executed next. and an address where the result is to be stored.
) This basic idea of the proof is to represent the computations of A as a sequence of configuration. Starting with an initial configuration ci is mapped to a subsequent configuration Ci + 1 by the combinational circuit M Implementing the computer hardware. which includes both an input string and a certificate but since the length of the certificate is polynomial in the length n of the input string the running time is polynomial in n. The algorithm F that we shall construct will use the two input algorithm A to compute the reduction function f. the input x. and working storage. Since L ∈ NP we must have an algorithm A that verifies L in polynomialtime. Now we give a polynomialtime algorithm F that can compute a reduction function f that maps every binary string x to a circuit C = f (x) such that x ∈ L if and only if C ∈ CIRCUITSAT. C0 C1 C2 CT(n) . the program counter and auxiliary machine state. As shown in Figure 4.108 ALGORITHMS AND ADVANCED DATA STRUCTURES Proof Suppose L be any language in NP. the certificate y. (The running time of A is actually a polynomial in the total input size. Let T(n) denote the worstcase running time of algorithm An on lengthn input strings and let k ≥ 1 be a constant such that T(n) = O(nk) and the length of the certificate is O(nk).5 each configuration can be broken into parts consisting of the program for A.
computes such a circuit C and outputs it. so constructed. Hence there exists an input y to C such that C(y) = 1. if we apply the bits of y to the inputs of C. the value never changes. and the length of the certificate y is O(nk).) The circuit C consists of at most t = O(nk) copies of M. for this we need only show that F runs in time polynomial in n = [x]. it must compute a circuit C = f (x) that is satisfiable if there exists a certificate y such that A(x. simply reside as values on the wires on the connecting copies of M. and if we assume that thereafter A halts. Now. the input x. Thus the only remaining inputs to the circuit correspond to the certificate y. and the is output is the configuration cT(n). The input to C' an initial configuration corresponding to a computation on A(x. Thus if the algorithm runs for at most T(n) steps. F correctly computes a reduction function. That is we can paste together T(n) copies of the circuit M. The output of the ith circuit.NP COMPLETE PROBLEM 109 2 (# 3 # $ # # 3 # * * # * &/' # (# # % 4 # # # * # % * 4 * The output of the algorithm A—0 or 1—is written to some designated location in the working storage when A finished executing. . Now we have two properties to be proved. let us assume that there exists a certificate y of length O(nk) such that A(x. except the one bit of CT(n) corresponding to the output of A. Since the algorithm runs for at most O(nk) step. and the initial state of memory are wired directly to these known values. That is we have to show that C is satisfiable if and only if there exists a certificate y such that A (x. This circuit C. The program for A itself has constant size. In order to prove that F correctly computes a reduction function. Given an input x. First. which produced configuration ci is fed directly into the input of the (i + 1)st circuit. When F obtains an input x. Thus. the output of C is C(y) = A(x. then C is satisfiable. The reduction algorithm F gives a single combinational circuit that computes all configuration given by a given initial configuration. the output appears as one of the bit in ct(n). (We assume that this memory is contiguous) The combinational circuit M which can implement the computer hardware has size polynomial in the length of a configuration which is polynomial in O(nk) and hence is polynomial in n. from which we conclude that A(x. and hence it has size polynomial in n. Now we are going to complete the proof.y) = 1. computes C(y) = A (x.y) for any input y of length O(nk).y) = 1. The first observation we make is that the number of bits needed to represent a configuration is polynomial in n. it first computes n = [x] and constructs a combinational circuit C'consisting os T (n) copies of M. Second we must show that F. the amount of working storage required by A is polynomial in n as well. all outputs to the circuit should be ignored. The length of the input is x is n. Thus if a certificate exists. since each step of the construction takes polynomialtime. Since we know that what a polynomialtime reduction algorithm F must do. the initial program counter.runs in polynomialtime.y) = 1. Then. The construction of C from x can be accomplished in polynomialtime by the reduction algorithm F.y) = 1. The reduction algorithm F when provided an input string x. (Most of this circuitry implement the logic of the memory system. Thus the configuration rather than ending up in a state register.y). The circuit C = f (x) that F computes is obtained by making a few changes in C'Initially the inputs to C' .y) = 1. independent of the length of its input x. corresponding to the program for A. we must show that F correctly computes a reduction function f.
and since it belongs to NP.110 ALGORITHMS AND ADVANCED DATA STRUCTURES The language circuitset is therefore at least as hard as any language in NP. If L is language such that L' P L for some L' ≤ ∈ΝΠΧ. 2. If your answers are correct. then proceed to next section. we shall show how to prove that language are NPcomplete without directly reducing every language in NP to the given language. if L ∈NP then L ∈NPC.complete. Also. we have L" ≤P L. 4. Top ! " The NPcompleteness of the circuitsatisfactory problem depends on a direct proof that L ≤P CIRCUITSAT for every language L ∈ NP. Describe an algorithm that computes a function f mapping every instance of L' an instance of L to Prove that the function f satisfies x ∈L' and only if f(x) ∈L for all x ∈ {0. by reducing a known NPcomplete language L' L we implicitly reduce every language to in NP to L. Τηεν L is NPhard.AND. The following lemma provides a base for showing that a language is NPlanguage. L' P L. Prove L ∈NP. We also have L ∈NPC.4 Before going to next section. By supposition. Prove that circuit satisfiability problem is NP. ) 5 (# # * # & 6& ∧6 '' . 5. We can say that. 3. it is NPcomplete. If L ∈NP. which shows that L is NPhard. this is ≤ Proof: For all L" ∈NP we have L" ≤P L' is because L' NP. Proof Immediate from lemmas given before and the definition of NPcompleteness. Prove that circuit satisfiability problem belongs to NP class. 2. Here. Thus lemma gives us a method for proving that a language L is NPcomplete: 1. Student Activity 4. and thus by transitivity.1}* if Prove that the algorithm computing f runs in polynomialtime. answer the following questions: 1. The circuitsatisfactory problem is NPproblem. Select a known NPcomplete language L.
The truth table for φ’i is given above. The final 3CNF formula φ’’’ is constructed from the clauses of the CNF formula φ’’. that is if Ci = (l1 ∨ l2). It also uses two auxiliary variables let p and q. then include (l1 ∨ l2 P) (l1 ∨ l2 P) as clauses of f(φ). In the step of the reduction further transforms the formula so that each clause has exactly 3 distinct literals. together with the value of the clause under that assignment. For each clause Ci of φ’’ we include the following clauses in φ’’’: • • If Ci has 3 distinct literals then simply include Ci as a clause of φ’’’. we build a formula in disjunctive normal form (or DNF) — an OR of AND's that is equivalent to ¬ φ’i. Each clause φ’i of the formula φ’i has now been converted into a CNF formula φ”i and thus φ’i is equivalent to the CNF formula φ” consisting of the conjunction of the φ”i. . Each row of the truth table consists of a possible assignment of the variable of the clause. ø"1 = (¬y1 ∨ y2 ∨ ¬x2 ) ∧ (¬y1 ∨ y2 ∨ ¬x2 ) ∧ (¬y1 ∨ y 2 ∨ ¬n2 ) ∧ (y1 ∨ y2 ∨ n2 ) Which is equivalent to the original clause ø' 1. where l1 and l2 are literals.NP COMPLETE PROBLEM 111 resulting expression is ø’ = y1 ∧ (y1 ↔ (y2 ∧¬ x2)) ∧ (y2 ↔ (y3∨ y4)) ∧ (y3 ↔ (x1→x2)) ∧ (y4 ↔ ¬ y5) ∧ (y5 ↔ (y^∨ x4)) ∧ (y6 ↔ (¬ x1 ↔ x3)). By using DeMorgan's law we get the CNF formula. If Ci has 2 distinct literals. We then convert this formula into a CNF formula φ’’i by using DeMorgan's laws all literals and change OR's into AND's or AND's into OR's. Using the truthtable entries that evaluate to 0. It should be noted that the formula φ’ thus obtained is a conjunction of clauses φ’i each of which has at most 3 literals the only additional requirement is that each clause be an OR of literals. The literals P and ¬ p merely fulfill the syntactic requirement that there be exactly 3 distance literals are per clause: (l1 ∨ l2 P) ∧ (l1 ∨ l2∨¬ P) is equivalent to (l1 ∨ l2 ) whether p = 0 or p = 1. Here we convert the clause φ’i = [y1↔(y2∧¬x2)] into CNF as follow. The DNF formula equivalent to ¬φ’i is (y1 ∧y2 ∧ x2) ∨ (y1 ∧¬ y2 ∧ x2) ∨ (y1 ∧¬ y2 ∧¬ x2) ∨ ( ¬ y1 ∧ y2 ∧¬ x2). We construct a truth table for φ’i by evaluating all possible assignment to its variables. Now we convert each clause φ’i in to conjunctive normal form. Moreover each clause of φ”i has at most 3 literals.
! " 7 (# 8 ./  % ( A clique in an undirected graph (G = V. games and puzzles. then include (l ∨¬ P ∨ P) ∧ (l ∨ ¬ P ∨ ¬ q) as clauses of φ’’’.112 ALGORITHMS AND ADVANCED DATA STRUCTURES • If Ci has only 1 distinct literal l. program optimization etc. and the truth table has at most 23 = 8 rows. Top ! " NPcomplete problem can be in the domains: boolean logic. Hence the 3CMP formula φ’’’ is satisfiable iff φ is satisfiable by inspecting each of the three steps. Constructing φ’ from φ can introduce at most 8 clause in φ’’ for each clause from φ’.k : G is a graph with a clique of size k}. mathematical programming. graphs. arithmetic. It is a subset V’⊆V of vertices each pair of which is connected by an edge in E.E ). Note that every setting of P and q causes the conjunction of these four clauses to evaluate to l. In constructing φ’ from φ we have to introduce at most 1 variable and 1 clause per connective φ’. the construction of φ’ from φ in the first step retains satisfiability. storage and retrieval sequencing and scheduling. Third step produces a 3CNF formula φ’’ that is effectively equivalent to φ’’ because any assignment to the variables P and q produces a formula that is algebraically equivalent to φ’’. The second step produces a CNF formula φ’’ which is equivalent to φ’. automata and language theory. The size of a clique is defined as the number of vertices it contains. network design sets and partitions. Similarly the construction of φ’ from φ introduces at most 4 clauses into φ’’’ for each clause of φ’’. Hence a clique is: CLIIQUE = { G. Here we use the reduction methodology to provide NPcompleteness proofs for the problems related to graph theory and set partitioning. since each clause of φ’ has at most 3 variable.Hence the size of the φ’’’ is polynomial in the length of the original formula and each of the constructions can easily be accomplished in polynomialtime. We have to show that the reduction can be computed in polynomialtime. We can say that a clique is a complete subgraph of G. algebra and number theory. Like the reduction form CIRCUITSET to SAT. .
Generally k could be proportional to[V] in which case the algorithm runs in super polynomialtime..vr2. and check each one to see whether it forms a clique. which is polynomial if k is a constant.∧ Ck be a boolean formula in 3CNFSAT k clauses. Let φ = C1 ∧ C2 ∧. We can say that an efficient algorithm for the clique problem is unlikely to exists.E ) is constructed as follows for each clause Cr = ( l 1 ∨ l 2 3 )l in φ. r ≠ s and their corresponding literals are consistent. we place a triple of vertices vr1. The time complexity of this algorithm is Ω(k2([vk])).. For a given graph (G = V. Now we construct a graph G such that φ is satisfiable iff G has a clique of size k.E ) with [V] vertices has a clique of size k is to list all k subsets of V.. that is.. l2r and l3r. # $% % $% # $% $% # $% $% x3 9 (# /+ 1 & #: # + x3 . that is. For r = 1.. The clique problem is NPcomplete. To check whether V' a clique can be accomplished in polynomialtime is can be done by checking whether the edge (u.NP COMPLETE PROBLEM 113 A native algorithm for determining whether a graph (G = V.. and vr3 in V. vri is not the negation of vsj .E ) .2 . k each clause Cr has exactly three distinct literals l1r. we use the set V' V of vertices ⊆ in the clique as a certificate for G.v) belongs to E...1 / ∧ / ∧ /+ %# / 1 & / % #¬ # ∨¬ # ∨¬ + ' ∨ ∨ + ' 1 1 + /+ % # + + /0 8 ( ' (# / ) =3 < # $ % # # & 1 / The reduction algorithm begins with an instance of 3CNFSAT. r r r The graph G = (V. . We put an edge between two vertices vri and vsj if both of the following hold: • • vri and vsj are different triples. We next show that the clique problem is NPhard by proving that 3CNFSAT ≤P CLIQUE. That we should be able to prove this result is somewhat surprising since on the surface logical formulas seem to have little to do with graphs. Proof: We have to show that clique ∈ NP.
This problem is to find a vertex cover of minimum size in a given graph. As an example of this construction.E) as a subset V' ⊆ V such that if (u. then u ∈ V. j & $ & $ ' ( ' ( ) ) > ? &*' (# : $ # * # & ' # # # # % # $ 6 1 1 ! ! Conversely.9. The vertexcover problem is NP−complete .9 has a vertex cover {w. then u ∈ V'(or both). Restating it as a decision problem we wish to determine whether a graph has a vertex cover of a given size k. where r ≠ s both corresponding literals l ir and l s are mapped to 1 by j the given satisfying assignment and thus the literals cannot be complements. each clause CR will have at least one literal l ir that is assigned 1. z} of size 2. The size of a vertex cover is the number of vertices in it. and x3 from the third clause. For example the graph in figure 4.114 ALGORITHMS AND ADVANCED DATA STRUCTURES The graph can easily be computed from φ in polynomialtime. v rj ∈ v’. v s ) belongs to E. by the construction of G. and each such literal corresponding to a vertex v ir . For any two vertices v ir . We say that V'is a clique. We can assign 1 to each literal l ir such that vri ∈ V' since G contains no edges between inconsistent literals. x2 = 0.9 a satisfying assignment of φ is x1 = 0. So.K : graph G has vertex cover of size k}. In language we define. the edge (v ir . if we have φ= (x1 ∨¬ x2 ∨¬ x3) ∧ (¬ x1∨ x2 ∨ x3) ∧ ( x1∨ x2 ∨ x3) then G is the graph shown in Figure 4. assume that G has a clique V' size k.v) ∈ E. VERTEXCOVER = { G. and a vertex cover for G is a set of vertices that covers all the edges in E. taking one such "true" literal from each clause yields a set of V' vertices. Now we show that this transformation of φ into G is a reduction. Assume that φ has a satisfying assignment. Thus. A corresponding clique of size k = 3 consists of the vertices corresponding to ¬x2 from the first clause x3 from the second clause. That is each vertex "covers" its incident edges. ) *! We define a vertex cover of an undirected graph G = (V. x3 = 1 . No edges in G connect vertices in the same triple and of so V' contains exactly one vertex per triple. Each clause is satisfied and so φ is satisfied (Any variables that correspond to no vertex in the clique may be set arbitrarily) In the example of Figure 4.
k) of the subsetsum problem the reduction algorithm constructs an instance (S... 1040. 16. we are given a finite set S ⊂ N and a target t ∈ N. if (u.v) ∉E}. To complete the proof.v) be ⊆ is any edge in E. 4. Since (u. Figure .E).v) is covered by V . 1040. every edge of E is covered by a verted in V—V'Hence the set V . we assume the subset S' is the certificate. Proof To show that subsetsum belongs to class NP. Then (u.v[V] .) ∈ V' then (u.E) where E = {(u. whether u ∈ V. Assume that G has a clique V' V with [V] = k .t) of the problem.. V — V' a clique. ∈ E. we show that this transformation is indeed a reduction : the graph G has a clique of size k if and only if the graph G has a vertex cover of size [V] — k.) ⊆ ] and .1}. Assume there is a graph a G = (V. The contrapositive of this implication is that for all u. for example. At the heart of the reduction is an incidencematrix representation of G. . we define the complement of G as G = (V. v. ∈ V. 1344} and t = 3755. and it has size [V] . ⊆ ] and then it checks for each edge. In language we define: Subsetsum = { S. Equivalently atleast one of u to v is V . Conversely. In other words G is the graph containing exactly those edges that are not in G.e[E] . In other words. . The verification algorithm says that [V' = k. For an instance (G. We claim that V — V' a vertex cover in G. for all u.K) off the clique problem.V' . we can show that the subsetsum problem is unlikely to have a fast algorithm. if u ∉V' v ∉V'then (u. 1093.e1. 256. We prove that the vertex cover problem is NPhard by showing that CLIQUE ≤PVERTEXCOVER This reduction is based on the notion of the "complement" of a graph. It computes the complement G which is easily doable in polynomialtime. then the subset S' = {1. v. if S = {1. which implies that at least one of u to v does not belong toV' connected by is an edge of E. We ask whether there is a subset S' ⊆ S whose elements sum to t. Let G = (V. it is important that our standard encoding assumes that the input integers are coded in binary.v) : (u.V' which means that edge (u. forms vertex cover for G.t) of the subsetsum problem so that G has a vertex cover of size k if and only if there is a subset of S whose sum is exactly t. 1041. Then . 64.v) ∈ V. for an instance (S..9 shows a graph and its complement and illustrates the reduction from CLIQUE to VERTEXCOVER.k.v1. Given an undirected graph G = (V. Checking whether t = s ∈S' s can be done by a verification algorithm in polynomialtime. this verification can be done in polynomialtime easily.. The reduction algorithm takes as input an instance (G.k) of the vertex cover problem. 1093.. As usual.E) be an undirected graph and let V = {v0.1}and E = {e0.1285} is a solution. The incidence matrix of G is a [V] * [E] matrix B = (bij) such that . The subsetsum problem is NPcomplete.NP COMPLETE PROBLEM 115 Proof Initially we shall show that vertexcover ∈NP. or both.v) ∉E. v.. 256. suppose that G has a vertex cover V' V were [V' = [V] . 1285. We now show that VERTEXCOVER ≤P SUBSETSUM. v. Let (u. is ] & ! In this. The certificate we choose is the vertex cover V' V itself. 64.v) was chosen arbitrarily from E. V' which has size [V] — k.. Now. ∈ V. 16.E) and an integer k. and v ∈ V. The output of the reduction algorithm is the instance (G [V] .t there exists a subset S' ⊆ S such that t = s ∈S' s}.[V'= k.
) Formally.[E] . xi = 4 E + E −1 j =o bij 4 j For each edge ej∈E we create a positive integer yj that is just a row of the “identity” incidence matrix. in order to simplify the formulas for the numbers in S. t = k4 E + E j =0 2.116 ALGORITHMS AND ADVANCED DATA STRUCTURES bij = {1 if edge ej is incident on vertex vi. (The identity incidence matrix is the [E] + [E] matrix with 1's only in the diagonal positions. the reduction algorithm computes a set s of numbers and an integer t. The incidence matrix is shown with lower index edges on the right rather. All of these numbers have polynomial size when we represent them in binary. The [E] loworder digits of a number will be in base4 but the highorder digit can be as large as k..1.10 is shown in Figure 4...10 (c) formally for i = 0. The first digit of the target sum t is k. corresponding to vertices and edges respectively. The set of numbers is constructed in such a way that no carries can be propagated from lower digit to higher digits. as illustrated in Figure 4.[V]1. .1.1. Given a graph G and an integer k. The digits corresponds to vi’s rows of the incidence matrix B = (bij) for G. 0 otherwise} The incidence matrix for the undirected graph of Figure 4.. yj = 4j. than on the left as is conventional. Formally. >+ # # % & ' # 9 ! # 55 25 The set S consists of two types of numbers.... v0 v0 v2 0 1 0 1 1 0 0 0 1 0 x0 x2 = = 1 0 0 1 1 0 0 0 1 0 = 1041 = 1041 1 1 v2 y1 = 0 0 0 0 1 0 = 4 (# + ! # @ + # % (# ! # * @ * # * # &*' (# &' 3 # # (# % # # * # # # * %# # +72 * & ' 8# # # . for J =0. The reduction can be performed in polynomialtime by manipulating the bits of the incidence matrix...10.. For each vertex vi ∈ V. we create a positive integer xi whose modified base4 representation consists of a leading 1 followed by [E] digits.4 j . and all [E] lower order digits are 2's. To understand how the reduction algorithm works let us represent numbers in a "modified base4" fashion.
.y0 . S’ is a solution to the subset sum instance S.10 the vertex cover V’ = {v1.vim} is a vertex cover for G. To get the loworder digits of t.. observe that the only way the leading k in target t can be achieved is by including exactly k of xi in the sum.. xi2. for each of the [E] low order position of] t. and define S’ by S’ = {xi1. Thus.. each of which corresponds to an edge ej.. there are no carries from position ej to position ej +1... Since ej is incident on two vertices.y2 . First suppose that G has a vertex cover V' ⊆ V of size k.x4 . for the other case —when ej is incident on exactly one vertex in V’ — we have yj ∈ S’ and the incident vertex and yj each contribute 1 to the sum of the jth digit of t...xim}∪ {yj1. Let S = {xi1.. To prove this claim we start by observing that for each edge ej ∈E there are three 1's in set s in the ej position: one from each of the two vertices incident on ej .x3 .... + & ' * *+ &*' + * ' & ' *+ ' ' '.yjp}..}.y4 . Since at least one xi contributes to the sum for each edge we see that V’ is a vertex cover. Because V’ is a vertex cover.xik}∪ {yj : ej is incident on precisely one vertex in V’} To see that s ∈S’’s = t. Let V' = {vi1. If ej is incident on two vertices in V’ then both contribute a 2 to the sum in the jth position. for each edge ej there is at least one xim∈S’ with 1 in the jth position. We claim that m = k and that V’ = {vi1.v3.. thereby also producing a 2. Thus .. Thus.. vi2. and one from yj because we are working with a modified base 4 representation... The jth digit of yj contributes nothing.NP COMPLETE PROBLEM 117 Now we have to show that graph G has a vertex cover of size k if and only if there is a subset S' ⊆ S whose sum is t. ej incident on at least one vertex in V’.. vi2. which is incident on two vertices in V’. which implies that yj ∉S’ . Thus in this case the sum of S' produces a 2 in the jth position of t. each of which is a 2 consider the digit positions in turn.. Now. at least one and at most two xi must contribute to the sum. yj2....y3 . + * ' ' ! *+ ' '. and thus that V’ is a vertex cover of size k. xi2.. All of the yj are included in S’ with the exception of y1.. suppose that there is a subset S’ ⊆ S’ that sums to t. + * *+ & ' + .v4} corresponds to the subset S’ = {x1 . observe that summing the k leading 1’s of the xim∈S’ gives the leading digit k of modified base 4 representation of t.. To see that m = k... In Figure 4.vik}..
which are pieces of graphs that enforce certain properties. each containing exactly 3 distinct literals we construct a graph G = (V. since we have yet to relate the variables to the clauses.. We shall represent widget A as shown in Figure 4. We are not yet finished with the construction of graph G.bi. The graph G that we shall construct consists mostly of copies of these two widgets. x2.13 of the k clauses φ.1.j+1) with edge em.b3).. Given a graph G = (V. Suppose that B is a subgraph of some graph G and that the only connections from B to the remainder of G are through vertices b1.. We connect these two vertices by means m of two copies of the edge (x' ..b3. We represent this widget as in Figure 4.. and (b3..12 (a)—(e) show five such subsets. since then all vertices in the widget other than b1.b3 and b4 would be missed. This verification can be performed in polynomialtime. for each variable xm in φ we include two vertices x' and x"m .2.ck .. If the hamiltonian cycle takes edge em. the remaining two subsets can be obtained by performing a toptobottom flip of part (b) and (e).1. x"m+1) for m = 1. We now prove that HAMCYCLE is NPcomplete by showing that 3 CNFSAT ≤P HAMCYCLE.. The construction is illustrated in Figure 4.k . the variable is assigned the value 0. Given a 3CNF boolean formula ø over variables x1. z3 and z4 in one of the ways shown in figures 4. x') and (bk...12 is our second widget.b2.xn with clauses c1. Our first widget is the subgraph A shown in Figure 4. Then. A hamiltonian cycle of G may however traverse any proper subset of these edges. and b4.b2)..b2. Letting bij be the copy of vertex bj in the jth copy of widget B.4 to bi+1. Proof Initially we show that HAMCYCLE belong to NP. (b2.. # # .11 (b) and (c) we may treat subgraph A as if it were simply a pair of edges (a.. then we use an A widget to connect edge (bij. Each pair of these edges forms a twoedge loop.a' and (b. we connect bi. The subgraph B in Figure 4.11. and we join these widgets together in series as follows.b4). we include a copy of widget B. z2.13.. we connect these small loops in series by adding edges (x' . We connect the left m (clause) side of the graph to the right (variable) side by means of two edges (b1.12 (f). Our construction is based on widgets.11.1 for i = 1. The verification algorithm checks that this sequence contains each vertex in V exactly once and that with the first vertex repeated at the end it forms a cycle in G..118 ALGORITHMS AND ADVANCED DATA STRUCTURES & ' A # (# # # # % # # # %# # # # & ' # + /..1 .c2..4 .2.E) in polynomialtime such that G has a Hamiltonian cycle if and only if φ is satisfiable. x"m). . The idea is that if the m hamiltonian cycle takes edge em . which 1 are the topmost and bottom most edges in Figure 4.E) our certificate is the sequence of [V] vertices that make up the hamiltonain cycle. A Hamiltonian cycle of graph G cannot traverse all of the edges (b1. the idea being that at least one of the paths pointed by that arrows must be taken by a G hamiltonain cycle. Suppose that A is a subgraph of some graph G and that the only connections between A and the remainder G are through the vertices z1. x"n ). B /C/ 3 &*' & ' ) / # # % # * # # % * * # The hamiltonian cycle problem is NPcomplete. which we denote by em and em to distinguish them. If the jth literal of clause Ci is xm... it corresponds to assigning variable xm the value 1.b' with the restriction that any hamiltonian cycle of G must include exactly one of these ) ) edges.n . Figure 4.
In this case.4) and e3.for example).∨x3).j+1) and em In Figure 4.1.14 effectively replacing edge cm or em by a series of edges.b2. because clause c2 is (xi ∨¬ x2. and thus an edge em or em may be influenced by several A widgets (edge e3 .3. between (b2. Note that connecting two edges by means of A widgets actually entails replacing each edge by the five edges in the top to bottom of Figure 4. and between (b2.bi. adding connections that pass through the Z vertices as well.13 (a) and.13 for example).13 for example. of course.2) and e1.b2.NP COMPLETE PROBLEM 119 If the jth literal of clause ci is ¬ xm. .b2. we connect the A widgets in series.3) and e2 . A given literal lm may appear in several clauses (¬ x3in figure 4.2. as shown in Figure 4. then we instead put an A widget between edge (bij. we place three A widgets as follows: between (b2.
+ . .120 ALGORITHMS AND ADVANCED DATA STRUCTURES A # * &' # # % # * # # % % * + /. ! ! ! ! ! ! . # # ∨ ¬∨ '∧& ∨¬ ∨ '∧& ∨ ∨¬ ' !& '1 # # # %# # φ # % !& '1 !& "' 1 em # # !&#"' 1 em ! ! ! ! ! ! ! ! + ! ! ! ! . 8 ( # # % # %# # * 4 . B /C/ 3 . & ' & & '& ' # # # # ' * & ' + (# : #/ ! # # # # * φ 1 &¬ !& '1 .
bi. m but not both.bi. It next traverses edge (bk 4. edge bi . edge em belong to h.. j + 1 is connected by an A widget to either edge em of edge em .1 .NP COMPLETE PROBLEM 121 .(bi.2). & $ ( / 2 # % # * 7 8# Conversely. 2. j + 1 is traversed by h if and only if the corresponding literal is 0. If edge em belong to h then we set xm = 1. For graph G: traverse edge em edge en if xm = 0. The edge bi . it traverses edge.j1. three edges (bi. we define a truth assignment for ø as follows.) Given the hamiltonian cycle h. j bi .. We claim that this assignment satisfies φ. i = 1. and traverse . Each ( ) ( ) . but we use these subgraphs to enforce the either /or nature of the edges it connects. (b1.j3. Otherwise. let us suppose that formula φ is satisfied by some truth assignment. (It actually traverses edges within the A widgets as well.3). Since each of the cannot be traversed corresponding literal clause Ci .4) in clause ci is also in a B widget. we can construct a hamiltonian cycle. must have a whose assigned value is 1. . ! ! ! ! ++ * % & ' em * # suppose that G has a hamiltonian cycle h and show that φ is satisfiable.. depending on whether xm or ¬xm is the jth literal in the clause.(bi. Consider a clause Ci and the corresponding B widget in G. Cycle h must take a particular form: First.bi.. choosing either edge em or edge em . Finally. ++ (# %# &*' (# ! ! . This property holds for each kand thus formula φ is satisfied. it traverses the B widgets from bottom to top on the left. One of the three edges. j bi . x '1 ) to go from the top left of the top right. x "n ) to get back to the left side. and we set xm = 0. therefore. all three by the hamiltonian cycle h. We first It then follows all of the x' and x"m vertices from top to bottom. and Clause Ci is satisfied.j2. By following the rules from above. We claim that formula φ is satisfiable if and only if graph G contains a hamiltonian cycle.
∈V}. sums up the edge costs and checks whether the sum is at most k. Modeling the problem as a complete graph with n vertices. bi . we note that graph G can be constructed on polynomialtime. c is a function from V × V → Z. Since the A and B widgets are of fixed size. ( ) ! In the travellingsalesman problem. We form the complete graph G' = (V. Given an instance of the problem. with cost 7.5 Answer the following questions: . 1if (i.j) : i. Proof: We first show that TPS belongs to NP.j) ∈E. j) to travel from city i to city j. v. To prove that TSP is NPhard.j) ∈E. Conversely. Since has of the cost s of the edges in E' 0 and 1. E) is a complete graph. which is closely related to the hamiltoniancycle problem. since we assure that s is a satisfying assignment for formula φ. CYCLE. These rules can indeed be followed. the cost of tour h' exactly 0. the graph G has O(k) vertices and is easily constructed in polynomialtime. which is easily formed in polynomialtime. x. h' are is contains only edge in E. We now show that graph G has a hamiltonian cycle if and only if graph G' has a tour of cost at most 0. It contains one B widget for each of the k clauses in φ. a salesman must visit n cities. u . and so there are 3k A widgets. or hamiltonian cycle. Let G = (V. w. The travellingsalesman problem is NPcomplete. j . where the total cost is the sum of the individual costs along the edges of the tour. Thus we have provided a polynomialtime reduction from 3CHFSET to HAMCYCLE. j + 1 if and only if the jth literal of clause Ci is 0 under the assignment. show that HAMCYCLE ≤P TSP.j. we can say that the salesman wishes to make a tour.15 a minimum cost tour is u. There is an integer cost c(i. we use as a certificate the sequence of n vertices in the tour. Thus. Finally.j) = {0 if (i. The instance of TSP then (G' . k} : G = (V. h' a tour in G' is with cost 0. and the salesman wishes to make the tour whose total cost is minimum. Suppose that graph G has a hamiltonian cycle h. We conclude that h is a hamiltonian cycle in graph G. This process can certainly be done in polynomialtime. The following theorem shows that a fast algorithm for the travellingsalesman problem is unlikely to exist. K ∈Z.o). Each edge in h belong to E and thus has cost 0 in G' has. The formal language for the traveling salesman problem is : TPS = {G. c.E) be an instance of HAM Student Activity 4. For example.E' where E'= {(i. in Figure 4. visiting each city exactly once and to finishing at the city he starts from. The verification algorithm checks that this sequence contains each vertex exactly once. Therefore.122 ALGORITHMS AND ADVANCED DATA STRUCTURES edge bi .c. and we define the cost ) function c by c(i. and G has a travelling salesman tour with cost at most k}. suppose that graph G' a tour h' cost at most 0.
I. The vertex cover problem is NPcomplete. 4. 2. 2. 2. Fill in the blanks 1. The Hamiltonian cycle problem is ______________. NPcomplete Hamiltoniancycle complete P=NP . 2. The formallanguage framework allows us to express relation between decision problems and algorithms that solve them concisely. If any NPcomplete problem is polynomialtime solvable then________________. The class of polynomialtime solvable problems has closure properties. 3. True and False 1.NP COMPLETE PROBLEM 123 1. Fill in the blanks 1. What is a clique problem? Show that the clique problem is NP complete? What is vertex cover problem? Show that this problem is NP complete. True and False 1. The traveling salesman problem is closely related to the _____________________ problem. I. 3. The NP___________ languages are in a sense. The class of language decided by polynomialtime algorithms is not a subset of the class of languages accepted by polynomialtime algorithms. 3. II. then every NPcomplete problem has a polynomial time algorithm. True False II. 4. 2. The circuitsatisfiability problem is NPhand. the “hardest” language in NP. What is travelingsalesperson problem? & If any single NPComplete problem can be solved in polynomial time.
.124 ALGORITHMS AND ADVANCED DATA STRUCTURES I. to within a polynomialtime factor. . 2. If any NPcomplete problem is polynomial time solvable then ________. ___________ reductions provide a formal means for showing that one problem is atleast as hard as another. 5. The subsetsum problem is NP complete. Show that this problem is NPcomplete. 1. A Hamiltonainpath in a graph is a simple path that visits every vertex exactly once: show that the language HAMPATH = {(G. 4. 2. 5. The circuit – satisfiability problem is NP – complete.u. 4. 2. 3. True and False 1. The clique problem is __________. II. Show that the subsetsum problem is solvable in polynomialtime if the target value t is expressed in unary. The circuitsatisfiability problem belongs to the class N ________. One of the convenient aspects of focusing on decision problem is that they make it easy to use the machinery of _________ theory. Fill in the blanks: 1.v) : there is a hamiltonian path from u to v in graph G} belongs to NP. 3. Show that L is complete for NP if and only if is complete for CoNP. Show that the hamiltonainpath problem is NPcomplete? The longestsimple cycle problem is the problem of determining a simple cycle of maximum length in a graph (no repeated vertex).
Assume that you have 5 loads of clothes to wash. And so on.e. if you had 5 machines. In case of severe hurricanes or snowstorms. thousand of rules have to be tried. If an expert system is used to aid a physician in surgical procedures. In the medical example. if there are p washing machines and p loads of clothes. In the forecasting example. Even the fastest singleprocessor machines may not be able to come up with solutions within tolerable time limits. large sized matrices have to be operated on. In this block we study algorithms for parallel machines (i. whether forecasting has to be done in a timely fashion. On the other hand. Then it will take 125 minutes to wash all the clothes using a single machine. in this example. then the washing time can be cut down by a factor of p .Overview Parallelism Computational Model: PRAM and other Models Finding Maximum Element Merging Sorting Parallel Algorithms Learning Objectives • • • • • • Top Overview Parallelism Computational Model: PRAM and Other Models Finding Maximum Element Merging Sorting So far our discussion of algorithm has been confined to single processor computers. decisions have to be made within seconds. For example. evacuation has to be done in short period of time. Parallel machines offer the potential of decreasing the solution time enormously. Programs written for such applications have to perform enormous amount of computation. washing could be computed in just 25 minutes. Also assume that it takes 25 minutes to wash one load in a washing machine. There are many applications in daytoday life that demand real time solutions to problems. computers with more than one processor).
the 100 numbers can be added sequentially in 99 units of time. Note: In this block we use the terms speedup and asymptotic speedup interchangeably which one is meant is clear from the context. and when all the processor are done. Let A be an nprocessor parallel algorithm that sorts n keys in θ(log n) time and let B be an n2 processor algorithm that also sort n keys θ (logn) time.p)=θ(p). We refer to any algorithm designed for a single processor machine as a sequential algorithm and any designed for a multi processor machines a parallel algorithm. If the best known sequential algorithm for π has an asymptotic run time of s(n) and if T(n. given a problem to solve we partition the problem into many sub problem. On the other hand. Top The idea of parallel competing is very similar. then the algorithm is said to have linear speedup. As another example say there are 100 numbers to be added and there are two persons A and B. So two people can add the 100 numbers in almost half the time required by one. the speedup of B is also θ(n log n ) θ(log n ) = θ(n ) .p) is the asymptotic run time of a parallel algorithm. If this assumption is invalid then the washing time will be dictated by the slowest machines.128 ALGORITHMS AND ADVANCED DATA STRUCTURES compared to having a single machine : here we have assumed that every machine takes exactly the same time to wash.. If a parallel algorithm on a pprocessor machine runs in time T'(n. θ(log n ) Algorithm A has linear speedup whereas B does not have a linear speedup. Person A can add the first 50 numbers. Then the speedup of A is θ(n log n ) = θ(n ) . Person A can add 50 numbers in 49 units of time. At the same time B. At the same time B can add next 50 numbers. the partial solutions are combined to arrive at the final answer. For the problem of example 2.p).p) then the speedup of the parallel algorithm is defined to be S'(n) / T' (n. can add other 50 numbers. the two partial sums can be added. If S(2)/T(n. So the speed up of this parallel algorithm is 99/50 = 1. n being the number of keys to be sorted.98. When they are done one of them can add the two individual sums to get the final answer. Let π be a given problem for which the best known sequential algorithm has a run time of s' (n) where n is the problem size. If there are p processor then potentially we can cut down the solution to by a factor of p. which is very nearly equal to 2! There are many sequential algorithms for sorting such as heap sort that are optimal and run in time θ(nlogn). . his means that the parallel run time is 50.p). In another unit of time. then the asymptotic speedup of the parallel algorithm is defined to be S(n)/T(n.
it is necessary to parallelize every component of the under lying technique. algorithm) and identify the one that is the most paralletizable to achieve a good speedup. Also the parallel algorithm is said to be work optimal if pT(n. If there are p machines the washing time is p n n w.PARALLEL ALGORITHMS 129 If a pprocessor parallel algorithm for a given problem runs in time T (n. This is n w p p θ(1) if ≥ p. Also the efficiency of a work optional parallel algorithm is Q(1)' Let w be the time to wash one load of cloths on a single machine in example (1) also let n be the total number of loads to wash. Two of the possible reasons for such an anomaly are (1) pprocessor have more aggregate memory than one and (2) The cachehit frequency may be better for the parallel machines as the pprocessor may have more aggregate cache them does one processor. This is a contradiction since by assumption s is the run time of the best known sequential algorithm for solving π! The preceding discussion is valid only when we consider asymptotic speedups. where S(n) is the asymptotie run time of the best known sequential algorithm for solving the same problem. Is it possible to get a speed up of more than p for any problem on a pprocessor machine? Assume that it is possible (such a speed up is called super linear speedup). Amdahl’s law relates the maximum speed up achievable with f and p as follows.e. In particular let π be the problem under consideration and s be the best known sequence run time. When the speedup is defined with respect to the actual run times on the sequential and parallel machines.p).p) the total work done by this algorithm is defined to be p T (n. the total work done is nθ(logn) = θ (n log n).e. If there is a parallel algorithm can a pprocessor machine whose speedup is better than p. This speedup is > if n ≥ p .p) = 0 (S(n)). If a fraction of the technique cannot be parallelized (i. Thus the speedup is n / . Note: A parallel algorithm is work optimal if and only if it has linear speedup. then the maximum speed up that can be obtained a limited by f. Maximum speed up = 1 / ( f + (1 − f ) / p) . Its efficiency is θ (nlogn) / θ (nlogn) = θ (1) Thus A is work optimal and has a linear speedup. n For the algorithm A of example 4.p) the efficiency of the algorithm is defined to be S(n) / pT(n. it is possible to obtain super linear speedup. One way of solving a given problem in a to explore many techniques (i. it means that the parallel run time T < (s / p) that is < PT < s. has to be run sequentially). A single machine S will take time nw.So the asymptotic speed up is Ω (p) and p p 2 nw hence the parallel algorithm has linear speedup and is work optimal also the efficiency is . The total work done by the algorithm B is n2 θ (logn) = θ (n2logn) and its efficiency is θ(nlogn)/θ(n2 logn)= θ (1/n) as a result B is not work optimal .
5+10. .0. Each machine or processor in a parallel computer can be assumed to be a RAM. then the maximum speedup that can be obtained is 1/0.01. What do you mean by a workoptimal parallel algorithm? If your answers are correct. if f=0. then proceed to next section. which is slightly more than 5! Finally. On the other hand when it comes to parallel computing.1 Before going to next section. then the maximum speedup is 10/1. An important feature of parallel computing that is absent in sequential computing is the need for inter processor communication.5 for this technique. answer the following questions: 1. In the RAM model we assume that any of the following operations can be performed in one unit of time : addition. the processors have to communicate among themselves and agree on the subproblems each will work on. Explain the importance of parallel processing with an example.130 ALGORITHMS AND ADVANCED DATA STRUCTURES Consider the some technique for solving a problem π. which is less than 2! If f=0. Also they need to communicate to see whether every one has finished its task. which is slightly more than 9! Student Activity 5. 2. division. and so on.09. This model has been widely accepted as a valid sequential model. subtraction.5/10=20/11. comparison. multiplication. Top The sequential computational model we have employed so far is the RAM (random access machine). Assume that p=10. assignment and so on. Various parallel models differ in the way they support interprocessor communication.1. memory access. Parallel models can be broadly categorized into two. fixed connection machines and shared memory machines. given any problem. For example. If f=0. numerous models have been proposed and algorithm have been designed for each such model. then the maximum speedup is 10/1.
A single step of a PRAM algorithm can be one of the following.PARALLEL ALGORITHMS 131 A fixed connection network is a graph G(V. arithmetic operation (such as addition. But this need not always to the case. and in the second step processor j reads from this cell. In general two processors can communicate through any of the paths connecting them.2). Inter processor communication is done through the communication links. hypercube and butterfly. Communication is performed by writing to and/or reading from the common memory. Usually we assume that the degree of each node is either a constant or a slowly increasing function of the number of nodes in the graph. memory access (local or global). processor i writes its message into memory cell j. in a fixed connection machine. They communicate with each other using a common block of global memory that is accessible by all this global memory is also called common or shared memory (See figure 5. Examples include the mesh. In the first step. Since the global memory is accessible by all processors. Any two processors i and j can communicate in two steps. The communication time depends on the lengths of these paths (at least for small packets). We also assume that the input is given in the global memory and there is space for the output and for storing intermediate results. and so on (See figure 5. comparison.E) whose nodes represent processors and whose edges represent communication links between processor. access conflict may arise. division and so on). Any two processors connected by an edge in G can communicate in one step. a number (say p) of processors work synchronously. Each processor in a PRAM is a RAM with some local memory. In shared memory models [also called PRAMs (Parallel Random Access Machines)]. What happens if more than one . assignment etc. In contrast. The number (m) of cells in a global memory is typically assumed to be the same as p.1). !" In fact we present algorithms for which m is much larger or smaller than p. the communication time depends on the lengths of the paths connecting the communicating processors.
Consider our simple example of four processor trying to write in M[1]simulating a common CRCW PRAM requires the four processors to verify that all wish to write the same value. and then the one with this priority does the write. Simulating a priority CRCW PRAM requires the four processors to first determine which has the highest priority. In a common CRCW PRAM. at the same time. PRAM model allows both concurrent reads and concurrent writes. and processor 3 and 4 read M[1] at third and fourth time units. One way of performing this is as follows: processor 2 reads M[1] at the first time unit. if more than one processor tries to write in the same cell.132 ALGORITHMS AND ADVANCED DATA STRUCTURES processor tries to access the same global memory cell (for purpose of reading from or writing into)? There are several ways of resolving read and write conflicts. Now consider the operation in which each processor has to access M[1] for writing at the same timesince only one message can be written to M[1]. processor 2 reads M[1] at the second time unit. This operation can be denoted as Processor i (in parallel for 1 ≤ i ≤ 4) does: Read M[1] This concurrent read operation can be performed in one unit of time on the CRCW as well as on the CREW PRAMs. then possibly they may have different messages to write. . they will read the same information. Still we can perform this operation on the EREW PRAM making sure that at any given time no two processors attempt to read from the same memory cell. concurrent reads are prohibited. the CRCW. Note that ER or EW does not preclude different processors simultaneously accessing different memory cells. The priority CRCW lets the processor with the highest priority succeed in the case of conflicts. one of the processors will succeed in writing and we don’t know which one. Accordingly several variations of the CRCW PRAM can be derived. clearly. However. EREW (Exclusive Read and Exclusive Write) the PRAM is the shared memory model in which no concurrent read or write is allowed on any cell by the global memory. one has to assume some scheme for resolving contentions. Typically each processor is assigned a (static) priority to begin with. processor one might access cell five and at the same time processor two might access cell 12 and so on. message gets to be written. Accordingly. Similarly one could also define the ERCW model. Other models may be similarly simulated. this operation can be denoted as Processor i(in parallel for 1 ≤ i ≤ 4) does : Write M[1]. In the CRCW and (EREW) PRAMs. Consider a 4processor machine and also consider an operation in which each processor has to read from the global cell M[1]. Again in the CRCW PRAM. But on the EREW PRAM. Following this processor 1 can do the writing. if more than one processor tries to read from the same cell. Thus there has to be an additional mechanism to determine which. for example. concurrent writes are permitted in any cell only if all the processor conflicting for this cell have the same message to write. For example. if there is a conflict for writing. respectively the total runtime is four. this operation can be completed in one unit of time. But processors one and two cannot access memory cell ten. several variants of PRAM arise. at a given time step. these models can simulate the effect of a concurrent write. Finally. CREW (Concurrent Read Exclusively Write) PRAM is a variation that permits concurrent read but not concurrent writes. concurrent writes are prohibited. But in a CRCW PRAM. Any algorithm designed for this model should work no matter which processor succeeds in the event of conflicts. In an arbitrary CRCW PRAM. In a CRCW or CRCW PRAM.
Processor i (in parallel for 1 ≤ i ≤ n) does: if (A[i]==1) A[0]=A[i]. may be 1. T(n. Since several of the A[i]. A[0] is easily computed in O(n) time on a RAM. Therefore the total work done on M is ≤ p’ T p/p’ ≤ pT+p’T = O(pT). This results is the following lemma. There exists a hierarchy among the different versions of the CRCW PRAM also. Note that this algorithm works on all this two varieties of the CRCW PRAM. A[0]=A[1]A[2]A……A[n] is the Boolean (or logical) OR of the n bits A[1 : n]. p)) In practice a problem of n is solved on a computer with a constant number p of processors. for these two models. Similarly define CREW (p. Assume that A[0] is zero to begin with. we saw that the implementation of a single concurrent write or concurrent read step takes much more time on the CREW PRAM. T (n. where n is the problem size. given the same number of processors. and priority from an increasing hierarchy of computing power. Hence the algorithm can not be run (as such) on a EREW or CREW PRAM. it is known that the parallel complexity of the Boolean OR problem is O(log n). Theorem The Boolean OR of n bits can be computed in O(1) time on an nprocessor common CRCW PRAM. p)). Following algorithm shows how A[0] can be computed in θ(1) time using an nprocessor CRCW PRAM.p)) ⊂ Common CRCW (p. T(n. p)) ⊂ Priority CRCW (p. Also any version of the CRCW PRAM is more powerful than a CREW PRAM as is demonstrated by example 9.PARALLEL ALGORITHMS 133 Note that any algorithm that runs on a pprocessor EREW PRAM in time T(n. p)) denote the set of all problems that can be solved using a pprocessor EREW PRAM in time T(n. the simulation time on M is ≤ T p/p’ . no matter how many processors are used.p)) ⊂ CREW (p. T(n. can also run on a pprocessor CREW PRAM or a CRCW PRAM within the same time. Likewise. Let A be a parallel algorithm for solving problem π that runs in time T using p processors. p). Each step of algorithm A can be simulated on the p’processor machine (call it M) in time ≤ p/p’ . Common arbitrary. T(n. a CREW PRAM is strictly more powerful than an EREW PRAM. reads memory location A[i] and proceeds to write a1 in memory location A[0] if A[i] is a1. In the first time step. The slowdown lemma concerns the simulation of the same algorithm on a p’processors machine (for p’<p).p)) ⊂ Arbitrary CRCW (p. All the algorithm designed under some assumptions about the relationships between n and p can also be used when fewer processors are available as there is a general slowdown lemma for the PRAM model. This means that there is at least one problem that can be solved in asymptotically less time on a CREW PRAM than on an EREW PRAM. But a CRCW PRAM algorithm or a CREW PRAM algorithm may not be implementable on an EREW PRAM preserving the asymptotic run time. then EREW (p. p) (n being the problem size). processor i. several processors may write to A[0] concurrently. . It turn out that there is a strict hierarchy among the variants of the PRAM in terms of their computational power. In example 8. T (n. Thus. In fact. for 1 ≤ i ≤ n. for example. T(n. a pprocessor CRCW PRAM algorithm may not be implementable on a pprocessor CREW PRAM preserving the symptotic run time. Since a processor of M can be in charge of simulating p/p’ processors of the original machine. p)) and CRCW (p. T(n. Let EREW (p.
. Step 3. Using the slowdown lemma. output the key. If Gi computes a zero in step 2.Gn where Gi (1 ≤ i ≤ n) consists of the processors pi1. the algorithm runs in time θ(n). 1 ≤ j ≤ n). it also run in θ n time using n processors. Let k1. The n2 processors are grouped into n groups G1. then proceed to next section. 2. they can be made distinct by replacing key ki with the tuple (ki. which is the same as the run times of the best sequential algorithm! ( ) Student Activity 5. Top ! Algorithms to find the maximum element in a list using more than one processor are given below. Differentiate between EREW PRAM and CRCW PRAM.kn be the input. Without loss of generality assume that all the keys are distinct. k2. this implies the following theorem: Theorem .2 Before going to next section. for any p’<p. then processor pi1 outputs ki as the answer. Algorithm of example (9) runs in θ(1) time using n processors.……. When p=1. If n=1.…..pin. Step 2. the same n algorithm also runs in θ(log n) time using processors. log n and so on. " # # Finding the maximum of n given numbers can be done in O(1) time using an n2–processor CRCW PRAM. Step 1. Even if they are not. i) (for (1 ≤ i ≤ n). Each group Gi computes the Boolean OR of xi1. If we name the processors pij (for 1 ≤ i ≤ n. answer the following questions: 1. this amounts to appending each key with only a (log n)–bit number of all the input keys. Step 2 takes O(1) time. j ≤ n in parallel) compute xij=(ki<kj). Thus the whole algorithm runs in O(1) time. Step 1 and 3 of this algorithm take unit time each. Define EREW PRAM model. The resultant algorithm appears as follows: Step 0. processor pij computes xij=(ki<kj). G2. pi2.134 ALGORITHMS AND ADVANCED DATA STRUCTURES [Slowdown lemma] Any parallel algorithm that runs on a pprocessor machine in time T can be run on a p’processor machine in time O(pT/p’). This key can be identified using the Boolean OR algorithm and is the maximum of all. xi2…. The idea is to perform all pairs of comparisons in one step using n2 processors. There is only one key k which when compared with every other key would have yielded the same bit zero. Processor Pij (for each 1 ≤ i.…. If your answers are correct.xin.
Let the input sequence by k1. If each one of these keys is a bit. We are interested in developing an algorithm that can find the maximum of n keys using n processors. k2…. " # $ Consider again the problem of finding the maximum of n given keys. Hence its efficiency is θ(n)/ θ(n2) = θ(1/n). this can be done in T(√n) time. M2. k2…. Since each key is of magnitude at most nc. it follows that each key is a binary number with ≤C log n bits. Step 1 of this algorithm takes T(√n) time and step 2 takes O(1) time. Since the recursive maximal selection of each part involves √n keys and an equal number of processors.……M√n are the group maxima. …. Suppose we find the . Instead we show that if each key is an integer in the range [0. Let T(n) be the run time of this algorithm. M2. P(i–1)n+2…. The technique to be employed is divide and conquer. Without loss of generality assume that every key is of length exactly equal to log n. This raises the following question: What can be the maximum magnitude of each key if we desire a constant time algorithm for maximal selection using n processors? Answering this question in its full generality is beyond the scope of this syllabus. Speedup of this algorithm is θ(n) and its efficiency is θ(1). Note that the speedup of previous algorithm is θ(n)/1=θ(n). Step 2.kin similarly partition the processors so that Pi(1 ≤ i ≤ n) consists of the processors P(i–1) √n+1.PARALLEL ALGORITHMS 135 The maximum of n keys can be computed in O(1) time using n2 common CRCW PRAM processors. we can find the maximum of these employing all the n processors (see the following algorithm). replace √x by [√n] in the following discussion). M√n be the group maxima. nc]. the problem of finding the maximum reduces to computing the Boolean OR of n bits and hence can be done in O(1) time using n common CRCW PRAM processors. Theorem The maximum of n keys can be found in O(log log n) time using n common CRCW PRAM processors.kn. Since now we only have √n keys. Total work done by this algorithm is θ(n2). If n=1 return k Partition the input keys into n part k1.k√n where ki consists of k(i–1) √n+1. Thus this algorithm is workoptimal.…. # Now we show that maximal selection can be done in O(log log n) time using n common CRCW PRAM processors. we assume n is a perfect square (when n is not a perfect square. To simplify the discussion. If M1.P√n let Pi find the maximum of Ki recursively (for 1 ≤ i ≤ n). Step 1. This T(n) satisfies the recurrence T(n)=T(√n)+O(1) Which solves to T(n) = O(log log n). find and output the maximum of these maxima employing theorem of previous section. We partition the input into √n parts so that the maximum of each part can be computed in parallel. The answer we are supposed to output is the maximum of these maxima. Total work done by the above algorithm is θ(log log n) and its efficiency is θ(n)/ θ(log log n)=θ(1/log log n). Let M1. Therefore. Clearly this algorithm is not workoptimal. where C is a constant. maximal selection can be done workoptimally in O(1) time.. Step 0. k(i– 1)n+2. the following theorem arises.
Let M be dropped from future consideration since it cannot possibly be the maximum. One of the keys that survive the very last step can be output as the maximum. M1….M√n1. it will attempt to write 10 in M10. The 2cth part may have less than logn/2 bits. Theorem .. i++) { Step 1. For (i=1. their maximum can be found in O(1) time using n processors. all the keys are alive. For example. where each key is an integer in the range (o. The algorithm is summarized below. After this write step.M√n–1. Delete each alive key whose i th part is <M } Output one of the alive keys. the problem of computing the maximum of the n keys reduces to computing the maximum of the contents of M0. the next most significant logn/2 bits as its second part. then it tries to write ki in Mki. Call these cells M0. Refer to the log n/2 MSBs of any key as its first part.136 ALGORITHMS AND ADVANCED DATA STRUCTURES maximum of the n keys only with respect to their log n/2 most significant bits. Assign one processor to each key.…. Step 2. i<=2c. Find the maximum of all alive keys with respect to their i th parts. √n1). Thus each step of this algorithm is nothing but the task of finding the maximum of n keys. Let M be the maximum. As a result we get the following theorem. Next we compute the maximum of remaining keys with respect to their next log n/2 MSBs and drop keys that cannot possibly be the maximum. In one parallel write step. There are 2c parts for each key. and so on. Note that if a key has at most logn/2 bits. of processor i has a key ki. Make use of √n global memory cells (which one initialized to –∞). Since these are only √n numbers. # $ %" " We repeat this basic step 2 times (once for every log n/2 bits in the input keys). its maximum magnitude is √n1. To begin with.. We now show the step 1 of this algorithm can be completed in O(1) time using n common CRCW PRAM processors. if processor i has a key valued 10. (See figure 3). After this many keys can potentially survive.
the speedup of the above algorithm is θ(m)/ θ(log n) = θ(m/log n). 2. The same is true in parallel computing also. In this section we study the parallel complexity of merging. answer the following questions: 1. Write algorithm for finding the maximum using n2 processors? Describe method for finding the maximum using n processors.Km–1 and X1even=K2. Theorem Merging of two sorted sequences each of length m can be completed in O(log n) time using m CREW PRAM processors. then proceed to next section. k2. partition X2 into X2odd and X2even. Merging is an important problem. Note that the merging of X1 and X2 can be reduced to computing the rank of each key k in X1. Assume without loss of generality that m is an integral power of 2 and that the keys are distinct.…. If m=1. π can compute k’s rank in X1UX2 as j+q. Similarly.. a similar procedure can be used to compute its rank in X1UX2..….k2m be the input sorted sequence to be merged.K2m (where m is an integral power of 2) are the two sorted sequences to be merged then following algorithm uses 2m processors. In summary.….PARALLEL ALGORITHMS 137 The maximum of n keys can be found in O(1) time using n CRCW PRAM processors provided the keys are integers in the range [o. Student Activity 5. Partition X1 and X2 into their odd and even parts. π can perform a binary search on X2 and figure out the number 9 of keys in X2 that are less than k. If we allocate a single processor π to k.. if we have 2m processors (one processor per key). Step 0. For example an efficient merging algorithm can lead to an efficient sorting algorithm.Km and X2 = Km+1. then the keys can be merged by writing the key whose rank is i into global memory cell i. km+2. K2. If we know the rank of each key. Top The problem of merging is to take two sorted sequences as input and produce a sequence of all the elements. Since two sorted sequences of length m each can be sequentially merged in θ(m) time. merge the sequences with one comparison. This writing will take only one time unit if we have n=2m processors. let its rank in X1 (X2) be denoted as r1k (r2k). If your answers are correct. nc] for any constant c.3 Before going to next section. That is. This algorithm is not workoptimal! &! OddEven merge is a merging algorithm based on divide and conquer that yields itself to efficient parallelization. merging can be completed in O(log m) time. Step 1. K3. UX2. If k belongs to X2.Km. Once q is known. partition X1 into X1odd=K1. If X1=K1. km and X2=km+1. #% Let X1=k1. For any key k. K4.…. then note that r1k=j.……. . its efficiency is θ(m)/ θ(m log n) =θ(1/log m). If k=kj ∈X1.…….
….lm be the result. Step 3.25 and X2=4.21. Let l1=l1.' #* " " & ' (' ' #' #' *' # .34. lm+2….5. X2even.' ' *' (' # #* * ! / % " The correctness of the merging algorithm can be established using the zeroone principle. Figure 5.18.31. At the same time merge X1even with X2even using the other m processors to get l2=lm+1. li+1) and interchange them out of order. That is.l2m. lm+1. & ' +' ' )' (' ' .' ' # #* Compareexchange ' ' +' ' ' #' )' #' (' ' ' #' .12. . Note that X1odd. Recursively merge X1odd with X2odd using m processors. compare lm+2 with l3 and inter change them if need by.27. Shuffle l1 and l2 : that is. and so on.11.23.138 ALGORITHMS AND ADVANCED DATA STRUCTURES Step 2. The validity of this principle is not proved here.8.' #* & ' ' +' *' (' ' ' )' #' (' ' ' #' . Compare every pair (lm+i. l2m. compare lm+2 with l3 and inter change them if need be.…. l2 lm2. l2. Theorem [Zeroone principle] If any oblivious comparisonbased sorting algorithm sorts an arbitrary sequence of n zeros and ones correctly then it will sort any sequence of arbitrary keys. Shows how the oddeven merge algorithm can be used to merge these two sorted sequences. form the sequence L=l1.4.13.16. Out put the result sequence.' ' # #* !  ' #' ' (' ' ' )' *' ' #' # +' (' .9. Let X1=2. & '' (' ' #' )' ' !  & *' +' ' (' #' . X2odd and X2even are in sorted order.lm.
the next pair of cells to be compared cannot depend on the outcome of comparisons made in the previous steps. Student Activity 5. Merge the following two files by oddeven merge algorithm X1 = 5. Let X2 be the result.4 Before going to next section. 8. If n ≤1. 2. &! " Oddeven merge sort employs the classical divide and conquer strategy. Merge X1. k2……kn/2 is the given sequence of n keys.…kn/2 and X’2=K2/2+1. K2. 14 If your answers are correct. 11.…. Of equal length. At the same time employ the other n/2 processors to sort X’2 recursively. Explain oddeven merge algorithm. it is partitioned into two subproblems X1=k1. Theorem We can sort n arbitrary keys in O(log2 n) time using n EREW PRAM processors. Allocate n/2 processors to sort X’1 recursively. K2….…. The two sorted subsequences (call them X1 and X2 respectively) are then finally merged. Assume for simplicity that n is an integral power of two and that the keys are distinct. For example. in one parallel write step they can be written in sorted order (the key whose rank is i is written in all i). Let X1 be the result. Once we know the rank of back key.kn. recall that problem of sorting is to rearrange this sequence into either ascending or descending order. answer the following questions: 1. Let X=K1. and X2 using odd even merge algorithm and n=2m processors. then proceed to next section.kn be the input. Thus we have the following theorem. In this section we study algorithms for parallel sorting. Top " Given a sequence of n keys. Proof : The sorting algorithm is described as follows: Step 0. Step 2. We employ the oddeven merge algorithm of previous section. Step 1. Partition the input into two: X’1=K1.PARALLEL ALGORITHMS 139 A comparisonbased sorting algorithm is said to be oblivious if the sequence of cells to be compared in the algorithm is prespecified. 15 X2 = 4. return X. Step 3. If we have n processors. … kn/2 and X’2=kn/2+1.kn. the rank of each key can be computed in O(log n) time comparing in parallel. . The preceding description of the algorithm is exactly the same as that of two subsequence X1 and X2 are merged.k2. 13. all possible pairs. 10. X’1 and X’2 are sorted recursively assigning n/2 processors to each. If X=k1. 10. 9. Theorem We can sort n keys in O(log n) time using n2 CREW PRAM processors..
31. The work done by this algorithm is θ(n log2n). 2.21. 12.13. The partial solutions are then combined to obtain the final result.34. Therefore.11.12.25.5. In PRAMs. 27.2. .13. Define T(n) to be the time taken by this algorithm to sort n keys using n processors. Finally.5. In step 1 of algorithm.5 Answer the following questions : 1.18. step 3 takes O(log n) time. OddEven merge is a merging algorithm based on divide and conquer.34.34.23. sort it.8. 8.140 ALGORITHMS AND ADVANCED DATA STRUCTURES It uses n processors. Therefore. 18.5. 34 using 16 processors.23. the input is partitioned into two parts: X’1=8. 9. OddEven merge is based on_____________technique.12. its efficiency is θ(1/logn) it has a speedup of θ(1/log n) Student Activity 5.11.4. 2.8. 23.12.31. Step 2 runs in T(n/2) time.25.18.31. processors 1 to 8 work on X’1. a number of processors work synchrously and communicate to each other by means of a common global memory. What is the time complexity for sorting n arbitrary keys using n EREW PRAM processors. Maximum selection can be done in O (log log n) time using n common CRCW PRAM processors. X1 and X2 are merged as showed in example of previous section to get the final result : 2.27. recursively sort it and obtain X1=2. T(n) satisfies T(n)=O(1)+T(n/2)+O(log n)=T(n/2)+O(log n) which solves to T(n)=O(log2n) Consider the problem of sorting the 16 number 25.31.18. At the same time processors 9 to 16 work on X’2. 11. In 2.13. The maximum speed up = 1/(f+p) Fill in the blanks 1.27. " ' In parallel computing a problem is subdivided into many subproblems and submitted to many processors.4. In step 3.8.16 and X’2=23. The problem of merging is to take two sorted sequence as input and produce a sequence of all the element. Step (of this algorithm takes O(1) time.9. and obtain X2=4.9. ! I.27. True and False 1. 13.9. II.21.21.16. 4. 5.11. 2. Any algorithm designed for multi processor machines is called a parallel algorithm. Describe oddeven merge sort algorithm. 21. 16.16.
4. speedups. ! Shared memory modes are called PRAMs Maximum selection can be done in O(log n) time using n common CRCW PRAM processors. Is it possible? Present an O(1) time nprocessor common CRCW PRAM algorithm for computing the Boolean AND of n bits. Ultra smart claims to have found an algorithm for above problem that runs in time θ(logn) using n3/4 processors. In a CRCW PRAM. . The problem of ________ is to rearrange this sequence into either ascending or descending order. 2. divide and conquer O(1) n2 –processor $ I. # True and False 1. 2. Finding the maximum of n given numbers can be done in O(1) time using an __________CRCW PRAM. the will read the ________ information. " I. Given a sequence of n keys. True and False 1. and efficiencies of these two algorithms. A parallel algorithm is work optimal if and only if it has _____. if more than one processor trial to read from the same cell. II. 2. 3. True False II. Are these algorithms workoptimal? Mr.5 processors and runs in time θ(n0. The problem of ________ is to take two sorted sequences as input and produce a sequence of all the elements. 3. Algorithm A uses n0.5). ) Algorithms A and B are parallel algorithm for solving the problem of finding the maximum element in a list. 2. Compute the work done. Algorithm B uses n processors and runs in O(log n) time. ( 1. 3. The Boolean OR of n bits can be computed in_____________time on an nprocessor common CRCW PRAM. Fill in the blanks 1.PARALLEL ALGORITHMS 141 2. 2. 3. Fill in the blanks 1.
142 ALGORITHMS AND ADVANCED DATA STRUCTURES 4. What are the time and processor bounds of your algorithms? Can exercise (4) be solved in O(1) time using n processors on any of the PRAMs if the keys are arbitrary? How about it there are n2 processors? The algorithm A is a parallel algorithm that has two components. 9. Given two sorted sequences of length n each. 8. 10. Given an array A of n elements. Give an O(1) time algorithm for this problem on an n–processor common CRCW PRAM. Present on O(log log n) time algorithm for finding the maximum of n arbitrary numbers using n/log log n common CRCW PRAM processors. The second component runs in θ(log n) time using n/logn CREW PRAM processors. . In O (logn) time using n CRCW PRAM processors. 7. (a) (b) In O(1) using n2 CRCW PRAM processors. Show how to solve this problem. How will you merge them in O(1) time using n2 CRCW PRAM processors? A given two sets A and B of size n each (in the form of arrays). 11. Show that minima computation can be performed in O(log log n) time using n/log log n common CRCW PRAM processors. we would like to find the largest I such that A[i]=1. Solve the Boolean OR and AND problems on the CRCW and EREW PRAMs. the goal is to check whether the two sets are disjoint or not. nprocessor common CRCW PRAM algorithm to check whether the array is in sorted order. 12. Give an O(1) time. Input is an array of n elements. The first runs in θ(log log n) time using n/log log n EREW PRAM processors. Show that the whole algorithm can be run in θ(log n) time using n/logn CREW PRAM Processors. 6. 5.
Algorithms and Advanced Data Structures BCA202 Directorate of Distance Education Maharshi Dayanand University ROHTAK – 124 001 .
without the written permission of the copyright holder. A45 Naraina. electronic. Maharshi Dayanand University ROHTAK – 124 001 Developed & Produced by EXCEL BOOKS. recording or otherwise. Phase 1. Maharshi Dayanand University. mechanical. New Delhi110028 2 . No part of this publication may be reproduced or stored in a retrieval system or transmitted in any form or by any means. ROHTAK All Rights Reserved. photocopying.Copyright © 2002.
Contents UNIT 1 TREES Overview Binary Trees Traversal of Binary Trees Binary Tree Representation Threaded Binary Trees Binary Search Tree AVL Tree Run Time Storage Management Garbage collection Compaction UNIT 2 SORTING TECHNIQUES Overview Bubble Sort Insertion Sort Selection Sort Quick Sort Merge Sort Radix Sort Heap Sort External Sort Lower Bound theory for sorting Selection and Adversary Argument Minimum Spanning Tree Prim’s Algorithm Kruskal’s Algorithm Shortest Path Graph Component Algorithm String Matching KMP Algorithm 23 1 3 .
Boyer Moore Algorithm UNIT 3 DYNAMIC PROGRAMMING Overview Principle of Optimality Matrix Multiplication Optimal Binary Search Trees UNIT 4 NP COMPLETE PROBLEM Overview Polynomialtime NPCompleteness and Reducibility NPCompleteness Proofs NPComplete Problems UNIT 5 PARALLEL ALGORITHMS Overview Parallelism Computational Model: PRAM and Other Models Finding Maximum Element Merging Sorting 127 98 81 4 .
9. Bentley... Hopcroft and J. 4. and S.. 3. Fundamental Algorithms. Knuth. Bellman. A. R. Inform Process. The Design and Analysis of Computer Algorithms. McGrawHill. 8. Decomposable Searching Problem. Data Structures and Algorithms.Suggested Readings 1.D. Dobosiewicz. Hopcroft. Goodman. and J. Ullman Computer Algorithms: Introduction to Design and Analysis and Analysis. 5. Princeton University Press. J. E. E. Basse. A. W. 2. E. Fundamentals of Data Structure. D. . Horowitz. S. AHO. S. 6.L. Ullman. Sahni Computer Science Press. Lett. 7..E. Sorting by Distributive Partitioning. Introduction to the Design and Analysis of Algorithms. Lett.J.. J. Dynamic Programming.V.