You are on page 1of 275

Resmi N.G. References: Data Structures and Algorithms: Alfred V. Aho, John E. Hopcroft, Jeffrey D.

Ullman A Practical Approach to Data Structures and Algorithms: Sanjay Pahuja

Syllabus
Non Linear Structures - Graphs - Trees - Graph and Tree implementation using array and Linked List -Binary trees - Binary tree traversals - pre-order, in-order and postorder - Threaded binary trees Binary Search trees - AVL trees B trees and B+ trees - Graph traversals - DFS, BFS shortest path - Dijkstras algorithm, Minimum spanning tree - Kruskal Algorithm, Prims algorithm

10/25/2012

CS 09 303 Data Structures - Module 3

Trees

10/25/2012

CS 09 303 Data Structures - Module 3

Trees
A tree is a collection of nodes, one of which is designated as the root, along with a relation (parenthood). Represents hierarchical relationship. A node can be of any type.

10/25/2012

CS 09 303 Data Structures - Module 3

Trees
A tree can be recursively defined as : A single node by itself is a tree. This node is called the root of the tree. A tree is a finite set of one or more nodes such that there is a specially designated node called the root and the remaining nodes are partitioned into n>=0 disjoint sets T1, T2, ,Tn, where each of these sets is a subtree. Every node except the root has a parent node associated to it.
10/25/2012 CS 09 303 Data Structures - Module 3 5

TREE TERMINOLGIES
Root :Specially designed first node Degree of a node: Number of subtrees of a node Degree of a Tree :Maximum degree of any node in a given tree. Terminal node/Leaf node: Node with degree zero.

10/25/2012

CS 09 303 Data Structures - Module 3

Non terminal node : Any node (except root) whose

degree is non-zero.
Siblings: Children nodes of same parent node. Level : If a node is at level n, then its children will be

at level n+1. Root is at level 0, next immediate is level 1 etc..


Path: If n1,n2,nk be the node sequence of a tree such

that ni is the parent of ni+1 for 1<=i<k, then this sequence is called a path from node n1 to node nk.
10/25/2012 CS 09 303 Data Structures - Module 3 7

Length of path :One less than the number of nodes in a

path.
Ancestor/descendant of a node: If there is a path from

node a to node b, then a is an ancestor of b, and b is descendant of a.


Proper Ancestor/Proper Descendant : An ancestor or

descendant of node ,other than the node itself is called proper ancestor or proper descendant respectively.

10/25/2012

CS 09 303 Data Structures - Module 3

Height of a node :Length of the longest path from the

node to a leaf.
Height of a tree: Height of the root. Depth of a node : Length of the unique path from root

to that node. Depth of a tree :One more than the maximum level of any node in a given tree.

10/25/2012

CS 09 303 Data Structures - Module 3

Order of nodes
Children of a node are usually ordered from left-to-right.

These are two distinct ordered trees.


10/25/2012 CS 09 303 Data Structures - Module 3 10

If a and b are siblings, and a is to the left of b, then all the descendants of a are to the left of all descendants of b. A tree in which the order of nodes is ignored is referred to as unordered tree.

10/25/2012

CS 09 303 Data Structures - Module 3

11

Tree Traversal (Orderings)


Preorder Inorder Postorder These orderings are recursively defined as:
If a tree t is null (with no nodes), then the empty list is the preorder, inorder and postorder listing of T. If T consists of a single node, then that node itself is the preorder, inorder and postorder listing of T.
10/25/2012 CS 09 303 Data Structures - Module 3 12

Otherwise, let T be a tree with root n and subtrees T1, T2, , Tk.

T1

T2

Tk

The preorder listing of the nodes of T is the root n of T followed by the nodes of T1 in preorder, then the nodes of T2 in preorder, and so on, upto the nodes of Tk in preorder.

10/25/2012

CS 09 303 Data Structures - Module 3

13

The inorder listing of the nodes of T is the nodes of T1 in inorder, followed by the root n of T , followed by the nodes of T2, , Tk in inorder. The postorder listing of the nodes of T is the nodes of T1 in postorder, then the nodes of T2 in postorder, and so on, upto Tk, all followed by node n.

10/25/2012

CS 09 303 Data Structures - Module 3

14

Preorder Procedure
Procedure PREORDER (n : node) begin (1) list n; (2) for each child c of n, if any, in order from the left do PREORDER(c) end;

10/25/2012

CS 09 303 Data Structures - Module 3

15

a a b c

10/25/2012

CS 09 303 Data Structures - Module 3

16

a b a b c

10/25/2012

CS 09 303 Data Structures - Module 3

17

a b d a b c

10/25/2012

CS 09 303 Data Structures - Module 3

18

a b d e a b c

10/25/2012

CS 09 303 Data Structures - Module 3

19

a b d e c a b c

10/25/2012

CS 09 303 Data Structures - Module 3

20

a b d e c f a b c

10/25/2012

CS 09 303 Data Structures - Module 3

21

a b d e c f g a b c

10/25/2012

CS 09 303 Data Structures - Module 3

22

A, B, D, H, I, E, C, F, J, G, K
10/25/2012 CS 09 303 Data Structures - Module 3 23

Postorder Procedure
Procedure POSTORDER (n : node) begin (1) for each child c of n, if any, in order from the left do POSTORDER(c) (2) list n; end;

10/25/2012

CS 09 303 Data Structures - Module 3

24

d a b c

10/25/2012

CS 09 303 Data Structures - Module 3

25

d e a b c

10/25/2012

CS 09 303 Data Structures - Module 3

26

d e b a b c

10/25/2012

CS 09 303 Data Structures - Module 3

27

d e b f a b c

10/25/2012

CS 09 303 Data Structures - Module 3

28

d e b f g a b c

10/25/2012

CS 09 303 Data Structures - Module 3

29

d e b f g c a b c

10/25/2012

CS 09 303 Data Structures - Module 3

30

d e b f g c a a b c

10/25/2012

CS 09 303 Data Structures - Module 3

31

H, I, D, E, B, J, F, K, G, C, A
10/25/2012 CS 09 303 Data Structures - Module 3 32

Inorder Procedure
Procedure INORDER (n : node) begin if n is a leaf then list n; else begin INORDER (leftmost child of n); list n; for each child c of n, except for the leftmost, in order from the left do INORDER(c) end end;
10/25/2012 CS 09 303 Data Structures - Module 3 33

d a b c

10/25/2012

CS 09 303 Data Structures - Module 3

34

d b a b c

10/25/2012

CS 09 303 Data Structures - Module 3

35

d b e a b c

10/25/2012

CS 09 303 Data Structures - Module 3

36

d b e a a b c

10/25/2012

CS 09 303 Data Structures - Module 3

37

d b e a f a b c

10/25/2012

CS 09 303 Data Structures - Module 3

38

d b e a f c a b c

10/25/2012

CS 09 303 Data Structures - Module 3

39

d b e a f c g a b c

10/25/2012

CS 09 303 Data Structures - Module 3

40

H, D, I, B, E, A, J, F, C, K, G
10/25/2012 CS 09 303 Data Structures - Module 3 41

Labeled Trees and Expression Trees


Every leaf is labeled by an operand and consists of that operand alone. Every interior node n is labeled by an operator. Suppose, n is labeled by a binary operator, #, and that the left child represents expression E1 and the right child E2. Then, n represents (E1) # (E2).

10/25/2012

CS 09 303 Data Structures - Module 3

42

n1 = (a + b) * (a + c)

*
n2 = (a + b)

+ a b a

n3 = (a + c)

n4 = (a)

n7 = (c)

n5 = (b) n6 = (a)

10/25/2012

CS 09 303 Data Structures - Module 3

43

The preorder listing of labels in an expression tree gives the prefix form of the expression, where the operator precedes its left and right operands. The prefix expression for (E1)#(E2), with # a binary operator, is #P1P2 , where P1 and P2 are the prefix expressions for E1 and E2 respectively.

10/25/2012

CS 09 303 Data Structures - Module 3

44

The postorder listing of labels in an expression tree gives the postfix form of the expression, where left and right operands precede the operator. The postfix expression for (E1)#(E2), with # a binary operator, is P1P2#, where P1 and P2 are the postfix expressions for E1 and E2 respectively. The inorder listing of labels in an expression tree gives the infix expression.

10/25/2012

CS 09 303 Data Structures - Module 3

45

The ADT Tree Operations


PARENT (n, T) : Returns the parent of node n in tree T.
If n is the root, it returns NULL.

LEFTMOST_CHILD (n, T) : Returns the leftmost child of node n in tree T.


It returns NULL if n is a leaf.

RIGHT_SIBLING (n, T) : Returns the right sibling of node n in tree T, defined to be that node m with same parent p as n such that m lies immediately to the right of n in the ordering of children of p.
10/25/2012 CS 09 303 Data Structures - Module 3 46

LABEL (n, T) : Returns the label of node n in tree T. CREATEi (v, T1, T2, , Ti) : For each value of i = 0, 1, 2, CREATEi makes a new node r with label v and gives it i children, which are the roots of trees T1, T2, , Ti, in order from the left. The tree with root r is returned. ROOT (T) : Returns the node that is the root of tree T, or returns NULL T is null tree. MAKENULL (T) : makes T the null tree.
10/25/2012 CS 09 303 Data Structures - Module 3 47

IMPLEMENTATION OF TREE
Array Representation Linked List representation

10/25/2012

CS 09 303 Data Structures - Module 3

48

IMPLEMENTATION OF TREES
Array Representation (Parent Representation) Uses the property of trees that each node has a unique parent. Uses a linear array A where A[i] =j, if node j is parent of node i, and A[i]=0,if node i is the root. It supports LABEL operator, where L be an array with L[i], the Label of node i.
10/25/2012 CS 09 303 Data Structures - Module 3 49

10/25/2012

CS 09 303 Data Structures - Module 3

50

With this representation, the parent of a node can be found in constant time. Limitations: Lacks child-of information. Given a node n, it is expensive to determine the children of n, or the height of n. Parent pointer representation does not specify the order of the children of a node.

10/25/2012

CS 09 303 Data Structures - Module 3

51

Definition type node = integer; TREE = array[1..MAXNODES] of node;

10/25/2012

CS 09 303 Data Structures - Module 3

52

Right Sibling Operation function RIGHT_SIBLING (n:node; T:Tree): node; {returns right sibling of node n in tree T} Var i,parent:node ; begin parent:=T[n]; for i = n+1 to maxnodes do {search for node after n with same parent} if T[i] = parent then return(i); return(0);{null node will be returned if no right sibling is ever found} End;(RIGHT_SIBLING}

10/25/2012

CS 09 303 Data Structures - Module 3

53

Linked List representation Of Tree


Way of representing trees where a list of children is formed for each node. Header : An array of header cells, indexed by nodes. Each header points to a linked list of nodes. Elements on the list headed by header[i] are the children of node i.
10/25/2012 CS 09 303 Data Structures - Module 3 54

10/25/2012

CS 09 303 Data Structures - Module 3

55

Definition Type node = integer; LIST = {appropriate definition for list of nodes}; position = {appropriate definition for positions in lists}; TREE = record header : array[1..maxnodes] of LIST; labels : array[1..maxnodes] of labeltype; root : node; End;
10/25/2012 CS 09 303 Data Structures - Module 3 56

LEFTMOST-CHILD Operation function LEFTMOST_CHILD (n:node ;T:Tree): node; {returns the leftmost child of node n of tree T} Var L : LIST {list of ns children} begin L:= T.header[n]; if EMPTY(L) then {n is a leaf} return(0) else return (RETRIEVE(FIRST(L),L); End;{LEFTMOST_CHILD}

10/25/2012

CS 09 303 Data Structures - Module 3

57

Definition of cellspace For PARENT Operation


T.header[n] points directly to the first cell of the list. Var cellspace :array[1..maxnodes] of record node :integer; next :integer; end;

10/25/2012

CS 09 303 Data Structures - Module 3

58

PARENT Operation function PARENT(n:node ;T:Tree): node; {returns the parent of node n of tree T} Var p: node; {runs through possible parents of n} i: position; {runs down the list of ps children} begin for p:=1 to maxnodes do begin i := T.header[p]; while i <> 0 do { see if n is among children of p} if cellspace[i].node = n then return(p)

10/25/2012

CS 09 303 Data Structures - Module 3

59

else i:= cellspace[i].next; end; return(0); end;{PARENT} {returns null node if parent not found}

10/25/2012

CS 09 303 Data Structures - Module 3

60

Shortcoming Inability to create large trees from smaller trees using CREATEi operator.

10/25/2012

CS 09 303 Data Structures - Module 3

61

Leftmost-Child, Right-Sibling Representation of a Tree


Definition Of Node Space Var nodespace :array[1..maxnodes] of record label:labeltype; header: integer; {cursor to cellspace} end;

10/25/2012

CS 09 303 Data Structures - Module 3

62

Definition Of Cell Space Var cellspace :array[1..maxnodes] of record label:labeltype; leftmost_child :integer; right_sibling: integer; end;

10/25/2012

CS 09 303 Data Structures - Module 3

63

10/25/2012

CS 09 303 Data Structures - Module 3

64

10/25/2012

CS 09 303 Data Structures - Module 3

65

Binary Trees
Definition A binary tree is a tree data structure in which each node has at most two child nodes, usually distinguished as "left child" and "right child". A binary tree can be defined as : (1) either an empty tree, or (2) a tree in which every node has either no children, a left child, a right child, or both a left and a right child.
10/25/2012 CS 09 303 Data Structures - Module 3 66

10/25/2012

CS 09 303 Data Structures - Module 3

67

Two Distinct Binary Trees


10/25/2012 CS 09 303 Data Structures - Module 3

An Ordinary Tree
68

Left-Skewed: If a binary tree has only left subtree, it is called left-skewed. Right-Skewed: If a binary tree has only right subtree, it is called right-skewed.

10/25/2012

CS 09 303 Data Structures - Module 3

69

10/25/2012

CS 09 303 Data Structures - Module 3

70

In a binary tree a degree of every node is maximum two. Binary tree with n nodes has exactly (n-1) edges. A full binary tree is a tree in which every node other than the leaves has two children. A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible.

10/25/2012

CS 09 303 Data Structures - Module 3

71

Complete Binary Tree

10/25/2012

CS 09 303 Data Structures - Module 3

72

Full Binary Tree

A full binary tree of depth k has 2k -1 nodes, k>=0.

10/25/2012

CS 09 303 Data Structures - Module 3

73

Binary Tree Representations


Array Representation Array stores nodes Nodes accessed sequentially Root node index starts with 1. SIZE of array=(2^d)-1 where d=depth of tree

10/25/2012

CS 09 303 Data Structures - Module 3

74

10/25/2012

CS 09 303 Data Structures - Module 3

75

10/25/2012

CS 09 303 Data Structures - Module 3

76

10/25/2012

CS 09 303 Data Structures - Module 3

77

Binary Tree Representations


Pointer Based Implementation Binary Tree linked list declaration as type node = record leftchild : node; rightchild: node; parent : node; end;
10/25/2012 CS 09 303 Data Structures - Module 3 78

Binary Trees: Pointer Based Implementation


CREATION Algorithm Function create (lefttree, righttree: node): node; Var root : node; begin new ( root); root .leftchild := lefttree; root .rightchild := righttree; root .parent:=0; lefttree .parent := root; righttree .parent:=root; return (root) end;{create}
10/25/2012 CS 09 303 Data Structures - Module 3 79

INSERTION Algorithm Function insert(root: node,digit:number): node; Var root : node; begin If root = NULL then begin new(root); root .leftchild:=NULL; root .rightchild:=NULL; root .data := digit; count := count+1; end;{if}
10/25/2012 CS 09 303 Data Structures - Module 3 80

else if count %2 = 0 then begin root .leftchild :=insert(root .leftchild, digit); else root .rightchild := insert(root .rightchild, digit); end;{elseif} return(root); end;{insert}

10/25/2012

CS 09 303 Data Structures - Module 3

81

Binary Trees Operations


TRAVERSAL Pre-order(Node-left-right) In-Order(Left-Node-Right) Post-Order(Left-Right-Node)

10/25/2012

CS 09 303 Data Structures - Module 3

82

Recursive TRAVERSAL Pre-order Algorithm Steps: (1) Visit the root node (2) Traverse the left subtree in pre-order (3) Traverse the right subtree in pre-order

10/25/2012

CS 09 303 Data Structures - Module 3

83

Recursive TRAVERSAL Pre-order Algorithm procedure preorder (root : node) begin if root <> NULL begin write (root .data); preorder(root .lchild); preorder(root .rchild); end{if} end;{preorder}

10/25/2012

CS 09 303 Data Structures - Module 3

84

Recursive TRAVERSAL In-order Algorithm Steps: (1) Traverse the left subtree in inorder (2) Visit the root node (3) Traverse the right subtree in inorder

10/25/2012

CS 09 303 Data Structures - Module 3

85

Recursive TRAVERSAL Inorder Algorithm procedure inorder(root : node) begin if root <> NULL begin inorder(root .lchild); write (root .data); inorder(root .rchild); end{if} end;{inorder}

10/25/2012

CS 09 303 Data Structures - Module 3

86

Recursive TRAVERSAL Postorder Algorithm Steps: (1) Traverse the left subtree in postorder (2) Traverse the right subtree in postorder (3) Visit the root node

10/25/2012

CS 09 303 Data Structures - Module 3

87

Recursive TRAVERSAL
Postorder Algorithm procedure postorder(root : node) begin if root <> NULL begin postorder(root .lchild); postorder(root .rchild); write (root .data); end{if} end;{preorder}

10/25/2012

CS 09 303 Data Structures - Module 3

88

Binary Tree Vs General Tree


Binary Tree : May be empty Tree : cannot be empty Binary Tree : Exactly 2 subtrees Tree : Any number of subtrees Binary Tree : Ordered Tree : Unordered
10/25/2012 CS 09 303 Data Structures - Module 3 89

Binary Search Tree

10/25/2012

CS 09 303 Data Structures - Module 3

90

10/25/2012

CS 09 303 Data Structures - Module 3

91

Binary Search Tree Operations


Inserting a node
Searching a node Deleting a node

10/25/2012

CS 09 303 Data Structures - Module 3

92

BST INSERTION
procedure INSERT (x: elementtype, var A: Set)); {add x to set A} Begin if A = NIL then begin new (A); A . element := x; A . leftchild := NIL; A . rightchild := NIL; end
10/25/2012 CS 09 303 Data Structures - Module 3 93

else if x < A . element then INSERT (x, A .leftchild); else if x > A . element then INSERT (x, A .rightchild); {if x = A . element then, do nothing; x is already in the set} End; {INSERT}

10/25/2012

CS 09 303 Data Structures - Module 3

94

10/25/2012

CS 09 303 Data Structures - Module 3

95

BST SEARCH
function MEMBER (x: elementtype, var A : SET) : boolean; {returns true if x is in A, false otherwise} Begin if A = NIL then return (false) else if x = A .element then return (true) else if x < A .element then return (MEMBER (x, A .leftchild)) else {x > A .element} return (MEMBER (x, A .rightchild)) End; {MEMBER}
10/25/2012 CS 09 303 Data Structures - Module 3 96

BST DELETION
(1) Deleting leaf node - Just delete the leaf node (2) Deleting a node with a single child (either a left child or a right child) - Replace the node with its left (or right) child. (3) Deleting a node with both left and right child - Replace the node to be deleted with its inorder successor (with smallest value in its right subtree) and then delete the node.
10/25/2012 CS 09 303 Data Structures - Module 3 97

BST DELETION
Procedure DELETE (x: elementtype, var A : SET); {remove x from set A} Begin if A <> NIL then if x < A .element then DELETE (x, A .leftchild) else if x > A .element then DELETE(x, A .rightchild)) {if we reach here, x is at the node pointed to by A}
10/25/2012 CS 09 303 Data Structures - Module 3 98

else if A .leftchild=NIL and A .rightchild=NIL then A := NIL; {delete the leaf holding x} else if A .leftchild=NIL then {A has only right child} A := A .rightchild; else if A .rightchild=NIL then {A has only left child} A := A .leftchild; else {both children are present} A .element := DELETEMIN (A .rightchild); End; {DELETE}

10/25/2012

CS 09 303 Data Structures - Module 3

99

Function DELETEMIN (var A : SET) : elementtype; {returns and removes the smallest element from set A} Begin if A .leftchild = NIL then begin {A points to the smallest element} DELETEMIN := A .element; A := A .rightchild); {replace the node pointed to by A by its right child} end else {the node pointed to by A has a left child} DELETEMIN := DELETEMIN (A .leftchild); End; {DELETEMIN}
10/25/2012 CS 09 303 Data Structures - Module 3 100

(1)

Deleting leaf node with value 13

10/25/2012

CS 09 303 Data Structures - Module 3

101

(2)

Deleting node with single (right)child (with value 16)

Replace the node with its right child (with value 20).

10/25/2012

CS 09 303 Data Structures - Module 3

102

(3)

Deleting node with left and right child (with value 5)

Replace the node with its inorder successor and delete the node.

10/25/2012

CS 09 303 Data Structures - Module 3

103

BST Traversal

Inorder : 3 5 6 7 10 12 13 15 16 18 20 23 (sorted in ascending order) Preorder : 15 5 3 12 10 6 7 13 16 20 18 23 Postorder : 3 7 6 10 13 12 5 18 23 20 16 15


10/25/2012 CS 09 303 Data Structures - Module 3 104

Deleting a node with two children from a BST


Rule 1: Find the largest node of left subtree (inorder predecessor). Rule 2: Find the smallest node of right subtree(inorder successor). A node's in-order successor is the left-most child of its right subtree, and a node's in-order predecessor is the right-most child of its left subtree.

10/25/2012

CS 09 303 Data Structures - Module 3

105

Node to be deleted : node with value 7. The triangles represent subtrees of arbitrary size. Rule 1: Find the largest node of left subtree.

Rule 2: Find the smallest node of right subtree.


10/25/2012 CS 09 303 Data Structures - Module 3 106

Binary Search trees Vs Arrays


Advantages: Complexity of searching: O (log2N) Better insertion time: O (log2N) Vs O(N) Better deletion time Disadvantage: BST requires more memory space to store the two pointer references to left and right child for each data element.

10/25/2012

CS 09 303 Data Structures - Module 3

107

Application of BST
Sorting: We can sort the data by reading it, item by item, and constructing a BST as we go. The inorder traversal of a BST gives the elements in ascending order. A sorted array can be produced from a BST by traversing the tree in inorder and inserting each element sequentially into the array as it is visited. Time Complexity -?????
10/25/2012 CS 09 303 Data Structures - Module 3 108

Threaded Trees
Binary trees have a lot of wasted space: each of the leaf nodes has 2 null pointers. We can use these pointers to help us in inorder traversals. We have the pointers that reference the next node in an inorder traversal; called threads. To know whether a pointer is an actual link or a thread, a boolean variable can be maintained for each pointer.
10/25/2012 CS 09 303 Data Structures - Module 3 109

Threaded Tree Example


6 3 1 5 7 9 8 11 13

10/25/2012

CS 09 303 Data Structures - Module 3

110

Threaded Binary Trees


A binary tree is threaded according to a particular traversal order. A binary tree is threaded by making all right child pointers that would normally be null point to the inorder successor of the node, and all left child pointers that would normally be null point to the inorder predecessor of the node.

10/25/2012

CS 09 303 Data Structures - Module 3

111

Types: Single Threaded: each node is threaded towards either the inorder predecessor or successor. Double threaded: each node is threaded towards both the inorder predecessor and successor.

10/25/2012

CS 09 303 Data Structures - Module 3

112

Threads are references to the predecessors and successors of the node according to an inorder traversal. Inorder of the threaded tree is ABCDEFGHI.

10/25/2012

CS 09 303 Data Structures - Module 3

113

10/25/2012

CS 09 303 Data Structures - Module 3

114

Inorder: DBAEC

10/25/2012

CS 09 303 Data Structures - Module 3

115

Threaded Tree Traversal


We start at the leftmost node in the tree, print it, and follow its right thread. If we follow a thread to the right, we output the node and continue to its right. If we follow a link to the right, we go to the leftmost node, print it, and continue.

10/25/2012

CS 09 303 Data Structures - Module 3

116

Threaded Tree Traversal


6 3 1 5 7 9
Start at leftmost node, print it
10/25/2012 CS 09 303 Data Structures - Module 3

Output 1

8 11 13

117

Threaded Tree Traversal


6 3 1 5 7 9
Follow thread to right, print node
10/25/2012 CS 09 303 Data Structures - Module 3

Output 1 3

8 11 13

118

Threaded Tree Traversal


6 3 1 5 7 9
Follow link to right, go to leftmost node and print
10/25/2012 CS 09 303 Data Structures - Module 3

8 11 13

Output 1 3 5

119

Threaded Tree Traversal


6 3 1 5 7 9
Follow thread to right, print node
10/25/2012 CS 09 303 Data Structures - Module 3

8 11 13

Output 1 3 5 6

120

Threaded Tree Traversal


6 3 1 5 7 9
Follow link to right, go to leftmost node and print
10/25/2012 CS 09 303 Data Structures - Module 3

8 11 13

Output 1 3 5 6 7

121

Threaded Tree Traversal


6 3 1 5 7 9
Follow thread to right, print node
10/25/2012 CS 09 303 Data Structures - Module 3

8 11 13

Output 1 3 5 6 7 8

122

Threaded Tree Traversal


6 3 1 5 7 9
Follow link to right, go to leftmost node and print
10/25/2012 CS 09 303 Data Structures - Module 3

8 11 13

Output 1 3 5 6 7 8 9

123

Threaded Tree Traversal


6 3 1 5 7 9
Follow thread to right, print node
10/25/2012 CS 09 303 Data Structures - Module 3

8 11 13

Output 1 3 5 6 7 8 9 11

124

Threaded Tree Traversal


6 3 1 5 7 9
Follow link to right, go to leftmost node and print
10/25/2012 CS 09 303 Data Structures - Module 3

8 11 13

Output 1 3 5 6 7 8 9 11 13

125

10/25/2012

CS 09 303 Data Structures - Module 3

126

Advantages
The traversal operation is faster than that of its unthreaded version, because with threaded binary tree non-recursive implementation is possible which can run faster. We can efficiently determine the predecessor and successor nodes starting from any node (no stack required). Any node is accessible from any other node. Threads are usually upward whereas links are downward. Thus, in a threaded tree, one can move in either direction and nodes are in fact circularly linked.
10/25/2012 CS 09 303 Data Structures - Module 3 127

Limitations of BSTs
BSTs can become highly unbalanced (In worst case, time complexity for BST operations become O(n)).
root A C F M Z
10/25/2012 CS 09 303 Data Structures - Module 3 128

Height-balanced Binary Search Trees


A self-balancing (or height-balanced) binary search tree is any binary search tree that automatically keeps its height (number of levels below the root) small after arbitrary item insertions and deletions. A balanced tree is a BST whose every node above the last level has non-empty left and right subtree. Number of nodes in a complete binary tree of height h is 2h+1 1. Hence, a binary tree of n elements is balanced if: 2h 1 < n <= 2h+1 - 1
10/25/2012 CS 09 303 Data Structures - Module 3 129

AVL Trees
Named after Russian Mathematicians: G.M. AdelsonVelskii and E.M. Landis who discovered them in 1962. An AVL tree is a binary search tree which has the following properties: The sub-trees of every node differ in height by at most one. Every sub-tree is an AVL tree.

10/25/2012

CS 09 303 Data Structures - Module 3

130

10/25/2012

CS 09 303 Data Structures - Module 3

131

Implementations of AVL tree insertion rely on adding an extra attribute, the balance factor to each node. Balance factor (bf) = HL - HR An empty binary tree is an AVL tree. A non-empty binary tree T is an AVL tree iff : | HL - HR | <= 1 For an AVL tree, the balance factor, HL - HR , of a node can be either 0, 1 or -1.
10/25/2012 CS 09 303 Data Structures - Module 3 132

The balance factor indicates whether the tree is: left-heavy (the height of the left sub-tree is 1 greater than the right sub-tree), balanced (both sub-trees are of the same height) or right-heavy (the height of the right sub-tree is 1 greater than the left sub-tree).

10/25/2012

CS 09 303 Data Structures - Module 3

133

+1

0 0

+1

-1

0 AVL Tree 2

0 AVL Tree 0

-1

0 0

+1 Not an AVL Tree

10/25/2012

CS 09 303 Data Structures - Module 3

134

10/25/2012

CS 09 303 Data Structures - Module 3

135

10/25/2012

CS 09 303 Data Structures - Module 3

136

AVL Tree Insertion


Insertion is similar to that of a BST. After inserting a node, it is necessary to check each of the node's ancestors for consistency with the rules of AVL. If after inserting an element, the balance of any tree is destroyed then a rotation is performed to restore the balance.

10/25/2012

CS 09 303 Data Structures - Module 3

137

For each node checked, if the balance factor remains 1, 0, or +1 then no rotations are necessary. However, if balance factor becomes less than -1 or greater than +1, the subtree rooted at this node is unbalanced.

10/25/2012

CS 09 303 Data Structures - Module 3

138

Theorem: When an AVL tree becomes unbalanced after an insertion, exactly one single or double rotation is required to balance the tree. Let A be the root of the unbalanced subtree. There are four cases which need to be considered. Left-Left (LL) rotation Right-Right (RR) rotation Left-Right (LR) rotation Right-Left (RL) rotation
10/25/2012 CS 09 303 Data Structures - Module 3 139

LL Rotation: Inserted node is in the left subtree of left subtree of node A. RR Rotation: Inserted node is in the right subtree of right subtree of node A. LR Rotation: Inserted node is in the right subtree of left subtree of node A. RL Rotation: Inserted node is in the left subtree of right subtree of node A.

10/25/2012

CS 09 303 Data Structures - Module 3

140

LL Rotation: New element 2 is inserted in the left subtree of left subtree of A , whose bf becomes +2 after insertion.
+1 B 4 0 AR 0 BL BR BL 2 BR A 6 +1 4 AR B +2 A 6

To rebalance the tree, it is rotated so as to allow B to be the root with BL and A to be its left subtree and right child respectively, and BR and AR to be the left and right subtrees of A.
10/25/2012 CS 09 303 Data Structures - Module 3 141

B 0 4 0 BL 2 A 0 6

BR

AR

A.Leftchild = B.rightchild B.Rightchild = A

10/25/2012

CS 09 303 Data Structures - Module 3

142

RR Rotation: New element 10 is inserted in the right subtree of right subtree of A , whose bf becomes -2 after insertion.
-2 -1 A 6 B AL BL 0 8 BL BR BR 10 AL 8 A 6 B -1

To rebalance the tree, it is rotated so as to allow B to be the root with A as its left child and BR as its right subtree, and AL and BL as the left and right subtrees of A respectively.
10/25/2012 CS 09 303 Data Structures - Module 3 143

B 0 8 0 A 6 0 10

AL

BL

BR

A.Rightchild = B.leftchild B.leftchild = A

10/25/2012

CS 09 303 Data Structures - Module 3

144

Left-Left case and Left-Right case: If the balance factor of P is 2, then the left subtree outweighs the right subtree of the given node, and the balance factor of the left child L must be checked. The right rotation with P as the root is necessary. If the balance factor of L is +1, a single right rotation (with P as the root) is needed (Left-Left case). If the balance factor of L is -1, two different rotations are needed. The first rotation is a left rotation with L as the root. The second is a right rotation with P as the root (Left-Right case).
10/25/2012 CS 09 303 Data Structures - Module 3 145

10/25/2012

CS 09 303 Data Structures - Module 3

146

10/25/2012

CS 09 303 Data Structures - Module 3

147

Right-Right case and Right-Left case: If the balance factor of P is -2 then the right subtree outweighs the left subtree of the given node, and the balance factor of the right child (R) must be checked. The left rotation with P as the root is necessary. If the balance factor of R is -1, a single left rotation (with P as the root) is needed (Right-Right case). If the balance factor of R is +1, two different rotations are needed. The first rotation is a right rotation with R as the root. The second is a left rotation with P as the root (Right-Left case).
10/25/2012 CS 09 303 Data Structures - Module 3 148

10/25/2012

CS 09 303 Data Structures - Module 3

149

10/25/2012

CS 09 303 Data Structures - Module 3

150

10/25/2012

CS 09 303 Data Structures - Module 3

151

AVL Tree Deletion


If the node is a leaf or has only one child, remove it. Otherwise, replace it with either the largest in its left sub tree (in order predecessor) or the smallest in its right sub tree (in order successor), and remove that node. The node that was found as a replacement has at most one sub tree. After deletion, retrace the path back up the tree (parent of the replacement) to the root, adjusting the balance factors as needed.
10/25/2012 CS 09 303 Data Structures - Module 3 152

The retracing can stop if the balance factor becomes 1 or +1 indicating that the height of that subtree has remained unchanged. If the balance factor becomes 0 then the height of the subtree has decreased by one and the retracing needs to continue. If the balance factor becomes 2 or +2 then the subtree is unbalanced and needs to be rotated to fix it.

10/25/2012

CS 09 303 Data Structures - Module 3

153

Reference
Sanjay Pahuja, A Practical Approach to Data Structures and Algorithms, First Ed. 2007. For Height Balanced Trees: AVL, B-Trees, refer: Pg. 292 296, 301 315

10/25/2012

CS 09 303 Data Structures - Module 3

154

B-Trees
B-tree is a tree data structure that is a generalization of a binary search tree.ie; A node can have more than two children. It is also called Balanced M-way tree or balanced sort tree. A node of the tree may contain many records or keys and pointers to children. Used in external sorting. It is not a binary tree.
10/25/2012 CS 09 303 Data Structures - Module 3 155

Properties of B-tree of order M


(1) Each node(except the root and leaf) has maximum of M children and a minimum of ceil(M/2) children and for root, any number from 2 to maximum. (2) Each node has one fewer key than children with a maximum of M-1 keys.

10/25/2012

CS 09 303 Data Structures - Module 3

156

(3) Keys are arranged in a defined order within the node. All keys in the subtree to the left of a key are predecessors of the key and those to the right are successors of the key. (4) All leaves are on the same level. ie. There is no empty subtree above the level of the leaves.

10/25/2012

CS 09 303 Data Structures - Module 3

157

B-Tree Insertion
(1) (2) (3) Search and find the position for insertion. Add the key to the node if the node can accommodate it. If not, ie. if a new key is to be inserted into a full node, the node is split into two and the key with median value is inserted in parent node. Continue splitting upward, if required, until the root is reached. If parent node is the root node, and has to be split, a new root is created and the tree grows taller by one level.
10/25/2012 CS 09 303 Data Structures - Module 3 158

B-Tree Deletion
(1) Search and find the key to be deleted. (2) If the key is in a terminal node, the key along with appropriate pointer is deleted. (3) If the key is not in a terminal node, it is replaced by a copy of its successor(key with next higher value).

10/25/2012

CS 09 303 Data Structures - Module 3

159

(4) If on deleting the key, the new node size is lower than the minimum, an underflow occurs. (a) If either of adjacent siblings contains more than minimum number of keys, the central key is chosen from the collection:
contents of node with less than minimum number of keys, more than minimum number of keys, and the separating key from parent node.

This key is written back to parent; the left and right halves are written back to siblings.
10/25/2012 CS 09 303 Data Structures - Module 3 160

(b) If none of the adjacent siblings contains more than minimum number of keys, concatenation is used. The node is merged with its adjacent sibling and the separating key from its parent.

10/25/2012

CS 09 303 Data Structures - Module 3

161

B-Tree of Order 5 Example


All internal nodes have at least ceil(5 / 2) = ceil(2.5) = 3 children (and hence at least 2 keys), other than the root node. The maximum number of children that a node can have is 5 (so that 4 is the maximum number of keys). All nodes other than the root must have a minimum of 2 keys.

10/25/2012

CS 09 303 Data Structures - Module 3

162

B-Tree Order 5 Insertion Insert C N G A H E K Q M F W L T Z DPRXYS


Originally we have an empty B-tree of order 5. The first 4 letters get inserted into the same node

10/25/2012

CS 09 303 Data Structures - Module 3

163

CN GAHEK Q M FWLTZ DPRXYS


When we try to insert the H, we find no room in this node, so we split it into 2 nodes, moving the median item G up into a new root node.

10/25/2012

CS 09 303 Data Structures - Module 3

164

CN GAHEK Q M FWLTZ DPRXYS


Inserting E, K, and Q proceeds without requiring any splits.

10/25/2012

CS 09 303 Data Structures - Module 3

165

CN GAHEK Q M FWLTZ DPRXYS


Inserting M requires a split.

10/25/2012

CS 09 303 Data Structures - Module 3

166

CN GAHEK Q M FWLTZ DPRXYS


The letters F, W, L, and T are then added without any split.

10/25/2012

CS 09 303 Data Structures - Module 3

167

CN GAHEK Q M FWLTZ DPRXYS


When Z is added, the rightmost leaf must be split. The median item T is moved up into the parent node.

10/25/2012

CS 09 303 Data Structures - Module 3

168

CN GAHEK Q M FWLTZ DPRXYS


The insertion of D causes the leftmost leaf to be split. D happens to be the median key and so it is moved up into the parent node. The letters P, R, X, and Y are then added without any split.

10/25/2012

CS 09 303 Data Structures - Module 3

169

CN GAHEK Q M FWLTZ DPRXYS


Finally, when S is added, the node with N, P, Q, and R splits, sending the median Q up to the parent. The parent node is full, so it splits, sending the median M up to form a new root node.

10/25/2012

CS 09 303 Data Structures - Module 3

170

B-Tree Order 5 Deletion


Initial B-Tree

10/25/2012

CS 09 303 Data Structures - Module 3

171

Delete H
Since H is in a leaf and the leaf has more than minimum number of keys, we just remove it.

10/25/2012

CS 09 303 Data Structures - Module 3

172

Delete T
Since T is not in a leaf, we find its successor (the next item in ascending order), which happens to be W. Move W up to replace the T. ie; What we really have to do is to delete W from the leaf .

10/25/2012

CS 09 303 Data Structures - Module 3

173

B+ Trees
Variant of the original B-tree in which all records are stored in the leaves and all leaves are linked sequentially. The B+ tree is used as an indexing method in relational database management systems. All keys are duplicated in the leaves. This has the advantage that all the leaves are linked together sequentially, and hence the entire tree may be scanned without visiting the higher nodes at all.

10/25/2012

CS 09 303 Data Structures - Module 3

174

10/25/2012

CS 09 303 Data Structures - Module 3

175

The B + Tree consists of two types of nodes: (1) internal nodes and (2) leaf nodes Internal nodes point to other nodes in the tree. Leaf nodes point to data in the database using data pointers. Leaf nodes also contain an additional pointer, called the sibling pointer, which is used to improve the efficiency of certain types of search.

10/25/2012

CS 09 303 Data Structures - Module 3

176

The B + -Tree is a balanced tree because every path from the root node to a leaf node is the same length. A balanced tree is one in which all searches for individual values require the same number of nodes to be read from the disc. Order of a B + Tree The order of a B + Tree is the number of keys and pointers that an internal node can contain. An order size of m means that an internal node can contain m-1 keys and m pointers.
10/25/2012 CS 09 303 Data Structures - Module 3 177

Insertion in B+ Tree
Insert sequence : 5, 8, 1, 7, 3, 12, 9, 6 Order: 3 Empty Tree The B+Tree starts as a single leaf node. A leaf node consists of one or more data pointers and a pointer to its right sibling. This leaf node is empty.

10/25/2012

CS 09 303 Data Structures - Module 3

178

Inserting Key Value 5 To insert a key, search for the location where the key has to be inserted. Here, the B+Tree consists of a single leaf node, L1, which is empty. Hence, the key value 5 must be placed in leaf node L1.

10/25/2012

CS 09 303 Data Structures - Module 3

179

Inserting Key Value 8 Again, search for the location where key value 8 is to be inserted. This is in leaf node L1. There is room in L1; so insert the new key.

10/25/2012

CS 09 303 Data Structures - Module 3

180

Inserting Key Value 1 Searching for where the key value 1 should appear also results in L1 but L1 is now full as it contains the maximum two records.

10/25/2012

CS 09 303 Data Structures - Module 3

181

L1 must be split into two nodes. The first node will contain the first half of the keys and the second node will contain the second half of the keys.

10/25/2012

CS 09 303 Data Structures - Module 3

182

We now require a new root node to point to each of these nodes. We create a new root node and promote the rightmost key from node L1.

10/25/2012

CS 09 303 Data Structures - Module 3

183

Insert Key Value 7 Search for the location where key 7 is to be inserted, that is, L2. Insert key 7 into L2.

10/25/2012

CS 09 303 Data Structures - Module 3

184

Insert Key Value 3 Search for the location where key 3 is to be inserted. That is L1. But, L1 is full and must be split.

10/25/2012

CS 09 303 Data Structures - Module 3

185

The rightmost key in L1, i.e. 3, must now be promoted up the tree.

L1 was pointing to key 5 in B1. Therefore, all the key values in B1 to the right of and including key 5 are moved one position to the right.
10/25/2012 CS 09 303 Data Structures - Module 3 186

Insert Key Value 12 Search for the location where key 12 is to be inserted, L2. Try to insert 12 into L2 but L2 is full and it must be split.

As before, we must promote the rightmost value of L2 but B1 is full and so it must be split.
10/25/2012 CS 09 303 Data Structures - Module 3 187

Now the tree requires a new root node, so we promote the rightmost value of B1 into a new node.

10/25/2012

CS 09 303 Data Structures - Module 3

188

10/25/2012

CS 09 303 Data Structures - Module 3

189

Insert Key Value 9 Search for the location where key value 9 is to be inserted, L4. Insert key 9 into L4.

10/25/2012

CS 09 303 Data Structures - Module 3

190

Insert Key Value 6 Key value 6 should be inserted into L2 but it is full. Therefore, split it and promote the appropriate key value.

10/25/2012

CS 09 303 Data Structures - Module 3

191

Deletion in B+ Tree
Deletion sequence: 9, 8, 12.

10/25/2012

CS 09 303 Data Structures - Module 3

192

Delete Key Value 9 First, search for the location of key value 9, L4. Delete 9 from L4. L4 is not less than half full and the tree is correct.

10/25/2012

CS 09 303 Data Structures - Module 3

193

Delete Key Value 8 Search for key value 8, L5. Deleting 8 from L5 causes L5 to underflow, that is, it becomes less than half full.

10/25/2012

CS 09 303 Data Structures - Module 3

194

Redistribute some of the values from L2. This is possible because L2 is full and half its contents can be placed in L5.

As some entries have been removed from L2, its parent B2 must be adjusted to reflect the change.
10/25/2012 CS 09 303 Data Structures - Module 3 195

Deleting Key Value 12 Deleting key value 12 from L4 causes L4 to underflow. However, because L5 is already half full we cannot redistribute keys between the nodes. L4 must be deleted from the index and B2 adjusted to reflect the change.

10/25/2012

CS 09 303 Data Structures - Module 3

196

B+ Trees
Reference: http://www.mec.ac.in/resources/notes/notes/ds/bplus .htm

10/25/2012

CS 09 303 Data Structures - Module 3

197

GRAPH
Graph G = (V,E) is a collection of vertices and edges. (1) Set of vertices - V (2) Set of Edges - E

where V ={ V1,V2,V3..} E ={ e1,e2,e3..}


10/25/2012 CS 09 303 Data Structures - Module 3 198

10/25/2012

CS 09 303 Data Structures - Module 3

199

GRAPH TERMINOLOGIES
Directed Graph (Digraph) :A graph

where nodes are

connected with directed edges.


Arrow head is at vertex called head and other end is

called tail.

10/25/2012

CS 09 303 Data Structures - Module 3

200

LOOP: If an edge is incident from and into the same

vertex.
ADJACENT VERTICES :Two vertices are adjacent if

they are joined by an edge.


ISOLATED VERTEX : If there is no edge incident with

the vertex.

10/25/2012

CS 09 303 Data Structures - Module 3

201

ISOMORPHIC: Two graphs are said to be isomorphic if equal

number of vertices , edges and also corresponding images exist.


SUBGRAPH :Let G = (V,E) be a graph. Then, G1 = (V1,E1)

is a subgraph, if V1 is a subset of V and E1 is a subset of E.


SPANNING

/ INDUCED SUBGRAPH : A subgraph that

contains all vertices of G.


10/25/2012 CS 09 303 Data Structures - Module 3 202

DEGREE: The number of edges incident on a vertex. WEIGHTED

GRAPH :A graph in which every edge is assigned some weight or value.

PATH: A sequence of vertices. SIMPLE PATH : Path in which first and last vertices

are

distinct.
CYCLE : Length of path must be minimum 1 and begins and

ends at same vertex.


10/25/2012 CS 09 303 Data Structures - Module 3 203

LENGTH OF PATH: Number of edge arcs on the path. LABELED

DIGRAPH :A graph in which each arc or vertex has an associated label.

CONNECTED GRAPH : Graph in which there exists a

path from any vertex to any other vertex. Otherwise, it is said to be disconnected. {Disconnected graph contains components.}

10/25/2012

CS 09 303 Data Structures - Module 3

204

REPRESENTATION OF GRAPH
1) Sequential Representation

Adjacency matrix representation Incident matrix representation


2) Linked list representation

10/25/2012

CS 09 303 Data Structures - Module 3

205

ADJACENCY MATRIX REPRESENTATION


Order of Adjacency matrix = Number of vertices * number of vertices. For a directed and undirected graph, adjacency matrix conditions are: Aij = 1 { if there is an edge from Vi to Vj } = 0 { if there is no edge from Vi to Vj } For a weighted graph adjacency matrix conditions are: Aij = Wij { if there is an edge from Vi to Vj, where Wij is the weight. } = -1 { if there is no edge from Vi to Vj }
10/25/2012 CS 09 303 Data Structures - Module 3 206

10/25/2012

CS 09 303 Data Structures - Module 3

207

10/25/2012

CS 09 303 Data Structures - Module 3

208

INCIDENT MATRIX REPRESENTATION


Order of Incident matrix = Number of vertices * number of edges For an undirected graph, incident matrix conditions are: Aij = 1 { if an edge ej is incident on vertex Vi} = 0 {otherwise } For a directed graph, incident matrix conditions are: Aij = 1 {if an edge ej is going outward from vertex Vi} = -1{if an edge ej is incident on vertex Vi} = 0 {otherwise}
10/25/2012 CS 09 303 Data Structures - Module 3 209

LINKED LIST REPRESENTATION


For a directed and undirected graph, store all the vertices of graph in a list and each adjacent vertex as linked list node. For a weighted graph, linked list node will contain an extra field - weight.
10/25/2012 CS 09 303 Data Structures - Module 3 210

10/25/2012

CS 09 303 Data Structures - Module 3

211

Example
Suppose the adjacency matrix for a graph is:
1 1 1 0 0 2 1 0 1 1 3 1 0 0 1 4 1 0 1 0 1 2 3 4

The corresponding adjacency list representation is: 1 -> 1 -> 2 -> 3 -> 4 2 -> 1 3 -> 2 -> 4 4 -> 2 -> 3
10/25/2012 CS 09 303 Data Structures - Module 3 212

Operations on Graphs
FIRST (v) : returns the index for the first vertex adjacent to v. NEXT(v,i ) : returns the index after index i for the vertices adjacent to v. VERTEX ( v, i ) : returns the vertex with index i among the vertices adjacent to v.

10/25/2012

CS 09 303 Data Structures - Module 3

213

(1) FIRST (V) : Var A : array [1..n,1..n] of boolean Function FIRST (v: integer ):integer; Var i : integer ; Begin for i:= 1 to n do if A [v, i] then return (i); return (0); { if we reach here v has no adjacent vertex} End; {FIRST }
10/25/2012 CS 09 303 Data Structures - Module 3 214

(2) NEXT(v,i ) : Function NEXT (v: integer , i : integer ):integer; Var j : integer ; Begin for j:= i+1 to n do if A [v, j] then return (j); return (0); End; {NEXT } (3) VERTEX ( v, i ) : returns the vertex with index i

among the vertices adjacent to v.


10/25/2012 CS 09 303 Data Structures - Module 3 215

(4)TRAVERSAL OF ADJACENT VERTICES OF V i :=FIRST(v); while i <> NULL do begin w : = VERTEX (v,i); { some action on w } i:=NEXT(v,i); end; { while }

10/25/2012

CS 09 303 Data Structures - Module 3

216

(5)CREATING A GRAPH Steps (1) Input total number of vertices in a graph, say n. (2) Allocate memory dynamically for the vertices to store in list array. (3) Input the first vertex and vertices through which it has edge by linking node from list array through nodes. (4) Repeat the process incrementing the list array to add other vertices and edges. (5) Exit
10/25/2012 CS 09 303 Data Structures - Module 3 217

(6)SEARCHING & DELETING FROM A GRAPH Steps (1) Input an edge to be searched. (2) Search for an initial vertex of edge in list arrays by incrementing the array index. (3) Once it is found, search through linked list for the terminal vertex of the edge. (4) If found, display The edge is present in the graph. (5) Then delete the node where the terminal vertex is found and rearrange the linked list. (6) Exit.
10/25/2012 CS 09 303 Data Structures - Module 3 218

(7)TRAVERSING A GRAPH Breadth First Search (BFS ) (1) Input the vertices of the graph and its edges G=(V,E ) (2) Input the source vertex and mark it as visited. (3) Add the source vertex to queue . (4) Repeat step 5 and 6 until the queue is empty. ( i.e, front >rear ) (5) Pop the front element of queue and display it. (6) Add the vertices, which is neighbor to just popped element, if it is not in the queue . (7) Exit.
10/25/2012 CS 09 303 Data Structures - Module 3 219

Breadth First Search (BFS ) ALGORITHM Procedure bfs (V) {bfs visits all vertices adjacent to V using BFS} Var Q : QUEUE of vertex ; x,y : vertex; begin mark[v] := visited; ENQUEUE(V,Q);

10/25/2012

CS 09 303 Data Structures - Module 3

220


10/25/2012

while not EMPTY (Q) do begin x:= FRONT (Q); DEQUEUE(Q); for each vertex y adjacent to x do if mark[y] = unvisited then begin mark[y] := visited; ENQUEUE(y,Q); end {if} end;{while } End; {bfs}
CS 09 303 Data Structures - Module 3 221

Example

10/25/2012

CS 09 303 Data Structures - Module 3

222

Depth First Search (DFS ) ALGORITHM STEPS (1) Input the vertices and edges of graph G=(V,E). (2) Input the source vertex and assign it to the variable S. (3) Push the source vertex to the stack. (4) Repeat steps 5 and 6 until the stack is empty. (5) Pop the top element of stack & display it. (6) Push the vertices adjacent to just popped element if it is not in the stack & is displayed (i.e. not visited). (7) exit

10/25/2012

CS 09 303 Data Structures - Module 3

223

Depth First Search (DFS ) ALGORITHM ALGORITHM Procedure DFS(V: Vertex ); Var W : vertex; begin mark(V) := visited; for each vertex W on L(V) do if mark[w]=unvisited then DFS(W); end; {DFS} L(V) is the adjacency list.
CS 09 303 Data Structures - Module 3 224

(1) (2) (3) (4)

10/25/2012

Tree Searches
A B C

A tree search starts at the root and explores nodes from there, looking for a goal node (a node that satisfies certain conditions, depending on the problem)

10/25/2012

CS 09 303 Data Structures - Module 3

225

Depth First Search


A B C

A depth-first search (DFS) explores a path all the way to a leaf before backtracking and exploring another path. For example, after searching A, then B, then D, the search backtracks and tries another path from B. Node are explored in the order ABDEHLMNIOPCFG JKQ N will be found before J
226

10/25/2012

CS 09 303 Data Structures - Module 3

Breadth First Search


A

A breadth-first search (BFS) explores nodes nearest the root before exploring nodes further away. For example, after searching A, then B, then C, the search proceeds with D, E, F, G. Node are explored in the order ABCDEFGHIJKLMN OPQ J will be found before N

10/25/2012

CS 09 303 Data Structures - Module 3

227

How to do DFS?
Put the root node on a stack; while (stack is not empty) do begin remove a node from the stack; if (node is a goal node) return success; put all children of node onto the stack; end return failure; At each step, the stack contains a path of nodes from the root. The stack must be large enough to hold the longest possible path, that is, the maximum depth of search.
10/25/2012 CS 09 303 Data Structures - Module 3 228

How to do BFS?
Put the root node on a queue; while (queue is not empty) do begin remove a node from the queue; if (node is a goal node) return success; put all children of node onto the queue; end return failure; Just before starting to explore level n, the queue holds all the nodes at level n-1. In a typical tree, the number of nodes at each level increases exponentially with the depth.
10/25/2012 CS 09 303 Data Structures - Module 3 229

Greedy Algorithms

10/25/2012

CS 09 303 Data Structures - Module 3

230

Minimum Spanning Tree/Minimum Cost Spanning Tree (MST)


Spanning Tree for a graph G = (V,E) is a subgraph G 1= (V 1 ,E1 ) of G that contains all the vertices of G.

The vertex set V1 is same as that of graph G. The edge set E1 is a subset of G. There is no cycle. A graph can have many different spanning trees.

10/25/2012

CS 09 303 Data Structures - Module 3

231

A weighted tree is one in which each edge is assigned a weight. Weight or cost of a spanning tree is the sum of weights of its edges. A Minimum Spanning Tree or Minimum-Weight Spanning Tree is a spanning tree with weight less than or equal to the weight of every other spanning tree.

10/25/2012

CS 09 303 Data Structures - Module 3

232

This figure shows there may be more than one minimum spanning tree in a graph. In the figure, the two trees below the graph are two possibilities of minimum spanning tree of the given graph.
10/25/2012 CS 09 303 Data Structures - Module 3 233

MST Algorithms
Kruskals Algorithm Finds an MST for a connected weighted undirected graph. If the graph is not connected, then it finds a minimum spanning forest (a minimum spanning tree for each connected component). Prims Algorithm Finds an MST for a connected weighted undirected graph. Prim's algorithm requires the graph to be connected.

10/25/2012

CS 09 303 Data Structures - Module 3

234

KRUSKALs ALGORITHM
Kruskals algorithm works by growing the minimum spanning tree one edge at a time, adding the lowest cost edge that does not create a cycle.

It starts with each vertex as a separate tree and merges these trees together by repeatedly adding the lowest cost edge that merges two distinct subtrees (i.e. does not create a cycle).

10/25/2012

CS 09 303 Data Structures - Module 3

235

KRUSKALs ALGORITHM
- It builds the MST in forest (A forest is a disjoint union of trees.). Initially, each vertex is in its own tree in forest. Then, the algorithm considers each edge in order by increasing weight. If an edge (u, v) connects two different trees, then (u, v) is added to the set of edges of the MST, and the two trees connected by the edge (u, v) are merged into a single tree. On the other hand, if the edge (u, v) connects two vertices in the same tree, then edge (u, v) is discarded.
10/25/2012 CS 09 303 Data Structures - Module 3 236

Kruskal(G) - Informal Algorithm Sort the edges in order of increasing weight count = 0 while (count < n-1) do get next edge (u,v) if (component (u) <> component(v)) add edge to T component(u) = component(v) end end end
10/25/2012 CS 09 303 Data Structures - Module 3 237

KRUSKALs ALGORITHM:OPERATIONS REQUIRED

DELETEMIN : deletes edge of minimum cost from a PRIORITY QUEUE . MERGE (A,B,C ): merge components A and B in C and call the result A or B arbitrarily. FIND (v, C) : returns the name of the component of C of which vertex v is a member. This operation will be used to determine whether the two vertices of an edge are in the same or in different components. INITIAL (A, v, C ) : makes A the name of the component in C containing only one vertex, namely v.
CS 09 303 Data Structures - Module 3 238

10/25/2012

Procedure Kruskal (V : Set of vertex; E : Set of edge ) :Var T : Set of edge ) ;

Var
ncomp : integer ; { current number of components } edges : PRIORITYQUEUE ;{ the set of edges } components :MFSET ; {the set V grouped into a MERGE- FIND set of components} u,v : vertex; e : edge ; nextcomp : integer ; {name for new component } ucomp, vcomp :integer ;{component names }

10/25/2012

CS 09 303 Data Structures - Module 3

239

Begin MAKENULL(T); MAKENULL(edges); nextcomp := 0; ncomp := number of members of V; for v in V do begin { initialize a component to contain one vertex of V } nextcomp:= nextcomp+1; INITIAL(nextcomp, v, components ); end; {for} for e in E do {initialize priority queue of edges } INSERT(e, edges);
10/25/2012 CS 09 303 Data Structures - Module 3 240

while ncomp > 1 do begin e:= DELETEMIN(edges); let e = (u,v); ucomp := FIND(u, components); vcomp:= FIND(v, components); if ucomp <> vcomp then begin {e connects two different components} MERGE (ucomp , vcomp, components ); ncomp := ncomp -1; INSERT (e ,T); end ;{if ] end; {while } End; {kuruskal }
10/25/2012 CS 09 303 Data Structures - Module 3 241

10/25/2012

CS 09 303 Data Structures - Module 3

242

10/25/2012

CS 09 303 Data Structures - Module 3

243

10/25/2012

CS 09 303 Data Structures - Module 3

244

10/25/2012

CS 09 303 Data Structures - Module 3

245

10/25/2012

CS 09 303 Data Structures - Module 3

246

PRIMs ALGORITHM for MST


Prim's algorithm starts with an arbitrary vertex v and ``grows'' a tree from it, repeatedly finding the lowest-cost edge that will link some new vertex into this tree.

10/25/2012

CS 09 303 Data Structures - Module 3

247

Algorithm (Informal)
Start with a tree which contains only one node. Identify a node (outside the tree) which is closest to the tree and add the minimum weight edge from that node to some node in the tree and incorporate the additional node as a part of the tree. If there are less than n 1 edges in the tree, go to 2.

10/25/2012

CS 09 303 Data Structures - Module 3

248

PRIMS ALGORITHM
Procedure Prim (G: graph; var T: set of edges); {Prim constructs a minimum-cost spanning tree T for G}. Var U: set of vertices; u, v : vertex; Begin T:= ; U:= {1};

10/25/2012

CS 09 303 Data Structures - Module 3

249

while U<>V do begin let (u,v) be a lowest cost edge such that u is in U and v is in V-U;

end End; {Prim}

10/25/2012

CS 09 303 Data Structures - Module 3

250

10/25/2012

CS 09 303 Data Structures - Module 3

251

10/25/2012

CS 09 303 Data Structures - Module 3

252

10/25/2012

CS 09 303 Data Structures - Module 3

253

PRIMs ALGORITHM FOR MST


Procedure Prim (C :array [1..n ,1..n] of real ); {Prim prints the edges of MST for a graph with vertices {1,2,3.n} and cost matrix C on edges } Var LOWCOST :array[1..n] of real; CLOSEST :array[1..n] of integer; i, j, k, min :integer; {i and j are indices .During a scan of the LOWCOST array, k is the index of the CLOSEST vetex found so far and min := LOWCOST[k] }
10/25/2012 CS 09 303 Data Structures - Module 3 254

begin for i:= 2 to n do begin {initialize with only vertex 1 in the set U } LOWCOST[i] :=C [1,i] ; CLOSEST[i]:=1; end;{for }

10/25/2012

CS 09 303 Data Structures - Module 3

255

For i:= 2 to n do begin {find the CLOSEST vertex k outside of U to some vertex in U} Min:= LOWCOST[2]; k:= 2; For j:=3 to n do If LOWCOST[i] < min then begin Min := LOWCOST[j]; k := j; end; Writeln(k, CLOSEST[k]); {print edge } LOWCOST[k] := infinity; {k is added to U }
10/25/2012 CS 09 303 Data Structures - Module 3 256

For j:=2 to n do { adjust costs to U } If (C[k,j] < LOWCOST[j] ) and (LOWCOST[j] < infinity ) then begin LOWCOST[j] := C[k,j]; CLOSEST[j]:= k; End; {if } End {for } End ;{Prim} Where CLOSEST[i] Gives vertex in U that is currently closest to vertex i in V-U. LOWCOST[i] Cost of the edge (i,CLOSEST[i])

10/25/2012

CS 09 303 Data Structures - Module 3

257

10/25/2012

CS 09 303 Data Structures - Module 3

258

Start with the tree < { v1 }, { } >. Vertex v4 is closest to tree . . .

10/25/2012

CS 09 303 Data Structures - Module 3

259

v3 is closest to tree . . .

10/25/2012

CS 09 303 Data Structures - Module 3

260

v2 and v5 are closest to tree, pick v5, say . .

10/25/2012

CS 09 303 Data Structures - Module 3

261

v6 is closest to tree . . .

10/25/2012

CS 09 303 Data Structures - Module 3

262

v2 is closest (and only remaining) vertex . . .

10/25/2012

CS 09 303 Data Structures - Module 3

263

10/25/2012

CS 09 303 Data Structures - Module 3

264

SHORTEST PATH Algorithms for Directed Graphs


Single Source Shortest Paths Algorithm to determine the cost of the shortest path from the source to every other vertex in V.
Dijkstras Algorithm

All-Pairs Shortest Paths Algorithm to find for each ordered pair of vertices (v,w), the shortest path from v to w.
Floyds Algorithm

10/25/2012

CS 09 303 Data Structures - Module 3

265

Dijkstras Algorithm
The algorithm works by maintaining a set S of vertices whose shortest distance from the source is already known. Initially, S contains only the source vertex. At each step, we add to S a remaining vertex v whose distance from the source is as short as possible. Assuming that all arcs have nonnegative costs, we can always find a shortest path from the source to v that passes only through vertices in S.
10/25/2012 CS 09 303 Data Structures - Module 3 266

At each step of the algorithm, we use an array D to record the length of the shortest path to each vertex. Once S includes all vertices, all paths are shortest paths, so D will hold the shortest distance from the source to each vertex.

10/25/2012

CS 09 303 Data Structures - Module 3

267

Procedure Dijikstra {Dijikstra computes the cost of shortest paths from vertex 1 to every vertex of a directed graph} Begin S := {1}; For i :=2 to n do D[i] :=C[1,i]; {Initialize D } For i:=1 to n-1 do begin Choose a vertex w in V-S such that D[w] is minimum ; add w to S; For each vertex v in V-S do D[v]:= min( D[v], D[w] + C[w,v]); end;{for} End;{Dijikstra}
10/25/2012 CS 09 303 Data Structures - Module 3 268

10/25/2012

CS 09 303 Data Structures - Module 3

269

10/25/2012

CS 09 303 Data Structures - Module 3

270

FLOYDS ALGORITHM
Procedure Floyd(Var A :array[1..n ,1..n] of real; C :array[1..n,1..n] of real ); {Floyd computes shortest path matrix A given arc cost matrix C} Var i, j, k : integer; Begin For i := 1 to n do For j := 1 to n do A[i, j] := C[i, j]; For i:=1 to n do A[i, i] := 0;

10/25/2012

CS 09 303 Data Structures - Module 3

271

For k := 1 to n do For i:=1 to n do For j:=1 to n do If A[i, k] + A[k, j] < A[i,j] then A[i,j] := A[i, k] + A[k, j]; End; {Floyd }

10/25/2012

CS 09 303 Data Structures - Module 3

272

10/25/2012

CS 09 303 Data Structures - Module 3

273

10/25/2012

CS 09 303 Data Structures - Module 3

274

Thank You

10/25/2012

CS 09 303 Data Structures - Module 3

275

You might also like