You are on page 1of 37

B -Tree

COMP171 Tutorial 9
Deficiency of AVL Tree

 Performs really bad when the data is too


huge and cannot be put in the main memory
 Too much disk access if the data is stored in
the disk
 Access on disk is much slower than access on
main memory
 No. of disk access is proportional to the depth
of AVL tree
Alternative Solution — M-ary
Tree
 Every node has multiple children (M-ary mea
ns M branches)
 Depth decreases as branching increases
 Depth = O(logMn) instead of O(log2n)
 Therefore, no. of disk access also decreases
B+-Tree — Basic Information
 An M-ary tree (M>3)
 Leaves contain data items
 all are at the same depth
 each node has L/2 to L data (usually L << M in practice)
 Internal nodes contain searching keys
 each node has M/2 to M children
 each node has (M/2)-1 to (M-1) searching keys
 key i is the smallest key in subtree i+1
 Root
 can be a single leaf, or has 2 to M children
B+-Tree — Example right child of J
left child of J

left subtree of J

 M=L=4
 Records are at the leaves right subtree of J
 Node are at least half-full, so that the tree will
not degenerate into simple binary tree or eve
n link list
 Left child pointer & right child pointer and also
left subtree & right subtree are defined
B+-Tree — In Practical

 Each internal node & leaf has the size of one


I/O block of data
 minimize tree depth and so as no. of disk access
 First one or two levels of the tree are stored
in main memory to speed up searching
 Most internal nodes have less than (m-1)
searching keys most of the time
 huge space wastage for main memory, but not a
big problem for disk
B+-Tree — Searching Example 1

 Search G
P

F J U

A F J P U
B G K Q V
C L
D M

Found!!
M=L=4
B+-Tree — Searching Example 2

 Search H
P

F J U

A F J P U
B G K Q V
C L
D M

Not Found!!
M=L=4
B+-Tree — Searching Algorithm
 Searching KEY:
 Start from the root
 If an internal node is reached:
 Search KEY among the keys in that node
 linear search or binary search
 If KEY < smallest key, follow the leftmost child pointer down
 If KEY >= largest key, follow the rightmost child pointer down
 If Ki <= KEY < Kj, follow the child pointer between Ki and Kj
 If a leaf is reached:
 Search KEY among the keys stored in that leaf
 linear search or binary search
 If found, return the corresponding record; otherwise report not
found
B+-Tree — Insertion Example 1

 Insert H
P

F J U

A F J P U
B G K Q V
C H L
D M

Done!!
M=L=4
B+-Tree — Insertion Algorithm

 Insert KEY:
 Search for KEY using search operation
 we will reach a leaf and get “Not Found”
 Insert KEY into that leaf
 If the leaf contains <L keys, just insert KEY into it
 If the leaf contains L keys, splitting is necessary
B+-Tree — Insertion Example 2

 Insert E
P

F J U

A F J P U
B G K Q V
C H L
D M
E
Oops!!
M=L=4 Splitting is needed
B+-Tree — Insertion Algorithm (C
on’t)

 Splitting leaf:
 Cut the node out, insert KEY into it
 Split it into 2 new leaves Lleft and Lright
 Lleft has the (L+1)/2 smallest keys
 Lright has the remaining (L+1)/2 keys
 Make a copy of the smallest key in Lright, say J, to be
the parent of Lleft and Lright
 Insert J, together with Lleft and Lright, into the original
parent node
B+-Tree — Insertion Example 2 (C
on’t)
 Insert E
P

C F J U
A
B
C
A C F J P U
D
B D G K Q V
E E H L
M

Done!!
M=L=4
B+-Tree — Insertion Example 3

 Insert N
P

C F J U

A C F J P U
B D G K Q V
E H L
M
N
Leaf Splitting!!
M=L=4
B+-Tree — Insertion Example 3 (C
on’t)
 Insert N No. of keys
= 4 > (M-1)!!
P

C F J L U
J
K
L
A C F J L P U
M
B D G K M Q V
N E H N

Done??
M=L=4
B+-Tree — Insertion Algorithm (C
on’t)
 Splitting internal node:
 Cut the node out
 Split it into 2 new internal nodes Nleft and Nright
 Nleft has the smallest (M/2-1) keys
 Nright has the largest M/2 keys
 Note that the (M/2)th key is not in either node!
 because (M/2-1) + (M/2) = M/2 + M/2 - 1 = M - 1
 Make the (M/2)th key, say J, to be the parent of Nleft
and Nright
 Insert J, together with Nleft and Nright, into the original p
arent node
B+-Tree — Insertion Example 3 (C
on’t)
 Insert N
F P

C F J L C J L U

A C F J L P U
B D G K M Q V
E H N

Done!!
M=L=4
B+-Tree — Insertion Algorithm (C
on’t)
 Splitting root:
 Follow exactly the same procedure as splitting an
internal node
 J, the parent of Nleft and Nright, is now set to be the r
oot of the tree
 because the original root is destroyed by splitting
 After splitting the root, the depth of the tree is
increased by 1
B+-Tree — Deletion Example 1

 Delete H
F P

C J L U

A C F J L P U
B D G K M Q V
E H N

Done!!
M=L=4
B+-Tree — Deletion Algorithm

 Delete KEY:
 Search for KEY using search operation
 we will reach a leaf and get “Found”
 Delete KEY from that leaf
 If KEY is included in an ancestor (internal node), replace
it by the new smallest key in that leaf
 If that leaf finally contains < L/2 keys, borrowing of
child is necessary
B+-Tree — Deletion Algorithm (Co
n’t)
 Lending of child:
 Case 1: if the right sibling of the current node
contains >= (L/2+1) keys
 borrow the leftmost child from it (no return!)
 Case 2: if the left sibling of the current node
contains >= (L/2+1) keys
 borrow the rightmost child from it (no return!)
 Update the searching key in the parent node
separating the current node and the sibling
accordingly
B+-Tree — Deletion Example 2

 Delete B
F P

CD J L U

A C F J L P U
B D G K M Q V
C E N

Done!!
M=L=4
B+-Tree — Deletion Algorithm (Co
n’t)
 What if both left & right sibling have only L/2
keys, so that borrowing is not possible?
 since the minimum no. of key of an internal node
is L/2
 We need to merge two leaves
B+-Tree — Deletion Example 3

 Delete G
F P

D J L U

A D F J L P U
C E G K M Q V
N

Oops!!
M=L=4 Can’t Borrow!!
B+-Tree — Deletion Algorithm (Co
n’t)

 Merging two leaves:


 Move all keys in the current leaf to the sibling leaf
 Delete the child pointer at the parent node which is
pointing to the current leaf
 Delete the separating key between the two leaves
from the parent node
B+-Tree — Deletion Example 3 (Co
n’t)
 Delete G
F P

D J L U

A D F F L P U
C E J M Q V
K N

Done!!
M=L=4
B+-Tree — Deletion Algorithm (Co
n’t)
 How about the case of internal node?
 First, try borrowing of child
 If not possible, merge two internal nodes
B+-Tree — Deletion Example 4

 Delete C
F P

D L U

A D F L P U
C E J M Q V
K N

Oops! We need to
M=L=4 merge two leaves!!
B+-Tree — Deletion Example 4 (Co
n’t)
 Delete C
F P

D L U

A A F L P U
D J M Q V
E K N

Can we do that?? No!!


An internal node will We need to merge two int
M=L=4 become empty!! ernal nodes first!!
B+-Tree — Deletion Algorithm (Co
n’t)
 Merging two internal nodes
(i.e. no sibling has excess
child to be borrowed)
 Move the separating key between the current
node and the sibling node in the parent node
down to the sibling
 Move the keys and child pointers in the current
node to the sibling node
 Remove the pointer to the current node in the
parent node
B+-Tree — Deletion Example 4 (Co
n’t)
 Delete C
F P

D F L U

A A F L P U
D J M Q V
E K N

Done!!
M=L=4
B+-Tree — Deletion Example 5

 Delete P
P

F L U

A F L P U
D J M Q V
E K N

But can we?? No!!


Oops! We need to We have to borrow
M=L=4 merge two leaves!! a child first!!
B+-Tree — Deletion Algorithm (Co
n’t)
 Borrow a child (i.e. an sibling
has excess child to be borrowed)
 Move the separating key between the current
node and the sibling node in the parent node
down to the sibling
 Make the leftmost (/rightmost) child of sibling node
to be the rightmost (/leftmost) child of the current
node
 Move the leftmost (/rightmost) key of the sibling
node to be the new separating key between the
current node and the sibling in the parent node
B+-Tree — Deletion Example 5 (Co
n’t)
 Delete P
PL

Q
F L PU

A F L P Q
D J M Q U
E K N V

Done!!
M=L=4
B+-Tree — Deletion Algorithm (Co
n’t)
 What if the root becomes empty during
merging of nodes?
 make the only-one child of the original root to be
the new root
B+-Tree — Final Output

F Q

A F L Q
D J M U
E K N V

M=L=4

You might also like