You are on page 1of 26

Jim Anderson Comp 750, Fall 2009 B-Trees - 1

Chapter 18: B-Trees


Example:
M
B C
Q T X
F G N P R S V W Y Z
D H
J K L
Node x has the following fields:
n[x] -- number of keys stored in x.
key
1
[x] s key
2
[x] s s key
n[x]
[x] -- the n[x] keys.
leaf[x] -- Boolean.
if internal: n[x] + 1 pointers to children, c
1
[x], c
2
[x], , c
n[x]+1
[x].
Let k
i
= any key in subtree rooted at c
i
[x]. Then,
k
1
s key
1
[x] s k
2
s key
2
[x] s s key
n[x]
[x] s k
n[x]+1.
- t > 2 called the minimum degree.
x = root t 1 s n[x] s 2t1
x = root n[x] s 2t1
Note: Each leaf
has the same depth
Jim Anderson Comp 750, Fall 2009 B-Trees - 2
Application: Disk Accesses
Each node is stored as a page.
Page size determines t.
t is usually large
Implies branching factor is large, so height is small.
Example:



Holds over one billion keys.
Height is only 2, so can find any key with only two disk accesses
(compare to red-black trees, where the branching factor is 2).
Note: Disk accesses dominate performance in this application.
1000
1000 1000 1000
1000
1000 1000

1001
Jim Anderson Comp 750, Fall 2009 B-Trees - 3
Height of a B-Tree
Theorem 18.1: Let n = the number of keys in T, n > 1, t > 2,
h = height of T. Then,
2
1 n
log h
t
+
s
Proof:
Let T be of height h. The number of nodes is minimized when root has 1 key and
all other nodes have t1 keys.

This gives us 2t
i-1
nodes at depth i, 1 s i s h, and 1 node at depth 0. Hence,
|
|
.
|

\
|

+ =
+ >

=

1 t
1 t
1) 2(t 1
2t 1) (t 1
n
h
h
1 i
1 i
2
1 n
log h Implies
1 2t
t
h
+
s
=
Jim Anderson Comp 750, Fall 2009 B-Trees - 4
B-Tree Operations
Search:
O(log
t
n) disk accesses.
O(t log
t
n) CPU time.
Create:
O(1) disk accesses.
O(1) CPU time.
Insert and Delete:
O(log
t
n) disk accesses.
O(t log
t
n) CPU time.


In the code that follows,
we use:
Disk-Read: To move node
from disk to memory.
Disk-Write: To move node
from memory to disk.
We assume root is in
memory.
Jim Anderson Comp 750, Fall 2009 B-Trees - 5
Search
Search(x, k)
i := 1;
while i s n[x] and k > key
i
[x] do
i := i + 1
od;
if i s n[x] and k = key
i
[x] then
return(x,i)
fi;
if leaf[x] then
return NIL
else
DiskRead(c
i
[x]);
Search(c
i
[x])
fi
Search(root[T], k)
returns (y,i) s.t.
key
i
[y] = k or NIL
if no such key.

Worst-case:
O(log
t
n) disk reads.
O(t log
t
n) CPU time.
Jim Anderson Comp 750, Fall 2009 B-Trees - 6
Creating an Empty Tree
Create(T)
x := Allocate-Node();
leaf[x] := true;
n[x] := 0;
Disk-Write(x);
root[T] := x
To create a nonempty tree, first create an empty tree, then
insert nodes.

Splitting is fundamental to insert.
Jim Anderson Comp 750, Fall 2009 B-Trees - 7
Splitting
Applied to a full child of a nonfull parent. full 2t1 keys.

Example: (t=4)
N W
P Q R S T U V
T
1
T
2
T
3
T
4
T
5
T
6
T
7
T
8

y = c
i
[x]
x
N S W
P Q R
T
1
T
2
T
3
T
4
T
5
T
6
T
7
T
8

y = c
i
[x]
x
T U V
z = c
i+1
[x]
Split
Jim Anderson Comp 750, Fall 2009 B-Trees - 8
Split-Child
Split-Child(x, i, y)
z := Allocate-Node();
leaf[z] := leaf[y];
n[z] := t1;
for j := 1 to t1 do
key
j
[z] := key
j+t
[y]
od;
if not leaf[y] then
for j := 1 to t do
c
j
[z] := c
j+t
[y]
od
fi;
n[y] := t1;
for j := n[x] + 1 downto i+1 do
c
j+1
[x] := c
j
[x]
od;
c
i+1
[x] := z;
/* Continued */
for j := n[x] downto i do
key
j+1
[x] := key
j
[x]
od;
key
i
[x] := key
t
[y];
n[x] := n[x] + 1;
Disk-Write(y);
Disk-Write(z);
Disk-Write(x)
O(t) CPU time.
O(1) disk writes.
Jim Anderson Comp 750, Fall 2009 B-Trees - 9
Insert
A D F H L N P
T
1
T
2
T
3
T
4
T
5
T
6
T
7
T
8

root[T]
H
A D F
T
1
T
2
T
3
T
4
T
5
T
6
T
7
T
8

L N P
r
root[T]
s
r
Insert(T, k)
r := root[T];
if n[r] = 2t1 then
s := Allocate-Node();
root[T] := s;
leaf[s] := false;
n[s] := 0;
c
1
[s] := r;
Split-Child(s, 1, r);
Insert-Nonfull(s, k)
else
Insert-Nonfull(r, k)
fi
First, modify tree
(if necessary) to create
room for new key.
Then, call Insert-Nonfull().

Example:
Jim Anderson Comp 750, Fall 2009 B-Trees - 10
Insert-Nonfull
Insert-Nonfull(x, k)
i := n[x];
if leaf[x] then
while i > 1 and k < key
i
[x] do
key
i+1
[x] := key
i
[x];
i := i1
od;
key
i+1
[x] := k;
n[x] := n[x] + 1;
Disk-Write(x)
else /* not leaf[x] */
while i > 1 and k < key
i
[x] do
i := i1
od;
i := i + 1;
Disk-Read(c
i
[x]);
if n[c
i
[x]] = 2t1 then
Split-Child(x, i, c
i
[x]);
if k > key
i
[x] then
i := i + 1
fi
fi;
Insert-Nonfull(c
i
[x], k)
fi
Worst Case:
O(t log
t
n) CPU time.
O(log
t
n) disk writes.
Jim Anderson Comp 750, Fall 2009 B-Trees - 11
Insert Example
G M P X
A C D E J K N O R S T U V Y Z
t = 3
Insert B
G M P X
A B C D E J K N O R S T U V Y Z
Jim Anderson Comp 750, Fall 2009 B-Trees - 12
Insert Example (Continued)
Insert Q
G M P X
A B C D E J K N O R S T U V Y Z
G M P T X
A B C D E J K N O Q R S Y Z
U V
Jim Anderson Comp 750, Fall 2009 B-Trees - 13
Insert Example (Continued)
Insert L
G M
A B C D E J K L N O Q R S Y Z
U V
G M P T X
A B C D E J K N O Q R S Y Z
U V
P
T X
Jim Anderson Comp 750, Fall 2009 B-Trees - 14
Insert Example (Continued)
Insert F
C G M
D E F J K L N O Q R S Y Z
U V
P
T X
G M
A B C D E J K L N O Q R S Y Z
U V
P
T X
A B
Jim Anderson Comp 750, Fall 2009 B-Trees - 15
Deletion
Main Idea: Recursively descend tree.

Ensure any non-root node x that is
considered has at least t keys.

May have to move key down from parent.
Jim Anderson Comp 750, Fall 2009 B-Trees - 16
Deletion Cases
Case 0: Empty root -- make roots only child the new root.
x
c
1
[x]
Case 1: k in x, x is a leaf -- delete k from x.
k
> t keys
x
leaf

> t1 keys
x
leaf
Jim Anderson Comp 750, Fall 2009 B-Trees - 17
Deletion Cases (Continued)
Case 2: k in x, x internal.
k
x
not a leaf
y z
Subcase A: y has at least t keys -- find predecessor k of k in subtree
rooted at y, recursively delete k, replace k by k in x.
k
not a leaf
y
> t keys
k
pred
of k
> t keys
x
k
y
x
Jim Anderson Comp 750, Fall 2009 B-Trees - 18
Deletion Cases (Continued)
Subcase B: z has at least t keys -- find successor k of k in subtree
rooted at z, recursively delete k, replace k by k in x.
k
not a leaf
z
> t keys
k
succ
of k
> t keys
x
k
z
x
Subcase C: y and z both have t1 keys -- merge k and z into y, free
z, recursively delete k from y.
k
x
not a leaf
y z
t1 keys t1 keys

x
not a leaf
ys keys, k, zs keys
y
2t1 keys
Jim Anderson Comp 750, Fall 2009 B-Trees - 19
Deletion Cases (Continued)
Case 3: k not in internal node. Let c
i
[x] be the root of the subtree that
must contain k, if k is in the tree. If c
i
[x] has at least t keys,
then recursively descend; otherwise, execute 3.A and 3.B as
necessary.
Subcase A: c
i
[x] has t1 keys, some sibling has at least t keys.

not a leaf
c
i
[x]
t1 keys
k
x
k1

k2

c
i
[x]
t keys
k
x
k2

k1
recursively
descend
Jim Anderson Comp 750, Fall 2009 B-Trees - 20
Deletion Cases (Continued)
Subcase B: c
i
[x] and sibling both have t1 keys.

not a leaf
c
i
[x]
t1 keys
k
x
k1

t1 keys
c
i+1
[x]

c
i
[x]s keys, , c
i+1
[x]s keys
c
i
[x]
2t1 keys
k
x
k1
recursively
descend
Jim Anderson Comp 750, Fall 2009 B-Trees - 21
Delete Example
Delete F (Case 1)
C G M
D E F J K L N O Q R S Y Z
U V
P
T X
A B
C G M
D E J K L N O Q R S Y Z
U V
P
T X
A B
t = 3
Jim Anderson Comp 750, Fall 2009 B-Trees - 22
Delete Example (Continued)
Delete M (Case 2.A)
C G M
D E J K L N O Q R S Y Z
U V
P
T X
A B
C G L
D E J K N O Q R S Y Z
U V
P
T X
A B
Jim Anderson Comp 750, Fall 2009 B-Trees - 23
Delete Example (Continued)
Delete G (Case 2.C)
C G L
D E J K N O Q R S Y Z
U V
P
T X
A B
C L
D E J K N O Q R S Y Z
U V
P
T X
A B
Jim Anderson Comp 750, Fall 2009 B-Trees - 24
Delete Example (Continued)
Delete D (Case 3.B)
C L P T X
E J K N O Q R S Y Z
U V

A B
C L
D E J K N O Q R S Y Z
U V
P
T X
A B
Jim Anderson Comp 750, Fall 2009 B-Trees - 25
Delete Example (Continued)
Case 0
C L P T X
E J K N O Q R S Y Z
U V
A B
C L P T X
E J K N O Q R S Y Z
U V

A B
Jim Anderson Comp 750, Fall 2009 B-Trees - 26
Delete Example (Continued)
Delete B (Case 3.A)
E L P T X
J K N O Q R S Y Z
U V
A C
C L P T X
E J K N O Q R S Y Z
U V
A B

You might also like