Professional Documents
Culture Documents
Algorithms
F O U R T H E D I T I O N
B-trees
ordered iteration?
key interface
N/2
N/2
no
equals()
lg N
lg N
N/2
N/2
yes
compareTo()
BST
1.39 lg N
1.39 lg N
yes
compareTo()
goal
log N
log N
log N
log N
log N
log N
yes
compareTo()
Challenge. Guarantee performance. This lecture. 2-3 trees, left-leaning red-black BSTs, B-trees.
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
B-trees
2-3 tree
Allow 1 or 2 keys per node.
2-node: 3-node:
Perfect balance. Every path from root to null link has same length.
M
3-node 2-node
E J
AC
SX
null link
4
2-3 tree
Allow 1 or 2 keys per node.
2-node: 3-node:
Perfect balance. Every path from root to null link has same length. Symmetric order. Inorder traversal yields keys in ascending order.
M
smaller than E
E J
R
larger than J
AC
SX
between E and J
5
Compare search key against keys in node. Find interval containing search key. Follow associated link (recursively).
search for H
E J
AC
SX
Add new key to 3-node to create temporary 4-node. Move middle key in 4-node into parent. Repeat up the tree, as necessary. If you reach the root and it's a 4-node, split it into three 2-nodes.
insert L
AC
SX
insert S
2-3 tree
AC
SX
a e b c d
less than a
between a and b
between b and c
between c and d
between d and e
greater than e
a c e b less than a between a and b between b and c d between d and e greater than e
between c and d
10
root
parent is a 3-node
a b c
left
a b c
d e
b d e
parent is a 2-node
left
a b c
b d
middle
a e b c d
a c e
c
a c
right
a b c d
right
d
a b c d e
a b d
11
Tree height.
12
Tree height. [all 2-nodes] Worst case: lg N. Best case: log N .631 lg N. [all 3-nodes] Between 12 and 20 for a million nodes. Between 18 and 30 for a billion nodes.
3
13
ST implementations: summary
worst-case cost (after N inserts) implementation search sequential search (unordered list) binary search (ordered array) insert delete
ordered iteration?
key interface
N/2
N/2
no
equals()
lg N
lg N
N/2
N/2
yes
compareTo()
BST
1.39 lg N
1.39 lg N
yes
compareTo()
2-3 tree
c lg N
c lg N
c lg N
c lg N
c lg N
c lg N
yes
compareTo()
14
Maintaining multiple node types is cumbersome. Need multiple compares to move down tree. Need to move back up the tree to split 4-nodes. Large number of cases for splitting.
Bottom line. Could do it, but there's a better way.
15
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
B-trees
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
B-trees
black tree
a b
than a
a and b
than b
P S
greater than b
X
a less than a
greater Rthan b
b greater than b
between a and b b a M
between a and b
E A C H
less than a
between a and b
tree
M E J A C H L
2-3 tree
redblack tree
M J E L H
corresponding red-black BST
R P S X
C A
horizontal red links
R P S X
M E J R
18
An equivalent denition
A BST such that:
No node has two red links connected to it. Every path from root to null link has the same number of black links. Red links lean left.
"perfect black balance"
redblack tree
M J E C H L P S R X
A
horizontal red links
M E J H L P R S X
19
M J E C H L P S R X
A
horizontal red links
M E J H L P R S X
2-3 tree
M E J A C H L P R S X
public Val get(Key key) { Node x = root; while (x != null) { int cmp = key.compareTo(x.key); if (cmp < 0) x = x.left; else if (cmp > 0) x = x.right; else if (cmp == 0) return x.val; } return null; }
redblack tree
M J E C H L P S R X
A
horizontal red links
M E J H L P R S X
2-3 tree
Remark. Most other ops (e.g., floor, iteration, selection) are also identical. R E J
A C H L P S X
21
private static final boolean RED = true; private static final boolean BLACK = false; private class Node { Key key; Value val; Node left, right; boolean color; // color of parent link } private boolean isRed(Node x) { if (x == null) return false; return x.color == RED; }
null links are black
h.left.color is RED
E J D G
C A
h.right.color is BLACK
private static final boolean RED = true; private static final boolean BLACK = false; private class Node { Key key; Value val; Node left, right; int N; boolean color;
// // // // // //
key associated data subtrees # nodes in this subtree color of link from parent to this node
22
E S
x
private Node rotateLeft(Node h) { assert isRed(h.right); Node x = h.right; h.right = x.left; x.left = h; x.color = h.color; h.color = RED; return x; }
S
h
E
greater than S
less than E
between E and S
private Node rotateLeft(Node h) { assert isRed(h.right); Node x = h.right; h.right = x.left; x.left = h; x.color = h.color; h.color = RED; return x; }
S
x
E
greater than S
less than E
between E and S
private Node rotateRight(Node h) { assert isRed(h.left); Node x = h.left; h.left = x.right; x.right = h; x.color = h.color; h.color = RED; return x; }
E S
h
private Node rotateRight(Node h) { assert isRed(h.left); Node x = h.left; h.left = x.right; x.right = h; x.color = h.color; h.color = RED; return x; }
ip colors (before) h
E S
less than A
between A and E
between E and S
greater than S
private void flipColors(Node h) { assert !isRed(h); assert isRed(h.left); assert isRed(h.right); h.color = RED; h.left.color = BLACK; h.right.color = BLACK; }
ip colors (after) h
E S
less than A
between A and E
between E and S
greater than S
private void flipColors(Node h) { assert !isRed(h); assert isRed(h.left); asset isRed(h.right); h.color = RED; h.left.color = BLACK; h.right.color = BLACK; }
E A R S
E A C R S
29
left
root
right
a a
search ends at this null link attached new node with red link b
root
b a
b a
right
a a
search ends at this null link attached new node with red link b
root
30
Do standard BST insert; color new link red. If new red link is a right link, rotate left.
insert C
E A R S
E A C R S
smaller
b a
b a
b a
attached new b a b attached b new node with c node with red link c ab a b attached new a red link c a c attached new node with a node with red link red link rotated b b colors flippedrighta rotated b a to black colors flipped b c a right b c a colors flipped to black c a a black c to b b c colors flipped b a a to black colors flipped a to blackc b
ca
a new attached attached new node with b node b withred link attached new
rotated left
b
cc
32
so rotate right E
C
inserting H
S R H
A E C R S E C A
color new link red. Do standard BST insert; inserting H add new E needed). Rotate to balance the 4-node (if node here C S two lefts in a row inserting Hto pass red link up one level. Flip colors so rotate right A R E E C lean S left (if needed). Rotate to make C S add new
A R
both children red so flip colors R H right link red so rotate left E C A H R E C A Insert into a 3-node at the bottom H S R S S
node here
inserting H
add new node here E C both children red S two lefts in a rowC S so flip colors so rotate right A R E A R E C R H C S add new S A H node here A R both children red two lefts in a row H so flip colors so rotate right right link red E so rotate left E both children redC R C S so flip colors E S A H A R E C R R H C right link red S A H S so rotate left A H both children red so flip colors right link red R E E so rotate left S E C R C R H SC A H A H E S
E
33
node here C A R H S
add new node here M both children R red so P H flip colors S E C A R H S M P both children red so flip colors
Do standard BST insert; color new link red. Rotate to balance the 4-node (if needed). Flip colors to pass red link up one level. Rotate to make lean left (if needed). Repeat case 1 or case 2 up the tree (if needed).
R E S C M
inserting P
inserting P
right link red M so P rotate left R A H S E two lefts in a row M so rotate right C P A R H M E C H M both children red E so flip colors C H MA E C A M E C A E C the tree H Passing a red link up A P H A P C R H P R P S two lefts in a row P so rotate right R S
E
inserting P
A S
C R E S A H
add new S node here E C M right link red C M so rotate left R both children A H R red so P A H S E add new E S flip colors node here C M C M both children right link red P A H R red so P A H so rotate left R flip colors S E two lefts in a row S E C M so rotate right right link red both children C M red soleft R P so rotate A H R flip colors P A H M S S E P E C M right link red two lefts in a row so rotate left so rotate right C R H P A H R A S E both children red two lefts in a row M S C M so flip colors so rotate right P E P A H R C H M M S two lefts in a row A R E
E M H
insert S
35
red-black BST
M E C A H L P S R X
36
Right child red, left child black: rotate left. Left child, left-left grandchild red: rotate right. Both children red: flip colors.
flip colors
private Node put(Node h, Key key, Value val) { if (h == null) return new Node(key, val, RED); int cmp = key.compareTo(h.key); if (cmp < 0) h.left = put(h.left, key, val); else if (cmp > 0) h.right = put(h.right, key, val); else if (cmp == 0) h.val = val;
if (isRed(h.right) && !isRed(h.left)) h = rotateLeft(h); if (isRed(h.left) && isRed(h.left.left)) h = rotateRight(h); if (isRed(h.left) && isRed(h.right)) flipColors(h); return h; }
only a few extra lines of code provides near-perfect balance
37
38
40
Every path from root to null link has same number of black links. Never two red links in-a-row.
ST implementations: summary
ordered iteration?
key interface
N/2
N/2
no
equals()
lg N
lg N
N/2
N/2
yes
compareTo()
BST
1.39 lg N
1.39 lg N
yes
compareTo()
2-3 tree
c lg N
c lg N
c lg N
c lg N
c lg N
c lg N
yes
compareTo()
red-black BST
2 lg N
2 lg N
2 lg N
1.00 lg N *
1.00 lg N *
1.00 lg N *
yes
compareTo()
Alto. GUI. Ethernet. Smalltalk. InterPress. Laser printing. Bitmapped display. WYSIWYG text editor. ...
A DIClIROlV1ATIC FUAl\lE\V()HK Fon BALANCED TREES
Leo J. Guibas .Xerox Palo Alto Research Center, Palo Alto, California, and Carnegie-Afellon University and
Xerox Alto
ABSTUACT
I() this paper we present a uniform framework for the implementation and study of halanced tree algorithms. \Ve show how to imhcd in this framework the best known halanced tree tecilIliques and thell usc the framework to deVl'lop new which perform the update and rebalancing in one pass, Oil the way down towards a leaf. \Ve conclude with a study of performance issues and concurrent updating.
the way down towards a leaf. As we will see, this has a number of significant advantages ovcr the older methods. We shall cxamine a numhcr of variations on a common theme and exhibit full implementations which are notable for their brcvity. One imp1cn1entation is exatnined carefully, and some properties about its behavior are proved. ]n both sections 1 and 2 particular attention is paid to practical implementation issues, and cOlnplcte impletnentations are given for all of the itnportant algorithms. '1l1is is significant because one measure under which balanced tree algorithtns can differ greatly is
43
O.
Introduction
Red-black BST search and insert; Hibbard deletion. Exceeding height limit of 80 triggered error-recovery process.
allows for up to 240 keys Hibbard deletion was the problem
Main cause = height bounded exceeded! Telephone company sues database provider. Legal testimony:
If implemented properly, the height of a red-black BST with N keys is at most 2 lg N. expert witness
44
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
B-trees
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
B-trees
slow
fast
Property. Time required for a probe is much larger than time to access data within a page. Cost model. Number of probes. Goal. Access data using minimum number of probes.
47
At least 2 key-link pairs at root. At least M / 2 key-link pairs in other nodes. External nodes contain client keys. Internal nodes contain copies of keys to guide search.
* K sentinel key * D H external 3-node * B C D E F client keys (black) are in external nodes H I J
each red key is a copy of min key in subtree external 5-node (full) K M N O P
K Q U external 4-node Q R T U W X Y
all nodes except the root are 3-, 4- or 5-nodes Anatomy of a B-tree set (M = 6)
48
Searching in a B-tree
Start at root. Find interval for search key and take corresponding link. Search terminates in external node.
searching for E
* K
* B C
D E F
H I J
K M N O P
Q R T
U W X
49
Insertion in a B-tree
Search for new key. Insert at bottom. Split nodes with M key-link pairs on the way up the tree.
inserting A
* H K Q U
* B C E F
H I J
K M N O P * H K Q U
Q R T
U W X
H I J
K M N O P * C H K Q U
Q R T
* A B
C E F
H I J * K
K M N O P
Q R T
U W X
* C H
K Q U
* A B
C E F
Q R T
U W X
Balance in B-tree
Proposition. A search or an insertion in a B-tree of order M with N keys requires between log M-1 N and log M/2 N probes. Pf. All internal nodes (besides root) have between M / 2 and M - 1 links.
In practice. Number of probes is at most 4. Optimization. Always keep root page in memory.
N 4
51
each line shows the result of inserting one key in some page
full page splits into two half -full pages then a new key is added to one of them
52
Java: java.util.TreeMap, java.util.TreeSet. C++ STL: map, multimap, multiset. Linux kernel: completely fair scheduler, linux/rbtree.h.
B-tree variants. B+ tree, B*tree, B# tree, B-trees (and variants) are widely used for file systems and databases.
Windows: HPFS. Mac: HFS, HFS+. Linux: ReiserFS, XFS, Ext3FS, JFS. Databases: ORACLE, DB2, INGRES, SQL, PostgreSQL.
53
Common sense. Sixth sense. Together they're the FBI's newest team.
54
55
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
B-trees
Algorithms
Algorithms
F O U R T H E D I T I O N
B-trees