You are on page 1of 57

Algorithms

R OBERT S EDGEWICK | K EVIN W AYNE

3.3 B ALANCED S EARCH T REES


2-3 search trees red-black BSTs

Algorithms
F O U R T H E D I T I O N

B-trees

R OBERT S EDGEWICK | K EVIN W AYNE


http://algs4.cs.princeton.edu

Symbol table review


worst-case cost (after N inserts) implementation search sequential search (unordered list) binary search (ordered array) insert delete average case (after N random inserts) search hit insert delete

ordered iteration?

key interface

N/2

N/2

no

equals()

lg N

lg N

N/2

N/2

yes

compareTo()

BST

1.39 lg N

1.39 lg N

yes

compareTo()

goal

log N

log N

log N

log N

log N

log N

yes

compareTo()

Challenge. Guarantee performance. This lecture. 2-3 trees, left-leaning red-black BSTs, B-trees.

3.3 B ALANCED S EARCH T REES


2-3 search trees red-black BSTs

Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu

B-trees

2-3 tree
Allow 1 or 2 keys per node.

2-node: 3-node:

one key, two children. two keys, three children.

Perfect balance. Every path from root to null link has same length.

M
3-node 2-node

E J

AC

SX

null link
4

2-3 tree
Allow 1 or 2 keys per node.

2-node: 3-node:

one key, two children. two keys, three children.

Perfect balance. Every path from root to null link has same length. Symmetric order. Inorder traversal yields keys in ascending order.
M

smaller than E

E J

R
larger than J

AC

SX

between E and J
5

2-3 tree demo


Search.

Compare search key against keys in node. Find interval containing search key. Follow associated link (recursively).
search for H

E J

AC

SX

2-3 tree demo


Insertion into a 3-node at bottom.

Add new key to 3-node to create temporary 4-node. Move middle key in 4-node into parent. Repeat up the tree, as necessary. If you reach the root and it's a 4-node, split it into three 2-nodes.

insert L

AC

SX

2-3 tree construction demo

insert S

2-3 tree construction demo

2-3 tree

AC

SX

Local transformations in a 2-3 tree


Splitting a 4-node is a local transformation: constant number of operations.

a e b c d

less than a

between a and b

between b and c

between c and d

between d and e

greater than e

a c e b less than a between a and b between b and c d between d and e greater than e

between c and d

Splitting a 4-node is a local transformation that preserves balance

10

Global properties in a 2-3 tree


Invariants. Maintains symmetric order and perfect balance. Pf. Each transformation maintains symmetric order and perfect balance.

root

parent is a 3-node

a b c

left
a b c

d e

b d e

parent is a 2-node

left
a b c

b d

middle

a e b c d

a c e

c
a c

right

a b c d

right
d

a b c d e

a b d

Splitting a temporary 4-node in a 2-3 tree (summary)

11

2-3 tree: performance


Perfect balance. Every path from root to null link has same length.

Typical 2-3 tree built from random keys

Tree height.

Worst case: Best case:

12

2-3 tree: performance


Perfect balance. Every path from root to null link has same length.

Typical 2-3 tree built from random keys

Tree height. [all 2-nodes] Worst case: lg N. Best case: log N .631 lg N. [all 3-nodes] Between 12 and 20 for a million nodes. Between 18 and 30 for a billion nodes.
3

Guaranteed logarithmic performance for search and insert.

13

ST implementations: summary

worst-case cost (after N inserts) implementation search sequential search (unordered list) binary search (ordered array) insert delete

average case (after N random inserts) search hit insert delete

ordered iteration?

key interface

N/2

N/2

no

equals()

lg N

lg N

N/2

N/2

yes

compareTo()

BST

1.39 lg N

1.39 lg N

yes

compareTo()

2-3 tree

c lg N

c lg N

c lg N

c lg N

c lg N

c lg N

yes

compareTo()

constants depend upon implementation

14

2-3 tree: implementation?


Direct implementation is complicated, because:

Maintaining multiple node types is cumbersome. Need multiple compares to move down tree. Need to move back up the tree to split 4-nodes. Large number of cases for splitting.
Bottom line. Could do it, but there's a better way.

15

3.3 B ALANCED S EARCH T REES


2-3 search trees red-black BSTs

Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu

B-trees

3.3 B ALANCED S EARCH T REES


2-3 search trees red-black BSTs

Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu

B-trees

Left-leaning red-black BSTs (Guibas-Sedgewick 1979 and Sedgewick 2007)


3-node

black tree

1. Represent 23 tree as a BST.


J E C H
less than a
3-node

a b

2. Use "internal"Mleft-leaning links as "glue" 3nodes. greater less for between


R L
a b

than a

a and b

than b

P S
greater than b

X
a less than a
greater Rthan b

b greater than b

larger key is root

between a and b b a M

zontal red links

between a and b

E A C H

less than a

between a and b

Encoding a 3-node with two 2-nodes connected by a left-leaning red link

Encoding a 3-node with two 2-nodes connected by a left-leaning red link

red links "glue" nodes within a 3-node

black links connect 2-nodes and 3-nodes

tree

M E J A C H L
2-3 tree

redblack tree

M J E L H
corresponding red-black BST

R P S X
C A
horizontal red links

R P S X

1 correspondence between red-black and 2-3 trees

M E J R

18

An equivalent denition
A BST such that:

No node has two red links connected to it. Every path from root to null link has the same number of black links. Red links lean left.
"perfect black balance"

redblack tree

M J E C H L P S R X

A
horizontal red links

M E J H L P R S X
19

Left-leaning red-black BSTs: 1-1 correspondence with 2-3 trees


Key property. 11 correspondence between 23 and LLRB.
redblack tree

M J E C H L P S R X

A
horizontal red links

M E J H L P R S X

2-3 tree

M E J A C H L P R S X

11 correspondence between red-black and 2-3 trees


20

Search implementation for red-black BSTs


Observation. Search is the same as for elementary BST (ignore color).
but runs faster because of better balance

public Val get(Key key) { Node x = root; while (x != null) { int cmp = key.compareTo(x.key); if (cmp < 0) x = x.left; else if (cmp > 0) x = x.right; else if (cmp == 0) return x.val; } return null; }

redblack tree

M J E C H L P S R X

A
horizontal red links

M E J H L P R S X

2-3 tree

Remark. Most other ops (e.g., floor, iteration, selection) are also identical. R E J
A C H L P S X
21

Red-black BST representation


Each node is pointed to by precisely one link (from its parent) can encode color of links in nodes.

private static final boolean RED = true; private static final boolean BLACK = false; private class Node { Key key; Value val; Node left, right; boolean color; // color of parent link } private boolean isRed(Node x) { if (x == null) return false; return x.color == RED; }
null links are black

h.left.color is RED

E J D G

C A

h.right.color is BLACK

private static final boolean RED = true; private static final boolean BLACK = false; private class Node { Key key; Value val; Node left, right; int N; boolean color;

// // // // // //

key associated data subtrees # nodes in this subtree color of link from parent to this node

Node(Key key, Value val) { this.key = key; this.val = val;

22

Elementary red-black BST operations


Left rotation. Orient a (temporarily) right-leaning red link to lean left.

rotate E left (before) h

E S
x

less than E between E and S greater than S

private Node rotateLeft(Node h) { assert isRed(h.right); Node x = h.right; h.right = x.left; x.left = h; x.color = h.color; h.color = RED; return x; }

Invariants. Maintains symmetric order and perfect black balance.


23

Elementary red-black BST operations


Left rotation. Orient a (temporarily) right-leaning red link to lean left.

rotate E left (after)

S
h

E
greater than S

less than E

between E and S

private Node rotateLeft(Node h) { assert isRed(h.right); Node x = h.right; h.right = x.left; x.left = h; x.color = h.color; h.color = RED; return x; }

Invariants. Maintains symmetric order and perfect black balance.


24

Elementary red-black BST operations


Right rotation. Orient a left-leaning red link to (temporarily) lean right.

rotate S right (before)

S
x

E
greater than S

less than E

between E and S

private Node rotateRight(Node h) { assert isRed(h.left); Node x = h.left; h.left = x.right; x.right = h; x.color = h.color; h.color = RED; return x; }

Invariants. Maintains symmetric order and perfect black balance.


25

Elementary red-black BST operations


Right rotation. Orient a left-leaning red link to (temporarily) lean right.

rotate S right (after) x

E S
h

less than E between E and S greater than S

private Node rotateRight(Node h) { assert isRed(h.left); Node x = h.left; h.left = x.right; x.right = h; x.color = h.color; h.color = RED; return x; }

Invariants. Maintains symmetric order and perfect black balance.


26

Elementary red-black BST operations


Color flip. Recolor to split a (temporary) 4-node.

ip colors (before) h

E S

less than A

between A and E

between E and S

greater than S

private void flipColors(Node h) { assert !isRed(h); assert isRed(h.left); assert isRed(h.right); h.color = RED; h.left.color = BLACK; h.right.color = BLACK; }

Invariants. Maintains symmetric order and perfect black balance.


27

Elementary red-black BST operations


Color flip. Recolor to split a (temporary) 4-node.

ip colors (after) h

E S

less than A

between A and E

between E and S

greater than S

private void flipColors(Node h) { assert !isRed(h); assert isRed(h.left); asset isRed(h.right); h.color = RED; h.left.color = BLACK; h.right.color = BLACK; }

Invariants. Maintains symmetric order and perfect black balance.


28

Insertion in a LLRB tree: overview


Basic strategy. Maintain 1-1 correspondence with 2-3 trees by applying elementary red-black BST operations.
insert C

E A add new node here right link red so rotate left E A C E C A R S R S R S

E A R S

E A C R S

Insert into a 2-node at the bottom

29

Insertion in a LLRB tree

search ends at this null link


root

Warmup 1. Insert into a tree with exactly 1 node. b


a

red link to new node containing a converts 2-node to 3-node


root

left

root

right

b search ends at this null link


root

a a

search ends at this null link attached new node with red link b
root

b a

red link to new node containing a converts 2-node to 3-node


root

b a

rotated left to make a legal 3-node

right

a a

search ends at this null link attached new node with red link b
root

Insert into a single 2-node (two cases)

30

Insertion in a LLRB tree


Case 1. Insert into a 2-node at the bottom.

Do standard BST insert; color new link red. If new red link is a right link, rotate left.
insert C

E A add new node here right link red so rotate left E A C E C A R S R S R S

E A R S

E A C R S

Insert into a 2-node at the bottom


31

Insertion in a LLRB tree


Warmup 2. Insert into a tree with exactly 2 nodes.
larger
larger
between smaller between between searchc ends c c search ends c c b search ends c b at this at this at this a search ends b null link a a search ends b null link a search ends a link b null at at this null link this null link at this null link search ends search endssearch ends at this null link c at this null link larger smaller

smaller

b a

at this null link

b a

b a

attached new b a b attached b new node with c node with red link c ab a b attached new a red link c a c attached new node with a node with red link red link rotated b b colors flippedrighta rotated b a to black colors flipped b c a right b c a colors flipped to black c a a black c to b b c colors flipped b a a to black colors flipped a to blackc b

attached new c node with red link b

ca

a new attached attached new node with b node b withred link attached new

attached new node with red link c

link nodered with red link c c


b rotated

b rotated left b c b rotated right

rotated left
b

a right a rotated left c

cc

rotated right a rotated colors flipped a to right black c b a

Insert into a single 3-node (three cases)

colors flipped c to black Insert into a single 3-node (three cases) c a b

colors flipped to black colors flipped c a to black

32

Insert into a single 3-node (three cases)

so rotate right E

Insertion in a LLRB tree

C
inserting H

S R H

A E C R S E C A

Case 2. Insert into a 3-node at the bottom.


A

color new link red. Do standard BST insert; inserting H add new E needed). Rotate to balance the 4-node (if node here C S two lefts in a row inserting Hto pass red link up one level. Flip colors so rotate right A R E E C lean S left (if needed). Rotate to make C S add new
A R

both children red so flip colors R H right link red so rotate left E C A H R E C A Insert into a 3-node at the bottom H S R S S

node here

inserting H

add new node here E C both children red S two lefts in a rowC S so flip colors so rotate right A R E A R E C R H C S add new S A H node here A R both children red two lefts in a row H so flip colors so rotate right right link red E so rotate left E both children redC R C S so flip colors E S A H A R E C R R H C right link red S A H S so rotate left A H both children red so flip colors right link red R E E so rotate left S E C R C R H SC A H A H E S
E

A two lefts in a row H so rotate right

33

node here C A R H S

Insertion in a LLRB tree: passing red links up the tree C


A

add new node here M both children R red so P H flip colors S E C A R H S M P both children red so flip colors

Case 2. Insert into a 3-node at the bottom.


inserting P

Do standard BST insert; color new link red. Rotate to balance the 4-node (if needed). Flip colors to pass red link up one level. Rotate to make lean left (if needed). Repeat case 1 or case 2 up the tree (if needed).
R E S C M
inserting P

right link red so rotate left E C

inserting P

add new node here R

right link red M so P rotate left R A H S E two lefts in a row M so rotate right C P A R H M E C H M both children red E so flip colors C H MA E C A M E C A E C the tree H Passing a red link up A P H A P C R H P R P S two lefts in a row P so rotate right R S

E
inserting P

A S

C R E S A H

add new S node here E C M right link red C M so rotate left R both children A H R red so P A H S E add new E S flip colors node here C M C M both children right link red P A H R red so P A H so rotate left R flip colors S E two lefts in a row S E C M so rotate right right link red both children C M red soleft R P so rotate A H R flip colors P A H M S S E P E C M right link red two lefts in a row so rotate left so rotate right C R H P A H R A S E both children red two lefts in a row M S C M so flip colors so rotate right P E P A H R C H M M S two lefts in a row A R E

add new C node here A R

E M H

A both children red so P flip colors

both children red so flip colors S M R E H S M R S P S

Passing a red link up the tree


34

Red-black BST construction demo

insert S

35

Red-black BST construction demo

red-black BST

M E C A H L P S R X

36

Insertion in a LLRB tree: Java implementation


Same code for all cases.

Right child red, left child black: rotate left. Left child, left-left grandchild red: rotate right. Both children red: flip colors.

left rotate right rotate

flip colors

private Node put(Node h, Key key, Value val) { if (h == null) return new Node(key, val, RED); int cmp = key.compareTo(h.key); if (cmp < 0) h.left = put(h.left, key, val); else if (cmp > 0) h.right = put(h.right, key, val); else if (cmp == 0) h.val = val;

Passing a red link up a red-black tree

insert at bottom (and color it red)

if (isRed(h.right) && !isRed(h.left)) h = rotateLeft(h); if (isRed(h.left) && isRed(h.left.left)) h = rotateRight(h); if (isRed(h.left) && isRed(h.right)) flipColors(h); return h; }
only a few extra lines of code provides near-perfect balance

lean left balance 4-node split 4-node

37

Insertion in a LLRB tree: visualization

255 insertions in ascending order

38

Insertion in a LLRB tree: visualization

255 insertions in descending order


39

Insertion in a LLRB tree: visualization

255 random insertions

40

Balance in LLRB trees


Proposition. Height of tree is 2 lg N in the worst case. Pf.

Every path from root to null link has same number of black links. Never two red links in-a-row.

Property. Height of tree is ~ 1.00 lg N in typical applications.


41

ST implementations: summary

worst-case cost (after N inserts) implementation search insert delete

average case (after N random inserts) search hit insert delete

ordered iteration?

key interface

sequential search (unordered list) binary search (ordered array)

N/2

N/2

no

equals()

lg N

lg N

N/2

N/2

yes

compareTo()

BST

1.39 lg N

1.39 lg N

yes

compareTo()

2-3 tree

c lg N

c lg N

c lg N

c lg N

c lg N

c lg N

yes

compareTo()

red-black BST

2 lg N

2 lg N

2 lg N

1.00 lg N *

1.00 lg N *

1.00 lg N *

yes

compareTo()

* exact value of coefficient unknown but extremely close to 1


42

War story: why red-black?


Xerox PARC innovations. [1970s]

Alto. GUI. Ethernet. Smalltalk. InterPress. Laser printing. Bitmapped display. WYSIWYG text editor. ...
A DIClIROlV1ATIC FUAl\lE\V()HK Fon BALANCED TREES
Leo J. Guibas .Xerox Palo Alto Research Center, Palo Alto, California, and Carnegie-Afellon University and

Xerox Alto

Robert Sedgewick* Program in Computer Science Brown University Providence, R. I.

ABSTUACT

I() this paper we present a uniform framework for the implementation and study of halanced tree algorithms. \Ve show how to imhcd in this framework the best known halanced tree tecilIliques and thell usc the framework to deVl'lop new which perform the update and rebalancing in one pass, Oil the way down towards a leaf. \Ve conclude with a study of performance issues and concurrent updating.

the way down towards a leaf. As we will see, this has a number of significant advantages ovcr the older methods. We shall cxamine a numhcr of variations on a common theme and exhibit full implementations which are notable for their brcvity. One imp1cn1entation is exatnined carefully, and some properties about its behavior are proved. ]n both sections 1 and 2 particular attention is paid to practical implementation issues, and cOlnplcte impletnentations are given for all of the itnportant algorithms. '1l1is is significant because one measure under which balanced tree algorithtns can differ greatly is

43

O.

Introduction

War story: red-black BSTs


Telephone company contracted with database provider to build real-time database to store customer information. Database implementation.

Red-black BST search and insert; Hibbard deletion. Exceeding height limit of 80 triggered error-recovery process.
allows for up to 240 keys Hibbard deletion was the problem

Extended telephone service outage.

Main cause = height bounded exceeded! Telephone company sues database provider. Legal testimony:
If implemented properly, the height of a red-black BST with N keys is at most 2 lg N. expert witness

44

3.3 B ALANCED S EARCH T REES


2-3 search trees red-black BSTs

Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu

B-trees

3.3 B ALANCED S EARCH T REES


2-3 search trees red-black BSTs

Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu

B-trees

File system model


Page. Contiguous block of data (e.g., a file or 4,096-byte chunk). Probe. First access to a page (e.g., from disk to memory).

slow

fast

Property. Time required for a probe is much larger than time to access data within a page. Cost model. Number of probes. Goal. Access data using minimum number of probes.

47

B-trees (Bayer-McCreight, 1972)


B-tree. Generalize 2-3 trees by allowing up to M - 1 key-link pairs per node.

At least 2 key-link pairs at root. At least M / 2 key-link pairs in other nodes. External nodes contain client keys. Internal nodes contain copies of keys to guide search.

choose M as large as possible so that M links fit in a page, e.g., M = 1024

* K sentinel key * D H external 3-node * B C D E F client keys (black) are in external nodes H I J

2-node internal 3-node

each red key is a copy of min key in subtree external 5-node (full) K M N O P

K Q U external 4-node Q R T U W X Y

all nodes except the root are 3-, 4- or 5-nodes Anatomy of a B-tree set (M = 6)
48

Searching in a B-tree

Start at root. Find interval for search key and take corresponding link. Search terminates in external node.

searching for E

follow this link because E is between * and K * D H

* K

K Q U follow this link because E is between D and H

* B C

D E F

H I J

K M N O P

Q R T

U W X

search for E in this external node Searching in a B-tree set (M = 6)

49

Insertion in a B-tree

Search for new key. Insert at bottom. Split nodes with M key-link pairs on the way up the tree.
inserting A

* H K Q U

* B C E F

H I J

K M N O P * H K Q U

Q R T

U W X

* A B C E F new key (A) causes overflow and split

H I J

K M N O P * C H K Q U

Q R T

U W X new key (C) causes overflow and split

* A B

C E F

H I J * K

K M N O P

Q R T

U W X

* C H

root split causes a new root to be created H I J K M N O P

K Q U

* A B

C E F

Q R T

U W X

Inserting a new key into a B-tree set


50

Balance in B-tree
Proposition. A search or an insertion in a B-tree of order M with N keys requires between log M-1 N and log M/2 N probes. Pf. All internal nodes (besides root) have between M / 2 and M - 1 links.

In practice. Number of probes is at most 4. Optimization. Always keep root page in memory.

M = 1024; N = 62 billion log


M/2

N 4

51

Building a large B tree


white: unoccupied portion of page

each line shows the result of inserting one key in some page

black: occupied portion of page

full page, about to split

full page splits into two half -full pages then a new key is added to one of them

52

Balanced trees in the wild


Red-black trees are widely used as system symbol tables.

Java: java.util.TreeMap, java.util.TreeSet. C++ STL: map, multimap, multiset. Linux kernel: completely fair scheduler, linux/rbtree.h.
B-tree variants. B+ tree, B*tree, B# tree, B-trees (and variants) are widely used for file systems and databases.

Windows: HPFS. Mac: HFS, HFS+. Linux: ReiserFS, XFS, Ext3FS, JFS. Databases: ORACLE, DB2, INGRES, SQL, PostgreSQL.

53

Red-black BSTs in the wild

Common sense. Sixth sense. Together they're the FBI's newest team.

54

Red-black BSTs in the wild

55

3.3 B ALANCED S EARCH T REES


2-3 search trees red-black BSTs

Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu

B-trees

Algorithms

R OBERT S EDGEWICK | K EVIN W AYNE

3.3 B ALANCED S EARCH T REES


2-3 search trees red-black BSTs

Algorithms
F O U R T H E D I T I O N

B-trees

R OBERT S EDGEWICK | K EVIN W AYNE


http://algs4.cs.princeton.edu

You might also like