You are on page 1of 71

Balanced search trees: 2‐3‐4 trees.

2‐3‐4 (or 2‐4) trees improve the efficiency of insertItem and deleteItem methods of 2‐3 trees, because they are performed on the path from the root to the leaf. However, they require more memory for storing 3 data items and 4 pointers in each node. Definition: A 2‐3‐4 tree is a general tree which satisfies the following properties: 1 Each node may store three data items. 2 Each node may have four children. 3 The second and third data items in any node may be empty, in which case sentinel value emptyFlag is stored there (assume emptyFlag := 0). If they are not empty, the first data item precedes the second one according to the specified ordering relationship, the second data item precedes the third data item. 4. For each node, data in the first child precedes the first data item in the node; data in the second child follows the first data item, but precedes the second; data in the third child follows the second data item, but precedes the third; data in the fourth child follows the third data item. 5 All leaf nodes are on the same level.

Example 2‐3‐4 tree
4 2 1
Class Node234tree { Node234tree firstChild; Node234tree secondChild; Node234tree thirdChild; Node234tree fourthChild; p ; Node234tree parent; int firstItem; int secondItem; int thirdItem; .... class methods follow }

6 8 3 5 7 9 10 11

Search in 2‐3‐4 trees
The search algorithm is similar to that in 2‐3 trees and binary search trees. In the example 2‐3‐4 tree, the search for 10 is carried out as follows: 1. 1 Compare 10 to the only item in the root. 10 > 4, continue the search root 4 in the second child. 2. 10 > 6 and 10 > 8, continue the search in the third child. 3. 10 > 9, 10 = 10. Stop. As in 2‐3 trees, the efficiency of the search operation is guaranteed to be O(log n). On average, it will be better that the search efficiency in 2‐3 trees, trees because the height of a 2‐3‐4 tree might be less than the height of the 2‐3 tree with the same data.

Insertion in 2‐3‐4 trees
Step 1 Search for the item to be inserted (same as in 2‐3 trees). Step 2 Insert at the leaf level. The following cases are possible: • The termination node is a 2‐node. Then, make it a 3‐node, and insert the new item appropriately. • The termination node is a 3‐node. Then, make it a 4‐node, and insert the new item appropriately. • The termination node is a 4 node. Split is, pass the middle to the parent, and insert the new item appropriately. General rules for inserting new nodes in 2‐3‐4 trees: Rule 1: During the search step, every time a 2‐node connected to a 4‐node is encountered, transform it into a 3‐node connected to two 2‐nodes. Rule 2: During the search step, every time a 3‐node connected to a 4‐node l h h d d d is encountered, transform it into a 4‐node connected to two 2‐nodes. Note that two 2‐nodes resulting from these transformations have the same number of children as the original 4‐node. This is why the split of a 4‐node does not affect any nodes below the level where the split occurs.

Efficiency of search and insert operations
Result 1: Search in a 2‐3‐4 tree with N nodes takes at most O(log N) time. This is in case if all nodes are 2 nodes. If there are 3‐nodes and 4‐nodes on the tree, the search will take less than (log N) time. Result 2: Insertion into a 2‐3‐4 tree takes less than O(log N) time, and on average requires less than 1 node split.

Deletion in 2‐3‐4 tree
Consider our example tree 4 2 1 3 5 6 8 7 9 10 11

The following special cases (with multiple sub‐cases each) are possible: Case 1 (three sub‐cases): The item is deleted from a leaf node (a node with external children), which currently contains 2 or 3 items. Easy sub‐cases – delete the item transforming a 4‐node into a 3 node, or a 3 node into a 2 node. No other nodes are affected. Example: delete 9 – the existing 4 node, containing 9, 10, and 11 is transformed into a 3 node, containing 10 and 11. Deleting from a 2‐ node (the third sub‐case) requires an item from the parent node to be drawn, which in turn must be replaced by an item from the sibling note (if the sibling node is NOT a 2‐node as well). See case 2.

Deletion in 2‐3‐4 tree (contd.)
Case 2 (with several more sub‐cases) Delete from a node that has non‐external children. For example, delete 8. This case can be reduced to case 1 by finding the item that precedes the one to be deleted in in‐order traversal (7, in our example) and exchanging the two items. If 7 were part of a 3‐ or 4‐ node, 8 would have been deleted easily. However, since 8 is now the only item in the node, we have a case of underflow. This requires that an item from the parent node be transferred to the underflow node, and substituted in the parent node by an item from the sibling node. In our example, 7 will be transferred back to where it was, and 9 will move to the parent node to fill the gap. However, if the sibling node is also a 2‐node, the so‐called fusing takes place. That is, the two 2‐node siblings are “fused” in a single 3‐node, after an item is transferred from the parent node The later suggests that the parent can now handle one less child and it node. child, indeed has one child less after two of its former children are fused. The last sub‐case suggests that a parent node is also a 2‐node. Then, it must in turn borrow from its parent, etc., resulting in the 2‐3‐4 tree becoming one level shorter.

Efficiency of 2-3 Tree
• As for any search tree, the efficiency depends on the tree’s height • A 2-3 tree of height h with the smallest number of keys is a full tree of 2-nodes
level 0 1 h−1 items 1 2 2h−1

2-3-(4) Trees

n ≥ 1 + 2 + ... + 2h −1
• So

h ≤ log 2 ( n + 1)

2-3 Tree Visualization
• A 2-3 tree of height h with the largest number of keys is a full tree of 3-nodes, each two keys and three children • http://slady.net/java/bt/view.php?w=600&h=450

• Text input box and three buttons
– Enter a number in the text box and click ”Insert” Insert to have it entered into the 2-3 tree – Similarly, you can delete from the tree and search the tree – All three buttons work on data entered in the same text box

n ≤ 2 ⋅1 + 2 ⋅ 3 + ... + 2 ⋅ 3 h −1
• So

h ≥ log 3 (n + 1)

Practice
• Construct a 2-3 tree for the list 9, 5, 8, 3, 2, 4, 7 by successive insertions

2-3-4 Trees
• Same as 2-3 tree, except now nodes can have four subtrees (and 3 items) • Traversal and searching operations are simply extensions of the same procedures for Binary Search Trees (BST) and 2-3 Trees • 2-3-4 trees are useful because the steps required to resolve insertion and deletion dilemmas are reduced compared to 2-3 trees

Insertion
• In order to insert an item into a 2-3-(4) tree it is first necessary to locate the leaf that the item will be inserted into • This requires that we start from the root and make comparison decisions until we arrive at a leaf • This effectively traverses a “path” from the root node down to that leaf

The difficulty with insertions
• Recall that in 2-3 trees we blindly followed this path to the leaf. Then we (over)filled nodes to accommodate the new item • Overfilled nodes were resolved by pushing items up to the parent • This push had the possibility of overfilling the parent as well and the problem was propagated up the tree

Taking advantage (of descent)
• 2-3-4 Trees make use of the descent from the root to the leaf • 2-3 trees blindly descend to the leaf and 23 make modifications to the tree only on the way back up as nodes become overfull • 2-3-4 also make modifications to the tree during the descent to find the leaf

The simple modification
• While searching for the leaf on the way down if we encounter a “4” node then we immediately split that node…
50 30 10 15 20 50 40 70 60 80 90 10 15 20 30 40 60 70 80 90

• Wh would you d such craziness? Why ld do h i ?
– To make room for items that might be pushed up! (now 50, 30 and 70 can hold at least one more item that might be pushed up from below)

Reasoning
• 2-nodes and 3-nodes can always hold one more item • The only nodes that can t hold one more item can’t are 4-nodes • Items are only pushed up to ancestors • If we encounter an ancestor on the way down that won’t be able to accept an item we p can transform it into 2-nodes that can before the insertion is actually performed
root

Special case

Other cases
2-node parent

Other cases
3-node parent

Guarantee
• This guarantees that the exact same insertion routine presented for 2-3 trees can p g y be performed without ever having to worry about “resolving” overfull nodes

Example
• Adding 16 to the following tree
– Step 1a: (descent) Is current node a 4-node?
30 10 15 20 50 40 70 60

*
80 90

Yes
50 30 10 15 20 40

*
70 60 80 90

• Step 1b: Compare search key with current node (16 < 50)
50 30 10 15 20

• Step 1a: Is current node a 4-node? Yes
Yes
70 40 60 80 90 10

*
50 70 30 60 80 90

*
40

*
15 30 20

50 70 40 60 80 90

*

*
10 15 20

• Step 1a: Is current node (*) a 4-node? No. • Step 1b: Compare search key with current node (16<30)

• 16 < 30 which is the middle branch
50 15 30 70

• The insertion …
50 15 30 70

10

*

20

40

60

80 90

10

16 20

40

60

80 90

• Th arrived at l f i a 2 d ( maybe a 3 The i d t leaf is 2-node (or b 3node) so you can simply insert the new item and nothing will have to propagate up

• N thi can propagate up b Nothing t because th the necessary space was made during descent to the leaf

2-3-4 Tree Visualization
• http://www.cse.ohio-state.edu/~bondhugu/acads/234-tree/index.shtml

• Insert 60, 30, 10, 20, 50, 40, 70, 80, 15, 90, 100

Inserting 60, 30, 10, 20 ...

... 50, 40 ...

Inserting 50, 40 ...

Inserting 70 ...

... 70, ...

... 80, 15 ...

Inserting 80, 15 ...

Inserting 90 ...

... 90 ...

... 100 ...

2-3-4 Deletions
• Like with the 2-3 tree, to delete we
Inserting 100 ...

– find the node containing the value to be deleted – find the value that comes next in the tree – swap the next value with the value to be deleted – delete the swapped value which is now in a leaf node which is
• either a 3-node or 4-node and we do not have to physically delete a node, or • a 2-node and we have to take care of removing the node from the tree

2-3-4 Deletions
• Bottom-up strategy
– if the 2-node has a sibling that is a 3-node or 4g node, redistribute the values between the sibling, parent and current node – if the 2-node has no nearby 3-node or 4-node siblings, then merge the sibling with the parent and move up to the parent level and perform the deletion recursively

2-3-4 Deletions
• Top-down strategy
– when searching for the node containing the g (collapse) any p ) y value to be deleted we will merge ( 2-node (except root) into a larger node – in this way, we can be assured that any physical removal will take place only from a 3-node or 4node – there are many merge situations like there were y g 6 split situations
• the cases depend on the type of node that is the given 2-node’s parent and a next sibling to the left or right

2-3-4 Deletions
Turning a 2-node into a 3-node ... Case 1: an adjacent sibling has 2 or 3 items
"steal" item from sibling by rotating items and moving subtree steal
(note: parent has at least two items, unless it is the root)

2-3-4 Deletions
Turning a 2-node into a 3-node ... Case 2: each adjacent sibling has only one item
"steal" item from parent and merge node with sibling steal
(note: parent has at least two items, unless it is the root)

30 50

20 50

30 50

50

10 20

40

*
"rotation"

10

30 40

10

40

*
merging

10 30 40

25

25

25

35

25

35

2-3-4 Deletions
Turning a 2-node into a 4-node ... Case 3: parent is root and parent and each adjacent sibling has only one item a o yo
merge node with parent and sibling
20

Example
• Delete 40
40

50

30

10 30 40

14

32

43

62 70 79

10

40

*
merging

25

35 10 18 25 33 42 47 57 60 66 74 81

25

35

Red-Black Trees — Idea
• We can represent a 4-node of the 2-3-4 tree using only 2nodes if we add two new nodes (shown in red)
4 7 12 2 5 9 18 2 7 4 5 9 12 18

Here again is our transformed 4-node
7 4 7 12 2 5 9 18 2 4 5 9 12 18

We can also apply this process to a 3-node (in two different ways)
4 7 2 5 9 2 4 5 7 9 or 4 2 5 7 9

• Note that the ordering property of the 2-3-4 Tree translates into the ordering property of a Binary Search Tree • Th text talks about red and black links but “links” or The t t t lk b t d d bl k li k b t “li k ” children references do not have the ability to contain information about their “color”

2-nodes are essentially left alone
4 2 5 2 4 5

• A Red-Black Tree is a Binary-Tree representation of a 2-3-4 Tree
– Think of black nodes as representing 2-3-4 Tree nodes – Think of red nodes as being the extra ones required to make a Binary Tree out of the 2-3-4 Tree
28 20 28 5 12 15 3 24 35 42 5 3 7 9 12 20 24 35 42 48

• Every red-black tree has an equivalent 2-3-4 representation
I C C A EGH LM NR A SX E G H L M P S I N R X

7 9 14 19 22 25 30 40 45 48 50

15 22 25 30 40 45 50 14 19

P

– It is no longer true that every leaf is at the same level. However, given a node, every path from it down to a leaf goes through the same number of black nodes

Practice
• Thus we have the following restriction in the definition of red-black trees • There must be an equal number of black nodes on every path from the root to a node with fewer than two children • If you take any 2-3-4 tree you can illustrate it as a Red-Black Tree… • Figure 12-20 from the book as a 2-3-4 tree 12 20 234

• Same tree as a Red-Black tree

• Red-black trees are undercover 2-3-4 trees • We can tell if a node is really a 4-node by checking if both its left and right children are colored red • A node is a 3-node if its not a 4-node and has one red child

• There are 15 other valid variations

Red-black trees: benefits
• 2-3 and 2-3-4 trees typically waste a lot of memory on unused items and child references • 2-nodes are really 4-nodes with 2/3 of memory for that node being unused • Red-black trees have the advantages of 2-34 trees without the overhead

Red-black trees: insertion
• Insertion in a red-black tree can be implemented as a direct translation of the top-down 2-3-4 splitting algorithm
– search from root to leaf and then insert the new value – we must maintain height-balancing. how?
• in the 2-3-4 tree, we split any 4-node on the way down the tree using one of 6 cases red black • we do the same for our red-black tree, implementing the 6 cases from the point of view of red-black nodes rather than 4-nodes • most of the splitting is done by re-coloring nodes, but there are some cases that require additional rotations

Red-black trees: insertion
• To split a root 4-node: recolor the two children black
Y X Y Z X Y Z X Z X Z X Y W

Red-black trees: insertion
• To split a non-root 4-node: flip the color of all three nodes
W W Y Y Z X Z X Y Z X W Y Z

Red-black trees: insertion
• One problem is that we may end up with two adjacent red nodes
– If the parent is a 3-node oriented the wrong way round – We fix this with a single rotation and further recoloring
VW XYZ VWY X Z
V W Y X Z X V W Y Z V X W Y Z V X W Y Z

Convention
• Leaf nodes are null links; the rest are internal nodes
30 15 10 20 50 5 40 55 60 65 70 85 80 90

Definition of black-height
• The number of black nodes on any path from a node, x, i a R B tree to a f d in R-B t t descendent leaf, but excluding node x itself, is the node’s black-height, denoted bh(x) RB • If T is an R-B tree, we define bh(T) as the black-height of its root

Black heights
30

3
70

2 1 10 1 5

15 20 50 40 55 60 65

85 80 90

0

Node Height
• The height of a node v is the number of nodes on the longest path from v to a leaf, but excluding node v itself
A height of the tree is 4

Theorem 1 – Any red-black tree with root x, has at least n = 2bh(x) – 1 internal nodes, where bh(x) is the black height of node x
Proof: by induction on height of x y g – Base step: x has height 0 (i.e. null leaf node) • What is bh(x)? – Inductive step: x has positive height and 2 children • Each child has black-height of bh(x) or bh(x) – 1 (Why?) • Since the height of a child of x is less than the height of x itself, we can apply the inductive hypothesis to conclude that each child has ≥ 2bh(x)-1 - 1 internal nodes. So, the tree rooted at x has at least (2bh(x)-1 - 1)+(2bh(x)-1 - 1)+1= 2bh(x) 1 internal nodes

height of node B is 3

B

C

D

E

F

G

H

I

Theorem 2 – In a red-black tree, at least half the nodes on any path from the root to a leaf, not including the root, must be black Proof: If a red node on the path has a child, then the color of the child must be black

Theorem 3 – A red-black tree with n internal nodes has height h <= 2log(n + 1) Proof: Let h be the height of the red-black tree with root x By Theorem 2 x. 2, bh(x) >= h/2 From Theorem 1, n >= 2bh(x) - 1 Therefore n >= 2 h/2 – 1 n + 1 >= 2h/2 log(n + 1) >= h/2 2log(n + 1) >= h

2-3-(4) Trees

Efficiency of 2-3 Tree
• As for any search tree, the efficiency depends on the tree’s height • A 2-3 tree of height h with the smallest number of keys is a full tree of 2-nodes
level items 0 1 h−1 1 2 2h−1

n ≥ 1 + 2 + ... + 2
• So

h −1

h ≤ log 2 (n +1)

• A 2-3 tree of height h with the largest number of keys is a full tree of 3-nodes, each two keys and three children

n ≤ 2 ⋅1 + 2 ⋅ 3 + ... + 2 ⋅ 3
• So

h −1

h ≥ log 3 (n + 1)

2-3 Tree Visualization
• http://slady.net/java/bt/view.php?w=600&h=450

• Text input box and three buttons
– Enter a number in the text box and click ”Insert” to have it entered into the 2-3 tree – Similarly, you can delete from the tree and search the tree – All three buttons work on data entered in the same text box

Practice
• Construct a 2-3 tree for the list 9, 5, 8, 3, 2, 4, 7 by successive insertions

2-3-4 Trees
• Same as 2-3 tree, except now nodes can have four subtrees (and 3 items) • Traversal and searching operations are simply extensions of the same procedures for Binary Search Trees (BST) and 2-3 Trees • 2-3-4 trees are useful because the steps required to resolve insertion and deletion dilemmas are reduced compared to 2-3 trees

Insertion
• In order to insert an item into a 2-3-(4) tree it is first necessary to locate the leaf that the item will be inserted into • This requires that we start from the root and make comparison decisions until we arrive at a leaf • This effectively traverses a “path” from the root node down to that leaf

The difficulty with insertions
• Recall that in 2-3 trees we blindly followed this path to the leaf. Then we (over)filled nodes to accommodate the new item • Overfilled nodes were resolved by pushing items up to the parent • This push had the possibility of overfilling the parent as well and the problem was propagated up the tree

Taking advantage (of descent)
• 2-3-4 Trees make use of the descent from the root to the leaf • 2-3 trees blindly descend to the leaf and make modifications to the tree only on the way back up as nodes become overfull • 2-3-4 also make modifications to the tree during the descent to find the leaf

The simple modification
• While searching for the leaf on the way down if we encounter a “4” node then we immediately split that node…
50 30 10 15 20 50 40 70 60 80 90 10 15 20 30 40 60 70 80 90

• Why would you do such craziness?
– To make room for items that might be pushed up! (now 50, 30 and 70 can hold at least one more item that might be pushed up from below)

Reasoning
• 2-nodes and 3-nodes can always hold one more item • The only nodes that can’t hold one more item are 4-nodes • Items are only pushed up to ancestors • If we encounter an ancestor on the way down that won’t be able to accept an item we can transform it into 2-nodes that can before the insertion is actually performed

Special case
root

Other cases
2-node parent

Other cases
3-node parent

Guarantee
• This guarantees that the exact same insertion routine presented for 2-3 trees can be performed without ever having to worry about “resolving” overfull nodes

Example
• Adding 16 to the following tree
– Step 1a: (descent) Is current node a 4-node?
30 10 15 20 50 40 70 60

*
80 90

Yes
50 30 10 15 20 40

*
70 60 80 90

• Step 1b: Compare search key with current node (16 < 50)
50 30 10 15 20

*
70 60 80 90

*
40

*

• Step 1a: Is current node (*) a 4-node? No. • Step 1b: Compare search key with current node (16<30)

• Step 1a: Is current node a 4-node? Yes
Yes
70 40 60 80 90 10

50 30

*
15 30 20

50 70 40 60 80 90

*
10 15 20

• 16 < 30 which is the middle branch
50 15 30 70

10

*

20

40

60

80 90

• The arrived at leaf is a 2-node (or maybe a 3-node) so you can simply insert the new item and nothing will have to propagate up

• The insertion …
50 15 30 70

10

16 20

40

60

80 90

• Nothing can propagate up because the necessary space was made during descent to the leaf

2-3-4 Tree Visualization
• http://www.cse.ohio-state.edu/~bondhugu/acads/234-tree/index.shtml

• Insert 60, 30, 10, 20, 50, 40, 70, 80, 15, 90, 100

Inserting 60, 30, 10, 20 ...

... 50, 40 ...

Inserting 50, 40 ...

... 70, ...

Inserting 70 ...

... 80, 15 ...

Inserting 80, 15 ...

... 90 ...

Inserting 90 ...

... 100 ...

Inserting 100 ...

2-3-4 Deletions
• Like with the 2-3 tree, to delete we
– find the node containing the value to be deleted – find the value that comes next in the tree – swap the next value with the value to be deleted – delete the swapped value which is now in a leaf node which is
• either a 3-node or 4-node and we do not have to physically delete a node, or • a 2-node and we have to take care of removing the node from the tree

2-3-4 Deletions
• Bottom-up strategy
– if the 2-node has a sibling that is a 3-node or 4node, redistribute the values between the sibling, parent and current node – if the 2-node has no nearby 3-node or 4-node siblings, then merge the sibling with the parent and move up to the parent level and perform the deletion recursively

2-3-4 Deletions
• Top-down strategy
– when searching for the node containing the value to be deleted we will merge (collapse) any 2node (except root) into a larger node – in this way, we can be assured that any physical removal will take place only from a 3-node or 4node – there are many merge situations like there were 6 split situations
• the cases depend on the type of node that is the given 2-node’s parent and a next sibling to the left or right

2-3-4 Deletions
Turning a 2-node into a 3-node ... Case 1: an adjacent sibling has 2 or 3 items
"steal" item from sibling by rotating items and moving subtree
(note: parent has at least two items, unless it is the root)

30 50

20 50

10 20

40

*
"rotation"

10

30 40

25

25

2-3-4 Deletions
Turning a 2-node into a 3-node ... Case 2: each adjacent sibling has only one item
 "steal" item from parent and merge node with sibling
(note: parent has at least two items, unless it is the root)

30 50

50

10

40

*
merging

10 30 40

25

35

25

35

2-3-4 Deletions
Turning a 2-node into a 4-node ... Case 3: parent is root and parent and each adjacent sibling has only one item
 merge node with parent and sibling

30

10 30 40

10

40

*
merging

25

35

25

35

Example
• Delete 40
40

20

50

14

32

43

62 70 79

10

18

25

33

42

47

57 60

66

74

81

Red-Black Trees — Idea
• We can represent a 4-node of the 2-3-4 tree using only 2nodes if we add two new nodes (shown in red)
4 7 12 2 5 9 18 2 7 4 5 9 12 18

• Note that the ordering property of the 2-3-4 Tree translates into the ordering property of a Binary Search Tree • The text talks about red and black links but “links” or children references do not have the ability to contain information about their “color”

Here again is our transformed 4-node
7 4 7 12 2 5 9 18 2 4 5 9 12 18

We can also apply this process to a 3-node (in two different ways)
4 7 2 5 9 2 4 5 7 9 or 4 2 5 7 9

2-nodes are essentially left alone
4 2 5 2 4 5

• A Red-Black Tree is a Binary-Tree representation of a 2-3-4 Tree
– Think of black nodes as representing 2-3-4 Tree nodes – Think of red nodes as being the extra ones required to make a Binary Tree out of the 2-3-4 Tree
28 20 28 5 12 15 3 24 35 42 5 3 7 9 12 20 24 35 42 48

7 9 14 19 22 25 30 40 45 48 50

15 22 25 30 40 45 50 14 19

– It is no longer true that every leaf is at the same level. However, given a node, every path from it down to a leaf goes through the same number of black nodes

• Every red-black tree has an equivalent 2-3-4 representation
I C C A EGH LM NR A SX E G H L M P S I N R X

P

• Thus we have the following restriction in the definition of red-black trees • There must be an equal number of black nodes on every path from the root to a node with fewer than two children

Practice
• If you take any 2-3-4 tree you can illustrate it as a Red-Black Tree… • Figure 12-20 from the book as a 2-3-4 tree

• Same tree as a Red-Black tree

• There are 15 other valid variations

• Red-black trees are undercover 2-3-4 trees • We can tell if a node is really a 4-node by checking if both its left and right children are colored red • A node is a 3-node if its not a 4-node and has one red child

Red-black trees: benefits
• 2-3 and 2-3-4 trees typically waste a lot of memory on unused items and child references • 2-nodes are really 4-nodes with 2/3 of memory for that node being unused • Red-black trees have the advantages of 2-34 trees without the overhead

Red-black trees: insertion
• Insertion in a red-black tree can be implemented as a direct translation of the top-down 2-3-4 splitting algorithm
– search from root to leaf and then insert the new value – we must maintain height-balancing. how?
• in the 2-3-4 tree, we split any 4-node on the way down the tree using one of 6 cases • we do the same for our red-black tree, implementing the 6 cases from the point of view of red-black nodes rather than 4-nodes • most of the splitting is done by re-coloring nodes, but there are some cases that require additional rotations

Red-black trees: insertion
• To split a root 4-node: recolor the two children black
Y X Y Z X Y Z X Z X Z Y

Red-black trees: insertion
• To split a non-root 4-node: flip the color of all three nodes
W W X Y Z W Y X Z X Y Z X W Y Z

Red-black trees: insertion
• One problem is that we may end up with two adjacent red nodes
– If the parent is a 3-node oriented the wrong way round – We fix this with a single rotation and further recoloring
VW XYZ VWY X Z
V W Y X Z X V W Y Z V X W Y Z V X W Y Z

Convention
• Leaf nodes are null links; the rest are internal nodes
30 15 10 20 50 5 40 55 60 65 80 70 85 90

Definition of black-height
• The number of black nodes on any path from a node, x, in a R-B tree to a descendent leaf, but excluding node x itself, is the node’s black-height, denoted bh(x) • If T is an R-B tree, we define bh(T) as the black-height of its root

Black heights
30

3
70

2 1 10 1 5

15 20 50 40 55 60 65

85 80 90

0

Node Height
• The height of a node v is the number of nodes on the longest path from v to a leaf, but excluding node v itself
A height of the tree is 4

height of node B is 3

B

C

D

E

F

G

H

I

Theorem 1 – Any red-black tree with root x, has at least n = 2bh(x) – 1 internal nodes, where bh(x) is the black height of node x
Proof: by induction on height of x – Base step: x has height 0 (i.e. null leaf node) • What is bh(x)? – Inductive step: x has positive height and 2 children • Each child has black-height of bh(x) or bh(x) – 1 (Why?) • Since the height of a child of x is less than the height of x itself, we can apply the inductive hypothesis to conclude that each child has ≥ 2bh(x)-1 - 1 internal nodes. So, the tree rooted at x has at least (2bh(x)-1 - 1)+(2bh(x)-1 - 1)+1= 2bh(x) - 1 internal nodes

Theorem 2 – In a red-black tree, at least half the nodes on any path from the root to a leaf, not including the root, must be black Proof: If a red node on the path has a child, then the color of the child must be black

Theorem 3 – A red-black tree with n internal nodes has height h <= 2log(n + 1) Proof: Let h be the height of the red-black tree with root x. By Theorem 2, bh(x) >= h/2 From Theorem 1, n >= 2bh(x) - 1 Therefore n >= 2 h/2 – 1 n + 1 >= 2h/2 log(n + 1) >= h/2 2log(n + 1) >= h

ÁÒ× ÖØ ÓÒ ÒØÓ ´¾¸ µ ØÖ

¯ ËÔÐ Ø Ñ Ý ¯ ËÓ Ø

Ò

ØØ

Ô Ö ÒØ ÒÓ

¿

Ö ×ÙÐØ× Ò Ò ÓÚ Ö­ÓÛº ÖÓÓØ Ø ×ÔÐ Ø Ò Ø

ÓÚ Ö­ÓÛ Ñ Ý Ø ÔÖÓÔ Ø ÙÒØ Ð Ø Ø Ó Ø ØÖ × Ò Ö × Ý ½º
p2 n n2 n1 p1 n k1 k2 k3 k4 c5 c1 c2 c3 c4 c5 p2 n2

p1 n1

p1 k3 p2 n1 k1 k2 c1 c2 k4 c3 c4 c5 n2

k1 k2 k3 k4 c1 c2 c3 c4

Ô ÖØÑ ÒØ Ó ÓÑÔÙØ Ë Ò ² Ò Ò Ö Ò

ʺ ú

Ó×

Ø ËØÖÙ ØÙÖ ² Ð ÓÖ Ø Ñ×

¾

Ó ¿

ÆÓÚ Ñ Ö¸ ¾¼¼½