You are on page 1of 13

Ensuring tree height h = O(log n)

Red Black Trees


Red Black Properties
• A binary search tree where
– Every node is either “red” or “black”
• For consistency we count “null” pointers as if they were
“black” nodes. This means that an empty tree (one where the
root is null) satisfies the definition (it’s black).
– If a node is red, then both children are black
• Another reason for “null” pointers being considered black
– For any node, x, in the tree. All paths from x to a leaf
have exactly the same number of black nodes.
Red/Black Tree has height O(log n)
• Let the “black-height” of a node x be the number of black nodes that
are descendants of x.
• Then, for any node x, the subtree rooted at x has at least 2bh(x) -1 nodes.
– Proof, by induction on the height of node x.
• If the height of x is 0, then the bh is zero, and 2 0 -1 is 0. Of course, if the
height of x is 0 then there are no nodes in the tree (x is an empty subtree).
• If the height of x is k, and the bh of x is b (b < k), then consider each of the two
children of x (either of which may be null). The black height of the child must
be at least b – 1 (which happens if the child is black). By the inductive
hypothesis, the subtree rooted at each child has 2 b-1 -1 nodes. If we add up the
number of nodes in both children, we have 2 * (2 b-1 – 1), or 2b – 2. When we
add in the node x we get 2b – 1. So the number of nodes in the subtree rooted
at x is at least 2b-1
• Note that bh > h/2. So a tree with height h must have at least 2h/2-1
nodes in it. i.e., 2h/2 -1 ≤ n.
• Therefore (taking the log base 2 of both sides) h < 2log2(n+1)
Huh? Doesn’t that work for any
tree?
• That proof kinda stinks of the “let’s prove zero
equals one” sort of proofs… in particular, it seems
that the technique could be used to prove that all
trees are balanced.
• The inductive proof relies on the black height
being “well defined” for any node.
– The height is defined as the longest path to a leaf
– The black height is the same for all paths to a leaf.
• That’s why you cannot prove that any tree of
height h has at least Ω(2h) nodes in it.
Making Red/Black Trees
• The basic idea of a Red/Black tree is to use
the insert and remove operations of an
ordinary binary search tree.
– But this may result in violating the red/black
properties
• So, we’ll “fixup” the tree after each
insert/remove
Rotations
• A “right rotate” will interchange a node with its left child.
The child will become the parent, and the parent will
become a child.
– The parent becomes the right child of the child.
• The old “right grandchild” becomes the left child of the parent
• A “left rotate” will interchange a node with its right child
– The parent becomes the left child of the child
• The old “left grandchild” becomes the right child of the parent
• Note that these operations are exact opposites (inverses) of
each other.
• Note also that Rotations do not affect the BST properties
(although they will almost certainly affect the red/black
properties).
Insert
• Insert the value normally, and make the new node
“red”
– We have not changed the black height of any node in
the tree.
– However, we may have created a red node with a red
parent (and this is bad).
• As we “fixup” we’ll always have a pointer to a red
node. We’ll always know that the black height is
OK, and the only problem we need to worry about
is that the parent of this red node is also red.
Fixup
• Let c (child) be the red node we inserted
• Let p (parent) be the parent of c
• Let gp (grandparent) be the parent of p
• Let u (uncle) be the child of gp that is not equal to
p
• If p->color == black, we’re done. So, assume p-
>color == red.
– We know, therefore, that gp->color == black.
• Two interesting cases
– Uncle is red (easy), or uncle is black (harder)
Uncle is red
• If the grandparent is black, the parent is red and
the uncle is red, then
– we would not change the black height by making the
grandparent red and the parent (and uncle) black.
– We may, however, have introduced a new problem
where the grandparent is now red, and its parent is also
red (the great-grandparent).
• So, if the uncle is red, make it and the parent
black. Make the grandparent red, and then repeat
fixup where we treat the grandparent as the next
“child”.
Uncle is Black
• Make the parent black
– But this increases the number of black nodes along the path to the
child.
• Make the grandparent red
– This fixes the problem with the path from the root to the child, but
it decreases (breaks) the number of black nodes on the path from
the root to the uncle
• Rotate around the grandparent and parent
– So that the path from the root to the uncle now passes through
both the parent and the grandparent
– (and the path from the root to the child no longer passes through
the grandparent).
Case Analysis
• Coding this up requires six cases. Three cases are for
when the parent is the left child of the grandparent, and
three (perfectly symmetric) cases for when the parent is
the right child of the grandparent.
• Of the remaining three cases, “uncle is red” is one case.
• Two cases are required for “uncle is black” depending on
whether the path from grandparent to child is “straight” or
“crooked”
– If the path is “crooked” then we’ll need to rotate first around the
parent and child, and then perform the rotation around the parent
and grandparent.
“Root is Black” Sentinel
• Once we reach the root of the tree, we’re done.
• If we can ensure that the root is always black, then
we don’t need to worry about the special case of
reaching the root in fixup.
– Fixup stops whenever the next “child” is black, or when
the parent is black.
• It’s easy (and always correct) to simply make the
root black as the last step in any insert/remove
operation.
Time Complexity
• Fixup runs in a loop. Each iteration of the loop we do
– O(1) work in case analysis
– O(1) work recoloring nodes (“uncle is red” case)
– O(1) work performing rotations (at worst 2 rotations)
• Each iteration of the loop we either terminate (always the
case after a rotation), or we set “child” equal to
grandparent
– i.e., each iteration of the loop uses a node with height less than the
previous iteration.
• Since height must decrease each iteration, we can do at
most h iterations. Since h = O(log n), we do O(log n)
iterations with O(1) work each iteration.

You might also like