You are on page 1of 13

CS 3345 AVL Trees

AVL Balanced Binary Search Tree The rst balanced BST that we will study is the AVL Tree, named after its inventors. AVL Tree Property The nodes of an AVL tree abide by the BST property and the following: The heights of the left and right subtrees of any node dier by no more than 1. Theorem: The AVL property is sucient to maintain a worst case tree height of O(log N ). Proof To show this, we construct the AVL tree of height h that contains the fewest nodes. From the number of nodes in these skinniest AVL trees we deduce a recurrence relation for the minimum number of nodes in an AVL tree of height h. We solve this recurrence relation and then rearrange the result to get an equation for the maximum height of an AVL tree with N nodes. Call these skinny AVL trees T0 , T1 , T2 , for heights 0, 1, 2, etc. Here are the rst few:

T0

T1

T2

T3

T4

Figure 2
Let n(h) be the number of nodes in a skinny AVL tree of height h. Then n(0) = 1, n(1) = 2, n(2) = 4, n(3) = 7, n(4) = 12 and n(h) = n(h 1) + n(h 2) + 1, n(0) = 1, n(1) = 2

The recurrence is not homogeneous, but we can transform it into a homogeneous linear recurrence with constant coecients: n(h + 2) n(h + 1) n(h) 1 = 0 n(h + 3) n(h + 2) n(h + 1) 1 = 0 Subtracting these we get: n(h + 3) 2n(h + 2) + n(h) = 0 for which the characteristic equation is x3 2x2 + 1 = 0 Solving, (x 1)(x2 x 1) = 0 (x 1)(x )(x + 1/) = 0 where is the Golden Ratio: = 1+ 5 2

Copyrighted. No one may present, print, or copy these notes without the permission of Ivor Page, UT Dallas

The solution is a linear combination of the roots, each raised to the power h: n(h) = A(1)h + B()h + C(1/)h = A + (1.618)h B (0.618)h C Inserting n(0) = 1, n(1) = 2, n(3) = 4 gives A = 1, B = 1.894, C = 0.106: n(h) = 1 + 1.894(1.618)h + 0.106(0.618)h For large h we can drop the third term. Then, rearranging, we get: h = 1.44log2 (N + 1) 1.328 where n(h) = N is the number of nodes. We could have got to this results a little faster. Compare values of n(h) with Fibonacci numbers: h 0 1 2 3 4 5 6 7 8 n(h) 1 2 4 7 12 20 33 54 88 F ib(h) 1 1 2 3 5 8 13 21

We see that n(h) = F ib(h + 3) + 1. Now, for large h, we can approximate Fib(h+3): 1 1+ 5 n(h) F ib(h + 3) 2 5 Taking logs of both sides and rearranging, we get: h = 1.44log2 N 1.328 for N = n(h) nodes in the tree. We have shown that the height of an AVL tree is O(log N ). If the operations insert, search, and delete take place on a path from the root to a leaf node, then they will each take O(log N ) time. AVL with hight dierence of two Consider a dierent balancing rule where the heights of the left and right subtrees of any node are allowed to dier by up to nodes.
h+3

Copyrighted. No one may present, print, or copy these notes without the permission of Ivor Page, UT Dallas

Here are the rst few skinny trees of height h:

h=0 o h=1 h=2 h=3 h=4 h=5 o o o o o / / / \ / \ / \ o o o o o o o o / / / \ \ / \ \ o o o o o o o o / / / \ \ \ o o o o o o / / o o / o 1 2 2 3 3 5 4 8 5 12 6 18 7 27 8 40 9 59 10 87 11 128 12 188

h = 0 n(h) = 1

We see that n(h) = n(h 1) + n(h 3) + 1, n(0) = 1, n(1) = 2, n(2) = 3 This is a non-homogeneous recurrence relation. We develop a homogeneous version by the subtraction n(h) n(h 1) = n(h 1) + n(h 3) n(h 2) n(h 4) n(h) 2n(h 1) + n(h 2) n(h 3) + n(h 4) = 0 From which we get the characteristic polynomial, x4 2x3 + x2 x + 1 = 0 = (x 1)(x3 x2 1) The roots are x = 1, x = 1.466 and x = 0.233 0.793i. The imaginary roots are small (magnitude 0,83) and will contribute little in comparison to 1.466 for large h since each is raised to the power h. Just using roots x = 1 and x = 1.466 we get n(h) a 1h + b 1.466h and substituting h = 9, h = 10 we get n(h) 1.9 1.466h from which we get h 1.8log2 n 1.667 For very large n we see that this tree is about 1.8/1.44 = 1.25 times taller than the AVL tree with the same large number of nodes. AVL with hight dierence of three If the left and right subtrees were allowed to dier in height by 3, the characteristic equation would be: x5 2x4 + x3 x + 1 0 which has roots x = 1, x = 1.38, x = 0.819, x = 0.2194 0.9145i Again, using only the roots with magnitude greater than 1, n(h) a 1h + b 1.38h
Copyrighted. No one may present, print, or copy these notes without the permission of Ivor Page, UT Dallas

n(h) 0.64 + 1.83(1.38)h h 2.16log2 n 1.88 Again we get tree height O(log n) and a worst case 1.5 times that for the standard AVL tree. AVL Tree Search Search operates exactly as in a binary search tree, following a direct path from the root to the search node or, if the key isnt present, the search stops at a place where the key would be inserted. Therefore search takes O(log N ) time. AVL Tree Insert Insert rst searches for the key to be inserted. If the search fails, it stops at the place where the new BST Node will be inserted. For example, if we insert 1 into the AVL tree of Figure 1, the new node would be the left child of 2. And if we insert 8, the new node would be the right child of 7. An insertion could cause a violation of the AVL tree property. To remove such violations, rotations are used. AVL Tree Rotations Following the insertion, the algorithm recurses out from the point of the insertion, checking each ancestor node for balance. The rotation, if necessary, must take place at the rst out-of-balance node, x, along the path from the insertion point to the root. There are four cases: Where the insertion took place In the left subtree of the left child of x In the right subtree of the left child of x In the left subtree of the right child of x In the right subtree of the right child of x Function to x the problem rotateWithLeftChild(x) doubleWithLeftChild(x) doubleWithRightChild(x) rotateWithRightChild(x)

Cases 1 and 4 are mirror images, as are cases 2 and 3. Single rotations x cases 1 and 4 and double rotations x cases 2 and 3. As we will see, a double rotation is achieved by two single rotations. Before we get lost in the details, remember that the rotation is necessary because one of xs subtrees has become taller by two than its other subtree. Each of these rotations is cleverly designed to restore the subtree that had x as its root to the height it had before the insertion took place, ensuring that no further rotations will be necessary along the path to the root. Case 1: Rotate with left child Figure 3 shows the rotation before and after:

x y h+1 * A B h C h A Figure 3

y x h B C h

h+1

Copyrighted. No one may present, print, or copy these notes without the permission of Ivor Page, UT Dallas

The asterisk marks the point of insertion and the subtree heights are indicated. The rst out of balance node is x. Node xs left subtree height is h+2 and its right subtree height is h. The rotation rearranged the three subtrees so that the taller one has moved up one level and one of the smaller subtrees has moved down. Overall, the entire subtree including node x has height one less than before the rotation. Its height is the same as before the insertion. Because of the BST property, the three subtrees, A, B, C, are arranged left-to-right in the same order before and after the rotation. If the entire AVL tree was balanced before the insertion operation, then its balance is restored by the single rotation and the insert function can exit without considering any ancestors of node x. The code is as follows:
/** * Rotate binary tree node with left child. * For AVL trees, this is a single rotation for case 1. */ AvlNode rotateWithLeftChild( AvlNode x ) { AvlNode temp = x.left; x.left = temp.right; temp.right = x; x.height = max( height( x.left ), height( x.right ) ) + 1; temp.height = max( height( temp.left ), x.height ) + 1; return temp; }

The rst three lines implement the rotation. Then the height variables in the nodes x and y are adjusted and a reference to the new root is returned. This function is a private member of the AVLTree class. It is called by the insert and delete functions. Case 4: Rotate with right child This function is called when the insertion point is in the right subtree of the right child of x. The situation is the mirror image of Case 1 above. Figure 4 below shows the rotation.

x h A B h * C y h+1 h A Figure 4 B x

h+1 h C

Here is the code: /** * Rotate binary tree node with right child. * For AVL trees, this is a single rotation for case 4.

Copyrighted. No one may present, print, or copy these notes without the permission of Ivor Page, UT Dallas

*/ private static AvlNode rotateWithRightChild( AvlNode x ) { AvlNode temp = x.right; x.right = temp.left; temp.left = x; x.height = max( height( x.left ), height( x.right ) ) + 1; temp.height = max( height( temp.right), x.height) + 1; return temp; } Case 2: Double with left child Here, the insertion was made in the right subtree of the left child of x. It could equally have been made in subtree C. See Figure 5 below:

x y h+1 A * B z h+1 C D h A Figure 5 h+1 y h+1 B

z x h+1 C h D h+1

Node z moves up two places and becomes the new root. Node x has rotated to the right. Node ys left subtree and xs right subtree are not changed, but all the other links are changed. The eect is to move up subtrees B and C one level and move down subtree D. Once again, the BST property has been preserved the subtrees remain in their left-to-right order. And the overall height of the subtree that began at node x has been restored to its pre-insertion value. The insert function can exit without examining ancestor nodes of this entire subtree. Here is the code: /** * Double rotate binary tree node: first left child * with its right child; then node k3 with new left child. * For AVL trees, this is a double rotation for case 2. */ AvlNode doubleWithLeftChild( AvlNode x ) { x.left = rotateWithRightChild( x.left ); return rotateWithLeftChild( x ); }

Copyrighted. No one may present, print, or copy these notes without the permission of Ivor Page, UT Dallas

Case 3: Double with right child And nally, here is Case 3, where the insertion took place in the left subtree of the right child of x. The situation is the mirror image of Case 2.

x h+1 A h B * C z D h+1 y h+1 A Figure 6 x h+1 B h

z y h+1 C D h+1

/** * Double rotate binary tree node: first right child * with its left child; then node k1 with new right child. * For AVL trees, this is a double rotation for case 3. */ AvlNode doubleWithRightChild( AvlNode x ) { x.right = rotateWithLeftChild( x.right ); return rotateWithRightChild( x ); } Which Rotation to apply? Figure 7 below illustrates the four cases. x marks the rst out-of-balance Node along the path from the point of insertion to the root. The key k has been inserted in one of the four areas of the tree delineated by the four inequalities.

Case 1 k<p

Case 2 p<k<x p

Case 3 x<k<q q

Case 4 q<k

Figure 7
Copyrighted. No one may present, print, or copy these notes without the permission of Ivor Page, UT Dallas

After the insertion takes place, the insert function recurses up the tree, following a direct path to the root, examining every node along that path for an imbalance. If such a node, say x, is found, the function compares the key value that was inserted, say k, with the key value in x and the key value in one of xs children. For k<x, the key in the left child of x is used, otherwise, the key in the right child of x is used. The correct rotation is determined, based on these two comparisons. The code of the insert function:
/** * Internal method to insert into a subtree. * @param x the item to insert. * @param t the node that roots the tree. * @return the new root. */ AvlNode insert( Comparable x, AvlNode t ) { if( t == null ) t = new AvlNode( x, null, null ); else if( x.compareTo( t.element ) < 0 ) { t.left = insert( x, t.left ); // recurse down left subtree if( height( t.left ) - height( t.right ) == 2 ) // check for imbalance on the way up if( x.compareTo( t.left.element ) < 0 ) // check which rotation to apply t = rotateWithLeftChild( t ); else t = doubleWithLeftChild( t ); } else if( x.compareTo( t.element ) > 0 ) { t.right = insert( x, t.right ); // recurse down left subtree if( height( t.right ) - height( t.left ) == 2 ) // check for imbalance on the way up if( x.compareTo( t.right.element ) > 0 ) // check which rotation to apply t = rotateWithRightChild( t ); else t = doubleWithRightChild( t ); } else ; // Duplicate key; do nothing t.height = max( height( t.left ), height( t.right ) ) + 1; // adjust height in father of return t; // rotated node }

Notice that each recursive call to insert returns a reference that is assigned to the left or right child of the current node. That child will only change if the insertion causes an imbalance and a rotation takes place.

Copyrighted. No one may present, print, or copy these notes without the permission of Ivor Page, UT Dallas

Here is a simple example, showing the sequence of calls that took place to insert key 1 into the tree on the left, and the rotation that took place after the insertion to create the balanced tree on the right.

10 5 3 1
Here is the sequence of calls:

10 15 1 Figure 8 3 5 15

insert(1,10) // the second argument is really a reference to the node containing 10 insert(1,5) insert(1,3) insert(1,null) //the left child of 3 in null returns a reference to node1 node3.leftChild = node1 returns a reference to node3 node5.leftChild=node3 returns a reference to node5 // node 5 is out of balance singleWithLeftChild(node 5) returns a reference to node3 returns a reference to node3 node10.leftChild = node3 returns a reference to node10

Here is an example where keys 1,2,3,4,5,6,7,16,15,14,13,12,11 are inserted into an initially empty AVL Tree. The nodes with bold rings are the rst out-of-balance nodes on the path from the point of insertion to the root. The operation performed to rebalance the tree is given between the before-and-after trees.

Copyrighted. No one may present, print, or copy these notes without the permission of Ivor Page, UT Dallas

2 1 3 1

2 3

rotateWith RightChild(3) 1 4

2 4 1 5

2 4 3

rotateWith RightChild(2) 2 5 6 1

4 5 3 6

3
5

After inserting 4,5,

After inserting 6

4 2 1 3 5

rotateWith RightChild(5) 2 6 7 1 3

4 6 5 7 1 2 3

4 6 5

doubleWith RightChild(7) 2 7 16 15 1 3

4 6 5 7 15 16

After inserting 7

After inserting 16,15

4 2 1 3 5 7 After inserting 14 14 6 15 16

doubleWith RightChild(6) 2 1 3

4 7 6 5 14 15 16

4 2 1 3 5 After inserting 13 13 7 4 2 1 3 5 12 7 4 2 1 3 5 11 15 15 6 14 7 15 16

rotateWith RightChild(4) 4 2 1 3 5

7 15

6
13

14

16

rotateWith LeftChild(14) 4 16 1 2 3 5

7 15

6
13

14

6
12

13 14

16

After inserting 12

rotateWith LeftChild(15) 4 16 14 1 2 3 5

7 13

6
12

13

6
11

12 14

15 16

After inserting 11

Copyrighted. No one may present, print, or copy these notes without the permission of Ivor Page, UT Dallas

AVL Tree Deletion Deletion is the most complex operation in almost all data structures. In an AVL tree, the process begins with a simple BST deletion. This may upset the balance in an ancestor node of the point of deletion. As the function recurses out along a direct path to the root, nodes are examined for imbalance. If such a node, say x, is found, a rotation is used to correct the imbalance. Unfortunately, the rotation may not entirely cure the problem. Depending on the situation, more rotation(s) may be necessary along the path to the root. However, there can only be O(log N ) such rotations, so the overall time for a deletion is O(log N ) Before continuing, we must establish what is meant by the, point of deletion, because the search for imbalanced nodes must begin at that point as the delete() function recurses back up the tree. If we are deleting node x and x is a leaf node (it has zero children,) then x is trivially deleted and the point of deletion is the parent of the deleted node. If x has one child, then that child (and its subtrees) move up one level to take the place of x. The point of deletion is, again, the parent of the deleted node. If x has two children, then x is replaced by the smallest key node in the right subtree or the largest key node in the left subtree. The point of deletion is the parent of the node that is moved up to replace x. As with insert, the rebalance process begins with the rst out-of-balance node on the path from the deletion point to the root. There are ve cases to consider, and each has a mirror image.

x h B Case 1 A

x h B

Height unchanged No rotations

* A

h-1

In Case 1, the deletion did not cause an imbalance, nor did it change the height of the overall subtree. Therefore the delete function does not need to examine any ancestor of x for imbalance.

x h B Case 2 A h

x h B

Height reduced No rotations

h+1 * A

In Case 2, the deletion did not cause an imbalance, but it did reduce the height of the overall subtree. The delete function must next check xs parent for an imbalance.

Copyrighted. No one may present, print, or copy these notes without the permission of Ivor Page, UT Dallas

x y h B C Case 3a h h-1 A B x

y Height unchanged h h C Single left rotation

* A

In Case 3a, the deletion did cause an imbalance, which was corrected by a call to rotateWithRightChild(x). The rotation did not change the height of the overall subtree, so the delete function does not need to check any of the subtrees ancestors for imbalance.

x y h B C Case 3b h-1 A B x

y Height reduced h h-1 C Single left rotation

* A h-1

Case 3b is similar to Case 3a. The deletion did cause an imbalance, which was corrected by a call to rotateWithRightChild(x). The rotation did reduce the height of the overall subtree, so the delete function must next check the parent of the subtree for an imbalance.

Copyrighted. No one may present, print, or copy these notes without the permission of Ivor Page, UT Dallas

z y h-1 or h-2 h-1 D Height reduced Double rotation

In this case, one of the subtrees B or C could have height h-2 and the other h-1, or they could both have height h-1. The deletion did cause an imbalance, which was corrected by a call to doubleWithRightChild(x). The rotation did reduce the height of the overall subtree, so the delete function must next check the parent of the subtree for an imbalance. Whenever a rotation during a delete operation reduces the overall subtree height, in the worst case, rotations could be necessary at each of xs ancestors, all the way to the root. See Figure 8 below.

x3 x2 x1 y1 h-1 h+1 y2 y3 h+2 h+2 h+3

Figure 8

Rotations are required at x1, then x2, then x3. Remember that, for each of these ve cases, there is a mirror-image case. After a deletion, if an imbalance is found, deciding which rotation to apply, is left as an exercise.

Copyrighted. No one may present, print, or copy these notes without the permission of Ivor Page, UT Dallas