You are on page 1of 28

Chapter 3 – Search Trees

Contents
• Trees
• Binary trees
• Binary search
• Binary search trees
• Balanced search trees
• AVL trees
• Multiway search trees
• Red-black trees
• Splay trees
• B-trees
• Search trees in Java
1
Trees – definition, terminology • Root: node without parent (A)
• Internal node: node with at least one child (A, B, C, F)
• Graph theory: A tree T is an • External node (or leaf ): node without children (E, I, J,
undirected acyclic graph K, G, H, D)
• A tree can be either unrooted or • Ancestors of a node: parent, grandparent, grand-
rooted. This chapter covers rooted grandparent, etc.
trees. • Depth of a node: number of ancestors
• Height of a tree (or a subtree): maximum depth of any
• Informally: A tree T is a set of nodes, node
which have a parent-child • Descendant of a node: child, grandchild, grand-
relationship. grandchild, etc.
• Subtree: tree consisting of a node of T 3 A 0
and its descendants
• Properties:
B 1 1 C 0 D 1
 If T is nonempty, it has a single node called
the root of T
 Each node v of T (except the root) has a
unique parent E 2 F 2 G 1 H 1
 Every node v with parent w is a child of w
 A tree T can be empty subtree
I 3 J 3 K 3
2
Binary trees
Arithmetic expressions Decision trees
• A binary tree is a tree with
the following properties: • Binary tree associated • Each internal node is
 Each internal node has at with an arithmetic associated with an
most two children (exactly attribute or variable
two for proper binary trees) expression
 The children of a node are in  internal nodes: operators • Each external node
an ordered pair  external nodes: operands corresponds to an
• We call the children of an outcome
• Example: arithmetic
internal node left child and expression tree for the • Example: how to decide on
right child
expression ((2  (a - 1)) + a restaurant
• Alternative recursive (3  b))
definition: a binary tree is
either Want a fast
+
meal?
 an empty tree,
 a tree consisting of a single   Yes No
node, or
How about a Have more
 a tree whose root has an 2 - 3 b
ordered pair of children, each burger? money?
of which is a binary tree
a 1 Yes No Yes No
Mc’s Tim’s The K Chalet
3
Proper binary trees – properties
• Notation • Properties:
n number of nodes  n=e+i
e number of external nodes  e=i+1
i number of internal nodes  n = 2e - 1
h height  h + 1 e  2h
 h  i  2h - 1
 log (n + 1) - 1  h  (n - 1)/2

4
Binary trees – traversal
• Preorder (L): A node is visited Algorithm EulerTour(v) • Ex. Postorder: evaluate
before its left and right subtrees visit(v) arithmetic expression
• Inorder (B): A node is visited after if hasLeft(v)  Specialization of postorder
its left subtree and before its right EulerTour(left(v)) traversal
subtree visit(v)  recursive method returning the
• Postorder (R): A node is visited if hasRight(v) value of a subtree
after its left and right subtrees EulerTour(right(v))  when visiting an internal node,
combine the values of the
• Euler tour traversal: visit(v)
subtrees
 Recursively called on the left and right
children • Ex. inorder: draw a binary
 Visit every internal node three times tree Algorithm evalExpr(v)
 External node visited only once  x(v) = inorder rank of v if isExternal (v)
 y(v) = depth of v return v.element ()
L + R
else
B
• Ex. Inorder: Print x  evalExpr(left (v))
L  R L  R arithmetic expression y  evalExpr(right (v))
2
B
- 3
B
b  It outputs:   operator stored at v
L R ((2  (a - 1)) + (3  b)) return x  y
B
a 1

5
Sorted maps – binary search Algorithm binarySearch(S, k, low, high):
if low > high then
• Binary search performs find(k) in a sorted return null
map implemented on an array, sorted by key else
 Performs a search in a logarithmic number of mid  (low + high) / 2
steps: O(log n) e  S.get(mid)
if k = e.getKey() then
• Can be easily adapted to find more than return e
one entry with key k else if k < e.getKey() then
• Can also find keys in a range in O(log n) return binarySearch(S, k, low, mid-1)
else
• Example: find(7) return binarySearch(S, k, mid+1, high)

0 1 3 4 5 7 8 9 11 14 16 18 19
0 1 2 3 4 5 6 7 8 9 10 11 12
l m h
0 1 3 4 5 7 8 9 11 14 16 18 19
l m h
0 1 3 4 5 7 8 9 11 14 16 18 19
l m h
0 1 3 4 5 7 8 9 11 14 16 18 19
l=m =h
6
Binary search trees (BST)
• A binary search tree is a binary Search
tree storing keys (or key-value • Can be easily implemented
entries) at its internal nodes and recursively
satisfying the following property:
 Let u, v, and w be three nodes Algorithm TreeSearch(k, v)
such that u is in the left subtree
of v and w is in the right subtree if T.isExternal (v) {base case}
of v. We have return v
key(u)  key(v)  key(w) if k < key(v) {general case}
return TreeSearch(k, T.left(v))
• External nodes do not store else if k > key(v)
items return TreeSearch(k, T.right(v))
• BST are proper binary trees else {k = key(v) -- base case}
return v
• Inorder traversal of a binary
6
search tree visits the keys in • Example: Find 4
increasing order
2 9
• A binary search tree implements
a sorted map ADT 1 4 8

7
Binary search trees – insertion
• Search for key k using TreeSearch • Example: insert 5
 If k is not found, it is inserted at w, be the • Initial call:
leaf reached by the search  TreeInsert(k,x,T.root())
 If k is already in the tree in node w, the
same insertion method is used with the
left child of w 6
<
2 9
Algorithm TreeInsert(k,x,v) >
Input: A key k, a value x, and 1 4 8
a node v of T >
Output: A new node w of T
w
w  TreeSearch(k,v) 6
if k = key(w) then
2 9
return
TreeInsert(k,x,T.left(w)) 1 4 8
T.insertAtExternal(k,x,w) w
return w 5

8
Binary search trees – deletion
After removing 5:
• First, we search for key k 1
v 1
• Let v be the node storing k
3 v
• Case 1: Node v has a leaf child w 3
 remove v and w from the tree with 2 8
operation removeExternal(w), which 2 8
removes w and its parent 6 9
 Example: remove 5 6 9
w
• Case 2: The key k to be removed is 5
stored at a node v whose children are z
both internal:
 Find the internal node w that follows v in
inorder traversal After removing 3:
 Copy key(w) into node v
 Remove node w and its left child z (which 1
must be a leaf) by means of operation v
removeExternal(z) 6
 Example: Remove 3 2 8

9
Binary search trees – performance 1
2
• Consider a sorted map with n
Unbalanced:
entries implemented by means of 3
a binary search tree of height h 5
4
 the space used is O(n) 4
 In the worst case, methods find, 5
insert and remove run in O(n) time 3

• The height h is O(n) in the worst 2


case and O(log n) in the best case
1
• Average case:
 Assume a random sequence of keys,
and 4
 all keys are equally likely
Balanced: 2 6
 Then, the average depth of all nodes
in the tree is O(log n)
1 3 5 7
 However, average case of
insertions/deletions may not be
O(log n). At least no proof exists.
10
AVL trees
• AVL trees are balanced
Example: 4
• Def: An AVL Tree is a binary 44
search tree such that for every
internal node v of T, the heights 2 3
of the subtrees rooted at the 17 78
children of v can differ by at
most 1 0 1 2 1
32 50 88
• Named after its inventors:
 Adel’son-Vel’ski and Landis 1 1
• Fact: The minimum number of keys 48 62
of an AVL tree, n(h), is:
n(h) > 2 h/2-1
• Theorem: The height of an AVL tree
storing n keys is O(log n)
• Search in an AVL tree is the same as
in a BST

11
AVL trees – insertion After insertion of 54:
44
• Insertion is the same as Case 1a: It’s like a
in a BST a=z “single rotation” (a left c=z
17 78
• But balance has to be rotation about a) b=y
a=y
restored
b=y 32 50 88
• After an insertion: a=z c=x
 Trace a path from node T0 T3
inserted w all the way up c=x 48 62
to the root b=x
 For every node v, check T1 T0
that the heights of left T0 T1 T2 T3 T2
and right children differ 54
by at most 1 T2 T3 (the other two
w
 If for a node z, they differ cases, 1b and 2b,
T1
by more than 1, then
perform restructure(x): Case 2a: It’s like a “double are symmetrical) After restructuring:
 Let y and x be the child a=z rotation” (a right rotation about 44
and grandchild of z in c, then a left rotation about a)
that path b=x
 Apply the restructuring b=x
operation on z c=y 17 62
 Once one restructuring a=y c=z
operation is applied T0 a=z c=y
locally, the AVL tree b=x
32 50 78
becomes balanced,
globally
 The heights of all nodes T3
48 54 T 88
are stored and must be T0 T1 T2 T3 2
updated too. T1 T2
T0 T1 T3 12
AVL trees - deletions
Example:
• Deletion in an AVL tree is the 44
same as in a BST
17 62
• But balance has to be restored
• After the deletion: 32 50 78
 Let w be the parent of the node being
removed.
48 54 88
 Let z be the first unbalanced node
encountered while travelling up the tree Remove 32
from w
 Also, let y be the child of z with the a=z b
44 62
largest height, and let x be the child of y
with the largest height a c=x
b=y
 We perform restructure(x) to restore w 17 62
restructure 44 78
balance at z
 As this restructuring may upset the c=x
17 50 88
balance of another node higher in the 50 78 T2
tree, we must continue checking for T0
balance until the root of T is reached
48 54
T2 88 T0 48 54 T3

T1 T3 T1
13
AVL trees – performance
Worst case:
• a single restructure is O(1) 44
down phase
O(1)
 using a linked-structure binary tree
up phase c=z
• Search or find is O(log n) 17 78 O(1)
a=y
 height of tree is O(log n), no restructuring
needed 32 50 88 O(1)
O(log n)
• insert is O(log n) T3
 initial find is O(log n) O(1)
48 62
 Restructuring up the tree, maintaining heights b=x
is O(log n) T0
T2 O(1)
• remove is O(log n) w
54

 initial find is O(log n) T1


 Restructuring up the tree, maintaining heights Total: O(log n)
is O(log n)

14
Multi-way search trees
• A multi-way search tree is an ordered tree such that Inorder traversal:
 Each internal node has at least two children and stores d -1 • (ki, oi) of node v is visited between
key-element items (ki, oi), where d is the number of children the recursive traversals of the
 For a node with children v1 v2 … vd storing keys k1 k2 … kd-1 subtrees of v rooted at children vi
 keys in the subtree of v1 are less than k1 and vi + 1
• The keys of the multi-way search
 keys in the subtree of vi are between ki-1 and ki (i = 2, …, d - 1) tree are visited in increasing order
 keys in the subtree of vd are greater than kd-1
• The height of a multi-way search tree is O(log n)

11 24
8 12
2 6 8 15 27 32
2 4 6 10 14 18
30
1 3 5 7 9 11 13 16 19

15 17 15
(2,4) trees
• A (2,4) tree (also called 2-4 tree or 2-3-4 tree) is a • Theorem: A (2,4) tree storing n items has
multi-way search tree with the following height O(log n)
properties:
 Node-Size Property: every internal node has at • Searching in a (2,4) tree with n items takes
most four children O(log n) time
 Depth Property: all the external nodes have the
same depth
• Depending on the number of children, an
internal node of a (2,4) tree is called a 2-node, 3-
node or 4-node

10 15 24

2 8 12 18 27 32

16
Red-black trees
• A red-black tree is a BST that satisfies the • A (2,4) tree can be easily
following properties: converted into a red-black tree 4
 Root Property: the root is black • The height of a red-black tree
 External Property: every leaf is black is at most twice the height of
its associated (2,4) tree, which
 Internal Property: the children of a red node are is O(log n)
black
• Then, the height of a red-black 4
 Depth Property: all the leaves have the same
black depth tree is O(log n)

9 3 5 2 6 7

4 15

2 6 12 21 5 6
3
7 3 5 2 7
or

17
Red-black trees - insertion
Example: insert 4
• Insertions in a red black tree is like in a 6
BST, where the newly inserted node z 6 v
v
is colored red, unless it is the root 8 3 8
3
 We preserve the root, external, and z z
depth properties 4
 If the parent v of z is black, we also
preserve the internal property and we
are done
recoloring
 Else (v is red ) we have a double red
(i.e., a violation of the internal 6
property), which requires a v
reorganization of the tree 3 8
z
• Reorganization requires:
 Restructuring
4
 Recoloring

18
Red-black trees – deletion Example: remove 8

• To perform operation remove(k), we first 6 6


execute the deletion algorithm for BST y v
y r
• Let v be the internal node removed, w the 3 8
3
external node removed, and r the sibling of w r w
 If either v or r was red, we color r black and we 4
are done 4
 Else (v and r were both black) we color r double
black, which is a violation of the depth property
requiring restructuring the tree case 1
• Let x be the parent of r, and y be the restructuring
sibling of r 4
• The algorithm for remedying a double
black node r with sibling y considers three 3 6
cases:
Case 1: y is black and has a red child z
 We perform a restructuring , and we are done r
Case 2: y is black and its children are both black
 We perform a recoloring, which may propagate Worst-case analysis:
up the double black violation
Case 3: y is red (and hence it has a black child z) • Insertion and deletion in a red-black tree take
 We perform an adjustment, after which either
Case 1 or Case 2 applies O(log n) time.
• Search or find also takes O(log n) time.
19
Weak AVL trees (optional – not in exam)
• A weak AVL tree or wavl tree has features of 9 5
both AVL and red-black trees
• Insertions and deletions are simpler though 4

• Definition: 4 15 3
 Every node v has a rank, r(v)
 For every node v, r(v) < r(parent(v)) 6 2
• Rank difference of v: difference between r(v)
and r(parent(v)) 2 7 12 21 1
• Properties of rank:
 The rank difference of any non-root node is 1 or 0
2
 Every external node has rank 0 • The height of a wavl tree storing n keys is at
 Leaf children of an internal node with two leaves most 2 log(n + 1)
as children, the children must not have a rank • Insertion: Similar to AVL trees, but rebalancing
difference of 2 using promotion.
• Main properties of wavl trees: • Deletion: Similar to AVL trees, but rebalancing
 Every AVL tree is a wavl tree using demotion.
 Every wavl tree can be colored as a red-black • Search, insertion and deletion in a wavl tree take
tree O(log n)
 Every red-black tree can be made a wavl tree
after rank assignments • More details found in Section 4.4 of [1].
20
Splay trees Example: Insert 5 6
• Splay trees are BSTs
• Unlike other trees, they don’t store 2 9
any height, balance, colors, etc. 1 4 8
• Splaying is the main operation:
 Bottommost position p after search, 5
insertion or deletion is moved up to the
root
 More frequently accessed nodes stay closer
to the root
Splaying moves 5 to the root of T
• Most important feature: average-case
running time of search, insertion and 5
deletion is O(log n)
• Worst-case running time of search, 4 6
insertion and deletion is like in BSTs:
O(n) 2 9
• Three main splaying operations: 1 8
 Zig-zig
 Zig-zag
 Zig
21
Splaying
zig-zig: z z y
zig-zag: zig:
2 2 4
y y x
4 5 5
x x
T0 T0 T0
5 4
T1 T3 T1 T2

T2 T3 T1 T2
x
5
x x y
4 4
5
y z y T2
4 2 5
z T1
T3 T0
2
T2 T0 T1 T2 T3 x doesn’t have any grandparent

T0 T1
22
Splaying – when to splay Example: Insert 5 6
• Search:
2 9
 When searching for a key k, which is
found at node x 1 4 8
 x is splayed
• Insertion: 5
 When inserting a key k, a new internal
node x is created, where x contains k
 x is splayed
zig-zig followed by zig moves 5 to the root of T
• Deletion:
 When deleting a key k that is contained 5
in node w, the parent x of w is splayed
4 6
• Performance
 Search, insertion and deletion take 2 9
O(log n) average-case running time
 Worst-case is still O(n) 1 8
 In practice, splay trees perform
exceptionally well

23
Review and comparison
Search tree Average case Worst case
Search Insertion Deletion Search Insertion Deletion
BST O(log n)* O(log n)* O(log n)* O(n) O(n) O(n)
AVL O(log n) O(log n) O(log n) O(log n) O(log n) O(log n)
Red-black O(log n) O(log n) O(log n) O(log n) O(log n) O(log n)
wavl O(log n) O(log n) O(log n) O(log n) O(log n) O(log n)
Splay O(log n) O(log n) O(log n) O(n) O(n) O(n)

* Average case for a BST assumes a series of insertions and deletions of n randomly generated keys.

24
Review and Further Reading
• TreeSet and TreeMap classes in Java 8 implement balanced search trees
• They implement and maintain a red-black tree.
• TreeSet implementationis based on TreeMap.
• TreeSet
 http://docs.oracle.com/javase/8/docs/api/java/util/TreeSet.html
• TreeMap
 http://docs.oracle.com/javase/8/docs/api/java/util/TreeMap.html
• BST, AVL, splay and red-black trees:
 Ch. 11 of [2], Ch. 4 of [1]
 Ch. 4 of [3]
• Implementations in Standard library
 Sec. 4.8 of [3]

25
References
1. Algorithm Design and Applications by M. Goodrich and R. Tamassia, Wiley,
2015.
2. Data Structures and Algorithms in Java, 6th Edition, by M. Goodrich and R.
Tamassia, Wiley, 2014.
3. Data Structures and Algorithm Analysis in Java, 3rd Edition, by M. Weiss,
Addison-Wesley, 2012.
4. Java documentation:
 http://docs.oracle.com/javase/8/docs/api/overview-summary.html
 http://docs.oracle.com/javase/8/docs/api/java/util/TreeSet.html
 http://docs.oracle.com/javase/8/docs/api/java/util/TreeMap.html

26
Lab – Practice
1. Using classes BinarySearchTree, AVLTree, RedBlackBST, SplayTree, create four
search trees: BST, AVL, red-black, and splay trees.
2. For each tree:
a) Insert 100,000 integer keys, from 1 to 100,000 (in that order). Find the average time of each
insertion
b) Do 100,000 searches of random integer keys between 1 and 100,000. Find the average time
of each search.
c) Delete all the keys in the trees, starting from 100,000 down to 1 (in that order). Find the
average time of each deletion.
3. For each tree:
a) Insert 100,000 random keys between 1 and 100,000. Find the average time of each search.
b) Repeat #2.b.
c) Repeat #2.c but with random keys between 1 and 100,000. Note that not all the keys may be
found in the tree.
4. Draw a table that contains all the average times found in #2 and #3
5. Comment on the results obtained and compare them with the worst-case and
average-case running times of each operation for each tree.
6. Which search tree will you use in your application? Why?
27
Exercises
1. Define height, depth, internal/external node, subtree. Give 14. Write algorithms for insertion and deletion in a red-black tree.
examples.
15. Give an example of a wavl tree with 8 nodes, and show how that
2. Define a binary tree. Give an example of a decision tree. tree can be converted into an AVL tree and a red-black tree.
3. Give examples for the properties of binary trees. Show the facts. 16. Show (no formal proof required) that a wavl tree has height O(log
n).
4. Write recursive algorithms for printing a binary tree and for
printing an arithmetic expression. 17. Investigate in ref [1] how insertions and deletions are performed in
a wavl tree.
5. *Write a program that reads all the folders and files and stores
them in a tree (not necessarily binary). List all the folders and files 18. Write an algorithm for insertion in a splay tree.
along with their size.
19. Consider the red-black tree of slide 17. Insert 5 and delete 8.
6. Write an algorithm that modifies binary search by finding the keys
in a certain range. Show that the algorithm runs in O(log n). 20. Give an example of keys that inserted into a splay tree, it results in
a balanced tree (like an AVL tree).
7. Give an example of binary tree that is not proper.
21. Consider the tree of slide 21. Insert 1, and then 1. Apply splaying
8. Give a sequence of keys that after being inserted in an empty BST, after each insertion.
it produces a: (a) balanced tree, (b) completely unbalanced tree.
22. *Write a program that read the words of a file and insert them in a
9. Give examples of AVL trees with the minimum number of keys, and search tree (BST, AVL, etc.). Also use the hash table of previous
whose heights are 4, 5, 6, 7. chapter. Do 1000 searches and compare the average search times.
10. Consider the tree of slide 13. Insert 56 and rebalance the tree. 23. *Write an algorithm that given two AVL trees and a key, it merges
Delete 44 and rebalance the tree. them into a single AVL tree. Do the same for a BST. Can you
optimize the algorithm?
11. Write algorithms for insertion and deletion in an AVL tree.
24. Write an algorithm that converts an AVL tree into a red-black tree.
12. Write an algorithm that converts a (2,4) tree into a red-black tree Can you optimize your algorithm?
13. Show that the heights of a red-black tree and the (2,4) tree are the
same: O(log n).
28

You might also like