You are on page 1of 10

Dictionary ADT methods:

The Dictionary ADT


The dictionary ADT models a searchable collection findElement(k): if the dictionary has an item with key k,
of key-element items
returns its element, else, returns the special element
The main operations of a dictionary are searching, NO_SUCH_KEY
inserting, and deleting items
Multiple items with the same key are allowed insertItem(k, o): inserts item (k, o) into the dictionary
removeElement(k): if the dictionary has an item with key k,
Applications: removes it from the dictionary and returns its element,
„ address book else returns the special element NO_SUCH_KEY
size(), isEmpty()
„ credit card authorization
„mapping host names (e.g., cs16.net) to internet
keys(), Elements()
addresses (e.g., 128.148.34.101)
„ English dictionary
1
findAllElements(k), removeAllElements(k) 2

Implementing a Dictionary with an Implementing a Dictionary with an


Unordered Sequence Ordered Sequence

• searching and removing takes O(n) time


• inserting takes O(1) time • searching takes O(log n) time (binary search)
• applications to log files (frequent insertions, rare searches • inserting and removing takes O(n) time
and removals) • application to look-up tables (frequent searches,
rare insertions and removals)

3 4

1
Binary Search Pseudocode for Binary Search
Algorithm BinarySearch(S, k, low, high)
• narrow down the search range in stages if low > high then
• “high-low” game return NO_SUCH_KEY
else mid ← (low+high) / 2
• Example: findElement(7) if k = key(mid) then
return key(mid)
else if k < key(mid) then
return BinarySearch(S, k, low, mid-1)
0 1 3 4 5 7 8 9 11 14 16 18 19 else return BinarySearch(S, k, mid+1, high)
l m h
2 4 5 7 8 9 12 14 17 19 22 25 27 28 33 37
0 1 3 4 5 7 8 9 11 14 16 18 19
l m h low mid high
0 1 3 4 5 7 8 9 11 14 16 18 19 2 4 5 7 8 9 12 14 17 19 22 27 28 33 37
25
l m h
0 1 3 4 5 7 8 9 11 14 16 18 19 low mid high

l=m =h 5
2 4 5 7 8 9 12 14 17 19 22 25 27 28 33 37
6
low mid

Running Time of Binary Search Running Time of Binary Search

• The range of candidate items to be searched is halved • binary search runs in O(log n) time for findElement(k)
after each comparison • If we need to find all elements with a given key
(findAllElements(k)), it runs in O(log n + s), where s is the
number of element of elements in the iterator returned.
– Simply do a binary search to find an element equal to k. Then step back through
the array until you reach the first element equal to k. Finally, step forward through
the area adding each element to the iterator until you reach the first element that
is not equal to k. This takes O(logn) time for the search and then at most s time to
search back to the beginning of the run of k’s and s time return all of the elements
k. Therefore we have a solution running in at most O(logn+s) time.

In the array-based implementation, access by rank takes


O(1) time, thus binary search runs in O(log n) time 7 8

2
Binary Search Tree
New Insertion Sort

What is the running time of the insertion sort, Searching


using a sequence implemented with an array?
Cost of Searching
Insertion 6
<
If we use a binary search to do the insertions?
Deletion 2
>
9

1 4 = 8

9 10

Binary Search Trees Gregor


• A binary search tree is a binary tree T such that
– each internal node stores an item (k, e) of a dictionary.
– keys stored at nodes in the left subtree of v are less than or Fabio Nicole
equal to k.
– keys stored at nodes in the right subtree of v are greater than
or equal to k.
– external nodes do not hold elements but serve as place holders.

Bob Frank

11 12

3
Operations
10

3 17

Searching: findElement(k):
1 8 15 20

Inserting: insertItem(k, o):


5 9 Removing: removeElement(k):

Question: How can we traverse the tree so that we


visit the elements in increasing key order?

13 14

Search Search Example I


• To search for a key k, Algorithm findElement(k, v)
we trace a downward if T.isExternal (v) Successful findElement(76)
path starting at the return NO_SUCH_KEY 76>44
root if k < key(v) 76<88
• The next node visited return findElement(k, T.leftChild(v))
depends on the else if k = key(v) 76>65
outcome of the return element(v)
comparison of k with else { k > key(v) }
76<82
the key of the current
return findElement(k, T.rightChild(v))
node
• If we reach a leaf, the < 6
key is not found and
we return 2 9
>
NO_SUCH_KEY
1 4 = 8 • A successful search traverses a path starting at the root
• Example:
findElement(4) and ending at an internal node
15 • How about findAllelements(k)? 16

4
Search Example I Search Example II
Algorithm findAllElements(k, v, c):
Input: The search key k, a node of the binary search tree v Unsuccessful findElement(25)
and a container c
Output: An iterator containing the found elements 25<44
if v is an external node then 25>17
return c.elements()
if k = key(v) then 25<32
c.addElement(v)
return findAllElements(k,T.rightChild(v), c) 25<28
else if k < key(v) then
return findAllElements(k,T,leftChild(v))
else leaf
{we know k > key(v)} node
return findAllElements(k,T,rightChild(v))
• An unsuccessful search traverses a path starting at the
Note that after finding k, if it occurs again, it will be in the left most root and ending at an external node
internal node of the right subtree.
17 18

Cost of Search: Worst Case Cost of Search: Worst Case


a a
0 0
account account
1 Africa 1 Africa
Average # of 2 apple Average # of 2 apple
comparisons in the comparisons in the
worst case:
3 arc worst case:
3 arc
4 4

Successful Path to node i has length i, to get Unsuccessful


search there we search
do 2i+1 comparisons An unsuccessful search
takes 2n comparisons for
n internal nodes
Avg cost= (1/n)∑ (2i+1) = n
19 20

5
Cost of Search: Best Case Cost of Search: Best Case
1 1 1 1 1 1
2 3 2 3 2 3 2 3 2 3 2 3
4 5 6 7 4 5 4 5 6 7 4 5 6 7 4 5 4 5 6 7
8 9 8 9

Leaves are on the same level or on an adjacent level.


Leaves are on the same level or on an adjacent level.
Length of path from root to node i = ⎣log i⎦
Length of path from root to node i = ⎣log i⎦
For a successful search, we do 2 comparison at each node
For a failed search, we do 2 comparison at each node along
along the path plus one at the end.
the path plus two at the end.
Comparisons to node i: 2 ⎣log i⎦ +1
Only paths to external nodes count.
Æ Average # of comparisons in the best case

Σ ⎣log i⎦+1 =
1 2E 2(I + 2n)
2 O(log n) = = O(log n) FAILED
n 21 n+1 n+1 22
i=1

Insertion I expandExternal(v):
• To perform insertItem(k, e), let w be the node returned by
TreeSearch(k, T.root()) Transform v from an external node into an internal node
by creating two new children
• If w is external, we know that k is not stored in T. We call
expandExternal(w) on T and store (k, e) in w B B

A D A D

C E C E

new1 new2

expandExternal(v):
new1 and new 2 are the new nodes
if isExternal(v)
v.left ← new1
v.right ← new2
23 size ← size +2 24

6
Insertion II
• If w is internal, we know another item with key k is stored at w. We call
the algorithm recursively starting at T.rightChild(w) or T.leftChild(w) Construct a Tree
• They idea is to store the new item in an external node which either
precedes or follows the items with the same key in an inorder traversal.
What would be the result of constructing a
tree from repeated insertions of the
following sequences?
a. 5,8,3,7,1,9,2,4,6
b. 1,2,3,4,5,6,7,8,9
c. 5,4,6,3,7,2,8,1,9
When do you think trees work best?

25 26

Deletion
Construct a Tree 2 10
No children: simply
remove
5 15

What happens with repeated #’s?


k
3 8 18
With one child:
redirect (as
indicated)
5,8,3,7,1,5,9,5,2,4,5 10
k
With two children:
replace k with the
5
child most to the
right in its left sub-
3 8 18
What do you get from an inorder k
tree (or vice-versa),
that is, the child
search of this tree? that either
precedes or follows
5 18
it in an inorder
3 8 traversal.

27 28

7
Deletion I removeAboveExternal(v):

B
B
6 A D
• To perform operation < A D
removeElement(k), we 2 9
search for key k > C E
C
1 4 v 8
• Assume key k is in the
w G
tree, and let v be the node 5
F G
storing k
• If node v has a leaf child
w, we remove v and w from
the tree with operation 6 B
removeAboveExternal(w)
2 9 A D
• Example: remove 4
1 5 8
C
G

29 30

B B
A D
A D

C E C

F G
G

removeAboveExternal(v):
if isExternal(v)
{ p ← parent(v)
s ← sibling(v)
B if isRoot(v) s.parent ← null and root ← s
else
A G
v { g ← parent(p)
if (p is leftChild(g) g.left ← s
else g.right ← s
s.parent ← g
}
size ← size - 2 }
31 32

8
Deletion II

• We consider the case 1


v
where the key k to be 3
removed is stored at a node 2 8
v whose children are both
internal 6 9
– we find the internal node w w
5
that follows v in an inorder
traversal
z
– we copy key(w) into node v
– we remove node w and its 1
v
left child z (which must be 5
a leaf) by means of 2 8
operation
removeAboveExternal(z) 6 9
• Example: remove 3
33 34

Cost of Inserting and Deleting


Practice, practice, practice… = Cost of Search
Summary:
Consider a dictionary with n
a. Delete the 3 from the tree you got items implemented by means
in the (a). of a binary search tree of
• (a) 5,8,3,7,1,9,2,4,6 height h
„the space used is O(n)
b. Now delete node 5. „methods findElement ,
insertItem and removeElement
take O(h) time

The height h is O(n) in the


worst case and O(log n) in the
best case
35 36

9
Performance of a dictionary
implementation with a binary search Conclusion
tree
Tree T with height h for n key-element items • To achieve good running time,
we need to keep the tree balanced,
• uses O(n) space
i.e., with O(logn) height.
• size, isEmpty : O(1)
• findElement, insertItem,
removeElement : O(h) • Various balancing schemes will be
explored next.
• findAllElements, removeAllElements :
O(h + s)
– s = size of the iterators returned

37 38

10

You might also like