Deck describing differences between BST and heap data structures

Deck describing differences between BST and heap data structures

Binary Heaps

Given a node i

i.value is the stored object i.left and i.right point to other nodes

All of is left children and grand-children are less than i.value All of is right children and grand-children are greater than i.value

Binary search trees can be easily implemented using arrays.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

50

24

78

12

36 50

64

90

15

29

44

58

67

81

93

24 12 3 15 29 36 44 58 64 67

78 90 81 93

Root is at index 1 (i = 1) Left(i) { return i*2; } Right(i) { return i*2 + 1; }

10

11

12

13

14

15

50

24

78

12

36 50

64

90

15

29

44

58

67

81

93

24 12 3 15 29 36 44 58 64 67

78 90 81 93

bool Find_rec(int x, int i) { if (a[i] == -1 ) return false; else if (x == a[i]) return true; else if (x < a[i]) Find_rec(x, i*2); else Find_rec(x, i*2+1); } bool Find(int x) { return Find_rec(x, 1); }

10

11

12

13

14

15

-1

50

24

78

12

36

64

90

15

29

44

-1

-1

-1

Find O(log N) on average Insert in proper order O(log N) on average

O(log N) to find the correct location + O(1) to perform insertion

Keep moving left until you see a -1

Keep moving left until you see a -1

See in-order traversal in book

Three functions

push(x) pushes the value x into the queue min() return the minimum value in the queue delete() removes the minimum value in the queue

Tons of applications:

OS process queues, Transaction Processing, Packet routing in advanced networks, used in various other algoriothms

Consider using an un-order Array

push(x) O(1) just add it to the end of the array. min() O(N) sequential search for min value delete() O(N) might have to shift entire array

Consider using an Ordered Array

push(x) O(N) to find correct location and shift array appropriately min() O(1) return the first value delete() O(N) might have to shift entire array Recall using an un-order Array

push(x) O(1) just add it to the end of the array. min() O(N) sequential search for min value delete() O(N) might have to shift entire array

Why are simple array implementation bad? O(N) is not a problem, right? Consider this application:

A private network router has 10 million packets coming in every minute (mostly junk, spam, etc.) and I only want to let through the top 1 million (#1 is top priority min)

N = 10 million in-coming packets M = 1 million out-going packets (top priority priority #1) Consider using an Ordered Array push(x) O(N) Must do this N times N*N min() O(1) delete() O(1) Must do this M times M Recall using an un-order Array push(x) O(1) Must do this N time (not a problem) min() O(N) Must do this M times N*M delete() O(N) Must do this M times N*M

N*M

N all the packets 10 million M 1 million top priority packets Must be processed in one minute. Assume your computer can do 10 billion operation per second 600 billion operation in one minute. Unfortunately, N*M is 10 trillion operations.

Consider using an BST

push(x) O(log N) add to the correct position Log(n) * n, n = 10,000,000 min() O(log N) return the left-most node delete() O(log N) Recall using an un-order Array

push(x) O(1) just add it to the end of the array. min() O(N) sequential search for min value delete() O(N) might have to shift entire array

Is this possible?

push(x) O(1) to find correct location an shift array appropriately min() O(1) return the first value delete() O(log N)

Given a node i

i.value is the stored object i.left and i.right point to other nodes

All of is left children and grand-children are greater than i.value All of is right children and grand-children are greater than i.value

Binary heaps can be easily implemented using arrays.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

-1

10

16

33 3

48

49

24

81

63

78

58

67

-1

-1

10 16 24 81 63 33 78 58 48 67

5 49

Find O(N) requires sequential search Insert in proper order O(1) on average

Amazing Heap Property

O(1) to return min O(log N) to restore the heap property

N = 10 million in-coming packets M = 1 million out-going packets (top priority priority #1) Consider using an Heap push(x) O(1) Must do this N times (not a problem) min() O(1) delete() O(log N) Must do this M times log(N)*M Recall using an un-order Array push(x) O(1) Must do this N time (not a problem) min() O(N) Must do this M times N*M delete() O(N) Must do this M times N*M

log(N)*M

N all the packets 10 million M 1 million top priority packets Must be processed in one minute. Assume your computer can do 1 billion operation per second 60 billion operation in one minute. What is log(N)*M?

BST Implementation

Push: O(log N) Find Min: O(log N) Remove Min: O(log N)

Pushing N = 10,000,000 230 million operations Removing M minimums M = 1,000,000 20 million operations

Push: O(1) Find Min: O(1) Remove Min: O(log N)

Pushing N = 10,000,000 10 million operations Removing M minimums M = 1,000,000 20 million operations

