You are on page 1of 38

Heap Data Structure

The Master Theorem


Given: a divide and conquer algorithm
An algorithm that divides the problem of size n
into a subproblems, each of size n/b
Let the cost of each stage (i.e., the work to divide
the problem + combine solved subproblems) be
described by the function f(n)
Then, the Master Theorem gives us a cookbook for
the algorithms running time:
The Master Theorem
if T(n) = aT(n/b) + f(n) then
)
)
)
)
)
)

,
|

>

O =
O =
=
O
O
O
=
+

1
0
large for ) ( ) / (
AND ) (
) (
) (
) (
log ) (
log
log
log
log
log
c
n n cf b n af
n n f
n n f
n O n f
n f
n n
n
n T
a
a
a
a
a
b
b
b
b
b
I
I
I
Using The Master Method
T(n) = 9T(n/3) + n
a=9, b=3, f(n) = n
n
log
b
a
= n
log
3
9
= O(n
2
)
Since f(n) = O(n
log
3
9 - I
), where I=1, case 1
applies:
Thus the solution is T(n) = O(n
2
)
) )
I
= O =
a a
b b
n O n f n n T
log log
) ( when ) (
Heaps
A (binary) heap data structure is an array of object that can be viewed
as a complete binary tree
A heap is defined only on nearly full or complete binary trees. What is
a nearly full binary tree?
A complete binary tree is one where all the internal nodes have exactly
2 children, and all the leaves are on the same level. An internal node
comprises all the nodes except the leaf nodes.
Leaf nodes are nodes without children.
Each node of the tree corresponds to an element of the array that
stores the value in the node
The tree is completely filled on all levels except possibly the lowest,
which is filled from left to right up to a point
Leaf node
Interior node
Heaps
Height of a node: The number of edges starting at that
node, and going down to the furthest leaf.
Height of the heap: The maximum number of edges from the
root to a leaf.
Height of blue
node = 1
Height of root = 3
Heaps
Complete binary tree if not full, then the only
unfilled level is filled in from left to right.
2
3
4 5
6 7
8 9 10 11 12 13
1
2
3
4 5
6
12 7 8 9 10 11
1
Parent(i)
return i/2
Left(i)
return 2i
Right(i)
return 2i + 1
Heap as an Array
An array A that represents a heap is an array with two attributes
length, the number of elements in the array
heap-size, the number of heap elements stored in the array
Viewed as a binary tree and as an array
length, the number of elements in the array
Heaps
The heap property of a tree is a condition that must be true
for the tree to be considered a heap.
Min-heap property: for min-heaps, requires
A[parent(i)] e A[i]
So, the root of any sub-tree holds the least value in that
sub-tree.
Max-heap property: for max-heaps, requires
A[parent(i)] u A[i]
The root of any sub-tree holds the greatest value in the
sub-tree.
Heaps
Looks like a max heap, but max-heap property is violated
at indices 11 and 12. Why? because those nodes parents
have smaller values than their children.
2
8
3
8
4
8
5
7
6
8
7
1
8
6
9
7
10
5
11
8
12
9
1
9
Maintaining the Heap Property
One of the more basic heap operations is converting a
complete binary tree to a heap.
Such an operation is called Heapify.
Its inputs are an array A and an index i into the array.
When Heapify is called, it is assumed that the binary
trees rooted at i.
LeftChild(i) and RightChild(i) are heaps, but that A[i] may
be smaller than its children, thus violating the 2nd heap
property.
The function of Heapify is to let the value at A[i] float
down in the heap so that the subtree rooted at index i
becomes a heap.
Heaps
Assume we are given a tree represented by a linear array
A, and such that i e length[A], and the trees rooted at
Left(i) and Right(i) are heaps. Then, to create a heap
rooted at A[i], use procedure Max-Heapify(A,i):
Max-Heapify(A,i)
1. l Left(i)
2. r Right(i)
3. if l e heap-size[A] and A[l] > A[i]
4. then largest l
5. else largest i
6. if r e heap-size[A] and A[r] > A[largest]
7. then largest r
8. if largest { i
9. then swap( A[i], A[largest])
Max-Heapify(A, largest)
20
9
19 18
6 7
17 16 17 16 1
10
Heaps
If either child of A[i] is greater than A[i], the greatest child is
exchanged with A[i]. So, A[i] moves down in the heap.
The move of A[i] may have caused a violation of the max-heap
property at its new location.
So, we must recursively call Max-Heapify(A,i) at the location i
where the node lands.
The above is the top-down approach of Max-Heapify(A,i)
Obviously, Max-Heapify(A,i) cant be bottom-up. Why?
20
9
19 18
6 7
17 16 17 16 1
10
Max-Heapify(A,1)
20
9
19 18
6 7
17 16 17 16 1
10
10
9
19 18
6 7
17 16 17 16 1
20
10
9
19 18
6 7
17 16 17 16 1
20
19
9
10 18
6 7
17 16 17 16 1
20
19
9
10 18
6 7
17 16 17 16 1
20
19
9
17 18
6 7
10 16 17 16 1
20
Run Time of Max-Heapify()
Run time of Max-Heapify().
Consider worst case: Last row is half full.
How many nodes, in terms of n, is in the sub-tree with the most nodes?
2 n / 3
So, worst case would follow a path that included the most nodes
T(n) = T(2n/3) + O(1)
Why O(1)? Note that all the work in lines 1-7 is simple assignments or
comparisons, so they are constant time, or O(1).
2
1
2
3
4 5
6 7
8 9 10 11
1
2
3
4 5
1
2(2) / 3 = 1
2(5) / 3 = 3
2(11) / 3 = 7
Heaps
So, what must we do to build a heap?
We call Max-Heapify(A,i) for every i starting at last node
and going to the root.
Why Bottom-up?
Because Max-Heapify() moves the larger node upward
into the root. If we start at the top and go to larger
node indices, the highest a node can move is to the root
of that sub-tree. But what if there is a violation between
the sub-trees root and its parent?
So, we must start from the bottom and go towards the
root to fix the problems caused by moving larger nodes
into the root of sub-trees.
Heaps
Bad-Build-Max-Heap(A)
1. heap-size[A] length[A]
2. for i length[A] downto 1
3. do Max-Heapify(A,i)
Can this implementation be improved? Sure can!
2
3
4 5
6 7
8 9 10 11
1
Heaps
Without going through a formal proof, notice that there is no
need to call Max-Heapify() on leaf nodes. At most, the
internal node with the largest index is at most equal to
length[A]/2.
Since this may be odd, should we use the floor or the ceiling?
In the case below, it is clear that we really need the floor
of length[A]/2 which is 5, and not the ceiling, which is 6.
Build-Max-Heap(A)
1. heap-size[A] length[A]
2. for i length[A]/2 downto 1
3. do Max-Heapify(A,i)
2
3
4 5
6 7
8 9 10 11
1
Example: building a heap (1)
4 1 3 2 16 9 10 14 8 7
1 2 3 4 5 6 7 8 9 10
4
1
2 16
7 8 14
3
9
10
1
2 3
4 5 6 7
8 9
10
Example: building a heap (2)
4
1
2 16
7 8 14
3
9
10
1
2 3
4 5 6 7
8 9
10
Example: building a heap (3)
4
1
2 16
7 8 14
3
9
10
1
2 3
4 5 6 7
8 9
10
Example: building a heap (4)
4
1
14 16
7 8 2
3
9
10
1
2 3
4 5 6 7
8 9
10
Example: building a heap (5)
4
1
14 16
7 8 2
10
9
3
1
2 3
4 5 6 7
8 9
10
Example: building a heap (6)
4
16
14 7
1 8 2
10
9
3
1
2 3
4 5 6 7
8 9
10
Example: building a heap (7)
16
14
8 7
1 4 2
10
9
3
1
2 3
4 5 6 7
8 9
10
Heaps Running Time
By quick examination, Build-Max-Heap() is O(n lgn). But it can be
shown that it really is O(n). Why?
max height h = lg n. (Consider cases when tree is full, and when
it is not. Determine the height as a function of the number of nodes
for both cases. Use fact that its a binary tree.)
In terms of h and n, the maximum number of nodes at any height h
is given by n/2
(h+1)
.
Finally, for a given node at some height h, Max-Heapify() is
O(lg n) = O(h).
By summing the times for every node, based on its height, we get
T(n) =
all possible heights
(maximum number of nodes at height h)
(time for nodes at height h)
Complexity analysis of Build-Heap
For each height 0 < h lg n , the number
of nodes in the tree is at most n/2
h+1
For each node, the amount of work is
proportional to its height h, O(h) n/2
h+1
.O(h)
Summing over all heights, we obtain:

=

=
+
=
+
n
h
h
n
h
h
h
n O h O
n
n T
lg
0
1
lg
0
1
2
) ( .
2
) (
Complexity analysis of Build-Heap (2)
We use the fact that
Therefore:
Building a heap takes only linear time and
space!
2
) 2 / 1 1 (
2 / 1
2
2
0
=

g
= h
h
h

) (
2 2
) (
0
lg
0
1
n O
h
n O
h
n O n T
h
h
n
h
h
=

=

g
= =
+
1 | |
) 1 (
2
0

g
=
x for
x
x
kx
k
k
Priority Queues
A priority queue is a data structure for maintaining a set S
of elements, each with an associated value called a key.
We will only consider a max-priority queue.
If we give the key a meaning, such as priority, so that
elements with the highest priority have the highest value of
key, then we can use the heap structure to extract the
element with the highest priority.
Priority queues
We can implement the priority queue
ADT with a heap. The operations are:
Max(S) returns the maximum element
Extract-Max(S) remove and return the
maximum element
Insert(x,S) insert element x into S
Increase-Key(S,x,k) increase xs value
to k
Priority queues: Extract-Max
Heap-Maximum(A) return A[1]
Heap-Extract-Max(A)
1. if heapsize[A] < 1
2. then error heap underflow
3. max A[1]
4. A[1] A[heapsize[A]]
5. heapsize[A] heapsize[A] 1
6. Max-Heapify(A,1)
7. return max
Make last
element the root
Priority queues: Increase-Key
Heap-Increase-key(A, i, key)
1. if key < A[i]
2. then error new key smaller than existing
one
3. A[i] key
4. while i > 1 and A[parent(i)] < A[i]
5. do Exchange(A[i], parent(A[i]))
6. i parent(i)
Priority queues: Insert-Max
Heap-Insert-Max(A, key)
1. heapsize[A] heapsize[A] + 1
2. A[heapsize[A]]
3. Heap-Increase-Key(A, heapsize[A],
key)
Example: increase key (1)
16
14
8 7
1 4 2
10
9
3
1
2 3
4 5 6 7
8 9
10
increase 4 to 15
Example: increase key (2)
16
14
8 7
1 15 2
10
9
3
1
2 3
4 5 6 7
8 9
10
Example: increase key (3)
16
14
15 7
1 8 2
10
9
3
1
2 3
4 5 6 7
8 9
10
Example: increase key (4)
16
15
14 7
1 8 2
10
9
3
1
2 3
4 5 6 7
8 9
10

You might also like