You are on page 1of 8

Data Structures & Algorithms

SECTION 7:
OBJECTIVES: At the end of the section, the student is expected to be able to
1. assess the suitability of using a binary tree to solve a given problem on a
computer
2. define a heap
3. apply the heapsort algorithm on a given set of keys
DISCUSSION:
In this section, we will examine problem whose solution uses the binary tree as
the underlying data structure. This problem has to do with the familiar task of sorting
keys on which total order has been defined. The solution to the problem uses a binary
tree represented using sequential allocation.

Heaps and heapsort algorithm


An elegant sorting algorithm, which uses the binary tree as the underlying data
structure, was put forth in 1964 by R.W. Floyd and J.W.J. Williams. It is called heapsort.

Some definitions
1. The binary tree used as the underlying data structure in heapsort is the
complete binary tree. We defined a complete binary as the binary tree which
results when zero or more nodes are deleted from a full binary tree in reverse
level order, i.e., bottom to top, right to left. This means that the leaves in a
complete binary tree lie on at most two adjacent levels, say l and l-1, and that the
leaves at the bottommost level l lie in the leftmost positions of l.

B C

D E F G … l-1

H I J K L ……………………. l

2. A total order is a relation between the elements of a set of objects, say S,


satisfying the following properties for any objects x, y and z in S:
Section 7 Page 1 of 8
Jennifer Laraya-Llovido
Data Structures & Algorithms

a. Transitivity: if x<y and y<z, then x<z.


b. Trichotomy: for any two objects x and y in S, exactly one of the following
relations x>y, x=y or x<y holds.

3. Let K1, K2, …, Kn be keys chosen from a set of keys on which total order has
been defined. Let the keys be assigned to the nodes of a complete binary tree in
level order, i.e., top to bottom, left to right. For instance,

K1

K2 K2

K4 K5 K6 K7

K8 K9

We define such a binary tree to be a heap provided the key in each node is greater
than or equal to the keys in its left and right son nodes. For example, taking
instruction codes for the IBM/370 Assembly language as alphabetic keys, then

AR TM

EX OI MR ST

AR OI
LA MR ST TM LA EX

(a) Not a heap (a) A heap


Note that in a heap, the key in the root node is the largest key in the set.

Sift-up: converting a complete binary tree into a heap

To convert a complete binary tree such as the one in Figure (a) into the heap, we
apply a process called sift-up. This is a bottom-up, right-to-left, process in which the
smallest subtrees of a complete binary tree are converted into a heaps, then the
subtrees which contain them, and so on, until the entire binary tree is converted into a
heap. The figure below depicts the process for the non-heap of Figure (a).

Section 7 Page 2 of 8
Jennifer Laraya-Llovido
Data Structures & Algorithms

AR

EX OI

LA MR ST TM

AR

Subtree rooted at TM
OI ↔ TM EX TM is now a heap.

(Step 1)
LA MR ST OI

AR

Subtree rooted at MR
MR TM is now a heap.
EX ↔ MR

(Step 2)
LA EX ST OI

TM

MR AR Heap property is
AR ↔ TM satisfied at TM but the
subtree rooted at AR is
(Step 3) no longer a heap.
LA EX ST OI

TM
Subtree rooted at ST is
again a heap. Entire
MR ST binary now a heap.
AR ↔ ST

(Step 4)
LA EX AR OI

Section 7 Page 3 of 8
Jennifer Laraya-Llovido
Data Structures & Algorithms

Note that when the subtree rooted at any node, say node α, is converted into a heap, its
left and right subtrees are already heaps. We call such a subtree an almost-heap.
When an almost-heap is converted into a heap, one of its subtrees may cease to be
heap (i.e., it may become an almost-heap). This subtree is once more converted into a
heap, but this may again cause one of its subtrees to lose the heap property. And so on.
The process continues as smaller and yet smaller subtrees lose and regian the heap
property, with larger keys migrating upwards, until the smaller key from the starting root
node α finds its final resting place.

In the example shown above, when the key AR is exchanged with the key TM in
step 3, the resulting right subtree rooted at AR ceases to be heap, and is converted back
into a heap in Step 4, at which point AR finds final placement.

KR
K

A KL KR A K
h h KL
e e
a a Assume: Kr>Kl K’R
K’L
p p Then: K ↔ KR

An almost-heap: An almost-heap if
K < max(Kl,, KR) K < max(K’L, K’R)

A figure depicting Sift-up: larger keys migrate upward as root key migrates downward to its
final resting place.

Procedure SIFT-UP formalizes the process just described for a binary tree
represented using linked allocation with node structure (LSON, KEY, RSON). Study the
procedure carefully. Take note especially of the way in which the key from the original
root is compared, but not actually exchanged, with upward migrating keys, until a final
home is found for it.

Section 7 Page 4 of 8
Jennifer Laraya-Llovido
Data Structures & Algorithms

procedure SIFT-UP( T )
//Converts an almost heap T into a heap.//
node(LSON, KEY, RSON)

αÅT
k Å KEY (α) //keep key at root of almost-heap in k//
β Å LSON(α)
while β < > Λ do
σ Å RSON (α)
if σ < > Λ and KEY(σ) > KEY(β) then β Å σ
if KEY(β) > k then [KEY(α) Å KEY(β) // larger key migrates upward//
α Å β //next root//
β Å LSON(α) ]
else exit
endwhile
KEY(α) Å k //final placement of root key//
end SIFT-UP

To convert a complete binary tree on n nodes into a heap, SIFT-UP is called


|_n/2_| times starting with the rightmost subtree and continuing in reverse level order.
With the linked representation of binary tree, locating the roots of the pertinent subtrees
poses some difficulty. The sequential representation of a complete binary tree
completely eliminates this difficulty.

In the sequential representation of a complete binary tree on n nodes, the nodes


of the binary tree (actually, the contents of the data field of the node) are stored in one-
dimensional array of size n in level order.

The following formulas allow us to locate, in constant time, the sons(if any) and
the father (if it exists), of any node, say node I, in a sequentially allocated complete
binary tree on n nodes. These formulas are the reasons for this particular
implementation of a binary tree.
1. If 2i < n, then the left son of node i is node 2i; else, node i has no left son.
2. If 2i < n, then the right son of node i is node 2i+1; else, node i has no right son.
3. If 1 < i < n, then the father of node i is node |_ i/2 _|.

Procedure SIFT-UP(i, n) given below implements the sift-up process for a


sequentially represented almost-heap on n nodes with root at node i. It has exactly
the same structure as procedure SIFT-UP(T) given previously for the linked

Section 7 Page 5 of 8
Jennifer Laraya-Llovido
Data Structures & Algorithms

representation of an almost-heap binary tree rooted at T. The only difference


between the two procedures is in the way sons are located: by computation using the
above formulas in the former and following the links in the latter.
procedure SIFT-UP( i, n )
//Converts an almost heap on n nodes and rooted at node i into a heap.//
array KEY(1:n)
k Å KEY (i) //keep key at root of almost-heap in k//
j Å 2*I //find left son of root
while j < n do
if j < n and KEY(j+1) > KEY(j) then j Å j + 1
if KEY(j) > k then [KEY(i) Å KEY(j) // larger key migrates upward//
iÅj //root of next subtree//
j Å 2*i ] //left son of new root
else exit
endwhile
KEY(i) Å k //root key finds final resting place//
end SIFT-UP

The following segment of EASY codes converts a complete binary tree on n


nodes, stored sequentially in the vector K(1:n), into a heap. Note how easy it is t locate
the roots of the subtrees in reverse level order, something not as easily accomplished
with the linked representation of a binary tree.

for iÅ |_ n/2 _| to 1 by –1 do
call SIFT-UP(i,n)
endfor

The Heapsort algorithm of Floyd and Williams

The heap is the basis of an elegant sorting algorithm devised by Floyd and
Williams in 1964. The algorithm is given below.

1. Assign the keys to be sorted in the nodes of a complete binary tree.


2. Convert this binary tree into a heap by applying sift-up to its nodes in reverse
level order.
3. Repeatedly do the following until the heap is empty.
a. Remove the key at the root of the heap (the largest in the heap) and
place it in an output queue.

Section 7 Page 6 of 8
Jennifer Laraya-Llovido
Data Structures & Algorithms

b. Detach from the heap the rightmost leaf node at the bottommost level,
extract its key, and store this key at the root of the heap.
c. Apply sift-up to the root to convert the binary tree into a heap once again.

Procedure HEAPSORT implements the heapsort algorithm using a sequentially


allocated complete binary tree in a vector K, of size n. HEAPSORT performs an
in-place sort of the keys with the heap and the output queue coexisting in K.

procedure HEAPSORT(K, n)
// Given a vector of size n containing keys k1, k2, k3, . . ., kn, HEAPSORT
performs an in-place sort of the keys in O(nlog 2 n) time. HEAPSORT invokes
procedure SIFT-UP//

array K(1:n)
// Convert K into an almost heap.//
for i Å |_ n/2 _| to 2 by –1 do
call SIFT-UP(i, n)
endfor

//Sift-up to root, exchange root and last leaf, and consider last leaf deleted from
binary tree and entered into output queue.//
for j Å n to 2 by –1 do
call SIFT-UP(1, j) //Sift-up to root//
K(1) ↔ K(j) //Exchange root and last leaf//
endfor
end HEAPSORT

The figure below shows the resulting heap and output queue in the vector K at
successive stages of the algorithm (actually, at the instanst the heap is formed
but before the root and the last leaf are exchanged). Note that the boundary
between heap and queue is simply established by the value of the indexing
variable j in the second for-loop in procedure HEAPSORT.

1 2 3 4 5 6
K(1:6) TM OI ST AR EX MR
heap← → queue (empty )

ST OI MR AR EX TM
heap← → queue

OI EX MR AR ST TM
heap← → queue

Section 7 Page 7 of 8
Jennifer Laraya-Llovido
Data Structures & Algorithms

MR EX AR OI ST TM
heap← → queue

EX AR MR OI ST TM
heap← → queue

AR EX MR OI ST TM
heap← → queue

AR EX MR OI ST TM
heap(empty)← → queue

To invoke procedure HEAPSORT, we simply write

call HEAPSORT(K, n)

At entry into the procedure, K contains the keys to be sorted. Upon return to the calling
program, K contains the sorted keys.

Section 7 Page 8 of 8
Jennifer Laraya-Llovido

You might also like