Application of Data Structures

Christopher Moh 2005

Overview 

Priority Queue structures 


Heaps Application: Dijkstra s algorithm  

Cumulative Sum Data Structures on Intervals Augmenting data structures with extra info to solve questions
Christopher Moh 2005

Priority Queue (PQ) Structures 

Stores elements in a list by comparing a key field 


Often has other satellite data For example, when sorting pixels by their R value, we consider the R as the key field and GB as satellite data 

Priority queues allow us to sort elements by their key field.
Christopher Moh 2005

Common PQ operations 

Create() 

Creates an empty priority queue Returns the smallest element (by key field) Insert element x (with predefined key field) Delete position x from the queue Change key field of position x to k
Christopher Moh 2005 

Find_Min()  

Insert(x)  

Delete(x)  

Change(x, k) 

Optional PQ operations  Union (a.b)  Combines two PQs a and b Returns the position of the element in the heap with key value k  Search (k)  Christopher Moh 2005 .

we shall assume that there exists existing extra data which allows us to do a search in O(1) time. Christopher Moh 2005  How fast does it need to be?   Do I need to store extra data to do a Search?  . The handling of this data structure will be assumed and not covered.Considerations when implementing a PQ in competition  How complicated is it?  Is the code likely to be buggy? Does a constant factor also come into the equation? During the course of this presentation.

Change in O(1) time Find_min.Linear Array  Unsorted Array   Create. Insert. Change in O(n + log n) = O(n) time Christopher Moh 2005  Sorted Array   . Delete in O(n) time Create. Delete. Find_min in O(1) time Insert.

Binary Heaps  Will be the most common structure that will be implemented in competition setting   Efficient for most applications Easy to implement   A heap is a structure where the value of a node is less than the value of all of its children A binary heap is a heap where the maximum number of children for each node is 2. Christopher Moh 2005 .

BHeap[x] <= BHeap[x*2]. (nheap*2)+1] to be INFINITY for practical reasons)     The children of BHeap[x] are BHeap[x*2] and BHeap[x*2+1] The parent of BHeap[x] are BHeap[x/2] This allows a near uniform Binary Heap where we can ensure that the number of levels in this heap is O(log n) Some properties wrt Key values: BHeap[x] >= BHeap[x/2]..Array implementation  Consider a heap of size nheap in an array BHeap[1. BHeap[x*2] ?? BHeap[x*2+1] Christopher Moh 2005 ..nheap] (Define BHeap[nheap+1 . BHeap[x] <= BHeap[x*2+1].

we assume the variable n refers to nheap Christopher Moh 2005 . we assume Heapify is O(log n)  For the rest of the presentation.PQ Operations on a BHeap   We define BTree(x) to be the Binary Tree rooted at BHeap[x] We define Heapify(x) to be an operation that does the following:    Assume: BTree(x*2) and BTree(x*2+1) are binary heaps but BTree(x) is not necessarily a binary heap Produce: BTree(x) binary heap Details of Heapify in later slides but for now.

Operations on a BHeap   1.  Create is trivial Find_min: Return BHeap[1] O(1) time O(1) time  1. 2. BHeap[T/2] T=T/2  O(log n) time as the number of levels is O(log n) Christopher Moh 2005 . 3. Insert (element with key value x) nheap++ BHeap[nheap] = x T = nheap While (T != 1 && Bheap[T] < BHeap[T/2]) 1. Swap (Bheap[T]. 2. 4.

2. ChangeDown (position x.Operations on a BHeap   1. 3. new key value k) Assume: k < existing BHeap[x] BHeap[x] = k T=x While (T != 1 && BHeap[T] < BHeap[T/2]) 1. 2. Swap (BHeap[T]. BHeap[T/2]) T = T/2   Complexity: O(log n) This procedure is known as bubbling up the heap Christopher Moh 2005 .

Operations on a BHeap  ChangeUp (position x. 2.  Assume: k > existing BHeap[x] BHeap[x] = k Heapify(x) O(log n) as complexity of Heapify is O(log n) Christopher Moh 2005 . new key value k)  1.

2. 4. BHeap[T/2]) T=T/2   Complexity is O(log n) Why must I do both Heapify and bubble up ? Christopher Moh 2005 . Swap (BHeap[T]. 2. 5.Operations on a BHeap  1. Delete (position x on the heap) BHeap[x] = BHeap[nheap] nheap Heapify(x) T=x While (T != 1 && BHeap[T] < BHeap[T/2]) 1. 3.

BHeap[x*2]. K = position where BHeap[K] = T Swap(BHeap[x]. 3. 5.  Heapify (position x on the heap) T = min(BHeap[x]. 2.Operations on a BHeap  1. BHeap[K]) Heapify(K) O(log n) as the maximum number of levels in the heap is O(log n) and Heapify only goes through each level at most once Christopher Moh 2005 . BHeap[x*2+1]) If (T == BHeap[x]) return. 4.

and Delete are O(log n) time Union operations are how long?   Insertion: O(n log n) union Heapify: O(n) union Christopher Moh 2005 .BHeap Operations: Summary    Create. Find_min in O(1) time Change (includes both ChangeUp and ChangeDown). Insert.

2.Corollary: Heapsort  We can convert an unsorted array to a heap using Heapify (why does this work?): 1. For (i = n/2. i--) 1. i <= n. For (i = 1. Heapify(i)  We can then return a sorted list (list initially empty): 1. Append the value of find_min to the list Delete(1)  Complexity is O(n log n) Christopher Moh 2005 . i >= 1. i++) 1.

the heap property holds i. is formed by merging two B(n-1) trees in the following way:  The root of the B(n) tree is the root of one of the B(n-1) trees. that the key field of any node is greater than the key field of all its children. and the (new) leftmost child of this root is the root of the other B(n-1) tree.e.  Within the tree. n != 0. Christopher Moh 2005 .Binomial Trees  Define Binomial Tree B(k) as follows:   B(0) is a single node B(n).

B(0) tree in that order Christopher Moh 2005 . .Properties of Binomial Trees    The number of nodes in B(k) is exactly 2^k. B(k2). The height of B(k) is exactly (k + 1) For any tree B(k)   The root of B(k) has exactly k children If we take the children of B(k) from left to right. they form the roots of a B(k-1).

and B(3). Christopher Moh 2005 .Binomial Heaps  Binomial Heaps are a forest of binomial trees with the following properties:   All the binomial trees are of different sizes The binomial trees are ordered (from left to right) by increasing size  If we consider the fact that the size of B(k) is 2^k. so the binomial heap with 13 nodes consists of the binomial trees B(0). B(2). the binomial tree B(k) exists in a binomial heap of n nodes iff the bit representing 2^k is 1 in the binary representation of n  For example: 13 (decimal) = 1101 (binary).

points to NIL) to    Parent Next Sibling (ordered left to right. a sibling must have the same parent).Binomial Heap Implementation  Each node will store the following data:   Key field Pointers (if non-existent. For roots of binomial trees. next sibling points to the root of the next binomial tree Leftmost child   Number of children in field degree Any other data that might be useful for the program  The binomial heap is represented by a head pointer that points to the root of the smallest binomial tree (which is the leftmost binomial tree) Christopher Moh 2005 .

 . 3. h2) Links two binomial trees with root h1 and h2 of the same order k to form a new binomial tree of order (k+1) We assume h1->key < h2->key which implies that h1 is the root of the new tree T = h1->leftchild h1->leftchild = h2 h2->parent = h1 H2->next_sibling= T O(1) time Christopher Moh 2005  1.Operations on Binomial Trees   Link (h1. 4. 2.

2. Find_min X = head. min = INFINITY While (X != nil) 1. 2. Leftchild.  Return min O(log n) time as there are at most log n binomial trees (log n bits) Christopher Moh 2005 .Operations on binomial heaps    Create Create a new binomial heap with one node (key field set) Set Parent. Next sibling to NIL O(1) time  1. If (X->key < min) min = X->key X = X->next_sibling 3.

one. or two binomial trees of order k in this list.More Operations   Merge (h1. create a list L of all the binomial trees of h1 U h2 arranged in ascending order of size For any order k. L) Given binomial heaps with head pointers h1 and h2. Christopher Moh 2005  . h2. there may be zero.

2. 2. Else 1. Christopher Moh 2005 .More Operations   Merge (h1. Append the (binomial)tree with root h1 to L h1 = h1->next_sibling Apply above steps to h2 instead 2. h2. 1. L) Assume that NIL is a node of infinitely small order L = empty While (h1 != NIL || h2 != NIL) 1. If (h1->degree < h2->degree) 1.

h2)   The fundamental operation involving binomial heaps Takes two binomial heaps with head pointers h1 and h2 and creates a new binomial heap of the union of h1 and h2 Christopher Moh 2005 .More Operations  Union (h1.

L) Go by increasing k in the list L until L is empty 1. 2. 2. If there is exactly one or exactly three (how can this happen?) binomial trees of order k in L. 3. Union (h1. remove both trees. h2) Start with empty binomial heap Merge (h1. use Link to form a tree of order (k+1) and pre-pend this tree to L  Union is O(log n) Christopher Moh 2005 . append one binomial tree of order k to the binomial heap and remove that tree from L If there are two trees of order k.More Operations  1. h2.

new heap) O (log n) time Decreasing the key value of a node Same idea as binary heap: Bubble up the binomial tree containing this node (exchange only key fields and satellite data! What s the complexity if you physically change the node?) O (log n) time Christopher Moh 2005  ChangeDown (node at position x.More Operations  Inserting a new node with key field set    Create a new binomial heap with that one node Union (existing heap with head h. new value)    .

new heap) 2. 1. . 4. k-1 Form a new binomial heap with the children of the root of this binomial tree the roots in the new binomial heap Remove the original binomial tree from the original binomial heap Union (original heap. from right to left. -INFINITY) Now x is at the root of its binomial tree Supposing that the binomial tree is of order k Recall that the children of the root of the binomial tree. 2.    Delete (node at position x) Deleting position x from the heap ChangeDown(x. 3. 3. 4.  O(log n) complexity Christopher Moh 2005 .More Operations   1. are binomial trees of order 0.

2. new value) 1.  Delete (X) Insert (new value) O (log n) time Christopher Moh 2005 .More Operations  ChangeUp (node at position X.

Summary   Binomial Heaps  Create in O(1) time Union. Delete. because they are more complicated. and Change operations take O(log n) time In general. in competition it is far more prudent (saves time coding and debugging) to use a binary heap instead  Unless there are MANY Union operations Christopher Moh 2005 . Insert. Find_min.

3. 0) where s is the source node Christopher Moh 2005 . The following describes how Dijkstra s algorithm can be coded with a binary heap Initializing phase: Let n be the number of nodes Create a heap of size n.Application of heaps: Dijkstra   1. all key fields initialized to INFINITY Change_val (s. 2.

While (heap is not empty) 1. 2. If (cost[X] + distance[X][k] < cost[k]) 1.Running of Dijkstra s algorithm 1. ChangeDown (position of k in heap. cost[X] + distance[X][k]) Christopher Moh 2005 . X = node corresponding to find_min value Delete (position of X in heap = 1) For all nodes k that are adjacent to X 1. 3.

 O(m log n)   Total running time O([m+n] log n) This is faster than using a basic array list unless the graph is very dense.Analysis of running time  At most n nodes are deleted  O(n log n)  Let m be the number of edges. Each edge is relaxed at most once. in which case m is about O(n^2) which leads to a running time of O(n^2 log n) Christopher Moh 2005 .

b] keeps changing? Christopher Moh 2005 . At x coordinate X [X an integer between 0 and N]. Given an interval [a. how much gold is there between a and b? How efficiently can this be done if we dynamically change the amount of gold and the interval [a.b]. there is g(X) gold.Cumulative Sum on Intervals   Problem: We have a line that runs from x coordinate 1 to x coordinate N.

C(x+1). if we change g(x). C(x+2).Cumulative Sum Array    Let us define C(0) = 0. C(N)  Any change in gold results in an update in O(N) time Christopher Moh 2005 . we will have to change C(x). we can perform the update in O(1) time  However. and C(x) = C(x-1) + g(x) where g(x) is the amount of gold at position x C(x) then defines the total amount of gold from position 1 to position x The amount of gold in interval [a. .b] is simply C(b) C(a-1)  For any change in a or b.

Cumulative Sum Tree   We can use the binary representation of any number to come up with a cumulative sum tree For example. let say we take 13 (decimal) = 1101 (binary)  The cumulative sum of g(1) + g(2) + represented as the sum of:    g(13) can be g(1) + g(2) + + g(8) [ 8 elements ] g(9) + g(10) + + g(12) [ 4 elements ] g(13) [ 1 element ]  Notice that the number of elements in each case represents a bit that is 1 in the binary representation of the number Christopher Moh 2005 .

Cumulative Sum Tree  Another example: C(19)  19 (decimal) is 10011 (binary)     C(19) is the sum of the following: g(1) + g(2) + + g(16) [ 16 elements ] g(17) + g(18) [ 2 elements ] g(19) [ 1 element ] Christopher Moh 2005 .

p = 12 [1100] x = 16 [10000]. p = 0 [00000] Christopher Moh 2005 .Cumulative Sum Tree   Let us define C2(x) to be the sum of g(x) + g(x-1) + + g(p + 1) where p is a number with the same binary representation as x except the least significant bit of x (the rightmost bit of x that is 1 ) is 0 Examples of x and the corresponding p:    x = 6 [110]. p = 4 [100] x = 13 [1101].

Cumulative Sum Tree  If we want to find the cumulative sum C(x) = g(1) + g(2) + + g(x).b] = C(b) C(a-1) can be found in log N time. we can trace through the values of C2 using the binary representation of x      Examples: C(13) = C2(8) + C2(8+4) + C2(8+4+1) C(16) = C2(16) C(21) = C2(16) + C2(16+4) + C2(16+4+1) C(99) = C2(64) + C2(64+32) + C2(64+32+2) + C2(64+32+2+1) Hence the amount of gold in interval [a. which implies updates of a and b can be done in O(log N) Christopher Moh 2005  This allows us to find C(x) in log x time  .

While (x <= N) 1. 2.Cumulative Sum Tree   What happens when we change g(x)?  If g(x) is changed. which is a great improvement over the O(N) needed for an array. Christopher Moh 2005 . we only need to update C2(y) where C2(y) covers g(x) We can go through all necessary C2(y) in the following way: 1. Update C2(x) Add the value of the least significant bit of x to x   This runs in O(log N) time Hence updates to g can also be done in O(log n) time.

C2(64). C2(8). Can we extend a cumulative sum tree to 2 or more dimensions?  See IOI 2001 Day 1 Question 1 Christopher Moh 2005 .Cumulative Sum Tree  Examples [binary representation in brackets]    Change to g(5) [ 101 ] : Update C2(5). C2(6). C2(16). C2(40). C2(14). and all C2(power of 2 > 16) Change to g(35) [ 100011 ]: Update C2(35). C2(16) and all C2(power of 2 > 16) Change to g(13) [ 1101 ]: Update C2(13). C2(48). and all C2(power of 2 > 64)   We can implement a cumulative sum tree very simply: By simply using a linear array to store the values of C2. C2(36).

[L+R]/2) [left child] and ([L+R]/2+1.Sum of Intervals Tree        Another way to solve the question is to use a Sum of Intervals Binary Tree Each node in the tree is represented by (L. R) [right child] The number of nodes in the tree is O(2*N) [ why? ] In an implementation.R) is g(L) + g(L+1) + + g(R) The root of the tree has L = 1 and R = N Every leaf has L = R Every non-leaf has children (L. R) and the value of (L. every node should have pointers to its children and its parent Christopher Moh 2005 .

R) Set L and R to the left child of the current node Set L and R to the right child of the current node 3. R = N. C += value of (L.  C += value at (L. 2. How to find C(x) = g(1) + g(2) + We trace from the root downwards L = 1. 2. + g(x)? M = (L + R) / 2 If (M < x) 1. 3.L) or (R. C = 0 While (L != R) 1.Sum of Intervals Tree   1. Else 1.R) as L = R ] Time complexity: O(log n) Christopher Moh 2005 . 2.R) [ or (L.

Update the value of (L. 2. Update the root   Complexity of O(log N) Hence all updates of interval [a.x) upwards to the root Let L = R = x While (L. 2.R) 3.R) Set (L.R) to the parent of (L.R) is not the root 1. What happens when g(x) is changed? Trace from (x.b] and g(x) can be done in O(log N) time Christopher Moh 2005 .Sum of Intervals Tree   1.

Augmenting Data Structures    It is often useful to change the data structure in some way.R) that are related to L and R. by adding additional data in each node or changing what each node represents. we can use so-called interval trees to solve not just cumulative sum problems  We can use properties of elements in the interval (L. This allows us to use the same data structure to solve problems For example. Christopher Moh 2005 .

Other data structures  Balanced (and unbalanced) binary trees    Red-Black trees 2-3-4 trees Splay trees   Suffix Trees Fibonacci Heaps Christopher Moh 2005 .

Sign up to vote on this title
UsefulNot useful