You are on page 1of 715

Lecture 01

Analysis of Algorithms

Input

Algorithm

Output

An algorithm is a step-by-step procedure for


solving a problem in a finite amount of time.

Running Time (1.1)

best case
average case
worst case
120
100

Running Time

Most algorithms transform


input objects into output
objects.
The running time of an
algorithm typically grows
with the input size.
Average case time is often
difficult to determine.
We focus on the worst case
running time.

80
60
40
20
0

Easier to analyze
Crucial to applications such as
games, finance and robotics
[Lect01] Analysis of Algorithms

1000

2000

3000

4000

Input Size

Experimental Studies (1.6)


9000
8000
7000

Time (ms)

Write a program
implementing the
algorithm
Run the program with
inputs of varying size and
composition
Use a method like
System.currentTimeMillis() to
get an accurate measure
of the actual running time
Plot the results

6000
5000
4000
3000
2000
1000
0
0

50

100

Input Size
[Lect01] Analysis of Algorithms

Limitations of Experiments
It is necessary to implement the
algorithm, which may be difficult
Results may not be indicative of the
running time on other inputs not included
in the experiment.
In order to compare two algorithms, the
same hardware and software
environments must be used
[Lect01] Analysis of Algorithms

Theoretical Analysis
Uses a high-level description of the algorithm
(pseudocode) instead of an implementation
Characterizes running time as a function of
the input size, n.
Takes into account all possible inputs
Allows us to evaluate the speed of an
algorithm independent of the
hardware/software environment

[Lect01] Analysis of Algorithms

Pseudocode (1.1)
Example: find max
High-level description
element of an array
of an algorithm
More structured than Algorithm arrayMax(A, n)
English prose
Input array A of n integers
Less detailed than a
Output maximum element of A
program
currentMax A[0]
Preferred notation for
for i 1 to n 1 do
describing algorithms
if A[i] currentMax then
Hides program design
currentMax A[i]
issues
return currentMax

[Lect01] Analysis of Algorithms

Primitive Operations
Basic computations
performed by an algorithm
Identifiable in pseudocode
Largely independent from the
programming language
Exact definition not important
(we will see why later)
Assumed to take a constant
amount of time

[Lect01] Analysis of Algorithms

Examples:

Performing an
arithmetic ops
Assigning a value
to a variable
Indexing into an
array
Calling a method
Returning from a
method
Comparing two
numbers
7

Counting Primitive
Operations (1.1)
By inspecting the pseudocode, we can determine the
maximum number of primitive operations executed by
an algorithm, as a function of the input size
# operations
2
1+n
2(n 1)
2(n 1)
2(n 1)
1

Algorithm arrayMax(A, n)
currentMax A[0]
for i 1 to n 1 do
if A[i] currentMax then
currentMax A[i]
{ increment counter i }
return currentMax

Total
[Lect01] Analysis of Algorithms

7n 2
8

Estimating Running Time


Algorithm arrayMax executes 7n 2 primitive
operations in the worst case. Define:
a = Time taken by the fastest primitive operation
b = Time taken by the slowest primitive operation

Let T(n) be worst-case time of arrayMax. Then


a (7n 2) T(n) b(7n 2)
Hence, the running time T(n) is bounded by two
linear functions

[Lect01] Analysis of Algorithms

Growth Rate of Running Time


Changing the hardware/ software
environment

Affects T(n) by a constant factor, but


Does not alter the growth rate of T(n)

The linear growth rate of the running


time T(n) is an intrinsic property of
algorithm arrayMax
[Lect01] Analysis of Algorithms

10

Growth Rates

Constant 1
Logarithmic log n
Linear n
N-Log-N n log n
Quadratic n2
Cubic n3
Exponential 2n

T(n)

Growth rates of
functions:

1E+29
1E+27
1E+25
1E+23
1E+21
1E+19
1E+17
1E+15
1E+13
1E+11
1E+9
1E+7
1E+5
1E+3
1E+1
1E-1
1E-1

Cubic
Quadratic
Linear

In a log-log chart,
the slope of the line
1E+1
corresponds to the
growth rate of the
[Lect01] Analysis of Algorithms
function

1E+3

1E+5

1E+7

1E+9

n
11

Constant Factors
The growth rate is
not affected by

constant factors or
lower-order terms

Examples

T(n)

102n + 105 is a linear


function
105n2 + 108n is a
quadratic function

1E+25
1E+23
1E+21
1E+19
1E+17
1E+15
1E+13
1E+11
1E+9
1E+7
1E+5
1E+3
1E+1
1E-1
1E-1

Quadratic
Quadratic
Linear
Linear

1E+2

1E+5

1E+8

[Lect01] Analysis of Algorithms

12

Big-Oh Notation (1.2)


10,000

Given functions f(n) and


g(n), we say that f(n) is
1,000
O(g(n)) if there are
positive constants
100
c and n0 such that
f(n) cg(n) for n n0
Example: 2n + 10 is O(n)

2n + 10 cn
(c 2) n 10
n 10/(c 2)
Pick c = 3 and n0 = 10

3n

2n+10
n

10

[Lect01] Analysis of Algorithms

10

100

1,000

13

Big-Oh Example
1,000,000

n^2

Example: the function


100,000
n2 is not O(n)

n2 cn
nc
The above inequality
cannot be satisfied
since c must be a
constant

100n
10n
n

10,000

1,000
100
10
1

10

100

1,000

n
[Lect01] Analysis of Algorithms

14

More Big-Oh Examples


7n-2
7n-2 is O(n)
need c > 0 and n0 1 such that 7n-2 cn for n n0
this is true for c = 7 and n0 = 1

3n3 + 20n2 + 5
3n3 + 20n2 + 5 is O(n3)
need c > 0 and n0 1 such that 3n3 + 20n2 + 5 cn3 for n n0
this is true for c = 4 and n0 = 21

3 log n + log log n


3 log n + log log n is O(log n)
need c > 0 and n0 1 such that 3 log n + log log n clog n for n n0
this is true for c = 4 and n0 = 2
[Lect01] Analysis of Algorithms

15

Big-Oh and Growth Rate


The big-Oh notation gives an upper bound on the
growth rate of a function
The statement f(n) is O(g(n)) means that the growth
rate of f(n) is no more than the growth rate of g(n)
We can use the big-Oh notation to rank functions
according to their growth rate
f(n) is O(g(n))

g(n) is O(f(n))

g(n) grows more

Yes

No

f(n) grows more

No
Yes

Yes
Yes

Same growth

[Lect01] Analysis of Algorithms

16

Big-Oh Rules
If is f(n) a polynomial of degree d, then f(n) is
O(nd), i.e.,
1.
2.

Drop lower-order terms


Drop constant factors

Use the smallest possible class of functions

Say 2n is O(n) instead of 2n is O(n2)

Use the simplest expression of the class

Say 3n + 5 is O(n) instead of 3n + 5 is O(3n)


[Lect01] Analysis of Algorithms

17

Asymptotic Algorithm Analysis


The asymptotic analysis of an algorithm determines
the running time in big-Oh notation
To perform the asymptotic analysis

We find the worst-case number of primitive operations


executed as a function of the input size
We express this function with big-Oh notation

Example:

We determine that algorithm arrayMax executes at most


7n 2 primitive operations
We say that algorithm arrayMax runs in O(n) time

Since constant factors and lower-order terms are


eventually dropped anyhow, we can disregard them
when counting primitive operations
[Lect01] Analysis of Algorithms

18

The Importance of Asymptotics


An algorithm with an asymptotically slow running
time is beaten in the long run by an algorithm with
an asymptotically faster running time
Maximum Problem Size (n)

Running Time

1 second

1 minute

1 hour

O(n)
O(n[log n])
O(n2)
O(n4)
O(2n)

2,500

150,000

9,000,000

4,096

166,666

7,826,087

707

5,477

42,426

31

88

244

19

25

31

[Lect01] Analysis of Algorithms

19

Computing Prefix Averages


We further illustrate
asymptotic analysis with
two algorithms for prefix
averages
The i-th prefix average of
an array X is average of the
first (i + 1) elements of X:
A[i] = (X[0] + X[1] + + X[i])/(i+1)
Computing the array A of
prefix averages of another
array X has applications to
financial analysis

35
30

X
A

25
20
15
10
5
0

[Lect01] Analysis of Algorithms

1 2 3 4 5 6 7

20

Prefix Averages (Quadratic)


The following algorithm computes prefix averages in
quadratic time by applying the definition
Algorithm prefixAverages1(X, n)
Input array X of n integers
Output array A of prefix averages of X #operations
A new array of n integers
n
for i 0 to n 1 do
n
s X[0]
n
for j 1 to i do
1 + 2 + + (n 1)
s s + X[j]
1 + 2 + + (n 1)
A[i] s / (i + 1)
n
return A
1
[Lect01] Analysis of Algorithms

21

Arithmetic Progression
The running time of
prefixAverages1 is
O(1 + 2 + + n)
The sum of the first n
integers is n(n + 1) / 2

There is a simple visual


proof of this fact

Thus, algorithm
prefixAverages1 runs in
O(n2) time

7
6
5
4
3
2
1
0
1

[Lect01] Analysis of Algorithms

6
22

Prefix Averages (Linear)


The following algorithm computes prefix averages in
linear time by keeping a running sum
Algorithm prefixAverages2(X, n)
Input array X of n integers
Output array A of prefix averages of X
A new array of n integers
s0
for i 0 to n 1 do
s s + X[i]
A[i] s / (i + 1)
return A

#operations

n
1
n
n
n
1

Algorithm prefixAverages2 runs in O(n) time


[Lect01] Analysis of Algorithms

23

Math you need to Review


Summations (Sec. 1.3.1)
Logarithms and Exponents (Sec. 1.3.2)

Proof techniques (Sec. 1.3.3)


Basic probability (Sec. 1.3.4)

properties of logarithms:
logb(xy) = logbx + logby
logb (x/y) = logbx - logby
logbxa = alogbx
logba = logxa/logxb
properties of exponentials:
a(b+c) = aba c
abc = (ab)c
ab /ac = a(b-c)
b = a logab
bc = a c*logab

[Lect01] Analysis of Algorithms

24

Intuition for Asymptotic


Notation
Big-Oh
f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n)
big-Omega
f(n) is (g(n)) if f(n) is asymptotically greater than or equal to g(n)
big-Theta
f(n) is (g(n)) if f(n) is asymptotically equal to g(n)
little-oh
f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n)
little-omega
f(n) is (g(n)) if is asymptotically strictly greater than g(n)

[Lect01] Analysis of Algorithms

25

Example Uses of the


Relatives of Big-Oh

5n2 is (n2)

f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1


such that f(n) cg(n) for n n0
let c = 5 and n0 = 1
5n2 is (n)

f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1


such that f(n) cg(n) for n n0
let c = 1 and n0 = 1
5n2 is (n)
f(n) is (g(n)) if, for any constant c > 0, there is an integer constant n0
0 such that f(n) cg(n) for n n0
need 5n02 cn0 given c, the n0 that satisfies this is n0 c/5 0
[Lect01] Analysis of Algorithms

26

Time Complexity
Time complexity refers to the use of
asymptotic notation (O, , , o, ) in
denoting running time
If two algorithms accomplishing the same task
belong to two different time complexities:

One will be faster than the other


As n is increased further, more benefit will be
gained from the faster algorithm

Faster algorithm is generally preferred

[Lect01] Analysis of Algorithms

27

Time Complexity Comparison


Speed comparison (fastest to slowest):

Constant 1 (fastest)
Logarithmic log n
Linear n
N-Log-N n log n
Quadratic n2
Cubic n3
Exponential 2n (slowest)
The speed here refers to the speed in solving the
problem, not the growth rate of time as mentioned
earlier. A fast algorithm has lower growth rate than a
slow algorithm

[Lect01] Analysis of Algorithms

28

Lecture 02a

Review of
Basic Data Structures

TCP2101 ADA

Some Useful STL Containers


Container

Description

vector

"Array" that grows automatically,


Best for rapid insertion and deletion at back.
Support direct access to any element via operator "[]".

set

No duplicate key/element allowed.


Keys are automatically sorted.
Best for rapid lookup (searching) of key.

multiset

set that allows duplicate keys.

map

Collection of (key, value) pairs with non-duplicate key.


Pairs are automatically sorted by key.
Best for rapid lookup of key.

multimap

map that allows duplicate keys.

stack

Last-in, first-out (LIFO) data structure.

queue

First-in, first-out (FIFO) data structure.

TCP2101 ADA

STL vector Class


#include <iostream>
#include <vector>
using namespace std;
int main() {
vector<int> v;
v.push_back(4);
v.push_back(2);
v.push_back(7);
v.push_back(6);
for (int i = 0; i < v.size(); i++)
cout << v[i] << " ";
cout << endl;
// Same result as 'for' loop.
for (vector<int>::iterator it = v.begin();
it != v.end();
it++)
cout << *it << " ";
cout << endl;

Output:
4 2 7 6
4 2 7 6

Iterator type must


match container type.
Initialize iterator it to
the first element of
container.
Move to the next
element.

Use iterator it like a


pointer.

TCP2101 ADA

STL set Class


A set is a collection of non-duplicate sorted elements called keys.
set <key_type> s;
key_type is the data types of the key/element.
Use set when you want to fast search a sorted collection and you do
not need random access to its elements.
Use insert() method to insert an element into a set:
set <int> s;
s.insert (321);
Duplicates are ignored when inserted.
iterator is required to iterate/visit the elements in set. Operator[] is
not supported.
Use find() method to look up a specified key in a set .

TCP2101 ADA

STL set Class


#include <iostream>
#include <set>
using namespace std;
int main() {
set<int> s;
s.insert (321);
s.insert (-999);
s.insert (18);
s.insert (-999); // duplicate is ignored
set<int>::iterator it = s.begin();
while (it != s.end())
cout << *it++ << endl; // -999 18 321
int target;
cout << "Enter an integer: ";
cin >> target;
it = s.find (target);
if (it == s.end()) // not found
cout << target << " is NOT in set.";
else
cout << target << " is IN set.";
}

TCP2101 ADA

Output1:
-999
18
321
Enter an integer:
5
5 is NOT in set.

Use iterator to
iterate the set.
Output2:
-999
18
321
Enter an integer:
321
321 is IN set.

STL multiset Class


#include <iostream>
#include <set>
using namespace std;
int main() {
multiset<int> s;
s.insert (321);
s.insert (-999);
s.insert (18);
s.insert (-999); // duplicate
set<int>::iterator it = s.begin();
while (it != s.end())
cout << *it++ << endl;
}

TCP2101 ADA

Output:
-999
-999
18
321

multiset allows
duplicate keys

STL map Class


A map is a collection of (key,value) pairs sorted by the keys.
map <key_type, value_type> m;
key_type and value_type are the data types of the key and
the value respectively.
In array the index is always int starting from 0, whereas in
map the key can be of other data type.
map cannot contain duplicate key (multimap can).

map <char, string> m;


m['A'] = "Apple";
m['A'] = "Angel"; // key 'A' already in the
// map, new 'A' is ignored.
// m['A'] is still "Apple".

TCP2101 ADA

STL map Class


#include <iostream>
#include <string>
#include <map> // map, multimap
using namespace std;
int main() {
map <char, string> m;
m['C'] = "Cat";
// insert
m['A'] = "Apple";
m['B'] = "Boy";
cout << m['A'] << " " // retrieve
<< m['B'] << " "
<< m['C'] << endl;
map <char, string>::iterator it;
it = m.begin();
while (it != m.end()) {
cout << it->first << " "
<< it->second << endl;
it++;
}

TCP2101 ADA

char key;
cout << "Enter a char: ";
cin >> key;
it = m.find (key);
if (it == m.end())
cout << key
<< " is NOT in map.";
else
cout << key << " is IN map.";
}

first refers to the key of


current element whereas
second refers to the value
of of current element

STL map Class


#include <iostream>
#include <string>
#include <map> // map, multimap
using namespace std;
int main() {
map <char, string> m;
m['C'] = "Cat";
// insert
m['A'] = "Apple";
m['B'] = "Boy";
cout << m['A'] << " " // retrieve
<< m['B'] << " "
<< m['C'] << endl;
map <char, string>::iterator it;
it = m.begin();
while (it != m.end()) {
cout << it->first << " "
<< it->second << endl;
it++;
}

TCP2101 ADA

char key;
cout << "Enter a char: ";
cin >> key;
it = m.find (key);
if (it == m.end())
cout << key
<< " is NOT in map.";
else
cout << key << " is IN map.";
}

Output 1:
Apple Boy Cat
A Apple
B Boy
C Cat
Enter a char: Z
Z is NOT in map

Output 2:
Apple Boy Cat
A Apple
B Boy
C Cat
Enter a char: C
C is IN map

STL map Class


Another way of inserting a new (key,value) pair into a map is to
use insert method and pair class.
The pair object and the map must have the same key type and
value type.

map <char,string> m;
m.insert (pair<char,string>('A',"Apple"));
m.insert (pair<char,string>('A',"Angel"));

TCP2101 ADA

10

STL multimap Class


A multimap is similar to map but it allows duplicate keys.
However, insert method and a pair object must be used
when inserting a (key,value) pair into multimap.
The pair object and the multimap must have the same key
type and value type.

Operator [ ] is not supported. Iterator must be used to locate a


element.
multimap <char,string> mm;
mm.insert (pair<char,string>('A',"Apple"));
mm.insert (pair<char,string>('A',"Angel"));
// mm has 2 elements with 'A' as key.

TCP2101 ADA

11

STL multimap Class


#include <iostream>
#include <string>
#include <map> // map, multimap
using namespace std;
int main() {
multimap <char,string> mm;
mm.insert (
pair<char,string>('C',"Cat"));
mm.insert (
pair<char,string>('A',"Apple"));
mm.insert (
pair<char,string>('B',"Boy"));
mm.insert (
pair<char,string>('A',"Angle"));
map <char, string>::iterator it;
it = mm.begin();
while (it != mm.end()) {
cout << it->first << " "
<< it->second << endl;
it++;
}

TCP2101 ADA

char key;
cout << "Enter a char: ";
cin >> key;
it = mm.find (key);
if (it == mm.end())
cout << key
<< " is NOT in map.";
else
cout << key << " is IN map.";
}

Output 1:
A Apple
A Angle
B Boy
C Cat
Enter a char: Z
Z is NOT in map

Output 2:
A Apple
A Angle
B Boy
C Cat
Enter a char: C
C is IN map

12

STL stack Class


A stack is a Last-In-First-Out (LIFO) data structures, meaning
that the last item to push (insert) into the top of the stack will
be the first item to pop out (remove) from the top of the stack.
Sample applications:
1. Page-visited history in a Web browser
2. Undo sequence in a text editor
3. Chain of method calls in the Java Virtual Machine or C++
runtime environment

TCP2101 ADA

13

STL stack Class


#include <iostream>
#include <stack> // STL stack
using namespace std;
int main() {
stack<int> st;
cout << "Push result: ";
for (int i=0; i<5; i++) {
st.push(i); // push into stack
cout << st.top() << " "; // check top item
}
cout << "\nPop result : ";
while (!st.empty()) {
cout << st.top() << " ";
st.pop(); // remove the top item.
}
}

TCP2101 ADA

Output:
Push result: 0 1 2 3 4
Pop result : 4 3 2 1 0

The last item to


insert is the first
item to remove

14

STL queue Class


A stack is a First-In-First-Out (FIFO) data structures, meaning
that the first item to push (insert/enqueue) into the back of the
queue will be the first item to pop out (remove/dequeue).
Sample applications:
1. Waiting lines
2. Access to shared resources (e.g., printer)

TCP2101 ADA

15

STL queue Class


#include <iostream>
#include <queue> // STL queue
using namespace std;
int main() {
queue<int> q;
cout << "Push result:\nfront,back\n";
for (int i=0; i<5; i++) {
q.push(i); // push into queue
// check front and back of the queue
cout << q.front() << "," << q.back() << "\n";
}
cout << "Pop result:\nfront,back\n";
while (!q.empty()) {
cout << q.front() << "," << q.back() << "\n";
q.pop(); // pop from queue
}
}

TCP2101 ADA

Output:
Push result:
front,back
0,0
0,1
0,2
0,3
0,4
Pop result:
front,back
0,4
1,4
2,4
3,4
4,4

16

STL Container Efficiency


Contain
er

[]

Insert

Remove

vector

(1) can
go to any
valid position
directly.

O(n) insert at
beginning requires
shifting of all
elements to right
by one position.

O(n) remove at
beginning requires
shifting of all
elements to left by
one position.

O(n) if the
target is the
last item.

set/
multis
et

n/a

O(lg n)

O(lg n)

O(lg n)

map

O(lg n)

O(lg n)

O(lg n)

O(lg n)

multim
ap

n/a

O(lg n)

O(lg n)

O(lg n)

stack

n/a

(1) happen at
top.

(1) happen at
top.

n/a

queue

n/a

(1) happen at
back.

(1) happen at
front.

n/a

TCP2101 ADA

Find

17

Lecture 02b
Hash Tables

Review of Linked List

start

Each blue node is divided into two


sections, for the two members of
the Node struct.

Review of Linked List (cont.)

start

The left section is


the info member.

The right section is the


pointer called next.

A Node Struct
Template
template <typename T>
struct Node {
T info;
Node<T> *next;
};
The next pointer stores
the address of a Node
of the same type! This
means that each node
can point to another
node.

The info member is


for the data. It can
anything (T), but it is
often the object of
another struct, used
as a record of
information.

Review of Linked List (cont.)

start

The start pointer would


be saved in the private
section of a data
structure class.

The last node doesnt


point to another node, so
its pointer (called next) is
set to NULL (indicated by
slash).
5

Linked List
Advantages
Linked lists has 2 main advantages over
arrays.
1. Linked lists waste less memory for large
number of elements.
In arrays, the wasted memory is the part of
the array not being utilized.
In linked lists, the wasted memory is the
pointer in each node.
6

Linked List
Advantages (cont.)
2. Linked lists are faster than arrays on the
following 2 operations:
insert new element at start or middle of link
lists.
remove existing element from start or middle
of linked lists.

Linked List
Advantages (cont.)

Removing an element or inserting an


element at the middle of a linked list
is fast.

Inserting a Node at Front


element
start

All new nodes must be made in the heap, SO

Inserting a Node at Front (cont.)


element
start

newNode

Node<T> *newNode = new Node<T>;

10

Inserting a Node at Front (cont.)


element
start

newNode
Node<T> *newNode = new Node<T>;

newNode->info = element;
Now we have to store element into the node
11

Inserting a Node at Front (cont.)


element
start

newNod
e
Node<T> *newNode = new Node<T>;
newNode->info = element;

newNode->next = start;
12

Inserting a Node at Front (cont.)


element
start

newNod
e
Node<T> *newNode = new Node<T>;
newNode->info = element;
newNode->next = start;

start = newNode;

13

Linked List Implementation


Study LinkedList.cpp
The implementation is incomplete but
sufficient to implement a hash table.

14

Time Complexities
for Linked List
insertFront well insert at the head of the linked
list ( 1 )
Find/Delete in the worst case, all nodes in the
linked list are checked, so it is ( n ) unordered
list. E.g find the max/min must search the whole
list
isEmpty is ( 1 ), because we just test the
linked list to see if it is empty
makeEmpty is ( n ), because we need to
delete all nodes
15

Hash Table ADT


The hash table is a table of elements that
have keys
A hash function is used for locating a
position in the table
The table of elements is the set of data
acted upon by the hash table operations

16

Selected Hash Table ADT


Operations
insert, to insert an element into a table
retrieve, to retrieve an element from the
table
an operation to empty out the hash table

17

Fast Search
A hash table uses a function of the
key value of an element to identify its
location in an array.
A search for an element can be done
in ( 1 ) time.
The function of the key value is called
a hash function.
18

Hash Functions
The input into a hash function is a key
value
The output from a hash function is an
index of an array (hash table) where the
object containing the key is located
Example of a hash function:
h( k ) = k % 100

19

Example Using a
Hash Function
Suppose our hash function is:
h( k ) = k % 100

We wish to search for the object containing key


value 214
k is set to 214 in the hash function
The result is 14
The object containing key value 214 is stored at
index 14 of the array (hash table)
The search is done in ( 1 ) time
20

Inserting an Element
An element is inserted into a hash table
using the same hash function
h( k ) = k % 100

To find where an element is to be inserted,


use the hash function on its key
If the key value is 214, the object is to be
stored in index 14 of the array
Insertion is done in ( 1 ) time
21

Consider the Big Picture


If we have millions of key values, it may
take a long time to search a regular array
or a linked list for a specific part number
(on average, we might compare 500,000
key values)
Best search algorithm gives O(lg n)
lg 500000 19
1 vs. 19
Using a hash table, we simply have a
function which provides us with the index
of the array where the object containing
the key is located
22

Collisions
Consider the hash function
h( k ) = k % 100

A key value of 393 is used for an object, and the


object is stored at index 93
Then a key value of 193 is used for a second
object; the result of the hash function is 93, but
index 93 is already occupied
This is called a collision

23

Birthday paradox

Probability of having 2 people with same


birthday
As you can see once Prob() > 0.5, it goes
24
up very quickly

Birthday paradox
1
n( p, N ) 2 N ln(
)
1 p

p = probability of collision
N = max no of entry in the hash table
n is the min no of entry to cause a collision
Let N= 500,000 and p = 0.5
n = (2 x 500,000 x 0.693)^0.5
= 833 entries

Thus, you must have way to handle


collisions unless you have infinite memory
25

How are Collisions


Resolved?
The most popular way to resolve collisions is by
chaining, Linear probing and other methods
Instead of having an array of objects, we have an array
of linked lists, each node of which contains an object
An element is still inserted by using the hash function -the hash function provides an index of a linked list, and
the element is inserted at the front of that (usually short)
linked list
When searching for an element, the hash function is
used to get the correct linked list, then the linked list is
searched for the key (still much faster than comparing
500,000 keys)
26

Example Using Chaining


0
1

2
3

A hash table which is initially


empty.
Every element is a LinkedList
object. Only the start pointer
of the LinkedList object is
shown, which is set to NULL.

4
5

The hash function is:


h( k ) = k % 7
27

Example Using Chaining


(cont.)
0
1

2
3

INSERT object
with key 31

31 % 7 is 3

The hash function is:


h( k ) = k % 7
28

Example Using Chaining


(cont.)
0

Note: The whole object is stored


but only the key value is shown

2
3

4
5

31

INSERT object
with key 31

31 % 7 is 3
The hash function is:
h( k ) = k % 7
29

Example Using Chaining


(cont.)
0
1
2

31

INSERT object
with key 9

9 % 7 is 2

The hash function is:

h( k ) = k % 7
30

Example Using Chaining


(cont.)
0
1

36

31

4
5
6

INSERT object
with key 36

36 % 7 is 1
The hash function is:
h( k ) = k % 7
31

Example Using Chaining


(cont.)
0

42

36

31

4
5

INSERT object
with key 42

42 % 7 is 0
The hash function is:
h( k ) = k % 7
32

Example Using Chaining


(cont.)
0

42

36

31

INSERT object
with key 46

46

46 % 7 is 4

5
6

The hash function is:


h( k ) = k % 7
33

Example Using Chaining


(cont.)
0

42

36

31

INSERT object
with key 20

46

20 % 7 is 6

5
6

The hash function is:

20

h( k ) = k % 7
34

Example Using Chaining


(cont.)
0

42

36

31

INSERT object
with key 2

46

2 % 7 is 2

5
6

COLLISION occurs

The hash function is:

20

h( k ) = k % 7
35

Example Using Chaining


(cont.)
0

42

36

31

INSERT object
with key 2

46

2 % 7 is 2

5
6

But key 2 is just inserted in


the linked list

The hash function is:

20

h( k ) = k % 7
36

Example Using Chaining


(cont.)
0

42

36

31

INSERT object
with key 2

46

2 % 7 is 2

5
6

The insert function of LinkedList


inserts a new element at the
BEGINNING of the list

The hash function is:

20

h( k ) = k % 7
37

Example Using Chaining


(cont.)
0

42

36

31

INSERT object
with key 2

46

2 % 7 is 2

The hash function is:

20

h( k ) = k % 7
38

Example Using Chaining


(cont.)
0

42

36

24

31

46

INSERT object
with key 24

24 % 7 is 3
The hash function is:

20

h( k ) = k % 7
39

Example Using Chaining


(cont.)
0

42

36

24

31

46

**FIND** the
object with key 9

9 % 7 is 2
The hash function is:

20

h( k ) = k % 7
40

Example Using Chaining


(cont.)
0

42

36

24

31

46

5
6

We search this linked list for


the object with key 9
**FIND** the
object with key 9

9 % 7 is 2
The hash function is:

20

h( k ) = k % 7
41

Example Using Chaining


(cont.)
0

42

36

24

31

46

5
6

Rememberthe whole object is


stored, only the key is shown
**FIND** the
object with key 9

9 % 7 is 2
The hash function is:

20

h( k ) = k % 7
42

Example Using Chaining


(cont.)
0

42

36

24

31

46

5
6

Does this object contain key 9?

**FIND** the
object with key 9

9 % 7 is 2
The hash function is:

20

h( k ) = k % 7
43

Example Using Chaining


(cont.)
0

42

36

24

31

46

5
6

Does this object contain key 9?


No, so go on to the next object.
**FIND** the
object with key 9

9 % 7 is 2
The hash function is:

20

h( k ) = k % 7
44

Example Using Chaining


(cont.)
0

42

36

24

31

46

5
6

Does this object contain key 9?

**FIND** the
object with key 9

9 % 7 is 2
The hash function is:

20

h( k ) = k % 7
45

Example Using Chaining


(cont.)
0

42

36

24

31

46

Does this object contain key 9?


YES, found it! Return the object.
**FIND** the
object with key 9

9 % 7 is 2
The hash function is:

20

h( k ) = k % 7
46

Uniform Hashing
When the elements are spread evenly (or near
evenly) among the indexes of a hash table, it is
called uniform hashing
If elements are spread evenly, such that the
number of elements at an index is less than
some small constant, uniform hashing allows a
search to be done in ( 1 ) time
The hash function largely determines whether or
not we will have uniform hashing
47

Ideal Hash Function


for Uniform Hashing
The hash table size should be a prime number
that is not too close to a power of 2
31 is a prime number but is too close to a power
of 2
97 is a prime number not too close to a power of
2
A good hash function might be:
h( k ) = k % 97
48

Chaining Problem
In nature there are
Clustering phenomena
Some place are crowded
and most place are empty
Thus, most place no entry
Some entry has long chains
Worst case = O(n) where n is the max
length of the chain
Additional memory is required during run
time

49

Collision Resolution: Linear Probing

If collision, put the entry on the next free entry


In the above case, collision at index 5, index 6,7 is filled, thus put the item at
index 8.
Knuths parking problem.
Let assume there are M indexes and the hash table is 50% full.
Average search time for one item that exist in the hash table is 3/2 (search
hit). Either hash and found the item is there or found the item next to the
hash value.
Average search time for one item that not exist in the hash table is 5/2
(search miss). Found the hash, the item does not match, find the next item,
item no match, then the next item (Its empty).

Hashing Linear Probing Problem

When inserting 56, we hit index 12, we


need to probe till index 19 (7x probe)
before we can insert 56.

Linear Probing improvement


Quadratic Probing
Another open addressing strategy is
known as quadratic probing
Rather than always moving one cell (linear
probing) when encounters collision, this
strategy moves j2 cells from the point of
collision, j is the number of attempts to
resolve the collision
The limitation of this strategy: It may not
find an empty cell, if the bucket array is at
52
least half full

Chaining vs Linear Probing

Chaining vs Linear Probing


Linear Probing use fix amount of memory,
Chaining require extra memory for link list
Linear Probing is better if the % load is lower
than 0.85
Linear probing is better suited for caching,
when you load an item the next item is
always loaded
Clustering does happen with for all data type,
thus, linear probing search time can be very
long if there are many collision

Speed vs.
Memory Conservation
Speed comes from reducing the number of
collisions
In a search, if there are no collisions, the
first element in the linked list in the one we
want to find (fast)
Therefore, the greatest speed comes
about by making a hash table much larger
than the number of keys (but there will still
be an occasional collision)
55

Speed vs.
Memory Conservation
(cont.)
Each empty LinkedList object in a hash table
wastes 4 bytes of memory (4 bytes for the start
pointer)
The best memory conservation comes from
trying to reduce the number of empty LinkedList
objects
The hash table size would be made much
smaller than the number of keys (there would
still be an occasional empty linked list)
56

Hash Table Design


Decide whether speed or memory
conservation is more important (and how
much more important) for the application
Come up with a good table size which
Allows for the use of a good hash function
Strikes the appropriate balance between
speed and memory conservation

57

Ideal Hash Tables


Can we have a hash function which guarantees
that there will be no collisions?
Yes:
h( k ) = k

Each key k is unique; therefore, each index


produced from h( k ) is unique
Consider 300 employees that have a 4 digit id
A hash table size of 10000 with the hash
function above guarantees the best possible
speed

58

Ideal Hash Tables


(cont.)
Should we use LinkedList objects if there are no
collisions?
Suppose each Employee object takes up 100 bytes
An array size of 10000 Employee objects with only 300
used indexes will have 9700 unused indexes, each
taking up 100 bytes
Best to use LinkedList objects (in this case) the 9700
unused indexes will only use 4 bytes each

59

Ideal Hash Tables


(cont.)
Can we have a hash table without any collisions
and without any empty linked lists?
Sometimes. Consider 300 employees with ids
from 0 to 299. We can make a hash table size
of 300, and use h( k ) = k
LinkedList objects wouldnt be necessary and in
fact, would waste space
Array is the best for this ideal case

60

Time Complexities
for Hash Table
insert well insert at the head of the linked list
( 1 )
retrieve element is found by hashing, so it is
( 1 ) for uniform hashing (the hash function and
hash table are designed so that the length of the
collision list is bounded by some small constant)

61

Hash table issues


Hash table are not cache friendly & memory
inefficient for very large data
E.g. Router needs to have approximately
500,000 prefixes in the router. Each prefixes
takes 8 bytes.
As you know if hash table > 50% full then
performance may drop dramatically. So you
keep it < 50%
You need a memory of 500,000 x 2 (50%) x 8
bytes = 8,000,000 bytes
= 8 Mbytes just to store the whole table in the
router memory for fast lookup
Acceptable for routers

62

Hash table issue 2


E.g. Let say you have 50 billion URL to store in
your Internet Cache Server using hashing
Each URL takes about 1 KBytes
Assume 50% hash table capacity
Total storage required is
50 billion x 1 KB x 2 (50%) = 10 billion KB
= 10 Terabytes (Hardisk technology)
You cannot have 10 TB RAM but can have 10
TB hardisk, thus caching is any issue
Each time you hash you need to fetch it from the
disk since it random= Slow.
63

Reference
Childs, J. S. (2008). Methods for Making
Data Structures. C++ Classes and Data
Structures. Prentice Hall.

64

Lec 03a
Binary Search Tree

Definition of Tree
A tree is a set of linked nodes, such that
there is one and only one path from a
unique node (called the root node) to
every other node in the tree.
A path exists from node A to node B if one
can follow a chain of pointers to travel
from node A to node B.

Paths
A set of linked nodes
D

F
A

E
B

There is one path from A to B


There is a path from D to B
There is also a second path
from D to B.

Paths (cont.)
D

F
A

E
B

There is no path from C to any


other node.
4

Cycles
There is no cycle (circle of pointers) in a
tree.
Any linked structure that has a cycle would
have more than one path from the root
node to another node.

Example of a Cycle
D
A

B
C

E
CDBEC
6

Tree Cannot Have a


Cycle
D
A

B
C

E
2 paths exist from A to C:
1. A C
2. A C D B E C

Example of a Tree
root

In a tree, every
pair of linked
nodes have a
parent-child
relationship (the
parent is closer
to the root)

Example of a Tree
(cont.)
root

For example, C is a
parent of G

Example of a Tree
(cont.)
root

E and F are
children of D

10

Example of a Tree
(cont.)
root

The root node is the


only node that has no
parent.

11

Example of a Tree
(cont.)
root

Leaf nodes (or


leaves for short)
have no children.

12

Binary Trees

A binary tree is a tree in which each node


can only have up to two children

13

NOT a Binary Tree


root

C has 3 child nodes.


A

H
14

Example of a Binary Tree


The links in a tree
are often called
edges

root
A

H
15

Levels
root
A

level 0
level 1
level 2

level 3
J

The level of a node is the number of edges in the path


from the root node to this node
16

Full Binary Tree


root

In a full binary tree, each node has two children except for
the nodes on the last level, which are leaf nodes
17

Complete Binary Trees


A complete binary tree is a binary tree
that is either
a full binary tree
OR
a tree that would be a full binary tree but it is
missing the rightmost nodes on the last level

18

NOT a Complete Binary Trees


root

Missing non-rightmost
nodes on the last level
19

Complete Binary Trees


(cont.)
root

Missing rightmost
nodes on the last
level
20

Complete Binary Trees


(cont.)
root

A full binary tree is


also a complete binary
tree.

O
21

Binary Search Trees


A binary search tree is a binary tree that
allows us to search for values that can be
anywhere in the tree.
Usually, we search for a certain key value,
and once we find the node that contains it,
we retrieve the rest of the info at that
node.

22

Properties of
Binary Search Trees
A binary search tree does not have to be a
complete binary tree.
For any particular node,
the key in its left child (if any) is less than its
key.
the key in its right child (if any) is greater than
or equal to its key.

Left < Parent <= Right.


23

Binary Search Tree


Node
template <typename T>
BSTNode {
T info;
BSTNode<T> *left;
BSTNode<T> *right;
};

The implementation
of a binary search
tree usually just
maintains a single
pointer in the private
section called root,
to point to the root
node.

24

Inserting Nodes
Into a BST
root:
NULL
BST starts off empty

Objects that need to be inserted (only key values are


shown):
37, 2, 45, 48, 41, 29, 20, 30, 49, 7
25

Inserting Nodes
Into a BST (cont.)
root

37

37, 2, 45, 48, 41, 29, 20, 30, 49, 7


26

Inserting Nodes
Into a BST (cont.)
root

37

2 < 37, so insert 2 on the


left side of 37

2, 45, 48, 41, 29, 20, 30, 49, 7


27

Inserting Nodes
Into a BST (cont.)
root

37

2, 45, 48, 41, 29, 20, 30, 49, 7


28

Inserting Nodes
Into a BST (cont.)
root

37

45 > 37, so insert it at the right of 37

45, 48, 41, 29, 20, 30, 49, 7


29

Inserting Nodes
Into a BST (cont.)
root

37

45

45, 48, 41, 29, 20, 30, 49, 7


30

Inserting Nodes
Into a BST (cont.)
root

37
45

When comparing, we always


start at the root node
48 > 37, so look to the right
48, 41, 29, 20, 30, 49, 7
31

Inserting Nodes
Into a BST (cont.)
root

37
45

This time, there is a node already


to the right of the root node. We
then compare 48 to this node
48 > 45, and 45 has no right child,
so we insert 48 on the right of 45
48, 41, 29, 20, 30, 49, 7
32

Inserting Nodes
Into a BST (cont.)
root

37
45

48

48, 41, 29, 20, 30, 49, 7


33

Inserting Nodes
Into a BST (cont.)
root

37
45

2
41 > 37, so look to
the right

48

41 < 45, so look to


the left there is no
left child, so insert
41, 29, 20, 30, 49, 7
34

Inserting Nodes
Into a BST (cont.)
root

37
45

41

48

41, 29, 20, 30, 49, 7


35

Inserting Nodes
Into a BST (cont.)
root

37
45

29 < 37, left

41

48

29 > 2, right

29, 20, 30, 49, 7


36

Inserting Nodes
Into a BST (cont.)
root

37
45

29

41

48

29, 20, 30, 49, 7


37

Inserting Nodes
Into a BST (cont.)
root

37
45

20 < 37, left

29

41

48

20 > 2, right
20 < 29, left
20, 30, 49, 7
38

Inserting Nodes
Into a BST (cont.)
root

37
45

29

41

48

20
20, 30, 49, 7
39

Inserting Nodes
Into a BST (cont.)
root

37
45

29

20
30, 49, 7

41

48

30 < 37
30 > 2
30 > 29
40

Inserting Nodes
Into a BST (cont.)
root

37
45

29

20

41

48

30

30, 49, 7
41

Inserting Nodes
Into a BST (cont.)
root

37
45

29

20

48

41

30

49 > 37
49 > 45
49 > 48

49, 7
42

Inserting Nodes
Into a BST (cont.)
root

37
45

29

20

41

30

48

49

49, 7
43

Inserting Nodes
Into a BST (cont.)
root

37
45

7 < 37
7>2
7 < 29
7 < 20

29

20

41

30

48

49

7
44

Inserting Nodes
Into a BST (cont.)
root

37
45

29

20
7

41

30

48

49

7
45

Inserting Nodes
Into a BST (cont.)
root

37
45

29

20

48

41

30

All elements have


been inserted

49

7
46

Searching for a
Key in a BST
root

37
45

29

20
7

41

30

48

Searching for a
key in a BST uses
the same logic

49

Key to search for: 29


47

Searching for a
Key in a BST (cont.)
root

37
45

29 < 37

29

20
7

41

48

30

49
Key to search for: 29
48

Searching for a
Key in a BST (cont.)
root

37
45

29 > 2

29

20
7

41

48

30

49
Key to search for: 29
49

Searching for a
Key in a BST (cont.)
root

37
45

29 == 29

29

41

48

FOUND IT!
20
7

30

49
Key to search for: 29
50

Searching for a
Key in a BST (cont.)
root

37
45

29

20
7

41

48

30

49
Key to search for: 3
51

Searching for a
Key in a BST (cont.)
root

37
45

3 < 37
3>2
3 < 29

29

41

48

3 < 20
3<7
20
7

30

49
Key to search for: 3
52

Searching for a
Key in a BST (cont.)
root

37
45

2
When the child pointer
you want to follow is
set to NULL, the key
29
you are looking for is
not in the BST
20
7

41

48

30

49
Key to search for: 3
53

Time Complexities
If the binary search tree happens to be a
complete binary tree:
the time for insertion is ( lg n )
the time for the search is O( lg n )
Search in array is O( n )

However, we could run into some bad


luck

54

Bad Luck
root

2
7
20

29
Exactly the same keys were
inserted into this BST but
they were inserted in a
different order (the order
shown below)

30
37
41
45

48
2, 7, 20, 29, 30, 37, 41, 45, 48, 49

49
55

Bad Luck (cont.)


root

2
7
20

29
This is some bad luck, but a
BST can be formed this way

30
37
41
45

48
2, 7, 20, 29, 30, 37, 41, 45, 48, 49

49
56

Bad Luck (cont.)


root

2
7
20

29
Using the tightest possible
big-oh notation, the insertion
and search time is O( n )

30
37
41
45

48
2, 7, 20, 29, 30, 37, 41, 45, 48, 49

49
57

Balanced vs. Unbalanced


If a BST takes ( lg n ) time for insertion,
and O( lg n ) time for a search, we say it is
a balanced binary search tree
If a BST take O( n ) time for insertion and
searching, we say it is an unbalanced
binary search tree

58

Deleting a BST Node


Deleting a node in a BST is a little tricky
it has to be deleted so that the resulting
structure is still a BST with each node
greater than its left child and less than its
right child.
Deleting a node is handled differently
depending on whether the node:
has no children
has one child
has two children
59

Deletion Case 1:
No Children
root

37
45

29

20

48

41

30

Node 49 has no
children to
delete it, we just
remove it

49

60

Deletion Case 1:
No Children (cont.)
root

37
45

29

20

41

48

30

61

Deletion Case 2:
One Child
root

37
45

29

20

41

30

Node 48 has one


child to delete
it, we just splice
it out

48

49

62

Deletion Case 2:
One Child (cont.)
root

37
45

29

20

41

30

Node 48 has one


child to delete
it, we just splice
it out

48

49

63

Deletion Case 2:
One Child (cont.)
root

37
45

29

20

41

30

49

64

Deletion Case 2:
One Child (cont.)
root

37
45

29

20

41

30

48

Another example:
node 2 has one child
to delete it we also
splice it out

49

65

Deletion Case 2:
One Child (cont.)
root

37
45

29

20

41

30

48

Another example:
node 2 has one child
to delete it we also
splice it out

49

66

Deletion Case 2:
One Child (cont.)
root

37
45

29

20

41

30

48

49

67

Deletion Case 3:
Two Children
root

37
45

29

41

48

Node 37 has two


children
20

30

49

68

Deletion Case 3:
Two Children (cont.)
root

37
45

29

20

41

30

to delete it, first we


find the greatest
node in its left
subtree

48

49

69

Deletion Case 3:
Two Children (cont.)
root

37
45

2
First, we go
to the left
once, then
follow the
right pointers
as far as we
can

29

20

41

30

48

49

70

Deletion Case 3:
Two Children (cont.)
root

37
45

29

20

41

30

30 is the greatest
node in the left
subtree of node 37

48

49

71

Deletion Case 3:
Two Children (cont.)
root

37
45

29

20

41

30

Next, we copy the


object at node 30
into node 37

48

49

72

Deletion Case 3:
Two Children (cont.)
root

30
45

29

20

41
Finally, we delete
the lower red node
using case 1 or
case 2 deletion

48

49

73

Deletion Case 3:
Two Children (cont.)
root

30
45

29

20

41
Lets delete node 30
now

48

49

74

Deletion Case 3:
Two Children (cont.)
root

30
45

29

20

41
29 is the greatest node
in the left subtree of
node 30

48

49

75

Deletion Case 3:
Two Children (cont.)
root

30
45

29

20

41
Copy the object at node
29 into node 30

48

49

76

Deletion Case 3:
Two Children (cont.)
root

29
45

29

20

41
This time, the lower red
node has a child to delete
it we use case 2 deletion

48

49

77

Deletion Case 3:
Two Children (cont.)
root

29
45

29

20

41
This time, the lower red
node has a child to delete
it we use case 2 deletion

48

49

78

Deletion Case 3:
Two Children (cont.)
root

29
45

41

20

48

49

79

Deletion Time Complexity


In all cases, we must find the node we wish to
delete first, using the standard search method.
Finding the greatest node in the left subtree is
just a continuation of a path down the BST
For balanced BSTs, the time complexity for
deletion is O( lg n ) in all 3 cases
For unbalanced BSTs, the time complexity is
O( n ) in all 3 cases

80

Traversing a BST
There are 3 ways to traversal a BST (visit
every node in BST):
1. Preorder (parent left right)
Root is output first

2. Inorder (left parent right)


Output is sorted

3. Postorder (left right parent)


Root is output last
81

Traversing a BST (cont.)


Based on the BST on page 47, the result
of traversal:
1. Preorder (parent left right)
37 (root first) 2 29 20 7 30 45 41 48 49

2. Inorder (left parent right)


2 7 20 29 30 37 41 45 46 49 (sorted)

3. Postorder (left right parent)


7 20 30 29 2 41 49 48 45 37 (root last)
82

Time Complexities for BST


Balanced BST
Insertion is ( lg n )
Search is O( lg n )
Deletion is O( lg n )

Unbalanced BST
Insertion is O( n )
Search is O( n )
Deletion is O( n )

83

References
Childs, J. S. (2008). Trees. C++ Classes
and Data Structures. Prentice Hall.

84

Lecture 03b
Priority Queues, and Heaps

Priority Queue ADT


The data in a priority queue is
(conceptually) a queue of elements
The queue can be thought of as sorted
with the largest in front, and the smallest
at the end
Its physical form, however, may differ from
this conceptual view considerably

Priority Queue ADT


Operations
enqueue, an operation to add an element
to the queue
dequeue, an operation to take the largest
element from the queue
an operation to determine whether or not
the queue is empty
an operation to empty out the queue
3

Another Priority Queue


ADT
A priority queue might also be designed to
dequeue the element with the minimum value
Priority queues are generally not designed to
arbitrarily dequeue both minimum and maximum
values, whichever the client wants at any
particular time
We will only consider priority queues that
dequeue maximum values (can be easily
modified for dequeuing minimum values)
4

Priority Queue
Implementation
To implement a priority queue, an array
sorted in descending order comes to mind
Dequeuing from a sorted array is easy
just get the value at the current front and
increment a front index this is ( 1 ) time
However, enqueuing into a sorted array
would take some time the element would
have to be inserted into its proper position
in the array
5

Enqueuing an Element

211 204 201 81

79

70

69

67

49

50

51

52

48

7
98

5
99 100

In this array of elements, each element might be


an object, but only the data member considered
for maximum value is shown.

Enqueuing an Element
(cont.)
item: 71

211 204 201 81

79

70

69

67

49

50

51

52

48

7
98

5
99 100

Suppose that element 71 needs to be enqueued.


We could enqueue 71 by using a loop.

Enqueuing an Element
(cont.)
item: 71

211 204 201 81

79

70

69

67

49

50

51

52

48

7
98

5
99 100

for ( i = 100; i > 0 && arr[ i - 1 ] < item; i-- )


arr[ i ] = arr[ i 1 ];
8

Enqueuing an Element
(cont.)
item: 71

211 204 201 81

79

70

69

67

49

50

51

52

48

7
98

99 100

for ( i = 100; i > 0 && arr[ i - 1 ] < item; i-- )


arr[ i ] = arr[ i 1 ];
9

Enqueuing an Element
(cont.)
item: 71

211 204 201 81

79

70

69

67

49

50

51

52

48

7
98

99 100

for ( i = 100; i > 0 && arr[ i - 1 ] < item; i-- )


arr[ i ] = arr[ i 1 ];
10

Enqueuing an Element
(cont.)
item: 71

211 204 201 81

79

70

69

67

49

50

51

52

48

7
98

99 100

for ( i = 100; i > 0 && arr[ i - 1 ] < item; i-- )


arr[ i ] = arr[ i 1 ];
11

Enqueuing an Element
(cont.)
item: 71

211 204 201 81

79

70

69

67

49

50

51

52

48

7
98

99 100

for ( i = 100; i > 0 && arr[ i - 1 ] < item; i-- )


arr[ i ] = arr[ i 1 ];

This process
continues and i
eventually
becomes 51
12

Enqueuing an Element
(cont.)
item: 71

211 204 201 81

79

70

69

69

49

50

51
i

52

48

for ( i = 100; i > 0 && arr[ i - 1 ] < item; i-- )


arr[ i ] = arr[ i 1 ];

20
98

99 100

This process
continues and i
eventually
becomes 51
13

Enqueuing an Element
(cont.)
item: 71

211 204 201 81

79

70

70

69

49

50

51
i

52

48

20
98

99 100

for ( i = 100; i > 0 && arr[ i - 1 ] < item; i-- )


arr[ i ] = arr[ i 1 ];
14

Enqueuing an Element
(cont.)
item: 71

211 204 201 81

79

70

70

69

49

50
i

51

52

48

20
98

99 100

for ( i = 100; i > 0 && arr[ i - 1 ] < item; i-- )


arr[ i ] = arr[ i 1 ];
15

Enqueuing an Element
(cont.)
item: 71

211 204 201 81

79

70

70

69

49

50
i

51

52

48

20
98

99 100

for ( i = 100; i > 0 && arr[ i - 1 ] < item; i-- ) FALSE


arr[ i ] = arr[ i 1 ];
16

Enqueuing an Element
(cont.)
item: 71

211 204 201 81

79

71

70

69

49

50
i

51

52

48

20
98

99 100

Now we can use:


arr[ i ] = item;
to enqueue the element
17

Enqueuing an Element
(cont.)
If we assume that, on average, half the
elements in an array need to be shifted to
insert an element, then the enqueue for an
array is a ( n ) algorithm
In summary, when using an array for a
priority queue:
dequeue is ( 1 )
enqueue (on average) is ( n )
18

Using a Heap to Implement a


Priority Queue
An alternative to using a sorted array for a
priority queue is to use a heap
Here, heap does not mean an area of
memory used for dynamic allocation
rather, it is a data structure
Enqueue in a heap is a O( lg n ) operation
Dequeue in a heap is a O( lg n ) operation
19

Comparing Operations
So which is better, the heap or the array?
We often eliminate a data structure that has a
high time complexity in a commonly used
operation, even if the other operations have very
low time complexities
In the array, on average, an enqueue-dequeue
pair of operations takes ( n ) + ( 1 ) time, but
( 1 ) is absorbed into ( n ), leaving us with an
overall time complexity of ( n ) per pair of
operations

20

Comparing Operations
(cont.)
In the heap, each enqueue-dequeue pair
of operations takes O( lg n ) + O( lg n )
time, giving us an overall time complexity
of O( lg n ) per pair of operations
The heap is usually better, although the
array can be good in situations where a
group of initial elements are supplied and
sorted, then only dequeue operations are
performed (no enqueue operations)
21

Heaps
A heap is a complete binary tree in which the
value of each node is greater than or equal to
the values of its children (if any)
Parent >= Children

Technically, this is called a maxheap


In a minheap, the value of each node is less
than or equal to the values of its children
A maxheap can be easily modified to make a
minheap
In our discussion, the word heap will refer to a
maxheap
22

Example of a Heap
root
46

39

28

16

14

32

29

24

Only the data member is shown


that we want to prioritize
23

Example of a Heap
(cont.)

root

46

39

28

16

14

32

29

24

Where is the greatest value


in a heap always found?
Root
24

Dequeue
Dequeuing the object with the greatest
value appears to be a ( 1 ) operation
However, after removing the object, we
must turn the resultant structure into a
heap again, for the next dequeue
Fortunately, it only takes O( lg n ) time to
turn the structure back into a heap again
(which is why dequeue in a heap is a
O( lg n ) operation
25

Dequeue (cont.)
root
46

39

28

16

14

32

29

15

24

26

Dequeue (cont.)
root

remElement: 46

46

39

28

16

14

32

29

15

24

Save the root object


in remElement

27

Dequeue (cont.)
root

remElement: 46

39

28

16

14

32

29

15

24

Copy object in last


node into root object

28

Dequeue (cont.)
root

remElement: 46

39

28

16

14

32

29

15

24

2
Remove last node

29

Dequeue (cont.)
root

remElement: 46

5
Greatest Child
39 > 5, so
swap

39

16

14

32

29

15

28

24

If the value of the greatest


child of 5 is greater than 5,
then swap with the greatest
child
30

Dequeue (cont.)
root

remElement: 46

39

28

16

14

32

29

15

24

Notice that 39 is correctly


placed it is guaranteed to
be greater than or equal to
its two children
31

Dequeue (cont.)
root

remElement: 46

39

28

16

14

32

29

15

24

It was the greatest child, so


it is greater than the other
child (28)

32

Dequeue (cont.)
root

remElement: 46

39

28

16

14

32

29

15

24

It is greater than the


element it was swapped
with (5), or else we wouldnt
have swapped.
33

Dequeue (cont.)
root

remElement: 46

39

28

16

14

32

29

15

24

At this point, we repeat the


process, using 5 as the
parent.

34

Dequeue (cont.)
root

remElement: 46

39

28

16

14

32

29

15

24

If the value of the greatest


child of 5 is greater than 5,
then swap with the greatest
child
35

Dequeue (cont.)
root

remElement: 46

39

28
Greatest
Child

16

14

32

29

15

24

If the value of the greatest


child of 5 is greater than 5,
then swap with the greatest
child
36

Dequeue (cont.)
root

remElement: 46

39

28
32 > 5, so
swap

16

14

32

29

15

24

If the value of the greatest


child of 5 is greater than 5,
then swap with the greatest
child
37

Dequeue (cont.)
root

remElement: 46

39

32

28

16

14

29

24

15

If the value of the greatest


child of 5 is greater than 5,
then swap with the greatest
child
38

Dequeue (cont.)
root

remElement: 46

39

32

28

16

14

Greatest Child

29

24

15

If the value of the greatest


child of 5 is greater than 5,
then swap with the greatest
child
39

Dequeue (cont.)
root

remElement: 46

39

32

28
29 > 5, so
swap
5

16

14

Greatest Child

29

24

15

If the value of the greatest


child of 5 is greater than 5,
then swap with the greatest
child
40

Dequeue (cont.)
root

remElement: 46

39

32

28

16

14

29

15

24

If the value of the greatest


child of 5 is greater than 5,
then swap with the greatest
child
41

Dequeue (cont.)
root

remElement: 46

39

32

28

16

14

29

15

24

The final result is a heap!

42

Dequeue (cont.)
root

remElement: 46

39

32

28

16

14

29

15

24

Sometimes, it is not
necessary to swap all the
way down through the
heap
43

Dequeue (cont.)
root

remElement: 46

39

32

28

16

14

29

15

24

If 5 would have been


greater than or equal to
both of its children, we
would stop there
44

Heapify
The process of swapping downwards to form a
new heap is called heapifying
When, we heapify, it is important that the rest of
the structure is a heap, except for the root node
that we are starting off with; otherwise, a new
heap wont be formed
A loop is used for heapifying; the number of
times through the loop is always lg n or less,
which gives the O( lg n ) complexity
Each time we swap downwards, the number of
nodes we can travel to is reduced by
approximately half
45

Enqueue
root
39

value to enqueue: 37

32

28

16

14

29

24

15

46

Enqueue (cont.)
root
39

value to enqueue: 37

32

28

16

14

29

15

24

2
Create a new node
in the last position

47

Enqueue (cont.)
root
39

value to enqueue: 37

32

28

16

14

29

15

24

37

2
Place the value to
enqueue in the last
node

48

Enqueue (cont.)
root
39

32

28

16

14

29

15

24

37

If 37 is larger than
its parent, swap

49

Enqueue (cont.)
root
39

32

28

37 > 24,
so swap
16

14

29

15

24

37

If 37 is larger than
its parent, swap

50

Enqueue (cont.)
root
39

32

28

16

14

37

29

15

24

If 37 is larger than
its parent, swap

51

Enqueue (cont.)
root
39
37 > 28,
so swap

32

16

14

37

29

15

24

28

If 37 is larger than
its parent, swap

52

Enqueue (cont.)
root
39

32

37

16

14

28

29

15

24

Notice that 37 is
guaranteed to be
greater than or
equal to its children
53

Enqueue (cont.)
root
39

32

37

16

14

28

29

15

24

It was greater than the


value swapped with
(28), or we wouldnt
have swapped
54

Enqueue (cont.)
root
39

32

37

16

14

28

29

15

24

and 28 was greater


than the other node
(2) or it wouldnt have
been a heap
55

Enqueue (cont.)
root
39

32

37

16

14

28

29

15

24

so 37 must be greater
than the other node
(2) as well.

56

Enqueue (cont.)
root
39

32

37

16

14

28

29

15

24

If 37 is larger than
its parent, swap

57

Enqueue (cont.)
root
39

37 < 39, so
dont swap

32

37

16

14

28

29

15

24

If 37 is larger than
its parent, swap

58

Enqueue (cont.)
root
39

32

37

16

14

28

29

15

24

The result is a heap!

59

Enqueue (cont.)
root
39

32

37

16

14

28

29

15

24

Enqueue uses a loop,


and it is a O( lg n )
operation (swapping
in reverse)
60

Implementing a Heap
Although it is helpful to think of a heap as a
linked structure when visualizing the enqueue
and dequeue operations, it is often implemented
with an array
Dont get mixed up with implementing a priority
queue (PQ) and implementing a heap
We have discussed earlier that to implement a
PQ, heap is better than array overall
We are discussing implementing a heap here
Lets number the nodes of a heap, starting with
61
0, going top to bottom and left to right

Implementing a Heap with Array


root

46
0

39

28

16

32

24

25

14

29

15

18

17

10

11

12

13

14

11

15

16
62

Heap Properties
root

46
0

39

28

16

32

24

25

14

29

15

18

17

10

11

12

13

14

11

15

16

(left child #) = 2 * (parent #) + 1

63

Heap Properties
(cont.)
root

46
0

39

28

16

32

24

25

14

29

15

18

17

10

11

12

13

14

11

15

16

(left child #) = 2 * (parent #) + 1

64

Heap Properties
(cont.)
root

46
0

39

28

16

32

24

25

14

29

15

18

17

10

11

12

13

14

11

15

16

(left child #) = 2 * (parent #) + 1

65

Heap Properties
(cont.)
root

46
0

39

28

16

32

24

25

14

29

15

18

17

10

11

12

13

14

11

15

16

(left child #) = 2 * (parent #) + 1

66

Heap Properties
(cont.)
root

46
0

39

28

16

32

24

25

14

29

15

18

17

10

11

12

13

14

11

15

16

(right child #) = 2 * (parent #) + 2

67

Heap Properties
(cont.)
root

46
0

39

28

16

32

24

25

14

29

15

18

17

10

11

12

13

14

11

15

16

(right child #) = 2 * (parent #) + 2

68

Heap Properties
(cont.)
root

46
0

39

28

16

32

24

25

14

29

15

18

17

10

11

12

13

14

11

15

16

(right child #) = 2 * (parent #) + 2

69

Heap Properties
(cont.)
root

46
0

39

28

16

32

24

25

14

29

15

18

17

10

11

12

13

14

11

15

16

(parent #) = (child # - 1) / 2
(using integer division)
70

Heap Properties
(cont.)
root

46
0

39

28

16

32

24

25

14

29

15

18

17

10

11

12

13

14

11

15

16

(parent #) = (child # - 1) / 2
(using integer division)
71

Heap Properties
(cont.)
root

46
0

39

28

16

32

24

25

14

29

15

18

17

10

11

12

13

14

11

15

16

(parent #) = (child # - 1) / 2
(using integer division)
72

Array Implementation of Heap


These remarkable properties of the heap
allow us to place the elements into an
array
The red numbers on the previous slide
correspond to the array indexes

73

Array Implementation of Heap


(cont.)
46 39 28 16 32 24 25 14 3 29 15 5
0 1

7 18 17 11 9

9 10 11 12 13 14 15 16

So now, this is our heap. It has no linked


nodes, so it is much easier to work with.
Lets dequeue an element.

The highest value is stored in the root node


(index 0).
74

Array Implementation of Heap


(cont.)
46 39 28 16 32 24 25 14 3 29 15 5
0 1

remElement: 46

7 18 17 11 9

9 10 11 12 13 14 15 16

Now we need to move the object at


the last node to the root node
We need to keep track of the object
in the last position of the heap
using a heapsize variable
75

Array Implementation of Heap


(cont.)
46 39 28 16 32 24 25 14 3 29 15 5
0 1

remElement: 46
heapsize: 17

7 18 17 11 9

9 10 11 12 13 14 15 16

Now we need to move the object at


the last node to the root node
We need to keep track of the object
in the last position of the heap
using a heapsize variable
76

Array Implementation of Heap


(cont.)
46 39 28 16 32 24 25 14 3 29 15 5
0 1

remElement: 46
heapsize: 17

7 18 17 11 9

9 10 11 12 13 14 15 16

Now we can access the object


in the last node using
elements[ heapsize 1 ] and
assign it to elements[ 0 ]

77

Array Implementation of Heap


(cont.)
9 39 28 16 32 24 25 14 3 29 15 5
0 1

remElement: 46
heapsize: 17

7 18 17 11 9

9 10 11 12 13 14 15 16

Now we can access the object


in the last node using
elements[ heapsize 1 ] and
assign it to elements[ 0 ]

78

Array Implementation of Heap


(cont.)
9 39 28 16 32 24 25 14 3 29 15 5
0 1

remElement: 46

7 18 17 11 9

9 10 11 12 13 14 15 16

Next, decrement the heap size

heapsize: 17

79

Array Implementation of Heap


(cont.)
9 39 28 16 32 24 25 14 3 29 15 5
0 1

remElement: 46

7 18 17 11 9

9 10 11 12 13 14 15 16

Next, decrement the heap size

heapsize: 16

80

Array Implementation of Heap


(cont.)
9 39 28 16 32 24 25 14 3 29 15 5
0 1

remElement: 46
heapsize: 16

7 18 17 11 9

9 10 11 12 13 14 15 16

The value at index 16 cant


be used anymore; it will be
overwritten on the next
enqueue

81

Array Implementation of Heap


(cont.)
9 39 28 16 32 24 25 14 3 29 15 5
0 1

remElement: 46
heapsize: 16

7 18 17 11 9

9 10 11 12 13 14 15 16

Now, we need to find the


greatest child of node 0
and compare it to 9.
But how do we get the
greatest child?

82

Array Implementation of Heap


(cont.)
9 39 28 16 32 24 25 14 3 29 15 5
0 1

7 18 17 11 9

9 10 11 12 13 14 15 16

remElement: 46
heapsize: 16

By using the formulas we noted earlier


(this is why an array can be used)

83

Array Implementation of Heap


(cont.)
Greatest Child 39 > 9, so swap
9 39 28 16 32 24 25 14 3 29 15 5
0 1

remElement: 46
heapsize: 16

7 18 17 11 9

9 10 11 12 13 14 15 16

(left child #) =
2*(parent #) + 1 =
2*0+1=
1

(right child #) =
2*(parent #) + 2 =
2*0+2=
2

84

Array Implementation of Heap


(cont.)
39 9 28 16 32 24 25 14 3 29 15 5
0 1

remElement: 46
heapsize: 16

7 18 17 11 9

9 10 11 12 13 14 15 16

If the greatest child of 9 is


greater than 9, then swap

85

Array Implementation of Heap


(cont.)
Greatest Child 32 > 9, so swap
39 9 28 16 32 24 25 14 3 29 15 5
0 1

remElement: 46
heapsize: 16

7 18 17 11 9

9 10 11 12 13 14 15 16

(left child #) =
2*(parent #) + 1 =
2*1+1=
3

(right child #) =
2*(parent #) + 2 =
2*1+2=
4

86

Array Implementation of Heap


(cont.)
39 32 28 16 9 24 25 14 3 29 15 5
0 1

remElement: 46
heapsize: 16

7 18 17 11 9

9 10 11 12 13 14 15 16

If the greatest child of 9 is


greater than 9, then swap

87

Array Implementation of Heap


(cont.)
Greatest Child 29 > 9, so swap
39 32 28 16 9 24 25 14 3 29 15 5
0 1

remElement: 46
heapsize: 16

7 18 17 11 9

9 10 11 12 13 14 15 16

(left child #) =
2*(parent #) + 1 =
2*4+1=
9

(right child #) =
2*(parent #) + 2 =
2*4+2=
10

88

Array Implementation of Heap


(cont.)
39 32 28 16 29 24 25 14 3

9 15 5

0 1

9 10 11 12 13 14 15 16

remElement: 46
heapsize: 16

7 18 17 11 9

If the greatest child of 9 is


greater than 9, then swap

89

Array Implementation of Heap


(cont.)
39 32 28 16 29 24 25 14 3

9 15 5

0 1

9 10 11 12 13 14 15 16

remElement: 46
heapsize: 16

19 > heapsize

7 18 17 11 9

(left child #) =
2*(parent #) + 1 =
2*9+1=
19

90

Array Implementation of Heap


(cont.)
39 32 28 16 29 24 25 14 3

9 15 5

0 1

9 10 11 12 13 14 15 16
so 9 must be a leaf node

remElement: 46
heapsize: 16

(left child #) =
2*(parent #) + 1 =
2*9+1=
19

7 18 17 11 9

(we can stop)

91

Array Implementation of Heap


(cont.)
39 32 28 16 29 24 25 14 3

9 15 5

0 1

9 10 11 12 13 14 15 16

heapsize: 16

7 18 17 11 9

An enqueue is done by placing the


new element at elements[ heapsize ],
then swapping upwards

92

Array Implementation of Heap


(cont.)
39 32 28 16 29 24 25 14 3

9 15 5

0 1

9 10 11 12 13 14 15 16

heapsize: 16

7 18 17 11 9

When enqueuing, the parent is


always found by using the parent
formula:
(parent #) = (child # - 1 ) / 2
93

Reducing the Work


in a Swap
A swap (using simple assignments) would
normally involve three statements:
temp = elements[ i ];
elements[ i ] = elements[ j ];
elements[ j ] = temp;

94

Reducing the Work


in a Swap (cont.)
In an enqueue or a dequeue for a heap,
we do not really have to use this threeassignment swap (although it helped to
visualize how the enqueue and dequeue
process worked)
We can save the value we are swapping
upwards or downwards
Lets look at an enqueue
95

Enqueue
root
39

value to enqueue: 37

32

28

16

14

29

24

15

96

Enqueue (cont.)
root
39

value to enqueue: 37

32

28

16

14

29

15

24

2
Create a new node
in the last position

97

Enqueue (cont.)
root
39

value to enqueue: 37

32

28

16

14

29

15

24

2
We just pretend
that 37 is here

98

Enqueue (cont.)
root
39

value to enqueue: 37

32

28

16

14

29

15

24

2
37 > 24, so we
pretend to swap
(we just copy 24)
99

Enqueue (cont.)
root

value to enqueue: 37

39

32

28

16

14

24

29

15

24

We now pretend 37
has been placed here
100

Enqueue (cont.)
root

value to enqueue: 37

39

32

28

16

14

24

29

15

24

and we compare 37
to 28 to see if we
should swap again
101

Enqueue (cont.)
root

value to enqueue: 37

39

32

28

16

14

24

29

15

24

37 > 28, so we do a
one-assignment
swap again
102

Enqueue (cont.)
root

value to enqueue: 37

39

32

28

16

14

28

29

15

24

2
We pretend 37
is here

103

Enqueue (cont.)
root

value to enqueue: 37

39

32

28

16

14

28

29

15

24

and compare 37 to
39 to see if we
should swap again
104

Enqueue (cont.)
root

value to enqueue: 37

39

32

28

16

14

28

29

15

24

This time we
shouldnt swap

105

Enqueue (cont.)
root

value to enqueue: 37

39

32

37

16

14

28

29

15

24

but we have one


final assignment

106

Enqueue (cont.)
root

value to enqueue: 37

39

32

37

16

14

28

29

15

24

(we can stop


pretending)

107

Dequeue
root
39

32

37

16

14

28

29

15

24

The same technique


can be used when
swapping downward
108

Parent-Child Formulas
Parent-child formulas can also be sped up:
(left child #) = (parent #) << 1 + 1
(right child #) = (left child #) + 1
when finding the greatest child, the left and right
children are always found together

(parent #) = (child # - 1) >> 1


using the shift operator is the same as integer
division
109

Forming an Initial Heap


A heap of size n can be made by enqueuing n
elements one at a time but it would take
O( n lg n ) time (even though n starts off as
small)
A faster method can be used to make a heap out
of an initial array in ( n ) time
The heap will be shown as a linked tree,
because it is easier to visualize how the method
works but in actuality, when we use the
method, we use it on the array

110

Forming an Initial Heap


(cont.)
6 31 5 34 34 11 7

7 12 39 38 5 32 1 34 27 16

9 10 11 12 13 14 15 16

Suppose the initial array looks like this.

It is not initially a heap, but by rearranging the


elements, it will be.
We will look at the tree form of this to see why
the method works.
111

Forming an Initial Heap


(cont.)
root
6
0

31

34

34

11

12

39

38

32

34

10

11

12

13

14

27

16

15

16

Drawing it this way, we can easily see


that the initial array is not a heap
112

Forming an Initial Heap


(cont.)
root
6
0

31

34

34

11

12

39

38

32

34

10

11

12

13

14

27

16

15

16

We realize that nodes 8 through 16 are


subheaps of only one node.
113

Forming an Initial Heap


(cont.)
root
6
0

31

34

34

11

12

39

38

32

34

10

11

12

13

14

27

16

15

16

We want to heapify starting with the


parent of the last leaf node.
114

Forming an Initial Heap


(cont.)
root
6
0

31

34

34

11

12

39

38

32

34

10

11

12

13

14

27

16

15

16

To find the last leaf node, we use


heapsize 1 (17 1 = 16)
115

Forming an Initial Heap


(cont.)
root
6
0

31

34

34

11

12

39

38

32

34

10

11

12

13

14

27

16

15

16

Then the parent of the last node is


(16 1) / 2 = 7
116

Forming an Initial Heap


(cont.)
root
6
0

31

34

34

11

12

39

38

32

34

10

11

12

13

14

27

16

15

16

Heapifying works only if the structure is a


heap except for possibly the root.
117

Forming an Initial Heap


(cont.)
root
6
0

31

34

34

11

12

39

38

32

34

10

11

12

13

14

27

16

15

16

This is true for the structure rooted at


index 7, so we can use heapify on it
118

Forming an Initial Heap


(cont.)
root
6
0

31

34

34

11

27

12

39

38

32

34

10

11

12

13

14

16

15

16

This is true for the structure rooted at


index 7, so we can use heapify on it
119

Forming an Initial Heap


(cont.)
root
6
0

31

34

34

11

27

12

39

38

32

34

10

11

12

13

14

16

15

16

Then we decrement index 7 and use


heapify again
120

Forming an Initial Heap


(cont.)
root
6
0

31

34

34

11

34

27

12

39

38

32

10

11

12

13

14

16

15

16

Then we decrement index 7 and use


heapify again
121

Forming an Initial Heap


(cont.)
root
6
0

31

34

34

11

34

27

12

39

38

32

10

11

12

13

14

16

15

16

This process continues until we heapify at


the root
122

Forming an Initial Heap


(cont.)
root
6
0

31

34

34

32

34

27

12

39

38

11

10

11

12

13

14

16

15

16
123

Forming an Initial Heap


(cont.)
root
6
0

31

34

39

32

34

27

12

34

38

11

10

11

12

13

14

16

15

16
124

Forming an Initial Heap


(cont.)
root
6
0

31

34

39

32

34

27

12

34

38

11

10

11

12

13

14

16

15

16

already a heap
125

Forming an Initial Heap


(cont.)
root
6
0

31

temp

34

39

32

34

27

12

34

38

11

10

11

12

13

14

16

15

16

one-assignment swaps
126

Forming an Initial Heap


(cont.)
root
6
0

31

34

34

39

32

27

12

34

38

11

10

11

12

13

14

16

15

16
127

Forming an Initial Heap


(cont.)
root
6
0

31

34

34

39

32

27

12

34

38

11

10

11

12

13

14

16

15

16

temp
128

Forming an Initial Heap


(cont.)
root
6
0

39

34

34

38

32

27

12

34

31

11

10

11

12

13

14

16

15

16
129

Forming an Initial Heap


(cont.)
root
6
0

39

34

34

38

32

27

12

34

31

11

10

11

12

13

14

16

15

16

temp
130

Forming an Initial Heap


(cont.)
root
39
0

38

34

34

34

32

27

12

31

11

10

11

12

13

14

16

15

16

The result is a heap


131

Linked Heap
If we have a large element size and we
dont know the max size, we might want to
consider conserving memory and avoiding
resizing array with a linked heap
Should have the same time complexities
as the array-based heap

132

Reference
Childs, J. S. (2008). Methods for Making
Data Structures. C++ Classes and Data
Structures. Prentice Hall.

133

Lecture 04a
Graphs

Graphs
Graphs are a very general data structure
A graph consists of a set of nodes called
vertices
Any vertex can point to any number of other
vertices
It is possible that no vertex in the graph points to
any other vertex
Each vertex may point to every other vertex,
even if there are thousands of vertices
2

A Directed Graph
(Digraph)
A

F
D

Each pointer is called


an edge

E
3

An Undirected Graph
A

Each edge
points in
both
directions
ex: A
points to D
and D
points to A

F
D
E

Another Digraph
A
F
B
Nodes A and F point
to each other it is
considered
improper to draw a
single undirected
edge between them

C
E

Undirected Weighted Graph


Graphville
3
4

Node Town

8
4

Vertex City

4
Builders
Paradise

10

Pointerburgh

2
Binary
Tree Valley
6

Graph Implementation
A vertex A is said to be adjacent to a
vertex B if there is an edge pointing from A
to B
Graphs can be implemented in a couple of
popular ways:
adjacency matrix
adjacency list

Adjacency Matrix
Nodes are given a key from 0 to n 1
If weight is not important, the adjacency
matrix is a 2-dimensional array of bool
(True or False) type variables
If weight is important, the adjacency matrix
is a 2-dimensional array of int type
variables

Undirected Weighted Graph


Node 0
3
4

Node 2

8
4

Node 1

10

Node 4

2
Node 3

Node 5
9

Adjacency Matrix (cont.)


0

Undirected
weighted graph
The row
numbers give
the vertex
number of the
vertex that an
edge is pointing
from

10

Adjacency Matrix (cont.)


0

Example:
Node 1 points
to Node 2 (set
to T)
Node 2 also
points to Node
1 for undirected
graph
11

Adjacency Matrix (cont.)


0

10

10

Weighted graph
- Unconnected

12

Directed Weighted Graph


Node 0
3
4

Node 2

8
4

Node 1

10

Node 4

2
Node 3

Node 5
13

Adjacency Matrix (cont.)


0

F
Directed graph

Example:
Node 0 points
to Node 1 (set
to T)
But Node 1
doesnt point to
Node 0 (set to
F)
14

Adjacency Matrix (cont.)


0

10

2
3

Directed
weighted graph
- Unconnected

15

Adjacency Matrix (cont.)


0

10

Note that we can


construct a graph
from the adjacency
matrix the
vertices may not
be drawn in the
same locations as
the original graph,
but the
connections will
be the same

16

Adjacency List
An adjacency list is an array of linked lists
The vertices are numbered 0 through
n1
Each index in the array corresponds to the
number of a vertex
The vertex (with an index number) is
adjacent to every node in the linked list at
that index
17

Adjacency List (cont.)


0

Directed weighted
graph

3
4

5
18

Adjacency List (cont.)


0

3
4

Vertex 1 has a link


to vertex 3, and
vertex 1 also has a
link to vertex 5

5
19

Adjacency List (cont.)


0

3
4

Vertex 3 has an
empty linked list it
has no link to any
other vertices

5
20

Adjacency List for


Directed Weighted Graph
0

1 4

2 3

3 4

5 10

4 4

4 8

3
4

5 2

Red font is for vertex


numbers, black font
is for weights
21

Adjacency Matrix vs.


Adjacency List
The speed of the adjacency matrix or the
adjacency list depends on the algorithm
some algorithms need to know if there is a
direct connection between two specific
vertices the adjacency matrix would be
faster
some algorithms are written to process the
linked list of an adjacency list, node by node
the adjacency list would be faster
22

Adjacency Matrix vs.


Adjacency List (cont.)
When both seem equally fast for a certain
algorithm, then we consider memory
usage
We will consider the space complexity of
each implementation
Space complexities are like time
complexities, but they tell us the effects on
memory space usage as the problem size
is varied
23

Adjacency Matrix vs.


Adjacency List (cont.)
An array of vertex info is used for each
implementation, which is ( n )
In the adjacency matrix, each dimension is
n, so the spatial complexity is ( n2 )

24

Adjacency Matrix vs.


Adjacency List (cont.)
In the adjacency list, there are n elements in the
array, giving a spatial complexity of ( n ) for the
array
Note that each edge corresponds to a linked list
node in the adjacency list
There can be no more than n( n 1 ) edges in a
graph, so the spatial complexity for the linked list
nodes is O( n2 )
the total spatial complexity for the adjacency list
25
is O( n2 )

Adjacency Matrix vs.


Adjacency List (cont.)
Both of the spatial complexities for the
adjacency matrix and the adjacency list
absorb the ( n ) spatial complexity used
in the vertex info array
The spatial complexity of the adjacency list
really depends on whether the graph is
sparse or dense
26

Adjacency Matrix vs.


Adjacency List (cont.)
A sparse graph does not have that many
edges relative to the number of vertices
if the number of edges is less than or
equal to n, then the spatial complexity of
the adjacency list is ( n )
In a dense graph, there may be close to n2
edges, and then the spatial complexity of
the adjacency list is ( n2 ), the same as
that for the adjacency matrix
27

Lecture 04b Graph Algorithms

Depth-first search
Breadth-First Search
Topological Sorting

SFO

LAX

Graphs

ORD

DFW

Graph
A graph is a pair (V, E), where

V is a set of nodes, called vertices


E is a collection of pairs of vertices, called edges
Vertices and edges are positions and store elements

Example:

A vertex represents an airport and stores the three-letter airport code


An edge represents a flight route between two airports and stores the
mileage of the route

PVD

ORD

SFO

LGA
HNL

LAX

DFW
Graphs

MIA
2

Edge Types
Directed edge

ordered pair of vertices (u,v)


first vertex u is the origin
second vertex v is the destination
e.g., a flight

ORD

flight
AA 1206

PVD

ORD

849
miles

PVD

Undirected edge

unordered pair of vertices (u,v)


e.g., a flight route

Directed graph

all the edges are directed


e.g., route network

Undirected graph

all the edges are undirected


e.g., flight network
Graphs

Applications
cslab1a

cslab1b

Electronic circuits

math.brown.edu

Printed circuit board


Integrated circuit

cs.brown.edu

Transportation networks

Highway network
Flight network

brown.edu
qwest.net
att.net

Computer networks

Local area network


Internet
Web

cox.net
John

Databases

Paul

David

Entity-relationship diagram
Graphs

Terminology
End vertices (or endpoints) of
an edge

U and V are the endpoints of a

Edges incident on a vertex

a, d, and b are incident on V

Adjacent vertices

U and V are adjacent

h
X

j
Z

i
g

h and i are parallel edges

Self-loop

X has degree 5

Parallel edges

Degree of a vertex

j is a self-loop

Graphs

Terminology (cont.)
Path

sequence of alternating
vertices and edges
begins with a vertex
ends with a vertex
each edge is preceded and
followed by its endpoints

Simple path

path such that all its vertices


and edges are distinct

Examples

P1=(V,b,X,h,Z) is a simple path


P2=(U,c,W,e,X,g,Y,f,W,d,V) is a
path that is not simple

Graphs

a
U
c

d
P2

P1
X

Terminology (cont.)
Cycle

circular sequence of alternating


vertices and edges
each edge is preceded and
followed by its endpoints

Simple cycle

cycle such that all its vertices


and edges are distinct

Examples

C1=(V,b,X,g,Y,f,W,c,U,a,) is a
simple cycle
C2=(U,c,W,e,X,g,Y,f,W,d,V,a,)
is a cycle that is not simple

Graphs

a
U
c

d
C2

C1
g

W
f

Properties
Property 1

Notation

Sv deg(v) = 2m

n
m
deg(v)

Proof: each edge is


counted twice

Property 2

number of vertices
number of edges
degree of vertex v

Example
n = 4
m = 6
deg(v) = 3

In an undirected graph
with no self-loops and
no multiple edges
m n (n - 1)/2
Proof: each vertex has
degree at most (n - 1)

What is the bound for a


directed graph?
Graphs

Asymptotic Performance
n vertices, m edges
no parallel edges
no self-loops
Bounds are big-Oh

Adjacency
List

Adjacency
Matrix

Space

n+m

n2

incidentEdges(v)
areAdjacent (v, w)
insertVertex(o)

deg(v)
min(deg(v), deg(w))
1

n
1
n2

insertEdge(v, w, o)

deg(v)
1

n2
1

removeVertex(v)
removeEdge(e)

Graphs

Depth-First Search:
Outline and Reading
Depth-first search (6.3.1)

Algorithm
Example
Properties
Analysis

A
B

Graphs

10

Subgraphs
A subgraph S of a graph
G is a graph such that

The vertices of S are a


subset of the vertices of G
The edges of S are a
subset of the edges of G

Subgraph

A spanning subgraph of G
is a subgraph that
contains all the vertices
of G
Spanning subgraph
Graphs

11

Connectivity
A graph is
connected if there is
a path between
every pair of
vertices
A connected
component of a
graph G is a
maximal connected
subgraph of G
Graphs

Connected graph

Non connected graph with two


connected components
12

Trees and Forests


A (free) tree is an
undirected graph T such
that
T is connected
T has no cycles
This definition of tree is
different from the one of
a rooted tree

A forest is an undirected
graph without cycles
The connected
components of a forest
are trees
Graphs

Tree

Forest
13

Spanning Trees and Forests


A spanning tree of a
connected graph is a
spanning subgraph that is
a tree
A spanning tree is not
unique unless the graph is
a tree
Spanning trees have
applications to the design
of communication
networks
A spanning forest of a
graph is a spanning
subgraph that is a forest

Graph

Spanning tree
Graphs

14

Depth-First Search
Depth-first search (DFS)
is a general technique
for traversing a graph
A DFS traversal of a
graph G

Visits all the vertices and


edges of G
Determines whether G is
connected
Computes the connected
components of G
Computes a spanning
forest of G
Graphs

DFS on a graph with n


vertices and m edges
takes O(n + m ) time
DFS can be further
extended to solve other
graph problems

Find and report a path


between two given
vertices
Find a cycle in the graph

Depth-first search is to
graphs what Euler tour
is to binary trees
15

DFS Algorithm
The algorithm uses a mechanism
for setting and getting labels of
vertices and edges
Algorithm DFS(G)
Input graph G
Output labeling of the edges of G
as discovery edges and
back edges
for all u G.vertices()
setLabel(u, UNEXPLORED)
for all e G.edges()
setLabel(e, UNEXPLORED)
for all v G.vertices()
if getLabel(v) = UNEXPLORED
DFS(G, v)

Algorithm DFS(G, v)
Input graph G and a start vertex v of G
Output labeling of the edges of G
in the connected component of v
as discovery edges and back edges
setLabel(v, VISITED)
for all e G.incidentEdges(v)
if getLabel(e) = UNEXPLORED
w opposite(v,e)
if getLabel(w) = UNEXPLORED
setLabel(e, DISCOVERY)
DFS(G, w)
else
setLabel(e, BACK)

Graphs

16

Example
unexplored vertex
visited vertex
unexplored edge
discovery edge
back edge

A
B

A
B

A
D

C
Graphs

17

Example (cont.)
A
B

A
D

C
Graphs

18

DFS and Maze Traversal


The DFS algorithm is
similar to a classic
strategy for exploring
a maze

We mark each
intersection, corner
and dead end (vertex)
visited
We mark each corridor
(edge ) traversed
We keep track of the
path back to the
entrance (start vertex)
by means of a rope
(recursion stack)
Graphs

19

Properties of DFS
Property 1
DFS(G, v) visits all the
vertices and edges in
the connected
component of v

Property 2
The discovery edges
labeled by DFS(G, v)
form a spanning tree of
the connected
component of v

Graphs

20

Analysis of DFS
Setting/getting a vertex/edge label takes O(1) time
Each vertex is labeled twice

once as UNEXPLORED
once as VISITED

Each edge is labeled twice

once as UNEXPLORED
once as DISCOVERY or BACK

Method incidentEdges is called once for each vertex


DFS runs in O(n + m) time provided the graph is
represented by the adjacency list structure

Recall that

Sv deg(v) = 2m
Graphs

21

Breadth-First Search
L0
L1

L2

Graphs

C
E

D
F

22

Breadth-First Search
Breadth-first search
(BFS) is a general
technique for traversing
a graph
A BFS traversal of a
graph G

Visits all the vertices and


edges of G
Determines whether G is
connected
Computes the connected
components of G
Computes a spanning
forest of G
Graphs

BFS on a graph with n


vertices and m edges
takes O(n + m ) time
BFS can be further
extended to solve other
graph problems

Find and report a path


with the minimum
number of edges
between two given
vertices
Find a simple cycle, if
there is one
23

BFS Algorithm
The algorithm uses a
mechanism for setting and
getting labels of vertices
and edges
Algorithm BFS(G)
Input graph G
Output labeling of the edges
and partition of the
vertices of G
for all u G.vertices()
setLabel(u, UNEXPLORED)
for all e G.edges()
setLabel(e, UNEXPLORED)
for all v G.vertices()
if getLabel(v) = UNEXPLORED
BFS(G, v)

Algorithm BFS(G, s)
L0 new empty sequence
L0.insertLast(s)
setLabel(s, VISITED)
i0
while Li.isEmpty()
Li +1 new empty sequence
for all v Li.elements()
for all e G.incidentEdges(v)
if getLabel(e) = UNEXPLORED
w opposite(v,e)
if getLabel(w) = UNEXPLORED
setLabel(e, DISCOVERY)
setLabel(w, VISITED)
Li +1.insertLast(w)
else
setLabel(e, CROSS)
i i +1
Graphs

24

Example
L0

unexplored vertex
visited vertex
unexplored edge
discovery edge
cross edge

L0
L1

L1

L0
C

L1

D
F

C
E

C
E

Graphs

D
F
25

Example (cont.)
L0
L1

L0

C
E

L0
L1

D
F

L0
C

L2

L2

L1

L1

Graphs

C
E

D
F

L2

C
E

D
F
26

Example (cont.)
L0

L1

L0
L1

L2

C
E

L1

L2

A
C
E

D
F

L2

L0

F
Graphs

27

Properties
Notation

Gs: connected component of s

Property 1

BFS(G, s) visits all the vertices and


edges of Gs

B
E

Property 2

The discovery edges labeled by


BFS(G, s) form a spanning tree Ts
of Gs

Property 3

L0
L1

For each vertex v in Li

The path of Ts from s to v has i


edges
Every path from s to v in Gs has at
least i edges
Graphs

C
F

L2

C
E

D
F
28

Analysis
Setting/getting a vertex/edge label takes O(1) time
Each vertex is labeled twice

once as UNEXPLORED
once as VISITED

Each edge is labeled twice

once as UNEXPLORED
once as DISCOVERY or CROSS

Each vertex is inserted once into a sequence Li


Method incidentEdges is called once for each vertex
BFS runs in O(n + m) time provided the graph is
represented by the adjacency list structure

Recall that

Sv deg(v) = 2m
Graphs

29

Applications
Using the template method pattern, we can
specialize the BFS traversal of a graph G to
solve the following problems in O(n + m) time

Compute the connected components of G


Compute a spanning forest of G
Find a simple cycle in G, or report that G is a
forest
Given two vertices of G, find a path in G between
them with the minimum number of edges, or
report that no such path exists
Graphs

30

DFS vs. BFS


Applications

DFS

BFS

Spanning forest, connected


components, paths, cycles
Shortest paths

Biconnected components

L0

A
B

C
E

L1

L2

DFS

A
C

BFS
Graphs

31

DFS vs. BFS (cont.)


Back edge (v,w)

Cross edge (v,w)

w is an ancestor of v in
the tree of discovery
edges

L0

A
B

w is in the same level as


v or in the next level in
the tree of discovery
edges

C
E

L1

L2

DFS

C
E

D
F

BFS
Graphs

32

DAGs and Topological Sorting


A directed acyclic graph (DAG) is a
digraph that has no directed cycles
A topological sorting/ordering of a
digraph is a numbering
v1 , , vn
of the vertices such that for every
edge (vi , vj), we have i < j
Example: in a task scheduling
digraph, a topological sorting a
task sequence that satisfies the
v2
precedence constraints
Theorem
A digraph admits a topological
v1
sorting if and only if it is a DAG

B
C
DAG G

A
D
B
C
A

v4

v5

v3
Topological
ordering of G
33

Topological Sorting
Number vertices, so that (u,v) in E implies u < v
wake up

A typical student day


3

2
study computer sci.

eat
4

7
play

nap
8
write c.s. program

9
make cookies
for professors

5
more c.s.

6
work out

10
sleep

11
dream about graphs
34

Algorithm for Topological Sorting


Note: This algorithm is different than the
one in Goodrich-Tamassia (next slide)
Method TopologicalSort(G)
HG
// Temporary copy of G
n G.numVertices()
while H is not empty do
Let v be a vertex with no outgoing edges
Label v n
nn-1
Remove v from H

Running time: O(n + m).


35

Topological Sorting
Algorithm using DFS
Simulate the algorithm by using
depth-first search
Algorithm topologicalDFS(G)
Input dag G
Output topological ordering of G
n G.numVertices()
for all u G.vertices()
setLabel(u, UNEXPLORED)
for all e G.edges()
setLabel(e, UNEXPLORED)
for all v G.vertices()
if getLabel(v) = UNEXPLORED
topologicalDFS(G, v)
O(n+m) time.

Algorithm topologicalDFS(G, v)
Input graph G and a start vertex v of G
Output labeling of the vertices of G
in the connected component of v
setLabel(v, VISITED)
for all e G.incidentEdges(v)
if getLabel(e) = UNEXPLORED
w opposite(v,e)
if getLabel(w) = UNEXPLORED
setLabel(e, DISCOVERY)
topologicalDFS(G, w)
else
{e is a forward or cross edge}
Label v with topological number n
nn-1
36

Topological Sorting Example

37

Topological Sorting Example

9
38

Topological Sorting Example

8
9
39

Topological Sorting Example

7
8
9
40

Topological Sorting Example

6
7
8
9
41

Topological Sorting Example

5
7

8
9
42

Topological Sorting Example

4
6

5
7

8
9
43

Topological Sorting Example


3
4
6

5
7

8
9
44

Topological Sorting Example


2
3
4
6

5
7

8
9
45

Topological Sorting Example


2

3
4
6

5
7

8
9
46

Lecture 05a
Selectionsort, Heapsort &
Quicksort

Sorting
Sorting is the process of placing elements in
order
If elements are objects, they are ordered by a
particular data member
There are many algorithms for sorting, each
having its own advantages
No sorting algorithm is better than every other
sorting algorithm all the time
Choosing the right sorting algorithm for a
problem requires careful consideration
2

Selectionsort
Idea: Find the largest number from the unsorted
numbers, place it at the correct location.
Array: 3 6 5 2 4
n=5
i = 4: 3 4 5 2 6
i = 3: 3 4 2 5 6
i = 2: 3 2 4 5 6
i = 1: 2 3 4 5 6 (sorted)
3

Selectionsort (cont.)
void selectionSort (int a[], int n) {
for (int i = n-1; i > 0; i--) {
// find the max element in the unsorted a[i .. n-1]
int maxIndex = i; // assume the max is the last element
// test against elements before i to find the largest
for (int j = 0; j < i; j++) {
// if this element is larger, then it is the new max
if (a[j] > a[maxIndex])
// found new max; remember its index
maxIndex = j;
}
// maxIndex is the index of the max element,
// swap it with the current position
if (maxIndex != i)
swap (a[i], a[maxIndex]);
}
}
4

Selectionsort Time
Complexity
By the same analysis as the
prefixAverages1 algorithm in Lecture 1 (
the "for" loops of the 2 algorithms are
similar), Selectionsort runs in ( n2 ) time

Heapsort
Algorithm Heapsort (S, n)
Input sequence S of n elements
Output sorted sequence S
pq priority queue (S)
for i n-1 to 0
pq.dequeue (S[i])

Create a heap-based priority


queue from S
6

Heapsort (cont.)
Algorithm Heapsort (S, n)
Input sequence S of n elements
Output sorted sequence S
pq priority queue (S)
for i n-1 to 0
pq.dequeue (S[i])

In the loop, we start with the last sequence


index
7

Heapsort (cont.)
Algorithm Heapsort (S, n)
Input sequence S of n elements
Output sorted sequence S
pq priority queue (S)
for i n-1 to 0
pq.dequeue (S[i])

On the first dequeue, the largest element from


the heap will be placed into the last position
of the sequence
8

Heapsort (cont.)
Algorithm Heapsort (S, n)
Input sequence S of n elements
Output sorted sequence S
pq priority queue (S)
for i n-1 to 0
pq.dequeue (S[i])

On the next dequeue (i is decremented), the


remaining highest value from the heap will be
placed in the next left sequence position
9

Heapsort (cont.)
Algorithm Heapsort (S, n)
Input sequence S of n elements
Output sorted sequence S
pq priority queue (S)
for i n-1 to 0
pq.dequeue (S[i])

The loop stops when the index becomes less


than 0
10

Heapsort (cont.)

S = 3 6 5 2 4: pq = 6 4 5 2 3
i = 4: pq = 5 4 3 2: S = 3 6 5 2 6
i = 3: pq = 4 2 3: S = 3 6 5 5 6
i = 2: pq = 3 2: S = 3 6 4 5 6
i = 1: pq = 2: S = 3 3 4 5 6
i = 0: pq = : S = 2 3 4 5 6 (sorted)

11

Heapsort (cont.)
The previous Heapsort algorithm requires
2 arrays to work. The second array is used
to create a heap (priority queue)
Instead of create a heap separately, we
can actually create a heap on the original
array
Lec03b, section Forming an Initial Heap

12

Heapsort (cont.)

S=36524
Forming an Initial Heap
S/pq = 6 4 5 2 3 (S and pq are the array)
i = 4: S = 5 4 3 2 6 (pq = 5 4 3 2)
i = 3: S = 4 2 3 5 6 (pq = 4 3 2)
i = 2: S = 3 6 4 5 6 (pq = 3 2)
i = 1: S = 3 3 4 5 6 (pq = 2)
i = 0: S = 2 3 4 5 6 (sorted)
13

Heapsort Time
Complexity
Heapsort runs in ( n lg n ) time on
average
It has a best-case time of ( n ) if all
elements have the same value
This is an unusual case, but we may want to
consider it if many of the elements have the
same value

14

Quicksort
Time complexities of Quicksort
best: ( n lg n )
average: ( n lg n )
worst: ( n2 )

The average case usually dominates over the


worst case in sorting algorithms, and in
Quicksort, the coefficient of the n lg n term is
small
makes Quicksort one of the most popular generalpurpose sorting algorithms
15

Functions of Quicksort
A recursive function, called quicksort
A nonrecursive function, usually called
partition

quicksort calls partition


16

Partition

pivot

10

14

28

10

35

46

47

38

11

11
28

In the partition function, the last element is


chosen as a pivot a special element used
for comparison purposes
Each element is compared to the pivot.
When partition is done, it produces the
result shown next
17

Partition (cont.)

pivot

10

11

14

28

10

35

46

47

38

11

10

11

14

28

10

11

28

38

47

35

46

28

All elements less than or equal to the pivot are on its left
18

Partition (cont.)

pivot

10

11

14

28

10

35

46

47

38

11

10

11

14

28

10

11

28

38

47

35

46

28

All elements greater than the pivot are on its right side
19

Partition (cont.)

pivot

10

11

14

28

10

35

46

47

38

11

10

11

14

28

10

11

28

38

47

35

46

28

The pivot is where it will be in the final sorted array


20

Partition (cont.)

pivot

10

11

14

28

10

35

46

47

38

11

10

11

14

28

10

11

28

38

47

35

46

28

Partition is called from quicksort more than once, but


when called again, it will work with a smaller section of
the array.
21

Partition (cont.)
0

10

11

14

28

10

11

28

38

47

35

46

The next time partition is called, for


example, it will only work with this
section to the left of the previous pivot.

22

Partition (cont.)
0

10

11

14

28

10

11

28

38

47

35

46

pivot

23

Partition (cont.)
0

10

11

10

14

11

28

28

38

47

35

46

pivot

24

Partition (cont.)
0

10

11

10

14

11

28

28

38

47

35

46

Each section of the array separated by a


previous pivot will eventually have partition
called for it

25

Partition (cont.)
0

10

11

10

14

11

28

28

38

47

35

46

Except that partition is not called for a section of


just one element it is already a pivot and it is
where it belongs

26

Partition (cont.)
0

10

11

10

14

11

28

28

38

47

35

46

Partition is called
for this section

27

Partition (cont.)
0

10

11

10

14

11

28

28

38

47

35

46

pivot

28

Partition (cont.)
0

10

11

10

14

11

28

28

38

47

35

46

All elements are smaller


and stay to the left of the
pivot

pivot

29

Partition (cont.)
0

10

11

10

14

11

28

28

38

47

35

46

Partition is called for


this section

30

Partition (cont.)
0

10

11

10

14

11

28

28

38

47

35

46

pivot

31

Partition (cont.)
0

10

11

10

11

14

28

28

38

47

35

46

pivot

32

Partition (cont.)
0

10

11

10

11

14

28

28

38

47

35

46

partition

33

Partition (cont.)
0

10

11

10

11

14

28

28

38

47

35

46

pivot

34

Partition (cont.)
0

10

11

10

11

14

28

28

38

47

35

46

pivot

35

Partition (cont.)
0

10

11

10

11

14

28

28

38

47

35

46

partition not
called for this

36

Partition (cont.)
0

10

11

10

11

14

28

28

38

47

35

46

partition not
called for this

37

Partition (cont.)
0

10

11

10

11

14

28

28

38

47

35

46

partition called
for this

38

Partition (cont.)
0

10

11

10

11

14

28

28

38

47

35

46

pivot

39

Partition (cont.)
0

10

11

10

11

14

28

28

38

35

46

47

pivot

40

Partition (cont.)
0

10

11

10

11

14

28

28

38

35

46

47

partition called
for this

41

Partition (cont.)
0

10

11

10

11

14

28

28

38

35

46

47

pivot

42

Partition (cont.)
0

10

11

10

11

14

28

28

35

38

46

47

pivot

43

Partition (cont.)
0

10

11

10

11

14

28

28

35

38

46

47

Partition not
called for this

44

Partition (cont.)
0

10

11

10

11

14

28

28

35

38

46

47

Partition not
called for this

45

Partition (cont.)
0

10

11

10

11

14

28

28

35

38

46

47

At this point, the


array is sorted

46

Partition (cont.)
0

10

11

10

11

14

28

28

35

38

46

47

But to achieve this, what steps does the


algorithm for partition go through?

47

Partition (cont.)

pivot

10

14

10

35

46

47

38

11

11
28

Partition has a loop which iterates through each


element, comparing it to the pivot

48

Partition (cont.)

pivot

10

14

28

10

35

46

47

38

11

partitioned section

11
28

unpartitioned section

When partition is in progress, there is a partitioned


section on the left, and an unpartitioned section on
the right

49

Partition (cont.)

pivot

10

14

28

10

35

46

47

38

11

partitioned section

11
28

unpartitioned section

The dividing line in the partitioned section means that


all elements to the left are less than or equal to the
pivot; all elements to the right are greater than the
pivot
50

Partition (cont.)

pivot

10

14

28

10

35

46

47

38

11

partitioned section

11
28

unpartitioned section

On each iteration, the partitioned section grows by


one and the unpartitioned section shrinks by one,
until the array is fully partitioned

51

Partition (cont.)

pivot

10

14

28

10

35

46

47

38

11

partitioned section

11
28

unpartitioned section

i is an index used to mark the last value in the small


side of the partition, while j is used to mark the first
value in the unpartitioned section.

52

Partition (cont.)
i

pivot

10

14

28

10

35

46

47

38

11

partitioned section

11
28

unpartitioned section

i is an index used to mark the last value in the small


side of the partition, while j is used to mark the first
value in the unpartitioned section.

53

Partition (cont.)
i

pivot

10

14

28

10

35

46

47

38

11

partitioned section

11
28

unpartitioned section

On each iteration of partition, the value at j is


compared to the pivot.

54

Partition (cont.)
i

pivot

10

14

28

10

35

46

47

38

11

partitioned section

11
28

unpartitioned section

On each iteration of partition, the value at j is


compared to the pivot.

55

Partition (cont.)
i

pivot

10

14

28

10

35

46

47

38

11

partitioned section

11
28

unpartitioned section

If the value at j is greater, j is just incremented (the


partitioned section grows by one)

56

Partition (cont.)
i

pivot

10

14

28

10

35

46

47

38

11

partitioned section

11
28

unpartitioned section

If the value at j is greater, j is just incremented (the


partitioned section grows by one)

57

Partition (cont.)
i

pivot

10

14

28

10

35

46

47

38

11

partitioned section

11
28

unpartitioned section

If, on an iteration, the value at j is less than the


pivot

58

Partition (cont.)
i

pivot

10

14

28

10

35

46

47

38

11

partitioned section

11
28

unpartitioned section

then i is incremented

59

Partition (cont.)
i

pivot

10

14

28

10

35

46

47

38

11

partitioned section

11
28

unpartitioned section

then i is incremented

60

Partition (cont.)
i

pivot

10

14

28

10

35

46

47

38

11

partitioned section

11
28

unpartitioned section

and the value at i is swapped with the value at j

61

Partition (cont.)
i

pivot

10

14

28

10

46

47

35

38

11

partitioned section

11
28

unpartitioned section

and the value at i is swapped with the value at j

62

Partition (cont.)
i

pivot

10

14

28

10

46

47

35

38

11

partitioned section

11
28

unpartitioned section

then j is incremented

63

Partition (cont.)
i

pivot

10

14

28

10

46

47

35

38

11

partitioned section

11
28

unpartitioned section

then j is incremented

64

Partition (cont.)
1
2
3
4
5
6
7
8

partition( arr, p, r )
Partition starts off by
i=p1
passing in the array
for all j from p to r 1
arr
if arr[ j ] <= arr[ r ]
i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

65

Partition (cont.)
1
2
3
4
5
6
7
8

partition( arr, p, r )
the beginning index of
i=p1
the array section it is
for all j from p to r 1
working with
if arr[ j ] <= arr[ r ]
i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

66

Partition (cont.)
1
2
3
4
5
6
7
8

partition( arr, p, r )
and the ending index
i=p1
of the array section it
for all j from p to r 1
is working with
if arr[ j ] <= arr[ r ]
i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

67

Partition (cont.)
1
2
3
4
5
6
7
8

partition( arr, p, r )
i is used to mark the
i=p1
end of the small-side
for all j from p to r 1
part of the partition
if arr[ j ] <= arr[ r ]
i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

68

Partition (cont.)
1
2
3
4
5
6
7
8

partition( arr, p, r )
initially, there isnt
i=p1
one, so i is set to p 1
for all j from p to r 1
(-1 if p is 0)
if arr[ j ] <= arr[ r ]
i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

69

Partition (cont.)
1
2
3
4
5
6
7
8

partition( arr, p, r )
i=p1
this loop iterates
for all j from p to r 1
through the
if arr[ j ] <= arr[ r ]
elements
i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

70

Partition (cont.)
1
2
3
4
5
6
7
8

partition( arr, p, r )
i=p1
for all j from p to r 1
comparing them to the
if arr[ j ] <= arr[ r ]
pivot arr[ r ]
i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

71

Partition (cont.)
p

10

14

28

10

46

47

35

38

11

1
2

partition( arr, p, r )
i=p1

for all j from p to r 1

4
5
6
7
8

if arr[ j ] <= arr[ r ]


i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

11
28

j is currently 7,
with partition
having already
iterated through
elements 0
through 6
72

Partition (cont.)
p

10

14

28

10

46

47

35

38

11

1
2
3

4
5
6
7
8

11
28

partition( arr, p, r )
i=p1
for all j from p to r 1

if arr[ j ] <= arr[ r ]


i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

73

Partition (cont.)
p

10

14

28

10

46

47

35

38

11

1
2
3
4

5
6
7
8

11
28

partition( arr, p, r )
i=p1
for all j from p to r 1
if arr[ j ] <= arr[ r ]

i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

74

Partition (cont.)
p

10

14

28

10

47

35

46

38

11

1
2
3
4
5

6
7
8

11
28

partition( arr, p, r )
i=p1
for all j from p to r 1
if arr[ j ] <= arr[ r ]
i++

swap( arr[ i ], arr[ j ] )


swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

75

Partition (cont.)
p

10

14

28

10

47

35

46

38

11

1
2

partition( arr, p, r )
i=p1

for all j from p to r 1

4
5
6
7
8

if arr[ j ] <= arr[ r ]


i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

11
28

76

Partition (cont.)
p

10

14

28

10

47

35

46

38

11

1
2
3

4
5
6
7
8

11
28

partition( arr, p, r )
i=p1
for all j from p to r 1

if arr[ j ] <= arr[ r ]


i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

77

Partition (cont.)
p

10

14

28

10

47

35

46

38

11

1
2

partition( arr, p, r )
i=p1

for all j from p to r 1

4
5
6
7
8

if arr[ j ] <= arr[ r ]


i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

11
28

78

Partition (cont.)
p

10

14

28

10

47

35

46

38

11

1
2
3

4
5
6
7
8

11
28

partition( arr, p, r )
i=p1
for all j from p to r 1

if arr[ j ] <= arr[ r ]


i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

79

Partition (cont.)
p

10

14

28

10

47

35

46

38

11

1
2
3
4

5
6
7
8

11
28

partition( arr, p, r )
i=p1
for all j from p to r 1
if arr[ j ] <= arr[ r ]

i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

80

Partition (cont.)
p

10

14

28

10

11

35

46

38

47

1
2
3
4
5

6
7
8

11
28

partition( arr, p, r )
i=p1
for all j from p to r 1
if arr[ j ] <= arr[ r ]
i++

swap( arr[ i ], arr[ j ] )


swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

81

Partition (cont.)
p

r
11

10

14

28

10

11

35

46

38

47

1
2

partition( arr, p, r )
i=p1

for all j from p to r 1

4
5
6
7
8

if arr[ j ] <= arr[ r ]


i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

28

82

Partition (cont.)
p

r
11

10

14

28

10

11

35

46

38

47

1
2
3

4
5
6
7
8

28

partition( arr, p, r )
i=p1
for all j from p to r 1

if arr[ j ] <= arr[ r ]


i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

83

Partition (cont.)
p

r
11

10

14

28

10

11

35

46

38

47

1
2
3
4

5
6
7
8

28

partition( arr, p, r )
i=p1
for all j from p to r 1
if arr[ j ] <= arr[ r ]

i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

84

Partition (cont.)
p

10

11

14

28

10

11

46

38

47

35

28

1
2
3
4
5

6
7
8

partition( arr, p, r )
i=p1
for all j from p to r 1
if arr[ j ] <= arr[ r ]
i++

swap( arr[ i ], arr[ j ] )


swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

85

Partition (cont.)
p

10

11

14

28

10

11

46

38

47

35

28

1
2

partition( arr, p, r )
i=p1

for all j from p to r 1

4
5
6
7
8

if arr[ j ] <= arr[ r ]


i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )
return i + 1

j isnt incremented
past r - 1

86

Partition (cont.)
p

10

11

14

28

10

11

28

38

47

35

46

1
2
3
4
5
6

partition( arr, p, r )
i=p1
for all j from p to r 1
if arr[ j ] <= arr[ r ]
i++
swap( arr[ i ], arr[ j ] )

swap ( arr[ i + 1 ], arr[ r ] )

return i + 1

87

Partition (cont.)
p

10

11

14

28

10

11

28

38

47

35

46

1
2
3
4
5
6
7

partition( arr, p, r )
i=p1
for all j from p to r 1
if arr[ j ] <= arr[ r ]
i++
swap( arr[ i ], arr[ j ] )
swap ( arr[ i + 1 ], arr[ r ] )

return i + 1

The final step of


partition returns the
INDEX of the pivot
back to the quicksort
function that called it
88

The Quicksort
Function
1 quicksort( arr, p, r )
2 if p < r
3 pi = partition( arr, p, r )
4 quicksort( arr, p, pi 1 )
5 quicksort( arr, pi + 1, r )

Since quicksort is
called recursively, it
also passes in the
array arr, the
beginning index p of
the section it is
working with, and the
ending index r of the
section it is working
with

89

The Quicksort
Function (cont.)
1 quicksort( arr, p, r )
2 if p < r
3 pi = partition( arr, p, r )
4 quicksort( arr, p, pi 1 )
5 quicksort( arr, pi + 1, r )

If p < r, those same


indexes are used in
the call to partition (in
such a case, a call to
quicksort always
produces a matching
call to partition)

90

The Quicksort
Function (cont.)
1 quicksort( arr, p, r )
2 if p < r
3 pi = partition( arr, p, r )
4 quicksort( arr, p, pi 1 )
5 quicksort( arr, pi + 1, r )

pi

The pivot index pi is


returned from partition

r
91

The Quicksort
Function (cont.)

pi-1

pi

1 quicksort( arr, p, r )
2 if p < r
3 pi = partition( arr, p, r )
4 quicksort( arr, p, pi 1 )
5 quicksort( arr, pi + 1, r )

quicksort is called
recursively here,
working with the
section on the left of
the pivot
92

The Quicksort
Function (cont.)

pi+1

pi-1

pi

1 quicksort( arr, p, r )
2 if p < r
3 pi = partition( arr, p, r )
4 quicksort( arr, p, pi 1 )
5 quicksort( arr, pi + 1, r )

and quicksort is called


recursively here,
working with the
section on the right of
the pivot
93

The Quicksort
Function (cont.)
1 quicksort( arr, p, r )
2 if p < r
3 pi = partition( arr, p, r )
4 quicksort( arr, p, pi 1 )
5 quicksort( arr, pi + 1, r )

if the array sections


are at least 2 elements
in size, these quicksort
functions will call
partition for the same
array section it is
working with

94

The Quicksort
Function (cont.)
1 quicksort( arr, p, r )
2 if p < r
3 pi = partition( arr, p, r )
4 quicksort( arr, p, pi 1 )
5 quicksort( arr, pi + 1, r )

and partition will break


the sections into even
smaller sections

95

The Quicksort
Function (cont.)
1 quicksort( arr, p, r )
2 if p < r
3 pi = partition( arr, p, r )
4 quicksort( arr, p, pi 1 )
5 quicksort( arr, pi + 1, r )

The recursive case


occurs if p < r
(this means the array
section has at least
two elements, the one
at p and the one at r)

96

The Quicksort
Function (cont.)
1 quicksort( arr, p, r )
2 if p < r
3 pi = partition( arr, p, r )
4 quicksort( arr, p, pi 1 )
5 quicksort( arr, pi + 1, r )

If the recursive case


occurs, a call to
quicksort will match a
call to partition

97

The Quicksort
Function (cont.)
1 quicksort( arr, p, r )
2 if p < r
3 pi = partition( arr, p, r )
4 quicksort( arr, p, pi 1 )
5 quicksort( arr, pi + 1, r )

If p == r, the base case


occurs there is only
one element in the
array section (recall
that partition is not
called for a section
that just has one
element)

98

Lecture 05b Sorting and


Selection Algorithms
7 29 4 2 4 7 9
72 2 7
77

22

94 4 9
99

44

Learning Outcome
Merge-sort (4.1.1)
Summary of Sorting Algorithms
Radix-Sort (4.5.2)
Quick-Select ( 4.7)

Merge-Sort
Merge-sort is a sorting algorithm based on the
divide-and-conquer paradigm
Like heap-sort

It has O(n log n) running time

Unlike heap-sort

It does not use a priority queue


It accesses data in a sequential manner (suitable to sort
data on a disk)

Merge-Sort (cont.)
Merge-sort on an input sequence S with n elements
consists of three steps:

Divide: partition S into two sequences S1 and S2 of about n/2


elements each
Recur: recursively sort S1 and S2
Conquer: merge S1 and S2 into a unique sorted sequence

Algorithm Merge-Sort
Input Parameters: array a, start index p, end index r.
Output Parameter: array a.
Mergesort (a, p, r) {
// if only one element, just return.
if (p == r)
return
// Divide: divide a into two nearly equal parts.
m = (p + r) / 2
// Recur: sort each half.
Mergesort (a, p, m)
Mergesort (a, m + 1, r)
// Conquer: merge the two sorted halves.
Merge (a, p, m, r)
}
5

Merging Two Sorted Sequences


The conquer step of
merge-sort consists
of merging two
sorted sequences A
and B into a sorted
sequence S
containing the union
of the elements of A
and B
Merging two sorted
sequences, each
with n/2 elements
and implemented by
means of a doubly
linked list, takes
O(n) time

Algorithm Merge (a, p, m, r)


Input sequences A = a[p]a[m] and
B = a[m+1]a[r] with
Output sorted sequence a[p]a[r]
S empty sequence
while A.isEmpty() B.isEmpty()
if A.first().element() < B.first().element()
S.insertLast(A.remove(A.first()))
else
S.insertLast(B.remove(B.first()))
while A.isEmpty()
S.insertLast(A.remove(A.first()))
while B.isEmpty()
S.insertLast(B.remove(B.first()))
return S
6

Merge-Sort Tree
An execution of merge-sort is depicted by a binary tree

each node represents a recursive call of merge-sort and stores


unsorted sequence before the execution and its partition
sorted sequence at the end of the execution

the root is the initial call


the leaves are calls on subsequences of size 0 or 1

7 2
7

9 4 2 4 7 9

2 2 7

77

22

4 4 9

99

44
7

Execution Example
Partition
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9

7 2 9 4 2 4 7 9

7 2 2 7

77

22

3 8 6 1 1 3 8 6

9 4 4 9

99

44

3 8 3 8

33

88

6 1 1 6

66

11
8

Execution Example (cont.)


Recursive call, partition
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9

7 29 4 2 4 7 9

7 2 2 7

77

22

3 8 6 1 1 3 8 6

9 4 4 9

99

44

3 8 3 8

33

88

6 1 1 6

66

11
9

Execution Example (cont.)


Recursive call, partition
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9

7 29 4 2 4 7 9

722 7

77

22

3 8 6 1 1 3 8 6

9 4 4 9

99

44

3 8 3 8

33

88

6 1 1 6

66

11
10

Execution Example (cont.)


Recursive call, base case
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9

7 29 4 2 4 7 9

722 7

77

22

3 8 6 1 1 3 8 6

9 4 4 9

99

44

3 8 3 8

33

88

6 1 1 6

66

11
11

Execution Example (cont.)


Recursive call, base case
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9

7 29 4 2 4 7 9

722 7

77

22

3 8 6 1 1 3 8 6

9 4 4 9

99

44

3 8 3 8

33

88

6 1 1 6

66

11
12

Execution Example (cont.)


Merge
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9

7 29 4 2 4 7 9

722 7

77

22

3 8 6 1 1 3 8 6

9 4 4 9

99

44

3 8 3 8

33

88

6 1 1 6

66

11
13

Execution Example (cont.)


Recursive call, , base case, merge
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9

7 29 4 2 4 7 9

722 7

77

22

3 8 6 1 1 3 8 6

9 4 4 9

99

44

3 8 3 8

33

88

6 1 1 6

66

11
14

Execution Example (cont.)


Merge
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9

7 29 4 2 4 7 9

722 7

77

22

3 8 6 1 1 3 8 6

9 4 4 9

99

44

3 8 3 8

33

88

6 1 1 6

66

11
15

Execution Example (cont.)


Recursive call, , merge, merge
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9

7 29 4 2 4 7 9

722 7

77

22

3 8 6 1 1 3 6 8

9 4 4 9

99

44

3 8 3 8

33

88

6 1 1 6

66

11
16

Execution Example (cont.)


Merge
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9

7 29 4 2 4 7 9

722 7

77

22

3 8 6 1 1 3 6 8

9 4 4 9

99

44

3 8 3 8

33

88

6 1 1 6

66

11
17

Analysis of Merge-Sort
The height h of the merge-sort tree is O(log n)
at each recursive call we divide in half the sequence,

The overall amount or work done at the nodes of depth i is O(n)


we partition and merge 2i sequences of size n/2i
we make 2i+1 recursive calls

Thus, the total running time of merge-sort is O(n log n)


depth #seqs size

n/2

2i

n/2i

18

Summary of Sorting Algorithms


Algorithm

Time

Notes

selection-sort

O(n2)

in-place
slow (good for small
inputs)

quick-sort

O(n log n)
O(n2) for worst case if
not randomized

heap-sort
merge-sort

in-place, randomized
fastest (good for large
inputs)

O(n log n)

in-place
fast (good for large inputs)

O(n log n)

sequential data access


fast (good for huge inputs)
19

Bucket-Sort and Radix-Sort


1, c
B

3, a

3, b

7, d

7, g

7, e

0 1 2 3 4 5 6 7 8 9

20

Bucket-Sort (4.5.1)
Let be S be a sequence of n
(key, element) items with keys
in the range [0, N - 1]
Bucket-sort uses the keys as
indices into an auxiliary array B
of sequences (buckets)
Phase 1: Empty sequence S by
moving each item (k, o) into its
bucket B[k]
Phase 2: For i = 0, , N - 1, move
the items of bucket B[i] to the
end of sequence S

Analysis:

Phase 1 takes O(n) time


Phase 2 takes O(n + N) time

Bucket-sort takes O(n + N) time

Algorithm bucketSort(S, N)
Input sequence S of (key, element)
items with keys in the range
[0, N - 1]
Output sequence S sorted by
increasing keys
B array of N empty sequences
while S.isEmpty()
f S.first()
(k, o) S.remove(f)
B[k].insertLast((k, o))
for i 0 to N - 1
while B[i].isEmpty()
f B[i].first()
(k, o) B[i].remove(f)
S.insertLast((k, o))
21

Example
Key range [0, 9]
7, d

1, c

3, a

7, g

3, b

7, e

Phase 1

1, c
B

3, a

3, b

7, d

7, g

7, e

Phase 2
1, c

3, a

3, b

7, d

7, g

7, e
22

Radix-Sort (4.5.2)
Radix-sort is a
specialization of
lexicographic-sort that
uses bucket-sort as the
stable sorting algorithm
in each dimension
Radix-sort is applicable
to tuples where the
keys in each dimension i
are integers in the
range [0, N - 1]
Radix-sort runs in time
O(d( n + N))

Algorithm radixSort(S, N)
Input sequence S of d-tuples such
that (0, , 0) (x1, , xd) and
(x1, , xd) (N - 1, , N - 1)
for each tuple (x1, , xd) in S
Output sequence S sorted in
lexicographic order
for i d downto 1
bucketSort(S, N)
23

Radix-Sort for
Binary Numbers
Consider a sequence of n
b-bit integers
x = xb - 1 x1x0
We represent each element
as a b-tuple of integers in
the range [0, 1] and apply
radix-sort with N = 2
This application of the
radix-sort algorithm runs in
O(bn) time
For example, we can sort a
sequence of 32-bit integers
in linear time

Algorithm binaryRadixSort(S)
Input sequence S of b-bit
integers
Output sequence S sorted
replace each element x
of S with the item (0, x)
for i 0 to b - 1
replace the key k of
each item (k, x) of S
with bit xi of x
bucketSort(S, 2)
24

Example
Sorting a sequence of 4-bit integers
1001

0010

1001

1001

0001

0010

1110

1101

0001

0010

1101

1001

0001

0010

1001

0001

1101

0010

1101

1101

1110

0001

1110

1110

1110
25

Selection: The Selection Problem


Given an integer k and n elements x1, x2, , xn,
taken from a total order, find the k-th smallest
element in this set.
Of course, we can sort the set in O(n log n) time
and then index the k-th element.
k=3

7 4 9 6 2 2 4 6 7 9

Can we solve the selection problem faster? Lets


say O(n).
26

Quick-Select ( 4.7)
Quick-select is a randomized
selection algorithm based on
the prune-and-search
paradigm:

Prune: pick a random element x


(called pivot) and partition S into

L elements less than x


E elements equal x
G elements greater than x

Search: depending on k, either


answer is in E, or we need to
recurse in either L or G

k < |L|

k > |L|+|E|
k = k - |L| - |E|

|L| < k < |L|+|E|


(done)
27

Algorithm Quick-Select
Input Parameters: array a, start index p, end index r,
target index k.
Output Parameter: a[k] at the correct position.
QuickSelect (a, p, r, k) {
if (p < r) {
pi = Partition (a, p, r) // pivot index.
if (k == pi)
return
if (k < pi)
QuickSelect (a, p, pi - 1, k)
else
QuickSelect (a, pi + 1, r, k)
}
}
28

Partition
The partition step of Quick-Select is the same
partition in Quick-Sort which takes O(n) time.
Based on Probabilistic Facts, Quick-Select runs in
time O(n).

29

Quick-Select Visualization
An execution of quick-select can be visualized by a
recursion path

Each node represents a recursive call of quick-select, and


stores k and the remaining sequence

k=5, S=(7 4 9 3 2 6 5 1 8)

k=2, S=(7 4 9 6 5 8)
k=2, S=(7 4 6 5)

k=1, S=(7 6 5)
5
30

Lecture 06 Divide & Conquer


Learning Outcome

Divide-and-conquer paradigm (5.2)


Review Merge-sort (4.1.1)
Recurrence Equations (5.2.1)

Iterative substitution
Recursion trees
The master method

Integer Multiplication (5.2.2)


Review Quick-sort (4.3)
1

Divide-and-Conquer
Divide-and conquer is a
general algorithm design
paradigm:

Divide: divide the input data S in


two or more disjoint subsets S1,
S2 ,
Recur: solve the subproblems
recursively
Conquer: combine the solutions
for S1, S2, , into a solution for S

The base case for the


recursion are subproblems of
constant size
Analysis can be done using
recurrence equations

Merge-Sort Review
Merge-sort on an input sequence S with n elements
consists of three steps:

Divide: partition S into two sequences S1 and S2 of about n/2


elements each
Recur: recursively sort S1 and S2
Conquer: merge S1 and S2 into a unique sorted sequence

Merge-Sort Review
Input Parameters: array a, start index p, end index r.
Output Parameter: array a.
Mergesort (a, p, r) {
// if only one element, just return.
if (p == r)
return
// Divide: divide a into two nearly equal parts.
m = (p + r) / 2
// Recur: sort each half.
Mergesort (a, p, m)
Mergesort (a, m + 1, r)
// Conquer: merge the two sorted halves.
Merge (a, p, m, r)
}
4

Recurrence Equation
Analysis
The conquer step of merge-sort consists of merging two sorted
sequences, each with n/2 elements and implemented by means of
a doubly linked list, takes at most bn steps, for some constant b.
Likewise, the basis case (n < 2) will take at b most steps.
Therefore, if we let T(n) denote the running time of merge-sort:

T (n)
2T (n / 2) bn

if n 2
if n 2

We can therefore analyze the running time of merge-sort by


finding a closed form solution to the above equation.

That is, a solution that has T(n) only on the left-hand side.

Iterative Substitution
In the iterative substitution, or plug-and-chug, technique, we
iteratively apply the recurrence equation to itself and see if we can
find a pattern:
T ( n ) 2T ( n / 2) bn

2( 2T ( n / 22 )) b( n / 2)) bn
2 2 T ( n / 2 2 ) 2bn
23 T ( n / 23 ) 3bn
2 4 T ( n / 2 4 ) 4bn
...
2i T ( n / 2i ) ibn
Note that base, T(n)=b, case occurs when 2i=n. That is, i = log n.
So,
T (n) bn bn log n
Thus, T(n) is O(n log n).
6

The Recursion Tree


Draw the recursion tree for the recurrence relation and look for a
pattern:
b

T (n)
2T (n / 2) bn

if n 2

if n 2
time

depth

Ts

size

bn

n/2

bn

2i

n/2i

bn

Total time = bn + bn log n


(last level plus all previous levels)
7

Master Method
Many divide-and-conquer recurrence equations have
the form:

T (n)
aT ( n / b) f ( n )

if n d

if n d

The Master Theorem:

1. if f (n) is O(n log b a ), then T (n) is (n log b a )


2. if f (n) is (n log b a log k n), then T (n) is (n log b a log k 1 n)
3. if f (n) is (n log b a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.
8

Master Method, Example 1


c

if n d

aT ( n / b) f ( n )

if n d

The form: T (n )

The Master Theorem:


1. if f (n) is O(n log b a ), then T (n) is (n log b a )
2. if f (n) is (n log b a log k n), then T (n) is (n log b a log k 1 n)
3. if f (n) is (n log b a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.

Example:

T (n) 4T (n / 2) n

Solution: logba=2, so case 1 says T(n) is O(n2).


9

Master Method, Example 2


c

if n d

aT ( n / b) f ( n )

if n d

The form: T (n )

The Master Theorem:


1. if f (n) is O(n log b a ), then T (n) is (n log b a )
2. if f (n) is (n log b a log k n), then T (n) is (n log b a log k 1 n)
3. if f (n) is (n log b a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.

Example:

T (n) 2T (n / 2) n log n

Solution: logba=1, so case 2 says T(n) is O(n log2 n).


10

Master Method, Example 3


c

if n d

aT ( n / b) f ( n )

if n d

The form: T (n )

The Master Theorem:


1. if f (n) is O(n log b a ), then T (n) is (n log b a )
2. if f (n) is (n log b a log k n), then T (n) is (n log b a log k 1 n)
3. if f (n) is (n log b a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.

Example:

T (n) T (n / 3) n log n

Solution: logba=0, so case 3 says T(n) is O(n log n).


11

Master Method, Example 4


c

if n d

aT ( n / b) f ( n )

if n d

The form: T (n )

The Master Theorem:


1. if f (n) is O(n log b a ), then T (n) is (n log b a )
2. if f (n) is (n log b a log k n), then T (n) is (n log b a log k 1 n)
3. if f (n) is (n log b a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.

Example:

T (n) 8T (n / 2) n

Solution: logba=3, so case 1 says T(n) is O(n3).


12

Master Method, Example 5


c

if n d

aT ( n / b) f ( n )

if n d

The form: T (n )

The Master Theorem:


1. if f (n) is O(n log b a ), then T (n) is (n log b a )
2. if f (n) is (n log b a log k n), then T (n) is (n log b a log k 1 n)
3. if f (n) is (n log b a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.

Example:

T (n) 9T (n / 3) n

Solution: logba=2, so case 3 says T(n) is O(n3).


13

Master Method, Example 6


c

if n d

aT ( n / b) f ( n )

if n d

The form: T (n )

The Master Theorem:


1. if f (n) is O(n log b a ), then T (n) is (n log b a )
2. if f (n) is (n log b a log k n), then T (n) is (n log b a log k 1 n)
3. if f (n) is (n log b a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.

Example:

T (n) T (n / 2) 1

(binary search)

Solution: logba=0, so case 2 says T(n) is O(log n).


14

Master Method, Example 7


c

if n d

aT ( n / b) f ( n )

if n d

The form: T (n )

The Master Theorem:


1. if f (n) is O(n log b a ), then T (n) is (n log b a )
2. if f (n) is (n log b a log k n), then T (n) is (n log b a log k 1 n)
3. if f (n) is (n log b a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.

Example:

T (n) 2T (n / 2) log n

(heap construction)

Solution: logba=1, so case 1 says T(n) is O(n).


15

Integer Multiplication
Algorithm: Multiply two n-bit integers I and J.

Divide step: Split I and J into high-order and low-order bits

I I h 2n / 2 I l
J J h 2n / 2 J l

We can then define I*J by multiplying the parts and adding:

I * J ( I h 2n / 2 I l ) * ( J h 2n / 2 J l )
I h J h 2n I h J l 2n / 2 I l J h 2n / 2 I l J l

So, T(n) = 4T(n/2) + n, which implies T(n) is O(n2).


But that is no better than the algorithm we learned in grade
school.
16

An Improved Integer
Multiplication Algorithm
Algorithm: Multiply two n-bit integers I and J.

Divide step: Split I and J into high-order and low-order bits

I I h 2n / 2 I l

J J h 2n / 2 J l

Observe that there is a different way to multiply parts:

I * J I h J h 2n [( I h I l )( J l J h ) I h J h I l J l ]2n / 2 I l J l
I h J h 2n [( I h J l I l J l I h J h I l J h ) I h J h I l J l ]2n / 2 I l J l
I h J h 2 n ( I h J l I l J h )2 n / 2 I l J l

So, T(n) = 3T(n/2) + n, which implies T(n) is O(nlog23), by


the Master Theorem.
Thus, T(n) is O(n1.585).
17

Integer multiplication
(1234 x 5678)
= (12x1000 + 34) (56x1000 + 78)
=1,000,000(12x56) + 1000[(12x78) +
(34x56)] + (34 x 78)
4 calculation needed
12x56 , 12x78, 34x56, 34x78

AA[Lec05] The Greedy Method

18

Improved Integer multiplication


[(12)(78) + (34)(56)]
= [(12-34)(78-56) + 12(56)+34(78)]
=[(12)(78)+(34)(56)-34(78) -12(56)
+12(56) + 34(78)] (underline cancel
each other.
Thus, (1234 x 5678)
= 12(56) + [(12-34)(78-56) + 12(56) +
34(78)] + 34(78)
3 calculations needed
12x56 , 34x78, (12-34)x(78-56)
AA[Lec05] The Greedy Method

19

Quick-Sort
7 4 9 6 2 2 4 6 7 9
4 2 2 4
22

7 9 7 9
99

20

Quick-Sort
Quick-sort is a sorting
algorithm based on the
divide-and-conquer
paradigm:

Divide: pick a random


element x (called pivot) and
partition S into

L elements less than x


E elements equal x

G elements greater than x

Recur: sort L and G


Conquer: join L, E and G
Recall that Lec05a picks the
last element as pivot

x
21

Algorithm QuickSort
1 quicksort( arr, p, r )
2 if p < r
3 pi = partition( arr, p, r )
4 quicksort( arr, p, pi 1 )
5 quicksort( arr, pi + 1, r )

22

Partition
We partition an input
sequence as follows:

We remove, in turn, each


element y from S and
We insert y into L, E or G,
depending on the result of
the comparison with the
pivot x

Each insertion and removal


is at the beginning or at the
end of a sequence, and
hence takes O(1) time
Thus, the partition step of
quick-sort takes O(n) time

Algorithm partition(S, p)
Input sequence S, position p of pivot
Output subsequences L, E, G of the
elements of S less than, equal to,
or greater than the pivot, resp.
L, E, G empty sequences
x S.remove(p)
while S.isEmpty()
y S.remove(S.first())
if y < x
L.insertLast(y)
else if y = x
E.insertLast(y)
else { y > x }
G.insertLast(y)
return L, E, G
23

Quick-Sort Tree
An execution of quick-sort is depicted by a binary tree

Each node represents a recursive call of quick-sort and stores


Unsorted sequence before the execution and its pivot

Sorted sequence at the end of the execution

The root is the initial call


The leaves are calls on subsequences of size 0 or 1

7 4 9 6 2 2 4 6 7 9
4 2 2 4
22

7 9 7 9
99
24

Execution Example
Pivot selection
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9

7 2 9 4 2 4 7 9

22

3 8 6 1 1 3 8 6

9 4 4 9

99

33

88

44
25

Execution Example (cont.)


Partition, recursive call, pivot selection
7 2 9 4 3 7 6 1 1 2 3 4 6 7 8 9

2 4 3 1 2 4 7 9

22

3 8 6 1 1 3 8 6

9 4 4 9

99

33

88

44
26

Execution Example (cont.)


Partition, recursive call, base case
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9

2 4 3 1 2 4 7

11

3 8 6 1 1 3 8 6

9 4 4 9

99

33

88

44
27

Execution Example (cont.)


Recursive call, , base case, join
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9

2 4 3 1 1 2 3 4

11

3 8 6 1 1 3 8 6

4 3 3 4

99

33

88

44
28

Execution Example (cont.)


Recursive call, pivot selection
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9

2 4 3 1 1 2 3 4

11

7 9 7 1 1 3 8 6

4 3 3 4

99

88

99

44
29

Execution Example (cont.)


Partition, , recursive call, base case
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9

2 4 3 1 1 2 3 4

11

7 9 7 1 1 3 8 6

4 3 3 4

99

88

99

44
30

Execution Example (cont.)


Join, join
7 2 9 4 3 7 6 1 1 2 3 4 6 7 7 9

2 4 3 1 1 2 3 4

11

7 9 7 17 7 9

4 3 3 4

99

88

99

44
31

Worst-case Running Time


The worst case for quick-sort occurs when the pivot is the unique
minimum or maximum element (already sorted)
One of L and G has size n 1 and the other has size 0
The running time is proportional to the sum
n (n 1) 2 1
Thus, the worst-case running time of quick-sort is O(n2)
depth time
0

n1

n1

1
32

Expected Running Time


Quick-sort performs the best when L and G are of
equal size. In this case, a single quick-sort call involves
bn work of partition plus two recursive calls on lists of
size n/2, hence the recurrence relation is (same as
merge-sort)
T(n) = T(n/2) + bn
From the Master Theorem, the best case running time
of quick-sort is O(n log n).
For average case, if a random pivot is selected, it will
reduce the probability of worst case, the expected
running time is also O(n log n).

33

Lecture 07
Greedy Algorithms

Outline and Reading

The Greedy Method Technique (5.1)


Fractional Knapsack Problem (5.1.1)
Dijkstras algorithm (7.1.1)
The Prim-Jarnik Algorithm (7.3.2)
Kruskal's Algorithm (7.3.1)
2

The Greedy Method


Technique
The greedy method is a general algorithm
design paradigm, built on the following
elements:

configurations: different choices, collections, or


values to find
objective function: a score assigned to
configurations, which we want to either maximize or
minimize

It works best when applied to problems with the


greedy-choice property:

a globally-optimal solution can always be found by a


series of local improvements from a starting
configuration.
3

Making Change
Problem: A dollar amount to reach and a collection of
coin amounts to use to get there.
Configuration: A dollar amount yet to return to a
customer plus the coins already returned
Objective function: Minimize number of coins returned.
Greedy solution: Always return the largest coin you can
Example 1: Coins are valued $.32, $.08, $.01

Has the greedy-choice property, since no amount over $.32 can


be made with a minimum number of coins by omitting a $.32
coin (similarly for amounts over $.08, but under $.32).

Example 2: Coins are valued $.30, $.20, $.05, $.01

Does not have greedy-choice property, since $.40 is best made


with two $.20s, but the greedy solution will pick three coins
(which ones?)
4

The Fractional Knapsack


Problem
Given: A set S of n items, with each item i having

bi - a positive benefit
wi - a positive weight

Goal: Choose items with maximum total benefit but with


weight at most W.
If we are allowed to take fractional amounts, then this is
the fractional knapsack problem.

In this case, we let xi denote the amount we take of item i

Objective: maximize

b ( x / w )
i

iS

Constraint:

iS

W
5

Example
Given: A set S of n items, with each item i having

bi - a positive benefit
wi - a positive weight

Goal: Choose items with maximum total benefit but with


weight at most W.
knapsack
Solution:
Items:

Weight:
Benefit:
Value:
($ per ml)

4 ml

8 ml

2 ml

6 ml

1 ml

$12

$32

$40

$30

$50

20

50

1
2
6
1

ml
ml
ml
ml

of
of
of
of

10 ml
6

5
3
4
2

The Fractional Knapsack


Algorithm
Greedy choice: Keep taking
item with highest value
(benefit to weight ratio)

Since bi ( xi / wi ) (bi / wi ) xi
iS
iS
Run time: O(n log n). Why?

Correctness: Suppose there


is a better solution

there is an item i with higher


value than a chosen item j,
but xi<wi, xj>0 and vi >vj
If we substitute some j with i,
we get a better solution
How much of i: min{wi-xi, xj}
Thus, there is no better
solution than the greedy one

Algorithm fractionalKnapsack(S, W)
Input: set S of items w/ benefit bi
and weight wi; max. weight W
Output: amount xi of each item i
to maximize benefit w/ weight
at most W
for each item i in S
xi 0
vi bi / wi
{value}
w0
{total weight}
while w < W
remove item i w/ highest vi
xi min{wi , W - w}
w w + min{wi , W - w}
7

Shortest Paths
A

0
4

2
B

7
5
E

Lec09 Shortest Paths

C
3

D
8

Outline and Reading


Weighted graphs (7.1)

Shortest path problem


Shortest path properties

Dijkstras algorithm (7.1.1)

Algorithm
Edge relaxation

Weighted Graphs
In a weighted graph, each edge has an associated numerical
value, called the weight of the edge
Edge weights may represent, distances, costs, etc.
Example:

In a flight route graph, the weight of an edge represents the


distance in miles between the endpoint airports

SFO

PVD

ORD
LGA

HNL

LAX

DFW

MIA
10

Shortest Path Problem


Given a weighted graph and two vertices u and v, we want to
find a path of minimum total weight between u and v.

Length of a path is the sum of the weights of its edges.

Example:

Shortest path between Providence and Honolulu

Applications

Internet packet routing


Flight reservations
Driving directions

SFO

PVD

ORD
LGA

HNL

LAX

DFW

MIA
11

Shortest Path Properties


Property 1:
A subpath of a shortest path is itself a shortest path

Property 2:
There is a tree of shortest paths from a start vertex to all the other
vertices

Example:
Tree of shortest paths from Providence

SFO

PVD

ORD
LGA

HNL

LAX

DFW

MIA
12

Dijkstras Algorithm
The distance of a vertex
v from a vertex s is the
length of a shortest path
between s and v
Dijkstras algorithm
computes the distances
of all the vertices from a
given start vertex s
Assumptions:

the graph is connected


the edges are
undirected
the edge weights are
nonnegative

We grow a cloud of vertices,


beginning with s and eventually
covering all the vertices
We store with each vertex v a
label d(v) representing the
distance of v from s in the
subgraph consisting of the cloud
and its adjacent vertices
At each step

We add to the cloud the vertex


u outside the cloud with the
smallest distance label, d(u)
We update the labels of the
vertices adjacent to u
13

Edge Relaxation
Consider an edge e (u,z)
such that

u is the vertex most recently


added to the cloud
z is not in the cloud

d(u) 50
s

The relaxation of edge e


updates distance d(z) as
follows:

d(u) 50

d(z) min{d(z),d(u) + weight(e)}

d(z) 75

d(z) 60
e

14

Example
A

0
4

2
B

7
5
E

C
3

7
5
E

D
8

0
4

2
C

0
4

2
8

D
11

B
2

7
5
E

C
3

D
8

5
15

Example (cont.)
A

0
4

2
B
2

C
3

5
E

8
F

5
A

0
4

2
B
2

C
3

5
E

8
F

16

Dijkstras Algorithm
A priority queue stores
the vertices outside the
cloud

Key: distance
Element: vertex

Locator-based methods

insert(k,e) returns a
locator
replaceKey(l,k) changes
the key of an item

We store two labels


with each vertex:

Distance (d(v) label)


locator in priority
queue

Algorithm DijkstraDistances(G, s)
Q new heap-based priority queue
for all v G.vertices()
if v s
setDistance(v, 0)
else
setDistance(v, )
l Q.insert(getDistance(v), v)
setLocator(v,l)
while Q.isEmpty()
u Q.removeMin()
for all e G.incidentEdges(u)
{ relax edge e }
z G.opposite(u,e)
r getDistance(u) + weight(e)
if r < getDistance(z)
setDistance(z,r)
Q.replaceKey(getLocator(z),r)
17

Analysis
Graph operations

Method incidentEdges is called once for each vertex

Label operations

We set/get the distance and locator labels of vertex z O(deg(z)) times


Setting/getting a label takes O(1) time

Priority queue operations

Each vertex is inserted once into and removed once from the priority
queue, where each insertion or removal takes O(log n) time
The key of a vertex in the priority queue is modified at most deg(w)
times, where each key change takes O(log n) time

Dijkstras algorithm runs in O((n + m) log n) time provided the


graph is represented by the adjacency list structure

Recall that

Sv deg(v) 2m

The running time can also be expressed as O(m log n) since the
graph is connected
18

Why Dijkstras Algorithm


Works
Dijkstras algorithm is based on the greedy
method. It adds vertices by increasing distance.

Suppose it didnt find all shortest


distances. Let F be the first wrong
vertex the algorithm processed.
When the previous node, D, on the
true shortest path was considered,
its distance was correct.
But the edge (D,F) was relaxed at
that time!
Thus, so long as d(F)>d(D), Fs
distance cannot be wrong. That is,
there is no wrong vertex.

0
4

2
B
2

C
3

5
E

8
F

19

Why It Doesnt Work for


Negative-Weight Edges
Dijkstras algorithm is based on the greedy
method. It adds vertices by increasing distance.

If a node with a negative


incident edge were to be added
late to the cloud, it could mess
up distances for vertices already
in the cloud.

0
4

6
B
2

C
0

5
E

-8

9
F

Cs true distance is 1, but


it is already in the cloud
with d(C)=5!
20

Minimum Spanning Trees (MST)


2704

BOS

867
849

PVD

ORD
740
621

1846

LAX

1391

1464

1235

144

JFK

1258

184

802

SFO
337

187

BWI
1090

DFW

946

1121
MIA
2342

21

Outline and Reading


Minimum Spanning Trees (7.3)

Definitions
A crucial fact

The Prim-Jarnik Algorithm (7.3.2)


Kruskal's Algorithm (7.3.1)

22

Minimum Spanning Tree


Spanning subgraph

ORD

Subgraph of a graph G
containing all the vertices of G

Spanning tree

Spanning subgraph that is


itself a (free) tree

DEN

Minimum spanning tree (MST)

Spanning tree of a weighted


graph with minimum total
edge weight

10

PIT
9

STL

4
8

7
3

DCA

Applications

Communications networks
Transportation networks

DFW

ATL

23

Cycle Property
Cycle Property:
Let T be a minimum
spanning tree of a
weighted graph G
Let e be an edge of G
that is not in T and C let
be the cycle formed by e
with T
For every edge f of C,
weight(f) weight(e)
Proof:
By contradiction
If weight(f) > weight(e) we
can get a spanning tree
of smaller weight by
replacing e with f

9
3

7
Replacing f with e yields
a better spanning tree
f
2

8
4

9
3

7
24

Partition Property

U
f

Partition Property:
Consider a partition of the vertices of
G into subsets U and V
Let e be an edge of minimum weight
across the partition
There is a minimum spanning tree of
G containing edge e
Proof:
Let T be an MST of G
If T does not contain e, consider the
cycle C formed by e with T and let f
be an edge of C across the partition
By the cycle property,
weight(f) weight(e)
Thus, weight(f) weight(e)
We obtain another MST by replacing
f with e

4
9

8
8

e
7

Replacing f with e yields


another MST
U
f
2

4
9

8
8

e
7
25

Prim Algorithm
Input: A non-empty connected weighted graph with
vertices V and edges E (the weights can be negative).
Initialize: Vnew = {x}, where x is an arbitrary node (starting
point) from V, Enew = {}
Repeat until Vnew = V:
Choose an edge {u, v} with minimal weight such that u is
in Vnew and v is not (if there are multiple edges with the
same weight, any of them may be picked)
Add v to Vnew, and {u, v} to Enew

AA[Lec05] The Greedy Method

26

Prim-Jarniks Algorithm
Similar to Dijkstras algorithm (for a connected graph)
We pick an arbitrary vertex s and we grow the MST as a
cloud of vertices, starting from s
We store with each vertex v a label d(v) = the smallest
weight of an edge connecting v to a vertex in the cloud
At each step:
We add to the cloud the
vertex u outside the cloud
with the smallest distance
label
We update the labels of the
vertices adjacent to u

27

Prim-Jarniks Algorithm (cont.)


A priority queue stores
the vertices outside the
cloud

Key: distance
Element: vertex

Locator-based methods

insert(k,e) returns a
locator
replaceKey(l,k) changes
the key of an item

We store three labels


with each vertex:

Distance
Parent edge in MST
Locator in priority queue

Algorithm PrimJarnikMST(G)
Q new heap-based priority queue
s a vertex of G
for all v G.vertices()
if v s
setDistance(v, 0)
else
setDistance(v, )
setParent(v, )
l Q.insert(getDistance(v), v)
setLocator(v,l)
while Q.isEmpty()
u Q.removeMin()
for all e G.incidentEdges(u)
z G.opposite(u,e)
r weight(e)
if r < getDistance(z)
setDistance(z,r)
setParent(z,e)
Q.replaceKey(getLocator(z),r)
28

Example
2

B
5
C

5
C

7
2

4
F

8
7

8
A

7
5

F
E

7
2

4
8

8
A

8
C

3
7

3
7

5
C

8
A

3
7
29

Example (contd.)
2
2

5
C

3
3
2
2

5
C

8
A

3
3

30

Dijkstra vs Prim
Algorithm DijkstraDistances(G, s)
Q new heap-based priority queue
for all v G.vertices()
if v s
setDistance(v, 0)
else
setDistance(v, )
l Q.insert(getDistance(v), v)
setLocator(v,l)
while Q.isEmpty()
u Q.removeMin()
for all e G.incidentEdges(u)
{ relax edge e }
z G.opposite(u,e)

r getDistance(u) +
weight(e)
if r < getDistance(z)
setDistance(z,r)
Q.replaceKey(getLocator(z),r)

Algorithm PrimJarnikMST(G)
Q new heap-based priority queue
s a vertex of G
for all v G.vertices()
if v s
setDistance(v, 0)
else
setDistance(v, )
setParent(v, )
l Q.insert(getDistance(v), v)
setLocator(v,l)
while Q.isEmpty()
u Q.removeMin()
for all e G.incidentEdges(u)
z G.opposite(u,e)

r weight(e)
if r < getDistance(z)
setDistance(z,r)

setParent(z,e)
Q.replaceKey(getLocator(z),r)

AA[Lec05] The Greedy Method

31

Dijkstra vs Prim
a

5 b

9
Shortest Path to all nodes
from node A

5 b

9
Minimum Spanning Tree
a

5 b

Shortest Path to all nodes


from node C

Min Spanning Tree :- How to connect all nodes with the least cost?
Graph must be connected.
E.g. Minimize the TNB power line transmission cost, Railway line cost
Save material/construction cost
You can start at any node, the Min Spanning Tree is the same.
Dijkstra: Shortest Path to all nodes from an individual node
Specific to each node. Graph is different for node A and node C
Graph differs depending on which node is the root (starting point)
How to reach all nodes with the least cost. Save time and fuel
AA[Lec05] The Greedy Method

32

Analysis
Graph operations

Method incidentEdges is called once for each vertex

Label operations

We set/get the distance, parent and locator labels of vertex z O(deg(z))


times
Setting/getting a label takes O(1) time

Priority queue operations

Each vertex is inserted once into and removed once from the priority
queue, where each insertion or removal takes O(log n) time
The key of a vertex w in the priority queue is modified at most deg(w)
times, where each key change takes O(log n) time

Prim-Jarniks algorithm runs in O((n + m) log n) time provided the


graph is represented by the adjacency list structure

Recall that

Sv deg(v) 2m

The running time is O(m log n) since the graph is connected


33

Kruskals Algorithm
A priority queue stores
the edges outside the
cloud

Key: weight
Element: edge

At the end of the


algorithm

We are left with one


cloud that encompasses
the MST
A tree T which is our
MST

Algorithm KruskalMST(G)
for each vertex V in G do
define a Cloud(v) of {v}
let Q be a priority queue.
Insert all edges into Q using their
weights as the key
T
while T has fewer than n-1 edges do
edge e = Q.removeMin()
Let u, v be the endpoints of e
if Cloud(v) Cloud(u) then
Add edge e to T
Merge Cloud(v) and Cloud(u)
return T

34

Kruskal vs Prim
Algorithm PrimJarnikMST(G)
Q new heap-based priority queue
s a vertex of G
for all v G.vertices()
if v s
setDistance(v, 0)
else
setDistance(v, )
setParent(v, )
l Q.insert(getDistance(v), v)
setLocator(v,l)
while Q.isEmpty()
u Q.removeMin()
//Remove vertex/node by node and
//find the incident edge
for all e G.incidentEdges(u)
z G.opposite(u,e)
r weight(e)
if r < getDistance(z)
setDistance(z,r)
setParent(z,e)
Q.replaceKey(getLocator(z),r)

Algorithm KruskalMST(G)
for each vertex V in G do
define a Cloud(v) of {v}
let Q be a priority queue.
Insert all edges into Q using their
weights as the key
T
while T has fewer than n-1 edges do
// remove edge by edge

edge e = Q.removeMin()
Let u, v be the endpoints of e
if Cloud(v) Cloud(u) then
Add edge e to T
Merge Cloud(v) and Cloud(u)
return T

AA[Lec05] The Greedy Method

35

Kruskal vs Prim
If algorithm is stopped before completion
Prim :- Always one connected tree
Kruskal : 1 connected tree or a forest with multiple
trees.

36

Data Structure for


Kruskal Algortihm
The algorithm maintains a forest of trees
An edge is accepted it if connects distinct trees
We need a data structure that maintains a partition,
i.e., a collection of disjoint sets, with the operations:
-find(u): return the set storing u
-union(u,v): replace the sets storing u and v with
their union

37

Representation of a
Partition
Each set is stored in a sequence
Each element has a reference back to the set

operation find(u) takes O(1) time, and returns the set of


which u is a member.
in operation union(u,v), we move the elements of the
smaller set to the sequence of the larger set and update
their references
the time for operation union(u,v) is min(nu,nv), where nu
and nv are the sizes of the sets storing u and v

Whenever an element is processed, it goes into a


set of size at least double, hence each element is
processed at most log n times
38

Partition-Based
Implementation
A partition-based version of Kruskals Algorithm
performs cloud merges as unions and tests as finds.
Algorithm Kruskal(G):
Input: A weighted graph G.
Output: An MST T for G.
Let P be a partition of the vertices of G, where each vertex forms a separate set.
Let Q be a priority queue storing the edges of G, sorted by their weights
Let T be an initially-empty tree
while Q is not empty do
(u,v) Q.removeMinElement()
if P.find(u) != P.find(v) then
Running time:
Add (u,v) to T
O((n+m)log n)
P.union(u,v)
return T
39

Kruskal
Example

2704

BOS

867

849
ORD

LAX

1391

1464
1235

144

JFK

1258

184

802

SFO

337

187
740
621

1846

PVD

BWI
1090

DFW

946
1121
MIA

2342
40

Example

2704

BOS

867
849
ORD

LAX

1391

1464
1235

144

JFK

1258

184

802

SFO
337

187
740
621

1846

PVD

BWI
1090

DFW

946
1121
MIA

2342
41

Example

2704

BOS

867
849
ORD

LAX

1391

1464
1235

144

JFK

1258

184

802

SFO
337

187
740
621

1846

PVD

BWI
1090

DFW

946
1121
MIA

2342
42

Example

2704

BOS

867
849
ORD

LAX

1391

1464
1235

144

JFK

1258

184

802

SFO
337

187
740
621

1846

PVD

BWI
1090

DFW

946
1121
MIA

2342
43

Example

2704

BOS

867
849
ORD

LAX

1391

1464
1235

144

JFK

1258

184

802

SFO
337

187
740
621

1846

PVD

BWI
1090

DFW

946
1121
MIA

2342
44

Example

2704

BOS

867
849
ORD

LAX

1391

1464
1235

144

JFK

1258

184

802

SFO
337

187
740
621

1846

PVD

BWI
1090

DFW

946
1121
MIA

2342
45

Example

2704

BOS

867
849
ORD

LAX

1391

1464
1235

144

JFK

1258

184

802

SFO
337

187
740
621

1846

PVD

BWI
1090

DFW

946
1121
MIA

2342
46

Example

2704

BOS

867
849
ORD

LAX

1391

1464
1235

144

JFK

1258

184

802

SFO
337

187
740
621

1846

PVD

BWI
1090

DFW

946
1121
MIA

2342
47

Example

2704

BOS

867
849
ORD

LAX

1391

1464
1235

144

JFK

1258

184

802

SFO
337

187
740
621

1846

PVD

BWI
1090

DFW

946
1121
MIA

2342
48

Example

2704

BOS

867
849
ORD

LAX

1391

1464
1235

144

JFK

1258

184

802

SFO
337

187
740
621

1846

PVD

BWI
1090

DFW

946
1121
MIA

2342
49

Example

2704

BOS

867
849
ORD

LAX

1391

1464
1235

144

JFK

1258

184

802

SFO
337

187
740
621

1846

PVD

BWI
1090

DFW

946
1121
MIA

2342
50

Example

2704

BOS

867
849
ORD

LAX

1391

1464
1235

144

JFK

1258

184

802

SFO
337

187
740
621

1846

PVD

BWI
1090

DFW

946
1121
MIA

2342
51

Example

2704

BOS

867
849
ORD

LAX

1391

1464
1235

144

JFK

1258

184

802

SFO
337

187
740
621

1846

PVD

BWI
1090

DFW

946
1121
MIA

2342
52

Example

2704

BOS

867

849
ORD

LAX

1391

1464
1235

144

JFK

1258

184

802

SFO

337

187
740
621

1846

PVD

BWI
1090

DFW

946
1121
MIA

2342
53

Dijkstra, Prim vs Kruskal


Minimum edge weight data
structure

Time complexity (total)

Prim adjacency matrix, searching

O(|V|2)

Prim binary heap and adjacency list

O((|V| + |E|) log |V|) = O(|E| log |V|)

Prim Fibonacci heap and adjacency


list

O(|E| + |V| log |V|)

Kruskal

O( |E| log |V|)

Dijkstra

O((|V| + |E|) log |V|) = O(|E| log |V|)

AA[Lec05] The Greedy Method

54

Lecture 08
Dynamic Programming

Learning Outcome
Matrix Chain-Product (5.3.1)
The General Technique (5.3.2)
0-1 Knapsack Problem (5.3.3)
Transitive closure (6.4.2)

The Floyd-Warshall Algorithm

Matrix Chain-Products
Dynamic Programming is a general
algorithm design paradigm.

Rather than give the general structure, let us


first give a motivating example:
Matrix Chain-Products

Review: Matrix Multiplication.

C = A*B
A is d e and B is e f

f
B
j

e 1

C[i, j ] A[i, k ] * B[k , j ]

k 0

O(def ) time

A
d

C
i

i,j

d
3

Matrix Chain-Products
Matrix Chain-Product:

Compute A=A0*A1**An-1
Ai is di di+1
Problem: How to parenthesize?

Example

B is 3 100
C is 100 5
D is 5 5
(B*C)*D takes (3 x 100 x 5) + (5 x 5 x5) =
1500 + 75 = 1575 ops
B*(C*D) takes (3 x 100 x 5) + (100 x 5
x5) = 1500 + 2500 = 4000 ops
4

An Enumeration Approach
Matrix Chain-Product Alg.:

Try all possible ways to parenthesize


A=A0*A1**An-1
Calculate number of ops for each one
Pick the one that is best

Running time:

The number of paranethesizations is equal


to the number of binary trees with n nodes
This is exponential!
It is called the Catalan number, and it is
almost 4n.
This is a terrible algorithm!
5

A Greedy Approach
Idea #1: repeatedly select the product that
uses (up) the most operations.
Counter-example:

A is 10 5
B is 5 10
C is 10 5
D is 5 10
Greedy idea #1 gives (A*B)*(C*D), which takes
(10x5x10)+(10x10x10)+(10x5x10) =
500+1000+500 = 2000 ops
A*((B*C)*D) takes (10x5x10)+(5x10x5)+(5x5x10)
= 500+250+250 = 1000 ops
6

Another Greedy Approach


Idea #2: repeatedly select the product that uses
the fewest operations.
Counter-example:

A is 101 11
B is 11 9
C is 9 100
D is 100 99
Greedy idea #2 gives A*((B*C)*D)), which takes
109989+9900+108900=228789 ops
(A*B)*(C*D) takes 9999+89991+89100=189090 ops

The greedy approach is not giving us the


optimal value.

A Recursive Approach
Define subproblems:

Find the best parenthesization of Ai*Ai+1**Aj.


Let Ni,j denote the number of operations done by this
subproblem (i to j).
The optimal solution for the whole problem (0 to n-1) is N0,n-1.

Subproblem optimality: The optimal solution can be


defined in terms of optimal subproblems

There has to be a final multiplication (root of the expression


tree) for the optimal solution.
Say, the final multiply is at index i: (A0**Ai)*(Ai+1**An-1).
Then the optimal solution N0,n-1 is the sum of two optimal
subproblems, N0,i and Ni+1,n-1 plus the time for the last multiply.
If the global optimum did not have these optimal subproblems,
we could define an even better optimal solution.
8

A Dynamic Programming
Algorithm
Since subproblems
Algorithm matrixChain(S):
overlap, we dont
Input: sequence S of n matrices to be multiplied
use recursion.
Output: number of operations in an optimal
Instead, we
paranethization of S
construct optimal
for i 1 to n-1 do
subproblems
Ni,i 0
bottom-up.
for b 1 to n-1 do
Ni,is are easy, so
for i 0 to n-b-1 do
start with them
j i+b
Then do length
Ni,j +infinity
2,3, subproblems,
and so on.
for k i to j-1 do
Ni,j min{Ni,j , Ni,k +Nk+1,j +di dk+1 dj+1}
Running time: O(n3)
9

A Dynamic Programming
Algorithm Visualization
Ni , j min{Ni ,k Nk 1, j di dk 1d j 1}

The bottom-up
construction fills in the
N array by diagonals
Ni,j gets values from
previous entries in i-th
row and j-th column
Filling in each entry in
the N table takes O(n)
time.
Total run time: O(n3)
Getting actual
parenthesization can be
done by remembering
k for each N entry

i k j

1 2

n-1

answer

n-1

10

Dynamic Programming Algorithm


Visualization
A0: 30 X 35; A1: 35 X15; A2: 15X5;
A3: 5X10; A4: 10X20; A5: 20 X 25

Ni , j min{Ni ,k Nk 1, j di dk 1d j 1}
i k j

N1, 4 min{
N1,1 N 2, 4 d1d 2 d 5 0 2500 35 *15 * 20 13000 ,
N1, 2 N 3, 4 d1d 3d 5 2625 1000 35 * 5 * 20 7125 ,
N1,3 N 4, 4 d1d 4 d 5 4375 0 35 *10 * 20 11375
}
7125

The General Dynamic


Programming Technique
Applies to a problem that at first seems to
require a lot of time (possibly exponential),
provided we have:

Simple subproblems: the subproblems can be


defined in terms of a few variables, such as j, k, l,
m, and so on.
Subproblem optimality: the global optimum value
can be defined in terms of optimal subproblems
Subproblem overlap: the subproblems are not
independent, but instead they overlap (hence,
should be constructed bottom-up).

12

The 0/1 Knapsack Problem


Given: A set S of n items, with each item i having

bi - a positive benefit
wi - a positive weight

Goal: Choose items with maximum total benefit but with


weight at most W.
If we are not allowed to take fractional amounts, then
this is the 0/1 knapsack problem.

In this case, we let T denote the set of items we take

Objective: maximize

iT

Constraint:

w W
i

iT

13

Example
Given: A set S of n items, with each item i having

bi - a positive benefit
wi - a positive weight

Goal: Choose items with maximum total benefit but with


weight at most W.
knapsack
Solution:

Items:

Weight:
Benefit:

4 in

2 in

2 in

6 in

2 in

$20

$3

$6

$25

$80

5 (2 in)
3 (2 in)
1 (4 in)
W = 9 in

w = 8 in
B = $126
14

A 0/1 Knapsack Algorithm,


First Attempt
Greedy: repeatedly add item with maximum ratio,
benefit/ weight
Problem: greedy method does not have subproblem
optimality:
{5, 2, 1} has benefit = 35, greedy not optimal
{3, 4} has optimum benefit of 40!
W = 11 kg
Items:
Weight (kg):
Benefit:

18

22

28

15

A 0/1 Knapsack Algorithm,


Second Attempt
Sk: Set of items numbered 1 to k.
Define B[k,w] = best selection from Sk with
weight at most w
Good news: this does have subproblem
optimality:
B[k 1, w]
if wk w

B[k , w]
else
max{ B[k 1, w], B[k 1, w wk ] bk }

i.e., best subset of Sk with weight exactly w is


either the best subset of Sk-1 w/ weight w or the
best subset of Sk-1 w/ weight w-wk plus item k.
16

The 0/1 Knapsack


Algorithm
Recall definition of B[k,w]:

B[k 1, w]
if wk w

B[k , w]
else
max{ B[k 1, w], B[k 1, w wk ] bk }
Algorithm 01Knapsack(S, W):
Since B[k,w] is defined in
Input: set S of items w/ benefit bi
terms of B[k-1,*], we can
and weight wi; max. weight W
Output: benefit of best subset with
reuse the same array
weight at most W

Running time: O(nW).


Not a polynomial-time
algorithm if W is large
This is a pseudo-polynomial
time algorithm

for w 0 to W do
B[w] 0
for k 1 to n do
for w W downto wk do
if B[w-wk]+bk > B[w] then
B[w] B[w-wk]+bk
17

The 0/1 Knapsack Algorithm


0

10

11

{1}

{1,2}

{1,2,3}

18

19

24

25

25

25

25

{1,2,3,4} 0

18

22

24

28

29

29

40

{1,2,3,4,5} 0

18

22

28

29

34

35

40

18

Transitive Closure Problem


Given a digraph G, the
transitive closure of G is the
digraph G* such that
G* has the same vertices
as G
if G has a directed path
from u to v (u v), G*
has a directed edge from
u to v
The transitive closure
provides reachability
information about a digraph

B
C

A
D

B
C
A

G*
19

Computing the
Transitive Closure
We can perform
DFS starting at
each vertex

If there's a way to get


from A to B and from
B to C, then there's a
way to get from A to C.

O(n(n+m))

Alternatively ... Use


dynamic programming:
The Floyd-Warshall
Algorithm
20

Floyd-Warshall
Transitive Closure
Idea #1: Number the vertices 1, 2, , n.
Idea #2: Consider paths that use only
vertices numbered 1, 2, , k, as
intermediate vertices:
i

Uses only vertices numbered 1,,k


(add this edge if its not already in)

Uses only vertices


numbered 1,,k-1
k

Uses only vertices


numbered 1,,k-1
21

Floyd-Warshalls Algorithm
Floyd-Warshalls algorithm
numbers the vertices of G as
v1 , , vn and computes a
series of digraphs G0, , Gn

Algorithm FloydWarshall(G)
Input digraph G
Output transitive closure G* of G
i1
for all v G.vertices()
G0=G
denote v as vi
Gk has a directed edge (vi, vj)
ii+1
if G has a directed path from
G0 G
vi to vj with intermediate
for k 1 to n do
vertices in the set {v1 , , vk}
Gk Gk 1
We have that Gn = G*
for i 1 to n (i k) do
for j 1 to n (j i, k) do
In phase k, digraph Gk is
if Gk 1.areAdjacent(vi, vk)
computed from Gk 1
Gk 1.areAdjacent(vk, vj)
Running time: O(n3),
if Gk.areAdjacent(vi, vj)
assuming areAdjacent is O(1)
Gk.insertDirectedEdge(vi, vj , k)
(e.g., adjacency matrix)
return Gn
22

Floyd-Warshall Example

v7
BOS

ORD

v4

JFK

v2

v6

SFO

DFW
LAX
v1

v3

MIA
v5
23

Floyd-Warshall, Iteration 1

v7
BOS

ORD

v4

JFK

v2

v6

SFO

DFW
LAX
v1

v3

MIA
v5
24

Floyd-Warshall, Iteration 2

v7
BOS

ORD

v4

JFK

v2

v6

SFO

DFW
LAX
v1

v3

MIA
v5
25

Floyd-Warshall, Iteration 3

v7
BOS

ORD

v4

JFK

v2

v6

SFO

DFW
LAX
v1

v3

MIA
v5
26

Floyd-Warshall, Iteration 4

v7
BOS

ORD

v4

JFK

v2

v6

SFO

DFW
LAX
v1

v3

MIA
v5
27

Floyd-Warshall, Iteration 5

v7
BOS

ORD

v4

JFK

v2

v6

SFO

DFW
LAX
v1

v3

MIA
v5
28

Floyd-Warshall, Iteration 6

v7
BOS

ORD

v4

JFK

v2

v6

SFO

DFW
LAX
v1

v3

MIA
v5
29

Floyd-Warshall, Conclusion

v7
BOS

ORD

v4

JFK

v2

v6

SFO

DFW
LAX
v1

v3

MIA
v5
30

Lecture 09
Text Processing
a

b
a

a
b

c
a

Outline and Reading


Pattern matching algorithms

Brute-force algorithm (9.1.2)


Boyer-Moore algorithm (9.1.3)
Knuth-Morris-Pratt algorithm (9.1.4)

Trie

Huffman encoding

Strings
A string is a sequence of
characters
Examples of strings:

Java program
HTML document
DNA sequence
Digitized image

An alphabet S is the set of


possible characters for a
family of strings
Example of alphabets:

ASCII
Unicode
{0, 1}
{A, C, G, T}

Let P be a string of size m

A substring P[i .. j] of P is the


subsequence of P consisting of
the characters with ranks
between i and j
A prefix of P is a substring of
the type P[0 .. i]
A suffix of P is a substring of
the type P[i ..m - 1]

Given strings T (text) and P


(pattern), the pattern matching
problem consists of finding a
substring of T equal to P
Applications:

Text editors
Search engines
Biological research

Brute-Force Algorithm
Algorithm BruteForceMatch(T, P)
Input text T of size n and pattern
P of size m
Output starting index of a
substring of T equal to P or -1
if no such substring exists
a match is found, or
for i 0 to n - m
all placements of the pattern
{ test shift i of the pattern }
have been tried
j0
Brute-force pattern matching
while j < m T[i + j] = P[j]
runs in time O(nm)
jj+1
Example of worst case:
if j = m
T = aaa ah
P = aaah
return i {match at i}
may occur in images and
else
DNA sequences
break while loop {mismatch}
unlikely in English text
return -1 {no match anywhere}
The brute-force pattern
matching algorithm compares
the pattern P with the text T
for each possible shift of P
relative to T, until either

Brute-Force Algorithm
Matching:
Brute-Force Matching:
Comparing from left to right
ABABC
ABABABCCA
Error

Text to find
ABABC

ABABC

ABABABCCA

ABABABCCA
Successful match!

Error
Source Text

Brute Force Algorithm


Brute-Force Matching II
Worst case looks like:
Scan_Text:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab
Pattern:
aaaab
Best case looks like:
Scan_Text:
aaaabjhgasjhgasjhagsjhagsjhgahgaaaaaaaaab
Pattern:
aaaab
Typical average case:
Scan_Text:
This is an algorithm difficult to beat
Pattern:
algorithm
Analysis:

Worst Case:
Best Case:
Average Case:

O(M * N)
O(M)
O(M+N)
6

Boyer-Moore Heuristics
The Boyer-Moores pattern matching algorithm is based on two
heuristics
Looking-glass heuristic: Compare P with a subsequence of T
moving backwards
Character-jump heuristic: When a mismatch occurs at T[i] = c

If P contains c, shift P to align the last occurrence of c in P with T[i]


Else, shift P to align P[0] with T[i + 1]

Example
a

p a t

r i

1
t h m

r i

t e r n

2
t h m

m a t c h i n g
3
t h m

4
t h m

a l g o r

5
t h m

t h m

11 10 9 8 7
r i t h m

6
t h m
7

Last-Occurrence Function
Boyer-Moores algorithm preprocesses the pattern P and the
alphabet S to build the last-occurrence function L mapping S to
integers, where L(c) is defined as

the largest index i such that P[i] = c or


-1 if no such index exists

Example:

S = {a, b, c, d}

P = abacab

L(c)

-1

The last-occurrence function can be represented by an array


indexed by the numeric codes of the characters
The last-occurrence function can be computed in time O(m + s),
where m is the size of P and s is the size of S

The Boyer-Moore Algorithm


Algorithm BoyerMooreMatch(T, P, S)
L lastOccurenceFunction(P, S )
im-1
jm-1
repeat
if T[i] = P[j]
if j = 0
return i { match at i }
else
ii-1
jj-1
else
{ character-jump }
l L[T[i]]
i i + m min(j, 1 + l)
jm-1
until i > n - 1
return -1 { no match }

Case 1: j 1 + l
.

a .
i

b a
j l
m-j

b a

Case 2: 1 + l j
.

a .
i

a .
l

b .
j
m - (1 + l)

a .

1+l

b .
9

Example
a

b a

a a

a d

b a

a b

a b b

b a

a b a

a b
4

13 12 11 10 9

b a

a b

a b
7

b a

a b

10

Another Boyer-Moore Example


must
|
If you wish to understand others you must intensify ..
No Match and y is not in must, so move the whole word.
must
|
If you wish to understand others you must intensify ..
No Match and w is not in must, so move the whole word.
must
|
If you wish to understand others you must intensify ..
No Match and _ is not in must, so move the whole word.

11

Another Boyer-Moore Example


must
|
If you wish to understand others you must intensify ..
No Match, but u is in must, so move the whole word only to
right until u fits.
must
|
If you wish to understand others you must intensify ..
Again: No Match and d is not in must, so move the whole word.
must
|||
If you wish to understand others you must intensify ..
From the right: No Match and r is not in must, so move the
whole word.
must
|
If you wish to understand others you must intensify ..
From the right: No Match and n is not in must, so move the
whole word.

12

Another Boyer-Moore Example


The rest goes this way:
must
must
must
must
must
|| |
|
|
If you wish to understand others you must intensify ..
Success!!
Try the Boyer-Moore algorithm with:
scan_Text: THISISASTRINGSEARCHEXAMPLE
pattern: HINGE

13

Analysis
Boyer-Moores algorithm
runs in time O(nm + s)
Example of worst case:

T = aaa a
P = baaa

The worst case may occur in


images and DNA sequences
but is unlikely in English text
Boyer-Moores algorithm is
significantly faster than the
brute-force algorithm on
English text

12 11 10

18 17 16 15 14 13

24 23 22 21 20 19

14

The KMP Algorithm - Motivation


Knuth-Morris-Pratts algorithm
compares the pattern to the
text in left-to-right, but shifts
the pattern more intelligently
than the brute-force algorithm.
When a mismatch occurs,
what is the most we can shift
the pattern so as to avoid
redundant comparisons?
Answer: the largest prefix of
P[0..j] that is a suffix of P[1..j]

a b a a b x .

a b a a b a
j
a b a a b a

No need to
repeat these
comparisons

Resume
comparing
here
15

KMP Failure Function


Knuth-Morris-Pratts
algorithm preprocesses the
pattern to find matches of
prefixes of the pattern with
the pattern itself
The failure function F(j) is .
defined as the size of the
largest prefix of P[0..j] that is
also a suffix of P[1..j]
Knuth-Morris-Pratts
algorithm modifies the bruteforce algorithm so that if a
mismatch occurs at P[j] T[i]
we set j F(j - 1)

P[j]

F(j)

a b a a b x .

a b a a b a
j
a b a a b a
F(j - 1)
16

The KMP Algorithm


The failure function can be
represented by an array and
can be computed in O(m) time
At each iteration of the whileloop, either

i increases by one, or
the shift amount i - j
increases by at least one
(observe that F(j - 1) < j)

Hence, there are no more


than 2n iterations of the
while-loop
Thus, KMPs algorithm runs in
optimal time O(m + n)

Algorithm KMPMatch(T, P)
F failureFunction(P)
i0
j0
while i < m
if T[i] = P[j]
if j = m - 1
return i - j { match }
else
ii+1
jj+1
else
if j > 0
j F[j - 1]
else
ii+1
return -1 { no match }
17

Computing the Failure


Function
The failure function can be
represented by an array and Algorithm failureFunction(P)
can be computed in O(m) time
F[0] 0
i1
The construction is similar to
j0
the KMP algorithm itself
while i < m
At each iteration of the whileif P[i] = P[j]
{we have matched j + 1 chars}
loop, either

i increases by one, or
the shift amount i - j
increases by at least one
(observe that F(j - 1) < j)

Hence, there are no more


than 2m iterations of the
while-loop

F[i] j + 1
ii+1
jj+1
else if j > 0 then
{use failure function to shift P}
j F[j - 1]
else
F[i] 0 { no match }
ii+1
18

Example
a b a c a a b a c c a b a c a b a a b b
1 2 3 4 5 6

a b a c a b
7

a b a c a b
8 9 10 11 12

a b a c a b
13
j

P[j]

F(j)

a b a c a b
14 15 16 17 18 19

a b a c a b
19

Trie
Preprocessing the pattern speeds up pattern matching
queries

After preprocessing the pattern, KMPs algorithm performs


pattern matching in time proportional to the text size

If the text is large, immutable and searched for often


(e.g., works by Shakespeare), we may want to
preprocess the text instead of the pattern
A trie is a compact data structure for representing a
set of strings, such as all the words in a text

A tries supports pattern matching queries in time


proportional to the pattern size

20

Trie (2)
Given a string X, efficiently encode X into a smaller
string Y

Each uncompressed character in X is 7-bit for ASCII and 16bit for UNICODE
A compressed character in Y has less bit
Saves memory and/or bandwidth

A good approach: Huffman encoding

Compute frequency f(c) for each character c.


Encode high-frequency characters with short code words
(Greedy method)
No code word is a prefix for another code
Use an optimal encoding tree to determine the code words

21

Why Trie?

N is no of string
L = length of string
Above is the performance for Red-Black BST and
Hashing.
Can we do better than the above for string?
Yes.
22

Trie Implementation

Root = store nothing


Each node can have R children/keys :- 26 for
alphabets
If node = last character (link it a value)
Else
link it to null.
23

Trie: Search

Null pointer
(key word Shelter)

Get word (Shells) . 3 is return.


Get word she. 0 is return.
Get word shelter, there is no match after shel, so
a null is returned
24

Trie: Insertion

Put(shore, 7)

25

Trie: Deletion

delete(shells, 3)
All the character l, l, s and its pointer will have to be removed.
The removal stop when there is a non-null value character or
when the is another sub-tree
26

Trie: Deletion

delete(shore, 7)
Stop at (h).

27

Trie: Cost

R-way trie
Good when R is small
When R is large, too much memory is required.
E.g. R-Way for english words, R = 26 , N = 10
Space = (26 +1) x 10 = 270 (still OK)
E.g. R-Way for UTF-16, R = 65536-way, N = 10
Space = (65536 + 1) x 10 = 655370 (too big).
28

Trie Conclusion
Trade memory with speed
When you do a string search, you dont
have to compare the whole word
Quick search hit worst case is the
length of character
Quicker search miss, just check the 1st
few character, if miss then its not
there.
Note: Just one implementation is given
in this lecture, other implementation
may varies [Lec11] Pattern Matching
29

Other implementation. Words at Leaf


MACCBOY
MACAW
MACARON
MACARONI
MACARONIC
MACAQUE
MACADAMIA
MACADAM
MACACACO
MACABRE

M
A
C
A

MACABRE

MACACO

MACADAM

R
O

MACAQUE
M
N
I

MACADAMI

MACCBOY

MACAW
O
MACAROON

MACARONI

MACARONIC

[Lec11] Pattern Matching

30

26 character implementation

[Lec11] Pattern Matching

31

Encoding Trie (1)


A code is a mapping of each character of an alphabet to a binary
code-word
A prefix code is a binary code such that no code-word is the prefix
of another code-word
An encoding trie represents a prefix code

Each leaf stores a character


The code word of a character is given by the path from the root to
the leaf storing the character (0 for a left child and 1 for a right child

00

010

011

10

11

d
b

e
32

Encoding Trie (2)


Given a text string X, we want to find a prefix code for the characters
of X that yields a small encoding for X
Frequent characters should have short code-words
Rare characters should have long code-words

Example
X = abracadabra
T1 encodes X into 29 bits
T2 encodes X into 24 bits

T1

T2

d
a

b
c

d
33

Huffmans Algorithm
Given a string X,
Huffmans algorithm
construct a prefix
code the minimizes
the size of the
encoding of X
It runs in time
O(n + d log d), where
n is the size of X
and d is the number
of distinct characters
of X
A heap-based
priority queue is
used as an auxiliary
structure

Algorithm HuffmanEncoding(X)
Input string X of size n
Output optimal encoding trie for X
C distinctCharacters(X)
computeFrequencies(C, X)
Q new empty heap
for all c C
T new single-node tree storing c
Q.insert(getFrequency(c), T)
while Q.size() > 1
f1 Q.minKey()
T1 Q.removeMin()
f2 Q.minKey()
T2 Q.removeMin()
T join(T1, T2)
Q.insert(f1 + f2, T)
return Q.removeMin()
34

Example

11

a
5

b
2

c
1

d
1

X = abracadabra
Frequencies

b
2

r
2

a
5

2
d

r
2

r
6

2
a
5

a
5

4
d

4
d

r
35

Huffman Coding
String
Frequency
A= 20
B = 18
C = 15
D = 12
E=8
F -= 6
G=4
H=2
I=1
J=1

36

Lecture 10: NP-Completeness


x1

x1

x2

x2

12

11

x3

x3

22

13

21

x4

x4

32

23

31

33

Learning Outcome
P and NP (13.1)

Definition of P
Definition of NP
Alternate definition of NP

NP-completeness (13.2)

Definition of NP-hard and NP-complete


The Cook-Levin Theorem

Running Time Revisited


Input size, n
To be exact, let n denote the number of bits in a nonunary
(binary) encoding of the input
A time bound of the form O(nk), for some fixed k, is called
polynomial time.
All the polynomial-time algorithms studied so far in this course run
in polynomial time using this definition of input size.
Exception: any pseudo-polynomial time algorithm

SFO

PVD

ORD
LGA

HNL

LAX

DFW

MIA
3

Dealing with Hard Problems


What to do when we find a problem
that looks hard

I couldnt find a polynomial-time algorithm;


I guess Im too dumb.
(cartoon inspired by [Garey-Johnson, 79])

Dealing with Hard Problems


Sometimes we can prove a strong lower
bound (but not usually)

I couldnt find a polynomial-time algorithm,


because no such algorithm exists!
(cartoon inspired by [Garey-Johnson, 79])

Dealing with Hard Problems


NP-completeness lets us show
collectively that a problem is hard.

I couldnt find a polynomial-time algorithm,


but neither could all these other smart people.
(cartoon inspired by [Garey-Johnson, 79])

Polynomial-Time
Decision Problems
To simplify the notion of hardness, we will
focus on the following:

Polynomial-time as the cut-off for efficiency


Decision problems: output is 1 or 0 (yes or no)
Examples:
Does a given graph G have an Euler tour?
Does a text T contain a pattern P?
Does an instance of 0/1 Knapsack have a solution with

benefit at least K?
Does a graph G have an MST with weight at most K?

Problems and Languages


A language L is a set of strings defined over some
alphabet
Every decision algorithm A defines a language L

L is the set consisting of every string x such that A outputs


yes on input x.
We say A accepts x in this case
Example:
If A determines whether or not a given graph G has an
Euler tour, then the language L for A is all graphs with
Euler tours.

The Complexity Class P


A complexity class is a collection of languages
P is the complexity class consisting of all languages
that are accepted by polynomial-time algorithms
For each language L in P there is a polynomial-time
decision algorithm A for L.

If n=|x|, for x in L, then A runs in p(n) time on input x.


The function p(n) is some polynomial

The Complexity Class NP


We say that an algorithm is non-deterministic if it
uses the following operation:

Choose(b): chooses a bit b


Can be used to choose an entire string y (with |y| choices)

We say that a non-deterministic algorithm A accepts


a string x if there exists some sequence of choose
operations that causes A to output yes on input x.
NP is the complexity class consisting of all languages
accepted by polynomial-time non-deterministic
algorithms.

10

NP example
Problem: Decide if a graph has an MST of weight K
Algorithm:
1.
2.
3.

Non-deterministically choose a set T of n-1 edges


Test that T forms a spanning tree
Test that T has weight at most K

Analysis: Testing takes O(n+m) time, so this


algorithm runs in polynomial time.

11

The Complexity Class NP


Alternate Definition
We say that an algorithm B verifies the acceptance
of a language L if and only if, for any x in L, there
exists a certificate y such that B outputs yes on
input (x,y).
NP is the complexity class consisting of all languages
verified by polynomial-time algorithms.
We know: P is a subset of NP.
Major open question: P=NP?
Most researchers believe that P and NP are different.
12

NP example (2)
Problem: Decide if a graph has an MST of weight K
Verification Algorithm:
1.
2.
3.

Use as a certificate, y, a set T of n-1 edges


Test that T forms a spanning tree
Test that T has weight at most K

Analysis: Verification takes O(n+m) time, so this


algorithm runs in polynomial time.

13

Equivalence of the
Two Definitions
Suppose A is a non-deterministic algorithm
Let y be a certificate consisting of all the outcomes of the

choose steps that A uses


We can create a verification algorithm that uses y instead of
As choose steps
If A accepts on x, then there is a certificate y that allows us to
verify this (namely, the choose steps A made)
If A runs in polynomial-time, so does this verification
algorithm

Suppose B is a verification algorithm


Non-deterministically choose a certificate y
Run B on y
If B runs in polynomial-time, so does this non-deterministic
algorithm

14

An Interesting Problem
A Boolean circuit is a circuit of AND, OR, and NOT
gates; the CIRCUIT-SAT problem is to determine if
there is an assignment of 0s and 1s to a circuits
inputs so that the circuit outputs 1.
Inputs:
Logic Gates:

0
1

NOT
0

1
Output:

OR

1
1

AND

15

CIRCUIT-SAT is in NP
Non-deterministically choose a set of inputs and the
outcome of every gate, then test each gates I/O.

Inputs:
Logic Gates:

0
1

NOT
0

1
Output:

OR

1
1

AND

16

NP-Completeness
A problem (language) L is NP-hard if every
problem in NP can be reduced to L in
polynomial time.
That is, for each language M in NP, we can
take an input x for M, transform it in
polynomial time to an input x for L such that
x is in M if and only if x is in L.
L is NP-complete if its in NP and is NP-hard.
NP

poly-time

L
17

Cook-Levin Theorem
CIRCUIT-SAT is NP-complete.

We already showed it is in NP.

To prove it is NP-hard, we have to show that every


language in NP can be reduced to it.

Let M be in NP, and let x be an input for M.


Let y be a certificate that allows us to verify membership in M in
polynomial time, p(n), by some algorithm D.
Let S be a circuit of size at most O(p(n)2) that simulates a
computer (details omitted)

NP

poly-time

CIRCUIT-SAT
18

Some Thoughts
about P and NP

NP-complete
problems live here
NP

CIRCUIT-SAT

Belief: P is a proper subset of NP.


Implication: the NP-complete problems are the hardest in NP.
Why: Because if we could solve an NP-complete problem in
polynomial time, we could solve every problem in NP in polynomial
time.
That is, if an NP-complete problem is solvable in polynomial time,
then P=NP.
Since so many people have attempted without success to find
polynomial-time solutions to NP-complete problems, showing your
problem is NP-complete is equivalent to showing that a lot of smart
people have worked on your problem and found no polynomialtime algorithm.
19

NP-Completeness (2)
x1

x1

x2

x2

12

11

x3

x3

22

13

21

x4

x4

32

23

31

33

20

Outline and Reading


Definitions (13.1-2)

NP is the set of all problems (languages) that can be


accepted non-deterministically (using choose
operations) in polynomial time.
verified in polynomial-time given a certificate y.

Some NP-complete problems (13.3)

Problem reduction
SAT (and CNF-SAT and 3SAT)
Vertex Cover
Hamiltonian Cycle
21

Problem Reduction
A language M is polynomial-time reducible to a
language L if an instance x for M can be transformed in
polynomial time to an instance x for L such that x is in M
if and only if x is in L.

Denote this by ML.

A problem (language) L is NP-hard if every problem in


NP is polynomial-time reducible to L.
A problem (language) is NP-complete if it is in NP and
Inputs:
it is NP-hard.
0
1
0
CIRCUIT-SAT is NP-complete:
1

CIRCUIT-SAT is in NP
For every M in NP, M CIRCUIT-SAT.

1
Output:
1

22

Transitivity of Reducibility
If A B and B C, then A C.

An input x for A can be converted to x for B, such that x is in A


if and only if x is in B. Likewise, for B to C.
Convert x into x for C such that x is in B if x is in C.
Hence, if x is in A, x is in B, and x is in C.
Likewise, if x is in C, x is in B, and x is in A.
Thus, A C, since polynomials are closed under composition.

Types of reductions:

Local replacement: Show A B by dividing an input to A


into components and show how each component can be
converted to a component for B.
Component design: Show A B by building special
components for an input of B that enforce properties needed
for A, such as choice or evaluate.

23

SAT
A Boolean formula is a formula where the
variables and operations are Boolean (0/1):

(a+b+d+e)(a+c)(b+c+d+e)(a+c+e)
OR: +, AND: (times), NOT:

SAT: Given a Boolean formula S, is S


satisfiable, that is, can we assign 0s and 1s
to the variables so that S is 1 (true)?

Easy to see that CNF-SAT is in NP:


Non-deterministically choose an assignment of 0s and

1s to the variables and then evaluate each clause. If


they are all 1 (true), then the formula is satisfiable.

24

SAT is NP-complete
Reduce CIRCUIT-SAT to SAT.
Given a Boolean circuit, make a variable for every input
and gate.
Create a sub-formula for each gate, characterizing its
effect. Form the formula as the output variable AND-ed
with all these sub-formulas:
Example:
m((a+b)e)(cf)(dg)(eh)(efi)
Inputs:
a
b

g
d

The formula is satisfiable


if and only if the
Output:
Boolean circuit
m
is satisfiable.
25

3SAT
The SAT problem is still NP-complete even if the formula is a
conjunction of disjuncts, that is, it is in conjunctive normal form
(CNF).
The SAT problem is still NP-complete even if it is in CNF and
every clause has just 3 literals (a variable or its negation):
(a+b+d)(a+c+e)(b+d+e)(a+c+e)
Reduction from SAT
DNF (disjunctive normal form):
_ _
_
__
B=a.b.c+a.b.c+a.b.c+a.b.c
CNF (conjunctive normal form):
_ _
_
_
_ _
_
B=(a+b+c).(a+b+c).(a+b+c).(a+b+c)

26

Illustration of Reductions
Every problem in NP

Cook-Levin Theorem
CIRCUIT-SAT is NP-complete

comp. design

CIRCUIT-SAT
local rep.

CNF-SAT
local rep.

3SAT
comp. design
VERTEX-COVER

local rep.
local rep.

CLIQUE

SET-COVER

comp. design

SUBSET-SUM
restriction

KNAPSACK

comp. design
HAMILTONIANCYCLE

restriction

TSP
27

Vertex Cover
A vertex cover of graph G=(V,E) is a subset W of V, such
that, for every edge (a,b) in E, a is in W or b is in W.
VERTEX-COVER: Given an graph G and an integer K,
does G have a vertex cover of size at most K?

VERTEX-COVER is in NP: Non-deterministically choose a


subset W of size K and check that every edge is covered
28
by W.

Vertex Cover
Example
Suppose we are given a graph G representing a
computer network where the vertices represent routers
and edges represent physical connections. Suppose
further that we wish to upgrade some of the routers in
our network with special new, but expensive, routers
that can perform sophisticated monitoring operations for
incident connections. If we would like to determine if k
new routers are sufficient to monitor every connection in
our network, then we have an instance of VERTEXCOVER problem.
29

Vertex-Cover is NP-complete
Reduce 3SAT to VERTEX-COVER.
Let S be a Boolean formula in CNF with each clause
having 3 literals.
For each variable x, create a node for x and x,
and connect these two:
x
x

For each clause (a+b+c), create a triangle and


connect these three nodes.

b
30

Vertex-Cover is NP-complete
Completing the construction
Connect each literal in a clause triangle to its copy
in a variable pair.
E.g., a clause (x+y+z)
x

Let n=# of variables


Let m=# of clauses
Set K=n+2m

b
31

Vertex-Cover is NP-complete
Example: (a+b+c)(a+b+c)(b+c+d)
Graph has vertex cover of size K=4+6=10 if formula is
satisfiable.
a

12

11

22

13

21

32

23

31

33

32

Some Other
NP-Complete Problems
SUBSET-SUM: Given a set of integers and a
distinguished integer K, is there a subset of
the integers that sums to K?

NP-complete by reduction from VERTEX-COVER

33

SUBSET-SUM
SUBSET-SUM: Given a set S of n integers and a distinguished
integer k, is there a subset of the integers in S that
sum to k?
Example: Suppose we have an internet web server, and we are
presented with a collection of download requests. For each download
request we can easily determine the size of the requested file. Thus,
we can abstract each web request simply as an integer - the size of
the requested file. Given this set of integers, we might be interested
in determining a subset of them that exactly sums to the bandwidth
our server can accommodate in one minute. Unfortunately, this
problem is an instance of SUBSET-SUM. Moreover, because it is NPComplete, this problem will actually harder to solve as our web
servers bandwidth and request-handling ability improves.
34

Some Other
NP-Complete Problems
0/1 Knapsack: Given a collection of items with
weights and benefits, is there a subset of weight
at most W and benefit at least K?

NP-complete by reduction from SUBSET-SUM

Hamiltonian-Cycle: Given an graph G, is there a


cycle in G that visits each vertex exactly once?

NP-complete by reduction from VERTEX-COVER

Traveling Saleperson Tour: Given a complete


weighted graph G, is there a cycle that visits each
vertex and has total cost at most K?

NP-complete by reduction from Hamiltonian-Cycle.


35

You might also like