You are on page 1of 59

Unit I

Asymptotic Notations

Asymptotic Notation: O(), (), !(), and _()

A problem may have numerous algorithmic solutions. In order to choose the best
algorithm for a particular task, you need to be able to judge how long a particular solution
will take to run. Or, more accurately and choose the better of the two. You don't need to
know how many minutes and seconds they will take, but you do need some way to
compare algorithms against one another.

Asymptotic complexity is a way of expressing the main component of the cost of an


algorithm, using idealized units of computational work. Comparison is not the only issue
in algorithms. There are space issues as well. Generally, a trade off between time and
space is noticed in algorithms. Asymptotic notation empowers you to make that trade off.
If you think of the amount of time and space your algorithm uses as a function of your
data over time or space (time and space are usually analyzed separately), you can analyze
how the time and space is handled when you introduce more data to your program.

This is important in data structures because you want a structure that behaves efficiently
as you increase the amount of data it handles. Keep in mind though those algorithms that
are efficient with large amounts of data are not always simple and efficient for small
amounts of data. So if you know you are working with only a small amount of data and
you have concerns for speed and code space, a trade off can be made for a function that
does not behave well for large amounts of data.

Consider, for example, the algorithm for sorting a deck of cards, which proceeds by
repeatedly searching through the deck for the lowest card. The asymptotic complexity of
this algorithm is the square of the number of cards in the deck. This quadratic behavior is
the main term in the complexity formula, it says, e.g., if you double the size of the deck,
then the work is roughly quadrupled.

The Idea
“Big-O” notation was introduced in P. Bachmann’s 1892 book Analytische
Zahlentheorie. He used it to say things like “x is O(n2 )” instead of “x _ n2 .” The
notation works well to compare algorithm efficiencies because we want to say that the
growth of effort of a given algorithm approximates the shape of a standard function.

The Definitions

Big-O (O()) is one of five standard asymptotic notations. In practice, Big-O is used as a
tight upper-bound on the growth of an algorithm’s effort (this effort is described by the
function f(n)), even though, as written, it can also be a loose upper-bound. To make its
role as a tight upper-bound more clear, “Little-o” (o()) notation is used to describe an
upper-bound that cannot be tight.
Definition (Big–O, O()): Let f(n) and g(n) be functions that map positive integers to
positive real numbers. We say that f(n) is O(g(n)) (or f(n) 2 O(g(n))) if there exists a real
constant c > 0 and there exists an integer constant n0 _ 1 such that f(n) _ c _ g(n) for
every integer n _ n0.

Definition (Little–o, o()): Let f(n) and g(n) be functions that map positive integers to
positive real numbers. We say that f(n) is o(g(n)) (or f(n) 2 o(g(n))) if for any real
constant c > 0, there exists an integer constant n0 _ 1 such that f(n) < c _ g(n) for every
integer n _ n0.

On the other side of f(n), it is convenient to define parallels to O() and o() that provide
tight and loose lower bounds on the growth of f(n). “Big-Omega” (()) is the tight lower
bound notation, and “little-omega” (!()) describes the loose lower bound.

Definition (Big–Omega, ()): Let f(n) and g(n) be functions that map positive integers to
positive real numbers. We say that f(n) is (g(n)) (or f(n) 2 (g(n))) if there exists a real
constant c > 0 and there exists an integer constant n0 _ 1 such that f(n) _ c · g(n) for
every integer n _ n0.

Definition (Little–Omega, !()): Let f(n) and g(n) be functions that map positive integers
to positive real numbers. We say that f(n) is !(g(n)) (or f(n) 2 !(g(n))) if for any real
constant c > 0, there exists an integer constant n0 _ 1 such that f(n) > c · g(n) for every
integer n _ n0.

This graph should help you visualize the relationships between this notations:

Definition (Big–Theta, _()): Let f(n) and g(n) be functions that map positive integers to
positive real numbers. We say that f(n) is _(g(n)) (or f(n) 2 _(g(n))) if and only if f(n) 2
O(g(n)) and f(n) 2 (g(n)).

Application Examples
Here are a few examples that show how the definitions should be applied.
1. Let f(n) = 7n + 8 and g(n) = n. Is f(n) 2 O(g(n))?
For 7n + 8 2 O(n), we have to find c and n0 such that 7n + 8 _ c · n, 8n _ n0. By
inspection, it’s clear that c must be larger than 7. Let c = 8.
Now we need a suitable n0. In this case, f(8) = 8 · g(8). Because the definition of O()
requires that f(n) _ c · g(n), we can select n0 = 8, or any integer above 8 – they will all
work.

We have identified values for the constants c and n0 such that 7n + 8 is _ c · n for every n
_ n0, so we can say that 7n + 8 is O(n).
(But how do we know that this will work for every n above 7? We can prove by induction
that 7n+8 _ 8n, 8n _ 8. Be sure that you can write such proofs if asked!)
Big-O Notation

Definition

Big-O is the formal method of expressing the upper bound of an algorithm's running
time. It's a measure of the longest amount of time it could possibly take for the algorithm
to complete.

More formally, for non-negative functions, f(n) and g(n), if there exists an integer n0 and
a constant c > 0 such that for all integers n > n0, f(n) ≤ cg(n), then f(n) is Big O of g(n).
This is denoted as "f(n) = O(g(n))". If graphed, g(n) serves as an upper bound to the curve
you are analyzing, f(n).

Big-Omega Notation

For non-negative functions, f(n) and g(n), if there exists an integer n0 and a constant c > 0
such that for all integers n > n0, f(n) ≥ cg(n), then f(n) is omega of g(n). This is denoted as
"f(n) = Ω(g(n))".

This is almost the same definition as Big Oh, except that "f(n) ≥ cg(n)", this makes g(n) a
lower bound function, instead of an upper bound function. It describes the best that can
happen for a given data size.

Best, Worst, and Average-Case Complexity

Using the RAM model of computation, we can count how many steps our algorithm will
take on any given input instance by simply executing it on the given input. However, to
really understand how good or bad an algorithm is, we must know how it works over all
instances.

As a branch of the theory of computation in computer science, computational complexity


theory investigates the problems related to the amounts of resources required for the
execution of algorithms (e.g., execution time), and the inherent difficulty in providing
efficient algorithms for specific computational problems.

As the size of the input to an algorithm increases, how do the running time and memory
requirements of the algorithm change and what are the implications and ramifications of
that change?"

The time complexity of a problem is the number of steps that it takes to solve an instance
of the problem as a function of the size of the input (usually measured in bits), using the
most efficient algorithm. To understand this intuitively, consider the example of an
instance that is n bits long that can be solved in n² steps. In this example we say the
problem has a time complexity of n². Of course, the exact number of steps will depend on
exactly what machine or language is being used. To avoid that problem, the Big O
notation is generally used (sometimes described as the "order" of the calculation, as in
"on the order of"). If a problem has time complexity O(n²) on one typical computer, then
it will also have complexity O(n²) on most other computers, so this notation allows us to
generalize away from the details of a particular computer.
Simply

The space complexity of a problem is a related concept, that measures the amount of
space, or memory required by the algorithm. An informal analogy would be the amount
of scratch paper needed while working out a problem with pen and paper. Space
complexity is also measured with Big O notation.

Time Complexity

Time complexity is the measure of how quickly or slowly a program runs relative to
some measure of the size of the problem n.
 Most primitives execute in constant time.
 Recursive calls (loops) have linear time.

Space Complexity

Space complexity is the measure of how much memory is used by a program relative to
some measure of the size of the problem n.
 Most primitives use constant space.
 Tail recursive calls also use constant space.
 General recursive calls use linear space.

Using the RAM model of computation, we can count how many steps our algorithm will
take on any given input instance by simply executing it on the given input. However, to
really understand how good or bad an algorithm is, we must know how it works over all
instances.

To understand the notions of the best, worst, and average-case complexity, one must
think about running an algorithm on all possible instances of data that can be fed to it.

Comparison

 The worst-case complexity of the algorithm is the function defined by the maximum
number of steps taken on any instance of size n. It represents the curve passing through
the highest point of each column.
 The best-case complexity of the algorithm is the function defined by the minimum
number of steps taken on any instance of size n. It represents the curve passing through
the lowest point of each column.

 Finally, the average-case complexity of the algorithm is the function defined by the
average number of steps taken on any instance of size n.

Unit II
Divide-and-Conquer Algorithm

Divide-and-conquer is a top-down technique for designing algorithms that consists of


dividing the problem into smaller subproblems hoping that the solutions of the
subproblems are easier to find and then composing the partial solutions into the solution
of the original problem.

Little more formally, divide-and-conquer paradigm consists of following major phases:

 Breaking the problem into several sub-problems that are similar to the original
problem but smaller in size,
 Solve the sub-problem recursively (successively and independently), and then
 Combine these solutions to subproblems to create a solution to the original
problem.

Binary Search (simplest application of divide-and-conquer)

Binary Search is an extremely well-known instance of divide-and-conquer paradigm.


Given an ordered array of n elements, the basic idea of binary search is that for a given
element we "probe" the middle element of the array. We continue in either the lower or
upper segment of the array, depending on the outcome of the probe until we reached the
required (given) element.

Problem Let A[1 . . . n] be an array of non-decreasing sorted order; that is A [i] ≤ A [j]
whenever 1 ≤ i ≤ j ≤ n. Let 'q' be the query point. The problem consist of finding 'q' in
the array A. If q is not in A, then find the position where 'q' might be inserted.
Formally, find the index i such that 1 ≤ i ≤ n+1 and A[i-1] < x ≤ A[i].

Sequential Search

Look sequentially at each element of A until either we reach at the end of an array A or
find an item no smaller than 'q'.

Sequential search for 'q' in array A

for i = 1 to n do
if A [i] ≥ q then
return index i
return n + 1

Analysis

This algorithm clearly takes a θ(r), where r is the index returned. This is Ω(n) in the worst
case and O(1) in the best case.

If the elements of an array A are distinct and query point q is indeed in the array then
loop executed (n + 1) / 2 average number of times. On average (as well as the worst
case), sequential search takes θ(n) time.
Binary Search

Look for 'q' either in the first half or in the second half of the array A. Compare 'q' to an
element in the middle, n/2 , of the array. Let k = n/2 . If q ≤ A[k], then search in
the A[1 . . . k]; otherwise search T[k+1 . . n] for 'q'. Binary search for q in subarray A[i . .
j] with the promise that

A[i-1] < x ≤ A[j]


If i = j then
return i (index)
k= (i + j)/2
if q ≤ A [k]
then return Binary Search [A [i-k], q]
else return Binary Search [A[k+1 . . j], q]

Analysis

Binary Search can be accomplished in logarithmic time in the worst case , i.e., T(n) =
θ(log n). This version of the binary search takes logarithmic time in the best case.

Iterative Version of Binary Search

Interactive binary search for q, in array A[1 . . n]

if q > A [n]
then return n + 1
i = 1;
j = n;
while i < j do
k = (i + j)/2
if q ≤ A [k]
then j = k
else i = k + 1
return i (the index)

Analysis

The analysis of Iterative algorithm is identical to that of its recursive counterpart.

Merge Sort
Merge-sort is based on the divide-and-conquer paradigm. The Merge-sort algorithm can
be described in general terms as consisting of the following three steps:

1. Divide Step
If given array A has zero or one element, return S; it is already sorted. Otherwise,
divide A into two arrays, A1 and A2, each containing about half of the elements of
A.
2. Recursion Step
Recursively sort array A1 and A2.
3. Conquer Step
Combine the elements back in A by merging the sorted arrays A1 and A2 into a
sorted sequence.

We can visualize Merge-sort by means of binary tree where each node of the tree
represents a recursive call and each external nodes represent individual elements of given
array A. Such a tree is called Merge-sort tree. The heart of the Merge-sort algorithm is
conquer step, which merge two sorted sequences into a single sorted sequence.

To begin, suppose that we have two sorted arrays A1[1], A1[2], . . , A1[M] and A2[1],
A2[2], . . . , A2[N]. The following is a direct algorithm of the obvious strategy of
successively choosing the smallest remaining elements from A1 to A2 and putting it in A.
MERGE (A1, A2, A)

i.← j 1
A1[m+1], A2[n+1] ← INT_MAX
For k ←1 to m + n do
if A1[i] < A2[j]
then A[k] ← A1[i]
i ← i +1
else
A[k] ← A2[j]
j←j+1

Merge Sort Algorithm

MERGE_SORT (A)

A1[1 . . n/2 ] ← A[1 . . n/2 ]


A2[1 . . n/2 ] ← A[1 + n/2 . . n]
Merge Sort (A1)
Merge Sort (A1)
Merge Sort (A1, A2, A)

Analysis

Let T(n) be the time taken by this algorithm to sort an array of n elements dividing A into
subarrays A1 and A2 takes linear time. It is easy to see that the Merge (A1, A2, A) also takes
the linear time. Consequently,

T(n) = T( n/2 ) + T( n/2 ) + θ(n)

for simplicity

T(n) = 2T (n/2) + θ(n)

The total running time of Merge sort algorithm is O(n lg n), which is asymptotically
optimal like Heap sort, Merge sort has a guaranteed n lg n running time. Merge sort
required (n) extra space. Merge is not in-place algorithm. The only known ways to
merge in-place (without any extra space) are too complex to be reduced to practical
program.
Implementation

void mergeSort(int numbers[], int temp[], int array_size)


{
m_sort(numbers, temp, 0, array_size - 1);

}
void m_sort(int numbers[], int temp[], int left, int right)
{
int mid;
if (right > left)
{
mid = (right + left) / 2;
m_sort(numbers, temp, left, mid);
m_sort(numbers, temp, mid+1, right);
merge(numbers, temp, left, mid+1, right);
}

}
void merge(int numbers[], int temp[], int left, int mid, int
right)
{
int i, left_end, num_elements, tmp_pos;
left_end = mid - 1;
tmp_pos = left;
num_elements = right - left + 1;
while ((left <= left_end) && (mid <= right))
{
if (numbers[left] <= numbers[mid])
{
temp[tmp_pos] = numbers[left];
tmp_pos = tmp_pos + 1;
left = left +1;
}
else
{
temp[tmp_pos] = numbers[mid];
tmp_pos = tmp_pos + 1;
mid = mid + 1;
}
}
while (left <= left_end)
{
temp[tmp_pos] = numbers[left];
left = left + 1;
tmp_pos = tmp_pos + 1;
}
while (mid <= right)
{
temp[tmp_pos] = numbers[mid];
mid = mid + 1;
tmp_pos = tmp_pos + 1;
}
for (i=0; i <= num_elements; i++)
{
numbers[right] = temp[right];
right = right - 1;
}
}

Greedy Introduction

Greedy algorithms are simple and straightforward. They are shortsighted in their
approach in the sense that they take decisions on the basis of information at hand without
worrying about the effect these decisions may have in the future. They are easy to invent,
easy to implement and most of the time quite efficient. Many problems cannot be solved
correctly by greedy approach. Greedy algorithms are used to solve optimization problems

Greedy Approach

Greedy Algorithm works by making the decision that seems most promising at any
moment; it never reconsiders this decision, whatever situation may arise later.

As an example consider the problem of "Making Change".

Coins available are:


 dollars (100 cents)
 quarters (25 cents)
 dimes (10 cents)
 nickels (5 cents)
 pennies (1 cent)

Problem Make a change of a given amount using the smallest possible number of
coins.

Informal Algorithm

 Start with nothing.


 at every stage without passing the given amount.
o add the largest to the coins already chosen.

Formal Algorithm

Make change for n units using the least possible number of coins.

MAKE-CHANGE (n)
C ← {100, 25, 10, 5, 1} // constant.
Sol ← {}; // set that will hold the solution set.
Sum ← 0 sum of item in solution set
WHILE sum not = n
x = largest item in set C such that sum + x ≤ n
IF no such item THEN
RETURN "No Solution"
S ← S {value of x}
sum ← sum + x
RETURN S

Example Make a change for 2.89 (289 cents) here n = 2.89 and the solution contains 2
dollars, 3 quarters, 1 dime and 4 pennies. The algorithm is greedy because at every stage
it chooses the largest coin without worrying about the consequences. Moreover, it never
changes its mind in the sense that once a coin has been included in the solution set, it
remains there.

Characteristics and Features of Problems solved by Greedy Algorithms

To construct the solution in an optimal way. Algorithm maintains two sets. One contains
chosen items and the other contains rejected items.

The greedy algorithm consists of four (4) function.

1. A function that checks whether chosen set of items provide a solution.


2. A function that checks the feasibility of a set.
3. The selection function tells which of the candidates is the most promising.
4. An objective function, which does not appear explicitly, gives the value of a
solution.

Structure Greedy Algorithm

 Initially the set of chosen items is empty i.e., solution set.


 At each step
o item will be added in a solution set by using selection function.
o IF the set would no longer be feasible
 reject items under consideration (and is never consider again).
o ELSE IF set is still feasible THEN
 add the current item.

Definitions of feasibility

A feasible set (of candidates) is promising if it can be extended to produce not merely a
solution, but an optimal solution to the problem. In particular, the empty set is always
promising why? (because an optimal solution always exists)

Unlike Dynamic Programming, which solves the subproblems bottom-up, a greedy


strategy usually progresses in a top-down fashion, making one greedy choice after
another, reducing each problem to a smaller one.

Greedy-Choice Property

The "greedy-choice property" and "optimal substructure" are two ingredients in the
problem that lend to a greedy strategy.

Greedy-Choice Property

It says that a globally optimal solution can be arrived at by making a locally optimal
choice.

Spanning Tree and Minimum Spanning Tree

Spanning Trees

A spanning tree of a graph is any tree that includes every vertex in the graph. Little more
formally, a spanning tree of a graph G is a subgraph of G that is a tree and contains all the
vertices of G. An edge of a spanning tree is called a branch; an edge in the graph that is
not in the spanning tree is called a chord. We construct spanning tree whenever we want
to find a simple, cheap and yet efficient way to connect a set of terminals (computers,
cites, factories, etc.). Spanning trees are important because of following reasons.

 Spanning trees construct a sparse sub graph that tells a lot about the original
graph.
 Spanning trees a very important in designing efficient routing algorithms.
 Some hard problems (e.g., Steiner tree problem and traveling salesman problem)
can be solved approximately by using spanning trees.
 Spanning trees have wide applications in many areas, such as network design, etc.

Greedy Spanning Tree Algorithm

One of the most elegant spanning tree algorithm that we know of is as follows:

 Examine the edges in graph in any arbitrary sequence.


 Decide whether each edge will be included in the spanning tree.

Note that each time a step of the algorithm is performed, one edge is examined. If there is
only a finite number of edges in the graph, the algorithm must halt after a finite number
of steps. Thus, the time complexity of this algorithm is clearly O(n), where n is the
number of edges in the graph.

Some important facts about spanning trees are as follows:

 Any two vertices in a tree are connected by a unique path.


 Let T be a spanning tree of a graph G, and let e be an edge of G not in T. The T+e
contains a unique cycle.

Lemma The number of spanning trees in the complete graph Kn is nn-2.

Greediness It is easy to see that this algorithm has the property that each edge is
examined at most once. Algorithms, like this one, which examine each entity at most
once and decide its fate once and for all during that examination are called greedy
algorithms. The obvious advantage of greedy approach is that we do not have to spend
time reexamining entities.

Consider the problem of finding a spanning tree with the smallest possible weight or the
largest possible weight, respectively called a minimum spanning tree and a maximum
spanning tree. It is easy to see that if a graph possesses a spanning tree, it must have a
minimum spanning tree and also a maximum spanning tree. These spanning trees can be
constructed by performing the spanning tree algorithm (e.g., above mentioned algorithm)
with an appropriate ordering of the edges.

Minimum Spanning Tree Algorithm


Perform the spanning tree algorithm (above) by examining the edges is
order of non decreasing weight (smallest first, largest last). If two or more edges have the
same weight, order them arbitrarily.

Maximum Spanning Tree Algorithm


Perform the spanning tree algorithm (above) by examining the edges in order
of non increasing weight (largest first, smallest last). If two or more edges have the same
weight, order them arbitrarily.
Knapsack Problem

Statement A thief robbing a store and can carry a maximal weight of w into their
knapsack. There are n items and ith item weigh wi and is worth vi dollars. What items
should thief take?

There are two versions of problem

I. Fractional knapsack problem


The setup is same, but the thief can take fractions of items, meaning that the items
can be broken into smaller pieces so that thief may decide to carry only a fraction
of xi of item i, where 0 ≤ xi ≤ 1.

Exhibit greedy choice property.

 Greedy algorithm exists.

Exhibit optimal substructure property.

II. 0-1 knapsack problem


The setup is the same, but the items may not be broken into smaller pieces, so
thief may decide either to take an item or to leave it (binary choice), but may not
take a fraction of an item.

Exhibit No greedy choice property.

 No greedy algorithm exists.

Exhibit optimal substructure property.

 Only dynamic programming algorithm exists.

Greedy Solution to the Fractional Knapsack Problem

There are n items in a store. For i =1,2, . . . , n, item i has weight wi > 0 and worth vi > 0.
Thief can carry a maximum weight of W pounds in a knapsack. In this version of a
problem the items can be broken into smaller piece, so the thief may decide to carry only
a fraction xi of object i, where 0 ≤ xi ≤ 1. Item i contributes xiwi to the total weight in the
knapsack, and xivi to the value of the load.

In Symbol, the fraction knapsack problem can be stated as follows.


maximize nSi=1 xivi subject to constraint nSi=1 xiwi ≤ W

It is clear that an optimal solution must fill the knapsack exactly, for otherwise we could
add a fraction of one of the remaining objects and increase the value of the load. Thus in
an optimal solution nSi=1 xiwi = W.
Greedy-fractional-knapsack (w, v, W)

FOR i =1 to n
do x[i] =0
weight = 0
while weight < W
do i = best remaining item
IF weight + w[i] ≤ W
then x[i] = 1
weight = weight + w[i]
else
x[i] = (w - weight) / w[i]
weight = W
return x

Analysis

If the items are already sorted into decreasing order of vi / wi, then
the while-loop takes a time in O(n);
Therefore, the total time including the sort is in O(n log n).

If we keep the items in heap with largest vi/wi at the root. Then

 creating the heap takes O(n) time


 while-loop now takes O(log n) time (since heap property must be restored after
the removal of root)

Although this data structure does not alter the worst-case, it may be faster if only a small
number of items are need to fill the knapsack.

One variant of the 0-1 knapsack problem is when order of items are sorted by increasing
weight is the same as their order when sorted by decreasing value.

The optimal solution to this problem is to sort by the value of the item in decreasing
order. Then pick up the most valuable item which also has a least weight. First, if its
weight is less than the total weight that can be carried. Then deduct the total weight that
can be carried by the weight of the item just pick. The second item to pick is the most
valuable item among those remaining. Keep follow the same strategy until thief cannot
carry more item (due to weight).

Proof

One way to proof the correctness of the above algorithm is to prove the greedy choice
property and optimal substructure property. It consist of two steps. First, prove that there
exists an optimal solution begins with the greedy choice given above. The second part
prove that if A is an optimal solution to the original problem S, then A - a is also an
optimal solution to the problem S - s where a is the item thief picked as in the greedy
choice and S - s is the subproblem after the first greedy choice has been made. The
second part is easy to prove since the more valuable items have less weight.
Note that if v` / w` , is not it can replace any other because w` < w, but it increases the
value because v` > v. □

Theorem The fractional knapsack problem has the greedy-choice property.

Proof Let the ratio v`/w` is maximal. This supposition implies that v`/w` ≥ v/w for any
pair (v, w), so v`v / w > v for any (v, w). Now Suppose a solution does not contain the full
w` weight of the best ratio. Then by replacing an amount of any other w with more w`
will improve the value.

Binary search

However, if we place our items in an array and sort them in either ascending or
descending order on the key first, then we can obtain much better performance with an
algorithm called binary search.
In binary search, we first compare the key with the item in the middle position of the
array. If there's a match, we can return immediately. If the key is less than the middle
key, then the item sought must lie in the lower half of the array; if it's greater then the
item sought must lie in the upper half of the array. So we repeat the procedure on the
lower (or upper) half of the array. The function can now be implemented:

static void *bin_search( collection c, int low, int high, void *key ) {
int mid;
/* Termination check */
if (low > high) return NULL;
mid = (high+low)/2;
switch (memcmp(ItemKey(c->items[mid]),key,c->size)) {
/* Match, return item found */
case 0: return c->items[mid];
/* key is less than mid, search lower half */
case -1: return bin_search( c, low, mid-1, key);
/* key is greater than mid, search upper half */
case 1: return bin_search( c, mid+1, high, key );
default : return NULL;
}
}
void *FindInCollection( collection c, void *key ) {
/* Find an item in a collection
Pre-condition:
c is a collection created by ConsCollection
c is sorted in ascending order of the key
key != NULL
Post-condition: returns an item identified by key if
one exists, otherwise returns NULL
*/
int low, high;
low = 0; high = c->item_cnt-1;
return bin_search( c, low, high, key );
}

Points to note:

a. bin_search is recursive: it determines whether the search key lies in the lower or upper
half of the array, then calls itself on the appropriate half.

b. There is a termination condition (two of them in fact!)


i. If low > high then the partition to be searched has no elements in it and
ii. If there is a match with the element in the middle of the current partition, then we
can return immediately.

c. AddToCollection will need to be modified to ensure that each item added is placed in
its correct place in the array. The procedure is simple:
i. Search the array until the correct spot to insert the new item is found,
ii. Move all the following items up one position and
iii. Insert the new item into the empty position thus created.

d. bin_search is declared static. It is a local function and is not used outside this class: if it
were not declared static, it would be exported and be available to all parts of the program.

The static declaration also allows other classes to use the same name internally.

Dynamic-Programming Solution to the 0-1 Knapsack Problem

Let i be the highest-numbered item in an optimal solution S for W pounds. Then S`= S -
{i} is an optimal solution for W-wi pounds and the value to the solution S is Vi plus the
value of the subproblem.

We can express this fact in the following formula: define c[i, w] to be the solution for
items 1,2, . . . , i and maximum weight w. Then

0 if i = 0 or w = 0
c[i,w]
c[i-1, w] if wi ≥ 0
=
max [vi + c[i-1, w-wi], c[i-1,if i>0 and w ≥
w]} wi

This says that the value of the solution to i items either include ith item, in which case it is
vi plus a subproblem solution for (i-1) items and the weight excluding wi, or does not
include ith item, in which case it is a subproblem's solution for (i-1) items and the same
weight. That is, if the thief picks item i, thief takes vi value, and thief can choose from
items w-wi, and get c[i-1, w-wi] additional value. On other hand, if thief decides not to
take item i, thief can choose from item 1,2, . . . , i-1 upto the weight limit w, and get c[i-1,
w] value. The better of these two choices should be made.
Although the 0-1 knapsack problem, the above formula for c is similar to LCS formula:
boundary values are 0, and other values are computed from the input and "earlier" values
of c. So the 0-1 knapsack algorithm is like the LCS-length algorithm given in CLR-book
for finding a longest common subsequence of two sequences.

The algorithm takes as input the maximum weight W, the number of items n, and the two
sequences v = <v1, v2, . . . , vn> and w = <w1, w2, . . . , wn>. It stores the c[i, j] values in the
table, that is, a two dimensional array, c[0 . . n, 0 . . w] whose entries are computed in a
row-major order. That is, the first row of c is filled in from left to right, then the second
row, and so on. At the end of the computation, c[n, w] contains the maximum value that
can be picked into the knapsack.

Dynamic-0-1-knapsack (v, w, n, W)

for w = 0 to W
do c[0, w] = 0
for i = 1 to n
do c[i, 0] = 0
for w = 1 to W
do if wi ≤ w
then if vi + c[i-1, w-wi]
then c[i, w] = vi + c[i-1, w-wi]
else c[i, w] = c[i-1, w]
else
c[i, w] = c[i-1, w]

The set of items to take can be deduced from the table, starting at c[n. w] and tracing
backwards where the optimal values came from. If c[i, w] = c[i-1, w] item i is not part of
the solution, and we are continue tracing with c[i-1, w]. Otherwise item i is part of the
solution, and we continue tracing with c[i-1, w-W].

Analysis

This dynamic-0-1-kanpsack algorithm takes θ(nw) times, broken up as follows:

θ(nw) times to fill the c-table, which has (n+1).(w+1) entries, each requiring θ(1) time to
compute. O(n) time to trace the solution, because the tracing process starts in row n of the
table and moves up 1 row at each step.
Unit III

Dynamic Programming

Dynamic programming is a stage-wise search method suitable for optimization problems


whose solutions may be viewed as the result of a sequence of decisions. The most
attractive property of this strategy is that during the search for a solution it avoids full
enumeration by pruning early partial decision solutions that cannot possibly lead to
optimal solution. In many practical situations, this strategy hits the optimal solution in a
polynomial number of decision steps. However, in the worst case, such a strategy may
end up performing full enumeration.

Dynamic programming takes advantage of the duplication and arrange to solve each
subproblem only once, saving the solution (in table or something) for later use. The
underlying idea of dynamic programming is: avoid calculating the same stuff twice,
usually by keeping a table of known results of subproblems. Unlike divide-and-conquer,
which solves the subproblems top-down, a dynamic programming is a bottom-up
technique.

Bottom-up means

i. Start with the smallest subproblems.


ii. Combining theirs solutions obtain the solutions to subproblems of increasing size.
iii. Until arrive at the solution of the original problem.

The Principle of Optimality

The dynamic programming relies on a principle of optimality. This principle states that
in an optimal sequence of decisions or choices, each subsequence must also be optimal.
For example, in matrix chain multiplication problem, not only the value we are interested
in is optimal but all the other entries in the table are also represent optimal.

The principle can be related as follows: the optimal solution to a problem is a


combination of optimal solutions to some of its subproblems.

The difficulty in turning the principle of optimally into an algorithm is that it is not
usually obvious which subproblems are relevant to the problem under consideration.

Dynamic-Programming Solution to the 0-1 Knapsack Problem

Problem Statement A thief robbing a store and can carry a maximal weight of W into
their knapsack. There are n items and ith item weigh wi and is worth vi dollars. What items
should thief take?

There are two versions of problem

Fractional knapsack problem The setup is same, but the thief can take fractions of
items, meaning that the items can be broken into smaller pieces so that thief may decide
to carry only a fraction of xi of item i, where 0 ≤ xi ≤ 1.
0-1 knapsack problem The setup is the same, but the items may not be broken into
smaller pieces, so thief may decide either to take an item or to leave it (binary choice),
but may not take a fraction of an item.

Fractional knapsack problem

 Exhibit greedy choice property.


 Greedy algorithm exists.
 Exhibit optimal substructure property.

0-1 knapsack problem

 Exhibit No greedy choice property.


 No greedy algorithm exists.
 Exhibit optimal substructure property.
 Only dynamic programming algorithm exists.

Dynamic-Programming Solution to the 0-1 Knapsack Problem

Let i be the highest-numbered item in an optimal solution S for W pounds. Then S` = S -


{i} is an optimal solution for W - wi pounds and the value to the solution S is Vi plus the
value of the subproblem.

We can express this fact in the following formula: define c[i, w] to be the solution for
items 1,2, . . . , i and maximum weight w. Then

if i = 0 or w =
0
0
c[i,w] = c[i-1, w] if wi ≥ 0
max [vi + c[i-1, w-wi], c[i- if i>0 and w ≥
1, w]} wi

This says that the value of the solution to i items either include ith item, in which case it
is vi plus a subproblem solution for (i - 1) items and the weight excluding wi, or does not
include ith item, in which case it is a subproblem's solution for (i - 1) items and the same
weight. That is, if the thief picks item i, thief takes vi value, and thief can choose from
items w - wi, and get c[i - 1, w - wi] additional value. On other hand, if thief decides not to
take item i, thief can choose from item 1,2, . . . , i- 1 upto the weight limit w, and get c[i -
1, w] value. The better of these two choices should be made.

Although the 0-1 knapsack problem, the above formula for c is similar to LCS formula:
boundary values are 0, and other values are computed from the input and "earlier" values
of c. So the 0-1 knapsack algorithm is like the LCS-length algorithm given in CLR for
finding a longest common subsequence of two sequences.

The algorithm takes as input the maximum weight W, the number of items n, and the two
sequences v = <v1, v2, . . . , vn> and w = <w1, w2, . . . , wn>. It stores the c[i, j] values in the
table, that is, a two dimensional array, c[0 . . n, 0 . . w] whose entries are computed in a
row-major order. That is, the first row of c is filled in from left to right, then the second
row, and so on. At the end of the computation, c[n, w] contains the maximum value that
can be picked into the knapsack.

Dynamic-0-1-knapsack (v, w, n, W)

FOR w = 0 TO W
DO c[0, w] = 0
FOR i=1 to n
DO c[i, 0] = 0
FOR w=1 TO W
DO IFf wi ≤ w
THEN IF vi + c[i-1, w-wi]
THEN c[i, w] = vi + c[i-1, w-wi]
ELSE c[i, w] = c[i-1, w]
ELSE
c[i, w] = c[i-1, w]

The set of items to take can be deduced from the table, starting at c[n. w] and tracing
backwards where the optimal values came from. If c[i, w] = c[i-1, w] item i is not part of
the solution, and we are continue tracing with c[i-1, w]. Otherwise item i is part of the
solution, and we continue tracing with c[i-1, w-W].

Analysis

This dynamic-0-1-kanpsack algorithm takes θ(nw) times, broken up as follows: θ(nw)


times to fill the c-table, which has (n +1).(w +1) entries, each requiring θ(1) time to
compute. O(n) time to trace the solution, because the tracing process starts in row n of the
table and moves up 1 row at each step.

Multistage Graphs

A multistage graph is a graph

o G=(V,E) with V partitioned into K >= 2 disjoint subsets such that if (a,b) is in E,
then a is in Vi , and b is in Vi+1 for some subsets in the partition;
o and | V1 | = | VK | = 1.

The vertex s in V1 is called the source; the vertex t in VK is called the sink.

G is usually assumed to be a weighted graph.

The cost of a path from node v to node w is sum of the costs of edges in the path.

The "multistage graph problem" is to find the minimum cost path from s to t.

[Cf. the "network flow problem".]


Each set Vi is called a stage in the graph.

Consider the resource allocation problem:

Given n units of resources to be allocated to k projects.


For 1 <= i <= k, 0 <= j <= n,

P(i,j) = profit obtained by allocating "j" units of


the resource to project i.

Transform this to instance of "multistage graph problem".

Create a multistage graph:

V = {s} and denote s = V(1,0) -- read, we are at node 1 having


0 allocated 0 units of resource

Stages 1 to k are such that stage i consists of a set:

{ V(i+1,j) } j=0 .. n

[we could denote the vertices in this set as: vi+1j

[or could instead call them vj of set Vi]

The edges are weighted with C(i,j) = -P(i,j) [the negative of the profit] to make it a
minimization problem.

Dynamic Programming solution:

Let path(i,j) be some specification of the minimal path from vertex j in set i to vertex t;
C(i,j) is the cost of this path; c(j,t) is the weight of the edge from j to t.

C(i,j) = min { c(j,l) + C(i+1,l) }


l in Vi+1

(j,l) in E

To write a simple algorithm, assign numbers to the vertices so those in stage Vi have
lower number those in stage Vi+1.

int[] MStageForward(Graph G)
{
// returns vector of vertices to follow through the graph
// let c[i][j] be the cost matrix of G

int n = G.n (number of nodes);


int k = G.k (number of stages);
float[] C = new float[n];
int[] D = new int[n];
int[] P = new int[k];
for (i = 1 to n) C[i] = 0.0;
for j = n-1 to 1 by -1 {
r = vertex such that (j,r) in G.E and c(j,r)+C(r) is minimum
C[j] = c(j,r)+C(r);
D[j] = r;
}
P[1] = 1; P[k] = n;
for j = 2 to k-1 {
P[j] = D[P[j-1]];
}
return P;
}

All Pairs Shortest Path (APSP) Problem

Given a directed graph G = (V,E), where each edge (v,w) has a nonnegative cost C[v,w],
for all pairs of vertices (v,w) find the cost of the lowest cost path from v to w.

 A generalization of the single-source-shortest-path problem.


 Use Dijkstra's algorithm, varying the source node among all the nodes in the
graph.

We will consider a slight extension to this problem: find the lowest cost path between
each pair of vertices.

 We must recover the path itself, and not just the cost of the path.

Floyd's Algorithm

Floyd's algorithm takes as input the cost matrix C[v,w]

 C[v,w] = oo if (v,w) is not in E

It returns as output

 a distance matrix D[v,w] containing the cost of the lowest cost path from v to w
o initially D[v,w] = C[v,w]
 a path matrix P, where P[v,w] holds the intermediate vertex k on the least cost
path between v and w that led to the cost stored in D[v,w].

We iterate N times over the matrix D, using k as an index. On the kth iteration, the D
matrix contains the solution to the APSP problem, where the paths only use vertices
numbered 1 to k.
On the next iteration, we compare the cost of going from i to j using only vertices
numbered 1..k (stored in D[i,j] on the kth iteration) with the cost of using the k+1th
vertex as an intermediate step, which is D[i,k+1] (to get from i to k+1) plus D[k+1,j] (to
get from k+1 to j).

If this results in a lower cost path, we remember it.

After N iterations, all possible paths have been examined, so D[v,w] contains the cost of
the lowest cost path from v to w using all vertices if necessary.

The Algorithm

FloydAPSP (int N, rmatrix &C, rmatrix &D, imatrix &P)


{
int i,j,k;
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
D[i][j] = C[i][j];
P[i][j] = -1;
}
D[i][i] = 0.0;
}
for (k = 0; k < N; k++) {
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
if (D[i][k] + D[k][j] < D[i][j]) {
D[i][j] = D[i][k] + D[k][j];
P[i][j] = k;
} } } }
} /* FloydAPSP */

Clearly the algorithm is O(N^3).

Finding a Least Cost Path

Floyd's algorithm (modified to find the least cost paths, and not just the cost of the paths)
produces a matrix P, which, for each pair of nodes u and v, contains an intermediate node
on the least cost path from u to v

So the least cost path from u to v is the least cost path from u to P[u,v], followed by the
least cost path from P[u,v] to v.

The following procedure uses the P matrix produced earlier to print the intermediate
vertices on the least cost path from node u to node v.
Path (int u, int v, imatrix &P)
{
int k;

k = P[u][v];
if (k == -1) return;
path(u,k);
cout << k;
path(k,v);

} /* Path */

Note that this procedure could loop forever on an arbitrary matrix, but Floyd's algorithm
ensures that we cannot have k on the shortest path from u to v and v on the shortest path
from u to k.

Proof that Floyd's Algorithm Works

We will prove that after k iterations over the matrix D, D[i,j] is the cost of the cheapest
path from i to j that does not include a vertex numbered > k.

Proof by induction on k.

Basis: Let k = 0 (ie, no iterations yet performed).

Then no intermediate vertices on a path from i to j are allowed, so D[i,j] should be C[i,j]
if (i,j) in E, and infinity otherwise. The initialization step does exactly this.

Induction step: Assume that after k iterations, D[i,j] is the cost of the lowest cost path
from i to j excluding all vertices from k+1 to N.

On the next (k+1) iteration, we are allowed to include vertex k+1 in any path.

For all pairs (i,j), the lowest cost path from i to j excluding vertices k+2 thru N goes thru
k+1 iff there is a low cost path from i to k+1 and from k+1 to j, excluding vertices k+2
thru N.

But the cheapest path from i to k+1 without using nodes k+2 thru N is simply D[i,k+1]
(by the induction hypothesis).

Similarly, the lowest cost path from k+1 to j without using nodes k+2 thru N is D[k+1,j].

Thus, we should use node k+1 to get from i to j iff D[i,k+1] + D[k+1,j] < D[i,j], the
cheapest path excluding k+1. Since this is exactly what is stored on the k+1th iteration,
we have completed the proof.
Comparison with Dijkstra's Algorithm

The all-pairs-shortest-path problem is generalization of the single-source-shortest-path


problem, so we can use Floyd's algorithm, or Dijkstra's algorithm (varying the source
node over all nodes).

 Floyd's algorithm is O(N^3)


 Dijkstra's algorithm with an adjacency matrix is O(N^2), so varying over N
source nodes is O(N^3)
 Dijkstra's algorithm with adjacency lists is O(E log N), so varying over N source
nodes is O(N E log N)

For large sparse graphs, Dijkstra's algorithm is preferable.

Optimal Binary Search Trees

A binary search tree is a tree where the key values are stored in the internal nodes, the
external nodes (leaves) are null nodes, and the keys are ordered lexicographically. I.e. for
each internal node all the keys in the left subtree are less than the keys in the node, and all
the keys in the right subtree are greater.

When we know the probabilities of searching each one of the keys, it is quite easy to
compute the expected cost of accessing the tree. An OBST is a BST which has minimal
expected cost.

Example:

Key -5 1 8 7 13 21
Probabilities 1/8 1/32 1/16 1/32 1/4 1/2

The expectation-value of a search is:

It's clear that this tree is not optimal. - It is easy to see that if the 21 is closer to the root,
given its high probability, the tree will have a lower expected cost.

Criterion for an optimal tree:

Each optimal binary search tree is composed of a root and (at most) two optimal subtrees,
the left and the right.

Method:
The criterion for optimality gives a dynamic programming algorithm. For the root (and
each node in turn) we select one value to be stored in the node. (We have possibilities
to do that.)

Once this choice is made, the set of keys which go into the left subtree and right subtree
is completely defined, because the tree is lexicographically ordered. The left and right
subtrees are now constructed recursively (optimally). This gives the recursive definition
of the optim cost: Let denote the probability of accessing key , let denote the sum
of the probabilities from to

The explanation of the formula is easy once we see that the first term corresponds to the
left subtree, which is one level lower than the root, the second term corresponds to the
root and the 3 to the right subtree. Every cost is multiplied by its probability. For
simplicity we set and so simplifies to . This
procedure is exponential if applied directly. However, the optimal trees are only
constructed over contiguous sets of keys, and there are at most different sets of
contiguous keys.

In this case we store the optimal cost of a subtree in a matrix The Matrix-entry will
contain the cost of an optimal subtree constructed with the keys to

We now fill the matrix diagonal by diagonal. It is custumary to fill the matrix with
that we save a lot of multiplications and divisions. Let then

An optimal tree with one node is just the node itself (no other choice), so the diagonal of
is easy to fill: .

The cost of the is in ( in our example) And you can see, that it is
practical, not to work with the probabilities, but with the frequencies (i.e the probabilities
times the least common multiple of their denominators) to avoid fractions as matrix-
entries.

Travelling Salesman Problem


Unit 4
Backtracking
Unit 5
Graph Traversals
Introduction

When dealing with graphs, one fundamental issue is, to traverse the graph, to go through
it. In other words to visit each vertex and each edge. To solve this problem, many
interesting algorithms exist. Two of them will be presented here: Depth-First Search (dfs)
and Breadth-First Search (bfs). These two methods are used e.g. in connection with the
task of finding the connected components of a graph, which is a nice example o f an
application of bfs and dfs.

2 Depth-First Search (dfs)


The dfs algorithm will become clear when taking this example: Imagine a single person
being trapped inside a maze. In order to get out, the per son has to be sure to visit each
path and each intersection. So he or she uses two colors of paint, to mark the intersections
already passed.
When disco vering a new intersection, it is marked grey, and the way deeper into the
maze in continued. After reaching a “dead end” at the end of each path from an
intersection tho ugh, the person knows that there is no more unexplored path from the
grey int ersection, which now is co mpleted and thus can be marked black. This “dead
end” is either an intersection which has alr eady been marked grey or black, or simply a
path that does not lead to an in tersection. The connectio n to the issue of graphs is
obvious: The intersections of t he maze are the vertices, while the paths between the
intersections are the edges of t he graph. The technical te rm for the returning of t he
person from the “dead end” is backtracking. So what you are doing, is to go away fr om
your starting vertex into the gr aph as deep as you can, until you have to backtrack to the
preceding grey vert ex. To gain mor e information fro m t he algorithm, not only a co lor
value (white, grey or black) is assigned to each vertex, but also two numbers: the
dfs_number and the completion_number.

While the first number is assigned when a vertex is discovered (mar ked grey), the latter
is
allocated, when it is completed (marked black).
Here the pseudo-code of the dfs algorithm:
i:=1; j:=1; //initia lisation of iteration variables
PROCEDURE depth_first_search(u: vertex)
color[u]:=grey;
dfs_number[u]:=i++;
FOR v . Adj(u) DO //for each vertex adja cent to u
IF color[v]=white THEN depth_first_search(v)
color[u]=black;
completion_number[u]:=j++;
Minimum Spanning Trees

A minimum spanning tree (MST) of a weighted graph G is a spanning tree of G whose


edges sum is minimum weight. In other words, a MST is a tree formed from a subset of
the edges in a given undirected graph, with two properties:

 it spans the graph, i.e., it includes every vertex of the graph.


 it is a minimum, i.e., the total weight of all the edges is as low as possible.

Let G=(V, E) be a connected, undirected graph where V is a set of vertices (nodes) and E
is the set of edges. Each edge has a given non negative length.

Problem Find a subset T of the edges of G such that all the vertices remain connected
when only the edges T are used, and the sum of the lengths of the edges in T is as small
as possible.

Let G` = (V, T) be the partial graph formed by the vertices of G and the edges in T.
[Note: A connected graph with n vertices must have at least n-1 edges AND more that n-
1 edges implies at least one cycle]. So n-1 is the minimum number of edges in the T.
Hence if G` is connected and T has more that n-1 edges, we can remove at least one of
these edges without disconnecting (choose an edge that is part of cycle). This will
decrease the total length of edges in T.

G` = (V, T) where T is a subset of E. Since connected graph of n nodes must have n-1
edges otherwise there exist at least one cycle. Hence if G` is connected and T has more
that n-1 edges. Implies that it contains at least one cycle. Remove edge from T without
disconnecting the G` (i.e., remove the edge that is part of the cycle). This will decrease
the total length of the edges in T. Therefore, the new solution is preferable to the old one.

Thus, T with n vertices and more edges can be an optimal solution. It follow T must have
n-1 edges and since G` is connected it must be a tree. The G` is called Minimum
Spanning Tree (MST).
Introduction to NP Completeness
NP Complete Problems:

You might also like