Professional Documents
Culture Documents
Asymptotic Notations
A problem may have numerous algorithmic solutions. In order to choose the best
algorithm for a particular task, you need to be able to judge how long a particular solution
will take to run. Or, more accurately and choose the better of the two. You don't need to
know how many minutes and seconds they will take, but you do need some way to
compare algorithms against one another.
This is important in data structures because you want a structure that behaves efficiently
as you increase the amount of data it handles. Keep in mind though those algorithms that
are efficient with large amounts of data are not always simple and efficient for small
amounts of data. So if you know you are working with only a small amount of data and
you have concerns for speed and code space, a trade off can be made for a function that
does not behave well for large amounts of data.
Consider, for example, the algorithm for sorting a deck of cards, which proceeds by
repeatedly searching through the deck for the lowest card. The asymptotic complexity of
this algorithm is the square of the number of cards in the deck. This quadratic behavior is
the main term in the complexity formula, it says, e.g., if you double the size of the deck,
then the work is roughly quadrupled.
The Idea
“Big-O” notation was introduced in P. Bachmann’s 1892 book Analytische
Zahlentheorie. He used it to say things like “x is O(n2 )” instead of “x _ n2 .” The
notation works well to compare algorithm efficiencies because we want to say that the
growth of effort of a given algorithm approximates the shape of a standard function.
The Definitions
Big-O (O()) is one of five standard asymptotic notations. In practice, Big-O is used as a
tight upper-bound on the growth of an algorithm’s effort (this effort is described by the
function f(n)), even though, as written, it can also be a loose upper-bound. To make its
role as a tight upper-bound more clear, “Little-o” (o()) notation is used to describe an
upper-bound that cannot be tight.
Definition (Big–O, O()): Let f(n) and g(n) be functions that map positive integers to
positive real numbers. We say that f(n) is O(g(n)) (or f(n) 2 O(g(n))) if there exists a real
constant c > 0 and there exists an integer constant n0 _ 1 such that f(n) _ c _ g(n) for
every integer n _ n0.
Definition (Little–o, o()): Let f(n) and g(n) be functions that map positive integers to
positive real numbers. We say that f(n) is o(g(n)) (or f(n) 2 o(g(n))) if for any real
constant c > 0, there exists an integer constant n0 _ 1 such that f(n) < c _ g(n) for every
integer n _ n0.
On the other side of f(n), it is convenient to define parallels to O() and o() that provide
tight and loose lower bounds on the growth of f(n). “Big-Omega” (()) is the tight lower
bound notation, and “little-omega” (!()) describes the loose lower bound.
Definition (Big–Omega, ()): Let f(n) and g(n) be functions that map positive integers to
positive real numbers. We say that f(n) is (g(n)) (or f(n) 2 (g(n))) if there exists a real
constant c > 0 and there exists an integer constant n0 _ 1 such that f(n) _ c · g(n) for
every integer n _ n0.
Definition (Little–Omega, !()): Let f(n) and g(n) be functions that map positive integers
to positive real numbers. We say that f(n) is !(g(n)) (or f(n) 2 !(g(n))) if for any real
constant c > 0, there exists an integer constant n0 _ 1 such that f(n) > c · g(n) for every
integer n _ n0.
This graph should help you visualize the relationships between this notations:
Definition (Big–Theta, _()): Let f(n) and g(n) be functions that map positive integers to
positive real numbers. We say that f(n) is _(g(n)) (or f(n) 2 _(g(n))) if and only if f(n) 2
O(g(n)) and f(n) 2 (g(n)).
Application Examples
Here are a few examples that show how the definitions should be applied.
1. Let f(n) = 7n + 8 and g(n) = n. Is f(n) 2 O(g(n))?
For 7n + 8 2 O(n), we have to find c and n0 such that 7n + 8 _ c · n, 8n _ n0. By
inspection, it’s clear that c must be larger than 7. Let c = 8.
Now we need a suitable n0. In this case, f(8) = 8 · g(8). Because the definition of O()
requires that f(n) _ c · g(n), we can select n0 = 8, or any integer above 8 – they will all
work.
We have identified values for the constants c and n0 such that 7n + 8 is _ c · n for every n
_ n0, so we can say that 7n + 8 is O(n).
(But how do we know that this will work for every n above 7? We can prove by induction
that 7n+8 _ 8n, 8n _ 8. Be sure that you can write such proofs if asked!)
Big-O Notation
Definition
Big-O is the formal method of expressing the upper bound of an algorithm's running
time. It's a measure of the longest amount of time it could possibly take for the algorithm
to complete.
More formally, for non-negative functions, f(n) and g(n), if there exists an integer n0 and
a constant c > 0 such that for all integers n > n0, f(n) ≤ cg(n), then f(n) is Big O of g(n).
This is denoted as "f(n) = O(g(n))". If graphed, g(n) serves as an upper bound to the curve
you are analyzing, f(n).
Big-Omega Notation
For non-negative functions, f(n) and g(n), if there exists an integer n0 and a constant c > 0
such that for all integers n > n0, f(n) ≥ cg(n), then f(n) is omega of g(n). This is denoted as
"f(n) = Ω(g(n))".
This is almost the same definition as Big Oh, except that "f(n) ≥ cg(n)", this makes g(n) a
lower bound function, instead of an upper bound function. It describes the best that can
happen for a given data size.
Using the RAM model of computation, we can count how many steps our algorithm will
take on any given input instance by simply executing it on the given input. However, to
really understand how good or bad an algorithm is, we must know how it works over all
instances.
As the size of the input to an algorithm increases, how do the running time and memory
requirements of the algorithm change and what are the implications and ramifications of
that change?"
The time complexity of a problem is the number of steps that it takes to solve an instance
of the problem as a function of the size of the input (usually measured in bits), using the
most efficient algorithm. To understand this intuitively, consider the example of an
instance that is n bits long that can be solved in n² steps. In this example we say the
problem has a time complexity of n². Of course, the exact number of steps will depend on
exactly what machine or language is being used. To avoid that problem, the Big O
notation is generally used (sometimes described as the "order" of the calculation, as in
"on the order of"). If a problem has time complexity O(n²) on one typical computer, then
it will also have complexity O(n²) on most other computers, so this notation allows us to
generalize away from the details of a particular computer.
Simply
The space complexity of a problem is a related concept, that measures the amount of
space, or memory required by the algorithm. An informal analogy would be the amount
of scratch paper needed while working out a problem with pen and paper. Space
complexity is also measured with Big O notation.
Time Complexity
Time complexity is the measure of how quickly or slowly a program runs relative to
some measure of the size of the problem n.
Most primitives execute in constant time.
Recursive calls (loops) have linear time.
Space Complexity
Space complexity is the measure of how much memory is used by a program relative to
some measure of the size of the problem n.
Most primitives use constant space.
Tail recursive calls also use constant space.
General recursive calls use linear space.
Using the RAM model of computation, we can count how many steps our algorithm will
take on any given input instance by simply executing it on the given input. However, to
really understand how good or bad an algorithm is, we must know how it works over all
instances.
To understand the notions of the best, worst, and average-case complexity, one must
think about running an algorithm on all possible instances of data that can be fed to it.
Comparison
The worst-case complexity of the algorithm is the function defined by the maximum
number of steps taken on any instance of size n. It represents the curve passing through
the highest point of each column.
The best-case complexity of the algorithm is the function defined by the minimum
number of steps taken on any instance of size n. It represents the curve passing through
the lowest point of each column.
Finally, the average-case complexity of the algorithm is the function defined by the
average number of steps taken on any instance of size n.
Unit II
Divide-and-Conquer Algorithm
Breaking the problem into several sub-problems that are similar to the original
problem but smaller in size,
Solve the sub-problem recursively (successively and independently), and then
Combine these solutions to subproblems to create a solution to the original
problem.
Problem Let A[1 . . . n] be an array of non-decreasing sorted order; that is A [i] ≤ A [j]
whenever 1 ≤ i ≤ j ≤ n. Let 'q' be the query point. The problem consist of finding 'q' in
the array A. If q is not in A, then find the position where 'q' might be inserted.
Formally, find the index i such that 1 ≤ i ≤ n+1 and A[i-1] < x ≤ A[i].
Sequential Search
Look sequentially at each element of A until either we reach at the end of an array A or
find an item no smaller than 'q'.
for i = 1 to n do
if A [i] ≥ q then
return index i
return n + 1
Analysis
This algorithm clearly takes a θ(r), where r is the index returned. This is Ω(n) in the worst
case and O(1) in the best case.
If the elements of an array A are distinct and query point q is indeed in the array then
loop executed (n + 1) / 2 average number of times. On average (as well as the worst
case), sequential search takes θ(n) time.
Binary Search
Look for 'q' either in the first half or in the second half of the array A. Compare 'q' to an
element in the middle, n/2 , of the array. Let k = n/2 . If q ≤ A[k], then search in
the A[1 . . . k]; otherwise search T[k+1 . . n] for 'q'. Binary search for q in subarray A[i . .
j] with the promise that
Analysis
Binary Search can be accomplished in logarithmic time in the worst case , i.e., T(n) =
θ(log n). This version of the binary search takes logarithmic time in the best case.
if q > A [n]
then return n + 1
i = 1;
j = n;
while i < j do
k = (i + j)/2
if q ≤ A [k]
then j = k
else i = k + 1
return i (the index)
Analysis
Merge Sort
Merge-sort is based on the divide-and-conquer paradigm. The Merge-sort algorithm can
be described in general terms as consisting of the following three steps:
1. Divide Step
If given array A has zero or one element, return S; it is already sorted. Otherwise,
divide A into two arrays, A1 and A2, each containing about half of the elements of
A.
2. Recursion Step
Recursively sort array A1 and A2.
3. Conquer Step
Combine the elements back in A by merging the sorted arrays A1 and A2 into a
sorted sequence.
We can visualize Merge-sort by means of binary tree where each node of the tree
represents a recursive call and each external nodes represent individual elements of given
array A. Such a tree is called Merge-sort tree. The heart of the Merge-sort algorithm is
conquer step, which merge two sorted sequences into a single sorted sequence.
To begin, suppose that we have two sorted arrays A1[1], A1[2], . . , A1[M] and A2[1],
A2[2], . . . , A2[N]. The following is a direct algorithm of the obvious strategy of
successively choosing the smallest remaining elements from A1 to A2 and putting it in A.
MERGE (A1, A2, A)
i.← j 1
A1[m+1], A2[n+1] ← INT_MAX
For k ←1 to m + n do
if A1[i] < A2[j]
then A[k] ← A1[i]
i ← i +1
else
A[k] ← A2[j]
j←j+1
MERGE_SORT (A)
Analysis
Let T(n) be the time taken by this algorithm to sort an array of n elements dividing A into
subarrays A1 and A2 takes linear time. It is easy to see that the Merge (A1, A2, A) also takes
the linear time. Consequently,
for simplicity
The total running time of Merge sort algorithm is O(n lg n), which is asymptotically
optimal like Heap sort, Merge sort has a guaranteed n lg n running time. Merge sort
required (n) extra space. Merge is not in-place algorithm. The only known ways to
merge in-place (without any extra space) are too complex to be reduced to practical
program.
Implementation
}
void m_sort(int numbers[], int temp[], int left, int right)
{
int mid;
if (right > left)
{
mid = (right + left) / 2;
m_sort(numbers, temp, left, mid);
m_sort(numbers, temp, mid+1, right);
merge(numbers, temp, left, mid+1, right);
}
}
void merge(int numbers[], int temp[], int left, int mid, int
right)
{
int i, left_end, num_elements, tmp_pos;
left_end = mid - 1;
tmp_pos = left;
num_elements = right - left + 1;
while ((left <= left_end) && (mid <= right))
{
if (numbers[left] <= numbers[mid])
{
temp[tmp_pos] = numbers[left];
tmp_pos = tmp_pos + 1;
left = left +1;
}
else
{
temp[tmp_pos] = numbers[mid];
tmp_pos = tmp_pos + 1;
mid = mid + 1;
}
}
while (left <= left_end)
{
temp[tmp_pos] = numbers[left];
left = left + 1;
tmp_pos = tmp_pos + 1;
}
while (mid <= right)
{
temp[tmp_pos] = numbers[mid];
mid = mid + 1;
tmp_pos = tmp_pos + 1;
}
for (i=0; i <= num_elements; i++)
{
numbers[right] = temp[right];
right = right - 1;
}
}
Greedy Introduction
Greedy algorithms are simple and straightforward. They are shortsighted in their
approach in the sense that they take decisions on the basis of information at hand without
worrying about the effect these decisions may have in the future. They are easy to invent,
easy to implement and most of the time quite efficient. Many problems cannot be solved
correctly by greedy approach. Greedy algorithms are used to solve optimization problems
Greedy Approach
Greedy Algorithm works by making the decision that seems most promising at any
moment; it never reconsiders this decision, whatever situation may arise later.
Problem Make a change of a given amount using the smallest possible number of
coins.
Informal Algorithm
Formal Algorithm
Make change for n units using the least possible number of coins.
MAKE-CHANGE (n)
C ← {100, 25, 10, 5, 1} // constant.
Sol ← {}; // set that will hold the solution set.
Sum ← 0 sum of item in solution set
WHILE sum not = n
x = largest item in set C such that sum + x ≤ n
IF no such item THEN
RETURN "No Solution"
S ← S {value of x}
sum ← sum + x
RETURN S
Example Make a change for 2.89 (289 cents) here n = 2.89 and the solution contains 2
dollars, 3 quarters, 1 dime and 4 pennies. The algorithm is greedy because at every stage
it chooses the largest coin without worrying about the consequences. Moreover, it never
changes its mind in the sense that once a coin has been included in the solution set, it
remains there.
To construct the solution in an optimal way. Algorithm maintains two sets. One contains
chosen items and the other contains rejected items.
Definitions of feasibility
A feasible set (of candidates) is promising if it can be extended to produce not merely a
solution, but an optimal solution to the problem. In particular, the empty set is always
promising why? (because an optimal solution always exists)
Greedy-Choice Property
The "greedy-choice property" and "optimal substructure" are two ingredients in the
problem that lend to a greedy strategy.
Greedy-Choice Property
It says that a globally optimal solution can be arrived at by making a locally optimal
choice.
Spanning Trees
A spanning tree of a graph is any tree that includes every vertex in the graph. Little more
formally, a spanning tree of a graph G is a subgraph of G that is a tree and contains all the
vertices of G. An edge of a spanning tree is called a branch; an edge in the graph that is
not in the spanning tree is called a chord. We construct spanning tree whenever we want
to find a simple, cheap and yet efficient way to connect a set of terminals (computers,
cites, factories, etc.). Spanning trees are important because of following reasons.
Spanning trees construct a sparse sub graph that tells a lot about the original
graph.
Spanning trees a very important in designing efficient routing algorithms.
Some hard problems (e.g., Steiner tree problem and traveling salesman problem)
can be solved approximately by using spanning trees.
Spanning trees have wide applications in many areas, such as network design, etc.
One of the most elegant spanning tree algorithm that we know of is as follows:
Note that each time a step of the algorithm is performed, one edge is examined. If there is
only a finite number of edges in the graph, the algorithm must halt after a finite number
of steps. Thus, the time complexity of this algorithm is clearly O(n), where n is the
number of edges in the graph.
Greediness It is easy to see that this algorithm has the property that each edge is
examined at most once. Algorithms, like this one, which examine each entity at most
once and decide its fate once and for all during that examination are called greedy
algorithms. The obvious advantage of greedy approach is that we do not have to spend
time reexamining entities.
Consider the problem of finding a spanning tree with the smallest possible weight or the
largest possible weight, respectively called a minimum spanning tree and a maximum
spanning tree. It is easy to see that if a graph possesses a spanning tree, it must have a
minimum spanning tree and also a maximum spanning tree. These spanning trees can be
constructed by performing the spanning tree algorithm (e.g., above mentioned algorithm)
with an appropriate ordering of the edges.
Statement A thief robbing a store and can carry a maximal weight of w into their
knapsack. There are n items and ith item weigh wi and is worth vi dollars. What items
should thief take?
There are n items in a store. For i =1,2, . . . , n, item i has weight wi > 0 and worth vi > 0.
Thief can carry a maximum weight of W pounds in a knapsack. In this version of a
problem the items can be broken into smaller piece, so the thief may decide to carry only
a fraction xi of object i, where 0 ≤ xi ≤ 1. Item i contributes xiwi to the total weight in the
knapsack, and xivi to the value of the load.
It is clear that an optimal solution must fill the knapsack exactly, for otherwise we could
add a fraction of one of the remaining objects and increase the value of the load. Thus in
an optimal solution nSi=1 xiwi = W.
Greedy-fractional-knapsack (w, v, W)
FOR i =1 to n
do x[i] =0
weight = 0
while weight < W
do i = best remaining item
IF weight + w[i] ≤ W
then x[i] = 1
weight = weight + w[i]
else
x[i] = (w - weight) / w[i]
weight = W
return x
Analysis
If the items are already sorted into decreasing order of vi / wi, then
the while-loop takes a time in O(n);
Therefore, the total time including the sort is in O(n log n).
If we keep the items in heap with largest vi/wi at the root. Then
Although this data structure does not alter the worst-case, it may be faster if only a small
number of items are need to fill the knapsack.
One variant of the 0-1 knapsack problem is when order of items are sorted by increasing
weight is the same as their order when sorted by decreasing value.
The optimal solution to this problem is to sort by the value of the item in decreasing
order. Then pick up the most valuable item which also has a least weight. First, if its
weight is less than the total weight that can be carried. Then deduct the total weight that
can be carried by the weight of the item just pick. The second item to pick is the most
valuable item among those remaining. Keep follow the same strategy until thief cannot
carry more item (due to weight).
Proof
One way to proof the correctness of the above algorithm is to prove the greedy choice
property and optimal substructure property. It consist of two steps. First, prove that there
exists an optimal solution begins with the greedy choice given above. The second part
prove that if A is an optimal solution to the original problem S, then A - a is also an
optimal solution to the problem S - s where a is the item thief picked as in the greedy
choice and S - s is the subproblem after the first greedy choice has been made. The
second part is easy to prove since the more valuable items have less weight.
Note that if v` / w` , is not it can replace any other because w` < w, but it increases the
value because v` > v. □
Proof Let the ratio v`/w` is maximal. This supposition implies that v`/w` ≥ v/w for any
pair (v, w), so v`v / w > v for any (v, w). Now Suppose a solution does not contain the full
w` weight of the best ratio. Then by replacing an amount of any other w with more w`
will improve the value.
Binary search
However, if we place our items in an array and sort them in either ascending or
descending order on the key first, then we can obtain much better performance with an
algorithm called binary search.
In binary search, we first compare the key with the item in the middle position of the
array. If there's a match, we can return immediately. If the key is less than the middle
key, then the item sought must lie in the lower half of the array; if it's greater then the
item sought must lie in the upper half of the array. So we repeat the procedure on the
lower (or upper) half of the array. The function can now be implemented:
static void *bin_search( collection c, int low, int high, void *key ) {
int mid;
/* Termination check */
if (low > high) return NULL;
mid = (high+low)/2;
switch (memcmp(ItemKey(c->items[mid]),key,c->size)) {
/* Match, return item found */
case 0: return c->items[mid];
/* key is less than mid, search lower half */
case -1: return bin_search( c, low, mid-1, key);
/* key is greater than mid, search upper half */
case 1: return bin_search( c, mid+1, high, key );
default : return NULL;
}
}
void *FindInCollection( collection c, void *key ) {
/* Find an item in a collection
Pre-condition:
c is a collection created by ConsCollection
c is sorted in ascending order of the key
key != NULL
Post-condition: returns an item identified by key if
one exists, otherwise returns NULL
*/
int low, high;
low = 0; high = c->item_cnt-1;
return bin_search( c, low, high, key );
}
Points to note:
a. bin_search is recursive: it determines whether the search key lies in the lower or upper
half of the array, then calls itself on the appropriate half.
c. AddToCollection will need to be modified to ensure that each item added is placed in
its correct place in the array. The procedure is simple:
i. Search the array until the correct spot to insert the new item is found,
ii. Move all the following items up one position and
iii. Insert the new item into the empty position thus created.
d. bin_search is declared static. It is a local function and is not used outside this class: if it
were not declared static, it would be exported and be available to all parts of the program.
The static declaration also allows other classes to use the same name internally.
Let i be the highest-numbered item in an optimal solution S for W pounds. Then S`= S -
{i} is an optimal solution for W-wi pounds and the value to the solution S is Vi plus the
value of the subproblem.
We can express this fact in the following formula: define c[i, w] to be the solution for
items 1,2, . . . , i and maximum weight w. Then
0 if i = 0 or w = 0
c[i,w]
c[i-1, w] if wi ≥ 0
=
max [vi + c[i-1, w-wi], c[i-1,if i>0 and w ≥
w]} wi
This says that the value of the solution to i items either include ith item, in which case it is
vi plus a subproblem solution for (i-1) items and the weight excluding wi, or does not
include ith item, in which case it is a subproblem's solution for (i-1) items and the same
weight. That is, if the thief picks item i, thief takes vi value, and thief can choose from
items w-wi, and get c[i-1, w-wi] additional value. On other hand, if thief decides not to
take item i, thief can choose from item 1,2, . . . , i-1 upto the weight limit w, and get c[i-1,
w] value. The better of these two choices should be made.
Although the 0-1 knapsack problem, the above formula for c is similar to LCS formula:
boundary values are 0, and other values are computed from the input and "earlier" values
of c. So the 0-1 knapsack algorithm is like the LCS-length algorithm given in CLR-book
for finding a longest common subsequence of two sequences.
The algorithm takes as input the maximum weight W, the number of items n, and the two
sequences v = <v1, v2, . . . , vn> and w = <w1, w2, . . . , wn>. It stores the c[i, j] values in the
table, that is, a two dimensional array, c[0 . . n, 0 . . w] whose entries are computed in a
row-major order. That is, the first row of c is filled in from left to right, then the second
row, and so on. At the end of the computation, c[n, w] contains the maximum value that
can be picked into the knapsack.
Dynamic-0-1-knapsack (v, w, n, W)
for w = 0 to W
do c[0, w] = 0
for i = 1 to n
do c[i, 0] = 0
for w = 1 to W
do if wi ≤ w
then if vi + c[i-1, w-wi]
then c[i, w] = vi + c[i-1, w-wi]
else c[i, w] = c[i-1, w]
else
c[i, w] = c[i-1, w]
The set of items to take can be deduced from the table, starting at c[n. w] and tracing
backwards where the optimal values came from. If c[i, w] = c[i-1, w] item i is not part of
the solution, and we are continue tracing with c[i-1, w]. Otherwise item i is part of the
solution, and we continue tracing with c[i-1, w-W].
Analysis
θ(nw) times to fill the c-table, which has (n+1).(w+1) entries, each requiring θ(1) time to
compute. O(n) time to trace the solution, because the tracing process starts in row n of the
table and moves up 1 row at each step.
Unit III
Dynamic Programming
Dynamic programming takes advantage of the duplication and arrange to solve each
subproblem only once, saving the solution (in table or something) for later use. The
underlying idea of dynamic programming is: avoid calculating the same stuff twice,
usually by keeping a table of known results of subproblems. Unlike divide-and-conquer,
which solves the subproblems top-down, a dynamic programming is a bottom-up
technique.
Bottom-up means
The dynamic programming relies on a principle of optimality. This principle states that
in an optimal sequence of decisions or choices, each subsequence must also be optimal.
For example, in matrix chain multiplication problem, not only the value we are interested
in is optimal but all the other entries in the table are also represent optimal.
The difficulty in turning the principle of optimally into an algorithm is that it is not
usually obvious which subproblems are relevant to the problem under consideration.
Problem Statement A thief robbing a store and can carry a maximal weight of W into
their knapsack. There are n items and ith item weigh wi and is worth vi dollars. What items
should thief take?
Fractional knapsack problem The setup is same, but the thief can take fractions of
items, meaning that the items can be broken into smaller pieces so that thief may decide
to carry only a fraction of xi of item i, where 0 ≤ xi ≤ 1.
0-1 knapsack problem The setup is the same, but the items may not be broken into
smaller pieces, so thief may decide either to take an item or to leave it (binary choice),
but may not take a fraction of an item.
We can express this fact in the following formula: define c[i, w] to be the solution for
items 1,2, . . . , i and maximum weight w. Then
if i = 0 or w =
0
0
c[i,w] = c[i-1, w] if wi ≥ 0
max [vi + c[i-1, w-wi], c[i- if i>0 and w ≥
1, w]} wi
This says that the value of the solution to i items either include ith item, in which case it
is vi plus a subproblem solution for (i - 1) items and the weight excluding wi, or does not
include ith item, in which case it is a subproblem's solution for (i - 1) items and the same
weight. That is, if the thief picks item i, thief takes vi value, and thief can choose from
items w - wi, and get c[i - 1, w - wi] additional value. On other hand, if thief decides not to
take item i, thief can choose from item 1,2, . . . , i- 1 upto the weight limit w, and get c[i -
1, w] value. The better of these two choices should be made.
Although the 0-1 knapsack problem, the above formula for c is similar to LCS formula:
boundary values are 0, and other values are computed from the input and "earlier" values
of c. So the 0-1 knapsack algorithm is like the LCS-length algorithm given in CLR for
finding a longest common subsequence of two sequences.
The algorithm takes as input the maximum weight W, the number of items n, and the two
sequences v = <v1, v2, . . . , vn> and w = <w1, w2, . . . , wn>. It stores the c[i, j] values in the
table, that is, a two dimensional array, c[0 . . n, 0 . . w] whose entries are computed in a
row-major order. That is, the first row of c is filled in from left to right, then the second
row, and so on. At the end of the computation, c[n, w] contains the maximum value that
can be picked into the knapsack.
Dynamic-0-1-knapsack (v, w, n, W)
FOR w = 0 TO W
DO c[0, w] = 0
FOR i=1 to n
DO c[i, 0] = 0
FOR w=1 TO W
DO IFf wi ≤ w
THEN IF vi + c[i-1, w-wi]
THEN c[i, w] = vi + c[i-1, w-wi]
ELSE c[i, w] = c[i-1, w]
ELSE
c[i, w] = c[i-1, w]
The set of items to take can be deduced from the table, starting at c[n. w] and tracing
backwards where the optimal values came from. If c[i, w] = c[i-1, w] item i is not part of
the solution, and we are continue tracing with c[i-1, w]. Otherwise item i is part of the
solution, and we continue tracing with c[i-1, w-W].
Analysis
Multistage Graphs
o G=(V,E) with V partitioned into K >= 2 disjoint subsets such that if (a,b) is in E,
then a is in Vi , and b is in Vi+1 for some subsets in the partition;
o and | V1 | = | VK | = 1.
The vertex s in V1 is called the source; the vertex t in VK is called the sink.
The cost of a path from node v to node w is sum of the costs of edges in the path.
The "multistage graph problem" is to find the minimum cost path from s to t.
{ V(i+1,j) } j=0 .. n
The edges are weighted with C(i,j) = -P(i,j) [the negative of the profit] to make it a
minimization problem.
Let path(i,j) be some specification of the minimal path from vertex j in set i to vertex t;
C(i,j) is the cost of this path; c(j,t) is the weight of the edge from j to t.
(j,l) in E
To write a simple algorithm, assign numbers to the vertices so those in stage Vi have
lower number those in stage Vi+1.
int[] MStageForward(Graph G)
{
// returns vector of vertices to follow through the graph
// let c[i][j] be the cost matrix of G
Given a directed graph G = (V,E), where each edge (v,w) has a nonnegative cost C[v,w],
for all pairs of vertices (v,w) find the cost of the lowest cost path from v to w.
We will consider a slight extension to this problem: find the lowest cost path between
each pair of vertices.
We must recover the path itself, and not just the cost of the path.
Floyd's Algorithm
It returns as output
a distance matrix D[v,w] containing the cost of the lowest cost path from v to w
o initially D[v,w] = C[v,w]
a path matrix P, where P[v,w] holds the intermediate vertex k on the least cost
path between v and w that led to the cost stored in D[v,w].
We iterate N times over the matrix D, using k as an index. On the kth iteration, the D
matrix contains the solution to the APSP problem, where the paths only use vertices
numbered 1 to k.
On the next iteration, we compare the cost of going from i to j using only vertices
numbered 1..k (stored in D[i,j] on the kth iteration) with the cost of using the k+1th
vertex as an intermediate step, which is D[i,k+1] (to get from i to k+1) plus D[k+1,j] (to
get from k+1 to j).
After N iterations, all possible paths have been examined, so D[v,w] contains the cost of
the lowest cost path from v to w using all vertices if necessary.
The Algorithm
Floyd's algorithm (modified to find the least cost paths, and not just the cost of the paths)
produces a matrix P, which, for each pair of nodes u and v, contains an intermediate node
on the least cost path from u to v
So the least cost path from u to v is the least cost path from u to P[u,v], followed by the
least cost path from P[u,v] to v.
The following procedure uses the P matrix produced earlier to print the intermediate
vertices on the least cost path from node u to node v.
Path (int u, int v, imatrix &P)
{
int k;
k = P[u][v];
if (k == -1) return;
path(u,k);
cout << k;
path(k,v);
} /* Path */
Note that this procedure could loop forever on an arbitrary matrix, but Floyd's algorithm
ensures that we cannot have k on the shortest path from u to v and v on the shortest path
from u to k.
We will prove that after k iterations over the matrix D, D[i,j] is the cost of the cheapest
path from i to j that does not include a vertex numbered > k.
Proof by induction on k.
Then no intermediate vertices on a path from i to j are allowed, so D[i,j] should be C[i,j]
if (i,j) in E, and infinity otherwise. The initialization step does exactly this.
Induction step: Assume that after k iterations, D[i,j] is the cost of the lowest cost path
from i to j excluding all vertices from k+1 to N.
On the next (k+1) iteration, we are allowed to include vertex k+1 in any path.
For all pairs (i,j), the lowest cost path from i to j excluding vertices k+2 thru N goes thru
k+1 iff there is a low cost path from i to k+1 and from k+1 to j, excluding vertices k+2
thru N.
But the cheapest path from i to k+1 without using nodes k+2 thru N is simply D[i,k+1]
(by the induction hypothesis).
Similarly, the lowest cost path from k+1 to j without using nodes k+2 thru N is D[k+1,j].
Thus, we should use node k+1 to get from i to j iff D[i,k+1] + D[k+1,j] < D[i,j], the
cheapest path excluding k+1. Since this is exactly what is stored on the k+1th iteration,
we have completed the proof.
Comparison with Dijkstra's Algorithm
A binary search tree is a tree where the key values are stored in the internal nodes, the
external nodes (leaves) are null nodes, and the keys are ordered lexicographically. I.e. for
each internal node all the keys in the left subtree are less than the keys in the node, and all
the keys in the right subtree are greater.
When we know the probabilities of searching each one of the keys, it is quite easy to
compute the expected cost of accessing the tree. An OBST is a BST which has minimal
expected cost.
Example:
Key -5 1 8 7 13 21
Probabilities 1/8 1/32 1/16 1/32 1/4 1/2
It's clear that this tree is not optimal. - It is easy to see that if the 21 is closer to the root,
given its high probability, the tree will have a lower expected cost.
Each optimal binary search tree is composed of a root and (at most) two optimal subtrees,
the left and the right.
Method:
The criterion for optimality gives a dynamic programming algorithm. For the root (and
each node in turn) we select one value to be stored in the node. (We have possibilities
to do that.)
Once this choice is made, the set of keys which go into the left subtree and right subtree
is completely defined, because the tree is lexicographically ordered. The left and right
subtrees are now constructed recursively (optimally). This gives the recursive definition
of the optim cost: Let denote the probability of accessing key , let denote the sum
of the probabilities from to
The explanation of the formula is easy once we see that the first term corresponds to the
left subtree, which is one level lower than the root, the second term corresponds to the
root and the 3 to the right subtree. Every cost is multiplied by its probability. For
simplicity we set and so simplifies to . This
procedure is exponential if applied directly. However, the optimal trees are only
constructed over contiguous sets of keys, and there are at most different sets of
contiguous keys.
In this case we store the optimal cost of a subtree in a matrix The Matrix-entry will
contain the cost of an optimal subtree constructed with the keys to
We now fill the matrix diagonal by diagonal. It is custumary to fill the matrix with
that we save a lot of multiplications and divisions. Let then
An optimal tree with one node is just the node itself (no other choice), so the diagonal of
is easy to fill: .
The cost of the is in ( in our example) And you can see, that it is
practical, not to work with the probabilities, but with the frequencies (i.e the probabilities
times the least common multiple of their denominators) to avoid fractions as matrix-
entries.
When dealing with graphs, one fundamental issue is, to traverse the graph, to go through
it. In other words to visit each vertex and each edge. To solve this problem, many
interesting algorithms exist. Two of them will be presented here: Depth-First Search (dfs)
and Breadth-First Search (bfs). These two methods are used e.g. in connection with the
task of finding the connected components of a graph, which is a nice example o f an
application of bfs and dfs.
While the first number is assigned when a vertex is discovered (mar ked grey), the latter
is
allocated, when it is completed (marked black).
Here the pseudo-code of the dfs algorithm:
i:=1; j:=1; //initia lisation of iteration variables
PROCEDURE depth_first_search(u: vertex)
color[u]:=grey;
dfs_number[u]:=i++;
FOR v . Adj(u) DO //for each vertex adja cent to u
IF color[v]=white THEN depth_first_search(v)
color[u]=black;
completion_number[u]:=j++;
Minimum Spanning Trees
Let G=(V, E) be a connected, undirected graph where V is a set of vertices (nodes) and E
is the set of edges. Each edge has a given non negative length.
Problem Find a subset T of the edges of G such that all the vertices remain connected
when only the edges T are used, and the sum of the lengths of the edges in T is as small
as possible.
Let G` = (V, T) be the partial graph formed by the vertices of G and the edges in T.
[Note: A connected graph with n vertices must have at least n-1 edges AND more that n-
1 edges implies at least one cycle]. So n-1 is the minimum number of edges in the T.
Hence if G` is connected and T has more that n-1 edges, we can remove at least one of
these edges without disconnecting (choose an edge that is part of cycle). This will
decrease the total length of edges in T.
G` = (V, T) where T is a subset of E. Since connected graph of n nodes must have n-1
edges otherwise there exist at least one cycle. Hence if G` is connected and T has more
that n-1 edges. Implies that it contains at least one cycle. Remove edge from T without
disconnecting the G` (i.e., remove the edge that is part of the cycle). This will decrease
the total length of the edges in T. Therefore, the new solution is preferable to the old one.
Thus, T with n vertices and more edges can be an optimal solution. It follow T must have
n-1 edges and since G` is connected it must be a tree. The G` is called Minimum
Spanning Tree (MST).
Introduction to NP Completeness
NP Complete Problems: