You are on page 1of 26

# UNIT I

INTRODUCTION

1. Algorithm
- An algorithm is a sequence of unambiguous instructions for solving a problem. Is a finite
set of instructions that, if followed, accomplishes a particular task.

2. Algorithm Properties/characteristics
Algorithm must satisfy the following 5 criteria
o Input
o Output
o Definiteness
o Finiteness
o Effectiveness

3. Fundamentals of Algorithmic Problem Solving
1. Understanding the problem
2. Ascertaining the capabilities of a computational device
3. Choose between exact & approximate problem solving
4. Deciding on appropriate data structure
5. Algorithm design techniques
6. Methods of specifying an algorithm
7. Proving an algorithms correctness
8. Analysing an algorithm
9. Coding an algorithm

4. Important Problem Types
Sorting, Searching, String Processing, Graph problems, Combinatorial problems,
Geometric problems, numerical problems.

5. Fundamentals of the Analysis of Algorithm Efficiency
a. General characteristics
b. Measuring input size
c. Units for measuring running time
d. Orders of growth
e. Worst-case, Best-case, Average-case efficiencies
f. Asymptotic Notations

6. Running time measurement
- running time cannot be measured by seconds, millseconds & so on
- reason is
o dependence of speed of a computer
o dependence of quality of algorithm
o complier used in generating the machine code
o difficulty in clocking the actual running time.

7. Basic Operation
- The operation contributing the most of the total time & count the number of times the
basic operation is executed.

- Basic operation is usually the most time-consuming operation in the algorithms inner
loop.
- T(n) = C
op
* C
n

- Eg. Sorting comparison;
Matrix mulplication multiplicaton & addition

8. Worst-Case efficiency of an algorithm
- is the maximum number of steps that an algorithm can take for any collection of data
values.
- Is its efficiency for the worst case input of size n, which is an input of size n for which
the algorithm runs the longest among all possible inputs of that size.
- Example: Sequential Search. Searched value may at the at the end or may not be.

9. Best-Case efficiency of an algorithm
- is the minimum number of steps that an algorithm can take any collection of data values
- is its efficiency for the best-case input of size n, which is an input of size n, for which
the algorithm runs the fastest among all possible inputs of that size.
- Example: Sequential Search. Searched value may at the first position.
o The algorithm makes the smallest number of comparisons. C
best
(n) = 1 O(1)

10. Average-Case efficiency of an algorithm
- can not be obtained by taking the average of the worst-case and best-case efficiencies.
o Sometimes this is may be correct.
- Efficiency averaged on possible inputs.
- Some assumptions about the possible inputs of size is made.
- Example: Sequential Search. Searched value may at the middle position.
o The algorithm makes the average number of comparisons.

11. Asymptotic Notations
- The efficiency analysis framework concentrates on the order of growth of an algorithms
basic operation count as the principal indicator of the algorithms efficiency.
- To compare and rank such orders of growth , 3 notations are used.
o O (big oh), (big omega), (big theta)

12. O (big oh) notation
- A function f(n) = O(g(n)) [f(n) O(g(n))], iff there exist positive constants c & n
o
such
that f(n)cg(n) for all n, nn
o
.

13. (big omega) notation
- A function f(n) = (g(n)) [f(n) (g(n))], iff there exist positive constants c & n
o
such
that f(n) cg(n) for all n, nn
o
.

14. (big theta) notation
- A function f(n) = (g(n)) [f(n) (g(n))], iff there exist positive constants c
1
,c
2
& n
o

such that c
1
g(n) f(n) c
2
g(n) for all n, nn
o
.

15. Steps in analyzing efficiency of non-recursive algorithms
1. Decide on a parameter indicating an inputs size

2. Identify the basic operation (located in innermost loop)
3. Check whether the number of times the basic operations is executed depended only on
the input size of an input. If it also depends on some additional property, worst, average,
best case efficiencies are investigated.
4. Set up a sum expressing the number of times the algorithms basic operations is executed.
5. Using standard formulas and rules of sum manipulation, either find closed-form formula
for the count or at the very best establish its order of growth.

16. Steps for analyzing efficiency of recursive algorithms
1. decide on a parameter indicating an input size
2. identify the algorithms basic operation
3. check whether the number of times the basic operation executed can vary on different
inputs of the same size. If it can , investigate worst, best, average case.
4. set up a recursive relation, with an appropriate initial condition for the number of times
the basic operation is executed.
5. solve the recurrence or at least ascertain the order of growth of its solution.

17. Time Efficiency
Time efficiency is measured by counting the number of time the algorithm basic operation is
executed.

18. Space Efficiency
Space efficiency is measured by counting the number of extra memory units consumed by the
algorithm.

19. Need for case efficiencies
The efficiencies of some algorithms may differ significantly for inputs of the same size. For such
algorithms, case (best, worst, average) efficiencies is needed.

20. Need for algorithm analysis framework
The frameworks primary interest lies in the order of growth of the algorithms running time as
its input size goes to infinity.

21. Efficiency classes
The efficiencies of a large number of algorithms fall into the following few classes; Constant,
Logarithmic, Linear, n-log-n, Quadratic, Cubic, and Exponential.

22. Main tool for anlaysing the time efficiency of a non-recursive algorithm
Setting up a sum expressing the number of executions of its basic operation and ascertain the
sums order of growth.

23. Main tool for anlaysing the time efficiency of a non-recursive algorithm
Setting up a recurrence expressing the number of executions of its basic operation and ascertain
the solutions order of growth.

24. Brute Force Method
Brute force is a straightforward approach to solving a problem, usually directly based on the
problems statement and definitions of the concepts involved. Examples: Matrix multiplication,
selection sort, sequential search.

25. Strength & Weakness of Brute Force Method
Strength
Wide applicability and simplicity
Weakness
Efficiency

26. Exhaustive Search
Exhaustive search is a brute force approach to combinatorial problems. It suggests generating
each and every combinatorial object of the problem, selecting those of them that satisfy the problems
constraints and then finding a desired object. This search is impractical for large problems.

UNIT II
DIVIDE AND CONQUER METHODOLOGY

27. What is Divide and conquer methodology?
1. A problem instance is divided into several smaller instances of the same problem, ideally
of about the same size.
2. The smaller instances are solved (typically recursively though sometimes a different
algorithm is employed when instances become small enough)
3. If necessary, the solutions obtained for the smaller instances are combined to get a
solution to the original problem.
Algorithm DANDC(P)
{
if SMALL(P) then return SOLUTION(P)
else
{
divide P into smaller instances, P
1
,P
2
,..P
k
k1
Apply DANDC to each of these subproblems
Return COMBINE(DANDC(P
1
),DANDC(P
2
),.,DANDC(P
k
))
}
}

28. Computing time for finding the sum of n numbers

Time efficiency T(n) of many divide and conquer algorithms satisfies the equation
T(n)=aT(n/b) + f(n).

{ T(1) n=1
T(n) = {
{ aT(n/b) + f(n) n>1

Let a=b=2, T(1)=2, f(n) = n

T(n) = 2T(n/2) + n
= 2(2T(n/4) + n/2)+n = 4T(n/4) +2n
= 4(2T(n/8) + n/4) +2n = 8T(n/8) + 3n

= 2
i
T(n/2
i
) + in for iilog
2
n

if i = log
2
n then

T(n) = nT(1) + nlog
2
n [n/2
log
2
n
= 1; 2
log
2
n
= n ]
= nlog
2
n + 2n

29. Merge Sort
A merge sort works as follows:
1. If the list is of length 0 or 1, then it is already sorted. Otherwise:
2. Divide the unsorted list into two sublists of about half the size.
3. Sort each sublist recursively by re-applying merge sort.
4. Merge the two sublists back into one sorted list.
Data structure used Array
Worst case performance (nlogn)
Best case performance (nlogn)
Average case performance (nlogn)
Worst case space complexity (n)

Its principal drawback is the significant extra storage requirement.

30. Computing time for Merge sort

{ a n=1 a & c are constants
T(n) = {
{ 2T(n/2) + cn n>1

When n is a power of 2, n=2
k
[k = log
2
n]

T(n) = 2(2T(n/4) + c(n/2)) + cn = 4T(n/4) + 2cn
= 4(2T(n/8) + c(n/4) + 2cn
..
.
= 2
k
T(1) +kcn
= an +c(nlogn) T(n) = O(nlogn) for Worst & Best case

31. Recurrence relation for the number of key comparisons in Merge sort

0 n=1
C(n) = 2C(n/2) +C
merge
(n) n>1

C
merge
(n) is the number of key comparisons performed during the merging stage.
In worst case, neither of the two arrays becomes empty before other one contains just
one element.
C
merge
(n) = n-1

Worst case
0 n=1
C
worst
(n) = 2C
worst
(n/2) + n-1 n>1

if n=2
k
then C
worst
(n) = nlog
2
n n+1 & T(n) = O(n log n)

Best case
0 n1
C
best
(n) = 2C
best
(n/2) + n/2 n>1

32. Disadvantages of Merge sort
1. Extra space (auxiliary array) n is needed during merge process
2. Stack space is increased by the use of recursion. Maximum depth of the stack is log n
3. Time is spent on recursion instead of sorting

33. Quick Sort

Quicksort sorts by employing a divide and conquer strategy to divide a list into two sub-lists.
The steps are:
1. Pick an element, called a pivot, from the list.
2. Reorder the list so that all elements which are less than the pivot come before the pivot
and so that all elements greater than the pivot come after it (equal values can go either
way). After this partitioning, the pivot is in its final position. This is called the partition
operation.
3. Recursively sort the sub-list of lesser elements and the sub-list of greater elements.
Works by partitioning its inputs elements according to their value relative to some preselected
element, called pivot element. The elements to the left of pivot should be less than the pivot element and
the elements to the right of pivot should be greater than the pivot element.
Worst case performance (n
2
)
Best case performance (nlogn)
Average case performance (nlogn) comparisons
Worst case space complexity Varies by implementation
Optimal Result Sometimes

34. Binary Search
Binary search is a O(log n) algorithm for searching in sorted arrays. Search a sorted array by
repeatedly dividing the search interval in half. Begin with an interval covering the whole array. If the
value of the search key is less than the item in the middle of the interval, narrow the interval to the lower
half. Otherwise narrow it to the upper half. Repeatedly check until the value is found or the interval is
empty.
Data structure Array

Worst case performance (log n)
Best case performance O(1)
Average case performance O(Log n)
Worst case space complexity O(1)
Optimal Yes

35. Worst-case analysis of binary search
- key comparisons in the worst case is C
w
(n)
worst case no match, first or last element matches the key
- after one comparison, half the size of array is considered, so recurrence relation for C
w
(n)

{ 1 n=1
C
w
(n)= {
{ C
w
|n/2| + 1 n>1
- standard way of solving recurrences is to assume n=2
k
(n is a power of 2) and solve by
backward substitution

C
w
(n) = C
w
|2
k
/2| +1
= log
2
n +1
= log
2
(n+1)

To prove, C
w
(n) = log
2
n +1, consider n is positive and even n=2i, i>0

C
w
(n) = C
w
|n/2| +1
1)
LHS
C
w
(n) = log
2
n +1
= log
2
2i +1 = log
2
2 + log
2
i +1 = 1 + log
2
i +1 = log
2
i +2

RHS
C
w
|n/2| +1 = C
w
|2i/2| +1
= C
w
(i) +1 = (log
2
i +1) + 1 = log
2
i +2

2)
C
w
(n) = C
w
|n/2| +1
C
w
(2i) = C
w
|2i/2| +1
C
w
(2i) = C
w
(i) +1
C
w
|2i/2| +1= C
w
(i) +1
C
w
(i) +1 = C
w
(i) +1
(log
2
i +1) + 1 = (log
2
i +1) + 1 from C
w
(n) = log
2
n +1

log
2
i +2 = log
2
i +2

- worst-case efficiency is O(log n)

36. Multiplication Of Large Integers
- decrease the total number of multiplications performed at the expense of a slight increase in
the number of addition.
- Modern cryptology, require manipulation of integers that are over 100 decimal digits long
Such integers are too long to fit in a single word of a modern computer, so they
require special treatment
DANC technique
- to multiply 2 n-digit numbers
- reduces the number of multiplication
- Representation
C = a * b a first number bsecond number n digits
A = a
1
a
o
b= b
1
b
0

= a
1
*10
n/2
+ a
o
= b
1
*10
n/2
+ b
0

C = (a
1
*10
n/2
+ a
o
)* (b
1
*10
n/2
+ b
0
) [10
n/2
*10
n/2
=10
n/2+n/2

= (a
1
* b
1
)10
n
+ (a
1
* b
0
+a
0
* b
1
)10
n/2
+(a
0
* b
0
) =10
2n/2
]

- Formula
C= C
2
10
n
+ C
1
10
n/2
+ C
0

C
2
= a
1
* b
1

C
0
= a
0
* b
0

C
1
= (a
1
+ a
0
)* (b
1
+ b
0
) (C
2
+C
0
)

37. How many digit multiplications does algorithm for multiplying 2 large integers make?
- Multiplication of n-digit numbers requires 3 multiplications of n/2 digit numbers.
- Recurrence relation for number of multiplications is

{ 1 n=1
M(n) = {
{ 3M(n/2) n>1

by backward substitution for n=2
k

M(2
k
) = 3M(2
k-1
)
= 3(3M(2
k-2
)) = 3
2
M(2
k-2
)

In general, 3
i
M(2
k-i
) = 3
k
M(2
k-k
) = 3
k

Since k = log
2
n

M(n) = 3
log
2
n
[by property of logarithms a
log
b
c
= c
log
b
a
]
= n
log
2
3
n
1.585

DANDC algorithm for multiplying two n-digit integers requires about n
1.585

One-digit multiplications.

38. Strassens Algorithm
Strassens algorithm needs only seven multiplications to multiply two 2-by-2 matrices
but requires more addition than the definition based algorithm. By exploiting the DANDC
technique, this algorithm can multiply two n-by-n matrices with about n
1.585
multiplications.

39. Strassens method

( A
11
A
12
) ( B
11
B
12
) ( C
11
C
12
)
( A
21
A
22
) * ( B
21
B
22
) = ( C
21
C
22
)

C
11
= m
1
+m
4
-m
5
+m
7

C
12
= m
3
+m
5

C
21
= m
2
+ m
4

C
22
= m
1
+m
3
-m
2
+m
6

Where
m
1
= (A
11
+A
22
) (B
11
+B
22
)
m
2
= (A
21
+A
22
)B
11

m
3
= A
11
(B
12
-B
22
)
m
4
= A
22
(B
21
-B
11
)
m
5
= (A
11
+A
12
)B
22

m
6
= (A
21
-A
11
)(B
11
+B
12
)
m
7
= (A
12
-A
22
)(B
21
+B
22
)

- this method needs 7 multiplications and 18 additions or subtractions
7 multiplications in m
1
,m
2
,,m
7

10 additions or subtractions in m
1
,m
2
,.,m
7

8 additions or subtractions for computing C
ij
s
- Recurrence relation

{ b n2
T(n) = {
{ 7T(n/2) + an
2
n>2

40. How many number of multiplications M(n) made by Strassens method?
Recurrence relation
{ 1 n=1
M(n) = {
{ 7M(n/2) n>1

Let n = 2
k

M(n) = 7M(2
k-1
)
= 7(7M(2
k-2
)) = 7
2
M(2
k-2
)
In general
7
i
M(2
k-i
) = 7
k
M(2
k-k
) = 7
k
M(1) = 7
k

since k = log
2
n

M(n) = 7
log
2
n

= n
log27
(by logarithm property)
n
2.807
which is smaller than n
3

41. Strassens Matrix Multiplication for matrix of order n>2

Consider 4 x 4 matrix.

A B R S
A11 A12 A13 A14 B11 B12 B13 B14
X = A21 A22 A23 A24 Y = B21 B22 B23 B24

A31 A32 A33 A34 B31 B32 B33 B34
A41 A42 A43 A44 A41 A42 A43 A44
C D T U

Find Z = X x Y

I Method
A B R S
X = C D Y = T U

AR+BT AS+BU
Z = CR+DT CS+DU

Apply Strassens method to multiply AR,BT,AS,BU,CR,DT,CS,DU

II Method

A B R S
X = C D Y = T U

Apply Strassens method to multiply X & Y

M1=(A+D)*(R+U) L=M1+M4+-M5+M7
M2=(C+D)*R M=M3+M5
M3=A*(S-U) N=M2+M4
M4=D*(T-R) O=M1+M3-M2+M6
M5=(A+B)*U
M6=(C-A)*(R+S)
M7=(B-D)*(T+U)

Z = L M

N O

Adding Zeroes to the non-power of two order matrices

Z = X x Y

X = 3 x 4 order matrix Y = 4 x 3 order matrix

Z = 3 x 3 order matrix

Let
1 1 1 1
X = 1 1 1 1
1 1 1 1

1 1 1
1 1 1
Y = 1 1 1
1 1 1
Add zeroes to the last row and last column of the X & Y matrices, respectively OR zeroes to the
first row and first column of the X & Y matrices, respectively

1 1 1 1
X = 1 1 1 1
1 1 1 1
0 0 0 0

1 1 1 0
1 1 1 0
Y= 1 1 1 0
1 1 1 0

Z = X x Y Apply Strassens method

GREEDY METHOD

42. What is Greedy method?
- The greedy approach suggests constructing a solution through a sequence of steps, each
expanding a partially constructed solution obtained so far, until a complete solution to the
problem is reached.
- On each step the choice made must be
Feasible (ie) it has to satisfy the problems constraints
Locally optimal (ie) it has to be the best local choice among all feasible choices
available on that step
Irrevocable (ie) once made, it cannot be changed on subsequent steps of the algorithm
Algorithm GREEDY(a,n)
{

solution = 0
for I = 1 to n
{
x = SELECT (a)
if FEASIBLE (solution, x) then solution = UNION(solution,x)
}
return solution

43. Spanning Tree
- a spanning tree of a connected graph is its connected acyclic subgraph ((ie) a tree) that
contains all the vertices of a graph.

44. Minimum Spanning tree
- a minimum spanning tree of a weighted connected graph is its spanning tree of the smallest
weight, where the weight of a tree is defined as the sum of the weights on all its edges.

45. Prims algorithm
Prims algorithm is a greedy algorithm for constructing a minimum spanning tree of a
weighted connected graph. It works by attaching to a previously constructed subtree a vertex
closest to the vertices already in the tree.

Prim's algorithm is an algorithm that finds a minimum spanning tree for a connected
weighted graph. This means it finds a subset of the edges that forms a tree that includes every
vertex, where the total weight of all the edges in the tree is minimized.

46. Steps-Prims algorithm
1. consider a vertex as the initial vertex for the minimum spanning tree
2. shortest edge from the initial vertex is selected
a. to find the shortest edge the information about the nearest vertices are provided in
the following way
i. name of the nearest vertices and length(weight) of the corresponding edge.
This is Fringe. Fringe contains only the vertices that are not in the tree but
are adjacent to at least one tree vertex. These are candidates from which
the next tree vertex is selected.
ii. Vertices that are not adjacent to any of the tree vertex can be given a
label. This is Unseen. All the other vertices of the graph, is called Unseen,
because they are yet to be affected by the algorithm.
3. After identifying a vertex u* to be added to the tree, perform 2 operations
a. move u* from the set V-V
T
to the set of tree vertices V
T

b. for each remaining vertex u in V-V
T
that is connected to u* by a shorter edge than
the us current distance label, update its labels by u* and weight of the edge
between u* & u, respectively.

47. Implementation of Prims algorithm (Data structure used)

- efficiency of Prims algorithm depends on the data structure chosen for the graph
Priority queue is the best
- If the graph is represented by its weight matrix and priority queue is implemented by as a
unordered array, the algorithms running time will be (n
2
), where n is the number of vertices
in the graph.

- On each of the n-1 iterations, the array implementing the priority queue, is traversed to find
and delete the minimum and then to update, if necessary, the priorities of the remaining
vertices.
- The priority queue can be implemented with min-heap.
- If the graph is represented by its adjacency linked list and the priority queue is implemented
as a min-heap, the running time of the algorithm is O(Elogn)
This is because the algorithm performs n-1 deletions of the smallest element and
makes E verifications and possibly, changes of an elements priority in a min-heap of
size not greater than n.
- Each of this operations, is a O(log n) operations
- Running time of this implementation of Prims algorithm is
((n-1) + E)O(log n) = O(Elogn) in a connected graph, n-1 E

48. Kruskals Algorithm
- This algorithm looks at a MST for a weighted connected graph G=<V,E> as an acyclic
subgraph with V-1 edges for which the sum of the edge weights is the smallest.
- Algorithm constructs a MST as an expanding sequence of subgraphs, which are always
acyclic but not necessarily connected on the intermediate stages of the algorithm
- Algorithm begins by sorting the graphs edges in non-decreasing order of their weights.
- Then starting with the empty subgraph, it scans this sorted list adding the next edge on the
list to the current sub-graph, if such inclusion does not create a cycle and simply skipping the
edge otherwise.
- It will generally only be a forest since the set of edges t can be completed into a tree iff
there are no cycles in t.
OR
Kruskals algorithm is a greedy algorithm for the minimum spanning tree problem. It
constructs a minimum spanning tree by selecting edges in increasing order of their weights
provided that the inclusion does not create a cycle. It requires Union-Find algorithms.

49. Dijkstras algorithm
Dijkstras algorithm solves the single-source shortest path problem of finding shortest
paths from a given vertex(the source) to all the other vertices of a weighted graph or digraph. It
works as Prims algorithm but compares path lengths rather than edge lengths. Dijkstras
algorithm always yields a correct solution for a graph with non-negative lengths.

50. Difference between Dijktras and Prims algorithm
1. Dijstras finds single source shortest path
2. Dijkstras compares path lengths and therefore must add edge weights

1. Prims finds minimum spanning tree
2. Prims compares the edge weights as given

51. Difference between Prims & Kruskals algorithm
1. Prims algorithm expands the tree by one vertex at a time, by selecting the minimum
vertex available at that stage
2. At any point of time the resultant tree is a tree with no cycles

1. Kruskals algorithm selects the minimum vertex among all vertices

2. At any point time the resultant tree may be a forest
3. All edges are sorted in increasing order

52. Difference between Greedy and Dynamic Programming Method
- In the greedy method only one decision sequence is ever generated

- In the dynamic programming, many decision sequences may be generated.
- Dynamic programming algorithms often have a polynomial complexity.

53. Single Source Shortest Path (SSSP) Problem (Dijkstras algorithm)
- Given a directed graph G = (V,E), with non-negative costs on each edge, and a selected source node
v in V, for all w in V, find the cost of the least cost path from v to w.
- The cost of a path is simply the sum of the costs on the edges traversed by the path.

54. Data structures used by Dijkstra's algorithm include:
- a cost matrix C, where C[i,j] is the weight on the edge connecting node i to node j. If there is no
such edge, C[i,j] = infinity.
- a set of nodes S, containing all the nodes whose shortest path from the source node is known.
Initially, S contains only the source node.
- a distance vector D, where D[i] contains the cost of the shortest path (so far) from the source
node to node i, using only those nodes in S as intermediaries.

UNIT III
DYNAMIC PROGRAMMING

55. What is Dynamic programming?
- is a technique for solving problems with overlapping sub-problems
o These sub-problems arise from a recurrence relating a solution to a given problem
with solutions to its smaller sub-problems of the same type.
- Is an algorithm design method that can be used when the solution to a problem can be
viewed as the result of a sequence of decisions
o An optimal sequence of decisions can be found by making the decisions one at a
time and never making an erroneous decision
- It suggests solving each smaller sub problem once and recording the results in a table
from which a solution to the original problem can be than obtained.

56. Principle of Optimality
The principle of optimality state that an optimal sequence of decisions has the
property that whatever the initial state and decision are, the remaining decisions
must constitute an optimal decision sequence with regard to the state resulting
from the first decision.
OR
An optimal solution to any of its instance must be made up of optimal solutions to
its sub-instances.

57. Computing Binomial Coefficient
- Binomial coefficient denoted by C(n,k) or
|
|
.
|

\
|
k
n
, is the number of combinations (subsets)
of k elements from an n-element set (0kn)

- The binomial coefficient defined by factorials

|
|
.
|

\
|
k
n
=
! )! (
!
k k n
n

(0kn)

|
|
.
|

\
|
k
n
= 0 k<0 or k>n
- numerous properties of binomial coefficients, concentrates on the following two:

C(n,k)=C(n-1,k-1) + C(n-1,k) for n>k>0
C(n,0) = C(n,n) = 1

C(n,k) = { 1 k=0,k=n
{ C(n-1,k-1)+C(n-1,k) k>0

- to solve, record the values of the binomial coefficients in a table of n+1 rows and k+1
columns, numbered from 0 to n and from 0 to k, respectively.
- to compute C(n,k) fill the table row by row, starting with row 0 and ending with row n.
- each row i (0in) is filled left to right, starting with 1 because C(n,0) = 1
- Rows 0 through k also end with 1 on the tables main diagonal C(i,i)=1 (0ik)
- Compute other entries, adding the contents of the cells in the preceding row and previous
column and in the preceding row and the same column.

58. Time efficiency for computing binomial coefficient
- basic operation is Addition
- Let A(n,k) be the total number of additions for computing C(n,k)
- Computing each entry in the table requires just one addition
- The first k+1 rows of the table form a triangle
- Remaining n-k rows of the table form a rectangle
o So split the sum expressing A(n,k) as

+ = = =

=
+ =
n
k i
k
j
k
i
i
j
k n A
1 1 1
1
1
1 1 ) , (

=

= + =
+
k
i
n
k i
k i
1 1
) 1 (

=

+ = =
+
n
k i
k
i
k
i
k i
1 1 1
1 1

= ) (
2
) 1 (
k n k k
k k
+
+
=
2
) 1 ( k k
+k(n-k)

= O(nk)

Working to prove

A(n,k) =
2
) 1 ( k k
+k(n-k)

n
i
i
1
=
2
) 1 ( + n n
=
2
) 1 ( + k k

=
k
i 1
1 = u-l+1 = k-1+1 = k

+ =
n
k i 1
1 = u-l+1 = n-(k+1)+1 = n-k-1+1 = n-k

=

k
i
k
i
i
1 1
1 =
2
) 1 ( + k k
- k =
2
2 ) 1 ( k k k +
=
2
2
2
k k k +
=
2
2
k k
=
2
) 1 ( k k

A(n,k) =
2
) 1 ( k k
+k(n-k)

59. Warshalls algorithm
- constructs the transitive closure of a given digraph with n vertices through a series of n x
n Boolean matrices
R
0
,R
1
, ..R
k-1
,R
k
,,R
n

- each of these matrices provides certain information about directed path in the digraph.
- The element R
ij
k
in the i
th
row & j
th
column of the matrix R
k
(k=0,1,n) is equal to 1 iff
there exists a directed path from the i
th
vertex to the j
th
vertex with each intermediate
value, if any, numbered not higher than k.
- formula for generating the elements of matrix R
k
from the elements of matrix R
k-1
is

r
ij
k
= r
ij
k-1
OR (r
ik
k-1
and

r
kj
k-1
)

o if an element r
ij
is 1 in R
k-1
, it remains 1 in R
k
.
o If an element r
ij
is 0 in R
k-1
, it remains 1 in R
k
iff the element in its row i &
column k and the element in its column j & row k are both 1 in R
k-1
.
- time efficiency is cubic ,O(n
3
)

60. Floyds Algorithm
- used to solve All-pairs shortest-paths problem
- uses the idea of Warshalls algorithm
- Given a weighted connected graph (undirected or directed), the all-pairs shortest paths
problem finds the distances (the lengths of the shortest paths) from each vertex to all
other vertices.
- The lengths of the shortest path is recorded in an n x n matrix D called Distance matrix.
o The element d
ij
in the i
th
row and the j
th
column of this matrix indicates the length
of the shortest path from the i
th
vertex to the j
th
vertex 1i, jn
- Computes the distance matrix of a weighted graph with n vertices through a series of n
x n matrices.

D
0
,D
1
, ..,D
k-1
,D
k
,,D
n

- each of these matrices contains the lengths of the shortest paths with certain constraints
on the paths considered.

- The element D
ij
k
in the i
th
row & j
th
column of the matrix D
k
(k=0,1,n) is equal to
length of the shortest path among all paths from the i
th
vertex to the j
th
vertex with each
intermediate value, if any, numbered not higher than k.
- Shortest path among the paths that use the k
th
vertex is equal to d
ik
k-1
+ d
kj
k-1

{ Min { di
j
k-1
, d
ik
k-1
+ d
kj
k-1
} k1
di
j
k
= { w
ij
k=0

- the element in the i
th
row & j
th
column of the current distance matrix D
k-1
is replaced by
the sum of the elements in the same row i & the k
th
column and in the same column j &
k
th
column iff the latter sum is smaller than its current value.
- time efficiency is O(n
3
)

61. Difference between Warshalls & Floyds algorithm
Warshalls
- Input is adjacency matrix
- Output is transitive closure
- If there is no direct edge between vertices, the value in adjacency matrix is zero
- Algorithm looks for 1s in the adjacency matrix to find transitive closure matrix
- Transitive closure matrix is a Boolean matrix
Floyds
- Input is weight matrix
- Output is distance matrix
- If there is no direct edge between vertices, the value in weight matrix is infinity and
diagonal elements are zero
- Algorithm looks for minimum value in the weight matrix to find distance matrix
- Distance matrix is not a Boolean matrix and will not have infinity

62. 0/1 Knapsack Problem
- Given n items of known weights w
1
,.,w
n
and values v
1
,..v
n
and a knapsack of capacity
W. Find the most valuable subset of the items that fill into the knapsack. All weights and
knapsack capacity are positive integers and the item values can be real values.
- Aim is to fill the knapsack in a way that maximizes the value of the included objects,
while respecting the capacity constraint.
- Let x
i
be zero is we dont select the object i or 1 if we include object i.
- The problem may be stated as
Maximize E v
i
x
i
i=1 to n
Subject to Ew
i
x
i
<W
where v
i
>0, w
i
>0 and x
i
e{0,1}

- To solve the problem by Dynamic Programming, we set up a table V(1:n,0:W), with one
row for each available object and one column for each weight from 0 to W.
- The solution of the instance can be found in V(n,W).
- Fill the table either row by row or column by column.
- the recurrence for knapsack problem is
V(i,j)=Max{V(i-1,j), V(i-1,j-w
i
)+v
i
} } j-w
i
0
V(i-1,j) j-w
i
<0

If V(i-1,j) is larger, the object i is discarded.

If V(i-1,j-W
i
)+v
i
is larger, the object i is included

- initial conditions are
V(0,j)=0 j>0
V(i,j)=- for all i when j<0
V(i,0) = 0 i0
- fill the table V, using the above formula
- solution is found at V(n,W)
- to find solution vector start from V(n,W) and track back the computations in the table
- to add or discard an item i, the following criteria should be satisfied for each item.
1. If V(i,j) = V(i-1,j) and V(i,j) = V(i-1,j-w
i
)+v
i
then discard item i.
2. If V(i,j) = V(i-1,j) and V(i,j) = V(i-1,j-w
i
)+v
i
then include item i.
3. If V(i,j) = V(i-1,j) and V(i,j) = V(i-1,j-w
i
)+v
i
then include item i.
4. If V(i,j) = V(i-1,j) and V(i,j) = V(i-1,j-w
i
)+v
i
then discard item i.
- if item i is added, update the weight of the items included in the sack.
- Time & space efficiency is O(nW), time necessary to construct the table
- The composition of the optimal load can be determined in a time of O(n+W)

63. Memory Functions
- The direct top-down approach to finding a solution to such a recurrence leads to
an algorithm that solves common subproblems more than once and hence inefficient
(exponential or worse)
- The classic 0/1 knapsack of dynamic programming works bottom-up.
o The solutions of some of these smaller subproblems are often not necessary for
getting a solution to the problem given.
- The goal is to get a method that solves only subproblems that are necessary and does it
only once. Such a method exists; it is bases on using Memory Functions.
o This method solves a given problem in the top-down manner but, in addition,
maintains a table of the kind that is used in bottom-up approach.
o Initially, all the tables entries are initialized with a special nul symbol to
indicate that they have not yet been calculated.
o Whenever a new value needs to be calculated, the method checks that
corresponding entry in the table first; if this enry is not null, it is simple
retrieved from the table; otherwise it is computed by the recursive call whose
result is than recorded in the table.
o After initializing the table, the recursive function needs to be called with i=n(no.
of items) and j=W (knapsack capacity)

63.a. Optimal Binary Search Tree
An optimal binary search tree, is a binary search tree where the average cost of looking
up an item (the expected search cost) is minimized.

UNIT IV
BACKTRACKING

64. Backtracking
Backtracking constructs its state-space tree in the depth-first search fashion. If the
sequence of choices represented by a current node of the state-space tree can be developed

further without violating the problems constraints, it is done by considering the first remaining
legitimate option for the next component. Otherwise, the method backtracks by undoing the last
component of the partially built solution and replaces it by the next alternative.

64.a. Characteristics of Backtracking
- Backtracking is typically applied to difficult combinatorial problems for which no
efficient algorithms for finding exact solutions possibly exist.

- Unlike the exhaustive search approach, which is doomed to be extremely slow for all
instance of a problem, backtracking at least holds a hope for solving some instances of
non-trivial sizes in an acceptable amount of time. This is especially true for optimization
problems, for which the idea of backtracking can be further enhanced by evaluating the
quality of partially constructed solutions.

- Even is backtracking does not eliminate any elements of a problems state space and ends
up generating all its elements, it provides a specific technique for doing so, which can be
of value in its own right.

65. State Space Tree
State-space tree is a rooted tree whose nodes represent partially constructed solutions to
the problem in question. Is constructed in the manner of depth first search. Its root represents an
initial state before the search for a solution begins. The nodes of the first level in the tree
represent the choices made for the first component of a solution; the nodes of the second level
represent the choices for the second component and so on.

66. Promising Node
A node in a state-space is said to be promising if it corresponds to a partially constructed
that may still lead to a complete solution.

67. Explicit Constraints
Explicit constraints are rules that restrict each x
i
to take a value only from a given set.
x
i
0 S
i
={all positive real numbers}
x
i
=0 or 1 S
i
={0,1}
l
i
x
i
u
i
S
i
={a : l
i
au
i
}
The explicit constraints depend on the particular instance I of the problem being solved.
All tuples that satisfy the explicit constraints define a possible solution space for I.

68. Implicit Constraints
Implicit constraints are rules that determine which of the tuples in the solution space of I
satisfy the criterion function. Thus implicit constraints describe the way in which the x
i
must
relate to each other.

69. N Queens Problem
Is a combinatorial problem to place n queens on an n-by-n chessboard, so that no two
queens attack each other by being in the same row or in the same column or on the same
diagonal.
The Explicit constraint: Value of x
i
must be from S={1,2,3,,n}. The solution space
consists of n
n
n-tuples. Xi represents the column numbers

The Implicit constraint: No two x
i
s can be the same (ie) all queens must be on different
columns and no two queens can be on the same diagonal.

Two queens lie on the same diagonal iff |j-l| = |i-k|.

70. Hamiltonian Circuit (Cycle)
Let G=<V,E> be a connected graph with n vertices. A Hamiltonian cycle is a round-trip
path along n edges of G that visits every vertex once and returns to its starting position. If a
Hamiltonian cycle begins at some vertex G v e
1
and the vertices of G are visited in the order
v
1
,v
2
,.,v
n+1
then edges (v
i
,v
i+1
) are in E, 1in and the v
i
are distinct except for v
1
and v
n+1

which are equal.
Implicit constraints
- all vertices should be included in the cycle
-
all vertices should be distinct except x
1
,x
n+1
- only distinct cycles are output.
Explicit constraint
- x
i
= vertex number

71. Sum of Subsets
Given positive numbers w
i
, 1sisn and m. The problem is to find all subsets of the w
i

whose sum equals m. The problem can be formulated using either Fixed or Variable sized tuples.

72. Sum of subsets Formulations
Variable Size
- State space tree is constructed using Breadth First Search (Queue Method)
- Tree representation is not a Binary Tree representation
- Xi values are weights or indices of weights
- Solution vector is k-tuple
Fixed Size
- State space tree is constructed using D Search (Depth Search, Stack Method)
- Tree representation is Binary Tree representation
- Xi values are either 1 or 0
- Solution vector is n-tuple

BRANCH AND BOUND

73. Branch and Bound
Branch and bound is an algorithm design technique that enhances the idea of generating a
state-space tree with the idea of estimating the best value obtainable from a current node of the
decision tree: if such an estimate is not superior to the best solution seen up to that point in the
processing, the node is eliminated from further consideration.

74. Principal idea behind Branch and Bound Technique
- problem is represented in state space tree
- a nodes bound value is compared with the value of the best solution seen so far:
o if the bound value is not better than the best solution seen so far, (ie) not smaller
for a minimization problem and not larger for a maximization problem, the node
is non-promising and can be terminated.

o No solution obtained from it can yield a better solution than the one already
available.

75. A search path at the current node in a state space tree is terminated for any one of the
following 3 reasons:

1. value of the nodes bound is not better that the value of the best solution seen so far.
2. node represents no feasible solutions because the constraints of the problem is already
violated.
3. subset of feasible solutions represented by the node consists of a single point (no further
choices can be made)
o the value of the objective function for this feasible solution is compared with that
of the best solution seen so far and update the later with the former, if the new
solution is better.

76. Feasible Solution
Feasible solution is a point in the problems search space that satisfies all the problems
constraints. Example: Cycle in Traveling sales person problem, Items whose weight does not
exceed the capacity of bag.

77. Optimal Solution
Optimal solution is a feasible solution with the best value of the objective function.
Example: shortest path in Traveling Salesperson problem, most valuable items that fits the bag.

78. Best-First Branch and Bound Strategy
- In branch and bound, instead of generating a single child of the last promising node, all
the children of the most promising among non-terminated leaves in the current tree is
generated. (Non-terminated still promising, leaves are live).
- Compare the lower bounds of the live nodes to find which of the nodes is most promising
and consider a node with the best bound as most promising.
- This strategy is called Best First branch and bound.

79. Knapsack Problem
- order the items of a given instance in descending order by their value-to-weight ratios
(v
i
/w
i
)
v
1
/w
1
v
2
/w
2
. v
n
/w
n

- Compute upper bound ub = vi + (W-w
i
)(v
i+1
/w
i+1
)
- Root of the state space, has no items and total weight of the items already selected W and
their total value V is equal to 0
o Value of the ub is computed by formula.
- Next level nodes w,v,ub is computed
o Levels left nodes ub and right nodes ub is compared.
o Higher ub value, node is selected for finding the next value, in maximization
problem.
o Also w at each level is checked to find whether it exceeds the bag capacity. If a
nodes w exceeds the bag capacity, the path will not give feasible solution.

80. Assignment Problem
- is the problem of assigning n people to n jobs so that the total cost of the assignment is a
small as possible.

- Assignment problem is specified by an n-by-n cost matrix, C.
- The problem is stated as follows:
o Select one element in each row of the matrix so that no 2 selected elements are in
the same column and their sum is the smallest possible.

81. Traveling Salesperson Problem
The problem is to find a least-cost tour of the N cities in a sales region. The tour is to visit
each city exactly once. To help find an optimum tour, the salesperson has a cost matrix C, where
element C(i,j) equals the cost (usually in terms of time, money or distance) of direct travel
between city i & city j.

82. Common issues in Backtracking and Branch and Bound
Backtracking and Branch & Bound
- are used to solve some large instance of difficult combinatorial problems
- can be considered as an improvement over exhaustive search
- Are based on the construction of a state-space-tree, whose nodes reflect specific choices
made for a solutions components.
- Terminate a node as soon as it can be guaranteed that no solution, to the problem can be
obtained by considering choices that correspond to the nodes descendants.

83. Difference between Backtracking and Branch and Bound
Backtracking
- Applicable to non-optimization problems.
- State-space-tree is developed as Depth-first.
Branch & Bound
- Applicable only to optimization problems because it is based on computing a bound on
possible values of the problems objective function.
- Can generate nodes according to several rules; the most natural rule is Best-first rule.
- Has both the challenge and opportunity of choosing an order of node generation and
finding a good bounding function.
- Best first rule may or may not lead to a solution faster than other strategies.

84. Difference between Backtracking and Backward approach
Backtracking
1. starts from root of the tree
2. When a node is a non-promising node, moves back to the previous node or level
3. From non-promising node, first moves by BFS and then by DFS in backwards

Backward approach
1. starts from leaf and moves towards root
2. moves either by BFS or DFS in backwards, to reach the root

UNIT V
NP-HARD AND NP-COMPLETE PROBLEMS

85.Heuristic approach
A heuristic is a common-sense rule drawn from experience rather than from a
mathematically proven assertion.

86.Approximation algorithms

Approximation algorithms are often used to find approximate solutions to difficult
problems of combinatorial optimization. Approximation algorithm run a length in level of
sophistication, many of them use greedy algorithms based on some problem specific heuristic.

87. Accuracy of approximate solution
The accuracy of an approximate solution S
a
to a problem minimizing some function f
can be quantified by the size of the relative error of this approximation

) (
) ( ) (
) (
*
a
a
a
S f
S f S f
S re

= S* is an exact solution to the problem

Since 1
) (
) (
) (
*
=
S f
S f
S re
a
a
, the accuracy ratio
) (
) (
) (
*
S f
S f
S r
a
a
= is used as a measure of
accuracy of S
a
. The accuracy ration of approximate solution to maximization problem is
computed as
) (
) (
) (
*
a
a
S f
S f
S r = . The closer r(S
a
) is to 1, the better the approximate solution is.

88. Performance Ratio
The performance ratio is the principal metric for measuring the accuracy of such
approximation algorithms.

89. Nearest Neighbor algorithm
Nearest Neighbor is a simple greedy algorithm for approximating a slution to the
Traveling salesperson problem. The performance ratio of this algorithm is unbounded above,
even for the important subset of Euclidean graphs.
Step 1: choose an arbitrary city as the start
Step 2: repeat the following operation until all the cities have been visited; go to the
unvisited city nearest to the one visited last. (Ties can be broken arbitrary)
Step 3: return to the starting city.

90. Twice-around-the-tree algorithm
Twice-around-the-tree is an approximation algorithm for the Traveling salesperson
problem with the performance ratio of 2 for Euclidean graphs. The algorithm is based on
modifying a walk around a minimum spanning tree by shortcuts.
Step 1: Construct a Minimum Spanning Tree of the graph corresponding to a given
instance of the Traveling salesperson problem.
Step 2: Starting at an arbitrary vertex, perform a walk around the Minimum spanning tree
recording the vertices passed by
Step 3: Scan the list of vertices obtained in Step 2 and eliminate from it all repeated
occurrences of the same vertex except the starting one at the end of the list. The vertices
remaining on the list will form a Hamiltonian Circuit, which is the output of the
algorithm.

91. Approximation Schemes
Polynomial-time approximation schemes for discrete version of knapsack problem, which
are parametric families of algorithms that allows to get approximations S
a
(k)
with any predefined
accuracy level:

k S f
S f
k
a
1
1
) (
) (
*
) (
+ s for any instance of size n, where k is an integer parameter in the
range 0kn.

92. Complexity Theory
Complexity theory seeks to classify problems according to their computational
complexity. The principal split in between tractable & intractable problems problems that can
& cannot be solved in polynomial time, respectively.

Complexity theory concentrates on decision problem, problem with Yes/No answers.

93. Halting Problem
Halting problem is an example of an undecidable decision problem (ie) it cannot be
solved by any algorithm. The halting problem is to determine for an arbitrary deterministic
algorithm A and an input I whether algorithm A with input I ever terminates (or enters an infinite
loop). This problem is undecidable. Hence there exists no algorithm to solve this problem.

94. Polynomial time
An algorithm solves the problem in polynomial time if its worst-case time
efficiency belong to O(p(n)) where p(n) is a polynomial of the problems input size n. An
algorithm is said to be solvable in polynomial time if the number of steps required to complete
the algorithm for a given input is for some nonnegative integer , where is the
complexity of the input. Polynomial-time algorithms are said to be "fast."
Problems that can be solved in polynomial time are called Tractable, problems that
cannot be solved in polynomial time are called intractable.

95. Class P
P is the class of all decision problems than can be solved in polynomial time. Class P is a
class of decision problems (problems which have Yes/No answers) that can be solved in
polynomial time by deterministic algorithms. This class of problems is called Polynomial.

96. Non deterministic Algorithm
A non-deterministic algorithm is a two- stage procedure that takes as its input an
instance I of a decision problem and does the following:
Non-deterministic stage (guessing): An arbitrary string S is generated that can be thought of as a
candidate solution to the given instance I.
Deterministic stage(verification): A deterministic algorithm takes both I and S as its input and
outputs Yes if S represents a solution to instance I.
Non-deterministic algorithms are said to solve decision problem iff for every Yes
instance of the problem returns Yes on some execution. A non-deterministic algorithm is said to
be non-deterministic polynomial if the time efficiency of its verification stage is polynomial.

97. Class NP
NP is the class of all decision problems whose randomly generated solutions can be
verified in polynomial time. Class NP is the class of decision problems that can be solved by
non-deterministic polynomial algorithms. This class of problems is called non-deterministic
polynomial.

98. NP- Complete

A decision problem D is said to be NP-complete if
1. it belongs to class NP
2. every problem in NP is polynomially reducible to D.
Decision version of a difficult combinatorial problem in NP-Complete. Examples. Traveling
salesperson problem, Knapsack problem.
It is not known whether P =NP or P is just a proper subset of NP. A discovery of a
Polynomial-time algorithm for any of the thousands known NP-complete problem would imply
that P=NP.

99. Polynomially-Reducible
A decision problem D1 is said to be polynomially reducible to a decision problem D2 if
there exists a function t that transforms instanced of D1 to instances of D2 such that
1. t maps all yes instances of D1 to yes instances of D2 and all no instances of D1
to no instances of D2.
2. t is computable by a Polynomial-time algorithm

This definition implies that if a problem D1 is polynomially reducible to some problems
D2 that can be solved in polynomial time, then problem D1 can also be solved in polynomial
time.

100. Exponential time

In complexity theory, exponential time is the computation time of a problem where the
time to complete the computation, m(n), is bounded by an exponential function of the problem
size, n (i.e, as the size of the problem increases linearly, the time to solve the problem increases
exponentially). Written mathematically, there exists k > 1 such that m(n) = (k
n
) and there exists
c such that m(n) = O(c
n
).

101. Reason for intractability

1. Arbitrary instance of intractable problems cannot be solved in a reasonable amount of
time unless such instances are very small.
2. There is a huge difference between the running times in O(p(n)) for polynomials of
drastically different degrees, there are very few useful polynomial-time algorithms with
the degree of a polynomial higher than 3.
3. Polynomial functions posses many convenient properties both the sum & composition of
two polynomials are always polynomials too.
4. Polynomial algorithm lead to the development of an extensive theory called
computational complexity, which seeks to classify problems according to their inherent
difficulty.

102. Undecidable Problems
Not every decision problems can be solved in polynomial time. Some decision problems
cannot be solved at all by any algorithm. Such problems are called Undecidable.

103. Decision Problem

Any problem for which the answer is either zero or one is called decision problem.

104. Optimization problem

Any problem that involves the identification of an optimal (either minimum or
maximum) value of a given cost function is known as an optimization problem.

105. Cooks Theorem

Cook's theorem, states that the Boolean satisfiability problem is NP-complete. That is,
any problem in NP can be reduced in polynomial time by a deterministic Turing machine to a
problem of determining whether a Boolean formula is satisfiable.

106. Satisfiability Problem

The satisfiability problem is to determine whether a formula is true for some assignment
of truth values to the variables. CNF-satisfiability problem for CNF (Conjunctive Normal Form)
formulas.

Diagram of complexity classes provided that P NP. If P = NP, then all three classes are equal.

Relationship between P, NP, NP-Complete and NP-Hard problems.

P
NP
Complete
NP
NP Hard