You are on page 1of 68

Unit-II

Analysis of Algorithms
and Greedy Algorithms
UNIT II
Introduction
Analyzing Control Structures
Solving Recurrences
Making Change
Characteristics of Greedy Algorithms
Graphs: Minimum Spanning Trees
Kruskals Algorithm
Prims Algorithm
Graphs: Shortest Paths
The Knapsack Problem(1)
Scheduling
Introduction

The analysis of algorithms usually


proceeds from the inside out. First we
determine the time required by
individual instructions, then we combine
these times according to the control
structures that combine the instructions
in the program.
Analyzing Control Structures

sequencing
for loops
Recursive class
while and repeat loops
Sequencing
Let p1 and p2 be two fragments of an
algorithm. They may be single instructions or
complicated sub-algorithms. Let t1 and t2 be the
time taken by p1 and p2 respectively. These
times depends on various parameters such
as the instance size.
The sequencing rule says that the time
required to compute p1:p2, that is first p1 and
then p2 is simply t1+t2.
By the maximum rule, this time is in exact order
of (max(t1,t2))

Thus, the analysis of p1:p2 cannot always be


performed by considering p1 and p2
independently.
For loops

For i 1 to m do P (i)
Suppose this loop is a part of larger algorithm,
working on an instance of size n.
The easiest case is when the time taken by P (i)
does not actually depend on i, although it
could depend on the in stance size or more
generally on instance itself.
Cont
Let t denote the time required to compute P (i)
P (i) is performed m times, each time at a cost
of t and thus the total time required by the loop
is simply l=m t. for loop is shorthand for
something like following while loop
i 1
while i <= m do
p (i)
i i+1
Cont
The time l taken by the loop is thus bounded above
by
l <= c for i 1
+ (m+1)c for the tests i<=m
+ mt for the executions of P(i)
+ mc for the executions of i 1+1
+ mc for the sequencing operations
<= (t+3c)m+2c

Time is clearly bounded below by mt. If c is negligible compared to


t then l is roughly equal to mt was justified
Example

For i <- 5 to m step 3 do P(i)

Here P(i) is executed ((m-5)/2)+1 times


provided m>=3
Recursive calls
Consider function Fibrec(n)
if n<2 then return n
else return Fibrec(n-1)+Fibrec(n-2)

Let T(n) be the time taken by a call on Fibrec(n)


If n<2 the algorithm simply returns n, which
takes some constant a
Otherwise, most of the work is spent in the two
recursive calls, which take time T(n-1) and
T(n-2) respectively.
Let h(n) stand for the work involved in the
addition and control, that is required by a call
on Fibrec(n) ignoring the time spent inside
the two recursive calls.
By the definition of T(n) and h(n) we obtain
the following recurrence:

T(n) = {a if n=0 or n=1


T(n-1) + T(n-2) +h(n) otherwise
If we count the additions at unit cost, h(n) is
bounded by a constant and the recurrence
equations for T(n) is very similar to that
already encountered for g(n)
T(n) O(fn) and T(n) (fn) Hence
T(n) (fn)
If we do not count the additions at unit cost,
h(n) is no longer bounded by a constant
while and repeat loops
While and repeat loops are usually harder to
analyses than for loops because there is no
obvious a priori way to know how many times
we shall have to go round the loop.
The standard technique for analyzing these
loops is to find a function of the variables
involved whose value decreases each time
around.
To conclude the loop will eventually terminate,
it suffices to show that this value must be a
positive integer.
An alternative approach to the analysis of
while loops consists of treating them like
recursive algorithms
Recurrences

The last step when analyzing an


algorithm is often to solve a
recurrence equation.
Recurrences

Approaches / types of recurrences


Intelligent Guesswork
Homogeneous Recurrences
Inhomogeneous Recurrences
Change of Variables
Range Transformation
Intelligent Guesswork
This approach generally proceed in four
stages:
1. calculate the first few values of the
recurrence
2. look for granularity
3. guess a suitable general for
4. and finally prove by mathematical
induction that this form is correct
Example
Consider the recurrence

0 if n=0
T(n)= 3T(n2)+n otherwise

First step is to replace n2 by n/2 with a suitable


restriction on the set of values that we consider initially.
It is tempering to restrict n to being even since in that
case n2 =n/2, but recursively diving an even number by
2 may produce an odd number larger than 1.
So it is better to restrict n to being an exact power of 2
Recurrence on first few powers of 2

n 1 2 4 8 16 32

T(n) 1 5 19 65 211 665

Each item in this table but the first is computed from the
previous term. For instance,
T(16)=3*T(8)+16=211
Instead of writing T(2)=5, it is more useful to write
T(2)=3*1+2

T(4)=3*T(2)= 3*(3*1+2)+4=32 *1+3*2+4


n as explicit power of 2
n T(n)

1 1

2 3*1+2

22 32*1+3*2+22

23 33*1+32*2+3 * 22 + 23

24 34*1+33*2+32*22+3*23+24

25 35*1+34*2+33*22+32*23+3*24+25
Cont

T(2k) = 3k20+3k-121+3k-222+12k-1+302k
k
= 3k-i2i
i=0
k
= 3k (2/3)i
i=0

=3k * (1-2(3)k+1 /(1-(2/3)


= 3k+1-2k+1
Values of T (n) + in
n 1 2 4 8 16 32
T(n)-2n -1 1 11 49 179 601
T(n)-n 0 3 15 57 195 663
T(n) 1 5 19 65 211 665
T(n)+n 2 7 23 73 227 697
T(n)+2n 3 9 27 81 243 729

From the table it is clear that T(n)+2n is an exact power of 3


Cont

T(n)=T(2k)
n=2k it follows that k=logn
therefore T(n)=T(2logn)=31+logn-21+logn
Using the fact that 3logn=nlog3
it follows that
T(n)= 3nlog3-2n when n is a power of 2
T(n) is in exact power of nlog3 if n is a power of 2.
Introduction
Greedy algorithms are simple and straightforward.
They are shortsighted in their approach in the
sense that they take decisions on the basis of
information at hand without worrying about the
effect these decisions may have in the future. They
are easy to invent, easy to implement and most of
the time quite efficient. Many problems cannot be
solved correctly by greedy approach. Greedy
algorithms are used to solve optimization
problems.
Characteristics and Features of Problems
solved by Greedy Algorithms
To construct the solution in an optimal way.
Algorithm maintains two sets. One contains chosen
items and the other contains rejected items.
The greedy algorithm consists of four (4) function.
A function that checks whether chosen set of items
provide a solution.
A function that checks the feasibility of a set.
The selection function tells which of the candidates
is the most promising.
An objective function, which does not appear
explicitly, gives the value of a solution.
Structure Greedy Algorithm

Initially the set of chosen items is empty i.e., solution


set.
At each step
item will be added in a solution set by using
selection function.
IF the set would no longer be feasible
reject items under consideration (and is

never consider again).


ELSE IF set is still feasible THEN
add the current item.
Greedy Approach
Greedy Algorithm works by making the decision that
seems most promising at any moment; it never reconsiders
this decision, whatever situation may arise later.
As an example consider the problem of "Making Change".
Coins available are:
dollars (100 cents)
quarters (25 cents)
dimes (10 cents)
nickels (5 cents)
pennies (1 cent)
Problem : Make a change of a given amount using the
smallest possible number of coins.

Problem can be solved formally and informally

Informal Algorithm

Start with nothing.


at every stage without passing the given amount.
add the largest to the coins already chosen.
Formal Algorithm
Make change for n units using the least possible number of coins

MAKE-CHANGE (n)
C {100, 25, 10, 5, 1} // constant.
Sol {}; // set that will hold the solution set.
Sum 0 sum of item in solution set
WHILE sum not = n
x = largest item in set C such that sum + x n
IF no such item THEN
RETURN "No Solution"
S S {value of x}
sum sum + x
RETURN S
Minimum Spanning Trees

A tree is defined to be an undirected, acyclic and


connected graph (or more simply, a graph in which
there is only one path connecting each pair of vertices).
Assume there is an undirected, connected graph G. A
spanning tree is a sub graph of G, is a tree, and
contains all the vertices of G. A minimum spanning
tree is a spanning tree, but has weights or lengths
associated with the edges, and the total weight of the
tree (the sum of the weights of its edges) is at a
minimum.
Kruskals Algorithm
The Kruskal Algorithm starts with a forest which
consists of n trees. Each and everyone tree, consists only by
one node and nothing else. In every step of the algorithm, two
different trees of this forest are connected to a bigger tree.
Therefore ,we keep having less and bigger trees in our forest
until we end up in a tree which is the minimum genetic tree
(m.g.t.) .In every step we choose the side with the least cost,
which means that we are still under greedy policy. If the
chosen side connects nodes which belong in the same tree the
side is rejected, and not examined again because it could
produce a circle which will destroy our tree. Either this side or
the next one in order of least cost will connect nodes of
different trees, and this we insert connecting two small trees
into a bigger one.
Algorithm
E(1) is the set of the sides of the minimum genetic tree.
E(2) is the set of the remaining sides.
STEPS
E(1)=0,E(2)=E
While E(1) contains less then n-1 sides and E(2)=0 do
From the sides of E(2) choose one with minimum cost--
>e(ij)
E(2)=E(2)- {e(ij) }
E(1)=E(1)U {e(ij)}
If V(i), V(j) do not belong in the same tree then
unite the trees of V (i) and V (j) to one tree.
end (If)
end (While)
End Of Algorithm.
Example of Kruskals algorithm


Stepwise Execution of Example
Usingtheabovegraph,herearethestepstotheMST,using
Kruskal'sAlgorithm:
N1toN2-costis1-addtotree
N7toN8-costis1-addtotree
N2toN3-costis2-addtotree
N1toN6-costis3-addtotree
N2toN6-costis4-rejectbecauseitformsacircuit
N3toN4-costis4-addtotree
N2toN7-costis5-addtotree
N3toN7-costis6-rejectbecauseitformsacircuit(cycle)
N4toN8-costis6-rejectbecauseitformsacircuit(cycle)
N4toN7-costis7-rejectbecauseitformsacircuit
N4toN5-costis7-addtotree
Westophere,becausen-1edgeshavebeenadded.Weare
leftwiththeminimumspanningtree,withatotalweightof23.
Prims Algorithm
In Prims Algorithms, the minimum spanning
tree grows in a natural way, starting from an
arbitrary root. At each stage we ass a new
branch to the tree already constructed; the
algorithm stops when all the nodes have been
reached.
Prims Algorithm
PRIM (L[1..n,1..n]):set of edges{Initialization:only one node 1 is in B}
T (will contain the edges of the minimum spanning tree}
For i=2 to n do
nearesr[i] 1
mindist[i] L[ i,1]
{greedy loop}
Repeat n-1 times
min
For j 2 to n do
if 0 mindist[j]<min then min mindist [j]
k j
T T U{{nearest[k],k}}
Mindist[k] -1{add k to B}
For j 2 to n do
if L[j,k]<mindist[j] then mindist[j] L[j,k]
nearest[j] k
Return T
Example of Prims Algorithm
Graphs: Shortest Paths
Dijkstra's algorithm is almost identical to that of
Prim's. The algorithm begins at a specific vertex
and extends outward within the graph, until all
vertices have been reached. The only distinction is
that Prim's algorithm stores a minimum cost edge
whereas Dijkstra's algorithm stores the total cost
from a source vertex to the current vertex. More
simply, Dijkstra's algorithm stores a summation of
minimum cost edges whereas Prim's algorithm
stores at most one minimum cost edge.
Theorem
Statement: Dijkstra's algorithm finds the shortest
paths from a single source to the other nodes.

Proof: Prove by mathematical induction that


(a) if a node i1 is in S, then D [i] gives the
length of the shortest path from the source to i and
(b) if a node i is not in S ,then D [i] gives the
length of the shortest special path from the source
to i
Algorithm
It should be noted that distance between nodes can also be referred
to as weight.
Create a distance list, a previous vertex list, a visited list, and a
current vertex.
All the values in the distance list are set to infinity except the
starting vertex which is set to zero.
All values in visited list are set to false.
All values in the previous list are set to a special value signifying
that they are undefined, such as null.
Current vertex is set as the starting vertex.
Mark the current vertex as visited.
Update distance and previous lists based on those vertices which
can be immediately reached from the current vertex.
Update the current vertex to the unvisited vertex that can be
reached by the shortest path from the starting vertex
Repeat (from step 6) until all nodes are visited1
Dijkstras Algorithm

Dijkstra's algorithm can be expressed formally as


follows:
G - arbitrary connected graph
L - length of each directed edge: L[i,j]>=0
S - Set of selected elements
C Set of rejected elements
Algorithm
Dijkstra Algorithm (L[1..n,1..n]):array[2..n]
array D[2..n]
{Initialization}
C {2,3,4n} {S=N/C exists only implicitly}
for i 2 to n do D[i] L[1,i]
{greedy loop}
repeat n-2 times
v some element of C minimizing D[v]
C c\{v} {and implicitly S SU{v}}
for each w in C do
D[w] min(D[w], D[v] + L[v,w])
Return D
Example

1
10 50

5 100 30 2

10 5
20

4 3
50

A Directed Graph
Example
The algorithm for figure is:

Step v c D
Initialization - {2,3,4,5} [50,30,100,10]
1 5 {2,3,4} [50,30,20,10]
2 4 {2,3} [40,30,20,10]
3 3 {2} [35,30,20,10]
The Knapsack problem(1)
Definition:
We are given n objects and a knapsack of
capacity W.
Given items are of different positive weights
and values.
Our aim is to fill the knapsack in a way that
maximizes the value of the included objects
while respecting the capacity constraints.
Version (1)
In this version (1) of problem we
assume that the objects can be broken
into smaller pieces, so we may decide
to carry only a fraction xi of object I,
where 0 xi 1.
Problemcanbestatedasfollows

Maximize
n
XiVi where X i is a fraction of i th item
i=1

Subjectto
n
XiWiW
i=1
Common to all versions are a set of n items,
with each item having an associated profit pj
and weight wj. The objective is to pick some
of the items, with maximal total profit, while
obeying that the maximum total weight of the
chosen items must not exceed W. Generally,
these coefficients are scaled to become
integers, and they are almost always
assumed to be positive.
Stepwise Execution
Step 1: The algorithm takes as input the maximum
weight W, the number of items n, and the two sequences v
= <v1, v2, . . . , vn> and w = <w1, w2, . . . , wn>.
Step 2: It stores the c[i, j] values in the table, that is, a
two dimensional array, c[0 . . n, 0 . . w] whose entries are
computed in a row-major order. That is, the first row of c
is filled in from left to right, then the second row, and so
on.
Step 3: At the end of the computation, c[n, w]
contains the maximum value that can be picked into the
knapsack.
Algorithm
Function knapsack(w[1..n],v[1..n],w):array[1..n]
{Initialization}
For i=1 to n do x [i] 0
weight 0
{greedy loop }
While weight< W do
I the best remaining object {see below}
if weight +W[i] W then x[i] 1
weight= weight+w[i]
else x[i]=(W-weight)/W[i]
weight=W
Return x
What is Scheduling?
Allocation of resources to activities over time so that
input demands are met in a timely and cost-effective
manner.
Most typically, this involves determining a set of
activity start and end times, together with resource
assignments, which
satisfy all temporal constraints on activity
execution (following from process
considerations)
satisfy resource capacity constraints, and
optimize some set of performance objectives
to the extent possible
A Basic Scheduling Problem

op11 op12 op13

rel1 dd1

R1 R2
st(i)+p(i)<st(j),wherep(i)
i j istheprocessingtimeofopi

op21 op22 i R j st(i)+p(i)<st(j)st(j)+p(j)<st(i)


rel<st(i),foreachopiofjobj
j

rel 2 dd2 dd>st(i)+p(i),foreachopiofjobj


j
Minimizing time in the system
Greedy algorithm for scheduling (minimizing time in the system) is
optimal.
Proof:
Let P=p1,p2,p3-----pn be any permutations of the integers from 1 to n and
let si=tp. If customers are served in the order of P, then the service time
required by the i-th customer to be served is si, and the total time passed
in the system by all the customers is
T(P) =s1+(s1+s2)+(s1+s2+s3)+..
=ns1+(n-1)s2+(n-3)s3+.
n
(n-k+1)sk.
i=1
1.a-1 a a+1b+1 b b+1.n
P

P
Suppose now that P does not arrange the customers in
the order of increasing service time. then we find two integers a and b with a <b
and sa>sb.
In other words, the a-th customers is served before the b-th customers even
though the former needs more service time than a latter. If we exchange simply
P with all the integers pa and pb interchange.
The total time passed in the system by all the customers if schedule P is used is

n
T(P)=(n-a+1)sb+(n-b+1)sa+ (n-k+1)sk.
K=1
ka,b
New schedule is preferable to the old because
T(P)-T(P) =(n-a+1)(sa-sb)+(n-b+1)(sb-sa)
= (b a ) (sa - sb)>0


Same result can be obtained less formally from figure.
Comparing schedules P and P, we see that the first a-1
customers leave the system at exactly the same time
both schedules. The same is true of the last n-b
customers. Customers a now leaves when customer b
leaves earlier than customer a used to, because sb<sa.
Finally those customer served in the positions a+1 tob-1
also leave the system earlier, for the same reason.
Overall, P is thereof better than P.