You are on page 1of 190

Algorithm Analysis and Design (R 606

)

1

Department of Computer Science & Engineering

SJCET, Palai

Algorithm Analysis and Design (R 606) ALGORITHM ANALYSIS AND DESIGN R606 3+1+0

2

Module 1

Introduction and Complexity

What is an algorithm – Properties of an Algorithm, Difference between Algorithm, Computational Procedure and Program, Study of Algorithms; Pseudo-code Conventions; Recursive Algorithms –Space and Time Complexity –Asymptotic Notations – ‗Oh‘, ‗Omega‘, ‗Theta‘, Common Complexity Functions; Recurrence Relations and Recurrence Trees for Complexity Calculations; Profiling. – Deterministic and non - deterministic algorithms.

Module 2

Divide and Conquer

Control Abstraction, Finding Maximum and Minimum, Binary Search, Divide and Conquer Matrix Multiplication, Stressen‘s Matrix Multiplication, Merge Sort, Quick Sort.

Module 3

Greedy Strategy

Control Abstraction, General Knapsack Problem, Optimal Storage on Tapes, Minimum Cost Spanning Trees – Prim‘s Algorithm, Kruskal‘s Algorithm – Job sequencing with deadlines. Module 4 Dynamic Programming

Principle of Optimality, Multi-stage Graph, All-Pairs Shortest Paths, Travelling Salesman Problem. Lower Bound Theory - Comparison Trees for Searching and Sorting, Oracles and Adversary Arguments – Merging, Insertion & Selection Sort; Selection of ‗k‘th Smallest Element.

Module 5

Backtracking

Control Abstraction - Bounding Functions, Control Abstraction, N-Queens Problem, Sum of Subsets, Knapsack problem. Branch and Bound Techniques – FIFO, LIFO, and LC Control Abstractions, 15puzzle, Travelling Salesman Problem.

Department of Computer Science & Engineering

SJCET, Palai

Algorithm Analysis and Design (R 606)
Text Book

3

1.

Fundamentals of Computer Algorithms - Horowitz and Sahni, Galgotia

References

1.

Computer Algorithms – Introduction to Design and Analysis - Sara Baase & Allen Van Gelder, Pearson Education Data Structures algorithms and applications - Sahni, Tata McGrHill Foundations of Algorithms - Richard Neapolitan, Kumarss N., DC Hearth & Company Introduction to algorithm- Thomas Coremen, Charles, Ronald Rivest -PHI

2. 3. 4.

Department of Computer Science & Engineering

SJCET, Palai

..... Recurrence Relations...............................................Algorithm Analysis and Design (R 606) 4 TABLE OF CONTENTS Module 1 What is an algorithm ……………………………………………………… Properties of an Algorithm ……………………………………....... Common Complexity Functions………………………………………........ Space Complexity ………………………………………….................................... ............................................... Difference between Algorithm Computational Procedure and Program …………………………………........................... Asymptotic Notations ………………………………………………........................................ Pseudocode convention…………………………………………………....................................... Recursive Algorithms …………………………………................................................................ Oh……………………………………………. Omega................................................................. Time Complexity... Palai .............. Study of Algorithms................ Deterministic and non ........ ..............................deterministic algorithms………………………… 7 7 8 10 10 12 19 20 25 26 28 30 33 35 39 41 41 Department of Computer Science & Engineering SJCET................ ................................... Recurrence Trees for Complexity Calculations…........................................................... Profiling……………………………………………........................ Theta..............................

.. Finding Maximum and Minimum ……………………………………….Algorithm Analysis and Design (R 606) 5 Module 2 Control Abstraction ……………………………………………………….... 105 106 Department of Computer Science & Engineering SJCET......... 44 45 51 54 56 60 69 Module 3 Greedy Strategy Control Abstraction ……………………………………………………… General Knapsack Problem ……………………………………………… Optimal Storage on Tapes ……………………………………………… Minimum Cost Spanning Trees ………………………………………. Prim‘s Algorithm ……………………………………………....... 100 Module 4 Dynamic Programming Principle of Optimality ……………………………………………… Multi-stage Graph ……………………………………………….... Binary Search ……………………………………………………….... Merge Sort …………………………………………………………….... Stressen‘s Matrix Multiplication ……………………………………….... Kruskal‘s Algorithm ………………………………………………….............. Palai ... Divide and Conquer Matrix Multiplication ………………….. Quick Sort ……………………………………………………………...... 87 88 90 91 93 99 Job sequencing with deadlines……………………………………...

................................................... Sum of Subsets ……....... 118 6 Lower Bound Theory Comparison Trees for Searching and Sorting ..................................................................................... Selection of ‗k‘th Smallest Element……………………................. Selection Sort……………............ Knapsack problem……………............................................................................................................................. 15-puzzle…....... ......................... LC Control Abstractions........... Branch and Bound Techniques FIFO …....................................... LIFO…............................... 172 177 181 182 184 149 153 149 153 158 164 Department of Computer Science & Engineering SJCET................Algorithm Analysis and Design (R 606) All-Pairs Shortest Paths …………………………………………….................................. Oracles and Adversary Arguments.......................................................... Merging………………….............................. ...................................... Travelling Salesman Problem…...................... Control Abstraction …………………………………………………….. 127 134 135 136 142 144 Module 5 Backtracking Control Abstraction …………………………………………………............ Bounding Functions ………………………………………………............................................................................ Insertion……………….............................................................................. N-Queens Problem................................................................................................................................................... Palai .........

Algorithm Analysis and Design (R 606) 7 MODULE 1 Department of Computer Science & Engineering SJCET. Palai .

4. in principle. PROPERTIES OF AN ALGORITHM All algorithms must satisfy the following criteria: 1. Effectiveness. Zero or more quantities are externally supplied. At least one quantity is produced. 5. be done by a person using pencil and paper in a finite amount of time. the algorithm terminates after a finite number of steps. The fourth criterion for algorithms is that they terminate after a finite number of operations. 3. each of which may require one or more operations. Definiteness. Input. Criteria 1 and 2 require that an algorithm produce one or more outputs and have zero or more inputs that are externally supplied. accomplishes a particular task. It is not enough that each operation be definite as in criterion 3. Department of Computer Science & Engineering SJCET. then for all cases. 2. each step must be such that it can. If we trace out the instructions of an algorithm. The possibility of a computer carrying out these operations necessitates that certain constraints be placed on the type of operations an algorithm can include. A related consideration is that the time for termination should be reasonably short. at least in principle. Output. Performing arithmetic on integers is an example of an effective operation. each operation must be definite. Each instruction is clear and unambiguous. it also must be feasible.Algorithm Analysis and Design (R 606) 8 WHAT IS AN ALGORITHM Definition: An algorithm is a finite set of instructions that. Criteria 5 requires that each operation be effective. Palai . According to criteria 3. Every instruction must be very basic so that it can be carried out. but arithmetic with real numbers is not. An algorithm is composed of a finite set of steps. since some values may be expressible only by infinitely long decimal expansion. if followed. by a person using only pencil and paper. Finiteness. meaning that it must be perfectly clear what should be done.

Creating an algorithm is an art which may never be fully automated. function and subroutine are used synonymously for program. once the validity of the method has been shown. it is necessary to show that it computes the correct answer for all possible legal inputs. This procedure is designed to control the execution of jobs.Algorithm Analysis and Design (R 606) 9 DIFFERENCE BETWEEN ALGORITHM. it does not terminate but continues in a waiting state until a new job is entered. Palai . Dynamic programming is one such technique. Such languages are designed so that each legitimate sentence has a unique meaning. This phase is referred to as program proving or program verification.Once an algorithm is devised. Sometimes words such as procedure. COMPUTATIONAL PROCEDURE AND PROGRAM COMPUTATIONAL PROCEDURE Algorithms that are definite and effective are also called computational procedures. One important example of computational procedures is the operating system of a digital computer. This process is referred to as algorithm validation. in such a way that when no jobs are available. PROGRAM To help us achieve the criterion of definiteness. 2. The purpose of validation is to assure us that this algorithm will work correctly independently of the issues concerning the programming language it will eventually be written in. It is sufficient to state the algorithm in any precise way and need not be expressed as a program. algorithms are written in a programming language. a program can be written and a second phase begins. A proof of correctness requires that Department of Computer Science & Engineering SJCET. There are four distinct areas of study: 1. Some of the techniques are especially useful in fields other than computer science such as operations research and electrical engineering. How to devise algorithms: . There are several techniques with which you can devise new and useful algorithms. A program is the expression of an algorithm in a programming language. The study of algorithms includes many important and active areas of research. How to validate algorithms: .

These timing figures are useful in that they may confirm a previously done analysis and point out logical places to perform useful optimization. in the worst case. In cases in which we cannot verify the correctness of output on sample data. the following strategy can be employed: let more than one programmer develop programs for the same problem. How to analyze algorithms: . or on the average are typical. since it guarantees that the program will work correctly for all possible inputs. 4. Palai . As an algorithm is executed.testing a program consists of two phases: debugging and profiling (or performance measurement). Analysis of algorithms or performance analysis refers to the task of determining how much computing time and storage an algorithm requires. An important result of this study is that it allows you to make quantitative judgments about the value of one algorithm over another. A proof of correctness is much more valuable than a thousand tests. How to test a program: . If the outputs match. The second form is called a specification. These assertions are often expressed in predicate calculus. if so to correct them.this field of study is called analysis of algorithms.Algorithm Analysis and Design (R 606) 10 the solution be stated in two forms. Another result is that it allows you to predict whether the software will meet any efficiency constraints that exist. and compare the outputs produced by these programs. it uses the computer‘s central processing unit (CPU) to perform operations and its memory (both immediate and auxiliary) to hold the program and data. they describe the same output. Department of Computer Science & Engineering SJCET. One form is usually as a program which is annotated by a set of assertions about the input and output variables of the program. A complete proof of program correctness requires that each statement of the programming language be precisely defined and all basic operations be proved correct. Profiling or performance measurement is the process of executing a correct program on data sets and measuring the time and space it takes to compute the results. 3. and this may also be expressed in the predicate calculus. A proof consists of showing that these two forms are equivalent in that for every given legal input. Debugging is the process of executing programs on sample data sets to determine whether faulty results occur and. then there is a good chance that they are correct. Questions such as how well an algorithm performs in the best case.

we must make sure that the resulting instructions are definite. Profiling is the process of executing a correct program on data sets and measuring the time and space it takes to compute the results. We can present most of our algorithms using a pseudocode that resembles c 1. it is necessary to show that it computes the correct answer for all possible legal inputs. There are four distinct areas of study 1.We can use a natural language like English. 3. How to validate algorithms: Once an algorithm is devised . Comments begin with // and continueuntill the end of line Eg: count :=count+1. 2. How to test a program: Testing a program consists of two phases. debugging and profiling. although I we select this option. Department of Computer Science & Engineering SJCET.Algorithm Analysis and Design (R 606) 11 STUDY OF ALGORITHM An algorithm is a finite set of instructions that . As an algorithm is executed . if so.Analysis of algorithms refers to the task of determining how much computing time and storage an algorithm requires. to correct them. Palai . How to devise algorithms: Creating an algorithm is an art which may never be fully automated.This process is algorithm validation.It allows you to predict whether the software will meet any efficiency constraints that exits. The study of algorithms includes many important and active areas of research.some important design techniques are linear.nonlinear and integer programming .Dynamic programming is one such technique.Debugging is the process of executing programs on sample data sets to determine whether faulty results occur and . PSEUDOCODE CONVENTIONS We can describe an algorithm in many ways.It is initially zero.To study various design techniques that have proven to be useful in that they have often yielded good algorithms. it uses the computer‘s central processing unit to hold the program and data.The purpose of the validation is to assure us this algorithm will work correctly independent of the issues concerning the programming language it will eventually be written in. accomplishes a particular task.//count is global . if followed . 4. How to analyze algorithms: This field of study is called analysis of algorithms.

Eg: node=record { datatype_1 data_1. : datatype_n data_n. C[i. the (I. The types will be clear from the context . The data types of variables are not explicitly declared. and not and the relational relational operators <. While (condition) do { <statement 1> Department of Computer Science & Engineering SJCET. Array indicates start at zero.j) th element of the array is denoted as A[I. else k:=n-1.A compound statement can be represent as a block. } 3. Count:=count +1.j]+b[i. } 4. the logical operators and . Eg: for j:= 1 to n do { Count:=count+1. 6. An identifier begins with a letter. Palai . Assignment of values to variables is done using the assignment statement <variable> := <expression>. For eg: if A is a two dimentional array .j]. The while loop takes the following form.!=.while and repeat until. Eg: if (j>1) then k:=i-1.=.Algorithm Analysis and Design (R 606) 12 2.j].Whether a variable is global or local to a procedure will also be evident from the context.<=.>= and > are provided. 5. 7. There are two Boolean values true and false. or . Eg: count:= count+1. In order to produce these values. Elements of multidimentional arrays are accessed using [ and ]. Compound data types can be formed with records. The following looping statements are employed: for.Statements are delimited by .j]:=a[i. Blocks are indicated with matching braces: { and } . node *link.The body of a procedure also forms a block.

No format is used to specify the size of input or output quantities. A conditional statement has the following forms: 13 If < condition > then <statement> If<condition> then <statement 1> else <statement 2> Here < condition > is a Boolean expression and <statement>. we split a complex problem into its single simplest case. 9. The heading takes the form Algorithm Nmae (<parameter list>) RECURSIVE ALGORITHMS A function that calls itself repeatedly. double a) Department of Computer Science & Engineering SJCET.<statement 1>. satisfying some condition is called a Recursive Function. Another example of a linear recursive function would be one to compute the square root of a number using Newton's method (assume EPSILON to be a very small number close to 0): double my_sqrt(double x. 10. Eg: write (―n is even‖).Algorithm Analysis and Design (R 606) : : <statement n> } 8. Using recursion. The algorithm that does this is called a recursive algorithm. TYPES OF RECURSION: Linear Recursion A linear recursive function is a function that only makes a single call to itself each time the function runs (as opposed to one that would call itself multiple times during its execution). and < statement 2> are arbitrary statements. An algorithm consists of a heading and a body. There is only one type of procedure: Algorithm. The recursive function only knows how to solve that simplest case. The factorial function is a good example of linear recursion. Palai . Input and output are done using the instructions read and write.

Often.(a+x/a)/2. } 14 Tail recursive Tail recursion is a form of linear recursion. if (difference < 0. or Greatest Common Denominator. The mathematical combinations operation is a good example of a function that can quickly be Department of Computer Science & Engineering SJCET.0) difference = -difference. if (r == 0) return(n). they have two (or more). the recursive call is the last thing the function does.r)). As such.0)). of two numbers: int gcd(int m. In tail recursion. the same effect can generally be achieved. the value of the recursive call is returned. r = m%n.m). if (difference < EPSILON) return(a). int n) { int r. tail recursive functions can often be easily implemented in an iterative manner. Functions with two recursive calls are referred to as binary recursive functions. if (m < n) return gcd(n. else return(gcd(n. A good example of a tail recursive function is a function to compute the GCD. In fact. a good compiler can recognize tail recursion and convert it to iteration in order to optimize the performance of the code. Palai . by taking out the recursive call and replacing it with a loop. else return(my_sqrt(x. } Binary Recursive Some recursive functions don't just have one call to themself.Algorithm Analysis and Design (R 606) { double difference = a*x-x.

arr[i] = arr[j]. for(j=i+1. j<n. n). would have an exponential number of calls in relation to the size of the data set (exponential meaning if there were n elements.k) + choose(n-1. } void print_permutations(int arr[]. int k) { if (k == 0 || n == k) return(1). can be implemented as follows: int choose(int n. printf("\n").k-1)). int n) { int i. The number of combinations. there would be O(an) function calls where a is a positive number).Algorithm Analysis and Design (R 606) 15 implemented as a binary recursive function. print_permutations(arr. else return(choose(n-1. arr[i]). arr[j] = swap. } } Department of Computer Science & Engineering SJCET. A good example an exponentially recursive function is a function to compute all the permutations of a data set. Let's write a function to take an array of n integers and print out every permutation of it. arr[j] = swap. swap. n. print_array(arr. swap = arr[i]. int n. arr[i] = arr[j]. void print_array(int arr[]. Palai . i<n. } Exponential recursion An exponential recursive function is one that. int i) { int j. if you were to draw out a representation of all the function calls. for(i=0. i) printf("%d ". j) { swap = arr[i]. i+1). often represented as nCk where we are choosing n elements out of a set of k elements.

else return(is_odd(n-1)). Nested Recursion In nested recursion. for example). else return(ackerman(m-1. Some recursive functions work in pairs or even larger groups. function A calls function B which calls function C which in turn calls function A. A good example is the classic mathematical function. 0) where the 0 tells it to start at the beginning of the array. For example. It grows very quickly (even for small values of x and y. } Department of Computer Science & Engineering SJCET. it requires indefinite iteration (recursion. we'd do print_permutations(arr. int is_even(unsigned int n) { if (n==0) return 1. } Mutual Recursion A recursive function doesn't necessarily need to call itself. Ackermann(x. one of the arguments to the recursive function is the recursive function itself! These functions tend to grow extremely fast. } int is_odd(unsigned int n) { return (!iseven(n)). "Ackerman's function.Algorithm Analysis and Design (R 606) 16 To run this function on an array arr of length n.y) is extremely large) and it cannot be computed with only definite iteration (a completely defined for() loop for example).n-1))). A simple example of mutual recursion is a set of function to determine whether an integer is even or odd. Ackerman's function int ackerman(int m. n. int n) { if (m == 0) return(n+1).ackerman(m. Palai .1)). else if (n == 0) return(ackerman(m-1.

2. We imagine that we know a solution for n − 1 disks (―reduce to a previous case‖). we would: 1. Palai . s _= r). Although commonly sold today as a children‘s toy. as we will sometimes call them. 2. the nth disk on pole 1 will never be in our way because any valid sequence of moves with only n −1 disks will still be valid if there is an nth (larger) disk always sitting at the bottom of pole 1 (why?). Use the same method as in Step 1 to move the n −1 disks now on pole 2 to Department of Computer Science & Engineering SJCET. The object of the Towers of Hanoi problem is to specify the steps required to move the disks or. or 3) to pole s (s = 1. it is often discussed in discrete mathematics or computer science books because it provides a simple example of recursion. The most common form of the problem has r = 1 and s = 3. The Towers of Hanoi problem Solution The algorithm to solve this problem exemplifies the recursive paradigm. Move disk n from pole 1 to pole 3. its analysis is straightforward and it has many variations of varying difficulty. FIGURE . 2. observing the following rules: i) Only one disk at a time may be moved. rings) from pole r (r = 1. in 1883.Algorithm Analysis and Design (R 606) 17 EXAMPLES OF RECURSIVE ALGORITHMS: The Towers of Hanoi The Towers of Hanoi puzzle (TOH) was first posed by a French professor. ii) At no time may a larger disk be on top of a smaller one. ´Edouard Lucas. or 3. 3. However we do this. In addition. Move n − 1 disks (the imagined known solution) from pole 1 to pole 2. Thus to move n disks from pole 1 to pole 3. and then we use this solution to solve the problem for n disks.

Dynamic Programming. Palai . 1. s) endif endpro H(num. Branch and Bound. Randomized Algorithm. Pinit. 3. 6−r−s) robot(r → s) H(n−1. Divide and Conquer Department of Computer Science & Engineering SJCET. Now let us discuss each method briefly. 1.Algorithm Analysis and Design (R 606) pole 3. Greedy Algorithm. Divide and Conquer. 6−r−s. Pfin) [Main algorithm] ALGORITHM DESIGN TECHNIQUES For a given problem. 2. r. Backtracking Algorithms. 1 ≤ Pfin ≤ 3. The different methods are listed below. r. 6. Pinit _= Pfin] Output The sequence of commands to the robot to move the disks from pole Pinit to pole Pfin Algorithm Hanoi procedure H(in n. 5. 18 Algorithm: Input num [Number of disks] Pinit [Initial pole. 4. there are many ways to solve them. s) [Move n disks from pole r to pole s] if n = 1 then robot(r → s) else H(n−1. 1 ≤ Pinit ≤ 3] Pfin [Final pole.

Branch and Bound Algorithm. Branch and Bound Algorithms are methods for global optimization in non-convex problems. recursively. 3. The result is a good solution but not necessarily the best one.Randomized Algorithm. It is a depth first search of a set of possible solution. Greedy Algorithm does not always guarantee the optimal solution however it generally produces solutions that are very close in value to the optimal solution. the place which presented different alternatives and tries the next alternative. Branch and Bound Algorithm can be slow. Solve every sub-problem individually. Combine the solutions of the sub-problems into a solution of the whole original problem. But in some cases the methods converge with much less effort. During the search if an alternative doesn‘t work. b. Divide the original problem into a set of sub-problems. 5. Dynamic Programming is a technique for efficient solution. If there are no more choice points the search fails. unbiased. 19 c.Dynamic Programming. In Branch and Bound Algorithm a given Algorithm which cannot be bounded has to be divided into at least two new restricted sub-problems. random bits. 2.Backtracking Algorithm. They try each possibility until they find the right one. Department of Computer Science & Engineering SJCET. Palai . and it is then allowed to use these random bits to influence its computation. It is a method of solving problems exhibiting the properties of overlapping sub problems and optimal sub-structure that takes much less time than other methods. however in the worst case they require efforts that grows exponentially with problem size. the search backtrack to the choice point. 4. A Randomized Algorithm is defined as an algorithm that is allowed to access a source of independent. 6.Algorithm Analysis and Design (R 606) Divide and conquer method consists of three steps. Greedy Approach Greedy algorithms seek to optimize a function by making choices which are the best locally but do not look at the global problem. a.

Palai . the choice of characteristics was the number m of rows and the number n of columns in the matrices being added. the space needed by reference variables and he recursion stack space. Althogh any specific instance may have several characteristics. In this case the number of steps will be computed as a function of the number of inputs alone. and RSum is a recursive algorithm that computes Ei n =1 a [i] The space needed by each of these algorithms is seen to be the sum of the following components: 1. The number of steps is itself a function of the instance characteristics. Algorithm Sum computes Ei to n = 1 a[i] iteratively. we need to know which characteristics of the problem instance are to be used. Where c is a constant. Thus before the step count of an algorithm can be determined. Usually. The space requirement s(p) of any algorithm p may thereore be written as s(p) =c+Sp. space for constants. For a different algorithm . we chose to measure the time complexity as a function of the number n of elements being added. SPACE COMPLEXITY Algorithm abc computes a+b+b*c+(a+b-c)/(a+b)+4. A variable part that consists of the space needed by component variables whose size is dependent on the particular problem instance being solved. For algorithm Add. where the a[i]‘s are real numbers. 2. In this case the number of steps will be computed as a function of the magnitude of this input alone. space for simple variables and fixed-size component variables. In the case of sum.Algorithm Analysis and Design (R 606) 20 ALGORITHMIC COMPLEXITY The time complexity of an algorithm is given by the number of steps taken by the algorithm to compute the function it was written for.0. These define the variables in the expression for the step count. A fixed part that is independent of the characteristics of the inputs and ouputs . Department of Computer Science & Engineering SJCET. This part typically includes the instruction space. the number of steps computed as a function of some subset of these. we might be interested in determining how the computing time increases as the magnitude of one of the input increases. we might wish to know how the computing time increases as the number of inputs increase.

denote the time needed for an addition.we could proceed to determine the number of additions. and so on . A propgram step is loosely defined as a syntactically or semantically meaningful segment of a program that has an execution time independent of the instance characteristics. For example. Palai . and ca. respectively.Algorithm Analysis and Design (R 606) 21 TIME COMPLEXITY COMPLEXITY OF SIMPLE ALGORITHMS Time complexity of an algorithm is given by the number of steps taken by the algorithm to compute the function it was written for. division . Where n denotes the instance characteristics. compares.The program is typed. If we know the characteristics of the compiler to be used . are functions whose values are the numbers of additions. The value of tp(n) for any n can be obtained only experimentally. multiplications.compiled and run on a particular machine. divisions. The compile time does not depends the instance characteristics. subtraction.and so on. and so on.The execution time is physically blocked. stores and so on. So we could obtain an expression for tp (n) of the form tp(n)=ca ADD(n) + cs SUB(n) + cm MUL(n) + cd DIV(n)+……. multiplications. cd.and tp(n) obtained. and ADD. DIV.In a multi user system . multiplication. and so on. divisions.that would be made the code for P.execution time depends on factors such as system load and number of other programs running on the computer at the time P is running. subtractions.that are performed when the code for P is used on an instance with characteristic n.SUB.MUL. cm. subtractions. loads.0 of the program given below Department of Computer Science & Engineering SJCET. A compiled program will run several times without recompilation. So we have to concern with the run time of the program which is denoted by tp (instance characteristics). cs . The time T(P) taken by a program P is the sum of compile time and run time.the characteristics of these other programs and so on.consider the entire stataement return a+b+b*c+(a+b-c)/(a+b)+4.

the first execution of the for has a step count equal to the sum of counts for (expr) and (expr1).count into the problem.In the first method we introduce a new variable. Department of Computer Science & Engineering SJCET.count is incremented by the step count of that statement.c) { return a+b+b*c+(a+b-c)/(a+b)+4. In the latter case.For example comments count as zero steps.in an iterative statement such as the for .and o on.an assignment statement which does not involve any calls to other algorithms is counted as one step.Remaining executions of the for statement have a step count of one.The number of steps any program statement is assigned depends on the kind of statement.Statement to increment count by appropriate amount are introduced into the program. Palai .while.unless the counts attributable to (expr) and (expr1) are the functions of the control part the instance characteristics.Algorithm Analysis and Design (R 606) Algorihm abc(a.we consider the step count only for the control part of the the statement.The step count for each execution of the controlpart of a for statement is one.This is a global variable with initial value equals zero.The control parts for for and while statements have the following forms: for i=(expr) to (expr1) do while(expr) do Each execution of the control part of a while statement is given by a step count equal to the number of step counts assignable to(expr).0 } 22 The above line could be considered as a step since its execution time is independent of the the instance characteristics. We can determine the number of steps needed by a program to solve a particular program instance in one of the two ways.This is done so that each time a statement in the original program is executed.and repeat until statements .b.

3.When n>1.total step count for this case is 2.5..Line 9 gets executed n times .2. Palai . } Write(fn) } } To analyze the complexity of this algorithm we need to consider two cases (1) n=0 or 1 and (2) n>1. fnm2 =fnm1.lines 4.then f0=0.fnm=1.Since each line has an s/e of 1.f1=1.1.and lines 11 and 12 get executed n-1 times each.55………….1. For I := 1 to n do { fn = fnm1+fnm2.13. Department of Computer Science & Engineering SJCET.fnm1=fn.The remaining lines that get executed have s/e‘s of 1 The total steps for the case n>1 is therefore 4n+1.When n=0 or 1 lines 4and 5 get executed once each. else { fnm2:=0. n>=2 Algorithm Fibonacci(n) // Compute the nth Fibonacci number { if (n<=1) then Write(n).34. 8 and 14 are each executed once. Each new term is obtained by taking the sum of two previous terms.and in general fn=fn-1+fn-2.If we call the first term of the sequence f0.Algorithm Analysis and Design (R 606) 23 EXAMPLE FOR TIIME COMPLEXITY CALCULATION Fibonacci series of numbers starts as 0.8.line 12 has an s/e of 2 and line 13 has s/e of 0.21.Line 8 has an s/e of 2.

n>=0 =2+2+tRSum(n-2) Example 2 :Complexity of Fibonacci series Department of Computer Science & Engineering SJCET. } Simplified version of algorithm Algorithm Sum(a.0.Algorithm Analysis and Design (R 606) Example 1 Sum of n numbers Algorithm with count statements added Algorithm sum(a.count:=count+1. it is initially zero.n) { For i:=1 to n do count:= count+2. Palai .//For assignment } Count :=count+1.//For the return Return s.//For for S:= s+a[i]. 24 Count := count+1. Count:= count+3.//count is global. For i:=1 to n do { Count :=count + 1.//For last time of for Count := count+1.n) { S:=0.} Complexity calculation tRSum(n)=2+tRSum(n-1) =2(2)+tRSum(n-2) : =n(2)+tRSum(0) =2n+2.

When n=0 or 1 .fnm1:=fn.Algorithm Analysis and Design (R 606) Algorithm Fibonacci(n) //compute the nth Fibonacci number { If(n<=1) then Write (n). The remaning lines that get executed have s/e of 1.8 and 14 are each executed once. Line 9 gets executed n times and lines 11 and 12 get executed n-1 times each.and line 13 has an s/e of 0. Department of Computer Science & Engineering SJCET. fnm2:=fnm1. lines 4 and 5 get executed once each. Since each line has an s/e of 1. } Write (fn). for i:=2 to n do { fn:= fnm1+fnm2. fnm1:= 1. else { Fnm2:=0. the total step count for this case is 2.Line 8 has an s/e of 2. Palai .When n>1. } } 25 To analyse the time complexity of this algorithm. line 12 has an s/e of 2. The total steps for the case n>1 is therefore 4n+1. we need to consider the two cases (1) n=0 or 1 and(2) n>1. lines 4.

Or. more accurately. But the N^2 term dominates the expression. Let f(n) be the cost. the formula is 1 + 2 + . The asymptotic complexity of this algorithm is the square of the number of cards in the deck. So the behavior is said to be O(n2).. and this is what is key for comparing algorithm costs. You don't need to know how many minutes and seconds they will take. e.. Consider. but you do need some way to compare algorithms against one another. the next would take 51.) Asymptotically speaking. So the cost formula is 52 + 51 + . the deck would start out reverse-sorted. Now let us consider how we would go about comparing the complexity of two algorithms. The exact formula for the cost is more complex. expressed as a function of the input size n. And what difference does the constant factor of 1/2 make. 1 + 2 + . (This is in fact an expensive algorithm. and choose the better of the two. f(10) and g(10) would be the maximum number of steps that the algorithms would take on a list of 10 items. etc. at this level of abstraction.. then the algorithm with Department of Computer Science & Engineering SJCET. in the worst case. and g(n) be the cost function for the other algorithm.g. Palai . for sorting algorithms. using idealized units of computational work. for example. The first scan would involve scanning 52 cards.. E.. in the limit as N tends towards infinity. + 1. + N. the best sorting algorithms run in sub-quadratic time..g. the algorithm for sorting a deck of cards. you need to be able to judge how long two solutions will take to run.. letting N be the number of cards.. f(n) is less than or equal to g(n). generally. In order to choose the best algorithm for a particular task. if you double the size of the deck. in the worst case. Asymptotic complexity is a way of expressing the main component of the cost of an algorithm. which proceeds by repeatedly searching through the deck for the lowest card. This quadratic behavior is the main term in the complexity formula. then the work is roughly quadrupled. and contains more details than are needed to understand the essential complexity of the algorithm. so our scans would have to go all the way to the end. for all values of n >= 0. of one algorithm. it says. you need to be able to judge how long a particular solution will take to run. which equals (N + 1) * (N / 2) = (N2 + N) / 2 = (1 / 2)N2 + N / 2. With our deck of cards. + N gets closer and closer to the pure quadratic function (1/2) N^2.Algorithm Analysis and Design (R 606) 26 ASYMPTOTIC NOTATION Introduction A problem may have numerous algorithmic solutions. If.

. f(n) and g(n). Palai . for n larger than some threshold. The actual number of steps required to sort our deck of cards (with our naive quadratic algorithm) will depend upon the order in which the cards begin. f(n). our concern for computational cost is for the cases with large inputs. Department of Computer Science & Engineering SJCET. This is denoted as "f(n) = O(g(n))". etc. generally speaking. It's a measure of the longest amount of time it could possibly take for the algorithm to complete. if there exists an integer n0 and a constant c > 0 such that for all integers n > n0. rather than giving exact speeds. The actual time to perform each of our steps will depend upon our processor speed. the condition of our processor cache. so the comparison of f(n) and g(n) for small values of n is less significant than the "long term" comparison of f(n) and g(n). O-Notation (Upper Bound) This notation gives an upper bound for a function to within a constant factor. It's all very complicated in the concrete details. f(n) ≤ cg(n). etc. and moreover not relevant to the essence of the algorithm.Algorithm Analysis and Design (R 606) 27 complexity function f is strictly faster. for non-negative functions. Note that we have been speaking about bounds on the performance of algorithms. More formally. g(n) serves as an upper bound to the curve you are analyzing. We write f(n) = O(g(n)) if there are positive constants n0 and c such that to the right of n0. But. then f(n) is Big O of g(n). If graphed. BIG-O NOTATION Definition Big-O is the formal method of expressing the upper bound of an algorithm's running time. the value of f(n) always lies on or below cg(n).

cutting it in half repeatedly until there's only one item left. we remove constant multipliers. we can say that f(n) is generally faster than g(n). and comparing every item to every other item. and will always be less than it. To find the upper bound . that lets us put a tighter (closer) upper bound onto the estimate. Also. For any number c greater than 4. It could then be said that f(n) runs in O(n2) time: "f-of-n runs in Big-O of n-squared time". they become irrelevant. eventually. giving us 16 <= 16. that is. for convenience of comparison. so that 2n + 8 <= n2? The number 4 works here. For example. Department of Computer Science & Engineering SJCET. This makes f(n) = 2n. let's take an example of Big-O. and g(n) = n2.the Big-O time . and small values (1. this will still work. f(n) is bound by g(n). at some value of c. It could also be said that f(n) runs in O(n) time. This makes f(n) = n. O(n2): taking a list of n items.assuming we know that f(n) is equal to (exactly) 2n + 8. 3) aren't that important. Say that f(n) = 2n + 8. 2.Algorithm Analysis and Design (R 606) 28 Theory Examples So. Practical Examples O(n): printing a list of n items to the screen. the 2. Since we're trying to generalize this for large values of n. in this case. we can remove all constants from the runtime. Palai . O(ln n): also "log n". taking a list of items. we can take a few shortcuts. Can we find a constant c. looking at each item once.

f(n) ≥ cg(n). f(n) and g(n). How asymptotic notation relates to analyzing complexity Temporal comparison is not the only issue in algorithms. It describes the best that can happen for a given data size. instead of an upper bound function. Asymptotic notation empowers you to make that trade off. Palai . This is almost the same definition as Big Oh. this makes g(n) a lower bound function. Generally. then f(n) is omega of g(n). Ω-Notation (Lower Bound) This notation gives a lower bound for a function to within a constant factor. the value of f(n) always lies on or above cg(n). We write f(n) = Ω(g(n)) if there are positive constants n0 and c such that to the right of n0. This is denoted as "f(n) = Ω(g(n))". a trade off between time and space is noticed in algorithms.Algorithm Analysis and Design (R 606) 29 BIG-OMEGA NOTATION For non-negative functions. Department of Computer Science & Engineering SJCET. There are space issues as well. if there exists an integer n0 and a constant c > 0 such that for all integers n > n0. except that "f(n) ≥ cg(n)". you can analyze how the time and space is handled when you introduce more data to your program. If you think of the amount of time and space your algorithm uses as a function of your data over time or space (time and space are usually analyzed separately).

a trade off can be made for a function that does not behave well for large amounts of data.Algorithm Analysis and Design (R 606) 30 This is important in data structures because you want a structure that behaves efficiently as you increase the amount of data it handles.. Palai . Keep in mind though that algorithms that are efficient with large amounts of data are not always simple and efficient for small amounts of data. So. if you want to write a function that searches through an array of numbers and returns the smallest one: function find-min(array a[1. then the for loop iterates 87 times. So if you know you are working with only a small amount of data and you have concerns for speed and code space.. a[i]) repeat Department of Computer Science & Engineering SJCET.n]) let j := for i := 1 to n: j := min(j. find the smallest element in the array let j := . Therefore.n]) // First. What about this function: function find-min-plus-max(array a[1. Therefore we say the function runs in time O(n). we have to initialize the i and j integer variables and return j at the end. For example. the for loop iterates n times. for n elements. Likewise. we use asymptotic notation as a convenient way to examine what can happen in a function in the worst case or in the best case. for i := 1 to n: j := min(j. every time we run find-min. even if the very first element we hit turns out to be the minimum. how can we use asymptotic notation to discuss the find-min function? If we search through an array with 87 elements. a[i]) repeat return j end Regardless of how big or small the array is. we can just think of those parts of the function as constant and ignore them. A few examples of asymptotic notation Generally.

Example: The function 3n+2=theta(n) as 3n+2>=3n for all n>=2 and 3n+2<=4n for all n>=2. THETA Definition: The function f(n)=theta(g(n))(read as ―f of n is theta of g of n‖) iff there exist positive constants c1. 3n+3= theta(n). Palai . and 6 * 2n +n2 Department of Computer Science & Engineering SJCET. so the running time is clearly O(2n). 3n+2 !=theta(1). n>=n0. !=theta(1). Why can you do this? If you recall the definition of Big-O notation. 6*2n+n2 !=theta(n2). This rule is general for the various asymptotic notations. 10n2+4n+2=theta(n2).c2=4 and n0=2. Because 2 is a constant. that each iterate n times. so c1=3. The theta notation is more precise than both the big oh and big omega notations. 6*2n +n2 !=theta(n100). 10n2+4n+2 !=theta(1). end What's the running time for find-min-plus-max? There are two for loops. and 10* log n+4= theta (log n). 3n+3 != theta(n2).Algorithm Analysis and Design (R 606) let minim := j 31 // Now. 10n2+4n+2 !=theta(n). a[i]) repeat let maxim := j // return the sum of the two return minim + maxim.c2. add it to the smallest and j := . we can see that if g(x) = x. the function whose bound you're testing can be multiplied by some constant. If f(x) = 2x. then the Big-O condition holds. The function f(n)=theta(g(n)) iff g(n) is both lower and upper bound of f(n). and n0 such that c1g(n) <=f(n)<=c2g(n) for all n. we throw it away and write the running time as O(n).6*2n+n2=theta(2n). Thus O(2n) = O(n). find the biggest element. for i := 1 to n: j := max(j.

6*2n+n2=o(2n log n). 3n+2 !=o(n). 6*2n+n2 !=o(2n). 6*2n+n2=o(3n). 3n+2= o(n log log n). 3n+2=o(n log n).Algorithm Analysis and Design (R 606) 32 Little oh Definition: The function f(n)=0(g(n)) (read as ―f of n is little oh of g of n‖) iff Lim n->infinity Example: The function 3n+2=o(n2) since lim n->infinity (3n+2)/n2=0. f(n)/g(n)=0 Department of Computer Science & Engineering SJCET. Palai .

So.0. Iterative function for sum For this algorithm Sum 2n+3. tSum(n)=theta(n) wec determined that tSum(n)= Asymptotic Notation Properties Let f(n) and g(n) be asymptotically positive functions. for i=1 to n do s:=s+a[i]. Prove or disprove each of the following conjectures. } Alg 1.n) { S:=0.Algorithm Analysis and Design (R 606) 33 Little omega Definition: The function f(n)=w(g(n)) (read as ―f of n is little omega of g of n‖) iff Lim n->infinity Example: g(n)/f(n)=0 Algorithm Sum(a. return s. Department of Computer Science & Engineering SJCET. Palai .

f(n)= (f(n/2)).where lg(g(n))>=1 and f(n)>=1 for all sufficiently large n. f(n)=Ω(f(n)). assume that f(n) and g(n) are asymptotically positive. f(n)=O(g(n)) and g(n)=O(h(n)) imply f(n)=O(h(n)).Algorithm Analysis and Design (R 606) a. g. f(n)=O(f(n)). h. d. f(n)=Ω(g(n)) and g(n)=Ω(h(n)) imply f(n)=Ω(h(n)). f(n)=O(g(n)) implies 2^f(n)=O(2^g(n)). Department of Computer Science & Engineering SJCET. Transitivity: f(n)= (g(n)) and g(n)= (h(n)) imply f(n)= (h(n)). f(n)=O(g(n)) implies g(n) and Ω(f(n)).g(n))). b. f(n)=o(g(n)) and g(n)=o(h(n)) imply f(n)=o(h(n)). 34 c. f. f(n)+o(f(n))= (f(n)). f(n)=O(g(n)) implies g(n)=O(f(n)). Reflexivity: f(n)= (f(n)). f(n)=O((f(n))^2). f(n)=ω(g(n)) and g(n)=ω(h(n)) imply f(n)=ω(h(n)). Symmetry: f(n)= (g(n)) if and only if g(n)= (f(n)). e. Palai . COMMON COMPLEXITY FUNCTIONS Comparison of functions Many of the relational properties of real numbers apply to asymptotic comparisons as well. f(n)+g(n)= (min(f(n). f(n)=O(g(n)) implies lg(f(n))=O(lg(g(n))). For the following.

f(n)=Ω(g(n)) similar to a>=b. f(n)=o(g(n)) similar to a<b. it may be the case that neither f(n)=O(g(n)) nor f(n)=Ω(g(n)) holds. f(n)=O(g(n)) if and only if g(n)=Ω(f(n)). taking on all values in between. f(n)=ω(g(n)) similar to a>b. for two functions f(n) and g(n). or a>b. 35 Because these properties hold for asymptotic notations.Algorithm Analysis and Design (R 606) Transpose symmetry f(n)=O(g(n)) if and only if g(n)=Ω(f(n)). Palai . f(n)= (g(n)) similar to a=b. exactly one of the following must hold: a<b. Although any two real numbers can be compared. the functions n and n^(1+sin n) cannot be compared using asymptotic notation. not all functions are asymptotically comparable. Department of Computer Science & Engineering SJCET. a=b. one can draw an analogy between the asymptotic comparison of two functions f and g and the comparison of two real numbers a and b: f(n)=O(g(n)) similar to a<=b. since the value of the exponent in n^(1+sin n) oscillates between 0 and 2. does not carry over to asymptotic notations: Trichotomy: For any two real numbers a and b. For example. and f(n) is asymptotically larger than g(n) if f(n)=ω(g(n)). We say that f(n) is asymptotically smaller than g(n) if f(n)=o(g(n)). That is. however. One property of real numbers.

 Assume holds for n/2: T(n/2) ≤ c n∕2 log n∕2 Prove that holds for n: T(n) ≤cn log n T(n)= 2T(n/2)+n Department of Computer Science & Engineering SJCET. Let‘s solve T(n) = 2T(n/2) + n using substitution – Guess T(n) ≤ cn log n for some constant c (that is.to obtain a function defined on natural numbers that satisfies the recurrence. Examble: tn = 2tn-1 T(n)= 3T(n/4)+(θn²) Different methods for solving recurrence relation are:Substitution method Itration method Changing variables method Recurrence tree Characteristic equation method n^α method Master theorem Solving recurrences by Substitution method Idea: Make a guess for the form of the solution and prove by induction.Algorithm Analysis and Design (R 606) 36 RECURRENCE RELATIONS The recurrence relation is an equation or inequality that describes a function in terms of its values of smaller inputs. Ok. since function constant for small constant n. some small n is ok). The main tool for analyzing the time efficiency of a recurrence algorithm is to setup a sum expressing the number executions of its basic operation and ascertain the solution‘s order of growth To solve a recurrence relation means . T(n) = O(n log n)) – Proof:  Base case: we need to show that our guess holds for some base case (not necessarily n = 1. Can be used to prove both upper bounds O() and lower bounds Ω(). Palai .

. Example: Solve T(n) = 8T(n/2) + n² (T(1) = 1) T(n) = n² + 8T(n/2) = n² + 8(8T( n/2² ) + (n/2)²) = n² + 8²T( n/2 ²) + 8(n²/4)) = n²+ 2n² + 8²T( n/2² ) = n² + 2n² + 8²(8T( n/2³ ) + ( n/2² )²) = n² + 2n² + 8³T( n/2³ ) + 8²(n²/4² )) = n² + 2n² + 2²n²+ 8³T( n/2³ ) =. The hard part of the substitution method is often to make a good guess. . Masters theorem The master method is used for solving the following type of recurrence T(n)=aT(n/b)+f(n).. 37 Solving Recurrences with the Iteration In the iteration method we iteratively ―unfold‖ the recurrence until we ―see the pattern‖.. Palai . The iteration method does not require making a good guess like the substitution method (but it is often more involved than using induction).Algorithm Analysis and Design (R 606) ≤ 2(c n/2 logn/2) + n = cn log n/2 +n =cn log n-cn log 2+ n =cn log n-cn + n So ok if c ≥ 1 Similarly it can be shown that T(n) =Ω (n log n) Similarly it can be shown that T(n) = T([n/2]) + T([n/2]) + n is θ(n lg n).The subproblems are solved recursively each in T(n/b) Department of Computer Science & Engineering SJCET.a 1 and b>1.In above recurrence the problem is divided in to ‗a‘ subproblems each of size atmost ‗n/b‘. = n² + 2n² + 2²n²+ 2²n³ + .

CASE 1: If f(n) O( for some e>0.log n) CASE 3: If f(n) € Ω( for some e>0 and f(n) € O( for some e then T(n) € (f(n)) Department of Computer Science & Engineering SJCET.The cost of split the problem or combine the solutions of subproblems is given by function f(n).b>1 are constants and f(n) be a function.Algorithm Analysis and Design (R 606) 38 time. Palai .Then T(n) can be asymtotically as.It should be note that the number of leaves in the recursion tree is E= where THEOREM Let T(n) be defined on the non-negative integers by the recurrence T(n)=aT(n/b)+f(n) where a 1.then T(n)€ ( ) CASE 2: If f(n) € ( ) then T(n) € (f(n).

T(n)=4T(n/2)+ a=4. T(n)=4T(n/2)+n E=log 4/log2=2 O( O( T(n) € =n ( ) Case 2 problem: 2.log n) T(n) € Case 3 problem: 3. b=2 f(n)= E=log 4/log 2=2 Ω( = Ω( = Ω( O( = O( = O( ) ) ) ).Algorithm Analysis and Design (R 606) 39 RECURRENCE PROBLEMS Case 1 problem: 1.T(n)=4T(n/2)+ a=4. b=2 f(n)= E=log 4/log 2=2 ( ) ( . Palai ) ) Department of Computer Science & Engineering . T(n) € ( ) SJCET.

Palai .. the recursion tree is a direct proof for the solution of the recurrence...Algorithm Analysis and Design (R 606) 40 RECURSION TREES FOR COMPLEXITY CALCULATIONS A different way to look at the iteration method: is the recursion-tree we draw out the recursion tree with cost of single call in each node—running time is sum of costs in all nodes if you are careful drawing the recursion tree and summing up the costs..+2^(log n-1) n²+8^(log n) Department of Computer Science & Engineering SJCET. just like iteration and substitution Example: T(n) = 8T(n/2) + n² (T(1) = 1) T(n)=n²+ 2n²+ 2²n²+ 2³n²+ .

Algorithm Analysis and Design (R 606)

41

Changing variables
Sometimes reucurrences can be reduced to simpler ones by changing variables Example: Solve T(n) = 2T(√n) + log n Let m = log 2^m = n √n = 2^m/2 T(2^m) = 2T(2^m/2) + m

T(n) = 2T(√n) + log n )

Let S(m) = T(2^m) T(2^m) = 2T(2^m/2) + m ) S(m) = O(mlogm)) T(n) = T(2^m) = S(m) = O(mlogm) = O(log n log log n) S(m) = 2S(m/2) + m)

Other recurrences
Some important/typical bounds on recurrences not covered by master method: Logarithmic: θ(log n) – Recurrence: T(n) = 1 + T(n/2) – Typical example: Recurse on half the input (and throw half away) – Variations: T(n) = 1 + T(99n/100) Linear: θ(N) – Recurrence: T(n) = 1 + T(n − 1) – Typical example: Single loop – Variations: T(n) = 1+2T(n/2), T(n) = n+T(n/2), T(n) = T(n/5)+T(7n/10+6)+n Quadratic: θ(n2) – Recurrence: T(n) = n + T(n − 1) – Typical example: Nested loops Exponential: θ(2^n) – Recurrence: T(n) = 2T(n − 1)

Department of Computer Science & Engineering

SJCET, Palai

Algorithm Analysis and Design (R 606)

42

PROFILING
Profiling or performance measurement is the process of executing a correct program on data set and measuring the time and space it takes to compute the result.

NONDETERMINISTIC ALGORITHMS
The notation of algorithm that we have been using has the property that the result of every operation is uniquely defined.Algoithm with this property are termed deterministic algorithms. Such algorithms agree with the programs are executed on computer. In a theoretical framework we can remove this rustication on the outcomes of every operation. We can allow algorithms to be containing operations whose outcomes of every operation. We can allow algorithm to contain operations whose outcomes are not uniquely defined but are limited to specified set of possibilities. The machine executing such operations is allowed to choose any one of these outcomes subject to be a defined later. This leads to the concept of nondeterministic algorithms, we introduce three functions.

1. Choice (S) arbitrarily chooses one of the elements of element of set S.

2. Failure () signals an unsuccessful completion.

3. Success () signals a successful completion.
The assignment statement x=Choice (1, n) could result in x being assigned any one of the integers in the range [1,n].there is the rule of specifying how this choice is to be made. The Failure () and Success () signals are used to define a computation of the algorithm. These statements cannot be used to effect a return. Whenever there is a set of choices that leads to a successful completion, then one such set of choices is always made and the algorithm terminates successfully. A nondeterministic algorithm terminates unsuccessfully if and only if there exists no set choices leading to a success signal. The computing times for. Choice, Failure, and Success are taken to be O(1).A machine capable of executing a non Department of Computer Science & Engineering SJCET, Palai

Algorithm Analysis and Design (R 606)

43

nondeterministic algorithm in this way is called a nondeterministic machine. Although nondeterministic machines do not exist in practice, we see that they provide strong intuitive reasons to conclude that certain problems cannot be solved by fast deterministic algorithms. Example Consider the problem of searching for an element x in a given set of elements A[1:n],n>1.We are required to determine an index j such that A[j]=x or j=0 if x is not in A.
1. J:=Choice(1.n); 2. If A[j]=x then {write (j);Success();} 3. Write (0); Failure();

From the way a nondeterministic computation is defined, it follows that no is 0 can be the output if and only if there is no j such that A[j]=x. Complexity of nondeterministic search algorithms =O(1). Since A is not ordered, every deterministic algorithm is of complexity Ω (n).

DETERMINISTIC ALGOITHM for n>p
The deterministic algorithm for a selection whose run time is O(n/p log log p+ log

n)The basic idea of this algorithm is same to sequential algorithm. The sequential algorithm partitions the input into groups(of size,says,5),finds the median of each group, and output recursively the median(call it M)of these group meadians.Then the rank rM of M in the input is computed, and as a result, all element from the input that are either ≤ M or >Mare dropped, depending on whether Respectively. Finally recursively. We showed that run time of this algorithm was O(n). i >rM or i ≤ rM,

an appropriate selection is performed from the remaining keys

Department of Computer Science & Engineering

SJCET, Palai

Algorithm Analysis and Design (R 606) 44 MODULE 2 Department of Computer Science & Engineering SJCET. Palai .

If this is so.DandC(pk)). Small(p) is a boolean-valued function that determines whether the input size is small enough that the answer can be computed without splitting.1<k<n. else { divide p into smaller instances p1. Palai .k>=1.DandC(p2).p2.----.the function S is invoked.where p is the problem to be solved.Algorithm Analysis and Design (R 606) 45 DIVIDE AND CONQUER Given a function to compute on n inputs the divide and conquer strategy suggests splitting the inputs into k distinct subsets. If the subproblems are still relatively large.and then a method must be found to combine subsolutions into a solution of whole.that is smaller and smaller subproblems of the same kind are generated untill eventually subproblems that are small enough to be solved without splitting are produced. These subproblems must be solved.Otherwise the problem p is divided into smaller subproblems. return combine(DandC(p1).Combine is a function that determines the solutions to P using the solutions to the k subproblems. ALGORITHM Algorithm DAndC(p) { if small(p) then return s(p). } } Department of Computer Science & Engineering SJCET.then the divide and conquer strategy can possibly reapplied.Below algorithmDAndC is initially invoked as DandC(p).-----.p3-----p(k).These subproblems P1.The principle is natrually expressed using recursive algorithm.yielding k subproblems.P2. CONTROL ABSTRACTION Control abstraction means a procedures whose flowof control is clear but whose primary operations are specified by other procedures whose precise meanings are left defined. apply DandC to each of these subproblems.Pkare solved by recursive applications of DandC.

and so the average number of comparisons is 3n/2-1. If n=2. In this case. Department of Computer Science & Engineering SJCET. The maximum and minimum are a[i] if n=1.Algorithm Analysis and Design (R 606) FINDING THE MAXIMUM AND MINIMUM 46 This is a simple problem that can be solved by the divide and-conquer technique. a[i] is greater than max. min) //Set ‗max’ to the maximum and ‗min‘ to the minimum of a[1:n]. The best case occurs when the elements are in increasing order. The frequency count for other operations in this algorithm is of the same order as that for element comparisons. ALGORITHM-1 Algorithm StraightMaxMin (a. The number of element comparisons is (n-1). The average number of element comparisons is less than 2(n-1). { max:=min:=a[1]. the problem can be solved by making one comparison. Palai . n. and worst cases. On the average. for i:=2 to n do { if (a[i]>max) then max:= a[i]. the number of element comparisons is 2(n-1). In analyzing the time complexity of this algorithm. StraightMaxMin requires 2(n-1) element comparisons in the best. half the time. average. max. if a[i]<min) then min:= a[i]. } } This is a straight forward algorithm to accomplish the above problem. we concentrate on the number of element comparisons. The worst case occurs when the elements are in the decreasing order. The problem is to find the maximum and minimum items in a set of ‗n ‗elements.

Algorithm Analysis and Design (R 606) ALGORITHM-2 Algorithm MaxMin ( i, j, max, min )

47

//a[1:n] is a global array. Parameters I and j are integers, 1<=i<=j<=n. The effect is to set //max and min to the largest and smallest values in a [i : j] ,respectively. { if (i=j) then max :=min :=a[i]; // Small(p) else if (i=j-1) then //Another case of Small(p) { if( a[i] < a[j]) then { max:=a[j]; min:=a[i]; } else { max:=a[i]; min:=a[j]; } } else { // If P is not small, divide P into subproblems. Find where to split // mid := [(i+j)/2]; // Solve the subproblems. MaxMin ( i, mid, max, min ) MaxMin ( mid+1, j, max, min) if (max<max1)then max:=max1; if (min>min1)then min:=min1; } } MaxMin is a recursive algorithm that finds the maximum and minimum of the set of elements {a(i),a(i+1),…………….a(j)}.The situation of set sizes one (i=j) and two (i=j-1) are handled separately. For sets containing more than two elements, the midpoint is determined (just as in binary search) and two new subproblems are generated. When the maxima and minima of these subproblems are generated, the two maxima are compared and the two minima are compared to achieve the solution for the entire set.

Department of Computer Science & Engineering

SJCET, Palai

Algorithm Analysis and Design (R 606) Eg:- Suppose we simulate MaxMin on the following nine elements: a: [1] [2] [3] [4] [5] [6] [7] [8] [9] 22 13 -5 -8 15 60 17 31 47

48

A good way of keeping track of recursive calls is to build a tree by adding a node each time anew call is made. For this algorithm each node has four items of

information: i, j, max, and min. On the array a[] above, the tree is produced. Trees of recursive calls of MaxMin

1, 9,60,-8

1, 5, 22, -8

6,9,60,17

1, 3, 22, -5

4, 5, 15,-8

6,7,60,17

8, 9, 47, 31

1,2,22,13

3,3,-5,-5

We see that the root node contains 1 and 9 as the values of i and j corresponding to the initial call to MaxMin. This execution produces new call to MaxMin where i and j have the values 1 ,5 and 6,9, respectively, and thus split the set into two subsets of approximately the same size. From the tree, we can immediately see that the maximum depth of recursion is four(including the first call). The order in which max and min are assigned values are follows: [1,2,22,13] [3,3,-5,-5] [1,3,22,-5] [4,5,15,-8] [1,5,22,-8] [6,7,60,17] [8,9,47,31] [6,9,60,17] [1,9,60,-8]. Number of element comparisons needed for MaxMin: If T(n) represents the number of element comparisons needed, then the resulting recurrence relation is:

T ([n/2]) + T ([n/2]) +2 T (n) = 1 0

n>2 n=2 n=1 SJCET, Palai

Department of Computer Science & Engineering

Algorithm Analysis and Design (R 606) When n is a power of 2, n=2k we can solve this equation by successive substitutions: T(n) = 2T(n/2)+2 = 2(2T(n/4)+2) +2 = 4T(n/4)+ 4 + 2 . . = 2k-1T(2) +∑ 1< = i < = = 2k-1 + 2k -2 = 3n/2 - 2
k-1 2 i

49

When n is a power of 2, the number of comparisons in the best , average and worst case is 3n/2-2. COMPLEXITY ANALYSIS OF FINDING MAXIMUM&MINIMUM Consider the following nine elements to simulate MaxMin 22 13 -5 -8 15 60 17 31 47

Fig below shows the tree constructed for the algorithm. Each node has four items of information: i, j, max and min. 9
1, 9, 60,-8

5
1, 5, 22,-8

8
6, 9, 60,17

3

1, 3, 22,-5

4, 5, 15,-8 4

6, 7, 60,17 6

1, 9, 60,-8 7

1, 2, 22, 13

3, 3, -5,-5

1

2

Consider the total number of element comparison needed for MaxMin? If T(n) represents this number, then the resulting recurrence relation is T (┌ n/2 ┐) + T(┌ n/2 ┐) + 2 T (n)= 1 0 n>2 n=2 n=1

Department of Computer Science & Engineering

SJCET, Palai

Algorithm Analysis and Design (R 606) When n is a power of 2. Palai . Let C (n) be this number. average case and worst case complexity is θ(n) Department of Computer Science & Engineering SJCET. n=2k for some positive integer k. Assuming n=2k for some positive integer k. then T(n) = 2T( n/2 ) + 2 = 2(2T (n/4) +2) +2 = 4T (n/4) +4+2 : : = 2k-1T (2) +∑1<=i<=k-1 2i = 2k-1+2k-2 =3n/2-2 50 Therefore 3n/2 is the best. then divide and conquer method is more efficient In both cases mentioned above the best case. we get C(n) = 2C(n/2) + 3 n>2 2 n=2 C (n) = 2C (n/2) + 3 =4 C (n/4) + 6 + 3 : : =2k-1C (2)+3∑0k-2 2i =2k+3*2k-1-3 =5n/2-3 if comparisons between array elements are costlier than comparisons of integer variable. average and worst case number of comparisons when n is a power of two. Consider the count when element comparisons have the same cost as comparisons between i and j.

1.………logn) Cost of each node at level i=c(n/4)i No: of node at level log4n=3log4n Cost of each node=3log4n. Palai .Algorithm Analysis and Design (R 606) 51 RECURSION TREE DEFINITION: A recursion tree is a tree that depicts the entire recursion process.T(1)=θ(3log4n)=θ(nlog43) Total cost=cn2+3/16cn2+ (3/16)2cn2+ (3/16)log4n-1cn2 = (3/16)I cn2+ θ(nlog43)+ θ(nlog43) Department of Computer Science & Engineering SJCET. T (n) =3T (n/4)+ θ(n2) = 3T (n/4) + cn2 cn2 c(n/4)2 c(n/4)2 c(n/4)2 : T(n/16) T(n/16) T(n/16) 2 2 2 T(n/16) T(n/16) T(n/16) : : : : : : T(1) : : : : : T(1) : : : : : : T(1) c(n/16) c(n/16) : c(n/16) : : : : : : : T(1) T(n/16) T(n/16) T(n/16) : T(1) : T(1) : : : T(1) T(1) T(1) Sub problem size at depth i=n/4i Boundary condition (n=1) :=(n/4i)=1 ie i=log4n Depth of the tree=log4n+1 (0. Consider the following recurrence relation. 2.

Algorithm Analysis and Design (R 606) = (3/16)I cn2+ θ(nlog43) = 1/(1-3/16)cn2+ θ(nlog43) = 16/13cn2+ θ (nlog43) = O (n2) 52 BINARY SEARCH ALGORITHMS The Binary search requires an ordered list. int target) // pre: list is sorted in ascending order //post: ITERATIVE binary search will return the index of the target element. int target. int first = 0. else -1 { int mid. if ( list[mid] == target ) return mid. } Recursive Algorithm int find (const list. else -1 first = mid + 1. Palai . else } return -1. int last = list. Iterative Algorithm int find (const list. while ( first <= last ) { mid = (first + last) / 2.1. if ( list[mid] > target ) last = mid . Department of Computer Science & Engineering SJCET.length( ) -1. int last) // pre: list is sorted in ascending order //post: RECURSIVE binary search will return the index of the target element. int first.

Palai .Algorithm Analysis and Design (R 606) { if (first > last) return -1. For example. if (list[mid] < target) return find(list. target. } Complexity Analysis of Binary Search For binary search. return find(list. high. 53 int mid = (first + last) / 2. storage is required for the n elements of the array plus the variables low. mid+1. if n=14. 7 3 11 1 5 9 13 2 4 6 8 10 12 14 Department of Computer Science & Engineering SJCET. target. first. last). the resultant tree will be like the one shown below. mid-1). mid and x. if (list[mid] == target) return mid. A binary search tree can be created to explain the binary search where the mid value of each node is the value of the node.

Algorithm Analysis and Design (R 606) 54 The first comparison is s with a[7]. (n+1))/n -1 = (n. then the next comparison is with a[11]. In other words the time for a successful search is O (log n) and for an unsuccessful search is ϴ (log n). then the next comparison is with a[3]. As (n) =1+I/n The number of comparisons on any path from the root to an external node is equal to the distance between the root and the external node. If x is present. If x is not present the algorithm will terminate at one of the square nodes. The number of comparison to find an element represented by an internal node is one more than the distance of this node from the root. Au (n) + Au (n))/n -1 = Au (n) + Au (n)/n -1 As (n) = Au (n) (1 +1/n) -1 Department of Computer Science & Engineering SJCET. then binary search makes at most k element comparison for a successful search and either k-1 or k comparison for an unsuccessful search. then the algorithm will end at one of the circular nodes tat lists the index into the array where x was found. E=I+2n Where E=External path length (sum of the distance of all external nodes from the root) I=Internal path length (sum of the distance of all internal nodes from the root). Let As (n) is the average number of comparison in a successful search and Au (n) is the average number of comparison in an unsuccessful search. if x>a[7]. Since every binary tree with n internal nodes has n+1 external nodes Au (n) =E/ (n+1) As (n) =I/n+1 = (E-2n/n) + 1 =E/n-1 = (Au (n). similarly.2k). Theorem: If n is in the range [2k-1.if x<a[7]. Palai .

Matrix multiplication is particularly easy to break into subproblems.ϴ (log n) Worst. because it can be performed blockwise. this was widely believed to be the best running time possible.ϴ (log n) DIVIDE-AND-CONQUER MATRIX MULTIPLICATION ALGORITHM The product of two n x n matrices X and Y is a third . matrix multiplication is not commutative. not the same as YX.Algorithm Analysis and Design (R 606) Successful searches Best-ϴ (1) (only one element is compared) Average. with (i. XY= A B E F = AE+BG AF+BH Department of Computer Science & Engineering SJCET. Average and Worst.j)th entry n Zij=∑Xik Ykj k=1 In general XY. carve X into four n/2 x n/2 blocks. and is exactly as if the blocks were single elements. The formula above implies an O (n3) algorithm for matrix multiplication: there are n2 entries to be computed. and also Y: A B C D . E F X= Y= G H Then their product can be expressed in terms of these blocks. It was therefore a source of great excitement when in 1969. To see what this means.ϴ (log n) 55 Unsuccessful searches Best. n x n matrix Z=XY . Strassen announced a signi_cantly more ef_cient algorithm. For quite a while. and each takes linear time. Palai . and it was even proved that no algorithm which used just additions and multiplications could do better. based upon divide-and-conquer.

which by the result is O(nlog2 7 )≈ O(n2. The total running time is described by the recurrence relation T(n)= 8T(n/2) + O(n2).81). It turns out that XY can be computed from just seven sub problems.CF. Department of Computer Science & Engineering SJCET. product XY. P1 = A(F – H) P2 P3 P4 P5 P6 P7 = = = = = = (A + B)H (C + D)E D(G – E) (A + D) (E + H) (B .DG. However.D) (G + H) (A . an improvement in the time bound is possible. Palai .C) (E + F) XY = P +P – P +P 5 4 2 6 P +P 3 4 P +P 1 2 P +P – P – P 1 5 3 7 This translates into a running time of T(n)= 7T(n/2) + O(n2). it relies upon algebraic tricks. which comes out to O(n3).Algorithm Analysis and Design (R 606) C D G H CE+DG CF+DH 56 We now have a divide-and-conquer strategy: to compute the size-n.CE. and as with integer multiplication.BH. the same as for the default algorithm.DH and then do a few O(n2) time additions.AF. recursively compute eight size-n/2 products AE.BG.

S2. Solve the subproblems recursively. B be two square matrices over a ring R. Palai .Divide matrices in sub-matrices and recursively multiply sub-matrices Let A.Algorithm Analysis and Design (R 606) 57 STRESSEN’S MATRIX MULTIPLICATION Strassen showed that 2x2 matrix multiplication can be accomplished in 7 multiplication and 18 additions or subtractions.j matrices. the same number of multiplications we need when using standard matrix multiplication. We want to calculate the matrix product C as If the matrices A. We still need 8 multiplications to calculate the Ci. We partition A. …. Now comes the important part.Analysis can be done using recurrence equations. B and C into equally sized block matrices with then With this construction we have not reduced the number of multiplications.Divide the input data S in two or more disjoint subsets S1. S2. The base case for the recursion are subproblems of constant size. This reduce can be done by Divide and Conquer Approach. B are not of type 2n x 2n we fill the missing rows and columns with zeros. We define new matrices Department of Computer Science & Engineering SJCET. into a solution for S. Combine the solutions for S1.

The particular crossover point for which Strassen's algorithm is more efficient depends on the specific implementation and hardware. It has been estimated that Strassen's algorithm is faster for matrices with widths from 32 to 128 for optimized implementations and 60.000 or more for basic implementations. for which they are more efficient. Palai . Department of Computer Science & Engineering SJCET. Because of our definition of the Mk we can eliminate one matrix multiplication and reduce the number of multiplications to 7 (one multiplication for each Mk) and express the Ci.j as We iterate this division process n-times until the submatrices degenerate into numbers (group elements). the asymptotic complexity is O(n3).Algorithm Analysis and Design (R 606) 58 which are then used to express the Ci. Numerical analysis The standard matrix multiplications takes approximately 2n3 arithmetic operations (additions and multiplications).j in terms of Mk. Practical implementations of Strassen's algorithm switch to standard methods of matrix multiplication for small enough submatrices.

R. the asymptotic complexity for multiplying matrices of size n = 2k using the Strassen algorithm is . Then by recursive application of the Strassen algorithm.Algorithm Analysis and Design (R 606) N 59 Ci . int *R. int *B.. j k 1 ai . Palai . i. n/4). Hence f(k) = (7 + o(1))k. for some constant l that depends on the number of additions performed at each application of the algorithm. we see that f(k) = 7f(k − 1) + l4k. The reduction in the number of arithmetic operations however comes at the price of a somewhat reduced numerical stability. } else { matmul(A. int n) { if (n == 1) { (*R) += (*A) * (*B). Department of Computer Science & Engineering SJCET. B. j N N N Thus T ( N ) i 1 j 1 k 1 c cN 3 O( N 3 ) The number of additions and multiplications required in the Strassen algorithm can be calculated as follows: let f(k) be the number of operations for a matrix. Algorithm void matmul(int *A.e.k bk .

n/4). B+(n/4). n/4). n/4). matmul(A+2*(n/4). matmul(A+3*(n/4). B+3*(n/4). R+2*(n/4). matmul(A+3*(n/4). R+(n/4). R. R+(n/4). n/4). n/4). B+2*(n/4). } } 60 Department of Computer Science & Engineering SJCET. n/4). Palai . R+3*(n/4). B+(n/4). B. B+2*(n/4). R+2*(n/4). matmul(A+(n/4).Algorithm Analysis and Design (R 606) matmul(A. R+3*(n/4). matmul(A+(n/4). B+3*(n/4). matmul(A+2*(n/4). n/4).

423.. divide A into two arrays. each containing about half of the elements of A. Conquer Step: Combine the elements back in A by merging the sorted arrays A1 and A2 into a sorted sequence. Divide Step: If given array A has zero or one element.450. Consider the following example: Consider the array of ten elements a [1:10] = (310. Its worst case complexity is O(nlog n).450.254.179..310|179|652..a[9:9]. and the resulting sorted sequences are merged to produce a single sorted sequence of n elements.a[3:3].351|423.861. Assume a sequence of n elements a[1].a[2:2].Algorithm Analysis and Design (R 606) 61 MERGE SORT Merge sort is an example of divide and conquer algorithm.520) The merge sort algorithm first divides the array as follows a[1:5] and a[6:10] and then a[1:3] and a[4:5] and a[6:8] and a[9:10] and then into a[1:2] and a[3:3] and a[6:7] and a[8:8] and finally it will look like a[1:1]. Each set is individually sorted. Palai . Recursion Step: Recursively sort array A1 and A2.a[4:4]. The Merge-sort algorithm can be described in general terms as consisting of the following three steps: 1..a[6:6].a[5:5].450.a[7:7]. To understand merge sort assume that elements are arranged in the non decreasing order.351|423.861.351.254.a[8:8].861.a[n].652.520) Department of Computer Science & Engineering SJCET..a[2]. Otherwise.254..it is already sorted. Merge-sort is based on the divide-and-conquer paradigm..a[10:10] Pictorially the file can now be viewed as (310|285|179|652.285.a[2]. 3.a[n].a[n/2] and a[(n/2)+1].. Split them into two sets like a[1].520) Elements a[1] and a[2] are merged as (285. 2.

351. 450.861.652. 5 6. then the computing time for merge sort is described by the recurrence relation a n= 1a a constant SJCET.861. 7 If the time for the merging operation is proportional to n.254. 10 1. 10 1.254.310.450. 310.285.861. 5 6. 10 1. 310 |351.351|423. 450. 310 .520.861) and finally (179.423. 310 |652. 2 6.Algorithm Analysis and Design (R 606) And then a[3] is merged with a[1:2] (179.285. 9 10. 3 4. 3 4. 1 2.520. 2 3.254. 8 9. 10 1.652|423. 5 6.520) Repeated recursive calls are invoked producing the following array (179.652|423. 8 9. Palai Department of Computer Science & Engineering . 7 8.285.351.520) next a[4] and a[5] (179.652|254.861) At this point there are two sorted subarrays and final merge produces the fully sorted result 62 1.450.520)and then a[1:3] and a[4:5] (179.285.423.450.285. 351. 6 7. 4 5.254.

n=2k we can solve this equation by successive substitutions: T(n) = 2(2T(n4)+cn/2)+cn = 4T(n/4)+2cn = 4(2T(n/8)+cn/4) + 2cn . Palai . //In this case the list is already sorted. the n elements should be placed in a[1:n].high) //a[low:high] is a global array to be sorted. then T(n)<=T(2k+1). Therefore T(n)=O (nlogn).Algorithm Analysis and Design (R 606) T(n)= 2T(n/2)+cn n>1. { if(low<high) then // If there are more than one element. { //Divide P into subproblems. //Small(P) is true if there is only one element to sort. 24 MergeSort describes the process very succinctly using recursion and a function Merge which merges two sorted sets. Before executing MergeSort . Department of Computer Science & Engineering SJCET. = 2kT(1) + kcn = an+cn log n It is easy to see that if 2k < n <2k+1.n) causes the keys to be rearranged into nondecreasing order in a. . Algorithm MergeSort(low. Then MergeSort (1. c a constant 63 When n is a power of 2.

b[] is an auxiliary global //array. } } Algorithm Merge(low.high) //a[low:high] is a global array containing two sorted subsets in //a[low:mid] and in a[mid+1:high]. i:=low.mid. { h:=low. Palai . while ((h ≤ mid) and (j ≥ high)) do { if (a[h] ≤ a[j]) then { b[i]:=a[h]. h:=h+1.Algorithm Analysis and Design (R 606) //Find where to split the set .mid. //Solve the subproblems. MergeSort(mid+1.The goal is to merge these two // sets into single set residing in a[low:high].high). j:=mid+1. } else Department of Computer Science & Engineering 64 SJCET. Merge(low. mid:=[(low+high)/2]. MergeSort(low.high). //Combine the solutions.high).

} i:=i+1. Palai . i:=i+1. j:=j+1. } for k:=low to high do a[k]:=b[k]. } 65 Department of Computer Science & Engineering SJCET. } else for k:=h to mid do { b[i]:=a[k]. i:=i+1. } if(h>mid) then for k:=j to high do { b[i]:=a[k].Algorithm Analysis and Design (R 606) { b[i]:=a[j].

Each set is individually sorted and the resulting sorted sequences are merged to produce a single sorted sequence of n elements. 351 | 423 .351. The elements in a[1:5] are then split into two subarrays of size three (a[1:3]) and two (a[4:5]). 285 . 254 . 450 .520). 520 ) And then a[1:3] and a[4:5]: ( 179 . A record of the subarrays is implicitly maintained by the recursive mechanism. 520 ) Department of Computer Science & Engineering SJCET.652... Elements a[1] and a[2] are merged to yield ( 285 . 520 ) where vertical bars indicate the boundaries of subarrays. Algorithm MergeSort begins by splitting a[] into two subarrays each of size five (a[1:5] and a[6:10]). MergeSort describes this process using recursion and a function Merge which merges two sorted sets.. 450 .179. 450 .423. 652 | 423 . 351 | 423 . 520 ) is produced. Pictorially the file can be viewed as ( 310 | 285 | 179 | 652 ... 310 . 450 .285. 310 | 179 | 652 . Then the items in a[1:3] are split into subarrays of size two (a[1:2]) and one (a[3:3]).450.Algorithm Analysis and Design (R 606) 66 COMPLEXITY ANALYSIS OF MERGE SORT Merge sort is an ideal example of the divide and conquer strategy in which the given set of elements is split into two equal-sized sets and the combining operation is the merging of two sorted sets into one. 861 . 861 . 450 . 254 . Palai .a[n]. and now the merging begins. Next elements a[4] and a[5] are merged: ( 179 .. 861 . 310 | 351 .a[n].. 351 | 423 .. 351 .. 861 . Given a sequence of n elements a[1].. 861 . 310 | 652 .. 254 .861.254. 254 .a[n/2] and a[n/2+1]. 520 ) Then a[3] is merged with a[1:2] and ( 179 . 285 . 254 . The two values in a[1:2] are split final time into one-element subarrays . 652 | 423 .. Example: Consider an array of ten elements a[1:10] = (310. the general idea is to imagine them to split into two sets a[1]. 285 .

520 . 423 . 450 . 861 ) At this point there are two sorted subarrays and the final merge produces the fully sorted result ( 179 .2) is a tree representing the calls to produce Merge by MergeSort. 2 and 3 represents the merging of a[1:2] with a[3]. 351 . 285 . The pair of values in each node are the values of parameters low and high. 652 | 423 | 861 | 254 | 450 . Palai . 861 | 450 . 652 | 254 . Repeated recursive calls are invoked producing the following subarrays: ( 179 . 285 . 351 . 351 . 423 . 310 . 861 ) Figure (1. For example. 310 . 310 . 652 | 254 . 285 . the node containing 1 . 310 . Then a[8] is merged with a[6:7]: ( 179 .Algorithm Analysis and Design (R 606) 67 At this point the algorithm has returned to the first invocation of MergeSort and is about to process the second recursive call. 652 . 520 ) Next a[9] and a[10] are merged and then a[6:8] and a[9:10]: ( 179 . 423 . The splitting continues until the sets containing a single element are produced. Department of Computer Science & Engineering SJCET. 285 . 520 . 450 . 254 . 351 . Figure (1. 520 ) Elements a[6] and a[7] are merged.1) is a tree that represents the sequence of recursive calls that are produced by MergeSort when it is applied to ten elements.

7 8.7 1.10 Tree of calls of MergeSort(1.10 1.2 1.3 4.7.10 1.2 6.5 6.Algorithm Analysis and Design (R 606) 68 1.2 3.9.1.5 6.5 5.8 6.6 Tree of calls of Merge ( Fig 1.1 ) 6.3 4.4 4.4.8 9.8.6.5.8 9.7 7.3. Palai .9 9.5 6.2 ) Department of Computer Science & Engineering SJCET.1 2.10 1.10 1.10) ( Fig 1.3 1.10 1.2.5 6.10 10.

Algorithm Analysis and Design (R 606) 69 If the time for the merging operation is proportional to n. n = 2^k. = 2^kT(1) + kcn = an + cn logn It is easy to see that if 2^k < n < 2^(k+1) . c a constant When n is a power of 2. then the computing time for merge sort is described by the recurrence relation T(n) = { a { 2T(n/2) + cn n=1 . we can solve this equation by successive substitutions: T(n) = 2(2T(n/4)+ cn/2) +cn = 4T(n/4) + 2cn = 4(2T(n/8) + cn/4) + 2cn . T(n) = O(n logn) Department of Computer Science & Engineering SJCET. then T(n) < T(2^k+1). Palai . . . a a constant n>1 .

It works recursively by a divide-and-conquer strategy. Figure 1: Quicksort(n) The first step of the partition procedure is choosing a comparison element x. Idea First. In the following algorithm it may also happen that an element equal to x remains between the two parts.Algorithm Analysis and Design (R 606) 70 QUICK SORT Quicksort is one of the fastest and simplest sorting algorithms . such that all elements of the first part b are less than or equal to all elements of the second part c (divide). the sequence to be sorted a is partitioned into two parts. Department of Computer Science & Engineering SJCET. Figure 1 illustrates this approach. For elements equal to x it does not matter into which part they come. Recombination of the two parts yields the sorted sequence (combine). All elements of the sequence that are less than x are placed in the first part. Then the two parts are sorted separately by recursive application of the same procedure (conquer). all elements greater than x are placed in the second part. Palai .

a[i]=a[j]. i. j). The recursion ends whenever a part consists of one element only. Program The following Java program implements quicksort. while (a[j]>x) j--. if (i<hi) quicksort(a. int hi) { // lo is the lower index. Palai . quicksort treats the two parts recursively by the same procedure. j=hi. } } while (i<=j). void quicksort (int[] a. In order to sort n elements. int x=a[(lo+hi)/2].Algorithm Analysis and Design (R 606) 71 After partitioning the sequence. // partition do { while (a[i]<x) i++. if (i<=j) { h=a[i]. in this case the Department of Computer Science & Engineering SJCET. h. hi). lo. // recursion if (lo<j) quicksort(a. } Analysis The best-case behavior of the quicksort algorithm occurs when in each recursion step the partitioning produces two parts of equal length. i++. int lo. hi is the upper index // of the region of array a that is to be sorted int i=lo. j--. a[j]=h.

Heapsort and Mergesort. it is better to choose the element in the middle of the sequence as comparison element. in the (very rare) worst case quicksort is as slow as Bubblesort. the constant hidden in the Onotation is small. Even better would it be to take the n/2-th greatest element of the sequence (the median). There are sorting algorithms with a time complexity of O(n log(n)) even in the worst case. it is possible to compute the median in linear time [AHU 74]. namely that one part consists of only one element and the other part consists of the rest of the elements (Figure 2 c). Proposition: The time complexity of quicksort is in Θ(n log(n)) Θ(n2) In the average case and in In the worst case Conclusions Quicksort turns out to be the fastest sorting algorithm in practice. Therefore. Moreover. However. Therefore. The choice of the comparison element x determines which partition is achieved. Palai . But this algorithm is on the average and in the Department of Computer Science & Engineering SJCET. And it turns out that even in its simple form quicksort runs in O(n log(n)) on the average. However. these algorithms are by a constant factor slower than quicksort. But on the average. Suppose that the first element of the sequence is chosen as comparison element. This would lead to the worst case behavior of the algorithm when the sequence is initially sorted. This is because the recursion depth is log(n) and on each level there are n elements to be treated (Figure 2 a). In the average case a partitioning as shown in Figure 2 b is to be expected.Algorithm Analysis and Design (R 606) 72 running time is in Θ(n log(n)). The worst case occurs when in each recursion step an unbalanced partitioning is produced. e. we trade this for the (rare) worst case behavior of Θ(n2). Actually. Then the optimal partition is achieved. It has a time complexity of Θ(n log(n)) on the average. the beauty of quicksort lies in its simplicity. Then the recursion depth is n-1 and quicksort runs in time Θ(n2). This variant of quicksort would run in time O(n log(n)) even in the worst case.g. namely in Θ(n2). It is possible to obtain a worst case complexity of O(n log(n)) with a variant of quicksort (by choosing the median as comparison element).

From our original array.Algorithm Analysis and Design (R 606) 73 worst case by a constant factor slower than Heapsort or Mergesort. for example. Of course. leaves a hole in the array: Department of Computer Science & Engineering SJCET. The selected element is called the "pivot element" because the other elements.A. I found the K&R code a bit hard to follow. like this: 04 06 13 15 25 27 33 34 36 51 58 64 69 73 74 75 78 95 The general approach of the quick sort is to select an element from the middle (or from close to the middle) of an array and then put all other elements which are less than or equal to the selected element to its left and all elements which are greater than the selected element to its right. I believe I've come upon a better way of explaining the algorithm. This. Palai . figuratively speaking. The interesting part of the quick sort is how it comes up with the sub arrays. The quicksort algorithm was invented by C. If.R. Kernighan and Dennis M. I will illustrate. we start with the original array above and use "25" as the pivot element. a "sub array" with zero elements or only one element is already in proper order and does not need to be sorted. with the smallest number first. Ritchie (page 87). it is not interesting in practice. figuratively speaking. our first sorting yields: [ 04 06 15 13 ] 25 [ 64 34 74 69 95 33 58 78 36 51 73 75 27 ] The next step is for quicksort to call itself to have the left and right sub-arrays sorted. Hoare in 1962. Second Edition" by Brian W. This crude "sorting" around the pivot element yields two sub-arrays: a left one and a right one. using a few figures of speech. take out the element in (or close to) the middle of the array for a pivot element. We might. I learned about it from "The C Programming Language. The point of any sort routine is to take an array of objects and arrange the objects in some kind of sequence. therefore. for example. turn around it. want to take this collection of numbers: 06 34 69 33 75 64 04 74 25 95 15 58 78 36 51 73 13 27 and put them into numerical order.

= 25 At the end of the process. = 25 Now "move the hole" to the first position in the array by moving the current occupant of that position into the current hole's position: __ 34 69 33 75 64 04 74 06 95 15 58 78 36 51 73 13 27 p.e. If it is greater than the pivot element.Algorithm Analysis and Design (R 606) 06 34 69 33 75 64 04 74 __ 95 15 58 78 36 51 73 13 27 74 p.e. then we want to put the element in the hole and then re-establish the hole immediately to the right of the moved element. = 25 Now we get to work.e. we have: Department of Computer Science & Engineering SJCET. The first element of our sample array that needs to be moved is "04": +-----------------+ | V | | 04 34 69 33 75 64 __ 74 06 95 15 58 78 36 51 73 13 27 +--------------+ | | | V 04 _ 69 33 75 64 34 74 06 95 15 58 78 36 51 73 13 27 p.e. we do nothing. = 25 After we similarly process "06" we will have: 04 06 __ 33 75 64 34 74 69 95 15 58 78 36 51 73 13 27 p. Palai . If it is less than or equal to the pivot element. We start with the first element to the right of the hole.

Set a[p]=. then we stuff the pivot element back into the hole and we are done.e. all elements to the right of the hole are greater than the pivot element. I :=m. Department of Computer Science & Engineering SJCET. all elements to the left of the hole are less than the pivot element. repeat j:=j-1. To finish up.p) //Within a[m]. Algorithm partition (a.a[k]<=t form<=k<q. until ( a[i]>=v). a[m+1].…. { v :=a[m]. q is returned. until(a[j]<=v). and a[k]>=t for //q<k<p. repeat { repeat I:=i+1. we sort the left sub-array then the right sub-array. then after completion a[q] =t for //some q between m and p-1.a[p-1]the elements are rearranged in such //a manner that if initially t=a[m]. Palai .m. = 25 As you can see.Algorithm Analysis and Design (R 606) 75 04 06 15 13 __ 64 34 74 69 95 33 58 78 36 51 73 75 27 p. j :=p.

}until(i>=j). The rearrangement of the element is accomplished by picking some element of a[]. } Algorithm interchange (a. a[m]:=a[j].and then reordering the other elements so that all elements appearing before t in a[1:n] are less than or equal to t and all elements appearing after t are greater than or equal to t.i. then a[n+1] must be defined and must be greater than or equal to all elements in a[1:n]. Department of Computer Science & Engineering SJCET.say t =a[s]. j) exchanges a[i] with a[j]. No merge is needed. Thus the elements in a[1:m] and a[m+1:n] can be independently sorted.It is assumed that a[p]>=a[m] and that a[m] is the partitioning elements. the division into sub arrays is made so that the sorted sub arrays do not need to be merged later. a[j]:=v. i. The assumption that a[m] is the partition element is merely for convenience. other choices for the partitioning element than the first item in the set are better in practice. Palai .1<=m<=n. { p :=a[i].i. This is accomplished by rearranging the elements in a[1:n] such that a[i]<=a[j] for all I between 1 and m and all j between m+1 and n for some m. The function Interchange (a.Algorithm Analysis and Design (R 606) if(i<j) then interchange (a. Function Partition of Algorithm accomplishes an in-place partitioning of the elements of a[m:p-1]. } 76 In quick sort. This is rearranging is referred to as partitioning. a[i] :=a[j].j) // exchange a[i] with a[j]. return j.j). a[j] :=p. If m=1 and p-1=n.

p. two sets S1 and S2 are produced. } } Using Hoare‘s clever method of partitioning a set of elements about a chosen element. Palai . QuickSort (p.q) 77 //Sorts the elements a[p]. // There is no need for combining solutions. Each set is sorted by reusing the function Partition Department of Computer Science & Engineering SJCET. Hence S1 and S2 can be sorted independently.j-1). Following a call to the function Partition.Algorithm Analysis and Design (R 606) Algorithm Quick Sort (p.q+1).a[q] which reside in the global array //a[1:n] into ascending order. Quick Sort(j+1.q). a[n+1] is considered to be defined and //must be >= all the elements in a[1:n]. we can directly devise a divide-and-conquer method for completely sorting n elements. // Solve the subproblems. { if (p<q) then //If there are more than one element { // divide P into two subproblems. All elements in S1 are less than or equal to the element in S2. //j is the position of the partitioning element. j:=Partition(a.…….

However. Department of Computer Science & Engineering SJCET.Algorithm Analysis and Design (R 606) 78 Quicksort Quicksort Quicksort in action on a list of numbers. on average. Class Sorting algorithm Data structure Varies Worst case performance Θ(n2) Best case performance Θ(nlogn) Average case performance Θ(nlogn) comparisons Worst case space complexity Varies by implementation Optimal Sometimes Quicksort is a well-known sorting algorithm developed by C. and in most real-world data. The horizontal lines are pivot values. A. because its inner loop can be efficiently implemented on most architectures. Hoare that. in the worst case. Typically. quicksort is significantly faster in practice than other Θ(nlogn) algorithms. Palai . R. it is possible to make design choices which minimize the probability of requiring quadratic time. makes Θ(nlogn) (big O notation) comparisons to sort n items. it makes Θ(n2) comparisons.

This makes quicksort a comparison sort.Algorithm Analysis and Design (R 606) Quicksort is a comparison sort and. and the pivot selected is the last among those of equal value). called a pivot. which are always sorted. Pick an element. The steps are: 1. is not a stable sort. 3. After this partitioning. pivot. Reorder the list so that all elements which are less than the pivot come before the pivot and so that all elements greater than the pivot come after it (equal values can go either way). the algorithm might be expressed as this: function quicksort(array) var list less. The correctness of the partition algorithm is based on the following two arguments: Department of Computer Science & Engineering SJCET. the pivot is in its final position. in efficient implementations. Palai . In simple pseudocode. from the list. Recursively sort the sub-list of lesser elements and the sub-list of greater elements. quicksort(greater)) Notice that we only examine elements by comparing them to other elements. 2. The base case of the recursion are lists of size zero or one. This is called the partition operation. 79 Algorithm Quicksort sorts by employing a divide and conquer strategy to divide a list into two sub-lists. This version is also a stable sort (assuming that the "for each" method retrieves elements in original order. greater if length(array) ≤ 1 return array select and remove a pivot value pivot from array for each x in array if x ≤ pivot then append x to less else append x to greater return concatenate(quicksort(less).

right. the final list has the same elements as the original list. Department of Computer Science & Engineering SJCET. inclusively. The additional memory allocations required can also drastically impact speed and cache performance in practical implementations. It temporarily moves the pivot element to the end of the subarray.Algorithm Analysis and Design (R 606) 80 At each iteration. The correctness of the overall algorithm follows from inductive reasoning: for zero or one element. There is a more complex version which uses an in-place partition algorithm and can achieve the complete sort using O(nlogn) space use on average (for the call stack): function partition(array. which is as bad as merge sort. which it returns. This is the in-place partition algorithm. by moving all elements less than or equal to a[pivotIndex] to the beginning of the subarray. Each iteration leaves one fewer element to be processed (loop variant). blue elements are less or equal. It partitions the portion of the array between indexes left and right. Notice that an element may be exchanged multiple times before reaching its final place. the algorithm leaves the data unchanged. leaving all the greater elements following them. elements less than or equal to the pivot and elements greater than it. so that it doesn't get in the way. themselves sorted by the recursive hypothesis. The boxed element is the pivot element. all the elements processed so far are in the desired position: before the pivot if less than or equal to the pivot's value. Because it only uses exchanges. left. after the pivot otherwise (loop invariant). The disadvantage of the simple version above is that it requires Ω(n) extra storage space. for a larger data set it produces the concatenation of two parts. pivotIndex) pivotValue := array[pivotIndex] swap array[pivotIndex] and array[right] // Move pivot to end storeIndex := left for i from left to right − 1 if array[i] ≤ pivotValue swap array[i] and array[storeIndex] storeIndex := storeIndex + 1 swap array[storeIndex] and array[right] // Move pivot to its final place return storeIndex In-place partition in action on a small list. Palai . In the process it also finds the final position for the pivot element. and red elements are larger.

writing quicksort itself is easy: procedure quicksort(array. It's not hard to see that the partition operation. Consequently. which simply loops over the elements of the array once. pivotIndex := left) pivotNewIndex := partition(array. but since there are only Θ(n) calls at each level. multiple variations can be found in various textbooks.1) quicksort(array. since partition reorders elements within a partition. this is subsumed in the Θ(n) factor). each level of calls needs only Θ(n) time all together (each call has some constant overhead. uses Θ(n) time. left. left. left. An alternate approach is to set up a recurrence relation for the T(n) factor. This means that the depth of the call tree is Θ(logn).Algorithm Analysis and Design (R 606) 81 This form of the partition algorithm is not the original form. pivotNewIndex + 1. Department of Computer Science & Engineering SJCET. such as versions not having the storeIndex. right) However. this form is probably the easiest to understand.g. pivotNewIndex . But no two calls at the same level of the call tree process the same part of the original list. this operation is also Θ(n). pivotIndex) quicksort(array. each time we perform a partition we divide the list into two nearly equal pieces. the relation would be: The master theorem tells us that T(n) = Θ(nlogn). right. right) if right > left select a pivot index (e. Palai . The result is that the algorithm uses only Θ(nlogn) time. Once we have this. In versions that perform concatenation. However. Formal analysis From the initial description it's not obvious that quicksort takes Θ(nlogn) time on average. This means each recursive call processes a list of half the size. the time needed to sort a list of size n. thus. we can make only logn nested calls before we reach a list of size 1. this version of quicksort is not a stable sort. Because a single quicksort call involves Θ(n) factor work plus two recursive calls on lists of size n / 2 in the best case. In the best case.

Algorithm Analysis and Design (R 606)

82

In fact, it's not necessary to divide the list this precisely; even if each pivot splits the elements with 99% on one side and 1% on the other (or any other fixed fraction), the call depth is still limited to 100logn, so the total running time is still Θ(nlogn). In the worst case, however, the two sublists have size 1 and n − 1 (for example, if the array consists of the same element by value), and the call tree becomes a linear chain of n nested calls. The ith call does Θ(n − i) work, and . The recurrence relation is: T(n) = Θ(n) + T(0) + T(n − 1) = O(n) + T(n − 1) This is the same relation as for insertion sort and selection sort, and it solves to T(n) = Θ(n2). Given knowledge of which comparisons are performed by the sort, there are adaptive algorithms that are effective at generating worst-case input for quicksort on-the-fly, regardless of the pivot selection strategy.

Randomized quicksort expected complexity
Randomized quicksort has the desirable property that it requires only Θ(nlogn) expected time, regardless of the input. Suppose we sort the list and then divide it into four parts. The two parts in the middle will contain the best pivots; each of them is larger than at least 25% of the elements and smaller than at least 25% of the elements. If we could consistently choose an element from these two middle parts, we would only have to split the list at most 2log2n times before reaching lists of size 1, yielding an Θ(nlogn) algorithm. A random choice will only choose from these middle parts half the time. However, this is good enough. Imagine that you are flipping a coin over and over until you get k heads. Although this could take a long time, on average only 2k flips are required, and the chance that you won't get k heads after 100k flips is highly improbable. By the same argument, quicksort's recursion will terminate on average at a call depth of only 2(2log2n). But if its average call depth is Θ(logn), and each level of the call tree processes at most n elements, the total amount of work done on average is the product, Θ(nlogn). Note that the algorithm does not have to verify that the pivot is in the middle half - if we hit it any constant fraction of the times, that is enough for the desired complexity.

Department of Computer Science & Engineering

SJCET, Palai

Algorithm Analysis and Design (R 606)

83

The outline of a formal proof of the O(nlogn) expected time complexity follows. Assume that there are no duplicates as duplicates could be handled with linear time pre- and postprocessing, or considered cases easier than the analyzed. Choosing a pivot, uniformly at random from 0 to n − 1, is then equivalent to choosing the size of one particular partition, uniformly at random from 0 to n − 1. With this observation, the continuation of the proof is analogous to the one given in the average complexity section.

Department of Computer Science & Engineering

SJCET, Palai

Algorithm Analysis and Design (R 606)

84

Average complexity
Even if pivots aren't chosen randomly, quicksort still requires only Θ(nlogn) time over all possible permutations of its input. Because this average is simply the sum of the times over all permutations of the input divided by n factorial, it's equivalent to choosing a random permutation of the input. When we do this, the pivot choices are essentially random, leading to an algorithm with the same running time as randomized quicksort. More precisely, the average number of comparisons over all permutations of the input sequence can be estimated accurately by solving the recurrence relation: Here, n − 1 is the number of comparisons the partition uses. Since the pivot is equally likely to fall anywhere in the sorted list order, the sum is averaging over all possible splits. This means that, on average, quicksort performs only about 39% worse than the ideal number of comparisons, which is its best case. In this sense it is closer to the best case than the worst case. This fast average runtime is another reason for quicksort's practical dominance over other sorting algorithms.

Space complexity
The space used by quicksort depends on the version used. Quicksort has a space complexity of Θ(logn), even in the worst case, when it is carefully implemented such that in-place partitioning is used. This requires Θ(1). After partitioning, the partition with the fewest elements is (recursively) sorted first, requiring at most Θ(logn) space. Then the other partition is sorted using tail recursion or iteration. The version of quicksort with in-place partitioning uses only constant additional space before making any recursive call. However, if it has made Θ(logn) nested recursive calls, it needs to store a constant amount of information from each of them. Since the best case makes at most Θ(logn) nested recursive calls, it uses Θ(logn) space. The worst case makes Θ(n) nested recursive calls, and so needs Θ(n) space; Sedgewick's improved version using tail recursion requires Θ(logn) space in the worst case. Department of Computer Science & Engineering SJCET, Palai

far more than the list itself. Because we have variables like this in every stack frame. The not-in-place version of quicksort uses Θ(n) space before it even makes any recursive calls. If greater than the greatest. because each level of the recursion uses half as much space as the last. If the next element is less than the least of the buffer. leading to a best-case Θ(nlogn) and worst-case Θ(n2logn) space requirement. write it to available space at the beginning. the list itself will also occupy Θ(nlogn) bits of space. since if the list contains mostly distinct elements. and then follow the regular quicksort algorithm. write the buffer. for example. Palai . if most of the list elements are distinct. Read the next element from the beginning or end to balance writing. First. requiring space. This isn't too terrible. it takes Θ(logn) bits to index into a list of n items. we have to keep in mind that our variables like left and right can no longer be considered to occupy constant space. External quicksort: The same as regular quicksort except the pivot is replaced by a buffer.Algorithm Analysis and Design (R 606) 85 We are eliding a small detail here. write it to the end. In the best case its space is still limited to Θ(n). though. Variants There are three variants of quicksort that are worth mentioning: Balanced quicksort: choose a pivot likely to represent the middle of the values to be sorted. each would require about ΘO(logn) bits. read the M/2 first and last elements into the buffer and sort them. If we consider sorting arbitrarily large lists. Recursively sort the smaller partition. If the list elements are not themselves constant size. When done. and loop to sort the remaining partition. Otherwise write the greatest or least of the buffer. however. the problem grows even larger. Keep the maximum lower and minimum upper keys written to avoid resorting middle elements that are in order. and put the next element in the buffer. Department of Computer Science & Engineering SJCET. and Its worst case is dismal. in reality quicksort requires Θ((logn)2) bits of space in the best and average case and Θ(nlogn) space in the worst case.

but the worst-case running time is always Θ(nlogn). and can be easily adapted to operate on linked lists and very large lists stored on slow-to-access media such as disk storage or network attached storage. Palai . Recursively sort the "equal to" partition by the next character (key). Although quicksort can be written to operate on linked lists. it requires Θ(n) auxiliary space in the best case. Mergesort is a stable sort. another recursive sort algorithm but with the benefit of worst-case Θ(nlogn) running time. Quicksort is usually faster. quicksort organizes them concurrently into a tree that is implied by the recursive calls. and greater than the pivot's character. The main disadvantage of mergesort is that. Quicksort also competes with mergesort. constant amount of auxiliary storage. Partition the remaining elements into three sets: those whose corresponding character is less than. Recursively sort the "less than" and "greater than" partitions on the same character. but in a different order. unlike quicksort and heapsort. mergesort only requires a small. though there remains the chance of worst case performance except in the introsort variant. whereas the variant of quicksort with in-place partitioning and tail recursion uses only Θ(logn) space.) Department of Computer Science & Engineering SJCET. using it directly will be faster than waiting for introsort to switch to it.Algorithm Analysis and Design (R 606) 86 Three-way radix quicksort (also called multikey quicksort): is a combination of radix sort and quicksort. The algorithms make exactly the same comparisons. Instead of inserting items sequentially into an explicit tree. equal to. Comparison with other sorting algorithms Quicksort is a space-optimized version of the binary tree sort. Pick an element from the array (the pivot) and consider the first character (key) of the string (multikey). when operating on arrays. Heapsort is typically somewhat slower than quicksort. (Note that when operating on linked lists. it will often suffer from poor pivot choices without random access. The most direct competitor of quicksort is heapsort. If it's known in advance that heapsort is going to be necessary.

Palai .Algorithm Analysis and Design (R 606) 87 MODULE 3 Department of Computer Science & Engineering SJCET.

if Fesible(solution. } Department of Computer Science & Engineering SJCET. { Solution :=ø. The function Select selects an input from a[ ] and removes it. Algorithm Greedy (a. If the inclusion of the next input into the partially constructed optimal solution will result in an infeasible solution. Otherwise. The measure may be the objective function. In fact. Feasible. considering one input at a time. The selected input‘s value is assigned to x. Feasible is a Boolean-valued function that determines whether x can be included into the solution vector. The function Greedy describes the essential way that a greedy algorithm will look.// Initialize the solution. several different optimization measures may be plausible for a given problem. This version of the greedy technique is called the subset paradigm. it is added.n) //a[1 : n] contains the n inputs. once a particular problem is chosen and the functions Select. Most of these. however. then this input is not assed to the partial solution. } return solution. The selection procedure itself is based on some optimization measure.Algorithm Analysis and Design (R 606) 88 THE GREEDY METHOD – CONTROL ABSTRACTION The greedy method suggests that one can devise an algorithm that works in stages. This is done by considering the inputs in an order determined by some selection procedure. will result in algorithms that generate suboptimal solutions. Palai . a decision is made regarding whether a particular input is in an optimal solution. for i:=1 to n do { x:=Select(a).x) then solution := Union(solution.x). and Union are properly implemented. The function Union combines x with the solution and updates the objective function. At each stage.

the problem can be stated as maximize ∑ pixi 1≤ i≤n Subject to ∑ wixi ≤ m 1≤i≤ n And 0 ≤ xi ≤ 1. Formally. u = u – w[i] . w[ 1. If a fraction xi . ALGORITHM ALGORITHM greedyknapsack(m..Algorithm Analysis and Design (R 606) 89 GENERAL KAPSACK PROBLEM In the knapsack problem. u=m . The objective is to obtain a filling of the knapsack that maximizes the total profit earned. 1≤i≤ n The profit and weight are positive numbers. } Department of Computer Science & Engineering SJCET. n ) / / p[ 1. 0 ≤ xi ≤ 1.. x[i] = 1. n] / / pi/ wi ≥ pi+1/ wi+1 / / x[1 …. of object i is placed into the knapsack. we have n objects and a knapsack or a bag. This is done by using greedy method.0 . we require the total weight of all chosen objects to be at most m. } If ( i ≤ n ) then x[i] = u/ w[i] .0 . for i= 1 to n do { If ( w[i] > u) then break . Since the knapsack capacity is m. Palai . n] solution { for i=1 to n do x[1] =0. n] . Object i has a weight wi and the knapsack has a capacity m. then a profit of pixi is earned.

no global optimal solution is computed. value: 9. Example: Knapsack Problem I 2 possible greedy-strategies: select the most valuable element in each step that still fits into the knapsack: Knapsack capacity: 15 Object Size Value g1 3 3 g2 4 5 g3 6 8 g4 7 9 choose: g4. That is why the approach is called greedy. All optimum solutions will fill the knapsack exactly complexity of knapsack algorithm is 0( n). but rather locally optimal decisions are taken. Department of Computer Science & Engineering SJCET. However.2 The strategy will choose g1 (value 90). this need not always be the case! Knapsack capacity: 50 Object Size Value vs g1 30 90 3 g2 40 100 2.5 g3 50 110 2. remaining capacity: 8 choose: g4. remaining capacity: 1 y not optimal Example: Knapsack Problem II select the relatively most valuable element (max( v(gi )s(gi ) ) ) This will lead to the optimal solution in the case described above. then greedy knapsack generate an Optimum solution to the given instant of knapsack problem. Palai . value: 9.Algorithm Analysis and Design (R 606) 90 If p1 / w1 ≥ p2 /w2 ≥ ……………. ≥ pn / wn . Greedy algorithms Similarly to dynamic programming. greedy algorithms are used to solve optimization problems. In contrast to dynamic programming.

Pin . . Pn with lengths l1. they are far less expensive than dynamic programming. the next activity can start at time fi . the optimal would be g3 (value 110) Greedy algorithms do not always provide the optimal solution. The programs are stored on a tape in order Pi1 . . . . . . for example a lecture hall. The resource can only be used by one activity at a time. Complexity of the greedy-knapsack: O(C) Optimal greedy algorithms There are problems where greedy algorithms produce optimal solutions. .Algorithm Analysis and Design (R 606) 91 However. . fi ). Each activity i has a starting time si and a completion time fi with si < fi .. . . . . Average access time: T(i1. Problem: compute the largest possible set of compatible Activitie OPTIMAL STORAGE ON TAPES Optimal storage on tapes I Given: n programs P1. . In exchange. Example: resource planning problem. . . . si _ fj _ sj _ fi . . . . . . j) = 1n Xn j=1 Xj i=1 li j = 1n Xn j=1 (n − j + 1)lij Optimal storage on tapes II How to store the programs on the tape so that the average access time becomes minimal? Department of Computer Science & Engineering SJCET. . j) :. The activity takes place in the time interval [si . Compatibility of activities: compatible(i . Given: set of activities S = {1. . . j) = li1 + · · · + lij . n} that want to use some resource. Time to access Program Pij : T(i1. .e. . in) = 1n Xn j=1 T(i1. . i. ln. in. Palai . . in.

Hence. e. E) be an undirected connected graph. Any connected graph with n vertices must have least n-1 edges and all connected graphs with n-1 edges are trees. . Palai . order acess time P1P2P3 17 + (17 + 5) + (17 + 5 + 10) = 71 P1P3P2 17 + (17 + 10) + (17 + 10 + 5) = 76 P2P1P3 59 P3P1P2 69 P2P3P1 52 P3P2P1 52 92 Optimal storage on tapes III Fact: If for the lenghts l1. Greedy algorithms need less time than. ln the relation l1 _ l2 _ · · · _ ln holds. .. Example: Let P1. A graph t= (V. E‘) of G is a spanning tree of G if T is a tree. Department of Computer Science & Engineering SJCET. Spanning trees have many applications. If the nodes of G represent cities and the edges represent possible communication links connecting two cities. For example. There are problems where a greedy strategy leads to an optimal solution. . then the average access time is minimized by the order P1 . P3 have lengths 17. A minimal subgraph is one with fewest numbers of edges. . they may fail to determine the global optimum. . What have we learnt on greedy algorithms? Greedy algorithms treat optimization problems by taking locally optimal decisions. dynamic programming.g.Algorithm Analysis and Design (R 606) Let n be fixed. SPANNING TREES-MINIMUM COST SPANNING TREES Definition: Let G= (V. 10. Another application of spanning trees arises from the property that a spanning tree is a minimal subgraph G‘ of G such that V(G‘) =V(G) and G‘ is connected. Pn. hence 1n can be ignored. Hence. 5. a greedy strategy that always selects the shortest program leads to the minimal average access time. . P2. they can be used to obtain independent set of circuit equation for an electric network.

Palai . the edges have weights assigned to them.Algorithm Analysis and Design (R 606) 93 then the minimum number of links needed to connect the cities is n-1. then selection of links contain cycle. Given such weighted graph. Removal of any of the links on this cycle results in a link selection of less cost connecting all the cities. These weights may represent the cost of construction. In practical situations. If this is not so. 1 28 8 2 14 16 10 7 6 24 18 25 5 4 3 12 22 (a) Department of Computer Science & Engineering SJCET. the length of the link. and so on. one would then wish to select cities to have minimum total cost or minimum total length. So that the minimum cost spanning tree of G can be obtained. In either case the links selected have to form a tree. The spanning trees of G represent all feasible choices. The cost of the spanning tree is the sum of all the edges in that tree.

In every execution of the Prim Algorithm a new peak will be connected to the T tree. not always with their numbering order.. for example the V(4) peak can be connected to the tree before the V(2) peak. The corresponding pointer of the newly connected peak will be deleted Department of Computer Science & Engineering SJCET. will always contain the pointers of those peaks which are terminally attached in the T tree. Palai ..which for simplicity we accept it as V(1).This way two sets of pointers are initialized .n}..the 0={1} and P={2. The O set (the O is taken from the Greek word Oristiko which means Terminal).. The P set( P is taken from the Greek word Prosorino which means Temporary) contains the rest of the pointers for the peaks. The V(1) peak has already been attached in the T tree.Algorithm Analysis and Design (R 606) 94 1 10 14 2 16 7 6 3 25 5 4 12 22 (b) Figure (a) represents a graph and (b) represents the minimum cost spanning tree. that means they are not attached in the tree.n}-O which are those pointers who have not been terminally connected with a node of T. PRIM’S ALGORITHM At first a peak is chosen in random order . P={1.

P={2. 2..(P<-P-{k}) If P=0 then stop..j belonging to {1.n} and P=0.n} For every j belonging to P :e(j):=c[e(j1)] . V(i) peak is already in the T tree.we choose one with minimum cost.This of course means the end of the algorithm. :(initializations). If the chosen one is e(ij) then i belongs in the O set .. Exchange the O set with the set produced by the union of the O set and {k} .. INPUT :n.. p(j)=1 ( all peaks connected to the root.. When all peaks are connected there will be O={1.n}. Palai ..c[e(ij)]. This may seem to you extremely complicated but it is easily understood by a set of examples.i. STEPS 1.Algorithm Analysis and Design (R 606) 95 from P set and will be inserted to the O set. Choose a k for which e(k)<=e(j) for every j belonging to P In case of tight choose the smaller one.. Pseudocode For The Prim Algorithm. If e(j) >c[e(kj)] exchange e(j) <-c(e(kj)). Exchange the P set with the set produced by the difference of the P set and {k} . j belongs in the P set . For every j belonging to P compare e(j) with c[e(kj)]. OUTPUT :p(j) j=2. O={1} (V(1) root of the T tree).. Department of Computer Science & Engineering SJCET.among all sides of G which connect peaks already inserted in the T (pointers in the O set ) tree with the rest of the peaks (pointers in the P set )..n (pointer of peaks j father in the T tree). We put V(j) in the T tree. we change the O set by putting the j pointer. By definition of the cost function :e(j)=infinite when V(j) does not connect to V(1)..Go back to Step 1.). and V(j) peak has not been attached in the T tree yet. 3.... and we also change the P set by removing the j pointer. The new peak every time will be chosen by using greedy method ..

Algorithm Analysis and Design (R 606) An example for Prim‘s algorithm 96  shortest paths from v0 to all destinations Department of Computer Science & Engineering SJCET. Palai .

Algorithm Analysis and Design (R 606) 97 1 2 6 7 3 5 4 (a) 1 10 2 7 6 3 5 4 Department of Computer Science & Engineering SJCET. Palai .

Algorithm Analysis and Design (R 606) 98 1 10 2 7 6 3 5 4 (c) 12 1 10 14 2 7 6 3 5 4 (d) 12 Department of Computer Science & Engineering SJCET. Palai .

Palai .Algorithm Analysis and Design (R 606) 99 1 10 14 2 16 7 6 3 5 4 (e) 12 1 10 14 2 16 7 6 3 5 4 12 22 (f) Department of Computer Science & Engineering SJCET.

It is discarded as it creates a cycle. Edge (1. mincost := 0. edge (6.0 . Its inclusion in tree is being built does not create a cycle. cost. v) from the heap and reheapify using Adjust . Its inclusion in the tree results in a cycle. Of the edges not yet considered. { Construct a heap out of the edge costs using Heapify . Department of Computer Science & Engineering SJCET. t) / / E is the set of edges in G. It is included in the spanning tree being built. 5) is considered and included in the tree being built. n. k := Find (v) . mincost := mincost + cost [u. 7). j := Find (u) . so this edge is discarded. / / the final cost is returned. The resulting tree has the cost 99. / / Each vertex is in a different set. if (j ≠ k) then { i := i+1 . 5). v] is the cost of / / edge (u. 1] := u . KRUSKAL’S ALGORITHM ALGORITHM ALGORITHM Kruskal ( E.Algorithm Analysis and Design (R 606) 100 Figure (a) shows the current graph with no edges selected. for i := n do parent [i] := -1 .4) is next edge to be added to the tree being built. i := 0 . It is considered next. Cost [u. (7. The next edge to be considered is (2. t [i. Edge (5. while ( i < n – 1 ) and ( heap not empty )) do { Delete a minimum cost edge (u. Palai . This yields the graph of fig (b). t[ i. v). This completes the spanning tree. 4) is selected and included in the tree (fig(c)). This results in the configuration of fig (f). v] . so we get the graph of fig (d). Edge (2. 3) is considered next and included in the home fig (e). The next edge to be considered is (7. Next edge (3. Finally. 6) is the first edge considered. 2 ] := v . G has n vertices. 4) has the least cost. t is the set of edges in the minimum – cost spanning tree.

The computing time is O( | E | log |E|). In the while loop of line 10. v) is discarded as its inclusions into t would create a cycle. 2] := v. Line 14 determines the sets containing u and v. Associated with job i is integer deadline di ≥0 and a profit pi>0. Edge (u. k) . Line 23 determines whether a spanning tree was found. } } If ( i ≠ n – 1 ) then write (― no spanning tree‖) . If u = v the edge (u. It follows that i ≠ n – 1 iff the graph G is not connected. If j ≠ k then vertices u and v are in different sets and edge (u. The sets containing u and v are combined in line 20. Palai . v) is included into t. else return mincost. and y[ i. The set t can be represented as a sequential list using a two – dimensional array t [ 1: n – 1. . or ∑i€J pi an optimal solution is a feasible solution with maximum value. Department of Computer Science & Engineering SJCET. A feasible solution for this problem is a subset J of such that each job in this can be completed by its deadline. } 101 COMPLEXITY The set t is the set of edges to be included in the minimum – cost spanning tree and i is the number of edges in t. To complete a job one has to process the job on the machine for one unit of time. 1 : 2]. where e is the edge set of G. v) can be added to t by the assignments t[ i.Algorithm Analysis and Design (R 606) Union (j. Only one machine is available for processing jobs. 1] := u . For any job i the profit pi is earned if the job is completed by its deadline. The value of a feasible solution J is the sum of the profits of the jobs in J. edges are removed from the heap one by one in nondecreasing order of cost. JOB SEQUENCING WITH DEADLINES If we are given a set of n jobs.

10. 8. 6.27) and (d1.3 4.d3. Thus the processing of job begins at time zero and that of job 1 is completed at time 2.Algorithm Analysis and Design (R 606) 102 Example:.n) //J is a set of jobs that can be completed by their deadlines. 2.p3.J.15. 5.2.1 4.4) (1) (2) (3) (4) Processing sequence 2.1.2) (1. } } Department of Computer Science & Engineering SJCET. 4.d4)=(2.p2.1 1. Feasible solution (1. These jobs must be processed in the order job 4 followed by job 1. 9.3) (3.3) (1. Palai .(p1. for i:=2 to n do { if(all jobs in J U {i} can be completed by their headlines) then J:=J U {i}. In this solution only jobs 1 and 4 are processed and the value is 127. 3.3 or 3. { J:={1}.1). High level description of job sequencing algorithm Algorithm Greedy job(d.4) (2. The feasible solutions and their values are: No 1.d2.Let n=4.1 2.p4)=(100. 7.3 1 2 3 4 Value 110 115 127 25 42 100 10 15 27 Solution 3 is optimal.

1≤i≤k. //Also at termination d[J][i]]≤d[J[i+1]]. r:=k.n) //d[i]≥1. // Find position for i and check feasibility of insertion..//initialize. //J[i] is the ith job in the optimal solution. } } return k. } 103 The fuction JS is a correct implemention of the greedy-based method. if((d[J[r]]≤d[i]) and (d[i]>r)) then { //Insert i into J[ ].n≥1. Palai . k:=k+1.Algorithm Analysis and Design (R 606) Greedy algorithm for sequencing unit time jobs with deadlines and profits Algorithm JS(d. { d[0] :=J[0]:=0. for i:=2 to n do { //consider jobs in nonincreasing order of p[i]. J[1]:=1. The jobs are ordered such that p[1]≥p[2]≥. k:=1.1≤i≤k. for q:=k to (r+1) step-1 do J[q+1]:=J[q].≥p[n].// Include job 1. 1≤i≤n are the deadlines. As the jobs are in the nonincreasing Department of Computer Science & Engineering SJCET.j. while((d[J[r]]>d[i]) and (d[J][r]]≠r)) do r:=r-1.. Since d[i]≥1. J[r+1]:=i.the job with largest pi will always be in the greedy solution.

Algorithm Analysis and Design (R 606) 104 order of pi‘s.1≤i≤k. Department of Computer Science & Engineering SJCET.line 6 in the above algorithm includes the job with largest pi.then J is such that d[J][i]]≤d[J[i+1]]. is the set already included. If J[i]. The for loop of the line 8 considers the remaining job in the jobs in the order required by the greedy algorithm method. When job i is being considered.1≤i≤k. the while loop determines where in J this job has to be inserted. The computing time of JS can be reduced from θ(n2) to nearby O(n2) by using disjoint set union and find algorithms and a different method to determine the feasibility of a partial solution. At all times. the set of jobs already included in the solution is maintained in J. Palai . The worst case computing time of Algorithm JS is θ(n2).

Algorithm Analysis and Design (R 606) 105 MODULE 4 Department of Computer Science & Engineering SJCET. Palai .

Palai .Algorithm Analysis and Design (R 606) 106 DYNAMIC PROGRAMMING Dynamic programming is an algorithm design method that can be used when the solution to a problem can be viewed as the result of a sequence of decisions. 2)OPTIMAL MERGE PATTERNS-An optimal merge pattern tells us which pair of files should be merged at each step.Dynamic programming often drastically reduces the amount of enumeration by avoiding the enumeration of some decision sequences that cannot possibly be optimal.the remaining decisions must constitute an optimal decision sequence with regard to the state resulting from the first decision.For many problems it is not possible to make stepwise decisions in such a manner that the sequence of decisions made is optimal.which the third and so on.We have to decide the values of xi. One way to solve problems for which it is not possible to make a sequence of stepwise decisions leading to an optimal decision sequence is to try all possible decision sequences.We could enumerate all decision sequences and then pick out the best.In dynamic programming an optimal sequence of decisions is obtained by making explicit appeal to the principle of optimality.an optimal sequence of decisions can be found by making the decisions one at a time and never making an erroneous decision. 1)KNAPSACK-The solution to the knapsack problem can be viewed as the result of a sequence of decisions.An optimal sequence of decisions is one that results in a path of least length.1 ≤ i ≤ n. For some of the problems that may be viewed in this way. PRINCIPLE OF OPTIMALITY The principle of optimality states that an optimal sequence of decisions has the property that whatever the initial state and decisions are.But the time and space requirements may be prohibitive.It also satisfies the constraints ∑wixi≤m and 0 ≤ xi ≤1.Below are some of the examples of problems that can be viewed this way. 3)SHORTEST PATH-One way to find a shortest path from vertex i to vertex j in a directed graph G is to decide which vertex should be the second vertex.An optimal sequence of decisions is a least cost sequence.This is true for all problems solvable by the greedy method.An optimal sequence of decisions maximizes the objective function ∑pixi. Department of Computer Science & Engineering SJCET.

and t the sink.dynamic programming algorithms often have a polynomial complexity..i1....Starting with the initial vertex i..The cost of a path from s to t is the sum of the costs of the edges on the path.However.a decision has been made to go to vertex i1.ik.many decision sequences may be generated.Therefore the principle of optimality applies for this problem...r2.j. MULTISTAGE GRAPHS EXPLANATION Theory A multistage graph G = (V. The vertex s is the source....... Department of Computer Science & Engineering SJCET..i2.r1.. j).. Because of the use of principle of optimality ... 1<=i<=k.j is a shortest path from i to j. E) is a directed graph in which the vertices are partitioned into k>=2 disjoint sets Vi.....rq. v) is an edge in E.....Then i.j is an i to j path that is shorter than the path i.j) be the cost of edge (i.. Example:SHORTEST PATH Consider the shortest path problem.Following this decision the problem state is defined by vertex i1 and we need to find a path from i1 to j.Assume that i.j be a shortest i1 to j path.i1. Palai .rq.j must constitute a shortest i1 to j path. then u ε Vi and v ε Vi+1 for some i. The sets V1 and Vk are such that | V1|=|Vk|=1....i2.Algorithm Analysis and Design (R 606) 107 Thus.i1... Another important feature of dynamic programming approach is that optimal solutions to subproblems are retained so as to avoid recomputing their values.If not .The use of these tabulated values make it natural to recast the recursive equations into an iterative algorithm.In dynamic programming . If (u.. 1<=i<k.i2.let i1.ik. Although the total number of different decision sequences is exponential in the number of decisions(if there are d choices for each of the n decisions to be made then there are dn possible sequences).r1....ik.. the essential difference between the greedy method and dynamic programming is that in the greedy method only one decision sequence is ever generated... Let c (i. decision sequences containing subsequences that are suboptimal are not considered. Let s and t be the vertex in V1 and Vk..sequences containing suboptimal subsequences cannot be optimal(if the principle of optimality holds)and so will not be generated as far as possible.It is clear that the sequence i1.

According to this principle a path (i. l)} l ε Vi+1 (j.. then stage 4.Algorithm Analysis and Design (R 606) 108 The multistage graph problem is to find a minimum-cost path from s to t. even though it appears in reverse. l) + cost (i+1. The multistage graph problem has two approaches namely. Because of the constraints on E. goes to stage 2. with 12 vertices and the cost of each node is given. The graph is a 5 stage graph. Palai . Equation The cost of a path (i. j) is given by the equation cost (i. Figure V1 V2 V3 V4 V5 Department of Computer Science & Engineering SJCET. j) = min {c (j. j) means i th level and the cost of edge from j th node to the last node. A dynamic programming formulation for a k-stage graph problem is obtained by using the principle of optimality. l) ε E where j is a vertex in Vi. k) is said to be optimal if and only if the intermediate paths (i. and eventually terminates in stage k. Here cost (i. then to stage 3. j) and (j. etc. k) are also optimal. In the latter case it from destination to source tracing the lowest cost path. every path from s to t starts in stage 1. forward and backward . EXAMPLE Consider the following figure. Each set Vi defines a stage in the graph. The aim is to find the minimum cost path from the first node to the last node. In forward approach the resulting path is obtained by moving in the direction of destination from source.

7. 6).6+ cost (4. 8) }=7 cost (2. 12).8+ cost (3.5) = min { 11+ cost (3. 6).1) = min { 9+ cost (2. 7).Algorithm Analysis and Design (R 606) 109 2 9 7 1 3 4 2 5 11 8 11 1 4 6 2 6 5 2 9 4 3 7 4 7 7 7 5 3 2 10 00 0 5 6 12 8 11 Cost Calculation cost (4.2+ cost (3. Thus the minimum cost path in forward approach is (1. 8)=18 cost (2. 11) }=7 cost (2.6) =cost from 6 th to 12 th node = min { 5+ cost (4.2+ cost (2.3+ cost (4. FORWARD APPROACH EXPLANATION: Department of Computer Science & Engineering SJCET.7+ cost (2. 7)}=9 cost (2.4) = 11+cost (3.Both have a cost of 16 each. 7).3). 5)}=16 By selecting the least cost edges from each group we can determine which all edges will be involved in the minimum cost path. 9). 4) . 10) }=5 cost (3. 10) }=7 cost (3.2) = min { 4+ cost (3.7+ cost (3. 9).10) =2 cost (4. 10). 2). 2.6+ cost (4. 3.7) = min { 4+ cost (4. 10.3) = min { 2+ cost (3. 8)}=15 cost (1.8) = min { 5+ cost (4. 9) =4 cost (4.11) =5 cost (3. 1+ cost (3. 11+ cost (2. 6. Palai . 10. 12) and in backward approach it is (1.

with 12 vertices and the cost of each node is given.we obtain the cost of a path (i. Each set Vi defines a stage in the graph. By using the forward approach. This is a 5 stage graph. etc. forward and backward . and eventually terminates in stage k. The sets V1 and Vk are such that | V1|=|Vk|=1. j) = min {c (j. Palai . In forward approach the resulting path is obtained by moving in the direction of destination from source. l) + cost (i+1. and t the sink. 1<=i<k. v) is an edge in E. This can be done by using the forward approach as shown below: EXAMPLE: Department of Computer Science & Engineering SJCET.Algorithm Analysis and Design (R 606) 110 A multistage graph G = (V. even though it appears in reverse. k) is said to be optimal if and only if the intermediate paths (i. where j is a vertex in Vi. A dynamic programming formulation for a k-stage graph problem is obtained by using the principle of optimality. In the latter case it from destination to source tracing the lowest cost path. l)} l ε Vi+1 (j. According to this principle a path (i. Here cost (i. l) ε E . goes to stage 2.j) be the cost of edge (i. E) is a directed graph in which the vertices are partitioned into k>=2 disjoint sets Vi. The multistage graph problem is to find a minimum-cost path from s to t. then stage 4. then u ε Vi and v ε Vi+1 for some i. then to stage 3.The cost of a path from s to t is the sum of the costs of the edges on the path. The vertex s is the source. If (u. j). every path from s to t starts in stage 1. k) are also optimal. Consider the example given below. j) which is given by the following equation: cost (i. j) means i th level and the cost of edge from j th node to the last node. Because of the constraints on E.. j) and (j. 1<=i<=k. The aim is to find the minimum cost path from the first node to the last node. The multistage graph problem has two approaches namely. Let s and t be the vertex in V1 and Vk. Let c (i.

7) and cost (3. 2+ cost (3. 7+ cost (3. 8) }= 7 While calculating the value of cost (2. 9).6).4) = min{11+cost (3. 9).Algorithm Analysis and Design (R 606) 111 V1 V2 V3 V4 V5 2 9 7 1 3 4 2 5 11 8 11 1 4 6 2 6 5 2 9 4 3 7 4 7 7 7 5 3 2 10 00 0 5 6 12 2 8 11 CALCULATING THE MINIMUM COST: cost (4.8) have been reused so as to avoid their re-computation. 11) }= 7 cost (2. 3+ cost (4. 8)}= 18 Department of Computer Science & Engineering SJCET.2) = min { 4+ cost (3.10) = 2 cost (4.11) =5 cost (3. 10).7) = min { 4+ cost (4. 10) }= 7 cost (3. 6+ cost (4. 6).2) the values of cost (3. 6). 6+ cost (4. Palai . cost (2. 10) }= 5 cost (3. 9) = cost from 4th node to the 9th node = 4 cost (4.3) = min { 2+ cost (3.6) = min { 6+ cost (4. cost (3. 1+ cost (3. 7)}= 9 cost (2.8) = min { 5+ cost (4. 7).

Palai .j) be a minimum-cost path from vertex s to a vertex j in Vi . 12) which has the cost of 16. p and d are omitted for the same reasons as before.j)} l Є Vi-1 <l. This algorithm has the same complexity as FGraph provided G is now represented by itsinverse adjacency lists (i. 10. 8)}= 15 cost (1.e.3). BACKWARD APPROACH : The Multistage graph problem can be solved using backward approach in the following manner.j). 7. Thus the minimum cost path in forward approach is (1.1) = min { 9+ cost (2. 5)}= 16 A minimum cost s to t path has a cost of 16. let bcost(i. the cost of forward approach is= 16. 11+ cost (2. Let bp(i.Algorithm Analysis and Design (R 606) 112 cost (2.(Algorithm 1. By selecting the least cost edges from each group we can determine which all edges will be involved in the minimum cost path. It is represented using dotted lines.v> Є E Department of Computer Science & Engineering SJCET..l) + c(l. 2).1) The first subscript on bcost. From the backward approach we obtain bcost(i.5) = min { 11+ cost (3.j> Є E Algorithm in psedocode corresponding to obtain a minimum-cost s-t graph is BGraph. 2. 8+ cost (3.j) be the cost of bp(i. 7+ cost (2. 4) . 7). This is the multistage graph forward approach problem. 2+ cost (2. for each vertex v we have a list of vertices w such that <w.j) = min {bcost(i-1. Thus.

9. In this generalization. p[1] := 1.j) is set to ∞ for any <i. T) = min{ 2+d(F. Algorithm BGraph(G. 16+d(F. T) = min{9+d(D. 5. <r. for j := k – 1 to 2 do p[j] := d[p[j+1]].v> such that u Є Vi . T). d[j] := r. When programming these pseudocodes. T). bcost[j] :=bcost[r] + c[r.j> Є E. 5+13.Algorithm Analysis and Design (R 606) ALGORITHM 113 1. Care should be taken to avoid such overflows. Let r be such that is an edge of G is an edge of and bcost is an edge of G and bcost[r] + c[r. 11. bcost[1] :=0. T)} = min{9+18.j] is minimum. // Same function as FGraph 3.k.n. a floatin point overflow might occur.0.j> for j :=2 to n do { / / Compute bcost[j]. T) } = 2+2 = 4 Department of Computer Science & Engineering SJCET. 14. 6. the graph is permitted to have edges <u. v Є Vj and i< j. 15. 7. } } / / Find a minimum-cost path.j]. one could use the maximum allowable floating point number for ∞. 16+2} = 18. If the weight of any such edge is added to some other costs. 13. It should be easy to see that both FGraph and BGraph work correctly even on a more generalized version of multistage graphs. { 4. Eg:  d(B. Palai . 10.  d(C.p) 2. In the pseudocodes FGraph and BGraph. 5+d(E. 8. p[k] :=n. bcost(i. 12.

5+2 } = 7  d(S. A)+d(A. E)} = min{ 1+11. B)+d(B.d(S. d(S. B)+d(B. 7+13.E)+d(E. 5+d(C. B) = 2 d(S.d(S. F)} = min{ 2+16. T)} = min{ 5+18.  The above way of reasoning is called backward reasoning. A) = 1 d(S.D)=min{d(S. E). 114 9 5 D E d(D. A)+d(A.T) = min{d(S. B)+d(B. C) = 5  d(S.Algorithm Analysis and Design (R 606)  d(S.T). 2+18. T) B T 16 F d(F.d(S. 7+2 } =9 Department of Computer Science & Engineering approach SJCET.F)=min{d(S. T). D).d(S. T). A)+d(A. T) d(E. D)+d(D. 2+9 } = 5 d(S. D)} = min{ 1+4. F). T) Backward (forward reasoning)  d(S. 5+4} = 9. T). F)+d(F. 2+d(B. 2+5 } = 7 d(S.E)=min{d(S. T) = min{1+d(A. Palai . T)} = min{1+22.

node) = minimum cost of traveling to the node in stage from the source(node 1). Palai .e. Sample Graph and Simulation of the Algorithm Stage I Stage II Stage III Stage IV Stage V Stage VI 8 2 5 12 1 3 6 9 13 15 4 7 10 00 0 11 14 SIMULATION OF THE SOLUTION USING BACKWARD COSTS Format: COST(stage.2) = COST(I.Algorithm Analysis and Design (R 606)  Forward approach and backward approach: 115  Note that if the recurrence relations are formulated using the forward approach then the relations are solved backwards . i.2) = 0 + 10 = 10 Department of Computer Science & Engineering SJCET. they are solved forwards.1) + cost(1..1) = 0 STEP II COST(II. STEP I COST(I. beginning with the last decision  On the other hand if the relations are formulated using the backward approach.

7).6) = min{COST(II.40+∞ } Department of Computer Science & Engineering 116 SJCET.8). COST(II.3) + cost(3. COST(II. COST(II. COST(II. COST(III.Algorithm Analysis and Design (R 606) COST(II.2) + cost(2.5) = min{COST(II. 30+∞} = 20---via the path 1-2-5 COST(III. 20+50. COST(III.6).7) = min{COST(II. 30+30} = 40---via the path 1-2-7 STEP IV COST(IV.5) + cost(5.7).6)} =min{10 +20.1) + cost(1.2) + cost(2.3) + cost(3. COST(II.4) + cost(4.8)} =min{20+10.3) = 0 + 20 = 20 COST(II.4) + cost(4.7) + cost(7. 20+∞. Palai . 30+40} = 20---via the path 1-3-6 COST(III.4) = 0 + 30 = 30 STEP III COST(III.7)} =min{10 +30.3) = COST(I. COST(II.8).4) = COST(I. 20+∞ .8)=min{COST(III.6) + cost(6.3) + cost(3.5).2) + cost(2.5)} =min{10 +10.1) + cost(1. 20+40.6).4) + cost(4.5).

10)} =min{20+10.7) + cost(7.11)=min{COST(III.7) + cost(7.12)} =min{30+10 .9). COST(IV. COST(III.40+∞ } =30---via the path 1-2-5-10 COST(IV.9)=min{COST(III.10) + cost(10.8) + cost(8.6) + cost(6.10)=min{COST(III.5) + cost(5.12)=min{COST(IV.Algorithm Analysis and Design (R 606) =30---via the path 1-2-5-8 COST(IV.11). COST(IV.6) + cost(6.40+∞ . COST(III.11)} =min{20+30.40+∞ } =40---via the path 1-2-5-9 or via the path 1-3-6-9 COST(IV. COST(III. COST(III. Palai .40+30 } =50---via the path 1-2-5-11 or via the path 1-3-6-11 STEP V COST(V.11).6) + cost(6.50+∞} = 40—via the path 1-2-5-8-12 Department of Computer Science & Engineering 117 SJCET.9)} =min{20+20.11) + cost(11. 20+∞ .12). 20+30 .12).10).7) + cost(7.9) + cost(9. COST(III.10).12).5) + cost(5.9).30+∞. COST(IV. COST(III. 20+20 .5) + cost(5.

8) + cost(8.30+10.14).14)+cost(14.15)=min{COST(V.40+20 .14).Algorithm Analysis and Design (R 606) COST(V.13)} =min{30+20 . COST(IV. COST(IV.11) + cost(11.13)+cost(13.13).50+10} = 40—via the path 1-2-5-8-13 COST(V.13). COST(V.15)} =min{40+20 . COST(IV.50+30} =50---via the path 1-2-5-8-13-15 118 Department of Computer Science & Engineering SJCET. COST(IV. COST(IV.15).9) + cost(9.14)} =min{30+30 .13)=min{COST(IV. Palai .10) + cost(10. COST(V.14)=min{COST(IV. COST(IV.40+10 .9) + cost(9.14).15).30+20.11) + cost(11.13).12)+cost(12.50+30} =50---via the path 1-2-5-10-14 or via 1-2-5-9-14 or 1-3-6-9-14 STEP VI COST(VI.40+10 .10) + cost(10.8) + cost(8.

since Bernard Roy described this algorithm in 1959) is a graph analysis algorithm for finding shortest paths in a weighted. Algorithms for single source problem might also be repeatedly used. or there exists some path that goes from i to k + 1. Further consider a function shortestPath(i. Department of Computer Science & Engineering SJCET. Palai .. It is able to do this with only V3 comparisons.k).. directed graph. then from k + 1 to j that is better. A single execution of the algorithm will find the shortest paths between all pairs of vertices. Johnson's algorithm is harder to implement. It does so by incrementally improving an estimate on the shortest path between two vertices. until the estimate is known to be optimal. Consider a graph G with vertices V. Various means of doing so are known. but might perform better for sparse graphs. the Floyd–Warshall algorithm (sometimes known as the WFI Algorithm or Roy–Floyd algorithm.k) that returns the shortest possible path from i to j using only vertices 1 through k as intermediate points along the way. and every combination of edges is tested. There are two candidates for this path: either the true shortest path only uses nodes in the set (1. Now.Algorithm Analysis and Design (R 606) 119 ALL PAIRS SHORTEST PATH Finding all pairs shortest path consists of finding the shortest distance between every pair of nodes in a possibly directed graph. although they often perform worse or are harder to optimize FLOYD-WARSHELL ALGORITHM In computer science. each numbered 1 through N. This is remarkable considering that there may be up to V2 edges in the graph. and the following list gives a few of the common methods: Floyd-Warshall algorithm is an elegant. quickly implementable O(n3) algorithm (Assumes absence of negatively-weighed cycles). The Floyd–Warshall algorithm is an example of dynamic programming Algorithm The Floyd-Warshall algorithm compares all possible paths through the graph between each pair of vertices. our goal is to find the shortest path from each i to each j using only nodes 1 through k + 1. given this function.j.

10 */ 11 12 procedure FloydWarshall () 13 for k: = 1 to n Department of Computer Science & Engineering SJCET.j.k) in terms of the following recursive formula This formula is the heart of Floyd Warshall.2) for all (i.j. we can define shortestPath(i. Each path[i][j] is initialized to 9 edgeCost(i.j. Therefore. etc...k-1)..j. Palai . and it is clear that if there were a better path from i to k + 1 to j. This process continues until k=n.. 7 /* A 2-dimensional matrix.j) pairs using any intermediate vertices. path[i][j] is the shortest path 8 from i to j using intermediate vertices (1.1) for all (i.j) or infinity if there is no edge between i and j.j) pairs. then using that to find shortestPath(i. Pseudocode Conveniently. Also assume that n is the number of vertices and edgeCost(i.i)=0 4 */ 5 6 int path[][]. one can overwrite the information saved from the computation of k − 1. At each step in the algorithm.j) pairs. and we have found the shortest path for all (i. then the length of this path would be the concatenation of the shortest path from i to k + 1 (using vertices in (1.k)). Be careful to note the initialization conditions: 1 /* Assume a function edgeCost(i. when calculating the kth case.j) which returns the cost of the edge from i to j 2 3 (infinity if there is none). The algorithm works by first computing shortestPath(i..k). This means the algorithm uses quadratic memory.k)) and the shortest path from k + 1 to j (also using vertices in (1.Algorithm Analysis and Design (R 606) 120 We know that the best path from i to j that only uses nodes 1 through k is defined by shortestPath(i.

Finding a regular expression denoting the regular language accepted by a finite automaton (Kleene's algorithm) Inversion of real matrices (Gauss-Jordan algorithm). Therefore. In Warshall's original formulation of the algorithm. In this application one is interested in finding the path with the maximum flow between two vertices. .. Path weights represent bottlenecks. one instead takes maxima. The edge weights represent fixed constraints on flow. Maximum Bandwidth Paths in Flow Networks Department of Computer Science & Engineering SJCET. the complexity of the algorithm is Θ(n3) and can be solved by a deterministic machine in polynomial time Applications and generalizations The Floyd–Warshall algorithm can be used to solve the following problems.n}2 path[i][j] = min ( path[i][j]. This means that. Then the addition operation is replaced by logical conjunction (AND) and the minimum operation by logical disjunction (OR)... Optimal routing. so the addition operation above is replaced by the minimum operation. path[i][k]+path[k][j] ).. among others: Shortest paths in directed graphs (Floyd's algorithm). rather than taking minima as in the pseudocode above..j) in {1. Palai .Algorithm Analysis and Design (R 606) 14 15 for each (i. . Testing whether an undirected graph is bipartite. Fast computation of Pathfinder Networks. Transitive closure of directed graphs (Warshall's algorithm). . and compute the sequence of n zero-one matrices the total number of bit operations used is . Since we begin with . the graph is unweighted and represented by a Boolean adjacency matrix.. 121 Analysis To find all n2 of from those of requires 2n2 bit operations.

4.v) + h(u) −h(v). Palai . a new node q is added to the graph. to find for each vertex v the least weight h(v) of a path from q to v. Johnson's algorithm consists of the following steps: 1. and O(V log V + E) for each of V instantiations of Dijkstra's algorithm. starting from the new vertex q. Dijkstra's algorithm is used to find the shortest paths from s to each other vertex in the reweighted graph. using Fibonacci heaps in the implementation of Dijkstra's algorithm. 2. If this step detects a negative cycle. Department of Computer Science & Engineering SJCET. ensuring the optimality of the paths found by Dijkstra's algorithm. Finally. is given the new length w(u. 3. all modified edge lengths are non-negative. It allows some of the edge weights to be negative numbers.Algorithm Analysis and Design (R 606) 122 JOHNSON’S ALGORITHM Johnson's algorithm is a way to find shortest paths between all pairs of vertices in a sparse directed graph. Second. is O(V2log V + VE): the algorithm uses O(VE) time for the Bellman-Ford stage of the algorithm.v). all paths between a pair s and t of nodes have the same quantity h(s) -h(t) added to them. The distances in the original graph may be calculated from the distances calculated by Dijkstra's algorithm in the reweighted graph by reversing the reweighting transformation. the Bellman-Ford algorithm is used. for each node s. connected by zero-weight edge to each other node. However. due to the way the values h(v) were computed. Next the edges of the original graph are reweighted using the values computed by the Bellman-Ford algorithm: an edge from u to v. but no negativeweight cycles may exist. so a path that is shortest in the original graph remains shortest in the modified graph and vice versa. Analysis The time complexity of this algorithm. First. having length w(u. the algorithm is terminated. In the reweighted graph.

when the graph is sparse. Department of Computer Science & Engineering SJCET.v) by w(u.v) + h(u) −h(v). all edge weights are non-negative. On the right is shown the reweighted graph. but the shortest path between any two nodes uses the same sequence of edges as the shortest path between the same two nodes in the original graph. but no negative cycles. The graph on the left of the illustration has two negative edges. At the center is shown the new vertex q.Algorithm Analysis and Design (R 606) 123 Thus. formed by replacing each edge weight w(u. Example The first three stages of Johnson's algorithm are depicted in the illustration below. The algorithm concludes by applying Dijkstra's algorithm to each of the four starting nodes in the reweighted graph. which solves the same problem in time O(V3). Note that these values are all non-positive. because q has a length-zero edge to each vertex and the shortest path can be no longer than that edge. and the values h(v) computed at each other node as the length of the shortest path from q to that node. In this reweighted graph. the total time can be faster than the Floyd-Warshall algorithm. Palai . a shortest path tree as computed by the Bellman-Ford algorithm with q as starting vertex.

given a weighted graph (that is. in this case. find a path P from v to each v' of V so that is minimal among all paths connecting v to v' . a set E of edges. Department of Computer Science & Engineering SJCET. This can be reduced to the single-source shortest path problem by reversing the edges in the graph. The problem is also sometimes called the single-pair shortest path problem. the shortest path problem is the problem of finding a path between two vertices (or nodes) such that the sum of the weights of its constituent edges is minimized. Formally. Palai . in which we have to find shortest paths from a source vertex v to all other vertices in the graph. a set V of vertices. The single-destination shortest path problem. and one element v of V. An example is finding the quickest way to get from one location to another on a road map. in which we have to find shortest paths from all vertices in the graph to a single destination vertex v. to distinguish it from the following generalizations: The single-source shortest path problem.Algorithm Analysis and Design (R 606) 124 SHORTEST PATH PROBLEM A graph with 6 vertices and 7 edges In graph theory. and a realvalued weight function f : E → R). the vertices represent locations and the edges represent segments of road and are weighted by the time needed to travel that segment.

Applications Shortest path algorithms are applied to automatically find directions between physical locations. Algorithms The most important algorithms for solving this problem are: Dijkstra's algorithm solves the single-pair. Perturbation theory finds (at worst) the locally shortest path. and single-destination shortest path problems. Floyd-Warshall algorithm solves all pairs shortest paths. v' in the graph. For example. These generalizations have significantly more efficient algorithms than the simplistic approach of running a single-pair shortest path algorithm on all relevant pairs of vertices. and may be faster than FloydWarshall on sparse graphs. Palai . in which we have to find shortest paths between every pair of vertices v. Johnson's algorithm solves all pairs shortest paths. shortest path algorithms can be used to find an optimal sequence of choices to reach a certain goal state. if vertices represents the states of a puzzle like a Rubik's Cube and each directed edge corresponds to a single move or turn. single-source. Bellman-Ford algorithm solves single source problem if edge weights may be negative. Department of Computer Science & Engineering SJCET. shortest path algorithms can be used to find a solution that uses the minimum possible number of moves. such as driving directions on web mapping websites like Mapquest or Google Maps. If one represents a nondeterministic abstract machine as a graph where vertices describe states and edges describe possible transitions.Algorithm Analysis and Design (R 606) 125 The all-pairs shortest path problem. or to establish lower bounds on the time needed to reach a given state. A* search algorithm solves for single pair shortest path using heuristics to try to speed up the search.

For example. changes over time. transportation. The Canadian traveller problem and the stochastic shortest path problem are generalizations where either the graph isn't completely known to the mover. Palai .. robotics. The traveling salesman problem is the problem of finding the shortest path that goes through every vertex exactly once. and returns to the start. Unlike the shortest path problem. The problem of finding the longest path in a graph is also NP-complete. plant and facility layout. or where actions (traversals) are probabilistic. A more lighthearted application is the games of "six degrees of separation" that try to find the shortest path in graphs like movie stars appearing in the same film. this problem is NP-complete and. Related problems For shortest path problems in computational geometry. the algorithm may seek the shortest (min-delay) widest path. shrinkage of nodes) are made with a graph Department of Computer Science & Engineering SJCET. Other applications include "operations research. see Euclidean shortest path. or widest shortest (min-delay) path.Algorithm Analysis and Design (R 606) 126 In a networking or telecommunications mindset.g. as such. The problems of recalculation of shortest paths arises if some graph transformations (e. is believed not to be efficiently solvable (see P = NP problem) . this shortest path problem is sometimes called the min-delay path problem and usually tied with a widest path problem. and VLSI design".

For example. Perhaps this is because a lower bound states a fact about all possible algorithms for solving a problem. then the one with the smaller order was generally regarded as superior. for many problems it is possible to easily observe that a lower bound identical to n exists. Bounds such as these are often reffered to as trivial lower bounds because they are so easy to obtain. so lower bound proofs are often hard to obtain. If two algorithms solving the same problem were discovered and their times differed by an order of magnitude. Department of Computer Science & Engineering SJCET. If f(n) is the time for some algorithm. But for matrix multiplication the best known algorithm requires O(n2+€) operations (€ > 0) . We know how to find the maximum of n elements by an algorithm that uses only n-1 comparisons so there is no gap between the upper and lower bounds for this problem. Formally this equation can be written if there exist positive constants c and n0 such that |f(n)| ≥ c|g(n)| for all n > n0. Or. Clearly every integer must be examined at least once. so Ω(n) is a lower bound for any algorithm that solves this problem. However . There is a mathematical notation for expressing lower bounds. is discovered. To establish a given algorithm in the most efficient way. Deriving good lower bounds is often more difficult than devising efficient algorithms. then we know that asymptotically we can do no better. a function g(n) . suppose we wish to find an algorithm that efficiently multiplies two n×n matrices. Then Ω(n2) is a lower bound on any such algorithm since there are 2n2 inputs that must be examined and n2 outputs that must be computed. Usually we cannot enumerate and analyze all these algorithms. Palai . we are also concerned with determining more exact bounds. then we write f(n)=Ω(g(n)) to mean that g(n) is a lower bound for f(n). In addition to developing lower bounds to within a constant factor. where n is the number of inputs( or possibly outputs) to the problem. that is a lower bound on the time. and so there is no reason to believe that a better method cannot be found.Algorithm Analysis and Design (R 606) 127 LOWER BOUND THEORY Our main task for each problem is to obtain a correct and efficient solution. If we have an algorithm whose computing time is the same order as g(n). consider all algorithms that find the maximum of an unordered set of n integers.

. Palai . . then we are to determine an I between 1 and n such that A[i] = x. . < A[m] and B[1] < .Algorithm Analysis and Design (R 606) 128 COMPARISON TREES FOR SEARCHING AND SORTING A comparison tree for binary search on an array is a binary tree that depicts all possible search paths. The ordered searching problem asks whether a given element x E S occurs within the elements in A[1:n]. X:A(1) Failure X: A(2) Failure X:A(n) Failure Failure Suppose that we are given a set S of distinct values on which an ordering relation < holds. . For all these problems we restrict the class of algorithms we are considering to those which work solely by making comparisons between elements. B[n]. .Thes algorithms are referred to as Department of Computer Science & Engineering SJCET.e it shows all different sequences of comparisons undertaken by binary search when searching for keys that may or may not be present in the array. though it is possible for the algorithm to move elements around.. .< A[p(n)]. such that the n distinct values from S stored in A[1:n] satisfy A[p(1)] < A[p(2)] <.i. these m+n values are to be rearranged into an array C[1:m+n] so that C[1] < . No arithmetic involving elements is permitted. < C[m+n]. The merging problem assumes that two ordered sets of distinct inputs from S are given in A[1:m] and B[1:n] such that A[1] < . . The sorting problem calls for determining a permutation of the integers 1 to n. say p(1) to p(n). .

otherwise right branch. The left branch is taken if x < A[i].Now. Associate with every path from the root to an external node is a unique permutation. the comparison tree is a binary tree in which each internal node is labeled by the pair i:j which represents the comparison of A[i] with A[j]. Comparison trees comparison tree n! leaves : every permutation must be a leaf -case # comparisons = height of tree Department of Computer Science & Engineering SJCET.If A[i] < A[j] . c) Selection Any comparison tree that models comparison-based algorithms for finding the maximum of n elements has at least 2^ n-1 external nodes since each path from the root to an external node must contain at least n-1 internal nodes. and the right is taken if x > A[i]. then the algorithm proceeds down the left branch of the tree.any comparison between A[i] and A[j] must result in one of two possibilities: either A[i] < A[j] or A[i] > A[j]. or x > A[i]. Palai . Each internal node in the binary tree represents a comparison between x and an A[i]. Consider the case in which the n numbers A[1:n] to be sorted are distinct.the algorithm terminates.So. The external nodes represent termination of the algorithm.Algorithm Analysis and Design (R 606) 129 comparison based algorithms. a) Ordered Searching In obtaining the lower bound for the ordered searching problem. then no I has been found such that x = A[i] and the algorithm must declare the search successful. we consider only those comparison based algorithms in which every comparison between two elements of S is of the type ―compare x and A[i]‖. This implies at least n-1 comparisons for otherwise at least two of the input items never lose a comparison and the largest is not yet found.If the algorithm terminates following a left or right branch . b) Sorting We can describe any sorting algorithm that satisfies the restrictions of the comparison tree model by a binary tree.If x=A[i]. We rule out algorithms such as radix sort that decompose the values into subparts.

For the problem at hand. sorting. we will take pair wise comparisons of keys as the available tools. But. Of course. in a sense. in the former case. this leaves out of consideration Bin sorting (which is not based on key comparisons). In the latter case. we can justify this on the ground that Bin sorting is not a ‗‗general‘‘ sorting method: it is applicable only if keys have a particular form. than to come up with a ‗‗good‘‘ algorithm for that task. it‘s important to keep in mind that the result Department of Computer Science & Engineering SJCET. At any rate. one need only argue about one algorithm— the one that is supposedly ‗‗good‘‘. Palai .Algorithm Analysis and Design (R 606) 130 a1a2a3 : a1< a2? a1a2a3 : a2<a3 ? a2a1a3 : a1< a3? a1a2a3 a1a3a2 : a1< a3? a2a1a3 a2a3a1 : a2< a3? a1a3a2 a3a1a2 a2a3a1 a3a2a1 Comparison tree for insertion sort of three items LOWER BOUND FOR COMPARISON-BASED SORTING It is generally much more difficult to show that an algorithm is ‗‗best possible‘‘ for a certain task. one must argue about all possible algorithms — even those which one knows nothing about! To answer the question ‗‗what‘s the best way to do a job‘‘ we must start by fixing the set of tools that may be used to do the job.

K2 . Palai .Algorithm Analysis and Design (R 606) we are about to present only applies to sorting algorithms based on comparisons. The main theoretical device we‘ll use to analyze our problem is a decision (or comparison) tree. K2 . . 1 < 1:2 > 2:3 2:3 < > < > 1:3 123 1: 3 321 < > > 213 231 < 132 312 Department of Computer Science & Engineering SJCET. . . either Ki < Kj or Ki > Kj. Suppose we want to sort n keys K1 . Kn . This is a useful way of representing any comparison-based algorithm that sorts n keys (for any given n). let‘s develop some intuition about decision trees by studying an example. 131 Example 1: Below is a decision tree that corresponds to one possible algorithm for sorting 3 keys. Kj where i j . Let‘s assume that all keys are distinct. K1 . . K3 . Before introducing the formal definition. so that for any Ki .

By comparing K1 and K3 we discover that K1>K3 and therefore we take the right branch which leads us to the leaf labeled "2. This indicates we must compare K2 to K3 and since K2<K3. we take the left branch and arrive at the "1:3" node.leaf. as indeed is the case. For example. The execution starts at the root of the tree. For example. This is the case for the path from the root to any -2. For example the label "1:2" of the root indicates that K1 is to be compared with K2.1".3. Since K1>K2. A more programming language-like description of the algorithm corresponding to the above decision tree is: if K1<K2 then if K2<K3 then sorted order of keys is K1 . K1 Using the intuition we have gained by studying this example.3. K3 else {K1>K3} sorted order of keys is K2 . This indicates that the keys in sorted order are K2<K3<K1.Algorithm Analysis and Design (R 606) 132 The internal nodes of this tree correspond to comparisons the algorithm makes. K a. we take the right branch from the root and arrive at the "2:3" node. K2 else {K1>K2} if K2<K3 then if K1<K3 then sorted order of keys is K2 . K1 else {K2>K3} sorted order of keys is K3 . K3 . suppose that the input keys are K c. K3 else {K2>K3} if K1<K3 then sorted order of keys is K1 . then from node "2:3" to the left (signifying that K2<K3) and from node "1:3" to the right (signifying that K1>K3). Depending on the outcome of the comparison in a node. Note that the fact this is the order of the three keys is implied by the outcomes of the comparisons made in the path from the root to that leaf: To get to that leaf we must go from the root to the right (signifying that K1>K2). Palai . An execution of the algorithm for a specific input (of 3 keys) corresponds to a path from root to a leaf. The labels in those nodes. K2 else {K1>K3} sorted order of keys is K3 . This means that K2<K3<K1. K3 . specify which keys are to be compared.1". Thus the decision tree specifies the sequence of comparisons that the algorithm will perform to sort 3 keys. K3=b (where a < b < c). K2 . consider the node with label "2. The label of each leaf is the permutation specifying how to rearrange the keys to get them sorted. The node there specifies that K1 and K2 are to be compared. let us now give the formal Department of Computer Science & Engineering SJCET. K2 . K1 . K1 . the left or the right branch out of that node is taken. These three comparisons imply that K2<K3<K1.

) The idea is that leaves correspond to the possible outcomes of sorting n distinct keys. . Note that for a given sorting algorithm we need a different -3decision tree for each n to represent all possible executions of that algorithm when the input consists of n keys. each labeled by a different permutation of 1. . . Part (3) of the definition essentially requires that every relationship determined by the algorithm must have been established by actual comparisons: the algorithm cannot ‗‗guess‘‘. . . p (n) (p is a permutation of — in which case the path goes i)‘‘ — in which case the path goes i < n. If the outcome is Ki < Kj then the left subtree of the node ‗‗i : j‘‘ contains the subsequent comparisons made until the order of the keys is determined. n) there is either a node ‗‗p (i) : p (i from that node to its left child — or ‗‗p (i from that node to its right child — nodes too but must contain at least these. < Kp (n). . Definition 1: A decision tree of order n is a binary tree so that (1) It has n! leaves. 133 (3) In the path from the root to a leaf p (1) p (2) .Algorithm Analysis and Design (R 606) definition of a decision tree. n. Symmetrically. (2) Internal nodes are labeled by pairs of indices of the form ‗‗i : j 1. . 2. the right subtree of ‗‗i : j‘‘ contains the subsequent comparisons. Internal node ‗‗i : j‘‘ corresponds to the comparison of Ki and Kj. . . Any comparison-based algorithm for sorting n keys corresponds to a decision tree of i j n. Example 2: Consider the algorithm for insertion sort: INSERTION -SORT(A) begin for i := 2 to n do for j := i down to 2 do if A[ j] < A[ j 1] then A[ j else break end A[ j 1] Department of Computer Science & Engineering SJCET. . (The path may contain other order n. . The execution of the algorithm on input sequence of keys K1. if the outcome is Ki > Kj . K2. Kn follows the path from the root to the leaf labeled by the permutation p such that Kp (1) < Kp (2) < . . . . Palai . . . 2.

† The following fact can be easily proved: Fact 2: The tree that minimizes the external path length has all leaves in at most two depths d and d 1. Thus we have Cn = [(d 1) Nd− d Nd ]/n! (1) But we have: Nd Nd−1 = n! (2) and Department of Computer Science & Engineering SJCET. let Nd and Nd−1 be the number of leaves in the corresponding decision tree at depth d and d 1 respectively. If we assume that all initial arrangements of the keys to be sorted are equally likely. the worst case number of comparisons required by the algorithm represented by that tree is precisely the height of the tree. Let Cn be the minimum average case number of comparisons needed to sort n keys. Lemma: Any binary tree of height h has at most 2h leaves. By definition. Thus. the average case number of comparisons performed by that algorithm is equal to the external path length of the decision tree divided by the number of leaves in the tree. any decision tree of order n is a binary tree with n! leaves. all leaves have depth d or d 1 for some d. Also. n Cn n from which it follows that Cn AVERAGE CASE LOWER BOUND ON THE NUMBER OF COMPARISONS Consider a decision tree of order n representing some sorting algorithm. For any decision tree. n (n/e)n n/e)n = n log n n log e n log n). Palai . by the lemma. it must have height at least approximation says that n Cn n (n/e)n. Recall that the number of leaves in T is n!. Thus.Algorithm Analysis and Design (R 606) 134 WORST CASE LOWER BOUND ON THE NUMBER OF COMPARISONS Let Cn denote the minimum number of comparisons required to sort n keys. Proof: Trivial induction on h. for some d. By Facts 1 and 2 we may assume that in the decision tree that minimizes the average number of comparisons.

For equation (3) note that if we gave 2 children to each leaf of depth d 1. ORACLES AND ADVERSARY ARGUMENTS One of the proof techniques that is useful for obtaining lower bounds consists of making use of an oracle. located in Delphi in Greece. The most famous oracle in the history was called the Delphic oracle. n log n) comparisons n(n/e)n n/e)n = n log n n log e. Cn Therefore. situated in the side of a hill embedded in some rocks. This oracle can still be found. Palai . After some period of time elapsed. the oracle would reply and a caretaker would interpret the oracle‘s answer. which is 2d . † Recall that the external path length of the tree is the sum of the depths of all leaves. In olden times people would approach the oracle and ask it a question.Algorithm Analysis and Design (R 606) Nd Nd−1 =2d (3) 135 Equation (2) says that the total number of leaves in T is n!. Hence. -5Solving (2) and (3) for Nd and Nd−1. we would get the maximum possible number of nodes at depth d. any comparisonboth in the worst and in the average case. we get: Nd =2 n and Nd−1 =2d n! (5) Substituting (4) and (5) in (1) we obtain: Cn = [(d 1) Nd− d Nd ]/n! = (d n n But d d )/n! n = log n n n! 2e )/n! d (4) Cn = (n! log n n = log n Cn n n log n). Department of Computer Science & Engineering SJCET.

As was the case of sorting. Given some model of computation such as comparison trees. A[i]=x.z. the oracle tries its best to cause the algorithm to work as hard as it can. the result that causes the most work to be required to determine the final answer.n) be the minimum number of comarisons needed to merge m items.x.y.y. It does this by choosing as the outcome of the next test. there are ( )=10 ways in which A and B merge: u.n)-1 The exercises show that these upper and lower bounds can get arbitrarily far apart as m gets much smaller than n.v.z.v.y. When m and n are equal.u.v. Department of Computer Science & Engineering SJCET. Thus if we use comparison trees as our model for merging algorithm. For example. and therefore at least [log ] comparisons are required by any comparison- based merging algorithm.x.u.y.x.z.z.y.<A[m] and B[1]<B[2]……. To derive a good lower bound.y.u.v.x.v.x.x.….u.y.B[n].. The conventional merging algorithm takes m+n-1 comparisons. the lower bound given by the comparison tree model is too and the number of comparisons for the conventional merging algorithm can be shown to be optimal.u.Algorithm Analysis and Design (R 606) 136 A similar phenomenon takes place when we use an oracle to establish a lower bound.z. then we have the inequality [log ] MERGE(m. Where the item in A and the item in B are sorted.u.z.v. we investigate lower bounds for algorithms that merge these two sets to give a single sorted set. the oracle tells us the outcome of each comparison.u.u. the n elements of B can be inter leaved within A in every possible way. MERGING Now we consider the merging problem.u.v.<B[n].v.y. It is possible that after these two sets are merged. we assume that all the m+n elements are distinct and that A[1]<A[2]……. n=2..B[2]=v.x. And by keeping tracks of the work that is done.u.y.A[3]=z. if m=3.x. Given the sets A[l:m] and B[l:n]. If we let MERGE(m. then there will be external nodes. and x.B[1]=u.x. Elementary combinatorics tells us that there are ( ) ways that the A‘s and B‘s can merge together while preserving the ordering within A and B. Palai .z. a worst-case lower bound for the problem can be derived.v.z.y. This should not be a surprise becoz the conventional algorithm is designed to work best when m and n binary insertion would required the fewest number of comparisons needed to merge A[1] into B[1]..z.A[2]=y.

then the theorm follows. Insertion sort is an example of an incremental algorithm.<B[m]<A[m]. key = A[j] {Put A[j] into the sorted sequence A[1 . So any algorithm must make all 2m-1 comparisons to produce this final result. Then the algorithm cannot distinguish b/w the previous ordering and the one in which B[1]<A[1]<……<A[i-1]<A[i]<B[i]<B[i+1]<……. it builds the sorted sequence one number at a time.…. for m≥1. We already have an algorithm that requires 2m-1 comparisons. To see this. INSERTION SORT If the first few objects are already sorted. . If we can show that MERGE(m.< A[m] and B[1]<……. This is called insertion sort. Consider any comparison-based algorithm for solving A[1]<B[2]<A[2]<………<B[m]<A[m]. then the algorithm will not be able to distinguish b/w the cases in which B[1]<A[1]<B[2]<……<B[m]<A[m] and in which B[1]<A[1]<B[2]<A[2]……<A[i1]<B[i]<B[i+1]<A[i]<a[i+1]<…….B[m]:A[m] while merging the given input. The theorm follows. For j = 2 to length [A] do 2. 137 Proof: Consider any algorithm that merges the two sets A[1] <……..Algorithm Analysis and Design (R 606) Theorm: MERGE(m.A[1]:B[2]. Palai Department of Computer Science & Engineering . 6. an unsorted object can be inserted in the sorted set in proper place. 4. 3. for which the B‘s and A‘s alternate.m)=2m-1. that is. 5. supose that a comparison of type B[i]:A[i] is not made for some i. Any merging algorithm must make each of the 2m-1 comparisons B[1]:A[1].. INSERTION_SORT (A) 1. If a comparison of type A[i]:B[i+1] is not made..<B[m]<A[m]. So the algorithm will not neccessarly merge the A‘s and B‘s properly. An algorithm consider the elements one at a time.<B[m].B[2]:A[2]. j-1] i ← j -1 while i > 0 and A[i] > key do A[i+1] = A[i] SJCET.m)≥2m-1. inserting each in its suitable place among those already considered (keeping them sorted).

This happens if given array A is already sorted.Algorithm Analysis and Design (R 606) 7. This can happens if array A starts out in reverse order T(n) = an2 + bc + c = O(n2) It is a quadratic function of n. Palai . Worst-Case The worst-case occurs. 8. T(n) = an + b = O(n) It is a linear function of n. when line 5 executed j times for each j. Department of Computer Science & Engineering SJCET. Analysis Best-Case i = i-1 A[i+1] = key 138 The while-loop in line 5 executed only once for each j.

Insertion sort use no extra memory it sort in place. despite the fact that a time in order of n is sufficient to solve large instances in which the items are already sorted. Palai . Stability Since multiple keys with the same value are placed in the sorted array in the same order that they appear in the input array. Department of Computer Science & Engineering SJCET. The time of Insertion sort is depends on the original order of a input. It takes a time in Ω(n2) in the worst-case.Algorithm Analysis and Design (R 606) 139 The graph shows the n2 complexity of the insertion sort. Insertion sort is stable. Extra Memory This algorithm does not require extra memory. For Insertion sort we say the worst-case running time is θ(n2). and the best-case running time is θ(n).

1. j = i. j = j . Palai . int array_size) { int i. } } 140 Department of Computer Science & Engineering SJCET. i++) { index = numbers[i].Algorithm Analysis and Design (R 606) Implementation void insertionSort(int numbers[]. while ((j > 0) && (numbers[j-1] > index)) { numbers[j] = numbers[j-1]. j. i < array_size. for (i=1. } numbers[j] = index. index.

. – Your purpose is forcing your friend to ask as many questions as possible. but – Construct a date according to the questions. • If the adversary can force the algorithm to perform f(n) steps. • To question ―is it in winter‖. or to say. i. The – The requirement is that the finally constructed date should be consistent to all your answers to the questions.Algorithm Analysis and Design (R 606) 141 Adversary argument • Suppose we have an algorithm we think efficient. do a lot of decision. the answer releases as less new information as possible). Department of Computer Science & Engineering SJCET..e. • At each point in the algorithm. key comparison) is made. • The adversary chooses the answer which tries to force the algorithm work hard (i. • Image an adversary tries to prove otherwise. the adversary tells us the result of the decision. and the friend will try to guess the date by asking YES/NO questions. your answer should be NO. Palai . • To question ―is the first letter of the month‘s name in the first half of the alphabet‖? Your answer should be YES. Simply Put: • Playing a guessing game between you and your friend. • You can think the adversary is constructing a ―bad‖ input while it is answering the questions.e. The only requirement on the answers is that they must be internally consistent. whenever an decision (i. • Idea: – You did not pick up a date in advance at all. – You are to pick up a date. then f(n) is the lower bound. but it is a good way to find the lower bound.e. – Looks like cheating. at least how many steps in the worst case.

kn]. That is ki is the integer at the ith position. However. [3.2. We can associate 1 with Clyde.……. suppose we have alpha input. We show that all sorting algorithms that sort only by comparisons of keys. and so on. 2 for the second smallest. we can assume that the keys to be sorted are simply the positive integers 1. the worst case bound still holds true with this restriction removed because a lower bound on the worst-case performance from some inputs of some subsets is also a lower bound when all inputs are considered. and accomplish such a limited amount of rearranging after each time.3.2.1] This means that n! Different inputs (to a sorting algorithm) containing n different keys. k3=2 R and s must be integers between 1 and n such that s > r. K1=3. These six permutations are the different inputs of size three (3).2] [2. Palai . [Ralph. require at least quadratic time.. In general. For example. this is all we have accomplished.3. We obtain our results under the assumption that the keys to be sorted are distinct. Any algorithm that sorts these integers only by comparison of keys would have to do the same number of comparisons to sort the three names.1. For example the following six permutations are all the ordering of the first three positive integers: [1. either does nothing or moves the key in the jth slot to the (j+1)st slot.2] [3. without loss of generality.n. 2]. To obtain the equivalent input. 1. 2]. we have remedied the fact that ‗x‘ (number being compared with) should come before the key. of the first n positive integers. A permutation of the first n positive integers can be thought of as an ordering of those integers.…………. Clyde. because we can substitute 1 for the smallest key.k2. 1. Dave]. Department of Computer Science & Engineering SJCET. Because there are n! Permutations. By moving the key in the jth slot up one slot. 2 with Dave and 3 with Ralph.1] [3.3] [1. Clearly.3] [2. However. k2=1. For the permutation [3. There are n! different orderings of those integers. for example.Algorithm Analysis and Design (R 606) 142 Lower Bound for Algorithms that remove at most One Inversion per Comparison Insertion Sort. We denote a permutation by [k1.2. we are concerned with sorting n distinct keys that come from any ordered set.1.

and continue in this way until the entire array is sorted. SELECTION_SORT (A) for i ← 1 to n-1 do min j ← i. Insertion Sort removes at most the inversion consisting of S[j] and x after each comparison. It works as follows: first find the smallest in the array and exchange it with the element in the first position. n ( n . its about as good as we can hope to do (as far as comparisons of keys are concerned) with algorithms that sort only by comparisons of keys and remove at least one inversion after each comparison. we consider all permutations equally probable for the input. the pair (s. r) is an inversion in either the permutation or its transpose but not both. the average number of inversions in the input is also n (n-1) / 4. and therefore this algorithm is in the class of algorithms addressed by Theorem. SELECTION SORT This type of sorting is called "Selection Sort" because it works by repeatedly element. Showing that there are n ( n . So the average no of inversions in a permutation and its transpose is. Because we assume that the algorithm removes at most one inversion after each comparison. 1 n(n 1) * 2 2 n(n 1) 4 Therefore. min x ← A[i] for j ← i + 1 to n do If A[j] < min x then min j ← j Department of Computer Science & Engineering SJCET.1 ) / 2 inversions between them. Because Insertion Sort‘s worst case time complexity is n ( n – 1 ) / 2 and its average – case time complexity is about n2/4. on an average it must do at least this many comparisons to remove all inversion sand thereby sort the input.1 ) / 2 such pairs of integers between 1 and n is left as an exercise. Palai . This means that a permutation and its transpose have exactly. then find the second smallest element and exchange it with the element in the second position.Algorithm Analysis and Design (R 606) 143 Given a permutation.

Algorithm Analysis and Design (R 606) min x ← A[j] A[min j] ← A [i] A[i] ← min x

144

Selection sort is among the simplest of sorting techniques and it work very well for small files. Furthermore, despite its evident "naïve approach "Selection sort has a quite important application because each item is actually moved at most once, Section sort is a method of choice for sorting files with very large objects (records) and small keys. The worst case occurs if the array is already sorted in descending order. Nonetheless, the time require by selection sort algorithm is not very sensitive to the original order of the array to be sorted: the test "if A[j] < min x" is executed exactly the same number of times in every case. The variation in time is only due to the number of times the "then" part (i.e., min j ← j; min x ← A[j] of this test are executed.

The Selection sort spends most of its time trying to find the minimum element in the "unsorted" part of the array. It clearly shows the similarity between Selection sort and Bubble sort. Bubble sort "selects" the maximum remaining elements at each stage, but wastes some effort imparting some order to "unsorted" part of the array. Selection sort is quadratic in both the worst and the average case, and requires no extra memory. For each i from 1 to n - 1, there is one exchange and n - i comparisons, so there is a total of n -1 exchanges and (n -1) + (n -2) + . . . + 2 + 1 = n(n -1)/2 comparisons. These observations hold no matter what the input data is. In the worst case, this could be quadratic, but in the average case, this quantity is O(n log n). It implies that the running time of Selection sort is quite insensitive to the input.

Implementation

void selectionSort(int numbers[], int array_size) { int i, j; int min, temp;

for (i = 0; i < array_size-1; i++) Department of Computer Science & Engineering SJCET, Palai

Algorithm Analysis and Design (R 606) { min = i; for (j = i+1; j < array_size; j++) { if (numbers[j] < numbers[min]) min = j; } temp = numbers[i]; numbers[i] = numbers[min]; numbers[min] = temp; } }

145

SELECTION OF KTH SMALLEST ELEMENT
In computer science, a selection algorithm is an algorithm for finding the k-th smallest number in a list (such a number is called the kth order statistic.) This includes the cases of finding the minimum, maximum, and median elements. There are worst-case linear time selection algorithms. Selection is a subproblem of more complex problems like the nearest neighbor problem and shortest path problems. A worst-case linear algorithm for the general case of selecting the kth largest element was published by Blum, Floyd, Pratt, Rivest, and Tarjan in their 1973 paper Time bounds for selection, sometimes called BFPRT after the last names of the authors. The algorithm that it is based on was conceived by the inventor of quicksort,C.A.R.Hoare, and is known as Hoare's selection algorithm or quickselect. In quicksort, there is a subprocedure called partition that can, in linear time, group a list (ranging from indices left to right) into two parts, those less than a certain element, and those greater than or equal to the element. Here is pseudocode that performs a partition about the element list[pivotIndex]: Algorithm function partition(list, left, right, pivotIndex) pivotValue := list[pivotIndex] swap list[pivotIndex] and list[right] // Move pivot to end Department of Computer Science & Engineering SJCET, Palai

Algorithm Analysis and Design (R 606) storeIndex := left for i from left to right-1 if list[i] < pivotValue swap list[storeIndex] and list[i] storeIndex := storeIndex + 1 swap list[right] and list[storeIndex] // Move pivot to its final place return storeIndex

146

In quicksort, we recursively sort both branches, leading to best-case Ω(n log n) time. However, when doing selection, we already know which partition our desired element lies in, since the pivot is in its final sorted position, with all those preceding it in sorted order preceding it and all those following it in sorted order following it. Thus a single recursive call locates the desired element in the correct partition: Algorithm function select(list, left, right, k) select pivotIndex between left and right pivotNewIndex := partition(list, left, right, pivotIndex) if k = pivotNewIndex return list[k] else if k < pivotNewIndex return select(list, left, pivotNewIndex-1, k) else return select(list, pivotNewIndex+1, right, k) Note the resemblance to quicksort; indeed, just as the minimum-based selection algorithm is a partial selection sort, this is a partial quicksort, generating and partitioning only O(log n) of its O(n) partitions. This simple procedure has expected linear performance, and, like quicksort, has quite good performance in practice. It is also an in –place algorithm, requiring only constant memory overhead, since the tail recursion can be eliminated with a loop like this: Algorithm function select(list, left, right, k) loop select pivotIndex between left and right pivotNewIndex := partition(list, left, right, pivotIndex) Department of Computer Science & Engineering SJCET, Palai

and the sublist of items remaining to be sorted. Here is an example of this sort algorithm sorting five elements: 64 25 12 22 11 11 25 12 22 64 11 12 25 22 64 11 12 22 25 64 Department of Computer Science & Engineering SJCET. David Musser describes a "median-of-3 killer" sequence that can force the well-known median-of-three pivot selection algorithm to fail with worst-case behavior . Palai . the performance of the algorithm is sensitive to the pivot that is chosen. this degrades to the minimum-based selection described previously. If bad pivots are consistently chosen. occupying the remainder of the array. Swap it with the value in the first position 3.Algorithm Analysis and Design (R 606) if k = pivotNewIndex return list[k] else if k < pivotNewIndex right := pivotNewIndex-1 else left := pivotNewIndex+1 147 Like quicksort. and so can require as much as O(n2) time. The algorithm works as follows: 1. Find the minimum value in the list 2. Repeat the steps above for remainder of the list (starting at the second position) Effectively. we divide the list into two parts: the sublist of items already sorted. which we build up from left to right and is found at the beginning.

and then insert it at the end of the values sorted so far.Algorithm Analysis and Design (R 606) 148 Selection sort can also be used on list structures that make add and remove efficient. In this case it's more common to remove the minimum element from the remainder of the list. such as a linked list. For example: 64 25 12 22 11 11 64 25 12 22 11 12 64 25 22 11 12 22 64 25 11 12 22 25 64 Department of Computer Science & Engineering SJCET. Palai .

Algorithm Analysis and Design (R 606) 149 MODULE 5 Department of Computer Science & Engineering SJCET. Palai .

. Thus implicit constraints describe the way in which xi must relate to each other. In many applications of the backtrack method. x[ ] and //n are global. the first //k-1 values x[1]..x[k])!=0) then { Department of Computer Science & Engineering SJCET.…. On entering. where the xi are chosen from some from finite set Si. x[2].. The explicit constraints depend on the particular instance I of the problem being solved. The criterion function P is the inequality a[xi]≤a[xi+1] for 1 ≤ i < n. The set Si is finite and includes the integers 1 through n. Sometimes it seeks all vectors that satisfy P.….xn). the desired solution is expressible as an n-tuple(x1.x[k-1]) do { if (Bk(x[1]....Algorithm Analysis and Design (R 606) 150 BACKTRACKING Many problems with searching a set of solutions or which ask for an optimal solution satisfying some constraints can be solved using the backtracking formulation. { for (each x[k] Є T(x[1]..…..….x[k-1] of the solution vector x[1:n] have been assigned. the implicit constraints are the rules that determine which of the tuples in the solution space of I satisfy the criterion function. Often the problem to be solved calls for finding one vector that maximizes (or minimizes or satisfies) a criterion function P(x1. CONTROL ABSTRACTION (i) Recursive backtracking algorithm Algorithm Backtrack(k) //This schema describes the backtracking process using recursion..x[2]. All tuples that satisfy the explicit constraints define a possible solution space for I..…. For any problem these constraints can be classified into two categories: explicit and implicit. Many of the problems we solve using backtracking require that all the solutions satisfy a complex set of constraints.xn). Palai .

x2.….xn) =Ф. The last unresolved call now resumes. Palai . When the ‗for‘ loop is exited. We assume the existence of bounding function Bi+1 (expressed as predicates) such that if Bi+1(x1..…. if (k<n) then backtrack(k+1).x2.xi+1) is also a path to a problem state.xi) be the set of all possible values for xi+1 such that (x1. and adjoined to the current vector (x1.. a check is made to determine whether a solution has been found.. no more values for xk exist and the current copy of Backtrack ends. T(x1..….xi+1) is false for a path (x1.xn) are those values which are generated by T and satisfy Bi+1.x2. { Department of Computer Science & Engineering SJCET.x2..x2...…..xk-1).. (ii)General iterative backtracking method Algorithm IBacktrack(n) //This schema describes the backtracking process.x[2]. Each time xk is attached.….xn). The solution vector (x1. All the possible elements for the kth position of the tuple that satisfy Bk are generated.x2.x2.Algorithm Analysis and Design (R 606) if (x[1]..xi+1) from the root node to a problem state.….…. one by one.…..x2. is treated as a global array x[1:n].….x[k] is a path to an answer node) then write (x[1:k])..x2. The above recursive algorithm is initially invoked by Backtrack(1). Thus the candidates for position i+1 of the solution vector (x1. } } } Explanation 151 Let (x1.xi) be a path from the root to a node in a state space tree. then the path cannot be extended to reach an answer node. //All solutions are generated in x[1:n] and printed as soon as they are determined. Then the algorithm is recursively invoked.…. Let T(x1.

.. The variable k is continually incremented and a solution vector is grown until either a solution in found or no untried value of x k remains.x[2].Algorithm Analysis and Design (R 606) k:= 1. The number of xk satisfying the Bk Department of Computer Science & Engineering SJCET. Palai . } } Explanation T() will yield the set of possible values that can be placed as the first component x[1] of the solution vector. When k is decremented. the algorithm must resume the generation of possible elements for the kth position that have not yet been tried...x[k]) is true) then { if (x[1]. while(k≠0) do { 152 if (there remains an untried x[k] Є T(x[1]. The number of xk satisfying the explicit constraints 3. The time to generate next xk 2.//Consider the next set.x[k-1]) and Bk (x[1]. The component x[1] will take on those for which the bounding function B1x(1) is true. Conclusion The efficiency of both algorithms depends very much on four factors: 1. } else k:=k-1.…. k:=k+1.x[k] is a path to answer node) then write (x[1:k]).….…. The time for the bounding functions Bk 4. //Backtrack to the previous set.

Thus implicit constrains describe the way in which the xi must relate to each other.…. 1 } li xi ui or Si = {a : li a ui } All tuples satisfying the explicit constraints define a possible solution space for I (I=problem instance) Definition 2: The implicit constraints are rules that determine which of the tuples in the solution space of I satisfy the criterion function. Palai .Algorithm Analysis and Design (R 606) Constraints Solutions must satisfy a set of constraints Explicit vs.x2.2. assume that queen i is placed on row i All solutions represented as 8-tuples (x1. solution space reduced from 88 to 8!) – No two queens can be on the same diagonal Department of Computer Science & Engineering SJCET.8} – So. solution space consists of 88 8-tuples Implicit constraints – No two xi‘s can be the same (By this. implicit Definition 1: Explicit constraints are rules that restrict each xi to take on values only from a given set.x8) where xi is the column on which queen i is placed Constraints • • Explicit constraints – Si={1.…. Classic combinatorial problem Place eight queens on an 8*8 chessboard so that no two attack Without loss of generality. eg) xi 0 or Si = { all nonnegative real numbers } 153 xi = 0 or 1 or Si = { 0.

then mi+1……mn possible test vectors can be ignored entirely. The solution space is defined by all the paths from the root node to a leaf node. column.Algorithm Analysis and Design (R 606) 154 BOUNDING FUNCTIONS The backtrack algorithm as its virtue the ability to yield the same answer with far fewer than m trials.……. x2. one competent at a time and to use modified criterion functions Pi(x1. so that no two of them are on the same row. Palai . or diagonal. For eg.. Its basic idea is to build solution vector.2. The solution space consists of all n! permutations of the n-tuple (1. The major advantage of this method is this: if it is realized that the partial vector (x1.xn) called bounding functions. The tree is called permutation tree. or diagonal. to test whether the vector being formed has any chance of success. Department of Computer Science & Engineering SJCET. The edges are labeled by possible values of xi.. xi) can in no way lead to an optimal solution. Edges from level i to i+1 are labeled with values of xi. then there will be 4! =24 leaf nodes in the tree. Edges from level 1 to level 2 nodes specify the values for x1.‖ that is. N-QUEENS PROBLEM The n-queens problem is a generalization of the 8-queens problem.…. If n=4.…. that is no two queens are on the same row. Here n queens are to be placed on an nxn chessboard so that no two attack.n). column. Eg: 8-queens problem A classic problem is to place eight queens on an 8x8 chessboard so that no two ―attack.

e. j-l=i-k) or i+j=k+l (i.. two queens placed on the same diagonal iff |j-l| = |i-k| Department of Computer Science & Engineering SJCET. j-l=k-i) • So..e.j) and (k.l) – i-j=k-l (i.Algorithm Analysis and Design (R 606) 155 • Bounding function – No two queens placed on the same column  xi’s are distinct – No two queens placed on the same diagonal  how to test? • The same value of ―row-column‖ or ―row+column‖ • Supposing two queens on (i. Palai .

Algorithm Analysis and Design (R 606) 156 This example shows how the backtracking works(4-Queens) Department of Computer Science & Engineering SJCET. Palai .

Edges from level to level i+1are labeled with the value of xi. Otherwise it returns false. The edges are labeled by possible values of xi.. no two queens are on the same row.Generalising our discussions.The above figure shows a possible tree organization for the case n=4. Edges from level 1 to level 2 nodes specify the value for x1. There are 4! =24 nodes in the tree.Algorithm Analysis and Design (R 606) 157 TREE STRUCTURE Tree organization of the 4-queens solution space. ALGORITHM Algorithm Place (k.the solution space is defined by all paths from the root node to a leaf node. column or diagonal.Thus the leftmost sub tree contains all solutions with x1=1 and x2=2. A tree such as this is called a permutation tree. n). Department of Computer Science & Engineering SJCET. that is. //Abs (p) returns the absolute value of p. x[] is a //global array whose first (k –1) values have been set. Nodes are numbered as in depth first search. 2. the solution space consists of all n! permutations of n-tuple (1. Palai . Now n queens are to be placed on an n×n chess board so that no two attack. ί) // Returns true if a queen can be placed in the kth row and //ith column.and so on.. The n-queens problem is a generalization of the 8-queens problem.

if (k=n) then write (x[1:n]). return true. { for ί: =1 to n do { if Place (k.Algorithm Analysis and Design (R 606) { for j: =1 to k-1 do if ((x [j] = ί) // Two in the same column or (Abs (x [j] – ί ) = Abs (j-k)) ) // or in the same diagonal then return false. } } } Department of Computer Science & Engineering SJCET. } 158 Algorithm Nqueens (k.n). this procedure prints all // possible placements of n queens on an n x n // chessboard so that they are nonattacking. else Nqueens (k+1. n) // Using backtracking. ί) then { x [k] : = ί. Palai .

2.. We could formulate this problem using either fixed-or variable-sized tuple Given positive numbers wi.0.4) and (3. Palai .xn) where xi∈{0.0.xk). and m.24.x2.….7). (1.13.13.Rather than represent the solution vector by the wi which sum to m we could represent the solution vector by giving the indices of these wi.4) Explicit constraints: xi ∈ { j | j is integer and 1≤j≤n} Implicit constraints: no two be the same.1) and (0.In general all solutions are ktuples (x1. Department of Computer Science & Engineering SJCET. 1≤i≤n In the above example.Now the two solutions are described by the vectors (1.w3.Algorithm Analysis and Design (R 606) 159 SUM OF SUBSETS Given positive numbers wi.24.For example if n=4.7) and (24.1.w2.x2.1<=i<=n.4) and (3.w3..1}.w4) = (11.w4)=(11.13.w2.7).then the desired subsets are (11.7) Representation of solution vector Variable-sized By giving indices In the above example.7) and (24.. the sums are m Fixed-sized n-tuple (x1. (1.This is called sum of subsets problem.( w1. 1≤i≤n. this problem calls for finding all subsets of the wi whose sums are m.1) Variable tuple size Suppose we are given n distinct positive numbers(usually called weights) and we desire to find all combinations of these numbers whose sums are m. 1<=k<=n and different solutions may have different sized tuples.2.1. and m=31.4). find all subsets of the wi whose sum is m eg) (w1..7) and m=31 Solutions are (11.13. and m..

.... Palai .xk) cannot lead to an answer node if Department of Computer Science & Engineering SJCET.. (x1.xk)=true iff – Assuming wi’s in nondecreasing order..Algorithm Analysis and Design (R 606) • Solution space defined by all paths from the root to any node 160 Fixed tuple size: • • Solution space defined by all paths from the root to a leaf node Bounding function • Bk(x1.

∑i=1.Algorithm Analysis and Design (R 606) 161 – So.nwi) Department of Computer Science & Engineering SJCET.1. the bounding function is • Backtracking algorithm for the sum of sunsets – Invoked by SumOfSubsets(0. Palai .

18}. m=30 162 • Terminology – A node which has been generated and all of whose children have not yet been generated is called a live node – The live node whose children are currently being generated is called E-node – A dead node is a generated node which is not to be expanded further or all of whose children have been generated • Two different ways of tree generation • Backtracking • Depth first node generation with bounding functions – Branch-and-bound – E-node remains E-node until it is dead Four factors for efficiency of the backtracking algorithms Time to generate the next xk Number of xk satisfying the explicit constraints Time for the bounding function Bk Number of xk satisfying Bk Department of Computer Science & Engineering SJCET. Palai .15.10.12.Algorithm Analysis and Design (R 606) Example:n=6. w[1:6]={5.13.

2.(1.2.4). the solution space is partitioned into subsolution spaces.(1. and the last dependent X1=1 1 1 X1=4 X1=2 3 3 X2=4 X2=3 9 99 X3=4 15 6 10 11 11 1 4 4 5 5 2 2 X2=2 X 6 6 X3=3 12 13 X3=4 14 6 7 7 X3=4 8 X2=4 X4=4 16 Fig 1 The tree of fig 1 represents corresponds to variable tuple size formulation. Department of Computer Science & Engineering SJCET.3).4).(2).3) and so on.At each node.Algorithm Analysis and Design (R 606) 163 A solution space organization for the sum of subsets problem.(1.3.Thr possible paths are (1).Thus the leftmost subtree defines all subsets containing w1.The edges are labelled such that an edge from a level i node to a level i+1 node represents a value for xi.the next subtree defines all subsets containing w2 but not w1 and so on.Nodes are numbered in BFS * The first three relatively independent of the problem instance.2). Palai .since any such path corresponds to a subset satisfying the explicit constraints.2.The solution space is defined by all paths from the root node to any node in the tree.(2.4).(1.3.(1.

Palai .All paths from root to a leaf node defining the solution space .Nodes are numbered in D-search X1=1 1 1 X1=0 X2=1 2 12 1 3 1 X2=0 X2=1 4 1 X3=0 21 11 X4=1 25 22 X4=0 1 22 0000 1 == X3=0 X3=1 1 12 13 11 X4= 10 15 1 11 1 0 1 X3=1 6 11 X4 11 =1 8 1 11 X2=0 X3=1 26 22 6 1 18 11 81 X3=0 27 12 1 X3=1 20 20 1 19 19 1 5 11 30 11 X4=1 31 28 X4=0 11 22 8 1 X4=1 29 24 X4=0 11 24 1 11 X4=1 2 X4= X4= 23 1 16 1 17 0 14 14 1 61 1 1 1 1 FIG 2 The tree of fig 2 corresponds to the fixed tuple size formation.The left subtree of the root defines all subsets containing w1. and so on.Now there are 24 leaf nodes.Edges from level i nodes level i+l nodes are labelled with the value of xi w hich is either zero or one. Department of Computer Science & Engineering SJCET.Algorithm Analysis and Design (R 606) 164 Another organization for the sum of subsets problem.the right subtree defines all subsets not containing w1.

Regardless of which is used. The resulting algorithm is BKnap. It was obtained form the recursive backtracking schema. The solution space for this problem consists of the 2n distinct ways to assign zero or one values to the x i‘s. From Bound it follows that the bound for a feasible left child of a node Z is the same as that for Z.0). One corresponds to the fixed tuple size formulation and the other to the variable tuple size formulation. Thus the solution space is the same as that for the sum of subsets proble. Now by using the fixed tuple size formulation. Hence. Initially set fp. then an upper bound for Z can be obtained by relaxing the requirement xi=0 or 1 to 0≤xi ≤1 for k+1≤i≤n and using the greedy algorithm to solve the relaxed problem. The xi‘s constitute a zero-one-valued vector. Function Bound(cp.=-1. and a positive number m that is the knapsack capacity. then that live node can be killed.0.Algorithm Analysis and Design (R 606) 165 KNAPSACK PROBLEM Explanation: Given n positive weights wi . Palai . It is assumed that p[i]/w[i]≥p[i+1]/w[i+1]. Two possible tree organizations are possible. x[i]. The object weights and profits are w[i] and p[i]. If this upper bound is not higher than the value of the best solution determined so far. if at node Z the values of xi. n positive profits pi. In lines 8 to 18 children are generated. A bounding function for this problem is obtained by using an upper bound on the value of the best feasible solution obtainable by expanding the given live node and any of its descendants.k) determines an upper bound on the best solution obtainable by expanding any node Z at level k+1 of the state space tree.. 1≤i≤k. bounding functions are needed to help kill some live nodes without expanding them. is such that ∑ni=1 p[i]x[i] = fp. the bounding function need not be used whenever the backtracking algorithm makes a move to the left child of a node.cw. In line 20. Backtracking algorithms for the knapsack problem can arrived by using either of these two state space trees. 1≤i<n. Bound is used to test whether a right child should be generated. this problem calls for choosing a subset of the weights such that ∑ i≤i≤n wixi ≤ m and ∑ i≤i≤n pixi is maximized. have already been determined. This algorithm is invoked as BKnap(1. When fp ≠ -1. The Department of Computer Science & Engineering SJCET. 1≤ i≤ n.

{ b:= cp. c:=cw. In lines 13 to 17 and 23 to 27 the solution vector is updated if need Algorithm: Bounding Function Algorithm Bound(cp. fw is the final weight of //knapsack. Palai . //and profits. } Algorithm: Backtracking solution to the 0/1 knapsack problem. ∑i=1k-1 p[i]y[i]. and m is the knapsack size. k is the index of the last removed //item. fp is the final maximum profit. all the backtracking algorithms have worked on a static state space tree. So far. is the path to the current node. else x[k]=1. w[] and p[] are the weights and profits.Algorithm Analysis and Design (R 606) 166 path y[i]. if(c<m) then b:=b+p[i]. 1≤ i ≤ k.cw. x[k] = 0 if w[k] //is not in the knapsack. //p[i]/w[i] ≥ p[i+1]/w[i+1]. } return b.k) //cp is the current profit total.cp. The current weight cw = ∑k-1i=1 w[i]y[i] and cp= be. n is the number of weights. Algorithm Bknap(k. for i:=k+1 to n do { c:=c+w[i].cw) //m is the size of the knapsack. { // Generate left child. cw is the current //weight total. else return b+(1-(c-m)/w[i])*p[i]. if (cw + w[k] ≤ m) then Department of Computer Science & Engineering SJCET.

cw +w[k]).k)≥fp) then { y[k] :=0.cp. } } } 167 ALGORITHM // CP-Current Profit.c=cw. fw:=cw+w[k]. fw:=cw. if(k<n) then BKnap(k+1. if ((cp+p[k] > fp) and (k=n)) then { fp:=cp +p[k].cp+p[k]. //K-index of last removed access { b=cp. for P=K+1 to n do { c=c+w[i]. if(c<n)the b=b+p[i]. if (k<n) then BKnap(k+1. Palai . for j:=1 to k do x[j]:=y[j]. if (Bound(cp. } } // Generate right child. Department of Computer Science & Engineering SJCET.Algorithm Analysis and Design (R 606) { y[k]:=1.cw).cw. if((cp>fp) and (k=n)) then { fp:=cp. for j:=1 to k do x[j]:=y[j].

for j=1 to K do x[j]=y[j].cw) { If(w+w[K]<=m) then { Y[K]=1.K)>=fp) then { Y[K]=0.cw+w[K]).w. } 168 Algorithm B knap(K.Algorithm Analysis and Design (R 606) else return b+(1-(c-m)/w[i])+p[i].cp+p[K]. Palai . If(K<n)then B knap(K+1.cp. } } If(Bound(cp. If(k<n)then Department of Computer Science & Engineering SJCET. } Return(b). If((cp=p[k])>fp)and(K=n)then { fp=cp+p[K].fw=cw+w[K].

Palai . and even genetic programming. as it is often abbreviated. As a result. the greedy strategy. Branch and bound is an algorithm technique that is often implemented for finding the optimal solutions in case of optimization problems. dynamic programming. is one of the most complex techniques and surely cannot be discussed in its entirety in a single article. divide and conquer.Algorithm Analysis and Design (R 606) B knap(K+1. Thus. in this part we will compare branch and bound with the previously mentioned techniques as well. B&B. fw=cw. } } } 169 BRANCH AND BOUND ALGORITHM TECHNIQUE Introduction Branch and bound is another algorithm technique that we are going to present in our multipart article series covering algorithm design patterns and techniques. we opt for this technique when the domain of Department of Computer Science & Engineering SJCET. It is really useful to understand the differences. we are going to focus on the so-called A* algorithm that is the most distinctive B&B graph search algorithm.cp. it is mainly used for combinatorial and discrete global optimizations of problems. for j=1 to k do x[j]=y[j]. In a nutshell. If((cp>fp)and(k=n)) then { fp=cp. If you have followed this article series then you know that we have already covered the most important techniques such as backtracking.cw).

it often leads to exponential time complexities in the worst case. You should already be familiar with the tree structure of algorithms. then. it would be very hard to implement. both the backtracking and divide and conquer traverse the tree in its depth.e. This technique is based on the en masse elimination of the candidates. On the other hand. it can lead to algorithms that run reasonably fast on average.Algorithm Analysis and Design (R 606) 170 possible candidates is way too large and all of the other algorithms fail. it is much slower. unless proven otherwise mathematically. if applied carefully. DP becomes inefficient. The basic concept underlying the branch-and-bound technique is to divide and conquer. their children generated).. The greedy strategy picks a single route and forgets about the rest. Out of the techniques that we have learned. additionally. but not all nodes get expanded (i. The dividing (branching) is done by partitioning the entire set of feasible solutions into smaller and smaller subsets.The conquering (fathoming) is done partially by (i) giving a bound for the best solution in the subset. The general idea of B&B is a BFS-like search for the optimal solution. You see. We shouldn't rely on greedy because that is problem-dependent and never promises to deliver a global optimum. Indeed. but the implementation wouldn't be an efficient approach. and another criterion tells the algorithm when an optimal solution has been found. Palai . though they take opposite routes. The truth is that maybe the problem can indeed be solved with dynamic programming.(ii) discarding the subset if the bound indicates that it can‘t contain an optimal solution.These Department of Computer Science & Engineering SJCET. if we have a complex problem where we would need lots of parameters to describe the solutions of sub-problems. However. the backtracking and divide and conquer algorithms are out. a carefully selected criterion determines which node to expand and when. Branch and bound is a systematic method for solving optimization problems B&B is a rather general optimization technique that applies where the greedy method and dynamic programming fail. Rather. Dynamic programming approaches this in a sort of breadth-first search variation (BFS). by definition. As our last resort we may even think about dynamic programming. Since the original ―large‖ problem is hard to solve directly.it is divided into smaller and smaller subproblems until these subproblems can be conquered. Now if the decision tree of the problem that we are planning to solve has practically unlimited depth.

X. X->CC := E->CC + AX->I. /* S is a bitmap set initialized to 0*/ /* S will contain all the jobs that have been assigned by the partial path from the root to E */ p := E. endif endfor Department of Computer Science & Engineering SJCET. a queue is used.H). p := p-> Parent.p: nodepointer. */ I := E->I. bounding. Palai . X->Parent := E.Algorithm Analysis and Design (R 606) 171 three basic steps – branching. X->I := I + 1. S[1:n]: Boolean. Branch-and-Bound Algorithms A counter-part of the backtracking search algorithm which. while (p is not the root) do S[p->J] := 1. Insert(X. That is. in the absence of a cost criteria. X->J := job. and the nodes are processed in first-in-first-out order Procedure Expand(E) begin /* Generate all the children of E. the algorithm traverses a spanning tree of the solution space using the breadth-first approach.X->J-mX->I. endwhile for job=1 to n do if S[job] = 0 then X := new(node). and fathoming – are illustrated on the following example.

we also need a cost function to decide in what order to traverse the nodes when searching for a solution. The following function produces a complete binary tree of 11 nodes.e. and its children are inserted into the set. The algorithm proceeds in the following manner. 2. if when given a node X and index i it produces the i‘th child of the node. A priority queue is needed here. Visit: The cost criteria decides which of the live nodes is to process next. 172 If a cost criteria is available.e. Replacement: The chosen node is removed from the set of live nodes. the branch) is the one with the best cost within the queue. 3. Department of Computer Science & Engineering SJCET.. Palai . 4.. Iteration: The visitation and replacement steps are repeated until no alive nodes are left. the bound) from the queue nodes that can be determined to be expensive. 1. the node to be expanded next (i.Algorithm Analysis and Design (R 606) end . the cost function may also be used to discard (i. Cost-Based Tree Traversal of branch and bound A function can be considered to be a tree generator. Besides for a tree generator function. Initialization: The root of the of the tree is declared to be alive. In such a case. The children are determined by the tree generator function. The recursive function provided for deriving permutations is another example of a function that may be used to generate trees.

e.A dead node is a node that has been expanded. While implementing FIFO B&B algorithm it is not economical to kill live nodes with c®> upper each time upper is updated.If the bounds match. A generalization to arbitrary cost criteria is the basis for the priority branch-and-bound algorithm. The search proceeds until all nodes have been solved or pruned. The expanded node (or E-node for short) is the live node with the best CC value. Otherwise. or until some specified threshold is met between the best solution found and the lower bounds on all unsolved subproblems. these subproblems partition the feasible region.. called an approximate cost function CC.. the bound) from the queue nodes that can be determined to be expensive. the branch) is the one with the best cost within the queue. the feasible region is divided into two or more regions. FIFO BRANCH AND BOUND ALGORITHM FIFO branch and bound algorithm for the job sequencing problem can begin with upper= infinity. a queue is used. Starting by considering the root problem (the original problem with the complete feasible region). A first-in-first-out cost criteria implies the FIFO branch-and-bound algorithm. That is. Therefore. If the lower bound for a node exceeds the best known feasible solution. in the absence of a cost criteria. A live node is a node that has not been expanded. A counter-part of the backtracking search algorithm which. it is a feasible solution to the full problem. the node can be removed from consideration. the lower-bounding and upper-bounding procedures are applied to the root problem. Department of Computer Science & Engineering SJCET. the cost function may also be used to discard (i. Each solution is assumed to be expressible as an array X[1:n] (as was seen in Backtracking). If a cost criteria is available. which can be realized with a stack memory. then an optimal solution has been found and the procedure terminates. the node to be expanded next (i. Branch and Bound is a general search method. and a priority queue memory can be employed to realize the function. If an optimal solution is found to a subproblem. is assumed to have been defined. A priority queue is needed here. A predictor. but not necessarily globally optimal.e. no globally optimal solution can exist in the subspace of the feasible region represented by the node. and the nodes are processed in firstin-first-out order. the algorithm traverses a spanning tree of the solution space using the breadth-first approach. In such a case. Palai . and it can be realized with queue memory.Algorithm Analysis and Design (R 606) 173 In the case of backtracking the cost criteria assumes a last-in-first-out (LIFO) function. The algorithm is applied recursively to the subproblems.

and a max-heap for maximization problems. -. end Department of Computer Science & Engineering SJCET.E is an optimal solution print out the path from E to the root. endif E := delete-top(H).A heap for all the live nodes -. endwhile end Procedure Expand(E) begin .Algorithm Analysis and Design (R 606) 174 The general FIFO B&B algorithm follows: Procedure B&B() begin E: nodepointer.Compute the approximate cost value CC of each child.Generate all the children of E. . return. return. while (true) do if (E is a final leaf) then -. if (H is empty) then report that there is no solution.Insert each child into the heap H. E := new(node). .this is the root node which -.H is a min-heap for minimization problems. endif Expand(E). -. Palai .is the dummy start node H: heap. -.

Job J is assigned to person I CC: real.p: nodepointer. a node record structure should look like: Record node Begin Parent: nodepointer.n Code for Expand(E): Procedure Expand(E) begin /* Generate all the children of E. Palai .Therefore. which signifies that the job X[i] assigned to person i is j.mX->I Write a piece of code that computes the mis for i=1. the path from that leaf to the root can be traced and printed out as the optimal solution.. while (p is not the root) do Department of Computer Science & Engineering SJCET. I: integer. -. Every node must store its CC value. -.A->J .2..person I J: integer. S[1:n]: Boolean. then X->CC = X->Parent->CC + AX->I. End Take the 2nd CC formula: CC(X at level k) = cost so far + sumni=k+1mi where mi is the minimum of row i.Algorithm Analysis and Design (R 606) 175 We need to define the full record of a node . observe that if X is a pointer to a node. Each node must point to its parent so that when an optimal leaf is generated. X...We need to fully implement the Expand procedure Every node corresponds to something like X[i]=j. /* S is a bitmap set initialized to 0*/ /* S will contain all the jobs that have been assigned by the partial path from the root to E */ p := E. */ I := E->I.

the FIFO branch-and-bound technique has proven to be reasonably efficient on practical problems.X->J-mX->I.Algorithm Analysis and Design (R 606) S[p->J] := 1.The technique is also used in a lot of software in global optimization. X->J := job. endif endfor end. Fifo B&B Cost-Based Tree Traversal A function can be considered to be a tree generator. X->Parent := E. endwhile for job=1 to n do if S[job] = 0 then X := new(node). if when given a node X and index i it produces the i‘th child of the node.H).The following function produces a complete binary tree of 11 nodes. Department of Computer Science & Engineering SJCET. X->I := I + 1. X->CC := E->CC + AX->I. and it has the added advantage that it solves continuous linear programs as sub problems. Insert(X. p := p-> Parent. Palai . 176 Although a number of algorithms have been proposed for the integer linear programming problem.

and its children are inserted into the set. Department of Computer Science & Engineering SJCET. So it could perform better than backtracking. • Least-cost branch and bound directs the search to parts of the space most likely to contain the answer. Replacement: The chosen node is removed from the set of live nodes. but replace the FIFO queue with a stack (LIFO branch and bound). Palai . In the case of backtracking the cost criteria assumes a last-in-first-out (LIFO) function. 4. Initialization: The root of the of the tree is declared to be alive. The children are determined by the tree generator function. Besides for a tree generator function. A first-in-first-out cost criteria implies the FIFO branch-and-bound algorithm. we also need a cost function to decide in what order to traverse the nodes when searching for a solution. Visit: The cost criteria decides which of the live nodes is to process next. The priority of a node p in the queue is based on an estimate of the likelihood that the answer node is in the subtree whose root is p. Replace the FIFO queue with a priority queue (least-cost (or max priority) branch and bound).Algorithm Analysis and Design (R 606) 177 The recursive function provided for deriving permutations is another example of a function that may be used to generate trees. and a priority queue memory can be employed to realize the function. and it can be realized with queue memory. Iteration: The visitation and replacement steps are repeated until no alive nodes are left. FIFO branch and bound finds solution closest to root. which can be realized with a stack memory. 3. • • • Search the tree using a breadth-first search (FIFO branch and bound). 1.Backtracking may never find a solution because tree depth is infinite (unless repeating configurations are eliminated). 2. A generalization to arbitrary cost criteria is the basis for the priority branch-and-bound algorithm. Search the tree as in a bfs. The algorithm proceeds in the following manner.

e. called an approximate cost function CC.. the bound) from the queue nodes that can be determined to be expensive. A live node is a node that has not been expanded. no globally optimal solution can exist in the subspace of the feasible region represented by the node. Palai . Otherwise.Algorithm Analysis and Design (R 606) 178 LIFO BRANCH AND BOUND ALGORITHM LIFO branch and bound algorithm for the job sequencing problem can begin with upper= infinity. E := new(node). in the absence of a cost criteria. it is a feasible solution to the full problem.is the dummy start node Department of Computer Science & Engineering SJCET. A general LIFO B&B algorithm : Procedure B&B() begin E: nodepointer. A priority queue is needed here. and the nodes are processed in firstin-first-out order. The expanded node (or E-node for short) is the live node with the best CC value. While implementing LIFO B&B algorithm it is not economical to kill live nodes with c®> upper each time upper is updated.A dead node is a node that has been expanded. the branch) is the one with the best cost within the queue. If a cost criteria is available.If the bounds match. The algorithm is applied recursively to the subproblems. In such a case. but not necessarily globally optimal. then an optimal solution has been found and the procedure terminates. a queue is used.. If an optimal solution is found to a subproblem. is assumed to have been defined. the feasible region is divided into two or more regions. or until some specified threshold is met between the best solution found and the lower bounds on all unsolved subproblems. the lower-bounding and upper-bounding procedures are applied to the root problem. the algorithm traverses a spanning tree of the solution space using the breadth-first approach. Therefore.e. Branch and Bound is a general search method. Each solution is assumed to be expressible as an array X[1:n] (as was seen in Backtracking). these subproblems partition the feasible region.this is the root node which -. -. If the lower bound for a node exceeds the best known feasible solution. the cost function may also be used to discard (i. That is. Starting by considering the root problem (the original problem with the complete feasible region). A counter-part of the backtracking search algorithm which. A predictor. the node can be removed from consideration. The search proceeds until all nodes have been solved or pruned. the node to be expanded next (i.

Algorithm Analysis and Design (R 606) H: heap.We need to fully implement the Expand procedure Every node corresponds to something like X[i]=j.Compute the approximate cost value CC of each child. -. Each node must point to its parent so that when an optimal leaf is generated.E is an optimal solution print out the path from E to the root.A heap for all the live nodes 179 -.Insert each child into the heap H. while (true) do if (E is a final leaf) then -. a node record structure should look like: Record node Begin Department of Computer Science & Engineering SJCET. endif E := delete-top(H).Therefore. the path from that leaf to the root can be traced and printed out as the optimal solution. -. Every node must store its CC value. endwhile end Procedure Expand(E) begin . return. endif Expand(E). Palai .Generate all the children of E. end We need to define the full record of a node . .and a max-heap for maximization problems.H is a min-heap for minimization problems. which signifies that the job X[i] assigned to person i is j. . return. if (H is empty) then report that there is no solution.

if when given a node X and index i it produces the i‘th child of the node.Algorithm Analysis and Design (R 606) Parent: nodepointer. -. End 180 Take the 2nd CC formula: CC(X at level k) = cost so far + sumni=k+1mi where mi is the minimum of row i. Department of Computer Science & Engineering SJCET. Palai .mX->I Write a piece of code that computes the mis for i=1.2.Job J is assigned to person I CC: real...The following function produces a complete binary tree of 11 nodes. then X->CC = X->Parent->CC + AX->I.The technique is also used in a lot of software in global optimization.. observe that if X is a pointer to a node. the LIFO branch-and-bound technique has proven to be reasonably efficient on practical problems.person I J: integer. I: integer. -. and it has the added advantage that it solves continuous linear programs as sub problems..n Although a number of algorithms have been proposed for the integer linear programming problem. The recursive function provided for deriving permutations is another example of a function that may be used to generate trees.A->J . Lifo B&B Cost-Based Tree Traversal A function can be considered to be a tree generator.

Replace the FIFO queue with a priority queue (least-cost (or max priority) branch and bound). In the case of backtracking the cost criteria assumes a last-in-first-out (LIFO) function. • • • Search the tree using a breadth-first search (FIFO branch and bound). Initialization: The root of the of the tree is declared to be alive. • Least-cost branch and bound directs the search to parts of the space most likely to contain the answer. So it could perform better than backtracking.Algorithm Analysis and Design (R 606) 181 Besides for a tree generator function. 7. Visit: The cost criteria decides which of the live nodes is to process next. 8. and a priority queue memory can be employed to realize the function. A generalization to arbitrary cost criteria is the basis for the priority branch-and-bound algorithm. Palai . The priority of a node p in the queue is based on an estimate of the likelihood that the answer node is in the subtree whose root is p.Backtracking may never find a solution because tree depth is infinite (unless repeating configurations are eliminated). 5. Department of Computer Science & Engineering SJCET. and it can be realized with queue memory. which can be realized with a stack memory. Replacement: The chosen node is removed from the set of live nodes. Iteration: The visitation and replacement steps are repeated until no alive nodes are left. The children are determined by the tree generator function. A last-in-first-out cost criteria implies the LIFO branch-and-bound algorithm. LIFO branch and bound finds solution closest to root. we also need a cost function to decide in what order to traverse the nodes when searching for a solution. Search the tree as in a bfs. and its children are inserted into the set. 6. The algorithm proceeds in the following manner. but replace the FIFO queue with a stack (LIFO branch and bound).

} Department of Computer Science & Engineering SJCET. Palai . }until(false). Repeat { for each child x of E do { If x is an answer node then output the path from x to t and return.return. Add(x).//x is a new live node. { If *t is an answer node then output t and return E:=t. } Algorithm LCSearch(t) //Search t for an answer node. } E:=Least().Algorithm Analysis and Design (R 606) 182 LC CONTROL ABSTRACTION listnode= record { Listnode *next. //E-node Initalize the list of live nodes to be empty.*parent.//Pointer for path to root } If there are no more live nodes then { Write(―No answer node‖). Float cost. (x->parent):=E.

Let position(i) be the position number in the intial state of the tile numbered i. The object of the puzzle is to place the tiles in order (see diagram) by making sliding moves that use the empty space. Department of Computer Science & Engineering SJCET. The goal is to reposition the squares from a given arbitrary starting arrangement by sliding them one at a time into the configuration shown above. this rearrangement is possible. For some initial arrangements. If the size is 3×3. For any state let less(i) be the number of tiles j such that j<I and position(j)>position(i). the 15 puzzle. and if 4×4. including the 8 puzzle. Palai . It is a sliding puzzle that consists of a frame of numbered square tiles in random order with one tile missing. but for others. and with various names.PUZZLE The 15 puzzle consists of 15 squares numbered from 1 to 15 that are placed in a box leaving one position out of the 16 empty. The n-puzzle is known in various versions. the puzzle is called the 15-puzzle or 16-puzzle. Theorem: The goal state is reachavle from intial state iff ∑16 i=1less(i)+x is even. the puzzle is called the 8-puzzle or 9puzzle.Algorithm Analysis and Design (R 606) 183 15. it is not.

Palai .Algorithm Analysis and Design (R 606) 184 First ten steps in depth first search Department of Computer Science & Engineering SJCET.

such that the distance traveled is as small as possible. Problem: You are given a list of n cities along with the distances between each pair of cities. In practice. The goal is to find a tour which starts at the first city. The TSP has several applications even in its purest formulation.e. only in time exponential in n. visits each city exactly once and returns to the first city. especially. the distance between two cities is the same in each direction. we want to compute an approximate solution. Given a list of cities and their pairwise distances. It is used as a benchmark for many optimization methods. there may not even be a connection in the other direction. Thus. soldering points. customers. This problem is known to be NP-complete . for example. such as genome sequencing. in a given amount of time. we are given a graph G=(N. logistics. the task is to find a shortest possible tour that visits each city exactly once. a set of edges V = {(i. it appears as a sub-problem in many areas. More formally. a single tour whose length is as short as possible. such as planning. i. The graph is directed. In these applications.Algorithm Analysis and Design (R 606) 185 TRAVELLING SALESMAN PROBLEM The Travelling Salesman problem (TSP) is a problem in combinatorial optimization studied in operations research and theoretical computer science. or a similarity measure between DNA fragments. i.j)} giving the length of edge (i.j)} connecting cities. the distance from one city to the other need not equal the distance in the other direction. Palai . in general. the concept city represents. and the concept distance represents travelling times or cost. and it is widely believed that no polynomial time algorithm exists. Slightly modified. no serial algorithm exists that runs in time polynomial in n.W) consisting of a set N of n nodes (or cities). the underlying structure is an undirected graph between. In the asymmetric TSP. and the manufacture of microchips. or DNA fragments.e. so that an Department of Computer Science & Engineering SJCET. each tour has the same length in both directions.j) (distance from city i to city j).V. and a set of nonnegative weights W = {w(i. Asymmetric and symmetric In the symmetric TSP.

. 2. wB ) = Best_S_so_far if k = n then new_w = w + w(ik. 3. w(i. [ 1 ]. new_w ) Search( New_S. . Best_S_so_far ) let ( k.. . .. ik..j) does not necessarily equal w(j. .4) + . Best_S_so_far ) end if end for endif return end Better Branch and Bound Algorithm for TSP Department of Computer Science & Engineering SJCET.. .j) if new_w < wB then New_S = ( k+1. n ].. w ) = S let ( n. . ik ].. 0 ) Search( S. . + w(n-1.i1) if new_w < wB then Best_S_so_far = ( k. i2.i) may or may not exist.1) Best_S_so_far = ( n. Naive Branch-and-Bound Solution of TSP w = w(1. . [ i1. i2B. Best_S_so_far ) print Best_S_so_far procedure Search( S. there is no reason to continue searching that path.i). new_w ) end if else for all j not in [ i1.Algorithm Analysis and Design (R 606) 186 edge (i. i2. inB ]. i2. .. . .n) + w(n. [ i1. [ i1.. j ]. w ) S = ( 1. [ 1. i2. ik ] new_w = w + w(ik... Branch-and-Bound for TSP A simple improvement on the algorithm prunes the search tree by observing that if a partial tour is already longer than the best solution found so far.. if both edges exist.j) may only be traversed in the direction from i to j.. and edge (j. ik ]. [ i1B.. Palai .3) + w(3. . n-1. Similarly.2) + w(2.

---------. This results in a matrix with at least one zero in every row and column.---------. The search tree would unfold as follows: ----------| all solns | ----------/ ---------------| solns with e_i | ---------------/ \ \ ----------------| solns w/out e_i | ----------------/ \ ----------. i\j 1 2 3 4 5 6 7 \ ________________________________________ 1 2 3 4 5 6 7 |Inf 3 93 77 13 42 36 33 21 16 56 9 16 28 7 25 57 34 25 91 57 7 | 4 Inf | 45 | 39 | 28 | 3 | 44 17 Inf 90 46 88 26 80 Inf 88 18 33 33 Inf 46 92 Inf 27 84 39 Inf We can subtract a constant from a given row or column.---------. We therefore normalize the matrix by subtracting the minimum value in each row from the row and the minimum value of each column from the column.----------| with e_j | | w/out e_j | | with e_k | | w/out e_k | ----------. i\j 1 2 3 4 5 6 7 \ ________________________________________ 1 2 3 |Inf 0 83 66 9 37 19 30 17 0 6 12 12 50 26 5 | 0 Inf | 29 1 Inf Department of Computer Science & Engineering SJCET. Palai . as long as the values remain nonnegative.---------.----------- Bounding the solutions: Assume the input is given as a dense adjacency matrix. This changes the weight of each tour. rather than zeros on the diagonal to avoid traversing these self-edges.Algorithm Analysis and Design (R 606) 187 Another strategy for searching the solution space is to repeatedly divide it into two parts: those with a given edge and those without the edge. We will put infinities. but not the set of legal tours or their relative weights.

so we replace the (6. When an edge is eliminated from the solution. so it remains unchanged. so the sum of the values we just subtracted is a lower bound on the weight of solution. Palai .4) is no longer usable. Column 6 has another 0 entry.4) with infinity. we replace the (4. yielding a lower bound of 96+3=99 and the matrix: i\j 1 2 3 4 5 7 \ __________________________________ 1 |Inf 0 83 9 30 50 Department of Computer Science & Engineering SJCET. When an edge is chosen for the solution. since we have used edge (4. the row and column containing that edge is deleted. In the above example. so the lower bound on the weight of any solution is 96. The minimum value in row 4 is now 32. The matrix for the right subtree is: i\j 1 2 3 4 5 6 7 \ ________________________________________ 1 2 3 4 5 6 7 |Inf 0 83 66 9 37 19 30 17 0 6 12 12 50 26 5 48 28 0 | 0 Inf | 29 | 0 | 3 | 0 | 18 1 Inf 51 21 85 0 34 Inf 56 8 0 17 Inf 0 7 Inf 42 0 89 Inf 58 13 Inf The left subtree represents all solutions containing (4. its entry is changed to infinity so that it will never be chosen.The matrix can now be renormalized.6).6) entry by infinity. we subtracted [3 4 16 7 25 3 26] from the rows and then [0 0 7 1 0 0 4] from the columns. In this case. For the right subtree. In addition.Algorithm Analysis and Design (R 606) 4 5 6 7 | 32 | 3 | 0 | 18 83 21 85 0 66 Inf 56 8 0 49 0 0 80 28 0 188 7 Inf 42 0 89 Inf 58 13 Inf Any solution must use one entry from every row and every column. so we can renormalize the matrix and improve the lower bound to 96+32 = 128.6). in this case by subtracting 3 from the row with i=5 (now the 4th row in the smaller matrix).6). edge (6. Representing the set of solutions: The adjacency matrix can be used to represent the set of solutions. We therefore delete row 4 and column 6 from the matrix and renormalize. assume we choose to split the search space on the edge from 4 to 6. which represents all solutions not containing (4.

i) should be made infinity. and (2. but the improvement in the lower bound for that case would have been only 1 (for row 3) plus 17 (for column 5). Following the right branch. and (1. in choosing the splitting edge. In general.j) that maximizes the increase in the lower bound (largest sum of minimum in row i and minimum in column j.. In the example. There are other zero entries in the matrix. (5. we will eventually have a matrix full of infinities. For the example above.4). the general rule is to search for the zero entry at (i. we look for something that will raise the lower bound of the right-hand subtree as much as possible. (3.5). Therefore. Insertion of infinities in left-branches: When expanding a left branch in the search tree for edge (i.6) was chosen because its value was zero and the next larger value in row 4 was 32. the right branches represent a larger set of solutions than the left branches. assume that the left-most search path unfolds with the following edge splits: (4. Therefore. edge (4.5). we noted that edge (j.6). i.e. all cycles were prevented by marking (6.j).2) as infinity. cycles that traverse only a subset of the nodes.j)). we are trying to avoid the creation of non-covering cycles in the tour. Following the left branches we are guaranteed to reach some solution in n levels. At this point the partial solution contains three disconnected paths.3). for example. at which point the lower bound Choice of the splitting edge: In general. Palai . Department of Computer Science & Engineering SJCET.Algorithm Analysis and Design (R 606) 2 3 5 6 7 | 0 Inf | 29 | 0 | 0 | 18 66 37 19 17 0 26 5 25 0 189 1 Inf 18 85 0 53 4 Inf 89 8 Inf 0 0 58 Inf The process outlined so far can be used by repeatedly following these steps. (3.1). not counting the zero at (i.

However.1) should be changed to infinity. since it was removed with the (4. Moreover.6)| -----------/ -----------| with (3.1)| -----------\ -----------| w/out (2. Department of Computer Science & Engineering SJCET. at this point the adjacency matrix does not contain a row for 4. Palai .5)| -----------/ -----------| with (2.6)| ------------- The next choice of splitting edge is (1. the partial solution is now: 2-1-4-6 &3-5.1)| -----------\ ------------| w/out (3.4).Algorithm Analysis and Design (R 606) 190 ----------| all solns | ----------/ -----------| with (4.5)| ------------\ ------------| w/out (4. so according to our above rule (4.6) step.