2 Analyzing Algorithms

To familiarize you with the framework of the design and analysis of algorithms, we start with the algorithm of insertion sort. We describe algorithms as programs written in a pseudocode that is similar in many aspects to C or Java. In pseudocode, we employ whatever expressive method is most clear and concise to specify a given algorithm. Issues of implementation are often ignored in order to convey the essence of the algorithm. Insertion sort is an efficient algorithm for sorting a small number of elements. The input numbers are sorted in place: the numbers are rearranged within the input array, with at most a constant number of them stored outside the array at any time.


Pseudocode for Insertion Sort
INSERTION-SORT(A) 1 2 3 4 5 6 7 8 FOR j := 2 TO length[A] DO key := A[j]; //Insert A[j] into the sorted sequence A[1 .. j-1] i := j – 1; WHILE i > 0 and A[i] > key DO A[i+1] := A[i]; i := i – 1 END A[i+1] := key


See an example on papantulis.

Loop Invariants and the Correctness of Algorithms
In the insertion sort algorithm, the elements A[1..j-1] are the elements originally in positions 1 through j-1, but now in sorted order. We state these properties of A[1..j-1] formally as a loop invariant: At the start of each iteration of the for loop of lines 1 - 8, the subarray A[1..j-1] consists of the elements originally in A[1..j-1] but in sorted order. We use loop invariants to help us understand why an algorithm is correct.


it remains true before the next iteration. Initialization: It is true prior to the first iteration of the loop.4 . Maintenance: If it is true before an iteration of the loop.We must show three things about a loop invariant: 1. Termination: When the loop terminates. the invariant gives us a useful property that helps show that the algorithm is correct. 2. 2. 3.

j-1]. The FOR loop ends when j exceeds n (the length of A). A[j-2]. But the array A[1. and so on by one position to the right until the proper position for A[j] is found (lines 4-7).. at which point the value of A[j] is inserted (line 8). trivially. when j = 2. we have that the subarray A[1. which means that the algorithm is correct.5 . Termination: We examine what happens when the loop terminates. consists of just the single element A[1].. the entire array is sorted. Hence. The subarray A[1.. The body of the FOR loop works by moving A[j-1]. Maintenance: We have to show that each iteration maintains the loop invariant. 2.n] is the entire array. therefore. i..n] consists of the elements originally in A[1. Moreover. but in sorted order.. when j = n + 1.e. which is in fact the original element in A[1]. this subarray is sorted. Substituting n+1 for j in the statement of loop invariant.n].Showing the correctness of insertion sort by using a loop invariant Initialization: We have to show that the loop invariant holds before the first loop iteration.

such as while loops. so no expressive programming power is lost by restricting attention to while loops alone.6 . The template for a WHILE loop is as follows: WHILE E DO S END 2. and repeat-until loops. The other loops can be written in terms of while loops.WHILE Loops Programming languages offer a variety of loop constructs. for loops. We will focus on while loops.

It will therefore be necessary to prove that the loops always terminate when they are required to. 2. because the guard is true in every state that is reached.7 . If the loop is executed in a state in which E is false. Execution of a loop thus passes through a sequence of intermediate states. If the loop is executed in a state s in which E is true. then the loop terminates immediately without changing the state. then the loop executes the sequence of statements S to reach a state s' from which the loop is again executed.The predicate E is a condition or boolean expression. It is called the guard of the loop. In general it is possible that a loop never terminates execution.

8 .E true S false 2.

Example The "Russian Multiplication" algorithm multiplies two numbers a and b by producing two columns of numbers. and the second column is produced by repeated doubling. headed by a and b. The sum of the second column then gives the product of a and b.9 . The first column is produced by repeated halving with rounding down (integer division). All rows in which the first entry is even are removed. 2.

Russian multiplication of 57 and 43 57 28 14 7 3 1 43 86 172 344 688 1376 2451 removed 2.10 .

which keeps track of the running total as execution proceeds. then the running sum total will be the product of the two variables a and b. WHILE x > 0 DO IF x mod 2 = 1 THEN total := total + y END. x := x/2. y := y*2 END 2. total := 0.11 . x := a. y := b. When the loop finishes.The algorithm can be described with a loop.

must take the intermediate states into account. The relationship between one intermediate state and the next is described by the body S of the loop. expressed as a postcondition P. and the final state it reaches. For the Russian Multiplication loop.The loop in the previous slide is supposed to implement the Russian Multiplication algorithm. Since a loop execution passes through many intermediate states.12 . 2. the postcondition is that total = a*b. The key link between successive states is captured by a loop invariant. but no argument has yet been given concerning its correctness. Correctness must be with respect to a requirement. the relationship between the initial state from which it is executed.

13 . connected through all of the intermediate states.A loop invariant is a condition (predicate) which holds of all of the states that the execution of the loop passes through. before and after each execution of the loop body S. 2. It therefore provides a link between the initial and final states.

I I E I∧E I ∧ ¬E S I P Loop control points annotated with predicates 2.14 .

Proving that the algorithm is correct requires the identification of some predicate I on some or all of the variables. as encapsulated in the value total. In the initial state: x = a & y = b & total = 0. 2. and which implies the postcondition total = a*b when the guard x > 0 is false.15 . It requires that total = a*b in the final state. One suitable invariant is the following: total + x * y = a * b At any particular stage. is preserved by the body S of the loop.Example Consider again the Russian Multiplication. which is true for the initial state. the invariant describes what has been achieved so far.

If x is even. x is halved. then y is added to total. Maintenance: On a single pass through the loop. then total remains as it was. Thus the invariant is preserved by every iteration of the loop. 2. It follows that total = a*b. on termination total + x*y = a*b and also the negation of the guard holds: x = 0.The proof is as follows: Initialization: Initially. In this case (x/2)*(y*2) is the same as x*y. Termination: Finally. 2. so the invariant again remains true. x is then halved with rounding down. which is the required postcondition.16 . If x is odd. So the invariant is true when the loop begins. (total + y) + x/2*(y*2) = total + x*y. total = 0 and x*y is indeed a*b. and y is doubled. and y is doubled. In this case. there are two possibilities: 1. so the invariant remains true.

An execution of the Russian Multiplication loop x 57 28 14 7 3 1 0 y 43 86 172 344 688 1376 2752 total 0 43 43 43 387 1075 2451 INVARIANT 0 + 57*43 = 57*43 43 + 28*86 = 57*43 43 + 14*172 = 57*43 43 + 7*344 = 57*43 387 + 3*688 = 57*43 1075 + 1*1376 = 57*43 2451 + 0*2752 = 57*43 2.17 .

18 . This says that the invariant. This requirement can be used to guide the development of a loop which is required to establish a particular postcondition P. together with the negation of the guard.Finding an Invariant Loop correctness requires that I ∧ ¬E ⇒ P. 2. must imply the postcondition.

One technique is to obtain I by weakening P. 2.19 . The loop should then terminate when the particular instance of I corresponds to the situation where P also holds: this will influence the choice of guard. so I holds for more states than P does.

2. so that P = I when i is equal to N. In this case. the guard of the loop should be i ≠ N. which indeed implies P.20 . When the loop terminates.Constructing I by Weakening P Replacing a constant with a variable The postcondition P can be weakened by replacing a constant N by a variable i. and we will have I ∧ i=N. the guard is false.

(j ∈ 1.i | aa[j]) The type of i is natural number. Replacing the constant N by a variable i results in the invariant I: sum = Σj.(j ∈ 1. The guard E of the loop is: i ≠ N...N | aa[j]). The postcondition P is sum = Σj.. 2.Example We want to develop a loop to sum the elements of an array aa [1.N].21 . It follows by construction that I ∧ ¬E ⇒ P.

i := 0.The suitable loop is: sum := 0. WHILE i ≠ N DO i := i + 1. sum := sum + aa[i] END 2.22 .

N | aa(j)) Develop a complete loop with this invariant.Exercise The postcondition in the previous example can also be weakened to obtain the following invariant I: sum = Σj.(j ∈ i.. 2.23 .

then it can be weakened by deleting one (or several) of its conjuncts. So the negation of the guard and the remaining conjuncts together imply the postcondition. The resulting predicate will be true in more states than the postcondition.24 . 2.) Deleting a conjunct If a postcondition consists of a number of conjuncts. and might be suitable as a loop invariant. The loop guard in this case will be the negation of the deleted conjunct.Constructing I by Weakening P (cont.

The loop body simply increments r . so an initial state for the loop can easily be established. So we have P = r2 <= n & n < (r + 1)2 Deleting the second conjunct leaves r2 <= n. The loop guard will be the negation of the deleted conjunct: E = (r + 1)2 <= n.Example The integer square root r of a natural number n is the greatest integer whose square is no more than n. This will do as an invariant of a loop to achieve the postcondition P. It is true when r = 0. 2.25 .

26 . WHILE (r + 1)2 <= n DO r := r + 1 END 2.Thus the complete loop to compute integer square root is: r := 0.

Random-Access Machine (RAM) Analyzing an algorithm means predicting the resources (computing time. memory. etc) that the algorithm requires. 2. Before we can analyze an algorithm.27 . we must have a model of the implementation technology. We will use a generic one-processor RAM (randomaccess machine) model of computation and implement our algorithms as computer programs on that machine. communication bandwidth. Most often it is computing time that we want to measure.

The RAM model: contains instructions commonly found in real computers: arithmetic (add. ceiling). Each such instruction takes a constant amount of time. copy). store. 2. remainder. floor. and control (conditional and unconditional branch.28 . supports the data types: integer and floating point. divide. data movement (load. multiply. subroutine call and return). subtract.

.. we let tj be the number of times the while loop test in line 5 is executed for that value of j.Running Time The running time of an algorithm is the number of primitive operations or steps executed. where n = length[A]. 8000) input itself (e. 3. . Running time depends on input size (e. already sorted or not) The mathematical expression for the running time of INSERTION-SORT can be determined as follows. n. For each j = 2.g.29 . 2. 8 elements vs.g..

.INSERTION-SORT (A) 1 2 3 FOR j := 2 TO length[A] DO key := A[j].30 cost times c1 c2 n n-1 0 c4 c5 c6 c7 c8 n-1 n-1 ∑ ∑ (t −1) ∑ (t −1) n-1 n j=2 j n j =2 j n j =2 j t . j-1]. i := i – 1 END A[i+1] := key END 2. //Insert A[j] into the sorted sequence A[1 . 4 5 6 7 8 i := j – 1 WHILE i > 0 and A[i] > key DO A[i+1] := A[i].

31 . j =2 n 2.The running time of INSERTION-SORT is: T(n) = c1n + c2 (n −1) + c4 (n −1) + c5 ∑t j + c6 ∑(t j −1) j =2 j =2 n n + c7 ∑(t j −1) + c8 (n −1).

2.. In this case. The best-case running time is ) ) ) ) T(n) = c1n+c2(n−1 +c4(n−1 +c5(n−1 +c8(n−1 = (c1 +c2 +c4 +c5 +c8)n−(c2 +c4 +c5 +c8). This is a linear function of n. the best case occurs if the array is already sorted. .. an algorithm's running time may depend on which input of that size is given. in INSERTION-SORT.3. For example.32 . tj = 1 for j = 2.n.For inputs of a given size..

n.The worst case occurs if the array is in reverse sorted order. 2. and so tj = j for j = 2.. j-1].. 3.. each element A[j] must be compared with each element in the entire sorted subarray A[1 . In this case.. j=2 n j=2 ∑ j n = = ∑ ( j − 1) n ( n + 1) −1 2 n ( n − 1) 2 The worst case running time is a quadratic function of n : c5 c6 c7 2 c5 c6 c7 T(n) = ( + + )n + (c1 + c2 + c4 + − − + c8 )n 2 2 2 2 2 2 − (c2 + c4 + c5 + c8 ).33 . .

(sometimes) -. because we can cheat with slow algorithm that works fast on some input.Best case Useless.Kinds of Algorithm Analysis (usually) -.Worst case T(n) = max time on any input of size n. 2.34 .Average case T(n) = average time over all inputs of size n (assumes statistical distribution of inputs) (never) -.

a quadratic function of n. searches for absent info may be frequent. Why? Three reasons: 1. For example.We usually concentrate on finding the worst-case running time. 3. the worst-case occurs fairly often. The "average case" is often roughly as bad as the worst case.e. The average-case running time is just like the worstcase running time. The worst-case running time is an upper bound on the running time for any input.35 . In some searching applications. For some algorithms. i. suppose that we randomly choose n numbers and apply insertion sort. 2. Knowing it gives us a guarantee that the algorithm will never take any longer.. 2.

Sign up to vote on this title
UsefulNot useful