Professional Documents
Culture Documents
4th semester
NOTES
Unit-1
Introduction to algorithms
Algorithm:
It is a logical step-by-step method to solve the problem is called algorithm, in other words,
an algorithm is a procedure for solving problems. In order to solve a mathematical or
computer problem, this is the first step of the procedure. An algorithm includes calculations,
reasoning and data processing.
problem
algorthim
• Analysing an algorithm
Understanding the problem:The problem given should be understood completely. Check if
it is similar to some standard problems & if a known algorithm exists otherwise a new
algorithm has to be devised. Creating an algorithm is an art which may never be fully
automated. An important step in the design is to specify an in- stance of the problem.
Decide on the appropriate data structure: Some algorithms do not demand any in- genuity
in representing their inputs. Some others are in fact are predicted on ingenious data structures.
A data type is a well-defined collection of data with a well-defined set of operations on it. A
data structure is an actual implementation of a particular abstract data type. The Elementary
Data Structures are Arrays.
i) Arrays: You can have arrays of any other data type. .
ii) Records: These let you organize non-homogeneous data into logical packages
to keep every- thing together. These packages do not include operations, just
data fields.
iii) Sets: These let you represent subsets of a set with such operations as
intersection, union, and equivalence.
Algorithm design techniques: Creating an algorithm is an art which may never be fully au-
tomated. By mastering these design strategies, it will become easier for you to devise new
and useful algorithms. Dynamic programming is one such technique. Some of the techniques
are especially useful in fields other then computer science such as operation research and
electric- al engineering. Some important design techniques are linear, non linear and integer
programming
Methods of specifying an algorithm: There are mainly two options for specifying an
algorithm: use of natural language or pseudocode & Flowcharts.
Another result is that it allows you to predict whether the software will meet any efficiency
constraint that exits
For Example:
Step 2: If n=0, retrun the value of m as athe answer and stop, otherwise proceed step 3
Step 3: Divide m by n and assign the value of the remainder to r.
Example: m = 60 and n= 24
60 % 24 = 12
24 % 12 = 0
12 % 0
after assigning the value of r to n that is n=0, thus the value of m = 12 is the GCD(m,n).
Step 5.
Step 4: Divide n by t, If the remainder of this division is 0, retruns the value of t, otherwise
proceed step 5.
Step 4 Identify all the common factors in the two prime expansions found in Step 2 and Step
3. (If p is a common factor occurring Pm and Pn times in m and n, respectively, it should be
repeated min{pm, p,) times.)
Step 5 Compute the product of all the common factors and return it as the greatest common
divisor of the numbers given.
we get 60=2·2·3·5
24=2·2·2·3
As an example, consider the application of the algorithm to finding the list of primes not exceeding n
= 25:
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
23 5 7 9 11 13 15 17 19 21 23 25
23 5 7 11 13 17 19 23 25
23 5 7 11 13 17 19 23
For this example, no more passes are needed because they would eliminate numbers already
eliminated on previous iterations of the algorithm. The remaining numbers on the list are the
consecutive primes less than or equal to 25.
In general, what is the largest number p whose multiples can still remain on the list? Before
we answer this question, let us first note that if p is a number whose multiples are being
eliminated on the current pass, then the first multiple we should consider is p · p because all
its smaller multiples 2p, ... , (p- l)p have been eliminated on earlier passes through the list.
This observation helps to avoid
eliminating the same number more than once. Obviously, p · p should not be greater than n,
and therefore p cannot exceed ,jn rounded down (denoted l ,jn J using the so-called floor
function). We assume in the following pseudocode that there is a function available for
computing l ,jn J; alternatively, we could check the inequality p · p S n as the loop
continuation condition there.
So now we can incorporate the sieve of Eratosthenes into the middle-school procedure to get
a legitimate algorithm for computing the greatest common divisor of two positive integers.
Note that special care needs to be exercised if one or both input numbers are equal to 1:
because mathematicians do not consider 1 to be a prime number, strictly speaking, the
method does not work for such inputs. Before we leave this section, one more comment is in
order. The examples considered in this section notwithstanding, the majority of algorithms in
use today-even those that are implemented as computer programs-do not deal with
mathematical problems. Look around for algorithms helping us through our daily routines,
both professional and personal. May this ubiquity of algorithms in today's world strengthen
your resolve to learn more about these fascinating engines of the information age.
Analysis Framework:
Asymptotic Notations:
Asymptotic Notations are languages that allow us to analyze an algorithm’s running time by
identifying its behavior as the input size for the algorithm increases. This is also known as an
algorithm’s growth rate. Does the algorithm suddenly become incredibly slow when the input
size grows? Does it mostly maintain its quick run time as the input size increases?
Asymptotic Notation gives us the ability to answer these questions.`
Another way is to physically measure the amount of time an algorithm takes to complete
given different input sizes. However, the accuracy and relativity (times obtained would only
be relative to the machine they were computed on) of this method is bound to environmental
variables such as computer hardware specifications, processing power, etc.
In the first section of this doc we described how an Asymptotic Notation identifies the
behavior of an algorithm as the input size changes. Let us imagine an algorithm as a function
f, n as the input size, and f(n) being the running time. So for a given algorithm f, with input
size n you get some resultant run time f(n). This results in a graph where the Y axis is the
runtime, X axis is the input size, and plot points are the resultants of the amount of time for a
given input size.
You can label a function, or algorithm, with an Asymptotic Notation in many different ways.
Some examples are, you can describe an algorithm by its best case, worse case, or equivalent
case. The most common is to analyze an algorithm by its worst case. You typically don’t
evaluate by best case because those conditions aren’t what you’re planning for. A very good
example of this is sorting algorithms; specifically, adding elements to a tree structure. Best
case for most algorithms could be as low as a single operation. However, in most cases, the
element you’re adding will need to be sorted appropriately through the tree, which could
mean examining an entire branch. This is the worst case, and this is what we plan for.
Linear Function - an + b
constant
One extremely important note is that for the notations about to be discussed you should do
your best to use simplest terms. This means to disregard constants, and lower order terms,
because as the input size (or n in our f(n) example) increases to infinity (mathematical limits),
the lower order terms and constants are of little to no importance. That being said, if you have
constants that are 2^9001, or some other ridiculous, unimaginable amount, realize that
simplifying will skew your notation accuracy.
Logarithmic - log n
Linear - n
Quadratic - n^2
Big-O:
Big-O, commonly written as O, is an Asymptotic Notation for the worst case, or ceiling of
growth for a given function. It provides us with an asymptotic upper bound for the growth
rate of runtime of an algorithm. Say f(n) is your algorithm runtime, and g(n) is an arbitrary
time complexity you are trying to relate to your algorithm. f(n) is O(g(n)), if for some real
constants c (c > 0) and n0, f(n) <= c g(n) for every input size n (n > n0).
Example 1
g(n) = log n
Is f(n) O(g(n))? Is 3 log n + 100 O(log n)? Let’s look to the definition of Big-O.
Is there some pair of constants c, n0 that satisfies this for all n > n0?
Example 2
f(n) = 3*n^2
g(n) = n
3 * n^2 <= c * n
Is there some pair of constants c, n0 that satisfies this for all n > 0? No, there isn’t. f(n) is
NOT O(g(n)).
Big-Omega:
Big-Omega, commonly written as Ω, is an Asymptotic Notation for the best case, or a floor
growth rate for a given function. It provides us with an asymptotic lower bound for the
growth rate of runtime of an algorithm.
f(n) is Ω(g(n)), if for some real constants c (c > 0) and n 0 (n0 > 0), f(n) is >= c g(n) for
every input size n (n > n0).
Note
The asymptotic growth rates provided by big-O and big-omega notation may or may not be
asymptotically tight. Thus we use small-o and small-omega notation to denote bounds that
are not asymptotically tight.
Small-o:
Small-o, commonly written as o, is an Asymptotic Notation to denote the upper bound (that is
not asymptotically tight) on the growth rate of runtime of an algorithm.
f(n) is o(g(n)), if for all real constants c (c > 0) and n 0 (n0 > 0), f(n) is < c g(n) for every
input size n (n > n0).
The definitions of O-notation and o-notation are similar. The main difference is that in f(n) =
O(g(n)), the bound f(n) <= g(n) holds for some constant c > 0, but in f(n) = o(g(n)), the bound
f(n) < c g(n) holds for all constants c > 0.
Small-omega:
Small-omega, commonly written as ω, is an Asymptotic Notation to denote the lower bound
(that is not asymptotically tight) on the growth rate of runtime of an algorithm.
f(n) is ω(g(n)), if for all real constants c (c > 0) and n 0 (n0 > 0), f(n) is > c g(n) for every
input size n (n > n0).
The definitions of Ω-notation and ω-notation are similar. The main difference is that in f(n) =
Ω(g(n)), the bound f(n) >= g(n) holds for some constant c > 0, but in f(n) = ω(g(n)), the
bound f(n) > c g(n) holds for all constants c > 0.
Theta:
Theta, commonly written as Θ, is an Asymptotic Notation to denote the asymptotically tight
bound on the growth rate of runtime of an algorithm.
f(n) is Θ(g(n)), if for some real constants c1, c2 and n0 (c1 > 0, c2 > 0, n0 > 0), c1 g(n) is
< f(n) is < c2 g(n) for every input size n (n > n0).
Feel free to head over to additional resources for examples on this. Big-O is the primary
notation use for general algorithm time complexity.
WORST CASE
O(g(n)) = { f(n): there exist positive constants c and
n0 such that 0 <= f(n) <= c*g(n) for
all n >= n0}
BEST CASE
Ω (g(n)) = {f(n): there exist positive constants c and
n0 such that 0 <= c*g(n) <= f(n) for
all n >= n0}.
EXAMPLE 1:
Consider the problem of finding the value of the largest element in a list of n numbers.
For simplicity, we assume that the list is implemented as an array. The following is a
pseudocode of a standard algorithm for solving the problem.
The obvious measure of an input's size here is the number of elements in the array, i.e., n.
The operations that are going to be executed most often are in the algorithm's for loop. There
are two operations in the loop's body: the comparison A[i] >max val and the assignment max
val A[i]. Which of these two operations should we consider basic? Since the comparison is
executed on each repetition of the loop and the assignment is not, we should consider the
comparison to be the algorithm's basic operation.
Note that the number of comparisons will be the same for all arrays of size n;
therefore, in terms of this metric, there is no need to distinguish among the
worst, average, and best cases here.
Let us denote C(n) the number of times this comparison is executed and try to
find a formula expressing it as a function of size n. The algorithm makes one
comparison on each execution of the loop, which is repeated for each value of
the loop’s variable i within the bounds 1 and n − 1, inclusive. Therefore, we get
the following sum for C(n):
The natural measure of the input’s size here is again n, the number of elements
in the array. Since the innermost loop contains a single operation (the
comparison of two elements), we should consider it as the algorithm’s basic
operation. Note, however, that the number of element comparisons depends not
only on nbut also on whether there are equal elements in the array and, if there
are, which array positions they occupy. We will limit our investigation to the
worst case only.
By definition, the worst case input is an array for which the number of
element comparisons Cworst (n) is the largest among all arrays of size n. An
inspection of the innermost loop reveals that there are two kinds of worst-case
inputs—inputs for which the algorithm does not exit the loop prematurely:
arrays with no equal elements and arrays in which the last two elements are the
only pair of equal elements. For such inputs, one comparison is made for each
repetition of the innermost loop, i.e., for each value of the loop
variable jbetween its limits i + 1 and n − 1; this is repeated for each value of the
outer loop, i.e., for each value of the loop variable i between its limits 0
and n − 2. Accordingly, we get
where the last equality is obtained by applying summation formula (S2). Note
that this result was perfectly predictable: in the worst case, the algorithm needs
to compare all n(n − 1)/2 distinct pairs of its n elements.
EXAMPLE 3 Given two n-by-n matrices A and B, find the time efficiency of the
definition-based algorithm for computing their product C = AB. By definition, C is an
n-by-n matrix whose elements are computed as the scalar (dot) products of the rows of
matrix A and the columns of matrix B:
EXAMPLE 4 The following algorithm finds the number o£ binary digits in the binary
representation of a positive decimal integer.
Mathematical Analysis of Recursive Algorithms :
In this section, we will see how to apply the general framework for analysis of algorithms to
recursive algorithms. We start with an example ofteu used to introduce novices to the idea of
a recursive algorithm.
EXAMPLE 1
Basic operation? multiplication during the recursive call
Formula for multiplication
M(n) = M(n-1) + 1
is a recursive formula too. This is typical.
We need the initial case which corresponds to the base case
M(0) = 0
There are no multiplications
Solve by the method of backward substitutions
M(n) = M(n-1) + 1
= [M(n-2) + 1] + 1 = M(n-2) + 2 substituted M(n-2) for M(n-1)
= [M(n-3) + 1] + 2 = M(n-3) + 3 substituted M(n-3) for M(n-2)
... a pattern evolves
= M(0) + n
= n
Not surprising!
Therefore M(n) ε Θ(n)
Example: Tower Hanoi
Tower of Hanoi is a mathematical puzzle where we have three rods and n disks. The
objective of the puzzle is to move the entire stack to another rod, obeying the
following simple rules:
1) Only one disk can be moved at a time.
2) Each move consists of taking the upper disk from one of the stacks and placing it
on top of another stack i.e. a disk can only be moved if it is the uppermost disk on a
stack.
3) No disk may be placed on top of a smaller disk.
Approach :
Take an example for 2 disks :
Let rod 1 = 'A', rod 2 = 'B', rod 3 = 'C'.
Examples:
Input : 2
Output : Disk 1 moved from A to B
Disk 2 moved from A to C
Disk 1 moved from B to C
Input : 3
Output : Disk 1 moved from A to C
Disk 2 moved from A to B
Disk 1 moved from C to B
Disk 3 moved from A to C
Disk 1 moved from B to A
Disk 2 moved from B to C
Disk 1 moved from A to C
IF disk == 1, THEN
move disk from source to dest
ELSE
Hanoi(disk - 1, source, aux, dest) // Step 1
move disk from source to dest // Step 2
Hanoi(disk - 1, aux, dest, source) // Step 3
END IF
END Procedure
STOP
IC: M(1) = 1
M(n) = 2M(n-1) + 1
...
...
M(n) ε Θ(2n)
IMPORTANT THEOREM FOR ALGORITHM DESIGNERS