You are on page 1of 37

1.

Introduction to algorithm analysis : Algorithm analysis is the process of evaluating the


performance of an algorithm, usually in terms of its time and space complexity. There are several
ways to analyze the performance of an algorithm, including asymptotic analysis, which analyzes
the behavior of an algorithm as the size of the input grows indefinitely.

Introduction to algorithm :

Algorithms typically involves familiarizing oneself with the fundamental concepts and techniques
used in designing and analyzing algorithms, which are step-by-step procedures for solving
computational problems. Here's a breakdown of what such an introduction might entail:
1. Definition of Algorithms: An algorithm is a set of well-defined instructions or a step-by-step
procedure for solving a problem or accomplishing a task. It's akin to a recipe or a roadmap that
guides the computer in performing a specific task.
2. Importance of Algorithms: Algorithms are crucial in computer science and programming
because they form the backbone of all software systems. Understanding algorithms is essential
for efficient problem-solving, software development, and optimizing resource usage.
3. Basic Algorithm Analysis: Introductions typically cover the basics of algorithm analysis, including
measuring the efficiency of algorithms in terms of time and space complexity. This involves
understanding concepts like Big O notation, which describes the upper bound or worst-case
scenario of an algorithm's runtime as a function of the input size.
4. Common Algorithmic Paradigms: Students are introduced to various algorithmic paradigms,
such as:
 Divide and Conquer: Breaking down a problem into smaller sub-problems, solving each
independently, and combining the solutions.
 Dynamic Programming: Solving complex problems by breaking them down into simpler
overlapping sub-problems and storing the solutions to avoid redundant computations.
 Greedy Algorithms: Making locally optimal choices at each step with the hope of finding
a global optimum.
 Backtracking: A systematic approach to generate all possible solutions to a problem and
efficiently search for the optimal one.
 Graph Algorithms: Solving problems related to graphs, such as finding shortest paths,
minimum spanning trees, and network flows.
5. Data Structures: Understanding basic data structures like arrays, linked lists, stacks, queues,
trees, and hash tables is essential. These structures serve as the building blocks for
implementing algorithms efficiently.
6. Sorting and Searching Algorithms: Students learn about fundamental algorithms for sorting and
searching data, including insertion sort, merge sort, quicksort, binary search, and linear search.
7. Application of Algorithms: Finally, an introduction might include real-world examples and
applications where algorithms play a crucial role, such as in internet search engines, social
networks, route planning, image processing, and cryptography.
Overall, an introduction to algorithms provides the foundation for understanding how computers
solve problems efficiently and equips learners with the essential tools for algorithm design,
analysis, and implementation.
Algorithm Specifications :

An algorithm is defined as a finite set of instructions that, if followed, performs a particular task.
All algorithms must satisfy the following criteria
Input. An algorithm has zero or more inputs, taken or collected from a specified set of objects.
Output. An algorithm has one or more outputs having a specific relation to the inputs.
Definiteness. Each step must be clearly defined; Each instruction must be clear and
unambiguous.
Finiteness. The algorithm must always finish or terminate after a finite number of steps.
Effectiveness. All operations to be accomplished must be sufficiently basic that they can be done
exactly and in finite length.
We can depict an algorithm in many ways.

Example 1: Algorithm for calculating factorial value of a number


Step 1: a number n is inputted
Step 2: variable final is set as 1
Step 3: final<= final * n
Step 4: decrease n
Step 5: verify if n is equal to 0
Step 6: if n is equal to zero, goto step 8 (break out of loop)
Step 7: else goto step 3
Step 8: the result final is printed
Recursive Algorithms
A recursive algorithm calls itself which generally passes the return value as a parameter to the
algorithm again. This parameter indicates the input while the return value indicates the output.
Recursive algorithm is defined as a method of simplification that divides the problem into sub-
problems of the same nature. The result of one recursion is treated as the input for the next
recursion. The repletion is in the self-similar fashion manner. The algorithm calls itself with
smaller input values and obtains the results by simply accomplishing the operations on these
smaller values. Generation of factorial, Fibonacci number series are denoted as the examples of
recursive algorithms.
Example: Writing factorial function using recursion
intfactorialA(int n)
{
return n * factorialA(n-1);
}
Algorithm Specification A Pseudo code Approach For example, assume that if we have a problem
finding the given number is even or odd. For that problem, we have to write an algorithm. So far,
we haven’t specified how to write an algorithm. How to describe an algorithm?
1 Using Natural Language like English
2 Flow chart
3 Pseudo code Approach
The algorithm to find whether the given number is even or odd using the natural language
approach is shown below.
Step 1 Ask the user to provide a number
Step 2 Do the modulus division by two
Step 3 Compare the result of modulus division with 0
Step 4 If the result is equal, then it is an even number Else it is an odd number.
We use simple English sentences to write an algorithm in the above approach. Each step of the
algorithm is explained in detail using the English language.
The flow chart approach is a pictorial way of representing an algorithm. This approach is suitable
for small or straightforward algorithms, but it is a complex process for big algorithms.
The third approach is the Pseudo Code approach. The name itself states Pseudo means half
code. It uses the keywords or the syntax of any programming language.
The algorithm to find whether the given number is even or odd is shown below.
Algorithm EvenorOdd()
{
if(number%2 == 0)
then print even;
else print odd; }

Performance analysis :

Performance Analysis of Algorithm in DAA


Performance analysis of algorithms in the context of Design and Analysis of Algorithms (DAA) is
crucial to evaluate their efficiency and make informed design decisions. Here are the key aspects
of performance analysis:
Time Complexity Analysis:
Time complexity is a fundamental metric that quantifies an algorithm’s time to complete
concerning the input size. Big O notation is commonly used to express the upper bound on the
growth rate of an algorithm’s running time.
Space Complexity Analysis:
Space complexity evaluates the amount of memory an algorithm requires concerning the input
size. It includes the analysis of additional data structures, variables, and storage needed during
the execution of an algorithm.
Worst, Average, and Best Case Analysis:
Algorithms may behave differently under different scenarios. Performance analysis considers the
worst-case, average-case, and best-case scenarios to provide a comprehensive view of the
algorithm’s behavior under various conditions.
Asymptotic Analysis:
Asymptotic analysis assesses the efficiency of algorithms for large input sizes. It focuses on how
the algorithm behaves as the input approaches infinity. Common notations like Big O, Omega,
and Theta express different asymptotic behavior aspects.
Empirical Performance Evaluation:
Beyond theoretical analysis, empirical evaluations involve practical testing and measurement of
algorithm performance. This includes implementing algorithms, running real-world or
simulated data experiments, and measuring execution times.
Benchmarking:
Benchmarking involves comparing the performance of different algorithms under standardized
conditions. It helps identify the most efficient algorithm for a specific problem by running them
on the same input instances and comparing their execution times.
Scalability Assessment:
Scalability analysis explores how well an algorithm adapts to increasing input sizes. It assesses
whether the algorithm’s performance remains acceptable as the size of the input data grows, a
critical consideration in real-world applications.
Trade-off Analysis:
Performance analysis often involves trade-off considerations, such as time and space complexity.
It helps make informed decisions about the optimal balance between performance metrics.
Optimization Strategies:
Performance analysis provides insights into potential optimization strategies. By understanding
the bottlenecks and inefficiencies in an algorithm, designers can implement targeted
optimizations to enhance its overall performance.
Let’s consider a simple example to illustrate the performance analysis of an algorithm in the
context of Design and Analysis of Algorithms (DAA). We’ll use the problem of finding the
maximum element in an array as our example algorithm.
def find_max_element(arr):
“””
Algorithm to find the maximum element in an array.
“””
max_element = arr[0] # Assume the first element is the maximum
for element in arr[1:]:
if element > max_element:
max_element = element # Update max_element if a larger element is found
return max_element
 The time complexity of the find_max_element algorithm is O(n), where n is the size of the input
array. This is because, in the worst case, the algorithm needs to iterate through all array
elements once.
 The space complexity is O(1), constant space, as the algorithm uses a fixed amount of additional
space regardless of the input size. The only variable used is max_element.
 The worst case occurs when the maximum element is at the end of the array, requiring a full
traversal. The worst-case time complexity is O(n).
 The average case is also O(n) as we assume no specific distribution of maximum elements.
 The best case occurs when the maximum element is at the beginning, and the algorithm only
needs one comparison. The best-case time complexity is O(1).
 The asymptotic behavior is expressed as O(n) for large input sizes.

Performance analysis of an algorithm depends upon two factors i.e. amount of memory used
and amount of compute time consumed on any CPU. Formally they are notified as complexities
in terms of:
 Space Complexity.
 Time Complexity.
Space Complexity of an algorithm is the amount of memory it needs to run to completion i.e.
from start of execution to its termination. Space need by any algorithm is the sum of following
components:
1. Fixed Component: This is independent of the characteristics of the inputs and outputs. This part
includes: Instruction Space, Space of simple variables, fixed size component variables, and
constants variables.
2. Variable Component: This consist of the space needed by component variables whose size is
dependent on the particular problems instances(Inputs/Outputs) being solved, the space
needed by referenced variables and the recursion stack space is one of the most prominent
components. Also this included the data structure components like Linked list, heap, trees,
graphs etc.
Therefore the total space requirement of any algorithm 'A' can be provided as
Space(A) = Fixed Components(A) + Variable Components(A)
Among both fixed and variable component the variable part is important to be determined
accurately, so that the actual space requirement can be identified for an algorithm 'A'. To identify
the space complexity of any algorithm following steps can be followed:
1. Determine the variables which are instantiated by some default values.
2. Determine which instance characteristics should be used to measure the space requirement and
this is will be problem specific.
3. Generally the choices are limited to quantities related to the number and magnitudes of the
inputs to and outputs from the algorithms.
4. Sometimes more complex measures of the interrelationships among the data items can used.
Example: Space Complexity
Algorithm Sum(number,size)\\ procedure will produce sum of all numbers provided in 'number'
list
{
result=0.0;
for count = 1 to size do \\will repeat from 1,2,3,4,....size times
result= result + number[count];
return result;
}
In above example, when calculating the space complexity we will be looking for both fixed and
variable components. here we have
Fixed components as 'result','count' and 'size' variable there for total space required is three(3)
words.
Variable components is characterized as the value stored in 'size' variable (suppose value store in
variable 'size 'is 'n'). because this will decide the size of 'number' list and will also drive the for
loop. therefore if the space used by size is one word then the total space required by 'number'
variable will be 'n'(value stored in variable 'size').
therefore the space complexity can be written as Space(Sum) = 3 + n;
Time Complexity of an algorithm(basically when converted to program) is the amount of
computer time it needs to run to completion. The time taken by a program is the sum of the
compile time and the run/execution time . The compile time is independent of the
instance(problem specific) characteristics. following factors effect the time complexity:
1. Characteristics of compiler used to compile the program.
2. Computer Machine on which the program is executed and physically clocked.
3. Multiuser execution system.
4. Number of program steps.
Therefore the again the time complexity consist of two components fixed(factor 1 only) and
variable/instance(factor 2,3 & 4), so for any algorithm 'A' it is provided as:
Time(A) = Fixed Time(A) + Instance Time(A)
Here the number of steps is the most prominent instance characteristics and The number of
steps any program statement is assigned depends on the kind of statement like
 comments count as zero steps,
 an assignment statement which does not involve any calls to other algorithm is counted as one
step,
 for iterative statements we consider the steps count only for the control part of the statement
etc.
Therefore to calculate total number program of program steps we use following procedure. For
this we build a table in which we list the total number of steps contributed by each statement.
This is often arrived at by first determining the number of steps per execution of the statement
and the frequency of each statement executed. This procedure is explained using an example.
Example: Time Complexity
In above example if you analyze carefully frequency of "for count = 1 to size do" it is 'size +1' this
is because the statement will be executed one time more die to condition check for false
situation of condition provided in for statement. Now once the total steps are calculated they
will resemble the instance characteristics in time complexity of algorithm. Also the repeated
compile time of an algorithm will also be constant every time we compile the same set of
instructions so we can consider this time as constant 'C'. Therefore the time complexity can be
expressed as: Time(Sum) = C + (2size +3)
So in this way both the Space complexity and Time complexity can be calculated. Combination of
both complexity comprises the Performance analysis of any algorithm and can not be used
independently. Both these complexities also helps in defining parameters on basis of which we
optimize algorithms.
Performance analysis :
Analyzing the performance of an algorithm is an important part of its design. One of the ways to
estimate the performance of an algorithm is to analyze its complexity.
Complexity theory is the study of how complicated algorithms are. To be useful, any algorithm
should have three key features:
 It should be correct. An algorithm won't do you much good if it doesn't give you the right
answers.
 A good algorithm should be understandable. The best algorithm in the world won't do you any
good if it's too complicated for you to implement on a computer.
 A good algorithm should be efficient. Even if an algorithm produces a correct result, it won't help
you much if it takes a thousand years or if it requires 1 billion terabytes of memory.
There are two possible types of analysis to quantify the complexity of an algorithm:
 Space complexity analysis: Estimates the runtime memory requirements needed to execute the
algorithm.
 Time complexity analysis: Estimates the time the algorithm will take to run.
Space complexity analysis
Space complexity analysis estimates the amount of memory required by the algorithm to
process input data. While processing the input data, the algorithm needs to store the transient
temporary data structures in memory. The way the algorithm is designed affects the number,
type, and size of these data structures. In an age of distributed computing and with increasingly
large amounts of data that needs to be processed, space complexity analysis is becoming more
and more important. The size, type, and number of these data structures will dictate the
memory requirements for the underlying hardware. Modern in-memory data structures used in
distributed computing—such as Resilient Distributed Datasets (RDDs)—need to have efficient
resource allocation mechanisms that are aware of the memory requirements at different
execution phases of the algorithm.
Space complexity analysis is a must for the efficient design of algorithms. If proper space
complexity analysis is not conducted while designing a particular algorithm, insufficient memory
availability for the transient temporary data structures may trigger unnecessary disk spillovers,
which could potentially considerably affect the performance and efficiency of the algorithm.
In this chapter, we will look deeper into time complexity. Space complexity will be discussed
in Chapter 13, Large-Scale Algorithms, in more detail, where we will deal with large-scale
distributed algorithms with complex runtime memory requirements.
Time complexity analysis
Time complexity analysis estimates how long it will take for an algorithm to complete its
assigned job based on its structure. In contrast to space complexity, time complexity is not
dependent on any hardware that the algorithm will run on. Time complexity analysis solely
depends on the structure of the algorithm itself. The overall goal of time complexity analysis is to
try to answer these important questions—will this algorithm scale? How well will this algorithm
handle larger datasets?
To answer these questions, we need to determine the effect on the performance of an algorithm
as the size of the data is increased and make sure that the algorithm is designed in a way that
not only makes it accurate but also scales well. The performance of an algorithm is becoming
more and more important for larger datasets in today's world of "big data."
In many cases, we may have more than one approach available to design the algorithm. The goal
of conducting time complexity analysis, in this case, will be as follows:
"Given a certain problem and more than one algorithm, which one is the most efficient to use in
terms of time efficiency?"
There can be two basic approaches to calculating the time complexity of an algorithm:

case study on analysis of algorithms.

 A post-implementation profiling approach: In this approach, different candidate algorithms are


implemented and their performance is compared.
 A pre-implementation theoretical approach: In this approach, the performance of each
algorithm is approximated mathematically before running an algorithm.
The advantage of the theoretical approach is that it only depends on the structure of the
algorithm itself. It does not depend on the actual hardware that will be used to run the
algorithm, the choice of the software stack chosen at runtime, or the programming language
used to implement the algorithm.
Estimating the performance
The performance of a typical algorithm will depend on the type of the data given to it as an
input. For example, if the data is already sorted according to the context of the problem we are
trying to solve, the algorithm may perform blazingly fast. If the sorted input is used to
benchmark this particular algorithm, then it will give an unrealistically good performance
number, which will not be a true reflection of its real performance in most scenarios. To handle
this dependency of algorithms on the input data, we have different types of cases to consider
when conducting a performance analysis.
The best case
In the best case, the data given as input is organized in a way that the algorithm will give its best
performance. Best-case analysis gives the upper bound of the performance.
The worst case
The second way to estimate the performance of an algorithm is to try to find the maximum
possible time it will take to get the job done under a given set of conditions. This worst-case
analysis of an algorithm is quite useful as we are guaranteeing that regardless of the conditions,
the performance of the algorithm will always be better than the numbers that come out of our
analysis. Worst-case analysis is especially useful for estimating the performance when dealing
with complex problems with larger datasets. Worst-case analysis gives the lower bound of the
performance of the algorithm.
The average case
This starts by dividing the various possible inputs into various groups. Then, it conducts the
performance analysis from one of the representative inputs from each group. Finally, it
calculates the average of the performance of each of the groups.
Average-case analysis is not always accurate as it needs to consider all the different
combinations and possibilities of input to the algorithm, which is not always easy to do.
Selecting an algorithm
How do you know which one is a better solution? How do you know which algorithm runs
faster? Time complexity and Big O notation (discussed later in this chapter) are really good tools
for answering these types of questions.
To see where it can be useful, let's take a simple example where the objective is to sort a list of
numbers. There are a couple of algorithms available that can do the job. The issue is how to
choose the right one.
First, an observation that can be made is that if there are not too many numbers in the list, then
it does not matter which algorithm do we choose to sort the list of numbers. So, if there are only
10 numbers in the list (n=10), then it does not matter which algorithm we choose as it would
probably not take more than a few microseconds, even with a very badly designed algorithm.
But as soon as the size of the list becomes 1 million, now the choice of the right algorithm will
make a difference. A very badly written algorithm might even take a couple of hours to run,
while a well-designed algorithm may finish sorting the list in a couple of seconds. So, for larger
input datasets, it makes a lot of sense to invest time and effort, perform a performance analysis,
and choose the correctly designed algorithm that will do the job required in an efficient manner.
Big O notation
Big O notation is used to quantify the performance of various algorithms as the input size grows.
Big O notation is one of the most popular methodologies used to conduct worst-case analysis.
The different kinds of Big O notation types are discussed in this section.
Constant time (O(1)) complexity
If an algorithm takes the same amount of time to run, independent of the size of the input data,
it is said to run in constant time. It is represented by O(1). Let's take the example of accessing
the nth element of an array. Regardless of the size of the array, it will take constant time to get
the results. For example, the following function will return the first element of the array and has
a complexity of O(1):
def getFirst(myList):
return myList[0]
The output is shown as:

 Addition of a new element to a stack by using push or removing an element from a stack by
using pop. Regardless of the size of the stack, it will take the same time to add or remove an
element.
 Accessing the element of the hashtable (as discussed in Chapter 2, Data Structures Used in
Algorithms).
 Bucket sort (as discussed in Chapter 2, Data Structures Used in Algorithms).
Linear time (O(n)) complexity
An algorithm is said to have a complexity of linear time, represented by O(n), if the execution
time is directly proportional to the size of the input. A simple example is to add the elements in a
single-dimensional data structure:
def getSum(myList):
sum = 0
for item in myList:
sum = sum + item
return sum
Note the main loop of the algorithm. The number of iterations in the main loop increases
linearly with an increasing value of n, producing an O(n) complexity in the following figure:

Some other examples of array operations are as follows:


 Searching an element
 Finding the minimum value among all the elements of an array
Quadratic time (O(n2)) complexity
An algorithm is said to run in quadratic time if the execution time of an algorithm is proportional
to the square of the input size; for example, a simple function that sums up a two-dimensional
array, as follows:
def getSum(myList):
sum = 0
for row in myList:
for item in row:
sum += item
return sum
Note the nested inner loop within the other main loop. This nested loop gives the preceding
code the complexity of O(n2):

Another example is the bubble sort algorithm (as discussed in Chapter 2, Data Structures Used
in Algorithms).
Logarithmic time (O(logn)) complexity
An algorithm is said to run in logarithmic time if the execution time of the algorithm is
proportional to the logarithm of the input size. With each iteration, the input size decreases by a
constant multiple factor. An example of logarithmic is binary search. The binary search algorithm
is used to find a particular element in a one-dimensional data structure, such as a Python list.
The elements within the data structure need to be sorted in descending order. The binary search
algorithm is implemented in a function named searchBinary, as follows:
def searchBinary(myList,item):
first = 0
last = len(myList)-1
foundFlag = False
while( first<=last and not foundFlag):
mid = (first + last)//2
if myList[mid] == item :
foundFlag = True
else:
if item < myList[mid]:
last = mid - 1
else:
first = mid + 1
return foundFlag
The main loop takes advantage of the fact that the list is ordered. It divides the list in half with
each iteration until it gets to the result:

After defining the function, it is tested to search a particular element in lines 11 and 12. The
binary search algorithm is further discussed in Chapter 3, Sorting and Searching Algorithms.
Note that among the four types of Big O notation types presented, O(n2) has the worst
performance and O(logn) has the best performance. In fact, O(logn)'s performance can be
thought of as the gold standard for the performance of any algorithm (which is not always
achieved, though). On the other hand, O(n2) is not as bad as O(n3) but still, algorithms that fall in
this class cannot be used on big data as the time complexity puts limitations on how much data
they can realistically process.
One way to reduce the complexity of an algorithm is to compromise on its accuracy, producing a
type of algorithm called an approximate algorithm.
The whole process of the performance evaluation of algorithms is iterative in nature, as shown
in the following figure:
Divide and conquer technique of problem solving:

Divide and Conquer is a problem-solving strategy that involves


breaking down a complex problem into smaller, more manageable
parts, solving each part individually, and then combining the
solutions to solve the original problem. It is a widely used
algorithmic technique in computer science and mathematics.
Example: In the Merge Sort algorithm, the “Divide and Conquer”
strategy is used to sort a list of elements. Below image illustrate
the dividing and merging states to sort the array using Merge Sort.

Or

Divide and conquer -


Technique is the basis of effective algorithm for many problems
such as sorting, for example quick sort, merge sort.
A divide and conquer algorithm recursively breaks down a
problem into two or more sub problems of the same or related
type until there become thing to be solve directly. The solution to
the sub problems are then combined to give a solution to the
original problem. It is used to find and optimal solution of the
problem.
1. Divide : This involve dividing the problem into smaller sub-
problem.
2. Conquer : solve a subproblems by calling recursively until
solved.
3. Combine : combine the sub problem to get the final solution
of the whole problem.
Performance Analysis of quick sort –
Quick sort is a sorting algorithms that work using the divide and
conquer approach. It choose a pivot place in its correct positionin the
sorted array. Partition the smaller elements to its left and greater
elements to its right. This process is continue for the left and right parts
and the array is sorted.
Time complexity analysis of quick sort :
T (n) = Time complexity of quick sort of n elements
P (n) = Time complexity for finding the position of pivot among
elements
Best case = Log 2(n)
= 0 n*log(n)
Average Case = 0 (nlogn)
Worst case =
Space Complexity = The required space is 0(1) as we are not using extra
space in the algorithm if we do not consider the recursive stack space. If
we consider the recursive stack space then the worst case. Quick sort
could make 0(n).
Quick sort is an internal algorithm which is based on divide and conquer strategy. In
this:
 The array of elements is divided into parts repeatedly until it is not possible to
divide it further.
 It is also known as “partition exchange sort”.
 It uses a key element (pivot) for partitioning the elements.
 One left partition contains all those elements that are smaller than the pivot
and one right partition contains all those elements which are greater than the
key element.

Merge sort is an external algorithm and based on divide and conquer strategy. In this:
 The elements are split into two sub-arrays (n/2) again and again until only
one element is left.
 Merge sort uses additional storage for sorting the auxiliary array.
 Merge sort uses three arrays where two are used for storing each half, and the
third external one is used to store the final sorted list by merging other two
and each array is then sorted recursively.
 At last, the all sub arrays are merged to make it ‘n’ element size of the array.
Quick Sort vs Merge Sort

1. Partition of elements in the array : In the merge sort, the array is parted
into just 2 halves (i.e. n/2). whereas In case of quick sort, the array is parted
into any ratio. There is no compulsion of dividing the array of elements into
equal parts in quick sort.
2. Worst case complexity : The worst case complexity of quick sort is O(n^2)
as there is need of lot of comparisons in the worst condition. whereas In
merge sort, worst case and average case has same complexities O(n log n).
3. Usage with datasets : Merge sort can work well on any type of data sets
irrespective of its size (either large or small). whereas The quick sort cannot
work well with large datasets.
4. Additional storage space requirement : Merge sort is not in place because it
requires additional memory space to store the auxiliary arrays. whereas The
quick sort is in place as it doesn’t require any additional storage.
5. Efficiency : Merge sort is more efficient and works faster than quick sort in
case of larger array size or datasets. whereas Quick sort is more efficient and
works faster than merge sort in case of smaller array size or datasets.
6. Sorting method : The quick sort is internal sorting method where the data is
sorted in main memory. whereas The merge sort is external sorting method in
which the data that is to be sorted cannot be accommodated in the memory
and needed auxiliary memory for sorting.
7. Stability : Merge sort is stable as two elements with equal value appear in the
same order in sorted output as they were in the input unsorted array. whereas
Quick sort is unstable in this scenario. But it can be made stable using some
changes in code.
8. Preferred for : Quick sort is preferred for arrays. whereas Merge sort is
preferred for linked lists.
9. Locality of reference : Quicksort exhibits good cache locality and this makes
quicksort faster than merge sort (in many cases like in virtual memory
environment).
Basis for
comparison Quick Sort Merge Sort

The partition of The splitting of a array of In the merge sort, the array is
elements in the elements is in any ratio, not parted into just 2 halves (i.e.
array necessarily divided into half. n/2).

Worst case
O(n^2) O(nlogn)
complexity

It operates fine on any size of


Works well on It works well on smaller array
array

It work faster than other sorting


Speed of It has a consistent speed on any
algorithms for small data set like
execution size of data
Selection sort etc

Additional
storage space Less(In-place) More(not In-place)
requirement
Basis for
comparison Quick Sort Merge Sort

Efficiency Inefficient for larger arrays More efficient

Sorting method Internal External

Stability Not Stable Stable

Preferred for for Arrays for Linked Lists

Locality of
good poor
reference

The major work is to partition the Major work is to combine the


Major work array into two sub-arrays before two sub-arrays after sorting
sorting them recursively. them recursively.

Division of an array into sub- Division of an array into sub


Division of arrays may or may not be array is always balanced as it
array balanced as the array is partitioned divides the array exactly at the
around the pivot. middle.

Quick sort is in- place sorting Merge sort is not in – place


Method
method. sorting method.

Quicksort does not need explicit


merging of the sorted sub-arrays; Merge sort performs explicit
Merging
rather the sub-arrays rearranged merging of sorted sub-arrays.
properly during partitioning.

For merging of sorted sub-


Quicksort does not require arrays, it needs a temporary
Space
additional array space. array with the size equal to the
number of input elements.
Greedy algorithms :
Greedy algorithms are a class of algorithms that make locally optimal choices at each
step with the hope of finding a global optimum solution. In these algorithms, decisions
are made based on the information available at the current moment without considering
the consequences of these decisions in the future. The key idea is to select the best
possible choice at each step, leading to a solution that may not always be the most
optimal but is often good enough for many problems.
For example: consider the Fractional Knapsack Problem. The local optimal strategy is
to choose the item that has maximum value vs weight ratio. This strategy also leads to a
globally optimal solution because we are allowed to take fractions of an item.

Fractional Knapsack

General Method

The general structure of a greedy algorithm can be summarized in the


following steps:

1. Identify the problem as an optimization problem where we need to find the


best solution among a set of possible solutions.
2. Determine the set of feasible solutions for the problem.
3. Identify the optimal substructure of the problem, meaning that the optimal
solution to the problem can be constructed from the optimal solutions of its
subproblems.
4. Develop a greedy strategy to construct a feasible solution step by step,
making the locally optimal choice at each step.
Prove the correctness of the algorithm by showing that the locally optimal
choices at each step lead to a globally optimal solution.

Some common applications of greedy algorithms include:

1. Coin change problem: Given a set of coins with different denominations, find
the minimum number of coins required to make a given amount of change.
Fractional knapsack problem: Given a set of items with weights and values,
fill a knapsack with a maximum weight capacity with the most valuable
items, allowing fractional amounts of items to be included.
Huffman coding: Given a set of characters and their frequencies in a message,
construct a binary code with minimum average length for the characters.
Shortest path algorithms: Given a weighted graph, find the shortest path
between two nodes.
Minimum spanning tree: Given a weighted graph, find a tree that spans all
nodes with the minimum total weight.
Greedy algorithms can be very efficient and provide fast solutions for many
problems. However, it is important to keep in mind that they may not always
provide the optimal solution and to analyze the problem carefully to ensure
the correctness of the algorithm.
2. Greedy Algorithms work step-by-step, and always choose the steps which
provide immediate profit/benefit. It chooses the “locally optimal solution”,
without thinking about future consequences. Greedy algorithms may not
always lead to the optimal global solution, because it does not consider the
entire data. The choice made by the greedy approach does not consider future
data and choices. In some cases making a decision that looks right at that
moment gives the best solution (Greedy), but in other cases, it doesn’t. The
greedy technique is used for optimization problems (where we have to find
the maximum or minimum of something). The Greedy technique is best
suited for looking at the immediate situation.
All greedy algorithms follow a basic structure:
1. declare an empty result = 0.
2. We make a greedy choice to select, If the choice is feasible add it to the final
result.
3. return the result.
Why choose Greedy Approach:
The greedy approach has a few tradeoffs, which may make it suitable for optimization.
One prominent reason is to achieve the most feasible solution immediately. In the
activity selection problem (Explained below), if more activities can be done before
finishing the current activity, these activities can be performed within the same time.
Another reason is to divide a problem recursively based on a condition, with no need to
combine all the solutions. In the activity selection problem, the “recursive division”
step is achieved by scanning a list of items only once and considering certain activities.
Greedy choice property:
This property says that the globally optimal solution can be obtained by making a
locally optimal solution (Greedy). The choice made by a Greedy algorithm may depend
on earlier choices but not on the future. It iteratively makes one Greedy choice after
another and reduces the given problem to a smaller one.
Optimal substructure:
A problem exhibits optimal substructure if an optimal solution to the problem contains
optimal solutions to the subproblems. That means we can solve subproblems and build
up the solutions to solve larger problems.
Note: Making locally optimal choices does not always work. Hence, Greedy algorithms
will not always give the best solutions.
Characteristics of Greedy approach:
 There is an ordered list of resources(profit, cost, value, etc.)
 Maximum of all the resources(max profit, max value, etc.) are taken.
 For example, in the fractional knapsack problem, the maximum value/weight
is taken first according to available capacity.
Characteristic components of greedy algorithm:
1. The feasible solution: A subset of given inputs that satisfies all specified
constraints of a problem is known as a “feasible solution”.
2. Optimal solution: The feasible solution that achieves the desired extremum
is called an “optimal solution”. In other words, the feasible solution that
either minimizes or maximizes the objective function specified in a problem
is known as an “optimal solution”.
3. Feasibility check: It investigates whether the selected input fulfils all
constraints mentioned in a problem or not. If it fulfils all the constraints then
it is added to a set of feasible solutions; otherwise, it is rejected.
4. Optimality check: It investigates whether a selected input produces either a
minimum or maximum value of the objective function by fulfilling all the
specified constraints. If an element in a solution set produces the desired
extremum, then it is added to a sel of optimal solutions.
5. Optimal substructure property: The globally optimal solution to a problem
includes the optimal sub solutions within it.
6. Greedy choice property: The globally optimal solution is assembled by
selecting locally optimal choices. The greedy approach applies some locally
optimal criteria to obtain a partial solution that seems to be the best at that
moment and then find out the solution for the remaining sub-problem.
The local decisions (or choices) must possess three characteristics as mentioned
below:
1. Feasibility: The selected choice must fulfil local constraints.
2. Optimality: The selected choice must be the best at that stage (locally
optimal choice).
3. Irrevocability: The selected choice cannot be changed once it is made.
Applications of Greedy Algorithms:
 Finding an optimal solution (Activity selection, Fractional Knapsack, Job
Sequencing, Huffman Coding).
 Finding close to the optimal solution for NP-Hard problems like TSP.
 Network design: Greedy algorithms can be used to design efficient networks,
such as minimum spanning trees, shortest paths, and maximum flow
networks. These algorithms can be applied to a wide range of network design
problems, such as routing, resource allocation, and capacity planning.
 Machine learning: Greedy algorithms can be used in machine learning
applications, such as feature selection, clustering, and classification. In
feature selection, greedy algorithms are used to select a subset of features that
are most relevant to a given problem. In clustering and classification, greedy
algorithms can be used to optimize the selection of clusters or classes.
 Image processing: Greedy algorithms can be used to solve a wide range of
image processing problems, such as image compression, denoising, and
segmentation. For example, Huffman coding is a greedy algorithm that can be
used to compress digital images by efficiently encoding the most frequent
pixels.
 Combinatorial optimization: Greedy algorithms can be used to solve
combinatorial optimization problems, such as the traveling salesman problem,
graph coloring, and scheduling. Although these problems are typically NP-
hard, greedy algorithms can often provide close-to-optimal solutions that are
practical and efficient.
 Game theory: Greedy algorithms can be used in game theory applications,
such as finding the optimal strategy for games like chess or poker. In these
applications, greedy algorithms can be used to identify the most promising
moves or actions at each turn, based on the current state of the game.
 Financial optimization: Greedy algorithms can be used in financial
applications, such as portfolio optimization and risk management. In portfolio
optimization, greedy algorithms can be used to select a subset of assets that
are most likely to provide the best return on investment, based on historical
data and current market trends.
Advantages of the Greedy Approach:
 The greedy approach is easy to implement.
 Typically have less time complexity.
 Greedy algorithms can be used for optimization purposes or finding close to
optimization in case of Hard problems.
 Greedy algorithms can produce efficient solutions in many cases, especially
when the problem has a substructure that exhibits the greedy choice property.
 Greedy algorithms are often faster than other optimization algorithms, such as
dynamic programming or branch and bound, because they require less
computation and memory.
 The greedy approach is often used as a heuristic or approximation algorithm
when an exact solution is not feasible or when finding an exact solution
would be too time-consuming.
 The greedy approach can be applied to a wide range of problems, including
problems in computer science, operations research, economics, and other
fields.
 The greedy approach can be used to solve problems in real-time, such as
scheduling problems or resource allocation problems, because it does not
require the solution to be computed in advance.
 Greedy algorithms are often used as a first step in solving optimization
problems, because they provide a good starting point for more complex
optimization algorithms.
 Greedy algorithms can be used in conjunction with other optimization
algorithms, such as local search or simulated annealing, to improve the
quality of the solution.
Disadvantages of the Greedy Approach:
 The local optimal solution may not always be globally optimal.
 Greedy algorithms do not always guarantee to find the optimal solution, and
may produce suboptimal solutions in some cases.
 The greedy approach relies heavily on the problem structure and the choice of
criteria used to make the local optimal choice. If the criteria are not chosen
carefully, the solution produced may be far from optimal.
 Greedy algorithms may require a lot of preprocessing to transform the
problem into a form that can be solved by the greedy approach.
 Greedy algorithms may not be applicable to problems where the optimal
solution depends on the order in which the inputs are processed.
 Greedy algorithms may not be suitable for problems where the optimal
solution depends on the size or composition of the input, such as the bin
packing problem.
 Greedy algorithms may not be able to handle constraints on the solution
space, such as constraints on the total weight or capacity of the solution.
 Greedy algorithms may be sensitive to small changes in the input, which can
result in large changes in the output. This can make the algorithm unstable
and unpredictable in some cases.

Greedy Algorithms : Greedy algorithm is an approach for solving a problem by


selecting the best option available at the movement. It does not worry whether the
current best result will bring the overall optimal result. The algorithm never reverses the
earlier decision even if the choice is wrong. It work in a top down approach. This
algorithm may not produce the best result for all the problems. It is because it always
goes for the local best choice to produce the global best result.
We can determine if the algorithms can be used with any problem if the problem has the
following properties:
1. Greedy choice properties : if an optimal solution to the problem can e found by
choosing the best choice at each step without reconsidering the previous steps
once chosen the problem can be solved using a greedy approach. This property is
called greedy property.
2. Optimal sub structure : if the optimal solution to the problem corresponds to the
optimal solution to its sub problems then the problem can be solve to the greedy
approach this property is called optimal sub structure.
Advantages of greedy algorithm
1. This algorithm is easy to describe.
2. This algorithm perform better than other algorithms.
Drawbacks
1. The greedy algorithm does not always produce the optimal solution. This is the
major disadvantage of the algorithm.

Knapsack capacity
Knapsack :
Object Obj1 Obj2 Obj3
Profit 25 24 15
Weight 18 15 10
P/W 1.3 1.6 1.5

1. Greedy about profit :


25+2/15*24 = 25+48/15 = 28.2
2. Greedy about weight :
15+10/15*24 = 15+240/15=31
3. P/W
24+5/10*15 = 24+75/10 = 31.5
Algorithm
For i=1 to n
Calculate p/w
Sort object in decreasing order of p/w ratio
For i= I to n
If m>0 and w i<_m
M=m-wi;
P=p+I;
Else break;
If (m>0)
P=p+pi(m/wi)j

1. The knapsack problem is an optimization problem used to illustrate both problem


and solution. It drives its name from a scenario where one is constrained in the
number of items that can be placed inside a fixed size knapsack given a set of
item with specific weights and values the aim is to get as much value into the
knapsack as possible given the weight constraint of the knapsack.
2. It has two types.
1. Zero/1 0/1 knapsack problem.
2. Items are indivisible (either take an item or not)
3. It can be solved with dynamic programming.
4. We can not solve this by using greedy techniques.

Fractional knapsack problem :


1.In this items are divisible we can take any fraction of an item. It can be solved by
using greedy technique.
2. in this method a knapsack for example a bag with limited weight capacity is given
and few items each having some weight and value are given.
The problem states which item should be placed into the knapsack such that :
The value or profit obtain by putting the item in to the knapsack is maximum.
The weight limited of the knapsack does not exceed.
Fraction knapsack problem is solved using greedy method in the following steps
For each item compute its value/weigh ratio
Arrange all the items in the decritive order off their value upon their ratio.
Start putting the item into a knapsack beginning from the item with the highest ratio.
Put as many item as you can into the knapsack

Time complexity :
0(logn)(total time taken including the soit
1. The main time taking step is the sorting all items in decreasing order of
their value/ratio.
2. If the item are already arranged in the required order then while loop takes
0(n) time.
APSP
All pair shortest path problem APSP is the problem of
finding the shortest path between every pair of nodes is
known as all pair shortest path problem by using
sequential algorithms but these algorithms take long run
time. The other methods are parallization is used to solve
this problem which is more beneficial technique to solve
these problems.
1. Dijkstra algorithm can be used to solve APSP by
executing the single source variant with each nodes in
role of root node.
2. Floid worshall algorithms solve the all pair shortest
path problems for directed graphs with adjacency
matrix of a graph as input. It calculate shorter path
interative. After be interation the distance matrix
contains all the shortest path.
Example : APSP (all pair shortest path) (floid warshall algorithm)
By using floid warshall algorithms initialize the solution matrix
same as the input graph matrix as a first step.
1. Then update the solution matrix by considering all
vertex as an intermediate vertex.
2. The idea is to pick all vertex one by one and update all
shortest path which include the picked vertex in the
shortest path.
When we pick vertex no k as an intermediate vertex
we already as consider vertex 0,1,2,k-1 as
intermediate vertex.
3. For every pair I,j of the source and destination vertex
respectively there are two possible cases.
1. K is not an intermediate vertex in shortest path
from I to j we keep the value of dist[i], [j] as etc.
2. K is an intermediate vertex in shortest from I to j.
we update the value of distance [I,j] as dist [i][k]
+ dist [k][j], if dist [i][j] > dist [i][k] + dist [k][j].
Code of floid warshall algorithms
For k = 0 to n-1
For I = 0 to n-1
For j = 0 to n-1
Dist [i][j] = min distance [I,j], distance [I,k] + distance [k,j]
Where I = source node, j = destination k = intermediate node.
Transitive closure
The transitive closure of the adjacency relation of a
directed acyclic graph (DAG) is the reachability
relative of the DAG and a stick partially order. The
transitive closure of an undirected graph produces a
cluster graph, a disjoint union of cliques
constructing the transitive closure is an equivalent
formulation of the problem of finding the
component of the graph. It construct the output
graph from the input graph.
Input -

Output -

Transitive closure used to constructing a data


structure that makes it possible to answer
reachability questions. For example we can reach
from node A to node D in one or more hopes.
Transitive closure of a directed graph:
For a directed graph G = (V,E) the transitive closure
of graph G is graph G*(V,E) in G* for every vertex
pair (V,W) in v there is an edge E* if and only if there
is a valid path from v to w in G.

This is matrix
A B C D E
A 1 1 1 1 0
B 0 1 0 0 0
C 0 1 1 1 0
D 0 1 0 1 0
E 0 1 0 1 1

Dynamic Programming
DP is a general algorithm design technique for
solving problems define by or formulated as
reccurence with overlapping sub instances. It is
invented by American Mathematician Richard bell
man in the 1950 to solve optimization problems the
main idea to solve the problem is 1.set up a
reccurence relating a solution to a larger instance to
solutions of sub smaller instances.
2.solve smaller instances once.
3.record solution in a table.
4.extract solution to the initial instance from that
table.
Dynamic programming reduces the amount
enumrations by eliminating those sequence which
can not be optimal. In DM optimal sequence of
decisions are found by following the principle of
optimality.
It is an algorithm design technique for optimization
problem often minimizing or maximizing. For
example divide and conquer, DM solves problems
unlike divide and conquer sub problem are not
independent. Sub problems may share sub-sub
problems. The term dynamic programming comes
from control theory, not computer science. This
programming refers to the use of tables to construct
a solution. In this we reduce time by increasing the
amount of space. We solve the problem by solving
subproblem of increasing spa and saving each.
Optimal solution in a table. The table is then used
for finding the optimal solution to larger problems.
Time is saved since each subproblem is solved only
once.
Multistage graph : a multistage graph is a directed
weighted graph in which the nodes can be divided
into a set of stages. Such that all edges are from a
stage to next stage only. In other words there is no
edge between vertex of same stage and from a
current vertex to previous stage. The vertex of a
multistage graphs are divided into n number of
disjoint subsets. S={s1,s2,s3…sn} is the destination.
The cordinality of s1 and sn are equal to 1.
1 4

0 2 5 7

3 6
Shortest path : 0-3-6-7
Source : 0
Destination : 7
In above diagram a multistage graph is given with a
source and a destination we need to find a shortest
path from to source to destination. We consider
source at stage one and destination as last stage.
Graph is graph g is usually assumed to be a
weighted graph. In this graph cost of an edge(I,j) is
represented by c(I,j) the cost of path from source s
to destin d is the some of costs of each edge in this
path. The multistage graph problem is finding the
path with minimum cost source to destination d.

You might also like