You are on page 1of 45

Best, Average and Worst-case Analysis of Algorithms

Introduction
We all know that the running time of an algorithm increases (or remains constant in case of
constant running time) as the input size ([Math Processing Error]) increases. Sometimes even
if the size of the input is same, the running time varies among different instances of the input.
In that case, we perform best, average and worst-case analysis. The best case gives the
minimum time, the worst case running time gives the maximum time and average case
running time gives the time required on average to execute the algorithm. I will explain all
these concepts with the help of two examples - (i) Linear Search and (ii) Insertion sort.
Usually, the time required by an algorithm falls under three types

Best Case − Minimum time required for program execution.

Average Case − Average time required for program execution.

Worst Case − Maximum time required for program execution.

Best Case Analysis


Consider the example of Linear Search where we search for an item in an array. If the item is
in the array, we return the corresponding index, otherwise, we return -1. The code for linear
search is given below.
1
2
3
4
5
6
7
8
9
int search(int a, int n, int item) {
int i;
for (i = 0; i < n; i++) {
if (a[i] == item) {
return a[i]

PALLAVI HIRE ITM UNIVERSE


}
}
return -1
}
Variable a is an array, n is the size of the array and item is the item we are looking for in the
array. When the item we are looking for is in the very first position of the array, it will return
the index immediately. The for loop runs only once. So the complexity, in this case, will be
[Math Processing Error]. This is the called the best case.
Consider another example of insertion sort. Insertion sort sorts the items in the input array in
an ascending (or descending) order. It maintains the sorted and un-sorted parts in an array. It
takes the items from the un-sorted part and inserts into the sorted part in its appropriate
position. The figure below shows one snapshot of the insertion operation.
In the figure, items [1, 4, 7, 11, 53] are already sorted and now we want to place 33 in its
appropriate place. The item to be inserted are compared with the items from right to left one-
by-one until we found an item that is smaller than the item we are trying to insert. We
compare 33 with 53 since 53 is bigger we move one position to the left and compare 33 with
11. Since 11 is smaller than 33, we place 33 just after 11 and move 53 one step to the right.
Here we did 2 comparisons. It the item was 55 instead of 33, we would have performed only
one comparison. That means, if the array is already sorted then only one comparison is
necessary to place each item to its appropriate place and one scan of the array would sort it.
The code for insertion operation is given below.
1
2
3
4
5
6
7
8
9
10
11
12
13
void sort(int a, int n)

PALLAVI HIRE ITM UNIVERSE


{
int i, j;
for (i = 0; i < n; i++) {
j = i-1;
key = a[i];
while (j >= 0 && a[j] > key)
{
a[j+1] = a[j];
j = j-1;
}
a[j+1] = key;
}
}
When items are already sorted, then the while loop executes only once for each item. There
are total n items, so the running time would be [Math Processing Error]. So the best case
running time of insertion sort is [Math Processing Error].
The best case gives us a lower bound on the running time for any input. If the best case of the
algorithm is [Math Processing Error] then we know that for any input the program needs at
least [Math Processing Error] time to run. In reality, we rarely need the best case for our
algorithm. We never design an algorithm based on the best case scenario.
Worst Case Analysis
In real life, most of the time we do the worst case analysis of an algorithm. Worst case
running time is the longest running time for any input of size [Math Processing Error].
In the linear search, the worst case happens when the item we are searching is in the last
position of the array or the item is not in the array. In both the cases, we need to go through
all [Math Processing Error] items in the array. The worst case runtime is, therefore, [Math
Processing Error]. Worst case performance is more important than the best case performance
in case of linear search because of the following reasons.
The item we are searching is rarely in the first position. If the array has 1000 items from 1 to
1000. If we randomly search the item from 1 to 1000, there is 0.001 percent chance that the
item will be in the first position.
Most of the time the item is not in the array (or database in general). In which case it takes
the worst-case running time to run.
Similarly, in insertion sort, the worst-case scenario occurs when the items are reverse sorted.
The number of comparisons in the worst case will be in the order of [Math Processing Error]
and hence the running time is [Math Processing Error].

PALLAVI HIRE ITM UNIVERSE


Knowing the worst-case performance of an algorithm provides a guarantee that the algorithm
will never take any time longer.
Average Case Analysis
Sometimes we do the average case analysis on algorithms. Most of the time the average case
is roughly as bad as the worst case. In the case of insertion sort, when we try to insert a new
item to its appropriate position, we compare the new item with half of the sorted item on
average. The complexity is still in the order of [Math Processing Error] which is the worst-
case running time.

It is usually harder to analyze the average behavior of an algorithm than to analyze its
behavior in the worst case. This is because it may not be apparent what constitutes an
“average” input for a particular problem. A useful analysis of the average behavior of an
algorithm, therefore, requires a prior knowledge of the distribution of the input instances
which is an unrealistic requirement. Therefore often we assume that all inputs of a given size
are equally likely and do the probabilistic analysis for the average case.

Asymptotic Analysis: Big-O Notation and More


The efficiency of an algorithm depends on the amount of time, storage and other resources
required to execute the algorithm. The efficiency is measured with the help of asymptotic
notations.
An algorithm may not have the same performance for different types of inputs. With the
increase in the input size, the performance will change.
The study of change in performance of the algorithm with the change in the order of the input
size is defined as asymptotic analysis.
Asymptotic notations are used to represent the complexities of algorithms for asymptotic
analysis. These notations are mathematical tools to represent the complexities. There are
three notations that are commonly used.
Asymptotic Notations
Asymptotic notations are the mathematical notations used to describe the running time of an
algorithm when the input tends towards a particular value or a limiting value.
For example: In bubble sort, when the input array is already sorted, the time taken by the
algorithm is linear i.e. the best case.
But, when the input array is in reverse condition, the algorithm takes the maximum time
(quadratic) to sort the elements i.e. the worst case.
When the input array is neither sorted nor in reverse order, then it takes average time. These
durations are denoted using asymptotic notations.
There are mainly three asymptotic notations:

PALLAVI HIRE ITM UNIVERSE


Big-O notation
Omega notation
Theta notation

Big-O Notation (O-notation)


Big-O notation represents the upper bound of the running time of an algorithm. Thus, it gives
the worst-case complexity of an algorithm.
O(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤
f(n) ≤ cg(n) for all n ≥ n0 }
The above expression can be described as a function f(n) belongs to the set O(g(n)) if there
exists a positive constant c such that it lies between 0 and cg(n), for sufficiently large n.
For any value of n, the running time of an algorithm does not cross the time provided by
O(g(n)).
Since it gives the worst-case running time of an algorithm, it is widely used to analyze an
algorithm as we are always interested in the worst-case scenario.

Big-O gives the upper bound of a function

Omega Notation (Ω-notation)


Omega notation represents the lower bound of the running time of an algorithm. Thus, it
provides the best case complexity of an algorithm.
Ω(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ cg(n) ≤ f(n) for all n
≥ n0 }

PALLAVI HIRE ITM UNIVERSE


The above expression can be described as a function f(n) belongs to the set Ω(g(n)) if there
exists a positive constant c such that it lies above cg(n), for sufficiently large n. For any value
of n, the minimum time required by the algorithm is given by Omega Ω(g(n)).

Omega gives the lower bound of a function

Theta Notation (Θ-notation)


Theta notation encloses the function from above and below. Since it represents the upper and
the lower bound of the running time of an algorithm, it is used for analyzing the average-case
complexity of an algorithm.

Theta bounds the function within constants factors


For a function g(n), Θ(g(n)) is given by the relation:

PALLAVI HIRE ITM UNIVERSE


Θ(g(n)) = { f(n): there exist positive constants c1, c2 and n0 such that 0 ≤ c1g(n) ≤ f(n) ≤
c2g(n) for all n ≥ n0 }

The above expression can be described as a function f(n) belongs to the set Θ(g(n)) if there
exist positive constants c1 and c2 such that it can be sandwiched between c1g(n) and c2g(n),
for sufficiently large n.
If a function f(n) lies anywhere in between c1g(n) and c2g(n) for all
n ≥ n0, then f(n) is said to be asymptotically tight bound.
Calculating the running time of Algorithms
Introduction
Here we learn how to estimate the running time of an algorithm looking at the source code
without running the code on the computer. The estimated running time helps us to find the
efficiency of the algorithm. Knowing the efficiency of the algorithm helps in the decision
making process. Even though there is no magic formula for analyzing the efficiency of an
algorithm as it is largely a matter of judgment, intuition, and experience, there are some
techniques that are often useful which we are going to discuss here.
The approach we follow is also called a theoretical approach. In this approach, we calculate
the cost (running time) of each individual programming construct and we combine all the
costs into a bigger cost to get the overall complexity of the algorithm.
Basic operations
Knowing the cost of basic operations helps to calculate the overall running time of an
algorithm. The table below shows the list of basic operations along with their running time.
The list not by any means provides the comprehensive list of all the operations. But I am
trying to include most of the operations that we come across frequently in programming.

Operation Running Time


Integer add/subtract Θ(1)
Integer multiply/divide Θ(1)
Float add/subtract Θ(1)
Float multiply/divide Θ(1)
Trigonometric Functions (sine, cosine, .. ) Θ(1)
Variable Declaration Θ(1)
Assignment Operation Θ(1)
Logical Operations (<,>,≤,≥, etc) Θ(1)
Array Access Θ(1)
Array Length Θ(1)

PALLAVI HIRE ITM UNIVERSE


1D array allocation Θ(n)
2D array allocation Θ(n2)
Substring extraction Θ(1) or Θ(n)
String concatenation Θ(n)

Consecutive statements
Let two independent consecutive statements are P1 and P2. Let t1 be the cost of running P1
and t2 be the cost of running P2. The total cost of the program is the addition of cost of
individual statement i.e. t1+t2. In asymptotic notation the total time is Θ(max(t1,t2))(we
ignore the non significant term).
Example: Consider the following code.
1
2
3
4
5
int main()
{
// 1. some code with running time n
// 2. some code with running time n^2
return 0;
}
Assume that statement 2 is independent of statement 1 and statement 1 executes first
followed by statement 2. The total running time is
Θ(max(n,n2))=Θ(n2)

for loops
It is relatively easier to compute the running time of for loop than any other loops. All we
need to compute the running time is how many times the statement inside the loop body is
executed. Consider a simple for loop in C.
for (i = 0; i < 10; i++)
{

PALLAVI HIRE ITM UNIVERSE


// body
}
The loop body is executed 10 times. If it takes m operations to run the body, the total number
of operations is 10×m=10m. In general, if the loop iterates n times and the running time of
the loop body are m, the total cost of the program is n∗m. Please note that we are ignoring the
time taken by expression i<10 and statement i++. If we include these, the total time becomes
1+2×n+mn=Θ(mn)

In this analysis, we made one important assumption. We assumed that the body of the loop
doesn’t depend on i. Sometimes the runtime of the body does depend on i. In that case, our
calculation becomes a little bit difficult. Consider an example shown below.

1
2
3
4
5
for (i = 0; i < n; i++)
{
if (i % 2 == 0)
{
// some operations of runtime n^2
}
}
In the for loop above, the control goes inside the if condition only when i is an even number.
That means the body of if condition gets executed n/2 times. The total cost is therefore
n/2∗n2=n3/2=Θ(n3).

Nested for loops


Suppose there are p nested for loops. The p for loops execute n1,n2,…,np times respectively.
The total cost of the entire program is
n1×n2×,….,×np×cost of the body of innermost loop
Consider nested for loops as given in the code below

PALLAVI HIRE ITM UNIVERSE


for (i = 0; i < n; i++) {
for (j = 0; j < n; j++) {
// body that runs in linear time n
}
}
There are two for loops, each goes n times. So the total cost is
n×n×n=n3=Θ(n3)

while loops
while loops are usually harder to analyze than for loops because there is no obvious a priori
way to know how many times we shall have to go round the loop. One way of analyzing
while loops is to find a variable that goes increasing or decreasing until the terminating
condition is met. Consider an example given below
while (i > 0) {
// some computation of cost n
i=i/2
}
How many times the loop repeats? In every iteration, the value of i gets halved. If the initial
value of i is 16, after 4 iterations it becomes 1 and the loop terminates. The implies that the
loop repeats log2i times.
In each iteration, it does the n work. Therefore, the total cost is Θ(nlog2i).

Recursive calls
To calculate the cost of a recursive call, we first transform the recursive function to a
recurrence relation and then solve the recurrence relation to get the complexity. There are
many techniques to solve the recurrence relation. These techniques will be discussed in
details in the next article.
int fact(int n)
{
if (n <= 2)
{
return n;
}

PALLAVI HIRE ITM UNIVERSE


return n * fact(n - 1);
}
We can transform the code into a recurrence relation as follows.
T(n) = a if n≤2

b + T(n−1) otherwise

When n is 1 or 2, the factorial of n is n itself. We return the result


in constant time a. Otherwise, we calculate the factorial of n−1 and
multiply the result by n. The multiplication takes a constant time
b. We use one of the techniques called back substitution to find the
complexity.

T(n)=b+T(n−1)

=b+b+T(n−2)

=b+b+b+T(n−3)

=3b+T(n−3)

=kb+T(n−k)
=nb+T(0)
=nb+a
=Θ(n)
Example
Let us put together all the techniques discussed above and compute the running time of some
example programs.
Example 1
int sum(int a, int b) {
int c = a + b;
return c
}
The sum function has two statements. The first statement (line 2) runs in constant time i.e.
Theta(1) and second statement (line 3) also runs in constant time Θ(1). These two statements
are consecutive statements, so the total running time is Θ(1)+Θ(1)=Θ(1)
Example 2
int array_sum(int a, int n)

PALLAVI HIRE ITM UNIVERSE


{
int i;
int sum = 0;
for (i = 0; i < n; i++)
{
sum = sum + a[i]
}
return sum;
}
Analysis
Line 2 is a variable declaration. The cost is Θ(1)
Line 3 is a variable declaration and assignment. The cost is Θ(2)
Line 4 - 6 is a for loop that repeats n times. The body of the for loop requires Θ(1) to run. The
total cost is Θ(n).
Line 7 is a return statement. The cost is Θ(1).
1, 2, 3, 4 are consecutive statements so the overall cost is Θ(n)
Example 3
int sum = 0;
for (i = 0; i < n; i++) {
for (j = 0; j < n; j++) {
for (k = 0; k < n; k++) {
if (i == j == k) {
for (l = 0; l < n*n*n; l++) {
sum = i + j + k + l;
}
}
}
}
}
Analysis
Line 1 is a variable declaration and initialization. The cost is Θ(1)

PALLAVI HIRE ITM UNIVERSE


Line 2 - 11 is a nested for loops. There are four for loops that repeat n times. After the third
for loop in Line 4, there is a condition of i == j == k. This condition is true only n times. So
the total cost of these loops is Θ(n3)+Θ(n4)=Θ(n4)
The overall cost is Θ(n4).

Heap Sort Algorithm


Heap Sort is a popular and efficient sorting algorithm in computer programming. Learning
how to write the heap sort algorithm requires knowledge of two types of data structures -
arrays and trees.
The initial set of numbers that we want to sort is stored in an array
e.g. [10, 3, 76, 34, 23, 32] and after sorting, we get a sorted array [3,10,23,32,34,76].
Heap sort works by visualizing the elements of the array as a special kind of complete binary
tree called a heap.
Relationship between Array Indexes and Tree Elements
A complete binary tree has an interesting property that we can use to find the children and
parents of any node.
If the index of any element in the array is i, the element in the index 2i+1 will become the left
child and element in 2i+2 index will become the right child. Also, the parent of any element
at index i is given by the lower bound of (i-1)/2.

Relationship between array and heap indices


Left child of 1 (index 0)
= element in (2*0+1) index
= element in 1 index
= 12
Right child of 1

PALLAVI HIRE ITM UNIVERSE


= element in (2*0+2) index
= element in 2 index
=9
Similarly,
Left child of 12 (index 1)
= element in (2*1+1) index
= element in 3 index
=5
Right child of 12
= element in (2*1+2) index
= element in 4 index
=6

Let us also confirm that the rules hold for finding parent of any node
Parent of 9 (position 2)
= (2-1)/2

= 0.5
~ 0 index
=1
Parent of 12 (position 1)
= (1-1)/2
= 0 index
=1
Understanding this mapping of array indexes to tree positions is critical to understanding how
the Heap Data Structure works and how it is used to implement Heap Sort.

What is Heap Data Structure?


Heap is a special tree-based data structure. A binary tree is said to follow a heap data
structure if it is a complete binary tree

PALLAVI HIRE ITM UNIVERSE


All nodes in the tree follow the property that they are greater than their children i.e. the
largest element is at the root and both its children and smaller than the root and so on. Such a
heap is called a max-heap. If instead, all nodes are smaller than their children, it is called a
min-heap
The following example diagram shows Max-Heap and Min-Heap.

Max Heap and Min Heap


How to "heapify" a tree
Starting from a complete binary tree, we can modify it to become a Max-Heap by running a
function called heapify on all the non-leaf elements of the heap.
Since heapify uses recursion, it can be difficult to grasp. So let's first think about how you
would heapify a tree with just three elements.
heapify(array)

Root = array[0]

Largest = largest( array[0] , array [2*0 + 1]. array[2*0+2])

if(Root != Largest)

Swap(Root, Largest)

PALLAVI HIRE ITM UNIVERSE


Heapify base cases
The example above shows two scenarios - one in which the root is the largest element and we
don't need to do anything. And another in which the root had a larger element as a child and
we needed to swap to maintain max-heap property.
If you're worked with recursive algorithms before, you've probably identified that this must
be the base case.Now let's think of another scenario in which there is more than one level.

How to heapify root element when its subtrees are already max heaps

PALLAVI HIRE ITM UNIVERSE


The top element isn't a max-heap but all the sub-trees are max-heaps.
To maintain the max-heap property for the entire tree, we will have to keep pushing 2
downwards until it reaches its correct position.

How to heapify root element when its subtrees are max-heaps


Thus, to maintain the max-heap property in a tree where both sub-trees are max-heaps, we
need to run heapify on the root element repeatedly until it is larger than its children or it
becomes a leaf node.
We can combine both these conditions in one heapify function as
void heapify(int arr[], int n, int i)

// Find largest among root, left child and right child

int largest = i;

int left = 2 * i + 1;

int right = 2 * i + 2;

if (left < n && arr[left] > arr[largest])

largest = left;

PALLAVI HIRE ITM UNIVERSE


if (right < n && arr[right] > arr[largest])

largest = right;

// Swap and continue heapifying if root is not largest

if (largest != i)

swap(&arr[i], &arr[largest]);

heapify(arr, n, largest);

This function works for both the base case and for a tree of any size. We can thus move the
root element to the correct position to maintain the max-heap status for any tree size as long
as the sub-trees are max-heaps.
Build max-heap
To build a max-heap from any tree, we can thus start heapifying each sub-tree from the
bottom up and end up with a max-heap after the function is applied to all the elements
including the root element.
In the case of a complete tree, the first index of a non-leaf node is given by n/2 - 1. All other
nodes after that are leaf-nodes and thus don't need to be heapified.
So, we can build a maximum heap as
// Build heap (rearrange array)

for (int i = n / 2 - 1; i >= 0; i--)

heapify(arr, n, i);

Create array and calculate i

PALLAVI HIRE ITM UNIVERSE


Steps to build max heap for heap sort

PALLAVI HIRE ITM UNIVERSE


Steps to build max heap for heap sort
As shown in the above diagram, we start by heapifying the lowest smallest trees and
gradually move up until we reach the root element.
If you've understood everything till here, congratulations, you are on your way to mastering
the Heap sort.

PALLAVI HIRE ITM UNIVERSE


Working of Heap Sort
Since the tree satisfies Max-Heap property, then the largest item is stored at the root node.
Swap: Remove the root element and put at the end of the array (nth position) Put the last item
of the tree (heap) at the vacant place.
Remove: Reduce the size of the heap by 1.
Heapify: Heapify the root element again so that we have the highest element at root.
The process is repeated until all the items of the list are sorted.

Remove, and Heapify The code below shows the operation.


// Heap sort

for (int i = n - 1/2; i >= 0; i--)

swap(&arr[0], &arr[i]);

// Heapify root element to get highest element at root again

heapify(arr, i, 0);

PALLAVI HIRE ITM UNIVERSE


PALLAVI HIRE ITM UNIVERSE
Bubble Sort
Bubble sort is a sorting algorithm that compares two adjacent elements and swaps them if
they are not in the intended order.
Working of Bubble Sort
Suppose we are trying to sort the elements in ascending order.
1. First Iteration (Compare and Swap)

1. Starting from the first index, compare the first and the second elements.
2. If the first element is greater than the second element, they are swapped.
3. Now, compare the second and the third elements. Swap them if they are not in order.
4. The above process goes on until the last element.

Compare the Adjacent Elements

Compare two adjacent elements and swap them if the first element is greater than the next
element
2. Remaining Iteration

PALLAVI HIRE ITM UNIVERSE


The same process goes on for the remaining iterations.
After each iteration, the largest element among the unsorted elements is placed at the end.

Put the largest element at the end

Continue the swapping and put the largest element among the unsorted list at the end
In each iteration, the comparison takes place up to the last unsorted element.

Compare the adjacent elements

Swapping occurs only if the first element is greater than the next element
The array is sorted when all the unsorted elements are placed at their correct positions.

PALLAVI HIRE ITM UNIVERSE


The array is sorted if all the elements are kept in the right order.

Bubble Sort Algorithm


bubbleSort(array)
for i <- 1 to indexOfLastUnsortedElement-1
if leftElement > rightElement
swap leftElement and rightElement
end bubbleSort

Bubble Sort Complexity


Time Complexity
Best O(n)
Worst O(n2)
Average O(n2)
Space Complexity O(1)
Stability Yes

Complexity in Detail
Bubble Sort compares the adjacent elements.
Cycle Number of Comparisons
1st (n-1)

PALLAVI HIRE ITM UNIVERSE


2nd (n-2)
3rd (n-3)
....... ......
last 1
Hence, the number of comparisons is
(n-1) + (n-2) + (n-3) +.....+ 1 = n(n-1)/2 nearly equals to n2

Hence, Complexity: O(n2)


Also, if we observe the code, bubble sort requires two loops. Hence, the complexity is n*n =
n2
1. Time Complexities
● Worst Case Complexity: O(n2)
If we want to sort in ascending order and the array is in descending order then the worst case
occurs.
● Best Case Complexity: O(n)
If the array is already sorted, then there is no need for sorting.
● Average Case Complexity: O(n2)
It occurs when the elements of the array are in jumbled order (neither ascending nor
descending).
2. Space Complexity
Space complexity is O(1) because an extra variable is used for swapping.
In the optimized bubble sort algorithm, two extra variables are used. Hence, the space
complexity will be O(2).
Bubble Sort Applications
Bubble sort is used if
● complexity does not matter
● short and simple code is preferred

PALLAVI HIRE ITM UNIVERSE


Insertion Sort Algorithm
Insertion sort is a sorting algorithm that places an unsorted element at its suitable place in
each iteration.
Insertion sort works similarly as we sort cards in our hand in a card game.
We assume that the first card is already sorted then, we select an unsorted card. If the
unsorted card is greater than the card in hand, it is placed on the right otherwise, to the left. In
the same way, other unsorted cards are taken and put in their right place.
A similar approach is used by insertion sort.
Working of Insertion Sort
Suppose we need to sort the following array.

1. The first element in the array is assumed to be sorted. Take the second element and
store it separately in key.
Compare the key with the first element. If the first element is greater than key, then
key is placed in front of the first element.

PALLAVI HIRE ITM UNIVERSE


2. Now, the first two elements are sorted.
Take the third element and compare it with the elements on the left of it. Placed it just
behind the element smaller than it. If there is no element smaller than it, then place it
at the beginning of the array.

3. Similarly, place every unsorted element at its correct position.

PALLAVI HIRE ITM UNIVERSE


Insertion Sort Algorithm
insertionSort(array)
mark first element as sorted
for each unsorted element X
'extract' the element X
for j <- lastSortedIndex down to 0
if current element j > X
move sorted element to the right by 1
break loop and insert X here
end insertionSort

Insertion Sort Complexity


Time Complexity
Best O(n)
Worst O(n2)
Average O(n2)
Space Complexity O(1)
Stability Yes

PALLAVI HIRE ITM UNIVERSE


Time Complexities
Worst Case Complexity: O(n2)
Suppose, an array is in ascending order, and you want to sort it in descending order. In this
case, worst case complexity occurs.
Each element has to be compared with each of the other elements so, for every nth element,
(n-1) number of comparisons are made.
Thus, the total number of comparisons = n*(n-1) ~ n2
Best Case Complexity: O(n)
When the array is already sorted, the outer loop runs for n number of times whereas the inner
loop does not run at all. So, there are only n number of comparisons. Thus, complexity is
linear.
Average Case Complexity: O(n2)
It occurs when the elements of an array are in jumbled order (neither ascending nor
descending).
Space Complexity
Space complexity is O(1) because an extra variable key is used.

Insertion Sort Applications


The insertion sort is used when:
● the array is has a small number of elements
● there are only a few elements left to be sorted

Selection Sort Algorithm


Selection sort is a sorting algorithm that selects the smallest element from an unsorted list in
each iteration and places that element at the beginning of the unsorted list.
Working of Selection Sort
1. Set the first element as minimum.

2. Compare minimum with the second element. If the second element is smaller than
minimum, assign the second element as minimum.

PALLAVI HIRE ITM UNIVERSE


Compare minimum with the third element. Again, if the third element is smaller, then
assign minimum to the third element otherwise do nothing. The process goes on until
the last element.

3. After each iteration, minimum is placed in the front of the unsorted list.

4. For each iteration, indexing starts from the first unsorted element. Step 1 to 3 are
repeated until all the elements are placed at their correct positions.

PALLAVI HIRE ITM UNIVERSE


PALLAVI HIRE ITM UNIVERSE
Selection Sort Algorithm
selectionSort(array, size)
repeat (size - 1) times
set the first unsorted element as the minimum
for each of the unsorted elements
if element < currentMinimum
set element as new minimum

PALLAVI HIRE ITM UNIVERSE


swap minimum with first unsorted position
end selectionSort

Selection Sort Complexity


Time Complexity
Best O(n2)
Worst O(n2)
Average O(n2)
Space Complexity O(1)
Stability No

Cycle Number of Comparison


1st (n-1)
2nd (n-2)
3rd (n-3)
... ...
last 1
Number of comparisons: (n - 1) + (n - 2) + (n - 3) + ..... + 1 = n(n - 1) / 2 nearly equals to n2.
Complexity = O(n2)
Also, we can analyze the complexity by simply observing the number of loops. There are 2
loops so the complexity is n*n = n2.
Time Complexities:
Worst Case Complexity: O(n2)
If we want to sort in ascending order and the array is in descending order then, the worst case
occurs.
Best Case Complexity: O(n2)
It occurs when the array is already sorted
Average Case Complexity: O(n2)
It occurs when the elements of the array are in jumbled order (neither ascending nor
descending).

PALLAVI HIRE ITM UNIVERSE


The time complexity of the selection sort is the same in all cases. At every step, you have to
find the minimum element and put it in the right place. The minimum element is not known
until the end of the array is not reached.
Space Complexity:
Space complexity is O(1) because an extra variable temp is used.

Counting sort
Complexity
Worst case time O(n)
Best case time O(n)
Average case time O(n)
Space O(n)

Strengths:
Linear time. Counting sort runs in O(n)O(n) time, making it asymptotically faster than
comparison-based sorting algorithms like quicksort or merge sort.
Weaknesses:
Restricted inputs. Counting sort only works when the range of potential items in the input is
known ahead of time.
Space cost. If the range of potential values is big, then counting sort requires a lot of space
(perhaps more than O(n)O(n)).

Counting sort works by iterating through the input, counting the number of times each item
occurs, and using those counts to compute an item's index in the final, sorted array.
Counting How Many Times Each Item Occurs
Say we have this array:
Unsorted input: [4, 8, 4, 2, 9, 9, 6, 2, 9].
And say we know all the numbers in our array will be whole numbers ↴ between 0 and 10
(inclusive).
The idea is to count how many 0's we see, how many 1's we see, and so on. Since there are 11
possible values, we'll use an array with 11 counters, all initialized to 0.

PALLAVI HIRE ITM UNIVERSE


We'll iterate through the input once. The first item is a 4, so we'll add one to counts[4]. The
next item is an 8, so we'll add one to counts[8].

The first two elements in the input [4, 8, 4, 2, ...] are 4 and 8. To count them, we increment
the value at indices 4 and 8 in our counts list, which becomes [0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0].
And so on. When we reach the end, we'll have the total counts for each number:

Once we count all the values in [4, 8, 4, 2, 9, 9, 6, 2, 9], the counts list is [0, 0, 2, 0, 2, 0, 1, 0,
1, 3, 0].

Building the Sorted Output


Now that we know how many times each item appears, we can fill in our sorted array.
Looking at counts, we don't have any 0's or 1's, but we've got two 2's. So, those go at the start
of our sorted array.

PALLAVI HIRE ITM UNIVERSE


No 3's, but there are two 4's that come next.

After that, we have one 6,

one 8,

and three 9's

And, with that, we're done!

PALLAVI HIRE ITM UNIVERSE


Complexity
Counting sort takes O(n + k)O(n+k) time and O(n + k)O(n+k) space, where nn is the number
of items we're sorting and kk is the number of possible values.
We iterate through the input items twice—once to populate counts and once to fill in the
output array. Both iterations are O(n)O(n) time. Additionally, we iterate through counts once
to fill in nextIndex, which is O(k)O(k) time.
The algorithm allocates three additional arrays: one for counts, one for nextIndex, and one for
the output. The first two are O(k)O(k) space and the final one is O(n)O(n) space.
You can actually combine counts and nextIndex into one array. No asymptotic changes, but it
does save O(k)O(k) space.
In many cases cases, kk is O(n)O(n) (i.e.: the number of items to be sorted is not
asymptotically different than the number of values those items can take on. Because of this,
counting sort is often said to be O(n)O(n) time and space.

Radix Sort Algorithm


Radix sort is a sorting algorithm that sorts the elements by first grouping the individual digits
of the same place value. Then, sort the elements according to their increasing/decreasing
order.
Suppose, we have an array of 8 elements. First, we will sort elements based on the value of
the unit place. Then, we will sort elements based on the value of the tenth place. This process
goes on until the last significant place.
Let the initial array be [121, 432, 564, 23, 1, 45, 788]. It is sorted according to radix sort as
shown in the figure below.

PALLAVI HIRE ITM UNIVERSE


Working of Radix Sort
1. Find the largest element in the array, i.e. max. Let X be the number of digits in max. X is
calculated because we have to go through all the significant places of all elements.
In this array [121, 432, 564, 23, 1, 45, 788], we have the largest number 788. It has 3 digits.
Therefore, the loop should go up to hundreds place (3 times).
2.Now, go through each significant place one by one.
Use any stable sorting technique to sort the digits at each significant place. We have used
counting sort for this.
Sort the elements based on the unit place digits (X=0).

Using counting sort to sort elements based on unit place

3. Using counting sort to sort elements based on unit place

4. Finally, sort the elements based on the digits at hundreds place.

PALLAVI HIRE ITM UNIVERSE


Radix Sort Algorithm
radixSort(array)
d <- maximum number of digits in the largest element
create d buckets of size 0-9
for i <- 0 to d
sort the elements according to ith place digits using countingSort
countingSort(array, d)
max <- find largest element among dth place elements
initialize count array with all zeros
for j <- 0 to size
find the total count of each unique digit in dth place of elements and
store the count at jth index in count array
for i <- 1 to max
find the cumulative sum and store it in count array itself
for j <- size down to 1
restore the elements to array
decrease count of each element restored by 1

Radix Sort Complexity


Time Complexity
Best O(n+k)
Worst O(n+k)
Average O(n+k)
Space Complexity O(max)
Stability Yes

PALLAVI HIRE ITM UNIVERSE


Since radix sort is a non-comparative algorithm, it has advantages over comparative sorting
algorithms.
For the radix sort that uses counting sort as an intermediate stable sort, the time complexity is
O(d(n+k)).
Here, d is the number cycle and O(n+k) is the time complexity of counting sort.
Thus, radix sort has linear time complexity which is better than O(nlog n) of comparative
sorting algorithms.
If we take very large digit numbers or the number of other bases like 32-bit and 64-bit
numbers then it can perform in linear time however the intermediate sort takes large space.
This makes radix sort space inefficient. This is the reason why this sort is not used in
software libraries.
Radix Sort Applications
Radix sort is implemented in
• DC3 algorithm (Kärkkäinen-Sanders-Burkhardt) while making a suffix array.
• places where there are numbers in large ranges.

Bucket Sort Algorithm


Bucket Sort is a sorting algorithm that divides the unsorted array elements into several groups
called buckets. Each bucket is then sorted by using any of the suitable sorting algorithms or
recursively applying the same bucket algorithm.
Finally, the sorted buckets are combined to form a final sorted array.
Scatter Gather Approach
The process of bucket sort can be understood as a scatter-gather approach. Here, elements are
first scattered into buckets then the elements in each bucket are sorted. Finally, the elements
are gathered in order.

PALLAVI HIRE ITM UNIVERSE


Working of Bucket Sort
1.Suppose, the input array is:

Create an array of size 10. Each slot of this array is used as a bucket for storing elements.

2. Insert elements into the buckets from the array. The elements are inserted according to the
range of the bucket.
In our example code, we have buckets each of ranges from 0 to 1, 1 to 2, 2 to 3,...... (n-1) to
n. Suppose, an input element is .23 is taken. It is multiplied by size = 10 (ie. .23*10=2.3).
Then, it is converted into an integer (ie. 2.3≈2). Finally, .23 is inserted into bucket-2.

PALLAVI HIRE ITM UNIVERSE


Similarly, .25 is also inserted into the same bucket. Everytime, the floor value of the floating
point number is taken.
If we take integer numbers as input, we have to divide it by the interval (10 here) to get the
floor value.
Similarly, other elements are inserted into their respective buckets.

3. The elements of each bucket are sorted using any of the stable sorting algorithms. Here, we
have used quicksort (inbuilt function).

4.The elements from each bucket are gathered.


It is done by iterating through the bucket and inserting an individual element into the original
array in each cycle. The element from the bucket is erased once it is copied into the original
array.

PALLAVI HIRE ITM UNIVERSE


Bucket Sort Algorithm

bucketSort()
create N buckets each of which can hold a range of values
for all the buckets
initialize each bucket with 0 values
for all the buckets
put elements into buckets matching the range
for all the buckets
sort elements in each bucket
gather elements from each bucket
end bucketSort

Bucket Sort Complexity


Time Complexity
Best O(n+k)
Worst O(n2)
Average O(n)
Space Complexity O(n+k)
Stability Yes

Worst Case Complexity: O(n2)


When there are elements of close range in the array, they are likely to be placed in the same
bucket. This may result in some buckets having more number of elements than others.
It makes the complexity depend on the sorting algorithm used to sort the elements of the
bucket.

PALLAVI HIRE ITM UNIVERSE


The complexity becomes even worse when the elements are in reverse order. If insertion sort
is used to sort elements of the bucket, then the time complexity becomes O(n2).
Best Case Complexity: O(n+k)
It occurs when the elements are uniformly distributed in the buckets with a nearly equal
number of elements in each bucket.
The complexity becomes even better if the elements inside the buckets are already sorted.
If insertion sort is used to sort elements of a bucket then the overall complexity in the best
case will be linear ie. O(n+k). O(n) is the complexity for making the buckets and O(k) is the
complexity for sorting the elements of the bucket using algorithms having linear time
complexity at the best case.
Average Case Complexity: O(n)
It occurs when the elements are distributed randomly in the array. Even if the elements are
not distributed uniformly, bucket sort runs in linear time. It holds true until the sum of the
squares of the bucket sizes is linear in the total number of elements.
Bucket Sort Applications
Bucket sort is used when:
• input is uniformly distributed over a range.
• there are floating point values

PALLAVI HIRE ITM UNIVERSE

You might also like