You are on page 1of 37

Searching & Sorting

SLIDESMANIA.COM

Musarrat Ahmed
Linear Search
● When you’re at a restaurant and deciding what to have for lunch, you may be
looking around the menu chaotically until something catches your eye.
● Alternatively, you can take a more systematic approach by scanning the menu from
top to bottom and scrutinizing every item in a sequence.
● That’s linear/sequential search in a nutshell.
● The approach loops over a collection of elements in a predefined and consistent
order.
● It stops when the element is found, or when there are no more elements to check.
● This strategy guarantees that no element is visited more than once because you’re
SLIDESMANIA.COM

traversing them in order by index.


Click here for a visualization of linear search -
SLIDESMANIA.COM

Binary and Linear Search Visualization


Pseudocode

Here, L = list to be searched


T = target element to be searched
i = index in the list
SLIDESMANIA.COM
So what’s the problem?
● The lookup time grows with the increasing index of an element in the collection.
● The further the element is from the beginning of the list, the more comparisons
have to run.
● In the worst case, when an element is missing, the whole collection has to be
checked to give a definite answer.
● On plotting experimental data to see the
relationship between element location and
the time it takes to find it, it is evident
that all samples lie on a straight line and
can be described by a linear function.
● On an average, the time required to find
any element using linear search is
proportional to the number of all elements
in the collection.
SLIDESMANIA.COM

● The linear search algorithm may be a good choice for smaller datasets, because
it doesn’t require preprocessing the data but not when the dataset is large.
Case Analysis

● Best Case (rarely considered)


➔ When the element to be searched is the first element of the collection.

● Worst Case
➔ When the element to be searched is either the last element of the
collection or it is not present in it.

● If the collection is already sorted, will it affect the time complexity?


SLIDESMANIA.COM
● If the item we are looking for is present in the list, the chance of it being
in any one of the n positions is still the same as before.
● We will still have the same number of comparisons to find the item.
● However, if the item is not present there is a slight advantage.
● Suppose you’re looking for 50 in the given ordered list of integers.
● Items are still compared in sequence until
54. However, now we know that the
item we’re looking for i.e 50 cannot be
beyond 54. So we can end the search.
● In this case, the algorithm does not have to continue looking through all of
the items to report that the item was not found. It can stop immediately.
SLIDESMANIA.COM

● However, the technique is still O(n).


What about Python’s built-in mechanisms?
● The list data structure, for example, exposes a method that will return the
index of an element or raise an exception otherwise.

● The in operator also comes in handy.


● Despite using linear search under the hood, these built-in functions will be
faster than your own implementation of linear search.
● That’s because they were written in pure C, which compiles to native machine
code. The standard Python interpreter is no match for it, no matter how hard you
try.
● However, for sufficiently large datasets, even the native code will hit its
limits.
SLIDESMANIA.COM

● Thus, in real-life scenarios, the linear search algorithm should usually be


avoided.
SLIDESMANIA.COM
Binary Search
● The idea behind binary search resembles the steps for finding a page in a book.
● At first, you typically open the book to a completely random page or at least
one that’s close to where you think your desired page might be.
● Occasionally, you’ll be fortunate enough to find that page on the first try.
● However, if the page number is too low, then you know the page must be to the
right.
● If you overshoot on the next try, and the current page number is higher than the
page you’re looking for, then you know for sure that it must be somewhere in
between.
● You repeat the process, but rather than choosing a page at random, you check the
page located right in the middle of that new range. This minimizes the number of
SLIDESMANIA.COM

tries.
● Binary search is based on dividing a collection of elements into two halves and
throwing away one of them at each step of the algorithm.
● This can dramatically reduce the number of comparisons required to find an
element.
● But there’s a catch —elements in the collection must be sorted first.
● In the page finding example, the page numbers that restrict the range of pages to
search through are known as the lower bound and the upper bound.
● You commonly start with the first page as the lower bound and the last page as
the upper bound. You must update both bounds as you go.
● Binary search is a great example of a divide-and-conquer technique, which
partitions one problem into a bunch of smaller problems of the same kind.
● Click here for a visualization of binary search -
Binary and Linear Search Visualization
SLIDESMANIA.COM

● If there are duplicates, which element would binary search return?


Pseudocode

Here, A = list to be searched


T = target element to be searched
L = upper bound in the list
R = lower bound in the list
SLIDESMANIA.COM
Case Analysis

● Best Case (rarely considered)


➔ When the element to be searched is the middle element of the collection.

● Worst Case
➔ When the element to be searched is found in the end when the list is
narrowed down to a single item.
SLIDESMANIA.COM
● When we split the list enough times, we end up with a list that has just one
item. Either that is the item we are looking for or it is not. Either way, we
are done.
● The number of comparisons necessary to get to this point is i where

● Solving for i gives us i = logn


● The maximum number of comparisons is logarithmic with respect to the number of
items in the list.
● Therefore, the binary search is 𝑂(log𝑛).

● Unlike other search algorithms, binary search can be used beyond just
searching.
● For example, it allows for set membership testing, finding the largest or
smallest value, finding the nearest neighbor of the target value, performing
range queries, and more.
● Even though a binary search is generally better than a sequential search, it is
important to note that for small values of n, the additional cost of sorting is
probably not worth it.
SLIDESMANIA.COM
SLIDESMANIA.COM
SLIDESMANIA.COM
Bubble Sort
● One of the most straightforward sorting algorithms.
● Its name comes from the way the algorithm works: With every new pass, the
largest element in the list “bubbles up” toward its correct position.
● Each iteration takes fewer steps than the previous iteration because a
continuously larger portion of the array is sorted.
● Bubble sort consists of making multiple passes through a list, comparing
elements one by one, and swapping adjacent items that are out of order.
SLIDESMANIA.COM
Click here for a visualization
of bubble sort -

Bubble Sort visualize | Algorithms

● At the start of the second pass,


the largest value is now in
place.
● There are n−1 items left to sort,
meaning that there will be n−2
pairs.
● Since each pass places the next
largest value in place, the total
number of passes necessary will
be 𝑛−1
● After completing the n−1 passes,
the smallest item must be in the
SLIDESMANIA.COM

correct position with no further


processing required.
Link -

https://www.youtube.com/watch?
v=nmhjrI-aW5o
SLIDESMANIA.COM
Pseudocode
SLIDESMANIA.COM
Analysis
● Regardless of how the items are arranged in the initial list, n−1 passes will be
made to sort a list of size n.

● Total number of comparisons = 1 + 2 + 3 + …… + (n-2) + (n-1) = n(n-1)/2 =


● In the best case, if the list is already ordered, no exchanges will be made.
However, in the worst case, every comparison will cause an exchange. On average,
SLIDESMANIA.COM

we exchange half of the time.


Optimized Bubble Sort
● Because the bubble sort makes passes through the entire unsorted portion of the
list, it has the capability to do something most sorting algorithms cannot.
● If during a pass there are no exchanges, then we know that the list must be
sorted.
● The sort can be modified to stop early if it finds that the list has become
sorted.
● Therefore, in the best case if the list is already sorted, after the first
iteration the algorithm will stop since no swaps would have been made and the
number of comparisons in this case would reduce to O(n).
SLIDESMANIA.COM
SLIDESMANIA.COM
Selection Sort
● The selection sort improves on the bubble sort by making only one exchange for
every pass through the list.
● In order to do this, a selection sort looks for the smallest value as it makes a
pass and, after completing the pass, places it in the proper location.
● After the first pass, the smallest item is in the correct place.
● After the second pass, the next smallest is in place.
● This process continues and requires n−1 passes to sort n items, since the final
item must be in place after the (n−1)th pass.
SLIDESMANIA.COM
SLIDESMANIA.COM
SLIDESMANIA.COM
Pseudocode
SLIDESMANIA.COM
Analysis
● None of the loops depends on the data in the list.
● Selecting the minimum requires scanning n elements taking (n-1) comparisons and
then swapping it into the first position.
● Finding the next lowest element requires scanning the remaining (n-1) elements
taking (n-2) comparisons and so on.
● Therefore, total number of comparisons = (n-1) + (n-2) + …… + 2 + 1 = n(n-1)/2
=
● However, due to the reduction in the number of exchanges, the selection sort
typically executes faster than bubble sort.
SLIDESMANIA.COM
SLIDESMANIA.COM
Insertion Sort

● It builds the sorted list one element at a time by comparing each item with
the rest of the list and inserting it into its correct position.
● This “insertion” procedure gives the algorithm its name.
● An excellent analogy to explain insertion sort is the way you would sort a
deck of cards.
● Imagine that you’re holding a group of cards in your hands, and you want to
arrange them in order.
● You’d start by comparing a single card step by step with the rest of the cards
until you find its correct position.
● At that point, you’d insert the card in the correct location and start over
with a new card, repeating until all the cards in your hand were sorted.
SLIDESMANIA.COM
● Insertion sort always
maintains a sorted sublist in
the lower positions of the
list.
● Each new item is then
“inserted” back into the
previous sublist such that the
sorted sublist is one item
larger.
SLIDESMANIA.COM
SLIDESMANIA.COM
Pseudocode
SLIDESMANIA.COM
Analysis

● To sort n items, (n-1) passes are required.


● The best case input is a list that is already sorted.
● In this case insertion sort has a linear running time (i.e., O(n)).
● During each iteration, the first remaining element of the input is only
compared with the right-most element of the sorted subsection of the list
● The simplest worst case input is a list sorted in reverse order.
● The total comparisons will again be the sum of first (n-1) integers =
SLIDESMANIA.COM
Comparison with Selection Sort

● Insertion sort scans backwards from the current key, while selection sort scans forwards.
● Advantage of insertion sort over selection sort is that selection sort must always scan all remaining
elements to find the smallest element in the unsorted portion of the list, while insertion sort
requires only a single comparison when the (k + 1)-st element is greater than the k-th element.
● Thus, if the input array is already sorted or partially sorted, insertion sort is distinctly more
efficient compared to selection sort.
● In the worst case for insertion sort (when the input array is reverse-sorted), it performs just as
many comparisons as selection sort.
● However, a disadvantage of insertion sort over selection sort is that it requires more writes.
● On each iteration, inserting the (k + 1)-st element into the sorted portion of the array requires
many element swaps to shift all of the following elements, while only a single swap is required for
each iteration of selection sort.
● Selection sort may be preferable in cases where writing to memory is significantly more expensive
than reading, such as with EEPROM or flash memory.
SLIDESMANIA.COM
SLIDESMANIA.COM
Fin.
SLIDESMANIA.COM

You might also like