You are on page 1of 54

Capitolo 12

ORDINAMENTO E RICERCA
Chapter Goals
• To study several sorting and searching algorithms
• To appreciate that algorithms for the same task can differ widely in
performance
• To understand the big-Oh notation
• To estimate and compare the performance of algorithms
• To write code to measure the running time of a program

9/7/2023 Page 2
Contents
• Selection Sort
• Profiling the Selection Sort Algorithm
• Analyzing the Performance of the Selection Sort Algorithm
• Merge Sort
• Analyzing the Merge Sort Algorithm
• Searching
• Problem Solving: Estimating the Running Time of an Algorithm

9/7/2023 Page 3
12.1 Selection Sort
• Sorts a list by repeatedly finding the smallest element of the unsorted
tail region and moving it to the front
• Example: sorting a list of integers

11 9 17 5 12

9/7/2023 Page 4
Sorting a List of Integers
• Find the smallest and swap it with the first element
5 9 17 11 12

• Find the next smallest. It is already in the correct place


5 9 17 11 12

• Find the next smallest and swap it with first element of unsorted
portion
5 9 11 17 12
• Repeat
5 9 11 12 17

• When the unsorted portion is of length 1, we are done


5 9 11 12 17

9/7/2023 Page 5
Selectionsort.py

Continued

9/7/2023 Page 6
Selectiondemo.py

9/7/2023 Page 7
Profiling the Selection Sort Algorithm
• Want to measure the time the algorithm takes to execute
• Exclude the time the program takes to load
• Exclude output time
• To measure time elapsed:
• Use the time() library function from the time module.
• It returns the seconds that have elapsed since midnight at the start
of January 1, 1970.
• The difference of two such counts gives us the number of seconds in
a given time interval.

9/7/2023 Page 8
Selectiontimer.py

9/7/2023 Page 9
Time Taken by Selection Sort
List size (n) Seconds
10,000 9
20,000 38
30,000 85
40,000 147
50,000 228
60,000 332

• These measurements were obtained with an Intel dual core processor with a clock
speed of 3.2 GHz running Python 3.2 on the Linux operating system.

9/7/2023 Page 10
Analyzing the Performance
• In a list of size n, count how many times a list element is visited
• To find the smallest, visit n elements + 2 visits for the swap
• To find the next smallest, visit (n - 1) elements + 2 visits for the swap
• The last term is 2 elements visited to find the smallest + 2 visits for
the swap

9/7/2023 Page 11
Analyzing the Performance
• The number of visits:
• n + 2 + (n - 1) + 2 + (n - 2) + 2 + ...+ 2 + 2
• This can be simplified to n2 /2 + 5n/2 - 3
• 5n/2 - 3 is small compared to n2 /2 — so ignore it
• Also ignore the 1/2 — it cancels out when comparing ratios

9/7/2023 Page 12
Analyzing the Performance
• The number of visits is of the order n2
• Using big-Oh notation: The number of visits is O(n2)
• Multiplying the number of elements in a list by 2
multiplies the processing time by 4
• Big-Oh notation “f(n) = O(g(n))”
expresses that f grows no faster than g
• To convert to big-Oh notation: Locate fastest-growing
term, and ignore constant coefficient

9/7/2023 Page 13
Merge Sort
• Sorts a list by
• Cutting the list in half
• Recursively sorting each half
• Merging the sorted halves

• Much more efficient than selection sort

9/7/2023 Page 14
Merge Sort Example
• Divide a list in half and sort each half

• Merge the two sorted lists into a single sorted list

9/7/2023 Page 15
MergeSort() Function
• When the mergeSort() function sorts a list, it makes two lists, each
half the size of the original, and sorts them recursively
• Then it merges the two sorted lists together:

def mergeSort(values) :
if len(values) <= 1 : return
mid = len(values) // 2
first = values[ : mid]
second = values[mid : ]
mergeSort(first)
mergeSort(second)
mergeLists(first, second, values)

9/7/2023 Page 16
Mergesort.py

9/7/2023 Page 17
Mergesort.py

9/7/2023 Page 18
Mergedemo.py

9/7/2023 Page 19
Analyzing the Merge Sort Algorithm
List size (n) Merge Sort (seconds) Selection Sort (seconds)

10,000 0.105 9

20,000 0.223 38

30,000 0.344 85

40,000 0.470 147

50,000 0.599 228

60,000 0.729 332

9/7/2023 Page 20
Analyzing the Merge Sort Algorithm

9/7/2023 Page 21
Analyzing the Merge Sort Algorithm
• In a list of size n, count how many times a list
element is visited
• Assume n is a power of 2: n = 2m
• Calculate the number of visits to create the two sub-lists
and then merge the two sorted lists
• 3 visits to merge each element or 3n visits
• 2n visits to create the two sub-lists
• total of 5n visits

9/7/2023 Page 22
Analyzing the Merge Sort Algorithm
• Let T(n) denote the number of visits to sort a list of n elements then
• T(n) = T(n/2) + T(n/2) + 5n or
• T(n) = 2T(n/2) + 5n

• The visits for a list of size n/2 is:


• T(n/2) = 2T(n/4) + 5n/2
• So T(n) = 2 × 2T(n/4) +5n + 5n

• The visits for a list of size n/4 is:


• T(n/4) = 2T(n/8) + 5n/4
• So T(n) = 2 × 2 × 2T(n/8) + 5n + 5n + 5n

9/7/2023 Page 23
Analyzing the Merge Sort Algorithm
• Repeating the process k times:
• T(n) = 2kT(n/2k) +5nk
• Since n = 2m, when k=m: T(n) = 2mT(n/2m) +5nm
• T(n) = nT(1) +5nm
• T(n) = n + 5nlog2(n)

9/7/2023 Page 24
Analyzing the Merge Sort Algorithm
• To establish growth order for n + 5nlog2(n)
• Drop the lower-order term n
• Drop the constant factor 5
• Drop the base of the logarithm since all logarithms are related by a constant
factor
• We are left with n log(n)

• Using big-Oh notation: Number of visits is O(nlog(n))

9/7/2023 Page 25
Merge Sort vs. Selection Sort
• Selection sort is an O(n2) algorithm
• Merge sort is an O(nlog(n)) algorithm
• The nlog(n) function grows much more slowly than n2
• (Recall the running times):

9/7/2023 Page 26
Searching
• Linear search: Examines all values in a list until it finds a match or
reaches the end
• Also called sequential search
• Number of visits for a linear search of a list of n elements:
• The average search visits n/2 elements
• The maximum visits is n

• A linear search locates a value in a list in O(n) steps

9/7/2023 Page 27
Linearsearch.py

9/7/2023 Page 28
Lineardemo.py

9/7/2023 Page 29
Linear Search: Results

9/7/2023 Page 30
Binary Search
• Locates a value in a sorted list by
• Determining whether the value occurs in the first or second half
• Then repeating the search in one of the halves

9/7/2023 Page 31
Binary Search
• To search 15:

• The search can be narrowed by finding whether the target is in the


first or second half of the list
• The last value in the first half of the data set, values[4], is 9 is smaller
than the target of 15
• Discard the lower half of the list

9/7/2023 Page 32
Binary Search

• The middle element of this sequence is 20; hence, the target must be
located in the sequence:

9/7/2023 Page 33
Binary Search (4)
• The last value of the first half of this very short sequence is 12, which
is smaller than the target, so we must look in the second half:

• We don’t have a match, because 15 ≠ 17

9/7/2023 Page 34
Binarysearch.py

9/7/2023 Page 35
Analyzing Binary Search
• Count the number of visits to search a sorted list of size n
• We visit one element (the middle element) then search either the
left or right sub-list
• Thus: T(n) = T(n/2) + 1
• Using the same equation, T(n/2) = T(n/4) + 1
• Substituting into the original equation: T(n) = T(n/4) + 2
• This generalizes to: T(n) = T(n/2k) + k

9/7/2023 Page 36
Analyzing Binary Search
• Assume n is a power of 2, n = 2m, where m = log2(n)
• Then: T(n) = 1 + log2(n)
• Binary search is an O(log(n)) algorithm

9/7/2023 Page 37
Problem Solving:
• Estimating the Running Time of an Algorithm
• Differentiating between O(nlog(n)) and O(n2) has great practical
implications
• Being able to estimate running times of other algorithms is an
important skill
• Will practice estimating the running times of list algorithms

9/7/2023 Page 38
Linear Time
• An algorithm to count how many elements of a list have a particular
value:
count = 0
for i in range(len(values)) :
if values[i] == searchedValue :
count = count + 1

9/7/2023 Page 39
Linear Time
• To visualize the pattern of list element visits:
• Imagine the list as a sequence of light bulbs
• As the ith element gets visited, imagine the ith light bulb lighting up

• When each visit involves a fixed number of actions, the


running time is n times a constant, or O(n)

9/7/2023 Page 40
Linear Time
• When you don’t always run to the end of the list; e.g.:
found = False
i = 0
while not found and i < len(values) :
if values[i] == searchedValue :
found = True
else :
i = i + 1
Loop can stop in middle:

• Still O(n), because in some cases the match may be at


the very end of the list

9/7/2023 Page 41
Quadratic Time
• What if we do a lot of work with each visit?
• Example: find the most frequent element in a list
• Obvious for small list

• For larger list?

• Put counts in a second list of same length

• Take the maximum of the counts, 3, look up where the 3


occurs in the counts, and find the corresponding value, 7

9/7/2023 Page 42
Quadratic Time
• First estimate how long it takes to compute the counts
for i in range(len(values)) :
counts[i] = Count how often values[i] occurs in values

• Still visit each element once, but work per visit is


much larger
• Each counting action is O(n)
• When we do O(n) work in each step, total running time is
O(n2)

9/7/2023 Page 43
Quadratic Time
• Algorithm has three phases:
1. Compute all counts
2. Compute the maximum
3. Find the maximum in the counts
• First phase is O(n2), second and third are each O(n)
• The big-Oh running time for doing several steps in a row
is the largest of the big-Oh times for each step
• Thus algorithm for finding the most frequent element is
O(n2)

9/7/2023 Page 44
Logarithmic Time
• An algorithm that cuts the size of work in half in each step
runs in O(log(n)) time
• E.g. binary search, merge sort
• To improve our algorithm for finding the most frequent
element, first sort the list:

• Sorting costs O(n log(n)) time


• If we can complete the algorithm in O(n) time, have a
better algorithm that the previous O(n2) algorithm

9/7/2023 Page 45
Logarithmic Time
• While traversing the sorted list:
• As long as you find a value that is equal to its predecessor,
increment a counter
• When you find a different value, save the counter and start
counting anew

9/7/2023 Page 46
Logarithmic Time
count = 0
for i in range(len(values)) :
count = count + 1
if i == len(values) - 1 or values[i] != values[i + 1] :
counts[i] = count
count = 0
• Constant amount of work per iteration, even though it visits two
elements:

• 2n is still O(n)
• Entire algorithm is now O(log(n))

9/7/2023 Page 47
Summary

9/7/2023 48
Selection Sort Algorithm
• The selection sort algorithm sorts a list by repeatedly finding the
smallest element of the unsorted tail region and moving it to the front

9/7/2023 Page 49
Measuring the Running Time
• To measure the running time of a function or method, get the current
time immediately before and after the function/method call

9/7/2023 Page 50
Big-Oh Notation
• Computer scientists use the big-Oh notation to describe
the growth rate of a function
• Selection sort is an O(n2) algorithm. Doubling the data
set means a fourfold increase in processing time
• Insertion sort is an O(n2) algorithm

9/7/2023 Page 51
Merge and Selection Sort
• The merge sort algorithm sorts a list by cutting the list in half,
recursively sorting each half, and then merging the sorted halves
• Merge sort is an O(n log(n)) algorithm. The n log(n) function grows
much more slowly than n2

9/7/2023 Page 52
Linear and Binary Search
• A linear search examines all values in a list until it finds a match or
reaches the end
• A linear search locates a value in a list in O(n) steps
• A binary search locates a value in a sorted list by determining whether
the value occurs in the first or second half, then repeating the search
in one of the halves
• A binary search locates a value in a sorted list in O(log(n)) steps

9/7/2023 Page 53
Estimating Big-Oh of Algorithms
• A loop with n iterations has O(n) running time if each step consists of a
fixed number of actions
• A loop with n iterations has O(n2) running time if each step takes O(n)
time
• The big-Oh running time for doing several steps in a row is the largest
of the big-Oh times for each step
• A loop with n iterations has O(n2) running time if the ith step takes O(i)
time
• An algorithm that cuts the size of work in half in each step runs in
O(log(n)) time

9/7/2023 Page 54

You might also like