Quick sort is a very fast algorithm for sorting, but since it is recursive, it takes comparably same time as insertion sort for "small" arrays, as insertion sort being iterative and operating by switching elements, works better for small arrays which are "almost sorted". Thus, we design a hybrid algorithm which partitions the to be sorted array similar to the quick sort recursively until a optimum cutoff size, determined by simulation study, is reached and then applies the insertion sort algorithm.

© All Rights Reserved

9 views

Quick sort is a very fast algorithm for sorting, but since it is recursive, it takes comparably same time as insertion sort for "small" arrays, as insertion sort being iterative and operating by switching elements, works better for small arrays which are "almost sorted". Thus, we design a hybrid algorithm which partitions the to be sorted array similar to the quick sort recursively until a optimum cutoff size, determined by simulation study, is reached and then applies the insertion sort algorithm.

© All Rights Reserved

- QuickSort in C#
- 3 bca 3rd sem syllabus uemj
- CS344Syllabus2017 Sections 5 7 BahmanKalantari
- Dev.Mag - 03
- Meljun Cortes Data Structures Advanced Sorting
- Quick-sort Algorithm Analysis
- Sort 2000
- C-Sorting
- Daa
- Algo
- An Efficient Approach of Fast External Sorting Algorithm in Data Warehouse
- ISO-8859-1__(www.entrance-exam.net)-Motorola Placement Sample Paper 1
- parallel processing-GTU paper
- Analysis
- Selection Sort Insertion Sort Bottomup Merge Chapter 1
- Data Structure Notes
- M4 Sorting
- Design And Analysis of Algorithms Lab
- lec1
- Chapter Wise

You are on page 1of 9

Runtime Comparison

Anirban Ray

23 February 2018

Objective

Among numerous sorting algorithms, some of the common algorithms are Quick Sort and Insertion Sort.

Quick sort is very popular since it is the fastest known general sorting algorithm in practice which provides

best run-time in average cases. Insertion sort, on the other hand, works very well when the array is partially

sorted and also when the array size is not too large. In this project, we will try to combine these two algorithms

in such a way that we can use both the speed of quick sort and also the benefit of effectiveness of insertion

sort. Afterwards, we would like to find hybrid algorithm (combination of insertion and quick), which is optimum

in the sense of minimum average run-time.

Insertion Sort

Insertion sort is an iterative sorting algorithm. The main idea of this is that at each iteration, insertion sort

removes an element, find its ordered position in the sorted array of the previous elements and inserts it

there. The algorithm can be written as below:

INSERTIONSORT(A)

for j = 2 to A.length

key = A[j]

i = j - 1

while i > 0 and A[i] > key

A[i + 1] = A[i]

i = i - 1

A[i + 1] = key

Quick Sort

Quick sort is a divide and conquer algorithm. It first divides a large array into two sub-arrays with respect to a

pivot element, where all elements of one sub-array is not more than the pivot element, and those of the other

are not less than that. Then it does the same for the two sub-arrays and continue to do so until a stage is

reached where all sub-arrays are of size 1. Since all these sub-arrays are now sorted trivially, merging these

will result in completion of the sorting process. The algorithm to sort the pth to rth of the array A is as follows.

QUICKSORT(A, p, r)

if p < r

q = PARTITION(A, p, r)

QUICKSORT(A, p, q)

QUICKSORT(A, q + 1, r)

PARTITION(A, p, r)

x = A[p]

i = p - 1

j = r + 1

while TRUE

repeat

j = j - 1

until A[j] <= x

repeat

i = i + 1

until A[i] >= x

if (i < j) exchange A[i] with A[j]

else return j

Different choices of the pivot element are available for different types of the input array. In the above-

mentioned algorithm, we have used the first element of the array. Lomuto used the last element of the array.

Sometimes a random index is chosen and swapped with the last element and then the Lomuto partitioning

method is followed. Singleton used the median of three method, where one first sort the first, last and middle-

most elements of the array, and then exchange the middle most element of the modified array with the first

element of the array and proceed as before. In this project, we will always use random inputs, in which case

the choice of pivot does not matter too much. So, we will continue to use the first element as pivot following

Hoare, the first proposer of the quick sort algorithm.

Hybrid Sort

Now we come to the formulation of the new hybrid algorithm. Since we know that insertion sort works better

for arrays with partially sorted sub-arrays of small size, we start the sorting procedure by the partition

approach of quick sort algorithm. But instead of continuing until we reach sub-arrays of one element each,

we stop partitioning when we reach the stage of sub-arrays of size less than some given cut-off size, which

distinguishes between the small and large arrays. After this step gets completed, we have an array

constituting of sub-arrays of sizes less than or equal to the cut-off size, which are not sorted themselves, but

as a whole, they are sorted. Finally, we run insertion sort over the entire array to get the completely sorted

output. The algorithm is the following.

HYBRIDSORT(A, p, r, k)

if (p < r)

if (r - p + 1 > k)

q = PARTITION(A, p, r)

HYBRIDSORT(A, p, q, k)

HYBRIDSORT(A, q + 1, r, k)

INSERTIONSORT(A)

We first define the sorting algorithms in C++ using the Rcpp package.

#include <Rcpp.h>

using namespace Rcpp;

void swap(NumericVector array, int first_position, int second_position) {

double temporary = array[first_position];

array[first_position] = array[second_position];

array[second_position] = temporary;

}

int partition(NumericVector array, int start, int end) {

double pivot = array[start];

int i = (start - 1);

int j = (end + 1);

while(TRUE) {

do {

i = (i + 1);

} while (array[i] < pivot);

do {

j = (j - 1);

} while (array[j] > pivot);

if (i >= j) {

return j;

}

swap(array, i, j);

}

}

void insertion(NumericVector array, int start, int end) {

if (start < end) {

for (int i = (start + 1); i <= end; ++i) {

double temporary = array[i];

int j = (i - 1);

while ((j >= start) && (array[j] > temporary)) {

array[(j + 1)] = array[j];

j = (j - 1);

}

array[(j + 1)] = temporary;

}

}

}

void quick(NumericVector array, int start, int end) {

if (start < end) {

int key = partition(array, start, end);

quick(array, start, key);

quick(array, (key + 1), end);

}

}

void hybrid(NumericVector array, int start, int end, int cutoff) {

if (start < end) {

// applying partition algorithm only when array size is more than cutoff

if ((end - start + 1) > cutoff) {

int key = partition(array, start, end);

hybrid(array, start, key, cutoff);

hybrid(array, (key + 1), end, cutoff);

}

}

}

// [[Rcpp::export]]

NumericVector sorting_R(NumericVector array, char method, int cutoff) {

int n = array.length();

// making an explicit copy of the input array to keep that unchanged

NumericVector sorted_array = clone(array);

// applying different sorting algorithms based on method

switch (method) {

case 'h': {

hybrid(sorted_array, 0, (n - 1), cutoff);

insertion(sorted_array, 0, (n - 1));

break;

}

case 'i': {

insertion(sorted_array, 0, (n - 1));

break;

}

case 'q': {

quick(sorted_array, 0, (n - 1));

break;

}

default: {

Rcpp::stop("Permissible methods are Hybrid(h), Insertion(i) and Quick(q).");

}

}

return sorted_array;

}

Now that we have defined our sorting algorithms, in the next step, we wish to find the optimum choice for the

cut-off by simulation study, since it is not known and the concept of “small” is pretty vague. Therefore, we

define functions in R (by calling the C++ functions) to compute the average run-time of our hybrid algorithm

for given choice of the cut-off array size. We run these functions over different choices of cut-off sizes for

different array sizes and plot the average run-times against choices of cut-offs for different array sizes as

below.

# function to calculate required time to sort a particular input array using

# a user defined cutoff

single_hybrid_runtime <- function(array_to_be_sorted, cutoff_to_be_used) {

system.time(sorting_R(array_to_be_sorted, "h", cutoff_to_be_used))["user.self"]

}

# particular size using different choices of cutoff

comparative_hybrid_runtime <- function(array_size, cutoff) {

simulated_array <- rnorm(array_size)

sapply(cutoff, single_hybrid_runtime, array_to_be_sorted = simulated_array)

}

# function to calculate average runtime for user defined array size for

# different choices of cutoff, average being taken over different

# replications (optionally user defined)

average_hybrid_runtime <- function(array_size, cutoff, replication = 25) {

rowMeans(replicate(replication, comparative_hybrid_runtime(array_size, cutoff)))

}

keys <- seq(1, 1000, 1) # choices of cutoff used for simulation study

times_1_e_5 <- average_hybrid_runtime(array_size = 1e+05, cutoff = keys)

times_4_e_5 <- average_hybrid_runtime(array_size = 4e+05, cutoff = keys)

times_7_e_5 <- average_hybrid_runtime(array_size = 7e+05, cutoff = keys)

times_1_e_6 <- average_hybrid_runtime(array_size = 1e+06, cutoff = keys)

plot(keys, times_1_e_5, type = "o", main = "For array size 1e+05", xlab = "Cutoff Used",

ylab = "Time Taken")

plot(keys, times_4_e_5, type = "o", main = "For array size 4e+05", xlab = "Cutoff Used",

ylab = "Time Taken")

plot(keys, times_7_e_5, type = "o", main = "For array size 7e+05", xlab = "Cutoff Used",

ylab = "Time Taken")

plot(keys, times_1_e_6, type = "o", main = "For array size 1e+06", xlab = "Cutoff Used",

ylab = "Time Taken")

Observations from the Graphs

Firstly, we see that there is a sharp fall in all the graphs initially. This proves the effectiveness of the

hybrid algorithm over quick sort, as it should be noted that for the choice of cut-off as 1, we are

essentially applying quick sort over the entire array. So that steep fall helps us to conclude with

confidence that combining the two algorithms is not at all worthless. This is because of the fact that as

quick sort is a recursive algorithm, it has a too much of overhead cost for calling itself repeatedly for

small arrays.

Secondly, we note that after a certain point, average run-time has a steadily increasing trend, which is

due to the fact that insertion sort is effective only for “small” arrays. As we are increasing the cut-off

size, insertion sort needs to be applied on larger partially sorted sub-arrays and hence the sorting of

the entire array becomes slower.

Finally, we observe that the trade-off between these two opposite effects on run-time is balanced in the

lower part of the skewed U-shaped pattern, which is revealed in all the graphs, in more or less extent.

Therefore, based on the simulation study, we can conclude that the optimum choice of cut-off lies in the

range from 100 to 200. Based on our interpretation of the graph, we will subjectively choose 140 as cut-off in

the latter sections, without any analytical justification.

Now, a plausible (and of course perfectly reasonable) question will be how much do we gain from this

algorithm or do we gain at all. We have already shown in the previous section that the run-time is significantly

improved for hybrid method over quick sort. Now, we wish to see whether this improvement varies with the

size of the input array or not. For that purpose, we define to function to calculate the percentage

improvement in run-time in hybrid sort over quick sort and plot the results.

# user defined input size

single_improvement <- function(array_size) {

x <- rnorm(array_size)

hybrid_time <- system.time(sorting_R(x, "h", 140))["user.self"]

quick_time <- system.time(sorting_R(x, "q", 140))["user.self"]

(quick_time - hybrid_time) * 100/quick_time

}

average_improvement <- function(length_of_array, replication = 50) {

mean(replicate(replication, single_improvement(length_of_array)))

}

sizes <- seq(1e+05, 1e+07, 1e+05) # simulated sizes used for improvement calculation

improvement <- sapply(sizes, average_improvement)

plot(sizes, improvement, type = "o", xlab = "Array Size", ylab = "Percentage Improvement",

main = "Improvement in Hybrid algorithm over Quick")

Explanation of Improvement Pattern

From the graph, it is evident that hybrid sort always outperforms quick sort comfortably for all the array sizes.

But the same graph also reveals that the improvement is decreasing as array size increases. But one should

note that the percentage improvement is still around 40% (which is, of course, very significant for practical

purposes). The unexpected decreasing trend can be explained by the slow nature of insertion sort algorithm.

In hybrid sort, we are using insertion sort over the entire array in the last step. Although, at this step, the

array is partially sorted, it should be kept in mind the insertion sort is significantly effective only for small

arrays. We use insertion sort to minimise the large overhead cost due to recursive calls of the quick sort for

small arrays, but this remedy comes with its own cost that for large arrays, it is intrinsically slow, however

partially sorted the array may be. Thus, as array size increases, the run-time for this step also increases.

Summary

At the end the project, we see that we have successfully improved the quick sort by combining insertion sort

with it. We have also provided an interval where the optimum choice of cut-off size should lie. We have also

verified the consistent out-performance of hybrid sort over quick sort. Thus, we can use this algorithm as an

alternative for the quick sort algorithm.

References

1. Introduction to Algorithms - Third Edition (https://mitpress.mit.edu/books/introduction-algorithms)

2. Wikipedia - Quick Sort (https://en.wikipedia.org/wiki/Quicksort)

3. Wikipedia - Insertion Sort (https://en.wikipedia.org/wiki/Insertion_sort)

4. Techie Delight - Hybrid QuickSort Algorithm (www.techiedelight.com/hybrid-quicksort)

- QuickSort in C#Uploaded byCrom Mik
- 3 bca 3rd sem syllabus uemjUploaded byapi-351162654
- CS344Syllabus2017 Sections 5 7 BahmanKalantariUploaded byTran Cuong
- Dev.Mag - 03Uploaded bygeorgpiorczynski
- Meljun Cortes Data Structures Advanced SortingUploaded byMELJUN CORTES, MBA,MPA
- Quick-sort Algorithm AnalysisUploaded byCharles
- Sort 2000Uploaded byursmmk
- C-SortingUploaded bySuv Ultimatum
- DaaUploaded byAnkitKoshti
- An Efficient Approach of Fast External Sorting Algorithm in Data WarehouseUploaded byJournal of Computing
- ISO-8859-1__(www.entrance-exam.net)-Motorola Placement Sample Paper 1Uploaded bysushilkumar3362
- AnalysisUploaded byAppu Prasad
- AlgoUploaded bywarda Khan
- parallel processing-GTU paperUploaded bySagar Damani
- Selection Sort Insertion Sort Bottomup Merge Chapter 1Uploaded bySahil Arora
- Data Structure NotesUploaded byGaurav Agrawal
- M4 SortingUploaded byresmi_ng
- Design And Analysis of Algorithms LabUploaded byRajendraYalaburgi
- lec1Uploaded byKarthik Keyan
- Chapter WiseUploaded byhbgfjghjj hhfh
- 52632357 GE2115 Computer Practice Laboratory I Manual Rev01Uploaded bySanjay Kumar
- 12-mergesortUploaded byDiponegoro Muhammad Khan
- ComplexityUploaded bylaltu.bagui
- 140 Google Interview QuestionsUploaded byRaul Balbo
- Docslide.us Week5 Lab ReportUploaded byCarlos Liang
- Ch 20 Parallel DatabaseUploaded byThenmozhi Rajagopal
- C++ course training institute hyderabadUploaded bysathya tech
- Algorithm Analysis ComplexityUploaded byamandeep651
- April 2016Uploaded bySubbu Buddu
- Ble90.pdfUploaded byVipulGupta

- Hybrid Quick Sort + Insertion Sort: Runtime ComparisonUploaded byAnirban Ray
- PR Mini Project 2018Uploaded byAnirban Ray
- Non-Parametric Tests for One Sample Location ProblemUploaded byAnirban Ray
- Analysis of Hydrocarbon Data - Application of LASSO RegressionUploaded byAnirban Ray
- Analysis of Hydrocarbon Data - Application of LASSO RegressionUploaded byAnirban Ray
- Analysis of Fishing Data – Application of Count RegressionUploaded byAnirban Ray
- Analysis of Fishing Data – Application of Count RegressionUploaded byAnirban Ray

- 201.ev1.12.Lec18.pdfUploaded bySanthosh Mamidala
- THERMAL ENGG.docUploaded bySachi Mensi
- PRESENTATION Jennifer Brandt BC and DR Planning 120515Uploaded byAli Amjad
- Kindle TroubleshootingUploaded bysonaliforex1
- To Study the influence of different cooling media.pptxUploaded byVikas Mani Tripathi
- MAN Truck Edc TroubleshootingUploaded byNuno Silva
- Tutorial for HVAC in AutoCAD MEP - PDF Course.pdfUploaded bymanjunath hr
- ReadMeUploaded bytoyersin
- Staffing 100 ListsUploaded byrlufthansa
- ConservatoriesUploaded byDavid Shanks 1
- 4_flanges & Forged Fittings WeightsUploaded byBhavani Prasad
- Sg 247638Uploaded byharikrishna.m2008@gmail.com
- Dart EngineUploaded byEstevam Gomes de Azevedo
- SIP and Inter Working With PSTNUploaded bymehmetca
- LogUploaded bySheryar Shah
- THK_Paint_Defects_GB.pdfUploaded bymaddy
- G-9059 Humidity Sensor PB FinalUploaded byJamie Ying
- Wind PowerUploaded bypaldopal
- Bosch WTVC4500UCUploaded byPurcellMurray
- Bauer Jennifer 2012Uploaded byCleberson Carlos Xavier de Albuquerque
- ASON Architecture ModelUploaded byMd.Bellal Hossain
- Lte-Advanced Release 10Uploaded byNguyễn Đình Vũ
- Mini Refineries Range NcUploaded byWendi Junaedi
- As NZS ISO 14343-2006 Welding Consumables - Wire Electrodes Wires and Rods for Arc Welding of Stainless and hUploaded bySAI Global - APAC
- dell 54xxUploaded bypauloper
- 09.LINUXUploaded bysgrrsc
- Halfen channelUploaded byBrijender Yadav
- WFTM PPTUploaded byParul Bansal
- Dse61xx Pc Software ManualUploaded byDi Noe
- PatrónMediatorUploaded byLuis Ariel Castillo Estupiñan

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.