You are on page 1of 8

Vol 6. No.

5, December 2013
African Journal of Computing & ICT

© 2013 Afr J Comp & ICT – All Rights Reserved - ISSN 2006-1781
www.ajocict.net

A Comparative Study of Sorting Algorithms


D.R. Aremu
Department of Computer Science
Faculty of Communication and Information Sciences
University of Ilorin
Ilorin, Nigeria
draremu2006@gmail.com

O.O. Adesina, O.E. Makinde, O. Ajibola & O.O. Agbo-Ajala


Department of Physical Sciences
Faculty of Natural Sciences
Ajayi Crowther University
Oyo, Oyo State, Nigeria
opeyemi.adesina@gmail.com, oladayo.makinde@yahoo.com, ajalabosun@gmail.com

ABSTRACT
The study presents a comparative study of some sorting algorithm with the aim to come up with the most efficient sorting
algorithm. The methodology used was to evaluate the performance of median, heap, and quick sort techniques using CPU
time and memory space as performance index. This was achieved by reviewing literatures of relevant works. We also
formulated architectural model which serves as guideline for implementing and evaluating the sorting techniques. The
techniques were implemented with C-language; while the profile of each technique was obtained with G-profiler. The
results obtained show that in majority of the cases considered, heap sort technique is faster and requires less space than
median and quick sort algorithms in sorting data of any input data size. Similarly, results also show that the slowest
technique of the three is median sort; while quick sort technique is faster and requires less memory than median sort, but
slower and requires more memory than heap sort. The number of sorting algorithms considered in this study for
complexity measurement is limited to bubble, insertion, and selection sorting. Future effort will investigate complexities
of other sorting techniques in the literature based on CPU time and memory space. The goal for this will be to adopt the
most efficient sorting technique in the development of job scheduler for grid computing community.

Keywords— Sorting algorithms, Comparative Study, Heap Sort, Quick Sort, Median Sort

African Journal of Computing & ICT Reference Format:


D.R. Aremu, O.O. Adesina, O.E. Makinde, O. Ajibola & O.O. Agbo-Ajala (2013). A Comparative Study of Sorting Algorithms
Afr J. of Comp & ICTs. Vol 6, No. 5. Pp 199-206.

1. INTRODUCTION
This paper presents a comparative study of sorting However, an important question of which sorting
algorithms. Sorting [4], a mechanism that organizes algorithm is the most-efficient or least-complex is yet to
elements of a list into a predefined order is important for be fully addressed. The goal of this paper is to present a
various reasons. For instance, numerical computation comparative study of some sorting algorithm with the
with estimated values (e.g. floating point representation aim to come up with the most efficient sorting algorithm.
of real numbers) is concerned with accuracy. To control The methodology used was to evaluate the performance
accuracy in computations of this kind, scientists often of median, heap, and quick sort techniques using CPU
apply sorting to actualize the goal. In string processing, time and memory space as performance index. This was
finding the longest common prefix in a set of string and achieved by reviewing literatures of relevant works. We
the longest repeated substring in a given string is often also formulated architectural model which serves as
based on sorting. To develop jobs schedules, so as to guideline for implementing and evaluating the sorting
maximize customer satisfaction by minimizing the techniques. The techniques were implemented with C-
average completion time, often requires methods (such as language; while the profile of each technique was
longest- processing-time-first rule) based on sorting. In obtained with G-profiler. The results obtained show that
the literature, various implementation solutions of in majority of the cases considered, heap sort technique is
algorithm exist for sorting. faster and requires less space than median and quick sort
algorithms in sorting data of any input data size.

199
Vol 6. No. 5, December 2013
African Journal of Computing & ICT

© 2013 Afr J Comp & ICT – All Rights Reserved - ISSN 2006-1781
www.ajocict.net

Similarly, results also show that the slowest technique of evaluated theoretically and empirically. In theory, the
the three is median sort; while quick sort technique is odd-even transposition has a complexity of O(b ); such
faster and requires less memory than median sort, but
that b = 1/2 . This implies that the time will be reduced
slower and requires more memory than heap sort. The
rest part of this paper is organized as follows. Section 2 by ‘b’. Similarly, in theory parallel rank sort has a
discusses related works;. In section 3, we presented an complexity of O( ); c = 1/p. The theoretical
architectural model designed for this work;. Section 4 complexity of parallel merge sort is O(n/p log n/p) [13].
presented the algorithms design for simulating the sorting ‘P’ is number of processes. Empirical results have shown
algorithms; section 5 discusses the results obtained from that the fastest algorithm of three is the parallel merge
the evaluation of sorting techniques considered. In technique. This is followed by the odd-even transposition
sections 4; and section 6 concluded the paper; while algorithm; while the slowest is the parallel rank sorting
section 7 presented future thoughts. algorithm.
The motivation of [7] was to research sorting algorithms
2. RELATED WORKS for potential failing comparisons. It was expected that
faulty comparisons tend to occur in sorting, due to
In the literature, there are various researches on random fluctuations in the evaluation functions that
evaluating performance or complexity of sorting compares two elements. Hence, the aim of the work is
algorithms.. In [4], a comparative study of some parallel research a method that is robust against faulty
sort algorithms such as: map sort, merge sort, Tsigas- comparisons. The null hypothesis for the research is: a
Zhangs parallel quicksort, alternative quick sort, and STL very efficient (i.e. low complexity order) sorting
sort was carried out. the goal was to introduce merge sort algorithm might be susceptible to errors from imprecise
algorithm and its ability to sort n-sized array of elements comparisons than the more efficient sorting algorithms
which might implement a lot implicitly of redundant
in O( ) time complexity. The approach adopted in comparisons. The sorting algorithms considered include:
developing this algorithm was divide-and-conquer bubble sort, merge sort, quick sort, heap sort and
method. The empirical and theoretical analyzes of merge- selection sort. The result of the research supports its null
sort algorithm was presented in addition to its pseudo hypothesis. Bubble sort is the most robust sorting
codes. Merge sort algorithm was compared with its algorithms, followed by merge, quick, heap, and selection
counterparts: bubble, insertion, and selection. It was sorts. The work contributed to the state of the art by
analyzing existing sorting algorithms based on robustness
recorded the other algorithms has quadratic O( ) time.
against imprecise and noisy comparisons.
Results presented involves merge sort against insertion
In [14], performance evaluation of bubble, selection, and
sort. This shows that merge sort algorithm is significantly
insertion sort techniques were conducted. The motivation
faster than insertion sort algorithm for great size of array. of the work was based on the unprecedented growth of
Merge sort is 24 to 241 times faster than insertion sort data in knowledge-based sectors and its requirements of
using n-values of 10,000 and 60,000 respectively. Also,
mechanisms for analyzing these data which often
results show that the difference between merge and
demands sorting. Sorting techniques evaluated in this
insertion sorts is statistically significant with more than work were subjected to CPU time and memory space as
90 percent confidence. performance index. Experimental results had shown that
the most efficient of the techniques evaluated is the
Similarly, [5] gave a practical performance comparison
insertion sort techniques in both cases of performance
of sorting algorithms like: odd-even transposition sort, parameters. Similarly, it has been observed that selection
parallel rank sort, and parallel merge sort. sort is less efficient than insertion sort technique. The
In [6], robustness was studied as a function of complexity
least efficient of the techniques is the bubble sort
of some sorting techniques. Finally, [4] evaluated merge
technique.
sort empirically and theoretically to sort N-sized dataset
in O(n log n) time as the most-efficient among bubble,
3. ARCHITECTURAL MODEL
insertion, and selection sort with quadratic time
complexity.
The knowledge acquired from the literatures is being
According to [6], a great number of research works have applied in designing the architectural model presented in
addressed issues related to dedicated and parallel Figure 1. It is divided into three basic components. These
machines [12]; but only little research has been carried are: repository, shuffling module, and sorting module.
out on performance evaluations of parallel sorting This model was formulated to serve as benchmark of
algorithms on clustered station. Hence, the goal of [6] is components to be developed when evaluating
to compare some parallel sorting algorithms with overall performance of sorting algorithms. The data designed for
execution time as performance parameter. Algorithms evaluating the algorithms is stored in the repository. This
considered include: odd-even transposition sort, parallel data may be ordered or unordered in nature.
rank sort, and parallel merge sort. These algorithms were

200
Vol 6. No. 5, December 2013
African Journal of Computing & ICT

© 2013 Afr J Comp & ICT – All Rights Reserved - ISSN 2006-1781
www.ajocict.net

However, since the aim of the work is to determine


efficiency of sorting mechanisms in the solution space, it shuffle( A , n)
is important we realize a uniformly ruffled data. To 1. i ← n – 1
realize this, a shuffling module (which randomizes 2. while( i > 0 ) do
positions of data) is introduced. The ruffled data which is 3. j ← rand() % (i + 1)
a result of shuffling process is passed to the sorter. The 4. swap( A[i] , A[j] )
sorter applies its logic to sort the input data. The
architecture also shows the flow of control from a 5. i←i–1
component to another. Control flow starts from the 6. end-while
repository to shuffling module. Furthermore, the control
flows from the shuffling module to the sorting module.
The output obtained from sorting is passed into the Figure 2: Shuffling Algorithim
repository for storage.

4.2 Sorting Algorithms


The sorting module component can be designed with any
sorting technique in the solution space. To design this
component, we will consider heap, quick, and median
sort techniques. These algorithms will be discussed
theoretically and with pseudo codes.
Sorting
Repository Module
4.2.1 Heap Sort Algorithm
Figure 3 presents the pseudo code of make heap
algorithm. This constructs the heap data to be sorted with
the heap sort algorithm. The heap sort algorithm
presented in figure 4 is applied to sort the heap
constructed. According to [8], the heap sort algorithm has
theoretical complexity of in its best, average
and worst case scenario.

Shuffling Module make-heap( A , n)

1. for i ← 1 to n – 1 do
2. val ← A[i]
Figure 1: Architectural Mode 3. s←i
4. f ← (s - 1) / 2
4. ALGORITHM DESIGN 5. while s > 0 AND A[f] < val do
This section describes and presents the basic algorithms
which are the foundation of majority of the sorting 6. A[s] ← A[f]
algorithms. Algorithms under these categories include: 7. s←f
shuffling and swapping algorithms.
8. f ← (s - 1) / 2
4.1 Shuffling Algorithm
9. end-while
Figures 2 presents pseudo code of the shuffling
algorithm. Its theoretical complexity is O(n) and O(1) in
10. A[s] ← val
worst case and best case scenario respectively. We have 11. end-for
assumed that swapping algorithm takes a constant time
O(1) to execute in theory.
Figure 3: Make Heap Algorithim

201
Vol 6. No. 5, December 2013
African Journal of Computing & ICT

© 2013 Afr J Comp & ICT – All Rights Reserved - ISSN 2006-1781
www.ajocict.net

pivot( left , right)


heap-sort( A , n)
1. make-heap( A , n ) 1. x ← right - left
2. for i ← n - 1 to 1 do 2. y ← rand() % x
3. ivalue ← A[i] 3. piv ← y - left
4. f←0 4. return piv
5. if i == 1 then
6. s ← -1
Figure 5: Pivot Generating Algorithim.
7. else
8. s←1
9. end-if partition( A, left , right)
10. if i > 2 AND A[2] > A[1] then
1. p ← pivot( left, right )
11. s←2
2. swap( A[p] , A[right] )
12. while s >= 0 AND ivalue < A[s] do
3. store ← left
13. A[s] ← A[f]
4. for i ← left to right – 1 do
14. f←s 5. if A[i] < A[right]
15. s ← s* f +1 6. swap( A[i], A[store])
16. if s+1 <= i-1 AND A[i] < A[s+1] then 7. store ← store + 1
17. s=s+1 8. swap( A[store] , A[right])
18. if s > i – 1 then 9. return store
19. s ← -1
20. end-if Figure 6: Partitioning Algorithim
21. end-if
22. end-while
23. end-if quick-sort( A, left , right)
24. A[f] ← ivalue
1. if left < right
25. end-for
2. p ← partition( A, left, right )
3. quick-sort( A, left, p)
Figure 4: Heap Sort Algorithim 4. quick-sort( A, p+1, right)

Figure 7: Quick Sort Algorithim.


4.2.2 Quick Sort Algorithm
Figures 5, 6 and 7 collectively form quick sort algorithm.
4.2.3 Median Sort Algorithm
Figure 5 generates a pivot randomly for dividing data set
into two distinct partitions. Figure 6 is the pseudo code of Figures 8, 9 and 10 collectively form median sort
the algorithm for portioning a block of data set which is algorithm. Figure 8 generates a pivot by computing the
based on the pivot generated. Figure 7 presents quick sort average value of the left and right positions. This is used
algorithm. It is being developed with the divide-and- for dividing data set into two distinct partitions. Figure 9
conquer approach. The method recursively calls itself for is the pseudo code of the algorithm for partitioning a
sorting once a partition has been created. According to block of data set which is based on the pivot generated.
[8], it has theoretical complexity of in its best Figure 10 presents median sort algorithm. It is being
and average cases and in its worst case scenario. developed with the divide-and-conquer approach. The
method recursively calls itself for sorting once a partition
has been created. According to [8], it has theoretical
complexity of in its best and average cases
and in worst case scenario.

202
Vol 6. No. 5, December 2013
African Journal of Computing & ICT

© 2013 Afr J Comp & ICT – All Rights Reserved - ISSN 2006-1781
www.ajocict.net

5. RESULTS AND DISCUSSION


m-pivot( left, right ) This section presents the discussion of results obtained
from the empirical evaluation of insertion, bubble, and
1. return (left + right) / 2 selection sort algorithms. The efficiency of these
algorithms was measured in CPU time. This was
measured using the system clock on a machine with
Figure 8: Median Sort Pivot Generator Algorithim minimal background process running, with respect to the
size of the input data. The algorithms were implemented
in C-language. The tests were carried out using G-
Profiler in the GCC suite on Linux Ubuntu 13.04. These
m-partition( A, left, right ) were run on Dell Inspiron 6400 PC with the following
specifications: Intel Dual Core CPU at 1.60 GHz and
1. p = m-pivot(left , right) 1.00GB of RAM. Empirical results of bubble sort,
2. swap(A[p] , A[right]) insertion sort, and selection sort techniques using various
data sizes and corresponding CPU time for the techniques
3. store ← left is presented in Table 1. Similarly, table 2 presents the
4. for i ← left to right do empirical results of the sorting techniques considered
5. if A[i] < A[right] then using various input data sizes and corresponding memory
required for execution of the techniques.
6. swap(A[i], A[store])
Table 1: Results of CPU Time vs. Data Size
7. store ← store + 1
Data Size Heap Sort Quick Sort Median
8. end-if ( ) (s) (s) Sort (s)
9. end-for
10 0 0.01 0.01
10. swap(A[store] , A[right])
30 0.01 0.01 0.02
11. return store
50 0.02 0.02 0.06
Figure 9: Median Sort Partitioning Algorithim 100 0.05 0.06 0.15
200 0.09 0.08 0.21

median-sort( A, left, right ) Table 2: Results of Memory vs. Data Size


Data Size Heap Sort Quick Sort Median Sort
1. if left < right then ( ) (Bytes) (Bytes) (Bytes)
2. pi ← m-partition(A , left ,right ) 10 0.00 4.00 4.00
3. median-sort(A , left , pi) 30 4.00 4.00 8.00
4. median-sort(A , pi+1, right) 50 8.00 8.00 24.00
Figure 10: Median Sort Algorithim
100 20.00 24.00 60.02
200 36.00 32.00 83.93

Also, graph 1 illustrates results of an experiment with


complexities of the sorting techniques based on input
data size against CPU time. In similar manner, M-Sort,
H-Sort, and Q-Sort imply median, heap and quick sort
techniques. We observed that for median and heap
sorting techniques considered in this work, the memory
space is directly proportional to the input data size, while
the CPU time required for quick sort algorithm is the
same for data sizes of 10,000 and 30,000. But for other
data sizes, the memory space required is proportional to
the input data size. Also we observed that the CPU time
required by median sort is 2.5 times higher than its
counterparts. Heap and quick sort algorithms compete in
almost all the cases of input data size.

203
Vol 6. No. 5, December 2013
African Journal of Computing & ICT

© 2013 Afr J Comp & ICT – All Rights Reserved - ISSN 2006-1781
www.ajocict.net

Graph 1: Graph of CPU Time Vs. Data Size

Also, graph 2 illustrates results of an experiment with the memory space required for quick sort algorithm is the
complexities of the sorting techniques based on input same for data sizes of 10,000 and 30,000. But for other
data size against memory space. In similar manner, M- data sizes, the memory space required is proportional to
Sort, H-Sort, and Q-Sort imply median, heap and quick the input data size. Also we observed that the memory
sort techniques. We observed that for median and heap space required by median sort is 2.5 times larger than its
sorting techniques considered in this work, the memory counterparts. Heap and quick sort algorithms compete in
space is directly proportional to the input data size, while almost all the cases of input data size.

Graph 2: Graph of Memory Space Vs. Data Size

204
Vol 6. No. 5, December 2013
African Journal of Computing & ICT

© 2013 Afr J Comp & ICT – All Rights Reserved - ISSN 2006-1781
www.ajocict.net

6. CONCLUSION
In conclusion, we have evaluated the performance of
median, heap, and quick sort techniques using CPU time
and memory space as performance index. This was
achieved by reviewing literatures of relevant works. We
also formulated architectural model which serves as
guideline for implementing and evaluating the sorting
techniques. The techniques were implemented with C-
language; while the profile of each technique was
obtained with G-profiler [15].

Empirical results were tabulated and graphically


presented. The results obtained show that in majority of
the cases considered, heap sort technique is faster and
requires less space than median and quick sort algorithms
in sorting data of any input data size. Similarly, results
also show that the slowest technique of the three is
median sort; while quick sort technique is faster and
requires less memory than median sort, but slower and
requires more memory than heap sort. We infer from this
study that empirical complexity can be determined in
theory.
7. FUTURE THOUGHTS
We realized that in the solution space, numerous
implementation solutions of sorting techniques exist.
However, time constraint has limited this study to bubble,
insertion, and selection sort techniques. We intend to
investigate complexities of other sorting techniques in the
literature based on CPU time and memory space. Also,
we intend to adopt the most efficient sorting technique in
the development of job scheduler for grid computing
community.

205
Vol 6. No. 5, December 2013
African Journal of Computing & ICT

© 2013 Afr J Comp & ICT – All Rights Reserved - ISSN 2006-1781
www.ajocict.net

REFERENCES
[1] Wright A., Glut: Mastering Information
through the ages, Henry, Washington D.C.,
2007.
[2] Dubitzky W., Data mining techniques in grid
computing environment, John Wiley and Sons,
West Sussex, U.K., 2008.
[3] Aremu D.R., Adesina O.O., “Web services: A
solution to interoperability problems in sharing
grid resources”, ARPN Journal of Systems and
Software, vol. 1, no. 4, pp. 141 – 148, 2011.
[4] Qin M., “Merge Sort Algorithm”, Department
of Computer Sciences, Florida Institute of
Technology, Melbourne, FL 32901.
[5] Qureshi K., “A Practical Performance
Comparison of Parallel Sorting Algorithms on
Homogeneous Network of Workstations”,
Department of Mathematics and Computer
Science, Kuwait University, Kuwait.
[6] Pasetto D., Akhriev A., “A comparative study
of parallel sort algorithms”, IBM Dublin
Research Laboratory, Dublin15, Ireland.
[7] Elmenreich W., Ibounig T., Fehérvári I.,
“Robustness versus performance in sorting and
tournament algorithms”, Acta Polytechnica
Hungarica, vol. 6, no. 5, pp. 7 – 17, 2009.
[8] George T.H., Police G., Selkow S., Algorithms
in a nutshell, O’Rielly, California, 2009.
[9] Edahiro M., “Parallelizing fundamental
algorithms such as sorting on multi-core
processors for EDA acceleration”. Asia and
South Pacific Design Automation Conference,
pp. 230 – 233, 2009.
[10] Varman P.J., Scheufler S.D., Iyer B.R., Ricard
G.R., “Merging multiple lists on hierarchical-
memory multiprocessors”, Journal of Parallel
and Distributed Computing, vol. 12, no. 2, pp.
171 – 177, 1991.
[11] Tsigas P., Zhang Y., “A simple, fast parallel
implementation of quicksort and its
performance evaluation on Sun Enterprise
10000”, 11th Euromicro Conference on Parallel
Distributed and Network-based Processing, pp.
372 – 384, 2003.

[12] Sado K., Igarashi Y., “Some parallel sorts on a


Mesh-Connected processor array and their time
efficiency”, Journal of Parallel and Distributed
Computing, vol. 3, pp. 398 – 410, 1999.
[13] Bitton D., DeWitt D., Hsiao D.K., Menon J.,
“A taxonomy of parallel sorting”, ACM
Computing Surveys, vol. 16, no. 3, pp. 287 –
318, 1984.
[14] Makinde O.E., Adesina O.O., Aremu D.R.,
Agbo-Ajala O.O., “Performance evaluation of
sorting techniques”, In press.
[15] Blum R., “Professional Assembly Language”,
Wiley Publishing, Indianapolis, 2005.

206

You might also like