You are on page 1of 6

Proceedings of the 2009 13th International Conference on Computer Supported Cooperative Work in Design

A Cooperative Sort Algorithm Based on Indexing


Guigang Zheng, Shaohua Teng, Wei Zhang, Xiufen Fu
Faculty of Computer, Guangdong University of Technology, Guangzhou, P.R. China
diamond2015@163.com, {shteng, weizhang, xffu}@gdut.edu.cn

Abstract parallel computing method in aerial Digital Images based


on cluster computer. The experiment results
Based on insertion, Quick-Sort and Merge-Sort demonstrate the feasibility of the method and the high
algorithms, this paper proposes an improved method efficiency of the system. Wang [5] makes use of the
about indexing and presents its corresponding parallel character that lengths of the list are almost equal after
algorithm. Introduction of index table increases sorting in a concurrent system. He presents an improved
memory consumption but decreases consumption of Merge-Sort algorithm based on parallel computing in
record movement in sorting. The experiment [5]. Using an optimal parallel sorting algorithm for
demonstrates that executing CPU time of indexing- Multi-sets, Zhong [6] presents an improved parallel
based sort algorithm is evidently less than that of other algorithm of selection sorting . Because many people
sort algorithms. Based on index table and parallel pay attention to the research of algorithm, some sort
computing, the Merge-Sort algorithm saved the waiting algorithms or improves of the sort algorithm have been
and disposal time in which every two sub-merging given. This leads to more applications of sort algorithm.
sequences are sorted in single processor computer. This
obtained better efficiency than the original Merge-Sort 2. Sort algorithm based on index table
algorithm.
2.1. Sort algorithm
Keywords: Sort Algorithm, Parallel Computing,
Merge-Sort, Index Table, Time Efficiency. Suppose the dataset stored in an array. We
established an index table which has the same length of
1. Introduction the dataset. The index is found according to every array
element. In general, they are called as index pointer. In
Sorting is an important issue of the computer. The the sorting process, the index pointer is modified
complexity of sorting algorithms is closely correlated instead of the record movement. After sorting process,
with the comparison of keyword, times of moving data sorted in a certain way can be accessed according
records and the storage space needed in sorting. Many to indexing order. The exchanging times of index
methods are used to rising the speed of sort algorithm. pointer depend on the order of the original data records.
One is based on how to reduce the times of comparison When memory of a data record is larger than the
and moving records. Li [1] constructs a block structure memory of index pointer, the exchanging time of index
type for the indexing-based sorting algorithm. But it pointer will be less than the time of data record directly
increases the cost of index memory space. We use the exchanging. The larger the elements of the dataset are,
index table to enhance the efficiency of sorting the better effectiveness is. Then, abstract description of
algorithm. This paper proposes an improved Merge- this algorithm is as follows:
Sort algorithm based on parallel computing. 1) Index table generation: Suppose dataset A [N], we
Experiments prove that the efficiency of this algorithm define an index table of N elements, which initializes
is directly proportional to the size of data records. it into index [i]=i, i=0, 1, ……, N-1.
In recent years, many scholars have put forward 2) By comparing original data, we find out those data
different improvements and optimizations about sorting. that need to be exchanged.
Liu [2] puts forward a Quick-Sort algorithm based on 3) Exchange corresponding index pointer according to
concurrent. It adopts divide-and-conquer method which comparison result.
means dividing big problems into small sub-problems 4) If sorting requirement is met, sorting process ends;
of the same structure and solving them respectively. otherwise, the next step will go to 2).
Because they are all similar subtasks, Liu adopts the
parallel computing to deal with them. Lan [3] optimizes 2.2. C language implementation
bubble sort algorithm by changing the scan direction of
the algorithm to realize bidirectional sort, which Below algorithm is a dichotomy insertion sort
reduces times of comparison. For enhancing the algorithm [7], which is implemented in C language based
efficiency of data processing, Zhang [4] discusses the on indexing table. It is ascending sort. Let h[i] is the

978-1-4244-3535-7/09/$25.00 ©2009 IEEE

704

Authorized licensed use limited to: Ingrid Nurtanio. Downloaded on October 01,2020 at 02:16:13 UTC from IEEE Xplore. Restrictions apply.
original data structure type. Index pointer stored in Microprocessor (CPU & Cache): Direct input-output
index [i]. i is corresponding to the original array (I/O)
element. h[i].key is keyword. i=0, 1, ……, N-1. System bus: all microprocessors connect and
communicate with each other through system
void index BIS (element *h, int *index) bus.
{ int i, j, low, high, mid, m ; Memory (Mem): Memories are composed of multiple
For (i=1;i<N;i++) memory modules. They are shown in figure 1.
{ These modules and nodes distribute on two
m=index[i]; // index[i] temporarily kept in m
low= 1;
sides of system bus symmetrically.
high= i-1;
while (low<=high) // look for insertion position 3.2. Parallel algorithm
between h[index[low]] and h[index[high]
{ Based on index sort algorithm, an improved parallel
mid= (low+high) /2; Merge-Sort algorithm is proposed in this paper.
if (strcmp (h[m].key, h[index[mid]].key)>=0) Suppose that the data record of dataset is A [N]. A [N]
low= mid+1; // insertion point is in high half zone need to be sorted. The number of elements in dataset is
else
high= mid-1; // insertion is in low half zone
N. First, the records can be regarded as N ordered
} subsequences. The length of every subsequence is 1. By
for (j=i-1;j>=low;j--) merging every two sub-sequence, we can get ┏N/2┑
index [j+1]=index[j]; //index pointer moves ordered sub-sequence. Their length respectively is 2 or
backwards 1. We merge repeatedly every two sub-sequence in the
index [high+1]=m; // change index pointer same way until we get an ordered sequence in length of
} N. Every processor in parallel system can independently
}
merge two sub-sequences. Then compared with single
processor, waiting time can be reduced or avoided.
After the program executes, we can get ordered
While every processor merges two sub-sequences, the
dataset according to serial number of index pointer.
exchanging of index pointer takes place of data record
exchange. Based on this idea, the algorithm’s abstract
3. A cooperative Merge-Sort algorithm description is as follows:
1) Given an array A [N], the first step is to merge
3.1. Parallel computer and cooperative ┏N/2┑every two sub-sequence by┏N/2┑
computing microprocessors. If time spent in merging two sub-
sequences is T, we can save┏N/2┑-1(T) in contrast
Cooperative computing means that in a parallel with single microprocessor,.
computer, an application is decomposed into multiple 2) We can get ┏N/2┑sub-sequences from 1). Then
subtasks which are distributed to different processors. merge two sub-sequences repeatedly.
The processors cooperatively carry out subtasks in ┏N/4┑processors are used to do parallel
concurrent. Thereby, it will take less time to solve the computing. If the time spent in merging two sub-
problem. sequences is T1, comparing with single
Parallel computer means that two or more
microprocessor, it will save ┏N/4┑-1 (T1).
processors are connected by interconnection networks.
These processors communicate with each other. 3) In the same way, ┏N/4┑, ┏N/8┑, .....,
SMP parallel computer system structure [8] is ┏N /N/8┑, we adopt the method of 1) and 2) to deal
shown in figure 1. with the ┏N/4┑, ┏N/8┑, ....., ┏N /N/8┑
sub-sequences.
4) Finally, there are two ordered sub-sequences left. We
use a processor to merge the two sub-sequences into
an ordered sequence. Ultimately, we get an ordered
sequence in length of N.
In general, it is impossible to use microprocessors
enough to merge sort. Because we don’t know the
number of elements in sequence, the hardware cost is
too high. If we make use of several microprocessors in
the parallel computing, the executing efficiency of the
parallel algorithm is remarkable. According to it, we
Fig.1. Typical SMP system architecture give the architecture of Merge-Sort algorithm based on
indexing. It is shown as the Figs 2 (P stands for
In which: processor while T stands for sub-sequences):

705

Authorized licensed use limited to: Ingrid Nurtanio. Downloaded on October 01,2020 at 02:16:13 UTC from IEEE Xplore. Restrictions apply.
Sort, insertion and Merge-Sort algorithms to do these
experiments.

4.2. Index-based sort algorithm

In order to measure CPU time, the same dataset is


operated 10000 times in each sort algorithm. Executing
CPU time in every algorithm is shown in the following
Fig.2. An architecture of parallel Merge-Sort algorithm
based on indexing Table 1.

Table.1. Executing CPU time in every algorithm


3.3. Time complexity of the algorithm

In the above parallel computing, hardware


equipment decides the executing efficiency of Merge-
Sort algorithm in some degree. If the length of dataset
which needs to be sorted is N, the best case is that the
number of processors is┏N/2┑in theory. Therefore,
the closer the number of processors is to┏N/2┑, the
larger the efficiency will be.
The length of dataset is N1 (N1=100). The length of
Time complexity of the algorithm: data record is less than N2. The time cost of each
1) Time complexity of original Merge-Sort algorithm is algorithm is shown in Table 1. Results of the
O(nlog2n). With the help of parallel computing, experiments are described from figure 3 to figure 5 in
operating repeatedly in single processor can be histogram. We can get more intuitive images from these
greatly reduced. Then, the time complexity is figures.
O(log2n).
2) In parallel computing, each processor needs to
establish private space when it merges the sub-
sequences. The time spent in establishing auxiliary
space is O(m1+m2) (m1 and m2 are the length of two
sub-sequences respectively).
3) Meanwhile, when we do parallel computing, the
delay of communication among the processors should
be considered. Of course, the delay in special PC
cluster environment is less than 10μs. It is too small
to calculate in time complexity.
4) Therefore, the time complexity of parallel Merge- Fig.3. Quick and index-quick sort
Sort algorithm is O(log2n) + O(m1+m2). The total
time complexity is O(log2n).
The Merge-Sort algorithm based on parallel
computing doesn’t need executing repeatedly. Thus,
waiting time be reduced or avoided. Although the
executing time of every processor is less than the
original Merge-Sort algorithm, sum of time spent in all
processors is greater.

4. Experiment and analysis


4.1. Question Fig.4. Merging and index-merging sort

In order to verify the efficiency of Merge-Sort


algorithm based on index table we do some experiments.
We use an array A [N] to store original record. The data
record generates in random. The length of every data
record is less than N2. We let N1 equals to 100 (N1 is the
length of dataset A [N]). N2 is the storage size of single
data record in dataset. We pick 5, 10, 100, 200, 500 to
carry out comparison experiments. We adopt Quick-

706

Authorized licensed use limited to: Ingrid Nurtanio. Downloaded on October 01,2020 at 02:16:13 UTC from IEEE Xplore. Restrictions apply.
Fig.5. Insertion and index-insertion sort Fig.7. Time difference between merging and index-merge

Based on Table 1, Figs 3, 4 and 5, we can get:


1) Time spent by index-based sorting algorithm is
obviously less than original sorting algorithm;
2) When records of data set increases, difference of
time consumption increases.
3) The executing efficient of index-based insertion
sorting algorithm is better than original insertion
sorting algorithm. When the records become large in
data set, index-based Merge-Sort algorithm has the
best performance.

We can get table 2 by comparing time cost in every


sort algorithm of table 1. Fig.8. Time difference between insertion and index-insertion

Table.2. Executing CPU time difference in every algorithm Analyzing the figure 6, figure 7, and figure 8, we easily
know:
1) When the records of data set get more evidently, we
can see the continuously increasing time difference in
curve.
2) From figure 6 to figure 8, we can know that the
merging algorithm based on index table has better
efficient than original merging algorithm.

Table 2 lists the case of time difference between 4.3. Index-based Merge-Sort algorithm under
original sorting algorithm and sorting algorithm based the parallel computing
on index table. Using string diagram to describe table 2,
we get figure 6, figure 7 and figure 8. These are more In experiments of parallel computing, two or more
intuitive images. processors are needed. Because of limitation of
laboratory conditions, we apply two processors to do
parallel computing experiments. It must be pointed out
that, in theory, processors enough can produce the
maximum executing efficiency [7] of merging algorithm
based on parallel computing. For instance, if we have
50 microprocessors, the length of sorting sequence is
100 will be the best choice in this experiment.
In order to measure executing CPU time, the dataset
will be operated 10000 times in every sort algorithm.
Executing CPU time in every algorithm is shown in the
table 3.

Table.3. Executing CPU time in every algorithm


Fig.6. Time difference between quick and index-quick

707

Authorized licensed use limited to: Ingrid Nurtanio. Downloaded on October 01,2020 at 02:16:13 UTC from IEEE Xplore. Restrictions apply.
The length of dataset is N1 (N1=100). The records of
the data set are less than N2. The time cost of each
algorithm is shown in table 3. The results of these
experiments are described in figure 9. This is histogram.
(Note: two microprocessors are used in parallel Merge-
Sort algorithm)

Fig.10. Time difference among merge, index-based merge and


parallel index-merge algorithm

Analyzing figure 10, we have:


1) The Merge-Sort algorithm based on parallel
computing is related to the records of data set.
2) It proved that the efficient of parallel computing
depends on the number of processors.

In a word, for the index-based sort algorithm, the


bigger the data records are, the greater the time
difference (the saved time) is. For the Merge-Sort
algorithm based on parallel computing, the more the
Fig.9. Merging, index-merging, parallel index-merging elements of the data set are, the greater the time
difference is (have processors enough).
From figure 9, we get:
z Index-merging sort algorithm based on parallel 5. Conclusion
computing has the best executing efficiency. The paper demonstrates that the time efficiency of
Comparing time cost of every sorting algorithm in sort algorithm based on index table is directly
table 3, we can get table 4. proportional to the size of data records. With the help of
index table, the paper proposes an improved Merge-Sort
Table.4. Executing CPU time difference in every algorithm algorithm based on parallel computing. The
experiments show effect of the algorithm. The effect
was still remarkable even though we applied two
processors (this is a basic requirement in parallel
computing). Processors in parallel computing algorithm
need to additional private space when the processors
merge sub-sequences. Of course, the index table also
needs auxiliary space when we adopt the sorting
Table 4 shows the case of time difference between algorithm based on index table. The parallel Merge-Sort
original sorting algorithm and sorting algorithm based algorithm based on index table is better than original
on index table. When we use string diagram to describe sorting algorithm. The experiment proved that.
table 4, we can get figure 10. The sorting problem is widely applied in scientific
research and practical engineering. Therefore,
improvement of sort algorithm has important
significance.

Acknowledgement
This work was supported by Guangdong Provincial
Natural Science Foundation (Grant No. 06021484),
Guangdong Provincial science and technology project
(Grant No.2005B10101077) and Yuexiu Zone
Guangzhou city science & technology project (Grant
No. 2007-GX-075).

708

Authorized licensed use limited to: Ingrid Nurtanio. Downloaded on October 01,2020 at 02:16:13 UTC from IEEE Xplore. Restrictions apply.
Digital Images”, 2008 International Symposiums on
References Information Processing, Chengdu, China, Jul 2008,
pp.389-393.
[5] W. Wang, Y. Qiou, “A new algorithm for parallel
[1] Y. Li, D. Wang, G. Wang, et al, “Block Sorting Index-
Mergesorting”, Computer Engineering and Applications,
based Techniques for Local Alignment Searches on
2005, 41(5), pp.71-72, 81.
Biological Sequences”, computer science, 2005, 32(12),
[6] C. Zhong, G. Chen, “An Optimal Parallel Sorting
pp.159-163, 205.
Algorithm for Multisets”, Journal of Computer Research
[2] N. Liu, Z. Tong, “Analysis of improving the C language
and Development, 2003, 40(2), 336-341.
quick sort algorithm”, Applications of the Computer
[7] W. Yan, W. Wu, “Data Structure (C language)”, Tsinghua
System, 2008, 1, pp.113-116.
University Press, April 1997.
[3] C. Lan, “Optimizing of Bubble Sort Algorithm”,
[8] L. Zhang, “Parallel computing introduction”, Tsinghua
Ordnance Industry Automaton, 2006, 25(12), pp.50-52.
University Press, Nov. 2006.
[4] J. Zhang, T. Ke, M. Sun, “The Parallel Computing Based
on Cluster Computer in the Processing of Mass Aerial

709

Authorized licensed use limited to: Ingrid Nurtanio. Downloaded on October 01,2020 at 02:16:13 UTC from IEEE Xplore. Restrictions apply.

You might also like