You are on page 1of 22

LINEAR ARRAY

JYOTIKA JAIN

MTECH 2009 IS 13
ABV-IIITM Gwalior
Gwalior-474 010, MP, India

September 12, 2010

1 / 22
OUTLINE

I Introduction
I Odd-Even Transposition Sort
I Merge-Splitting Sort
I Merge Sort on a Pipeline
I Enumeration Sort

2 / 22
INTRODUCTION

I Parallel sorting algo for SIMD machines in which processors


are interconnected to form a linear array
I Pi linked by a communication path to processors Pi−1 and
Pi+1
I No other link available

3 / 22
ODD-EVEN TRANSPOSITION SORT

I no of processors = no of elements in the input sequence


I Algorithm
I for k=1 to [n/2] do
1. for i=1,3,...,2[n/2]-1 do in parallel
if yi >yi+1 then yi ←→yi+1 end if
end for
2. for i=2,4,...,2[(n-1)/2] do in parallel
if yi >yi+1 then yi ←→yi+1 end if
end for

4 / 22
EXAMPLE

Figure: S=7,6,5,4,3,2,1

5 / 22
ANALYSIS

I Running time of algo t(n) = O(n)


I Cost c(n) = t(n) × p(n) = o(n) × n = O(n2 )

6 / 22
MERGE SPLITTING SORT

I Algorithm:
Preprocessing step
for i=1,2,..,p do in parallel
processor Pi sorts Si using a sequential algo
end for
end of preprocessing
for k=1 to [p/2] do
1. for i=1,3,...,2[p/2]-1 do in parallel
1.1 merge Si and Si+1 into a sorted subsequence Ai
1.2 Si ←− first(n/p) elements of Ai
1.3 Si+1 ←−second(n/p) elements of Ai
end for
2. for i=2,4,...,2[?(p-1)/2] do in parallel
2.1 merge Si and Si+1 into a sorted subsequence Ai
2.2 Si ←− first(n/p) elements of Ai
2.3 Si+1 ←−second(n/p) elements of Ai
end for
7 / 22
EXAMPLE

Figure: Sort (12,9,10,11,7,4,3,6,2,1,8,5)

8 / 22
ANALYSIS

I By using heap sort in preprocessing step


I The total running time is
t(n) = o[(n/p)log (n/p)] + O(n) = O((nlogn)/p + O(n))
I Cost of the algorithm
c(n) = t(n) × p = O(nlogn) + o(np)
optimal for p ≤ logn

9 / 22
MERGING SORT ON A PIPELINE

DO steps 1,2,3 in parallel

1. P1 performs the following steps


1.1 read x1 from q1
1.2 j←−0
1.3 for i=2 to n do
I place xi−1 on q2+j
I read xi from q1
I j←−j+1 mod 2
end for
1.4 place xn on q3

10 / 22
STEP 2
for i=2 to r do in parallel
1. j←−0
2. k←−1
3. while k ≤n do
if q2i−2 is 2i−2 elements long and q2(i−1)+1 contains one
element
then
3.1 for m=1 to 2i−1 do
Pi compares the first element in q2(i−1) to the first element in
q2(i−1)+1
removes the larger of the two and places it on q2i+j
end for
3.2 j←−j+1 mod 2
3.3 k←−k + 2i−1
end if
end while
end for
11 / 22
STEP 3

if q2r is 2r −1 elements long and q2r +1 contains one element then


for m=1 to 2r do
Pr +1 compares the first element in q2r to the first element in
q2r +1 ,removes the larger of the two and places it on q2(r +1)
end for
end if

12 / 22
EXAMPLE

Figure: Sort (1,5,3,2,8,7,4,6)

13 / 22
Figure: Sort (1,5,3,2,8,7,4,6)

14 / 22
ANALYSIS

I Running time O(n)


I Cost is given by:
c(n) = t(n) × p(n) = O(n) × (logn + 1) = O(nlogn)

15 / 22
ENUMERATION SORT

ALGORITHM

1. for i=1 to n do in parallel


Pi sets its register C to 1
end for
2. for k=1 to 2n do
2.1 if k≤n then h←1 else h←k←n end if
2.2 for i=hto n do in parallel
if its registers X and Y are non empty and X<Y
then processor Pi increments its register C by 1
end if
end for
end for

16 / 22
CONTD..

2.3 for i=h to n-1 do in parallel


if its register Y is nonempty then processor Pi shifts the integer in
it to Pi+1 which stores it in its own register Y end if
2.4 if k≤n then processors P1 and Pk read the next integer xk
from the input queue and store it in their registers Y and
X,respectively end if
2.5 if k>n then processor Pk−n stores in register Z of Pj the
content of its register X, where j is the value stored in its register
C end if
end for

17 / 22
CONTD..

Step 3 for k=1 to n do

1. processor Pn places the contents of its register Z on the


output queue
2. for i=k to n-1 do in parallel
processor Pi shifts the contents of its register Z to the register
Z of Pi+1
end for

end for

18 / 22
EXAMPLE

Figure: Sort (8,9,7)

19 / 22
Figure: Sort (8,9,7)

20 / 22
ANALYSIS

I Running time O(n)


I Cost is given by:
c(n) = t(n) × p(n) = O(n) × n = O(n2 )
I cannot handle sequences with repeated numbers

21 / 22
THANK YOU

22 / 22

You might also like