You are on page 1of 2

1.

3way-merge
I. 2-way merge sort:
Let’s start with merge sort:
a. In merge sort we divide our data into 2 parts, and keep dividing till our array size
becomes 1. Once it is 1, we begin the merge process and keep merging until we a
complete array.
b. Based on the recursive merge sort, since the array size n gets divided by 2, we get T(n/2)
and since we are dividing n/2 into 2 again, we get 2T(n/2). Merge action takes O(n) time.
The entire recurrence for 2-way merge sort is T(n) = 2T(n/2) + O(n).
II. 3-way merge sort.
- As the name suggests, we divide our array into 3 parts instead of 2.
- The logic for 3-way merge sort would be similar to that of 2-way, where we recursively
divide the array into 3 parts and keep dividing till we get size 1.
- The pseudo-code for 3 way merge can be as
MergeSort3(A[0..n-1]):
1. If n<=1, return A[0.. n-1]
2. Let k=[n/3] and m=[2n/3]
3. Return Merge3(MergeSort3(A[0…k-1]), MergeSort3(A[k…m-1]), MergeSort3(A[m…n-
1]))

Merge3(P0,P1,P2):

1. Return Merge(P0, Merge(P1,P2))

Merge(), merges two sorted lists.

- Similar to 2-way merge, the recurrence for division would be 3T(n/3). It is so because we
make 3 recursive calls to MergeSort3, each on a list of size n/3.
- Assuming that P0,P1 and P2 are three sorted arrays divided into length of n/3, we need
to find the time taken to merge.
- The running time is n/3 + n/3 for the call to Merge(P0,P1).
- The running time is n/3 and 2n/3 for outer call.
- Based on the above two, by adding the equations we get 5n/3. Since for big-oh notation
we don’t consider constants, we get O(n) for our merge.
- The recurrence relation for 3-way merge sort is :: T(n) = 3T(n/3) + O(n).
- If we were to further solve the recurrence relation using masters theorem we get the
time complexity of O(n log n[base 3]) (since a and b=3)
- The time complexity of 3 way merge is slightly better than 2-way merge sort, since for 3-
way merge the time complexity is log to the base 3 and for 2 way it is log to the base 2
and from logarithmic properties, larger the base smaller the value.
- But we also need to keep in mind the since the division or parts increase, the number of
comparisons also increase. At each step in the merge we need to find which of the lists
have smallest unused element.
- In two-way we compare the first-two against each other. In 3-way this comparison
increases.
- Based on the time complexity there is isn’t much difference, but I do feel that if we have
a lot of data and just a single thread then 3 way would take longer than 2 way merge
sort.
III. Extending to k-parts:
- K-way merge we will have arrays sorted into size m and have k sorted arrays.
- The merging operation would be similar to that of the 2-way merge sort, where k-way
merge would take log k times to merge k arrays into 1 sorted array.
- Similarly, merging 2 arrays of size n would take O(2n), so the overall time complexity
would be O(nk log k).
- So, if n = mk, recurrence would be T(k) = 2T(k/2) + n + n.
n is for copying elements of arrays to a structure.
n is also for merging
2T(k/2) is for the recursive call.
hence time complexity of algorithm is O(n logk) but n=mk, there O(mk logn).

Additional resources I looked at that may be related to k-way merge sort and determining if 2-way is
better than k-way are:

1. Polyphase merge sort


- Basically, it is a variation of merge sort where it will divide the array into uneven list,
also decreases the number of merge passes at every iteration of main loop by merging
the data into larger loops.
- It is used for external sorting and can also be defined as A non-balanced k-way
merge which reduces the number of output files needed by reusing the emptied input
file or device as one of the output devices. This is most efficient if the number of
outputs runs in each output file is different.
- According to Wikipedia, the if the number of files that need to be merged are less than 8
than polyphase merge is better than original/2-way merge sort.
- In conclusion for this one, based on the file size we can decide with merge is better.

In conclusion, I feel like 2-way merge sort will work pretty well if I have larger data, since my
comparisons would be less as compared to 3-way merge or k-way merge. That said, I don’t think 3-way
or k-way merge should not be used at all. For example, in applications like github, gitlab where we want
to merge files, 3-way merge can be used. We also saw an example for k-way in polyphase merge sort.

An additional thing I was looking at was parallelism. In case of parallelism k-way merge sort would work
way faster than 2-way merge sort.

With all the above readings and observation, we can say that merge sort in general, be it 2-way, 3-way
or k-way is better than most of the other sorting algorithms on the basis that it has a consistent running
time and carries out different bits with a similar time range. Overall, 2-way is better than 3-way or k-
way.

You might also like