Professional Documents
Culture Documents
BJORN POONEN
1. Introduction
Shellsort is a general-purpose sorting algorithm that was invented by Shell in 1959 [14].
Empirical results show that it is competitive with the fastest sorting algorithms, especially
when N , the number of elements to be sorted, is not too large. The algorithm is very simple
to describe. If h ≥ 1, to h-sort a list L[1], . . . , L[n] is to apply insertion sort to each of the
sublists L[i], L[i+h], . . . , L[i+2h], etc., where 1 ≤ i ≤ h. A list for which these sublists are in
order is said to be h-ordered. Shellsort sorts a list by performing an h1 -sort, an h2 -sort, . . . ,
and finally an hm -sort, where h1 , h2 , . . . , hm are positive integers called the increments and
hm = 1. The choice hm = 1 guarantees that the resulting list is in order, because 1-sorting
a list is the same as sorting it using insertion sort.
As far as complexity is concerned, insertion sort is a series of compare-exchange operations
between pairs of adjacent elements, where a compare-exchange consists of comparing two
elements and exchanging them, if necessary, to put them in order. Thus Shellsort also can
be considered a series of compare-exchanges, between pairs of elements L[i] and L[i + h] for
various h’s (the increments).
Shaker-sort, proposed by Incerpi and Sedgewick [5], is another general-purpose sorting
algorithm which sorts a list in a number of passes, each having an associated increment h.
To h-shake a list L[1], . . . , L[n] is to perform compare-exchanges on the following pairs of
positions, in the order listed:
(1, 1 + h), (2, 2 + h), . . . , (N − h − 1, N − 1), (N − h, N ),
(N − h − 1, N − 1), (N − h − 2, N − 2), . . . , (2, 2 + h), (1, 1 + h).
Note that the distance between the elements of each pair is h. Shaker-sort sorts a list by
performing an h1 -shake, an h2 -shake, . . . , and an hm -shake, for some sequence h1 , h2 , . . . , hm .
Instead of assuming hm = 1, which would not guarantee that the result was sorted, we assume
that insertion sort is applied at the end, after the hm -shake. (Alternatively, we could perform
1-shakes repeatedly at the end until the list is sorted.)
2. Shellsort-type Algorithms
We now describe the general class of algorithms we will consider. A sorting algorithm is
a Shellsort-type algorithm if it satisfies the following:
(1) The algorithm sorts a list L[1], . . . , L[N ] in a number of passes, each with an associ-
ated increment h. During a pass with increment h, the algorithm performs compare-
exchanges (one at a time) on pairs L[i], L[i + h] for 1 ≤ i ≤ N − h.
(2) At least Ω(N ) comparisons are done during each pass with increment h < N/2.
(3) If the list is k-ordered at the beginning of a pass, the list is k-ordered at the end of
the pass also.
The sequence of increments need not be decreasing, and it need not end in 1, so long as the
algorithm actually sorts. It may contain duplicates. The sequence also may depend on N ,
but it must be fixed before looking at the list to be sorted.
Because of condition 3, it is not obvious that even Shellsort is a Shellsort-type algorithm.
(That Shellsort satisfies 3 is Theorem K in Section 5.2.1 of [6].) Even more surprising is
3
that Shaker-sort satisfies 3. To prove this, we will use the following easy result, which is
Exercise 21 in Section 5.3.4 of [6].
Proposition 1. If L, L0 are lists of the same length N , and L[i] ≤ L0 [i] for all i, then the
inequalities still hold after the same series of compare-exchanges is performed on both lists.
Proof. It suffices to consider a single compare-exchange, say on (j, k), with j < k. Let M
and M 0 be the lists after the compare-exchange. For i 6= j, k, M [i] = L[i] ≤ L0 [i] = M 0 [i].
Also
M [j] = min{L[j], L[k]} ≤ min{L0 [j], L0 [k]} = M 0 [j]
M [k] = max{L[j], L[k]} ≤ max{L0 [j], L0 [k]} = M 0 [k]
so the result follows.
Proposition 2. Shellsort and Shaker-sort are Shellsort-type algorithms.
Proof. It is clear that condition 1 is satisfied, and that 2 is satisfied by Shaker-sort. Condition
2 is also true for Shellsort, because N − h comparisons are necessary in an h-sort, even if
only to check that the list is h-ordered. (This is Ω(N ) when h < N/2.) So it remains to
show that 3 holds.
Let L denote the k-ordered list at the beginning of a pass with increment h. Consider the
following 2 by (N + k) array:
−∞ · · · −∞ L[1] ··· L[N − k] L[N − k + 1] · · · L[N ]
L[1] · · · L[k] L[k + 1] · · · L[N ] +∞ ··· +∞
Since L is k-ordered, each entry in the first row is less than or equal to the entry below it.
Consider performing an h-sort on both rows or an h-shake on both rows. This has the same
effect as performing it on the two copies of L within the array, because the −∞ and +∞
terms stay where they are. But this can be done by a series of compare-exchanges, so by
Proposition 1, in the resulting array each entry of the first row is less than or equal to the
entry below it. So the resulting list is k-ordered.
Proposition 2 holds for Shellsort even if the method used to do the h-sort is not insertion
sort, as long as conditions 1 and 2 still hold. This is because the result of an h-sort done by
compare-exchanges at distance h is independent of the method. Shellsort-type algorithms
have the following property, which will be used in proving our lower bound theorems.
Lemma 1. If a Shellsort-type algorithm is applied to a list which is h-ordered for all incre-
ments h ≥ β, where β is some fixed cutoff, then the passes with increments h ≥ β do nothing
to the list.
Proof. By condition 3, the list will be h-ordered at the beginning of every pass, for each
increment h ≥ β. So the compare-exchanges at distance h done during a pass with increment
h ≥ β can’t do anything.
If we dropped condition 3 from the definition of Shellsort-type algorithms, we could still
prove Lemma 1 under the assumption that the increments were decreasing, because then at
the beginning of a pass with increment h ≥ β, the list would be in its initial state, and would
hence be h-ordered already.
4
3. The Frobenius Problem
Suppose a1 , . . . , ar is a sequence of positive integers with gcd(a1 , . . . , ar ) = 1. Then every
integer k can be expressed in the form n1 a1 + · · · + nr ar for some integers n1 , . . . nr . We say
k is expressible if the ni ’s can be chosen nonnegative, and inexpressible otherwise. It is not
hard to show that if k is sufficiently positive, k is expressible. So a natural question to ask
is, what is the largest inexpressible integer g = g(a1 , . . . , ar )? For example, suppose r = 2,
a1 = 3, a2 = 5. Then the expressible integers are 0, 3, 5, 6, 8, 9, 10, 11, 12, . . ., so g = 7. (Note
that if a1 consecutive integers are expressible, then all larger integers will be expressible.)
The problem of determining g is known as the Frobenius problem, because according to
Brauer [1], Frobenius mentioned it in his lectures. Although the problem dates back to the
nineteenth century [13], no general formula for g in terms of a1 , . . . , ar has been found. The
problem has been solved in some special cases, however. (See [12] for some of these.) For
example, the following result is well-known. For a proof, see Exercise 21 in Section 5.2.1
of [6] or Example 1 on page 4 of [12].
Proposition 3. If gcd(a1 , a2 ) = 1, g(a1 , a2 ) = a1 a2 − a1 − a2 .
Our next result shows how knowledge of g can be useful in obtaining upper bounds on the
time required to do an h-sort.
Lemma 2. Let a1 , . . . , ar be positive integers with greatest common divisor 1. The number
of steps required to h-sort a list L[1], . . . , L[N ] which is already (a1 h)-ordered, . . . , (ar h)-
ordered, is O(N g), where g = g(a1 , . . . , ar ).
Proof. By definition of g, every multiple of h greater than gh is a sum n1 (a1 h) + · · · + nr (ar h)
for some ni ≥ 0. Since L is (a1 h)-ordered, . . . , (ar h)-ordered, it follows from transitivity
that L[j − bh] ≤ L[j] for all b > g. So during the h-sort, inserting each element L[j] into
its proper position in its sublist will involve moving it at most g steps of h. There are N
elements to put into place, so the total number of steps required is at most N g.
4. Upper Bounds on the Worst Case
We now prove the best known upper bound for Shellsort, due to Pratt [10]. All logarithms
in this paper are to the base 2 unless explicitly written otherwise.
Proposition 4. If the numbers of the form 2a 3b less than N are used in decreasing or-
der as increments in Shellsort, then the worst case and average case running times are
Θ(N (log N )2 ).
Proof. By the time the list is to be h-sorted, the list will have already been (2h)-sorted and
(3h)-sorted, and it will still be (2h)-ordered and (3h)-ordered by condition 3 in the definition
of Shellsort-type algorithms. So by Lemma 2, the h-sort requires only O(N g(2, 3)) = O(N )
steps. There are dlog2 N e powers of 2 less than N and dlog3 N e powers of 3 less than N ,
so there are at most dlog2 N edlog3 N e = O((log N )2 ) increments. Thus the total number of
steps is always O(N (log N )2 ).
By condition 2 in the definition of Shellsort-type algorithms, Ω(N ) comparisons are re-
quired for each increment
p h < N/2. And there are p Ω((log N )2 ) increments less than N/2,
there are dlog2 N/2e powersp
since p of 2 less than N/2, which may be paired with the
dlog3 N/2e powers of 3 less than N/2. So the total number of steps is Ω(N (log N )2 )
also.
5
Even though Pratt’s increment sequence gives the best known theoretical bound, it does
not give the fastest algorithm in practice. (See [4] for some empirical results.) This may be
because the other sequences actually have an average case better than Θ(N (log N )2 ). Or
it may be that N is not sufficiently large yet. But certainly one reason is that too many
increments are being used. (There are Θ((log N )2 ) of them.) In order to get reasonable
running times in practice, it seems necessary to limit the number of increments to O(log N ).
So what time bounds can we get under this restriction? We shall see that the asymptotic
worst case running time worsens as we reduce the number of increments we are allowed to
use.
For now, we give upper bounds where the number of increments is restricted. A result
very similar to the following (the special case m = Θ(log N )) was proved by Chazelle. (See
Theorem 3 (non-uniform case) in [4].)
Theorem 1. For all √m, there exists a sequence of at most m increments for which Shellsort
requires O(mN 1+O(1/ m) ) steps in the worst case.
Proof. The proof is a generalization of the proof of Pratt’s result. If m ≥ (log N )2 we are
done, since we can take the sequence of Proposition 4. Otherwise, pick the smallest integer
α ≥ 3 such that blogα−1 N cblogα N c ≤ m. There are at most m numbers of the form
(α − 1)a αb less than N , so we may use them as increments, in decreasing order. As in the
proof of Proposition 4, each h-sort will require O(N g(α − 1, α)) steps. By Proposition 3, this
is O(N α2 ). Since there are at most m increments, the total running time will be O(mN α2 ).
Now m < (log N )2 , so by definition of α,
2
log N
= Θ(m)
log α
√
α = N Θ(1/ m)
√
O(mN α2 ) = O(mN 1+O(1/ m) )
as desired.
Theorem 2. There is a constant c1 such that for all sufficiently large N and all m ≤
(log N/ log log N )2 , there
√
exists a sequence of at most m increments for which Shellsort re-
quires at most N 1+c1 / m steps.
Proof. By Theorem 1, there are constants c, c0 such that we can√
find a sequence of at most
1+c0 / m
m increments for which Shellsort requires at most cmN steps. But for sufficiently
large N ,
√ log N
log N 1/ m = √ ≥ log log N ≥ log c
m
and
√
log N 2/ m
≥ 2 log log N ≥ log m
so
0 √ 0 √
N 1+(3+c )/ m
≥ cmN 1+c / m
Proof. Let t = bn/2c ≥ 1 and q = dN/te ≥ 2. Then t ≥ n/4, and q ≤ 2N/t ≤ 8N/n. Make
a list of length N 0 = qt with I inversions by appending N 0 − N ones to the end of L. Divide
this list into q blocks Bi of length t. Let bi be the number of ones in Bi , and let ni be the
number of inversions within Bi . Then
q
X X
I= bi (t − bj ) + ni ,
i<j i=1
7
since the first sum counts the number of inversions involving a pair of elements from different
blocks. If ni ≥ I/2q for some i, then any sublist of length n containing the part of Bi
belonging to the original list L will have at least
I I n n2 I
≥ ≥
2q 2 8N 256N 2
inversions. Otherwise
q
X X I I
bi (t − bj ) = I − ni ≥ I − q = ,
i<j i=1
2q 2
so for some i < j,
I/2 I n2 I
bi (t − bj ) ≥ ≥ 2 ≥ .
q(q − 1)/2 q 64N 2
n I2
By the previous lemma applied to bi , bi+1 , . . . , bj , we get bs (t − bs+1 ) ≥ 256N 2 for some s. So
any sublist of length n containing the part of Bs ∪ Bs+1 belonging to the original list L will
n2 I
have at least 256N 2 inversions.
so
1
log a1 ≥ log N
2
1 log N
= 9 (log log N )
18 log log N
≥ 9r log r,
l ≤ 2βi+1
1
√
= 2N 72 m βi
1p √1
≤ log N N 72 m βi (for sufficiently large N )
2
1 c2 /√m 72√1 m
≤ N N βi (by (1))
2
1 9/√m
≤ N βi (since c2 + 1/72 ≤ 9)
2
1 18/√m
≤ a a1 (by (6))
2 1
1 1+1/r
≤ a (by (3)).
2 1
For the particular case of Shaker-sort, we can prove an even stronger result for some
increment sequences.
Theorem 5. If hj = Θ(αj ) for some fixed real α > 1, and the increments hj less than N
are used in any order in Shaker-sort, then Ω(N 2 ) steps are required in the worst case.
Proof. Let hm be the largest increment used and let ai = hm−r+i , 1 ≤ i ≤ r, be the r largest
increments in increasing order. (We’ll choose r later.) For fixed r, a1 = Θ(N/αr ) = Θ(N ),
1+1/r
so N ≤ 21 a1 for large N . Thus by Lemma 18 with l = N , there exists a list L of length
N which is a1 -ordered, . . ., ar -ordered and which has at least 2−17 N 2 /r2 inversions.
16
An h-shake involves at most 2N compare-exchanges at a distance of h, so by Proposition 5,
the increments h1 , . . . , hm−r can remove at most
4N (h1 + h2 + · · · + hm−r ) = Θ(N (α + α2 + · · · + αm−r ))
= Θ(N αm−r )
= Θ(α−r N 2 )
inversions. Pick r large enough that this is less than or equal to 2−18 N 2 /r2 . Then when
Shaker-sort is applied to L, the increments a1 , . . . , ar do no work (by Lemma 1), so after all
the h-shakes, at least
2−17 N 2 /r2 − 2−18 N 2 /r2 = 2−18 N 2 /r2
inversions remain to removed by the insertion sort at the end, and this will take at least
2−18 N 2 /r2 steps. Since r is independent of N , this is Ω(N 2 ).
References
[1] Brauer, A., On a Problem of Partitions, Amer. J. Math. 64 (1942), 299–312.
[2] Cypher, R., A Lower Bound on the Size of Shellsort Sorting Networks, Proceedings of the 1989 Inter-
national Conference on Parallel Processing, 58–63.
[3] Hibbard, T., An Empirical Study of Minimal Storage Sorting, Comm. of the ACM 6, no. 5 (May
1963), 206–213.
[4] Incerpi, J. and R. Sedgewick, Improved Upper Bounds on Shellsort, J. Comput. System. Sci. 31,
no. 2 (1985), 210–224.
[5] Incerpi, J. and R. Sedgewick, Practical Variations on Shellsort, Inform. Process. Lett. 26 (1987),
37–43.
[6] Knuth, D., “The Art of Computer Programming. 3. Sorting and Searching”, Addison-Wesley, Reading,
Mass., 1973.
[7] Lazarus, R. and R. Frank, A High-Speed Sorting Procedure, Comm. of the ACM 3, no. 1 (1960),
20–22.
[8] Nijenhuis, A. and H. Wilf, Representations of Integers by Linear Forms in Nonnegative Integers, J.
Number Theory 4 (1972), 98–106.
17
[9] Papernov, A. and G. Stasevich, A Method of Information Sorting in Computer Memories, Problems
of Inform. Transmission 1, no. 3 (1965), 63–75.
[10] Pratt, V., “Shellsort and Sorting Networks”, Garland Publishing, New York, 1979. (Originally pre-
sented as the author’s Ph. D. thesis, Stanford University, 1971.)
[11] Sedgewick, R., A New Upper Bound for Shellsort, J. of Algorithms 2 (1986), 159–173.
[12] Selmer, E., On the Linear Diophantine Problem of Frobenius, J. reine angew. Math. 294 (1977), 1–17.
[13] Sharp, W. J. Curran, Solution to Problem 7382 (Mathematics) proposed by J. J. Sylvester, Ed.
Times 41 (1884), London.
[14] Shell, D., A High-Speed Sorting Procedure, Comm. of the ACM 2, no. 7 (1959), 30–32.
[15] Weiss, M. and R. Sedgewick, Bad Cases for Shaker Sort, Inform. Process. Lett. 28 (1988), 133–136.
[16] Weiss, M. and R. Sedgewick, Tight Lower Bounds for Shellsort, J. of Algorithms 11 (1990), 242–251.
[17] Yao, A., An Analysis of (h, k, 1)-Shellsort, J. of Algorithms 1 (1980), 14–50.
18