Professional Documents
Culture Documents
a r t i c l e i n f o a b s t r a c t
Article history: Let f be a polynomial (or polynomial system) with all simple roots.
Received 11 October 2015 The root separation of f is the minimum of the pair-wise distances
Accepted 1 October 2016 between the complex roots. A root separation bound is a lower
Available online 23 March 2017
bound on the root separation. Finding a root separation bound is a
Keywords:
fundamental problem, arising in numerous disciplines. We present
Root separation bounds two new root separation bounds: one univariate bound, and one
Polynomial roots multivariate bound. The new bounds improve on the old bounds in
Polynomial systems two ways:
(1) The new bounds are usually significantly bigger (hence better)
than the previous bounds.
(2) The new bounds scale correctly, unlike the previous bounds.
Crucially, the new bounds are not harder to compute than the
previous bounds.
© 2017 Elsevier Ltd. All rights reserved.
1. Introduction
In this paper we present improved root separation bounds. A root separation bound is a lower
bound on the distances between the roots of a polynomial (or polynomial system). First, we introduce
the notion of the root separation by an example.
E-mail addresses: aherman@ncsu.edu (A. Herman), hong@ncsu.edu (H. Hong), elias.tsigaridas@inria.fr (E. Tsigaridas).
http://dx.doi.org/10.1016/j.jsc.2017.03.001
0747-7171/© 2017 Elsevier Ltd. All rights reserved.
26 A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56
Fig. 1. Roots of f (x) (top left), distances between roots (top right), minimum separation highlighted (bottom center).
Example 1. Let us consider the case of a single polynomial in a single variable. Let f (x) = x4 − 60x3 +
1000x2 − 8000x. The roots of f (x) are plotted in Fig. 1. The lengths of the red line segments are the
distances between the √ roots of f (x). The root separation is the √
smallest of these distances. The root
separation of f (x) is 200, so any number less than or equal to 200 is a root separation bound. 2
Root separation bounds are fundamental tools in algorithmic mathematics, with numerous ap-
plications (Li et al., 2004; Emiris and Tsigaridas, 2004; Burnikel et al., 2001; Schultz and Moller,
2005; Tsigaridas and Emiris, 2008; Strzebonksi and Tsigaridas, 2011). As a consequence, there has
been intensive effort in finding such bounds (Mahler, 1964; Mignotte, 1974, 1995; Rump, 1979;
Tsigaridas and Emiris, 2008; Emiris et al., 2010; Collins and Horowitz, 1974; Bugeaud and Mignotte,
2004), resulting in many important bounds. Unfortunately, it is well known that current bounds
are very pessimistic. Furthermore, we have found another issue with current bounds. If the roots
of a polynomial are doubled, the root separation is obviously doubled. Hence we naturally ex-
pect that a root separation bound would double if the roots are doubled. This does not happen:
for the polynomial in the above example, the well known Mahler–Mignotte bound (Mahler, 1964;
Mignotte, 1974) becomes smaller when the roots are doubled. If the roots are tripled, the Mahler–
Mignotte bound is even smaller. In other words, the Mahler–Mignotte bound does not scale correctly;
the bound is not compatible with the geometry of the roots. To the best of the authors’ knowledge,
the same observation holds for all efficiently computable root separation bounds.1 We elaborate further
on this phenomena in the next section.
This discussion leads us to the following challenge: find new root separation bounds such that
(1) the new bounds are (almost always) less pessimistic than previous bounds,
(2) the new bounds scale correctly, and
(3) the new bounds can be computed as efficiently as previous bounds.
1
There do exist many bounds in the literature which scale correctly but are not efficiently computable. For example, the root
separation bound which simply returns the exact root separation scales correctly with the roots. Other examples include the
Mahler bound with the Mahler measure in the denominator (Theorem 2 of Mahler, 1964) and several bounds due to Mignotte
(1995), all of which depend on the magnitudes of the roots.
A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56 27
The main contribution of this paper is to provide two new bounds which meet the challenge:
one univariate root separation bound, and one multivariate root separation bound. We found the new
bounds by transforming known bounds into new bounds which meet the challenge. In the univariate
case, we transform the celebrated bound of Mahler and Mignotte (Mahler, 1964; Mignotte, 1974). In
the multivariate case, we transform the DMM bound due to Emiris, Mourrain, and Tsigaridas (Emiris
et al., 2010). Experimental evidence indicates that the improvement is usually very large, especially
when the magnitudes of the roots are different from 1.
The structure of this paper is as follows. In Section 2 we elaborate on the challenge discussed
above. In Section 3 we present the main contributions of this paper: one univariate root separation
bound and one multivariate root separation bound, both of which meet the challenge. In Section 4
we derive the two new bounds. In Section 5 we discuss the experimental performance of the new
bounds.
2. Challenge
In order to motivate our search for new root separation bounds, we recall the celebrated Mahler–
Mignotte root separation bound (Mahler, 1964; Mignotte, 1974).
3|discr ( f )|
BMM( f ) =
dd/2+1 || f ||d2−1
where discr ( f ) is the discriminant2 of f and d is the degree of f . Let us apply the Mahler–Mignotte
bound to an example.
Example
√ 2. Let f (x) = x4 − 60x3 + 1000x2 − 8000x. As we saw in Example 1, the root separation of f
is 200 (≈ 14.14). How does the Mahler–Mignotte bound perform on this polynomial? We have
d(d−1)
2
Recall that the discriminant can be calculated via the resultant: discr ( f ) = (−1) 2
1
ad
res( f , f ), where ad is the leading
coefficient of f .
28 A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56
Fig. 2. B M M ( f (x/s) ).
the Mahler–Mignotte bound is approaching zero as the root separation increases. The situation is
equally strange to the left of the Mahler–Mignotte bound of f . When we decrease s, we see that until
s reaches a value around .18, the Mahler–Mignotte bound is increasing. In other words, the Mahler–
Mignotte bound is increasing when the root separation is decreasing. This is very surprising and also
undesirable. 2
(1) The Mahler–Mignotte bound is very pessimistic (several magnitudes smaller than the root sepa-
ration).
(2) The Mahler–Mignotte bound does not scale correctly (“covariantly”) with the roots of f .
We have also observed similar phenomena for other efficiently computable root separation bounds.
Thus we have a challenge.
3. Main results
The main results of this paper are two new root separation bounds: one univariate root separation
bound and one multivariate root separation bound. The two new bounds meet the challenge posed in
the previous section. In this section we will precisely state the main results of the paper. We use the
following notation.
Notation 1.
d d
f = a xi = ad i =1 (x − αi ) ∈ C[x]
i =0 i
Fn = F ∈ (C [x1 , . . . , xn ])n : F has finitely many (at least two) solutions, and all solutions are simple
F = ( f 1 , . . . , f n ) ∈ Fn
( F ) = min β1 =β2 ∈Cn ||β1 − β2 ||2
F (β1 )= F (β2 )=0
discr ( f ) = ad2d−2 i = j (αi − α j )
E ( f ) = Support ( f )
di = deg( f i )
D = d 1 · · · dn
M i = j =i d j
A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56 29
We begin by recalling two root separation bounds: a univariate bound due to Mahler and
Mignotte (Mahler, 1964; Mignotte, 1974)3
|discr ( f )|
B M M ,k ( f ) = P k (d)
|| f ||kd−1
where
√
3
P k (d) = 1 1
dd/2+1 (d + 1)( 2 − k )(d−1)
and a multivariate bound due to Emiris, Mourrain, and Tsigaridas known as the Davenport–Mahler–
Mignotte bound (or DMM bound) (Emiris et al., 2010)4
|discr ( T f 0 )|
B DMM (F ) = D −1 P (d1 , . . . , dn , n)
n Mi
i =1 || f i || ∞
where
√
3
P (d1 , . . . , dn , n) = √ n di +n
M i D −1
D D /2+1 · n1/2 W · D + 1(n + 1) D W D i =1 di
1 (q−1 p)
|h(q)| k aq
s̃k = max min
q p |h( p )| a p
h(q)<0 h( p )>0
d 1
h (i ) = −i+ .
2 d−1
3
The Mahler–Mignotte bound is usually presented with the 2-norm (as in the previous section) or the ∞-norm. It can easily
be extended to arbitrary k-norm (k ≥ 2).
4
We present a slight modification of the bound from Emiris et al. (2010). See Lemma 8 for details.
30 A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56
Example 3. Let f = x4 − 60x3 + 1000x2 − 8000x. Recall that the root separation of f is approximately
14.14. We have
B M M ,∞ = 7.56 × 10−7
1 (q−1 p)
|h(q)| k aq
s̃∞ = max min
q∈{3,4} p ∈{1,2} |h( p )| a p
1
1
1
1
|60| 3−1 |60| 3−2 |1| 4−1 |1 | 4−2
= max min , , min ,
|8000| |1000| |8000| |1000|
= max 6.00 × 10−2 , 3.16 × 10−2
= 6.00 × 10−2
||x4 − 3.60x3 + 3.60x2 − 1.73x||∞
H∞ = 4 1
(6.00 × 10−2 ) 2 − 4−1
3.60
= 5
(6.00 × 10−2 ) 3
= 3.91 × 102
B Ne w ,∞ = 6.45 × 10−3 .
Note that B Ne w ,∞ is a root separation bound for f , and is significantly larger than B M M ,∞ . To demon-
strate the covariance, we plot the function B Ne w ,∞ ( f (x/s) ) in Fig. 3. 2
5
Arithmetic and radicals.
A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56 31
Furthermore, B Ne w ,k is almost always larger for smaller k, as the same example illustrates:
B Ne w ,2 ( f ) = 1.29 × 10−3
B M M ,2 ( f ) = 1.32 × 10−3 .
Remark 2. For square-free integer polynomials, the discriminant has a lower bound of 1. Hence in
practice the discriminant is almost always replaced by 1. In this case, part (4) of Theorem 1 implies
that B Ne w ,k can be computed in O (d) algebraic operations and comparisons. Note that removing the
discriminant sacrifices the scaling covariance. When the coefficients are not rational it is difficult to
obtain a lower bound on the discriminant.
Remark 3. It is possible to replace s̃k by a number which is computed without using radicals if we
allow ourselves to compute to arbitrary accuracy the real root of a polynomial. 7
H = min R (s)
s >0
n
sdi −|e| |ae | ||∞i
M
i =1 || e∈ E ( f i )
R (s) = D 1
.
s 2 − D −1
∗
6
A curious reader may wonder how this example was constructed. In the notation of Section 4.2, g ≈ f [s2 ] , where f is the
same polynomial we considered in the previous examples.
7
The polynomial is obtained by replacing t with sk in the definition of Q k (t ) in Lemma 3.
32 A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56
m = # monomials of F
n
d= di .
i =1
f 0 = u − x1 − x2
is a separating element in the set
{u − x1 − ix2 : 0 ≤ i ≤ 6} .
We compute
T f 0 = 4u 2 − 800u 2 + 2500
|discr ( T f 0 )| = 2.40 × 108
√
P = 30/48348866242924385372681011200
|| f 1 ||∞ = 100
|| f 2 ||∞ = 25
|| f 1 ||∞ || f 2 ||2∞ = 6.25 × 106 .
2
Hence
√
2.40 × 108 3
B DMM (F ) =
4 · ≈ 1.11 × 10−40 .
6.25 × 106 48348866242924385372681011200
H = R (s∗ )
= 4.64 × 101 .
Hence
√
2.40 × 108 3
B Ne w ( F ) =
4 · ≈ 2.71 × 10−25 .
4.64 × 101 48348866242924385372681011200
Note that this number is still quite pessimistic; however, the new bound is significantly larger than
B D M M ( F ). To demonstrate the covariance, we plot the function B Ne w ( F (x1 /s, x2 /s) ) in Fig. 4. 2
Remark 4. Note that B Ne w is only defined for the ∞−norm. It turns out that generalizing the result
to arbitrary norms is more difficult than in the univariate case.
A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56 33
Remark 5. For F ∈ Fn with integer coefficients, T f 0 is a square-free integer polynomial; in this case
discr ( T f 0 ) has a lower bound of 1. Hence in practice the discriminant is almost always replaced by 1.
In this case, part (4) of Theorem 2 implies that B Ne w can be computed in O (n · m + n · d) algebraic
operations and comparisons. As with the new univariate bound, removing the discriminant sacrifices
the scaling covariance. When the coefficients are not rational it is difficult to obtain a lower bound on
the discriminant.
4. Derivation
In this section, we present the framework we use to derive the two new bounds. To make the
presentation as general as possible, the framework will be derived for square polynomial systems.
Of course, the univariate case is included by considering a square polynomial system which contains
only one polynomial. We use the following notation.
Notation 2.
Note that in the above notation we scale the roots of F using a slight modification of the scal-
ing operation in the introduction. Since the only difference between the two scaling operations are
the leading coefficients, the two operations are equivalent. We use this scaling operation for later
convenience.
In Propositions 1–3 we will incrementally develop the framework used to meet the challenge
stated at the beginning of this paper.
B ( F [s] )
B ∗ : F
→ .
s
Then
We will illustrate the result by a simple example, since the proof is simple.
B M M ,2 ( f [2] ) ≤ ( f [2] ) = 2 ( f ).
Rearranging yields
B M M ,2 ( f [2] )
≤ ( f ). (2)
2
Combining (1) and (2) we have
1.05 × 10−6
= 5.25 × 10−7 ≤ ( f ).
2
Note that 5.25 × 10−7 ≤ B M M ,2 ( f ). So 2 was not a good choice for s.
How should we choose s? In Fig. 5 we plot the function B M M ,2 ( f [s] )/s. Clearly, we should choose
s so that the function is maximized. We see that for s ≈ .06, the new bound is approximately 2.00 ×
10−2 . This new bound is significantly larger than B M M ,2 ( f ) = 8.26 × 10−6 . 2
B ( F [σ ( F )] )
B ∗ : F
→ .
σ (F )
If ∀ F ∈ Fn and ∀γ > 0 we have
1
σ ( F [γ ] ) = σ (F )
γ
then
Hence
A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56 35
[γ ]
[σ ( F [γ )] )]
B F
B ( F [σ ( F )] )
B ∗ ( F [γ ] ) = = from (3)
σ ( F [γ )] ) σ ( F [γ )] )
B ( F [σ ( F )] ) B ( F [σ ( F )] )
= 1
= γ = γ B ∗ ( F ).
γ σ (F )
σ (F )
We have proved that B ∗ scales covariantly. 2
Proposition 3 (Optimal bound from known bound). Let B : Fn → R+ be a root separation bound. Let
B ( F [s] )
B ∗ : F
→ max .
s >0 s
Then
B ( F [s] )
σ ( f ) = arg max .
s >0 s
Clearly
B ( F [σ ( F )] )
B ∗ : F
→ .
σ (F )
Let F ∈ Fn and γ > 0. We will show that σ ( F [γ ] ) = γ1 σ ( F ). We have
[γ ] [ s ]
B F
σ ( F [γ ] ) = arg max
s >0 s
B F [γ s ]
= arg max
s >0 s
1 B F [γ s ]
= arg max since γ > 0
s >0 γ s
B F [γ s ]
= arg max
s >0 sγ
1 B F [s]
= arg max
γ s >0 s
1
= σ ( F ).
γ
Hence by Proposition 2, B ∗ scales covariantly.
We will now prove the third property. We have
B ( F [s] ) B ( F [1 ] ) B(F )
B ∗ ( F ) = max ≥ = = B ( F ).
s >0 s 1 1
We have proved the Proposition. 2
36 A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56
Let us summarize the framework built up in this section. We have seen that for a given root
separation bound B
B ( F [s] )
max
s >0 s
meets the challenge if the maximum can be computed efficiently. If the maximum cannot be computed
efficiently, we can approximate the maximum. We can then use Proposition 2 to guarantee that the
new bound is scaling covariant.
In this section we derive the new univariate bound. We will find a tight approximation s̃k of
B M M ,k ( f [s] )
sk∗ = arg max .
s >0 s
We will then use Proposition 2 and a result due to Melhorn and Ray (2010) to show that the bound
B M M ,k ( f [s̃k ] )
B Ne w ,k =
s̃k
meets the challenge.
We will use our first few Lemmas to find a simplified expression for sk∗ . Our eventual goal is to
find an expression for sk∗ that we can use to tightly approximate sk∗ . We will take advantage of the
following easily verifiable identities:
As our first simplification step, we will find an expression for sk∗ which does not include the
discriminant or P k (d).
where
|| f [s] ||k
R k (s) = d 1
.
− d−
s2 1
B M M ,k ( f [s] )
s
then simplify this expression with the identities of Lemma 1. We have
[s] |discr ( f [s] )|
B M M ,k ( f )= P k (d). (4)
|| f [s] ||kd−1
Since
A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56 37
d
d
f [s] = sd f (x/s) = sd ad (x/s − αi ) = ad (x − sαi )
i =1 i =1
we have
discr ( f [s] ) = ad2d−2 ( s αi − s α j )
i = j
= ad2d−2 sd(d−1) (αi − α j )
i = j
d(d−1)
=s discr ( f ). (5)
Hence
B M M ,k ( f [s] ) 1 |discr ( f [s] )|
= P k (d)
s s || f [s] ||kd−1
1 |sd(d−1) discr ( f )|
= P k (d) from (5)
s || f [s] ||kd−1
d(d−1)
s 2 |discr ( f )|
= P k (d)
s|| f [s] ||kd−1
d(d−1)
s 2
−1
= |discr ( f )| P k (d)
|| f [s] ||kd−1
d 1 d−1
− d−
s2 1
= |discr ( f )| P k (d)
|| f [s] || k
d−1
1
= |discr ( f )| P k (d). (6)
R k (s)
Now we apply the identities from Lemma 1 to the expression in (6):
d−1
B M M ,k ( f [s] ) 1
arg max = arg max |discr ( f )| P k (d)
s >0 s s >0 R k (s)
d−1
1
= arg max (Identity 1)
s >0 R k (s)
1
= arg max (Identity 2)
s >0 R k (s)
= arg min R k (s). (Identity 3)
s >0
We will now find an even simpler expression for s̃k∗ which depends only on the unique positive
root of a polynomial with a particularly nice structure.
d
Q k (t ) = h(i ) |ai |k · t d−i
i =0
d 1
and h(i ) = 2
−i+ d−1
.
Proof. For later convenience, we first rewrite R k (s). We will show that
1
R k (s) = R̃ k (s) k
where
1
d
R̃ k (s) k = (sk )h(i ) |ai |k
i =0
= kd k
s 2 − d −1
d 1k
kd−ki − kd
− d−k 1
= s 2 k
|ai |
i =0
d 1k
kd k
= s 2 −ki + d−1 |ai | k
i =0
d 1k
d 1
= (sk ) 2 −i + d−1 |ai |k
i =0
d 1k
d 1
= (sk )h(i ) |ai |k since h(i ) = −i+
2 d−1
i =0
1
= R̃ k (s) k (7)
Combining Lemma 1 and (7), we have
R̃ k (sk∗ ) = 0. (9)
Note that
A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56 39
d
R̃ k (s) = skh(i )−1 · kh(i ) |ai |k .
i =0
d
Q k (t ) = h(i ) |ai |k · t d−i .
i =0
We have
−kd
− d−k 1 −1 − kd k
d
ks 2 Q k (sk ) = s 2 − d −1 −1 kh(i ) |ai |k · (sk )d−i
i =0
kd k
d
= s − 2 − d −1 −1 kh(i ) |ai |k · (sk )d−i
i =0
d
kd−ki − kd − d−k 1 −1
= s 2 · kh(i ) |ai |k
i =0
d
kd k
= s 2 −ki − d−1 −1 · kh(i ) |ai |k
i =0
d
k( d2 −i − d−
1
)−1
= s 1 · kh(i ) |ai |k
i =0
d
= skh(i )−1 · kh(i ) |ai |k
i =0
= R̃ k (s).
Hence
Since Q k is a polynomial with a single sign change, we can derive a tight approximation of its
single positive root with the following result.
m
Theorem 3 (Herman and Hong, 2015). Let f = i =0 c i x
ei
have a single sign change, and x∗ be the unique
positive root of f . Then
L ≤ x∗ ≤ U
8
The number of sign changes of a real polynomial is the number of times the signs of the coefficients change from positive
to negative, when the coefficients are ordered by degree.
40 A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56
where
1
L=H( f )
2
U = 2 H( f )
1
|cq | e p −eq
H( f ) = max min .
q p |c p |
c q <0 c p >0
e p >eq
L ≤ t∗ ≤ U (12)
where
1
L= H( Q k )
2
U = 2H( Q k ).
We will use the next two Lemmas to show that s̃k tightly approximates sk∗ . We split the Lemmas up
for the sake of clarity.
and
where
1 1
G= sd − p − sd−q .
p
|a p | q
|aq |
h( p )>0 h(q)<0
Proof. We have
⎛ ⎞1
k (d− p)−(
1 k
|h(q)| aq
d−q)
1 ⎜ ⎟
(H( Q k )) = ⎝ max
k min k ⎠
q p
h(q)<0 h( p )>0
|h( p )| a p
⎛ ⎞1
k q−1 p k
⎜ |h(q)| aq ⎟
= ⎝ max min k ⎠
q p
h(q)<0 h( p )>0
|h( p )| a p
k k(q1− p)
|h(q)| aq
= max min k
q p
h(q)<0 h( p )>0
|h( p )| a p
1 (q−1 p)
|h(q)| k aq
= max min
q p |h( p )| a p
h(q)<0 h( p )>0
= s̃k .
We now consider the limit. We have
1 (q−1 p)
|h(q)| k aq
lim s̃k = lim max min
k→∞ k→∞ q p |h( p )| a p
h(q)<0 h( p )>0
1
aq (q− p)
= max min .
q p a p
h(q)<0 h( p )>0
We also have
1 (d− p)−(
1
d−q)
|aq |
H(G ) = max min 1
q p
h(q)<0 h( p )>0 |a p |
(d− p )>(d−q)
1 q−1 p
|aq |
= max min 1
q p
h(q)<0 h( p )>0 |a p |
(d− p )>(d−q)
1
|a p | q− p
= max min
q p |aq |
h(q)<0 h( p )>0
(d− p )>(d−q)
42 A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56
1
|a p | q− p
= max min
q p |aq |
h(q)<0 h( p )>0
p <q
1
|a p | q− p
= max min since h(i ) is strictly decreasing with i
q p |aq |
h(q)<0 h( p )>0
= lim s̃k . 2
k→∞
B M M ,k ( f [s] )
sk∗ = arg max .
s >0 s
Then
1
1 k 1
s̃k ≤ sk∗ ≤ 2 k s̃k
2
where
1 (q−1 p)
|h(q)| k aq
s̃k = max min
q p |h( p )| a p
h(q)<0 h( p )>0
d 1
h (i ) = −i+ .
2 d−1
B M M ,k ( f [s] )
sk∗ = arg max .
s >0 s
Combining Lemmas 2, 4 and 5, we have
1
1 k 1
s̃k ≤ sk∗ ≤ 2 k s̃k .
2
We have proved the Lemma. 2
We are now ready to define the new bound. In Lemma 6, we showed that s̃k is a tight approxi-
mation of sk∗ . As k increases, the approximation becomes tighter. Thus we choose to approximate the
bound
B M M ,k ( f [s] )
max
s >0 s
with the bound
B M M ,k ( f [s̃k ] ) |discr ( f )|
B Ne w ,k ( f ) = = P k (d). (13)
s̃k H kd−1
Before proving Theorem 1, we present an algorithm for computing s̃k . We combine Lemma 5
and an ingenious algorithm due to Melhorn and Ray (2010) to compute H( Q ) in O (d) algebraic
operations and comparisons. We formally state their complexity results in the Lemma below.
Lemma 7 (Melhorn and Ray, 2010). Let g ∈ R[x] with m non-zero coefficients. Then H( g ) can be computed in
O(m) algebraic operations and comparisons with the algorithm Compute H (Algorithm 3).
A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56 43
( p 1 , |a p 1 |), ( p 2 , |a p 2 |) .
Remark 6. In Melhorn and Ray (2010), the point T and line l are not reset when T is removed
from L (as we do in Algorithm 1). This appears to be a minor oversight which we correct here.
44 A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56
s̃∞ = s∗∞ .
Hence
B M M ,∞ ( f [s̃∞ ] )
B Ne w ,∞ ( f ) =
s̃∞
∗
B M M ,∞ ( f [s∞ ] )
=
s∗∞
B M M ,∞ ( f [s] )
= arg max .
s >0 s
Hence by Proposition 3, B Ne w ,∞ ( f ) ≥ B M M ,∞ ( f ) for all f .
(3) Let γ > 0. We have
1 (q−1 p)
[γ )] |h(q)| k γ d−q aq
s̃k ( f ) = max min
q p |h( p )| γ d− p a p
h(q)<0 h( p )>0
1 (q−1 p)
|h(q)| k aq 1
= max min
q p |h( p )| a p γ q− p
h(q)<0 h( p )>0
1 (q−1 p)
1 |h(q)| k aq
= max min
γ q p |h( p )| a p
h(q)<0 h( p )>0
1
= s̃k .
γ
A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56 45
In this section, we derive the new multivariate bound. We first briefly discuss the bound B D M M
presented in Section 3.
Proof. To prove the result, we follow a proof almost identical to that in Emiris et al. (2010). Instead
of using the sparse resultant, we will use the multivariate resultant. Let f 0 be a separating element
and T f 0 be the resultant of F and f 0 which eliminates x1 , . . . , xn . We use the same coefficient bounds
as in Emiris et al. (2010) to show that
n n
Mi
Mi D D n + di
|| T f 0 ||∞ ≤ || f i ||∞ C (n + 1) (14)
di
i =1 i =1
( F ) ≥ B D M M ( F ) 2
For the remainder of this section, let F ∈ Fn be fixed, and f 0 a fixed separating element of F .
Similar to the previous section, we will begin by deriving a simplified expression for
B D M M ( F [s] )
s∗ = arg max .
s >0 s
Note that s∗ is does not depend on different choices of norm (unlike the univariate case) since B D M M
is only derived with the infinity norm.
46 A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56
We first need to understand the effect that root scaling has on the discriminant of T f 0 . We make
use of the following result from the proof of Proposition 5.8 of Cox et al. (2005).
Lemma 9. Let F be zero-dimensional, have no solutions at infinity, and have no singular solutions. Let
f 0 = u + r1 x1 + · · · + rn xn
where
C = Res( F )
F = ( f 1, . . . , fn)
fi = ae xe .
e∈ E ( f i )
|e|=di
[s]
Lemma 10. Let s > 0. Let T f be the resultant of F [s] and f 0 . Then
0
[s] D ( D −1 )
discr ( T f ) = s discr ( T f 0 ).
0
[s]
Proof. To prove the claim, we will first show that the leading coefficients of T f 0 and T f are the
0
same. Then we will use the definition of the discriminant to complete the proof.
[s]
Let C be the leading coefficient of T f 0 and C f the leading coefficient of T [s] . From Lemma 9, we
0
have
Note that
!
[s]
fi = sdi f i (x1
/s, . . . , xn /s)
e
= sd i a e x[ s ]
|e|=di
x e1 x en
di 1 n
=s ae ...
s s
|e|=di
1
e1 +···+en e e
= sd i ae x11 · · · xnn
s
|e|=di
1
e1 +···+en
= sd i ae xe
s
|e|=di
1
di
= sd i ae xe
s
|e|=di
di
di 1
=s ae xe
s
|e|=di
A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56 47
= ae xe
|e|=di
= fi.
Hence
F =!
F [s] . (17)
Combining (16) and (17), we have
C = C [s] . (18)
Note that the roots T f 0 are
{r1 γi ,1 + · · · + rn γi ,n }iD=1
[s]
and the roots of T f are
0
{s · (r1 γi ,1 + · · · + rn γi ,n )}iD=1 .
[s]
We will now expand the discriminant of T f . We have
0
D ( D −1 )
[s]
discr ( T f ) = C [s] s · (r1 γi ,1 + · · · + rn γi ,n ) − s · (r1 γ j ,1 + · · · + rn γ j ,n )
0
i = j
D ( D −1 )
=C (s · (r1 γi ,1 + · · · + rn γi ,n ) − s · (r1 γ j ,1 + · · · + rn γ j ,n )) from (18)
i = j
= s D ( D −1 ) C D ( D −1 ) ((r1 γi ,1 + · · · + rn γi ,n ) − (r1 γ j ,1 + · · · + rn γ j ,n ))
i = j
= s D ( D −1)discr ( T f 0 ). 2
Now that we know the effect root scaling has on the discriminant of T f 0 , we can study the effect
of root scaling on B D M M . We will follow a similar procedure to the derivation of the univariate bound.
First we will find an expression for the scaled bound which does not depend on the discriminant of
T f 0 or P (d1 , . . . , dn , n).
Proof. We have
B DMM ( F [s] ) 1 |discr ( T [fs0] )|
= n [ s ] M i ( D −1 )
P (d1 , . . . , dn , n)
s s
i =1 || f i ||∞
D ( D −1)
1 s 2 |discr ( T f 0 )|
= P (d1 , . . . , dn , n) from Lemma 10
s n [ s ] M i D −1
i =1 || f i ||∞
48 A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56
|discr ( T f 0 )|
=
D −1 P (d1 , . . . , dn , n)
n [s] M
i =1 || f i ||∞
i
D− 1
s 2 D −1
|discr ( T f 0 )|
= P (d1 , . . . , dn , n). 2
R ( s ) D −1
Next, we find a simplified expression for s∗ . As in the univariate case, our eventual goal is to find
an expression for s∗ which leads to an efficient computation of s∗ .
Proof. To prove the claim, we will again make use of the identities in Lemma 1. We have
B D M M ( F [s] ) |discr ( T )|
arg max = arg max P (d1 , . . . , dn , n) from Lemma 11
s >0 s s >0 R ( s ) D −1
1
= arg max (Identity 1)
s >0 R ( s ) D −1
1
= arg max (Identity 2)
s >0 R (s)
= arg min R (s) (Identity 3) 2
s >0
We will now consider the computation of arg mins>0 R (s). For the sake of generality, we will study
all functions of the form
n [s] U i
i =1 || f i ||∞
R (s) =
sV
where U 1 , . . . , U n , V ∈ R>0 . Let s∗ = arg mins>0 R (s). We will show that s∗ can be computed in
O ( n · m + n · d ) algebraic operations and comparisons.9 Our overall strategy will be to transform
the problem into a new problem which is stated in terms of linear functions. More precisely, we will
show that log( R (s)) can be viewed as the upper envelope of a set of linear functions. We will make use
of a technique for efficiently computing upper envelopes known as the Convex Hull Trick to compute
s∗ efficiently.
n
log( R (s)) = U i · max ( (di − |e |) · t + log(|ae |) ) − V · t .
e∈ E ( f i )
i =1
Proof. We have
n [s] U i
i =1 || f i ||∞
log( R (s)) = log
sV
n
[s]
= U i · log(|| f i ||∞ ) − V · log(s). (19)
i =1
n
9
Recall that m = # monomials of F and d = i =1 di .
A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56 49
Note that
[s]
log(|| f i ||) = log max sdi −|e| |ae |
e∈ E ( f i )
= max log sdi −|e| |ae |
e∈ E ( f i )
n
log( R (s)) = U i · max ( (di − |e |) · t + log(|ae |) ) − V · t . 2
e∈ E ( f i )
i =1
Since the sum of upper envelopes is an upper envelope, log( R (s)) is an upper envelope. The upper
envelope of a set of linear functions li (t ) = βi · t + ξi on t > 0 is represented by an ordered sequence
(li 1 , 0), (li 2 , t i 1 ,i 2 ), . . . , (lir , t ir−1 ,ir ) such that
⎧
⎪
⎪li 1 (t ) −∞ ≤ t ≤ t i 1 ,i 2
⎪
⎪
⎨li 2 (t ) t i 1 ,i 2 ≤ t ≤ t i 2 ,i 3
max li (t ) =
i ⎪ ...
⎪
⎪
⎪
⎩
lir (t ) t ir −1 ,ir ≤ t ≤ ∞
Given such a representation, finding the t which minimizes the upper envelope is trivial: we simply
find the corner point t where the slopes of the lines in the upper envelope switch from negative to
positive. In fact, this representation contains more information than is necessary to find the minimizer.
We need only store the slopes of functions which lie on the upper envelope, as well as the corner
points.
Hence we have the following initial strategy. For i = 1, . . . , n, we compute the upper envelope
representation of
The most efficient algorithm for computing upper envelope representations of linear functions is
known as the Convex Hull Trick. It is not clear who deserves credit for this trick; it appears to be
folklore, not published in the literature. See Convex hull trick (2017) for a concise summary. We can
combine the upper envelope representations to find the representation of
n
log( R (s)) = U i · max ( (di − |e |) · t + log(|ae |) ) − V · t .
e∈ E ( f i )
i =1
We will now discuss improvements to the above strategy. Note that in the above strategy we must
take logarithms. Recall that the current goal is to present an algorithm which produces the minimizer
in
O(n · m + n · d)
50 A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56
algebraic operations and comparisons. It turns out that it is a relatively trivial matter to modify the
Convex Hull Trick algorithm to avoid logarithm computations for the current application. In the Con-
vex Hull Trick algorithm, we compare corner points t i 1 ,i 2 and t i 3 ,i 4 . In our case, the corner points for
the upper envelope of (21) are the points where
where
d = deg( f )
bk = max |ae |.
e∈ E ( f )
|e|=k
Hence
We are now ready to present F indMinimizer (Algorithm 6). For each f i , we first find the coefficient
of largest magnitude for each total degree (motivated by Lemma 14). We then use the sub-algorithm
U pper Envelope Slopes (Algorithm 5) to compute the slopes of the lines which lie on the upper en-
[s]
velope of log(|| f i ||∞ ), as well as the points se i ,e j such that t e i ,e j = log(se i ,e j ) is a corner point of
the upper envelope. U pper Envelope Slopes is a straightforward modification of the Convex Hull Trick
[s]
algorithm. Once the upper envelope slopes are computed for each log(|| f i ||∞ ), we search for the
smallest s such that the slope of log( R ) is positive for t > log(s).
We are now ready to discuss the complexity of F indMinimizer.
m = # monomials of F
n
d= di .
i =1
52 A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56
It is straightforward to see that U pper Envelope Slopes requires O (r ) algebraic operations and compar-
isons when r linear functions are input. Since L i has O (di ) elements, line 7 requires O ( di ) algebraic
n
operations and comparisons. Hence the total amount of work performed in Line 9 is O ( i =1 di ).
In Line 8 we compute
M ← the list of triples (β, i , s), sorted in ascending order with respect to s
where (β, s) is an element of Z i .
n
n
O n· #E ( f i ) + n · di = O (n · m + n · d ) . 2
i =1 i =1
5. Performance
In this Section, we discuss the experimental performance of the new bounds. We first repeat the
observation of Remark 1: experimental evidence indicates that B Ne w ,k is almost always larger for
smaller k. This is unsurprising once we consider the derivation strategy in the previous section. Let
k1 ≤ k2 . We have
where the third inequality holds due to known inequalities on polynomial norms.
We have also observed that the improvement is usually very large for the new bounds, especially
when the magnitudes of the roots are different from 1. To generate data points, we generated 100 ran-
dom monic polynomials (or square Pham polynomial systems) with fixed degree and height (defined
below) and calculated the average value of the improvement:
d−1 n
D −1
|| f k || i =1 || f i ||
Mi
B Ne w ,k ( f ) B Ne w ( F )
= and = .
B M M ,k ( f ) Hk B DMM (F ) H
Note that the improvement is independent of the discriminant for both new bounds. This observation
allowed us to avoid many expensive computations when performing experiments (in particular, no
resultants need be computed in the multivariate case).
We will measure the size of the coefficients of a monic univariate polynomial with the following
expression:
1
|ai | d−i
|| f || B = max d
0≤i ≤d−1
i
We will call this the B-Height (B for “binomial”). We can naturally extend the B-Height to Pham
polynomials (see Gonzalez-Vega and Gonzalez-Campos, 1999 for a precise definition) of degree d
with the following expression
54 A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56
1
|ae | d−|e|
|| f || B = max d
.
e ∈ Support (trailing polynomial of f )
e
It is well known that both height definitions above are linearly related to the size of the roots. To
generate a polynomial (or polynomial system) with the height rn /rd , we uniformly generated an inte-
ger c in the range (−rn , rd ) for every trailing coefficient. The corresponding integer for one coefficient
was randomly chosen to be fixed at rn . We then set
d−|e|
rn d
|ae | =
rd e
It is well known that Mignotte polynomials have very small root separation (approximately h−d ). In
Fig. 7, we plot the logarithm of the exact root separation, B Ne w ,2 , and B M M ,2 of the improvement of
B Ne w ,2 for certain values of h and d. In the top plot, we fix h to be 10 and vary the degree. In the
bottom plot, we fix the degree to be 4 and vary h. We can see from the plots that the new bound is
consistently a tighter lower bound on the root separation. Furthermore, in the top plot we see that as
the degree increases the improvement of the new bound over B M M ,2 increases.
6. Conclusion
In this paper we presented two improved root separation bounds. The new bounds improve on the
previous bounds in two ways:
(1) The new bounds are usually significantly bigger (hence better) than the previous bounds.
(2) The new bounds scale correctly, unlike the previous bounds.
A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56 55
Fig. 7. Log plots of exact separation bound (red diamonds), B Ne w ,2 (blue circles) and B M M ,2 (green boxes) for Mignotte polyno-
mials. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)
Crucially, the improved bounds are not harder to compute than the previous bounds. The improved
bounds meet an important challenge facing researchers in our field (Section 2). To the best of the
authors’ knowledge, the new improved bounds are significantly bigger than all known (efficiently
computable) root separation bounds and are the only known (efficiently computable) root separation
bounds which scale correctly.
Of course, there remains plenty of room for improvement in the new bounds. In particular, the
new multivariate bound presented in this paper is still quite pessimistic. One possible strategy for
improving the new bound is to modify the bound to account for sparsity. This strategy does present a
challenge: the derivation in this paper requires an understanding of the scaling behavior of the leading
coefficient of T f 0 (see Lemmas 9 and 10). If T f 0 is calculated by computing the sparse resultant instead
of the univariate resultant, the formula for the leading coefficient is much more challenging to work
with (see D’Andrea and Sombra, 2015).
Acknowledgements
The authors would like to thank Carlos D’Andrea for his insights regarding the multivariate resul-
tant. Hoon Hong acknowledges the partial support from the grant US NSF 1319632. Elias Tsigaridas is
partially supported by GeoLMI (ANR 2011 BS03 011 06), HPAC (ANR ANR-11-BS02-013), and an FP7
Marie Curie Career Integration Grant.
References
Bugeaud, Y., Mignotte, M., 2004. On the distance between roots of integer polynomials. Proc. Edinb. Math. Soc. 47 (3), 553–556.
Burnikel, C., Funke, S., Melhorn, K., Schirra, S., Schmitt, S., 2001. A separation bound for real algebraic expressions. In: Lecture
Notes in Computer Science, pp. 254–265.
Collins, G., Horowitz, E., 1974. The minimum root separation of a polynomial. Math. Comput. 28 (126), 589–597.
Convex hull trick. http://wcipeg.com/wiki/Convex_hull_trick, 2017.
Cox, D., Little, J., O’Shea, D., 2005. Using Algebraic Geometry, 2nd edition. Springer.
D’Andrea, C., Sombra, M., 2015. A Poisson formula for the sparse resultant. Proc. Lond. Math. Soc. (3) 110, 932–964.
Emiris, I., Mourrain, B., Tsigaridas, E., 2010. The DMM bound: multivariate (aggregate) separation bounds. In: Proceedings of the
2010 International Symposium on Symbolic and Algebraic Computation, pp. 243–250.
Emiris, I., Tsigaridas, E., 2004. Comparing Real Algebraic Numbers of Small Degree. Lecture Notes in Computer Science, vol. 3221.
Gonzalez-Vega, L., Gonzalez-Campos, Neila, 1999. Simultaneous elimination by using several tools from real algebraic geometry.
J. Symb. Comput. 28, 89–103.
Herman, A., Hong, H., 2015. Quality of positive root bounds. J. Symb. Comput. 74, 592–602.
Li, C., Pion, S., Yap, C., 2004. Recent progress in exact geometric computation. J. Log. Algebraic Program. 64, 85–111.
Mahler, K., 1964. An inequality for the discriminant of a polynomial. Mich. Math. J. 11 (3), 257.
Melhorn, K., Ray, S., 2010. Faster algorithms for computing Hong’s bound on absolute positivity. J. Symb. Comput. 45, 677–683.
Mignotte, M., 1974. An inequality about factors of polynomials. Math. Comput. 28 (128), 1153–1157.
56 A. Herman et al. / Journal of Symbolic Computation 84 (2018) 25–56
Mignotte, M., 1995. On the distance between the roots of a polynomial. Appl. Algebra Eng. Commun. Comput. 6, 327–332.
Rump, S., 1979. Polynomial minimum root separation. Math. Comput. 33 (145), 327–336.
Schultz, C., Moller, R., 2005. Quantifier elimination over real closed fields in the context of applied description logics. In: Univ.,
Bibl. des Fachbereichs Informatik.
Strzebonksi, A., Tsigaridas, E., 2011. Univariate real root isolation in an extension field and applications. hal-01248390.
Tsigaridas, E., Emiris, I., 2008. On the complexity of real root isolation using continued fractions. Theor. Comput. Sci. 392,
158–173.