You are on page 1of 19

SIAM J. MATRIX ANAL. APPL.

c 2016 Society for Industrial and Applied Mathematics


Vol. 37, No. 2, pp. 676–694

HOW BAD ARE VANDERMONDE MATRICES?∗


Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

VICTOR Y. PAN†

Abstract. Estimation of the condition numbers of Vandermonde matrices, motivated by ap-


plications to interpolation and quadrature, can be traced back to at least the 1970s. Empirical
study has consistently shown that Vandermonde matrices tend to be badly ill-conditioned, with a
narrow class of exceptions, such as the matrices of the discrete Fourier transform (hereafter we use
the acronym DFT ). So far, however, formal support for this empirical observation has been limited
to the matrices defined by the real set of knots. We prove that, more generally, any Vandermonde
matrix of a large size is badly ill-conditioned unless its knots are more or less equally spaced on or
about the circle {x : |x| = 1}. The matrices of the DFT are unitary up to scaling and therefore
perfectly conditioned. They are defined by a cyclic sequence of knots, equally spaced on that circle,
but we prove that even a slight modification of the knots into the so-called quasi-cyclic sequence on
this circle defines badly ill-conditioned Vandermonde matrices. Likewise, we prove that the half-size
leading (that is, northwestern) block of a large DFT matrix is badly ill-conditioned. (This result was
motivated by an application to preconditioning of an input matrix for Gaussian elimination with no
pivoting and for block Gaussian elimination.) Our analysis involves the Ekkart–Young theorem, the
Vandermonde-to-Cauchy transformation of matrix structure, our new inversion formula for a Cauchy
matrix, and low-rank approximation of its large submatrices.

Key words. Vandermonde matrices, condition number, Cauchy matrices, Cauchy inversion
formulae, Vandermonde matrix inversion

AMS subject classifications. 15A12, 65F35, 47A65, 12Y05

DOI. 10.1137/15M1030170

1. Introduction. Vandermonde matrices frequently appear in various areas


of modern computing (see, e.g., [BTG04], [MM92, paragraph 2.6.2, Vandermonde
matrix, pp. 15–16], [MS58], [OO10], [P03], [PTVF07, section 2.8.1, Vandermonde
matrices], [RD09], and [T66]). Our main subject, motivated by applications to in-
terpolation and quadrature, is the estimation of the condition number, κ(Vs ), of a
nonsingular n × n Vandermonde matrix, Vs , where

(1) κ(Vs ) = ||Vs || ||Vs−1 ||, Vs = (sji )n−1


i,j=0 ,

s = (si )n−1
i=0 denotes the vector of n distinct knots s0 , . . . , sn−1 , and ||M || = ||M ||2
denotes the spectral norm of a matrix M . First, one would like to know whether
this number is reasonably bounded or large, that is, whether the matrix is well- or
ill-conditioned. In the latter case, one must perform computations with extended
precision in order to obtain an accurate solution to a linear system of equations with
this matrix (cf. [GL13]) and to the equivalent problem of polynomial interpolation.
It has been proven in [GI88], [G90], and [T94] that the condition number κ(Vs )
is exponential in n if all knots s0 , s1 , . . . , sn−1 are real. This means that such a
matrix is badly ill-conditioned already for moderately large integers n, but empirically
almost all Vandermonde matrices of large, or even reasonably large, sizes are badly
∗ Received by the editors July 10, 2015; accepted for publication (in revised form) February 1,

2016 by J. L. Barlow; published electronically May 25, 2016. This work was supported by NSF grant
CCF–1116736 and PSC CUNY Award 67699-00 45.
http://www.siam.org/journals/simax/37-2/M103017.html
† Department of Mathematics and Computer Science, Lehman College of the City University of

New York, Bronx, NY 10468, and Ph.D. Programs in Mathematics and Computer Science, The Grad-
uate Center of the City University of New York, New York, NY 10036 (victor.pan@lehman.cuny.edu,
http://comet.lehman.cuny.edu/vpan/).
676

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


HOW BAD ARE VANDERMONDE MATRICES? 677

ill-conditioned as well, with a narrow class of exceptions, such as the Vandermonde


matrices defined by cyclic sequences of knots, equally spaced on the circle C(0, 1)√=
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

{x : |x| = 1} on the complex plane. Such matrices are unitary up to scaling by 1/ n


and thus are perfectly conditioned, that is, have the minimum condition number 1.
A celebrated example is the matrix of the discrete Fourier transform (DFT),

(2) Ω = (ωnij )n−1
i,j=0 for ωq = exp((2π/q) −1),

denoting a primitive qth root of 1.


However, a slight modification of the knots into the so-called quasi-cyclic sequence
can make the matrices badly ill-conditioned. This was empirically observed in [G90,
section IV], but formal support of that observation turned out to be elusive. The
critical step was the estimation of the norm of the inverse matrix, ||Vs−1 ||. The known
expressions for the entries of this matrix are not explicit enough. In particular, write
m−1
Y
(3) s(x) = sm (x) = (x − si )
i=0

and
n−1
s(x) X
si (x) = 0
= si,j xj for m = n
(x − si )s (si ) i,j=0

and recall from [G90, eq. (2.3)] that


n−1
(4) Vs−1 = (si,j )n−1
i,j=0 , and so ||Vs−1 || ≥ max |si,j |
i,j=0

(see other expressions for the inverse matrix Vs−1 in [MS58], [M03], [T66], [SEMC11],
and the references therein). However, the known lower and upper bounds on the
condition number of a Vandermonde matrix, as Gautschi wrote in [G90, section IV],
“unfortunately, are too far apart to give much useful information”, and so he “com-
puted the condition number . . . numerically”.
Our new lower bounds enable us to formally support the cited empirical observa-
tions for a very general class of Vandermonde matrices. Our results in sections 4 and
8 indicate that the condition number of an n × n Vandermonde matrix is exponential
in n unless its knots are more or less equally spaced on or about the unit circle C(0, 1).
Our study covers the matrices defined with the quasi-cyclic sequence of knots
as a special case (see section 5), and we further observe in section 6 that the k × k
blocks of the n × n matrix of the DFT for n = 2k are badly ill-conditioned as long as
the integers k and n are sufficiently large, even though DFT matrices themselves are
perfectly conditioned.
In order to prove our lower bounds on the condition number κ(Vs ), we shift from
Vandermonde matrices Vs of (1) to Cauchy matrices,
 1 n−1
(5) Cs,t = ,
si − tj i,j=0
defined by 2n distinct knots s0 , . . . , sn−1 ; t0 , . . . , tn−1 . The matrices of these two
classes are much different, and it is not obvious that their studies can be linked
together. Unlike a Vandermonde matrix Vs , a Cauchy matrix Cs,t has rational en-
tries, every one of its submatrices is a Cauchy matrix as well, and the Cauchy struc-
ture is invariant in column interchange as well as in scaling and shifting all entries

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


678 VICTOR Y. PAN

by the same scalar, that is, aCas,at = Cs,t for a 6= 0 and Cs+ae,t+ae = Cs,t for
e = (1, . . . , 1)T . Moreover, some powerful techniques have been developed in [GR87],
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

[CGR88], [MRT05], [R06], [B10], [XXG12], [XXCB14], [P15], and [Pa] for the ap-
proximation of reasonably large submatrices of Cauchy matrices of a large class by
lower-rank matrices.
Nevertheless there exists a simple Vandermonde–Cauchy link, and this very link
has become the basis for not only our progress in this paper but also the work in [P15]
and [Pa]. (The paper [P90] revealed a more general link among the matrix structures
of Vandermonde, Cauchy, Toeplitz, and Hankel types, based on their displacement
representation. The paper also pointed out potential algorithmic applications, and
this link has indeed become the springboard for devising the highly efficient algo-
rithms of the papers [GKO95], [HB97], [HB98], [G98], [MRT05], [R06], [AR10], [P10],
[XXG12], [XXCB14], [P15], and [Pa].)
Restatement of our original task in terms of Cauchy matrices has opened new
opportunities for our study of Vandermonde matrices, and we obtained the desired
progress by extending the known formula for the Cauchy determinant to the following
simple expression for all entries of the inverse of a Cauchy matrix Cs,t :

−1 (−1)n
(6) (Cs,t )i0 ,j 0 = s(tj 0 )t(si0 ) for all pairs (i0 , j 0 ).
tj 0 − si0

This expression involves the values of the polynomial s(x) of (3) and the polynomial
n−1
Y
(7) t(x) = tn (x) = (x − tj ).
j=0

We apply formula (6) in order to prove exponential lower bounds on the norms of the
inverses and the condition numbers of Cauchy matrices of a large class, and then we
extend these bounds to Vandermonde matrices. We also deduce some alternative ex-
ponential lower bounds by applying the known techniques for low-rank approximation
of certain large submatrices of Cauchy matrices of a large class.
In section 3, by applying the Ekkart–Young theorem, we more readily prove the
exponential growth of the condition numbers of Vandermonde matrices of a narrower
class, namely, those having at least a single knot outside the disc D(0, 1) = {x : |x| ≤
1} or having many knots lying inside this disc and not close to its boundary circle
C(0, 1). These simple results can be of independent interest, and they provide some
initial insight into the subject.
We organize our presentation as follows. In section 2 we recall some basic defini-
tions and auxiliary results. In section 4 we first deduce our basic lower bounds on the
condition number κ(Vs ) by applying a link between Vandermonde and Cauchy matri-
ces, and then we informally interpret these bounds in terms of some restrictions on the
knot set {s0 , s1 , . . . , sn−1 } of any well-conditioned Vandermonde matrix. We prove
exponential growth of the condition number of (a) Vandermonde matrices defined by
the quasi-cyclic sequence of knots in section 5 and (b) the half-size leading blocks of
the DFT matrices in section 6. In section 7 we deduce an exponential lower bound
on the condition numbers of some Cauchy matrices closely linked to Vandermonde
matrices. In section 8 we extend this bound to an alternative exponential lower bound
on the condition numbers κ(Vs ) of Vandermonde matrices Vs . This bound is weaker
than the bounds of section 4 for the matrix classes of sections 5 and 6, but its domain
of applications is shown more explicitly. We devote section 9 to our numerical tests.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


HOW BAD ARE VANDERMONDE MATRICES? 679

In section 10 we briefly summarize our results. In section 11 we list some natural


extensions of our study.
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

2. Some definitions and auxiliary results. Next we introduce some defini-


tions and recall some basic results (cf. [GL13]).
• σj (M ) denotes the jth largest singular value of an n × n matrix M .
• ||M || = ||M ||2 = σ1 (M ) is its spectral norm.
• κ(M ) = σ1 (M )/σρ (M ) is its condition number provided that ρ = rank(M ).
• σρ (M ) = ||M −1 || and κ(M ) = ||M || ||M −1 || if ρ = n, that is, if M is a
nonsingular matrix.
• (mi,j )k,l
i,j=1 is a k × l matrix with (i, j)th entries mi,j for i = 1, 2, . . . , k and
j = 1, 2, . . . , l.
Theorem 2.1 (the Ekkart–Young theorem). σj (M ) is equal to the distance
||M − Mj−1 || between M and its closest approximation by a matrix Mj−1 of rank at
most j − 1.
Corollary 2.1.1. Suppose that a matrix B has been obtained by appending k
new rows or k new columns to a matrix A. Then σj (A) ≥ σj+k (B) for all j.
Proposition 2.1.1. For the DFT matrix Ω and ωn of (2), it holds that
(i) nΩ−1 = ΩH = (ωn−ij )n−1 and
√ i,j=0
(ii) ||Ω|| = ||ΩH || = n.

3. The condition of Vandermonde matrices: Simple bounds. For a non-


singular Vandermonde matrix Vs = (sji )n−1
i,j=0 of (1), write

n−1 n−1
(8) s+ = max |si |, |Vs | = max{1, s+ }.
i=0

Next we prove that the condition number κ(Vs ) is exponential in n if s+ ≥ ν for


a constant ν > 1 (see Theorem 3.1(i)) and is exponential in k if 1/|si | ≥ ν for at least
k knots si , i = i1 , . . . , ik , and for a constant ν > 1 (see Corollary 3.2.1).
Theorem 3.1. For a Vandermonde matrix Vs of (1), it holds that
(i) |Vs | ≤ ||V√
s || ≤ n |Vs | and
(ii) ||Vs || ≥ n.
Proof. The theorem readily follows from [GL13, eqs. (2.3.8) and (2.3.11)].
Theorem 3.2. √ For a Vandermonde matrix Vs of (1), it holds that
(i) σn (Vs ) ≤ n and √
(ii) σn (Vs ) ≤ ν 1−k k max{k, ν/(ν − 1)} if 1/|si | ≥ ν > 1 for i = 0, 1, . . . , k − 1,
k ≥ 1.
Proof. First, turn the matrix Vs into a rank deficient matrix by setting to 0 √ the
entries of its first column. Then note that the norm of this perturbation equals n
and deduce part (i) by applying Theorem 2.1.
Likewise, turn the matrix Vs into a rank deficient matrix by setting to 0 its entries
that lie in its first k rows, but not the entries in its first k − 1 columns.
j k−1,k
This perturbation, diag(sk−1i )k−1
i=0 (si )i,j=0 has column norm, ||·||1 , at most ν
1−k
k
n 2−k
1−k k−i ν
P
and has row norm, || · ||∞ , at most ν i=k ν ≤ ν−1 . So its spectral norm is at
1−k

most ν k max{k, ν/(ν − 1)}.
Hence Theorem 2.1 implies part (ii).

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


680 VICTOR Y. PAN

Corollary 3.2.1. For a Vandermonde matrix Vs of (1), keep defining s+ and


|Vs | by (8). Then √
(i) κ(Vs ) ≥ max{1, sn−1
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

+ }/ n (cf. Table 1) and


(ii) κ(Vs ) ≥ k −0.5 |Vs |ν k−1 / max{k, ν/(ν−1)} if 1/|si | ≥ ν > 1 for i = 0, 1, . . . , k−1
(cf. Table 2).
Proof. Combine Theorems 3.1 and 3.2 with (8).
Remark 3.3. Observe
√ that ||Vs ||∞ = (sn+ − 1)/(s+ − 1) for s+ 6= 1, and so ||Vs || ≥
(sn+ − 1)/((s+ − 1) n) (cf. [GL13, eq. (2.3.7)]). This enables us to slightly strengthen
the bounds in part (i) of Theorem 3.1 and in Corollary 3.2.1 implied by (8).
Corollary 3.2.1 shows that an n×n Vandermonde matrix Vs is badly ill-conditioned
already for a moderately large integer n if s+ ≥ ν as well as for a reasonably large
integer k if 1/|si | ≥ ν, ν is a constant exceeding 1, and i = 0, 1, . . . , k − 1. Therefore,
a Vandermonde matrix Vs of a large size is badly ill-conditioned unless all knots
s0 , . . . , sn−1 lie in or near the disc D(0, 1) = {x : |x ≤ 1|} and lie mostly on or near
its boundary circle C(0, 1). In the following we prove even stronger restrictions on
the knots of a well-conditioned Vandermonde matrix.
4. The condition estimates via the inversion formulae. Let Mi,j denote
the (i, j)th minor of a nonsingular n × n matrix M , that is, the determinant of its
submatrix, obtained by deleting the ith row and jth column of the matrix M . Then

(9) M −1 = (−1)i+j (Mj,i / det(M ))i,j .

Next, we deduce expressions (6) for the inverse of a nonsingular Cauchy matrix
Cs,t of (5) by combining (9) with the following well-known formula:
Y Y
(10) det(Cs,t ) = (sj − si )(ti − tj )/ (si − tj ).
i<j i,j

Since every square submatrix of a nonsingular Cauchy matrix is again a nonsin-


gular Cauchy matrix, we can extend (10) to its minors, combine these extensions with
(10) itself and with (9), and deduce (after some simplifications) that
−1
(11) (Cs,t )i0 ,j 0 =

2 0 0
, !
(−1)n +i +j Y Y Y Y
(si − ti0 ) (sj 0 − tj ) (sj 0 − si )(ti − tj 0 ) (sj − si0 )(ti0 − tj )
si0 − tj 0 i j i<j 0 j>i0

for every pair (i0 , j 0 ). In particular, note that n2 + n − 1 is an odd number and obtain

−1 1 Y Y (−1)n
(12) (Cs,t )n−1,0 = (si − t0 ) (sn−1 − tj ) = s(t0 )t(sn−1 )
t0 − sn−1 i j
t0 − sn−1

for the polynomials s(x) of (3) and t(x) of (7). By re-enumerating the knots si and
tj , we can turn any pair of them into the pair of sn−1 and t0 and thus arrive at
expression (6).
We are going to combine this expression with the following equations (cf. [P15,
eq. (5)]), which link Vandermonde and Cauchy matrices:

(13) Vs = DCs,t D1−1 Vt and Vs−1 = Vt−1 D1 Cs,t


−1 −1
D .

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


HOW BAD ARE VANDERMONDE MATRICES? 681

Here
(14) D = diag(t(si ))n−1 and D1 = diag(t0 (tj ))n−1
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

i=0 j=0

for the polynomial t(x) of (7).


Equations (12)–(14) combined can be considered inversion formulae for a non-
singular Vandermonde matrix Vs , where t0 , . . . , tn−1 are n parameters of our choice.
Let us express these parameters and, consequently, the inverse matrix Vs−1 , through
a single complex nonzero parameter f . By substituting f ωnj for tj into the expres-
1
sion Cs,t = ( si −t )n−1 for j = 0, . . . , n − 1, define the subclass of the CV matrices,
j i,j=0

Cs,f = ( s 1
j )n−1
i,j=0 , which are Cauchy matrices linked to Vandermonde matrices.
i −f ωn

Substitute Cs,t = Cs,f , Vt = Vf = Ω diag(f j )n−1 n n


j=0 , and t(x) = x − f into (13)
and obtain
−1
Vs = Df Cs,f D1,f Ω diag(f j )n−1
j=0
for
Df = diag(sni − f n )n−1
i=0 and D1,f = n diag((f ωnj )n−1 )n−1
j=0 .
(n−1)j
Combine these equations with Proposition 2.1.1, substitute ωn = ωn−j for all
j, and obtain
!n−1
n−1 −1 1
(15) Vs−1 = diag(f n−1−j )n−1
j=0 Ω
H
diag(ωn−j )j=0 Cs,f diag ,
sni − f n
j=0

−1
where the matrix Cs,fis defined by (11) for tj = f ωnj
and j = 0, . . . , n − 1.
j
By combining (11), (15), and tj = f ωn for |f | = 1 and j = 0, . . . , n − 1, we obtain
inversion formulae for a nonsingular Vandermonde matrix Vs depending only on the
single nonzero parameter f .
Furthermore, we can substitute t(x) = xn − f n into (12) and express any entry
−1 −1
(Cs,t )i0 ,j 0 of the matrix Cs,f as follows:
−1
(16) (Cs,f )i0 ,j 0 = (−1)n s(tj 0 )(sni0 − f n )/(si0 − tj 0 ).
Equations (15) and (16), combined for all pairs i0 and j 0 , provide a simplified
representation of the inverses of CV and Vandermonde matrices. Next, we apply this
representation in order to estimate the condition number of a Vandermonde matrix.
Theorem 4.1. Let Vs be a nonsingular n × n Vandermonde matrix of (1). Fix
a scalar f such that |f | = 1 and si 6= f ωnj for all pairs of i and j, thus implying that
the CV matrix Cs,f is well-defined and nonsingular. Then
√ −1 n−1
κ(Vs ) ≥ n ||Cs,f || max |sni − f n |.
i=0

Proof. Apply (15) and note that the matrix diag(f n−1−j )n−1 H −j n−1
j=0 Ω diag(ωn )j=0
−1 n−1
is unitary. Therefore, ||Vs−1 || = ||Cs,f 1
diag( sn −f n )j=0 ||, and consequently ||Vs
−1
|| ≥
−1 n−1 n
i

||Cs,f ||/ maxi=0 |si − f n |. Combine this estimate with the bound κ(Vs ) ≥ ||Vs−1 || n
of Theorem 3.1.
Corollary 4.1.1. Let Vs be a nonsingular n × n Vandermonde matrix of (1)
with s+ ≤ 1 for s+ of (8). Then

κ(Vs ) ≥ 0.5 n max |s(f )| for s(x) of (3).
|f |=1

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


682 VICTOR Y. PAN

Proof. Suppose that the matrix Cs,f is well-defined and nonsingular and apply
Theorem 4.1.
−1 −1
)i0 ,j 0 | for all pairs i0 and j 0 . Substitute this inequality
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Clearly, ||Cs,f || ≥ |(Cs,f


√ −1
into Theorem 4.1 and obtain κ(Vs ) ≥ n maxi0 ,j 0 |(Cs,f )i0 ,j 0 |/ maxn−1 n n
i=0 |si − f |.
0 0
Substitute (16) for complex number f and integers i and j that satisfy the
equations tj 0 = f and |f | = 1 and maximize the value |sni0 − f n |. (We keep the
assumption that the matrix Cs,f is well-defined and nonsingular because otherwise we
could fix the problem by √ means of an infinitesimal perturbation of the parameter f .)
Obtain that κ(Vs ) ≥ n |s(f )|/|si0 − f |.
Observe that |si0 − f | ≤ 2 because |f | = 1 and s+ ≤ 1 by assumption.
Substitute this bound on |si0 − f | and obtain the corollary.
Corollary 4.1.2. Assume a nonsingular n × n Vandermonde matrix Vs with
n−1
s+ ≤ 1 for s+ of (8). Write vi = s(ωni ), i = 0, . . . , n − 1, and v = (vi )s=0 . Then
√ √
κ(Vs ) ≥ 0.5 n max |s(f )| ≥ 0.5 n ||v||∞ ≥ ||v||/2.
|f |=1


Proof. Note that max|f |=1 |s(f )| ≥ ||v||∞ ≥ ||v||/ n.
Part (i) of Corollary 3.2.1 implies that a Vandermonde matrix Vs of a large size
is badly ill-conditioned if s+ ≥ ν for a constant ν exceeding 1. Corollary 4.1.2 implies
that even if ν+ ≤ 1, this property of the matrix Vs still holds as long as the knot
set {s0 , . . . , sn−1 } is sparse in some disc that covers a reasonably thick neighborhood
of a reasonably long arc of the circle C(0, 1). Thus, informally, the knots of a well-
conditioned Vandermonde matrix must be more or less equally spaced on or about
the circle C(0, 1).
In sections 7 and 8, by applying Theorem 4.1 and low-rank approximation of CV
matrices, we obtain an alternative proof of this property for the inputs of large sizes.
In sections 5 and 6, we estimate exponentially growing condition numbers κ(Vs ) in
two specific cases when the knots si deviate from equally spaced distributions on the
unit circle C(0, 1).
We conclude this section by estimating the condition number κ(Vs ) in terms of
the coefficients of the polynomial s(x) rather than its values.

Pn Corollary 4.1.3. Under the assumptions of Corollary√ 4.1.2, write s(x) =


i n
i=0 fi x , f n = 1, and f = (f i )i=0 . Then κ(Vs ) ≥ 0.5 n (||f || − 1).
Proof. Note that v = Ωf + e for the vector v of Corollary 4.1.2, DFT matrix Ω of
√ and
(2), e = (1) n
. . . , 1)T . Therefore Ω−1 v = f √
√ i=1 =√(1, 1, −1 + Ω−1 e and, consequently,
−1 −1
√n Ω v = n f+ n Ω √ e.−1Recall that the √ matrix n Ω is unitary, √ and so
−1
|| n Ω√ v|| = ||v|| and || n Ω e|| = ||e|| = n. It follows that ||v|| ≥ n ||f || −
||e|| = n (||f || − 1). Substitute this bound into Corollary 4.1.2.
For a fixed set {s0 , . . . , sn−1 } one can compute the coefficients f0 , . . . , fn−1 of s(x)
by using O(n log2 (n)) arithmetic operations (cf. [P01, section 2.4]), and then one can
bound the condition number κ(Vs ) by applying Corollary 4.1.3.
5. The case of the quasi-cyclic sequence of knots. If all of the knots
s0 , . . . , sn−1
(a) lie on the unit circle C(0, 1) and
(b) are equally spaced on it,
then we arrive at perfectly conditioned Vandermonde matrices, such as the DFT ma-
trix Ω, satisfying κ(Ω) = 1. A rather minor deviation from assumption (b), however,

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


HOW BAD ARE VANDERMONDE MATRICES? 683

can make the matrix ill-conditioned, as [G90] has shown empirically, and we formally
deduce this from Theorem 4.1.
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Example 5.0.1 (the quasi-cyclic sequence of knots). In applications to interpo-


lation and quadrature one seeks a sequence of n × n well-conditioned Vandermonde
matrices for n = 2, 3, . . . , such that all the knots si are reused recursively as n in-
creases (cf. [G90, section IV]). For n = 2k we choose the √ n × n matrix of DFT at n
points, defined by the equally spaced knots si = exp(2πi −1/n) for i = 0, 1, . . . , n−1.
For 2k < n ≤ 2k+1 we keep all the 2k th roots of 1 among the n knots and select the
remaining n − 2k knots among 2k+1 st roots of 1. It turns out that for n near 3 · 2k the
condition number κ(Vs ) of the resulting matrix √ Vs strongly depends on this selection.
In particular (see Theorem 5.2), κ(Vs ) < 2n for an n × n Vandermonde matrix
Vs , whose remaining n − 2k knots we define later in this section by (23) and (24).
The condition numbers of n × n Vandermonde matrices Vs behave, however, in
a rather opposite way if we select the remaining 2k+1 st roots of 1 in succession in
counterclockwise order on the circle C(0, 1), e.g., if we select the values ω22i+1 k+1 for

i = 0, 1, . . . , n − 2k .
k
By including √ also all of the 2 th roots of 1, we arrive at a sequence of knots
si = exp(2πfi −1 ), i = 0, 1, . . . , n − 1, called quasi-cyclic in [G90, section IV] and
defined by the following sequence of fractions fi , i = 0, 1, . . . , n − 1:
1 1 3 1 3 5 7 1 3 5 7 9 11 13 15
0, , , , , , , , , , , , , , , , . . . .
2 4 4 8 8 8 8 16 16 16 16 16 16 16 16
For n = 2k as well as n = 2k+1 the first n knots of the sequence define the unitary
matrices of the DFT, but for n = 3·2k and k ≥ 4 the condition numbers of the resulting
matrices Vs grow large as the values k increase, so that the condition number κ(Vs )
exceeds 1,000,000 already for k = 4, n = 48 (cf. [G90, section IV]). This has remained
an empirical observation since 1990, but next we deduce from Corollary 4.1.2 that
log(κ(Vs )) for n = 3 · 2k−1 grows at least proportionally to 2k , and so κ(Vs ) grows
exponentially in n.
Theorem 5.1. Let Vq denote the n×n Vandermonde matrix defined by the quasi-
cyclic sequence of knots on the√unit circle C(0, 1) for n = 3q, q = 2k−1 , and a positive
integer k. Then κ(Vq ) ≥ 2q/2 n.
i
Proof. For n = 3q the sequence has the 2q knots si = ω2q , i = 0, 1, . . . , 2q − 1,
2i+1
equally spaced on the circle C(0, 1), and the q additional knots si = ω4q , i =
0, 1, . . . , q − 1, equally spaced on the upper half of that circle, such that s(x) =
(x2q − 1)pq (x) for
q−1
Y
2i+1
(17) pq (x) = (x − ω4q ).
i=0

Corollary 4.1.2 implies that κ(Vq ) ≥ |s(f )| n/2, where we can choose any complex
number f such that√ |f | = 1.
Choose f = − −1 ω4q and observe that f 2q − 1 = −2, and so in this case,
(18) |s(f )| = 2|pq (f )|.
2i+1

Furthermore, |f − ω4q ≥ 2 for i = 1, . . . , q (see Figure 1). Consequently |s(f )| ≥
|
q/2+1
2 , and the theorem follows because

(19) κ(Vq ) ≥ 0.5|s(f )| n.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


684 VICTOR Y. PAN
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

2i+1

Fig. 1. The minimum length of the line intervals [f, ω4q ], i = 0, 1, . . . , q − 1, is 2, reached
where i = 1.

The theorem implies that the condition number κ(Vk ) grows exponentially in n
for n = 3q and q = 2k−1 , but we can strengthen the estimate a little further because
of the following observation.
2i+1
Proposition 5.1.1. |f − ω4q | ≥ 2 cos((0.5 − i/q)π/2) for q being a power of 2
and i in the range between 0 and q.
2i+1

This proposition implies that the distance |f − ω4q | grows from 2 to 2 as i

grows from 0 to q/2, but this distance decreases to 2 as i grows from q/2 to q (cf.
Figure 1). In the limit, as q → ∞, we obtain
Z q
(20) |pq (f )| ≥ exp(y) for y = ln(2 cos((0.5 − x/q)π/2)) dx.
0

For any q we obtain a lower bound on the value |pq (f )| by replacing the integral
with a partial sum. Let us do this for q divisible by 12. Replace the integral with√a
partial sum by applying Proposition 5.1.1 for i = q/6 and 2 cos((0.5 − i/q)π/2) = 3
and obtain

(21) |pq (f )| ≥ 18q/6 .

This lower estimate for the integral is quite tight in the case of large integers q,
but next we refine it further by applying Proposition 5.1.1 for i = q/3 and 2 cos((0.5 −
i/q)π/2) ≈ 1.9318517. In this way, we deduce that

(22) |pq (f )| ≥ (2 cos(π/12) 6 )q/3 .
√ √
For n = 48 the lower bound of Theorem 5.1 on κ(Vq ) is just 28 48 = 1024 3 >
1, 773.6, but (21) and (22) enable us to increase it to 15,417 and 27,598, respectively.
By engaging Proposition 5.1.1 at first for i = q/4 and 2 cos((0.5−i/q)π/2) ≈ 1.9615706

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


HOW BAD ARE VANDERMONDE MATRICES? 685

and then for i = q/12 and 2 cos((0.5 − i/q)π/2) ≈ 1.586707, we increase these lower
bounds at first to 38,453 and then to 71,174.
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

We leave further refinement of the bounds as an exercise for the reader.


Can one still define a sequence of well-conditioned h×h Vandermonde matrices for
h = 1, . . . , n, which would include all the 2k th roots of 1 for k = blog2 nc and consist
of only n knots overall? Gautschi [G90, section IV] proves that such a sequence of
h × h Vandermonde matrices Vs is obtained if we define the knots

(23) si = 2πfi −1, i = 0, 1, . . . , n − 1,
by the following van der Corput sequence of fractions fi . For 2k−1 < n ≤ 2k , the
resulting knot set consists of n knots selected as 2k th roots of 1 ordered in a zigzag
manner:
1 1 3 1 5 3 7 1 9 5 13 3 11 7 15
(24) 0, , , , , , , , , , , , , , , , . . . .
2 4 4 8 8 8 8 16 16 16 16 16 16 16 16
The knots s0 , . . . , sn−1 are equally spaced on the unit circle C(0, 1) for n = 2k being
the powers of 2 and are distributed on it quite evenly for all n. Hereafter, we call
a van der Corput matrix an n × n Vandermonde matrix Vs defined by the knots
s0 , . . . , sn−1 of (23) with the fractions f0 , . . . , fn−1 from the van der Corput sequence
(24). Gautschi has proved the following estimate in [G90, section IV].

Theorem 5.2. κ(Vs ) < 2n for an n × n van der Corput matrix Vs for any n.
6. Numerical stability issues for Gaussian elimination with no pivot-
ing applied to well-conditioned Vandermonde matrices. Gaussian elimination
with partial pivoting, that is, with appropriate row interchange, is extensively used
in matrix computations, even though pivoting is a communication intensive process,
and thus is a substantial burden in the context of modern computer technology. The
papers [PQY15] and [PZ15] list the following problems with pivoting: it interrupts
the stream of arithmetic operations with foreign operations of comparison, involves
bookkeeping, compromises data locality, complicates parallelization of the computa-
tions, and increases communication overhead and data dependence. The user prefers
to apply Gaussian elimination with no pivoting (hereafter we use the acronym GENP )
unless it fails because of a division by 0 or runs into numerical problems. The following
result states sufficient conditions for such a failure.
Hereafter, we call a matrix strongly nonsingular and strongly well-conditioned if
this matrix itself and all its square leading blocks are both nonsingular and well-
conditioned.
Theorem 6.1. GENP fails or is unsafe numerically unless an input matrix A is
strongly nonsingular and strongly well-conditioned.
Proof. GENP recursively produces a sequence of LU factorizations of the k × k
leading blocks A(k) of an n × n input matrix A for k = 1, 2, . . . , n, but fails because
of divisions by 0 if any of these blocks is singular. If such a block is nonsingular but
ill-conditioned, then GENP runs into numerical problems.
Typically, deciding whether a nonsingular and well-conditioned matrix is strongly
nonsingular and strongly well-conditioned is not simpler than applying to it GENP,
but in some cases the answer is readily available; e.g., Theorem 5.2 implies the fol-
lowing corollary.
Corollary 6.1.1. Every van der Corput matrix is strongly nonsingular and
strongly well-conditioned.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


686 VICTOR Y. PAN
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

j √
Fig. 2. The distances |f − ωn |≥ 2 for j = 0, 1, . . . , n/2 − 1.

In contrast, by virtue of our next theorem, a DFT matrix Ω of a large size cannot
be strongly well-conditioned, and thus such a matrix is numerically hard for GENP.
(Surely we do not need GENP in order to invert the DFT matrix, but we will show
some interesting implications of the theorem.)
Theorem 6.2. Assume for convenience that n is even, and let Ω(q) denote the
q×q leading
√ (that is, northwestern) block of the matrix Ω for q = n/2. Then κ(Ω(q) ) ≥
2n/4−1 n.

Proof. As follows from Corollary 4.1.2, κ(Ω(q) ) ≥ |s(f ωqj )| n/2 for j = 0, . . . , q −
Qq−1 √
1, where s(x) = j=0 (x − ωnj ) and where we can choose f = − −1. Under such a

choice, we get |f − ωnj | ≥ 2 for j = 0, . . . , q − 1 (see Figure 2), and so |s(f )| ≥ 2n/4 .
(q) √
Hence κ(Ωn ) ≥ 2n/4 n/2, and Theorem 6.2 follows.
Remark 6.3. We can strengthen the bound of this theorem by extending the es-
timates of (18)–(22): they still hold if we substitute pq (x) = s(x) and n = 2q (instead
of n = 3q) into these equations. Moreover, by extending the proof of Theorem 6.2,
we can show that all of the k × k blocks of the unitary matrix √1n Ω are badly ill-
conditioned as long as the integer k or n − k is large.
Corollary 6.3.1. GENP is numerically unsafe for (i) a DFT matrix Ω of a
large size as well as for its products, (ii) DΩ, and (iii) ΩZ with any diagonal matrix
D and any circulant matrix Z of the same size.1
Proof. Part (i) follows from Theorems 6.1 and 6.2 combined.
In order to prove part (ii), first note that κ(DΩ) = κ(D) because Ω is a unitary
matrix up to scaling. Therefore, GENP applied to the matrix DΩ fails numerically if
the matrix D is ill-conditioned.
1 See [D70], [GL13, p. 220], or [P01, p. 33] for the definition of circulant matrices.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


HOW BAD ARE VANDERMONDE MATRICES? 687

Next, suppose that an n × n diagonal matrix D and, consequently, its leading


block D(k) for large k = n/2 are well-conditioned. Then the matrix (DΩ)(k) is ill-
conditioned because the matrix Ω(k) is ill-conditioned by virtue of Theorem 6.2 and
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

because κ((DΩ)(k) ) ≥ κ(Ω(k) )/κ(D(k) ) since D(k) is a diagonal matrix.


Now part (iii) follows because (cf. [CPW74] or [P01, Thm. 2.6.4]) ΩZ = DΩ for
any circulant matrix Z and a diagonal matrix D = diag(vi )ni=1 such that (vi )ni=1 is
the first column vector of the matrix Z.
The corollary is in very good accordance with the data from the tests in [PQZ13]
and [PQY15], whereas in extensive numerical tests of the same papers, GENP con-
sistently produced accurate outputs when it was applied with random circulant mul-
tipliers Z to nonsingular and well-conditioned input matrices of other various classes.
Let us recall some formal results from [PQZ13], [PQY15], and [PZa] that partially
support the latter empirical data.
Theorem 6.4 (cf. [PQZ13, Thm. 5.1]). Assume that GENP or block Gaussian
elimination is applied to an n × n matrix A. Let A(j) denote the j × j leading blocks of
this matrix for j = 1, 2, . . . , n, and write N = ||A||2 and N− = maxnj=1 ||(A(j) )−1 ||2 .
(One can normalize the matrix A by scaling to ensure that N = 1.) Then the absolute
values of all pivot elements of GENP and the norms of all pivot blocks of block Gauss-
ian elimination do not exceed N +N− N 2 , whereas the absolute values of the reciprocals
of these elements and the norms of the inverses of the blocks do not exceed N− .
Theorem 6.4 implies that all pivot blocks involved in any block Gaussian elimina-
tion process applied to a strongly nonsingular and strongly well-conditioned matrix
are nonsingular and well-conditioned as well. By using this property, one can readily
deduce (cf. [PQY15, Remark 3.2]) a moderately large upper bound (N N− )log2 (n) on
the norms of the factors L and U produced by GENP and block Gaussian elimination
applied to such a matrix A. This upper bound holds in the worst case of strongly
nonsingular and strongly well-conditioned inputs and is more acceptable than the
growth factor 2n−1 of the entries of the factor U in PLU factorization, proved for
Gaussian elimination with partial pivoting (hereafter abbreviated GEPP); cf. [GL13,
p. 131]. The average growth factors have orders n2/3 and n3/2 in the cases of GEPP
and GENP, respectively [TS90], [YC97].
The average case estimate can be irrelevant to some important classes of inputs for
GENP and block Gaussian elimination, but preprocessing of the inputs with various
random, specialized, or heuristic multipliers can alleviate this problem (see some rel-
evant formal and empirical studies in [PQZ13], [PQY15], and [PZa], and see [PQY15]
and [BCD14] for historical accounts).

7. Low-rank approximation of Cauchy and CV matrices. Next, under


some mild assumptions on the knots s0 , . . . , sm−1 , we deduce an exponential low
bound on the norm of the inverse of a nonsingular CV matrix Cs,f , which is the
−1
reciprocal of its smallest singular value, ||Cs,f || = 1/σn (Cs,f ). First, we deduce such
a bound for a submatrix and then extend it to the matrix itself by applying Corollary
2.1.1. In the next section, by extending this estimate, we obtain our alternative
exponential lower bounds on the norm ||Vs−1 || and the condition number κ(Vs ) of
matrix Vs , whose knot set deviates substantially from the evenly spaced distribution
on the circle C(0, 1).
Definition 7.1 (see [CGS07, p. 1254]). Two complex points s and t are (η, c)-
t−c
separated, for η > 1 and a complex center c, if | s−c | ≤ 1/η.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


688 VICTOR Y. PAN
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Fig. 3. The figure shows the sets Sm− = {s10 , s11 } and Sm+ = {s0 , s1 , . . . , s9 } as well as the
distance r from the center c to the nearest points tj 0 and tj 00 of the set Tj 0 ,j 00 .

Two complex sets S and T are (η, c)-separated if every pair of points s ∈ S and
t ∈ T is (η, c)-separated.
Theorem 7.2 (cf. [CGS07, section 2.2], [P15, Thm. 29]). Let m and l denote a
pair of positive integers, and let Sm = {s0 , . . . , sm−1 } and Tl = {t0 , . . . , tl−1 } denote
two sets that are (η, c)-separated for a pair of real number η > 1 and complex number
c. Define the m × n Cauchy matrix C = Cs,t = ( si −t 1
)m−1,l−1 , and write
j i,j=0

m−1
δ = δc,S = min |si − c|.
i=0

Then
1/σρ (C) ≥ (η − 1)η ρ−1 δ for all positive integers ρ.
We will apply the theorem to an arbitrary knot set Sm and to a specific knot set
Tl . In the next section we comment on the implications of the resulting basic theorem.
Theorem 7.3 (cf. Figure 3). Let us be given a complex number f such that
|f | = 1 and two knot sets Sm = {s0 , . . . , sm−1 √ } and Tn = {t0 , . . . , tn−1 }, where
tj = f ω j for j = 0, . . . , n − 1 and ω = exp(2π −1/n). Define the m × n CV matrix
1
Cs,f = C(Sm , Tn ) = ( si −t j
)i∈Sm ,j∈Tn .
Fix two integers j and j 00 in the range [0, n−1] such that 0 < l = j 00 −j 0 +1 ≤ n/2,
0
00
and define the subset Tj 0 ,j 00 = {f ω j }jj=j 0 of the set Tn .
Let c = 0.5(tj 0 + tj 00 ) denote the midpoint of the line interval with the endpoints
tj and tj 00 , and let r = |c − tj 0 | = |c − tj 00 | = sin((j 00 − j 0 ) nπ ) denote the distance from
0

this midpoint to the points tj 0 and tj 00 . (Note that (j 00 − j 0 ) nπ is a quite tight upper
bound on r for large integers n and bounded differences j 00 − j 0 .)
Fix a constant η > 1, and partition the set Sm into the two subsets Sm,− and
Sm,+ of cardinalities m− and m+ , respectively (m− = 2 and m+ = 10 in Figure 3),
such that the knots of Sm,− lie in the open disc D(c, ηr) = {z : |z − c| < ηr}, while
the knots of Sm,+ lie in the exterior of that disc.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


HOW BAD ARE VANDERMONDE MATRICES? 689

(i) Then the two sets Sm,+ and Tj 0 ,j 00 are (c, η)-separated and
(ii) 1/σρ+m− +n−l (Cs,f ) ≥ (η − 1)η ρ r for all positive integers ρ.
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Proof. Part (i) follows because distance(c, Tj 0 ,j 00) = r and because distance(c, Sm,+) ≥
ηr by virtue of the definition of the set Sm,+ .
In order to prove part (ii), first define the two following submatrices of the CV
matrix Cs,f = C(Sm , Tn ):
   
1 1
C(Sm , Tj 0 ,j 00 ) = and C(Sm+ , Tj 0 ,j 00 ) = .
si − tj i∈Sm ,j∈Tj0 ,j00 si − tj i∈Sm,+ ,j∈Tj0 ,j00

Apply Theorem 7.2 to the matrix C = C(Sm+ , Tj 0 ,j 00 ) and to δ = ηr, and obtain
that
1/σρ (C(Sm+ , Tj 0 ,j 00 )) ≥ (η − 1)η ρ r for all positive integers ρ.
Combine this bound with Corollary 2.1.1 applied for k = m− , and obtain that

1/σρ+m− (C(Sm , Tj 0 ,j 00 )) ≥ (η − 1)η ρ−1 r for all positive integers ρ.

Combine the latter bound with Corollary 2.1.1 applied for k = n − l, and obtain
part (ii) of the theorem.
Corollary 7.3.1. Under the assumptions of Theorem 7.3, let m = n and ρ =
ρ̄ = l − m− > 0. Then Cs,f = C(Sm , Tn )) is an n × n matrix and
−1
||Cs,f || = 1/σn (Cs,f ) ≥ (η − 1)η ρ̄ r.

8. A typical large Vandermonde matrix is badly ill-conditioned. Com-


bine Theorem 4.1 and Corollary 7.3.1, note that in our case maxn−1 n n
i=0 |si − f | ≤ 2,
and obtain the following result.
Corollary 8.0.1. Under the assumptions of Corollary 7.3.1, let s+ ≤ 1 for s+
of (8). Then √
κ(Vs ) ≥ (η − 1)η ρ̄ r n/2.
The corollary shows that, under its mild assumptions, the condition number κ(Vs )
of an n × n Vandermonde matrix Vs grows exponentially in ρ̄, and thus also in n if ρ̄
grows proportionally to n.
What do these assumptions mean in terms of the properties of the knots
s1 , . . . , sn−1 ? Fix any real number η > 1 and any pair of integers j 0 and j 00 in the
range [0, n − 1] such that 0 < j 00 − j 0 ≤ n/2. Then the disc D(c, r) contains precisely
l = j 00 − j 0 + 1 knots from the set T , and m− denotes the number of knots of the set
S in the disc D(c, ηr). Hence Corollary 8.0.1 implies that κ(Vs ) is exponential in n
unless l − m− = o(n) for all such triples (η, j 0 , j 00 ) with l = j 00 − j 0 + 1 of order n,
that is, unless the number of knots si in the disc D(c, ηr) exceeds, matches, or nearly
matches the number of knots tj in the disc D(c, r) for all such triples. Clearly, this
property is only satisfied for all such triples (η, j 0 , j 00 ) if the associated knot set S is
more or less evenly spaced on or about the unit circle C(0, 1).
9. Numerical tests. Numerical tests were performed by Liang Zhao in the
Graduate Center of the City University of New York. He ran them on a Dell server
using Windows and MATLAB R2014a (with the IEEE standard double precision).
In particular, he estimated condition numbers by applying MATLAB function cond()
and generated Gaussian random numbers by applying MATLAB function randn().

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


690 VICTOR Y. PAN

Table 1 displays the condition numbers κ(Vs ) of n × n Vandermonde matrices,


with the knots si = ωni being the nth roots of 1 for i = 0, 1, . . . , n − 2 and with the
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

knot sn−1 in the range from 1.14 to 10. For comparison, the table also displays the
lower estimates of part (i) of Corollary 3.2.1 in the case where m = n. By setting
sn−1 = ωnn−1 we would have had κ(Vs ) = 1, but with the growth of the value |sn−1 |—
and even with the growth of n for sn−1 fixed at 1.14—the matrix quickly became
ill-conditioned.
Table 2 displays the condition numbers of n × n Vandermonde matrices defined
i
by various pairs of integers k and n and the knots s0 , . . . , sn−1 such that si = ωn−k
i
are the (n − k)th roots of 1, while sn−k+i = ρωk are the kth roots of 1 scaled by ρ in
both cases for i = 0, 1, . . . , k and ρ = √
3/4, 1/2. For comparison, the table also displays
the lower estimates κ− = |Vs |ν k−1 /( k max{k, ν/(ν − 1)}) of part (ii) of Corollary
3.2.1 for ν = 1/ρ.

Table 1
Condition numbers of Vandermonde matrices with a single absolutely large knot.

n−1 √
n sn−1 κ(Vs ) s+ / n
64 1.14E+00 3.36E+03 4.98E+02
64 1.56E+00 6.88E+11 2.03E+11
64 3.25E+00 4.62E+32 2.22E+31
64 1.00E+01 5.47E+63 1.25E+62
128 1.14E+00 1.08E+07 1.60E+06
128 1.56E+00 3.24E+32 3.64E+23
128 3.25E+00 6.40E+76 9.03E+63
128 1.00E+01 1.30E+130 8.84E+125
256 1.14E+00 1.57E+14 2.33E+13
256 1.56E+00 6.61E+62 1.66E+48
256 3.25E+00 2.89E+132 2.12E+129
256 1.00E+01 1.18E+265 6.25E+253

Table 2
Condition numbers of Vandermonde matrices with k absolutely small knots.

ρ = 3/4 ρ = 1/2
n k κ(Vs ), κ− κ(Vs ), κ−
64 8 4.04e+01, 3.57e+00 6.90e+02, 6.10e+01
64 16 2.71e+02, 1.20e+01 1.19e+05, 5.24e+03
64 32 1.71e+04, 3.78e+02 4.91e+09, 1.09e+08
128 8 5.85e+01, 5.17e+00 1.00e+03, 8.84e+01
128 16 4.03e+02, 1.78e+01 1.77e+05, 7.80e+03
128 32 2.70e+04, 5.97e+02 7.77e+09, 1.72e+08
256 8 8.38e+01, 7.40e+00 1.43e+03, 1.26e+02
256 16 5.85e+02, 2.58e+01 2.56e+05, 1.13e+04
256 32 4.02e+04, 8.89e+02 1.16e+10, 2.56e+08

Table 3 displays the condition numbers κ(Vs ) of the 3q×3q Vandermonde matrices
defined by the quasi-cyclic sequence of knots of Example 5.0.1. The table also displays
the lower bounds κ and κ0 on these numbers, calculated based on (22) and (20),
respectively.
Table 4 displays the condition numbers κ(Ω(q) ) of the q × q leading blocks of the
n × n DFT matrices Ω = (ωnij )n−1
i,j=0 for n = 2q = 8, 16, 32, 64. The table also displays
the lower bounds κ− and κ0− on these numbers, calculated for s(x) = pq (x) and for
the same pairs of n and q based on Theorem 6.2 and (20), respectively.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


HOW BAD ARE VANDERMONDE MATRICES? 691
Table 3
Condition numbers of Vandermonde matrices defined by quasi-cyclic sequence of knots.
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

n q κ(Vs ) κ κ0
12 4 2.16e+01 1.96e+01 1.03E+01
24 8 1.50e+03 2.22e+02 1.06E+02
48 16 1.16e+07 2.01e+04 1.13E+04
96 32 9.86e+14 2.16e+08 1.27E+08

Table 4
Condition numbers of the leading blocks of DFT matrices.

(k)
n q κ(Ωn ) κ− κ0−
8 4 1.53E+01 8.00E+00 1.03E+01
16 8 1.06E+03 4.53E+01 1.06E+02
32 16 8.18E+06 1.02E+03 1.13E+04
64 32 8.44E+14 3.71E+05 1.27E+08

According to our test results, the lower bounds based on (22) and (20)—and even
those based on Theorem 6.2—grow very fast, but the condition numbers grow even
faster.
Table 5 shows the mean and standard deviation (std) of the relative residual
norms rn = ||Ωn x̃ − b||/||b|| and relative error norms en = ||x̃ − x||/||x̃|| when GENP
was applied to the linear systems of n equations Ωn x = b for the n × n DFT matrix
Ωn and the vectors b = Ωn e + τ g of dimension n, and when 100 linear systems
Ωn x = b were solved for each n, n = 64, 128, 256, 512, 1024. Here eT = (1, 1, . . . , 1),
τ = 0.001, g is a vector filled with independently and identically distributed (i.i.d.)
standard Gaussian random variables, and x̃ denotes the computed approximation to
the solution vector x.
According to the data in this table and all other tables, our formal estimates and
test results are in quite good accordance with each other.
Table 5
GENP with DFT matrices: relative residual and error norms.

rn = ||Ax̃ − b||/||b|| en = ||x̃ − x||/||x||


n mean std mean std
16 8.75e-14 1.09e-14 2.44E-04 4.23E-05
32 4.18e-10 6.62e-11 1.76E-04 2.21E-05
64 2.72e-03 8.41e-04 2.71E-03 8.40E-04
128 2.03e+00 3.85e-01 2.03E+00 3.85E-01
256 1.02e+03 3.58e+02 1.01E+03 3.58E+02
512 2.85e+03 9.83e+02 2.85E+03 9.81E+02
1024 5.57e+04 1.96e+04 5.56E+04 1.96E+04

10. A brief summary of our results. First, we have estimated the condition
numbers κ(Vs ) of n × n Vandermonde matrices Vs = (sji )n−1 i,j=0 with complex knots
s0 , . . . , sn−1 by applying the Ekkart–Young theorem, and we have readily deduced
that these numbers grow exponentially in n unless all knots s0 , . . . , sn−1 lie in or near
the unit disc {x : |x| ≤ 1} and, furthermore, unless they lie mostly on or near the
unit circle C(0, 1) = {x : |x = 1|}.
Then, by applying more involved techniques based on the Vandermonde–Cauchy
transformation of matrix structure, our formulae for the inversion of a Cauchy matrix,
their extension to Vandermonde matrix inversion, and the techniques of low-rank

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


692 VICTOR Y. PAN

approximation of Cauchy matrices, we proved our exponential lower bounds on the


condition numbers of even a more general class of Vandermonde matrices.
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Namely, we provided a missing formal support for the well-known empirical


observation that Vandermonde matrices Vs tend to be badly ill-conditioned already
for moderately large integers n, and we also proved that all of the exceptions can
be narrowed down to the Vandermonde matrices (such as the matrices of the DFT),
whose knots are more or less equally spaced on or about the unit circle C(0, 1).
Our auxiliary inversion formulae for Cauchy and Vandermonde matrices may be
of independent interest.
11. Some extensions of our study.
11.1. Straightforward extensions. All of our estimates hold for the transpose
of a Vandermonde matrix as well.
Furthermore, we can immediately extend our upper bounds on the smallest pos-
itive singular value σn (Vs ) = 1/||Vs || to the upper bounds

d
X d
X
σn (V ) = σn (V T ) ≤ σn (Vs ) ||Cj || ||Dj || and σn (C) ≤ σn (Cs,t ) ||Dj0 || ||Dj00 ||
j=1 j=1

on the smallest positive singular values σn (V ) and σn (C), respectively, of the


“Vandermonde-like” and Pd “Cauchy-like” matrices Pdof the important classes V =
d T
= j=1 Dj VsT Cj , and C = j=1 Dj0 Cs,t Dj00 , respectively, for a
P
j=1 Dj Vs Cj , V
bounded positive integer d, diagonal matrices Dj , Dj0 , and Dj00 , and circulant matrices
Cj , j = 1, . . . , d (cf. [P01, Examples 1.4.1 and 4.4.6] and [PW03]).
Most of our results can be readily extended to rectangular Vandermonde, Cauchy,
Vandermonde-like, and Cauchy-like matrices.
11.2. Extensions that need more work. One can try to narrow the remaining
gap between the estimated and observed values of the condition number κ(Vs ) by
varying the parameters tj and f that we substitute into our basic equation (12).
A natural research challenge is the extension of our estimates to polynomial Van-
dermonde matrices VP,s = (pj (xi ))m−1,n−1
i,j=0
n−1
, where P = (pj (x))j=0 is any basis in
the space of polynomials of degree less than n, for example, the basis made up of
Chebyshev polynomials. Equation (12) is extended to this case as follows (cf. [P01,
eq. (3.6.8)]):
 m−1,n−1
1 −1
Cs,t = = diag(t(si )−1 )m−1 0 n−1
i=0 VP,s VP,u diag(t (tj ))j=0 .
si − tj i,j=0

Qn−1 n−1
Here t(x) = j=0 (x − tj ) is the polynomial of (7) and u = (uj )j=0 denotes the
n
coefficient vector of the polynomial t(x) − x . The decisive step could be the selection
of a proper set of knots T = {tj }n−1
j=0 , which would enable us to reduce the estimation of
the condition number of a polynomial Vandermonde matrix VP,s to the same task for
a Cauchy matrix Cs,t . Sections V and VI of [G90] might give us some initial guidance
for the selection of the knot set that would extend the set of equally spaced knots on
the unit circle used in the case of Vandermonde matrices. Then it would remain to
extend our estimates for CV matrices to the new subclass of Cauchy matrices.
Acknowledgments. I am grateful to the reviewers for their valuable and
detailed comments.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


HOW BAD ARE VANDERMONDE MATRICES? 693

REFERENCES

[AR10] A. Aricò and G. Rodriguez, A fast solver for linear systems with displacement struc-
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

ture, Numer. Algorithms, 55 (2010), pp. 529–556.


[B10] S. Börm, Efficient Numerical Methods for Non-local Operators: H2 -Matrix Compres-
sion, Algorithms and Analysis, European Math. Society, Zürich, 2010.
[BCD14] G. Ballard, E. Carson, J. Demmel, M. Hoemmen, N. Knight, and O. Schwartz,
Communication lower bounds and optimal algorithms for numerical linear algebra,
Acta Numer., 23 (2014), pp. 1–155.
[BTG04] H.A. Barker, A.H. Tan, and K.R. Godfrey, Optimal levels of perturbation signals
for nonlinear system identification, IEEE Trans. Automat. Control, 49 (2004), pp.
1404–1407.
[CGR88] J. Carrier, L. Greengard, and V. Rokhlin, A fast adaptive multipole algorithm
for particle simulations, SIAM J. Sci. Stat. Comput., 9 (1988), pp. 669–686.
doi:10.1137/0909044.
[CGS07] S. Chandrasekaran, M. Gu, X. Sun, J. Xia, and J. Zhu, A superfast algorithm for
Toeplitz systems of linear equations, SIAM J. Matrix Anal. Appl., 29 (2007), pp.
1247–1266. doi:10.1137/040617200.
[CPW74] R.E. Cline, R.J. Plemmons, and G. Worm, Generalized inverses of certain Toeplitz
matrices, Linear Algebra Appl., 8 (1974), pp. 25–33.
[D70] P.J. Davis, Circulant Matrices, Wiley, New York, 1970.
[G90] W. Gautschi, How unstable are Vandermonde systems?, in Proceedings of the Inter-
national Symposium on Asymptotic and Computational Analysis: Conference in
Honor of Frank W. J. Olver’s 65th Birthday, Lecture Notes in Pure Appl. Math.
124, R. Wong, ed., Marcel Dekker, New York, 1990, pp. 193–210.
[G98] M. Gu, Stable and efficient algorithms for structured systems of linear equations, SIAM
J. Matrix Anal. Appl., 19 (1998), pp. 279–306. doi:10.1137/S0895479895291273.
[GI88] W. Gautschi and G. Inglese, Lower bounds for the condition number of Vandermonde
matrices, Numer. Math., 52 (1988), pp. 241–250.
[GKO95] I. Gohberg, T. Kailath, and V. Olshevsky, Fast Gaussian elimination with partial
pivoting for matrices with displacement structure, Math. Comp., 64 (1995), pp.
1557–1576.
[GL13] G.H. Golub and C.F. Van Loan, Matrix Computations, 4th ed., Johns Hopkins Uni-
versity Press, Baltimore, MD, 2013.
[GR87] L. Greengard and V. Rokhlin, A fast algorithm for particle simulation, J. Comput.
Phys., 73 (1987), pp. 325–348.
[HB97] G. Heinig and A. Bojanczyk, Transformation techniques for Toeplitz and Toeplitz-
plus-Hankel matrices, I: Transformations, Linear Algebra Appl., 254 (1997), pp.
193–226.
[HB98] G. Heinig and A. Bojanczyk, Transformation techniques for Toeplitz and Toeplitz-
plus-Hankel matrices. II: Algorithms, Linear Algebra Appl., 278 (1998), pp. 11–36.
[M03] M.E.A. El-Mikkawy, Explicit inverse of a generalized Vandermonde matrix, Appl.
Math. Comput., 146 (2003), pp. 643–651.
[MS58] N. Macon and A. Spitzbart, Inverses of Vandermonde matrices, Amer. Math.
Monthly, 65 (1958), pp. 95–100.
[MM92] M. Marcus and H. Minc, A Survey of Matrix Theory and Matrix Inequalities, Dover,
New York, 1992.
[MRT05] P.G. Martinsson, V. Rokhlin, and M. Tygert, A fast algorithm for the inversion of
Toeplitz matrices, Comput. Math. Appl., 50 (2005), pp. 741–752.
[OO10] H. Olkkonen and J.T. Olkkonen, Sampling and reconstruction of transient signals
by parallel exponential filters, IEEE Trans. Circuits Systems II: Express Briefs, 57
(2010), pp. 426–429.
[P90] V.Y. Pan, On computations with dense structured matrices, Math. Comp., 55 (1990),
pp. 179–190.
[P01] V.Y. Pan, Structured Matrices and Polynomials: Unified Superfast Algorithms,
Birkhäuser Boston, Inc., Boston, MA; Springer-Verlag, New York, 2001.
[P03] G.M. Phillips, Interpolation and Approximation by Polynomials, Springer, New York,
2003.
[P10] F. Poloni, A note on the O(n)-storage implementation of the GKO algorithm and its
adaptation to Trummer-like matrices, Numer. Algorithms, 55 (2010), pp. 115–139.
[P15] V.Y. Pan, Transformations of matrix structures work again, Linear Algebra Appl., 465
(2015), pp. 1–32.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


694 VICTOR Y. PAN

[Pa] V.Y. Pan, Fast Approximate Computations with Cauchy Matrices and Polynomials,
preprint, 2015. arXiv:1506.02285 [math.NA].
[PQY15] V.Y. Pan, G. Qian, and X. Yan, Random multipliers numerically stabilize Gaussian
Downloaded 11/01/16 to 121.49.106.94. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

and block Gaussian elimination: Proofs and an extension to low-rank approxima-


tion, Linear Algebra Appl., 481 (2015), pp. 202–234.
[PQZ13] V.Y. Pan, G. Qian, and A. Zheng, Randomized preprocessing versus pivoting, Linear
Algebra Appl., 438 (2013), pp. 1883–1899.
[PTVF07] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical
Recipes: The Art of Scientific Computing, 3rd ed., Cambridge University Press,
Cambridge, UK, 2007.
[PW03] V.Y. Pan and X. Wang, Inversion of displacement operators, SIAM J. Matrix Anal.
Appl., 24 (2003), pp. 660–677. doi:10.1137/S089547980238627X.
[PZ15] V.Y. Pan and L. Zhao, Randomized circulant and Gaussian preprocessing, in Proceed-
ings of the 17th International Workshop on Computer Algebra in Scientific Comput-
ing (CASC 2015), Lecture Notes in Comput. Sci. 9301, V.P. Gerdt, V. Koepf, and
E.V. Vorozhtsov, eds., Springer, Heidelberg, 2015, pp. 359–373.
[PZa] V.Y. Pan and L. Zhao, How Much Randomization Do We Need for Supporting
Gaussian Elimination, Block Gaussian Elimination, and Low-Rank Approximation?
preprint, 2015. arXiv:1501.05385 [cs.SC].
[R06] G. Rodriguez, Fast solution of Toeplitz- and Cauchy-like least-squares problems, SIAM
J. Matrix Anal. Appl., 28 (2006), pp. 724–748. doi:10.1137/050629148.
[RD09] O. Ryan and M. Debbah, Asymptotic behaviour of random Vandermonde matrices with
entries on the unit circle, IEEE Trans. Inform. Theory, 5 (2009), pp. 3115–3147.
[SEMC11] F. Soto-Eguibar and H. Moya-Cessa, Inverse of the Vandermonde and Vandermonde
confluent matrices, Appl. Math. Inf. Sci., 5 (2011), pp. 361–366.
[T66] L.R. Turner, Inverse of the Vandermonde Matrix with Applications, Technical report
NASA TN D-3547, NASA, Washington, DC, 1966.
[T94] E.E. Tyrtyshnikov, How bad are Hankel matrices?, Numer. Math., 67 (1994), pp. 261–
269.
[TS90] L.N. Trefethen and R.S. Schreiber, Average-case stability of Gaussian elimination,
SIAM J. Matrix Anal. Appl., 11 (1990), pp. 335–360. doi:10.1137/0611023.
[XXG12] J. Xia, Y. Xi, and M. Gu, A superfast structured solver for Toeplitz linear systems
via randomized sampling, SIAM J. Matrix Anal. Appl., 33 (2012), pp. 837–858.
doi:10.1137/110831982.
[XXCB14] Y. Xi, J. Xia, S. Cauley, and V. Balakrishnan, Superfast and stable structured solvers
for Toeplitz least squares via randomized sampling, SIAM J. Matrix Anal. Appl., 35
(2014), pp. 44–72. doi:10.1137/120895755.
[YC97] M.-C. Yeung and T.F. Chan, Probabilistic analysis of Gaussian elimination
without pivoting, SIAM J. Matrix Anal. Appl., 18 (1997), pp. 499–517.
doi:10.1137/S0895479895291741.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

You might also like