This action might not be possible to undo. Are you sure you want to continue?
equations
Aram W. Harrow
∗
, Avinatan Hassidim
†
and Seth Lloyd
‡
June 2, 2009
Abstract
Solving linear systems of equations is a common problem that arises
both on its own and as a subroutine in more complex problems: given a
matrix A and a vector
b, ﬁnd a vector x such that Ax =
b. We consider
the case where one doesn’t need to know the solution x itself, but rather
an approximation of the expectation value of some operator associated
with x, e.g., x
†
Mx for some matrix M. In this case, when A is sparse and
wellconditioned, with largest dimension N, the best classical algorithms
can ﬁnd x and estimate x
†
Mx in O(N poly log(N)) time. Here, we ex
hibit a quantum algorithm for this task that runs in poly(log N) time, an
exponential improvement over the best classical algorithm.
Quantum computers are devices that harness quantum mechanics to perform
computations in ways that classical computers cannot. For certain problems,
quantum algorithms supply exponential speedups over their classical counter
parts, the most famous example being Shor’s factoring algorithm [1]. Few such
exponential speedups are known, and those that are (such as the use of quantum
computers to simulate other quantum systems [2]) have found little use outside
the domain of quantum information theory. This paper presents a quantum al
gorithm that can give an exponential speedup for a broad range of applications.
Linear equations play an important role in virtually all ﬁelds of science and
engineering. The sizes of the data sets that deﬁne the equations are growing
rapidly over time, so that terabytes and even petabytes of data may need to
be processed to obtain a solution. The minimum time it takes to exactly solve
such a set on a classical computer scales at least as N, where N is the size of
the data set. Indeed, merely to write out the solution takes time of order N.
Frequently, however, one is interested not in the full solution to the equations,
but rather in computing some function of that solution, such as determining
the total weight of some subset of the indices. We show that in some cases, a
∗
Department of Mathematics, University of Bristol, Bristol, BS8 1TW, U.K.
†
MIT  Research Laboratory for Electronics, Cambridge, MA 02139, USA
‡
MIT  Research Laboratory for Electronics and Department of Mechanical Engineering,
Cambridge, MA 02139, USA
1
quantum computer can approximate the value of such a function in time which
is polylogarithmic in N, an exponential speedup over the best known classical
algorithms. In fact, under standard complexitytheoretic assumptions, we prove
that in performing this task any classical algorithm must be exponentially slower
than the algorithm presented here. Moreover, we show that our algorithm is
almost the optimal quantum algorithm for the task.
We begin by presenting the main ideas behind the construction. Then we
give an informal description of the algorithm, making many simplifying assump
tions. Finally we present generalizations and extensions. The full proofs appear
in the supporting online material [3].
Assume we are given the equation Ax =
b, where
b has N entries. The
algorithm works by mapping
b to a quantum state [b) and by mapping A to a
suitable quantum operator. For example, A could represent a discretized dif
ferential operator which is mapped to a Hermitian matrix with eﬃciently com
putable entries, and [b) could be the ground state of a physical system, or the
output of some other quantum computation. Alternatively, the entries of A and
b could represent classical data stored in memory. The key requirement here,
as in all quantum information theory, is the ability to perform actions in super
position (also called “quantum parallel”). We present an informal discussion of
superposition, and its meaning in this context. Suppose that a algorithm (which
we can take to be reversible without loss of generality) exists to map input x, 0
to output x, f(x). Quantum mechanics predicts that given a superposition of
x, 0 and y, 0, evaluating this function on a quanutm computer will produce a
superposition of x, f(x) and y, f(y), while requiring no extra time to execute.
One can view accessing a classical memory cell as applying a function whose
input is the address of the cell and which output the contents of this cell. We
require that we can access this function in superposition.
In the following paragraphs we assume that A is sparse and Hermitian.
Both assumptions can be relaxed, but this complicates the presentation. We
also ignore some normalization issues (which are treated in the supplementary
material). The exponential speedup is attained when the condition number of
A is polylogarithmic in N, and the required accuracy is 1/ poly log n.
The algorithm maps the N entries of
b onto the log
2
N qubits required
to represent the state [b). When A is sparse, the transformation e
iAt
[b) can
be implemented eﬃciently. This ability to exponentiate A translates, via the
wellknown technique of phase estimation, into the ability to decompose [b) in
the eigenbasis of A and to ﬁnd the corresponding eigenvalues λ
j
. Informally,
the state of the system after this stage is close to
j
β
j
[u
j
) [λ
j
), where u
j
is
the eigenvector basis of A, and [b) =
j
β
j
[u
j
). As the eigenvalues which
corresponds to each eigenvector is entangled with it, One can hope to apply an
operation which would take
j
β
j
[u
j
) [λ
j
) to
j
λ
−1
j
β
j
[u
j
) [λ
j
). However, this
is not a linear operation, and therefore performing it requires a unitary, followed
by a successful measurement. This allows us to extract the state [x) = A
−1
[b).
The total number of resources required to perform these transformations scales
polylogarithmically with N.
2
This procedure yields a quantummechanical representation [x) of the desired
vector x. Clearly, to read out all the components of x would require one to
perform the procedure at least N times. However, when one is interested not in
x itself, but in some expectation value x
T
Mx, where M is some linear operator
(our procedure also accommodates nonlinear operators as described below). By
mapping M to a quantummechanical operator, and performing the quantum
measurement corresponding to M, we obtain an estimate of the expectation
value ¸x[ M [x) = x
T
Mx, as desired. A wide variety of features of the vector x
can be extracted in this way, including normalization, weights in diﬀerent parts
of the state space, moments, etc.
A simple example where the algorithm can be used is to see if two diﬀerent
stochastic processes have similar stable state [4]. Consider a stochastic pro
cess x
t
= Ax
t−1
+ b, where the i’th coordinate in the vector x
t
represents the
abundance of item i in time t. The stable state of this distribution is given by
[x) = (I − A)
−1
[b). Let ˜ x
t
=
˜
A˜ x
t−1
+
˜
b, and [˜ x) = (I −
˜
A)
−1
¸
¸
¸
˜
b
_
. To know if
[x) and [˜ x) are similar, we perform the SWAP test between them [5]. We note
that classically ﬁnding out if two probability distributions are similar requires
at least O(
√
N) samples [6]. One can apply similar ideas to know if diﬀerent
pictures are similar, or to identify what is the relation between two pictures. In
general, diﬀerent problems require us to extract diﬀerent features, and it is an
important question to identify what are the important features to extract.
Estimating expectation values on solutions of systems of linear equations is
quite powerful. In particular, we show that it is universal for quantum com
putation  anything that a quantum computer can do, can be written as a set
of linear equations, such that the result of the computation is encoded in some
expectation value of the solution of the system. Thus, matrix inversion can
be thought of as an alternate paradigm for quantum computing, along with
[7, 8, 9, 10, 11, 12]. Matrix inversion has the advantage of being a natural
problem that is not obviously related to quantum mechanics. We use the uni
versality result to show that our algorithm is almost optimal and that classical
algorithms cannot match its performance.
An important factor in the performance of the matrix inversion algorithm
is κ, the condition number of A, or the ratio between A’s largest and smallest
eigenvalues. As the condition number grows, A becomes closer to a matrix
which cannot be inverted, and the solutions become less stable. Such a matrix
is said to be “illconditioned.” Our algorithms will generally assume that the
singular values of A lie between 1/κ and 1; equivalently κ
−2
I ≤ A
†
A ≤ I. In
this case, we will achieve a runtime proportional to κ
2
log N. However, we also
present a technique to handle illconditioned matrices. The runtime also scales
as 1/ if we allow an additive error of in the output state [x). Therefore, if
κ and 1/ are both poly log(N), the runtime will also be poly log(N). In this
case, our quantum algorithm is exponentially faster than any classical method.
Previous papers utilized quantum computers to perform linear algebraic op
erations in a limited setting [13]. Our work was extended by [14] to solving
nonlinear diﬀerential equations.
3
We now give a more detailed explanation of the algorithm. First, we want to
transform a given Hermitian matrix A into a unitary operator e
iAt
which we can
apply at will. This is possible (for example) if A is ssparse and eﬃciently row
computable, meaning it has at most s nonzero entries per row and given a row
index these entries can be computed in time O(s). Under these assumptions,
Ref. [15] shows how to simulate e
iAt
in time
˜
O(log(N)s
2
t),
where the
˜
O suppresses more slowlygrowing terms (included in the supporting
material [3]). If A is not Hermitian, deﬁne
C =
_
0 A
A
†
0
_
(1)
As C is Hermitian, we can solve the equation Cy =
_
b
0
_
to obtain y =
_
0
x
_
.
Applying this reduction if necessary, the rest of the paper assumes that A is
Hermitian.
We also need an eﬃcient procedure to prepare [b). For example, if b
i
and
i
2
i=i
1
[b
i
[
2
are eﬃciently computable then we can use the procedure of Ref. [16]
to prepare [b).
The next step is to decompose [b) in the eigenvector basis, using phase
estimation [17, 18]. Denote by [u
j
) the eigenvectors of e
iAt
, and by λ
j
the
corresponding eigenvalues. Let
[Ψ
0
) :=
_
2
T
T−1
τ=0
sin
π(τ +
1
2
)
T
[τ) (2)
for some large T. The coeﬃcients of [Ψ
0
) are chosen (following [18]) to mini
mize a certain quadratic loss function which appears in our error analysis (see
supplementary material for details).
Next we apply the conditional Hamiltonian evolution
T−1
τ=0
[τ)¸τ[
C
⊗e
iAτt
0
/T
on [Ψ
0
)
C
⊗[b), where t
0
= O(κ/). Fourier transforming the ﬁrst register gives
the state
N
j=1
T−1
k=0
α
kj
β
j
[k) [u
j
) , (3)
where [k) are the Fourier basis states, and [α
kj
[ is large if and only if λ
j
≈
2πk
t
0
.
Deﬁning
˜
λ
k
:= 2πk/t
0
, we can relabel our [k) register to obtain
N
j=1
T−1
k=0
α
kj
β
j
¸
¸
¸
˜
λ
k
_
[u
j
)
Adding an ancilla qubit and rotating conditioned on
¸
¸
¸
˜
λ
k
_
yields
N
j=1
T−1
k=0
α
kj
β
j
¸
¸
¸
˜
λ
k
_
[u
j
)
_¸
1 −
C
2
˜
λ
2
k
[0) +
C
˜
λ
k
[1)
_
,
4
where C = O(1/κ). We now undo the phase estimation to uncompute the
¸
¸
¸
˜
λ
k
_
.
If the phase estimation were perfect, we would have α
kj
= 1 if
˜
λ
k
= λ
j
, and 0
otherwise. Assuming this for now, we obtain
N
j=1
β
j
[u
j
)
_¸
1 −
C
2
λ
2
j
[0) +
C
λ
j
[1)
_
To ﬁnish the inversion we measure the last qubit. Conditioned on seeing 1,
we have the state
¸
1
N
j=1
C
2
/[λ
j
[
2
N
j=1
β
j
C
λ
j
[u
j
)
which corresponds to [x) =
n
j=1
β
j
λ
−1
j
[u
j
) up to normalization. We can de
termine the normalization constant from the probability of obtaining 1. Finally,
we make a measurement M whose expectation value ¸x[ M [x) corresponds to
the feature of x that we wish to evaluate.
We present an informal description of the sources of error; the exact error
analysis and runtime considerations are presented in [3]. Performing the phase
estimation is done by simulating e
iAt
. Assuming that A is ssparse, this can be
done with negligible error in time nearly linear in t and quadratic in s.
The dominant source of error is phase estimation. This step errs by O(1/t
0
)
in estimating λ, which translates into a relative error of O(1/λt
0
) in λ
−1
. If
λ ≥ 1/κ taking t
0
= O(κ/) induces a ﬁnal error of . Finally, we consider the
success probability of the postselection process. Since C = O(1/κ) and λ ≤ 1,
this probability is at least Ω(1/κ
2
). Using amplitude ampliﬁcation [19], we ﬁnd
that O(κ) repetitions are suﬃcient. Putting this all together, the runtime is
˜
O
_
log(N)s
2
κ
2
/
_
By contrast, one of the best generalpurpose classical matrix inversion algo
rithms is the conjugate gradient method [20], which, when A is positive deﬁnite,
uses O(
√
κlog(1/)) matrixvector multiplications each taking time O(Ns) for a
total runtime of O(Ns
√
κlog(1/)). (If A is not positive deﬁnite, O(κlog(1/))
multiplications are required, for a total time of O(Nsκlog(1/)).) An important
question is whether classical methods can be improved when only a summary
statistic of the solution, such as x
†
Mx, is required. Another question is whether
our quantum algorithm could be improved, say to achieve error in time propor
tional to poly log(1/). We show that the answer to both questions is negative,
using an argument from complexity theory. Our strategy is to prove that the
ability to invert matrices (with the right choice of parameters) can be used to
simulate a general quantum computation.
We show that a quantum circuit using n qubits and T gates can be simulated
by inverting an O(1)sparse matrix A of dimension N = O(2
n
κ). The condition
number κ is O(T
2
) if we need A to be positive deﬁnite or O(T) if not. This
implies that a classical poly(log N, κ, 1/)time algorithm would be able to sim
ulate a poly(n)gate quantum algorithm in poly(n) time. Such a simulation is
5
strongly conjectured to be false, and is known to be impossible in the presence
of oracles [21].
The reduction from a general quantum circuit to a matrix inversion prob
lem, also implies that our algorithm cannot be substantially improved (un
der standard assumptions). If the runtime could be made polylogarithmic
in κ, then any problem solvable on n qubits could be solved in poly(n) time
(i.e. BQP=PSPACE), a highly implausible result[22]. Even improving our κ
dependence to κ
1−δ
for δ > 0 would allow any timeT quantum algorithm to be
simulated in time o(T); iterating this would again imply that BQP=PSPACE.
Similarly, improving the error dependence to poly log(1/) would imply that
BQP includes PP, and even minor improvements would contradict oracle lower
bounds [22].
We now present the key reduction from simulating a quantum circuit to
matrix inversion. Let ( be a quantum circuit acting on n = log N qubits which
applies T twoqubit gates U
1
, . . . U
T
. The initial state is [0)
⊗n
and the answer
is determined by measuring the ﬁrst qubit of the ﬁnal state.
Now adjoin an ancilla register of dimension 3T and deﬁne a unitary
U =
T
t=1
[t+1)¸t[⊗U
t
+[t+T +1)¸t+T[⊗I +[t+2T +1 mod 3T)¸t+2T[⊗U
†
3T+1−t
.
(4)
We have chosen U so that for T + 1 ≤ t ≤ 2T, applying U
t
to [1) [ψ) yields
[t + 1) ⊗ U
T
U
1
[ψ). If we now deﬁne A = I − Ue
−1/T
then κ(A) = O(T),
and we can expand
A
−1
=
k≥0
U
k
e
−k/T
, (5)
This can be interpreted as applying U
t
for t a geometricallydistributed ran
dom variable. Since U
3T
= I, we can assume 1 ≤ t ≤ 3T. If we measure
the ﬁrst register and obtain T + 1 ≤ t ≤ 2T (which occurs with probability
e
−2
/(1 + e
−2
+ e
−4
) ≥ 1/10) then we are left with the second register in the
state U
T
U
1
[ψ), corresponding to a successful computation. Sampling from
[x) allows us to sample from the results of the computation. This establishes
that matrix inversion is BQPcomplete, and proves our above claims about the
diﬃculty of improving our algorithm.
We now discuss ways to extend our algorithm and relax the assumptions we
made while presenting it. First, we show how a broader class of matrices can
be inverted, and then consider measuring other features of x and performing
operations on A other than inversion.
Certain nonsparse A can also be simulated and therefore inverted; see [23]
for a list of examples. It is also possible to invert nonsquare matrices, using
the reduction presented from the nonHermitian case to the Hermitian one.
The matrix inversion algorithm can also handle illconditioned matrices by
inverting only the part of [b) which is in the wellconditioned part of the matrix.
Formally, instead of transforming [b) =
j
β
j
[u
j
) to [x) =
j
λ
−1
j
β
j
[u
j
), we
6
transform it to a state which is close to
j,λ
j
<1/κ
λ
−1
j
β
j
[u
j
) [well) +
j,λ
j
≥1/κ
β
j
[u
j
) [ill)
in time proportional to κ
2
for any chosen κ (i.e. not necessarily the true condi
tion number of A). The last qubit is a ﬂag which enables the user to estimate
what the size of the illconditioned part, or to handle it in any other way she
wants. This behavior can be advantageous if we know that A is not invertible
and we are interested in the projection of [b) on the well conditioned part of A.
Another method that is often used in classical algorithms to handle ill
conditioned matrices is to apply a preconditioner[24]. If we have a method
of generating a preconditioner matrix B such that κ(AB) is smaller than κ(A),
then we can solve Ax =
b by instead solving (AB)c = B
b, a matrix inversion
problem with a betterconditioned matrix. Further, if A and B are both sparse,
then AB is as well. Thus, as long as a state proportional to B[b) can be eﬃ
ciently prepared, our algorithm could potentially run much faster if a suitable
preconditioner is used.
The outputs of the algorithm can also be generalized. We can estimate
degree2k polynomials in the entries of x by generating k copies of [x) and
measuring the nkqubit observable
i
1
,...,i
k
,j
1
,...,j
k
M
i
1
,...,i
k
,j
1
,...,j
k
[i
1
, . . . , i
k
) ¸j
1
, . . . , j
k
[
on the state [x)
⊗k
. Alternatively, one can use our algorithm to generate a quan
tum analogue of MonteCarlo, where given A and
b we sample from the vector
x, meaning that the value i occurs with probability [x
i
[
2
.
Perhaps the most farreaching generalization of the matrix inversion algo
rithm is not to invert matrices at all! Instead, it can compute f(A) [b) for any
computable f. Depending on the degree of nonlinearity of f, nontrivial tradeoﬀs
between accuracy and eﬃciency arise. Some variants of this idea are considered
in [25, 14].
Finally, we consider physical aspects regarding the implementation and per
formance of the quantum matrix inversion algorithm. The algorithm allows
considerable ﬂexibility in how the matrix to be inverted A, the measurement
matrix M, and the vector
b are represented. If the entries of A, M represent
data, they can be stored in quantum memory: a large quantum memory (O(Ns)
slots) is needed, but the physical requirements for quantum memory are signif
icantly less demanding than those for fullblown quantum computation [26].
Alternatively, if A or M represents some computable transformation, e.g., a
discretized diﬀerential operator, then its entries can be computed in quantum
parallel, in which case no quantum memory is required. Similarly, the compo
nents of
b could either be computed or stored in quantum memory in a form
that allows the construction of [b), e.g. using Ref. [16]. No matter how the in
puts are represented, the quantum processor that performs the algorithm itself
7
is exponentially smaller than the matrix to be inverted: a quantum computer
with under one hundred qubits suﬃces to invert a matrix with Avogadro’s num
ber of entries. Similarly, the exponential speedup of the algorithm allows it to
be performed with a relatively small number of quantum operations, thereby
reducing the overhead required for quantum error correction.
Acknowledgements. We thank the W.M. Keck foundation for support,
and AWH thanks them as well as MIT for hospitality while this work was carried
out. AWH was also funded by the U.K. EPSRC grant “QIP IRC.” SL thanks R.
Zecchina for encouraging him to work on this problem. D. Farmer, M. Tegmark,
S. Mitter, and P. Parillo supplied useful applications for this algorithm. We
are grateful as well to R. Cleve, S. Gharabian and D. Spielman for helpful
discussions.
References
[1] P. W. Shor. Algorithms for quantum computation: discrete logarithms and
factoring. In S. Goldwasser, editor, Proceedings: 35th Annual Symposium
on Foundations of Computer Science, pages 124–134. IEEE Computer So
ciety Press, 1994.
[2] S. Lloyd. Universal quantum simulators. Science, 273:1073–1078, August
1996.
[3] Aram W. Harrow, Avinatan Hassidim, and Seth Lloyd. Quantum algorithm
for solving linear systems of equations, 2009. Supplementary material.
[4] D.G. Luenberger. Introduction to Dynamic Systems: Theory, Models, and
Applications. Wiley, New York, 1979.
[5] H. Buhrman, R. Cleve, J. Watrous, and R. De Wolf. Quantum ﬁngerprint
ing. Physical Review Letters, 87(16):167902–167902, 2001.
[6] Paul Valiant. Testing symmetric properties of distributions. In STOC,
pages 383–392, 2008.
[7] Edward Farhi, Jeﬀrey Goldstone, Sam Gutmann, Joshua Lapan, Andrew
Lundgren, and Daniel Preda. A Quantum Adiabatic Evolution Algo
rithm Applied to Random Instances of an NPComplete Problem. Science,
292(5516):472–475, 2001.
[8] E. Knill, R. Laﬂamme, and GJ Milburn. A scheme for eﬃcient quantum
computation with linear optics. Nature, 409:46–52, 2001.
[9] D. Aharanov and I. Arad. The BQPhardness of approximating the Jones
Polynomial, 2006. arXiv:quantph/0605181.
[10] M.A. Nielsen, M.R. Dowling, M. Gu, and A.C. Doherty. Quantum Com
putation as Geometry, 2006.
8
[11] M.H. Freedman, M. Larsen, and Z. Wang. A modular functor which is
universal for quantum computation. Comm. Math. Phys., 227(3):605–622,
2002.
[12] M.H. Freedman, A. Kitaev, M.J. Larsen, and Z. Wang. Topological quan
tum computation. Bull. Am. Math. Soc., 40(1):31–38, 2003.
[13] A. Klappenecker and M. Roetteler. Quantum Physics Title: Engineering
Functional Quantum Algorithms. Phys. Rev. A, 67:010302, 2003.
[14] S. K. Leyton and T. J. Osborne. A quantum algorithm to solve nonlinear
diﬀerential equations, 2008. arXiv:0812.4423.
[15] D.W. Berry, G. Ahokas, R. Cleve, and B.C. Sanders. Eﬃcient Quan
tum Algorithms for Simulating Sparse Hamiltonians. Comm. Math. Phys.,
270(2):359–371, 2007. arXiv:quantph/0508139.
[16] L. Grover and T. Rudolph. Creating superpositions that correspond to
eﬃciently integrable probability distributions. arXiv:quantph/0208112.
[17] R. Cleve, A. Ekert, C. Macchiavello, and M. Mosca. Quantum Algorithms
Revisited, 1997. arXiv:quantph/9708016.
[18] V. Buzek, R. Derka, and S. Massar. Optimal quantum clocks. Phys. Rev.
Lett., 82:2207–2210, 1999. arXiv:quantph/9808042.
[19] G. Brassard, P. Høyer, M. Mosca, and A. Tapp. Quantum Amplitude Ampli
ﬁcation and Estimation, volume 305 of Contemporary Mathematics Series
Millenium Volume. AMS, 2002. arXiv:quantph/0005055.
[20] Jonathan R. Shewchuk. An Introduction to the Conjugate Gradient
Method Without the Agonizing Pain. Technical Report CMUCS94125,
School of Computer Science, Carnegie Mellon University, Pittsburgh, Penn
sylvania, March 1994.
[21] Daniel R. Simon. On the power of quantum computation. SIAM J. Comp.,
26:116–123, 1997.
[22] E. Farhi, J. Goldstone, S. Gutmann, and M. Sipser. A limit on the speed
of quantum computation in determining parity. Phys. Rev. Lett., 81:5442–
5444, 1998. arXiv:quantph/9802045.
[23] Andrew M. Childs. On the relationship between continuous and discrete
time quantum walk, 2008. arXiv:0810.0312.
[24] K. Chen. Matrix preconditioning techniques and applications. Cambridge
Univ. Press, Cambridge, U.K., 2005.
[25] L. Sheridan, D. Maslov, and M. Mosca. Approximating fractional time
quantum evolution, 2008. arXiv:0810.3843.
9
[26] Vittorio Giovannetti, Seth Lloyd, and Lorenzo Maccone. Quantum random
access memory. Phys. Rev. Lett., 100:160501, 2008.
[27] D. Aharonov and A. TaShma. Adiabatic quantum state generation and
statistical zero knowledge. In Proceedings of the thirtyﬁfth annual ACM
symposium on Theory of computing (STOC), pages 20–29. ACM Press New
York, NY, USA, 2003. arXiv:quantph/0301023.
[28] P.C. Hansen. Rankdeﬁcient and discrete illposted problems: Numerical
aspects of linear inversion. SIAM, Philadelphia, PA, 1998.
[29] M. Sipser. Introduction to the Theory of Computation. International Thom
son Publishing, 1996.
[30] C.H. Bennett, E. Bernstein, G. Brassard, and U. Vazirani. The strengths
and weaknesses of quantum computation. SIAM Journal on Computing,
26:1510–1523, 1997.
A Supplementary Online Material
In this appendix, we describe and analyze our algorithm in full detail. While
the body of the paper attempted to convey the spirit of the procedure and left
out various improvements, here we take the opposite approach and describe
everything, albeit possibly in a less intuitive way. We also describe in more
detail our reductions from nonHermitian matrix inversion to Hermitian matrix
inversion (Section A.4) and from a general quantum computation to matrix
inversion (Section A.5).
As inputs we require a procedure to produce the state [b), a method of
producing the ≤ s nonzero elements of any row of A and a choice of cutoﬀ κ.
Our runtime will be roughly quadratic in κ and our algorithm is guaranteed to
be correct if A ≤ 1 and A
−1
 ≤ κ.
The condition number is a crucial parameter in the algorithm. Here we
present one possible method of handling illconditioned matrices. We will deﬁne
the wellconditioned part of A to be the span of the eigenspaces corresponding
to eigenvalues ≥ 1/κ and the illconditioned part to be the rest. Our strategy
will be to ﬂag the illconditioned part of the matrix (without inverting it), and
let the user choose how to further handle this. Since we cannot exactly resolve
any eigenvalue, we can only approximately determine whether vectors are in
the well or illconditioned subspaces. Accordingly, we choose some κ
> κ (say
κ
= 2κ). Our algorithm then inverts the wellconditioned part of the matrix,
ﬂags any eigenvector with eigenvalue ≤ 1/κ
as illconditioned, and interpolates
between these two behaviors when 1/κ
< [λ[ < 1/κ. This is described formally
in the next section. We present this strategy not because it is necessarily ideal
in all cases, but because it gives a concrete illustration of the key components
of our algorithm.
Finally, the algorithm produces [x) only up to some error which is given as
part of the input. We work only with pure states, and so deﬁne error in terms
10
of distance between vectors, i.e.  [α) −[β)  =
_
2(1 −Re ¸α[β)). Since ancilla
states are produced and then imperfectly uncomputed by the algorithm, our
output state will technically have high ﬁdelity not with [x) but with [x) [000 . . .).
In general we do not write down ancilla qubits in the [0) state, so we write [x)
instead of [x) [000 . . .) for the target state, [b) instead of [b) [000 . . .) for the initial
state, and so on.
A.1 Detailed description of the algorithm
To produce the input state [b), we assume that there exists an eﬃciently
implementable unitary B, which when applied to [initial) produces the state
[b), possibly along with garbage in an ancilla register. We make no further
assumption about B; it may represent another part of a larger algorithm, or
a standard statepreparation procedure such as [16]. Let T
B
be the number
of gates required to implement B. We neglect the possibility that B errs in
producing [b) since, without any other way of producing or verifying the state
[b), we have no way to mitigate these errors. Thus, any errors in producing [b)
necessarily translate directly into errors in the ﬁnal state [x).
Next, we deﬁne the state
[Ψ
0
) =
_
2
T
T−1
τ=0
sin
π(τ +
1
2
)
T
[τ) (6)
for a T to be chosen later. Using [16], we can prepare [Ψ
0
) up to error
Ψ
in
time poly log(T/
Ψ
).
One other subroutine we will need is Hamiltonian simulation. Using the
reductions described in Section A.4, we can assume that A is Hermitian. To
simuluate e
iAt
for some t ≥ 0, we use the algorithm of [15]. If A is ssparse,
t ≤ t
0
and we want to guarantee that the error is ≤
H
, then this requires time
T
H
= O(log(N)(log
∗
(N))
2
s
2
t
0
9
√
log(s
2
t
0
/
H
)
). =
˜
O(log(N)s
2
t
0
) (7)
The scaling here is better than any power of 1/
H
, which means that the addi
tional error introduced by this step introduces is negligible compared with the
rest of the algorithm, and the runtime is almost linear with t
0
. Note that this
is the only step where we require that A be sparse; as there are some other
types of Hamiltonians which can be simulated eﬃciently (e.g. [27, 15, 23]), this
broadens the set of matrices we can handle.
The key subroutine of the algorithm, denoted U
invert
, is deﬁned as follows:
1. Prepare [Ψ
0
)
C
from [0) up to error
Ψ
.
2. Apply the conditional Hamiltonian evolution
T−1
τ=0
[τ)¸τ[
C
⊗e
iAτt
0
/T
up
to error
H
.
3. Apply the Fourier transform to the register C. Denote the resulting basis
states with [k), for k = 0, . . . T −1. Deﬁne
˜
λ
k
:= 2πk/t
0
.
11
4. Adjoin a threedimensional register S in the state
¸
¸
¸h(
˜
λ
k
)
_
S
:=
_
1 −f(
˜
λ
k
)
2
−g(
˜
λ
k
)
2
[nothing)
S
+f(
˜
λ
k
) [well)
S
+g(
˜
λ
k
) [ill)
S
,
for functions f(λ), g(λ) deﬁned below in (8). Here ‘nothing’ indicates that
the desired matrix inversion hasn’t taken place, ‘well’ indicates that it has,
and ‘ill’ means that part of [b) is in the illconditioned subspace of A.
5. Reverse steps 13, uncomputing any garbage produced along the way.
The functions f(λ), g(λ) are known as ﬁlter functions[28], and are chosen
so that for some constant C > 1: f(λ) = 1/Cκλ for λ ≥ 1/κ, g(λ) = 1/C for
λ ≤ 1/κ
:= 1/2κ and f
2
(λ) + g
2
(λ) ≤ 1 for all λ. Additionally, f(λ) should
satisfy a certain continuity property that we will describe in the next section.
Otherwise the functions are arbitrary. One possible choice is
f(λ) =
_
¸
_
¸
_
1
2κλ
when λ ≥ 1/κ
1
2
sin
_
π
2
λ−
1
κ
1
κ
−
1
κ
_
when
1
κ
> λ ≥
1
κ
0 when
1
κ
> λ
(8a)
g(λ) =
_
¸
_
¸
_
0 when λ ≥ 1/κ
1
2
cos
_
π
2
λ−
1
κ
1
κ
−
1
κ
_
when
1
κ
> λ ≥
1
κ
1
2
when
1
κ
> λ
(8b)
If U
invert
is applied to [u
j
) it will, up to an error we will discuss below, adjoin
the state [h(λ
j
)). Instead if we apply U
invert
to [b) (i.e. a superposition of
diﬀerent [u
j
)), measure S and obtain the outcome ‘well’, then we will have
approximately applied an operator proportional to A
−1
. Let ˜ p (computed in
the next section) denote the success probability of this measurement. Rather
than repeating 1/˜ p times, we will use amplitude ampliﬁcation [19] to obtain the
same results with O(1/
√
˜ p) repetitions. To describe the procedure, we introduce
two new operators:
R
succ
= I
S
−2[well)¸well[
S
,
acting only on the S register and
R
init
= I −2[initial)¸initial[.
Our main algorithm then follows the amplitude ampliﬁcation procedure: we
start with U
invert
B[initial) and repeatedly apply U
invert
BR
init
B
†
U
†
invert
R
succ
.
Finally we measure S and stop when we obtain the result ‘well’. The number
of repetitions would ideally be π/4
√
˜ p, which in the next section we will show
is O(κ). While ˜ p is initially unknown, the procedure has a constant probability
of success if the number of repetitions is a constant fraction of π/4˜ p. Thus,
following [19] we repeat the entire procedure with a geometrically increasing
number of repetitions each time: 1, 2, 4, 8, . . . , until we have reached a power
12
of two that is ≥ κ. This yields a constant probability of success using ≤ 4κ
repetitions.
Putting everything together, the runtime is
˜
O(κ(T
B
+ t
0
s
2
log(N)), where
the
˜
O suppresses the moreslowly growing terms of (log
∗
(N))
2
, exp(O(1/
_
log(t
0
/
H
)))
and poly log(T/
Ψ
). In the next section, we will show that t
0
can be taken to
be O(κ/) so that the total runtime is
˜
O(κT
B
+κ
2
s
2
log(N)/).
A.2 Error Analysis
In this section we show that taking t
0
= O(κ/) introduces an error of ≤
in the ﬁnal state. The main subtlety in analyzing the error comes from the
postselection step, in which we choose only the part of the state attached to
the [well) register. This can potentially magnify errors in the overall state. On
the other hand, we may also be interested in the nonpostselected state, which
results from applying U
invert
a single time to [b). For instance, this could be used
to estimate the amount of weight of [b) lying in the illconditioned components
of A. Somewhat surprisingly, we show that the error in both cases is upper
bounded by O(κ/t
0
).
In this section, it will be convenient to ignore the error terms
H
and
Ψ
,
as these can be made negligible with relatively little eﬀort and it is the errors
from phase estimation that will dominate. Let
˜
U denote a version of U
invert
in
which everything except the phase estimation is exact. Since 
˜
U − U
invert
 ≤
O(
H
+
Ψ
), it is suﬃcient to work with
˜
U. Deﬁne U to be the ideal version of
U
invert
in which there is no error in any step.
Theorem A.1 (Error bounds).
1. In the case when no postselection is performed, the error is bounded as

˜
U −U ≤ O(κ/t
0
). (9)
2. If we postselect on the ﬂag register being in the space spanned by ¦[well) , [ill)¦
and deﬁne the normalized ideal state to be [x) and our actual state to be
[˜ x) then
 [˜ x) −[x)  ≤ O(κ/t
0
). (10)
3. If [b) is entirely within the wellconditioned subspace of A and we post
select on the ﬂag register being [well) then
 [˜ x) −[x)  ≤ O(κ/t
0
). (11)
The third claim is often of the most practical interest, but the other two are
useful if we want to work with the illconditioned space, or estimate its weight.
The rest of the section is devoted to the proof of Theorem A.1. We ﬁrst
show that the third claim is a corollary of the second, and then prove the ﬁrst
two claims more or less independently. To prove (10 assuming (9), observe that
if [b) is entirely in the wellconditioned space, the ideal state [x) is proportional
13
to A
−1
[b) [well). Model the postselection on [well) by a postselection ﬁrst on
the space spanned by ¦[well) , [ill)¦, followed by a postselection onto [well). By
(9), the ﬁrst postselection leaves us with error O(κ/t
0
). This implies that the
second postselection will succeed with probability ≥ 1−O(κ
2
/t
2
0
) and therefore
will increase the error by at most O(κ/t
0
). The ﬁnal error is then O(κ/t
0
) as
claimed in (11).
Now we turn to the proof of (9). A crucial piece of the proof will be the
following statement about the continuity of [h(λ)).
Lemma A.2. The map λ → [h(λ)) is O(κ)Lipschitz, meaning that for any
λ
1
,= λ
2
,
 [h(λ
1
)) −[h(λ
2
))  =
_
2(1 −Re ¸h(λ
1
)[h(λ
2
))) ≤ cκ[λ
1
−λ
2
[,
for some c = O(1).
Proof. Since λ → [h(λ)) is continuous everywhere and diﬀerentiable everywhere
except at 1/κ and 1/κ
, it suﬃces to bound the norm of the derivative of [h(λ)).
We consider it piece by piece. When λ > 1/κ,
d
dλ
[h(λ)) =
1
2κ
2
λ
3
_
1 −1/2κ
2
λ
2
[nothing) −
1
2κλ
2
[well) ,
which has squared norm
1
2κ
2
λ
4
(2κ
2
λ
2
−1)
+
1
4κ
2
λ
4
≤ κ
2
. Next, when 1/κ
< λ <
1/κ, the norm of
d
dλ
[h(λ)) is
1
2
π
2
1
1
κ
−
1
κ
=
π
2
κ.
Finally
d
dλ
[h(λ)) = 0 when λ < 1/κ
. This completes the proof, with c =
π
2
.
Now we return to the proof of (9). Let
˜
P denote the ﬁrst three steps of the
algorithm. They can be thought of as mapping the initial zero qubits to a [k)
register, together with some garbage, as follows:
˜
P =
n
j=1
[u
j
)¸u
j
[ ⊗
k
α
kj
[k) [garbage(j, k)) ¸initial[ ,
where the guarantee that the phase estimation algorithm gives us is that α
kj
is
concentrated around λ
j
≈ 2πk/t
0
=:
˜
λ
k
. Technically,
˜
P should be completed to
make it a unitary operator by deﬁning some arbitrary behavior on inputs other
than [initial) in the last register.
Consider a test state [b) =
N
j=1
β
j
[u
j
). The ideal functionality is deﬁned
by
[ϕ) = U [b) =
N
j=1
β
j
[u
j
) [h(λ
j
)) ,
14
while the actual algorithm produces the state
[ ˜ ϕ) =
˜
U [b) =
˜
P
†
N
j=1
β
j
[u
j
)
k
α
kj
[k)
¸
¸
¸h(
˜
λ
k
)
_
,
We wish to calculate ¸ ˜ ϕ[ϕ), or equivalently the inner product between
˜
P [ ˜ ϕ) and
˜
P [ϕ) =
j,k
β
j
α
kj
[u
j
) [k) [h(λ
j
)). This inner product is
¸ ˜ ϕ[ϕ) =
N
j=1
[β
j
[
2
k
[α
kj
[
2
_
h(
˜
λ
k
)[h(λ
j
)
_
:= E
j
E
k
_
h(
˜
λ
k
)[h(λ
j
)
_
,
where we think of j and k as random variables with joint distribution Pr(j, k) =
[β
j
[
2
[α
kj
[
2
. Thus
Re ¸ ˜ ϕ[ϕ) = E
j
E
k
Re
_
h(
˜
λ
k
)[h(λ
j
)
_
.
Let δ = λ
j
t
0
− 2πk = t
0
(λ
j
−
˜
λ
k
). From Lemma A.2, Re
_
h(
˜
λ
k
)[h(λ
j
)
_
≥
1 − c
2
κ
2
δ
2
/2t
2
0
, where c ≤
π
2
is a constant. There are two sources of inﬁdelity.
For δ ≤ 2π, the inner product is at least 1 −2π
2
c
2
κ
2
/t
2
0
. For larger values of δ,
we use the bound [α
kj
[
2
≤ 64π
2
/(λ
j
t
0
−2πk)
4
(proved in Section A.3) to ﬁnd
an inﬁdelity contribution that is
≤ 2
∞
k=
λ
j
t
0
2π
+1
64π
2
δ
4
c
2
κ
2
δ
2
2t
2
0
=
64π
2
c
2
κ
2
t
2
0
∞
k=1
1
4π
2
k
2
=
8π
2
c
2
3
κ
2
t
2
0
.
Summarizing, we ﬁnd that Re ¸ ˜ ϕ[ϕ) ≥ 1 −5π
2
c
2
κ
2
/t
2
0
, which translates into
 [ ˜ ϕ) −[ϕ)  ≤ 4πcκ/t
0
= 2π
2
κ/t
0
. Since the initial state [b) was arbitrary, this
bounds the operator distance 
˜
U −U as claimed in (9).
Turning now to the postselected case, we observe that
[x) :=
f(A) [b) [well) +g(A) [b) [ill)
_
¸b[ (f(A)
2
+g(A)
2
) [b)
(12)
=
j
β
j
[u
j
) (f(λ
j
) [well) +g(λ
j
) [ill))
_
j
[β
j
[
2
(f(λ
j
)
2
+g(λ
j
)
2
)
(13)
=:
j
β
j
[u
j
) (f(λ
j
) [well) +g(λ
j
) [ill))
√
p
. (14)
Where in the last step we have deﬁned
p := E
j
[f(λ
j
)
2
+g(λ
j
)
2
]
to be the probability that the postselection succeeds. Naively, this post
selection could magnify the errors by as much as 1/
√
p, but by careful ex
amination of the errors, we ﬁnd that this worstcase situation only occurs when
15
the errors are small in the ﬁrst place. This is what will allow us to obtain the
same O(κ/t
0
) error bound even in the postselected state.
Now write the actual state that we produce as
[˜ x) :=
˜
P
†
N
j=1
β
j
[u
j
)
k
α
kj
[k) (f(
˜
λ
k
) [well) +g(
˜
λ
k
) [ill))
_
E
j,k
f(
˜
λ
k
)
2
+g(
˜
λ
k
)
2
(15)
=:
˜
P
†
N
j=1
β
j
[u
j
)
k
α
kj
[k) (f(
˜
λ
k
) [well) +g(
˜
λ
k
) [ill))
√
˜ p
, (16)
where we have deﬁned ˜ p = E
j,k
[f(
˜
λ
k
)
2
+g(
˜
λ
k
)
2
].
Recall that j and k are random variables with joint distribution Pr(j, k) =
[β
j
[
2
[α
kj
[
2
. We evaluate the contribution of a single j value. Deﬁne λ := λ
j
and
˜
λ := 2πk/t
0
. Note that δ = t
0
(λ −
˜
λ) and that Eδ, Eδ
2
= O(1). Here δ depends
implicitly on both j and k, and the above bounds on its expectations hold even
when conditioning on an arbitrary value of j. We further abbreviate f := f(λ),
˜
f :=
˜
f(λ), g := g(λ) and ˜ g = ˜ g(λ). Thus p := E[f
2
+g
2
] and ˜ p = E[
˜
f
2
+ ˜ g
2
].
Our goal is to bound  [x) −[˜ x)  in (10). We work instead with the ﬁdelity
F := ¸˜ x[x) =
E[f
˜
f +g˜ g]
√
p˜ p
=
E[f
2
+g
2
] +E[(
˜
f −f)f + (˜ g −g)g]
p
_
1 +
˜ p−p
p
(17)
=
1 +
E[(
˜
f−f)f+(˜ g−g)g]
p
_
1 +
˜ p−p
p
≥
_
1 +
E[(
˜
f −f)f + (˜ g −g)g]
p
_
_
1 −
1
2
˜ p −p
p
_
(18)
Next we expand
˜ p −p = E[
˜
f
2
−f
2
] +E[˜ g
2
−g
2
] (19)
= E[(
˜
f −f)(
˜
f +f)] +E[(˜ g −g)(˜ g +g)] (20)
= 2E[(
˜
f −f)f] + 2E[(˜ g −g)g] +E[(
˜
f −f)
2
] +E[(˜ g −g)
2
] (21)
Substituting into (18), we ﬁnd
F ≥ 1 −
E[(
˜
f −f)
2
+ (˜ g −g)
2
]
2p
−
E[(
˜
f −f)f + (˜ g −g)g]
p
˜ p −p
2p
(22)
We now need an analogue of the Lipschitz condition given in Lemma A.2.
Lemma A.3. Let f,
˜
f, g, ˜ g be deﬁned as above, with κ
= 2κ. Then
[f −
˜
f[
2
+[g − ˜ g[
2
≤ c
κ
2
t
2
0
δ
2
[f
2
+g
2
[
where c = π
2
/2.
16
Proof. Remember that
˜
f = f(λ −δ/t
0
) and similarly for ˜ g.
Consider the case ﬁrst when λ ≥ 1/κ. In this case g = 0, and we need to
show that
[f −
˜
f[ ≤ 2
κ[δf[
t
0
=
[λ −
˜
λ[
λ
(23)
To prove this, we consider four cases. First, suppose
˜
λ ≥ 1/κ. Then [f −
˜
f[ =
1
2κ

˜
λ−λ
˜
λ·λ
≤ [δ[/2t
0
λ. Next, suppose λ = 1/κ (so f = 1/2) and
˜
λ < 1/κ. Since
sin
π
2
α ≥ α for 0 ≤ α ≤ 1, we have
[f −
˜
f[ ≤
1
2
−
1
2
˜
λ −
1
κ
1
κ
−
1
κ
=
1
2
−κ(
˜
λ −
1
2
) = κ(
1
κ
−
˜
λ), (24)
and using λ = 1/κ we ﬁnd that [f −
˜
f[ =
λ−
˜
λ
λ
, as desired. Next, if
˜
λ < 1/κ < λ
and f <
˜
f then replacing λ with 1/κ only makes the inequality tighter. Finally,
suppose
˜
λ < 1/κ < λ and
˜
f < f. Using (24) and λ > 1/κ we ﬁnd that
f −
˜
f ≤ 1 −κ
˜
λ < 1 −
˜
λ/λ = (λ −
˜
λ)/λ, as desired.
Now, suppose that λ < 1/κ. Then
[f −
˜
f[
2
≤
δ
2
t
2
0
max [f
[
2
=
π
2
4
δ
2
t
2
0
κ
2
.
And similarly
[g − ˜ g[
2
≤
δ
2
t
2
0
max [g
[
2
=
π
2
4
δ
2
t
2
0
κ
2
.
Finally f(λ)
2
+g(λ)
2
= 1/2 for any λ ≤ 1/κ, implying the result.
Now we use Lemma A.3 to bound the two error contributions in (18). First
bound
E[(
˜
f −f)
2
+ (˜ g −g)
2
]
2p
≤ O
_
κ
2
t
2
0
_
E[(f
2
+g
2
)δ
2
]
E[f
2
+g
2
]
≤ O
_
κ
2
t
2
0
_
(25)
The ﬁrst inequality used Lemma A.3 and the second used the fact that E[δ
2
] ≤
O(1) even when conditioned on an arbitrary value of j (or equivalently λ
j
).
Next,
E[(
˜
f −f)f + (˜ g −g)g]
p
≤
E
__
_
(
˜
f −f)
2
+ (˜ g −g)
2
_
(f
2
+g
2
)
_
p
(26)
≤
E
__
δ
2
κ
2
t
2
0
(f
2
+g
2
)
2
_
p
(27)
≤ O
_
κ
t
0
_
, (28)
17
where the ﬁrst inequality is CauchySchwartz, the second is Lemma A.3 and the
last uses the fact that E[[δ[] ≤
_
E[δ
2
] = O(1) even when conditioned on j.
We now substitute (25) and (28) into (21) (and assume κ ≤ t
0
) to ﬁnd
[ ˜ p −p[
p
≤ O
_
κ
t
0
_
. (29)
Substituting (25), (28) and (29) into (22), we ﬁnd Re ¸˜ x[x) ≥ 1 − O(κ
2
/t
2
0
), or
equivalently, that  [˜ x)−[x)  ≤ . This completes the proof of Theorem A.1.
A.3 Phase estimation calculations
Here we describe, in our notation, the improved phaseestimation procedure of
[18], and prove the concentration bounds on [α
kj
[. Adjoin the state
[Ψ
0
) =
_
2
T
T−1
τ=0
sin
π(τ +
1
2
)
T
[τ) .
Apply the conditional Hamiltonian evolution
τ
[τ)¸τ[ ⊗e
iAτt
0
/T
. Assume the
target state is [u
j
), so this becomes simply the conditional phase
τ
[τ)¸τ[e
iλ
j
t
0
τ/T
.
The resulting state is
¸
¸
Ψ
λ
j
t
0
_
=
_
2
T
T−1
τ=0
e
iλ
j
t
0
τ
T
sin
π(τ +
1
2
)
T
[τ) [u
j
) .
18
We now measure in the Fourier basis, and ﬁnd that the inner product with
1
√
T
T−1
τ=0
e
2πikτ
T
[τ) [u
j
) is (deﬁning δ := λ
j
t
0
−2πk):
α
kj
=
√
2
T
T−1
τ=0
e
i
τ
T
(λ
j
t
0
−2πk)
sin
π(τ +
1
2
)
T
(30)
=
1
i
√
2T
T−1
τ=0
e
i
τδ
T
_
e
iπ(τ+1/2)
T
−e
−
iπ(τ+1/2)
T
_
(31)
=
1
i
√
2T
T−1
τ=0
e
iπ
2T
e
iτ
δ+π
T
−e
−
iπ
2T
e
iτ
δ−π
T
(32)
=
1
i
√
2T
_
e
iπ
2T
1 −e
iπ+iδ
1 −e
i
δ+π
T
−e
−
iπ
2T
1 −e
iπ+iδ
1 −e
i
δ−π
T
_
(33)
=
1 +e
iδ
i
√
2T
_
e
−iδ/2T
e
−
i
2T
(δ+π)
−e
i
2T
(δ+π)
−
e
−iδ/2T
e
−
i
2T
(δ−π)
−e
i
2T
(δ−π)
_
(34)
=
(1 +e
iδ
)e
−iδ/2T
i
√
2T
_
1
−2i sin
_
δ+π
2T
_ −
1
−2i sin
_
δ−π
2T
_
_
(35)
= −e
i
δ
2
(1−
1
T
)
√
2 cos(
δ
2
)
T
_
1
sin
_
δ+π
2T
_ −
1
sin
_
δ−π
2T
_
_
(36)
= −e
i
δ
2
(1−
1
T
)
√
2 cos(
δ
2
)
T
sin
_
δ−π
2T
_
−sin
_
δ+π
2T
_
sin
_
δ+π
2T
_
sin
_
δ−π
2T
_ (37)
= e
i
δ
2
(1−
1
T
)
√
2 cos(
δ
2
)
T
2 cos
_
δ
2T
_
sin
_
π
2T
_
sin
_
δ+π
2T
_
sin
_
δ−π
2T
_ (38)
Following [18], we make the assumption that 2π ≤ δ ≤ T/10. Further using
α −α
3
/6 ≤ sin α ≤ α and ignoring phases we ﬁnd that
[α
kj
[ ≤
4π
√
2
(δ
2
−π
2
)(1 −
δ
2
+π
2
3T
2
)
≤
8π
δ
2
. (39)
Thus [α
kj
[
2
≤ 64π
2
/δ
2
whenever [k −λ
j
t
0
/2π[ ≥ 1.
A.4 The nonHermitian case
Suppose A ∈ C
M×N
with M ≤ N. Generically Ax = b is now underconstrained.
Let the singular value decomposition of A be
A =
M
j=1
λ
j
[u
j
) ¸v
j
[ ,
with [u
j
) ∈ C
M
, [v
j
) ∈ C
N
and λ
1
≥ λ
M
≥ 0. Let V = span¦[v
1
) , . . . , [v
M
)¦.
Deﬁne
H =
_
0 A
A
†
0
_
. (40)
19
H is Hermitian with eigenvalues ±λ
1
, . . . , ±λ
M
, corresponding to eigenvectors
¸
¸
w
±
j
_
:=
1
√
2
([0) [u
j
)±[1) [v
j
)). It also has N−M zero eigenvalues, corresponding
to the orthogonal complement of V .
To run our algorithm we use the input [0) [b). If [b) =
M
j=1
β
j
[u
j
) then
[0) [b) =
M
j=1
β
j
1
√
2
(
¸
¸
w
+
j
_
+
¸
¸
w
−
j
_
)
and running the inversion algorithm yields a state proportional to
H
−1
[0) [b) =
M
j=1
β
j
λ
−1
j
1
√
2
(
¸
¸
w
+
j
_
−
¸
¸
w
−
j
_
) =
M
j=1
β
j
λ
−1
j
[1) [v
j
) .
Dropping the inital [1), this deﬁnes our solution [x). Note that our algorithm
does not produce any component in V
⊥
, although doing so would have also
yielded valid solutions. In this sense, it could be said to be ﬁnding the [x) that
minimizes ¸x[x) while solving A[x) = [b).
On the other hand, if M ≥ N then the problem is overconstrained. Let
U = span¦[u
1
) , . . . , [u
N
)¦. The equation A[x) = [b) is satisﬁable only if [b) ∈ U.
In this case, applying H to [0) [b) will return a valid solution. But if [b) has
some weight in U
⊥
, then [0) [b) will have some weight in the zero eigenspace of
H, which will be ﬂagged as illconditioned by our algorithm. We might choose
to ignore this part, in which case the algorithm will return an [x) satisfying
A[x) =
N
j=1
[u
j
)¸u
j
[ [b).
A.5 Optimality
In this section, we explain in detail two important ways in which our algorithm is
optimal up to polynomial factors. First, no classical algorithm can perform the
same matrix inversion task; and second, our dependence on condition number
and accuracy cannot be substantially improved.
We present two versions of our lower bounds; one based on complexity theory,
and one based on oracles. We say that an algorithm solves matrix inversion if
its input and output are
1. Input: An O(1)sparse matrix A speciﬁed either via an oracle or via a
poly(log(N))time algorithm that returns the nonzero elements in a row.
2. Output: A bit that equals one with probability ¸x[ M [x) ± , where
M = [0)¸0[ ⊗ I
N/2
corresponds to measuring the ﬁrst qubit and [x) is
a normalized state proportional to A
−1
[b) for [b) = [0).
Further we demand that A is Hermitian and κ
−1
I ≤ A ≤ I. We take to be a
ﬁxed constant, such as 1/100, and later deal with the dependency in . If the
algorithm works when A is speciﬁed by an oracle, we say that it is relativizing.
Even though this is a very weak deﬁnition of inverting matrices, this task is still
hard for classical computers.
20
Theorem A.4. 1. If a quantum algorithm exists for matrix inversion run
ning in time κ
1−δ
poly log(N) for some δ > 0, then BQP=PSPACE.
2. No relativizing quantum algorithm can run in time κ
1−δ
poly log(N).
3. If a classical algorithm exists for matrix inversion running in time poly(κ, log(N)),
then BPP=BQP.
Given an nqubit Tgate quantum computation, deﬁne U as in (4). Deﬁne
A =
_
0 I −Ue
−
1
T
I −U
†
e
−
1
T
0
_
. (41)
Note that A is Hermitian, has condition number κ ≤ 2T and dimension N =
6T2
n
. Solving the matrix inversion problem corresponding to A produces
an approximation of the quantum computation corresponding to applying
U
1
, . . . , U
T
, assuming we are allowed to make any two outcome measurement on
the output state [x). Recall that
_
I −Ue
−
1
T
_
−1
=
k≥0
U
k
e
−k/T
. (42)
We deﬁne a measurement M
0
, which outputs zero if the time register t is between
T+1 and 2T, and the original measurement’s output was one. As Pr(T+1 ≤ k ≤
2T) = e
−2
/(1 +e
−2
+e
−4
) and is independent of the result of the measurement
M, we can estimate the expectation of M with accuracy by iterating this
procedure O
_
1/
2
_
times.
In order to perform the simulation when measuring only the ﬁrst qubit,
deﬁne
B =
_
I
6T2
n 0
0 I
3T2
n −Ue
−
1
T
_
. (43)
We now deﬁne
˜
B to be the matrix B, after we permuted the rows and columns
such that if
C =
_
0
˜
B
˜
B
†
0
_
. (44)
and Cy =
_
b
0
_
, then measuring the ﬁrst qubit of [y) would correspond to
perform M
0
on [x). The condition number of C is equal to that of A, but the
dimension is now N = 18T2
n
.
Now suppose we could solve matrix inversion in time κ
1−δ
(log(N)/)
c
1
for
constants c
1
≥ 2, δ > 0. Given a computation with T ≤ 2
2n
/18, let m =
2
δ
log(2n)
log(log(n))
and = 1/100m. For suﬃciently large n, ≥ 1/ log(n). Then
κ
1−δ
_
log(N)
_
c
1
≤ (2T)
1−δ
_
3n
_
c
1
≤ T
1−δ
c
2
(nlog(n))
c
1
,
where c
2
= 2
1−δ
3
c
1
is another constant.
21
We now have a recipe for simulating an n
i
qubit T
i
gate computation with
n
i+1
= n
i
+ log(18T
i
) qubits, T
i+1
= T
1−δ
i
c
3
(n
i
log(n
i
))
c
1
gates and error .
Our strategy is to start with an n
0
qubit T
0
gate computation and iterate this
simulation ≤ m times, ending with an n
qubit T
gate computation with
error ≤ m ≤ 1/100. We stop iterating either after m steps, or whenever
T
i+1
> T
1−δ/2
i
, whichever comes ﬁrst. In the latter case, we set equal to the
ﬁrst i for which T
i+1
> T
1−δ/2
i
.
In the case where we iterated the reduction mtimes, we have T
i
≤ T
(1−δ/2)
i
≤
2
(1−δ/2)
i
2n
0
, implying that T
m
≤ n
0
. On the other hand, suppose we stop for
some < m. For each i < we have T
i+1
≤ T
1−δ/2
i
. Thus T
i
≤ 2
(1−δ/2)
i
2n
0
for each i ≤ . This allows us to bound n
i
= n
0
+
i−1
j=0
log(18T
i
) = n
0
+
2n
0
i−1
j=0
(1 − δ/2)
j
+ i log(18) ≤
_
4
δ
+ 1
_
n
0
+ mlog(18). Deﬁning yet another
constant, this implies that T
i+1
≤ T
1−δ
i
c
3
(n
0
log(n
0
))
c
1
. Combining this with
our stopping condition T
+1
> T
1−δ/2
we ﬁnd that
T
≤ (c
3
(n
0
log(n
0
))
c
1
)
2
δ
= poly(n
0
).
Therefore, the runtime of the procedure is polynomial in n
0
regardless of the
reason we stopped iterating the procedure. The number of qubits used increases
only linearly.
Recall that the TQBF (totally quantiﬁed Boolean formula satisﬁability)
problem is PSPACEcomplete, meaning that any kbit problem instance for
any language in PSPACE can be reduced to a TQBF problem of length
n = poly(k) (see [29] for more information). The formula can be solved in time
T ≤ 2
2n
/18, by exhaustive enumeration over the variables. Thus a PSPACE
computation can be solved in quantum polynomial time. This proves the ﬁrst
part of the theorem.
To incorporate oracles, note that our construction of U in (4) could simply
replace some of the U
i
’s with oracle queries. This preserves sparsity, although
we need the rows of A to now be speciﬁed by oracle queries. We can now iterate
the speedup in exactly the same manner. However, we conclude with the ability
to solve the OR problem on 2
n
inputs in poly(n) time and queries. This, of
course, is impossible [30], and so the purported relativizing quantum algorithm
must also be impossible.
The proof of part 3 of Theorem A.4 simply formulates a poly(n)time, n
qubit quantum computation as a κ = poly(n), N = 2
n
poly(n) matrix inversion
problem and applies the classical algorithm which we have assumed exists.
Theorem A.4 established the universality of the matrix inversion algorithm.
To extend the simulation to problems which are not decision problems, note that
the algorithm actually supplies us with [x) (up to some accuracy). For example,
instead of measuring an observable M, we can measure [x) in the computational
basis, obtaining the result i with probability [ ¸i[x) [
2
. This gives a way to sim
ulate quantum computation by classical matrix inversion algorithms. In turn,
this can be used to prove lower bounds on classical matrix inversion algorithms,
22
where we assume that the classical algorithms output samples according to this
distribution.
Theorem A.5. No relativizing classical matrix inversion algorithm can run in
time N
α
2
βκ
unless 3α + 4β ≥ 1/2.
If we consider matrix inversion algorithms that work only on positive deﬁnite
matrices, then the N
α
2
βκ
bound becomes N
α
2
β
√
κ
.
Proof. Recall Simon’s problem [21], in which we are given f : Z
n
2
→ ¦0, 1¦
2n
such that f(x) = f(y) iﬀ x + y = a for some a ∈ Z
n
2
that we would like to
ﬁnd. It can be solved by running a 3nqubit 2n +1gate quantum computation
O(n) times and performing a poly(n) classical computation. The randomized
classical lower bound is Ω(2
n/2
) from birthday arguments.
Converting Simon’s algorithm to a matrix A yields κ ≈ 4n and N ≈ 36n2
3n
.
The runtime is N
α
2
βκ
≈ 2
(3α+4β)n
poly(n). To avoid violating the oracle
lower bound, we must have 3α + 4β ≥ 1/2, as required.
Next, we argue that the accuracy of algorithm cannot be substantially im
proved. Returning now to the problem of estimating ¸x[ M [x), we recall that
classical algorithms can approximate this to accuracy in time O(Nκpoly(log(1/))).
This poly(log(1/)) dependence is because when writing the vectors [b) and [x)
as bit strings means that adding an additional bit will double the accuracy.
However, samplingbased algorithms such as ours cannot hope for a better than
poly(1/) dependence of the runtime on the error. Thus proving that our algo
rithm’s error performance cannot be improved will require a slight redeﬁnition
of the problem.
Deﬁne the matrix inversion estimation problem as follows. Given A, b, M, , κ, s
with A ≤ 1, A
−1
 ≤ κ, A ssparse and eﬃciently rowcomputable, [b) = [0)
and M = [0)¸0[ ⊗ I
N/2
: output a number that is within of ¸x[ M [x) with
probability ≥ 2/3, where [x) is the unit vector proportional to A
−1
[b).
The algorithm presented in our paper can be used to solve this problem
with a small amount of overhead. By producing [x) up to trace distance /2 in
time
˜
O(log(N)κ
2
s
2
/), we can obtain a sample of a bit which equals one with
probability µ with [µ−¸x[ M [x) [ ≤ /2. Since the variance of this bit is ≤ 1/4,
taking 1/3
2
samples gives us a ≥ 2/3 probability of obtaining an estimate within
/2 of µ. Thus quantum computers can solve the matrix inversion estimation
problem in time
˜
O(log(N)κ
2
s
2
/
3
).
We can now show that the error dependence of our algorithm cannot be
substantially improved.
Theorem A.6. 1. If a quantum algorithm exists for the matrix inversion es
timation problem running in time poly(κ, log(N), log(1/)) then BQP=PP.
2. No relativizing quantum algorithm for the matrix inversion estimation
problem can run in time N
α
poly(κ)/
β
unless α +β ≥ 1.
23
Proof. 1. A complete problem for the class PP is to count the number of sat
isfying assignments to a SAT formula. Given such formula φ, a quantum
circuit can apply it on a superposition of all 2
n
assignments for variables,
generating the state
z
1
,...,z
n
∈{0,1}
[z
1
, . . . , z
n
) [φ(z
1
, . . . z
n
)) .
The probability of obtaining 1 when measuring the last qubit is equal to
the number of satisfying truth assignments divided by 2
n
. A matrix inver
sion estimation procedure which runs in time poly log(1/) would enable
us to estimate this probability to accuracy 2
−2n
in time poly(log(2
2n
)) =
poly(n). This would imply that BQP = PP as required.
2. Now assume that φ(z) is provided by the output of an oracle. Let C
denote the number of z ∈ ¦0, 1¦
n
such that φ(z) = 1. From [22], we know
that determining the parity of C requires Ω(2
n
) queries to φ. However,
exactly determining C reduces to the matrix inversion estimation problem
with N = 2
n
, κ = O(n
2
) and = 2
−n−2
. By assumption we can solve this
in time 2
(α+β)n
poly(n), implying that α +β ≥ 1.
24
quantum computer can approximate the value of such a function in time which is polylogarithmic in N , an exponential speedup over the best known classical algorithms. In fact, under standard complexitytheoretic assumptions, we prove that in performing this task any classical algorithm must be exponentially slower than the algorithm presented here. Moreover, we show that our algorithm is almost the optimal quantum algorithm for the task. We begin by presenting the main ideas behind the construction. Then we give an informal description of the algorithm, making many simplifying assumptions. Finally we present generalizations and extensions. The full proofs appear in the supporting online material [3]. Assume we are given the equation Ax = b, where b has N entries. The algorithm works by mapping b to a quantum state b and by mapping A to a suitable quantum operator. For example, A could represent a discretized differential operator which is mapped to a Hermitian matrix with eﬃciently computable entries, and b could be the ground state of a physical system, or the output of some other quantum computation. Alternatively, the entries of A and b could represent classical data stored in memory. The key requirement here, as in all quantum information theory, is the ability to perform actions in superposition (also called “quantum parallel”). We present an informal discussion of superposition, and its meaning in this context. Suppose that a algorithm (which we can take to be reversible without loss of generality) exists to map input x, 0 to output x, f (x). Quantum mechanics predicts that given a superposition of x, 0 and y, 0, evaluating this function on a quanutm computer will produce a superposition of x, f (x) and y, f (y), while requiring no extra time to execute. One can view accessing a classical memory cell as applying a function whose input is the address of the cell and which output the contents of this cell. We require that we can access this function in superposition. In the following paragraphs we assume that A is sparse and Hermitian. Both assumptions can be relaxed, but this complicates the presentation. We also ignore some normalization issues (which are treated in the supplementary material). The exponential speedup is attained when the condition number of A is polylogarithmic in N , and the required accuracy is 1/ poly log n. The algorithm maps the N entries of b onto the log2 N qubits required to represent the state b . When A is sparse, the transformation eiAt b can be implemented eﬃciently. This ability to exponentiate A translates, via the wellknown technique of phase estimation, into the ability to decompose b in the eigenbasis of A and to ﬁnd the corresponding eigenvalues λj . Informally, the state of the system after this stage is close to j βj uj λj , where uj is the eigenvector basis of A, and b = j βj uj . As the eigenvalues which corresponds to each eigenvector is entangled with it, One can hope to apply an operation which would take j βj uj λj to j λ−1 βj uj λj . However, this j is not a linear operation, and therefore performing it requires a unitary, followed by a successful measurement. This allows us to extract the state x = A−1 b . The total number of resources required to perform these transformations scales polylogarithmically with N .
2
such that the result of the computation is encoded in some expectation value of the solution of the system. We note x that classically ﬁnding out if two probability distributions are similar requires √ at least O( N ) samples [6]. we will achieve a runtime proportional to κ2 log N . The stable state of this distribution is given by ˜x ˜ x = (I − A)−1 b . we perform the SWAP test between them [5]. One can apply similar ideas to know if diﬀerent pictures are similar. where M is some linear operator (our procedure also accommodates nonlinear operators as described below). equivalently κ−2 I ≤ A† A ≤ I. can be written as a set of linear equations. 11. and performing the quantum measurement corresponding to M . The runtime also scales as 1/ if we allow an additive error of in the output state x . and the solutions become less stable. Consider a stochastic process xt = Axt−1 + b. In particular. 12]. we obtain an estimate of the expectation value x M x = xT M x. we also present a technique to handle illconditioned matrices. A becomes closer to a matrix which cannot be inverted. By mapping M to a quantummechanical operator. 3 . In this case. Matrix inversion has the advantage of being a natural problem that is not obviously related to quantum mechanics. we show that it is universal for quantum computation . In general. Therefore. Estimating expectation values on solutions of systems of linear equations is quite powerful. along with [7. 10. or to identify what is the relation between two pictures. Thus. when one is interested not in x itself. Clearly. if κ and 1/ are both poly log(N ). 8. Previous papers utilized quantum computers to perform linear algebraic operations in a limited setting [13]. To know if ˜ b. However. A simple example where the algorithm can be used is to see if two diﬀerent stochastic processes have similar stable state [4]. In this case.anything that a quantum computer can do. diﬀerent problems require us to extract diﬀerent features.This procedure yields a quantummechanical representation x of the desired vector x. etc. However. An important factor in the performance of the matrix inversion algorithm is κ. and it is an important question to identify what are the important features to extract. 9.” Our algorithms will generally assume that the singular values of A lie between 1/κ and 1. to read out all the components of x would require one to perform the procedure at least N times. including normalization. A wide variety of features of the vector x can be extracted in this way. or the ratio between A’s largest and smallest eigenvalues. the runtime will also be poly log(N ). where the i’th coordinate in the vector xt represents the abundance of item i in time t. Our work was extended by [14] to solving nonlinear diﬀerential equations. weights in diﬀerent parts of the state space. matrix inversion can be thought of as an alternate paradigm for quantum computing. but in some expectation value xT M x. x b x and ˜ are similar. the condition number of A. our quantum algorithm is exponentially faster than any classical method. as desired. moments. Let xt = A˜t−1 + ˜ and ˜ = (I − A)−1 ˜ . As the condition number grows. Such a matrix is said to be “illconditioned. We use the universality result to show that our algorithm is almost optimal and that classical algorithms cannot match its performance.
For example. If A is not Hermitian. we can relabel our k register to obtain N T −1 ˜ αkj βj λk uj j=1 k=0 ˜ Adding an ancilla qubit and rotating conditioned on λk N T −1 yields . This is possible (for example) if A is ssparse and eﬃciently row computable. and αkj  is large if and only if λj ≈ ˜ Deﬁning λk := 2πk/t0 . ˜ where the O suppresses more slowlygrowing terms (included in the supporting material [3]). j=1 k=0 (3) 2πk t0 . [15] shows how to simulate eiAt in time ˜ O(log(N )s2 t). meaning it has at most s nonzero entries per row and given a row index these entries can be computed in time O(s). the rest of the paper assumes that A is Hermitian. where k are the Fourier basis states. The next step is to decompose b in the eigenvector basis. [16] to prepare b . using phase estimation [17. We also need an eﬃcient procedure to prepare b . Ref. Let As C is Hermitian. ˜ αkj βj λk uj j=1 k=0 1− C2 C 1 0 + ˜ ˜ λ2 λk k 4 . The coeﬃcients of Ψ0 are chosen (following [18]) to minimize a certain quadratic loss function which appears in our error analysis (see supplementary material for details). we can solve the equation Cy = Ψ0 := 2 T T −1 sin τ =0 1 π(τ + 2 ) τ T (2) for some large T . 18]. and by λj the corresponding eigenvalues. we want to transform a given Hermitian matrix A into a unitary operator eiAt which we can apply at will. Fourier transforming the ﬁrst register gives the state N T −1 αkj βj k uj . if bi and i2 2 i=i1 bi  are eﬃciently computable then we can use the procedure of Ref. T −1 Next we apply the conditional Hamiltonian evolution τ =0 τ τ C ⊗eiAτ t0 /T C on Ψ0 ⊗ b . First. x 0 Applying this reduction if necessary.We now give a more detailed explanation of the algorithm. Under these assumptions. Denote by uj the eigenvectors of eiAt . where t0 = O(κ/ ). deﬁne C= 0 A† A 0 (1) 0 b to obtain y = .
we would have αkj = 1 if λk = λj . We show that a quantum circuit using n qubits and T gates can be simulated by inverting an O(1)sparse matrix A of dimension N = O(2n κ). such as x† M x. we have the state N 1 C βj uj N 2 /λ 2 λj j j=1 C j=1 which corresponds to x = j=1 βj λ−1 uj up to normalization. Performing the phase estimation is done by simulating eiAt . If λ ≥ 1/κ taking t0 = O(κ/ ) induces a ﬁnal error of . The condition number κ is O(T 2 ) if we need A to be positive deﬁnite or O(T ) if not. We show that the answer to both questions is negative. the runtime is ˜ O log(N )s2 κ2 / By contrast. We present an informal description of the sources of error. Assuming this for now. Using amplitude ampliﬁcation [19]. one of the best generalpurpose classical matrix inversion algorithms is the conjugate gradient method [20].) An important question is whether classical methods can be improved when only a summary statistic of the solution. We now undo the phase estimation to uncompute the λk . ˜ If the phase estimation were perfect. we ﬁnd that O(κ) repetitions are suﬃcient. (If A is not positive deﬁnite. Another question is whether our quantum algorithm could be improved. is required. Our strategy is to prove that the ability to invert matrices (with the right choice of parameters) can be used to simulate a general quantum computation. the exact error analysis and runtime considerations are presented in [3]. and 0 otherwise. κ. This implies that a classical poly(log N. Putting this all together. we obtain N βj uj j=1 1− C2 C 2 0 + λ 1 λj j To ﬁnish the inversion we measure the last qubit. Since C = O(1/κ) and λ ≤ 1. O(κ log(1/ )) multiplications are required. when A is positive deﬁnite. √ uses O( κ log(1/ )) matrixvector multiplications each taking time O(N s) for a √ total runtime of O(N s κ log(1/ )). using an argument from complexity theory. which translates into a relative error of O(1/λt0 ) in λ−1 . for a total time of O(N sκ log(1/ )). Assuming that A is ssparse. We can dej termine the normalization constant from the probability of obtaining 1. Conditioned on seeing 1. The dominant source of error is phase estimation. which. we make a measurement M whose expectation value x M x corresponds to the feature of x that we wish to evaluate. Such a simulation is 5 n . Finally. this can be done with negligible error in time nearly linear in t and quadratic in s. say to achieve error in time proportional to poly log(1/ ).˜ where C = O(1/κ). Finally. we consider the success probability of the postselection process. 1/ )time algorithm would be able to simulate a poly(n)gate quantum algorithm in poly(n) time. this probability is at least Ω(1/κ2 ). This step errs by O(1/t0 ) in estimating λ.
Now adjoin an ancilla register of dimension 3T and deﬁne a unitary T U= t=1 † t+1 t⊗Ut + t+T +1 t+T ⊗I + t+2T +1 mod 3T t+2T ⊗U3T +1−t . We now present the key reduction from simulating a quantum circuit to matrix inversion. and we can expand A−1 = U k e−k/T . If the runtime could be made polylogarithmic in κ. If we now deﬁne A = I − U e−1/T then κ(A) = O(T ). UT . we j 6 . iterating this would again imply that BQP=PSPACE. corresponding to a successful computation. instead of transforming b = j βj uj to x = j λ−1 βj uj . If we measure the ﬁrst register and obtain T + 1 ≤ t ≤ 2T (which occurs with probability e−2 /(1 + e−2 + e−4 ) ≥ 1/10) then we are left with the second register in the state UT · · · U1 ψ . BQP=PSPACE). Certain nonsparse A can also be simulated and therefore inverted.strongly conjectured to be false. and proves our above claims about the diﬃculty of improving our algorithm. using the reduction presented from the nonHermitian case to the Hermitian one. Sampling from x allows us to sample from the results of the computation. The reduction from a general quantum circuit to a matrix inversion problem. Similarly. (4) We have chosen U so that for T + 1 ≤ t ≤ 2T . and even minor improvements would contradict oracle lower bounds [22]. and then consider measuring other features of x and performing operations on A other than inversion. we can assume 1 ≤ t ≤ 3T . Since U 3T = I. (5) k≥0 This can be interpreted as applying U t for t a geometricallydistributed random variable. The matrix inversion algorithm can also handle illconditioned matrices by inverting only the part of b which is in the wellconditioned part of the matrix. Let C be a quantum circuit acting on n = log N qubits which ⊗n applies T twoqubit gates U1 .e. improving the error dependence to poly log(1/ ) would imply that BQP includes PP. We now discuss ways to extend our algorithm and relax the assumptions we made while presenting it. a highly implausible result[22]. we show how a broader class of matrices can be inverted. and is known to be impossible in the presence of oracles [21]. . then any problem solvable on n qubits could be solved in poly(n) time (i. The initial state is 0 and the answer is determined by measuring the ﬁrst qubit of the ﬁnal state. . This establishes that matrix inversion is BQPcomplete. Even improving our κdependence to κ1−δ for δ > 0 would allow any timeT quantum algorithm to be simulated in time o(T ). Formally. . applying U t to 1 ψ yields t + 1 ⊗ UT · · · U1 ψ . see [23] for a list of examples. First. also implies that our algorithm cannot be substantially improved (under standard assumptions). It is also possible to invert nonsquare matrices.
Finally. it can compute f (A) b for any computable f . M represent data.ik . The last qubit is a ﬂag which enables the user to estimate what the size of the illconditioned part. e. The algorithm allows considerable ﬂexibility in how the matrix to be inverted A. [16].jk i1 . If the entries of A...... they can be stored in quantum memory: a large quantum memory (O(N s) slots) is needed. Thus.. Alternatively. ik i1 . .j1 ..jk ⊗k j1 . our algorithm could potentially run much faster if a suitable preconditioner is used.transform it to a state which is close to λ−1 βj uj well + j j. . the quantum processor that performs the algorithm itself 7 . and the vector b are represented. the components of b could either be computed or stored in quantum memory in a form that allows the construction of b . If we have a method of generating a preconditioner matrix B such that κ(AB) is smaller than κ(A). where given A and b we sample from the vector x. if A or M represents some computable transformation. . . 14]. . j k  on the state x . e. . We can estimate degree2k polynomials in the entries of x by generating k copies of x and measuring the nkqubit observable Mi1 .. Similarly.. then AB is as well. as long as a state proportional to B b can be eﬃciently prepared. a matrix inversion problem with a betterconditioned matrix. the measurement matrix M . Perhaps the most farreaching generalization of the matrix inversion algorithm is not to invert matrices at all! Instead. then we can solve Ax = b by instead solving (AB)c = B b.. not necessarily the true condition number of A).. . one can use our algorithm to generate a quantum analogue of MonteCarlo.ik . we consider physical aspects regarding the implementation and performance of the quantum matrix inversion algorithm. No matter how the inputs are represented. a discretized diﬀerential operator.. if A and B are both sparse.λj ≥1/κ βj uj ill in time proportional to κ2 for any chosen κ (i..λj <1/κ j.g. Alternatively.. Depending on the degree of nonlinearity of f . Another method that is often used in classical algorithms to handle illconditioned matrices is to apply a preconditioner[24]. then its entries can be computed in quantum parallel. Some variants of this idea are considered in [25.. This behavior can be advantageous if we know that A is not invertible and we are interested in the projection of b on the well conditioned part of A. nontrivial tradeoﬀs between accuracy and eﬃciency arise. in which case no quantum memory is required. Further.e. using Ref..j1 . .. meaning that the value i occurs with probability xi 2 . or to handle it in any other way she wants. but the physical requirements for quantum memory are significantly less demanding than those for fullblown quantum computation [26]. The outputs of the algorithm can also be generalized.g.
and Seth Lloyd.” SL thanks R. Supplementary material. 292(5516):472–475. [10] M. AWH was also funded by the U. Laﬂamme. Physical Review Letters. [7] Edward Farhi. Cleve. Nature. 2001. Cleve. Dowling. Nielsen. A Quantum Adiabatic Evolution Algorithm Applied to Random Instances of an NPComplete Problem. Quantum Computation as Geometry. S. In S. Watrous. 87(16):167902–167902.R. M. Science. editor. Science.is exponentially smaller than the matrix to be inverted: a quantum computer with under one hundred qubits suﬃces to invert a matrix with Avogadro’s number of entries. Proceedings: 35th Annual Symposium on Foundations of Computer Science. Similarly. and P. A scheme for eﬃcient quantum computation with linear optics. Sam Gutmann. Gu. 2008. Algorithms for quantum computation: discrete logarithms and factoring. 2001. Gharabian and D.A. and GJ Milburn. Luenberger. [9] D. IEEE Computer Society Press. Tegmark. [2] S. 1979. Arad. and R. 409:46–52. J. M. [5] H. References [1] P. 2001. Universal quantum simulators. Knill. Zecchina for encouraging him to work on this problem. Parillo supplied useful applications for this algorithm. [4] D. Testing symmetric properties of distributions. Andrew Lundgren. 2006. Harrow. August 1996. The BQPhardness of approximating the Jones Polynomial. In STOC. Jeﬀrey Goldstone. Models. [6] Paul Valiant. Quantum ﬁngerprinting. Aharanov and I. Acknowledgements. and Daniel Preda. 8 . pages 124–134. Joshua Lapan. De Wolf. D. Avinatan Hassidim. Mitter.M. We thank the W. [3] Aram W. and Applications. Spielman for helpful discussions.C. R. Introduction to Dynamic Systems: Theory. and A. thereby reducing the overhead required for quantum error correction. pages 383–392. R. S. Doherty. [8] E. Buhrman.K. Shor. Farmer. arXiv:quantph/0605181. Quantum algorithm for solving linear systems of equations. Wiley. We are grateful as well to R. 1994. New York. EPSRC grant “QIP IRC. M. the exponential speedup of the algorithm allows it to be performed with a relatively small number of quantum operations.G. and AWH thanks them as well as MIT for hospitality while this work was carried out. W. Keck foundation for support. 273:1073–1078. Goldwasser. 2009. Lloyd. 2006.
1997. Cleve. Rev. Grover and T. [21] Daniel R. 2008. 2007. Wang. [18] V. J.W. Soc. arXiv:0810. U. R. [19] G. Gutmann. 2008. Rudolph. A. 9 . arXiv:0810. and Z. M. 2008. Matrix preconditioning techniques and applications.and discretetime quantum walk. 2002. Tapp. 2003. Pennsylvania. Leyton and T. Comp. Phys. and Z. 67:010302. 82:2207–2210. Am. C. and B. 1997.J. 1999. Math. arXiv:quantph/0508139. Wang. 2005. S. Sipser. D. A quantum algorithm to solve nonlinear diﬀerential equations. and M. Kitaev. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain. Lett. Cambridge. [25] L. On the power of quantum computation. [23] Andrew M. Quantum Algorithms Revisited.[11] M. Brassard. Eﬃcient Quantum Algorithms for Simulating Sparse Hamiltonians. 2003. AMS.4423. Chen. March 1994. Bull. Høyer. R. [14] S. Phys.. Lett. Farhi.. 270(2):359–371. Roetteler. Quantum Amplitude Ampliﬁcation and Estimation. arXiv:quantph/9808042.. School of Computer Science. Larsen. Derka. Creating superpositions that correspond to eﬃciently integrable probability distributions. Topological quantum computation. Technical Report CMUCS94125. Cambridge Univ. arXiv:quantph/0005055. Mosca. A limit on the speed of quantum computation in determining parity.0312. Phys. and A.3843. Comm. Shewchuk. P. A. Klappenecker and M. Press. [16] L. Pittsburgh. Quantum Physics Title: Engineering Functional Quantum Algorithms. Berry. Comm. 26:116–123. Approximating fractional time quantum evolution. 2002. J. Rev. SIAM J. and S. [12] M. Freedman. M. Carnegie Mellon University. arXiv:quantph/0208112. Phys. Macchiavello. and M. Ekert. Sheridan. Math.. 81:5442– 5444. Phys. Goldstone. arXiv:quantph/9802045.. 40(1):31–38.C. arXiv:quantph/9708016. A. Math. [15] D. arXiv:0812. 1998.. [20] Jonathan R. Larsen. [24] K. Optimal quantum clocks. Buzek. [13] A. volume 305 of Contemporary Mathematics Series Millenium Volume. Maslov. M. [22] E. Mosca. On the relationship between continuous. and M.K.H. Childs. [17] R. G. Ahokas. Cleve. 227(3):605–622. Simon.. Rev. Mosca. Osborne. Massar. Sanders.H. K. A modular functor which is universal for quantum computation. Freedman.
ACM Press New York. we choose some κ > κ (say κ = 2κ). 1996. As inputs we require a procedure to produce the state b . pages 20–29. a method of producing the ≤ s nonzero elements of any row of A and a choice of cutoﬀ κ. Finally. Hansen. Vazirani. 2003. 1997. A Supplementary Online Material In this appendix. Our strategy will be to ﬂag the illconditioned part of the matrix (without inverting it). SIAM Journal on Computing. Introduction to the Theory of Computation. the algorithm produces x only up to some error which is given as part of the input. Rev. [30] C. In Proceedings of the thirtyﬁfth annual ACM symposium on Theory of computing (STOC). Brassard. Philadelphia. [27] D. arXiv:quantph/0301023. G. International Thomson Publishing. This is described formally in the next section. ﬂags any eigenvector with eigenvalue ≤ 1/κ as illconditioned. Quantum random access memory. and interpolates between these two behaviors when 1/κ < λ < 1/κ. SIAM.4) and from a general quantum computation to matrix inversion (Section A. Bennett. 2008. albeit possibly in a less intuitive way. 100:160501. PA. Adiabatic quantum state generation and statistical zero knowledge. 26:1510–1523. here we take the opposite approach and describe everything. and so deﬁne error in terms 10 . Bernstein.or illconditioned subspaces. E.. 1998. We present this strategy not because it is necessarily ideal in all cases. USA. Our algorithm then inverts the wellconditioned part of the matrix. and Lorenzo Maccone.[26] Vittorio Giovannetti. The condition number is a crucial parameter in the algorithm. We also describe in more detail our reductions from nonHermitian matrix inversion to Hermitian matrix inversion (Section A. We will deﬁne the wellconditioned part of A to be the span of the eigenspaces corresponding to eigenvalues ≥ 1/κ and the illconditioned part to be the rest. Here we present one possible method of handling illconditioned matrices. and U. Phys. While the body of the paper attempted to convey the spirit of the procedure and left out various improvements.5). and let the user choose how to further handle this. [29] M. Rankdeﬁcient and discrete illposted problems: Numerical aspects of linear inversion. NY. The strengths and weaknesses of quantum computation. Lett. Since we cannot exactly resolve any eigenvalue. [28] P. we describe and analyze our algorithm in full detail. Sipser. Accordingly. Our runtime will be roughly quadratic in κ and our algorithm is guaranteed to be correct if A ≤ 1 and A−1 ≤ κ.C. We work only with pure states. we can only approximately determine whether vectors are in the well. but because it gives a concrete illustration of the key components of our algorithm. Aharonov and A.H. TaShma. Seth Lloyd.
then this requires time √ 2 ˜ TH = O(log(N )(log∗ (N ))2 s2 t0 9 log(s t0 / H ) ). as there are some other types of Hamiltonians which can be simulated eﬃciently (e. possibly along with garbage in an ancilla register. Prepare Ψ0 C from 0 up to error Ψ. for the target state. . T −1 τ =0 2. Let TB be the number of gates required to implement B. 23]). this broadens the set of matrices we can handle. and so on.of distance between vectors. so we write x instead of x 000 . The key subroutine of the algorithm. and the runtime is almost linear with t0 . . Denote the resulting basis ˜ states with k . Apply the Fourier transform to the register C. is deﬁned as follows: 1. 11 . t ≤ t0 and we want to guarantee that the error is ≤ H . . To simuluate eiAt for some t ≥ 0. Using the reductions described in Section A. . for the initial state. = O(log(N )s2 t0 ) (7) The scaling here is better than any power of 1/ H . [27. A. Apply the conditional Hamiltonian evolution to error H . we use the algorithm of [15]. or a standard statepreparation procedure such as [16]. . denoted Uinvert . τ τ C ⊗ eiAτ t0 /T up 3. we deﬁne the state Ψ0 = 2 T T −1 sin τ =0 1 π(τ + 2 ) τ T (6) for a T to be chosen later. One other subroutine we will need is Hamiltonian simulation. which means that the additional error introduced by this step introduces is negligible compared with the rest of the algorithm. . We make no further assumption about B. any errors in producing b necessarily translate directly into errors in the ﬁnal state x .4. b instead of b 000 . for k = 0. Next. Note that this is the only step where we require that A be sparse. . Deﬁne λk := 2πk/t0 . we assume that there exists an eﬃcientlyimplementable unitary B. In general we do not write down ancilla qubits in the 0 state. which when applied to initial produces the state b . Thus.g. α − β = 2(1 − Re αβ ). Since ancilla states are produced and then imperfectly uncomputed by the algorithm. it may represent another part of a larger algorithm. T − 1. without any other way of producing or verifying the state b . Using [16]. . i. . we can assume that A is Hermitian.e. 15. our output state will technically have high ﬁdelity not with x but with x 000 . we can prepare Ψ0 up to error Ψ in time poly log(T / Ψ ).1 Detailed description of the algorithm To produce the input state b . If A is ssparse. We neglect the possibility that B errs in producing b since. we have no way to mitigate these errors. .
Here ‘nothing’ indicates that the desired matrix inversion hasn’t taken place. we introduce ˜ two new operators: Rsucc = I S − 2well wellS . the procedure has a constant probability ˜ of success if the number of repetitions is a constant fraction of π/4˜. Instead if we apply Uinvert to b (i. Finally we measure S and stop when we obtain the result ‘well’.e. p following [19] we repeat the entire procedure with a geometrically increasing number of repetitions each time: 1.4. measure S and obtain the outcome ‘well’. Additionally. then we will have approximately applied an operator proportional to A−1 . Rather than repeating 1/˜ times. Let p (computed in ˜ the next section) denote the success probability of this measurement. . g(λ) are known as ﬁlter functions[28]. 2. and are chosen so that for some constant C > 1: f (λ) = 1/Cκλ for λ ≥ 1/κ. Otherwise the functions are arbitrary. Our main algorithm then follows the amplitude ampliﬁcation procedure: we † start with Uinvert B initial and repeatedly apply Uinvert BRinit B † Uinvert Rsucc . which in the next section we will show is O(κ). 5. While p is initially unknown. acting only on the S register and Rinit = I − 2initial initial. One possible choice is 1 when λ ≥ 1/κ 2κλ 1 λ− κ 1 π 1 1 sin 2 · 1 − 1 when κ > λ ≥ κ (8a) f (λ) = κ 2 κ 1 0 when κ > λ 0 when λ ≥ 1/κ cos π 2 1 λ− κ 1 1 −κ κ g(λ) = 1 2 1 2 · when when 1 κ 1 κ >λ≥ >λ 1 κ (8b) If Uinvert is applied to uj it will. g(λ) deﬁned below in (8). uncomputing any garbage produced along the way. . Reverse steps 13. ‘well’ indicates that it has. The functions f (λ). adjoin the state h(λj ) . and ‘ill’ means that part of b is in the illconditioned subspace of A. Thus. 8. we will use amplitude ampliﬁcation [19] to obtain the p √ same results with O(1/ p) repetitions. To describe the procedure. up to an error we will discuss below. g(λ) = 1/C for λ ≤ 1/κ := 1/2κ and f 2 (λ) + g 2 (λ) ≤ 1 for all λ. a superposition of diﬀerent uj ). . The number √ ˜ of repetitions would ideally be π/4 p. for functions f (λ). . until we have reached a power 12 . 4. Adjoin a threedimensional register S in the state ˜ h(λk ) S := S S ˜ ˜ ˜ ˜ 1 − f (λk )2 − g(λk )2 nothing +f (λk ) well +g(λk ) ill S . f (λ) should satisfy a certain continuity property that we will describe in the next section.
(9) 2. On the other hand. If b is entirely within the wellconditioned subspace of A and we postselect on the ﬂag register being well then ˜ − x x ≤ O(κ/t0 ). the runtime is O(κ(TB + t0 s2 log(N )). Theorem A. Deﬁne U to be the ideal version of O( H + Ψ ). it will be convenient to ignore the error terms H and Ψ . we will show that t0 can be taken to ˜ be O(κ/ ) so that the total runtime is O(κTB + κ2 s2 log(N )/ ). this could be used to estimate the amount of weight of b lying in the illconditioned components of A. it is suﬃcient to work with U Uinvert in which there is no error in any step. the ideal state x is proportional 13 . For instance. Let U denote a version of Uinvert in ˜ which everything except the phase estimation is exact. where ˜ the O suppresses the moreslowly growing terms of (log∗ (N ))2 . In this section. the error is bounded as ˜ U − U ≤ O(κ/t0 ).of two that is ≥ κ. 1. In the next section. Since U − Uinvert ≤ ˜ . or estimate its weight. The main subtlety in analyzing the error comes from the postselection step.2 Error Analysis In this section we show that taking t0 = O(κ/ ) introduces an error of ≤ in the ﬁnal state. as these can be made negligible with relatively little eﬀort and it is the errors ˜ from phase estimation that will dominate.1 (Error bounds). This yields a constant probability of success using ≤ 4κ repetitions. The rest of the section is devoted to the proof of Theorem A. and then prove the ﬁrst two claims more or less independently. To prove (10 assuming (9). but the other two are useful if we want to work with the illconditioned space. in which we choose only the part of the state attached to the well register. If we postselect on the ﬂag register being in the space spanned by {well . which results from applying Uinvert a single time to b . (11) The third claim is often of the most practical interest. In the case when no postselection is performed. we may also be interested in the nonpostselected state. H ))) A. ˜ Putting everything together.1. Somewhat surprisingly. ill } and deﬁne the normalized ideal state to be x and our actual state to be ˜ then x ˜ − x ≤ O(κ/t0 ). observe that if b is entirely in the wellconditioned space. This can potentially magnify errors in the overall state. We ﬁrst show that the third claim is a corollary of the second. exp(O(1/ log(t0 / and poly log(T / Ψ ). x (10) 3. we show that the error in both cases is upperbounded by O(κ/t0 ).
1 which has squared norm 2κ2 λ4 (2κ2 λ2 −1) + d 1/κ. Technically. We consider it piece by piece. The ideal functionality is deﬁned by N ϕ = U b = j=1 βj uj h(λj ) . A crucial piece of the proof will be the following statement about the continuity of h(λ) . Let P denote the ﬁrst three steps of the algorithm. the norm of dλ h(λ) is 1 4κ2 λ4 ≤ κ2 . They can be thought of as mapping the initial zero qubits to a k register. Model the postselection on well by a postselection ﬁrst on the space spanned by {well . it suﬃces to bound the norm of the derivative of h(λ) . when 1/κ < λ < 1 π · · 2 2 Finally d dλ 1 κ 1 − 1 κ = π κ. 2 π 2. By (9). ill }. The map λ → h(λ) is O(κ)Lipschitz. Now we turn to the proof of (9). P should be completed to make it a unitary operator by deﬁning some arbitrary behavior on inputs other than initial in the last register.2. with c = ˜ Now we return to the proof of (9). d h(λ) = dλ 2κ2 λ3 1 1− 1/2κ2 λ2 nothing − 1 well . 2κλ2 = 2(1 − Re h(λ1 )h(λ2 ) ) ≤ cκλ1 − λ2 . Next. N Consider a test state b = j=1 βj uj . together with some garbage. This completes the proof. where the guarantee that the phase estimation algorithm gives us is that αkj is ˜ ˜ concentrated around λj ≈ 2πk/t0 =: λk . Proof. This implies that the second postselection will succeed with probability ≥ 1 − O(κ2 /t2 ) and therefore 0 will increase the error by at most O(κ/t0 ). followed by a postselection onto well . h(λ) = 0 when λ < 1/κ .to A−1 b well . meaning that for any λ1 = λ2 . When λ > 1/κ. Lemma A. h(λ1 ) − h(λ2 ) for some c = O(1). as follows: n ˜ P = j=1 uj uj  ⊗ k αkj k garbage(j. k) initial . Since λ → h(λ) is continuous everywhere and diﬀerentiable everywhere except at 1/κ and 1/κ . The ﬁnal error is then O(κ/t0 ) as claimed in (11). the ﬁrst postselection leaves us with error O(κ/t0 ). 14 .
Turning now to the postselected case. There are two sources of inﬁdelity. but by careful examination of the errors.k βj αkj uj k h(λj ) . k) = βj 2 αkj 2 . where c ≤ π is a constant. For larger values of δ. Naively. 0 we use the bound αkj 2 ≤ 64π 2 /(λj t0 − 2πk)4 (proved in Section A.3) to ﬁnd an inﬁdelity contribution that is ∞ ≤2 k= λj t0 2π +1 64π 2 c2 κ2 δ 2 64π 2 c2 κ2 = 2 4 δ 2t0 t2 0 ∞ k=1 1 8π 2 c2 κ2 · 2. √ p Where in the last step we have deﬁned p := Ej [f (λj )2 + g(λj )2 ] to be the probability that the postselection succeeds. Since the initial state b was arbitrary. Thus ˜ Re ϕϕ = Ej Ek Re h(λk )h(λj ) . = 2 k2 4π 3 t0 Summarizing.while the actual algorithm produces the state N ˜ ˜ ϕ = U b = P † ˜ j=1 βj uj k ˜ αkj k h(λk ) . we observe that x := = f (A) b well + g(A) b ill b (f (A)2 + g(A)2 ) b j βj uj (f (λj ) well + g(λj ) ill ) j (12) (13) βj 2 (f (λj )2 + g(λj )2 ) (14) =: j βj uj (f (λj ) well + g(λj ) ill ) . we ﬁnd that this worstcase situation only occurs when 15 . Re h(λk )h(λj ) ≥ 1 − c2 κ2 δ 2 /2t2 . ˜ ˜ We wish to calculate ϕϕ . ˜ ˜ ˜ Let δ = λj t0 − 2πk = t0 (λj − λk ). From Lemma A. where we think of j and k as random variables with joint distribution Pr(j. the inner product is at least 1 − 2π 2 c2 κ2 /t2 . which translates into ˜ 0 ϕ − ϕ ≤ 4πcκ/t0 = 2π 2 κ/t0 . this ˜ ˜ bounds the operator distance U − U as claimed in (9). this post√ selection could magnify the errors by as much as 1/ p. we ﬁnd that Re ϕϕ ≥ 1 − 5π 2 c2 κ2 /t2 .2. 0 2 For δ ≤ 2π. or equivalently the inner product between P ϕ and ˜ ˜ ϕ = P j. This inner product is N ϕϕ = ˜ j=1 βj 2 k ˜ ˜ αkj 2 h(λk )h(λj ) := Ej Ek h(λk )h(λj ) .
˜ ˜ ˜ ˜ Our goal is to bound x − ˜ in (10). Thus p := E[f 2 + g 2 ] and p = E[f 2 + g 2 ]. Note that δ = t0 (λ − λ) and that Eδ. with κ = 2κ. ˜ Recall that j and k are random variables with joint distribution Pr(j. This is what will allow us to obtain the same O(κ/t0 ) error bound even in the postselected state. We evaluate the contribution of a single j value. κ2 2 2 δ f + g 2  t2 0 16 . we ﬁnd F ≥1− ˜ ˜ E[(f − f )2 + (˜ − g)2 ] E[(f − f )f + (˜ − g)g] p − p g g ˜ − · 2p p 2p (19) (20) (21) (22) We now need an analogue of the Lipschitz condition given in Lemma A.k f (λk )2 + g(λk )2 =: ˜ P† N j=1 βj uj k ˜ ˜ αkj k (f (λk ) well + g(λk ) ill ) √ . We work instead with the ﬁdelity x F := xx = ˜ 1+ ˜ ˜ E[f f + g˜] g E[f 2 + g 2 ] + E[(f − f )f + (˜ − g)g] g √ = p−p ˜ p˜ p p 1+ p (17) = ˜ E[(f −f )f +(˜−g)g] g p 1+ Next we expand p−p ˜ p ≥ 1+ ˜ E[(f − f )f + (˜ − g)g] g p 1− 1 p−p ˜ · (18) 2 p ˜ p − p = E[f 2 − f 2 ] + E[˜2 − g 2 ] ˜ g ˜ ˜ = E[(f − f )(f + f )] + E[(˜ − g)(˜ + g)] g g ˜ ˜ = 2E[(f − f )f ] + 2E[(˜ − g)g] + E[(f − f )2 ] + E[(˜ − g)2 ] g g Substituting into (18). g. Eδ 2 = O(1). f . Let f. ˜ ˜ ˜ f := f (λ). Deﬁne λ := λj and ˜ ˜ λ := 2πk/t0 . p ˜ (16) ˜ ˜ where we have deﬁned p = Ej. ˜ ˜ Lemma A. We further abbreviate f := f (λ).k [f (λk )2 + g(λk )2 ].2. g := g(λ) and g = g (λ). and the above bounds on its expectations hold even when conditioning on an arbitrary value of j.3. Here δ depends implicitly on both j and k. g be deﬁned as above. k) = βj 2 αkj 2 . Now write the actual state that we produce as ˜ x := ˜ P† N j=1 βj uj k ˜ ˜ αkj k (f (λk ) well + g(λk ) ill ) (15) ˜ ˜ Ej.the errors are small in the ﬁrst place. Then ˜ f − f 2 + g − g 2 ≤ c ˜ where c = π 2 /2.
Finally. suppose that λ < 1/κ. suppose λ = 1/κ (so f = 1/2) and λ < 1/κ. if λ < 1/κ < λ λ ˜ and f < f then replacing λ with 1/κ only makes the inequality tighter. First. Then f − f  = ˜ 1 λ−λ ˜ ˜ 2κ λ·λ ≤ δ/2t0 λ. Next. we have 2 ˜ 1 ˜ 1 1 λ− 1 1 ˜ 1 ˜ f − f  ≤ − 1 κ = − κ(λ − ) = κ( − λ). Now. t0 4 t2 0 And similarly g − g 2 ≤ ˜ δ2 π2 δ2 2 max g 2 = κ . First bound ˜ E[(f − f )2 + (˜ − g)2 ] g ≤O 2p κ2 t2 0 · E[(f 2 + g 2 )δ 2 ] ≤O E[f 2 + g 2 ] κ2 t2 0 (25) The ﬁrst inequality used Lemma A. Using (24) and λ > 1/κ we ﬁnd that ˜ ˜ ˜ ˜ f − f ≤ 1 − κλ < 1 − λ/λ = (λ − λ)/λ. as desired. Then δ2 π2 δ2 2 ˜ f − f 2 ≤ 2 max f 2 = κ . suppose λ ≥ 1/κ. Since sin π α ≥ α for 0 ≤ α ≤ 1. 1 2 2κ−κ 2 2 κ (24) ˜ ˜ ˜ and using λ = 1/κ we ﬁnd that f − f  = λ−λ . ˜ E[(f − f )f + (˜ − g)g] g ≤ p ≤ E ˜ (f − f )2 + (˜ − g)2 (f 2 + g 2 ) g p E δ 2 κ2 (f 2 t2 0 (26) + g 2 )2 (27) (28) p κ t0 .˜ Proof.3 and the second used the fact that E[δ 2 ] ≤ O(1) even when conditioned on an arbitrary value of j (or equivalently λj ). and we need to show that ˜ κδf  λ − λ ˜ f − f  ≤ 2 = (23) t0 λ ˜ ˜ To prove this. as desired. ≤O 17 . Now we use Lemma A. we consider four cases. Remember that f = f (λ − δ/t0 ) and similarly for g . implying the result. ˜ ˜ suppose λ < 1/κ < λ and f < f .3 to bound the two error contributions in (18). ˜ Consider the case ﬁrst when λ ≥ 1/κ. t2 4 t2 0 0 Finally f (λ)2 + g(λ)2 = 1/2 for any λ ≤ 1/κ. Next. In this case g = 0. Next.
we ﬁnd Re xx ≥ 1 − O(κ2 /t2 ). that ˜ −x ≤ . This completes the proof of Theorem A.where the ﬁrst inequality is CauchySchwartz. the second is Lemma A. and prove the concentration bounds on αkj . so this becomes simply the conditional phase τ τ τ eiλj t0 τ /T . or ˜ 0 equivalently. the improved phaseestimation procedure of [18].1. We now substitute (25) and (28) into (21) (and assume κ ≤ t0 ) to ﬁnd ˜ − p p ≤O p κ t0 . x A.3 Phase estimation calculations Here we describe.3 and the last uses the fact that E[δ] ≤ E[δ 2 ] = O(1) even when conditioned on j. The resulting state is Ψλj t0 = 2 T T −1 e τ =0 iλj t0 τ T sin π(τ + 1 ) 2 τ uj . T 18 . (29) Substituting (25). (28) and (29) into (22). T Apply the conditional Hamiltonian evolution τ τ τ  ⊗ eiAτ t0 /T . Assume the target state is uj . in our notation. Adjoin the state Ψ0 = 2 T T −1 sin τ =0 1 π(τ + 2 ) τ .
Further using α − α3 /6 ≤ sin α ≤ α and ignoring phases we ﬁnd that √ 4π 2 8π αkj  ≤ ≤ 2. A. vj ∈ CN and λ1 ≥ · · · λM ≥ 0. Deﬁne 0 A H= . Let V = span{v1 . Let the singular value decomposition of A be M A= j=1 λj uj vj  . . (39) 2 − π 2 )(1 − δ 2 +π 2 ) δ (δ 3T 2 Thus αkj 2 ≤ 64π 2 /δ 2 whenever k − λj t0 /2π ≥ 1. (40) A† 0 19 . .4 The nonHermitian case Suppose A ∈ CM ×N with M ≤ N . . with uj ∈ CM . Generically Ax = b is now underconstrained. we make the assumption that 2π ≤ δ ≤ T /10. vM }.We now measure in the Fourier basis. . and ﬁnd that the inner product with T −1 2πikτ 1 √ T τ uj is (deﬁning δ := λj t0 − 2πk): τ =0 e T √ αkj = 2 T T −1 ei T (λj t0 −2πk) sin τ =0 T −1 τ π(τ + 1 ) 2 T − e− iπ iπ(τ +1/2) T (30) 1 = √ i 2T 1 = √ i 2T ei T τ =0 T −1 iπ τδ e iπ(τ +1/2) T (31) e 2T eiτ τ =0 δ+π T − e− 2T eiτ δ−π T (32) (33) (34) (35) iπ+iδ iπ+iδ iπ 1 − e iπ 1 − 2T 1 − e = √ e 2T δ+π − e δ−π i 2T 1 − ei T 1 − ei T 1 + eiδ e−iδ/2T e−iδ/2T = √ − − i (δ−π) i i i i 2T e− 2T (δ+π) − e 2T (δ+π) e 2T − e 2T (δ−π) 1 1 (1 + eiδ )e−iδ/2T √ − −2i sin δ+π −2i sin δ−π i 2T 2T 2T √ δ 2 cos( 2 ) 1 δ 1 1 = −ei 2 (1− T ) − δ+π T sin 2T sin δ−π 2T √ δ 2 cos( 2 ) sin δ−π − sin δ+π δ 1 2T 2T · = −ei 2 (1− T ) T sin δ+π sin δ−π 2T 2T √ δ π δ 2 cos( 2 ) 2 cos 2T sin 2T δ 1 = ei 2 (1− T ) · T sin δ+π sin δ−π 2T 2T = (36) (37) (38) Following [18].
. corresponding to eigenvectors ± 1 wj := √2 (0 uj ±1 vj ). our dependence on condition number and accuracy cannot be substantially improved. uN }. applying H to 0 b will return a valid solution. We say that an algorithm solves matrix inversion if its input and output are 1. which will be ﬂagged as illconditioned by our algorithm. such as 1/100. . in which case the algorithm will return an x satisfying N A x = j=1 uj uj  b . Even though this is a very weak deﬁnition of inverting matrices. . corresponding to the orthogonal complement of V . this task is still hard for classical computers. although doing so would have also yielded valid solutions. On the other hand. j j=1 Dropping the inital 1 . one based on complexity theory. we say that it is relativizing. Let U = span{u1 . . We take to be a ﬁxed constant. .H is Hermitian with eigenvalues ±λ1 . Input: An O(1)sparse matrix A speciﬁed either via an oracle or via a poly(log(N ))time algorithm that returns the nonzero elements in a row.5 Optimality In this section. if M ≥ N then the problem is overconstrained. where M = 0 0 ⊗ IN/2 corresponds to measuring the ﬁrst qubit and x is a normalized state proportional to A−1 b for b = 0 . The equation A x = b is satisﬁable only if b ∈ U . We present two versions of our lower bounds. But if b has some weight in U ⊥ . 2. it could be said to be ﬁnding the x that minimizes xx while solving A x = b . First. and one based on oracles. . M To run our algorithm we use the input 0 b . and second. . no classical algorithm can perform the same matrix inversion task. If b = j=1 βj uj then 0 b = 1 + − βj √ ( wj + wj ) 2 j=1 M and running the inversion algorithm yields a state proportional to H −1 0 b = 1 + − βj λ−1 √ ( wj − wj ) = j 2 j=1 M M βj λ−1 1 vj . this deﬁnes our solution x . Note that our algorithm does not produce any component in V ⊥ . Output: A bit that equals one with probability x M x ± . A. Further we demand that A is Hermitian and κ−1 I ≤ A ≤ I. then 0 b will have some weight in the zero eigenspace of H. In this case. we explain in detail two important ways in which our algorithm is optimal up to polynomial factors. . ±λM . and later deal with the dependency in . It also has N −M zero eigenvalues. In this sense. 20 . We might choose to ignore this part. If the algorithm works when A is speciﬁed by an oracle.
Theorem A. then BQP=PSPACE. . then BPP=BQP. Solving the matrix inversion problem corresponding to A produces an approximation of the quantum computation corresponding to applying U1 . 1. has condition number κ ≤ 2T and dimension N = 6T 2n . . ≥ 1/ log(n). but the dimension is now N = 18T 2n . we can estimate the expectation of M with accuracy by iterating this procedure O 1/ 2 times. Recall that I − U e− T 1 −1 = k≥0 U k e−k/T . The condition number of C is equal to that of A. then measuring the ﬁrst qubit of y would correspond to 0 perform M0 on x . and the original measurement’s output was one. Now suppose we could solve matrix inversion in time κ1−δ (log(N )/ )c1 for constants c1 ≥ 2.4. In order to perform the simulation when measuring only the ﬁrst qubit. which outputs zero if the time register t is between T +1 and 2T . (44) B 0 b . If a quantum algorithm exists for matrix inversion running in time κ1−δ · poly log(N ) for some δ > 0. Given an nqubit T gate quantum computation. (43) 1 0 I3T 2n − U e− T ˜ We now deﬁne B to be the matrix B. log(N )). (42) We deﬁne a measurement M0 . let m = 2 log(2n) δ log(log(n)) and = 1/100m. after we permuted the rows and columns such that if ˜ 0 B C = ˜† . Given a computation with T ≤ 22n /18. deﬁne U as in (4). UT . where c2 = 21−δ 3c1 is another constant. 21 . . 3. Then and Cy = κ1−δ log(N ) c1 ≤ (2T )1−δ 3n c1 ≤ T 1−δ c2 (n log(n))c1 . 2. For suﬃciently large n. deﬁne I6T 2n 0 B= . Deﬁne A= 0 1 I − U † e− T I − U e− T 0 1 . assuming we are allowed to make any two outcome measurement on the output state x . If a classical algorithm exists for matrix inversion running in time poly(κ. δ > 0. As Pr(T +1 ≤ k ≤ 2T ) = e−2 /(1 + e−2 + e−4 ) and is independent of the result of the measurement M . . No relativizing quantum algorithm can run in time κ1−δ · poly log(N ). (41) Note that A is Hermitian.
Ti+1 = Ti1−δ c3 (ni log(ni ))c1 gates and error . note that the algorithm actually supplies us with x (up to some accuracy). Thus Ti ≤ 2(1−δ/2) 2n0 i−1 for each i ≤ . instead of measuring an observable M . is impossible [30]. For example. We can now iterate the speedup in exactly the same manner.4 simply formulates a poly(n)time. To incorporate oracles. In the latter case. or whenever 1−δ/2 Ti+1 > Ti . The proof of part 3 of Theorem A. this implies that Ti+1 ≤ Ti1−δ c3 (n0 log(n0 ))c1 . This gives a way to simulate quantum computation by classical matrix inversion algorithms. Deﬁning yet another δ constant. by exhaustive enumeration over the variables.We now have a recipe for simulating an ni qubit Ti gate computation with ni+1 = ni + log(18Ti ) qubits. Thus a PSPACE computation can be solved in quantum polynomial time.4 established the universality of the matrix inversion algorithm. This proves the ﬁrst part of the theorem. This preserves sparsity. obtaining the result i with probability  ix 2 . we have Ti ≤ T (1−δ/2) ≤ i 2(1−δ/2) 2n0 . the runtime of the procedure is polynomial in n0 regardless of the reason we stopped iterating the procedure. meaning that any kbit problem instance for any language in PSPACE can be reduced to a TQBF problem of length n = poly(k) (see [29] for more information). this can be used to prove lower bounds on classical matrix inversion algorithms. The number of qubits used increases only linearly. Combining this with 1−δ/2 our stopping condition T +1 > T we ﬁnd that T ≤ (c3 (n0 log(n0 ))c1 ) δ = poly(n0 ). 2 i−1 22 . On the other hand. implying that Tm ≤ n0 . although we need the rows of A to now be speciﬁed by oracle queries. This allows us to bound ni = n0 + j=0 log(18Ti ) = n0 + 2n0 j=0 (1 − δ/2)j + i log(18) ≤ 4 + 1 n0 + m log(18). whichever comes ﬁrst. In turn. note that our construction of U in (4) could simply replace some of the Ui ’s with oracle queries. To extend the simulation to problems which are not decision problems. ending with an n qubit T gate computation with error ≤ m ≤ 1/100. Theorem A. The formula can be solved in time T ≤ 22n /18. We stop iterating either after m steps. of course. nqubit quantum computation as a κ = poly(n). i In the case where we iterated the reduction m times. However. Therefore. we can measure x in the computational basis. This. suppose we stop for i 1−δ/2 some < m. Recall that the TQBF (totally quantiﬁed Boolean formula satisﬁability) problem is PSPACEcomplete. For each i < we have Ti+1 ≤ Ti . and so the purported relativizing quantum algorithm must also be impossible. we conclude with the ability to solve the OR problem on 2n inputs in poly(n) time and queries. Our strategy is to start with an n0 qubit T0 gate computation and iterate this simulation ≤ m times. we set equal to the 1−δ/2 ﬁrst i for which Ti+1 > Ti . N = 2n ·poly(n) matrix inversion problem and applies the classical algorithm which we have assumed exists.
log(N ). To avoid violating the oracle lower bound. we recall that classical algorithms can approximate this to accuracy in time O(N κ poly(log(1/ ))). If a quantum algorithm exists for the matrix inversion estimation problem running in time poly(κ. The algorithm presented in our paper can be used to solve this problem with a small amount of overhead. in which we are given f : Zn → {0.6. Recall Simon’s problem [21]. The randomized classical lower bound is Ω(2n/2 ) from birthday arguments. b = 0 and M = 0 0 ⊗ IN/2 : output a number that is within of x M x with probability ≥ 2/3. No relativizing classical matrix inversion algorithm can run in time N α 2βκ unless 3α + 4β ≥ 1/2. It can be solved by running a 3nqubit 2n + 1gate quantum computation O(n) times and performing a poly(n) classical computation.where we assume that the classical algorithms output samples according to this distribution. 2. log(1/ )) then BQP=PP. M. we argue that the accuracy of algorithm cannot be substantially improved. Next. The runtime is N α 2βκ ≈ 2(3α+4β)n · poly(n). b. Thus quantum computers can solve the matrix inversion estimation ˜ problem in time O(log(N )κ2 s2 / 3 ).5. Theorem A. s with A ≤ 1. where x is the unit vector proportional to A−1 b . 23 . If we consider matrix inversion algorithms that work only on positive deﬁnite √ matrices. as required. taking 1/3 2 samples gives us a ≥ 2/3 probability of obtaining an estimate within /2 of µ. Returning now to the problem of estimating x M x . . Given A. then the N α 2βκ bound becomes N α 2β κ . This poly(log(1/ )) dependence is because when writing the vectors b and x as bit strings means that adding an additional bit will double the accuracy. A ssparse and eﬃciently rowcomputable. κ. Converting Simon’s algorithm to a matrix A yields κ ≈ 4n and N ≈ 36n23n . Theorem A. A−1 ≤ κ. we can obtain a sample of a bit which equals one with probability µ with µ − x M x  ≤ /2. samplingbased algorithms such as ours cannot hope for a better than poly(1/ ) dependence of the runtime on the error. 1. Deﬁne the matrix inversion estimation problem as follows. 1}2n 2 such that f (x) = f (y) iﬀ x + y = a for some a ∈ Zn that we would like to 2 ﬁnd. Thus proving that our algorithm’s error performance cannot be improved will require a slight redeﬁnition of the problem. No relativizing quantum algorithm for the matrix inversion estimation problem can run in time N α poly(κ)/ β unless α + β ≥ 1. However. Since the variance of this bit is ≤ 1/4. Proof. We can now show that the error dependence of our algorithm cannot be substantially improved. we must have 3α + 4β ≥ 1/2. By producing x up to trace distance /2 in ˜ time O(log(N )κ2 s2 / ).
1. From [22]. 1}n such that φ(z) = 1. 24 .. zn ) . zn φ(z1 . a quantum circuit can apply it on a superposition of all 2n assignments for variables. A matrix inversion estimation procedure which runs in time poly log(1/ ) would enable us to estimate this probability to accuracy 2−2n in time poly(log(22n )) = poly(n). κ = O(n2 ) and = 2−n−2 .zn ∈{0. By assumption we can solve this in time 2(α+β)n · poly(n). However. implying that α + β ≥ 1. z1 . . A complete problem for the class PP is to count the number of satisfying assignments to a SAT formula... . This would imply that BQP = PP as required. . generating the state z1 .. . 2. . Let C denote the number of z ∈ {0. .Proof. exactly determining C reduces to the matrix inversion estimation problem with N = 2n . Now assume that φ(z) is provided by the output of an oracle. .1} The probability of obtaining 1 when measuring the last qubit is equal to the number of satisfying truth assignments divided by 2n . we know that determining the parity of C requires Ω(2n ) queries to φ. Given such formula φ.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.