You are on page 1of 9

PHYSICAL REVIEW RESEARCH 4, 033029 (2022)

Adaptive quantum approximate optimization algorithm


for solving combinatorial problems on a quantum computer

Linghua Zhu ,1,* Ho Lun Tang,1 George S. Barron ,1 F. A. Calderon-Vargas,1


Nicholas J. Mayhall,2 Edwin Barnes,1 and Sophia E. Economou1,†
1
Department of Physics, Virginia Tech, Blacksburg, Virginia 24061, USA
2
Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24061, USA

(Received 25 May 2020; revised 22 December 2020; accepted 17 June 2022; published 11 July 2022)

The quantum approximate optimization algorithm (QAOA) is a hybrid variational quantum-classical algorithm
that solves combinatorial optimization problems. While there is evidence suggesting that the fixed form of the
standard QAOA Ansatz is not optimal, there is no systematic approach for finding better Ansätze. We address
this problem by developing an iterative version of QAOA that is problem tailored, and which can also be adapted
to specific hardware constraints. We simulate the algorithm on a class of Max-Cut graph problems and show
that it converges much faster than the standard QAOA, while simultaneously reducing the required number of
CNOT gates and optimization parameters. We provide evidence that this speedup is connected to the concept of
shortcuts to adiabaticity.

DOI: 10.1103/PhysRevResearch.4.033029

I. INTRODUCTION QAOA may provide a significant quantum advantage over


classical algorithms [14–16], and that it is computationally
Many important computationally hard combinatorial opti-
universal [17,18].
mization problems such as Max-Cut, graph coloring, traveling
Despite these advances, there are limitations and potential
salesman, and scheduling management [1–3] can be mapped
issues with this algorithm. The performance improves with
to Ising Hamiltonians whose ground states provide the solu-
the number of alternating layers in the Ansatz, but the lat-
tions. One can in principle solve these optimization problems
ter is limited by coherence times in existing and near-term
on a quantum computer by initializing the quantum device in
quantum processors. Moreover, more layers implies more
the ground state of a Hamiltonian that is easy to prepare and
variational parameters, which introduces challenges for the
adiabatically tuning the latter into the problem Hamiltonian.
classical optimizer [19]. Furthermore, Ref. [20] points out
In a digital quantum computer, this translates into a Trotter-
that the locality and symmetry of QAOA can also severely
ized version of the adiabatic evolution operator, which is the
limit its performance. These issues can be attributed to, or
alternating product of the evolution operators corresponding
are at least exacerbated by, the form of the QAOA Ansatz.
to the initial mixer and the problem (Ising) Hamiltonians. In
In particular, short-depth QAOA is not really the digitized
the limit of an infinite product, this Trotterized form becomes
version of the adiabatic problem, but rather an ad hoc Ansatz,
exact.
and as a result should not be expected to perform optimally,
QAOA is a hybrid quantum-classical variational algorithm
or even well. A short-depth Ansatz that is further tailored to
that uses a finite-order version of this evolution operator
a given combinatorial problem could therefore address the
to prepare wave-function Ansätze on a quantum processor
issues with the standard QAOA Ansatz. However, identifying
[4–7]. QAOA is performed by variationally minimizing the
such an alternative is a highly nontrivial problem given the
expectation value of the Ising Hamiltonian with respect to the
vast space of possible Ansätze.
parameters in the Ansatz. The quantum processor is also used
In this work, we propose an iterative version of QAOA
to measure energy expectation values, while the optimization
termed adaptive derivative assembled problem tailored-
is done on a classical computer. There has been a lot of
quantum approximate optimization algorithm (ADAPT-
progress on QAOA recently on both the experimental and
QAOA). Our algorithm grows the Ansatz two operators at a
theoretical fronts [1,8–14]. There is evidence suggesting that
time by using a gradient criterion to systematically select the
QAOA mixer from a predefined operator pool. While ADAPT-
QAOA is general and can be applied to any optimization
*
zlinghua18@vt.edu problem, we focus on Max-Cut to quantify its performance.

economou@vt.edu We find that when entangling gates are included in the opera-
tor pool, there is a dramatically faster convergence compared
Published by the American Physical Society under the terms of the to standard QAOA. Surprisingly, despite the introduction of
Creative Commons Attribution 4.0 International license. Further entangling gates as mixers, these improvements come with a
distribution of this work must maintain attribution to the author(s) reduction in the numbers of both the optimization parameters
and the published article’s title, journal citation, and DOI. and the CNOT gates by approximately 50% each compared to

2643-1564/2022/4(3)/033029(9) 033029-1 Published by the American Physical Society


LINGHUA ZHU et al. PHYSICAL REVIEW RESEARCH 4, 033029 (2022)

standard QAOA. We summarize the details of our algorithm such that ψ (k) |HC |ψ (k)  is minimized, and return to the sec-
and its performance in Sec. II and provide evidence that this ond step. This algorithm, which we call ADAPT-QAOA, lies
improved performance is related to the concept of shortcuts somewhere between standard QAOA and ADAPT-VQE in the
to adiabaticity [21–24] in Sec. III. Finally, we conclude in sense that it possesses the alternating-operator structure of
Sec. IV with a look ahead. QAOA but enjoys additional flexibility by allowing the mixers
to vary over the course of the iterative construction.
II. ADAPT-QAOA
B. Operator pool
A. Framework
The first step in running this algorithm is to define
In QAOA [4,5], the variational Ansatz consists of p layers,
the mixer pool. Define Q to be the set of qubits. The
each containing the cost Hamiltonian HC and a mixer, HM :
pool corresponding to the  standard QAOA contains only
 p 
 one operator, PQAOA = { i∈Q Xi }. Here, we introduce two
|ψ p (γ , β ) = [e−iHM βk
e−iHC γ k
] |ψref  , (1) qualitatively different pools: one consisting entirely of single-
k=1 qubit mixers, and one with both single-qubit  and multiqubit
⊗n entangling gates: Psingle = ∪i∈Q {Xi , Yi } ∪ { i∈Q Yi } ∪ PQAOA ,
where |ψref  = |+ , n is the number of qubits, and γ and
Pmulti = ∪i, j∈Q×Q {BiC j |Bi , C j ∈ {X, Y, Z}} ∪ Psingle . Because
β are sets of variational parameters. If these parameters are PQAOA ⊂ Psingle ⊂ Pmulti , we expect that Pmulti will provide the
chosen such that ψ p (γ , β )| HC |ψ p (γ , β ) is minimized, then best performance. The QAOA, single-qubit, and multiqubit
the resulting energy and state provide an approximate solution pools have O(1), O(n), and O(n2 ) elements, respectively.
to the optimization problem encoded in HC . The accuracy of Max-Cut is a classic (NP-hard) quadratic unconstrained
the result and the efficiency with which it can be obtained binary optimization problem, and it can be used to solve other
depend sensitively on HM . In the standard QAOA Ansatz, the optimization problems. Thus, it is a useful benchmarking
mixer is chosen to be a single-qubit X rotation applied to problem for QAOA and has been used as such in prior works
all qubits. A few papers have suggested modifications to the [5,7,11]. It is defined as follows: Given a graph G = (V, E ),
standard QAOA Ansatz for specific problems and hardware with weight wi, j for edge (i, j), find a cut S ⊆ V such that
architectures [25–27]. These interesting results reveal the po- S ∪ S̄ = V , and iS, j S̄,i, jE wi, j is maximized. This problem
tential advantages of the QAOA Ansatz but do not provide can be encoded in the Ising Hamiltonian
a universal strategy for choosing mixers that works across a
broad range of optimization problems. 1
HC = − wi, j (I − Zi Z j ), (3)
In this work, we replace the single, fixed mixer HM by a set 2 i, j
of mixers Ak that change from one layer to the next:
 p  where the couplings are given by the edge weights. Each
 classical state (i.e., tensor product of Z eigenstates) represents
|ψ p (γ , β ) = [e−iAk βk
e−iH C γ k
] |ψref  . (2) a possible cut. HC counts the sum of the weights of the edges
k=1 connecting one subgraph to the other, and its ground state
We build up this Ansatz iteratively, one layer at a time, in a corresponds to the maximum cut. HC has a Z2 symmetry
way that is determined by HC . This iterative process is inspired generated by F = ⊗i Xi . Only the A j that commute with F
by the variational quantum eigensolver algorithm, ADAPT- have a nonzero gradient (see Appendix A), so we retain only
VQE [28,29]. It can be summarized by three basic steps: First, these Pauli strings (which have an even number of Y or Z
define the operator set {A j } (called the mixer pool, and where operators) in our mixer pool.
A j = A†j ) and select a suitable reference state to be the initial
state: |ψ (0)  = |ψref . Here, we choose |ψref  = |+⊗n as in C. Performance and resource comparison
the standard QAOA. We will return shortly to the question of We use the Max-Cut problem on regular graphs with n = 6
how to choose the pool. Second, prepare the current Ansatz vertices and degrees D = 3 and D = 5 to benchmark the
|ψ (k−1)  on the quantum processor and measure the energy performance of ADAPT-QAOA. For each type of graph, we
gradient with respect to the pool, the jth component of which analyze 20 instances of random edge weights, which are
is given by −i ψ (k−1) | eiHC γk [HC , A j ]e−iHC γk |ψ (k−1) , where drawn from the uniform distribution U (0, 1) [31]. We use
the new variational parameter γk is set to a predefined value Nelder-Mead for the optimization of the variational param-
γ0 . For the measurement, we can decompose the commutator eters β and γ . The gradients used to select new operators are
into linear combinations of Pauli strings and measure the sensitive to the initial values for γ . It is natural to initialize
expectation values of the strings using general variational these parameters at γ0 = 0 to avoid biasing the optimization.
quantum algorithm methods [30]. If the norm of the gradient is However, as we show in Appendix B, γ0 = 0 is a critical
below a predefined threshold, then the algorithm stops, and the point of the cost function [32]. Moreover, the minimum of
current state and energy estimate approximate the desired so- the energy in the first layer of ADAPT-QAOA never occurs at
lution. If the gradient threshold is not met, modify the Ansatz γ0 = 0. Therefore, we shift the initial value γ0 slightly away
(k)
by adding the operator, Amax , associated with the largest from zero (γ0 = 0.01) to avoid these issues.
(k)
component of the gradient: |ψ  = e−iAmax βk e−iHC γk |ψ (k−1) ,
(k)
In Fig. 1 we show the error as a function of the number
where βk is a new variational parameter. Third, optimize all of Ansatz layers for the standard QAOA and for ADAPT-
parameters currently in the Ansatz, βm , γm , m = 1, . . . , k, QAOA using single-qubit and multiqubit mixer pools. For

033029-2
ADAPTIVE QUANTUM APPROXIMATE OPTIMIZATION … PHYSICAL REVIEW RESEARCH 4, 033029 (2022)

(a) 10 0 (b) 10 0
QAOA
ADAPT-Single

Energy error (arb. units)


ADAPT-Multi 10 -2
Energy error (arb. units)

10 -2

10 -4 10 -4

10 -6 10 -6

10 -8 10 -8

-10
10-10 10

0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14
Layer (p) Layer (p)

FIG. 1. Comparison of the performance of standard QAOA (blue) with ADAPT-QAOA for the single-qubit (orange) and multiqubit (green)
pools. The algorithms are run on the Max-Cut problem for the regular graphs shown in the figure, which have n = 6 vertices and are of degree
(a) D = 3 and (b) D = 5. The energy error (the difference between the energy estimate obtained by the algorithm and the exact ground-state
energy of HC ) is shown as a function of the number of layers in the Ansatz. Results are shown for 20 different instances of edge weights, which
are randomly sampled from the uniform distribution U (0, 1). The shaded regions indicate 95% confidence intervals.

both three- and five-regular graphs, we find that using the compared to that constructed from the entangling multiqubit
single-qubit mixer pool provides a modest improvement over mixer pool. Moreover, on average, the standard QAOA algo-
standard QAOA. On the other hand, the multiqubit pool per- rithm uses more parameters and CNOTs to reach the same
forms dramatically better, leading to a rapid convergence to convergence threshold than either version of ADAPT-QAOA.
the exact solution after only ∼3 layers. We also find that for About half as many CNOTs are required for the ADAPT-
the degree-five graphs, standard QAOA and ADAPT-QAOA QAOA multiqubit pool case, despite the fact that the mixers
with single-qubit mixers converge slower than the degree- in the multiqubit pool themselves introduce additional CNOT
three case, whereas the performance of ADAPT-QAOA with gates on top of those coming from HC . Reference [25] pro-
the multiqubit operator pool remains approximately the same. posed using a restricted form of entangling gates in the
Note that the particular form of the two-qubit operators in the Ansatz to obtain better performance in combinatorial prob-
pool was chosen for its simplicity. In general, one can choose lems at the cost of introducing more variational parameters.
a hardware-tailored operator pool, in the spirit of Ref. [33]. In contrast, ADAPT-QAOA provides a systematic way to both
In Appendix C, we show similar results for n = 8 and n = improve performance and reduce the number of parameters
10 graphs of degree D = 2, where ADAPT-QAOA with the and CNOTs.
multiqubit pool substantially outperforms the standard QAOA
again. Going to larger values of D or n is made challenging
III. SHORTCUTS TO ADIABATICITY
by a sharp increase in the number of layers needed to reach
convergence, as reported for standard QAOA in Ref. [34]. One may wonder whether there is a physically intu-
It is interesting to ask how much the ADAPT-QAOA An- itive way to understand the strikingly better performance of
sätze differ from the standard QAOA Ansatz. We find that ADAPT-QAOA. Considering that the standard QAOA Ansatz
when the single-qubit mixer pool is used, the single-qubit has a structure dictated by the adiabatic theorem, a possi-
operators Xi are chosen instead of the standard mixer approx- ble explanation is that the ADAPT algorithm is related to
imately 36.6% of the time for n = 6, D = 3 graphs and 25% shortcuts to adiabaticity (STA). STA, also known as counter-
of the time for n = 6, D = 5 graphs. For the multiqubit mixer diabatic or transitionless driving, was introduced for quantum
pool, the algorithm chooses operators other than the stan- systems by Demirplak and Rice [21] and later, independently,
dard mixer approximately 75% of the time for n = 6, D = 3 by Berry [22,23]. STA has also been explored in the classical
graphs and 80% of the time for n = 6, D = 5 graphs (see context [35,36], including a recent application in biology [37].
Appendix D). This trend supports the intuitive idea that a The idea is that if we want to drive a system such that it
more connected graph requires more entanglement for a rapid remains in the instantaneous ground state at all times, then
convergence to the solution. by adding a certain term HCD to the Hamiltonian, we can
A crucial question, especially for near-term platforms, is achieve this without paying the price of a slow evolution.
how the different mixer pools compare with respect to re- Although the instantaneous eigenstates of the original Hamil-
source overhead. Figure 2 shows the number of CNOTs and tonian only solve the time-dependent Schrödinger equation in
number of parameters for the three algorithms. The CNOT the adiabatic limit, they become exact solutions when the
counts are determined by decomposing each two-qubit op- Hamiltonian is updated to include HCD . The advantage of
erator into two CNOT gates and one or two single-qubit STA is that the evolution can be achieved nonadiabatically.
gates. Surprisingly, we find that both the standard QAOA Below, we provide evidence that ADAPT-QAOA is indeed
and the single-qubit mixer Ansätze in fact have more CNOTs related to STA, a likely explanation for why it converges to

033029-3
LINGHUA ZHU et al. PHYSICAL REVIEW RESEARCH 4, 033029 (2022)

(a) 35 500 in the rotated frame. It is evident that the term −θ̇ Ãθ drives
Parameter Count transitions between the energy levels of the original Hamil-
30 CNOT Count tonian H. Therefore, one can add the counterdiabatic term
400
HCD = θ̇Aθ to H(θ ), with Aθ = U Ãθ U † , to eliminate such
25
transitions in the rotated frame. This is the core of transition-
Parameter Count

less driving protocols.

CNOT Count
300
20
Now, the matrix elements of the adiabatic gauge potential
15 in the instantaneous eigenbasis are
200

10
m(θ )| Aθ |n(θ ) = m(θ )| U Ãθ U † |n(θ )
100 = i m(θ )| ∂θ UU † |n(θ )
5
= i m(θ )| |∂θ n(θ ) , (4)
0 0
QAOA ADAPT-Single ADAPT-Multi
where we used Ãθ = iU † ∂θ U and |n(θ ) = U (θ ) |n0  with
(b) 35 500 |n0  being independent of θ . Moreover, the adiabatic gauge
Parameter Count
potential Aθ satisfies [22,38]
30 CNOT Count
400 m| ∂θ H |n
m| Aθ |n = i m| |∂θ n = i , (5)
25 En − Em
Parameter Count

CNOT Count
300 which is obtained by differentiating the eigenfunction
20
H(θ ) |n(θ ) = En (θ ) |n(θ ) with respect to θ . Note that in-
15 creasing the size of the system can lead to divergent matrix
200
elements due to exponentially small denominators (En − Em ).
10
In this regard, Ref. [40] proposes an approximate gauge po-
100
5
tential

p
0 0 Aθ(p) = i ak [H, ∂θ H]2k−1 , (6)
QAOA ADAPT-Single ADAPT-Multi
k=1

FIG. 2. Resource comparison of the standard QAOA, ADAPT- where [X, Y ]k+1 = [X, [X, Y ]]k and {a1 , a2 , . . . , a p } is a set of
QAOA with the single-qubit mixer pool, and ADAPT-QAOA with coefficients with p being the order of the expansion. This set
the multiqubit mixer pool for the Max-Cut problem on regular graphs of coefficients is found by minimizing Tr[G2 (Aθ(p) )], where
with n = 6 vertices and random edge weights. (a) and (b) show the G(Aθ(p) ) = ∂θ H − i[H, Aθ(p) ] [40]. In fact, Tr[G2 (X )] is min-
comparison for graphs of degree D = 3 and D = 5, respectively. For imized when X is equal to the exact adiabatic gauge potential
all cases except the standard QAOA applied to D = 5 graphs, we Aθ [38,39]. Using matrix calculus identities and properties of
count the number of parameters and CNOTs needed to reach an
the trace, it is straightforward to show that
energy error of δE = 10−3 . As standard QAOA for D = 5 graphs
never reaches this error threshold, we instead count the CNOT gates ∂Tr[G2 (X )]
and parameters at the end of the simulation (15 layers). The dark = 2[H, i∂θ H − [X , H]]. (7)
∂X
(light) red bars show variational parameter (CNOT gate) counts. The
error bars show variances obtained by sampling over 20 different Only adiabatic gauge potentials satisfy [H, i∂θ H −
instances of edge weights. [Aθ , H]] = 0. This is easily proven by differentiating
H̃(θ ) = U † (θ )H(θ )U (θ ) with respect to θ ,

the solution much faster than its adiabatic counterpart, the ∂θ H̃ = ∂θ U †UU † HU + U † ∂θ HU + U † HUU † ∂θ U, (8)
standard QAOA. Before we present this evidence, we must and noting that Ãθ = −i∂θ U U = iU ∂θ U and H =
† †
first explain how HCD can be constructed using the concept of 
n En (θ ) |n(θ ) n(θ )|, then
adiabatic gauge potentials. 
[Aθ , H] = i(∂θ H − ∂θ En (θ ) |n(θ ) n(θ )|). (9)
A. Approximate adiabatic gauge potentials n

Here we briefly review the mathematical machinery of STA Given that [H, n ∂θ En (θ ) |n(θ ) n(θ )|] = 0, adiabatic
and adiabatic gauge potentials [38–40]. Let |ψ be a state gauge potentials clearly satisfy
evolving under H(θ (t )), i∂t |ψ = H(θ (t )) |ψ, where θ is [H, i∂θ H − [Aθ , H]] = 0. (10)
a continuous variable that parameterizes the Hamiltonian. A
unitary transformation U (θ (t )) can be applied to move the
Hamiltonian H(θ (t )) from the initial basis to its instantaneous B. Connection between ADAPT-QAOA and STA
eigenbasis, where H̃(θ ) = U † (θ )H(θ )U (θ ) is diagonal at all To investigate the connection between ADAPT-QAOA and
times. The Schrödinger equation in the instantaneous eigen- STA, we apply the above  formalism using the Hamiltonian
basis is i∂t |∗ ψ̃ = [H̃ − θ̇ Ãθ ] |∗ ψ̃, where |∗ ψ̃ = U † |ψ, H = f (t )HC + [1 − f (t )] ni Xi with f (t ) = t/T and, since
θ̇ = dθ /dt, and Ãθ = iU † ∂θ U is the adiabatic gauge potential there is no other continuous variable that parameterizes the

033029-4
ADAPTIVE QUANTUM APPROXIMATE OPTIMIZATION … PHYSICAL REVIEW RESEARCH 4, 033029 (2022)

1.0
and that these axes may in some sense be universal across all
possible choices of H(t ) that interpolate between the initial
and target states. This suggests that STA can be used as a tool
0.8
to construct operator pools for ADAPT-QAOA.
0.6 1st order
IV. CONCLUSION
P

2nd order
3rd order
0.4 4th order In conclusion, we introduced ADAPT-QAOA, an optimiza-
5th order tion algorithm that grows the Ansatz iteratively in a way
6th order that is naturally tailored to a given problem. We tested sev-
0.2 7th order
8th order eral instances of random diagonal Hamiltonians and found
0.0
that ADAPT-QAOA always outperforms the standard QAOA.
1 2 3 4 5 Given its flexibility with the choice of mixer pool, the algo-
Layer p
rithm can be tailored to the native gates, connectivities, and
experimental constraints of hardware. It would also be fruitful
FIG. 3. Probability P of the operator at layer p of the ADAPT- to employ ADAPT-QAOA for optimization problems that use
QAOA Ansatz to be among the Pauli strings with the largest higher-dimensional Hilbert spaces, such as graph coloring
coefficient in HCD averaged over 32 graphs with n = 6, D = 3. The [26,27]. Finally, more work into the connection to STA would
different curves correspond to different orders of the approximation. be of both fundamental and practical interest.

Hamiltonian, we simply set θ = t in the equations above. T ACKNOWLEDGMENTS


is the duration of the evolution from the initial state |ψre f  =
We thank Bryan T. Gard and Ada Warren for helpful
|+⊗n to the ground state of the cost Hamiltonian HC , which
discussions. S.E.E. acknowledges support from the US De-
is given by Eq. (3). The counterdiabatic Hamiltonian HCD
partment of Energy (Award No. DE-SC0019318). E.B. and
is approximated using Eq. (6), where p is the order of the
N.J.M. acknowledge support from the US Department of En-
approximation.
ergy (Award No. DE-SC0019199).
As a concrete example, we study the Max-Cut problem on
32 instances of regular graphs (n = 6, D = 3) with random
edge weights. Figure 3 shows the probability that an operator APPENDIX A: ISING SYMMETRY AND MIXER POOL
in the ADAPT-QAOA Ansatz is also one of the dominant OPERATORS
operators in HCD . For each of the 32 cases, we define a In this work, we focus on Ising Hamiltonians of the form
(i)
set OCD (with i = 1, . . . , 32) comprised of the five operators
with the largest coefficient in the time-averaged HCD .1 The 1
HC = − wi, j (I − Zi Z j ), (A1)
probability P in Fig. 3 is constructed by taking the total 2 i, j
number of times the mixer operator at layer p is also an
(i)
element of the corresponding set OCD and dividing it by the which have a Z2 symmetry associated with the operator F =
total number of cases. In all cases, the mixer operator at ⊗i Xi . Since [F, HC ] = 0 and F |ψref  = |ψref , we can rewrite
(i)
the first layer is also an element of the set OCD . For higher the gradient in the first iteration as
(i)
layers, the probability of the mixer operator to be in OCD
ψref |eiHC γ1 [HC , A j ]e−iHC γ1 |ψref 
is inversely proportional to the layer number. We attribute
this to the factthat HCD is computed for a specific mixer = ψref |eiHC γ1 F [HC , A j ]Fe−iHC γ1 |ψref , (A2)
Hamiltonian ( ni Xi ), while information about this choice
does not enter into ADAPT-QAOA, which only relies on the where A j is an operator from the mixer pool. However, we
initial state |+⊗n .2 Interestingly, from Fig. 3 we see that also know that FA j = ±A j F because F and A j are  Pauli
going to higher order in the HCD approximation increases the strings (except when A j is the standard QAOA mixer i∈Q Xi
(i)

probability of finding the mixers in the set OCD . It therefore or i∈Q Yi , but the former commutes and the latter anticom-
appears that ADAPT-QAOA finds the appropriate rotation mutes with F ), so F [HC , A j ]F = ±[HC , A j ]. Comparing this
axes in Hilbert space for faster convergence to the solution, to Eq. (A2), we see that to have a nonzero gradient, we need
[F, A j ] = 0. This holds for all steps of the algorithm, because
only operators that commute with F appear in the Ansätze, so
1
We have seen in our simulations that only Pauli string operators a formula like Eq. (A2) holds at every iteration. The A j that
are chosen by ADAPT-QAOA until deep into the layer number. Since commute with F are Pauli strings that have an even number of
we stop at layer five, only Pauli strings are chosen in all 32 cases. Y or Z operators, so we retain only these Pauli strings in our
2
In principle, the mixer Hamiltonian in H(t ) could be replaced by mixer pool.
any other Hamiltonian that has |+⊗n as the ground state, but this
would modify the counterdiabatic Hamiltonian and any resemblance APPENDIX B: FIRST LAYER OF ADAPT-QAOA ANSATZ
to the ADAPT-QAOA Ansatz would be  reduced or even lost. In-
terestingly, using the mixer Hamiltonian ni Xi involves the lowest Here we analyze the ADAPT-QAOA cost function in the
possible energetic cost [41,42] of implementing the shortcut to adia- first layer, and the results show that the minimum of the energy
baticity. in the first layer of ADAPT-QAOA never occurs at γ = 0 for

033029-5
LINGHUA ZHU et al. PHYSICAL REVIEW RESEARCH 4, 033029 (2022)

FIG. 4. The performance of three algorithms for the Max-Cut FIG. 5. The performance of three algorithms for the Max-Cut
problem on a n = 8 and D = 2 graph with randomly chosen edge problem on a n = 10 and D = 2 graph with randomly chosen edge
weights. weights.

any operator included in the pool. We also show that γ = 0 is


a critical point of the cost function. For the first layer, p = 1, the state is |ψ (0)  = |+⊗n , and
At level p of ADAPT-QAOA, the cost function is so Ha  = 0. From Eq. (B6), we see that β = ±π /4 assuming
0|G(γ )|0 = 0. Eq. (B7) then requires G (γ ) = 0. Using
E p (β, γ ) = ψ (p−1) |e−iγ H e−iβM HeiβM eiγ H |ψ (p−1) . (B1) Eq. (B4), for p = 1 this is
Where H is a linear combination of Pauli strings that are
tensor products of the identity and Z. All the terms in H 0|G (γ )|0 = 20|MHa2 e2iγ Ha |0
commute with each other. M is the mixer. If the mixers M = 20|Ha2 e2iγ Ha 0 = 0. (B8)
are single-Pauli strings, we have
e−iβM HeiβM = Hc + cos(2β )Ha − i sin(2β )MHa , (B2) Notice that γ = 0 is not a solution of this equation be-
cause 0|Ha2 |0 > 0, which follows from the fact that this is
where Hc is the part of H that commutes with M, and Ha is the norm of a nonzero state, Ha |+⊗n . Numerics are needed
the part of H that anticommutes with M. We then have to determine if there is a nonzero value of γ that satisfies
e−iγ H e−iβM HeiβM eiγ H 0|G (γ )|0 = 0.
On the other hand, if 0|G(γ )|0 = 0, then there is no
= Hc + cos(2β )Ha − i sin(2β )e−iγ H Meiγ H Ha ∂E
constraint on β from ∂βp = 0. Notice that this happens when
= Hc + cos(2β )Ha − i sin(2β )MHa e2iγ Ha . (B3) γ = 0 since 0|G(0)|0 = −i0|MHa |0 = 0, which is true
for regular graphs where Ha is a sum of terms like Z j Zk , and
We know that E p (β, γ ) is periodic in β with period π . M is a Pauli string that commutes with F = ⊗
X
. In this
Therefore, we can restrict β to the range β ∈ [−π /2, π /2] case, we must have β = 0 or ±π /2. There could be other
without loss of generality. Let us define values of γ that satisfy 0|G(γ )|0 = 0, but it seems unlikely
G(γ ) ≡ −iMHa e2iγ Ha . (B4) that both this condition and 0|G (γ )|0 = 0 can be satisfied
simultaneously, so it is probably still true that β must be 0 or
The cost function is then ±π /2 in this case.
E p (β, γ ) = Hc  + cos(2β )Ha  + sin(2β )G(γ ), (B5) In summary, there are two classes of possible solu-
tions for p = 1: (i) β = ±π /4 and γ = γ ∗ = 0 where
where the expectation values are taken with respect to 0|G (γ ∗ )|0 = 0; (ii) γ = γ ∗ where 0|G(γ ∗ )| = 0 and ei-
|ψ (p−1) . Therefore, ther 0|G (γ ∗ )|0 = 0 or β = 0, ±π /2. To clarify whether
∂E p these solutions are maxima or minima, we calculated the
= −2 sin(2β )Ha  + 2 cos(2β )G(γ ) = 0 second derivatives:
∂β
G(γ ) ∂ 2Ep
⇒ tan(2β ) = , (B6) = −4 cos(2β )0|Ha |0 − 4 sin(2β )0|G(γ )|0, (B9)
Ha  ∂β 2
and ∂ 2Ep
= sin(2β )0|G (γ )|0, (B10)
∂E p ∂γ 2
= sin(2β )G (γ ) = 0
∂γ ∂ 2Ep
= 2 cos(2β )0|G (γ )|0. (B11)
⇒ β = 0, ±π /2 or G (γ ) = 0. (B7) ∂β∂γ

033029-6
ADAPTIVE QUANTUM APPROXIMATE OPTIMIZATION … PHYSICAL REVIEW RESEARCH 4, 033029 (2022)

FIG. 6. Probability of operators picked by the original QAOA, ADAPT-QAOA with the single-qubit mixer and ADAPT-QAOA with
multiqubit pool for the Max-Cut problem on regular graphs with n = 6 vertices with degree (a), (b) D = 3 and (c), (d) D = 5 with random
edge weights sampled from a uniform distribution U (0, 1). The blue bars show the probability of each particular operator used for Ansatz, and
green bars show the probability of the original mixer, sum over all single-qubit gates and sum over all entangling gates used in Ansatz. The
results from 20 instances of random edge weights.

Let us consider the class (i) extrema first. In this case and the Hessian is
  2
2 
∂ 2 E p  ∂ 2Ep ∂ 2Ep ∂ Ep 
= ∓40|G(γ ∗ )|0, (B12) − 
∂β 2 β=±π/4,γ =γ ∗ ∂β 2 ∂γ 2 ∂β∂γ 
β=0,±π/2,γ =γ ∗

and the Hessian is = −40|G (γ ∗ )|02 , (B15)


 2
2 
∂ 2Ep ∂ 2Ep ∂ Ep  these extrema are saddle points. Notice that γ = 0 is one of
− 
∂β ∂γ
2 2 ∂β∂γ  these saddle points.
β=±π/4,γ =γ ∗
We can see that the minimum of the energy in the first
= −40|G(γ ∗ )|00|G (γ ∗ )|0. (B13) layer of ADAPT-QAOA never occurs at γ = 0, and the only
minima occur at β = ±π /4 and γ = γ ∗ = 0, which is con-
Thus, we need 0|G(γ ∗ )|00|G (γ ∗ )|0 < 0, in which case sistent with our numerical simulation results. The fact that
β = π /4 is a maximum and β = −π /4 is a minimum if γ = 0 is a saddle point (at least in the first layer) makes this
0|G(γ ∗ )|0 > 0, while β = −π /4 is a maximum and β = a problematic choice for the initial value of γ in the classical
π /4 is a minimum if 0|G(γ ∗ )|0 < 0. optimization subroutine of ADAPT-QAOA. For this reason,
For type (ii) extrema, we have we instead choose the initial value, γ0 , to be a small nonzero
 value, e.g., γ0 = 0.01. Similar performance is achieved for
∂ 2 E p  any value of γ0  0.1. Keeping γ0 close to zero is preferable
= ±40|Ha |0 = 0, (B14)
∂β 2 β=0,±π/2,γ =γ ∗ to avoid biasing the optimization.

033029-7
LINGHUA ZHU et al. PHYSICAL REVIEW RESEARCH 4, 033029 (2022)

n = 10 qubits, with results shown in Figs. 4 and 5. Clearly,


even for larger system sizes and less connected graphs where
D/n < 1/2, our algorithm exhibits substantially better perfor-
mance compared to standard QAOA.

APPENDIX D: ROLE OF ENTANGLING MIXERS VERSUS


ENTANGLING GATES
In Fig. 2, we compared the resources used by three differ-
ent algorithms for the Max-Cut problem on regular graphs.
These resources include the number of CNOT gates and the
number of optimization parameters. From the comparison
we can see that including entangling mixers in the Ansatz
produces a dramatically faster convergence to the exact so-
lution compared to the original QAOA. Surprisingly, despite
the inclusion of these entangling mixers, the improvement in
convergence comes with a simultaneous reduction in both the
number of entangling gates in the compiled Ansatz.
To further investigate the role of entangling mixers, we
consider the n = 6, D = 3 [in Figs. 6(a), 6(b)] and n =
6, D = 5 graphs [in Figs. 6(c), 6(d)] and show the probability
for an operator to be picked by the original QAOA and by
ADAPT-QAOA with a single-qubit mixer pool in Fig. S1.
Similar results for ADAPT-QAOA with a multiqubit mixer
pool are also shown in Figs. 6(b), 6(d). We find that when only
the single-qubit mixer pool is used, the single-qubit operators
Xi are chosen instead of the original mixer approximately
25% of the time. For the multiqubit mixer pool, the algorithm
chooses two-qubit entangling operators approximately 70% of
the time. Clearly, entangling mixers play a central role in the
improved performance of ADAPT-QAOA.
Additionally, to understand the importance of CNOT gates
in the compiled Ansatz, in Fig. 7 we show the error for the
solution determined by each algorithm as a function of the
number of CNOTs used in the Ansatz for both n = 6, D = 3,
and n = 6, D = 5 graphs. Based on the figure, we can see
FIG. 7. Comparison of the performance of the original QAOA that ADAPT-QAOA with the multiqubit mixer pool converges
algorithm (blue) with the ADAPT-QAOA algorithm for the single- using a much smaller number of CNOTs compared to the orig-
qubit (orange) and multiqubit (green) pools run for the Max-Cut inal QAOA or to ADAPT-QAOA with the single-qubit mixer
problem on graphs with n = 6 vertices with degree (a) D = 3 and pool. On the other hand, the latter only provides a modest
(b) D = 5. The energy error is shown as a function of the number of reduction in CNOT count compared to the original QAOA.
total CNOTs used in the Ansatz. Interestingly, we see that while entangling mixers are crucial
for the dramatic improvement in the performance afforded by
APPENDIX C: SCALABILITY OF ADAPT-QAOA
ADAPT-QAOA with the multiqubit mixer pool, at the same
Here, we investigate the scalability of our algorithm. We time far fewer entangling gates are needed in the Ansatz. Thus,
perform simulations for degree D = 2 graphs with n = 8 and the role of entanglement in this algorithm is rather subtle.

[1] Y.-H. Oh, H. Mohammadbagherpoor, P. Dreher, A. Singh, [4] E. Farhi, J. Goldstone, and S. Gutmann, A quantum approxi-
X. Yu, and A. J. Rindos, Solving multi-coloring combinato- mate optimization algorithm, arXiv:1411.4028.
rial optimization problems using hybrid quantum algorithms, [5] E. Farhi, J. Goldstone, and P. Shor, Design of quantum al-
arXiv:1911.00595. gorithms using physics tools, Tech. Rep. 56291-PH-QC.10
[2] G. Kochenberger, J.-K. Hao, F. Glover, M. Lewis, Z. Lü, (Massachusetts Institute of Technology, Cambridge, 2014).
H. Wang, and Y. Wang, The unconstrained binary quadratic [6] E. Farhi and A. W. Harrow, Quantum supremacy
programming problem: A survey, J. Comb. Opt. 28, 58 through the quantum approximate optimization algorithm,
(2014). arXiv:1602.07674.
[3] A. Lucas, Ising formulations of many NP problems, Front. [7] R. Shaydulin, I. Safro, and J. Larson, Multistart methods for
Phys. 2, 5 (2014). quantum approximate optimization, in Proceedings of the IEEE

033029-8
ADAPTIVE QUANTUM APPROXIMATE OPTIMIZATION … PHYSICAL REVIEW RESEARCH 4, 033029 (2022)

High Performance Extreme Computing Conference (HPEC), optimization algorithm to a quantum alternating operator
Waltham, MA, 2019 (IEEE, Piscataway, NJ, 2019). Ansatz, Algorithms 12, 34 (2019).
[8] G. Pagano, A. Bapat, P. Becker, K. Collins, A. De, P. Hess, H. [27] Z. Wang, N. C. Rubin, J. M. Dominy, and E. G. Rieffel, XY
Kaplan, A. Kyprianidis, W. Tan, C. Baldwin et al., Quantum mixers: Analytical and numerical results for the quantum alter-
approximate optimization of the long-range ising model with nating operator Ansatz, Phys. Rev. A 101, 012320 (2020).
a trapped-ion quantum simulator, Proc. Natl. Acad. Sci. 117, [28] H. R. Grimsley, S. E. Economou, E. Barnes, and N. J. Mayhall,
25396 (2020). An adaptive variational algorithm for exact molecular simu-
[9] Z. Wang, S. Hadfield, Z. Jiang, and E. G. Rieffel, Quantum lations on a quantum computer, Nature Commun. 10, 3007
approximate optimization algorithm for MaxCut: A fermionic (2019).
view, Phys. Rev. A 97, 022304 (2018). [29] H. L. Tang, V. O. Shkolnikov, G. S. Barron, H. R. Grimsley,
[10] L. Zhou, S.-T. Wang, S. Choi, H. Pichler, and M. D. Lukin, N. J. Mayhall, E. Barnes, and S. E. Economou, qubit-adapt-vqe:
Quantum Approximate Optimization Algorithm: Performance, An adaptive algorithm for constructing hardware-efficient an-
Mechanism, and Implementation on Near-Term Devices, Phys. sätze on a quantum processor, PRX Quantum 2, 020310 (2021).
Rev. X 10, 021067 (2020). [30] W. J. Huggins, J. R. McClean, N. C. Rubin, Z. Jiang, N. Wiebe,
[11] G. E. Crooks, Performance of the quantum approximate K. B. Whaley, and R. Babbush, Efficient and noise resilient
optimization algorithm on the maximum cut problem, measurements for quantum chemistry on near-term quantum
arXiv:1811.08419. computers, npj Quantum Info. 7, 23 (2021).
[12] Z.-C. Yang, A. Rahmani, A. Shabani, H. Neven, and C. [31] L. Li, M. Fan, M. Coram, P. Riley, and S. Leichenauer, Quan-
Chamon, Optimizing Variational Quantum Algorithms using tum optimization with a novel Gibbs objective function and
Pontryagin’s Minimum Principle, Phys. Rev. X 7, 021027 Ansatz architecture search, Phys. Rev. Research 2, 023074
(2017). (2020).
[13] Z. Jiang, E. G. Rieffel, and Z. Wang, Near-optimal quantum [32] R. Shaydulin, I. Safro, and J. Larson, Multistart methods for
circuit for Grover’s unstructured search using a transverse field, quantum approximate optimization, in 2019 IEEE High Per-
Phys. Rev. A 95, 062317 (2017). formance Extreme Computing Conference (HPEC) (IEEE, New
[14] P. K. Barkoutsos, G. Nannicini, A. Robert, I. Tavernelli, and York, 2019), pp. 1–8.
S. Woerner, Improving variational quantum optimization using [33] M. P. Harrigan, K. J. Sung, M. Neeley, K. J. Satzinger, F.
CVaR, Quantum 4, 256 (2020). Arute, K. Arya, J. Atalaya, J. C. Bardin, R. Barends, S. Boixo
[15] G. G. Guerreschi and A. Matsuura, QAOA for Max-Cut re- et al., Quantum approximate optimization of non-planar graph
quires hundreds of qubits for quantum speed-up, Sci. Rep. 9, problems on a planar superconducting processor, Nat. Phys. 17,
6903 (2019). 332 (2021).
[16] M. Y. Niu, S. Lu, and I. L. Chuang, Optimizing QAOA: [34] V. Akshay, H. Philathong, M. E. S. Morales, and J. D.
Success probability and runtime dependence on circuit depth, Biamonte, Reachability Deficits in Quantum Approximate Op-
arXiv:1905.12134. timization, Phys. Rev. Lett. 124, 090504 (2020).
[17] M. E. S. Morales, J. D. Biamonte, and Z. Zimborás, On the uni- [35] C. Jarzynski, Generating shortcuts to adiabaticity in quantum
versality of the quantum approximate optimization algorithm, and classical dynamics, Phys. Rev. A 88, 040101(R) (2013).
Quantum Info. Proc. 19, 291 (2020). [36] J. Deng, Q.-h. Wang, Z. Liu, P. Hänggi, and J. Gong, Boosting
[18] S. Lloyd, Quantum approximate optimization is computation- work characteristics and overall heat-engine performance via
ally universal, arXiv:1812.11075. shortcuts to adiabaticity: Quantum and classical systems, Phys.
[19] D. Wierichs, C. Gogolin, and M. Kastoryano, Avoiding local Rev. E 88, 062122 (2013).
minima in variational quantum eigensolvers with the natural [37] S. Iram, E. Dolson, J. Chiel, J. Pelesko, N. Krishnan, Ö.
gradient optimizer, Phys. Rev. Research 2, 043246 (2020). Güngör, B. Kuznets-Speck, S. Deffner, E. Ilker, J. G. Scott,
[20] S. Bravyi, A. Kliesch, R. Koenig, and E. Tang, Obstacles to and M. Hinczewski, Controlling the speed and trajectory of
Variational Quantum Optimization from Symmetry Protection, evolution with counterdiabatic driving, Nature Phys. 17, 135
Phys. Rev. Lett. 125, 260505 (2020). (2020).
[21] M. Demirplak and S. A. Rice, Adiabatic population transfer [38] M. Kolodrubetz, D. Sels, P. Mehta, and A. Polkovnikov, Ge-
with control fields, J. Phys. Chem. A 107, 9937 (2003). ometry and non-adiabatic response in quantum and classical
[22] M. V. Berry, Transitionless quantum driving, J. Phys. A: Math. systems, Phys. Rep. 697, 1 (2017).
Theor. 42, 365303 (2009). [39] D. Sels and A. Polkovnikov, Minimizing irreversible losses in
[23] D. Guéry-Odelin, A. Ruschhaupt, A. Kiely, E. Torrontegui, quantum systems by local counterdiabatic driving, Proc. Natl.
S. Martínez-Garaot, and J. G. Muga, Shortcuts to adiabatic- Acad. Sci. USA 114, E3909 (2017).
ity: Concepts, methods, and applications, Rev. Mod. Phys. 91, [40] P. W. Claeys, M. Pandey, D. Sels, and A. Polkovnikov,
045001 (2019). Floquet-Engineering Counterdiabatic Protocols in Quan-
[24] N. N. Hegade, K. Paul, Y. Ding, M. Sanz, F. Albarrán- tum Many-Body Systems, Phys. Rev. Lett. 123, 090602
Arriagada, E. Solano, and X. Chen, Shortcuts to Adiabaticity (2019).
in Digitized Adiabatic Quantum Computing, Phys. Rev. Appl. [41] A. C. Santos and M. S. Sarandy, Superadiabatic controlled evo-
15, 024038 (2021). lutions and universal quantum computation, Sci. Rep. 5, 15775
[25] E. Farhi, J. Goldstone, S. Gutmann, and H. Neven, Quantum (2015).
algorithms for fixed qubit architectures, arXiv:1703.06199. [42] Y. Zheng, S. Campbell, G. De Chiara, and D. Poletti, Cost
[26] S. Hadfield, Z. Wang, B. O’Gorman, E. G. Rieffel, D. of counterdiabatic driving and work output, Phys. Rev. A 94,
Venturelli, and R. Biswas, From the quantum approximate 042132 (2016).

033029-9

You might also like