Professional Documents
Culture Documents
Applications in Finance
Mingyu Lee∗
Department of Computer Science and Engineering,
Seoul National University, Seoul 08826, Korea and
QDQ Corp., Seoul 06999, Korea
Myeongjin Shin†
School of Computing, KAIST, Daejeon 34141, Korea and
QDQ Corp., Seoul 06999, Korea
Junseo Lee‡
School of Electrical and Electronic Engineering, Yonsei University, Seoul 03722, Korea and
Quantum Computing R&D, Norma Inc., Seoul 04799, Korea
Kabgyun Jeong§
Research Institute of Mathematics, Seoul National University, Seoul 08826, Korea and
arXiv:2309.01363v1 [quant-ph] 4 Sep 2023
School of Computational Sciences, Korea Institute for Advanced Study, Seoul 02455, Korea
(Dated: September 6, 2023)
One of the most promising applications in the era of NISQ (Noisy Intermediate-Scale Quantum)
computing is quantum machine learning. Quantum machine learning offers significant quantum
advantages over classical machine learning across various domains. Specifically, generative
adversarial networks have been recognized for their potential utility in diverse fields such as image
generation, finance, and probability distribution modeling. However, these networks necessitate
solutions for inherent challenges like mode collapse. In this study, we capitalize on the concept that
the estimation of mutual information between high-dimensional continuous random variables can
be achieved through gradient descent using neural networks. We introduce a novel approach named
InfoQGAN, which employs the Mutual Information Neural Estimator (MINE) within the framework
of quantum generative adversarial networks to tackle the mode collapse issue. Furthermore, we
elaborate on how this approach can be applied to a financial scenario, specifically addressing
the problem of generating portfolio return distributions through dynamic asset allocation. This
illustrates the potential practical applicability of InfoQGAN in real-world contexts.
The theoretical demonstration that QGAN can offer an quantifies this relationship for two random variables, X
exponential edge over classical GAN when dealing with and Y , employing the principles of Shannon entropy [30]:
data composed of measurements in high-dimensional
spaces [25] has captured significant attention. Leveraging
these advantages, diverse research endeavors have I(X; Y ) = H(X) − H(X|Y ). (1)
harnessed the potential of QGAN. This includes its In this context, H(X) represents the Shannon entropy
application in option pricing by employing methodologies of a random variable X, while H(X|Y ) signifies the
that learn random probabilities and map them onto conditional entropy of variable Y given the value of
quantum states [26], addressing financial challenges variable X. The joint distribution of the pair of random
through modeling joint probability distributions [27], and variables (X, Y ) across the space X × Y, along with
even image generation [28]. These studies collectively their marginal distributions PX and PY , is defined.
highlight successful instances where QGAN has proven These entropy measures are formally articulated as
its effectiveness. equations (2) and (3) in the form of
In this paper, we introduce a novel quantum-classical
hybrid machine learning model called InfoQGAN . X
Inspired by the integration of the idea of maximizing H(X) = − p(x) log (p(x)) and (2)
mutual information between some latent variables x∈X
and observations in GAN, as seen in the structure
of InfoGAN, we aim to apply the same concept X
pX,Y (x, y)
to QGAN. Particularly, we intend to leverage the H(X|Y ) = − pX,Y (x, y) log .
pY (y)
efficient calculation of mutual information through neural x,y∈X×Y
networks using MINE (Mutual Information Neural (3)
Estimation) [29] within the framework of QGAN. We Furthermore, the interpretation of mutual information
intend to employ InfoQGAN to experimentally draw extends to Kullback-Leibler divergence [31], wherein DKL
the standard deviation-expected return diagram for symbolizes the Kullback-Leibler divergence and PX ⊗ PY
two assets derived through the Markowitz portfolio represents the outer product distribution, assigning a
theory. Our goal is to demonstrate that by utilizing probability of PX (x) · PY (y) to each (x, y). This mutual
InfoQGAN, the modeling performance is significantly information can be succinctly formulated as
enhanced compared to using QGAN alone.
The structure of our paper is as follows. In
I(X; Y ) = DKL (PX,Y ||PX ⊗ PY ). (4)
Section II, we provide an explanation of the theoretical
background that underpins our results. Moving on to Several methodologies have been proposed to estimate
Section III, we delve into the details of InfoQGAN, this measure, with the Mutual Information Neural
where we elucidate the structure of QGAN, along Estimator (MINE) [29] standing out as a successful
with the integration method involving MINE (Mutual approach.
Information Neural Estimation). Section IV is dedicated
to describing experiments involving the utilization of
InfoQGAN for modeling two-dimensional probability B. Mutual Information Neural Estimation (MINE)
distributions. Notably, we experimentally demonstrate
its capability to address the mode collapse issue typically The Mutual Information Neural Estimator (MINE),
observed in GANs. In Section V, we pivot towards which was introduced by Belghazi and colleagues in
practical applications by employing InfoQGAN to tackle their research [29], constitutes a parametric methodology
a real-world financial problem, specifically the Markowitz exploiting the capabilities of neural networks to
portfolio theory. We conduct experiments to generate approximate mutual information. This strategy employs
the standard deviation-expected return diagram. As we neural networks with the aim of minimizing a specialized
proceed to Section VI, we outline the significance and loss function tailored to quantify mutual information
limitations of our study, along with prospects for future effectively. The construction of this loss function is
plan. This concluding section encapsulates the essence deduced from the underlying principles of the Donsker-
of our research. Varadhan representation [32]. This representation is
elucidated through equation (5), wherein F denotes an
arbitrary class of functions T : Ω → R. Notably, P and Q
II. PRELIMINARIES are used to denote two distributions, assuming that these
distributions are defined within a confined domain Ω ⊂
A. Mutual Information Rd , each characterized by probability density functions p
and q correspondingly, relative to the Lebesgue measure,
Mutual information serves as a quantitative measure
of the interdependence between random variables. The
DKL (P||Q) ≥ sup EP [T ] − log EQ eT .
(5)
mutual information I(X; Y ), defined by equation (1), T ∈F
3
In this context, the parameterized neural network Tθ FIG. 1: The basic structure of a Generative Adversarial
comes into play. It has been established that I(X; Z) ≤ Network (GAN).
\
I(X; Z)n , creating a foundation for training the neural
network to maximize I(X; \ Z)n . This optimization
process, in turn, facilitates the estimation of mutual Adversarial Network (GAN) framework into the realm
information. Furthermore, the strong consistency of of quantum computing. GANs consist of a generator
MINE has been rigorously demonstrated through a and a discriminator, where the generator fabricates
compelling proof [29, Theorem 5]. This theorem data instances that mirror the training data, while
establishes the existence of a positive integer N and a the discriminator distinguishes genuine data from
set of neural network parameters θ ∈ Θ for which, given synthesized counterparts. In the quantum domain,
a QGAN harnesses quantum states and operations
∀n ≥ N , the inequality I(X; Z) − I(X; \ Z)n ≤ ε holds
to both learn and generate quantum data. The
true for all values of ε > 0. quantum generator employs a quantum circuit to map
a latent space onto a data space. The quantum
discriminator, whether manifested as a quantum circuit
C. (Quantum) Generative Adversarial Networks or a conventional neural network, discerns between
authentic and generated quantum data, necessitating an
The Generative Adversarial Network (GAN) [33] initial measurement step for processing. By capitalizing
stands as a machine learning framework wherein two on the tenets of quantum mechanics, the QGAN imitates
neural networks engage in a competitive interplay, and manipulates quantum states, potentially augmenting
akin to a zero-sum game, with the objective of the generation of quantum data in comparison to classical
generating new data that resembles a predefined training techniques. It is noteworthy that QGANs are an
dataset. Originally conceived as a generative model evolving area of study, with practical implementations
for unsupervised learning, GANs have demonstrated on quantum computers confronting obstacles such as the
their utility across diverse learning paradigms. At the management of noise and error correction. Additionally,
heart of this concept lies a generator that is trained the structural aspects of the QGAN employed in our
through dynamic adjustments to outwit a discriminator, experiment will be elaborated upon in Section III.
rather than being directly optimized for specific images.
Through adversarial learning, the classification and
generation models iteratively refine one another, enabling D. InfoGAN
the generator to craft highly authentic data that
is indistinguishable from real data by consistently InfoGAN [18] extends the concept of GAN to
confounding the discriminator’s classification ability. encompass unsupervised learning through the utilization
The structural framework of GANs is illustrated in of mutual information. The core notion revolves around
Figure 1. The GAN learning process is cast as a minmax maximizing the mutual information between the output
problem, encapsulated by of the generator and a subset of latent codes, which,
together with noise, act as inputs to the generator. This
strategic manipulation of latent codes enables them to
min max V (D, G) = (7)
G D influence the unsupervised behavior of the generated
Ex∼pdata (x) [log D(x)] + Ex∼pz (z) [log(1 − D(G(z)))], output.
Unlike traditional GANs where only noise z serves
where D(x) represents the classifier’s evaluation of as input for the generator, InfoGAN introduces the
real data, and D(G(z)) gauges the authenticity of the latent code c as an additional input, transforming
generated data. the generator’s distribution to G(z, c). The primary
Incorporating the principles of quantum mechanics, objective of InfoGAN is to maximize the mutual
a Quantum Generative Adversarial Network (QGAN) information I(c; G(z, c)), achieved by altering the loss
expands the scope of the traditional Generative function of the minimax game as defined by
4
I(c; G(z, c)) ≥ Ex∼G(z,c) Ec′ ∼P (c|x) [log Q(c′ |x)] + H(c).
q
σp = 2 σ 2 + w 2 σ 2 + 2w w ρ
wA
(9) A B B A B AB . (10)
Here, Q(c|x) functions as an auxiliary distribution
approximating P (c|x). A recognition network, which In this context, the symbols wA and wB take on
shares much of its structure with the discriminator significance as the designated portfolio weights assigned
while differing only in the final layer, is employed to assets A and B, respectively. Importantly, the
to parameterize Q. Empirical results showcase the constraint wA + wB = 1 is preserved, underscoring
efficacy of InfoGAN in disentangling writing styles from the allocation of resources. Additionally, the term
digit shapes within the MNIST dataset. Additionally, ρAB = Cov(RA , RB ) captures the covariance between
it demonstrates the discovery of visual concepts such the returns of assets A and B, thus encapsulating the
as hairstyles, eyeglasses, and emotions, along with dynamic interaction embedded within their respective
the ability to discern pose variations due to lighting returns. This covariance plays a pivotal role in assessing
conditions in 3D rendered images. the interrelationship and diversification potential of these
assets within the portfolio framework.
III. INTEGRATION OF QGAN AND MINE the code. Both the discriminator and the MINE model
operate as classical neural networks. The MINE model
We propose a novel quantum machine learning takes (c, {fi }) as its input and generates an estimation of
framework that seamlessly integrates the QGAN and mutual information. The QGAN, Discriminator model,
MINE methodologies. Our endeavor commences with and MINE model are concurrently trained at each epoch.
the introduction of an enhanced generator design. This It is important to reiterate that the generator’s loss
specialized generator, when supplied with a stochastic function is adjusted according to
noise vector z, orchestrates the creation of an initial state
represented as UI (z)|0⟩. Following this initialization,
as the generator undergoes the intricacies of the ansatz max E(z,c)∼p(z,c) [log D(G(z, c)) + β · MINE(c, G(z, c))],
G
process, its terminal state elegantly transforms into the (11)
expression UG (θ)UI (z)|0⟩. while the loss functions for the other models remain
Within the scope of this study, meticulous attention unchanged.
has been devoted to the composition of each layer. InfoGAN demonstrated that by separating a small
Arranged in sequence, these layers encompass a subset of latent variables given as input into a code, and
combination of RX, RY, RZ (Rotation-X, Y, and Z) maximizing the mutual information between the code
operations, intricately interwoven with strategically and the model’s output, one can control the attributes
positioned CNOT gates. These CNOT gates adeptly of the data generated through the code. The manner
facilitate interactions spanning from the initial to the of calculating mutual information in InfoGAN, which is
final qubits, while also extending their influence across based on maximizing, can be replaced by using MINE.
all adjacent qubits. For a comprehensive exploration Hence, although our model is a combination of MINE and
of the nuanced framework that shapes the ansatz, we QGAN, we will refer to it as InfoQGAN , signifying a
refer Figure 3. (The circuit can be subject to various Mutual Information Maximizing QGAN.
modifications, and it should be noted that for the
experiments conducted in Section V, certain circuit
structures differed.)
IV. INFOQGAN FOR GENERATING 2D
DISTRIBUTIONS
RX RY RZ •
We employed InfoQGAN to generate a predefined 2D
RX RY RZ • distribution and subsequently compared the performance
of QGAN and InfoQGAN within this experimental
• framework. The generator of a GAN is trained to
RX RY RZ deceive the discriminator, while the discriminator is
trained not to be fooled by the generator. Thus,
RX RY RZ •
GAN employs a minimax optimization approach, which
contributes to the difficulty in training both the generator
FIG. 3: Generator ansatz corresponding to 4-qubit and discriminator, ultimately leading to the issue of
quantum circuit. mode collapse [37–43]. This phenomenon involves the
generator generating a limited distribution with reduced
variety.
In the final state, we measure the expectation
values {⟨Oj ⟩} of local and nonlocal observables. The MINE [29] introduced a technique where the
values F1 ({⟨Oj ⟩}), F2 ({⟨Oj ⟩}), ..., FN ({⟨Oj ⟩}) obtained generator’s input, denoted as z = (ϵ, c), is partitioned
by passing through activation functions Fi : Rn → into two components: noise ϵ and code c variables. This
R, where N is the dimensionality, constitute the N - separation aims to maximize the mutual information
dimensional output produced by the ultimate generator. between G(ϵ, c) and c. This approach effectively
For convenience, let’s denote {fi } = {Fi ({⟨Oj ⟩})}. The addressed the issue of mode collapse in GANs designed
model’s output is defined as G(z) = {fi }. To ensure the for generating 2D distributions. This study underscores
activation functions can be applied within a quantum the effectiveness of InfoQGAN in mitigating mode
computer, they must support gradient calculations collapse in QGANs. Furthermore, the utilization of the
through the parameter shift rule [35, 36]. Moreover, code variable empowers us to exert control over certain
since quantum circuits involve linear transformations, the properties within the generated data distribution. The
activation functions must also be nonlinear to introduce input noise is symbolized as z ∈ [−1, 1]n , and it is
nonlinearity. integrated into the generator circuit through a noise
embedding process, referenced as
Let us now proceed to select a subset of the noise vector
z as the code, designating the remaining portion as the n πz
i
Y
noise variable. This separation can be represented as UI (z) = RYi , (12)
z = (ϵ, c), where ϵ represents the noise and c represents i=1
2
6
wherein RY (θ) signifies the RY gate, i.e., a rotation-Y Experimental Settings QGAN InfoQGAN
gate operating on a single qubit with an angle of θ. The Points 2000
objective is to generate a two-dimensional distribution Target Domain [0, 1] × [0, 1]
where the x-coordinate is extracted from the first qubit
Distribution Center (0.3, 0.3)
and the y-coordinate from the second qubit. In the
case of i ∈ {1, 2}, where Pi signifies the measurement Shape Figure 4a
probability of the i-th qubit being 0, the generator output Learning Rate 0.001
is given by Generator Optimizer Adam
(G) Scheduler Step Size 30
4 p 1
sin−1 Pi − . (13) Gamma 0.7
π 2
Learning Rate 0.0003
The generator G takes the form of a variational Discriminator Optimizer Adam
quantum circuit and is trained through classical (D) Scheduler Step Size 30
backpropagation within a quantum circuit simulator. If Gamma 0.85
the generator circuit is executed on a physical quantum
Mutual Information Learning Rate 0.001
processor, the training procedure could also incorporate
techniques such as the parameter shift rule or SPSA [44], Neural Estimator Optimizer - Adam
Scheduler Step Size 30
which do not necessitate backpropagation. Both the (MINE)
discriminator D and MINE models are constructed as Gamma 0.7
classical feed-forward neural networks and were also Training Epoch 300
trained using classical backpropagation. To streamline Layers 20
the training process and mitigate complexity, we adopted Noise (ϵ) 5 3
stochastic gradient descent. In each epoch, a subset Qubits
Code (c) - 2
comprising 25% of samples from the distribution was
selected for mini-batch training. Refer to Table I and III β in Loss Function (Equation (11)) - 0.5
for the experimental setup, including hyperparameters
TABLE I: Details of the experimental settings for the
related to training and target probability distributions.
experiment Reducing Mode Collapse.
To determine the equivalence of the generated point
7
FIG. 6: The graph represents the trend of the FIG. 7: The graph illustrates the trend of correlation
generator, discriminator, and mutual information between the x, y-axes and codes 0, 1 during the training
during the training process. process.
9
FIG. 8: Scatter plot representing the distribution of FIG. 9: Scatter plot representing the distribution of
points generated using QGAN and the separation of points generated using InfoQGAN and the separation of
features according to codes. It can be observed that the features according to codes. It can be seen that the
feature separation is not well executed. feature separation is distinctly taking place.
10
(b) Code 1
(a) QGAN
(b) QGAN
(b) InfoQGAN
Additionally, it’s feasible to compute essential devised our generator circuit, visually represented in
statistical indicators for both these standalone assets Figure 13. The range of the noise is [0, 1], and the noise
and the consolidated portfolio. These indicators embedding circuit has been structured according to
encompass mean daily returns and standard deviation
of daily returns, which are derived from their individual n
π(zi − 0.5)
distributions of daily return. These insights shed light on Y
UI (z) = RYi . (15)
anticipated performance and fluctuation levels. Grasping 2
i=1
these metrics is vital for making knowledgeable decisions
in investment management, as they furnish critical data This design corresponds to the Rotation-y gates located
regarding potential investment results under varying at the forefront of Figure 13. We refer to Table IV for
market scenarios. the detailed structure of the model and experimental
Figure 11 displays the distribution of daily returns settings.
for Apple and Tesla over the span from 2011 to 2022. In Figure 14, return distributions generated by
As discernible from a histogram with 16 bins, Tesla modifying code values within the finalized model have
demonstrates greater volatility than Apple, along with been depicted in the mean-standard deviation space.
superior average returns. If the return series for Apple For InfoQGAN, the graph manifests a concave shape
within this time-series data is represented as RiA , and towards the left, as the proportions of the two assets
that for Tesla as RiT , then the return series of a change. This outcome aligns with the successful learning
portfolio constructed by amalgamating these assets in described in Section II E, reminiscent of Figure 2, where
proportions α and 1 − α, respectively, would be denoted the Markowitz theory can be aptly explained. However,
as αRiA + (1 − α)RiT . The mean return and standard QGAN did not exhibit such behavior. Furthermore, in
deviation of this combined portfolio can be readily the case of QGAN, adjusting code values only resulted in
computed from this information. learning return distributions with mean values confined
In this experiment, we devise a generator circuit to a highly restricted interval, specifically within the
with the specific aim of generating the daily return range [0.0004, 0.0009]. In contrast, InfoQGAN showcased
distribution of a portfolio, which is constituted through a significantly broader range of mean values, spanning
the amalgamation of two assets. The distribution approximately 9 times wider, falling within the interval
is truncated to [−0.1, 0.1] and discretized using 2n [0.0005, 0.0040]. This outcome can also be interpreted
grid points, where n represents the number of qubits as the resolution of mode collapse, providing further
employed. The unitary operator within the generator validation for the conclusions presented in Section IV A.
transforms the given input as follows: Through Figure 15, it is possible to confirm the
portfolio return distribution generated by InfoQGAN. By
n
2X −1 observing the shape of the histogram, it can be seen that
√ there is a high probability of return values around 0.
|0⟩n → pi |i⟩n (14)
i=0
Furthermore, as the code value increases, it is noticeable
that the probability of the portion where the absolute
Here, |i⟩ corresponds to the discretized interval [0.1+ 0.2 value of the return is greater than 0.5 also increases. This
2n ·
0.2
i, 0.1 + 2n · (i + 1)], which lies within [−0.1, 0.1]. phenomenon is presumed to occur due to the adjustment
We employed daily return data for Apple (AAPL) of the portfolio variance, which is changed by the code,
and Tesla (TSLA) spanning from 2011 to 2022 as the during the process of mutual information maximization.
foundation of our raw training dataset. The generator,
in this case, is implemented as a 4-qubit quantum circuit,
meticulously segmenting the [−0.1, 0.1] range into 16 VI. DISCUSSION AND CONCLUSIONS
distinct sections. This segmentation process facilitates
the extraction of probability distribution data from the Our work has proposed a quantum-classical hybrid
training dataset. By denoting the weight assigned to version of InfoGAN [18]. Experimental results indicate
AAPL’s performance as α and TSLA’s weight as 1−α, we that InfoQGAN is also effective in mitigating mode
generated a total of 2000 return probability distribution collapse and enables control over generated features,
datasets. This was achieved by uniformly partitioning much like InfoGAN. Moreover, applying InfoQGAN in
the interval [0, 1] into 1000 segments to create one a financial context has yielded promising outcomes.
dataset, and then separately repeating this procedure In future research, we intend to conduct experiments
a thousand times. On each repetition, a value was using noisy quantum circuit simulators and real quantum
uniformly selected from within the [0, 1] range. The hardware. Exploring alternative embedding strategies
generator circuit underwent a training process with the may yield improved results. Additionally, investigating
aim of aligning the resulting probability distribution pi methods to leverage quantum mutual information
of the output state with the probability distribution of instead of classical mutual information presents an
portfolio returns. Taking inspiration from the structure intriguing avenue. This has the potential to enhance the
of QGAN [26], which addresses analogous problems, we likelihood of achieving quantum advantage, particularly
13
FIG. 15: The distribution is generated by the trained generator G. The code c values are uniformly partitioned into
16 sections within the range of [0, 1], and for each section, the remaining 4 noise ϵ values are randomly determined.
This iterative process is repeated 512 times, and the outcomes are averaged and then plotted.
considering the availability of approaches for estimating quantum algorithms and quantum machine learning in a
quantum mutual information in existing works [46–48]. diverse array of financial problem-solving investigations.
The pricing calculation for various derivatives depends Moreover, this approach lends itself to versatile
on the profits obtained from a specific portfolio and portfolio generation and evaluation through quantum
is inherently linked to predicting its yield distribution. superposition, proving useful in various scenarios,
The incorporation of k auxiliary qubits into the including portfolio design, yield prediction for ELS
InfoQGAN-trained model, along with a dedicated (Equity-Linked Securities), option pricing, and more.
circuit for controlling code qubits, facilitates the This underscores the potential of harnessing quantum
generation of a superposition that encompasses 2k diverse advancements in financial problem-solving. This facet
portfolio combinations. This advancement enables the of our study holds immense potential and is expected to
identification of the maximum value by utilizing an create a profound impact across the broader spectrum of
evaluation observable related to the portfolio’s return quantum applications in the field of finance.
distribution, alongside Grover’s search algorithm
√ [49–51],
which operates with a time complexity of O( 2k ). This
development holds the potential for quantum advantages Acknowledgments
not only in portfolio evaluation but also in broader
applications. This work was supported by the National
Our research not only contributes to the advancement Research Foundation of Korea (NRF) through
of theoretical aspects related to quantum machine grants funded by the Ministry of Science and ICT
learning but also holds significant meaning in its (NRF-2022M3H3A1098237) and the Ministry of
innovative approach to addressing financial problem- Education (NRF-2021R1I1A1A01042199). This work
solving through quantum methodologies. Beyond was partially supported by an Institute for Information
addressing the portfolio optimization problem discussed & Communications Technology Promotion (IITP)
in this paper, we are enthusiastic about the application of grant funded by the Korean government (MSIP) (No.
14
2019-0-00003; Research and Development of Core Systems), and Korea Institute of Science and Technology
Technologies for Programming, Running, Implementing, Information (KISTI).
and Validating of Fault-Tolerant Quantum Computing
[1] B. Huang, Y. Huan, L. D. Xu, L. Zheng, and Z. Zou, Information 5, 103 (2019).
Enterprise Information Systems 13, 132 (2019). [27] E. Y. Zhu, S. Johri, D. Bacon, M. Esencan, J. Kim,
[2] K. B. Hansen, Big Data & Society 7, 2053951720926558 M. Muir, N. Murgai, J. Nguyen, N. Pisenti, A. Schouela,
(2020). et al., Physical Review Research 4, 043092 (2022).
[3] Y. Li, W. Zheng, and Z. Zheng, IEEE Access 7, 108014 [28] H.-L. Huang, Y. Du, M. Gong, Y. Zhao, Y. Wu, C. Wang,
(2019). S. Li, F. Liang, J. Lin, Y. Xu, et al., Physical Review
[4] M. Leo, S. Sharma, and K. Maddulety, Risks 7, 29 (2019). Applied 16, 024051 (2021).
[5] A. Mashrur, W. Luo, N. A. Zaidi, and A. Robles-Kelly, [29] M. I. Belghazi, A. Baratin, S. Rajeswar, S. Ozair,
IEEE Access 8, 203203 (2020). Y. Bengio, A. Courville, and R. D. Hjelm, arXiv preprint
[6] N. Bussmann, P. Giudici, D. Marinelli, and arXiv:1801.04062 (2018).
J. Papenbrock, Computational Economics 57, 203 [30] C. E. Shannon, The Bell system technical journal 27, 379
(2021). (1948).
[7] J. Perols, Auditing: A Journal of Practice & Theory 30, [31] S. Kullback, Information theory and statistics (Courier
19 (2011). Corporation, 1997).
[8] J. I.-Z. Chen and K.-L. Lai, Journal of Artificial [32] J. Von Neumann, Mathematische grundlagen der
Intelligence and Capsule Networks 3, 101 (2021). quantenmechanik, vol. 38 (Springer-Verlag, 2013).
[9] I. Sadgali, N. Sael, and F. Benabbou, Procedia computer [33] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu,
science 148, 45 (2019). D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio,
[10] G.-Y. Ban, N. El Karoui, and A. E. Lim, Management Communications of the ACM 63, 139 (2020).
Science 64, 1136 (2018). [34] D. G. Luenberger et al., OUP Catalogue (1997).
[11] Z. Zhang, S. Zohren, and S. Roberts, The Journal of [35] G. E. Crooks, arXiv preprint arXiv:1905.13311 (2019).
Financial Data Science (2020). [36] D. Wierichs, J. Izaac, C. Wang, and C. Y.-Y. Lin,
[12] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, Quantum 6, 677 (2022).
D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, [37] T. Che, Y. Li, A. P. Jacob, Y. Bengio, and W. Li, arXiv
Advances in neural information processing systems 27 preprint arXiv:1612.02136 (2016).
(2014). [38] V. Dumoulin, I. Belghazi, B. Poole, O. Mastropietro,
[13] M.-Y. Liu and O. Tuzel, Advances in neural information A. Lamb, M. Arjovsky, and A. Courville, arXiv preprint
processing systems 29 (2016). arXiv:1606.00704 (2016).
[14] X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and [39] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung,
S. Paul Smolley, in Proceedings of the IEEE international A. Radford, and X. Chen, Advances in neural
conference on computer vision (2017), pp. 2794–2802. information processing systems 29 (2016).
[15] T. Karras, S. Laine, and T. Aila, in Proceedings of the [40] Y. Saatci and A. G. Wilson, Advances in neural
IEEE/CVF conference on computer vision and pattern information processing systems 30 (2017).
recognition (2019), pp. 4401–4410. [41] T. Nguyen, T. Le, H. Vu, and D. Phung, Advances in
[16] M. Mirza and S. Osindero, arXiv preprint neural information processing systems 30 (2017).
arXiv:1411.1784 (2014). [42] Z. Lin, A. Khetan, G. Fanti, and S. Oh, Advances in
[17] D. Berthelot, T. Schumm, and L. Metz, arXiv preprint neural information processing systems 31 (2018).
arXiv:1703.10717 (2017). [43] A. Ghosh, V. Kulharia, V. P. Namboodiri, P. H. Torr,
[18] X. Chen, Y. Duan, R. Houthooft, J. Schulman, and P. K. Dokania, in Proceedings of the IEEE conference
I. Sutskever, and P. Abbeel, Advances in neural on computer vision and pattern recognition (2018), pp.
information processing systems 29 (2016). 8513–8521.
[19] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, [44] J. C. Spall, in 1987 American control conference (IEEE,
N. Wiebe, and S. Lloyd, Nature 549, 195 (2017). 1987), pp. 1161–1167.
[20] M. Schuld, I. Sinayskiy, and F. Petruccione, [45] J. A. Peacock, Monthly Notices of the Royal
Contemporary Physics 56, 172 (2015). Astronomical Society 202, 615 (1983).
[21] A. W. Harrow, A. Hassidim, and S. Lloyd, Physical [46] M. Shin, J. Lee, and K. Jeong, arXiv preprint
review letters 103, 150502 (2009). arXiv:2306.14566 (2023).
[22] P. Rebentrost, M. Mohseni, and S. Lloyd, Physical review [47] S. Lee, H. Kwon, and J. S. Lee, arXiv preprint
letters 113, 130503 (2014). arXiv:2307.13511 (2023).
[23] M. Schuld and N. Killoran, Physical review letters 122, [48] Z. Goldfeld, D. Patel, S. Sreekumar, and M. M. Wilde,
040504 (2019). arXiv preprint arXiv:2307.01171 (2023).
[24] P.-L. Dallaire-Demers and N. Killoran, Physical Review [49] L. K. Grover, in Proceedings of the twenty-eighth annual
A 98, 012324 (2018). ACM symposium on Theory of computing (1996), pp.
[25] S. Lloyd and C. Weedbrook, Physical review letters 121, 212–219.
040502 (2018). [50] G.-L. Long, Physical Review A 64, 022307 (2001).
[26] C. Zoufal, A. Lucchi, and S. Woerner, npj Quantum [51] D. Biron, O. Biham, E. Biham, M. Grassl, and D. A.
15