Professional Documents
Culture Documents
NN 3
NN 3
networks
Natansh Mathur, Jonas Landman, Yun Yvonna Li, Martin Strahm, Skander
Kazdaghli, Anupam Prakash, Iordanis Kerenidis
by employing quantum circuits in training of classical neural networks, and second, by designing
and training quantum orthogonal neural networks. We benchmark our techniques on two different
imaging modalities, retinal color fundus images and chest X-rays. The results show the promises of
such techniques and the limitations of current quantum hardware.
cal machine learning methods for medical image classi- II. RESULTS
fication and experts in quantum algorithms to advance
the state-of-the-art of quantum methods for medical im- A. Datasets and pre-processing
age classification. The hardware demonstration remains
a simple proof of concept, which nevertheless marks
In order to benchmark our quantum neural network
progress towards unblocking a number of theoretical and
techniques for medical images we used datasets from
practical bottlenecks for quantum machine learning.
MedMNIST, a collection of 10 pre-processed medical
open datasets [33]. The collection has been standard-
We focus on quantum neural networks for medical im-
ized for classification tasks on lightweight 28×28 medical
age classification and implement two different methods:
images of 10 different imaging modalities.
the first uses quantum circuits to assist the training and
inference of classical neural networks by adapting the In this work we looked specifically at two different
work in [3, 21] to be amenable to current quantum hard- datasets. The first is the PneumoniaMNIST [23], a
ware; the second method builds on the recent work on dataset of pediatric chest X-ray images. The task is
quantum orthogonal neural networks [22], where quan- binary-class classification between pneumonia-infected
tum parametrized circuits are trained to enforce orthog- and healthy chest X-rays. The second is the RetinaM-
onality of weight matrices. Orthogonality is an inherent NIST, which is based on DeepDRiD [1], a dataset of reti-
property of quantum operations, which are described by nal fundus images. The original task is ordinal regres-
unitary matrices, and it has been shown that it can im- sion on a 5-level grading of diabetic retinopathy severity,
prove performance of deep neural nets and help avoid which has been adapted in this work to a binary classi-
vanishing or exploding gradients [19, 29, 32]. fication task to distinguish between normal (class 0) and
different levels of retinopathy (classes 1 to 4).
The datasets used to demonstrate the utilities of the In the PneumoniaMNIST the training set consists of
two different quantum techniques come from the MedM- 4708 images (class 0: 1214, class 1: 3494) and the test
NIST series [33]. Two very different imaging modalities, set has 624 points (234 - 390). In the RetinaMNIST the
the RetinaMNIST dataset of retina fundus images [1] and training set has 1080 points (486 - 594) and the test set
the PneumoniaMNIST dataset of chest X-rays [23] are 400 (174 - 226). Note that in the RetinaMNIST we have
chosen. The same techniques can also be easily adapted considered the class 0 (normal) versus classes {1, 2, 3, 4}
to be tested on all other datasets from the MedMNIST. together (different levels of retinopathy).
Though the image dimension of the MedMNIST is
The hardware experiments both for training the quan- small compared to the original images, namely 784 pix-
tum neural networks and forward inference have been els, one cannot load such data on the currently available
performed on the IBM quantum hardware. More specifi- quantum computers and thus a standard dimensionality
cally, 5-qubit experiments have been performed on three reduction pre-processing has been done with Principal
different machines with 5, 7, and 16 qubits, and 9-qubit Component Analysis to reduce the images to 4 or 8 di-
experiments have been performed in a 16-qubit machine. mensions. Such a pre-processing indeed may reduce the
The IBM quantum hardware is based on superconduct- possible accuracy of both classical and quantum methods
ing qubits, one of the two most advanced technologies for but our goal is to benchmark such approaches on current
quantum hardware right now along with trapped ions. quantum hardware and understand if and when quantum
We also performed the same training and inference ex- machine learning methods may become competitive.
periments using classical methods as well as simulators
of the quantum neural networks for benchmarking the
accuracy of the quantum methods. B. Quantum data-loaders
Our results show quantum neural networks could be Once the classical medical image datasets have been
used as an alternative to classical neural networks for pre-processed we need to find ways of loading the data
medical image classification. Results of simulation of the efficiently into the quantum computer. We use exactly
quantum neural networks provide similar accuracy to the one qubit per feature and outline three different ways of
classical ones. From the hardware experiments, we see performing what is called a unary amplitude encoding of
that for a number of classification tasks the quantum neu- the classical data points in the following section.
ral networks can be indeed trained to the same level of ac- We use a two-qubit parametrized gate, called Recon-
curacy as their classical counterparts, while for more dif- figurable Beam Splitter gate (RBS) in [20] and that has
ficult tasks, quantum accuracies drop due to the quantum appeared also as partial SWAP or fSIM gate, which is
hardware limitations. The MedMNIST datasets provide defined as
good benchmarks for tracking the advances in quantum
hardware, to act as a proxy for checking how well dif- 1 0 0 0
ferent hardware perform in classification tasks, and also 0 cos θ sin θ 0
for testing different classification algorithms and neural RBS(θ) = (1)
0 − sin θ cos θ 0
network architectures in the literature. 0 0 0 1
3
circuit for some vector w, which in our case will be a row multiple measurements that form a complete operator
of the weight matrix. The parameters of the gates in the basis on the Hilbert space of the system.
first part of the circuit are fixed and correspond to the in- In short, in order to compute the inner product of two
put data points, while the parameters of the gates of the vectors x and w, the extra qubit, denoted as q0, is ini-
second part are updated as we update the weight matrix tialized with an X gate to |1i and used as a control qubit
through a gradient descent computation. Note that in through an RBS gate between the qubits q0 and q1 with
[20] the parallel loader was used, since the experiments θ = π/4. This way when the qubit q0 is 1 nothing hap-
were performed in a fully connected trapped ion quan- pens, and when it is 0, the quantum circuit described
tum computer, while here we use the semi-diagonal ones, above consisting of a quantum data loader for x and the
due to the restricted connectivity of the superconducting adjoint quantum data loader for w is performed. By per-
hardware. forming a final RBS gate with θ = π/4 we end up with a
final state of the form
1 1
− w · x |e1 i + |Gi (4)
2 2
form of a pyramid made of RBS gates is applied in order we restrict to the unary basis vectors. In other words,
to implement an orthogonal layer of the neural network. given an orthogonal weight matrix one can find the cor-
responding parameters of the quantum gates and vice
versa. The mapping holds for the case of n × d matrices
as well.
If we denote by x the classical data point that is loaded
by the diagonal loader, and W the orthogonal weight
matrix corresponding to the pyramid circuit with rows
Wj , then the state that comes out of the circuit in Figure
4 is
X
Wj · x |ej i (5)
j
(a)
1 or 0 and the remaining qubits in the unary state The other two machines that were used were the 7-
2
qubit ibmq casablanca and the 5-qubit ibmq bogota. The
ej is given by Pr[0, ej ] − Pr[1, ej ] = 41 √1n + Wj x −
quantum hardware was accessed through the IBM Quan-
1 √1
2 √ tum Experience for Business cloud service.
4 n
− Wj x = Wj x/ n. Therefore, for each j, we
can deduce both the amplitude and the sign of Wj x.
More details about these calculations appear in [22].
2. Quantum software
Let us provide some remarks about these orthogonal
neural networks. First, the quantum circuits have only
linear depth and thus one can use a quantum computer The quantum software development was performed us-
to perform inference on the datasets in only linear num- ing tools from the QC Ware Forge platform, including the
ber of steps. Of course, one can imagine combining such quasar language, the data loader and the inner product
quantum orthogonal layers with other types of quantum estimation procedures, as well as the quantum pyramid
layers, in the form of more general parametrized quantum circuits. The final circuits were translated into qiskit cir-
circuits, thus creating more advanced quantum neural cuits that were then sent to the IBM hardware.
network architectures. Second, in [22] an efficient clas- The datasets were downloaded from the MedMNIST
sical algorithm for training such orthogonal neural net- repository [33] and pre-processed by the sklearn imple-
works is presented that takes the same asymptotic time mentation of PCA.
as training normal dense neural networks, while previ- For benchmarking as accurately as possibly against
ously used training algorithms were based on singular classical fully-connected neural networks, we used the
value decompositions that have a much worse complex- code from [28] for training classical neural networks, and
ity. The new training of the orthogonal neural network for the quantum-assisted neural networks we adapted the
is done with a gradient descent in the space of the an- code to use a quantum procedure for the dot product
gles of the gates of the quantum circuit and not on the computations. For the orthogonal neural networks, we
elements of the weight matrix. We denote this training first designed a new training algorithm for quantum or-
algorithm by QPC, after the quantum pyramid circuit thogonal neural networks, based on [22], and we also
which inspired its creation. Note here that the fact the developed code for classical orthogonal neural networks
optimization is performed in the angle landscape, and not based on singular value decomposition, following [19].
on the weight landscape, can produce very different and
at times better models, since these landscapes are not the
same. This is indeed corroborated by our experimental 3. Optimizations for the hardware demonstration
results as we will discuss below.
E. Hardware demonstration
(b)
4. Experimental results
hardware experiments.
In Figures 9 and 10 and Table 2, we show our results,
where we provide the AUC (area under curve) and ACC
(accuracy) for all different types of neural network ex-
periments, for both the training and test sets, for the
Pneumonia and Retina datasets.
As general remarks from these results, one can see that
for the PneumoniaMNIST dataset, both the AUC and
ACC are quite close for all different experiments, showing
that the quantum simulations and the quantum hardware
experiments reach the performance of the classical neural
networks. We also note that the quantum-assisted neu-
ral networks achieved somewhat higher accuracies than
the orthogonal neural networks. For the RetinaMNIST
dataset, the experiments with 8-dimensional data achieve
higher AUC and ACC than the ones with 4-dimensional
data. The quantum simulations achieve similar perfor-
FIG. 9: Results of experiments for the Quantum mance to the classical one, while for the case where both
assisted Neural Network. the training and the inference was performed on a quan-
tum computer, a drop in the performance is recorded,
We tested our methods using a number of different due to the noise present in the hardware and the higher
datasets, classification tasks, and architectures that we difficulty of the dataset.
detail below, and we performed both simulations and We provide now a more detailed description of the per-
8
TABLE I: Results of our experiments on Pneumonia and Retina datasets, reported for the train set and the test set,
both using the AUC and ACC metrics. Methods are quantum-assisted neural networks (qNN) or quantum
orthogonal neural network (qOrthNN) with different layer architectures. The training and inference are done
classically via standard feedforward/backpropagation (CLA), or in a quantum-assisted way both on a quantum
simulator (SIM) and on quantum hardware (QHW) for the qNN; and with the Singular-Value Bounded algorithm
(SVB), or with a quantum pyramid circuit algorithm both on a quantum simulator (QPC-SIM) or on quantum
hardware (QPC-QHW) for the qOrthNN. The experiments that do not involve quantum hardware have been
repeated ten times, the mean values appear on the table and the standard deviation is in most cases ±0.01 and up
to ±0.04.
III. METHODS
In Fig. 17,18, and 19 we provide the AUC curves, ACC timation circuits, which are useful for other applications
and confusion matrices for some more of the experiments beyond training neural networks.
we performed, including both the PneumoniaMNIST and In particular, we present inference results from the
RetinaMNIST datasets, both quantum-assisted and or- 624 data points in the test set of the PneumoniaMNIST
thogonal neural networks, and for training and inference and the 400 data points in the test set of the RetinaM-
on simulators or quantum hardware. NIST, and also both with the [4,4,2] and [8,4,2] quantum-
We also present here an analysis for the quantum cir- assisted neural networks, and we check the estimated
cuits we used in order to perform the quantum-assisted value of the first node in the final layer of the neural net-
neural networks, namely the quantum inner product es- work (which corresponds to the first class of the binary
12
V. DATA AVAILABILITY
The data supporting the findings is available from the VIII. COMPETING INTERESTS
corresponding author upon reasonable request. The authors declare no competing interests.
[1] Deepdr diabetic retinopathy image dataset (deep- Matthias Degroote, Hermanni Heimonen, Jakob S
drid), ”the 2nd diabetic retinopathy – grad- Kottmann, Tim Menke, et al. Noisy intermediate-
ing and image quality estimation challenge,”. scale quantum (nisq) algorithms. arXiv preprint
https://isbi.deepdr.org/data.html, 2020. arXiv:2101.08448, 2021.
[2] Amira Abbas, David Sutter, Christa Zoufal, Aurélien [7] William Cappelletti, Rebecca Erbanni, and Joaquı́n
Lucchi, Alessio Figalli, and Stefan Woerner. The power Keller. Polyadic quantum classifier. arXiv:2007.14044,
of quantum neural networks. Nature Computational Sci- 2020.
ence, 1(6):403–409, 2021. [8] M Cerezo, A Sone, L Cincio, and P Coles. Barren plateau
[3] J Allcock, CY Hsieh, I Kerenidis, and S Zhang. Quan- issues for variational quantum-classical algorithms. Bul-
tum algorithms for feedforward neural networks. ACM letin of the American Physical Society 65, 2020.
Transactions on Quantum Computing 1 (1), 1-24, 2020. [9] M. Cerezo, Akira Sone, Tyler Volkoff, Lukasz Cin-
[4] Kerstin Beer, Dmytro Bondarenko, Terry Farrelly, To- cio, and Patrick J. Coles. Cost-function-dependent
bias J Osborne, Robert Salzmann, Daniel Scheiermann, barren plateaus in shallow quantum neural networks.
and Ramona Wolf. Training deep quantum neural net- arXiv:2001.00550, 2020.
works. Nature communications, 11(1):1–6, 2020. [10] Shouvanik Chakrabarti, Huang Yiming, Tongyang Li,
[5] Stan Benjamens, Pranavsingh Dhunnoo, and Bertalan Soheil Feizi, and Xiaodi Wu. Quantum wasserstein gen-
Mesko. The state of artificial intelligence-based fda- erative adversarial networks. In Advances in Neural In-
approved medical devices and algorithms: an online formation Processing Systems, volume 32. Curran Asso-
database. npj Digital Medicine volume 3, Article number: ciates, Inc., 2019.
118, 2020. [11] Iris Cong, Soonwon Choi, and Mikhail D. Lukin. Quan-
[6] Kishor Bharti, Alba Cervera-Lierta, Thi Ha Kyaw, tum convolutional neural networks. Nature Physics 15,
Tobias Haug, Sumner Alperin-Lea, Abhinav Anand, 2019.
14
[12] Iris Cong, Soonwon Choi, and Mikhail D Lukin. Quan- nal neural networks. arXiv:2106.07198, 2021.
tum convolutional neural networks. Nature Physics, [23] Daniel S. Kermany, Michael Goldbaum, and et al. Identi-
15(12):1273–1278, 2019. fying medical diagnoses and treatable diseases by image-
[13] Brian Coyle, Daniel Mills, Vincent Danos, and Elham based deep learning. Cell, vol. 172, no. 5, pp. 1122 –
Kashefi. The born supremacy: quantum advantage and 1131.e9,, 2018.
training of an ising born machine. npj Quantum Infor- [24] Bobak Toussi Kiani, Agnes Villanyi, and Seth
mation, 6(1):1–11, 2020. Lloyd. Quantum medical imaging algorithms.
[14] Edward Farhi and Hartmut Neven. Classification with arXiv:2004.02036, 2020.
quantum neural networks on near term processors. [25] Saurabh Kumar, Siddharth Dangwal, and Debanjan
arXiv:1802.06002, 2018. Bhowmik. Supervised learning using a dressed quan-
[15] Hector Ivan Garcıa-Hernandez, Raymundo Torres-Ruiz, tum network with ”super compressed encoding”: Al-
and Guo-Hua Sun. Image classification via quantum ma- gorithm and quantum-hardware-based implementation.
chine learning. arXiv:2011.02831, 2020. arXiv:2007.10242, 2020.
[16] Edward Grant, Marcello Benedetti, Shuxiang Cao, An- [26] Kosuke Mitarai, Makoto Negoro, Masahiro Kitagawa,
drew Hallam, Joshua Lockhart, Vid Stojevic, Andrew G. and Keisuke Fujii. Quantum circuit learning. Physical
Green, and Simone Severini. Hierarchical quantum clas- Review A, 98(3):032309, 2018.
sifiers. npj Quantum Information 4, arXiv:1804.03680, [27] Kouhei Nakaji and Naoki Yamamoto. Quantum semi-
2018. supervised generative adversarial network for enhanced
[17] Nikodem Grzesiak, Reinhold Blümel, Kenneth Wright, data classification. arXiv:2010.13727, 2020.
Kristin M. Beck, Neal C. Pisenti, Ming Li, Vandiver [28] Michael A. Nielsen. Neural networks and deep learning.
Chaplin, Jason M. Amini, Shantanu Debnath, Jwo-Sy Determination Press, 2015.
Chen, and Yunseong Nam. Efficient arbitrary simultane- [29] Benjamin Nosarzewski. Deep orthogonal neural net-
ously entangling gates on a trapped-ion quantum com- works. 2018.
puter. Nat Commun, 11, 2020. [30] Sergi Ramos-Calderer, Adrián Pérez-Salinas, Diego
[18] Vojtech Havlicek, Antonio D. Córcoles, Kristan Temme, Garcı́a-Martı́n, Carlos Bravo-Prieto, Jorge Cortada,
Aram W. Harrow, Abhinav Kandala, Jerry M. Chow, Jordi Planagumà, and José I. Latorre. Quantum unary
and Jay M. Gambetta. Supervised learning with quan- approach to option pricing. arXiv:1912.01618, 2019.
tum enhanced feature spaces. Nature volume 567, [31] K Sharma, M Cerezo, L Cincio, and PJ Coles. Train-
arXiv:1804.11326, 2018. ability of dissipative perceptron-based quantum neural
[19] Kui Jia, Shuai Li, Yuxin Wen, Tongliang Liu, and networks. arXiv preprint arXiv:2005.12458, 2020.
Dacheng Tao. Orthogonal deep neural networks. IEEE [32] Jiayun Wang, Yubei Chen, Rudrasis Chakraborty, and
transactions on pattern analysis and machine intelli- Stella X Yu. Orthogonal convolutional neural networks.
gence, 2019. In Proceedings of the IEEE/CVF Conference on Com-
[20] S Johri, S Debnath, A Mocherla, A Singh, A Prakash, puter Vision and Pattern Recognition, pages 11505–
J Kim, and I Kerenidis. Nearest centroid classification 11515, 2020.
on a trapped ion quantum computer. npj Quantum In- [33] Jiancheng Yang, Rui Shi, and Bingbing Ni. Medm-
formation (to appear), arXiv:2012.04145, 2021. nist classification decathlon: A lightweight automl
[21] I Kerenidis, J Landman, and A Prakash. Quantum algo- benchmark for medical image analysis. arXiv preprint
rithms for deep convolutional neural networks. EIGHTH arXiv:2010.14925, 2020.
INTERNATIONAL CONFERENCE ON LEARNING [34] Xiaohua Zhai, Alexander Kolesnikov, Neil Houlsby,
REPRESENTATIONS ICLR, 2019. and Lucas Beyer. Scaling vision transformers.
[22] Iordanis Kerenidis, Jonas Landman, and Natansh arxiv:2106.04560, 2021.
Mathur. Classical and quantum algorithms for orthogo-