Professional Documents
Culture Documents
Daniel Silver, Tirthak Patel, Aditya Ranjan, Harshitta Gandhi, William Cutler, Devesh Tiwari
Northeastern University
{silver.da, patel.ti, ranjan.ad, gandhi.ha, cutler.wi, d.tiwari}@northeastern.edu
9846
II. S LI Q’s design demonstrates how to exploit the superpo- Variational Quantum Circuit (VQC) Iterative
Optimization of
sition and entanglement properties of quantum computing Q1 R3(p1, p2, p3) the Quantum
Qubit Measurements
systems for similarity detection. It builds a resource-efficient Gate Parameters
Q2 R3(p4, p5, p6) on a Classical
training pipeline by creating interwoven, entangled input Machine
pairs on a VQC, and it applies new robust methods for the Q3 R3(p7, p8, p9)
9847
Anchor (A) Run Positive (P) Run Negative (N) Run
Features Superposed State Layer 1 Layer N Features Features
A1 f(A1) 0 PX NX
AX
Figure 2: The baseline design inputs one image at a time, requiring three separate runs for A, P, and N.
R3 R3 N1 N2 N3
A + N
A4 A5 P1 f(P1) 1
R3 R3 APY N4 N5 NY
+
A2 f(A2) 2
ANX
P1 P2 P3
+ R3 R3 PX A1 A2 A3
P2 f(P2) 3
P A ANY
P4 P5 + R3 R3 PY A4 A5
A3 f(A3) 4
R3 R3
Figure 3: S LI Q’s procedure of combining A-P and A-N inputs into two runs, interweaving their feature space, and updating the
variational structure to leverage the properties of quantum superposition and entanglement to reduce the number of runs and
generate better results.
for all three inputs (Fig. 2). The amplitude embedding pro- S LI Q: Key Design Elements
cedure embeds classical data as quantum data in a Hilbert
Space. Although recent prior works have utilized principal S LI Q builds off the baseline design and introduces multiple
component analysis (PCA) prior to amplitude embedding for novel design aspects to mitigate the limitations of the base-
feature encoding (Silver, Patel, and Tiwari 2022), the base- line design. First, S LI Q introduces the concept of training an
line does employ PCA because higher performance is ob- input pair in the same run to leverage the superposition and
served with keeping features intact and padding 0’s as nec- entanglement properties of quantum computing systems.
essary to maintain the full variance of features. The second
component is to feed these encoded features to a VQC for Input Feature Entanglement and Interweaving. Recall
training and optimizing the VQC parameters to minimize that in the baseline design, each image type (Anchor, Pos-
the training loss (Fig. 2). The training loss is estimated via itive, Negative) traverses through the variational quantum
calculating the distance between the projection of inputs on circuit one-by-one. The corresponding measurements at the
a 2-D space – the projection is obtained via measuring two end of each image-run produce coordinates on a 2-D plane
qubits at the end of the VQC circuit (this is the third and final that allows us to calculate similarity distance between A-P
component). The loss is calculated as the squared distance and A-N inputs. This allows us to calculate the loss that is
between the projection of the anchor, A, to the positive pro- targeted to be minimized over multiple runs during training.
jection, P , taking the absolute distance between the anchor Unfortunately, this procedure requires performing three runs
projection and negative projection N . This is an L2 variant before the loss for a single (A, P, N) triplet input can be es-
of Triplet Embedding Loss and is formally defined as
n o n o timated, which is resource inefficient.
L2 = (Ax −Px )2 +(Ay −Py )2 − (Ax −Nx )2 +(Ay −Ny )2 S LI Q’s design introduces multiple new elements to ad-
(1) dress this resource inefficiency. The first idea is to create
We note that the effectiveness of training can be adapted two training pairs (Anchor-Positive and Anchor-Negative),
to any choice of dimension and shape of the projection space and each pair is trained in a single run (demonstrated visu-
(2-D square box bounded between -1 and 1, in our case) as ally in Fig. 3). This design element improves the resource-
long as the choice is consistent among all input types (An- efficiency of the training process – instead of three runs (one
chor, Positive, and Negative inputs). A more critical feature for each image type), S LI Q requires only two runs. Note
is the repeating layers of the VQC which the baseline design that, theoretically, it is possible to combine all three input
chooses to be the same as other widely-used VQCs to make types and perform combined amplitude embedding. How-
it competitive (Chicco 2021; Li, Kan, and He 2020). ever, in practice, this process is not effective because it be-
9848
Baseline SLIQ Convert the Train the Use the
Y 1 Y 1 Dataset into SLIQ SLIQ
A+P and A+N network using network to
(NX, NY) (PX, PY) (NX, NY) (PX, PY)
Pairs for the dataset infer class or
Training R3
cluster or
X X closeness of a
-1 1 -1 1 R3
new sample
R3
(AX, AY) (ANX, ANY) (APX, APY) Input Dataset R3
R3
-1 -1
Figure 4: While the baseline design has zero projection vari- Figure 5: Overview of the design of S LI Q including the steps
ance, S LI Q has to take additional steps to mitigate it. of preparing the data for training, training the quantum net-
work, and using the network for inference post-training.
9849
Correlation = 0.63 Correlation = 0.47 Correlation = 0.46 Correlation = 0.68 Correlation = 0.65 Correlation = 0.64
SliQ’s Calculated Loss
Figure 6: S LI Q’s loss normalized against ground truth distance between color frequencies. The anchor image was compared to
100 images. Only the top 3 correlations are shown. The steep drop in correlation metric shows that the baseline is not effective.
trained, S LI Q performs similarity detection by performing machines are specified in the text. For setting up the datasets
inference on new pairs of data. This inference can be used for testing and training, triplets are created in the form of an
to identify the most similar samples to one another, the most anchor image, a positive image, and a negative image.
distant samples, and even serve as a classifier if an unsuper- For the unlabeled dataset used, three images are selected
vised clustering algorithm is used to generate clusters. at random. The color frequency is evaluated for each image
on R, G, and B channels individually, then placed into eight
Experimental Methodology bins for each color. The 24 total bins for each image are used
Training and Testing Datasets. S LI Q is evaluated on NIH to establish a ground truth similarity between the images,
AIDS Antiviral Screen Data (Kramer, De Raedt, and Helma where the first image is the anchor and the images closest in
2001), MNIST (Deng 2012), Fashion-MNIST (Xiao, Ra- L1 norm to the anchor is set as the positive image and the
sul, and Vollgraf 2017), and Flickr Landscape (Chen, Lai, further away image is the negative image. Once the triplets
and Liu 2018). These datasets are chosen because (1) they are created, the pixels within the images are interwoven with
cover both unlabeled (e.g., Flickr Landscape) and labeled one another. The image is then padded with 0s to the nearest
datasets (e.g., AIDS, MNIST, Fashion-MNIST), (2) they power of 2 for the amplitude embedding process.
represent different characteristics (e.g., medical dataset, im- In the case of the labeled datasets, the anchor and positive
ages, handwritten characters) and have real-world use cases are chosen to be from the same class, while the negative im-
(e.g., AIDS detection). We note that the size of the Flickr age is chosen at random from another class. For evaluation,
Landscape dataset is ≈25× larger than the commonly used an anchor and positive are passed into the embedding and
quantum machine learning image datasets of MNIST and the four dimensions are used in a Gaussian Mixture Model
Fashion-MNIST (Xiao, Rasul, and Vollgraf 2017; Silver, Pa- to form clusters which are then evaluated for accuracy. We
tel, and Tiwari 2022). This presents additional scaling chal- use a batch size of 30 for all experiments with a Gradient-
lenges that we mitigate with the scalable design of S LI Q. DescentOptimizer and a learning rate of 0.01. We train for
Flickr Landscape is an unlabeled dataset consisting of 500 epochs on a four-layer network. The size of the network
4300 images of different landscapes spanning general land- changes based on the dataset used, a four-qubit circuit is
scapes, mountains, deserts, seas, beaches, islands, and used for the Aids dataset, 14 for Flickr Landscape, and 11
Japan. The images are different sizes, but are cropped to for MNIST and Fashion-MNIST. For the baseline, one less
80×80×3 for consistency. The center is kept intact with all qubit is used for all circuits as it does not necessitate the
color channels. This dataset is unlabeled and serves to show additional qubit to store an additional sample on each run.
how S LI Q performs on unlabeled data. NIH AIDS Antivi- Competing Techniques. Our baseline scheme is the same
ral Screen Data contains 50,000 samples of features along- as described earlier: it is a quantum analogue of a triplet net-
side a label to indicate the status of the Aids virus (“CA” work, where weights are shared and use a single network for
for confirmed active, “CM” for confirmed moderately ac- training and inference. Although S LI Q is not designed for
tive, and “CI” for confirmed inactive). The MNIST dataset is classification tasks on labeled data, we provide a quantitative
a grayscale handwritten digit dataset (Deng 2012) where we comparison with also the state-of-the-art quantum machine
perform binary classification on ‘1’s and ‘0’s. The Fashion- learning classifier: Projected Quantum Kernel (PQK) (pub-
MNIST dataset contains the same number of pixels and lished in Nature’2021) (Huang et al. 2021b) and Quilt (pub-
classes as MNIST, but the images are more complex in na- lished in AAAI’2022) (Silver, Patel, and Tiwari 2022). The
ture. In all datasets, 80% of the data is reserved for training former, PQK, trains a classical network on quantum features
and 20% reserved for testing. generated by a specially designed quantum circuit. Datasets
Experimental Framework.The environment for S LI Q is are modified to have quantum features that show the quan-
Python3 with Pennylane (Bergholm et al. 2018) and Qiskit tum advantage in training. While not feasible to implement
(Aleksandrowicz et al. 2019) frameworks. Our quantum ex- on current quantum computers, we modify PQK’s architec-
periment evaluations are simulated classically, and the infer- ture to use fewer intermediate layers to reduce the number of
ence results which are collected on the real IBM quantum gates and training time. The other comparison used, Quilt, is
9850
Percentile Baseline S LI Q Percentile Baseline S LI Q Aids MNIST Fashion-MNIST
100 100 100
25th −0.06 0.23 75th 0.19 0.48 80 80 80
Accuracy (%)
Accuracy (%)
Accuracy (%)
50th 0.05 0.36 100th 0.63 0.68
60 60 60
40 40 40
Table 1: Spearman correlation results show that S LI Q out- 20 20 20
performs the baseline in similarity detection. S LI Q’s median
0 0 0
Spearman correlation is 0.36 vs. the Baseline’s 0.05.
PQK
SliQ
PQK
SliQ
PQK
SliQ
Baseline
Quilt
Baseline
Quilt
Baseline
Quilt
Anchor Baseline Anchor SliQ
Figure 8: S LI Q performs competitively against the compar-
ative classification techniques for all datasets, despite classi-
fication not being a learned objective of S LI Q.
9851
Environment Baseline PQK Quilt S LI Q gled together throughout the entire circuit and will not be
Simulation 60.8% 81.6% 51% 71.54% separated in the output unless a constraint is put in place to
Real Computer 64.4% N/A 21.4% 68.8% enforce this. This separability can be demonstrated in Fig. 9
where S LI Q is compared to a trained version without con-
sistency loss. S LI Q has more precise outputs throughout the
Table 2: S LI Q’s real quantum computer results for the AIDS entire CDF, evaluated over 1000 MNIST test images. En-
dataset are consistent with the simulation results, showing forcing consistency amounts to an average decrease of 80%
its resilient to hardware noise due to its low number of pa- in projection variance when changing the order of the inputs
rameters. The PQK column is denoted as N/A circuit is pro- – demonstrating the effectiveness of S LI Q’s projection in-
hibitively deep for it to be feasible and run on error-prone variance method. As shown in Fig. 9, interweaving increas-
NISQ computers (30× more parameters than S LI Q). ing robustness, leading to a decrease in projection variance.
9852
Acknowledgements Hoffer, E.; and Ailon, N. 2015. Deep metric learning us-
We thank the anonymous reviewers for their constructive ing triplet network. In Similarity-Based Pattern Recogni-
feedback. This work was supported in part by Northeastern tion: Third International Workshop, SIMBAD 2015, Copen-
University and NSF Award 2144540. hagen, Denmark, October 12-14, 2015. Proceedings 3, 84–
92. Springer.
References Huang, H.-Y. R.; Broughton, M. B.; Cotler, J.; Chen, S.;
Li, J.; Mohseni, M.; Neven, H.; Babbush, R.; Kueng, R.;
Aaronson, S. 2015. Read the fine print. Nature Physics, Preskill, J.; and McClean, J. R. 2021a. Quantum advantage
11(4): 291–293. in learning from experiments. Science, 376: 1182 – 1186.
Aleksandrowicz, G.; Alexander, T.; Barkoutsos, P.; Bello,
Huang, H.-Y. R.; Broughton, M. B.; Mohseni, M.; Babbush,
L.; Ben-Haim, Y.; Bucher, D.; Cabrera-Hernández, F. J.;
R.; Boixo, S.; Neven, H.; and McClean, J. R. 2021b. Power
Carballo-Franquis, J.; Chen, A.; Chen, C.-F.; et al. 2019.
of data in quantum machine learning. Nature Communica-
Qiskit: An open-source framework for quantum computing.
tions, 12: 2631.
Accessed on: Mar, 16.
Hur, T.; Kim, L.; and Park, D. K. 2022. Quantum convolu-
Beer, K.; Bondarenko, D.; Farrelly, T.; Osborne, T. J.; Salz-
tional neural network for classical data classification. Quan-
mann, R.; Scheiermann, D.; and Wolf, R. 2020. Training
tum Machine Intelligence, 4(1): 3.
deep quantum neural networks. Nature Communications,
11(1): 808. Johnson, J.; Douze, M.; and Jegou, H. 2021. Billion-Scale
Similarity Search with GPUs. IEEE Transactions on Big
Bergholm, V.; Izaac, J.; Schuld, M.; Gogolin, C.; Alam,
Data, 7(03): 535–547.
M. S.; Ahmed, S.; Arrazola, J. M.; Blank, C.; Delgado,
A.; Jahangiri, S.; et al. 2018. Pennylane: Automatic differ- Khairy, S.; Shaydulin, R.; Cincio, L.; Alexeev, Y.; and Bal-
entiation of hybrid quantum-classical computations. arXiv aprakash, P. 2020. Learning to Optimize Variational Quan-
preprint arXiv:1811.04968. tum Circuits to Solve Combinatorial Problems. Proceedings
of the AAAI Conference on Artificial Intelligence, 34(03):
Cerezo, M.; Arrasmith, A.; Babbush, R.; Benjamin, S. C.;
2367–2375.
Endo, S.; Fujii, K.; McClean, J. R.; Mitarai, K.; Yuan, X.;
Cincio, L.; et al. 2021. Variational quantum algorithms. Na- Koch, G.; Zemel, R.; Salakhutdinov, R.; et al. 2015. Siamese
ture Reviews Physics, 3(9): 625–644. neural networks for one-shot image recognition. In ICML
Chen, S. Y.-C.; Yang, C.-H. H.; Qi, J.; Chen, P.-Y.; Ma, X.; deep learning workshop, volume 2. Lille.
and Goan, H.-S. 2020. Variational quantum circuits for deep Kramer, S.; De Raedt, L.; and Helma, C. 2001. Molecular
reinforcement learning. IEEE Access, 8: 141007–141024. feature mining in HIV data. In Proceedings of the seventh
Chen, Y.; Lai, Y.-K.; and Liu, Y.-J. 2018. Cartoongan: Gen- ACM SIGKDD international conference on Knowledge dis-
erative adversarial networks for photo cartoonization. In covery and data mining, 136–143.
Proceedings of the IEEE conference on computer vision and Li, G.; Song, Z.; and Wang, X. 2021. VSQL: Variational
pattern recognition, 9465–9474. Shadow Quantum Learning for Classification. Proceedings
Chicco, D. 2021. Siamese Neural Networks: An Overview, of the AAAI Conference on Artificial Intelligence, 35(9):
73–94. New York, NY: Springer US. ISBN 978-1-0716- 8357–8365.
0826-5. Li, J.; and Kais, S. 2021. Quantum cluster algorithm for data
Daley, A. J.; Bloch, I.; Kokail, C.; Flannigan, S.; Pearson, classification. Materials Theory, 5(1): 6.
N.; Troyer, M.; and Zoller, P. 2022. Practical quantum ad- Li, P.; Li, Y.; Hsieh, C.-Y.; Zhang, S.; Liu, X.; Liu, H.; Song,
vantage in quantum simulation. Nature, 607(7920): 667– S.; and Yao, X. 2020. TrimNet: learning molecular repre-
676. sentation from triplet messages for biomedicine. Briefings
Deng, L. 2012. The mnist database of handwritten digit im- in Bioinformatics, 22(4). Bbaa266.
ages for machine learning research. IEEE Signal Processing Li, Y.; Kan, S.; and He, Z. 2020. Unsupervised Deep Metric
Magazine, 29(6): 141–142. Learning with Transformed Attention Consistency and Con-
Gichane, M. M.; Byiringiro, J. B.; Chesang, A. K.; Nyaga, trastive Clustering Loss. In Vedaldi, A.; Bischof, H.; Brox,
P. M.; Langat, R. K.; Smajic, H.; and Kiiru, C. W. 2020. Dig- T.; and Frahm, J.-M., eds., Computer Vision – ECCV 2020,
ital Triplet Approach for Real-Time Monitoring and Control 141–157. Cham: Springer International Publishing.
of an Elevator Security System. Designs, 4(2). Liu, X.; Zhou, R.-G.; El-Rafei, A.; Li, F.-X.; and Xu, R.-Q.
Guan, W.; Perdue, G.; Pesah, A.; Schuld, M.; Terashi, K.; 2019. Similarity assessment of quantum images. Quantum
Vallecorsa, S.; and Vlimant, J.-R. 2021. Quantum machine Information Processing, 18(8): 1–19.
learning in high energy physics. Machine Learning: Science Liu, Y.-h.; Qi, Z.-d.; and Liu, Q. 2022. Comparison of the
and Technology, 2(1): 011003. similarity between two quantum images. Scientific reports,
Heidari, M.; Grama, A.; and Szpankowski, W. 2022. Toward 12(1): 1–10.
Physically Realizable Quantum Neural Networks. Proceed- Lloyd, S.; Schuld, M.; Ijaz, A.; Izaac, J.; and Killoran, N.
ings of the AAAI Conference on Artificial Intelligence, 36(6): 2020. Quantum embeddings for machine learning. arXiv
6902–6909. preprint arXiv:2001.03622.
9853
Lockwood, O.; and Si, M. 2020. Reinforcement Learning Veit, A.; Belongie, S.; and Karaletsos, T. 2017. Conditional
with Quantum Variational Circuit. Proceedings of the AAAI similarity networks. In Proceedings of the IEEE conference
Conference on Artificial Intelligence and Interactive Digital on computer vision and pattern recognition, 830–838.
Entertainment, 16(1): 245–251. Wan, W.; Gao, Y.; and Lee, H. J. 2019. Transfer deep feature
Ma, Z.; Dong, J.; Long, Z.; Zhang, Y.; He, Y.; Xue, H.; learning for face sketch recognition. Neural Computing and
and Ji, S. 2020. Fine-grained fashion similarity learning Applications, 31(12): 9175–9184.
by attribute-specific embedding network. In Proceedings of Wei, Y.; Cheng, Z.; Yu, X.; Zhao, Z.; Zhu, L.; and Nie,
the AAAI Conference on Artificial Intelligence, volume 34, L. 2019. Personalized hashtag recommendation for micro-
11741–11748. videos. In Proceedings of the 27th ACM International Con-
Myers, L.; and Sirois, M. J. 2004. Spearman correlation ference on Multimedia, 1446–1454.
coefficients, differences between. Encyclopedia of statistical Xiao, H.; Rasul, K.; and Vollgraf, R. 2017. Fashion-mnist:
sciences, 12. a novel image dataset for benchmarking machine learning
Nandy Pal, M.; Banerjee, M.; and Sarkar, A. 2022. Retrieval algorithms. arXiv preprint arXiv:1708.07747.
of exudate-affected retinal image patches using Siamese
quantum classical neural network. IET Quantum Commu-
nication, 3(1): 85–98.
Patel, Z.; Merali, E.; and Wetzel, S. J. 2022. Unsupervised
learning of Rydberg atom array phase diagram with Siamese
neural networks. New Journal of Physics, 24(11): 113021.
Preskill, J. 2018. Quantum Computing in the NISQ era and
beyond. Quantum, 2: 79.
Radha, S. K.; and Jao, C. 2022. Generalized quantum simi-
larity learning. arXiv preprint arXiv:2201.02310.
Reimers, N.; and Gurevych, I. 2019. Sentence-bert: Sen-
tence embeddings using siamese bert-networks. arXiv
preprint arXiv:1908.10084.
Roy, S.; Harandi, M.; Nock, R.; and Hartley, R. 2019.
Siamese Networks: The Tale of Two Manifolds. In 2019
IEEE/CVF International Conference on Computer Vision
(ICCV), 3046–3055.
Schroff, F.; Kalenichenko, D.; and Philbin, J. 2015. FaceNet:
A unified embedding for face recognition and clustering.
In 2015 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 815–823.
Schuld, M.; and Petruccione, F. 2018. Supervised learning
with quantum computers, volume 17. Springer.
Silver, D.; Patel, T.; and Tiwari, D. 2022. QUILT: Effective
Multi-Class Classification on Quantum Computers Using an
Ensemble of Diverse Quantum Classifiers. In Proceedings of
the AAAI Conference on Artificial Intelligence, volume 36,
8324–8332.
Tannu, S. S.; and Qureshi, M. K. 2019. Not all qubits are cre-
ated equal: A case for variability-aware policies for NISQ-
era quantum computers. In Proceedings of the Twenty-
Fourth International Conference on Architectural Support
for Programming Languages and Operating Systems, 987–
999.
Tiwari, P.; and Melucci, M. 2019. Binary Classifier Inspired
by Quantum Theory. Proceedings of the AAAI Conference
on Artificial Intelligence, 33(01): 10051–10052.
Utkin, L.; Meldo, A.; Kovalev, M.; and Kasimov, E. 2019.
An ensemble of triplet neural networks for differential di-
agnostics of lung cancer. In 2019 25th Conference of Open
Innovations Association (FRUCT), 346–352. IEEE.
9854