You are on page 1of 8

PHYSICAL REVIEW A 103, 032406 (2021)

Time-dependent atomic magnetometry with a recurrent neural network


1,2,*
Maryam Khanahmadi and Klaus Mølmer3,†
1
Department of Physics, Institute for Advanced Studies in Basic Sciences, Zanjan 45137, Iran
2
Department of Physics and Astronomy, University of Aarhus, Ny Munkegade 120, DK 8000 Aarhus C, Denmark
3
Center for Complex Quantum Systems, Department of Physics and Astronomy, University of Aarhus,
Ny Munkegade 120, DK 8000 Aarhus C, Denmark

(Received 3 September 2020; revised 7 December 2020; accepted 24 February 2021; published 10 March 2021)

We propose to employ a recurrent neural network to estimate a fluctuating magnetic field from continuous
optical Faraday rotation measurement on an atomic ensemble. We show that an encoder-decoder architecture
neural network can process measurement data and learn an accurate map between recorded signals and the
time-dependent magnetic field. The performance of this method is comparable to Kalman filters while it is free
of the theory assumptions that restrict their application to particular measurements and physical systems.

DOI: 10.1103/PhysRevA.103.032406

I. INTRODUCTION In this paper we study an alternative approach, namely,


the use of machine learning (ML), to estimate time-dependent
Quantum systems can be affected by external perturba-
perturbations from measurement data. ML techniques provide
tions such as electric or magnetic fields contributing to the
powerful tools with remarkable ability to recognize and char-
system Hamiltonian, or thermal baths that contribute damp-
acterize complicated, noisy, and correlated data. ML has made
ing and decoherence processes. In quantum metrology, the
impressive achievements in numerous difficult tasks, such as
quantum system is probed in order to estimate these per-
turbations, and a host of results have been obtained about data mining, classification, and pattern recognition [14]. Neu-
how one can best extract the relevant information from the ral networks (NNs), as one of the modern tools of ML, have
measurement outcomes [1,2]. Special attention is currently enabled great progress in modeling complicated tasks such
devoted to the use of nonclassical states, including squeezing as language translation and image and speech recognition
and entanglement, and quantum many-body states using, e.g., [15–18]. The latter tasks have several similarities with the esti-
their critical behavior around phase transition [3–8]. If the mation of a time-dependent perturbation from time-dependent
task is to estimate a time-dependent perturbation, one should measurement data; this motivates the present paper. Note that
apply continuous or repeated measurements on the quantum the NN does not employ any knowledge about the properties
probe, and thus follow the dynamical evolution of its quantum of the probe system, how it interacts with the perturbation,
state conditioned on the time series of random measurement or how the data are obtained and processed by the laboratory
outcomes. In the absence of unknown perturbations, this dy- hardware. We merely assume that we can expose the NN to
namics is described by the theory of quantum trajectories [9]. large amounts of training data obtained under equivalent phys-
Before their introduction in quantum optics the pertaining ical conditions. As an example, we provide measurement data
stochastic master equations were derived by Belavkin and obtained by quantum trajectory simulation of the probing of
referred to as filter equations [10], and Belavkin also for- an atomic ensemble subject to a magnetic field that fluctuates
mulated hybrid classical-quantum filter equations [11], which in time according to an Ornstein-Uhlenbeck process [19].
at the same time update the quantum state and the proba- NNs and their combinations in the so-called deep neural
bilistic information about unknown parameters contributing network (DNN) [20] have been used in quantum physics to
to the system dynamics [12]. In short, these filters employ a model quantum dynamics [21–26], phase and parameter es-
combination of Born’s rule which provides the probability of timation [27–30], and quantum tomography [31,32], and they
measurement outcomes conditioned on the candidate quantum are increasingly employed in calibration and feedback tasks in
state, and Bayes’s rule for the update of prior probabilities quantum experiments [33–36]. We use a deep neural network
for candidate parameter values, conditioned on the given for the estimation of a stochastically varying perturbation such
outcome, pBayes (param|outcome) ∝ pBorn (outcome|param) × as a magnetic field interacting with a continuously monitored
pprior (param). However, for many experiments, filter theory, atomic system.
because of its modeling of the system dynamics, may not The paper is structured as follows: In Sec. II we describe
be valid or practically feasible for inclusion of complicating the atomic system and our procedure to simulate the fluctuat-
features such as added noise, finite detector bandwidths, satu- ing perturbation and a realistic measurement signal. In Sec. III
ration, and dead time (see, e.g., [13]). we describe the structure and motivate our choice of DNN
to analyze time-dependent experimental data. In Sec. IV we
present quantitative results and discuss the performance of
*
m.khanahmadi@phys.au.dk the DNN for estimation of the magnetic field. In Sec. V we
† conclude and present an outlook.
moelmer@phys.au.dk

2469-9926/2021/103(3)/032406(8) 032406-1 ©2021 American Physical Society


MARYAM KHANAHMADI AND KLAUS MØLMER PHYSICAL REVIEW A 103, 032406 (2021)

15

10

-5
0 0.2 0.4 0.6 0.8 1

-0.2

-0.4

-0.6

-0.8
0 0.2 0.4 0.6 0.8 1

FIG. 1. (a) Schematic of the physical model. The atomic spins are polarized along the x direction and interact with a stochastic magnetic
field directed along the y axis. The interaction of the system and the magnetic field causes a Larmor rotation towards the z axis. The system is
probed by a laser beam propagating along z and the polarization rotation, represented by a canonical field quadrature variable xph , is measured
(see text). Panel (b) shows an example of such a simulated Faraday rotation signal xph . Panel (c) shows the noisy magnetic field B(t ) that causes
the atomic precession and gives rise to the optical signal in panel (b). We note that the data in panel (b) are compatible with (minus) the integral
over time of the data in panel (c)

II. PHYSICAL MODEL To probe the system, we consider an off-resonant laser


beam which propagates along the z axis and is linearly po-
The experimental scheme is presented in Fig. 1(a). We
larized along the x direction. Any short segment of this field,
consider atoms with two Zeeman ground sublevels, equivalent
during a time interval τ , is represented by the Stokes vector of
to a spin-1/2 particle, which is polarized along the x axis. A
operators S with the x component, Sx  = Nph /2, where Nph is
large ensemble of such atomsis described by a collective spin
the number of photons. The light field can be decomposed as
operator (h̄ = 1), J = (1/2) j σ j , where σ j is the vector of
a superposition of orthogonal σ + and σ − circularly polarized
Pauli matrices describing the individual two-level atoms.
light, which interact dispersively with the atoms and acquire
We assume that the atoms are subject to a time-dependent
different phase shifts depending on the state occupied by the
magnetic field directed along the y direction, which causes a
atoms. A difference in the phase shift accumulated by the two
Larmor rotation of the atomic spin toward the z axis. In our
field components during their passage of the atomic ensemble
example, the magnetic field fluctuates stochastically as given
translates into a (Faraday) rotation of the optical polarization.
by an Ornstein-Uhlenbeck process [19],
√ This rotation is proportional to the population difference be-
dB(t ) = −γb B(t ) dt + σb dWb , (1) tween the two atomic sublevels, and hence the collective pat ∝
with steady-state zero mean and variance σb /2γb . The noisy Jz component which, in turn, evolves in time according to the
Wiener increment dWb has a Gaussian distribution with mean value of the magnetic field. In the absence of the quantum-
zero and variance dt. mechanical uncertainty of the spin observable and the random
We note that the rate γb yields the time scale over which the photodetection noise, the magnetic would be simply given by
magnetic field loses its correlation with previous values and (minus) the time derivative of the Faraday signal.
hence estimates lose their credibility. A random realization of To properly model the atom-light interaction and the opti-
Eq. (1) is shown in Fig. 1(c). cal detection process, we discretize the continuously evolving
For a large number of spin polarized atoms Nat , we may dynamics in small time intervals τ . During each short-
employ the Holstein-Primakoff approximation [37] and intro- time√ interval, we consider
√ canonical field variables xph =
duce effective
√ canonical position
√ and momentum variables as Sy / Sx , pph = Sz / Sx , and write the interaction Hamil- √
xat = Jy / Jx , pat = Jz / Jx , respectively. If the depolar- tonian of the√ atom and light as Hat−ph = κ pat (t )pph (t )/ τ
ization during the interaction is small, the system retains its where κ ∝ Nat  is the atom-light coupling strength ( =
spin polarization along the x axis, Jx  = h̄Nat /2. The interac- Nph /τ is the photon flux). This yields the effective total Hamil-
tion Hamiltonian between the atoms and the magnetic field tonian:
can then be written Hat−B (t ) = βB(t )Jy = μB(t )xat (t ), and √
H (t ) = κ pat (t )pph (t )/ τ + μB(t )xat (t ). (2)
the atomic momentum quadrature, representing the z com-
ponent of the atomic spins, evolves as pat (t ) → pat (t √− τ) − After each τ , the observable xph of the incident probe field
μτ B(t ), where β is the magnetic moment, and μ ∝ β Nat . is displaced by an amount proportional to the atomic spin

032406-2
TIME-DEPENDENT ATOMIC MAGNETOMETRY … PHYSICAL REVIEW A 103, 032406 (2021)

observable, and when measured we obtain noisy data as de-


picted in Fig. 1(b). The subsequent segment of the probing
beam, for the next time period τ , is treated by a new set
of observables (xph , pph ), and the succession of interactions .
.
.
. .
.
.
.
.
. . .
and measurements thus constitute the continuous measure- .
. .

ment scheme. In our analysis below, we shall employ signals


obtained by simulation of the magnetic field according to
Eq. (1), and of the optical measurement according to the . .
.
.
.
.
.
.
. .
distribution
√ of xph due to Eq. (2) (i.e., a Gaussian with mean . .
.
. .

value κ τ pat  and variance 1/2).


Both the optical fields and the atomic collective spin are
quantum degrees of freedom, so the measurements are subject
to randomness, and the systems are subject to measurement
back action. These are important components that govern the
dynamics and the measurement signal and thus the ability to
FIG. 2. Schematic of the encoder-decoder RNN to process the
estimate the time-dependent magnetic field. Estimating the
sequence to sequence translation (measurement data to magnetic-
magnetic field by the numerical derivative of the noisy Fara- field estimate). First, each sequence of the recorded signal
day signal would yield unphysically large fluctuations, while {x0 , xτ , . . . , xT } is fed to the encoder. The encoder sequentially up-
smoothed derivatives may suppress too much information and dates its hidden state ht , such that the final state holds information
overlook real fluctuations of the field. A Bayesian analysis that about the entire signal from t = 0 to T . The final hidden state hT
incorporates all measurement data is compatible with a quan- is subsequently treated as the initial state of the decoder, which
tum description of the system by means of Gaussian Wigner sequentially extracts the predicted magnetic field as its output. Both
functions. These can be fully characterized by their mean the encoder and decoder are LSTM cells (see the Appendix). They
values and covariances and yield the best possible estimates have different nonlinear layers which are followed in the decoder by
of an unknown constant [38,39] or time-dependent magnetic a fully connected layer with one neuron that outputs the predicted
field B(t ) [40,41]. The covariance matrix analysis, however, value of the field. As shown in the figure, the input of the decoder at
relies on complete knowledge of all physical parameters and each time step t is the previous output estimate Bt−τ and the hidden
ideal detectors, and we thus explore an alternative strategy to 
state ht−τ .
infer time-dependent magnetic fields from noisy measurement
records. Technically, the encoder, at each time step t, encodes the
information of its input xt in the internal state ht of the NN,
III. ESTIMATION BY A DEEP NEURAL NETWORK which in the LSTM is formally treated by two sets of m nodes
ht , ct which are updated according to the nonlinear activation
The NN approach is ignorant of the atomic system and functions
quantum measurement theory and has the sole purpose to
learn from a large collection of simulated realizations of ht = F (xt , ht−τ , ct−τ ),
the true magnetic field {B} and the corresponding opti- ct = Q (xt , ht−τ , ct−τ ), (3)
cal detection signals {X} at discrete times separated by τ
how well an unknown time-dependent magnetic field B = with a set of weight coefficients Wencoder . After the encoding
{B0 , Bτ , . . . , Bt , . . . , BT } can be estimated from a new signal is finished, the hidden state at the last time step hT = hT , cT
 
record X = {x0 , xτ , . . . , xt , . . . , xT }. To this end, we employ is applied as the initial state hinit , cinit = hT for the decoder.
the method described in the previous section to prepare a The internal states of the decoder are updated according to
collection of simulated magnetic fields and accompanying previous states and estimated values of the magnetic field as
noisy optical signals. Both are represented as row vectors
ht = F  (Bt−τ , ht−τ
 
, ct−τ ),
with dimension NT in the total time T for each record where
(NT − 1)τ = T . ct = Q  (Bt−τ , ht−τ
 
, ct−τ ), (4)
The training data {{X}, {B}}, are fed to a recurrent NN  
(RNN) which has the ability to extract correlations in sequen- where F , Q are nonlinear functions with weight parameters
tial data. The characteristic feature of an RNN is its internal Wdecoder ; for details of nonlinear functions see the Appendix.
(hidden) loop memory allowing it to maintain a state that The encoder employs a fully connected layer with one neuron
contains information depending on all previous input as it pro- and linear activation function f (x) = x. At each time step, a
cesses through the sequence of data [42]. This is accomplished copy of ht is fed to the fully connected layer and the candi-
by the encoder-decoder architecture shown in Fig. 2, which date subsequent value of the time-dependent magnetic field is
has a remarkable ability to process sequence-to-sequence data obtained with the weight matrix Wh B and output bias bB :
[16–18] by first processing the input measurement data se- Bt = f (Wh B ht + bB ) = Wh B ht + bB . (5)
quence {X} to form the hidden vector h, and to subsequently
process h to form the output candidate estimate of {B}. For Note that during the learning procedure, the variable Bt−τ in
our purpose, for both the encoder and decoder, we use long Eq. (4) is the true magnetic field from training experiments
short-term memory (LSTM) units which are a specific type (simulated in our numerical study). But, for the prediction
of RNNs with ability to process long sequence data [43]; for of an unknown signal, it is the estimated magnetic field at
details see the Appendix. the previous step of the encoding process [see Eq. (5); see

032406-3
MARYAM KHANAHMADI AND KLAUS MØLMER PHYSICAL REVIEW A 103, 032406 (2021)

FIG. 3. (a) The loss function (7) is reduced according to the number of epochs of the gradient descent procedure (6) applied on the training
data. After 30 epochs, the loss function has converged, i.e., it changes by less than or the order of 10−6 in the last two epochs. (b) The
time-dependent mean-squared error (8) between the predicted and the true magnetic field attains a constant low value inside the probing
interval, while increasing at both ends of the time interval, where the lack of prior and posterior measurement data, respectively, causes larger
uncertainty. Panels (c) and (d) show the true magnetic field with red dots together with the value inferred by the RNN, which is shown by the
blue curve. The optical signal leading to the inferred magnetic field B(t ) in panel (c) is shown in Fig. 1(b). The simulations are made with the
physical parameters κ 2 = 18(ms)−1 , μ = 90(ms)−1 , Nat = 2 × 1012 , and Nph = 5 × 109 during each time interval τ = 0.01 ms. The magnetic
field is sampled with σb = 2(pT 2 /ms) and γb = 1(ms)−1 .

the Appendix for the details of the learning and prediction direction of the gradient of L until the minimum value is
algorithms]. obtained.
The encoder-decoder RNN is thus fully specified We use the ADAM optimizer [44] to update the parameters
by the elements of the matrices and vectors W = of the NN on every minibatch of size = 256 with the initial
(Wencoder , Wdecoder , Wh B , bB ) and it is trained by processing learning rate 0.01. We refer to each iteration over minibatches
independent sequences of data corresponding to different covering the full data set as an epoch, and we find that ap-
simulated realizations of the magnetic noise and the optical plying the gradient ascent for 30 such epochs suffices for
detection noise. convergence of the loss function. We prepare a total of 3 × 106
The training consists in exposing the RNN to numerous data for which N = 2.8 × 106 are used as training data and
simulated pairs of magnetic fields and measurement data, and N  = 2 × 105 as unseen (test) data to validate the RNN. Each
minimizing a loss function L which determines the difference sequence has NT = 101 dimension from t = 0 to T = 1 with
between the true {Btrue } and the predicted field {Best }. The τ = 0.01( ms ). As mentioned, we use two LSTMs with the
loss function is differentiable, and the trainable parameters W hidden dimension m = 80. The LSTMs alleviate vanishing
of the RNN are adjusted to minimize L by applying gradient or diverging gradient problems of simpler RNNs and they
descent steps: improve the learning of long-term dependencies [45].
∂L
W ←W −η , (6) IV. RESULTS
∂W
where η is the learning rate. According to the learning prob- During the learning procedure, the RNN minimizes the
lem, different kinds of loss function can be applied, e.g., the loss function (7), and the learning is iterated until the loss
cross entropy and mean-square error which are suitable for function converges; see Fig. 3(a). After the learning process,
classification and regression, respectively. Here, we consider we validate the NN on the unexplored test data, employing the
the loss function error function

1  i 1  i 2
N
T M
2
L= Bt,true − Bt,est
i
, (7) Error(t ) =  Bt,true − Bt,est
i
(8)
MNT t=0 i=1 N i

where M is the number of the minibatches on which L is where N  is the number of sequences of test data. Equa-
calculated and NT is the dimension of each sequence. The tion (8) yields the mean-squared error of the prediction by
parameters of the RNN are changed according to the negative the encoder-decoder for the magnetic field from t = 0 to T .

032406-4
TIME-DEPENDENT ATOMIC MAGNETOMETRY … PHYSICAL REVIEW A 103, 032406 (2021)

The performance of the RNN is shown in Fig. 3(b), where


the mean-squared error is given in units of the variance of the
unobserved magnetic-field fluctuations.
Despite the similarity of Eqs. (7) and (8), the loss function
shown in Fig. 3(a) is smaller than the time average of the error
shown in Fig. 3(b). The learning procedure provides the true
value of the magnetic field at each time step to the decoder and
the loss function concerns only the difference between the true
and the guessed value at the next time step. In contrast, after
training, the network has access only to the measurement data
and thus it applies the previous guessed values at each time
step; for the details of the algorithms see Appendix.
Two examples comparing the true and the inferred mag-
netic field are shown in Fig. 3(c), based on the measurement
sequence of Fig. 1(b), and in Fig. 3(d). The same estimation
problem can be treated by Bayesian quantum filter approaches FIG. 4. Schematic diagram of the LSTM cell. Each LSTM cell
[38,41,46,47], and the NN and filter methods yield very simi- has three inputs where ct−τ , ht−τ represent the hidden and cell state at
lar results. A noticeable feature in Fig. 3(b) is the low, almost the previous time step t − τ and rt is the input to the cell. Rectangles
constant estimation error well within the probing interval, represent LSTM layers and circles are entrywise operations; see
accompanied by three to four times higher values at the ends Eqs. (A1)–(A5). For the encoder, the input is the recorded signals and
of the interval. This result is also found in the Bayesian filter we retain the final hidden and cell state hT , cT as the initial internal
analyses where it is directly associated with the ability to state for the decoder by which the output, magnetic field, is predicted.
use both earlier and later measurement data to infer unknown Bifurcations point out the copy operation.
values at any given time. Such earlier and later measurement
outcomes are not available at the beginning and the end of the strongly simplifying Gaussian description employed in the
probing time interval and hence the estimation error is larger Bayesian filters [40,41,46], while they would not impose any
there. For a simple Gaussian process, this would typically fundamental problem for the RNN. Our paper confirms the
change the variance by a factor of 2, but in our case, both the excellent prospects for use of an RNN in quantum metrology,
atomic ensemble state and the magnetic field are unknown, and since the goal is always to infer classical functions from
and their mutual correlation is the cause of the larger gain classical data, we emphasize that no particular quantum mod-
in information by the combination of earlier and later mea- ification of the RNN concept is needed. It is likely that neural
surements. The minimum value in Fig. 3(b) depends on the networks may be combined with other effective methods, e.g.,
values of the probing strength κ and the rate of fluctuations of making use of sparsity in the signal in compressed sensing
the magnetic field γb , since a high rate of fluctuations hinders [48], as our analysis does not rely on any assumptions about
the reliable estimation of the field unless we can increase the way the measurement data are provided.
the probing strength. The relationship between the estimation
error, the probing strength, and the rate of Ornstein-Uhlenbeck ACKNOWLEDGMENTS
fluctuations is investigated quantitatively for the Bayesian fil-
This work is supported by the Ministry of Science, Re-
ter in Ref. [41].
search and Technology of Iran; the Danish National Research
Foundation through the Center of Excellence “CCQ” (Grant
V. SUMMARY AND CONCLUSIONS agreement No. DNRF156); and the European QuantERA
grant C’MON-QSENS!, by Innovation Fund Denmark Grant
We have demonstrated the use of a recurrent neural net-
No. 9085-00002B. The authors acknowledge C. Zhang for
work with an encoder-decoder architecture to estimate the
sharing her results on quantum filter estimation and A.
value of a stochastic magnetic field B(t ) interacting with an
T. Rezakhani for careful reading of the paper and helpful
atomic system during a finite time interval. The input data
comments.
is the optical signal from continuous Faraday polarization
monitoring of the system, which was simulated by quantum
APPENDIX: LONG SHORT-TERM MEMORY
trajectory theory. Our RNN learned to estimate B(t ) from
the optical signal alone and without any information about For our RNN we employ the LSTM unit, which is a variant
the dynamics of the system and magnetic field. The perfor- of the simple RNN layer, permitting us to process complicated
mance of our approach is quantitatively similar to the one by data and carry information across many time steps. The LSTM
a Bayesian analysis of the same data. Although a Bayesian employs two internal states ht , ct , known as the hidden and
quantum filter approach can also be applied to the example the cell state, respectively, and it processes information by
presented here, such an approach relies crucially on several application of three gates: a forget, an input/update, and an
assumptions. Non-Markovian effects related to finite detector output gate.
bandwidth, saturation, and dead time for example invalidate For the encoder (or decoder), at time step t, the input
or substantially complicate the Bayesian analysis [13]. Other rt = xt (or Bt ) and the two internal states ht−τ , ct−τ that were
noise processes would not be compatible with a Gaussian updated in previous steps are fed to the LSTM cell. These
distribution of the magnetic field and hence invalidate the states and input are employed and updated as follows: The

032406-5
MARYAM KHANAHMADI AND KLAUS MØLMER PHYSICAL REVIEW A 103, 032406 (2021)

forget gate F decides what information to discard from the connected layer. As shown in Fig. 4, one copy of ht is consid-
cell, the input/update gate I decides which input values are ered as the LSTM output and is fed to the fully connected
employed to update the memory state, and based on the input layer by Eq. (5), the output of which is the estimation of the
and the memory of the cell the output gate O decides how the time-dependent magnetic field (5); see Fig. 2. We use LSTM
output value is determined. The equations of the gates yielding cells for both the encoder and decoder architecture with hid-
the subsequent hidden state values ht , ct are [42,43] den dimension m = 80, yielding an (m × m) matrix Wh j ; an
(m × 1) matrix Wr j for j = i, f , c, o; and m-dimensional bias
It = σ (Wri rt + Whi ht−τ + bi ), (A1) vectors. All the parameters W, b of Eqs. (A1)–(A5) are up-
dated in the gradient descent learning of the NN Eqs. (6) and
Ft = σ (Wr f rt + Wh f ht−τ + b f ), (A2) (7). The network was implemented by the KERAS 2.3.1 and
TENSORFLOW 2.1.0 framework on PYTHON 3.6.3 [49,50]. All
ct = Ft ct−τ + It tanh(Wrc rt + Whc ht−τ + bc ), (A3) weights were initialized with KERAS default.
Ot = σ (Wrort + Whoht−τ + bo ), (A4)
Learning and prediction algorithm
ht = Ot tanh(ct ), (A5)
According to Fig. 2, two LSTMs are employed to process
where σ (x) = 1/(1 + e−x ) is the sigmoid function (note that the training data in two parts: encoding and decoding. In this
σ (x)and tanh(x) are applied elementwise to each of their argu- section we describe the algorithms of the RNN (encoder and
ments). In the decoder, the LSTM cells are followed by a fully decoder) for the learning and prediction procedure.

Algorithm 1. Algorithm of the learning procedure.

Initialize the parameters


W = (Wencoder , Wdecoder , Wh B , bB ) according to the
Keras default;
for counter < epochs do
for number < N (training data) do
“Encoding part”
Initialize hinit , cinit , t = 0;
if t  T then
Use xt from training data as the input;
Use ht−τ , ct−τ from previous step;
Do Eqs. [(A1)–(A5)];
Keep the updated ht , ct for the next time step;
t ← t + τ;
end
Keep hT , cT as the encoded vector;
“Decoding part”
 
Initialize hinit = hT , cinit = cT , t = 0;
Feed an arbitrary initial input B = 0 to the decoder;
if t  T then
Use Bt from training data as the input;
Use ht−τ , ct−τ from previous step;
Do Eqs. [(A1)–(A5)];
Keep the updated ht , ct for the next time step;
Feed the hidden state ht to the fully connected layer;
Calculate the output Bt of the fully connected layer as Eq. (5);
Find the square error of the output and the true magnetic field;
t ← t + τ;
end
if number is multiplication of mini-batch then
Evaluate the loss function (7) according to square errors calculated in the decoding part;
Update parameter W according to Eq. (6) with ADAM optimizer;
end
end
end

032406-6
TIME-DEPENDENT ATOMIC MAGNETOMETRY … PHYSICAL REVIEW A 103, 032406 (2021)

Algorithm 2. Algorithm of the prediction procedure.

The parameters
W = (Wencoder , Wdecoder , Wh B , bB ) are learned according to the algorithm 1;
for number < N  (test data) do
“Encoding part”
Initialize hinit , cinit , t = 0;
if t  T then
Use xt from test (unseen) data as the input;
Use ht−τ , ct−τ from previous step;
Do Eqs. [(A1)–(A5)];
Keep the updated ht , ct for the next time step;
t ← t + τ;
end
Keep hT , cT as the encoded vector;
“Decoding part”
 
Initialize hinit = hT , cinit = cT , t = 0;
Feed an arbitrary initial input B = 0 to the decoder;
if t  T then
Use Bt from previous output as the input;
Use ht−τ , ct−τ from previous step;
Do Eqs. [(A1)–(A5)];
Keep the updated ht , ct for the next time step;
Feed the hidden state ht to the fully connected layer;
Calculate the output Bt of the fully connected layer as Eq. (5);
Show the output as the estimated magnetic field;
Find the square error of the output and the true magnetic field;
t ← t + τ;
end
end
Find the average error, cf., Eq. (8);

[1] V. Giovannetti, S. Lloyd, and L. Maccone, Quantum-enhanced [9] H. Carmichael, An Open Systems Approach to Quantum Optics
measurements: beating the standard quantum limit, Science (Springer-Verlag, Berlin, 1993).
306, 1330 (2004). [10] V. P. Belavkin, Measurement, filtering and control in quantum
[2] V. Giovannetti, S. Lloyd, and L. Maccone, Advances in quan- open dynamical systems, Rep. Math. Phys. A 43, 405 (1999).
tum metrology, Nat. Photonics 5, 222 (2011). [11] V. P. Belavkin, Quantum Filtering of Markov Signals with White
[3] C. M. Caves, Quantum-mechanical noise in an interferometer, Quantum Noise (Springer, New York, 1995).
Phys. Rev. D 23, 1693 (1981). [12] L. Bouten, R. Van Handel, and M. R. James, An Introduc-
[4] P. Zanardi, M. G. A. Paris, and L. Campos Venuti, Quantum tion to quantum filtering, SIAM J. Control Optimiz. 46, 2199
criticality as a resource for quantum estimation, Phys. Rev. A (2007).
78, 042105 (2008). [13] P. Warszawski, H. M. Wiseman, and H. Mabuchi, Quantum
[5] M. Tsang and C. M. Caves, Evading Quantum Mechanics: trajectories for realistic detection, Phys. Rev. A 65, 023802
Engineering a Classical Subsystem within a Quantum Environ- (2002).
ment, Phys. Rev. X 2, 031016 (2012). [14] E. Alpaydin, Introduction to Machine Learning (MIT, Cam-
[6] O. Hosten, N. J. Engelsen, R. Krishnakumar, and M. A. bridge, MA, 2010).
Kasevich, Measurement noise 100 times lower than the [15] K. Cho, B. Merrienboer, C. Gulcehre, F. Bougares, H.
quantum-projection limit using entangled atoms, Nature Schwenk, and Y. Bengio, Learning phrase representations us-
(London) 529, 505 (2016). ing RNN encoder-decoder for statistical machine translation,
[7] L. Pezzé, A. Smerzi, M. K. Oberthaler, R. Schmied, and arXiv:1406.1078.
P. Treutlein, Quantum metrology with nonclassical states of [16] I. Sutskever, O. Vinyals, and Q. V. Le, Sequence to sequence
atomic ensembles, Rev. Mod. Phys. 90, 035005 (2018). learning with neural networks, in NIPS’14: Proceedings of the
[8] M. M. Rams, P. Sierant, O. Dutta, P. Horodecki, and 27th International Conference on Neural Information Process-
J. Zakrzewski, At the Limits of Criticality-Based Quan- ing Systems, Montreal (Canada MIT, Cambridge, MA, 2014).
tum Metrology: Apparent Super-Heisenberg Scaling Revisited, [17] T. Mikolov, M. Karafiat, L. Burget, J. Cernocky, and S.
Phys. Rev. X 8, 021022 (2018). Khudanpur, Recurrent neural network based language model, in

032406-7
MARYAM KHANAHMADI AND KLAUS MØLMER PHYSICAL REVIEW A 103, 032406 (2021)

Proceedings of the Eleventh Annual Conference of the Interna- [35] D. F. Wise, J. J. L. Morton, and S. Dhomkar, Using deep learn-
tional Speech Communication Association, 2010 (unpublished). ing to understand and mitigate the qubit noise environment,
[18] A. Graves, A.-R. Mohamed, and G. Hinton, Speech recognition PRX Quantum 2, 010316 (2021).
with deep recurrent neural networks, arXiv:1303.5778. [36] R. K. Gupta, G. D. Bruce, S. J. Powis, and K. Dholakia, Deep
[19] C. W. Gardiner, Handbook of Stochastic Methods (Springer, learning enabled laser speckle wavemeter with a high dynamic
New York, 1985). range, Laser & Photonics Rev. 14, 2000120 (2020).
[20] Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature [37] T. Holstein and H. Primakoff, Field Dependence of the Intrinsic
(London) 521, 436 (2015). Domain Magnetization of a Ferromagnetic, Phys. Rev. 58, 1098
[21] E. Flurin, L. S. Martin, S. Hacohen-Gourgy, and I. Siddiqi, (1940).
Using a Recurrent Neural Network to Reconstruct Quantum [38] K. Mølmer and L. B. Madsen, Estimation of a classical parame-
Dynamics of a Superconducting Qubit from Physical Observa- ter with Gaussian probes: Magnetometry with collective atomic
tions, Phys. Rev. X 10, 011006 (2020). spins, Phys. Rev. A 70, 052102 (2004).
[22] M. J. Hartmann and G. Carleo, Neural-Network Approach to [39] J. M. Geremia, J. K. Stockton, and H. Mabuchi, Suppression
Dissipative Quantum Many-Body Dynamics, Phys. Rev. Lett. of Spin Projection Noise in Broadband Atomic Magnetometry,
122, 250502 (2019). Phys. Rev. Lett. 94, 203002 (2005).
[23] L. Banchi, E. Grant, A. Rocchetto, and S. Severini, Modelling [40] V. Petersen and K. Mølmer, Estimation of fluctuating magnetic
non-Markovian quantum processes with recurrent neural net- fields by an atomic magnetometer, Phys. Rev. A 74, 043802
works, New J. Phys. 20, 123030 (2018). (2006).
[24] N. Yoshioka and R. Hamazaki, Constructing neural stationary [41] C. Zhang and K. Mølmer, Estimating a fluctuating magnetic
states for open quantum many-body systems, Phys. Rev. B 99, field with a continuously monitored atomic ensemble, Phys.
214306 (2019). Rev. A 102, 063716 (2020).
[25] F. Vicentini, A. Biella, N. Regnault, and C. Ciuti, Variational [42] F. Chollet, Deep Learning with Python (Manning, Shelter Is-
Neural Network Ansatz for Steady States in Open Quantum land, NY, 2018).
Systems, Phys. Rev. Lett. 122, 250503 (2019). [43] S. Hochreiter and J. Schmidhuber, Long short-term memory,
[26] A. Nagy and V. Savona, Variational Quantum Monte Carlo with Neural Comput. 9, 1735 (1997).
Neural Network Ansatz for Open Quantum Systems, Phys. Rev. [44] D. P. Kingma and J. Ba, Adam: A method for stochastic opti-
Lett. 122, 250501 (2019). mization, arXiv:1412.6980.
[27] J. Carrasquilla and R. G. Melko, Machine learning phases of [45] S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber,
matter, Nat. Phys. 13, 431 (2017). Gradient flow in recurrent nets: The difficulty of learning long-
[28] E. Greplova, C. K. Andersen, and K. Mølmer, Quantum param- term dependencies, in A Field Guide to Dynamical Recurrent
eter estimation with a neural network, arXiv:1711.05238. Networks, edited by J. F. Kolen and S. C. Kremer (Wiley, New
[29] S. Nolan, A. Smerzi, and L. Pezze, A machine learning ap- York, 2001).
proach to Bayesian parameter estimation, arXiv:2006.02369. [46] M. Tsang, Optimal waveform estimation for classical and quan-
[30] E. P. Van Nieuwenburg, Y.-H. Liu, and S. D. Huber, Learning tum systems via time-symmetric smoothing, Phys. Rev. A 80,
phase transitions by confusion, Nat. Phys. 13, 435 (2017). 033840 (2009).
[31] G. Torlai, G. Mazzola, J. Carrasquilla, M. Troyer, R. Melko [47] M. Tsang, Time-Symmetric Quantum Theory of Smoothing,
and G. Carleo, Neural-network quantum state tomography, Phys. Rev. Lett. 102, 250403 (2009).
Nat. Phys. 14, 447 (2018). [48] E. J. Candes, J. Romberg, and T. Tao, Robust uncertainty
[32] G. Carleo and M. Troyer, Solving the quantum manybody prob- principles: Exact signal reconstruction from highly incomplete
lem with artificial neural networks, Science 355, 602 (2017). frequency information, IEEE Trans. Inf. Theory 52, 489 (2006).
[33] F. Breitling, R. S. Weigel, M. C. Downer, and T. Tajimab, Laser [49] F. Chollet, KERAS, available at keras.io.
pointing stabilization and control in the submicroradian regime [50] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean,
with neural network, Rev. Sci. Instrum. 72, 1339 (2001). M. Devin, S. Ghemawat, G. Irving, M. Isard, and M. Kudlur,
[34] P. Kobel, M. Link, and M. Köhl, Exponentially improved de- TensorFlow: A system for large-scale machine learning, in Pro-
tection and correction of errors in experimental systems using ceedings of 12th USENIX Symposium on Operating Systems
neural network, arXiv:2005.09119. Design and Implementation (2016), pp. 265–283.

032406-8

You might also like