Large Deviation Theory To Model Systems Under An External Feedback

Large deviation theory to model systems under an external feedback
Alessio Gagliardi1 , Alessandro Pecchia2 , Aldo Di Carlo3

(1) Technische Universitaet Muenchen,
Arcisstrasse 21, 80333, Munich (Germany)
alessio.gagliardi@tum.de.
(2) CNR, Via Salaria Km 29,
600, 0017 Monte Rotondo (Italy)
arXiv:1603.03786v2 [cond-mat.stat-mech] 15 Apr 2016
(3) University of Rome ”Tor Vergata”,

Via del Politecnico 1, 00133, Rome (Italy)
(Dated: May 15, 2019)
In this paper we address the problem of systems under an external feedback. This is performed
using a large deviation approach and rate distortion from information theory. In particular we
define a lower boundary for the maximum entropy reduction that can be obtained using a feedback
apparatus with a well defined accuracy in terms of measurement of the state of the system. The
large deviation approach allows also to define a new set of potentials, including information, which
similarly to more conventional thermodynamic potentials can define the state with optimal use of
the information given the accuracy of the feedback apparatus.
PACS numbers:
I. INTRODUCTION The mutual information, for a classical system, is de-

scribed by the mutual information functional of informa-
The idea of a thermodynamic theory of systems under tion theory. There is an understandable debate about
an external feedback dates back to the origin of statisti- the validity of using information theoretical quantities
cal physics when Maxwell made a gedanken experiment and their interpretations in thermodynamics15 . However,
about the work that could be extracted by a system con- for many cases of interest16 Shannon entropy represents
trolled by an external apparatus. In principle the ex- a good choice for the entropy functional, especially in
ternal feedback can probe the state of the system and equilibrium cases. In particular Shannon entropy gives
manipulates it in order to extract work. In his first for- the right entropy functional but only physical and ex-
mulation the feedback was treated like an oracle that can perimental evidences can determine which is the correct
make measures and manipulate the system without any probability density function (PDF) of the problem under
cost in terms of work and entropy production, for this investigation, as information theory has nothing to say
reason it was called a ”Demon” to stress its abstract about that17 .
nature. However the idea was reconsidered along all An interesting alternative approach to non equilibrium
the past century1,2 with different approaches and conclu- thermodynamics is arising within the framework of a dif-
sions. One of the central work on the topic was the paper ferent branch of information theory, e.g. large deviation
by Szilard and his famous Szilard engine3 , a still very ide- theory (LDT). LDT is a mathematical framework used
alized machine that anyway shows already some more re- to establish the probability of fluctuations of statistical
alistic characteristics in terms of its components and the quantities from their typical values. Typicality in infor-
nature of the feedback controller. Starting from Szilard mation theory leads to the definition of thermodynamic
engine, but generalizing the concept, several authors4–10 averages and state variables18,19 .
and in particular Sagawa and Ueda, have developed a In particular, LDT demonstrates that for a broad class
new branch of thermodynamics of systems under an ex- of PDFs, the probability of fluctuations from the aver-
ternal feedback11–13 , i.e. information thermodynamics. age value drops exponentially with the fluctuation mag-
In particular in14 it was shown, using an ingenious ex- nitude. The exponent is proportional to the number of
tension of fluctuation theorems including the information degrees of freedom of the system, explaining why, in the
gathered by the feedback, that the presence of the exter- thermodynamic limit, fluctuations of state variables be-
nal controller can increase the average work that can be come negligible. However in the emerging field of stochas-
extracted from a system when it is driven from one equi- tic thermodynamics which also copes with small systems
librium state to another according to in non equilibrium conditions, fluctuations can be an im-
portant aspect of their behavior20–24 .
hW i ≤ ∆F + kB T I, (1) Many systems of interest have the peculiarity of being
the interconnection between a physical system probed
with hW i the average work, ∆F free energy variation and controlled by a feedback apparatus, for example bi-
between the final and initial equilibrium states, kB is ological processes in cells and other living organisms be-
the Boltzmann constant, T the bath temperature and I long to this class. Recently, several studies have inves-
the mutual information. For a cyclic transformation, for tigated the effect of including information terms in the
which ∆F = 0 the maximum work reduces to kB T I. analysis of biological processes within the framework of
2
information thermodynamics with great success, see for with p(ξ, π) the joint probability. If the two vectors are
example25–32 . Thus the investigation of thermodynamics independent the joint entropy reduces to the sum of the
under feedback is rapidly rising interest in many fields individual entropies, in the other case we have:
beyond statistical physics.
In this paper we give a new insight to the problem of S(Ξ, Π) = S(Ξ) + S(Π) − kB I(Ξ, Π), (4)
a system under an external feedback using a special case
of LDT and the concept of typicality. The link between with I(Ξ, Π) the mutual information functional defined
systems under feedback and LDT is obtained using rate as:
distortion theory (RDT), a fundamental part of informa-
X p(ξ, π)
tion theory. The chain of relationships between LDT and I(Ξ, Π) = p(ξ, π) ln . (5)
RDT that we outline, is particularly pleasing as it makes p(ξ)p(π)
a direct connection between the information the feedback
The mutual information is a special case of the Kullback-
controller apparatus gathers about a system and the as-
Leibler divergence (KLd) defined as:
sociated entropy reduction, also leading to an explicit
construction of a thermodynamic potential for a system X
p(ξ)

under feedback. The possibility to construct potentials D(P kQ) = p(ξ) ln , (6)
q(ξ)
including the effect of information opens the perspective
of a complete new analysis of those systems more similar where P and Q are two PDFs. The KLd is a pseudo
to conventional equilibrium thermodynamic formalism. distance between PDFs and has an important role
For a similar approach see also33 . in LDT due to the Chernoff bound19 , in stochastic
The paper is organized as follows: in the first part a thermodynamics34 , within fluctuation theorems35 and in
brief summary of the main information theoretical con- entropy production within the Boltzmann equation36 . In
cepts, Shannon entropy, conditional entropy and mutual particular, we observe that the mutual information has
information is given. Then the concept of typicality is the form of a KLd between the joint PDF, p(ξ, π), and
introduced. In the next section we present the large devi- the independent PDF, p(ξ, π) = p(ξ)p(π) ≡ q(ξ, π).
ation theory and the code large distortion problem. The Finally, mutual information and entropy of two ran-
final part is devoted to introducing rate distortion theory dom vectors are connected by the conditional entropy:
and the final link to thermodynamics under an external
feedback. X
S(Ξ|Π) = S(Ξ) − kB I(Ξ, Π) = p(ξ, π) ln p(ξ|π). (7)
II. A BRIEF EXCURSUS IN INFORMATION Conditional entropy represents the entropy (uncertainty)
THEORY left in Ξ after the conditioning to Π.
These five are the most relevant functionals in infor-
The central quantity of information theory is the Shan- mation theory as practically every important theorem is
non entropy defined as related to one or another in some form.
X
S(P ) = −kB pi ln pi , (2)
i III. TYPICAL SET IN THE PHASE SPACE
th
where pi represents the probability of the i event and
P the PDF. Shannon entropy is usually expressed in bits The Shannon entropy has a nice geometrical inter-
and adimensional, we have chosen here the convention to pretation in the typical set theorem consequence of the
include the Boltzmann constant and express the entropy asymptotic equipartition principle37 . The theorem states
using natural logarithm. However, this is totally imma- that given a PDF p(ξ), where ξ is the vector of degrees
terial for the present discussion. Shannon entropy plays of freedom of the problem, with entropy S(Ξ) then the
a central role in thermodynamics, even for a large class ”typical set” within the phase space has a volume equal
of systems under non equilibrium conditions16 . to:
More generally if we have a PDF p(ξ) which depends S(Ξ)
on a set of degrees of freedom, ξ, we can define the proper Ωtyp ∼ e kB
, (8)
discrete (or continuous) Shannon entropy. The degrees
of freedom depend on the problem at hand, they could where the meaning of ”typical” means that the probabil-
be the set of positions and momenta of a collection of ity for the system to be found in a microstate within the
particle in gas phase for example. From the Shannon typical set converges to 1 in the thermodynamic limit,
entropy it is possible to promptly derive other four con- namely,
nected quantities. The first is the joint (Shannon) en-
tropy which is just the entropy of two random vectors ξ p(ξ ∈ Ωtyp ) → 1. (9)
and π:
X In other words the concept of typicality states that only
S(Ξ, Π) = −kB p(ξ, π) ln p(ξ, π), (3) a portion of the entire phase space is really relevant to
3
fluctuations with the macroscopic behavior of the sys-

tem described by thermodynamic potentials, this formal-
ism is large deviation theory (LDT). This connection is
a well established fact dating back to the ’70, several
authors38–43 used LDT to derive many results of equilib-
rium thermodynamics. In particular it was possible to
derive in a very elegant way the maximum entropy prin-
ciple/ minimum free energy for systems in the thermo-
dynamic limit in microcanonical or canonical ensemble.
The entire idea of LDT is to estimate the probability
of fluctuations departing from the average in stochastic
problems. This can be directly applied to evaluate the
probability of observing a fluctuation of a state variable
FIG. 1. (color online) The phase space and the typical sub- in a thermodynamic system.
set. Usually, except for the uniform distribution, the typical Let us assume we have a quantity A with average value
set is indeed a proper subset of the entire class of possible mi-
A∗ and N degrees of freedom in the system. We define
crostates in the phase space. Its volume grows exponentially
with the entropy of the problem. the contribution to A per degrees of freedom a = A/N
and a∗ = A∗ /N . It is said that a stochastic problem
follows a Large deviation (LD) law if the probability that
a departs from a∗ follows an exponential law:
compute ensemble averages of thermodynamic quanti-
ties, considering that the volume Ωtyp fundamentally col- q(a 6= a∗ ) ∼ e−N K(a) , (12)
lects all the probability (see Fig. 1).
where K(a) is an exponent which is dependent on the
In particular for any quantity, A(ξ), defined over the
thermodynamic quantity, while q(a) is defined as in eq. 11
phase space, the ensamble average is defined as,
from the PDF per microstate p(ξ).
Z A powerful theorem to check if a problem satisfies a
hAi = Ãq(Ã)dÃ, (10) LD law is the Gärtner-Ellis theorem (GET)39 . We first
define the Scaled Cumulant Generating Function (SCGF)
with as
Z 1 N αa
q(Ã) = p(ξ)δ(Ã − A(ξ))dξ. (11) λ(α) = lim ln he i , (13)
N →∞ N
with
If it happens that for all microstates within the typical
set, A(ξ ∈ Ωtyp ) = A∗ (constant), then A is a state
Z

N αa
variable of the problem and hAi = A∗ . e = p(ξ)eαN a(ξ) dξ, (14)
Several generalizations of the typical set concept ex-
ist for example for a joint PDF in the joint typical set with α a real number. The GET states that if the SCGF
theorem37 . The concept of typicality is extremely im- exists and is differentiable everywhere in α, then the sys-
portant also in thermodynamics. In the microcanonical tem fluctuations follow an exponential law.
ensemble where the PDF over the phase space is a uni- We notice that the SCGF is very similar to a partition
form PDF, the entropy reduces to the integration of the function, but scaled with respect to N and also where
density of accessible microstates. every exponential term is weighted by the probability per
However, as recent studies on stochastic thermody- microstate, p(ξ). If we make a simple variable change
namics have shown15,34 , it is possible to extend many α = −β, the new parameter β plays the same role as
thermodynamic results also for small systems, i.e., sys- the inverse temperature, β = 1/(kB T ) and the SCGF
tems where significant fluctuations of the state variables can be rewritten as −φ(β) = λ(α). The GET does not
are not only possible, but also probable. Such type of only provide a condition for existence, but also gives an
systems are also those for which a practical implementa- operative way to evaluate the exponent K(a). In fact
tion of a feedback controller is more feasible due to their it demonstrates that K(a) is related to the Legendre-
limited number of degrees of freedom. Fenchel transform of the SCGF:
K(a) = − min [βa − φ(β)]. (15)

β≥0
IV. LARGE DEVIATION THEORY
It is possible to demonstrate44 that the previous equa-
There is an elegant formalism within information the- tion can be rewritten in terms of the real partition func-
ory connecting the statistical analysis of microscopic tion (without the PDF weighting the exponents as in the
4
SCGF), but adding a constant: A dual theorem of GET is the Varadhan theorem48
which allows to invert the Legendre-Fenchel transform.
1 1
J(a) = − min [βa − φ(β)] + ln Λ = K(a) + ln Λ. This states that if
β≥0 N N
(16) K(a) = − min [βa − φ(β)], (22)
The latter constant has the form of the entropy of a uni- β≥0
form PDF U (ξ) = 1/Λ, with Λ a particular volume within
the phase space (see44 ), which depends on the prior PDF is valid, then also the following relation holds:
p(ξ). In the case of a microcanonical ensemble it reduces
to all the microstates with the same energy Ē. The func- φ(β) = min[βa + K(a)]. (23)
a
tion φ(β) is related to the free energy potential. Specifi-
cally, we have that the free energy per degrees of freedom The LD law in microcanonical or canonical ensemble ex-
(f = F/N ) is equal to: plains, through the Varadhan theorem, why the minima
of the free energy potentials are associated to the state
φ(β) variable values for the equilibrium state.
f= , (17)
β
thus the Legendre-Fenchel transform of the SCGF is
linked to the entropy of the system per degree of free- V. RATE DISTORTION THEORY AND LARGE
DEVIATION THEORY
dom, s(a), by
s(a) 1 In this paragraph we make explicit the connection be-
J(a) = − + ln Λ. (18)
kB N tween LDT and the information theory quantities pre-
sented in the previous sections and consider a system
If we consider the constant term as the entropy associ-
under feedback. In order to do so we need to introduce
ated to a uniform PDF we have that the exponent J(a)
the rate distortion function (RDF), a central functional
can be treated as follows:
in rate distortion theory (RDT). RDT copes with a fun-
−s(a) 1 damental problem in communication, namely estimating
J(a) = + ln Λ
kB N the minimal information content that any message sent
over a communication channel must contain such that
 
1 X X
= p(ξ) ln p(ξ) + p(ξ) ln Λ the receiver can still have a good reconstruction of the
N original signal. The error source can be either due to
ξ ξ
1 distorting noise, a finite channel capacity or because of
= D(Ξ//U ), (19) a lossy compression operated by the sender. In mathe-
N matical form, if we define a distance, d(π, ξ), between the
where U = 1/Λ is the uniform PDF over the phase message sent, ξ, and the message received, π (eventually
space
P volume Λ, D is the KLd and N s(a) = Stot = after decompression, noise deconvolution, etc...), the tar-
− p(ξ) ln p(ξ) the total entropy. Thus the probability get of RDT is to find under which conditions the average
q(a 6= a∗ ) goes like the following: distance can be kept lower than a given threshold, Γ37 .
Formally we ask,
q(a 6= a∗ ) ≈ e−D(Ξ//U) , (20)
recovering the Chernoff bound for large fluctuations hd(ξ, π)i ≤ Γ, (24)
within the typical set formalism37 . The LDT provides a
clear understanding why entropy maximization is at the where the average is computed over the joint PDF p(ξ, π).
essence of equilibrium thermodynamics. Similar results The distance, d, can be any functional with the properties
are obtained for a canonical ensemble18 . For a general of a distance (symmetry, positive definite, d = 0 iif ξ =
discussion about the GET and the LDT applied to ther- π, and must satisfy the Schwartz inequality). The most
modynamics we refer to39 . important theorem of RDT states that, given d and Γ,
Notably, LDT has been applied to non equilibrium there exists a function, R(Γ) (the rate distortion func-
systems38 substituting the PDF for a microstate with tion), representing the minimum information required in
the PDF of entire time dependent trajectories to take order to send messages with an average distortion not
into account the time evolution of the system. Even more greater than Γ. The function R(Γ) has some remarkable
important, LDT can be used to derive fluctuation theo- properties: it is convex in the argument Γ, it converges to
rems. In fact if the exponent has the symmetry relation, the entropy of the source (for a discrete PDF) for Γ = 0,
e.g. K(−a) − K(a) = γa (γ is a positive real number) while for Γ ≥ Γ∗ it is zero. In practice Γ∗ is the limit
for a certain thermodynamic quantity, A = N a , we im- distortion value after which the information content is
mediately get the fluctuation theorem18,34,45–47 : completely lost and the receiver has equal chance by just
guessing at random the most likely message, based on the
q(a) joint PDF37 . For an example of a RDF for a Gaussian
≈ eN (K(−a)−K(a)) = eN γa . (21)
q(−a) PDF see Fig. 2.
5
Zπ has the form of a generalized partition function, linked

to the distribution of π, where the distance d has a role
similar to energy in the conventional partition function51 .
An interesting aspect of this particular case of the LDT
is that minimizing R(Γ) w.r.t. λ, we get an important
relation,
∗
X p(ξ)q(π)e−λ d(ξ,π)
Γ= d(ξ, π) = hd(ξ, π)i. (29)
Zπ (λ∗ )
ξ,π
The average is made with respect to the joint PDF,

∗
p(ξ)q(π)e−λ d(ξ,π)
p̃(ξ, π) = , (30)
Zπ (λ∗ )
which is the joint probability that fulfill the minimal rate

FIG. 2. Typical shape of a RDF. In this case it is plot the function for a defined average error Γ and λ∗ is the value
RDF of a Gaussian PDF with variance σ 2 = 9. For Γ larger minimizing the Legendre-Fenchel transform.
than σ 2 the RDF is 0.
VI. LARGE DEVIATION, RATE DISTORTION

The second important aspect of the rate distortion AND FEEDBACK CONTROL
function is that it can be computed as a constrained min-
imization of a mutual information functional, We can now use the results of RDT applied to LDT to
obtain our most important result concerning the thermo-
R(Γ) = min I(Ξ, Π). (25)
p(π|ξ): hd(ξ,π)i≤Γ dynamics of systems with feedback control. Let assume
that we have a feedback controller that performs mea-
In the latter the minimization is with respect to the con- surements of the state of a system and then afterwards
ditional PDF, p(π|ξ), and the constrain is that the aver- manipulates it. If we assume that the measurement, πk ,
age distortion remains always smaller than Γ. has some correlation with the state ξ, then the entropy of
The connection between LDT and RDT is materialized the system after measurement will be given by the PDF,
by the distortion coding problem (DCP)49 . DCP can p(ξ|π = πk ), conditioned by the outcome πk . This is
be formulated in the following way: let assume we have equal to:
two random vectors ξ and π with PDFs p(ξ) and q(π),
respectively, and we want to know what is the probability X
S(Ξ|π = πk ) = −kB p(ξ|π = πk ) ln p(ξ|π = πk ).
that picking up at random two vectors ξ and π, one from
each distribution, the distance is such that d(ξ, π) ≤ Γ. If (31)
we assume the only constrain that ξ should belong to the A typical set volume is always associated to this entropy,
typical set of Ξ, the probability of such condition follows S(Ξ|π=πk )
a LDT with an exponent equal to the rate distortion Ωtyp (ξ|π = πk ) = e kB
, (32)
function:
hence the average volume of the typical set after a mea-
p(ξ, π : hd(ξ, π)i ≤ Γ) ∝ e−R(Γ) . (26) surement is thus:
The function R(Γ) monotonically decreases in the range S(Ξ|π=π )
k
0 < Γ < Γ∗ , with limiting values R(0) = S/kb , R(Γ > hΩtyp (ξ|π = πk )i = e kB , (33)
Γ∗ ) = 0.
In the two limiting cases: when Γ = 0, R(0) =
where the average is made w.r.t. q(π).
S(Ξ)/kB . In the case Γ ≥ Γ∗ the probability converges
The lower boundary of this formula can be obtained
to 1 because R(Γ ≥ Γ∗ ) = 0.
using the Jensen inequality, thanks to the convexity of
Since the rate distortion function appears in the form
the exponential function, hexp(f )i ≥ exp(hf i), to obtain:
of a LD law, there is a second way to define the rate dis-
tortion function using the GET and the Legendre-Fenchel hS(Ξ|π=πk )i S(Ξ|Π)
transform of the SCGF49,50 : hΩ(ξ|π = πk )typ i ≥ e kB

=e kB
, (34)
h X i
R(Γ) = − min λΓ + p(ξ) ln(Zπ (λ)) , (27) exploiting the fact that hS(Ξ|π = πk )i = S(Ξ|Π).
λ≥0
Thus, we find that the feedback can -at best- con-
with strain the volume of phase-space by the conditional en-
tropy. Splitting the conditional entropy in the usual,
X
Zπ (λ) = q(π)e−λd(ξ,π) . (28)
π S(Ξ|Π) = S(Ξ) − kB I(Ξ, Π), we obtain that the effect
6
of the feedback is to compress the volume of the typical system accordingly in a time τ much smaller than any re-
set with an exponent at best equal to the mutual infor- laxation time of the system. Moreover, we assume a cer-
mation: tain level of ideality in the feedback, i) during the mea-
surement no perturbation of the system occurs, ii) the
Ωtyp (ξ|π) ≥ Ωtyp (ξ)e−I(Ξ,Π) . (35) feedback uses the entire information during the manipu-
lation achieving the maximal efficiency. Once the system
The fact that the mutual information is always non has been manipulated by the feedback it is allowed to
negative ensures that the effect of the feedback is always relax and finally it is connected again to the external
to compress the original typical set. bath until thermalize with it. This simple cycle assures
If we want to connect the mutual information to the that every time the feedback performs its new measure-
effective measurement operated by the feedback appa- ment/manipulation the system is at thermal equilibrium
ratus (FA), then we can define a distance functional, d, with temperature T0 .
and an average distortion Γ between state and estimation With these approximations we can decouple the initial
and use the rate distortion function to find the minimal state before the feedback operation from the feedback
mutual information required to have a certain average action. A completely different and more complex sce-
distortion Γ. The final result is the lower boundary, nario occurs in the case when the system is not allowed
to thermalize or the feedback acts continuously in such a
hΩtyp (ξ|π = πk )i ≥ Ωtyp (ξ)e−R(Γ) . (36) way that the system state depends on previous feedback
history.
The appealing of equation (36) is that it gives a direct We finally assume that the feedback can obtain an es-
connection between the effect of feedback information timate, π, of the real state, ξ, of the system such that
and the effective operative measurement performed by hd(ξ, π)i ≤ Γ for a well-defined distance functional.
the FA. We have shown in section 6 that the feedback action
A final note regarding the effect of the FA. The re- entails to an entropy reduction after measurement by a
duction in the volume of the typical set remains only factor I(ξ, π) = R(Γ). Thus we can finally define the
”virtual” until the FA does not operate and manipulate maximum increase in free energy due to the presence of
the system. We can thus imagine this shrinking of the the feedback:
typical set linked to the R(Γ) as the best average reduc-
tion in uncertainty about the state of the system from ∆F = F − Feq
the FA side once the measurements have been made. It = U − T0 (S − kB R(Γ)) − U + T0 S = kB T0 R(Γ) = hW i.
is anyway clear that the entropy and the physics of the (37)
system are left unchanged until the FA does not directly
operate. with U the internal energy. It recovers the result found
by Sagawa with eq. 1, as the maximum work for a cyclic
transformation, but now with a direct connection be-
VII. TOWARDS A THERMODYNAMIC
tween the type and accuracy of the measurement, d(ξ, π)
POTENTIAL INCLUDING INFORMATION and Γ, and the average extracted work hW i.
The R(Γ) exponent of equation (36), following a LD
law, can be expanded in the Legendre-Fenchel transform
A thermodynamic potential is nothing else than a as in equation (15), leading to a joint probability between
quantity that when minimized/maximized allows to find the microstate and the guess equal to equation (29). Us-
the equilibrium or stationary states of a certain thermo- ing the Varadhan theorem we can invert the Legendre-
dynamic system, subject to given constrains. We have Fenchel transform for the RDF and obtain something
already discussed in section 4 and 5 how the LDT pro- equivalent to a thermodynamic potential for the infor-
vides an elegant way to derive the maximal entropy and mation:
minimal energy principles for the microcanonical and
canonical ensembles. In particular, we saw how the free
X
hln(Zπ (λ)i = p(ξ) ln(Zπ (λ)) = min [λΓ + R(Γ)] .
energy potential comes as part of the Legendre-Fenchel Γ
transform of the large deviation exponent thanks to the (38)
Gartner-Ellis theorem. In this section we put together The structure of this equation is similar to the one of the
this result and the results of section 6 in order to con- free energy in standard thermodynamics with the rela-
struct an explicit thermodynamic potential for a system tion between free energy and partition function, where
under feedback control. now the Chernoff coefficient λ plays the role of inverse
First of all we define a simplified feedback apparatus temperature:
(SFA). We assume a system in thermal equilibrium with 1
an external bath at temperature T0 and we assume that φI (λ) ≡ − hln(Zπ (λ)i . (39)
λ
the system can be coupled to the bath or disconnected
from that and coupled to the SFA. The SFA is a sys- What is the meaning of this functional? We can un-
tem that can make a measurement and manipulate the derstand its role by calculating the distance, using the
7
KLd, between a generic joint probability p(ξ, π) and the

optimal p̃(ξ, π) as defined in eq. (29):
X p(ξ, π)
D(P//P̃ ) = p(ξ, π) ln
p̃(ξ, π)
ξπ
X p(ξ, π) FIG. 3. (color online) Scheme of a channel between two joint
= p(ξ, π) log + λ∗ hd(ξ, π)i + hln Zπ (λ∗ )i
p(ξ)q(π) gaussian random variables.
ξπ
= I(Ξ, Π) + λ∗ hd(ξ, π)i − λ∗ φI (λ∗ ). (40)
The latter equation can be rewritten as: Following37 the feedback has a PDF equals to:
2
λ∗ φI (λ∗ ) = λ∗ hd(ξ, π)i + I(Ξ, Π) − D(P//P̃ ). (41) 1 − π2
p(π) = N (0, σξ2 − Γ) = q e 2(σξ −Γ) . (43)
2π(σξ2 − Γ)
Considering that the three terms in the rhs are all posi-
tive it is easy to demonstrate that the maximum of the
potential φI (λ∗ ) is obtained for D(P//P̃ ) = 0, that is In the latter we have assumed that 0 ≤ Γ ≤ σξ2 . With the
when the joint probability is the ideal one for which the previous defined distance functional the average distance
RDF is achieved. Γ is equal to the mean square error h(ξ − π)2 i.
For this simple model the form of the rate distortion
function is well known:
VIII. A SIMPLE PHYSICAL MODEL !
1 σξ2
R(Γ) = ln , (44)
2 Γ
We apply the formalism to a simple model: a set of sin-
gle particles in a box in gas phase. We assume that every
particle is under a feedback and that it is in thermal equi- with R(Γ) = 0 for Γ ≥ σξ2 .
librium with a bath at temperature T . The Hamiltonian It is a simple matter of calculation to demonstrate that
of the system is given by E = Ap2 , where A = 1/2m, indeed eq. 27, using p(ξ), p(π) and d(ξ, π) = (ξ − π)2 ,
being m the particle mass. The particle state, ξ = (r, p), gives back the correct rate distortion function. The rela-
is characterized by an homogeneous spatial distribution tion between λ and Γ is very simple:
for the position r within the box, and a normal distribu- 1
tion of the momentum, p. Therefore, neglecting position, λ= . (45)
2Γ
we identify the particle state just with the momentum,
x = p. The equilibrium distribution is a normal distri- We can insert the PDF and the distance functional in
bution with 0 mean value and a variance σξ2 = kB T /2A. eq. 30 in order to get the optimal joint distribution for
The model includes a feedback that can probe the mo- such marginal PDFs. The solution is a joint Gaussian
mentum of the particle. The feedback uses a distance distribution:
d(ξ, π) = (ξ − π)2 . This measurement is affected by er-
ror, so the measurement π is a random variable, statis- p̃(ξ, π) =
" #!
tically correlated with the dynamical state of the parti- 1 1 ξ2 2ρξπ π2
cle. The model assumes that between every probe and exp − − + 2
2(1 − ρ2 ) σξ2
p
2πσξ σπ 1 − ρ2 σξ σπ σπ
manipulation the system is allowed to relax to thermal
equilibrium, that means that at every measurement the (46)
system is found in the same equilibrium state. Finally, q
we also assume that π is distributed like a normal ran- with correlation coefficient equal to ρ = 1 − Γ/σξ2 . Us-
dom variable. The situation can be formalized like in37 ing the result in eq. 45 we obtain the relation between λ
by assuming that the feedback and the source are con- and the correlation coefficient:
nected by a channel with Gaussian noise (see Fig. 3).
1
λ= . (47)
The Gaussian noise has distribution: 2σξ2 (1 − ρ2 )
1 z2 As expected if Γ is related to the distance between state

p(z) = N (0, Γ) = √ e− 2Γ . (42)
2πΓ and estimation by the feedback, λ has a simple interpre-
tation in terms of correlation between the two random
For simplicity we assume that the source has mean value variables. In case of random vectors (with N elements)
equal to zero (µξ = 0), p(ξ) = N (0, σξ2 ). This assumption
P
the total average distance Γ = N Γi , with Γi the dis-
is totally immaterial for the generality of the discussion. tance associated to the ith component. Clearly the value
8
Γ has an extensive characteristic being function of the accuracy and the maximum work that can be extracted in
number of degree of freedom (N ) of the system. On the terms of entropy reduction. Clearly in this discussion we
contrary λ is the intensive counterpart being connected have not considered the effect of the manipulation from
to the correlation between state and estimation. the feedback and its cost in terms of overall efficiency,
In this particular case we can also calculate the optimal thus all our results must be considered as boundary lim-
work that such a feedback apparatus can recover from its.
the system as a function of the average distance Γ, using The main result of the paper is the possibility, us-
eq.37: ing rate distortion theory, of developing the equivalent
of thermodynamic potentials even in the case of systems
T kB
hW i = T kB R(Γ) = − ln(1 − ρ2 ) = T kB I(Ξ, Π). under an external feedback.
2
(48) This perspective to the problem is particular appealing
The latter shows that the maximum work extracted is not only because it establishes a nice relation between the
proportional as should be to the mutual information be- measurement and the respective entropy and information
tween two Gaussian random variables. functional, but also because it can be generalized for a
large class of problems where large deviation applies.
IX. CONCLUSIONS
X. ACKNOWLEDGEMENTS
In this work we have analyzed a system under feedback
using typicality and large deviation theory. In particular
this connection assures to make a simple and natural link We acknowledge Prof. Merhav for the useful com-
between the way the measurement is performed and its ments.
1 16
H. S. Leff, A. F. Rex, ”Maxwell Demon: Entropy, Informa- J. M. R. Parrondo, J. M. Horowitz and T. Sagawa, Nature
tion, Computing, Princeton University Press”, Princeton, Phys., vol. 11, 131 (2015).
17
NJ, (1990). A. Gagliardi and A. Pecchia, arXiv:1503.02824v1 (2015).
2 18
H. S. Leff, A. F. Rex, ”Maxwell Demon 2: Entropy, Clas- H. Touchette, Phys. Rep., vol. 478, 1-69 (2009).
19
sical and Quantum Information, Computing”, Institute of N. Merhav, IEEE Trans. Inform. Theory, vol. 54, no. 8,
Physics, Bristol, (2003). pp. 3710-3721 (2008).
3 20
L. Szilard, On the Decrease of Entropy in a Thermody- A. Kis Andras and A. Zettl, Philos. Trans. R. Soc. A, 366,
namic System by the Intervention of Intelligent Beings, Z. 1591 (2008).
21
Phys. 53:840 (1929): English translation reprinted Behav- J. R. Gomez-Solano, L. Bellon, A. Petrosyan and S. Cilib-
ioral Science 9:301 (1964). erto, Europhys. Lett., 89, 60003 (2010).
4 22
H. Touchette and S. Lloyd,Phys. Rev. E, 84, 1156 (2000). L. Bellon, L. Buisson, S. Ciliberto and F. Vittoz, Rev. Sci.
5
A. E. Allahverdyan, D. Janzing, and G. Mahler,J. Stat. Instrum., 73, 3286 (2002).
23
Mech., P09011 (2009). A. Berut, A. Arakelyan, A. Petrosyan, S. Ciliberto, R. Dil-
6
J. Horowitz, T. Sagawa, and J. M. R. Parrondo,Phys. Rev. lenschneider and E Lutz, Nature, 187, 483 (2012).
24
Lett., 111, 010602 (2013). J. V. Koski, V. F. Maisi, T. Sagawa and J. P. Pekola, Phys.
7
J. M. Horowitz and M. Esposito,Phys. Rev. X, 4, 031015 Rev. Lett., 113, 030601 (2014).
25
(2014). A. C. Barato, D. Hartich and U. Seifert, Phys. Rev.E, 87,
8
J. M. Horowitz and H. Sandberg,New J. Phys., 16, 125007 042104 (2013).
26
(2014). S. Ito, T. Sagawa, Nat. Comm., 6, 7498 (2015).
9 27
N. Shiraishi, S. Ito, K. Kawaguchi, and T. Sagawa,New J. A. H. Lang, C. K. Fisher, T. Mora and P. Mehta, Phys.
Phys., 17, 045012 (2015). Rev. Lett., 113, 148103 (2014).
10 28
D. Hartich, A. C. Barato, and U. Seifert,J. Stat. Mech., P. Sartori, L. Granger, C. F. Lee and J. M. Horowitz, PLoS
P02016 (2014). Comput. Biol., 10, 1003974 (2014).
11 29
S. Toyabe, T. Sagawa, M. Ueda, E. Muneyuki and M. R. G. Endres and N. S. Wingreen, Phys. Rev. Lett., 103,
Sano,Nature Physics, 6, 988 (2010). 158101 (2009).
12 30
T. Sagawa and M. Ueda, Phys. Rev. Lett., 100, 080403 H. Qian and T. C. Reluga, Phys. Rev. Lett., 94, 028101
(2008). (2005).
13 31
T. Sagawa and M. Ueda, Phys. Rev. Lett., vol. 104, 090602 P. Mehta and D. Schwab, Proc. Natl Acad. Sci. USA, 109,
(2010). 17978 (2012).
14 32
T. Sagawa and M. Ueda, Chap. 6, ”Nonequilibrium Sta- Y. Tu,Proc. Natl Acad. Sci. USA, 105, 11737 (2008).
33
tistical Physics of Small systems”, (Wiley-VCH), (2013). N. Merhav, J. Stat. Phys., P01029, doi: 10.1088/1742-
15
S. Hilbert, P. Haenggi and J. Dunkel, Phys. Rev. E, vol. 5468/2011/01/P01029 (2011).
34
90, 062116 (2014). U. Seifert, Rep. Prog. Phys., 75, 126001 (2012).
9
35
C. Jarzynski, Non-equilibrium equality for free energy dif- Berlin, pp. 1-113 (1973).
44
ferences, Phys. Rev. Lett., vol. 78, 2690 (1997). N. Merhav, IEEE IST (2008) Toronto, 499 (2008).
36 45
F. Rezakhanlou and C. Villani, ”Entropy Methods for the D. J. Evans and D. J. Searles, Phys. Rev. E, 50, 1645
Boltzmann Equation”, Springer-Verlag Berlin Heidelberg (1994).
46
(2008). G. Gallavotti and E. G. D. Cohen, Phys. Rev. Lett., 74,
37
T. M. Cover and J. A. Thomas, ”Elements of information 2694 (1995).
47
theory”, Wyley (2006). J. L. Lebowitz and H. Spohn, J. Stat. Phys., 95, 333 (1999).
38 48
H. Touchette, R. J. Harris, Chap. 11, ”Nonequilibrium Sta- S. R. S. Varadhan, Comm. Pure App. Math., 19, 261
tistical Physics of Small systems”, (Wiley-VCH), (2013). (1966).
39 49
R. S. Ellis, ”Entropy, Large Deviations and Statistical Me- N. Merhav, Statistical Physics and Information theory,
chanics”, Springer, New York (1985). Foundation and Trends in Communication and Informa-
40
R. S. Ellis, Physica D, 133, 106-136 (1999). tion Theory, vol. 6, 1-212 (2009).
41 50
Y. Oono, Progr. Theoret. Phys. Suppl. 99, 165-205 (1989). T. Berger, ”Rate distortion theory: a mathematical ba-
42
R. S. Ellis, Scand. Actuar. J., 1, 97-142 (1995). sis for data compression”, PrenticeHall, Inc., Engelwood
43
O. E. Lanford, ”Entropy and equilibrium states in classi- Cliffs, NJ, (1971).
51
cal statistical mechanics”, in: A. Lenard (Ed.), Statistical R. M. Gray, ”Source Coding Theory”, Kluwer Academic
Mechanics and Mathematical Problems, vol. 20, Springer, Publishers, (1990).

Large Deviation Theory To Model Systems Under An External Feedback

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Large Deviation Theory To Model Systems Under An External Feedback

Uploaded by

Copyright:

Available Formats

Large deviation theory to model systems under an external feedback

Alessio Gagliardi1 , Alessandro Pecchia2 , Aldo Di Carlo3

(3) University of Rome ”Tor Vergata”,

I. INTRODUCTION The mutual information, for a classical system, is de-

fluctuations with the macroscopic behavior of the sys-

K(a) = − min [βa − φ(β)]. (15)

Zπ has the form of a generalized partition function, linked

The average is made with respect to the joint PDF,

which is the joint probability that fulfill the minimal rate

VI. LARGE DEVIATION, RATE DISTORTION

transform of the SCGF49,50 : hΩ(ξ|π = πk )typ i ≥ e kB

KLd, between a generic joint probability p(ξ, π) and the

1 z2 As expected if Γ is related to the distance between state

You might also like