You are on page 1of 6

March 8, 2012 7:54 World Scientific Book - 9in x 6in neweda

Appendix A

ENTROPY AND INFORMATION


All rights reserved. May not be reproduced in any form without permission from the publisher, except fair uses permitted under U.S. or applicable copyright law.

In Chapter 4, we mentioned that Boltzmann was able to establish a rela-


tionship between entropy and missing information. In this appendix, we
will look in detail at his reasoning.
The reader will remember that Boltzmann’s statistical mechanics (seen
from a modern point of view) deals with an ensemble of N weakly-
interacting identical systems which may be in one or another of a set of
discrete states, i = 1, 2, 3, ... with energies i , with the number of the sys-
tems which occupy a particular state denoted by ni ,

State number 1 2 3 ... i ...

Energy 1 , 2 , 3 , ... i , ... (A.1)

Occupation number n1 , n2 , n3 , ... ni , ...

A “macrostate” of the N identical systems can be specified by writing down


the energy levels and their occupation numbers. This macrostate can be
constructed in many ways, and each of these ways is called a “microstate”.
From combinatorial analysis it is possible to show that the number of mi-
crostates corresponding to a given macrostate is given by:
N!
W = (A.2)
n1 !n2 !n3 !...ni !...
Boltzmann assumed that for very large values of N , the most probable
macrostate predominates over all others. He also assumed that the amount
Copyright 2012. World Scientific.

of energy which is shared by the N identical systems has a constant value,


E, so that X
n i i − E = 0 (A.3)
i

215
EBSCO Publishing : eBook Academic Collection (EBSCOhost) - printed on 9/2/2023 9:29 AM via SEOUL NATIONAL UNIVERSITY
AN: 479900 ; John Scales Avery.; Information Theory And Evolution (2nd Edition)
Account: s6221847.main.ehost
March 8, 2012 7:54 World Scientific Book - 9in x 6in neweda

216 INFORMATION THEORY AND EVOLUTION

He knew, in addition, that the sum of the occupation numbers must be


equal to the number of weakly-interacting identical systems:
X
ni − N = 0 (A.4)
i

It is logical to assume that all microstates which fulfill these two conditions
are equally probable, since the N systems are identical. It then follows that
the probability of a particular macrostate is proportional to the number of
microstates from which it can be constructed, i.e., proportional to W , so
that if we wish to find the most probable macrostate, we need to maximize
W subject to the constraints (A.3) and (A.4). It turns out to be more con-
venient to maximize ln W subject to these two constraints, but maximizing
In W will of course also maximize W . Using the method of undetermined
Lagrange multipliers, we look for an absolute maximum of the function
! !
X X
ln W − λ ni − N − β n i i − E (A.5)
i i

Having found this maximum, we can use the conditions (A.3) and (A.4)
to determine the values of the Lagrangian multipliers λ and β. For the
function shown in equation (A.5) to be a maximum, it is necessary that
its partial derivative with respect to each of the occupation numbers shall
vanish. This gives us the set of equations
" #
∂ X
ln(N !) − ln(ni ) − λ − βi = 0 (A.6)
∂ni i

which must hold for all values of i. For very large values of N and ni ,
Sterling’s approximation,
ln(ni ) = ni (ln ni − 1) (A.7)
can be used to simplify the calculation. With the help of Sterling’s approx-
imation and the identity

[ni (ln ni − 1)] = ln ni (A.8)
∂ni
we obtain the relationship
− ln ni − λ − βi = 0 (A.9)
which can be rewritten in the form
ni = e−λ−βi (A.10)

EBSCOhost - printed on 9/2/2023 9:29 AM via SEOUL NATIONAL UNIVERSITY. All use subject to https://www.ebsco.com/terms-of-use
March 8, 2012 7:54 World Scientific Book - 9in x 6in neweda

ENTROPY AND INFORMATION 217

and for the most probable macrostate, this relationship must hold for all
values of i. Substituting (A.10) into (A.4), we obtain:
X X
N= ni = e−λ e−βi (A.11)
i i

so that
ni e−βi e−βi
= P −βi ≡ (A.12)
N ie Z
where
X
Z≡ e−βi (A.13)
i

The sum Z is called the “partition function” (or in German, Zus-


tandssumme) of a system, and it plays a very central role in statistical
mechanics. All of the thermodynamic functions of a system can be de-
rived from it. The factor e−βi is called the “Boltzmann factor”. Looking
at equation (A.12), we can see that because of the Boltzmann factor, the
probability
ni e−βi
= Pi ≡ (A.14)
N Z
that a particular system will be in a state i is smaller for the states of high
energy than it is for those of lower energy. We mentioned above that the
constraints (A.3) and (A.4) can be used to find the values of the Lagrangian
multipliers λ and β. The condition
X
E=N P i i (A.15)
i

can be used to determine β. By applying his statistical methods to a


monatomic gas at low pressure, Boltzmann found that
1
β= (A.16)
kT
where T is the absolute temperature and k is the constant which appears
in the empirical law relating the pressure, volume and temperature of a
perfect gas:
P V = nkT (A.17)
From experiments on monatomic gases at low pressures, one finds that the
“Boltzmann constant” k is given by
Joules
k = 1.38062 × 10−23 (A.18)
Kelvin

EBSCOhost - printed on 9/2/2023 9:29 AM via SEOUL NATIONAL UNIVERSITY. All use subject to https://www.ebsco.com/terms-of-use
March 8, 2012 7:54 World Scientific Book - 9in x 6in neweda

218 INFORMATION THEORY AND EVOLUTION

We mentioned that Boltzmann’s equation relating entropy to disorder is


carved on his tombstone. With one minor difference, this equation is
SN = k ln W (A.19)
(The minor difference is that on the tombstone, the S lacks a subscript.)
How did Boltzmann identify k ln W with the entropy of Clausius, dS =
dq/T ? In answering this question we will continue to use modern picture of
a system with a set of discrete states i, whose energies are i . Making use of
Sterling’s approximation, equation (A.9), and remembering the definition
of W , (A.2), we can rewrite (A.19) as
 
N!
SN = k ln
n1 !n2 !n3 !...ni !...
" #
X X ni ni
= k ln(N !) − ln(ni )! ≈ −kN ln (A.20)
i i
N N

Equation (A.20) gives us the entropy of the entire collection of N identical


weakly-interacting systems. The entropy of a single system is just this
quantity divided by N :
SN X
S= = −k Pi ln Pi ≡ −khlnP i (A.21)
N i

where Pi = ni /N , defined by equation (A.14), is the probability that the


system is in state i. According to equation (A.14), this probability is just
equal to the Boltzmann factor, e−βi , divided by the partition function, Z,
so that
X e−βi  e−βi 
S = −k ln
i
Z Z
k X
= e−βi (βi + ln Z)
Z i
1 X e−βi k X
= i + ln Z e−βi (A.22)
T i Z Z i
or
U
S= + k ln Z (A.23)
T
where
X
U≡ Pi i (A.24)
i

EBSCOhost - printed on 9/2/2023 9:29 AM via SEOUL NATIONAL UNIVERSITY. All use subject to https://www.ebsco.com/terms-of-use
March 8, 2012 7:54 World Scientific Book - 9in x 6in neweda

ENTROPY AND INFORMATION 219

The quantity U defined in equation (A.24) is called the “internal energy” of


a system. Let us now imagine that a very small change in U is induced by an
arbitrary process, which may involve interactions between the system and
the outside world. We can express the fact that this infinitesimal alteration
in internal energy may be due either to slight changes in the energy levels
i or to slight changes in the probabilities Pi by writing:
X X
dU = Pi di + i dPi (A.25)
i i
To the first term on the right-hand side of equation (A.25) we give the name
“dw”:
X
dw ≡ Pi di (A.26)
i
while the other term is named “dq”.
X
dq ≡ dU − dw = i dPi (A.27)
i
What is the physical interpretation of these two terms? The first term, dw,
involves changes in the energy levels of system, and this can only happen if
we change the parameters defining the system in some way. For example,
if the system is a cylinder filled with gas particles and equipped with a
piston, we can push on the piston and decrease the volume available to the
gas particles. This action will raise the energy levels, and when we perform
it we do work on the system — work in the sense defined by Carnot, force
times distance, the force which we apply to the piston multiplied by the
distance through which we push it. Thus dw can be interpreted as a small
amount of work performed on the system by someone or something on the
outside. Another way to change the internal energy of the system is to
transfer heat to it; and when a small amount of heat is transferred, the
energy levels do not change, but the probabilities Pi must change slightly,
as can be seen from equations (A.13), (A.14) and (A.16). Thus the quantity
dq in equation (A.27) can be interpreted as an infinitesimal amount of heat
transferred to the system. We have in fact anticipated this interpretation
by giving it the same name as the dq of equations (4.2) and (4.3). If the
probabilities Pi are changed very slightly, then from equation (20) it follows
that the resulting small change in entropy is
X
dS = −k [ln Pi dPi + dPi ] (A.28)
i
From equations (A.13) and (A.14) it follows that
X
Pi = 1 (A.29)
i

EBSCOhost - printed on 9/2/2023 9:29 AM via SEOUL NATIONAL UNIVERSITY. All use subject to https://www.ebsco.com/terms-of-use
March 8, 2012 7:54 World Scientific Book - 9in x 6in neweda

220 INFORMATION THEORY AND EVOLUTION

as we would expect from the fact that Pi is interpreted as the probability


that the system is in a particular state i. Therefore
X X
dPi = d Pi = 0 (A.30)
i i

and as a consequence, the second term on the right-hand side of equation


(4.31) vanishes. Making use of equation (A.14) to rewrite ln Pi , we then
have:
X
dS = −k [(−βi − ln Z)dPi ] (A.31)
i
or
1X dq
dS = i dPi = (A.32)
T i T
The somewhat complicated discussion which we have just gone through is a
simplified paraphrase of Boltzmann’s argument showing that if he defined
entropy to be proportional to ln W (the equation engraved on his tomb-
stone) then the function which he defined in this way must be identical
with the entropy of Clausius. (We can perhaps sympathize with Ostwald
and Mach, who failed to understand Boltzmann!)

EBSCOhost - printed on 9/2/2023 9:29 AM via SEOUL NATIONAL UNIVERSITY. All use subject to https://www.ebsco.com/terms-of-use

You might also like