Professional Documents
Culture Documents
Arieh Ben-Naim
Entropy and the Second Law
Interpretation and and the
Misss-Interpretationsss
Arieh Ben-Naim
This book presents a clear and
readable description of one
Second Law
Interpretation and Misss-Interpretationsss
of the most mysterious
concepts of physics:
Arieh Ben-Naim
ISBN-13 978-981-4374-89-7(pbk)
ISBN-10 981-4354-89-X(pbk)
www.worldscientific.com !"#$%&'()*+,)-(
8333 sc
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-bm 2nd Reading
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-bm 2nd Reading
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-bm 2nd Reading
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-fm 2nd Reading
Publishers’ page
i
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-fm 2nd Reading
Publishers’ page
ii
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-fm 2nd Reading
Publishers’ page
iii
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-fm 2nd Reading
Publishers’ page
iv
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-fm 2nd Reading
v
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-fm 2nd Reading
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-fm 2nd Reading
Preface
Dear Reader,
I assume that you are browsing through this book and
pondering whether or not to read it. Here are some points
which I hope will help you in making your decision.
Preface ix
Preface xi
Preface xiii
Preface xv
Acknowledgments
xvii
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-fm 2nd Reading
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-fm 2nd Reading
List of Abbreviations
1-D one-dimensional
SMI Shannon’s measure of information
lhs left-hand side
ln natural logarithm
log general logarithm, and in particular, logarithm
in base 2
rhs right-hand side
D distinguishable
ID indistinguishable
20-Q 20 questions
xix
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-fm 2nd Reading
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-fm 2nd Reading
Contents
Preface vii
Contents xxiii
Contents xxv
Appendix A 235
Epilogue 243
Notes 247
Index 263
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-fm 2nd Reading
November 26, 2011 11:48 9in x 6in Entropy and the Second Law b1293-epilogue 1st Reading
Epilogue
1
November 26, 2011 11:48 9in x 6in Entropy and the Second Law b1293-epilogue 1st Reading
Epilogue 3
1
Introduction: From Heat Engines
to Disorder, Information Spreading,
Freedom, and More …
1
January 25, 2012 9:38 9in x 6in Entropy and the Second Law b1293-ch01 2nd Reading
(a)
(b)
Fig. 1.2 Three spontaneous processes (a) expansion of a gas, (b) mixing
of two gases, and (c) heat transfer from a hot to a cold body.
are quite different processes and that it is far from being clear
that they are all governed by the same law. Perhaps there is
one law for the spontaneous expansion of a gas, another for
the spontaneous mixing of two gases, and still another for the
spontaneous flow of heat from the hot to the cold body.
It was Clausius who realized the common principle and
postulated that there is only one law that governs all these
processes. Even before formulating the Second Law, Clausius’
postulate was an outstanding achievement considering the
fact that none of these processes was understood on a
molecular level. You watch how a colored gas expands and
fills a larger volume. You watch a blue drop of ink mixing
with a glass of water, coloring the entire liquid. You watch the
hot body cooling and the cold body heating up. You watch
January 25, 2012 9:38 9in x 6in Entropy and the Second Law b1293-ch01 2nd Reading
all of these with your macroscopic eyes, but you have no idea
what drives these processes, what goes on inside the system
you are watching. Such an insight was not even possible before
the atomic nature of matter was embraced by the scientific
community, allowing us to use our “microscopic eyes” to “see”
what is going on when such processes occur. “Seeing,” even
with our microscopic eyes, is one thing, and understanding
what we see is quite another.
Clausius started from one particular process: The sponta-
neous flow of heat from a hot to a cold body. Based on this
specific process, Clausius defined a new quantity he called
“entropy.” Let dQ > 0 be a small quantity of heat flowing
into a body at a given temperature T . The change in entropy
was defined as3
dQ
dS = . (1.1)
T
The letter d stands here for a very small quantity, and
T is the absolute temperature. Q has the units of energy,
and T has the units of temperature. Therefore, the entropy
change has the units of energy divided by temperature.
Clausius’ extraordinary achievement was the enormous
generalization from one spontaneous process to any sponta-
neous process. Clausius postulated that there exists a quantity
he called entropy which is assigned to any macroscopic
system, and when a spontaneous process occurs, the entropy
always increases. This was the birth of the Second Law of
Thermodynamics. This law introduced a new quantity into
the vocabulary of physics, and at the same time brought
together many processes under the same umbrella.
January 25, 2012 9:38 9in x 6in Entropy and the Second Law b1293-ch01 2nd Reading
E, V, N E, V, N E, V, N
S = k logW , (1.2)
d = |n − N /2|. (1.3)
Ordered Disordered
Fig. 1.5 Which system is more ordered, the right-hand or the left-
hand figure? The upper or the lower figure?
January 25, 2012 9:38 9in x 6in Entropy and the Second Law b1293-ch01 2nd Reading
If you read the entire book, you will not encounter any of
the “verification” that is promised in this quotation, nor will
you find a plausible definition of disorder.16
In a very recent book Greene (2011) writes:
“More entropy means more disordered particles.”
“This law … established that disorder — entropy — increases
over time.”
2V
S = nR ln = nR ln 2, (1.8)
V
where n is the number of moles, and R is the gas constant.
You can repeat the same process at different temperatures, as
long as you have an ideal gas, the change in entropy in this
process is the same nR ln 2, independent of the temperature
(as well as of the energy of the gas).
But according to Lambert, the change in entropy in this
process must also depend on the temperature. Unfortunately,
this is in disagreement with both experiments and theoretical
calculations. More on this in Chapters 4 and 5.
A more serious attempt to support the spreading metaphor
was published by Leff.23 I will not go into a detailed criticism
of Leff ’s article.23 All I will say here is that I am not
convinced that a bona fide “spreading function” exists which
has the properties postulated by Leff. For more details, see
Appendix A.
January 25, 2012 9:38 9in x 6in Entropy and the Second Law b1293-ch01 2nd Reading
(a)
E VN
E,V,N E
E,V,N
VN E+ E V N E-∆E,
E+∆E,V,N E E V V,N
N E VN
E,V,N E
E,V,N
VN
(b)
(c)
P =1 Atm P =1 Atm
P =1 Atm P =1 Atm
Vapor
Vapor Vapor Vapor
Liquid Liquid
Liquid Liquid
P,T,N P,T,N P,T,N P,T,N
(a) (b)
Fig. 1.9 (a) Two systems at the same P, T , N , each having two phases
at equilibrium. (b) We bring the two systems to thermal contact. What
will happen?
January 25, 2012 9:38 9in x 6in Entropy and the Second Law b1293-ch01 2nd Reading
2
Forget about Entropy for a While,
Let Us Go and Play iGames
47
January 25, 2012 9:42 9in x 6in Entropy and the Second Law b1293-ch02 2nd Reading
(a)
(b)
Fig. 2.2 Two games with (a) eight boxes and (b) 16 boxes.
8 7 6 5 4 3 2 1
(a) The dumbest strategy
8 7 6 5 4 3 2 1
(b) The smartest strategy
Fig. 2.3 Two strategies of asking questions: (a) the “dumbest” strategy
and (b) the smartest strategy.
one. More efficient in the sense that “on average” we shall need
to ask fewer questions to obtain the required information. We
shall also refer to these kinds of questions as “smart questions.”
The second conclusion is that the game with the larger
number of boxes requires more questions in order to find
the coin, compared to the game with the smaller number of
boxes.
These two conclusions were derived for the specific
examples in Fig. 2.2. It turns out that these conclusions have
more general validity.6 Here, we provide some qualitative
plausible arguments. Suppose we have a game with N equally
probable boxes (that means that we have selected the box in
which to place the coin at random, giving each box the same
“chances” or the same probability). Suppose that N is of the
January 25, 2012 9:42 9in x 6in Entropy and the Second Law b1293-ch02 2nd Reading
form
N = 2n , (2.1)
where n is an integer. In this case the two strategies of asking
questions are:
Dumbest:
1. Is the coin in box number 1?
2. Is the coin in box number 2?
and so on …
Smartest: At each step, we divide the total number of boxes
into exactly two halves [this is possible because we have chosen
N to be of the form in Eq. (2.1)], and ask: Is the coin in the
rhs or the lhs half of the boxes?
Intuitively, it is clear that by using the smartest strategy
we shall find the coin in exactly n steps. The reason is simple:
At each step, we divide into two halves, hence the sizes of the
two groups of boxes at each step are
Initial number: N = 2n
First step: (N /2 = 2n−1 , N /2 = 2n−1 )
Second step: (N /4 = 2n−2 , N /4 = 2n−2 )
..
.
(n − 1)th step: (1, 1)
nth step: we find the coin.
Thus, for N = 2n , we shall find the coin in exactly n steps,
which is given by
n = log2 N = log2 2n . (2.2)
January 25, 2012 9:42 9in x 6in Entropy and the Second Law b1293-ch02 2nd Reading
Initial number: N = 2n
First step: (1, N − 1)
If the first step fails
the second step: (1, N − 2)
and so on.
Failing the N − 2th step
we reach the N − 1th step: (1, 1)
At the N th step, we get the answer.
only in the smartest one, which we shall use to define the “size”
of the game, or the “size” of the required information.
Note that we could have used the number of objects N
as a measure of the “size” of the game. The larger N is, the
harder it is to play the game. We could also have chosen N 2
or N 10 or exp[N ] to measure the size of the game. However,
one choice has a particularly simple interpretation:
For any game or any experiment with N equally probable
outcomes, we define the Shannon measure of information
(SMI) by
Symmetric distribution
(a)
1/2 1/2
Asymmetric distribution
(c) 1
Fig. 2.5 Three possible divisions of a board into two areas (a) sym-
metrical, (b) asymmetrical, (c) extreme assymmetrical.
These three cases are shown in Fig. 2.5. In the first case,
you must ask one question in order to know where the dart
hit. In the third case, you do not have to ask any question;
being given the distribution (0, 1) is equivalent to having the
knowledge that the dart hit the right-hand area, i.e. you know
where the dart is. In any intermediate case of 0 < p < 1/2,
you feel that you “know” more than in the first case but less
than the third case. On the other hand, in all cases except
the third one, one must ask one question in order to find
out where the dart is. Thus, we cannot use the “number of
questions” to quantify the amount of “information” we have
in the general case of ( p, 1 − p). To do so, we modify the rules
of the game.
Again, you know that the dart hit the board, and you are
given the distribution ( p, 1 − p), i.e. you know the areas A1
and A2 . You choose one area and ask if the dart is in that area.
If the answer is YES, you win, and if the answer is NO, you
ask: Is the dart in the second area, and the answer you will
certainly get is YES.
The difference between the rules of this game and the
previous ones is simple. When you have only two possibilities,
you will know where the dart is after the first question,
whatever the answer is. However, in the modified game, if
the answer is NO, we add one more “verification” step, for
which the answer is always YES.
Now let us calculate the number of questions in this
modified game.7
For the symmetrical case, the probability of obtaining a
YES answer on the first step is 1/2, and of obtaining a NO
January 25, 2012 9:42 9in x 6in Entropy and the Second Law b1293-ch02 2nd Reading
(1 − p) × 1 + p × 2 = 1 + p. (2.7)
This result means that for 0 < p < 1/2, the average
number of questions is between 1 and 1.5, provided we
always start by asking whether the dart is in the area of
higher probability (i.e. the one with probability 1 − p, if
0 < p < 1/2). If we decide to start by asking on the area of
the lower probability, the average number of questions will be
p × 1 + (1 − p) × 2 = 2 − p. (2.8)
A1
A2
A5
A4 A3
Fig. 2.8
A board of area A is divided into five regions of area Ai such
that A = Ai .
January 25, 2012 9:42 9in x 6in Entropy and the Second Law b1293-ch02 2nd Reading
written as
Ai · Aj = φ for any i and j, (2.13)
where Ai · Aj means the intersection of the two events, i.e.
both Ai and Aj occurred simultaneously; φ is the empty event,
the probability of which is zero
p(Ai · Aj ) = p(φ) = 0. (2.14)
For instance, the experiment of throwing a single die has
six outcomes: 1, 2, 3, 4, 5, 6. If the die is fair, or well balanced,
the probability distribution of this game is p1 = p2 = p3 =
p4 = p5 = p6 = 1/6. We say that in this case, the distribution
is uniform. On the other hand, the distribution of the events
“the dart fell in Ai ” is not uniform.
Qualitatively, knowing the distribution is equivalent to
having some information about the experiment. The question
posed by Shannon is how we can measure the “size” of this
information.
Before we elaborate on Shannon’s measure of information,
consider the following simple example. I tossed a coin, looked
at the outcome, and you have to guess which outcome has
occurred. If you guess correctly you win a prize, if not, you
get nothing.
Consider the following two cases:
A. I tell you that the coin is fair. This means that the
probability distribution is uniform, and the probability
of “heads” (H ), or “tails” (T ) is 1/2, 1/2.
B. I tell you that the coin is “unfair”, it is heavier on
one side, and that the probability distribution of the
January 25, 2012 9:42 9in x 6in Entropy and the Second Law b1293-ch02 2nd Reading
the outcome for certain, but it is more likely that the result
is H rather than T . When p = 7/10, 1 − p = 3/10 again you
know that H is more likely, but the degree of certainty is lower
(or the degree of uncertainty is higher) than in the previous
case. When p = 1/2, 1 − p = 1/2, your degree of certainty is
the lowest (or the degree of uncertainty is the highest). Next,
suppose that p = 3/10, 1 − p = 7/10. In this case, the degree of
uncertainty is larger than the uniform case; you should guess
that T is the outcome, and you will win 70% of the time. In
the case p = 1/10, 1 − p = 9/10, if your guess is T , you will
win 90% of the time. When p = 0, 1 − p = 1, you know
that the result is T , and you can win 100% of the time when
you guess T . In this case, you have been given maximum
information.
Thus, when p changes from zero to a half (and 1 − p
changes from one to a half ), the amount of certainty as to
the outcome of the experiment changes from certainty to a
minimum (or the uncertainty changes to a maximum). The
same is true when p changes from 1 to 1/2.
Shannon sought a measure of the information contained
in a more general distribution, say, p1 , . . . , pn . For instance,
a coin is hidden in one of n boxes (Fig. 2.9). We are given the
distribution p1 , . . . , pn , i.e. pi is the probability that the coin
is in box number i.
As we can see from the quotation in Section 2.1, Shannon
started with the general concept of information. Information
can be subjective or objective, it can be interesting or dull, or
even meaningless, it can be important or totally irrelevant.
Then he restricted himself to one kind of information,
January 25, 2012 9:42 9in x 6in Entropy and the Second Law b1293-ch02 2nd Reading
1/4 1/4
(a)
1/4 1/4
1/30
1/30
1/4
(c) 1/2
1/8 1/8
Fig. 2.9 Games with different distributions.
d 2H −1 1 −1
( loge 2) 2 = − = ≤ 0. (2.22)
dp 1−p p p(1 − p)
Thus, the second derivative is always negative, which
means that the function is concave downwards (Fig. 2.7).
Note that the quantity p log2 p tends to zero when p
tends to zero. This can be easily seen by using L’Hopital’s
theorem, i.e.
log x
lim (x log x) = lim
x→0 x→0 1/x
d 1/x
dx (log x)
= lim d 1 = lim = 0.
x→0 x→0 −1/x 2
dx x
(2.23)
converges or diverges.
The case of a continuous distribution is problematic.10 If
we start from the discrete case and proceed to the continuous
limit, we get into some difficulties. We shall not discuss this
problem here. Here, we shall follow Shannon’s treatment for
a continuous distribution for which a probability density
function f (x) exists. In analogy with the definition of the
H function for a discrete probability distribution, we define
the quantity H for a continuous distribution. Let f (x)
be the density distribution, i.e. f (x)dx is the probability
of finding the random variable having values between x
and x + dx. We defined the H function as
∞
H =− f (x) log f (x)dx. (2.35)
−∞
A similar definition applies to an n-dimensional distribution
function f (x1 , . . . , xn ).
In Chapter 3, we shall use the definition (2.35) to derive
the entropy of an ideal gas.
January 25, 2012 9:42 9in x 6in Entropy and the Second Law b1293-ch02 2nd Reading
(1/4, 1/4, 1/4, 1/4). In case (b), you are told that the distribution
is
A1 9
p1 = =
A 10
1
p2 = p2 = p4 = .
30
Which game is easier to play? If you have any problems
answering this question, try to play these two games and
“experiment” with different strategies of asking questions.
You will soon find that the best way of playing game (a)
is to divide the four regions into two groups, say, (1, 2) and
(3, 4). Once you find out in which group the dart is in, you
will have to ask one more question to find where the dart is.
On the other hand, in game (b), you can do better. If you
ask: Is the dart in region 1? The probability of getting a YES
answer is 9/10, and a NO is 1/10.
Clearly, if you play these two games many times, you will
need on average to ask fewer questions for case (b) than for
case (a). Case (c) in Fig. 2.9 is somewhere in between (a)
and (b).
The SMI for the three games is:
1
4,1
SMI 4,1
4,1
=2
4
9 1 1 1
SMI , , , = 0.627
10 30 30 30
1 1 1 1
SMI , , , = 1.75.
2 4 8 8
January 25, 2012 9:42 9in x 6in Entropy and the Second Law b1293-ch02 2nd Reading
(b)
Fig. 2.10 (a) Game with four outcomes, (b) the SMI values for all
possible divisions of the outcomes into two groups.
January 25, 2012 9:42 9in x 6in Entropy and the Second Law b1293-ch02 2nd Reading
pa ; ( pb , pc , pd ) ( p = pa , 1 − p = pb + pc + pd ) 0.97
pb ; ( pa , pc , pd ) ( p = p↓ (b), 1 − p = p↓ a + p↓ c + p↓ d ) 0.72
pc ; ( pa , pb , pd ) ( p = pc , 1 − p = pa + pb + pd ) 0.61
pd ; ( pa , pb , pc ) ( p = pd , 1 − p = pa + pb + pc ) 0.81
( pa , pb ); ( pc , pd ) ( p = pa + pb , 1 − p = pc + pd ) 0.97
( pa , pc ); ( pb , pd ) ( p = pa + pc , 1 − p = pb + pd ) 0.99
( pa , pd ); ( pb , pc ) ( p = pa + pd , 1 − p = pb + pc ) 0.93
9 1
p1 = , p2 = p2 = · · · = p10 = ;
10 990
There are several other interpretations of the term SMI,
or H in Shannon’s notation. We shall specifically refrain
from using the meaning of H as a measure of “entropy,”
or as a measure of disorder. The first is an outright mis-
leading term (see Chapter 3). The second, although very
commonly used, is very problematic. First, because the
terms “order” and “disorder” are fuzzy concepts. Order and
disorder, like beauty, are very subjective terms and lie in the
eye of the beholder. Second, many examples can be given
showing that the amount of information does not correlate
with what we perceive as order or disorder [see Ben-Naim
(2008)].
and
n
qj = p(i, j) − pY ( j). (2.38)
i=1
inequality holds13 :
n
n
H (q1 , . . . , qn ) = − qi log qi ≤ − qi log pi .
i=1 i=1
(2.41)
From Eqs. (2.36) to (2.40), we obtain
H (X ) + H (Y ) = − pX (i) log pX (i) − pY ( j) log pY ( j)
i j
= − p(i, j) log pX (i) − p(i, j) log pY (i)
i,j i,j
= − p(i, j) log[ pX (i)pY ( j)]. (2.42)
i,j
= H (X ) + H (Y ). (2.43)
Thus, we found
H (X , Y ) ≤ H (X ) + H (Y ). (2.44)
January 25, 2012 9:42 9in x 6in Entropy and the Second Law b1293-ch02 2nd Reading
The equality holds if and only if the two random variables are
independent, i.e.
Whence
H (X , Y ) = H (X ) + H (Y ). (2.46)
p(xi · yj )
p( yj /xi ) = . (2.47)
p(xi )
= H (X , Y ) − H (X ). (2.49)
Thus, H (Y /X ) measures the difference between the SMI
of X and Y and the SMI of X . This can be rewritten as
H (X , Y ) = H (X ) + H (Y /X )
= H (Y ) + H (X /Y ). (2.50)
From Eq. (2.44) and Eq. (2.50), we also obtain the inequality
H (Y /X ) ≤ H (Y ) (2.51)
and similarly,
H (X /Y ) ≤ 1 + (X ).
The last inequality means that the SMI of Y can never
increase by knowing X . Alternatively, H (Y /X ) is the average
uncertainty that remains about Y when X is known. This
uncertainty is always smaller than the uncertainty about Y .
If X and Y are independent, then the equality sign in
January 25, 2012 9:42 9in x 6in Entropy and the Second Law b1293-ch02 2nd Reading
I (X ; Y ) ≡ H (X ) + H (Y ) − H (X , Y ). (2.52)
January 25, 2012 9:42 9in x 6in Entropy and the Second Law b1293-ch02 2nd Reading
H(X,Y)
H(X) H(Y/X)
H(X/Y) H(Y)
deter you, you can read the prose in between the equations.
I hope that by doing so, you will get a rudimentary under-
standing of what SMI is, and be ready to comprehend the
meaning of entropy as introduced in Chapter 3.
Exercise: Before we plunge into thermodynamics, I suggest
that you pause and think about two quantities that you
are familiar with in connection with the 20-Q game. The
analogues of these two quantities will be important in the
understanding of entropy.
Let us denote by W the number of objects from which
we select one object. Let S be the minimal number of binary
questions you need to ask in order to find out which object
out of W was chosen.
Answer the following questions:
1. Can you describe in one word either W or S?
2. Is either of the two quantities W or S subjective or
objective?
3. I tell you that W is the “number of objects from which I
chose one object.” Which of the following statements are
acceptable as a valid interpretation of S?
(a) S is a measure of W .
(b) S is a measure of freedom I have in choosing one object
out of W objects.
(c) S is a measure of sharing of the information on the
W objects.
(d) S is a measure of the disorder in the W objects.
(e) S is a measure of the spreading of energy among the
W objects.
January 25, 2012 9:42 9in x 6in Entropy and the Second Law b1293-ch02 2nd Reading
3
The Astounding Emergence
of Entropy of a Classical Ideal
Gas out of Shannon’s Measure
of Information
101
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-ch03 2nd Reading
0 x x+dx L
Fig. 3.1 A particle in 1-D “box” of length L. We are interested in the
probability of finding the center of particle between x and x + dx.
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-ch03 2nd Reading
L
Z
Ri L
0 L X
i of the ith particle.
Fig. 3.2 The locational vector R
Fig. 3.3 Two different configurations (a) and (b) of two distinguish-
able particles become a single configuration (c) when the particles are
indistinguishable.
IDlog M 2
H (1, 2) = . (3.11)
2
The difference between H D (1, 2) and H ID (1, 2) can be
interpreted in terms of mutual information (see Section 2.6),
i.e.
H ID (1, 2) = H (1) + H (2) − I (1; 2), (3.12)
where
I (1; 2) = log 2 > 0. (3.13)
In the more general case of N indistinguishable particles in
M cells, such that N M , we have
N
D
H (1, 2, . . . , N ) = H (i) = log M N (3.14)
i=1
Fig. 3.6 The most probable (mp), the mean and the root mean square
(rms) speed.
Figure 3.6 shows the most probable, the mean and the root
mean square speed. The most probable (mp) speed is simply the
location vmp for which f ∗ has the maximum, i.e. the solution
of the equation df /dv = 0. The result is
2kB T
vmp = . (3.35)
m
On the other hand, the average speed is calculated from
∞
8kB T
v = vf ∗ (v)dv = . (3.36)
0 πm
Another useful average is the root mean square speed vrms ,
which is defined by
∞ 3kB T
vrms = v 2 f ∗ (v)dv = . (3.37)
0 m
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-ch03 2nd Reading
H ID (1, 2, . . . , N )
D
= Hmax (R N ) + Hmax
D
( pN ) − log N ! − 3N log h.
(3.44)
SB = kB ln W , (3.47)
y
f(x)
f(x,y)
x x
(a) (b)
Fig. 3.7 Maximum of a function f (x), (a), and a function f (x, y), (b).
Fig. 3.8 The general form of the entropy function S(E , V , N ), with
respect to E , V and N .
General Information
Entropy SMI
Fig. 3.9 Schematic relationship between entropy, SMI and the general
concept of information (see text).
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-ch03 2nd Reading
4
Examples and Their Interpretations.
Challenges for any Descriptor
of Entropy
135
January 25, 2012 9:6 9in x 6in Entropy and the Second Law b1293-ch04 2nd Reading
E,N,V E,N,2V
E,N,V E,N,3V
Fig. 4.3 The actual events that we see when the blue-colored gas
expands from V to 2V .
I
(a)
II
(b)
(c)
Fig. 4.4 A process of mixing of two different gases (a) and “mixing”
the same gas (b). In both processes, the volume accessible to each particle
changes from V to 2V . (c) The process of mixing as seen with our eyes.
III
(a)
IV
(b)
Fig. 4.5 A process of mixing two different gases (a) and “mixing”
the same gas (b). Here, the volume accessible to each particle does not
change.
January 25, 2012 9:6 9in x 6in Entropy and the Second Law b1293-ch04 2nd Reading
SIII = 0. (4.6)
(SMI ) = 0. (4.7)
not made for the mixing process depicted in Fig. 4.5(a), but
rather for the process depicted in Fig. 4.4(a). That comment
is correct, nevertheless, the conclusion reached above is also
correct, independently of the kind of mixing process we
discuss. In the next two sections, we examine two more
processes involving mixing.
this process
2V
(SMI ) = 2N log2 = 2N . (4.9)
V
Exercise: Calculate the change in entropy and the change in
SMI for the more general case where NA = NB , the temper-
ature is the same in the two subsystems and the volumes of
the two subsystems are also the same.
In Eqs. (4.8) and (4.9), we added an intermediate step
before the cancellation of the volume V . This is important
because it means that the change in entropy, as well as the
change in SMI, is due only to the change in the accessible
volume for each particle. In other words, the change in entropy
is due to the change in the locational SMI of the particles.
There are altogether NA + NB = 2N particles; the volume
accessible for each has increased from V to 2V . This is exactly
the same cause of the change in the entropy as in the expansion
of the ideal gas shown in Fig. 4.1. The only difference is that
we have NA + NB particles here compared with N particles
in the expansion process. Therefore, the process depicted in
Fig. 4.4(a) is thermodynamically equivalent to two processes
of expansion; NA particles expanding from V to 2V , and
another NB particles expanding from V to 2V . The identity
of the two particles is inconsequential for this process. The
mixing in itself did not contribute anything to the change in
the entropy of a system.
This is a remarkable result. Take any two different gases;
say, argon and xenon, or methane and ethane, and mix them as
in the process of Fig. 4.4(a); the change in entropy is the same
January 25, 2012 9:6 9in x 6in Entropy and the Second Law b1293-ch04 2nd Reading
V V V v
v
the A particles (say NA ) will flow to the rhs volume V , and
most of the B particles (say NB ) will flow to the lhs volume V .
Clearly, the larger the volume V , the more complete is the
demixing of the initial mixture of A and B.
We now close the impermeable walls and calculate the
change in entropy for this process either from thermodynam-
ics or from statistical mechanics; the result is
V V
S = NA kB ln + NB kB ln . (4.10)
v v
We can also calculate the change in the locational SMI in
this process
V V
(SMI ) = NA log2 + NB log2 . (4.11)
v v
We note also that we can demix reversibly the mixture that
remains in the original volume v, which consists of (NA −NA )
January 25, 2012 9:6 9in x 6in Entropy and the Second Law b1293-ch04 2nd Reading
IV
IV1 IV2
22N = (1 + 1)2N
2N 2N
2N (2N )! (2N )!
= = > . (4.19)
i i!(2N − i)! N !N !
i=1 i=1
S ≈ 2NkB ln 2 − kB 2N ln 2 = 0. (4.20)
The change in the SMI for this process, and for any finite
N , is
(2N )!
(SMI ) = 2N log2 2 − log2 > 0. (4.21)
(N !)2
In calculating the SMI of a system of N indistinguishable
particles, we are not committed to a macroscopically large N .
The change in SMI in Eq. (4.21) is valid for any finite N .
In general, one must be careful to use the correct statistics to
calculate the total number of configurations.4 However, for
our purpose in this section, suffice it to say that the nearly-zero
change in entropy in this process arises from two effects; there
is an increase in the locational SMI for each particle, and there
is a change in the mutual information between the initial and
the final state. The two effects approximately cancel out each
other in the limit of very large N .
It should be noted that Gibbs, who gave much thought to
this process and the comparison with the process of mixing of
different gases [Fig. 4.4(a), process I] was well aware that when
we remove the partition between the two compartments,
there is an expansion process, in the sense that each particle
originally being confined to a volume V can now access the
volume 2V . However, Gibbs also realizes that the particles
are indistinguishable. Therefore, he concluded that there is
no way we can reverse the process II, in the sense of bringing
back each particle that originated from the left compartment
back to the left compartment, and each particle originating
from the right compartment back to the right compartment.
Gibbs also noted that the spontaneous mixing in process I
can be reversed. This will require investing work; a possible
January 25, 2012 9:6 9in x 6in Entropy and the Second Law b1293-ch04 2nd Reading
-I
-III
Fig. 4.8 Reversing the process of mixing in two steps: First, demix the
mixture reversibly, then compress each gas from 2V to V .
-II
H H
NH2 NH2
Alanine l Alanine d
(b)
Fig. 4.10 (a) A molecule, alanine, with a chiralic center. (b) Sponta-
neous process of deassimilation.
V, 1A V, 1A 2 V, 2A
V= V/N
V>>v
Fig. 4.12 Process of “delocalization” and “assimilation” of N particles.
January 25, 2012 9:6 9in x 6in Entropy and the Second Law b1293-ch04 2nd Reading
Fig. 4.15 The change in entropy in the second step in the process
of Fig. 4.13.
2H
0
g>0 g>0
2H
0
g>0 g>0
dQ (−dQ ) dQ (TB − TA )
dS = dSA + dSB = + = > 0.
TA TB TA TB
(4.29)
(a)
(b)
E1,=10 E2,=20 ? ?
N1,=10 N2,=100
V V
Fig. 4.18 (a) Heat transfer from the high-energy system to the low-
energy system E1 > E2 . (b) Heat transfer from the low-energy system
to the high-energy system.
January 25, 2012 9:6 9in x 6in Entropy and the Second Law b1293-ch04 2nd Reading
(a) (b)
σ R =σ
vEX
V-vEX
Fig. 4.22 The excluded volume V EX around a particle of diameter σ.
January 25, 2012 9:6 9in x 6in Entropy and the Second Law b1293-ch04 2nd Reading
I (1, 2, . . . , N )
1
= − N (N − 1) p(r1 , r2 ) ln g (r1 , r2 )dr1 dr2 .
2
(4.30)
(a)
(b)
Fig. 4.24 (a) Ten compartments each containing an ideal gas of the
same compound A. The fractions of molecules in each compartment in
the initial state are indicated. After removal of the partition the fraction
of molecules in each compartment is 1/10 (b).
Mixture of
A+B+C+D+E+F+G+H+I+J
T = Σ
i
T i / 10
(a)
T=10
T=20
(b) (c)
4.8 Conclusion
The reader is urged to study each of the examples discussed in
this chapter and to try to apply his or her favorite descriptor
of entropy. The reader is also urged to “invent” new processes
which defy an interpretation based on the common entropy
descriptors. It is only after working with these examples that
one can appreciate the superior interpretive power of the SMI.
January 25, 2012 9:45 9in x 6in Entropy and the Second Law b1293-ch05 2nd Reading
5
Finally, Let Us Discuss the Most
Mysterious Second Law
of Thermodynamics
The title of the book Entropy and The Second Law of Ther-
modynamics was chosen to reflect the two issues associated
with the Second Law. One is concerned with the meaning
of the entropy, or with the question: What is entropy? The
second concerns the Second Law, or the question of: Why
does entropy always increase (in a spontaneous process in
an isolated system)? The answers to these two questions
are quite different. Understanding what is entropy leaves
the “Why” question unanswered. This distinction between
the two questions is not always understood, even by scientists
who attempt to explain the entropy and the Second Law. In
my previous book, I wrote1 : “It should be stressed again that
the interpretation of entropy as a measure of information cannot
be used to explain the Second Law of Thermodynamics.”
187
January 25, 2012 9:45 9in x 6in Entropy and the Second Law b1293-ch05 2nd Reading
L R L R
(a) N = 2
Suppose we have the total of N = 2 particles. In this
case, we have the following possible configurations and
the corresponding probabilities:
n=0 n=1 n = 2,
1 1 1 (5.9)
PN (0) = , PN (1) = , PN (2) = .
4 2 4
This means that on average, we can expect to find the con-
figuration n = 1 (i.e. one particle in each compartment)
about half of the time, and the configurations n = 0 and
n = 2 only a quarter of the time (Fig. 5.2).
(b) N = 4
For the case N = 4, we have the distribution as shown
in Fig. 5.2. We see again that the maximal probability
is PN (2) = 6/16 = 0.375, which is smaller than 1/2. In
this case, the system will spend only 3/8 of the time in the
maximal state n∗ = 2.
(c) For N = 10, we can calculate the maximum at n∗ = 5,
which is
Fig. 5.3 The maximum value of log W (n) and the maximum value
of the probability PN (n) for large N .
January 25, 2012 9:45 9in x 6in Entropy and the Second Law b1293-ch05 2nd Reading
SMI ( p, q) = −p ln p − q ln q. (5.16)
Fig. 5.6 Symbolic relationship between the questions What and Why.
distribution is
N!
Pr( p1 , . . . , pn ) = C
n , (5.19)
i=1 N i !
where C is a normalization constant.
From one of Shannon’s theorems,7 we know that the
Maxwell–Boltzmann distribution is the one that maximized
the SMI, subject to two conditions
pi = 1, (5.20)
m 2
pi vi = const. (5.21)
2
For large N , we use the Stirling approximation for ln Pr
ln Pr = ln C + ln N ! − ln Ni !
≈ ln C + N ln N − Ni ln Ni
= ln C + N − pi ln pi = ln C + N × SMI .
(5.22)
Clearly, the distribution that maximizes the SMI, sub-
jected to the two conditions Eqs. (5.20) and (5.21), is the
same as the distribution that maximizes ln Pr, hence also Pr
itself.
This is essentially the same argument as given in the first
example. The conclusion is also the same: The answer to the
question of “Why the system evolves towards the equilibrium
state” is simply that the equilibrium state is the state of
maximal probability. Because Pr is related to the SMI by a
monotonic increasing function, it is also correct to say that
January 25, 2012 9:45 9in x 6in Entropy and the Second Law b1293-ch05 2nd Reading
Furthermore, he adds:
“Therefore, statistical mechanics, or an exclusive focus on prob-
ability considerations in entropy change, smuggle-in entropy
change by ignoring its enablement by molecular motional
energy.”
In my opinion, Lambert’s criticism is unjustified for two
reasons. First, to the best of my knowledge, no one has
implied that “matter can spread out without any involvement
of energy.” Second, to the best of my knowledge, no one is
smuggling-in “entropy change by ignoring its enablement.”
Indeed, statistical mechanics deals with probabilities, and
with these probabilities, one can calculate all average quan-
tities of interest, and in particular, entropy changes between
different states. Statistical mechanics, like thermodynamics,
is not concerned with the way or the path leading from state
A to state B. It does not ignore the molecular motional energy.
It simply does not need it for the calculation of entropy
difference. The fact is that matter is composed of atoms and
molecules, and that these atoms and molecules possess energy;
this is undeniable. It is not used explicitly in some calculations
of entropy simply because it is not needed, not because it is
ignored. The simplest example where energy is not needed is
the case of a gas expanding from a volume V to 2V .
When one mole of ideal gas expands from volume V to
2V , the entropy change is R ln 2. This is independent of the
molecular motion. The same entropy change is calculated
for the expansion process at any temperature, even at very
low temperatures. Similarly, the entropy change associated
with the mixing of two ideal gases is independent of the
temperature.
January 25, 2012 9:45 9in x 6in Entropy and the Second Law b1293-ch05 2nd Reading
a vertical axis going through the faces “1” and “6”, and
initially the die shows the face “6” upwards [Fig. 5.8(c)].
I tell you that I gave the die a spin, and it responds by
swirling around while still stuck to the fluid. What is the
probability of the outcome “4”? The answer is zero. So is
the answer for any other number except for “6”, which has
a probability of one.
I hope that these examples are sufficient in illustrating
what I have said about the probability being a conditional
probability.
The moral of all these examples is that whenever we ask
what is the probability of, say, “4” in throwing a die, we
implicitly imply, although we do not say it explicitly, that
the die has been thrown, or will be thrown in such a way that
one of the possible results has occurred or will occur. Of course,
one must specify (which is different in each of the examples
discussed above). But once you specify , then whenever you
A
say P(A), you always mean P( ).
What does all this die business have to do with the gas
expanding from one compartment to another?
The answer is: When we say that the probability of occur-
rence of any molecular event, say, a particle moving from one
compartment to another, it is always assumed that an experi-
ment is done, and one of the possible outcomes has occurred
(or will occur). The “enablement” of the occurrence of the out-
come is always assumed implicitly. Moreover, the insistence
on mentioning the molecular motions is not only unnecessary
for the discussion of what entropy is and why it increases in
a spontaneous process, it might also be misleading.
January 25, 2012 9:45 9in x 6in Entropy and the Second Law b1293-ch05 2nd Reading
L R L R
If you have enough “patience,” you will see all the 1000
particles in one compartment once in many billions of ages
of the universe. Again, there is no conflict with Newtonian
mechanics.
The calculation made above is only for N = 1000. For
a thermodynamic system with N = 1023 , the probability of
January 25, 2012 9:45 9in x 6in Entropy and the Second Law b1293-ch05 2nd Reading
all the particles in the system, the state of the system will be
reversed. In particular, if we start with the gas filling the two
compartments, or the two mixed gases in two compartments,
and reverse the direction of the velocities of all the particles, we
shall observe the gas condensing in one compartment, and the
mixture of the two gases separating into two compartments.
Does that mean that the entropy will also decrease after
the reversal of the velocities? Most of the discussions about
the association of the arrow of time with the Second Law
overlook the difference between the two questions:
1. Will the reversal of all velocities bring the system to its
initial microscopic state, i.e from state (b) to state (c) in
Fig. 5.10?
2. Will the reversal of all velocities cause a decrease in the
entropy?
Regarding the first question, the answer was already
provided above. The system will return to its initial state.
One can ask how long it will take to get back to the initial
state, but in principle, we can be sure that it will.18
Thus, while we have no difficulty in answering the first
question in the affirmative, the answer to the second depends
on how we define the macroscopic equilibrium state.
is also valid for the process of expansion, i.e. from (a) to (b)
in Fig. 5.10.
In the expansion of the gas from volume V to 2V , we
start with an initial equilibrium state defined by (E , V , N )
for which the entropy is S(E , V , N ). After the expansion
has occurred, we have a new equilibrium state and a new
value of the entropy S(E , 2V , N ) which is higher than the
value S(E , V , N ). In this process, the value of the entropy did
increase from the initial value of S(E , V , N ) to the final value
S(E , 2V , N ), in some period of time, say, t. However, there
exists no function S(t) that gives us the dependence of entropy
on time. In the particular example of the expansion process,
the time interval t, or the “speed” of the expansion process
could depend on the size of the opening between the two com-
partments; the smaller the opening, the longer it will take for
the system to get from the initial equilibrium state to the final
equilibrium state, but the change in the value of the entropy
S will be the same R ln 2. The speed could also depend on
the temperature, but again S will be the same. Similarly, in
the heat transfer from a body of high temperature to a body
of low temperature, the entropy change of the entire system is
fixed by the initial and final equilibrium states. On the other
hand, the time it takes to proceed from one equilibrium state
to the other would depend on the thermal conductivity of the
partition separating between the two bodies.
Thus, at the very moment when we remove the partition
in the expansion process, the system is not in an equilibrium
state and therefore it is meaningless to talk about the entropy
of the system19 when it is moving from one equilibrium state
to another. The variation of the entropy with time will look
January 25, 2012 9:45 9in x 6in Entropy and the Second Law b1293-ch05 2nd Reading
like the one in Fig. 5.11(a), where t0 is the time we removed the
constraint (here the partition), and t1 is the time the system
reached a new equilibrium state.
We could do expansion processes in very small steps,
each time opening a small door on the partition for a short
period of time then closing it and waiting until equilibrium is
reached, then opening it again and so on and so forth.19 In this
case, the entropy will change with time as in Fig. 5.11(b). Note,
however, that the stepwise functional dependence of S on t is
not because there exists a function S(t), but because we chose to
openandclosethedooratsomearbitrarysequenceoftimes.19
We could, of course, calculate the distribution of the
locations and velocities of the particles at each point of
time. We could also define the SMI associated with these
distributions, and follow the value of the SMI as it changes
with time. However, the entropy of the system is obtained
only for those distributions that maximize the SMI, i.e. for
the equilibrium distribution. This reasoning suggests that the
best way to extend thermodynamics into the realm of non-
equilibrium states is by using the concept of SMI. The SMI
is well defined for any distribution, whereas the entropy is
defined only for the distributions that maximize the SMI.
January 25, 2012 9:45 9in x 6in Entropy and the Second Law b1293-ch05 2nd Reading
5.7 Conclusion
It is time to summarize the message of the book in a few
sentences.
Two caveats before attempting an interpretation of entropy
and the Second Law:
1. Descriptors of either the state of the system or of changes
in the state of the system are not necessarily the same as
the descriptor of entropy.
2. Interpretation of W as the number of accessible
microstates of a system at equilibrium is not necessarily
the same as an interpretation of log W , or entropy.
With these two caveats in mind my summary of the book in
brief is:
1. Entropy is a special case of Shannon’s measure of infor-
mation. Special, in the sense that SMI is defined for any
probability distribution. Entropy is an SMI defined on a
very limited range of probability distributions (see Fig. 3.9
for orientation). As such, the interpretation of entropy is
derived from the interpretation of the SMI.
2. The Second Law of thermodynamics states that in a
spontaneous process in an isolated system, entropy can
only increase. The reason for the spontaneous change in
the state of the system is simply because the probability of
the new equilibrium state is far larger than the probability
of the initial state. The reason for the increase in the
entropy of the system in a spontaneous process in an
January 25, 2012 9:45 9in x 6in Entropy and the Second Law b1293-ch05 2nd Reading
Appendix A
235
January 25, 2012 9:46 9in x 6in Entropy and the Second Law b1293-appA 2nd Reading
Appendix A 237
Appendix A 239
Appendix A 241
Notes
Notes 249
Notes 251
p pi
ln qii ≤ − 1. Multiplying by qi and summing over i,
qi
p
we get qi log qii ≤ pi − qi = 0, from which
Eq. (2.41) follows.
14. The quantity H (Y /X ) is sometimes denoted by HX (Y ).
15. Note that sometimes instead of I (X ; Y ), the notation
H (X ; Y ) is used. This is potentially a confusing nota-
tion because of the similarities between H (X , Y ) and
H (X ; Y ).
Notes 253
Notes 255
Notes to Chapter 5
1. Ben-Naim (2008).
2. See Lambert’s “review” of my book Entropy Demystified,
posted on entropysite.oxy.edu, and on Amazon.com.
3. See any book on probability, e.g. Papoulis and Pillai,
4th edition (2002). A shorter discussion is also available
in Ben-Naim (2008). It should be said that the high
probability of an event does not mean that “probability”
is the cause of that event. Any definition of probability
is circular. However, when we are considering a very
large probability ratio, say, between two events, then we
can safely use the frequency definition of probability to
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-Notes 2nd Reading
infer that if you start from one event and repeat the
experiment many times, the event having the higher
probability will occur most of the time. There is an
awkward statement in Brillouin’s book (1962): “The
probability has a tendency to increase and so thus entropy.”
I believe this is a slip of the tongue. The state of a system
changes from low-probability states to high-probability
states. The probabilities of the states do not change
with time.
4. This fact has caused some confusion in the literature.
Though it is true that as N becomes larger, W (n∗ )
increases with N , the ratio of W (n∗ ) and 2N decreases
with N .
5. If your answer is any number different from zero, you
should reread the whole of Section 5.1 or consult any
book on probability. Note that we did not remove the
partition in this “experiment.” If, on the other hand,
you are wondering why I asked such a silly question,
read Section 5.4.
6. This can be computed exactly from Eq. (5.3).
P100 (50 − 1 ≤ n ≤ 50 + 1) ∼
= 0.235
P1000 (500 − 10 ≤ n ≤ 500 + 10) ∼ = 0.493
P10000 (5000 − 100 ≤ n ≤ 5000 + 100) ∼ = 0.955.
Notes 257
Atkins P. (2007) Four Laws that Drive the Universe. Oxford University
Press, Oxford, UK.
Baierlein R. (1994) Entropy and the Second Law: A pedagogical
alternative. Am J Phys 62: 15.
Battino R. (2007) “Mysteries” of the First and Second Laws of
Thermodynamics. J Chem Educ 84: 753.
Ben-Naim A. (1987) Is Mixing a Thermodynamic Process? Am J Phys
55: 725.
Ben-Naim A. (2006) A Molecular Theory of Solutions. Oxford University
Press, Oxford.
Ben-Naim A. (2007) Entropy Demystified, The Second Law of Ther-
modynamics Reduced to Plain Common Sense. World Scientific,
Singapore.
Ben-Naim A. (2008) A Farewell to Entropy: Statistical Thermodynamics
Based on Information. World Scientific, Singapore.
Ben-Naim A. (2009) An Informational–Theoretical Formulation of the
Second Law of Thermodynamics. J Chem Educ 86: 99.
Ben-Naim A. (2010) Discover Entropy and the Second Law of Ther-
modynamics: A Playful Way of Discovering a Law of Nature, World
Scientific, Singapore.
Ben-Naim A. (2011) Entropy: Order or Information. J Chem Educ
88: 594.
Bent HA. (1965) The Second Law. Oxford University Press, New York.
Boltzmann L. (1877) Vienna Academy 42 “Gesammelte Werke” p. 193.
Boltzmann L. (1896) Lectures on Gas Theory. Translated by Brush SG
(1995). Dover, New York.
259
January 25, 2012 8:44 9in x 6in Entropy and the Second Law b1293-bib 2nd Reading