This action might not be possible to undo. Are you sure you want to continue?
:
An intermediate level course
Richard Fitzpatrick
Associate Professor of Physics
The University of Texas at Austin
1 INTRODUCTION
1 Introduction
1.1 Intended audience
These lecture notes outline a single semester course intended for upper division
undergraduates.
1.2 Major sources
The textbooks which I have consulted most frequently whilst developing course
material are:
Fundamentals of statistical and thermal physics: F. Reif (McGrawHill, New York NY,
1965).
Introduction to quantum theory: D. Park, 3rd Edition (McGrawHill, New York NY,
1992).
1.3 Why study thermodynamics?
Thermodynamics is essentially the study of the internal motions of many body
systems. Virtually all substances which we encounter in everyday life are many
body systems of some sort or other (e.g., solids, liquids, gases, and light). Not
surprisingly, therefore, thermodynamics is a discipline with an exceptionally wide
range of applicability. Thermodynamics is certainly the most ubiquitous subﬁeld
of Physics outside Physics Departments. Engineers, Chemists, and Material Scien
tists do not study relatively or particle physics, but thermodynamics is an integral,
and very important, part of their degree courses.
Many people are drawn to Physics because they want to understand why the
world around us is like it is. For instance, why the sky is blue, why raindrops are
spherical, why we do not fall through the ﬂoor, etc. It turns out that statistical
2
1.4 The atomic theory of matter 1 INTRODUCTION
thermodynamics can explain more things about the world around us than all
of the other physical theories studied in the undergraduate Physics curriculum
put together. For instance, in this course we shall explain why heat ﬂows from
hot to cold bodies, why the air becomes thinner and colder at higher altitudes,
why the Sun appears yellow whereas colder stars appear red and hotter stars
appear bluishwhite, why it is impossible to measure a temperature below 273
◦
centigrade, why there is a maximum theoretical efﬁciency of a power generation
unit which can never be exceeded no matter what the design, why high mass
stars must ultimately collapse to form blackholes, and much more!
1.4 The atomic theory of matter
According to the wellknown atomic theory of matter, the familiar objects which
make up the world around us, such as tables and chairs, are themselves made up
of a great many microscopic particles.
Atomic theory was invented by the ancient Greek philosophers Leucippus and
Democritus, who speculated that the world essentially consists of myriads of tiny
indivisible particles, which they called atoms, from the Greek atomon, meaning
“uncuttable.” They speculated, further, that the observable properties of everyday
materials can be explained either in terms of the different shapes of the atoms
which they contain, or the different motions of these atoms. In some respects
modern atomic theory differs substantially from the primitive theory of Leucippus
and Democritus, but the central ideas have remained essentially unchanged. In
particular, Leucippus and Democritus were right to suppose that the properties of
materials depend not only on the nature of the constituent atoms or molecules,
but also on the relative motions of these particles.
1.5 Thermodynamics
In this course, we shall focus almost exclusively on those physical properties of
everyday materials which are associated with the motions of their constituent
3
1.6 The need for a statistical approach 1 INTRODUCTION
atoms or molecules. In particular, we shall be concerned with the type of motion
which we normally call “heat.” We shall try to establish what controls the ﬂow of
heat from one body to another when they are brought into thermal contact. We
shall also attempt to understand the relationship between heat and mechanical
work. For instance, does the heat content of a body increase when mechanical
work is done on it? More importantly, can we extract heat from a body in order
to do useful work? This subject area is called “thermodynamics,” from the Greek
roots thermos, meaning “heat,” and dynamis, meaning “power.”
1.6 The need for a statistical approach
It is necessary to emphasize from the very outset that this is a difﬁcult subject.
In fact, this subject is so difﬁcult that we are forced to adopt a radically different
approach to that employed in other areas of Physics.
In all of the Physics courses which you have taken up to now, you were even
tually able to formulate some exact, or nearly exact, set of equations which gov
erned the system under investigation. For instance, Newton’s equations of mo
tion, or Maxwell’s equations for electromagnetic ﬁelds. You were then able to
analyze the system by solving these equations, either exactly or approximately.
In thermodynamics we have no problem formulating the governing equations.
The motions of atoms and molecules are described exactly by the laws of quantum
mechanics. In many cases, they are also described to a reasonable approximation
by the much simpler laws of classical mechanics. We shall not be dealing with
systems sufﬁciently energetic for atomic nuclei to be disrupted, so we can forget
about nuclear forces. Also, in general, the gravitational forces between atoms and
molecules are completely negligible. This means that the forces between atoms
and molecules are predominantly electromagnetic in origin, and are, therefore,
very well understood. So, in principle, we could write down the exact laws of
motion for a thermodynamical system, including all of the interatomic forces.
The problem is the sheer complexity of this type of system. In one mole of a
substance (e.g., in twelve grams of carbon, or eighteen grams of water) there are
4
1.6 The need for a statistical approach 1 INTRODUCTION
Avagadro’s number of atoms or molecules. That is, about
N
A
= 6 10
23
particles, which is a gigantic number of particles! To solve the system exactly
we would have to write down about 10
24
coupled equations of motion, with the
same number of initial conditions, and then try to integrate the system. Quite
plainly, this is impossible. It would also be complete overkill. We are not at all
interested in knowing the position and velocity of every particle in the system as a
function of time. Instead, we want to know things like the volume of the system,
the temperature, the pressure, the heat capacity, the coefﬁcient of expansion, etc.
We would certainly be hard put to specify more than about ﬁfty, say, properties
of a thermodynamic system in which we are really interested. So, the number of
pieces of information we require is absolutely minuscule compared to the number
of degrees of freedom of the system. That is, the number of pieces of information
needed to completely specify the internal motion. Moreover, the quantities which
we are interested in do not depend on the motions of individual particles, or some
some small subset of particles, but, instead, depend on the average motions of
all the particles in the system. In other words, these quantities depend on the
statistical properties of the atomic or molecular motion.
The method adopted in this subject area is essentially dictated by the enor
mous complexity of thermodynamic systems. We start with some statistical in
formation about the motions of the constituent atoms or molecules, such as their
average kinetic energy, but we possess virtually no information about the motions
of individual particles. We then try to deduce some other properties of the system
from a statistical treatment of the governing equations. If fact, our approach has
to be statistical in nature, because we lack most of the information required to
specify the internal state of the system. The best we can do is to provide a few
overall constraints, such as the average volume and the average energy.
Thermodynamic systems are ideally suited to a statistical approach because of
the enormous numbers of particles they contain. As you probably know already,
statistical arguments actually get more exact as the numbers involved get larger.
For instance, whenever I see an opinion poll published in a newspaper, I immedi
ately look at the small print at the bottom where it says how many people were
5
1.7 Microscopic and macroscopic systems 1 INTRODUCTION
interviewed. I know that even if the polling was done without bias, which is ex
tremely unlikely, the laws of statistics say that there is a intrinsic error of order
one over the square root of the number of people questioned. It follows that if
a thousand people were interviewed, which is a typical number, then the error
is at least three percent. Hence, if the headline says that so and so is ahead by
one percentage point, and only a thousand people were polled, then I know the
result is statistically meaningless. We can easily appreciate that if we do statistics
on a thermodynamic system containing 10
24
particles then we are going to obtain
results which are valid to incredible accuracy. In fact, in most situations we can
forget that the results are statistical at all, and treat them as exact laws of Physics.
For instance, the familiar equation of state of an ideal gas,
P V = νRT,
is actually a statistical result. In other words, it relates the average pressure and
the average volume to the average temperature. However, for one mole of gas the
statistical deviations from average values are only about 10
−12
, according to the
1/
√
N law. Actually, it is virtually impossible to measure the pressure, volume, or
temperature of a gas to such accuracy, so most people just forget about the fact
that the above expression is a statistical result, and treat it as a law of Physics
interrelating the actual pressure, volume, and temperature of an ideal gas.
1.7 Microscopic and macroscopic systems
It is useful, at this stage, to make a distinction between the different sizes of the
systems that we are going to examine. We shall call a system microscopic if it
is roughly of atomic dimensions, or smaller. On the other hand, we shall call a
system macroscopic when it is large enough to be visible in the ordinary sense.
This is a rather inexact deﬁnition. The exact deﬁnition depends on the number
of particles in the system, which we shall call N. A system is macroscopic if
1
√
N
¸1,
which means that statistical arguments can be applied to reasonable accuracy.
For instance, if we wish to keep the statistical error below one percent then a
6
1.8 Thermodynamics and statistical thermodynamics 1 INTRODUCTION
macroscopic system would have to contain more than about ten thousand parti
cles. Any system containing less than this number of particles would be regarded
as essentially microscopic, and, hence, statistical arguments could not be applied
to such a system without unacceptable error.
1.8 Thermodynamics and statistical thermodynamics
In this course, we are going to develop some machinery for interrelating the sta
tistical properties of a system containing a very large number of particles, via a
statistical treatment of the laws of atomic or molecular motion. It turns out that
once we have developed this machinery, we can obtain some very general results
which do not depend on the exact details of the statistical treatment. These re
sults can be described without reference to the underlying statistical nature of the
system, but their validity depends ultimately on statistical arguments. They take
the form of general statements regarding heat and work, and are usually referred
to as classical thermodynamics, or just thermodynamics, for short. Historically,
classical thermodynamics was the ﬁrst sort of thermodynamics to be discovered.
In fact, for many years the laws of classical thermodynamics seemed rather myste
rious, because their statistical justiﬁcation had yet to be discovered. The strength
of classical thermodynamics is its great generality, which comes about because it
does not depend on any detailed assumptions about the statistical properties of
the system under investigation. This generality is also the principle weakness of
classical thermodynamics. Only a relatively few statements can be made on such
general grounds, so many interesting properties of the system remain outside the
scope of this theory.
If we go beyond classical thermodynamics, and start to investigate the sta
tistical machinery which underpins it, then we get all of the results of classical
thermodynamics, plus a large number of other results which enable the macro
scopic parameters of the system to be calculated from a knowledge of its micro
scopic constituents. This approach is known as statistical thermodynamics, and is
extremely powerful. The only drawback is that the further we delve inside the
statistical machinery of thermodynamics, the harder it becomes to perform the
7
1.9 Classical and quantum approaches 1 INTRODUCTION
necessary calculations.
Note that both classical and statistical thermodynamics are only valid for sys
tems in equilibrium. If the system is not in equilibrium then the problem becomes
considerably more difﬁcult. In fact, the thermodynamics of nonequilibrium sys
tems, which is generally called irreversible thermodynamics, is a graduate level
subject.
1.9 Classical and quantum approaches
We mentioned earlier that the motions (by which we really meant the transla
tional motions) of atoms and molecules are described exactly by quantum me
chanics, and only approximately by classical mechanics. It turns out that the
nontranslational motions of molecules, such as their rotation and vibration, are
very poorly described by classical mechanics. So, why bother using classical me
chanics at all? Unfortunately, quantum mechanics deals with the translational
motions of atoms and molecules (via wave mechanics) in a rather awkward man
ner. The classical approach is far more straightforward, and, under most cir
cumstances, yields the same statistical results. Hence, in the bulk of this course,
we shall use classical mechanics, as much as possible, to describe translational
motions, and reserve quantum mechanics for dealing with nontranslational mo
tions. However, towards the end of this course, we shall switch to a purely quan
tum mechanical approach.
8
2 PROBABILITY THEORY
2 Probability theory
2.1 Introduction
The ﬁrst part of this course is devoted to a brief, and fairly low level, introduction
to a branch of mathematics known as probability theory. In fact, we do not need
to know very much about probability theory in order to understand statistical
thermodynamics, since the probabilistic “calculation” which underpins all of this
subject is extraordinarily simple.
2.2 What is probability?
What is the scientiﬁc deﬁnition of probability? Well, let us consider an observation
made on a general system S. This can result in any one of a number of different
possible outcomes. We want to ﬁnd the probability of some general outcome X.
In order to ascribe a probability, we have to consider the system as a member of
a large set Σ of similar systems. Mathematicians have a fancy name for a large
group of similar systems. They call such a group an ensemble, which is just the
French for “group.” So, let us consider an ensemble Σ of similar systems S. The
probability of the outcome X is deﬁned as the ratio of the number of systems in
the ensemble which exhibit this outcome to the total number of systems, in the
limit where the latter number tends to inﬁnity. We can write this symbolically as
P(X) =
lt Ω(Σ)→∞
Ω(X)
Ω(Σ)
, (2.1)
where Ω(Σ) is the total number of systems in the ensemble, and Ω(X) is the
number of systems exhibiting the outcome X. We can see that the probability
P(X) must be a number between 0 and 1. The probability is zero if no systems
exhibit the outcome X, even when the number of systems goes to inﬁnity. This
is just another way of saying that there is no chance of the outcome X. The
probability is unity if all systems exhibit the outcome X in the limit as the number
of systems goes to inﬁnity. This is another way of saying that the outcome X is
bound to occur.
9
2.3 Combining probabilities 2 PROBABILITY THEORY
2.3 Combining probabilities
Consider two distinct possible outcomes, X and Y, of an observation made on
the system S, with probabilities of occurrence P(X) and P(Y), respectively. Let us
determine the probability of obtaining the outcome X or Y, which we shall denote
P(X  Y). From the basic deﬁnition of probability
P(X  Y) =
lt Ω(Σ)→∞
Ω(X  Y)
Ω(Σ)
, (2.2)
where Ω(X  Y) is the number of systems in the ensemble which exhibit either
the outcome X or the outcome Y. It is clear that
Ω(X  Y) = Ω(X) +Ω(Y) (2.3)
if the outcomes X and Y are mutually exclusive (which they must be the case if
they are two distinct outcomes). Thus,
P(X  Y) = P(X) +P(Y). (2.4)
So, the probability of the outcome X or the outcome Y is just the sum of the indi
vidual probabilities of X and Y. For instance, with a six sided die the probability
of throwing any particular number (one to six) is 1/6, because all of the possible
outcomes are considered to be equally likely. It follows from what has just been
said that the probability of throwing either a one or a two is simply 1/6 + 1/6,
which equals 1/3.
Let us denote all of the M, say, possible outcomes of an observation made on
the system S by X
i
, where i runs from 1 to M. Let us determine the probability
of obtaining any of these outcomes. This quantity is clearly unity, from the basic
deﬁnition of probability, because every one of the systems in the ensemble must
exhibit one of the possible outcomes. But, this quantity is also equal to the sum
of the probabilities of all the individual outcomes, by (2.4), so we conclude that
this sum is equal to unity. Thus,
M
i=1
P(X
i
) = 1, (2.5)
10
2.3 Combining probabilities 2 PROBABILITY THEORY
which is called the normalization condition, and must be satisﬁed by any complete
set of probabilities. This condition is equivalent to the selfevident statement that
an observation of a system must deﬁnitely result in one of its possible outcomes.
There is another way in which we can combine probabilities. Suppose that
we make an observation on a state picked at random from the ensemble and
then pick a second state completely independently and make another observation.
We are assuming here that the ﬁrst observation does not inﬂuence the second
observation in any way. The fancy mathematical way of saying this is that the
two observations are statistically independent. Let us determine the probability of
obtaining the outcome X in the ﬁrst state and the outcome Y in the second state,
which we shall denote P(X ⊗ Y). In order to determine this probability, we have
to form an ensemble of all of the possible pairs of states which we could choose
from the ensemble Σ. Let us denote this ensemble Σ ⊗ Σ. It is obvious that the
number of pairs of states in this new ensemble is just the square of the number
of states in the original ensemble, so
Ω(Σ ⊗Σ) = Ω(Σ) Ω(Σ). (2.6)
It is also fairly obvious that the number of pairs of states in the ensemble Σ ⊗ Σ
which exhibit the outcome X in the ﬁrst state and Y in the second state is just the
product of the number of states which exhibit the outcome X and the number of
states which exhibit the outcome Y in the original ensemble, so
Ω(X ⊗Y) = Ω(X) Ω(Y). (2.7)
It follows from the basic deﬁnition of probability that
P(X ⊗Y) =
lt Ω(Σ)→∞
Ω(X ⊗Y)
Ω(Σ ⊗Σ)
= P(X) P(Y). (2.8)
Thus, the probability of obtaining the outcomes X and Y in two statistically inde
pendent observations is just the product of the individual probabilities of X and
Y. For instance, the probability of throwing a one and then a two on a six sided
die is 1/6 1/6, which equals 1/36.
11
2.4 The twostate system 2 PROBABILITY THEORY
2.4 The twostate system
The simplest nontrivial system which we can investigate using probability theory
is one for which there are only two possible outcomes. There would obviously be
little point in investigating a one outcome system. Let us suppose that there are
two possible outcomes to an observation made on some system S. Let us denote
these outcomes 1 and 2, and let their probabilities of occurrence be
P(1) = p, (2.9)
P(2) = q. (2.10)
It follows immediately from the normalization condition (2.5) that
p +q = 1, (2.11)
so q = 1 − p. The best known example of a twostate system is a tossed coin.
The two outcomes are “heads” and “tails,” each with equal probabilities 1/2. So,
p = q = 1/2 for this system.
Suppose that we make N statistically independent observations of S. Let us
determine the probability of n
1
occurrences of the outcome 1 and N − n
1
occur
rences of the outcome 2, with no regard to the order of these occurrences. Denote
this probability P
N
(n
1
). This type of calculation crops up again and again in prob
ability theory. For instance, we might want to know the probability of getting nine
“heads” and only one “tails” in an experiment where a coin is tossed ten times, or
where ten coins are tossed simultaneously.
Consider a simple case in which there are only three observations. Let us try to
evaluate the probability of two occurrences of the outcome 1 and one occurrence
of the outcome 2. There are three different ways of getting this result. We could
get the outcome 1 on the ﬁrst two observations and the outcome 2 on the third.
Or, we could get the outcome 2 on the ﬁrst observation and the outcome 1 on
the latter two observations. Or, we could get the outcome 1 on the ﬁrst and
last observations and the outcome 2 on the middle observation. Writing this
symbolically
P
3
(2) = P(1 ⊗1 ⊗2  2 ⊗1 ⊗1  1 ⊗2 ⊗1). (2.12)
12
2.5 Combinatorial analysis 2 PROBABILITY THEORY
This formula looks a bit scary, but all we have done here is to write out symbol
ically what was just said in words. Where we said “and” we have written the
symbolic operator ⊗, and where we said “or” we have written the symbolic oper
ator . This symbolic representation is helpful because of the two basic rules for
combining probabilities which we derived earlier
P(X  Y) = P(X) +P(Y), (2.13)
P(X ⊗Y) = P(X) P(Y). (2.14)
The straightforward application of these rules gives
P
3
(2) = ppq +qpp +pqp = 3 p
2
q (2.15)
in the case under consideration.
The probability of obtaining n
1
occurrences of the outcome 1 in Nobservations
is given by
P
N
(n
1
) = C
N
n
1
, N−n
1
p
n
1
q
N−n
1
, (2.16)
where C
N
n
1
, N−n
1
is the number of ways of arranging two distinct sets of n
1
and
N − n
1
indistinguishable objects. Hopefully, that this is, at least, plausible from
the example we just discussed. There, the probability of getting two occurrences
of the outcome 1 and one occurrence of the outcome 2 was obtained by writing
out all of the possible arrangements of two ps (the probability of outcome 1) and
one q (the probability of outcome 2), and then added them all together.
2.5 Combinatorial analysis
The branch of mathematics which studies the number of different ways of arrang
ing things is called combinatorial analysis. We need to know how many different
ways there are of arranging N objects which are made up of two groups of n
1
and N − n
1
indistinguishable objects. This is a pretty tough problem! Let us try
something a little easier to begin with. How many ways are there of arranging N
distinguishable objects? For instance, suppose that we have six pool balls, num
bered one through six, and we pot one each into every one of the six pockets of a
pool table (that is, topleft, topright, middleleft, middleright, bottomleft, and
13
2.5 Combinatorial analysis 2 PROBABILITY THEORY
bottomright). How many different ways are there of doing this? Well, let us start
with the topleft pocket. We could pot any one of the six balls into this pocket,
so there are 6 possibilities. For the topright pocket we only have 5 possibilities,
because we have already potted a ball into the topleft pocket, and it cannot be in
two pockets simultaneously. So, our 6 original possibilities combined with these
5 new possibilities gives 6 5 ways of potting two balls into the top two pockets.
For the middleleft pocket we have 4 possibilities, because we have already potted
two balls. These possibilities combined with our 6 5 possibilities gives 6 5 4
ways of potting three balls into three pockets. At this stage, it should be clear
that the ﬁnal answer is going to be 6 5 4 3 2 1. Well, 6 5 4 3 2 1
is a bit of a mouthful, so to prevent us having to say (or write) things like this,
mathematicians have invented a special function called a factorial. The factorial
of a general positive integer n is deﬁned
n! = n(n −1)(n −2) 3 2 1. (2.17)
So, 1! = 1, and 2! = 2 1 = 2, and 3! = 3 2 1 = 6, and so on. Clearly, the
number of ways of potting six pool balls into six pockets is 6! (which incidentally
equals 720). Since there is nothing special about pool balls, or the number six, we
can safely infer that the number of different ways of arranging N distinguishable
objects, denoted C
N
, is given by
C
N
= N! . (2.18)
Suppose that we take the number four ball off the pool table and replace it
by a second number ﬁve ball. How many different ways are there of potting
the balls now? Well, consider a previous arrangement in which the number ﬁve
ball was potted into the topleft pocket and the number four ball was potted into
the topright pocket, and then consider a second arrangement which only differs
from the ﬁrst because the number four and ﬁve balls have been swapped around.
These arrangements are now indistinguishable, and are therefore counted as a
single arrangement, whereas previously they were counted as two separate ar
rangements. Clearly, the previous arrangements can be divided into two groups,
containing equal numbers of arrangements, which differ only by the permutation
of the number four and ﬁve balls. Since these balls are now indistinguishable, we
14
2.6 The binomial distribution 2 PROBABILITY THEORY
conclude that there are only half as many different arrangements as there were
before. If we take the number three ball off the table and replace it by a third
number ﬁve ball, we can split the original arrangements into six equal groups
of arrangements which differ only by the permutation of the number three, four,
and ﬁve balls. There are six groups because there are 3! = 6 separate permuta
tions of these three balls. Since the number three, four, and ﬁve balls are now
indistinguishable, we conclude that there are only 1/6 the number of original
arrangements. Generalizing this result, we conclude that the number of arrange
ments of n
1
indistinguishable and N−n
1
distinguishable objects is
C
N
n
1
=
N!
n
1
!
. (2.19)
We can see that if all the balls on the table are replaced by number ﬁve balls then
there is only N!/N! = 1 possible arrangement. This corresponds, of course, to a
number ﬁve ball in each pocket. A further straightforward generalization tells us
that the number of arrangements of two groups of n
1
and N−n
1
indistinguishable
objects is
C
N
n
1
, N−n
1
=
N!
n
1
! (N−n
1
)!
. (2.20)
2.6 The binomial distribution
It follows from Eqs. (2.16) and (2.20) that the probability of obtaining n
1
occur
rences of the outcome 1 in N statistically independent observations of a twostate
system is
P
N
(n
1
) =
N!
n
1
! (N−n
1
)!
p
n
1
q
N−n
1
. (2.21)
This probability function is called the binomial distribution function. The reason
for this is obvious if we tabulate the probabilities for the ﬁrst few possible values
of N (see Tab. 1). Of course, we immediately recognize these expressions: they
appear in the standard algebraic expansions of (p + q), (p + q)
2
, (p + q)
3
, and
(p+q)
4
, respectively. In algebra, the expansion of (p+q)
N
is called the binomial
expansion (hence, the name given to the probability distribution function), and
15
2.7 The mean, variance, and standard deviation 2 PROBABILITY THEORY
n
1
0 1 2 3 4
1 q p
N 2 q
2
2 pq p
2
3 q
3
3 pq
2
3 p
2
q p
3
4 q
4
4 pq
3
6 p
2
q
2
4 p
3
q p
4
Table 1: The binomial probability distribution
can be written
(p +q)
N
≡
N
n=0
N!
n! (N−n)!
p
n
q
N−n
. (2.22)
Equations (2.21) and (2.22) can be used to establish the normalization condition
for the binomial distribution function:
N
n
1
=0
P
N
(n
1
) =
N
n
1
=0
N!
n
1
! (N−n
1
)!
p
n
1
q
N−n
1
≡ (p +q)
N
= 1, (2.23)
since p +q = 1.
2.7 The mean, variance, and standard deviation
What is meant by the mean or average of a quantity? Well, suppose that we
wanted to calculate the average age of undergraduates at the University of Texas
at Austin. We could go to the central administration building and ﬁnd out how
many eighteen yearolds, nineteen yearolds, etc. were currently enrolled. We
would then write something like
Average Age ·
N
18
18 +N
19
19 +N
20
20 +
N
18
+N
19
+N
20
, (2.24)
where N
18
is the number of enrolled eighteen yearolds, etc. Suppose that we
were to pick a student at random and then ask “What is the probability of this
student being eighteen?” From what we have already discussed, this probability
is deﬁned
P
18
=
N
18
N
students
, (2.25)
16
2.7 The mean, variance, and standard deviation 2 PROBABILITY THEORY
where N
students
is the total number of enrolled students. We can now see that the
average age takes the form
Average Age · P
18
18 +P
19
19 +P
20
20 + . (2.26)
Well, there is nothing special about the age distribution of students at UT
Austin. So, for a general variable u, which can take on any one of M possible
values u
1
, u
2
, , u
M
, with corresponding probabilities P(u
1
), P(u
2
), , P(u
M
),
the mean or average value of u, which is denoted ¯ u, is deﬁned as
¯ u ≡
M
i=1
P(u
i
) u
i
. (2.27)
Suppose that f(u) is some function of u. Then, for each of the M possible
values of u, there is a corresponding value of f(u) which occurs with the same
probability. Thus, f(u
1
) corresponds to u
1
and occurs with the probability P(u
1
),
and so on. It follows from our previous deﬁnition that the mean value of f(u) is
given by
f(u) ≡
M
i=1
P(u
i
) f(u
i
). (2.28)
Suppose that f(u) and g(u) are two general functions of u. It follows that
f(u) +g(u) =
M
i=1
P(u
i
) [f(u
i
)+g(u
i
)] =
M
i=1
P(u
i
) f(u
i
)+
M
i=1
P(u
i
) g(u
i
), (2.29)
so
f(u) +g(u) = f(u) +g(u). (2.30)
Finally, if c is a general constant then it is clear that
c f(u) = c f(u). (2.31)
We now know how to deﬁne the mean value of the general variable u. But,
howcan we characterize the scatter around the mean value? We could investigate
the deviation of u from its mean value ¯ u, which is denoted
∆u ≡ u − ¯ u. (2.32)
17
2.8 Application to the binomial distribution 2 PROBABILITY THEORY
In fact, this is not a particularly interesting quantity, since its average is obviously
zero:
∆u = (u − ¯ u) = ¯ u − ¯ u = 0. (2.33)
This is another way of saying that the average deviation from the mean vanishes.
A more interesting quantity is the square of the deviation. The average value of
this quantity,
(∆u)
2
=
M
i=1
P(u
i
) (u
i
− ¯ u)
2
, (2.34)
is usually called the variance. The variance is clearly a positive number, unless
there is no scatter at all in the distribution, so that all possible values of u corre
spond to the mean value ¯ u, in which case it is zero. The following general relation
is often useful
(u − ¯ u)
2
= (u
2
−2 u ¯ u + ¯ u
2
) = u
2
−2 ¯ u ¯ u + ¯ u
2
, (2.35)
giving
(u − ¯ u)
2
= u
2
− ¯ u
2
. (2.36)
The variance of u is proportional to the square of the scatter of u around its
mean value. A more useful measure of the scatter is given by the square root of
the variance,
∆
∗
u =
_
(∆u)
2
_
1/2
, (2.37)
which is usually called the standard deviation of u. The standard deviation is
essentially the width of the range over which u is distributed around its mean
value ¯ u.
2.8 Application to the binomial distribution
Let us now apply what we have just learned about the mean, variance, and stan
dard deviation of a general distribution function to the speciﬁc case of the bino
mial distribution function. Recall, that if a simple system has just two possible
outcomes, denoted 1 and 2, with respective probabilities p and q = 1 − p, then
18
2.8 Application to the binomial distribution 2 PROBABILITY THEORY
the probability of obtaining n
1
occurrences of outcome 1 in N observations is
P
N
(n
1
) =
N!
n
1
! (N−n
1
)!
p
n
1
q
N−n
1
. (2.38)
Thus, the mean number of occurrences of outcome 1 in N observations is given
by
n
1
=
N
n
1
=0
P
N
(n
1
) n
1
=
N
n
1
=0
N!
n
1
! (N−n
1
)!
p
n
1
q
N−n
1
n
1
. (2.39)
This is a rather nasty looking expression! However, we can see that if the ﬁnal
factor n
1
were absent, it would just reduce to the binomial expansion, which we
know how to sum. We can take advantage of this fact by using a rather elegant
mathematical sleight of hand. Observe that since
n
1
p
n
1
≡ p
∂
∂p
p
n
1
, (2.40)
the summation can be rewritten as
N
n
1
=0
N!
n
1
! (N−n
1
)!
p
n
1
q
N−n
1
n
1
≡ p
∂
∂p
_
¸
_
N
n
1
=0
N!
n
1
! (N−n
1
)!
p
n
1
q
N−n
1
_
¸
_ . (2.41)
This is just algebra, and has nothing to do with probability theory. The term
in square brackets is the familiar binomial expansion, and can be written more
succinctly as (p +q)
N
. Thus,
N
n
1
=0
N!
n
1
! (N−n
1
)!
p
n
1
q
N−n
1
n
1
≡ p
∂
∂p
(p +q)
N
≡ pN(p +q)
N−1
. (2.42)
However, p +q = 1 for the case in hand, so
n
1
= Np. (2.43)
In fact, we could have guessed this result. By deﬁnition, the probability p is
the number of occurrences of the outcome 1 divided by the number of trials, in
the limit as the number of trials goes to inﬁnity:
p =
lt N→∞
n
1
N
. (2.44)
19
2.8 Application to the binomial distribution 2 PROBABILITY THEORY
If we think carefully, however, we can see that taking the limit as the number of
trials goes to inﬁnity is equivalent to taking the mean value, so that
p =
_
n
1
N
_
=
n
1
N
. (2.45)
But, this is just a simple rearrangement of Eq. (2.43).
Let us now calculate the variance of n
1
. Recall that
(∆n
1
)
2
= (n
1
)
2
− (n
1
)
2
. (2.46)
We already know n
1
, so we just need to calculate (n
1
)
2
. This average is written
(n
1
)
2
=
N
n
1
=0
N!
n
1
! (N−n
1
)!
p
n
1
q
N−n
1
(n
1
)
2
. (2.47)
The sum can be evaluated using a simple extension of the mathematical trick we
used earlier to evaluate n
1
. Since
(n
1
)
2
p
n
1
≡
_
p
∂
∂p
_
2
p
n
1
, (2.48)
then
N
n
1
=0
N!
n
1
! (N−n
1
)!
p
n
1
q
N−n
1
(n
1
)
2
≡
_
p
∂
∂p
_
2
N
n
1
=0
N!
n
1
! (N−n
1
)!
p
n
1
q
N−n
1
≡
_
p
∂
∂p
_
2
(p +q)
N
(2.49)
≡
_
p
∂
∂p
_
_
pN(p +q)
N−1
_
≡ p
_
N(p +q)
N−1
+pN(N−1) (p +q)
N−2
_
.
Using p +q = 1 yields
(n
1
)
2
= p[N+pN(N−1)] = Np[1 +pN−p]
= (Np)
2
+Npq = (n
1
)
2
+Npq, (2.50)
since n
1
= Np. It follows that the variance of n
1
is given by
(∆n
1
)
2
= (n
1
)
2
− (n
1
)
2
= Npq. (2.51)
20
2.9 The Gaussian distribution 2 PROBABILITY THEORY
The standard deviation of n
1
is just the square root of the variance, so
∆
∗
n
1
=
_
Npq. (2.52)
Recall that this quantity is essentially the width of the range over which n
1
is
distributed around its mean value. The relative width of the distribution is char
acterized by
∆
∗
n
1
n
1
=
√
Npq
Np
=
¸
¸
¸
_
q
p
1
√
N
. (2.53)
It is clear from this formula that the relative width decreases like N
−1/2
with
increasing N. So, the greater the number of trials, the more likely it is that an
observation of n
1
will yield a result which is relatively close to the mean value
n
1
. This is a very important result.
2.9 The Gaussian distribution
Consider a very large number of observations, N ¸1, made on a system with two
possible outcomes. Suppose that the probability of outcome 1 is sufﬁciently large
that the average number of occurrences after Nobservations is much greater than
unity:
n
1
= Np ¸1. (2.54)
In this limit, the standard deviation of n
1
is also much greater than unity,
∆
∗
n
1
=
_
Npq ¸1, (2.55)
implying that there are very many probable values of n
1
scattered about the mean
value n
1
. This suggests that the probability of obtaining n
1
occurrences of out
come 1 does not change signiﬁcantly in going from one possible value of n
1
to an
adjacent value:
P
N
(n
1
+1) −P
N
(n
1
)
P
N
(n
1
)
¸1. (2.56)
In this situation, it is useful to regard the probability as a smooth function of n
1
.
Let n be a continuous variable which is interpreted as the number of occurrences
21
2.9 The Gaussian distribution 2 PROBABILITY THEORY
of outcome 1 (after N observations) whenever it takes on a positive integer value.
The probability that n lies between n and n +dn is deﬁned
P(n, n +dn) = T(n) dn, (2.57)
where T(n) is called the probability density, and is independent of dn. The prob
ability can be written in this form because P(n, n +dn) can always be expanded
as a Taylor series in dn, and must go to zero as dn →0. We can write
n
1
+1/2
n
1
−1/2
T(n) dn = P
N
(n
1
), (2.58)
which is equivalent to smearing out the discrete probability P
N
(n
1
) over the range
n
1
±1/2. Given Eq. (2.56), the above relation can be approximated
T(n) · P
N
(n) =
N!
n! (N−n)!
p
n
q
N−n
. (2.59)
For large N, the relative width of the probability distribution function is small:
∆
∗
n
1
n
1
=
¸
¸
¸
_
q
p
1
√
N
¸1. (2.60)
This suggests that T(n) is strongly peaked around the mean value n = n
1
. Sup
pose that lnT(n) attains its maximum value at n = ˜ n (where we expect ˜ n ∼ n).
Let us Taylor expand lnT around n = ˜ n. Note that we expand the slowly varying
function lnT(n), instead of the rapidly varying function T(n), because the Taylor
expansion of T(n) does not converge sufﬁciently rapidly in the vicinity of n = ˜ n
to be useful. We can write
lnT(˜ n +η) · lnT(˜ n) +ηB
1
+
η
2
2
B
2
+ , (2.61)
where
B
k
=
d
k
lnT
dn
k
¸
¸
¸
¸
¸
¸
n=˜ n
. (2.62)
By deﬁnition,
B
1
= 0, (2.63)
B
2
< 0, (2.64)
22
2.9 The Gaussian distribution 2 PROBABILITY THEORY
if n = ˜ n corresponds to the maximum value of lnT(n).
It follows from Eq. (2.59) that
lnT = lnN! − lnn! − ln(N−n)! +nlnp + (N−n) lnq. (2.65)
If n is a large integer, such that n ¸1, then lnn! is almost a continuous function
of n, since lnn! changes by only a relatively small amount when n is incremented
by unity. Hence,
dlnn!
dn
·
ln(n +1)! − lnn!
1
= ln
_
_
(n +1)!
n!
_
_
= ln(n +1), (2.66)
giving
dlnn!
dn
· lnn, (2.67)
for n ¸1. The integral of this relation
lnn! · n lnn −n +O(1), (2.68)
valid for n ¸ 1, is called Stirling’s approximation, after the Scottish mathemati
cian James Stirling who ﬁrst obtained it in 1730.
According to Eq. (2.65),
B
1
= −ln ˜ n + ln(N− ˜ n) + lnp − lnq. (2.69)
Hence, if B
1
= 0 then
(N− ˜ n) p = ˜ nq, (2.70)
giving
˜ n = Np = n
1
, (2.71)
since p +q = 1. Thus, the maximum of lnT(n) occurs exactly at the mean value
of n, which equals n
1
.
Further differentiation of Eq. (2.65) yields
B
2
= −
1
˜ n
−
1
N− ˜ n
= −
1
Np
−
1
N(1 −p)
= −
1
Npq
, (2.72)
23
2.9 The Gaussian distribution 2 PROBABILITY THEORY
since p + q = 1. Note that B
2
< 0, as required. The above relation can also be
written
B
2
= −
1
(∆
∗
n
1
)
2
(2.73)
It follows from the above that the Taylor expansion of lnT can be written
lnT(n
1
+η) · lnT(n
1
) −
η
2
2 (∆
∗
n
1
)
2
+ . (2.74)
Taking the exponential of both sides yields
T(n) · T(n
1
) exp
_
_
−
(n −n
1
)
2
2 (∆
∗
n
1
)
2
_
_
. (2.75)
The constant T(n
1
) is most conveniently ﬁxed by making use of the normalization
condition
N
n
1
=0
P
N
(n
1
) = 1, (2.76)
which translates to
N
0
T(n) dn · 1 (2.77)
for a continuous distribution function. Since we only expect T(n) to be signiﬁcant
when n lies in the relatively narrow range n
1
±∆
∗
n
1
, the limits of integration in
the above expression can be replaced by ±∞with negligible error. Thus,
T(n
1
)
∞
−∞
exp
_
_
−
(n −n
1
)
2
2 (∆
∗
n
1
)
2
_
_
dn = T(n
1
)
√
2 ∆
∗
n
1
∞
−∞
exp(−x
2
) dx · 1. (2.78)
As is wellknown,
∞
−∞
exp(−x
2
) dx =
√
π, (2.79)
so it follows from the normalization condition (2.78) that
T(n
1
) ·
1
√
2π∆
∗
n
1
. (2.80)
24
2.9 The Gaussian distribution 2 PROBABILITY THEORY
Finally, we obtain
T(n) ·
1
√
2π∆
∗
n
1
exp
_
_
−
(n −n
1
)
2
2 (∆
∗
n
1
)
2
_
_
. (2.81)
This is the famous Gaussian distribution function, named after the German math
ematician Carl Friedrich Gauss, who discovered it whilst investigating the distri
bution of errors in measurements. The Gaussian distribution is only valid in the
limits N ¸1 and n
1
¸1.
Suppose we were to plot the probability P
N
(n
1
) against the integer variable
n
1
, and then ﬁt a continuous curve through the discrete points thus obtained.
This curve would be equivalent to the continuous probability density curve T(n),
where n is the continuous version of n
1
. According to Eq. (2.81), the probability
density attains its maximum value when n equals the mean of n
1
, and is also
symmetric about this point. In fact, when plotted with the appropriate ratio of
vertical to horizontal scalings, the Gaussian probability density curve looks rather
like the outline of a bell centred on n = n
1
. Hence, this curve is sometimes
called a bell curve. At one standard deviation away from the mean value, i.e.,
n = n
1
± ∆
∗
n
1
, the probability density is about 61% of its peak value. At two
standard deviations away from the mean value, the probability density is about
13.5% of its peak value. Finally, at three standard deviations away from the mean
value, the probability density is only about 1% of its peak value. We conclude
that there is very little chance indeed that n
1
lies more than about three standard
deviations away from its mean value. In other words, n
1
is almost certain to lie
in the relatively narrow range n
1
±3 ∆
∗
n
1
. This is a very wellknown result.
In the above analysis, we have gone from a discrete probability function P
N
(n
1
)
to a continuous probability density T(n). The normalization condition becomes
1 =
N
n
1
=0
P
N
(n
1
) ·
∞
−∞
T(n) dn (2.82)
under this transformation. Likewise, the evaluations of the mean and variance of
the distribution are written
n
1
=
N
n
1
=0
P
N
(n
1
) n
1
·
∞
−∞
T(n) ndn, (2.83)
25
2.10 The central limit theorem 2 PROBABILITY THEORY
and
(∆n
1
)
2
≡ (∆
∗
n
1
)
2
=
N
n
1
=0
P
N
(n
1
) (n
1
−n
1
)
2
·
∞
−∞
T(n) (n −n
1
)
2
dn, (2.84)
respectively. These results follow as simple generalizations of previously es
tablished results for the discrete function P
N
(n
1
). The limits of integration in
the above expressions can be approximated as ±∞ because T(n) is only non
negligible in a relatively narrow range of n. Finally, it is easily demonstrated that
Eqs. (2.82)–(2.84) are indeed true by substituting in the Gaussian probability
density, Eq. (2.81), and then performing a few elementary integrals.
2.10 The central limit theorem
Now, you may be thinking that we got a little carried away in our discussion of
the Gaussian distribution function. After all, this distribution only seems to be
relevant to twostate systems. In fact, as we shall see, the Gaussian distribution is
of crucial importance to statistical physics because, under certain circumstances,
it applies to all systems.
Let us brieﬂy review how we obtained the Gaussian distribution function in
the ﬁrst place. We started from a very simple system with only two possible
outcomes. Of course, the probability distribution function (for n
1
) for this system
did not look anything like a Gaussian. However, when we combined very many
of these simple systems together, to produce a complicated system with a great
number of possible outcomes, we found that the resultant probability distribution
function (for n
1
) reduced to a Gaussian in the limit as the number of simple
systems tended to inﬁnity. We started from a two outcome system because it was
easy to calculate the ﬁnal probability distribution function when a ﬁnite number
of such systems were combined together. Clearly, if we had started from a more
complicated system then this calculation would have been far more difﬁcult.
Let me now tell you something which is quite astonishing! Suppose that we
start from any system, with any distribution function (for some measurable quan
tity x). If we combine a sufﬁciently large number of such systems together, the
26
2.10 The central limit theorem 2 PROBABILITY THEORY
resultant distribution function (for x) is always Gaussian. This proposition is
known as the central limit theorem. As far as Physics is concerned, it is one of the
most important theorems in the whole of mathematics.
Unfortunately, the central limit theorem is notoriously difﬁcult to prove. A
somewhat restricted proof is presented in Sections 1.10 and 1.11 of Reif.
The central limit theorem guarantees that the probability distribution of any
measurable quantity is Gaussian, provided that a sufﬁciently large number of
statistically independent observations are made. We can, therefore, conﬁdently
predict that Gaussian distributions are going to crop up all over the place in
statistical thermodynamics.
27
3 STATISTICAL MECHANICS
3 Statistical mechanics
3.1 Introduction
Let us now analyze the internal motions of a many particle system using proba
bility theory. This subject area is known as statistical mechanics.
3.2 Speciﬁcation of the state of a many particle system
How do we determine the state of a many particle system? Well, let us, ﬁrst
of all, consider the simplest possible many particle system, which consists of a
single spinless particle moving classically in one dimension. Assuming that we
know the particle’s equation of motion, the state of the system is fully speciﬁed
once we simultaneously measure the particle’s position q and momentum p. In
principle, if we know q and p then we can calculate the state of the system at
all subsequent times using the equation of motion. In practice, it is impossible to
specify q and p exactly, since there is always an intrinsic error in any experimental
measurement.
Consider the time evolution of q and p. This can be visualized by plotting the
point (q, p) in the qp plane. This plane is generally known as phasespace. In
general, the point (q, p) will trace out some very complicated pattern in phase
space. Suppose that we divide phasespace into rectangular cells of uniform di
mensions δq and δp. Here, δq is the intrinsic error in the position measurement,
and δp the intrinsic error in the momentum measurement. The “area” of each
cell is
δqδp = h
0
, (3.1)
where h
0
is a small constant having the dimensions of angular momentum. The
coordinates q and p can now be conveniently speciﬁed by indicating the cell in
phasespace into which they plot at any given time. This procedure automatically
ensures that we do not attempt to specify q and p to an accuracy greater than
our experimental error, which would clearly be pointless.
28
3.2 Speciﬁcation of the state of a many particle system 3 STATISTICAL MECHANICS
Let us now consider a single spinless particle moving in three dimensions. In
order to specify the state of the system we now need to know three qp pairs:
i.e., q
x
p
x
, q
y
p
y
, and q
z
p
z
. Incidentally, the number of qp pairs needed to
specify the state of the system is usually called the number of degrees of freedom
of the system. Thus, a single particle moving in one dimension constitutes a one
degree of freedom system, whereas a single particle moving in three dimensions
constitutes a three degree of freedom system.
Consider the time evolution of q and p, where q = (q
x
, q
y
, q
z
), etc. This can
be visualized by plotting the point (q, p) in the six dimensional qp phasespace.
Suppose that we divide the q
x
p
x
plane into rectangular cells of uniform dimen
sions δq and δp, and do likewise for the q
y
p
y
and q
z
p
z
planes. Here, δq and
δp are again the intrinsic errors in our measurements of position and momen
tum, respectively. This is equivalent to dividing phasespace up into regular six
dimensional cells of volume h
3
0
. The coordinates q and p can now be conve
niently speciﬁed by indicating the cell in phasespace into which they plot at any
given time. Again, this procedure automatically ensures that we do not attempt
to specify q and p to an accuracy greater than our experimental error.
Finally, let us consider a system consisting of N spinless particles moving clas
sically in three dimensions. In order to specify the state of the system, we need to
specify a large number of qp pairs. The requisite number is simply the number
of degrees of freedom, f. For the present case, f = 3 N. Thus, phasespace (i.e.,
the space of all the qp pairs) now possesses 2 f = 6 N dimensions. Consider a
particular pair of conjugate coordinates, q
i
and p
i
. As before, we divide the q
i
p
i
plane into rectangular cells of uniform dimensions δq and δp. This is equivalent
to dividing phasespace into regular 2 f dimensional cells of volume h
f
0
. The state
of the system is speciﬁed by indicating which cell it occupies in phasespace at
any given time.
In principle, we can specify the state of the system to arbitrary accuracy by
taking the limit h
0
→ 0. In reality, we know from quantum mechanics that it is
impossible to simultaneously measure a coordinate q
i
and its conjugate momen
tum p
i
to greater accuracy than δq
i
δp
i
= ¯h. This implies that
h
0
≥ ¯h. (3.2)
29
3.3 The principle of equal a priori probabilities 3 STATISTICAL MECHANICS
In other words, the uncertainty principle sets a lower limit on how ﬁnely we can
chop up classical phasespace.
In quantum mechanics we can specify the state of the system by giving its
wavefunction at time t,
ψ(q
1
, , q
f
, s
1
, , s
g
, t), (3.3)
where f is the number of translational degrees of freedom, and g the number of
internal (e.g., spin) degrees of freedom. For instance, if the system consists of
N spinonehalf particles then there will be 3 N translational degrees of freedom,
and N spin degrees of freedom (i.e., the spin of each particle can either point up
or down along the zaxis). Alternatively, if the system is in a stationary state (i.e.,
an eigenstate of the Hamiltonian) then we can just specify f+g quantum numbers.
Either way, the future time evolution of the wavefunction is fully determined by
Schr¨ odinger’s equation.
In reality, this approach does not work because the Hamiltonian of the system
is only known approximately. Typically, we are dealing with a system consist
ing of many weakly interacting particles. We usually know the Hamiltonian for
completely noninteracting particles, but the component of the Hamiltonian asso
ciated with particle interactions is either impossibly complicated, or not very well
known (often, it is both!). We can deﬁne approximate stationary eigenstates us
ing the Hamiltonian for noninteracting particles. The state of the system is then
speciﬁed by the quantum numbers identifying these eigenstates. In the absence
of particle interactions, if the system starts off in a stationary state then it stays in
that state for ever, so its quantum numbers never change. The interactions allow
the system to make transitions between different “stationary” states, causing its
quantum numbers to change in time.
3.3 The principle of equal a priori probabilities
We now know how to specify the instantaneous state of a many particle system.
In principle, such a system is completely deterministic. Once we know the ini
tial state and the equations of motion (or the Hamiltonian) we can evolve the
30
3.3 The principle of equal a priori probabilities 3 STATISTICAL MECHANICS
system forward in time and, thereby, determine all future states. In reality, it
is quite impossible to specify the initial state or the equations of motion to suf
ﬁcient accuracy for this method to have any chance of working. Furthermore,
even if it were possible, it would still not be a practical proposition to evolve
the equations of motion. Remember that we are typically dealing with systems
containing Avogadro’s number of particles: i.e., about 10
24
particles. We cannot
evolve 10
24
simultaneous differential equations! Even if we could, we would not
want to. After all, we are not particularly interested in the motions of individual
particles. What we really want is statistical information regarding the motions of
all particles in the system.
Clearly, what is required here is a statistical treatment of the problem. Instead
of focusing on a single system, let us proceed in the usual manner and consider a
statistical ensemble consisting of a large number of identical systems. In general,
these systems are distributed over many different states at any given time. In
order to evaluate the probability that the system possesses a particular property,
we merely need to ﬁnd the number of systems in the ensemble which exhibit this
property, and then divide by the total number of systems, in the limit as the latter
number tends to inﬁnity.
We can usually place some general constraints on the system. Typically, we
know the total energy E, the total volume V, and the total number of particles
N. To be more honest, we can only really say that the total energy lies between E
and E + δE, etc., where δE is an experimental error. Thus, we only need concern
ourselves with those systems in the ensemble exhibiting states which are consis
tent with the known constraints. We call these the states accessible to the system.
In general, there are a great many such states.
We now need to calculate the probability of the system being found in each of
its accessible states. Well, perhaps “calculate” is the wrong word. The only way
we could calculate these probabilities would be to evolve all of the systems in the
ensemble and observe how long on average they spend in each accessible state.
But, as we have already mentioned, such a calculation is completely out of the
question. So what do we do instead? Well, we effectively guess the probabilities.
Let us consider an isolated system in equilibrium. In this situation, we would
31
3.3 The principle of equal a priori probabilities 3 STATISTICAL MECHANICS
expect the probability of the system being found in one of its accessible states to
be independent of time. This implies that the statistical ensemble does not evolve
with time. Individual systems in the ensemble will constantly change state, but
the average number of systems in any given state should remain constant. Thus,
all macroscopic parameters describing the system, such as the energy and the
volume, should also remain constant. There is nothing in the laws of mechanics
which would lead us to suppose that the system will be found more often in one
of its accessible states than in another. We assume, therefore, that the system is
equally likely to be found in any of its accessible states. This is called the assumption
of equal a priori probabilities, and lies at the very heart of statistical mechanics.
In fact, we use assumptions like this all of the time without really thinking
about them. Suppose that we were asked to pick a card at random from a well
shufﬂed pack. I think that most people would accept that we have an equal
probability of picking any card in the pack. There is nothing which would favour
one particular card over all of the others. So, since there are ﬁftytwo cards in a
normal pack, we would expect the probability of picking the Ace of Spades, say,
to be 1/52. We could now place some constraints on the system. For instance,
we could only count red cards, in which case the probability of picking the Ace
of Hearts, say, would be 1/26, by the same reasoning. In both cases, we have
used the principle of equal a priori probabilities. People really believe that this
principle applies to games of chance such as cards, dice, and roulette. In fact,
if the principle were found not to apply to a particular game most people would
assume that the game was “crooked.” But, imagine trying to prove that the prin
ciple actually does apply to a game of cards. This would be very difﬁcult! We
would have to show that the way most people shufﬂe cards is effective at ran
domizing their order. A convincing study would have to be part mathematics and
part psychology!
In statistical mechanics, we treat a many particle system a bit like an extremely
large game of cards. Each accessible state corresponds to one of the cards in the
pack. The interactions between particles cause the system to continually change
state. This is equivalent to constantly shufﬂing the pack. Finally, an observation
of the state of the system is like picking a card at random from the pack. The
principle of equal a priori probabilities then boils down to saying that we have an
32
3.4 The H theorem 3 STATISTICAL MECHANICS
equal chance of choosing any particular card.
It is, unfortunately, impossible to prove with mathematical rigor that the prin
ciple of equal a priori probabilities applies to manyparticle systems. Over the
years, many people have attempted this proof, and all have failed miserably. Not
surprisingly, therefore, statistical mechanics was greeted with a great deal of scep
ticism when it was ﬁrst proposed just over one hundred years ago. One of the
its main proponents, Ludvig Boltzmann, got so fed up with all of the criticism
that he eventually threw himself off a bridge! Nowadays, statistical mechanics
is completely accepted into the cannon of physics. The reason for this is quite
simple: it works!
It is actually possible to formulate a reasonably convincing scientiﬁc case for
the principle of equal a priori probabilities. To achieve this we have to make use
of the socalled H theorem.
3.4 The H theorem
Consider a system of weakly interacting particles. In quantum mechanics we can
write the Hamiltonian for such a system as
H = H
0
+H
1
, (3.4)
where H
0
is the Hamiltonian for completely noninteracting particles, and H
1
is
a small correction due to the particle interactions. We can deﬁne approximate
stationary eigenstates of the system using H
0
. Thus,
H
0
Ψ
r
= E
r
Ψ
r
, (3.5)
where the index r labels a state of energy E
r
and eigenstate Ψ
r
. In general, there
are many different eigenstates with the same energy: these are called degenerate
states.
For example, consider N noninteracting spinless particles of mass m conﬁned
in a cubic box of dimension L. According to standard wavemechanics, the energy
33
3.4 The H theorem 3 STATISTICAL MECHANICS
levels of the ith particle are given by
e
i
=
¯h
2
π
2
2 mL
2
_
n
2
i1
+n
2
i2
+n
2
i3
_
, (3.6)
where n
i1
, n
i2
, and n
i3
are three (positive integer) quantum numbers. The overall
energy of the system is the sum of the energies of the individual particles, so that
for a general state r
E
r
=
N
i=1
e
i
. (3.7)
The overall state of the system is thus speciﬁed by 3N quantum numbers (i.e.,
three quantum numbers per particle). There are clearly very many different ar
rangements of these quantum numbers which give the same overall energy.
Consider, now, a statistical ensemble of systems made up of weakly interacting
particles. Suppose that this ensemble is initially very far from equilibrium. For
instance, the systems in the ensemble might only be distributed over a very small
subset of their accessible states. If each system starts off in a particular stationary
state (i.e., with a particular set of quantum numbers) then, in the absence of
particle interactions, it will remain in that state for ever. Hence, the ensemble will
always stay far from equilibrium, and the principle of equal a priori probabilities
will never be applicable. In reality, particle interactions cause each system in
the ensemble to make transitions between its accessible “stationary” states. This
allows the overall state of the ensemble to change in time.
Let us label the accessible states of our system by the index r. We can ascribe
a time dependent probability P
r
(t) of ﬁnding the system in a particular approxi
mate stationary state r at time t. Of course, P
r
(t) is proportional to the number of
systems in the ensemble in state r at time t. In general, P
r
is time dependent be
cause the ensemble is evolving towards an equilibrium state. We assume that the
probabilities are properly normalized, so that the sum over all accessible states
always yields
r
P
r
(t) = 1. (3.8)
Small interactions between particles cause transitions between the approxi
mate stationary states of the system. There then exists some transition probabil
34
3.4 The H theorem 3 STATISTICAL MECHANICS
ity per unit time W
rs
that a system originally in state r ends up in state s as a
result of these interactions. Likewise, there exists a probability per unit time W
sr
that a system in state s makes a transition to state r. These transition probabili
ties are meaningful in quantum mechanics provided that the particle interaction
strength is sufﬁciently small, there is a nearly continuous distribution of accessi
ble energy levels, and we consider time intervals which are not too small. These
conditions are easily satisﬁed for the types of systems usually analyzed via statis
tical mechanics (e.g., nearly ideal gases). One important conclusion of quantum
mechanics is that the forward and inverse transition probabilities between two
states are the same, so that
W
rs
= W
sr
(3.9)
for any two states r and s. This result follows from the time reversal symmetry of
quantum mechanics. On the microscopic scale of individual particles, all funda
mental laws of physics (in particular, classical and quantum mechanics) possess
this symmetry. So, if a certain motion of particles satisﬁes the classical equations
of motion (or Schr¨ odinger’s equation) then the reversed motion, with all particles
starting off from their ﬁnal positions and then retracing their paths exactly until
they reach their initial positions, satisﬁes these equations just as well.
Suppose that we were to “ﬁlm” a microscopic process, such as two classical
particles approaching one another, colliding, and moving apart. We could then
gather an audience together and show them the ﬁlm. To make things slightly
more interesting we could play it either forwards or backwards. Because of the
time reversal symmetry of classical mechanics, the audience would not be able to
tell which way the ﬁlm was running (unless we told them!). In both cases, the
ﬁlm would show completely plausible physical events.
We can play the same game for a quantum process. For instance, we could
“ﬁlm” a group of photons impinging on some atoms. Occasionally, one of the
atoms will absorb a photon and make a transition to an “excited” state (i.e., a
state with higher than normal energy). We could easily estimate the rate constant
for this process by watching the ﬁlm carefully. If we play the ﬁlm backwards then
it will appear to show excited atoms occasionally emitting a photon and decaying
back to their unexcited state. If quantum mechanics possesses time reversal sym
metry (which it certainly does!) then both ﬁlms should appear equally plausible.
35
3.4 The H theorem 3 STATISTICAL MECHANICS
This means that the rate constant for the absorption of a photon to produce an
excited state must be the same as the rate constant for the excited state to decay
by the emission of a photon. Otherwise, in the backwards ﬁlm the excited atoms
would appear to emit photons at the wrong rate, and we could then tell that the
ﬁlm was being played backwards. It follows, therefore, that as a consequence of
time reversal symmetry, the rate constant for any process in quantum mechanics
must equal the rate constant for the inverse process.
The probability P
r
of ﬁnding the systems in the ensemble in a particular state
r changes with time for two reasons. Firstly, systems in another state s can make
transitions to the state r. The rate at which this occurs is P
s
, the probability
that the systems are in the state s to begin with, times the rate constant of the
transition W
sr
. Secondly, systems in the state r can make transitions to other
states such as s. The rate at which this occurs is clearly P
r
times W
rs
. We can
write a simple differential equation for the time evolution of P
r
:
dP
r
dt
=
s, =r
P
s
W
sr
−
s, =r
P
r
W
rs
, (3.10)
or
dP
r
dt
=
s
W
rs
(P
s
−P
r
), (3.11)
where use has been made of the symmetry condition (3.9). The summation is
over all accessible states.
Consider now the quantity H (from which the H theorem derives its name),
which is the mean value of lnP
r
over all accessible states:
H ≡ lnP
r
≡
r
P
r
lnP
r
. (3.12)
This quantity changes as the individual probabilities P
r
vary in time. Straightfor
ward differentiation of the above equation yields
dH
dt
=
r
_
dP
r
dt
lnP
r
+
dP
r
dt
_
=
r
dP
r
dt
(lnP
r
+1). (3.13)
According to Eq. (3.11), this can be written
dH
dt
=
r
s
W
rs
(P
s
−P
r
) (lnP
r
+1). (3.14)
36
3.4 The H theorem 3 STATISTICAL MECHANICS
We can now interchange the dummy summations indices r and s to give
dH
dt
=
r
s
W
sr
(P
r
−P
s
) (lnP
s
+1). (3.15)
We can write dH/dt in a more symmetric form by adding the previous two equa
tions and making use of Eq. (3.9):
dH
dt
= −
1
2
r
s
W
rs
(P
r
−P
s
) (lnP
r
− lnP
s
). (3.16)
Note, however, that lnP
r
is a monotonically increasing function of P
r
. It fol
lows that lnP
r
> lnP
s
whenever P
r
> P
s
, and vice versa. Thus, in general, the
righthand side of the above equation is the sum of many negative contributions.
Hence, we conclude that
dH
dt
≤ 0. (3.17)
The equality sign only holds in the special case where all accessible states are
equally probable, so that P
r
= P
s
for all r and s. This result is called the H
theorem, and was ﬁrst proved by the unfortunate Professor Boltzmann.
The H theorem tells us that if an isolated system is initially not in equilibrium
then it will evolve under the inﬂuence of particle interactions in such a manner
that the quantity H always decreases. This process will continue until H reaches
its minimum possible value, at which point dH/dt = 0, and there is no further
evolution of the system. According to Eq. (3.16), in this ﬁnal equilibrium state
the system is equally likely to be found in any one of its accessible states. This is,
of course, the situation predicted by the principle of equal a priori probabilities.
You may be wondering why the above argument does not constitute a mathe
matically rigorous proof that the principle of equal a priori probabilities applies
to many particle systems. The answer is that we tacitly made an unwarranted
assumption: i.e., we assumed that the probability of the system making a transi
tion from some state r to another state s is independent of the past history of the
system. In general, this is not the case in physical systems, although there are
many situations in which it is a pretty good approximation. Thus, the epistemo
logical status of the principle of equal a priori probabilities is that it is plausible,
37
3.5 The relaxation time 3 STATISTICAL MECHANICS
but remains unproven. As we have already mentioned, the ultimate justiﬁcation
for this principle is empirical: i.e., it leads to theoretical predictions which are in
accordance with experimental observations.
3.5 The relaxation time
The H theorem guarantees that an isolated many particle system will eventually
reach equilibrium, irrespective of its initial state. The typical timescale for this
process is called the relaxation time, and depends in detail on the nature of the
interparticle interactions. The principle of equal a priori probabilities is only
valid for equilibrium states. It follows that we can only safely apply this principle
to systems which have remained undisturbed for many relaxation times since
they were setup, or last interacted with the outside world. The relaxation time
for the air in a typical classroom is very much less than one second. This suggests
that such air is probably in equilibrium most of the time, and should, therefore,
be governed by the principle of equal a priori probabilities. In fact, this is known
to be the case. Consider another example. Our galaxy, the “Milky Way,” is an
isolated dynamical system made up of about 10
11
stars. In fact, it can be thought
of as a selfgravitating “gas” of stars. At ﬁrst sight, the “Milky Way” would seem
to be an ideal system on which to test out the ideas of statistical mechanics. Stars
in the Galaxy interact via occasional “near miss” events in which they exchange
energy and momentum. Actual collisions are very rare indeed. Unfortunately,
such interactions take place very infrequently, because there is an awful lot of
empty space between stars. The best estimate for the relaxation time of the
“Milky Way” is about 10
13
years. This should be compared with the estimated age
of the Galaxy, which is only about 10
10
years. It is clear that, despite its great age,
the “Milky Way” has not been around long enough to reach an equilibrium state.
This suggests that the principle of equal a priori probabilities cannot be used to
describe stellar dynamics. Not surprisingly, the observed velocity distribution of
the stars in the vicinity of the Sun is not governed by this principle.
38
3.6 Reversibility and irreversibility 3 STATISTICAL MECHANICS
3.6 Reversibility and irreversibility
Previously, we mentioned that on a microscopic level the laws of physics are in
variant under time reversal. In other words, microscopic phenomena look phys
ically plausible when run in reverse. We usually say that these phenomena are
reversible. What about macroscopic phenomena? Are they reversible? Well, con
sider an isolated many particle system which starts off far from equilibrium. Ac
cording to the H theorem, it will evolve towards equilibrium and, as it does so,
the macroscopic quantity H will decrease. But, if we run this process backwards
the system will appear to evolve away from equilibrium, and the quantity H will
increase. This type of behaviour is not physical because it violates the H theorem.
So, if we saw a ﬁlm of a macroscopic process we could very easily tell if it was
being run backwards. For instance, suppose that by some miracle we were able
to move all of the Oxygen molecules in the air in some classroom to one side
of the room, and all of the Nitrogen molecules to the opposite side. We would
not expect this state to persist for very long. Pretty soon the Oxygen and Nitro
gen molecules would start to intermingle, and this process would continue until
they were thoroughly mixed together throughout the room. This, of course, is
the equilibrium state for air. In reverse, this process looks crazy! We would start
off from perfectly normal air, and suddenly, for no good reason, the Oxygen and
Nitrogen molecules would appear to separate and move to opposite sides of the
room. This scenario is not impossible, but, from everything we know about the
world around us, it is spectacularly unlikely! We conclude, therefore, that macro
scopic phenomena are generally irreversible, because they look “wrong” when run
in reverse.
How does the irreversibility of macroscopic phenomena arise? It certainly
does not come from the fundamental laws of physics, because these laws are
all reversible. In the previous example, the Oxygen and Nitrogen molecules got
mixed up by continually scattering off one another. Each individual scattering
event would look perfectly reasonable viewed in reverse, but when we add them
all together we obtain a process which would look stupid run backwards. How
can this be? How can we obtain an irreversible process from the combined effects
of very many reversible processes? This is a vitally important question. Unfortu
39
3.7 Probability calculations 3 STATISTICAL MECHANICS
nately, we are not quite at the stage where we can formulate a convincing answer.
Note, however, that the essential irreversibility of macroscopic phenomena is one
of the key results of statistical thermodynamics.
3.7 Probability calculations
The principle of equal a priori probabilities is fundamental to all statistical me
chanics, and allows a complete description of the properties of macroscopic sys
tems in equilibrium. In principle, statistical mechanics calculations are very sim
ple. Consider a system in equilibrium which is isolated, so that its total energy is
known to have a constant value somewhere in the range E to E +δE. In order to
make statistical predictions, we focus attention on an ensemble of such systems,
all of which have their energy in this range. Let Ω(E) be the total number of
different states of the system with energies in the speciﬁed range. Suppose that
among these states there are a number Ω(E; y
k
) for which some parameter y of
the system assumes the discrete value y
k
. (This discussion can easily be general
ized to deal with a parameter which can assume a continuous range of values).
The principle of equal a priori probabilities tells us that all the Ω(E) accessible
states of the system are equally likely to occur in the ensemble. It follows that
the probability P(y
k
) that the parameter y of the system assumes the value y
k
is
simply
P(y
k
) =
Ω(E; y
k
)
Ω(E)
. (3.18)
Clearly, the mean value of y for the system is given by
¯ y =
k
Ω(E; y
k
) y
k
Ω(E)
, (3.19)
where the sum is over all possible values that y can assume. In the above, it is
tacitly assumed that Ω(E) → ∞, which is generally the case in thermodynamic
systems.
It can be seen that, using the principle of equal a priori probabilities, all calcu
lations in statistical mechanics reduce to simply counting states, subject to various
constraints. In principle, this is fairly straightforward. In practice, problems arise
40
3.8 Behaviour of the density of states 3 STATISTICAL MECHANICS
if the constraints become too complicated. These problems can usually be over
come with a little mathematical ingenuity. Nevertheless, there is no doubt that
this type of calculation is far easier than trying to solve the classical equations of
motion (or Schr¨ odinger’s equation) directly for a manyparticle system.
3.8 Behaviour of the density of states
Consider an isolated system in equilibrium whose volume is V, and whose energy
lies in the range E to E+δE. Let Ω(E, V) be the total number of microscopic states
which satisfy these constraints. It would be useful if we could estimate how this
number typically varies with the macroscopic parameters of the system. The
easiest way to do this is to consider a speciﬁc example. For instance, an ideal gas
made up of spinless monatomic particles. This is a particularly simple example,
because for such a gas the particles possess translational but no internal (e.g.,
vibrational, rotational, or spin) degrees of freedom. By deﬁnition, interatomic
forces are negligible in an ideal gas. In other words, the individual particles move
in an approximately uniform potential. It follows that the energy of the gas is just
the total translational kinetic energy of its constituent particles. Thus,
E =
1
2 m
N
i=1
p
2
i
, (3.20)
where m is the particle mass, N the total number of particles, and p
i
the vector
momentum of the ith particle.
Consider the system in the limit in which the energy E of the gas is much
greater than the groundstate energy, so that all of the quantum numbers are
large. The classical version of statistical mechanics, in which we divide up phase
space into cells of equal volume, is valid in this limit. The number of states
Ω(E, V) lying between the energies E and E+δE is simply equal to the number of
cells in phasespace contained between these energies. In other words, Ω(E, V)
is proportional to the volume of phasespace between these two energies:
Ω(E, V) ∝
E+δE
E
d
3
r
1
d
3
r
N
d
3
p
1
d
3
p
N
. (3.21)
41
3.8 Behaviour of the density of states 3 STATISTICAL MECHANICS
Here, the integrand is the element of volume of phasespace, with
d
3
r ≡ dx
i
dy
i
dz
i
, (3.22)
d
3
p ≡ dp
i x
dp
i y
dp
i z
, (3.23)
where (x
i
, y
i
, z
i
) and (p
i x
, p
i y
, p
i z
) are the Cartesian coordinates and momen
tum components of the ith particle, respectively. The integration is over all coor
dinates and momenta such that the total energy of the system lies between E and
E +δE.
For an ideal gas, the total energy E does not depend on the positions of the
particles [see Eq. (3.20)]. This means that the integration over the position vec
tors r
i
can be performed immediately. Since each integral over r
i
extends over the
volume of the container (the particles are, of course, not allowed to stray outside
the container),
d
3
r
i
= V. There are N such integrals, so Eq. (3.21) reduces to
Ω(E, V) ∝ V
N
χ(E), (3.24)
where
χ(E) ∝
E+δE
E
d
3
p
1
d
3
p
N
(3.25)
is a momentum space integral which is independent of the volume.
The energy of the system can be written
E =
1
2 m
N
i=1
3
α=1
p
2
i α
, (3.26)
since p
2
i
= p
2
i 1
+p
2
i 2
+p
2
i 3
, denoting the (x, y, z) components by (1, 2, 3), respec
tively. The above sum contains 3N square terms. For E = constant, Eq. (3.26)
describes the locus of a sphere of radius R(E) = (2 mE)
1/2
in the 3 Ndimensional
space of the momentum components. Hence, χ(E) is proportional to the volume
of momentum phasespace contained in the spherical shell lying between the
sphere of radius R(E) and that of slightly larger radius R(E +δE). This volume is
proportional to the area of the inner sphere multiplied by δR ≡ R(E+δE) −R(E).
Since the area varies like R
3N−1
, and δR ∝ δE/E
1/2
, we have
χ(E) ∝ R
3N−1
/E
1/2
∝ E
3N/2−1
. (3.27)
42
3.8 Behaviour of the density of states 3 STATISTICAL MECHANICS
Combining this result with (3.24) yields
Ω(E, V) = BV
N
E
3N/2
, (3.28)
where B is a constant independent of V or E, and we have also made use of
N ¸ 1. Note that, since the number of degrees of freedom of the system is
f = 3 N, the above relation can be very approximately written
Ω(E, V) ∝ V
f
E
f
. (3.29)
In other words, the density of states varies like the extensive macroscopic param
eters of the system raised to the power of the number of degrees of freedom.
An extensive parameter is one which scales with the size of the system (e.g., the
volume). Since thermodynamic systems generally possess a very large number
of degrees of freedom, this result implies that the density of states is an excep
tionally rapidly increasing function of the energy and volume. This result, which
turns out to be quite general, is very useful in statistical thermodynamics.
43
4 HEAT AND WORK
4 Heat and work
4.1 A brief history of heat and work
In 1789 the French scientist Antoine Lavoisier published a famous treatise on
Chemistry which, amongst other things, demolished the then prevalent theory
of combustion. This theory, known to history as the phlogiston theory, is so ex
traordinary stupid that it is not even worth describing. In place of phlogiston
theory, Lavoisier proposed the ﬁrst reasonably sensible scientiﬁc interpretation of
heat. Lavoisier pictured heat as an invisible, tasteless, odourless, weightless ﬂuid,
which he called caloriﬁc ﬂuid. He postulated that hot bodies contain more of this
ﬂuid than cold bodies. Furthermore, he suggested that the constituent particles
of caloriﬁc ﬂuid repel one another, causing heat to ﬂow from hot to cold bodies
when they are placed in thermal contact.
The modern interpretation of heat is, or course, somewhat different to Lavoisier’s
caloriﬁc theory. Nevertheless, there is an important subset of problems involving
heat ﬂow for which Lavoisier’s approach is rather useful. These problems often
crop up as examination questions. For example: “A clean dry copper calorimeter
contains 100 grams of water at 30
◦
degrees centigrade. A 10 gramblock of copper
heated to 60
◦
centigrade is added. What is the ﬁnal temperature of the mixture?”.
How do we approach this type of problem? Well, according to Lavoisier’s theory,
there is an analogy between heat ﬂow and incompressible ﬂuid ﬂow under grav
ity. The same volume of liquid added to containers of different crosssectional
area ﬁlls them to different heights. If the volume is V, and the crosssectional
area is A, then the height is h = V/A. In a similar manner, the same quantity of
heat added to different bodies causes them to rise to different temperatures. If Q
is the heat and θ is the (absolute) temperature then θ = Q/C, where the constant
C is termed the heat capacity. [This is a somewhat oversimpliﬁed example. In
general, the heat capacity is a function of temperature, so that C = C(θ).] Now,
if two containers ﬁlled to different heights with a free ﬂowing incompressible
ﬂuid are connected together at the bottom, via a small pipe, then ﬂuid will ﬂow
under gravity, from one to the other, until the two heights are the same. The ﬁnal
height is easily calculated by equating the total ﬂuid volume in the initial and
44
4.1 A brief history of heat and work 4 HEAT AND WORK
ﬁnal states. Thus,
h
1
A
1
+h
2
A
2
= hA
1
+hA
2
, (4.1)
giving
h =
h
1
A
1
+h
2
A
2
A
1
+A
2
. (4.2)
Here, h
1
and h
2
are the initial heights in the two containers, A
1
and A
2
are the
corresponding crosssectional areas, and h is the ﬁnal height. Likewise, if two
bodies, initially at different temperatures, are brought into thermal contact then
heat will ﬂow, from one to the other, until the two temperatures are the same.
The ﬁnal temperature is calculated by equating the total heat in the initial and
ﬁnal states. Thus,
θ
1
C
1
+θ
2
C
2
= θ C
1
+θ C
2
, (4.3)
giving
θ =
θ
1
C
1
+θ
2
C
2
C
1
+C
2
, (4.4)
where the meaning of the various symbols should be selfevident.
The analogy between heat ﬂow and ﬂuid ﬂow works because in Lavoisier’s
theory heat is a conserved quantity, just like the volume of an incompressible
ﬂuid. In fact, Lavoisier postulated that heat was an element. Note that atoms
were thought to be indestructible before nuclear reactions were discovered, so
the total amount of each element in the Universe was assumed to be a constant.
If Lavoisier had cared to formulate a law of thermodynamics from his caloriﬁc
theory then he would have said that the total amount of heat in the Universe was
a constant.
In 1798 Benjamin Thompson, an Englishman who spent his early years in pre
revolutionary America, was minister for war and police in the German state of
Bavaria. One of his jobs was to oversee the boring of cannons in the state arsenal.
Thompson was struck by the enormous, and seemingly inexhaustible, amount of
heat generated in this process. He simply could not understand where all this
heat was coming from. According to Lavoisier’s caloriﬁc theory, the heat must
ﬂow into the cannon from its immediate surroundings, which should, therefore,
become colder. The ﬂow should also eventually cease when all of the available
45
4.2 Macrostates and microstates 4 HEAT AND WORK
heat has been extracted. In fact, Thompson observed that the surroundings of the
cannon got hotter, not colder, and that the heating process continued unabated as
long as the boring machine was operating. Thompson postulated that some of the
mechanical work done on the cannon by the boring machine was being converted
into heat. At the time, this was quite a revolutionary concept, and most people
were not ready to accept it. This is somewhat surprising, since by the end of
the eighteenth century the conversion of heat into work, by steam engines, was
quite commonplace. Nevertheless, the conversion of work into heat did not gain
broad acceptance until 1849, when an English physicist called James Prescott
Joule published the results of a long and painstaking series of experiments. Joule
conﬁrmed that work could indeed be converted into heat. Moreover, he found
that the same amount of work always generates the same quantity of heat. This
is true regardless of the nature of the work (e.g., mechanical, electrical, etc.).
Joule was able to formulate what became known as the work equivalent of heat.
Namely, that 1 newton meter of work is equivalent to 0.241 calories of heat. A
calorie is the amount of heat required to raise the temperature of 1 gram of water
by 1 degree centigrade. Nowadays, we measure both heat and work in the same
units, so that one newton meter, or joule, of work is equivalent to one joule of
heat.
In 1850, the German physicist Clausius correctly postulated that the essen
tial conserved quantity is neither heat nor work, but some combination of the
two which quickly became known as energy, from the Greek energia meaning “in
work.” According to Clausius, the change in the internal energy of a macroscopic
body can be written
∆E = Q−W, (4.5)
where Q is the heat absorbed from the surroundings, and W is the work done on
the surroundings. This relation is known as the ﬁrst law of thermodynamics.
4.2 Macrostates and microstates
In describing a system made up of a great many particles, it is usually possible to
specify some macroscopically measurable independent parameters x
1
, x
2
, , x
n
46
4.3 The microscopic interpretation of heat and work 4 HEAT AND WORK
which affect the particles’ equations of motion. These parameters are termed the
external parameters the system. Examples of such parameters are the volume (this
gets into the equations of motion because the potential energy becomes inﬁnite
when a particle strays outside the available volume) and any applied electric and
magnetic ﬁelds. A microstate of the system is deﬁned as a state for which the
motions of the individual particles are completely speciﬁed (subject, of course,
to the unavoidable limitations imposed by the uncertainty principle of quantum
mechanics). In general, the overall energy of a given microstate r is a function of
the external parameters:
E
r
≡ E
r
(x
1
, x
2
, , x
n
). (4.6)
A macrostate of the system is deﬁned by specifying the external parameters, and
any other constraints to which the system is subject. For example, if we are
dealing with an isolated system (i.e., one that can neither exchange heat with nor
do work on its surroundings) then the macrostate might be speciﬁed by giving the
values of the volume and the constant total energy. For a manyparticle system,
there are generally a very great number of microstates which are consistent with
a given macrostate.
4.3 The microscopic interpretation of heat and work
Consider a macroscopic system A which is known to be in a given macrostate.
To be more exact, consider an ensemble of similar macroscopic systems A, where
each system in the ensemble is in one of the many microstates consistent with
the given macrostate. There are two fundamentally different ways in which the
average energy of A can change due to interaction with its surroundings. If the
external parameters of the system remain constant then the interaction is termed
a purely thermal interaction. Any change in the average energy of the system is
attributed to an exchange of heat with its environment. Thus,
∆
¯
E = Q, (4.7)
where Q is the heat absorbed by the system. On a microscopic level, the energies
of the individual microstates are unaffected by the absorption of heat. In fact,
47
4.3 The microscopic interpretation of heat and work 4 HEAT AND WORK
it is the distribution of the systems in the ensemble over the various microstates
which is modiﬁed.
Suppose that the system A is thermally insulated from its environment. This
can be achieved by surrounding it by an adiabatic envelope (i.e., an envelope
fabricated out of a material which is a poor conductor of heat, such a ﬁber glass).
Incidentally, the term adiabatic is derived from the Greek adiabatos which means
“impassable.” In scientiﬁc terminology, an adiabatic process is one in which there
is no exchange of heat. The system A is still capable of interacting with its envi
ronment via its external parameters. This type of interaction is termed mechanical
interaction, and any change in the average energy of the system is attributed to
work done on it by its surroundings. Thus,
∆
¯
E = −W, (4.8)
where W is the work done by the system on its environment. On a microscopic
level, the energy of the system changes because the energies of the individual
microstates are functions of the external parameters [see Eq. (4.6)]. Thus, if the
external parameters are changed then, in general, the energies of all of the sys
tems in the ensemble are modiﬁed (since each is in a speciﬁc microstate). Such a
modiﬁcation usually gives rise to a redistribution of the systems in the ensemble
over the accessible microstates (without any heat exchange with the environ
ment). Clearly, from a microscopic viewpoint, performing work on a macroscopic
system is quite a complicated process. Nevertheless, macroscopic work is a quan
tity which can be readily measured experimentally. For instance, if the system
A exerts a force F on its immediate surroundings, and the change in external
parameters corresponds to a displacement x of the center of mass of the system,
then the work done by A on its surroundings is simply
W = Fx (4.9)
i.e., the product of the force and the displacement along the line of action of the
force. In a general interaction of the system A with its environment there is both
heat exchange and work performed. We can write
Q ≡ ∆
¯
E +W, (4.10)
48
4.4 Quasistatic processes 4 HEAT AND WORK
which serves as the general deﬁnition of the absorbed heat Q (hence, the equiva
lence sign). The quantity Qis simply the change in the mean energy of the system
which is not due to the modiﬁcation of the external parameters. Note that the
notion of a quantity of heat has no independent meaning apart from Eq. (4.10).
The mean energy
¯
E and work performed W are both physical quantities which
can be determined experimentally, whereas Q is merely a derived quantity.
4.4 Quasistatic processes
Consider the special case of an interaction of the system A with its surroundings
which is carried out so slowly that A remains arbitrarily close to equilibrium at
all times. Such a process is said to be quasistatic for the system A. In practice, a
quasistatic process must be carried out on a timescale which is much longer than
the relaxation time of the system. Recall that the relaxation time is the typical
timescale for the system to return to equilibrium after being suddenly disturbed
(see Sect. 3.5).
A ﬁnite quasistatic change can be built up out of many inﬁnitesimal changes.
The inﬁnitesimal heat ¯ dQ absorbed by the system when inﬁnitesimal work ¯ dW is
done on its environment and its average energy changes by d
¯
E is given by
¯ dQ ≡ d
¯
E + ¯ dW. (4.11)
The special symbols ¯ dW and ¯ dQ are introduced to emphasize that the work done
and heat absorbed are inﬁnitesimal quantities which do not correspond to the
difference between two works or two heats. Instead, the work done and heat
absorbed depend on the interaction process itself. Thus, it makes no sense to
talk about the work in the system before and after the process, or the difference
between these.
If the external parameters of the system have the values x
1
, , x
n
then the
energy of the system in a deﬁnite microstate r can be written
E
r
= E
r
(x
1
, , x
n
). (4.12)
Hence, if the external parameters are changed by inﬁnitesimal amounts, so that
x
α
→ x
α
+ dx
α
for α in the range 1 to n, then the corresponding change in the
49
4.4 Quasistatic processes 4 HEAT AND WORK
energy of the microstate is
dE
r
=
n
α=1
∂E
r
∂x
α
dx
α
. (4.13)
The work ¯ dW done by the system when it remains in this particular state r is
¯ dW
r
= −dE
r
=
n
α=1
X
αr
dx
α
, (4.14)
where
X
αr
≡ −
∂E
r
∂x
α
(4.15)
is termed the generalized force (conjugate to the external parameter x
α
) in the
state r. Note that if x
α
is a displacement then X
αr
is an ordinary force.
Consider now an ensemble of systems. Provided that the external parameters
of the system are changed quasistatically, the generalized forces X
αr
have well
deﬁned mean values which are calculable from the distribution of systems in the
ensemble characteristic of the instantaneous macrostate. The macroscopic work
¯ dW resulting from an inﬁnitesimal quasistatic change of the external parameters
is obtained by calculating the decrease in the mean energy resulting from the
parameter change. Thus,
¯ dW =
n
α=1
¯
X
α
dx
α
, (4.16)
where
¯
X
α
≡ −
∂E
r
∂x
α
(4.17)
is the mean generalized force conjugate to x
α
. The mean value is calculated
from the equilibrium distribution of systems in the ensemble corresponding to
the external parameter values x
α
. The macroscopic work W resulting from a
ﬁnite quasistatic change of external parameters can be obtained by integrating
Eq. (4.16).
The most wellknown example of quasistatic work in thermodynamics is that
done by pressure when the volume changes. For simplicity, suppose that the
50
4.5 Exact and inexact differentials 4 HEAT AND WORK
volume V is the only external parameter of any consequence. The work done
in changing the volume from V to V + dV is simply the product of the force
and the displacement (along the line of action of the force). By deﬁnition, the
mean equilibrium pressure ¯ p of a given macrostate is equal to the normal force
per unit area acting on any surface element. Thus, the normal force acting on
a surface element dS
i
is ¯ p dS
i
. Suppose that the surface element is subject to a
displacement dx
i
. The work done by the element is ¯ p dS
i
dx
i
. The total work
done by the system is obtained by summing over all of the surface elements.
Thus,
¯ dW = ¯ p dV, (4.18)
where
dV =
i
dS
i
dx
i
(4.19)
is the inﬁnitesimal volume change due to the displacement of the surface. It
follows from (4.17) that
¯ p = −
∂
¯
E
∂V
, (4.20)
so the mean pressure is the generalized force conjugate to the volume V.
Suppose that a quasistatic process is carried out in which the volume is changed
from V
i
to V
f
. In general, the mean pressure is a function of the volume, so
¯ p = ¯ p(V). It follows that the macroscopic work done by the system is given by
W
if
=
V
f
V
i
¯ dW =
V
f
V
i
¯ p(V) dV. (4.21)
This quantity is just the “area under the curve” in a plot of ¯ p(V) versus V.
4.5 Exact and inexact differentials
In our investigation of heat and work we have come across various inﬁnitesimal
objects such as d
¯
E and ¯ dW. It is instructive to examine these inﬁnitesimals more
closely.
51
4.5 Exact and inexact differentials 4 HEAT AND WORK
Consider the purely mathematical problem where F(x, y) is some general func
tion of two independent variables x and y. Consider the change in F in going from
the point (x, y) in the xy plane to the neighbouring point (x +dx, y +dy). This
is given by
dF = F(x +dx, y +dy) −F(x, y), (4.22)
which can also be written
dF = X(x, y) dx +Y(x, y) dy, (4.23)
where X = ∂F/∂x and Y = ∂F/∂y. Clearly, dF is simply the inﬁnitesimal difference
between two adjacent values of the function F. This type of inﬁnitesimal quantity
is termed an exact differential to distinguish it from another type to be discussed
presently. If we move in the xy plane from an initial point i ≡ (x
i
, y
i
) to a ﬁnal
point f ≡ (x
f
, y
f
) then the corresponding change in F is given by
∆F = F
f
−F
i
=
f
i
dF =
f
i
(Xdx +Y dy). (4.24)
Note that since the difference on the lefthand side depends only on the initial
and ﬁnal points, the integral on the righthand side can only depend on these
points as well. In other words, the value of the integral is independent of the
path taken in going from the initial to the ﬁnal point. This is the distinguishing
feature of an exact differential. Consider an integral taken around a closed circuit
in the xy plane. In this case, the initial and ﬁnal points correspond to the same
point, so the difference F
f
− F
i
is clearly zero. It follows that the integral of an
exact differential over a closed circuit is always zero:
dF ≡ 0. (4.25)
Of course, not every inﬁnitesimal quantity is an exact differential. Consider
the inﬁnitesimal object
¯ dG ≡ X
/
(x, y) dx +Y
/
(x, y) dz, (4.26)
where X
/
and Y
/
are two general functions of x and y. It is easy to test whether
or not an inﬁnitesimal quantity is an exact differential. Consider the expression
52
4.5 Exact and inexact differentials 4 HEAT AND WORK
(4.23). It is clear that since X = ∂F/∂x and Y = ∂F/∂y then
∂X
∂y
=
∂Y
∂x
=
∂
2
F
∂x∂y
. (4.27)
Thus, if
∂X
/
∂y
,=
∂Y
/
∂x
(4.28)
(as is assumed to be the case), then ¯ dG cannot be an exact differential, and is
instead termed an inexact differential. The special symbol ¯ d is used to denote an
inexact differential. Consider the integral of ¯ dG over some path in the xy plane.
In general, it is not true that
f
i
¯ dG =
f
i
(X
/
dx +Y
/
dy) (4.29)
is independent of the path taken between the initial and ﬁnal points. This is the
distinguishing feature of an inexact differential. In particular, the integral of an
inexact differential around a closed circuit is not necessarily zero, so
¯ dG ,= 0. (4.30)
Consider, for the moment, the solution of
¯ dG = 0, (4.31)
which reduces to the ordinary differential equation
dy
dx
= −
X
/
Y
/
. (4.32)
Since the righthand side is a known function of x and y, the above equation
deﬁnes a deﬁnite direction (i.e., gradient) at each point in the xy plane. The
solution simply consists of drawing a system of curves in the xy plane such that
at any point the tangent to the curve is as speciﬁed in Eq. (4.32). This deﬁnes a
set of curves which can be written σ(x, y) = c, where c is a labeling parameter. It
follows that
dσ
dx
≡
∂σ
∂x
+
∂σ
∂y
dy
dx
= 0. (4.33)
53
4.5 Exact and inexact differentials 4 HEAT AND WORK
The elimination of dy/dx between Eqs. (4.32) and (4.33) yields
Y
/
∂σ
∂x
= X
/
∂σ
∂y
=
X
/
Y
/
τ
, (4.34)
where τ(x, y) is function of x and y. The above equation could equally well be
written
X
/
= τ
∂σ
∂x
, Y
/
= τ
∂σ
∂y
. (4.35)
Inserting Eq. (4.35) into Eq. (4.26) gives
¯ dG = τ
_
∂σ
∂x
dx +
∂σ
∂y
dy
_
= τ dσ, (4.36)
or
¯ dG
τ
= dσ. (4.37)
Thus, dividing the inexact differential ¯ dG by τ yields the exact differential dσ. A
factor τ which possesses this property is termed an integrating factor. Since the
above analysis is quite general, it is clear that an inexact differential involving
two independent variables always admits of an integrating factor. Note, however,
this is not generally the case for inexact differentials involving more than two
variables.
After this mathematical excursion, let us return to physical situation of interest.
The macrostate of a macroscopic system can be speciﬁed by the values of the
external parameters (e.g., the volume) and the mean energy
¯
E. This, in turn,
ﬁxes other parameters such as the mean pressure ¯ p. Alternatively, we can specify
the external parameters and the mean pressure, which ﬁxes the mean energy.
Quantities such as d¯ p and d
¯
E are inﬁnitesimal differences between welldeﬁned
quantities: i.e., they are exact differentials. For example, d
¯
E =
¯
E
f
−
¯
E
i
is just
the difference between the mean energy of the system in the ﬁnal macrostate f
and the initial macrostate i, in the limit where these two states are nearly the
same. It follows that if the system is taken from an initial macrostate i to any
ﬁnal macrostate f the mean energy change is given by
∆
¯
E =
¯
E
f
−
¯
E
i
=
f
i
d
¯
E. (4.38)
54
4.5 Exact and inexact differentials 4 HEAT AND WORK
However, since the mean energy is just a function of the macrostate under consid
eration,
¯
E
f
and
¯
E
i
depend only on the initial and ﬁnal states, respectively. Thus,
the integral
d
¯
E depends only on the initial and ﬁnal states, and not on the
particular process used to get between them.
Consider, now, the inﬁnitesimal work done by the system in going from some
initial macrostate i to some neighbouring ﬁnal macrostate f. In general, ¯ dW =
¯
X
α
dx
α
is not the difference between two numbers referring to the properties
of two neighbouring macrostates. Instead, it is merely an inﬁnitesimal quantity
characteristic of the process of going from state i to state f. In other words, the
work ¯ dW is in general an inexact differential. The total work done by the system
in going from any macrostate i to some other macrostate f can be written as
W
if
=
f
i
¯ dW, (4.39)
where the integral represents the sum of the inﬁnitesimal amounts of work ¯ dW
performed at each stage of the process. In general, the value of the integral does
depend on the particular process used in going from macrostate i to macrostate
f.
Recall that in going from macrostate i to macrostate f the change ∆
¯
E does
not depend on the process used whereas the work W, in general, does. Thus,
it follows from the ﬁrst law of thermodynamics, Eq. (4.10), that the heat Q, in
general, also depends on the process used. It follows that
¯ dQ ≡ d
¯
E + ¯ dW (4.40)
is an inexact differential. However, by analogy with the mathematical example
discussed previously, there must exist some integrating factor, T, say, which con
verts the inexact differential ¯ dQ into an exact differential. So,
¯ dQ
T
≡ dS. (4.41)
It will be interesting to ﬁnd out what physical quantities correspond to the func
tions T and S.
55
4.5 Exact and inexact differentials 4 HEAT AND WORK
Suppose that the system is thermally insulated, so that Q = 0. In this case, the
ﬁrst law of thermodynamics implies that
W
if
= −∆
¯
E. (4.42)
Thus, in this special case, the work done depends only on the energy difference
between in the initial and ﬁnal states, and is independent of the process. In fact,
when Clausius ﬁrst formulated the ﬁrst law in 1850 this is how he expressed it:
If a thermally isolated system is brought from some initial to some ﬁnal state
then the work done by the system is independent of the process used.
If the external parameters of the system are kept ﬁxed, so that no work is done,
then ¯ dW = 0, Eq. (4.11) reduces to
¯ dQ = d
¯
E, (4.43)
and ¯ dQ becomes an exact differential. The amount of heat Q absorbed in going
from one macrostate to another depends only on the mean energy difference
between them, and is independent of the process used to effect the change. In
this situation, heat is a conserved quantity, and acts very much like the invisible
indestructible ﬂuid of Lavoisier’s caloriﬁc theory.
56
5 STATISTICAL THERMODYNAMICS
5 Statistical thermodynamics
5.1 Introduction
Let us brieﬂy review the material which we have covered so far in this course.
We started off by studying the mathematics of probability. We then used prob
abilistic reasoning to analyze the dynamics of many particle systems, a subject
area known as statistical mechanics. Next, we explored the physics of heat and
work, the study of which is termed thermodynamics. The ﬁnal step in our investi
gation is to combine statistical mechanics with thermodynamics: in other words,
to investigate heat and work via statistical arguments. This discipline is called
statistical thermodynamics, and forms the central subject matter of this course.
This section is devoted to the study of the fundamental concepts of statistical
thermodynamics. The remaining sections will then explore the application and
elucidation of these concepts.
5.2 Thermal interaction between macrosystems
Let us begin our investigation of statistical thermodynamics by examining a purely
thermal interaction between two macroscopic systems, A and A
/
, from a micro
scopic point of view. Suppose that the energies of these two systems are E and E
/
,
respectively. The external parameters are held ﬁxed, so that A and A
/
cannot do
work on one another. However, we assume that the systems are free to exchange
heat energy (i.e., they are in thermal contact). It is convenient to divide the en
ergy scale into small subdivisions of width δE. The number of microstates of A
consistent with a macrostate in which the energy lies in the range E to E + δE
is denoted Ω(E). Likewise, the number of microstates of A
/
consistent with a
macrostate in which the energy lies between E
/
and E
/
+δE is denoted Ω
/
(E
/
).
The combined system A
(0)
= A + A
/
is assumed to be isolated (i.e., it neither
does work on nor exchanges heat with its surroundings). It follows from the ﬁrst
law of thermodynamics that the total energy E
(0)
is constant. When speaking of
thermal contact between two distinct systems, we usually assume that the mutual
57
5.2 Thermal interaction between macrosystems 5 STATISTICAL THERMODYNAMICS
interaction is sufﬁciently weak for the energies to be additive. Thus,
E +E
/
· E
(0)
= constant. (5.1)
Of course, in the limit of zero interaction the energies are strictly additive. How
ever, a small residual interaction is always required to enable the two systems
to exchange heat energy and, thereby, eventually reach thermal equilibrium (see
Sect. 3.4). In fact, if the interaction between A and A
/
is too strong for the
energies to be additive then it makes little sense to consider each system in iso
lation, since the presence of one system clearly strongly perturbs the other, and
vice versa. In this case, the smallest system which can realistically be examined in
isolation is A
(0)
.
According to Eq. (5.1), if the energy of A lies in the range E to E+δE then the
energy of A
/
must lie between E
(0)
− E − δE and E
(0)
− E. Thus, the number of
microstates accessible to each system is given by Ω(E) and Ω
/
(E
(0)
− E), respec
tively. Since every possible state of A can be combined with every possible state
of A
/
to form a distinct microstate, the total number of distinct states accessible
to A
(0)
when the energy of A lies in the range E to E +δE is
Ω
(0)
= Ω(E) Ω
/
(E
(0)
−E). (5.2)
Consider an ensemble of pairs of thermally interacting systems, A and A
/
,
which are left undisturbed for many relaxation times so that they can attain ther
mal equilibrium. The principle of equal a priori probabilities is applicable to this
situation (see Sect. 3). According to this principle, the probability of occurrence
of a given macrostate is proportional to the number of accessible microstates,
since all microstates are equally likely. Thus, the probability that the system A
has an energy lying in the range E to E +δE can be written
P(E) = CΩ(E) Ω
/
(E
(0)
−E), (5.3)
where C is a constant which is independent of E.
We know, from Sect. 3.8, that the typical variation of the number of accessible
states with energy is of the form
Ω ∝ E
f
, (5.4)
58
5.2 Thermal interaction between macrosystems 5 STATISTICAL THERMODYNAMICS
where f is the number of degrees of freedom. For a macroscopic system f is an
exceedingly large number. It follows that the probability P(E) in Eq. (5.3) is the
product of an extremely rapidly increasing function of E and an extremely rapidly
decreasing function of E. Hence, we would expect the probability to exhibit a very
pronounced maximum at some particular value of the energy.
Let us Taylor expand the logarithm of P(E) in the vicinity of its maximum
value, which is assumed to occur at E =
˜
E. We expand the relatively slowly
varying logarithm, rather than the function itself, because the latter varies so
rapidly with the energy that the radius of convergence of its Taylor expansion is
too small for this expansion to be of any practical use. The expansion of lnΩ(E)
yields
lnΩ(E) = lnΩ(
˜
E) +β(
˜
E) η −
1
2
λ(
˜
E) η
2
+ , (5.5)
where
η = E −
˜
E, (5.6)
β =
∂lnΩ
∂E
, (5.7)
λ = −
∂
2
lnΩ
∂E
2
= −
∂β
∂E
. (5.8)
Now, since E
/
= E
(0)
−E, we have
E
/
−
˜
E
/
= −(E −
˜
E) = −η. (5.9)
It follows that
lnΩ
/
(E
/
) = lnΩ
/
(
˜
E
/
) +β
/
(
˜
E
/
) (−η) −
1
2
λ
/
(
˜
E
/
) (−η)
2
+ , (5.10)
where β
/
and λ
/
are deﬁned in an analogous manner to the parameters β and λ.
Equations (5.5) and (5.10) can be combined to give
ln[Ω(E) Ω
/
(E
/
)] = ln[Ω(
˜
E) Ω
/
(
˜
E
/
)] + [β(
˜
E) −β
/
(
˜
E
/
)] η−
1
2
[λ(
˜
E) +λ
/
(
˜
E
/
)] η
2
+ .
(5.11)
At the maximum of ln[Ω(E) Ω
/
(E
/
)] the linear term in the Taylor expansion must
vanish, so
β(
˜
E) = β
/
(
˜
E
/
), (5.12)
59
5.2 Thermal interaction between macrosystems 5 STATISTICAL THERMODYNAMICS
which enables us to determine
˜
E. It follows that
lnP(E) = lnP(
˜
E) −
1
2
λ
0
η
2
, (5.13)
or
P(E) = P(
˜
E) exp
_
−
1
2
λ
0
(E −
˜
E)
2
_
, (5.14)
where
λ
0
= λ(
˜
E) +λ
/
(
˜
E
/
). (5.15)
Now, the parameter λ
0
must be positive, otherwise the probability P(E) does not
exhibit a pronounced maximum value: i.e., the combined system A
(0)
does not
possess a welldeﬁned equilibrium state as, physically, we know it must. It is
clear that λ(
˜
E) must also be positive, since we could always choose for A
/
a
system with a negligible contribution to λ
0
, in which case the constraint λ
0
> 0
would effectively correspond to λ(
˜
E) > 0. [A similar argument can be used to
show that λ
/
(
˜
E
/
) must be positive.] The same conclusion also follows from the
estimate Ω ∝ E
f
, which implies that
λ(
˜
E) ∼
f
˜
E
2
> 0. (5.16)
According to Eq. (5.14), the probability distribution function P(E) is a Gaus
sian. This is hardly surprising, since the central limit theorem ensures that the
probability distribution for any macroscopic variable, such as E, is Gaussian in
nature (see Sect. 2.10). It follows that the mean value of E corresponds to the
situation of maximum probability (i.e., the peak of the Gaussian curve), so that
¯
E =
˜
E. (5.17)
The standard deviation of the distribution is
∆
∗
E = λ
−1/2
0
∼
¯
E
√
f
, (5.18)
where use has been made of Eq. (5.4) (assuming that system A makes the dom
inant contribution to λ
0
). It follows that the fractional width of the probability
distribution function is given by
∆
∗
E
¯
E
∼
1
√
f
. (5.19)
60
5.3 Temperature 5 STATISTICAL THERMODYNAMICS
Hence, if A contains 1 mole of particles then f ∼ N
A
· 10
24
and ∆
∗
E/
¯
E ∼ 10
−12
.
Clearly, the probability distribution for E has an exceedingly sharp maximum. Ex
perimental measurements of this energy will almost always yield the mean value,
and the underlying statistical nature of the distribution may not be apparent.
5.3 Temperature
Suppose that the systems A and A
/
are initially thermally isolated from one an
other, with respective energies E
i
and E
/
i
. (Since the energy of an isolated system
cannot ﬂuctuate, we do not have to bother with mean energies here.) If the two
systems are subsequently placed in thermal contact, so that they are free to ex
change heat energy, then, in general, the resulting state is an extremely improb
able one [i.e., P(E
i
) is much less than the peak probability]. The conﬁguration
will, therefore, tend to change in time until the two systems attain ﬁnal mean
energies
¯
E
f
and
¯
E
/
f
which are such that
β
f
= β
/
f
, (5.20)
where β
f
≡ β(
¯
E
f
) and β
/
f
≡ β
/
(
¯
E
/
f
). This corresponds to the state of maximum
probability (see Sect. 5.2). In the special case where the initial energies, E
i
and
E
/
i
, lie very close to the ﬁnal mean energies,
¯
E
f
and
¯
E
/
f
, respectively, there is no
change in the two systems when they are brought into thermal contact, since the
initial state already corresponds to a state of maximum probability.
It follows from energy conservation that
¯
E
f
+
¯
E
/
f
= E
i
+E
/
i
. (5.21)
The mean energy change in each system is simply the net heat absorbed, so that
Q ≡
¯
E
f
−E
i
, (5.22)
Q
/
≡
¯
E
/
f
−E
/
i
. (5.23)
The conservation of energy then reduces to
Q+Q
/
= 0 : (5.24)
61
5.3 Temperature 5 STATISTICAL THERMODYNAMICS
i.e., the heat given off by one system is equal to the heat absorbed by the other
(in our notation absorbed heat is positive and emitted heat is negative).
It is clear that if the systems A and A
/
are suddenly brought into thermal
contact then they will only exchange heat and evolve towards a new equilibrium
state if the ﬁnal state is more probable than the initial one. In other words, if
P(
¯
E
f
) > P(E
i
), (5.25)
or
lnP(
¯
E
f
) > lnP(E
i
), (5.26)
since the logarithm is a monotonic function. The above inequality can be written
lnΩ(
¯
E
f
) + lnΩ
/
(
¯
E
/
f
) > lnΩ(E
i
) + lnΩ
/
(E
/
i
), (5.27)
with the aid of Eq. (5.3). Taylor expansion to ﬁrst order yields
∂lnΩ(E
i
)
∂E
(
¯
E
f
−E
i
) +
∂lnΩ
/
(E
/
i
)
∂E
/
(
¯
E
/
f
−E
/
i
) > 0, (5.28)
which ﬁnally gives
(β
i
−β
/
i
) Q > 0, (5.29)
where β
i
≡ β(E
i
), β
/
i
≡ β
/
(E
/
i
), and use has been made of Eq. (5.24).
It is clear, from the above, that the parameter β, deﬁned
β =
∂lnΩ
∂E
, (5.30)
has the following properties:
1. If two systems separately in equilibrium have the same value of β then the systems
will remain in equilibrium when brought into thermal contact with one another.
2. If two systems separately in equilibrium have diﬀerent values of β then the systems
will not remain in equilibrium when brought into thermal contact with one another.
Instead, the system with the higher value of β will absorb heat from the other
system until the two β values are the same [see Eq. (5.29)].
62
5.3 Temperature 5 STATISTICAL THERMODYNAMICS
Incidentally, a partial derivative is used in Eq. (5.30) because in a purely thermal
interaction the external parameters of the system are held constant whilst the
energy changes.
Let us deﬁne the dimensionless parameter T, such that
1
k T
≡ β ≡
∂lnΩ
∂E
, (5.31)
where k is a positive constant having the dimensions of energy. The parameter
T is termed the thermodynamic temperature, and controls heat ﬂow in much the
same manner as a conventional temperature. Thus, if two isolated systems in
equilibrium possess the same thermodynamic temperature then they will remain
in equilibrium when brought into thermal contact. However, if the two systems
have different thermodynamic temperatures then heat will ﬂow from the system
with the higher temperature (i.e., the “hotter” system) to the system with the
lower temperature until the temperatures of the two systems are the same. In
addition, suppose that we have three systems A, B, and C. We know that if
A and B remain in equilibrium when brought into thermal contact then their
temperatures are the same, so that T
A
= T
B
. Similarly, if B and C remain in
equilibrium when brought into thermal contact, then T
B
= T
C
. But, we can then
conclude that T
A
= T
C
, so systems A and C will also remain in equilibrium when
brought into thermal contact. Thus, we arrive at the following statement, which
is sometimes called the zeroth law of thermodynamics:
If two systems are separately in thermal equilibrium with a third system then
they must also be in thermal equilibrium with one another.
The thermodynamic temperature of a macroscopic body, as deﬁned in Eq. (5.31),
depends only on the rate of change of the number of accessible microstates with
the total energy. Thus, it is possible to deﬁne a thermodynamic temperature
for systems with radically different microscopic structures (e.g., matter and ra
diation). The thermodynamic, or absolute, scale of temperature is measured in
degrees kelvin. The parameter k is chosen to make this temperature scale accord
as much as possible with more conventional temperature scales. The choice
k = 1.381 10
−23
joules/kelvin, (5.32)
63
5.3 Temperature 5 STATISTICAL THERMODYNAMICS
ensures that there are 100 degrees kelvin between the freezing and boiling points
of water at atmospheric pressure (the two temperatures are 273.15 and 373.15
degrees kelvin, respectively). The above number is known as the Boltzmann con
stant. In fact, the Boltzmann constant is ﬁxed by international convention so as
to make the triple point of water (i.e., the unique temperature at which the three
phases of water coexist in thermal equilibrium) exactly 273.16
◦
K. Note that the
zero of the thermodynamic scale, the so called absolute zero of temperature, does
not correspond to the freezing point of water, but to some far more physically
signiﬁcant temperature which we shall discuss presently.
The familiar Ω ∝ E
f
scaling for translational degrees of freedom yields
k T ∼
¯
E
f
, (5.33)
using Eq. (5.31), so k T is a rough measure of the mean energy associated with
each degree of freedom in the system. In fact, for a classical system (i.e., one
in which quantum effects are unimportant) it is possible to show that the mean
energy associated with each degree of freedom is exactly (1/2) k T. This result,
which is known as the equipartition theorem, will be discussed in more detail later
on in this course.
The absolute temperature T is usually positive, since Ω(E) is ordinarily a very
rapidly increasing function of energy. In fact, this is the case for all conventional
systems where the kinetic energy of the particles is taken into account, because
there is no upper bound on the possible energy of the system, and Ω(E) conse
quently increases roughly like E
f
. It is, however, possible to envisage a situation
in which we ignore the translational degrees of freedom of a system, and concen
trate only on its spin degrees of freedom. In this case, there is an upper bound
to the possible energy of the system (i.e., all spins lined up antiparallel to an
applied magnetic ﬁeld). Consequently, the total number of states available to the
system is ﬁnite. In this situation, the density of spin states Ω
spin
(E) ﬁrst increases
with increasing energy, as in conventional systems, but then reaches a maximum
and decreases again. Thus, it is possible to get absolute spin temperatures which
are negative, as well as positive.
In Lavoisier’s caloriﬁc theory, the basic mechanism which forces heat to ﬂow
64
5.4 Mechanical interaction between macrosystems 5 STATISTICAL THERMODYNAMICS
from hot to cold bodies is the supposed mutual repulsion of the constituent par
ticles of caloriﬁc ﬂuid. In statistical mechanics, the explanation is far less con
trived. Heat ﬂow occurs because statistical systems tend to evolve towards their
most probable states, subject to the imposed physical constraints. When two bod
ies at different temperatures are suddenly placed in thermal contact, the initial
state corresponds to a spectacularly improbable state of the overall system. For
systems containing of order 1 mole of particles, the only reasonably probable ﬁ
nal equilibrium states are such that the two bodies differ in temperature by less
than 1 part in 10
12
. The evolution of the system towards these ﬁnal states (i.e.,
towards thermal equilibrium) is effectively driven by probability.
5.4 Mechanical interaction between macrosystems
Let us now examine a purely mechanical interaction between macrostates, where
one or more of the external parameters is modiﬁed, but there is no exchange
of heat energy. Consider, for the sake of simplicity, a situation where only one
external parameter x of the system is free to vary. In general, the number of
microstates accessible to the system when the overall energy lies between E and
E +δE depends on the particular value of x, so we can write Ω ≡ Ω(E, x).
When x is changed by the amount dx, the energy E
r
(x) of a given microstate
r changes by (∂E
r
/∂x) dx. The number of states σ(E, x) whose energy is changed
from a value less than E to a value greater than E when the parameter changes
from x to x + dx is given by the number of microstates per unit energy range
multiplied by the average shift in energy of the microstates. Hence,
σ(E, x) =
Ω(E, x)
δE
∂E
r
∂x
dx, (5.34)
where the mean value of ∂E
r
/∂x is taken over all accessible microstates (i.e., all
states where the energy lies between E and E + δE and the external parameter
takes the value x). The above equation can also be written
σ(E, x) = −
Ω(E, x)
δE
¯
X dx, (5.35)
65
5.4 Mechanical interaction between macrosystems 5 STATISTICAL THERMODYNAMICS
where
¯
X(E, x) = −
∂E
r
∂x
(5.36)
is the mean generalized force conjugate to the external parameter x (see Sect. 4.4).
Consider the total number of microstates between E and E + δE. When the
external parameter changes from x to x +dx, the number of states in this energy
range changes by (∂Ω/∂x) dx. This change is due to the difference between the
number of states which enter the range because their energy is changed from a
value less than E to one greater than E and the number which leave because their
energy is changed from a value less than E + δE to one greater than E + δE. In
symbols,
∂Ω(E, x)
∂x
dx = σ(E) −σ(E +δE) · −
∂σ
∂E
δE, (5.37)
which yields
∂Ω
∂x
=
∂(Ω
¯
X)
∂E
, (5.38)
where use has been made of Eq. (5.35). Dividing both sides by Ω gives
∂lnΩ
∂x
=
∂lnΩ
∂E
¯
X +
∂
¯
X
∂E
. (5.39)
However, according to the usual estimate Ω ∝ E
f
, the ﬁrst term on the righthand
side is of order (f/
¯
E)
¯
X, whereas the second term is only of order
¯
X/
¯
E. Clearly, for
a macroscopic system with many degrees of freedom, the second term is utterly
negligible, so we have
∂lnΩ
∂x
=
∂lnΩ
∂E
¯
X = β
¯
X. (5.40)
When there are several external parameters x
1
, , x
n
, so that Ω ≡ Ω(E, x
1
, ,
x
n
), the above derivation is valid for each parameter taken in isolation. Thus,
∂lnΩ
∂x
α
= β
¯
X
α
, (5.41)
where
¯
X
α
is the mean generalized force conjugate to the parameter x
α
.
66
5.5 General interaction between macrosystems 5 STATISTICAL THERMODYNAMICS
5.5 General interaction between macrosystems
Consider two systems, A and A
/
, which can interact by exchanging heat energy
and doing work on one another. Let the system A have energy E and adjustable
external parameters x
1
, , x
n
. Likewise, let the system A
/
have energy E
/
and
adjustable external parameters x
/
1
, , x
/
n
. The combined system A
(0)
= A+A
/
is
assumed to be isolated. It follows from the ﬁrst law of thermodynamics that
E +E
/
= E
(0)
= constant. (5.42)
Thus, the energy E
/
of system A
/
is determined once the energy E of system
A is given, and vice versa. In fact, E
/
could be regarded as a function of E.
Furthermore, if the two systems can interact mechanically then, in general, the
parameters x
/
are some function of the parameters x. As a simple example, if the
two systems are separated by a movable partition in an enclosure of ﬁxed volume
V
(0)
, then
V +V
/
= V
(0)
= constant, (5.43)
where V and V
/
are the volumes of systems A and A
/
, respectively.
The total number of microstates accessible to A
(0)
is clearly a function of E and
the parameters x
α
(where α runs from 1 to n), so Ω
(0)
≡ Ω
(0)
(E, x
1
, , x
n
). We
have already demonstrated (in Sect. 5.2) that Ω
(0)
exhibits a very pronounced
maximum at one particular value of the energy E =
˜
E when E is varied but the
external parameters are held constant. This behaviour comes about because of
the very strong,
Ω ∝ E
f
, (5.44)
increase in the number of accessible microstates of A (or A
/
) with energy. How
ever, according to Sect. 3.8, the number of accessible microstates exhibits a sim
ilar strong increase with the volume, which is a typical external parameter, so
that
Ω ∝ V
f
. (5.45)
It follows that the variation of Ω
(0)
with a typical parameter x
α
, when all the other
parameters and the energy are held constant, also exhibits a very sharp maximum
at some particular value x
α
= ˜ x
α
. The equilibrium situation corresponds to the
67
5.5 General interaction between macrosystems 5 STATISTICAL THERMODYNAMICS
conﬁguration of maximum probability, in which virtually all systems A
(0)
in the
ensemble have values of E and x
α
very close to
˜
E and ˜ x
α
. The mean values of
these quantities are thus given by
¯
E =
˜
E and ¯ x
α
= ˜ x
α
.
Consider a quasistatic process in which the system A is brought from an equi
librium state described by
¯
E and ¯ x
α
to an inﬁnitesimally different equilibrium
state described by
¯
E + d
¯
E and ¯ x
α
+ d¯ x
α
. Let us calculate the resultant change
in the number of microstates accessible to A. Since Ω ≡ Ω(E, x
1
, , x
n
), the
change in lnΩ follows from standard mathematics:
dlnΩ =
∂lnΩ
∂E
d
¯
E +
n
α=1
∂lnΩ
∂x
α
d¯ x
α
. (5.46)
However, we have previously demonstrated that
β =
∂lnΩ
∂E
, β
¯
X
α
=
∂lnΩ
∂x
α
(5.47)
[from Eqs. (5.30) and (5.41)], so Eq. (5.46) can be written
dlnΩ = β
_
_
d
¯
E +
α
¯
X
α
d¯ x
α
_
_
. (5.48)
Note that the temperature parameter β and the mean conjugate forces
¯
X
α
are
only welldeﬁned for equilibrium states. This is why we are only considering
quasistatic changes in which the two systems are always arbitrarily close to equi
librium.
Let us rewrite Eq. (5.48) in terms of the thermodynamic temperature T, using
the relation β ≡ 1/k T. We obtain
dS =
_
_
d
¯
E +
α
¯
X
α
d¯ x
α
_
_
_
T, (5.49)
where
S = k lnΩ. (5.50)
Equation (5.49) is a differential relation which enables us to calculate the quan
tity S as a function of the mean energy
¯
E and the mean external parameters ¯ x
α
,
68
5.5 General interaction between macrosystems 5 STATISTICAL THERMODYNAMICS
assuming that we can calculate the temperature T and mean conjugate forces
¯
X
α
for each equilibrium state. The function S(
¯
E, ¯ x
α
) is termed the entropy of system
A. The word entropy is derived from the Greek en+trepien, which means “in
change.” The reason for this etymology will become apparent presently. It can be
seen from Eq. (5.50) that the entropy is merely a parameterization of the number
of accessible microstates. Hence, according to statistical mechanics, S(
¯
E, ¯ x
α
) is es
sentially a measure of the relative probability of a state characterized by values
of the mean energy and mean external parameters
¯
E and ¯ x
α
, respectively.
According to Eq. (4.16), the net amount of work performed during a quasi
static change is given by
¯ dW =
α
¯
X
α
d¯ x
α
. (5.51)
It follows from Eq. (5.49) that
dS =
d
¯
E + ¯ dW
T
=
¯ dQ
T
. (5.52)
Thus, the thermodynamic temperature T is the integrating factor for the ﬁrst law
of thermodynamics,
¯ dQ = d
¯
E + ¯ dW, (5.53)
which converts the inexact differential ¯ dQ into the exact differential dS (see
Sect. 4.5). It follows that the entropy difference between any two macrostates
i and f can be written
S
f
−S
i
=
f
i
dS =
f
i
¯ dQ
T
, (5.54)
where the integral is evaluated for any process through which the system is
brought quasistatically via a sequence of nearequilibrium conﬁgurations from
its initial to its ﬁnal macrostate. The process has to be quasistatic because the
temperature T, which appears in the integrand, is only welldeﬁned for an equi
librium state. Since the lefthand side of the above equation only depends on the
initial and ﬁnal states, it follows that the integral on the righthand side is inde
pendent of the particular sequence of quasistatic changes used to get from i to
f. Thus,
f
i
¯ dQ/T is independent of the process (provided that it is quasistatic).
All of the concepts which we have encountered up to now in this course, such
as temperature, heat, energy, volume, pressure, etc., have been fairly familiar to
69
5.6 Entropy 5 STATISTICAL THERMODYNAMICS
us from other branches of Physics. However, entropy, which turns out to be of
crucial importance in thermodynamics, is something quite new. Let us consider
the following questions. What does the entropy of a system actually signify?
What use is the concept of entropy?
5.6 Entropy
Consider an isolated system whose energy is known to lie in a narrow range. Let
Ω be the number of accessible microstates. According to the principle of equal
a priori probabilities, the system is equally likely to be found in any one of these
states when it is in thermal equilibrium. The accessible states are just that set
of microstates which are consistent with the macroscopic constraints imposed on
the system. These constraints can usually be quantiﬁed by specifying the values
of some parameters y
1
, , y
n
which characterize the macrostate. Note that these
parameters are not necessarily external: e.g., we could specify either the volume
(an external parameter) or the mean pressure (the mean force conjugate to the
volume). The number of accessible states is clearly a function of the chosen
parameters, so we can write Ω ≡ Ω(y
1
, , y
n
) for the number of microstates
consistent with a macrostate in which the general parameter y
α
lies in the range
y
α
to y
α
+dy
α
.
Suppose that we start from a system in thermal equilibrium. According to sta
tistical mechanics, each of the Ω
i
, say, accessible states are equally likely. Let us
now remove, or relax, some of the constraints imposed on the system. Clearly,
all of the microstates formally accessible to the system are still accessible, but
many additional states will, in general, become accessible. Thus, removing or
relaxing constraints can only have the effect of increasing, or possibly leaving un
changed, the number of microstates accessible to the system. If the ﬁnal number
of accessible states is Ω
f
, then we can write
Ω
f
≥ Ω
i
. (5.55)
Immediately after the constraints are relaxed, the systems in the ensemble are
not in any of the microstates from which they were previously excluded. So the
70
5.6 Entropy 5 STATISTICAL THERMODYNAMICS
systems only occupy a fraction
P
i
=
Ω
i
Ω
f
(5.56)
of the Ω
f
states now accessible to them. This is clearly not a equilibrium situa
tion. Indeed, if Ω
f
¸ Ω
i
then the conﬁguration in which the systems are only
distributed over the original Ω
i
states is an extremely unlikely one. In fact, its
probability of occurrence is given by Eq. (5.56). According to the H theorem (see
Sect. 3.4), the ensemble will evolve in time until a more probable ﬁnal state is
reached in which the systems are evenly distributed over the Ω
f
available states.
As a simple example, consider a system consisting of a box divided into two
regions of equal volume. Suppose that, initially, one region is ﬁlled with gas
and the other is empty. The constraint imposed on the system is, thus, that the
coordinates of all of the molecules must lie within the ﬁlled region. In other
words, the volume accessible to the system is V = V
i
, where V
i
is half the volume
of the box. The constraints imposed on the system can be relaxed by removing
the partition and allowing gas to ﬂow into both regions. The volume accessible
to the gas is now V = V
f
= 2 V
i
. Immediately after the partition is removed,
the system is in an extremely improbable state. We know, from Sect. 3.8, that at
constant energy the variation of the number of accessible states of an ideal gas
with the volume is
Ω ∝ V
N
, (5.57)
where N is the number of particles. Thus, the probability of observing the state
immediately after the partition is removed in an ensemble of equilibrium systems
with volume V = V
f
is
P
i
=
Ω
i
Ω
f
=
_
V
i
V
f
_
N
=
_
1
2
_
N
. (5.58)
If the box contains of order 1 mole of molecules then N ∼ 10
24
and this probability
is fantastically small:
P
i
∼ exp
_
−10
24
_
. (5.59)
Clearly, the system will evolve towards a more probable state.
This discussion can also be phrased in terms of the parameters y
1
, , y
n
of
the system. Suppose that a constraint is removed. For instance, one of the pa
71
5.6 Entropy 5 STATISTICAL THERMODYNAMICS
rameters, y, say, which originally had the value y = y
i
, is now allowed to vary.
According to statistical mechanics, all states accessible to the system are equally
likely. So, the probability P(y) of ﬁnding the system in equilibrium with the pa
rameter in the range y to y+δy is just proportional to the number of microstates
in this interval: i.e.,
P(y) ∝ Ω(y). (5.60)
Usually, Ω(y) has a very pronounced maximum at some particular value ˜ y (see
Sect. 5.2). This means that practically all systems in the ﬁnal equilibrium ensem
ble have values of y close to ˜ y. Thus, if y
i
,= ˜ y initially then the parameter y
will change until it attains a ﬁnal value close to ˜ y, where Ω is maximum. This
discussion can be summed up in a single phrase:
If some of the constraints of an isolated system are removed then the param
eters of the system tend to readjust themselves in such a way that
Ω(y
1
, , y
n
) →maximum. (5.61)
Suppose that the ﬁnal equilibrium state has been reached, so that the systems
in the ensemble are uniformly distributed over the Ω
f
accessible ﬁnal states. If the
original constraints are reimposed then the systems in the ensemble still occupy
these Ω
f
states with equal probability. Thus, if Ω
f
> Ω
i
, simply restoring the
constraints does not restore the initial situation. Once the systems are randomly
distributed over the Ω
f
states they cannot be expected to spontaneously move
out of some of these states and occupy a more restricted class of states merely in
response to the reimposition of a constraint. The initial condition can also not
be restored by removing further constraints. This could only lead to even more
states becoming accessible to the system.
Suppose that some process occurs in which an isolated system goes from some
initial conﬁguration to some ﬁnal conﬁguration. If the ﬁnal conﬁguration is such
that the imposition or removal of constraints cannot by itself restore the initial
condition then the process is deemed irreversible. On the other hand, if it is such
that the imposition or removal of constraints can restore the initial condition then
the process is deemed reversible. From what we have already said, an irreversible
process is clearly one in which the removal of constraints leads to a situation
72
5.6 Entropy 5 STATISTICAL THERMODYNAMICS
where Ω
f
> Ω
i
. A reversible process corresponds to the special case where the
removal of constraints does not change the number of accessible states, so that
Ω
f
= Ω
i
. In this situation, the systems remain distributed with equal probability
over these states irrespective of whether the constraints are imposed or not.
Our microscopic deﬁnition of irreversibility is in accordance with the macro
scopic deﬁnition discussed in Sect. 3.6. Recall that on a macroscopic level an ir
reversible process is one which “looks unphysical” when viewed in reverse. On a
microscopic level it is clearly plausible that a system should spontaneously evolve
from an improbable to a probable conﬁguration in response to the relaxation of
some constraint. However, it is quite clearly implausible that a system should
ever spontaneously evolve from a probable to an improbable conﬁguration. Let
us consider our example again. If a gas is initially restricted to one half of a box,
via a partition, then the ﬂow of gas from one side of the box to the other when
the partition is removed is an irreversible process. This process is irreversible
on a microscopic level because the initial conﬁguration cannot be recovered by
simply replacing the partition. It is irreversible on a macroscopic level because
it is obviously unphysical for the molecules of a gas to spontaneously distribute
themselves in such a manner that they only occupy half of the available volume.
It is actually possible to quantify irreversibility. In other words, in addition to
stating that a given process is irreversible, we can also give some indication of
how irreversible it is. The parameter which measures irreversibility is just the
number of accessible states Ω. Thus, if Ω for an isolated system spontaneously
increases then the process is irreversible, the degree of irreversibility being pro
portional to the amount of the increase. If Ω stays the same then the process is
reversible. Of course, it is unphysical for Ω to ever spontaneously decrease. In
symbols, we can write
Ω
f
−Ω
i
≡ ∆Ω ≥ 0, (5.62)
for any physical process operating on an isolated system. In practice, Ω itself is a
rather unwieldy parameter with which to measure irreversibility. For instance, in
the previous example, where an ideal gas doubles in volume (at constant energy)
due to the removal of a partition, the fractional increase in Ω is
Ω
f
Ω
i
· 10
2 ν10
23
, (5.63)
73
5.6 Entropy 5 STATISTICAL THERMODYNAMICS
where ν is the number of moles. This is an extremely large number! It is far more
convenient to measure irreversibility in terms of lnΩ. If Eq. (5.62) is true then it
is certainly also true that
lnΩ
f
− lnΩ
i
≡ ∆lnΩ ≥ 0 (5.64)
for any physical process operating on an isolated system. The increase in lnΩ
when an ideal gas doubles in volume (at constant energy) is
lnΩ
f
− lnΩ
i
= νN
A
ln2, (5.65)
where N
A
= 6 10
23
. This is a far more manageable number! Since we usu
ally deal with particles by the mole in laboratory physics, it makes sense to pre
multiply our measure of irreversibility by a number of order 1/N
A
. For historical
reasons, the number which is generally used for this purpose is the Boltzmann
constant k, which can be written
k =
R
N
A
joules/kelvin, (5.66)
where
R = 8.3143 joules/kelvin/mole (5.67)
is the ideal gas constant which appears in the wellknown equation of state for
an ideal gas, P V = νRT. Thus, the ﬁnal form for our measure of irreversibility is
S = k lnΩ. (5.68)
This quantity is termed “entropy”, and is measured in joules per degree kelvin.
The increase in entropy when an ideal gas doubles in volume (at constant energy)
is
S
f
−S
i
= νR ln2, (5.69)
which is order unity for laboratory scale systems (i.e., those containing about one
mole of particles). The essential irreversibility of macroscopic phenomena can be
summed up as follows:
S
f
−S
i
≡ ∆S ≥ 0, (5.70)
for a process acting on an isolated system [this is equivalent to Eqs. (5.61),
(5.62), and (5.64)]. Thus, the entropy of an isolated system tends to increase
74
5.6 Entropy 5 STATISTICAL THERMODYNAMICS
with time and can never decrease. This proposition is known as the second law of
thermodynamics.
One way of thinking of the number of accessible states Ω is that it is a mea
sure of the disorder associated with a macrostate. For a system exhibiting a high
degree of order we would expect a strong correlation between the motions of the
individual particles. For instance, in a ﬂuid there might be a strong tendency for
the particles to move in one particular direction, giving rise to an ordered ﬂow
of the system in that direction. On the other hand, for a system exhibiting a low
degree of order we expect far less correlation between the motions of individ
ual particles. It follows that, all other things being equal, an ordered system is
more constrained than a disordered system, since the former is excluded from
microstates in which there is not a strong correlation between individual particle
motions, whereas the latter is not. Another way of saying this is that an ordered
system has less accessible microstates than a corresponding disordered system.
Thus, entropy is effectively a measure of the disorder in a system (the disorder
increases with S). With this interpretation, the second law of thermodynamics
reduces to the statement that isolated systems tend to become more disordered
with time, and can never become more ordered.
Note that the second law of thermodynamics only applies to isolated systems.
The entropy of a nonisolated system can decrease. For instance, if a gas expands
(at constant energy) to twice its initial volume after the removal of a partition,
we can subsequently recompress the gas to its original volume. The energy of the
gas will increase because of the work done on it during compression, but if we
absorb some heat from the gas then we can restore it to its initial state. Clearly, in
restoring the gas to its original state, we have restored its original entropy. This
appears to violate the second law of thermodynamics because the entropy should
have increased in what is obviously an irreversible process (just try to make a
gas spontaneously occupy half of its original volume!). However, if we consider
a new system consisting of the gas plus the compression and heat absorption
machinery, then it is still true that the entropy of this system (which is assumed
to be isolated) must increase in time. Thus, the entropy of the gas is only kept
the same at the expense of increasing the entropy of the rest of the system, and
the total entropy is increased. If we consider the system of everything in the
75
5.6 Entropy 5 STATISTICAL THERMODYNAMICS
Universe, which is certainly an isolated system since there is nothing outside it
with which it could interact, then the second law of thermodynamics becomes:
The disorder of the Universe tends to increase with time and can never de
crease.
An irreversible process is clearly one which increases the disorder of the Uni
verse, whereas a reversible process neither increases nor decreases disorder. This
deﬁnition is in accordance with our previous deﬁnition of an irreversible process
as one which “does not look right” when viewed backwards. One easy way of
viewing macroscopic events in reverse is to ﬁlm them, and then play the ﬁlm
backwards through a projector. There is a famous passage in the novel “Slaugh
terhouse 5,” by Kurt Vonnegut, in which the hero, Billy Pilgrim, views a pro
paganda ﬁlm of an American World War II bombing raid on a German city in
reverse. This is what the ﬁlm appeared to show:
“American planes, full of holes and wounded men and corpses took oﬀ back
wards from an airﬁeld in England. Over France, a few German ﬁghter planes
ﬂew at them backwards, sucked bullets and shell fragments from some of the
planes and crewmen. They did the same for wrecked American bombers on
the ground, and those planes ﬂew up backwards and joined the formation.
The formation ﬂew backwards over a German city that was in ﬂames. The
bombers opened their bomb bay doors, exerted a miraculous magnetism
which shrunk the ﬁres, gathered them into cylindrical steel containers, and
lifted the containers into the bellies of the planes. The containers were stored
neatly in racks. The Germans had miraculous devices of their own, which were
long steel tubes. They used them to suck more fragments from the crewmen
and planes. But there were still a few wounded Americans, though, and some
of the bombers were in bad repair. Over France, though, German ﬁghters
came up again, made everything and everybody as good as new.”
Vonnegut’s point, I suppose, is that the morality of actions is inverted when you
view them in reverse.
76
5.7 Properties of entropy 5 STATISTICAL THERMODYNAMICS
What is there about this passage which strikes us as surreal and fantastic?
What is there that immediately tells us that the events shown in the ﬁlm could
never happen in reality? It is not so much that the planes appear to ﬂy backwards
and the bombs appear to fall upwards. After all, given a little ingenuity and a
sufﬁciently good pilot, it is probably possible to ﬂy a plane backwards. Likewise,
if we were to throw a bomb up in the air with just the right velocity we could, in
principle, ﬁx it so that the velocity of the bomb matched that of a passing bomber
when their paths intersected. Certainly, if you had never seen a plane before it
would not be obvious which way around it was supposed to ﬂy. However, certain
events are depicted in the ﬁlm, “miraculous” events in Vonnegut’s words, which
would immediately strike us as the wrong way around even if we had never seen
them before. For instance, the ﬁlm might show thousands of white hot bits of
shrapnel approach each other from all directions at great velocity, compressing
an explosive gas in the process, which slows themdown such that when they meet
they ﬁt together exactly to form a metal cylinder enclosing the gases and moving
upwards at great velocity. What strikes us as completely implausible about this
event is the spontaneous transition from the disordered motion of the gases and
metal fragments to the ordered upward motion of the bomb.
5.7 Properties of entropy
Entropy, as we have deﬁned it, has some dependence on the resolution δE to
which the energy of macrostates is measured. Recall that Ω(E) is the number of
accessible microstates with energy in the range E to E + δE. Suppose that we
choose a new resolution δ
∗
E and deﬁne a new density of states Ω
∗
(E) which is
the number of states with energy in the range E to E + δ
∗
E. It can easily be seen
that
Ω
∗
(E) =
δ
∗
E
δE
Ω(E). (5.71)
It follows that the new entropy S
∗
= k lnΩ
∗
is related to the previous entropy
S = k lnΩ via
S
∗
= S +k ln
δ
∗
E
δE
. (5.72)
77
5.7 Properties of entropy 5 STATISTICAL THERMODYNAMICS
Now, our usual estimate that Ω ∼ E
f
gives S ∼ kf, where f is the number of
degrees of freedom. It follows that even if δ
∗
E were to differ from δE by of order
f (i.e., twenty four orders of magnitude), which is virtually inconceivable, the
second term on the righthand side of the above equation is still only of order
k lnf, which is utterly negligible compared to kf. It follows that
S
∗
= S (5.73)
to an excellent approximation, so our deﬁnition of entropy is completely insen
sitive to the resolution to which we measure energy (or any other macroscopic
parameter).
Note that, like the temperature, the entropy of a macrostate is only well
deﬁned if the macrostate is in equilibrium. The crucial point is that it only makes
sense to talk about the number of accessible states if the systems in the ensem
ble are given sufﬁcient time to thoroughly explore all of the possible microstates
consistent with the known macroscopic constraints. In other words, we can only
be sure that a given microstate is inaccessible when the systems in the ensemble
have had ample opportunity to move into it, and yet have not done so. Note
that for an equilibrium state, the entropy is just as welldeﬁned as more familiar
quantities such as the temperature and the mean pressure.
Consider, again, two systems Aand A
/
which are in thermal contact but can do
no work on one another (see Sect. 5.2). Let E and E
/
be the energies of the two
systems, and Ω(E) and Ω
/
(E
/
) the respective densities of states. Furthermore, let
E
(0)
be the conserved energy of the system as a whole and Ω
(0)
the corresponding
density of states. We have from Eq. (5.2) that
Ω
(0)
(E) = Ω(E) Ω
/
(E
/
), (5.74)
where E
/
= E
(0)
−E. In other words, the number of states accessible to the whole
system is the product of the numbers of states accessible to each subsystem, since
every microstate of A can be combined with every microstate of A
/
to form a
distinct microstate of the whole system. We know, from Sect. 5.2, that in equilib
rium the mean energy of A takes the value
¯
E =
˜
E for which Ω
(0)
(E) is maximum,
and the temperatures of A and A
/
are equal. The distribution of E around the
78
5.8 Uses of entropy 5 STATISTICAL THERMODYNAMICS
mean value is of order ∆
∗
E =
˜
E/
√
f, where f is the number of degrees of free
dom. It follows that the total number of accessible microstates is approximately
the number of states which lie within ∆
∗
E of
˜
E. Thus,
Ω
(0)
tot
·
Ω
(0)
(
˜
E)
δE
∆
∗
E. (5.75)
The entropy of the whole system is given by
S
(0)
= k lnΩ
(0)
tot
= k lnΩ
(0)
(
˜
E) +k ln
∆
∗
E
δE
. (5.76)
According to our usual estimate, Ω ∼ E
f
, the ﬁrst term on the righthand side is
of order kf whereas the second term is of order k ln(
˜
E/
√
f δE). Any reasonable
choice for the energy subdivision δE should be greater than
˜
E/f, otherwise there
would be less than one microstate per subdivision. It follows that the second term
is less than or of order k lnf, which is utterly negligible compared to kf. Thus,
S
(0)
= k lnΩ
(0)
(
˜
E) = k ln[Ω(
˜
E) Ω(
˜
E
/
)] = k lnΩ(
˜
E) +k lnΩ
/
(
˜
E
/
) (5.77)
to an excellent approximation, giving
S
(0)
= S(
˜
E) +S
/
(
˜
E
/
). (5.78)
It can be seen that the probability distribution for Ω
(0)
(E) is so strongly peaked
around its maximum value that, for the purpose of calculating the entropy, the
total number of states is equal to the maximum number of states [i.e., Ω
(0)
tot
∼
Ω
(0)
(
˜
E)]. One consequence of this is that the entropy has the simple additive
property shown in Eq. (5.78). Thus, the total entropy of two thermally interacting
systems in equilibrium is the sum of the entropies of each system in isolation.
5.8 Uses of entropy
We have deﬁned a new function called entropy, denoted S, which parameterizes
the amount of disorder in a macroscopic system. The entropy of an equilibrium
macrostate is related to the number of accessible microstates Ω via
S = k lnΩ. (5.79)
79
5.8 Uses of entropy 5 STATISTICAL THERMODYNAMICS
On a macroscopic level, the increase in entropy due to a quasistatic change in
which an inﬁnitesimal amount of heat ¯ dQ is absorbed by the system is given by
dS =
¯ dQ
T
, (5.80)
where T is the absolute temperature of the system. The second law of thermo
dynamics states that the entropy of an isolated system can never spontaneously
decrease. Let us now brieﬂy examine some consequences of these results.
Consider two bodies, A and A
/
, which are in thermal contact but can do no
work on one another. We knowwhat is supposed to happen here. Heat ﬂows from
the hotter to the colder of the two bodies until their temperatures are the same.
Consider a quasistatic exchange of heat between the two bodies. According to
the ﬁrst lawof thermodynamics, if an inﬁnitesimal amount of heat ¯ dQis absorbed
by A then inﬁnitesimal heat ¯ dQ
/
= −¯ dQ is absorbed by A
/
. The increase in
the entropy of system A is dS = ¯ dQ/T and the corresponding increase in the
entropy of A
/
is dS
/
= ¯ dQ
/
/T
/
. Here, T and T
/
are the temperatures of the two
systems, respectively. Note that ¯ dQ is assumed to the sufﬁciently small that the
heat transfer does not substantially modify the temperatures of either system.
The change in entropy of the whole system is
dS
(0)
= dS +dS
/
=
_
1
T
−
1
T
/
_
¯ dQ. (5.81)
This change must be positive or zero, according to the second law of thermody
namics, so dS
(0)
≥ 0. It follows that ¯ dQ is positive (i.e., heat ﬂows from A
/
to
A) when T
/
> T, and vice versa. The spontaneous ﬂow of heat only ceases when
T = T
/
. Thus, the direction of spontaneous heat ﬂow is a consequence of the
second law of thermodynamics. Note that the spontaneous ﬂow of heat between
bodies at different temperatures is always an irreversible process which increases
the entropy, or disorder, of the Universe.
Consider, now, the slightly more complicated situation in which the two sys
tems can exchange heat and also do work on one another via a movable partition.
Suppose that the total volume is invariant, so that
V
(0)
= V +V
/
= constant, (5.82)
80
5.8 Uses of entropy 5 STATISTICAL THERMODYNAMICS
where V and V
/
are the volumes of A and A
/
, respectively. Consider a quasi
static change in which system A absorbs an inﬁnitesimal amount of heat ¯ dQ
and its volume simultaneously increases by an inﬁnitesimal amount dV. The
inﬁnitesimal amount of work done by system A is ¯ dW = ¯ pdV (see Sect. 4.4),
where ¯ p is the mean pressure of A. According to the ﬁrst law of thermodynamics,
¯ dQ = dE + ¯ dW = dE + ¯ pdV, (5.83)
where dE is the change in the internal energy of A. Since dS = ¯ dQ/T, the increase
in entropy of system A is written
dS =
dE + ¯ pdV
T
. (5.84)
Likewise, the increase in entropy of system A
/
is given by
dS
/
=
dE
/
+ ¯ p
/
dV
/
T
/
. (5.85)
According to Eq. (5.84),
1
T
=
_
∂S
∂E
_
V
, (5.86)
¯ p
T
=
_
∂S
∂V
_
E
, (5.87)
where the subscripts are to remind us what is held constant in the partial deriva
tives. We can write a similar pair of equations for the system A
/
.
The overall system is assumed to be isolated, so conservation of energy gives
dE +dE
/
= 0. Furthermore, Eq. (5.82) implies that dV +dV
/
= 0. It follows that
the total change in entropy is given by
dS
(0)
= dS +dS
/
=
_
1
T
−
1
T
/
_
dE +
_
_
¯ p
T
−
¯ p
/
T
/
_
_
dV. (5.88)
The equilibrium state is the most probable state (see Sect. 5.2). According to
statistical mechanics, this is equivalent to the state with the largest number of
accessible microstates. Finally, Eq. (5.79) implies that this is the maximum en
tropy state. The system can never spontaneously leave a maximum entropy state,
81
5.9 Entropy and quantum mechanics 5 STATISTICAL THERMODYNAMICS
since this would imply a spontaneous reduction in entropy, which is forbidden
by the second law of thermodynamics. A maximum or minimum entropy state
must satisfy dS
(0)
= 0 for arbitrary small variations of the energy and external
parameters. It follows from Eq. (5.88) that
T = T
/
, (5.89)
¯ p = ¯ p
/
, (5.90)
for such a state. This corresponds to a maximum entropy state (i.e., an equilib
rium state) provided
_
_
∂
2
S
∂E
2
_
_
V
< 0, (5.91)
_
_
∂
2
S
∂V
2
_
_
E
< 0, (5.92)
with a similar pair of inequalities for system A
/
. The usual estimate Ω ∝ E
f
V
f
,
giving S = kf lnE +kf lnV + , ensures that the above inequalities are satisﬁed
in conventional macroscopic systems. In the maximum entropy state the systems
A and A
/
have equal temperatures (i.e., they are in thermal equilibrium) and
equal pressures (i.e., they are in mechanical equilibrium). The second law of
thermodynamics implies that the two interacting systems will evolve towards this
state, and will then remain in it indeﬁnitely (if left undisturbed).
5.9 Entropy and quantum mechanics
The entropy of a system is deﬁned in terms of the number Ω of accessible mi
crostates consistent with an overall energy in the range E to E +δE via
S = k lnΩ. (5.93)
We have already demonstrated that this deﬁnition is utterly insensitive to the res
olution δE to which the macroscopic energy is measured (see Sect. 5.7). In classi
cal mechanics, if a system possesses f degrees of freedom then phasespace is con
ventionally subdivided into cells of arbitrarily chosen volume h
f
0
(see Sect. 3.2).
82
5.9 Entropy and quantum mechanics 5 STATISTICAL THERMODYNAMICS
The number of accessible microstates is equivalent to the number of these cells in
the volume of phasespace consistent with an overall energy of the system lying
in the range E to E +δE. Thus,
Ω =
1
h
f
0
dq
1
dq
f
dp
1
dp
f
, (5.94)
giving
S = k ln
_
dq
1
dq
f
dp
1
dp
f
_
−kf lnh
0
. (5.95)
Thus, in classical mechanics the entropy is undetermined to an arbitrary additive
constant which depends on the size of the cells in phasespace. In fact, S increases
as the cell size decreases. The second law of thermodynamics is only concerned
with changes in entropy, and is, therefore, unaffected by an additive constant.
Likewise, macroscopic thermodynamical quantities, such as the temperature and
pressure, which can be expressed as partial derivatives of the entropy with respect
to various macroscopic parameters [see Eqs. (5.86) and (5.87)] are unaffected by
such a constant. So, in classical mechanics the entropy is rather like a gravita
tional potential: it is undetermined to an additive constant, but this does not
affect any physical laws.
The nonunique value of the entropy comes about because there is no limit to
the precision to which the state of a classical system can be speciﬁed. In other
words, the cell size h
0
can be made arbitrarily small, which corresponds to spec
ifying the particle coordinates and momenta to arbitrary accuracy. However, in
quantum mechanics the uncertainty principle sets a deﬁnite limit to how accu
rately the particle coordinates and momenta can be speciﬁed. In general,
δq
i
δp
i
≥ h, (5.96)
where p
i
is the momentum conjugate to the generalized coordinate q
i
, and δq
i
,
δp
i
are the uncertainties in these quantities, respectively. In fact, in quantum
mechanics the number of accessible quantum states with the overall energy in
the range E to E + δE is completely determined. This implies that, in reality, the
entropy of a system has a unique and unambiguous value. Quantum mechanics
can often be “mocked up” in classical mechanics by setting the cell size in phase
space equal to Planck’s constant, so that h
0
= h. This automatically enforces the
83
5.9 Entropy and quantum mechanics 5 STATISTICAL THERMODYNAMICS
most restrictive form of the uncertainty principle, δq
i
δp
i
= h. In many systems,
the substitution h
0
→ h in Eq. (5.95) gives the same, unique value for S as that
obtained from a full quantum mechanical calculation.
Consider a simple quantum mechanical system consisting of N noninteracting
spinless particles of mass m conﬁned in a cubic box of dimension L. The energy
levels of the ith particle are given by
e
i
=
¯h
2
π
2
2 mL
2
_
n
2
i1
+n
2
i2
+n
2
i3
_
, (5.97)
where n
i1
, n
i2
, and n
i3
are three (positive) quantum numbers. The overall energy
of the system is the sum of the energies of the individual particles, so that for a
general state r
E
r
=
N
i=1
e
i
. (5.98)
The overall state of the system is completely speciﬁed by 3N quantum numbers,
so the number of degrees of freedom is f = 3 N. The classical limit corresponds
to the situation where all of the quantum numbers are much greater than unity.
In this limit, the number of accessible states varies with energy according to our
usual estimate Ω ∝ E
f
. The lowest possible energy state of the system, the so
called groundstate, corresponds to the situation where all quantum numbers
take their lowest possible value, unity. Thus, the groundstate energy E
0
is given
by
E
0
=
f ¯h
2
π
2
2 mL
2
. (5.99)
There is only one accessible microstate at the groundstate energy (i.e., that
where all quantum numbers are unity), so by our usual deﬁnition of entropy
S(E
0
) = k ln1 = 0. (5.100)
In other words, there is no disorder in the system when all the particles are in
their groundstates.
Clearly, as the energy approaches the groundstate energy, the number of ac
cessible states becomes far less than the usual classical estimate E
f
. This is true
84
5.9 Entropy and quantum mechanics 5 STATISTICAL THERMODYNAMICS
for all quantum mechanical systems. In general, the number of microstates varies
roughly like
Ω(E) ∼ 1 +C(E −E
0
)
f
, (5.101)
where C is a positive constant. According to Eq. (5.31), the temperature varies
approximately like
T ∼
E −E
0
k f
, (5.102)
provided Ω ¸ 1. Thus, as the absolute temperature of a system approaches
zero, the internal energy approaches a limiting value E
0
(the quantummechanical
groundstate energy), and the entropy approaches the limiting value zero. This
proposition is known as the third law of thermodynamics.
At low temperatures, great care must be taken to ensure that equilibrium ther
modynamical arguments are applicable, since the rate of attaining equilibrium
may be very slow. Another difﬁculty arises when dealing with a system in which
the atoms possess nuclear spins. Typically, when such a system is brought to a
very low temperature the entropy associated with the degrees of freedom not
involving nuclear spins becomes negligible. Nevertheless, the number of mi
crostates Ω
s
corresponding to the possible nuclear spin orientations may be very
large. Indeed, it may be just as large as the number of states at room tempera
ture. The reason for this is that nuclear magnetic moments are extremely small,
and, therefore, have extremely weak mutual interactions. Thus, it only takes a
tiny amount of heat energy in the system to completely randomize the spin orien
tations. Typically, a temperature as small as 10
−3
degrees kelvin above absolute
zero is sufﬁcient to randomize the spins.
Suppose that the system consists of N atoms of spin 1/2. Each spin can have
two possible orientations. If there is enough residual heat energy in the system to
randomize the spins then each orientation is equally likely. If follows that there
are Ω
s
= 2
N
accessible spin states. The entropy associated with these states is
S
0
= k lnΩ
s
= νR ln2. Below some critical temperature, T
0
, the interaction be
tween the nuclear spins becomes signiﬁcant, and the system settles down in some
unique quantum mechanical groundstate (e.g., with all spins aligned). In this sit
uation, S → 0, in accordance with the third law of thermodynamics. However,
for temperatures which are small, but not small enough to “freeze out” the nu
85
5.10 The laws of thermodynamics 5 STATISTICAL THERMODYNAMICS
clear spin degrees of freedom, the entropy approaches a limiting value S
0
which
depends only on the kinds of atomic nuclei in the system. This limiting value is
independent of the spatial arrangement of the atoms, or the interactions between
them. Thus, for most practical purposes the third law of thermodynamics can be
written
as T →0
+
, S →S
0
, (5.103)
where 0
+
denotes a temperature which is very close to absolute zero, but still
much larger than T
0
. This modiﬁcation of the third law is useful because it can
be applied at temperatures which are not prohibitively low.
5.10 The laws of thermodynamics
We have now come to the end of our investigation of the fundamental postulates
of classical and statistical thermodynamics. The remainder of this course is de
voted to the application of the ideas we have just discussed to various situations
of interest in Physics. Before we proceed, however, it is useful to summarize the
results of our investigations. Our summary takes the form of a number of general
statements regarding macroscopic variables which are usually referred to as the
laws of thermodynamics:
Zeroth Law: If two systems are separately in thermal equilibrium with a third
system then they must be in equilibrium with one another (see Sect. 5.3).
First Law: The change in internal energy of a system in going from one
macrostate to another is the diﬀerence between the net heat absorbed by the
system from its surroundings and the net work done by the system on its
surroundings (see Sect. 4.1).
Second Law: The entropy of an isolated system can never spontaneously
decrease (see Sect. 5.6).
Third Law: In the limit as the absolute temperature tends to zero the
entropy also tends to zero (see Sect. 5.9).
86
6 CLASSICAL THERMODYNAMICS
6 Classical thermodynamics
6.1 Introduction
We have learned that macroscopic quantities such as energy, temperature, and
pressure are, in fact, statistical in nature: i.e., in equilibrium they exhibit ran
dom ﬂuctuations about some mean value. If we were to plot out the probability
distribution for the energy, say, of a system in thermal equilibrium with its sur
roundings we would obtain a Gaussian with a very small fractional width. In fact,
we expect
∆
∗
E
E
∼
1
√
f
, (6.1)
where the number of degrees of freedom f is about 10
24
for laboratory scale
systems. This means that the statistical ﬂuctuations of macroscopic quantities
about their mean values are typically only about 1 in 10
12
.
Since the statistical ﬂuctuations of equilibrium quantities are so small, we can
neglect them to an excellent approximation, and replace macroscopic quantities,
such as energy, temperature, and pressure, by their mean values. So, p → ¯ p, and
T →T, etc. In the following discussion, we shall drop the overbars altogether, so
that p should be understood to represent the mean pressure ¯ p, etc. This prescrip
tion, which is the essence of classical thermodynamics, is equivalent to replacing
all statistically varying quantities by their most probable values.
Although there are formally four laws of thermodynamics (i.e., the zeroth to
the third), the zeroth law is really a consequence of the second law, and the third
law is actually only important at temperatures close to absolute zero. So, for
most purposes, the two laws which really matter are the ﬁrst law and the second
law. For an inﬁnitesimal process, the ﬁrst law is written
¯ dQ = dE + ¯ dW, (6.2)
where dE is the change in internal energy of the system, ¯ dQ is the heat absorbed
by the system, and ¯ dW is the work done by the system on its surroundings. Note
that this is just a convention. We could equally well write the ﬁrst law in terms of
87
6.2 The equation of state of an ideal gas 6 CLASSICAL THERMODYNAMICS
the heat emitted by the system or the work done on the system. It does not really
matter, as long as we are consistent in our deﬁnitions.
The second law of thermodynamics implies that
¯ dQ = T dS, (6.3)
for a quasistatic process, where T is the thermodynamic temperature, and dS is
the change in entropy of the system. Furthermore, for systems in which the only
external parameter is the volume (i.e., gases), the work done on the environment
is
¯ dW = pdV, (6.4)
where p is the pressure, and dV is the change in volume. Thus, it follows from
the ﬁrst and second laws of thermodynamics that
T dS = dE +pdV. (6.5)
6.2 The equation of state of an ideal gas
Let us start our discussion by considering the simplest possible macroscopic sys
tem: i.e., an ideal gas. All of the thermodynamic properties of an ideal gas are
summed up in its equation of state, which determines the relationship between
its pressure, volume, and temperature. Unfortunately, classical thermodynamics
is unable to tell us what this equation of state is from ﬁrst principles. In fact,
classical thermodynamics cannot tell us anything from ﬁrst principles. We always
have to provide some information to begin with before classical thermodynamics
can generate any new results. This initial information may come from statistical
physics (i.e., fromour knowledge of the microscopic structure of the systemunder
consideration), but, more usually, it is entirely empirical in nature (i.e., it is the
result of experiments). Of course, the ideal gas law was ﬁrst discovered empiri
cally by Robert Boyle, but, nowadays, we can justify it from statistical arguments.
Recall (from Sect. 3.8), that the number of accessible states of a monotonic ideal
gas varies like
Ω ∝ V
N
χ(E), (6.6)
88
6.2 The equation of state of an ideal gas 6 CLASSICAL THERMODYNAMICS
where N is the number of atoms, and χ(E) depends only on the energy of the gas
(and is independent of the volume). We obtained this result by integrating over
the volume of accessible phasespace. Since the energy of an ideal gas is inde
pendent of the particle coordinates (because there are no interatomic forces), the
integrals over the coordinates just reduced to N simultaneous volume integrals,
giving the V
N
factor in the above expression. The integrals over the particle mo
menta were more complicated, but were clearly completely independent of V,
giving the χ(E) factor in the above expression. Now, we have a statistical rule
which tells us that
X
α
=
1
β
∂lnΩ
∂x
α
(6.7)
[see Eq. (5.41)], where X
α
is the mean force conjugate to the external parameter
x
α
(i.e., ¯ dW =
α
X
α
dx
α
), and β = 1/k T. For an ideal gas, the only external
parameter is the volume, and its conjugate force is the pressure (since ¯ dW =
pdV). So, we can write
p =
1
β
∂lnΩ
∂V
. (6.8)
If we simply apply this rule to Eq. (6.6), we obtain
p =
Nk T
V
. (6.9)
However, N = νN
A
, where ν is the number of moles, and N
A
is Avagadro’s
number. Also, k N
A
= R, where R is the ideal gas constant. This allows us to
write the equation of state in its usual form
pV = νRT. (6.10)
The above derivation of the ideal gas equation of state is rather elegant. It is
certainly far easier to obtain the equation of state in this manner than to treat
the atoms which make up the gas as little billiard balls which continually bounce
of the walls of a container. The latter derivation is difﬁcult to perform correctly
because it is necessary to average over all possible directions of atomic motion. It
is clear, from the above derivation, that the crucial element needed to obtain the
ideal gas equation of state is the absence of interatomic forces. This automatically
gives rise to a variation of the number of accessible states with E and V of the
89
6.2 The equation of state of an ideal gas 6 CLASSICAL THERMODYNAMICS
form (6.6), which, in turn, implies the ideal gas law. So, the ideal gas law should
also apply to polyatomic gases with no interatomic forces. Polyatomic gases are
more complicated that monatomic gases because the molecules can rotate and
vibrate, giving rise to extra degrees of freedom, in addition to the translational
degrees of freedom of a monatomic gas. In other words, χ(E), in Eq. (6.6),
becomes a lot more complicated in polyatomic gases. However, as long as there
are no interatomic forces, the volume dependence of Ω is still V
N
, and the ideal
gas law should still hold true. In fact, we shall discover that the extra degrees of
freedom of polyatomic gases manifest themselves by increasing the speciﬁc heat
capacity.
There is one other conclusion we can draw from Eq. (6.6). The statistical
deﬁnition of temperature is [Eq. (5.31)]
1
k T
=
∂lnΩ
∂E
. (6.11)
It follows that
1
k T
=
∂lnχ
∂E
. (6.12)
We can see that since χ is a function of the energy, but not the volume, then the
temperature must be a function of the energy, but not the volume. We can turn
this around and write
E = E(T). (6.13)
In other words, the internal energy of an ideal gas depends only on the tem
perature of the gas, and is independent of the volume. This is pretty obvious,
since if there are no interatomic forces then increasing the volume, which effec
tively increases the mean separation between molecules, is not going to affect the
molecular energies in any way. Hence, the energy of the whole gas is unaffected.
The volume independence of the internal energy can also be obtained directly
from the ideal gas equation of state. The internal energy of a gas can be consid
ered as a general function of the temperature and volume, so
E = E(T, V). (6.14)
It follows from mathematics that
dE =
_
∂E
∂T
_
V
dT +
_
∂E
∂V
_
T
dV, (6.15)
90
6.2 The equation of state of an ideal gas 6 CLASSICAL THERMODYNAMICS
where the subscript V reminds us that the ﬁrst partial derivative is taken at con
stant volume, and the subscript T reminds us that the second partial derivative
is taken at constant temperature. Thermodynamics tells us that for a quasistatic
change of parameters
T dS = dE +pdV. (6.16)
The ideal gas law can be used to express the pressure in term of the volume and
the temperature in the above expression. Thus,
dS =
1
T
dE +
νR
V
dV. (6.17)
Using Eq. (6.15), this becomes
dS =
1
T
_
∂E
∂T
_
V
dT +
_
1
T
_
∂E
∂V
_
T
+
νR
V
_
dV. (6.18)
However, dS is the exact differential of a welldeﬁned state function, S. This
means that we can consider the entropy to be a function of temperature and
volume. Thus, S = S(T, V), and mathematics immediately tells us that
dS =
_
∂S
∂T
_
V
dT +
_
∂S
∂V
_
T
dV. (6.19)
The above expression is true for all small values of dT and dV, so a comparison
with Eq. (6.18) gives
_
∂S
∂T
_
V
=
1
T
_
∂E
∂T
_
V
, (6.20)
_
∂S
∂V
_
T
=
1
T
_
∂E
∂V
_
T
+
νR
V
. (6.21)
One wellknown property of partial differentials is the equality of second deriva
tives, irrespective of the order of differentiation, so
∂
2
S
∂V∂T
=
∂
2
S
∂T∂V
. (6.22)
This implies that
_
∂
∂V
_
T
_
∂S
∂T
_
V
=
_
∂
∂T
_
V
_
∂S
∂V
_
T
. (6.23)
91
6.3 Heat capacity or speciﬁc heat 6 CLASSICAL THERMODYNAMICS
The above expression can be combined with Eqs. (6.20) and (6.21) to give
1
T
_
_
∂
2
E
∂V∂T
_
_
=
_
_
−
1
T
2
_
∂E
∂V
_
T
+
1
T
_
_
∂
2
E
∂T∂V
_
_
_
_
. (6.24)
Since second derivatives are equivalent, irrespective of the order of differentia
tion, the above relation reduces to
_
∂E
∂V
_
T
= 0, (6.25)
which implies that the internal energy is independent of the volume for any gas
obeying the ideal equation of state. This result was conﬁrmed experimentally by
James Joule in the middle of the nineteenth century.
6.3 Heat capacity or speciﬁc heat
Suppose that a body absorbs an amount of heat ∆Q and its temperature conse
quently rises by ∆T. The usual deﬁnition of the heat capacity, or speciﬁc heat, of
the body is
C =
∆Q
∆T
. (6.26)
If the body consists of ν moles of some substance then the molar speciﬁc heat (i.e.,
the speciﬁc heat of one mole of this substance ) is deﬁned
c =
1
ν
∆Q
∆T
. (6.27)
In writing the above expressions, we have tacitly assumed that the speciﬁc heat
of a body is independent of its temperature. In general, this is not true. We can
overcome this problem by only allowing the body in question to absorb a very
small amount of heat, so that its temperature only rises slightly, and its speciﬁc
heat remains approximately constant. In the limit as the amount of absorbed heat
becomes inﬁnitesimal, we obtain
c =
1
ν
¯ dQ
dT
. (6.28)
92
6.3 Heat capacity or speciﬁc heat 6 CLASSICAL THERMODYNAMICS
In classical thermodynamics, it is usual to deﬁne two speciﬁc heats. Firstly, the
molar speciﬁc heat at constant volume, denoted
c
V
=
1
ν
_
¯ dQ
dT
_
V
, (6.29)
and, secondly, the molar speciﬁc heat at constant pressure, denoted
c
p
=
1
ν
_
¯ dQ
dT
_
p
. (6.30)
Consider the molar speciﬁc heat at constant volume of an ideal gas. Since
dV = 0, no work is done by the gas, and the ﬁrst law of thermodynamics reduces
to
¯ dQ = dE. (6.31)
It follows from Eq. (6.29) that
c
V
=
1
ν
_
∂E
∂T
_
V
. (6.32)
Now, for an ideal gas the internal energy is volume independent. Thus, the above
expression implies that the speciﬁc heat at constant volume is also volume inde
pendent. Since E is a function only of T, we can write
dE =
_
∂E
∂T
_
V
dT. (6.33)
The previous two expressions can be combined to give
dE = νc
V
dT (6.34)
for an ideal gas.
Let us nowconsider the molar speciﬁc heat at constant pressure of an ideal gas.
In general, if the pressure is kept constant then the volume changes, and so the
gas does work on its environment. According to the ﬁrst law of thermodynamics,
¯ dQ = dE +pdV = νc
V
dT +pdV. (6.35)
93
6.3 Heat capacity or speciﬁc heat 6 CLASSICAL THERMODYNAMICS
The equation of state of an ideal gas tells us that if the volume changes by dV,
the temperature changes by dT, and the pressure remains constant, then
pdV = νRdT. (6.36)
The previous two equations can be combined to give
¯ dQ = νc
V
dT +νRdT. (6.37)
Now, by deﬁnition
c
p
=
1
ν
_
¯ dQ
dT
_
p
, (6.38)
so we obtain
c
p
= c
V
+R (6.39)
for an ideal gas. This is a very famous result. Note that at constant volume all of
the heat absorbed by the gas goes into increasing its internal energy, and, hence,
its temperature, whereas at constant pressure some of the absorbed heat is used
to do work on the environment as the volume increases. This means that, in the
latter case, less heat is available to increase the temperature of the gas. Thus, we
expect the speciﬁc heat at constant pressure to exceed that at constant volume,
as indicated by the above formula.
The ratio of the two speciﬁc heats c
p
/c
V
is conventionally denoted γ. We have
γ ≡
c
p
c
V
= 1 +
R
c
V
(6.40)
for an ideal gas. In fact, γ is very easy to measure because the speed of sound in
an ideal gas is written
c
s
=
¸
¸
¸
_
γp
ρ
, (6.41)
where ρ is the density. Table 2 lists some experimental measurements of c
V
and
γ for common gases. The extent of the agreement between γ calculated from
Eq. (6.40) and the experimental γ is quite remarkable.
94
6.4 Calculation of speciﬁc heats 6 CLASSICAL THERMODYNAMICS
Gas Symbol c
V
γ γ
(experiment) (experiment) (theory)
Helium He 12.5 1.666 1.666
Argon Ar 12.5 1.666 1.666
Nitrogen N
2
20.6 1.405 1.407
Oxygen O
2
21.1 1.396 1.397
Carbon Dioxide CO
2
28.2 1.302 1.298
Ethane C
2
H
6
39.3 1.220 1.214
Table 2: Speciﬁc heats of common gases in joules/mole/deg. (at 15
◦
C and 1 atm.) From Reif.
6.4 Calculation of speciﬁc heats
Now that we know the relationship between the speciﬁc heats at constant vol
ume and constant pressure for an ideal gas, it would be nice if we could calculate
either one of these quantities from ﬁrst principles. Classical thermodynamics can
not help us here. However, it is quite easy to calculate the speciﬁc heat at constant
volume using our knowledge of statistical physics. Recall, that the variation of
the number of accessible states of an ideal gas with energy and volume is written
Ω(E, V) ∝ V
N
χ(E). (6.42)
For the speciﬁc case of a monatomic ideal gas, we worked out a more exact ex
pression for Ω in Sect. 3.8: i.e.,
Ω(E, V) = BV
N
E
3N/2
, (6.43)
where B is a constant independent of the energy and volume. It follows that
lnΩ = lnB +NlnV +
3N
2
lnE. (6.44)
The temperature is given by
1
k T
=
∂lnΩ
∂E
=
3N
2
1
E
, (6.45)
so
E =
3
2
Nk T. (6.46)
95
6.5 Isothermal and adiabatic expansion 6 CLASSICAL THERMODYNAMICS
Since, N = νN
A
, and N
A
k = R, we can rewrite the above expression as
E =
3
2
νRT, (6.47)
where R = 8.3143 joules/mole/deg. is the ideal gas constant. The above formula
tells us exactly how the internal energy of a monatomic ideal gas depends on its
temperature.
The molar speciﬁc heat at constant volume of a monatomic ideal gas is clearly
c
V
=
1
ν
_
∂E
∂T
_
V
=
3
2
R. (6.48)
This has the numerical value
c
V
= 12.47 joules/mole/deg. (6.49)
Furthermore, we have
c
p
= c
V
+R =
5
2
R, (6.50)
and
γ ≡
c
p
c
V
=
5
3
= 1.667. (6.51)
We can see from the previous table that these predictions are borne out pretty
well for the monatomic gases Helium and Argon. Note that the speciﬁc heats
of polyatomic gases are larger than those of monatomic gases. This is because
polyatomic molecules can rotate around their centres of mass, as well as translate,
so polyatomic gases can store energy in the rotational, as well as the translational,
energy states of their constituent particles. We shall analyze this effect in greater
detail later on in this course.
6.5 Isothermal and adiabatic expansion
Suppose that the temperature of an ideal gas is held constant by keeping the gas
in thermal contact with a heat reservoir. If the gas is allowed to expand quasi
statically under these so called isothermal conditions then the ideal equation of
state tells us that
pV = constant. (6.52)
96
6.5 Isothermal and adiabatic expansion 6 CLASSICAL THERMODYNAMICS
This is usually called the isothermal gas law.
Suppose, now, that the gas is thermally isolated from its surroundings. If the
gas is allowed to expand quasistatically under these so called adiabatic condi
tions then it does work on its environment, and, hence, its internal energy is
reduced, and its temperature changes. Let us work out the relationship between
the pressure and volume of the gas during adiabatic expansion.
According to the ﬁrst law of thermodynamics,
¯ dQ = νc
V
dT +pdV = 0, (6.53)
in an adiabatic process (in which no heat is absorbed). The ideal gas equation of
state can be differentiated, yielding
pdV +V dp = νRdT. (6.54)
The temperature increment dT can be eliminated between the above two expres
sions to give
0 =
c
V
R
(pdV +V dp) +pdV =
_
c
V
R
+1
_
pdV +
c
V
R
V dp, (6.55)
which reduces to
(c
V
+R) pdV +c
V
V dp = 0. (6.56)
Dividing through by c
V
pV yields
γ
dV
V
+
dp
p
= 0, (6.57)
where
γ ≡
c
p
c
V
=
c
V
+R
c
V
. (6.58)
It turns out that c
V
is a very slowly varying function of temperature in most gases.
So, it is always a fairly good approximation to treat the ratio of speciﬁc heats γ
as a constant, at least over a limited temperature range. If γ is constant then we
can integrate Eq. (6.57) to give
γlnV + lnp = constant, (6.59)
97
6.6 Hydrostatic equilibrium of the atmosphere 6 CLASSICAL THERMODYNAMICS
or
pV
γ
= constant. (6.60)
This is the famous adiabatic gas law. It is very easy to obtain similar relationships
between V and T and p and T during adiabatic expansion or contraction. Since
p = νRT/V, the above formula also implies that
T V
γ−1
= constant, (6.61)
and
p
1−γ
T
γ
= constant. (6.62)
Equations (6.60)–(6.62) are all completely equivalent.
6.6 Hydrostatic equilibrium of the atmosphere
The gas which we are most familiar with in everyday life is, of course, the Earth’s
atmosphere. In fact, we can use the isothermal and adiabatic gas laws to explain
most of the observable features of the atmosphere.
Let us, ﬁrst of all, consider the hydrostatic equilibrium of the atmosphere.
Consider a thin vertical slice of the atmosphere of crosssectional area A which
starts at height z above ground level and extends to height z + dz. The upwards
force exerted on this slice from the gas below is p(z) A, where p(z) is the pressure
at height z. Likewise, the downward force exerted by the gas above the slice is
p(z+dz) A. The net upward force is clearly [p(z)−p(z+dz)]A. In equilibrium, this
upward force must be balanced by the downward force due to the weight of the
slice: this is ρ Adz g, where ρ is the density of the gas, and g is the acceleration
due to gravity. In follows that the force balance condition can be written
[p(z) −p(z +dz)]A = ρ Adz g, (6.63)
which reduces to
dp
dz
= −ρ g. (6.64)
This is called the equation of hydrostatic equilibrium for the atmosphere.
98
6.7 The isothermal atmosphere 6 CLASSICAL THERMODYNAMICS
We can write the density of a gas in the following form,
ρ =
νµ
V
, (6.65)
where µ is the molecular weight of the gas, and is equal to the mass of one mole
of gas particles. For instance, the molecular weight of Nitrogen gas is 28 grams.
The above formula for the density of a gas combined with the ideal gas law
pV = νRT yields
ρ =
pµ
RT
. (6.66)
It follows that the equation of hydrostatic equilibrium can be rewritten
dp
p
= −
µg
RT
dz. (6.67)
6.7 The isothermal atmosphere
As a ﬁrst approximation, let us assume that the temperature of the atmosphere is
uniform. In such an isothermal atmosphere, we can directly integrate the previous
equation to give
p = p
0
exp
_
−
z
z
0
_
. (6.68)
Here, p
0
is the pressure at ground level (z = 0), which is generally about 1 bar,
or 1 atmosphere (10
5
Nm
−2
in SI units). The quantity
z
0
=
RT
µg
(6.69)
is called the isothermal scaleheight of the atmosphere. At ground level, the tem
perature is on average about 15
◦
centigrade, which is 288
◦
kelvin on the absolute
scale. The mean molecular weight of air at sea level is 29 (i.e., the molecular
weight of a gas made up of 78% Nitrogen, 21% Oxygen, and 1% Argon). The
acceleration due to gravity is 9.81 ms
−2
at ground level. Also, the ideal gas con
stant is 8.314 joules/mole/degree. Putting all of this information together, the
isothermal scaleheight of the atmosphere comes out to be about 8.4 kilometers.
99
6.7 The isothermal atmosphere 6 CLASSICAL THERMODYNAMICS
We have discovered that in an isothermal atmosphere the pressure decreases
exponentially with increasing height. Since the temperature is assumed to be
constant, and ρ ∝ p/T [see Eq. (6.66)], it follows that the density also decreases
exponentially with the same scaleheight as the pressure. According to Eq. (6.68),
the pressure, or density, decreases by a factor 10 every ln10 z
0
, or 19.3 kilometers,
we move vertically upwards. Clearly, the effective height of the atmosphere is
pretty small compared to the Earth’s radius, which is about 6, 400 kilometers. In
other words, the atmosphere constitutes a very thin layer covering the surface
of the Earth. Incidentally, this justiﬁes our neglect of the decrease of g with
increasing altitude.
One of the highest points in the United States of America is the peak of Mount
Elbert in Colorado. This peak lies 14, 432 feet, or about 4.4 kilometers, above sea
level. At this altitude, our formula says that the air pressure should be about 0.6
atmospheres. Surprisingly enough, after a few days acclimatization, people can
survive quite comfortably at this sort of pressure. In the highest inhabited regions
of the Andes and Tibet, the air pressure falls to about 0.5 atmospheres. Humans
can just about survive at such pressures. However, people cannot survive for any
extended period in air pressures below half an atmosphere. This sets an upper
limit on the altitude of permanent human habitation, which is about 19, 000 feet,
or 5.8 kilometers, above sea level. Incidentally, this is also the maximum altitude
at which a pilot can ﬂy an unpressurized aircraft without requiring additional
Oxygen.
The highest point in the world is, of course, the peak of Mount Everest in
Nepal. This peak lies at an altitude of 29, 028 feet, or 8.85 kilometers, above sea
level, where we expect the air pressure to be a mere 0.35 atmospheres. This ex
plains why Mount Everest was only conquered after lightweight portable oxygen
cylinders were invented. Admittedly, some climbers have subsequently ascended
Mount Everest without the aid of additional oxygen, but this is a very foolhardy
venture, because above 19, 000 feet the climbers are slowly dying.
Commercial airliners ﬂy at a cruising altitude of 32, 000 feet. At this altitude,
we expect the air pressure to be only 0.3 atmospheres, which explains why air
line cabins are pressurized. In fact, the cabins are only pressurized to 0.85 atmo
100
6.8 The adiabatic atmosphere 6 CLASSICAL THERMODYNAMICS
spheres (which accounts for the “popping” of passangers ears during air travel).
The reason for this partial pressurization is quite simple. At 32, 000 feet, the pres
sure difference between the air in the cabin and that outside is about half an
atmosphere. Clearly, the walls of the cabin must be strong enough to support
this pressure difference, which means that they must be of a certain thickness,
and, hence, the aircraft must be of a certain weight. If the cabin were fully pres
surized then the pressure difference at cruising altitude would increase by about
30%, which means that the cabin walls would have to be much thicker, and,
hence, the aircraft would have to be substantially heavier. So, a fully pressurized
aircraft would be more comfortable to ﬂy in (because your ears would not “pop”),
but it would also be far less economical to operate.
6.8 The adiabatic atmosphere
Of course, we know that the atmosphere is not isothermal. In fact, air temper
ature falls quite noticeably with increasing altitude. In ski resorts, you are told
to expect the temperature to drop by about 1 degree per 100 meters you go
upwards. Many people cannot understand why the atmosphere gets colder the
higher up you go. They reason that as higher altitudes are closer to the Sun they
ought to be hotter. In fact, the explanation is quite simple. It depends on three
important properties of air. The ﬁrst important property is that air is transpar
ent to most, but by no means all, of the electromagnetic spectrum. In particular,
most infrared radiation, which carries heat energy, passes straight through the
lower atmosphere and heats the ground. In other words, the lower atmosphere
is heated from below, not from above. The second important property of air is
that it is constantly in motion. In fact, the lower 20 kilometers of the atmosphere
(the so called troposphere) are fairly thoroughly mixed. You might think that this
would imply that the atmosphere is isothermal. However, this is not the case
because of the ﬁnal important properly of air: i.e., it is a very poor conductor of
heat. This, of course, is why woolly sweaters work: they trap a layer of air close
to the body, and because air is such a poor conductor of heat you stay warm.
Imagine a packet of air which is being swirled around in the atmosphere. We
101
6.8 The adiabatic atmosphere 6 CLASSICAL THERMODYNAMICS
would expect it to always remain at the same pressure as its surroundings, other
wise it would be mechanically unstable. It is also plausible that the packet moves
around too quickly to effectively exchange heat with its surroundings, since air is
very a poor heat conductor, and heat ﬂow is consequently quite a slow process.
So, to a ﬁrst approximation, the air in the packet is adiabatic. In a steadystate
atmosphere, we expect that as the packet moves upwards, expands due to the re
duced pressure, and cools adiabatically, its temperature always remains the same
as that of its immediate surroundings. This means that we can use the adiabatic
gas law to characterize the cooling of the atmosphere with increasing altitude. In
this particular case, the most useful manifestation of the adiabatic law is
p
1−γ
T
γ
= constant, (6.70)
giving
dp
p
=
γ
γ −1
dT
T
. (6.71)
Combining the above relation with the equation of hydrostatic equilibrium, (6.67),
we obtain
γ
γ −1
dT
T
= −
µg
RT
dz, (6.72)
or
dT
dz
= −
γ −1
γ
µg
R
. (6.73)
Now, the ratio of speciﬁc heats for air (which is effectively a diatomic gas) is
about 1.4 (see Tab. 2). Hence, we can calculate, from the above expression, that
the temperature of the atmosphere decreases with increasing height at a constant
rate of 9.8
◦
centigrade per kilometer. This value is called the adiabatic lapse rate
of the atmosphere. Our calculation accords well with the “1 degree colder per
100 meters higher” rule of thumb used in ski resorts. The basic reason why
air is colder at higher altitudes is that it expands as its pressure decreases with
height. It, therefore, does work on its environment, without absorbing any heat
(because of its low thermal conductivity), so its internal energy, and, hence, its
temperature decreases.
According to the adiabatic lapse rate calculated above, the air temperature at
the cruising altitude of airliners (32, 000 feet) should be about −80
◦
centigrade
102
6.8 The adiabatic atmosphere 6 CLASSICAL THERMODYNAMICS
(assuming a sea level temperature of 15
◦
centigrade). In fact, this is somewhat of
an underestimate. A more realistic value is about −60
◦
centigrade. The explana
tion for this discrepancy is the presence of water vapour in the atmosphere. As
air rises, expands, and cools, water vapour condenses out releasing latent heat
which prevents the temperature from falling as rapidly with height as the adia
batic lapse rate would indicate. In fact, in the Tropics, where the humidity is very
high, the lapse rate of the atmosphere (i.e., the rate of decrease of temperature
with altitude) is signiﬁcantly less than the adiabatic value. The adiabatic lapse
rate is only observed when the humidity is low. This is the case in deserts, in the
Arctic (where water vapour is frozen out of the atmosphere), and, of course, in
ski resorts.
Suppose that the lapse rate of the atmosphere differs from the adiabatic value.
Let us ignore the complication of water vapour and assume that the atmosphere
is dry. Consider a packet of air which moves slightly upwards from its equilibrium
height. The temperature of the packet will decrease with altitude according to
the adiabatic lapse rate, because its expansion is adiabatic. We assume that the
packet always maintains pressure balance with its surroundings. It follows that
since ρ T ∝ p, according to the ideal gas law, then
(ρ T)
packet
= (ρ T)
atmosphere
. (6.74)
If the atmospheric lapse rate is less than the adiabatic value then T
atmosphere
>
T
packet
implying that ρ
packet
> ρ
atmosphere
. So, the packet will be denser than its im
mediate surroundings, and will, therefore, tend to fall back to its original height.
Clearly, an atmosphere whose lapse rate is less than the adiabatic value is sta
ble. On the other hand, if the atmospheric lapse rate exceeds the adiabatic value
then, after rising a little way, the packet will be less dense than its immediate sur
roundings, and will, therefore, continue to rise due to buoyancy effects. Clearly,
an atmosphere whose lapse rate is greater than the adiabatic value is unstable.
This effect is of great importance in Meteorology. The normal stable state of the
atmosphere is for the lapse rate to be slightly less than the adiabatic value. Oc
casionally, however, the lapse rate exceeds the adiabatic value, and this is always
associated with extremely disturbed weather patterns.
Let us consider the temperature, pressure, and density proﬁles in an adiabatic
103
6.8 The adiabatic atmosphere 6 CLASSICAL THERMODYNAMICS
atmosphere. We can directly integrate Eq. (6.73) to give
T = T
0
_
1 −
γ −1
γ
z
z
0
_
, (6.75)
where T
0
is the ground level temperature, and
z
0
=
RT
0
µg
(6.76)
is the isothermal scaleheight calculated using this temperature. The pressure
proﬁle is easily calculated from the adiabatic gas law p
1−γ
T
γ
= constant, or
p ∝ T
γ/(γ−1)
. It follows that
p = p
0
_
1 −
γ −1
γ
z
z
0
_
γ/(γ−1)
. (6.77)
Consider the limit γ →1. In this limit, Eq. (6.75) yields T independent of height
(i.e., the atmosphere becomes isothermal). We can evaluate Eq. (6.77) in the
limit as γ →1 using the mathematical identity
lt m→0
(1 +mx)
1/m
≡ exp(x). (6.78)
We obtain
p = p
0
exp
_
−
z
z
0
_
, (6.79)
which, not surprisingly, is the predicted pressure variation in an isothermal at
mosphere. In reality, the ratio of speciﬁc heats of the atmosphere is not unity,
it is about 1.4 (i.e., the ratio for diatomic gases), which implies that in the real
atmosphere
p = p
0
_
1 −
z
3.5 z
0
_
3.5
. (6.80)
In fact, this formula gives very similar results to the exponential formula, Eq. (6.79),
for heights below one scaleheight (i.e., z < z
0
). For heights above one scale
height, the exponential formula tends to predict too low a pressure. So, in an
adiabatic atmosphere, the pressure falls off less quickly with altitude than in an
isothermal atmosphere, but this effect is only really noticeable at pressures sig
niﬁcantly below one atmosphere. In fact, the isothermal formula is a pretty good
104
6.9 Heat engines 6 CLASSICAL THERMODYNAMICS
approximation below altitudes of about 10 kilometers. Since ρ ∝ p/T, the varia
tion of density with height goes like
ρ = ρ
0
_
1 −
γ −1
γ
z
z
0
_
1/(γ−1)
, (6.81)
where ρ
0
is the density at ground level. Thus, the density falls off more rapidly
with altitude than the temperature, but less rapidly than the pressure.
Note that an adiabatic atmosphere has a sharp upper boundary. Above height
z
1
= [γ/(γ−1)] z
0
the temperature, pressure, and density are all zero: i.e., there is
no atmosphere. For real air, with γ = 1.4, z
1
· 3.5 z
0
· 29.4 kilometers. This be
haviour is quite different to that of an isothermal atmosphere, which has a diffuse
upper boundary. In reality, there is no sharp upper boundary to the atmosphere.
The adiabatic gas law does not apply above about 20 kilometers (i.e., in the
stratosphere) because at these altitudes the air is no longer strongly mixed. Thus,
in the stratosphere the pressure falls off exponentially with increasing height.
In conclusion, we have demonstrated that the temperature of the lower atmo
sphere should fall off approximately linearly with increasing height above ground
level, whilst the pressure should fall off far more rapidly than this, and the den
sity should fall off at some intermediate rate. We have also shown that the lapse
rate of the temperature should be about 10
◦
centigrade per kilometer in dry air,
but somewhat less than this in wet air. In fact, all off these predictions are, more
or less, correct. It is amazing that such accurate predictions can be obtained from
the two simple laws, pV = constant for an isothermal gas, and pV
γ
= constant
for an adiabatic gas.
6.9 Heat engines
Thermodynamics was invented, almost by accident, in 1825 by a young French
engineer called Sadi Carnot who was investigating the theoretical limitations on
the efﬁciency of steam engines. Although we are not particularly interested in
steam engines, nowadays, it is still highly instructive to review some of Carnot’s
arguments. We know, by observation, that it is possible to do mechanical work w
105
6.9 Heat engines 6 CLASSICAL THERMODYNAMICS
upon a device M, and then to extract an equivalent amount of heat q, which goes
to increase the internal energy of some heat reservoir. (Here, we use small letters
w and q to denote intrinsically positive amounts of work and heat, respectively.)
An example of this is Joule’s classic experiment by which he veriﬁed the ﬁrst law
of thermodynamics: a paddle wheel is spun in a liquid by a falling weight, and the
work done by the weight on the wheel is converted into heat, and absorbed by the
liquid. Carnot’s question was this: is it possible to reverse this process and build
a device, called a heat engine, which extracts heat energy from a reservoir and
converts it into useful macroscopic work? For instance, is it possible to extract
heat from the ocean and use it to run an electric generator?
There are a few caveats to Carnot’s question. First of all, the work should not
be done at the expense of the heat engine itself, otherwise the conversion of heat
into work could not continue indeﬁnitely. We can ensure that this is the case if the
heat engine performs some sort of cycle, by which it periodically returns to the
same macrostate, but, in the meantime, has extracted heat from the reservoir and
done an equivalent amount of useful work. Furthermore, a cyclic process seems
reasonable because we know that both steam engines and internal combustion
engines perform continuous cycles. The second caveat is that the work done by
the heat engine should be such as to change a single parameter of some external
device (e.g., by lifting a weight) without doing it at the expense of affecting the
other degrees of freedom, or the entropy, of that device. For instance, if we are
extracting heat from the ocean to generate electricity, we want to spin the shaft of
the electrical generator without increasing the generator’s entropy; i.e., causing
the generator to heat up or fall to bits.
Let us examine the feasibility of a heat engine using the laws of thermody
namics. Suppose that a heat engine M performs a single cycle. Since M has
returned to its initial macrostate, its internal energy is unchanged, and the ﬁrst
law of thermodynamics tell us that the work done by the engine w must equal
the heat extracted from the reservoir q, so
w = q. (6.82)
The above condition is certainly a necessary condition for a feasible heat engine,
but is it also a sufﬁcient condition? In other words, does every device which satis
106
6.9 Heat engines 6 CLASSICAL THERMODYNAMICS
ﬁes this condition actually work? Let us think a little more carefully about what
we are actually expecting a heat engine to do. We want to construct a device
which will extract energy from a heat reservoir, where it is randomly distributed
over very many degrees of freedom, and convert it into energy distributed over
a single degree of freedom associated with some parameter of an external de
vice. Once we have expressed the problem in these terms, it is fairly obvious that
what we are really asking for is a spontaneous transition from a probable to an
improbable state, which we know is forbidden by the second law of thermody
namics. So, unfortunately, we cannot run an electric generator off heat extracted
from the ocean, because it is like asking all of the molecules in the ocean, which
are jiggling about every which way, to all suddenly jig in the same direction, so as
to exert a force on some lever, say, which can then be converted into a torque on
the generator shaft. We know from our investigation of statistical thermodynam
ics that such a process is possible, in principle, but is fantastically improbable.
The improbability of the scenario just outlined is summed up in the second
law of thermodynamics. This says that the total entropy of an isolated system
can never spontaneously decrease, so
∆S ≥ 0. (6.83)
For the case of a heat engine, the isolated system consists of the engine, the
reservoir from which it extracts heat, and the outside device upon which it does
work. The engine itself returns periodically to the same state, so its entropy is
clearly unchanged after each cycle. We have already speciﬁed that there is no
change in the entropy of the external device upon which the work is done. On
the other hand, the entropy change per cycle of the heat reservoir, which is at
absolute temperature T
1
, say, is given by
∆S
reservoir
=
¯ dQ
T
1
= −
q
T
1
, (6.84)
where ¯ dQ is the inﬁnitesimal heat absorbed by the reservoir, and the integral
is taken over a whole cycle of the heat engine. The integral can be converted
into the expression −q/T
1
because the amount of heat extracted by the engine
is assumed to be too small to modify the temperature of the reservoir (this is
107
6.9 Heat engines 6 CLASSICAL THERMODYNAMICS
the deﬁnition of a heat reservoir), so that T
1
is a constant during the cycle. The
second law of thermodynamics clearly reduces to
−q
T
1
≥ 0 (6.85)
or, making use of the ﬁrst law of thermodynamics,
q
T
1
=
w
T
1
≤ 0. (6.86)
Since we wish the work w done by the engine to be positive, the above rela
tion clearly cannot be satisﬁed, which proves that an engine which converts heat
directly into work is thermodynamically impossible.
A perpetual motion device, which continuously executes a cycle without ex
tracting heat from, or doing work on, its surroundings, is just about possible
according to Eq. (6.86). In fact, such a device corresponds to the equality sign in
Eq. (6.83), which means that it must be completely reversible. In reality, there is
no such thing as a completely reversible engine. All engines, even the most efﬁ
cient, have frictional losses which make them, at least, slightly irreversible. Thus,
the equality sign in Eq. (6.83) corresponds to an asymptotic limit which reality
can closely approach, but never quite attain. It follows that a perpetual motion
device is thermodynamically impossible. Nevertheless, the U.S. patent ofﬁce re
ceives about 100 patent applications a year regarding perpetual motion devices.
The British patent ofﬁce, being slightly less openminded that its American coun
terpart, refuses to entertain such applications on the basis that perpetual motion
devices are forbidden by the second law of thermodynamics.
According to Eq. (6.86), there is no thermodynamic objection to a heat engine
which runs backwards, and converts work directly into heat. This is not surpris
ing, since we know that this is essentially what frictional forces do. Clearly, we
have, here, another example of a natural process which is fundamentally irre
versible according to the second law of thermodynamics. In fact, the statement
It is impossible to construct a perfect heat engine which converts heat directly
into work
108
6.9 Heat engines 6 CLASSICAL THERMODYNAMICS
is called Kelvin’s formulation of the second law.
We have demonstrated that a perfect heat engine, which converts heat directly
into work, is impossible. But, there must be some way of obtaining useful work
from heat energy, otherwise steam engines would not operate. Well, the reason
that our previous scheme did not work was that it decreased the entropy of a
heat reservoir, at some temperature T
1
, by extracting an amount of heat q per
cycle, without any compensating increase in the entropy of anything else, so the
second law of thermodynamics was violated. How can we remedy this situation?
We still want the heat engine itself to perform periodic cycles (so, by deﬁnition,
its entropy cannot increase over a cycle), and we also do not want to increase
the entropy of the external device upon which the work is done. Our only other
option is to increase the entropy of some other body. In Carnot’s analysis, this
other body is a second heat reservoir at temperature T
2
. We can increase the
entropy of the second reservoir by dumping some of the heat we extracted from
the ﬁrst reservoir into it. Suppose that the heat per cycle we extract from the
ﬁrst reservoir is q
1
, and the heat per cycle we reject into the second reservoir is
q
2
. Let the work done on the external device be w per cycle. The ﬁrst law of
thermodynamics tells us that
q
1
= w+q
2
. (6.87)
Note that q
2
< q
1
if positive (i.e., useful) work is done on the external device.
The total entropy change per cycle is due to the heat extracted from the ﬁrst
reservoir and the heat dumped into the second, and has to be positive (or zero)
according to the second law of thermodynamics. So,
∆S =
−q
1
T
1
+
q
2
T
2
≥ 0. (6.88)
We can combine the previous two equations to give
−q
1
T
1
+
q
1
−w
T
2
≥ 0, (6.89)
or
w
T
2
≤ q
1
_
1
T
2
−
1
T
1
_
. (6.90)
It is clear that the engine is only going to perform useful work (i.e., w is only
going to be positive) if T
2
< T
1
. So, the second reservoir has to be colder than the
109
6.9 Heat engines 6 CLASSICAL THERMODYNAMICS
ﬁrst if the heat dumped into the former is to increase the entropy of the Universe
more than the heat extracted from the latter decreases it. It is useful to deﬁne
the efﬁciency η of a heat engine. This is the ratio of the work done per cycle on
the external device to the heat energy absorbed per cycle from the ﬁrst reservoir.
The efﬁciency of a perfect heat engine is unity, but we have already shown that
such an engine is impossible. What is the efﬁciency of a realizable engine? It is
clear from the previous equation that
η ≡
w
q
1
≤ 1 −
T
2
T
1
=
T
1
−T
2
T
1
. (6.91)
Note that the efﬁciency is always less than unity. A real engine must always re
ject some energy into the second heat reservoir in order to satisfy the second
law of thermodynamics, so less energy is available to do external work, and the
efﬁciency of the engine is reduced. The equality sign in the above expression cor
responds to a completely reversible heat engine (i.e., one which is quasistatic).
It is clear that real engines, which are always irreversible to some extent, are
less efﬁcient than reversible engines. Furthermore, all reversible engines which
operate between the two temperatures T
1
and T
2
must have the same efﬁciency,
η =
T
1
−T
2
T
1
, (6.92)
irrespective of the way in which they operate.
Let us consider how we might construct one of these reversible heat engines.
Suppose that we have some gas in a cylinder equipped with a frictionless piston.
The gas is not necessarily a perfect gas. Suppose that we also have two heat
reservoirs at temperatures T
1
and T
2
(where T
1
> T
2
). These reservoirs might
take the form of large water baths. Let us start off with the gas in thermal contact
with the ﬁrst reservoir. We now pull the piston out very slowly so that heat energy
ﬂows reversibly into the gas from the reservoir. Let us now thermally isolate the
gas and slowly pull out the piston some more. During this adiabatic process the
temperature of the gas falls (since there is no longer any heat ﬂowing into it to
compensate for the work it does on the piston). Let us continue this process until
the temperature of the gas falls to T
2
. We now place the gas in thermal contact
with the second reservoir and slowly push the piston in. During this isothermal
110
6.9 Heat engines 6 CLASSICAL THERMODYNAMICS
2
T
V >
p

>
c
a
d
b
T
1
q
q
2
1
Figure 1: An ideal gas Carnot engine.
process heat ﬂows out of the gas into the reservoir. We next thermally isolate
the gas a second time and slowly compress it some more. In this process the
temperature of the gas increases. We stop the compression when the temperature
reaches T
1
. If we carry out each step properly we can return the gas to its initial
state and then repeat the cycle ad inﬁnitum. We now have a set of reversible
processes by which a quantity of heat is extracted from the ﬁrst reservoir and
a quantity of heat is dumped into the second. We can best evaluate the work
done by the system during each cycle by plotting out the locus of the gas in a
pV diagram. The locus takes the form of a closed curve—see Fig. 1. The net
work done per cycle is the “area” contained inside this curve, since ¯ dW = pdV
[if p is plotted vertically and V horizontally, then pdV is clearly an element of
area under the curve p(V)]. The engine we have just described is called a Carnot
engine, and is the simplest conceivable device capable of converting heat energy
into useful work.
For the speciﬁc case of an ideal gas, we can actually calculate the work done
per cycle, and, thereby, verify Eq. (6.92). Consider the isothermal expansion
phase of the gas. For an ideal gas, the internal energy is a function of the temper
ature alone. The temperature does not change during isothermal expansion, so
the internal energy remains constant, and the net heat absorbed by the gas must
111
6.9 Heat engines 6 CLASSICAL THERMODYNAMICS
equal the work it does on the piston. Thus,
q
1
=
b
a
pdV, (6.93)
where the expansion takes the gas from state a to state b. Since pV = νRT, for
an ideal gas, we have
q
1
=
b
a
νRT
1
dV
V
= νRT
1
ln
V
b
V
a
. (6.94)
Likewise, during the isothermal compression phase, in which the gas goes from
state c to state d, the net heat rejected to the second reservoir is
q
2
= νRT
2
ln
V
c
V
d
. (6.95)
Now, during adiabatic expansion or compression
T V
γ−1
= constant. (6.96)
It follows that during the adiabatic expansion phase, which takes the gas from
state b to state c,
T
1
V
γ−1
b
= T
2
V
γ−1
c
. (6.97)
Likewise, during the adiabatic compression phase, which takes the gas from state
d to state a,
T
1
V
γ−1
a
= T
2
V
γ−1
d
. (6.98)
If we take the ratio of the previous two equations we obtain
V
b
V
a
=
V
c
V
d
. (6.99)
Hence, the work done by the engine, which we can calculate using the ﬁrst law
of thermodynamics,
w = q
1
−q
2
, (6.100)
is
w = νR(T
1
−T
2
) ln
V
b
V
a
. (6.101)
112
6.9 Heat engines 6 CLASSICAL THERMODYNAMICS
Thus, the efﬁciency of the engine is
η =
w
q
1
=
T
1
−T
2
T
1
(6.102)
which, not surprisingly, is exactly the same as Eq. (6.92).
The engine described above is very idealized. Of course, real engines are far
more complicated than this. Nevertheless, the maximum efﬁciency of an ideal
heat engine places severe constraints on real engines. Conventional power sta
tions have many different “front ends” (e.g., coal ﬁred furnaces, oil ﬁred furnaces,
nuclear reactors), but their “back ends” are all very similar, and consist of a steam
driven turbine connected to an electric generator. The “front end” heats water ex
tracted from a local river and turns it into steam, which is then used to drive the
turbine, and, hence, to generate electricity. Finally, the steam is sent through a
heat exchanger so that it can heat up the incoming river water, which means that
the incoming water does not have to be heated so much by the “front end.” At this
stage, some heat is rejected to the environment, usually as clouds of steam es
caping from the top of cooling towers. We can see that a power station possesses
many of the same features as our idealized heat engine. There is a cycle which
operates between two temperatures. The upper temperature is the temperature
to which the steam is heated by the “front end,” and the lower temperature is
the temperature of the environment into which heat is rejected. Suppose that the
steam is only heated to 100
◦
C (or 373
◦
K), and the temperature of the environ
ment is 15
◦
C (or 288
◦
K). It follows from Eq. (6.91) that the maximum possible
efﬁciency of the steam cycle is
η =
373 −288
373
· 0.23. (6.103)
So, at least 77% of the heat energy generated by the “front end” goes straight
up the cooling towers! Not be surprisingly, commercial power stations do not
operate with 100
◦
C steam. The only way in which the thermodynamic efﬁciency
of the steam cycle can be raised to an acceptable level is to use very hot steam
(clearly, we cannot refrigerate the environment). Using 400
◦
C steam, which is
not uncommon, the maximum efﬁciency becomes
η =
673 −288
673
· 0.57, (6.104)
113
6.10 Refrigerators 6 CLASSICAL THERMODYNAMICS
which is more reasonable. In fact, the steam cycles of modern power stations are
so well designed that they come surprisingly close to their maximum thermody
namic efﬁciencies.
6.10 Refrigerators
Let us now move on to consider refrigerators. An idealized refrigerator is an
engine which extracts heat from a cold heat reservoir (temperature T
2
, say) and
rejects it to a somewhat hotter heat reservoir, which is usually the environment
(temperature T
1
, say). To make this machine work we always have to do some
external work on the engine. For instance, the refrigerator in your home contains
a small electric pump which does work on the freon (or whatever) in the cooling
circuit. We can see that, in fact, a refrigerator is just a heat engine run in reverse.
Hence, we can immediately carry over most of our heat engine analysis. Let q
2
be the heat absorbed per cycle from the colder reservoir, q
1
the heat rejected per
cycle into the hotter reservoir, and w the external work done per cycle on the
engine. The ﬁrst law of thermodynamics tells us that
w+q
2
= q
1
. (6.105)
The second law says that
q
1
T
1
+
−q
2
T
2
≥ 0. (6.106)
We can combine these two laws to give
w
T
1
≥ q
2
_
1
T
2
−
1
T
1
_
. (6.107)
The most sensible way of deﬁning the efﬁciency of a refrigerator is as the ratio
of the heat extracted per cycle from the cold reservoir to the work done per cycle
on the engine. With this deﬁnition
η =
T
2
T
1
−T
2
. (6.108)
We can see that this efﬁciency is, in general, greater than unity. In other words,
for one joule of work done on the engine, or pump, more than one joule of
114
6.10 Refrigerators 6 CLASSICAL THERMODYNAMICS
energy is extracted from whatever it is we are cooling. Clearly, refrigerators
are intrinsically very efﬁcient devices. Domestic refrigerators cool stuff down to
about 4
◦
C (277
◦
K) and reject heat to the environment at about 15
◦
C (288
◦
K).
The maximum theoretical efﬁciency of such devices is
η =
277
288 −277
= 25.2. (6.109)
So, for every joule of electricity we put into a refrigerator we can extract up to 25
joules of heat from its contents.
115
7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
7 Applications of statistical thermodynamics
7.1 Introduction
In our study of classical thermodynamics, we concentrated on the application of
statistical physics to macroscopic systems. Somewhat paradoxically, statistical ar
guments did not ﬁgure very prominently in this investigation. In fact, the only
statistical statement we made was that it was extremely unlikely that a macro
scopic system could violate the second law of thermodynamics. The resolution of
this paradox is, of course, that macroscopic systems contain a very large number
of particles, and their statistical ﬂuctuations are, therefore, negligible. Let us now
apply statistical physics to microscopic systems, such as atoms and molecules. In
this study, the underlying statistical nature of thermodynamics will become far
more apparent.
7.2 Boltzmann distributions
We have gained some understanding of the macroscopic properties of the air
around us. For instance, we know something about its internal energy and spe
ciﬁc heat capacity. How can we obtain some information about the statistical
properties of the molecules which make up air? Consider a speciﬁc molecule:
it constantly collides with its immediate neighbour molecules, and occasionally
bounces off the walls of the room. These interactions “inform” it about the macro
scopic state of the air, such as its temperature, pressure, and volume. The sta
tistical distribution of the molecule over its own particular microstates must be
consistent with this macrostate. In other words, if we have a large group of such
molecules with similar statistical distributions, then they must be equivalent to
air with the appropriate macroscopic properties. So, it ought to be possible to
calculate the probability distribution of the molecule over its microstates from a
knowledge of these macroscopic properties.
We can think of the interaction of a molecule with the air in a classroom as
analogous to the interaction of a small system A in thermal contact with a heat
116
7.2 Boltzmann distributions 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
reservoir A
/
. The air acts like a heat reservoir because its energy ﬂuctuations
due to any interactions with the molecule are far too small to affect any of its
macroscopic parameters. Let us determine the probability P
r
of ﬁnding system A
in one particular microstate r of energy E
r
when it is thermal equilibrium with
the heat reservoir A
/
.
As usual, we assume fairly weak interaction between A and A
/
, so that the
energies of these two systems are additive. The energy of A is not known at this
stage. In fact, only the total energy of the combined system A
(0)
= A + A
/
is
known. Suppose that the total energy lies in the range E
(0)
to E
(0)
+ δE. The
overall energy is constant in time, since A
(0)
is assumed to be an isolated system,
so
E
r
+E
/
= E
(0)
, (7.1)
where E
/
denotes the energy of the reservoir A
/
. Let Ω
/
(E
/
) be the number of
microstates accessible to the reservoir when its energy lies in the range E
/
to
E
/
+δE. Clearly, if system A has an energy E
r
then the reservoir A
/
must have an
energy close to E
/
= E
(0)
− E
r
. Hence, since A is in one deﬁnite state (i.e., state
r), and the total number of states accessible to A
/
is Ω
/
(E
(0)
−E
r
), it follows that
the total number of states accessible to the combined system is simply Ω
/
(E
(0)
−
E
r
). The principle of equal a priori probabilities tells us the the probability of
occurrence of a particular situation is proportional to the number of accessible
microstates. Thus,
P
r
= C
/
Ω
/
(E
(0)
−E
r
), (7.2)
where C
/
is a constant of proportionality which is independent of r. This constant
can be determined by the normalization condition
r
P
r
= 1, (7.3)
where the sum is over all possible states of system A, irrespective of their energy.
Let us now make use of the fact that system A is far smaller than system A
/
.
It follows that E
r
¸ E
(0)
, so the slowly varying logarithm of P
r
can be Taylor
expanded about E
/
= E
(0)
. Thus,
lnP
r
= lnC
/
+ lnΩ
/
(E
(0)
) −
_
_
∂lnΩ
/
∂E
/
_
_
0
E
r
+ . (7.4)
117
7.2 Boltzmann distributions 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
Note that we must expand lnP
r
, rather than P
r
itself, because the latter function
varies so rapidly with energy that the radius of convergence of its Taylor series is
far too small for the series to be of any practical use. The higher order terms in
Eq. (7.4) can be safely neglected, because E
r
¸E
(0)
. Now the derivative
_
_
∂lnΩ
/
∂E
/
_
_
0
≡ β (7.5)
is evaluated at the ﬁxed energy E
/
= E
(0)
, and is, thus, a constant independent of
the energy E
r
of A. In fact, we know, from Sect. 5, that this derivative is just the
temperature parameter β = (k T)
−1
characterizing the heat reservoir A
/
. Hence,
Eq. (7.4) becomes
lnP
r
= lnC
/
+ lnΩ
/
(E
(0)
) −βE
r
, (7.6)
giving
P
r
= Cexp(−βE
r
), (7.7)
where C is a constant independent of r. The parameter C is determined by the
normalization condition, which gives
C
−1
=
r
exp(−βE
r
), (7.8)
so that the distribution becomes
P
r
=
exp(−βE
r
)
r
exp(−βE
r
)
. (7.9)
This is known as the Boltzmann probability distribution, and is undoubtably the
most famous result in statistical physics.
The Boltzmann distribution often causes confusion. People who are used to the
principle of equal a priori probabilities, which says that all microstates are equally
probable, are understandably surprised when they come across the Boltzmann
distribution which says that high energy microstates are markedly less proba
ble then low energy states. However, there is no need for any confusion. The
principle of equal a priori probabilities applies to the whole system, whereas the
Boltzmann distribution only applies to a small part of the system. The two results
are perfectly consistent. If the small system is in a microstate with a compara
tively high energy E
r
then the rest of the system (i.e., the reservoir) has a slightly
118
7.2 Boltzmann distributions 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
lower energy E
/
than usual (since the overall energy is ﬁxed). The number of
accessible microstates of the reservoir is a very strongly increasing function of
its energy. It follows that when the small system has a high energy then signiﬁ
cantly less states than usual are accessible to the reservoir, and so the number of
microstates accessible to the overall system is reduced, and, hence, the conﬁgu
ration is comparatively unlikely. The strong increase in the number of accessible
microstates of the reservoir with increasing E
/
gives rise to the strong (i.e., expo
nential) decrease in the likelihood of a state r of the small system with increasing
E
r
. The exponential factor exp(−βE
r
) is called the Boltzmann factor.
The Boltzmann distribution gives the probability of ﬁnding the small system A
in one particular state r of energy E
r
. The probability P(E) that A has an energy
in the small range between E and E + δE is just the sum of all the probabilities
of the states which lie in this range. However, since each of these states has
approximately the same Boltzmann factor this sum can be written
P(E) = CΩ(E) exp(−βE), (7.10)
where Ω(E) is the number of microstates of A whose energies lie in the appro
priate range. Suppose that system A is itself a large system, but still very much
smaller than system A
/
. For a large system, we expect Ω(E) to be a very rapidly
increasing function of energy, so the probability P(E) is the product of a rapidly
increasing function of E and another rapidly decreasing function (i.e., the Boltz
mann factor). This gives a sharp maximum of P(E) at some particular value of
the energy. The larger system A, the sharper this maximum becomes. Eventually,
the maximum becomes so sharp that the energy of system A is almost bound to
lie at the most probable energy. As usual, the most probable energy is evaluated
by looking for the maximum of lnP, so
∂lnP
∂E
=
∂lnΩ
∂E
−β = 0, (7.11)
giving
∂lnΩ
∂E
= β. (7.12)
Of course, this corresponds to the situation in which the temperature of A is
the same as that of the reservoir. This is a result which we have seen before
119
7.3 Paramagnetism 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
(see Sect. 5). Note, however, that the Boltzmann distribution is applicable no
matter how small system A is, so it is a far more general result than any we have
previously obtained.
7.3 Paramagnetism
The simplest microscopic system which we can analyze using the Boltzmann dis
tribution is one which has only two possible states (there would clearly be little
point in analyzing a system with only one possible state). Most elements, and
some compounds, are paramagnetic: i.e., their constituent atoms, or molecules,
possess a permanent magnetic moment due to the presence of one or more un
paired electrons. Consider a substance whose constituent particles contain only
one unpaired electron. Such particles have spin 1/2, and consequently possess
an intrinsic magnetic moment µ. According to quantum mechanics, the mag
netic moment of a spin 1/2 particle can point either parallel or antiparallel to an
external magnetic ﬁeld B. Let us determine the mean magnetic moment µ
B
(in
the direction of B) of the constituent particles of the substance when its abso
lute temperature is T. We assume, for the sake of simplicity, that each atom (or
molecule) only interacts weakly with its neighbouring atoms. This enables us to
focus attention on a single atom, and treat the remaining atoms as a heat bath at
temperature T.
Our atom can be in one of two possible states: the (+) state in which its spin
points up (i.e., parallel to B), and the (−) state in which its spin points down (i.e.,
antiparallel to B). In the (+) state, the atomic magnetic moment is parallel to the
magnetic ﬁeld, so that µ
B
= µ. The magnetic energy of the atom is
+
= −µB. In
the (−) state, the atomic magnetic moment is antiparallel to the magnetic ﬁeld,
so that µ
B
= −µ. The magnetic energy of the atom is
−
= µB.
According to the Boltzmann distribution, the probability of ﬁnding the atom
in the (+) state is
P
+
= Cexp(−β
+
) = Cexp(βµB), (7.13)
where C is a constant, and β = (k T)
−1
. Likewise, the probability of ﬁnding the
120
7.3 Paramagnetism 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
atom in the (−) state is
P
−
= Cexp(−β
−
) = Cexp(−βµB). (7.14)
Clearly, the most probable state is the state with the lowest energy [i.e., the (+)
state]. Thus, the mean magnetic moment points in the direction of the magnetic
ﬁeld (i.e., the atom is more likely to point parallel to the ﬁeld than antiparallel).
It is clear that the critical parameter in a paramagnetic system is
y ≡ βµB =
µB
k T
. (7.15)
This parameter measures the ratio of the typical magnetic energy of the atom to
its typical thermal energy. If the thermal energy greatly exceeds the magnetic
energy then y ¸1, and the probability that the atomic moment points parallel to
the magnetic ﬁeld is about the same as the probability that it points antiparallel.
In this situation, we expect the mean atomic moment to be small, so that µ
B
· 0.
On the other hand, if the magnetic energy greatly exceeds the thermal energy
then y ¸ 1, and the atomic moment is far more likely to point parallel to the
magnetic ﬁeld than antiparallel. In this situation, we expect µ
B
· µ.
Let us calculate the mean atomic moment µ
B
. The usual deﬁnition of a mean
value gives
µ
B
=
P
+
µ +P
−
(−µ)
P
+
+P
−
= µ
exp(βµB) − exp(−βµB)
exp(βµB) + exp(−βµB)
. (7.16)
This can also be written
µ
B
= µtanh
µB
k T
, (7.17)
where the hyperbolic tangent is deﬁned
tanhy ≡
exp(y) − exp(−y)
exp(y) + exp(−y)
. (7.18)
For small arguments, y ¸1,
tanhy · y −
y
3
3
+ , (7.19)
121
7.3 Paramagnetism 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
whereas for large arguments, y ¸1,
tanhy · 1. (7.20)
It follows that at comparatively high temperatures, k T ¸µB,
µ
B
·
µ
2
B
k T
, (7.21)
whereas at comparatively low temperatures, k T ¸µB,
µ
B
· µ. (7.22)
Suppose that the substance contains N
0
atoms (or molecules) per unit volume.
The magnetization is deﬁned as the mean magnetic moment per unit volume, and
is given by
M
0
= N
0
µ
B
. (7.23)
At high temperatures, k T ¸ µB, the mean magnetic moment, and, hence, the
magnetization, is proportional to the applied magnetic ﬁeld, so we can write
M
0
· χB, (7.24)
where χ is a constant of proportionality known as the magnetic susceptibility. It is
clear that the magnetic susceptibility of a spin 1/2 paramagnetic substance takes
the form
χ =
N
0
µ
2
k T
. (7.25)
The fact that χ ∝ T
−1
is known as Curie’s law, because it was discovered exper
imentally by Pierre Curie at the end of the nineteenth century. At low tempera
tures, k T ¸µB,
M
0
→N
0
µ, (7.26)
so the magnetization becomes independent of the applied ﬁeld. This corresponds
to the maximum possible magnetization, where all atomic moments are lined up
parallel to the ﬁeld. The breakdown of the M
0
∝ B law at low temperatures (or
high magnetic ﬁelds) is known as saturation.
The above analysis is only valid for paramagnetic substances made up of spin
onehalf (J = 1/2) atoms or molecules. However, the analysis can easily be
122
7.3 Paramagnetism 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
Figure 2: The magnetization (vertical axis) versus B/T (horizontal axis) curves for (I) chromium
potassium alum (J = 3/2), (II) iron ammonium alum (J = 5/2), and (III) gadolinium sulphate
(J = 7/2). The solid lines are the theoretical predictions whereas the data points are experimental
measurements. From W.E. Henry, Phys. Rev. 88, 561 (1952).
123
7.4 Mean values 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
generalized to take account of substances whose constituent particles possess
higher spin (i.e., J > 1/2). Figure 2 compares the experimental and theoretical
magnetization versus ﬁeldstrength curves for three different substances made
up of spin 3/2, spin 5/2, and spin 7/2 particles, showing the excellent agreement
between the two sets of curves. Note that, in all cases, the magnetization is
proportional to the magnetic ﬁeldstrength at small ﬁeldstrengths, but saturates
at some constant value as the ﬁeldstrength increases.
The previous analysis completely neglects any interaction between the spins
of neighbouring atoms or molecules. It turns out that this is a fairly good approx
imation for paramagnetic substances. However, for ferromagnetic substances, in
which the spins of neighbouring atoms interact very strongly, this approximation
breaks down completely. Thus, the above analysis does not apply to ferromagnetic
substances.
7.4 Mean values
Consider a system in contact with a heat reservoir. The systems in the represen
tative ensemble are distributed over their accessible states in accordance with the
Boltzmann distribution. Thus, the probability of occurrence of some state r with
energy E
r
is given by
P
r
=
exp(−βE
r
)
r
exp(−βE
r
)
. (7.27)
The mean energy is written
E =
r
exp(−βE
r
) E
r
r
exp(−βE
r
)
, (7.28)
where the sum is taken over all states of the system, irrespective of their energy.
Note that
r
exp(−βE
r
) E
r
= −
r
∂
∂β
exp(−βE
r
) = −
∂Z
∂β
, (7.29)
where
Z =
r
exp(−βE
r
). (7.30)
124
7.4 Mean values 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
It follows that
E = −
1
Z
∂Z
∂β
= −
∂lnZ
∂β
. (7.31)
The quantity Z, which is deﬁned as the sum of the Boltzmann factor over all
states, irrespective of their energy, is called the partition function. We have just
demonstrated that it is fairly easy to work out the mean energy of a system using
its partition function. In fact, as we shall discover, it is easy to calculate virtually
any piece of statistical information using the partition function.
Let us evaluate the variance of the energy. We know that
(∆E)
2
= E
2
−E
2
(7.32)
(see Sect. 2). Now, according to the Boltzmann distribution,
E
2
=
r
exp(−βE
r
) E
2
r
r
exp(−βE
r
)
. (7.33)
However,
r
exp(−βE
r
) E
2
r
= −
∂
∂β
_
_
r
exp(−βE
r
) E
r
_
_
=
_
−
∂
∂β
_
2
_
_
r
exp(−βE
r
)
_
_
.
(7.34)
Hence,
E
2
=
1
Z
∂
2
Z
∂β
2
. (7.35)
We can also write
E
2
=
∂
∂β
_
1
Z
∂Z
∂β
_
+
1
Z
2
_
∂Z
∂β
_
2
= −
∂E
∂β
+E
2
, (7.36)
where use has been made of Eq. (7.31). It follows from Eq. (7.32) that
(∆E)
2
= −
∂E
∂β
=
∂
2
lnZ
∂β
2
. (7.37)
Thus, the variance of the energy can be worked out from the partition function
almost as easily as the mean energy. Since, by deﬁnition, a variance can never be
negative, it follows that ∂E/∂β ≤ 0, or, equivalently, ∂E/∂T ≥ 0. Hence, the mean
125
7.4 Mean values 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
energy of a system governed by the Boltzmann distribution always increases with
temperature.
Suppose that the systemis characterized by a single external parameter x (such
as its volume). The generalization to the case where there are several external
parameters is obvious. Consider a quasistatic change of the external parameter
from x to x +dx. In this process, the energy of the system in state r changes by
δE
r
=
∂E
r
∂x
dx. (7.38)
The macroscopic work ¯ dW done by the system due to this parameter change is
¯ dW =
r
exp(−βE
r
)(−∂E
r
/∂x dx)
r
exp(−βE
r
)
. (7.39)
In other words, the work done is minus the average change in internal energy of
the system, where the average is calculated using the Boltzmann distribution. We
can write
r
exp(−βE
r
)
∂E
r
∂x
= −
1
β
∂
∂x
_
_
r
exp(−βE
r
)
_
_
= −
1
β
∂Z
∂x
, (7.40)
which gives
¯ dW =
1
βZ
∂Z
∂x
dx =
1
β
∂lnZ
∂x
dx. (7.41)
We also have the following general expression for the work done by the system
¯ dW = Xdx, (7.42)
where
X = −
∂E
r
∂x
(7.43)
is the mean generalized force conjugate to x (see Sect. 4). It follows that
X =
1
β
∂lnZ
∂x
. (7.44)
Suppose that the external parameter is the volume, so x = V. It follows that
¯ dW = pdV =
1
β
∂lnZ
∂V
dV (7.45)
126
7.5 Partition functions 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
and
p =
1
β
∂lnZ
∂V
. (7.46)
Since the partition function is a function of β and V (the energies E
r
depend
on V), it is clear that the above equation relates the mean pressure p to T (via
β = 1/k T) and V. In other words, the above expression is the equation of state.
Hence, we can work out the pressure, and even the equation of state, using the
partition function.
7.5 Partition functions
It is clear that all important macroscopic quantities associated with a system can
be expressed in terms of its partition function Z. Let us investigate how the
partition function is related to thermodynamical quantities. Recall that Z is a
function of both β and x (where x is the single external parameter). Hence,
Z = Z(β, x), and we can write
dlnZ =
∂lnZ
∂x
dx +
∂lnZ
∂β
dβ. (7.47)
Consider a quasistatic change by which x and β change so slowly that the sys
tem stays close to equilibrium, and, thus, remains distributed according to the
Boltzmann distribution. If follows from Eqs. (7.31) and (7.41) that
dlnZ = β ¯ dW −Edβ. (7.48)
The last term can be rewritten
dlnZ = β ¯ dW −d(Eβ) +βdE, (7.49)
giving
d(lnZ +βE) = β( ¯ dW +dE) ≡ β ¯ dQ. (7.50)
The above equation shows that although the heat absorbed by the system ¯ dQ
is not an exact differential, it becomes one when multiplied by the temperature
127
7.5 Partition functions 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
parameter β. This is essentially the second law of thermodynamics. In fact, we
know that
dS =
¯ dQ
T
. (7.51)
Hence,
S ≡ k (lnZ +βE). (7.52)
This expression enables us to calculate the entropy of a system from its partition
function.
Suppose that we are dealing with a system A
(0)
consisting of two systems A
and A
/
which only interact weakly with one another. Let each state of A be
denoted by an index r and have a corresponding energy E
r
. Likewise, let each
state of A
/
be denoted by an index s and have a corresponding energy E
/
s
. A state
of the combined system A
(0)
is then denoted by two indices r and s. Since A and
A
/
only interact weakly their energies are additive, and the energy of state rs is
E
(0)
rs
= E
r
+E
/
s
. (7.53)
By deﬁnition, the partition function of A
(0)
takes the form
Z
(0)
=
r,s
exp[−βE
(0)
rs
]
=
r,s
exp(−β[E
r
+E
/
s
])
=
r,s
exp(−βE
r
) exp(−βE
/
s
)
=
_
_
r
exp(−βE
r
)
_
_
_
_
s
exp(−βE
/
s
)
_
_
. (7.54)
Hence,
Z
(0)
= ZZ
/
, (7.55)
giving
lnZ
(0)
= lnZ + lnZ
/
, (7.56)
where Z and Z
/
are the partition functions of A and A
/
, respectively. It follows
from Eq. (7.31) that the mean energies of A
(0)
, A, and A
/
are related by
E
(0)
= E +E
/
. (7.57)
128
7.6 Ideal monatomic gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
It also follows from Eq. (7.52) that the respective entropies of these systems are
related via
S
(0)
= S +S
/
. (7.58)
Hence, the partition function tells us that the extensive thermodynamic functions
of two weakly interacting systems are simply additive.
It is clear that we can perform statistical thermodynamical calculations using
the partition function Z instead of the more direct approach in which we use the
density of states Ω. The former approach is advantageous because the partition
function is an unrestricted sum of Boltzmann factors over all accessible states,
irrespective of their energy, whereas the density of states is a restricted sum over
all states whose energies lie in some narrow range. In general, it is far easier to
perform an unrestricted sum than a restricted sum. Thus, it is generally easier to
derive statistical thermodynamical results using Z rather than Ω, although Ω has
a far more direct physical signiﬁcance than Z.
7.6 Ideal monatomic gases
Let us now practice calculating thermodynamic relations using the partition func
tion by considering an example with which we are already quite familiar: i.e.,
an ideal monatomic gas. Consider a gas consisting of N identical monatomic
molecules of mass m enclosed in a container of volume V. Let us denote the po
sition and momentum vectors of the ith molecule by r
i
and p
i
, respectively. Since
the gas is ideal, there are no interatomic forces, and the total energy is simply the
sum of the individual kinetic energies of the molecules:
E =
N
i=1
p
2
i
2 m
, (7.59)
where p
2
i
= p
i
p
i
.
Let us treat the problem classically. In this approach, we divide up phasespace
into cells of equal volume h
f
0
. Here, f is the number of degrees of freedom, and h
0
is a small constant with dimensions of angular momentum which parameterizes
129
7.6 Ideal monatomic gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
the precision to which the positions and momenta of molecules are determined
(see Sect. 3.2). Each cell in phasespace corresponds to a different state. The
partition function is the sum of the Boltzmann factor exp(−βE
r
) over all possible
states, where E
r
is the energy of state r. Classically, we can approximate the
summation over cells in phasespace as an integration over all phasespace. Thus,
Z =
exp(−βE)
d
3
r
1
d
3
r
N
d
3
p
1
d
3
p
N
h
3N
0
, (7.60)
where 3N is the number of degrees of freedom of a monatomic gas containing N
molecules. Making use of Eq. (7.59), the above expression reduces to
Z =
V
N
h
3N
0
exp[−(β/2m) p
2
1
] d
3
p
1
exp[−(β/2m) p
2
N
] d
3
p
N
. (7.61)
Note that the integral over the coordinates of a given molecule simply yields the
volume of the container, V, since the energy E is independent of the locations
of the molecules in an ideal gas. There are N such integrals, so we obtain the
factor V
N
in the above expression. Note, also, that each of the integrals over
the molecular momenta in Eq. (7.61) are identical: they differ only by irrelevant
dummy variables of integration. It follows that the partition function Z of the gas
is made up of the product of N identical factors: i.e.,
Z = ζ
N
, (7.62)
where
ζ =
V
h
3
0
exp[−(β/2m) p
2
] d
3
p (7.63)
is the partition function for a single molecule. Of course, this result is obvious,
since we have already shown that the partition function for a system made up
of a number of weakly interacting subsystems is just the product of the partition
functions of the subsystems (see Sect. 7.5).
The integral in Eq. (7.63) is easily evaluated:
exp[−(β/2m) p
2
] d
3
p =
∞
−∞
exp[−(β/2m) p
2
x
] dp
x
∞
−∞
exp[−(β/2m) p
2
y
] dp
y
∞
−∞
exp[−(β/2m) p
2
z
] dp
z
130
7.6 Ideal monatomic gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
=
_
_
¸
¸
¸
_
2 πm
β
_
_
3
, (7.64)
where use has been made of Eq. (2.79). Thus,
ζ = V
_
_
2 πm
h
2
0
β
_
_
3/2
, (7.65)
and
lnZ = Nlnζ = N
_
_
lnV −
3
2
lnβ +
3
2
ln
_
_
2 πm
h
2
0
_
_
_
_
. (7.66)
The expression for the mean pressure (7.46) yields
p =
1
β
∂lnZ
∂V
=
1
β
N
V
, (7.67)
which reduces to the ideal gas equation of state
pV = Nk T = νRT, (7.68)
where use has been made of N = νN
A
and R = N
A
k. According to Eq. (7.31),
the mean energy of the gas is given by
E = −
∂lnZ
∂β
=
3
2
N
β
= ν
3
2
RT. (7.69)
Note that the internal energy is a function of temperature alone, with no depen
dence on volume. The molar heat capacity at constant volume of the gas is given
by
c
V
=
1
ν
_
_
∂E
∂T
_
_
V
=
3
2
R, (7.70)
so the mean energy can be written
E = νc
V
T. (7.71)
We have seen all of the above results before. Let us now use the partition
function to calculate a new result. The entropy of the gas can be calculated quite
simply from the expression
S = k (lnZ +βE). (7.72)
131
7.7 Gibb’s paradox 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
Thus,
S = νR
_
_
lnV −
3
2
lnβ +
3
2
ln
_
_
2 πm
h
2
0
_
_
+
3
2
_
_
, (7.73)
or
S = νR
_
lnV +
3
2
lnT +σ
_
, (7.74)
where
σ =
3
2
ln
_
_
2 πmk
h
2
0
_
_
+
3
2
. (7.75)
The above expression for the entropy of an ideal gas is certainly new. Unfortu
nately, it is also quite obviously incorrect!
7.7 Gibb’s paradox
What has gone wrong? First of all, let us be clear why Eq. (7.74) is incorrect.
We can see that S → −∞ as T → 0, which contradicts the third law of ther
modynamics. However, this is not a problem. Equation (7.74) was derived using
classical physics, which breaks down at low temperatures. Thus, we would not
expect this equation to give a sensible answer close to the absolute zero of tem
perature.
Equation (7.74) is wrong because it implies that the entropy does not behave
properly as an extensive quantity. Thermodynamic quantities can be divided into
two groups, extensive and intensive. Extensive quantities increase by a factor α
when the size of the system under consideration is increased by the same fac
tor. Intensive quantities stay the same. Energy and volume are typical extensive
quantities. Pressure and temperature are typical intensive quantities. Entropy is
very deﬁnitely an extensive quantity. We have shown [see Eq. (7.58)] that the
entropies of two weakly interacting systems are additive. Thus, if we double the
size of a system we expect the entropy to double as well. Suppose that we have a
system of volume V containing ν moles of ideal gas at temperature T. Doubling
the size of the system is like joining two identical systems together to form a new
132
7.7 Gibb’s paradox 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
system of volume 2 V containing 2 ν moles of gas at temperature T. Let
S = νR
_
lnV +
3
2
lnT +σ
_
(7.76)
denote the entropy of the original system, and let
S
/
= 2 νR
_
ln2 V +
3
2
lnT +σ
_
(7.77)
denote the entropy of the doublesized system. Clearly, if entropy is an extensive
quantity (which it is!) then we should have
S
/
= 2 S. (7.78)
But, in fact, we ﬁnd that
S
/
−2 S = 2 νR ln2. (7.79)
So, the entropy of the doublesized system is more than double the entropy of the
original system.
Where does this extra entropy come from? Well, let us consider a little more
carefully how we might go about doubling the size of our system. Suppose that
we put another identical system adjacent to it, and separate the two systems by a
partition. Let us now suddenly remove the partition. If entropy is a properly ex
tensive quantity then the entropy of the overall system should be the same before
and after the partition is removed. It is certainly the case that the energy (another
extensive quantity) of the overall system stays the same. However, according to
Eq. (7.79), the overall entropy of the system increases by 2 νR ln2 after the par
tition is removed. Suppose, now, that the second system is identical to the ﬁrst
system in all respects except that its molecules are in some way slightly different
to the molecules in the ﬁrst system, so that the two sets of molecules are distin
guishable. In this case, we would certainly expect an overall increase in entropy
when the partition is removed. Before the partition is removed, it separates type
1 molecules from type 2 molecules. After the partition is removed, molecules of
both types become jumbled together. This is clearly an irreversible process. We
cannot imagine the molecules spontaneously sorting themselves out again. The
increase in entropy associated with this jumbling is called entropy of mixing, and
133
7.7 Gibb’s paradox 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
is easily calculated. We know that the number of accessible states of an ideal
gas varies with volume like Ω ∝ V
N
. The volume accessible to type 1 molecules
clearly doubles after the partition is removed, as does the volume accessible to
type 2 molecules. Using the fundamental formula S = k lnΩ, the increase in
entropy due to mixing is given by
S = 2 k ln
Ω
f
Ω
i
= 2 Nk ln
V
f
V
i
= 2 νR ln2. (7.80)
It is clear that the additional entropy 2 νR ln2, which appears when we dou
ble the size of an ideal gas system by joining together two identical systems, is
entropy of mixing of the molecules contained in the original systems. But, if the
molecules in these two systems are indistinguishable, why should there be any
entropy of mixing? Well, clearly, there is no entropy of mixing in this case. At this
point, we can begin to understand what has gone wrong in our calculation. We
have calculated the partition function assuming that all of the molecules in our
system have the same mass and temperature, but we have never explicitly taken
into account the fact that we consider the molecules to be indistinguishable. In
other words, we have been treating the molecules in our ideal gas as if each car
ried a little license plate, or a social security number, so that we could always tell
one from another. In quantum mechanics, which is what we really should be us
ing to study microscopic phenomena, the essential indistinguishability of atoms
and molecules is hardwired into the theory at a very low level. Our problem is
that we have been taking the classical approach a little too seriously. It is plainly
silly to pretend that we can distinguish molecules in a statistical problem, where
we do not closely follow the motions of individual particles. A paradox arises
if we try to treat molecules as if they were distinguishable. This is called Gibb’s
paradox, after the American physicist Josiah Gibbs who ﬁrst discussed it. The res
olution of Gibb’s paradox is quite simple: treat all molecules of the same species
as if they were indistinguishable.
In our previous calculation of the ideal gas partition function, we inadvertently
treated each of the N molecules in the gas as distinguishable. Because of this, we
overcounted the number of states of the system. Since the N! possible permu
tations of the molecules amongst themselves do not lead to physically different
134
7.8 The equipartition theorem 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
situations, and, therefore, cannot be counted as separate states, the number of
actual states of the system is a factor N! less than what we initially thought. We
can easily correct our partition function by simply dividing by this factor, so that
Z =
ζ
N
N!
. (7.81)
This gives
lnZ = Nlnζ − lnN!, (7.82)
or
lnZ = Nlnζ −NlnN+N, (7.83)
using Stirling’s approximation. Note that our new version of lnZ differs from
our previous version by an additive term involving the number of particles in
the system. This explains why our calculations of the mean pressure and mean
energy, which depend on partial derivatives of lnZ with respect to the volume
and the temperature parameter β, respectively, came out all right. However, our
expression for the entropy S is modiﬁed by this additive term. The newexpression
is
S = νR
_
_
lnV −
3
2
lnβ +
3
2
ln
_
_
2 πmk
h
2
0
_
_
+
3
2
_
_
+k (−NlnN+N). (7.84)
This gives
S = νR
_
ln
V
N
+
3
2
lnT +σ
0
_
(7.85)
where
σ
0
=
3
2
ln
_
_
2 πmk
h
2
0
_
_
+
5
2
. (7.86)
It is clear that the entropy behaves properly as an extensive quantity in the above
expression: i.e., it is multiplied by a factor α when ν, V, and N are multiplied by
the same factor.
7.8 The equipartition theorem
The internal energy of a monatomic ideal gas containing Nparticles is (3/2) Nk T.
This means that each particle possess, on average, (3/2) k T units of energy. Mon
135
7.8 The equipartition theorem 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
atomic particles have only three translational degrees of freedom, corresponding
to their motion in three dimensions. They possess no internal rotational or vi
brational degrees of freedom. Thus, the mean energy per degree of freedom in a
monatomic ideal gas is (1/2) k T. In fact, this is a special case of a rather general
result. Let us now try to prove this.
Suppose that the energy of a system is determined by some f generalized co
ordinates q
k
and corresponding f generalized momenta p
k
, so that
E = E(q
1
, , q
f
, p
1
, , p
f
). (7.87)
Suppose further that:
1. The total energy splits additively into the form
E =
i
(p
i
) +E
/
(q
1
, , p
f
), (7.88)
where
i
involves only one variable p
i
, and the remaining part E
/
does not
depend on p
i
.
2. The function
i
is quadratic in p
i
, so that
i
(p
i
) = bp
2
i
, (7.89)
where b is a constant.
The most common situation in which the above assumptions are valid is where p
i
is a momentum. This is because the kinetic energy is usually a quadratic function
of each momentum component, whereas the potential energy does not involve
the momenta at all. However, if a coordinate q
i
were to satisfy assumptions 1
and 2 then the theorem we are about to establish would hold just as well.
What is the mean value of
i
in thermal equilibrium if conditions 1 and 2 are
satisﬁed? If the system is in equilibrium at absolute temperature T ≡ (k β)
−1
then it is distributed according to the Boltzmann distribution. In the classical
approximation, the mean value of
i
is expressed in terms of integrals over all
phasespace:
i
=
∞
−∞
exp[−βE(q
1
, , p
f
)]
i
dq
1
dp
f
∞
−∞
exp[−βE(q
1
, , p
f
)] dq
1
dp
f
. (7.90)
136
7.8 The equipartition theorem 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
Condition 1 gives
i
=
∞
−∞
exp[−β(
i
+E
/
)]
i
dq
1
dp
f
∞
−∞
exp[−β(
i
+E
/
)] dq
1
dp
f
=
∞
−∞
exp(−β
i
)
i
dp
i
∞
−∞
exp(−βE
/
) dq
1
dp
f
∞
−∞
exp(−β
i
) dp
i
∞
−∞
exp(−βE
/
) dq
1
dp
f
, (7.91)
where use has been made of the multiplicative property of the exponential func
tion, and where the last integrals in both the numerator and denominator extend
over all variables q
k
and p
k
except p
i
. These integrals are equal and, thus, cancel.
Hence,
i
=
∞
−∞
exp(−β
i
)
i
dp
i
∞
−∞
exp(−β
i
) dp
i
. (7.92)
This expression can be simpliﬁed further since
∞
−∞
exp(−β
i
)
i
dp
i
≡ −
∂
∂β
_
_
∞
−∞
exp(−β
i
) dp
i
_
_
, (7.93)
so
i
= −
∂
∂β
ln
_
_
∞
−∞
exp(−β
i
) dp
i
_
_
. (7.94)
According to condition 2,
∞
−∞
exp(−β
i
) dp
i
=
∞
−∞
exp(−βbp
2
i
) dp
i
=
1
√
β
∞
−∞
exp(−by
2
) dy, (7.95)
where y =
√
βp
i
. Thus,
ln
∞
−∞
exp(−β
i
) dp
i
= −
1
2
lnβ + ln
∞
−∞
exp(−by
2
) dy. (7.96)
Note that the integral on the righthand side does not depend on β at all. It
follows from Eq. (7.94) that
i
= −
∂
∂β
_
−
1
2
lnβ
_
=
1
2 β
, (7.97)
137
7.9 Harmonic oscillators 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
giving
i
=
1
2
k T. (7.98)
This is the famous equipartition theorem of classical physics. It states that the
mean value of every independent quadratic termin the energy is equal to (1/2) k T.
If all terms in the energy are quadratic then the mean energy is spread equally
over all degrees of freedom (hence the name “equipartition”).
7.9 Harmonic oscillators
Our proof of the equipartition theorem depends crucially on the classical approx
imation. To see how quantum effects modify this result, let us examine a par
ticularly simple system which we know how to analyze using both classical and
quantum physics: i.e., a simple harmonic oscillator. Consider a onedimensional
harmonic oscillator in equilibrium with a heat reservoir at temperature T. The
energy of the oscillator is given by
E =
p
2
2 m
+
1
2
κ x
2
, (7.99)
where the ﬁrst term on the righthand side is the kinetic energy, involving the
momentum p and mass m, and the second term is the potential energy, involving
the displacement x and the force constant κ. Each of these terms is quadratic
in the respective variable. So, in the classical approximation the equipartition
theorem yields:
p
2
2 m
=
1
2
k T, (7.100)
1
2
κ x
2
=
1
2
k T. (7.101)
That is, the mean kinetic energy of the oscillator is equal to the mean potential
energy which equals (1/2) k T. It follows that the mean total energy is
E =
1
2
k T +
1
2
k T = k T. (7.102)
138
7.9 Harmonic oscillators 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
According to quantum mechanics, the energy levels of a harmonic oscillator
are equally spaced and satisfy
E
n
= (n +1/2) ¯hω, (7.103)
where n is a nonnegative integer, and
ω =
¸
κ
m
. (7.104)
The partition function for such an oscillator is given by
Z =
∞
n=0
exp(−βE
n
) = exp[−(1/2) β¯hω]
∞
n=0
exp(−nβ¯hω). (7.105)
Now,
∞
n=0
exp(−nβ¯hω) = 1 + exp(−β¯hω) + exp(−2 β¯hω) + (7.106)
is simply the sum of an inﬁnite geometric series, and can be evaluated immedi
ately,
∞
n=0
exp(−nβ¯hω) =
1
1 − exp(−β¯hω)
. (7.107)
Thus, the partition function takes the form
Z =
exp[−(1/2) β¯hω]
1 − exp(−β¯hω)
, (7.108)
and
lnZ = −
1
2
β¯hω− ln[1 − exp(−β¯hω)] (7.109)
The mean energy of the oscillator is given by [see Eq. (7.31)]
E = −
∂
∂β
lnZ = −
_
_
−
1
2
¯hω−
exp(−β¯hω) ¯hω
1 − exp(−β¯hω)
_
_
, (7.110)
or
E = ¯hω
_
_
1
2
+
1
exp(β¯hω) −1
_
_
. (7.111)
139
7.10 Speciﬁc heats 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
Consider the limit
β¯hω =
¯hω
k T
¸1, (7.112)
in which the thermal energy k T is large compared to the separation ¯hω between
the energy levels. In this limit,
exp(β¯hω) · 1 +β¯hω, (7.113)
so
E · ¯hω
_
1
2
+
1
β¯hω
_
· ¯hω
_
1
β¯hω
_
, (7.114)
giving
E ·
1
β
= k T. (7.115)
Thus, the classical result (7.102) holds whenever the thermal energy greatly ex
ceeds the typical spacing between quantum energy levels.
Consider the limit
β¯hω =
¯hω
k T
¸1, (7.116)
in which the thermal energy is small compared to the separation between the
energy levels. In this limit,
exp(β¯hω) ¸1, (7.117)
and so
E · ¯hω[1/2 + exp(−β¯hω)] ·
1
2
¯hω. (7.118)
Thus, if the thermal energy is much less than the spacing between quantum states
then the mean energy approaches that of the groundstate (the socalled zero
point energy). Clearly, the equipartition theorem is only valid in the former limit,
where k T ¸¯hω, and the oscillator possess sufﬁcient thermal energy to explore
many of its possible quantum states.
7.10 Speciﬁc heats
We have discussed the internal energies and entropies of substances (mostly ideal
gases) at some length. Unfortunately, these quantities cannot be directly mea
140
7.10 Speciﬁc heats 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
sured. Instead, they must be inferred from other information. The thermody
namic property of substances which is the easiest to measure is, of course, the
heat capacity, or speciﬁc heat. In fact, once the variation of the speciﬁc heat
with temperature is known, both the internal energy and entropy can be easily
reconstructed via
E(T, V) = ν
T
0
c
V
(T, V) dT +E(0, V), (7.119)
S(T, V) = ν
T
0
c
V
(T, V)
T
dT. (7.120)
Here, use has been made of dS = ¯ dQ/T, and the third law of thermodynamics.
Clearly, the optimum way of verifying the results of statistical thermodynamics
is to compare the theoretically predicted heat capacities with the experimentally
measured values.
Classical physics, in the guise of the equipartition theorem, says that each
independent degree of freedom associated with a quadratic term in the energy
possesses an average energy (1/2) k T in thermal equilibrium at temperature T.
Consider a substance made up of N molecules. Every molecular degree of free
dom contributes (1/2) Nk T, or (1/2) νRT, to the mean energy of the substance
(with the tacit proviso that each degree of freedom is associated with a quadratic
term in the energy). Thus, the contribution to the molar heat capacity at constant
volume (we wish to avoid the complications associated with any external work
done on the substance) is
1
ν
_
_
∂E
∂T
_
_
V
=
1
ν
∂[(1/2) νRT]
∂T
=
1
2
R, (7.121)
per molecular degree of freedom. The total classical heat capacity is therefore
c
V
=
g
2
R, (7.122)
where g is the number of molecular degrees of freedom. Since large compli
cated molecules clearly have very many more degrees of freedom than small
simple molecules, the above formula predicts that the molar heat capacities of
substances made up of the former type of molecules should greatly exceed those
141
7.10 Speciﬁc heats 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
of substances made up of the latter. In fact, the experimental heat capacities
of substances containing complicated molecules are generally greater than those
of substances containing simple molecules, but by nowhere near the large fac
tor predicted by Eq. (7.122). This equation also implies that heat capacities are
temperature independent. In fact, this is not the case for most substances. Ex
perimental heat capacities generally increase with increasing temperature. These
two experimental facts pose severe problems for classical physics. Incidentally,
these problems were fully appreciated as far back as 1850. Stories that physi
cists at the end of the nineteenth century thought that classical physics explained
absolutely everything are largely apocryphal.
The equipartition theorem (and the whole classical approximation) is only
valid when the typical thermal energy k T greatly exceeds the spacing between
quantum energy levels. Suppose that the temperature is sufﬁciently low that
this condition is not satisﬁed for one particular molecular degree of freedom. In
fact, suppose that k T is much less than the spacing between the energy levels.
According to Sect. 7.9, in this situation the degree of freedomonly contributes the
groundstate energy, E
0
, say, to the mean energy of the molecule. The ground
state energy can be a quite complicated function of the internal properties of
the molecule, but is certainly not a function of the temperature, since this is a
collective property of all molecules. It follows that the contribution to the molar
heat capacity is
1
ν
_
_
∂[NE
0
]
∂T
_
_
V
= 0. (7.123)
Thus, if k T is much less than the spacing between the energy levels then the de
gree of freedom contributes nothing at all to the molar heat capacity. We say that
this particular degree of freedom is frozen out. Clearly, at very low temperatures
just about all degrees of freedom are frozen out. As the temperature is gradually
increased, degrees of freedom successively “kick in,” and eventually contribute
their full (1/2) R to the molar heat capacity, as k T approaches, and then greatly
exceeds, the spacing between their quantum energy levels. We can use these
simple ideas to explain the behaviours of most experimental heat capacities.
To make further progress, we need to estimate the typical spacing between
the quantum energy levels associated with various degrees of freedom. We can
142
7.11 Speciﬁc heats of gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
Radiation type Frequency (Hz) T
rad
(
◦
K)
Radio < 10
9
< 0.05
Microwave 10
9
– 10
11
0.05 – 5
Infrared 10
11
– 10
14
5 – 5000
Visible 5 10
14
2 10
4
Ultraviolet 10
15
– 10
17
5 10
4
– 5 10
6
Xray 10
17
– 10
20
5 10
6
– 5 10
9
γray > 10
20
> 5 10
9
Table 3: Effective “temperatures” of various types of electromagnetic radiation
do this by observing the frequency of the electromagnetic radiation emitted and
absorbed during transitions between these energy levels. If the typical spacing
between energy levels is ∆E then transitions between the various levels are asso
ciated with photons of frequency ν, where hν = ∆E. We can deﬁne an effective
temperature of the radiation via hν = k T
rad
. If T ¸ T
rad
then k T ¸ ∆E, and
the degree of freedom makes its full contribution to the heat capacity. On the
other hand, if T ¸ T
rad
then k T ¸ ∆E, and the degree of freedom is frozen
out. Table 3 lists the “temperatures” of various different types of radiation. It is
clear that degrees of freedom which give rise to emission or absorption of radio
or microwave radiation contribute their full (1/2) R to the molar heat capacity at
room temperature. Degrees of freedom which give rise to emission or absorption
in the visible, ultraviolet, Xray, or γray regions of the electromagnetic spectrum
are frozen out at room temperature. Degrees of freedom which emit or absorb
infrared radiation are on the border line.
7.11 Speciﬁc heats of gases
Let us now investigate the speciﬁc heats of gases. Consider, ﬁrst of all, trans
lational degrees of freedom. Every molecule in a gas is free to move in three
dimensions. If one particular molecule has mass m and momentum p = mv then
its kinetic energy of translation is
K =
1
2 m
(p
2
x
+p
2
y
+p
2
z
). (7.124)
143
7.11 Speciﬁc heats of gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
The kinetic energy of other molecules does not involve the momentum p of
this particular molecule. Moreover, the potential energy of interaction between
molecules depends only on their position coordinates, and, thus, certainly does
not involve p. Any internal rotational, vibrational, electronic, or nuclear degrees
of freedom of the molecule also do not involve p. Hence, the essential conditions
of the equipartition theorem are satisﬁed (at least, in the classical approxima
tion). Since Eq. (7.124) contains three independent quadratic terms, there are
clearly three degrees of freedom associated with translation (one for each di
mension of space), so the translational contribution to the molar heat capacity of
gases is
(c
V
)
translation
=
3
2
R. (7.125)
Suppose that our gas is contained in a cubic enclosure of dimensions L. Ac
cording to Schr¨ odinger’s equation, the quantized translational energy levels of an
individual molecule are given by
E =
¯h
2
π
2
2 mL
2
_
n
2
1
+n
2
2
+n
2
3
_
, (7.126)
where n
1
, n
2
, and n
3
are positive integer quantum numbers. Clearly, the spacing
between the energy levels can be made arbitrarily small by increasing the size of
the enclosure. This implies that translational degrees of freedom can be treated
classically, so that Eq. (7.125) is always valid (except very close to absolute zero).
We conclude that all gases possess a minimum molar heat capacity of (3/2) R due
to the translational degrees of freedom of their constituent molecules.
The electronic degrees of freedom of gas molecules (i.e., the possible conﬁg
urations of electrons orbiting the atomic nuclei) typically give rise to absorption
and emission in the ultraviolet or visible regions of the spectrum. It follows from
Tab. 3 that electronic degrees of freedom are frozen out at room temperature.
Similarly, nuclear degrees of freedom (i.e., the possible conﬁgurations of protons
and neutrons in the atomic nuclei) are frozen out because they are associated
with absorption and emission in the Xray and γray regions of the electromag
netic spectrum. In fact, the only additional degrees of freedom we need worry
about for gases are rotational and vibrational degrees of freedom. These typically
give rise to absorption lines in the infrared region of the spectrum.
144
7.11 Speciﬁc heats of gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
The rotational kinetic energy of a molecule tumbling in space can be written
K =
1
2
I
x
ω
2
x
+
1
2
I
y
ω
2
y
+
1
2
I
z
ω
2
z
, (7.127)
where the x, y, and zaxes are the so called principle axes of inertia of the
molecule (these are mutually perpendicular), ω
x
, ω
y
, and ω
z
are the angular
velocities of rotation about these axes, and I
x
, I
y
, and I
z
are the moments of in
ertia of the molecule about these axes. No other degrees of freedom depend on
the angular velocities of rotation. Since the kinetic energy of rotation is the sum
of three quadratic terms, the rotational contribution to the molar heat capacity
of gases is
(c
V
)
rotation
=
3
2
R, (7.128)
according to the equipartition theorem. Note that the typical magnitude of a
molecular moment of inertia is md
2
, where m is the molecular mass, and d is the
typical interatomic spacing in the molecule. A special case arises if the molecule
is linear (e.g. if the molecule is diatomic). In this case, one of the principle axes
lies along the line of centers of the atoms. The moment of inertia about this axis
is of order ma
2
, where a is a typical nuclear dimension (remember that nearly
all of the mass of an atom resides in the nucleus). Since a ∼ 10
−5
d, it follows
that the moment of inertia about the line of centres is minuscule compared to the
moments of inertia about the other two principle axes. In quantum mechanics,
angular momentum is quantized in units of ¯h. The energy levels of a rigid rotator
are written
E =
¯h
2
2 I
J(J +1), (7.129)
where I is the moment of inertia and J is an integer. Note the inverse dependence
of the spacing between energy levels on the moment of inertia. It is clear that for
the case of a linear molecule, the rotational degree of freedom associated with
spinning along the line of centres of the atoms is frozen out at room temperature,
given the very small moment of inertia along this axis, and, hence, the very widely
spaced rotational energy levels.
Classically, the vibrational degrees of freedom of a molecule are studied by
standard normal mode analysis of the molecular structure. Each normal mode
145
7.11 Speciﬁc heats of gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
behaves like an independent harmonic oscillator, and, therefore, contributes R to
the molar speciﬁc heat of the gas [(1/2) R from the kinetic energy of vibration and
(1/2) R from the potential energy of vibration]. A molecule containing n atoms
has n − 1 normal modes of vibration. For instance, a diatomic molecule has just
one normal mode (corresponding to periodic stretching of the bond between the
two atoms). Thus, the classical contribution to the speciﬁc heat from vibrational
degrees of freedom is
(c
V
)
vibration
= (n −1) R. (7.130)
Figure 3: The infrared vibrationabsorption spectrum of HCl.
So, do any of the rotational and vibrational degrees of freedom actually make
a contribution to the speciﬁc heats of gases at room temperature, once quantum
effects are taken into consideration? We can answer this question by examin
ing just one piece of data. Figure 3 shows the infrared absorption spectrum of
Hydrogen Chloride. The absorption lines correspond to simultaneous transitions
between different vibrational and rotational energy levels. Hence, this is usu
ally called a vibrationrotation spectrum. The missing line at about 3.47 microns
corresponds to a pure vibrational transition from the groundstate to the ﬁrst
excited state (pure vibrational transitions are forbidden: HCl molecules always
have to simultaneously change their rotational energy level if they are to couple
effectively to electromagnetic radiation). The longer wavelength absorption lines
146
7.11 Speciﬁc heats of gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
correspond to vibrational transitions in which there is a simultaneous decrease
in the rotational energy level. Likewise, the shorter wavelength absorption lines
correspond to vibrational transitions in which there is a simultaneous increase in
the rotational energy level. It is clear that the rotational energy levels are more
closely spaced than the vibrational energy levels. The pure vibrational transition
gives rise to absorption at about 3.47 microns, which corresponds to infrared ra
diation of frequency 8.5 10
11
hertz with an associated radiation “temperature”
of 4400 degrees kelvin. We conclude that the vibrational degrees of freedom of
HCl, or any other small molecule, are frozen out at room temperature. The ro
tational transitions split the vibrational lines by about 0.2 microns. This implies
that pure rotational transitions would be associated with infrared radiation of
frequency 5 10
12
hertz and corresponding radiation “temperature” 260 degrees
kelvin. We conclude that the rotational degrees of freedom of HCl, or any other
small molecule, are not frozen out at room temperature, and probably contribute
the classical (1/2) R to the molar speciﬁc heat. There is one proviso, however.
Linear molecules (like HCl) effectively only have two rotational degrees of free
dom (instead of the usual three), because of the very small moment of inertia of
such molecules along the line of centres of the atoms.
We are now in a position to make some predictions regarding the speciﬁc heats
of various gases. Monatomic molecules only possess three translational degrees
of freedom, so monatomic gases should have a molar heat capacity (3/2) R =
12.47 joules/degree/mole. The ratio of speciﬁc heats γ = c
p
/c
V
= (c
V
+ R)/c
V
should be 5/3 = 1.667. It can be seen from Tab. 2 that both of these predictions
are borne out pretty well for Heliumand Argon. Diatomic molecules possess three
translational degrees of freedom and two rotational degrees of freedom (all other
degrees of freedom are frozen out at room temperature). Thus, diatomic gases
should have a molar heat capacity (5/2) R = 20.8 joules/degree/mole. The ratio
of speciﬁc heats should be 7/5 = 1.4. It can be seen from Tab. 2 that these are
pretty accurate predictions for Nitrogen and Oxygen. The freezing out of vibra
tional degrees of freedom becomes gradually less effective as molecules become
heavier and more complex. This is partly because such molecules are generally
less stable, so the force constant κ is reduced, and partly because the molec
ular mass is increased. Both these effect reduce the frequency of vibration of
147
7.11 Speciﬁc heats of gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
the molecular normal modes [see Eq. (7.104)], and, hence, the spacing between
vibrational energy levels [see Eq. (7.103)]. This accounts for the obviously non
classical [i.e., not a multiple of (1/2) R] speciﬁc heats of Carbon Dioxide and
Ethane in Tab. 2. In both molecules, vibrational degrees of freedom contribute
to the molar speciﬁc heat (but not the full R because the temperature is not high
enough).
Figure 4: The molar heat capacity at constant volume (in units of R) of gaseous H
2
versus tempera
ture.
Figure 4 shows the variation of the molar heat capacity at constant volume
(in units of R) of gaseous hydrogen with temperature. The expected contribution
from the translational degrees of freedom is (3/2) R (there are three translational
degrees of freedom per molecule). The expected contribution at high tempera
tures from the rotational degrees of freedom is R (there are effectively two ro
tational degrees of freedom per molecule). Finally, the expected contribution at
high temperatures from the vibrational degrees of freedom is R (there is one vi
brational degree of freedom per molecule). It can be seen that as the temperature
rises the rotational, and then the vibrational, degrees of freedom eventually make
their full classical contributions to the heat capacity.
148
7.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
7.12 Speciﬁc heats of solids
Consider a simple solid containing N atoms. Now, atoms in solids cannot trans
late (unlike those in gases), but are free to vibrate about their equilibrium posi
tions. Such vibrations are called lattice vibrations, and can be thought of as sound
waves propagating through the crystal lattice. Each atom is speciﬁed by three
independent position coordinates, and three conjugate momentum coordinates.
Let us only consider small amplitude vibrations. In this case, we can expand the
potential energy of interaction between the atoms to give an expression which
is quadratic in the atomic displacements from their equilibrium positions. It is
always possible to perform a normal mode analysis of the oscillations. In effect,
we can ﬁnd 3 N independent modes of oscillation of the solid. Each mode has
its own particular oscillation frequency, and its own particular pattern of atomic
displacements. Any general oscillation can be written as a linear combination of
these normal modes. Let q
i
be the (appropriately normalized) amplitude of the
ith normal mode, and p
i
the momentum conjugate to this coordinate. In normal
mode coordinates, the total energy of the lattice vibrations takes the particularly
simple form
E =
1
2
3N
i=1
(p
2
i
+ω
2
i
q
2
i
), (7.131)
where ω
i
is the (angular) oscillation frequency of the ith normal mode. It is clear
that in normal mode coordinates, the linearized lattice vibrations are equivalent
to 3 N independent harmonic oscillators (of course, each oscillator corresponds
to a different normal mode).
The typical value of ω
i
is the (angular) frequency of a sound wave propagating
through the lattice. Sound wave frequencies are far lower than the typical vibra
tion frequencies of gaseous molecules. In the latter case, the mass involved in the
vibration is simply that of the molecule, whereas in the former case the mass in
volved is that of very many atoms (since lattice vibrations are nonlocalized). The
strength of interatomic bonds in gaseous molecules is similar to those in solids,
so we can use the estimate ω ∼
_
κ/m (κ is the force constant which measures
the strength of interatomic bonds, and m is the mass involved in the oscillation)
as proof that the typical frequencies of lattice vibrations are very much less than
149
7.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
the vibration frequencies of simple molecules. It follows from ∆E = ¯hω that
the quantum energy levels of lattice vibrations are far more closely spaced than
the vibrational energy levels of gaseous molecules. Thus, it is likely (and is, in
deed, the case) that lattice vibrations are not frozen out at room temperature,
but, instead, make their full classical contribution to the molar speciﬁc heat of
the solid.
If the lattice vibrations behave classically then, according to the equipartition
theorem, each normal mode of oscillation has an associated mean energy k T
in equilibrium at temperature T [(1/2) k T resides in the kinetic energy of the
oscillation, and (1/2) k T resides in the potential energy]. Thus, the mean internal
energy per mole of the solid is
E = 3 Nk T = 3 νRT. (7.132)
It follows that the molar heat capacity at constant volume is
c
V
=
1
ν
_
_
∂E
∂T
_
_
V
= 3 R (7.133)
for solids. This gives a value of 24.9 joules/mole/degree. In fact, at room temper
ature most solids (in particular, metals) have heat capacities which lie remarkably
close to this value. This fact was discovered experimentally by Dulong and Petite
at the beginning of the nineteenth century, and was used to make some of the
ﬁrst crude estimates of the molecular weights of solids (if we know the molar
heat capacity of a substance then we can easily work out how much of it corre
sponds to one mole, and by weighing this amount, and then dividing the result by
Avogadro’s number, we can obtain an estimate of the molecular weight). Table 4
lists the experimental molar heat capacities c
p
at constant pressure for various
solids. The heat capacity at constant volume is somewhat less than the constant
pressure value, but not by much, because solids are fairly incompressible. It can
be seen that Dulong and Petite’s law (i.e., that all solids have a molar heat capaci
ties close to 24.9 joules/mole/degree) holds pretty well for metals. However, the
law fails badly for diamond. This is not surprising. As is wellknown, diamond
is an extremely hard substance, so its intermolecular bonds must be very strong,
suggesting that the force constant κ is large. Diamond is also a fairly low density
150
7.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
Solid c
p
Solid c
p
Copper 24.5 Aluminium 24.4
Silver 25.5 Tin (white) 26.4
Lead 26.4 Sulphur (rhombic) 22.4
Zinc 25.4 Carbon (diamond) 6.1
Table 4: Values of c
p
(joules/mole/degree) for some solids at T = 298
◦
K. From Reif.
substance, so the mass m involved in lattice vibrations is comparatively small.
Both these facts suggest that the typical lattice vibration frequency of diamond
(ω ∼
_
κ/m) is high. In fact, the spacing between the different vibration energy
levels (which scales like ¯hω) is sufﬁciently large in diamond for the vibrational
degrees of freedom to be largely frozen out at room temperature. This accounts
for the anomalously low heat capacity of diamond in Tab. 4.
Dulong and Petite’s law is essentially a high temperature limit. The molar
heat capacity cannot remain a constant as the temperature approaches absolute
zero, since, by Eq. (7.120), this would imply S → ∞, which violates the third
law of thermodynamics. We can make a crude model of the behaviour of c
V
at
low temperatures by assuming that all the normal modes oscillate at the same
frequency, ω, say. This approximation was ﬁrst employed by Einstein in a paper
published in 1907. According to Eq. (7.131), the solid acts like a set of 3N
independent oscillators which, making use of Einstein’s approximation, all vibrate
at the same frequency. We can use the quantum mechanical result (7.111) for a
single oscillator to write the mean energy of the solid in the form
E = 3 N¯hω
_
_
1
2
+
1
exp(β¯hω) −1
_
_
. (7.134)
The molar heat capacity is deﬁned
c
V
=
1
ν
_
_
∂E
∂T
_
_
V
=
1
ν
_
_
∂E
∂β
_
_
V
∂β
∂T
= −
1
νk T
2
_
_
∂E
∂β
_
_
V
, (7.135)
giving
c
V
= −
3 N
A
¯hω
k T
2
_
_
−
exp(β¯hω) ¯hω
[exp(β¯hω) −1]
2
_
_
, (7.136)
151
7.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
which reduces to
c
V
= 3 R
_
θ
E
T
_
2
exp(θ
E
/T)
[exp(θ
E
/T) −1]
2
. (7.137)
Here,
θ
E
=
¯hω
k
(7.138)
is called the Einstein temperature. If the temperature is sufﬁciently high that
T ¸ θ
E
then k T ¸ ¯hω, and the above expression reduces to c
V
= 3 R, after
expansion of the exponential functions. Thus, the law of Dulong and Petite is
recovered for temperatures signiﬁcantly in excess of the Einstein temperature.
On the other hand, if the temperature is sufﬁciently low that T ¸ θ
E
then the
exponential factors in Eq. (7.137) become very much larger than unity, giving
c
V
∼ 3 R
_
θ
E
T
_
2
exp(−θ
E
/T). (7.139)
So, in this simple model the speciﬁc heat approaches zero exponentially as T →0.
In reality, the speciﬁc heats of solids do not approach zero quite as quickly as
suggested by Einstein’s model when T → 0. The experimentally observed low
temperature behaviour is more like c
V
∝ T
3
(see Fig. 6). The reason for this
discrepancy is the crude approximation that all normal modes have the same
frequency. In fact, long wavelength modes have lower frequencies than short
wavelength modes, so the former are much harder to freeze out than the lat
ter (because the spacing between quantum energy levels, ¯hω, is smaller in the
former case). The molar heat capacity does not decrease with temperature as
rapidly as suggested by Einstein’s model because these long wavelength modes
are able to make a signiﬁcant contribution to the heat capacity even at very low
temperatures. A more realistic model of lattice vibrations was developed by the
Dutch physicist Peter Debye in 1912. In the Debye model, the frequencies of
the normal modes of vibration are estimated by treating the solid as an isotropic
continuous medium. This approach is reasonable because the only modes which
really matter at low temperatures are the long wavelength modes: i.e., those
whose wavelengths greatly exceed the interatomic spacing. It is plausible that
these modes are not particularly sensitive to the discrete nature of the solid: i.e.,
the fact that it is made up of atoms rather than being continuous.
152
7.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
Consider a sound wave propagating through an isotropic continuous medium.
The disturbance varies with position vector r and time t like exp[−i (k r − ωt)],
where the wavevector k and the frequency of oscillation ω satisfy the dispersion
relation for sound waves in an isotropic medium:
ω = k c
s
. (7.140)
Here, c
s
is the speed of sound in the medium. Suppose, for the sake of argument,
that the medium is periodic in the x, y, and zdirections with periodicity lengths
L
x
, L
y
, and L
z
, respectively. In order to maintain periodicity we need
k
x
(x +L
x
) = k
x
x +2 πn
x
, (7.141)
where n
x
is an integer. There are analogous constraints on k
y
and k
z
. It follows
that in a periodic medium the components of the wavevector are quantized, and
can only take the values
k
x
=
2π
L
x
n
x
, (7.142)
k
y
=
2π
L
y
n
y
, (7.143)
k
z
=
2π
L
z
n
z
, (7.144)
where n
x
, n
y
, and n
z
are all integers. It is assumed that L
x
, L
y
, and L
z
are
macroscopic lengths, so the allowed values of the components of the wavevector
are very closely spaced. For given values of k
y
and k
z
, the number of allowed
values of k
x
which lie in the range k
x
to k
x
+dk
x
is given by
∆n
x
=
L
x
2π
dk
x
. (7.145)
It follows that the number of allowed values of k (i.e., the number of allowed
modes) when k
x
lies in the range k
x
to k
x
+dk
x
, k
y
lies in the range k
y
to k
y
+dk
y
,
and k
z
lies in the range k
z
to k
z
+dk
z
, is
ρ d
3
k =
_
L
x
2π
dk
x
_ _
L
y
2π
dk
y
_ _
L
z
2π
dk
z
_
=
V
(2π)
3
dk
x
dk
y
dk
z
, (7.146)
153
7.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
where V = L
x
L
y
L
z
is the periodicity volume, and d
3
k ≡ dk
x
dk
y
dk
z
. The quantity
ρ is called the density of modes. Note that this density is independent of k,
and proportional to the periodicity volume. Thus, the density of modes per unit
volume is a constant independent of the magnitude or shape of the periodicity
volume. The density of modes per unit volume when the magnitude of k lies in
the range k to k+dk is given by multiplying the density of modes per unit volume
by the “volume” in kspace of the spherical shell lying between radii k and k+dk.
Thus,
ρ
k
dk =
4πk
2
dk
(2π)
3
=
k
2
2π
2
dk. (7.147)
Consider an isotropic continuous mediumof volume V. According to the above
relation, the number of normal modes whose frequencies lie between ω and
ω + dω (which is equivalent to the number of modes whose k values lie in the
range ω/c
s
to ω/c
s
+dω/c
s
) is
σ
c
(ω) dω = 3
k
2
V
2π
2
dk = 3
V
2π
2
c
3
s
ω
2
dω. (7.148)
The factor of 3 comes from the three possible polarizations of sound waves in
solids. For every allowed wavenumber (or frequency) there are two indepen
dent torsional modes, where the displacement is perpendicular to the direction
of propagation, and one longitudinal mode, where the displacement is parallel to
the direction of propagation. Torsion waves are vaguely analogous to electromag
netic waves (these also have two independent polarizations). The longitudinal
mode is very similar to the compressional sound wave in gases. Of course, torsion
waves can not propagate in gases because gases have no resistance to deforma
tion without change of volume.
The Debye approach consists in approximating the actual density of normal
modes σ(ω) by the density in a continuous medium σ
c
(ω), not only at low fre
quencies (long wavelengths) where these should be nearly the same, but also
at higher frequencies where they may differ substantially. Suppose that we are
dealing with a solid consisting of N atoms. We know that there are only 3 N
independent normal modes. It follows that we must cut off the density of states
above some critical frequency, ω
D
say, otherwise we will have too many modes.
154
7.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
Thus, in the Debye approximation the density of normal modes takes the form
σ
D
(ω) = σ
c
(ω) for ω < ω
D
σ
D
(ω) = 0 for ω > ω
D
. (7.149)
Here, ω
D
is the Debye frequency. This critical frequency is chosen such that the
total number of normal modes is 3 N, so
∞
0
σ
D
(ω) dω =
ω
D
0
σ
c
(ω) dω = 3 N. (7.150)
Substituting Eq. (7.148) into the previous formula yields
3V
2π
2
c
3
s
ω
D
0
ω
2
dω =
V
2π
2
c
3
s
ω
3
D
= 3 N. (7.151)
This implies that
ω
D
= c
s
_
6π
2
N
V
_
1/3
. (7.152)
Thus, the Debye frequency depends only on the sound velocity in the solid and the
number of atoms per unit volume. The wavelength corresponding to the Debye
frequency is 2πc
s
/ω
D
, which is clearly on the order of the interatomic spacing
a ∼ (V/N)
1/3
. It follows that the cutoff of normal modes whose frequencies
exceed the Debye frequency is equivalent to a cutoff of normal modes whose
wavelengths are less than the interatomic spacing. Of course, it makes physical
sense that such modes should be absent.
Figure 5 compares the actual density of normal modes in diamond with the
density predicted by Debye theory. Not surprisingly, there is not a particularly
strong resemblance between these two curves, since Debye theory is highly ide
alized. Nevertheless, both curves exhibit sharp cutoffs at high frequencies, and
coincide at low frequencies. Furthermore, the areas under both curves are the
same. As we shall see, this is sufﬁcient to allow Debye theory to correctly account
for the temperature variation of the speciﬁc heat of solids at low temperatures.
We can use the quantummechanical expression for the mean energy of a single
oscillator, Eq. (7.111), to calculate the mean energy of lattice vibrations in the
155
7.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
Figure 5: The true density of normal modes in diamond compared with the density of normal modes
predicted by Debye theory. From C.B. Walker, Phys. Rev. 103, 547 (1956).
Debye approximation. We obtain
E =
∞
0
σ
D
(ω) ¯hω
_
_
1
2
+
1
exp(β¯hω) −1
_
_
dω. (7.153)
According to Eq. (7.135), the molar heat capacity takes the form
c
V
=
1
νk T
2
∞
0
σ
D
(ω) ¯hω
_
_
exp(β¯hω) ¯hω
[exp(β¯hω) −1]
2
_
_
dω. (7.154)
Substituting in Eq. (7.149), we ﬁnd that
c
V
=
k
ν
ω
D
0
exp(β¯hω) (β¯hω)
2
[exp(β¯hω) −1]
2
3 V
2π
2
c
3
s
ω
2
dω, (7.155)
giving
c
V
=
3 V k
2π
2
ν(c
s
β¯h)
3
β¯hω
D
0
exp x
(exp x −1)
2
x
4
dx, (7.156)
in terms of the dimensionless variable x = β¯hω. According to Eq. (7.152), the
volume can be written
V = 6 π
2
N
_
c
s
ω
D
_
3
, (7.157)
156
7.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
so the heat capacity reduces to
c
V
= 3Rf
D
(β¯hω
D
) = 3 Rf
D
(θ
D
/T), (7.158)
where the Debye function is deﬁned
f
D
(y) ≡
3
y
3
y
0
exp x
(exp x −1)
2
x
4
dx. (7.159)
We have also deﬁned the Debye temperature θ
D
as
k θ
D
= ¯hω
D
. (7.160)
Consider the asymptotic limit in which T ¸ θ
D
. For small y, we can approxi
mate exp x as 1 +x in the integrand of Eq. (7.159), so that
f
D
(y) →
3
y
3
y
0
x
2
dx = 1. (7.161)
Thus, if the temperature greatly exceeds the Debye temperature we recover the
law of Dulong and Petite that c
V
= 3 R. Consider, now, the asymptotic limit in
which T ¸θ
D
. For large y,
y
0
exp x
(exp x −1)
2
x
4
dx ·
∞
0
exp x
(exp x −1)
2
x
4
dx =
4π
4
15
. (7.162)
The latter integration is standard (if rather obscure), and can be looked up in any
(large) reference book on integration. Thus, in the low temperature limit
f
D
(y) →
4π
4
5
1
y
3
. (7.163)
This yields
c
V
·
12π
4
5
R
_
T
θ
D
_
3
(7.164)
in the limit T ¸θ
D
: i.e., c
V
varies with temperature like T
3
.
The fact that c
V
goes like T
3
at low temperatures is quite well veriﬁed exper
imentally, although it is sometimes necessary to go to temperatures as low as
157
7.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
Solid θ
D
from low temp. θ
D
from sound speed
Na Cl 308 320
KCl 230 246
Ag 225 216
Zn 308 305
Table 5: Comparison of Debye temperatures (in degrees kelvin) obtained from the low temperature
behaviour of the heat capacity with those calculated from the sound speed. From C. Kittel, Introduc
tion to solidstate physics, 2nd Ed. (John Wiley & Sons, New York NY, 1956).
0.02 θ
D
to obtain this asymptotic behaviour. Theoretically, θ
D
should be calcu
lable from Eq. (7.152) in terms of the sound speed in the solid and the molar
volume. Table 5 shows a comparison of Debye temperatures evaluated by this
means with temperatures obtained empirically by ﬁtting the law (7.164) to the
low temperature variation of the heat capacity. It can be seen that there is fairly
good agreement between the theoretical and empirical Debye temperatures. This
suggests that the Debye theory affords a good, thought not perfect, representa
tion of the behaviour of c
V
in solids over the entire temperature range.
Figure 6: The molar heat capacity of various solids.
Finally, Fig. 6 shows the actual temperature variation of the molar heat ca
158
7.13 The Maxwell distribution 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
pacities of various solids as well as that predicted by Debye’s theory. The pre
diction of Einstein’s theory is also show for the sake of comparison. Note that
24.9 joules/mole/degree is about 6 calories/gramatom/degree (the latter are
chemist’s units).
7.13 The Maxwell distribution
Consider a molecule of mass m in a gas which is sufﬁciently dilute for the inter
molecular forces to be negligible (i.e., an ideal gas). The energy of the molecule
is written
=
p
2
2 m
+
int
, (7.165)
where p is its momentum vector, and
int
is its internal (i.e., nontranslational)
energy. The latter energy is due to molecular rotation, vibration, etc. Transla
tional degrees of freedom can be treated classically to an excellent approxima
tion, whereas internal degrees of freedom usually require a quantum mechanical
approach. Classically, the probability of ﬁnding the molecule in a given internal
state with a position vector in the range r to r + dr, and a momentum vector in
the range p to p+dp, is proportional to the number of cells (of “volume” h
0
) con
tained in the corresponding region of phasespace, weighted by the Boltzmann
factor. In fact, since classical phasespace is divided up into uniform cells, the
number of cells is just proportional to the “volume” of the region under consid
eration. This “volume” is written d
3
r d
3
p. Thus, the probability of ﬁnding the
molecule in a given internal state s is
P
s
(r, p) d
3
r d
3
p ∝ exp(−βp
2
/2 m) exp(−β
int
s
) d
3
r d
3
p, (7.166)
where P
s
is a probability density deﬁned in the usual manner. The probability
P(r, p) d
3
r d
3
p of ﬁnding the molecule in any internal state with position and mo
mentum vectors in the speciﬁed range is obtained by summing the above expres
sion over all possible internal states. The sum over exp(−β
int
s
) just contributes
a constant of proportionality (since the internal states do not depend on r or p),
so
P(r, p) d
3
r d
3
p ∝ exp(−βp
2
/2m) d
3
r d
3
p. (7.167)
159
7.13 The Maxwell distribution 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
Of course, we can multiply this probability by the total number of molecules N
in order to obtain the mean number of molecules with position and momentum
vectors in the speciﬁed range.
Suppose that we now want to determine f(r, v) d
3
r d
3
v: i.e., the mean number
of molecules with positions between r and r + dr, and velocities in the range v
and v +dv. Since v = p/m, it is easily seen that
f(r, v) d
3
r d
3
v = Cexp(−βmv
2
/2) d
3
r d
3
v, (7.168)
where C is a constant of proportionality. This constant can be determined by the
condition
(r)
(v)
f(r, v) d
3
r d
3
v = N : (7.169)
i.e., the sum over molecules with all possible positions and velocities gives the to
tal number of molecules, N. The integral over the molecular position coordinates
just gives the volume V of the gas, since the Boltzmann factor is independent
of position. The integration over the velocity coordinates can be reduced to the
product of three identical integrals (one for v
x
, one for v
y
, and one for v
z
), so we
have
CV
_
_
∞
−∞
exp(−βmv
2
z
/2) dv
z
_
_
3
= N. (7.170)
Now,
∞
−∞
exp(−βmv
2
z
/2) dv
z
=
¸
¸
¸
_
2
βm
∞
−∞
exp(−y
2
) dy =
¸
¸
¸
_
2π
βm
, (7.171)
so C = (N/V)(βm/2π)
3/2
. Thus, the properly normalized distribution function
for molecular velocities is written
f(v) d
3
r d
3
v = n
_
m
2πk T
_
3/2
exp(−mv
2
/2 k T) d
3
r d
3
v. (7.172)
Here, n = N/V is the number density of the molecules. We have omitted the
variable r in the argument of f, since f clearly does not depend on position. In
other words, the distribution of molecular velocities is uniform in space. This is
hardly surprising, since there is nothing to distinguish one region of space from
160
7.13 The Maxwell distribution 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
another in our calculation. The above distribution is called the Maxwell velocity
distribution, because it was discovered by James Clark Maxwell in the middle of
the nineteenth century. The average number of molecules per unit volume with
velocities in the range v to v +dv is obviously f(v) d
3
v.
Let us consider the distribution of a given component of velocity: the zcomponent,
say. Suppose that g(v
z
) dv
z
is the average number of molecules per unit volume
with the zcomponent of velocity in the range v
z
to v
z
+ dv
z
, irrespective of the
values of their other velocity components. It is fairly obvious that this distribution
is obtained from the Maxwell distribution by summing (integrating actually) over
all possible values of v
x
and v
y
, with v
z
in the speciﬁed range. Thus,
g(v
z
) dv
z
=
(v
x
)
(v
y
)
f(v) d
3
v. (7.173)
This gives
g(v
z
) dv
z
= n
_
m
2πk T
_
3/2
(v
x
)
(v
y
)
exp[−(m/2 k T)(v
2
x
+v
2
y
+v
2
z
)] dv
x
dv
y
dv
z
= n
_
m
2πk T
_
3/2
exp(−mv
2
z
/2 k T)
_
_
∞
−∞
exp(−mv
2
x
/2 k T)
_
_
2
= n
_
m
2πk T
_
3/2
exp(−mv
2
z
/2 k T)
_
_
_
¸
¸
¸
_
2πk T
m
_
_
_
2
, (7.174)
or
g(v
z
) dv
z
= n
_
m
2πk T
_
1/2
exp(−mv
2
z
/2 k T) dv
z
. (7.175)
Of course, this expression is properly normalized, so that
∞
−∞
g(v
z
) dv
z
= n. (7.176)
It is clear that each component (since there is nothing special about the z
component) of the velocity is distributed with a Gaussian probability distribution
(see Sect. 2), centred on a mean value
v
z
= 0, (7.177)
161
7.13 The Maxwell distribution 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
with variance
v
2
z
=
k T
m
. (7.178)
Equation (7.177) implies that each molecule is just as likely to be moving in the
plus zdirection as in the minus zdirection. Equation (7.178) can be rearranged
to give
1
2
mv
2
z
=
1
2
k T, (7.179)
in accordance with the equipartition theorem.
Note that Eq. (7.172) can be rewritten
f(v) d
3
v
n
=
_
_
g(v
x
) dv
x
n
_
_
_
_
g(v
y
) dv
y
n
_
_
_
_
g(v
z
) dv
z
n
_
_
, (7.180)
where g(v
x
) and g(v
y
) are deﬁned in an analogous way to g(v
z
). Thus, the prob
ability that the velocity lies in the range v to v + dv is just equal to the product
of the probabilities that the velocity components lie in their respective ranges. In
other words, the individual velocity components act like statistically independent
variables.
Suppose that we now want to calculate F(v) dv: i.e., the average number of
molecules per unit volume with a speed v = v in the range v to v + dv. It is
obvious that we can obtain this quantity by adding up all molecules with speeds
in this range, irrespective of the direction of their velocities. Thus,
F(v) dv =
f(v) d
3
v, (7.181)
where the integral extends over all velocities satisfying
v < v < v +dv. (7.182)
This inequality is satisﬁed by a spherical shell of radius v and thickness dv in
velocity space. Since f(v) only depends on v, so f(v) ≡ f(v), the above integral
is just f(v) multiplied by the volume of the spherical shell in velocity space. So,
F(v) dv = 4πf(v) v
2
dv, (7.183)
162
7.13 The Maxwell distribution 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
which gives
F(v) dv = 4πn
_
m
2πk T
_
3/2
v
2
exp(−mv
2
/2 k T) dv. (7.184)
This is the famous Maxwell distribution of molecular speeds. Of course, it is prop
erly normalized, so that
∞
0
F(v) dv = n. (7.185)
Note that the Maxwell distribution exhibits a maximum at some nonzero value
of v. The reason for this is quite simple. As v increases, the Boltzmann fac
tor decreases, but the volume of phasespace available to the molecule (which
is proportional to v
2
) increases: the net result is a distribution with a nonzero
maximum.
Figure 7: The Maxwell velocity distribution as a function of molecular speed in units of the most
probable speed (v
mp
) . Also shown are the mean speed (¯ c) and the root mean square speed (v
rms
).
The mean molecular speed is given by
v =
1
n
∞
0
F(v) v dv. (7.186)
Thus, we obtain
v = 4π
_
m
2πk T
_
3/2
∞
0
v
3
exp(−mv
2
/2 k T) dv, (7.187)
163
7.13 The Maxwell distribution 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
or
v = 4π
_
m
2πk T
_
3/2
_
2 k T
m
_
2
∞
0
y
3
exp(−y
2
) dy. (7.188)
Now
∞
0
y
3
exp(−y
2
) dy =
1
2
, (7.189)
so
v =
¸
¸
¸
_
8
π
k T
m
. (7.190)
A similar calculation gives
v
rms
=
_
v
2
=
¸
¸
¸
_
3 k T
m
. (7.191)
However, this result can also be obtained from the equipartition theorem. Since
1
2
mv
2
=
1
2
m(v
2
x
+v
2
y
+v
2
z
) = 3
_
1
2
k T
_
, (7.192)
then Eq. (7.191) follows immediately. It is easily demonstrated that the most
probable molecular speed (i.e., the maximum of the Maxwell distribution func
tion) is
˜v =
¸
¸
¸
_
2 k T
m
. (7.193)
The speed of sound in an ideal gas is given by
c
s
=
¸
¸
¸
_
γp
ρ
, (7.194)
where γ is the ratio of speciﬁc heats. This can also be written
c
s
=
¸
¸
¸
_
γk T
m
, (7.195)
since p = nk T and ρ = nm. It is clear that the various average speeds which we
have just calculated are all of order the sound speed (i.e., a few hundred meters
per second at room temperature). In ordinary air (γ = 1.4) the sound speed is
about 84% of the most probable molecular speed, and about 74% of the mean
164
7.13 The Maxwell distribution 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS
molecular speed. Since sound waves ultimately propagate via molecular motion,
it makes sense that they travel at slightly less than the most probable and mean
molecular speeds.
Figure 7 shows the Maxwell velocity distribution as a function of molecular
speed in units of the most probable speed. Also shown are the mean speed and
the root mean square speed.
It is difﬁcult to directly verify the Maxwell velocity distribution. However,
this distribution can be veriﬁed indirectly by measuring the velocity distribution
of atoms exiting from a small hole in an oven. The velocity distribution of the
escaping atoms is closely related to, but slightly different from, the velocity dis
tribution inside the oven, since high velocity atoms escape more readily than low
velocity atoms. In fact, the predicted velocity distribution of the escaping atoms
varies like v
3
exp(−mv
2
/2 k T), in contrast to the v
2
exp(−mv
2
/2 k T) variation
of the velocity distribution inside the oven. Figure 8 compares the measured and
theoretically predicted velocity distributions of potassium atoms escaping from
an oven at 157
◦
C. There is clearly very good agreement between the two.
Figure 8: Comparison of the measured and theoretically predicted velocity distributions of potassium
atoms escaping from an oven at 157
◦
C. Here, the measured transit time is directly proportional to
the atomic speed.
165
8 QUANTUM STATISTICS
8 Quantum statistics
8.1 Introduction
Previously, we investigated the statistical thermodynamics of ideal gases using a
rather ad hoc combination of classical and quantum mechanics (see Sects. 7.6
and 7.7). In fact, we employed classical mechanics to deal with the translational
degrees of freedom of the constituent particles, and quantum mechanics to deal
with the nontranslational degrees of freedom. Let us now discuss ideal gases
from a purely quantum mechanical standpoint. It turns out that this approach
is necessary to deal with either low temperature or high density gases. Further
more, it also allows us to investigate completely nonclassical “gases,” such as
photons or the conduction electrons in a metal.
8.2 Symmetry requirements in quantum mechanics
Consider a gas consisting of N identical, noninteracting, structureless particles
enclosed within a container of volume V. Let Q
i
denote collectively all the coor
dinates of the ith particle: i.e., the three Cartesian coordinates which determine
its spatial position, as well as the spin coordinate which determines its internal
state. Let s
i
be an index labeling the possible quantum states of the ith particle:
i.e., each possible value of s
i
corresponds to a speciﬁcation of the three momen
tum components of the particle, as well as the direction of its spin orientation.
According to quantum mechanics, the overall state of the system when the ith
particle is in state s
i
, etc., is completely determined by the complex wavefunction
Ψ
s
1
,,s
N
(Q
1
, Q
2
, , Q
N
). (8.1)
In particular, the probability of an observation of the system ﬁnding the ith par
ticle with coordinates in the range Q
i
to Q
i
+dQ
i
, etc., is simply
Ψ
s
1
,,s
N
(Q
1
, Q
2
, , Q
N
)
2
dQ
1
dQ
2
dQ
N
. (8.2)
One of the fundamental postulates of quantum mechanics is the essential in
distinguishability of particles of the same species. What this means, in practice, is
166
8.2 Symmetry requirements in quantum mechanics 8 QUANTUM STATISTICS
that we cannot label particles of the same species: i.e., a proton is just a proton—
we cannot meaningfully talk of proton number 1 and proton number 2, etc. Note
that no such constraint arises in classical mechanics. Thus, in classical mechan
ics particles of the same species are regarded as being distinguishable, and can,
therefore, be labelled. Of course, the quantum mechanical approach is the correct
one.
Suppose that we interchange the ith and jth particles: i.e.,
Q
i
↔ Q
j
, (8.3)
s
i
↔ s
j
. (8.4)
If the particles are truly indistinguishable then nothing has changed: i.e., we have
a particle in quantum state s
i
and a particle in quantum state s
j
both before and
after the particles are swapped. Thus, the probability of observing the system in
a given state also cannot have changed: i.e.,
Ψ( Q
i
Q
j
)
2
= Ψ( Q
j
Q
i
)
2
. (8.5)
Here, we have omitted the subscripts s
1
, , s
N
for the sake of clarity. Note that
we cannot conclude that the wavefunction Ψ is unaffected when the particles
are swapped, because Ψ cannot be observed experimentally. Only the probability
density Ψ
2
is observable. Equation (8.5) implies that
Ψ( Q
i
Q
j
) = AΨ( Q
j
Q
i
), (8.6)
where A is a complex constant of modulus unity: i.e., A
2
= 1.
Suppose that we interchange the ith and jth particles a second time. Swapping
the ith and jth particles twice leaves the system completely unchanged: i.e., it is
equivalent to doing nothing to the system. Thus, the wavefunctions before and
after this process must be identical. It follows from Eq. (8.6) that
A
2
= 1. (8.7)
Of course, the only solutions to the above equation are A = ±1.
We conclude, from the above discussion, that the wavefunction Ψ is either
167
8.2 Symmetry requirements in quantum mechanics 8 QUANTUM STATISTICS
completely symmetric under the interchange of particles, or it is completely anti
symmetric. In other words, either
Ψ( Q
i
Q
j
) = +Ψ( Q
j
Q
i
), (8.8)
or
Ψ( Q
i
Q
j
) = −Ψ( Q
j
Q
i
). (8.9)
In 1940 the Nobel prize winning physicist Wolfgang Pauli demonstrated, via
arguments involving relativistic invariance, that the wavefunction associated
with a collection of identical integerspin (i.e., spin 0, 1, 2, etc.) particles satis
ﬁes Eq. (8.8), whereas the wavefunction associated with a collection of identical
halfintegerspin (i.e., spin 1/2, 3/2, 5/2, etc.) particles satisﬁes Eq. (8.9). The for
mer type of particles are known as bosons [after the Indian physicist S.N. Bose,
who ﬁrst put forward Eq. (8.8) on empirical grounds]. The latter type of particles
are called fermions (after the Italian physicists Enrico Fermi, who ﬁrst studied the
properties of fermion gases). Common examples of bosons are photons and He
4
atoms. Common examples of fermions are protons, neutrons, and electrons.
Consider a gas made up of identical bosons. Equation (8.8) implies that the in
terchange of any two particles does not lead to a new state of the system. Bosons
must, therefore, be considered as genuinely indistinguishable when enumerating
the different possible states of the gas. Note that Eq. (8.8) imposes no restriction
on how many particles can occupy a given singleparticle quantum state s.
Consider a gas made up of identical fermions. Equation (8.9) implies that
the interchange of any two particles does not lead to a new physical state of
the system (since Ψ
2
is invariant). Hence, fermions must also be considered
genuinely indistinguishable when enumerating the different possible states of the
gas. Consider the special case where particles i and j lie in the same quantum
state. In this case, the act of swapping the two particles is equivalent to leaving
the system unchanged, so
Ψ( Q
i
Q
j
) = Ψ( Q
j
Q
i
). (8.10)
However, Eq. (8.9) is also applicable, since the two particles are fermions. The
only way in which Eqs. (8.9) and (8.10) can be reconciled is if
Ψ = 0 (8.11)
168
8.3 An illustrative example 8 QUANTUM STATISTICS
wherever particles i and j lie in the same quantum state. This is another way
of saying that it is impossible for any two particles in a gas of fermions to lie in
the same singleparticle quantum state. This proposition is known as the Pauli
exclusion principle, since it was ﬁrst proposed by W. Pauli in 1924 on empirical
grounds.
Consider, for the sake of comparison, a gas made up of identical classical
particles. In this case, the particles must be considered distinguishable when
enumerating the different possible states of the gas. Furthermore, there are no
constraints on how many particles can occupy a given quantum state.
According to the above discussion, there are three different sets of rules which
can be used to enumerate the states of a gas made up of identical particles. For a
boson gas, the particles must be treated as being indistinguishable, and there is no
limit to how many particles can occupy a given quantum state. This set of rules is
called BoseEinstein statistics, after S.N. Bose and A. Einstein, who ﬁrst developed
them. For a fermion gas, the particles must be treated as being indistinguishable,
and there can never be more than one particle in any given quantum state. This
set of rules is called FermiDirac statistics, after E. Fermi and P.A.M. Dirac, who
ﬁrst developed them. Finally, for a classical gas, the particles must be treated as
being distinguishable, and there is no limit to how many particles can occupy a
given quantum state. This set of rules is called MaxwellBoltzmann statistics, after
J.C. Maxwell and L. Boltzmann, who ﬁrst developed them.
8.3 An illustrative example
Consider a very simple gas made up of two identical particles. Suppose that
each particle can be in one of three possible quantum states, s = 1, 2, 3. Let us
enumerate the possible states of the whole gas according to MaxwellBoltzmann,
BoseEinstein, and FermiDirac statistics, respectively.
For the case of MaxwellBoltzmann (MB) statistics, the two particles are con
sidered to be distinguishable. Let us denote them A and B. Furthermore, any
number of particles can occupy the same quantum state. The possible different
169
8.3 An illustrative example 8 QUANTUM STATISTICS
states of the gas are shown in Tab. 6. There are clearly 9 distinct states.
1 2 3
AB
AB
AB
A B
B A
A B
B A
A B
B A
Table 6: Two particles distributed amongst three states according to MaxwellBoltzmann statistics.
For the case of BoseEinstein (BE) statistics, the two particles are considered
to be indistinguishable. Let us denote them both as A. Furthermore, any number
of particles can occupy the same quantum state. The possible different states of
the gas are shown in Tab. 7. There are clearly 6 distinct states.
1 2 3
AA
AA
AA
A A
A A
A A
Table 7: Two particles distributed amongst three states according to BoseEinstein statistics.
Finally, for the case of FermiDirac (FD) statistics, the two particles are consid
ered to be indistinguishable. Let us again denote them both as A. Furthermore,
no more than one particle can occupy a given quantum state. The possible dif
ferent states of the gas are shown in Tab. 8. There are clearly only 3 distinct
states.
It follows, from the above example, that FermiDirac statistics are more restric
tive (i.e., there are less possible states of the system) than BoseEinstein statistics,
170
8.4 Formulation of the statistical problem 8 QUANTUM STATISTICS
1 2 3
A A
A A
A A
Table 8: Two particles distributed amongst three states according to FermiDirac statistics.
which are, in turn, more restrictive than MaxwellBoltzmann statistics. Let
ξ ≡
probability that the two particles are found in the same state
probability that the two particles are found in different states
. (8.12)
For the case under investigation,
ξ
MB
= 1/2, (8.13)
ξ
BE
= 1, (8.14)
ξ
FD
= 0. (8.15)
We conclude that in BoseEinstein statistics there is a greater relative tendency
for particles to cluster in the same state than in classical statistics. On the other
hand, in FermiDirac statistics there is less tendency for particles to cluster in the
same state than in classical statistics.
8.4 Formulation of the statistical problem
Consider a gas consisting of N identical noninteracting particles occupying vol
ume V and in thermal equilibrium at temperature T. Let us label the possible
quantum states of a single particle by r (or s). Let the energy of a particle in state
r be denoted
r
. Let the number of particles in state r be written n
r
. Finally, let
us label the possible quantum states of the whole gas by R.
The particles are assumed to be noninteracting, so the total energy of the gas
in state R, where there are n
r
particles in quantum state r, etc., is simply
E
R
=
r
n
r
r
, (8.16)
171
8.5 FermiDirac statistics 8 QUANTUM STATISTICS
where the sum extends over all possible quantum states r. Furthermore, since
the total number of particles in the gas is known to be N, we must have
N =
r
n
r
. (8.17)
In order to calculate the thermodynamic properties of the gas (i.e., its internal
energy or its entropy), it is necessary to calculate its partition function,
Z =
R
e
−βE
R
=
R
e
−β(n
1
1
+n
2
2
+)
. (8.18)
Here, the sum is over all possible states R of the whole gas: i.e., over all the
various possible values of the numbers n
1
, n
2
, .
Now, exp[−β(n
1
1
+ n
2
2
+ )] is the relative probability of ﬁnding the gas
in a particular state in which there are n
1
particles in state 1, n
2
particles in state
2, etc. Thus, the mean number of particles in quantum state s can be written
¯ n
s
=
R
n
s
exp[−β(n
1
1
+n
2
2
+ )]
R
exp[−β(n
1
1
+n
2
2
+ )]
. (8.19)
A comparison of Eqs. (8.18) and (8.19) yields the result
¯ n
s
= −
1
β
∂lnZ
∂
s
. (8.20)
Here, β ≡ 1/k T.
8.5 FermiDirac statistics
Let us, ﬁrst of all, consider FermiDirac statistics. According to Eq. (8.19), the
average number of particles in quantum state s can be written
¯ n
s
=
n
s
n
s
e
−βn
s
s
(s)
n
1
,n
2
,
e
−β(n
1
1
+n
2
2
+)
n
s
e
−βn
s
s
(s)
n
1
,n
2
,
e
−β(n
1
1
+n
2
2
+)
. (8.21)
Here, we have rearranged the order of summation, using the multiplicative prop
erties of the exponential function. Note that the ﬁrst sums in the numerator and
172
8.5 FermiDirac statistics 8 QUANTUM STATISTICS
denominator only involve n
s
, whereas the last sums omit the particular state s
from consideration (this is indicated by the superscript s on the summation sym
bol). Of course, the sums in the above expression range over all values of the
numbers n
1
, n
2
, such that n
r
= 0 and 1 for each r, subject to the overall con
straint that
r
n
r
= N. (8.22)
Let us introduce the function
Z
s
(N) =
(s)
n
1
,n
2
,
e
−β(n
1
1
+n
2
2
+)
, (8.23)
which is deﬁned as the partition function for Nparticles distributed over all quan
tum states, excluding state s, according to FermiDirac statistics. By explicitly
performing the sum over n
s
= 0 and 1, the expression (8.21) reduces to
¯ n
s
=
0 + e
−β
s
Z
s
(N−1)
Z
s
(N) + e
−β
s
Z
s
(N−1)
, (8.24)
which yields
¯ n
s
=
1
[Z
s
(N)/Z
s
(N−1)] e
β
s
+1
. (8.25)
In order to make further progress, we must somehow relate Z
s
(N−1) to Z
s
(N).
Suppose that ∆N ¸N. It follows that lnZ
s
(N−∆N) can be Taylor expanded to
give
lnZ
s
(N−∆N) · lnZ
s
(N) −
∂lnZ
s
∂N
∆N = lnZ
s
(N) −α
s
∆N, (8.26)
where
α
s
≡
∂lnZ
s
∂N
. (8.27)
As always, we Taylor expand the slowly varying function lnZ
s
(N), rather than the
rapidly varying function Z
s
(N), because the radius of convergence of the latter
Taylor series is too small for the series to be of any practical use. Equation (8.26)
can be rearranged to give
Z
s
(N−∆N) = Z
s
(N) e
−α
s
∆N
. (8.28)
173
8.5 FermiDirac statistics 8 QUANTUM STATISTICS
Now, since Z
s
(N) is a sum over very many different quantum states, we would
not expect the logarithm of this function to be sensitive to which particular state
s is excluded from consideration. Let us, therefore, introduce the approximation
that α
s
is independent of s, so that we can write
α
s
· α (8.29)
for all s. It follows that the derivative (8.27) can be expressed approximately
in terms of the derivative of the full partition function Z(N) (in which the N
particles are distributed over all quantum states). In fact,
α ·
∂lnZ
∂N
. (8.30)
Making use of Eq. (8.28), with ∆N = 1, plus the approximation (8.29), the
expression (8.25) reduces to
¯ n
s
=
1
e
α+β
s
+1
. (8.31)
This is called the FermiDirac distribution. The parameter α is determined by the
constraint that
r
¯ n
r
= N: i.e.,
r
1
e
α+β
r
+1
= N. (8.32)
Note that ¯ n
s
→ 0 if
s
becomes sufﬁciently large. On the other hand, since
the denominator in Eq. (8.31) can never become less than unity, no matter how
small
s
becomes, it follows that ¯ n
s
≤ 1. Thus,
0 ≤ ¯ n
s
≤ 1, (8.33)
in accordance with the Pauli exclusion principle.
Equations (8.20) and (8.30) can be integrated to give
lnZ = αN+
r
ln(1 + e
−α−β
r
), (8.34)
where use has been made of Eq. (8.31).
174
8.6 Photon statistics 8 QUANTUM STATISTICS
8.6 Photon statistics
Up to now, we have assumed that the number of particles N contained in a given
system is a ﬁxed number. This is a reasonable assumption if the particles possess
nonzero mass, since we are not generally considering relativistic systems in this
course. However, this assumption breaks down for the case of photons, which
are zeromass bosons. In fact, photons enclosed in a container of volume V,
maintained at temperature T, can readily be absorbed or emitted by the walls.
Thus, for the special case of a gas of photons there is no requirement which limits
the total number of particles.
It follows, from the above discussion, that photons obey a simpliﬁed form of
BoseEinstein statistics in which there is an unspeciﬁed total number of particles.
This type of statistics is called photon statistics.
Consider the expression (8.21). For the case of photons, the numbers n
1
, n
2
,
assume all values n
r
= 0, 1, 2, for each r, without any further restriction. It
follows that the sums
(s)
in the numerator and denominator are identical and,
therefore, cancel. Hence, Eq. (8.21) reduces to
¯ n
s
=
n
s
n
s
e
−βn
s
s
n
s
e
−βn
s
s
. (8.35)
However, the above expression can be rewritten
¯ n
s
= −
1
β
∂
∂
s
_
_
_ln
n
s
e
−βn
s
s
_
_
_. (8.36)
Now, the sumon the righthand side of the above equation is an inﬁnite geometric
series, which can easily be evaluated. In fact,
∞
n
s
=0
e
−βn
s
s
= 1 + e
−β
s
+ e
−2 β
s
+ =
1
1 − e
−β
s
. (8.37)
Thus, Eq. (8.36) gives
¯ n
s
=
1
β
∂
∂
s
ln(1 − e
−β
s
) =
e
−β
s
1 − e
−β
s
, (8.38)
175
8.7 BoseEinstein statistics 8 QUANTUM STATISTICS
or
¯ n
s
=
1
e
β
s
−1
. (8.39)
This is known as the Planck distribution, after the German physicist Max Planck
who ﬁrst proposed it in 1900 on purely empirical grounds.
Equation (8.20) can be integrated to give
lnZ = −
r
ln(1 − e
−β
r
), (8.40)
where use has been made of Eq. (8.39).
8.7 BoseEinstein statistics
Let us now consider BoseEinstein statistics. The particles in the system are as
sumed to be massive, so the total number of particles N is a ﬁxed number.
Consider the expression (8.21). For the case of massive bosons, the numbers
n
1
, n
2
, assume all values n
r
= 0, 1, 2, for each r, subject to the constraint
that
r
n
r
= N. Performing explicitly the sum over n
s
, this expression reduces to
¯ n
s
=
0 + e
−β
s
Z
s
(N−1) +2 e
−2 β
s
Z
s
(N−2) +
Z
s
(N) + e
−β
s
Z
s
(N−1) + e
−2 β
s
Z
s
(N−2) +
, (8.41)
where Z
s
(N) is the partition function for N particles distributed over all quantum
states, excluding state s, according to BoseEinstein statistics [cf., Eq. (8.23)].
Using Eq. (8.28), and the approximation (8.29), the above equation reduces to
¯ n
s
=
s
n
s
e
−n
s
(α+β
s
)
s
e
−n
s
(α+β
s
)
. (8.42)
Note that this expression is identical to (8.35), except that β
s
is replaced by α+
β
s
. Hence, an analogous calculation to that outlined in the previous subsection
yields
¯ n
s
=
1
e
α+β
s
−1
. (8.43)
176
8.8 MaxwellBoltzmann statistics 8 QUANTUM STATISTICS
This is called the BoseEinstein distribution. Note that ¯ n
s
can become very large in
this distribution. The parameter α is again determined by the constraint on the
total number of particles: i.e.,
r
1
e
α+β
r
−1
= N. (8.44)
Equations (8.20) and (8.30) can be integrated to give
lnZ = αN−
r
ln(1 − e
−α−β
r
), (8.45)
where use has been made of Eq. (8.43).
Note that photon statistics correspond to the special case of BoseEinstein
statistics in which the parameter α takes the value zero, and the constraint (8.44)
does not apply.
8.8 MaxwellBoltzmann statistics
For the purpose of comparison, it is instructive to consider the purely classical
case of MaxwellBoltzmann statistics. The partition function is written
Z =
R
e
−β(n
1
1
+n
2
2
+)
, (8.46)
where the sum is over all distinct states R of the gas, and the particles are treated
as distinguishable. For given values of n
1
, n
2
, there are
N!
n
1
! n
2
!
(8.47)
possible ways in which N distinguishable particles can be put into individual
quantum states such that there are n
1
particles in state 1, n
2
particles in state 2,
etc. Each of these possible arrangements corresponds to a distinct state for the
whole gas. Hence, Eq. (8.46) can be written
Z =
n
1
,n
2
,
N!
n
1
! n
2
!
e
−β(n
1
1
+n
2
2
+)
, (8.48)
177
8.9 Quantum statistics in the classical limit 8 QUANTUM STATISTICS
where the sum is over all values of n
r
= 0, 1, 2, for each r, subject to the
constraint that
r
n
r
= N. (8.49)
Now, Eq. (8.48) can be written
Z =
n
1
,n
2
,
N!
n
1
! n
2
!
(e
−β
1
)
n
1
(e
−β
2
)
n
2
, (8.50)
which, by virtue of Eq. (8.49), is just the result of expanding a polynomial. In
fact,
Z = (e
−β
1
+ e
−β
2
+ )
N
, (8.51)
or
lnZ = N ln
_
_
r
e
−β
r
_
_
. (8.52)
Note that the argument of the logarithm is simply the partition function for a
single particle.
Equations (8.20) and (8.52) can be combined to give
¯ n
s
= N
e
−β
s
r
e
−β
r
. (8.53)
This is known as the MaxwellBoltzmann distribution. It is, of course, just the
result obtained by applying the Boltzmann distribution to a single particle (see
Sect. 7).
8.9 Quantum statistics in the classical limit
The preceding analysis regarding the quantum statistics of ideal gases is sum
marized in the following statements. The mean number of particles occupying
quantum state s is given by
¯ n
s
=
1
e
α+β
s
±1
, (8.54)
178
8.9 Quantum statistics in the classical limit 8 QUANTUM STATISTICS
where the upper sign corresponds to FermiDirac statistics and the lower sign
corresponds to BoseEinstein statistics. The parameter α is determined via
r
¯ n
r
=
r
1
e
α+β
r
±1
= N. (8.55)
Finally, the partition function of the gas is given by
lnZ = αN±
r
ln(1 ±e
−α−β
r
). (8.56)
Let us investigate the magnitude of α in some important limiting cases. Con
sider, ﬁrst of all, the case of a gas at a given temperature when its concentration is
made sufﬁciently low: i.e., when N is made sufﬁciently small. The relation (8.55)
can only be satisﬁed if each term in the sum over states is made sufﬁciently small;
i.e., if ¯ n
r
¸1 or exp (α +β
r
) ¸1 for all states r.
Consider, next, the case of a gas made up of a ﬁxed number of particles when
its temperature is made sufﬁciently large: i.e., when β is made sufﬁciently small.
In the sum in Eq. (8.55), the terms of appreciable magnitude are those for which
β
r
¸α. Thus, it follows that as β →0 an increasing number of terms with large
values of
r
contribute substantially to this sum. In order to prevent the sum from
exceeding N, the parameter α must become large enough that each term is made
sufﬁciently small: i.e., it is again necessary that ¯ n
r
¸1 or exp (α + β
r
) ¸1 for
all states r.
The above discussion suggests that if the concentration of an ideal gas is made
sufﬁciently low, or the temperature is made sufﬁciently high, then α must become
so large that
e
α+β
r
¸1 (8.57)
for all r. Equivalently, this means that the number of particles occupying each
quantum state must become so small that
¯ n
r
¸1 (8.58)
for all r. It is conventional to refer to the limit of sufﬁciently low concentration, or
sufﬁciently high temperature, in which Eqs. (8.57) and Eqs. (8.58) are satisﬁed,
as the classical limit.
179
8.9 Quantum statistics in the classical limit 8 QUANTUM STATISTICS
According to Eqs. (8.54) and (8.57), both the FermiDirac and BoseEinstein
distributions reduce to
¯ n
s
= e
−α−β
s
(8.59)
in the classical limit, whereas the constraint (8.55) yields
r
e
−α−β
r
= N. (8.60)
The above expressions can be combined to give
¯ n
s
= N
e
−β
s
r
e
−β
r
. (8.61)
It follows that in the classical limit of sufﬁciently low density, or sufﬁciently high
temperature, the quantum distribution functions, whether FermiDirac or Bose
Einstein, reduce to the MaxwellBoltzmann distribution. It is easily demonstrated
that the physical criterion for the validity of the classical approximation is that
the mean separation between particles should be much greater than their mean
de Broglie wavelengths.
Let us now consider the behaviour of the partition function (8.56) in the clas
sical limit. We can expand the logarithm to give
lnZ = αN±
r
_
±e
−α−β
r
_
= αN+N. (8.62)
However, according to Eq. (8.60),
α = −lnN+ ln
_
_
r
e
−β
r
_
_
. (8.63)
It follows that
lnZ = −N lnN+N+N ln
_
_
r
e
−β
r
_
_
. (8.64)
Note that this does not equal the partition function Z
MB
computed in Eq. (8.52)
from MaxwellBoltzmann statistics: i.e.,
lnZ
MB
= N ln
_
_
r
e
−β
r
_
_
. (8.65)
180
8.10 The Planck radiation law 8 QUANTUM STATISTICS
In fact,
lnZ = lnZ
MB
− lnN!, (8.66)
or
Z =
Z
MB
N!
, (8.67)
where use has been made of Stirling’s approximation (N! · N lnN − N), since
N is large. Here, the factor N! simply corresponds to the number of different
permutations of the N particles: permutations which are physically meaningless
when the particles are identical. Recall, that we had to introduce precisely this
factor, in an ad hoc fashion, in Sect. 7.7 in order to avoid the nonphysical conse
quences of the Gibb’s paradox. Clearly, there is no Gibb’s paradox when an ideal
gas is treated properly via quantum mechanics.
In the classical limit, a full quantum mechanical analysis of an ideal gas re
produces the results obtained in Sects. 7.6 and 7.7, except that the arbitrary
parameter h
0
is replaced by Planck’s constant h = 6.61 10
−34
J s.
A gas in the classical limit, where the typical de Broglie wavelength of the
constituent particles is much smaller than the typical interparticle spacing, is
said to be nondegenerate. In the opposite limit, where the concentration and
temperature are such that the typical de Broglie wavelength becomes comparable
with the typical interparticle spacing, and the actual FermiDirac or BoseEinstein
distributions must be employed, the gas is said to be degenerate.
8.10 The Planck radiation law
Let us now consider the application of statistical thermodynamics to electromag
netic radiation. According to Maxwell’s theory, an electromagnetic wave is a cou
pled selfsustaining oscillation of electric and magnetic ﬁelds which propagates
though a vacuum at the speed of light, c = 3 10
8
ms
−1
. The electric component
of the wave can be written
E = E
0
exp[ i (kr −ωt)], (8.68)
181
8.10 The Planck radiation law 8 QUANTUM STATISTICS
where E
0
is a constant, k is the wavevector which determines the wavelength
and direction of propagation of the wave, and ω is the frequency. The dispersion
relation
ω = k c (8.69)
ensures that the wave propagates at the speed of light. Note that this dispersion
relation is very similar to that of sound waves in solids [see Eq. (7.140)]. Electro
magnetic waves always propagate in the direction perpendicular to the coupled
electric and magnetic ﬁelds (i.e., electromagnetic waves are transverse waves).
This means that k E
0
= 0. Thus, once k is speciﬁed, there are only two pos
sible independent directions for the electric ﬁeld. These correspond to the two
independent polarizations of electromagnetic waves.
Consider an enclosure whose walls are maintained at ﬁxed temperature T.
What is the nature of the steadystate electromagnetic radiation inside the en
closure? Suppose that the enclosure is a parallelepiped with sides of lengths L
x
,
L
y
, and L
z
. Alternatively, suppose that the radiation ﬁeld inside the enclosure is
periodic in the x, y, and zdirections, with periodicity lengths L
x
, L
y
, and L
z
, re
spectively. As long as the smallest of these lengths, L, say, is much greater than the
longest wavelength of interest in the problem, λ = 2π/k, then these assumptions
should not signiﬁcantly affect the nature of the radiation inside the enclosure.
We ﬁnd, just as in our earlier discussion of sound waves (see Sect. 7.12), that the
periodicity constraints ensure that there are only a discrete set of allowed wave
vectors (i.e., a discrete set of allowed modes of oscillation of the electromagnetic
ﬁeld inside the enclosure). Let ρ(k) d
3
k be the number of allowed modes per unit
volume with wavevectors in the range k to k + dk. We know, by analogy with
Eq. (7.146), that
ρ(k) d
3
k =
d
3
k
(2π)
3
. (8.70)
The number of modes per unit volume for which the magnitude of the wave
vector lies in the range k to k + dk is just the density of modes, ρ(k), multiplied
by the “volume” in kspace of the spherical shell lying between radii k and k+dk.
Thus,
ρ
k
(k) dk =
4πk
2
dk
(2π)
3
=
k
2
2π
2
dk. (8.71)
182
8.10 The Planck radiation law 8 QUANTUM STATISTICS
Finally, the number of modes per unit volume whose frequencies lie between ω
and ω+dω is, by Eq. (8.69),
σ(ω) dω = 2
ω
2
2π
2
c
3
dω. (8.72)
Here, the additional factor 2 is to take account of the two independent polariza
tions of the electromagnetic ﬁeld for a given wavevector k.
Let us consider the situation classically. By analogy with sound waves, we
can treat each allowable mode of oscillation of the electromagnetic ﬁeld as an
independent harmonic oscillator. According to the equipartition theorem (see
Sect. 7.8), each mode possesses a mean energy k T in thermal equilibrium at
temperature T. In fact, (1/2) k T resides with the oscillating electric ﬁeld, and
another (1/2) k T with the oscillating magnetic ﬁeld. Thus, the classical energy
density of electromagnetic radiation (i.e., the energy per unit volume associated
with modes whose frequencies lie in the range ω to ω+dω) is
u(ω) dω = k T σ(ω) dω =
k T
π
2
c
3
ω
2
dω. (8.73)
This result is known as the RayleighJeans radiation law, after Lord Rayleigh and
James Jeans who ﬁrst proposed it in the late nineteenth century.
According to Debye theory (see Sect. 7.12), the energy density of sound waves
in a solid is analogous to the RayleighJeans law, with one very important differ
ence. In Debye theory there is a cutoff frequency (the Debye frequency) above
which no modes exist. This cutoff comes about because of the discrete nature
of solids (i.e., because solids are made up of atoms instead of being continuous).
It is, of course, impossible to have sound waves whose wavelengths are much
less than the interatomic spacing. On the other hand, electromagnetic waves
propagate through a vacuum, which possesses no discrete structure. It follows
that there is no cutoff frequency for electromagnetic waves, and so the Rayleigh
Jeans law holds for all frequencies. This immediately poses a severe problem.
The total classical energy density of electromagnetic radiation is given by
U =
∞
0
u(ω) dω =
k T
π
2
c
3
∞
0
ω
2
dω. (8.74)
183
8.10 The Planck radiation law 8 QUANTUM STATISTICS
This is an integral which obviously does not converge. Thus, according to classical
physics, the total energy density of electromagnetic radiation inside an enclosed
cavity is inﬁnite! This is clearly an absurd result, and was recognized as such
in the latter half of the nineteenth century. In fact, this prediction is known
as the ultraviolet catastrophe, because the RayleighJeans law usually starts to
diverge badly from experimental observations (by overestimating the amount of
radiation) in the ultraviolet region of the spectrum.
So, how do we obtain a sensible answer? Well, as usual, quantum mechanics
comes to our rescue. According to quantum mechanics, each allowable mode of
oscillation of the electromagnetic ﬁeld corresponds to a photon state with energy
and momentum
= ¯hω, (8.75)
p = ¯hk, (8.76)
respectively. Incidentally, it follows from Eq. (8.69) that
= pc, (8.77)
which implies that photons are massless particles which move at the speed of
light. According to the Planck distribution (8.39), the mean number of photons
occupying a photon state of frequency ω is
n(ω) =
1
e
β¯hω
−1
. (8.78)
Hence, the mean energy of such a state is given by
¯ (ω) = ¯hω n(ω) =
¯hω
e
β¯hω
−1
. (8.79)
Note that low frequency states (i.e., ¯hω ¸k T) behave classically: i.e.,
¯ · k T. (8.80)
On the other hand, high frequency states (i.e., ¯hω ¸k T) are completely “frozen
out”: i.e.,
¯ ¸k T. (8.81)
184
8.11 Blackbody radiation 8 QUANTUM STATISTICS
The reason for this is simply that it is very difﬁcult for a thermal ﬂuctuation to cre
ate a photon with an energy greatly in excess of k T, since k T is the characteristic
energy associated with such ﬂuctuations.
According to the above discussion, the true energy density of electromagnetic
radiation inside an enclosed cavity is written
¯ udω = (ω) σ(ω) dω, (8.82)
giving
u(ω) dω =
¯h
π
2
c
3
ω
3
dω
exp(β¯hω) −1
. (8.83)
This is famous result is known as the Planck radiation law. The Planck law ap
proximates to the classical RayleighJeans law for ¯hω ¸ k T, peaks at about
¯hω · 3 k T, and falls off exponentially for ¯hω ¸k T. The exponential fall off at
high frequencies ensures that the total energy density remains ﬁnite.
8.11 Blackbody radiation
Suppose that we were to make a small hole in the wall of our enclosure, and
observe the emitted radiation. A small hole is the best approximation in Physics
to a blackbody, which is deﬁned as an object which absorbs, and, therefore,
emits, radiation perfectly at all wavelengths. What is the power radiated by the
hole? Well, the power density inside the enclosure can be written
u(ω) dω = ¯hω n(ω) dω, (8.84)
where n(ω) is the mean number of photons per unit volume whose frequencies
lie in the range ω to ω+dω. The radiation ﬁeld inside the enclosure is isotropic
(we are assuming that the hole is sufﬁciently small that it does not distort the
ﬁeld). It follows that the mean number of photons per unit volume whose fre
quencies lie in the speciﬁed range, and whose directions of propagation make an
angle in the range θ to θ +dθ with the normal to the hole, is
n(ω, θ) dωdθ =
1
2
n(ω) dω sinθ dθ, (8.85)
185
8.11 Blackbody radiation 8 QUANTUM STATISTICS
where sinθ is proportional to the solid angle in the speciﬁed range of directions,
and
π
0
n(ω, θ) dωdθ = n(ω) dω. (8.86)
Photons travel at the velocity of light, so the power per unit area escaping from
the hole in the frequency range ω to ω+dω is
P(ω) dω =
π/2
0
c cos θ ¯hω n(ω, θ) dωdθ, (8.87)
where c cos θ is the component of the photon velocity in the direction of the hole.
This gives
P(ω) dω = c u(ω) dω
1
2
π/2
0
cos θ sinθ dθ =
c
4
u(ω) dω, (8.88)
so
P(ω) dω =
¯h
4π
2
c
2
ω
3
dω
exp(β¯hω) −1
(8.89)
is the power per unit area radiated by a blackbody in the frequency range ω to
ω+dω.
A blackbody is very much an idealization. The power spectra of real radiating
bodies can deviate quite substantially from blackbody spectra. Nevertheless,
we can make some useful predictions using this model. The blackbody power
spectrum peaks when ¯hω · 3 k T. This means that the peak radiation frequency
scales linearly with the temperature of the body. In other words, hot bodies tend
to radiate at higher frequencies than cold bodies. This result (in particular, the
linear scaling) is known as Wien’s displacement law. It allows us to estimate the
surface temperatures of stars from their colours (surprisingly enough, stars are
fairly good blackbodies). Table 9 shows some stellar temperatures determined
by this method (in fact, the whole emission spectrum is ﬁtted to a blackbody
spectrum). It can be seen that the apparent colours (which correspond quite well
to the colours of the peak radiation) scan the whole visible spectrum, from red to
blue, as the stellar surface temperatures gradually rise.
Probably the most famous blackbody spectrum is cosmological in origin. Just
after the “big bang” the Universe was essentially a “ﬁreball,” with the energy as
186
8.12 The StefanBoltzmann law 8 QUANTUM STATISTICS
Name Constellation Spectral Type Surf. Temp. (
◦
K) Colour
Antares Scorpio M 3300 Very Red
Aldebaran Taurus K 3800 Reddish
Sun G 5770 Yellow
Procyon Canis Minor F 6570 Yellowish
Sirius Canis Major A 9250 White
Rigel Orion B 11,200 Bluish White
Table 9: Physical properties of some wellknown stars
sociated with radiation completely dominating that associated with matter. The
early Universe was also pretty well described by equilibrium statistical thermo
dynamics, which means that the radiation had a blackbody spectrum. As the
Universe expanded, the radiation was gradually Doppler shifted to ever larger
wavelengths (in other words, the radiation did work against the expansion of
the Universe, and, thereby, lost energy), but its spectrum remained invariant.
Nowadays, this primordial radiation is detectable as a faint microwave background
which pervades the whole universe. The microwave background was discovered
accidentally by Penzias and Wilson in 1961. Until recently, it was difﬁcult to mea
sure the full spectrum with any degree of precision, because of strong microwave
absorption and scattering by the Earth’s atmosphere. However, all of this changed
when the COBE satellite was launched in 1989. It took precisely nine minutes to
measure the perfect blackbody spectrum reproduced in Fig. 9. This data can be
ﬁtted to a blackbody curve of characteristic temperature 2.735
◦
K. In a very real
sense, this can be regarded as the “temperature of the Universe.”
8.12 The StefanBoltzmann law
The total power radiated per unit area by a blackbody at all frequencies is given
by
P
tot
(T) =
∞
0
P(ω) dω =
¯h
4π
2
c
2
∞
0
ω
3
dω
exp(¯hω/k T) −1
, (8.90)
or
P
tot
(T) =
k
4
T
4
4π
2
c
2
¯h
3
∞
0
η
3
dη
exp η −1
, (8.91)
187
8.12 The StefanBoltzmann law 8 QUANTUM STATISTICS
Figure 9: Cosmic background radiation spectrum measured by the Far Infrared Absolute Spectrome
ter (FIRAS) aboard the Cosmic Background Explorer satellite (COBE).
where η = ¯hω/k T. The above integral can easily be looked up in standard
mathematical tables. In fact,
∞
0
η
3
dη
exp η −1
=
π
4
15
. (8.92)
Thus, the total power radiated per unit area by a blackbody is
P
tot
(T) =
π
2
60
k
4
c
2
¯h
3
T
4
= σT
4
. (8.93)
This T
4
dependence of the radiated power is called the StefanBoltzmann law,
after Josef Stefan, who ﬁrst obtained it experimentally, and Ludwig Boltzmann,
who ﬁrst derived it theoretically. The parameter
σ =
π
2
60
k
4
c
2
¯h
3
= 5.67 10
−8
Wm
−2
K
−4
, (8.94)
is called the StefanBoltzmann constant.
We can use the StefanBoltzmann law to estimate the temperature of the Earth
from ﬁrst principles. The Sun is a ball of glowing gas of radius R
¸
· 7 10
5
km
188
8.13 Conduction electrons in a metal 8 QUANTUM STATISTICS
and surface temperature T
¸
· 5770
◦
K. Its luminosity is
L
¸
= 4πR
2
¸
σT
4
¸
, (8.95)
according to the StefanBoltzmann law. The Earth is a globe of radius R
⊕
∼
6000 km located an average distance r
⊕
· 1.5 10
8
km from the Sun. The Earth
intercepts an amount of energy
P
⊕
= L
¸
πR
2
⊕
/r
2
⊕
4π
(8.96)
per second from the Sun’s radiative output: i.e., the power output of the Sun
reduced by the ratio of the solid angle subtended by the Earth at the Sun to the
total solid angle 4π. The Earth absorbs this energy, and then reradiates it at
longer wavelengths. The luminosity of the Earth is
L
⊕
= 4πR
2
⊕
σT
4
⊕
, (8.97)
according to the StefanBoltzmann law, where T
⊕
is the average temperature of
the Earth’s surface. Here, we are ignoring any surface temperature variations
between polar and equatorial regions, or between day and night. In steadystate,
the luminosity of the Earth must balance the radiative power input from the Sun,
so equating L
⊕
and P
⊕
we arrive at
T
⊕
=
_
R
¸
2 r
⊕
_
1/2
T
¸
. (8.98)
Remarkably, the ratio of the Earth’s surface temperature to that of the Sun de
pends only on the EarthSun distance and the solar radius. The above expression
yields T
⊕
∼ 279
◦
K or 6
◦
C (or 43
◦
F). This is slightly on the cold side, by a few
degrees, because of the greenhouse action of the Earth’s atmosphere, which was
neglected in our calculation. Nevertheless, it is quite encouraging that such a
crude calculation comes so close to the correct answer.
8.13 Conduction electrons in a metal
The conduction electrons in a metal are nonlocalized (i.e., they are not tied to
any particular atoms). In conventional metals, each atom contributes a single
189
8.13 Conduction electrons in a metal 8 QUANTUM STATISTICS
such electron. To a ﬁrst approximation, it is possible to neglect the mutual inter
action of the conduction electrons, since this interaction is largely shielded out by
the stationary atoms. The conduction electrons can, therefore, be treated as an
ideal gas. However, the concentration of such electrons in a metal far exceeds the
concentration of particles in a conventional gas. It is, therefore, not surprising
that conduction electrons cannot normally be analyzed using classical statistics:
in fact, they are subject to FermiDirac statistics (since electrons are fermions).
Recall, from Sect. 8.5, that the mean number of particles occupying state s
(energy
s
) is given by
¯ n
s
=
1
e
β(
s
−µ)
+1
, (8.99)
according to the FermiDirac distribution. Here,
µ ≡ −k T α (8.100)
is termed the Fermi energy of the system. This energy is determined by the con
dition that
r
¯ n
r
=
r
1
e
β(
r
−µ)
+1
= N, (8.101)
where N is the total number of particles contained in the volume V. It is clear,
from the above equation, that the Fermi energy µ is generally a function of the
temperature T.
Let us investigate the behaviour of the Fermi function
F() =
1
e
β(−µ)
+1
(8.102)
as varies. Here, the energy is measured from its lowest possible value = 0. If
the Fermi energy µ is such that βµ ¸1 then β( −µ) ¸1, and F reduces to the
MaxwellBoltzmann distribution. However, for the case of conduction electrons
in a metal we are interested in the opposite limit, where
βµ ≡
µ
k T
¸1. (8.103)
In this limit, if ¸µ then β( −µ) ¸1, so that F() = 1. On the other hand, if
¸ µ then β( − µ) ¸ 1, so that F() = exp[−β( − µ)] falls off exponentially
190
8.13 Conduction electrons in a metal 8 QUANTUM STATISTICS
ε −>
0
1
0
µ
T = 0 K
T > 0 K
F

>
kT
Figure 10: The Fermi function.
with increasing , just like a classical Boltzmann distribution. Note that F = 1/2
when = µ. The transition region in which F goes from a value close to unity to
a value close to zero corresponds to an energy interval of order k T, centred on
= µ. This is illustrated in Fig. 10.
In the limit as T → 0, the transition region becomes inﬁnitesimally narrow.
In this case, F = 1 for < µ and F = 0 for > µ, as illustrated in Fig. 10.
This is an obvious result, since when T = 0 the conduction electrons attain their
lowest energy, or groundstate, conﬁguration. Since the Pauli exclusion principle
requires that there be no more than one electron per singleparticle quantum
state, the lowest energy conﬁguration is obtained by piling electrons into the
lowest available unoccupied states until all of the electrons are used up. Thus,
the last electron added to the pile has quite a considerable energy, = µ, since all
of the lower energy states are already occupied. Clearly, the exclusion principle
implies that a FermiDirac gas possesses a large mean energy, even at absolute
zero.
Let us calculate the Fermi energy µ = µ
0
of a FermiDirac gas at T = 0. The
191
8.13 Conduction electrons in a metal 8 QUANTUM STATISTICS
energy of each particle is related to its momentum p = ¯hk via
=
p
2
2 m
=
¯h
2
k
2
2 m
, (8.104)
where k is the de Broglie wavevector. At T = 0 all quantum states whose energy
is less than the Fermi energy µ
0
are ﬁlled. The Fermi energy corresponds to a
Fermi momentum p
F
= ¯hk
F
which is such that
µ
0
=
p
2
F
2 m
=
¯h
2
k
2
F
2 m
. (8.105)
Thus, at T = 0 all quantum states with k < k
F
are ﬁlled, and all those with k > k
F
are empty.
Now, we know, by analogy with Eq. (7.146), that there are (2π)
−3
V allowable
translational states per unit volume of kspace. The volume of the sphere of
radius k
F
in kspace is (4/3) πk
3
F
. It follows that the Fermi sphere of radius k
F
contains (4/3) πk
3
F
(2π)
−3
V translational states. The number of quantum states
inside the sphere is twice this, because electrons possess two possible spin states
for every possible translational state. Since the total number of occupied states
(i.e., the total number of quantum states inside the Fermi sphere) must equal the
total number of particles in the gas, it follows that
2
V
(2π)
3
_
4
3
πk
3
F
_
= N. (8.106)
The above expression can be rearranged to give
k
F
=
_
3 π
2
N
V
_
1/3
. (8.107)
Hence,
λ
F
≡
2π
k
F
=
2π
(3π
2
)
1/3
_
V
N
_
1/3
, (8.108)
which implies that the de Broglie wavelength λ
F
corresponding to the Fermi en
ergy is of order the mean separation between particles (V/N)
1/3
. All quantum
states with de Broglie wavelengths λ ≡ 2π/k > λ
F
are occupied at T = 0, whereas
all those with λ < λ
F
are empty.
192
8.13 Conduction electrons in a metal 8 QUANTUM STATISTICS
According to Eq. (8.105), the Fermi energy at T = 0 takes the form
µ
0
=
¯h
2
2 m
_
3 π
2
N
V
_
2/3
. (8.109)
It is easily demonstrated that µ
0
¸ k T for conventional metals at room temper
ature.
The majority of the conduction electrons in a metal occupy a band of com
pletely ﬁlled states with energies far below the Fermi energy. In many cases, such
electrons have very little effect on the macroscopic properties of the metal. Con
sider, for example, the contribution of the conduction electrons to the speciﬁc
heat of the metal. The heat capacity C
V
at constant volume of these electrons
can be calculated from a knowledge of their mean energy
¯
E(T) as a function of
T: i.e.,
C
V
=
_
_
∂
¯
E
∂T
_
_
V
. (8.110)
If the electrons obeyed classical MaxwellBoltzmann statistics, so that F ∝ exp(−β)
for all electrons, then the equipartition theorem would give
¯
E =
3
2
Nk T, (8.111)
C
V
=
3
2
Nk. (8.112)
However, the actual situation, in which F has the form shown in Fig. 10, is very
different. A small change in T does not affect the mean energies of the major
ity of the electrons, with ¸ µ, since these electrons lie in states which are
completely ﬁlled, and remain so when the temperature is changed. It follows
that these electrons contribute nothing whatsoever to the heat capacity. On the
other hand, the relatively small number of electrons N
eff
in the energy range
of order k T, centred on the Fermi energy, in which F is signiﬁcantly different
from 0 and 1, do contribute to the speciﬁc heat. In the tail end of this region
F ∝ exp(−β), so the distribution reverts to a MaxwellBoltzmann distribution.
Hence, from Eq. (8.112), we expect each electron in this region to contribute
roughly an amount (3/2) k to the heat capacity. Hence, the heat capacity can be
193
8.13 Conduction electrons in a metal 8 QUANTUM STATISTICS
written
C
V
·
3
2
N
eff
k. (8.113)
However, since only a fraction k T/µ of the total conduction electrons lie in the
tail region of the FermiDirac distribution, we expect
N
eff
·
k T
µ
N. (8.114)
It follows that
C
V
·
3
2
Nk
k T
µ
. (8.115)
Since k T ¸ µ in conventional metals, the molar speciﬁc heat of the con
duction electrons is clearly very much less than the classical value (3/2) R. This
accounts for the fact that the molar speciﬁc heat capacities of metals at room
temperature are about the same as those of insulators. Before the advent of
quantum mechanics, the classical theory predicted incorrectly that the presence
of conduction electrons should raise the heat capacities of metals by 50 percent
[i.e., (3/2) R] compared to those of insulators.
Note that the speciﬁc heat (8.115) is not temperature independent. In fact,
using the superscript e to denote the electronic speciﬁc heat, the molar speciﬁc
heat can be written
c
(e)
V
= γT, (8.116)
where γ is a (positive) constant of proportionality. At room temperature c
(e)
V
is
completely masked by the much larger speciﬁc heat c
(L)
V
due to lattice vibrations.
However, at very low temperatures c
(L)
V
= AT
3
, where A is a (positive) constant
of proportionality (see Sect. 7.12). Clearly, at low temperatures c
(L)
V
= AT
3
ap
proaches zero far more rapidly that the electronic speciﬁc heat, as T is reduced.
Hence, it should be possible to measure the electronic contribution to the molar
speciﬁc heat at low temperatures.
The total molar speciﬁc heat of a metal at low temperatures takes the form
c
V
= c
(e)
V
+c
(L)
V
= γT +AT
3
. (8.117)
194
8.14 Whitedwarf stars 8 QUANTUM STATISTICS
Figure 11: The low temperature heat capacity of potassium, plotted as C
V
/T versus T
2
. From
C. Kittel, and H. Kroemer, Themal physics (W.H. Freeman & co., New York NY, 1980).
Hence,
c
V
T
= γ +AT
2
. (8.118)
If follows that a plot of c
V
/T versus T
2
should yield a straight line whose intercept
on the vertical axis gives the coefﬁcient γ. Figure 11 shows such a plot. The fact
that a good straight line is obtained veriﬁes that the temperature dependence of
the heat capacity predicted by Eq. (8.117) is indeed correct.
8.14 Whitedwarf stars
A mainsequence hydrogenburning star, such as the Sun, is maintained in equi
librium via the balance of the gravitational attraction tending to make it collapse,
and the thermal pressure tending to make it expand. Of course, the thermal en
ergy of the star is generated by nuclear reactions occurring deep inside its core.
Eventually, however, the star will run out of burnable fuel, and, therefore, start to
collapse, as it radiates away its remaining thermal energy. What is the ultimate
fate of such a star?
A burntout star is basically a gas of electrons and ions. As the star collapses,
its density increases, so the mean separation between its constituent particles
decreases. Eventually, the mean separation becomes of order the de Broglie
wavelength of the electrons, and the electron gas becomes degenerate. Note,
195
8.14 Whitedwarf stars 8 QUANTUM STATISTICS
that the de Broglie wavelength of the ions is much smaller than that of the elec
trons, so the ion gas remains nondegenerate. Now, even at zero temperature,
a degenerate electron gas exerts a substantial pressure, because the Pauli exclu
sion principle prevents the mean electron separation from becoming signiﬁcantly
smaller than the typical de Broglie wavelength (see the previous section). Thus,
it is possible for a burntout star to maintain itself against complete collapse un
der gravity via the degeneracy pressure of its constituent electrons. Such stars
are termed whitedwarfs. Let us investigate the physics of whitedwarfs in more
detail.
The total energy of a whitedwarf star can be written
E = K +U, (8.119)
where K is the total kinetic energy of the degenerate electrons (the kinetic energy
of the ion is negligible) and U is the gravitational potential energy. Let us assume,
for the sake of simplicity, that the density of the star is uniform. In this case, the
gravitational potential energy takes the form
U = −
3
5
GM
2
R
, (8.120)
where G is the gravitational constant, M is the stellar mass, and R is the stellar
radius.
Let us assume that the electron gas is highly degenerate, which is equivalent
to taking the limit T → 0. In this case, we know, from the previous section, that
the Fermi momentum can be written
p
F
= Λ
_
N
V
_
1/3
, (8.121)
where
Λ = (3π
2
)
1/3
¯h. (8.122)
Here,
V =
4π
3
R
3
(8.123)
196
8.14 Whitedwarf stars 8 QUANTUM STATISTICS
is the stellar volume, and N is the total number of electrons contained in the
star. Furthermore, the number of electron states contained in an annular radius
of pspace lying between radii p and p +dp is
dN =
3 V
Λ
3
p
2
dp. (8.124)
Hence, the total kinetic energy of the electron gas can be written
K =
3 V
Λ
3
p
F
0
p
2
2 m
p
2
dp =
3
5
V
Λ
3
p
5
F
2 m
, (8.125)
where m is the electron mass. It follows that
K =
3
5
N
Λ
2
2 m
_
N
V
_
2/3
. (8.126)
The interior of a whitedwarf star is composed of atoms like C
12
and O
16
which
contain equal numbers of protons, neutrons, and electrons. Thus,
M = 2 Nm
p
, (8.127)
where m
p
is the proton mass.
Equations (8.119), (8.120), (8.122), (8.123), (8.126), and (8.127) can be
combined to give
E =
A
R
2
−
B
R
, (8.128)
where
A =
3
20
_
9π
8
_
2/3
¯h
2
m
_
_
M
m
p
_
_
5/3
, (8.129)
B =
3
5
GM
2
. (8.130)
The equilibrium radius of the star R
∗
is that which minimizes the total energy E.
In fact, it is easily demonstrated that
R
∗
=
2 A
B
, (8.131)
197
8.15 The Chandrasekhar limit 8 QUANTUM STATISTICS
which yields
R
∗
=
(9π)
2/3
8
¯h
2
m
1
Gm
5/3
p
M
1/3
. (8.132)
The above formula can also be written
R
∗
R
¸
= 0.010
_
M
¸
M
_
1/3
, (8.133)
where R
¸
= 7 10
5
km is the solar radius, and M
¸
= 2 10
30
kg is the solar
mass. It follows that the radius of a typical solar mass whitedwarf is about
7000km: i.e., about the same as the radius of the Earth. The ﬁrst whitedwarf to
be discovered (in 1862) was the companion of Sirius. Nowadays, thousands of
whitedwarfs have been observed, all with properties similar to those described
above.
8.15 The Chandrasekhar limit
One curious feature of whitedwarf stars is that their radius decreases as their
mass increases [see Eq. (8.133)]. It follows, from Eq. (8.126), that the mean
energy of the degenerate electrons inside the star increases strongly as the stellar
mass increases: in fact, K ∝ M
4/3
. Hence, if M becomes sufﬁciently large the
electrons become relativistic, and the above analysis needs to be modiﬁed. Strictly
speaking, the nonrelativistic analysis described in the previous section is only
valid in the low mass limit M ¸ M
¸
. Let us, for the sake of simplicity, consider
the ultrarelativistic limit in which p ¸mc.
The total electron energy (including the rest mass energy) can be written
K =
3 V
Λ
3
p
F
0
(p
2
c
2
+m
2
c
4
)
1/2
p
2
dp, (8.134)
by analogy with Eq. (8.125). Thus,
K ·
3 V c
Λ
3
p
F
0
_
_
p
3
+
m
2
c
2
2
p +
_
_
dp, (8.135)
198
8.15 The Chandrasekhar limit 8 QUANTUM STATISTICS
giving
K ·
3
4
V c
Λ
3
_
p
4
F
+m
2
c
2
p
2
F
+
_
. (8.136)
It follows, from the above, that the total energy of an ultrarelativistic white
dwarf star can be written in the form
E ·
A−B
R
+CR, (8.137)
where
A =
3
8
_
9π
8
_
1/3
¯hc
_
_
M
m
p
_
_
4/3
, (8.138)
B =
3
5
GM
2
, (8.139)
C =
3
4
1
(9π)
1/3
m
2
c
3
¯h
_
_
M
m
p
_
_
2/3
. (8.140)
As before, the equilibrium radius R
∗
is that which minimizes the total energy
E. However, in the ultrarelativistic case, a nonzero value of R
∗
only exists for
A−B > 0. When A−B < 0 the energy decreases monotonically with decreasing
stellar radius: in other words, the degeneracy pressure of the electrons is inca
pable of halting the collapse of the star under gravity. The criterion which must
be satisﬁed for a relativistic whitedwarf star to be maintained against gravity is
that
A
B
> 1. (8.141)
This criterion can be rewritten
M < M
C
, (8.142)
where
M
C
=
15
64
(5π)
1/2
(¯hc/G)
1/2
m
2
p
= 1.72 M
¸
(8.143)
is known as the Chandrasekhar limit, after A. Chandrasekhar who ﬁrst derived it
in 1931. A more realistic calculation, which does not assume constant density,
yields
M
C
= 1.4 M
¸
. (8.144)
199
8.16 Neutron stars 8 QUANTUM STATISTICS
Thus, if the stellar mass exceeds the Chandrasekhar limit then the star in question
cannot become a whitedwarf when its nuclear fuel is exhausted, but, instead,
must continue to collapse. What is the ultimate fate of such a star?
8.16 Neutron stars
At stellar densities which greatly exceed whitedwarf densities, the extreme pres
sures cause electrons to combine with protons to form neutrons. Thus, any star
which collapses to such an extent that its radius becomes signiﬁcantly less than
that characteristic of a whitedwarf is effectively transformed into a gas of neu
trons. Eventually, the mean separation between the neutrons becomes compara
ble with their de Broglie wavelength. At this point, it is possible for the degen
eracy pressure of the neutrons to halt the collapse of the star. A star which is
maintained against gravity in this manner is called a neutron star.
Neutrons stars can be analyzed in a very similar manner to whitedwarf stars.
In fact, the previous analysis can be simply modiﬁed by letting m
p
→ m
p
/2 and
m →m
p
. Thus, we conclude that nonrelativistic neutrons stars satisfy the mass
radius law:
R
∗
R
¸
= 0.000011
_
M
¸
M
_
1/3
, (8.145)
It follows that the radius of a typical solar mass neutron star is a mere 10 km.
In 1967 Antony Hewish and Jocelyn Bell discovered a class of compact radio
sources, called pulsars, which emit extremely regular pulses of radio waves. Pul
sars have subsequently been identiﬁed as rotating neutron stars. To date, many
hundreds of these objects have been observed.
When relativistic effects are taken into account, it is found that there is a
critical mass above which a neutron star cannot be maintained against gravity.
According to our analysis, this critical mass, which is known as the Oppenheimer
Volkoff limit, is given by
M
OV
= 4 M
C
= 6.9 M
¸
. (8.146)
A more realistic calculation, which does not assume constant density, does not
treat the neutrons as point particles, and takes general relativity into account,
200
8.16 Neutron stars 8 QUANTUM STATISTICS
gives a somewhat lower value of
M
OV
= 1.5—2.5 M
¸
. (8.147)
A star whose mass exceeds the OppenheimerVolkoff limit cannot be maintained
against gravity by degeneracy pressure, and must ultimately collapse to form a
blackhole.
201
1 INTRODUCTION
1 Introduction
1.1 Intended audience These lecture notes outline a single semester course intended for upper division undergraduates.
1.2 Major sources The textbooks which I have consulted most frequently whilst developing course material are: Fundamentals of statistical and thermal physics: F. Reif (McGrawHill, New York NY, 1965). Introduction to quantum theory: D. Park, 3rd Edition (McGrawHill, New York NY, 1992).
1.3 Why study thermodynamics? Thermodynamics is essentially the study of the internal motions of many body systems. Virtually all substances which we encounter in everyday life are many body systems of some sort or other (e.g., solids, liquids, gases, and light). Not surprisingly, therefore, thermodynamics is a discipline with an exceptionally wide range of applicability. Thermodynamics is certainly the most ubiquitous subﬁeld of Physics outside Physics Departments. Engineers, Chemists, and Material Scientists do not study relatively or particle physics, but thermodynamics is an integral, and very important, part of their degree courses. Many people are drawn to Physics because they want to understand why the world around us is like it is. For instance, why the sky is blue, why raindrops are spherical, why we do not fall through the ﬂoor, etc. It turns out that statistical
2
1.4 The atomic theory of matter
1 INTRODUCTION
thermodynamics can explain more things about the world around us than all of the other physical theories studied in the undergraduate Physics curriculum put together. For instance, in this course we shall explain why heat ﬂows from hot to cold bodies, why the air becomes thinner and colder at higher altitudes, why the Sun appears yellow whereas colder stars appear red and hotter stars appear bluishwhite, why it is impossible to measure a temperature below 273◦ centigrade, why there is a maximum theoretical efﬁciency of a power generation unit which can never be exceeded no matter what the design, why high mass stars must ultimately collapse to form blackholes, and much more!
1.4 The atomic theory of matter According to the wellknown atomic theory of matter, the familiar objects which make up the world around us, such as tables and chairs, are themselves made up of a great many microscopic particles. Atomic theory was invented by the ancient Greek philosophers Leucippus and Democritus, who speculated that the world essentially consists of myriads of tiny indivisible particles, which they called atoms, from the Greek atomon, meaning “uncuttable.” They speculated, further, that the observable properties of everyday materials can be explained either in terms of the different shapes of the atoms which they contain, or the different motions of these atoms. In some respects modern atomic theory differs substantially from the primitive theory of Leucippus and Democritus, but the central ideas have remained essentially unchanged. In particular, Leucippus and Democritus were right to suppose that the properties of materials depend not only on the nature of the constituent atoms or molecules, but also on the relative motions of these particles.
1.5 Thermodynamics In this course, we shall focus almost exclusively on those physical properties of everyday materials which are associated with the motions of their constituent
3
1.6 The need for a statistical approach
1 INTRODUCTION
atoms or molecules. In particular, we shall be concerned with the type of motion which we normally call “heat.” We shall try to establish what controls the ﬂow of heat from one body to another when they are brought into thermal contact. We shall also attempt to understand the relationship between heat and mechanical work. For instance, does the heat content of a body increase when mechanical work is done on it? More importantly, can we extract heat from a body in order to do useful work? This subject area is called “thermodynamics,” from the Greek roots thermos, meaning “heat,” and dynamis, meaning “power.”
1.6 The need for a statistical approach It is necessary to emphasize from the very outset that this is a difﬁcult subject. In fact, this subject is so difﬁcult that we are forced to adopt a radically different approach to that employed in other areas of Physics. In all of the Physics courses which you have taken up to now, you were eventually able to formulate some exact, or nearly exact, set of equations which governed the system under investigation. For instance, Newton’s equations of motion, or Maxwell’s equations for electromagnetic ﬁelds. You were then able to analyze the system by solving these equations, either exactly or approximately. In thermodynamics we have no problem formulating the governing equations. The motions of atoms and molecules are described exactly by the laws of quantum mechanics. In many cases, they are also described to a reasonable approximation by the much simpler laws of classical mechanics. We shall not be dealing with systems sufﬁciently energetic for atomic nuclei to be disrupted, so we can forget about nuclear forces. Also, in general, the gravitational forces between atoms and molecules are completely negligible. This means that the forces between atoms and molecules are predominantly electromagnetic in origin, and are, therefore, very well understood. So, in principle, we could write down the exact laws of motion for a thermodynamical system, including all of the interatomic forces. The problem is the sheer complexity of this type of system. In one mole of a substance (e.g., in twelve grams of carbon, or eighteen grams of water) there are
4
1.6 The need for a statistical approach
1 INTRODUCTION
Avagadro’s number of atoms or molecules. That is, about NA = 6 × 1023 particles, which is a gigantic number of particles! To solve the system exactly we would have to write down about 1024 coupled equations of motion, with the same number of initial conditions, and then try to integrate the system. Quite plainly, this is impossible. It would also be complete overkill. We are not at all interested in knowing the position and velocity of every particle in the system as a function of time. Instead, we want to know things like the volume of the system, the temperature, the pressure, the heat capacity, the coefﬁcient of expansion, etc. We would certainly be hard put to specify more than about ﬁfty, say, properties of a thermodynamic system in which we are really interested. So, the number of pieces of information we require is absolutely minuscule compared to the number of degrees of freedom of the system. That is, the number of pieces of information needed to completely specify the internal motion. Moreover, the quantities which we are interested in do not depend on the motions of individual particles, or some some small subset of particles, but, instead, depend on the average motions of all the particles in the system. In other words, these quantities depend on the statistical properties of the atomic or molecular motion. The method adopted in this subject area is essentially dictated by the enormous complexity of thermodynamic systems. We start with some statistical information about the motions of the constituent atoms or molecules, such as their average kinetic energy, but we possess virtually no information about the motions of individual particles. We then try to deduce some other properties of the system from a statistical treatment of the governing equations. If fact, our approach has to be statistical in nature, because we lack most of the information required to specify the internal state of the system. The best we can do is to provide a few overall constraints, such as the average volume and the average energy. Thermodynamic systems are ideally suited to a statistical approach because of the enormous numbers of particles they contain. As you probably know already, statistical arguments actually get more exact as the numbers involved get larger. For instance, whenever I see an opinion poll published in a newspaper, I immediately look at the small print at the bottom where it says how many people were
5
1.7 Microscopic and macroscopic systems
1 INTRODUCTION
interviewed. I know that even if the polling was done without bias, which is extremely unlikely, the laws of statistics say that there is a intrinsic error of order one over the square root of the number of people questioned. It follows that if a thousand people were interviewed, which is a typical number, then the error is at least three percent. Hence, if the headline says that so and so is ahead by one percentage point, and only a thousand people were polled, then I know the result is statistically meaningless. We can easily appreciate that if we do statistics on a thermodynamic system containing 1024 particles then we are going to obtain results which are valid to incredible accuracy. In fact, in most situations we can forget that the results are statistical at all, and treat them as exact laws of Physics. For instance, the familiar equation of state of an ideal gas, P V = ν R T, is actually a statistical result. In other words, it relates the average pressure and the average volume to the average temperature. However, for one mole of gas the statistical deviations from average values are only about 10−12 , according to the √ 1/ N law. Actually, it is virtually impossible to measure the pressure, volume, or temperature of a gas to such accuracy, so most people just forget about the fact that the above expression is a statistical result, and treat it as a law of Physics interrelating the actual pressure, volume, and temperature of an ideal gas.
1.7 Microscopic and macroscopic systems It is useful, at this stage, to make a distinction between the different sizes of the systems that we are going to examine. We shall call a system microscopic if it is roughly of atomic dimensions, or smaller. On the other hand, we shall call a system macroscopic when it is large enough to be visible in the ordinary sense. This is a rather inexact deﬁnition. The exact deﬁnition depends on the number of particles in the system, which we shall call N. A system is macroscopic if 1 √ 1, N which means that statistical arguments can be applied to reasonable accuracy. For instance, if we wish to keep the statistical error below one percent then a
6
1.8 Thermodynamics and statistical thermodynamics
1 INTRODUCTION
macroscopic system would have to contain more than about ten thousand particles. Any system containing less than this number of particles would be regarded as essentially microscopic, and, hence, statistical arguments could not be applied to such a system without unacceptable error.
1.8 Thermodynamics and statistical thermodynamics In this course, we are going to develop some machinery for interrelating the statistical properties of a system containing a very large number of particles, via a statistical treatment of the laws of atomic or molecular motion. It turns out that once we have developed this machinery, we can obtain some very general results which do not depend on the exact details of the statistical treatment. These results can be described without reference to the underlying statistical nature of the system, but their validity depends ultimately on statistical arguments. They take the form of general statements regarding heat and work, and are usually referred to as classical thermodynamics, or just thermodynamics, for short. Historically, classical thermodynamics was the ﬁrst sort of thermodynamics to be discovered. In fact, for many years the laws of classical thermodynamics seemed rather mysterious, because their statistical justiﬁcation had yet to be discovered. The strength of classical thermodynamics is its great generality, which comes about because it does not depend on any detailed assumptions about the statistical properties of the system under investigation. This generality is also the principle weakness of classical thermodynamics. Only a relatively few statements can be made on such general grounds, so many interesting properties of the system remain outside the scope of this theory. If we go beyond classical thermodynamics, and start to investigate the statistical machinery which underpins it, then we get all of the results of classical thermodynamics, plus a large number of other results which enable the macroscopic parameters of the system to be calculated from a knowledge of its microscopic constituents. This approach is known as statistical thermodynamics, and is extremely powerful. The only drawback is that the further we delve inside the statistical machinery of thermodynamics, the harder it becomes to perform the
7
1.9 Classical and quantum approaches
1 INTRODUCTION
necessary calculations. Note that both classical and statistical thermodynamics are only valid for systems in equilibrium. If the system is not in equilibrium then the problem becomes considerably more difﬁcult. In fact, the thermodynamics of nonequilibrium systems, which is generally called irreversible thermodynamics, is a graduate level subject.
1.9 Classical and quantum approaches We mentioned earlier that the motions (by which we really meant the translational motions) of atoms and molecules are described exactly by quantum mechanics, and only approximately by classical mechanics. It turns out that the nontranslational motions of molecules, such as their rotation and vibration, are very poorly described by classical mechanics. So, why bother using classical mechanics at all? Unfortunately, quantum mechanics deals with the translational motions of atoms and molecules (via wave mechanics) in a rather awkward manner. The classical approach is far more straightforward, and, under most circumstances, yields the same statistical results. Hence, in the bulk of this course, we shall use classical mechanics, as much as possible, to describe translational motions, and reserve quantum mechanics for dealing with nontranslational motions. However, towards the end of this course, we shall switch to a purely quantum mechanical approach.
8
since the probabilistic “calculation” which underpins all of this subject is extraordinarily simple.1 Introduction The ﬁrst part of this course is devoted to a brief. introduction to a branch of mathematics known as probability theory. and fairly low level. let us consider an ensemble Σ of similar systems S. we do not need to know very much about probability theory in order to understand statistical thermodynamics. The probability is unity if all systems exhibit the outcome X in the limit as the number of systems goes to inﬁnity. They call such a group an ensemble. This can result in any one of a number of different possible outcomes.2 What is probability? What is the scientiﬁc deﬁnition of probability? Well. We can see that the probability P(X) must be a number between 0 and 1. In fact. let us consider an observation made on a general system S. This is another way of saying that the outcome X is bound to occur. and Ω(X) is the number of systems exhibiting the outcome X.” So. Ω(Σ) (2. The probability is zero if no systems exhibit the outcome X. 9 . 2. even when the number of systems goes to inﬁnity. in the limit where the latter number tends to inﬁnity.1) where Ω(Σ) is the total number of systems in the ensemble. The probability of the outcome X is deﬁned as the ratio of the number of systems in the ensemble which exhibit this outcome to the total number of systems. which is just the French for “group. Mathematicians have a fancy name for a large group of similar systems. we have to consider the system as a member of a large set Σ of similar systems.2 PROBABILITY THEORY 2 Probability theory 2. We can write this symbolically as P(X) = lt Ω(Σ)→∞ Ω(X) . We want to ﬁnd the probability of some general outcome X. This is just another way of saying that there is no chance of the outcome X. In order to ascribe a probability.
of an observation made on the system S. which equals 1/3. with a six sided die the probability of throwing any particular number (one to six) is 1/6. Ω(Σ) (2. Let us determine the probability of obtaining the outcome X or Y. M P(Xi ) = 1. It is clear that Ω(X  Y) = Ω(X) + Ω(Y) (2.2) where Ω(X  Y) is the number of systems in the ensemble which exhibit either the outcome X or the outcome Y. Thus. with probabilities of occurrence P(X) and P(Y). this quantity is also equal to the sum of the probabilities of all the individual outcomes. This quantity is clearly unity. because all of the possible outcomes are considered to be equally likely. the probability of the outcome X or the outcome Y is just the sum of the individual probabilities of X and Y. From the basic deﬁnition of probability P(X  Y) = lt Ω(Σ)→∞ Ω(X  Y) . so we conclude that this sum is equal to unity. i=1 (2.2. by (2. which we shall denote P(X  Y). possible outcomes of an observation made on the system S by Xi .3) if the outcomes X and Y are mutually exclusive (which they must be the case if they are two distinct outcomes). (2. X and Y. because every one of the systems in the ensemble must exhibit one of the possible outcomes. It follows from what has just been said that the probability of throwing either a one or a two is simply 1/6 + 1/6. say. P(X  Y) = P(X) + P(Y). where i runs from 1 to M. But. respectively. For instance. Let us determine the probability of obtaining any of these outcomes. Thus. from the basic deﬁnition of probability.3 Combining probabilities 2 PROBABILITY THEORY 2.4). Let us denote all of the M.3 Combining probabilities Consider two distinct possible outcomes.4) So.5) 10 .
We are assuming here that the ﬁrst observation does not inﬂuence the second observation in any way. which we shall denote P(X ⊗ Y). It follows from the basic deﬁnition of probability that P(X ⊗ Y) = lt Ω(Σ)→∞ (2. the probability of throwing a one and then a two on a six sided die is 1/6 × 1/6. we have to form an ensemble of all of the possible pairs of states which we could choose from the ensemble Σ.2. Suppose that we make an observation on a state picked at random from the ensemble and then pick a second state completely independently and make another observation. so Ω(X ⊗ Y) = Ω(X) Ω(Y). and must be satisﬁed by any complete set of probabilities.3 Combining probabilities 2 PROBABILITY THEORY which is called the normalization condition.8) Thus. Let us determine the probability of obtaining the outcome X in the ﬁrst state and the outcome Y in the second state.6) It is also fairly obvious that the number of pairs of states in the ensemble Σ ⊗ Σ which exhibit the outcome X in the ﬁrst state and Y in the second state is just the product of the number of states which exhibit the outcome X and the number of states which exhibit the outcome Y in the original ensemble. Let us denote this ensemble Σ ⊗ Σ.7) Ω(X ⊗ Y) = P(X) P(Y). the probability of obtaining the outcomes X and Y in two statistically independent observations is just the product of the individual probabilities of X and Y. In order to determine this probability. It is obvious that the number of pairs of states in this new ensemble is just the square of the number of states in the original ensemble. which equals 1/36. The fancy mathematical way of saying this is that the two observations are statistically independent. This condition is equivalent to the selfevident statement that an observation of a system must deﬁnitely result in one of its possible outcomes. For instance. There is another way in which we can combine probabilities. 11 . so Ω(Σ ⊗ Σ) = Ω(Σ) Ω(Σ). Ω(Σ ⊗ Σ) (2. (2.
This type of calculation crops up again and again in probability theory.” each with equal probabilities 1/2. Or.2.5) that p + q = 1. The two outcomes are “heads” and “tails. P(2) = q. Let us suppose that there are two possible outcomes to an observation made on some system S. (2. Or. There would obviously be little point in investigating a one outcome system. Let us denote these outcomes 1 and 2. The best known example of a twostate system is a tossed coin. and let their probabilities of occurrence be P(1) = p. we could get the outcome 1 on the ﬁrst and last observations and the outcome 2 on the middle observation. Consider a simple case in which there are only three observations.11) (2. Suppose that we make N statistically independent observations of S.9) (2.10) so q = 1 − p. Let us try to evaluate the probability of two occurrences of the outcome 1 and one occurrence of the outcome 2. (2. we could get the outcome 2 on the ﬁrst observation and the outcome 1 on the latter two observations. we might want to know the probability of getting nine “heads” and only one “tails” in an experiment where a coin is tossed ten times. or where ten coins are tossed simultaneously. We could get the outcome 1 on the ﬁrst two observations and the outcome 2 on the third.4 The twostate system The simplest nontrivial system which we can investigate using probability theory is one for which there are only two possible outcomes. p = q = 1/2 for this system. Let us determine the probability of n1 occurrences of the outcome 1 and N − n1 occurrences of the outcome 2. There are three different ways of getting this result. It follows immediately from the normalization condition (2.4 The twostate system 2 PROBABILITY THEORY 2. For instance.12) 12 . with no regard to the order of these occurrences. Denote this probability PN (n1 ). So. Writing this symbolically P3 (2) = P(1 ⊗ 1 ⊗ 2  2 ⊗ 1 ⊗ 1  1 ⊗ 2 ⊗ 1).
plausible from the example we just discussed. N−n1 is the number of ways of arranging two distinct sets of n1 and n N − n1 indistinguishable objects. but all we have done here is to write out symbolically what was just said in words.15) (2. that this is. (2. P(X ⊗ Y) = P(X) P(Y).2. N−n1 pn1 qN−n1 . How many ways are there of arranging N distinguishable objects? For instance.5 Combinatorial analysis 2 PROBABILITY THEORY This formula looks a bit scary. This is a pretty tough problem! Let us try something a little easier to begin with. and then added them all together. We need to know how many different ways there are of arranging N objects which are made up of two groups of n1 and N − n1 indistinguishable objects. and 13 . topright. Hopefully. The straightforward application of these rules gives P3 (2) = p p q + q p p + p q p = 3 p2 q in the case under consideration. middleleft. This symbolic representation is helpful because of the two basic rules for combining probabilities which we derived earlier P(X  Y) = P(X) + P(Y). suppose that we have six pool balls. numbered one through six. The probability of obtaining n1 occurrences of the outcome 1 in N observations is given by PN (n1 ) = CN1 . the probability of getting two occurrences of the outcome 1 and one occurrence of the outcome 2 was obtained by writing out all of the possible arrangements of two p s (the probability of outcome 1) and one q (the probability of outcome 2). and where we said “or” we have written the symbolic operator . and we pot one each into every one of the six pockets of a pool table (that is. Where we said “and” we have written the symbolic operator ⊗. There.16) n where CN1 .13) (2. at least. bottomleft.14) 2. (2. topleft.5 Combinatorial analysis The branch of mathematics which studies the number of different ways of arranging things is called combinatorial analysis. middleright.
How many different ways are there of doing this? Well. because we have already potted a ball into the topleft pocket. Well. (2. For the middleleft pocket we have 4 possibilities. or the number six. These possibilities combined with our 6 × 5 possibilities gives 6 × 5 × 4 ways of potting three balls into three pockets. Since there is nothing special about pool balls.5 Combinatorial analysis 2 PROBABILITY THEORY bottomright). is given by CN = N! .18) Suppose that we take the number four ball off the pool table and replace it by a second number ﬁve ball. 6 × 5 × 4 × 3 × 2 × 1 is a bit of a mouthful. At this stage. Clearly. and 3! = 3 × 2 × 1 = 6. 1! = 1. whereas previously they were counted as two separate arrangements. we 14 . and then consider a second arrangement which only differs from the ﬁrst because the number four and ﬁve balls have been swapped around. because we have already potted two balls. we can safely infer that the number of different ways of arranging N distinguishable objects. Clearly. denoted CN . the number of ways of potting six pool balls into six pockets is 6! (which incidentally equals 720). and so on.17) So. mathematicians have invented a special function called a factorial. Since these balls are now indistinguishable. and it cannot be in two pockets simultaneously. it should be clear that the ﬁnal answer is going to be 6 × 5 × 4 × 3 × 2 × 1. the previous arrangements can be divided into two groups. How many different ways are there of potting the balls now? Well. consider a previous arrangement in which the number ﬁve ball was potted into the topleft pocket and the number four ball was potted into the topright pocket. For the topright pocket we only have 5 possibilities. containing equal numbers of arrangements. (2. We could pot any one of the six balls into this pocket. which differ only by the permutation of the number four and ﬁve balls. So. so there are 6 possibilities. and are therefore counted as a single arrangement. These arrangements are now indistinguishable. so to prevent us having to say (or write) things like this.2. our 6 original possibilities combined with these 5 new possibilities gives 6 × 5 ways of potting two balls into the top two pockets. and 2! = 2 × 1 = 2. The factorial of a general positive integer n is deﬁned n! = n(n − 1)(n − 2) · · · 3 · 2 · 1. let us start with the topleft pocket.
(p + q)3 .21) PN (n1 ) = n1 ! (N − n1 )! This probability function is called the binomial distribution function. n1 ! (2. and ﬁve balls. Generalizing this result. (2. the expansion of (p + q)N is called the binomial expansion (hence. we conclude that the number of arrangements of n1 indistinguishable and N − n1 distinguishable objects is CN1 = n N! . four. four. If we take the number three ball off the table and replace it by a third number ﬁve ball. to a number ﬁve ball in each pocket. and ﬁve balls are now indistinguishable. The reason for this is obvious if we tabulate the probabilities for the ﬁrst few possible values of N (see Tab. of course. we can split the original arrangements into six equal groups of arrangements which differ only by the permutation of the number three. and (p + q)4 . N−n1 = . (2. There are six groups because there are 3! = 6 separate permutations of these three balls.20) that the probability of obtaining n1 occurrences of the outcome 1 in N statistically independent observations of a twostate system is N! pn1 qN−n1 . (2. In algebra.20) n n1 ! (N − n1 )! 2.2.6 The binomial distribution It follows from Eqs. Of course. we immediately recognize these expressions: they appear in the standard algebraic expansions of (p + q).6 The binomial distribution 2 PROBABILITY THEORY conclude that there are only half as many different arrangements as there were before. A further straightforward generalization tells us that the number of arrangements of two groups of n1 and N−n1 indistinguishable objects is N! CN1 .19) We can see that if all the balls on the table are replaced by number ﬁve balls then there is only N!/N! = 1 possible arrangement. respectively. we conclude that there are only 1/6 the number of original arrangements. This corresponds. and 15 . 1). (p + q)2 . Since the number three. the name given to the probability distribution function).16) and (2.
variance.23) since p + q = 1. and standard deviation n1 2 2 PROBABILITY THEORY N 1 2 3 4 0 1 3 4 q p q2 2 p q p2 q3 3 p q2 3 p2 q p3 q4 4 p q3 6 p2 q2 4 p3 q p4 Table 1: The binomial probability distribution can be written (p + q) ≡ N N n=0 N! pn qN−n .7 The mean.21) and (2.22) can be used to establish the normalization condition for the binomial distribution function: N N PN (n1 ) = n1 =0 n1 N! pn1 qN−n1 ≡ (p + q)N = 1.2. 2. We would then write something like Average Age N18 × 18 + N19 × 19 + N20 × 20 + · · · . (2. and standard deviation What is meant by the mean or average of a quantity? Well. were currently enrolled.7 The mean. suppose that we wanted to calculate the average age of undergraduates at the University of Texas at Austin. nineteen yearolds. n ! (N − n1 )! =0 1 (2. n! (N − n)! (2. etc. N18 + N19 + N20 · · · (2.25) Nstudents 16 . Suppose that we were to pick a student at random and then ask “What is the probability of this student being eighteen?” From what we have already discussed.24) where N18 is the number of enrolled eighteen yearolds. We could go to the central administration building and ﬁnd out how many eighteen yearolds. etc. variance. this probability is deﬁned N18 P18 = .22) Equations (2.
and so on. · · · . there is nothing special about the age distribution of students at UT Austin. Thus. It follows that M M M f(u) + g(u) = i=1 P(ui ) [f(ui )+g(ui )] = i=1 P(ui ) f(ui )+ i=1 P(ui ) g(ui ). uM . the mean or average value of u. which can take on any one of M possible values u1 . if c is a general constant then it is clear that c f(u) = c f(u). P(u2 ).32) . (2. We can now see that the average age takes the form Average Age P18 × 18 + P19 × 19 + P20 × 20 + · · · . Then. (2.31) We now know how to deﬁne the mean value of the general variable u. But. variance. f(u1 ) corresponds to u1 and occurs with the probability P(u1 ).28) Suppose that f(u) and g(u) are two general functions of u. i=1 (2. is deﬁned as ¯ M u≡ ¯ P(ui ) ui . It follows from our previous deﬁnition that the mean value of f(u) is given by M f(u) ≡ P(ui ) f(ui ).26) Well.7 The mean. · · · . which is denoted ¯ ∆u ≡ u − u. (2. there is a corresponding value of f(u) which occurs with the same probability.2. u2 . Finally. for a general variable u. So.27) Suppose that f(u) is some function of u. P(uM ). ¯ 17 (2. and standard deviation 2 PROBABILITY THEORY where Nstudents is the total number of enrolled students. how can we characterize the scatter around the mean value? We could investigate the deviation of u from its mean value u. for each of the M possible values of u. i=1 (2.30) so f(u) + g(u) = f(u) + g(u). which is denoted u.29) (2. with corresponding probabilities P(u1 ).
35) 2. denoted 1 and 2. since its average is obviously zero: ∆u = (u − u) = u − u = 0. ¯ ¯ ¯ ¯¯ ¯ giving (u − u)2 = u2 − u2 . then 18 .2. variance. so that all possible values of u correspond to the mean value u. The variance is clearly a positive number. and standard deviation of a general distribution function to the speciﬁc case of the binomial distribution function. with respective probabilities p and q = 1 − p.33) This is another way of saying that the average deviation from the mean vanishes. The standard deviation is essentially the width of the range over which u is distributed around its mean value u.34) is usually called the variance.37) ∆∗ u = (∆u)2 which is usually called the standard deviation of u. A more interesting quantity is the square of the deviation. that if a simple system has just two possible outcomes.8 Application to the binomial distribution 2 PROBABILITY THEORY In fact. ¯ (2. in which case it is zero. unless there is no scatter at all in the distribution. M (∆u)2 = i=1 P(ui ) (ui − u)2 . The following general relation ¯ is often useful (u − u)2 = (u2 − 2 u u + u2 ) = u2 − 2 u u + u2 . ¯ (2. Recall. this is not a particularly interesting quantity. 1/2 . ¯ ¯ ¯ (2. ¯ ¯ (2.8 Application to the binomial distribution Let us now apply what we have just learned about the mean. (2. The average value of this quantity.36) The variance of u is proportional to the square of the scatter of u around its mean value. A more useful measure of the scatter is given by the square root of the variance.
40) ∂ N! pn1 qN−n1 n1 ≡ p n1 ! (N − n1 )! ∂p n =0 1 N! pn1 qN−n1 . n1 ! (N − n1 )! (2.42) However.44) p = lt N→∞ N 19 . The term in square brackets is the familiar binomial expansion.39) pn1 qN−n1 n1 . the probability p is the number of occurrences of the outcome 1 divided by the number of trials. it would just reduce to the binomial expansion. the mean number of occurrences of outcome 1 in N observations is given by N N N! (2. ∂p N (2. and has nothing to do with probability theory.2. in the limit as the number of trials goes to inﬁnity: n1 . (2.38) Thus.43) In fact. We can take advantage of this fact by using a rather elegant mathematical sleight of hand. and can be written more succinctly as (p + q)N . we can see that if the ﬁnal factor n1 were absent. which we know how to sum. n1 ! (N − n1 )! ∂p =0 (2. By deﬁnition. we could have guessed this result. (2. n1 = PN (n1 ) n1 = n1 ! (N − n1 )! n =0 n =0 1 1 This is a rather nasty looking expression! However. p + q = 1 for the case in hand. n1 ! (N − n1 )! =0 (2. Thus. N n1 ∂ N! pn1 qN−n1 n1 ≡ p (p + q)N ≡ p N (p + q)N−1 . Observe that since n1 pn1 ≡ p the summation can be rewritten as N n1 ∂ n1 p . so n1 = N p.41) This is just algebra.8 Application to the binomial distribution 2 PROBABILITY THEORY the probability of obtaining n1 occurrences of outcome 1 in N observations is PN (n1 ) = N! pn1 qN−n1 .
so that p= n1 n1 = . we can see that taking the limit as the number of trials goes to inﬁnity is equivalent to taking the mean value.48) N! pn1 qN−n1 (n1 )2 ≡ n ! (N − n1 )! =0 1 ≡ N n1 N! pn1 qN−n1 n ! (N − n1 )! =0 1 ∂ (p + q)N (2.43). n ! (N − n1 )! =0 1 (2. Using p + q = 1 yields (n1 )2 = p [N + p N (N − 1)] = N p [1 + p N − p] = (N p)2 + N p q = (n1 )2 + N p q. so we just need to calculate (n1 )2 .51) . (2. however. Since (n1 ) p then N n1 2 n1 ∂ ≡ p ∂p ∂ p ∂p p 2 2 pn1 . (2. this is just a simple rearrangement of Eq.2. Recall that (∆n1 )2 = (n1 )2 − (n1 )2 .49) ∂p ∂ p N (p + q)N−1 ≡ p ∂p ≡ p N (p + q)N−1 + p N (N − 1) (p + q)N−2 .8 Application to the binomial distribution 2 PROBABILITY THEORY If we think carefully. 20 2 (2.46) We already know n1 . It follows that the variance of n1 is given by (∆n1 )2 = (n1 )2 − (n1 )2 = N p q.50) (2. since n1 = N p. Let us now calculate the variance of n1 . This average is written N (n1 )2 = n1 N! pn1 qN−n1 (n1 )2 . (2.45) But. N N (2.47) The sum can be evaluated using a simple extension of the mathematical trick we used earlier to evaluate n1 .
This suggests that the probability of obtaining n1 occurrences of outcome 1 does not change signiﬁcantly in going from one possible value of n1 to an adjacent value: PN (n1 + 1) − PN (n1 ) 1.9 The Gaussian distribution Consider a very large number of observations.52) Recall that this quantity is essentially the width of the range over which n1 is distributed around its mean value.53) = = n1 Np p N It is clear from this formula that the relative width decreases like N−1/2 with increasing N.2.55) implying that there are very many probable values of n1 scattered about the mean value n1 . N 1. (2.56) PN (n1 ) In this situation. So. made on a system with two possible outcomes. (2. (2. the more likely it is that an observation of n1 will yield a result which is relatively close to the mean value n1 . the greater the number of trials. 2.54) In this limit. ∆∗ n1 = N p q 1. (2. it is useful to regard the probability as a smooth function of n1 . the standard deviation of n1 is also much greater than unity.9 The Gaussian distribution 2 PROBABILITY THEORY The standard deviation of n1 is just the square root of the variance. so ∆∗ n1 = N p q. (2. This is a very important result. Suppose that the probability of outcome 1 is sufﬁciently large that the average number of occurrences after N observations is much greater than unity: n1 = N p 1. The relative width of the distribution is characterized by √ ∆∗ n1 Npq q 1 √ . Let n be a continuous variable which is interpreted as the number of occurrences 21 .
57) where P(n) is called the probability density. instead of the rapidly varying function P(n). (2. Suppose that ln P(n) attains its maximum value at n = n (where we expect n ∼ n). ˜ ˜ Let us Taylor expand ln P around n = n.2.56). 22 η2 ln P(˜ ) + η B1 + B2 + · · · . Note that we expand the slowly varying ˜ function ln P(n). (2. n + dn) = P(n) dn. and must go to zero as dn → 0. the relative width of the probability distribution function is small: ∆∗ n1 = n1 q 1 √ p N 1. B1 = 0. the above relation can be approximated P(n) PN (n) = N! pn qN−n .9 The Gaussian distribution 2 PROBABILITY THEORY of outcome 1 (after N observations) whenever it takes on a positive integer value. B2 < 0. (2. We can write ln P(˜ + η) n where Bk = By deﬁnition. n=˜ n (2. n 2 dk ln P dnk . Given Eq.64) .61) (2. The probability that n lies between n and n + dn is deﬁned P(n.63) (2.60) This suggests that P(n) is strongly peaked around the mean value n = n1 .59) For large N.58) which is equivalent to smearing out the discrete probability PN (n1 ) over the range n1 ± 1/2. n! (N − n)! (2. We can write n1 +1/2 n1 −1/2 P(n) dn = PN (n1 ).62) (2. and is independent of dn. n + dn) can always be expanded as a Taylor series in dn. (2. The probability can be written in this form because P(n. because the Taylor expansion of P(n) does not converge sufﬁciently rapidly in the vicinity of n = n ˜ to be useful.
65).70) (2. According to Eq. ˜ (2. which equals n1 . B1 = − ln n + ln (N − n) + ln p − ln q. 1 n! (2.66) for n n ln n − n + O(1). n N−n ˜ ˜ Np N (1 − p) Npq 23 (2. such that n 1. ˜ ˜ giving n = N p = n1 .65) yields B2 = − 1 1 1 1 1 − =− − =− . (2.71) since p + q = 1. if B1 = 0 then (N − n) p = n q. (2.9 The Gaussian distribution 2 PROBABILITY THEORY if n = n corresponds to the maximum value of ln P(n).72) . Thus. The integral of this relation ln n! ln n. then ln n! is almost a continuous function of n.65) If n is a large integer. (2. (2.2. since ln n! changes by only a relatively small amount when n is incremented by unity. the maximum of ln P(n) occurs exactly at the mean value of n. ˜ ˜ Hence. after the Scottish mathematician James Stirling who ﬁrst obtained it in 1730. Hence. ˜ It follows from Eq.69) (2.67) (n + 1)! ln (n + 1)! − ln n! = ln = ln (n + 1).59) that ln P = ln N! − ln n! − ln (N − n)! + n ln p + (N − n) ln q. (2. Further differentiation of Eq. is called Stirling’s approximation.68) valid for n 1. (2. d ln n! dn giving d ln n! dn 1.
75) The constant P(n1 ) is most conveniently ﬁxed by making use of the normalization condition PN (n1 ) = 1. Thus.79) so it follows from the normalization condition (2. Note that B2 < 0.74) Taking the exponential of both sides yields P(n) (n − n1 )2 − . as required. 2 (∆∗ n1 )2 (2. −∞ ∞ exp(−x2 ) dx = √ π.76) which translates to N 0 P(n) dn 1 (2.77) for a continuous distribution function. √ (n − n1 )2 dn = P(n1 ) 2 ∆∗ n1 P(n1 ) exp− 2 (∆∗ n1 )2 −∞ ∞ −∞ ∞ exp(−x2 ) dx 1. the limits of integration in the above expression can be replaced by ±∞ with negligible error. The above relation can also be written 1 (2.80) . 2π ∆∗ n1 (2. P(n1 ) exp 2 (∆∗ n1 )2 N (2. (2.2.9 The Gaussian distribution 2 PROBABILITY THEORY since p + q = 1. Since we only expect P(n) to be signiﬁcant when n lies in the relatively narrow range n1 ± ∆∗ n1 . n1 =0 (2.78) that P(n1 ) √ 24 1 .73) B2 = − ∗ 2 (∆ n1 ) It follows from the above that the Taylor expansion of ln P can be written ln P(n1 + η) η2 ln P(n1 ) − + ···. (2.78) As is wellknown.
In other words. the probability density is about 13.. and then ﬁt a continuous curve through the discrete points thus obtained.9 The Gaussian distribution 2 PROBABILITY THEORY Finally. This curve would be equivalent to the continuous probability density curve P(n).e. At two standard deviations away from the mean value.81).82) under this transformation. and is also symmetric about this point. n1 is almost certain to lie in the relatively narrow range n1 ± 3 ∆∗ n1 . Likewise. who discovered it whilst investigating the distribution of errors in measurements.81) P(n) √ exp 2 (∆∗ n1 )2 2π ∆∗ n1 This is the famous Gaussian distribution function.83) . where n is the continuous version of n1 . Suppose we were to plot the probability PN (n1 ) against the integer variable n1 . In the above analysis. we have gone from a discrete probability function PN (n1 ) to a continuous probability density P(n). named after the German mathematician Carl Friedrich Gauss. the probability density is about 61% of its peak value. (2. This is a very wellknown result. the evaluations of the mean and variance of the distribution are written N n1 = n1 =0 PN (n1 ) n1 25 −∞ ∞ P(n) n dn. we obtain 1 (n − n1 )2 − . (2. when plotted with the appropriate ratio of vertical to horizontal scalings. We conclude that there is very little chance indeed that n1 lies more than about three standard deviations away from its mean value. at three standard deviations away from the mean value.2. n = n1 ± ∆∗ n1 . The Gaussian distribution is only valid in the limits N 1 and n1 1. the probability density is only about 1% of its peak value.5% of its peak value. the Gaussian probability density curve looks rather like the outline of a bell centred on n = n1 . Finally. this curve is sometimes called a bell curve. Hence. i. At one standard deviation away from the mean value. According to Eq. In fact. (2. the probability density attains its maximum value when n equals the mean of n1 . The normalization condition becomes N 1= n1 =0 PN (n1 ) −∞ ∞ P(n) dn (2.
with any distribution function (for some measurable quantity x).10 The central limit theorem Now. the 26 . If we combine a sufﬁciently large number of such systems together. it applies to all systems. (2. We started from a two outcome system because it was easy to calculate the ﬁnal probability distribution function when a ﬁnite number of such systems were combined together. Eq. Finally.2.84) are indeed true by substituting in the Gaussian probability density. 2. this distribution only seems to be relevant to twostate systems. However. when we combined very many of these simple systems together.82)–(2. under certain circumstances. These results follow as simple generalizations of previously established results for the discrete function PN (n1 ). if we had started from a more complicated system then this calculation would have been far more difﬁcult. we found that the resultant probability distribution function (for n1 ) reduced to a Gaussian in the limit as the number of simple systems tended to inﬁnity.81). and then performing a few elementary integrals. the probability distribution function (for n1 ) for this system did not look anything like a Gaussian. Of course. Let us brieﬂy review how we obtained the Gaussian distribution function in the ﬁrst place. Clearly.10 The central limit theorem 2 PROBABILITY THEORY and N (∆n1 )2 ≡ (∆ n1 ) = ∗ 2 PN (n1 ) (n1 − n1 ) n1 =0 2 −∞ ∞ P(n) (n − n1 )2 dn. (2. as we shall see. you may be thinking that we got a little carried away in our discussion of the Gaussian distribution function. After all. the Gaussian distribution is of crucial importance to statistical physics because. We started from a very simple system with only two possible outcomes. (2.84) respectively. In fact. to produce a complicated system with a great number of possible outcomes. Let me now tell you something which is quite astonishing! Suppose that we start from any system. The limits of integration in the above expressions can be approximated as ±∞ because P(n) is only nonnegligible in a relatively narrow range of n. it is easily demonstrated that Eqs.
11 of Reif. This proposition is known as the central limit theorem. conﬁdently predict that Gaussian distributions are going to crop up all over the place in statistical thermodynamics. the central limit theorem is notoriously difﬁcult to prove.10 The central limit theorem 2 PROBABILITY THEORY resultant distribution function (for x) is always Gaussian. provided that a sufﬁciently large number of statistically independent observations are made. We can. Unfortunately. therefore. it is one of the most important theorems in the whole of mathematics. A somewhat restricted proof is presented in Sections 1. As far as Physics is concerned. 27 .2.10 and 1. The central limit theorem guarantees that the probability distribution of any measurable quantity is Gaussian.
since there is always an intrinsic error in any experimental measurement. consider the simplest possible many particle system. let us. ﬁrst of all. In practice.1 Introduction Let us now analyze the internal motions of a many particle system using probability theory. the state of the system is fully speciﬁed once we simultaneously measure the particle’s position q and momentum p. and δp the intrinsic error in the momentum measurement. the point (q.1) where h0 is a small constant having the dimensions of angular momentum. This procedure automatically ensures that we do not attempt to specify q and p to an accuracy greater than our experimental error. Assuming that we know the particle’s equation of motion. 28 . p) in the qp plane.3 STATISTICAL MECHANICS 3 Statistical mechanics 3. it is impossible to specify q and p exactly. (3. The coordinates q and p can now be conveniently speciﬁed by indicating the cell in phasespace into which they plot at any given time. Suppose that we divide phasespace into rectangular cells of uniform dimensions δq and δp. δq is the intrinsic error in the position measurement. The “area” of each cell is δq δp = h0 . if we know q and p then we can calculate the state of the system at all subsequent times using the equation of motion. which would clearly be pointless. 3. Here. p) will trace out some very complicated pattern in phasespace. In principle. This plane is generally known as phasespace. This subject area is known as statistical mechanics. In general. Consider the time evolution of q and p. This can be visualized by plotting the point (q. which consists of a single spinless particle moving classically in one dimension.2 Speciﬁcation of the state of a many particle system How do we determine the state of a many particle system? Well.
Thus. In reality. let us consider a system consisting of N spinless particles moving classically in three dimensions. As before. This is equivalent to dividing phasespace up into regular six dimensional cells of volume h03 . qz ). the number of qp pairs needed to specify the state of the system is usually called the number of degrees of freedom of the system. For the present case. Incidentally. and qz pz . This can be visualized by plotting the point (q. we need to specify a large number of qp pairs. Suppose that we divide the qx px plane into rectangular cells of uniform dimensions δq and δp. respectively. Thus. Consider a particular pair of conjugate coordinates. a single particle moving in one dimension constitutes a one degree of freedom system. h 29 (3. Again. qi and pi . Consider the time evolution of q and p.e. we can specify the state of the system to arbitrary accuracy by taking the limit h0 → 0..2) . Finally. and do likewise for the qy py and qz pz planes. f = 3 N.2 Speciﬁcation of the state of a many particle system 3 STATISTICAL MECHANICS Let us now consider a single spinless particle moving in three dimensions. where q = (qx . phasespace (i. In principle. The state of the system is speciﬁed by indicating which cell it occupies in phasespace at any given time. This is equivalent to dividing phasespace into regular 2 f dimensional cells of volume h0f . The requisite number is simply the number of degrees of freedom. f. we divide the qi pi plane into rectangular cells of uniform dimensions δq and δp.. In order to specify the state of the system. this procedure automatically ensures that we do not attempt to specify q and p to an accuracy greater than our experimental error. qy py .3. This implies that h h0 ≥ ¯ . δq and δp are again the intrinsic errors in our measurements of position and momentum. In order to specify the state of the system we now need to know three qp pairs: i.e. p) in the six dimensional qp phasespace. The coordinates q and p can now be conveniently speciﬁed by indicating the cell in phasespace into which they plot at any given time. the space of all the qp pairs) now possesses 2 f = 6 N dimensions. we know from quantum mechanics that it is impossible to simultaneously measure a coordinate qi and its conjugate momentum pi to greater accuracy than δqi δpi = ¯ . qx px . whereas a single particle moving in three dimensions constitutes a three degree of freedom system. etc. qy . Here.
3. qf .. We usually know the Hamiltonian for completely noninteracting particles. Either way. the spin of each particle can either point up or down along the zaxis). it is both!). · · · . · · · .. the future time evolution of the wavefunction is fully determined by Schr¨dinger’s equation. this approach does not work because the Hamiltonian of the system is only known approximately. t). o In reality. but the component of the Hamiltonian associated with particle interactions is either impossibly complicated. so its quantum numbers never change. we are dealing with a system consisting of many weakly interacting particles. if the system is in a stationary state (i. s1 . The interactions allow the system to make transitions between different “stationary” states. In the absence of particle interactions.e. such a system is completely deterministic. if the system starts off in a stationary state then it stays in that state for ever. the uncertainty principle sets a lower limit on how ﬁnely we can chop up classical phasespace.e. Alternatively.g. an eigenstate of the Hamiltonian) then we can just specify f+g quantum numbers. For instance.3 The principle of equal a priori probabilities We now know how to specify the instantaneous state of a many particle system. The state of the system is then speciﬁed by the quantum numbers identifying these eigenstates.3) where f is the number of translational degrees of freedom. Typically. In principle. (3. and g the number of internal (e. spin) degrees of freedom. causing its quantum numbers to change in time. if the system consists of N spinonehalf particles then there will be 3 N translational degrees of freedom.3 The principle of equal a priori probabilities 3 STATISTICAL MECHANICS In other words. or not very well known (often. We can deﬁne approximate stationary eigenstates using the Hamiltonian for noninteracting particles. sg . ψ(q1 . 3. In quantum mechanics we can specify the state of the system by giving its wavefunction at time t. Once we know the initial state and the equations of motion (or the Hamiltonian) we can evolve the 30 .. and N spin degrees of freedom (i.
Instead of focusing on a single system. perhaps “calculate” is the wrong word. So what do we do instead? Well. Typically. the total volume V.. In this situation. But. We now need to calculate the probability of the system being found in each of its accessible states. We cannot evolve 1024 simultaneous differential equations! Even if we could. it is quite impossible to specify the initial state or the equations of motion to sufﬁcient accuracy for this method to have any chance of working. what is required here is a statistical treatment of the problem. Thus. In general. there are a great many such states. we effectively guess the probabilities. we would not want to. we merely need to ﬁnd the number of systems in the ensemble which exhibit this property. Well. After all. In general. such a calculation is completely out of the question.. We call these the states accessible to the system. where δE is an experimental error. we know the total energy E.e. To be more honest.3 The principle of equal a priori probabilities 3 STATISTICAL MECHANICS system forward in time and. determine all future states. thereby. we only need concern ourselves with those systems in the ensemble exhibiting states which are consistent with the known constraints. etc. In reality. about 1024 particles. in the limit as the latter number tends to inﬁnity. we can only really say that the total energy lies between E and E + δE.3. The only way we could calculate these probabilities would be to evolve all of the systems in the ensemble and observe how long on average they spend in each accessible state. even if it were possible. In order to evaluate the probability that the system possesses a particular property. We can usually place some general constraints on the system. these systems are distributed over many different states at any given time. we would 31 . it would still not be a practical proposition to evolve the equations of motion. Furthermore. let us proceed in the usual manner and consider a statistical ensemble consisting of a large number of identical systems. and the total number of particles N. as we have already mentioned. and then divide by the total number of systems. Clearly. Remember that we are typically dealing with systems containing Avogadro’s number of particles: i. we are not particularly interested in the motions of individual particles. Let us consider an isolated system in equilibrium. What we really want is statistical information regarding the motions of all particles in the system.
by the same reasoning. We could now place some constraints on the system. we treat a many particle system a bit like an extremely large game of cards. We assume. This is called the assumption of equal a priori probabilities. This implies that the statistical ensemble does not evolve with time. we have used the principle of equal a priori probabilities. People really believe that this principle applies to games of chance such as cards. For instance. we use assumptions like this all of the time without really thinking about them. if the principle were found not to apply to a particular game most people would assume that the game was “crooked. A convincing study would have to be part mathematics and part psychology! In statistical mechanics. an observation of the state of the system is like picking a card at random from the pack. such as the energy and the volume. I think that most people would accept that we have an equal probability of picking any card in the pack. should also remain constant. Individual systems in the ensemble will constantly change state. This would be very difﬁcult! We would have to show that the way most people shufﬂe cards is effective at randomizing their order. In both cases. we would expect the probability of picking the Ace of Spades. that the system is equally likely to be found in any of its accessible states. therefore. we could only count red cards. Each accessible state corresponds to one of the cards in the pack. all macroscopic parameters describing the system. This is equivalent to constantly shufﬂing the pack. in which case the probability of picking the Ace of Hearts. to be 1/52. In fact.3 The principle of equal a priori probabilities 3 STATISTICAL MECHANICS expect the probability of the system being found in one of its accessible states to be independent of time. In fact. but the average number of systems in any given state should remain constant. So. and lies at the very heart of statistical mechanics. would be 1/26. The interactions between particles cause the system to continually change state. since there are ﬁftytwo cards in a normal pack. Suppose that we were asked to pick a card at random from a wellshufﬂed pack. dice. and roulette.3. imagine trying to prove that the principle actually does apply to a game of cards. Finally. say. There is nothing in the laws of mechanics which would lead us to suppose that the system will be found more often in one of its accessible states than in another. The principle of equal a priori probabilities then boils down to saying that we have an 32 . There is nothing which would favour one particular card over all of the others. Thus. say.” But.
impossible to prove with mathematical rigor that the principle of equal a priori probabilities applies to manyparticle systems. We can deﬁne approximate stationary eigenstates of the system using H0 . got so fed up with all of the criticism that he eventually threw himself off a bridge! Nowadays. In general. there are many different eigenstates with the same energy: these are called degenerate states. For example. statistical mechanics is completely accepted into the cannon of physics. consider N noninteracting spinless particles of mass m conﬁned in a cubic box of dimension L. and all have failed miserably. (3. many people have attempted this proof. and H1 is a small correction due to the particle interactions. It is. unfortunately. Not surprisingly. Over the years. H0 Ψr = Er Ψr .5) where the index r labels a state of energy Er and eigenstate Ψr .4 The H theorem 3 STATISTICAL MECHANICS equal chance of choosing any particular card. statistical mechanics was greeted with a great deal of scepticism when it was ﬁrst proposed just over one hundred years ago. therefore.4) where H0 is the Hamiltonian for completely noninteracting particles. In quantum mechanics we can write the Hamiltonian for such a system as H = H0 + H1 . One of the its main proponents. To achieve this we have to make use of the socalled H theorem.3. 3. Ludvig Boltzmann.4 The H theorem Consider a system of weakly interacting particles. (3. According to standard wavemechanics. The reason for this is quite simple: it works! It is actually possible to formulate a reasonably convincing scientiﬁc case for the principle of equal a priori probabilities. Thus. the energy 33 .
If each system starts off in a particular stationary state (i. a statistical ensemble of systems made up of weakly interacting particles. Of course. We assume that the probabilities are properly normalized. Pr is time dependent because the ensemble is evolving towards an equilibrium state.e. and the principle of equal a priori probabilities will never be applicable. There are clearly very many different arrangements of these quantum numbers which give the same overall energy. In reality. For instance. There then exists some transition probabil34 . (3. (3. We can ascribe a time dependent probability Pr (t) of ﬁnding the system in a particular approximate stationary state r at time t. ni2 . Hence. Suppose that this ensemble is initially very far from equilibrium. Let us label the accessible states of our system by the index r. the systems in the ensemble might only be distributed over a very small subset of their accessible states. so that for a general state r N Er = i=1 ei . The overall energy of the system is the sum of the energies of the individual particles..6) 2 2mL where ni1 .7) The overall state of the system is thus speciﬁed by 3N quantum numbers (i. it will remain in that state for ever. Pr (t) is proportional to the number of systems in the ensemble in state r at time t. so that the sum over all accessible states always yields Pr (t) = 1. now. Consider. in the absence of particle interactions. the ensemble will always stay far from equilibrium. particle interactions cause each system in the ensemble to make transitions between its accessible “stationary” states..e. with a particular set of quantum numbers) then.4 The H theorem 3 STATISTICAL MECHANICS levels of the ith particle are given by ¯ 2 π2 h 2 2 2 ei = ni1 + ni2 + ni3 . This allows the overall state of the ensemble to change in time. and ni3 are three (positive integer) quantum numbers. three quantum numbers per particle).8) r Small interactions between particles cause transitions between the approximate stationary states of the system.3. (3. In general.
9) for any two states r and s. On the microscopic scale of individual particles. there exists a probability per unit time Wsr that a system in state s makes a transition to state r. If quantum mechanics possesses time reversal symmetry (which it certainly does!) then both ﬁlms should appear equally plausible. So. and moving apart. If we play the ﬁlm backwards then it will appear to show excited atoms occasionally emitting a photon and decaying back to their unexcited state. one of the atoms will absorb a photon and make a transition to an “excited” state (i. and we consider time intervals which are not too small. 35 . Likewise.. there is a nearly continuous distribution of accessible energy levels. These conditions are easily satisﬁed for the types of systems usually analyzed via statistical mechanics (e. To make things slightly more interesting we could play it either forwards or backwards. satisﬁes these equations just as well.4 The H theorem 3 STATISTICAL MECHANICS ity per unit time Wrs that a system originally in state r ends up in state s as a result of these interactions. if a certain motion of particles satisﬁes the classical equations of motion (or Schr¨dinger’s equation) then the reversed motion.g.. a state with higher than normal energy).3. we could “ﬁlm” a group of photons impinging on some atoms. so that Wrs = Wsr (3.e. In both cases. Suppose that we were to “ﬁlm” a microscopic process. all fundamental laws of physics (in particular. the audience would not be able to tell which way the ﬁlm was running (unless we told them!). These transition probabilities are meaningful in quantum mechanics provided that the particle interaction strength is sufﬁciently small. One important conclusion of quantum mechanics is that the forward and inverse transition probabilities between two states are the same. the ﬁlm would show completely plausible physical events. classical and quantum mechanics) possess this symmetry. Because of the time reversal symmetry of classical mechanics. We could then gather an audience together and show them the ﬁlm. This result follows from the time reversal symmetry of quantum mechanics. We can play the same game for a quantum process. colliding. with all particles o starting off from their ﬁnal positions and then retracing their paths exactly until they reach their initial positions. nearly ideal gases). We could easily estimate the rate constant for this process by watching the ﬁlm carefully. Occasionally. For instance. such as two classical particles approaching one another.
3. The rate at which this occurs is clearly Pr times Wrs .13) dt dt dt dt r r According to Eq.11) dt s where use has been made of the symmetry condition (3. Otherwise.9). It follows. the rate constant for any process in quantum mechanics must equal the rate constant for the inverse process.14) . that as a consequence of time reversal symmetry.10) dt s=r s=r dPr = Wrs (Ps − Pr ). (3. the probability that the systems are in the state s to begin with. which is the mean value of ln Pr over all accessible states: H ≡ ln Pr ≡ Pr ln Pr . in the backwards ﬁlm the excited atoms would appear to emit photons at the wrong rate. We can write a simple differential equation for the time evolution of Pr : dPr = Ps Wsr − Pr Wrs .11). Secondly. and we could then tell that the ﬁlm was being played backwards. Firstly. systems in another state s can make transitions to the state r. dt r s 36 (3. times the rate constant of the transition Wsr .12) This quantity changes as the individual probabilities Pr vary in time. Straightforward differentiation of the above equation yields dH dPr dPr dPr = ln Pr + = (ln Pr + 1). (3. The probability Pr of ﬁnding the systems in the ensemble in a particular state r changes with time for two reasons. systems in the state r can make transitions to other states such as s. Consider now the quantity H (from which the H theorem derives its name). (3. The summation is over all accessible states.4 The H theorem 3 STATISTICAL MECHANICS This means that the rate constant for the absorption of a photon to produce an excited state must be the same as the rate constant for the excited state to decay by the emission of a photon. this can be written dH = Wrs (Ps − Pr ) (ln Pr + 1). The rate at which this occurs is Ps . (3. r or (3. therefore.
of course. however. In general. r s (3. This is. this is not the case in physical systems. in general.17) dt The equality sign only holds in the special case where all accessible states are equally probable. so that Pr = Ps for all r and s. This result is called the H theorem. we conclude that dH ≤ 0. in this ﬁnal equilibrium state the system is equally likely to be found in any one of its accessible states. (3.15) We can write dH/dt in a more symmetric form by adding the previous two equations and making use of Eq. and there is no further evolution of the system. although there are many situations in which it is a pretty good approximation.16). the righthand side of the above equation is the sum of many negative contributions.16) Note. Thus.3.. It follows that ln Pr > ln Ps whenever Pr > Ps . Hence. and was ﬁrst proved by the unfortunate Professor Boltzmann. r s (3. 37 .e. Thus.4 The H theorem 3 STATISTICAL MECHANICS We can now interchange the dummy summations indices r and s to give dH = dt Wsr (Pr − Ps ) (ln Ps + 1). we assumed that the probability of the system making a transition from some state r to another state s is independent of the past history of the system. that ln Pr is a monotonically increasing function of Pr . (3. You may be wondering why the above argument does not constitute a mathematically rigorous proof that the principle of equal a priori probabilities applies to many particle systems. The H theorem tells us that if an isolated system is initially not in equilibrium then it will evolve under the inﬂuence of particle interactions in such a manner that the quantity H always decreases. and vice versa. the situation predicted by the principle of equal a priori probabilities. at which point dH/dt = 0. The answer is that we tacitly made an unwarranted assumption: i. the epistemological status of the principle of equal a priori probabilities is that it is plausible.9): dH 1 =− dt 2 Wrs (Pr − Ps ) (ln Pr − ln Ps ). This process will continue until H reaches its minimum possible value. According to Eq. (3.
be governed by the principle of equal a priori probabilities. and depends in detail on the nature of the interparticle interactions. irrespective of its initial state. the observed velocity distribution of the stars in the vicinity of the Sun is not governed by this principle. which is only about 1010 years. such interactions take place very infrequently.” is an isolated dynamical system made up of about 1011 stars. the “Milky Way” has not been around long enough to reach an equilibrium state. The relaxation time for the air in a typical classroom is very much less than one second.5 The relaxation time 3 STATISTICAL MECHANICS but remains unproven. 38 . the ultimate justiﬁcation for this principle is empirical: i. the “Milky Way.3.e. The principle of equal a priori probabilities is only valid for equilibrium states. At ﬁrst sight. despite its great age. Stars in the Galaxy interact via occasional “near miss” events in which they exchange energy and momentum. the “Milky Way” would seem to be an ideal system on which to test out the ideas of statistical mechanics. It is clear that. it can be thought of as a selfgravitating “gas” of stars. it leads to theoretical predictions which are in accordance with experimental observations. The best estimate for the relaxation time of the “Milky Way” is about 1013 years.. As we have already mentioned. In fact. The typical timescale for this process is called the relaxation time. and should. Our galaxy. Unfortunately. therefore.5 The relaxation time The H theorem guarantees that an isolated many particle system will eventually reach equilibrium. or last interacted with the outside world. In fact. because there is an awful lot of empty space between stars. This should be compared with the estimated age of the Galaxy. 3. Not surprisingly. Consider another example. This suggests that the principle of equal a priori probabilities cannot be used to describe stellar dynamics. this is known to be the case. It follows that we can only safely apply this principle to systems which have remained undisturbed for many relaxation times since they were setup. This suggests that such air is probably in equilibrium most of the time. Actual collisions are very rare indeed.
Each individual scattering event would look perfectly reasonable viewed in reverse. we mentioned that on a microscopic level the laws of physics are invariant under time reversal. Unfortu39 . and the quantity H will increase. for no good reason. the Oxygen and Nitrogen molecules would appear to separate and move to opposite sides of the room. We usually say that these phenomena are reversible. What about macroscopic phenomena? Are they reversible? Well. this process looks crazy! We would start off from perfectly normal air. and all of the Nitrogen molecules to the opposite side. the macroscopic quantity H will decrease. but when we add them all together we obtain a process which would look stupid run backwards. How does the irreversibility of macroscopic phenomena arise? It certainly does not come from the fundamental laws of physics. it is spectacularly unlikely! We conclude. So. In other words. because these laws are all reversible. According to the H theorem. the Oxygen and Nitrogen molecules got mixed up by continually scattering off one another. therefore. from everything we know about the world around us.6 Reversibility and irreversibility Previously.3. as it does so. if we run this process backwards the system will appear to evolve away from equilibrium. How can this be? How can we obtain an irreversible process from the combined effects of very many reversible processes? This is a vitally important question. This scenario is not impossible. In reverse. Pretty soon the Oxygen and Nitrogen molecules would start to intermingle. But. microscopic phenomena look physically plausible when run in reverse. because they look “wrong” when run in reverse. and suddenly.6 Reversibility and irreversibility 3 STATISTICAL MECHANICS 3. but. suppose that by some miracle we were able to move all of the Oxygen molecules in the air in some classroom to one side of the room. In the previous example. We would not expect this state to persist for very long. and this process would continue until they were thoroughly mixed together throughout the room. is the equilibrium state for air. For instance. if we saw a ﬁlm of a macroscopic process we could very easily tell if it was being run backwards. This type of behaviour is not physical because it violates the H theorem. it will evolve towards equilibrium and. that macroscopic phenomena are generally irreversible. of course. consider an isolated many particle system which starts off far from equilibrium. This.
3. It follows that the probability P(yk ) that the parameter y of the system assumes the value yk is simply Ω(E. Note. it is tacitly assumed that Ω(E) → ∞. (This discussion can easily be generalized to deal with a parameter which can assume a continuous range of values).19) where the sum is over all possible values that y can assume. In the above. (3. we are not quite at the stage where we can formulate a convincing answer. which is generally the case in thermodynamic systems. however. statistical mechanics calculations are very simple.7 Probability calculations The principle of equal a priori probabilities is fundamental to all statistical mechanics. this is fairly straightforward. Let Ω(E) be the total number of different states of the system with energies in the speciﬁed range.7 Probability calculations 3 STATISTICAL MECHANICS nately. Consider a system in equilibrium which is isolated. It can be seen that. subject to various constraints. using the principle of equal a priori probabilities. The principle of equal a priori probabilities tells us that all the Ω(E) accessible states of the system are equally likely to occur in the ensemble. (3. yk ) yk Ω(E) .18) Ω(E) Clearly. In principle. so that its total energy is known to have a constant value somewhere in the range E to E + δE. In practice. and allows a complete description of the properties of macroscopic systems in equilibrium. Suppose that among these states there are a number Ω(E. all of which have their energy in this range. yk ) P(yk ) = . 3. yk ) for which some parameter y of the system assumes the discrete value yk . we focus attention on an ensemble of such systems. the mean value of y for the system is given by y= ¯ k Ω(E. all calculations in statistical mechanics reduce to simply counting states. In principle. that the essential irreversibility of macroscopic phenomena is one of the key results of statistical thermodynamics. problems arise 40 . In order to make statistical predictions.
These problems can usually be overcome with a little mathematical ingenuity. It would be useful if we could estimate how this number typically varies with the macroscopic parameters of the system. Nevertheless. V) is proportional to the volume of phasespace between these two energies: E+δE Ω(E. The number of states Ω(E. Consider the system in the limit in which the energy E of the gas is much greater than the groundstate energy. Thus. V) lying between the energies E and E + δE is simply equal to the number of cells in phasespace contained between these energies.20) where m is the particle mass. For instance. Let Ω(E.8 Behaviour of the density of states Consider an isolated system in equilibrium whose volume is V. This is a particularly simple example. and pi the vector momentum of the ith particle. rotational. i=1 (3. By deﬁnition. there is no doubt that this type of calculation is far easier than trying to solve the classical equations of motion (or Schr¨dinger’s equation) directly for a manyparticle system. 1 E= 2m N pi2 . or spin) degrees of freedom. an ideal gas made up of spinless monatomic particles. so that all of the quantum numbers are large. the individual particles move in an approximately uniform potential.8 Behaviour of the density of states 3 STATISTICAL MECHANICS if the constraints become too complicated.21) . 41 (3. V) be the total number of microscopic states which satisfy these constraints. It follows that the energy of the gas is just the total translational kinetic energy of its constituent particles. vibrational.. N the total number of particles. The easiest way to do this is to consider a speciﬁc example. interatomic forces are negligible in an ideal gas. is valid in this limit.3. In other words. Ω(E. The classical version of statistical mechanics. in which we divide up phasespace into cells of equal volume. V) ∝ E d3 r1 · · · d3 rN d3 p1 · · · d3 pN . o 3. In other words.g. because for such a gas the particles possess translational but no internal (e. and whose energy lies in the range E to E+δE.
i=1 α=1 (3. The integration is over all coordinates and momenta such that the total energy of the system lies between E and E + δE. yi . (3.20)]. 42 (3. of course. respectively. For an ideal gas. denoting the (x. z) components by (1.24) d3 p1 · · · d3 pN (3. The energy of the system can be written 1 E= 2m N 3 2 pi α . Hence. the integrand is the element of volume of phasespace.3. Since each integral over ri extends over the volume of the container (the particles are. with d3 p ≡ dpi x dpi y dpi z . V) ∝ V N χ(E). so Eq. This means that the integration over the position vectors ri can be performed immediately.27) . Eq. respectively. and δR ∝ δE/E 1/2 . Since the area varies like R 3N−1 . This volume is proportional to the area of the inner sphere multiplied by δR ≡ R(E + δE) − R(E). not allowed to stray outside the container). (3. zi ) and (pi x .25) is a momentum space integral which is independent of the volume. (3.23) where (xi . There are N such integrals. we have χ(E) ∝ R 3N−1 /E 1/2 ∝ E 3N/2−1 . 3). the total energy E does not depend on the positions of the particles [see Eq. d3 ri = V. χ(E) is proportional to the volume of momentum phasespace contained in the spherical shell lying between the sphere of radius R(E) and that of slightly larger radius R(E + δE). (3. y.26) describes the locus of a sphere of radius R(E) = (2 m E)1/2 in the 3 Ndimensional space of the momentum components. where χ(E) ∝ E+δE E (3. pi y .22) (3. For E = constant.26) since pi2 = pi 12 +pi 22 +pi 32 . pi z ) are the Cartesian coordinates and momentum components of the ith particle.21) reduces to Ω(E. d3 r ≡ dxi dyi dzi . 2. The above sum contains 3N square terms.8 Behaviour of the density of states 3 STATISTICAL MECHANICS Here.
the density of states varies like the extensive macroscopic parameters of the system raised to the power of the number of degrees of freedom.24) yields Ω(E. (3. the volume). the above relation can be very approximately written Ω(E. since the number of degrees of freedom of the system is f = 3 N. This result. and we have also made use of N 1.28) where B is a constant independent of V or E. V) = B V N E 3N/2 . is very useful in statistical thermodynamics.29) In other words.3. (3. this result implies that the density of states is an exceptionally rapidly increasing function of the energy and volume. which turns out to be quite general. An extensive parameter is one which scales with the size of the system (e. Note that. V) ∝ V f Ef . Since thermodynamic systems generally possess a very large number of degrees of freedom.. 43 .8 Behaviour of the density of states 3 STATISTICAL MECHANICS Combining this result with (3.g.
1 A brief history of heat and work In 1789 the French scientist Antoine Lavoisier published a famous treatise on Chemistry which. In place of phlogiston theory. For example: “A clean dry copper calorimeter contains 100 grams of water at 30◦ degrees centigrade. then ﬂuid will ﬂow under gravity. The ﬁnal height is easily calculated by equating the total ﬂuid volume in the initial and 44 . Furthermore. Lavoisier proposed the ﬁrst reasonably sensible scientiﬁc interpretation of heat. This theory. Lavoisier pictured heat as an invisible. from one to the other. Nevertheless. causing heat to ﬂow from hot to cold bodies when they are placed in thermal contact. What is the ﬁnal temperature of the mixture?”. until the two heights are the same. which he called caloriﬁc ﬂuid. demolished the then prevalent theory of combustion. In a similar manner. [This is a somewhat oversimpliﬁed example. or course. where the constant C is termed the heat capacity. These problems often crop up as examination questions. odourless. if two containers ﬁlled to different heights with a free ﬂowing incompressible ﬂuid are connected together at the bottom.] Now. If Q is the heat and θ is the (absolute) temperature then θ = Q/C. so that C = C(θ). there is an important subset of problems involving heat ﬂow for which Lavoisier’s approach is rather useful. then the height is h = V/A. he suggested that the constituent particles of caloriﬁc ﬂuid repel one another. according to Lavoisier’s theory. there is an analogy between heat ﬂow and incompressible ﬂuid ﬂow under gravity. He postulated that hot bodies contain more of this ﬂuid than cold bodies. known to history as the phlogiston theory. How do we approach this type of problem? Well.4 HEAT AND WORK 4 Heat and work 4. In general. A 10 gram block of copper heated to 60◦ centigrade is added. is so extraordinary stupid that it is not even worth describing. The modern interpretation of heat is. via a small pipe. If the volume is V. weightless ﬂuid. somewhat different to Lavoisier’s caloriﬁc theory. tasteless. the heat capacity is a function of temperature. the same quantity of heat added to different bodies causes them to rise to different temperatures. and the crosssectional area is A. amongst other things. The same volume of liquid added to containers of different crosssectional area ﬁlls them to different heights.
According to Lavoisier’s caloriﬁc theory. just like the volume of an incompressible ﬂuid. A1 and A2 are the corresponding crosssectional areas. giving h1 A1 + h2 A2 . If Lavoisier had cared to formulate a law of thermodynamics from his caloriﬁc theory then he would have said that the total amount of heat in the Universe was a constant.4) (4. Thus.1) The analogy between heat ﬂow and ﬂuid ﬂow works because in Lavoisier’s theory heat is a conserved quantity. become colder. are brought into thermal contact then heat will ﬂow. and seemingly inexhaustible. the heat must ﬂow into the cannon from its immediate surroundings. C1 + C2 where the meaning of the various symbols should be selfevident.2) A1 + A2 Here. therefore. if two bodies. and h is the ﬁnal height. (4. θ= (4.3) θ1 C1 + θ2 C2 = θ C1 + θ C2 . h1 and h2 are the initial heights in the two containers. Lavoisier postulated that heat was an element. The ﬁnal temperature is calculated by equating the total heat in the initial and ﬁnal states. Thompson was struck by the enormous. until the two temperatures are the same. (4. an Englishman who spent his early years in prerevolutionary America. The ﬂow should also eventually cease when all of the available 45 . He simply could not understand where all this heat was coming from. from one to the other.4. which should. initially at different temperatures. In fact. Likewise. Note that atoms were thought to be indestructible before nuclear reactions were discovered. Thus. amount of heat generated in this process. In 1798 Benjamin Thompson. One of his jobs was to oversee the boring of cannons in the state arsenal. h= giving θ1 C1 + θ2 C2 . h1 A1 + h2 A2 = h A1 + h A2 .1 A brief history of heat and work 4 HEAT AND WORK ﬁnal states. so the total amount of each element in the Universe was assumed to be a constant. was minister for war and police in the German state of Bavaria.
4.2 Macrostates and microstates
4 HEAT AND WORK
heat has been extracted. In fact, Thompson observed that the surroundings of the cannon got hotter, not colder, and that the heating process continued unabated as long as the boring machine was operating. Thompson postulated that some of the mechanical work done on the cannon by the boring machine was being converted into heat. At the time, this was quite a revolutionary concept, and most people were not ready to accept it. This is somewhat surprising, since by the end of the eighteenth century the conversion of heat into work, by steam engines, was quite commonplace. Nevertheless, the conversion of work into heat did not gain broad acceptance until 1849, when an English physicist called James Prescott Joule published the results of a long and painstaking series of experiments. Joule conﬁrmed that work could indeed be converted into heat. Moreover, he found that the same amount of work always generates the same quantity of heat. This is true regardless of the nature of the work (e.g., mechanical, electrical, etc.). Joule was able to formulate what became known as the work equivalent of heat. Namely, that 1 newton meter of work is equivalent to 0.241 calories of heat. A calorie is the amount of heat required to raise the temperature of 1 gram of water by 1 degree centigrade. Nowadays, we measure both heat and work in the same units, so that one newton meter, or joule, of work is equivalent to one joule of heat. In 1850, the German physicist Clausius correctly postulated that the essential conserved quantity is neither heat nor work, but some combination of the two which quickly became known as energy, from the Greek energia meaning “in work.” According to Clausius, the change in the internal energy of a macroscopic body can be written ∆E = Q − W, (4.5) where Q is the heat absorbed from the surroundings, and W is the work done on the surroundings. This relation is known as the ﬁrst law of thermodynamics.
4.2 Macrostates and microstates In describing a system made up of a great many particles, it is usually possible to specify some macroscopically measurable independent parameters x1 , x2 , · · · , xn
46
4.3 The microscopic interpretation of heat and work
4 HEAT AND WORK
which affect the particles’ equations of motion. These parameters are termed the external parameters the system. Examples of such parameters are the volume (this gets into the equations of motion because the potential energy becomes inﬁnite when a particle strays outside the available volume) and any applied electric and magnetic ﬁelds. A microstate of the system is deﬁned as a state for which the motions of the individual particles are completely speciﬁed (subject, of course, to the unavoidable limitations imposed by the uncertainty principle of quantum mechanics). In general, the overall energy of a given microstate r is a function of the external parameters: Er ≡ Er (x1 , x2 , · · · , xn ). (4.6)
A macrostate of the system is deﬁned by specifying the external parameters, and any other constraints to which the system is subject. For example, if we are dealing with an isolated system (i.e., one that can neither exchange heat with nor do work on its surroundings) then the macrostate might be speciﬁed by giving the values of the volume and the constant total energy. For a manyparticle system, there are generally a very great number of microstates which are consistent with a given macrostate.
4.3 The microscopic interpretation of heat and work Consider a macroscopic system A which is known to be in a given macrostate. To be more exact, consider an ensemble of similar macroscopic systems A, where each system in the ensemble is in one of the many microstates consistent with the given macrostate. There are two fundamentally different ways in which the average energy of A can change due to interaction with its surroundings. If the external parameters of the system remain constant then the interaction is termed a purely thermal interaction. Any change in the average energy of the system is attributed to an exchange of heat with its environment. Thus, ¯ ∆E = Q, (4.7)
where Q is the heat absorbed by the system. On a microscopic level, the energies of the individual microstates are unaffected by the absorption of heat. In fact,
47
4.3 The microscopic interpretation of heat and work
4 HEAT AND WORK
it is the distribution of the systems in the ensemble over the various microstates which is modiﬁed. Suppose that the system A is thermally insulated from its environment. This can be achieved by surrounding it by an adiabatic envelope (i.e., an envelope fabricated out of a material which is a poor conductor of heat, such a ﬁber glass). Incidentally, the term adiabatic is derived from the Greek adiabatos which means “impassable.” In scientiﬁc terminology, an adiabatic process is one in which there is no exchange of heat. The system A is still capable of interacting with its environment via its external parameters. This type of interaction is termed mechanical interaction, and any change in the average energy of the system is attributed to work done on it by its surroundings. Thus, ¯ ∆E = −W, (4.8)
where W is the work done by the system on its environment. On a microscopic level, the energy of the system changes because the energies of the individual microstates are functions of the external parameters [see Eq. (4.6)]. Thus, if the external parameters are changed then, in general, the energies of all of the systems in the ensemble are modiﬁed (since each is in a speciﬁc microstate). Such a modiﬁcation usually gives rise to a redistribution of the systems in the ensemble over the accessible microstates (without any heat exchange with the environment). Clearly, from a microscopic viewpoint, performing work on a macroscopic system is quite a complicated process. Nevertheless, macroscopic work is a quantity which can be readily measured experimentally. For instance, if the system A exerts a force F on its immediate surroundings, and the change in external parameters corresponds to a displacement x of the center of mass of the system, then the work done by A on its surroundings is simply W = F·x (4.9)
i.e., the product of the force and the displacement along the line of action of the force. In a general interaction of the system A with its environment there is both heat exchange and work performed. We can write ¯ Q ≡ ∆E + W,
48
(4.10)
4.4 Quasistatic processes
4 HEAT AND WORK
which serves as the general deﬁnition of the absorbed heat Q (hence, the equivalence sign). The quantity Q is simply the change in the mean energy of the system which is not due to the modiﬁcation of the external parameters. Note that the notion of a quantity of heat has no independent meaning apart from Eq. (4.10). ¯ The mean energy E and work performed W are both physical quantities which can be determined experimentally, whereas Q is merely a derived quantity.
4.4 Quasistatic processes Consider the special case of an interaction of the system A with its surroundings which is carried out so slowly that A remains arbitrarily close to equilibrium at all times. Such a process is said to be quasistatic for the system A. In practice, a quasistatic process must be carried out on a timescale which is much longer than the relaxation time of the system. Recall that the relaxation time is the typical timescale for the system to return to equilibrium after being suddenly disturbed (see Sect. 3.5). A ﬁnite quasistatic change can be built up out of many inﬁnitesimal changes. The inﬁnitesimal heat dQ absorbed by the system when inﬁnitesimal work dW is ¯ ¯ ¯ done on its environment and its average energy changes by dE is given by ¯ ¯ dQ ≡ dE + dW. ¯ (4.11)
The special symbols dW and dQ are introduced to emphasize that the work done ¯ ¯ and heat absorbed are inﬁnitesimal quantities which do not correspond to the difference between two works or two heats. Instead, the work done and heat absorbed depend on the interaction process itself. Thus, it makes no sense to talk about the work in the system before and after the process, or the difference between these. If the external parameters of the system have the values x1 , · · · , xn then the energy of the system in a deﬁnite microstate r can be written Hence, if the external parameters are changed by inﬁnitesimal amounts, so that xα → xα + dxα for α in the range 1 to n, then the corresponding change in the
49
Er = Er (x1 , · · · , xn ).
(4.12)
Xα r ≡ − Consider now an ensemble of systems. The macroscopic work dW resulting from an inﬁnitesimal quasistatic change of the external parameters ¯ is obtained by calculating the decrease in the mean energy resulting from the parameter change.13) The work dW done by the system when it remains in this particular state r is ¯ n dWr = −dEr = ¯ α=1 Xα r dxα . (4. Provided that the external parameters of the system are changed quasistatically. ∂xα (4. Note that if xα is a displacement then Xα r is an ordinary force. n dW = ¯ α=1 ¯ Xα dxα . (4.15) ∂xα is termed the generalized force (conjugate to the external parameter xα ) in the state r. (4.16). The most wellknown example of quasistatic work in thermodynamics is that done by pressure when the volume changes. The macroscopic work W resulting from a ﬁnite quasistatic change of external parameters can be obtained by integrating Eq.4. The mean value is calculated from the equilibrium distribution of systems in the ensemble corresponding to the external parameter values xα .16) ∂Er ¯ Xα ≡ − (4.14) where ∂Er (4. suppose that the 50 where . Thus.4 Quasistatic processes 4 HEAT AND WORK energy of the microstate is n dEr = α=1 ∂Er dxα . the generalized forces Xα r have well deﬁned mean values which are calculable from the distribution of systems in the ensemble characteristic of the instantaneous macrostate.17) ∂xα is the mean generalized force conjugate to xα . For simplicity.
5 Exact and inexact differentials In our investigation of heat and work we have come across various inﬁnitesimal ¯ objects such as dE and dW. Thus. 51 . It is instructive to examine these inﬁnitesimals more ¯ closely. ¯ ¯ (4. ¯ (4.5 Exact and inexact differentials 4 HEAT AND WORK volume V is the only external parameter of any consequence. so p = p(V). By deﬁnition. It follows from (4.19) is the inﬁnitesimal volume change due to the displacement of the surface.20) ∂V so the mean pressure is the generalized force conjugate to the volume V. the mean pressure is a function of the volume. the mean equilibrium pressure p of a given macrostate is equal to the normal force ¯ per unit area acting on any surface element. The work done by the element is p dSi · dxi . Suppose that the surface element is subject to a ¯ displacement dxi . It follows that the macroscopic work done by the system is given by ¯ ¯ Vf Vf Wif = Vi dW = ¯ Vi p(V) dV. The total work ¯ done by the system is obtained by summing over all of the surface elements. ¯ 4.18) where dV = i dSi ·dxi (4.17) that ¯ ∂E p=− .4. Thus. ¯ (4. dW = p dV. Suppose that a quasistatic process is carried out in which the volume is changed from Vi to Vf . the normal force acting on a surface element dSi is p dSi . In general.21) This quantity is just the “area under the curve” in a plot of p(V) versus V. The work done in changing the volume from V to V + dV is simply the product of the force and the displacement (along the line of action of the force).
In this case.5 Exact and inexact differentials 4 HEAT AND WORK Consider the purely mathematical problem where F(x. Consider the change in F in going from the point (x. so the difference Ff − Fi is clearly zero. y) in the xy plane to the neighbouring point (x + dx. the value of the integral is independent of the path taken in going from the initial to the ﬁnal point. y) dx + Y(x. Consider the expression 52 . y) is some general function of two independent variables x and y. y) dy. This is given by dF = F(x + dx. dF is simply the inﬁnitesimal difference between two adjacent values of the function F. If we move in the xy plane from an initial point i ≡ (xi . the initial and ﬁnal points correspond to the same point. yf ) then the corresponding change in F is given by f f ∆F = Ff − Fi = i dF = i (X dx + Y dy). This type of inﬁnitesimal quantity is termed an exact differential to distinguish it from another type to be discussed presently. It is easy to test whether or not an inﬁnitesimal quantity is an exact differential. (4. yi ) to a ﬁnal point f ≡ (xf . In other words. (4. ¯ (4.23) where X = ∂F/∂x and Y = ∂F/∂y.24) Note that since the difference on the lefthand side depends only on the initial and ﬁnal points. not every inﬁnitesimal quantity is an exact differential. It follows that the integral of an exact differential over a closed circuit is always zero: dF ≡ 0.26) where X and Y are two general functions of x and y. (4. y) dx + Y (x. (4. Consider the inﬁnitesimal object dG ≡ X (x.4.22) which can also be written dF = X(x. Consider an integral taken around a closed circuit in the xy plane. y). y) dz. This is the distinguishing feature of an exact differential. Clearly. y + dy) − F(x. the integral on the righthand side can only depend on these points as well.25) Of course. y + dy).
∂y ∂x ∂x∂y Thus. where c is a labeling parameter.31) .29) is independent of the path taken between the initial and ﬁnal points. the solution of dG = 0. (4. In particular.27) dG = ¯ i i (X dx + Y dy) (4. This deﬁnes a set of curves which can be written σ(x.e. y) = c.32) dx Y Since the righthand side is a known function of x and y. it is not true that f f (4.32). Consider the integral of dG over some path in the xy plane.23).33) dx ∂x ∂y dx 53 (4. and is ¯ instead termed an inexact differential. (4. It is clear that since X = ∂F/∂x and Y = ∂F/∂y then ∂X ∂Y ∂2 F = = . then dG cannot be an exact differential.28) ∂y ∂x (as is assumed to be the case). if ∂X ∂Y = (4. for the moment. ¯ Consider.. The solution simply consists of drawing a system of curves in the xy plane such that at any point the tangent to the curve is as speciﬁed in Eq. (4. ¯ In general. ¯ which reduces to the ordinary differential equation dy X =− . gradient) at each point in the xy plane. so dG = 0. It follows that dσ ∂σ ∂σ dy ≡ + = 0. The special symbol d is used to denote an ¯ inexact differential. the integral of an inexact differential around a closed circuit is not necessarily zero. the above equation deﬁnes a deﬁnite direction (i.5 Exact and inexact differentials 4 HEAT AND WORK (4.4. This is the distinguishing feature of an inexact differential.30) (4.
it is clear that an inexact differential involving two independent variables always admits of an integrating factor. ∂x ∂y τ (4.37) τ Thus. The above equation could equally well be written ∂σ ∂σ X =τ .4.. The macrostate of a macroscopic system can be speciﬁed by the values of the ¯ external parameters (e.32) and (4. This.34) where τ(x. (4. however. After this mathematical excursion. y) is function of x and y.e. dividing the inexact differential dG by τ yields the exact differential dσ.33) yields Y ∂σ X Y ∂σ =X = . dE = Ef − Ei is just the difference between the mean energy of the system in the ﬁnal macrostate f and the initial macrostate i. in turn. (4. in the limit where these two states are nearly the same.. Since the above analysis is quite general. It follows that if the system is taken from an initial macrostate i to any ﬁnal macrostate f the mean energy change is given by ¯ ¯ ¯ ∆E = Ef − Ei = 54 f i ¯ dE. (4. Y =τ . For example. ﬁxes other parameters such as the mean pressure p. we can specify ¯ the external parameters and the mean pressure.35) ∂x ∂y Inserting Eq.g. (4.5 Exact and inexact differentials 4 HEAT AND WORK The elimination of dy/dx between Eqs. which ﬁxes the mean energy. the volume) and the mean energy E. A ¯ factor τ which possesses this property is termed an integrating factor. ¯ Quantities such as d¯ and dE are inﬁnitesimal differences between welldeﬁned p ¯ ¯ ¯ quantities: i. Note. let us return to physical situation of interest.35) into Eq.38) .26) gives dG = τ ¯ or ∂σ ∂σ dx + dy = τ dσ. Alternatively. they are exact differentials. this is not generally the case for inexact differentials involving more than two variables.36) dG ¯ = dσ. (4. (4. ∂x ∂y (4.
the work dW is in general an inexact differential. dW = ¯ ¯ Xα dxα is not the difference between two numbers referring to the properties of two neighbouring macrostates. and not on the particular process used to get between them. Thus. say.39) where the integral represents the sum of the inﬁnitesimal amounts of work dW ¯ performed at each stage of the process. since the mean energy is just a function of the macrostate under consid¯ ¯ eration. ¯ (4. by analogy with the mathematical example discussed previously. The total work done by the system ¯ in going from any macrostate i to some other macrostate f can be written as f Wif = i dW. (4.4. the inﬁnitesimal work done by the system in going from some initial macrostate i to some neighbouring ﬁnal macrostate f. Ef and Ei depend only on the initial and ﬁnal states. ¯ dQ ¯ ≡ dS. that the heat Q. However.5 Exact and inexact differentials 4 HEAT AND WORK However. It follows that ¯ ¯ dQ ≡ dE + dW ¯ (4. ¯ Recall that in going from macrostate i to macrostate f the change ∆E does not depend on the process used whereas the work W. does. So. respectively. in general. Thus. T .40) is an inexact differential. also depends on the process used. ¯ the integral dE depends only on the initial and ﬁnal states. it follows from the ﬁrst law of thermodynamics. In other words. In general. T (4. 55 . it is merely an inﬁnitesimal quantity characteristic of the process of going from state i to state f. Consider. now. In general. there must exist some integrating factor. Eq. the value of the integral does depend on the particular process used in going from macrostate i to macrostate f.41) It will be interesting to ﬁnd out what physical quantities correspond to the functions T and S.10). Instead. which converts the inexact differential dQ into an exact differential. in general.
Eq. (4. and is independent of the process. (4.43) and dQ becomes an exact differential.4. in this special case. In this case. then dW = 0. so that no work is done. when Clausius ﬁrst formulated the ﬁrst law in 1850 this is how he expressed it: If a thermally isolated system is brought from some initial to some ﬁnal state then the work done by the system is independent of the process used. If the external parameters of the system are kept ﬁxed. 56 . The amount of heat Q absorbed in going ¯ from one macrostate to another depends only on the mean energy difference between them. heat is a conserved quantity. In fact.11) reduces to ¯ ¯ dQ = dE. so that Q = 0. In this situation. and acts very much like the invisible indestructible ﬂuid of Lavoisier’s caloriﬁc theory.5 Exact and inexact differentials 4 HEAT AND WORK Suppose that the system is thermally insulated.42) Thus. the work done depends only on the energy difference between in the initial and ﬁnal states. ¯ (4. the ﬁrst law of thermodynamics implies that ¯ Wif = −∆E. and is independent of the process used to effect the change.
Likewise. The number of microstates of A consistent with a macrostate in which the energy lies in the range E to E + δE is denoted Ω(E). The ﬁnal step in our investigation is to combine statistical mechanics with thermodynamics: in other words.5 STATISTICAL THERMODYNAMICS 5 Statistical thermodynamics 5. to investigate heat and work via statistical arguments. We then used probabilistic reasoning to analyze the dynamics of many particle systems. the study of which is termed thermodynamics. it neither does work on nor exchanges heat with its surroundings). It follows from the ﬁrst law of thermodynamics that the total energy E(0) is constant. A and A . a subject area known as statistical mechanics. The remaining sections will then explore the application and elucidation of these concepts. respectively. When speaking of thermal contact between two distinct systems.2 Thermal interaction between macrosystems Let us begin our investigation of statistical thermodynamics by examining a purely thermal interaction between two macroscopic systems. so that A and A cannot do work on one another. we explored the physics of heat and work. It is convenient to divide the energy scale into small subdivisions of width δE. the number of microstates of A consistent with a macrostate in which the energy lies between E and E + δE is denoted Ω (E ).1 Introduction Let us brieﬂy review the material which we have covered so far in this course. We started off by studying the mathematics of probability. The combined system A(0) = A + A is assumed to be isolated (i. However. Next. we assume that the systems are free to exchange heat energy (i. we usually assume that the mutual 57 . and forms the central subject matter of this course. from a microscopic point of view.e. This discipline is called statistical thermodynamics.e.. 5. This section is devoted to the study of the fundamental concepts of statistical thermodynamics. Suppose that the energies of these two systems are E and E . they are in thermal contact). The external parameters are held ﬁxed..
that the typical variation of the number of accessible states with energy is of the form Ω ∝ Ef . Thus. a small residual interaction is always required to enable the two systems to exchange heat energy and. (5. where C is a constant which is independent of E.3) (5. and vice versa. from Sect. (5. Thus. respectively. the smallest system which can realistically be examined in isolation is A(0) . The principle of equal a priori probabilities is applicable to this situation (see Sect.5. which are left undisturbed for many relaxation times so that they can attain thermal equilibrium.1). if the energy of A lies in the range E to E + δE then the energy of A must lie between E(0) − E − δE and E(0) − E. the probability of occurrence of a given macrostate is proportional to the number of accessible microstates.4) . eventually reach thermal equilibrium (see Sect. (5. In this case. We know. since the presence of one system clearly strongly perturbs the other. 3). in the limit of zero interaction the energies are strictly additive. the number of microstates accessible to each system is given by Ω(E) and Ω (E(0) − E). the total number of distinct states accessible to A(0) when the energy of A lies in the range E to E + δE is Ω(0) = Ω(E) Ω (E(0) − E). Thus.8.4).2) Consider an ensemble of pairs of thermally interacting systems.1) Of course.2 Thermal interaction between macrosystems 5 STATISTICAL THERMODYNAMICS interaction is sufﬁciently weak for the energies to be additive. According to this principle. if the interaction between A and A is too strong for the energies to be additive then it makes little sense to consider each system in isolation. However. since all microstates are equally likely. 58 (5. According to Eq. In fact. Since every possible state of A can be combined with every possible state of A to form a distinct microstate. E+E E(0) = constant. the probability that the system A has an energy lying in the range E to E + δE can be written P(E) = C Ω(E) Ω (E(0) − E). A and A . 3. 3. thereby.
10) can be combined to give 1 ˜ ˜ ˜ ˜ ˜ ˜ ln [Ω(E) Ω (E )] = ln [Ω(E) Ω (E )] + [β(E) − β (E )] η − [λ(E) + λ (E )] η2 + · · · . ∂E ∂β ∂2 ln Ω =− . so ˜ ˜ β(E) = β (E ). It follows that 1 ˜ ˜ ˜ ln Ω (E ) = ln Ω (E ) + β (E ) (−η) − λ (E ) (−η)2 + · · · . 2 (5. ∂ ln Ω β = .5) and (5. (5.7) (5. because the latter varies so rapidly with the energy that the radius of convergence of its Taylor expansion is too small for this expansion to be of any practical use. The expansion of ln Ω(E) yields 1 ˜ ˜ ˜ (5.8) (5.12) 59 (5. Hence.5) ln Ω(E) = ln Ω(E) + β(E) η − λ(E) η2 + · · · . we would expect the probability to exhibit a very pronounced maximum at some particular value of the energy. (5. We expand the relatively slowly varying logarithm.11) At the maximum of ln [Ω(E) Ω (E )] the linear term in the Taylor expansion must vanish. (5. which is assumed to occur at E = E. It follows that the probability P(E) in Eq. λ = − ∂E2 ∂E Now. we have ˜ ˜ E − E = −(E − E) = −η. 2 where ˜ η = E − E. Equations (5. Let us Taylor expand the logarithm of P(E) in the vicinity of its maximum ˜ value.9) . since E = E(0) − E.6) (5.2 Thermal interaction between macrosystems 5 STATISTICAL THERMODYNAMICS where f is the number of degrees of freedom. For a macroscopic system f is an exceedingly large number.3) is the product of an extremely rapidly increasing function of E and an extremely rapidly decreasing function of E.10) 2 where β and λ are deﬁned in an analogous manner to the parameters β and λ.5. rather than the function itself.
It follows that the fractional width of the probability distribution function is given by ∆∗ E 1 (5. 2. since the central limit theorem ensures that the probability distribution for any macroscopic variable. in which case the constraint λ0 > 0 ˜ would effectively correspond to λ(E) > 0.19) ∼√ .15) Now. is Gaussian in nature (see Sect. (5. so that ¯ ˜ E = E.14). ¯ E f ∆ ∗ E = λ0 −1/2 (5.e. (5. The standard deviation of the distribution is ¯ E (5. (5. 2 where ˜ ˜ λ0 = λ(E) + λ (E ). It follows that 1 ˜ ln P(E) = ln P(E) − λ0 η2 . It is ˜ clear that λ(E) must also be positive.17) 60 .5...2 Thermal interaction between macrosystems 5 STATISTICAL THERMODYNAMICS ˜ which enables us to determine E.18) ∼√ .4) (assuming that system A makes the dominant contribution to λ0 ).13) (5. otherwise the probability P(E) does not exhibit a pronounced maximum value: i. [A similar argument can be used to ˜ show that λ (E ) must be positive. such as E.] The same conclusion also follows from the estimate Ω ∝ Ef . the peak of the Gaussian curve). we know it must. the probability distribution function P(E) is a Gaussian.14) (5. It follows that the mean value of E corresponds to the situation of maximum probability (i.16) ˜ E According to Eq.e. This is hardly surprising. (5. the parameter λ0 must be positive. physically. since we could always choose for A a system with a negligible contribution to λ0 . which implies that f ˜ λ(E) ∼ 2 > 0.10). f where use has been made of Eq. 2 or 1 ˜ ˜ P(E) = P(E) exp − λ0 (E − E)2 . the combined system A(0) does not possess a welldeﬁned equilibrium state as.
and the underlying statistical nature of the distribution may not be apparent. the resulting state is an extremely improbable one [i.e.20) ¯ ¯ where βf ≡ β(Ef ) and βf ≡ β (Ef ). since the initial state already corresponds to a state of maximum probability. In the special case where the initial energies. Ei and ¯ ¯ Ei .3 Temperature Suppose that the systems A and A are initially thermally isolated from one another. the probability distribution for E has an exceedingly sharp maximum. The conﬁguration will. ¯ Q ≡ Ef − Ei . (Since the energy of an isolated system cannot ﬂuctuate. The conservation of energy then reduces to Q+Q =0: 61 (5. therefore. (5. so that they are free to exchange heat energy.24) . with respective energies Ei and Ei . respectively. 5.23) (5. lie very close to the ﬁnal mean energies. Ef and Ef . in general. (5.22) (5.) If the two systems are subsequently placed in thermal contact. 5.5. This corresponds to the state of maximum probability (see Sect. then. Experimental measurements of this energy will almost always yield the mean value.3 Temperature 5 STATISTICAL THERMODYNAMICS ¯ Hence. if A contains 1 mole of particles then f ∼ NA 1024 and ∆∗ E/E ∼ 10−12 . so that ¯ Q ≡ Ef − Ei . Clearly. there is no change in the two systems when they are brought into thermal contact.21) The mean energy change in each system is simply the net heat absorbed. we do not have to bother with mean energies here.2).. P(Ei ) is much less than the peak probability]. It follows from energy conservation that ¯ ¯ Ef + Ef = Ei + Ei . tend to change in time until the two systems attain ﬁnal mean ¯ ¯ energies Ef and Ef which are such that βf = βf .
Instead.5. In other words. where βi ≡ β(Ei ). and use has been made of Eq. It is clear. from the above. (5. If two systems separately in equilibrium have diﬀerent values of β then the systems will not remain in equilibrium when brought into thermal contact with one another. Taylor expansion to ﬁrst order yields ∂ ln Ω(Ei ) ¯ ∂ ln Ω (Ei ) ¯ (Ef − Ei ) + (Ef − Ei ) > 0. ∂E ∂E which ﬁnally gives (βi − βi ) Q > 0.29)].29) ∂ ln Ω . the system with the higher value of β will absorb heat from the other system until the two β values are the same [see Eq. It is clear that if the systems A and A are suddenly brought into thermal contact then they will only exchange heat and evolve towards a new equilibrium state if the ﬁnal state is more probable than the initial one.e. with the aid of Eq.24).27) (5.30) .25) (5. ∂E (5.28) (5. that the parameter β. The above inequality can be written ¯ ¯ ln Ω(Ef ) + ln Ω (Ef ) > ln Ω(Ei ) + ln Ω (Ei ). deﬁned β= has the following properties: 1.3 Temperature 5 STATISTICAL THERMODYNAMICS i. if ¯ P(Ef ) > P(Ei ).3). 2. βi ≡ β (Ei )..26) since the logarithm is a monotonic function. the heat given off by one system is equal to the heat absorbed by the other (in our notation absorbed heat is positive and emitted heat is negative). or ¯ ln P(Ef ) > ln P(Ei ). (5. (5. (5. If two systems separately in equilibrium have the same value of β then the systems will remain in equilibrium when brought into thermal contact with one another. 62 (5.
But.32) . we arrive at the following statement.31) kT ∂E where k is a positive constant having the dimensions of energy. Let us deﬁne the dimensionless parameter T . and C. depends only on the rate of change of the number of accessible microstates with the total energy. The thermodynamic temperature of a macroscopic body. so systems A and C will also remain in equilibrium when brought into thermal contact.. However.381 × 10−23 joules/kelvin. Thus.5. In addition. suppose that we have three systems A. or absolute. 63 (5. The choice k = 1..e. which is sometimes called the zeroth law of thermodynamics: If two systems are separately in thermal equilibrium with a third system then they must also be in thermal equilibrium with one another. scale of temperature is measured in degrees kelvin. if B and C remain in equilibrium when brought into thermal contact. Thus. such that ∂ ln Ω 1 ≡β≡ .30) because in a purely thermal interaction the external parameters of the system are held constant whilst the energy changes. Similarly. The parameter T is termed the thermodynamic temperature. then TB = TC . if the two systems have different thermodynamic temperatures then heat will ﬂow from the system with the higher temperature (i. the “hotter” system) to the system with the lower temperature until the temperatures of the two systems are the same. as deﬁned in Eq. matter and radiation).3 Temperature 5 STATISTICAL THERMODYNAMICS Incidentally. and controls heat ﬂow in much the same manner as a conventional temperature. if two isolated systems in equilibrium possess the same thermodynamic temperature then they will remain in equilibrium when brought into thermal contact.31). The parameter k is chosen to make this temperature scale accord as much as possible with more conventional temperature scales. we can then conclude that TA = TC . B. (5. so that TA = TB . The thermodynamic. We know that if A and B remain in equilibrium when brought into thermal contact then their temperatures are the same. Thus. (5. (5.g. it is possible to deﬁne a thermodynamic temperature for systems with radically different microscopic structures (e. a partial derivative is used in Eq.
one in which quantum effects are unimportant) it is possible to show that the mean energy associated with each degree of freedom is exactly (1/2) k T . In this case. since Ω(E) is ordinarily a very rapidly increasing function of energy. This result. In this situation. Consequently. The above number is known as the Boltzmann constant. (5. Note that the zero of the thermodynamic scale. because there is no upper bound on the possible energy of the system. and concentrate only on its spin degrees of freedom.33) f using Eq. however. The familiar Ω ∝ Ef scaling for translational degrees of freedom yields kT ∼ ¯ E . it is possible to get absolute spin temperatures which are negative. for a classical system (i.15 degrees kelvin. the basic mechanism which forces heat to ﬂow 64 . possible to envisage a situation in which we ignore the translational degrees of freedom of a system.16◦ K. there is an upper bound to the possible energy of the system (i. will be discussed in more detail later on in this course.e.5. this is the case for all conventional systems where the kinetic energy of the particles is taken into account. the density of spin states Ωspin (E) ﬁrst increases with increasing energy. It is. the unique temperature at which the three phases of water coexist in thermal equilibrium) exactly 273. the Boltzmann constant is ﬁxed by international convention so as to make the triple point of water (i. and Ω(E) consequently increases roughly like Ef . In fact. so k T is a rough measure of the mean energy associated with each degree of freedom in the system. which is known as the equipartition theorem. does not correspond to the freezing point of water. respectively). Thus. the so called absolute zero of temperature.e. as in conventional systems. but then reaches a maximum and decreases again.15 and 373. In fact.. but to some far more physically signiﬁcant temperature which we shall discuss presently. (5. the total number of states available to the system is ﬁnite.3 Temperature 5 STATISTICAL THERMODYNAMICS ensures that there are 100 degrees kelvin between the freezing and boiling points of water at atmospheric pressure (the two temperatures are 273. as well as positive.31). all spins lined up antiparallel to an applied magnetic ﬁeld)... In fact. The absolute temperature T is usually positive.e. In Lavoisier’s caloriﬁc theory.
the explanation is far less contrived. When x is changed by the amount dx.e.4 Mechanical interaction between macrosystems 5 STATISTICAL THERMODYNAMICS from hot to cold bodies is the supposed mutual repulsion of the constituent particles of caloriﬁc ﬂuid. x) whose energy is changed from a value less than E to a value greater than E when the parameter changes from x to x + dx is given by the number of microstates per unit energy range multiplied by the average shift in energy of the microstates. δE 65 (5.35) .. but there is no exchange of heat energy. The number of states σ(E. x) ∂Er dx. the only reasonably probable ﬁnal equilibrium states are such that the two bodies differ in temperature by less than 1 part in 1012 . Consider. the number of microstates accessible to the system when the overall energy lies between E and E + δE depends on the particular value of x. all states where the energy lies between E and E + δE and the external parameter takes the value x). In statistical mechanics..e. the initial state corresponds to a spectacularly improbable state of the overall system. x) = − Ω(E. towards thermal equilibrium) is effectively driven by probability. Heat ﬂow occurs because statistical systems tend to evolve towards their most probable states. For systems containing of order 1 mole of particles. In general. x) ¯ X dx. the energy Er (x) of a given microstate r changes by (∂Er /∂x) dx. for the sake of simplicity.4 Mechanical interaction between macrosystems Let us now examine a purely mechanical interaction between macrostates. Hence. 5. a situation where only one external parameter x of the system is free to vary. where one or more of the external parameters is modiﬁed. so we can write Ω ≡ Ω(E. σ(E. x) = Ω(E. When two bodies at different temperatures are suddenly placed in thermal contact.34) where the mean value of ∂Er /∂x is taken over all accessible microstates (i. The evolution of the system towards these ﬁnal states (i. The above equation can also be written σ(E. δE ∂x (5.5. subject to the imposed physical constraints. x).
x) ∂σ dx = σ(E) − σ(E + δE) − δE. the above derivation is valid for each parameter taken in isolation. according to the usual estimate Ω ∝ Ef .4 Mechanical interaction between macrosystems 5 STATISTICAL THERMODYNAMICS where ∂Er ¯ (5. x1 . the ﬁrst term on the righthand ¯ ¯ ¯ ¯ side is of order (f/E) X. for a macroscopic system with many degrees of freedom. X+ ∂x ∂E ∂E (5. so we have ∂ ln Ω ∂ ln Ω ¯ ¯ = X = β X. In symbols. xn .38) ∂x ∂E where use has been made of Eq. so that Ω ≡ Ω(E. Consider the total number of microstates between E and E + δE. ∂ ln Ω ¯ = β Xα . 4. xn ). This change is due to the difference between the number of states which enter the range because their energy is changed from a value less than E to one greater than E and the number which leave because their energy is changed from a value less than E + δE to one greater than E + δE.4). whereas the second term is only of order X/E.35). (5.39) However. Clearly.40) ∂x ∂E When there are several external parameters x1 . (5. Thus. ∂Ω(E. · · · . the number of states in this energy range changes by (∂Ω/∂x) dx.37) ∂x ∂E which yields ¯ ∂Ω ∂(ΩX) = . x) = − ∂x is the mean generalized force conjugate to the external parameter x (see Sect. (5.36) X(E.5. (5. the second term is utterly negligible.41) 66 . ∂xα ¯ where Xα is the mean generalized force conjugate to the parameter xα . · · · . Dividing both sides by Ω gives ¯ ∂ ln Ω ∂ ln Ω ¯ ∂X = . When the external parameter changes from x to x + dx. (5.
We have already demonstrated (in Sect. A and A . x1 . if the two systems can interact mechanically then.2) that Ω(0) exhibits a very pronounced ˜ maximum at one particular value of the energy E = E when E is varied but the external parameters are held constant. then V + V = V (0) = constant. respectively.5 General interaction between macrosystems Consider two systems. (5. The equilibrium situation corresponds to the ˜ 67 . · · · . so Ω(0) ≡ Ω(0) (E. E could be regarded as a function of E. Likewise. · · · . the energy E of system A is determined once the energy E of system A is given. The total number of microstates accessible to A(0) is clearly a function of E and the parameters xα (where α runs from 1 to n). In fact. (5. if the two systems are separated by a movable partition in an enclosure of ﬁxed volume V (0) .43) where V and V are the volumes of systems A and A . which can interact by exchanging heat energy and doing work on one another.8. xn . As a simple example. Let the system A have energy E and adjustable external parameters x1 . (5. Furthermore. It follows from the ﬁrst law of thermodynamics that E + E = E(0) = constant. the number of accessible microstates exhibits a similar strong increase with the volume. when all the other parameters and the energy are held constant. xn . which is a typical external parameter. 5. This behaviour comes about because of the very strong. 3. also exhibits a very sharp maximum at some particular value xα = xα . Ω ∝ Ef . in general.5 General interaction between macrosystems 5 STATISTICAL THERMODYNAMICS 5.5. and vice versa. so that Ω ∝ V f. The combined system A(0) = A + A is assumed to be isolated. (5. the parameters x are some function of the parameters x. · · · . according to Sect. However.45) It follows that the variation of Ω(0) with a typical parameter xα .42) Thus.44) increase in the number of accessible microstates of A (or A ) with energy. let the system A have energy E and adjustable external parameters x1 . xn ).
50) Equation (5. (5. the change in ln Ω follows from standard mathematics: ∂ ln Ω ¯ ∂ ln Ω d ln Ω = d¯α .5 General interaction between macrosystems 5 STATISTICAL THERMODYNAMICS conﬁguration of maximum probability. in which virtually all systems A(0) in the ˜ ensemble have values of E and xα very close to E and xα . Let us calculate the resultant change ¯ x in the number of microstates accessible to A. (5. (5.48) in terms of the thermodynamic temperature T . ¯ 68 α ¯ x Xα d¯α T.49) is a differential relation which enables us to calculate the quan¯ tity S as a function of the mean energy E and the mean external parameters xα .49) .41)]. (5. ∂E ∂ ln Ω ¯ β Xα = ∂xα (5. xn ). using the relation β ≡ 1/k T . · · · . This is why we are only considering quasistatic changes in which the two systems are always arbitrarily close to equilibrium. (5.46) can be written ¯ d ln Ω = β dE + α ¯ x Xα d¯α . we have previously demonstrated that β= ∂ ln Ω .30) and (5. x1 . We obtain ¯ dS = dE + where S = k ln Ω. (5. Let us rewrite Eq.46) [from Eqs.48) ¯ Note that the temperature parameter β and the mean conjugate forces Xα are only welldeﬁned for equilibrium states. so Eq. ˜ these quantities are thus given by E ¯ ˜ Consider a quasistatic process in which the system A is brought from an equi¯ librium state described by E and xα to an inﬁnitesimally different equilibrium ¯ ¯ ¯ state described by E + dE and xα + d¯α .47) n (5.5. The mean values of ˜ ¯ = E and xα = xα . Since Ω ≡ Ω(E. x dE + ∂E ∂xα α=1 However.
” The reason for this etymology will become apparent presently. The word entropy is derived from the Greek en + trepien. energy. Hence. Thus. xα ) is essentially a measure of the relative probability of a state characterized by values ¯ of the mean energy and mean external parameters E and xα . The process has to be quasistatic because the temperature T .49) that ¯ ¯ dE + dW dQ ¯ = .. (4. the net amount of work performed during a quasistatic change is given by ¯ x dW = ¯ Xα d¯α . (5.53) dS = which converts the inexact differential dQ into the exact differential dS (see ¯ Sect. ¯ (5.54) i T i where the integral is evaluated for any process through which the system is brought quasistatically via a sequence of nearequilibrium conﬁgurations from its initial to its ﬁnal macrostate. Since the lefthand side of the above equation only depends on the initial and ﬁnal states. ¯ All of the concepts which we have encountered up to now in this course. (5. It follows that the entropy difference between any two macrostates i and f can be written f f dQ ¯ Sf − Si = dS = . the thermodynamic temperature T is the integrating factor for the ﬁrst law of thermodynamics.52) T T Thus.50) that the entropy is merely a parameterization of the number ¯ ¯ of accessible microstates. etc.16).5). which appears in the integrand. ¯ ¯ dQ = dE + dW. (5.5. (5. S(E. is only welldeﬁned for an equilibrium state. ¯ According to Eq. heat. it follows that the integral on the righthand side is independent of the particular sequence of quasistatic changes used to get from i to f f. have been fairly familiar to 69 .51) α It follows from Eq. 4.5 General interaction between macrosystems 5 STATISTICAL THERMODYNAMICS ¯ assuming that we can calculate the temperature T and mean conjugate forces Xα ¯ ¯ for each equilibrium state. which means “in change. according to statistical mechanics. respectively. It can be seen from Eq. (5. xα ) is termed the entropy of system A. volume. pressure. The function S(E. such as temperature. i dQ/T is independent of the process (provided that it is quasistatic).
accessible states are equally likely. or possibly leaving unchanged.g. we could specify either the volume (an external parameter) or the mean pressure (the mean force conjugate to the volume).. the systems in the ensemble are not in any of the microstates from which they were previously excluded. in general. Thus. According to statistical mechanics. Let Ω be the number of accessible microstates. If the ﬁnal number of accessible states is Ωf . So the 70 . so we can write Ω ≡ Ω(y1 . Clearly. each of the Ωi . entropy. Note that these parameters are not necessarily external: e. become accessible. yn ) for the number of microstates consistent with a macrostate in which the general parameter yα lies in the range yα to yα + dyα . However. but many additional states will. According to the principle of equal a priori probabilities.5. What does the entropy of a system actually signify? What use is the concept of entropy? 5. The number of accessible states is clearly a function of the chosen parameters. · · · . removing or relaxing constraints can only have the effect of increasing. (5. The accessible states are just that set of microstates which are consistent with the macroscopic constraints imposed on the system. the system is equally likely to be found in any one of these states when it is in thermal equilibrium. These constraints can usually be quantiﬁed by specifying the values of some parameters y1 . the number of microstates accessible to the system. Suppose that we start from a system in thermal equilibrium.6 Entropy Consider an isolated system whose energy is known to lie in a narrow range.6 Entropy 5 STATISTICAL THERMODYNAMICS us from other branches of Physics. yn which characterize the macrostate. is something quite new. some of the constraints imposed on the system. Let us now remove.55) Immediately after the constraints are relaxed. then we can write Ωf ≥ Ωi . which turns out to be of crucial importance in thermodynamics. Let us consider the following questions. · · · . or relax. all of the microstates formally accessible to the system are still accessible. say.
consider a system consisting of a box divided into two regions of equal volume. This is clearly not a equilibrium situation. Immediately after the partition is removed. The constraints imposed on the system can be relaxed by removing the partition and allowing gas to ﬂow into both regions. 3. This discussion can also be phrased in terms of the parameters y1 . The volume accessible to the gas is now V = Vf = 2 Vi . Suppose that.6 Entropy 5 STATISTICAL THERMODYNAMICS systems only occupy a fraction Ωi (5. Thus. · · · .57) where N is the number of particles.56). the volume accessible to the system is V = Vi . the ensemble will evolve in time until a more probable ﬁnal state is reached in which the systems are evenly distributed over the Ωf available states. one region is ﬁlled with gas and the other is empty. that at constant energy the variation of the number of accessible states of an ideal gas with the volume is Ω ∝ V N.5.4). (5. 3. In other words. (5. We know. Indeed. (5. one of the pa71 . (5. According to the H theorem (see Sect.8. The constraint imposed on the system is. Suppose that a constraint is removed. initially. the system will evolve towards a more probable state. that the coordinates of all of the molecules must lie within the ﬁlled region. its probability of occurrence is given by Eq. from Sect. the system is in an extremely improbable state. yn of the system.56) Ωf of the Ωf states now accessible to them.58) If the box contains of order 1 mole of molecules then N ∼ 1024 and this probability is fantastically small: Pi ∼ exp −1024 . thus. where Vi is half the volume of the box. For instance.59) Clearly. the probability of observing the state immediately after the partition is removed in an ensemble of equilibrium systems with volume V = Vf is Ωi Vi Pi = = Ωf Vf N 1 = 2 N . In fact. Pi = As a simple example. if Ωf Ωi then the conﬁguration in which the systems are only distributed over the original Ωi states is an extremely unlikely one.
where Ω is maximum. Ω(y) has a very pronounced maximum at some particular value y (see ˜ Sect. If the original constraints are reimposed then the systems in the ensemble still occupy these Ωf states with equal probability. Suppose that some process occurs in which an isolated system goes from some initial conﬁguration to some ﬁnal conﬁguration. From what we have already said.e.5. P(y) ∝ Ω(y).61) rameters. if Ωf > Ωi . This could only lead to even more states becoming accessible to the system.2). the probability P(y) of ﬁnding the system in equilibrium with the parameter in the range y to y + δy is just proportional to the number of microstates in this interval: i. On the other hand. The initial condition can also not be restored by removing further constraints. an irreversible process is clearly one in which the removal of constraints leads to a situation 72 . · · · . say. (5. So. (5. This ˜ discussion can be summed up in a single phrase: If some of the constraints of an isolated system are removed then the parameters of the system tend to readjust themselves in such a way that Ω(y1 . so that the systems in the ensemble are uniformly distributed over the Ωf accessible ﬁnal states. simply restoring the constraints does not restore the initial situation..6 Entropy 5 STATISTICAL THERMODYNAMICS Usually. According to statistical mechanics. Thus. Thus. y. if yi = y initially then the parameter y ˜ ˜ will change until it attains a ﬁnal value close to y. which originally had the value y = yi . yn ) → maximum. all states accessible to the system are equally likely. If the ﬁnal conﬁguration is such that the imposition or removal of constraints cannot by itself restore the initial condition then the process is deemed irreversible. 5. This means that practically all systems in the ﬁnal equilibrium ensemble have values of y close to y. is now allowed to vary.60) Suppose that the ﬁnal equilibrium state has been reached. if it is such that the imposition or removal of constraints can restore the initial condition then the process is deemed reversible. Once the systems are randomly distributed over the Ωf states they cannot be expected to spontaneously move out of some of these states and occupy a more restricted class of states merely in response to the reimposition of a constraint.
6. (5.6 Entropy 5 STATISTICAL THERMODYNAMICS where Ωf > Ωi . If Ω stays the same then the process is reversible. Ω itself is a rather unwieldy parameter with which to measure irreversibility. Of course.63) Ωi 73 . in addition to stating that a given process is irreversible. then the ﬂow of gas from one side of the box to the other when the partition is removed is an irreversible process. This process is irreversible on a microscopic level because the initial conﬁguration cannot be recovered by simply replacing the partition. In practice.5. It is irreversible on a macroscopic level because it is obviously unphysical for the molecules of a gas to spontaneously distribute themselves in such a manner that they only occupy half of the available volume. the systems remain distributed with equal probability over these states irrespective of whether the constraints are imposed or not. it is quite clearly implausible that a system should ever spontaneously evolve from a probable to an improbable conﬁguration. In this situation. However. In symbols. we can write Ωf − Ωi ≡ ∆Ω ≥ 0. via a partition. if Ω for an isolated system spontaneously increases then the process is irreversible. it is unphysical for Ω to ever spontaneously decrease. the fractional increase in Ω is Ωf 23 10 2 ν×10 . we can also give some indication of how irreversible it is. Our microscopic deﬁnition of irreversibility is in accordance with the macroscopic deﬁnition discussed in Sect. in the previous example. For instance. where an ideal gas doubles in volume (at constant energy) due to the removal of a partition. It is actually possible to quantify irreversibility. If a gas is initially restricted to one half of a box. Thus. Let us consider our example again. 3. so that Ωf = Ωi . On a microscopic level it is clearly plausible that a system should spontaneously evolve from an improbable to a probable conﬁguration in response to the relaxation of some constraint. (5. the degree of irreversibility being proportional to the amount of the increase. The parameter which measures irreversibility is just the number of accessible states Ω.62) for any physical process operating on an isolated system. A reversible process corresponds to the special case where the removal of constraints does not change the number of accessible states. In other words. Recall that on a macroscopic level an irreversible process is one which “looks unphysical” when viewed in reverse.
The increase in entropy when an ideal gas doubles in volume (at constant energy) is Sf − Si = ν R ln 2. For historical reasons. Thus. If Eq.69) which is order unity for laboratory scale systems (i.70) for a process acting on an isolated system [this is equivalent to Eqs. the entropy of an isolated system tends to increase 74 . and (5. NA (5. (5. the ﬁnal form for our measure of irreversibility is S = k ln Ω.61).5. (5. (5. those containing about one mole of particles). Thus. it makes sense to premultiply our measure of irreversibility by a number of order 1/NA . (5. The essential irreversibility of macroscopic phenomena can be summed up as follows: Sf − Si ≡ ∆S ≥ 0. the number which is generally used for this purpose is the Boltzmann constant k.e. This is a far more manageable number! Since we usually deal with particles by the mole in laboratory physics. (5.3143 joules/kelvin/mole (5.67) is the ideal gas constant which appears in the wellknown equation of state for an ideal gas.65) where NA = 6 × 1023 .66) This quantity is termed “entropy”. The increase in ln Ω when an ideal gas doubles in volume (at constant energy) is ln Ωf − ln Ωi = ν NA ln 2.6 Entropy 5 STATISTICAL THERMODYNAMICS where ν is the number of moles.68) R joules/kelvin. (5.62). (5. and is measured in joules per degree kelvin.64)]..62) is true then it is certainly also true that ln Ωf − ln Ωi ≡ ∆ ln Ω ≥ 0 (5.64) for any physical process operating on an isolated system. This is an extremely large number! It is far more convenient to measure irreversibility in terms of ln Ω. P V = ν R T . which can be written k= where R = 8.
then it is still true that the entropy of this system (which is assumed to be isolated) must increase in time. since the former is excluded from microstates in which there is not a strong correlation between individual particle motions. Note that the second law of thermodynamics only applies to isolated systems. an ordered system is more constrained than a disordered system. we have restored its original entropy. However. the entropy of the gas is only kept the same at the expense of increasing the entropy of the rest of the system. in restoring the gas to its original state. It follows that. whereas the latter is not. and can never become more ordered. entropy is effectively a measure of the disorder in a system (the disorder increases with S). The energy of the gas will increase because of the work done on it during compression. Clearly. giving rise to an ordered ﬂow of the system in that direction. If we consider the system of everything in the 75 .6 Entropy 5 STATISTICAL THERMODYNAMICS with time and can never decrease. This appears to violate the second law of thermodynamics because the entropy should have increased in what is obviously an irreversible process (just try to make a gas spontaneously occupy half of its original volume!). For instance. if a gas expands (at constant energy) to twice its initial volume after the removal of a partition. in a ﬂuid there might be a strong tendency for the particles to move in one particular direction. This proposition is known as the second law of thermodynamics. For instance. Another way of saying this is that an ordered system has less accessible microstates than a corresponding disordered system. On the other hand.5. With this interpretation. but if we absorb some heat from the gas then we can restore it to its initial state. the second law of thermodynamics reduces to the statement that isolated systems tend to become more disordered with time. One way of thinking of the number of accessible states Ω is that it is a measure of the disorder associated with a macrostate. Thus. The entropy of a nonisolated system can decrease. For a system exhibiting a high degree of order we would expect a strong correlation between the motions of the individual particles. and the total entropy is increased. Thus. we can subsequently recompress the gas to its original volume. for a system exhibiting a low degree of order we expect far less correlation between the motions of individual particles. if we consider a new system consisting of the gas plus the compression and heat absorption machinery. all other things being equal.
5. though. whereas a reversible process neither increases nor decreases disorder. But there were still a few wounded Americans. The bombers opened their bomb bay doors. and lifted the containers into the bellies of the planes. This is what the ﬁlm appeared to show: “American planes. They used them to suck more fragments from the crewmen and planes.” Vonnegut’s point.” by Kurt Vonnegut. An irreversible process is clearly one which increases the disorder of the Universe. which is certainly an isolated system since there is nothing outside it with which it could interact. in which the hero. and some of the bombers were in bad repair. exerted a miraculous magnetism which shrunk the ﬁres. and those planes ﬂew up backwards and joined the formation. This deﬁnition is in accordance with our previous deﬁnition of an irreversible process as one which “does not look right” when viewed backwards. though. The formation ﬂew backwards over a German city that was in ﬂames. made everything and everybody as good as new. I suppose. One easy way of viewing macroscopic events in reverse is to ﬁlm them. and then play the ﬁlm backwards through a projector. The Germans had miraculous devices of their own. There is a famous passage in the novel “Slaughterhouse 5.6 Entropy 5 STATISTICAL THERMODYNAMICS Universe. a few German ﬁghter planes ﬂew at them backwards. sucked bullets and shell fragments from some of the planes and crewmen. which were long steel tubes. They did the same for wrecked American bombers on the ground. full of holes and wounded men and corpses took oﬀ backwards from an airﬁeld in England. Billy Pilgrim. is that the morality of actions is inverted when you view them in reverse. views a propaganda ﬁlm of an American World War II bombing raid on a German city in reverse. Over France. 76 . The containers were stored neatly in racks. then the second law of thermodynamics becomes: The disorder of the Universe tends to increase with time and can never decrease. German ﬁghters came up again. gathered them into cylindrical steel containers. Over France.
which slows them down such that when they meet they ﬁt together exactly to form a metal cylinder enclosing the gases and moving upwards at great velocity. it is probably possible to ﬂy a plane backwards. in principle. Suppose that we choose a new resolution δ∗ E and deﬁne a new density of states Ω∗ (E) which is the number of states with energy in the range E to E + δ∗ E. compressing an explosive gas in the process.7 Properties of entropy Entropy. has some dependence on the resolution δE to which the energy of macrostates is measured. as we have deﬁned it. However.71) δE It follows that the new entropy S∗ = k ln Ω∗ is related to the previous entropy S = k ln Ω via δ∗ E ∗ . the ﬁlm might show thousands of white hot bits of shrapnel approach each other from all directions at great velocity. “miraculous” events in Vonnegut’s words. Recall that Ω(E) is the number of accessible microstates with energy in the range E to E + δE. For instance. What strikes us as completely implausible about this event is the spontaneous transition from the disordered motion of the gases and metal fragments to the ordered upward motion of the bomb. (5.7 Properties of entropy 5 STATISTICAL THERMODYNAMICS What is there about this passage which strikes us as surreal and fantastic? What is there that immediately tells us that the events shown in the ﬁlm could never happen in reality? It is not so much that the planes appear to ﬂy backwards and the bombs appear to fall upwards. which would immediately strike us as the wrong way around even if we had never seen them before. After all. Certainly. given a little ingenuity and a sufﬁciently good pilot. (5. certain events are depicted in the ﬁlm.72) S = S + k ln δE 77 .5. if we were to throw a bomb up in the air with just the right velocity we could. if you had never seen a plane before it would not be obvious which way around it was supposed to ﬂy. ﬁx it so that the velocity of the bomb matched that of a passing bomber when their paths intersected. Likewise. 5. It can easily be seen that δ∗ E Ω∗ (E) = Ω(E).
We know. like the temperature. which is virtually inconceivable.5.74) where E = E(0) − E. our usual estimate that Ω ∼ Ef gives S ∼ kf. It follows that even if δ∗ E were to differ from δE by of order f (i.2). 5. again. since every microstate of A can be combined with every microstate of A to form a distinct microstate of the whole system. that in equilib¯ ˜ rium the mean energy of A takes the value E = E for which Ω(0) (E) is maximum. Let E and E be the energies of the two systems.73) to an excellent approximation. which is utterly negligible compared to kf. let E(0) be the conserved energy of the system as a whole and Ω(0) the corresponding density of states. (5. and yet have not done so. Consider. the second term on the righthand side of the above equation is still only of order k ln f. Furthermore. two systems A and A which are in thermal contact but can do no work on one another (see Sect.2. so our deﬁnition of entropy is completely insensitive to the resolution to which we measure energy (or any other macroscopic parameter).7 Properties of entropy 5 STATISTICAL THERMODYNAMICS Now. where f is the number of degrees of freedom. we can only be sure that a given microstate is inaccessible when the systems in the ensemble have had ample opportunity to move into it. Note that. the number of states accessible to the whole system is the product of the numbers of states accessible to each subsystem. In other words. The distribution of E around the 78 . and Ω(E) and Ω (E ) the respective densities of states. twenty four orders of magnitude).. We have from Eq. and the temperatures of A and A are equal. the entropy of a macrostate is only welldeﬁned if the macrostate is in equilibrium. Note that for an equilibrium state. the entropy is just as welldeﬁned as more familiar quantities such as the temperature and the mean pressure.e. In other words. from Sect. It follows that S∗ = S (5. 5. (5.2) that Ω(0) (E) = Ω(E) Ω (E ). The crucial point is that it only makes sense to talk about the number of accessible states if the systems in the ensemble are given sufﬁcient time to thoroughly explore all of the possible microstates consistent with the known macroscopic constraints.
5.76) According to our usual estimate. Thus. ˜ ˜ ˜ ˜ ˜ S(0) = k ln Ω(0) (E) = k ln[Ω(E) Ω(E )] = k ln Ω(E) + k ln Ω (E ) to an excellent approximation. The entropy of an equilibrium macrostate is related to the number of accessible microstates Ω via S = k ln Ω.e. It follows that the total number of accessible microstates is approximately ˜ the number of states which lie within ∆∗ E of E. otherwise there would be less than one microstate per subdivision. Ωtot ∼ ˜ Ω(0) (E)]. Ωtot (0) ˜ Ω(0) (E) ∗ ∆ E. Thus. Ω ∼ Ef .78). which parameterizes the amount of disorder in a macroscopic system. δE (0) ∗ (5.8 Uses of entropy We have deﬁned a new function called entropy.. the (0) total number of states is equal to the maximum number of states [i. denoted S. (5. 79 (5. It follows that the second term is less than or of order k ln f. Any reasonable ˜ choice for the energy subdivision δE should be greater than E/f. for the purpose of calculating the entropy. which is utterly negligible compared to kf.75) The entropy of the whole system is given by S (0) = (0) k ln Ωtot (5. where f is the number of degrees of freedom.79) . 5.78) (5. Thus. the total entropy of two thermally interacting systems in equilibrium is the sum of the entropies of each system in isolation. giving ˜ ˜ S(0) = S(E) + S (E ).77) It can be seen that the probability distribution for Ω(0) (E) is so strongly peaked around its maximum value that.8 Uses of entropy 5 STATISTICAL THERMODYNAMICS √ ˜ mean value is of order ∆∗ E = E/ f. the ﬁrst term on the righthand side is √ ˜ of order kf whereas the second term is of order k ln(E/ f δE). One consequence of this is that the entropy has the simple additive property shown in Eq. (5. δE ∆E ˜ = k ln Ω (E) + k ln .
The second law of thermodynamics states that the entropy of an isolated system can never spontaneously decrease.80) where T is the absolute temperature of the system. The change in entropy of the whole system is dS(0) = dS + dS = 1 1 − dQ. Here. according to the second law of thermodynamics. now. the increase in entropy due to a quasistatic change in which an inﬁnitesimal amount of heat dQ is absorbed by the system is given by ¯ dS = dQ ¯ . respectively. Consider a quasistatic exchange of heat between the two bodies. the direction of spontaneous heat ﬂow is a consequence of the second law of thermodynamics. We know what is supposed to happen here.82) . so dS(0) ≥ 0. Consider two bodies. The spontaneous ﬂow of heat only ceases when T = T . which are in thermal contact but can do no work on one another. and vice versa. Note that dQ is assumed to the sufﬁciently small that the ¯ heat transfer does not substantially modify the temperatures of either system. of the Universe. if an inﬁnitesimal amount of heat dQ is absorbed ¯ by A then inﬁnitesimal heat dQ = −dQ is absorbed by A . or disorder..e. the slightly more complicated situation in which the two systems can exchange heat and also do work on one another via a movable partition. T and T are the temperatures of the two ¯ systems. ¯ T T (5.8 Uses of entropy 5 STATISTICAL THERMODYNAMICS On a macroscopic level. Let us now brieﬂy examine some consequences of these results. 80 (5.81) This change must be positive or zero. Thus. Consider.5. T (5. According to the ﬁrst law of thermodynamics. so that V (0) = V + V = constant. Suppose that the total volume is invariant. Note that the spontaneous ﬂow of heat between bodies at different temperatures is always an irreversible process which increases the entropy. heat ﬂows from A to ¯ A) when T > T . The increase in ¯ ¯ the entropy of system A is dS = dQ/T and the corresponding increase in the ¯ entropy of A is dS = dQ /T . Heat ﬂows from the hotter to the colder of the two bodies until their temperatures are the same. It follows that dQ is positive (i. A and A .
(5.4). We can write a similar pair of equations for the system A . E where the subscripts are to remind us what is held constant in the partial derivatives. V (5. T ∂S ∂E ∂S ∂V (5. ¯ ¯ where p is the mean pressure of A. Eq. (5. this is equivalent to the state with the largest number of accessible microstates. The inﬁnitesimal amount of work done by system A is dW = p dV (see Sect. 5.2). Consider a quasistatic change in which system A absorbs an inﬁnitesimal amount of heat dQ ¯ and its volume simultaneously increases by an inﬁnitesimal amount dV.84) Likewise. ¯ ¯ ¯ (5. respectively.84). Finally. Furthermore. The system can never spontaneously leave a maximum entropy state. T dE + p dV ¯ . It follows that the total change in entropy is given by dS(0) 1 1 − = dS + dS = T T p p ¯ ¯ dE + − T T dV. Eq. (5.82) implies that dV + dV = 0.87) . so conservation of energy gives dE + dE = 0.8 Uses of entropy 5 STATISTICAL THERMODYNAMICS where V and V are the volumes of A and A . ¯ dQ = dE + dW = dE + p dV.88) The equilibrium state is the most probable state (see Sect. According to statistical mechanics. 1 = T p ¯ = T . the increase ¯ in entropy of system A is written dS = dE + p dV ¯ .79) implies that this is the maximum entropy state. According to the ﬁrst law of thermodynamics.83) where dE is the change in the internal energy of A. (5. The overall system is assumed to be isolated.85) (5. Since dS = dQ/T .5.86) (5. 81 . 4. the increase in entropy of system A is given by dS = According to Eq.
(5. In the maximum entropy state the systems A and A have equal temperatures (i.e.e. 5. A maximum or minimum entropy state must satisfy dS(0) = 0 for arbitrary small variations of the energy and external parameters.9 Entropy and quantum mechanics 5 STATISTICAL THERMODYNAMICS since this would imply a spontaneous reduction in entropy.92) ∂2 S < 0. giving S = kf ln E + kf ln V + · · ·.88) that T = T . they are in mechanical equilibrium). This corresponds to a maximum entropy state (i. 3.93) We have already demonstrated that this deﬁnition is utterly insensitive to the resolution δE to which the macroscopic energy is measured (see Sect.2). and will then remain in it indeﬁnitely (if left undisturbed).91) (5. ¯ ¯ (5. an equilibrium state) provided ∂2 S < 0. The usual estimate Ω ∝ Ef V f . ∂V 2 E with a similar pair of inequalities for system A .5.89) (5. (5.9 Entropy and quantum mechanics The entropy of a system is deﬁned in terms of the number Ω of accessible microstates consistent with an overall energy in the range E to E + δE via S = k ln Ω. p = p. In classical mechanics.e. The second law of thermodynamics implies that the two interacting systems will evolve towards this state.. which is forbidden by the second law of thermodynamics.90) for such a state. if a system possesses f degrees of freedom then phasespace is conventionally subdivided into cells of arbitrarily chosen volume h0f (see Sect. ensures that the above inequalities are satisﬁed in conventional macroscopic systems.. ∂E2 V (5. they are in thermal equilibrium) and equal pressures (i. It follows from Eq. 5. 82 ..7).
which corresponds to specifying the particle coordinates and momenta to arbitrary accuracy. This implies that. However. so that h0 = h. respectively. (5. but this does not affect any physical laws.87)] are unaffected by such a constant.96) where pi is the momentum conjugate to the generalized coordinate qi . δpi are the uncertainties in these quantities. (5.95) 1 h0f · · · dq1 · · · dqf dp1 · · · dpf . in classical mechanics the entropy is undetermined to an arbitrary additive constant which depends on the size of the cells in phasespace. (5. So. macroscopic thermodynamical quantities. and is. The nonunique value of the entropy comes about because there is no limit to the precision to which the state of a classical system can be speciﬁed.94) Thus. (5. in quantum mechanics the uncertainty principle sets a deﬁnite limit to how accurately the particle coordinates and momenta can be speciﬁed. therefore. in reality. Ω= giving S = k ln · · · dq1 · · · dqf dp1 · · · dpf − kf ln h0 . This automatically enforces the 83 . the cell size h0 can be made arbitrarily small.5. In fact.86) and (5. in classical mechanics the entropy is rather like a gravitational potential: it is undetermined to an additive constant. which can be expressed as partial derivatives of the entropy with respect to various macroscopic parameters [see Eqs. In fact. δqi δpi ≥ h. In general. and δqi . Quantum mechanics can often be “mocked up” in classical mechanics by setting the cell size in phasespace equal to Planck’s constant. in quantum mechanics the number of accessible quantum states with the overall energy in the range E to E + δE is completely determined. the entropy of a system has a unique and unambiguous value. Likewise. In other words. Thus. S increases as the cell size decreases.9 Entropy and quantum mechanics 5 STATISTICAL THERMODYNAMICS The number of accessible microstates is equivalent to the number of these cells in the volume of phasespace consistent with an overall energy of the system lying in the range E to E + δE. The second law of thermodynamics is only concerned with changes in entropy. unaffected by an additive constant. such as the temperature and pressure.
corresponds to the situation where all quantum numbers take their lowest possible value. ni2 . In this limit. Clearly. the substitution h0 → h in Eq. the socalled groundstate.99) 2 m L2 There is only one accessible microstate at the groundstate energy (i. (5. (5.9 Entropy and quantum mechanics 5 STATISTICAL THERMODYNAMICS most restrictive form of the uncertainty principle.5. The energy levels of the ith particle are given by ¯ 2 π2 h 2 2 2 ni1 + ni2 + ni3 . (5. the number of accessible states becomes far less than the usual classical estimate Ef . The lowest possible energy state of the system.98) The overall state of the system is completely speciﬁed by 3N quantum numbers.95) gives the same. as the energy approaches the groundstate energy. so the number of degrees of freedom is f = 3 N. (5. Thus. that where all quantum numbers are unity). the number of accessible states varies with energy according to our usual estimate Ω ∝ Ef .e. The overall energy of the system is the sum of the energies of the individual particles. there is no disorder in the system when all the particles are in their groundstates..97) where ni1 .100) In other words. The classical limit corresponds to the situation where all of the quantum numbers are much greater than unity. This is true 84 . unique value for S as that obtained from a full quantum mechanical calculation. and ni3 are three (positive) quantum numbers. δqi δpi = h. so by our usual deﬁnition of entropy S(E0 ) = k ln 1 = 0. Consider a simple quantum mechanical system consisting of N noninteracting spinless particles of mass m conﬁned in a cubic box of dimension L. In many systems. the groundstate energy E0 is given by f ¯ 2 π2 h E0 = . unity. ei = 2 2mL (5. so that for a general state r N Er = i=1 ei .
and. and the entropy approaches the limiting value zero. the interaction between the nuclear spins becomes signiﬁcant. (5. but not small enough to “freeze out” the nu85 . with all spins aligned). In this situation. Nevertheless. the number of microstates varies roughly like Ω(E) ∼ 1 + C (E − E0 )f . (5.101) where C is a positive constant. as the absolute temperature of a system approaches zero.. the number of microstates Ωs corresponding to the possible nuclear spin orientations may be very large. have extremely weak mutual interactions. Thus. it only takes a tiny amount of heat energy in the system to completely randomize the spin orientations. when such a system is brought to a very low temperature the entropy associated with the degrees of freedom not involving nuclear spins becomes negligible. At low temperatures. it may be just as large as the number of states at room temperature. the internal energy approaches a limiting value E0 (the quantum mechanical groundstate energy). Typically. Suppose that the system consists of N atoms of spin 1/2. in accordance with the third law of thermodynamics. Typically. Below some critical temperature. If follows that there are Ωs = 2N accessible spin states. for temperatures which are small. The entropy associated with these states is S0 = k ln Ωs = ν R ln 2. great care must be taken to ensure that equilibrium thermodynamical arguments are applicable.102) T∼ kf provided Ω 1. (5. a temperature as small as 10−3 degrees kelvin above absolute zero is sufﬁcient to randomize the spins. If there is enough residual heat energy in the system to randomize the spins then each orientation is equally likely. S → 0. since the rate of attaining equilibrium may be very slow. This proposition is known as the third law of thermodynamics. T0 . Another difﬁculty arises when dealing with a system in which the atoms possess nuclear spins. Each spin can have two possible orientations. However. and the system settles down in some unique quantum mechanical groundstate (e. the temperature varies approximately like E − E0 .9 Entropy and quantum mechanics 5 STATISTICAL THERMODYNAMICS for all quantum mechanical systems. Indeed. In general. According to Eq. The reason for this is that nuclear magnetic moments are extremely small.31). Thus.5. therefore.g.
5. The remainder of this course is devoted to the application of the ideas we have just discussed to various situations of interest in Physics.3). 5. but still much larger than T0 . it is useful to summarize the results of our investigations. the entropy approaches a limiting value S0 which depends only on the kinds of atomic nuclei in the system. Before we proceed.5.1).10 The laws of thermodynamics We have now come to the end of our investigation of the fundamental postulates of classical and statistical thermodynamics. S → S0 .9). however. 86 . Thus. 5. for most practical purposes the third law of thermodynamics can be written as T → 0+ .10 The laws of thermodynamics 5 STATISTICAL THERMODYNAMICS clear spin degrees of freedom. Third Law: In the limit as the absolute temperature tends to zero the entropy also tends to zero (see Sect. This limiting value is independent of the spatial arrangement of the atoms.103) where 0+ denotes a temperature which is very close to absolute zero. Second Law: The entropy of an isolated system can never spontaneously decrease (see Sect. Our summary takes the form of a number of general statements regarding macroscopic variables which are usually referred to as the laws of thermodynamics: Zeroth Law: If two systems are separately in thermal equilibrium with a third system then they must be in equilibrium with one another (see Sect.6). 4. This modiﬁcation of the third law is useful because it can be applied at temperatures which are not prohibitively low. or the interactions between them. (5. 5. First Law: The change in internal energy of a system in going from one macrostate to another is the diﬀerence between the net heat absorbed by the system from its surroundings and the net work done by the system on its surroundings (see Sect.
and ¯ T → T . This prescrip¯ tion. ¯ ¯ (6. etc. temperature. For an inﬁnitesimal process. dQ is the heat absorbed ¯ by the system. So.. the two laws which really matter are the ﬁrst law and the second law.1 Introduction We have learned that macroscopic quantities such as energy. we shall drop the overbars altogether. So.. the zeroth law is really a consequence of the second law. in fact. and pressure are. p → p. If we were to plot out the probability distribution for the energy. by their mean values. is equivalent to replacing all statistically varying quantities by their most probable values. statistical in nature: i. so that p should be understood to represent the mean pressure p. etc. the zeroth to the third). and pressure.e. and dW is the work done by the system on its surroundings.1) E f where the number of degrees of freedom f is about 1024 for laboratory scale systems.e. We could equally well write the ﬁrst law in terms of 87 . and the third law is actually only important at temperatures close to absolute zero.2) where dE is the change in internal energy of the system. and replace macroscopic quantities. for most purposes. temperature. we expect 1 ∆∗ E ∼√ . (6. which is the essence of classical thermodynamics. Since the statistical ﬂuctuations of equilibrium quantities are so small. In the following discussion. of a system in thermal equilibrium with its surroundings we would obtain a Gaussian with a very small fractional width. say. Note ¯ that this is just a convention. we can neglect them to an excellent approximation. such as energy. In fact. This means that the statistical ﬂuctuations of macroscopic quantities about their mean values are typically only about 1 in 1012 .6 CLASSICAL THERMODYNAMICS 6 Classical thermodynamics 6. in equilibrium they exhibit random ﬂuctuations about some mean value. the ﬁrst law is written dQ = dE + dW. Although there are formally four laws of thermodynamics (i.
. gases). ¯ (6. and temperature. classical thermodynamics cannot tell us anything from ﬁrst principles. we can justify it from statistical arguments. from our knowledge of the microscopic structure of the system under consideration). ¯ (6. volume. classical thermodynamics is unable to tell us what this equation of state is from ﬁrst principles. the ideal gas law was ﬁrst discovered empirically by Robert Boyle. as long as we are consistent in our deﬁnitions. The second law of thermodynamics implies that dQ = T dS. and dS is the change in entropy of the system. (6. Thus.2 The equation of state of an ideal gas Let us start our discussion by considering the simplest possible macroscopic system: i. but. for systems in which the only external parameter is the volume (i. It does not really matter. (6. it follows from the ﬁrst and second laws of thermodynamics that T dS = dE + p dV. it is the result of experiments).8).. it is entirely empirical in nature (i.6) 88 . All of the thermodynamic properties of an ideal gas are summed up in its equation of state.. Of course. 3.e.3) for a quasistatic process. an ideal gas. where T is the thermodynamic temperature. We always have to provide some information to begin with before classical thermodynamics can generate any new results. This initial information may come from statistical physics (i. that the number of accessible states of a monotonic ideal gas varies like Ω ∝ V N χ(E).2 The equation of state of an ideal gas 6 CLASSICAL THERMODYNAMICS the heat emitted by the system or the work done on the system. Furthermore..4) where p is the pressure.e. more usually.e. Unfortunately.6. but. which determines the relationship between its pressure. nowadays. Recall (from Sect. and dV is the change in volume. the work done on the environment is dW = p dV.5) 6. In fact.e.
This allows us to write the equation of state in its usual form p V = ν R T.6. and its conjugate force is the pressure (since dW = ¯ p dV). but were clearly completely independent of V. The latter derivation is difﬁcult to perform correctly because it is necessary to average over all possible directions of atomic motion.6). So. we can write 1 ∂ ln Ω ..10) The above derivation of the ideal gas equation of state is rather elegant. Since the energy of an ideal gas is independent of the particle coordinates (because there are no interatomic forces). we obtain p= NkT .41)]. and NA is Avagadro’s number. (6. from the above derivation. (6. The integrals over the particle momenta were more complicated. the only external ¯ parameter is the volume. For an ideal gas.8) p= β ∂V If we simply apply this rule to Eq.7) Xα = β ∂xα [see Eq. and χ(E) depends only on the energy of the gas (and is independent of the volume).e. where ν is the number of moles. N = ν NA .2 The equation of state of an ideal gas 6 CLASSICAL THERMODYNAMICS where N is the number of atoms. k NA = R. This automatically gives rise to a variation of the number of accessible states with E and V of the 89 . We obtained this result by integrating over the volume of accessible phasespace. V (6. It is clear. It is certainly far easier to obtain the equation of state in this manner than to treat the atoms which make up the gas as little billiard balls which continually bounce of the walls of a container. that the crucial element needed to obtain the ideal gas equation of state is the absence of interatomic forces. Now. (5. giving the V N factor in the above expression.9) However. the integrals over the coordinates just reduced to N simultaneous volume integrals. (6. giving the χ(E) factor in the above expression. and β = 1/k T . where R is the ideal gas constant. dW = α Xα dxα ). where Xα is the mean force conjugate to the external parameter xα (i. Also. we have a statistical rule which tells us that 1 ∂ ln Ω (6.
and the ideal gas law should still hold true. χ(E). the volume dependence of Ω is still V N . In fact. (6.14) dV.11) kT ∂E It follows that 1 ∂ ln χ = . since if there are no interatomic forces then increasing the volume. V). The internal energy of a gas can be considered as a general function of the temperature and volume.6. In other words. (5. (6. The statistical deﬁnition of temperature is [Eq. but not the volume. However. which. becomes a lot more complicated in polyatomic gases. and is independent of the volume. as long as there are no interatomic forces. Hence.31)] 1 ∂ ln Ω = .6). in addition to the translational degrees of freedom of a monatomic gas. This is pretty obvious. It follows from mathematics that dE = ∂E ∂T dT + V (6.12) kT ∂E We can see that since χ is a function of the energy. but not the volume. Polyatomic gases are more complicated that monatomic gases because the molecules can rotate and vibrate.6). then the temperature must be a function of the energy. We can turn this around and write E = E(T ). There is one other conclusion we can draw from Eq. the internal energy of an ideal gas depends only on the temperature of the gas.13) In other words. (6. So.6). in turn. T ∂E ∂V (6.2 The equation of state of an ideal gas 6 CLASSICAL THERMODYNAMICS form (6. is not going to affect the molecular energies in any way. in Eq. we shall discover that the extra degrees of freedom of polyatomic gases manifest themselves by increasing the speciﬁc heat capacity. The volume independence of the internal energy can also be obtained directly from the ideal gas equation of state. so E = E(T. implies the ideal gas law.15) 90 . (6. giving rise to extra degrees of freedom. the energy of the whole gas is unaffected. the ideal gas law should also apply to polyatomic gases with no interatomic forces. which effectively increases the mean separation between molecules. (6.
so a comparison with Eq. T V 1 ∂E T ∂V νR dV. dS is the exact differential of a welldeﬁned state function. dS = Using Eq. Thus. T (6. (6.6. Thus.16) The ideal gas law can be used to express the pressure in term of the volume and the temperature in the above expression. ∂V∂T ∂T∂V This implies that ∂ ∂V T (6. (6. V (6. so ∂2 S ∂2 S = . this becomes dS = 1 ∂E T ∂T dT + V νR 1 dE + dV.18) However.22) ∂S ∂V ∂S ∂T = V ∂ ∂T .15). V (6. irrespective of the order of differentiation. T (6.20) νR .21) + T One wellknown property of partial differentials is the equality of second derivatives. (6. V).17) + T (6.23) V 91 . Thermodynamics tells us that for a quasistatic change of parameters T dS = dE + p dV.19) The above expression is true for all small values of dT and dV. S = S(T. and mathematics immediately tells us that dS = ∂S ∂T dT + V ∂S ∂V dV. This means that we can consider the entropy to be a function of temperature and volume. and the subscript T reminds us that the second partial derivative is taken at constant temperature.18) gives ∂S ∂T ∂S ∂V 1 T 1 = T = V T ∂E ∂T ∂E ∂V . S.2 The equation of state of an ideal gas 6 CLASSICAL THERMODYNAMICS where the subscript V reminds us that the ﬁrst partial derivative is taken at constant volume. V (6.
20) and (6. so that its temperature only rises slightly..6. We can overcome this problem by only allowing the body in question to absorb a very small amount of heat. In the limit as the amount of absorbed heat becomes inﬁnitesimal. In general. The usual deﬁnition of the heat capacity. 6. This result was conﬁrmed experimentally by James Joule in the middle of the nineteenth century. this is not true. (6.3 Heat capacity or speciﬁc heat Suppose that a body absorbs an amount of heat ∆Q and its temperature consequently rises by ∆T . ν dT 92 (6.21) to give 1 ∂2 E 1 ∂E = − 2 T ∂V∂T T ∂V T 1 ∂2 E .25) which implies that the internal energy is independent of the volume for any gas obeying the ideal equation of state.24) Since second derivatives are equivalent. we obtain c= ¯ 1 dQ .28) .26) ∆T If the body consists of ν moles of some substance then the molar speciﬁc heat (i. the speciﬁc heat of one mole of this substance ) is deﬁned c= 1 ∆Q . the above relation reduces to ∂E ∂V = 0. ν ∆T (6. of the body is ∆Q C= . T (6. or speciﬁc heat.3 Heat capacity or speciﬁc heat 6 CLASSICAL THERMODYNAMICS The above expression can be combined with Eqs. + T ∂T∂V (6.e. and its speciﬁc heat remains approximately constant. (6.27) In writing the above expressions. irrespective of the order of differentiation. we have tacitly assumed that the speciﬁc heat of a body is independent of its temperature.
denoted cp = 1 dQ ¯ ν dT .33) The previous two expressions can be combined to give dE = ν cV dT for an ideal gas. ¯ (6. we can write dE = ∂E ∂T dT.32) Now. Let us now consider the molar speciﬁc heat at constant pressure of an ideal gas. and the ﬁrst law of thermodynamics reduces to dQ = dE.34) (6. and so the gas does work on its environment.35) . In general. dQ = dE + p dV = ν cV dT + p dV. V (6.31) It follows from Eq. secondly. the molar speciﬁc heat at constant pressure. Since E is a function only of T . the molar speciﬁc heat at constant volume. if the pressure is kept constant then the volume changes. it is usual to deﬁne two speciﬁc heats.3 Heat capacity or speciﬁc heat 6 CLASSICAL THERMODYNAMICS In classical thermodynamics. Firstly. no work is done by the gas. for an ideal gas the internal energy is volume independent.29) and.6. V (6. denoted cV = ¯ 1 dQ ν dT . (6. Since dV = 0. Thus. p (6. the above expression implies that the speciﬁc heat at constant volume is also volume independent. V (6. According to the ﬁrst law of thermodynamics.30) Consider the molar speciﬁc heat at constant volume of an ideal gas. ¯ 93 (6.29) that cV = 1 ∂E ν ∂T .
3 Heat capacity or speciﬁc heat 6 CLASSICAL THERMODYNAMICS The equation of state of an ideal gas tells us that if the volume changes by dV. p (6. as indicated by the above formula. then p dV = νR dT. The previous two equations can be combined to give dQ = ν cV dT + νR dT. hence.6. The ratio of the two speciﬁc heats cp /cV is conventionally denoted γ. ¯ Now. the temperature changes by dT . less heat is available to increase the temperature of the gas. This means that.39) for an ideal gas. we expect the speciﬁc heat at constant pressure to exceed that at constant volume. This is a very famous result. Note that at constant volume all of the heat absorbed by the gas goes into increasing its internal energy.40) 1 dQ ¯ ν dT . Thus. (6.41) ρ where ρ is the density. In fact.37) (6. γ is very easy to measure because the speed of sound in an ideal gas is written γp cs = .40) and the experimental γ is quite remarkable. Table 2 lists some experimental measurements of cV and γ for common gases.38) for an ideal gas. in the latter case. its temperature. and. whereas at constant pressure some of the absorbed heat is used to do work on the environment as the volume increases.36) (6. The extent of the agreement between γ calculated from Eq. and the pressure remains constant. by deﬁnition cp = so we obtain cp = cV + R (6. 94 . (6. We have γ≡ cp R =1+ cV cV (6.
5 12.45) (6. 2 (6.) From Reif.407 1. (6.302 1.8: i.4 Calculation of speciﬁc heats Now that we know the relationship between the speciﬁc heats at constant volume and constant pressure for an ideal gas.5 20.1 28.405 1. 6.397 1. Classical thermodynamics cannot help us here.220 γ (theory) 1. V) = B V N E 3N/2 . Recall.44) (6. It follows that ln Ω = ln B + N ln V + The temperature is given by ∂ ln Ω 3N 1 1 = = . we worked out a more exact expression for Ω in Sect. 3. (6. However.4 Calculation of speciﬁc heats Gas Helium Argon Nitrogen Oxygen Carbon Dioxide Ethane Symbol He Ar N2 O2 CO2 C2H6 cV (experiment) 12.666 1.6 21. V) ∝ V N χ(E).e. that the variation of the number of accessible states of an ideal gas with energy and volume is written Ω(E.298 1. Ω(E.46) . 2 95 3N ln E.396 1. (at 15◦ C and 1 atm.43) where B is a constant independent of the energy and volume.42) For the speciﬁc case of a monatomic ideal gas.666 1.666 1. it is quite easy to calculate the speciﬁc heat at constant volume using our knowledge of statistical physics. it would be nice if we could calculate either one of these quantities from ﬁrst principles..6.214 Table 2: Speciﬁc heats of common gases in joules/mole/deg. kT ∂E 2 E so E= 3 N k T.3 6 CLASSICAL THERMODYNAMICS γ (experiment) 1.2 39.666 1.
The molar speciﬁc heat at constant volume of a monatomic ideal gas is clearly 1 ∂E 3 cV = (6.47 joules/mole/deg. energy states of their constituent particles.667. We shall analyze this effect in greater detail later on in this course. 2 where R = 8.3143 joules/mole/deg. 6.51) cV 3 We can see from the previous table that these predictions are borne out pretty well for the monatomic gases Helium and Argon. Furthermore. If the gas is allowed to expand quasistatically under these so called isothermal conditions then the ideal equation of state tells us that p V = constant.48) = R. N = ν NA .5 Isothermal and adiabatic expansion Suppose that the temperature of an ideal gas is held constant by keeping the gas in thermal contact with a heat reservoir. we can rewrite the above expression as 3 (6. and NA k = R.50) 5 cp = = 1. (6.52) 96 . The above formula tells us exactly how the internal energy of a monatomic ideal gas depends on its temperature. Note that the speciﬁc heats of polyatomic gases are larger than those of monatomic gases. 2 γ≡ (6. (6. we have and 5 cp = cV + R = R. This is because polyatomic molecules can rotate around their centres of mass.49) (6. as well as the translational.5 Isothermal and adiabatic expansion 6 CLASSICAL THERMODYNAMICS Since. ν ∂T V 2 This has the numerical value cV = 12. as well as translate.6. is the ideal gas constant.47) E = ν R T. so polyatomic gases can store energy in the rotational.
dQ = ν cV dT + p dV = 0. So.56) cV + R cp = . Let us work out the relationship between the pressure and volume of the gas during adiabatic expansion. hence.6.55) which reduces to (6. its internal energy is reduced. If the gas is allowed to expand quasistatically under these so called adiabatic conditions then it does work on its environment.58) cV cV It turns out that cV is a very slowly varying function of temperature in most gases. it is always a fairly good approximation to treat the ratio of speciﬁc heats γ as a constant. R R R (cV + R) p dV + cV V dp = 0. (6. and.59) . Dividing through by cV p V yields γ where dV dp + = 0. (6.5 Isothermal and adiabatic expansion 6 CLASSICAL THERMODYNAMICS This is usually called the isothermal gas law. ¯ (6. at least over a limited temperature range. 97 (6. According to the ﬁrst law of thermodynamics.57) (6. Suppose. and its temperature changes.54) The temperature increment dT can be eliminated between the above two expressions to give 0= cV cV cV (p dV + V dp) + p dV = + 1 p dV + V dp. If γ is constant then we can integrate Eq. that the gas is thermally isolated from its surroundings. (6. V p (6. The ideal gas equation of state can be differentiated.53) in an adiabatic process (in which no heat is absorbed). yielding p dV + V dp = ν R dT. now.57) to give γ≡ γ ln V + ln p = constant.
60)–(6. where ρ is the density of the gas.62) are all completely equivalent. and g is the acceleration due to gravity. It is very easy to obtain similar relationships between V and T and p and T during adiabatic expansion or contraction. Since p = ν R T/V. and p 1−γ T γ = constant. consider the hydrostatic equilibrium of the atmosphere. (6. (6.6 Hydrostatic equilibrium of the atmosphere The gas which we are most familiar with in everyday life is. Likewise. of course.6. dp = −ρ g. the downward force exerted by the gas above the slice is p(z+dz) A.6 Hydrostatic equilibrium of the atmosphere 6 CLASSICAL THERMODYNAMICS or p V γ = constant. The net upward force is clearly [p(z)−p(z+dz)]A. 98 (6. Let us.61) 6.64) .60) This is the famous adiabatic gas law. Equations (6. In equilibrium.63) which reduces to (6. this upward force must be balanced by the downward force due to the weight of the slice: this is ρ A dz g. dz This is called the equation of hydrostatic equilibrium for the atmosphere. the above formula also implies that T V γ−1 = constant. In fact. we can use the isothermal and adiabatic gas laws to explain most of the observable features of the atmosphere. The upwards force exerted on this slice from the gas below is p(z) A.62) (6. Consider a thin vertical slice of the atmosphere of crosssectional area A which starts at height z above ground level and extends to height z + dz. ﬁrst of all. where p(z) is the pressure at height z. the Earth’s atmosphere. In follows that the force balance condition can be written [p(z) − p(z + dz)]A = ρ A dz g.
The acceleration due to gravity is 9. the temperature is on average about 15◦ centigrade. the molecular weight of Nitrogen gas is 28 grams. V (6.81 m s−2 at ground level. let us assume that the temperature of the atmosphere is uniform. Also. and is equal to the mass of one mole of gas particles. The quantity z0 = RT µg (6.314 joules/mole/degree. p RT 6. which is 288◦ kelvin on the absolute scale. we can directly integrate the previous equation to give z . or 1 atmosphere (105 N m−2 in SI units). The mean molecular weight of air at sea level is 29 (i. (6. p0 is the pressure at ground level (z = 0). (6. the molecular weight of a gas made up of 78% Nitrogen. ρ= νµ . and 1% Argon). At ground level. the ideal gas constant is 8.6.68) p = p0 exp − z0 Here. the isothermal scaleheight of the atmosphere comes out to be about 8.4 kilometers..65) where µ is the molecular weight of the gas. 21% Oxygen. Putting all of this information together. In such an isothermal atmosphere.e. 99 . For instance.7 The isothermal atmosphere 6 CLASSICAL THERMODYNAMICS We can write the density of a gas in the following form.69) (6. which is generally about 1 bar. The above formula for the density of a gas combined with the ideal gas law p V = ν R T yields pµ ρ= .66) RT It follows that the equation of hydrostatic equilibrium can be rewritten µg dp =− dz.67) is called the isothermal scaleheight of the atmosphere.7 The isothermal atmosphere As a ﬁrst approximation.
which is about 19. the effective height of the atmosphere is pretty small compared to the Earth’s radius. above sea level.66)]. or about 4.7 The isothermal atmosphere 6 CLASSICAL THERMODYNAMICS We have discovered that in an isothermal atmosphere the pressure decreases exponentially with increasing height. This peak lies at an altitude of 29. after a few days acclimatization. 000 feet. above sea level.35 atmospheres. According to Eq. we expect the air pressure to be only 0.85 atmo100 . Incidentally. At this altitude. people cannot survive for any extended period in air pressures below half an atmosphere. the peak of Mount Everest in Nepal. 432 feet. Humans can just about survive at such pressures. this is also the maximum altitude at which a pilot can ﬂy an unpressurized aircraft without requiring additional Oxygen. In the highest inhabited regions of the Andes and Tibet. of course. or 19.3 atmospheres. 400 kilometers. At this altitude. or density. which is about 6.4 kilometers. decreases by a factor 10 every ln10 z0 . 000 feet. and ρ ∝ p/T [see Eq. In other words.6 atmospheres.85 kilometers. Clearly. Surprisingly enough. Since the temperature is assumed to be constant. 000 feet the climbers are slowly dying. This sets an upper limit on the altitude of permanent human habitation. However. The highest point in the world is. people can survive quite comfortably at this sort of pressure. which explains why airline cabins are pressurized. the air pressure falls to about 0. (6. the cabins are only pressurized to 0. where we expect the air pressure to be a mere 0. This explains why Mount Everest was only conquered after lightweight portable oxygen cylinders were invented. because above 19. 028 feet. In fact.8 kilometers. or 5.6. some climbers have subsequently ascended Mount Everest without the aid of additional oxygen. above sea level. the atmosphere constitutes a very thin layer covering the surface of the Earth.5 atmospheres. Admittedly.3 kilometers. This peak lies 14. our formula says that the air pressure should be about 0. Commercial airliners ﬂy at a cruising altitude of 32. the pressure. it follows that the density also decreases exponentially with the same scaleheight as the pressure. One of the highest points in the United States of America is the peak of Mount Elbert in Colorado. but this is a very foolhardy venture. Incidentally. (6.68). we move vertically upwards. this justiﬁes our neglect of the decrease of g with increasing altitude. or 8.
8 The adiabatic atmosphere 6 CLASSICAL THERMODYNAMICS spheres (which accounts for the “popping” of passangers ears during air travel). the lower atmosphere is heated from below.e. hence. In fact. air temperature falls quite noticeably with increasing altitude. Clearly. This. which carries heat energy. Imagine a packet of air which is being swirled around in the atmosphere. the pressure difference between the air in the cabin and that outside is about half an atmosphere.8 The adiabatic atmosphere Of course. At 32. but by no means all. which means that they must be of a certain thickness. 6. In ski resorts. and because air is such a poor conductor of heat you stay warm. a fully pressurized aircraft would be more comfortable to ﬂy in (because your ears would not “pop”). The second important property of air is that it is constantly in motion. In fact. the walls of the cabin must be strong enough to support this pressure difference. hence. the aircraft must be of a certain weight. We 101 . If the cabin were fully pressurized then the pressure difference at cruising altitude would increase by about 30%. passes straight through the lower atmosphere and heats the ground. which means that the cabin walls would have to be much thicker. In particular. You might think that this would imply that the atmosphere is isothermal. it is a very poor conductor of heat. is why woolly sweaters work: they trap a layer of air close to the body. In other words. not from above. of course. most infrared radiation. In fact. you are told to expect the temperature to drop by about 1 degree per 100 meters you go upwards.6. The ﬁrst important property is that air is transparent to most. It depends on three important properties of air. However. Many people cannot understand why the atmosphere gets colder the higher up you go. this is not the case because of the ﬁnal important properly of air: i. So. 000 feet. the lower 20 kilometers of the atmosphere (the so called troposphere) are fairly thoroughly mixed. the aircraft would have to be substantially heavier. of the electromagnetic spectrum. and. the explanation is quite simple. They reason that as higher altitudes are closer to the Sun they ought to be hotter. but it would also be far less economical to operate. we know that the atmosphere is not isothermal. The reason for this partial pressurization is quite simple. and..
6.8 The adiabatic atmosphere
6 CLASSICAL THERMODYNAMICS
would expect it to always remain at the same pressure as its surroundings, otherwise it would be mechanically unstable. It is also plausible that the packet moves around too quickly to effectively exchange heat with its surroundings, since air is very a poor heat conductor, and heat ﬂow is consequently quite a slow process. So, to a ﬁrst approximation, the air in the packet is adiabatic. In a steadystate atmosphere, we expect that as the packet moves upwards, expands due to the reduced pressure, and cools adiabatically, its temperature always remains the same as that of its immediate surroundings. This means that we can use the adiabatic gas law to characterize the cooling of the atmosphere with increasing altitude. In this particular case, the most useful manifestation of the adiabatic law is p 1−γ T γ = constant, giving γ dT dp = . (6.71) p γ−1 T Combining the above relation with the equation of hydrostatic equilibrium, (6.67), we obtain µg γ dT =− dz, (6.72) γ−1 T RT or γ − 1µg dT =− . (6.73) dz γ R Now, the ratio of speciﬁc heats for air (which is effectively a diatomic gas) is about 1.4 (see Tab. 2). Hence, we can calculate, from the above expression, that the temperature of the atmosphere decreases with increasing height at a constant rate of 9.8◦ centigrade per kilometer. This value is called the adiabatic lapse rate of the atmosphere. Our calculation accords well with the “1 degree colder per 100 meters higher” rule of thumb used in ski resorts. The basic reason why air is colder at higher altitudes is that it expands as its pressure decreases with height. It, therefore, does work on its environment, without absorbing any heat (because of its low thermal conductivity), so its internal energy, and, hence, its temperature decreases. According to the adiabatic lapse rate calculated above, the air temperature at the cruising altitude of airliners (32, 000 feet) should be about −80◦ centigrade
102
(6.70)
6.8 The adiabatic atmosphere
6 CLASSICAL THERMODYNAMICS
(assuming a sea level temperature of 15◦ centigrade). In fact, this is somewhat of an underestimate. A more realistic value is about −60◦ centigrade. The explanation for this discrepancy is the presence of water vapour in the atmosphere. As air rises, expands, and cools, water vapour condenses out releasing latent heat which prevents the temperature from falling as rapidly with height as the adiabatic lapse rate would indicate. In fact, in the Tropics, where the humidity is very high, the lapse rate of the atmosphere (i.e., the rate of decrease of temperature with altitude) is signiﬁcantly less than the adiabatic value. The adiabatic lapse rate is only observed when the humidity is low. This is the case in deserts, in the Arctic (where water vapour is frozen out of the atmosphere), and, of course, in ski resorts. Suppose that the lapse rate of the atmosphere differs from the adiabatic value. Let us ignore the complication of water vapour and assume that the atmosphere is dry. Consider a packet of air which moves slightly upwards from its equilibrium height. The temperature of the packet will decrease with altitude according to the adiabatic lapse rate, because its expansion is adiabatic. We assume that the packet always maintains pressure balance with its surroundings. It follows that since ρ T ∝ p, according to the ideal gas law, then (ρ T )packet = (ρ T )atmosphere . (6.74) If the atmospheric lapse rate is less than the adiabatic value then Tatmosphere > Tpacket implying that ρpacket > ρatmosphere . So, the packet will be denser than its immediate surroundings, and will, therefore, tend to fall back to its original height. Clearly, an atmosphere whose lapse rate is less than the adiabatic value is stable. On the other hand, if the atmospheric lapse rate exceeds the adiabatic value then, after rising a little way, the packet will be less dense than its immediate surroundings, and will, therefore, continue to rise due to buoyancy effects. Clearly, an atmosphere whose lapse rate is greater than the adiabatic value is unstable. This effect is of great importance in Meteorology. The normal stable state of the atmosphere is for the lapse rate to be slightly less than the adiabatic value. Occasionally, however, the lapse rate exceeds the adiabatic value, and this is always associated with extremely disturbed weather patterns. Let us consider the temperature, pressure, and density proﬁles in an adiabatic
103
6.8 The adiabatic atmosphere
6 CLASSICAL THERMODYNAMICS
atmosphere. We can directly integrate Eq. (6.73) to give T = T0 1 − γ−1 z , γ z0 (6.75)
where T0 is the ground level temperature, and z0 = R T0 µg (6.76)
is the isothermal scaleheight calculated using this temperature. The pressure proﬁle is easily calculated from the adiabatic gas law p 1−γ T γ = constant, or p ∝ T γ/(γ−1) . It follows that p = p0 γ−1 z 1− γ z0
γ/(γ−1)
.
(6.77)
Consider the limit γ → 1. In this limit, Eq. (6.75) yields T independent of height (i.e., the atmosphere becomes isothermal). We can evaluate Eq. (6.77) in the limit as γ → 1 using the mathematical identity
lt m→0 (1
+ m x)1/m ≡ exp(x).
(6.78)
z , (6.79) z0 which, not surprisingly, is the predicted pressure variation in an isothermal atmosphere. In reality, the ratio of speciﬁc heats of the atmosphere is not unity, it is about 1.4 (i.e., the ratio for diatomic gases), which implies that in the real atmosphere 3.5 z . (6.80) p = p0 1 − 3.5 z0 In fact, this formula gives very similar results to the exponential formula, Eq. (6.79), for heights below one scaleheight (i.e., z < z0 ). For heights above one scaleheight, the exponential formula tends to predict too low a pressure. So, in an adiabatic atmosphere, the pressure falls off less quickly with altitude than in an isothermal atmosphere, but this effect is only really noticeable at pressures signiﬁcantly below one atmosphere. In fact, the isothermal formula is a pretty good p = p0 exp −
104
We obtain
6.9 Heat engines
6 CLASSICAL THERMODYNAMICS
approximation below altitudes of about 10 kilometers. Since ρ ∝ p/T , the variation of density with height goes like ρ = ρ0 γ−1 z 1− γ z0
1/(γ−1)
,
(6.81)
where ρ0 is the density at ground level. Thus, the density falls off more rapidly with altitude than the temperature, but less rapidly than the pressure. Note that an adiabatic atmosphere has a sharp upper boundary. Above height z1 = [γ/(γ−1)] z0 the temperature, pressure, and density are all zero: i.e., there is no atmosphere. For real air, with γ = 1.4, z1 3.5 z0 29.4 kilometers. This behaviour is quite different to that of an isothermal atmosphere, which has a diffuse upper boundary. In reality, there is no sharp upper boundary to the atmosphere. The adiabatic gas law does not apply above about 20 kilometers (i.e., in the stratosphere) because at these altitudes the air is no longer strongly mixed. Thus, in the stratosphere the pressure falls off exponentially with increasing height. In conclusion, we have demonstrated that the temperature of the lower atmosphere should fall off approximately linearly with increasing height above ground level, whilst the pressure should fall off far more rapidly than this, and the density should fall off at some intermediate rate. We have also shown that the lapse rate of the temperature should be about 10◦ centigrade per kilometer in dry air, but somewhat less than this in wet air. In fact, all off these predictions are, more or less, correct. It is amazing that such accurate predictions can be obtained from the two simple laws, p V = constant for an isothermal gas, and p V γ = constant for an adiabatic gas.
6.9 Heat engines Thermodynamics was invented, almost by accident, in 1825 by a young French engineer called Sadi Carnot who was investigating the theoretical limitations on the efﬁciency of steam engines. Although we are not particularly interested in steam engines, nowadays, it is still highly instructive to review some of Carnot’s arguments. We know, by observation, that it is possible to do mechanical work w
105
(Here. by which it periodically returns to the same macrostate. so w = q. we want to spin the shaft of the electrical generator without increasing the generator’s entropy.9 Heat engines 6 CLASSICAL THERMODYNAMICS upon a device M. called a heat engine. of that device.. Let us examine the feasibility of a heat engine using the laws of thermodynamics. Suppose that a heat engine M performs a single cycle. its internal energy is unchanged.) An example of this is Joule’s classic experiment by which he veriﬁed the ﬁrst law of thermodynamics: a paddle wheel is spun in a liquid by a falling weight. Carnot’s question was this: is it possible to reverse this process and build a device.82) The above condition is certainly a necessary condition for a feasible heat engine. and the ﬁrst law of thermodynamics tell us that the work done by the engine w must equal the heat extracted from the reservoir q. has extracted heat from the reservoir and done an equivalent amount of useful work. the work should not be done at the expense of the heat engine itself. otherwise the conversion of heat into work could not continue indeﬁnitely. we use small letters w and q to denote intrinsically positive amounts of work and heat. First of all. and the work done by the weight on the wheel is converted into heat. and then to extract an equivalent amount of heat q. by lifting a weight) without doing it at the expense of affecting the other degrees of freedom. but.6. We can ensure that this is the case if the heat engine performs some sort of cycle. if we are extracting heat from the ocean to generate electricity. in the meantime. does every device which satis106 . Since M has returned to its initial macrostate. The second caveat is that the work done by the heat engine should be such as to change a single parameter of some external device (e. respectively. and absorbed by the liquid. causing the generator to heat up or fall to bits. Furthermore.. i. or the entropy.g. which extracts heat energy from a reservoir and converts it into useful macroscopic work? For instance. (6. which goes to increase the internal energy of some heat reservoir.e. is it possible to extract heat from the ocean and use it to run an electric generator? There are a few caveats to Carnot’s question. a cyclic process seems reasonable because we know that both steam engines and internal combustion engines perform continuous cycles. For instance. but is it also a sufﬁcient condition? In other words.
Once we have expressed the problem in these terms. which are jiggling about every which way. it is fairly obvious that what we are really asking for is a spontaneous transition from a probable to an improbable state. the entropy change per cycle of the heat reservoir. (6. The engine itself returns periodically to the same state. We want to construct a device which will extract energy from a heat reservoir. which is at absolute temperature T1 . we cannot run an electric generator off heat extracted from the ocean.83) For the case of a heat engine. and the integral ¯ is taken over a whole cycle of the heat engine. which can then be converted into a torque on the generator shaft. because it is like asking all of the molecules in the ocean. but is fantastically improbable. which we know is forbidden by the second law of thermodynamics. unfortunately. so ∆S ≥ 0. say. T1 T1 (6. the reservoir from which it extracts heat. say. We have already speciﬁed that there is no change in the entropy of the external device upon which the work is done. so its entropy is clearly unchanged after each cycle. and the outside device upon which it does work. So. This says that the total entropy of an isolated system can never spontaneously decrease. and convert it into energy distributed over a single degree of freedom associated with some parameter of an external device. We know from our investigation of statistical thermodynamics that such a process is possible.6. the isolated system consists of the engine. The integral can be converted into the expression −q/T1 because the amount of heat extracted by the engine is assumed to be too small to modify the temperature of the reservoir (this is 107 . where it is randomly distributed over very many degrees of freedom. is given by ∆Sreservoir = dQ ¯ q =− . so as to exert a force on some lever.84) where dQ is the inﬁnitesimal heat absorbed by the reservoir. in principle. The improbability of the scenario just outlined is summed up in the second law of thermodynamics. to all suddenly jig in the same direction.9 Heat engines 6 CLASSICAL THERMODYNAMICS ﬁes this condition actually work? Let us think a little more carefully about what we are actually expecting a heat engine to do. On the other hand.
q w = ≤ 0. which continuously executes a cycle without extracting heat from. Clearly.83). making use of the ﬁrst law of thermodynamics. so that T1 is a constant during the cycle. In fact. its surroundings. at least. Thus. there is no such thing as a completely reversible engine. The British patent ofﬁce.86) (6.6. such a device corresponds to the equality sign in Eq.85) Since we wish the work w done by the engine to be positive. the above relation clearly cannot be satisﬁed.83) corresponds to an asymptotic limit which reality can closely approach. (6. This is not surprising. being slightly less openminded that its American counterpart. (6.S. In fact. since we know that this is essentially what frictional forces do.9 Heat engines 6 CLASSICAL THERMODYNAMICS the deﬁnition of a heat reservoir). here. the U. but never quite attain. (6. and converts work directly into heat. the statement It is impossible to construct a perfect heat engine which converts heat directly into work 108 . another example of a natural process which is fundamentally irreversible according to the second law of thermodynamics. or doing work on. slightly irreversible.86). A perpetual motion device. we have. T1 T1 (6. refuses to entertain such applications on the basis that perpetual motion devices are forbidden by the second law of thermodynamics. is just about possible according to Eq. The second law of thermodynamics clearly reduces to −q ≥0 T1 or. In reality. Nevertheless. It follows that a perpetual motion device is thermodynamically impossible. there is no thermodynamic objection to a heat engine which runs backwards. have frictional losses which make them.86). even the most efﬁcient. (6. All engines. which proves that an engine which converts heat directly into work is thermodynamically impossible. According to Eq. patent ofﬁce receives about 100 patent applications a year regarding perpetual motion devices. which means that it must be completely reversible. the equality sign in Eq.
(6.89) T1 T2 or w 1 1 ≤ q1 − . Let the work done on the external device be w per cycle. But. In Carnot’s analysis. otherwise steam engines would not operate.e.87) Note that q2 < q1 if positive (i. useful) work is done on the external device. (6. w is only going to be positive) if T2 < T1 . so the second law of thermodynamics was violated.88) T1 T2 We can combine the previous two equations to give −q1 q1 − w + ≥ 0. Our only other option is to increase the entropy of some other body. How can we remedy this situation? We still want the heat engine itself to perform periodic cycles (so. by extracting an amount of heat q per cycle. its entropy cannot increase over a cycle).9 Heat engines 6 CLASSICAL THERMODYNAMICS is called Kelvin’s formulation of the second law. Suppose that the heat per cycle we extract from the ﬁrst reservoir is q1 . (6. by deﬁnition. The ﬁrst law of thermodynamics tells us that q1 = w + q2 . and the heat per cycle we reject into the second reservoir is q2 . We have demonstrated that a perfect heat engine.6. Well. the reason that our previous scheme did not work was that it decreased the entropy of a heat reservoir. at some temperature T1 . So. So. (6. the second reservoir has to be colder than the 109 . is impossible.. −q1 q2 ∆S = + ≥ 0. this other body is a second heat reservoir at temperature T2 . there must be some way of obtaining useful work from heat energy.e. The total entropy change per cycle is due to the heat extracted from the ﬁrst reservoir and the heat dumped into the second.. We can increase the entropy of the second reservoir by dumping some of the heat we extracted from the ﬁrst reservoir into it. without any compensating increase in the entropy of anything else.90) T2 T2 T1 It is clear that the engine is only going to perform useful work (i. and has to be positive (or zero) according to the second law of thermodynamics. which converts heat directly into work. and we also do not want to increase the entropy of the external device upon which the work is done.
Let us consider how we might construct one of these reversible heat engines. This is the ratio of the work done per cycle on the external device to the heat energy absorbed per cycle from the ﬁrst reservoir..92) irrespective of the way in which they operate. Suppose that we also have two heat reservoirs at temperatures T1 and T2 (where T1 > T2 ). We now place the gas in thermal contact with the second reservoir and slowly push the piston in. Suppose that we have some gas in a cylinder equipped with a frictionless piston. which are always irreversible to some extent. These reservoirs might take the form of large water baths. We now pull the piston out very slowly so that heat energy ﬂows reversibly into the gas from the reservoir. Furthermore. During this adiabatic process the temperature of the gas falls (since there is no longer any heat ﬂowing into it to compensate for the work it does on the piston). The efﬁciency of a perfect heat engine is unity. Let us start off with the gas in thermal contact with the ﬁrst reservoir. A real engine must always reject some energy into the second heat reservoir in order to satisfy the second law of thermodynamics. The gas is not necessarily a perfect gas. It is clear that real engines. η= T1 − T2 . and the efﬁciency of the engine is reduced. but we have already shown that such an engine is impossible.6. one which is quasistatic). What is the efﬁciency of a realizable engine? It is clear from the previous equation that η≡ T2 T1 − T2 w ≤1− = . The equality sign in the above expression corresponds to a completely reversible heat engine (i. q1 T1 T1 (6. so less energy is available to do external work. Let us continue this process until the temperature of the gas falls to T2 . Let us now thermally isolate the gas and slowly pull out the piston some more. are less efﬁcient than reversible engines.91) Note that the efﬁciency is always less than unity. all reversible engines which operate between the two temperatures T1 and T2 must have the same efﬁciency.e. It is useful to deﬁne the efﬁciency η of a heat engine. During this isothermal 110 . T1 (6.9 Heat engines 6 CLASSICAL THERMODYNAMICS ﬁrst if the heat dumped into the former is to increase the entropy of the Universe more than the heat extracted from the latter decreases it.
6. In this process the temperature of the gas increases. The locus takes the form of a closed curve—see Fig. and is the simplest conceivable device capable of converting heat energy into useful work.92). the internal energy is a function of the temperature alone. (6. For an ideal gas. since dW = p dV ¯ [if p is plotted vertically and V horizontally. We now have a set of reversible processes by which a quantity of heat is extracted from the ﬁrst reservoir and a quantity of heat is dumped into the second. verify Eq. The temperature does not change during isothermal expansion. then p dV is clearly an element of area under the curve p(V)]. thereby. For the speciﬁc case of an ideal gas.9 Heat engines 6 CLASSICAL THERMODYNAMICS T 1 a q p > 1 T 2 d q 2 b c V > Figure 1: An ideal gas Carnot engine. process heat ﬂows out of the gas into the reservoir. The net work done per cycle is the “area” contained inside this curve. so the internal energy remains constant. We stop the compression when the temperature reaches T1 . we can actually calculate the work done per cycle. Consider the isothermal expansion phase of the gas. We can best evaluate the work done by the system during each cycle by plotting out the locus of the gas in a pV diagram. and the net heat absorbed by the gas must 111 . and. We next thermally isolate the gas a second time and slowly compress it some more. 1. If we carry out each step properly we can return the gas to its initial state and then repeat the cycle ad inﬁnitum. The engine we have just described is called a Carnot engine.
97) T1 Vb = T2 Vcγ−1 . b q1 = a p dV. the net heat rejected to the second reservoir is q2 = ν R T2 ln Vc . the work done by the engine.95) Now. Thus. we have b q1 = a ν R T1 dV Vb = ν R T1 ln . γ−1 (6.100) is w = ν R (T1 − T2 ) ln 112 Vb . γ−1 γ−1 (6. (6. w = q1 − q2 .98) T1 V a = T2 V d .99) Hence.96) It follows that during the adiabatic expansion phase. which takes the gas from state d to state a. (6. Vd (6. If we take the ratio of the previous two equations we obtain Vb Vc = . in which the gas goes from state c to state d.9 Heat engines 6 CLASSICAL THERMODYNAMICS equal the work it does on the piston. during the adiabatic compression phase. during adiabatic expansion or compression T V γ−1 = constant.101) . Va Vd (6. which we can calculate using the ﬁrst law of thermodynamics. Va (6.94) Likewise.93) where the expansion takes the gas from state a to state b. for an ideal gas. V Va (6.6. which takes the gas from state b to state c. Since p V = ν R T . Likewise. (6. during the isothermal compression phase.
Finally. The “front end” heats water extracted from a local river and turns it into steam. Of course. Conventional power stations have many different “front ends” (e. hence.9 Heat engines 6 CLASSICAL THERMODYNAMICS Thus.. Suppose that the steam is only heated to 100◦ C (or 373◦ K). Using 400◦ C steam. is exactly the same as Eq.92).” and the lower temperature is the temperature of the environment into which heat is rejected. (6.” At this stage. we cannot refrigerate the environment). which is not uncommon. oil ﬁred furnaces. (6. which is then used to drive the turbine.23. real engines are far more complicated than this.104) .57. which means that the incoming water does not have to be heated so much by the “front end. The only way in which the thermodynamic efﬁciency of the steam cycle can be raised to an acceptable level is to use very hot steam (clearly. commercial power stations do not operate with 100◦ C steam.103) 373 So. Nevertheless.91) that the maximum possible efﬁciency of the steam cycle is 373 − 288 0. the maximum efﬁciency becomes η= η= 673 − 288 673 113 0. coal ﬁred furnaces. at least 77% of the heat energy generated by the “front end” goes straight up the cooling towers! Not be surprisingly.6.102) which. some heat is rejected to the environment.g. and consist of a steam driven turbine connected to an electric generator. the efﬁciency of the engine is η= w T1 − T2 = q1 T1 (6. but their “back ends” are all very similar. (6. (6. There is a cycle which operates between two temperatures. not surprisingly. and the temperature of the environment is 15◦ C (or 288◦ K). The engine described above is very idealized. We can see that a power station possesses many of the same features as our idealized heat engine. The upper temperature is the temperature to which the steam is heated by the “front end. It follows from Eq. and. the maximum efﬁciency of an ideal heat engine places severe constraints on real engines. to generate electricity. usually as clouds of steam escaping from the top of cooling towers. nuclear reactors). the steam is sent through a heat exchanger so that it can heat up the incoming river water.
in fact. in general. An idealized refrigerator is an engine which extracts heat from a cold heat reservoir (temperature T2 . Let q2 be the heat absorbed per cycle from the colder reservoir. the steam cycles of modern power stations are so well designed that they come surprisingly close to their maximum thermodynamic efﬁciencies. the refrigerator in your home contains a small electric pump which does work on the freon (or whatever) in the cooling circuit. 6. We can see that. T1 T2 We can combine these two laws to give w 1 1 ≥ q2 − . Hence. say). In other words. T1 − T2 (6. and w the external work done per cycle on the engine. With this deﬁnition η= T2 . In fact.6. greater than unity.108) We can see that this efﬁciency is.107) The most sensible way of deﬁning the efﬁciency of a refrigerator is as the ratio of the heat extracted per cycle from the cold reservoir to the work done per cycle on the engine. or pump. To make this machine work we always have to do some external work on the engine. say) and rejects it to a somewhat hotter heat reservoir. which is usually the environment (temperature T1 .10 Refrigerators 6 CLASSICAL THERMODYNAMICS which is more reasonable. a refrigerator is just a heat engine run in reverse.105) (6. For instance.106) (6. q1 −q2 + ≥ 0. for one joule of work done on the engine. more than one joule of 114 . T1 T2 T1 The second law says that (6. The ﬁrst law of thermodynamics tells us that w + q2 = q1 . we can immediately carry over most of our heat engine analysis.10 Refrigerators Let us now move on to consider refrigerators. q1 the heat rejected per cycle into the hotter reservoir.
Clearly.6. The maximum theoretical efﬁciency of such devices is η= 277 = 25.109) So. Domestic refrigerators cool stuff down to about 4◦ C (277◦ K) and reject heat to the environment at about 15◦ C (288◦ K). refrigerators are intrinsically very efﬁcient devices.10 Refrigerators 6 CLASSICAL THERMODYNAMICS energy is extracted from whatever it is we are cooling. 115 . for every joule of electricity we put into a refrigerator we can extract up to 25 joules of heat from its contents. 288 − 277 (6.2.
The resolution of this paradox is. The statistical distribution of the molecule over its own particular microstates must be consistent with this macrostate. For instance. that macroscopic systems contain a very large number of particles. pressure. and occasionally bounces off the walls of the room. the only statistical statement we made was that it was extremely unlikely that a macroscopic system could violate the second law of thermodynamics. negligible. then they must be equivalent to air with the appropriate macroscopic properties. and volume. if we have a large group of such molecules with similar statistical distributions. In fact. of course.1 Introduction In our study of classical thermodynamics. such as its temperature. In other words. such as atoms and molecules. These interactions “inform” it about the macroscopic state of the air. and their statistical ﬂuctuations are. therefore. How can we obtain some information about the statistical properties of the molecules which make up air? Consider a speciﬁc molecule: it constantly collides with its immediate neighbour molecules. Let us now apply statistical physics to microscopic systems. statistical arguments did not ﬁgure very prominently in this investigation. it ought to be possible to calculate the probability distribution of the molecule over its microstates from a knowledge of these macroscopic properties. we know something about its internal energy and speciﬁc heat capacity. we concentrated on the application of statistical physics to macroscopic systems. 7. We can think of the interaction of a molecule with the air in a classroom as analogous to the interaction of a small system A in thermal contact with a heat 116 .7 APPLICATIONS OF STATISTICAL THERMODYNAMICS 7 Applications of statistical thermodynamics 7. So. Somewhat paradoxically. the underlying statistical nature of thermodynamics will become far more apparent.2 Boltzmann distributions We have gained some understanding of the macroscopic properties of the air around us. In this study.
so Er + E = E(0) . so that the energies of these two systems are additive.4) 117 . The air acts like a heat reservoir because its energy ﬂuctuations due to any interactions with the molecule are far too small to affect any of its macroscopic parameters. (7.3) where the sum is over all possible states of system A.e. This constant can be determined by the normalization condition Pr = 1. Let Ω (E ) be the number of microstates accessible to the reservoir when its energy lies in the range E to E + δE.1) where E denotes the energy of the reservoir A . only the total energy of the combined system A(0) = A + A is known. Let us determine the probability Pr of ﬁnding system A in one particular microstate r of energy Er when it is thermal equilibrium with the heat reservoir A . Let us now make use of the fact that system A is far smaller than system A . we assume fairly weak interaction between A and A . Clearly. it follows that the total number of states accessible to the combined system is simply Ω (E(0) − Er ).2 Boltzmann distributions 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS reservoir A . since A(0) is assumed to be an isolated system. and the total number of states accessible to A is Ω (E(0) − Er ). It follows that Er E(0) . so the slowly varying logarithm of Pr can be Taylor expanded about E = E(0) . Hence. Thus. ∂ ln Ω ln Pr = ln C + ln Ω (E ) − ∂E (0) 0 Er + · · · . r (7. In fact. The overall energy is constant in time. (7. Pr = C Ω (E(0) − Er ).7.2) where C is a constant of proportionality which is independent of r. state r). As usual. irrespective of their energy. (7. Suppose that the total energy lies in the range E(0) to E(0) + δE. Thus. since A is in one deﬁnite state (i. The energy of A is not known at this stage. The principle of equal a priori probabilities tells us the the probability of occurrence of a particular situation is proportional to the number of accessible microstates.. if system A has an energy Er then the reservoir A must have an energy close to E = E(0) − Er .
9) This is known as the Boltzmann probability distribution.4) becomes ln Pr = ln C + ln Ω (E(0) ) − β Er . Hence. there is no need for any confusion. because Er E(0) . are understandably surprised when they come across the Boltzmann distribution which says that high energy microstates are markedly less probable then low energy states. The Boltzmann distribution often causes confusion. If the small system is in a microstate with a comparatively high energy Er then the rest of the system (i. because the latter function varies so rapidly with energy that the radius of convergence of its Taylor series is far too small for the series to be of any practical use.2 Boltzmann distributions 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS Note that we must expand ln Pr .8) so that the distribution becomes Pr = exp(−β Er ) . (7. The higher order terms in Eq. However. The parameter C is determined by the normalization condition. (7. 5. we know. that this derivative is just the temperature parameter β = (k T )−1 characterizing the heat reservoir A . thus. (7. Eq.e.5) exp(−β Er ). (7.4) can be safely neglected. whereas the Boltzmann distribution only applies to a small part of the system. (7. The principle of equal a priori probabilities applies to the whole system. which gives C−1 = r ∂ ln Ω ∂E 0 ≡β (7. rather than Pr itself.. a constant independent of the energy Er of A. r exp(−β Er ) (7. which says that all microstates are equally probable. from Sect.7) where C is a constant independent of r. The two results are perfectly consistent. People who are used to the principle of equal a priori probabilities. In fact. the reservoir) has a slightly 118 . and is undoubtably the most famous result in statistical physics. Now the derivative is evaluated at the ﬁxed energy E = E(0) .6) giving Pr = C exp(−β Er ).7. and is.
10) where Ω(E) is the number of microstates of A whose energies lie in the appropriate range.7. As usual. this corresponds to the situation in which the temperature of A is the same as that of the reservoir.e. the sharper this maximum becomes. The exponential factor exp(−β Er ) is called the Boltzmann factor. the most probable energy is evaluated by looking for the maximum of ln P. It follows that when the small system has a high energy then signiﬁcantly less states than usual are accessible to the reservoir.12) ∂E Of course. ∂E ∂E giving ∂ ln Ω = β. The probability P(E) that A has an energy in the small range between E and E + δE is just the sum of all the probabilities of the states which lie in this range. This gives a sharp maximum of P(E) at some particular value of the energy. (7. The larger system A. exponential) decrease in the likelihood of a state r of the small system with increasing Er . However. This is a result which we have seen before 119 (7. but still very much smaller than system A . since each of these states has approximately the same Boltzmann factor this sum can be written P(E) = C Ω(E) exp(−βE). The number of accessible microstates of the reservoir is a very strongly increasing function of its energy. we expect Ω(E) to be a very rapidly increasing function of energy. and so the number of microstates accessible to the overall system is reduced. The Boltzmann distribution gives the probability of ﬁnding the small system A in one particular state r of energy Er . (7. hence. Suppose that system A is itself a large system. For a large system.e. Eventually.2 Boltzmann distributions 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS lower energy E than usual (since the overall energy is ﬁxed).. the Boltzmann factor). the maximum becomes so sharp that the energy of system A is almost bound to lie at the most probable energy. so ∂ ln Ω ∂ ln P = − β = 0. The strong increase in the number of accessible microstates of the reservoir with increasing E gives rise to the strong (i. the conﬁguration is comparatively unlikely. and. so the probability P(E) is the product of a rapidly increasing function of E and another rapidly decreasing function (i..11) .
and consequently possess an intrinsic magnetic moment µ. so it is a far more general result than any we have previously obtained. the probability of ﬁnding the atom in the (+) state is P+ = C exp(−β + ) = C exp(β µ B). We assume. Let us determine the mean magnetic moment µB (in the direction of B) of the constituent particles of the substance when its absolute temperature is T . however.7. and some compounds. so that µB = −µ. and the (−) state in which its spin points down (i.. Consider a substance whose constituent particles contain only one unpaired electron.e. 7. The magnetic energy of the atom is + = −µ B. and β = (k T )−1 .3 Paramagnetism The simplest microscopic system which we can analyze using the Boltzmann distribution is one which has only two possible states (there would clearly be little point in analyzing a system with only one possible state). The magnetic energy of the atom is − = µ B. Our atom can be in one of two possible states: the (+) state in which its spin points up (i. and treat the remaining atoms as a heat bath at temperature T .e. 5). the magnetic moment of a spin 1/2 particle can point either parallel or antiparallel to an external magnetic ﬁeld B. are paramagnetic: i. According to quantum mechanics. According to the Boltzmann distribution.3 Paramagnetism 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS (see Sect. Note. This enables us to focus attention on a single atom. antiparallel to B). Such particles have spin 1/2. or molecules.. (7. the atomic magnetic moment is antiparallel to the magnetic ﬁeld.. parallel to B). Likewise. that the Boltzmann distribution is applicable no matter how small system A is. the probability of ﬁnding the 120 . possess a permanent magnetic moment due to the presence of one or more unpaired electrons.e. so that µB = µ. their constituent atoms.13) where C is a constant. In the (−) state. the atomic magnetic moment is parallel to the magnetic ﬁeld. for the sake of simplicity. that each atom (or molecule) only interacts weakly with its neighbouring atoms. Most elements. In the (+) state.
If the thermal energy greatly exceeds the magnetic energy then y 1.15) This parameter measures the ratio of the typical magnetic energy of the atom to its typical thermal energy.14) Clearly. the most probable state is the state with the lowest energy [i.18) (7. and the atomic moment is far more likely to point parallel to the magnetic ﬁeld than antiparallel. y 1. kT (7. In this situation. the mean magnetic moment points in the direction of the magnetic ﬁeld (i. if the magnetic energy greatly exceeds the thermal energy then y 1. so that µB 0. the atom is more likely to point parallel to the ﬁeld than antiparallel).7. we expect µB µ.e. the (+) state].16) This can also be written µB . and the probability that the atomic moment points parallel to the magnetic ﬁeld is about the same as the probability that it points antiparallel.17) exp(y) − exp(−y) . P+ + P− exp(β µ B) + exp(−β µ B) µB = µ tanh where the hyperbolic tangent is deﬁned tanh y ≡ For small arguments. we expect the mean atomic moment to be small. tanh y y3 + ···. On the other hand.e. The usual deﬁnition of a mean value gives µB = P+ µ + P− (−µ) exp(β µ B) − exp(−β µ B) =µ . (7.. Let us calculate the mean atomic moment µB . Thus. exp(y) + exp(−y) (7. y− 3 121 (7. kT (7.3 Paramagnetism 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS atom in the (−) state is P− = C exp(−β −) = C exp(−β µ B)..19) . In this situation. It is clear that the critical parameter in a paramagnetic system is y ≡ βµB = µB .
the magnetization. (7. µB kT whereas at comparatively low temperatures.7. tanh y 1. k T µ B.24) where χ is a constant of proportionality known as the magnetic susceptibility. The breakdown of the M0 ∝ B law at low temperatures (or high magnetic ﬁelds) is known as saturation. because it was discovered experimentally by Pierre Curie at the end of the nineteenth century. is proportional to the applied magnetic ﬁeld. (7. M0 → N0 µ. The magnetization is deﬁned as the mean magnetic moment per unit volume.21) µ B. (7. and is given by M0 = N0 µB .25) kT The fact that χ ∝ T −1 is known as Curie’s law. k T µB µ. k T µ2 B . and. (7. This corresponds to the maximum possible magnetization.22) (7. where all atomic moments are lined up parallel to the ﬁeld.26) so the magnetization becomes independent of the applied ﬁeld. However. Suppose that the substance contains N0 atoms (or molecules) per unit volume. The above analysis is only valid for paramagnetic substances made up of spin onehalf (J = 1/2) atoms or molecules. It is clear that the magnetic susceptibility of a spin 1/2 paramagnetic substance takes the form N0 µ2 χ= . the mean magnetic moment. so we can write M0 χ B.3 Paramagnetism 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS whereas for large arguments. hence.23) At high temperatures. (7.20) It follows that at comparatively high temperatures. the analysis can easily be 122 . k T µ B. (7. y 1. At low temperatures. µ B.
Henry. 123 . 561 (1952).3 Paramagnetism 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS Figure 2: The magnetization (vertical axis) versus B/T (horizontal axis) curves for (I) chromium potassium alum (J = 3/2).7. From W. (II) iron ammonium alum (J = 5/2).E. Phys. The solid lines are the theoretical predictions whereas the data points are experimental measurements. Rev. and (III) gadolinium sulphate (J = 7/2). 88.
in which the spins of neighbouring atoms interact very strongly.4 Mean values Consider a system in contact with a heat reservoir. Figure 2 compares the experimental and theoretical magnetization versus ﬁeldstrength curves for three different substances made up of spin 3/2. 124 (7. showing the excellent agreement between the two sets of curves.4 Mean values 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS generalized to take account of substances whose constituent particles possess higher spin (i.28) where the sum is taken over all states of the system. 7. Note that ∂ ∂Z exp(−β Er ) Er = − exp(−β Er ) = − .29) ∂β r r ∂β where Z= r exp(−β Er ). J > 1/2). but saturates at some constant value as the ﬁeldstrength increases. It turns out that this is a fairly good approximation for paramagnetic substances.e. spin 5/2.30) . The previous analysis completely neglects any interaction between the spins of neighbouring atoms or molecules. Thus. in all cases. the above analysis does not apply to ferromagnetic substances.. this approximation breaks down completely.27) Pr = r exp(−β Er ) The mean energy is written E= r exp(−β Er ) Er r exp(−β Er ) . the probability of occurrence of some state r with energy Er is given by exp(−β Er ) . The systems in the representative ensemble are distributed over their accessible states in accordance with the Boltzmann distribution. irrespective of their energy. Thus. (7. However. (7.7. Note that. (7. and spin 7/2 particles. the magnetization is proportional to the magnetic ﬁeldstrength at small ﬁeldstrengths. for ferromagnetic substances.
(7. (7. or. (7. Since. by deﬁnition. Now. as we shall discover. E=− Let us evaluate the variance of the energy.33) r r ∂ exp(−β Er ) Er = − ∂β E2 1 ∂2 Z = . It follows from Eq. it follows that ∂E/∂β ≤ 0. is called the partition function.36) where use has been made of Eq. the mean 125 . (7. irrespective of their energy. In fact.32) that (∆E)2 = − ∂E ∂2 ln Z = . (7. 2).32) (see Sect.37) Thus. (7. E2 = However.7. We have just demonstrated that it is fairly easy to work out the mean energy of a system using its partition function. Hence.35) We can also write E2 1 ∂Z ∂ 1 ∂Z + 2 = ∂β Z ∂β Z ∂β =− ∂E 2 +E .4 Mean values 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS 1 ∂Z ∂ ln Z =− .34) Hence. Z ∂β2 2 2 r exp(−βEr ) .31). a variance can never be negative. the variance of the energy can be worked out from the partition function almost as easily as the mean energy.31) Z ∂β ∂β The quantity Z. ∂ exp(−β Er ) Er2 = − ∂β 2 r exp(−β Er ) Er r exp(−β Er ) . ∂β (7. We know that (∆E)2 = E2 − E 2 It follows that (7. it is easy to calculate virtually any piece of statistical information using the partition function. ∂E/∂T ≥ 0. ∂β ∂β2 (7. according to the Boltzmann distribution. which is deﬁned as the sum of the Boltzmann factor over all states. equivalently.
45) .42) ∂Er (7.7. (7. (7.38) ∂x The macroscopic work dW done by the system due to this parameter change is ¯ δEr = dW = ¯ r exp(−β Er )(−∂Er /∂x r exp(−β Er ) dx) .39) which gives In other words. ¯ where X=− (7. the energy of the system in state r changes by ∂Er dx. 4).43) ∂x is the mean generalized force conjugate to x (see Sect. where the average is calculated using the Boltzmann distribution. β ∂x (7. The generalization to the case where there are several external parameters is obvious. It follows that dW = p dV = ¯ 126 1 ∂ ln Z dV β ∂V (7.4 Mean values 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS energy of a system governed by the Boltzmann distribution always increases with temperature. Suppose that the system is characterized by a single external parameter x (such as its volume). We can write 1 ∂ 1 ∂Z ∂Er =− . It follows that X= 1 ∂ ln Z . Consider a quasistatic change of the external parameter from x to x + dx.44) Suppose that the external parameter is the volume. In this process.41) βZ ∂x β ∂x We also have the following general expression for the work done by the system dW = X dx. (7. the work done is minus the average change in internal energy of the system. (7. so x = V.40) exp(−β Er ) exp(−β Er ) = − ∂x β ∂x r β ∂x r dW = ¯ 1 ∂Z 1 ∂ ln Z dx = dx.
using the partition function. If follows from Eqs. ¯ The last term can be rewritten d ln Z = β dW − d(E β) + β dE.7.49) (7. we can work out the pressure. ∂x ∂β (7. p= and 7. and we can write d ln Z = ∂ ln Z ∂ ln Z dx + dβ. Z = Z(β. Recall that Z is a function of both β and x (where x is the single external parameter).5 Partition functions 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS 1 ∂ ln Z . it becomes one when multiplied by the temperature d(ln Z + β E) = β(dW + dE) ≡ β dQ. remains distributed according to the Boltzmann distribution.50) (7. it is clear that the above equation relates the mean pressure p to T (via β = 1/k T ) and V. ¯ ¯ (7. Hence. ¯ giving The above equation shows that although the heat absorbed by the system dQ ¯ is not an exact differential. x).41) that d ln Z = β dW − E dβ. and even the equation of state. (7. Hence. Let us investigate how the partition function is related to thermodynamical quantities.48) 127 .31) and (7. and.5 Partition functions It is clear that all important macroscopic quantities associated with a system can be expressed in terms of its partition function Z. (7. thus.47) Consider a quasistatic change by which x and β change so slowly that the system stays close to equilibrium. In other words.46) β ∂V Since the partition function is a function of β and V (the energies Er depend on V). the above expression is the equation of state.
Since A and A only interact weakly their energies are additive.55) (7. let each state of A be denoted by an index s and have a corresponding energy Es . 128 (7.52) Suppose that we are dealing with a system A(0) consisting of two systems A and A which only interact weakly with one another.s exp(−β Er ) exp(−β Es ) exp(−β Er ) Z(0) = Z Z .s = = = Hence. r s exp(−β Es ) . giving r. and the energy of state rs is E(0) = Er + Es .51) dS = T Hence. Let each state of A be denoted by an index r and have a corresponding energy Er .56) where Z and Z are the partition functions of A and A . It follows from Eq. (7. S ≡ k (ln Z + β E).54) (7. ln Z(0) = ln Z + ln Z . This is essentially the second law of thermodynamics.s (7.5 Partition functions 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS This expression enables us to calculate the entropy of a system from its partition function. (7. and A are related by E (0) =E+E .53) exp[−β E(0) ] rs exp(−β [Er + Es ]) r. rs By deﬁnition. the partition function of A(0) takes the form Z(0) = r. A. A state of the combined system A(0) is then denoted by two indices r and s. (7. Likewise. parameter β.7.57) .31) that the mean energies of A(0) . we know that dQ ¯ . (7. respectively. In fact.
an ideal monatomic gas.59) where pi2 = pi ·pi . The former approach is advantageous because the partition function is an unrestricted sum of Boltzmann factors over all accessible states.52) that the respective entropies of these systems are related via S(0) = S + S .7. we divide up phasespace into cells of equal volume h0f . It is clear that we can perform statistical thermodynamical calculations using the partition function Z instead of the more direct approach in which we use the density of states Ω. Since the gas is ideal.. f is the number of degrees of freedom. (7.58) Hence. In this approach. Consider a gas consisting of N identical monatomic molecules of mass m enclosed in a container of volume V. and h0 is a small constant with dimensions of angular momentum which parameterizes 129 . In general. irrespective of their energy. it is generally easier to derive statistical thermodynamical results using Z rather than Ω. it is far easier to perform an unrestricted sum than a restricted sum. respectively.e. Here. Let us treat the problem classically. although Ω has a far more direct physical signiﬁcance than Z.6 Ideal monatomic gases Let us now practice calculating thermodynamic relations using the partition function by considering an example with which we are already quite familiar: i. (7. and the total energy is simply the sum of the individual kinetic energies of the molecules: N E= i=1 pi2 . Thus. 7.6 Ideal monatomic gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS It also follows from Eq. 2m (7. Let us denote the position and momentum vectors of the ith molecule by ri and pi . whereas the density of states is a restricted sum over all states whose energies lie in some narrow range. there are no interatomic forces. the partition function tells us that the extensive thermodynamic functions of two weakly interacting systems are simply additive.
61) are identical: they differ only by irrelevant dummy variables of integration. 7. the above expression reduces to VN Z = 3N h0 2 · · · exp[−(β/2m) p12 ] d3 p1 · · · exp[−(β/2m) pN ] d3 pN .2).e.7.63) is easily evaluated: exp[−(β/2m) p ] d p = 2 3 −∞ ∞ ∞ exp[−(β/2m) px2 ] dpx −∞ ∞ exp[−(β/2m) py2 ] dpy × −∞ exp[−(β/2m) pz2 ] dpz 130 . so we obtain the factor V N in the above expression. (7. this result is obvious.63) 3 exp[−(β/2m) p ] d p h0 is the partition function for a single molecule. Z = ζN . we can approximate the summation over cells in phasespace as an integration over all phasespace.60) where 3N is the number of degrees of freedom of a monatomic gas containing N molecules. h03N (7. There are N such integrals. (7. (7.5).6 Ideal monatomic gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS the precision to which the positions and momenta of molecules are determined (see Sect. 3. It follows that the partition function Z of the gas is made up of the product of N identical factors: i.62) V 2 3 (7. where Er is the energy of state r. Thus. Classically. also. that each of the integrals over the molecular momenta in Eq. Z= d3 r1 · · · d3 rN d3 p1 · · · d3 pN · · · exp(−β E) . Note.61) Note that the integral over the coordinates of a given molecule simply yields the volume of the container. Each cell in phasespace corresponds to a different state. (7. since the energy E is independent of the locations of the molecules in an ideal gas. Of course. V.. since we have already shown that the partition function for a system made up of a number of weakly interacting subsystems is just the product of the partition functions of the subsystems (see Sect. The integral in Eq. The partition function is the sum of the Boltzmann factor exp(−β Er ) over all possible states. Making use of Eq.59). where ζ= (7.
131 (7. (7.69) Note that the internal energy is a function of temperature alone. ln Z = N ln ζ = N ln V − ln β + ln 2 2 h02 1 ∂ ln Z 1N = .65) and 3 2 π m 3 .70) ν ∂T V 2 so the mean energy can be written E = ν cV T. (2. β ∂V βV (7. (7. 2πm ζ=V 2 h0 β 3/2 2 π m . β 3 (7.66) The expression for the mean pressure (7.71) We have seen all of the above results before. with no dependence on volume.6 Ideal monatomic gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS = where use has been made of Eq. (7.31).7.64) . The entropy of the gas can be calculated quite simply from the expression S = k (ln Z + β E).79).72) . Thus.67) which reduces to the ideal gas equation of state p V = N k T = ν R T. (7. Let us now use the partition function to calculate a new result. the mean energy of the gas is given by E=− 3 ∂ ln Z 3 N = = ν R T. (7.46) yields p= (7. The molar heat capacity at constant volume of the gas is given by 1 ∂E 3 cV = = R.68) where use has been made of N = ν NA and R = NA k. According to Eq. ∂β 2β 2 (7.
S = νR ln V − ln β + ln 2 2 h02 2 S = νR ln V + where (7. Equation (7. Pressure and temperature are typical intensive quantities. Thus.74) is wrong because it implies that the entropy does not behave properly as an extensive quantity. Intensive quantities stay the same.74) or 3 ln T + σ . extensive and intensive. Thermodynamic quantities can be divided into two groups.7 Gibb’s paradox 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS Thus.7. 2 The above expression for the entropy of an ideal gas is certainly new. Thus. (7. which contradicts the third law of thermodynamics. Unfortunately. Suppose that we have a system of volume V containing ν moles of ideal gas at temperature T .74) was derived using classical physics. 3 2 π m 3 3 + . this is not a problem. (7. We have shown [see Eq. However. let us be clear why Eq. Equation (7. we would not expect this equation to give a sensible answer close to the absolute zero of temperature.75) 7. Doubling the size of the system is like joining two identical systems together to form a new 132 .7 Gibb’s paradox What has gone wrong? First of all.73) (7. We can see that S → −∞ as T → 0. Extensive quantities increase by a factor α when the size of the system under consideration is increased by the same factor.74) is incorrect. which breaks down at low temperatures. Energy and volume are typical extensive quantities.58)] that the entropies of two weakly interacting systems are additive. Entropy is very deﬁnitely an extensive quantity. if we double the size of a system we expect the entropy to double as well. it is also quite obviously incorrect! 3 2 π m k 3 σ = ln + . 2 h02 2 (7.
Where does this extra entropy come from? Well.79). (7. The increase in entropy associated with this jumbling is called entropy of mixing. Let S = ν R ln V + 3 ln T + σ 2 (7. according to Eq. But. now. If entropy is a properly extensive quantity then the entropy of the overall system should be the same before and after the partition is removed. the entropy of the doublesized system is more than double the entropy of the original system. It is certainly the case that the energy (another extensive quantity) of the overall system stays the same. However. and let S = 2 ν R ln 2 V + 3 ln T + σ 2 (7. molecules of both types become jumbled together. the overall entropy of the system increases by 2 ν R ln 2 after the partition is removed. so that the two sets of molecules are distinguishable. This is clearly an irreversible process. In this case. we would certainly expect an overall increase in entropy when the partition is removed.7.77) denote the entropy of the doublesized system. Suppose.78) . we ﬁnd that S − 2 S = 2 ν R ln 2.76) denote the entropy of the original system. let us consider a little more carefully how we might go about doubling the size of our system. Before the partition is removed. if entropy is an extensive quantity (which it is!) then we should have S = 2 S. Let us now suddenly remove the partition. and separate the two systems by a partition. We cannot imagine the molecules spontaneously sorting themselves out again. and 133 (7.7 Gibb’s paradox 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS system of volume 2 V containing 2 ν moles of gas at temperature T . in fact. Clearly. Suppose that we put another identical system adjacent to it. After the partition is removed.79) So. it separates type 1 molecules from type 2 molecules. (7. that the second system is identical to the ﬁrst system in all respects except that its molecules are in some way slightly different to the molecules in the ﬁrst system.
The volume accessible to type 1 molecules clearly doubles after the partition is removed. We have calculated the partition function assuming that all of the molecules in our system have the same mass and temperature. Using the fundamental formula S = k ln Ω. The resolution of Gibb’s paradox is quite simple: treat all molecules of the same species as if they were indistinguishable. A paradox arises if we try to treat molecules as if they were distinguishable.7. clearly. At this point. In our previous calculation of the ideal gas partition function. why should there be any entropy of mixing? Well. after the American physicist Josiah Gibbs who ﬁrst discussed it. we overcounted the number of states of the system. which is what we really should be using to study microscopic phenomena. we can begin to understand what has gone wrong in our calculation. as does the volume accessible to type 2 molecules. is entropy of mixing of the molecules contained in the original systems. we have been treating the molecules in our ideal gas as if each carried a little license plate.80) It is clear that the additional entropy 2 ν R ln 2. there is no entropy of mixing in this case. so that we could always tell one from another.7 Gibb’s paradox 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS is easily calculated. we inadvertently treated each of the N molecules in the gas as distinguishable. Since the N! possible permutations of the molecules amongst themselves do not lead to physically different 134 . In other words. where we do not closely follow the motions of individual particles. Ωi Vi (7. if the molecules in these two systems are indistinguishable. the increase in entropy due to mixing is given by S = 2 k ln Vf Ωf = 2 N k ln = 2 ν R ln 2. Because of this. which appears when we double the size of an ideal gas system by joining together two identical systems. It is plainly silly to pretend that we can distinguish molecules in a statistical problem. or a social security number. but we have never explicitly taken into account the fact that we consider the molecules to be indistinguishable. In quantum mechanics. Our problem is that we have been taking the classical approach a little too seriously. This is called Gibb’s paradox. We know that the number of accessible states of an ideal gas varies with volume like Ω ∝ V N . the essential indistinguishability of atoms and molecules is hardwired into the theory at a very low level. But.
83) using Stirling’s approximation.85) 2 π m k 5 3 + . (7.7.e. and N are multiplied by the same factor. This explains why our calculations of the mean pressure and mean energy. on average. Note that our new version of ln Z differs from our previous version by an additive term involving the number of particles in the system. our expression for the entropy S is modiﬁed by this additive term. N! (7. so that Z= This gives ln Z = N ln ζ − ln N!. and.8 The equipartition theorem The internal energy of a monatomic ideal gas containing N particles is (3/2) N k T .82) (7. This means that each particle possess. We can easily correct our partition function by simply dividing by this factor. which depend on partial derivatives of ln Z with respect to the volume and the temperature parameter β. Mon135 . cannot be counted as separate states. (7.8 The equipartition theorem 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS situations. The new expression is 3 2 π m k 3 3 + + k (−N ln N + N).81) (7. the number of actual states of the system is a factor N! less than what we initially thought.84) S = ν R ln V − ln β + ln 2 2 h02 2 This gives S = ν R ln where V 3 + ln T + σ0 N 2 ζN . came out all right. respectively. 7. However.86) σ0 = ln 2 h02 2 It is clear that the entropy behaves properly as an extensive quantity in the above expression: i. V. therefore. (7. or ln Z = N ln ζ − N ln N + N. (3/2) k T units of energy. it is multiplied by a factor α when ν..
90) ∞ i = −∞ exp[−βE(q1 . This is because the kinetic energy is usually a quadratic function of each momentum component. In fact. They possess no internal rotational or vibrational degrees of freedom. (7. The most common situation in which the above assumptions are valid is where pi is a momentum. the mean energy per degree of freedom in a monatomic ideal gas is (1/2) k T . pf )] dq1 · · · dpf 136 . pf )] i dq1 · · · dpf . Let us now try to prove this. Thus. so that E = E(q1 . corresponding to their motion in three dimensions. the mean value of i is expressed in terms of integrals over all phasespace: ∞ −∞ exp[−βE(q1 . What is the mean value of i in thermal equilibrium if conditions 1 and 2 are satisﬁed? If the system is in equilibrium at absolute temperature T ≡ (k β)−1 then it is distributed according to the Boltzmann distribution.89) where b is a constant. qf . pf ).7.87) + E (q1 . · · · . · · · .88) where i involves only one variable pi . (7. if a coordinate qi were to satisfy assumptions 1 and 2 then the theorem we are about to establish would hold just as well. · · · . (7. this is a special case of a rather general result. Suppose further that: 1. whereas the potential energy does not involve the momenta at all. and the remaining part E does not depend on pi . · · · . The total energy splits additively into the form E= i (pi ) (7.8 The equipartition theorem 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS atomic particles have only three translational degrees of freedom. 2. Suppose that the energy of a system is determined by some f generalized coordinates qk and corresponding f generalized momenta pk . However. so that i (pi ) = b pi2 . The function i is quadratic in pi . In the classical approximation. p1 . pf ). · · · .
∂β 2 2β 137 (7. Thus. Hence.97) . (7. and where the last integrals in both the numerator and denominator extend over all variables qk and pk except pi .93) ∂ ln =− ∂β ∞ −∞ ∞ exp(−β i ) dpi . ∞ ∞ exp(−β i ) dpi −∞ exp(−β E ) dq1 · · · dpf −∞ (7. It follows from Eq.94) According to condition 2. 1 exp(−β b pi ) dpi = √ exp(−β i ) dpi = β −∞ −∞ √ where y = β pi .7.95) −∞ ln −∞ ∞ exp(−β 1 i ) dpi = − ln β + ln 2 −∞ ∞ exp(−b y2 ) dy. cancel. (7.96) Note that the integral on the righthand side does not depend on β at all. (7. These integrals are equal and. (7.94) that i =− ∂ 1 1 − ln β = . 2 ∞ ∞ exp(−b y2 ) dy. (7.92) ∞ i exp(−β i ) dpi −∞ ∞ exp(−β i ) dpi −∞ so . thus.8 The equipartition theorem 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS Condition 1 gives i = = ∞ −∞ exp[−β ( i + E )] i dq1 · · · dpf ∞ −∞ exp[−β ( i + E )] dq1 · · · dpf ∞ ∞ −∞ exp(−β i ) i dpi −∞ exp(−β E ) dq1 · · · dpf . (7.91) This expression can be simpliﬁed further since ∂ exp(−β i ) i dpi ≡ − ∂β −∞ i ∞ where use has been made of the multiplicative property of the exponential function. ∞ exp(−β i ) i dpi = −∞ .
7. E= 2m 2 (7. the mean kinetic energy of the oscillator is equal to the mean potential energy which equals (1/2) k T . If all terms in the energy are quadratic then the mean energy is spread equally over all degrees of freedom (hence the name “equipartition”). i = 7.e. To see how quantum effects modify this result. a simple harmonic oscillator.102) .98) 2 This is the famous equipartition theorem of classical physics.101) That is. Consider a onedimensional harmonic oscillator in equilibrium with a heat reservoir at temperature T . involving the displacement x and the force constant κ.9 Harmonic oscillators Our proof of the equipartition theorem depends crucially on the classical approximation. It states that the mean value of every independent quadratic term in the energy is equal to (1/2) k T . Each of these terms is quadratic in the respective variable. (7.99) where the ﬁrst term on the righthand side is the kinetic energy. It follows that the mean total energy is E= 1 1 k T + k T = k T. So.9 Harmonic oscillators 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS giving 1 k T. 2 2 138 (7. involving the momentum p and mass m.100) (7. 2m 2 1 2 1 κx = k T.. and the second term is the potential energy. 2 2 (7. The energy of the oscillator is given by 1 p2 + κ x2 . let us examine a particularly simple system which we know how to analyze using both classical and quantum physics: i. in the classical approximation the equipartition theorem yields: p2 1 = k T.
h where n is a nonnegative integer.106) is simply the sum of an inﬁnite geometric series.103) (7.110) or (7. (7. ∞ n=0 ∞ n=0 exp(−β En ) = exp[−(1/2) β ¯ ω] h exp(−n β ¯ ω).104) The partition function for such an oscillator is given by Z= Now. and ω= κ . E=− ∂β 2 1 − exp(−β ¯ ω) h 1 1 . the energy levels of a harmonic oscillator are equally spaced and satisfy En = (n + 1/2) ¯ ω.7. h (7. (7. 1 − exp(−β ¯ ω) h (7.31)] 1 exp(−β ¯ ω) ¯ ω h h ∂ ln Z = − − ¯ ω − h . m ∞ n=0 (7.107) exp(−n β ¯ ω) = h 1 − exp(−β ¯ ω) h n=0 Thus.109) The mean energy of the oscillator is given by [see Eq.108) 1 h h ln Z = − β ¯ ω − ln[1 − exp(−β ¯ ω)] 2 (7. and can be evaluated immediately.111) .105) exp(−n β ¯ ω) = 1 + exp(−β ¯ ω) + exp(−2 β ¯ ω) + · · · h h h (7. ∞ 1 . the partition function takes the form Z= and exp[−(1/2) β ¯ ω] h .9 Harmonic oscillators 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS According to quantum mechanics. E = ¯ ω + h 2 exp(β ¯ ω) − 1 h 139 (7.
Unfortunately.112) kT in which the thermal energy k T is large compared to the separation ¯ ω between h the energy levels. h ¯ω h 1 .102) holds whenever the thermal energy greatly exceeds the typical spacing between quantum energy levels. E ¯ω h 1.117) β¯ ω = h and so 1 ¯ ω. (7. β¯ ω h (7.7.118) 2 Thus. if the thermal energy is much less than the spacing between quantum states then the mean energy approaches that of the groundstate (the socalled zero point energy). In this limit. where k T ¯ ω. β¯ ω = h exp(β ¯ ω) h so E giving 1 = k T. (7. the classical result (7.10 Speciﬁc heats We have discussed the internal energies and entropies of substances (mostly ideal gases) at some length. E ¯ ω [1/2 + exp(−β ¯ ω)] h h Consider the limit ¯ω h 1 1 + 2 β¯ ω h 1 + β ¯ ω.116) kT in which the thermal energy is small compared to the separation between the energy levels. the equipartition theorem is only valid in the former limit.10 Speciﬁc heats 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS ¯ω h 1. exp(β ¯ ω) h 1. these quantities cannot be directly mea140 . (7. Clearly. In this limit. and the oscillator possess sufﬁcient thermal energy to explore h many of its possible quantum states.115) β Thus.113) (7.114) Consider the limit 7. h (7. (7.
120) S(T. the above formula predicts that the molar heat capacities of substances made up of the former type of molecules should greatly exceed those 141 . The thermodynamic property of substances which is the easiest to measure is. T (7. use has been made of dS = dQ/T . or (1/2) ν R T . Classical physics. the contribution to the molar heat capacity at constant volume (we wish to avoid the complications associated with any external work done on the substance) is 1 ∂E 1 ∂[(1/2) ν R T ] 1 = = R. V) dT. In fact. of course.121) per molecular degree of freedom. ν ∂T V ν ∂T 2 (7. once the variation of the speciﬁc heat with temperature is known. both the internal energy and entropy can be easily reconstructed via T E(T. in the guise of the equipartition theorem. cV (T. 2 where g is the number of molecular degrees of freedom. the heat capacity. Every molecular degree of freedom contributes (1/2) N k T .7. V) dT + E(0.119) (7. Since large complicated molecules clearly have very many more degrees of freedom than small simple molecules. says that each independent degree of freedom associated with a quadratic term in the energy possesses an average energy (1/2) k T in thermal equilibrium at temperature T . to the mean energy of the substance (with the tacit proviso that each degree of freedom is associated with a quadratic term in the energy). Instead. The total classical heat capacity is therefore g (7. and the third law of thermodynamics. they must be inferred from other information. Consider a substance made up of N molecules.122) cV = R. V) = ν 0 Here. V) = ν 0 T cV (T. Thus.10 Speciﬁc heats 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS sured. ¯ Clearly. the optimum way of verifying the results of statistical thermodynamics is to compare the theoretically predicted heat capacities with the experimentally measured values. or speciﬁc heat. V).
123) ν ∂T V Thus. Clearly. In fact. We say that this particular degree of freedom is frozen out. Incidentally. We can 142 . In fact. degrees of freedom successively “kick in.9. This equation also implies that heat capacities are temperature independent. In fact. As the temperature is gradually increased. these problems were fully appreciated as far back as 1850. Stories that physicists at the end of the nineteenth century thought that classical physics explained absolutely everything are largely apocryphal. We can use these simple ideas to explain the behaviours of most experimental heat capacities. and then greatly exceeds. 7. Experimental heat capacities generally increase with increasing temperature. The equipartition theorem (and the whole classical approximation) is only valid when the typical thermal energy k T greatly exceeds the spacing between quantum energy levels. at very low temperatures just about all degrees of freedom are frozen out. but by nowhere near the large factor predicted by Eq.7. the spacing between their quantum energy levels. Suppose that the temperature is sufﬁciently low that this condition is not satisﬁed for one particular molecular degree of freedom. since this is a collective property of all molecules.” and eventually contribute their full (1/2) R to the molar heat capacity. the experimental heat capacities of substances containing complicated molecules are generally greater than those of substances containing simple molecules.10 Speciﬁc heats 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS of substances made up of the latter.122). The groundstate energy can be a quite complicated function of the internal properties of the molecule. (7. but is certainly not a function of the temperature. It follows that the contribution to the molar heat capacity is 1 ∂[NE0 ] = 0. if k T is much less than the spacing between the energy levels then the degree of freedom contributes nothing at all to the molar heat capacity. E0 . According to Sect. we need to estimate the typical spacing between the quantum energy levels associated with various degrees of freedom. (7. say. These two experimental facts pose severe problems for classical physics. to the mean energy of the molecule. suppose that k T is much less than the spacing between the energy levels. as k T approaches. this is not the case for most substances. To make further progress. in this situation the degree of freedom only contributes the groundstate energy.
11 Speciﬁc heats of gases Let us now investigate the speciﬁc heats of gases. ultraviolet. Degrees of freedom which emit or absorb infrared radiation are on the border line. and the degree of freedom is frozen out. where h ν = ∆E. ﬁrst of all. If one particular molecule has mass m and momentum p = m v then its kinetic energy of translation is K= 1 (px2 + py2 + pz2 ). and the degree of freedom makes its full contribution to the heat capacity.7. It is clear that degrees of freedom which give rise to emission or absorption of radio or microwave radiation contribute their full (1/2) R to the molar heat capacity at room temperature.124) .11 Speciﬁc heats of gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS Radiation type Frequency (Hz) Trad (◦ K) Radio < 109 < 0. On the other hand. We can deﬁne an effective temperature of the radiation via h ν = k Trad . If the typical spacing between energy levels is ∆E then transitions between the various levels are associated with photons of frequency ν. or γray regions of the electromagnetic spectrum are frozen out at room temperature. translational degrees of freedom. Consider. If T Trad then k T ∆E. 7. Table 3 lists the “temperatures” of various different types of radiation. if T Trad then k T ∆E. Every molecule in a gas is free to move in three dimensions. Degrees of freedom which give rise to emission or absorption in the visible.05 9 11 Microwave 10 – 10 0.05 – 5 Infrared 1011 – 1014 5 – 5000 Visible 5 × 1014 2 × 104 15 17 Ultraviolet 10 – 10 5 × 104 – 5 × 106 Xray 1017 – 1020 5 × 106 – 5 × 109 γray > 1020 > 5 × 109 Table 3: Effective “temperatures” of various types of electromagnetic radiation do this by observing the frequency of the electromagnetic radiation emitted and absorbed during transitions between these energy levels. Xray. 2m 143 (7.
11 Speciﬁc heats of gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS The kinetic energy of other molecules does not involve the momentum p of this particular molecule.124) contains three independent quadratic terms. These typically give rise to absorption lines in the infrared region of the spectrum.7. the spacing between the energy levels can be made arbitrarily small by increasing the size of the enclosure. Any internal rotational. electronic. Moreover.e. It follows from Tab. so that Eq. 144 . 3 that electronic degrees of freedom are frozen out at room temperature. In fact.. and. n2 . This implies that translational degrees of freedom can be treated classically. Clearly. Similarly. the potential energy of interaction between molecules depends only on their position coordinates.. thus. (7. the possible conﬁgurations of electrons orbiting the atomic nuclei) typically give rise to absorption and emission in the ultraviolet or visible regions of the spectrum.125) is always valid (except very close to absolute zero). and n3 are positive integer quantum numbers. the possible conﬁgurations of protons and neutrons in the atomic nuclei) are frozen out because they are associated with absorption and emission in the Xray and γray regions of the electromagnetic spectrum. (7. the only additional degrees of freedom we need worry about for gases are rotational and vibrational degrees of freedom. We conclude that all gases possess a minimum molar heat capacity of (3/2) R due to the translational degrees of freedom of their constituent molecules. vibrational. Hence. there are clearly three degrees of freedom associated with translation (one for each dimension of space). nuclear degrees of freedom (i. the essential conditions of the equipartition theorem are satisﬁed (at least. 2 m L2 1 where n1 . The electronic degrees of freedom of gas molecules (i. Since Eq.125) (cV )translation = R. so the translational contribution to the molar heat capacity of gases is 3 (7.126) E= n 2 + n22 + n32 . or nuclear degrees of freedom of the molecule also do not involve p. certainly does not involve p. the quantized translational energy levels of an o individual molecule are given by ¯ 2 π2 h (7.e. in the classical approximation). According to Schr¨dinger’s equation. 2 Suppose that our gas is contained in a cubic enclosure of dimensions L.
angular momentum is quantized in units of ¯ . 2 2 2 (7. (7. Since the kinetic energy of rotation is the sum of three quadratic terms. given the very small moment of inertia along this axis. ωy . (7. Note the inverse dependence of the spacing between energy levels on the moment of inertia. The energy levels of a rigid rotator h are written ¯2 h E= J(J + 1).128) 2 according to the equipartition theorem. y. Note that the typical magnitude of a molecular moment of inertia is m d2 . one of the principle axes lies along the line of centers of the atoms. the vibrational degrees of freedom of a molecule are studied by standard normal mode analysis of the molecular structure. and ωz are the angular velocities of rotation about these axes.7.g. hence. the rotational degree of freedom associated with spinning along the line of centres of the atoms is frozen out at room temperature. The moment of inertia about this axis is of order m a2 . In quantum mechanics. and d is the typical interatomic spacing in the molecule. if the molecule is diatomic). the very widely spaced rotational energy levels.127) where the x. and Ix . No other degrees of freedom depend on the angular velocities of rotation.11 Speciﬁc heats of gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS The rotational kinetic energy of a molecule tumbling in space can be written K= 1 1 1 Ix ωx2 + Iy ωy2 + Iz ωz2 . ωx . and zaxes are the so called principle axes of inertia of the molecule (these are mutually perpendicular). It is clear that for the case of a linear molecule. Since a ∼ 10−5 d. and. the rotational contribution to the molar heat capacity of gases is 3 (cV )rotation = R. it follows that the moment of inertia about the line of centres is minuscule compared to the moments of inertia about the other two principle axes. where a is a typical nuclear dimension (remember that nearly all of the mass of an atom resides in the nucleus).129) 2I where I is the moment of inertia and J is an integer. where m is the molecular mass. Classically. A special case arises if the molecule is linear (e. and Iz are the moments of inertia of the molecule about these axes. Each normal mode 145 . Iy . In this case.
130) Figure 3: The infrared vibrationabsorption spectrum of H Cl. this is usually called a vibrationrotation spectrum. The longer wavelength absorption lines 146 . For instance.11 Speciﬁc heats of gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS behaves like an independent harmonic oscillator. Figure 3 shows the infrared absorption spectrum of Hydrogen Chloride. The missing line at about 3. Hence. once quantum effects are taken into consideration? We can answer this question by examining just one piece of data. a diatomic molecule has just one normal mode (corresponding to periodic stretching of the bond between the two atoms). So. and. the classical contribution to the speciﬁc heat from vibrational degrees of freedom is (cV )vibration = (n − 1) R.47 microns corresponds to a pure vibrational transition from the groundstate to the ﬁrst excited state (pure vibrational transitions are forbidden: H Cl molecules always have to simultaneously change their rotational energy level if they are to couple effectively to electromagnetic radiation). A molecule containing n atoms has n − 1 normal modes of vibration. (7. Thus. contributes R to the molar speciﬁc heat of the gas [(1/2) R from the kinetic energy of vibration and (1/2) R from the potential energy of vibration]. The absorption lines correspond to simultaneous transitions between different vibrational and rotational energy levels. do any of the rotational and vibrational degrees of freedom actually make a contribution to the speciﬁc heats of gases at room temperature.7. therefore.
We are now in a position to make some predictions regarding the speciﬁc heats of various gases. The rotational transitions split the vibrational lines by about 0. and probably contribute the classical (1/2) R to the molar speciﬁc heat.47 joules/degree/mole. The ratio of speciﬁc heats should be 7/5 = 1. however.667. The freezing out of vibrational degrees of freedom becomes gradually less effective as molecules become heavier and more complex. This is partly because such molecules are generally less stable. are frozen out at room temperature. or any other small molecule. which corresponds to infrared radiation of frequency 8. and partly because the molecular mass is increased. are not frozen out at room temperature. We conclude that the vibrational degrees of freedom of H Cl.7. The ratio of speciﬁc heats γ = cp /cV = (cV + R)/cV should be 5/3 = 1. so monatomic gases should have a molar heat capacity (3/2) R = 12. Monatomic molecules only possess three translational degrees of freedom. There is one proviso. the shorter wavelength absorption lines correspond to vibrational transitions in which there is a simultaneous increase in the rotational energy level.5 × 1011 hertz with an associated radiation “temperature” of 4400 degrees kelvin. We conclude that the rotational degrees of freedom of H Cl.47 microns. because of the very small moment of inertia of such molecules along the line of centres of the atoms.4.2 microns.11 Speciﬁc heats of gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS correspond to vibrational transitions in which there is a simultaneous decrease in the rotational energy level. Linear molecules (like H Cl) effectively only have two rotational degrees of freedom (instead of the usual three). or any other small molecule. Both these effect reduce the frequency of vibration of 147 . Likewise. Thus. so the force constant κ is reduced. diatomic gases should have a molar heat capacity (5/2) R = 20.8 joules/degree/mole. 2 that these are pretty accurate predictions for Nitrogen and Oxygen. Diatomic molecules possess three translational degrees of freedom and two rotational degrees of freedom (all other degrees of freedom are frozen out at room temperature). It can be seen from Tab. This implies that pure rotational transitions would be associated with infrared radiation of frequency 5 × 1012 hertz and corresponding radiation “temperature” 260 degrees kelvin. 2 that both of these predictions are borne out pretty well for Helium and Argon. It is clear that the rotational energy levels are more closely spaced than the vibrational energy levels. It can be seen from Tab. The pure vibrational transition gives rise to absorption at about 3.
Figure 4: The molar heat capacity at constant volume (in units of R) of gaseous H2 versus temperature. Figure 4 shows the variation of the molar heat capacity at constant volume (in units of R) of gaseous hydrogen with temperature. not a multiple of (1/2) R] speciﬁc heats of Carbon Dioxide and Ethane in Tab.11 Speciﬁc heats of gases 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS the molecular normal modes [see Eq. the spacing between vibrational energy levels [see Eq. It can be seen that as the temperature rises the rotational.e. (7. degrees of freedom eventually make their full classical contributions to the heat capacity. The expected contribution at high temperatures from the rotational degrees of freedom is R (there are effectively two rotational degrees of freedom per molecule).7. The expected contribution from the translational degrees of freedom is (3/2) R (there are three translational degrees of freedom per molecule). 148 . the expected contribution at high temperatures from the vibrational degrees of freedom is R (there is one vibrational degree of freedom per molecule). 2. vibrational degrees of freedom contribute to the molar speciﬁc heat (but not the full R because the temperature is not high enough). and. In both molecules. and then the vibrational.104)]..103)]. Finally. hence. (7. This accounts for the obviously nonclassical [i.
Each atom is speciﬁed by three independent position coordinates. 2 i=1 where ωi is the (angular) oscillation frequency of the ith normal mode.131) (pi2 + ωi2 qi2 ). The typical value of ωi is the (angular) frequency of a sound wave propagating through the lattice. atoms in solids cannot translate (unlike those in gases). so we can use the estimate ω ∼ κ/m (κ is the force constant which measures the strength of interatomic bonds.7. we can expand the potential energy of interaction between the atoms to give an expression which is quadratic in the atomic displacements from their equilibrium positions. and pi the momentum conjugate to this coordinate. each oscillator corresponds to a different normal mode). and its own particular pattern of atomic displacements. Any general oscillation can be written as a linear combination of these normal modes. the mass involved in the vibration is simply that of the molecule. whereas in the former case the mass involved is that of very many atoms (since lattice vibrations are nonlocalized). and m is the mass involved in the oscillation) as proof that the typical frequencies of lattice vibrations are very much less than 149 . It is clear that in normal mode coordinates. and can be thought of as sound waves propagating through the crystal lattice.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS 7. the total energy of the lattice vibrations takes the particularly simple form 3N 1 E= (7. Such vibrations are called lattice vibrations. and three conjugate momentum coordinates. In effect. In this case. Each mode has its own particular oscillation frequency. It is always possible to perform a normal mode analysis of the oscillations. the linearized lattice vibrations are equivalent to 3 N independent harmonic oscillators (of course.12 Speciﬁc heats of solids Consider a simple solid containing N atoms. The strength of interatomic bonds in gaseous molecules is similar to those in solids. Sound wave frequencies are far lower than the typical vibration frequencies of gaseous molecules. Let us only consider small amplitude vibrations. but are free to vibrate about their equilibrium positions. Now. Let qi be the (appropriately normalized) amplitude of the ith normal mode. In the latter case. we can ﬁnd 3 N independent modes of oscillation of the solid. In normal mode coordinates.
suggesting that the force constant κ is large. but not by much. Thus. so its intermolecular bonds must be very strong. This is not surprising. make their full classical contribution to the molar speciﬁc heat of the solid. but. we can obtain an estimate of the molecular weight).133) for solids. This gives a value of 24. the mean internal energy per mole of the solid is E = 3 N k T = 3 ν R T. This fact was discovered experimentally by Dulong and Petite at the beginning of the nineteenth century. each normal mode of oscillation has an associated mean energy k T in equilibrium at temperature T [(1/2) k T resides in the kinetic energy of the oscillation. metals) have heat capacities which lie remarkably close to this value.9 joules/mole/degree. The heat capacity at constant volume is somewhat less than the constant pressure value. according to the equipartition theorem. If the lattice vibrations behave classically then.132) (7.9 joules/mole/degree) holds pretty well for metals. the case) that lattice vibrations are not frozen out at room temperature. indeed. Thus. As is wellknown. the law fails badly for diamond. diamond is an extremely hard substance. that all solids have a molar heat capacities close to 24. Diamond is also a fairly low density 150 . it is likely (and is.. instead. Table 4 lists the experimental molar heat capacities cp at constant pressure for various solids. because solids are fairly incompressible.e.7.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS the vibration frequencies of simple molecules. In fact. and then dividing the result by Avogadro’s number. It can be seen that Dulong and Petite’s law (i. and (1/2) k T resides in the potential energy]. and was used to make some of the ﬁrst crude estimates of the molecular weights of solids (if we know the molar heat capacity of a substance then we can easily work out how much of it corresponds to one mole. and by weighing this amount. It follows that the molar heat capacity at constant volume is 1 ∂E cV = = 3 R ν ∂T V (7. at room temperature most solids (in particular. It follows from ∆E = ¯ ω that h the quantum energy levels of lattice vibrations are far more closely spaced than the vibrational energy levels of gaseous molecules. However.
E = 3N¯ ω + h 2 exp(β ¯ ω) − 1 h The molar heat capacity is deﬁned ∂β 1 ∂E 1 ∂E 1 ∂E =− . which violates the third law of thermodynamics.134) (7. the spacing between the different vibration energy levels (which scales like ¯ ω) is sufﬁciently large in diamond for the vibrational h degrees of freedom to be largely frozen out at room temperature. the solid acts like a set of 3N independent oscillators which. This approximation was ﬁrst employed by Einstein in a paper published in 1907. all vibrate at the same frequency. We can use the quantum mechanical result (7. say.131). 4.4 Solid Aluminium Tin (white) Sulphur (rhombic) Carbon (diamond) cp 24.4 25.136) . this would imply S → ∞. cV = = ν ∂T V ν ∂β V ∂T ν k T 2 ∂β V giving 3 NA ¯ ω exp(β ¯ ω) ¯ ω h h h cV = − − . The molar heat capacity cannot remain a constant as the temperature approaches absolute zero. We can make a crude model of the behaviour of cV at low temperatures by assuming that all the normal modes oscillate at the same frequency. Both these facts suggest that the typical lattice vibration frequency of diamond (ω ∼ κ/m) is high. Dulong and Petite’s law is essentially a high temperature limit.120).5 26. 2 kT [exp(β ¯ ω) − 1]2 h 151 (7. (7.7.1 Table 4: Values of cp (joules/mole/degree) for some solids at T = 298◦ K.4 26.135) (7. According to Eq. so the mass m involved in lattice vibrations is comparatively small. In fact.111) for a single oscillator to write the mean energy of the solid in the form 1 1 . This accounts for the anomalously low heat capacity of diamond in Tab. From Reif.5 25.4 6. since. by Eq. substance.4 22. ω. (7. making use of Einstein’s approximation.12 Speciﬁc heats of solids Solid Copper Silver Lead Zinc 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS cp 24.
the speciﬁc heats of solids do not approach zero quite as quickly as suggested by Einstein’s model when T → 0.e.139) In reality. [exp(θE /T ) − 1]2 (7.137) ¯ω h (7.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS which reduces to θE cV = 3 R T Here. A more realistic model of lattice vibrations was developed by the Dutch physicist Peter Debye in 1912. ¯ ω. giving θE = θE cV ∼ 3 R T 2 exp(−θE /T ). On the other hand. the fact that it is made up of atoms rather than being continuous. In the Debye model. those whose wavelengths greatly exceed the interatomic spacing. so the former are much harder to freeze out than the latter (because the spacing between quantum energy levels.7.138) k is called the Einstein temperature. In fact. If the temperature is sufﬁciently high that T θE then k T ¯ ω. The molar heat capacity does not decrease with temperature as rapidly as suggested by Einstein’s model because these long wavelength modes are able to make a signiﬁcant contribution to the heat capacity even at very low temperatures. The reason for this discrepancy is the crude approximation that all normal modes have the same frequency.. It is plausible that these modes are not particularly sensitive to the discrete nature of the solid: i. 2 exp(θE /T ) . long wavelength modes have lower frequencies than short wavelength modes. . the frequencies of the normal modes of vibration are estimated by treating the solid as an isotropic continuous medium. 152 So.. (7. This approach is reasonable because the only modes which really matter at low temperatures are the long wavelength modes: i.e. if the temperature is sufﬁciently low that T θE then the exponential factors in Eq. is smaller in the h former case). the law of Dulong and Petite is recovered for temperatures signiﬁcantly in excess of the Einstein temperature. and the above expression reduces to cV = 3 R. 6).137) become very much larger than unity. (7. after h expansion of the exponential functions. in this simple model the speciﬁc heat approaches zero exponentially as T → 0. The experimentally observed low temperature behaviour is more like cV ∝ T 3 (see Fig. Thus.
respectively. There are analogous constraints on ky and kz . 2π (7. ky lies in the range ky to ky +dky ..143) (7. the number of allowed values of kx which lie in the range kx to kx + dkx is given by ∆nx = Lx dkx . that the medium is periodic in the x. so the allowed values of the components of the wavevector are very closely spaced.145) It follows that the number of allowed values of k (i. and zdirections with periodicity lengths Lx . Ly . and kz lies in the range kz to kz + dkz . ny . For given values of ky and kz .144) where nx .146) . Lx 2π = ny .141) where nx is an integer. 2π (2π)3 153 (7. Ly . cs is the speed of sound in the medium. and Lz are macroscopic lengths. y. (7. (7. In order to maintain periodicity we need kx (x + Lx ) = kx x + 2 π nx . and Lz . for the sake of argument. and can only take the values kx = ky kz 2π nx . where the wavevector k and the frequency of oscillation ω satisfy the dispersion relation for sound waves in an isotropic medium: ω = k cs . Ly 2π = nz .e.142) (7. It follows that in a periodic medium the components of the wavevector are quantized. Suppose. is ρ d3 k = Lx dkx 2π Ly dky 2π Lz V dkz = dkx dky dkz .12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS Consider a sound wave propagating through an isotropic continuous medium. The disturbance varies with position vector r and time t like exp[−i (k· r − ω t)]. the number of allowed modes) when kx lies in the range kx to kx +dkx . Lz (7.140) Here. It is assumed that Lx . and nz are all integers.7.
and proportional to the periodicity volume. Thus. It follows that we must cut off the density of states above some critical frequency. ωD say. not only at low frequencies (long wavelengths) where these should be nearly the same. Torsion waves are vaguely analogous to electromagnetic waves (these also have two independent polarizations). (7. and one longitudinal mode. The quantity ρ is called the density of modes. Note that this density is independent of k. For every allowed wavenumber (or frequency) there are two independent torsional modes. According to the above relation. We know that there are only 3 N independent normal modes. Of course. Suppose that we are dealing with a solid consisting of N atoms. Thus. 4πk2 dk k2 ρk dk = = 2 dk.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS where V = Lx Ly Lz is the periodicity volume. otherwise we will have too many modes. and d3 k ≡ dkx dky dkz . The Debye approach consists in approximating the actual density of normal modes σ(ω) by the density in a continuous medium σc (ω). σc (ω) dω = 3 2π2 2π cs (7. 154 .148) The factor of 3 comes from the three possible polarizations of sound waves in solids. The longitudinal mode is very similar to the compressional sound wave in gases. The density of modes per unit volume when the magnitude of k lies in the range k to k+dk is given by multiplying the density of modes per unit volume by the “volume” in kspace of the spherical shell lying between radii k and k + dk. where the displacement is perpendicular to the direction of propagation. the density of modes per unit volume is a constant independent of the magnitude or shape of the periodicity volume. torsion waves can not propagate in gases because gases have no resistance to deformation without change of volume.147) (2π)3 2π Consider an isotropic continuous medium of volume V. where the displacement is parallel to the direction of propagation.7. the number of normal modes whose frequencies lie between ω and ω + dω (which is equivalent to the number of modes whose k values lie in the range ω/cs to ω/cs + dω/cs ) is V k2 V dk = 3 2 3 ω2 dω. but also at higher frequencies where they may differ substantially.
the areas under both curves are the same. 2 ωD 0 ω2 dω = V 3 ωD = 3 N. both curves exhibit sharp cutoffs at high frequencies. (7. and coincide at low frequencies.151) Figure 5 compares the actual density of normal modes in diamond with the density predicted by Debye theory. Furthermore. Of course. (7. The wavelength corresponding to the Debye frequency is 2π cs /ωD .148) into the previous formula yields 3V 2π2 cs3 This implies that N 1/3 .7. Eq. so ∞ ωD σD (ω) dω = 0 σc (ω) dω = 3 N. (7. (7. Not surprisingly. ωD is the Debye frequency.150) 0 Substituting Eq. since Debye theory is highly idealized. to calculate the mean energy of lattice vibrations in the 155 . This critical frequency is chosen such that the total number of normal modes is 3 N. the Debye frequency depends only on the sound velocity in the solid and the number of atoms per unit volume. Nevertheless. this is sufﬁcient to allow Debye theory to correctly account for the temperature variation of the speciﬁc heat of solids at low temperatures. We can use the quantum mechanical expression for the mean energy of a single oscillator.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS Thus.152) ωD = cs 6π V Thus.111). there is not a particularly strong resemblance between these two curves. it makes physical sense that such modes should be absent.149) Here. 2c 3 2π s (7. (7. It follows that the cutoff of normal modes whose frequencies exceed the Debye frequency is equivalent to a cutoff of normal modes whose wavelengths are less than the interatomic spacing. which is clearly on the order of the interatomic spacing a ∼ (V/N)1/3 . in the Debye approximation the density of normal modes takes the form σD (ω) = σc (ω) for ω < ωD σD (ω) = 0 for ω > ωD . As we shall see.
the molar heat capacity takes the form 1 cV = ν k T2 k cV = ν giving β ¯ ωD h 3V k exp x cV = 2 x4 dx. σD (ω) ¯ ω + h 2 exp(β ¯ ω) − 1 h exp(β ¯ ω) ¯ ω h h dω. (7. σD (ω) ¯ ω h [exp(β ¯ ω) − 1]2 h (7.154) Substituting in Eq. 2 2π2 c 3 [exp(β ¯ ω) − 1] h s (7.B. (7.155) 156 . Walker. (7.7. According to Eq. Rev.152).156) 3 2 2π ν (cs β ¯ ) 0 h (exp x − 1) in terms of the dimensionless variable x = β ¯ ω. the h volume can be written cs 3 2 V = 6π N . 547 (1956). (7.12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS Figure 5: The true density of normal modes in diamond compared with the density of normal modes predicted by Debye theory. Debye approximation.149).153) (7. we ﬁnd that ωD 0 exp(β ¯ ω) (β ¯ ω)2 3 V h h ω2 dω. Phys. From C.157) ωD 0 ∞ 1 1 dω.135). (7. We obtain E= 0 ∞ According to Eq. 103.
160) Consider the asymptotic limit in which T θD . although it is sometimes necessary to go to temperatures as low as 157 .158) exp x x4 dx..163) This yields (7. cV varies with temperature like T 3 .12 Speciﬁc heats of solids 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS so the heat capacity reduces to cV = 3R fD (β ¯ ωD ) = 3 R fD (θD /T ). (exp x − 1)2 (7. in the low temperature limit 4π4 1 fD (y) → .164) θD : i. y 0 exp x x4 dx 2 (exp x − 1) 0 ∞ 4π4 exp x 4 x dx = .161) Thus. now. 0 (7. (exp x − 1)2 15 (7.159) We have also deﬁned the Debye temperature θD as k θD = ¯ ωD . if the temperature greatly exceeds the Debye temperature we recover the law of Dulong and Petite that cV = 3 R. (7. For small y. The fact that cV goes like T 3 at low temperatures is quite well veriﬁed experimentally.159). h where the Debye function is deﬁned 3 fD (y) ≡ 3 y y 0 (7.7. the asymptotic limit in which T θD . we can approximate exp x as 1 + x in the integrand of Eq. 5 y3 cV in the limit T 12π4 T R 5 θD 3 (7. For large y. h (7. and can be looked up in any (large) reference book on integration.e. Thus. Consider.162) The latter integration is standard (if rather obscure). so that 3 fD (y) → 3 y y x2 dx = 1.
representation of the behaviour of cV in solids over the entire temperature range. Fig.7. Finally.164) to the low temperature variation of the heat capacity. 6 shows the actual temperature variation of the molar heat ca158 . 2nd Ed. thought not perfect. Figure 6: The molar heat capacity of various solids.12 Speciﬁc heats of solids Solid Na Cl K Cl Ag Zn 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS θD from low temp. 1956). Table 5 shows a comparison of Debye temperatures evaluated by this means with temperatures obtained empirically by ﬁtting the law (7. It can be seen that there is fairly good agreement between the theoretical and empirical Debye temperatures. Kittel.152) in terms of the sound speed in the solid and the molar volume. From C. (7. 0. (John Wiley & Sons. Theoretically.02 θD to obtain this asymptotic behaviour. This suggests that the Debye theory affords a good. θD should be calculable from Eq. 308 230 225 308 θD from sound speed 320 246 216 305 Table 5: Comparison of Debye temperatures (in degrees kelvin) obtained from the low temperature behaviour of the heat capacity with those calculated from the sound speed. Introduction to solidstate physics. New York NY.
since classical phasespace is divided up into uniform cells.e. the probability of ﬁnding the molecule in a given internal state with a position vector in the range r to r + dr. This “volume” is written d3 r d3 p. the number of cells is just proportional to the “volume” of the region under consideration. In fact.167) 159 . Note that 24. Thus. whereas internal degrees of freedom usually require a quantum mechanical approach. and int is its internal (i. weighted by the Boltzmann factor. The sum over exp(−β int ) just contributes s a constant of proportionality (since the internal states do not depend on r or p).. nontranslational) energy. (7. etc. p) d3 r d3 p of ﬁnding the molecule in any internal state with position and momentum vectors in the speciﬁed range is obtained by summing the above expression over all possible internal states. The latter energy is due to molecular rotation. Classically. Translational degrees of freedom can be treated classically to an excellent approximation. (7.7. The probability P(r.e. The prediction of Einstein’s theory is also show for the sake of comparison.13 The Maxwell distribution 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS pacities of various solids as well as that predicted by Debye’s theory. 7.9 joules/mole/degree is about 6 calories/gramatom/degree (the latter are chemist’s units). the probability of ﬁnding the molecule in a given internal state s is Ps (r.165) 2m where p is its momentum vector. and a momentum vector in the range p to p + dp.. so P(r. an ideal gas).166) where Ps is a probability density deﬁned in the usual manner. p) d3 r d3 p ∝ exp(−β p2 /2 m) exp(−β int s ) d3 r d3 p. vibration. The energy of the molecule is written p2 = + int . p) d3 r d3 p ∝ exp(−β p2 /2m) d3 r d3 p. is proportional to the number of cells (of “volume” h0 ) contained in the corresponding region of phasespace.13 The Maxwell distribution Consider a molecule of mass m in a gas which is sufﬁciently dilute for the intermolecular forces to be negligible (i. (7.
since there is nothing to distinguish one region of space from 160 . since the Boltzmann factor is independent of position.. This is hardly surprising. βm (7. (7. Since v = p/m. the mean number of molecules with positions between r and r + dr.169) (r) (v) i. the distribution of molecular velocities is uniform in space. ∞ −∞ ∞ 3 2 exp(−β m vz /2) dvz = N.168) where C is a constant of proportionality. since f clearly does not depend on position. Thus.170) −∞ exp(−β m vz2 /2) dvz = 2 βm −∞ ∞ exp(−y2 ) dy = 2π . and one for vz ). it is easily seen that f(r. The integral over the molecular position coordinates just gives the volume V of the gas. Suppose that we now want to determine f(r.172) Here. we can multiply this probability by the total number of molecules N in order to obtain the mean number of molecules with position and momentum vectors in the speciﬁed range. v) d3 r d3 v = C exp(−β m v2 /2) d3 r d3 v. so we have CV Now. (7. This constant can be determined by the condition f(r. and velocities in the range v and v + dv. n = N/V is the number density of the molecules. We have omitted the variable r in the argument of f. one for vy .e. the sum over molecules with all possible positions and velocities gives the total number of molecules. the properly normalized distribution function for molecular velocities is written m f(v) d r d v = n 2π k T 3 3 3/2 exp(−m v2 /2 k T ) d3 r d3 v. v) d3 r d3 v = N : (7.171) so C = (N/V)(β m/2π)3/2 . v) d3 r d3 v: i. (7.7. The integration over the velocity coordinates can be reduced to the product of three identical integrals (one for vx . In other words.e. N..13 The Maxwell distribution 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS Of course.
irrespective of the values of their other velocity components. 161 (7. so that −∞ ∞ (7.13 The Maxwell distribution 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS another in our calculation. because it was discovered by James Clark Maxwell in the middle of the nineteenth century. Suppose that g(vz ) dvz is the average number of molecules per unit volume with the zcomponent of velocity in the range vz to vz + dvz . The average number of molecules per unit volume with velocities in the range v to v + dv is obviously f(v) d3 v. Let us consider the distribution of a given component of velocity: the zcomponent.173) This gives g(vz ) dvz = n m 2π k T 3/2 (vx ) (vy ) 3/2 exp[−(m/2 k T )(vx2 + vy2 + vz2 )] dvx dvy dvz ∞ 2 exp(−m vx2 /2 k T ) 2 m = n 2π k T m = n 2π k T or exp(−m vz2 /2 k T ) −∞ 3/2 exp(−m vz2 /2 k T ) 2π k T .174) m 1/2 exp(−m vz2 /2 k T ) dvz . Thus.176) It is clear that each component (since there is nothing special about the zcomponent) of the velocity is distributed with a Gaussian probability distribution (see Sect. say. It is fairly obvious that this distribution is obtained from the Maxwell distribution by summing (integrating actually) over all possible values of vx and vy . The above distribution is called the Maxwell velocity distribution. (7.7. g(vz ) dvz = n 2π k T Of course. 2). g(vz ) dvz = (vx ) (vy ) f(v) d3 v. with vz in the speciﬁed range.175) g(vz ) dvz = n. centred on a mean value vz = 0. (7. m (7. this expression is properly normalized.177) .
In other words. vz2 = Note that Eq. F(v) dv = f(v) d3 v. the individual velocity components act like statistically independent variables. Suppose that we now want to calculate F(v) dv: i.178) can be rearranged to give 1 1 m vz2 = k T. So. irrespective of the direction of their velocities.7. Equation (7. 162 (7. the average number of molecules per unit volume with a speed v = v in the range v to v + dv. (7. (7.182) This inequality is satisﬁed by a spherical shell of radius v and thickness dv in velocity space.e.183) . so f(v) ≡ f(v). the probability that the velocity lies in the range v to v + dv is just equal to the product of the probabilities that the velocity components lie in their respective ranges. (7.178) m Equation (7. F(v) dv = 4πf(v) v2 dv. Since f(v) only depends on v.172) can be rewritten f(v) d3 v g(vx ) dvx g(vy ) dvy g(vz ) dvz . It is obvious that we can obtain this quantity by adding up all molecules with speeds in this range. the above integral is just f(v) multiplied by the volume of the spherical shell in velocity space.179) 2 2 in accordance with the equipartition theorem.181) where the integral extends over all velocities satisfying v < v < v + dv. (7. Thus.180) where g(vx ) and g(vy ) are deﬁned in an analogous way to g(vz ). Thus.. (7. = n n n n (7.177) implies that each molecule is just as likely to be moving in the plus zdirection as in the minus zdirection.13 The Maxwell distribution 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS with variance kT .
(7.7. we obtain m v = 4π 2π k T 3/2 0 ∞ (7.184) F(v) dv = 4π n 2π k T This is the famous Maxwell distribution of molecular speeds. c The mean molecular speed is given by 1 ∞ v= F(v) v dv. The reason for this is quite simple. Of course.186) v3 exp(−m v2 /2 k T ) dv. Also shown are the mean speed (¯) and the root mean square speed (vrms ). the Boltzmann factor decreases. but the volume of phasespace available to the molecule (which is proportional to v2 ) increases: the net result is a distribution with a nonzero maximum. (7. (7.187) 163 . Figure 7: The Maxwell velocity distribution as a function of molecular speed in units of the most probable speed (vmp ) . As v increases.185) Note that the Maxwell distribution exhibits a maximum at some nonzero value of v.13 The Maxwell distribution 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS which gives m 3/2 2 v exp(−m v2 /2 k T ) dv. it is properly normalized. so that 0 ∞ F(v) dv = n. n 0 Thus.
. It is clear that the various average speeds which we have just calculated are all of order the sound speed (i. This can also be written cs = γkT .e.192) then Eq. πm (7. m (7.190) A similar calculation gives vrms = v2 = 3kT .4) the sound speed is about 84% of the most probable molecular speed. 2 2 2 (7.188) 0 1 y3 exp(−y2 ) dy = . the maximum of the Maxwell distribution function) is 2kT ˜= v .13 The Maxwell distribution 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS or m v = 4π 2π k T Now so ∞ 3/2 2kT m 2 0 ∞ y3 exp(−y2 ) dy. It is easily demonstrated that the most probable molecular speed (i. this result can also be obtained from the equipartition theorem. a few hundred meters per second at room temperature).7. In ordinary air (γ = 1.195) since p = n k T and ρ = n m. Since 1 1 1 m v2 = m (vx2 + vy2 + vz2 ) = 3 kT . (7.194) where γ is the ratio of speciﬁc heats.. (7.189) (7. 2 v= 8 kT . m (7. ρ (7.e.193) m The speed of sound in an ideal gas is given by cs = γp .191) However.191) follows immediately. and about 74% of the mean 164 . (7.
the predicted velocity distribution of the escaping atoms varies like v3 exp(−m v2 /2 k T ). Figure 8 compares the measured and theoretically predicted velocity distributions of potassium atoms escaping from an oven at 157◦ C. In fact. 165 . It is difﬁcult to directly verify the Maxwell velocity distribution. There is clearly very good agreement between the two.13 The Maxwell distribution 7 APPLICATIONS OF STATISTICAL THERMODYNAMICS molecular speed.7. in contrast to the v2 exp(−m v2 /2 k T ) variation of the velocity distribution inside the oven. this distribution can be veriﬁed indirectly by measuring the velocity distribution of atoms exiting from a small hole in an oven. the velocity distribution inside the oven. Here. Figure 8: Comparison of the measured and theoretically predicted velocity distributions of potassium atoms escaping from an oven at 157◦ C. but slightly different from. it makes sense that they travel at slightly less than the most probable and mean molecular speeds. Since sound waves ultimately propagate via molecular motion. The velocity distribution of the escaping atoms is closely related to. the measured transit time is directly proportional to the atomic speed. However. Figure 7 shows the Maxwell velocity distribution as a function of molecular speed in units of the most probable speed. Also shown are the mean speed and the root mean square speed. since high velocity atoms escape more readily than low velocity atoms.
7). Let Qi denote collectively all the coordinates of the ith particle: i. is 166 . we employed classical mechanics to deal with the translational degrees of freedom of the constituent particles. According to quantum mechanics. It turns out that this approach is necessary to deal with either low temperature or high density gases.···.. is simply Ψs1 .e.e. is completely determined by the complex wavefunction In particular. as well as the spin coordinate which determines its internal state. Let si be an index labeling the possible quantum states of the ith particle: i.” such as photons or the conduction electrons in a metal. the three Cartesian coordinates which determine its spatial position. · · · . Q2 . and quantum mechanics to deal with the nontranslational degrees of freedom. Q2 .. 7. structureless particles enclosed within a container of volume V.sN (Q1 . etc.. In fact. QN ). each possible value of si corresponds to a speciﬁcation of the three momentum components of the particle.···.2) Ψs1 .2 Symmetry requirements in quantum mechanics Consider a gas consisting of N identical. the overall state of the system when the ith particle is in state si . · · · . etc.1) One of the fundamental postulates of quantum mechanics is the essential indistinguishability of particles of the same species.1 Introduction Previously. Let us now discuss ideal gases from a purely quantum mechanical standpoint. (8. in practice.6 and 7.8 QUANTUM STATISTICS 8 Quantum statistics 8. the probability of an observation of the system ﬁnding the ith particle with coordinates in the range Qi to Qi + dQi . Furthermore. it also allows us to investigate completely nonclassical “gases. we investigated the statistical thermodynamics of ideal gases using a rather ad hoc combination of classical and quantum mechanics (see Sects. (8. as well as the direction of its spin orientation. QN )2 dQ1 dQ2 · · · dQN .sN (Q1 . noninteracting.. 8. What this means.
because Ψ cannot be observed experimentally. Only the probability density Ψ2 is observable. Of course..e.e. etc.8. Qi ↔ Qj . Ψ(· · · Qi · · · Qj · · ·)2 = Ψ(· · · Qj · · · Qi · · ·)2 .e. therefore. the wavefunctions before and after this process must be identical.e. Thus. Thus. it is equivalent to doing nothing to the system. · · · . the quantum mechanical approach is the correct one. Of course.6) that A2 = 1. the probability of observing the system in a given state also cannot have changed: i..4) If the particles are truly indistinguishable then nothing has changed: i.e. we have omitted the subscripts s1 . where A is a complex constant of modulus unity: i. we have a particle in quantum state si and a particle in quantum state sj both before and after the particles are swapped. It follows from Eq. Suppose that we interchange the ith and jth particles: i.5) implies that Ψ(· · · Qi · · · Qj · · ·) = A Ψ(· · · Qj · · · Qi · · ·). from the above discussion.3) (8. a proton is just a proton— we cannot meaningfully talk of proton number 1 and proton number 2.. (8. si ↔ sj .. Thus. Suppose that we interchange the ith and jth particles a second time. the only solutions to the above equation are A = ±1.. and can.e. that the wavefunction Ψ is either 167 (8.7) . in classical mechanics particles of the same species are regarded as being distinguishable. Equation (8. (8.2 Symmetry requirements in quantum mechanics 8 QUANTUM STATISTICS that we cannot label particles of the same species: i. Swapping the ith and jth particles twice leaves the system completely unchanged: i. We conclude. Note that no such constraint arises in classical mechanics.6) (8.. A2 = 1. sN for the sake of clarity. Note that we cannot conclude that the wavefunction Ψ is unaffected when the particles are swapped.5) Here. (8. be labelled.
fermions must also be considered genuinely indistinguishable when enumerating the different possible states of the gas.9) implies that the interchange of any two particles does not lead to a new physical state of the system (since Ψ2 is invariant). via arguments involving relativistic invariance. Consider the special case where particles i and j lie in the same quantum state.8.10) (8. 5/2.. be considered as genuinely indistinguishable when enumerating the different possible states of the gas. Common examples of bosons are photons and He4 atoms.8) imposes no restriction on how many particles can occupy a given singleparticle quantum state s.) particles satisﬁes Eq. (8. the act of swapping the two particles is equivalent to leaving the system unchanged. so However.e. (8. spin 0. Note that Eq.N. (8. either or Ψ(· · · Qi · · · Qj · · ·) = +Ψ(· · · Qj · · · Qi · · ·). who ﬁrst studied the properties of fermion gases).9) and (8. In this case. Consider a gas made up of identical bosons.) particles satisﬁes Eq. In other words. Equation (8. Equation (8. and electrons. The latter type of particles are called fermions (after the Italian physicists Enrico Fermi. Eq. etc.9) is also applicable. 1.10) can be reconciled is if Ψ=0 168 Ψ(· · · Qi · · · Qj · · ·) = Ψ(· · · Qj · · · Qi · · ·).8) implies that the interchange of any two particles does not lead to a new state of the system. or it is completely antisymmetric. (8. Ψ(· · · Qi · · · Qj · · ·) = −Ψ(· · · Qj · · · Qi · · ·).e. that the wavefunction associated with a collection of identical integerspin (i. whereas the wavefunction associated with a collection of identical halfintegerspin (i.9). The former type of particles are known as bosons [after the Indian physicist S.2 Symmetry requirements in quantum mechanics 8 QUANTUM STATISTICS completely symmetric under the interchange of particles. The only way in which Eqs.9) In 1940 the Nobel prize winning physicist Wolfgang Pauli demonstrated. therefore. (8.8) (8. neutrons.8).. 3/2. Bose. since the two particles are fermions.11) . etc. (8. Common examples of fermions are protons. Hence. spin 1/2.8) on empirical grounds]. (8. Consider a gas made up of identical fermions. 2. (8. who ﬁrst put forward Eq. Bosons must.
For the case of MaxwellBoltzmann (MB) statistics. after S. the two particles are considered to be distinguishable. who ﬁrst developed them. Furthermore.M. According to the above discussion. The possible different 169 . Consider. and there can never be more than one particle in any given quantum state. This set of rules is called FermiDirac statistics. respectively. For a fermion gas. there are three different sets of rules which can be used to enumerate the states of a gas made up of identical particles. Finally.N. s = 1. This set of rules is called BoseEinstein statistics. and there is no limit to how many particles can occupy a given quantum state. This is another way of saying that it is impossible for any two particles in a gas of fermions to lie in the same singleparticle quantum state. Let us denote them A and B. 3. after E. the particles must be treated as being distinguishable. Suppose that each particle can be in one of three possible quantum states. Fermi and P . 8. Dirac. In this case. For a boson gas. a gas made up of identical classical particles.A. 2. BoseEinstein. and there is no limit to how many particles can occupy a given quantum state.8. and FermiDirac statistics. Furthermore. Bose and A. Einstein. since it was ﬁrst proposed by W. Pauli in 1924 on empirical grounds.C.3 An illustrative example 8 QUANTUM STATISTICS wherever particles i and j lie in the same quantum state. for a classical gas. This set of rules is called MaxwellBoltzmann statistics. who ﬁrst developed them. This proposition is known as the Pauli exclusion principle. the particles must be treated as being indistinguishable. Let us enumerate the possible states of the whole gas according to MaxwellBoltzmann. Maxwell and L.3 An illustrative example Consider a very simple gas made up of two identical particles. the particles must be treated as being indistinguishable. Boltzmann. for the sake of comparison. any number of particles can occupy the same quantum state. who ﬁrst developed them. the particles must be considered distinguishable when enumerating the different possible states of the gas. after J. there are no constraints on how many particles can occupy a given quantum state.
1 2 3 AB · · · · · · · · · AB · · · · · · · · · AB A B ··· B A ··· A ··· B B ··· A ··· A B ··· B A Table 6: Two particles distributed amongst three states according to MaxwellBoltzmann statistics. for the case of FermiDirac (FD) statistics. no more than one particle can occupy a given quantum state. that FermiDirac statistics are more restrictive (i. 7. from the above example.3 An illustrative example 8 QUANTUM STATISTICS states of the gas are shown in Tab. For the case of BoseEinstein (BE) statistics. Finally. There are clearly only 3 distinct states. there are less possible states of the system) than BoseEinstein statistics. 8. There are clearly 9 distinct states. the two particles are considered to be indistinguishable. The possible different states of the gas are shown in Tab. There are clearly 6 distinct states.8. 6. any number of particles can occupy the same quantum state. 1 2 3 AA · · · · · · · · · AA · · · · · · · · · AA A A ··· A ··· A ··· A A Table 7: Two particles distributed amongst three states according to BoseEinstein statistics. Let us again denote them both as A.e. Furthermore. the two particles are considered to be indistinguishable. Furthermore. The possible different states of the gas are shown in Tab.. 170 . Let us denote them both as A. It follows.
in FermiDirac statistics there is less tendency for particles to cluster in the same state than in classical statistics. which are. 8. more restrictive than MaxwellBoltzmann statistics.8.13) (8. Finally.15) We conclude that in BoseEinstein statistics there is a greater relative tendency for particles to cluster in the same state than in classical statistics. is simply ER = r nr r. Let the energy of a particle in state r be denoted r . ξFD = 0. so the total energy of the gas in state R. ξMB = 1/2. in turn.12) For the case under investigation.16) 171 .4 Formulation of the statistical problem Consider a gas consisting of N identical noninteracting particles occupying volume V and in thermal equilibrium at temperature T .. let us label the possible quantum states of the whole gas by R. etc.4 Formulation of the statistical problem 1 2 3 A A ··· A ··· A ··· A A 8 QUANTUM STATISTICS Table 8: Two particles distributed amongst three states according to FermiDirac statistics. ξBE = 1. Let the number of particles in state r be written nr . On the other hand. Let us label the possible quantum states of a single particle by r (or s). (8. Let ξ≡ probability that the two particles are found in the same state . (8. where there are nr particles in quantum state r. The particles are assumed to be noninteracting.14) (8. probability that the two particles are found in different states (8.
(8.··· e . the average number of particles in quantum state s can be written ns = ¯ ns ns e−β ns s s −β ns ns e (s) −β (n1 1 +n2 2 +···) n1 .e. (8.n2 . According to Eq. Note that the ﬁrst sums in the numerator and 172 . Furthermore.··· (8.21) Here.e. since the total number of particles in the gas is known to be N.19).5 FermiDirac statistics Let us. n2 particles in state 2..5 FermiDirac statistics 8 QUANTUM STATISTICS where the sum extends over all possible quantum states r.. Thus. exp[−β (n1 1 + n2 2 + · · ·)] is the relative probability of ﬁnding the gas in a particular state in which there are n1 particles in state 1. the mean number of particles in quantum state s can be written ns = ¯ R ns exp[−β (n1 1 + n2 2 + · · ·)] .20) Here.n2 . Z= R e−β ER = R e−β (n1 1 +n2 2 +···) . using the multiplicative properties of the exponential function. over all the various possible values of the numbers n1 . · · ·. 8. β ∂ s (8. ﬁrst of all. the sum is over all possible states R of the whole gas: i. β ≡ 1/k T .17) In order to calculate the thermodynamic properties of the gas (i. (8.19) A comparison of Eqs. etc. we must have N= r nr . n2 . R exp[−β (n1 1 + n2 2 + · · ·)] ns = − ¯ 1 ∂ ln Z . its internal energy or its entropy). we have rearranged the order of summation. (s) e−β (n1 1 +n2 2 +···) n1 . it is necessary to calculate its partition function. (8.18) Here. consider FermiDirac statistics.8. Now.19) yields the result (8.18) and (8.
(8.26) ∂ ln Zs . ∂N (8. the sums in the above expression range over all values of the numbers n1 .26) can be rearranged to give αs ≡ Zs (N − ∆N) = Zs (N) e−αs ∆N . subject to the overall constraint that nr = N.n2 . (8.28) . Suppose that ∆N N. whereas the last sums omit the particular state s from consideration (this is indicated by the superscript s on the summation symbol). · · · such that nr = 0 and 1 for each r. (8. rather than the rapidly varying function Zs (N).5 FermiDirac statistics 8 QUANTUM STATISTICS denominator only involve ns . n2 . because the radius of convergence of the latter Taylor series is too small for the series to be of any practical use.8.··· e−β (n1 1 +n2 2 +···) .25) In order to make further progress. It follows that ln Zs (N − ∆N) can be Taylor expanded to give ln Zs (N − ∆N) where ln Zs (N) − ∂ ln Zs ∆N = ln Zs (N) − αs ∆N. according to FermiDirac statistics.23) which is deﬁned as the partition function for N particles distributed over all quantum states. By explicitly performing the sum over ns = 0 and 1. ns = ¯ Zs (N) + e−β s Zs (N − 1) which yields ns = ¯ 1 [Zs (N)/Zs (N − 1)] e β s (8. we Taylor expand the slowly varying function ln Zs (N).22) r Let us introduce the function (s) Zs (N) = n1 . the expression (8. excluding state s. (8.24) +1 . Equation (8. Of course.27) ∂N As always. 173 (8.21) reduces to 0 + e−β s Zs (N − 1) . we must somehow relate Zs (N−1) to Zs (N).
29) for all s.27) can be expressed approximately in terms of the derivative of the full partition function Z(N) (in which the N particles are distributed over all quantum states). it follows that ns ≤ 1.25) reduces to ns = ¯ 1 e α+β s +1 . 174 . so that we can write αs α (8. ¯ 1 r e α+β r +1 = N.30) can be integrated to give ln Z = α N + r (8.. ¯ 0 ≤ ns ≤ 1. with ∆N = 1.30) Making use of Eq. (8. (8. since Zs (N) is a sum over very many different quantum states. therefore.31). the expression (8. The parameter α is determined by the constraint that r nr = N: i. (8. (8. introduce the approximation that αs is independent of s. Thus.31) can never become less than unity. we would not expect the logarithm of this function to be sensitive to which particular state s is excluded from consideration. Equations (8.8. plus the approximation (8.20) and (8.29). ¯ in accordance with the Pauli exclusion principle. ∂N (8. On the other hand.5 FermiDirac statistics 8 QUANTUM STATISTICS Now. Let us.28).34) where use has been made of Eq. no matter how small s becomes.33) ln (1 + e−α−β r ).32) Note that ns → 0 if s becomes sufﬁciently large. since ¯ the denominator in Eq.e. (8. α ∂ ln Z . In fact.31) This is called the FermiDirac distribution. It follows that the derivative (8. (8.
For the case of photons. Consider the expression (8. without any further restriction. the numbers n1 . the above expression can be rewritten ns = − ¯ 1 ∂ ln β∂ s e−β ns ns s Now.6 Photon statistics Up to now. photons enclosed in a container of volume V. β∂ s 1 − e−β s 175 (8.36) gives ns = ¯ e−β s 1 ∂ ln (1 − e−β s ) = . which can easily be evaluated. we have assumed that the number of particles N contained in a given system is a ﬁxed number. Eq. n2 . This type of statistics is called photon statistics. (8.38) .35) However. that photons obey a simpliﬁed form of BoseEinstein statistics in which there is an unspeciﬁed total number of particles. Hence. 1. However.37) Thus. this assumption breaks down for the case of photons.6 Photon statistics 8 QUANTUM STATISTICS 8. since we are not generally considering relativistic systems in this course. therefore. · · · for each r.21). 2. which are zeromass bosons. (8. 1 − e−β s (8. It follows that the sums (s) in the numerator and denominator are identical and. the sum on the righthand side of the above equation is an inﬁnite geometric series. In fact. cancel. from the above discussion. for the special case of a gas of photons there is no requirement which limits the total number of particles. ∞ ns =0 .36) e−β ns s = 1 + e−β s + e−2 β s + ··· = 1 . can readily be absorbed or emitted by the walls. In fact.21) reduces to ns = ¯ ns ns e−β ns s . maintained at temperature T . Thus. e−β ns s ns (8. · · · assume all values nr = 0. Eq.8. (8. This is a reasonable assumption if the particles possess nonzero mass. It follows.
(8.28).. (8.29).35). · · · assume all values nr = 0.40) where use has been made of Eq. For the case of massive bosons.39) eβ s − 1 This is known as the Planck distribution. Consider the expression (8. The particles in the system are assumed to be massive. Using Eq.23)]. after the German physicist Max Planck who ﬁrst proposed it in 1900 on purely empirical grounds.42) Note that this expression is identical to (8. (8.7 BoseEinstein statistics Let us now consider BoseEinstein statistics. Eq. e−ns (α+β s ) s (8.39). (8.21). according to BoseEinstein statistics [cf. this expression reduces to 0 + e−β s Zs (N − 1) + 2 e−2 β s Zs (N − 2) + · · · . subject to the constraint that r nr = N. (8. excluding state s.8. 1. · · · for each r.20) can be integrated to give ln Z = − r 1 ln (1 − e−β r ). Equation (8. except that β s is replaced by α + β s . the above equation reduces to ns = ¯ −ns (α+β s ) s ns e .43) s − 1 e 176 .41) where Zs (N) is the partition function for N particles distributed over all quantum states. Performing explicitly the sum over ns .7 BoseEinstein statistics 8 QUANTUM STATISTICS or ns = ¯ . 2. n2 . an analogous calculation to that outlined in the previous subsection yields 1 ns = α+β ¯ . Hence. ns = ¯ Zs (N) + e−β s Zs (N − 1) + e−2 β s Zs (N − 2) + · · · (8. the numbers n1 . 8. and the approximation (8. (8. so the total number of particles N is a ﬁxed number.
n2 . Each of these possible arrangements corresponds to a distinct state for the whole gas.8..47) possible ways in which N distinguishable particles can be put into individual quantum states such that there are n1 particles in state 1. it is instructive to consider the purely classical case of MaxwellBoltzmann statistics.e.30) can be integrated to give ln Z = α N − r ln (1 − e−α−β r ). (8. 8.20) and (8. (8.44) Equations (8. · · · there are N! n1 ! n2 ! · · · (8. (8.45) where use has been made of Eq. etc. (8.8 MaxwellBoltzmann statistics For the purpose of comparison. Eq. 1 r e α+β r −1 = N.44) does not apply. and the constraint (8.46) where the sum is over all distinct states R of the gas. Hence. n2 particles in state 2.43). Note that photon statistics correspond to the special case of BoseEinstein statistics in which the parameter α takes the value zero. The partition function is written Z= R e−β (n1 1 +n2 2 +···) . (8. (8.··· n1 ! n2 ! · · · 177 1 +n2 2 +···) .46) can be written Z= N! e−β (n1 n1 . For given values of n1 . and the particles are treated as distinguishable.n2 . Note that ns can become very large in ¯ this distribution.8 MaxwellBoltzmann statistics 8 QUANTUM STATISTICS This is called the BoseEinstein distribution. The parameter α is again determined by the constraint on the total number of particles: i.48) .
2. 7).9 Quantum statistics in the classical limit 8 QUANTUM STATISTICS where the sum is over all values of nr = 0. Z = (e−β 1 + e−β 2 + · · ·)N . (8. (8.8. · · · for each r. (8.9 Quantum statistics in the classical limit The preceding analysis regarding the quantum statistics of ideal gases is summarized in the following statements. n1 . (8.49) r Now. by virtue of Eq. 8. of course.20) and (8. In fact. Eq.49). is just the result of expanding a polynomial.53) r . −β r re (8.50) which.52) This is known as the MaxwellBoltzmann distribution.48) can be written Z= N! (e−β 1 )n1 (e−β 2 )n2 · · · .51) or ln Z = N ln e−β r Note that the argument of the logarithm is simply the partition function for a single particle.52) can be combined to give ns = N ¯ e−β s .n2 .54) . 1. just the result obtained by applying the Boltzmann distribution to a single particle (see Sect. Equations (8. subject to the constraint that nr = N. (8.··· n1 ! n2 ! · · · (8. The mean number of particles occupying quantum state s is given by ns = ¯ 1 e α+β 178 s ±1 . (8. It is.
57) and Eqs.55) Finally.e. or the temperature is made sufﬁciently high.58) for all r. the case of a gas made up of a ﬁxed number of particles when its temperature is made sufﬁciently large: i.e.. as the classical limit. the parameter α must become large enough that each term is made sufﬁciently small: i. i. then α must become so large that e α+β r 1 (8. Thus. or sufﬁciently high temperature..9 Quantum statistics in the classical limit 8 QUANTUM STATISTICS where the upper sign corresponds to FermiDirac statistics and the lower sign corresponds to BoseEinstein statistics.57) for all r. ﬁrst of all. (8. it is again necessary that nr ¯ 1 or exp (α + β r ) 1 for all states r. In order to prevent the sum from exceeding N. Consider. (8. It is conventional to refer to the limit of sufﬁciently low concentration.55) can only be satisﬁed if each term in the sum over states is made sufﬁciently small. this means that the number of particles occupying each quantum state must become so small that nr ¯ 1 (8. (8. in which Eqs. the partition function of the gas is given by ln Z = α N ± r ln (1 ± e−α−β r ).e.8. it follows that as β → 0 an increasing number of terms with large values of r contribute substantially to this sum. Equivalently. The above discussion suggests that if the concentration of an ideal gas is made sufﬁciently low.58) are satisﬁed. if nr ¯ 1 or exp (α + β r ) 1 for all states r..55). The relation (8. Consider. next. 179 .. (8. the case of a gas at a given temperature when its concentration is made sufﬁciently low: i. the terms of appreciable magnitude are those for which β r α.56) Let us investigate the magnitude of α in some important limiting cases. In the sum in Eq. when N is made sufﬁciently small. (8. when β is made sufﬁciently small. The parameter α is determined via nr = ¯ r r 1 e α+β r ±1 = N.e.
both the FermiDirac and BoseEinstein distributions reduce to ns = e−α−β s ¯ (8.60) The above expressions can be combined to give ns = N ¯ e−β s .e. e−β r r (8. (8. Let us now consider the behaviour of the partition function (8.59) in the classical limit. (8. (8.54) and (8.61) It follows that in the classical limit of sufﬁciently low density.64) e r −β r .52) from MaxwellBoltzmann statistics: i.62) e r −β r . (8.60). the quantum distribution functions.56) in the classical limit.. whether FermiDirac or BoseEinstein. reduce to the MaxwellBoltzmann distribution. or sufﬁciently high temperature.65) . It is easily demonstrated that the physical criterion for the validity of the classical approximation is that the mean separation between particles should be much greater than their mean de Broglie wavelengths. (8.8. according to Eq. We can expand the logarithm to give ln Z = α N ± However.57). whereas the constraint (8. α = − ln N + ln It follows that r ±e−α−β r = α N + N.55) yields e−α−β r r = N. (8. (8.63) Note that this does not equal the partition function ZMB computed in Eq.9 Quantum statistics in the classical limit 8 QUANTUM STATISTICS According to Eqs. (8. ln ZMB = N ln 180 ln Z = −N ln N + N + N ln e−β r r .
where the concentration and temperature are such that the typical de Broglie wavelength becomes comparable with the typical interparticle spacing. ln Z = ln ZMB − ln N!. an electromagnetic wave is a coupled selfsustaining oscillation of electric and magnetic ﬁelds which propagates though a vacuum at the speed of light.6 and 7.10 The Planck radiation law Let us now consider the application of statistical thermodynamics to electromagnetic radiation. A gas in the classical limit. c = 3 × 108 m s−1 . Here. or Z= (8. where the typical de Broglie wavelength of the constituent particles is much smaller than the typical interparticle spacing. The electric component of the wave can be written E = E0 exp[ i (k·r − ω t)].10 The Planck radiation law 8 QUANTUM STATISTICS In fact.66) ZMB . in Sect. the gas is said to be degenerate. except that the arbitrary parameter h0 is replaced by Planck’s constant h = 6.7 in order to avoid the nonphysical consequences of the Gibb’s paradox. in an ad hoc fashion. Clearly.67) N! where use has been made of Stirling’s approximation (N! N ln N − N). that we had to introduce precisely this factor. According to Maxwell’s theory.68) 181 .7. is said to be nondegenerate. In the opposite limit. 8. (8. In the classical limit. 7. and the actual FermiDirac or BoseEinstein distributions must be employed. there is no Gibb’s paradox when an ideal gas is treated properly via quantum mechanics. 7. the factor N! simply corresponds to the number of different permutations of the N particles: permutations which are physically meaningless when the particles are identical. since N is large.8. (8. a full quantum mechanical analysis of an ideal gas reproduces the results obtained in Sects.61 × 10−34 J s. Recall.
is much greater than the longest wavelength of interest in the problem. (7. y. and Lz . 4π k2 dk k2 ρk (k) dk = = 2 dk. Alternatively. multiplied by the “volume” in kspace of the spherical shell lying between radii k and k + dk. ρ(k). As long as the smallest of these lengths. The dispersion relation ω = kc (8.12). then these assumptions should not signiﬁcantly affect the nature of the radiation inside the enclosure. and zdirections. Ly . What is the nature of the steadystate electromagnetic radiation inside the enclosure? Suppose that the enclosure is a parallelepiped with sides of lengths Lx . electromagnetic waves are transverse waves). and ω is the frequency. (8. Consider an enclosure whose walls are maintained at ﬁxed temperature T . by analogy with Eq. that the periodicity constraints ensure that there are only a discrete set of allowed wavevectors (i. with periodicity lengths Lx . Thus.10 The Planck radiation law 8 QUANTUM STATISTICS where E0 is a constant. there are only two possible independent directions for the electric ﬁeld. Let ρ(k) d3 k be the number of allowed modes per unit volume with wavevectors in the range k to k + dk. Electromagnetic waves always propagate in the direction perpendicular to the coupled electric and magnetic ﬁelds (i. We ﬁnd. We know.. just as in our earlier discussion of sound waves (see Sect.140)]. a discrete set of allowed modes of oscillation of the electromagnetic ﬁeld inside the enclosure). 7.8. λ = 2π/k. L. say.70) ρ(k) d3 k = (2π)3 The number of modes per unit volume for which the magnitude of the wavevector lies in the range k to k + dk is just the density of modes. Thus. that d3 k . and Lz .e.146). suppose that the radiation ﬁeld inside the enclosure is periodic in the x. This means that k · E0 = 0.71) (2π)3 2π 182 . Note that this dispersion relation is very similar to that of sound waves in solids [see Eq.69) ensures that the wave propagates at the speed of light. Ly . (8.. (7.e. These correspond to the two independent polarizations of electromagnetic waves. respectively. k is the wavevector which determines the wavelength and direction of propagation of the wave. once k is speciﬁed.
8.10 The Planck radiation law 8 QUANTUM STATISTICS Finally.8).12). with one very important difference. because solids are made up of atoms instead of being continuous). Thus.e. The total classical energy density of electromagnetic radiation is given by U= ∞ 0 kT u(ω) dω = 2 3 π c 183 0 ∞ ω2 dω. (1/2) k T resides with the oscillating electric ﬁeld. electromagnetic waves propagate through a vacuum.69).. 7. and another (1/2) k T with the oscillating magnetic ﬁeld. after Lord Rayleigh and James Jeans who ﬁrst proposed it in the late nineteenth century. It follows that there is no cutoff frequency for electromagnetic waves. In Debye theory there is a cutoff frequency (the Debye frequency) above which no modes exist. This cutoff comes about because of the discrete nature of solids (i. (8. impossible to have sound waves whose wavelengths are much less than the interatomic spacing. by Eq. the number of modes per unit volume whose frequencies lie between ω and ω + dω is. the energy per unit volume associated with modes whose frequencies lie in the range ω to ω + dω) is u(ω) dω = k T σ(ω) dω = kT ω2 dω. 7. (8. of course. each mode possesses a mean energy k T in thermal equilibrium at temperature T .e. On the other hand. Let us consider the situation classically. By analogy with sound waves. 2π c (8. 2 c3 π (8. we can treat each allowable mode of oscillation of the electromagnetic ﬁeld as an independent harmonic oscillator. the additional factor 2 is to take account of the two independent polarizations of the electromagnetic ﬁeld for a given wavevector k.74) . the energy density of sound waves in a solid is analogous to the RayleighJeans law. ω2 σ(ω) dω = 2 2 3 dω. According to Debye theory (see Sect.72) Here. According to the equipartition theorem (see Sect.73) This result is known as the RayleighJeans radiation law. which possesses no discrete structure. the classical energy density of electromagnetic radiation (i. It is. and so the RayleighJeans law holds for all frequencies. This immediately poses a severe problem.. In fact.
h p = ¯ k. In fact.e. So. the mean number of photons occupying a photon state of frequency ω is n(ω) = 1 h eβ¯ ω − 1 .8. how do we obtain a sensible answer? Well.75) (8. h respectively. the total energy density of electromagnetic radiation inside an enclosed cavity is inﬁnite! This is clearly an absurd result. ¯ ω h ¯ h eβ¯ ω ¯ω h . quantum mechanics comes to our rescue.10 The Planck radiation law 8 QUANTUM STATISTICS This is an integral which obviously does not converge. 184 .80) k T ) are completely “frozen (8. ¯ ω h out”: i. according to classical physics.e. it follows from Eq.77) (8. because the RayleighJeans law usually starts to diverge badly from experimental observations (by overestimating the amount of radiation) in the ultraviolet region of the spectrum. (8. (8.78) Hence.39). Incidentally.e.81) On the other hand.79) k T ) behave classically: i.. According to quantum mechanics. ¯ k T. k T.76) which implies that photons are massless particles which move at the speed of light. (8. as usual. According to the Planck distribution (8... this prediction is known as the ultraviolet catastrophe.e. −1 (8..69) that = p c. (8. the mean energy of such a state is given by ¯(ω) = ¯ ω n(ω) = h Note that low frequency states (i. Thus. high frequency states (i. each allowable mode of oscillation of the electromagnetic ﬁeld corresponds to a photon state with energy and momentum = ¯ ω. and was recognized as such in the latter half of the nineteenth century.
The Planck law approximates to the classical RayleighJeans law for ¯ ω h k T . and. The radiation ﬁeld inside the enclosure is isotropic (we are assuming that the hole is sufﬁciently small that it does not distort the ﬁeld). It follows that the mean number of photons per unit volume whose frequencies lie in the speciﬁed range. According to the above discussion.11 Blackbody radiation 8 QUANTUM STATISTICS The reason for this is simply that it is very difﬁcult for a thermal ﬂuctuation to create a photon with an energy greatly in excess of k T . the power density inside the enclosure can be written u(ω) dω = ¯ ω n(ω) dω. h (8. is n(ω.11 Blackbody radiation Suppose that we were to make a small hole in the wall of our enclosure. emits. 2 185 (8. which is deﬁned as an object which absorbs.8.84) where n(ω) is the mean number of photons per unit volume whose frequencies lie in the range ω to ω + dω. radiation perfectly at all wavelengths. ¯ giving ¯ h ω3 dω u(ω) dω = 2 3 . and falls off exponentially for ¯ ω h h k T . θ) dω dθ = 1 n(ω) dω sin θ dθ.83) π c exp(β ¯ ω) − 1 h This is famous result is known as the Planck radiation law. peaks at about ¯ ω 3 k T . What is the power radiated by the hole? Well.85) . and whose directions of propagation make an angle in the range θ to θ + dθ with the normal to the hole. The exponential fall off at high frequencies ensures that the total energy density remains ﬁnite.82) 8. therefore. the true energy density of electromagnetic radiation inside an enclosed cavity is written u dω = (ω) σ(ω) dω. since k T is the characteristic energy associated with such ﬂuctuations. (8. (8. A small hole is the best approximation in Physics to a blackbody. and observe the emitted radiation.
and π n(ω. as the stellar surface temperatures gradually rise. θ) dω dθ = n(ω) dω. P(ω) dω = A blackbody is very much an idealization.88) ω3 dω ¯ h (8.89) 4π2 c2 exp(β ¯ ω) − 1 h is the power per unit area radiated by a blackbody in the frequency range ω to ω + dω. 4 (8. stars are fairly good blackbodies). from red to blue. the linear scaling) is known as Wien’s displacement law. (8. hot bodies tend to radiate at higher frequencies than cold bodies.” with the energy as186 . The blackbody power spectrum peaks when ¯ ω 3 k T . so the power per unit area escaping from the hole in the frequency range ω to ω + dω is π/2 P(ω) dω = 0 c cos θ ¯ ω n(ω. The power spectra of real radiating bodies can deviate quite substantially from blackbody spectra. It allows us to estimate the surface temperatures of stars from their colours (surprisingly enough. This gives 1 P(ω) dω = c u(ω) dω 2 so π/2 cos θ sin θ dθ = 0 c u(ω) dω. Nevertheless.11 Blackbody radiation 8 QUANTUM STATISTICS where sin θ is proportional to the solid angle in the speciﬁed range of directions. Probably the most famous blackbody spectrum is cosmological in origin. Table 9 shows some stellar temperatures determined by this method (in fact. It can be seen that the apparent colours (which correspond quite well to the colours of the peak radiation) scan the whole visible spectrum. h (8. In other words. we can make some useful predictions using this model. the whole emission spectrum is ﬁtted to a blackbody spectrum).86) 0 Photons travel at the velocity of light. θ) dω dθ.8. Just after the “big bang” the Universe was essentially a “ﬁreball. This means that the peak radiation frequency h scales linearly with the temperature of the body. This result (in particular.87) where c cos θ is the component of the photon velocity in the direction of the hole.
the radiation was gradually Doppler shifted to ever larger wavelengths (in other words.90) 4π c 0 exp(¯ ω/k T ) − 1 h 0 or ∞ k4 T 4 η3 dη . this primordial radiation is detectable as a faint microwave background which pervades the whole universe. 9. (◦ K) Colour Scorpio M 3300 Very Red Taurus K 3800 Reddish G 5770 Yellow Canis Minor F 6570 Yellowish Canis Major A 9250 White Orion B 11. the radiation did work against the expansion of the Universe. lost energy). Until recently. The microwave background was discovered accidentally by Penzias and Wilson in 1961. thereby. because of strong microwave absorption and scattering by the Earth’s atmosphere. but its spectrum remained invariant. it was difﬁcult to measure the full spectrum with any degree of precision. this can be regarded as the “temperature of the Universe. and. The early Universe was also pretty well described by equilibrium statistical thermodynamics. Temp. which means that the radiation had a blackbody spectrum. (8. In a very real sense.735 ◦ K.200 Bluish White Table 9: Physical properties of some wellknown stars sociated with radiation completely dominating that associated with matter. This data can be ﬁtted to a blackbody curve of characteristic temperature 2. (8.91) Ptot (T ) = 2 2 3 4π c ¯ 0 exp η − 1 h 187 .12 The StefanBoltzmann law The total power radiated per unit area by a blackbody at all frequencies is given by ∞ ∞ ¯ h ω3 dω Ptot (T ) = P(ω) dω = 2 2 . Nowadays.” 8.12 The StefanBoltzmann law Name Antares Aldebaran Sun Procyon Sirius Rigel 8 QUANTUM STATISTICS Constellation Spectral Type Surf. It took precisely nine minutes to measure the perfect blackbody spectrum reproduced in Fig. all of this changed when the COBE satellite was launched in 1989. However. As the Universe expanded.8.
8.12 The StefanBoltzmann law 8 QUANTUM STATISTICS Figure 9: Cosmic background radiation spectrum measured by the Far Infrared Absolute Spectrometer (FIRAS) aboard the Cosmic Background Explorer satellite (COBE). where η = ¯ ω/k T .93) This T 4 dependence of the radiated power is called the StefanBoltzmann law.94) is called the StefanBoltzmann constant. 60 c2 ¯ h (8. We can use the StefanBoltzmann law to estimate the temperature of the Earth from ﬁrst principles. and Ludwig Boltzmann. who ﬁrst obtained it experimentally. the total power radiated per unit area by a blackbody is π2 k4 4 4 Ptot (T ) = 3 T = σT . after Josef Stefan.67 × 10−8 W m−2 K−4 . 2¯3 60 c h (8. The Sun is a ball of glowing gas of radius R 7 × 105 km 188 . who ﬁrst derived it theoretically. The above integral can easily be looked up in standard h mathematical tables. The parameter σ= π2 k4 = 5.92) Thus. exp η − 1 15 (8. ∞ 0 η3 dη π4 = . In fact.
97) according to the StefanBoltzmann law. (8. In steadystate.13 Conduction electrons in a metal The conduction electrons in a metal are nonlocalized (i.. The Earth absorbs this energy. The above expression yields T⊕ ∼ 279◦ K or 6◦ C (or 43◦ F). they are not tied to any particular atoms). 8. it is quite encouraging that such a crude calculation comes so close to the correct answer.e.. In conventional metals.95) according to the StefanBoltzmann law. so equating L⊕ and P⊕ we arrive at T⊕ = R 2 r⊕ 1/2 T . we are ignoring any surface temperature variations between polar and equatorial regions.13 Conduction electrons in a metal 8 QUANTUM STATISTICS and surface temperature T 5770◦ K.98) Remarkably. Here.8. which was neglected in our calculation. the power output of the Sun reduced by the ratio of the solid angle subtended by the Earth at the Sun to the total solid angle 4π.e. This is slightly on the cold side. Its luminosity is L = 4π R 2 σ T 4 . The luminosity of the Earth is 2 L⊕ = 4π R⊕ σ T⊕4 . or between day and night.96) 4π per second from the Sun’s radiative output: i.5 × 108 km from the Sun. Nevertheless. The Earth intercepts an amount of energy 2 2 π R⊕ /r⊕ P⊕ = L (8. because of the greenhouse action of the Earth’s atmosphere. (8. and then reradiates it at longer wavelengths. by a few degrees. where T⊕ is the average temperature of the Earth’s surface. the ratio of the Earth’s surface temperature to that of the Sun depends only on the EarthSun distance and the solar radius. the luminosity of the Earth must balance the radiative power input from the Sun. The Earth is a globe of radius R⊕ ∼ 6000 km located an average distance r⊕ 1. (8. each atom contributes a single 189 .
Here. µ ≡ −k T α (8. the concentration of such electrons in a metal far exceeds the concentration of particles in a conventional gas. it is possible to neglect the mutual interaction of the conduction electrons. from the above equation.103) kT In this limit. However. if µ then β ( − µ) 1. On the other hand. This energy is determined by the condition that 1 = N. so that F( ) = 1. If the Fermi energy µ is such that β µ 1 then β ( − µ) 1. if µ then β ( − µ) 1. (8. that the mean number of particles occupying state s (energy s ) is given by 1 ns = β ( −µ) ¯ . where µ βµ ≡ 1.13 Conduction electrons in a metal 8 QUANTUM STATISTICS such electron. therefore. and F reduces to the MaxwellBoltzmann distribution.101) nr = ¯ e β ( r −µ) + 1 r r where N is the total number of particles contained in the volume V. they are subject to FermiDirac statistics (since electrons are fermions). (8.5. It is. It is clear. that the Fermi energy µ is generally a function of the temperature T . since this interaction is largely shielded out by the stationary atoms.102) as varies.8. Recall. for the case of conduction electrons in a metal we are interested in the opposite limit. be treated as an ideal gas. To a ﬁrst approximation.99) e s +1 according to the FermiDirac distribution. Let us investigate the behaviour of the Fermi function F( ) = 1 eβ( −µ) +1 (8. not surprising that conduction electrons cannot normally be analyzed using classical statistics: in fact. The conduction electrons can. so that F( ) = exp[−β ( − µ)] falls off exponentially 190 .100) is termed the Fermi energy of the system. the energy is measured from its lowest possible value = 0. (8. 8. Here. therefore. from Sect. However.
Clearly. This is an obvious result. the transition region becomes inﬁnitesimally narrow. since all of the lower energy states are already occupied. since when T = 0 the conduction electrons attain their lowest energy. Let us calculate the Fermi energy µ = µ0 of a FermiDirac gas at T = 0. The 191 . even at absolute zero. Thus. 10. F = 1 for < µ and F = 0 for > µ. The transition region in which F goes from a value close to unity to a value close to zero corresponds to an energy interval of order k T . the last electron added to the pile has quite a considerable energy. just like a classical Boltzmann distribution.8. = µ. the lowest energy conﬁguration is obtained by piling electrons into the lowest available unoccupied states until all of the electrons are used up. 10. In this case. Since the Pauli exclusion principle requires that there be no more than one electron per singleparticle quantum state. as illustrated in Fig. conﬁguration. the exclusion principle implies that a FermiDirac gas possesses a large mean energy. In the limit as T → 0. Note that F = 1/2 when = µ. centred on = µ. with increasing . or groundstate. This is illustrated in Fig.13 Conduction electrons in a metal 8 QUANTUM STATISTICS kT T=0K 1 F > T>0K 0 0 ε −> µ Figure 10: The Fermi function.
At T = 0 all quantum states whose energy is less than the Fermi energy µ0 are ﬁlled.106) The above expression can be rearranged to give kF = 3 π2 Hence. 2π 2π V 1/3 λF ≡ = . All quantum states with de Broglie wavelengths λ ≡ 2π/k > λF are occupied at T = 0. The number of quantum states inside the sphere is twice this. it follows that 2 4 V 3 π kF = N. by analogy with Eq.13 Conduction electrons in a metal 8 QUANTUM STATISTICS energy of each particle is related to its momentum p = ¯ k via h ¯ 2 k2 h p2 = . (8. the total number of quantum states inside the Fermi sphere) must equal the total number of particles in the gas. The Fermi energy corresponds to a Fermi momentum pF = ¯ kF which is such that h 2 pF ¯ 2 kF h 2 µ0 = = .146). (8. Now. 192 N V 1/3 . because electrons possess two possible spin states for every possible translational state. It follows that the Fermi sphere of radius kF 3 contains (4/3) π kF (2π)−3 V translational states.e.. we know. (8.107) .104) = 2m 2m where k is the de Broglie wavevector. and all those with k > kF are empty. Since the total number of occupied states (i. that there are (2π)−3 V allowable translational states per unit volume of kspace. (8.108) kF (3π2 )1/3 N which implies that the de Broglie wavelength λF corresponding to the Fermi energy is of order the mean separation between particles (V/N)1/3 . at T = 0 all quantum states with k < kF are ﬁlled. 3 3 (2π) (8.105) 2m 2m Thus. whereas all those with λ < λF are empty.8. (7. The volume of the sphere of 3 radius kF in kspace is (4/3) π kF .
112) The majority of the conduction electrons in a metal occupy a band of completely ﬁlled states with energies far below the Fermi energy. the heat capacity can be 193 .e. in which F has the form shown in Fig. the contribution of the conduction electrons to the speciﬁc heat of the metal.13 Conduction electrons in a metal 8 QUANTUM STATISTICS According to Eq. Hence. the Fermi energy at T = 0 takes the form ¯2 h N µ0 = 3 π2 2m V It is easily demonstrated that µ0 ature. (8. (8. ¯ ∂E . In the tail end of this region F ∝ exp(−β ). (8. from Eq. so the distribution reverts to a MaxwellBoltzmann distribution.109) k T for conventional metals at room temper If the electrons obeyed classical MaxwellBoltzmann statistics. On the other hand. so that F ∝ exp(−β ) for all electrons. since these electrons lie in states which are completely ﬁlled. Consider. we expect each electron in this region to contribute roughly an amount (3/2) k to the heat capacity. for example. with µ. in which F is signiﬁcantly different from 0 and 1. In many cases. Hence. (8. and remain so when the temperature is changed.8.105). do contribute to the speciﬁc heat. 2 (8. 2/3 . then the equipartition theorem would give 3 ¯ N k T.112). E = 2 3 CV = N k. the relatively small number of electrons Neff in the energy range of order k T . such electrons have very little effect on the macroscopic properties of the metal. the actual situation. A small change in T does not affect the mean energies of the majority of the electrons.. is very different. 10.111) (8. centred on the Fermi energy. It follows that these electrons contribute nothing whatsoever to the heat capacity.110) CV = ∂T V However. The heat capacity CV at constant volume of these electrons ¯ can be calculated from a knowledge of their mean energy E(T ) as a function of T : i.
e. using the superscript e to denote the electronic speciﬁc heat. Clearly.113) 2 However.115) Since k T µ in conventional metals. Hence. (L) However.13 Conduction electrons in a metal 8 QUANTUM STATISTICS written 3 Neff k. the classical theory predicted incorrectly that the presence of conduction electrons should raise the heat capacities of metals by 50 percent [i. the molar speciﬁc heat can be written (e) cV = γ T.114) (8.116) where γ is a (positive) constant of proportionality. (8. Before the advent of quantum mechanics.117) .12). In fact. 7.115) is not temperature independent. This accounts for the fact that the molar speciﬁc heat capacities of metals at room temperature are about the same as those of insulators. at very low temperatures cV = A T 3 . at low temperatures cV = A T 3 approaches zero far more rapidly that the electronic speciﬁc heat. 194 (e) (L) (e) (8. The total molar speciﬁc heat of a metal at low temperatures takes the form cV = cV + cV = γ T + A T 3 . (8. as T is reduced..8. the molar speciﬁc heat of the conduction electrons is clearly very much less than the classical value (3/2) R. Note that the speciﬁc heat (8. (3/2) R] compared to those of insulators. At room temperature cV is (L) completely masked by the much larger speciﬁc heat cV due to lattice vibrations. where A is a (positive) constant (L) of proportionality (see Sect. 2 µ (8. since only a fraction k T/µ of the total conduction electrons lie in the tail region of the FermiDirac distribution. we expect CV Neff It follows that CV kT N. it should be possible to measure the electronic contribution to the molar speciﬁc heat at low temperatures. µ kT 3 Nk .
Themal physics (W. 195 . Kittel. such as the Sun.14 Whitedwarf stars 8 QUANTUM STATISTICS Figure 11: The low temperature heat capacity of potassium. New York NY. however. the star will run out of burnable fuel.14 Whitedwarf stars A mainsequence hydrogenburning star. (8. and H. Note. As the star collapses. therefore. 1980). its density increases. and the thermal pressure tending to make it expand. (8. Eventually. Hence. is maintained in equilibrium via the balance of the gravitational attraction tending to make it collapse. The fact that a good straight line is obtained veriﬁes that the temperature dependence of the heat capacity predicted by Eq. so the mean separation between its constituent particles decreases. and the electron gas becomes degenerate. as it radiates away its remaining thermal energy. and. Freeman & co. Kroemer. What is the ultimate fate of such a star? A burntout star is basically a gas of electrons and ions..118) T If follows that a plot of cV /T versus T 2 should yield a straight line whose intercept on the vertical axis gives the coefﬁcient γ. plotted as CV /T versus T 2. Of course. Eventually.117) is indeed correct.H.8. Figure 11 shows such a plot. the mean separation becomes of order the de Broglie wavelength of the electrons. From C. start to collapse. the thermal energy of the star is generated by nuclear reactions occurring deep inside its core. cV = γ + A T 2. 8.
8. Thus. a degenerate electron gas exerts a substantial pressure. so the ion gas remains nondegenerate.14 Whitedwarf stars 8 QUANTUM STATISTICS that the de Broglie wavelength of the ions is much smaller than that of the electrons. h Here. the gravitational potential energy takes the form 3 G M2 . The total energy of a whitedwarf star can be written E = K + U. and R is the stellar radius.120) where G is the gravitational constant. which is equivalent to taking the limit T → 0.123) 1/3 . M is the stellar mass. that the density of the star is uniform. even at zero temperature. from the previous section. In this case. Let us assume. we know.121) 196 . (8. because the Pauli exclusion principle prevents the mean electron separation from becoming signiﬁcantly smaller than the typical de Broglie wavelength (see the previous section). V= 4π 3 R 3 (8. for the sake of simplicity. it is possible for a burntout star to maintain itself against complete collapse under gravity via the degeneracy pressure of its constituent electrons. Let us assume that the electron gas is highly degenerate. Now. (8. Such stars are termed whitedwarfs. Let us investigate the physics of whitedwarfs in more detail. In this case. that the Fermi momentum can be written N pF = Λ V where Λ = (3π2 )1/3 ¯ .119) where K is the total kinetic energy of the degenerate electrons (the kinetic energy of the ion is negligible) and U is the gravitational potential energy.122) (8. U=− 5 R (8.
neutrons. and N is the total number of electrons contained in the star. M = 2 N mp .126).14 Whitedwarf stars 8 QUANTUM STATISTICS is the stellar volume. In fact. (8.127) can be combined to give A B E= 2− .129) (8. (8. Λ3 (8.119).122). it is easily demonstrated that R∗ = 2A . the number of electron states contained in an annular radius of pspace lying between radii p and p + dp is dN = 3V 2 p dp. (8.127) ¯2 M h m mp 5/3 .130) The equilibrium radius of the star R∗ is that which minimizes the total energy E.120). and electrons. 5 2/3 (8. Equations (8. Thus.131) 197 .124) Hence.8. where mp is the proton mass. and (8. B (8. (8.123).125) where m is the electron mass. the total kinetic energy of the electron gas can be written 3V K= 3 Λ pF 0 5 p2 2 3 V pF p dp = .126) The interior of a whitedwarf star is composed of atoms like C12 and O16 which contain equal numbers of protons. (8. 2m 5 Λ3 2 m (8. (8. It follows that Λ2 N 3 K= N 5 2m V 2/3 . (8. Furthermore.128) R R where 3 9π A = 20 8 B = 3 G M2 .
all with properties similar to those described above.010 R M 1/3 (8. It follows that the radius of a typical solar mass whitedwarf is about 7000 km: i.135) 198 . Let us. Strictly speaking. for the sake of simplicity. and the above analysis needs to be modiﬁed.132) .126). Hence.125). Thus. if M becomes sufﬁciently large the electrons become relativistic.133) where R = 7 × 105 km is the solar radius. from Eq. 2 (8.8.134) by analogy with Eq. K 3V c Λ3 pF 0 m2 c2 p3 + p + · · · dp. (8.e.. consider the ultrarelativistic limit in which p m c. K ∝ M4/3 . about the same as the radius of the Earth. (8. the nonrelativistic analysis described in the previous section is only valid in the low mass limit M M . (8.15 The Chandrasekhar limit 8 QUANTUM STATISTICS which yields (9π)2/3 ¯ 2 h 1 R∗ = .133)]. 8. The ﬁrst whitedwarf to be discovered (in 1862) was the companion of Sirius. that the mean energy of the degenerate electrons inside the star increases strongly as the stellar mass increases: in fact. The total electron energy (including the rest mass energy) can be written K= 3V Λ3 pF 0 (p2 c2 + m2 c4 )1/2 p2 dp. It follows. (8. (8. thousands of whitedwarfs have been observed. and M = 2 × 1030 kg is the solar mass. 5/3 8 m G mp M1/3 The above formula can also be written M R∗ = 0.15 The Chandrasekhar limit One curious feature of whitedwarf stars is that their radius decreases as their mass increases [see Eq. Nowadays.
Chandrasekhar who ﬁrst derived it in 1931. yields MC = 1.138) (8. 5 M ¯ c h mp 4/3 . which does not assume constant density. the degeneracy pressure of the electrons is incapable of halting the collapse of the star under gravity. (8. that the total energy of an ultrarelativistic whitedwarf star can be written in the form A−B E + C R. (8. from the above.141) B This criterion can be rewritten M < MC . (8. a nonzero value of R∗ only exists for A − B > 0.4 M .140) (8. after A. When A − B < 0 the energy decreases monotonically with decreasing stellar radius: in other words. (8.72 M MC = (5π) 2 64 mp m2 c3 M 3 1 C = 4 (9π)1/3 ¯ h mp 2/3 . A more realistic calculation.136) It follows. (8.143) is known as the Chandrasekhar limit. the equilibrium radius R∗ is that which minimizes the total energy E.8.142) (8. where 1/2 15 h 1/2 (¯ c/G) = 1.15 The Chandrasekhar limit 8 QUANTUM STATISTICS giving K 3 Vc 4 2 p + m2 c2 pF + · · · .139) As before. in the ultrarelativistic case. 4 Λ3 F (8. However.137) R where 3 9π A = 8 8 B = 1/3 3 G M2 . The criterion which must be satisﬁed for a relativistic whitedwarf star to be maintained against gravity is that A > 1.144) 199 .
does not treat the neutrons as point particles. A star which is maintained against gravity in this manner is called a neutron star. Thus. What is the ultimate fate of such a star? 8. which does not assume constant density. Eventually.145) R M It follows that the radius of a typical solar mass neutron star is a mere 10 km. instead. the previous analysis can be simply modiﬁed by letting mp → mp /2 and m → mp . if the stellar mass exceeds the Chandrasekhar limit then the star in question cannot become a whitedwarf when its nuclear fuel is exhausted. it is found that there is a critical mass above which a neutron star cannot be maintained against gravity.000011 . Thus. Neutrons stars can be analyzed in a very similar manner to whitedwarf stars. In 1967 Antony Hewish and Jocelyn Bell discovered a class of compact radio sources. (8. which emit extremely regular pulses of radio waves. Pulsars have subsequently been identiﬁed as rotating neutron stars. (8. we conclude that nonrelativistic neutrons stars satisfy the massradius law: R∗ M 1/3 = 0. When relativistic effects are taken into account.8. called pulsars. any star which collapses to such an extent that its radius becomes signiﬁcantly less than that characteristic of a whitedwarf is effectively transformed into a gas of neutrons.146) A more realistic calculation. is given by MOV = 4 MC = 6.16 Neutron stars At stellar densities which greatly exceed whitedwarf densities. and takes general relativity into account. According to our analysis. must continue to collapse.16 Neutron stars 8 QUANTUM STATISTICS Thus. but. 200 . many hundreds of these objects have been observed. this critical mass. In fact.9 M . At this point. the extreme pressures cause electrons to combine with protons to form neutrons. the mean separation between the neutrons becomes comparable with their de Broglie wavelength. To date. it is possible for the degeneracy pressure of the neutrons to halt the collapse of the star. which is known as the OppenheimerVolkoff limit.
8. and must ultimately collapse to form a blackhole. 201 .5 M .5—2.16 Neutron stars 8 QUANTUM STATISTICS gives a somewhat lower value of MOV = 1.147) A star whose mass exceeds the OppenheimerVolkoff limit cannot be maintained against gravity by degeneracy pressure. (8.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.