You are on page 1of 56

Macrostates and microstates

In describing a system made up of a great many particles, it is usually possible to

specify some macroscopically measurable independent parameters ,


which affect the particles' equations of motion. These parameters are termed the
external parameters the system. Examples of such parameters are the volume (this
gets into the equations of motion because the potential energy becomes infinite when
a particle strays outside the available volume) and any applied electric and magnetic
fields. A microstate of the system is defined as a state for which the motions of the
individual particles are completely specified (subject, of course, to the unavoidable
limitations imposed by the uncertainty principle of quantum mechanics). In general,
the overall energy of a given microstate is a function of the external parameters:
(119)

A macrostate of the system is defined by specifying the external parameters, and any
other constraints to which the system is subject. For example, if we are dealing with
an isolated system (i.e., one that can neither exchange heat with nor do work on its
surroundings) then the macrostate might be specified by giving the values of the
volume and the constant total energy. For a many-particle system, there are generally
a very great number of microstates which are consistent with a given macrostate.

Thermodynamic Probability W and Entropy

The section on atoms, molecules and probability has shown that if we want to predict whether a
chemical change is spontaneous or not, we must find some general way of determining whether
the final state is more probable than the initial. This can be done using a number W, called the
thermodynamic probability. W is defined as the number of alternative microscopic
arrangements which correspond to the same macroscopic state. The significance of this
definition becomes more apparent once we have considered a few examples.

Figure 1a illustrates a crystal consisting of only eight atoms at the absolute zero of temperature.
Suppose that the temperature is raised slightly by supplying just enough energy to set one of the
atoms in the crystal vibrating. There are eight possible ways of doing this, since we could supply
the energy to any one of the eight atoms. All eight possibilities are shown in Fig. 1b.
Figure 1 The thermodynamic probability W of a crystal containing eight atoms at three different
temperatures. (a) At 0 K there is only one way in which the crystal can be arranged, so that W = 1. (b)
If enough energy is added to start just one of the atoms vibrating (color), there are eight different
equally likely arrangements possible, and W = 8. (c) If the energy is doubled, two different atoms can
vibrate simultaneously (light color) or a single atom can have all the energy (dark color). The number
of equally likely arrangements is much larger than before; W = 36.

Since all eight possibilities correspond to the crystal having the same temperature, we say that W
= 8 for the crystal at this temperature. Also, we must realize that the crystal will not stay
perpetually in any of these eight arrangements. Energy will constantly be transferred from one
atom to the other, so that all the eight arrangements are equally probable.

Let us now supply a second quantity of energy exactly equal to the first, so that there is just
enough to start two molecules vibrating. There are 36 different ways in which this energy can be
assigned to the eight atoms (Fig. 1c). We say that W = 36 for the crystal at this second
temperature. Because energy continually exchanges from one atom to another, there is an equal
probability of finding the crystal in any of the 36 possible arrangements.
A third example of W is our eight-atom crystal at the absolute zero of temperature. Since there is
no energy to be exchanged from atom to atom, only one arrangement is possible, and W = 1. This
is true not only for this hypothetical crystal, but also presumably for a real crystal containing a
large number of atoms, perfectly arranged, at absolute zero.

Figure 2 Heat flow and thermodynamic probability. When two crystals, one containing 64 units of
vibrational energy and the other (at 0 K) containing none are brought into contact, the 64 units of
energy will distribute themselves over the two crystals since there are many more ways of distributing
64 units among 200 atoms than there are of distributing 64 units over only 100 atoms.

The thermodynamic probability W enables us to decide how much more probable certain
situations are than others. Consider the flow of heat from crystal A to crystal B, as shown in Fig.
2. We shall assume that each crystal contains 100 atoms. Initially crystal B is at absolute zero.
Crystal A is at a higher temperature and contains 64 units of energy-enough to set 64 of the
atoms vibrating. If the two crystals are brought together, the molecules of A lose energy while
those of B gain energy until the 64 units of energy are evenly distributed between both crystals.

In the initial state the 64 units of energy are distributed among 100 atoms. Calculations show that
there are 1.0 × 1044 alternative ways of making this distribution. Thus W1, initial thermodynamic
probability, is 1.0× 1044. The 100 atoms of crystal A continually exchange energy among
themselves and transfer from one of these 1.0 × 1044 arrangements to another in rapid succession.
At any instant there is an equal probability of finding the crystal in any of the 1.0 × 1044
arrangements.

When the two crystals are brought into contact, the energy can distribute itself over twice as
many atoms. The number of possible arrangements rises enormously, and W2, the
thermodynamic probability for this new situation, is 3.6 × 1060. In the constant reshuffle of
energy among the 200 atoms, each of these 3.6 × 1060 arrangements will occur with equal
probability. However, only 1.0 × 1044 of them correspond to all the energy being in crystal A.
Therefore the probability of the heat flow reversing itself and all the energy returning to crystal
A is
In other words the ratio of W1 to W2 gives us the relative probability of finding the system in its
initial rather than its final state.

This example shows how we can use W as a general criterion for deciding whether a reaction is
spontaneous or not. Movement from a less probable to a more probable molecular situation
corresponds to movement from a state in which W is smaller to a state where W is larger. In other
words W increases for a spontaneous change. If we can find some way of calculating or
measuring the initial and final values of W, the problem of deciding in advance whether a
reaction will be spontaneous or not is solved. If W2 is greater than W1, then the reaction will
occur of its own accord. Although there is nothing wrong in principle with this approach to
spontaneous processes, in practice it turns out to be very cumbersome. For real samples of matter
(as opposed to 200 atoms in the example of Fig. 2) the values of W are on the order of 101024—so
large that they are difficult to manipulate. The logarithm of W, however, is only on the order of
1024, since log 10x = x. This is more manageable, and chemists and physicists use a quantity
called the entropy which is proportional to the logarithm of W.

This way of handling the extremely large thermodynamic probabilities encountered in real
systems was first suggested in 1877 by the Austrian physicist Ludwig Boltzmann (1844 to 1906).
The equation

S = k ln W (1)

is now engraved on Boltzmann’s tomb. The proportionality constant k is called, appropriately


enough, the Boltzmann constant. It corresponds to the gas constant R divided by the Avogadro
constant NA:

(2)

and we can regard it as the gas constant per molecule rather than per mole. In SI units, the
Boltzmann constant k has the value 1.3805 × 10–23 J K–1. The symbol ln in Eq. (1) indicates a
natural logarithm,i.e., a logarithm taken to the base e. Since base 10 logarithms and base e
logarithms are related by the formula

ln x = 2.303 log x

it is easy to convert from one to the other. Equation (1), expressed in base 10 logarithms, thus
becomes
S = 2.303k log W (1a)

EXAMPLE 1 The thermodynamic probability W for 1 mol propane gas at 500 K and 101.3 kPa
has the value 101025. Calculate the entropy of the gas under these conditions.

Solution Since

W = 101025

log W = 1025

Thus S = 2.303k log W = 1.3805 × 10–23 J K–1 × 2.303 × 1025 = 318 J K–1

Note: The quantity 318 J K–1 is obviously much easier to handle than 101025.

Note also that the dimensions of entropy are energy/temperature.

One of the properties of logarithms is that if we increase a number, we also increase the value of
its logarithm. It follows therefore that if the thermodynamic probability W of a system increases,
its entropy S must increase too. Further, since W always increases in a spontaneous change, it
follows that S must also increase in such a change.

The statement that the entropy increases when a spontaneous change occurs is called the second
law of thermodynamics. (The first law is the law of conservation of energy.) The second law, as
it is usually called, is one of the most fundamental and most widely used of scientific laws. In
this book we shall only be able to explore some of its chemical implications, but it is of
importance also in the fields of physics, engineering, astronomy, and biology. Almost all
environmental problems involve the second law. Whenever pollution increases, for instance, we
can be sure that the entropy is increasing along with it.

The second law is often stated in terms of an entropy difference ΔS. If the entropy increases from
an initial value of S1 to a final value of S2 as the result of a spontaneous change, then

ΔS = S2 – S1 (3)
Since S2 is larger than S1, we can write

ΔS > 0 (4)

Equation (4) tells us that for any spontaneous process, ΔS is greater than zero. As an example of
this relationship and of the possibility of calculating an entropy change, let us find ΔS for the
case of 1 mol of gas expanding into a vacuum. We have already argued for this process that the
final state is 101.813 × 1023 times more probable than the initial state. This can only be because
there are 101.813 × 1023 times more ways of achieving the final state than the initial state. In other
words, taking logs, we have

log = 1.813 × 1023

Thus

ΔS = S2 – S1 = 2.303 × k × log W2 – 2.303 × k × log W1

= 2.303 × k × log

= 2.303 × 1.3805 × 10–23 J K–1 × 1.813 × 1023

S = 5.76 J K–1

As entropy changes go, this increase in entropy is quite small. Nevertheless, it corresponds to a
gargantuan change in probabilities.

The microscopic interpretation of heat and work


Consider a macroscopic system which is known to be in a given macrostate. To be
more exact, consider an ensemble of similar macroscopic systems , where each
system in the ensemble is in one of the many microstates consistent with the given
macrostate. There are two fundamentally different ways in which the average energy
of can change due to interaction with its surroundings. If the external parameters of
the system remain constant then the interaction is termed a purely thermal interaction.
Any change in the average energy of the system is attributed to an exchange of heat
with its environment. Thus,
(120)

where is the heat absorbed by the system. On a microscopic level, the energies of
the individual microstates are unaffected by the absorption of heat. In fact, it is the
distribution of the systems in the ensemble over the various microstates which is
modified.

Suppose that the system is thermally insulated from its environment. This can be
achieved by surrounding it by an adiabatic envelope (i.e., an envelope fabricated out
of a material which is a poor conductor of heat, such a fiber glass). Incidentally, the
term adiabatic is derived from the Greek adiabatos which means ``impassable.'' In
scientific terminology, an adiabatic process is one in which there is no exchange of
heat. The system is still capable of interacting with its environment via its external
parameters. This type of interaction is termed mechanical interaction, and any change
in the average energy of the system is attributed to work done on it by its
surroundings. Thus,

(121)

where is the work done by the system on its environment. On a microscopic level,
the energy of the system changes because the energies of the individual microstates
are functions of the external parameters [see Eq. (119)]. Thus, if the external
parameters are changed then, in general, the energies of all of the systems in the
ensemble are modified (since each is in a specific microstate). Such a modification
usually gives rise to a redistribution of the systems in the ensemble over the accessible
microstates (without any heat exchange with the environment). Clearly, from a
microscopic viewpoint, performing work on a macroscopic system is quite a
complicated process. Nevertheless, macroscopic work is a quantity which can be
readily measured experimentally. For instance, if the system exerts a force on its
immediate surroundings, and the change in external parameters corresponds to a
displacement of the center of mass of the system, then the work done by on its
surroundings is simply
(122)
i.e., the product of the force and the displacement along the line of action of the force.
In a general interaction of the system with its environment there is both heat
exchange and work performed. We can write
(123)

which serves as the general definition of the absorbed heat (hence, the equivalence

sign). The quantity is simply the change in the mean energy of the system which is
not due to the modification of the external parameters. Note that the notion of a
quantity of heat has no independent meaning apart from Eq. (123). The mean energy
and work performed are both physical quantities which can be determined

experimentally, whereas is merely a derived quantity.

Maxwell–Boltzmann statistics
In statistical mechanics, Maxwell–Boltzmann statistics describes the statistical distribution of
material particles over various energy states in thermal equilibrium, when the temperature is high
enough and density is low enough to render quantum effects negligible.

The expected number of particles with energy εi for Maxwell–Boltzmann statistics is Ni where:

where:

 Ni is the number of particles in state i


 εi is the energy of the i-th state
 gi is the degeneracy of energy level i, the number of particle's states (excluding the "free
particle" state) with energy εi
 μ is the chemical potential
 k is Boltzmann's constant
 T is absolute temperature
 N is the total number of particles
 Z is the partition function

 e(...) is the exponential function

Equivalently, the distribution is sometimes expressed as

where the index i now specifies a particular state rather than the set of all states with energy εi.

Fermi–Dirac and Bose–Einstein statistics apply when quantum effects are important and the
particles are "indistinguishable". Quantum effects appear if the concentration of particles (N/V) ≥
nq. Here nq is the quantum concentration, for which the interparticle distance is equal to the
thermal de Broglie wavelength, so that the wavefunctions of the particles are touching but not
overlapping. Fermi–Dirac statistics apply to fermions (particles that obey the Pauli exclusion
principle), and Bose–Einstein statistics apply to bosons. As the quantum concentration depends
on temperature; most systems at high temperatures obey the classical (Maxwell–Boltzmann)
limit unless they have a very high density, as for a white dwarf. Both Fermi–Dirac and Bose–
Einstein become Maxwell–Boltzmann statistics at high temperature or at low concentration.

Maxwell–Boltzmann statistics are often described as the statistics of "distinguishable" classical


particles. In other words the configuration of particle A in state 1 and particle B in state 2 is
different from the case where particle B is in state 1 and particle A is in state 2. This assumption
leads to the proper (Boltzmann) distribution of particles in the energy states, but yields non-
physical results for the entropy, as embodied in the Gibbs paradox. This problem disappears
when it is realized that all particles are in fact indistinguishable. Both of these distributions
approach the Maxwell–Boltzmann distribution in the limit of high temperature and low density,
without the need for any ad hoc assumptions. Maxwell–Boltzmann statistics are particularly
useful for studying gases. Fermi–Dirac statistics are most often used for the study of electrons in
solids. As such, they form the basis of semiconductor device theory and electronics.

A derivation of the Maxwell–Boltzmann distribution


Suppose we have a container with a huge number of very small identical particles. Although the
particles are identical, we still identify them by drawing numbers on them in the way lottery balls
are being labelled with numbers and even colors.
All of those tiny particles are moving inside that container in all directions with great speed.
Because the particles are speeding around, they do possess some energy. The Maxwell–
Boltzmann distribution is a mathematical function that speaks about how many particles in the
container have a certain energy.

It can be so that many particles have the same amount of energy εi. The number of particles with
the same energy εi is Ni. The number of particles possessing another energy εj is Nj. In physical
speech this statement is lavishly inflated into something complicated which states that those
many particles Ni with the same energy amount εi, all occupy a so called "energy level" i . The
concept of energy level is used to graphically/mathematically describe and analyse the properties
of particles and events experienced by them. Physicists take into consideration the ways particles
arrange themself and thus there is more than one way of occupying an energy level and that's the
reason why the particles were tagged like lottery ball, to know the intentions of each one of
them.

To begin with, let's ignore the degeneracy problem: assume that there is only one single way to
put Ni particles into energy level i . What follows next is a bit of combinatorial thinking which
has little to do in accurately describing the reservoir of particles.

The number of different ways of performing an ordered selection of one single object from N
objects is obviously N. The number of different ways of selecting two objects from N objects, in
a particular order, is thus N(N − 1) and that of selecting n objects in a particular order is seen to
be N!/(N − n)!. The number of ways of selecting 2 objects from N objects without regard to order
is N(N − 1) divided by the number of ways 2 objects can be ordered, which is 2!. It can be seen
that the number of ways of selecting n objects from N objects without regard to order is the
binomial coefficient: N!/(n!(N − n)!). If we now have a set of boxes labelled a, b, c, d, e, ..., k,
then the number of ways of selecting Na objects from a total of N objects and placing them in
box a, then selecting Nb objects from the remaining N − Na objects and placing them in box b,
then selecting Nc objects from the remaining N − Na − Nb objects and placing them in box c, and
continuing until no object is left outside is

and because not even a single object is to be left outside the boxes, implies that the sum made of
the terms Na, Nb, Nc, Nd, Ne, ..., Nk must equal N, thus the term (N - Na - Nb - Nc - ... - Nl - Nk)! in
the relation above evaluates to 0! which makes possible to write down that relation as
Now going back to the degeneracy problem which characterize the reservoir of particles. If the i-
th box has a "degeneracy" of gi, that is, it has gi "sub-boxes", such that any way of filling the i-th
box where the number in the sub-boxes is changed is a distinct way of filling the box, then the
number of ways of filling the i-th box must be increased by the number of ways of distributing
the Ni objects in the gi "sub-boxes". The number of ways of placing Ni distinguishable objects in
gi "sub-boxes" is . Thus the number of ways W that a total of N particles can be classified
into energy levels according to their energies, while each level i having gi distinct states such that
the i-th level accommodates Ni particles is:

We wish to find the Ni for which the function W is maximized, while considering the constraint
that there is a fixed number of particles and a fixed energy in
the container. The maxima of W and ln(W) are achieved by the same values of Ni and, since it is
easier to accomplish mathematically, we will maximize the latter function instead. We constrain
our solution using Lagrange multipliers forming the function:

Using Stirling's approximation for the factorials

we obtain:

Then

Finally

In order to maximize the expression above we apply Fermat's theorem (stationary points),
according to which local extrema, if exist, must be at critical points (partial derivatives vanish):
By solving the equations above ( ) we arrive to an expression for Ni:

It can be shown thermodynamically[clarification needed] that β = 1/kT where k is Boltzmann's constant


and T is the temperature, and that α = −μ/kT where μ is the chemical potential, so that finally:

Note that the above formula is sometimes written:

where z = exp(μ / kT) is the absolute activity.

Alternatively, we may use the fact that

to obtain the population numbers as

where Z is the partition function defined by:

Another derivation (not as fundamental)


In the above discussion, the Boltzmann distribution function was obtained via directly analysing
the multiplicities of a system. Alternatively, one can make use of the canonical ensemble. In a
canonical ensemble, a system is in thermal contact with a reservoir. While energy is free to flow
between the system and the reservoir, the reservoir is thought to have infinitely large heat
capacity as to maintain constant temperature, T, for the combined system.
In the present context, our system is assumed to be have energy levels with degeneracies gi.
As before, we would like to calculate the probability that our system has energy εi.

If our system is in state , then there would be a corresponding number of microstates available
to the reservoir. Call this number . By assumption, the combined system (of the system
we are interested in and the reservoir) is isolated, so all microstates are equally probable.
Therefore, for instance, if , we can conclude that our system is twice as
likely to be in state than . In general, if is the probability that our system is in state
,

Since the entropy of the reservoir , the above becomes

Next we recall the thermodynamic identity (from the first law of thermodynamics):

In a canonical ensemble, there is no exchange of particles, so the dNR term is zero. Similarly, dVR
= 0. This gives

where and denote the energies of the reservoir and the system at si, respectively.
For the second equality we have used the conservation of energy. Substituting into the first
equation relating :

which implies, for any state s of the system


where Z is an appropriately chosen "constant" to make total probability 1. (Z is constant provided
that the temperature T is invariant.) It is obvious that

where the index s runs through all microstates of the system. Z is sometimes called the
Boltzmann sum over states. If we index the summation via the energy eigenvalues instead of all
possible states, degeneracy must be taken into account. The probability of our system having
energy is simply the sum of the probabilities of all corresponding microstates:

where, with obvious modification,

this is the same result as before.

[edit] Comments

 Notice that in this formulation, the initial assumption "... suppose the system has total N
particles..." is dispensed with. Indeed, the number of particles possessed by the system plays no
role in arriving at the distribution. Rather, how many particles would occupy states with energy
follows as an easy consequence.

 What has been presented above is essentially a derivation of the canonical partition function. As
one can tell by comparing the definitions, the Boltzmann sum over states is really no different
from the canonical partition function.

 Exactly the same approach can be used to derive Fermi–Dirac and Bose–Einstein statistics.
However, there one would replace the canonical ensemble with the grand canonical ensemble,
since there is exchange of particles between the system and the reservoir. Also, the system one
considers in those cases is a single particle state, not a particle. (In the above discussion, we
could have assumed our system to be a single atom.)

Limits of applicability
The Bose–Einstein and Fermi–Dirac distributions may be written:
Assuming the minimum value of εi is small, it can be seen that the condition under which the
Maxwell–Boltzmann distribution is valid is when

For an ideal gas, we can calculate the chemical potential using the development in the Sackur–
Tetrode article to show that:

where E is the total internal energy, S is the entropy, V is the volume, and Λ is the thermal de
Broglie wavelength. The condition for the applicability of the Maxwell–Boltzmann distribution
for an ideal gas is again shown to be

Fermi–Dirac statistics is a part of the science of physics that describes the energies of single
particles in a system comprising many identical particles that obey the Pauli Exclusion Principle.
It is named after Enrico Fermi and Paul Dirac, who each discovered it independently.[1][2]

Fermi–Dirac (F–D) statistics applies to identical particles with half-odd-integer spin in a system
in thermal equilibrium. Additionally, the particles in this system are assumed to have negligible
mutual interaction. This allows the many-particle system to be described in terms of single-
particle energy states. The result is the F–D distribution of particles over these states and
includes the condition that no two particles can occupy the same state, which has a considerable
effect on the properties of the system. Since F–D statistics applies to particles with half-integer
spin, they have come to be called fermions. It is most commonly applied to electrons, which are
fermions with spin 1/2. Fermi–Dirac statistics is a part of the more general field of statistical
mechanics and uses the principles of quantum mechanics.

Contents
[hide]

 1 History
 2 Fermi–Dirac distribution
o 2.1 Distribution of particles over energy
 3 Quantum and classical regimes
 4 Two derivations of the Fermi–Dirac distribution
o 4.1 Derivation starting with canonical distribution
o 4.2 Derivation using Lagrange multipliers
 5 See also
 6 References
 7 Footnotes

[edit] History
Before the introduction of Fermi–Dirac statistics in 1926, understanding some aspects of electron
behavior was difficult due to seemingly contradictory phenomena. For example, the electronic
heat capacity of a metal at room temperature seemed to come from 100 times fewer electrons
than were in the electric current.[3] It was also difficult to understand why the emission currents,
generated by applying high electric fields to metals at room temperature, were almost
independent of temperature.

The difficulty encountered by the electronic theory of metals at that time was due to considering
that electrons were (according to classical statistics theory) all equivalent. In other words it was
believed that each electron contributed to the specific heat an amount of the order of the
Boltzmann constant k. This statistical problem remained unsolved until the discovery of F–D
statistics.

F–D statistics was first published in 1926 by Enrico Fermi[1] and Paul Dirac.[2] According to an
account, Pascual Jordan developed in 1925 the same statistics which he called Pauli statistics,
but it was not published in a timely manner.[4] Whereas according to Dirac, it was first studied by
Fermi, and Dirac called it Fermi statistics and the corresponding particles fermions.[5]

F–D statistics was applied in 1926 by Fowler to describe the collapse of a star to a white dwarf.[6]
In 1927 Sommerfeld applied it to electrons in metals[7] and in 1928 Fowler and Nordheim
applied it to field electron emission from metals.[8] Fermi–Dirac statistics continues to be an
important part of physics.

[edit] Fermi–Dirac distribution


For a system of identical fermions, the average number of fermions in a single-particle state i, is
given by the Fermi–Dirac (F–D) distribution,[9]

where k is Boltzmann's constant, T is the absolute temperature, is the energy of the single-
particle state i, and is the chemical potential. At T = 0, the chemical potential is equal to the
Fermi energy. For the case of electrons in a semiconductor, is also called the Fermi level.[10][11]
The F–D distribution is valid only if the number of fermions in the system is large enough so that
adding one more fermion to the system has negligible effect on .[12] Since the F–D distribution
was derived using the Pauli exclusion principle, which allows at most one electron to occupy
each possible state, a result is that .[13]

 Fermi–Dirac distribution

Energy dependence. More gradual at higher T. = 0.5 when = . Not shown is that
decreases for higher T.[14]

Temperature dependence for .


(Click on a figure to enlarge.)

[edit] Distribution of particles over energy

Fermi function F( ) vs. energy , with μ = 0.55 eV and for various temperatures in the range
50K ≤ T ≤ 375K.

The above Fermi–Dirac distribution gives the distribution of identical fermions over single-
particle energy states, where no more than one fermion can occupy a state. Using the F–D
distribution, one can find the distribution of identical fermions over energy, where more than one
fermion can have the same energy.[15]

The average number of fermions with energy can be found by multiplying the F–D
distribution by the degeneracy (i.e. the number of states with energy ),[16]
When , it is possible that since there is more than one state that can be
occupied by fermions with the same energy .

When a quasi-continuum of energies has an associated density of states (i.e. the number of
[17]
states per unit energy range per unit volume ) the average number of fermions per unit energy
range per unit volume is,

where is called the Fermi function and is the same function that is used for the F–D
distribution ,[18]

so that,

[edit] Quantum and classical regimes


The classical regime, where Maxwell–Boltzmann (M–B) statistics can be used as an
approximation to F–D statistics, is found by considering the situation that is far from the limit
imposed by the Heisenberg uncertainty principle for a particle's position and momentum. Using
this approach, it can be shown that the classical situation occurs if the concentration of particles
corresponds to an average interparticle separation that is much greater than the average de
Broglie wavelength of the particles,[19]

where h is Planck's constant, and m is the mass of a particle.

For the case of conduction electrons in a typical metal at T=300K (i.e. approximately room
temperature), the system is far from the classical regime since . This is due to the
small mass of the electron and the high concentration (i.e. small ) of conduction electrons in
the metal. Thus F–D statistics is needed for conduction electrons in a typical metal.[19]
Another example of a system that is not in the classical regime is the system that consists of the
electrons of a star that has collapsed to a white dwarf. Although the white dwarf's temperature is
high (typically T=10,000K on its surface[20]), its high electron concentration and the small mass
of each electron, precludes using a classical approximation, and again F–D statistics is
required.[6]

[edit] Two derivations of the Fermi–Dirac distribution


[edit] Derivation starting with canonical distribution

Consider a many-particle system composed of N identical fermions that have negligible mutual
interaction and are in thermal equilibrium.[12] Since there is negligible interaction between the
fermions, the energy ER of a state R of the many-particle system can be expressed as a sum of
single-particle energies,

where nr is called the occupancy number and is the number of particles in the single-particle state
r with energy . The summation is over all possible single-particle states r.

The probability that the many-particle system is in the state R, is given by the normalized
canonical distribution,[21]

where = 1 / kT, k is Boltzmann's constant, T is the absolute temperature, e is called the


Boltzmann factor, and the summation is over all possible states R' of the many-particle system.
The average value for an occupancy number is[21]

Note that the state R of the many-particle system can be specified by the particle occupancy of
the single-particle states, i.e. by specifying so that

and the equation for becomes


where the summation is over all combinations of values of which obey the Pauli
exclusion principle, and nr = 0 or 1 for each r. Furthermore, each combination of values of
satisfies the constraint that the total number of particles is N,

Rearranging the summations,

where the (i) on the summation sign indicates that the sum is not over ni and is subject to the
constraint that the total number of particles associated with the summation is Ni = N − ni . Note
that Σ(i) still depends on ni through the Ni constraint, since in one case ni = 0 and Σ(i) is evaluated
with Ni = N, while in the other case ni = 1 and Σ(i) is evaluated with Ni = N − 1. To simplify the
notation and to clearly indicate that Σ(i) still depends on ni through N − ni , define

so that the previous expression for can be rewritten and evaluated in terms of the Zi,
The following approximation[22] will be used to find an expression to substitute for Zi(N) / Zi(N −
1) .

where

If the number of particles N is large enough so that the change in the chemical potential is very
[23]
small when a particle is added to the system, then Taking the base e
antilog[24] of both sides, substituting for , and rearranging,

Substituting the above into the equation for , and using a previous definition of to substitute
1 / kT for , results in the Fermi–Dirac distribution.
[edit] Derivation using Lagrange multipliers

A result can be achieved by directly analyzing the multiplicities of the system and using
Lagrange multipliers.[25]

Suppose we have a number of energy levels, labeled by index i, each level having energy εi and
containing a total of ni particles. Suppose each level contains gi distinct sublevels, all of which
have the same energy, and which are distinguishable. For example, two particles may have
different momenta (i.e. their momenta may be along different directions), in which case they are
distinguishable from each other, yet they can still have the same energy. The value of gi
associated with level i is called the "degeneracy" of that energy level. The Pauli exclusion
principle states that only one fermion can occupy any such sublevel.

The number of ways of distributing ni particles among the gi sublevels of an energy level is given
by the binomial coefficient, using its combinatorial interpretation

The number of ways that a set of occupation numbers ni can be realized is the product of the
ways that each individual energy level can be populated:

Following the same procedure used in deriving the Maxwell–Boltzmann statistics, we wish to
find the set of ni for which W is maximized, subject to the constraint that there be a fixed number
of particles, and a fixed energy. We constrain our solution using Lagrange multipliers forming
the function:

Using Stirling's approximation for the factorials, taking the derivative with respect to ni, setting
the result to zero, and solving for ni yields the Fermi–Dirac population numbers:

It can be shown thermodynamically that β = 1/kT where k is Boltzmann's constant and T is the
temperature, and that α = −μ/kT where μ is the chemical potential, so that finally, the probability
that a state will be occupied is:
Bose–Einstein statistics
Statistical mechanics

Thermodynamics · Kinetic theory

Particle Statistics[hide]

Spin-Statistics Theorem
Identical Particles
Maxwell–Boltzmann
Bose–Einstein · Fermi–Dirac
Parastatistics · Anyonic Statistics
Braid Statistics

In statistical mechanics, Bose–Einstein statistics (or more colloquially B–E statistics)


determines the statistical distribution of identical indistinguishable bosons over the energy states
in thermal equilibrium.

Concept
Fermi–Dirac and Bose–Einstein statistics apply when quantum effects are important and the
particles are "indistinguishable". Quantum effects appear if the concentration of particles
satisfies N/V ≥ nq. Here nq is the quantum concentration, for which the interparticle distance is
equal to the thermal de Broglie wavelength, so that the wavefunctions of the particles are
touching but not overlapping. Fermi–Dirac statistics apply to fermions (particles that obey the
Pauli exclusion principle), and Bose–Einstein statistics apply to bosons. As the quantum
concentration depends on temperature; most systems at high temperatures obey the classical
(Maxwell–Boltzmann) limit unless they have a very high density, as for a white dwarf. Both
Fermi–Dirac and Bose–Einstein become Maxwell–Boltzmann statistics at high temperature or at
low concentration.

Bosons, unlike fermions, are not subject to the Pauli exclusion principle: an unlimited number of
particles may occupy the same state at the same time. This explains why, at low temperatures,
bosons can behave very differently from fermions; all the particles will tend to congregate
together at the same lowest-energy state, forming what is known as a Bose–Einstein condensate.

B–E statistics was introduced for photons in 1924 by Bose and generalized to atoms by Einstein
in 1924-25.

The expected number of particles in an energy state i for B–E statistics is

with εi > μ and where ni is the number of particles in state i, gi is the degeneracy of state i, εi is
the energy of the ith state, μ is the chemical potential, k is the Boltzmann constant, and T is
absolute temperature.

This reduces to the Rayleigh-Jeans distribution for , namely .

History
In the early 1920s Satyendra Nath Bose, a Bengali professor of University of Calcutta in British
India was intrigued by Einstein's theory of light waves being made of particles called photons.
Bose was interested in mathematically deriving Planck's radiation formula, which Planck
obtained largely by guessing. In 1900 Max Planck had derived his formula by manipulating the
mathematics to fit the empirical evidence. Using the particle picture of Einstein, Bose was able
to derive the radiation formula by systematically developing a statistics of mass-less particles
without the constraint of particle number conservation. Bose derived Planck's Law of Radiation
by proposing different states for the photon. Instead of statistical independence of particles, Bose
put particles into cells and described statistical independence of cells of phase space. Such
systems allow two polarization states, and exhibit totally symmetric wavefunctions.

He developed a statistical law governing the behavior pattern of photons quite successfully.
However, he was not able to publish his work; no journals in Europe would accept his paper,
being unable to understand it. Bose sent his paper to Einstein, who saw the significance of it and
used his influence to get it published.[1][2]

A derivation of the Bose–Einstein distribution


Suppose we have a number of energy levels, labeled by index , each level having energy and
containing a total of particles. Suppose each level contains distinct sublevels, all of which
have the same energy, and which are distinguishable. For example, two particles may have
different momenta, in which case they are distinguishable from each other, yet they can still have
the same energy. The value of associated with level is called the "degeneracy" of that energy
level. Any number of bosons can occupy the same sublevel.

Let be the number of ways of distributing particles among the sublevels of an


energy level. There is only one way of distributing particles with one sublevel, therefore
. It is easy to see that there are ways of distributing particles in two
sublevels which we will write as:

With a little thought (see Notes below) it can be seen that the number of ways of distributing
particles in three sublevels is

so that

where we have used the following theorem involving binomial coefficients:

Continuing this process, we can see that is just a binomial coefficient (See Notes
below)

The number of ways that a set of occupation numbers can be realized is the product of the
ways that each individual energy level can be populated:
where the approximation assumes that . Following the same procedure used in deriving
the Maxwell–Boltzmann statistics, we wish to find the set of for which is maximised,
subject to the constraint that there be a fixed number of particles, and a fixed energy. The
maxima of and occur at the value of and, since it is easier to accomplish
mathematically, we will maximise the latter function instead. We constrain our solution using
Lagrange multipliers forming the function:

Using the approximation and using Stirling's approximation for the factorials
gives

Taking the derivative with respect to , and setting the result to zero and solving for , yields
the Bose–Einstein population numbers:

It can be shown thermodynamically that , where is Boltzmann's constant and is


the temperature.

It can also be shown that , where is the chemical potential, so that finally:

Note that the above formula is sometimes written:

where is the absolute activity.

Notes
A much simpler way to think of Bose–Einstein distribution function is to consider that n
particles are denoted by identical balls and g shells are marked by g-1 line partitions. It is clear
that the permutations of these n balls and g-1 partitions will give different ways of arranging
bosons in different energy levels.

Say, for 3(=n) particles and 3(=g) shells, therefore (g-1)=2, the arrangement may be like

|..|. or ||... or |.|.. etc.

Hence the number of distinct permutations of n + (g-1) objects which have n identical items and
(g-1) identical items will be:

(n+g-1)!/n!(g-1)!

OR

The purpose of these notes is to clarify some aspects of the derivation of the Bose–Einstein (B–
E) distribution for beginners. The enumeration of cases (or ways) in the B–E distribution can be
recast as follows. Consider a game of dice throwing in which there are dice, with each die
taking values in the set , for . The constraints of the game are that the value
of a die , denoted by , has to be greater than or equal to the value of die , denoted
by , in the previous throw, i.e., . Thus a valid sequence of die throws can be
described by an n-tuple , such that . Let denote the
set of these valid n-tuples:

(1)

Then the quantity (defined above as the number of ways to distribute particles among
the sublevels of an energy level) is the cardinality of , i.e., the number of elements (or
valid n-tuples) in . Thus the problem of finding an expression for becomes the
problem of counting the elements in .

Example n = 4, g = 3:

(there are elements in )


Subset is obtained by fixing all indices to , except for the last index, , which is
incremented from to . Subset is obtained by fixing , and
incrementing from to . Due to the constraint on the indices in
, the index must automatically take values in . The construction of subsets
and follows in the same manner.

Each element of can be thought of as a multiset of cardinality ; the elements of


such multiset are taken from the set of cardinality , and the number of such
multisets is the multiset coefficient

1. Why "{3 + 4 - 1 \choose 3-1}"="{3 + 4 - 1 \choose 4}"? I Think you should write "\choose 3+1"
instead of "\choose 3-1"

More generally, each element of is a multiset of cardinality (number of dice) with


elements taken from the set of cardinality (number of possible values of each die),
and the number of such multisets, i.e., is the multiset coefficient

(2)

which is exactly the same as the formula for , as derived above with the aid of a
theorem involving binomial coefficients, namely

(3)

To understand the decomposition

or for example, and

let us rearrange the elements of as follows


.

Clearly, the subset of is the same as the set

By deleting the index (shown in red with double underline) in the subset of
, one obtains the set

In other words, there is a one-to-one correspondence between the subset of and the
set . We write

Similarly, it is easy to see that

(empty set).

Thus we can write

or more generally,
(5)
;

and since the sets

are non-intersecting, we thus have

(6)
,

with the convention that

. (7)

Continuing the process, we arrive at the following formula

Using the convention (7)2 above, we obtain the formula

(8)

keeping in mind that for and being constants, we have

(9)
.

It can then be verified that (8) and (2) give the same result for , , , etc.

Information retrieval
In recent years, Bose Einstein statistics have also been used as a method for term weighting in
information retrieval. The method is one of a collection of DFR ("Divergence From
Randomness") models, the basic notion being that Bose Einstein statistics may be a useful
indicator in cases where a particular term and a particular document have a significant
relationship that would not have occurred purely by chance. Source code for implementing this
model is available from the Terrier project at the University of Glasgow.

The partition function and relation to thermodynamics


In principle, we should derive the isothermal-isobaric partition function by coupling our system
to an infinite thermal reservoir as was done for the canonical ensemble and also subject the
system to the action of a movable piston under the influence of an external pressure P. In this
case, both the temperature of the system and its pressure will be controlled, and the energy and
volume will fluctuate accordingly.

However, we saw that the transformation from E to T between the microcanonical and canonical
ensembles turned into a Laplace transform relation between the partition functions. The same
result holds for the transformation from V to T. The relevant ``energy'' quantity to transform is
the work done by the system against the external pressure P in changing its volume from V=0 to
V, which will be PV. Thus, the isothermal-isobaric partition function can be expressed in terms
of the canonical partition function by the Laplace transform:

where is a constant that has units of volume. Thus,

The Gibbs free energy is related to the partition function by

This can be shown in a manner similar to that used to prove the . The
differential equation to start with is

Other thermodynamic relations follow:

Volume:
Enthalpy:

Heat capacity at constant pressure

Entropy:

The fluctuations in the enthalpy are given, in analogy with the canonical ensemble, by

so that

so that, since and are both extensive, which vanish in the


thermodynamic limit.

23 Partition Function

23.1 Basic Properties


In thermal equilibrium, the probability of each microstate is proportional to its Boltzmann factor:

Pi ∝ exp(−Êi/kT) (23.1)
where Pi is the probability of the ith microstate, and Êi is the energy of the ith microstate. You
can think of the Boltzmann factor exp(−Êi/kT) as an unnormalized probability. In some cases an
unnormalized probability is satisfactory, or even desirable, but in other cases you really want the
normalized probability, normalized so that ∑Pi = 1. That is easily arranged:

exp(−Êi/kT)

Pi = (23.2)
∑ exp(−Êj/kT)

The normalization denominator in equation 23.2 is something we are going to encounter again
and again, so we might as well give it a name. It is called the partition function and is denoted Z.
That is:

Z := ∑ exp(−Êj/kT) (23.3)

Actually there is more to the story; we shall see that Z serves in many roles, not just as a
normalization denominator. However, that is more than we need to know at the moment. For the
time being, it suffices to think of Z as the normalization denominator. Additional motivation for
caring about Z will accumulate in the next few sections.

Before continuing, we need to take care of some housekeeping details.

We will find it convenient to express some things in terms of inverse temperature. Following
convention, we define

1
β := (23.4)
kT
The factor of k means that 1/β is measured in units of energy (per particle). This means we don’t
need to bother with units of temperature; all we need are units of energy.

In this section, we assume constant N, i.e. constant number of particles. We also assume that the
system is fully in equilibrium. That is, this analysis applies only to the Locrian modes, and any
non-Locrian modes will have to be handled by other means.

Remark: The partition function is almost universally denoted Z, which is traceable to the German
word Zustandsumme, meaning literally “sum over states”. This etymological remark seems
somewhat glib because although equation 23.3 truly is a sum over all microstates, there are
innumerable other expressions that also take the form of a sum over states. Still, the fact remains
that Z is so important that whenever anybody talks about “the” sum over states, you can assume
they mean equation 23.3 or equivalently equation 23.6.

Here are some basic facts about probabilities and Boltzmann factors:

The probability of the ith state is Pi. The Boltzmann factor for state i is
exp(−β Êi), where Êi is the energy of the
state.

The probabilities are normalized such that The sum of the Boltzmann factors is called
the partition function:

∑P = 1
i (23.5)

Z := ∑ e−β Êi (23.6)

Knowing the probability Pi for every state If you know the Boltzmann factors, you can
somewhat useful, but as we shall see, it is calculate all the probabilities in accordance
not nearly as useful as knowing the with equation 23.7, but the converse does
Boltzmann factors exp(−β Êi). not hold: knowing all the probabilities does
not suffice to calculate the Boltzmann
factors.

In fact, we shall see that if you know the partition function, you can calculate everything there is
to know about Locrian thermodynamics.

Among its many uses, the partition function can be used to write:
exp(−β Êi)
Pi = (23.7)
Z

23.2 Calculations Using the Partition Function


A direct application of basic probability ideas is:

⟨X⟩ = ∑ xi P i for any probability distribution

i
(23.8)

1
= ∑ xi e−β Êi for a Boltzmann distribution
Z
i

where ⟨⋯⟩ denotes the expectation value of some property. The idea of expectation value applies
to the macrostate. Here xi is the value of the X-property in the ith microstate. So we see that
equation 23.8 is a weighted average, such that each xi is weighted by the probability of state i.
This averaging process relates a macroscopic property X to the corresponding microscopic
property xi.

As a sanity check, you should verify that ⟨1⟩ = 1 by plugging into equation 23.8.

We now begin to explore the real power of the partition function, using it for much more than
just a normalization factor.

We can start from the observation that Z, as defined by equation 23.6, is a perfectly good state
function, just as P, V, T, S, et cetera are state functions. We will soon have more to say about the
physical significance of this state function.

We now illustrate what Z is good for. Here is a justly-famous calculation that starts with ln(Z)
and differentiates with respect to β:
∂ ln(Z) 1
= ∑ (− Êi) e−β Êi
∂ β | {Êi} Z
(23.9)
i

= −⟨Ê⟩

= −E

Recall that Êi is the energy of the ith microstate, while E is the energy of the macrostate.

Equation 23.9 tells us that one of the directional derivatives of the partition function is related to
the energy. For a particle in a box, or for an ideal gas, all the energy levels are determined by the
volume of the box, in which case we can write E = −∂ln(Z) / ∂β at constant volume.

You have to pay attention to understand what is happening here. How can the macroscopic
energy ⟨E⟩ be changing when we require all the Êi to be constant? The answer is that the
expectation value ⟨⋯⟩ is a weighted average, weighted according to the probability of finding the
system in the ith microstate, and by changing the inverse temperature β we change the weighting.

As another example calculation using the partition function, it is amusing to express the entropy
in terms of the partition function. We start with the workhorse expression for entropy, equation
2.2 or equation 8.3, and substitute the probability from equation 23.7.

S[P] = −k ∑ Pi ln(Pi)

i
(23.10)

e−β Êi e−β Êi
= −k ∑ ln( )
Z Z
i
e−β Êi
= −k ∑ [−β Êi − ln(Z)]
Z
i

Êi e−β Êi e−β Êi
= −kβ ∑ −kln(Z) ∑ [−β Êi − ln(Z)]
Z Z
i i

= kβ ⟨Ê⟩ + kln(Z) ⟨1⟩

= kβ E + kln(Z)

∂ ln(Z)
= −k + kln(Z)
∂ ln(β) | {Êi}

We obtained the last line by plugging in the value of E obtained from equation 23.9. This gives
us a handy formula for calculating the entropy directly from the partition function.

Here we have used the fact that ⟨ln(Z)⟩≡ln(Z), as it must be since Z is not a function of the
dummy index i. Also, in the last line we have used equation 23.9.

The next-to-last line of equation 23.10 tells us that E − TS = −kTln(Z) … and equation 13.8 tells
us that the free energy is F := E − TS. Combining these expressions yields a surprisingly simple
expression for the free energy:

F = −kT ln(Z) (23.11)

As an exercise in algebra, you find the entropy in terms of the free energy, namely

S[P] = − ∂F (23.12)
∂ T | {Êi}

by carrying out the derivative in equation 23.12 and comparing with equation 23.10.

We have just established a connection between the free energy F, the temperature T, and the
partition function Z. If at any point you know two of the three, you can immediately calculate the
third.

As another example, consider the case where the microstate energy depends linearly on some
parameter B:

Êi(B) = Êi(0) + B Mi for all i (23.13)

From there, it is straightforward to show that


1 ∂ ln(Z)

⟨M⟩ = − (23.14)

β ∂B β,{Êi(0)}

The notation was chosen to suggest that B might be an overall applied magnetic field, and Mi
might be the magnetization of the ith state … but this interpretation is not mandatory. The idea
applies for any parameter that affects the energy linearly as in equation 23.13. Remember
Feynman’s proverb: the same equations have the same solutions.

23.3 Example: Harmonic Oscillator


The partition function Z is defined in terms of a series, but sometimes it is possible to sum the
series analytically to obtain a closed-form expression for Z. The partition function of a quantum
harmonic oscillator is a simple example of this. As discussed in reference 42, it involves a
summing a geometric series, which is about as easy as anything could be. The result is

Z = ½ csch(½βℏω) (23.15)

where csch is the hyperbolic cosecant, i.e. the reciprocal of the hyperbolic sine.

Using methods described in section 23.2 we can easily the energy of the harmonic oscillator in
thermal equilibrium. The result is given by equation 23.16 and diagrammed in figure 23.1.
E = ½ℏω coth(½βℏω) (23.16)

Figure 23.1: Energy vs Temperature for a Harmonic Oscillator

The entropy of a harmonic oscillator is:

S = kβ E + kln(Z)

S/k = ½βℏω coth(½βℏω) + ln[½ csch(½βℏω)]


(23.17)
−βℏω
e
= βℏω − ln(1 − e−βℏω)
1−e−βℏω

In the high temperature limit (β → 0) this reduces to:

S = 1 − ln(βℏω) (23.18)
kT
= 1 + ln( )
ℏω

The microstates of a harmonic oscillator are definitely not equally populated, but we remark that
the entropy in equation 23.18 is the same as what we would get for a system with e kT/ℏω
equally-populated microstates. In particular it does not correspond to a picture where every
microstate with energy Ê < kT is occupied and others are not; the probability is spread out over
approximately e times that many states.

In the low-temperature limit, when kT is small, the entropy is very very small:

ℏω ℏω
S/k = exp(− ) (23.19)
kT kT

This is most easily understood by reference to the definition of entropy, as expressed by e.g.
equation 2.3. At low temperature, all of the probability is in the ground state, except for a very
very small bit of probability in the first excited state.

For details on all this, see reference 42.

23.4 Example: Two-State System


Suppose we have a two-state system. Specifically, consider a particle such as an electron or
proton, which has two spin states, UP and DOWN, or equivalently |↑⟩ and |↓⟩. Let’s apply a
magnetic field B, so that the two states have energy

Ê(UP) = +µB
(23.20)
Ê(DOWN) = −µB

where µ is called the magnetic moment. For a single particle, the partition function is simply:

Z1 = e−βÊ(i) (23.21)

i

= e−βµB + e+βµB

= 2 cosh(βµB)

Next let us consider N such particles, and assume that they are very weakly interacting, so that
when we calculate the energy we can pretend they are non-interacting. Then the overall partition
function is

Z = Z1N (23.22)

Using equation 23.9 we find that the energy of this system is

∂ ln(Z)
E = −
∂ β | {Êi} (23.23)

= −NµB tanh(βµB)

We can calculate the entropy directly from the workhorse equation, equation 2.2, or from
equation 23.10, or from equation 23.12. The latter is perhaps easiest:

S = kβ E + kln(Z)
(23.24)
= −NkβµB tanh(βµB) + Nkln(2 cosh(βµB))

You can easily verify that at high temperature (β = 0), this reduces to S/N = kln(2) i.e. one bit per
spin, as it should. Meanwhile, at low temperatures (β → ∞), it reduces to S = 0.

It is interesting to plot the entropy as a function of entropy, as in figure 23.2.


Figure 23.2: Entropy versus Energy – Two State System

In this figure, the slope of the curve is β, i.e. the inverse temperature. It may not be obvious from
the figure, but the slope of the curve is infinite at both ends. That is, at the low-energy end the
temperature is positive but only slightly above zero, whereas at the high-energy end the
temperature is negative but only slightly below zero. Meanwhile, the peak of the curve
corresponds to infinite temperature, i.e. β=0. The temperature is shown in figure 23.3.
Figure 23.3: Temperature versus Energy – Two State System

In this system, the curve of T as a function of E has infinite slope when E=Emin. You can prove
that by considering the inverse function, E as a function of T, and expanding to first order in T.
To get a fuller understanding of what is happening in the neighborhood of this point, we can
define a new variable b := exp(−µB/kT) and develop a Taylor series as a function of b. That gives
us

E − Emin
= 2µB e−2µB/kT for T near zero
N
(23.25)
2µB
kT =
ln(2NµB) − ln(E − Emin)

which is what we would expect from basic principles: The energy of the excited state is 2µB
above the ground state, and the probability of the excited state is given by a Boltzmann factor.
Let us briefly mention the pedestrian notion of “equipartition” (i.e. 1/2 kT of energy per degree
of freedom, as suggested by equation 24.7). This notion makes absolutely no sense for our spin
system. We can understand this as follows: The pedestrian result calls for 1/2 kT of energy per
quadratic degree of freedom in the classical limit, whereas (a) this system is not classical, and
(b) it doesn’t have any quadratic degrees of freedom.

For more about the advantages and limitations of the idea of equipartiation, see chapter 24.

Indeed, one could well ask the opposite question: Given that we are defining temperature via
equation 6.7, how could «equipartition» ever work at all? Partly the answer has to do with “the
art of the possible”. That is, people learned to apply classical thermodynamics to problems where
it works, and learned to stay away from systems where it didn’t work. If you hunt around, you
can find systems that are both harmonic and non-quantized, such as the classical ideal gas, the
phonon gas in a solid (well below the melting point), and the rigid rotor (in the high temperature
limit). Such systems will have 1/2 kT of energy in each quadratic degree of freedom. On the
other hand, if you get the solid too hot, it becomes anharmonic, and if you get the rotor too cold,
it becomes quantized. Furthermore, the two-state system is always anharmonic and always
quantized. Bottom line: Sometimes equipartition works, and sometimes it doesn’t.

23.5 Rescaling the Partition Function


This section is a bit of a digression. Feel free to skip it if you’re in a hurry.

We started out by saying that the probability Pi is “proportional” to the Boltzmann factor
exp(−βÊi).

If Pi is proportional to one thing, it is proportional to lots of other things. So the question arises,
what reason do we have to prefer exp(−βÊi) over other expressions, such as the pseudo-
Boltzmann factor α exp(−βÊi).

We assume the fudge factor α is the same for every microstate, i.e. for every term in the partition
function. That means that the probability Pi† we calculate based on the pseudo-Boltzmann factor
is the same as what we would calculate based on the regular Boltzmann factor:

α exp(−βÊi)

Pi† = (23.26)
∑ α exp(−βÊj)

j
= Pi

All the microstate probabilities are the same, so anything – such as entropy – that depends
directly on microstate probabilities will be the same, whether or not we rescale the Boltzmann
factors.

Our next steps depend on whether α depends on β or not. If α is a constant, independent of β,


then rescaling the Boltzmann factors by a factor of α has no effect on the entropy, energy, or
anything else. You should verify that any factor of α would drop out of equation 23.9 on the first
line.

We now consider the case where α depends on β. (We are still assuming that α is the same for
every microstate, i.e. independent of i, but it can depend on β.)

If we were only using Z as a normalization denominator, having a fudge factor that depends on β
would not matter. We could just pull the factor out front in the numerator and denominator of
equation 23.26 whereupon it would drop out.

In contrast, if we are interested in derivatives, the derivatives of Z′ := β Z are different from the
derivatives of plain Z. You can easily verify this by plugging Z′ into equation 23.9. The β-
dependence matters in equation 23.9 even though it doesn’t matter in equation 23.10. We
summarize this by saying that Z is not just a normalization factor.

A particularly interesting type of fudge factor is exp(−βφ) for some constant φ. You can easily
verify that this corresponds to shifting all the energies in the problem by φ. This can be
considered a type of gauge invariance. In situations where relativity is not involved, such as the
present situation, you can shift all the energies in the problem by some constant without
changing the observable physics. The numerical value of the energy is changed, but this has no
observable consequences. In particular, shifting the energy does not shift the entropy.

Partition function (statistical mechanics)


In statistical mechanics, the partition function, Z, encodes the statistical properties of a system
in thermodynamic equilibrium. It is a function of temperature and other parameters, such as the
volume enclosing a gas. Most of the aggregate thermodynamic variables of the system, such as
the total energy, free energy, entropy, and pressure, can be expressed in terms of the partition
function or its derivatives.
There are actually several different types of partition functions, each corresponding to different
types of statistical ensemble (or, equivalently, different types of free energy.) The canonical
partition function applies to a canonical ensemble, in which the system is allowed to exchange
heat with the environment at fixed temperature, volume, and number of particles. The grand
canonical partition function applies to a grand canonical ensemble, in which the system can
exchange both heat and particles with the environment, at fixed temperature, volume, and
chemical potential. Other types of partition functions can be defined for different circumstances;
see partition function (mathematics) for generalizations.

Canonical partition function


Definition

As a beginning assumption, assume that a thermodynamically large system is in constant thermal


contact with the environment, with a temperature T , and both the volume of the system and the
number of constituent particles fixed. This kind of system is called a canonical ensemble. Let us
label with s ( s = 1, 2, 3, ...) the exact states (microstates) that the system can occupy, and
denote the total energy of the system when it is in microstate s as Es . Generally, these
microstates can be regarded as analogous to discrete quantum states of the system.

The canonical partition function is

where the "inverse temperature", β, is conventionally defined as

with kB denoting Boltzmann's constant. The term exp(–β·Es) is known as the Boltzmann
factor. In systems with multiple quantum states s sharing the same Es , it is said that the energy
levels of the system are degenerate. In the case of degenerate energy levels, we can write the
partition function in terms of the contribution from energy levels (indexed by j ) as follows:

where gj is the degeneracy factor, or number of quantum states s which have the same energy
level defined by Ej = Es .

The above treatment applies to quantum statistical mechanics, where a physical system inside a
finite-sized box will typically have a discrete set of energy eigenstates, which we can use as the
states s above. In classical statistical mechanics, it is not really correct to express the partition
function as a sum of discrete terms, as we have done. In classical mechanics, the position and
momentum variables of a particle can vary continuously, so the set of microstates is actually
uncountable. In this case we must describe the partition function using an integral rather than a
sum. For instance, the partition function of a gas of N identical classical particles is

where

pi indicate particle momenta

xi indicate particle positions

d3 is a shorthand notation serving as a reminder that the pi and xi are vectors in three
dimensional space, and

H is the classical Hamiltonian.

The reason for the N! factor is discussed below. For simplicity, we will use the discrete form of
the partition function in this article. Our results will apply equally well to the continuous form.
The extra constant factor introduced in the denominator was introduced because, unlike the
discrete form, the continuous form shown above is not dimensionless. To make it into a
dimensionless quantity, we must divide it by h3N where h is some quantity with units of action
(usually taken to be Planck's constant).

In quantum mechanics, the partition function can be more formally written as a trace over the
state space (which is independent of the choice of basis):

where is the quantum Hamiltonian operator. The exponential of an operator can be defined
using the exponential power series. The classical form of Z is recovered when the trace is
expressed in terms of coherent states [1] and when quantum-mechanical uncertainties in the
position and momentum of a particle are regarded as negligible. Formally, one inserts under the
trace for each degree of freedom a resolution of the identity

where is a normalised Gaussian wavepacket centered at position x and momentum p.


Thus,
A coherent state is an approximate eigenstate of both operators and , hence also of the
Hamiltonian , with errors of the size of the uncertainties. If Δx and Δp can be regarded as
zero, the action of reduces to multiplication by the classical Hamiltonian, and Z reduces to the
classical configuration integral.

Meaning and significance

It may not be obvious why the partition function, as we have defined it above, is an important
quantity. First, let us consider what goes into it. The partition function is a function of the
temperature T and the microstate energies E1, E2, E3, etc. The microstate energies are determined
by other thermodynamic variables, such as the number of particles and the volume, as well as
microscopic quantities like the mass of the constituent particles. This dependence on microscopic
variables is the central point of statistical mechanics. With a model of the microscopic
constituents of a system, one can calculate the microstate energies, and thus the partition
function, which will then allow us to calculate all the other thermodynamic properties of the
system.

The partition function can be related to thermodynamic properties because it has a very
important statistical meaning. The probability Ps that the system occupies microstate s is

is the well-known Boltzmann factor. (For a detailed derivation of this result, see
canonical ensemble.) The partition function thus plays the role of a normalizing constant (note
that it does not depend on s), ensuring that the probabilities sum up to one:

This is the reason for calling Z the "partition function": it encodes how the probabilities are
partitioned among the different microstates, based on their individual energies. The letter Z
stands for the German word Zustandssumme, "sum over states". This notation also implies
another important meaning of the partition function of a system: it counts the (weighted) number
of states a system can occupy. Hence if all states are equally probable (equal energies) the
partition function is the total number of possible states. Often this is the practical importance of
Z.
Calculating the thermodynamic total energy

In order to demonstrate the usefulness of the partition function, let us calculate the
thermodynamic value of the total energy. This is simply the expected value, or ensemble average
for the energy, which is the sum of the microstate energies weighted by their probabilities:

or, equivalently,

Incidentally, one should note that if the microstate energies depend on a parameter λ in the
manner

then the expected value of A is

This provides us with a method for calculating the expected values of many microscopic
quantities. We add the quantity artificially to the microstate energies (or, in the language of
quantum mechanics, to the Hamiltonian), calculate the new partition function and expected
value, and then set λ to zero in the final expression. This is analogous to the source field method
used in the path integral formulation of quantum field theory.

Relation to thermodynamic variables

In this section, we will state the relationships between the partition function and the various
thermodynamic parameters of the system. These results can be derived using the method of the
previous section and the various thermodynamic relations.

As we have already seen, the thermodynamic energy is

The variance in the energy (or "energy fluctuation") is


The heat capacity is

The entropy is

where A is the Helmholtz free energy defined as A = U - TS, where U=<E> is the total energy
and S is the entropy, so that

Partition functions of subsystems

Suppose a system is subdivided into N sub-systems with negligible interaction energy. If the
partition functions of the sub-systems are ζ1, ζ2, ..., ζN, then the partition function of the entire
system is the product of the individual partition functions:

If the sub-systems have the same physical properties, then their partition functions are equal, ζ1 =
ζ2 = ... = ζ, in which case

Z = ζN.

However, there is a well-known exception to this rule. If the sub-systems are actually identical
particles, in the quantum mechanical sense that they are impossible to distinguish even in
principle, the total partition function must be divided by a N ! (N factorial):

This is to ensure that we do not "over-count" the number of microstates. While this may seem
like a strange requirement, it is actually necessary to preserve the existence of a thermodynamic
limit for such systems. This is known as the Gibbs paradox.
[edit] Examples

A specific example of the partition function, expressed in terms of the mathematical formalism
of measure theory, is presented in the article on the Potts model.

Grand canonical partition function


Definition

In a manner similar to the definition of the canonical partition function for the canonical
ensemble, we can define a grand canonical partition function for a grand canonical ensemble,
a system that can exchange both heat and particles with the environment, which has a constant
temperature T , and a chemical potential μ . The grand canonical partition function, although
conceptually more involved, simplifies the theoretical handling of quantum systems because it
incorporates in a simple way the spin-statistics of the particles (i.e. whether particles are bosons
or fermions). A canonical partition function, which has a given number of particles is in fact
difficult to write down because of spin statistics.

The grand canonical partition function for an ideal quantum gas (a gas of non-interacting
particles in a given potential well) is given by the following expression:

where N is the total number of particles in the gas, index i runs over every microstate (that is, a
single particle state in the potential) with ni being the number of particles occupying microstate
i and εi being the energy of a particle in that microstate. The set { ni } is the collection of all
possible occupation numbers for each of these microstates such that Σ ni = N .

For example, consider the N = 3 term in the above sum. One possible set of occupation numbers
would be ni = 0, 1, 0, 2, 0 ... and the contribution of this set of occupation numbers to the
N = 3 term would be

For bosons, the occupation numbers can take any integer values as long as their sum is equal to
N . For fermions, the Pauli exclusion principle requires that the occupation numbers only be 0
or 1 , again adding up to N .
Probability

Specific expressions

The above expression for the grand partition function can be shown to be mathematically
equivalent to:

(The above product is sometimes taken over all states with equal energy, rather than over each
state, in which case the individual partition functions must be raised to a power gi where gi is
the number of such states. gi is also referred to as the "degeneracy" of states.)

For a system composed of bosons:

and for a system composed of fermions:

For the case of a Maxwell-Boltzmann gas, we must use "correct Boltzmann counting" and divide
the Boltzmann factor by ni! .

Relation to thermodynamic variables

Just as with the canonical partition function, the grand canonical partition function can be used to
calculate thermodynamic and statistical variables of the system. As with the canonical ensemble,
the thermodynamic quantities are not fixed, but have a statistical distribution about a mean or
expected value.

Occupation numbers

The most probable occupation numbers are:


,

where α = –β·μ .

For Boltzmann particles this yields:

For bosons:

For fermions:

which are just the results found using the canonical ensemble for Maxwell-Boltzmann statistics,
Bose-Einstein statistics and Fermi-Dirac statistics, respectively. (The degeneracy gi is missing
from the above equations because the index i is summing over individual microstates rather
than energy eigenvalues.)

Total number of particles

Variance in total number of particles

Internal energy

.
Variance in internal energy

Pressure

Mechanical equation of state

Relation to potential V

For the case of a non-interacting gas, using the "Semiclassical Approach" we can write
(approximately) the inverse of the potential in the form:

(valid for high T )

supposing that the Hamiltonian of every particle is H=T+V .

Discussion

Before specific results can be obtained from the grand canonical partition function, the energy
levels of the system under consideration need to be specified. For example, the particle in a box
model or particle in a harmonic oscillator well provide a particular set of energy levels and are a
convenient way to discuss the properties of a quantum fluid. (See the gas in a box and gas in a
harmonic trap articles for a description of quantum fluids.)

These results may be used to construct the grand partition function to describe an ideal Bose gas
or Fermi gas, and can be used as well to describe a classical ideal gas. In the case of a
mesoscopic system made of few particles like a quantum dot in a semiconductor, the finite
quantum grand partition ensemble is required, for which the limit of a small finite N is
maintained.

You might also like