You are on page 1of 88

1

Introduction to Nonequilibrium
Statistical Physics

Mention Quantum Engineering

Master QLMN

Jean-Jacques Greffet
2
Contents

1 Introduction to statistical physics. Basic concepts. 7


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 State of a system. Statistical average . . . . . . . . . . . . . . . . . . . . . 8
1.2.1 State of a system in classical and in quantum mechanics. Phase space 8
1.2.2 Statistical (Ensemble) average and mean value of an observable . . 9
1.2.3 The concept of statistical average and ensemble average . . . . . . 10
1.2.4 Thermodynamic equilibrium . . . . . . . . . . . . . . . . . . . . . 11
1.2.5 Ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 Summing over all possible states. The classical approximation. . . . . . . . 12
1.4 Fundamental Principle of Statistical Physics . . . . . . . . . . . . . . . . . 14
1.4.1 Probability of a state of an isolated system. Microcanonical ensemble 14
1.4.2 Statistical Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4.3 Second formulation of the fundamental principle: maximum entropy 16

2 System in contact with a thermostat. Canonical ensemble 17


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Probability law for an isolated system. Microcanonical ensemble . . . . . . 17
2.2.1 Lagrangian multiplier technique . . . . . . . . . . . . . . . . . . . 18
2.2.2 Example: isolated system, microcanonical ensemble . . . . . . . . 18
2.3 System in contact with a thermostat. Canonical ensemble . . . . . . . . . . 19
2.3.1 Probability law for a canonical ensemble . . . . . . . . . . . . . . 19
2.3.2 Mean energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.3 Energy fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.4 Entropy. Link between the statistical temperature and the tempera-
ture in classical thermodynamics . . . . . . . . . . . . . . . . . . . 20
2.3.5 Heat capacity cv . Connection between fluctuations and linear response 21
2.3.6 Free energy and partition function . . . . . . . . . . . . . . . . . . 21
2.4 Example: perfect gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.1 Partition function . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.2 Thermodynamic properties of the perfect gas . . . . . . . . . . . . 23
2.5 Microscopic interpretation of the concepts of heat and work . . . . . . . . . 24
2.5.1 Work and heat in classical thermodynamics . . . . . . . . . . . . . 24
2.5.2 Work and heat in statistical physics . . . . . . . . . . . . . . . . . 24
2.5.3 Work and generalized force. State equation . . . . . . . . . . . . . 27
2.5.4 Application to the perfect gas . . . . . . . . . . . . . . . . . . . . 27

3
4 CONTENTS

3 System in contact with a reservoir. Grand canonical ensemble 29


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Systems exchanging energy and particles with a reservoir. Grand canonical
ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.1 Probability law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.2 Mean values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2.3 Entropy. Introduction of the chemical potential . . . . . . . . . . . 30
3.3 System with a fluctuating quantity X . . . . . . . . . . . . . . . . . . . . . 31
3.3.1 Probability law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.2 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.3 Mean value and fluctuation . . . . . . . . . . . . . . . . . . . . . . 32
3.3.4 Fluctuations and linear response . . . . . . . . . . . . . . . . . . . 32
3.3.5 State equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Thermodynamic limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4.1 Equivalence between the description with different ensembles . . . 33
3.4.2 Relative fluctuations of the number of particles in a perfect gas . . . 33
3.4.3 Thermodynamic limit . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4.4 Temperature of an isolated system . . . . . . . . . . . . . . . . . . 35
3.4.5 Chemical potential of a system with a fixed number of particles . . 35
3.5 Equilibrium condition between two subsystems: equality of T, µ, x . . . . . 35
3.6 Grand potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.6.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.6.2 Explicit form of the grand potential . . . . . . . . . . . . . . . . . 36
3.6.3 The potential is minimum at equilibrium . . . . . . . . . . . . . . 36
3.6.4 Legendre transform . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.7.1 Exercise: Grand canonical ensemble P-T . . . . . . . . . . . . . . 37

4 Linear response coefficients. General properties 39


4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Response function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3 Susceptibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3.2 Dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3.3 Causality. Kramers-Kronig relations . . . . . . . . . . . . . . . . . 43
4.4 Relaxation function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.4.1 Examples and definitions . . . . . . . . . . . . . . . . . . . . . . . 44
4.4.2 Response function and relaxation function . . . . . . . . . . . . . . 45
4.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5 Linear response. Fluctuation-Dissipation theorem 47


5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.2 Relaxation function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.3 Susceptibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.3.1 Wiener-Khinchine theorem . . . . . . . . . . . . . . . . . . . . . 50
5.3.2 Susceptibility. Fluctuation-dissipation theorem . . . . . . . . . . . 51
5.3.3 Application to the computation of fluctuation spectra . . . . . . . . 52
5.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
CONTENTS 5

5.4.1 Voltage fluctuations for a resistor . . . . . . . . . . . . . . . . . . 53


5.4.2 Fluctuations of the electromagnetic field in vacuum . . . . . . . . . 53
5.4.3 Exercise: Microwave polarisability of a dielectric particle . . . . . 54

6 Langevin model of Brownian motion. Application: fluctuations of linear sys-


tems 57
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.1.1 Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.1.2 Qualitative discussion . . . . . . . . . . . . . . . . . . . . . . . . 58
6.2 Langevin model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3 Fluctuation-dissipation theorem. Second form. . . . . . . . . . . . . . . . . 59
6.3.1 Newton’s equations in harmonic regime. Power spectral density. . . 59
6.3.2 Second form of the Fluctuation-Dissipation theorem . . . . . . . . 60
6.4 First form of the Fluctuation-dissipation theorem . . . . . . . . . . . . . . 60
6.4.1 Susceptibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.4.2 First form of the Fluctuation-dissipation theorem . . . . . . . . . . 61
6.5 Einstein relation. Brownian motion . . . . . . . . . . . . . . . . . . . . . . 61
6.5.1 Study of the velocity v(t) . . . . . . . . . . . . . . . . . . . . . . . 61
6.5.2 Derivation of x(t). Diffusion . . . . . . . . . . . . . . . . . . . . . 62
6.5.3 Einstein relation . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.6 Exemples of application . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.6.1 Charge fluctuation of a capacitor . . . . . . . . . . . . . . . . . . . 64
6.6.2 Problem : spring length fluctuations . . . . . . . . . . . . . . . . . 64
6.6.3 Problem : dipole moment fluctuations . . . . . . . . . . . . . . . . 65

7 Introduction to the kinetic theory of transport phenomena. 67


7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.2 Elementary introduction to kinetic theory of transport phenomena . . . . . 68
7.2.1 Fick’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.2.2 Fourier’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.2.3 Thermoelectric effect . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.3 Diffusive and ballistic regime . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.3.1 Example: heat transfer across a gap . . . . . . . . . . . . . . . . . 70
7.3.2 Diffusion equation: key ingredients and general form of the solution 70
7.3.3 Diffusive regime: the random walk model . . . . . . . . . . . . . . 70

8 Introduction to kinetic theory of transport phenomena. Boltzmann equation 71


8.1 General form of a flux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.2 Boltzmann equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.2.1 Liouville theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.2.2 Boltzmann equation . . . . . . . . . . . . . . . . . . . . . . . . . 73
8.3 Relaxation time approximation . . . . . . . . . . . . . . . . . . . . . . . . 73
8.4 Perturbative solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
8.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.5.1 Electrical conduction in an ionic solution . . . . . . . . . . . . . . 75
8.5.2 Problem : Momentum transport. Viscosity . . . . . . . . . . . . . . 76
6 CONTENTS

9 Introduction to irreversible thermodynamics. 79


9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
9.2 Overview of classical thermodynamics . . . . . . . . . . . . . . . . . . . . 80
9.2.1 Fundamental principles . . . . . . . . . . . . . . . . . . . . . . . . 80
9.2.2 Local Thermodynamic Equilibrium, intensive variables and state
equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
9.3 Homogeneous systems. Fluxes and Affinities. . . . . . . . . . . . . . . . . 80
9.3.1 Affinity (or generalized force) . . . . . . . . . . . . . . . . . . . . 80
9.3.2 Entropy production rate . . . . . . . . . . . . . . . . . . . . . . . 82
9.4 Flux and affinity in a heterogeneous system. . . . . . . . . . . . . . . . . 82
9.4.1 Local thermodynamic equilibrium . . . . . . . . . . . . . . . . . . 82
9.4.2 Local form of the conservation law of Xk . . . . . . . . . . . . . . 83
9.4.3 Entropy conservation. Affinity. Local entropy production rate. . . . 83
9.4.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
9.4.5 Linear response theory for fluxes . . . . . . . . . . . . . . . . . . . 86
9.5 Principles of irreversible thermodynamics . . . . . . . . . . . . . . . . . . 87
9.5.1 Curie Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
9.5.2 Clausius-Duhem inequality . . . . . . . . . . . . . . . . . . . . . . 88
9.5.3 Onsager reciprocity relations . . . . . . . . . . . . . . . . . . . . 88
Chapter 1

Introduction to statistical physics.


Basic concepts.

1.1 Introduction
From a historical perspective, statistical physics is the consequence of an attempt to un-
derstand the physical properties of gases from a mechanical modelling of the movement of
individual molecules. The goal is therefore to recover concepts of thermodynamics such
as pressure, temperature, entropy from mechanical concepts such as momentum or kinetic
energy. It is clearly an attempt to go from a microscopic description to a macroscopic de-
scription.
From a microscopic description to a macroscopic description. In practice, a me-
chanical description of a gas based on the computation of the position and velocity of all
the molecules is impossible. Numerical simulations can be done on thousands of atoms
but certainly not with a mole of atoms. Furthermore, it is impossible to know with enough
accuracy the initial conditions of a large system to predict its evolution. Hence, we see that
a statistical approach has to rely on some average procedure for practical reasons. At this
point, a statistical approach seems to be necessary to simplify the equations. While this is
true, it is only one of the features of statistical physics.
Complex systems. The statistical approach introduced by Boltzmann and Gibbs to
study gases has been extended to many other physical systems and gases are certainly not
the best example of the potential of a statistical approach. For instance, it is not possible to
understand the behaviour of a neutron star or the properties of a magnetic bit in a hard disk
without using statistical physics. Beyond physical phenomena, the procedure that allows
to move from a microscopic description to a macroscopic description is widely used in
different sciences. Information theory due to Shannon borrowed the concept of entropy.
Many concepts used in macroeconomy are based on similar concepts: here the problem is
to move from the behaviour of individuals to the economic quantities relevant at the scale
of a country. Another example is the study of trafic jams. Here, the issue is to understand
the dynamics of flow of cars from the description of the behaviour of individual cars. We
see that the statistical approach enables to study a large class of systems called complex
systems. We now introduce the idea that a large system is more than the mere addition of
its parts.
Emergence of new concepts and new phenomena. One of the most fascinating fea-
tures of statistical physics is its ability to introduce purely macroscopic concepts. To illus-

7
8 CHAPTER 1. INTRODUCTION TO STATISTICAL PHYSICS. BASIC CONCEPTS.

trate this point, let us consider the example of a gas consisting in N atoms with mass m. If
all velocities and positions are known at a given instant, the classical dynamical variables
are well characterized and we have all the relevant information. From thermodynamics,
we know that a gas can be described using both entropy and temperature. These concepts
do not appear in the microscopic description of the system. They are purely macroscopic
quantities. Hence, we see that statistical physics is not a mere filter that would keep the
essential information and remove the superfluous data. It is a different way to look at a
system. More importantly, statistical physics enable to study phenomena that cannot be
captured by a complete microscopic description such as phase transition. The solid-liquid
phase transition of a gas or the paramagnetic-ferromagnetic phase transition of a metal are
examples where the microscopic description does not help understanding the observed phe-
nomena. These phenomena are truly collective effects resulting from the interaction of its
elements.
Fluctuations. Finally, it is important to stress that statistical physics allows to go be-
yond classical thermodynamics. Indeed, it is possible to compute the fluctuations of the
physical quantities, something that is beyond reach for classical thermodynamics. This is
all the more important as it will be shown that the knowledge of fluctuations allows deriv-
ing the non-equilibrium properties of physical systems. For example, knowing the voltage
fluctuations of an electric circuit allows deriving its impedance and therefore its response to
an external excitation such as an applied voltage.
Summary of equilibrium staistical physics. The purpose of the first chapter is to in-
troduce the key concepts required to describe a system with a statistical approach. We will
first define what is a state of a physical system. We will then introduce the probability that
the system is in a particular state. With that, we will be able to compute mean physical
quantities. Hence, it follows that finding this probability is the fundamental task. Once
this is achieved, the rest of the lectures will consist in learning how to use in practice this
knowledge to compute physical quantities of interest such as a state equation for instance.
In order to illustrate the power of the statistical physics formalism, we will show that we
can recover the basic laws of thermodynamics. We will also show that many laws of clas-
sical thermodynamics such as the perfect gas state equation can be derived. When doing
so, we will realize that one of the most remarkable features of statistical physics is that
an oversimplified model of the microscopic systems often produces a remarkably accurate
macroscopic law.
Outline The non-equilibrium phenomena are the subject of chapters 4-9. We will first
introduce the general properties of linear systems. We will then establish a very general
relation connecting fluctuations and linear response known as the Fluctuation-Dissipation
theorem. We will then introduce the Langevin model which is an alternative technique to
model fluctuations. The next step will be to introduce the Boltzmann equation which allows
to discuss transport phenomena. Finally, we will discuss the basic ideas of irreversible
thermodynamics, a theory introduced to extend the capabilities of classical thermodynamics
to non-equilibrium situations.

1.2 State of a system. Statistical average


1.2.1 State of a system in classical and in quantum mechanics. Phase space
The concept of state is an important concept in quantum physics. It is known that the state
of a physical system is fully characterized by the knowledge of the quantum numbers. The
1.2. STATE OF A SYSTEM. STATISTICAL AVERAGE 9

quantum numbers are the eigenvalues of a complete ensemble of observables commuting


with the Hamiltonian. For instance, the state of a particle in a three dimensional potential
well is given by three quantum numbers specifying the particle momentum. Another quan-
tum number needs to be included to account for the spin state. For a molecule, the center of
mass degree of freedom are given by these three quantum numbers. Two additional quan-
tum numbers need to be included to account for the angular momentum and its projection
along a fixed axis and another one to account for the vibration.
The classical description of the state of a particle is based on the knowledge of the initial
conditions: position r and momentum p. A rotational degree of freedom is described by
the couple angular position θ and angular momentum Jθ . With these initial conditions and
the knowledge of all forces applied to a system, the movement can be predicted so that
the system is entirely determined. In the case of a gas containing N atoms, the classical
description of the state requires the position and momentum of all the atoms, i.e. it requires
6N values. These 6N coordinates are considered to be the coordinates of a state in the
so-called ”phase space”. In other words, a point in phase space corresponds to a particular
state of the system.
In summary,
- a classical state is characterized by a point in phase space.

- a quantum state is characterized by a set of quantum numbers.

1.2.2 Statistical (Ensemble) average and mean value of an observable


Obviously, statistical physics is about averaging. The purpose of this section is to clarify
what we mean by averaging. First of all, we establish the distinction between statistical
average and quantum average.
Let us first remind that if a physical system is in a state φ which is an eigenvector of an
observable Ô, the outcome of the measurement is certain. If by contrast, the system is in a
state |ri described by a linear superposition of eigenstates
X
|ri = ci |φi i ,
i

the measurement may produce different results. The probability of obtaining a given result
Oi corresponding to the eigenstate φi is given by the square modulus |ci |2 . In summary,
quantum physics introduces the concept of mean value of the outcome of a measurement
for a given physical quantity and a given physical state. What is fundamentally quantum, is
the fact that although the state of the system is fully known and denoted |r >, the outcome
of a measurement may not be fully determined. On the other hand, the mean value of the
observable is given by
D E
r|Ô|r .

A mean value in statistical physics has a completely different origin. It is due to the lack
of knowledge of the state occupied by the system. For instance, when dealing with a gas, we
do not know what is the position and momentum of the 3N atoms. In other words, we need
to assign a probability that the system is in state r. The key problem of statistical physics is
to find this probability. Once this probability is known, a mean value can be computed and
therefore the physical quantities can be computed.
10 CHAPTER 1. INTRODUCTION TO STATISTICAL PHYSICS. BASIC CONCEPTS.

Let us denote Pr the probability that the system occupies the quantum state |ri. Then,
we know that Dif the system
E is in this state, the mean outcome of a measurement of the ob-
servable Ô is r|Ô|r . The statistical average O of the measurement, given the uncertainty
on which state is actually occupied, is thus given by
X D E
O= Pr r|Ô|r .
r

We use a bar to denote the statistical average to emphasize the difference with the quan-
tum average. Let us now make three important remarks:
i) If we know the probability law, it is possible to derive the mean values. For instance,
we can derive the mean energy of a gas of N non-interacting atoms by adding the kinetic
energy of the N atoms and averaging over all possible states. For a system of N spins with
unknown orientations, it will be possible to derive the mean magnetization, etc.
ii) A very important feature of a statistical approach is that if the probabilities are known,
then one can also compute the fluctuations of the physical quantities. This is a very impor-
tant property that goes well beyond the concept of average value of a quantity at equilibrium.
This is a major contribution of statistical physics.
iii) The key concept is the probability that a system occupies a given state. It is very
important to note that such a probability can be assigned to a system independently of its
size. In other words, there is nothing that prevents to use statistical physics when dealing
with a system that contains three atoms. The specificity of a very small system is that the
fluctuations will be very large. However, all the concepts that we introduce can be applied.
Remark : Although the definition of the statistical average was introduced above using
a quantum description of the system, we emphasize that the concept of statistical average
can also be used in a classical framework. The bottom line is considering a system that can
be in different microscopic (classical or quantum) states with different probabilities.

1.2.3 The concept of statistical average and ensemble average


In this paragraph, we give a simple picture of the concept of ensemble average. The key
point is to realize that statistical average and ensemble average are the same thing. Sta-
tistical average is a mathematical terminology while ensemble average is the terminology
introduced by Gibbs in thermodynamics. The concept of ensemble probability can be illus-
trated as follows. Let us consider a physical system consisting of N atoms in a volume V
isolated. As already mentioned, a microscopic state is characterized by 6N real numbers
giving the position and momentum of the atoms1 . From a macroscopic point of view, the
system is characterized by its volume, its temperature and its fixed energy. There is a very
large number of microscopic systems that are compatible with the macroscopic constraints.
In order to define a probability for each particular microscopic state, let us imagine that
we produce an ensemble of a very large number of realizations of the system and that we
can measure the exact microscopic state of each of them. Note that each realization has
the same macroscopic properties but corresponds to different microscopic states. It is then
possible to perform a histogram of the number of times that each state was observed. After
normalization, this histogram yields a probability density function. The (imaginary) collec-
tion of microscopic realizations of the system is the so-called ”representative ensemble” of
the system. With this simple picture of the procedure used to define the probability density
1
For the sake of simplicity, the issue of distinguishability of the atoms is not discussed so far.
1.2. STATE OF A SYSTEM. STATISTICAL AVERAGE 11

function, it is clearly seen that the probability density can be defined for systems with either
large or small number of atoms N . What is needed, is a large number of realizations of the
system in the (fictitious) ensemble of realizations.
In summary, there is no need to have a large number of particles to use a statistical
physics approach. The statistical character of the approach comes from the fact that the
microscopic state of the system is unknown. The reason for a statistical model is not related
to the number of particles in the system. It will be seen later that the number of particles
does
√ play a significant role regarding the fluctuations: the relative fluctuations will decay as
1 N . Hence, systems with a small number of particles will display large fluctuations so
that a statistical approach becomes essential.

1.2.4 Thermodynamic equilibrium


The concept of equilibrium is borrowed from classical thermodynamics. It is a relatively
intuitive concept. In order to define this concept, let us start from an example. We consider
a volume V containing N molecules. It could be for instance a bottle with cold beer. When
opening the bottle, there will be gas escaping from the bottle so that there is an exchange of
matter. After one hour, the bottle will be at ambiant temperature as a consequence of a heat
flux. After a few hours, both the mass flux and the heat flux will vanish. Accordingly, the
chemical potential and the temperature will become uniform and stationary. These are the
two keys conditions to define equilibrium:
- stationary regime;
- no fluxes (or equivalently, all the intensive parameters such as temperature or pressure or
chemical potential are uniform).

1.2.5 Ergodicity
In order to define the probability, we have introduced the concept of ensemble of micro-
scopic realizations of the same macroscopic state. An alternative approach consists in
studying a single system as a function of time. One could think of measuring all the states
occupied by the system as a function of time and derive from this procedure the probability
density. This would be a time average.
A system is ergodic if the time average is equal to the ensemble average. In other
words, a system is ergodic if the time spent by a given system at each point in phase space
is proportional to the probability density.
The question of ergodicity is often emphasized because it allows to provide an answer
to the following question. In practice, one is interested in a given system. For example, if
we study a gas in a volume V with N atoms, we make measurements on this single system.
Hence, the question arises: why an average over a collection (an ensemble) of microscopic
realizations gives a relevant model of the experimental mesurements performed on a single
system ? Assuming that the experimental measurement yields a time average of the result
and assuming that the system is ergodic, we can assume an ensemble average to be a good
model of the experimental measurements. While this argument seems plausible, it cannot be
used in the vast majority of experiments as the time required to explore all the phase space
(and therefore perform a meaningful time average) is usually many orders of magnitude
larger than the actual measurements.
So it has to be concluded that if the ensemble average provides a good agreement with
the measurements, it is because the macroscopic quantities which are measured are not
12 CHAPTER 1. INTRODUCTION TO STATISTICAL PHYSICS. BASIC CONCEPTS.

sensitive on the details of the microscopic realizations. This argument can be supported by
the direct calculation of the fluctuations over different realizations. We will perform these
calculations and show that indeed, for large systems, the fluctuations are often negligible.

1.3 Summing over all possible states. The classical approxima-


tion.
When computing the mean value of a physical quantity, we need to sum over all accessible
states. This requires to actually perform the sum over all these states which may be tech-
nically difficult. From the computational point of view, it may be convenient to replace a
sum by an integral. A quantum description is based on a discrete sum over all quantum
numbers. Instead, a classical description can use an integral over the phase space. Mov-
ing from the quantum description to the classical description, namely, replacing a sum over
quantum numbers by an integral over the phase space, is the so-called ”classical approxi-
mation”. This approximation is valid if the energy difference between two adjacent states
in the energy spectrum of the system is such that:

∆E  kB T.
However, some care has to be taken when using an integral over phase space. The
sum over quantum numbers has no dimensions whereas an integral in phase space is not
dimensionless. For example, the volume of the phase space associated with the position
and momentum of one atom has the dimension of m6 kg 3 s−3 . Furthermore, if a state is
associated with a single point, there is an infinite number of states. We need to introduce
the elementary volume associated to a single state and this elementary volume must have a
dimension. It can be shown that this elementary volume is given by hN df where Ndf is the
number of degree of freedom. We call degree of freedom, a pair of conjugated variables x
and px in the phase space. For instance, in the case of a single atom, Ndf = 3 corresponding
to the three axis x, y and z. For N atoms, we have Ndf = 3N . As a practical way of
remembering the elementary volume associated to a single state, the Heisenberg uncertainty
principle can be used. It states that ∆x∆px ≈ h. Hence, a single state cannot be a point
(perfect knowledge of position and momentum). An area h is consistent with the Heisenberg
principle. This argument is by no means a proof but can be used as a practical mnemonic
argument.
More generally, the number of states dN in a system with Ndf degrees of freedom is
given by the ratio of the volume element in phase space dΓ divided by the volume element
hNdf :


dN ≈ .
hNdf
Here, we give this relation without a proof. However, its validity can be checked easily
using simple systems. Let us consider the case of a simple harmonic oscillator. In what
follows, we derive the number of states using either the quantum approach or the classical
approach. From a classical point of view, we know that there are two variables x and px .
The classical energy of the harmonic oscillator is given by

1 p2
E = mω 2 x2 + x
2 2m
1.3. SUMMING OVER ALL POSSIBLE STATES. THE CLASSICAL APPROXIMATION.13

where ω is the eigenfrequency of the harmonic oscillator. The volume of phase space Γ
2 2
with energy lower than E is the area of the ellipse given by the equation xa2 + pb2x = 1 where
a2 = 2E/mω 2 and b2 = 2mE. This area is given by
2πE
Γ = πab = .
ω
We now estimate the number of states with energy lower than E using quantum me-
chanics. It is known that the energy spectrum of a harmonic oscillator is given by En =
(n + 1/2)~ω. We now assume that E  ~ω. It follows that the number of states with
energy between 0 and E is given by

N = E/~ω.

Upon comparison of the two results, it is seen that N = Γ/h. Note that the quantum
estimate of N is accurate only if N is much larger than 1. This means that the energy of the
system (which is typically given by kB T ) is much larger than the quantum of energy ~ω. In
other words, the energy spectrum can be considered to be continuous.
In summary, if the quantum of energy is much smaller than the actual energy of the
system, it is possible to move from a quantum description to a classical description of the
states of the system. This is called the classical approximation. From a practical point of
view, one has to compute sums over all states weighted by the statistical probability. The
classical approximation enables to replace sum over discrete states by an integral over the
phase space volume with the proper normalization:
Z
X dΓ
−→ N
.
r Γ h
In the case of N atoms with no other degrees of freedom than their translational degrees
of freedom, we have:
N R R 
X Y dxi dyi dzi dpxi dpyi dpzi
−→ .
r
h3
i=1

It is often practical to perform the sum by grouping all states r with same energy En . We
then need to introduce the degeneracy gn of the energy level En . The sum can then be cast
in the form: X X
−→ gn
r n
where r is a label for a state and n is a label for an energy level. If the classical approx-
imation is valid, then a similar procedure can be used when performing the integral over
the phase space. We then replace the integral over dΓ by an integral over the energy. We
also need to introduce the number of states with energy in the interval [E, E + dE] which
is given by g(E)dE. Here, we have introduced the function g(E) called density of states.
Finally, we can write:
X X −−−−−−−−−−−−−−−−−→ Z
−→ gn classical approximation g(E)dE.
r n

It will be seen later on that g(E) is not a mere technical definition but instead a quantity
that plays a fundamental role in the physical description of the system.
14 CHAPTER 1. INTRODUCTION TO STATISTICAL PHYSICS. BASIC CONCEPTS.

1.4 Fundamental Principle of Statistical Physics


1.4.1 Probability of a state of an isolated system. Microcanonical ensemble
Definition

We now address the central question of statistical physics. How can we assign a probability
to a given state ? We first consider a system in equilibrium and isolated. Hence, the energy
of the system takes a given value E. It follows that we have to consider only states with
energy E. The states which are compatible with the information that we have about the
system (e.g. energy, number of particles, volume) are called accessible states. Having
restricted the number of possible states, we now need to assign a probability to each of
them. As we do not have any additional information allowing to suppose that one state
is more likely than another one, we have to assign the same probability to all the states.
Any other choice would require some additional information to be justified. Given this
consideration, we take as a postulate that all states have equal probability at equilibrium.
This statement is the fundamental principle of statistical physics:

For an isolated system at equilibrium, all accessible states have the same probability.

This choice is justified a posteriori by the fact that the theory allows to describe the
experimental observations. The ensemble corresponding to isolated systems is called mi-
crocanonical ensemble. The corresponding probability is called microcanonical probability.

Discussion

In this paragraph, we want to show that the fundamental principle does play a role analogous
to the second principle in classical thermodynamics. We know that the second principle is
the principle that allows to find the equilibrium state of the system among all possible states.
At first glance, the fundamental principle does not enable to choose as it postulates that
all states are equivalent. In order to clarify this issue, we emphasize that the fundamen-
tal principle states that all microscopic states have the same probability. The question of
the choice of the equilibrium state in classical thermodynamics is about the choice of a
macroscopic state. So the actual question is : How many microscopic states correspond to
a given macroscopic state ? The equilibrium macroscopic state is the one corresponding to
the largest number of microscopic states.
Let us take an example. We consider a simple model of paramagnetism. The system
consists of N spins 1/2 without interactions. No magnetic field is applied so that each spin
has a probability 1/2 to be in the state + (sz = 1/2) and a probability 1/2 to be in the
state − (sz = −1/2). We expect such a system to have a zero magnetic moment2 . We
assume that the spins are localized in sites in a crystal so that they are distinguishable. A
given state is specified by the list of the orientation of the N spins (+, +, −, +, −, −, .....).
According to the fundamental principle, the state (+, +, +, +, +, ....+, +) where all the
spins are in the state + has the same probability that a state with N/2 + and N/2 −.
This seems counterintuitive as we expect the system to be in a state with no magnetization.
To resolve this apparent paradox, we clarify the difference between a microscopic state
(+, +, −, −, −) and a macroscopic state given by the number n+ of spins +.
2
We remind the reader that the magnetic moment is proportional to the spin
1.4. FUNDAMENTAL PRINCIPLE OF STATISTICAL PHYSICS 15

We do expect the state n+ = N/2 to be more likely than the state n+ = N . However,
the microscopic state (+, +, +, ...+, +) has indeed the same probability that any other state.
The key point is that there is only one microscopic state corresponding to n+ = N whereas
there are N !/[n+ !(N −n+ )!] different microscopic states corresponding to the macroscopic
state n+ . The observed macroscopic state is the state corresponding to the largest number
of microscopic states. It can be checked using ln N ! ≈ N ln N − N that it corresponds to
n+ = N/2.
The key feature that allows deciding which ”macroscopic” state is actually observed is
therefore the number of microscopic states corresponding to a given macroscopic situation.
Let us denote by Ω this number. The entropy introduced by L. von Boltzman is S =
kB ln Ω so that maximizing the entropy and maximizing the number of microscopic states
corresponding to a given macroscopic situation is one and the same.

1.4.2 Statistical Entropy


The concept of statistical entropy has been introduced by L. von Boltzmann when he was
studying the gas properties with a microscopic point of view. A very important development
was later introduced by Shannon when he developed a mathematical theory of the informa-
tion content. From the point of view of statistical physics, its definition is mostly motivated
by its ability to recover classical thermodynamics. The information theory provides a deep
foundation for the concept of entropy. This concept is very useful in information processing,
image processing and more generally when studying stochastic processes. The statistical
entropy first introduced by Boltzmann is defined by:
X
S = −kB Pr ln Pr .
r

where Pr is the probability that the system is in a microscopic state r. Boltzmann had
two goals: to find the velocity distribution derived by Maxwell and to establish a form of
the entropy that would allow showing that entropy can only increase. Note that this form
of the statistical entropy allows to recover the simple Boltzmann formula in the case of an
isolated system. Indeed, if Pr = 1/Ω, then S = kB ln Ω.

The information theory point of view


The goal of this section is to give a brief overview of the information theory point of view.
A key feature of information theory is to provide a natural way of introducing the statistical
entropy as opposed to the standard formulation of statistical physics where it is postulated.
The basic question of information theory is the question of introducing a quantitative
way of measuring the amount of information on an event. What do we mean by ”event”?
It is not a soccer match or a political election. We will take two examples. We can think
of tossing a dice four times. The event is an ordered list of outcomes such as 1, 6, 4, 2. A
second example is a physical system that can be in different microscopic states. The event
is the choice of a given state r or in other words, the outcome of a measurement on the
system allowing to define its state. Our goal here, is to illustrate the idea that the entropy
is a measure of the amount of missing information on the system. Let us first consider a
simple case: we know that the system is in state r0 so that Pr0 = 1 and Pr = 0 for all
other states r. Then, there is no missing information as we know exactly in which state
is the system. In that case, the entropy formula gives S = 0. For any other distribution,
16 CHAPTER 1. INTRODUCTION TO STATISTICAL PHYSICS. BASIC CONCEPTS.

the entropy will provide a positive result S > 0. With this first example, we indeed see
that the entropy increases as the amount of missing information increases. In other words,
the missing information is the fact that we do not know in which microscopic state is the
system. If the number of possible microscopic states increases, then the amount of missing
information increases.
Let us now take a step back and think about the link between information and proba-
bility. We toss the dices 600 times and we find that the outcome is always 1. This is an
extremely unlikely result. It is so unlikely that we will probably think that the dices are
loaded and we will ask to take a close look at the dice. By contrast, if the outcome is 100
times the value 1, 98 times the value 2, etc, we will not be surprised and will not pay much
attention. In the first case, the probability of the event is very unlikely and the event get our
attention. It contains a lot of information: the dice is loaded! In the second case, the event
is very likely and does not contain much information.
Hence, information increases as probability decreases. If we introduce a function H(Pr )
measuring the amount of information, it has to be a decreasing function of Pr . We now go
a step forward and consider the particular case of the event consisting of tossing the dices
four times. Obviously, this event A can be split in two independent events B and C : tossing
twice the dices and repeating it. As the tosses are independent, the probability is given by
P (A) = P (B and C) = P (B)P (C). It follows that : H[P (A)] = H[P (B)P (C)]. We
now look for a measurement of information as an additive law. It seems natural to define H
as a function where the amount of information contained in B and in C can be added. We
then require H[P (B)P (C)] = H[P (B)] + H[P (C)]. This entails that H(P ) = −K ln(P )
where K is an arbitrary positive constant.
We have thus found that the measurement of missing information must be given by
−K ln P in order to have a law that satisfies the two basic requirements : decaying function
of P and additive function. We now introduce the average value of this missing information
over all possible events and call it entropy:
X
S = −K Pr ln Pr .
r

1.4.3 Second formulation of the fundamental principle: maximum entropy


It is possible to show that the only probability law that maximizes the statistical entropy is
the equiprobability law. This can be used as an alternative formulation of the fundamental
principle:
At equilibrium, the probability distribution is the distribution consistent with all
constraints that maximizes the entropy.
With this new approach, finding a probability distribution amounts to solve a maximiza-
tion problem given some constraints. It has to be noted that no assumptions regarding the
interaction of the system with its environment have been made. Hence, this form of the
principle can be directly used to deal with systems exchanging energy or particles with their
environment.
Chapter 2

System in contact with a thermostat.


Canonical ensemble

2.1 Introduction
In this chapter, we study systems in contact with a thermostat. A thermostat can exchange
energy and has a fixed temperature. It can be viewed as a system with an infinite heat ca-
pacity. The first purpose of this chapter is to introduce a technique to derive the probability
for a system to be in a given state from the fundamental principle. The second purpose is to
study how to extract physical quantities (mean value and fluctuations) from the knowledge
of these probabilities. We will see that a quantity plays a key role in this context: the parti-
tion function. The general technique of study of a system in contact with a thermostat will
be illustrated using the example of a perfect gas. Finally, we will compare the formalism of
statistical physics with classical thermodynamics thereby revisiting concepts such as work
and heat in the statistical physics framework.

2.2 Probability law for an isolated system. Microcanonical en-


semble
In the previous chapters we have given two different formulations of the fundamental prin-
ciple of statistical physics. The first formulation states that the probabilities of all accessible
states are equal for an isolated system in equilibrium. The second formulation states that the
probability law must maximise the statistical entropy while accounting for the constraints
imposed on the system. In this paragraph, we will apply the second formulation to an iso-
lated system and check that we recover the first formulation. This will be a good example
to show how to use in practice the principle of maximum entropy to derive an explicit form
of the probability law.
We denote Pr the probability that the system is in state r. should we assume that the Pr
are independent variables, the condition of maximum entropy would be:

∂S
= 0, (2.1)
∂Pr
for all states r. Yet, the probabilities are not independent. Indeed, the normalization condi-
tion yields:

17
18CHAPTER 2. SYSTEM IN CONTACT WITH A THERMOSTAT. CANONICAL ENSEMBLE

X
Pr = 1. (2.2)
r

Hence, we need to deal withP the problem of finding the values Pr that maximize the
entropy given the constraint r Pr = 1. This problem of finding an extremum in the
presence of constraints of the form f (P1 , ...Pr , ..) = 0 can be solved using the Lagrangian
multipliers technique. Here, we only show how to use this technique without giving any
proof.

2.2.1 Lagrangian multiplier technique


The problem is to find the values of the variables x1 , x2 , ..., xN that maximize or minimize
the function f (x1 , x2 , ..., xN ) given the constraints g(x1 , x2 , ..., xN ) = 0 and h(x1 , x2 , ..., xN ) =
0. To proceed, the Lagrangian multiplier technique introduces the new function

f (x1 , x2 , ..., xN ) + λ1 g(x1 , x2 , ..., xN ) + λ2 h(x1 , x2 , ..., xN ), (2.3)

where a parameter λi called Lagrangian multiplier has been introduced for each con-
straint. Here, we have introduced two constraint but the procedure can be reproduced with
any number of constraints. We now search the maximum of this new function considering
that the variables x1 , x2 , ..., xN are independent. The final result depends on the unknown
Lagrangian multipliers. Their value can be found using the constraints but this step can be
avoided in most cases as we will see.
As an example, you can quickly check the efficiency of the technique with the following
example. What are the values of the sides a and b of a rectangle with perimeter P that
maximizes the area of the rectangle ? Here, the variables are a, b, the function is f (a, b) =
ab and the constraint is g(a, b) = 2(a + b) − P = 0. You should find that the solution is a
square so that a = b = P/4. We can further check that the extremum is indeed a maximum
in this case.

2.2.2 Example: isolated system, microcanonical ensemble


Let us first apply the technique to the case of an isolated system. The ensemble of realiza-
tions of such a system is called microcanonical ensemble.
We seek theP probability law that maximizes the statistical entropy in the presence of
the constraint r Pr = 1. Applying the Lagrangian multiplier technique, we introduce the
function: X X
−kB Pr ln Pr + λ( Pr − 1). (2.4)
r r

By taking the derivative with respect to Pr0 , we find

−kB ln Pr0 − k + λ = 0. (2.5)

Hence, Pr0 depends on λ but does not depend on the state P label r so that the probability
is the same for all states. Accounting for the constraint r Pr = 1, we find immediately
Pr = 1/Ω where Ω is the number of accessible states in agreement with the first formulation
of the fundamental principle.
2.3. SYSTEM IN CONTACT WITH A THERMOSTAT. CANONICAL ENSEMBLE 19

2.3 System in contact with a thermostat. Canonical ensemble


In this section, we consider systems that can exchange energy and only energy with a ther-
mostat. As the energy is not fixed, it fluctuates around a mean valuePE which is fixed as
the system is stationary at equilibrium. Hence, we have the equality r Pr Er = E. This
is an additional constraint on the values of the probabilities. We now proceed to derive the
probability law by maximizing the entropy accounting for the constraints.

2.3.1 Probability law for a canonical ensemble


We seek the probability law Pr that maximizes the entropy given the following constraints:

X
Pr = 1 (2.6)
r
X
Pr Er = E (2.7)
r

We have to maximise the new function:


X X X
−kB Pr ln Pr + λ1 ( Pr − 1) + λ2 ( Pr Er − E), (2.8)
r r r

where we have introduced two Lagrangian multipliers λ1 , λ2 . Writing that the derivative
with respect to Pr0 is zero yields:
−kB ln Pr0 − k + λ1 + λ2 Er0 = 0. (2.9)
It is seen that Pr0 depends on the energy Er0 and can be cast in the form:
exp(−βEr0 )
Pr0 = . (2.10)
Z
In the previous equation, instead of solving in terms of the Lagrangian multipliers, we
have introduced a parameter β called statistical temperature. We have also introduced a
normalization factor Z called partition function. By definition, it is given by:
X
Z= exp(−βEr ). (2.11)
r

Hence, we have been able to derive an explicit form of the probability Pr that the system
is in a given state r from the principle of maximum entropy. We now show how to use this
probability law to derive physical quantities.

2.3.2 Mean energy


Our first example is to derive the mean energy. By definition, it is given by:
X X exp(−βEr )
E= Pr Er = Er . (2.12)
r r
Z

We now note that the numerator can be derived from the partition function by taking a
derivative with respect to β:
∂Z X
=− Er exp(−βEr ), (2.13)
∂β r
20CHAPTER 2. SYSTEM IN CONTACT WITH A THERMOSTAT. CANONICAL ENSEMBLE

so that we can write the mean energy as:

∂ ln Z
E=− . (2.14)
∂β

We stress that this expression reveals another aspect of the partition function Z. While it
has been first introduced as a normalization factor, it appears now that it plays the role of a
potential which contains information on the system. This is the first time that we observe
this feature. It will be seen in what follows that the partition function can be related to the
potentials introduced in classical thermodynamics. It is a fundamental quantity in statistical
physics.

2.3.3 Energy fluctuations


We have used the probability law to derive the mean energy. We can do much more. We can
compute for instance the root mean square deviation to characterize the fluctuations around
the mean value. At this point, we note that classical thermodynamics does not provide the
fluctuations of the physical quantities. The ability to derive fluctuations is an important
feature of statistical physics. Let us compute the root mean square deviation:
2
(E − E)2 = E 2 − E . (2.15)

X exp(−βEr ) 1 ∂2Z
E2 = Er2 = (2.16)
r
Z Z ∂β 2
!2 
1 ∂Z 2
X exp(−βEr ) 
2
E = Er = (2.17)
r
Z Z ∂β

Hence, we find:
2
1 ∂2Z ∂ 2 ln Z
  
1 ∂Z ∂ 1 ∂Z
(E − E)2 = − = = (2.18)
Z ∂β 2 Z ∂β ∂β Z ∂β ∂β 2
Again, we note that this quantity can de derived from the knowledge of the partition
function.

2.3.4 Entropy. Link between the statistical temperature and the temperature
in classical thermodynamics
The calculation of the entropy is straightforward. It suffices to insert the form of the proba-
bility in the definition.
X X
S = −kB Pr ln Pr = −kB Pr [−βEr − ln Z] = kB ln Z + kB βE. (2.19)
r r

Note that we can compute the entropy and not only an entropy variation during a trans-
formation, in marked contrast with classical thermodynamics. We now proceed to identify
the temperature. We use the thermodynamic identity:
1 ∂S
= , (2.20)
T ∂U
2.3. SYSTEM IN CONTACT WITH A THERMOSTAT. CANONICAL ENSEMBLE 21

where U denotes the internal energy of the system. Here, we identify U with E and we note
that ln Z is an explicit function of β but does not depend on E. Indeed, from its definition,
the partition function depends on β and the list of energy levels of the system. We find:

∂S
= kB β. (2.21)
∂E
By comparing with the definition of the temperature in classical thermodynamics, we find:
1
β= . (2.22)
kB T
We can thus recover the free energy:

F = E − T S = −kB T ln Z, (2.23)

and remark that it is proportional to the partition function. This is the first example where
we can identify −kB T ln Z with a thermodynamic potential.

2.3.5 Heat capacity cv . Connection between fluctuations and linear response


In this section, we compute a quantity of interest: the heat capacity at fixed volume. It
is given by cv = ∂E ∂T . It will be seen that this quantity is related to the fluctuations of
the energy so that fluctuations acquire a very different status. Although fluctuations are
often seen in experimental physics as a source of error when performing measurements,
here, they will appear to contain useful information on the system. This is an example of a
general relation between the fluctuations and the linear response of a system (in the sense
that a small modification of the temperature ∆T entails a small modification of the energy
proportional to ∆T ). To proceed, we simply insert the form of the mean energy in the
definition of the heat capacity:

∂E ∂E ∂β 1 ∂ 2 ln Z (E − E)2
cv = = = 2 2
= . (2.24)
∂T ∂β ∂T kB T ∂β kB T 2
We find that the heat capacity, which is an example of a linear response coefficient, is
proportional to the rms deviation of the energy.

2.3.6 Free energy and partition function


So far, we have introduced rather abstract concepts and notations. We need examples to
illustrate how these concepts can be applied to practical systems and derive useful relations.
We will use the perfect gas as the first example to show how to proceed. Before that, we
make a simple remark. We have seen the knowledge of the partition function allows to
derive the entropy, the free energy, etc. We will see later that we can also derive the state
equation from it. To make a long story short, it does contain all the information on the
system at equilibrium. So, what does it take to compute the partition function ? We simply
need a list of states r with their corresponding energy Er , in other words, we need to know
the spectrum of the system with the degeneracy of all the energy levels.
Hence, any model based on statistical physics simply requires to identify the list of
states with their energies. The rest is a matter of computing the partition function Z and
then, to deduce the relevant quantities from it.
22CHAPTER 2. SYSTEM IN CONTACT WITH A THERMOSTAT. CANONICAL ENSEMBLE

We will introduce a specific chapter on quantum statistics. This is because counting


states for a quantum system of identical particles cannot be done as in classical physics
when the particles are indistinguishable. This is the fundamental reason of the difference
between quantum and classical statistics.
A second major assumption that will be used often in this introductory text is the as-
sumption of independent particles. It amounts to neglect interaction between particles so
that the energy of a system consisting of many different particles will simply be the sum of
the energies of the different particles. If this property is valid, then the partition function
can be computed in a much simpler way.

2.4 Example: perfect gas


In this section, we illustrate the concepts introduced in this chapter using the perfect gas as
a simple model. The system contains N atoms in a volume V . The gas is in contact with a
thermostat at temperature T = 1/kB β.
In this section, we will first provide a model of the physical system. The outcome of the
model is the spectrum of the system, namely, the list of states r with their energy Er . The
second step of the study is the computation of the partition function Z. We will then use
the formulas derived in this chapter to compute the mean energy, the energy fluctuations
and the entropy. We will finally derive the heat capacity and establish a general connection
between the heat capacity and the energy fluctuations.
Let us start with the model. A remarkable property of a statistical physics approach is
that a very crude microscopic model is often sufficient to capture the macroscopic behaviour
of the system. The perfect gas model is a good example. We consider that the atoms
are point-like atoms with mass m, position rn and momentum pn where n is an integer
n = 1, ...N . In what follows, we assume that the particles are distinguishable. This is
not correct and we will lead to inconsistencies when computing the entropy. We will show
in the chapter on quantum statistics how this issue can be fixed using a proper description
of indistinguishable particles. For the time being, we ignore this quantum property and
consider that the atoms have a label i. We neglect the interaction energy between atoms
so that the energy is the sum of the energies of the particles (assumption of independent
particles)1 . We neglect the internal energy such as electronic or nuclear energy. Hence, only
the kinetic energy of the atoms has to be taken into account. As discussed in chapter 1, the
classical approximation is valid for a macroscopic volume so that the energy quantization
for translational energy can be neglected. Hence, a state is fully characterized by a point
in phase space, namely, the list of position and momentum of all the atoms and its kinetic
energy:

r = {r1 , r2 , ..rn , ...rN ; p1 , p2 , ..pn , ..pN },


N
X p2n
Er = . (2.25)
2m
n=1

We can now compute the partition function.

1
An interaction energy term between two particles such V (r1 , r2 ) could not be assigned to the energy of
particle 1 or particle 2
2.4. EXAMPLE: PERFECT GAS 23

2.4.1 Partition function


By definition, the partition function is:
X
Z= exp[−βEr ] (2.26)
r

p2n
where Er = N
P
n=1 2m . As the classical approximation is valid, we can replace the sum
over discrete states by an integral over phase space.
N Z
Y dxn dyn dzn dpx,n dpy,n dpz,n p2n
Z= exp[−β ] = ζN . (2.27)
h3 2m
n=1

where
4πp2 dp −βp2 /2m
Z
ζ=V e (2.28)
h3
Note that the factorization is made possible by the fact that the energy of the system is
the sum of the energies of the independent particles with no interaction terms. Hence,
the problem is simply the computation
R∞ function ζ corresponding to a single
of a partitionp
particle. Using the identity: 0 t2 exp(−at2 )dt = 41 aπ3 , we get
 3/2
2πm
ζ=V . (2.29)
βh2

Hence, the partition function is given by


2πm 3N/2
Z = V N( ) . (2.30)
βh2
Let us repeat that we have (incorrectly) assumed that the particles are distinguishable. It
will be shown later that in the classical limit (to be defined later) of a quantum approach,
the result can be cast in the form:
ζN
Z= , (2.31)
N!
where the term N ! accounts for the indistinguishability. Indeed, all states obtained by
permutations of identical particles were counted as different states although they should be
considered to be the same state. Hence the division by the number of permutations.

2.4.2 Thermodynamic properties of the perfect gas


We will now proceed to extract information on the thermodynamic properties of the perfect
gas using the general relations established in the chapter. We start by computing the mean
energy:

∂ ln Z 3N 3N kB T
E=− = = . (2.32)
∂β 2β 2
Let us now study the energy fluctuations using Eq. (2.18).

2 ∂ 2 ln Z 3N
E2 − E = 2
= . (2.33)
∂β 2β 2
24CHAPTER 2. SYSTEM IN CONTACT WITH A THERMOSTAT. CANONICAL ENSEMBLE

The relative fluctuations are given by:


r
∆E 2 1
= √ . (2.34)
E 3 N
For a system containing an Avogadro number of atoms, it is seen that the relative energy
fluctuations are on the order of 10−12 so that they are negligible.

2.5 Microscopic interpretation of the concepts of heat and work


We have introduced a general framework which provides a systematic procedure to derive
the mean energy of a system. It is interesting to try to establish a connection with the
classical thermodynamics formulation of the internal energy. This connection will lead us
to revisit the concepts of heat and work in the framework of statistical physics.

2.5.1 Work and heat in classical thermodynamics


In the framework of classical thermodynamics, the variation of internal energy U in an
elementary transformation of a system is given by the first principle:
dU = δW + δQ. (2.35)
where δW is the work provided to the system and δQ is the heat provided to the system.
The internal energy U introduced in classical thermodynamics is the same physical quantity
than E in statistical physics. We use the notation U when we use the thermodynamics
framework and E when we work in the framework of statistical physics. It will be useful
to remind the form taken by the elementary work. It stems from the basic definition of
work in classical mechanics, the product of a force F and en elementary displacement dl.
By analogy, the elementary work has the form δW = F dl where F is a generalized force
and dl is the elementary variation of a conjugated quantity. Let us illlustrate this statement
with a few examples. The work provided by pressure forces can be written as −P dV ,
the work provided by the superficial tension of a liquid is given by AdS where A is the
superficial tension and dS the surface variation, the work done by the tension of a wire is
T dl where dl is the length variation, the work provided by an electrical generator is V dq
where V is a voltage difference and dq is the charge variation of the system, the work done
by a polarization variation is E.dP where P is the polarization density and E is the electric
field, the work done by a magnetic field B on a magnetized medium is M.dB where M
is the magnetization. From this list of examples, we see that the elementary work is the
product of a generalized force by an elementary variation of the conjugated quantity.
In classical thermodynamics, the heat exchanged in a reversible transformation can be
cast in the form δQ = T dS. Temperature then plays the role of the force and entropy the
role of the conjugated quantity.

2.5.2 Work and heat in statistical physics


Mean energy variation
Let us now turn to the statistical formulation of the energy variation. Starting from Eq.
(2.12), we find:
X
dE = dPr Er + Pr dEr . (2.36)
r
2.5. MICROSCOPIC INTERPRETATION OF THE CONCEPTS OF HEAT AND WORK25

We are thus led to examine if these two terms can be identified with the elementary heat and
work variations.given by Eq.(2.35). This is indeed the case for some particular cases. To
proceed, we start with the calculation of the elementary variation of the entropy.

Entropy variation
Since we know the explicit form of the entropy as a function of the probabilities Pr , we can
derive the entropy variation:
X
dS = −kB d[Pr ln(Pr )]. (2.37)
r
P
We note that the sum of probabilities is equal to 1 so that r dPr = 0. We thus get:
X
dS = −kB ln(Pr )dPr . (2.38)
r

Inserting the form of the canonical probability Pr = exp(−βEr )/Z, yields:


X
dS = kB βEr dPr . (2.39)
r

If we now consider a reversible transformation, the entropy P


variation is only due to heat
exchange so that T dS = δQ. In that situation, we can identify r Er dPr :
X
δQ = T dS = Er dPr . (2.40)
r

It is seen that providing energy to a system in a reversible transformation amounts to


keep constant the energy levels and modify the probabilities of the different states. As
the canonical probability only depends on the temperature, this amounts to say that the
temperature is changed as one would expect.

Work
P
We now consider the second term r Pr dEr . Let us again consider a particular transfor-
mation. We consider an adiabatic (δQ = 0) tranformation
P so that dU = δW . We also
consider that the transformation is isentropic (dS = r Er dPr = 0 as was shown above.
In that case, the energy variation dU = dE yields:
X
δW = Pr dEr . (2.41)
r
It is seen that the elementary work in a reversible and adiabaitic transformation is as-
sociated to a modification of the energy of the states of the system. Let us illustrate this
idea with the perfect gas.The work received by the system is given by −P dV in classical
thermodynamics. We now revisit this issue using a microcopic point of view. We consider
a vessel with volume V = Lx Ly Lz containing atoms. If the length Lx is modified using a
~2 4π 2
moving piston, the kinetic energy levels of the particles given by n2x 2m L2x
are also mod-
ified. If this modification is slow enough, it does not induce quantum transitions between
states.
In summary, providing energy to the system through work amounts to change slowly
the energy of the states of the system.
26CHAPTER 2. SYSTEM IN CONTACT WITH A THERMOSTAT. CANONICAL ENSEMBLE

Quasistatic transformation. Reversible and irreversible transformations


To obtain a microscopic interpretation of work and heat, we have considered particular
transformations. Here, we summarize the definitions of quasistatic transformation and re-
versible transformation.
Let us first discuss reversible and irreversible transformations. In the general case, an
external operator can provide energy to a system by applying a generalized force which is
different from the internal force in the system. For instance, a reservoir with a pressure PR
different from the internal pressure of the system P can be used so that the work provide
is given by −PR dV and is different from the work received by the system −P dV . The
difference −(PR − P )dV between the work provided by the operator δWop and the work
received by the system δWr is converted into heat and contributes to a modification of the
probabilities given by a temperature increase. This process is an entropy creation process.
In classical thermodynamics, this situation is described as follows.The first principle yields:
dU = δQ + δWop = δQ + δWr + (δWop − δWr ). (2.42)
The thermodynamic identity can be cast in the form:
dU = T dS + δWr . (2.43)
Combining both equations, we obtain:
T dS = dU − δWr = δQ + (δWop − δWr ) = δQ + T dSirrev . (2.44)
A creation of entropy dSirrev accounting for irreversible processes is associated to the dif-
ference between the work provided by the operator and the work received by the system.
The entropy creation is proportional to the difference (PR − P ).
A quasistatic transformation is by definition a transformation where a system changes
from an equlibrium situation to another equilibrium situation. In other words, the transfor-
mation is performed in such a way that all internal intensive parameters such as temperature,
pressure, etc, are kept homogeneous and well defined during the transformation. This re-
quires that the evolution of the system takes place over characteristic times longer than the
characteristic times of the natural evolution of the system. This condition does not mean
that the generalized forces (e.g. temperature, pressure, etc) of the system are equal to their
values in the thermostat. A finite difference of the generalized force may lead to entropy
creation as discussed for the example of the volume variation under a pressure difference.
A quasistatic transformation is reversible if the intensive variables are equal between the
system and its environment. It is seen the example discussed above that when the pressure
difference (PR − P ) is negligible, the entropy creation vanishes. Obviously, for a transfor-
mation to exist, the difference cannot be strictly zero. There must be a small temperature
difference to generate a flux, a small pressure difference to generate a volume change, a
small voltage difference to generate a current, etc. A reversible transformation is thus an
idealization of a quasistatic transformation when the differences of generalized forces tend
to zero.
Returning now to Eqs (2.36) and (2.39), we obtain:
X
dE = T dS + Pr dEr . (2.45)
r
P
The comparison with (2.43) shows that the term r Pr dEr is equal to the work received
by the system. It is equal to the work provided during a reversible transformation δWrev .
In brief,
2.5. MICROSCOPIC INTERPRETATION OF THE CONCEPTS OF HEAT AND WORK27

X
Er dPr = T dS
r
X
Pr dEr = δWrev . (2.46)
r

2.5.3 Work and generalized force. State equation


In this section, we show how it is possible to derive a state equation of the system. Again,
it will be derived from the partition function. The procedure is to compare two forms of the
work given to a system. On one hand, the work can be cast as the product of a generalized
force F and a variation of a conjugated parameter dl, δW = F dl.POn the other hand, the
variation of internal energy in an isentropic transformation (dS = r Er dPr = 0) is given
by:
X
dE = δWrev = Pr dEr . (2.47)
r
The energy Er are the eigenvalues of the hamiltonian of the system. They change when
the parameters of the hamiltonian change. Let us consider a parameter l:
 
∂Er
dEr = dl. (2.48)
∂l

The parameter l can be the volume of the system, a magnetic field, an electric field, etc.
P tehn derive the corresponding generalized force F through the relation δW = F dl =
We
r Pr dEr :

δW = F dl (2.49)
X ∂Er
= Pr dl (2.50)
r
∂l
X exp(−βEr ) ∂Er
= dl (2.51)
r
Z ∂l
1 ∂ ln Z
= − dl. (2.52)
β ∂l
The final result can be cast in the form:

1 ∂ ln Z
F =− (2.53)
β ∂l
This relation connects the generalized force with its conjugate variable. Hence, the relation
is the state equation connecting F and l.

2.5.4 Application to the perfect gas


We apply this formula to the perfect gas model derived previously. The work received by
the system is −P dV . The generalized force is −P , the conjugate variable of the pressure
is the volume V . We thus get
28CHAPTER 2. SYSTEM IN CONTACT WITH A THERMOSTAT. CANONICAL ENSEMBLE

1 ∂ ln Z
P = . (2.54)
β ∂V
Inserting the partition function derived in Eq.(2.30), we get

N N kB T
P = = . (2.55)
βV V
Chapter 3

System in contact with a reservoir.


Grand canonical ensemble

3.1 Introduction
The goal of this chapter is to extend the canonical approach to systems which can exchange
energy with their environment and which have an additional constraint. We start with an
open system that can exchange energy and particles with a reservoir. In that case, the
number of particles N in the system can fluctuate but is fixed on average. Hence, a new
constraint is given and the probability law is modified. Other examples of systems in equi-
librium include systems with fluctuating quantities such as electric charge, volume, magne-
tization or polarization which are fixed on average upon interaction with the environment.
The first goal of this chapter is to introduce a general technique to deal with these systems.
The ensemble corresponding to systems exchanging particles is called grand canonical and
the ensemble of systems with a constraint on the quantity X will be called generalized grand
canonical.
Another goal of this chapter is to show that the relevant thermodynamic potential will
be given by −kB T ln ZGC where the subscript stands for grand canonical. We will also
introduce the thermodynamic limit and discuss the equilibrium conditions between subsys-
tems.

3.2 Systems exchanging energy and particles with a reservoir.


Grand canonical ensemble
3.2.1 Probability law
In order to find the probability law, we implement the Lagrangian multiplier technique. For
a system in contact with a reservoir of particles, both energy and the number of particles are
fixed on average. Hence, the constraints can be cast in the form:
X
Pr = 1 (3.1)
r
X
Pr Er = E (3.2)
r
X
Pr Nr = N (3.3)
r

29
30CHAPTER 3. SYSTEM IN CONTACT WITH A RESERVOIR. GRAND CANONICAL ENSEMBLE

where Nr is the number of particles when the system is in state r and N is the mean number
of particles. Applying the same method than in the previous chapter, we find:

exp(−βEr + αNr )
Pr = (3.4)
ZGC

where the normalization factor ZGC given by:


X
ZGC = exp(−βEr + αNr ), (3.5)
r

is the grand canonical partition function.

3.2.2 Mean values


We use the probability law to derive the mean value of the number of particles in the system:
X X exp(−βEr + αNr ) ∂ ln ZGC
N= Pr Nr = Nr = . (3.6)
r r
ZGC ∂α

Reproducing the calculation of the energy fluctuations, we get:


2 ∂ 2 ln ZGC
(N − N )2 = N 2 − N = . (3.7)
∂α2

3.2.3 Entropy. Introduction of the chemical potential


We insert the probability law in the definition of the statistical entropy:
X X
S = −kB Pr ln Pr = −kB Pr (−βEr +αNr −ln ZGC ) = kB βE−kB αN +kB ln ZGC .
r r
(3.8)
We now use the thermodynamic identity:

dU = T dS + µdN, (3.9)

so that the chemical potential is related to the entropy by:


 
∂S µ
=− . (3.10)
∂N U T
Using the form of the entropy to compute the chemical potential, we find:

µ = kB T α. (3.11)

Hence, it is seen that the chemical potential is the intensive variable that controls the ex-
change of particles between the system and the reservoir in the same way than the temper-
ature controls the exchange of energy between the system and the thermostat. We can now
cast the probability law in the form:
exp(−βEr + βµNr )
Pr = . (3.12)
ZGC
Finally, we can introduce the thermodynamic potential:

A = −kB T ln ZGC = E − T S − µN . (3.13)


3.3. SYSTEM WITH A FLUCTUATING QUANTITY X 31

3.3 System with a fluctuating quantity X


We now generalize the technique to any system with a fluctuating quantity X. It could
be the electric charge of a system in contact with a generator, the volume of a piston, the
length of a spring with an attached mass, etc. The conjugated variable is denoted x. The
work provided to the system is then given by dW = xdX during a transformation with an
elementary variation dX. We now repeat the general procedure for this generalized case.
Every step in what follows can be obtained by simply replacing N by X and µ by x.

3.3.1 Probability law


We use again the Lagrangian multiplier technique. The constraints are now given by:
X
Pr = 1 (3.14)
r
X
Pr Er = E (3.15)
r
X
Pr Xr = X. (3.16)
r

It follows that the probability law can be cast in the form:

exp(−βEr + αXr )
Pr = . (3.17)
ZGCX

3.3.2 Entropy
Inserting the probability in the definition of the statistical entropy, we obtain:
X X
S = −kB Pr ln Pr = −k Pr [−βEr +αXr −ln ZGCX ] = kB βE−kB αX+kB ln ZGCX .
r r
(3.18)
We note that the Lagrangian multiplier is given by:
 
1 ∂S
α=− . (3.19)
kB ∂X E

We now compare with the classical thermodynamics identity:

dU = T dS + xdX. (3.20)

From this identity, we obtain:


 
∂S
x = −T = kB αT, (3.21)
∂X E

so that α = x/kB T .
By expressing the Lagragian multiplier in terms of the conjugated parameter x, we find:

exp(−βEr + βxXr )
Pr = . (3.22)
ZGCX
32CHAPTER 3. SYSTEM IN CONTACT WITH A RESERVOIR. GRAND CANONICAL ENSEMBLE

3.3.3 Mean value and fluctuation


The mean value of X is given by:
X X exp(−βEr + αXr ) ∂ ln ZGCX
X= Pr Xr = Xr = . (3.23)
r r
ZGCX ∂α

We can also use the probabilities to derive the rms deviation:

2 ∂ 2 ln ZGCX
X2 − X = . (3.24)
∂α2

3.3.4 Fluctuations and linear response


We have seen that the heat capacity is related to the fluctuations of the energy. It is an
example of a general relation between fluctuations and linear response.
We introduce a linear response coefficient called susceptibility:

∂X
χ= , (3.25)
∂x
so that ∆X = χ∆x.
We now insert the explicit form of the mean value X in the definition of the suscepti-
bility:
2
∂ 2 ln ZGCX X2 − X
χ = kB T 2
= . (3.26)
∂x kB T
Hence, it is seen that for any quantity X, the corresponding susceptibility is given by its
fluctuations. This relation has been derived for a static perturbation. This general relation
will be generalized to a time-dependent perturbation. A frequency dependent susceptibility
will then be defined. Its connection with the time-dependent fluctuations characterized by
the correlation function < X(t)X(t0 ) > is called the fluctuation-dissipation theorem.

3.3.5 State equation


The state equation is the equation that relates the conjugated variables X and x. The em-
blematic example is the relation connecting the pressure of a gas P with its volume V . We
can derive this state equation in the generalized grand canonical ensemble assuming that V
(or more generally X) can fluctuate. Then, we get:

∂ ln ZGCX
X = kB T .
∂x
This equation relates x and X, it is the state equation of the system. Here, it is given in
terms of the grand canonical partition function. It is also possible to derive the state equation
when using a canonical approach. Indeed, the perfect gas state equation is valid for both a
closed and open system. We now derive this alternative form of the state equation. Instead
of computing the average value of X, we can compute the value Xmax of X that maximizes
the probability P (X). The two values, X and Xmax can be considered to be equal if and
only if the probability P (X) has a very narrow distribution. This assumption
√ is valid for
large systems as we have seen that the relative fluctuations decay as 1/ N .
3.4. THERMODYNAMIC LIMIT 33

Let us first write the probability P (X). It is simply the sum of the probabilities Pr over
all the states such that Xr = X:
X exp(−βEr ) ZC (β, X)
P (X) = exp(αX) = exp(βxX) (3.27)
ZGCX ZGCX
r/Xr =X

We now remark that if X is fixed, the system is described by a canonical distribution and
its partition function is given by ZC (X). It is now a simple matter to find the maximum of
the probability:
∂P (X)
= 0. (3.28)
∂X
We find:
∂Zc
βxZC + = 0, (3.29)
∂X
so that:    
∂ ln ZC ∂ ln ZC
x = −kB T = −kB T (3.30)
∂X Xmax ∂X X

This equation yields an alternative form of the state equation connecting x and X in
terms of the canonical partition function.

3.4 Thermodynamic limit


3.4.1 Equivalence between the description with different ensembles
When dealing with macroscopic systems, the number of particles is on the order √ of the
Avogadro number NA . In that case, the relative fluctuations are on the order of 1/ NA .
It follows that the fluctuations of the number of particles are so small than it is possible
to pretend that the number of particles is fixed and takes the value N and use a canonical
description. As a consequence, it becomes possible to use either a grand canonical or a
canonical ensemble to describe the system. Here, we do not mean that fluctuations do not
exist. We mean that it is possible, for instance, to derive the state equation of a perfect
gas and its heat capacity using a canonical ensemble. Conversely, a closed system can
be modelled using a grand canonical approach. This assumption will be used in quantum
statistics because it will turn out to be technically easier to compute the partition function
using a grand canonical approach even if the systems have a fixed number of particles.

3.4.2 Relative fluctuations of the number of particles in a perfect gas


We consider a volume V in an open system so that molecules can enter and exit the system.
We will evaluate the fluctuation of the number of molecules in the volume. A physical
phenomenon due to these fluctuation is the scattering of light by a gas. The scintillation of a
light source observed through the atmosphere is also due to the fluctuations of the numer of
molecules in a volume element of the atmosphere. In both cases, the refractive index which
is proportional to the number of molecules per unit volume fluctuates. We will characterize
2
the fluctuations by computing the rms deviation N 2 − N .

2 1 ∂ 2 ln ZGC
N2 − N = . (3.31)
β2 ∂µ2
34CHAPTER 3. SYSTEM IN CONTACT WITH A RESERVOIR. GRAND CANONICAL ENSEMBLE

We need to derive the grand canonical partition function.


 
X X∞ X
ZGC = exp(−βEr + βµNr ) =  exp(−βEr ) exp(βµN )
r N =0 r/Nr =N

X
= ZC (N ) exp(βµN ). (3.32)
N =0

We have computed ZC (N ) for a perfect gas and we have found ZC (N ) = ζ N where ζ is


the partition function for an atom. Here, we only take into account the translational degrees
of freedom and use this result valid for atoms. We now use a result that will be derived
in the chapter on quantum statistics. To account for the indistinguishibility of the atoms,
we divide the partition function by the number of permutations N !. The canonical partition
function is thus given by ZC (N ) = ζ N /N !. Finally, we find:

X ζN
ZGC = exp(βµN ) = exp(ζ exp(βµ)), (3.33)
N!
N =0

so that the average number of atoms and their fluctuations are given by:
1 ∂ ln ZGC 2
N= = ζ exp(βµ) ; N 2 − N = N . (3.34)
β ∂µ

We see that the relative fluctuations vary as 1/ N . If N is on the order of the Avogadro
number, this is on the order of 10−12 . This conclusion suggests that using a canonical model
to describe the system considering that the number of atoms is fixed and takes the value N
should provide accurate results.
We repeat the analysis for a perfect gas and compute the fluctuations of the energy for a
perfect gas in contact with a thermostat. The energy fluctuations were shown to be:

∂ 2 ln Z
(∆E)2 = . (3.35)
∂β 2
Using the canonical partition function, we find:

ζN
Z = (3.36)
N!
2πm 3/2
 
ζ = V (3.37)
βh2
3N
(∆E)2 = (3.38)
2 β2
∆E 1
≈ √ (3.39)
E N
Here again, we see that the relative energy fluctuations are negligible if 1  N .

3.4.3 Thermodynamic limit


When the number of particles is much larger than 1, the relative fluctuations can become
much smaller than 1. In this limit, it is possible to use different ensembles to study a
3.5. EQUILIBRIUM CONDITION BETWEEN TWO SUBSYSTEMS: EQUALITY OF T, µ, X35

system. This limit is called the thermodynamic limit. It is valid for macroscopic systems
with a number of particles on the order of the Avogadro number.
This limit paves the way to simplify the calculations by choosing the most convenient
ensemble. In practice, we need to know how to compute a temperature in an isolated system,
how to compute a chemical potential in a system which is not in contact with a reservoir,
etc.

3.4.4 Temperature of an isolated system


To compute the temperature of a system which is not in contact with a thermostat, we use
the thermodynamic identity: dS = dU/T + P dV /T so that:

1 ∂S
= . (3.40)
T ∂U V

3.4.5 Chemical potential of a system with a fixed number of particles


To compute the chemical potential of a system which has a fixed number of particles, we
use the general form of the state equation connecting x and X 3.30 with x = µ and X = N ;

∂ZC (N )
µ = −kB T . (3.41)
∂N

3.5 Equilibrium condition between two subsystems: equality of


T, µ, x
In this paragraph, we consider an isolated system that can be splitted in two sub-systems.
We will prove that at equilibrium, the two subsystems have the same temperature, chem-
ical potential, etc. To derive this result, we use the fact that the entropy is maximum at
equilibrium. The total entropy can be written as S = S1 + S2 , U = U1 + U2 , N =
N1 + N2 , X = X1 + X2 . The energy, the number of particles and the quantity X are
assumed to be extensive variables which are conserved. Their variations are thus related:
dU1 = −dU2 , dN1 = −dN2 , dX1 = −dX2 . The entropy variation during an elementary
transformation is given by:
     
∂S1 ∂S2 ∂S1 ∂S2 ∂S1 ∂S2
dS = − dU1 + − dN1 + − dX1 . (3.42)
∂U1 ∂U2 ∂N1 ∂N2 ∂X1 ∂X2

At equilibrium, the entropy is maximum so that dS = 0 and d2 S > 0. This entails in


particular that the partial derivatives are equal. We have seen that these partial derivatives
are respectively 1/T, −µ/T, −x/T so that the temperatures, chemical potentials and gen-
eralized force x are equal at equilibrium.

3.6 Grand potential


3.6.1 Definition
The grand potential is defined as A = −kB T ln ZGCX . The name potential stems from the
fact that equilibrium corresponds to a minimum value of this potential just like in classical
mechanics, the equilibrium situation corresponds to a minimum of the potential energy.
36CHAPTER 3. SYSTEM IN CONTACT WITH A RESERVOIR. GRAND CANONICAL ENSEMBLE

We have seen that for a canonical ensemble, it coincides with the free energy. It is worth
pointing out two features of the grand potential:
i) All the relevant information on the properties of the system can be derived from it. Indeed,
we have seen that the partition function contains all the information.
ii) By definition of the partition function, the grand potential is a natural function of β and
x. By contrast, it is not a function of the mean energy and of X.

3.6.2 Explicit form of the grand potential


Starting from the definition of the entropy, we find:

A = −kB T ln ZGCX = E − T S − kB T αX (3.43)

Inserting the form of α in terms of the conjugate variable x, we obtain:

A = E − T S − xX. (3.44)

This is the Legendre transform of the internal energy.

3.6.3 The potential is minimum at equilibrium


We have seen that the equilibrium condition is given by dS + dSR = 0 and d2 (S + SR ) > 0
where S denotes the system entropy and SR denotes the reservoir entropy. At equilibrium,
1 xR 1 xR
dSR = dUR − dXR = − dU − dX.
TR TR TR TR
Inserting this expression yields :
 
1 xR
d S− dU − dX = 0, (3.45)
TR TR

so that TR S−U +xR X is maximum. Thus, the grand potential U −TR S−xR X is minimum
at equilibrium. This form of the equilibrium condition will be very useful to discuss phase
transitions.

3.6.4 Legendre transform


The Legendre transform is introduced in classical thermodynamics. An example of Leg-
endre transform is the introduction from internal energy U to enthalpy U + P V . This
transform is useful to account for the fact that a system has a fixed volume and a varying
pressure or a fixed pressure and a varying volume.

dU = T dS − P dV (3.46)
dH = T dS − P dV + d(P V ) = T dS + V dP. (3.47)

Similarly, systems are either isolated or can exchange heat with their environment so
that there is a variation of entropy. When dealing with systems in contact with a thermostat
that controls the temperature, the natural potential is F = U − T S whose elementary
variation is given by:
dF = −P dV − SdT (3.48)
3.7. EXAMPLES 37

For open systems exchanging particles with a reservoir, the particle number can fluc-
tuate and the chemical potential is imposed by the reservoir. The relevant potential is then
F − µN = U − T S − µN .
In statistical physics, we have introduced the potential as −kB T ln Z where Z is the
partition function corresponding to the relevant ensemble. Relevant ensemble means that
the constraints associated with the fact that a variable X is fixed or fluctuating has already
been taken into account. In the general case, the grand potential can be cast in the form

A = E − T S − xX. (3.49)

Let us emphasize that the grand potential A = −kT ln ZGCX is a function of T and x as
can be seen directly on the explicit form of the partition function:
X
ZGCX = exp(−Er /kT + xXr /kT ). (3.50)
r

3.7 Examples
3.7.1 Exercise: Grand canonical ensemble P-T
Problem text

We consider a perfect gas with a fixed number N and a varying volume V .

1. Use the Lagrange multiplier technique to derive the probability of a state with energy
Er and volume Vr . Show that it can be cast in the form exp(−βEr + αVr )/ZGCT P .
2. Derive the mean volume and the volume fluctuations.
3. Compute the entropy. Derive the grand potential −kB T ln ZGCT P . What is the corre-
sponding thermodynamic potential?
4. Derive the form of α as a function of the pressure P.
5. Derive the mean energy E:

∂ ln Z
E = −P V − .
∂β

Solution

1.1 Probability P
We maximize the entropy S = − r Pr ln(Pr ) accounting for the constraints:

X
Pr =1 (3.51)
r
X
Pr Er = E
r
X
Pr V r = V
r

using the Lagrangian multiplier technique. We then search the zero of the function:
38CHAPTER 3. SYSTEM IN CONTACT WITH A RESERVOIR. GRAND CANONICAL ENSEMBLE

" #
∂ X X X X
Pr ln Pr + α1 ( Pr Er − E) + α2 ( Pr Vr − V ) + α3 ( Pr − 1)] .
∂Pr0 r r r r
(3.52)
We obtain:
− ln Pr0 = 1 + α3 + α1 Er0 + α2 Vr0 , (3.53)
so that the probability can be cast in the form:
exp[−βEr0 + αVr0 ]
Pr0 = . (3.54)
ZGCT P

1.2 Mean value and rms deviation of the volume


Volume mean value
X X exp[−βEr + αVr ] 1 ∂ZGCT P ∂ ln ZGCT P
V = Pr Vr = Vr = = . (3.55)
r r
ZGCT P ZGCT P ∂α ∂α

Rms fluctuations :
X 1 ∂ 2 ZGCT P
Vr2 = Pr Vr2 = , (3.56)
r
ZGCT P ∂α2
∂ZGCT P 2
 
2 1
Vr = 2 ,
ZGCT P ∂α
∂ 2 ZGCT P ∂ZGCT P 2
 
2 1 1
Vr2 − Vr = − ,
ZGCT P ∂α ZGCT P ∂α
∂ 2 ln ZGCT P
 
∂ 1 ∂ZGCT P
= = .
∂α ZGCT P ∂α ∂α2

1.3 Entropy
X X
S = −k Pr ln Pr = −k Pr [− ln Z − βEr + αVr ] = k ln Z + kβE − kαV . (3.57)
r r

1.4 Connection between P and α


The thermodynamic identity yields dU = T dS − P dV so that dS = dE/T + P dV /T
so that  
∂S P
= = −kB α. (3.58)
∂V E T
It follows that P = −kB T α.
1.5 Mean energy
Inserting the form of the probability in the definition of the mean energy, we get:
X X exp[−βEr − βP Vr ]
E= Pr Er = Er , (3.59)
r r
Z GCT P

so that
∂ ln ZGCT P
−E − P V = . (3.60)
∂β
Chapter 4

Linear response coefficients. General


properties

4.1 Introduction
The concept of linear response applies to a large number of phenomena where it is possi-
ble to identify a cause (or excitation) and a consequence (or response) related by a linear
relation. In these lectures, we are primarily interested in coefficients introduced for de-
scribing transport phenomena. Let us mention for instance Fourier’s law φcd = −k∇T for
heat transport, Ohm’s law for charge transport j = σE and Fick’s law jN = −D∇N for
particles transport. Here, the fluxes are the consequences and the gradients are the causes.
Yet, the scope of linear response theory is not restricted to transport phenomena. Linear
responses can also be found in different contexts: the polarisation induced by an electric
field P = 0 χE, the magnetization induced by a magnetic field M = µ0 (µr − 1)H, the
length variation of a spring when a force is applied x = −F/k, the charge in a capacitor
when applying a voltage Q = CU , etc
It is worth pointing out that the linear coefficients can be introduced phenomenolog-
ically from an experimental approach. There are tables where the electrical or thermal
conductivities can be found for different materials. From a theoretical point of view, these
coefficients can be derived by assuming that the systems are in a situation close to equi-
librium and considering the presence of an excitation (the cause) as a perturbation. The
linearity of the response is a direct consequence of the fact that it is possible to seek a so-
lution using a first order perturbation approximation. This is the route that we will follow
in the next chapter devoted to linear response theory and fluctuation-dissipation theorem.
Here, we assume that the linear relation is an experimental fact and we do not attempt to
derive it from first principles. However, this linear response must be consistent with general
principles of physics. For instance, a linear response relation must:

- be translationally invariant in time for stationary systems;


- be causal: the consequence cannot precede the cause;
- satisfy the second principle.

Accounting for all these conditions entails some specific properties of the linear re-
sponse coefficients. It is the purpose of this chapter to establish these properties. Let us
note that we will only make the assumption of a linear response. Hence, the properties that

39
40 CHAPTER 4. LINEAR RESPONSE COEFFICIENTS. GENERAL PROPERTIES

we will derive are extremely general.

4.2 Response function


We start by writing the most general form of a linear relation connecting a scalar excitation
F (t) hereafter called generalized force and its response denoted by X(t). Writing a linear
form of the type X(t) = χF (t) is a very particular form because it is instantaneous. The
most general form relates X(t) to all the values F (t) taken by the force at any time t. Hence,
we can cast the relation in the form :
Z ∞
X(t) = χ(t, t0 )F (t0 )dt0 , (4.1)
−∞

where we have introduced a linear response function χ(t, t0 ). Let us now consider a station-
ary system. Let us consider the example of a spring whose force constant does not change
in time. It follows that the linear response function is only a function of t − t0 so that if
the same experiment is repeated at two different times, it will produce the same result be-
cause the system does not depend on time t. Only the interval t − t0 between excitation
and response matters. Hence, for a stationary system, the linear response can be cast in the
form:
Z ∞
X(t) = χ(t − t0 )F (t0 )dt0 . (4.2)
−∞

This form suggests that the response at time t may depend on forces at times t0 > t.
This would violate the causality principle. Indeed, the cause must precede the consequence.
Hence, the linear response function needs to satisfy the condition:

χ(t) = 0 if t < 0. (4.3)

Finally, we obtain the following general form for a linear response satisfying stationarity
and causality:

Z t
X(t) = χ(t − t0 )F (t0 )dt0 . (4.4)
−∞

We can now discuss some properties of the linear response function. First of all, let us
assume that we excite the system with a pulse F0 δ(t − t0 ). We obtain a response X(t) =
χ(t − t0 )F0 . It is seen that χ is the response of the system to a pulse. Hence, this response
function is also called impulse response function.
It is important to note that the response function introduces a time scale. Indeed, the
response function decays with a typical time denoted τ . This time scale is typically the time
needed by the system to return to equilibrium. Let us consider for example a circuit with a
capacitor and a resistance. The typical decay time is given by τ = RC. In other words, this
typical time scale characterizes the memory of the system. Note that it is a general property
that the response function tends to zero for long times. This follows from the existence of
dissipation processes.
4.3. SUSCEPTIBILITY 41

Finally, we consider the particular case of a force varying very slowly as compared to
the typical system time scale τ . It is then possible to approximate the linear relation:
Z t Z t
0 0 0
X(t) = χ(t − t )F (t )dt ≈ F (t) χ(t − t0 )dt0 = χ0 F (t), (4.5)
−∞ −∞
Rt R∞
where we have introduced χ0 = −∞ χ(t − t0 )dt0 = 0 χ(u)du. We have obtained a
linear relation connecting the force and the response at the same time. This approximation
amounts to consider that the response can be simplified using χ0 δ(t−t0 ). This clearly means
that the response is considered to be instantaneous. It is very important to realize that this
is strictly equivalent to assume that dispersion can be neglected. This will be discussed in
the following section.

4.3 Susceptibility
The goal of this section is to introduce a spectral analysis of the linear response. We will
establish that stationarity, causality and the second principle impose severe constraints on
the Fourier transform of the linear response function.

4.3.1 Definition
We first note that for a stationary system, the linear response relation is a convolution prod-
uct: Z ∞
X(t) = χ(t − t0 )F (t0 )dt0 , where χ(t) = 0 if t < 0. (4.6)
−∞

It follows that the Fourier transforms are simply related:

X(ω) = χ(ω)F (ω) , (4.7)

where we have introduced:


Z ∞ Z ∞
dω dω
X(t) = X(ω) exp(−iωt) ; F (t) = F (ω) exp(−iωt) ; (4.8)
2π 2π
Z−∞∞
−∞

χ(t) = χ(ω) exp(−iωt) .
−∞ 2π

The Fourier transform of the linear response function χ(ω) is called susceptibility or
complex admittance. Given that the linear response function is a real function, we have the
property:
χ(−ω) = χ∗ (ω). (4.9)
Let us emphasize that it is equivalent to know the linear response or the susceptibility
for all the spectrum. Knowing the frequency dependence of the susceptibility amounts to
know the dispersive behaviour of the system. For the particular case of an instantaneous
response χ(t) = χ0 δ(t), we find that χ(ω) = χ0 does not depend on frequency: this is a
non-dispersive system.
To summarize, when a system is excited by by a generalized force with slow time vari-
ations as compared to the time scale of the system, the system ”follows” the excitation
instantaneously. In frequency domain, this corresponds to a non-dispersive behaviour.
42 CHAPTER 4. LINEAR RESPONSE COEFFICIENTS. GENERAL PROPERTIES

4.3.2 Dissipation
The goal of this section is to establish that the dissipation rate of the system energy is
proportional to the imaginary part of the susceptibility χ00 = =(χ). We will also prove
that at equilibrium, the imaginary part of the susceptibility χ00 = =(χ) is positive so that
”negative” losses (i.e. amplification) are impossible. This property follows from the first
and second principle of thermodynamics. To proceed, let us first consider a system that
receives work done by the generalized force and exchanges heat with its environment which
is considered to be a thermostat at temperature T0 . Our starting point is energy conservation
expressed by the first principle in the form:

dU = δW + δQ, (4.10)

where δW stands for the work received by the system and δQ is the heat received by
the system. The work done by F on the system is given by δW = F dX = F vdt
where we have introduced the notation v = Ẋ. Let us now consider a harmonic force
F0 cos(2πt/T ) = F0 cos(ωt) = Re[F0 exp(−iωt)]. The response is also monochromatic
and given by X(t) = Re[X0 exp(−iωt] = Re[χ(ω)F0 exp(−iωt]. During a period, the
internal energy changes and comes back to its initial value so that its variation is null. It
follows that: Z Z t+T Z t+T t+T
dU = 0 = δW + δQ. (4.11)
t t t
Heat and work exchanged during a cycle are therefore related by:
Z t+T Z t+T Z t+T
δW = F (t0 ) v(t0 ) dt0 = − δQ. (4.12)
t t t

This energy conservation equation shows that the energy received by the system is released
as heat by the system to the environment. The work done during a cycle can be computed
directly:
t+T
1 t+T F0 e−iωt + F0∗ eiωt −χ(ω)F0 e−iωt + χ∗ (ω)F0∗ eiωt
Z Z    
1 0
F v dt = iω
T t T t 2 2
ω
= =[χ(ω)]|F (ω)|2 . (4.13)
2
This formula shows that the average power transferred to the system is given by ω2 |F (ω)|2 =(χ).
Finally, the second principle imposes:

δQ
dS ≥ , (4.14)
T0
where T0 is the temperature of the thermostat in contact with the system. We know that
entropy is a state function so that its variation during a cycle is null. It follows that
Z t+T Z t+T
ωT
T0 dS = 0 ≥ δQ = − |F (ω)|2 Im(χ). (4.15)
t t 2

Finally, using the first and second principle, we have established:

χ00 (ω) ≥ 0 . (4.16)


4.3. SUSCEPTIBILITY 43

Note that the final result depends Ron the conventional choice of the sign in the Fourier
transform. If we had used χ(t) = χ(ω) exp(iωt)dω/2π, we would have obtained a
velocity v = iωX instead of −iωX so that we would have obtained χ00 (ω) ≤ 0.
Let us finally note that equation (4.15) shows that the work done by the generalized force
on the system is positive whereas the heat received by the system is negative. This work
corresponds to the energy transferred to the thermostat as heat. In other words, the system
transforms work in heat: this is the absorption process. This transformation of work into
heat coincides with a production of entropy transferred to the thermostat. The imaginary
part of the susceptibility is therefore the quantity that characterizes the absorption by the
system.
Let us finally emphasize that we have assumed that the system is at equilibrium at all
times because we have used the first and second principle. A system out of equilibrium can
have a susceptibility with the opposite sign. This is for instance the case of the amplify-
ing medium in a laser where absorption is replaced by amplification. Needless to say, the
amplifying medium is not in equilibrium.

4.3.3 Causality. Kramers-Kronig relations


The goal of this section is to establish relations which follow directly from causality. We
will show that the knowledge of the real part of the susceptibility for all frequencies allows
to derive the imaginary part of the susceptibility and vice versa. Interestingly, Kramers-
Kronig relations apply to any system with a linear and causal response. The link between
the real and the imaginary part of the susceptibility is a direct consequence of the causality.
Causality consequence

As discussed above, causality implies:

χ(t) = 0 if t < 0 (4.17)

It follows that the inversion formula of Fourier transforms Eq.(4.8) can be cast in the
form: Z ∞
χ(ω) = χ(t) exp(iωt)dt. (4.18)
0
Here, causality appears because we integrate only over positive values of time t. In what
follows, we need to define χ(ω) for complex values of ω (ω = ω 0 + iω 00 ). Equation (4.18)
allows computing χ(ω) for any complex value of ω. This is called an analytic continuation.
We now examine the consequences of causality on the properties of χ(ω). Causality implies
the following identity:

χ(t) = χ(t)H(t), (4.19)


where H(t) is the Heaviside step function:

H(t) = 1 for t ≥ 0 H(t) = 0 if t < 0. (4.20)

Since the Fourier transform of a product is the convolution product of the Fourier trans-
forms, we find, using the Fourier transform of the Heaviside function, the identity:
Z ∞
dω 0
 
0 0 i
χ(ω) = χ(ω ) πδ(ω − ω ) + pv , (4.21)
−∞ 2π ω − ω0
44 CHAPTER 4. LINEAR RESPONSE COEFFICIENTS. GENERAL PROPERTIES

where pv denotes the principal value of the integral. Finally, we obtain:


Z ∞
1 χ(ω 0 )
χ(ω) = pv 0−ω
dω 0 . (4.22)
iπ −∞ ω
By taking the real part and the imaginary part of this relation, we find the Kramers-
Kronig relations:

∞ ∞
χ00 (ω 0 ) 0 −1 χ0 (ω 0 ) 0
Z Z
0 1 00
χ (ω) = pv dω χ (ω) = pv dω (4.23)
π −∞ ω0 − ω π −∞ ω0 − ω

They can be cast in a slightly different form using:

χ(−ω) = χ∗ (ω). (4.24)


We finally obtain:

ω 0 χ00 (ω 0 ) 0
Z
0 2
χ (ω) = pv dω (4.25)
π 0 ω 02 − ω 2
This is a very interesting relation because it yields χ0 (ω) satisfying the causality principle
when starting from an approximate form of χ00 . This is a very powerful technique to obtain
the real part of the susceptibility (dispersion information) from an absorption spectrum. For
the sake of illustration, let us consider the real part of the susceptibility χ0 (ω) produced by
an absorption line at ωr due to resonance of the system. We can write the imaginary part of
the susceptibility as:

χ00 (ω 0 ) = Kδ(ω 0 − ωr ), (4.26)


where K is a constant. Using Eq.(4.25), we derive χ :
2ω0 K
χ0 (ω) = . (4.27)
π ωr − ω 2
2

4.4 Relaxation function


4.4.1 Examples and definitions
Let us first introduce the concept of relaxation by using two examples. We first consider
a volume of water in an electrostatic field. The permanent dipoles of water molecules will
align parallel to the field so that a macroscopic dipole moment per unit volume will appear.
Let us now assume that at time t = 0, the electric field is switched off abruptly. The
system will return to its equilibrium state where the molecules dipole moments are randomly
oriented so that the resulting dipole moment is null. This transition to equilibrium is called
relaxation. It takes a typical time called relaxation time. The relaxation time depends on the
microscopic mechanisms at work. Here, the collisions between molecules are responsible
for the relaxation. The typical time scale is thus essentially the mean time between two
collisions.
The second example is the decay of the charge of a capacitor with capacitance C through
a resistor with resistance R. By applying a voltage to the capacitor during a time much
larger than RC, the charge will reach a stationary value. By switching off the voltage,
current will flow through the resistor and the capacitor charge will decay exponentially with
4.4. RELAXATION FUNCTION 45

a time constant τ = RC. This is an example of relaxation, i.e. return to equilibrium after
suppressing a constraint.
We now introduce a formal definition of the relaxation function. Let us consider a con-
stant force F0 applied to the system. The response of the system is a displacement X0 . At
time t = 0, the force is suppressed. The system response is then given by F0 Ψ(t) where we
have introduced the relaxation function Ψ(t).

Let us point out three properties:

- Since the system is linear, the relaxation function must be related to the response function.
- Relaxation is fundamentally driven by losses mechanisms. In the first example, the energy
given to the system by polarizing it is dissipated in water through collisions. In the second
example, the energy given by the voltage supplier to the capacitor is dissipated by Joule
effect in the resistor.
- Relaxation is a measurement of the memory of the system. The decay of the relaxation
function tells how long it takes until the system forgets its initial non-equilibrium state.
These remarks give a hint on the fundamental properties underlying the linear response
theory based on fluctuation-dissipation theorem. We can guess that there must be a link
between losses, memory time and linear response.

4.4.2 Response function and relaxation function


It is a simple matter to establish a link between the relaxation function and the response
function starting from their definitions. Let us consider a constant force applied to the
system from t = −∞ until t = 0. The response function is given by:
Z 0 Z ∞
0 0
X(t) = χ(t − t )F0 dt = F0 χ(u) du, (4.28)
−∞ t

where we have introduced the new variable u = t − t0 . It is seen that the response of the
system can be cast in the form
X(t) = F0 Ψ(t)

where Z ∞
Ψ(t) = χ(u) du. (4.29)
t

Hence, we can derive the response function from the relaxation function. For t < 0, causal-
ity imposes that the response function is null. For t > 0, it can be derived from the relaxation
function so that we finally obtain:

dΨ(t)
χ(t) = −H(t) , (4.30)
dt

where H(t) is Heaviside’s function.


Let us note that the response function is the response to a pulse, the relaxation function is
the response to a step function and the susceptibility is the response to a harmonic excitation.
All of them contain all the information on the system.
46 CHAPTER 4. LINEAR RESPONSE COEFFICIENTS. GENERAL PROPERTIES

4.5 Example
Here, we apply the formalism introduced in this chapter to a well-known example. In order
to illustrate the last remark, we will derive the complex impedance Z(ω) of an RC circuit
starting from the knowledge of its relaxation function. This is certainly not the easiest way
to find the complex impedance of a RC circuit. Yet, it nicely illustrates how the spectral
information can be recovered from the relaxation function. The relaxation function Ψ(t)
is identified from the time decay of the capacitor charge Q(t) = Q0 exp(−t/RC). The
generalized force is the voltage V such that Q = CV where C is the capacitance. We have
Q0 = CV0 so that V0 is the generalized force. Hence, Q(t) = V0 Ψ(t) so that

Ψ(t) = C exp(−t/RC).

We now derive the response function:

dΨ(t) H(t)
χ(t) = −H(t) = exp(−t/RC). (4.31)
dt R
The susceptibility is the Fourier transform of the response function:
Z ∞
1 C
χ(ω) = exp(−t/RC) exp(iωt)dt = (4.32)
0 R 1 − iωτ

where we have used the notation τ = RC. In order to find the impedance, we need to
identify the coefficient relating the intensity to the voltage:
−iωC
I(ω) = −iωQ(ω) = −iωχ(ω)U (ω) = U (ω). (4.33)
1 − iωτ
It follows that the impedance is given by:

U (ω) 1
Z(ω) = =R− (4.34)
I(ω) iωC
Chapter 5

Linear response.
Fluctuation-Dissipation theorem

5.1 Introduction
The goal of this chapter is to derive a general form of the coefficients introduced in linear
response theory from a statistical approach. There are two basic ideas that will be developed
here. First, we study the relaxation of the system. As discussed in the preceding chapter,
relaxation contains all the information on the linear response of the system. The second idea
is that the generalized force applied to the system can be considered to be a perturbation.
Hence, a first order perturbative solution can be found leading to a linear response. The
main result that will be established here is a simple relation between the relaxation function
and the time correlation function of the fluctuations at equilibrium.
To be more explicit, let us consider the two examples used in the introduction of the
concept of relaxation in the previous chapter. We had considered water molecules oriented
along an electrostatic field thereby producing a macroscopic polarization. When turning off
the electric field, the polarization returns to zero, its equilibrium value, with a dynamic be-
haviour given by the relaxation function Ψ(t). Similarly, the charge of a capacitor returns to
zero exponentially when the applied voltage is replaced by a short circuit. At equilibrium,
in the absence of external electric field, the mean polarization is null but its instantaneous
value is non zero. At equilibrium, without applied voltage, the charge of the capacitor
also fluctuates in time. In both cases, it is possible to compute the correlation function
< P (t)P (t + τ ) > or < Q(t)Q(t + τ ) >. These equilibrium fluctuations will be shown
to be related to the relaxation function. In other words, the time dependence of relaxation
and fluctuation is the same. This may be surprising at first glance as the relaxation function
deals with the mean value whereas the correlation function deals with fluctuations. This can
be understood qualitatively by realizing that both the relaxation function and the correlation
function are a measurement of the memory of the system. The relaxation function mea-
sures the memory of an ordered system while the fluctuation measures the memory of one
particular random state. In both cases, this memory is governed by the same microscopic
mechanisms.
The chapter is organized as follows. We first derive the relaxation function. The main
result is to show that it is proportional to the correlation function. The second section deals
with the susceptibility χ. We will then be able to establish a relation between the imaginary
part of the susceptibility which is related to dissipation and the fluctuations. In the last

47
48 CHAPTER 5. LINEAR RESPONSE. FLUCTUATION-DISSIPATION THEOREM

section, we show some applications of this result to the study of fluctuations of systems at
equilibrium.

5.2 Relaxation function


We consider a system in contact with a reservoir of the quantity X at temperature T. X
could be for instance volume, polarisation or charge. We denote x0 the conjugate variable
of X. The conjugate variable x0 would then be pressure, electric field or electric potential
respectively, it is imposed by the reservoir. The probability that the system is in a state r is
given by:
exp(−βEr + βx0 Xr )
Pr = (5.1)
ZGCX
In order to derive the relaxation function, we need to compute the mean value X(t) for
time t > 0 after turning off the generalized force x0 at time t = 0. Here, X could be either
the polarization of water or the charge of a capacitor while x would be either the applied
electrostatic field or the voltage respectively. Hence, the system is no longer in equilibrium.
We cannot use the probability given by Eq.(5.1) for times t > 0.
To proceed, we consider a particular microscopic state r0 occupied by the system at
time t = 0. This microscopic state is the initial condition for the system. We know that
the system will evolve deterministically according to a given hamiltonian. For example, a
quantum system will be described by Schrödinger equation, a classical system of particles
will be described by Newton’s equations, etc. At time t, we can write that the quantity
X will take the value Xr0 (t). Note that the value does depend on the initial condition.
Averaging over all possible states will be performed by averaging over all possible initial
states. This can be done provided that the probability distribution is known at time t = 0. It
turns out that this is the equilibrium probability density in presence of a generalized force
x0 given by Eq.(5.1). The mean value X at time t is thus given by:
X
< X > (t) = Pr0 Xr0 (t). (5.2)
r0

Let us insert the explicit form of Pr0 given by Eq.(5.1):


X exp(−βEr + βx0 Xr (0))
0 0
< X > (t) = Xr0 (t). (5.3)
r
Z GCX
0

We now assume that βx0 Xr0 << 1. This assumption is valid if the interaction energy
x0 Xr0 is much smaller than the typical thermal energy kB T . A first order expansion yields:
X exp(−βEr )[1 + β x0 Xr (0)]
0 0
< X > (t) = Xr0 (t). (5.4)
r
Z GCX
0

Let us derive the partition function within this approximation:


X
ZGCX = exp(−βEr0 )[1 + β x0 Xr0 (0)] = ZGC0 [1 + βx0 X0 (0)], (5.5)
r0

where we have defined X0 (0) given by:


X exp(−βEr0 )
X0 (0) = Xr0 (0) , (5.6)
r0
ZGC0
5.2. RELAXATION FUNCTION 49

and ZGC0 : X
ZGC0 = exp(−βEr0 ). (5.7)
r0

By inserting this approximation in Eq. (5.4), we obtain:

X exp(−βEr0 )
< X > (t) = [1 + β x0 Xr0 (0)] Xr0 (t)
r0
ZGC0 [1 + β x0 X0 ]
X0 (t) + β x0 < X(0)X(t) >
= . (5.8)
1 + β x0 X0
We observe that X0 (t) is the mean value of X without applied external force x0 . Hence,
this is the mean value at equilibrium X0 without external force and therefore it is time
independent so that X0 = X0 (t) = X0 (0).
Note that the mean value < X(0)X(t) > is taken for a zero value of the generalized
force x0 . This is the fluctuation at equilibrium. Finally, the mean value at time t can be cast
in the form:

< X > (t) − X0 = β x0 [< X(0)X(t) > − < X > (t)X0 ] (5.9)

The structure that we obtain explicitly shows a linear link between the response < X >
(t) − X0 and the cause x0 . Our derivation allows identifying the origin of this linearity: the
first order expansion of the probability. Hence, one may expect a linear response if x0 X <<
kB T . This inequality gives a precise meaning to the condition ”close to equilibrium” often
invoked as a validity condition for linear response. We now consider the particular case
where X is null without external applied force (i.e. X0 = 0 if x0 = 0):

< X > (t) = β x0 < X(0)X(t) >= β x0 CXX (t), (5.10)

where CXX (t) =< X(0)X(t) > is the correlation function at equiilibrium. It follows that
the relaxation function is given by:

Ψ(t) = β < X(0)X(t) > . (5.11)


This is the major result of this chapter: the relaxation function is the product of β and
the time correlation function of the fluctuations of X. Using the results established in the
chapter on linear response, we can derive from the fluctuations the linear response, the
susceptibility and the losses spectrum. The linear response is given by:

dΨ dCXX (t)
χ(t) = −H(t) = −βH(t) . (5.12)
dt dt
Let us emphasize a remarkable property of this equality. The term on the left-hand side
describes a non-equilibrium property: linear response to an external force. The term on the
right hand side describes time fluctuations at equilibrium with no external force applied. In
other words, the study of the fluctuations at equilibrium contains the information on the lin-
ear response to an external force out of equilibrium. We have already indicated the physical
origin of this property. Both phenomena are driven by the microscopic mechanisms at work
when the system loses the memory of its initial state, be it an ordered state (relaxation) or
a particular microscopic state (time fluctuation). In both cases, the relevant time scale is
given by the collision time.
50 CHAPTER 5. LINEAR RESPONSE. FLUCTUATION-DISSIPATION THEOREM

Let us also stress a second remarkable property. As discussed previously, the relaxation
function contains all the information on the susceptibility and therefore on the absorption
spectrum. In other words, the left-hand side of the equality deals with dissipation while
the right hand-side deals with fluctuations. This is the origin of the name ”Fluctuation-
Dissipation” theorem. We will analyse in more detail this aspect of the fundamental result
obtained in the subsequent sections.

5.3 Susceptibility
We have introduced the susceptibility as the Fourier transform of the linear response func-
tion. Since we have established that the linear response is given by the correlation function
of a stationary random process, we need to deal with the spectral analysis of a stationary
processes. This is the purpose of the Wiener-Kinchine theorem.

5.3.1 Wiener-Khinchine theorem


Performing the spectral analysis of a square integrable function is an elementary issue. It
suffices to compute its Fourier transform. Unfortunately, this procedure cannot be used
when dealing with stationary random processes which are obviously not square integrable.
This is not a minor point. It should be realized that light emitted by the sun is the result of
a large number of random spontaneous emission processes so that the corresponding field
is a random process. This comment applies to any light source including a laser. Indeed,
there is always spontaneous emission processes into the cavity which are then amplified.
This mechanism is responsible for the finite spectral width of a laser. Beyond light, most
physical processes can fluctuate and are thus fundamentally described by random processes.
It is thus important to be able to properly define the spectrum of a stationary random process.
To deal with this issue, it is possible to introduce the restriction of the function to a
limited domain. Let us consider a random function f of time t corresponding to a stationary
random process. We introduce its restriction fT to a finite time window with width T :

fT (t) = f (t) if 0 < t < T


= 0 t≤0 or t ≥ T. (5.13)

It is now possible to compute the Fourier transform of fT :


Z ∞ Z T
f˜T (ω) = fT (t) exp(iωt)dt = fT (t) exp(iωt)dt. (5.14)
−∞ 0

There is no limit for fT as T tends to infinity. However, it is possible to define the limit:

1 ˜
lim |fT (ω)|2 , (5.15)
T →∞ T

for a class of random processes. This is typically the case of white noise for instance.
This quantity is called power spectral density and denoted If (ω). Here, the word power
refers to the fact that the square of the function is associated with energy for electric quan-
tities. Dividing energy by time yields power. In other words, as T goes to infinity, the
energy diverges but the power remains constant. A major result of spectral analysis of sta-
tionary random processes is the Wiener-Khinchine theorem which states that the correlation
5.3. SUSCEPTIBILITY 51

function is the Fourier transform of the power spectral density:


Z ∞

< f (t + τ )f (t) >= Cf (τ ) = If (ω) exp(−iωτ ) ,. (5.16)
−∞ 2π

Coming back to light, it is seen that the spectrum can be derived from the knowledge
of the time correlation function of the field. This is the basis of the Fourier transform
spectroscopy technique which is based on measuring the time correlation function using a
Michelson interferometer.

5.3.2 Susceptibility. Fluctuation-dissipation theorem


Equation (5.12) provides a link between linear response theory and correlation function.
Computing its Fourier transform yields the susceptibility:
Z ∞
dC(t)
χ(ω) = −β exp(iωt) dt. (5.17)
0 dt

Integrating by parts, we get:


Z ∞
χ(ω) = βC(0) + iωβ C(t) exp(iωt) dt. (5.18)
0

The imaginary part of the susceptibility is:


Z ∞ 
χ”(ω) = βωRe CXX (t) exp(iωt)dt
Z ∞ 0 Z ∞ 
βω ∗
= CXX (t) exp(iωt)dt + CXX (t) exp(−iωt)dt
2 0 0
βω ∞
Z
= CXX (t) exp(iωt)dt
2 −∞
βω
= IXX (ω). (5.19)
2
∗ (t) and
Here, we have used two properties: C is a real function so that CXX (t) = CXX
C is an even function. The latter follows from the commutation of X(t) and X(0) as well
as stationarity as we now show:

CXX (t) =< X(t)X(0) >=< X(0)X(t) >=< X(−t)X(0) >= CXX (−t). (5.20)

It is important to emphasize that the commutation is not always valid for operators in quan-
tum mechanics. Hence, a quantum derivation of the fluctuation dissipation theorem yields
a different result. This is left as a problem. The final classical result is the second form of
the fluctuation-dissipation theorem:

2kB T
IXX (ω) = χ”(ω) (5.21)
ω

The term on the left-hand side corresponds to fluctuations while the term on the right-hand
side is proportional to the losses spectrum. A quantum derivation produces a different result:
52 CHAPTER 5. LINEAR RESPONSE. FLUCTUATION-DISSIPATION THEOREM

2χ”(ω)
IXX (ω) = Θ(ω, T ) (5.22)
ω

where Θ(ω, T ) is the mean energy of a quantum harmonic oscillator given by:
 
1 1
Θ(ω, T ) = ~ω + . (5.23)
2 exp(~ω/kB T ) − 1

5.3.3 Application to the computation of fluctuation spectra


Fluctuation-dissipation theorem is a powerful tool to compute the amplitude of fluctuations
once the imaginary part of the susceptibility is known:
Z ∞ Z ∞
2kB T dω 2kB T
χ”(ω)
< X 2 (0) >= CXX (0) = χ”(ω) exp(iω0) = dω
−∞ ω 2π0 πω
(5.24)
where we have used the property χ”(−ω) = −χ”(ω). The quantum derivation yields a
different result:
Z ∞
2 2 χ”(ω)Θ(ω, T )
< X (0) >= CXX (0) = dω. (5.25)
π 0 ω

We have derived the fluctuations of the response X. It is also possible to derive the
fluctuations of the generalized force x(t). Indeed, we have a deterministic relation between
both quantities given by X(ω) = χ(ω)x(ω). It follows that:

< |X(ω)|2 >= |χ(ω)|2 < |x(ω)|2 > (5.26)

Finally, the thermal fluctuations of the generalized force are given by:

Z ∞
2 2kB T χ”(ω) dω
< x (0) >= Cxx (0) = . (5.27)
π 0 |χ(ω)|2 ω

for classical observables and by

Z ∞
2 χ”(ω) dω
< x2 (0) >= Cxx (0) = Θ(ω, T ) . (5.28)
π 0 |χ(ω)|2 ω

in the general case.

5.4 Applications
In this section, we will illustrate how the fluctuation-dissipation theorem can be used to de-
rive the thermal fluctuations. We first consider the voltage fluctuations across an impedance
Z.
5.4. APPLICATIONS 53

5.4.1 Voltage fluctuations for a resistor


The first step consists in identifying the couple of conjugate quantities x and X. We use the
fact that the product xX has the dimension of energy. Here, we deal with a voltage (x) so
that the conjugate quantity X has to be the electric charge. The susceptibility is thus given
by
Q(ω) = χ(ω)V (ω).
The linear coefficient usually used is the impedance Z(ω) given by V (ω)/I(ω). Charge
conservation yields I(ω) = −iωQ(ω) = −iωχ(ω)V (ω) and the charge can be cast in the
form:
i
Q(ω) = V (ω), (5.29)
Z(ω)ω
so that:

i
χ(ω) =
Z(ω)ω
R(ω)
χ”(ω) = . (5.30)
ω 2 |Z|2
We have introduced the usual notation R for the real part of the impedance Z. Equation
(5.27) yields the fluctuation of voltage across the impedance:

2kB T ∞
Z
2
< V >= R(ω) dω. (5.31)
π 0

5.4.2 Fluctuations of the electromagnetic field in vacuum


Here, we use the fluctuation-dissipation theorem to derive the thermal fluctuations of the
electromagnetic field in vacuum. This amounts to derive the spectrum of the blackbody
radiation. Indeed, the average electromagnetic field is zero in vacuuum at equilibrium.
However, its quadratic mean value is non zero. The corresponding energy is the blackbody
radiation energy.
Here, we follow the original derivation by Callen and Welton. We start by considering
an atom in equilibrium with the black body radiation. This atom is polarized by the inci-
dent field. The link between the n-component of the incident field and the corresponding
component of the dipole moment is given by the susceptibility:
pn = χEn . (5.32)
Here, the couple of variables was chosen so that the product p · E has the dimension of
energy. The generalized force is the electric field and the dipole moment is the generalized
displacement X. According to Eq. (5.28), we can derive the field fluctuations from χ”/|χ|2 .
We can derive this quantity by writing that the power P radiated by the atom is equal to the
power absorbed by the atom. The general form of absorbed power was shown to be:
ωχ”|En |2
P = , (5.33)
2
in the previous chapter. The power radiated in vacuum by a dipole is:
1 |χ|2 |En |2 ω 4
P = . (5.34)
4π0 3c3
54 CHAPTER 5. LINEAR RESPONSE. FLUCTUATION-DISSIPATION THEOREM

By equating these two quantities, we find:

χ” ω3
= (5.35)
|χ|2 6π0 c3

Inserting this result in Eq. (5.28), we obtain:



ω2
Z
< En2 >= Θ(ω, T ) (5.36)
0 3π 2 0 c3

It is now a simple matter to derive the energy density of the blackbody radiation. First, we
derive the energy associated with the n-component of the electric field by multiplying by
0 /2. We need to multiply by a factor 3 to account for the three components of the electric
field. We also multiply by 2 to account for the magnetic energy 1 . Hence, we obtain:

ω2
Z

U= dω. (5.37)
0 exp(~ω/kB T ) − 1 π 2 c3

Here, we have not included the additional term ~ω/2 per mode due to the vacuum fluctua-
tion. It is responsible for the Casimir force between two parallel mirrors.

5.4.3 Exercise: Microwave polarisability of a dielectric particle


Text
We consider a dielectric particle whose size is much smaller than the wavelength of the
illuminating plane wave. When the particle is in a static field E0 , it acquires a dipole moment
given by the static susceptibility χ0 . When turning off the electrostatic field, the dipole
moment decays exponentially with a time constant τ = 1/γ.
Complex susceptibility
Derive
1. The relaxation function Ψ(t) and the linear response χ(t)
2. the complex susceptibility χ(ω).
Dipole moment fluctuations
3. Derive the correlation function of the dipole moment from Ψ(t).
4. Derive the power spectral density of the dipole moment from fluctuation-dissipation
theorem.

Solution
1. Linear response
We introduce the linear relation between E0 and the dipole moment

p(t) = E0 Ψ(t) = χ0 exp(−t/τ )E0 . (5.38)

The relaxation function is thus:

Ψ(t) = χ0 exp(−t/τ ) (5.39)


1
it can be checked directly that the energy associated to the magnetic field is equal to the energy associated
to the electric field for a propagating plane wave: 0 |E|2 = |B|2 /µ0
5.4. APPLICATIONS 55

The linear response function can be computed using χ(t) = −Θ(t) dΨ(t) χ0
dt = Θ(t) τ exp(−t/τ ).
2. Computing the susceptibility amounts to take the Fourier transform:
Z ∞
χ0 χ0
χ(ω) = exp(−t/τ + iωt)dt = . (5.40)
0 τ 1 − iωτ

Dipole moment fluctuations

3. The relation between relaxation function and correlation function yields < pi (t)pi (0) >=
kB T Ψ(t) = kB T χ0 exp(−t/τ ).
4. A direct application of the FD theorem yields :

2χ”(ω) 2χ0 τ kB T
Ipp (ω) = kB T = (5.41)
ω 1 + ω2τ 2
56 CHAPTER 5. LINEAR RESPONSE. FLUCTUATION-DISSIPATION THEOREM
Chapter 6

Langevin model of Brownian motion.


Application: fluctuations of linear
systems

6.1 Introduction
This chapter is an introduction to a technique developped by Langevin to model fluctua-
tions of linear systems. This technique allows to go beyond statistical equilibrium because
the time dynamics of fluctuations can be modelled. It has been introduced historically in
order to explain Brownian motion. Hereafter, we summarize the key features of Brownian
motion. We then introduce Langevin model in the context of 1D Brownian motion. In the
following section, we use the Langevin model to recover Fluctuation-Dissipation theorem
in the context of 1D Brownian motion. Finally, we derive the Einstein relation existing
between diffusion coefficient and mobility.

6.1.1 Brownian motion


Brownian motion is the name given to the random motion of tiny pollen particles in water.
This process is named after Robert Brown, a botanist who first reported this observation
in 1827. A remarkable property of Brownian motion is the fact that the trajectory appears
to consist of broken lines. Hence, the velocity of particles is not a continuous function of
time. These discontinuities are explained by assuming that they are due to molecular colli-
sions. This observation played an important role in the experimental proof of the molecular
assumption. Jean Perrin used the Brownian motion to derive the Avogadro number.
The first theoretical models were proposed by Einstein in 1905 and Smoluchowsky
in 1906. Langevin introduced a more general model leading to a stochastic differential
equation. In other words, there are terms appearing in the differential equation which are
random variables. This type of differential equation had been introduced in a different
context by N. Wiener in 1923 and by L. Bachelier in his thesis on ”Speculation theory”
which is one of the seminal works on mathematical finance.
Here, we introduce Langevin model in the context of Brownian motion. Yet, we stress
that Langevin model is a general technique for modelling the dynamics of linear random
processes. In addition, it provides a clear and pedagogical example of the fluctuation-
dissipation theorem. Finally, studying Brownian motion leads to a microscopic model of

57
58CHAPTER 6. LANGEVIN MODEL OF BROWNIAN MOTION. APPLICATION: FLUCTUATIONS OF LIN

the phenomenon of diffusion.

6.1.2 Qualitative discussion


Here, our goal is to use Brownian motion as an example to illustrate the link between fluc-
tuation and dissipation. Since the particles have collisions with molecules, the momentum
conservation during a collision imposes a sudden change in momentum for the particle.
Hence, molecular collisions are the microscopic origin of fluctuations of the velocity of the
particle. On average, the momentum transfer is zero and the mean displacement is also
expected to be zero.
We now discuss the link with dissipation. Let us assume that the particle has an initial
velocity along the x axis in a fluid at rest. Let us further assume that the velocity is positive.
Hence, the particle experiences more collisions with particles having a negative velocity
than with particles having a positive velocity. Indeed, the relative velocity is increased in
the first case and decreased in the second case. It follows that the particle will have more
collisions that tend to reduce its velocity. This results in a drag force.
In summary, we see that molecular collisions are the microscopic source of both fluctu-
ations and dissipation.

6.2 Langevin model


Our goal in this section, is to introduce the Langevin model to analyse the dynamics of the
Brownian motion. We start using Newton’s equations. The experimental observation shows
that the velocity is not a continuous function of time. Hence, Langevin introduced a random
force in the differential equation. The dynamics of the particle of mass m are thus described
by:

dv
m = F(t) =< F(t) > +R(t), (6.1)
dt
where < F(t) > stands for the mean value of the force F(t) produced by the molecules
on the particle. R(t) is the random component of that force. The mean value is the drag
force. It can be cast in the phenomenological form −γmv(t). This form is only valid at
low velocities. 1
Of course, when using this differential equation, one needs to specify explicitly what
is this random force. The first idea is to try to derive a microscopic model of that random
force by analysing the collisions with molecules. However, we will show that it is possible
to derive the required information without need to perform such a microscopic analysis.
Indeed, in the framework of Langevin model, the random force is the source of fluctuations
at equibrium. The later are known. Hence, extracting the required information on the
random force from the mere knowledge of the fluctuations seems possible.
The power spectral density of the velocity fluctuations depends on two factors: i) the
exciting force spectrum and ii) the dynamics of the linear system. The later is given by the
dynamical equations describing the system. In the case of Brownian motion, it is possible to
simplify the problem. We assume that the random force spectrum is a white noise spectrum.
In other words, the time correlation function of the random force is given by:
1
More precisely, for a spherical particle, the Reynolds number has to be lower than 0.5. This is a condition
more stringent than having a laminar flow.
6.3. FLUCTUATION-DISSIPATION THEOREM. SECOND FORM. 59

< R(t)R(t0 ) >= IR δ(t − t0 ). (6.2)


To justify this assumption, we compare the typical time scale of the correlation func-
tion with a typical time scale of the problem. The random force is due to collisions with
molecules. Assuming that different collisions are uncorrelated, the only correlation time is
0
the duration of a single collision. This time τcoll can be estimated by taking the range of the
interaction b (typically 0.5nm divided by the molecule velocity (typically 300ms−1 in a gas
at 300K). Hence, a typical collision time is on the order of 3 ps. :

0 b
τcoll = . (6.3)
v
This time can be compared with the typical time between two collisions τcoll . It can
be estimated using a hard sphere model. Let us assume that a molecule with a velocity v
interacts with molecules located in a cylinder with radius 2b et axis parallel to the velocity.
If the cylinder length is vτ , it contains one molecule on average. By denoting n the number
of molecules per unit volume, we get

nvτcoll 4πb2 = 1. (6.4)

It is convenient to introduce the distance d such that nd3 = 1. This length is the side of
a cube that contains one molecule. We can cast the collision time as:
 3
0 b
τcoll ≈ τcoll 4π . (6.5)
d

In normal conditions, d takes the value 3.3 nm. Hence, b/d  1 so that the collision
duration is negligible compared to all other time scales. Hence, we can consider that the
collision is instantaneous so that the correlation function is proportional to δ(t − t0 ) and the
spectrum is a white noise spectrum.

6.3 Fluctuation-dissipation theorem. Second form.


The goal of this section is to derive the time-dependent correlation function of the velocity
< v(t)v(0) >. To proceed, it is sufficient to compute the Fourier transform of its power
spectral density.

6.3.1 Newton’s equations in harmonic regime. Power spectral density.


The information on the dynamics of the system is contained in Newton’s equation. In order
to derive the power spectral density, we write the equations for a harmonic force. We restrict
the study to the case of a one dimensional system.

−iωmvT (ω) = −γmvT (ω) + RT (ω), (6.6)

where fT (ω) is the Fourier transform of the restriction of the function f (t) on the interval
[0, T ]. We get:
RT (ω) 1
vT (ω) = , (6.7)
m γ − iω
60CHAPTER 6. LANGEVIN MODEL OF BROWNIAN MOTION. APPLICATION: FLUCTUATIONS OF LIN

so that
|RT |2 1
|vT |2 = . (6.8)
m γ + ω2
2 2

The power spectral densities are connected by the equation:


1
Iv = IR . (6.9)
m2 (γ 2+ ω2)

6.3.2 Second form of the Fluctuation-Dissipation theorem


So far, we have established a formal link between the power spectral density of the velocity
and the power spectral density of the random force. The later is yet unknown. We will
derive the correlation function of the velocity at equilibrium for t = 0. Indeed, in that
particular case, the correlation function < v(t)v(0) > yields the root mean square value of
the velocity fluctuations < v 2 (0) > which is known at equilibrium. This will enable us to
derive IR . From Wiener-Khinchine, we know that < v(t)v(0) > is the Fourier transform of
the power spectral density Iv .
Z ∞
IR ∞
Z
dω 1 dω
< v(t)v(0) > = Iv (ω) exp(−iωt) = 2 2 2
exp(−iωt)
−∞ 2π m −∞ γ + ω 2π
IR ∞
Z
1 dω
= exp(−iωt) (6.10)
m2 −∞ (ω + iγ)(ω − iγ) 2π
The integral can be evaluated by means of the residue theorem using a contour in the
lower half complex plane ( Im(ω) < 0) for t > 0. The only contribution comes from the
pole −iγ. The integral is given by −2iπRes(−iγ). The residue Res(−iγ) is given by :
IR exp(−γt)
Res(−iγ) = , (6.11)
2πm2 −2iγ
so that we obtain:
IR exp(−γt) IR
< v(t)v(0) >= (−2iπ) 2
= exp(−γt). (6.12)
2πm −2iγ 2γm2
Using the equilibrium value, < v 2 (0) >= kB T /m, we finally obtain:

IR = 2γmkB T . (6.13)
This equation allows identifying the amplitude of the force fluctuations. It establishes
a link between the drag force coefficient γ and the power spectral density of the velocity
fluctuations. Again, we see that there is a fundamental link between dissipation and fluctu-
ation. This relation is often called the second form of the fluctuation-dissipation theorem.
However, in the previous chapter, we have seen that the fluctation-dissipation theorem not
only provides a link between fluctuation and dissipation but also yields the full susceptibil-
ity (not only its imaginary part) as first shown by Kubo. This more complete form of the
theorem is called first form of the Fluctuation-Dissipation theorem. We now show that it
can be derived in the framework of Langevin model.

6.4 First form of the Fluctuation-dissipation theorem


In this section, our goal is i) to derive the form of the susceptibility and ii) to establish a link
between this susceptibility ad the correlation function of the velocity fluctuation.
6.5. EINSTEIN RELATION. BROWNIAN MOTION 61

6.4.1 Susceptibility
Here, we repeat the study of the motion in harmonic regime. Yet, we now assume that there
is an external deterministic force denoted Fext . Newton’s equation applied to the functions
restricted to a finite time interval T are:

−imωvT = −γmvT + RT + Fext,T . (6.14)

Taking the mean value of the previous equation yields:

−imω < vT >= −γm < vT > +Fext,T . (6.15)

It follows that the velocity and the external force are linearly related:
1
< vT >= Fext,T . (6.16)
m(γ − iω)
Thus, we can define a susceptibility:
1
χv = (6.17)
m(γ − iω)
Note that this linear response has been defined as a coefficient between velocity and
force. This is what is commonly used in the framework of Brownian motion but this is
not the standard approach we used when introducing the Fluctuation-Dissipation theorem.
Indeed, the product of the two quantities v and F does not have the dimension of an energy.

6.4.2 First form of the Fluctuation-dissipation theorem


The first form of the Fluctuation-dissipation theorem establishes a link between the suscep-
tibility and the correlation function. We have shown that the correlation function can be cast
in the form:
kB T
< v(t)v(0) >=< v 2 > exp(−γt) = exp(−γt), (6.18)
m
and we have just derived χv . We can remark that the integral:
Z ∞
kB T kB T 1
exp(iωt) exp(−γt) dt = (6.19)
0 m m γ − iω
yields the correlation function apart from a constant. We finally get:
Z ∞
1
χv (ω) = < v(t)v(0) > exp(iωt)dt , (6.20)
kB T 0

which is the form of the first Fluctuation-dissipation theorem, namely the link between the
susceptibility and correlation function.

6.5 Einstein relation. Brownian motion


6.5.1 Study of the velocity v(t)
In this section, we aim at studying the fluctuations of the position of the particle in time,
we thus study < x2 (t) >. We repeat the study of the velocity but we now work in time
62CHAPTER 6. LANGEVIN MODEL OF BROWNIAN MOTION. APPLICATION: FLUCTUATIONS OF LIN

domain. The result for the velocity correlation function will be the same. Only the technical
approach changes. In the next section, we will use this time dependent approach to derive
the rms mean position of a particle. This is an introduction to the diffusion processes in the
framework of Langevin model. The dynamic equations governing the particle velocity in
time domain can be written as:
dv R(t)
= −γv + . (6.21)
dt m
The solution of the homogeneous equation is v(t) = v(0) exp(−γt). A standard technique
to find a solution of a first order differential equation is the so-called constant variation
method. We seek a solution assuming the form v(t) = f (t) exp(−γt). By inserting this
form in the differential equation, we get:

df R(t)
= exp(γt). (6.22)
dt m
so that:
t
R(t0 )
Z
f (t) = f (0) + exp(γt0 )dt0 . (6.23)
0 m
It follows that:
t
R(t0 )
Z
v(t) = v(0) exp(−γt) + exp[−γ(t − t0 )]dt0 . (6.24)
0 m

Taking the mean value, we obtain:

< v(t) >= v(0) exp(−γt). (6.25)

Finally, upon multiplying the equation (6.24) by v(0) and averaging, we obtain:

< v(t)v(0) >=< v 2 (0) > exp(−γt) (6.26)

6.5.2 Derivation of x(t). Diffusion


The goal of this section is to introduce a microscopic model of the diffusion process based
on the Langevin model. We will characterize the particle motion by computing x(t). We
expect to observe a diffusion behaviour for times much longer than the mean free time
between two collisions. We start by integrating in time the equation (6.24) in order to
obtain x(t).
Z t Z t1
v(0) R(t2 )
x(t) − x(0) = [1 − exp(−γt)] + dt1 dt2 exp[γ(t2 − t1 )]. (6.27)
γ 0 0 m

The average value is given by:

v(0)
< x(t) >= x(0) + [1 − exp(−γt)], (6.28)
γ

so that: Z t Z t1
R(t2 )
x(t)− < x(t) >= dt1 dt2 exp[γ(t2 − t1 )]. (6.29)
0 0 m
6.5. EINSTEIN RELATION. BROWNIAN MOTION 63

Instead of deriving the average of the square of this quantity it turns out to be simpler to
study (x− < x >)(v− < v >) = 1/2(d(x− < x >)2 /dt) and then integrate this quantity:
Z t Z t1 Z t
R(t2 ) R(t3 )
(x− < x >)(v− < v >) = dt1 dt2 exp[γ(t2 −t1 )] dt3 exp[γ(t3 −t)].
0 0 m 0 m
(6.30)
Using the equation < R(t2 )R(t3 ) >= IR δ(t2 − t3 ), we find:

IR t
Z Z t1
(x− < x >)(v− < v >) = dt1 dt2 exp[γ(t2 − t1 )] exp[γ(t2 − t)]
m2 0 0
Z t
IR [exp(2γt1 ) − 1]
= 2
exp(−γt) dt1 exp(−γt1 )
m 0 2γ
 
IR 1 − exp(−γt) exp(−2γt) − exp(−γt)
= +
m2 2γ 2 2γ 2
IR
= [1 − exp(−γt)]2 (6.31)
2m2 γ 2

Integrating over time yields:


 
2 IR exp(−γt) − 1 exp(−2γt) − 1
< [x(t)− < x(t) >] >= σx2 (t)
= 2 2 t+2 − .
m γ γ 2γ
(6.32)
It is seen that for time t >> 1/γ, the linear
√ term provides the leading contribution. Hence,
the standard deviation σx (t) is given by 2Dt, where we have introduced the diffusion
coefficient D :
IR
D= . (6.33)
2m2 γ 2
Using 6.13, we can establish a link between the diffusion coefficient at temperature T and
the drag coefficient:
kB T
D= . (6.34)
γm

6.5.3 Einstein relation


Einstein relation is a link between mobility and diffusion coefficient. The underlying phys-
ical idea is again the fluctuation dissipation theorem. The coefficient D yields the position
fluctuation while the mobility is related to Ohmic dissipation. It is easy to recover the mo-
bility by writing that in stationary regime the drag force compensates the electric force :

0 = −γmv + qE. (6.35)

It follows that v = qE/γm. By definition, the mobility µ is given v = µE. Upon identifi-
cation, we find:
µ = q/γm. (6.36)
The ratio µ/d can now be derived. This is Einstein relation:

µ q
= . (6.37)
D kB T
64CHAPTER 6. LANGEVIN MODEL OF BROWNIAN MOTION. APPLICATION: FLUCTUATIONS OF LIN

6.6 Exemples of application


Although the Langevin model has been introduced as a technique to study Brownian motion,
it is actually a very powerful and general technique to study fluctuations of linear systems.
We illustrate this by considering two simple examples.

6.6.1 Charge fluctuation of a capacitor


We have studied the RC circuit using the fluctuation-dissipation theorem in the previous
chapter. We now study the charge fluctuations by means of the Langevin model. The
dynamical equation governing the system in the presence of a random external voltage Vf
is:
dq 1 Vf
+ q= . (6.38)
dt RC R
In harmonic regime, we find:
iVf (ω)
q(ω) = , (6.39)
R(ω + iγ)
where we have introduced γ = 1/RC. The power spectral density of charge fluctuations is
given by :
IV
Iq = 2 . (6.40)
R (ω − iγ)(ω + iγ)
We now use Wiener-Kinchine theorem to establish the form of the correlation function:
Z ∞

< q(t)q(0) >= Iq (ω) exp(−iωt). (6.41)
−∞ 2π

The integral can be performed using the residue theorem:


IV IV C
< q(t)q(0) >= 2
exp(−γt) = exp(−γt). (6.42)
2γR 2R
The equipartition theorem yields q 2 /2C = kB T /2 at equilibrium so that:

IV = 2RkB T. (6.43)

Finally, we get:
< q(t)q(0) >= CkB T exp(−t/RC). (6.44)

6.6.2 Problem : spring length fluctuations


Text
The purpose of this exercise is to study the thermal fluctuations of the length of a mechanical
oscillator. This model allows for instance to model the thermal fluctuations of the position
of the tip of an atomic force microscope. Let us assume that the drag force can be cast in
the form −γmẋ. The mass is denoted m and the spring constant K = mω02
1) Write the dynamical equation for the system in the presence of a random force R(t).
2) Derive the power spectral density Ix of the position x as a function of the power spectral
density of the force fluctuations IR .
3) We assume that IR does not depend on frequency (white noise assumption). Derive
< x(t)x(t + τ ) > first as an integral and then explicitly.
6.6. EXEMPLES OF APPLICATION 65

Solution
1. Dynamical equation

−mω 2 x0 = −mω02 x0 + iωγmx0 + R. (6.45)

2. Fluctuation power spectral density. The dynamical equation yields


R 1
x0 = 2 . (6.46)
m ω0 − ω 2 − iωγ
The power spectral density of position fluctuations is then given by:
IR 1
Ix = . (6.47)
|ω02 − ω − iωγ| m2
2 2

3. Time correlation function of the position fluctuations


From Wiener-Khinchine theorem, we obtain:
Z

< x(t)x(t + τ ) >= exp(−iωτ )Ix (ω). (6.48)

The integral can be performed using the residue theorem. The integrand has four poles: the
2 2
p of ω − ω0 + iωγ = 0 and their complex conjugates.
roots They are given by ±(iγ/2 ±
ω0 1 − γ /4ω0 ). We use the short notation ω00 = ω0 1 − γ 2 /4ω02 . The power spectral
p
2 2

density is thus:
IR
Ix = 2 (6.49)
m (ω − ω1 )(ω − ω1∗ )(ω − ω2 )(ω − ω2∗ )
For positive times t, we evaluate the integral using a contour closed by a semicircle in
the half space =(ω) < 0 so that the exponential decay of exp(−iωt) guarantees that the
contribution of the semicircle vanishes. We get:
IR cos(ω00 τ + φ) exp(−γτ /2)
< x(t)x(t + τ ) >= (6.50)
2m2 γω0 ω00

where tan(φ) = γ/2ω00 . Note that cos(φ) = 1 − γ 2 /4ω02 . At t = 0, we have < x2 >=
p

kB T /mω02 . It follows that:


IR = 2γmkB T, (6.51)
and finally:
kB T cos(ω00 τ + φ) exp(−γτ /2)
< x(t)x(t + τ ) >= . (6.52)
mω02 cos(φ)

6.6.3 Problem : dipole moment fluctuations


Text
In the presence of a static electric field E0 , a dielectric particle has a dipole moment p0
given by its static susceptibility χ0 , p0 = χ0 E0 . When the electric field is switched off, the
amplitude of the dipole moment decays exponentially with a time constant τ = 1/γ.

1. Write the differential equation that describes the exponential decay of the dipole mo-
ment (external field switched off).
66CHAPTER 6. LANGEVIN MODEL OF BROWNIAN MOTION. APPLICATION: FLUCTUATIONS OF LIN

2. Write the same equation including a noise term f that causes fluctuations. We use the
assumption that < fi (t)fj (t + τ ) >= If δij δ(τ ). Explain this approximation.

3. Write the expression of the power spectral density for the fluctuations of dipole
moment. Use this expression to derive the correlation function < pi (t)pi (0) > as a function
of If . At equilibrium, the correlation is < pi (0)pi (0) >= kB T χ0 . What is the value of If
? Derive the expression of the power spectral density of the dipole moment fluctuations.

Solution
1. The time dependent dipole moment is ruled by the differential equation

dp
+ γp = 0.
dt

2. In the presence, of a random force, the differential equation is:

dp
+ γp = f (t).
dt

The time correlation function of the random force is modelled by δ(t), owing to the short
time correlation function as compared to the decay time τ . The term δij accounts for the
isotropy of the system resulting in the absence of correlation between different components.
3. In frequency domain, we get

p(ω)[−iω + γ] = f (ω).

It follows that the power spectral densities are related by:

If (ω)
Ipp (ω) = . (6.53)
ω2 + γ 2
We now use Wiener-Kinchine theorem to get the time correlation function. Using the form
of the power spectral density of the noise and integrating over frequencies in the complex
plane we get :
Z ∞
If e−iωt dω e−γt
< pi (t)pi (0) >= = If . (6.54)
−∞ (ω + iγ)(ω − iγ) 2π 2γ

The time correlation function at t = 0 is given by the equilibrium form kB T χ0 . It follows


that If = 2γkB T χ0 . Finally, we recover the formula derived in the previous chapter using
the Fluctuation-Dissipation theorem:

2kB T χ0 τ
Ipp (ω) = . (6.55)
1 + ω2τ 2
Chapter 7

Introduction to the kinetic theory of


transport phenomena.

7.1 Introduction
When an intensive parameter such as temperature, chemical potential, voltage, etc. is in-
homogeneous, fluxes of heat, particles, charges, are generated in order to restore the ho-
mogeneity. We will study the fluxes that appear in the systems for situations ”close” to
equilibrium. The concept of distance to equilibrium will be given a precise meaning later
when introducing the dimensionless Knudsen number.
Let us give a few examples of the type of systems that will be considered. Phenomeno-
logical laws have been introduced to account for these transport effects:

Ohm’s law j = σE,


Fick’s law jN = −D∇n,
Fourier’s law jcd = −k∇T , etc.

Several different theoretical techniques can be used to establish these laws:


Linear response theory.
Here, we adopt the linear response theory point of view. For Ohm’s law, the electric field
is considered to be the excitation (generalized force) and the current density is the response
(generalized displacement). More generally, the force is proportional to the gradient and the
response is the flux. Close to equilibrium, the fluxes are proportional to the gradients. In
this approach, the transport coefficient is the linear response coefficient. Hence, by applying
fluctuation-dissipation theorem, we find that the linear response coefficient can be derived
from the fluctuations of the fluxes.

Kinetic theory.
The basic idea is to consider that the transport of some quantity X is mediated by the trans-
port of particles. The physical quantity X could be the charge, the mass, the energy, the spin
or the momentum of the particle for instance. Obviously, as a particle moves, it transports
all these quantities. A kinetic approach is based on computing particles fluxes and deriving
from them the fluxes of X. In order to achieve this program, it is necessary to know the
velocity distribution of particles.

67
68CHAPTER 7. INTRODUCTION TO THE KINETIC THEORY OF TRANSPORT PHENOMENA.

Thermodynamics of irreversible processes.

As opposed to kinetic theory, this approach is not based on a microscopic point of view.
It is an extension of the macroscopic thermodynamics to non-equilibrium phenomena. The
approach is based on the concept of local flux density. It allows computing the local entropy
generation rate. It also allows deriving very general reciprocity laws connecting coupled
transport phenomena such as a charge current due a temperature gradient.
In this chapter, we will give a sketch of the kinetic theory using elementary fluxes es-
timations. In the next chapter, we will introduce a more systematic technique to evaluate
the velocity distribution out of equilibrium based on the so-called Boltzmann equation. We
will show how this equation can be solved using a first order perturbation solution in the
framework of the relaxation time approximation. Finally, we will apply this approach to
both gases and solids.

7.2 Elementary introduction to kinetic theory of transport phe-


nomena
7.2.1 Fick’s law
In order to illustrate the transport phenomena and identify the physical mechanisms respon-
sible for transport, we will introduce the simplest possible model. We start by dealing with
particles diffusion due to a concentration gradient. We assume that the gradient is along
the x-axis. We denote n(x) the number of particles per unit volume. The particles have
a random motion due to collisions. We introduce the mean free path l. For the sake of
simplicity, we consider a one-dimensional motion. In other words, the particles move only
along the x axis. We also consider that the velocity modulus v is the same for all particles.
The question discussed here is :

How many particles cross the plane x = x0 per unit time?

Particles crossing the plane through an area Σ during dt in the positive direction (pos-
itive velocity) are in a cylinder with volume Σv dt. To count how many particles cross,
we simply multiply this volume by the number of particles per unit volume. Among all
particles in the volume, only half of them have a positive velocity so that we divide by a
factor of 2. The number of particles per unit volume is not homogeneous. We note that we
consider particles that will not be scattered before reaching the plane. Hence, they must be
close enough to the plane x0 . On average, particles are coming from a distance given by l
so that the number of molecules that we use is given by n(x0 − l)/2. Finally, the flux per
unit area is n(x0 − l)v/2. 1
1
The choice of l may seem arbitrary. Hereafter, we repeat the derivation with a more acurate justification
of the choice of the mean free path as a typical length. We first revisit the concept of mean free path and try to
offer a more general view. Let us introduce the probability density that a particle has a collision between x and
x+dx. This probability can be written as P (x)dx where P (x) is a probability density. We now assume that the
probability is homogeneous in x so that P (x) is a constant function. As P (x)dx is a probability, P (x) has the
dimension of the inverse of a length. Let us denote l this length so that we have P (x)dx = dx/l. We can now
derive the probability Π(x) that a particle has no collisions between 0 and x. In order to derive this probability,
we need to make a further assumption. We assume that the collisions are a markovian random process. In other
words, we assume that collisions are uncorrelated. Hence the probabilities of having a collision between 0 and
7.2. ELEMENTARY INTRODUCTION TO KINETIC THEORY OF TRANSPORT PHENOMENA69

The total flux through the interface results from a balance between particles moving in
both directions. We obtain:

n(x0 − l) n(x0 + l) dn
jN = v− v ≈ −lv (x0 ). (7.4)
2 2 dx
The result depends on the gradient of the number of particles per unit volume in agreement
with Fick’s law. The diffusion coefficient D is the product of the mean free path l and the
velocity v.
This simple model can be generalized to the three-dimensional case by dividing the
number of particles by a factor of 6 instead of 2. This rough estimate is based on the fact
that the particles can be distributed along six directions, 1/6 along positive axis Ox, 1/6
along the negative Ox axis, 1/6 along the positive Oy axis, etc. This leads to the relation
D = lv/3. More elaborate theories show that this relation is exact.

7.2.2 Fourier’s law


We now repeat this procedure to derive Fourier’s law. We assume that there is a temperature
gradient along the x axis. The heat flux is associated with the microscopic transport of
energy. Each particle transports its own kinetic energy. The average molecular kinetic
energy is denoted by E(T ). The heat flux can thus be written:
n(x0 − l) n(x0 + l)
jcd = v(x0 − l)E[T (x0 − l)] − v(x0 + l)E[T (x0 + l)] (7.5)
6 6
Assuming that there is no mass flux, the quantity n(x0 )v(x0 ) needs to be constant. It follows
that:
nv nvl dE dT
jcd = [E[T (x0 − l)] − E[T (x0 + l)]] ≈ − . (7.6)
6 3 dT dx
Noting that
dE
n = cv
dT
x is independent from the probability of having a collision between x and x + dx. It follows that:
dx
Π(x + dx) = Π(x)[1 − ], (7.1)
l
where Π(x + dx) is the probability of not having a collision between 0 and x + dx, Π(x) is the probability
of not having a collision between 0 and x and [1 − dx l
] is the probability of not having a collision between x
and x + dx. Upon solving this equation, we find Π(x) = exp(−x/l). Knowing this probability, it is easy to
compute the mean free path mf p: Z ∞
dx
mf p = xΠ(x) = l. (7.2)
0 l
We now know the physical meaning of the length l introduced in the probability density. It is the mean free
path. We now return to the derivation of the flux. We need to know from where the particle are coming when
they cross the plane x0 . Particles coming from the plane x0 − x0 have a probability dx0 /l of having a collision
in the interval dx0 and a probability Π(x0 ) to reach the plane x0 without any other collision. Only half of them
had the right velocity sign after the collision. Hence, the particles flux in positive direction is given by:
Z ∞
n(x0 − x0 ) dx0
jN = vΠ(x0 )
0 2 l
Z ∞ dn 0
n(x0 ) − dx x dx0
≈ vΠ(x0 )
0 2 l
dn
≈ [n(x0 ) − l]v/2 ≈ n(x0 − l)v/2 (7.3)
dx
70CHAPTER 7. INTRODUCTION TO THE KINETIC THEORY OF TRANSPORT PHENOMENA.

is the heat capacity, we get:


lvcv dT
jcd = − . (7.7)
3 dx
This produces the Fourier law and yields the form of the thermal conductivity k = lvcv /3.

7.2.3 Thermoelectric effect


The goal of this section is to provide a hint on the physical origin of the Seebeck effect. Let
us consider a semiconductor wire. We assume that the doping is low so that the semicon-
ductor is non degenerate. It can be described by classical statistics. We consider a situation
where a temperature gradient is applied to the system. Hence, the electrons see a gradient
of kinetic energy (equal to 3kB T /2 per particle for a non degenerate gas) and therefore a
gradient of mean velocity. It follows that electrons tends to migrate from hot regions to cold
regions just because of the difference of average velocity. Since in an open circuit, no cur-
rent can flow, there must be an electric field that develops to compensate for this diffusion
flux. We now proceed to quantify this effect.
The diffusion flux due to the temperature difference is given by:

n n n dv
j = −e[ v(x0 − l) − v(x0 + l)] ≈ e l. (7.8)
6 6 3 dx
Noting l = vτ where τ is the mean time between two collisions and using mv 2 /2 =
3kB T /2, we get:
eτ dmv 2 eτ nkB dT
j≈ n ≈ . (7.9)
6m dx 2m dx
The current density is zero:

eτ nkB dT
j = 0 = σE + , (7.10)
2m dx
where σ = ne2 τ /m so that the field induced by the temperature gradient can be cast in the
form:
kB dT
E=− (7.11)
2e dx

7.3 Diffusive and ballistic regime


Comparing the width of the system with the mean free path. Existence of a local equilib-
rium. Possibility of defining a temperature and its gradient.

7.3.1 Example: heat transfer across a gap


7.3.2 Diffusion equation: key ingredients and general form of the solution
Fick’s law+ conservation law Green function for diffusion equation

7.3.3 Diffusive regime: the random walk model


Chapter 8

Introduction to kinetic theory of


transport phenomena. Boltzmann
equation

8.1 General form of a flux


In this section, we derive the general form of the flux density of a quantity X which is
transported by particles. As explained in the previous chapter, this quantity can be a charge
q, a spin s, a mass m, kinetic energy mv 2 /2 for instance.

Velocity distribution function


We denote f (r, v, t)d3 rd3 v the number of particles in the volume element d3 rd3 v of the
micro phase space. After integration over all velocities, we get the number of particles per
unit volume n(r, t):
Z
f (r, v, t)d3 v = n(r, t). (8.1)

For the sake of illustration,we consider the case of a gas of particles with mass m. At
equilibrium, the velocity distribution function f (0) (r, v) is given by a Maxwell-Boltzmann
distribution: 3/2
mv2
  
(0) m
f (r, v) = n0 exp − . (8.2)
2πkB T 2kB T
It is useful to note that the Maxwell-Boltzmann distribution function can be factorized
as the product of the number of particles per unit volume n0 and a velocity probability
density function P (v), where:
3/2
mv2
  
m
P (v) = exp − . (8.3)
2πkB T 2kB T

Flux density
In order to derive the flux density, we first consider the flux of the quantity X through a
plane normal to the unit vector n due to particles with a velocity v. The particles crossing
the plane through an area Σ during dt are contained in a volume v · ndtΣ. In this volume,

71
72CHAPTER 8. INTRODUCTION TO KINETIC THEORY OF TRANSPORT PHENOMENA. BOLTZMANN

there are f (r, v, t)d3 v(v · n)dtΣ particules so that there are f (r, v, t)d3 v(v · n)dt particles
per unit area. Each particle transports the quantity X so that the contribution to the flux
density is given by :
djX = [X]f (r, v, t) d3 v (v · n). (8.4)
To obtain the total flux, we need to add the contributions of all particles with all possible
velocities. Thus, we get:
Z
djX = [X]f (r, v, t) (v · n) d3 v. (8.5)

We note that in this equation, the scalar product v · n changes sign with the velocity sign
along the normal n so that particles with a positive velocity give a positive contribution
to the flux whereas molecules with negative velocity give a negative contribution. We can
define a flux density j in such a way that the flux through an elementary surface dΣn is
ΦX = jX · (dΣn):
Z
jX = [X]f (r, v, t) v d3 v. (8.6)

From this expression, it is clearly seen that for an isotropic distribution of velocity, the
flux of a quantity X independent of the velocity is null. At equilibrium, the Maxwell-
Boltzmann law is symmetric so that the formula yields a zero flux as expected1 . The same
behaviour is obtained for electrons in degenerate systems governed by Fermi-Dirac law. 2
To evaluate a flux, it is thus necessary to derive the velocity distribution out of equilibrium.
The non equilibrium may be due to a perturbation such as an electric field, a temperature
gradient, a concentration gradient, etc. To proceed, it is necessary to derive an equation for
the velocity distribution. This equation is known as Boltzmann equation and will be derived
in the next section.

8.2 Boltzmann equation


8.2.1 Liouville theorem
The velocity distribution function is defined in the micro phase space with 6 dimensions
(x, y, z, vx , vy , vz ). Using Newton’s equations, it can be shown that the volume d3 rd3 v is
conserved during time evolution. This result is known as Liouville theorem. For the sake of
illustration, a one dimensional example is given in Fig. 1.
We can now derive an evolution equation for the velocity distribution. We start by
noting that if a particle belongs to the volume element d3 rd3 v at time t, then it stays in this
volume element at any time. To prove this statement, let us assume that a particle is leaving
the volume element. Before leaving the volume element, say at time t, the particle is on
its boundary. Hence, its position and initial velocity are arbitrarily close to neighboring
points on the boundary. Regarding a further evolution of the particle in phase space, its
velocity and position at time t define the initial conditions. Since they are the same than the
neighboring points on the boundary, the further evolution is the same for the particle and its
1
By contrast, if the quantity considered is the momentum normal to the surface, the result is non zero and
yields the momentum flux per unit area, i.e. the pressure. This quantity is obviously non null at equilibrium.
2
Degenerate systems are systems such that the classical limit does not apply, or in other words, such that the
condition n  1 is not satisfied where n is the mean number of particles per state (i.e. the mean occupation
number).
8.3. RELAXATION TIME APPROXIMATION 73

neighbors on the boundary so that the particle stays on the boundary. Since particle stays in
a volume element and the volume does not change, we can therefore state that the number
of particles per unit phase space volume is a constant:

df (r, v, t)
= 0. (8.7)
dt
The derivative can be expanded as follows:
∂f (r, v, t)
+ v · ∇r f (r, v, t) + γ · ∇v f (r, v, t) = 0, (8.8)
∂t
where we have introduced the acceleration γ = dv/dt.

8.2.2 Boltzmann equation


The previous analysis is based on Newton’s equations and does not account for collisions.
When a collision happens, the velocity changes abruptly from v to v0 . Hence, the corre-
sponding point is teleported in phase space: the particle disappears at (r, v) and pops up at
(r, v0 ). It follows that f (r, v, t) cannot be considered to be a constant. We need to include a
collision term in the equation denoted Γcoll to account for creation and annihilation particles
in phase space. The equation governing the evolution of the velocity distribution becomes
the Boltzmann equation:

∂f (r, v, t)
+ v · ∇r f (r, v, t) + γ · ∇v f (r, v, t) = Γcoll (8.9)
∂t

The derivation of the collision term requires to carefully study the interaction between all
particles. While this can be done, we will skip a detailed treatment and instead introduce a
simple model.

8.3 Relaxation time approximation


The spirit of the relaxation time approximation is to avoid a microscopic analysis of the
collision term and introduce a phenomenological form. The rationale for this simple form
is based on the effect of collisions. They tend to bring the system back to equilibrium with
a typical time scale τ . Hence, we introduce a collision term in the form:

f − f (0)
Γcoll = − (8.10)
τ
where we have introduced the relaxation time τ . This characteristic time usually depends
on the particle energy. In what follows, we will neglect the dependence of the relaxation
time on the energy. However, we stress that this is a rough approximation. The reader needs
to keep in mind that the relaxation time is not the same for particles with different energies
as the losses channels may strongly depend on the energy of the particles.

8.4 Perturbative solution


The procedure to model transport phenomena consists in three steps:
74CHAPTER 8. INTRODUCTION TO KINETIC THEORY OF TRANSPORT PHENOMENA. BOLTZMANN

i) Establish the general form of the flux density,


ii) Use Boltzmann equation to find the velocity distribution,
iii) Insert the velocity distribution in the flux density formula.

In what follows, we will assume that the system is close to equilibrium. Hence, we look
for a correction of the velocity distribution using a perturbative technique. Thus, we cast
the velocity distribution f in the form:

f (r, v, t) = f (0) (r, v, t)[1 + ϕ(r, v, t)], (8.11)


where |ϕ|  1 is a perturbative correction of order one. Inserting this form of the equation
in Boltzmann equation, we get:

τ df τ df (0) (r, v, t)[1 + ϕ(r, v, t)]


ϕ(r, v, t) = − = − . (8.12)
f (0) dt f (0) dt

The term on the left hand side is of first order in ϕ. Hence, we only keep first order terms
on the right hand side. So far, we have not identified what is the perturbative parameter. We
have only formally postulated that a perturbative expansion is possible. To proceed, it proves
useful to make a dimension analysis of the equation. Let us introduce a dimensionless time
t = t+ θ where θ is a characteristic time of the system. Boltzmann equation can be cast in
the form:

τ 1 df τ 1 df (0) (r, v, t)[1 + ϕ(r, v, t)]


ϕ(r, v, t) = − = − . (8.13)
θ f (0) dt+ θ f (0) dt+

We see that the right hand side term is proportional to a dimensionless number called Knud-
sen number given by Kn = τ /θ. This number determines the order of magnitue of ϕ. It
provides a quantitative measurement of the degree of non equilibrium of the system. A low
Knudsen number as compared to 1 corresponds to a system close to equilibrium. In that
case, a perturbative solution is justified. This situation corresponds to a system that changes
slowly in time as compared to τ . This is called a collisional regime as many collisions may
take place while the system gradually evolves. These collisions result in the possibility of
maintaining a local equilibrium at any time. Let us now examine in more detail the different
terms of the right hand side member in order to identify the first order terms in Kn .
Before analysing an example, it is important to make a technical remark on the analysis
of order of magnitude of different terms in an equation when derivatives are involved. Let us
consider a quantity with the form  cos(2πx/d) where   1 and a function cos(2πx/D).
It is clearly seen that the order of magnitude of these two functions are given respectively
by  and 1 respectively so that the first is much smaller than the second. Let us now suppose
that we compare their derivatives with respect to x. We find −(2π/d) sin(2πx/d) and
−(2π/D) sin(2πx/D). Now, the comparison depends on d/D so that the first term may
be the largest one. While this example is trivial, when comparing different terms in an
equation, this effect may be hidden. It can become obvious by introducing dimensionless
variables. Let us introduce here the variable x+ = 2πx/d. Hence, the derivative d/dx
becomes (d/2π)d/dx+ for the first term. A similar procedure for the second term yields
(D/2π)d/dx+ . It is seen that the typical length scale appears naturally when introducing
dimensionless variables. Indeed, the function cos(x+ ) has the same order of magnitude
than its derivative − sin(x+ ). When dealing with functions with dimensionless variables,
8.5. EXAMPLES 75

the functions and their derivatives have the same order of magnitude. Hence, the difficulty
to compare terms has been removed.
To summarize, when comparing different terms of an equation containing both functions
and their derivatives, it is necessary to first introduce dimensionless variables. By doing so,
the natural length scales of the problem appear explicitly in the equations and the resulting
terms can be compared directly.

8.5 Examples
8.5.1 Electrical conduction in an ionic solution
Here, we study the electrical conduction in a liquid containing n ions per unit volume. We
assume that the system is in stationary regime. The only cause of nonequilibrium is the
presence of the electric field E. Our goal is to derive the current density and therefore to
recover Ohm’s law. The procedure is always the same:
i) General form of the flux density;
ii) Derivation of the velocity distribution by seeking a perturbative solution of Boltzmann
equation in the relaxation time approximation;
iii) Derivation of the transport coefficient.

Here, the quantity X transported is the charge e of each ion. We only consider the
contribution of the ions with positive charge e and mass m. A similar term should be
included to account for the contribution of the negative ions. The flux can be cast in the
form: Z
j = e f (r, v)vd3 v (8.14)

Boltzmann equation in the relaxation time approximation can be cast in the form:

 
τ ∂f (r, v, t)
ϕ(r, v, t) = − + v · ∇r f (r, v, t) + γ · ∇v f (r, v, t) (8.15)
f (0) ∂t
Now, we need to analyse this equation in detail in order to identify the first order terms
that give a contribution to the velocity distribution pertrubation. It is obviously expected that
the electric field plays a key role. Let us examine where it appears. The first term is zero
because the system is in stationary regime so that time derivatives are null. The second term
is also null (∇r f = 0) because the system is homogeneous. The third term introduces the
derivative with respect to velocity. Since f (0) depends explicitly on velocity, this derivative
is non zero. The acceleration term is equally non zero because there is an electric force such
that mγ = eE. We finally obtain:
τ eE
ϕ(r, v, t) = − · ∇v f (r, v, t). (8.16)
f (0) m
It is easy to identify the Knudsen number from this equation:

eEτ
Kn = (8.17)
mv0
p
where v0 is the order of magnitude of the velocity given by kB T /m. This dimension
analysis provides the order of magnitude of ϕ. Hence, the perturbative expansion is valid
76CHAPTER 8. INTRODUCTION TO KINETIC THEORY OF TRANSPORT PHENOMENA. BOLTZMANN

if the velocity eEτ /m due to the work done by the field between two collisions is much
smaller than the mean thermal velocity v0 . It can also be viewed as a comparison between
two times: the collision time τ and the time θ = mv0 /eE required to accelerate the ion to
a velocity v0 . The perturbation solution is valid for τ  θ.

To compare ∇v φf (0) with ∇v f (0) we note that the first term is of order one in Kn while
the second is of order two in Kn . By keeping only first order terms, we finally obtain:

eEτ eEτ ∂f (0) (r, v, t)


ϕ(r, v, t)f (0) = − · ∇v f (0) (r, v, t) = − (8.18)
m m ∂vx
where we have chosen the x axis along the electric field. By inserting this result in Eq.(8.14),
we obtain:
∂f (0) 3
 Z
eτ E
j=e − vd v. (8.19)
m ∂vx
Since f (0) is an even function, its derivative with respect to vx is an odd function. It
follows that the components along y and z of the current density are zero as can be seen by
inspection of the integral over vx . By contrast, the component of the current density along
(0)
the x axis involves the product of two odd functions vx ∂f ∂vx so that it is an even function.
Finally, we obtain:

e2 τ E ∂f (0)
Z
jx = − vx dvx dvy dvz
m ∂vx
jy = 0
jz = 0 (8.20)

The integral can be computed by integrating by parts. Noting that f (0) d3 v = n, we


R

obtain:
e2 τ E ne2 τ
 Z 
(0) ∞ (0) 3
jx = − [vx f ]−∞ − f d v = E. (8.21)
m m
This result has the structure of Ohm’s law. We now have an explicit form of the conduc-
tivity as a function of the characteristic of the system. This result can be applied to either
ionic solutions or plasmas. It cannot be applied to degenerate electronic systems such as
metals. We will address this issue in the next chapter and we will show that a similar form
is obtained for the conductivity.

8.5.2 Problem : Momentum transport. Viscosity


Text
We consider a flow characterized by a velocity field V = V (z)ex . The fluid is a classical
fluid described by a velocity distribution function given by:
3/2
m(v − V)2
  
m
f (r, v) = n exp − . (8.22)
2πkB T 2kB T
We study the flux of the x-component of the momentum px through a plane z = z0
using the kinetic method. We will show that the flux can be cast in the form:
∂V
jpx ,z = Pxz = −µ , (8.23)
∂z
8.5. EXAMPLES 77

where µ is the viscosity.


1) Give the expression of the flux density.
2) Derive the velocity distribution function in the relaxation time approximation.
3) Derive the form of the viscosity µ as a function of the density n, the relaxation time τ
and the temperature T .

Solution
1. Momentum flux density px :
Z
jpx ,z = d3 v[px ]f (r, v)vz . (8.24)

2. Using the Boltzmann equation, we find :

∂ f − f (0) ϕf (0)
vz f (r, v) = − =− . (8.25)
∂z τ τ
To first order, we get:
∂f (0) (r, v) ∂V
ϕf (0) = τ vz . (8.26)
∂vx ∂z
3. Fluid viscosity.
Inserting the velocity distribution function in Eq.(8.24), we get:

∂V
jpx ,z = −µ (8.27)
∂z
where the viscosity is given by:
Z
∂ (0)
µ=− d3 vτ vz2 mvx f (r, v). (8.28)
∂vx
To compute the integral, it is useful to note that the velocity distribution at equilibrium can
be cast in the form of the product of the particle density n and a velocity probability density
function P (v). Thus, we have:
Z
2 ∂P (vx )
µ = −nmτ < vz > vx = nkB T τ (8.29)
∂vx

where we have used < vz2 >= kB T /m.


78CHAPTER 8. INTRODUCTION TO KINETIC THEORY OF TRANSPORT PHENOMENA. BOLTZMANN
Chapter 9

Introduction to irreversible
thermodynamics.

9.1 Introduction

Classical thermodynamics is devoted to the study of systems in equilibrium. At equilibrium,


intensive variables are uniform so that all fluxes are null. In addition, all thermodynamic
quantities are stationary so that thermostatics would be a more appropriate name. The goal
of irreversible thermodynamics is to extend the formalism of classical thermodynamics so
that fluxes can be accounted for. Knowing the fluxes allows modelling time dependent prob-
lems so that the dynamics of heat transfer can be studied. The most obvious example is the
diffusion equation for temperature. Knowing Fourier’s law and a local energy conservation
law yields the diffusion equation that governs the temperature field.
When comparing irreversible thermodynamics with a microscopic approach, it appears
that one of the merits of irreversible thermodynamics is to be able to include experimental
laws using measured coefficients such as thermal conductivity or electrical conductivity.
It provides a general framework where these quantities can be related. Indeed, the coeffi-
cients appearing in the laws describing fluxes have to satisfy general principles as already
discussed in the lecture on linear response. Within the framework of irreversible thermody-
namics, it is possible to derive the Onsager reciprocity relations. These relations establish
a link between the transport coefficients in the presence of simultaneous occuring transport
phenomena such as a heat flux and an electrical current. Another remarkable feature of
irreversible thermodynamics is to specify the local rate of entropy creation which is very
useful to identify the origin of irreversibility. Finally, it is possible to generalize the second
principle and to derive a local form called Clausius-Duhem inequality. We will introduce
these three aspects in this chapter.
The first section is devoted to a reminder of the axiomatic formulation of thermody-
namics. Then, we introduce the basic concepts of irreversible thermodynamics: affinities,
fluxes, and local conservation laws for homogeneous systems. We generalize this approach
in section 4 to heteregeneous systems with gradients. The last section deals with Onsager
reciprocity relations and Clausius-Duhem inequalities. To obtain the reciprocity relations, a
systematic prescription on how to define the coefficients is required. Indeed, the coefficients
are not the same if one defines the heat flux as proportional to ∇T or ∇(1/T ). This will be
clarified in section 3.

79
80 CHAPTER 9. INTRODUCTION TO IRREVERSIBLE THERMODYNAMICS.

9.2 Overview of classical thermodynamics


9.2.1 Fundamental principles
The thermodynamic description of systems uses a number of concepts that will be briefly
reviewed hereafter. The system is described using a number of extensive variables such as
energy U , volume V , number of particles of type i denoted Ni .
The second key concept is the definition of equilibrium. The goal of thermodynamics
is to determine which is the new equilibrium state of the system after removing an internal
constraint in a closed composite system. A typical example consists of a moving piston
separating two gas chambers. The piston can be blocked by a constraint so that pressure
could be different on both sides. Another example is two isolated capacitors charged with
different potentials. Removing the constraint in the first case amounts to free the piston so
that the systems can exchange volume. In the second case, an operator can establish an
electrical contact between the two capacitors allowing a charge exchange.
Thermodynamics needs to formulate laws governing the evolution of the systems so
that the final equilibrium states can be predicted. To proceed, thermodynamics has three
fundamental postulates. The first postulate is the existence of an extensive function of the
extensive variables of any composite system. This function is called entropy and denoted
S. It is maximum at equilibrium in the absence of any internal constraint:

S(U, V, N1 , ..., NN ). (9.1)

The second postulate is that entropy is an additive and increasing function of energy.
Hence, when the system is divided into two subsystems, the entropy of the system is the
sum of the entropy of the two subsystems.
The third postulate fixes the entropy at zero for zero temperature.

9.2.2 Local Thermodynamic Equilibrium, intensive variables and state equa-


tion
The concept of local thermodynamic equilibrium is based on the idea that a system can be
divided into subsystems such that their entropy can be defined as a function of the exten-
sive variables in this subsystem. We now consider that the subsystems can exchange the
extensive quantities Xk . We also assume that the fluxes are sufficiently small so that each
subsystem stays in equilibrium.
Let us denote by a and b two subsystems. The entropy of system a is denoted Sa . It is a
function of the subsystem energy Ea . We can then define the temperature of the subsystem
∂Sa
using T1a = ∂E a
. This equation establishes a link between the intensive variable Ta of
subsystem a and the corresponding extensive quantity Ea . Hence, it is a state equation. We
now consider a continuous system. It is possible to define locally a temperature 1/T (r) =
∂ρS
∂ρE where ρS and ρE are the entropy and energy per unit volume at point r.

9.3 Homogeneous systems. Fluxes and Affinities.


9.3.1 Affinity (or generalized force)
The basic idea at work in irreversible thermodynamics was introduced in the context of
linear response theory. The fluxes are a linear response to an excitation. The purpose of this
9.3. HOMOGENEOUS SYSTEMS. FLUXES AND AFFINITIES. 81

section is to introduce in a systematic way a prescription to identify what is the force and
what is the response. For instance, what should be the ”force” at work for heat fluxes : ∇T
or ∇1/T ?
In order to introduce heuristically the relevant quantities for a formulation of a theory
of non-equilibrium phenomena, we revisit the equilibrium case. At equilibrium, we have a
well-established principle: the entropy is maximum. In order to introduce non-equilibrium,
we consider an isolated system Σ0 which can be splitted in two subsystems:

Σ0 = Σ ∪ Σ0 .

The extensive quantities in each subsystem are denoted by (X1 , X2 , ...XN ) and (X10 , X20 , ...XN
0 )

respectively. The conservation law of Xk yields :

Xk0 = Xk + Xk0 ,

so that
dXk0 = dXk + dXk0 = 0.
According to the second principle, the entropy is maximum at equilibrium so that:

∂S 0 ∂S ∂S 0 ∂S ∂S 0
= + = − = 0. (9.2)
∂Xk ∂Xk ∂Xk ∂Xk ∂Xk0

This equation shows that the equilibrium is characterized by the equality of the partial
derivatives of the entropy with respect to its natural variables (i.e. extensive variables).
In the following, we denote Fk and Fk0 the partial derivatives ∂S/∂Xk and ∂S 0 /∂Xk0 . This
suggests that if they are not equal, then the difference Fk − Fk0 plays the role of the driving
force for the exchange of quantity Xk . Obviously, such an approach is expected to be valid
for small differences so that this is a theory for systems close to equilibrium. In other words,
we are working in the spirit of the linear response theory approach. With this in mind, we
introduce the quantity:

∂S 0
FXk = = FXk − FX0 k , (9.3)
∂Xk
which is called affinity or generalized force. We now give a number of examples of these
affinities. Let us start by considering the internal energy U . We get:

∂S 1 1 1
X = U; FU = = ; FU = − 0.
∂U T T T
Let us now consider the volume. Our purpose here is to show how to proceed to identify
systematically the correct affinity. We start from the thermodynamic identity as it is usually
expressed in classical thermodynamics in terms of energy:

dU = T dS − P dV + µdN + xdX.

We rewrite this identity by expressing the variation of entropy:

dU P µ x
dS = + dV − dN − dX.
T T T T
From this identity, we immediately identify the affinity for the volume:
82 CHAPTER 9. INTRODUCTION TO IRREVERSIBLE THERMODYNAMICS.

∂S P P P0
X =V; FV = = ; FV = − 0,
∂V T T T
for the number of particles:

∂S µ µ µ0
X = n; Fn = =− ; Fn = − + 0,
∂n T T T
and for any extensive quantity X:

∂S x x x0
X; FX = =− ; FX = − + 0.
∂X T T T
To summarize, let us stress that the main motivation to redefine new intensive variables
(e.g. P/T instead of P) is based on the different role of entropy and energy in the theoret-
ical framework of thermodynamics. While the internal energy U is merely one extensive
variables among others, the entropy plays a central role. The usual formulation of thermody-
namics gives a particular role to internal energy for historical reasons. From a fundamental
point of view, it is the entropy that has to play a central role.

9.3.2 Entropy production rate


We now consider the evolution of a system when we remove an internal constraint. The
system is composed of two subsystems. The total entropy is the sum of the two subsystems.
If the internal variables Xk of the subsystems change, then the entropy also changes as it
depends on the Xk . Hence, we can write the entropy variation in terms of the variations rate
∂Xk
∂t :

dS 0 X ∂S 0 ∂Xk X  ∂S ∂S 0 ∂Xk

= = − , (9.4)
dt ∂Xk ∂t ∂Xk ∂Xk0 ∂t
k k
which can be cast in the form:

dS 0 X ∂Xk
= Fk . (9.5)
dt ∂t
k

The physical content of this equation is very important. It clearly shows that there is
a production of entropy associated to the flux of the quantity Xk if the affinity (i.e. the
discontinuity in Fk ) is non zero. Furthermore, the rate of entropy production appears as the
product of two terms: the affinity Fk and the flux ∂Xk /∂t.
We now move to systems with a continuous gradient of the internal variables such as
temperature gradient of density gradient.

9.4 Flux and affinity in a heterogeneous system.


9.4.1 Local thermodynamic equilibrium
We define the local thermodynamic equilibrium by assuming that the entropy can be defined
locally per unit volume. Of course, this is only possible if the extensive variables such as
energy or number of particles of the entropy can also be defined locally. Within the macro-
scopic approach adopted here, we cannot exhibit validity conditions for this assumption.
9.4. FLUX AND AFFINITY IN A HETEROGENEOUS SYSTEM. 83

We have seen previously in the framework of Boltzmann equation that the validity condi-
tion for local thermodynamic equilibrium was essentially a low Knudsen number. In other
words, the system may change over time scales which are larger than the collision time and
it may change in space with length scales which are larger than the mean free path. We now
establish the thermodynamic identity in its local version. We start from the identity:
X ∂S
dS = dXk , (9.6)
∂Xk
k

which is well defined for a system with volume V. So far, we have considered a system in
equilibrium. We can always consider the system to be an ensemble of different subsystems.
∂S
Let us introduce Fk = ∂X k
and consider a subvolume δV which can be chosen arbitrarily.
Introducing the entropy density ρS as S = ρS δV and the density of Xk denoted ρk such
that Xk = ρk δV , we see that we can write for this subsystem:
X
δV dρS = Fk δV dρk , (9.7)
k

so that X
dρS = Fk dρk . (9.8)
k

We can now introduce ∂X ∂S


k
= ∂ρ S
∂ρk = Fk . Finally, we note that these quantities have been
defined locally. There is no need to assume that the values are homogeneous over the entire
system.

9.4.2 Local form of the conservation law of Xk


The quantities Xk are extensive variables. The corresponding quantity is conserved and a
local conservation equation can be written. Let us consider a volume V . The amount of Xk
contained in this volume can change when there are fluxes through the enclosing surface.
We introduce the corresponding current density Jk . The local form of conservation of the
quantity Xk can be cast in the usual form:

∂ρk
+ divJk = 0. (9.9)
∂t

9.4.3 Entropy conservation. Affinity. Local entropy production rate.


In this section, we aim at deriving the form of the local conservation of entropy. As we
know, entropy can be produced when there are irreversible processes so that we need to
introduce a local production of entropy rate. We start by studying the entropy variation of a
system of volume V contained in a surface Σ. The entropy of that system is given by :
Z
SV = ρS (ρ1 (r, t), ...ρN (r, t))dr.
V
We now analyse the origin of the entropy variation in the system. First of all, as the
entropy is a function the extensive variables Xk , it is clear that if there is a flux of this
quantity, Xk in the system will change and therefore, the entropy will change. In other
words, a flux of Xk generates a flux of entropy. Hence, we need to introduce an entropy
flux associated to the fluxes Jk . We also know that there is a term describing entropy
84 CHAPTER 9. INTRODUCTION TO IRREVERSIBLE THERMODYNAMICS.
 
∂ρS
generation associated to irreversible processes denoted by ∂t . A budget of entropy
irrev
variation is thus given by:

Z Z   Z
dSV d ∂ρS
= ρS (ρ1 (r, t), ...ρN (r, t))dr = dr − JS · dΣ.
dt dt V V ∂t irrev Σ

The local conservation of entropy can thus be cast in the form:


 
∂ρS ∂ρS
+ divJs = ,
∂t ∂t irrev

where the term on the right is the entropy production rate due to irreversible processes. We
will derive its explicit form by studying the left hand side terms. Let us first use the local
thermodynamic identity (9.8) to introduce the time rate of entropy change:

∂ρS X ∂ρk
= Fk , (9.10)
∂t ∂t
k

and the entropy current density: X


Js = Fk Jk . (9.11)
k

In the previous equation, we have used the fact that at local thermodynamic equilibrium,
the entropy has no explicit dependence on time other than mediated by the time dependence
of xk . The divergence of the current density can be cast in the form:
X X X
divJs = div Fk Jk = Fk divJk + ∇Fk · Jk
k k k
X ∂xk X
= − Fk + ∇Fk · Jk , (9.12)
∂t
k k

where we have used the conservation of Xk (9.9). We finally find:


 
∂ρS X
= ∇Fk · Jk . (9.13)
∂t irrev k

By analogy with the previous section, we have introduced the affinity defined by ∇Fk .
The above result shows that the source of entropy and therefore of irreversibility is propor-
tional to the gradient of the intensive quantities. More precisely, the entropy production rate
is the product of two terms: the affinity and the corresponding flux. This result provides a
clear justification of the concept of quasistatic transformation. It is seen that as the affinities
go to zero, the entropy production vanishes. It justifies the use of quasistatic tranformations
to suppress entropy generation. We now provide two examples to illustrate these concepts.

9.4.4 Examples
Heat transfer
We start by considering a cylinder of uniform section Σ and length L connecting two ther-
mostats at temperature T1 and T2 with T1 > T2 . We consider a transformation such that
9.4. FLUX AND AFFINITY IN A HETEROGENEOUS SYSTEM. 85

an amount δQ of heat is transferred from thermostat 1 to thermostat 2 during a time δt.We


will study the entropy generation during this transformation using two different points of
view: a global thermodynamic study and a local computation of entropy generation. By in-
tegrating over the total volume, the local approach should allow to recover the result of the
global approach. We start by the global approach. The entropy variation of the thermostat
1 is given by :
δQ
∆S1 = − ,
T1
whereas the entropy variation of thermostat 2 is

δQ
∆S2 = .
T2

Since entropy is additive, the total variation of entropy in the system is:
 
1 1
∆S = ∆S1 + ∆S2 = δQ − .
T2 T1

This formula provides the entropy generated during the heat transfer. As already pointed
out, it increases as the discontinuity T1 − T2 increases. Let us study now the local rate of
entropy production. We first start from the thermodynamic identity to identify the proper
affinity: dS = dU/T + Fk dXk . The affinity is thus given by ∇(1/T ) and the flux is the
internal energy flux which is the heat flux in this case. It is given by Fourier law:
 
∂ρS X 1
= ∇Fk · Jk = ∇ · [−λ∇T ].
∂t irrev T
k

Using ∇(1/T ) = −∇T /T 2 , the entropy production rate is given by :


 
∂ρS λ
= 2 (∇T )2 .
∂t irrev T

This equation yields the local entropy production rate. It is interesting to observe that
the entropy is only created where the gradient is non zero and therefore, only in the wire
connecting the thermostats. In order to obtain the total entropy produced, it suffices to
sum over the whole volume of the wire with section Σ and length L. By noting that the
temperature field is one dimensional and denoting by x the variable along the wire, we get:
Z L  2
λ ∂T
dx 2 Σ.
0 T ∂x

We have T (0) = T1 and T (L) = T2 . We choose the flux orientation along the x axis so
that the flux is positive if T1 > T2 . Noting that Φ = −λ ∂T
∂x Σ is constant along the wire in
stationary regime, we can write the integral in the form:
L L ∂ T1
Z   Z  
1 ∂T 1 1
− dxΦ 2 =Φ dx = Φ − .
0 T ∂x 0 ∂x T2 T1

This result is clearly in agreement with the previous approach if we consider that during δt,
there was a heat flow δQ = Φ∆t.
86 CHAPTER 9. INTRODUCTION TO IRREVERSIBLE THERMODYNAMICS.

Entropy generated by Joule effect


It is well known that transformation of mechanical work into heat leads to production of
entropy. Joule effect is a particular example of this general phenomenon. In this section,
we aim at deriving the entropy production due to Joule effect. To proceed, we start with the
general result of entropy production rate:
 
∂ρS X
= ∇Fk · Jk .
∂t irrev
k
Let us identify the relevant affinity and flux. The quantity X is the electrical charge so that
the flux is simply given by the usual intensity current density denoted j. Ohm’s law yields
a simple expression of this current density: j = σE. We now identify the affinity. We start
from the thermodynamics identity:

dU = T dS + V dq + µdn.
Here, we have included a term that depends on the chemical potential as an exchange of
charges implies an exchange of particles. By noting that n and q are not independent as
they are related by q = −ne in the case of electrons, we obtain the thermodynamic identity:
 µ
dU = T dS + V − dq.
e
It is seen that the the electrochemical potential Vec = V − µe appears naturally. We can now
cast the thermodynamic identity in the form:
dU Vec
dS = − dq.
T T
From this identity, we can now derive the form of the affinity given by the gradient − VTec .
In the particular case of the Joule effect, the metal is isothermal and the chemical potential
is uniform so that the affinity is simply given by: −∇V /T = E/T . It follows that:

σE2
 
∂ρS E
= · σE = .
∂t irrev T T
The entropy production rate is thus given by the ratio of the electromagnetic power
dissipated per unit volume divided by the temperature. We know consider that Joule effect
takes place in a simple cylindrical resistor with length L and uniform section Σ. We can
introduce the voltage across the resistance U such that the electric field is given by E =
U/L. Integrating the entropy production rate over the whole volume, we get:

σE2 Σ U2 U2 1
Z
dS
= dx Σ = =
dt V T ρL T R T
where the resistance is R = ρL/Σ = L/σΣ.

9.4.5 Linear response theory for fluxes


In the previous section, we have introduced linear response coefficients such as thermal
conductivity and electrical conductivity. In the framework of irreversible thermodynamics,
these linear processes are defined in a systematic way by connecting the fluxes Jl and the
affinities ∇Fk . Thus, we have:
9.5. PRINCIPLES OF IRREVERSIBLE THERMODYNAMICS 87

X
Jk = Lkl ∇Fl . (9.14)
l

It is seen that the most general form of a flux involves all the affinities so that crossed
terms are included. Obviously, such a linear response formulation is an approximation
which is valid for systems close to equilibrium. In the framework of irreversible thermody-
namics, the existence of these fluxes is an experimental fact. The theory is a phenomeno-
logical theory. Yet, it is possible to derive general properties that must be fulfilled by the
coefficients. It is the purpose of this section to introduce these properties.
Let us first stress that the formulation of general properties can take place only if the
systematic definition of coefficients is used. We stress that the value of the coefficients are
not the same if one consider that the heat flux is proportional to ∇T as it is usually done
or if one uses the affinity ∇(µ/T ). Similarly, the charge current density is the response to
−∇Vec /T and not to E = −∇V . Hence, the linear response coefficient is not the usual
electrical conductivity.
It is worth mentioning that the linear relations established here are valid for affinities
that vary slowly in time as compared to relaxation times of the system so that the linear
coefficients are non dispersive. If that condition is satisfied, the systems are said to be
markovian. Indeed, if the system is considered to have a response that is non dispersive in
frequency domain, its response in time domain is considered to be instantaneous. In other
words, its response does not depend on the value of the excitation at time preceding the
observation of the effect.

9.5 Principles of irreversible thermodynamics


In what follows, we introduce a number of principles that must be satisfied by the linear
coefficients which must be measured experimentally. They follow from general physical
properties.

9.5.1 Curie Principle


Curie symmetry principle can be formulated as follows ”When causes produce effects, all
symmetry properties of the cause are also symmetry properties of the effects.”.
Hence, if a system is invariant in a transformation, then the fluxes must also be invariant
in the same transformation. Let us consider the form:

Lα,β
X
Jαk = kl ∂β Fl , (9.15)
l,β

where exponents α, β denote the cartesian components of the vectors. The more general
form of the transport coefficient is given by a tensor of rank 2. We now assume that the
system is invariant under any rotation. This is the case for instance of thermal conductivity
of a fluid. It follows from Curie principle that the tensor Lα,β kl must be invariant under
any rotation. There are only two tensors of rank 2 which are rotationally invariant. They
are given by δ α,β and uα uβ . It follows that the linear coefficient can be cast in the form
Aδ α,β + Buα uβ . This reduces the number of parameters from 9 to 2 !
88 CHAPTER 9. INTRODUCTION TO IRREVERSIBLE THERMODYNAMICS.

9.5.2 Clausius-Duhem inequality


Clausius-Duhem inequality is a generalization of the second principle of thermodynamics.
We have shown that the entropy production rate can be cast in the form:
 
∂ρS X
= ∇Fk · Jk . (9.16)
∂t irrev
k

We have also expressed the fluxes using linear response coefficients9.14. By inserting
this form in the previous expression, the entropy production rate appears to be a bilinear
form:  
∂ρS X
= ∇Fk · Lkl ∇Fl . (9.17)
∂t irrev
k,l

Clausius-Duhem principle postulates that the rate of entropy production is positive so


that
X
∇Fk · Lkl ∇Fl ≥ 0. (9.18)
k,l

This condition imposes many constraints on the linear coefficients. For instance, it
follows that thermal conductivity and electrical conductivity are positive.

9.5.3 Onsager reciprocity relations


Onsager reciprocity relations are introduced in irreversible thermodynamics as an addi-
tional principle. They have been introduced by Lars Onsager in 1931. He was awarded the
Chemistry Nobel prize for this discovery in 1968. These relations can be proved using time
reversal symmetry properties of microscopic processes. They can be cast in the following
form:
Lkl (B) = Llk (−B) (9.19)
We must insist on the fact that these relations are only valid if the coefficients were
defined following the prescriptions given above. In the above formulation, a dependence
of the coefficients on the magnetic field B has been introduced. The reason is as follows.
As mentionned above, the physical origin of the reciprocity relations is the invariance of
the microscopic equations under time reversal. Yet, when changing time t into time −t,
the Lorentz force is not invariant. It is necessary to also change B into −B as the veloc-
ity changes sign. The interested reader will find the derivation in the original papers by
Onsager. A simplified derivation is given in the textbook by H. Callen.

You might also like