Professional Documents
Culture Documents
Entropy and PDE PDF
Entropy and PDE PDF
Lawrence C. Evans
Department of Mathematics, UC Berkeley
Inspiring Quotations
A good many times I have been present at gatherings of people who, by the standards
of traditional culture, are thought highly educated and who have with considerable gusto
been expressing their incredulity at the illiteracy of scientists. Once or twice I have been
provoked and have asked the company how many of them could describe the Second Law of
Thermodynamics. The response was cold: it was also negative. Yet I was asking something
which is about the scientic equivalent of: Have you read a work of Shakespeares?
C. P. Snow, The Two Cultures and the Scientic Revolution
. . . C. P. Snow relates that he occasionally became so provoked at literary colleagues who
scorned the restricted reading habits of scientists that he would challenge them to explain
the second law of thermodynamics. The response was invariably a cold negative silence. The
test was too hard. Even a scientist would be hard-pressed to explain Carnot engines and
refrigerators, reversibility and irreversibility, energy dissipation and entropy increase. . . all in
the span of a cocktail party conversation.
E. E. Daub, Maxwells demon
He began then, bewilderingly, to talk about something called entropy . . . She did gather
that there were two distinct kinds of this entropy. One having to do with heat engines, the
other with communication. . . Entropy is a gure of speech then. . . a metaphor.
T. Pynchon, The Crying of Lot 49
CONTENTS
Introduction
A. Overview
B. Themes
I. Entropy and equilibrium
A. Thermal systems in equilibrium
B. Examples
1. Simple uids
2. Other examples
C. Physical interpretations of the model
1. Equilibrium
2. Positivity of temperature
3. Extensive and intensive parameters
4. Concavity of S
5. Convexity of E
6. Entropy maximization, energy minimization
D. Thermodynamic potentials
1. Review of Legendre transform
2. Denitions
3. Maxwell relations
E. Capacities
F. More examples
1. Ideal gas
2. Van der Waals uid
II. Entropy and irreversibility
A. A model material
1. Denitions
2. Energy and entropy
a. Working and heating
b. First Law, existence of E
c. Carnot cycles
d. Second Law
e. Existence of S
3. Eciency of cycles
4. Adding dissipation, Clausius inequality
B. Some general theories
1. Entropy and eciency
1
a. Denitions
b. Existence of S
2. Entropy, temperature and separating hyperplanes
a. Denitions
b. Second Law
c. HahnBanach Theorem
d. Existence of S, T
III. Continuum thermodynamics
A. Kinematics
1. Denitions
2. Physical quantities
3. Kinematic formulas
4. Deformation gradient
B. Conservation laws, ClausiusDuhem inequality
C. Constitutive relations
1. Fluids
2. Elastic materials
D. Workless dissipation
IV. Elliptic and parabolic equations
A. Entropy and elliptic equations
1. Denitions
2. Estimates for equilibrium entropy production
a. A capacity estimate
b. A pointwise bound
3. Harnacks inequality
B. Entropy and parabolic equations
1. Denitions
2. Evolution of entropy
a. Entropy increase
b. Second derivatives in time
c. A dierential form of Harnacks inequality
3. Clausius inequality
a. Cycles
b. Heating
c. Almost reversible cycles
V. Conservation laws and kinetic equations
A. Some physical PDE
2
INTRODUCTION
A. Overview
This course surveys various uses of entropy concepts in the study of PDE, both linear
and nonlinear. We will begin in Chapters IIII with a recounting of entropy in physics, with
particular emphasis on axiomatic approaches to entropy as
(i) characterizing equilibrium states (Chapter I),
(ii) characterizing irreversibility for processes (Chapter II),
and
(iii) characterizing continuum thermodynamics (Chapter III).
Later we will discuss probabilistic theories for entropy as
(iv) characterizing uncertainty (Chapter VII).
I will, especially in Chapters II and III, follow the mathematical derivation of entropy provided by modern rational thermodynamics, thereby avoiding many customary physical arguments. The main references here will be Callen [C], Owen [O], and ColemanNoll [C-N].
In Chapter IV I follow Day [D] by demonstrating for certain linear second-order elliptic and
parabolic PDE that various estimates are analogues of entropy concepts (e.g. the Clausius
inequality). I as well draw connections with Harnack inequalities. In Chapter V (conservation laws) and Chapter VI (HamiltonJacobi equations) I review the proper notions of weak
solutions, illustrating that the inequalities inherent in the denitions can be interpreted as
irreversibility conditions. Chapter VII introduces the probabilistic interpretation of entropy
and Chapter VIII concerns the related theory of large deviations. Following Varadhan [V]
and Rezakhanlou [R], I will explain some connections with entropy, and demonstrate various
PDE applications.
B. Themes
In spite of the longish time spent in Chapters IIII, VII reviewing physics, this is a
mathematics course on partial dierential equations. My main concern is PDE and how
various notions involving entropy have inuenced our understanding of PDE. As we will
cover a lot of material from many sources, let me explicitly write out here some unifying
themes:
(i) the use of entropy in deriving various physical PDE,
(ii) the use of entropy to characterize irreversibility in PDE evolving in time,
and
5
(1)
such that
(2)
(i) S is concave
S
>0
(ii) E
S = S(E, X1 , . . . , Xm )
Here and afterwards we assume without further comment that S and other functions derived
from S are evaluated only in open, convex regions where the various functions make sense.
In particular, when we note that (2)(iii) means
S(E, X1 , . . . , Xm ) = S(E, X1 , . . . , Xm )
(4)
( > 0),
we automatically consider in (4) only those states for which both sides of (4) are dened.
Owing to (2)(ii), we can solve (3) for E as a C 1 function of (S, X1 , . . . , Xm ):
(5)
E = E(S, X1 , . . . , Xm ).
= temperature
T = E
S
E
Pk = Xk = k th generalized force (or pressure).
7
(7)
( > 0).
E
E
(S, X1 , . . . , Xm ) =
(S, X1 , . . . , Xm ).
S
S
E
.
S
Lemma 2 We have
S
Pk
1 S
=
= ,
E
T Xk
T
(9)
Proof. T =
E
S
S 1
E
(k = 1, . . . , m).
. Also
W = E(S(W, X1 , . . . , Xm ), X1 , . . . , Xm )
E
E S
+
.
S Xk Xk
=T
=Pk
dE = T dS
Pk dXk
Gibbs formula.
k=1
E
Note carefully: at this point (10) means merely T = E
, Pk = X
(k = 1, . . . , m). We will
S
mk
later in Chapter II interpret T dS as innitesimal heating and k=1 Pk dXk as innitesimal
working for a process. In this chapter however there is no notion whatsoever of anything
changing in time: everything is in equilibrium.
B. Examples
In applications X1 , . . . , Xm may measure many dierent physical quantities.
1. Simple uid. An important case is a homogeneous simple uid, for which
(1)
E
V
N
S
T
P
=
=
=
=
=
=
=
internal energy
volume
mole number
S(E, V, N )
E
= temperature
S
E
V = pressure
E
= chemical potential.
N
dE = T dS P dV dN.
Remark. We will most often consider the situation that N is identically constant, say
N = 1. Then we write
(3)
and so
E
V
S
T
P
(4)
=
=
=
=
=
internal energy
volume
S(E, V ) = entropy
E
= temperature
S
E
V = pressure
with
dE = T dS P dV.
(5)
Note that S(E, V ) will not satisfy the homogeneity condition (2)(iii) however.
2. Other examples. Although we will for simplicity of exposition mostly discuss simple
uid systems, it is important to understand that many interpretations are possible. (See,
e.g., Zemansky [Z].)
E
Intensive parameter P = X
tension
surface tension
pressure
electric force
magnetic intensity
Extensive parameter X
length
area
volume
electric charge
magnetization
innitesimal displacement
10
(i) S is concave
S
>0
(ii) E
S
E
S = S
E 1 = E
(1)
X 1 = X
(k = 1, . . . , m)
k
k
S1,E1,..,X1,..
k
S2,E2,..,X2,..
k
11
S = (1 )S
E 2 = (1 )E
X2
(k = 1, . . . , m).
k = (1 )Xk
Thus
1
2
S = S +S
E = E1 + E2
X = X1 + X2
k
k
k
(2)
(k = 1, . . . , m).
The homogeneity assumption (iii) is just (1). As a consequence, we see from (2) that
S, E, . . . , Xm are additive over subregions of our thermal system in equilibrium.
On the other hand, if T 1 , Pk1 , . . . are the temperatures and generalized forces for subregion
# 1, and T 2 , . . . , Pk2 , . . . are the same for subregion # 2, we have
T = T1 = T2
Pk = Pk1 = Pk2
(k = 1, . . . , m),
owing to Lemma 1 in A. Hence T, . . . , Pk are intensive parameters, which take the same
value on each subregion of our thermal system in equilibrium.
4. Concavity of S
Note very carefully that we are hypothesizing the additivity condition (2) only for subregions of a given thermal system in equilibrium.
We next motivate the concavity hypothesis (i) by looking at the quite dierent physical
situation that we have two isolated uid bodies A, B of the same substance:
SA,EA,..,XA,..
SB,EB,..,XB,..
Here
SC,EC,..,XC,..
k
(In Chapter II we will more carefully dene heat and work.) After C reaches equilibrium,
we can meaningfully discuss S C , E C , . . . , XkC , . . . . Since no work has been done, we have
XkC = XkA + XkB
(k = 1, . . . , m)
(3)
But then
(4)
SC =
=
S(E C , . . . , XkC , . . . )
S(E A + E B , . . . , XkA + XkB , . . . )
SA + SB
S(E A , . . . , XkA , . . . ) + S(E B , . . . , XkB , . . . ).
Thus S is concave.
5. Convexity of E
Next we show that
E is a convex functionof (S, X1 , . . . , Xm ).
(5)
A
B
To verify (5), take any S A , S B , X1A , . . . , Xm
, X1B , . . . , Xm
, and 0 < < 1. Dene
A
E A := E(S A , X1A , . . . , Xm
)
B
B
B
B
E := E(S , X1 , . . . , Xm );
so that
A
S A = S(E A , X1A , . . . , Xm
)
B
B
B
B
S = S(E , X1 , . . . , Xm ).
Since S is concave,
S(E A + (1 )E B , . . . , XkA + (1 )XkB , . . . )
S(E A , . . . , XkA , . . . )
+(1 )S(E B , . . . , XkB , . . . ).
(6)
Now
W = E(S(W, . . . , Xk , . . . ), . . . , Xk , . . . )
for all W, X1 , . . . , Xm . Hence
E A + (1 )E B = E(S(E A + (1 )E B , . . . , XkA
+(1 )XkB , . . . ), . . . , XkA + (1 )XkB , . . . )
E(S(E A , . . . , XAk , . . . )
+(1 )S(E B , . . . , XkB , . . . ), . . . , XkA + (1 )XkB , . . . )
owing to (6), since
E
S
and so E is convex.
6. Entropy maximization and energy minimization
14
Lastly we mention some physical variational principles (taken from Callen [C, p. 131
137]) for isolated thermal systems.
Entropy Maximization Principle. The equilibrium value of any unconstrained internal
parameter is such as to maximize the entropy for the given value of the total internal energy.
Energy Minimization Principle. The equilibrium value of any unconstrained internal
parameter is such as to minimize the energy for the given value of the total entropy.
graph of S = S(E,.,Xk,.)
E
(E*,.,Xk*,.)
Xk
E=E* (constraint)
S
graph of E = E(S,.,Xk,.)
(S*,.,Xk*,.)
S=S*
(constraint)
E
Xk
15
The rst picture illustrates the entropy maximization principle: Given the energy constraint
E = E , the values of the unconstrained parameters (X1 , . . . , Xm ) are such as to maximize
(X1 , . . . , Xm ) S(E , X1 , . . . , Xm ).
The second picture is the dual energy minimization principle. Given the entropy constraint
S = S , the values of the unconstrained parameters (X1 , . . . , Xm ) are such as to minimize
(X1 , . . . , Xm ) E(S , X1 , . . . , Xm ).
D. Thermodynamic potentials
Since E is convex and S is concave, we can employ ideas from convex analysis to rewrite
E
various formulas in terms of the intensive variables T = E
, Pk = X
(k = 1, . . . , m). The
S
k
primary tool will be the Legendre transform. (See e.g. Sewell [SE], [E1, III.C], etc.)
1. Review of Legendre transform
Assume that H : Rn (, +] is a convex, lower semicontinuous function, which is
proper (i.e. not identically equal to innity).
Denition. The Legendre transform of L is
(1)
(q Rn ).
We usually write L = H . It is not very hard to prove that L is likewise convex, lower
semicontinuous and proper. Furthermore the Legendre transform of L = H is H:
(2)
L = H , H = L .
q = DH(p).
Then
(4)
16
Furthermore
DL(q) = p + (q DH(p))Dq p
= p by (3),
and so
(5)
p = DL(q).
2. Denitions
The energy E and entropy S are not directly physically measurable, whereas certain of
the intensive variables (e.g. T, P ) are. It is consequently convenient to employ the Legendre
transform to convert to functions of various intensive variables. Let us consider an energy
function
E = E(S, V, X2 , . . . , Xm ),
where we explicitly take X1 = V = volume and regard the remaining parameters X2 , . . . , Xm
as being xed. For simplicity of notation, we do not display (X2 , . . . , Xm ), and just write
(6)
E = E(S, V ).
(7)
17
Remark. The inf in (7) is taken over those S such that (S, V ) lies in the domain of E.
A similar remark applies to (8), (9).
E is C 2 , strictly convex
(12)
F = E T S, where T =
(13)
H = E + P V, where P =
(14)
G = E T S + P V, where T =
E
V
E
E
, P =
.
S
V
E
(S, V
S
).
Lemma 3
(i) E is locally strictly convex in (S, V ).
(ii) F is locally strictly concave in T , locally strictly convex in V .
(iii) H is locally strictly concave in P , locally strictly convex in S.
(iv) G is locally strictly concave in (T, P ).
Remark. From (9) we see that G is the inf of ane mappings of (T, P ) and thus is concave. However to establish the strict concavity, etc., we will invoke (10), (11) and use the
formulations (12)(14). Note also that we say locally strictly convex, concave in (ii)(iv),
since what we really establish is the sign of various second derivatives.
18
(15)
where
T =
(16)
Then (15) implies
F
T
F
V
Thus
=
=
E
S
E
S
(17)
Next dierentiate (16):
S
T
S
V
S
S T T
= S
E
S
E
+ V
T V
= V
(= P ).
2F
T 2
2F
V 2
E
(S(T, V ), V ).
S
S
= T
1=
0=
2F
T 2
2F
V 2
2E
V 2
2 E S
V S V
2E
S 2
2E
S 2
2E
S 2
S
T
S
V
2E
.
V 2
2E
SV
1
2E
SV
2
2E
S 2
1
2E
SV
2
.
Hence:
2F
2F
<
0,
> 0.
T 2
V 2
This proves (ii), and (iii),(iv) are similar.
3. Maxwells relations
Notation. We will hereafter regard T, P in some instances as independent variables (and
not, as earlier, as functions of S, V ). We will accordingly need better notation when we compute partial derivatives, to display which independent variables are involved. The standard
notation is to list the other independent variables outside parenthesis.
19
For instance if we think of S as being a function of, say, T and V , we henceforth write
S
T V
to denote the partial derivative of S in T , with V held constant, and
S
V T
to denote the partial derivative of S in V , T constant. However we will not employ parenthesis when computing the partial derivatives of E, F, G, H with respect to their natural
, not
Thus if we are as usual thinking of F as a function of T, V , we write F
T
arguments.
F
.
T V
We next compute the rst derivatives of the thermodynamic potentials:
Energy. E = E(S, V )
(18)
E
E
= T,
= P.
S
V
F
F
= S,
= P.
T
V
Enthalpy. H = H(S, P )
(20)
H
H
= T,
= V.
S
P
G
G
= S,
= V.
T
P
Proof. The formulas (18) simply record our denitions of T, P . The remaining identities
, S = S(T, V ).
are variants of the duality (3), (5). For instance, F = E T S, where T = E
S
So
S
S
F
= E
S T T
T
S T V
V
= S,
We can now equate the mixed second partial derivatives of E, F, G, H to derive further
identities. These are Maxwells relations:
T
P
(22)
=
V S
S V
(23)
(24)
(25)
The equality (22) just says
2E
V S
S
V
T
P
S
P
2E
SV
=
T
=
S
P
T
V
S
=
T
V
V
T
; (23) says
P
2F
V T
2F
T V
, etc.
E. Capacities
For later reference, we record here some notation:
S
CP = T
(1)
= heat capacity at constant pressure
T P
(2)
CV = T
(3)
P = T
S
P
(4)
(5)
V = T
1
=
V
S
T
S
V
V
T
= latent heat with respect to volume
T
= coecient of thermal expansion
P
21
(6)
(7)
1
KT =
V
1
KS =
V
V
P
V
P
= isothermal compressibility
T
= adiabatic compressibility.
S
(See [B-S, p. 786-787] for the origin of the terms latent heat, heat capacity.)
There are many relationships among these quantities:
Lemma 4
(i) CV = E
T V
(ii) CP = H
T P
(iii) CP CV > 0
E
(iv) V P = V
.
T
Proof. 1. Think of E as a function of T, V ; that is, E = E(S(T, V ), V ), where S(T, V )
means S as a function of T, V . Then
E
E S
S
=
=T
= CV .
T V
S T V
T V
Likewise, think of H = H(S(T, P ), P ). Then
H
H S
S
=
=T
= CP ,
T P
S T P
T P
where we used (20) from D.
2. According to (19) in D:
S=
Thus
(8)
CV = T
S
T
F
.
T
= T
V
2F
> 0,
T 2
CP = T
S
T
G
T
= T
P
22
2G
> 0.
T 2
Consequently:
G
T
=
=
F
+
T
F
,
T
F
V
+P
and so
(10)
2G
2F
F
=
+
2
2
T
T
T V
V
V
T
.
P
(Kelvins formula).
F. More examples
1. Ideal gas
An ideal gas is a simple uid with the equation of state
(1)
P V = RT,
23
where R is the gas constant (Appendix A) and we have normalized by taking N = 1 mole.
As noted in A, such an expression does not embody the full range of thermodynamic information available from the fundamental equation S = S(E, V ). We will see however that
many conclusions can be had from (1) alone:
Theorem 1 For an ideal gas,
(i) CP , CV are functions of T only:
CP = CP (T ), CV = CV (T ).
(ii) CP CV = R.
(iii) E is a function of T only:
(2)
E = E(T ) =
CV ()d + E0 .
T0
S = S(T, V ) = R log V +
T0
CV ()
d + S0 .
T
CP CV = P
V
24
P
T
2
.
V
Since P V = RT , we have
P
= RT
,
V T
V2
P
= VR .
T V
Thus
CP CV = RT
V2
R2
V2
= R.
Then
S
S dE
1
=
= CV (T ),
T V
E dT
T
S
P
R
S
=
=
= .
V T
V
T
V
Remark. We can solve (2) for T as a function of E and so determine S = S(T (E), V ) as a
function of (E, V ). Let us check that
(E, V ) S is concave,
provided
CV > 0.
Now
Thus
S
E
S
V
=
=
2S
E 2
2S
EV
2S
V 2
S
T
V
S
V
= 1 CV (T ) CV1(T ) =
PT
= T V = VR .
T
E
1
T
T
= T12 E
= T 2 C1V (T ) < 0
= 0
= VR2 < 0,
P V = N RT,
25
and
S(E, V, N )
= NS
The function S of
(E, V, N ) from B
E
V
,
,
1
N N
E V
=
NS
,
N N
The function S of (E, V ) from B.
(5)
> 0, then:
Indeed if 0 < < 1, N, N
V + (1 )V , N + (1 )N
)
S(E + (1 )E,
)S E+(1)E , V +(1)V
= (N + (1 )N
N +(1)N N +(1)N
E
V
E
V
= (N + (1 )N )S N + (1 ) N , N + (1 ) N ,
where
=
N + (1 )N
,1 =
(1 )N
N + (1 )N
(N + (1 )N ) S N , N + (1 )S N , N
E V
S E , V
, N + (1 )N
= N S N
N
N
V , N
).
= S(E, V, N ) + (1 )S(E,
A simple ideal gas is an ideal gas for which CV (and so CP ) are positive constants. Thus
(6)
:=
CP
> 1.
CV
E = CV T
S = R log V + CV log T + S0
26
(8)
(The constants S0 in (7), (8) and below dier.) For later reference we record:
S = CV log(T V 1 ) + S0
S = CV log(P V ) + S0
(9)
where =
CP
CV
, CP CV = R, N = 1.
P =
a
RT
2
V b V
(V > b, N = 1)
E = E(T, V ) =
CV ()d
T0
a
+ E0 .
V
S = S(T, V ) = R log(V b) +
T0
CV ()
d + S0 .
But then
CV =
E
T
V
We can dene a simple van der Waals uid, for which CV is a constant. Then
E = CV T Va + E0
(N = 1).
S = R log(V b) + CV log T + S0 ,
However if we solve for S = S(E, V ), S is not concave everywhere. Thus a van der Waals uid
ts into the foregoing framework only if we restrict attention to regions where (E, V ) S
is concave.
Remark. More generally we can replace S by its concave envelope (= the smallest concave
function greater than or equal to S in some region). See Callen [C] for a discussion of the
physical meaning of all this.
28
We further assume:
P
< 0, V
= 0, CV > 0 in .
V
(1)
29
(T(t),V(t))
t
(2)
W() = d W = P dV.
(3)
(4)
Q() = d Q = CV dT + V dV.
(5)
30
d
=
dt
and
Q() =
We call
w(t) = P (T (t), V (t))V (t)
(6)
the rate of working and
(7)
Physical interpretations. (1) If we think of our homogeneous uid body as occupying the
region U (t) R3 at time t, then the rate of work at time t is
P v dS,
w(t) =
U (t)
v denoting the velocity eld and the outward unit normal eld along U (t). Since we
assume P is independent of position, we have
d
dV
CV (V, T )
=
dT
V (V, T )
for V as a function of T , V = V (T ). Taking dierent initial conditions for (8) gives dierent
adiabatic paths (a.k.a. adiabats).
Any C 1 parameterization of the graph of V = V (T ) gives an adiabatic path.
b. The First Law, existence of E
We turn now to our basic task, building E, S for our uid system. The existence of
these quantities will result from physical principles, namely the First and Second Laws of
thermodynamics.
We begin with a form of the First Law: We hereafter assume that for every cycle of
our homogeneous uid body, we have:
W() = Q().
(9)
This is conservation of energy: The work done by the uid along any cycle equals the
heat gained by the uid along the cycle.
Remark. We assume in (9) that the units of work and heat are the same. If not, e.g. if heat is
measured in calories and work in Joules (Appendix A), we must include in (9) a multiplicative
factor on the right hand side called the mechanical equivalent of heat (= 4.184J/calorie).
E
E
= V P,
= CV .
V
T
for each cycle in . The 1-form CV dT + (V P )dV is thus exact, since is open, simply
connected. This means there exists a C 2 function E with
(11)
dE = CV dT + (V P )dV.
(12)
c. Carnot cycles
Denition. A Carnot cycle for our uid is a cycle consisting of two distinct adiabatic
paths and two distinct isothermal paths, as drawn:
V
b
c
a
d
T1
T2
(We assume V > 0 for this picture and take a counterclockwise orientation.)
We have Q(b ) = Q(d ) = 0, since b , d are adiabatic paths.
Notation.
and
Q > 0
heat is gained at
heat is lost at
the higher temperature T2 the lower temperature T1
The picture above illustrates the correct orientation of for a Carnot heat engine, provided
V > 0.
33
Example. Suppose our uid body is in fact an ideal gas (discussed in I.F). Then P V = RT
if we consider N = 1 mole, and
P (T, V ) = RT
, CV (T, V ) = CV (T ),
V
(14)
RT
V (T, V ) = V .
(The
S formula
for
V isRTmotivated by our recalling from I.E that we should have V =
P
T V T = T T V = V .) Consider a Carnot heat engine, as drawn:
V3
V=V2(T)
V2
V4
V=V1(T)
V1
T1
T2
We compute
(15)
Q =
+
V2
V dV = RT2 log
V1
V2
V1
.
The equation for the adiabatic parts of the cycle, according to (8), is:
V CV (T )
CV
dV
=
=
.
dT
V
RT
Hence the formulas for the lower and upper adiabats are:
V1 (T ) = V1 exp TT CVR() d
2
T
CV ()
V (T ) = V exp
d
,
2
2
T2 R
and so
CV ()
d
R
CV ()
d
R
.
Therefore
Q =
V4
V3
V dV
= RT1 log
= RT1 log
The work is
W = Q+ Q
= R(T2 T1 ) log
> 0;
V4
V3
V1
V2
> 0.
V2
V1
W=
T1
1
T2
Q+ .
In other words we are assuming that formula (17), which we showed above holds for any
Carnot heat engine for an ideal gas, in fact holds for any Carnot heat engine for our general
homogeneous uid body.
Physical interpretation. The precise relation (17) can be motivated as follows from this
general, if vague, statement, due essentially to Clausius:
+ .
implies Q+ = Q
W=W
35
This says that any two Carnot cycles which operate between the same temperatures
and which perform the same work, must absorb the same heat at the higher temperature.
Physical derivation of (19) from (18). To see why (19) is in some sense a consequence
of (18), suppose not. Then for two uid bodies we could nd Carnot heat engines ,
operating between the temperatures T2 > T1 , such that
+.
but Q+ > Q
W = W,
we observe
Then since W = W,
+ Q+ < 0.
Q ) = Q
(Q
followed by the reversal of . Then would
Imagine now the process consisting of
Q ) > 0 units of heat at the lower temperature T1 and emit the same Q
absorb Q = (Q
W = 0, no work would be performed
units of heat at the higher temperature. But since W
by . This would all contradict (18), however.
Physical derivation of (17) from (19). Another way of stating (19) is that for a Carnot
heat engine, Q+ is some function (T1 , T2 , W) of the operating temperatures T1 , T2 and the
work W, and further this function is the same for all uid bodies.
But (16) says
T2
Q+ =
W = (T1 , T2 , W)
T2 T1
for an ideal gas. Hence (19) implies we have the same formula for any homogeneous uid
body. This is (17).
Remark. See Owen [O], Truesdell [TR, Appendix 1A], BharathaTruesdell [B-T] for a more
coherent discussion.
e. Existence of S
We next exploit (17) to build an entropy function S for our model homogeneous uid
body:
Theorem 2 For our homogeneous uid body, there exists a C 2 function S : R such
that
(20)
S
V S
CV
=
,
=
.
V
T T
T
V3
V=V2(T)
V2
V4
V=V1(T)
V1
(V * ,T * )
T1
T2
Now
V2
Q =
+
(21)
V (V, T2 )dV.
V1
Furthermore
W=
T2
V2 (T )
P dV =
V1 (T )
T1
P
dV dT
T
V2
V1
Let T1 T2 = T :
T2
V (V, T2 )dV =
T2 T1
T1
V2
T2
V2
V (V, T )dV = T
V1
V1
V2 (T )
V1 (T )
P
dV dT.
T
P
(V, T )dV.
T
V = T
P
T
(Clapeyrons formula)
at the point (T , V ). Since this point was arbitrary, the identity (22) is valid everywhere in
.
2. Recall from (10) that
(23)
E
E
= V P,
= CV .
V
T
37
Consequently:
V
T
V
T2
2E
V T
1 V
T
T
1
T
CV
T
P
T
1
T
P
T
by (22), (23)
by (23) again.
V
CV
dT +
dV
T
T
is exact: there exists a C 2 -function S such that
dS =
(24)
CV
V
dT +
dV.
T
T
This is (20).
Notation. From (24) it follows that
dS =
(25)
d Q
,
T
S
V
=
=
S
S
T
+ V
T V V
T
CV T
V
+ T by (20).
T V
(28)
Consequently (27) says
S
V
=
(29)
T
= P V .
V
P
.
T
In summary:
1
P
S
S
= ,
= ,
E V
T
V E
T
(30)
For proving this, we deduce rst from (29) that for S = S(E, V ):
1
2S
=
< 0.
2
E
CV T 2
(31)
P
T
P
2S
V E
= V E
.
V 2
T
T2
P
P T
P
+
=
,
V E V
T V E
Also
Now
2S
1 P
P
1
T
=
.
+ 2 T
P
V 2
T V
T
T
V E
and so
But
T
V
=
E
E
V
E
T
P V
by (10),
CV
P
.
T
Thus
1 P
1
2S
=
(P V )2 < 0,
2
V
T V
CV T 2
(32)
since
P
V
T
V
V P
2S
E
=
.
=
2
EV
T
T 2 CV
39
2S
E 2
(33)
2S
V 2
CV1T 2
P )2
(TV4 C 2
2S
EV
1 P
T V
2
1
(P
CV T 2
V )
> 0.
q + (t) =
q (t) =
q(t) if q(t) 0
0 if q(t) 0
0 if q(t) 0
q(t) if q(t) 0
b
ab
a
W()
,
Q+ ()
40
(35)
T1
T2
according to (16).
T2
T1
Theorem 3 Let be a cycle as above, and let denote its eciency. Then
1
(37)
T1
.
T2
0=
d Q
=
T
41
a
q
dt.
T
Then
0 =
(39)
b
q+
qT dt
a T
b +
1
q dt T11
T2 a
Q + ()
Q ()
T2
T1
since q + 0, q 0, T1 T T2 on .
Consequently:
= 1
= QW()
+ ()
1
b
a
q dt,
Q ()
Q + ()
T1
.
T2
We will later (in B) take the eciency estimate (37) as a starting point for more general
theory.
4. Adding dissipation, Clausius inequality
We propose next, following Owen [O] and Serrin [S1], to modify our model to include
irreversible, dissipative eects.
Notation. Remember that we represent a parametric curve by writing
(T (t), V (t)) for a t b,
(40)
where a < b and V, T : [a, b] R are C 1 . We will henceforth call a process to emphasize
that the following constructions depend on the parameterization. We call a cyclic process
if (T (a), V (a)) = (T (b), V (b)).
A model for a homogeneous uid body (with dissipation). Assume we are given:
(a) a convex open subset (0, ) (0, ) ( is the state space),
and
(b) two C 2 functions W, Q dened on R2 .
Notation. We will write
W = W(T, V, A, B)
Q = Q(T, V, A, B)
for some constant C and all (T, V ), (A, B) as above. Lastly we suppose that P, CV , V
satisfy:
P
< 0, V
= 0, CV > 0 in .
V
(43)
Given a process as above, we dene the work done by the uid along to be
b
(44)
W(T (t), V (t), T (t), V (t))dt
W() = d W =
(45)
Q(T (t), V (t), T (t), V (t))dt.
Q() = d Q =
Notation. We call
w(t) = W(T (t), V (t), T (t), V (t))
(46)
the rate of working and
(a t b)
the rate of heating at time t. In this model the expressions d W and d Q are dened
by (44), (45), but d W and d Q are not dened.
(47)
Remark. Note very carefully: W(), Q() depend not only on the path described by but
also on the parameterization.
Our intention is to build energy and entropy functions for our new model uid with
dissipation. As before we start with a form of the First Law: For each cyclic process of
our homogeneous uid body with dissipation, we assume
(48)
W() = Q().
This is again conservation of energy, now hypothesized for every cyclic process, i.e. for
each cycle as a path and each parameterization.
Theorem 4 There exists a C 2 function E : R such that
(49)
E
E
= CV ,
= V P.
T
V
43
CV T + (V P )V dt =
This identity must hold for each parameterization. So x > 0 and set
(T (t), V (t)) = (T (t), V (t)) (0 t b/).
Then
(50)
b/
CV T + (V P )V dt =
b/
R1 (T , V , T , V ) R2 (T , V , T , V )dt.
Since T (t) = T (t), V (t) = V (t), we can rewrite the left hand side of (51) as
(51)
CV T + (V P )V dt =
CV dT + (V P )dV,
where we use 1-form notation to emphasize that this term does not depend on the parameterization. However (42) implies
b/
R
(T
,
V
,
T
,
V
)
R
(T
,
V
,
T
,
V
)dt
0
1
C b max0tb/ [(T )2 + (V )2 ]
C.
Sending 0, we conclude
CV dT + (V P )dV = 0
R1 (T, V, A, B) = R2 (T, V, A, B)
44
for all (T, V ) , (A, B) R2 . To see this, consider any process parameterized by
(T (t), V (t)), where a t 0. Extend to a cyclic process parameterized by (T (t), V (t))
for a t 1. Then by the proof above:
1
1
(53)
R1 (T, V, T , V )dt =
R2 (T, V, T , V )dt.
a
Reparameterize by writing
T (t) =
V (t) =
T (t) a t 0
T (t) 0 t 1/,
V (t) a t 0
V (t) 0 t 1/.
Formula (53) holds for any parameterization and so is valid with (T , V , T , V ) replacing
(T, V, T , V ). Making this change and letting 0, we deduce
0
0
R1 (T, V, T , V )dt =
R2 (T, V, T , V )dt
a
The foregoing proofs demonstrate that our model with dissipation in some sense approximates an ideal model without dissipation in the limit as we consider our processes
on slower and slower time scales. The model without dissipation corresponds simply to
setting R1 0, R2 0.
To further develop our model we next assume as an instance of the Second Law that
W>0
and
(54)
W=
T1
1
T2
Q+
for any Carnot heat engine of the ideal model without dissipation acting between temperatures T2 > T . This assertion as before implies there exists a C 2 function S : R
with
(55)
CV S
V
S
=
,
=
.
T
T V
T
R1 (T, V, A, B) = R2 (T, V, A, B) 0
45
for all (T, V ) , (A, B) R2 . Indeed we can interpret for any process the term
b
R1 (T, V, T , V )dt
a
as representing the lost amount of work performed by the uid along owing to velocity
dependent internal friction (= dissipation).
Combining (56), (57), we deduce for any cyclic process 1 that
b
Q(T,V,T ,V )
dt
T
a
=
+
(57)
b
d
S(T, V )dt
a dt
b R2 (T,V,T ,V )
dt
T
a
b R2 (T,V,T ,V )
dt
T
a
=
0.
We employ earlier notation to rewrite:
dQ
(58)
0.
T
a cyclic process
T1
.
T2
0
d Q
=
T
a
q
dt.
T
T : (0, )
: Rn .
T is the temperature and the components of are the generalized latent heats
= (1 , . . . , n ).
Notation.
(a) A path is an oriented, continuous, piecewise C 1 curve in . A path is a cycle if its
starting and end points coincide.
is the path taken with opposite orientation.
(b) If is a path, its reversal
(c) If 1 and 2 are two paths such that the endpoint of 1 is the starting point of 2 ,
we write
2 1
to denote the path consisting of 1 , followed by 2 .
Y(t)
Rn
47
b
d Q = dY = a (Y (t)) Y (t)dt
heat absorbed along .
(ii)
q(t) = (Y (t)) Y (t) = heating at time t
q(t) if q(t) 0
q + (t) =
0 if q(t) 0
0 if q(t) 0
q (t) =
q(t) if q(t) 0.
(iii) If is a path,
Q+ () =
Q () =
b
ab
a
If is a cycle,
W() = Q() = Q+ () Q () = work performed along .
(iv)
t+ () = {t [a, b] | Y (t) exists, q(t) > 0} = times at which heat is gained
t () = {t [a, b] | Y (t) exists, q(t) < 0} = times at which heat is emitted.
(v)
T + () =
=
T () =
=
48
Remark.
Q+ () = Q ()
Q () = Q+ ()
Q() = Q()
T + () = T + ()
T () = T ().
Abbreviations
A
M+
M
M
I
C
T (Y (t)) = T1 if t t ()
T (Y (t)) = T2 if t t+ ().
Notation. If , ,
P (, ) = collection of all paths from to in
A(, ) = collection of all adiabatic paths from to .
M (, ), M (, ), I(, ), C(, ) are similarly dened.
b. Existence of S
Main Hypothesis. We assume that for each pair of states , and for each temperature level attained in , there exists a monotone, essentially isothermal path M (, )
such that
(1)
T (Y (t)) = on t+ () t ().
49
adiabats
Rn
Theorem. Assume
(2)
W()
T ()
1 +
T ()
Q+ ()
W() =
T ()
1 +
T ()
Q+ ()
k
S
=
Yk
T
in (k = 1, 2, . . . , n).
This result says that the eciency estimates (2), (3) (special cases of which we have
earlier encountered) in fact imply the existence of the entropy S. Conversely it is easy to
check using the arguments in A.3 that (4) implies (2), (3).
Proof. 1. Dene the crude entropy change
(5)
() :=
Q + ()
T + ()
0 if is adiabatic
Q ()
T ()
50
if not.
M (, ). Then 2 A(, ) M (, ), and so Q (2 ) = Q (1 ) = 0. Hence Q (2
2 ) + Q (1 ) = Q+ (2 ) + Q (1 ) = 0, but
2 1 is not adiabatic. Thus
1 ) = Q (
2 1 ) =
(
=
=
2 1 )
Q + (
2 1 )
T + (
2 )+Q + (1 )
Q + (
2 1 )
T + (
Q (2 )+Q + (1 )
2 1 )
T + (
> 0.
(1 ) = (2 ).
T () = T + ( )
2
+
+
Q () = Q (1 )
Q () = Q+ (2 ).
Thus (6) implies
0 = () =
=
()
Q + ()
QT ()
T + ()
+ ( )
Q+ (1 )
2
Q
T + (1 )
T + (2 )
= (1 ) (2 ).
This is (8) and a similar proof is valid if 1 , 2 M (, ).
51
(9)
To prove (9), recall the possibilities listed in (7). If M (, ) = A(, ), then is adiabatic
is a cycle, and (
) = (). Thus (6) implies
and () = 0. Also
) 0 = ().
() = (
This is (9).
If M (, ) = M + (, ), then A(, ) = M (, ) = and so is not adiabatic. Thus
T () = T + () = T ()
Q+ () = Q+ ()
Q () = Q () + Q+ ()
Thus (6) says
0 ()
=
=
()
Q + ()
Q
T + ()
T ()
()
Q+ ()
Q
T + ()
T ()
Q+ ()
T + ()
= () ().
Then
() () = (),
where we employed Claim # 3 for the last equality. The proof of (9) if M (, ) = M (, )
is similar.
5. Claim # 4. If P (, ) and I(, ), then
(10)
() ().
M (, ),
P (, ), and so
To verify (10), select any M (, ). Then
()
()
according to Claim # 3. Since , I(, ),
= (), ()
= ().
()
52
Thus
() ().
Owing to Claim # 3, then,
() () ().
This is (10).
6. Claim # 5. There is a function : R such that
() () ()
(11)
for all , , P (, ).
To prove this, note rst that Claim # 4 implies
(1 ) = (2 ) if 1 , 2 I(, ).
Thus we can dene
(, ) := ()
( I(, )).
( P (, )),
(, ) = () ()
for all , .
For this x a state and a temperature level . Owing to the Main Hypothesis, there
exist 1 M (, ), 2 M (, ) such that
T (1 ) = on t+ (1 ) t (1 )
T (2 ) = on t+ (2 ) t (2 ).
Then 2 1 I(, ) and
(2 1 ) =
+
Q + (2 1 )
2 1 )
TQ (
T + (2 1 )
(2 1 )
( )
Q + (2 )+Q + (1 )
1
Q (2 )+Q
=
= (2 ) + (1 ).
Hence
(13)
(, ) = (, ) + (, ).
53
Dene
() := (, ),
to deduce (12) from (13).
7. Finally we build the entropy function S. Take any cycle in , parameterized by
{Y (t) | 0 t 1}.
Fix > 0. Then take N so large that
1
1
T (Y (t1 )) T (Y (t2 ))
, Nk (k = 1, . . . , N ). Thus we have
if t1 , t2 k1
N
(14)
1
T (k )
1
T + (k )
(k = 1, . . . , N ),
where k is parameterized by Y (t) | k1
t Nk . We here and afterwards assume each
N
k is not adiabatic, as the relevant estimates are trivial for any adiabatic k . Thus
1
d Q
:= 0 T (Y1(t)) (Y (t)) Y (t) dt
T
=
=
N Nk
k=1
N
q(t)
1
q(t)dt
(t))
k1 T (Y
N
Q+ (k )
T (k )
(k )
QT + (
k)
N
1
1
+
Q
(
)
+
Q (k )
k
k=1
T + (k )
T (k )
N
+
k=1 (k ) + (Q (k ) + Q (k ))
1
N
k=1 (k ) + 0 |q(t)|dt
N
k=1 (k ) + C.
k=1
Now
(k ) (k ) (k )
by Claim # 5, with
k = Y k1
kN
k = Y N = k+1 .
(k ) 0.
k=1
54
d Q
=
T
0
q(t)
dt 0.
T (Y (t))
in place of , we conclude
Applying the same reasoning to
dQ
= 0 for each cycle .
T
As is simply connected, this fact implies the existence of S : R with
DS =
.
T
(iv) We endow M() with the weak topology, which is the weakest topology for which
the mappings
d
Physical interpretation. The abstract apparatus above is meant to model the following
situation.
(a) Suppose we have a physical body U made of various material points. At any given
moment of time, we assign to each point x U a state (x) .
56
AAA
AAA
AAA
-1
(E)
AAAAA
AAAAA
AAAAA
AAAAA
(x)
physical body U
The condition of the body at this xed time is then dened to be the measure M+ ()
such that
(E) = mass of 1 (E)
for each Borel subset E .
(b) We now image a process which somehow acts on U , thereby changing the initial
condition i to a nal condition f . The change of condition is
= f i .
Observe
() = f () i ()
= mass of U mass of U = 0.
b. Second Law
We next assume that our abstract system satises a form of the Second Law, which for
us means
(17)
C M+ () = {0}.
57
(19)
such that
(20)
58
{ = 0}
K1
K2
{ < 0}
{ > 0}
(21)
for all (, ) X.
Proof. 1. Suppose X = M() and
:XR
59
is continuous, linear. Fix any point and let denote the point mass at . Next
dene
() := ( )
( ).
2. We rst claim
C().
To prove this, let k in . Then for every C(),
dk = (k ) () =
d
and so
k weakly as measures.
This means
k in M().
As is continuous,
(k ) = (k ) ( ) = ().
Thus is continuous.
3. Since is linear,
(24)
m
ak k
k=1
ak (k )
k=1
m
for all {k }m
k=1 , {ak }k=1 R. Finally take any measure M(). We can nd
m =
am
k m
k
(m = 1, . . . )
k=1
such that
m in M().
(25)
Then (24) implies
m
am
k (k ) = (m ) () as m .
k=1
k=1
60
Thus
() =
d.
This proves the representation formula (22) and the proof of (23) is then immediate.
d. Existence of S, T
Theorem Assume that our abstract thermodynamic system satises (17). Then there exist
continuous functions
S:R
T : (0, )
such that
(26)
Sd
and
(28)
d > 0 for all M1+ ().
(0, ) =
61
1
.
Sd
Physical interpretation. We can repackage inequality (26) into more familiar form by
rst of all dening
dQ
d
(29)
:=
T
T
for each process = (, ). Then (26) says
dQ
(30)
0
a cyclic process.
T
Fix now any two states , . Then if there exists a process = ( , ) P, (26)
and (29) read
dQ
(31)
S() S()
a process from to .
T
Clearly (30) is a kind of Clausius inequality for our abstract thermodynamic system.
:= ( , ) P.
Finally let us say a process = ( , ) P is reversible if
Then
dQ
= S() S()
a reversible process from to .
T
Remark. See Serrin [S2], [S3], ColemanOwenSerrin [C-O-S], Owen [O] for another general
approach based upon a dierent, precise mathematical interpretation of the Second Law.
Note also the very interesting forthcoming book by Man and Serrin [M-S].
62
x=(X,t)
X
U(t)
U=U(0)
x = (X, T )
X = (x, t).
63
Then
(3)
v(x, t) =
(X, t)
t
dm = dx.
|v|2
dm;
2
edm;
E(t) =
U (t)
S(t) =
sdm.
U (t)
3. Kinematic formulas
Take V to be any smooth subregion of U and write V (t) = (V, t), (t 0).
If f = f (x, t) (x R3 , t 0) is smooth, we compute:
d
f
dx +
f dx =
f v dS,
dt
V (t)
V (t) t
V (t)
where dS denotes 2-dimensional surface measure on V (t), v is the velocity eld and is
the unit outer normal vector eld to V (t). Applying the GaussGreen Theorem we deduce:
f
d
f dx =
(5)
+ div(f v)dx.
dt
V (t)
V (t) t
Take f = above. Then
V (t)
d
+ div(v)dx =
t
dt
dx
= 0,
V (t)
as the total mass within the regions V (t) moving with the deformation is unchanging in
time. Since the region V (t) is arbitrary, we deduce
(6)
+ div(v) = 0
t
conservation of mass.
65
This PDE holds in U (t) at each time t 0. Now again take f = f (x, t) and compute
d
d
f
dm
=
f
dx
by (4)
dt
dt
V (t)
V (t)
= V (t) (ft) + div(f v)dx by (5)
+ v Df dx by (6),
= V (t) f
t
where
Df = Dx f =
f
, f , f
x1 x2 x3
= gradient of f with
respect to the spatial
variables x = (x1 , x2 , x3 ).
(7)
f dm =
V (t)
V (t)
Df
dm,
Dt
where
Df
f
=
+ v Df
Dt
t
is the material derivative. Observe that
d
Df
(X, t), t) =
f (
.
dt
Dt
4. Deformation gradient
We dene as well the deformation gradient
(8)
F (X, t) = DX (X, T )
1
1
. . . X
X1
3
...
=
3
X1
3
X3
L(X, t) =
F
(X, t)F 1 (X, t)
t
for X U , t 0.
66
vxi j
(X, t)
t
Thus
(1 i, j 3).
As F 1 = Dx , we see that
Dv(x, t) =
F
(X, t)F 1 (X, t).
t
Thus
(10)
L(X, t) = Dv(x, t)
(11)
d
dt
(1)
vdm
V (t)
T dS.
bdm +
V (t)
V (t)
This says the rate of change of linear momentum within the moving region V (t) equals the
body force acting on V (t) plus the contact force acting on V (t). We employ (6), (7) in A
and the GaussGreen Theorem to deduce from (1) that
(2)
Dv
= b + div T
Dt
balance of momentum.
We additionally suppose:
67
(3)
d
dt
V (t)
|v|2
+ edm
2
v b + rdm
=
V (t)
v T q dS
+
V (t)
This identity asserts that the rate of change of the total energy (= kinetic energy + internal
energy (including potential energy)) within the moving region V (t) equals the rate of work
performed by the body and contact forces, plus the body heat supplied minus the heat ux
outward through V (t).
Since T is symmetric, v T = (Tv) . Hence (3) and the GaussGreen Theorem imply:
D |v|2
+ e = (v b + r) + div(Tv q).
Dt
2
Simplify, using (2) to deduce:
(4)
De
= r div q + T : Dv
Dt
energy balance.
aij bij .
i,j=1
(5)
d
dt
sdm
V (t)
V (t)
r
dm
V (t)
q
dS.
This asserts that the rate of entropy increase within the moving region V (t) is greater than
or equal to the heat supply divided by temperature integrated over the body plus an entropy
ux term integrated over V (t). Observe that (5) is a kind of continuum version of the
various forms of Clausius inequality we have earlier encountered.
68
q
r
Ds
div
Dt
entropy inequality.
Notation. We dene the local production of entropy per unit mass to be:
q
Ds r 1
+
div
Dt
Ds r div(q) q D
(7)
=
+
Dt
0.
v=
U (t)
such that
e, , T, q
(a)
(b)
(2)
(c)
(d)
(x U (t), t 0)
where
e : R R R.
Equations (b)(d) have similar interpretations. Below we will sometimes omit writing the
and so regard e as a function of (s, v), etc.
+ T : Dv q D.
Dt
Dt
e Ds
e Dv
=
+
.
Dt
s Dt v Dt
(4)
(5)
Insert (4), (5) into (3):
(6)
e
0
s
Ds
e
1
D.
+ T
I : Dv q
Dt
v
The main point is that (6) must hold for all admissible thermodynamic processes, and here
is how we can build them. First we select any deformation as in A and any function s.
Then v =
, F = DX , = (det F )1 0 . Thus we have v = 1 and so can then employ
t
70
(2) to compute e, , T, q. Lastly the balance of momentum and energy equations ((2), (4)
from B) allow us to dene b, r.
We as follows exploit this freedom in choosing , s. First x any time t0 > 0. Then choose
as above so that (, t0 ) = (det F (, t0 ))1 0 is constant, and s so that s(, t0 ) s
is constant. Then v(, t0 ) is constant, whence (2)(b) implies D(, t0 ) = 0. Fix any point
x0 U (t0 ). We can choose , s as above, so that Dv = L = F
F 1 and Ds
at (x0 , t0 ) are
t
Dt
arbitrary. As (6) implies
Ds
e
+ T
I : Dv
0
s Dt
v
at (x0 , t0 ) for all choices of Dv,
dropping the circumex:
Ds
,
Dt
we conclude =
e
=
s
(7)
e
,
s
T =
e
I.
v
temperature formula
and
T = pI,
(8)
for
e
= p
v
(9)
pressure formula.
Equation (9) is the denition of the pressure p = p(s, v). Thus in (7)(9) we regard e, T, p
as functions of (s, v).
Remarks. (i) The balance of momentum equation (for b 0) becomes the compressible
Euler equations
(10)
Dv
= Dp,
Dt
about which more later. If the ow is incompressible, then is constant (say 1) and
thus (6) in A implies div v = 0. We obtain then the incompressible Euler equations
Dv
= Dp
Dt
(11)
div v = 0.
(ii) Note the similarity of (7), (9) with the classical formulas
E
E
= T,
= P
S
V
71
q(s, v, p) p 0
l, q
such that
e, , T,
(a)
(b)
(13)
(c)
(d)
T = T(x,
v) + l(s, v)[Dv]
(s, v, D).
q=q
Here for each (s, v), we assume l(s, v)[] is a linear mapping from M33 into S3 . This term
models the uid viscosity.
As before, we conclude from the ClausiusDuhem inequality that
e
Ds
e
1
0
+ T
I : Dv + l[Dv] : Dv q D.
s Dt
v
Since this estimate must be true for all and s, we deduce as before the temperature formula
(7).
We then pick , s appropriately to deduce:
0 T
I : Dv + l[Dv] : Dv.
v
In this expression Dv is arbitrary, and so in fact
I : Dv + 2 l[Dv] : Dv
0 T
v
72
(14)
follows, as does the inequality
l(L) : L 0
(15)
dissipation inequality
for all L M33 . The heat conduction inequality (12) then results.
Remark. As explained in [C-N] constitutive relations must also satisfy the principle of
material objectivity, which means that an admissible process must remain so after a (possibly
time dependent) orthogonal change of frame. This principle turns out to imply that l[] must
have the form
l[Dv] = (Dv + (Dv)T ) + (div v)I
(16)
where , are scalar functions of (s, v). The dissipative inequality (15) turns out to force
2
0, + 0.
3
Taking , to be constant in the balance of motion equations (for b 0) gives the compressible NavierStokes equations
Dv
= Dp + v + ( + )D(div v).
Dt
If the ow is incompressible, then is constant (say 1) and so the conservation of
momentum equation implies div v 0. We thereby derive the incompressible NavierStokes
equations
Dv
= Dp + v
Dt
(18)
div v = 0.
(17)
2. Elastic materials
We can also model as well elastic solids, for which it is appropriate to display certain
quantities in Lagrangian coordinates (i.e., in terms of X, not x).
a. We call our body a perfect elastic
T,
q
such that
four functions e, ,
(a)
(b)
(19)
(c)
(d)
(x U (t), t 0)
where e : M33 R R and X = (x, t). Equations (b)(d) have similar interpretations.
Ds De
0
Dt
Dt
(20)
1
+ T : Dv q D.
e Ds
e
F
.
=
+
:
+ DX F
Dt
s Dt F
t
Dt
(X, t), t) = X with respect to t, to deduce
Dierentiate the identity (
D
Dt
= 0. So
e Ds
e F
De
=
+
:
.
Dt
s Dt F t
F 1 , we substitute into (20), thereby
Recalling from (9), (10) in A that Dv = L = F
t
deducing
e
Ds
e
1
T
D.
0
(21)
+ T
F
: Dv q
s Dt
F
T =
e T
F
F
stress formula,
where, we recall from (11) in A, = (det F )1 0 . Here we have again dropped the circumex, as so are regarding T, e as functions of (s, F ). Next the heat conduction inequality (12)
follows as usual.
2
In coordinates:
e
F ij
e
Fij ,
(1 i, j 3).
74
Note. Let us check that (22) reduces to (8), (9) in the case of a uid, i.e. e = e(s, v),
v = 1 = (det F )1
0 . Then the (i, j)-th entry of T is
3
3
e
e
(det F )
Fjk =
Fjk .
F
ik
0 v
ik
k=1
k=1
Now
(det F )
= (cof F )ik ,
Fik
cof F denoting the cofactor matrix of F . Thus
e T
T = F
F
=
=
As = (det F )1 0 , we conclude T =
e
I,
v
e
(cof F )F T
v
e
(det F )I.
v
b. We have a viscous elastic material with heat conduction if these constitutive relations
hold:
(a) e = e(s, F )
(b) = (s,
F)
(23)
F ) + l(s, F )[Dv]
(c) T = T(s,
(s, F, D).
(d) q = q
We deduce as above the temperature formula (7), the heat conduction inequality (12), the
dissipation inequality (15), and the identity
T=
e T
F + l[Dv].
F
Remark. Our constitutive rules (2), (13), (19), (23) actually violate the general principle of
equipresence, according to which a quantity present in an independent variable in one constitutive equation should be present in all, unless its presence contradicts some rule of physics
or an invariance rule. See TruesdellNoll [T-N, p. 359363] for an analysis incorporating this
principle.
D. Workless dissipation
Our model from II.A of a homogeneous uid body without dissipation illustrated an
idealized physical situation in which work, but not dissipation, can occur. This is dissipationless work. We now provide a mathematical framework for workless dissipation (from
Gurtin [GU, Chapter 14]).
75
We adopt the terminology from AC above, with the additional proviso that our material
body is now assumed to be rigid. So U (t) = U (for all t 0) and
v 0, b 0.
(1)
(2)
(3)
s r div(q) q D
0.
+
.
t
t
f = e s,
f
q D
+s +
0.
t
t
For our model of heat conduction in a rigid body we introduce next the constitutive relations
(a) e = e(, D)
(b) s = s(, D)
(9)
(c) q = q
(, D)
76
are given functions. We seek to determine what (8) implies about these strucwhere e, s, q
tural functions.
First, dene
f(, p) := e(, p)
s(, p);
(10)
so that (7) says
f = f(, D).
Therefore
f
=
+ Dp f D
.
t
t
t
+ s
+ Dp f D
,
t
f
= s
+
D
q
0.
(13)
and so (10) implies
Dp e(, p) = Dp s(, p)
for all , p. But (12), (13) allow us to deduce
0=
(Dp f) = Dp s.
The energy and entropy thus depend only on and not D. Finally we conclude from (11)
that
(15)
q(, p) p 0
In summary
(17)
cv () = s () = f () .
cv () + div(q(, D)) = r.
(18)
t
This is a PDE for the temperature = (x, t) (x U, t 0). The local production of
entropy is
q(, D) D
1
=
= q(, D) D
(19)
.
2
(20)
is called Fouriers Law, where A M33 , A = ((aij )). Owing to the heat conduction
inequality, we must then have
p (Ap) 0 for all p,
and so
(21)
aij i j 0
( R3 ).
i,j=1
78
u
,
t
u xi =
u
,
xi
u xi xj =
2u
,
xi xj
etc.
i,j=1
u = u(x)
We assume u > 0 in U and u is smooth.
u = temperature
q = ADu = heat ux
(2)
79
We will additionally assume in the heat condition PDE that the heat capacity cv is a constant,
say cv 1. Then, up to additive constants, we see from formula (17) in III.D that
u = internal energy/unit mass
(3)
log u = entropy/unit mass.
The local production of entropy is
(4)
aij
i,j=1
u xi u xj
.
u2
We will henceforth assume the PDE (1) is uniformly elliptic, which means that there
exists a positive constant such that
n
(5)
i,j=1
for all x U , Rn . We assume as well that the coecient functions {aij }ni,j=1 are smooth.
Note that (5) implies 0.
Notation. (i) If V is any smooth region, the outer unit normal vector eld is denoted
= (1 , . . . , n ). The outer A-normal eld is
A = A .
(6)
(7)
G(V ) =
V
n
aij uxi uxj
dx =
dx = internal entropy production
u2
V i,j=1
80
(10)
R(V ) =
V
1 u
dS = entropy ux outward across V .
u A
F (V ) + G(V ) = R(V ).
n
i,j=1
ij uxi
xj
n
aij uxi uxj
f
=
+ .
2
u
u
i,j=1
u
V u i,j=1
V i,j=1
R(V )
G(V )
f
+
dx .
V u
F (V )
We henceforth assume
f 0 in U,
(12)
for q = ADu. This equality says that the heat ux outward through V balances the heat
production within V . Likewise the identity (11) says
f
q
+ dx,
(14)
dS =
u
V u
V
81
which means the entropy ux outward through V balances the entropy production within
V , the later consisting of
dx, the rate of internal entropy generation, and
V f
dx, the entropy supply.
V u
a. A capacity estimate
Now (13) shows clearly that the heat ux outward through U can be made arbitrarily
large if we take f large enough. In contrast if V U , there is a bound on the size of the
entropy ux and the rate of internal entropy production, valid no matter how large f is.
U
w =
value problem
0 in U V
1 on V
0 on U .
CapA (V, U ) =
U V i,j=1
w solving (15).
Integrating by parts shows:
(17)
CapA (V, U ) =
V
82
w
dS,
A
(18)
for all choices of f 0.
w2 u
dS
u A
U V i,j=1
Since w = 1 on V , the term on the left is R(V ), whereas by the denition (16) the term on
the right is CapA (V, U ).
(19)
for all f , the term on the right depending solely on A and the geometry of V, U . The above
calculation is from Day [D].
b. A pointwise bound
Next we demonstrate that in fact we have an internal pointwise bound on within any
region V U , provided
(20)
f 0 in U.
83
(21)
v = log u
i,j=1
becomes
(23)
ij
(a vxi )xj =
i,j=1
i,j=1
So
(24)
ij
a vxi xj +
i,j=1
for
b :=
i
bi vxi = in U
i=1
n
aij
xj
(1 i n).
i,j=1
n
i,j=1
ij
a vxk xi xj +
bi vxk xj = xk + R1 ,
i=1
2. Now
=
k,l=1
Thus
(27)
xi =
xi xj =
n
k,l=1
n
k,l=1
where
|R2 | C(|D2 v| |Dv| + |Dv|2 ).
(28)
Hence
(29)
n
n i
b xi
i=1
n
n i
n
kl
ij
= 2 k,l=1 a vxl i,j=1 a vxk xi xj + i=1 b vxk xi
2 ni,j,k,l=1 aij akl vxk xi vxl xj + R3 ,
i,j=1
aij xi xj +
R3 another remainder term satisfying an estimate like (28). In view of the uniform ellipticity
condition (5), we have
n
aij akl vxk xi vxl xj 2 |D2 v|2 .
i,j,k,l=1
This estimate and the equality (25) allow us to deduce from (29) that
(30)
n
i,j=1
ij
a xi xj +
b xi 2
i
i=1
k,l=1
1 2
b
4
(31)
i,j=1
(32)
(33)
i,j=1
(35)
and compute
(36)
xi = 4 xi + 4 3 xi
xi xj = 4 xi xj + 4 3 (xj xi + xi xj ) + 4( 3 xi )xj .
Consequently
D = 4D at x0 .
(37)
So at the point x0 :
n
i,j=1
n
4
aij xi xj
i,j=1
aij xi xj + R5 ,
where
|R5 | C( 3 |D| + 2 ).
(38)
Invoking (33) we compute
(39)
4 2 C 4 1/2 |D| + C + R6 ,
|R6 | C( 3 |D| + 2 )
C 2
4 4 2 + C.
4 2 C at x0 ,
the constants C, depending only on and the coecients of the PDE. As = 4 attains
its maximum over U at x0 , (40) provides a bound on .
In particular then, since 1 on V , estimate (21) follows.
3. Harnack inequality
As earlier noted, the pointwise estimate (21) is quite signicant from several viewpoints.
As a rst illustration we note that (21) implies
87
Theorem 3 For each connected region V U there exists a constant C, depending only
on U, V , and the coecients, such that
sup u C inf u
(41)
(42)
i,j=1
Proof. Take V W U and r > 0 so small that B(x, r) W for each x V . Let
> 0. Since u = u + > 0 solves (42), Theorem 2 implies
sup
(43)
|Du|
C
u+
for some constant C depending only on W, U , etc. Take any points y, z B(x, r) W .
Then
| log(u(y) + ) log(u(z) + )|
supB(x,r) |D log(u + )| |y z|
2Cr =: C1
owing to (43). So
log(u(y) + ) C1 + log(u(z) + )
and thus
u(y) + C2 (log(u(z) + )
for C2 := eC1 . Let 0 to deduce:
max u C2 min u.
(44)
B(x,r)
B(x,r)
Corollary (Strong Maximum Principle) Assume u C 2 (U ) solves the PDE (42), and U is
bounded, connected. Then either
(45)
88
(x U )
or else
u is constant on U.
(46)
xi dx = 0.
U {
u<0}
i,j=1
D
u =
0 a.e. on {
u 0}
D
u a.e. on {
u < 0}.
(49)
Thus either u > 0 everywhere on V or else u 0 on V . This conclusion is true for each V
as above: the dichotomy (45), (46) follows.
Remark. We have deduced Harnacks inequality and thus the strong maximum principle
from our interior pointwise bound (21) on the local production of entropy. An interesting
philosophical question arises: Are the foregoing PDE computations really entropy calculations from physics? Purely mathematically the point is that change of variables v = log u
converts the linear PDE
n
89
ij
(a vxi )xj =
i,j=1
i,j=1
i,j=1
admits better than linear interior estimates. Is all this just a mathematical accident (since
log is an important elementary function) or is it an instance of basic physical principles
(since entropy is a fundamental thermodynamic concept)? We urgently need a lavishly
funded federal research initiative to answer this question.
ut
i,j=1
f : UT R
and
A : U Sn
A = ((aij ));
(x U , 0 t T ).
u = temperature
q = ADu = heat ux,
(2)
n
aij uxi uxj
.
2
u
i,j=1
(4)
(6)
F (t, V ) =
V
(7)
G(t, V ) =
(, t)dx =
n
V
i,j=1
aij
(8)
f (, t)
dx = entropy supply to V at time t
u(, t)
R(t, V ) =
V
1 u(, t)
dS = entropy ux outward across V at time t.
u(, t) A
d
S(t, V ) = F (t, V ) + G(t, V ) R(t, V ).
dt
This is the entropy production equation.
91
u
u
u
u
i,j=1
i,j=1
xj
Integrate over V :
ut
f
1 u
dS =
dx +
dx +
dx
u A
u
V u
V
V
R(t,V )
d
S(t,v)
dt
G(t,V )
F (t,V ).
2. Evolution of entropy
In this section we suppose that
f 0 in U (0, )
(10)
and also
u
= 0 on U [0, ).
A
(11)
The boundary condition (11) means there is no heat ux across U : the boundary is insulated.
a. Entropy increase
Dene
S(t) =
(t 0).
Theorem 1 Assume u 0 solves (1) and conditions (10), (11) hold. Then
(12)
dS
0 on [0, ).
dt
The proof is trivial: take V = U in the entropy production equation (9) and note that
(11) implies R(t, U ) = 0.
Remarks. (i) Estimate (12) is of course consistent with the physical principle that entropy
cannot decrease in any isolated system. But in fact the same proof shows that
t
(u(, t))dx
U
92
ut u = 0 in U (0, )
as a diusion equation. So now u = u(x, t) > 0 represents the density of some physical
quantity (e.g. a chemical concentration) diusing within U as time evolves. If V U , we
hypothesize that
d
u
u(, t)dx =
(15)
ds
dt
V
V
which says that the rate of change of the total quantity within V equals the outward ux
through V normal to the surface. The identity (15) holding for all t 0 and all V U ,
implies u solves the diusion equation (14).
Next think of the physical quantity (whose density is u) as being passively transported
by a ow with velocity eld v = v(x, t). As in III.A, we have
d
0=
udx =
ut + div(uv)dx,
dt V (t)
V (t)
V (t) denoting a region moving with the ow. Then
(16)
ut + div(uv) = 0.
v=
Du
= Ds
u
93
s = log u denoting the entropy density. So we can perhaps imagine the density as moving
microscopically with velocity equaling the negative of the gradient of the entropy density.
Locally such a motion should tend to decrease s, but globally S(t) = U s(, t)dx increases.
dS
dt
graph of S(t)
t
2
We accordingly might conjecture that t S(t) is concave and so ddtS2 0 on [0, ). This is
true in dimension n = 1, but false in general: Day in [D, Chapter 5] constructs an example
2
where n = 2, U is a square and dtd 2 S(0) > 0.
We turn our attention therefore to a related physical quantity for which a concavity in
time assertion is true. Let us write
H(t) =
(18)
u(, t) log u(, t)dx.
U
Remark. Owing to (3) we should be inclined to think of U u(, t) u(, t) log u(, t)dx as
representing the free energy at time t. This is not so good however, as (z) = z log z z is
convex and so if f 0, t U u(, t) u(, t) log u(, t)dx is nondecreasing. This conclusion
is physically wrong: the free (a.k.a. available) energy should be diminishing during an
irreversible process.
It is worth considering expressions of the form (18) however, as they often appear in the
PDE literature under the name entropy: see e.g. Chow [CB], Hamilton [H], and see also
V.A following.
94
(i) We have
dH
0 on [0, ).
dt
(19)
(ii) If U is convex, then
d2 H
0 on [0, ).
dt2
(20)
We will need this
(21)
Then
|Du|2
0 on U.
(22)
Proof. 1. Fix any point x0 on U . We may without loss assume that x0 = 0 and that near
0, U is the graph in the en -direction of a convex function : Rn1 R, with
(23)
D(0) = 0
(24)
2. Set h =
(25)
n
j=1
(i = 1, . . . , n 1).
uxi xj j + uxj xj i = 0
j=1
95
(i = 1, . . . , n 1)
at 0. Now
n
|Du|2
u xi j u xi xj ,
=2
i,j=1
i,j=1
(26)
1
.
(1+|D|2 )1/2
So for 1 i, j n 1:
xj xk xk xi
xi xj
=
.
(1 + |D|2 )1/2 k=1 (1 + |D|2 )3/2
n1
xj i
i,j=1
n
uxi xj uxi xj
u x ux ux x
4 i uj2 i j
u
ij ux ux
+2|Du|2 u3i j dx
u x ux x j
2 ni,j=1 U i ui j dS.
i,j=1
96
1 |Du|2
dS 0,
u
|Du|2 |D2 u|
u2
=
|Du|2
u3/2
1 |Du|4
2 u3
+
|D2 u|
u1/2
|Du|4
dx
u3
1 |D2 u|2
.
2
u
ut u = 0 in U (0, )
with
(28)
u
= 0 on U [0, ).
We assume U is convex.
Lemma We have
(29)
n
|Du|2
ut
.
+
u
2t
u2
2t
where as usual s = log u is the entropy density and =
production.
Proof. 1. Write v = log u; so that (27) becomes
(30)
vt v = |Dv|2 .
97
|Du|2
u2
Set
(31)
w = v.
(32)
But
(33)
2 2
w .
n
2. We will exploit the good term on the right hand side of (33). Set
w = tw +
(34)
n
.
2
Then
(35)
n
w
t
2t
w 2 nw
n2
+
.
t2
t2
4t2
wt w 2Dv Dw w.
t
(36)
3. Now
= t w
= t
(vt |Dv|2 ) on U [0, )
= 0 on U [0, ),
v
(vt ) =
= 0 on U [0, ).
t
v
1 u
u
98
|Dv|2 0 on U [0, ).
Thus
w
0 on U [0, ).
(37)
4. Fix > 0 so small that
(38)
w = tw +
n
> 0 on U {t = }.
2
Then (36)(38) and the maximum principle for parabolic PDE imply
w 0 in U [, ).
This is true for all small > 0, and so
tw +
But w = v = vt |Dv|2 =
ut
u
n
0 in U (0, ).
2
|Du|2
.
u2
n/2
|x2 x1 |2
t2
e 4(t2 t1 ) u(x2 , t2 ).
u(x1 , t1 )
t1
1
01
d
v(sx2
ds
1
1 2
b and the identity
Here we used the inequality ab a2 + 4
1
t2 t1
t2
.
ds = log
t1
0 t1 + s(t2 t1 )
99
Day [D, Chapters 3,4] has noted that the heat equation (and related second-order parabolic
PDE) admit various estimates which are reminiscent of the classical Clausius inequality from
Chapter II. We recount in this section some simple cases of his calculations.
We hereafter assume u > 0 is a smooth solution of
ut
(1)
i,j=1
u(, t) = (t) on U,
ij
t
i,j=1 (a uxi )xj = 0 in UT
(4)
u = on U [0, T ]
u = g on U {t = 0}.
2. Let g be another smooth function and dene u similarly. Then
d
2
(u
)
dx
=
2
(u ut )(u u)dx
dt
U
U t
= 2 U ni,j=1 aij (uxi uxi )(uxj uxj )dx
2 U |D(u u)|2 dx,
there being no boundary term when we integrate by parts, as u u = = 0 on U . A
version of Poincares inequality states
2
w dx C
|Dw|2 dx
U
100
b. Heating
Let u be the unique cycle corresponding to and recall from A.1 that
q = ADu = heat ux.
Thus
(6)
Q(t) =
q dS =
U
u
dS
A
represents the total heat ux into U from its exterior at time t 0. We dene as well
+ = sup{ (t) | 0 t T, Q(t) > 0}
(7)
= inf{ (t) | 0 t T, Q(t) < 0}
to denote, respectively, the maximum temperature at which heat is absorbed and minimum
temperature at which heat is emitted.
Theorem 1 (i) We have
(8)
0
Q
dt 0,
< +.
Notice that (8) is an obvious analogue of Clausius inequality and (9) is a variant of
classical estimates concerning the eciency of cycles. In particular (9) implies that it is not
101
possible for heat to be absorbed only at temperatures below some given temperature 0 and
to be emitted only at temperatures above 0 . This is of course a form of the Second Law.
Proof. 1. Write v = log u, so that
vt
ij
(a vxi )xj =
i,j=1
Then
d
dt
U
i,j=1
v(, t)dx
v
dS +
U A
1 u
dS
U u A
Q(t)
,
(t)
U
dx
since u(, t) = (t) on U . As t v(, t) is T -periodic, we deduce (8) upon integrating the
above inequality for 0 t T . We obtain as well a strict inequality in (8) unless
dxdt = 0,
|Dv|2 dxdt = 0.
(x U )
for each 0 t T . But then the PDE (1) implies ut 0 in UT and so t (t) is constant.
2. We now adapt a calculation from II.A.3. If is not constant, then
0>
T
Q(t)
dt
(t)
(10)
=
=
+
T
Q+ (t)Q (t)
dt
(t)
T
T
1
Q+ (t)dt 1 0 Q (t)dt
+ 0
T |Q(t)|+Q(t)
T
1
dt 1 0 |Q(t)|Q(t)
dt
+ 0
2
2
T
1
1
1 0 |Q(t)|dt
2 +
T
1
1
1
+
Q(t)dt.
+
0
0
Q(t)dt =
0
d
dt
102
u(, t)dx dt = 0,
(b) c
(11)
(c) T C/
(d) | | C, |
| C2
for all > 0, t 0.
For any as above, let u be the corresponding cycle, and set
u
Q (t) =
(, t)dS.
U A
Theorem 2 We have
(12)
0
Q
dt = O() as 0.
Estimate (12) is a sort of approximation to the Clausius equality for reversible cycles.
Proof. 1. Let w = w(x) be the unique smooth solution of
n
i,j=1 (aij wxi )xj = 1 in U
(13)
w = 0 on U ,
and set
(14)
Q (t) =
u
dS
U A
= |U | (t)
for
=
d
dt
u (, t)dx
w
dx
(t) + R(t),
U
U
(17)
ut (, t)dx.
R(t) =
U
(18)
R2 (t)dt = O(3 ) as 0.
To verify this claim, multiply the PDE (15) by ut and integrate over UT :
T n
T 2
ij
u
dxdt
+
xi uxj t dxdt
t
i,j=1 a u
0
U
0
U
(19)
T
ut dxdt.
= 0 U w
The second term on the left is
T
0
d
dt
n
1
aij uxi uxj dx dt = 0,
2 U i,j=1
owing to the periodicity of u , and thus u. The expression on the right hand side of (19)
is estimated by
T
2
1 T
u
dxdt
+
C
|
|2 dt
t
2 0
U
0
T
12 0 U u2t dxdt + O(3 ),
| C2 . So (19) implies
since T C/, |
T
u2t dxdt = O(3 ),
0
and thus
T
0
2
u
dx
dt
t
0
TU 2
|U | 0 U ut dxdt = O(3 ).
R2 (t)dt =
T
104
T
= |U | 0 dt
T
wdx 0
U
Q
dt
dt
T
0
R
dt.
The rst term on the right is zero, since = dtd (log ) and is T -periodic. The second
term is estimated by
C|T | sup |
| = O(),
and the third term is less than or equal
1/2
T
c
1/2
C
R dt
1/2 3/2 = O(),
where
Q
dt + A
1
A=
|U |2
2
dt = O(2 )
wdx.
U
105
p = p().
This is an equation of state, the precise form of which depends upon the particular uid we
are investigating.
Assume further that the uid is a simple ideal gas. Recall now formula (9) from I.F,
which states that at equilibrium
(3)
P V = constant,
pv = ,
denoting some positive constant. Thus for an isentropic ow of a simple ideal gas the
general equation of state (2) reads
(5)
p = .
106
Hence
(v i )t = t v i + vti
= v i 3j=1 (v j )xj 3j=1 v j vxi j pxi ,
= 3j=1 (v i v j )xj pxi
(i = 1, 2, 3).
velocity v R3
.
107
Assume rst that the particles do not interact. Then for each velocity v R3
d
f (x, v, t)dx = 0,
dt
V (t)
where V (t) = V + tv is the region occupied at time t by those particles initially lying within
V and possessing velocity v. As III.A, we deduce
0=
ft + divx (f v)dx =
ft + v Dx f dx.
V (t)
V (t)
Consequently
(7)
ft + v Dx f = 0 in R3 R3 (0, )
v
v
v
*
v*
We assume the particles all have the same mass m, so that conservation of momentum and
kinetic energy dictate:
(a) v + v = v + v
(8)
(b) |v|2 + |v |2 = |v |2 + |v |2 ,
where v, v , v , v R3 .
We now change variables, by writing
v v = w
108
(9)
v = v [(v v ) w]w
v = v + [(v v ) w]w.
(11)
which diers from (7) with the addition of the collision operator Q(f, f ) on the right hand
side. This term models the decrease in f (v, x, t) and f (v , x, t) and the gain in f (v , x, t) and
f (v , x, t), owing to collisions of the type illustrated above. The term B models the rate of
collisions which start with velocity pairs v, v and result in velocity pairs v , v given by (9).
See Huang [HU, Chapter 3] for the physical derivation of (11).
b. H-Theorem
We henceforth assume f is a smooth solution of (11), with f 0. We suppose as well
that f 0 as |x|, |v| , fast enough to justify the following calculations. Dene then
Boltzmanns H-function
H(t) =
(12)
f log f dvdx
(t 0),
R3
R3
109
(13)
on [0, ).
Thus
Q(f, f ) =
S2
R3
1
(v)Q(f, f )(v)dv =
4
R3
S2
R3
R3
(f f f f )( + )Bdvdv dS.
Next set
B3 :=
S2
R3
R3
66
110
(v , v )
(v, v ) = 1.
and so
Consequently
B3 =
R3
S2
The integrand is
R3
S2
R3
|v v | = |v v |
(v v ) w = (v v ) w;
for
B4 :=
S2
R3
R3
log f (v, )Q(f, f )(v, )dv = 14 S 2 R3 R3 (f f f f )[log(f f ) log(f f )]Bdvdv dS
0,
111
4. Thus
d
H(t)
dt
=
=
=
=
ft (log f + 1)dvdx
R 3 R 3
3 R 3 [v Dx f + Q(f, f )](log f + 1)dvdx
R
R3 R3 v Dx f (log f + 1)dvdx by (15), (16)
R3 v R3 Dx (f log f )dx dv
0.
(17)
(18)
(v R3 )
(19)
(x, v R3 )
for t 1.
c. H and entropy
We provide in this section some physical arguments suggesting a connection between the
H-function and entropy. The starting point is to replace (19) by the equality
(20)
(x, v R3 , t 0),
and is thus determined by the macroscopic parameters a = a(x, t), b = b(x, t), c = c(x, t).
We intend to interpret these physically.
(i) It is rst of all convenient to rewrite (20):
f (x, v, t) =
(21)
|vv|2
n
2
e
,
(2)3/2
(a)
f dv = n
R 3
(b) R3 vf dv = nv
(22)
(23)
(24)
S = R log V + CV log T + S0 ,
113
entropy/mole
heat capacity/mole.
Now a mole of any substance contains Avagadros number NA of molecules: see Appendix A.
Thus the mass of a mole of our gas is NA m. Dene now
s = entropy/unit mass
cv = heat capacity/unit mass.
Then
(28)
s = S/NA m
cv = CV /NA m.
CP
> 1, CP CV = R.
CV
=
=
+ s0 ,
where
k=
(31)
R
NA
is Boltzmanns constant. We now further hypothesize (as in Chapter III) that formula (30)
makes sense at each point x R3 , t 0 for our nonequilibrium gas. That is, in (30) we
replace
V by
1
= volume/unit mass.
(x,t)
Inserting (32) into (30) gives:
(33)
s=
k
(( 1)1 log log ) + s0 .
m
114
|vv|2
2
dv
log log + r0 ,
(34)
h(x, t) =
m
2
r0 now a dierent constant.
Comparing now (33), (34), we deduce that up to arbitrary constants
(35)
(x R3 , t 0),
provided is proportional to ,
(36)
(x, t) = (x, t)
( > 0),
5
= .
3
S(t) = kH(t)
(t 0).
So the total entropy at time t is just kH at time t and consequently the estimate (13) that
dH/dt 0 is yet another version of Clausius inequality.
(iv) It remains to compute . Owing to (26), we expect
3
3
=
2
2
to represent the total internal energy at (x, t). That is, 32 should be the internal energy/unit
mass.
115
Now we saw in I.F that the internal energy/mole at equilibrium is CV T . Thus we can
expect
3
= cv ,
2
(39)
where we recall cv is the heat capacity/unit mass. Thus (28), (39) imply:
=
2 CV
.
3 NA m
(40)
R
k
= .
NA m
m
m 3/2 m|vv|2
e 2k ,
f (x, v, t) = n
2k
n(x, t) = particle density at (x, t)
(x, t) = local temperature at (x, t)
v(x, t) = macroscopic uid velocity at (x, t).
(43)
f =n
eH
,
Z
= k ,
H = m2 |v v|2 ,
3/2 H
Z = 2k
= R3 e
dv.
m
existence, uniqueness and regularity of solutions for either. Much eort has focused therefore
on certain simpler model PDE, the rigorous analysis of which presumably provides clues
about the structure of solutions of Eulers and Boltzmanns PDE.
In this section we discuss a type of nonlinear rst-order PDE called a conservation law
and describe how entropy-like notions are employed to record irreversibility phenomena and
in fact to dene appropriate weak solutions. Following Lions, Perthame, Tadmor [L-P-T1]
we introduce also a kinetic approximation, which is a sort of simpler analogue of Boltzmanns
equation. In C following we discuss systems of conservation laws.
1. Integral solutions
A PDE of the form
(1)
(3)
F dS.
V
We hypothesize that F is a function of u. Equating (2), (3) for any region V yields the
conservation law (1). Note that (1) is a caricature of Eulers equations (6) in A.
ut + b(u) Du = 0
in Rn (0, ),
where
(5)
b = F , b = (b1 , . . . , bn ).
117
We can interpret (4) as a nonlinear transport equation, for which the velocity v = b(u)
depends on the unknown density u. The left hand side of (4) is thus a sort of nonlinear
variant of the material derivative Du
from Chapter III. The characteristics of (1) are solutions
Dt
x() of the ODE
(6)
x(t)
(t 0).
(8)
0
Rn
gv(, 0)dx = 0
Rn
g(x) =
1 x<0
0 x > 0,
then
u(x, t) =
1 x<
0 x>
t
2
t
2
is an integral solution of
(9)
ut +
2
u
2
= 0 in R (0, )
u = g on R {t = 0}.
(b) If, instead,
g(x) =
0 x<0
1 x > 0,
118
then
u1 (x, t) =
0 x<
1 x>
t
2
t
2
0 x<0
x
0<x<t
u2 (x, t) =
t
1 x>t
and
x<0
1
1x 0<x<1
g(x) =
0
x > 1,
then
u(x, t) =
1x
1t
x t, 0 t 1
t x 1, 0 t 1
0
x 1, 0 t 1
, t1
1 x < 1+t
2
=
1+t
0 x> 2 , t1
1+t
2
1+t
.
2
2. Entropy solutions
119
We next introduce some additional mathematical structure which will clarify what we
mean above by a physically correct solution.
Denition. We call (, ) an entropy/entropy ux pair for the conservation law (1) provided
(i) : R R is convex
and
(ii) : R Rn , = (1 , . . . , n )
satises
= b .
(10)
Thus
i
(11)
(z) =
bi (v) (v)dv
(i = 1, . . . , n)
up to additive constants.
Motivation. Notice that it is a C 1 -solution of (1) in some region of Rn (0, ), then (10)
implies
(u)t + div (u) = 0
there. We interpret this equality to mean that there is no entropy production within such a
region. On the other hand our examples illustrate that integral solutions can be discontinuous, and in particular Examples (a), (b) suggest certain sorts of discontinuities are physically
acceptable, others not.
We intend to employ entropy/entropy ux pairs to provide an inequality criterion, a kind
of analogue of the ClausiusDuhem inequality, for selecting the proper integral solution. The
easiest way to motivate all this is to introduce the regularized PDE
(12)
where > 0. By analogy with the incompressible NavierStokes equations ((18) in III.C)
we can regard the term u as modelling viscous eects, which presumably tend to smear
out discontinuities. And indeed it is possible to prove under various technical hypotheses
that (12) has a smooth solution u , subject to initial condition u = g on Rn {t = 0}.
Take a smooth entropy/entropy ux pair , and compute:
(13)
(u )t + div (u ) =
=
=
=
(u )ut + (u ) Du
(u )(b(u ) Du + u ) + (u ) Du
(u )u by (10)
div( (u )Du ) (u )|Du |2
div( (u )Du ),
120
the inequality holding since is convex. Now take v Cc1 (Rn (0, )), v 0. Then (13)
implies
(u ))dxdt
(u )vt + (u ) Dvdxdt = 0 Rn v((u )t + div(
Rn
0
0 Rn v div( (u )Du )dxdt
= 0 Rn (u )Dv Du dxdt.
Assume now that as 0,
u u boundedly, a.e.,
(14)
and further we have the estimate
(15)
sup
(16)
0
Rn
Rn
u(, 0) = g.
Remarks. (i) The meaning of (18) is that the integral inequality (16) holds for all v
Cc1 (Rn (0, )), v 0.
(ii) We can regard (18) as a form of the ClausiusDuhem inequality, except that the sign
is reversed. Note carefully: if , is an entropy/entropy ux pair, then s = (u) acts like
a physical entropy density. The customary mathematical and physical usages of the term
entropy dier in sign.
(iii) Since we have seen that (u)t + div (u) = 0 in any region where u is C 1 , the possible
inequalities in (18) must arise only where u is not C 1 , e.g. along shocks. We motivated (18)
121
The existence of an entropy solution of (17) follows via the vanishing viscosity process,
and the really important assertion is this theorem of Kruzkov:
Theorem Assume that u, u are entropy solutions of
ut + div F(u) = 0 in Rn (0, )
(20)
u = g on Rn {t = 0}
and
(21)
ut + div F(
u) = 0 in Rn (0, )
u = g on Rn {t = 0}.
Then
(22)
for each 0 s t.
In particular an entropy solution of the initial value problem (20) is unique.
See for instance [E1, 11.4.3] for proof.
3. Condition E
We illustrate the meaning of the entropy condition for n = 1 by examining more closely
the case that u is a piecewise smooth solution of
(23)
ut + F (u)x = 0 in R (0, ).
More precisely assume that some region V R (0, ) is subdivided into a left region Vl
and a right region Vr by a smooth curve C, parametrized by {x = s(t)}:
122
C={x=s(t)}
t
Vl
Vr
Assume that u is smooth in both Vl and Vr , and further u satises the condition
(24)
(u)t + (u)x 0 in V
(u)t + (u)x = 0 in Vl .
123
Similarly
ut + F (u)x = 0 in Vr ,
(28)
(u)t + (u)x = 0 in Vr .
where = ( 1 , 2 ) is the outer unit normal to Vl along C, ul is the limit of u from the left
along C, and ur is the limit from the right. Since
=
1
(1, s)
(1 + (s)
2 )1/2
(30)
(31)
Select now a time t, and suppose ul < ur . Fix ul < u < ur and dene the entropy/entropy
ux pair
(z) = (z u)+
z
(z) = ul sgn+ (v u)F (v)dv.
Then
(ur ) (ul ) = ur u
(ur ) (ul ) = F (ur ) F (u).
(32)
Combine (31), (32):
(33)
F (u)
F (ur ) F (ul )
(u ur ) + F (ur )
ur ul
124
(ul u ur ).
This inequality holds for each ul u ur and says that the graph of F on the internal
[ul , ur ] lies above the line segment connecting (ul , F (ul )) to (ur , F (ur )).
A similar proof shows that if ul > ur , then
F (ur ) F (ul )
(34)
(ur u ul )
(u ur ) + F (ur )
F (u)
ur ul
The inequalities (33), (34) are called Oleiniks condition E.
Remarks. (i) In particular,
and
F (ur ) F (ul )
F (ul ).
ur ul
As characteristics move with speed b = F , we see that characteristics can only collide
with the shock and cannot emanate from it. The same conclusion follows from (34). This
geometric observation records the irreversibility inherent in the entropy condition (24).
4. Kinetic formulation
Our intention next is to introduce and study a sort of kinetic formulation of the conservation law ut + div F(u) = 0. If we think of this PDE as a simplied version of Eulers
equations, the corresponding kinetic PDE is then a rough analogue of Boltzmanns equation.
The following is taken from PerthameTadmor [P-T] and LionsPerthameTadmor [L-P-T1].
We will study the kinetic equation
(35)
wt + b(y) Dx w = my in Rn R (0, ),
where
w : Rn R (0, ) R, w = w(x, y, t),
is the unknown, b = F as in 2, and m is a nonnegative Radon measure on Rn R (0, ).
Hence
my =
m.
y
125
should represent the density at (x, t). The idea will be to show under certain circumstances
that u solves the conservation law ut + div F(u) = 0.
To facilitate this interpretation, we introduce the pseudo-Maxwellian
1 if 0 < y a
1 if a y 0
(37)
a (y) =
0 otherwise.
for each parameter a R.
As a sort of crude analogue with the theory set forth in A, we might guess that w, u are
further related by the formula:
w(x, y, t) = u(x,t) (y).
(38)
This equality says that on the mesoscopic scale the velocity distribution is governed by
the pseudo-Maxwellian, with macroscopic parameter a = u(x, t) at each point x Rn , time
t 0. It is remarkable that this rough interpretation can be made quite precise.
Theorem
(i) Let u be a bounded entropy solution of
ut + div F(u) = 0 in Rn (0, )
(39)
and dene
(40)
Then
(x Rn , g R, t 0).
wt + b(y) Dx w = my in Rn R (0, )
where R0 = uL .
(ii) Conversely, let w C([0, ); L1 (Rn R)) L (Rnx (0, ), L1 (Ry )) solve (41) for
some measure m as above. Assume also w has the form
w = u(x,t) .
Then
(42)
w(x, y, t)dy
u(x, t) =
R
v(x, t) (y)(y)
as a test function in the denition of w as a weak solution of the transport equation (41):
w(v )t + wb(y) Dx (v )dxdydt
R Rn
0
(45)
= 0 R Rn (v )y dm.
We must examine each term in this identity.
2. Now
w(v )t dxdydt
0 R Rn
= 0 Rn vt R w dy dxdt.
By hypothesis w = u(x,t) , and therefore
w(x, y, t) (y)(y)dy = R u(x,t) (y) (y)(y)dy
R
u(x,t)
(46)
(y)(y)dy
if u(x, t) 0
= 0
= (u(x, t)),
since (0) = 0 and 1 on [0, u(x, t)]. A similar computation is valid if u(x, t) 0. Hence
w(v k )t dxdydt
0 R Rn
(47)
= 0 Rn vt (u)dxdt.
127
Similarly,
wb(y) Dx (v )dxdydt
0 R Rn
= 0 Rn Dx v R b(y)w dy dxdt.
u(x,t)
wb(y) dy = 0
b(y) (y)(y)dy
= (u(x, t)),
for
(z) :=
(48)
b(y) (y)dy.
= 0 Rn Dv (u)dxdt.
3. We investigate now the term on the right hand side of (45):
(v )y dm = 0 Rn R v( + )dm
R Rn
0
(50)
0 Rn R v dm,
since 0, , v 0. Additionally, since 0 on the support of m, we have
(51)
v dm = 0.
0
Rn
Rn
for all v as above. An easy approximation removes the requirement that be C 2 . Thus for
each entropy/entropy ux pair we have
(u)t + div (u) 0 in Rn (0, )
in the weak sense, and consequently u is an entropy solution of (39).
5. Now we prove assertion (i) of the Theorem. Let u be a bounded entropy solution of
ut + div F(u) = 0 and dene
(53)
(x Rn , y R, t 0).
128
Rn
(55)
(57)
M
in the distribution sense.
y
(58)
with
0, Cc (Rn (0, ))
0, Cc (R).
Take
(y) :=
(60)
(w)dwdz.
0
Then
(61)
!M, " =
=
=
=
=
!M, "
!T, " by (56), (60)
w[t + b(y) Dx ] dxdydt by (54)
0 R Rn
u [t + b(y) Dx ]dydxdt
0 Rn R
(u)t + (u) Dx dxdt,
Rn
0
129
where
= a.
The last equality results from calculations like those in steps 1,2 above. Now is convex
and so, since u is an entropy solution, the last term in (60) is nonnegative. Thus
!M, " 0 if (x, y, t) = (x, t)(y),
(62)
where 0, 0.
Next, take
(x, y, t) = (x, t)(y)
where , are smooth, nonnegative and have compact support. Assume also
dxdydt = 1.
0
Let () =
1
(/),
n+2
Rn
(x, y, t) = ( )(x, y, t)
= 0 R Rn (x x, t t) (y y)(
x, y, t)d
xd
y dt.
We have
x, y, t)d
xd
y dt
!M, " = 0 R Rn !M, "(
0,
M
= my
y
5. A hydrodynamical limit
To illustrate the usefulness of the kinetic formulation of the conservation law introduced
in 4, we consider in this section the problem of understanding the limit as 0 of solutions
of the scaled transport equation
(63)
1
wt + b(y) Dx w = (u w ) in Rn R (0, )
where
(64)
w (x, y, t)dy.
u (x, t) :=
R
Our scaled problem (63), (64) is a vastly simplied variant, for which it is possible to
understand rigorously the limit 0. First let us adjoin to (63), (64) the initial condition
w = g on Rn R {t = 0},
(65)
where w solves
(67)
wt + b(y) Dx w = my
w = u in Rn R (0, )
w = g on Rn R {t = 0},
|w | 1 a.e.,
w 0 on {y 0}, w 0 on {y 0}.
supt(w ) Rn [R , R ] (0, ),
0
0
1
(u w ) = my ,
supt(h) [R0 , R0 ],
1 h 0 if y 0
(72)
0 h 1 if y 0
hdy = a
R
Then
(73)
(a.e. y R)
for
q(y) :=
Recall that
a (z) = h(z)dz.
1 if 0 z a
1 if a z 0
a (z) =
0 otherwise.
a (z) h(z)dz
= a hdz = 0.
q(R0 ) =
R0
Hence
q 0 on R.
(74)
where
supt(m ) [R0 , R0 ], m 0
for each (x, t). This is assertion (71).
4. Next we assert:
(75)
0<1
133
this:
n R m (y)y dydxdt
0 R
0 Rn R my ydydxdt
0 Rn R (wt + b(y) Dx w )ydydxdt
w (x, y, 0)ydydx
R n R
g(x) (y)ydydx by (65)
n
R g2R
= Rn 2 dx < .
wr N w weakly in L
ur u strongly in L1loc
r
m N m weakly as measures.
Hence
wt + b(y) Dx w = my in Rn R (0, )
(76)
(u w )dxdydt =
0
Rn
Rn
Consequently
ur N w weakly in L .
(77)
Now
1 if 0 y ur
1 if ur y 0
ur (y) =
0 otherwise,
and so
ur u in L1loc .
134
y dm 0.
Thus
w = u .
Hence (67) holds and so, according to the kinetic formulation in 4, u solves the conservation
law (68).
(1)
where the unknown is
u : Rn [0, ) Rm , u = (u1 , . . . , um )
and
F11 . . .
F : Rm Mmn , F = ...
F1m . . .
Fn1
..
.
Fnm mn
is given.
Notation. (i) We can rewrite (1) into the nondivergence form
ut + B(u)T : Du = 0 in Rn (0, )
(2)
for
(3)
(Fik (u))xi = 0
(k = 1, . . . , m)
i=1
ukt
n
m
F k (u)
i
i=1 l=1
zl
ulxi = 0
(k = 1, . . . , m).
135
(6)
0
Rn
u vt + F(u) : Dvdxdt +
Rn
g v(, 0)dx = 0
Dv =
F(u) : Dv =
vx11 . . .
vxm1 . . .
m n
k=1
i=1
vx1n
vxmn
Fik (u)vxki .
As for scalar conservation laws this is an inadequate notion of solution, and so we introduce this additional
Denition. We call (, ) an entropy/entropy ux pair for the conservation law (1) provided
(i) : Rm R is convex
and
(ii) : Rm Rn , = (1 , . . . , n )
satises
= BD.
D
(7)
Notation. The identity (7) means:
(8)
izk
m
F l
i
l=1
zk
zl
(1 i n, 1 k m).
136
(9)
there. Indeed, we compute:
m
k
k=1 zk (u)ut
n
i
k
+ m
k=1
i=1 zk (u)uxi
n
Fik (u) l
= m
(u)
u xi
z
k
k,l=1
i=1
zl
m n
i
k
+ k=1 i=1 zk (u)uxi according to (4)
m n m
Fil (u)
i
k
=
k=1
i=1
l=1 zl (u) zk + zk (u) uxi
= 0, owing to (8).
Unlike the situation for scalar conservation laws (i.e. m = 1), there need not exist any
entropy/entropy ux pairs for a given system of convservation laws. For physically derived
PDE, on the other hand, we can hope to discern at least some such pairs.
2. Compressible Euler equations in one space dimension
We return to A.1 and consider now the compressible, isentropic Euler equations in one
space dimension. According to (6) in A.1, the relevant PDE are
t + (v)x = 0
in R1 (0, ),
(10)
2
(v)t + (v + p)x = 0
where is the density, v the velocity and
(11)
p = p()
u = (, v)
F = (z2 , z22 /z1 + p(z1 )).
137
Remark. We have
B = DF =
z22 /z12
0
1
+ p (z1 ) 2z2 /z1
z2
(p (z1 ))1/2
z1
assuming
p > 0.
(13)
Reverting to physical variables, we see
= v (p ())1/2 .
It follows that the speed of sound for isentropic ow is
p ()1/2 .
(14)
and
(15)
t + x = 0 in any
region where (, v) are C 1 -solutions of (10).
138
Observe that the second line here is just the second line of formula (1) from A.1. So if
= (, v), = (, v), we can compute:
t + x = t + v vt + x + v vx
x
= (x v vx ) + v vvx p
+ x + v vx
= x v
p
+vx [v vv ].
Consequently, t + x 0 for all smooth solutions (, v) of (15) if and only if
= v + p v
(17)
v = + vv .
We proceed further by noting v = v :
p
v + v = ( + vv ) .
v
Hence
+ vv +
p
vv = + + vv ,
and consequently
(18)
p ()
vv
2
( > 0, v R).
= 0, = g on R { > 0},
p() = , where =
( 1)2
, >1
4
(21)
3
.
2( 1)
( > 0, v R).
for =
1
.
2
See [L-P-T2] for proof. We will momentarily see that we can regard as a sort of
pseudo-Maxwellian, parameterized by the macroscopic parameters , v.
Example. Take g(v) = v 2 . Then
(24)
(, v) =
=
y 2 (1 (y v)2 ) + dy
1
k
v 2 + 1
.
2
R
k
The term 12 v 2 is the density of the kinetic energy, and 1
is the density of the internal
energy. Hence is the energy density. If (, v) is an entropy solution of (10), then
t + x 0,
and so
(25)
sup
t0
k
1
(x, t)v 2 (x, t) +
(x, t)dx < ,
2
1
140
Theorem Let (, v) L ((0, ); L1 (R, R2 )) have nite energy and suppose 0 a.e.
Then (, v) is an entropy solution of Eulers equations
t + (v)x = 0
in R (0, )
(26)
2
(v)t + (v + p)x = 0
if and only if there exists a nonpositive measure m on R R (0, ) such that
(27)
w = (, y v)
satises
(28)
T = wt + [(y + (1 )v)w]x
and
2M
= T.
y 2
(30)
t + x =
Suppose now
(x, y, t) = (x, t)(y)
where
0, Cc
0, Cc .
Take g so that
(32)
g = .
141
0
!T, g"
!M, " by (29), (31)
!M, ".
since 0, and thus !M, " 0. This holds for all = as above and so, as in B.4,
(34)
142
ut + H(Du) = 0 in Rn (0, )
then v
t (x0 , t0 ) + H(Dv(x0 , t0 )) 0
and
(3)
The interest in (2), (3) is consequently the possibility of the inequalities holding at points
where u is not C 1 . This is all obviously some kind of vague analogue of the theory from
Chapter V. As in that chapter let us motivate (2), (3) by the vanishing viscosity method.
So x > 0 and consider the regularized PDE
ut + H(Du ) = u in Rn (0, ).
(4)
(5)
and further suppose for some v C that u v has a strict local maximum at some point
(x0 , t0 ) Rn (0, ). Then u v has a local maximum at a nearby point (x , t ), with
(x , t ) (x0 , t0 ) as 0.
(6)
As v and our solution u of the regularized problem (4) are smooth, we have
(7)
ut = vt , Du = Dv, D2 u D2 v at (x , t ),
vt (x0 , t0 ) + H(Dv(x0 , t0 )) 0.
It is easy to modify this proof if u v has a local maximum which is not strict at (x0 , t0 ). A
similar proof shows that the reverse inequality holds should u v have a local minimum at
a point (x0 , t0 ).
Hence if the u (or a subsequence) converge locally uniformly to a limit u, then u is a
viscosity solution of (1). This construction by the vanishing viscosity method accounts for
the name.3
We will not develop here the theory of viscosity solutions, other than to state the fundamental theorem of CrandallLions:
Theorem Assume that u, u are viscosity solutions of
ut + H(Du) = 0 in Rn (0, )
(8)
u = g on Rn {t = 0}
3
In fact, Crandall and Lions originally considered the name entropy solutions.
144
and
(9)
ut + H(D
u) = 0 in Rn (0, )
u = g on Rn {t = 0}.
Then
u(, t) u(, t)L (Rn ) u(, s) u(, s)L (Rn )
for each 0 s t.
In particular a viscosity solution of the initial value problem (8) is unique. See [C-E-L]
for proof (cf. also [E1, 10.2]).
B. HopfLax formula
For use later we record here a representation formula for the viscosity solution of
ut + H(Du) = 0 in Rn (0, )
(1)
u = g on Rn {t = 0},
in the special case that
(2)
H : Rn R is convex
and
(3)
g : Rn R is bounded, Lipschitz.
(x Rn , t 0)
for the unique viscosity solution of (1). Here L is the Legendre transform of H:
L(q) = sup {p q H(p)}.
pRn
See [E1, 10.3.4] for a proof. We will invoke this formula in VIII.C.
C. A diusion limit
In Chapter VIII we will employ viscosity solution methods to study several asymptotic
problems, involvingas we will seeentropy considerations. As these developments must
wait, it is appropriate to include here a rather dierent application. We introduce for each
145
> 0 a coupled linear rst-order transport PDE, with terms of orders O 1 , O 12 and
study under appropriate hypotheses the limit of w as 0. This will be a diusion limit,
a sort of twin of the hydrodynamical limit from V.B.5.
1. Formulation
Our PDE system is
(1)
1
1
wt + BDw = 2 Cw in Rn (0, ).
The unknown is
w : Rn (0, ) Rm , w = (w1, , . . . , wm, ).
Notation. In (1) we are given the matrix
C = ((ckl ))mm
and also
B = diag(b1 , . . . , bm ),
n
k
k
k
where the vectors {bk }m
k=1 in R are given, b = (b1 , . . . , bn ). In terms of the components of
(2)
wtk,
m
1 k
1
k,
+ b Dw = 2
ckl wl,
l=1
for k = 1, . . . , m.
Remark. We can think of (2) as a variant of the PDE (63) in V.B.5, where the velocity
parameterization variable y is now discrete. Thus (1) is a scalar PDE for w = w (x, y, t),
y {1, . . . , m}.
The left hand side of (1) is for each k a linear, constant coecient transport operator,
and the right hand side of (1) represents linear coupling. As 0, the velocity 1 bk on the
left becomes bigger and the coupling 12 C on the right gets bigger even faster. What happens
in the limit?
To answer we introduce some hypotheses, the meaning of which will be revealed only
later. Let us rst assume:
(3)
ckl > 0 if k = l,
m
l=1
146
ckl = 0.
It follows from PerronFrobenius theory for matrices that there exists a unique vector
= (1 , . . . , m )
satisfying
(4)
k > 0 (k = 1, . . . , m),
and
m
k=1 ckl k
m
k=1
k = 1,
= 0.
(See for instance Gantmacher [G] for PerronFrobenius theory, and also look at VIII.A
below.)
We envision as a probability vector on = {1, . . . , m} and then make the assumption
of average velocity balance:
m
(5)
k bk = 0.
k=1
We must construct the matrix A = ((aij )). First, write 11 = (1, . . . , 1) Rm . Then (2), (3)
say
(6)
C11 = 0, C = 0.
Cdj = bj
(j = 1, . . . , n),
dj 11 = 0
(j = 1, . . . , n).
147
We write dj = (d1j , . . . , dm
j ), and then dene the diusion coecients
m
ij
(8)
a =
(1 i, j n).
k bki dkj
k=1
(9)
i,j=1
dkj j
(k = 1, . . . , m).
j=1
bki =
(1 k m, 1 i n).
ckl dli
l=1
aij i j =
=
(11)
=
for
n
i,j=1
m
k,l=1
m
k,l=1
m
k,l=1
k ckl l k
skl k l ,
k ckl + l clk
(1 k, l m).
2
is symmetric, with skl > 0 (k
= l) and
skl :=
The matrix S = ((skl ))mm
S11 = 0
owing to (5). Since obviously the entries of 11 = (1, . . . , 1) are positive, PerronFrobenius theory asserts that every other eigenvalue of S has real part less than or equal to the eigenvalue
(namely 0) associated with 11. But as S is symmetric, each eigenvalue is real. So
0
for each eigenvalue of S. Consequently (10) follows from (11).
148
3. Passage to limits
Assume now
g : Rn R
is smooth, with compact support. We introduce the initial value problem for (1):
l,
in Rn (0, )
wtk, + 1 bk Dwk, = 12 m
l=1 ckl w
(12)
wk, = g
on Rn {t = 0}
for k = 1, . . . , n. Linear PDE theory implies there exists a unique smooth solution w .
Theorem As 0, we have for k = 1, . . . , m:
(13)
w1 = w2 = = wm
149
for some scalar function u = u(x, t). To verify this, take any v Cc (Rn (0, ); Rm ) and
observe from (11) that
Rn
It follows that
Rn
v (Cw)dxdt = 0
for all v as above and so Cw 0 in Rn (0, ). Since the nullspace of C is the span of 11,
(14) follows.
3. Thus
wk,r u locally uniformly (k = 1, . . . , m).
(16)
We next claim that
(17)
vt (x0 , t0 )
i,j=1
4. To prove this, let us take v as above and suppose u v has a strict local maximum at
some point (x0 , t0 ). Dene then the perturbed test functions
v := (v 1, , . . . , v m, ),
where
(20)
k,
:= v
dkj vxj
(k = 1, . . . , m),
j=1
Since u v has a strict local maximum at (x0 , t0 ), it follows from (16), (21) that
wk, v k, has a local maximum near (x0 , t0 )
(22)
at a point (xk , tk ) (k = 1, . . . , m),
for = r , and
(23)
(xk , tk ) (x0 , t0 ) as = r 0, k = 1, . . . , m.
Since w and v are smooth functions, it follows from (22) and the PDE (12) that:
(24)
vtk,
m
1 k
1
k,
+ b Dv = 2
ckl w,l
l=1
= r . Recalling that ckl > 0 for k
= l, we can employ the inequalities (26) in (25):
n
k k
vt (x0 , t0 )
i,j=1 bi dj vxi xj (x0 , t0 )
1 ni=1 bki vxi (xk , tk )
,l
n l
,l
l l
k k
k k
c
v
)(x
,
t
)
+
v(x
,
t
)
d
v
(x
,
t
)
(w
+ 12 m
kl
x
i
l=1
i=1
+ o(1).
1
l
k
But (10) says m
l=1 ckl di = bi , and so the O terms in the foregoing expression cancel
out. Thus
n
k k
vt (x0 , t0 )
i,j=1 bi dj vxi xj (x0 , t0 )
,l
v ,l )(xl , tl ) + v(xk , tk )].
12 m
l=1 ckl [(w
Multiply by k > 0 and sum k = 1, . . . , m, recalling (2), (3) to deduce:
n
n
k bki dkj vxi xj (x0 , t0 )
vt (x0 , t0 )
i,j=1
k=1
aij
o(1).
151
Let = r 0 to derive the inequality (19). A simple approximation removes the requirement at u v have a strict maximum, and a similar argument derives the opposite inequality
should u v have a minimum at (x0 , t0 ).
Commentary. The linear system (12) for each xed > 0 represents a system of linear
transport PDE with simple linear coupling. This PDE is reversible in time and yet the
diusion equation (14) is not. The interesting question is this: where did the irreversibility
come from? Section VIII.A will provide some further insights. See also Pinsky [P] for other
techniques, mostly based upon interpreting (12) as a random evolution.
152
gas
vacuum
gas
initial state
final state
Take one mole of a simple ideal gas, and suppose it is initially at equilibrium, being held
by a partition in half of a thermally insulated cylinder. The initial volume is Vi , and the
initial temperature is Ti . We remove the partition, the gas lls the entire cylinder, and, after
coming to equilibrium, it has nal volume Vf , nal temperature Tf .
What is the change of entropy? According to I.F, we have
Si = CV log Ti + R log Vi + S0
(1)
Sf = CV log Tf + R log Vf + S0 ,
so being an arbitrary constant. As there is no heat transfer nor work done to or from the
exterior, the internal energy is unchanged. Since, furthermore, the energy depends only on
the temperature (see I.F), we deduce
Ti = Tf .
153
since k = R/NA .
As the last sentence suggests, it is convenient now to shift attention to the microscopic
level, at which the gas can be thought of as a highly complex, random motion of NA molecules.
We next imagine that we reinstall the partition, but now with
(a) a small gate
and
(b) a nanotechnology-built robot, which acts as a gatekeeper.
Our robot is programmed to open the door whenever a gas molecule approaches the door
from the right, but to close the door if a gas molecule approaches from the left. After our
robot has been at work for awhile, we will see more particles in the left region than in the
right. This is close to our initial situation.
154
The eect of our tiny robot has thus been to decrease the entropy, with a very small
expenditure of energy on its part. We have here an apparent contradiction of the Second
Law.
Maxwell in 1867 proposed this thought experiment, with an intelligent creature (called
Maxwells demon by Kelvin) in place of our nanoscale robot. Generations of physicists
have reconsidered this problem, most notably L. Szilard [SZ], who argued that the Second
Law is not violated provided the overall entropy of the system increases by k log 2 each time
the robot measures the direction of an incoming molecule in order to decide whether or not
to open the gate. As (2) presumably implies the entropy decreases by k log 2 once a particle is
trapped on the left, the Second Law is saved, providedto repeatwe appropriately assign
an entropy to the robots gaining information about molecule velocities.
We will not attempt to pursue such reasoning any further, being content to learn from
this thought experiment that there seems to be some sort of connection between entropy
and our information about random systems.
Remark. The book [L-R], edited by Le and Rex, is a wonderful source for more on
Maxwells demon, entropy concepts in computer science, etc. See also the website
www.math.washington.edu/~hillman/entropy.html.
B. Maximum entropy
This section introduces a random model for thermal systems and a concept of entropy as
a measure of uncertainty. The following is based upon Huang [HU], Jaynes [J], Bamberg
Sternberg [B-S].
1. A probabilistic model
A probabilistic model for thermal systems in equilibrium
155
We are given:
(i) a triple
(, F, ),
consisting of a set , a -algebra F of subsets of , and a nonnegative measure dened
on F. (We call (, F, ) the system, and the reference measure. A typical point is
a microstate.)
(ii) the collection of all -measurable functions
: [0, ),
such that
(1)
d = 1.
X : Rm+1 , X = (X 0 , . . . , X m ).
E(X, ) = !X" = Xd
= expected value of X, given the
microstate distribution .
156
is the probability that the true (but unobserved) microstate is in E, given the density .
Our goal is to determine, or more precisely to estimate , given certain macroscopic physical
measurements.
These we model using the observables X 0 , . . . , X m .
X=(X0,..,Xm)
AAAAAAA
AAAAAAA
AAAAAAA
AAAAAAA
AAAAAAA
AAAAAAA
Rm+1
E(X, ) = !X" = X.
(4)
= (X
0, . . . , X
m ) as lying in some region Rm+1 , which we may
Think of the point X
thus corresponds to m + 1 physical
interpret as the macroscopic state space. A point X
measurements, presumably of extensive parameters as in Chapter I. To accord with the
notation from I.A, we will often write
E = !X 0 ".
(5)
= (X
0, X
1, . . . , X
m ),
The fundamental problem is this. Given the macroscopic measurements X
there are generally many, many microstate distributions satisfying (4). How do we determine the physically correct distribution?
2. Uncertainty
To answer the question just posed, let us rst consider the special case that
= {1 , . . . , N } is a nite set, F = 2 ,
(6)
and is counting measure.
157
(7)
pi = 1
i=1
1
N
, . . . , N1 is monotonically increasing.
C. Composition
For each N and each probability distribution (p1 , . . . , pN ), set
q1 = p1 + + pk1 , . . . , qj = pkj1 +1 + + pkj , . . .
where 1 = k0 k1 k2 kM = N . Then
S(p1 , . . . , pN ) = S(q1 , . . . , qM )
M
(8)
+
j=1 qj S(pkj1 +1 /qj , . . . , pkj /qj ).
Probabilistic interpretation
The monotonicity rule B says that if all the points in = {1 , . . . , N } have equal
probability N1 , then there is more uncertainty the bigger N is. The composition rule C
)
applies if we think of subdividing = M
i=1 j , where j = {kj1 +1 , . . . , kj }. Then
qj is the probability of the event j . Further (8) says that the uncertainty inherent in
the distribution {p1 , . . . , pN } on should equal the uncertainty of the induced probability
distribution {q1 , . . . , qM } on {1 , . . . , M } plus the average of the uncertainties within each
j . This last expression is the sum on j of qj , the probability of j , times S computed for
the induced probability distribution {pkj1 +1 /qj , . . . , pkj /qj } on j . If some qj = 0, we omit
this term.
158
(9)
pi log pi
i=1
1
1
A(N ) := S
N , . . . , N .
N terms
(10)
Take
pi =
1
N
(i = 1, . . . , N ),
qj =
nj
N
(j = 1, . . . , M ),
and
(11)
where {nj }M
j=1 are integers satisfying
M
(12)
nj = N.
j=1
1
1
,...,
N
N
= S(q1 , . . . , qM ) +
qj S
j=1
1
1
,...,
nj
nj
A(N ) = S(q1 , . . . , qM ) +
qj A(nj ).
j=1
1
S
M
1
L
, . . . , L1
.
Thus in particular
(14)
M bk N ak M bk+1 .
Then
bk log M ak log N (bk + 1) log M,
and so
(17)
ak
log M
log N
bk
1
1+
bk
log M
.
log N
Now since N A(N ) is increasing according to Axiom B, (16) and (14) imply
bk A(M ) ak A(N ) (bk + 1)A(M ).
Then, since A(M ) > 0,
(18)
A(N )
bk + 1
bk
ak
A(M )
ak
(M, N 2).
This identity implies A(N ) = K log N for some constant K, and necessarily K > 0 in light
of Axiom B. This proves (15).
3. Now drop the assumption that nj = L (j = 1, . . . , M ). We then deduce from (11)(15)
that
nj
S nN1 , . . . , nNM = A(N ) M
j=1 N A(nj )
nj
log
n
= K log N M
j
j=1 N
n
M nj
= K j=1 N log Nj
160
pi log pi
i=1
provided 0 pi 1 (i = 1, . . . , N ), N
i=1 pi = 1.
Return now to the general probabilistic model in 1 for a thermal system in equilibrium.
Motivated both by the above formula and our earlier study of Boltzmanns equation in V.A.2,
we hereafter dene
S() = k log d
(19)
to be the entropy of the microstate density , with respect to the reference measure . We interpret S() as measuring the uncertainty or disorder inherent in the probability distribution
d.
C. Maximizing uncertainty
We can now provide an answer to the question posed at the end of 1, namely how to
select the physically correct microstate distribution satisfying the macroscopic constraints
(20)
k
E(X k , ) = X
(k = 0, . . . , m)?
Here is the idea: Since all we really know about are these identities,
constraints (20).
Remark. This is both a principle of physics (that we should seek maximum entropy congurations (I.C.6)) and a principle of statistics (that we must employ unbiased estimators).
See Jaynes [J] for a discussion of the latter.
A = : [0, ) | is -measurable,
(21)
d = 1, E(X, ) = X ,
161
= (X
0, . . . , X
m ) is given. Recall that we write X : Rm+1 , X = (X 0 , . . . , X m ).
where X
Theorem. (i) Assume there exist Rm+1 and Z > 0 such that
(22)
eX
Z
belongs to A. Then
(23)
(ii) Any other maximizer of S() over A diers from only on a set of -measure zero.
Remark. Observe
eX d.
Z=
(24)
1
+ log x
(x > 0)
x
'
1
1 > 0 if x 1
(x) = 2 +
x
x < 0 if 0 < x 1.
(x) :=
satises
Hence
(x) (1) = 1 for all x > 0,
and so
(25)
(x) := x log x x + 1 0
(x > 0),
log + log on ,
log
+1=
0.
162
log d = log Z Xd
= log Z X,
log d = log Z X
= log d.
Consequently (27) implies
S() S().
(28)
We have a strict inequality here unless we have equality in (26) a.e., and this in turn
holds only if = a.e.
D. Statistical mechanics
1. Microcanonical distribution
We consider in somewhat more detail rst of all the case that there are no observables,
in which case we deduce from Theorem 1 in B that the entropy S() is maximized by the
constant microstate distribution
(1)
1
,
Z
where
(2)
Z = ().
Thus each microstate is equally probable. This is the microcanonical distribution (a.k.a.
microcanonical ensemble).
Example 1. Let us take to be a nite set,
= {1 , . . . , N },
and take the reference measure to be counting measure. Then (1), (2) imply Z = N ,
(i ) = N1 for i = 1, . . . , N . The entropy is
S() = k N
i=1 (i ) log((i ))
= k log N.
163
S = k log W
where W = N = ||.
The issue here is to understand the physical dierences between taking the hard constraint H = E versus the limit as 0 of the softer constraints E H E + .
The expository paper [vK-L] by van Kampen and Lodder discusses this and some related
issues.
2. Canonical distribution
Next we apply the results of B to the case that we have one observable X 0 = H, where
H:R
is the Hamiltonian. As before we write E for the macroscopic energy:
E = !H".
(5)
We now invoke Theorem 1 from B to deduce that the entropy S() is maximized by the
microstate distribution
(6)
eH
for some R,
Z
where
(7)
eH d.
Z=
m 2
|v|
2
(v R3 ).
H is the kinetic energy of a particle with mass m > 0, velocity v. Then canonical distribution
is then
m|v|2
1
= e 2 .
Z
165
But this is essentially the Boltzmann distribution (43) from V.A.2, with the macroscopic
velocity v = 0, the macroscopic particle density n = 1, and
=
1
,
k
We regard (8) as a formula for Z as a function of and call Z the partition function.
Remember H : R.
Denitions of thermodynamic quantities in terms of and Z. We dene
(i) the temperature T by the formula
(9)
1
,
kT
E=
(log Z),
S = k(E + logZ),
and
(iv) the free energy
(12)
1
F = log Z.
(13)
the expected value of H being computed with respect to the canonical distribution =
(ii) Furthermore
S = S() = k log d,
(14)
and
1
S
= .
E
T
(15)
(iii) Finally,
F = E ST
(16)
and
F
= S.
T
(17)
= Hd = !H".
This is assertion (i) of the Theorem.
2. We compute as well
S() = k log d
= Zk eH (H log Z)d
= kE + k log Z.
From (11) we conclude that the identity (14) is valid. Furthermore
1
S
S
E
=
E
1
E
= k E + E
+
(log
Z)
= k by (10)
= T1 by (9).
167
1 H
e
.
Z
3. Since (12) says log Z = F , formulas (9), (11) imply F = E T S. This is (16). We
next rewrite (12) to read:
(18)
eH d = eF
and so
e(F H) d = 1.
(19)
Now since =
F
= 0.
1
,
kT
= 2
.
k T
(20)
Therefore
1 F
F
F
=
= T
.
k T
T
F
.
T
.
Owing to formula (16), we deduce S = F
T
Remark. We can dene as well the heat capacity
(21)
CV = k 2
2
(log Z).
2
CV =
E
,
T
168
we have
(E H)e(F H) d = 0.
= E TS
But F + F
1 F
k T
= E and so
!E H"2 =
E
2
(log Z).
=
!E H"2 = kT 2 CV ,
the average on the left computed with respect to the canonical density. This is a probabilistic
interpretation of the heat capacity, recording the variance of the microstate energy H from
its macroscopic mean value E = !H".
is uniformly convex,
(25)
and
(26)
S
log Z = max E +
E
k
,
(log Z) = E,
in accordance with (10). Thus (25) follows from (11). Formula (26) is dual to (25).
Observe that (25), (26) imply:
E S is concave,
log Z is convex.
169
(1)
p(0, , ) = () =
(2)
(3)
(, ),
1 =
0 otherwise,
p(t, , ) = 1,
and
(4)
p(t + s, , ) =
p(t, , )p(s, , ).
by
(5)
c(, ) = lim
t0
p(t, , ) p(0, , )
.
t
d(, ) =
c(, ) if =
0 if = .
c(, ) = 0.
Thus
d(, ) 0.
( ).
the expected value of f (X(t)), given that X(0) = . Owing to (4) {X(t)}t0 is a continuous
time Markov process, with generator L.
(11)
so that
(12)
[S (t)]() = (, t)()
( , t 0).
Lemma. We have
(13)
= L
t
on [0, ),
f d = f dS (t)
= S(t)f d.
172
Thus
f d
t
S(t)f
d
S(t)Lf
d
Lf dS (t)
Lf (, t)d
f L d.
d
dt
=
=
=
=
=
2. Entropy production
Denition. We say the probability measure is invariant provided
(14)
S (t) =
for all t 0.
or, equivalently,
(16)
S(t)f d =
f d
for all f : R.
where
= d/d.
Remark. Since is nite, (17) says
(18)
H(, ) =
log
173
()
()
().
Lemma. Let be an invariant probability measure, with () > 0 for all . Take to
be any probability measure. Then
d
H(S (t), ) 0.
dt
(19)
Proof. 1. Write
(x) := x log x x + 1
(x 0).
As noted in VII,
is convex, 0 for x 0.
(20)
H(S
(t),
)
=
log
d
dt
dt
log d + t d
=
t
= (L ) log d + L d.
Now
L d =
g(L11)d = 0,
=
=
=
=
+
L(log )d
(,t)
(,
t)
d(,
)
log
()
(,t)
(,t)
(,t)
log (,t)
()(, t)
, d(, ) (,t)
(,t)
()(, t)
, d(, ) (,t)
, d(, )((, t) (, t)()).
L(, t)d = 0,
174
Remark. We call
(21)
G(t) =
d(, )
(, t)
(, t)
()(, t) 0
3. Convergence to equilibrium
Denition. The Markov chain is called irreducible if
(22)
p(t, , ) > 0
(t > 0, , ) .
Remark. It is straightforward to show that the Markov chain is irreducible if and only if
for each pair , ,
= there exists a path
= 0 , 1 , . . . , m =
with
d(i , i+1 ) > 0
(i = 0, . . . , m 1).
lim S (t) = .
() := lim (tk , )
tk
175
the sequence tk selected so that the limit (24) exists for each . Clearly
(25)
() = 1.
0 () 1,
( ).
(27)
Lf d =
c(, )f ()() = 0
() =
1 if =
0 if
= .
Then
(28)
p(t, , ) = S (t) .
t H(p(t, , ), ) is nonincreasing in t.
176
(30)
tl
Consequently
Thus
(31)
Set
S (t)
177
B. Large deviations
1. Thermodynamic limits
We turn next to the theory of large deviations, which will provide links between certain
limit problems in probability, statistical mechanics, and various linear and nonlinear PDE.
Physical motivation. To motivate the central issues, let us rst recall from Chapter VII
these formulas for the canonical distribution:
1
eH d.
F = log Z, Z =
Now it is most often the case in statistical mechanics that we are interested in a sequence
of free energies and partition functions:
1
(1)
eHN dN
(N = 1, 2, . . . ).
FN = log ZN , ZN =
N
Typically (1) represents the free energy and partition function for a system of N interacting
particles (described by microstates in the system (N , FN , N ), with Hamiltonian HN :
N R). We often wish to compute the limit as N of the free energy per particle:
(2)
1
1
FN = lim
log ZN .
N N
N N
f () = lim
N
for R1 . Setting
=
1
,
N
178
we recast (2) as
1
f () = lim log
0
(4)
dP .
(5)
(6)
1
inf ( + I()) ( R).
What is the physical meaning of this formula? First note that the energy per particle is
1
E
N N
=
=
=
=
1
!HN "
N
1
N
HN eHN d
N N
ZN
N
eN dP
ZN
e dP
Z
I()
e dQ
.
Z
EN
e,
N
where
(8)
and remember
1
F = log Z.
Thus
S
F = sup E +
k
E
(9)
.
(10)
k
S denoting the entropy and
deduce then
(11)
(12)
dP e k dQ,
(13)
says that entropy S controls the asymptotics of {P }0<1 in the thermodynamic limit and
that the most likely states for small > 0 are those which maximize the entropy.
2. Basic theory
We now follow DonskerVaradhan (see e.g. Varadhan [V], DemboZeitouni [D-Z], etc.)
and provide a general probabilistic framework within which to understand and generalize
the foregoing heuristics.
a. Rate functions
Notation. Hereafter
denotes a separable, complete, metric space,
180
and
{P }0<1 is a family of Borel probability measures on .
Denition. We say that {P }0<1 satises the large deviation principle with rate function
I provided
(i) I : [0, ] is lower semicontinuous, I
+,
(ii) for each l R,
(14)
and
(iv) for each open set U ,
(16)
then
lim log P (E) = inf I.
This gives a precise meaning to the heuristic (5) from the previous section (without the
unnecessary introduction of the reference measure).
(ii) The rate function I is called the entropy function in Ellis [EL]. This book contains
clear explanations of the connections with statistical mechanics.
(iii) It is sometimes convenient to consider instead of 0 an index n . Thus we
say {Pn }
n=1 satises the large deviation principle with rate function I if
and
(17)
We now make clearer the connections between large deviation theory and the heuristics
in 1.
Theorem 1. Let {P }0<1 satisfy the large deviation principle with rate function I. Let
g:R
be bounded, continuous. Then
lim log
(18)
= sup(g I).
e dP
N
*
Ci ,
i=1
Then
e dP
N
i=1
N
Ci
e dP
i=1
Ci
gi +
dP ,
where
gi = inf g
(i = 1, . . . , N ).
Ci
Thus
log
e dP
gi +
log N max1iN e P (Ci )
= log N + max1iN gi+ + log P (Ci ) ,
e dP
Consequently
(19)
e dP
sup(g I).
182
(20)
(21)
Then
lim inf 0 log
e dP
for U.
2
g
e dP
U
g() 2
lim inf 0 log U e dP by (21)
Hence
lim inf log
0
e dP
sup(g I).
Theorem 2. Assume the limit (18) holds for each bounded, Lipschitz function g : R,
where I : [0, ] is lower semicontinuous, I
+. Suppose also the sets {0 I l}
are compact for each l R.
Then {P }0<1 satises the large deviation principle with rate function I.
Proof. We must prove (15), (16)
1. Let C be closed and set
(22)
(m = 1, . . . ).
Thus
lim sup0 log P (C) lim0 log egm / dP
= sup (gm I).
183
log
egm / dP
log(P (U ) + em/ P ( U ))
log(P (U ) + em/ ).
Thus
supCk (I) sup (gm I)
= lim0 log egm / dP
lim inf 0 log(P (U ) + em/ ).
(23)
But
and so
m/
)
k=1
Ck
C. Cramers Theorem
In this section we illustrate the use of PDE methods by presenting an unusual proof, due
to R. Jensen, of Cramers Theorem, characterizing large deviations for sums of i.i.d. random
variables.
184
m
(k = 1, . . . )
Yk : R
are independent, identically distributed
(1)
random variables.
We write Y = Y1 and assume as well that
the exponential generating function Z = E(epY )
(2)
is nite for each p Rm ,
where E() denotes expected value. Thus
epY d.
Z=
Y1 + + Yn
n
Sn =
F := log Z,
that is,
(5)
pY
pY
) = log
d .
(q Rm )
we deduce
Pn (E) = en( inf E L+o(1)) .
Pn (E) 0 exponentially fast as n .
So if y
/ E,
Proof. 1. Write Y = (Y 1 , . . . , Y m ). Then for 1 k, l m:
F
pk
2F
pk pl
E(Y k epY )
,
E(epY )
k
l
pY
E(Y Y e
)
E(epY )
Thus if Rm ,
m
k,l=1
Fpk pl k l =
0,
since
E((Y )epY ) E((Y )2 epY )1/2 E(epY )1/2 .
Hence
(7)
Clearly also
(8)
186
In addition
L(q)
=
|q| |q|
lim
(10)
The right hand side of (11) will appear when we invoke the HopfLax formula for the solution
of (12).
To carry out this program, we x any point x Rm and then write
tk = k/n
We dene
wn (x, tk ) := E hn
(13)
(k = 0, . . . ).
Y1 + + Yk
+x
n
,
where
hn := eng .
(14)
3. We rst claim:
(15)
wn (x, tk+1 ) = E wn
Yk+1
x+
, tk
n
187
(k = 0, . . . ).
un (x, tk ) :=
1
log wn (x, tk )
n
D as usual denoting the gradient in the spatial variable x. Let us check (17) by rst noting
from (13), (14) that
wn L hn L = engL .
The rst inequality in (17) follows. Now x a time tk = nk . Then for a.e. x Rm we may
compute from (13), (14) that
k
Dwn (x, tk ) = E Dhn Y1 ++Y
+x
n
k
+
x
.
= nE Dghn X1 ++Y
n
Consequently
(18)
k
+x
|Dwn | nDgL E hn Y1 ++Y
n
= nDgL wn .
188
(19)
1
C,
n
(k, l 1).
ut F (Du) = 0 in Rm (0, ).
To verify this, we recall the relevant denitions from Chapter VI, take any v C 2 (Rm
(0, )) and suppose
u v has a strict local maximum at a point (x0 , t0 ).
189
We must prove:
vt (x0 , t0 ) F (Dv(x0 , t0 )) 0.
(23)
We may also assume, upon redening v outside some neighborhood of (x0 , t0 ) if necessary,
that u(x0 , t0 ) = v(x0 , t0 ),
sup |v, Dv, D2 v| < ,
(24)
and v > sup(un ) except in some region near (x0 , t0 ). In view of (21) we can nd for n = nr
points (xn , tkn ), tkn = knn , such that
un (xn , tkn ) v(xn , tkn ) =
(25)
max
xRm ,k=0,...
and
(xn , tkn ) (x0 , t0 ) as n = nr .
(26)
Write
(27)
Then for n = nr :
Y
v xn + , tkn 1
n
= v(xn , tkn 1 ) + Dv(xn , tkn 1 )
190
Y
+ n ,
n
where
(28)
Y
n := v xn + , tkn 1
n
Thus
v(xn , tkn 1 ) Dv(xn , tkn 1 )
Y
.
n
env(xn ,tkn ) envn (xn ,tkn 1 ) E eDv(xn ,tkn 1 )Y+nn ,
and hence
(29)
v(xn , tkn ) v(xn , tkn 1 )
log E eDv(xn ,tkn 1 )Y+nn .
1/n
and furthermore
DvY+n
n
e
eC|Y| .
Our assumption (2) implies also that E(eC|Y| ) < . We consequently may invoke the
Dominated Convergence Theorem and pass to limits as n = nr :
vt (x0 , t0 ) log E eDv(x0 ,t0 )Y
= F (Dv(x0 , t0 )).
This is (23), and the reverse inequality likewise holds should uv have a strict local minimum
at a point (x0 , t0 ). We have therefore proved u is a viscosity solution of (22). Since u = g on
Rm {t = 0}, we conclude that u is the unique viscosity solution of the initial value problem
(12). In particular un u.
6. We next transform (12) into a dierent form, by noting that u = u is the unique
viscosity solution of
ut + F (D
u) = 0 in Rm (0, )
(30)
u = g on Rm {t = 0}
for
(31)
g = g, F (p) = F (p).
Indeed if u v has a local maximum at (x0 , t0 ), then u v has a local minimum, where
v =
v . Thus
(32)
vt (x0 , t0 ) F (Dv(x0 , t0 )) 0,
191
u(x, t) = inf tL
(33)
+ g(y) ,
y
t
is the Legendre transform of the convex function F . But then
where L
L(q)
= sup(p q F (p))
p
= sup(p q F (p))
p
= L(q).
Therefore
u(x, t) =
u(x, t)
x
g(y)
= sup tL
t
y
'
(
yx
.
= sup g(y) tL
t
y
In particular
(34)
But
(35)
un (0, 1) =
=
=
=
1
n
1
n
1
n
1
n
log wn (0, tn )
n
log E hn Y1 ++Y
n
log E eng(Sn )
log Rm eng dPn .
As un (0, 1) u(0, 1), (34) and (35) conrm the limit (11).
The second theorem in B thus implies that I = L is the rate function for {Pn }
n=1 .
Remark. This proof illustrates the vague belief that rate functions, interpreted as functions
of appropriate parameters, are viscosity solutions of HamiltonJacobi type nonlinear PDE.
192
The general validity of this principle is unclear, but there are certainly many instances in
the research literature. See for instance the next section of these notes, and look also in the
book by FreidlinWentzell [F-W].
If we accept from B.1 the identication of rate functions and entropy (up to a minus
sign and Boltzmanns constant), then the foregoing provides us with a quite new interplay
between entropy ideas and nonlinear PDE.
(t > 0)
X(t)
= b(X(t)) + B(X(t)) (t)
(3)
X(0) = X0
where =
(4)
d
dt
(t 0),
and
(t) =
dW(t)
= m-dimensional white noise.
dt
(X(t) = X(t)
for all 0 t T ) = 1
for each T > 0. Furthermore, the sample paths t X(t) are continuous, with probability
one.
2. It
os formula, elliptic PDE
Solutions of (1) are connected to solutions of certain linear elliptic PDE of second order.
The key is It
os chain rule, which states that if u : Rn R is a C 2 -function, then
(7)
aij i j 0
i,j=1
194
( Rn ).
(t 0),
(9)
u(X(t)) = u(X(0))
t
+ 0 ni=1 bi (X(s))uxi (X(s)) + 12 ni,j=1 aij (X(s))uxi xj (X(s))ds
t
k
+ 0 ni=1 m
k=1 ik (X(s))uxi (X(s))dW (s)
sample path of X( )
x := min{t 0 | X(t) U }.
Assume, as will be the case in 3,4 following, that x < a.s. We apply Itos formula, with
u a solution of (10) and the random variable x replacing t:
x
u(X(x )) = u(X(0)) +
Du B dW.
0
4
Note that there is no minus sign in front of the term involving the second derivatives: this diers from
the convention in [E1, Chapter VI].
195
Du B dW.
We take expected values, and recall from [A], [FR], etc. that
x
E
Du BdW = 0,
0
(x U ).
i,j=1
(t 0)
x(t)
= b(x(t))
(t 0)
(16)
x(0) = x.
We are therefore interpreting (15) as modeling the dynamics of a particle moving with
velocity v = b plus a small noise term. What happens when 0?
This problem ts into the large deviations framework. We take = C([0, T ]; Rm ) for
some T > 0 and write P to denote the distribution of the process X () on . Freidlin
196
and Wentzell have shown that {P }0<1 satisfying the large deviation principle with a rate
function I[], dened this way:
(17)
I[y()] =
1
2
T n
0
i,j=1
otherwise.
Here ((aij )) = A1 is the inverse of the matrix A = DDT and H 1 ([0, T ]; Rn ) denotes the
Sobolev space of mappings from [0, T ] Rn which are absolutely continuous, with square
integrable derivatives. We write y() = (y1 (), . . . , yn ()).
b. Perturbations against the ow
We present now a PDE method for deriving an interesting special case of the aforementioned large derivation result, and in particular demonstrate how I[] above arises. We follow
Fleming [FL] and [E-I].
To set up this problem, take U Rn as above, x a point x U , and let {X (t)}t0 solve
the stochastic ODE (15). We now also select a smooth, relatively open subregion U
and ask:
region ?
This is in general a very dicult problem, and so we turn attention to a special case, by
hypothesizing concerning the vector eld b that
1
n
|y(t)
b(y(t))|2 dt = +.
0
Condition (19) says that it requires an innite amount of energy for a curve y() to resist
being swept along with the ow x() determined by b, staying within U for all times t 0.
197
flow lines of
the ODE x=b(x)
Intuitively we expect that for small > 0, the overwhelming majority of the sample paths
of X () will stay close to x() and so be swept out of U in nite time. If, on the other
hand we take for a smooth window within U lying upstream from x, the probability
that a sample path of X () will move against the ow and so exit U through should be
very small.
Notation. (i)
u (x) = probability that X () rst exits
U through
= (X (x ) ).
(20)
(ii)
(21)
g = =
1 on
0 on U .
Then
u (x) = E(g(X (x )))
(22)
ij
a
u
+
x
x
i=1 bi uxi
2 i,j=1
i j
(23)
u
198
(x U ).
problem
= 0 in U
= 1 on
= 0 on U .
(24)
w(x)+o(1)
2
as 0,
Proof (Outline). 1. We introduce a rescaled version of the log transform from Chapter IV,
by setting
w (x) := 2 log u (x)
(27)
(x U ).
wx i = 2
2
w
xi xj =
uxi
,
u
uxi xj
u
+ 2
uxi uxj
(u )2
n
n
n
2
1
ij
ij
w at U .
(29)
a wxk xi xj +
a wxk xi wxj
bi wx k xi = R1 ,
2 i,j=1
i,j=1
i=1
199
(30)
so that
Thus
aij xi xj ni=1 bi xi
2
n
n
n
ij
(31)
xi = 2 nk=1 wx k wx k xi
xi xj = 2 nk=1 wx k wx k xi xj + wx k xi wx k xj .
n
i,j=1
Now
n
n
aij wx k xi wx k xj |D2 w |2 .
k=1 i,j=1
n
n
2
ij
a xi xj
bi xi 2 |D2 w |2 + R2 ,
2 i,j=1
i=1
2
|D2 w |2
2
2
|D2 w |2
2
+ C(|Dw |3 + 1)
+ C( 3/2 + 1).
n
n
2 2 2 2
ij
a xi xj
bi xi C( 3/2 + 1).
|D w |
2
2 i,j=1
i=1
and so
C(2 |D2 w | + 1).
This inequality and (33) give us the estimate:
n
4
ij
a xi xj 2 C(|D| + 3/2 ) + C,
2 i,j=1
2
(34)
AAAAAAAAAAAAAAA
AAAAAAAAAAAAAAA
AAAAAAAAAAAAAAA
AAAAAAAAAAAAAAA
AAAAAAAAAAAAAAA
AAAAAAAAAAAAAAA
AAAAAAAAAAAAAAA
AAAAAAAAAAAAAAA
AAAAAAAAAAAAAAA
AAAAAAAAAAAAAAA
V
Write
:= 4
(35)
and compute
(36)
xi = 4 xi + 4 3 xi
xi xj = 4 xi xj + 4 3 (xj xi + xi xj ) + 4( 3 xi )xj .
Select a point x0 U where attains its maximum. Consider rst the case that x0 U ,
(x0 ) > 0. Then
D(x0 ) = 0, D2 (x0 ) 0.
201
Owing to (27)
D = 4D at x0 ,
(37)
and also
aij xi xj 0 at x0 .
i,j=1
Thus at x0 :
n
n
4
ij
4 4
ij
0
a xi xj =
a xi xj + R3
2 i,j=1
2 i,j=1
where
|R3 | 4 C( 3 |D| + 2 )
4 C 2 .
4 2
+ C.
C
,
2
|Dw (x0 )| = 2
|Du (x0 )|
C.
u (x0 )
sup |Dw | C
V
sup |w | C.
V
202
w = 0 on
and
(42)
n
n
1
ij
a wxi wxj
bi wxi = 0 in U,
2 i,j=1
i=1
in the viscosity sense: the proof is a straightforward adaptation of the vanishing viscosity
calculation in VI.A. Since the PDE (42) holds a.e., we conclude that
|Dw|
C a.e. in U,
and so
w C 0,1 (U ).
We must identify w.
w = 0 on .
n
n
1
ij
a wxi wxj
bi wxi = 0 in U,
2 i,j=1
i=1
in the viscosity sense. To prove this, take a smooth function v and suppose
(46)
We must show
(47)
n
n
1
ij
a v xi v xj
bi vxi 0 at x0 .
2 i,j=1
i=1
203
(48)
y(s)
= b(y(s)) + A(y(s))
(49)
y(0) = x0 .
(s > 0)
Let t > 0 be so small that y(t) B(x0 , r). Then (43) implies
1
w(x0 )
2
0 i,j=1
204
and prove
n
n
1
ij
a vxi vxj
bi vxj 0 at x0 .
2 i,j=1
i=1
(52)
To verify this inequality, we assume instead that (52) fails, in which case
(53)
n
n
1
ij
a v xi v xj
bi vxi < 0 near x0
2 i,j=1
i=1
for some constant > 0. Now take a small time t > 0. Then the denition (43) implies that
there exists y() A such that
1
w(x0 ) w(y(t)) +
2
Now dene
so that
(55)
Then
y(s)
= b(y(s)) + A(y(s))(s)
y(0) = x0 .
v(x0 ) v(y(t)) =
=
t
0t
0
(s > 0)
d
v(y(s))ds
ds
w w in U.
This is in fact true: the proof in [E-I] utilizes various viscosity solution tricks as well as the
condition (19). We omit the details here.
Finally then, our main assertion (24) follows from (56).
206
2.
Fundamental quantities
Units
time
length
mass
temperature
quantity
seconds (s)
meters (m)
kilogram (kg)
Kelvin (K)
mole (mol)
Derived quantities
Units
force
pressure
work, energy
power
entropy
heat
kg m s2 = newton (N )
N m2 = pascal (P a)
N m = joule (J)
J s1 = watt (W )
J K 1
4.1840 J = calorie
207
208
References
[A]
[B-G-L] C. Bardos, F. Golse and D. Levermore, Fluid dynamic limits of kinetic equations
I, J. Stat. Physics 63 (1991), 323344.
[BN]
[B-S]
[B-T]
[C]
[CB]
B. Chow, On Harnacks inequality and entropy for Gaussian curvature ow, Comm.
Pure Appl. Math. 44 (1991), 469483.
[C-N]
B. Coleman and W. Noll, The thermodynamics of elastic materials with heat conduction and viscosity, Arch. Rat. Mech. Analysis 13 (1963), 167178.
[C-O-S] B. Coleman, D. Owen and J. Serrin, The second law of thermodynamics for systems
with approximate cycles, Arch. Rat. Mech. Analysis 77 (1981), 103142.
[D]
[DA]
[D-S]
[D-Z]
[EL]
[E1]
[E2]
L. C. Evans, The perturbed test function method for viscosity solutions of nonlinear
PDE, Proc. Royal Soc. Edinburgh 111 (1989), 359375.
[E-G]
[E-I]
[F]
[F-L1]
[F-L2]
[FL]
[FR]
[F-W]
[G]
[GU]
[G-W]
M. Gurtin and W. Williams, An axiomatic foundation for continuum thermodynamics, Arch. Rat. Mech. Analysis 26 (1968), 83117.
[H]
R. Hamilton, Remarks on the entropy and Harnack estimates for Gauss curvature
ow, Comm. in Analysis and Geom. 1 (1994), 155165.
[HU]
[J]
[vK-L]
[L-P-S]
[M-S]
[OK]
[O]
[P-T]
B. Perthame and E. Tadmor, A kinetic equation with kinetic entropy functions for
scalar conservation laws, Comm. Math. Physics 136 (1991), 501517.
[P]
[R]
[S1]
J. Serrin, Foundations of Classical Thermodynamics, Lecture Notes, Math. Department, U. of Chicago, 1975.
[S2]
[S3]
J. Serrin, An outline of thermodynamical structure, in New Perspectives in Thermodynamics (ed. by Serrin), Springer, 1986.
[SE]
[S]
[ST]
[SZ]
L. Szilard, On the decrease of entropy in a thermodynamic system by the intervention of intelligent beings, in [L-R].
[T]
[TR]
[T-M]
[T-N]
[V]
[W]
[Z]
212